Deployment

Deployment is an advanced feature. This tutorial shows how to use it to produce a deployed model. Actually using the deployed model in another project requires knowledge about.

Deployment is still in experimental state and might change in future!

How to use

SOL’s deployment functionality works in a similar way to sol.optimize(). Just instead of creating and optimized model that is compiled when it is called, sol.deploy creates a directly compiled library of the given model at the given path. Both functions can be controlled by the same environment variables and sol.config. So, for example autotuning can also be enabled or disabled by setting sol.config["autotuning"]. But deploy does not read many of its options directly from the model. Instead you have to define them manually. The function’s signature looks like this:

def deploy(model, args=[], kwargs={}, fwargs={}, *,
		library_type:str					= "shared_lib",
		library_func:str					= "predict",
		library_name:str					= "sol_deploy",
		library_path:str					= ".",

		device:str,
		device_idx:int						= 0,
		device_idx_runtime:int				= 0,

		weights_type:str					= "linked",
		weights_path:str					= ".",
		vdims:Optional[Union[List, Tuple]]	= None,
		compiler_args						= {},
		no_avx_512							= False
	):

These options can be separated into four categories:

SOL-Options

These options define how SOL optimizes and compiles the network. Most of these are similar to sol.optimize().

Parameter Valid Options/Type Description
model nn.module/tf.model/onnx_model The Model, defined in any framework that is supported by SOL
args torch.Tensor/tf.Tensor An Example input, used to get datatype and shapes
kwargs dict Keyword arguments
kwargs dict Framework arguments, e.g. shape
vdims list Variable dimensions, see VDim
compiler_args dict Dictionary which is given directly to underlying JIT compilers

Library Options

These options define the properties of the generated library.

Parameter Valid Options Description
library_type shared_lib, static_lib Creates a shared (.so) or static (.a) library.
library_name str Name of the Library. “sol_” will be used as prefix.
library_func str Name of the main function of that library.
library_path str Path to which the generated Files will be written.

Weight Options

These options define how weights are stored and read by the generated library.

Parameter Valid Options Description
weights_type linked, extern Weights are either linked into the generated library or stored in an external location.
weights_path str Path to store external weights. (Can be None for linked weights.)

Weights that are linked into the library require less manual handling, because they are part of the same file. But the size of a shared library is limited by Linux, so you cannot use this functions for larger networks. Keeping the weights external also allows you to free them on the host if you have moved your model to another device during execution. The library on the other hand stays loaded on the host side as long as it is in use.

Device Options

These options define the device and the execution mode of how the deployed model.

Parameter Valid Options Description
device x86, nvidia, ve, veo
device_idx int Device id during compilation (+ autotuning)
device_idx_runtime int Device id of the generated library

The difference between ve and veo is in their respective execution modes. ve generates a .vso library that can be linked into an executable that can run directly on the Vector Engine. veo stands for Vector Engine offloading and generates a host library and a device library. You can use the host library in your x86 project and it will offload the model and its in- and outputs automatically to the Vector Engine during runtime.

Example

Here is an example script:

import sol
import torch
import torchvision.models as models

model = models.get_model("convnext_small")
model.eval()

input = torch.rand(10,3,224,224, dtype=torch.float32)

sol.deploy(model, [input], 
	library_type		= "shared_lib",
	library_func		= "predict",
	library_name		= "sol_convnext",
	library_path		= "/LOCAL/deployment",
	device_idx			= 0,
	device				= "nvidia",
	weights_type		= "extern",
	weights_path		= "data",

    # Advanced features
	vdims				= [False],
    compiler_args		= {"ispc::vector":"avx2", "gcc:march":"native"},
)

Output Files

This script creates a folder in /LOCAL/deployment with the following contents:

demouser:LOCAL/deployment$ ls
data  libsol_convnext.so  sol_convnext_example.c  sol_convnext_example.py  sol_convnext.h 

To compile and run the generated the example, run the following commands.

demouser:LOCAL/deployment$ g++ sol_convnext_example.c -lsol_convnext -o sol_convnext_exe
demouser:LOCAL/deployment$ ./sol_convnext_exe

Note that LIBRARY_PATH needs to include the location of libsol_convnext.so for the linker (ld) to be able to link against it. To run the executable, the loader (ld.so) needs to be able to locate the shared library as well. Set LD_LIBRARY_PATH accordingly or move libsol_convnext.so to a folder included in these paths.

Alternatively you can run the generated python example:

demouser:LOCAL/deployment$ python sol_convnext_example.py

The generated examples are meant to be human readable and show you how to call the deployed model from their respective languages.