SOL > Tutorials > Deployment

Deployment

Deployment is an advanced feature. This tutorial shows how to use it to produce a deployed model. Actually using the deployed model in another project requires knowledge about.

Deployment is still in experimental state and might change in future!

How to use

SOL’s deployment functionality works in a similar way to sol.optimize(). Just instead of creating and optimized model that is compiled when it is called, sol.deploy creates a directly compiled library of the given model at the given path. Both functions can be controlled by the same environment variables and sol.config. So, for example autotuning can also be enabled or disabled by setting sol.config["autotuning"]. But deploy does not read many of its options directly from the model. Instead you have to define them manually. The function’s signature looks like this:

def deploy(model, args=[], kwargs={}, fwargs={}, *,
		library_type:str					= "shared_lib",
		library_func:str					= "predict",
		library_name:str					= "sol_deploy",
		library_path:str					= ".",

		device:str,
		device_idx:int						= 0,
		device_idx_runtime:int				= 0,

		weights_type:str					= "linked",
		weights_path:str					= ".",
		vdims:Optional[Union[List, Tuple]]	= None,
		compiler_args						= {},
		no_avx_512							= False
	):

These options can be separated into four categories:

SOL-Options

These options define how SOL optimizes and compiles the network. Most of these are similar to sol.optimize().

Parameter	Valid Options/Type	Description
model	nn.module/tf.model/onnx_model	The Model, defined in any framework that is supported by SOL
args	torch.Tensor/tf.Tensor	An Example input, used to get datatype and shapes
kwargs	dict	Keyword arguments
kwargs	dict	Framework arguments, e.g. shape
vdims	list	Variable dimensions, see VDim
compiler_args	dict	Dictionary which is given directly to underlying JIT compilers

Library Options

These options define the properties of the generated library.

Parameter	Valid Options	Description
library_type	`shared_lib, static_lib`	Creates a shared (.so) or static (.a) library.
library_name	str	Name of the Library. “sol_” will be used as prefix.
library_func	str	Name of the main function of that library.
library_path	str	Path to which the generated Files will be written.

Weight Options

These options define how weights are stored and read by the generated library.

Parameter	Valid Options	Description
weights_type	`linked, extern`	Weights are either linked into the generated library or stored in an external location.
weights_path	str	Path to store external weights. (Can be `None` for linked weights.)

Weights that are linked into the library require less manual handling, because they are part of the same file. But the size of a shared library is limited by Linux, so you cannot use this functions for larger networks. Keeping the weights external also allows you to free them on the host if you have moved your model to another device during execution. The library on the other hand stays loaded on the host side as long as it is in use.

Device Options

These options define the device and the execution mode of how the deployed model.

Parameter	Valid Options	Description
device	`x86, nvidia, ve, veo`
device_idx	int	Device id during compilation (+ autotuning)
device_idx_runtime	int	Device id of the generated library

The difference between ve and veo is in their respective execution modes. ve generates a .vso library that can be linked into an executable that can run directly on the Vector Engine. veo stands for Vector Engine offloading and generates a host library and a device library. You can use the host library in your x86 project and it will offload the model and its in- and outputs automatically to the Vector Engine during runtime.

Example

Here is an example script:

import sol
import torch
import torchvision.models as models

model = models.get_model("convnext_small")
model.eval()

input = torch.rand(10,3,224,224, dtype=torch.float32)

sol.deploy(model, [input], 
	library_type		= "shared_lib",
	library_func		= "predict",
	library_name		= "sol_convnext",
	library_path		= "/LOCAL/deployment",
	device_idx			= 0,
	device				= "nvidia",
	weights_type		= "extern",
	weights_path		= "data",

    # Advanced features
	vdims				= [False],
    compiler_args		= {"ispc::vector":"avx2", "gcc:march":"native"},
)

Output Files

This script creates a folder in /LOCAL/deployment with the following contents:

demouser:LOCAL/deployment$ ls
data  libsol_convnext.so  sol_convnext_example.c  sol_convnext_example.py  sol_convnext.h

To compile and run the generated the example, run the following commands.

demouser:LOCAL/deployment$ g++ sol_convnext_example.c -lsol_convnext -o sol_convnext_exe
demouser:LOCAL/deployment$ ./sol_convnext_exe

Note that LIBRARY_PATH needs to include the location of libsol_convnext.so for the linker (ld) to be able to link against it. To run the executable, the loader (ld.so) needs to be able to locate the shared library as well. Set LD_LIBRARY_PATH accordingly or move libsol_convnext.so to a folder included in these paths.

Alternatively you can run the generated python example:

demouser:LOCAL/deployment$ python sol_convnext_example.py

The generated examples are meant to be human readable and show you how to call the deployed model from their respective languages.