The NEC SX-Aurora supports two execution modes, transparent and native offloading. If you only care about inference, the transparent offloading methods is the easiest to use. For training the native offloading should be used due to much better performance, if available for your framework.
Within PyTorch we support to use native tensors. For this program PyTorch as if you would use a GPU but replace all calls to
model.ve() # copy model to VE#0 input = input.ve() # copy data to VE#0 model(input) # gets executed on the device torch.ve.synchronize() # wait for execution to complete
(see https://pytorch.org/docs/stable/cuda.html for description)
torch.Tensor.ve() torch.Tensor.to('ve') torch.Tensor.to('ve:X') torch.nn.Module.ve() torch.ve.synchronize(device=0) torch.ve.is_available() torch.ve.current_device() torch.ve.set_device(device) torch.ve.device_count() torch.ve.memory_allocated(device=None) CLASS torch.ve.device(device) CLASS torch.ve.device_of(device)
SOL adds a new
VE device to TensorFlow, so you only need to use
tf.device('VE:0'): to tell TF to use the SX-Aurora. Further, you need to call
sol.optimize(...) function within the
with block to ensure the
model gets allocated on the device.
model = init_my_model() with tf.device('VE:0'): input_ve = tf.identity(input) sol_model = sol.optimize(model) output_ve = model(input_ve)
If your input data is located on the host system you might need to execute
tf.identity(...) within the
with block to ensure the data gets
copied onto the device by TF. If the data is on the host system while the model
got allocated on the device, the behavior is undefined.
To use the NEC SX-Aurora, it is necessary to set
sol.device.set("ve", deviceIdx) (deviceIdx is the index of the Aurora to run on, start from 0). Further it is necessary that the input data is located on the host system.
|The AI framework reports that an operation is not supported by device type "VE"|
|This is caused by the fact, that only a minimal subset of VE function calls are supported to be executed "eagerly" within the framework, i.e., +, -, *, /, ... If you encounter this problem, please open an issue for VEDA-PyTorch or VEDA-TensorFlow.|
|SOL reports "not found" for NCC compiler.|
|Possible Cause 1||
SOL is unable to find |
|Possible Cause 2||
If there is a problem with your NCC license SOL is unable to properly detect the
compiler. Please run |
|SOL crashes with |
On some systems NCC v3.4.2 crashes when compiling code generated by SOL. If you
encounter this problem, please switch to an older version of the compiler using