SOL > Devices > NEC SX-Aurora

NEC SX-Aurora

Requirements

Requirement	Version
VEOS	≥ 2.7
NCC	≥ 5.0 if using VE3

Native Offloading (PyTorch)

Within PyTorch we support to use native tensors. For this program PyTorch as if you would use a GPU but replace all calls to cuda with ve. E.g.:

model.ve()               # copy model to VE#0
input = input.ve()       # copy data to VE#0
model(input)             # gets executed on the device
torch.ve.synchronize()   # wait for execution to complete

Available functions

(see https://pytorch.org/docs/stable/cuda.html for description)

torch.Tensor.ve()
torch.Tensor.to('ve')
torch.Tensor.to('ve:X')
torch.nn.Module.ve()
torch.ve.synchronize(device=0)
torch.ve.is_available()
torch.ve.current_device()
torch.ve.set_device(device)
torch.ve.device_count()
torch.ve.memory_allocated(device=None)
CLASS torch.ve.device(device)
CLASS torch.ve.device_of(device)

Training

Loss functions are not implemented natively for VE. Instead use a wrapper model to add the loss function to the SOL optimized model.

class TrainingModel(torch.nn.Module):
	def __init__(self, model, loss):
		super().__init__()
		self.model = model
		self.loss  = loss

	def forward(self, input, target):
		output = self.model(input)
		loss   = self.loss (output, target)
		return output, loss

And adjust your training loop:

device = 've:0'
model.to(device)

optimizer      = torch.optim.SGD(model.parameters())
training_model = TrainingModel(model, torch.nn.L1Loss())
training_model = sol.optimize(training_model)

for input, target in dataset:
	input, target = input.to(device), target.to(device)
	output, loss  = training_model(input, target)
	loss.backward()
	optimizer.step()

training_model and model share the same weights, so you don’t need to further adjust your code.

For optimal performance, and if you don’t need PyTorch identical pseudo random numbers, use sol.optimize(..., determinism=sol.pytorch.determinism(sol.Determinism.Rand_Fastest)), which enables to use much faster random number generators.

Native Offloading (TensorFlow)

Due to increasing number of unresolved issues in TensorFlow PluggableDevice API (e.g., #55497, #57095, #60883, #60895) we decided to no longer maintain our veda-tensorflow extension. Therefore you cannot longer use with tf.device("/VE:0"):. Instead please use Transparent Offloading using sol.device.set('ve', 0). We are sorry for the inconvenience, but we don’t see any commitment of the TensorFlow team to accept our bugfixes, nor to fix the issues themselves.

Transparent Offloading (all frameworks)

To use the NEC SX-Aurora, it is necessary to set sol.device.set("ve", deviceIdx) (deviceIdx is the index of the Aurora to run on, start from 0). Further it is necessary that the input data is located on the host system.

As explained in our paper SOL: Effortless Device Support for AI Frameworks without Source Code Changes running inference with Transparent Offloading has nearly zero impact on the performance. However, training performance will be really low!

Config Options

Option	Type/Default	Description
ve::trace	bool/false	Enables to use ftrace.
ve::packed	bool/false	Enables use of packed vector for float32.

Known Issues

float16 or bfloat16 data types are not supported
3D Convolution or DeConvolution are not supported
PyTorch’s Bernoulli and Dropout don’t produce identical pseudo random numbers, due to unavailability of MKL’s VSL Bernoulli algorithm for VE.

Env Vars

EnvVar	Default	Description
NAR	“/opt/nec/ve/bin/nar”	Path to nar
NCXX	“/opt/nec/ve/bin/nc++”	Path to nc++
NOBJCOPY	“/opt/nec/ve/bin/nobjcopy”	Path to nobjcopy
VEDA_VISIBLE_DEVICES		see VEDA for description
VE_NODE_NUMBER		see VEDA for description
VE_OMP_NUM_THREADS		see VEDA for description
_VENODELIST		see VEDA for description
VE_LD_LIBRARY_PATH		see VEDA for description
NCPATH		Used as include paths
NC_INCLUDE_PATH		Used as include paths
NCPLUS_INCLUDE_PATH		Used as include paths
NLIBRARY_PATH		Used as library paths

FAQ

The AI framework reports that an operation is not supported by device type "VE"
This is caused by the fact, that only a minimal subset of VE function calls are supported to be executed “eagerly” within the framework, i.e., +, -, *, /, … If you encounter this problem, please open an issue for VEDA-PyTorch.

SOL reports "not found" for NCC compiler.
Possible Cause 1	SOL is unable to find `/opt/nec/ve/bin/nc++`. If you don’t use a standard installation, please use `NCXX`, `NAR` and `NLD` env vars to specify the paths to your NCC installation.
Possible Cause 2	If there is a problem with your NCC license SOL is unable to properly detect the compiler. Please run `nc++ –version` and check for any error messages.

SOL crashes with `nc++: /opt/nec/ve/ncc/3.4.2/libexec/ccom is abnormally terminated by SIGSEGV`.
On some systems NCC v3.4.2 crashes when compiling code generated by SOL. If you encounter this problem, please switch to an older version of the compiler using the `NCXX` env var.

SOL reports `VEDA_ERROR: VEDA_ERROR_CANNOT_CREATE_CONTEXT`.
This error message is triggered when the VE is occupied by another process. SOL relies on AVEO which requires exclusive access to the device. To resolve this issue, terminate all other processes on the device. You can use `VE_NODE_NUMBER=0 /opt/nec/ve/ve-top` to identify running processes.

Docker Containers

You can use the following scripts to build Docker Containers that contain SOL.

# check=error=true
FROM rockylinux/rockylinux:8.10 AS sol-base

RUN dnf update -y
RUN dnf install -y python312 python3.12-pip

#-------------------------------------------------------------------------------
FROM sol-base AS sol-installer 

ARG username
ARG password
ARG frameworks=torch
ARG devices=ve
RUN python3 -m pip install nec-sol
RUN python3 -m nec-sol install -u $username -p $password --accept-license --frameworks $frameworks --devices $devices

#-------------------------------------------------------------------------------
FROM sol-base

# SOL requirements
RUN dnf install -y gcc-toolset-11 graphviz

# VEOS requirements
RUN dnf install -y epel-release
RUN dnf install -y libquadmath libdhash protobuf-c log4c hwloc

# VEDA-PyTorch requirements
RUN dnf install -y python3.12-devel

RUN python3 -m pip install --no-cache-dir termcolor packaging matplotlib\
	requests threadpoolctl opt_einsum ninja

COPY --from=sol-installer							\
	/usr/local/lib/python3.12/site-packages/tungl	\
	/usr/local/lib/python3.12/site-packages/tungl

COPY --from=sol-installer						\
	/usr/local/lib/python3.12/site-packages/sol	\
	/usr/local/lib/python3.12/site-packages/sol

COPY --from=sol-installer						\
	/usr/local/lib/python3.12/site-packages/veda\
	/usr/local/lib/python3.12/site-packages/veda

COPY --from=sol-installer												\
	/usr/local/lib/python3.12/site-packages/veda_pytorch-*.dist-info	\
	/usr/local/lib/python3.12/site-packages/veda_pytorch-14.0.0.dist-info

ENV PATH="/usr/local/lib/python3.12/site-packages/veda/bin/:$PATH:/opt/nec/ve/bin"
ENV LD_LIBRARY_PATH="/opt/nec/ve/veos/lib64"
ENV CC="/opt/rh/gcc-toolset-11/root/usr/bin/gcc"
ENV CXX="/opt/rh/gcc-toolset-11/root/usr/bin/g++"
ENV PYTHONPATH="/usr/local/lib64/python3.12/site-packages:/usr/local/lib/python3.12/site-packages"

To build just run:

docker build . -t nec/sol4ve:latest username=USERNAME password=PASSWORD

To run use the following command. You’ll need to duplicate the line --device=/dev/veslot* for each VE device you want to use!

docker run	\
	--device=/dev/veslot0 \
	--device=/dev/veslot1 \
	-v /dev:/dev:z \
	-v /var/opt/nec/ve/veos:/var/opt/nec/ve/veos:z \
	-v /usr/lib64/libaurlic.so.1:/usr/lib64/libaurlic.so.1:ro \
	-v /opt/nec:/opt/nec:ro \
	-v $HOME:$HOME:rw \
	-it nec/sol4ve:latest bash

Then create a virtual env and install all required libraries.

python3 -m venv venv
. ./venv/bin/activate
pip3 install --upgrade pip
pip3 install ...

Further, VEDA-PyTorch >= v14 requires to compile a PyTorch C++ extension. To prevent recompilation you need to move the PyTorch extension folder to a place where it’s persistent between different runs, e.g.:

export TORCH_EXTENSIONS_DIR=/some/persistent/folder/.cache/torch_extensions

Singularity Containers

You can use the following scripts to build Singularity Containers that contain SOL.

Remote Setup

If you want to install SOL using the official repository use this script.

BootStrap: docker
From: rockylinux/rockylinux:8.10

%post
	# setup OS
	dnf update -y
	dnf install -y gcc-toolset-11 python312					# SOL requirements
	dnf install -y epel-release								# VEOS requirements
	dnf install -y libquadmath libdhash protobuf-c log4c	# VEOS requirements
	
	# setup VENV
	python3 -m venv /venv

	. /venv/bin/activate
	python3 -m pip install --upgrade pip
	python3 -m pip install {{ PYTHON_FRAMEWORKS }}
	python3 -m pip install nec-sol
	python3 -m nec-sol install -u "{{ SOL_USERNAME }}" -p "{{ SOL_PASSWORD }}" --accept-license --devices ve
	deactivate

%environment
	# init VE paths
	export PATH=$PATH:/opt/nec/ve/bin
	export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/opt/nec/ve/veos/lib64

	# init GCC/11
	. scl_source enable gcc-toolset-11
	export CC=/opt/rh/gcc-toolset-11/root/usr/bin/gcc
	export CXX=/opt/rh/gcc-toolset-11/root/usr/bin/g++

	# init VENV
	. /venv/bin/activate

	# Configure Proxy here if needed
	# export https_proxy=192.168.0.1:1234
	# export http_proxy=192.168.0.1:1234

Create a file sol4ve.cfg and set the correct values for the variables, replace {USERNAME} and {PASSWORD} with your credentials:

SOL_USERNAME={USERNAME}
SOL_PASSWORD={PASSWORD}
PYTHON_FRAMEWORKS=torch torchvision

And then build it with the following command.

sudo -E singularity build --build-arg-file sol4ve.cfg sol4ve.sif sol4ve.def

Local Setup

In case you want to install SOL from a local folder, you can use the following script.

BootStrap: docker
From: rockylinux/rockylinux:8.10

%files
	{{ SOL_PATH }} /sol

%post
	# setup OS
	dnf update -y
	dnf install -y gcc-toolset-11 python312					# SOL requirements
	dnf install -y epel-release								# VEOS requirements
	dnf install -y libquadmath libdhash protobuf-c log4c	# VEOS requirements
	
	# setup VENV
	python3 -m venv /venv

	. /venv/bin/activate
	python3 -m pip install --upgrade pip
	python3 -m pip install {{ PYTHON_FRAMEWORKS }}
	python3 -m pip install --pre nec-sol-core[ve,torch] veda-pytorch -f /sol # add features if needed
	deactivate

	rm /sol/*.*
	rmdir /sol

%environment
	# init VE paths
	export PATH=$PATH:/opt/nec/ve/bin
	export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/opt/nec/ve/veos/lib64

	# init GCC/11
	. scl_source enable gcc-toolset-11
	export CC=/opt/rh/gcc-toolset-11/root/usr/bin/gcc
	export CXX=/opt/rh/gcc-toolset-11/root/usr/bin/g++

	# init VENV
	. /venv/bin/activate

	# Configure Proxy here if needed
	# export https_proxy=192.168.0.1:1234
	# export http_proxy=192.168.0.1:1234

Create a file sol4ve.cfg and set the correct values for the variables, replace {PATH} with path to the SOL download folder.

SOL_PATH={PATH}
SOL_PASSWORD={PASSWORD}
PYTHON_FRAMEWORKS=torch torchvision

And then build it with the following command.

sudo -E singularity build --build-arg-file sol4ve.cfg sol4ve.sif sol4ve.def

Execution

To run the container, just execute the following command. Instead of manually binding the required folders.

singularity shell --writable-tmpfs --bind /opt/nec:/opt/nec:ro --bind /usr/lib64/libaurlic.so.1:/usr/lib64/libaurlic.so.1:ro --bind /var/opt/nec/ve/veos/:/var/opt/nec/ve/veos/:rw sol4ve.sif

Alternatively you can also use the SINGULARITY_BIND env var.

export SINGULARITY_BIND=/opt/nec:/opt/nec:ro,/usr/lib64/libaurlic.so.1:/usr/lib64/libaurlic.so.1:ro,/var/opt/nec/ve/veos/:/var/opt/nec/ve/veos/:rw
singularity shell --writable-tmpfs sol4ve.sif