Performance/Determinism

Since v0.6 SOL obeys framework specific determinism specifiers. Please refer to the frameworks specific options (e.g. PyTorch)

tl;dr;

For deterministic results use:

sol.config['autotune'] = False

For best performance use:

sol.config['autotune'] = True

SOL

In v0.6 SOL introduced Determinism flags. These allow to change the numerical behavior of SOL. By default, SOL obeys the numerical behavior of the AI framework. See below for information about PyTorch and TensorFlow.

If you want to modify the determinism yourself, you can pass a set, list or tuple consisting of sol.hlir.Determinism to sol.optimize(..., determinism=...) or torch.compile(..., backend='sol', determinism=...).

It’s important that you ALWAYS select one Framework type! This can have influence if certain options are obeyed or not. Not all device types support all options. Unavailable options get ignored.

Currently supported options:

Option Effect
Framework_PyTorch Sets PyTorch mode.
Framework_TensorFlow Sets TensorFlow mode.
Framework_Numpy Sets Numpy mode.
Framework_ONNX Sets ONNX mode.
GEMM_TF32 Enables to use TF32 as replacement for FP32 in GEMMs.
GEMM_FP16 Enables to use FP16 accumulators in FP16 GEMMs.
GEMM_BF16 Enables to use BF16 accumulators in BF16 GEMMs.
GEMM_BF16_2x RESERVED
Conv_Benchmark Enables Conv benchmarking.
Conv_TF32 Enables TF32 as replacement for FP32 in Conv.
Conv_Nondeterministic Enables non-deterministic Conv implementations.

PyTorch

For deterministic results use:

torch.backends.cudnn.benchmark = False
torch.backends.cudnn.deterministic = True
torch.backends.cudnn.allow_tf32 = False
torch.set_float32_matmul_precision("highest")
torch.backends.cuda.matmul.allow_tf32 = False
torch.backends.cuda.matmul.allow_fp16_reduced_precision_reduction = False
torch.backends.cuda.matmul.allow_bf16_reduced_precision_reduction = False

For best performance use:

torch.backends.cudnn.benchmark = True
torch.backends.cudnn.deterministic = False
torch.backends.cudnn.allow_tf32 = True
torch.set_float32_matmul_precision("lowest")
torch.backends.cuda.matmul.allow_tf32 = True
torch.backends.cuda.matmul.allow_fp16_reduced_precision_reduction = True
torch.backends.cuda.matmul.allow_bf16_reduced_precision_reduction = True

More information about PyTorch’s numerical accuracy options can be found here.

TensorFlow

SOL obeys the tf.config.experimental.enable_tensor_float32_execution option.