v0.8 Gacrux

Version/DateChanges
17.04.2026
v0.8.0
Docs

This is a feature update. It is possible that performance decreased compared to v0.7 release. We are focusing on performance improvements in upcoming releases.

Highlights

  • All sol.config[...] values can now also be set via env vars. The env name always starts with SOL_CONFIG_ followed by the config key, in upper case and :: replaced with _, e.g., sol.config["dfp::debug"]=True becomes SOL_CONFIG_DFP_DEBUG=TRUE. For boolean values TRUE, ON or 1 can be used.
  • Switched to intrinsics-like code generation for compute kernels.
  • Added support for variable pixel dimensions of conv and pooling layers.

Breaking Changes

  • The default location for the SOL cache (.sol folder) is now at $HOME/.cache/sol. Please use the env var SOL_CWD=/path/to/cache to move the SOL cache if required.
  • python3 -m sol fix-omp has been deprecated. Use nec-sol fix-omp instead! python3 -m sol-fix will be removed in v0.9!
  • We decided to make no variable dimensions enabled by default for performance reasons. Use vdims=[...] to manually enable variable dimensions.
  • Changing backend heuristics via sol.config[...] has been removed.

Closed Issues

  • #1943 [PyTorch] Test v2.11.0
  • #1938 [TF] Test v2.21.0
  • #1934 [CUBLAS] Set RPATH for libcublas.so in CUBLAS-handle
  • #1933 [LicenseServer] Change .postVERSION to .postEXPIRATIONDATE
  • #1931 [DFP] Improve Constant Setting
  • #1930 [DFP] Improve mask sorting
  • #1928 [HLIR] Performance problem when using BatchNormConvFuse, if the weights are significantly larger than the input data
  • #1927 [DFP] atomicAdd(0.0) does not need to be executed!
  • #1926 [DFP] Wrong gradients for complex mul and div
  • #1925 [DFP] VE Intrinsics performance regression
  • #1923 [DFP] Inline constant computations
  • #1921 [HLIR] Add topk
  • #1920 [DFP] Add code path for CUDA atomics for float16/32 dtypes
  • #1918 [CUDNNv9] Returns CUDNN_STATUS_NOT_INITIALIZED with float16/bfloat16 but works with float32
  • #1916 [DFP] Lookup boundary checks
  • #1914 [Keras] Keras.predict does not throw exception when wrong input dtype is detected
  • #1913 [License] Add .postX for every time a new license is issued, to prevent caching effects of expired license files.
  • #1912 [DFP] move autoSqueeze from lowerIR to Planner
  • #1911 [DFP] Unnecessary select for select(not(X), scalar(1), scalar(0))
  • #1910 [DFP] Minimize register usage
  • #1909 [SQLite] Improve stability when SOL cache is on a network drive
  • #1906 [DFP] Investigate why PyTorch is faster on PY_Issue_1740
  • #1901 [DFP] AllReduce performance optimization
  • #1899 [Cache] Add XDG_CACHE_HOME to determine ~/.cache
  • #1896 [PyTorch] Test v2.9.1
  • #1895 [Core] Consider to move .sol to $HOME/.cache/sol as default
  • #1890 [Installer] Add support for credential files
  • #1867 [HLIR] ViewDType breaks sol_tensor_ptr_XXX(...) dtype check!
  • #1854 [PyTorch] ANEMOI parsing: torch.compile reports different DTypes than SOL
  • #1851 [Debug] Estimated Peak Memory differs from Memory Dump
  • #1845 [Installer] Keyring
  • #1834 [CUDNNv8] Remove SymPad and use explicit padding if leftPadding != rightPadding
  • #1831 [Installer] add --fix-omp option
  • #1819 [VE] VEDA_ERROR_CANT_FREE_NON_DELAYED_VPTR in TF_Dropout using VEO mode
  • #1812 [HLIR] Enable Conv with vdims in pixel dimensions
  • #1797 [JIT] Use SQL lock to mutex compilation of handle/modules in parallel processes
  • #1795 [VE] VEDA_ERROR_UNKNOWN_VEDA_ARCHITECTURE thrown in initVEDA if no VE and no VEOS is present in system
  • #1782 [DFP] Split nested loops that underutilize the vectors
  • #1781 [Python] SOL might use wrong Python
  • #1780 [Installer] Project-URL is wrong, should be v0.X not v0.X.Y
  • #1778 [HLIR] Remove returnCarry from PrefixSum, instead handle like in MaxPool
  • #1777 [Utils] cleanup file paths
  • #1774 [DNN] Rework Handlers
  • #1773 [DNNL] Report DNNL Workspace Size
  • #1771 [YAAL] Workspaces
  • #1770 [CUDNN] Fix CUDA Graphs when using CUDNN
  • #1769 [TF] sol.optimize example inputs support
  • #1767 [Python] "There is a newer version of SOL available" detects post installers as NEW SOL version
  • #1764 [DFP] Deprecate LoopStack::sort
  • #1763 [DFP] Improve implementation of CoresSIMD -> SIMD
  • #1757 [DNN] Can we remove sol_ctx from the constructor of DeviceMap?
  • #1734 [OMP] Remove Cost Model
  • #1710 [Macros] Unify SOL_HAS_FUNC and SOL_HAS_FUNC2
  • #1662 [VLLM] Remove support
  • #1615 [HLIR] Implement Interpolate as Gather?
  • #1438 [DFP] AllReduce
  • #919 [HLIR] Merge two Reorder's with different shape sizes
  • #472 [X86, VE, NVIDIA] Reverse PrefixSum