17.04.2026 v0.8.0 Docs |
This is a feature update. It is possible that performance decreased compared to
v0.7 release. We are focusing on performance improvements in upcoming releases.
Highlights
-
All
sol.config[...] values can now also be set via env
vars. The env name always starts with SOL_CONFIG_ followed
by the config key, in upper case and :: replaced with
_, e.g., sol.config["dfp::debug"]=True becomes
SOL_CONFIG_DFP_DEBUG=TRUE. For boolean values
TRUE, ON or 1 can be used.
-
Switched to intrinsics-like code generation for compute kernels.
-
Added support for variable pixel dimensions of conv and pooling layers.
Breaking Changes
-
The default location for the SOL cache (
.sol folder) is now
at $HOME/.cache/sol. Please use the env var
SOL_CWD=/path/to/cache to move the SOL cache if required.
-
python3 -m sol fix-omp has been deprecated. Use
nec-sol fix-omp instead! python3 -m sol-fix will be
removed in v0.9!
-
We decided to make no variable dimensions enabled by default for
performance reasons. Use
vdims=[...] to manually enable
variable dimensions.
-
Changing backend heuristics via
sol.config[...] has been
removed.
Closed Issues
- #1943 [PyTorch] Test v2.11.0
- #1938 [TF] Test v2.21.0
- #1934 [CUBLAS] Set RPATH for libcublas.so in CUBLAS-handle
- #1933 [LicenseServer] Change .postVERSION to .postEXPIRATIONDATE
- #1931 [DFP] Improve Constant Setting
- #1930 [DFP] Improve mask sorting
- #1928 [HLIR] Performance problem when using BatchNormConvFuse, if the weights are significantly larger than the input data
- #1927 [DFP] atomicAdd(0.0) does not need to be executed!
- #1926 [DFP] Wrong gradients for complex mul and div
- #1925 [DFP] VE Intrinsics performance regression
- #1923 [DFP] Inline constant computations
- #1921 [HLIR] Add topk
- #1920 [DFP] Add code path for CUDA atomics for float16/32 dtypes
- #1918 [CUDNNv9] Returns CUDNN_STATUS_NOT_INITIALIZED with float16/bfloat16 but works with float32
- #1916 [DFP] Lookup boundary checks
- #1914 [Keras] Keras.predict does not throw exception when wrong input dtype is detected
- #1913 [License] Add .postX for every time a new license is issued, to prevent caching effects of expired license files.
- #1912 [DFP] move autoSqueeze from lowerIR to Planner
- #1911 [DFP] Unnecessary select for select(not(X), scalar(1), scalar(0))
- #1910 [DFP] Minimize register usage
- #1909 [SQLite] Improve stability when SOL cache is on a network drive
- #1906 [DFP] Investigate why PyTorch is faster on PY_Issue_1740
- #1901 [DFP] AllReduce performance optimization
- #1899 [Cache] Add XDG_CACHE_HOME to determine ~/.cache
- #1896 [PyTorch] Test v2.9.1
- #1895 [Core] Consider to move .sol to $HOME/.cache/sol as default
- #1890 [Installer] Add support for credential files
- #1867 [HLIR] ViewDType breaks sol_tensor_ptr_XXX(...) dtype check!
- #1854 [PyTorch] ANEMOI parsing: torch.compile reports different DTypes than SOL
- #1851 [Debug] Estimated Peak Memory differs from Memory Dump
- #1845 [Installer] Keyring
- #1834 [CUDNNv8] Remove SymPad and use explicit padding if leftPadding != rightPadding
- #1831 [Installer] add --fix-omp option
- #1819 [VE] VEDA_ERROR_CANT_FREE_NON_DELAYED_VPTR in TF_Dropout using VEO mode
- #1812 [HLIR] Enable Conv with vdims in pixel dimensions
- #1797 [JIT] Use SQL lock to mutex compilation of handle/modules in parallel processes
- #1795 [VE] VEDA_ERROR_UNKNOWN_VEDA_ARCHITECTURE thrown in initVEDA if no VE and no VEOS is present in system
- #1782 [DFP] Split nested loops that underutilize the vectors
- #1781 [Python] SOL might use wrong Python
- #1780 [Installer] Project-URL is wrong, should be v0.X not v0.X.Y
- #1778 [HLIR] Remove returnCarry from PrefixSum, instead handle like in MaxPool
- #1777 [Utils] cleanup file paths
- #1774 [DNN] Rework Handlers
- #1773 [DNNL] Report DNNL Workspace Size
- #1771 [YAAL] Workspaces
- #1770 [CUDNN] Fix CUDA Graphs when using CUDNN
- #1769 [TF] sol.optimize example inputs support
- #1767 [Python] "There is a newer version of SOL available" detects post installers as NEW SOL version
- #1764 [DFP] Deprecate LoopStack::sort
- #1763 [DFP] Improve implementation of CoresSIMD -> SIMD
- #1757 [DNN] Can we remove sol_ctx from the constructor of DeviceMap?
- #1734 [OMP] Remove Cost Model
- #1710 [Macros] Unify SOL_HAS_FUNC and SOL_HAS_FUNC2
- #1662 [VLLM] Remove support
- #1615 [HLIR] Implement Interpolate as Gather?
- #1438 [DFP] AllReduce
- #919 [HLIR] Merge two Reorder's with different shape sizes
- #472 [X86, VE, NVIDIA] Reverse PrefixSum
|