Releases: RIKEN-RCCS/GEMMul8
Releases · RIKEN-RCCS/GEMMul8
v2.0.16
v2.0.15
Modified test programs for HIP environments
v2.0.14
Add: execution of cuBLAS Ozaki-I and BF16x9 in test_watt.hpp Updated the test program for measuring watt and GFLOPS/watt
v2.0.13
Fix an illegal memory access observed in test_flops.hpp after the Ozaki-I path on an NVIDIA B200 SXM 192GB system with cuBLAS 13.1.80.
Reported by T. Yamashita from SAKURA internet Inc.
v2.0.12
Add: test for Zgemm3m in sample program
v2.0.11
Fix: compilation error in test programs using cuBLAS version < 13.1
v2.0.10
Fix: HIP execution via INT8 fixed bugs in INT8-based emulation on HIP
v2.0.9
Update: Added Ozaki-1 and BF16x9 in test code
v2.0.8
Fix: reset workspace buffer state when reallocating with cudaFreeAsync
v2.0.7
Update: test problems