Skip to content

Releases: RIKEN-RCCS/GEMMul8

v2.0.16

01 Apr 03:54

Choose a tag to compare

modified test_flops.hpp

v2.0.15

31 Mar 07:31

Choose a tag to compare

Modified test programs for HIP environments

v2.0.14

29 Mar 00:00

Choose a tag to compare

Add: execution of cuBLAS Ozaki-I and BF16x9 in test_watt.hpp

Updated the test program for measuring watt and GFLOPS/watt

v2.0.13

26 Mar 05:26

Choose a tag to compare

Fix an illegal memory access observed in test_flops.hpp after the Ozaki-I path on an NVIDIA B200 SXM 192GB system with cuBLAS 13.1.80.

Reported by T. Yamashita from SAKURA internet Inc.

v2.0.12

25 Mar 10:29

Choose a tag to compare

Add: test for Zgemm3m in sample program

v2.0.11

25 Mar 00:44

Choose a tag to compare

Fix: compilation error in test programs using cuBLAS version < 13.1

v2.0.10

23 Mar 06:47

Choose a tag to compare

Fix: HIP execution via INT8

fixed bugs in INT8-based emulation on HIP

v2.0.9

21 Mar 03:34

Choose a tag to compare

Update: Added Ozaki-1 and BF16x9 in test code

v2.0.8

13 Mar 05:44

Choose a tag to compare

Fix: reset workspace buffer state when reallocating with cudaFreeAsync

v2.0.7

27 Feb 15:03

Choose a tag to compare

Update: test problems