- edited description
dgesv_batched / dgetrs_batched fails for combination [batchCount, N, nrhs] = [1, >1025, >1025]
Hi,
The existing implementation of the dgesv_batched and dgetrs_batched fail for stated combination of parameters in the tests. From testing/testing_dgesv_batched
:
% BatchCount N NRHS CPU Gflop/s (sec) GPU Gflop/s (sec) ||B - AX|| / N*||A||*||X||
%============================================================================================
1 1025 1025 --- ( --- ) 16.12 ( 0.18) 1.26e-07 failed
This is causing downstream problems in PyTorch as referenced here: https://github.com/pytorch/pytorch/issues/36921
Comments (4)
-
reporter -
Please include the complete input & output of the tester, and some context about what platform you are running this on (MAGMA version, CUDA version, BLAS/LAPACK library, Linux/macOS/Windows, etc.). This aides in reproducing problems.
-
reporter The complete input and output is given here:
$ ./testing_dgesv_batched -N 1025 --nrhs 1025 --batch 1 % MAGMA 2.5.3 svn compiled for CUDA capability >= 5.0, 32-bit magma_int_t, 64-bit pointer. % CUDA runtime 10000, driver 10010. OpenMP threads 4. % device 0: GeForce 940M, 1176.0 MHz clock, 2004.5 MiB memory, capability 5.0 % Tue Apr 21 14:41:45 2020 % Usage: ./testing_dgesv_batched [options] [-h|--help] % BatchCount N NRHS CPU Gflop/s (sec) GPU Gflop/s (sec) ||B - AX|| / N*||A||*||X|| %============================================================================================ 1 1025 1025 --- ( --- ) 13.24 ( 0.22) 1.26e-07 failed
Running with cuda-memcheck reveals an invalid configuration argument error, which probably indicates that the last error is not checked.
$ cuda-memcheck ./testing_dgesv_batched -N 1025 --nrhs 1025 --batch 1 ========= CUDA-MEMCHECK % MAGMA 2.5.3 svn compiled for CUDA capability >= 5.0, 32-bit magma_int_t, 64-bit pointer. % CUDA runtime 10000, driver 10010. OpenMP threads 4. % device 0: GeForce 940M, 1176.0 MHz clock, 2004.5 MiB memory, capability 5.0 % Tue Apr 21 14:41:53 2020 % Usage: ./testing_dgesv_batched [options] [-h|--help] % BatchCount N NRHS CPU Gflop/s (sec) GPU Gflop/s (sec) ||B - AX|| / N*||A||*||X|| %============================================================================================ ========= Program hit cudaErrorInvalidConfiguration (error 9) due to "invalid configuration argument" on CUDA API call to cudaLaunchKernel. ========= Saved host backtrace up to driver entry point at error ========= Host Frame:/usr/lib/x86_64-linux-gnu/libcuda.so.1 [0x390513] ========= Host Frame:/usr/local/cuda-10.0/lib64/libcudart.so.10.0 (cudaLaunchKernel + 0x265) [0x4e405] ========= Host Frame:/media/vishwak/Official-1/magma-src/lib/libmagma.so (_Z59__device_stub__Z31dlaswp_rowserial_kernel_batchediPPdiiiPPiiPPdiiiPPi + 0x14a) [0x5ec70a] ========= Host Frame:/media/vishwak/Official-1/magma-src/lib/libmagma.so (magma_dlaswp_rowserial_batched + 0xad) [0x5ec7dd] ========= Host Frame:/media/vishwak/Official-1/magma-src/lib/libmagma.so (magma_dgetrs_batched + 0x3ca) [0x48befa] ========= Host Frame:./testing_dgesv_batched [0x2bbd] ========= Host Frame:/lib/x86_64-linux-gnu/libc.so.6 (__libc_start_main + 0xeb) [0x26b6b] ========= Host Frame:./testing_dgesv_batched [0x34fa] ========= 1 1025 1025 --- ( --- ) 0.77 ( 3.73) 1.26e-07 failed ========= ERROR SUMMARY: 1 error
OS: Linux Ubuntu 19.04, BLAS used is OpenBLAS (OpenBLAS 0.2.20dev)
-
- changed status to resolved
Hi,
We are making a sweep over the lingering issues in MAGMA. This one should now be fixed as of 725793b.
- Log in to comment