warpSize likely incorrect for AMD GPU

Issue #50 new
Daily, Jeff created an issue

warpSize is a constexpr in HIP headers and is conditionally defined based on gfx arch. It is 64 for AMD Instinctâ„¢ MI Series Accelerators, and 32 for others. The source file zmod_shufl.cu seems to use warpSize in some places, and a hard-coded 32 in others.

https://bitbucket.org/icl/magma/src/0c45d7181b40f4cc94ff82ef1416171ccf2cab84/sparse/blas/zmdot_shfl.cu#lines-322

This file and any others using warp primitives should be reviewed for correctness on AMD GPUs.

It was unclear if a unit test existed for the above function.

Comments (0)

  1. Log in to comment