With the Intel 17 compiler:
At least when called from phist, I always get ghost messages that a slow fallback kernel is used, despite the fact that I compiled the block size into ghost (tested with spmv and tsmttsm). He seems to always think the data is unaligned, which it shouldn't be in this case.
From the phist build dir I run
./Dbench_mvecT_times_mvec 1000000 8 8 100 ./Dbench_sparseMat_times_mvec BENCH3D-128-A0 8
with gcc 7.2 it works fine
I also noticed that the output line on how to re-cmake ghost with the missing kernels says GHOST_SPMMV= instead of GHOST_AUTOGEN...