AVX2+FMA?
Issue #22
resolved
Hi,
I've heard that shtns can utilize AVX2+FMA to double performance over just using AVX, however there's no mention of this in your overview.
Is this something that is currently supported, and if so does it require an additional compilation option when installing shtns?
Comments (2)
-
repo owner -
repo owner - changed status to resolved
- Log in to comment
Hello,
there is nothing special to do to enable that. By default SHtns will be compiled with -march=native which will pick up the details for you. If you want to compare with and without, try replacing -march=native with -march=avx in the Makefile generated by ./configure
AVX2+FMA gave me a x1.7 to x1.8 speedup compared to AVX.