Provide SIMD support for the ARM architecture
Description
In recent years, the ARM architecture has become one of the most prevalent processor architectures. However, despite its widespread use the Blaze library currently only provide SIMD support for current and upcoming x86 architectures. Blaze should also provide SIMD vectorization for the ARM architecture.
Tasks
- extend the existing SIMD module with support for the ARM architecture
- verify that the expected performance is achieved
- add new and extend existing test cases as necessary
Comments (14)
-
reporter -
Are there any updates on this? I saw your talk on CppCon 2016 where you said ARM support was in the works and it would be awesome to have!
-
reporter Hi Emil!
Unfortunately this feature is still not available. However, since adding a different kind of vectorization is pretty straightforward, this could be your chance to create a pull request. Pull request #11 gives you an impression of what needs to be done to provide SIMD support for ARM architectures. If you have questions, you are always welcome to ask.
Best regards,
Klaus!
-
Hi Klaus!
Thanks for the hint! I took a look at the SIMD implementation, and it was quite straight forward, should be no real problem to add the intrinsics. It can be a fun evening project!
One question, as I do not have an overview of the testing framework, is there a way to test the SIMD of one feature at a time to see that each is working (Abs ...)?
Thanks! BR Emil
-
reporter Hi Emil!
Take a look at the
blazetest
directory. It contains the entire test suite of the Blaze library. In order to test the SIMD functionality, you can go toblazetest/src/mathtest/simd
, which contains the SIMD tests for all relevant data types (int
,float
,double
, ...).The test suite is primarily written for Linux and MacOS. The first step is to fill out the
Configfile
in theblazetest
directory. Then run theconfigure
script```:./configure
It will create the necessary files and will enable you to use
make
in theblazetest/src/mathtest/simd
directory. If you happen to use Windows I can send you an accordingCMakeLists.txt
file.Please use
BLAZE_NEON_MODE
as the compilation switch for the neon vectorization. Also, please try to adhere to the existing formatting rules. Then I don't see any problems for a pull request.One more request: Please wait till the next push as this will touch a lot of the functionality that you will have to modify. Merging this later might prove to be difficult. The push should happen today or latest tomorrow.
Best regards,
Klaus!
-
Thanks for the instructions! I have the testing running, just waiting for all the commits until I start testing a little.
BR Emil
-
reporter Hi Emil!
The last two pushes have introduced the major changes of our 3.4 refactoring . Please feel free to fork now. Thanks for your patience,
Best regards,
Klaus!
-
reporter -
Hi Klaus!
Sorry for the delay, I am currently finalizing my PhD thesis and will be away a bit. I have started the implementation but sadly have to pause until my thesis is done.
I will be following this issue meanwhile if there are any comments/questions. Also, I got myself an nVidia TX1 for testing on, is there any other hardware you'd like to test on? It has NEON, but perhaps we need to test on something more as well.
BR Emil
-
Any progress on this? I am looking for a mathematics library for ARM architecture. I've seen the Blaze has a great performance, is this applicable to ARM? Can I use Blaze for ARM?
-
reporter Hi Amin!
Unfortunately there hasn't been any progress yet. But we are be willing to accept the contribution from a volunteer with ARM experience. Our expectation is that it would take approx. one day of work to introduce ARM support if you are familiar with the ARM intrinsics.
Best regards,
Klaus!
-
Hi Klaus!
ARM support isn’t something we need right now, but while looking at using Sleef+Blaze for SIMD I remembered this issue. It looks like Sleef has support for ARM (and POWER9 too) and thought that might be the easiest way to ultimately add ARM support to Blaze. Either by you or somebody else Anyway, just me “thinking out loud”
Best wishes,
Nils
-
reporter Hi Nils!
Indeed, Sleef would be perfectly suited for that job. The disadvantage would be that in order to use ARM vectorization a user would have to install Sleef. But I believe that the advantage of having ARM support available definitely outweighs this disadvantage. Unfortunately, my problem at the moment is that I couldn’t test the implementation due the lack of a suited ARM CPU. But that will change as sooner than later when I upgrade my MacBook to one of the new releases with M1 processor.
Best regards,
Klaus!
-
Hi Klaus! Just wondering if you ever got that upgraded MacBook and if so, if adding Sleef+ARM might be in the cards at some point? I’m still just a beginner with ARM assembly, so I don’t think I yet would have the skillset to add this myself (but would certainly be happy to help test!)
I also wonder whether Apple’s Accelerate framework might ever be viable for Blaze? Also just thinking out loud…I know that on ARM Macs this framework makes use of Apple’s secret AVX coprocessor for about a 2x performance boost over NEON on ARM Macs, but I don’t know enough about Blaze’s internals to know whether the Accelerate framework (the only supported way to make use of the Apple AVX coprocessor) has what you would need.
- Log in to comment
Issue
#153was marked as a duplicate of this issue.