Using Blaze's SIMD implementation directly?
Hi Klaus!
We have some code that isn’t easily applied to an entire DenseVector
but we would like to perform using some SIMD operations (it’s a several nested root finds that need to be done pointwise in a mesh). With Sleef it seems like Blaze supports pretty much all the math functions we would need, and so it would be nice to avoid having to rely on yet another 3rd party library like XSIMD, NSIMD, etc. Before using what’s in math/simd
directly I had a few questions:
- would you consider it reasonable to have the SIMD code in
math/simd
be user facing? I realize it’s still subject to refactors, etc. but basically do you consider it an implementation detail that users shouldn’t use or can we use it? - some of our operations are things like
max(a, 0.)
which as far as I can tell (not being anywhere close to an expert on SIMD) we would do with something likemax(a, SIMDdouble{0.})
ifSIMDdouble
had a constructorSIMDdouble(ValueType v) : value( __m256_set1_pd(v)) {}
. Would it be reasonable to add support for this constructor? Do you have a suggestion on how to better do operations likemax(a, SIMDdouble{0.})
? - we’ll need to have a
clamp
function. Is that something that could be added to themath/simd
code?
This is all really just exploring what our options are in terms of having some SIMD wrappers available to us. Like I said, ideally we’d just use Blaze+Sleef rather than having to add yet another 3rd party library
Thanks in advance!
Best wishes,
Nils
Comments (4)
-
-
reporter Hi Klaus!
Absolutely nothing to apologize for! Thank you for the detailed reply.
- Okay, great! I think between the header file names, your helpful responses to issues, Intel’s intrinsics guide, and Sleef documentation we shouldn’t have any troubles using what’s in math/simd.
- Ah, okay that makes sense. I agree that the behavior is a bit ambiguous. We can use the
set
andsetall
you pointed out. That’ll work perfectly, thank you! - There doesn’t seem to be an existing low-level clamp, but the STL experimental SIMD support uses this implementation: https://github.com/VcDevel/std-simd/blob/c69cb8f6a8627c186427b08662b05693176c73b2/experimental/bits/simd.h#L3568 which I think is just a
min(hi, max(lo, data))
. Any thoughts on this?
Thank you so much and all the best!
Best wishes,
Nils
-
Hi Nils!
I agree that the use of
min()
andmax()
is (probably) the best implementation of a SIMDclamp()
. I’ll use this issue for adding this feature. I hope that I’ve some time to add that after CppCon.Best regards,
Klaus!
-
reporter Hi Klaus!
Okay, that sounds great, thank you! I’m really looking forward to your cppcon talks. I won’t be there in person, but I’ll watch the recorded ones. I actually learned about Blaze from your 2016 cppcon talk. Good luck with all the preparations!
Best wishes,
Nils
- Log in to comment
Hi Nils!
First, allow me to apologize for the late reply. Then allow me to address your questions:
SIMDdouble{1.}
as semantically ambiguous. Should the entire SIMD vector be initialized to1
or just the first element (as it would happen if it would be an array)? For that reason there is a couple of “factory functions” likeset()
:max(a, set(0));
that perform the necessary SIMD operations (see for instance <math/simd/Set.h> or <math/simd/Setall.h>).clamp()
? If yes, that would be very reasonable, if no we would have to try to implement this on our own and proof by benchmarks that it performs well.Best regards,
Klaus!