Using Blaze's SIMD implementation directly?

Issue #418 new
Nils Deppe created an issue

Hi Klaus!

We have some code that isn’t easily applied to an entire DenseVector but we would like to perform using some SIMD operations (it’s a several nested root finds that need to be done pointwise in a mesh). With Sleef it seems like Blaze supports pretty much all the math functions we would need, and so it would be nice to avoid having to rely on yet another 3rd party library like XSIMD, NSIMD, etc. Before using what’s in math/simd directly I had a few questions:

  1. would you consider it reasonable to have the SIMD code in math/simd be user facing? I realize it’s still subject to refactors, etc. but basically do you consider it an implementation detail that users shouldn’t use or can we use it?
  2. some of our operations are things like max(a, 0.) which as far as I can tell (not being anywhere close to an expert on SIMD) we would do with something like max(a, SIMDdouble{0.}) if SIMDdouble had a constructor SIMDdouble(ValueType v) : value( __m256_set1_pd(v)) {}. Would it be reasonable to add support for this constructor? Do you have a suggestion on how to better do operations like max(a, SIMDdouble{0.})?
  3. we’ll need to have a clamp function. Is that something that could be added to the math/simd code?

This is all really just exploring what our options are in terms of having some SIMD wrappers available to us. Like I said, ideally we’d just use Blaze+Sleef rather than having to add yet another 3rd party library 🙂

Thanks in advance!

Best wishes,

Nils

Comments (4)

  1. Klaus Iglberger

    Hi Nils!

    First, allow me to apologize for the late reply. Then allow me to address your questions:

    1. I don’t plan any major refactoring to the SIMD code in the foreseeable future. I also consider it to be pretty stable since a lot of Blaze code depends on that. Therefore I would argue that it is reasonable to use it directly. It unfortunately just lacks the necessary documentation.
    2. I always considered calls like SIMDdouble{1.} as semantically ambiguous. Should the entire SIMD vector be initialized to 1 or just the first element (as it would happen if it would be an array)? For that reason there is a couple of “factory functions” like set(): max(a, set(0)); that perform the necessary SIMD operations (see for instance <math/simd/Set.h> or <math/simd/Setall.h>).
    3. I didn’t check: Is there some intrinsic function that performs a clamp()? If yes, that would be very reasonable, if no we would have to try to implement this on our own and proof by benchmarks that it performs well.

    Best regards,

    Klaus!

  2. Nils Deppe reporter

    Hi Klaus!

    Absolutely nothing to apologize for! Thank you for the detailed reply.

    1. Okay, great! I think between the header file names, your helpful responses to issues, Intel’s intrinsics guide, and Sleef documentation we shouldn’t have any troubles using what’s in math/simd.
    2. Ah, okay that makes sense. I agree that the behavior is a bit ambiguous. We can use the set and setall you pointed out. That’ll work perfectly, thank you!
    3. There doesn’t seem to be an existing low-level clamp, but the STL experimental SIMD support uses this implementation: https://github.com/VcDevel/std-simd/blob/c69cb8f6a8627c186427b08662b05693176c73b2/experimental/bits/simd.h#L3568 which I think is just a min(hi, max(lo, data)). Any thoughts on this?

    Thank you so much and all the best!

    Best wishes,

    Nils

  3. Klaus Iglberger

    Hi Nils!

    I agree that the use of min() and max() is (probably) the best implementation of a SIMD clamp(). I’ll use this issue for adding this feature. I hope that I’ve some time to add that after CppCon.

    Best regards,

    Klaus!

  4. Nils Deppe reporter

    Hi Klaus!

    Okay, that sounds great, thank you! I’m really looking forward to your cppcon talks. I won’t be there in person, but I’ll watch the recorded ones. I actually learned about Blaze from your 2016 cppcon talk. Good luck with all the preparations!

    Best wishes,

    Nils

  5. Log in to comment