- marked as enhancement
- removed comment
Support KNL's AVX512 instruction set
The main point of this pull request is to support KNL's AVX512 instruction set. This pull request also contains a few other changes on which the AVX512 depends.
Keyword:
Comments (14)
-
reporter -
- removed comment
this is hard to review. Could you provide a pull request that separates out the result of clang-format (which don't need review) from the other changes (ie new code, CCTK_VWarn -> CCTL_VWARN changes)?
-
- changed status to open
- removed comment
-
- changed status to open
- removed comment
-
reporter - removed comment
I factored it into individual topical commits. If you look at the commits, you see thematically-related changes. Reformatting is restricted to a single commit.
-
- changed status to open
- removed comment
ok, thank you. Let me see if I can figure out how to look at the commits individually (and comment on them). I tried to have a look and failed to find out which branch to look at. :-)
-
- removed comment
Alright, this worked without problems, not sure what I had tried to do before. The only downside is that the comments do not show up in the pull request diff but only in the individual commits, so here's links to them:
- https://bitbucket.org/cactuscode/cactusutils/commits/624ecd2afa085fad9794e42905f87233527c121e?at=master
- https://bitbucket.org/cactuscode/cactusutils/commits/555cfe712f869ba45da200e1bdde52e8cf39fb9a?at=master
Looks fine in general. Usual disclaimers apply: these comments are all very terse and just comments that came to my mind when reading the code, not intended to be very well worded or final requests.
Two more comments:
- does the final version of the commits still pass through clang-format unchanged?
- I don't know what was done in thorn Vectors (compared to vecmathlib) before: are approximated answers used or was there a promise that (as much as possible) results that simple use vectorization to do the same thing to multiple grid points and doing the same thing to multiple grid points using a loop return the same result (ignoring fma and the like).
-
reporter - removed comment
-
Before committing, I will re-run clang-format to ensure the formatting is right.
-
Answers are not approximated, except for functions such as sin and cos where the IEEE standard does not require accuracy of all bits; in these cases, I follow the OpenCL standard with respect to accuracy (how many bits can be wrong, usually at most 4 out of 53). inf and nan might also be handled differently. However, in particular basic arithmetic and square roots are correct in all digits. fma might or might not round between multiplying and adding. So, in the absence of trigonometric functions, the answer is "yes, result will be the same".
-
- changed status to open
- removed comment
Ok. Thank you for the clarifications. Please apply.
-
reporter - changed status to resolved
- removed comment
Merged.
-
- changed status to open
- removed comment
This seems to have caused test failures on Jenkins. Specifically,
CarpetProlongateTest.test_o11/2procs CarpetProlongateTest.test_o7/2procs CarpetProlongateTest.test_o9/2procs
all now fail. For o11, the error is
cactus_sim: /home/jenkins/workspace/EinsteinToolkit/arrangements/Carpet/CarpetLib/src/prolongate_3d_rf2.cc:258: T CarpetLib::interp1(const T*, size_t) [with T = double; int ORDER = 11; int di = 1; size_t = long unsigned int]: Assertion `i == (ptrdiff_t(coeffs::imax) - ptrdiff_t(coeffs::ncoeffs % VP::size()))' failed.
Backtrace from rank 0 pid 30341: 1. CarpetLib::signal_handler(int) [/home/jenkins/workspace/EinsteinToolkit/../simulations/EinsteinToolkit_eec4338ddb95f8ab8c59ed7cd91635b8c4ff0f23_2/SIMFACTORY/exe/cactus_sim(_ZN9CarpetLib14signal_handlerEi+0xda) [0x23f031a]] 2. /lib/x86_64-linux-gnu/libc.so.6(+0x354b0) [0x7fd54f6cf4b0] 3. /lib/x86_64-linux-gnu/libc.so.6(gsignal+0x38) [0x7fd54f6cf428] 4. /lib/x86_64-linux-gnu/libc.so.6(abort+0x16a) [0x7fd54f6d102a] 5. /lib/x86_64-linux-gnu/libc.so.6(+0x2dbd7) [0x7fd54f6c7bd7] 6. /lib/x86_64-linux-gnu/libc.so.6(+0x2dc82) [0x7fd54f6c7c82] 7. /home/jenkins/workspace/EinsteinToolkit/../simulations/EinsteinToolkit_eec4338ddb95f8ab8c59ed7cd91635b8c4ff0f23_2/SIMFACTORY/exe/cactus_sim() [0xc88a70] 8. void CarpetLib::prolongate_3d_rf2<double, 11>(double const*, vect<int, 3> const&, vect<int, 3> const&, double*, vect<int, 3> const&, vect<int, 3> const&, bbox<int, 3> const&, bbox<int, 3> const&, bbox<int, 3> const&, bbox<int, 3> const&, void*) [/home/jenkins/workspace/EinsteinToolkit/../simulations/EinsteinToolkit_eec4338ddb95f8ab8c59ed7cd91635b8c4ff0f23_2/SIMFACTORY/exe/cactus_sim(_ZN9CarpetLib17prolongate_3d_rf2IdLi11EEEvPKT_RK4vectIiLi3EES7_PS1_S7_S7_RK4bboxIiLi3EESC_SC_SC_Pv+0x1527) [0x2463de7]] 9. /home/jenkins/workspace/EinsteinToolkit/../simulations/EinsteinToolkit_eec4338ddb95f8ab8c59ed7cd91635b8c4ff0f23_2/SIMFACTORY/exe/cactus_sim() [0x2465eb6] a. /usr/lib/x86_64-linux-gnu/libgomp.so.1(+0xf43e) [0x7fd54fc8943e] b. /lib/x86_64-linux-gnu/libpthread.so.0(+0x76ba) [0x7fd5522446ba] c. /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d) [0x7fd54f7a13dd] The hexadecimal addresses in this backtrace can also be interpreted with a debugger (e.g. gdb), or with the 'addr2line' (or 'gaddr2line') command line tool: 'addr2line -e cactus_sim <address>'.
The other orders are similar.
-
- changed status to open
- assigned issue to
- removed comment
-
- removed comment
The fix now has a "please apply" in
#2074. -
- changed status to resolved
- removed comment
Applied as dd703e6a of Carpet.
- Log in to comment