- changed status to open
- removed comment
Compiling PittNullCode is slow
Compiling PittNullCode is very slow on some systems (e.g. with gcc). I believe this is because files such as NullConstr_R00.F90 contain many whole-array operations that the compiler has to analyse.
I suggest to rewrite these routines, using e.g. forall or do loops. If we want to keep the elegant, index-free notation, then I suggest to add an elemental subroutine for the actual calculations and calling it with whole arrays.
Keyword: NullConstr
Comments (7)
-
-
reporter - removed comment
I think this is a good approach. Could we also get a "good to go" from one of the thorn's maintainers?
-
- removed comment
Bela Szilagyi had a look at the proposed patch and pointed out that there is no need to have the temporaries s2-s10 and the e*'s be arrays anymore. Otherwise the patch was fine. The attached patch array2.patch replaces the old array.patch and implements this. The results agree identically with the array version (one has to actually add them to the regression test by hand).
Applied as rev 11 of NullConstr.
-
- changed status to resolved
- removed comment
-
- changed status to open
- removed comment
NullConstr_R00 still takes a very long time to compile with ifort (IFORT) 14.0.0 20130728. It takes 20 minutes on the Datura head node.
-
- changed status to resolved
- removed comment
Datura no longer exists. Compiling the file with gcc 7.2 on my workstation takes 1m (57s to compile and link the whole thing using 1 cpu) and using intel 16.0.3 on BW takes 1m8.411s (just the one file) which is usually among the slowest machines to build.
The current slowest file to compile is in QuasiLocalMeaures (also F90 code).
-
- changed status to closed
- edited description
- Log in to comment
The attached patch replaces the array operations by two nested do loops. Changes generated via a search&replace. Passed the test in SphericalHarmonicRecon. Reduces compilation time using intel 12 from many minutes to <1 minute.