run-tests not portable to many Cray systems
As currently written, the run-tests
script executes both tests which use GASNet-EX and some which do not. The latter present a problem on many Cray systems.
The script runs the non-GASNet-EX tests directly.
This results in SIGILL for some of these tests on Cray systems using the vendor's ALPS batch system, including Theta at ALCF.
This is because the job script (or interactive shell) executes on a service node, not a compute node.
This is not an issue at NERSC because, unlike ALPS, SLURM executes the job script (or interactive shell) on the first compute node of the allocation.
Rather than trying to concoct some convoluted work-around on short-notice, I propose to simply comment-out the problematic tests. I will be preparing a PR for this shortly.
Comments (3)
-
reporter -
reporter - changed status to resolved
Resolve issue
#165(run-tests problem on Crays)This commit resolves issue
#165(run-tests not portable to many Cray systems) by the simple expedient of commenting-out the three tests which are problematic (cannot safely be launched w/ current infrastructure).Testing on Theta at ALCF (where the original problem was observed) shows the problem to be resolved. The only required departure from the documented instructions was to set the GASNET environment variable to point to a pre-staged GASNet-EX tarball. This was necessary because batch and interactive jobs run without outside network access.
→ <<cset 7d1809279fee>>
-
reporter Resolve issue
#165(run-tests problem on Crays)This commit resolves issue
#165(run-tests not portable to many Cray systems) by the simple expedient of commenting-out the three tests which are problematic (cannot safely be launched w/ current infrastructure).Testing on Theta at ALCF (where the original problem was observed) shows the problem to be resolved. The only required departure from the documented instructions was to set the GASNET environment variable to point to a pre-staged GASNet-EX tarball. This was necessary because batch and interactive jobs run without outside network access.
→ <<cset 7d1809279fee>>
- Log in to comment
See PR #47 for proposed solution