run-tests not portable to many Cray systems

Issue #165 resolved
Paul Hargrove created an issue

As currently written, the run-tests script executes both tests which use GASNet-EX and some which do not. The latter present a problem on many Cray systems.

The script runs the non-GASNet-EX tests directly.
This results in SIGILL for some of these tests on Cray systems using the vendor's ALPS batch system, including Theta at ALCF.
This is because the job script (or interactive shell) executes on a service node, not a compute node.

This is not an issue at NERSC because, unlike ALPS, SLURM executes the job script (or interactive shell) on the first compute node of the allocation.

Rather than trying to concoct some convoluted work-around on short-notice, I propose to simply comment-out the problematic tests. I will be preparing a PR for this shortly.

Comments (3)

  1. Paul Hargrove reporter

    Resolve issue #165 (run-tests problem on Crays)

    This commit resolves issue #165 (run-tests not portable to many Cray systems) by the simple expedient of commenting-out the three tests which are problematic (cannot safely be launched w/ current infrastructure).

    Testing on Theta at ALCF (where the original problem was observed) shows the problem to be resolved. The only required departure from the documented instructions was to set the GASNET environment variable to point to a pre-staged GASNet-EX tarball. This was necessary because batch and interactive jobs run without outside network access.

    → <<cset 7d1809279fee>>

  2. Paul Hargrove reporter

    Resolve issue #165 (run-tests problem on Crays)

    This commit resolves issue #165 (run-tests not portable to many Cray systems) by the simple expedient of commenting-out the three tests which are problematic (cannot safely be launched w/ current infrastructure).

    Testing on Theta at ALCF (where the original problem was observed) shows the problem to be resolved. The only required departure from the documented instructions was to set the GASNET environment variable to point to a pre-staged GASNet-EX tarball. This was necessary because batch and interactive jobs run without outside network access.

    → <<cset 7d1809279fee>>

  3. Log in to comment