Make tests amenable to automatic verification

Issue #6 resolved
Dan Bonachea created an issue

Currently many/most of the tests require manual inspection of the output to validate success. The expected output usually depends on the number of ranks and is often non-deterministic based on the interleaving of output multiplexing. In extreme cases like rpc_barrier the validation depends upon parallel stdout ordering guarantees that don't exist on distributed systems.

All of this makes it difficult to automate rigorous validation of test success.

Ideally, every test should perform its own rigorous internal validation and either:

  • On failure, assert/crash or print output from any rank that includes a uniform failure substring (like "ERROR: ..." or "FAIL: ...")
  • On success, perform a barrier or equivalent sync (to ensure all ranks have terminated), and then print an output from a single rank that includes a uniform success substring (like "SUCCESS"). The success substring should not depend on the number of ranks or other configuration-dependent properties. It should not be concurrent with output from other ranks, to reduce the possibility of output interleaving garbling the result and leading to false negatives.

No test should require visual inspection of the output to determine correctness, because in practice that makes it mostly useless for automated regression testing. It's fine to include additional output that's useful for human consumption, but the ultimate outcome of the test should be presented in an unambiguous and machine-parsable manner.

Comments (4)

  1. Dan Bonachea reporter

    Thanks Steven - I updated the automated CI to check for the new success strings, updated results should show up here and here starting this morning.

    I disabled the color codes in the output because it was generating garbage characters and false negatives for the CI which always redirects test output to files and handles them as plaintext. If you think it's important to perform output with color codes, we should discuss how to check $TERM and ensure stdout is a tty or color-capable pager before using them.

  2. Log in to comment