Updating test suite to include controls on strictness of tests

#95 Merged at 6d298d2
Deleted repository
week-of-code (bae44f4876d8)
  1. chummels

I added a few test_runner.py flags for controlling the strictness of the testing constraints. --tolerance=int sets the order of the relative error allowed, --bitwise sets whether or not to allow bitwise tests to occur, --strict=[low, medium, high] also control tolerance and bitwise flags to preset levels. strict defaults to low, which means tolerance=3 and bitwise = false. in addition, i combined make_new_tests.py into test_runner.py, so there is one fewer step to run. lastly, i updated the docs to reflect all of these modifications (and flesh out a bit more on how to troubleshoot failures).

the result is now that nearly all of the tests pass on my test machines (using the default low strictness), but developers can change their precision to arbitrary levels of tightness before pushing or while testing our own code. see the docs for more information on all of this.

UPDATE 1: thanks to mike kuhlen and nathan goldbaum for honing in on some problematic test problems and coming up with ways of pruning these tests to not cause strange behavior across compilers/platforms.

fixed a few bugs with --bitwise and --sim-only flag

now, we purge all of the newly-created *__test_standard.py after run time instead of just ignoring them.

UPDATE 2: Documentation updated to reflect ngoldbaum's comment (and other slight modifications)

UPDATE 3: Modified the flags in yt's testing framework such that now, one can use --answer-store to designate whether or not you store/compare, --answer-name=X to designate what reference you are storing to/comparing against, and modified --local-store to just be simply --local. Additionally, I made the default --answer-name be set to enzogold2.2 (since that will be the cloud gold standard for this version), but one can change this in future versions as the codebase gets better. I set it so the default behavior was to run the quick suite, when no other tests are picked to run. And I documented all of these changes. This makes running the test suite a lot easier, because there are sensible defaults.

Comments (5)

  1. Sam Skillman

    This is great, Cameron. I've tested with --strict=low (all pass), --strict=medium (all pass), --strict=high(some fail), and --bitwise (more fail) against a slightly different optimization for the quick suite. I think this is good to go.

  2. chummels author

    I set strict to operate as follows (but i am open to changing these values):

    --strict=low means --tolerance=3 and --bitwise is not set

    --strict=medium means --tolerance=6 and --bitwise is not set

    --strict=high means --tolerance=13 and --bitwise is set

    In addition, the values used for tolerance and bitwise are printed out to STDOUT at the beginning of a run, and they're included in the test_results.txt file for later use.

  3. Britton Smith

    Cameron, this looks awesome. One comment, you can simplify checking for valid arguments to the --strict flag with the "choices" keyword in the add_option command. If you set the choices flag, the parser will do the checking itself. You can see an example of this with the --suite flag.

  4. chummels author

    Britton, I'll look at this. I saw the --suite flag, but it looked like there was a lot going on that didn't I didn't really understand, so I stuck with this homegrown but shorter version... But I'll see if the choices helps me out.