mbpart ungraceful exit with invalid data

Issue #28 resolved
Nico Schlömer created an issue
$ mbpart 1 -m ML_KWAY pacman.h5m pacman2.h5m 
Loading file pacman.h5m...
Computing partition using ML_KWAY method for 1 processors...
[fuji:03734] *** Process received signal ***
[fuji:03734] Signal: Floating point exception (8)
[fuji:03734] Signal code: Integer divide-by-zero (1)
[fuji:03734] Failing at address: 0x7fedb68d8bf8
[fuji:03734] [ 0] /lib/x86_64-linux-gnu/libc.so.6(+0x352f0) [0x7fedb6b212f0]
[fuji:03734] [ 1] /usr/lib/x86_64-linux-gnu/libmetis.so.5(METIS_PartGraphKway+0x1c8) [0x7fedb68d8bf8]
[fuji:03734] [ 2] /usr/lib/x86_64-linux-gnu/libMOAB.so.4(_ZN16MetisPartitioner14partition_meshEiPKcibbbbS1_b+0xe44) [0x7fedb7b83624]
[fuji:03734] [ 3] mbpart(main+0x1500) [0x40bba0]
[fuji:03734] [ 4] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0) [0x7fedb6b0ca40]
[fuji:03734] [ 5] mbpart(_start+0x29) [0x40cfc9]
[fuji:03734] *** End of error message ***
Floating point exception (core dumped)

The case p==1 should probably be intercepted.

Comments (8)

  1. Vijay M

    Ah, so we are not validating input here. p==1 is a weird case though and is not part of the expected use case, but I agree that we should catch that.

  2. Vijay M

    Does the branch (vijaysm/mbpart-validate-input) fix this issue ? If you are satisfied, I'll submit a PR for it. However, if there are more issues, let me know.

  3. Nico Schlömer reporter

    Unfortunately, I'm not able to check out that branch

    $ git checkout vijaysm/<tab><tab>
    vijaysm/cmake-modify-tests       vijaysm/matrix3_testutil         vijaysm/metis-serial-part 
    vijaysm/lasso-merge              vijaysm/matrix3_testutil_modif   vijaysm/mpi-no-hdf5-testfixes
    
  4. Nico Schlömer reporter

    After a fresh clone, I can check out the branch, but mbpart isn't build anymore for some reason. Perhaps a recent change the build chain? I'll have to investigate.

  5. Nico Schlömer reporter

    Well... I'm inclined to say that this issue is so easy to reproduce, and the fix so easy to verify, that it's okay to trust the fix if you alone can confirm it.

  6. Log in to comment