Abnormal Termination Processing (ATP)

Issue #32 new
jg piccinali repo owner created an issue

GNU

Setup

module load atp/1.8.1
cp -R $ATP_HOME/demos atpDemos
cd atpDemos/
cc -o testMPIApp testMPIApp.c

Run without atp

  • aprun -n4 ./testMPIApp 1 4
testApp: (31561) starting up...
testApp: (31564) starting up...
testApp: (31562) starting up...
testApp: (31563) starting up...

_pmiu_daemon(SIGCHLD): [NID 00012] [c0-0c0s3n0] [Fri May 29 22:20:18 2015] 
PE RANK 3 exit signal Illegal instruction
[NID 00012] 2015-05-29 22:20:18 Apid 158515: initiated application termination
Application 158515 exit codes: 132

Run with atp

  • ATP_ENABLED= aprun -n4 ./testMPIApp 1 4
testApp: (31663) starting up...
testApp: (31666) starting up...
testApp: (31664) starting up...
testApp: (31665) starting up...
Application 158516 is crashing. ATP analysis proceeding...

ATP Stack walkback for Rank 3 starting:
  _start@start.S:113
  __libc_start_main@libc-start.c:242
  main@0x401546
  foo@0x401395
  raise@pt-raise.c:42
ATP Stack walkback for Rank 3 done
Process died with signal 4: 'Illegal instruction'
Forcing core dumps of ranks 3, 0
View application merged backtrace tree with: stat-view atpMergedBT.dot
You may need to: module load stat

_pmiu_daemon(SIGCHLD): [NID 00012] [c0-0c0s3n0] [Fri May 29 22:21:54 2015] 
PE RANK 1 exit signal Killed
Application 158516 exit codes: 137

Comments (1)

  1. Log in to comment