enzo-dev / doc / manual / source / developer_guide / DebuggingWithGDB.rst

Debugging Enzo with GDB

While it is relatively straightforward to debug enzo in parallel with a commercial parallel debugger like Totalview or DDT, it is definitely not so straightforward to debug enzo with a free, open source serial debugger like GDB. This method works well if you do not have access to a supercomputer or clsuter with a commercial parallel debugger installed, if you would like to run and debug enzo on a small workstation, or if you prefer to use free and open source software in your programming life.

Modify Enzo to allow GDB to attach to a running Enzo process

Open enzo.C, located in the main Enzo source directory, and modify the beginning of the MAIN_NAME function (the main function where execution begins) so it looks like the following:

Eint32 MAIN_NAME(Eint32 argc, char *argv[])
  {
  int i;
  // Initialize Communications
  CommunicationInitialize(&argc, &argv);

#define DEBUG_MPI
#ifdef DEBUG_MPI
  if (MyProcessorNumber == ROOT_PROCESSOR) {
    int impi = 0;
    char hostname[256];
    gethostname(hostname, sizeof(hostname));
    printf("PID %d on %s ready for debugger attach\n", getpid(), hostname);
    fflush(stdout);
    while (impi == 0)
      sleep(5);
    }
#endif

All you should need to do is uncomment the #define DEBUG_MPI line. This code block will make Enzo print the name of the host its being run on and the process ID number. You will need both of these pieces of information when you try to attach to Enzo with GDB.

Once you've modified enzo.C, you will need to rebuild Enzo. If you haven't done so already, you should make sure Enzo is built with debugging symbols and with compiler optimizations turned off. This can be accomplished on most systems by setting make opt-debug at the command line (see :ref:`MakeOptions`).

Run Enzo

Now you're ready to run a test simulation. This method works best when using only a few processors, so don't start a simulation with hundreds of processors and try to attach to it with GDB unless you know what you're doing. If you're running Enzo on a cluster, make sure that you can SSH into the compute nodes. If not then this debugging method will not work. Start Enzo normally using mpirun, Enzo should print something like:

humperdinck:GDB_test goldbaum$ mpirun -np 4 ./enzo.exe -d -r DD0096/DD0096
MPI_Init: NumberOfProcessors = 4
PID 34352 on humperdinck.ucolick.org ready for debugger attach

This says that Enzo is running on four cores and has a process ID number of 34352 on the host humperdinck.ucolick.org.

Attach and Debug With GDB

Next, in a new terminal window, you should ssh into the appropriate host. If you're running on your local workstation there is no need to ssh. Next, start a GDB session and attach to the appropriate PID number:

humperdinck:enzo goldbaum$ gdb
GNU gdb 6.3.50-20050815 (Apple version gdb-1515) (Sat Jan 15 08:33:48 UTC 2011)
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "x86_64-apple-darwin".
(gdb) attach 34398

GDB should report a long list of warning messages about code in libraries that enzo links against that was not compiled with debugging symbols. It's safe to ignore these errors since we will only be debugging the enzo source. GDB is now attached to enzo's process and is probably stuck somewhere in your system's implimentation of the sleep() function. To see the execution stack, tell GDB to print a stack trace:

0x00007fff8730da6a in __semwait_signal ()
(gdb) backtrace
#0  0x00007fff8730da6a in __semwait_signal ()
#1  0x00007fff8730d8f9 in nanosleep ()
#2  0x00007fff8735a9ac in sleep ()
#3  0x0000000100008cee in main (argc=4, argv=0x7fff5fbfef70) at enzo.C:259
(gdb)

In this example GDB is stuck three levels down from where we want to be inside enzo.C. Move up the stack:

(gdb) up 3
#3  0x0000000100008cee in main (argc=4, argv=0x7fff5fbfef70) at enzo.C:259
259      sleep(5);
Current language:  auto; currently c++
(gdb) l
254    char hostname[256];
255    gethostname(hostname, sizeof(hostname));
256    printf("PID %d on %s ready for debugger attach\n", getpid(), hostname);
257    fflush(stdout);
258    while (impi == 0)
259      sleep(5);
260  }
261#endif
262
263
(gdb)

Now GDB is at line 259 of Enzo.C. To break the infinite loop, you will need to modify impi so that it is no longer zero:

(gdb) set var impi = 7

At this point you can continue execution by typing continue or c. If you want you can also optionally set a breakpoint elsewhere in the enzo source tree:

(gdb) break EvolveLevel.C:738

This will pause execution right before Enzo enters RebuildHierarchy for the first time.

That should be enough to get you going. It's also possible to start multiple GDB processes so you can attach to all of the parallel MPI processes. See the GDB docs and the openmpi FAQ page for more information.

Tip: Filter by directory path e.g. /media app.js to search for public/media/app.js.
Tip: Use camelCasing e.g. ProjME to search for ProjectModifiedEvent.java.
Tip: Filter by extension type e.g. /repo .js to search for all .js files in the /repo directory.
Tip: Separate your search with spaces e.g. /ssh pom.xml to search for src/ssh/pom.xml.
Tip: Use ↑ and ↓ arrow keys to navigate and return to view the file.
Tip: You can also navigate files with Ctrl+j (next) and Ctrl+k (previous) and view the file with Ctrl+o.
Tip: You can also navigate files with Alt+j (next) and Alt+k (previous) and view the file with Alt+o.