How to debug a core file?
Usually you have two choices of debuggers: dbx or gdb
.dosu dbx application_that_produces_this_core core
.dosu gdb application_that_produces_this_core core
The command `strings core | head -2` will tell you what
application produces this core (line 1) including the full
path of the application (line 2).
Apparently that above command (strings core | head -2) only
works on Irix. On Linux, you need to do something else.
On Irix, you need to do
.dosu strings core | head -10
to get the name of the executable somewhere in the middle.
Within a debugger while dissecting a core file:
p (curr_pat)pdata
This will get you information about patid, envid, etc.
What about multi-threaded core files?
Here's the explanation of core files in multi threaded apps. This is
from an old FAQ, but I think it's still applicable:
G.2: Does it work with post-mortem debugging?
Not very well. Generally, the core file does not correspond to the
thread that crashed. The reason is that the kernel will not dump core
for a process that shares its memory with other processes, such as the
other threads of your program. So, the thread that crashes silently
disappears without generating a core file. Then, all other threads of
your program die on the same signal that killed the crashing thread.
(This is required behavior according to the POSIX standard.) The last
one that dies is no longer sharing its memory with anyone else, so the
kernel generates a core file for that thread. Unfortunately, that's not
the thread you are interested in.
G.3: Any other ways to debug multithreaded programs, then?
Assertions and printf() are your best friends. Try to debug sequential
parts in a single-threaded program first. Then, put printf() statements
all over the place to get execution traces. Also, check invariants often
with the assert() macro. In truth, there is no other effective way (save
for a full formal proof of your program) to track down concurrency bugs.
Debuggers are not really effective for subtle concurrency problems,
because they disrupt program execution too much.