Post-Mortem Analysis for Multiprocessing Program in C
Last updated on December 24, 2023 pm
Post-Mortem Analysis for Multiprocessing Program in C
When a program crashes, sometimes there would be a core dumped message.
1 |
|
This core dump file is extremely useful for post-mortem analysis.
Enable Core Dump
Here is how to enable core dump on CentOS.
Set ulimit
If the value of ulimit -c
is 0
, then core dump is disabled.
1 |
|
This operation only affect current shell session, so better to add it to users’ .bashrc
.
1 |
|
Config dumped files location
On CentOS, the default core dump destination is defined in /proc/sys/kernel/core_pattern
, which will require sudo
permission to modify.
1 |
|
Here we set the dumped file located in dumps
directory of current working directory, with filename as core.%e.%t.%p
. The placeholders’ meaning:
%e
: name of the executable%t
: timestamp of dumping, in seconds since the UNIX Epoch%p
: process ID of the task
Inside a Docker container, the /proc/sys/kernel/core_pattern
is Read-Only. This is because Docker on Windows uses WSL2 as backend. So simply change it in WSL2 would work. However, this might be flushed after reboot.
Analyze Core Dump
Now that core dump is enabled, we can use gdb
to analyze a crashed program’s exiting state.
1 |
|
Here are some useful commands:
bt
: backtrace, show the stack traceinfo locals
: show the local variables of current stack frameframe <frame id>
orf <frame id>
: switch to a specific stack frame which is shown in backtracelist
: show the source code of current stack frame
POSIX Threads
info threads
: show all threads’ information, current thread is marked with*
thread <thread id>
: switch to a specific thread
1 |
|
Mutex
If a thread is stuck at __lll_lock_wait()
function, then it is waiting for a mutex.
1 |
|
Use p (pthread_mutex_t) <mutex>
to print a mutex’s value.
1 |
|
__owner
: the ID of the thread who locks the mutex at the moment__nusers
: the number of threads who are waiting for the mutex__kind
: the type of the mutex,0
stands forPTHREAD_MUTEX_NORMAL
Condition Variable
If a thread is stuck at pthread_cond_wait()
function, then it is waiting for a condition variable.
1 |
|
Similarly, use p (pthread_cond_t) <cond>
to print a condition variable’s value.
1 |
|
Analyze Deadlocked Program
Even though this post is mainly about analyzing core dump, here is a short tip when a running program is deadlocked.
Replace the
<pid>
with the process ID of the program, which can be found bytop
command.
- Use
gdb --pid <pid>
to attach to the program and debug as usual. - Save the current state of the program as a core dump file.
- Use
gcore
orgcore <name>
to save the core when insidegdb
. - Use
gcore <pid>
orgcore -o <name> <pid>
to dump the core when in shell.
- Use