Crash is a tool used to analyse the core dump file created by a tool like kdump
. Crash depends upon kdump
/kexec
utilities to obtain its input file. A standard Linux kernel, when booted with the crashkernel
argument, reserves a little amount of memory for a standby dump-capture kernel.
Upon a kernel panic, the kexec
utility triggers a warm reboot into a dump kernel, where the memory contents of the panicked kernel are backed up. A warm reboot does not erase the contents of memory, and hence these are accessible across reboots. Once the memory contents are dumped to a preconfigured location, the system cold reboots to the standard kernel. The dump can later be used to analyse the panic.
Installing and configuring Crash
To install the Crash tool, you can either install a distribution-specific RPM/deb package, or you can compile from source as per the following steps (as the root):
wget -c http://people.redhat.com/anderson/crash-5.1.1.tar.gz ##the current version as of this article tar -zxvf crash-5.1.1.tar.gz cd crash-5.1.1.tar.gz make && make install
Apart from this, you need to prepare your target machine for dump capture. You would need to make sure that the kernel running on this machine is compiled with the options CONFIG_KEXEC
, CONFIG_DEBUG_INFO
, CONFIG_CRASH_DUMP
, CONFIG_PROC_VMCORE
. Apart from that, you need to install the kexec-tools
package, which can be downloaded from here.
Once you compile and install this package, you are provided with kdump
, kexec
, makedumpfile
and makedumprd
binaries, which are used during various phases of the panic and dump capture. For the machine to be able to boot to the dump kernel, we need the following arguments appended to the bootloader’s kernel line. On my Ubuntu system, I see the following arguments appended to my kernel line:
linux /boot/vmlinuz-2.6.35-24-generic crashkernel=384M-2G:64M,2G-:128M
Here, crashkernel
is the keyword that is required. The memory settings are as follows: 384M-2G:64M
. If installed RAM is between 384 MB and 2 GB, then reserve 64MB. If it’s above 2 GB, then reserve 128 MB (if RAM is less than 384 MB, no memory is reserved). So, depending on your system’s configuration, you can reserve some amount of memory for the dump kernel.
On some Fedora and Red Hat-based distributions, you see syntax like crashkernel=128M@16M
. This means, reserve 128 MB of memory after the first 16 MB. Once these arguments are appended to the bootloader kernel line and saved, the system is rebooted with these settings, and is ready to capture the panic and dump it. Once a panic happens, the following files are fed to the crash utility to perform a dump analysis:
- Kernel (namelist): This is the uncompressed kernel binary (
vmlinux
) and not thevmlinuz
file that you have in the/boot
directory;vmlinux
can be obtained easily from the compilation directory of the kernel. If you are running a stock kernel, you need to obtainvmlinux
from your vendor. - Dump Image (dumpfile): This is the
vmcore
file or the/dev/mem
file. - Map file: This is typically the system map file, which is found in the kernel source directory after compilation. This file is passed to the Crash tool with the
-S
parameter.
Once the above files are obtained from the panicked system, we are ready to perform dump analysis.
Exploring Crash with a sample dump
Let’s trigger a crash, and use the dump we obtain to understand the Crash utility. Trigger a crash by trying the following command:
echo c > /proc/sysrq-trigger
This will trigger a panic, and the system boots into the crash kernel, and takes a dump of system memory into the directory /var/crash/<date-time>/
. This is named vmcore
. Once done, it boots back to the normal kernel.
With the help of the vmcore
, vmlinux
and system-map
files, we will invoke the Crash tool, and view the sample output from it:
[root@DELL-RnD-India linux-2.6]# crash -S System.map vmlinux /var/crash/2011-01-10-12\:23/vmcore crash 5.1.1 ---snip--- crash: overriding /boot/System.map with System.map GNU gdb (GDB) 7.0 This GDB was configured as "x86_64-unknown-linux-gnu"... ---snip------ SYSTEM MAP: System.map DEBUG KERNEL: vmlinux (2.6.36-rc6-ftrace+) DUMPFILE: /var/crash/2011-01-10-12:23/vmcore CPUS: 4 DATE: Mon Jan 10 12:21:33 2011 UPTIME: 00:06:56 LOAD AVERAGE: 0.80, 0.65, 0.31 TASKS: 278 NODENAME: DELL-RnD-India RELEASE: 2.6.36-rc6-ftrace+ VERSION: #2 SMP Wed Sep 29 16:43:59 IST 2010 MACHINE: x86_64 (2666 Mhz) MEMORY: 2 GB PANIC: "Oops: 0002 [#1] SMP " (check log for details) PID: 7203 COMMAND: "bash" TASK: ffff88007b0d0000 [THREAD_INFO: ffff88007a6ba000] CPU: 0 STATE: TASK_RUNNING (PANIC) crash>
The above output shows you details about the kernel, the number of processors on the target machine, the command which caused the panic, etc.
/dev/mem
instead of the vmcore
file. For this to work, you need to disable the CONFIG_STRICT_DEVMEM
option while compiling the kernel. Stock kernels come with this option enabled, and will not let you use it.The help command
The most useful command would be the help
command, which gives you all the available commands from within the crash tool:
t gdb p sig waitq btop help ps struct whatis dev irq pte swap wr dis kmem ptob sym q eval list ptov sys exit log rd task extend mach repeat timer crash version: 5.1.1 gdb version: 7.0
To obtain help on any command, run help followed by the command name — for example, help vm
.
The bt command
The bt
(backtrace) command gives you the stack trace in the current context. And bt -a
gives you a stack trace of active tasks on all CPUs. Once the crash tool loads the first context, it sets up information of the panicked process. Here we take a look at the sample output of the command:
crash> bt PID: 7203 TASK: ffff88007b0d0000 CPU: 0 COMMAND: "bash" #0 [ffff88007a6bbb00] machine_kexec at ffffffff81027ac7 #1 [ffff88007a6bbb80] crash_kexec at ffffffff810888c9 #2 [ffff88007a6bbc50] oops_end at ffffffff814570c4 #3 [ffff88007a6bbc80] no_context at ffffffff81032ee7 <snipped>
The ps command
This command obtains the status of all the processes, or a selected one. It has an amazing number of options to provide lots of information during dump analysis. Refer to the help section for more details. Here is a sample output:
crash> ps -a 5390 PID: 5390 TASK: ffff8800799ac650 CPU: 2 COMMAND: "httpd" ARG: /usr/sbin/httpd ENV: TERM=linux PATH=/sbin:/usr/sbin:/bin:/usr/bin runlevel=5 \<snipped....>
The set command
You can change the current context using the set
command, which takes the PID of the process (which can be obtained from the ps
command). It takes various other arguments as well, which can be learnt by running help set
. If set
is used without arguments, it shows information about the current stack. For example:
crash> set ffff88007d7c0000 PID: 1 COMMAND: "init" TASK: ffff88007d7c0000 [THREAD_INFO: ffff88007d7ba000] CPU: 0 STATE: TASK_INTERRUPTIBLE
Here, the address is the task pointer of the init process.
The files command
This can be used to get all the open files in the current context; it is a context-sensitive command:
crash> set 1 PID: 1 COMMAND: "init" TASK: ffff88007d7c0000 [THREAD_INFO: ffff88007d7ba000] CPU: 0 STATE: TASK_INTERRUPTIBLE crash> files PID: 1 TASK: ffff88007d7c0000 CPU: 0 COMMAND: "init" ROOT: / CWD: / FD FILE DENTRY INODE TYPE PATH 0 ffff880037a58f00 ffff88007cd5be40 ffff88007d090c90 CHR /dev/null 1 ffff880037a58f00 ffff88007cd5be40 ffff88007d090c90 CHR /dev/null 2 ffff880037a58f00 ffff88007cd5be40 ffff88007d090c90 CHR /dev/null 3 ffff880037a58a80 ffff88003747b000 ffff88003750d540 FIFO 4 ffff880037a586c0 ffff88003747b000 ffff88003750d540 FIFO 5 ffff880037a58c00 ffff880037493240 ffff88007cdc2ca0 UNKN anon_inode:/inotify 6 ffff880037a58180 ffff8800374936c0 ffff88007cdc2ca0 UNKN anon_inode:/inotify 7 ffff880076087a80 ffff8800376d8540 ffff88007ceb87b0 SOCK 8 ffff880079a25d80 ffff88007a205e40 ffff880079eabc30 SOCK 9 ffff88007688b6c0 ffff88007a8f0480 ffff88003752e830 SOCK
We have looked into some regularly used commands. For other commands, kindly refer to the help section.
Acknowledgement
I referred to the documentation/kdump/kdump.txt
file while writing this article. Apart from that, I also occasionally referred to numerous other articles available on the Web.
Hilarious title! Good work
Hi,
I have configured kdump and able to generate vmcore with
echo c > /proc/sysrq-trigger..but not able to generate vmcore with sysrq when system crashes.
I need help
Oops: 0002
CPU: 0
EIR: 0010:[]
EFLAGS: 00010002
eax:f7fd1018 ebx:c02857fc ecx:0000000e edx:000001f7
esi:f7fd1018 edi:c02856e0 ebp:00000082 esp:c0243f00
ds:0018 es:0018 ss:0018
Process Swapper (pid:0, process nr:0, stackpage=c0243000)
Stack: f7ff92c0 24000001 0000000e c0243f68 c0195034 c010afed 0000000e f7fd1000
C0243f68 0000000e c02541c0 f7ff92c0 c0243f60 c010adac 0000000e c0243f68
F7ff92c0 00000001 c0242000 00000463 c0119619 c010b113 0000000e c0243f68
Call Trace: []…