You may know the following situation. You arrive in the morning in the office, do what you always do and check out the latest changes of the software you are working on. After a little bit of compile time and the first coffee you start the just build application. Bumm, kernel panic. After rebooting and locking through the changes you may have an idea what the reason for this could be. A colleague of you is working on a fancy new feature which needed changes to a kernel module. As you almost know nothing about this code you seek for help and, as it of course not happen on his computer, he is asking for a backtrace of this panic. You have two problems now. First you need to see the panic yourself and second it would be nice to get a copy of the backtrace for sharing this info within a bugtracker. In the following post I will show how both aims could be easily archived.
Let the kernel manage the graphical modes
As most people are working under X11 they don't see the output of an kernel panic. When a kernel panic happens the kernel prints the reason for the panic and a kernel backtrace to the console window and stops immediately its own execution. It is not written into a log file or somewhere else. In consequence you don't have the ability to look into the panic text, cause the graphical mode is still on. Historically the mode settings are done by the graphic driver of the X11 system. So the kernel has no idea that or which graphic mode is currently in use. Fortunately the kernel hackers invented a new infrastructure which let the kernel do the mode switch. This subsystem is called Kernel-Mode-Settings (KMS). As the kernel do the mode settings, he can switch back to the console on a panic, regardless which graphical mode is currently configured. Beside this, KMS has other improvements like Fast User Switching or a flicker free switch between text and graphic mode. On the other side is this highly hardware dependent and even if it was introduced with version 2.6.28, not all today available hardware can make use of it. If you are an owner of an Intel graphic card you are in good shape. Radeon and NVidia cards have limited support through the in kernel drivers radeonhd and nouveau. For an Intel i915 card you need to enable the following kernel options:
CONFIG_DRM_I915=y Location: -> Device Drivers -> Graphics support -> Direct Rendering Manager (XFree86 4.1.0 and higher DRI support) (DRM [=y]) -> Intel 830M, 845G, 852GM, 855GM, 865G ( [=y]) CONFIG_DRM_I915_KMS=y Location: -> Device Drivers -> Graphics support -> Direct Rendering Manager (XFree86 4.1.0 and higher DRI support) (DRM [=y]) -> Intel 830M, 845G, 852GM, 855GM, 865G ( [=y]) -> i915 driver (DRM_I915 [=y])
The kernel line in your favorite boot loader needs the following additional parameter:
X11 should have this minimal configuration for the device section:
Section "Device" Identifier "i915" Driver "intel" Option "DRI" "true" EndSection
Please note that you need of course some recent kernel, X11 version and Intel X11 driver to make this work. After a compile, install and boot of the new kernel, KMS should be in use. You will notice it, cause the boot messages will be printed in a much higher graphical resolution, than the usual text mode provide. The next time a kernel panic occurs, the kernel will switch back to the console before the panic is printed. This allows you to see the info printed and maybe you get a useful hint for the reason of the panic.
Post the panic
If you can't use KMS or don't want transcribe the panic text by hand into the bugtracker, it would be nice if the text could be made available on another computer. Kernel hackers usual use the serial port for that. Unfortunately most modern computers doesn't have such a serial port anymore. Also you need two hosts with a serial port and the setup is complex (you have to know about baud-rates, parity and stuff like this). But there is a simpler solution: netconsole. Netconsole is a kernel module, which sends kernel messages anywhere to the net using UDP. The setup is really simple. In the kernel configuration you need the following setting:
CONFIG_NETCONSOLE=m Location: -> Device Drivers -> Network device support (NETDEVICES [=y])
I prefer to compile it as module, which allows me to turn it on only when I need it. Load it with the following command:
modprobe netconsole netconsole=@/,@192.168.220.10/
The ip has to be replaced by the one of your target computer. You can of course tune it much more, like setting source and target ports or even let netconsole send the text to more than one host. On your client you need a network tool which can read from a socket and print the read text to stdout. Netcat or nc are two tools which are able to do just that. The call for nc looks like the following:
nc -l -u 6666
Now if a kernel panic will happen you will see an output like this:
BUG: unable to handle kernel NULL pointer dereference at (null) IP:  rb_erase+0x15c/0x320 PGD 6942f067 PUD a1e4067 PMD 0 Oops: 0000 [#1] SMP last sysfs file: /sys/devices/virtual/block/md1/dev CPU 3 Modules linked in: vboxnetadp vboxnetflt vboxdrv netconsole ... Pid: 18887, comm: VirtualBox Tainted: G W 2.6.36-gentoo #4 DG33TL/ RIP: 0010:  rb_erase+0x15c/0x320 RSP: 0018:ffff8800b430db58 EFLAGS: 00010046 RAX: 0000000000000000 RBX: ffff880069557a68 RCX: 0000000000000001 RDX: ffff880069557a68 RSI: ffff880001d8ed58 RDI: 0000000000000000 RBP: ffff8800b430db68 R08: 0000000000000001 R09: 000000008edcb5d6 R10: 0000000000000000 R11: 0000000000000202 R12: ffff880001d8ed58 R13: 0000000000000000 R14: 000000000000ed00 R15: 0000000000000002 FS: 00007fffde457710(0000) GS:ffff880001d80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 00000000064f9000 CR4: 00000000000026e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process VirtualBox (pid: 18887, threadinfo ffff8800b430c000, task ffff880091e227f0) Stack: ffff88000a03ba18 ffff880001d8ed48 ffff8800b430dba8 ffffffff8105bf06 <0> ffff8800b430dba8 ffffffff8105c97c ffff8800b430dbc8 ffff88000a03ba18 <0> 00004ff8a86ba455 ffff880001d8ed48 ffff8800b430dc48 ffffffff8105ce77 Call Trace:  __remove_hrtimer+0x36/0xb0  ? lock_hrtimer_base+0x2c/0x60  __hrtimer_start_range_ns+0x2b7/0x3c0  ? rtR0SemEventMultiLnxWait+0x250/0x3d0 [vboxdrv]  ? RTLogLoggerExV+0x12f/0x180 [vboxdrv]  hrtimer_start+0x13/0x20  rtTimerLnxStartSubTimer+0x60/0x120 [vboxdrv]  rtTimerLnxStartOnSpecificCpu+0x21/0x30 [vboxdrv]  rtmpLinuxWrapper+0x23/0x30 [vboxdrv]  RTMpOnSpecific+0x99/0xa0 [vboxdrv]  ? rtTimerLnxStartOnSpecificCpu+0x0/0x30 [vboxdrv]  RTTimerStart+0x2a6/0x2e0 [vboxdrv]  ? g_abExecMemory+0x33665/0x180000 [vboxdrv]  g_abExecMemory+0xc678/0x180000 [vboxdrv]  g_abExecMemory+0x328d7/0x180000 [vboxdrv]  supdrvIOCtlFast+0x6a/0x70 [vboxdrv]  VBoxDrvLinuxIOCtl+0x47/0x1e0 [vboxdrv]  ? pick_next_task_fair+0xde/0x150  do_vfs_ioctl+0xa1/0x590  ? sys_futex+0x76/0x170  sys_ioctl+0x4a/0x80  system_call_fastpath+0x16/0x1b Code: 07 a8 01 75 9d eb 81 0f 1f 84 00 00 00 00 00 48 3b 78 10 0f 84 ... RIP  rb_erase+0x15c/0x320 RSP CR2: 0000000000000000 ---[ end trace 4eaa2a86a8e2da24 ]---
Normally only kernel panics are sent to the console. You can increase the verbosity level by executing
dmesg -n 8 as root.
To continue with the story from the beginning: With the shown methods you can hope your colleague get enough information to find the reason for the kernel panic. To be more helpful, the next step would be to try to debug the problem yourself. Even if the KGDB was merged into the kernel in version 2.6.35, it is not really usable for me. The reason is that it seems kernel hackers usually have really old hardware which either has a serial port, a PS/2 keyboard or both. Otherwise I can't find a reason why USB keyboards don't work. I asked on the mailing list of KGDB about the status of USB keyboard support and I can only hope support will be integrated soon.