Bug #1791

Panic: "Fatal trap 12: page fault while in kernel mode"

Added by ftigeot almost 4 years ago. Updated almost 4 years ago.

Status:NewStart date:
Priority:NormalDue date:
Assignee:-% Done:

0%

Category:-
Target version:-

Description

I have just been bitten by this panic on a 4 cores Xeon server running
DragonFly 2.6.3/i386.
It is usually lightly loaded and mainly used for mail services.

I have also setup a local chroot with a devfs mount of /dev and a nullfs
mount of /usr/pkgsrc/distfiles to build packages.
The panic occurred just after I exited a chrooted root shell.

I wasn't able to get a core dump: the keyboard was completely unresponsive
in the debugger and I had to reboot the machine with the reset switch.

Details of the crash follow:

Fatal trap 12: page fault while in kernel mode
mp_lock = 00000002; cpuid = 2; lapic.id = 04000000
fault virtual address = 0
fault code = supervisor read, page not present
instruction pointer = 0x8:0xc02d0a55
stack pointer = 0x10:0xe30729ac
frame pointer = 0x10:0xe3072a40
code segment = base 0x0, limit 0xfffff, type 0x1b
= DPL 0, pres 1, def32 1, gran 1
processor eflags = interrupt enabled, resume, IOPL = 0
current process = 89056 (zsh)
current thread = pri 6
<- SMP: XXX
kernel: type 12 trap, code=0

CPU2 stopping CPUs: 0x0000000b
stopped
Stopped at devfs_getattr+0x1b: movl 0(%edi),%edx
db>

History

#1 Updated by dillon almost 4 years ago

:I have just been bitten by this panic on a 4 cores Xeon server running
:DragonFly 2.6.3/i386.
:It is usually lightly loaded and mainly used for mail services.
:..
:fault virtual address = 0
: stopped
:Stopped at devfs_getattr+0x1b: movl 0(%edi),%edx
:Francois Tigeot

This line of assembly corresponds to the (dev = node->d_dev)
inside node_sync_dev_get() in vfs/devfs/devfs_vnops.c. node
is NULL.

0xc04853f9 <devfs_getattr+27>: mov (%edi),%edx (dev = node->d_dev)
(but 'node' is NULL so it faults)
0xc04853fb <devfs_getattr+29>: test %edx,%edx test dev for NULL

It looks like the devfs node is getting ripped out from
under the vnode but unfortunately there isn't enough information
to figure out why that happened.

-Matt
Matthew Dillon
<>

#2 Updated by ftigeot almost 4 years ago

On Sun, Jul 04, 2010 at 10:03:38AM -0700, Matthew Dillon wrote:
> :I have just been bitten by this panic on a 4 cores Xeon server running
> :DragonFly 2.6.3/i386.
> :It is usually lightly loaded and mainly used for mail services.
> :..
> :fault virtual address = 0
> : stopped
> :Stopped at devfs_getattr+0x1b: movl 0(%edi),%edx
> :Francois Tigeot
>
> This line of assembly corresponds to the (dev = node->d_dev)
> inside node_sync_dev_get() in vfs/devfs/devfs_vnops.c. node
> is NULL.
>
> 0xc04853f9 <devfs_getattr+27>: mov (%edi),%edx (dev = node->d_dev)
> (but 'node' is NULL so it faults)
> 0xc04853fb <devfs_getattr+29>: test %edx,%edx test dev for NULL
>
>
> It looks like the devfs node is getting ripped out from
> under the vnode but unfortunately there isn't enough information
> to figure out why that happened.

I have been able to reproduce the problem on a test machine.
The keyboard was still functional, but I could'nt get a coredump either:
call dumpsys did just print "0xdf4637cc" and I was stuck in the debugguer.

#3 Updated by ftigeot almost 4 years ago

On Tue, Jul 06, 2010 at 08:27:42AM +0200, Francois Tigeot wrote:
> On Sun, Jul 04, 2010 at 10:03:38AM -0700, Matthew Dillon wrote:
> > :I have just been bitten by this panic on a 4 cores Xeon server running
> > :DragonFly 2.6.3/i386.
> > :It is usually lightly loaded and mainly used for mail services.
> > :..
> > :Stopped at devfs_getattr+0x1b: movl 0(%edi),%edx
> >
> > It looks like the devfs node is getting ripped out from
> > under the vnode but unfortunately there isn't enough information
> > to figure out why that happened.

I've got a core dump, albeit from an x86_64 system.

The files are available here:
http://www.wolfpond.org/crash.dfly/

Also available in: Atom PDF