Bug #2399

DFBSD v3.1.0.1249.ge27e67 - Panic on lwkt_reltoken from vm_mmap

Added by tuxillo almost 2 years ago. Updated almost 2 years ago.

Status:ClosedStart date:08/10/2012
Priority:HighDue date:
Assignee:-% Done:

0%

Category:-
Target version:-

Description

Hi folks,

In a i386 virtual machine, 512MB RAM, after few hours running test/stress/tuxload, the system panics.
I haven't tried it in x86_64.

Unread portion of the kernel message buffer:
panic: assertion "ref >= &td->td_toks_base && ref->tr_tok == tok" failed in lwkt_reltoken at /home/source/dfbsd/sys/kern/lwkt_token.c:778
cpuid = 1
Trace beginning at frame 0xcd3b9a0c
panic(ffffffff,1,c071e170,cd3b9a40,c8dc0660) at panic+0x1a8 0xc03cfa80
panic(c071e170,c07825f0,c078280b,c07825c4,30a) at panic+0x1a8 0xc03cfa80
lwkt_reltoken(c990fe88,cd3c8690,0,0,cd3b9c74) at lwkt_reltoken+0x5e 0xc03dcf40
vm_mmap(c990fdf0,cd3b9c74,800000,3,7) at vm_mmap+0x6af 0xc05e088e
kern_mmap(c990fdf0,0,800000,3,1,12a,0,0,cd3b9cf0) at kern_mmap+0x448 0xc05e1758
sys_mmap(cd3b9cf0,cd3b9d00,20,0,0) at sys_mmap+0x5a 0xc05e1804
syscall2(cd3b9d40) at syscall2+0x270 0xc06fc52c
Xint0x80_syscall() at Xint0x80_syscall+0x36 0xc06cb466
Debugger("panic")

---------------------------------

(kgdb) bt
#0 _get_mycpu () at ./machine/thread.h:79
#1 md_dumpsys (di=0xc0c22540) at /home/source/dfbsd/sys/platform/pc32/i386/dump_machdep.c:266
#2 0xc03cf22e in dumpsys () at /home/source/dfbsd/sys/kern/kern_shutdown.c:925
#3 0xc01922ca in db_fncall (dummy1=-1066624222, dummy2=0, dummy3=-1072092741, dummy4=0xcd3b989c "\364^l\300\202yr\300") at /home/source/dfbsd/sys/ddb/db_command.c:539
#4 0xc01927af in db_command (aux_cmd_tablep_end=0xc07eec8c, aux_cmd_tablep=0xc07eec70, cmd_table=<optimized out>, last_cmdp=<optimized out>)
at /home/source/dfbsd/sys/ddb/db_command.c:401
#5 db_command_loop () at /home/source/dfbsd/sys/ddb/db_command.c:467
#6 0xc019530e in db_trap (type=3, code=0) at /home/source/dfbsd/sys/ddb/db_trap.c:71
#7 0xc06c9e95 in kdb_trap (type=3, code=0, regs=0xcd3b99bc) at /home/source/dfbsd/sys/platform/pc32/i386/db_interface.c:151
#8 0xc06fbf5a in trap (frame=0xcd3b99bc) at /home/source/dfbsd/sys/platform/pc32/i386/trap.c:844
#9 0xc06cb3b7 in calltrap () at /home/source/dfbsd/sys/platform/pc32/i386/exception.s:787
#10 0xc06c9b22 in breakpoint () at ./cpu/cpufunc.h:72
#11 Debugger (msg=0xc077fb58 "panic") at /home/source/dfbsd/sys/platform/pc32/i386/db_interface.c:333
#12 0xc03cfa95 in panic (fmt=0xc071e170 "assertion \"%s\" failed in %s at %s:%u") at /home/source/dfbsd/sys/kern/kern_shutdown.c:822
#13 0xc03dcf40 in lwkt_reltoken (tok=0xc990fe88) at /home/source/dfbsd/sys/kern/lwkt_token.c:778
#14 0xc05e088e in vm_mmap (map=0xc990fdf0, addr=0xcd3b9c74, size=8388608, prot=7 '\a', maxprot=7 '\a', flags=1, handle=0xcc8675f8, foff=0)
at /home/source/dfbsd/sys/vm/vm_mmap.c:1441
#15 0xc05e1758 in kern_mmap (vms=0xc990fdf0, uaddr=0x0, ulen=8388608, uprot=3, uflags=1, fd=298, upos=0, res=0xcd3b9cf0) at /home/source/dfbsd/sys/vm/vm_mmap.c:400
#16 0xc05e1804 in sys_mmap (uap=0xcd3b9cf0) at /home/source/dfbsd/sys/vm/vm_mmap.c:423
#17 0xc06fc52c in syscall2 (frame=0xcd3b9d40) at /home/source/dfbsd/sys/platform/pc32/i386/trap.c:1334
#18 0xc06cb466 in Xint0x80_syscall () at /home/source/dfbsd/sys/platform/pc32/i386/exception.s:878
#19 0x0000001f in ?? ()

Cores are available in case someone wants to take a look.

Cheers,
Antonio Huete


Related issues

Related to Bug #2336: 3.0.3 catchall Resolved 03/26/2012
Related to Bug #2402: Showstopper panics for Release 3.2 New 08/15/2012

History

#1 Updated by tuxillo almost 2 years ago

  • Status changed from New to In Progress
  • Priority changed from Normal to High

Grab.

No progress is being made as of yet, but work is ongoing. Trying to KTR acquire/releases as per Matt's advises.

#2 Updated by vsrinivas almost 2 years ago

Per the dump, the thread calling mmap() holds the MP token twice, the tty token, a vmobj token, the vm_token, and three other vmobj tokens. From the code paths, this is not possible; kern_mmap already takes the vm_map's private token, along with vm_mmap. There were no paths taking the mp_token here, and even if so, it should definitely have been released by that point.

#3 Updated by dillon almost 2 years ago

:Issue #2399 has been updated by Venkatesh Srinivas.
:
:Per the dump, the thread calling mmap() holds the MP token twice, the tty token, a vmobj token, the vm_token, and three other vmobj tokens. From the code paths, this is not possible; kern_mmap already takes the vm_map's private token, along with vm_mmap. There were no paths taking the mp_token here, and even if so, it should definitely have been released by that point.
:----------------------------------------
:Bug #2399: DFBSD v3.1.0.1249.ge27e67 - Panic on lwkt_reltoken from vm_mmap
:http://bugs.dragonflybsd.org/issues/2399

Looking at the token array for the thread that crashed the system
won't help you, you'll just get the tokens that were held by the panic
code as it dumped.

The panic code copies the thread's token array to a global array,
take a look at that:

struct lwkt_tokref panic_tokens[LWKT_MAXTOKENS];
int panic_tokens_count;

-Matt

#4 Updated by dillon almost 2 years ago

  • Status changed from In Progress to Closed

Antonio's token debugging finally narrowed it down and we found and fixed the sucker. Turned out to be an error path in vm_map_find().

Also available in: Atom PDF