DFBSD v126.96.36.1999.ge27e67 - Panic on lwkt_reltoken from vm_mmap
In a i386 virtual machine, 512MB RAM, after few hours running test/stress/tuxload, the system panics.
I haven't tried it in x86_64.
Unread portion of the kernel message buffer:
panic: assertion "ref >= &td->td_toks_base && ref->tr_tok == tok" failed in lwkt_reltoken at /home/source/dfbsd/sys/kern/lwkt_token.c:778
cpuid = 1
Trace beginning at frame 0xcd3b9a0c
panic(ffffffff,1,c071e170,cd3b9a40,c8dc0660) at panic+0x1a8 0xc03cfa80
panic(c071e170,c07825f0,c078280b,c07825c4,30a) at panic+0x1a8 0xc03cfa80
lwkt_reltoken(c990fe88,cd3c8690,0,0,cd3b9c74) at lwkt_reltoken+0x5e 0xc03dcf40
vm_mmap(c990fdf0,cd3b9c74,800000,3,7) at vm_mmap+0x6af 0xc05e088e
kern_mmap(c990fdf0,0,800000,3,1,12a,0,0,cd3b9cf0) at kern_mmap+0x448 0xc05e1758
sys_mmap(cd3b9cf0,cd3b9d00,20,0,0) at sys_mmap+0x5a 0xc05e1804
syscall2(cd3b9d40) at syscall2+0x270 0xc06fc52c
Xint0x80_syscall() at Xint0x80_syscall+0x36 0xc06cb466
#0 _get_mycpu () at ./machine/thread.h:79
#1 md_dumpsys (di=0xc0c22540) at /home/source/dfbsd/sys/platform/pc32/i386/dump_machdep.c:266
#2 0xc03cf22e in dumpsys () at /home/source/dfbsd/sys/kern/kern_shutdown.c:925
#3 0xc01922ca in db_fncall (dummy1=-1066624222, dummy2=0, dummy3=-1072092741, dummy4=0xcd3b989c "\364^l\300\202yr\300") at /home/source/dfbsd/sys/ddb/db_command.c:539
#4 0xc01927af in db_command (aux_cmd_tablep_end=0xc07eec8c, aux_cmd_tablep=0xc07eec70, cmd_table=<optimized out>, last_cmdp=<optimized out>)
#5 db_command_loop () at /home/source/dfbsd/sys/ddb/db_command.c:467
#6 0xc019530e in db_trap (type=3, code=0) at /home/source/dfbsd/sys/ddb/db_trap.c:71
#7 0xc06c9e95 in kdb_trap (type=3, code=0, regs=0xcd3b99bc) at /home/source/dfbsd/sys/platform/pc32/i386/db_interface.c:151
#8 0xc06fbf5a in trap (frame=0xcd3b99bc) at /home/source/dfbsd/sys/platform/pc32/i386/trap.c:844
#9 0xc06cb3b7 in calltrap () at /home/source/dfbsd/sys/platform/pc32/i386/exception.s:787
#10 0xc06c9b22 in breakpoint () at ./cpu/cpufunc.h:72
#11 Debugger (msg=0xc077fb58 "panic") at /home/source/dfbsd/sys/platform/pc32/i386/db_interface.c:333
#12 0xc03cfa95 in panic (fmt=0xc071e170 "assertion \"%s\" failed in %s at %s:%u") at /home/source/dfbsd/sys/kern/kern_shutdown.c:822
#13 0xc03dcf40 in lwkt_reltoken (tok=0xc990fe88) at /home/source/dfbsd/sys/kern/lwkt_token.c:778
#14 0xc05e088e in vm_mmap (map=0xc990fdf0, addr=0xcd3b9c74, size=8388608, prot=7 '\a', maxprot=7 '\a', flags=1, handle=0xcc8675f8, foff=0)
#15 0xc05e1758 in kern_mmap (vms=0xc990fdf0, uaddr=0x0, ulen=8388608, uprot=3, uflags=1, fd=298, upos=0, res=0xcd3b9cf0) at /home/source/dfbsd/sys/vm/vm_mmap.c:400
#16 0xc05e1804 in sys_mmap (uap=0xcd3b9cf0) at /home/source/dfbsd/sys/vm/vm_mmap.c:423
#17 0xc06fc52c in syscall2 (frame=0xcd3b9d40) at /home/source/dfbsd/sys/platform/pc32/i386/trap.c:1334
#18 0xc06cb466 in Xint0x80_syscall () at /home/source/dfbsd/sys/platform/pc32/i386/exception.s:878
#19 0x0000001f in ?? ()
Cores are available in case someone wants to take a look.
- Status changed from New to In Progress
- Priority changed from Normal to High
No progress is being made as of yet, but work is ongoing. Trying to KTR acquire/releases as per Matt's advises.
Per the dump, the thread calling mmap() holds the MP token twice, the tty token, a vmobj token, the vm_token, and three other vmobj tokens. From the code paths, this is not possible; kern_mmap already takes the vm_map's private token, along with vm_mmap. There were no paths taking the mp_token here, and even if so, it should definitely have been released by that point.
:Issue #2399 has been updated by Venkatesh Srinivas.
:Per the dump, the thread calling mmap() holds the MP token twice, the tty token, a vmobj token, the vm_token, and three other vmobj tokens. From the code paths, this is not possible; kern_mmap already takes the vm_map's private token, along with vm_mmap. There were no paths taking the mp_token here, and even if so, it should definitely have been released by that point.
:Bug #2399: DFBSD v188.8.131.529.ge27e67 - Panic on lwkt_reltoken from vm_mmap
Looking at the token array for the thread that crashed the system
won't help you, you'll just get the tokens that were held by the panic
code as it dumped.
The panic code copies the thread's token array to a global array,
take a look at that:
struct lwkt_tokref panic_tokens[LWKT_MAXTOKENS];