Bug #2026

race between fill_kinfo_lwp() and lwp_dispose()

Added by y0n3t4n1 over 5 years ago. Updated over 5 years ago.

Target version:
Start date:
Due date:
% Done:




I caught the following panic under moderate load, after surviving for a day
or two. `disass $rip' shows that the code attempted to dereference
lwp->lwp_proc being NULL at the time of panic, even though it isn't
on kgdb session.

Fatal trap 12: page fault while in kernel mode
cpuid = 1; lapic->id = 01000000
fault virtual address = 0x5c
fault code = supervisor read data, page not present
instruction pointer = 0x8:0xffffffff8029c10c
stack pointer = 0x10:0xffffffe05c1714c0
frame pointer = 0x10:0xffffffe05c1714e8
code segment = base 0x0, limit 0xfffff, type 0x1b
= DPL 0, pres 1, long 0, def32 0, gran 1
processor eflags = interrupt enabled, resume, IOPL = 0
current process = 3161
current thread = pri 10
trap number = 12
panic: page fault
cpuid = 1
Trace beginning at frame 0xffffffe05c1711f8
panic() at panic+0x222
panic() at panic+0x222
trap_fatal() at trap_fatal+0x3da trap_pfault() at trap_pfault+0x15d
trap() at trap+0x421
calltrap() at calltrap+0x8
--- trap 000000000000000c, rip = ffffffff8029c10c, rsp = ffffffe05c1714c0, rbp = ffffffe05c1714e8 ---
fill_kinfo_lwp() at fill_kinfo_lwp+0x1f
(kgdb) bt
#8 0xffffffff804aec3e in calltrap ()
at /usr/src/sys/platform/pc64/x86_64/exception.S:185
#9 0xffffffff8029c10c in fill_kinfo_lwp (lwp=0xffffffe06d2f9c00,
kl=0xffffffe05c171790) at /usr/src/sys/kern/kern_kinfo.c:173
#10 0xffffffff802a3804 in sysctl_out_proc (p=<value optimized out>,
req=0xffffffe05c1719c8, flags=<value optimized out>)
at /usr/src/sys/kern/kern_proc.c:825
#11 0xffffffff802a3b8c in sysctl_kern_proc (oidp=<value optimized out>,
arg1=<value optimized out>, arg2=<value optimized out>,
req=<value optimized out>) at /usr/src/sys/kern/kern_proc.c:954
#12 0xffffffff802c58b0 in sysctl_root (oidp=<value optimized out>,
arg1=<value optimized out>, arg2=3, req=0xffffffe05c1719c8)
at /usr/src/sys/kern/kern_sysctl.c:1202
#13 0xffffffff802c59da in userland_sysctl (name=0xffffffe05c171ac8, namelen=3,
old=<value optimized out>, oldlenp=0x0, inkernel=<value optimized out>,
new=0x0, newlen=0, retval=0xffffffe05c171ac0) at /usr/src/sys/kern/kern_sysctl.c:1284
#14 0xffffffff802c5d14 in sys___sysctl (uap=0xffffffe05c171b58)
at /usr/src/sys/kern/kern_sysctl.c:1224
#15 0xffffffff804b60f3 in syscall2 (frame=0xffffffe05c171c08) at /usr/src/sys/platform/pc64/x86_64/trap.c:1182
#16 0xffffffff804aee7f in Xfast_syscall ()
at /usr/src/sys/platform/pc64/x86_64/exception.S:318
(kgdb) fr 9
#9 0xffffffff8029c10c in fill_kinfo_lwp (lwp=0xffffffe06d2f9c00,
kl=0xffffffe05c171790) at /usr/src/sys/kern/kern_kinfo.c:173
173 kl->kl_pid = lwp->lwp_proc->p_pid;
(kgdb) p lwp->lwp_proc->p_pid
$1 = 62417


#1 Updated by y0n3t4n1 over 5 years ago

Apparently this may have been fixed by one or more of recent changes. I haven't been able to reproduce this
for about a month now.

Also available in: Atom PDF