Bug #2353: panic: assertion "gd->gd_spinlocks_wr == 0" failed in bsd4_schedulerclock - DragonFlyBSD - DragonFlyBSD bugtracker

Actions

Copy link

Bug #2353

open

panic: assertion "gd->gd_spinlocks_wr == 0" failed in bsd4_schedulerclock

Added by jaydg over 13 years ago. Updated over 12 years ago.

Status:

In Progress

Priority:

Normal

Assignee:

alexh

Category:

Target version:

6.4

Start date:

04/27/2012

Due date:

% Done:

Estimated time:

Description

Version String: DragonFly v3.1.0.322.gc231b-DEVELOPMENT #0: Fri Mar 16 20:31:33 CET 2012

#0 _get_mycpu () at ./machine/thread.h:69
#1 md_dumpsys (di=<optimized out>) at /usr/src/sys/platform/pc64/x86_64/dump_machdep.c:263
#2 0xffffffff802fb2d2 in dumpsys () at /usr/src/sys/kern/kern_shutdown.c:937
#3 0xffffffff802fb936 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:387
#4 0xffffffff802fbbed in panic (fmt=0xffffffff80585cf0 "assertion \"%s\" failed in %s at %s:%u") at /usr/src/sys/kern/kern_shutdown.c:843
#5 0xffffffff803036fb in bsd4_schedulerclock (lp=<optimized out>, period=<optimized out>, cpstamp=<optimized out>) at /usr/src/sys/kern/usched_bsd4.c:622
#6 0xffffffff802db16c in schedclock (info=0xffffffff820106e8, in_ipi=<optimized out>, frame=<optimized out>) at /usr/src/sys/kern/kern_clock.c:776
#7 0xffffffff802f9e03 in systimer_intr (timep=<optimized out>, in_ipi=<optimized out>, frame=0x0) at /usr/src/sys/kern/kern_systimer.c:102
#8 0xffffffff8054ad04 in lapic_timer_process_oncpu (gd=0xffffffff8200f000, frame=0x0) at /usr/src/sys/platform/pc64/apic/lapic.c:336
#9 0xffffffff8054ad3c in lapic_timer_process () at /usr/src/sys/platform/pc64/apic/lapic.c:342
#10 0xffffffff80540d69 in splz_timer () at /usr/src/sys/platform/pc64/x86_64/ipl.s:461
#11 0x0000000000000000 in ?? ()

Before dumping, several text blocks like this one were emitted:

Fatal trap 12: page fault while in kernel mode
cpuid = 0; lapic->id = 00000000
fault virtual address = 0x20
fault code = supervisor write data, page not present
instruction pointer = 0x8:0xffffffff802ec7fc
stack pointer = 0x10:0xffffffe0a8b4a980
frame pointer = 0x10:0xffffffe0a8b4a9c8
code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 0, def32 0, gran 1
processor eflags = interrupt enabled, resume, IOPL = 0
current process = 8139
current thread = pri 6 (CRIT)
kernel: type 12 trap, code=2

A dump is available.

Related issues 4 (1 open — 3 closed)

Actions

Copy link

Updated by vsrinivas over 13 years ago

If you could upload the dump to leaf or someplace accessible, that would be very useful!

Thanks,
-- vs;

Actions

Copy link

Updated by jaydg over 13 years ago

I've uploaded the dump on leaf, ~jaydg/crash/2353.

Actions

Copy link

Updated by vsrinivas over 13 years ago

Some preliminary debugging:

was in 'moused' thread, we were interrupted or somehow called splz. We were probably in a critical section; crit_exit can splz() itself.

splz
++critcount from splz itself (exp critcount=1)

splz_timer
        lapic_timer_process
            lapic_timer_process_oncpu
                systimer_intr
                    ++crit_enter from systimer_intr        (exp.cc=2)
                    ++gd_syst_next from systimer_intr

--crit_enter from systimer_intr        (exp.cc=1)

schedclock [first systimer]

lp = 'moused thread'

bsd4_schedulerclock

on CPU0; below rrinterval, no need_user_resched
                                                [exp.critcount=1, found cc=4]

(pollclock)
                    (emergency_intr_timer_callback)
                    (hardclock)
                    (statclock)

Actions

Copy link

Updated by vsrinivas about 13 years ago

Okay,

The problem is that we hold an MTX spinlock while attempting to go to sleep. The specific callpath and problem is that we hold the syscons MTX spinlock at :769, :771 of sys/dev/misc/syscons/syscons.c, around a device ioctl routine which may explicitly tsleep. The specific tsleep in question is via sysmouse_event, ultimately hitting kern_kevent and sleeping in kqueue.

First, why are we using MTX spinlocks at all?

Second, it is probably inappropriate to hold an MTX spinlock around the entire ioctl path here. The path gets the tty_token, among many other things, the chances of blocking are high.

What is it synchronizing that the tty_token is not?

Actions

Copy link

Updated by alexh over 12 years ago

Status changed from New to In Progress
Assignee set to alexh

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

DragonFlyBSD

Bug #2353

panic: assertion "gd->gd_spinlocks_wr == 0" failed in bsd4_schedulerclock

Updated by vsrinivas over 13 years ago

Updated by jaydg over 13 years ago

Updated by vsrinivas over 13 years ago

Updated by vsrinivas about 13 years ago

Updated by alexh over 12 years ago