Project

General

Profile

Actions

Bug #1024

closed

panic: assertion: dd->uschedcp != lp in bsd4_resetpriority

Added by qhwt+dfly almost 16 years ago. Updated almost 16 years ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
Start date:
Due date:
% Done:

0%

Estimated time:

Description

Hi.

I'm seeing this panic several times recently, only on an SMP kernel
running on VMware Fusion, with dual CPU support (not when the
number of active processors available to guest OS is 1). It occurs
under load, for instance when building kernel or world. I don't remember
when this started, so it may or may not be VMware's problem. I need to
try it on a real SMP hardware. Anyway, ...

(kgdb) bt
#0 dumpsys () at ./machine/thread.h:83
#1 0xc02bc881 in boot (howto=256)
at /dfsrc/current/sys/kern/kern_shutdown.c:375
#2 0xc02bcb44 in panic (fmt=0xc0507fb7 "assertion: %s in %s")
at /dfsrc/current/sys/kern/kern_shutdown.c:800
#3 0xc02c2515 in bsd4_resetpriority (lp=0xcadf8900)
at /dfsrc/current/sys/kern/usched_bsd4.c:792
#4 0xc02c26f8 in bsd4_recalculate_estcpu (lp=0xcadf8900)
at /dfsrc/current/sys/kern/usched_bsd4.c:701
#5 0xc02ca0ef in schedcpu_stats (p=0xcae6f858, data=0x0)
at /dfsrc/current/sys/kern/kern_synch.c:212
#6 0xc02b68e6 in allproc_scan (callback=0xc02ca099 <schedcpu_stats>, data=0x0)
at /dfsrc/current/sys/kern/kern_proc.c:533
#7 0xc02c9d52 in schedcpu (arg=0x0)
at /dfsrc/current/sys/kern/kern_synch.c:186
#8 0xc02cce3a in softclock_handler (arg=0xc060c7e0)
at /dfsrc/current/sys/kern/kern_timeout.c:308
#9 0xc02c450b in lwkt_deschedule_self (td=Cannot access memory at address 0x8
)
at /dfsrc/current/sys/kern/lwkt_thread.c:223

At first I thought that other CPU has just modified after this CPU
has unlocked bsd4_spin but before KKASSERT, so I tried to defer
spin_unlock_wr() as done in bsd4_setrunqueue():

%%%
diff --git a/sys/kern/usched_bsd4.c b/sys/kern/usched_bsd4.c
index b934e3d..e3478a0 100644
--- a/sys/kern/usched_bsd4.c
+++ b/sys/kern/usched_bsd4.c
@ -779,7 +779,6 @ bsd4_resetpriority(struct lwp *lp)
lp->lwp_priority = newpriority;
reschedcpu = 1;
}
spin_unlock_wr(&bsd4_spin);

/*
 * Determine if we need to reschedule the target cpu.  This only
@ -789,9 +788,14 @ bsd4_resetpriority(struct lwp *lp)
*/
if (reschedcpu >= 0) {
dd = &bsd4_pcpu[reschedcpu];
- KKASSERT(dd->uschedcp != lp);
+ if (dd->uschedcp == lp) {
+ kprintf("%p(d): dd->uschedcp=lp=%p\n",
+ curthread, mycpu->gd_cpuid, lp);
+ goto out;
+ }
if ((dd->upri x%x
~PRIMASK) > (lp->lwp_priority & ~PRIMASK)) {
dd->upri = lp->lwp_priority;
+ spin_unlock_wr(&bsd4_spin);
#ifdef SMP
if (reschedcpu mycpu->gd_cpuid) {
need_user_resched();
@ -802,8 +806,12 @ bsd4_resetpriority(struct lwp *lp)
#else
need_user_resched();
#endif
+ crit_exit();
+ return;
}
}
out:
spin_unlock_wr(&bsd4_spin);
crit_exit();
}

%%%

This seemed to cease the assertion, but adding debugging stuff
(like kprintf() or mycpu) also seemed to avoid the assertion
(or made it diffcult to reproduce), so I'm not 100% sure this
is enough, but other places manipulating bsd4_pcpu[] include:

bsd4_acquire_curproc:252: only when dd->uschedcp  NULL
bsd4_release_curproc:321: only when dd->uschedcp lp
bsd4_setrunqueue:463: only when gd mycpu
bsd4_schedulerclock:581: only modifier of rrcount

which don't seem to need spinlocks (maybe, correct me if I'm wrong).

Cheers.

Actions

Also available in: Atom PDF