Bug #1024: panic: assertion: dd->uschedcp != lp in bsd4_resetpriority - DragonFlyBSD - DragonFlyBSD bugtracker

Actions

Copy link

Bug #1024

closed

panic: assertion: dd->uschedcp != lp in bsd4_resetpriority

Added by qhwt+dfly about 17 years ago. Updated about 17 years ago.

Status:

Closed

Priority:

Normal

Assignee:

Category:

Target version:

Start date:

Due date:

% Done:

Estimated time:

Description

Hi.

I'm seeing this panic several times recently, only on an SMP kernel
running on VMware Fusion, with dual CPU support (not when the
number of active processors available to guest OS is 1). It occurs
under load, for instance when building kernel or world. I don't remember
when this started, so it may or may not be VMware's problem. I need to
try it on a real SMP hardware. Anyway, ...

(kgdb) bt
#0 dumpsys () at ./machine/thread.h:83
#1 0xc02bc881 in boot (howto=256)
at /dfsrc/current/sys/kern/kern_shutdown.c:375
#2 0xc02bcb44 in panic (fmt=0xc0507fb7 "assertion: %s in %s")
at /dfsrc/current/sys/kern/kern_shutdown.c:800
#3 0xc02c2515 in bsd4_resetpriority (lp=0xcadf8900)
at /dfsrc/current/sys/kern/usched_bsd4.c:792
#4 0xc02c26f8 in bsd4_recalculate_estcpu (lp=0xcadf8900)
at /dfsrc/current/sys/kern/usched_bsd4.c:701
#5 0xc02ca0ef in schedcpu_stats (p=0xcae6f858, data=0x0)
at /dfsrc/current/sys/kern/kern_synch.c:212
#6 0xc02b68e6 in allproc_scan (callback=0xc02ca099 <schedcpu_stats>, data=0x0)
at /dfsrc/current/sys/kern/kern_proc.c:533
#7 0xc02c9d52 in schedcpu (arg=0x0)
at /dfsrc/current/sys/kern/kern_synch.c:186
#8 0xc02cce3a in softclock_handler (arg=0xc060c7e0)
at /dfsrc/current/sys/kern/kern_timeout.c:308
#9 0xc02c450b in lwkt_deschedule_self (td=Cannot access memory at address 0x8
)
at /dfsrc/current/sys/kern/lwkt_thread.c:223

At first I thought that other CPU has just modified after this CPU
has unlocked bsd4_spin but before KKASSERT, so I tried to defer
spin_unlock_wr() as done in bsd4_setrunqueue():

%%%
diff --git a/sys/kern/usched_bsd4.c b/sys/kern/usched_bsd4.c
index b934e3d..e3478a0 100644
--- a/sys/kern/usched_bsd4.c
+++ b/sys/kern/usched_bsd4.c
@ -779,7 +779,6 @ bsd4_resetpriority(struct lwp *lp)
lp->lwp_priority = newpriority;
reschedcpu = 1;
}
spin_unlock_wr(&bsd4_spin);

/*
 * Determine if we need to reschedule the target cpu.  This only
@ -789,9 +788,14 @ bsd4_resetpriority(struct lwp *lp)
      */
     if (reschedcpu >= 0) {
         dd = &bsd4_pcpu[reschedcpu];
-        KKASSERT(dd->uschedcp != lp);
+        if (dd->uschedcp == lp) {
+            kprintf("%p(d): dd->uschedcp=lp=%p\n",
+                curthread, mycpu->gd_cpuid, lp);
+            goto out;
+        }
         if ((dd->upri x%x ~PRIMASK) > (lp->lwp_priority & ~PRIMASK)) {
             dd->upri = lp->lwp_priority;
+            spin_unlock_wr(&bsd4_spin);
 #ifdef SMP
             if (reschedcpu  mycpu->gd_cpuid) {
                 need_user_resched();
@ -802,8 +806,12 @ bsd4_resetpriority(struct lwp *lp)
 #else
             need_user_resched();
 #endif
+            crit_exit();
+            return;
         }
     }
out:
    spin_unlock_wr(&bsd4_spin);
     crit_exit();
 }

%%%

This seemed to cease the assertion, but adding debugging stuff
(like kprintf() or mycpu) also seemed to avoid the assertion
(or made it diffcult to reproduce), so I'm not 100% sure this
is enough, but other places manipulating bsd4_pcpu[] include:

bsd4_acquire_curproc:252: only when dd->uschedcp  NULL
  bsd4_release_curproc:321: only when dd->uschedcp  lp
  bsd4_setrunqueue:463:     only when gd  mycpu
  bsd4_schedulerclock:581:  only modifier of rrcount

which don't seem to need spinlocks (maybe, correct me if I'm wrong).

Cheers.

Actions

Copy link

Updated by dillon about 17 years ago

:Hi.
:
:I'm seeing this panic several times recently, only on an SMP kernel
:running on VMware Fusion, with dual CPU support (not when the
:number of active processors available to guest OS is 1). It occurs
:under load, for instance when building kernel or world. I don't remember
:when this started, so it may or may not be VMware's problem. I need to
:try it on a real SMP hardware. Anyway, ...
:...
:
:At first I thought that other CPU has just modified after this CPU
:has unlocked bsd4_spin but before KKASSERT, so I tried to defer
:spin_unlock_wr() as done in bsd4_setrunqueue():
:...
:This seemed to cease the assertion, but adding debugging stuff
:(like kprintf() or mycpu) also seemed to avoid the assertion
:(or made it diffcult to reproduce), so I'm not 100% sure this
:is enough, but other places manipulating bsd4_pcpu[] include:
:
: bsd4_acquire_curproc:252: only when dd->uschedcp NULL
: bsd4_release_curproc:321: only when dd->uschedcp lp
: bsd4_setrunqueue:463: only when gd == mycpu
: bsd4_schedulerclock:581: only modifier of rrcount
:
:which don't seem to need spinlocks (maybe, correct me if I'm wrong).
:
:Cheers.

I think you basically found the bug, and it does make sense that
    kprintf's could disrupt the race that causes it because what you found
    is a true 'SMP' race between two cpus.

Almost anything related to the globals (around line 146) is supposed to
    be spin-locked.  Numerous elements related to the per-cpu data
    (bsd4_pcpu[]), when operating on the current cpu, should not require
    spin locking.  That is the idea anyway.

Setting and NULLing dd->uschedcp in the per-cpu data, for the current
    cpu, is not intended to need a spin lock.  Similarly the per-cpu
    rrcount field is a heuristic only used by the current cpu and does not
    need a spin lock.

Scanning the global scheduler queue is intended to require a spinlock,
    since all the cpus are 'pulling' from the same global queue.

The code you found the bug appears to be bsd4_resetpriority() when
    called on a 'random' lwp instead of the 'current' lwp on the calling
    cpu.   That is, when the softclock does the allproc_scan().  That
    scan only occurs on one cpu, with the MP lock held (I think), but can
    race against manipulation of the per-cpu scheduler data (dd->*) on other
    cpus.  Such manipulate can occur on other cpus without the MP lock held
    and without the spinlock held.

Because the LWP in the failing path (the one doing the allproc_scan) is
    potentially owned by some other cpu, that KKASSERT() can race another
    cpu and cause a panic.  The bug is actually the presence of the KKASSERT
    and not the manipulation of the spin lock.

Moving the spinlock thus might not completely solve the problem.

I think all that needs to be done here is to remove the KKASSERT().  The
    worst that happens is that a cpu gets an extra need_resched IPI (which
    does no harm), or dd->upri winds up with the wrong value, which should
    also do no real harm.  I believe the KKASSERT itself is wrong.  Instead,
    augment the code comment above that section to indicate that the
    dd_uschedcp variable can be ripped out from under the code and
    cause a spurious need_user_resched() on the target cpu and caused
    dd->upri to be wrong for a short period of time.

-Matt
                    Matthew Dillon 
                    &lt;dillon@backplane.com&gt;

Actions

Copy link

Updated by qhwt+dfly about 17 years ago

On Thu, May 29, 2008 at 10:27:08AM -0700, Matthew Dillon wrote:
...

The code you found the bug appears to be bsd4_resetpriority() when
called on a 'random' lwp instead of the 'current' lwp on the calling
cpu. That is, when the softclock does the allproc_scan(). That
scan only occurs on one cpu, with the MP lock held (I think), but can
race against manipulation of the per-cpu scheduler data (dd->*) on other
cpus. Such manipulate can occur on other cpus without the MP lock held
and without the spinlock held.

Because the LWP in the failing path (the one doing the allproc_scan) is
potentially owned by some other cpu, that KKASSERT can race another
cpu and cause a panic. The bug is actually the presence of the KKASSERT
and not the manipulation of the spin lock.

Moving the spinlock thus might not completely solve the problem.

I think all that needs to be done here is to remove the KKASSERT. The
worst that happens is that a cpu gets an extra need_resched IPI (which
does no harm), or dd->upri winds up with the wrong value, which should
also do no real harm. I believe the KKASSERT itself is wrong. Instead,
augment the code comment above that section to indicate that the
dd_uschedcp variable can be ripped out from under the code and
cause a spurious need_user_resched() on the target cpu and caused
dd->upri to be wrong for a short period of time.

... assuming that I'm expected to augment the comment, does this look OK?

Remove KKASSERT from the code block where not all callers' CPU own
the LWP, occasionally leading to a panic because of race between CPUs.

--- usched_bsd4.c.orig 2008-05-15 16:15:08.000000000 0900
++ usched_bsd4.c 2008-06-18 17:54:07.000000000 +0900
@ -786,10 +786,16 @ * occurs if the LWP is already on a scheduler queue, which means * that idle cpu notification has already occured. At most we * need only issue a need_user_resched() on the appropriate cpu.
+ *
+ * The LWP may be owned by a CPU different from the current one,
+ * in which case dd->uschedcp may be modified without an MP lock
+ * or a spinlock held. The worst that happens is that the code
+ * below causes a spurious need_user_resched() on the target CPU
+ * and dd->pri to be wrong for a short period of time, both of
+ * which are harmless.
*/
if (reschedcpu >= 0) {
dd = &bsd4_pcpu[reschedcpu];
- KKASSERT;
if ((dd->upri & ~PRIMASK) > (lp->lwp_priority & ~PRIMASK)) {
dd->upri = lp->lwp_priority;
#ifdef SMP