Project

General

Profile

Actions

Bug #714

closed

SMP kernel panic at boot: assertion: ((int)sr->sysid ..

Added by thomas.nikolajsen almost 17 years ago. Updated almost 17 years ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
Start date:
Due date:
% Done:

0%

Estimated time:

Description

Using HEAD I get this panic on every boot with SMP kernel;
this on pentium 4 w/ HTT.
A few months ago I had no problem using same KERNCONF on this host.
Same KERNCONF works on AMD64 (one core).

Commit below introduced KKASSERT causing panic:
http://leaf.dragonflybsd.org/mailarchive/commits/2007-04/msg00211.html

-thomas

Files

Actions #1

Updated by dillon almost 17 years ago

:New submission from Thomas Nikolajsen <>:
:
:Using HEAD I get this panic on every boot with SMP kernel;
:this on pentium 4 w/ HTT.
:A few months ago I had no problem using same KERNCONF on this host.
:Same KERNCONF works on AMD64 (one core).
:
:Commit below introduced KKASSERT causing panic:
:http://leaf.dragonflybsd.org/mailarchive/commits/2007-04/msg00211.html
:
: -thomas

Woa.  That's really odd, it shouldn't be possible for that to happen.
Are you sure you have the latest HEAD?
When it panics please do this from the db> prompt:
print *ncpus_fit
print *ncpus_fit_mask
-Matt
Actions #2

Updated by thomas.nikolajsen almost 17 years ago

Well, it is full build of HEAD from June 27th.
Updated source & rebuild today, but it didn't change panic.

Anyway I guess it is a good idea you did put in ASSERT :)

thomas

db> print *ncpus_fit
2
db> print *ncpus_fit_mask
1
Actions #3

Updated by thomas.nikolajsen almost 17 years ago

I looked into this: sprinkling kprintf's around:
gd_sysid_alloc==1 and gd_cpuid==0 when sysres_init which panics is called.

It turns out that problem is that ncpus isn't fixed for SMP:
it's initialized to 1 and later changed to number of CPUs.
(in /sys/platform/pc32/i386/mp_machdep.c)

sysref_ctor is called once while ncpus==1.

Simple fix (hack?) is to initialize ncpus to MAXCPU,
and set ncpus* accordingly.

-thomas
Actions #4

Updated by dillon almost 17 years ago

:Thomas Nikolajsen <> added the comment:
:
:I looked into this: sprinkling kprintf's around:
:gd_sysid_alloc==1 and gd_cpuid==0 when sysres_init which panics is called.
:
:It turns out that problem is that ncpus isn't fixed for SMP:
:it's initialized to 1 and later changed to number of CPUs.
:(in /sys/platform/pc32/i386/mp_machdep.c)
:
:sysref_ctor is called once while ncpus==1.
:
:Simple fix (hack?) is to initialize ncpus to MAXCPU,
:and set ncpus* accordingly.
:
: -thomas

Ok.  Lets find out where this is.  Add a conditional that
checks for ncpus 1 in sysref_ctor() and call db_print_backtrace();
along with your kprintf. Tell me what it says! e.g.
sysref_ctor(...)
{
if (ncpus 1) {
kprintf("ncpus is one!!!!\n");
db_print_backtrace();
/* Debugger("blah"); OPTIONAL (continue booting with 'cont') */
}
}
I'm still coming up blanks.  I added a check for ncpus == 1 in
sysref_ctor() on HEAD on my test box and it never gets hit. Maybe
your cvs repository is out of date or something... try cvsup'ing
directly from the master site or maybe even clean it all out and
cvsup a fresh copy from the master site.
-Matt
Actions #5

Updated by thomas.nikolajsen almost 17 years ago

DDB trace uploaded in file .3; symbols weren't set up yet (early in boot),
so I used 'nm -n' on kernel and added symbols to trace myself.
perfmon popped up; after removing that from my KERNCONF problem isn't seen.

For a solution we could just use MAXCPU rounded up to nearest power of 2,
to add to gd_sysref_alloc to generate new sysid (instead of adding ncpus).

It will give fewer sysids than current scheme with ncpus << MAXCPU,
but MAXCPU has to be supported anyway.

-thomas
Actions #6

Updated by dillon almost 17 years ago

:Thomas Nikolajsen <> added the comment:
:
:DDB trace uploaded in file .3; symbols weren't set up yet (early in boot),
:so I used 'nm -n' on kernel and added symbols to trace myself.
:perfmon popped up; after removing that from my KERNCONF problem isn't seen.
:
:For a solution we could just use MAXCPU rounded up to nearest power of 2,
:to add to gd_sysref_alloc to generate new sysid (instead of adding ncpus).
:
:It will give fewer sysids than current scheme with ncpus << MAXCPU,
:but MAXCPU has to be supported anyway.
:
: -thomas

This is easy. It's due to PERFMON trying to initialize its devices
way, way too early.
Fix coming up in a sec.
-Matt
Actions #7

Updated by thomas.nikolajsen almost 17 years ago

Ah, nice.

Btw: I did check my source tree for bit rot; didn't find any.
Did compare with virgin checkout from local cvs-repo. and primary.

-thomas
Actions

Also available in: Atom PDF