Project

General

Profile

Actions

Bug #1634

open

panic: spin_lock: 0xe4ad1320, indefinitive wait!

Added by elekktretterr over 14 years ago. Updated over 9 years ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
Kernel
Target version:
Start date:
Due date:
% Done:

0%

Estimated time:

Description

Our dragonfly box has got a panic last night. it runs master from last week.

I wrote down some text from the panic:

mp_lock = 00000002; cpuid = 2

Trace beginning at frame 0xe42d3b88

panic(...)
panic(...)
exponential_backoff(ff010000, 1 ,3c, e4ad1320, c727dbb)
spin_lock_wr_contested(e4ad1320, 0000000)
crfree(e4ad12b0,2, e42f3cc4, c00ef73, e42f3c00)
nlookup_done(e42f3c00, e4960720, c0304fe3, 0, c0384b5c)
sys_lstat(e32f3cf0,6,0,0dd9fce18)
syscall2(e42f3d440)

Whoever can understand this, can you tell me what happened and how can
this be fixed?

Also, is there any way to automate this: when panic occurs, automatically
start a memory dump process and then automatically reboot? I've been going
to the data center way too many times recently to investigate why the box
has crashed and it would be very helpful.

Petr

Actions #1

Updated by swildner over 14 years ago

Am 22.12.2009 02:17, schrieb :

Also, is there any way to automate this: when panic occurs, automatically
start a memory dump process and then automatically reboot? I've been going
to the data center way too many times recently to investigate why the box
has crashed and it would be very helpful.

Check out the DDB_UNATTENDED kernel option.

Sascha

Actions #2

Updated by ahuete.devel over 14 years ago

Hi Petr,

Could you please upload the dump to somewhere where we can grab it?

Thanks,
Antonio Huete

2009/12/22 <>:

Our dragonfly box has got a panic last night. it runs master from last week.

I wrote down some text from the panic:

mp_lock = 00000002; cpuid = 2

Trace beginning at frame  0xe42d3b88

panic(...)
panic(...)
exponential_backoff(ff010000, 1 ,3c, e4ad1320, c727dbb)
spin_lock_wr_contested(e4ad1320, 0000000)
crfree(e4ad12b0,2, e42f3cc4, c00ef73, e42f3c00)
nlookup_done(e42f3c00, e4960720, c0304fe3, 0, c0384b5c)
sys_lstat(e32f3cf0,6,0,0dd9fce18)
syscall2(e42f3d440)

Whoever can understand this, can you tell me what happened and how can
this be fixed?

Also, is there any way to automate this: when panic occurs, automatically
start a memory dump process and then automatically reboot? I've been going
to the data center way too many times recently to investigate why the box
has crashed and it would be very helpful.

Petr

Actions #3

Updated by elekktretterr over 14 years ago

Hi Petr,

Could you please upload the dump to somewhere where we can grab it?

Hi Antonio,

Unfortunately I dont have a dump. when I got to the server and plugged in
a keyboard, it wasnt responding :(

Actions #4

Updated by qhwt+dfly over 14 years ago

On Tue, Dec 22, 2009 at 12:17:49PM +1100, wrote:

Our dragonfly box has got a panic last night. it runs master from last week.

The ouput from `uname -v' contains the version of the source tree
you compiled the kernel from. It might be helpful to determine
if the problem has already been solved in the newer versions.

Cheers.

Actions #5

Updated by dillon over 14 years ago

In the release code the spinlock is very short duration. All I
can think of is that there might be a MP race somewhere. There
are a few places in the release kernel where p_ucred is used as
if it were MPSAFE when, in fact, it was not MPSAFE.

Well, that spin lock was removed from master and the cred handling
code was rewritten.
-Matt
Matthew Dillon
&lt;&gt;
Actions #6

Updated by elekktretterr over 14 years ago

The ouput from `uname -v' contains the version of the source tree
you compiled the kernel from. It might be helpful to determine
if the problem has already been solved in the newer versions.

It's master from last wednesday I believe. Couple of days later there is
this commit:

http://leaf.dragonflybsd.org/mailarchive/commits/2009-12/msg00133.html

So maybe it has been fixed. We'll have to wait and see I guess.

Actions #7

Updated by ahuete.devel over 14 years ago

Hi Petr,

As far as I know you don't have to recompile with DDB_UNATTENDED, just
change the sysctl debug.debugger_on_panic to 0.
Note that you need to have a correctly set dumpdev (which the
installer sets now in 7etc/rc.conf to your swap device) in order to
get the dumps.

With that, once you get a panic, there will be a dump and the machine
should be restarted without intervention.

Cheers,
Antonio Huete

2009/12/23 <>:

Petr,

I guess you have dumpdev configured in the server, no? Since first of
december we got minidumps that will produce quite small dumps despite
the memory you have on the machine. You will find the cores on
/var/crash as usual, so next time maybe we can catch the panic :)

There is no keyboard permanently attached to the server though. If it
panics, I need to plugin one, but then I cant type in the ddb screen so I
can only press the reset button. Which is why I was asking, is there any
way to automate this? ie. 1) System panics and gets into DDB, 2) a dump is
automatically created 3) System automatically reboots.

I've been notified there is DDB_UNATTENDED kernel option, but will this
create a dump before rebooting?

Petr

Actions #8

Updated by tuxillo over 9 years ago

  • Description updated (diff)
  • Category set to Kernel
  • Assignee deleted (0)
  • Target version set to Unverifiable

Moving to unverifiable.

Actions

Also available in: Atom PDF