Bug #1835

Panic: Bad link elm prev->next != elm

Added by ftigeot over 4 years ago. Updated over 3 years ago.

Status:In ProgressStart date:
Priority:NormalDue date:
Assignee:dillon% Done:

0%

Category:-
Target version:-

Description

The exact message is:

panic: Bad link elm 0xfffffffe5655be58 prev->next != elm

It seems launching a browser and opening a bunch of tabs at once is enough
to cause this panic.

I have put a bunch of core and kernel files at
http://www.wolfpond.org/crash.dfly/

History

#1 Updated by dillon over 4 years ago

:The exact message is:
:
:panic: Bad link elm 0xfffffffe5655be58 prev->next != elm
:
:It seems launching a browser and opening a bunch of tabs at once is enough
:to cause this panic.
:
:I have put a bunch of core and kernel files at
:http://www.wolfpond.org/crash.dfly/
:
:--
:Francois Tigeot

Your kernel is more recent then my fix as far as I can tell but I was
sure I fixed that one. Try doing a full recompile of the kernel.

-Matt
Matthew Dillon
<>

#2 Updated by dillon over 4 years ago

I went through the code in kern.26 and it was definitely up-to-date.

I think this crash may be related to one Jan Lentfer is getting where
a socket is getting ripped out from under some code that it shouldn't
be getting ripped out from under.

-Matt
Matthew Dillon
<>

#3 Updated by ftigeot over 4 years ago

On Sun, Sep 12, 2010 at 01:54:47PM -0700, Matthew Dillon wrote:
> I went through the code in kern.26 and it was definitely up-to-date.
>
> I think this crash may be related to one Jan Lentfer is getting where
> a socket is getting ripped out from under some code that it shouldn't
> be getting ripped out from under.

I have upgraded anyway; hopefully the new socket assertions will help in
tracking this down.

#4 Updated by dillon over 4 years ago

:On Sun, Sep 12, 2010 at 01:54:47PM -0700, Matthew Dillon wrote:
:> I went through the code in kern.26 and it was definitely up-to-date.
:>
:> I think this crash may be related to one Jan Lentfer is getting where
:> a socket is getting ripped out from under some code that it shouldn't
:> be getting ripped out from under.
:
:I have upgraded anyway; hopefully the new socket assertions will help in
:tracking this down.
:
:--
:Francois Tigeot

I believe these should now be fixed for real. There was an additional
issue with the list the inpcb was placed on that caused another MP
race.

-Matt
Matthew Dillon
<>

#5 Updated by ftigeot over 4 years ago

On Sun, Sep 12, 2010 at 11:26:17PM -0700, Matthew Dillon wrote:
>
> I believe these should now be fixed for real. There was an additional
> issue with the list the inpcb was placed on that caused another MP
> race.

Yep, this one can be closed.
I couldn't reproduce the crash with the last kernel.

#6 Updated by dillon over 4 years ago

:Yep, this one can be closed.
:I couldn't reproduce the crash with the last kernel.
:
:--
:Francois Tigeot

Ok, even more fixes committed, I missed two list corruption cases but
I am definitely on the right track now.

-Matt
Matthew Dillon
<>

#7 Updated by pavalos over 4 years ago

Original submitter says it can't be reproduced in recent kernel.

#8 Updated by thomas.nikolajsen over 4 years ago

I still get this panic, with HEAD from today: on SMP kernel
running a few buildkernels with /usr/src & /usr/obj NFS mounted.

Should I upload crash dump?

-thomas

#9 Updated by dillon over 4 years ago

:Thomas Nikolajsen <> added the comment:
:
:I still get this panic, with HEAD from today: on SMP kernel
:running a few buildkernels with /usr/src & /usr/obj NFS mounted.
:
:Should I upload crash dump?
:
: -thomas

Yes, we're still trying to find this one.

-Matt
Matthew Dillon
<>

#10 Updated by peter over 4 years ago

On Tue, Sep 14, 2010 at 09:23:18PM -0700, Matthew Dillon wrote:
> :Thomas Nikolajsen <> added the comment:
> :
> :I still get this panic, with HEAD from today: on SMP kernel
> :running a few buildkernels with /usr/src & /usr/obj NFS mounted.
> :
> :Should I upload crash dump?
> :
> : -thomas
>
> Yes, we're still trying to find this one.
>

I got one of these the past few days, and it looked like that was
actually a secondary panic. Can you verify that's the actual panic
message (the first one the gets displayed on the console)?

--Peter

#11 Updated by thomas.nikolajsen over 4 years ago

crash dump uploader to ~thomas/crash/19 on leaf.

I do think that subj was on console; but don't remember for sure,
please look into crash dump to decide primary reason.

I have a few more crash dumps w/ same panic string.

-thomas

#12 Updated by dillon over 4 years ago

:Thomas Nikolajsen <> added the comment:
:
:crash dump uploader to ~thomas/crash/19 on leaf.
:
:I do think that subj was on console; but don't remember for sure,
:please look into crash dump to decide primary reason.
:
:I have a few more crash dumps w/ same panic string.
:
: -thomas

Thanks Thomas. It looks like I forgot to wrap the nfs_write()
function with a token. I've pushed the fixes. Please update
and tell us if it fixed the problem!

-Matt
Matthew Dillon
<>

#13 Updated by dillon over 3 years ago

See issue 2037, possible fix committed.

Also available in: Atom PDF