Bug #1824: kernel panic, x86, 2.7.3.859.ge5104 - DragonFlyBSD - DragonFlyBSD bugtracker

Actions

#1

Updated by akirchhoff135014 over 15 years ago

On Tuesday 07 September 2010 19:22:57 Adam K Kirchhoff wrote:

SMP enabled, otherwise a GENERIC kernel.

Full backtrace:

(kgdb) bt
#0 _get_mycpu (di=0xc06ebee0) at ./machine/thread.h:83
#1 md_dumpsys (di=0xc06ebee0) at
/usr/src/sys/platform/pc32/i386/dump_machdep.c:263
#2 0xc0315d31 in dumpsys () at /usr/src/sys/kern/kern_shutdown.c:880
#3 0xc03162f1 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:387
#4 0xc03165ba in panic (fmt=0xc06122c4 "m_copym, length > size of mbuf
chain")
at /usr/src/sys/kern/kern_shutdown.c:786
#5 0xc034f662 in m_copym (m=0x0, off0=32, len=24, wait=4) at
/usr/src/sys/kern/uipc_mbuf.c:1113
#6 0xc03ec2e6 in tcp_output (tp=0xdcc96d88) at
/usr/src/sys/netinet/tcp_output.c:723
#7 0xc03f38da in tcp_usr_send (so=0xc6a73698, flags=<value optimized out>,
m=0xe20ca900, nam=0x0, control=0x0,
td=0xdcc96988) at /usr/src/sys/netinet/tcp_usrreq.c:762
#8 0xc03518a4 in netmsg_pru_send (msg=0xe5c15b5c) at
/usr/src/sys/kern/uipc_msg.c:548
#9 0xc03a4674 in netmsg_service (msg=0xe5c15b5c, mpsafe_mode=1,
mplocked=0) at /usr/src/sys/net/netisr.c:310
#10 0xc03a478a in netmsg_service_loop (arg=0xc068cda0) at
/usr/src/sys/net/netisr.c:357
#11 0xc031f2ae in lwkt_deschedule_self (td=Cannot access memory at address
0x8 ) at /usr/src/sys/kern/lwkt_thread.c:278
Backtrace stopped: previous frame inner to this frame (corrupt stack?)

I was building some packages in the background at the time. I was ssh'ed
in from a linux box, and running x2x to allow me to move the mouse and
keyboard from the linux box to the DF box. At the time of the lock up, I
was moving the mouse around on the DF box. Though I have no idea if
that's related to the crash, that's the only thing that comes to mind.

Adam

The kernel and core are in my ~/crash folder on leaf.

Adam

Actions

Copy link

#2

Updated by dillon over 15 years ago

:> I was building some packages in the background at the time. I was ssh'ed
:> in from a linux box, and running x2x to allow me to move the mouse and
:> keyboard from the linux box to the DF box. At the time of the lock up, I
:> was moving the mouse around on the DF box. Though I have no idea if
:> that's related to the crash, that's the only thing that comes to mind.
:>
:> Adam
:
:The kernel and core are in my ~/crash folder on leaf.
:
:Adam

Very interesting.  Somehow the so_snd sockbuf has fewer bytes worth
    of mbufs than it says in its sb_cc field.  I have no idea how it
    managed to get into that state.  We will have to keep an eye on
    things and collect more information.

-Matt
                    Matthew Dillon 
                    &lt;dillon@backplane.com&gt;

Actions

Copy link

#3

Updated by dillon over 15 years ago

Adam, was the machine that crashed serving NFS ? I was able to
reproduce the exact same crash while serving NFS.

-Matt

Actions

Copy link

#4

Updated by akirchhoff135014 over 15 years ago

On Wed, 8 Sep 2010 10:12:00 -0700 (PDT)
Matthew Dillon <dillon@apollo.backplane.com> wrote:

Adam, was the machine that crashed serving NFS ? I was able to
reproduce the exact same crash while serving NFS.

-Matt

The NFS server was running, but nothing was connected at the time.

Actions

Copy link

#5

Updated by dillon over 15 years ago

I've pushed a bunch of work, please update to the latest master
and continue testing!

No smoking gun but I suspect machine load may be causing m_reclaim()
    to get run, which drains various protocol caches.  Those caches were
    not MPSAFE.  The changes address that issue.

If these changes don't fix the problem then compile a fresh kernel
    with two options added to your kernel config:

options         SOCKBUF_DEBUG
    options         MBUF_DEBUG

These are fairly invasive options so only compile them in if the
    problems have not gone away.  Hopefully then we will get a panic
    closer to where the actual bad code is instead of well after the
    fact.

-Matt

Actions

Copy link

#6

Updated by tuxillo about 13 years ago

Description updated (diff)
Status changed from New to Feedback
Assignee deleted (0)

Hi Adam,

Did you have a chance to try what Matt said? Can you please provide feedback about it?

Thanks,
Antonio Huete

Project

General

Profile

DragonFlyBSD

Bug #1824

kernel panic, x86, 2.7.3.859.ge5104

Updated by akirchhoff135014 over 15 years ago

Updated by dillon over 15 years ago

Updated by dillon over 15 years ago

Updated by akirchhoff135014 over 15 years ago

Updated by dillon over 15 years ago

Updated by tuxillo about 13 years ago