Bug #3170: repeatable nfsd crash - DragonFlyBSD - DragonFlyBSD bugtracker

Actions

Copy link

Bug #3170

open

repeatable nfsd crash

Added by tse about 6 years ago. Updated almost 5 years ago.

Status:

New

Priority:

Normal

Assignee:

Category:

Target version:

6.4

Start date:

10/07/2019

Due date:

% Done:

Estimated time:

(Total: 0:00 h)

Description

I created a linux vm on qemu with nfs shared from Dragonfly. Reason being so I could install the go-app-engine for google cloud. Could read/write small files to the nfs share. But running google-cloud-sdk/install.sh from the vm on the nfs share quickly causes this error:

panic: assertion "m->m_type == MT_DATA" failed in m_dup_data at /usr/src/sys/kern/uipc_mbuf.c:1820
cpuid = 1
Trace beginning at frame 0xfffff802f71bf500
m_dup_data() at m_dup_data+0x12b 0xffffffff805e8a7b
m_dup_data() at m_dup_data+0x12b 0xffffffff805e8a7b
nfs_realign.isra.3() at nfs_realign.isra.3+0x48 0xffffffff807177c8
nfsrv_rcv() at nfsrv_rcv+0x490 0xffffffff8071c110
sys_nfssvc() at sys_nfssvc+0x13e7 0xffffffff8071fcf7
syscall2() at syscall2+0x238 0xffffffff8098c0d8

I've switched from nfsd to unfsd, and that works fine. Though it took me a day of fiddling for unfsd, just because I didn't know not to run it with mountd. Hehe, such is life :)

Sorry, I'm sure it would take me many multiple months of work to think of supplying a patch for this bug

But I'm happy. It's my first time to setup an nfs share and linux vm, and now I can use them to access things from Dragonfly like google app engine, or clang sanitizers to hold my hand when I'm writing c

Files

core.txt.3 (296 KB) core.txt.3

tse, 01/28/2019 04:19 AM

Subtasks 1 (1 open — 0 closed)

Actions

Copy link

Updated by sepherosa about 6 years ago

Hi,

Do you have dumps available?

Thanks,
sephe

On Sun, Jan 27, 2019 at 5:39 AM <bugtracker-admin@leaf.dragonflybsd.org> wrote:

Issue #3170 has been reported by tse.

----------------------------------------
Bug #3170: repeatable nfsd crash
http://bugs.dragonflybsd.org/issues/3170

Author: tse

Status: New

Priority: Normal

Assignee:

Category:

Target version: Latest stable
----------------------------------------
I created a linux vm on qemu with nfs shared from Dragonfly. Reason being so I could install the go-app-engine for google cloud. Could read/write small files to the nfs share. But running google-cloud-sdk/install.sh from the vm on the nfs share quickly causes this error:

panic: assertion "m->m_type == MT_DATA" failed in m_dup_data at /usr/src/sys/kern/uipc_mbuf.c:1820
cpuid = 1
Trace beginning at frame 0xfffff802f71bf500
m_dup_data() at m_dup_data+0x12b 0xffffffff805e8a7b
m_dup_data() at m_dup_data+0x12b 0xffffffff805e8a7b
nfs_realign.isra.3() at nfs_realign.isra.3+0x48 0xffffffff807177c8
nfsrv_rcv() at nfsrv_rcv+0x490 0xffffffff8071c110
sys_nfssvc() at sys_nfssvc+0x13e7 0xffffffff8071fcf7
syscall2() at syscall2+0x238 0xffffffff8098c0d8

I've switched from nfsd to unfsd, and that works fine. Though it took me a day of fiddling for unfsd, just because I didn't know not to run it with mountd. Hehe, such is life :)

Sorry, I'm sure it would take me many multiple months of work to think of supplying a patch for this bug

But I'm happy. It's my first time to setup an nfs share and linux vm, and now I can use them to access things from Dragonfly like google app engine, or clang sanitizers to hold my hand when I'm writing c

--
You have received this notification because you have either subscribed to it, or are involved in it.
To change your notification preferences, please click here: http://bugs.dragonflybsd.org/my/account

--
Tomorrow Will Never Die

Actions

Copy link

Updated by tse about 6 years ago

File core.txt.3 core.txt.3 added

There's also the vmcore.3, which I can host somewhere.

And an unrelated warning I get occasionally, but didn't think it was worth opening an issue for:
Jan 27 09:39:08 beloved root: Unknown USB device: vendor 0x8087 product 0x07dc bus uhub0
Jan 27 09:39:08 beloved root: Unknown USB device: vendor 0x8087 product 0x07dc bus uhub0
Jan 27 20:33:18 beloved kernel: error: [drm:pid-1:intel_pipe_update_start] ERROR Potential atomic update failure on pipe A
...
Jan 28 08:57:14 beloved root: Unknown USB device: vendor 0x8087 product 0x07dc bus uhub0
Jan 28 08:57:14 beloved root: Unknown USB device: vendor 0x8087 product 0x07dc bus uhub0
Jan 28 10:19:36 beloved kernel: error: [drm:pid938:intel_pipe_update_start] ERROR Potential atomic update failure on pipe A

Actions

Copy link

Updated by sepherosa about 6 years ago

Please upload vmcore and kernel file somewhere.

Thanks,
sephe

On Mon, Jan 28, 2019 at 8:32 PM <bugtracker-admin@leaf.dragonflybsd.org> wrote:

Issue #3170 has been updated by tse.

File core.txt.3 added

There's also the vmcore.3, which I can host somewhere.

And an unrelated warning I get occasionally, but didn't think it was worth opening an issue for:
Jan 27 09:39:08 beloved root: Unknown USB device: vendor 0x8087 product 0x07dc bus uhub0
Jan 27 09:39:08 beloved root: Unknown USB device: vendor 0x8087 product 0x07dc bus uhub0
Jan 27 20:33:18 beloved kernel: error: [drm:pid-1:intel_pipe_update_start] ERROR Potential atomic update failure on pipe A
...
Jan 28 08:57:14 beloved root: Unknown USB device: vendor 0x8087 product 0x07dc bus uhub0
Jan 28 08:57:14 beloved root: Unknown USB device: vendor 0x8087 product 0x07dc bus uhub0
Jan 28 10:19:36 beloved kernel: error: [drm:pid938:intel_pipe_update_start] ERROR Potential atomic update failure on pipe A

----------------------------------------
Bug #3170: repeatable nfsd crash
http://bugs.dragonflybsd.org/issues/3170#change-13593

Author: tse

Status: New

Priority: Normal

Assignee:

Category:

Target version: Latest stable
----------------------------------------
I created a linux vm on qemu with nfs shared from Dragonfly. Reason being so I could install the go-app-engine for google cloud. Could read/write small files to the nfs share. But running google-cloud-sdk/install.sh from the vm on the nfs share quickly causes this error:

panic: assertion "m->m_type == MT_DATA" failed in m_dup_data at /usr/src/sys/kern/uipc_mbuf.c:1820
cpuid = 1
Trace beginning at frame 0xfffff802f71bf500
m_dup_data() at m_dup_data+0x12b 0xffffffff805e8a7b
m_dup_data() at m_dup_data+0x12b 0xffffffff805e8a7b
nfs_realign.isra.3() at nfs_realign.isra.3+0x48 0xffffffff807177c8
nfsrv_rcv() at nfsrv_rcv+0x490 0xffffffff8071c110
sys_nfssvc() at sys_nfssvc+0x13e7 0xffffffff8071fcf7
syscall2() at syscall2+0x238 0xffffffff8098c0d8

I've switched from nfsd to unfsd, and that works fine. Though it took me a day of fiddling for unfsd, just because I didn't know not to run it with mountd. Hehe, such is life :)

Sorry, I'm sure it would take me many multiple months of work to think of supplying a patch for this bug

But I'm happy. It's my first time to setup an nfs share and linux vm, and now I can use them to access things from Dragonfly like google app engine, or clang sanitizers to hold my hand when I'm writing c

---Files--------------------------------
core.txt.3 (296 KB)

--
You have received this notification because you have either subscribed to it, or are involved in it.
To change your notification preferences, please click here: http://bugs.dragonflybsd.org/my/account

--
Tomorrow Will Never Die

Actions

Copy link

Updated by tse about 6 years ago

Should have vmcore.3 & kern.3
https://gitlab.com/gratis/dfly/-/archive/master/dfly-master.tar.gz

Actions

Copy link

Updated by dillon about 6 years ago

Hmm.. Ok, you can delete vmcore.3 and kern.3, we've downloaded it. Unfortunately it looks like the core dump was corrupt. It unpacked as follows:

~~rw-r--r-~~ 1 2024 wheel 118793728 Jan 29 08:11 kern.3
~~rw-r--r-~~ 1 2024 wheel 1849134952 Jan 29 08:11 vmcore.3

If those sizes are correct then it might have generated a corrupt core dump. It might be possible to try again and the next core dump winds up being ok, but sometimes when core dumps get corrupted like this the same corruption occurs on each crash.

You can test the generated core yourself. If it is sitting in /var/crash you can do:

kgdb -n 3

and then use the 'back' command to get a stack trace, assuming it doesn't crap out. If it says 'cannot access memory at address 0x10', then the original core was corrupt. You can try causing another panic and generating another core. Definitely make sure you have enough room to store the core, they can get pretty big.

Sephe and I couldn't find anything looking at the source code so at the moment we don't know what could have caused that assertion to occur.

-Matt

Actions

Copy link

Updated by tse about 6 years ago

Thanks guys,

https://drive.google.com/open?id=1u-dD43h3S0aNO2NVPBXUj6mPOHCHxJiF
https://drive.google.com/open?id=1yWEM3sDTLI18E-z2UqdXNCYdfrwdQaKi

Hopefully these will work. Seems not corrupt, but also no stacktrace. The crash also happens when just unpacking the google sdk tar onto the nfs share (when doing it from within linux on qemu)

Actions

Copy link

Updated by dillon about 6 years ago

Ugh. somehow lost track of this one. Lets try a different approach... this was a NFS mount to a linux client ? Which linux dist? And any particular mount arguments? I can try to replicate the crash by exporting to a linux client and doing stuff.

-Matt

Actions

Copy link

Updated by samuel about 6 years ago

/etc/rc.conf
rpcbind_enable="YES"
mountd_enable="YES"
nfs_server_enable="YES"
nfs_server_flags="-u -t -n 1"
mountd_flags="-r -n"

/etc/exports
/share/linux -mapall=tse:wheel -network 127.0.0.1 -mask 255.255.255.0

qemu-system-x86_64 \
cpu max -smp 4 -m 2048 \
-drive file=snapshot.qcow2,format=qcow2 \
-M q35 -usb -device usb-host,hostbus=4,hostport=3 \
-netdev user,id=net0,net=10.0.2.25,hostfwd=tcp::2222:22 \
-device e1000,netdev=net0 \
-device virtio-rng-pci \
-soundhw hda

snapshot.qcow2 is an ubuntu image. I think it was the latest .img from
here: https://cloud-images.ubuntu.com/cosmic/current/

I rebooted the vm with those settings. Doing `cat /dev/urandom > rand.txt`
did not cause the crash, but `tar -xf 25MB.tar.gz` quickly did

The .gz was google-cloud-sdk-231.0.0-linux-x86_64.tar.gz

On Wed, 20 Mar 2019 at 05:16, <bugtracker-admin@leaf.dragonflybsd.org>
wrote:

Issue #3170 has been updated by dillon.

Ugh. somehow lost track of this one. Lets try a different approach...
this was a NFS mount to a linux client ? Which linux dist? And any
particular mount arguments? I can try to replicate the crash by exporting
to a linux client and doing stuff.

-Matt

----------------------------------------
Bug #3170: repeatable nfsd crash
http://bugs.dragonflybsd.org/issues/3170#change-13631

Author: tse

Status: New

Priority: Normal

Assignee:

Category:

Target version: Latest stable
----------------------------------------
I created a linux vm on qemu with nfs shared from Dragonfly. Reason being
so I could install the go-app-engine for google cloud. Could read/write
small files to the nfs share. But running google-cloud-sdk/install.sh from
the vm on the nfs share quickly causes this error:

panic: assertion "m->m_type == MT_DATA" failed in m_dup_data at
/usr/src/sys/kern/uipc_mbuf.c:1820
cpuid = 1
Trace beginning at frame 0xfffff802f71bf500
m_dup_data() at m_dup_data+0x12b 0xffffffff805e8a7b
m_dup_data() at m_dup_data+0x12b 0xffffffff805e8a7b
nfs_realign.isra.3() at nfs_realign.isra.3+0x48 0xffffffff807177c8
nfsrv_rcv() at nfsrv_rcv+0x490 0xffffffff8071c110
sys_nfssvc() at sys_nfssvc+0x13e7 0xffffffff8071fcf7
syscall2() at syscall2+0x238 0xffffffff8098c0d8

I've switched from nfsd to unfsd, and that works fine. Though it took me a
day of fiddling for unfsd, just because I didn't know not to run it with
mountd. Hehe, such is life :)

Sorry, I'm sure it would take me many multiple months of work to think of
supplying a patch for this bug

But I'm happy. It's my first time to setup an nfs share and linux vm, and
now I can use them to access things from Dragonfly like google app engine,
or clang sanitizers to hold my hand when I'm writing c

---Files--------------------------------
core.txt.3 (296 KB)

--
You have received this notification because you have either subscribed to
it, or are involved in it.
To change your notification preferences, please click here:
http://bugs.dragonflybsd.org/my/account

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

DragonFlyBSD

Bug #3170

repeatable nfsd crash

Updated by sepherosa about 6 years ago

Updated by tse about 6 years ago

Updated by sepherosa about 6 years ago

Updated by tse about 6 years ago

Updated by dillon about 6 years ago

Updated by tse about 6 years ago

Updated by dillon about 6 years ago

Updated by samuel about 6 years ago