Project

General

Profile

Bug #3170

repeatable nfsd crash

Added by tse 9 months ago. Updated 7 months ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
-
Target version:
Start date:
01/26/2019
Due date:
% Done:

0%

Estimated time:

Description

I created a linux vm on qemu with nfs shared from Dragonfly. Reason being so I could install the go-app-engine for google cloud. Could read/write small files to the nfs share. But running google-cloud-sdk/install.sh from the vm on the nfs share quickly causes this error:

panic: assertion "m->m_type == MT_DATA" failed in m_dup_data at /usr/src/sys/kern/uipc_mbuf.c:1820
cpuid = 1
Trace beginning at frame 0xfffff802f71bf500
m_dup_data() at m_dup_data+0x12b 0xffffffff805e8a7b
m_dup_data() at m_dup_data+0x12b 0xffffffff805e8a7b
nfs_realign.isra.3() at nfs_realign.isra.3+0x48 0xffffffff807177c8
nfsrv_rcv() at nfsrv_rcv+0x490 0xffffffff8071c110
sys_nfssvc() at sys_nfssvc+0x13e7 0xffffffff8071fcf7
syscall2() at syscall2+0x238 0xffffffff8098c0d8

I've switched from nfsd to unfsd, and that works fine. Though it took me a day of fiddling for unfsd, just because I didn't know not to run it with mountd. Hehe, such is life :)

Sorry, I'm sure it would take me many multiple months of work to think of supplying a patch for this bug

But I'm happy. It's my first time to setup an nfs share and linux vm, and now I can use them to access things from Dragonfly like google app engine, or clang sanitizers to hold my hand when I'm writing c


Files

core.txt.3 (296 KB) core.txt.3 tse, 01/28/2019 04:19 AM

History

#1

Updated by sepherosa 9 months ago

Hi,

Do you have dumps available?

Thanks,
sephe

On Sun, Jan 27, 2019 at 5:39 AM <> wrote:
>
> Issue #3170 has been reported by tse.
>
> ----------------------------------------
> Bug #3170: repeatable nfsd crash
> http://bugs.dragonflybsd.org/issues/3170
>
> * Author: tse
> * Status: New
> * Priority: Normal
> * Assignee:
> * Category:
> * Target version: Latest stable
> ----------------------------------------
> I created a linux vm on qemu with nfs shared from Dragonfly. Reason being so I could install the go-app-engine for google cloud. Could read/write small files to the nfs share. But running google-cloud-sdk/install.sh from the vm on the nfs share quickly causes this error:
>
> panic: assertion "m->m_type == MT_DATA" failed in m_dup_data at /usr/src/sys/kern/uipc_mbuf.c:1820
> cpuid = 1
> Trace beginning at frame 0xfffff802f71bf500
> m_dup_data() at m_dup_data+0x12b 0xffffffff805e8a7b
> m_dup_data() at m_dup_data+0x12b 0xffffffff805e8a7b
> nfs_realign.isra.3() at nfs_realign.isra.3+0x48 0xffffffff807177c8
> nfsrv_rcv() at nfsrv_rcv+0x490 0xffffffff8071c110
> sys_nfssvc() at sys_nfssvc+0x13e7 0xffffffff8071fcf7
> syscall2() at syscall2+0x238 0xffffffff8098c0d8
>
> I've switched from nfsd to unfsd, and that works fine. Though it took me a day of fiddling for unfsd, just because I didn't know not to run it with mountd. Hehe, such is life :)
>
> Sorry, I'm sure it would take me many multiple months of work to think of supplying a patch for this bug
>
> But I'm happy. It's my first time to setup an nfs share and linux vm, and now I can use them to access things from Dragonfly like google app engine, or clang sanitizers to hold my hand when I'm writing c
>
>
>
> --
> You have received this notification because you have either subscribed to it, or are involved in it.
> To change your notification preferences, please click here: http://bugs.dragonflybsd.org/my/account

--
Tomorrow Will Never Die

#2

Updated by tse 9 months ago

There's also the vmcore.3, which I can host somewhere.

And an unrelated warning I get occasionally, but didn't think it was worth opening an issue for:
Jan 27 09:39:08 beloved root: Unknown USB device: vendor 0x8087 product 0x07dc bus uhub0
Jan 27 09:39:08 beloved root: Unknown USB device: vendor 0x8087 product 0x07dc bus uhub0
Jan 27 20:33:18 beloved kernel: error: [drm:pid-1:intel_pipe_update_start] *ERROR* Potential atomic update failure on pipe A
...
Jan 28 08:57:14 beloved root: Unknown USB device: vendor 0x8087 product 0x07dc bus uhub0
Jan 28 08:57:14 beloved root: Unknown USB device: vendor 0x8087 product 0x07dc bus uhub0
Jan 28 10:19:36 beloved kernel: error: [drm:pid938:intel_pipe_update_start] *ERROR* Potential atomic update failure on pipe A

#3

Updated by sepherosa 9 months ago

Please upload vmcore and kernel file somewhere.

Thanks,
sephe

On Mon, Jan 28, 2019 at 8:32 PM <> wrote:
>
> Issue #3170 has been updated by tse.
>
> File core.txt.3 added
>
> There's also the vmcore.3, which I can host somewhere.
>
> And an unrelated warning I get occasionally, but didn't think it was worth opening an issue for:
> Jan 27 09:39:08 beloved root: Unknown USB device: vendor 0x8087 product 0x07dc bus uhub0
> Jan 27 09:39:08 beloved root: Unknown USB device: vendor 0x8087 product 0x07dc bus uhub0
> Jan 27 20:33:18 beloved kernel: error: [drm:pid-1:intel_pipe_update_start] *ERROR* Potential atomic update failure on pipe A
> ...
> Jan 28 08:57:14 beloved root: Unknown USB device: vendor 0x8087 product 0x07dc bus uhub0
> Jan 28 08:57:14 beloved root: Unknown USB device: vendor 0x8087 product 0x07dc bus uhub0
> Jan 28 10:19:36 beloved kernel: error: [drm:pid938:intel_pipe_update_start] *ERROR* Potential atomic update failure on pipe A
>
> ----------------------------------------
> Bug #3170: repeatable nfsd crash
> http://bugs.dragonflybsd.org/issues/3170#change-13593
>
> * Author: tse
> * Status: New
> * Priority: Normal
> * Assignee:
> * Category:
> * Target version: Latest stable
> ----------------------------------------
> I created a linux vm on qemu with nfs shared from Dragonfly. Reason being so I could install the go-app-engine for google cloud. Could read/write small files to the nfs share. But running google-cloud-sdk/install.sh from the vm on the nfs share quickly causes this error:
>
> panic: assertion "m->m_type == MT_DATA" failed in m_dup_data at /usr/src/sys/kern/uipc_mbuf.c:1820
> cpuid = 1
> Trace beginning at frame 0xfffff802f71bf500
> m_dup_data() at m_dup_data+0x12b 0xffffffff805e8a7b
> m_dup_data() at m_dup_data+0x12b 0xffffffff805e8a7b
> nfs_realign.isra.3() at nfs_realign.isra.3+0x48 0xffffffff807177c8
> nfsrv_rcv() at nfsrv_rcv+0x490 0xffffffff8071c110
> sys_nfssvc() at sys_nfssvc+0x13e7 0xffffffff8071fcf7
> syscall2() at syscall2+0x238 0xffffffff8098c0d8
>
> I've switched from nfsd to unfsd, and that works fine. Though it took me a day of fiddling for unfsd, just because I didn't know not to run it with mountd. Hehe, such is life :)
>
> Sorry, I'm sure it would take me many multiple months of work to think of supplying a patch for this bug
>
> But I'm happy. It's my first time to setup an nfs share and linux vm, and now I can use them to access things from Dragonfly like google app engine, or clang sanitizers to hold my hand when I'm writing c
>
> ---Files--------------------------------
> core.txt.3 (296 KB)
>
>
> --
> You have received this notification because you have either subscribed to it, or are involved in it.
> To change your notification preferences, please click here: http://bugs.dragonflybsd.org/my/account

--
Tomorrow Will Never Die

#5

Updated by dillon 9 months ago

Hmm.. Ok, you can delete vmcore.3 and kern.3, we've downloaded it. Unfortunately it looks like the core dump was corrupt. It unpacked as follows:

-rw-r--r-- 1 2024 wheel 118793728 Jan 29 08:11 kern.3
-rw-r--r-- 1 2024 wheel 1849134952 Jan 29 08:11 vmcore.3

If those sizes are correct then it might have generated a corrupt core dump. It might be possible to try again and the next core dump winds up being ok, but sometimes when core dumps get corrupted like this the same corruption occurs on each crash.

You can test the generated core yourself. If it is sitting in /var/crash you can do:

kgdb -n 3

and then use the 'back' command to get a stack trace, assuming it doesn't crap out. If it says 'cannot access memory at address 0x10', then the original core was corrupt. You can try causing another panic and generating another core. Definitely make sure you have enough room to store the core, they can get pretty big.

Sephe and I couldn't find anything looking at the source code so at the moment we don't know what could have caused that assertion to occur.

-Matt

#6

Updated by tse 9 months ago

Thanks guys,

https://drive.google.com/open?id=1u-dD43h3S0aNO2NVPBXUj6mPOHCHxJiF
https://drive.google.com/open?id=1yWEM3sDTLI18E-z2UqdXNCYdfrwdQaKi

Hopefully these will work. Seems not corrupt, but also no stacktrace. The crash also happens when just unpacking the google sdk tar onto the nfs share (when doing it from within linux on qemu)

#7

Updated by dillon 7 months ago

Ugh. somehow lost track of this one. Lets try a different approach... this was a NFS mount to a linux client ? Which linux dist? And any particular mount arguments? I can try to replicate the crash by exporting to a linux client and doing stuff.

-Matt

#8

Updated by samuel 7 months ago

# /etc/rc.conf
rpcbind_enable="YES"
mountd_enable="YES"
nfs_server_enable="YES"
nfs_server_flags="-u -t -n 1"
mountd_flags="-r -n"

# /etc/exports
/share/linux -mapall=tse:wheel -network 127.0.0.1 -mask 255.255.255.0

qemu-system-x86_64 \
-cpu max -smp 4 -m 2048 \
-drive file=snapshot.qcow2,format=qcow2 \
-M q35 -usb -device usb-host,hostbus=4,hostport=3 \
-netdev user,id=net0,net=10.0.2.25,hostfwd=tcp::2222-:22 \
-device e1000,netdev=net0 \
-device virtio-rng-pci \
-soundhw hda

snapshot.qcow2 is an ubuntu image. I think it was the latest .img from
here: https://cloud-images.ubuntu.com/cosmic/current/

I rebooted the vm with those settings. Doing `cat /dev/urandom > rand.txt`
did not cause the crash, but `tar -xf 25MB.tar.gz` quickly did

The .gz was google-cloud-sdk-231.0.0-linux-x86_64.tar.gz

On Wed, 20 Mar 2019 at 05:16, <>
wrote:

> Issue #3170 has been updated by dillon.
>
>
> Ugh. somehow lost track of this one. Lets try a different approach...
> this was a NFS mount to a linux client ? Which linux dist? And any
> particular mount arguments? I can try to replicate the crash by exporting
> to a linux client and doing stuff.
>
> -Matt
>
> ----------------------------------------
> Bug #3170: repeatable nfsd crash
> http://bugs.dragonflybsd.org/issues/3170#change-13631
>
> * Author: tse
> * Status: New
> * Priority: Normal
> * Assignee:
> * Category:
> * Target version: Latest stable
> ----------------------------------------
> I created a linux vm on qemu with nfs shared from Dragonfly. Reason being
> so I could install the go-app-engine for google cloud. Could read/write
> small files to the nfs share. But running google-cloud-sdk/install.sh from
> the vm on the nfs share quickly causes this error:
>
> panic: assertion "m->m_type == MT_DATA" failed in m_dup_data at
> /usr/src/sys/kern/uipc_mbuf.c:1820
> cpuid = 1
> Trace beginning at frame 0xfffff802f71bf500
> m_dup_data() at m_dup_data+0x12b 0xffffffff805e8a7b
> m_dup_data() at m_dup_data+0x12b 0xffffffff805e8a7b
> nfs_realign.isra.3() at nfs_realign.isra.3+0x48 0xffffffff807177c8
> nfsrv_rcv() at nfsrv_rcv+0x490 0xffffffff8071c110
> sys_nfssvc() at sys_nfssvc+0x13e7 0xffffffff8071fcf7
> syscall2() at syscall2+0x238 0xffffffff8098c0d8
>
> I've switched from nfsd to unfsd, and that works fine. Though it took me a
> day of fiddling for unfsd, just because I didn't know not to run it with
> mountd. Hehe, such is life :)
>
> Sorry, I'm sure it would take me many multiple months of work to think of
> supplying a patch for this bug
>
> But I'm happy. It's my first time to setup an nfs share and linux vm, and
> now I can use them to access things from Dragonfly like google app engine,
> or clang sanitizers to hold my hand when I'm writing c
>
> ---Files--------------------------------
> core.txt.3 (296 KB)
>
>
> --
> You have received this notification because you have either subscribed to
> it, or are involved in it.
> To change your notification preferences, please click here:
> http://bugs.dragonflybsd.org/my/account
>

Also available in: Atom PDF