Project

General

Profile

Actions

Bug #3088

closed

Kernel panic in syncache_add

Added by pa3k over 6 years ago. Updated over 6 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
Start date:
10/21/2017
Due date:
% Done:

0%

Estimated time:

Description

Hello,
I'm using SuperMicro A1SAi-C2750 as small server with jails that was running fine with latest DragonFly BSD 4.8.1,
but since I upgraded to DragonFly BSD 5.0, I'm experiencing random crashes - always in syncache_add.

Tried also clean install and kernel rebuild,
server is actually running DragonFly v5.0.0.6.g0978b-RELEASE

24 hours of memtest is no problem for this server.
Crash summaries of latest server crashes are attached to this issue. I cannot find a patter how to reliably crash that server,
but core.txt.5 was right after reboot-after-panic while trying to copy core.txt.4 per scp from server.

Any help or hint how to fix or debug this problem is appreciated.

Also - should I be worried to see messages like this:
softdep_sync_metadata_bp(1): caught buf 0xffffff80af383f50 going away
on console?

Thanks in advance.


Files

core.txt.3 (119 KB) core.txt.3 pa3k, 10/21/2017 02:17 PM
a1sai.txt (35 KB) a1sai.txt lshw pa3k, 10/21/2017 02:17 PM
dfly.5.0.jpg (47.7 KB) dfly.5.0.jpg KVM screen Fatal trap 12 pa3k, 10/21/2017 02:32 PM
dfly.5.0.panic.jpg (60.6 KB) dfly.5.0.panic.jpg KVM first panic command pa3k, 10/21/2017 02:33 PM
dfly.5.0.trace.jpg (71.6 KB) dfly.5.0.trace.jpg KVM trace pa3k, 10/21/2017 02:33 PM
core.txt.4 (126 KB) core.txt.4 pa3k, 10/21/2017 02:51 PM
core.txt.5 (129 KB) core.txt.5 pa3k, 10/21/2017 02:51 PM
Actions #1

Updated by dillon over 6 years ago

  • Status changed from New to In Progress

This appears to be a missed initialization in the kernel. I have pushed a fix to master and to the 5.0 release. Please try pulling the latest from the repo and rebuilding the kernel, and report back if the problem continues to occur. Basically there are two kmalloc's in netinet/tcp_syncache.c that did not specify M_ZERO when they should have.

You are getting some good dumps, so if my fix has not fixed it, we will try to debug the issue further by having you probe your dumps.

Thanks!

-Matt

Actions #2

Updated by pa3k over 6 years ago

Thank you!

The server is now running compiled kernel DragonFly v5.0.0.7.gb21cd7-RELEASE for several hours without panic.
I will let the server run for few hours and if nothing happens, I will close this issue as resolved.

This "softdep_sync_metadata_bp(1): caught buf 0xffffff80af235c48 going away" is only informational and nothing to bother,
or is there some deeper issue hidden in?

Actions #3

Updated by pa3k over 6 years ago

  • Status changed from In Progress to Resolved

24 hours with kernel DragonFly v5.0.0.7.gb21cd7-RELEASE and server is still up and running without panic.
Closing as resolved.

Thank you.

Actions

Also available in: Atom PDF