Kernel panic in syncache_add
I'm using SuperMicro A1SAi-C2750 as small server with jails that was running fine with latest DragonFly BSD 4.8.1,
but since I upgraded to DragonFly BSD 5.0, I'm experiencing random crashes - always in syncache_add.
Tried also clean install and kernel rebuild,
server is actually running DragonFly v22.214.171.124.g0978b-RELEASE
24 hours of memtest is no problem for this server.
Crash summaries of latest server crashes are attached to this issue. I cannot find a patter how to reliably crash that server,
but core.txt.5 was right after reboot-after-panic while trying to copy core.txt.4 per scp from server.
Any help or hint how to fix or debug this problem is appreciated.
Also - should I be worried to see messages like this:
softdep_sync_metadata_bp(1): caught buf 0xffffff80af383f50 going away
Thanks in advance.
- Status changed from New to In Progress
This appears to be a missed initialization in the kernel. I have pushed a fix to master and to the 5.0 release. Please try pulling the latest from the repo and rebuilding the kernel, and report back if the problem continues to occur. Basically there are two kmalloc's in netinet/tcp_syncache.c that did not specify M_ZERO when they should have.
You are getting some good dumps, so if my fix has not fixed it, we will try to debug the issue further by having you probe your dumps.
The server is now running compiled kernel DragonFly v126.96.36.199.gb21cd7-RELEASE for several hours without panic.
I will let the server run for few hours and if nothing happens, I will close this issue as resolved.
This "softdep_sync_metadata_bp(1): caught buf 0xffffff80af235c48 going away" is only informational and nothing to bother,
or is there some deeper issue hidden in?