softupdates locking bug
softupdates may still have some locking issues:
Panic where softdep_update_inodeblock() called bwrite() with a NULL buffer
getdirtybuf returned 'gotit', yet it either returned a NULL bp or the buffer was nulled after
it was saved in the inodedep structure. getdirtybuf can block and does release the softdep
lock while locking dirty buffers, but it is not clear if anyone can race in and result in the failure mode seen.
* (from 3.0.3 catchall bug (2336)):
Deadlock in -master with softdep. No more details available.
Updated by marino about 1 year ago
Occasionally I see this kernel message while packages are building:
softdep_setup_freeblocks_bp(1): caught <id> going away
An new bug related to soft updates: panic: flush_pagedep_deps: MKDIR_BODY
full core txt: http://leaf.dragonflybsd.org/~marino/core/core.flush_pagedep_deps.txt
core file located at ~/marino/crash on leaf: core.flush_pagedep_deps.txz
I hit this exact panic again today.
Do you need the core or is the first one good enough?
Commit 8224c9ea7d94389a63b07be4401f0b05912f8f4a likely fixes this bug; getdirtybuf could return success incorrectly earlier.
Some hours of fsstress testing hit a deadlock w/ softdep and the patch; I haven't been able to root case it, but here are some hints:
1) the syncer (syncer0) is waiting for vnlru to make progress; its backtrace is:
*_ WE DO NOT HOLD THE SOFTDEP LOCK AROUND VFS_VGET _*
2) vnlru is not making progress; it is trying to lock a buffer associated with UFS, its backtrace is:
3) The buffer in question is a BUF_CMD_WRITE buffer, its lock is marked by LK_KERNTHREAD, and is a softdep buffer (seen via b_ops being the softdep bioops). The vnode associated with the buffer is held locked by the vnlru thread. I think these are the buffer's flags : B_CACHE|B_HASHED|B_BNOCLIP|B_IODEBUG|B_VNCLEAN|B_VMIO