Bug #2374: softupdates locking bug - DragonFlyBSD - DragonFlyBSD bugtracker

Actions

Copy link

Bug #2374

closed

softupdates locking bug

Added by vsrinivas about 13 years ago. Updated over 12 years ago.

Status:

Resolved

Priority:

Normal

Assignee:

Category:

Target version:

Start date:

05/23/2012

Due date:

% Done:

Estimated time:

Description

softupdates may still have some locking issues:

In -master:

http://leaf.dragonflybsd.org/~marino/core/core.20120523.txt
Panic where softdep_update_inodeblock() called bwrite() with a NULL buffer

getdirtybuf returned 'gotit', yet it either returned a NULL bp or the buffer was nulled after
it was saved in the inodedep structure. getdirtybuf can block and does release the softdep
lock while locking dirty buffers, but it is not clear if anyone can race in and result in the failure mode seen.

(from 3.0.3 catchall bug (2336)):
Deadlock in -master with softdep. No more details available.

Related issues 2 (0 open — 2 closed)

Actions

Copy link

Updated by marino about 13 years ago

Perhaps related:

Occasionally I see this kernel message while packages are building:
softdep_setup_freeblocks_bp(1): caught <id> going away

Actions

Copy link

Updated by marino almost 13 years ago

An new bug related to soft updates: panic: flush_pagedep_deps: MKDIR_BODY
full core txt: http://leaf.dragonflybsd.org/~marino/core/core.flush_pagedep_deps.txt

core file located at ~/marino/crash on leaf: core.flush_pagedep_deps.txz

Actions

Copy link

Updated by marino almost 13 years ago

I hit this exact panic again today.
Do you need the core or is the first one good enough?

Actions

Copy link

Updated by vsrinivas almost 13 years ago

Commit 8224c9ea7d94389a63b07be4401f0b05912f8f4a likely fixes this bug; getdirtybuf could return success incorrectly earlier.

Actions

Copy link

Updated by vsrinivas almost 13 years ago

Status changed from New to Feedback

Actions

Copy link

Updated by vsrinivas almost 13 years ago

Some hours of fsstress testing hit a deadlock w/ softdep and the patch; I haven't been able to root case it, but here are some hints:

1) the syncer (syncer0) is waiting for vnlru to make progress; its backtrace is:

(bioops callback)
softdep_process_worklist
process_worklist_item
handle_workitem_remove
_ WE DO NOT HOLD THE SOFTDEP LOCK AROUND VFS_VGET _
vfs_vget
ffs_vget
getnewvnode
allocvnode
vnlru_proc_wait

2) vnlru is not making progress; it is trying to lock a buffer associated with UFS, its backtrace is:
ssleep
acquire
lockmgr
(BUF_TIMELOCK)
vinvalbuf_bp
vlrureclaim
mountlist_scan

3) The buffer in question is a BUF_CMD_WRITE buffer, its lock is marked by LK_KERNTHREAD, and is a softdep buffer (seen via b_ops being the softdep bioops). The vnode associated with the buffer is held locked by the vnlru thread. I think these are the buffer's flags : B_CACHE|B_HASHED|B_BNOCLIP|B_IODEBUG|B_VNCLEAN|B_VMIO
.

Actions

Copy link

Updated by vsrinivas over 12 years ago

Status changed from Feedback to Resolved

vnode LRU deadlock was solved by 62ae46c924bd3c2efd985c79dac02be03360e6a6. flush_pagedep_deps panic was solved by ca55765aeb1b1a6aa5f39b49ea1e514c7ab60178.

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

DragonFlyBSD

Bug #2374

softupdates locking bug

Updated by marino about 13 years ago

Updated by marino almost 13 years ago

Updated by marino almost 13 years ago

Updated by vsrinivas almost 13 years ago

Updated by vsrinivas almost 13 years ago

Updated by vsrinivas almost 13 years ago

Updated by vsrinivas over 12 years ago