:New submission from Thomas Nikolajsen <thomas.nikolajsen@mail.dk>:
:
:dfly 2.4.0
:
:'git gc' stalled (host didn't freeze) after some
:'nfs server .. not responding' / '.. is alive again'
:git repo on nfs mounted dir on local host.
:Other commands using nfs mount also stalls.
:
:Same experience with nfs mounted on remote host.
:(have seen this a few times over the last ~2 months,
:was guessing problem was network HW (although ping did respond))
:
:(did `shutdown' shortly before escape to debugger;
:it didn't seem to shutdown: returned to shell prompt;
:can do other core dump if needed)
:
:'git gc' did succeed using local fs (hammer) directly (no nfs).
:Can reproduce, as prev. state of git repo is in snapshot.
:
:Core dump *.39 uploading to leaf.
You have a ton of NFS mounts here. Hmm. The NFS client is stuck
waiting for a response from the NFS server (on the same host). The
NFS server (the nfsd's) are stuck in a vnode lock on the HAMMER
filesystem waiting for the buffer cache.
This looks like another HAMMER buffer cache exhaustion deadlock,
again probably due to the 128M of ram in the machine. However,
it looks like a different issue then the one from your other
bug report.
I dug into why HAMMER was stalling in the core and it looked like
it shouldn't be stalling. HAMMER was only reserving one buffer.
The bufdaemon and bufdaemon_hw are both in wdrn1 which implies
they were flushing data to disk.
It could be that the issue here is not an actual deadlock but simply
a great deal of disk write activity causing long stalls in the
system. Did you notice a significant amount of hard drive activity
while the system was in this state? The only thing you are running
is the 'git gc'. It could be that write activity from the git gc
is creating long stalls and causing NFS to report the problem.
If that is the case the issue is probably more one of HAMMER simply
being massively inefficient due to the tiny buffer cache, but otherwise
operating.
-Matt
Matthew Dillon
<dillon@backplane.com>