Project

General

Profile

Actions

Bug #1291

closed

NFS + HAMMER => stalls

Added by thomas.nikolajsen over 15 years ago. Updated over 15 years ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
Start date:
Due date:
% Done:

0%

Estimated time:

Description

Using 2.2.0-RELEASE w/ HAMMER patch from issue1276 or HEAD
I experience stalls on NFS client, during e.g. buildworld,
it just sits there doing no work; doing some operation on other NFS mount
(same server) kicks client to continue buildworld.

NFS server uses HAMMER for export, no PFS.

NFS client uses NFS mount for /usr/src & /usr/obj &
some other dirs, e.g. home dirs.

Stalls are not very frequent, typically once during a buildworld.

Problem seen rather infrequent on HEAD for a few months.

How is this best debugged?

-thomas
Actions #1

Updated by dillon over 15 years ago

:New submission from Thomas Nikolajsen <>:
:
:Using 2.2.0-RELEASE w/ HAMMER patch from issue1276 or HEAD
:I experience stalls on NFS client, during e.g. buildworld,
:it just sits there doing no work; doing some operation on other NFS mount
:(same server) kicks client to continue buildworld.
:
:NFS server uses HAMMER for export, no PFS.
:
:NFS client uses NFS mount for /usr/src & /usr/obj &
:some other dirs, e.g. home dirs.
:
:Stalls are not very frequent, typically once during a buildworld.
:
:Problem seen rather infrequent on HEAD for a few months.
:
:How is this best debugged?
:
: -thomas

Does pinging the box over the network unstick it?  If it does,
please report the interface adapter.
-Matt
Matthew Dillon
&lt;&gt;
Actions #2

Updated by thomas.nikolajsen over 15 years ago

Does pinging the box over the network unstick it? If it does,
please report the interface adapter.

No, ping works, but it doesn't unstick build.
Same goes for typing in ssh session from NFS client to NFS server.

But `ls -d NFS' does unstick build.

It doesn't seem like a general network problem; but I could be wrong.

NFS server uses xl(4) and client rl(4) NICs;
could change to other NICs, e.g. em(4).

-thomas
Actions #3

Updated by dillon over 15 years ago

:Thomas Nikolajsen <> added the comment:
:
:>Does pinging the box over the network unstick it? If it does,
:>please report the interface adapter.
:
:No, ping works, but it doesn't unstick build.
:Same goes for typing in ssh session from NFS client to NFS server.
:
:But `ls -d NFS' does unstick build.
:
:It doesn't seem like a general network problem; but I could be wrong.
:
:NFS server uses xl(4) and client rl(4) NICs;
:could change to other NICs, e.g. em(4).
:
: -thomas

Hmm.  Sounds like it could be a wakeup race in NFS, then.  I have
not experienced it yet. I may have to try to reproduce the problem
over here.
When you ran the buildworld what bits were being accessed via NFS?
/usr/src ? /usr/obj ? Both?
-Matt
Matthew Dillon
&lt;&gt;
Actions #4

Updated by dillon over 15 years ago

Also, I need the NFS mount line from the fstab and any special options
you might be using. Are nfsiod's running?

-Matt
Matthew Dillon
&lt;&gt;
Actions #5

Updated by thomas.nikolajsen over 15 years ago

Both /usr/src & /usr/obj are NFS mounts.

NFS client:
(NFS server wasn't used on this host): rc.conf, NFS part:
nfs_reserved_port_only="YES"
nfs_client_enable="YES"
nfs_server_enable="YES"
mountd_enable="YES"
rpcbind_enable="YES"

fstab: (more entries, but only these used)
ask:/hammer/DragonFlyBSD/current/usr/src /usr/src nfs rw,-i
ask:/hammer/DragonFlyBSD/current/usr/obj /usr/obj nfs rw,-i
ask:/hammer/home/thomas /home/thomas nfs rw,-i

Today I experienced that build was unstick by login to root;
root doesn't use NFS mounts for login.

I never had this problem using UFS on NFS server.

-thomas
Actions #6

Updated by thomas.nikolajsen over 15 years ago

This was a false alert, sorry!

I haven't seen these symptoms since issue was opened.

Problem seems to be bad LAN ethernet switch / NIC / driver / network stack:
(most probable first, in my guess)
I did see stall once, but it helped to move NFS client ethernet cable
to other LAN switch (four used for LAN).
Didn't try to reconnect to same switch port, to see if this also
resolves stall; will do that next time.

-thomas
Actions

Also available in: Atom PDF