Bug #2365
Hammer pfs-destroy and prune-everything can cause network loss
| Status: | Closed | Start date: | 05/09/2012 | |
|---|---|---|---|---|
| Priority: | Normal | Due date: | ||
| Assignee: | - | % Done: | 0% |
|
| Category: | - | |||
| Target version: | - |
Description
Fairly consistently, when I run hammer prune-everything, at some point in the process my ssh session will stop and not recover and the machine becomes unavailable on the network. It eventually returns to normal and I can reconnect to it. Today, I ran a hammer pfs-destroy on a 1.3TB pfs and the same thing happened. While the network was out, it was putting these messages in the log:
swap_pager: indefinite wait buffer: offset: 6489661440, size: 4096
swap_pager: indefinite wait buffer: offset: 23187279872, size: 4096
swap_pager: indefinite wait buffer: offset: 6642294784, size: 4096
swap_pager: indefinite wait buffer: offset: 6536355840, size: 4096
swap_pager: indefinite wait buffer: offset: 6500122624, size: 4096
swap_pager: indefinite wait buffer: offset: 6489661440, size: 4096
swap_pager: indefinite wait buffer: offset: 6725648384, size: 4096
swap_pager: indefinite wait buffer: offset: 23187279872, size: 4096
swap_pager: indefinite wait buffer: offset: 6642294784, size: 4096
swap_pager: indefinite wait buffer: offset: 6536355840, size: 4096
swap_pager: indefinite wait buffer: offset: 6500122624, size: 4096
swap_pager: indefinite wait buffer: offset: 861982720, size: 4096
swap_pager: indefinite wait buffer: offset: 6489661440, size: 4096
swap_pager: indefinite wait buffer: offset: 6725648384, size: 4096
swap_pager: indefinite wait buffer: offset: 23187279872, size: 4096
swap_pager: indefinite wait buffer: offset: 6642294784, size: 4096
swap_pager: indefinite wait buffer: offset: 1027108864, size: 4096
swap_pager: indefinite wait buffer: offset: 6536355840, size: 4096
swap_pager: indefinite wait buffer: offset: 6500122624, size: 4096
Related todos
History
Updated by swildner about 1 year ago
Is this on i386 or x86_64? I've seen the issue too here, although not triggered by prune-everything or pfs-destroy. It's just a thing that happens from time to time on my i386 box.
http://87.78.98.243/tmp/IMG_20120424_220035.jpg
In my case the trigger is not clear.
And I've never seen it on any x86_64 box.
Updated by t_dfbsd 12 months ago
This is on x86_64 AMD. I should add that those swap_pager messages were happening during the pfs-destroy, but I don't know whether or not that was the trigger. More concerning to me is the loss of network connectivity.
Updated by swildner 9 months ago
- File indefinite_wait_buffer.png added
- File indefinite_wait_buffer2.png added
- File indefinite_wait_buffer3.png added
I just had the "indefinite wait buffer" in an i386 VM and I captured the beginnings of it. I was building in pkgsrc and then thought I should cleanup my HAMMER so I ran /etc/periodic/daily/160.clean-hammer and it went fine until it got to /home (second image). Although nothing of the package building took place on /home (afaics) it hung there and soon the "indefinite wait buffer" messages started to appear on the console (first image). First the offsets were all the same but later on it was several (third image).
Maybe this gives some better clue? So far I had never witnessed it when it happened.
Updated by t_dfbsd 8 months ago
I upgraded my system (x86_64) to include all the recent scheduler changes and tried hammer prune-everything and for the first time in a long time, it didn't kill the network. I did noticed a slowdown, but that was all. Could those changes have fixed this problem? I'll do more testing.