Bug #2542
closed
Kernel freezes for minutes at a time
Added by ftigeot over 11 years ago.
Updated almost 8 years ago.
Description
Running the new poudriere-devel 2.3_4 port makes the system freeze for minutes at a time after a while.
- This is not the case at first, poudriere has to run for at least one hour for this phenomenon to become visible
- Display is frozen in place, periodically refreshed widgets such as clocks stop moving
- Keyboard and mouse input is not processed but is handled after a few minutes when the system becomes operational again
- Launching big applications such as firefox is enough to make the system freeze
- "Warning: deep namecache recursion at (null)" messages are printed on the console
When the system becomes operational again, top(1) shows one cpu thread using 100% cpu (system time).
A core dump taken when the system is in this state will be made available.
This behavior was seen with 8 threads Xeon systems, with both i386 and amd64 DragonFly-3.4rc installations.
In one freeze instance an xterm was still responsive and I could run some simple utilites.
systat -pv 1:
timer ipi extint user% nice% sys% intr% idle% tokcol token
cpu0 282 8 79 0.0 0.0 0.0 0.8 99.2 0
cpu1 283 0 0 0.0 0.0 0.0 0.0 100.0 0
cpu2 302 1 0 0.0 0.0 0.0 0.0 100.0 0
cpu3 281 0 0 0.0 0.0 100.0 0.0 0.0 6504549 vmobj
cpu4 282 0 0 0.0 0.0 0.0 0.0 100.0 0
cpu5 5 391 4 0 0.8 0.0 0.0 0.0 99.2 54844 mp_toke
cpu6 288 3 0 1.5 0.0 0.0 0.0 98.5 0
cpu7 282 8 0 0.0 0.0 0.0 0.0 100.0 19 pool
top(1) also showed some 'mv' zombie processes; my windowmanager was frozen in 'RUN' state and didn't answer any command.
A core dump is available in leaf:~ftigeot/crash/crash.issue2542
- % Done changed from 0 to 10
Setting vm.read_shortcut_enable to 0 has made the freezes disappear.
As an interested 3rd party, does "vm.read_shortcut_enable=1=freeze" imply where the problem lies? And do we know why this phenomenon (apparently) is not seen universally?
I had a machine building that was in "suspended animation" for 4 days.
It started working spontaneously after I hit "ctrl-t" to see what state poudriere was in. Apparently this stimulation kicked it back on track. It uses hammer.
ftigeot thinks this is the same thing.
If so, bug report confirmed.
- Description updated (diff)
I was able to reproduce this (or a very similar) bug on a dual-core KVM VPS running DFly 3.4.3. Basically, it was impossible to complete a "make buildworld" due to long and frequent freezes (clock skew, no response to keyboard input or ping). My setup does not use HAMMER, and changing vm.read_shortcut_enable did not seem to make any difference.
Since updating to 3.5 master (by waiting out the freezes), the issue is resolved - I'm now able to make buildworld without any freezing whatsoever.
- Status changed from New to Resolved
Fixed in more recent DragonFly versions, closing.
Also available in: Atom
PDF