Bug #2542
closedKernel freezes for minutes at a time
10%
Description
Running the new poudriere-devel 2.3_4 port makes the system freeze for minutes at a time after a while.
- This is not the case at first, poudriere has to run for at least one hour for this phenomenon to become visible
- Display is frozen in place, periodically refreshed widgets such as clocks stop moving
- Keyboard and mouse input is not processed but is handled after a few minutes when the system becomes operational again
- Launching big applications such as firefox is enough to make the system freeze
- "Warning: deep namecache recursion at (null)" messages are printed on the console
When the system becomes operational again, top(1) shows one cpu thread using 100% cpu (system time).
A core dump taken when the system is in this state will be made available.
Updated by ftigeot over 11 years ago
This behavior was seen with 8 threads Xeon systems, with both i386 and amd64 DragonFly-3.4rc installations.
Updated by ftigeot over 11 years ago
In one freeze instance an xterm was still responsive and I could run some simple utilites.
systat -pv 1:
timer ipi extint user% nice% sys% intr% idle% tokcol token
cpu0 282 8 79 0.0 0.0 0.0 0.8 99.2 0
cpu1 283 0 0 0.0 0.0 0.0 0.0 100.0 0
cpu2 302 1 0 0.0 0.0 0.0 0.0 100.0 0
cpu3 281 0 0 0.0 0.0 100.0 0.0 0.0 6504549 vmobj
cpu4 282 0 0 0.0 0.0 0.0 0.0 100.0 0
cpu5 5 391 4 0 0.8 0.0 0.0 0.0 99.2 54844 mp_toke
cpu6 288 3 0 1.5 0.0 0.0 0.0 98.5 0
cpu7 282 8 0 0.0 0.0 0.0 0.0 100.0 19 pool
top(1) also showed some 'mv' zombie processes; my windowmanager was frozen in 'RUN' state and didn't answer any command.
Updated by ftigeot over 11 years ago
A core dump is available in leaf:~ftigeot/crash/crash.issue2542
Updated by ftigeot over 11 years ago
- % Done changed from 0 to 10
Setting vm.read_shortcut_enable to 0 has made the freezes disappear.
Updated by marino over 11 years ago
As an interested 3rd party, does "vm.read_shortcut_enable=1=freeze" imply where the problem lies? And do we know why this phenomenon (apparently) is not seen universally?
Updated by marino over 11 years ago
I had a machine building that was in "suspended animation" for 4 days.
It started working spontaneously after I hit "ctrl-t" to see what state poudriere was in. Apparently this stimulation kicked it back on track. It uses hammer.
ftigeot thinks this is the same thing.
If so, bug report confirmed.
Updated by daniel.ramos about 11 years ago
- Description updated (diff)
I was able to reproduce this (or a very similar) bug on a dual-core KVM VPS running DFly 3.4.3. Basically, it was impossible to complete a "make buildworld" due to long and frequent freezes (clock skew, no response to keyboard input or ping). My setup does not use HAMMER, and changing vm.read_shortcut_enable did not seem to make any difference.
Since updating to 3.5 master (by waiting out the freezes), the issue is resolved - I'm now able to make buildworld without any freezing whatsoever.
Updated by ftigeot almost 8 years ago
- Status changed from New to Resolved
Fixed in more recent DragonFly versions, closing.