Bug #693

heavy swapping may kill system?

Added by justin over 7 years ago. Updated about 5 years ago.

Status:ClosedStart date:
Priority:NormalDue date:
Assignee:-% Done:

0%

Category:-
Target version:-

Description

I had a DragonFly 1.8.2 system lock up on me completely - even the console
was frozen. All that was printed was a long series of this:

swap_pager_getswapspace: failed

I hit the power button and rebooted, but I've noticed an ongoing series of
this: (from dmesg)

(repeat...)

Investigating, one of the user accounts has some multi-gigabyte backup
files in ~, and imap-uw is trying to index them as mail messages. The
error here lies either in my configuration or imap-uw, but: it appears to
have locked up the system after a while. Should that runaway application
have been able to bring down the entire computer that way?

History

#1 Updated by memmerto over 7 years ago

IMO, this is what ulimit (and friends) are for.

Here's what ulimit says on my 1.9 box:

root@jekell# ulimit -a
cpu time (seconds, -t) unlimited
file size (512-blocks, -f) unlimited
data seg size (kbytes, -d) 524288
stack size (kbytes, -s) 65536
core file size (512-blocks, -c) unlimited
max memory size (kbytes, -m) unlimited
locked memory (kbytes, -l) unlimited
max user processes (-u) 3214
open files (-n) 6429
virtual mem size (kbytes, -v) unlimited
sbsize (bytes, -b) unlimited
root@jekell#

With unlimited -m/-v/-b/-f/-t settings, there's nothing to restrict a
process from attempting to consume more memory (phys or virt) or CPU than is
available. In your case, having -m/-v set to reasonable limits would have
prevented uw-imap from bringing down the system -- as once those limits were
exceeded, a malloc or shmget would have failed and the process would have
likely died - instead of bringing down the system.

It seems to me that in the BSD world, the default limits are quite
permissive (ie, unlimited), whereas in the "commercial" Un*x world (ie, AIX,
Solaris, etc) the ulimits are set to reasonable restrictive limits. On the
AIX boxes I use, I always end up having to change my ulimit sessions since
files get truncated at 1G, and applications fail because they can't allocate
enough memory and core files always get truncated. It's quite a pain, but
it's also quite useful from an administration POV since rogue users aren't
abel to take the system down.

Regards,
--
Matt Emmerton

#2 Updated by dillon over 7 years ago

:I had a DragonFly 1.8.2 system lock up on me completely - even the console
:was frozen. All that was printed was a long series of this:
:
:swap_pager_getswapspace: failed
:
:I hit the power button and rebooted, but I've noticed an ongoing series of
:this: (from dmesg)

Unless the swap usage is creating a memory leak, I think the two might
be unrelated.

:> pid 788 (imapd), uid 1006: exited on signal 6
:> pid 806 (imapd), uid 1006: exited on signal 6
:> pid 815 (imapd), uid 1006: exited on signal 6
:> pid 825 (imapd), uid 1006: exited on signal 6
:> swap_pager_getswapspace: failed
:> swap_pager_getswapspace: failed
:> swap_pager_getswapspace: failed
:> swap_pager_getswapspace: failed
:(repeat...)
:
:Investigating, one of the user accounts has some multi-gigabyte backup
:files in ~, and imap-uw is trying to index them as mail messages. The
:error here lies either in my configuration or imap-uw, but: it appears to
:have locked up the system after a while. Should that runaway application
:have been able to bring down the entire computer that way?

No, the worse that should happen is that the kernel will kill the
largest offending processes on the system, which is what it did. At
least theoretically.

-Matt
Matthew Dillon
<>

#3 Updated by justin over 7 years ago

I've noticed it went though several iterations of swapspace errors and the
IMAP server being killed; it could be that activity, combined with
something else that was going on, was enough to trigger a problem.

I'll keep an eye on it; I haven't been able to get it to repeat again.

#4 Updated by tuxillo about 5 years ago

Weeks ago, I did an stupid test on my system. I tried badly to kill the system
by taking up all the memory I could, in the hope of crashing the system. The
offending big memory consumers were just killed and the kernel end up flawlessly.

Justin, did you have similar issues? Can we close this ticket?

#5 Updated by justin about 5 years ago

Marking resolved. The original circumstance where imap-uw was reading huge
non-mail files hasn't happened again, but then again neither has the crash.

I'll mark this closed.

Also available in: Atom PDF