Bug #651

system semi-freezes on mbuf cluster limit

Added by corecode over 7 years ago. Updated almost 3 years ago.

Status:ClosedStart date:
Priority:HighDue date:
Assignee:-% Done:

0%

Category:-
Target version:-

Description

I just experienced a nasty situation: I ran out of mbuf clusters (6656) and ppp was one of the processes stuck in objcache_get.

even after some clusters drained (from netstat -m output), the objcache depot didn't get free entries back and ppp stayed stuck. and of course because of this no mbuf clusters were freed (ppp would have to transmit them, i guess). I was doing some serious down/uploading at the moment.

this should not happen, or at least more gracefully.

cheers
simon

History

#1 Updated by dillon over 7 years ago

:I just experienced a nasty situation: I ran out of mbuf clusters (6656) =
:and ppp was one of the processes stuck in objcache_get.
:
:even after some clusters drained (from netstat -m output), the objcache d=
:epot didn't get free entries back and ppp stayed stuck. and of course be=
:cause of this no mbuf clusters were freed (ppp would have to transmit the=
:m, i guess). I was doing some serious down/uploading at the moment.
:
:this should not happen, or at least more gracefully.
:
:cheers
: simon

I was debugging something similar earlier this month. Basically
what can happen is that if a machine is running a lot of simultanious
TCP connections, particularly outgoing connections which may build up
a lot of data in the socket buffers, the machine can hit its mbuf
cluster limit.

Is that what is happening to you? Lots of outgoing tcp connections
with lots of data backed up (netstat -tn | fgrep tcp4)? I want to
make sure it isn't an mbuf leak.

When the cluster limit is reached, the sheer demand for packets
prevents the system from being able to recover mbufs. Eventually the
tcp connections start timing out and freeing all of their mbufs, and
the machine then recovers.

At the moment the only real solution is to increase the number of mbufs
as boot time (set kern.ipc.nmbclusters and kern.ipc.nmbufs in
/boot/loader.conf).

One thing that would be nice would be to have some sort of algorithm,
similar to what linux has, where it detects the mbuf load on the system
and reduces the amount of data it allows the tcp connections to build
up dynamically, resulting in more graceful degradation.

-Matt
Matthew Dillon
<>

#2 Updated by sepherosa over 5 years ago

The bug following commit intended to fix is related to this report:
8d968f1d34d7f672c6bda28d5a6c71f93bba1c2c

#3 Updated by pavalos almost 3 years ago

  • Description updated (diff)
  • Status changed from New to Closed
  • Assignee deleted (0)

Also available in: Atom PDF