Project

General

Profile

Actions

Bug #2534

closed

Network connections freeze with heavy email traffic

Added by ftigeot over 11 years ago. Updated over 11 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
Start date:
03/27/2013
Due date:
% Done:

100%

Estimated time:

Description

When under heavy load (receiving a stream of mails at 1Gb/s from a LAN machine), network connections freeze on a mail relay machine.
This machine is also a PPP/ADSL router.

The console shows this message:
Warning, objcache(cluster mbuf): Exhausted!

netstat -m output:
5306/4480 mbuf clusters in use (current/max)

Some processes (most often ppp and sendmail) end up in a weird state, as shown by these top outputs:

PID USERNAME   NICE  SIZE    RES    STATE CPU  TIME   CTIME    CPU COMMAND
1879 root 0 14M 2864K CPU0 0 0:00 0:00 0.34% top
260 root 0 24M 2476K tunread 2 0:05 0:05 0.00% ppp
PID USERNAME   NICE  SIZE    RES    STATE CPU  TIME   CTIME    CPU COMMAND
82699 ftigeot 0 15M 2872K CPU3 3 0:00 0:00 0.39% top
265 root 0 25M 1172K objcache 1 154:06 154:06 0.00% ppp

Only a reboot allows the machine to resume network operations.

A core dump is available in leaf:~ftigeot/crash/crash.objcache

Actions #1

Updated by sepherosa over 11 years ago

On Wed, Mar 27, 2013 at 3:42 PM, Francois Tigeot via Redmine
<> wrote:

Issue #2534 has been reported by ftigeot.

----------------------------------------
Bug #2534: Network connections freeze with heavy email traffic
http://bugs.dragonflybsd.org/issues/2534

Author: ftigeot
Status: New
Priority: Normal
Assignee:
Category:
Target version:

When under heavy load (receiving a stream of mails at 1Gb/s from a LAN machine), network connections freeze on a mail relay machine.
This machine is also a PPP/ADSL router.

The console shows this message:
Warning, objcache(cluster mbuf): Exhausted!

netstat -m output:
5306/4480 mbuf clusters in use (current/max)

Some processes (most often ppp and sendmail) end up in a weird state, as shown by these top outputs:

PID USERNAME NICE SIZE RES STATE CPU TIME CTIME CPU COMMAND
1879 root 0 14M 2864K CPU0 0 0:00 0:00 0.34% top
260 root 0 24M 2476K tunread 2 0:05 0:05 0.00% ppp

PID USERNAME NICE SIZE RES STATE CPU TIME CTIME CPU COMMAND
82699 ftigeot 0 15M 2872K CPU3 3 0:00 0:00 0.39% top
265 root 0 25M 1172K objcache 1 154:06 154:06 0.00% ppp

Only a reboot allows the machine to resume network operations.

A core dump is available in leaf:~ftigeot/crash/crash.objcache

I suggest to test the latest master, many network bugs are fixed since
the kernel version that crashed.

Best Regards,
sephe

--
Tomorrow Will Never Die

Actions #2

Updated by ftigeot over 11 years ago

  • Status changed from New to Resolved
  • % Done changed from 0 to 100

The issue seems to be resolved indeed, I couldn't reproduce it with today's kernel.
The current number of mbuf clusters in use doesn't grow bigger than the maximum limit anymore.

Thanks for the tip!

Actions

Also available in: Atom PDF