Bug #2534

Network connections freeze with heavy email traffic

Added by ftigeot over 1 year ago. Updated over 1 year ago.

Status:ResolvedStart date:03/27/2013
Priority:NormalDue date:
Assignee:-% Done:

100%

Category:-
Target version:-

Description

When under heavy load (receiving a stream of mails at 1Gb/s from a LAN machine), network connections freeze on a mail relay machine.
This machine is also a PPP/ADSL router.

The console shows this message:
Warning, objcache(cluster mbuf): Exhausted!

netstat -m output:
5306/4480 mbuf clusters in use (current/max)

Some processes (most often ppp and sendmail) end up in a weird state, as shown by these top outputs:

PID USERNAME NICE SIZE RES STATE CPU TIME CTIME CPU COMMAND
1879 root 0 14M 2864K CPU0 0 0:00 0:00 0.34% top
260 root 0 24M 2476K tunread 2 0:05 0:05 0.00% ppp

PID USERNAME NICE SIZE RES STATE CPU TIME CTIME CPU COMMAND
82699 ftigeot 0 15M 2872K CPU3 3 0:00 0:00 0.39% top
265 root 0 25M 1172K objcache 1 154:06 154:06 0.00% ppp

Only a reboot allows the machine to resume network operations.

A core dump is available in leaf:~ftigeot/crash/crash.objcache

History

#1 Updated by sepherosa over 1 year ago

On Wed, Mar 27, 2013 at 3:42 PM, Francois Tigeot via Redmine
<> wrote:
>
> Issue #2534 has been reported by ftigeot.
>
> ----------------------------------------
> Bug #2534: Network connections freeze with heavy email traffic
> http://bugs.dragonflybsd.org/issues/2534
>
> Author: ftigeot
> Status: New
> Priority: Normal
> Assignee:
> Category:
> Target version:
>
>
> When under heavy load (receiving a stream of mails at 1Gb/s from a LAN machine), network connections freeze on a mail relay machine.
> This machine is also a PPP/ADSL router.
>
> The console shows this message:
> Warning, objcache(cluster mbuf): Exhausted!
>
> netstat -m output:
> 5306/4480 mbuf clusters in use (current/max)
>
> Some processes (most often ppp and sendmail) end up in a weird state, as shown by these top outputs:
>
> PID USERNAME NICE SIZE RES STATE CPU TIME CTIME CPU COMMAND
> 1879 root 0 14M 2864K CPU0 0 0:00 0:00 0.34% top
> 260 root 0 24M 2476K tunread 2 0:05 0:05 0.00% ppp
>
> PID USERNAME NICE SIZE RES STATE CPU TIME CTIME CPU COMMAND
> 82699 ftigeot 0 15M 2872K CPU3 3 0:00 0:00 0.39% top
> 265 root 0 25M 1172K objcache 1 154:06 154:06 0.00% ppp
>
> Only a reboot allows the machine to resume network operations.
>
>
> A core dump is available in leaf:~ftigeot/crash/crash.objcache

I suggest to test the latest master, many network bugs are fixed since
the kernel version that crashed.

Best Regards,
sephe

--
Tomorrow Will Never Die

#2 Updated by ftigeot over 1 year ago

  • Status changed from New to Resolved
  • % Done changed from 0 to 100

The issue seems to be resolved indeed, I couldn't reproduce it with today's kernel.
The current number of mbuf clusters in use doesn't grow bigger than the maximum limit anymore.

Thanks for the tip!

Also available in: Atom PDF