Project

General

Profile

Actions

Bug #2268

closed

Panic when loading BGP full route table IPv4 + IPv6

Added by david about 12 years ago. Updated about 12 years ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
Start date:
01/02/2012
Due date:
% Done:

0%

Estimated time:

Description

Hi,

I'm using DragonFlyBSD to make border routers.
When loading routes on SMP systems ( 382492 routes, 375250 ipv4 + 7242 ipv6 ),
the kernel panic.

The test machine is DragonFly v2.13.0.781.gfaddf-DEVELOPMENT (i386) and the routes are injected by Quagga 0.99.17.

The Backtrace is as follow :

(kgdb) backtrace
#0 _get_mycpu () at ./machine/thread.h:79
#1 md_dumpsys (di=0xc0ae6260) at /usr/src/sys/platform/pc32/i386/dump_machdep.c:264
#2 0xc0372a18 in dumpsys () at /usr/src/sys/kern/kern_shutdown.c:925
#3 0xc037302e in boot (howto=<optimized out>) at /usr/src/sys/kern/kern_shutdown.c:387
#4 0xc0373297 in panic (fmt=0xc06977f7 "from debugger") at /usr/src/sys/kern/kern_shutdown.c:831
#5 0xc018ac62 in db_panic (addr=-1067341502, have_addr=0, count=-1, modif=0xd71dcb70 "")
at /usr/src/sys/ddb/db_command.c:445
#6 0xc018b32f in db_command (aux_cmd_tablep_end=0xc0721904, aux_cmd_tablep=0xc07218e8,
cmd_table=<optimized out>, last_cmdp=<optimized out>) at /usr/src/sys/ddb/db_command.c:401
#7 db_command_loop () at /usr/src/sys/ddb/db_command.c:467
#8 0xc018de8e in db_trap (type=3, code=0) at /usr/src/sys/ddb/db_trap.c:71
#9 0xc061acb5 in kdb_trap (type=3, code=0, regs=0xd71dcc90)
at /usr/src/sys/platform/pc32/i386/db_interface.c:152
#10 0xc064475b in trap (frame=0xd71dcc90) at /usr/src/sys/platform/pc32/i386/trap.c:844
#11 0xc061c1a7 in calltrap () at /usr/src/sys/platform/pc32/i386/exception.s:787
#12 0xc061a942 in breakpoint () at ./cpu/cpufunc.h:72
#13 Debugger (msg=0xc06b0e43 "panic") at /usr/src/sys/platform/pc32/i386/db_interface.c:334
#14 0xc0373278 in panic (fmt=0xc06fbf44 "rtrequest1_msghandler: rtrequest table error was not on cpu #0")
at /usr/src/sys/kern/kern_shutdown.c:822
#15 0xc0417bea in rtrequest1_msghandler (msg=0xd5d1fc74) at /usr/src/sys/net/route.c:809
#16 0xc041647c in rtable_service_loop (dummy=0x0) at /usr/src/sys/net/route.c:199
#17 0xc037d9ff in lwkt_deschedule_self (td=Cannot access memory at address 0x8
) at /usr/src/sys/kern/lwkt_thread.c:362
Backtrace stopped: previous frame inner to this frame (corrupt stack?)

Actions #1

Updated by sepherosa about 12 years ago

On Mon, Jan 2, 2012 at 7:11 PM, David BÉRARD via Redmine
<> wrote:

Hi,

I'm using DragonFlyBSD to make border routers.
When loading routes on SMP systems ( 382492 routes, 375250 ipv4 + 7242 ipv6 ),
the kernel panic.

The test machine is DragonFly v2.13.0.781.gfaddf-DEVELOPMENT (i386) and the routes are injected by Quagga 0.99.17.

Hmm, I have added some log about the R_Malloc error in rtrequest1. I
suspect your panic is caused by it.
Could you retry the latest master @ 38c2eb266c21ce17c37c1b4b8d2a6bc8c73aa26c?

Try locate the panic messges like:
rtrequest1: alloc rtentry failed on on cpuX

The default kmalloc limit of M_RTABLE could be too small in your case.
If it is caused by the kmalloc size limit we may fix it w/ some
simple changes.

Best Regards,
sephe

The Backtrace is as follow :

(kgdb) backtrace
#0  _get_mycpu () at ./machine/thread.h:79
#1  md_dumpsys (di=0xc0ae6260) at /usr/src/sys/platform/pc32/i386/dump_machdep.c:264
#2  0xc0372a18 in dumpsys () at /usr/src/sys/kern/kern_shutdown.c:925
#3  0xc037302e in boot (howto=<optimized out>) at /usr/src/sys/kern/kern_shutdown.c:387
#4  0xc0373297 in panic (fmt=0xc06977f7 "from debugger") at /usr/src/sys/kern/kern_shutdown.c:831
#5  0xc018ac62 in db_panic (addr=-1067341502, have_addr=0, count=-1, modif=0xd71dcb70 "")
   at /usr/src/sys/ddb/db_command.c:445
#6  0xc018b32f in db_command (aux_cmd_tablep_end=0xc0721904, aux_cmd_tablep=0xc07218e8,
   cmd_table=<optimized out>, last_cmdp=<optimized out>) at /usr/src/sys/ddb/db_command.c:401
#7  db_command_loop () at /usr/src/sys/ddb/db_command.c:467
#8  0xc018de8e in db_trap (type=3, code=0) at /usr/src/sys/ddb/db_trap.c:71
#9  0xc061acb5 in kdb_trap (type=3, code=0, regs=0xd71dcc90)
   at /usr/src/sys/platform/pc32/i386/db_interface.c:152
#10 0xc064475b in trap (frame=0xd71dcc90) at /usr/src/sys/platform/pc32/i386/trap.c:844
#11 0xc061c1a7 in calltrap () at /usr/src/sys/platform/pc32/i386/exception.s:787
#12 0xc061a942 in breakpoint () at ./cpu/cpufunc.h:72
#13 Debugger (msg=0xc06b0e43 "panic") at /usr/src/sys/platform/pc32/i386/db_interface.c:334
#14 0xc0373278 in panic (fmt=0xc06fbf44 "rtrequest1_msghandler: rtrequest table error was not on cpu #0")
   at /usr/src/sys/kern/kern_shutdown.c:822
#15 0xc0417bea in rtrequest1_msghandler (msg=0xd5d1fc74) at /usr/src/sys/net/route.c:809
#16 0xc041647c in rtable_service_loop (dummy=0x0) at /usr/src/sys/net/route.c:199
#17 0xc037d9ff in lwkt_deschedule_self (td=Cannot access memory at address 0x8
) at /usr/src/sys/kern/lwkt_thread.c:362
Backtrace stopped: previous frame inner to this frame (corrupt stack?)

--
You have received this notification because you have either subscribed to it, or are involved in it.
To change your notification preferences, please click here: http://bugs.dragonflybsd.org/my/account

--
Tomorrow Will Never Die

Actions #2

Updated by david about 12 years ago

Running @ 38c2eb266c21ce17c37c1b4b8d2a6bc8c73aa26c I get is panic message :
rtrequest1_msghandler: rtrequest table error was cpu1, err 55

But this panic is not always reproducible (~1/2), sometime quagga can inject only
139245 (always this number returned by netstat -rn | wc -l)
routes, and I get (for example, after ping 10.0.1.254 ) :
arplookup 10.0.1.254 failed: could not allocate llinfo
arpresolve: can't allocate llinfo for 10.0.1.254 rt

Best Regards,
Thanks for your work.

Actions #3

Updated by sepherosa about 12 years ago

On Mon, Jan 16, 2012 at 12:14 AM, David BÉRARD via Redmine
<> wrote:

Issue #2268 has been updated by David BÉRARD.

Running @ 38c2eb266c21ce17c37c1b4b8d2a6bc8c73aa26c I get is panic message :
       rtrequest1_msghandler: rtrequest table error was cpu1, err 55

Yeah, there is not enough memory in M_RTABLE

Try the latest master @bb58b775cfe6cfab22c2062609551692b95a3209

First take a look at:
vmstat -m | grep routetbl

You probably could double the current "Limit" value (the 5th value) by putting:
net.route.kmalloc_limit="your_value"
in /boot/loader.conf

Hopefully, it will not break kmalloc limit.

But this panic is not always reproducible (~1/2), sometime quagga can inject only
139245 (always this number returned by netstat -rn | wc -l)
routes, and I get (for example, after ping 10.0.1.254 ) :
       arplookup 10.0.1.254 failed: could not allocate llinfo
       arpresolve: can't allocate llinfo for 10.0.1.254 rt

Best Regards,
Thanks for your work.
----------------------------------------
Bug #2268: Panic when loading BGP full route table IPv4 + IPv6
http://bugs.dragonflybsd.org/issues/2268

Author: David BÉRARD
Status: New
Priority: Normal
Assignee:
Category:
Target version:

Hi,

I'm using DragonFlyBSD to make border routers.
When loading routes on SMP systems ( 382492 routes, 375250 ipv4 + 7242 ipv6 ),
the kernel panic.

The test machine is DragonFly v2.13.0.781.gfaddf-DEVELOPMENT (i386) and the routes are injected by Quagga 0.99.17.

The Backtrace is as follow :

(kgdb) backtrace
#0  _get_mycpu () at ./machine/thread.h:79
#1  md_dumpsys (di=0xc0ae6260) at /usr/src/sys/platform/pc32/i386/dump_machdep.c:264
#2  0xc0372a18 in dumpsys () at /usr/src/sys/kern/kern_shutdown.c:925
#3  0xc037302e in boot (howto=<optimized out>) at /usr/src/sys/kern/kern_shutdown.c:387
#4  0xc0373297 in panic (fmt=0xc06977f7 "from debugger") at /usr/src/sys/kern/kern_shutdown.c:831
#5  0xc018ac62 in db_panic (addr=-1067341502, have_addr=0, count=-1, modif=0xd71dcb70 "")
   at /usr/src/sys/ddb/db_command.c:445
#6  0xc018b32f in db_command (aux_cmd_tablep_end=0xc0721904, aux_cmd_tablep=0xc07218e8,
   cmd_table=<optimized out>, last_cmdp=<optimized out>) at /usr/src/sys/ddb/db_command.c:401
#7  db_command_loop () at /usr/src/sys/ddb/db_command.c:467
#8  0xc018de8e in db_trap (type=3, code=0) at /usr/src/sys/ddb/db_trap.c:71
#9  0xc061acb5 in kdb_trap (type=3, code=0, regs=0xd71dcc90)
   at /usr/src/sys/platform/pc32/i386/db_interface.c:152
#10 0xc064475b in trap (frame=0xd71dcc90) at /usr/src/sys/platform/pc32/i386/trap.c:844
#11 0xc061c1a7 in calltrap () at /usr/src/sys/platform/pc32/i386/exception.s:787
#12 0xc061a942 in breakpoint () at ./cpu/cpufunc.h:72
#13 Debugger (msg=0xc06b0e43 "panic") at /usr/src/sys/platform/pc32/i386/db_interface.c:334
#14 0xc0373278 in panic (fmt=0xc06fbf44 "rtrequest1_msghandler: rtrequest table error was not on cpu #0")
   at /usr/src/sys/kern/kern_shutdown.c:822
#15 0xc0417bea in rtrequest1_msghandler (msg=0xd5d1fc74) at /usr/src/sys/net/route.c:809
#16 0xc041647c in rtable_service_loop (dummy=0x0) at /usr/src/sys/net/route.c:199
#17 0xc037d9ff in lwkt_deschedule_self (td=Cannot access memory at address 0x8
) at /usr/src/sys/kern/lwkt_thread.c:362
Backtrace stopped: previous frame inner to this frame (corrupt stack?)

--
You have received this notification because you have either subscribed to it, or are involved in it.
To change your notification preferences, please click here: http://bugs.dragonflybsd.org/my/account

--
Tomorrow Will Never Die

Actions #4

Updated by david about 12 years ago

With the lastest master and 256M for net.route.kmalloc_limit, all work without any issue.

Thanks !

Actions #5

Updated by swildner about 12 years ago

  • Status changed from New to Closed

Fixed in master.

Actions

Also available in: Atom PDF