Bug #2268

Panic when loading BGP full route table IPv4 + IPv6

Added by david almost 3 years ago. Updated almost 3 years ago.

Status:ClosedStart date:01/02/2012
Priority:NormalDue date:
Assignee:-% Done:

0%

Category:-
Target version:-

Description

Hi,

I'm using DragonFlyBSD to make border routers.
When loading routes on SMP systems ( 382492 routes, 375250 ipv4 + 7242 ipv6 ),
the kernel panic.

The test machine is DragonFly v2.13.0.781.gfaddf-DEVELOPMENT (i386) and the routes are injected by Quagga 0.99.17.

The Backtrace is as follow :

(kgdb) backtrace
#0 _get_mycpu () at ./machine/thread.h:79
#1 md_dumpsys (di=0xc0ae6260) at /usr/src/sys/platform/pc32/i386/dump_machdep.c:264
#2 0xc0372a18 in dumpsys () at /usr/src/sys/kern/kern_shutdown.c:925
#3 0xc037302e in boot (howto=<optimized out>) at /usr/src/sys/kern/kern_shutdown.c:387
#4 0xc0373297 in panic (fmt=0xc06977f7 "from debugger") at /usr/src/sys/kern/kern_shutdown.c:831
#5 0xc018ac62 in db_panic (addr=-1067341502, have_addr=0, count=-1, modif=0xd71dcb70 "")
at /usr/src/sys/ddb/db_command.c:445
#6 0xc018b32f in db_command (aux_cmd_tablep_end=0xc0721904, aux_cmd_tablep=0xc07218e8,
cmd_table=<optimized out>, last_cmdp=<optimized out>) at /usr/src/sys/ddb/db_command.c:401
#7 db_command_loop () at /usr/src/sys/ddb/db_command.c:467
#8 0xc018de8e in db_trap (type=3, code=0) at /usr/src/sys/ddb/db_trap.c:71
#9 0xc061acb5 in kdb_trap (type=3, code=0, regs=0xd71dcc90)
at /usr/src/sys/platform/pc32/i386/db_interface.c:152
#10 0xc064475b in trap (frame=0xd71dcc90) at /usr/src/sys/platform/pc32/i386/trap.c:844
#11 0xc061c1a7 in calltrap () at /usr/src/sys/platform/pc32/i386/exception.s:787
#12 0xc061a942 in breakpoint () at ./cpu/cpufunc.h:72
#13 Debugger (msg=0xc06b0e43 "panic") at /usr/src/sys/platform/pc32/i386/db_interface.c:334
#14 0xc0373278 in panic (fmt=0xc06fbf44 "rtrequest1_msghandler: rtrequest table error was not on cpu #0")
at /usr/src/sys/kern/kern_shutdown.c:822
#15 0xc0417bea in rtrequest1_msghandler (msg=0xd5d1fc74) at /usr/src/sys/net/route.c:809
#16 0xc041647c in rtable_service_loop (dummy=0x0) at /usr/src/sys/net/route.c:199
#17 0xc037d9ff in lwkt_deschedule_self (td=Cannot access memory at address 0x8
) at /usr/src/sys/kern/lwkt_thread.c:362
Backtrace stopped: previous frame inner to this frame (corrupt stack?)

History

#1 Updated by sepherosa almost 3 years ago

On Mon, Jan 2, 2012 at 7:11 PM, David BÉRARD via Redmine
<> wrote:
> Hi,
>
> I'm using DragonFlyBSD to make border routers.
> When loading routes on SMP systems ( 382492 routes, 375250 ipv4 + 7242 ipv6 ),
> the kernel panic.
>
> The test machine is DragonFly v2.13.0.781.gfaddf-DEVELOPMENT (i386) and the routes are injected by Quagga 0.99.17.

Hmm, I have added some log about the R_Malloc error in rtrequest1. I
suspect your panic is caused by it.
Could you retry the latest master @ 38c2eb266c21ce17c37c1b4b8d2a6bc8c73aa26c?

Try locate the panic messges like:
rtrequest1: alloc rtentry failed on on cpuX

The default kmalloc limit of M_RTABLE could be too small in your case.
If it is caused by the kmalloc size limit we may fix it w/ some
simple changes.

Best Regards,
sephe

>
> The Backtrace is as follow :
>
> (kgdb) backtrace
> #0  _get_mycpu () at ./machine/thread.h:79
> #1  md_dumpsys (di=0xc0ae6260) at /usr/src/sys/platform/pc32/i386/dump_machdep.c:264
> #2  0xc0372a18 in dumpsys () at /usr/src/sys/kern/kern_shutdown.c:925
> #3  0xc037302e in boot (howto=<optimized out>) at /usr/src/sys/kern/kern_shutdown.c:387
> #4  0xc0373297 in panic (fmt=0xc06977f7 "from debugger") at /usr/src/sys/kern/kern_shutdown.c:831
> #5  0xc018ac62 in db_panic (addr=-1067341502, have_addr=0, count=-1, modif=0xd71dcb70 "")
>    at /usr/src/sys/ddb/db_command.c:445
> #6  0xc018b32f in db_command (aux_cmd_tablep_end=0xc0721904, aux_cmd_tablep=0xc07218e8,
>    cmd_table=<optimized out>, last_cmdp=<optimized out>) at /usr/src/sys/ddb/db_command.c:401
> #7  db_command_loop () at /usr/src/sys/ddb/db_command.c:467
> #8  0xc018de8e in db_trap (type=3, code=0) at /usr/src/sys/ddb/db_trap.c:71
> #9  0xc061acb5 in kdb_trap (type=3, code=0, regs=0xd71dcc90)
>    at /usr/src/sys/platform/pc32/i386/db_interface.c:152
> #10 0xc064475b in trap (frame=0xd71dcc90) at /usr/src/sys/platform/pc32/i386/trap.c:844
> #11 0xc061c1a7 in calltrap () at /usr/src/sys/platform/pc32/i386/exception.s:787
> #12 0xc061a942 in breakpoint () at ./cpu/cpufunc.h:72
> #13 Debugger (msg=0xc06b0e43 "panic") at /usr/src/sys/platform/pc32/i386/db_interface.c:334
> #14 0xc0373278 in panic (fmt=0xc06fbf44 "rtrequest1_msghandler: rtrequest table error was not on cpu #0")
>    at /usr/src/sys/kern/kern_shutdown.c:822
> #15 0xc0417bea in rtrequest1_msghandler (msg=0xd5d1fc74) at /usr/src/sys/net/route.c:809
> #16 0xc041647c in rtable_service_loop (dummy=0x0) at /usr/src/sys/net/route.c:199
> #17 0xc037d9ff in lwkt_deschedule_self (td=Cannot access memory at address 0x8
> ) at /usr/src/sys/kern/lwkt_thread.c:362
> Backtrace stopped: previous frame inner to this frame (corrupt stack?)
>
>
> --
> You have received this notification because you have either subscribed to it, or are involved in it.
> To change your notification preferences, please click here: http://bugs.dragonflybsd.org/my/account

--
Tomorrow Will Never Die

#2 Updated by david almost 3 years ago

Running @ 38c2eb266c21ce17c37c1b4b8d2a6bc8c73aa26c I get is panic message :
rtrequest1_msghandler: rtrequest table error was cpu1, err 55

But this panic is not always reproducible (~1/2), sometime quagga can inject only
139245 (always this number returned by netstat -rn | wc -l)
routes, and I get (for example, after ping 10.0.1.254 ) :
arplookup 10.0.1.254 failed: could not allocate llinfo
arpresolve: can't allocate llinfo for 10.0.1.254 rt

Best Regards,
Thanks for your work.

#3 Updated by sepherosa almost 3 years ago

On Mon, Jan 16, 2012 at 12:14 AM, David BÉRARD via Redmine
<> wrote:
>
> Issue #2268 has been updated by David BÉRARD.
>
>
> Running @ 38c2eb266c21ce17c37c1b4b8d2a6bc8c73aa26c I get is panic message :
>        rtrequest1_msghandler: rtrequest table error was cpu1, err 55

Yeah, there is not enough memory in M_RTABLE

Try the latest master @bb58b775cfe6cfab22c2062609551692b95a3209

First take a look at:
vmstat -m | grep routetbl

You probably could double the current "Limit" value (the 5th value) by putting:
net.route.kmalloc_limit="your_value"
in /boot/loader.conf

Hopefully, it will not break kmalloc limit.

>
> But this panic is not always reproducible (~1/2), sometime quagga can inject only
> 139245 (always this number returned by netstat -rn | wc -l)
> routes, and I get (for example, after ping 10.0.1.254 ) :
>        arplookup 10.0.1.254 failed: could not allocate llinfo
>        arpresolve: can't allocate llinfo for 10.0.1.254 rt
>
> Best Regards,
> Thanks for your work.
> ----------------------------------------
> Bug #2268: Panic when loading BGP full route table IPv4 + IPv6
> http://bugs.dragonflybsd.org/issues/2268
>
> Author: David BÉRARD
> Status: New
> Priority: Normal
> Assignee:
> Category:
> Target version:
>
>
> Hi,
>
> I'm using DragonFlyBSD to make border routers.
> When loading routes on SMP systems ( 382492 routes, 375250 ipv4 + 7242 ipv6 ),
> the kernel panic.
>
> The test machine is DragonFly v2.13.0.781.gfaddf-DEVELOPMENT (i386) and the routes are injected by Quagga 0.99.17.
>
> The Backtrace is as follow :
>
> (kgdb) backtrace
> #0  _get_mycpu () at ./machine/thread.h:79
> #1  md_dumpsys (di=0xc0ae6260) at /usr/src/sys/platform/pc32/i386/dump_machdep.c:264
> #2  0xc0372a18 in dumpsys () at /usr/src/sys/kern/kern_shutdown.c:925
> #3  0xc037302e in boot (howto=<optimized out>) at /usr/src/sys/kern/kern_shutdown.c:387
> #4  0xc0373297 in panic (fmt=0xc06977f7 "from debugger") at /usr/src/sys/kern/kern_shutdown.c:831
> #5  0xc018ac62 in db_panic (addr=-1067341502, have_addr=0, count=-1, modif=0xd71dcb70 "")
>    at /usr/src/sys/ddb/db_command.c:445
> #6  0xc018b32f in db_command (aux_cmd_tablep_end=0xc0721904, aux_cmd_tablep=0xc07218e8,
>    cmd_table=<optimized out>, last_cmdp=<optimized out>) at /usr/src/sys/ddb/db_command.c:401
> #7  db_command_loop () at /usr/src/sys/ddb/db_command.c:467
> #8  0xc018de8e in db_trap (type=3, code=0) at /usr/src/sys/ddb/db_trap.c:71
> #9  0xc061acb5 in kdb_trap (type=3, code=0, regs=0xd71dcc90)
>    at /usr/src/sys/platform/pc32/i386/db_interface.c:152
> #10 0xc064475b in trap (frame=0xd71dcc90) at /usr/src/sys/platform/pc32/i386/trap.c:844
> #11 0xc061c1a7 in calltrap () at /usr/src/sys/platform/pc32/i386/exception.s:787
> #12 0xc061a942 in breakpoint () at ./cpu/cpufunc.h:72
> #13 Debugger (msg=0xc06b0e43 "panic") at /usr/src/sys/platform/pc32/i386/db_interface.c:334
> #14 0xc0373278 in panic (fmt=0xc06fbf44 "rtrequest1_msghandler: rtrequest table error was not on cpu #0")
>    at /usr/src/sys/kern/kern_shutdown.c:822
> #15 0xc0417bea in rtrequest1_msghandler (msg=0xd5d1fc74) at /usr/src/sys/net/route.c:809
> #16 0xc041647c in rtable_service_loop (dummy=0x0) at /usr/src/sys/net/route.c:199
> #17 0xc037d9ff in lwkt_deschedule_self (td=Cannot access memory at address 0x8
> ) at /usr/src/sys/kern/lwkt_thread.c:362
> Backtrace stopped: previous frame inner to this frame (corrupt stack?)
>
>
> --
> You have received this notification because you have either subscribed to it, or are involved in it.
> To change your notification preferences, please click here: http://bugs.dragonflybsd.org/my/account

--
Tomorrow Will Never Die

#4 Updated by david almost 3 years ago

With the lastest master and 256M for net.route.kmalloc_limit, all work without any issue.

Thanks !

#5 Updated by swildner almost 3 years ago

  • Status changed from New to Closed

Fixed in master.

Also available in: Atom PDF