Bug #2251

panic in lapic_timer_process

Added by y0n3t4n1 over 2 years ago. Updated over 2 years ago.

Status:ClosedStart date:
Priority:NormalDue date:
Assignee:-% Done:

0%

Category:-
Target version:-

Description

Hi.
Does one of the recent commits fix the panic shown below? I've been
bisecting this problem for a while, and it's narrowed down to
168997..981239a (former being the `good' one, latter the `bad' one).
Commits to /sys between the range are:

981239a x86_64/ioapic_abi: Implement MachIntrABI.rman_setup
2b195d6 x86_64/ioapic_abi: Disable interrupt load balance by default
5e4b399 accept: Implement fast soaccept predication
6ac7736 bce: Use MPSAFE callout

There's no bce device on this system, so the last one can be eliminated.

The frames 0 - 8 in the backtrace (at the bottom of this message) appears
to be from the secondary panic. At first I thought that the panic was
from inlined portion of lapic_timer_process_oncpu(), but the disassembled
output shows that it's not inlined, and the panic has occurred before
it gets called. Unfortunately, `info registers' doesn't show the content
of %gs register. I can provide the kern and vmcore files for this panic
on request.

Best Regards,
YONETANI Tomokazu.

(kgdb) disass
Dump of assembler code for function lapic_timer_process:
0xffffffff805b7114 <+0>: push %rbp
0xffffffff805b7115 <+1>: mov %rsp,%rbp
=> 0xffffffff805b7118 <+4>: mov %gs:0x0,%rdi
0xffffffff805b7121 <+13>: mov $0x0,%esi
0xffffffff805b7126 <+18>: callq 0xffffffff805b70ab <lapic_timer_process_oncpu>
0xffffffff805b712b <+23>: leaveq
0xffffffff805b712c <+24>: retq
End of assembler dump.

(kgdb) info registers
rax 0x0 0
rbx 0xffffffe032d70870 -136586000272
rcx 0x0 0
rdx 0x2 2
rsi 0x14aa2 84642
rdi 0xffffffe032d70870 -136586000272
rbp 0xffffffe0c2bbfc00 0xffffffe0c2bbfc00
rsp 0xffffffe0c2bbfc00 0xffffffe0c2bbfc00
r8 0xffffffffffffffbf -65
r9 0x581 1409
r10 0x801514be0 34381843424
r11 0x8015fc880 34382792832
r12 0x7c 124
r13 0x14000062270 1374389936752
r14 0x7c 124
r15 0x0 0
rip 0xffffffff805b7118 0xffffffff805b7118 <lapic_timer_process+4>
eflags 0x10046 [ PF ZF RF ]
cs 0x8 8
ss 0x0 0
ds *value not available*
es *value not available*
fs *value not available*
gs *value not available*

:
ile in kernel mode
cpuid = 0; lapic->id = 00000000
fault virtual address = 0x7440025
fault code = supervisor read data, page not present
instruction pointer = 0x8:0xffffffff805ae3d9
stack pointer = 0x10:0xffffffe0c2bbf810
frame pointer = 0x10:0xffffffe0c2bbf828
code segment = base 0x0, limit 0xfffff, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags = interrupt enabled, resume, IOPL = 0
current process = 6953
current thread = pri 6 (CRIT)
trap number = 12
panic: page fault
cpuid = 0
boot() called on cpu#0
Uptime: 13h7m0s
Physical memory: 4047 MB
Dumping 1628 MB: 1613 1597 1581 1565 1549 1533 1517 1501 1485 1469 1453 1437 1421 1405 1389 1373 1357 1341 1325 1309 1293 1277 1261 1245 1229 1213 1197 1181 1165 1149 1133 1117 1101 1085 1069 1053 1037 1021 1005 989 973 957 941 925 909 893 877 861 845 829 813 797 781 765 749 733 717 701 685 669 653 637 621 605 589 573 557 541 525 509 493 477 461 445 429 413 397 381 365 349 333 317 301 285 269 253 237 221 205 189 173 157 141 125 109SECONDARY PANIC ON CPU 2 THREAD 0xffffffe032d6ecf0
93 77 61 45 29 13

(kgdb) bt
#0 _get_mycpu () at ./machine/thread.h:69
#1 md_dumpsys (di=<optimized out>)
at /usr/src/sys/platform/pc64/x86_64/dump_machdep.c:263
#2 0xffffffff8039b744 in dumpsys () at /usr/src/sys/kern/kern_shutdown.c:925
#3 0xffffffff8039bda8 in boot (howto=260)
at /usr/src/sys/kern/kern_shutdown.c:387
#4 0xffffffff8039c05d in panic (fmt=0xffffffff805ff22b "%s")
at /usr/src/sys/kern/kern_shutdown.c:831
#5 0xffffffff805b3b14 in trap_fatal (frame=0xffffffe0c2bbf758,
eva=<optimized out>) at /usr/src/sys/platform/pc64/x86_64/trap.c:1034
#6 0xffffffff805b3d24 in trap_pfault (frame=0xffffffe0c2bbf758, usermode=0)
at /usr/src/sys/platform/pc64/x86_64/trap.c:928
#7 0xffffffff805b42cc in trap (frame=0xffffffe0c2bbf758)
at /usr/src/sys/platform/pc64/x86_64/trap.c:630
#8 0xffffffff805acade in calltrap ()
at /usr/src/sys/platform/pc64/x86_64/exception.S:187
#9 0xffffffff805ae3d9 in db_read_bytes (addr=121896997, size=8,
data=0xffffffe0c2bbf838 "")
at /usr/src/sys/platform/pc64/x86_64/db_interface.c:244
#10 0xffffffff80289c8c in db_get_value (addr=121896997, size=8, is_signed=0)
at /usr/src/sys/ddb/db_access.c:58
#11 0xffffffff805af06a in db_nextframe (ip=<optimized out>, fp=<optimized out>)
at /usr/src/sys/platform/pc64/x86_64/db_trace.c:233
#12 db_stack_trace_cmd (addr=<optimized out>, have_addr=<optimized out>,
count=<optimized out>, modif=<optimized out>)
at /usr/src/sys/platform/pc64/x86_64/db_trace.c:439
#13 0xffffffff805af228 in print_backtrace (count=-1027868616)
at /usr/src/sys/platform/pc64/x86_64/db_trace.c:451
#14 0xffffffff8039c028 in panic (fmt=0xffffffff805ff22b "%s")
at /usr/src/sys/kern/kern_shutdown.c:820
#15 0xffffffff805b3b14 in trap_fatal (frame=0xffffffe0c2bbfb38,
eva=<optimized out>) at /usr/src/sys/platform/pc64/x86_64/trap.c:1034
#16 0xffffffff805b3d24 in trap_pfault (frame=0xffffffe0c2bbfb38, usermode=0)
at /usr/src/sys/platform/pc64/x86_64/trap.c:928
#17 0xffffffff805b42cc in trap (frame=0xffffffe0c2bbfb38)
at /usr/src/sys/platform/pc64/x86_64/trap.c:630
#18 0xffffffff805acade in calltrap ()
at /usr/src/sys/platform/pc64/x86_64/exception.S:187
#19 0xffffffff805b7118 in lapic_timer_process ()
at /usr/src/sys/platform/pc64/apic/lapic.c:306

History

#1 Updated by sepherosa over 2 years ago

On Fri, Dec 2, 2011 at 6:36 PM, YONETANI Tomokazu via Redmine
<> wrote:
>
> Issue #2251 has been reported by YONETANI Tomokazu.
>
> ----------------------------------------
> Bug #2251: panic in lapic_timer_process
> http://bugs.dragonflybsd.org/issues/2251
>
> Author: YONETANI Tomokazu
> Status: New
> Priority: Normal
> Assignee:
> Category:
> Target version:
>
>
> Hi.
> Does one of the recent commits fix the panic shown below?  I've been
> bisecting this problem for a while, and it's narrowed down to
> 168997..981239a (former being the `good' one, latter the `bad' one).
> Commits to /sys between the range are:
>
> 981239a x86_64/ioapic_abi: Implement MachIntrABI.rman_setup
> 2b195d6 x86_64/ioapic_abi: Disable interrupt load balance by default
> 5e4b399 accept: Implement fast soaccept predication
> 6ac7736 bce: Use MPSAFE callout
>
> There's no bce device on this system, so the last one can be eliminated.

Hmm, this is strange, could you test the latest master? I have
several boxes run the latest master, none of them got the panic, you
have shown. BTW, is it a UP box?

Best Regards,
sephe

--
Tomorrow Will Never Die

#2 Updated by y0n3t4n1 over 2 years ago

  • Status changed from New to Resolved

(Oh, I didn't know replying to bugtracker-admin@ doesn't make it into the list or the bug tracker)

It's an SMP box (ATOM D510), it has two CPU cores and HT enabled, so hw.ncpu=4.
The box has been running the pbulk on kernel built from 7a92c04 for 4 days
without a panic, so I guess some of the commits in 981239a..7a92c04 might
have fixed that.

#3 Updated by swildner over 2 years ago

  • Status changed from Resolved to Closed

Also available in: Atom PDF