Bug #770

page fault at m_xhalf on 1.10 while browsing

Added by kmb810 about 7 years ago. Updated about 7 years ago.

Status:ClosedStart date:
Priority:NormalDue date:
Assignee:-% Done:

0%

Category:-
Target version:-

Description

G'day All,

On 1.10 I'm regularly getting the page faults while browsing. The
backtrace of coredump is as follows. Always the same crash.

Fatal trap 12: page fault while in kernel mode
mp_lock = 00000001; cpuid = 1; lapic.id = 01000000
fault virtual address = 0xdeadc0ea
fault code = supervisor read, page not present
instruction pointer = 0x8:0xc035d4c5
stack pointer = 0x10:0xce2bd88c
frame pointer = 0x10:0xce2bd894
code segment = base 0x0, limit 0xfffff, type 0x1b
= DPL 0, pres 1, def32 1, gran 1
processor eflags = interrupt enabled, resume, IOPL = 0
current process = Idle
current thread = pri 12

#0 dumpsys () at thread.h:83
#1 0xc0167a35 in db_fncall (dummy1=1936, dummy2=0, dummy3=-835987788,
dummy4=0xce2bd6b0 "�\a") at /usr/src/sys/ddb/db_command.c:541
#2 0xc01677ef in db_command (last_cmdp=0xc0645990, cmd_table=0x0,
aux_cmd_tablep=0xc05ce894, aux_cmd_tablep_end=0xc05ce8b0)
at /usr/src/sys/ddb/db_command.c:343
#3 0xc01678cf in db_command_loop () at /usr/src/sys/ddb/db_command.c:469
#4 0xc016a488 in db_trap (type=12, code=0) at /usr/src/sys/ddb/db_trap.c:71
#5 0xc051102f in kdb_trap (type=12, code=0, regs=0xce2bd844) at
/usr/src/sys/platform/pc32/i386/db_interface.c:148
#6 0xc05268e6 in trap_fatal (frame=0xce2bd844, eva=0) at
/usr/src/sys/platform/pc32/i386/trap.c:1092
#7 0xc052655d in trap_pfault (frame=0xce2bd844, usermode=0,
eva=3735929066) at /usr/src/sys/platform/pc32/i386/trap.c:998
#8 0xc05260f3 in trap (frame=0xce2bd844) at
/usr/src/sys/platform/pc32/i386/trap.c:681
#9 0xc05120f6 in calltrap () at /usr/src/sys/platform/pc32/i386/exception.s:783
#10 0xc035d4c5 in m_xhalf (m=0xcf4b4c00, k=12, err=0xce2bd8b0) at
/usr/src/sys/net/bpf_filter.c:156
#11 0xc035d647 in bpf_filter (pc=0xcf20ba08, p=0xcf4b4c00 "",
wirelen=3735929054, buflen=0) at /usr/src/sys/net/bpf_filter.c:238
#12 0xc035cdb6 in bpf_mtap (bp=0xce445280, m=0xcf4b4c00) at
/usr/src/sys/net/bpf.c:1089
#13 0xc020cb63 in bge_start (ifp=0xccef1000) at
/usr/src/sys/dev/netif/bge/if_bge.c:2785
#14 0xc0360f4e in ether_output_frame (ifp=0xcf4b4c00, m=0xcf4b4c00) at
ifq_var.h:195
#15 0xc0360c50 in ether_output (ifp=0xccef1000, m=0xcf4b4c00,
dst=0xcef018f0, rt=0xcee46fc0) at /usr/src/sys/net/if_ethersubr.c:358
#16 0xc0396a97 in ip_output (m0=0xcf489810, opt=0xccef1000,
ro=0xce2bdb44, flags=0, imo=0x0, inp=0x0) at
/usr/src/sys/netinet/ip_output.c:1006
#17 0xc039f30c in tcp_respond (tp=0x0, ipgen=0xcf489824,
th=0xcf489824, m=0xcf4b4c00, ack=0, seq=1316665762, flags=4) at
/usr/src/sys/netinet/tcp_subr.c:632
#18 0xc039c1f9 in tcp_input (m=0xcf4b4c00) at
/usr/src/sys/netinet/tcp_input.c:2623
#19 0xc0393949 in transport_processing_oncpu (m=0xcf4b4c00, hlen=20,
ip=0xdeadc0d2, nexthop=0xcf4b4c00) at
/usr/src/sys/netinet/ip_input.c:404
#20 0xc03943db in ip_input (m=0xcf4b4c00) at
/usr/src/sys/netinet/ip_input.c:1101
#21 0xc03939b9 in ip_input_handler (msg0=0xdeadc0d2) at
/usr/src/sys/netinet/ip_input.c:434
#22 0xc039ec85 in tcpmsg_service_loop (dummy=0x0) at
/usr/src/sys/netinet/tcp_subr.c:385
#23 0xc02fb6a0 in lwkt_create (func=0xdeadc0d2, arg=0xdeadc0d2,
tdp=0xc0678968, template=0xdeadc0d2, tdflags=-559038254,
cpu=-559038254,
fmt=0xdeadc0d2 <Address 0xdeadc0d2 out of bounds>) at
/usr/src/sys/kern/lwkt_thread.c:1302

Also unattended core dumps are not working even if dumpon is executed.
I have to manually call dumpsys.

Let me know if more information is required.

Why the lwkt_create is called with freed memory?

Cheers
kmb

History

#1 Updated by sepherosa about 7 years ago

Try following patch:
http://leaf.dragonflybsd.org/~sephe/bge_encap.diff

Best Regards,
sephe

#2 Updated by kmb810 about 7 years ago

On 8/9/07, Sepherosa Ziehau <> wrote:

Thanks Sephe for the patch. I am going to try it now. But to let you
know, I have examined the coredump and the mbuf pointer is not valid
(starting from the ip_input). So I doubt the patch would work.

#3 Updated by sepherosa about 7 years ago

Nah, the mbuf is probably freed in bge_encap(), so if you use kgdb to
inspect frames before bge_encap(), you will see a trashed mbuf
content. I think the BPF is opened by DHCP, so one thing we can test
with ease:
Use fixed IP (i.e. stop the DHCP), and see whether your system still
crash regularly.

Best Regards,
sephe

#4 Updated by kmb810 about 7 years ago

On 8/9/07, Sepherosa Ziehau <> wrote:

Thanks Sephe for the explanation. Indeed the mbuf was valid pointer
when using ddb. The patch works.

Cheers
kmb

#5 Updated by sepherosa about 7 years ago

Sorry, I am not a native speaker, do you mean it fixes the panic you have seen?

Best Regards,
sephe

#6 Updated by kmb810 about 7 years ago

On 8/9/07, Sepherosa Ziehau <> wrote:

yes it does.

Also available in: Atom PDF