Bug #2547
opencrashed while doing a dry run of pkg_rolling-replace
0%
Description
I rebooted into a kernel compiled from sources I updated just a few days ago. I ran pkg_rolling-replace -nuv and it ran for a few hours, then crashed. The kernel dump is number 14 in my crash directory. The version is v3.5.0.25.g97861-DEVELOPMENT.
(kgdb) #0 _get_mycpu () at ./machine/thread.h:79
#1 md_dumpsys (di=0xc03e8b35)
at /usr/src/sys/platform/pc32/i386/dump_machdep.c:266
#2 0xc09646a0 in db_command_table ()
#3 0xc03e8b35 in dumpsys () at /usr/src/sys/kern/kern_shutdown.c:913
#4 0xc01968a3 in db_fncall (dummy1=-831973476, dummy2=0, dummy3=-1065827167,
dummy4=0xce691798 "(\361~\300\336L\212\300")
at /usr/src/sys/ddb/db_command.c:539
#5 0xc0196ca2 in db_command (aux_cmd_tablep_end=<optimized out>,
aux_cmd_tablep=<optimized out>, cmd_table=<optimized out>,
last_cmdp=0xc09f9318) at /usr/src/sys/ddb/db_command.c:401
#6 db_command_loop () at /usr/src/sys/ddb/db_command.c:467
#7 0xc019984b in db_trap (type=type@entry=12, code=code@entry=0)
at /usr/src/sys/ddb/db_trap.c:71
#8 0xc07903d0 in kdb_trap (type=type@entry=12, code=code@entry=0,
regs=regs@entry=0xce69190c)
at /usr/src/sys/platform/pc32/i386/db_interface.c:149
#9 0xc07c2602 in trap_fatal (frame=frame@entry=0xce69190c, eva=eva@entry=136)
at /usr/src/sys/platform/pc32/i386/trap.c:1107
#10 0xc07c27bd in trap_pfault (frame=frame@entry=0xce69190c, usermode=0,
usermode@entry=136, eva=eva@entry=136)
at /usr/src/sys/platform/pc32/i386/trap.c:1018
#11 0xc07c2dc3 in trap (frame=0xce69190c)
at /usr/src/sys/platform/pc32/i386/trap.c:695
#12 0xc0791c37 in calltrap ()
at /usr/src/sys/platform/pc32/i386/exception.s:787
#13 0xc0686b17 in fq_balance_self (tdio=0xc09edf00)
at /usr/src/sys/kern/dsched/fq/fq_core.c:351
Backtrace stopped: previous frame inner to this frame (corrupt stack?)
I also saw it freeze the first time I started a kde session after rebooting.
Updated by phma over 11 years ago
The kernel crashed again at 1:08. I was doing nothing; that's during the nightly periodic job. Maybe there's a bug in Hammer. I rebooted into the previous kernel.
Updated by phma over 11 years ago
It is now 1:29 and hammer (new world) is still running on the old kernel. The bug is therefore in the hammer code of the kernel.
Updated by alexh over 11 years ago
The bug is most definitely not in HAMMER but rather in dsched or dsched
fq.
Cheers,
Alex
On 2013-04-16 06:30, Pierre Abbat via Redmine wrote:
Issue #2547 has been updated by phma.
It is now 1:29 and hammer (new world) is still running on the old
kernel. The bug is therefore in the hammer code of the kernel.
----------------------------------------
Bug #2547: crashed while doing a dry run of pkg_rolling-replace
http://bugs.dragonflybsd.org/issues/2547Author: phma
Status: New
Priority: High
Assignee:
Category:
Target version:I rebooted into a kernel compiled from sources I updated just a few
days ago. I ran pkg_rolling-replace -nuv and it ran for a few hours,
then crashed. The kernel dump is number 14 in my crash directory. The
version is v3.5.0.25.g97861-DEVELOPMENT.(kgdb) #0 _get_mycpu () at ./machine/thread.h:79
#1 md_dumpsys (di=0xc03e8b35)
at /usr/src/sys/platform/pc32/i386/dump_machdep.c:266
#2 0xc09646a0 in db_command_table ()
#3 0xc03e8b35 in dumpsys () at /usr/src/sys/kern/kern_shutdown.c:913
#4 0xc01968a3 in db_fncall (dummy1=-831973476, dummy2=0,
dummy3=-1065827167,
dummy4=0xce691798 "(\361~\300\336L\212\300")
at /usr/src/sys/ddb/db_command.c:539
#5 0xc0196ca2 in db_command (aux_cmd_tablep_end=<optimized out>,
aux_cmd_tablep=<optimized out>, cmd_table=<optimized out>,
last_cmdp=0xc09f9318) at /usr/src/sys/ddb/db_command.c:401
#6 db_command_loop () at /usr/src/sys/ddb/db_command.c:467
#7 0xc019984b in db_trap (type=type@entry=12, code=code@entry=0)
at /usr/src/sys/ddb/db_trap.c:71
#8 0xc07903d0 in kdb_trap (type=type@entry=12, code=code@entry=0,
regs=regs@entry=0xce69190c)
at /usr/src/sys/platform/pc32/i386/db_interface.c:149
#9 0xc07c2602 in trap_fatal (frame=frame@entry=0xce69190c,
eva=eva@entry=136)
at /usr/src/sys/platform/pc32/i386/trap.c:1107
#10 0xc07c27bd in trap_pfault (frame=frame@entry=0xce69190c,
usermode=0,
usermode@entry=136, eva=eva@entry=136)
at /usr/src/sys/platform/pc32/i386/trap.c:1018
#11 0xc07c2dc3 in trap (frame=0xce69190c)
at /usr/src/sys/platform/pc32/i386/trap.c:695
#12 0xc0791c37 in calltrap ()
at /usr/src/sys/platform/pc32/i386/exception.s:787
#13 0xc0686b17 in fq_balance_self (tdio=0xc09edf00)
at /usr/src/sys/kern/dsched/fq/fq_core.c:351
Backtrace stopped: previous frame inner to this frame (corrupt
stack?)I also saw it freeze the first time I started a kde session after
rebooting.
Updated by phma over 11 years ago
Both disks are set to fq. Should I set them to something else and see what happens?
Updated by vsrinivas over 11 years ago
Yea, you'll probably want to use the noop scheduler; it is the best-tested of all the dsched modules.
Updated by phma over 11 years ago
Confirmed: the bug is in fq. I rebooted it with both disks set to noop, and it's still up.