Bug #1875
closedpanic: assertion: leaf->base.delete_tid == 0 in hammer_ip_delete_range
0%
Description
Hi,
I just got this panic on master. I don't know how to reproduce it.
Kernel and dump are on leaf, ~mh/crash/*.0
DragonFly v2.7.3.1283.gfa568-DEVELOPMENT #1: Wed Oct 13 12:08:28 CEST
2010 x86_64
Unread portion of the kernel message buffer:
panic: assertion: leaf->base.delete_tid == 0 in hammer_ip_delete_range
Trace beginning at frame 0xffffffe01b43c9c0
panic() at panic+0x16b
panic() at panic+0x16b
hammer_ip_delete_range() at hammer_ip_delete_range+0x19a
hammer_sync_inode() at hammer_sync_inode+0x1df
hammer_flusher_slave_thread() at hammer_flusher_slave_thread+0x9d
Debugger("panic")
panic: from debugger
Uptime: 23m42s
Physical memory: 2026 MB
Dumping 546 MB: 531 515 499 483 467 451 435 419 403 387 371 355 339 323
307 291 275 259 243 227 211 195 179 163 147 131 115 99 83 67 51 35 19 3
Reading symbols from /boot/kernel/snd_emu10k1.ko...done.
Loaded symbols for /boot/kernel/snd_emu10k1.ko
Reading symbols from /boot/kernel/sound.ko...done.
Loaded symbols for /boot/kernel/sound.ko
Reading symbols from /boot/kernel/acpi.ko...done.
Loaded symbols for /boot/kernel/acpi.ko
Reading symbols from /boot/kernel/ahci.ko...done.
Loaded symbols for /boot/kernel/ahci.ko
Reading symbols from /boot/kernel/ehci.ko...done.
Loaded symbols for /boot/kernel/ehci.ko
Reading symbols from /boot/kernel/kate.ko...done.
Loaded symbols for /boot/kernel/kate.ko
Reading symbols from /boot/kernel/powernow.ko...done.
Loaded symbols for /boot/kernel/powernow.ko
Reading symbols from /boot/kernel/pf.ko...done.
Loaded symbols for /boot/kernel/pf.ko
get_mycpu (di=0xffffffff808aeea0) at ./machine/thread.h:73
73 __asm ("movq %%gs:globaldata,%0" : "=r" (gd) :
"m"(_mycpu__dummy));
(kgdb) bt
#0 _get_mycpu (di=0xffffffff808aeea0) at ./machine/thread.h:73
#1 md_dumpsys (di=0xffffffff808aeea0) at
/usr/src/sys/platform/pc64/x86_64/dump_machdep.c:262
#2 0xffffffff8035c4d6 in dumpsys () at
/usr/src/sys/kern/kern_shutdown.c:880
#3 0xffffffff8035cb63 in boot (howto=-2004318071) at
/usr/src/sys/kern/kern_shutdown.c:387
#4 0xffffffff8035cd76 in panic (fmt=0xffffffff806073b4 "from debugger")
at /usr/src/sys/kern/kern_shutdown.c:786
#5 0xffffffff80190625 in db_panic (addr=<value optimized out>,
have_addr=0, count=0, modif=0x0) at /usr/src/sys/ddb/db_command.c:448
#6 0xffffffff80190cdb in db_command () at /usr/src/sys/ddb/db_command.c:344
#7 db_command_loop () at /usr/src/sys/ddb/db_command.c:470
#8 0xffffffff80193a61 in db_trap (type=<value optimized out>,
code=<value optimized out>) at /usr/src/sys/ddb/db_trap.c:71
#9 0xffffffff805da5fe in kdb_trap (type=3, code=0,
regs=0xffffffe01b43c8f8) at
/usr/src/sys/platform/pc64/x86_64/db_interface.c:176
#10 0xffffffff805dfbe2 in trap (frame=0xffffffe01b43c8f8) at
/usr/src/sys/platform/pc64/x86_64/trap.c:705
#11 0xffffffff805d8d0e in calltrap () at
/usr/src/sys/platform/pc64/x86_64/exception.S:179
#12 0xffffffff805da4d5 in breakpoint (msg=<value optimized out>) at
./cpu/cpufunc.h:73
#13 Debugger (msg=<value optimized out>) at
/usr/src/sys/platform/pc64/x86_64/db_interface.c:359
#14 0xffffffff8035cd6f in panic (fmt=0xffffffff805fbf4a "assertion: %s
in %s") at /usr/src/sys/kern/kern_shutdown.c:784
#15 0xffffffff8053280f in hammer_ip_delete_range
(cursor=0xffffffe01b43cb40, ip=0xffffffe03eb2d978, ran_beg=0,
ran_end=9223372036854775807, truncating=2) at
/usr/src/sys/vfs/hammer/hammer_object.c:1973
#16 0xffffffff8052a3b2 in hammer_sync_inode (trans=0xffffffe01b420188,
ip=0xffffffe03eb2d978) at /usr/src/sys/vfs/hammer/hammer_inode.c:2831
#17 0xffffffff80524f60 in hammer_flusher_flush_inode (arg=<value
optimized out>) at /usr/src/sys/vfs/hammer/hammer_flusher.c:512
#18 hammer_flusher_slave_thread (arg=<value optimized out>) at
/usr/src/sys/vfs/hammer/hammer_flusher.c:455
#19 0xffffffff80365e34 in lwkt_deschedule_self (td=0xffffffff8092af18)
at /usr/src/sys/kern/lwkt_thread.c:281
Backtrace stopped: previous frame inner to this frame (corrupt stack?)
Max
Updated by dillon over 14 years ago
Tracked assertion back to missing snapshot check when attempting to
truncate/destroy a file with nlinks == 0 on last-close.
Updated by dillon over 14 years ago
:Hi,
:
:I just got this panic on master. I don't know how to reproduce it.
:Kernel and dump are on leaf, ~mh/crash/*.0
:
:DragonFly v2.7.3.1283.gfa568-DEVELOPMENT #1: Wed Oct 13 12:08:28 CEST
:2010 x86_64
:
:Max
Thanks Max. I think I figured out what happened and I'm glad I had
the assertion there. Basically it asserted because it was trying to
modify an inode snapshot instead of a current inode.
I tracked the problem down to a situation where a snapshot inode is
caught with its nlinks count set to 0. This isn't supposed to be
possible (that is the inode should not be visible in the filesystem
if nlinks is 0), but there are a few places where a raw B-Tree scan can
instantiate an inode where I think it can happen.
When that happens and the vnode is then later discarded by the kernel
there was a piece of code designed to truncate the data if the file
had nlinks == 0 (i.e. no longer visible in the directory hierarchy)
which was not checking to see if the file was a snapshot or not.
I have committed a fix to master which will likely be MFCd to the
release branch before we release.
-Matt
Matthew Dillon
<dillon@backplane.com>