Bug #1577

panic: assertion: leaf->base.obj_id == ip->obj_id in hammer_ip_delete_range

Added by qhwt+dfly about 5 years ago. Updated over 4 years ago.

Status:NewStart date:
Priority:NormalDue date:
Assignee:-% Done:

0%

Category:-
Target version:-

Description

Hi.
I caught this panic while running pbulk and vkernel on the same machine.
The PC has 1.5G bytes of main memory, occasionally using the swap up to
20M or so bytes (according to top).
The kernel is built from source as of cf0e75a4(+ small modifications not
touching the kernel). I've turned off periodic/daily/160.clean-hammer
and run `hammer cleanup /pfs/*' by hand only occasionally (but I'm sure
I can't fill up this 1TB hard drive anyway). The hammer filesystem version
is 2. I'll upload the vmcore to leaf on request.

Cheers.

(kgdb) bt
#0 dumpsys () at ./machine/thread.h:83
#1 0xc01a0c79 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:378
#2 0xc01a0f3e in panic (fmt=0xc02fe5be "assertion: %s in %s")
at /usr/src/sys/kern/kern_shutdown.c:813
#3 0xc02812e0 in hammer_ip_delete_range (cursor=0xd88b4c9c, ip=0xe19acb50,
ran_beg=0, ran_end=9223372036854775807, truncating=2)
at /usr/src/sys/vfs/hammer/hammer_object.c:1946
#4 0xc02788bf in hammer_sync_inode (trans=0xd818a114, ip=0xe19acb50)
at /usr/src/sys/vfs/hammer/hammer_inode.c:2714
#5 0xc0276a5b in hammer_flusher_slave_thread (arg=0xd6e78510)
at /usr/src/sys/vfs/hammer/hammer_flusher.c:496
#6 0xc01a981a in lwkt_deschedule_self (td=Cannot access memory at address 0x8
)
at /usr/src/sys/kern/lwkt_thread.c:271

(kgdb) p ip->obj_id
$1 = 374280419483
(kgdb) p leaf->base
$2 = {obj_id = 374247118508, key = 16384, create_tid = 374281176448,
delete_tid = 0, rec_type = 16, obj_type = 2 '\002', btype = 82 'R',
localization = 393218}
(kgdb)

History

#1 Updated by dillon about 5 years ago

:Hi.
:I caught this panic while running pbulk and vkernel on the same machine.
:The PC has 1.5G bytes of main memory, occasionally using the swap up to
:20M or so bytes (according to top).
:The kernel is built from source as of cf0e75a4(+ small modifications not
:touching the kernel). I've turned off periodic/daily/160.clean-hammer
:and run `hammer cleanup /pfs/*' by hand only occasionally (but I'm sure
:I can't fill up this 1TB hard drive anyway). The hammer filesystem version
:is 2. I'll upload the vmcore to leaf on request.
:
:Cheers.

Yes, please.

-Matt

#2 Updated by qhwt+dfly about 5 years ago

:
> Yes, please.

It's at ~y0netan1/crash/{kernel,vmcore}.9 on leaf.

#3 Updated by dillon about 5 years ago

:It's at ~y0netan1/crash/{kernel,vmcore}.9 on leaf.
:

The file it blew up on was a nohistory-flagged file being deleted
about 2 seconds after it was created. The parts of the B-Tree that
the hammer cursor was sitting on look fine, so it's a software bug
in the iteration somewhere that the assertion caught.

I think I may have found the issue but it's a very rare and difficult
edge case so I just can't be sure yet. I will need to do some testing
before I'm even willing to post the patch.

Basically we have this situation in the B-Tree leaf:

F1 = records related to file 1 (cached records not yet inserted)
F2 = records related to file 2 (cached records not yet deleted,
and not on media)
F3 = records related to file 3 (records on-media)

We are trying to delete F2, so hammer_ip_delete_range() is called
for F2's records. There happen to be no records on-disk but it
doesn't know that yet.

BTREE NODE ON MEDIA:
F3 F3 F3
^ index

The iteration looking for records for F2 is essentially at the end,
having not found any records for F2. But then before the iteration
can finish the following happens:

* All the records for F3 are deleted due to the rm -rf (i.e.
some other inode flushing its deletion at the same time F2
is).

* Because the node is now empty the btree tries to remove the
entire node.

* This pushes the cursor for the F2 delete_range iterations up to
the parent.

* BUT btree_remove() is unsucessful at removing the entire node
due to hitting a deadlock, so now the cursor for F2 is pointing
at the parent instead of pointing at the node.

* Records for file F1 are inserted via another unrelated parallel
flush op. Since the node could not be deleted the records
are inserted in the node. And since the cursor for F2 is now
pointing at the parent, it is effectively pointing at the
records for F1 (which are indexed prior to the F2 cursor's begin
point).

* On the next iteration of the cursor it pushes down from the parent
node to the node that could not deleted and finds the records for
F1. This is the bug. The cursor for F2's deletion is now indexed
at F1 which is prior to the cursor begin point.

BTREE NODE ON MEDIA (BUG):
F1 F1 F1
^ index for F2's iteration

BTREE NODE ON MEDIA (DESIRED):
F1 F1 F1
^ index for F2's iteration

The iteration code assumes that the index already points to a record
greater or equal to the starting point, so it only checks that the index
has not gone past the ending point. But due to the insertion of the
records for F1 the index is now pointing to a point BEFORE the starting
point of the range we want to delete.

The KKASSERT catches the situation and quite properly panics the system
in order to avoid corrupting the on-disk media.

The solution is basically that I cannot call hammer_cursor_removed_node()
until the node is actually removed. That is, if F2's cursor is still
pointing at the node instead of the parent, then when F1's records are
inserted F2's cursor's node index will be incremented and stay past them,
instead of pointing at the parent which then pushes down into the node
and 'finds' the F1 records.

This creates a fairly severe complication in btree_remove() due to the
depth-first recursion it does. The patch looks trivial but if I make
a mistake we wave goodbye to the B-Tree. Hence I have to carefully
test the patch before I'm willing to even post it. So... it is going
to be a day or two I think.

I think this bug has been hit occassionally by several people over the
last year, including myself. Fortunately that assertion prevented
the media from getting corrupted (at the cost of panicing the system :-)).

-Matt
Matthew Dillon
<>

#4 Updated by dillon about 5 years ago

Ok Tomokazu, here a patch to test:

fetch http://apollo.backplane.com/DFlyMisc/hammer05.patch

It survived my bulk build tests so now the question is whether it
will survive yours.

-Matt
Matthew Dillon
<>

#5 Updated by qhwt+dfly about 5 years ago

On Wed, Oct 21, 2009 at 10:36:36AM -0700, Matthew Dillon wrote:
> Ok Tomokazu, here a patch to test:
>
> fetch http://apollo.backplane.com/DFlyMisc/hammer05.patch
>
> It survived my bulk build tests so now the question is whether it
> will survive yours.

I've installed the lastest kernel/world with above patch on my PC
and started pbulk and vkernel on it. I've tried the patch on vkernel
before installing it on PC, but I didn't see any problems then.
The HAMMER filesystem on vkernel has already been upgraded to version 2,
so I didn't test the patch on HAMMER filesystem version 1 yet.

#6 Updated by qhwt+dfly about 5 years ago

On Wed, Oct 21, 2009 at 10:36:36AM -0700, Matthew Dillon wrote:
> fetch http://apollo.backplane.com/DFlyMisc/hammer05.patch
>
> It survived my bulk build tests so now the question is whether it
> will survive yours.

It survived 7.5 hours before crashing in a different place.
The kernel and the vmcore are at ~y0etan1/crash/{kernel,vmcore}.10.

panic: assertion: s <= 0 in hammer_btree_iterate
mp_lock = 00000000; cpuid = 0
boot() called on cpu#0
Uptime: 7h24m33s

(kgdb) bt
#0 _get_mycpu () at ./machine/thread.h:83
#1 dumpsys () at /usr/src/sys/kern/kern_shutdown.c:657
#2 0xc01a0cbd in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:378
#3 0xc01a0f82 in panic (fmt=0xc02ff03e "assertion: %s in %s")
at /usr/src/sys/kern/kern_shutdown.c:813
#4 0xc02719d5 in hammer_btree_iterate (cursor=0xd88b4c9c)
at /usr/src/sys/vfs/hammer/hammer_btree.c:240
#5 0xc02804eb in hammer_ip_next (cursor=0xd88b4c9c)
at /usr/src/sys/vfs/hammer/hammer_object.c:1589
#6 0xc0281ed3 in hammer_ip_delete_range (cursor=0xd88b4c9c, ip=0xe310bc50,
ran_beg=0, ran_end=9223372036854775807, truncating=2)
at /usr/src/sys/vfs/hammer/hammer_object.c:2022
#7 0xc0278a63 in hammer_sync_inode (trans=0xd818a114, ip=0xe310bc50)
at /usr/src/sys/vfs/hammer/hammer_inode.c:2714
#8 0xc0276bdb in hammer_flusher_flush_inode (arg=0xd6e78510)
at /usr/src/sys/vfs/hammer/hammer_flusher.c:496
#9 hammer_flusher_slave_thread (arg=0xd6e78510)
at /usr/src/sys/vfs/hammer/hammer_flusher.c:440
#10 0xc01a985e in lwkt_deschedule_self (td=Cannot access memory at address 0x8
)
at /usr/src/sys/kern/lwkt_thread.c:271
Backtrace stopped: previous frame inner to this frame (corrupt stack?)
(kgdb) fr 4
#4 0xc02719d5 in hammer_btree_iterate (cursor=0xd88b4c9c)
at /usr/src/sys/vfs/hammer/hammer_btree.c:240
240 KKASSERT(s <= 0);
(kgdb) list
235 if (r == 0 && (cursor->flags &
236 HAMMER_CURSOR_END_INCLUSIVE) == 0) {
237 error = ENOENT;
238 break;
239 }
240 KKASSERT(s <= 0);
241
242 /*
243 * Better not be zero
244 */
(kgdb) p s
$1 = 4

#7 Updated by dillon about 5 years ago

:On Wed, Oct 21, 2009 at 10:36:36AM -0700, Matthew Dillon wrote:
:> fetch http://apollo.backplane.com/DFlyMisc/hammer05.patch
:>
:> It survived my bulk build tests so now the question is whether it
:> will survive yours.
:
:It survived 7.5 hours before crashing in a different place.
:The kernel and the vmcore are at ~y0etan1/crash/{kernel,vmcore}.10.
:
:panic: assertion: s <= 0 in hammer_btree_iterate
:mp_lock = 00000000; cpuid = 0

The cursor is pointing at the wrong place again, this time at
the wrong index of an internal node.

Try this patch. The only thing I found looking at the code was a
possible problem with hammer_cursor_up_locked(). If the routine
fails due to a deadlock it does not properly restore the cursor's
index.

This is a rollup patch which contains all active uncommitted patches
for HAMMER.

fetch http://apollo.backplane.com/DFlyMisc/hammer06.patch

-Matt

#8 Updated by qhwt+dfly about 5 years ago

On Thu, Oct 22, 2009 at 06:19:01PM -0700, Matthew Dillon wrote:
> This is a rollup patch which contains all active uncommitted patches
> for HAMMER.
>
> fetch http://apollo.backplane.com/DFlyMisc/hammer06.patch

Before starting pbulk after booting the patched kernel, I issued
`hammer cleanup /pfs/archives' where I keep packages built by bulkbuild,
and I noticed the following line in the /var/log/messages:

Oct 23 19:15:49 firebolt kernel: HAMMER: Debug: rebalance hit right bndry

I guess this means that the new code in the patch avoided some bad
condition, right?

By the way, I caught a different panic on vkernel. I think the last
I ran `hammer cleanup' on /usr/obj was before applying hammer05.patch
or hammer06.patch to vkernel.

$ ls /usr/obj
build.log t
$ rm -rf /usr/obj/*; sync
$ panic: assertion: parent != NULL in hammer_cursor_removed_node
mp_lock = 00000000; cpuid = 0
Trace beginning at frame 0x5b7dfabc
panic(5b7dfae0,0,0,58ba7168,5b7dfaf8) at 0x80dadce
panic(824684c,8269077,82826b7,5e226588,0) at 0x80dadce
hammer_cursor_removed_node(58ba7168,0,0,5b7dfc9c,0) at 0x81fe93a
hammer_btree_do_propagation(58ba7168,0,58ba71b8,58ba71b8,58ba71b8) at 0x81fc4e0
hammer_btree_do_propagation(0,0,52e22040,b,5b7dfbe8) at 0x81fc4c5
hammer_btree_delete(5b7dfc9c,1,1,5b7d0114,52e22040) at 0x81fc8c4
hammer_delete_at_cursor(5b7dfc9c,3,50c7fb30,b,4ae1c1b6) at 0x820b74e
hammer_ip_delete_record(5b7dfc9c,5e41ead0,50c7fb30,b,4000) at 0x820b9b6
hammer_ip_delete_range(5b7dfc9c,5e41ead0,0,0,ffffffff) at 0x820bda0
hammer_sync_inode(5b7d0114,5e41ead0,5b7d0000,5b7d0114,5b7d0000) at 0x82029bd
hammer_flusher_create(561212f0,0,0,0,0) at 0x8200b47
Debugger("panic")

CPU0 stopping CPUs: 0x000000fe
stopped
Stopped at 0x823b061: movb $0,0x83a7ff4
db>

So far the panic is repeatable, and removing `build.log' followed
by sync appears to trigger the panic. `undo -i' shows a warning like this.
# undo -i build.log
HAMMER: WARNING: Missing inode for dirent "build.log@@0x0000000b4d648360"
obj_id = 0000000b4d640668, asof=0000000b4d648360, lo=00070000
load: 0.00 cmd: undo 730 [fifoor] 0.00u 0.00s 0% 820k <-- ctrl+T

Since there's no important data in /usr/obj (and it's on vkernel anyway),
I don't mind losing the contents or the history, but will destroying and
recreating the PFS fix the issue?

#9 Updated by dillon about 5 years ago

:
:I guess this means that the new code in the patch avoided some bad
:condition, right?

Yes, that validates that the bug related to the rebalancing code
was real and that the bug fix actually fixed it.

:By the way, I caught a different panic on vkernel. I think the last
:I ran `hammer cleanup' on /usr/obj was before applying hammer05.patch
:or hammer06.patch to vkernel.
:
:$ ls /usr/obj
:build.log t
:$ rm -rf /usr/obj/*; sync
:$ panic: assertion: parent != NULL in hammer_cursor_removed_node
:mp_lock = 00000000; cpuid = 0
:Trace beginning at frame 0x5b7dfabc
:panic(5b7dfae0,0,0,58ba7168,5b7dfaf8) at 0x80dadce
:panic(824684c,8269077,82826b7,5e226588,0) at 0x80dadce
:hammer_cursor_removed_node(58ba7168,0,0,5b7dfc9c,0) at 0x81fe93a
:hammer_btree_do_propagation(58ba7168,0,58ba71b8,58ba71b8,58ba71b8) at 0x81fc4e0
:hammer_btree_do_propagation(0,0,52e22040,b,5b7dfbe8) at 0x81fc4c5
:hammer_btree_delete(5b7dfc9c,1,1,5b7d0114,52e22040) at 0x81fc8c4
:hammer_delete_at_cursor(5b7dfc9c,3,50c7fb30,b,4ae1c1b6) at 0x820b74e
:hammer_ip_delete_record(5b7dfc9c,5e41ead0,50c7fb30,b,4000) at 0x820b9b6
:hammer_ip_delete_range(5b7dfc9c,5e41ead0,0,0,ffffffff) at 0x820bda0
:hammer_sync_inode(5b7d0114,5e41ead0,5b7d0000,5b7d0114,5b7d0000) at 0x82029bd
:hammer_flusher_create(561212f0,0,0,0,0) at 0x8200b47
:Debugger("panic")

It's possible this will be fixed by the new patch as the propagation
code uses the recovery mechanism that was patched (I think), or the bug
fix from the problem corecode reported (also in patch 6). If you
have that core I'd like to see it, though, to make sure.

:by sync appears to trigger the panic. `undo -i' shows a warning like this.
:# undo -i build.log
:HAMMER: WARNING: Missing inode for dirent "build.log@@0x0000000b4d648360"
: obj_id = 0000000b4d640668, asof=0000000b4d648360, lo=00070000
:load: 0.00 cmd: undo 730 [fifoor] 0.00u 0.00s 0% 820k <-- ctrl+T
:
:Since there's no important data in /usr/obj (and it's on vkernel anyway),
:I don't mind losing the contents or the history, but will destroying and
:recreating the PFS fix the issue?

Generally speaking you can ignore missing inode warnings on historical
entries *unless* the transaction id is on a snapshot boundary (same as
a snapshot transaction id), in which case there's a problem.

The warnings can be ignored for any file marked nohistory or if the
mount is nohistory. All bets are off for nohistory files/mounts (the
data is just not in the B-Tree any more).

There's a second bug related to these missing inode issues and that is
if the undo code tries to actually open() the file, the file will look
like a FIFO, as in 'mkfifo' style FIFO, a mistake on my part I should
probably change it, and then get stuck on the open() until you ^C
because of that.

The real issue here is more of a long-standing disconnect between
the transaction the directory entry is in and the transaction the
initial laydown of the inode is in. It is something I need to fix
but it isn't critical.

-Matt
Matthew Dillon
<>

#10 Updated by qhwt+dfly about 5 years ago

On Fri, Oct 23, 2009 at 10:07:59AM -0700, Matthew Dillon wrote:
> :By the way, I caught a different panic on vkernel. I think the last
> :I ran `hammer cleanup' on /usr/obj was before applying hammer05.patch
> :or hammer06.patch to vkernel.
> :
> :$ ls /usr/obj
> :build.log t
> :$ rm -rf /usr/obj/*; sync
> :$ panic: assertion: parent != NULL in hammer_cursor_removed_node
> :mp_lock = 00000000; cpuid = 0
> :Trace beginning at frame 0x5b7dfabc
> :panic(5b7dfae0,0,0,58ba7168,5b7dfaf8) at 0x80dadce
> :panic(824684c,8269077,82826b7,5e226588,0) at 0x80dadce
> :hammer_cursor_removed_node(58ba7168,0,0,5b7dfc9c,0) at 0x81fe93a
> :hammer_btree_do_propagation(58ba7168,0,58ba71b8,58ba71b8,58ba71b8) at 0x81fc4e0
> :hammer_btree_do_propagation(0,0,52e22040,b,5b7dfbe8) at 0x81fc4c5
> :hammer_btree_delete(5b7dfc9c,1,1,5b7d0114,52e22040) at 0x81fc8c4
> :hammer_delete_at_cursor(5b7dfc9c,3,50c7fb30,b,4ae1c1b6) at 0x820b74e
> :hammer_ip_delete_record(5b7dfc9c,5e41ead0,50c7fb30,b,4000) at 0x820b9b6
> :hammer_ip_delete_range(5b7dfc9c,5e41ead0,0,0,ffffffff) at 0x820bda0
> :hammer_sync_inode(5b7d0114,5e41ead0,5b7d0000,5b7d0114,5b7d0000) at 0x82029bd
> :hammer_flusher_create(561212f0,0,0,0,0) at 0x8200b47
> :Debugger("panic")
>
> It's possible this will be fixed by the new patch as the propagation
> code uses the recovery mechanism that was patched (I think), or the bug
> fix from the problem corecode reported (also in patch 6). If you
> have that core I'd like to see it, though, to make sure.

The PC caught a panic with a similarly looking backtrace 9 hours after
starting the kernel with hammer06.patch. It's in ~y0netan1/crash on leaf,
as {kernel,vmcore}.11 .

#11 Updated by dillon about 5 years ago

:> :$ panic: assertion: parent != NULL in hammer_cursor_removed_node
:> :mp_lock = 00000000; cpuid = 0
:> :Trace beginning at frame 0x5b7dfabc
:> :panic(5b7dfae0,0,0,58ba7168,5b7dfaf8) at 0x80dadce
:> :panic(824684c,8269077,82826b7,5e226588,0) at 0x80dadce
:> :hammer_cursor_removed_node(58ba7168,0,0,5b7dfc9c,0) at 0x81fe93a
:> :hammer_btree_do_propagation(58ba7168,0,58ba71b8,58ba71b8,58ba71b8) at 0x81fc4e0
:> :hammer_btree_do_propagation(0,0,52e22040,b,5b7dfbe8) at 0x81fc4c5
:> :hammer_btree_delete(5b7dfc9c,1,1,5b7d0114,52e22040) at 0x81fc8c4
:
:The PC caught a panic with a similarly looking backtrace 9 hours after
:starting the kernel with hammer06.patch. It's in ~y0netan1/crash on leaf,
:as {kernel,vmcore}.11 .

I'd like the specific core for that one too.

I'll commit what I have so far.

-Matt
Matthew Dillon
<>

#12 Updated by qhwt+dfly about 5 years ago

On Fri, Oct 30, 2009 at 06:14:13PM -0700, Matthew Dillon wrote:
>
> :> :$ panic: assertion: parent != NULL in hammer_cursor_removed_node
> :> :mp_lock = 00000000; cpuid = 0
> :> :Trace beginning at frame 0x5b7dfabc
> :> :panic(5b7dfae0,0,0,58ba7168,5b7dfaf8) at 0x80dadce
> :> :panic(824684c,8269077,82826b7,5e226588,0) at 0x80dadce
> :> :hammer_cursor_removed_node(58ba7168,0,0,5b7dfc9c,0) at 0x81fe93a
> :> :hammer_btree_do_propagation(58ba7168,0,58ba71b8,58ba71b8,58ba71b8) at 0x81fc4e0
> :> :hammer_btree_do_propagation(0,0,52e22040,b,5b7dfbe8) at 0x81fc4c5
> :> :hammer_btree_delete(5b7dfc9c,1,1,5b7d0114,52e22040) at 0x81fc8c4
> :
> :The PC caught a panic with a similarly looking backtrace 9 hours after
> :starting the kernel with hammer06.patch. It's in ~y0netan1/crash on leaf,
> :as {kernel,vmcore}.11 .
>
> I'd like the specific core for that one too.

The vmcore file from the real kernel (not vkernel) is already on my
leaf account, as ~y0netan1/crash/vmcore.11, as mentioned in the quoted part.
If you meant the core file from vkernel, I have no idea which signal
I need to send to it. `call dumpsys' won't work inside vkernel, either.

> I'll commit what I have so far.

Hmm... currently I'm running vkernel built from source as of
e4eb09b2 (that is, hammer06.patch unapplied), and I can't reproduce
the panic on this kernel. I'm compiling vkernel from f3a4893b0
and see if I can reproduce this panic again.

#13 Updated by qhwt+dfly about 5 years ago

On Sat, Oct 31, 2009 at 07:54:07PM +0900, YONETANI Tomokazu wrote:
> > :> :$ panic: assertion: parent != NULL in hammer_cursor_removed_node
> > :> :mp_lock = 00000000; cpuid = 0
> > :> :Trace beginning at frame 0x5b7dfabc
> > :> :panic(5b7dfae0,0,0,58ba7168,5b7dfaf8) at 0x80dadce
> > :> :panic(824684c,8269077,82826b7,5e226588,0) at 0x80dadce
> > :> :hammer_cursor_removed_node(58ba7168,0,0,5b7dfc9c,0) at 0x81fe93a
> > :> :hammer_btree_do_propagation(58ba7168,0,58ba71b8,58ba71b8,58ba71b8) at 0x81fc4e0
> > :> :hammer_btree_do_propagation(0,0,52e22040,b,5b7dfbe8) at 0x81fc4c5
> > :> :hammer_btree_delete(5b7dfc9c,1,1,5b7d0114,52e22040) at 0x81fc8c4

(NOTE: the original panic mentioned in the Subject: line hasn't bitten me
for a while, but the new panic in hammer_cursor_removed_node is)

> Hmm... currently I'm running vkernel built from source as of
> e4eb09b2 (that is, hammer06.patch unapplied), and I can't reproduce
> the panic on this kernel. I'm compiling vkernel from f3a4893b0
> and see if I can reproduce this panic again.

Although my /usr/obj on vkernel no longer triggers the panic,
mirror-write'ing to a slave then pfs-destroy'ing it can still trigger
the panic. It's not necessarily the same panic triggered by removing
my /usr/obj followed by a sync, but at least it's reproducible.
I believe that the following results eliminate the possiblity of bad
compiler causing the panic:

vkernel from e4eb09b2 built on Oct 22 world(PC): no panic
vkernel from e4eb09b2 built on Oct 31 world(vkernel): no panic
vkernel from f3a4893b built on Oct 22 world(PC): panic
vkernel from f3a4893b built on Oct 31 world(vkernel): panic

Matt, can you determine if the situation is one of the following?
Because I'm not sure how to validate the mirror-stream data:

A) the mirror-stream data is somehow corrupted, and f3a4893b0 caught
something that e4eb09b2 missed
B) f3a4893b has introduced a new bug not existed in e4eb09b2

I've put the mirror-stream image which can trigger the panic
on my leaf account, as ~y0netan1/crash/usr.src.hammer.gz.
Its shared-uuid is c7d03096-4ce7-11de-85c5-0108543c929d.

# cd /pfs
# hammer pfs-slave test shared-uuid=c7d03096-4ce7-11de-85c5-0108543c929d
# zcat ~y0netan1/crash/usr.src.hammer.gz | hammer mirror-write
# hammer pfs-destroy

#14 Updated by qhwt+dfly almost 5 years ago

On Thu, Oct 22, 2009 at 06:19:01PM -0700, Matthew Dillon wrote:
> fetch http://apollo.backplane.com/DFlyMisc/hammer06.patch

It seems like btree_remove() sets cursor->parent to NULL in its
recursion path starting at hammer_btree.c:2226 but somehow returns 0
which ends up hitting the first KKASSERT() in hammer_cursor_removed_node().

#15 Updated by dillon almost 5 years ago

:On Thu, Oct 22, 2009 at 06:19:01PM -0700, Matthew Dillon wrote:
:> fetch http://apollo.backplane.com/DFlyMisc/hammer06.patch
:
:It seems like btree_remove() sets cursor->parent to NULL in its
:recursion path starting at hammer_btree.c:2226 but somehow returns 0
:which ends up hitting the first KKASSERT() in hammer_cursor_removed_node().

In your vkernel panic. By the way, the panic message will be correct
but the symbols in the backtrace are clearly messed up. For some
reason the vkernel reports the symbols incorrectly, I don't know
why.

But given that panic message:

panic: assertion: parent != NULL in hammer_cursor_removed_node

The only possible path is via btree_remove(). I'm a bit at a loss
here. I don't see how that panic could still occur with the recent
patches. I went through your emails again and found this comment:

"By the way, I caught a different panic on vkernel. I think the last
I ran `hammer cleanup' on /usr/obj was before applying hammer05.patch
or hammer06.patch to vkernel."

Was that vkernel backtrace a pre-patch panic?

--------

Going back to crash dumps .10 and .11 which paniced at:

panic: assertion: s <= 0 in hammer_btree_iterate

I think I see a possible issue. If hammer_btree_remove() fails with
EDEADLK hammer_btree_delete() ignores the error on line 897. This
is correct, we WANT to ignore the error because it is ok for
hammer_btree_remove()'s recursion to fail... it just means we could
not recursively delete the internal nodes to get rid of the empty
leaf. The leaf is simply left empty.

However I think this opens up an error path where the cursor can wind
up in a bad state when EDEADLK is returned and lead to the assertion.
It's just a guess at the moment. The normal case is clearly not
causing any problems, otherwise you'd get the panic instantly. It
takes a cpu/disk load and time to cause the panic to occur so it has
to be in the EDEADLK handling somewhere.

Another possibility is via hammer_btree_do_propagation(), which is also
called indirectly inside that loop. This code pushes the cursor,
does some work, then pops the cursor. But pushing a cursor unlocks it,
so some other third party operation can wind up adjusting it. It
is possible that some other deletion caused the node under the cursor
to be removed, causing the cursor to be adjusted to the parent node.
If the node that was removed was a node under the root node, then the
new cursor->node will become the root node and the cursor->parent will
become NULL. This should be ok (we are talking about the s <= 0 panic
here, not the cursor->parent != NULL panic).

But I'm thinking one of the above two conditions is causing the cursor
to get whacked out of shape badly enough to hit the s <= 0 assertion
in the iteration.

For the assertion to fail the cursor would have to be indexed to
BEFORE the beginning of the range. This can only occur if the
cursor gets adjusted while unlocked or is munged beyond hope in
the hammer_btree_do_propagation() or hammer_btree_remove() sequence.

I haven't found the smoking gun yet. This code is terribly complex.

-Matt
Matthew Dillon
<>

#16 Updated by qhwt+dfly almost 5 years ago

On Mon, Nov 02, 2009 at 09:04:32PM -0800, Matthew Dillon wrote:
> :On Thu, Oct 22, 2009 at 06:19:01PM -0700, Matthew Dillon wrote:
> :> fetch http://apollo.backplane.com/DFlyMisc/hammer06.patch
> :
> :It seems like btree_remove() sets cursor->parent to NULL in its
> :recursion path starting at hammer_btree.c:2226 but somehow returns 0
> :which ends up hitting the first KKASSERT() in hammer_cursor_removed_node().

Duh. I THOUGHT I saw the same panic message in vmcore.11, but now I
checked that again and it turned out the same with vmcore.10 and not
with the vkernel one. Sorry about the confusion.

> In your vkernel panic. By the way, the panic message will be correct
> but the symbols in the backtrace are clearly messed up. For some
> reason the vkernel reports the symbols incorrectly, I don't know
> why.
>
> But given that panic message:
>
> panic: assertion: parent != NULL in hammer_cursor_removed_node
>
> The only possible path is via btree_remove(). I'm a bit at a loss
> here. I don't see how that panic could still occur with the recent
> patches.

the backtrace should actually look like this (confirmed this by inserting
kprintf's here and there):
hammer_cursor_removed_node
btree_remove
btree_remove
hammer_btree_delete(5c865738,5c865738,5,5c865a0c,52b17040) at 0x81fcb6a
hammer_delete_at_cursor(5c865738,2,0,0,4af05332) at 0x820ba90

The hammer_cursor_removed_node() is called right after returning from
the third recursion to btree_remove(), which calls hammer_cursor_up()
at the bottom of it. hammer_cursor_up(), when called with a non-NULL
cursor->parent, assigns cursor->parent to cursor->node, then calls
hammer_load_cursor_parent(). This function sets cursor->parent to NULL
and returns 0 when called with cursor->node->ondisk->parent == 0, that is,
when cursor->parent->ondisk->parent == 0 in the third btree_remove().

> I went through your emails again and found this comment:
>
> "By the way, I caught a different panic on vkernel. I think the last
> I ran `hammer cleanup' on /usr/obj was before applying hammer05.patch
> or hammer06.patch to vkernel."
>
> Was that vkernel backtrace a pre-patch panic?

The backtrace was from the vkernel with hammer6.patch applied.

#17 Updated by qhwt+dfly almost 5 years ago

Hi.
With recent source tree (at least the fix for cursor tracking bug, f3a4893b,
is in) and the patch in the following message applied:
http://leaf.dragonflybsd.org/mailarchive/bugs/2009-11/msg00020.html
the same panic reproduced after running pbulk for about a day:

(kgdb) bt
#0 _get_mycpu () at ./machine/thread.h:83
#1 dumpsys () at /usr/src/sys/kern/kern_shutdown.c:657
#2 0xc01a1069 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:378
#3 0xc01a132e in panic (fmt=0xc030099e "assertion: %s in %s")
at /usr/src/sys/kern/kern_shutdown.c:813
#4 0xc028297e in hammer_ip_delete_range (cursor=0xd81ffc9c, ip=0xe2dcc8d0,
ran_beg=0, ran_end=9223372036854775807, truncating=2)
at /usr/src/sys/vfs/hammer/hammer_object.c:1949
#5 0xc0279619 in hammer_sync_inode (trans=0xd81ea114, ip=0xe2dcc8d0)
at /usr/src/sys/vfs/hammer/hammer_inode.c:2715
#6 0xc0277525 in hammer_flusher_flush_inode (arg=0xd6d332d8)
at /usr/src/sys/vfs/hammer/hammer_flusher.c:507
#7 hammer_flusher_slave_thread (arg=0xd6d332d8)
at /usr/src/sys/vfs/hammer/hammer_flusher.c:451
#8 0xc01a9c0a in lwkt_deschedule_self (td=Cannot access memory at address 0x8
)
at /usr/src/sys/kern/lwkt_thread.c:271
Backtrace stopped: previous frame inner to this frame (corrupt stack?)

My pbulk directory and $WRKOBJDIR is nohistory-flagged, and periodic
`hammer cleanup' has been turned off (manually invoked about once a week
when the machine is idle). Less than 40% of 1Tbyte harddrive is occuppied,
according to the output from df command.

The vmcore is on my leaf account, as ~y0netan1/crash/{kernel,vmcore}.13.

BTW, sshd on leaf dumps core when I tried to login unless I turn off
tty allocation (via -T). Is anything bad happening on it?

Best regards.

#18 Updated by dillon almost 5 years ago

:BTW, sshd on leaf dumps core when I tried to login unless I turn off
:tty allocation (via -T). Is anything bad happening on it?
:
:Best regards.

Woa, that's odd. It is dying in getenv(). It looks like
/usr/src/crypto/openssh/session.c is doing something really nasty
with the environ global, but I don't know what.

-Matt
Matthew Dillon
<>

#19 Updated by dillon almost 5 years ago

:
:
::BTW, sshd on leaf dumps core when I tried to login unless I turn off
::tty allocation (via -T). Is anything bad happening on it?
::
::Best regards.
:
: Woa, that's odd. It is dying in getenv(). It looks like
: /usr/src/crypto/openssh/session.c is doing something really nasty
: with the environ global, but I don't know what.
:

The crypto/openssha code is messing with the environment in ways
that really interfere with libc's management of it. I will commit
a change to sshd to work around the issue.

Changing libc is not really in the cards, we would have to completely
rewrite the environment handling code.

-Matt
Matthew Dillon
<>

#20 Updated by qhwt+dfly over 4 years ago

Hi.
I'm still seeing this panic, although very occasionally, during bulkbuild.
It's a DELL PowerEdge with Core 2 Duo and 2G bytes of RAM, running
an SMP x86_64 kernel built from source as of c652b. The HAMMER filesystem
is version 4 (originally version 2 but migrated to version 4 later), has
never been filled up, and about 40% is in use. I have
`daily_clean_hammer_enable="NO"' in /etc/periodic.conf and occasionally run
`hammer cleanup /pfs/*', so no cleanup should be involved at the time of
panic. I happened to be running two instances of hammer command at that
time, one for iostats and another for bstats, but I'm not sure it's anything
to do with the panic because I remember seeing the panic without those
commands running.

The backtrace looks almost the same as that of i386 version, but I can
upload the dump on leaf tomorrow if you think it's worth looking at it.

Cheers.

#21 Updated by dillon over 4 years ago

:The backtrace looks almost the same as that of i386 version, but I can
:upload the dump on leaf tomorrow if you think it's worth looking at it.
:
:Cheers.

Definitely worth uploading.

-Matt
Matthew Dillon
<>

#22 Updated by qhwt+dfly over 4 years ago

On Sat, Mar 06, 2010 at 06:42:11PM -0800, Matthew Dillon wrote:
> :The backtrace looks almost the same as that of i386 version, but I can
> :upload the dump on leaf tomorrow if you think it's worth looking at it.
> :
> :Cheers.
>
> Definitely worth uploading.

Uploaded as ~y0nean1/crash/{kern,vmcore}.4 on leaf.

#23 Updated by dillon over 4 years ago

:>
:> Definitely worth uploading.
:
:Uploaded as ~y0nean1/crash/{kern,vmcore}.4 on leaf.

This helps a lot. I think I see what is going on finally, the other
fixes have made it more apparent.

What is happening is that the record deletion updates the mirror_tid
in the B-Tree recursively going upwards, during which the original
cursor is unlocked. The unlocked cursor winds up getting moved to
a parent (internal) B-Tree node which is fine, but then a new record
is inserted before the cursor is re-locked.

Because the cursor got moved to an internal node the new insertion can
get in between the cursor's position and the original range that was
being scanned. The unexpected record then causes the assertion.

I'll have another patch for you to try in a day or two. I have several
solutions in mind but I want to try to find the one that is minimally
invasive for the release. Basically either (1) Unlocked cursors which
have to be moved into a parent node can't track and require a
relookup. Or (2) when moving an unlocked cursor into its parent the
cursor's index in the parent must also be bumped to prevent insertions
from being able to wiggle their way in.

-Matt
Matthew Dillon
<>

#24 Updated by dillon over 4 years ago

Here's another patch to try:

fetch http://apollo.backplane.com/DFlyMisc/hammer12.patch

Hopefully it will print out the debugging message and continue on its
merry way instead of asserting in the higher level HAMMER code.

What I'm trying to do is to catch the one unlocked cursor adjustment
which I believe can cause the problem by setting a flag, and then
adjusting the iteration code to skip any records which might have
wiggled their way in. Hopefully then the higher level code will
not assert on the obj_id being incorrect.

-Matt
Matthew Dillon
<>

#25 Updated by qhwt+dfly over 4 years ago

On Tue, Mar 09, 2010 at 11:27:56AM -0800, Matthew Dillon wrote:
> fetch http://apollo.backplane.com/DFlyMisc/hammer12.patch
>
> Hopefully it will print out the debugging message and continue on its
> merry way instead of asserting in the higher level HAMMER code.
>
> What I'm trying to do is to catch the one unlocked cursor adjustment
> which I believe can cause the problem by setting a flag, and then
> adjusting the iteration code to skip any records which might have
> wiggled their way in. Hopefully then the higher level code will
> not assert on the obj_id being incorrect.

I'm running pbulk on a few machines around me with the patch applied.
I'll let you know if anything unusual happens again.

Thanks.

#26 Updated by qhwt+dfly over 4 years ago

On Tue, Mar 09, 2010 at 11:27:56AM -0800, Matthew Dillon wrote:
> Here's another patch to try:
>
> fetch http://apollo.backplane.com/DFlyMisc/hammer12.patch
>
> Hopefully it will print out the debugging message and continue on its
> merry way instead of asserting in the higher level HAMMER code.

It survived the first bulkbuild, but crashed several hours after
starting the second one. I found many warnings like this in /var/log/messages
after the crash:
(the hexadecimal numbers differ between lines)
Mar 15 14:31:19 dfly64 kernel: HAMMER Warning: depth limit reached on setup recursion, inode 0xfffffffe4d5274e0 00000004c138829a

according to the last command's output, I guess these were logged just
before the crash. The crash dump is on my leaf account as
~y0netan1/crash/{kern,vmcore}.5 .

#27 Updated by dillon over 4 years ago

:It survived the first bulkbuild, but crashed several hours after
:starting the second one. I found many warnings like this in /var/log/messages
:after the crash:
:(the hexadecimal numbers differ between lines)
:Mar 15 14:31:19 dfly64 kernel: HAMMER Warning: depth limit reached on setup recursion, inode 0xfffffffe4d5274e0 00000004c138829a
:
:according to the last command's output, I guess these were logged just
:before the crash. The crash dump is on my leaf account as
:~y0netan1/crash/{kern,vmcore}.5 .

Those are ok, probably unrelated.

More importantly, it didn't hit any of my debugging kprintf()s so
I haven't caught the path that is causing the cursor to stray. It's
the same situation as before... the cursor is sitting at the prior
object in a node.

We have to figure out what that hammer command that's running is
trying to do (thread 7, pid 843). That might give me a hint as
to what sort of operation is confusing the cursor.

-Matt
Matthew Dillon
<>

#28 Updated by qhwt+dfly over 4 years ago

On Mon, Mar 15, 2010 at 10:13:27AM -0700, Matthew Dillon wrote:
> We have to figure out what that hammer command that's running is
> trying to do (thread 7, pid 843). That might give me a hint as
> to what sort of operation is confusing the cursor.

I don't remember which, but those two hammer processes were
doing `hammer iostats' and `hammer bstats'.

#29 Updated by dillon over 4 years ago

:On Mon, Mar 15, 2010 at 10:13:27AM -0700, Matthew Dillon wrote:
:> We have to figure out what that hammer command that's running is
:> trying to do (thread 7, pid 843). That might give me a hint as
:> to what sort of operation is confusing the cursor.
:
:I don't remember which, but those two hammer processes were
:doing `hammer iostats' and `hammer bstats'.

Shoot. That can't hurt anything. I was hoping there'd be a
hammer cleanup running somewhere but those are the only two
hammer user processes.

So maybe this isn't related to a collision with a hammer cleanup
operation but instead something more conventional.

The hammer cursor is at the correct node, just not at the proper
index within that node.

Ok, I found another clue. The transaction id for the elements
inserted into the node for the other object id is the same as the
transaction id being used to delete the range. That implies that
the operation which inserted those elements is another flusher thread.
This makes sense. What doesn't make sense is how they managed to get
inserted without the node's index being properly updated.

I found a second place where I have to flag HAMMER_CURSOR_ITERATE_CHECK,
this time in the normal delete path (no special cases other than a
node becoming empty and being destroyed).

I've committed all the patches so far including some related to
Francois's bug report. You can reset and pull the whole mess, or
just apply thte latest patch ec9b6294f7ea2d9365124dd1e50ebe21c78da8b2
to your already patched test system.

Oh, p.s. is the HAMMER filesystem mounted nohistory? If so that would
explain why this second case is getting hit.

-Matt
Matthew Dillon
<>

#30 Updated by qhwt+dfly over 4 years ago

On Mon, Mar 15, 2010 at 10:53:08AM -0700, Matthew Dillon wrote:
> So maybe this isn't related to a collision with a hammer cleanup
> operation but instead something more conventional.

I have daily_clean_hammer_enable="NO" in /etc/periodic.conf and I haven't
issued it while running bulkbuild. Looking at the output from `ps' macro,
it seems to me to be something between pkg_add (doing fsync) and hammer-S2.

:
> I've committed all the patches so far including some related to
> Francois's bug report. You can reset and pull the whole mess, or
> just apply thte latest patch ec9b6294f7ea2d9365124dd1e50ebe21c78da8b2
> to your already patched test system.

OK, I'll try the new code.

> Oh, p.s. is the HAMMER filesystem mounted nohistory? If so that would
> explain why this second case is getting hit.

No, the HAMMER filesystem itself is not mounted with nohistory, but
PFSes involved in pbulk have nohistory flag set.

#31 Updated by dillon over 4 years ago

:> Oh, p.s. is the HAMMER filesystem mounted nohistory? If so that would
:> explain why this second case is getting hit.
:
:No, the HAMMER filesystem itself is not mounted with nohistory, but
:PFSes involved in pbulk have nohistory flag set.

Ok, good. The odds are greatly increased for this particular issue
for no-history file deletions as those actually remove the B-Tree
elements (verses just marking them as deleted with the transaction id).

-Matt
Matthew Dillon
<>

Also available in: Atom PDF