Bug #1085

Another HAMMER crash

Added by bastyaelvtars about 6 years ago. Updated about 6 years ago.

Status:ClosedStart date:
Priority:NormalDue date:
Assignee:-% Done:

0%

Category:-
Target version:-

Description

Self-explanatory:

# mount_hammer -T 0000000130ad51bd /dev/ad0s1f /usr
ASOF
panic: assertion: volume->io.lock.refs == 0 in hammer_unload_volume
Trace beginning at frame 0xcad43998
panic(cad439bc,c156e348,c1765aac,cad2b000,cad439d4) at panic+0x8c
panic(c051fcb7,c05434c4,c05077f9,1001,c156e348) at panic+0x8c
hammer_unload_volume(c156e348,0,0,0,0) at hammer_unload_volume+0x6c
hammer_vol_rb_tree_RB_SCAN(cad2b00c,0,c043776d,0,cad2b02c) at
hammer_vol_rb_tree_RB_SCAN+0xad hammer_free_hmp
(cad2b340,c156e348,1,2,bfbffb2d) at hammer_free_hmp+0x166
hammer_vfs_mount(c9b048d8,bfbffb39,bfbff8f0,c15258c8,c1682528) at
hammer_vfs_mount+0x84a sys_mount(cad43cf0,6,0,0,c9b04218) at sys_mount
+0x66c syscall2(cad43d40) at syscall2+0x1e9 Xint0x80_syscall() at
Xint0x80_syscall+0x36 Debugger("panic")
Stopped at Debugger+0x34: movb $0,in_Debugger.3949

I cannot get dumps for some reason. I 'panic' and 'call dumpsys' like
crazy but it just does not work.

History

#1 Updated by dblazakis about 6 years ago

I got the same panic when I forced an undo check to fail.

If mount fails the recovery, it will leave a ref; on error it does not
call hammer_recover_flush_buffers at the end of hammer_recovery (which
would unref that io ref).

-- Dion

#2 Updated by dillon about 6 years ago

:Self-explanatory:
:
:# mount_hammer -T 0000000130ad51bd /dev/ad0s1f /usr

:I cannot get dumps for some reason. I 'panic' and 'call dumpsys' like
:crazy but it just does not work.
:
:--
:Gergo Szakal MD <>

I'm guessing it can't find the root inode with that as-of mount.
Try putting a 0x in front of timestamp.

mount_hammer -T 0x0000000130ad51bd /dev/ad0s1f /usr
^^^
Add a 0x

The code needs the 0x prefix:

case 'T':
info.asof = strtoull(optarg, NULL, 0);
break;

I will fix both panics when I get home this evening. Panics are bad :-)

I recommend always doing a normal mount and using cd @@0x<TID> to
push into a snapshot.

-Matt
Matthew Dillon
<>

#3 Updated by bastyaelvtars about 6 years ago

On Thu, 24 Jul 2008 09:12:17 -0700 (PDT)
Matthew Dillon <> wrote:

Yes, that helped.

I had no snapshot. I just wanted to try the as-of mount. I read the
manual but failed to understand it. hammer(5) says:

'Prior versions of files or directories are accessible by appending @@
and a transaction ID to the name.'

I could not access them this way.

#4 Updated by dillon about 6 years ago

:Yes, that helped.
:
:> I recommend always doing a normal mount and using cd @@0x<TID> to
:> push into a snapshot.
:
:I had no snapshot. I just wanted to try the as-of mount. I read the
:manual but failed to understand it. hammer(5) says:
:
:'Prior versions of files or directories are accessible by appending @@
:and a transaction ID to the name.'
:
:I could not access them this way.
:
:--
:Gergo Szakal MD <>
:University Of Szeged, HU

You can generate the transaction ids manually using 'hammer synctid',
or you can extract them from a file using 'undo -i <filename>'. You
get a snapshot every 30-60 seconds even if you don't lift a finger,
but unless you record the transaction ids somewhere you'd have to
do some sleuthing (e.g. with undo -i) to get them.

Typically what you would do is have a cron job create a convenient
snapshot softlink once an hour, once a day, whatever, like this:

mkdir /mnt/snapshots
hammer snapshot /mnt/snapshots
(creates a softlink in that directory called snap-DATE-TIME which
gives you a snapshot of the entire filesystem as-of that point).

If you have a snapshots directory you can then prune the filesystem
based on the contents of the directory. Hmm. I guess we need more
of a tutorial on how to get started with snapshots in our hammer(5)
man page.

-Matt
Matthew Dillon
<>

#5 Updated by swildner about 6 years ago

The file system by default will update historical information of a file
upon every sync. You can list the prior versions of any file with
'hammer history <file>'. These are the versions that could be accessed
by appending @@0x<id>. Pruning will free this space again.

But as Matt notes, snapshots are the common way of accessing historical
data. I've updated hammer(5) to be a bit more clear about this. If you
still feel it's unclear, please let me know.

Sascha

#6 Updated by dillon about 6 years ago

Here's a patch against 2.0 which should fix both problems plus
another one I found related to umounting a read-only mount that
had recovery associated with it. Please test.

fetch http://apollo.backplane.com/DFlyMisc/hammer01.patch

-Matt

#7 Updated by mneumann about 6 years ago

The first argument to "hammer snapshot" takes a template similar to
strftime(3), so you can generate snapshots with:

hammer snapshot /mnt/snaphots/SNAP-%Y-%M-%H

Which would generate a symlink

/mnt/snapshots/SNAP-2008-08-25

If /mnt/snapshots itself is not on a hammer filesystem you should as
well specify the hammer filesystem to snapshot:

hammer snapshot /mnt/snaphots/SNAP-%Y-%M-%H /hammer

Regards,

Michael

#8 Updated by mneumann about 6 years ago

What does "discarding of recovered buffers" in the case of read-only
mounts mean? Is anything written to disk?

Regards,

Michael

#9 Updated by corecode about 6 years ago

I think we should have a way to specify fuzzy values like "yesterday", "10
minutes ago" or even "previous" (call it "-1", "-2" for the one before,
etc.)

Otherwise people will have the impression that hammer's history feature
needs snapshots, which it does not.

cheers
simon

#10 Updated by swildner about 6 years ago

It would be handy if the template could be applied in the directory
names as well, e.g. 'hammer snapshot /mnt/snapshots/%Y/%m/%d/%H%M'.

Sascha

#11 Updated by mneumann about 6 years ago

Sure! But I'm not sure whether "hammer prune" likes subdirectories.

Regards,

Michael

#12 Updated by Johannes.Hofmann about 6 years ago

It would be nice to have something like undo -i that works recursively
on a directory.
Is there a more efficient way to do it than

find . -print0 | xargs -0 undo -i | sort

?

Johannes

#13 Updated by dillon about 6 years ago

It does not currently scan multiple directories. It only scans one,
so all the softlinks have to be in the same directory.

I don't think splitting them up into that many sub-directories would
be all that helpful. One is fine, really!

-Matt
Matthew Dillon
<>

#14 Updated by dillon about 6 years ago

:I think we should have a way to specify fuzzy values like "yesterday", "10
:minutes ago" or even "previous" (call it "-1", "-2" for the one before,
:etc.)
:
:Otherwise people will have the impression that hammer's history feature
:needs snapshots, which it does not.
:
:cheers
: simon

I originally had that sort of feature but I ripped it out because
the moving target got really confusing.

There is another issue here and that is the filesystem is only
guaranteed to be consistent on snapshot boundaries. This will be
true for all fine-grained history, but once you prune the filesystem
anything older then the most recent snapshot softlink really has to
be accessed via a softlink, or at least a transaction id that exists
in a softlink (one of the softlinks that the prune code was told to
retain). If you use a random transaction id you will get an
inconsistent version of the filesytem.

The reason is that pruning operations create holes in the history in
those areas it was told to prune. If you try to access the filesystem
in those areas you will get a combination of records and holes.

Originally the pruning code fixed up the B-Tree records to cover the
holes but I decided that was too dangerous to do, and it also made
mirroring impossible (becaused fixed historical records were being
modified by the pruning code).

-Matt
Matthew Dillon
<>

#15 Updated by dillon about 6 years ago

:What does "discarding of recovered buffers" in the case of read-only
:mounts mean? Is anything written to disk?
:
:Regards,
:
: Michael

When you do a read-only mount of a HAMMER filesytem after a crash
HAMMER must still run all the undo's in order to create a consistent
view of the filesystem. It leaves the modified buffers in-memory and
is not supposed to flush them to disk (it being a read-only mount!).

-Matt
Matthew Dillon
<>

Also available in: Atom PDF