Bug #2287

HAMMER(ROOT) Illegal UNDO TAIL signature at 300000001967c000

Added by y0n3t4n1 almost 3 years ago. Updated over 2 years ago.

Status:NewStart date:
Priority:NormalDue date:
Assignee:-% Done:

0%

Category:-
Target version:-

Description

Hello.

After having experienced a few panics, the root filesystem is
no longer able to be mounted, even in read-only mode:

(booted from a USB stick)
# mount -thammer -o ro /dev/da0s1d /mnt
HAMMER(ROOT) recovery check seqno=3364b54c
HAMMER(ROOT) Illegal UNDO TAIL signature at 300000001967c000
HAMMER(ROOT) recovery failure during seqno fwdscan
HAMMER(ROOT) recovery complete
Failed to recover HAMMER filesystem on mount
hammer: mount on /mnt: Input/output error

I haven't tried `hammer recover' yet, as I have no idea what it does.
Is there anything I can do to recover from this situation? This is
a machine dedicated to testing DragonFly stability, so I can install
from scratch in the worst case.

The error message looks similar to the one described in issue1984,
but in this case even R/O mount fails.

The current kernel is built from 4f459, and it occasionally panics
even under almost no CPU or disk load. It could be a hardware failure,
but I couldn't find any indication of it as far as I watched the console
while it booted with the USB stick.
The previous kernel was built from 190f1b64, and it survived 10 days
without panic under pbulk load.

Best Regards,
YONETANI Tomokazu.

History

#1 Updated by sepherosa almost 3 years ago

On Mon, Jan 23, 2012 at 10:55 AM, YONETANI Tomokazu via Redmine
<> wrote:
>
> Issue #2287 has been reported by YONETANI Tomokazu.
>
> ----------------------------------------
> Bug #2287: HAMMER(ROOT) Illegal UNDO TAIL signature at 300000001967c000
> http://bugs.dragonflybsd.org/issues/2287
>
> Author: YONETANI Tomokazu
> Status: New
> Priority: Normal
> Assignee:
> Category:
> Target version:
>
>
> Hello.
>
> After having experienced a few panics, the root filesystem is
> no longer able to be mounted, even in read-only mode:
>
> (booted from a USB stick)
> # mount -thammer -o ro /dev/da0s1d /mnt
> HAMMER(ROOT) recovery check seqno=3364b54c
> HAMMER(ROOT) Illegal UNDO TAIL signature at 300000001967c000
> HAMMER(ROOT) recovery failure during seqno fwdscan
> HAMMER(ROOT) recovery complete
> Failed to recover HAMMER filesystem on mount
> hammer: mount  on /mnt: Input/output error
>
> I haven't tried `hammer recover' yet, as I have no idea what it does.
> Is there anything I can do to recover from this situation?  This is
> a machine dedicated to testing DragonFly stability, so I can install
> from scratch in the worst case.
>
> The error message looks similar to the one described in issue1984,
> but in this case even R/O mount fails.
>
> The current kernel is built from 4f459, and it occasionally panics
> even under almost no CPU or disk load.  It could be a hardware failure,
> but I couldn't find any indication of it as far as I watched the console
> while it booted with the USB stick.

Could it be bad memory? I once installed dfly on a box w/ bad memory,
the system crashed many times w/o any activity. I think you may want
to run memtest on your box.

Best Regards,
sephe

--
Tomorrow Will Never Die

#2 Updated by y0n3t4n1 almost 3 years ago

Hi, I've let it run memtest over the last night, but (un)fortunately it found no errors so far.

Best Regards,
YONETANI Tomokazu

#3 Updated by y0n3t4n1 over 2 years ago

Hi. I tried `hammer show-undo'. It found a zero-sized field at
300000001967c000, and subsequent entries look like that
until I pressed ctrl+c.

Volume header UNDO 3000000019679708-300000001967aea8/3000000040000000
Undo map is 1024MB
3000000000000000 UNDO(0200) seq=33366e05 dataoff=2000003e69742938 bytes=472
3000000000000200 UNDO(0200) seq=33366e06 dataoff=2000003e69742b10 bytes=472
:
300000001967bc00 UNDO(0200) seq=3364b571 dataoff=2000003eb86d4c40 bytes=472
300000001967be00 UNDO(0200) seq=3364b572 dataoff=2000003eb86d4e18 bytes=472
300000001967c000 UNKNOWN(0000,0000) seq=00000000
Illegal size field, skipping to next boundary
300000001967c000 UNKNOWN(0000,0000) seq=00000000
Illegal size field, skipping to next boundary
:

Maybe format_undomap() can fix (reset) this truncated UNDO map?

Best Regards,
YONETANI Tomokazu.

#4 Updated by y0n3t4n1 over 2 years ago

Hi,

I've recovered from this situation by writing a modified version
of hammer utility to fill up the rest of the UNDO FIFO with DUMMY
records.

But in practice we need a read-only-without-recovery option
in hammer_mount (it still runs recover_stage1 and it failed for me),
so as to determine that the filesystem without processing UNDO record
will be usable.

Best Regards,
YONETANI Tomokazu.

On Thu, Feb 09, 2012 at 12:35:24AM +0900, YONETANI Tomokazu wrote:
> Hi. I tried `hammer show-undo'. It found a zero-sized field at
> 300000001967c000, and subsequent entries look like that
> until I pressed ctrl+c.
>
> Volume header UNDO 3000000019679708-300000001967aea8/3000000040000000
> Undo map is 1024MB
> 3000000000000000 UNDO(0200) seq=33366e05 dataoff=2000003e69742938 bytes=472
> 3000000000000200 UNDO(0200) seq=33366e06 dataoff=2000003e69742b10 bytes=472
> :
> 300000001967bc00 UNDO(0200) seq=3364b571 dataoff=2000003eb86d4c40 bytes=472
> 300000001967be00 UNDO(0200) seq=3364b572 dataoff=2000003eb86d4e18 bytes=472
> 300000001967c000 UNKNOWN(0000,0000) seq=00000000
> Illegal size field, skipping to next boundary
> 300000001967c000 UNKNOWN(0000,0000) seq=00000000
> Illegal size field, skipping to next boundary
> :
>
> Maybe format_undomap() can fix (reset) this truncated UNDO map?
>
> Best Regards,
> YONETANI Tomokazu.

Also available in: Atom PDF