Bug #1104
closedpanic: assertion: layer2->zone == zone in hammer_blockmap_free
0%
Description
Hi.
Caught this panic on latest DragonFly_RELEASE_2_0 (minus the fix to
vfs_subr.c, but the machine has less than 2G RAM), when I tried to
destroy a PFS slave on HAMMER filesystem.  The PFS hasn't been actively
used, only to try mirror-copy of another PFS, which is used as /usr/obj.
The panic also occurs when I try to reblock this PFS, but no problem
to upgrade to a PFS master.  I don't have a problem reblocking or pruning
other PFS's or HAMMER filesystem containing them.
The backtrace looks like this:
(kgdb) bt
#0  dumpsys () at ./machine/thread.h:83
#1  0xc0198cbd in boot (howto=260)
    at /home/source/dragonfly/R2_0/src/sys/kern/kern_shutdown.c:375
#2  0xc0198f80 in panic (fmt=0xc02d6a9f "assertion: %s in %s")
    at /home/source/dragonfly/R2_0/src/sys/kern/kern_shutdown.c:800
#3  0xc02508cf in hammer_blockmap_free (trans=0xd5fc2a88,
    zone_offset=10376293547717272640, bytes=128)
    at /home/source/dragonfly/R2_0/src/sys/vfs/hammer/hammer_blockmap.c:734
#4  0xc025e8df in hammer_delete_at_cursor (cursor=0xd5fc2894, delete_flags=2,
    delete_tid=0, delete_ts=1217680974, track=1, stat_bytes=0x0)
    at /home/source/dragonfly/R2_0/src/sys/vfs/hammer/hammer_object.c:2103
#5  0xc0262d78 in hammer_pfs_rollback (trans=0xd5fc2a88, pfsm=0xcc2e45a8,
    trunc_tid=0)
    at /home/source/dragonfly/R2_0/src/sys/vfs/hammer/hammer_pfs.c:403
#6  0xc0262ee0 in hammer_ioc_destroy_pseudofs (trans=0xd5fc2a88,
    ip=0xd59b7cd0, pfs=0xd5fc2c1c)
    at /home/source/dragonfly/R2_0/src/sys/vfs/hammer/hammer_pfs.c:244
#7  0xc025b748 in hammer_ioctl (ip=0xd59b7cd0, com=3224135691,
    data=0xd5fc2c1c "", fflag=1, cred=0xd5c13d98)
    at /home/source/dragonfly/R2_0/src/sys/vfs/hammer/hammer_ioctl.c:100
#8  0xc026ab2c in hammer_vop_ioctl (ap=0xd5fc2ae0)
    at /home/source/dragonfly/R2_0/src/sys/vfs/hammer/hammer_vnops.c:2011
#9  0xc01ec326 in vop_ioctl (ops=0xd57993b0, vp=0xd56dcba8,
    command=3224135691, data=0xd5fc2c1c "", fflag=1, cred=0xd5c13d98)
    at /home/source/dragonfly/R2_0/src/sys/kern/vfs_vopops.c:372
#10 0xc01eb98c in vn_ioctl (fp=0xcfecd758, com=3224135691, data=0xd5fc2c1c "",
    ucred=0xd5c13d98)
    at /home/source/dragonfly/R2_0/src/sys/kern/vfs_vnops.c:1006
#11 0xc01b8268 in mapped_ioctl (fd=3, com=3224135691,
    uspc_data=0xbfbffb90 <Address 0xbfbffb90 out of bounds>, map=0x0)
    at /home/source/dragonfly/R2_0/src/sys/sys/file2.h:87
#12 0xc01b82f0 in sys_ioctl (uap=0xd5fc2cf0)
    at /home/source/dragonfly/R2_0/src/sys/kern/sys_generic.c:525
#13 0xc02b24ef in syscall2 (frame=0xd5fc2d40)
    at /home/source/dragonfly/R2_0/src/sys/platform/pc32/i386/trap.c:1384
#14 0xc029ed26 in Xint0x80_syscall ()
    at /home/source/dragonfly/R2_0/src/sys/platform/pc32/i386/exception.s:876
#15 0x080511ec in ?? ()
The kernel dump has been uploaded as ~y0netan1/crash/{kernel,vmcore}.13
on my leaf account.
Cheers.
       Updated by dillon about 17 years ago
      Updated by dillon about 17 years ago
      
    
    :Hi.
:Caught this panic on latest DragonFly_RELEASE_2_0 (minus the fix to
:vfs_subr.c, but the machine has less than 2G RAM), when I tried to
:destroy a PFS slave on HAMMER filesystem.  The PFS hasn't been actively
:used, only to try mirror-copy of another PFS, which is used as /usr/obj.
:The panic also occurs when I try to reblock this PFS, but no problem
:to upgrade to a PFS master.  I don't have a problem reblocking or pruning
:other PFS's or HAMMER filesystem containing them.
:...
:The kernel dump has been uploaded as ~y0netan1/crash/{kernel,vmcore}.13
:on my leaf account.
:
:Cheers.
This looks fairly serious.  The big-block has been marked free but the
    rollback code is trying to delete a data element that still
    references it.I am going to need some additional information from you.  Please sync
    your filesystem and then do the following:hammer -f <device> blockmap > textfile
    hammer -f <device> -vvv show >> textfileAnd gzip and upload that to leaf.-Matt
                    Matthew Dillon 
                    <dillon@backplane.com>
       Updated by dillon about 17 years ago
      Updated by dillon about 17 years ago
      
    
    :Hi.
:Caught this panic on latest DragonFly_RELEASE_2_0 (minus the fix to
:vfs_subr.c, but the machine has less than 2G RAM), when I tried to
:destroy a PFS slave on HAMMER filesystem.  The PFS hasn't been actively
:..
:
:The kernel dump has been uploaded as ~y0netan1/crash/{kernel,vmcore}.13
:on my leaf account.
:
:Cheers.
Also:It looks like a 1TB disk, mostly empty, with ~500K inodes.- Did you ever fill it up? (I'm guessing you haven't)
- What types of operations did you run on it other then as a mirroring
 slave? e.g. cpdup, tar, buildworlds, etc?
- Did you upgrade it to a master before getting the panic or after?
- How many PFS operations did you run on the filesystem approximately?
- How much reblocking have you done overall?
- You said you were playing with mirroring, with it as a slave.  Did
 you mirror-copy or mirror-write into it? If so, how much data was
 transfered?
- Are the ~500K inodes mostly associated with the slave or unrelated?
	-Matt 
 Matthew Dillon
 <dillon@backplane.com>
       Updated by qhwt+dfly about 17 years ago
      Updated by qhwt+dfly about 17 years ago
      
    
    Matt, you sent me two other messages privately, but I think this message
covers what you asked me in them.  apollo doesn't like my IP address, so
I need to configure my mail to go through ISP to do so (and I haven't, yet).
$ ls -l /HAMMER
total 0
lrwxr-xr-x  1 root    wheel   26 Jul 19 15:09 obj -> @0xffffffffffffffff:00001@0xffffffffffffffff:00002
lrwxr-xr-x  1 root    wheel   26 Jul 19 16:24 slave -> 
drwxr-xr-x  1 source  source   0 Jul 24 11:48 source
/HAMMER is the only HAMMER filesystem on this machine and is mounted without
nohistory flags.
`obj` is used as objdir for build{kernel,world} through nullfs, and has
nouhistory flag set (it used to have noshistory too, but setting noshistory
confused a few commands running as non-root, so I cleared that flag later).
`source' is a plain directory under HAMMER filesystem and contains NetBSD
CVS repository, a pkgsrc git tree cpdupped from a UFS filesystem(unused
after cpdup), and a pkgsrc git tree updated once a day using togit script
(from `fromcvs' package).
It experienced two types of major crashes until now: the first one was
triggered by an attempt of cross-device link in the middle of July.
The other was triggered by network code (reused socket on connect).
According to /var/log/messages, the recovery was run only once, though.
Jul 19 11:34:52 firebolt kernel: HAMMER(HAMMER) Start Recovery 30000000002c7350 - 30000000002c93f0 (8352 bytes of UNDO)(RW)
  Jul 19 11:34:53 firebolt kernel: HAMMER(HAMMER) End RecoveryNo.
The `slave' was a plain pfs-slave, so I didn't do anything to it before
this panic.
After.
More than several to less than twenty, I guess.  Mostly creating and
destroying PFS's without actual mirror-copy or some other file operations.
Several times.
I use mirror-copy to sync the slave.  ${.OBJDIR} for buildworld usually
grows upto 2Gbytes, and ${WRKDIR}s for pkgsrc can reach around 1Gbytes
if I build a meta-package.  Usually mirror-copy after buildwold or building
packages, remove the directories in master, then mirror-copy again to
see if removing files or directories are properly propagated to slave.
I remember interrupting reblock on /HAMMER/obj, but I haven't done
mirror-copy to slave after that, so I don't think it's something to do
with it.
They are mostly assosiated to /HAMMER/source and /HAMMER/obj.
Cheers.
       Updated by qhwt+dfly about 17 years ago
      Updated by qhwt+dfly about 17 years ago
      
    
    Done as ~y0netan1/crash/blockmap.gz.
Cheers.
       Updated by dillon about 17 years ago
      Updated by dillon about 17 years ago
      
    
    :Matt, you sent me two other messages privately, but I think this message
:covers what you asked me in them.  apollo doesn't like my IP address, so
:I need to configure my mail to go through ISP to do so (and I haven't, yet).
Oh, its the .ppp. in the reverse dns.  I need to change over to spamcop
    or something.:$ ls -l /HAMMER
:total 0
:lrwxr-xr-x  1 root    wheel   26 Jul 19 15:09 obj -> @0xffffffffffffffff:00001@0xffffffffffffffff:00002
:lrwxr-xr-x  1 root    wheel   26 Jul 19 16:24 slave -> 
:
:/HAMMER is the only HAMMER filesystem on this machine and is mounted without
:nohistory flags.
:
:...
:It experienced two types of major crashes until now: the first one was
:triggered by an attempt of cross-device link in the middle of July.
:The other was triggered by network code (reused socket on connect).
:According to /var/log/messages, the recovery was run only once, though.
:
:  Jul 19 11:34:52 firebolt kernel: HAMMER Start Recovery 30000000002c7350 - 30000000002c93f0 (8352 bytes of UNDO)(RW)
:  Jul 19 11:34:53 firebolt kernel: HAMMER End Recovery
Ok, I'm not so worried about the net crash.  The cross-device link
    crashes (before we fixed it) are interesting... those could be important.Did you newfs_hammer the filesystem after the cross-device link crashes
    or is it possible that some cruft from those crashes leaked through to
    current-day?  The timestamp in that filesystem's FSID reads July 18th,
    which was right around when you reported that issue.:I use mirror-copy to sync the slave.  ${.OBJDIR} for buildworld usually
:grows upto 2Gbytes, and ${WRKDIR}s for pkgsrc can reach around 1Gbytes
:if I build a meta-package.  Usually mirror-copy after buildwold or building
:packages, remove the directories in master, then mirror-copy again to
:see if removing files or directories are properly propagated to slave.
Yah, that's pretty much what I've been doing for testing too.
    I usually also throw in a reblock run in a sleep loop, and an
    occassional prune-everything in its own sleep loop.  Running everything
    in parallel is a pretty good test.:I remember interrupting reblock on /HAMMER/obj, but I haven't done
:mirror-copy to slave after that, so I don't think it's something to do
:with it.
Ok. Interrupting reblocking should be fine, it isn't a real interrupt,
    the kernel code polls for the signal at a safe point.:>     * Are the ~500K inodes mostly associated with the slave or unrelated?
:
:They are mostly assosiated to /HAMMER/source and /HAMMER/obj.
:
:Cheers.
I'm crossing my fingers and hoping that the issue was related
    to the cross-device link crashes.  If your filesystem still has some
    cruft from those crashes then I will undo the test locally so I can
    reproduce the cross-link crashes and see if I can corrupt the
    filesystem that way.If you can, please wipe that filesystem and continue testing fresh,
    and see if you can reproduce that panic (or any bug).--While trying to reproduce your panic today I found another, unrelated
    bug which I will commit a fix for tomorrow.  There's a small window
    of opportunity where reblocking live data can interfere with programs
    accessing that live data.  It only effects the data though, not the
    meta-data, so it can't be related to the panic you got.I am also planning on writing a 'hammer fsck' feature to clean up
    corrupted freemaps & (maybe) B-Trees... kinda a last-resort directive.
    It will probably take most of next week to do.-Matt
                    Matthew Dillon 
                    <dillon@backplane.com>
       Updated by dillon about 17 years ago
      Updated by dillon about 17 years ago
      
    
    :>     I am going to need some additional information from you.  Please sync
:>     your filesystem and then do the following:
:> 
:>     hammer -f <device> blockmap > textfile
:>     hammer -f <device> -vvv show >> textfile
:> 
:>     And gzip and upload that to leaf.
:
:Done as ~y0netan1/crash/blockmap.gz.
:
:Cheers.
Excellent!  This is great!  Ok, here is the big-block blockmap entry:4000000174800000 zone=9 app=8388608 free=8388480Each big-block can hold 8MB (8388608 bytes).  This one has 128 bytes
    allocated out of it.  From the B-Tree dump four inodes are allocated
    out of that block, though, so it should have read 512 bytes
    allocated (free=8388096).I already see a pattern, and I'm hoping it will lead to the bug.
    The LO field is 0002xxxx == PFS#2.  The tids are create:delete... 
    notice that one has a delete_tid of 0, which means it is live.
    The others have a non-zero delete_tid, those have been deleted.We get a nice ctime too... the ctime is in microseconds.  First, here
    are the inode records:G------ ELM 14 R obj=0000000000000001 key=0000000000000000 
         lo=00020001 rt=01 ot=01            <<<< top 16 bits 
                                 of LO is the
                                 PFS id.
                 tids 000000010848ffbb:0000000000000000   <<<<<  create:delete
                 dataoff=9000000174dda260/128 crc=149d91f4  <<<< dataoff
                 fills=z9:745=1%
                 size=0 nlinks=1 mode=01777 uflags=00000000
                 ctime=0004525a5174d38a pobjid=0000000000000000 obj_type=1
                 mtime=0004537db932151f
            ^^^^^^^^^^
            time stamps
G------ ELM 15 R obj=0000000000000001 key=0000000000000000
         lo=00020001 rt=01 ot=01
               d tids 0000000108490171:0000000108490179   <<<< create:delete
                 dataoff=9000000174dda440/128 crc=761cd3d0
                 fills=z9:745=1%
                 size=0 nlinks=1 mode=00755 uflags=00000000
                 ctime=0004525b5cd8b43f pobjid=0000000000000000 obj_type=1
                 mtime=0004525b5cd8b43f
G------ ELM 22 R obj=0000000107e44a04 key=0000000000000000
         lo=00020001 rt=01 ot=02
               d tids 0000000108490169:0000000108490175 
                 dataoff=9000000174dda2e0/128 crc=1861f98c
                 fills=z9:745=1%
                 size=0 nlinks=2 mode=00644 uflags=00000000
                 ctime=0004525b4031d815 pobjid=0000000000000001 obj_type=2
                 mtime=0004525b4031d815
G------ ELM 23 R obj=0000000107e44a05 key=0000000000000000
         lo=00020001 rt=01 ot=07
               d tids 000000010849016d:0000000108490175 
                 dataoff=9000000174dda360/128 crc=c57654ef
                 fills=z9:745=1%
                 size=3 nlinks=1 mode=00755 uflags=00000000
                 ctime=0004525b40e6eaf6 pobjid=0000000000000001 obj_type=7
                 mtime=0004525b40e6eaf6
The converted ctimes are:Fri Jul 18 23:09:33 2008
    Sat Jul 19 00:24:20 2008
    Sat Jul 19 00:16:19 2008
    Sat Jul 19 00:16:31 2008So the inodes related to this broken freemap block were all created
    on Jul 18th and 19th.  They are all root inodes for PFS #2.This is very good news.  It's either related to the cross-device link
    bug that we fixed, or its related to the PFS root inode creation or
    deletion code.I will try to cross reference the data offsets against the rest of
    the dump to see if there are any more freemap discrepancies.-Matt
                    Matthew Dillon 
                    <dillon@backplane.com>
       Updated by qhwt+dfly about 17 years ago
      Updated by qhwt+dfly about 17 years ago
      
    
    I newfs_hammer'd before the first cross-device link crash and I haven't
done newfs_hammer after that.  And the above message was after the second
cross-device link crash.
:Sure.
       Updated by dillon about 17 years ago
      Updated by dillon about 17 years ago
      
    
    :G------ ELM 14 R obj=0000000000000001 key=0000000000000000 
:                 ctime=0004525a5174d38a pobjid=0000000000000000 obj_type=1
:G------ ELM 15 R obj=0000000000000001 key=0000000000000000
:                 ctime=0004525b5cd8b43f pobjid=0000000000000000 obj_type=1
:G------ ELM 22 R obj=0000000107e44a04 key=0000000000000000
:                 ctime=0004525b4031d815 pobjid=0000000000000001 obj_type=2
:G------ ELM 23 R obj=0000000107e44a05 key=0000000000000000
:                 ctime=0004525b40e6eaf6 pobjid=0000000000000001 obj_type=7
:
:    The converted ctimes are:
:
:    Fri Jul 18 23:09:33 2008
:    Sat Jul 19 00:24:20 2008
:    Sat Jul 19 00:16:19 2008
:    Sat Jul 19 00:16:31 2008
Let me rephrase that a little.  The first two are PFS root inodes
   for PFS #2 (one deleted, the other current).The third is a regular file (obj_type=2).The fourth is a softlink    (obj_type=7).The link count for the regular file is 2, but I couldn't find any
   directory entries pointing at it.  I'll bet that was related to
   the cross link issue.  The link count for the softlink is 1, and
   I couldn't find a directory entry for it either.I'll know for sure once I cross reference the text dump to see
   if any other blockmap entries are wrong.-Matt
                    Matthew Dillon 
                    <dillon@backplane.com>
       Updated by dillon about 17 years ago
      Updated by dillon about 17 years ago
      
    
    I wrote a program to track down all records which blew out their
    layer2 allocation and ran it.  The results are in your
    ~crash/checkfree.out.
There were quite a few, but every single one that was an inode had
    a ctime close to:Sat Jul 19 00:26:41 2008    (PDT, I think)Not only records, but often their related B-Tree nodes too.  So whatever
    caused that freemap corruption occured only on that day.I think I know what happened, too.  There were some commits right around
    that time that fixed a case where HAMMER's buffer cache was being
    flushed by the kernel on a panic, causing things to really get out
    of sync because the volume header's UNDO pointers wouldn't get updated
    in that case.  We really want to just throw those buffers away.  I'll bet
    what happened was that you had a crash related to the cross link tests
    on a kernel without those fixes, and the kernel flushed out HAMMER's
    unsynchronized meta-data and caused the blockmap to get out of sync.I think it would be prudent for me to do some more crash testing, to
    make sure that bug got fixed.  The base bug your reported can be
    considered closed though!--I've found two more assertions while testing reblocking and pruning
    at the same time.  One is related to a buffer alias occuring from
    the reblocking, panicing in hammer_io_new().  The other is related to
    a B-Tree sanity check related to I think the pruner (a window of
    opportunity when the B-Tree is deleting a chain of nodes where an
    insertion can occur, causing the chain deletion to assert on the node
    being unexpectedly non-empty).  I should be able to get both fixed
    fairly quickly and they shouldn't cause any media corruption.These are all related to heavy simultanious pruning and reblocking
    only, not normal use.Just in case you are doing that this is a head's up that I am aware
    of them.-Matt
       Updated by qhwt+dfly about 17 years ago
      Updated by qhwt+dfly about 17 years ago
      
    
    It's nice to hear that it's already been fixed in our 2.0 release.
In case you're interested, I put the relavant part of my /var/log/messages
as ~y0netan1/crash/message.gz.  I think I did most operations through
sudo, so the command line arguments are recorded in it.  While looking
through it, I realized I overlooked another panic, but maybe this was
the result of the previous panics.  I've been doing the same operation
after newfs_hammer'ing the partition but haven't managed to reproduce
any of these panics.
Jul 29 18:32:02 firebolt sudo:     qhwt : TTY=ttyp3 ; PWD=/HAMMER ; USER=root ; COMMAND=/sbin/hammer pfs-update slave label=slave of obj
Jul 29 18:32:37 firebolt sudo:     qhwt : TTY=ttyp3 ; PWD=/HAMMER ; USER=root ; COMMAND=/sbin/hammer mirror-copy slave
Jul 29 18:32:48 firebolt sudo:     qhwt : TTY=ttyp3 ; PWD=/HAMMER ; USER=root ; COMMAND=/sbin/hammer mirror-copy obj slave
Jul 29 18:40:19 firebolt kernel: Warning: BTREE_REMOVE: Defering parent removal2  8000000293323000, skipping@0x000000017a500990:00002 ; USER=root ; COMMAND=/bin/rm 
Jul 29 18:40:26 firebolt sudo:     qhwt : TTY=ttyp3 ; PWD=/HAMMER ; USER=root ; COMMAND=/sbin/hammer mirror-copy obj slave
Jul 29 18:41:22 firebolt sudo:     qhwt : TTY=ttyp4 ; PWD=/HAMMER/rf /var/obj/current>io.lock.refs == 0 in hammer_recover_flush_buffer_callback
Jul 29 18:41:26 firebolt sudo:     qhwt : TTY=ttyp3 ; PWD=/HAMMER ; USER=root ; COMMAND=/sbin/hammer mirror-copy obj slave
Jul 29 18:41:48 firebolt sudo:     qhwt : TTY=ttyp3 ; PWD=/HAMMER ; USER=root ; COMMAND=/sbin/hammer prune-everything /HAMMER/obj/
Jul 29 18:41:53 firebolt sudo:     qhwt : TTY=ttyp3 ; PWD=/HAMMER ; USER=root ; COMMAND=/sbin/hammer prune-everything /HAMMER/slave
Jul 29 18:42:07 firebolt sudo:     qhwt : TTY=ttyp3 ; PWD=/HAMMER ; USER=root ; COMMAND=/sbin/hammer mirror-copy obj slave
Jul 29 18:42:08 firebolt kernel: HAMMER: Critical error inode=4427368967 while syncing inode
Jul 29 18:42:08 firebolt kernel: HAMMER: Forcing read-only mode
Jul 29 18:43:48 firebolt sudo:     qhwt : TTY=ttyp4 ; PWD=/HAMMER ; USER=root ; COMMAND=/sbin/hammer pfs-destroy foo
Jul 29 18:44:19 firebolt sudo:     qhwt : TTY=ttyp4 ; PWD=/HAMMER ; USER=root ; COMMAND=/bin/rm foo
Jul 29 18:44:25 firebolt sudo:     qhwt : TTY=ttyp4 ; PWD=/HAMMER ; USER=root ; COMMAND=/bin/rm foo
Jul 29 18:44:43 firebolt su: BAD SU qhwt to root on /dev/ttyp4
Jul 29 18:45:01 firebolt su: qhwt to root on /dev/ttyp3
Jul 29 21:00:59 firebolt syslogd: kernel boot file is /kernel
Jul 29 21:00:59 firebolt kernel: HAMMER read-only -> read-write
Jul 29 21:00:59 firebolt kernel: panic: assertion: buffer
Jul 29 21:00:59 firebolt kernel: mp_lock = 00000000; cpuid = 0
Jul 29 21:00:59 firebolt kernel: boot() called on cpu#0
Jul 29 21:00:59 firebolt kernel:
Jul 29 21:00:59 firebolt kernel: syncing disks... 778 777 777 777 777 777 777 777 781 777 777 777 777 777 777 777 777 777 777 777 777 777 777 777 777 777 777 777 777
Thanks.
       Updated by dillon about 17 years ago
      Updated by dillon about 17 years ago
      
    
    I'm readying a commit+MFC for Wednesday if my testing today works out.
    So far so good.  I've been running simultanious buildworld, reblocking,
    and pruning loops all day with no problems.  I believe I have fixed the
    reblocker for good.
The B-Tree panic is still on my list (a very rare deletion race that
    asserts on a sanity check and does not corrupt the media).  This is
    currently the only known panic.During testing I found a directory disconnection issue which can leave
    an inaccessible sub-directory structure on the media if a crash
    occurs during a rm -rf.  The structures don't corrupt the filesystem
    other then eat up some space.  I made one fix that gets most of the
    problem but I think there may be some side cases still present.Francois's rmdir-on-regular-file bug will also be in Wednesday's commit.-Matt
                    Matthew Dillon 
                    <dillon@backplane.com>