Bug #934
closedstacked vn(4) borkitude
0%
Description
To make things more flexible, I've started using one largish partition
and creating vn disks for various uses underneath them.
last night I started to work on updating my vnconfig patch using this
new scheme and got a corrupted filesystem as follows:
- vnconfig c -s labels vn10 /path/to/home.img mount /dev/vn10s0a /home
- cd /home/niftyscriptness
- do some stuff which generates a disk image for vkernels
dd, vnconfig, disklabel, newfs, mount, make installworld, etc.
which mounts /dev/vn0s0a underneath /home
- strangeness occurs
basically, it seems like the 1.10.1 VFS/vn is getting confused when a VN
is stacked on top of another vn.
First time, I did this procedure and the first 'mount' resulted in an
error (input/output error). Thinking I might have accidentally done
someting wrong with my vn allocation, I started over, and then
started to get wierd things in the working directory (layer 1 vn, holds
the mountpoint for layer 2) - the files the vkernel image builder uses
to keep track of things (.formatted, etc) were showing up in 'ls', but
ls -l would say 'no such file or directory'. Thinking a bug was upon me,
I rebooted, and when I tried to fsck the 'layer 1' /home VN, it reported
many errors - 'fsck -y' essentially trashed the filesystem.
I started to repeat a second round of tests today after restoring /home
first time 'worked' - e.g. the initial mount of /dev/vn0s0a into
/dev/vn10s0a's /home filesystem was ok, but the make installworld of
the Vkernel system paniced the system mid-way (sorry for copied trace -
still need to get my debug infrastructure up to date)
panic
ffs_valloc
ufs_makeinode
ufs_create
ufs_vnoperate
vop_old_create
vop_compat_ncreate ? (cant read my writing :)
vop_default
vfs_vnoperate
vop_ncreate
vn_open
kern_open
sys_open
syscall2
Xint80_syscall
when I rebooted, the /home filesystem was ok, so I started the process
again, and got the same kind of corruption as before -
first try, things seemed ok, so I interrupted, unmounted, vnconfig
-u'ed, etc & tried again -
on this try the first mount of the VN (vn0s0a) failed (input/output
error), with a simultaneous console message :
dscheck(#vn/80): attempt to access nonexistent partition
and possibly (saw this at some point):
vn0: reading primary partition table error accessing offset 00000000 for 2
at this point, or shortly thereafter, doing an 'ls' within the layer 1
/home filesystem came back blank, and 'cd .. ; ls -al' started yielding
the 'no such file or directory' strangeness.
I rebooted, and the /home filesytem fsck'ed clean, but mounted empty -
df showed it as being 96% full, however (4G filesystem)
While typing this, I did realize that the script to create the 'layer 2'
vn's was not leaving any label space in the disklabel - that being said
I don't think that should cause corruption on the 'host' /home
filesytem in any case.
Script was used many times before on a UP 'raw partition' /home -
just switched to a 1.10.1 SMP vn(4) /home - the new machine seems
otherwise stable.
one other note: /home was NFS exported but only mounted during the
initial crash
pointers (or perhaps fixed pointers :) on the next steps welcome..
Thanks in advance,
- Chris