Bug #934
closedstacked vn(4) borkitude
0%
Description
To make things more flexible, I've started using one largish partition
and creating vn disks for various uses underneath them.
last night I started to work on updating my vnconfig patch using this
new scheme and got a corrupted filesystem as follows:
- vnconfig c -s labels vn10 /path/to/home.img mount /dev/vn10s0a /home
- cd /home/niftyscriptness
- do some stuff which generates a disk image for vkernels
dd, vnconfig, disklabel, newfs, mount, make installworld, etc.
which mounts /dev/vn0s0a underneath /home
- strangeness occurs
basically, it seems like the 1.10.1 VFS/vn is getting confused when a VN
is stacked on top of another vn.
First time, I did this procedure and the first 'mount' resulted in an
error (input/output error). Thinking I might have accidentally done
someting wrong with my vn allocation, I started over, and then
started to get wierd things in the working directory (layer 1 vn, holds
the mountpoint for layer 2) - the files the vkernel image builder uses
to keep track of things (.formatted, etc) were showing up in 'ls', but
ls -l would say 'no such file or directory'. Thinking a bug was upon me,
I rebooted, and when I tried to fsck the 'layer 1' /home VN, it reported
many errors - 'fsck -y' essentially trashed the filesystem.
I started to repeat a second round of tests today after restoring /home
first time 'worked' - e.g. the initial mount of /dev/vn0s0a into
/dev/vn10s0a's /home filesystem was ok, but the make installworld of
the Vkernel system paniced the system mid-way (sorry for copied trace -
still need to get my debug infrastructure up to date)
panic
ffs_valloc
ufs_makeinode
ufs_create
ufs_vnoperate
vop_old_create
vop_compat_ncreate ? (cant read my writing :)
vop_default
vfs_vnoperate
vop_ncreate
vn_open
kern_open
sys_open
syscall2
Xint80_syscall
when I rebooted, the /home filesystem was ok, so I started the process
again, and got the same kind of corruption as before -
first try, things seemed ok, so I interrupted, unmounted, vnconfig
-u'ed, etc & tried again -
on this try the first mount of the VN (vn0s0a) failed (input/output
error), with a simultaneous console message :
dscheck(#vn/80): attempt to access nonexistent partition
and possibly (saw this at some point):
vn0: reading primary partition table error accessing offset 00000000 for 2
at this point, or shortly thereafter, doing an 'ls' within the layer 1
/home filesystem came back blank, and 'cd .. ; ls -al' started yielding
the 'no such file or directory' strangeness.
I rebooted, and the /home filesytem fsck'ed clean, but mounted empty -
df showed it as being 96% full, however (4G filesystem)
While typing this, I did realize that the script to create the 'layer 2'
vn's was not leaving any label space in the disklabel - that being said
I don't think that should cause corruption on the 'host' /home
filesytem in any case.
Script was used many times before on a UP 'raw partition' /home -
just switched to a 1.10.1 SMP vn(4) /home - the new machine seems
otherwise stable.
one other note: /home was NFS exported but only mounted during the
initial crash
pointers (or perhaps fixed pointers :) on the next steps welcome..
Thanks in advance,
- Chris
Updated by alexh about 15 years ago
Our vn(4) has undergone some serious modifications, integrating it into the disk
layer. I don't think what is described here can still happen, but it would be
good if someone could confirm that this now works as expected.
Cheers,
Alex Hornung
Updated by alexh over 14 years ago
the disk subsystem makes it possible to stack vn devices without problems.