HAMMER2. Hang up, reboot, and crash.
I started the build as
nice -n 9 make -j 8 buildworld
@USR-SRC is mounted at /usr/scr, @USR-OBJ is mounted at /usr/obj.
The compilation stalled, I pressed reset and after reboot the system dumped immediately.
After that I was able to boot, but the PFSes were gone.
Each attempt to mount any of them led to the following errors:
Sep 19 17:23:31 fly kernel: hammer2_mount
Sep 19 17:23:31 fly kernel: hammer2_mount: dev="/dev/serno/WD-WCC2EP295836.s1d" label="USR-OBJ" rdonly=0
Sep 19 17:23:31 fly kernel: hammer2: using volume header #3
Sep 19 17:23:31 fly kernel: alloc spmp 0xffffff8525bc0000 tid 000000000000c00b
Sep 19 17:23:31 fly kernel: chain 00000068a0f4480a.01 key=0000000000000000 meth=31 CHECK FAIL (flags=00144002, bref/data f84c9b21bc9ef301/1e55118462e960c9)
Sep 19 17:23:31 fly kernel: hammer2_mount: error Check Error reading super-root
Sep 19 17:23:31 fly kernel: hammer2_unmount hmp=0xffffff852613a000 mount_count=0
Sep 19 17:23:31 fly kernel: unmount hmp 0xffffff852613a000 remove spmp 0xffffff8525bc0000
Sep 19 17:23:31 fly kernel: unmount hmp 0xffffff852613a000 last ref to PMP=0xffffff8525bc0000
Sep 19 17:23:31 fly kernel: pfsfree: 0xffffff8525bc0000 lrucount=0
Sep 19 17:23:31 fly kernel: hammer2_unmount(A): devvp /dev/serno/WD-WCC2EP295836.s1d rbdirty 0 ronly=0
Sep 19 17:23:31 fly kernel: hammer2_unmount(B): devvp /dev/serno/WD-WCC2EP295836.s1d rbdirty 0
Sep 19 17:23:31 fly kernel: v-chain 0xffffff852613a4c0.255 0000000000000000/0 mir=000000000000c00b
Sep 19 17:23:31 fly kernel:  (?) refs=1
Sep 19 17:23:31 fly kernel: f-chain 0xffffff852613a640.254 0000000000000000/0 mir=000000000000c00a
Sep 19 17:23:31 fly kernel:  (?) refs=0
- Assignee set to dillon
I think I may have fixed this one last night in master (commit id through to a964af6f47472). I also believe that the stall should be fixed too (19808ac9def). There was a bug in the flush code that could catch some indirect block management in the middle of moving elements into or out of an indirect block, causing a damaged topology to be committed to media. This bug self-corrected during a normal shutdown, halt, or reboot, but not if the machine crashes or undergoes a hard reset.
Unfortunately, once damaged, the topology pretty much can't be repaired and the filesystem needs to be newfs_hammer2'd.
So I would say, update, reinstall, and keep watch. If it happens again with a kernel with a commitid of a964af6f47472 or later then we need to look at it more closely.
#3 Updated by yellowrabbit2010 11 months ago
During this time, the system experienced occasional reboots quite well. The last question here is: if I see such messages at boot, does this mean that I need to recreate the HAMMER2 partition?
Sep 23 20:19:17 fly kernel: reconnect to cluster: nc=1 focus=0
Sep 23 20:19:17 fly kernel: not a local device mount
They are not harmless --- the mount -a returns an error code and the /etc/rc.d/mountcritlocal script does not work properly.
Perhaps I incorrectly specify them in a /etc/fstab?
/dev/serno/WD-WCC2EP295836.s1d@LOCAL /mnt/aux-hdd hammer2 noatime,rw 1 1
@DOWNLOADS /mnt/dl hammer2 noatime,rw 0 0
@QEMU-IMGS /mnt/qemu-imgs hammer2 noatime,rw 0 0