Bug #1218
openpanic: assertion: error == 0 in hammer_start_transaction
0%
Description
The machine was mostly idle and the panic happened during the night when I was
sleeping, so I don't remember much what I was running.
The only thing I do remember, was a vkernel running under gdb (still have to
find out why when running a diskless vkernel outside gdb it displays quite a
few "RPC timeout for server 192.168.0.16" before the network starts working
and goes on booting, but if I run the vkernel under gdb, I only get perhaps 2
of those messages and after that nothing happens - gets stuck) which was
semi-diskless (root on nfs, and one hammer fs partition on the first vkd, but
since it was stuck at the RPC timeout messages it shouldn't have gotten far
enough to mount the root, let alone the local hammer partition - unless it
started booting sometime during the night).
The backtrace:
panic: assertion: error == 0 in hammer_start_transaction
mp_lock = 00000000; cpuid = 0
Trace beginning at frame 0xe28c9968
panic(e28c998c,c02c6806,e28c9a84,c39a6738,e28c99a8) at panic+0x14d
panic(c03d0698,c03de4df,c03bbd41,6,45d61) at panic+0x14d
hammer_start_transaction(e28c9a84,debc0000,c39a6738,1,1) at
hammer_start_transaction+0x41
hammer_ioctl(de0bc550,c02c6806,e28c9c1c,1,c39a6738) at hammer_ioctl+0x2d
hammer_vop_ioctl(e28c9ae0,c04314e0,d272ad10,e27c46e8,0) at
hammer_vop_ioctl+0x2f
vop_ioctl(d272ad10,e27c46e8,c02c6806,e28c9c1c,1) at vop_ioctl+0x38
vn_ioctl(d61e90c0,c02c6806,e28c9c1c,c39a6738,d61e90c0) at vn_ioctl+0xbf
mapped_ioctl(4,c02c6806,bfbff8e0,0,e28c9d34) at mapped_ioctl+0x3e1
sys_ioctl(e28c9cf0,6,1e82,0,d8f675d8) at sys_ioctl+0x16
syscall2(e28c9d40) at syscall2+0x265
Xint0x80_syscall() at Xint0x80_syscall+0x36
boot() called on cpu#0
The dump is located at leaf:~rumko/crash/{kernel,vmcore}.0
The kernel was compiled on the 2nd January around noon CET ... so the sources
should have been from around then as well.
--
Regards,
Rumko
Updated by dillon almost 16 years ago
:...
:few "RPC timeout for server 192.168.0.16" before the network starts working
:and goes on booting, but if I run the vkernel under gdb, I only get perhaps 2
:of those messages and after that nothing happens - gets stuck) which was
:semi-diskless (root on nfs, and one hammer fs partition on the first vkd, but
:since it was stuck at the RPC timeout messages it shouldn't have gotten far
:enough to mount the root, let alone the local hammer partition - unless it
:started booting sometime during the night).
:
:The backtrace:
:panic: assertion: error == 0 in hammer_start_transaction
:mp_lock = 00000000; cpuid = 0
:Trace beginning at frame 0xe28c9968
:panic(e28c998c,c02c6806,e28c9a84,c39a6738,e28c99a8) at panic+0x14d
:...
:The dump is located at leaf:~rumko/crash/{kernel,vmcore}.0
:
:The kernel was compiled on the 2nd January around noon CET ... so the sources
:should have been from around then as well.
:--
:Regards,
:Rumko
Looking at the core the error code was 6, ENXIO, which implies
the underlying block device to the HAMMER filesystem went away.
It looks like a HAMMER mount on /mnt, backed by a VN device
(/dev/vn0s1a):
f_mntonname = "/mnt", '\0' <repeats 75 times>,
f_mntfromname = "VROOT", '\0' <repeats 74 times>,
vol_name = 0xe1c84080 "/dev/vn0s1a",
My guess is that your VN device is backed by a file over NFS
and NFS errored out.
-Matt
Matthew Dillon
<dillon@backplane.com>
Updated by rumcic almost 16 years ago
Matthew Dillon wrote:
Ah damn. In that case nevermind, I wonder what I was doing, hm.
--
Regards,
Rumko
Updated by corecode almost 16 years ago
Still shouldn't panic or something, no?
cheers
simon
Updated by rumcic almost 16 years ago
Simon 'corecode' Schubert wrote:
Well it would be lovely if it wouldn't panic, but at least I have a faint idea
what caused it and will be more careful in the future.
--
Regards,
Rumko
Updated by dillon almost 16 years ago
:Well it would be lovely if it wouldn't panic, but at least I have a faint idea
:what caused it and will be more careful in the future.
:--
:Regards,
:Rumko
Various error paths would have to be added to the transaction API
to allow it to return an error and abort the sequence, instead
of panicing there. I'd rather not mess with it now but I guess we
will want to deal with it at some point in the futue.
-Matt
Matthew Dillon
<dillon@backplane.com>
Updated by alexh over 14 years ago
Matt,
is this fixed properly now?
Cheers,
Alex Hornung
Updated by tuxillo almost 10 years ago
- Description updated (diff)
- Category set to VFS subsystem
- Status changed from New to In Progress
- Assignee deleted (
0) - Target version set to 4.2
Hi,
Is there a way to reproduce this?
Cheers,
Antonio Huete