Bug #1162

Possible hammer problem

Added by bastyaelvtars about 6 years ago. Updated over 5 years ago.

Status:ClosedStart date:
Priority:NormalDue date:
Assignee:-% Done:

0%

Category:-
Target version:-

Description

The lighttpd process got stuck in the select state unkillably. I decided to
reboot and got a panic.

syncing disks... 87 87 87 87 87 87 87 87 87 87 87 87 87 87 87 87 87 87 87 87
giving up on 1 buffers
Debugger("busy buffer problem")
Stopped at Debugger+0x34: movb $0,in_Debugger.3949
db> trace
Debugger(c05382b5,c053829c,1,0,0) at Debugger+0x34
boot(0,caf18d34,c04d2fdd,caf18cf0,6) at boot+0x1e2
sys_reboot(caf18cf0,6,55d93,0,c9708c58) at sys_reboot+0x23
syscall2(caf18d40) at syscall2+0x1e9
Xint0x80_syscall() at Xint0x80_syscall+0x36
db>

Needless to say, lighttpd serves from a hammer partition and also runs from
another. Kernel and vmcore are also available:

http://160.114.134.7/~szg/crash081111/

History

#1 Updated by dillon about 6 years ago

:The lighttpd process got stuck in the select state unkillably. I decided to
:reboot and got a panic.
:
:syncing disks... 87 87 87 87 87 87 87 87 87 87 87 87 87 87 87 87 87 87 87 87
:giving up on 1 buffers
:Debugger("busy buffer problem")
:Stopped at Debugger+0x34: movb $0,in_Debugger.3949
:db> trace
:Debugger(c05382b5,c053829c,1,0,0) at Debugger+0x34
:boot(0,caf18d34,c04d2fdd,caf18cf0,6) at boot+0x1e2
:sys_reboot(caf18cf0,6,55d93,0,c9708c58) at sys_reboot+0x23
:syscall2(caf18d40) at syscall2+0x1e9
:Xint0x80_syscall() at Xint0x80_syscall+0x36
:db>
:
:Needless to say, lighttpd serves from a hammer partition and also runs from
:another. Kernel and vmcore are also available:
:
:http://160.114.134.7/~szg/crash081111/

Excellent core dump, I have downloaded it and tracked the problem down
to a deadlock in the HAMMER code. I am working on a patch now.

-Matt
Matthew Dillon
<>

#2 Updated by dillon about 6 years ago

This may take another day. It turns out to be a deep vnode lock being
acquired from inside a B-tree cursor. The cursor is holding a B-tree
node lock and I can't easily release it.

-Matt

#3 Updated by bastyaelvtars about 6 years ago

Oops, I misconfigured my mail client, this was actually me.
@Matt: thanks for looking into it and do not hurry since I couldn't
reproduce it since then. :-)

"DFBSD" <> wrote in message
news:4919d02a$0$882$415eb37d@crater_reader.dragonflybsd.org...
> The lighttpd process got stuck in the select state unkillably. I decided
> to reboot and got a panic.
>
> syncing disks... 87 87 87 87 87 87 87 87 87 87 87 87 87 87 87 87 87 87 87
> 87
> giving up on 1 buffers
> Debugger("busy buffer problem")
> Stopped at Debugger+0x34: movb $0,in_Debugger.3949
> db> trace
> Debugger(c05382b5,c053829c,1,0,0) at Debugger+0x34
> boot(0,caf18d34,c04d2fdd,caf18cf0,6) at boot+0x1e2
> sys_reboot(caf18cf0,6,55d93,0,c9708c58) at sys_reboot+0x23
> syscall2(caf18d40) at syscall2+0x1e9
> Xint0x80_syscall() at Xint0x80_syscall+0x36
> db>
>
> Needless to say, lighttpd serves from a hammer partition and also runs
> from another. Kernel and vmcore are also available:
>
> http://160.114.134.7/~szg/crash081111/

#4 Updated by aoiko almost 6 years ago

Any progress with this bug?

Aggelos

#5 Updated by dillon almost 6 years ago

:Matthew Dillon wrote:
:> This may take another day. It turns out to be a deep vnode lock being
:> acquired from inside a B-tree cursor. The cursor is holding a B-tree
:> node lock and I can't easily release it.
:>
:> -Matt
:
:Any progress with this bug?
:
:Aggelos

This should be fixed now.

-Matt
Matthew Dillon
<>

#6 Updated by bastyaelvtars almost 6 years ago

"Matthew Dillon" <> wrote in message
news:...
>
> :Matthew Dillon wrote:
> :> This may take another day. It turns out to be a deep vnode lock
> being
> :> acquired from inside a B-tree cursor. The cursor is holding a
> B-tree
> :> node lock and I can't easily release it.
> :>
> :> -Matt
> :
> :Any progress with this bug?
> :
> :Aggelos
>
> This should be fixed now.

Also in 2.0 or whatever it is called now? :-P

Also available in: Atom PDF