Bug #2556

DragonFly v3.5.0.81.gd3479 - Process signal weirdness

Added by tuxillo over 1 year ago. Updated about 1 year ago.

Status:FeedbackStart date:05/07/2013
Priority:NormalDue date:
Assignee:-% Done:

0%

Category:-
Target version:-

Description

Hi,

tmux has a memory leak somewhere and it was causing a lot of swap memory to be used. I started using truss(1) to see what it was doing and after 3-4 tracing attempts the tmux process was stuck in status "stopevent". After sending to it all sorts of signals (CONT, HUP, KILL) the process is kept stopped in the same situation. I also tried disabling tracing with ktrace(1) but it didn't help.

Eventually I tried to reboot the virtual machine but it got stuck for around 10 minutes without actually being able to shutdown. Last step was dropping to DDB and producing a dump. Dump is available on leaf on demand. See below the backtrace of the thread involved (tmux process)

Best regards,
Antonio Huete

------

(kgdb) thread 26
[Switching to thread 26 (pid 835/0, tmux)]
#0 0xffffffff8050dfb0 in lwkt_switch () at /home/source/dfbsd/sys/kern/lwkt_thread.c:872
872 /home/source/dfbsd/sys/kern/lwkt_thread.c: No such file or directory.
(kgdb) bt
#0 0xffffffff8050dfb0 in lwkt_switch () at /home/source/dfbsd/sys/kern/lwkt_thread.c:872
#1 0xffffffff80518f0e in tsleep (ident=ident@entry=0xffffffe071efc990, flags=flags@entry=1024,
wmesg=wmesg@entry=0xffffffff8097e168 "stopevent", timo=timo@entry=0) at /home/source/dfbsd/sys/kern/kern_synch.c:612
#2 0xffffffff80538091 in stopevent (p=p@entry=0xffffffe071efc780, event=event@entry=4, val=val@entry=7)
at /home/source/dfbsd/sys/kern/sys_process.c:771
#3 0xffffffff808c5a7a in syscall2 (frame=0xffffffe07280e9f8) at /home/source/dfbsd/sys/platform/pc64/x86_64/trap.c:1229
#4 0xffffffff808af75b in ?? () at /home/source/dfbsd/sys/platform/pc64/x86_64/exception.S:323
#5 0x00000000000000c5 in ?? ()
#6 0x0000000000000000 in ?? ()

History

#1 Updated by vsrinivas over 1 year ago

Commit e0836e94092043dfdc5c34d00c214369c411de76 in -master should resolve this issue.

When traced, truss marked tmux to stop on certain events (syscall entry/exit, ...); truss was then killed. The close of the procfs filedescriptor through which truss controlled tmux should have unset the stop latch and woken tmux. However, procfs's close() VOP was not waking on the same variable that stopevent() was sleeping on, so the process was ready to wake but wouldn't do so.

Needs to be MFCed.

#2 Updated by tuxillo about 1 year ago

  • Description updated (diff)
  • Status changed from New to Feedback

Need to check if this is really solved now that we have a new release.

Also available in: Atom PDF