Bug #926

tcsh consumes much cpu when trying to gdb anything

Added by nant over 6 years ago. Updated over 6 years ago.

Status:ClosedStart date:
Priority:NormalDue date:
Assignee:nant% Done:

0%

Category:-
Target version:-

Description

how to replicate:

1) set the environment variable SHELL=/bin/tcsh
2) gdb /usr/bin/true
3) (gdb) run

a ktrace of the offending process reveals the following:

[...]
2203 tcsh CALL fork
2203 tcsh RET fork RESTART
2203 tcsh CALL fork
2203 tcsh RET fork RESTART
2203 tcsh CALL fork
2203 tcsh RET fork RESTART
2203 tcsh CALL fork
2203 tcsh RET fork RESTART
2203 tcsh CALL fork
2203 tcsh RET fork RESTART
2203 tcsh CALL fork
2203 tcsh RET fork RESTART
2203 tcsh CALL fork
2203 tcsh RET fork RESTART
2203 tcsh CALL fork
2203 tcsh RET fork RESTART
[...]

This is on a recent HEAD.

I'll try to chase this but any help would be appreciated. :)

Cheers,
Nuno

gdb-tcsh-fork.diff Magnifier (540 Bytes) corecode, 01/24/2008 12:30 PM

History

#1 Updated by nant over 6 years ago

Commenting out the SIGCHLD signal masking part seems to fix this issue for me.
But i'm not sure if this is a correct fix or not.

diff --git a/contrib/tcsh-6/tc.func.c b/contrib/tcsh-6/tc.func.c
index e2db430..9edd6ff 100644
--- a/contrib/tcsh-6/tc.func.c
+++ b/contrib/tcsh-6/tc.func.c
@@ -1990,10 +1990,12 @@ remotehost(void)
int fds[2], wait_options, status;
pid_t pid, wait_res;

+#if 0
sa.sa_handler = SIG_DFL; /* Make sure a zombie is created */
sigemptyset(&sa.sa_mask);
sa.sa_flags = 0;
sigaction(SIGCHLD, &sa, NULL);
+#endif
mypipe(fds);
pid = fork();
if (pid == 0) {

#2 Updated by corecode over 6 years ago

Could you try the attached fix?

cheers
simon

#3 Updated by nant over 6 years ago

On Jan 24, 2008 12:25 PM, Simon 'corecode' Schubert
<> wrote:
> Nuno Antunes wrote:
> > Nuno Antunes <> added the comment:
> >
> > Commenting out the SIGCHLD signal masking part seems to fix this issue for me.
> > But i'm not sure if this is a correct fix or not.
>
> Could you try the attached fix?
>

Confirmed! Thanks!

Cheers
Nuno

#4 Updated by nant over 6 years ago

On Jan 24, 2008 12:25 PM, Simon 'corecode' Schubert
<> wrote:
> Nuno Antunes wrote:
> > Nuno Antunes <> added the comment:
> >
> > Commenting out the SIGCHLD signal masking part seems to fix this issue for me.
> > But i'm not sure if this is a correct fix or not.
>
> Could you try the attached fix?
>
> cheers
> simon
>
> --
> Serve - BSD +++ RENT this banner advert +++ ASCII Ribbon /"\
> Work - Mac +++ space for low €€€ NOW!1 +++ Campaign \ /
> Party Enjoy Relax | http://dragonflybsd.org Against HTML \
> Dude 2c 2 the max ! http://golden-apple.biz Mail + News / \
>
> Index: sys/signal2.h
> ===================================================================
> RCS file: /home/dcvs/src/sys/sys/signal2.h,v
> retrieving revision 1.1
> diff -u -p -r1.1 signal2.h
> --- sys/signal2.h 25 Feb 2007 23:17:13 -0000 1.1
> +++ sys/signal2.h 24 Jan 2008 09:27:00 -0000
> @@ -103,9 +103,8 @@ __cursignb(struct lwp *lp)
> p = lp->lwp_proc;
> tmpset = lwp_sigpend(lp);
> SIGSETNAND(tmpset, lp->lwp_sigmask);
> - if ((!(p->p_flag & P_TRACED) && SIGISEMPTY(tmpset))) {
> + if (SIGISEMPTY(tmpset))
> return(FALSE);
> - }
> return (TRUE);
> }
>
>
>

This fix from Simon seems to work for me. Are there any objections to
commiting this?

Thanks,
Nuno

#5 Updated by dillon over 6 years ago

:> > Commenting out the SIGCHLD signal masking part seems to fix this issue =
:for me.
:> > But i'm not sure if this is a correct fix or not.
:>
:> Could you try the attached fix?
:>
:> cheers
:> simon
:...
:> RCS file: /home/dcvs/src/sys/sys/signal2.h,v
:> retrieving revision 1.1
:> diff -u -p -r1.1 signal2.h
:> --- sys/signal2.h 25 Feb 2007 23:17:13 -0000 1.1
:> +++ sys/signal2.h 24 Jan 2008 09:27:00 -0000
:> @@ -103,9 +103,8 @@ __cursignb(struct lwp *lp)
:> p =3D lp->lwp_proc;
:> tmpset =3D lwp_sigpend(lp);
:> SIGSETNAND(tmpset, lp->lwp_sigmask);
:> - if ((!(p->p_flag & P_TRACED) && SIGISEMPTY(tmpset))) {
:> + if (SIGISEMPTY(tmpset))
:> return(FALSE);
:> - }
:> return (TRUE);
:> }
:
:This fix from Simon seems to work for me. Are there any objections to
:commiting this?
:
:Thanks,
:Nuno

Hmm. There are two functions there that test for P_TRACED, do both
need to be fixed?

Also I'm not entirely sure what the mechanism is here. Clearly the
original code (this is original code) isn't quite compatible with
the new signaling code, but the question is: What is the correct
thing to do about it?

-Matt
Matthew Dillon
<>

#6 Updated by corecode over 6 years ago

No, cursig() will always stop the process and let the tracing parent look
at the signal.

However, as we don't want to block there, we are returning ERESTART from
fork1(). Now we should *only* do this if there is a signal. Otherwise
we'll just restart the syscall and the situation won't change.

No, this is "new" code :) It was added by you in 2006, sys/signalvar.h
rev 1.15.

I'm sure this is the correct fix.

cheers
simon

#7 Updated by dillon over 6 years ago

:No, cursig() will always stop the process and let the tracing parent look=
:=20
:at the signal.
:
:However, as we don't want to block there, we are returning ERESTART from =
:
:fork1(). Now we should *only* do this if there is a signal. Otherwise=20
:we'll just restart the syscall and the situation won't change.
:...
:> thing to do about it?
:
:No, this is "new" code :) It was added by you in 2006, sys/signalvar.h=20
:rev 1.15.
:
:I'm sure this is the correct fix.
:
:cheers
: simon

Ah, I see now. Ok, commit that change.

-Matt

#8 Updated by nant over 6 years ago

Commited! Thanks Simon.

Cheers,
Nuno

Also available in: Atom PDF