Bug #1322
closedpanic with high signal load
0%
Description
I managed to panic the kernel while doing some testing using signals.
The application was doing the following:
fork child
child sets up signal handler for SIGHUP then loops forever calling
    pause().
parent set up signal handler for SIGHUP then parent loops count number
    of times sending a SIGHUP to the child then calls pause().
child's SIGHUP handler just sends a SIGHUP to the parent.
parent's SIGHUP handler calculates the round-trip time for the signal.
This appears to work fine for count < 1000 or so.  I tried an
iteration where count = 5000 and panic'ed the kernel.  I was unable to
get the panic message from the serial console but was able to get the
following trace from DDB:
db> trace
Debugger(c03d444f) at Debugger+0x34
panic(c03c8398,c040a210,c03c7238,d2684d58,2) at panic+0x9f
userret(6,0,0,d2684d58,c041f11c) at userret+0x16a
syscall2(d8c9dd40) at syscall2+0x2d6
Xint0x80_syscall() at Xint0x80_syscall+0x36
I can attempt to reproduce this if needed and can also provide the
source for the application.  I still have the debug kernel but wasn't
able to glean any useful information from it myself.
Thanks,
Joe
       Updated by dillon over 16 years ago
      Updated by dillon over 16 years ago
      
    
    :I managed to panic the kernel while doing some testing using signals.
:The application was doing the following:
:
:fork child
:
:child sets up signal handler for SIGHUP then loops forever calling
:    pause().
:
:parent set up signal handler for SIGHUP then parent loops count number
:    of times sending a SIGHUP to the child then calls pause().
:
:child's SIGHUP handler just sends a SIGHUP to the parent.
:parent's SIGHUP handler calculates the round-trip time for the signal.
:
:This appears to work fine for count < 1000 or so.  I tried an
:iteration where count = 5000 and panic'ed the kernel.  I was unable to
:get the panic message from the serial console but was able to get the
:following trace from DDB:
:
:db> trace
:Debugger(c03d444f) at Debugger+0x34
:panic(c03c8398,c040a210,c03c7238,d2684d58,2) at panic+0x9f
:userret(6,0,0,d2684d58,c041f11c) at userret+0x16a
:syscall2(d8c9dd40) at syscall2+0x2d6
:Xint0x80_syscall() at Xint0x80_syscall+0x36
:
:I can attempt to reproduce this if needed and can also provide the
:source for the application.  I still have the debug kernel but wasn't
:able to glean any useful information from it myself.
:
:Thanks,
:Joe
I think I need the program to reproduce it.  I wrote a program based
    on your description, which I include below, but it doesn't seem to
    reproduce the problem.In my program instead of having the parent send a SIGHUP from its
    main loop I just have the two signal handlers ping-pong the signal,
    with the child serving the first ball.-Matt
                    Matthew Dillon 
                    <dillon@backplane.com>#include <signal.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
static void sig_child(int sig);
static void sig_parent(int sig);
pid_t Pid_parent;
pid_t Pid_child;
struct timeval Tv1;
struct timeval Tv2;
struct timeval TvDelta;
int64_t Count;
int
main(int ac, char **av)
{
    int i;
Pid_parent = getpid();
    signal(SIGHUP, sig_parent);if ((Pid_child = fork()) == 0) {
    Pid_child = getpid();signal(SIGHUP, sig_child);
    kill(Pid_parent, SIGHUP);        /* start it going */
    for (;;)
        pause();
    }/*
 * NOTE: Count and TvDelta updates can race, so we may occasionally
 * print a bad value.
     */
    for (;;) {
    pause();
    if (Count % 10000 == 0) {
        printf("%9lld %6.2fuS\n",
        Count,
        ((double)TvDelta.tv_sec * 1000000.0 +
        (double)TvDelta.tv_usec) / (double)Count
        );
    }
    }
}static
void
sig_child(int sig)
{
    kill(Pid_parent, SIGHUP);
}
static
void
sig_parent(int sig)
{
    int usec;
++Count;
    Tv1 = Tv2;
    gettimeofday(&Tv2, NULL);
    if (Count > 1) {
    usec = (Tv2.tv_sec - Tv1.tv_sec) * 1000000 +
           (Tv2.tv_usec - Tv1.tv_usec);usec += TvDelta.tv_usec;
    if (usec > 1000000) {
        TvDelta.tv_sec += usec / 1000000;
        usec %= 1000000;
    }
    TvDelta.tv_usec = usec;
    }
    kill(Pid_child, SIGHUP);
}
       Updated by corecode over 16 years ago
      Updated by corecode over 16 years ago
      
    
    Joseph, can you please post the original test code?
       Updated by josepht over 16 years ago
      Updated by josepht over 16 years ago
      
    
    On Thu, Apr 23, 2009 at 07:54:59AM +0000, Simon 'corecode' Schubert (via DragonFly issue tracker) wrote:
Simon 'corecode' Schubert <corecode@fs.ei.tum.de> added the comment:
Joseph, can you please post the original test code?
I think this issue can be closed.  Neither I nor Matt were able to
reproduce this.
Joe