Bug #276: Patch to try - Re: Sendmail rset command hangs socket on 1.6-Release - DragonFlyBSD - DragonFlyBSD bugtracker

Actions

Copy link

Bug #276

closed

Patch to try - Re: Sendmail rset command hangs socket on 1.6-Release

Added by sven over 19 years ago. Updated over 19 years ago.

Status:

Closed

Priority:

High

Assignee:

Category:

Target version:

Start date:

Due date:

% Done:

Estimated time:

Description

On Wed, 2006-08-02 at 10:25 -0700, Matthew Dillon wrote:

Please try this patch and tell me if it works. I think we have an issue
when one process holds an exclusive lock while 2 or more processes are
trying to get a shared lock, or vise-versa.

-Matt

Index: kern_lockf.c ===================================================================
RCS file: /cvs/src/sys/kern/kern_lockf.c,v
retrieving revision 1.32
diff u -r1.32 kern_lockf.c
-- kern_lockf.c 25 Jul 2006 20:01:50 0000 1.32
++ kern_lockf.c 2 Aug 2006 17:23:56 -0000
@ -772,8 +772,10 @
TAILQ_REMOVE(&lock>lf_blocked, range, lf_link);
range->lf_flags = 1;
wakeup(range);
#if 0
if (range->lf_start >= start && range->lf_end <= end)
break;
+#endif
}
}

I have applied the patch (and recompiled) and am letting the system run
full steam right now (including the milter, etc); the initial results
look promising as it has not exhibited the aberrant behavior as of yet.
I will post a followup after letting this run all night (assuming it
does so) or after it fails (which hopefully won't happen).

Sven

Actions

Copy link

Updated by sven about 20 years ago

On Wed, 2006-08-02 at 17:41 -0400, Sven Willenberger wrote:

On Wed, 2006-08-02 at 10:25 -0700, Matthew Dillon wrote:

Please try this patch and tell me if it works. I think we have an issue
when one process holds an exclusive lock while 2 or more processes are
trying to get a shared lock, or vise-versa.

-Matt

Index: kern_lockf.c ===================================================================
RCS file: /cvs/src/sys/kern/kern_lockf.c,v
retrieving revision 1.32
diff u -r1.32 kern_lockf.c
-- kern_lockf.c 25 Jul 2006 20:01:50 0000 1.32
++ kern_lockf.c 2 Aug 2006 17:23:56 -0000
@ -772,8 +772,10 @
TAILQ_REMOVE(&lock>lf_blocked, range, lf_link);
range->lf_flags = 1;
wakeup(range);
#if 0
if (range->lf_start >= start && range->lf_end <= end)
break;
+#endif
}
}

I have applied the patch (and recompiled) and am letting the system run
full steam right now (including the milter, etc); the initial results
look promising as it has not exhibited the aberrant behavior as of yet.
I will post a followup after letting this run all night (assuming it
does so) or after it fails (which hopefully won't happen).

Sven

As a followup, the server has been running without a hitch now for 18
hours so it would appear that the above patch has fixed the situation,
unless some other more rare situation/condition crops up that would
cause this lock.

Sven

Actions

Copy link

Updated by dillon about 20 years ago

:> I have applied the patch (and recompiled) and am letting the system run
:> full steam right now (including the milter, etc); the initial results
:> look promising as it has not exhibited the aberrant behavior as of yet.
:> I will post a followup after letting this run all night (assuming it
:> does so) or after it fails (which hopefully won't happen).
:>
:> Sven
:>
:
:As a followup, the server has been running without a hitch now for 18
:hours so it would appear that the above patch has fixed the situation,
:unless some other more rare situation/condition crops up that would
:cause this lock.
:
:Sven

Ok, that's good to hear.  I'll get the patch committed to both HEAD and
    REL.

This bug is serious enough to warrent rolling 1.6.1 next week, probably
    Monday.

-Matt
                    Matthew Dillon 
                    &lt;dillon@backplane.com&gt;

Actions

Copy link

Updated by qhwt+dfly about 20 years ago

On Thu, Aug 03, 2006 at 08:56:47AM -0700, Matthew Dillon wrote:

:> I have applied the patch (and recompiled) and am letting the system run
:> full steam right now (including the milter, etc); the initial results
:> look promising as it has not exhibited the aberrant behavior as of yet.
:> I will post a followup after letting this run all night (assuming it
:> does so) or after it fails (which hopefully won't happen).
:>
:> Sven
:>
:
:As a followup, the server has been running without a hitch now for 18
:hours so it would appear that the above patch has fixed the situation,
:unless some other more rare situation/condition crops up that would
:cause this lock.
:
:Sven

Ok, that's good to hear. I'll get the patch committed to both HEAD and
REL.

Does 1.4.x-RELEASE have this problem too?

Actions

Copy link

Updated by dillon about 20 years ago

:
:Does 1.4.x-RELEASE have this problem too?

The lockf code is different in 1.4.  There is a similar test in lf_wakeup,
    and it doesn't look right to me, but I don't know if the bug can be
    triggered or not.

-Matt
                    Matthew Dillon 
                    &lt;dillon@backplane.com&gt;

Actions

Copy link

Updated by hamilton about 20 years ago

Matthew Dillon <dillon@apollo.backplane.com>, said on Thu Aug 03, 2006 [08:12:27 PM]:
} :
} :Does 1.4.x-RELEASE have this problem too?
}
} The lockf code is different in 1.4. There is a similar test in lf_wakeup,
} and it doesn't look right to me, but I don't know if the bug can be
} triggered or not.

The postfix problem I had was present on 1.4.x as well as 1.6 and 1.7.

Jon Hamilton 
   hamilton@pobox.com

Actions

Copy link

Updated by dillon about 20 years ago

:...
:} and it doesn't look right to me, but I don't know if the bug can be
:} triggered or not.
:
:The postfix problem I had was present on 1.4.x as well as 1.6 and 1.7.
:
:--
:
: Jon Hamilton
: hamilton@pobox.com

I'll commit a similar patch to 1.4.x to hopefully fix it there.

It looks like I copied the original bug to the new lock code when I
    rewrote it during 1.5.  I'm not sure when it was first introduced but
    it looks like the code attempted to optimize the unblocking code by
    breaking out of the loop early in certain situations, but it turns out
    the optimization check it was doing was insufficient and it was breaking
    out too early.

-Matt
                    Matthew Dillon 
                    &lt;dillon@backplane.com&gt;

Actions

Copy link

Updated by joerg about 20 years ago

On Fri, Aug 04, 2006 at 09:21:30AM -0700, Matthew Dillon wrote:

It looks like I copied the original bug to the new lock code when I
rewrote it during 1.5. I'm not sure when it was first introduced but
it looks like the code attempted to optimize the unblocking code by
breaking out of the loop early in certain situations, but it turns out
the optimization check it was doing was insufficient and it was breaking
out too early.

Braino. The check was the wrong way, e.g. it should check whether the
range is covering the given [start, end], not the other way around.

Joerg

Actions

Copy link

Updated by justin about 20 years ago

Fixed in 1.6.1.

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

DragonFlyBSD

Bug #276

Patch to try - Re: Sendmail rset command hangs socket on 1.6-Release

Updated by sven about 20 years ago

Updated by dillon about 20 years ago

Updated by qhwt+dfly about 20 years ago

Updated by dillon about 20 years ago

Updated by hamilton about 20 years ago

Updated by dillon about 20 years ago

Updated by joerg about 20 years ago

Updated by justin about 20 years ago