https://bugs.dragonflybsd.org/https://bugs.dragonflybsd.org/favicon.ico?16293952082008-02-03T09:46:01ZDragonFlyBSD bugtrackerDragonFlyBSD - Bug #937: tcp_sack related panichttps://bugs.dragonflybsd.org/issues/937?journal_id=42622008-02-03T09:46:01Zdillon
<ul></ul><p>:#6 0xc02fe396 in calltrap () at /usr/src/sys/platform/pc32/i386/exception.=<br />:s:783<br />:#7 0xc0233d36 in sack_block_lookup (scb=3D0xdace6b0c, seq=3D1554912228, sb=<br />:=3D0xdaa45a90) at /usr/src/sys/netinet/tcp_sack.c:128<br />:#8 0xc0233eda in tcp_sack_nextseg (tp=3D0xdace6a20, nextrexmt=3D0xdaa45ad0=<br />:, plen=3D0xdaa45ad4, lostdup=3D0xdaa45acc) at /usr/src/sys/netinet/tcp_sack=<br />:=2Ec:496<br />:#9 0xc022f603 in tcp_sack_rexmt (tp=3D0xdace6a20, th=3D<value optimized ou=</p>
<pre><code>Hmm. I see two places where a node is removed from the sackblocks list<br /> but lastfound is not cleared on match. I don't know if this is the<br /> issue but it's the most obvious from looking at the failure.</code></pre>
<pre><code>I'll commit this tomorrow if no new developments come up.</code></pre>
<pre><code>-Matt<br /> Matthew Dillon <br /> &lt;<a class="email" href="mailto:dillon@backplane.com">dillon@backplane.com</a>&gt;</code></pre>
<p>Index: tcp_sack.c
===================================================================<br />RCS file: /cvs/src/sys/netinet/tcp_sack.c,v<br />retrieving revision 1.6<br />diff <del>u -p -r1.6 tcp_sack.c<br />--</del> tcp_sack.c 22 Apr 2007 01:13:14 <del>0000 1.6<br />+<ins>+ tcp_sack.c 3 Feb 2008 01:32:16 -0000<br /><code>@ -176,7 +176,7 </code>@ <br /> sb = TAILQ_FIRST(&scb</del>>sackblocks);<br /> while (sb && SEQ_LEQ(sb->sblk_end, th_ack)) {<br /> nb = TAILQ_NEXT(sb, sblk_list);<br />- if (sb scb->lastfound)<br /></ins> if (scb->lastfound sb)<br /> scb->lastfound = NULL;<br /> TAILQ_REMOVE(&scb->sackblocks, sb, sblk_list);<br /> free_sackblock(sb);<br /><code>@ -334,6 +334,8 </code>@ SEQ_GEQ(workingblock->sblk_end, sb-<br /> struct sackblock *nextblock;</p>
<pre><code>nextblock = TAILQ_NEXT(sb, sblk_list);<br />+ if (scb->lastfound == sb)<br />+ scb->lastfound = NULL;<br /> /* Remove completely overlapped block <strong>/<br /> TAILQ_REMOVE(&scb->sackblocks, sb, sblk_list);<br /> free_sackblock(sb);<br /><code>@ -346,6 +348,8 </code>@ if (sb != NULL &&<br /> SEQ_GEQ(workingblock->sblk_end, sb->sblk_start)) {<br /> /</strong> Extend new block to cover partially overlapped old block. */<br /> workingblock->sblk_end = sb->sblk_end;<br />+ if (scb->lastfound == sb)<br />+ scb->lastfound = NULL;<br /> TAILQ_REMOVE(&scb->sackblocks, sb, sblk_list);<br /> free_sackblock(sb);<br /> --scb->nblocks;</code></pre> DragonFlyBSD - Bug #937: tcp_sack related panichttps://bugs.dragonflybsd.org/issues/937?journal_id=42632008-02-03T10:14:00Zpavalos
<ul></ul><p>Also just got this with the same sources:</p>
<p>panic: zone: freeing free entry<br />mp_lock = 00000000; cpuid = 0<br />boot() called on cpu#0<br />Uptime: 1d11h35m59s</p>
<p>dumping to dev #da/0x20001, blockno 378927</p>
<p>(kgdb) bt<br />#0 dumpsys () at ./machine/thread.h:83<br /><a class="issue tracker-1 status-5 priority-3 priority-lowest closed" title="Bug: lib/libcr/sys/ cleanup (Closed)" href="https://bugs.dragonflybsd.org/issues/1">#1</a> 0xc01a2ea9 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:375<br /><a class="issue tracker-1 status-5 priority-3 priority-lowest closed" title="Bug: K&R -> ANSI cleanup status (Closed)" href="https://bugs.dragonflybsd.org/issues/2">#2</a> 0xc01a316c in panic (fmt=0xc034328a "zone: freeing free entry") at /usr/src/sys/kern/kern_shutdown.c:800<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: freebsds pipe-reverse test fails on dfly (Closed)" href="https://bugs.dragonflybsd.org/issues/3">#3</a> 0xc02a6aa8 in zerror (error=2) at /usr/src/sys/vm/vm_zone.c:567<br /><a class="issue tracker-1 status-5 priority-5 priority-high3 closed" title="Bug: Rework of nrelease (Closed)" href="https://bugs.dragonflybsd.org/issues/4">#4</a> 0xc02a6ff5 in zfree (z=0xd7049438, item=0xdb991760) at /usr/src/sys/vm/vm_zone.c:98<br /><a class="issue tracker-1 status-5 priority-3 priority-lowest closed" title="Bug: sys/dev cleanup (Closed)" href="https://bugs.dragonflybsd.org/issues/5">#5</a> 0xc02341ac in tcp_sack_update_scoreboard (tp=0xdad397c0, to=0xdaa45be8) at /usr/src/sys/netinet/tcp_sack.c:165<br /><a class="issue tracker-1 status-5 priority-3 priority-lowest closed" title="Bug: sys/emulation cleanup (Closed)" href="https://bugs.dragonflybsd.org/issues/6">#6</a> 0xc02318d9 in tcp_input (m=0xeb7df200) at /usr/src/sys/netinet/tcp_input.c:1900<br /><a class="issue tracker-1 status-5 priority-3 priority-lowest closed" title="Bug: /sys/boot cleanup (Closed)" href="https://bugs.dragonflybsd.org/issues/7">#7</a> 0xc0229ae2 in transport_processing_oncpu (m=0xeb7df200, hlen=20, ip=<value optimized out>, nexthop=0x0) at /usr/src/sys/netinet/ip_input.c:391<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: make upgrade broken (Closed)" href="https://bugs.dragonflybsd.org/issues/8">#8</a> 0xc022bae0 in ip_input (m=0xeb7df200) at /usr/src/sys/netinet/ip_input.c:1092<br /><a class="issue tracker-1 status-5 priority-5 priority-high3 closed" title="Bug: panic with HEAD (Closed)" href="https://bugs.dragonflybsd.org/issues/9">#9</a> 0xc022bbb4 in ip_input_handler (msg0=0xeb7df218) at /usr/src/sys/netinet/ip_input.c:421<br /><a class="issue tracker-1 status-5 priority-5 priority-high3 closed" title="Bug: make buildworld broken (Closed)" href="https://bugs.dragonflybsd.org/issues/10">#10</a> 0xc0235653 in tcpmsg_service_loop (dummy=0x0) at /usr/src/sys/netinet/tcp_subr.c:385<br /><a class="issue tracker-1 status-5 priority-3 priority-lowest closed" title="Bug: libstand cleanup (Closed)" href="https://bugs.dragonflybsd.org/issues/11">#11</a> 0xc01a9fa5 in lwkt_deschedule_self (td=Cannot access memory at address 0x8<br />) at /usr/src/sys/kern/lwkt_thread.c:214<br />Backtrace stopped: previous frame inner to this frame (corrupt stack?)</p>
<p>Do you think it's the same problem?</p> DragonFlyBSD - Bug #937: tcp_sack related panichttps://bugs.dragonflybsd.org/issues/937?journal_id=42652008-02-03T13:37:00Zpavalos
<ul></ul><p>FYI, the vmcores are on leaf:~pavalos/crash. The first one is *12 and<br />the 2nd is *13.</p>
<p>--Peter</p> DragonFlyBSD - Bug #937: tcp_sack related panichttps://bugs.dragonflybsd.org/issues/937?journal_id=42692008-02-04T05:38:02Zdillon
<ul></ul><p>:Also just got this with the same sources:<br />:<br />:panic: zone: freeing free entry<br />:mp_lock =3D 00000000; cpuid =3D 0<br />:boot() called on cpu#0<br />:Uptime: 1d11h35m59s<br />:...<br />:#3 0xc02a6aa8 in zerror (error=3D2) at /usr/src/sys/vm/vm_zone.c:567<br />:#4 0xc02a6ff5 in zfree (z=3D0xd7049438, item=3D0xdb991760) at /usr/src/sys=<br />:/vm/vm_zone.c:98<br />:#5 0xc02341ac in tcp_sack_update_scoreboard (tp=3D0xdad397c0, to=3D0xdaa45=<br />:be8) at /usr/src/sys/netinet/tcp_sack.c:165<br />:#6 0xc02318d9 in tcp_input (m=3D0xeb7df200) at /usr/src/sys/netinet/tcp_in=<br />:put.c:1900<br />:#7 0xc0229ae2 in transport_processing_oncpu (m=3D0xeb7df200, hlen=3D20, ip=<br />:<br />:Do you think it's the same problem?</p>
<pre><code>Same sources prior to the patch? It's quite possible.</code></pre>
<pre><code>I tracked this second crash to line 321 of tcp_sack.c (the kgdb backtrace<br /> is all wrong due to all the inlining). It's freeing 'newblock' here,<br /> which should always succeed at this paricular point in the code.</code></pre>
<pre><code>I think this case can only occur if the list had previously been<br /> corrupted due to the hint not getting NULL'd out in those two places.</code></pre>
<pre><code>-Matt<br /> Matthew Dillon <br /> &lt;<a class="email" href="mailto:dillon@backplane.com">dillon@backplane.com</a>&gt;</code></pre> DragonFlyBSD - Bug #937: tcp_sack related panichttps://bugs.dragonflybsd.org/issues/937?journal_id=58762009-01-21T03:55:40Zcorecode
<ul></ul><p>did this get committed?</p> DragonFlyBSD - Bug #937: tcp_sack related panichttps://bugs.dragonflybsd.org/issues/937?journal_id=59692009-01-26T06:36:08Zpavalos
<ul></ul><p>Committed in 9e3d6c9645ed28ef5b07a9b13e380e13a86deeb8. I haven't seen this<br />panic in about a year, so let's call it good.</p>