https://bugs.dragonflybsd.org/https://bugs.dragonflybsd.org/favicon.ico?16293952082010-08-05T02:50:06ZDragonFlyBSD bugtrackerDragonFlyBSD - Bug #1726: tmpfs "malloc limit exceeded" panichttps://bugs.dragonflybsd.org/issues/1726?journal_id=87392010-08-05T02:50:06Ztuxillo
<ul></ul><p>Venk,</p>
<p>Both kern and vmcore are unaccessible:</p>
<p>The requested URL /dfly/tmpfs_20100413_panic/vmcore.0.gz was not found on this<br />server.</p>
<p>Cheers,<br />Antonio Huete</p> DragonFlyBSD - Bug #1726: tmpfs "malloc limit exceeded" panichttps://bugs.dragonflybsd.org/issues/1726?journal_id=87712010-08-13T05:35:31Zvsrinivasvsrinivas@ops101.org
<ul></ul><p>Hey, sorry I removed them from lack of space.</p>
<p><a class="external" href="http://endeavour.zapto.org/src/tmpfskern.0.gz">http://endeavour.zapto.org/src/tmpfskern.0.gz</a><br />and <br /><a class="external" href="http://endeavour.zapto.org/src/tmpfsvmcore.0.gz">http://endeavour.zapto.org/src/tmpfsvmcore.0.gz</a></p>
<p>exhibit the same problem, on a kernel built from git today. These dumps are from<br />a system with 64M of ram, but they also appear on a 1GB of ram machine.</p>
<p>-- vs</p> DragonFlyBSD - Bug #1726: tmpfs "malloc limit exceeded" panichttps://bugs.dragonflybsd.org/issues/1726?journal_id=88042010-08-25T10:12:03Zvsrinivasvsrinivas@ops101.org
<ul></ul><p>I believe <br /><a class="external" href="http://gitweb.dragonflybsd.org/dragonfly.git/commit/42f6f6b1b2dcc2ca10d31421d2dd">http://gitweb.dragonflybsd.org/dragonfly.git/commit/42f6f6b1b2dcc2ca10d31421d2dd</a><br />6273851e012d, <br /><a class="external" href="http://gitweb.dragonflybsd.org/dragonfly.git/commit/8e771504ede4fe826607300e9e4c">http://gitweb.dragonflybsd.org/dragonfly.git/commit/8e771504ede4fe826607300e9e4c</a><br />0c7444652cc4, and <br /><a class="external" href="http://gitweb.dragonflybsd.org/dragonfly.git/commit/dcaa8a41662f2b0cf579a6e91256">http://gitweb.dragonflybsd.org/dragonfly.git/commit/dcaa8a41662f2b0cf579a6e91256</a><br />4c9fc8275ac1 allow it to survive this. I've tested on a vkernel only, would <br />appreciate testing on real hardware.</p> DragonFlyBSD - Bug #1726: tmpfs "malloc limit exceeded" panichttps://bugs.dragonflybsd.org/issues/1726?journal_id=88062010-08-26T09:13:05Zvsrinivasvsrinivas@ops101.org
<ul></ul><p>While we will survive fsstress, since the name zone hits its limit before the <br />dirent one under fsstress, the basic problem (that limits can be reached and <br />tmpfs wasn't counting resources from each zone) remains for dirent structures. A <br />well-written test that makes many symlinks while minimizing name zone usage <br />would still panic the kernel.</p>
<p>Before I close this bug, it'd be nice if more people could confirm/deny that <br />fsstress can run on tmpfs.</p>
<p>Things to do to make the world better: <br />1) Move the tmpfs name zone from a global malloc zone to a per-mount zone<br />2) Convert dirent allocations from M_WAITOK to M_WAITOK | M_NULLOK and handle <br />the null return case; there are only two places that dirents are allocated, so <br />this wouldn't be too bad.</p> DragonFlyBSD - Bug #1726: tmpfs "malloc limit exceeded" panichttps://bugs.dragonflybsd.org/issues/1726?journal_id=88072010-08-26T12:40:41Zthomas.nikolajsen
<ul></ul><p>I did test and it eventually panic'ed on mem exhaust,<br />would you like dump?</p>
<p>Test was on SMP kernel on 2GB RAM system;<br />on slower system, UP, 2GB RAM, I had no panic on running over night (too <br />slow?).<br />Also ran fsx; both programs w/ params. in do* file with program.</p>
<p>On UP system I saw message on console on shutdown:<br />Warning: deep namecache recursion at (null)<br />don't know if this happened during run or at shutdown;<br />fs test programs were stopped (^Z) at shutdown.</p> DragonFlyBSD - Bug #1726: tmpfs "malloc limit exceeded" panichttps://bugs.dragonflybsd.org/issues/1726?journal_id=88082010-08-26T15:13:57Zahuete.devel
<ul></ul><p>Venk,</p>
<p>I could panic the kernel fairly easy on a 256MB VM. It hit the malloc<br />limit in about 2 minutes running fsstress:</p>
<p><a class="external" href="http://www.imgpaste.com/i/ilplf.jpg">http://www.imgpaste.com/i/ilplf.jpg</a></p>
<p>Cheers,<br />Antonio Huete</p> DragonFlyBSD - Bug #1726: tmpfs "malloc limit exceeded" panichttps://bugs.dragonflybsd.org/issues/1726?journal_id=88092010-08-26T15:36:45Zahuete.devel
<ul></ul><p>Hi,</p>
<p>Mount command was:</p>
<ol>
<li>sudo mount -t tmpfs tmpfs /mnt/tmpfs/</li>
<li>vmstat -m | grep tmpfs<br /> tmpfs node 1 1K 0K 30924K 1 0 0<br /> tmpfs mount 1 1K 0K 24830K 1 0 0</li>
</ol>
<p>Cheers,<br />Antonio Huete</p> DragonFlyBSD - Bug #1726: tmpfs "malloc limit exceeded" panichttps://bugs.dragonflybsd.org/issues/1726?journal_id=88862010-09-10T20:11:48Zvsrinivasvsrinivas@ops101.org
<ul></ul><p>Hi,</p>
<p>I just committed 881dac8bcf7f6e26635fa38f071b93347ef92192, which I think solves <br />the problem tuxillo hit. I'd love if people tried it out - last time I thought <br />tmpfs was solved, it wasn't :D.</p>
<p>The fix allows the malloc zone for nodes to return NULL when its limit is <br />exhausted or when we are unable to satisfy the malloc (I've seen that on some <br />low-memory systems here); tmpfs_node_init would not survive a NULL node as well, <br />which I just fixed.</p>
<p>-- vs</p> DragonFlyBSD - Bug #1726: tmpfs "malloc limit exceeded" panichttps://bugs.dragonflybsd.org/issues/1726?journal_id=88872010-09-10T20:50:04Zvsrinivasvsrinivas@ops101.org
<ul></ul><p>thomas - the deep namecache recursions 'should' be happening on shutdown <br />(actually unmount); they are a real problem - a full tmpfs (by nodes) on a <br />system with 1.5GB of RAM takes upwards of 5 min to unmount. Why the ncp->nc_name <br />field is empty I don't know either....</p>
<p>status:<br />- tmpfs should survive fsstress at the minute<br />- >1 tmpfs will be a problem, the name zone is shared<br />- there are workloads which will still panic it...<br />- unmount after fsstress takes a long time</p>
<p>stuff to do still standing:<br />- Move the tmpfs name zone from a global malloc zone to a per-mount zone<br />- Convert dirent allocations from M_WAITOK to M_WAITOK | M_NULLOK and handle <br />the null return case; there are only two places that dirents are allocated, so <br />this wouldn't be too bad.<br />- Figure out why the unmount is hitting so many namecache entries with null <br />names<br />- Write a link stress test (something that makes a <em>lot</em> of links) to see if we <br />can exhaust the dirent zone currently.</p> DragonFlyBSD - Bug #1726: tmpfs "malloc limit exceeded" panichttps://bugs.dragonflybsd.org/issues/1726?journal_id=88902010-09-11T04:07:23Ztuxillo
<ul></ul><p>Venk,</p>
<p>After 3h testing w/ fsstress I didn't have any panics, but on shutdown the <br />namecache recursion issue is still there.</p>
<p>Cheers,<br />Antonio Huete</p> DragonFlyBSD - Bug #1726: tmpfs "malloc limit exceeded" panichttps://bugs.dragonflybsd.org/issues/1726?journal_id=88972010-09-11T20:00:10Zvsrinivasvsrinivas@ops101.org
<ul></ul><p>Tux: That's good to hear!</p>
<hr />
<p>This test program:</p>
<p>#include <unistd.h><br />#include <stdlib.h><br />#include <stdio.h></p>
<p>main() {<br /> int i;<br /> char id<sup><a href="#fn320">320</a></sup> = {};</p>
<pre><code>for (i = 0; i < 10000000; i++) {<br /> sprintf(id, "%09d", i);<br /> link("sin.c", id);<br /> }</code></pre>
<pre><code>return 0;<br />}<br />----<br />I expected it to exhaust the tmpfs dirent zone on a low-memory system, where the <br />dirent zone limit was less than the system limit on hardlinks. Instead I <br />exhausted the vfscache zone:</code></pre>
<p>panic: vfscache: malloc limit exceeded<br />mp_lock = 00000000; cpuid = 0<br />Trace beginning at frame 0x54ee5a10<br />panic(ffffffff,54ee5a38,55492c08,82d43e0,40400840) at 0x80e1d33<br />panic(8287563,829176f,4c8b7339,0,55492c08) at 0x80e1d33<br />kmalloc(a,82d43e0,2,0,54ee5bec) at 0x80df67c<br />cache_unlock(0,0,52b48d00,52ba4b00,40400000) at 0x812c274<br />cache_nlookup(54ee5bec,54ee5af4,54ee5bec,54ee5bec,40400000) at 0x81302ed<br />nlookup(54ee5bec,5503e4c8,54ee5c24,52a45540,5503e4c8) at 0x8138665<br />kern_link(54ee5c24,54ee5bec,552881d8,52ba4b00,526dc698) at 0x8141aa6<br />sys_link(54ee5c94,0,0,82c46cc,292) at 0x8147475<br />syscall2(54ee5d40,52a1dd40,0,0,54ee5d38) at 0x8265d6d<br />user_trap(54ee5d40,54e8bb88,82667bd,0,0) at 0x82660af<br />go_user(54ee5d38,0,0,7b,0) at 0x826663e<br />Debugger("panic")</p>
<p>CPU0 stopping CPUs: 0x00000000<br /> stopped<br />Stopped at 0x826352d: movb $0,0x83f6194<br />db></p>
<p>(hardlinks are the one of two things in tmpfs that allocate dirents; the other <br />allocation is already bounded by the node limits, so its not a problem).</p> DragonFlyBSD - Bug #1726: tmpfs "malloc limit exceeded" panichttps://bugs.dragonflybsd.org/issues/1726?journal_id=88982010-09-11T21:45:18Zeocallaghan
<ul></ul><p>I was able to reproduce with a hammer equivalent with the below test case from<br />vsrinivas.</p>
<p>(kgdb) bt<br />#0 _get_mycpu (di=0xc06d4ca0) at ./machine/thread.h:83<br /><a class="issue tracker-1 status-5 priority-3 priority-lowest closed" title="Bug: lib/libcr/sys/ cleanup (Closed)" href="https://bugs.dragonflybsd.org/issues/1">#1</a> md_dumpsys (di=0xc06d4ca0)<br /> at /usr/src/sys/platform/pc32/i386/dump_machdep.c:263<br /><a class="issue tracker-1 status-5 priority-3 priority-lowest closed" title="Bug: K&R -> ANSI cleanup status (Closed)" href="https://bugs.dragonflybsd.org/issues/2">#2</a> 0xc0304d15 in dumpsys () at /usr/src/sys/kern/kern_shutdown.c:880<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: freebsds pipe-reverse test fails on dfly (Closed)" href="https://bugs.dragonflybsd.org/issues/3">#3</a> 0xc03052d5 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:387<br /><a class="issue tracker-1 status-5 priority-5 priority-high3 closed" title="Bug: Rework of nrelease (Closed)" href="https://bugs.dragonflybsd.org/issues/4">#4</a> 0xc030559e in panic (fmt=0xc05bb41b "%s: malloc limit exceeded")<br /> at /usr/src/sys/kern/kern_shutdown.c:786<br /><a class="issue tracker-1 status-5 priority-3 priority-lowest closed" title="Bug: sys/dev cleanup (Closed)" href="https://bugs.dragonflybsd.org/issues/5">#5</a> 0xc03032bb in kmalloc (size=25, type=0xc1d8f590, flags=258)<br /> at /usr/src/sys/kern/kern_slaballoc.c:503<br /><a class="issue tracker-1 status-5 priority-3 priority-lowest closed" title="Bug: sys/emulation cleanup (Closed)" href="https://bugs.dragonflybsd.org/issues/6">#6</a> 0xc04aa5a3 in hammer_alloc_mem_record (ip=0xcb803d50, data_len=25)<br /> at /usr/src/sys/vfs/hammer/hammer_object.c:280<br /><a class="issue tracker-1 status-5 priority-3 priority-lowest closed" title="Bug: /sys/boot cleanup (Closed)" href="https://bugs.dragonflybsd.org/issues/7">#7</a> 0xc04aa91f in hammer_ip_add_directory (trans=0xce350ad4, <br /> dip=0xcb803d50, name=0xd3cdb1d0 "000452457", bytes=9, ip=0xce31df50)<br /> at /usr/src/sys/vfs/hammer/hammer_object.c:666<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: make upgrade broken (Closed)" href="https://bugs.dragonflybsd.org/issues/8">#8</a> 0xc04bbf8a in hammer_vop_nlink (ap=0xce350b2c)<br /> at /usr/src/sys/vfs/hammer/hammer_vnops.c:1388<br /><a class="issue tracker-1 status-5 priority-5 priority-high3 closed" title="Bug: panic with HEAD (Closed)" href="https://bugs.dragonflybsd.org/issues/9">#9</a> 0xc036cc1f in vop_nlink_ap (ap=0xce350b2c)<br /> at /usr/src/sys/kern/vfs_vopops.c:1978<br /><a class="issue tracker-1 status-5 priority-5 priority-high3 closed" title="Bug: make buildworld broken (Closed)" href="https://bugs.dragonflybsd.org/issues/10">#10</a> 0xc03717ca in null_nlink (ap=0xce350b2c)<br /> at /usr/src/sys/vfs/nullfs/null_vnops.c:164<br /><a class="issue tracker-1 status-5 priority-3 priority-lowest closed" title="Bug: libstand cleanup (Closed)" href="https://bugs.dragonflybsd.org/issues/11">#11</a> 0xc036d465 in vop_nlink (ops=0xcdbbe030, nch=0xce350c48, <br /> dvp=0xce0913e8, vp=0xce2f04e8, cred=0xcdef1738)<br /> at /usr/src/sys/kern/vfs_vopops.c:1397<br />---Type <return> to continue, or q <return> to quit--- <br />---Type <return> to continue, or q <return> to quit---<a class="issue tracker-1 status-5 priority-3 priority-lowest closed" title="Bug: /sys/net cleanup (Closed)" href="https://bugs.dragonflybsd.org/issues/12">#12</a> 0xc0365496 in<br />kern_link (nd=0xce350c80, linknd=0xce350c48)<br /> at /usr/src/sys/kern/vfs_syscalls.c:2320<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: Buildworld error/panic (Closed)" href="https://bugs.dragonflybsd.org/issues/13">#13</a> 0xc036ad49 in sys_link (uap=0xce350cf0)<br /> at /usr/src/sys/kern/vfs_syscalls.c:2345<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: zombie processes waiting for a lock, smth to worry about? (Closed)" href="https://bugs.dragonflybsd.org/issues/14">#14</a> 0xc055f6b3 in syscall2 (frame=0xce350d40)<br /> at /usr/src/sys/platform/pc32/i386/trap.c:1310<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: savecore -z patch (Closed)" href="https://bugs.dragonflybsd.org/issues/15">#15</a> 0xc0547fb6 in Xint0x80_syscall ()<br /> at /usr/src/sys/platform/pc32/i386/exception.s:876<br /><a class="issue tracker-1 status-5 priority-5 priority-high3 closed" title="Bug: install: net/bridge/*.h: No such file or directory (Closed)" href="https://bugs.dragonflybsd.org/issues/16">#16</a> 0x0000001f in ?? ()<br />Backtrace stopped: previous frame inner to this frame (corrupt stack?)<br />(kgdb)</p>
<p>Cheers,<br />Edward.</p> DragonFlyBSD - Bug #1726: tmpfs "malloc limit exceeded" panichttps://bugs.dragonflybsd.org/issues/1726?journal_id=89432010-09-17T00:16:50Zvsrinivasvsrinivas@ops101.org
<ul></ul><p>Hi,</p>
<p>I just converted the tmpfs name zone from a systemwide zone to a per-mount zone. <br />All of the panics from tmpfs directly from zone exhaustion should be taken care <br />of, so I think it'd be worth closing this bug and opening two new ones, one for <br />the vfscache exhaustion, one for the deep recursion on unmount.</p>
<p>I'll mark this as testing till more people can try beating up on tmpfs?</p>
<p>-- vs</p> DragonFlyBSD - Bug #1726: tmpfs "malloc limit exceeded" panichttps://bugs.dragonflybsd.org/issues/1726?journal_id=90082010-09-27T03:55:46Zvsrinivasvsrinivas@ops101.org
<ul></ul><p>Per last testing from tuxillo, this seems to be finally resolved! Cheers!</p>
<p>[The namecache recursion on unmount and the vfscache limit bugs still exist, as <br />a warning.]</p>
<p>-- vs</p>