https://bugs.dragonflybsd.org/https://bugs.dragonflybsd.org/favicon.ico?16293952082008-09-01T03:01:00ZDragonFlyBSD bugtrackerDragonFlyBSD - Bug #1129: hammer-inodes: malloc limit exceededhttps://bugs.dragonflybsd.org/issues/1129?journal_id=53372008-09-01T03:01:00Zdillon
<ul></ul><p>:Hi,<br />:<br />:I have replaced my backup system with a new machine running DragonFly and<br />:Hammer. It is a SMP Opteron with 2GB memory. The backup partition is a<br />:single disk on a 3Ware RAID controler.<br />:<br />:The previous machine (FreeBSD 7/UFS2) ran rsnapshot every 4 hours and this<br />:one continues with the same configuration. I have copied the content of the<br />:old rsnapshot directory to the new backup disk before puting it in production.<br />:<br />:For details on rsnapshot, see <a class="external" href="http://www.rsnapshot.org/">http://www.rsnapshot.org/</a><br />:<br />:/backup is a 400GB Hammer disk:<br />:<br />:Filesystem Size Used Avail Capacity iused ifree <span>iused<br />:Backup 371G 239G 132G 64</span> 2652953 0 100%<br />:<br />:I have just encountered this panic:<br />:<br />:panic: hammer-inodes: malloc limit exceeded<br />:mp_lock = 00000000; cpuid = 0<br />:...<br />:The backtrace was quickly copied by hand. I may be able to post the full trace<br />:tomorrow if needed.<br />:<br />:-- <br />:Francois Tigeot</p>
<pre><code>Francois, is this on a 2.0 release system or a HEAD or latest<br /> release (from CVS) system?</code></pre>
<pre><code>I was sure I fixed this issue for machines with large amounts of ram.<br /> Please do this:</code></pre>
<pre><code>sysctl vfs.maxvnodes</code></pre>
<pre><code>And tell me what it says. If the value is greater then 70000, set it<br /> to 70000.</code></pre>
<pre><code>-Matt<br /> Matthew Dillon <br /> &lt;<a class="email" href="mailto:dillon@backplane.com">dillon@backplane.com</a>&gt;</code></pre> DragonFlyBSD - Bug #1129: hammer-inodes: malloc limit exceededhttps://bugs.dragonflybsd.org/issues/1129?journal_id=53382008-09-01T03:16:01Zftigeot
<ul></ul><p>This is 2.0 + patches (current DragonFly_RELEASE_2_0_Slip)</p>
<p>There is no such sysctl. I used kern.maxvnodes instead; the original value<br />was 129055.</p> DragonFlyBSD - Bug #1129: hammer-inodes: malloc limit exceededhttps://bugs.dragonflybsd.org/issues/1129?journal_id=53392008-09-01T05:15:03Zdillon
<ul></ul><p>:This is 2.0 + patches (current DragonFly_RELEASE_2_0_Slip)<br />:<br />:...<br />:> And tell me what it says. If the value is greater then 70000, set it<br />:> to 70000.<br />:<br />:There is no such sysctl. I used kern.maxvnodes instead; the original value<br />:was 129055.<br />:<br />:-- <br />:Francois Tigeot</p>
<pre><code>Yah, I mistyped that. its kern.maxvnodes. The fix I had made was<br /> MFC'd to 2.0_Slip so the calculation must still be off. Reducing<br /> maxvnodes should solve the panic. The basic problem is that HAMMER's<br /> struct hammer_inode is larger then struct vnode so the vnode limit<br /> calculations wind up being off.</code></pre>
<pre><code>You don't need to use the hardlink trick if backing up to a HAMMER<br /> filesystem. I still need to write utility support to streamline<br /> the user interface but basically all you have to do is use rdist, rsync,<br /> or cpdup (without the hardlink trick) to overwrite the same destination<br /> directory on the HAMMER backup system, then generate a snapshot<br /> softlink. Repeat each day.</code></pre>
<pre><code>This is how I backup DragonFly systems. I have all the systems<br /> NFS-exported to the backup system and it uses cpdup and the hammer<br /> snapshot feature to create a softlink for each day.</code></pre>
<pre><code>backup# df -g -i /backup<br /> Filesystem 1G-blocks Used Avail Capacity iused ifree <span>iused Mounted on<br /> TEST 696 281 414 40</span> 3605109 0 100% /backup</code></pre>
<pre><code>backup# cd /backup/mirrors<br /> backup# ls -la <br /> ...<br /> drwxr-xr-x 1 root wheel 0 Aug 31 03:20 pkgbox<br /> lrwxr-xr-x 1 root wheel 26 Jul 14 22:22 pkgbox.20080714 -> pkgbox@<code>0x00000001061a92cd<br /> lrwxr-xr-x 1 root wheel 26 Jul 16 01:58 pkgbox.20080716 -> pkgbox</code>@0x000000010c351e83<br /> lrwxr-xr-x 1 root wheel 26 Jul 17 03:08 pkgbox.20080717 -> pkgbox@<code>0x000000010d9ee6ad<br /> lrwxr-xr-x 1 root wheel 26 Jul 18 03:12 pkgbox.20080718 -> pkgbox</code>@0x000000010f78313d<br /> lrwxr-xr-x 1 root wheel 26 Jul 19 03:25 pkgbox.20080719 -> pkgbox@@0x0000000112505014<br /> ...</code></pre>
<pre><code>Doing backups this way has some minor management issues, and we really<br /> need an official user utility to address them. When the backup disk<br /> gets over 90% full I will have to start deleting softlinks and running<br /> hammer prune, and I run about 30 minutes worth of hammer reblocking<br /> ops every night from cron.</code></pre>
<pre><code>HAMMER locks-down atime/mtime when accessed via a snapshot so tar | md5<br /> can be used to create a sanity check for each snapshot.</code></pre>
<pre><code>By my estimation it is going to take at least another 200+ days of<br /> daily backups before I get to that point on my /backup system. I<br /> may speed it up by creating some filler files so I can write and test<br /> a user utility to do the management.</code></pre>
<pre><code>--</code></pre>
<pre><code>Another way of doing backups is to use the mirroring feature. This only<br /> works when both the source and target filesystems are HAMMER filesystems<br /> though, and the snapshot softlink would have to be created manually<br /> (so we need more utility support to make it easier for userland to do).</code></pre>
<pre><code>-Matt<br /> Matthew Dillon <br /> &lt;<a class="email" href="mailto:dillon@backplane.com">dillon@backplane.com</a>&gt;</code></pre> DragonFlyBSD - Bug #1129: hammer-inodes: malloc limit exceededhttps://bugs.dragonflybsd.org/issues/1129?journal_id=53402008-09-01T06:56:01Zfjwcash
<ul></ul><p>On Sun, Aug 31, 2008 at 3:12 PM, Matthew Dillon<br /><<a class="email" href="mailto:dillon@apollo.backplane.com">dillon@apollo.backplane.com</a>> wrote:</p>
<blockquote>
<p>You don't need to use the hardlink trick if backing up to a HAMMER<br />filesystem. I still need to write utility support to streamline<br />the user interface but basically all you have to do is use rdist, rsync,<br />or cpdup (without the hardlink trick) to overwrite the same destination<br />directory on the HAMMER backup system, then generate a snapshot<br />softlink. Repeat each day.</p>
</blockquote>
<p>In-filesystem snapshot support is such a handy tool. It's something<br />that I really miss on our Linux systems (LVM snapshots are separate<br />volumes, and you have to guesstimate how much room each one will use,<br />and you have to leave empty space in your volume group to support<br />them).</p>
<p>We use a similar setup for our remote backups box at work. It's a 2x<br />dual-core Opteron system with 8 GB of RAM and 12x 400 GB SATA HDs on<br />one 3Ware controller and 12x 500 GB SATA HDs on a second 3Ware<br />controller (all configured as Single Disks), running FreeBSD 7-STABLE<br />off a pair of 2 GB CompactFlash cards (gmirror'd). / is on the CF,<br />everything else (/usr, /usr/ports, /usr/local, /usr/ports/distfiles,<br />/usr/src, /usr/obj, /home, /tmp, /var, /storage) are ZFS filesystems<br />(the 24 drives are configured as a single raidz2 pool). There's a<br />quad-port Intel Pro/1000 gigabit NIC configured via lagg(4) as a<br />single load-balancing interface.</p>
<p>Every night, a cronjob creates a snapshot of /storage, then the server<br />connects to the remote servers via SSH, runs rsync against the entire<br />harddrive and a directory under /storage. For 37 servers, it takes<br />just under 2 hours for the rsync runs (the initial rsync can takes<br />upwards of 12 hours per server, depending on the amount of data that<br />needs to be transferred). A normal snapshot uses <4 GB.</p>
<p>similar fashion, with Hammer filesystems.</p>
<p><snip></p>
<p>We've used just over 1 TB to completely archive 37 servers. Daily<br />snapshots use <5 GB each. This particular server (9 TB) should last<br />us for a couple of years. :) Even after we get the full 75 remote<br />servers being backed up, we should be good to keep at least 6 months<br />of daily backups online. :)</p>
<p>Or, you can use the mirror feature to mirror your backup server to an<br />offsite server. :) That's what we're planning on doing with ours,<br />using the "snapshot send" and "snapshot receive" features in ZFS.</p>
<p>There's lots of great work going on in filesystems right now. It's<br />nice to see the BSDs up near the front (FreeBSD with ZFS, DFlyBSD with<br />Hammer) again.</p> DragonFlyBSD - Bug #1129: hammer-inodes: malloc limit exceededhttps://bugs.dragonflybsd.org/issues/1129?journal_id=53412008-09-01T07:20:00Zdillon
<ul></ul><p>:We've used just over 1 TB to completely archive 37 servers. Daily<br />:snapshots use <5 GB each. This particular server (9 TB) should last<br />:us for a couple of years. :) Even after we get the full 75 remote<br />:servers being backed up, we should be good to keep at least 6 months<br />:of daily backups online. :)</p>
<pre><code>It is a far cry from the tape backups we all had to use a decade ago.</code></pre>
<pre><code>These days if the backups aren't live, they are virtually worthless.</code></pre>
<p>:Or, you can use the mirror feature to mirror your backup server to an<br />:offsite server. :) That's what we're planning on doing with ours,<br />:using the "snapshot send" and "snapshot receive" features in ZFS.</p>
<pre><code>Ah, yes. I should have mentioned that. It is an excellent way to<br /> bridge from a non-HAMMER filesystem to a HAMMER filesystem. At the<br /> moment my off-site backup system is running linux (I'm stealing a 700G<br /> disk from a friend of mine) so I can't run HAMMER, but hopefully some<br /> point before the 2.2 release I'll be able to get my new DFly colo server<br /> installed in the same colo facility and then I will be able to use<br /> the mirroring stream to backup from the LAN backup machine to the <br /> off-site backup machine, HAMMER-to-HAMMER.</code></pre>
<p>:There's lots of great work going on in filesystems right now. It's<br />:nice to see the BSDs up near the front (FreeBSD with ZFS, DFlyBSD with<br />:Hammer) again.<br />:<br />:-- <br />:Freddie Cash<br />:<a class="email" href="mailto:fjwcash@gmail.com">fjwcash@gmail.com</a></p>
<pre><code>I think the linux folks have wandered a little, but it only goes to<br /> show that major filesystem design is the work of individuals, not OS<br /> projects.</code></pre>
<pre><code>-Matt<br /> Matthew Dillon <br /> &lt;<a class="email" href="mailto:dillon@backplane.com">dillon@backplane.com</a>&gt;</code></pre> DragonFlyBSD - Bug #1129: hammer-inodes: malloc limit exceededhttps://bugs.dragonflybsd.org/issues/1129?journal_id=53472008-09-04T15:30:02Zftigeot
<ul></ul><p>The panic occurred again with kern.maxvnodes set to 70000.<br />I have reduced it to 35000; we will see if the system is still stable in a<br />few days...</p>
<blockquote>
<p>You don't need to use the hardlink trick if backing up to a HAMMER<br />filesystem. I still need to write utility support to streamline<br />the user interface but basically all you have to do is use rdist, rsync,<br />or cpdup (without the hardlink trick) to overwrite the same destination<br />directory on the HAMMER backup system, then generate a snapshot<br />softlink. Repeat each day.</p>
</blockquote>
<p>I agree this is a better way with Hammer. I just don't want to use something<br />too different from my other backup servers for the time being...</p> DragonFlyBSD - Bug #1129: hammer-inodes: malloc limit exceededhttps://bugs.dragonflybsd.org/issues/1129?journal_id=53482008-09-04T23:29:02Zdillon
<ul></ul><p>:<br />:> MFC'd to 2.0_Slip so the calculation must still be off. Reducing<br />:> maxvnodes should solve the panic. The basic problem is that HAMMER's<br />:> struct hammer_inode is larger then struct vnode so the vnode limit<br />:> calculations wind up being off.<br />:<br />:The panic occurred again with kern.maxvnodes set to 70000.<br />:I have reduced it to 35000; we will see if the system is still stable in a<br />:few days...</p>
<pre><code>That doesn't sound right, it should have been fine at 70000. Do<br /> you have a kernel & core crashdump I can look at? (email me privately<br /> if you do).</code></pre>
<pre><code>-Matt<br /> Matthew Dillon <br /> &lt;<a class="email" href="mailto:dillon@backplane.com">dillon@backplane.com</a>&gt;</code></pre> DragonFlyBSD - Bug #1129: hammer-inodes: malloc limit exceededhttps://bugs.dragonflybsd.org/issues/1129?journal_id=53492008-09-05T00:44:01Zftigeot
<ul></ul><p>Unfortunately, this machine wasn't configured to save a crash dump.</p>
<p>I have now setup dumpdev and reset maxvnodes to the default value. We<br />should get a dump in a few days, the interval between crashes never<br />exceeded a week.</p> DragonFlyBSD - Bug #1129: hammer-inodes: malloc limit exceededhttps://bugs.dragonflybsd.org/issues/1129?journal_id=53852008-09-11T21:14:02Zftigeot
<ul></ul><p>I've got a new panic.</p>
<p>How can I be sure to get a crash dump ? This machine actually panicked<br />once again before but for some reason didn't dump the core.</p>
<p>It is sitting at the kernel debugger screen for the moment.</p> DragonFlyBSD - Bug #1129: hammer-inodes: malloc limit exceededhttps://bugs.dragonflybsd.org/issues/1129?journal_id=53872008-09-12T00:14:00Zdillon
<ul></ul><p>:I've got a new panic.<br />:<br />:How can I be sure to get a crash dump ? This machine actually panicked<br />:once again before but for some reason didn't dump the core.<br />:<br />:It is sitting at the kernel debugger screen for the moment.<br />:<br />:-- <br />:Francois Tigeot</p>
<pre><code>It has to be setup before-hand. If it isn't there isn't much you can<br /> do from the debugger. If it is setup before hand you can type 'panic'<br /> from the debugger prompt & hit return twice (usually) and it will dump<br /> before rebooting.</code></pre>
<pre><code>Generally speaking you set up to get a crash dump like this:</code></pre>
<ul>
<li>Have enough swap space to cover main memory. i.e. if you 4g of<br /> ram, you need 4g of swap.</li>
</ul>
<ul>
<li>Set dumpdev to point to the swap device in /etc/rc.conf. Example:<br /> 'dumpdev=/dev/ad6s1b'. Takes effect when you reboot, you can<br /> manually set the dumpdev on the running system by also running<br /> 'dumpon /dev/ad6s1b').</li>
</ul>
<ul>
<li>Add 'kern.sync_on_panic=0' to your /etc/sysctl.conf to tell the<br /> system not to try to flush the buffer crash when it crashes. This<br /> improves its chances of being able to get to the dump code.</li>
</ul>
<pre><code>You can set the kernel up to automatically reboot on a panic (and<br /> generate a crash dump if it has been setup to do one) by compiling<br /> the kernel with:</code></pre>
<pre><code>options DDB<br /> options DDB_TRACE<br /> options DDB_UNATTENDED</code></pre>
<pre><code>-Matt<br /> Matthew Dillon <br /> &lt;<a class="email" href="mailto:dillon@backplane.com">dillon@backplane.com</a>&gt;</code></pre> DragonFlyBSD - Bug #1129: hammer-inodes: malloc limit exceededhttps://bugs.dragonflybsd.org/issues/1129?journal_id=54162008-09-16T00:35:01Zftigeot
<ul></ul><p>[...]</p>
<p>Thanks for the instructions, I was finally able to get a crash dump.</p>
<p>I have put the content of /var/crash at this location:<br /><a class="external" href="http://www.wolfpond.org/crash.dfly/">http://www.wolfpond.org/crash.dfly/</a></p> DragonFlyBSD - Bug #1129: hammer-inodes: malloc limit exceededhttps://bugs.dragonflybsd.org/issues/1129?journal_id=54172008-09-17T02:37:01Zdillon
<ul></ul><p>Ok, I'm looking at the core. There do not appear to be any<br /> memory leaks but HAMMER got behind on reclaiming inodes whos ref<br /> count has dropped to 0.</p>
<pre><code>In looking at the code I see a case that I am not handling in<br /> VOP_SETATTR. Was the code you were running doing a lot of chmod,<br /> chown, or other operations on file paths that do not require open()ing<br /> the file?</code></pre>
<pre><code>-Matt<br /> Matthew Dillon <br /> &lt;<a class="email" href="mailto:dillon@backplane.com">dillon@backplane.com</a>&gt;</code></pre> DragonFlyBSD - Bug #1129: hammer-inodes: malloc limit exceededhttps://bugs.dragonflybsd.org/issues/1129?journal_id=54182008-09-17T03:42:00Zftigeot
<ul></ul><p>Definitely.</p>
<p>Every time I got a crash, the machine was re-creating an hourly rsnapshot<br />arborescence from the previous one. It should have been a mix of mkdir /<br />chmod / chown ...</p> DragonFlyBSD - Bug #1129: hammer-inodes: malloc limit exceededhttps://bugs.dragonflybsd.org/issues/1129?journal_id=54192008-09-17T06:00:02Zdillon
<ul></ul><p>:Definitely.<br />:<br />:Every time I got a crash, the machine was re-creating an hourly rsnapshot<br />:arborescence from the previous one. It should have been a mix of mkdir /<br />:chmod / chown ...<br />:<br />:-- <br />:Francois Tigeot</p>
<pre><code>Ok, please try the patch below. This is kinda a kitchen sink approach<br /> and will reduce performance somewhat when doing lots of<br /> hardlinks/chmods/etc but I want to see if it deals with the problem.</code></pre>
<pre><code>Also reduce kern.maxvnodes to 100000.</code></pre>
<pre><code>-Matt<br /> Matthew Dillon <br /> &lt;<a class="email" href="mailto:dillon@backplane.com">dillon@backplane.com</a>&gt;</code></pre>
<p>Index: hammer_vnops.c
===================================================================<br />RCS file: /cvs/src/sys/vfs/hammer/hammer_vnops.c,v<br />retrieving revision 1.96<br />diff <del>u -p -r1.96 hammer_vnops.c<br />--</del> hammer_vnops.c 9 Aug 2008 07:04:16 <del>0000 1.96<br />+<ins>+ hammer_vnops.c 16 Sep 2008 22:52:12 -0000<br /><code>@ -1038,6 +1038,7 </code>@ hammer_vop_nlink(struct vop_nlink_args *<br /> cache_setvp(nch, ap</del>>a_vp);<br /> }<br /> hammer_done_transaction(&trans);<br /></ins> hammer_inode_waitreclaims(dip->hmp);<br /> return (error);<br /> }</p>
<p><code>@ -1108,6 +1109,7 </code>@ hammer_vop_nmkdir(struct vop_nmkdir_args<br /> }<br /> }<br /> hammer_done_transaction(&trans);<br />+ hammer_inode_waitreclaims(dip->hmp);<br /> return (error);<br /> }</p>
<p><code>@ -1873,6 +1875,8 </code>@ done:<br /> if (error 0)<br /> hammer_modify_inode(ip, modflags);<br /> hammer_done_transaction(&trans);<br />+ if (ap->a_vp->v_opencount 0)<br />+ hammer_inode_waitreclaims(ip->hmp);<br /> return (error);<br /> }</p> DragonFlyBSD - Bug #1129: hammer-inodes: malloc limit exceededhttps://bugs.dragonflybsd.org/issues/1129?journal_id=54202008-09-18T04:08:02Zftigeot
<ul></ul><p>Done.</p>
<p>If you don't hear about any new crash for a week, it means this patch is<br />good.</p> DragonFlyBSD - Bug #1129: hammer-inodes: malloc limit exceededhttps://bugs.dragonflybsd.org/issues/1129?journal_id=54272008-09-21T14:43:01Zftigeot
<ul></ul><p>The machine has been stable so far.</p>
<p>I just noticed these unusual messages in the logs today:</p>
<p>Sep 21 09:04:29 akane kernel: HAMMER: Warning: UNDO area too small!<br />Sep 21 09:05:00 akane kernel: HAMMER: Warning: UNDO area too small!<br />Sep 21 09:06:11 akane kernel: HAMMER: Warning: UNDO area too small!</p>
<p>The time corresponds to a rsnasphot hourly run.</p>
<p>I had to reboot this machine for an unrelated problem. We should wait<br />a few more days to be sure if the patch really fixes think.</p>
<p>I was never able to get more than 5-6 days uptime before.</p> DragonFlyBSD - Bug #1129: hammer-inodes: malloc limit exceededhttps://bugs.dragonflybsd.org/issues/1129?journal_id=54292008-09-22T00:52:02Zdillon
<ul></ul><p>:The machine has been stable so far.<br />:<br />:I just noticed these unusual messages in the logs today:<br />:<br />:Sep 21 09:04:29 akane kernel: HAMMER: Warning: UNDO area too small!<br />:Sep 21 09:05:00 akane kernel: HAMMER: Warning: UNDO area too small!<br />:Sep 21 09:06:11 akane kernel: HAMMER: Warning: UNDO area too small!<br />:<br />:The time corresponds to a rsnasphot hourly run.<br />:...<br />:Francois Tigeot</p>
<pre><code>How large is the filesystem your rsnapshot is writing to?</code></pre>
<pre><code>HAMMER tries to estimate how much space dependancies take up in the<br /> UNDO FIFO and tries to split the work up into multiple flush cycles<br /> such that each flush cycle does not exhaust the UNDO space.</code></pre>
<pre><code>The warning means that the HAMMER backend had to issue a flush cycle<br /> before it really wanted to, potentially causing some directory<br /> dependancies to get split between two flush cycles. If a crash were<br /> to occur during those particular flush cycles the hard link count<br /> between file and directory entry could wind up being wrong.</code></pre>
<pre><code>I think the problem may be caused by the hardlink trick you are using<br /> to duplicate directory trees. HAMMER's estimator is probably not<br /> taking into account the tens of thousands of hardlinks (directory-to-file<br /> link count dependancies) and directory-to-directory dependancies from<br /> creating the target directory hierarchy that can build up when files are<br /> simply being linked.</code></pre>
<pre><code>For now, keep watch on it. The warning itself is not a big deal.<br /> If HAMMER panics on insufficient UNDO space, though, that's a<br /> different matter.</code></pre>
<pre><code>-Matt<br /> Matthew Dillon <br /> &lt;<a class="email" href="mailto:dillon@backplane.com">dillon@backplane.com</a>&gt;</code></pre> DragonFlyBSD - Bug #1129: hammer-inodes: malloc limit exceededhttps://bugs.dragonflybsd.org/issues/1129?journal_id=54312008-09-22T01:05:01Zftigeot
<ul></ul><p>It is a single volume on a 400GB disk:</p>
<p>$ df -ih .<br />Filesystem Size Used Avail Capacity iused ifree <span>iused Mounted on<br />Backup 371G 314G 57G 85</span> 2642085 0 100% /backup</p> DragonFlyBSD - Bug #1129: hammer-inodes: malloc limit exceededhttps://bugs.dragonflybsd.org/issues/1129?journal_id=54332008-09-22T01:23:01Zdillon
<ul></ul><p>:It is a single volume on a 400GB disk:<br />:<br />:$ df <del>ih .<br />:Filesystem Size Used Avail Capacity iused ifree <span>iused Mounted on<br />:Backup 371G 314G 57G 85</span> 2642085 0 100% /backup<br />:<br />:-</del> <br />:Francois Tigeot</p>
<pre><code>Interesting. It should have a full-sized undo area, which means the<br /> dependancies resulted in 600MB+ worth of undos. I'll have to start<br /> testing with hardlinks.</code></pre>
<pre><code>Be sure to regularly prune and reblock that sucker. If you aren't<br /> using the history feature I expect you'll want to mount it 'nohistory'<br /> too. I found out the hard way that one still needs to spend about<br /> 5 minutes a day reblocking a HAMMER filesystem to keep fragmentation<br /> in check.</code></pre>
<pre><code>-Matt<br /> Matthew Dillon <br /> &lt;<a class="email" href="mailto:dillon@backplane.com">dillon@backplane.com</a>&gt;</code></pre> DragonFlyBSD - Bug #1129: hammer-inodes: malloc limit exceededhttps://bugs.dragonflybsd.org/issues/1129?journal_id=54432008-09-25T03:25:01Zftigeot
<ul></ul><p>The machine is still stable, with a 4 days uptime.</p>
<p>I have found a new strange warning in the logs:<br />Warning: BTREE_REMOVE: Defering parent removal2 @ 80000058efe06000, skipping</p>
<p>It occurred during a rsnapshot hourly run.</p> DragonFlyBSD - Bug #1129: hammer-inodes: malloc limit exceededhttps://bugs.dragonflybsd.org/issues/1129?journal_id=54442008-09-25T04:44:03Zdillon
<ul></ul><p>:The machine is still stable, with a 4 days uptime.<br />:<br />:I have found a new strange warning in the logs:<br />:Warning: BTREE_REMOVE: Defering parent removal2 @ 80000058efe06000, skipping<br />:<br />:It occurred during a rsnapshot hourly run.<br />:<br />:-- <br />:Francois Tigeot</p>
<pre><code>You can ignore that one, it's harmless. </code></pre>
<pre><code>-Matt<br /> Matthew Dillon <br /> &lt;<a class="email" href="mailto:dillon@backplane.com">dillon@backplane.com</a>&gt;</code></pre> DragonFlyBSD - Bug #1129: hammer-inodes: malloc limit exceededhttps://bugs.dragonflybsd.org/issues/1129?journal_id=54782008-10-21T16:09:02Zmatthias
<ul></ul><p>Hi,</p>
<p>I have encountered the same panic on one of my machines running HAMMER<br />here:</p>
<p>panic: hammer-inodes: malloc limit exceededmp_lock = 00000000; cpuid = 0<br />Trace beginning at frame 0xdf0b1a20<br />panic(df0b1a44,c03f67c0,ff80048c,248,df0b1a68) at panic+0x142<br />panic(c039cecf,c03a501c,0,11,ff800000) at panic+0x142<br />kmalloc(248,c03f67c0,102,db54f000,c02e3a62) at kmalloc+0xa5<br />hammer_create_inode(df0b1ac0,df0b1b5c,de4efa88,e268b738,0) at<br />hammer_create_inode+0x26hammer_vop_ncreate(df0b1b00,c03eab70,c41090b8,0,0)<br />at hammer_vop_ncreate+0x72<br />vop_ncreate(c41090b8,df0b1c84,e157a9b8,df0b1c08,de4efa88) at<br />vop_ncreate+0x3d<br />vn_open(df0b1c84,d746e638,603,1a4,c410ecf8) at vn_open+0xf3<br />kern_open(df0b1c84,602,1b6,df0b1cf0,ec0636c8) at kern_open+0x84<br />sys_open(df0b1cf0,c192e8,0,da362b78,c03db85c) at sys_open+0x32<br />syscall2(df0b1d40) at syscall2+0x240<br />Xint0x80_syscall() at Xint0x80_syscall+0x36<br />Debugger("panic")</p>
<p>CPU0 stopping CPUs: 0x00000002<br /> stopped<br />panic: from debugger<br />mp_lock = 00000000; cpuid = 0<br />boot() called on cpu#0<br />Uptime: 20d22h48m26s</p>
<p>dumping to dev #ad/0x20051, blockno 4269104<br />dump <br />Fatal double fault:<br />eip = 0xc03733f1<br />esp = 0xdf0aefb0<br />ebp = 0xdf0af024<br />mp_lock = 00000000; cpuid = 0; lapic.id = 00000000<br />panic: double fault<br />mp_lock = 00000000; cpuid = 0<br />boot() called on cpu#0<br />Uptime: 20d22h48m26s<br />Dump already in progress, bailing...<br />spin_lock: 0xc4107d6c, indefinite wait!<br />spin_lock: 0xc4107d64, indefinite wait!<br />Shutting down ACPI<br />Automatic reboot in 15 seconds - press a key on the console to abort<br />--> Press a key on the console to reboot,<br />--> or switch off the system now.<br />Rebooting...</p>
<p>Unfortunately no crash dump :( The machine is running HEAD from Tue Sep<br />30 11:47:27 CEST 2008. It is a Intel C2D 3GHz with 2GB RAM running a<br />SMP kernel. The fs layout is as follows:</p>
<p>ROOT 292G 60G 232G 21% /<br />/dev/ad10s1a 252M 138M 94M 59% /boot<br />/pfs/<code>@0xffffffffffffffff:00001 292G 60G 232G 21% /usr<br />/pfs/</code>@0xffffffffffffffff:00003 292G 60G 232G 21% /var<br />/pfs/<code>@0xffffffffffffffff:00006 292G 60G 232G 21% /tmp<br />/pfs/</code>@0xffffffffffffffff:00007 292G 60G 232G 21% /home<br />/pfs/<code>@0xffffffffffffffff:00005 292G 60G 232G 21% /var/tmp<br />/pfs/</code>@0xffffffffffffffff:00002 292G 60G 232G 21% /usr/obj<br />/pfs/@@0xffffffffffffffff:00004 292G 60G 232G 21%<br />/var/crash</p>
<p>The machine performed a pkgsrc "cvs update" before it crashed. If more<br />information is needed, I'll provide it. After the reboot kern.maxvnodes<br />is 129055 if that matters ...</p>
<p>Regards</p>
<pre><code>Matthias</code></pre> DragonFlyBSD - Bug #1129: hammer-inodes: malloc limit exceededhttps://bugs.dragonflybsd.org/issues/1129?journal_id=54802008-10-21T23:42:00Zdillon
<ul></ul><p>:The machine performed a pkgsrc "cvs update" before it crashed. If more<br />:information is needed, I'll provide it. After the reboot kern.maxvnodes<br />:is 129055 if that matters ...<br />:<br />:Regards<br />:<br />: Matthias</p>
<pre><code>It's the same issue. Drop kern.maxvnodes to 100000.</code></pre>
<pre><code>I am going to add an API to set the kmalloc pool's limit so HAMMER<br /> can size it according to the size of hammer_inode.</code></pre>
<pre><code>-Matt<br /> Matthew Dillon <br /> &lt;<a class="email" href="mailto:dillon@backplane.com">dillon@backplane.com</a>&gt;</code></pre> DragonFlyBSD - Bug #1129: hammer-inodes: malloc limit exceededhttps://bugs.dragonflybsd.org/issues/1129?journal_id=55642008-12-01T20:29:25Zaoiko
<ul></ul><p>Should this be closed?</p> DragonFlyBSD - Bug #1129: hammer-inodes: malloc limit exceededhttps://bugs.dragonflybsd.org/issues/1129?journal_id=55772008-12-01T21:42:32Zaoiko
<ul></ul><p>Fix committed by dillon@</p> DragonFlyBSD - Bug #1129: hammer-inodes: malloc limit exceededhttps://bugs.dragonflybsd.org/issues/1129?journal_id=56842008-12-27T11:59:02Zqhwt+dfly
<ul></ul><p>Do I still need to lower kern.maxvnodes to avoid the panic on machine<br />with >=2G bytes of RAM? I still see this panic while running blogbench<br />for a couple of hours, without increasing or decreasing kern.maxvnodes.</p>
<pre><code>(kgdb) bt<br /> :<br /> #2 0xc0198e0c in panic (fmt=0xc02e71dd "%s: malloc limit exceeded")<br /> at /home/source/dragonfly/current/src/sys/kern/kern_shutdown.c:800<br /> #3 0xc0196a4f in kmalloc (size=584, type=0xc4170010, flags=258)<br /> at /home/source/dragonfly/current/src/sys/kern/kern_slaballoc.c:490<br /> #4 0xc0260056 in hammer_get_inode (trans=0xde16db20, dip=0xe47032d0,<br /> obj_id=180316461440, asof=18446744073709551615, localization=131072,<br /> flags=0, errorp=0xde16da68)<br /> at /home/source/dragonfly/current/src/sys/vfs/hammer/hammer_inode.c:376<br /> #5 0xc026fc95 in hammer_vop_nresolve (ap=0xde16db78)<br /> at /home/source/dragonfly/current/src/sys/vfs/hammer/hammer_vnops.c:924<br /> #6 0xc01ee2d4 in vop_nresolve_ap (ap=0xde16db78)<br /> at /home/source/dragonfly/current/src/sys/kern/vfs_vopops.c:1613<br /> #7 0xde35b032 in ?? ()<br /> :</code></pre>
<pre><code>(kgdb) p *type<br /> $10 = {ks_next = 0xc416ff50, ks_memuse = {55452800, 51921920,<br /> 0 &lt;repeats 14 times&gt;}, ks_loosememuse = 107374720, ks_limit = 107374182,<br /> ks_size = 0, ks_inuse = {86645, 81128, 0 &lt;repeats 14 times&gt;},<br /> ks_calls = 1694049, ks_maxused = 0, ks_magic = 877983977,<br /> ks_shortdesc = 0xc02f039d "HAMMER-inodes", ks_limblocks = 0,<br /> ks_mapblocks = 0, ks_reserved = {0, 0, 0, 0}}</code></pre> DragonFlyBSD - Bug #1129: hammer-inodes: malloc limit exceededhttps://bugs.dragonflybsd.org/issues/1129?journal_id=56852008-12-27T14:00:00Zdillon
<ul></ul><p>:Do I still need to lower kern.maxvnodes to avoid the panic on machine<br />:with >=2G bytes of RAM? I still see this panic while running blogbench<br />:for a couple of hours, without increasing or decreasing kern.maxvnodes.<br />:<br />: (kgdb) bt<br />: :<br />: <a class="issue tracker-1 status-5 priority-3 priority-lowest closed" title="Bug: K&R -> ANSI cleanup status (Closed)" href="https://bugs.dragonflybsd.org/issues/2">#2</a> 0xc0198e0c in panic (fmt=0xc02e71dd "%s: malloc limit exceeded")<br />: at /home/source/dragonfly/current/src/sys/kern/kern_shutdown.c:800</p>
<pre><code>I'd like to get another crash dump if possible, before you lower the<br /> limit. There is still clearly an issue which I would like to get<br /> fixed before the January release.</code></pre>
<pre><code>Once you get a crash dump over to leaf then please lower the limit<br /> and see if you can panic the machine.</code></pre>
<pre><code>-Matt</code></pre> DragonFlyBSD - Bug #1129: hammer-inodes: malloc limit exceededhttps://bugs.dragonflybsd.org/issues/1129?journal_id=56862008-12-27T17:47:02Zqhwt+dfly
<ul></ul><p>Ok, scp'ed as ~y0netan1/crash/{kernel,vmcore}.4 .</p>
<p>Sure. Oh, BTW, although this machine has two HAMMER partitions mounted<br />(/HAMMER and /var/vkernel), I was only using the former for blogbench,<br />the latter was mounted but totally idle.</p> DragonFlyBSD - Bug #1129: hammer-inodes: malloc limit exceededhttps://bugs.dragonflybsd.org/issues/1129?journal_id=56882008-12-28T12:03:03Zqhwt+dfly
<ul></ul><p>I don't know exactly how desiredvnodes limits the amount of kmalloc's,<br />but I did notice that there are two places where it's used to compute<br />HAMMER-related values:<br /> - hammer_vfs_init(): vfs.hammer.limit_iqueued is computed at the first<br /> call and set to desiredvnodes / 5; after that, you need to set it<br /> manually<br /> - hammer_vfs_mount(): if I understand the code correctly, malloc limit<br /> is only updated when the HAMMER volume is unmounted then mounted.</p>
<p>So I went into single-user mode, set these parameters, unmounted and<br />re-mounted the HAMMER filesystems. It seems that with kern.maxvnodes=100000<br />it can still panic the machine.</p>
<p>BTW, I have a few questions WRT kmalloc():</p>
<p>kern_slaballoc.c:478<br /> while (type->ks_loosememuse >= type->ks_limit) {<br /> int i;<br /> long ttl;</p>
<pre><code>for (i = ttl = 0; i < ncpus; ++i)<br /> ttl += type->ks_memuse[i];<br /> type->ks_loosememuse = ttl; /* not MP synchronized */<br /> if (ttl >= type->ks_limit) {<br /> if (flags & M_NULLOK) {<br /> logmemory(malloc, NULL, type, size, flags);<br /> return(NULL);<br /> }<br /> panic("%s: malloc limit exceeded", type->ks_shortdesc);<br /> }<br /> }</code></pre>
<p>1. don't we need M_LOOPOK flag, which tells kmalloc() to wait until<br /> the sum of ks_memuse[] becomes lower than ks_limit? of course<br /> only when !M_NULLOK && M_WAITOK.<br /> struct hammer_inode is fairly small in size, so there could be<br /> a good chance that a couple of them gets reclaimed after a while.</p>
<p>2. I know ks_loosememuse is not MP synchronized, but ks_memuse[] is<br /> summed up without any locks, either. couldn't there be a race?</p>
<p>3. shouldn't the conditionals be<br /> while (type->ks_loosememuse + size >= type->ks_limit) {<br /> ...<br /> if (ttl + size >= type->ks_limit) ...</p>
<pre><code>to catch the situation earlier?</code></pre>
<p>Thanks in advance.</p> DragonFlyBSD - Bug #1129: hammer-inodes: malloc limit exceededhttps://bugs.dragonflybsd.org/issues/1129?journal_id=56892008-12-28T12:48:01Zdillon
<ul></ul><p>:1. don't we need M_LOOPOK flag, which tells kmalloc() to wait until<br />: the sum of ks_memuse[] becomes lower than ks_limit? of course<br />: only when !M_NULLOK && M_WAITOK.<br />: struct hammer_inode is fairly small in size, so there could be<br />: a good chance that a couple of them gets reclaimed after a while.</p>
<pre><code>No, because there is no guarantee that the caller won't deadlock.<br /> The bug is that the subsystem (HAMMER in this case) didn't control<br /> the allocations it was making.</code></pre>
<p>:2. I know ks_loosememuse is not MP synchronized, but ks_memuse[] is<br />: summed up without any locks, either. couldn't there be a race?</p>
<pre><code>ks_loosememuse can be very wrong. Summing up ks_memuse[] will<br /> give a correct result and while races can occur the difference<br /> will only be the difference due to the race, not some potentially<br /> wildly incorrect value.</code></pre>
<p>:3. shouldn't the conditionals be<br />: while (type->ks_loosememuse + size >= type->ks_limit) {<br />: ...<br />: if (ttl + size >= type->ks_limit) ...<br />:<br />: to catch the situation earlier?<br />:<br />:Thanks in advance.</p>
<pre><code>I don't think this will help. Subsystems have to control their<br /> memory use. The kernel can't really save them. HAMMER has an<br /> issue where it can allocate a virtually unlimited number of<br /> hammer_inode structures. I have lots of code in there to try<br /> to slow it down when it gets bloated but clearly some cases are<br /> getting through and still causing the allocations to spiral out<br /> of control.</code></pre>
<pre><code>-Matt<br /> Matthew Dillon <br /> &lt;<a class="email" href="mailto:dillon@backplane.com">dillon@backplane.com</a>&gt;</code></pre> DragonFlyBSD - Bug #1129: hammer-inodes: malloc limit exceededhttps://bugs.dragonflybsd.org/issues/1129?journal_id=56902008-12-28T13:36:00Zdillon
<ul></ul><p>What arguments to blogbench are you using and how long does it run<br /> before it hits the malloc panic?</p>
<pre><code>I found one possible path where inodes can build up but it still<br /> doesn't feel quite right because even that path has upwards of a<br /> 2-second tsleep. With so many blogbench threads running in<br /> parallel it could be the cause but it still ought to take a while<br /> to build up that many extra inodes. I think I need to reproduce the<br /> problem locally to determine if that path is the cause.</code></pre>
<pre><code>-Matt</code></pre> DragonFlyBSD - Bug #1129: hammer-inodes: malloc limit exceededhttps://bugs.dragonflybsd.org/issues/1129?journal_id=56912008-12-28T16:45:00Zqhwt+dfly
<ul></ul><p>I almost forgot, but fortunately vmcore.4 contains it:<br />blogbench -d0 -i1000 -o</p>
<p>`0' is the work directory, which has nohistory flag on it.</p>
<p>I thinks it survived for about three hours according to `last':<br />reboot ~ Sat Dec 27 02:37<br />qhwt ttyp0 eden Fri Dec 26 23:40 - crash (02:56)<br />reboot ~ Fri Dec 26 23:39</p> DragonFlyBSD - Bug #1129: hammer-inodes: malloc limit exceededhttps://bugs.dragonflybsd.org/issues/1129?journal_id=56922008-12-29T06:01:02Zdillon
<ul></ul><p>:I almost forgot, but fortunately vmcore.4 contains it:<br />:blogbench -d0 -i1000 -o<br />:<br />:`0' is the work directory, which has nohistory flag on it.<br />:<br />:I thinks it survived for about three hours according to `last':<br />:reboot ~ Sat Dec 27 02:37<br />:qhwt ttyp0 eden Fri Dec 26 23:40 - crash (02:56)<br />:reboot ~ Fri Dec 26 23:39</p>
<pre><code>Cool, I reproduced the issue. It is quite interesting. What is<br /> happening is that the load on HAMMER is causing the HAMMER flusher<br /> to get excessive deadlocks, stalling it out and preventing it from<br /> making any progress. Because of that no matter how much I slow down new<br /> inode creation the inode count just keeps building up.</code></pre>
<pre><code>What is really weird is that if I ^Z the blogbench and let the flusher<br /> catch up, then resume it, the flusher is then able to stay caught up<br /> for a few minutes before starting to get behind again.</code></pre>
<pre><code>I am experimenting with a number of possible solutions, including<br /> having the flusher try a different inode if it hits a deadlock, to<br /> see if I can prevent the stall-outs.</code></pre>
<pre><code>-Matt<br /> Matthew Dillon <br /> &lt;<a class="email" href="mailto:dillon@backplane.com">dillon@backplane.com</a>&gt;</code></pre> DragonFlyBSD - Bug #1129: hammer-inodes: malloc limit exceededhttps://bugs.dragonflybsd.org/issues/1129?journal_id=56932008-12-29T07:41:02Zdillon
<ul></ul><p>I've committed a HAMMER update to the master branch which should fix<br /> the issue revealed by blogbench.</p>
<pre><code>It is a bit of a hack but it seems to work in my tests so far.</code></pre>
<pre><code>-Matt</code></pre> DragonFlyBSD - Bug #1129: hammer-inodes: malloc limit exceededhttps://bugs.dragonflybsd.org/issues/1129?journal_id=56942008-12-29T20:14:02Zqhwt+dfly
<ul></ul><p>No more panics so far, thanks!</p> DragonFlyBSD - Bug #1129: hammer-inodes: malloc limit exceededhttps://bugs.dragonflybsd.org/issues/1129?journal_id=58862009-01-21T09:50:54Zdillon
<ul></ul><p>Believed to be fixed no, closing.</p>