Bug #884: Performance/memory problems under filesystem IO load - DragonFlyBSD - DragonFlyBSD bugtracker

Actions

Copy link

Bug #884

open

Performance/memory problems under filesystem IO load

Added by hasso almost 18 years ago. Updated over 4 years ago.

Status:

In Progress

Priority:

High

Assignee:

Category:

VM subsystem

Target version:

6.4

Start date:

Due date:

% Done:

Estimated time:

Description

During testing drive with dd I noticed that there are serious performance
problems. Programs which need disk access, block for 10 and more seconds.
Sometimes they don't continue the work until dd is finished. Raw disk access
(ie not writing to file, but directly to the disk) is reported to be OK (I
can't test it myself).

All tests are done with this command:
dd if=/dev/zero of=./file bs=4096k count=1000

Syncing after each dd helps to reproduce it more reliably (cache?).

There is one more strange thing in running these tests. I looked at memory
stats in top before and after running dd.

Before:
Mem: 42M Active, 40M Inact, 95M Wired, 304K Cache, 53M Buf, 795M Free
After:
Mem: 70M Active, 679M Inact, 175M Wired, 47M Cache, 109M Buf, 1752K Free

And as a side effect - I can't get my network interfaces up any more after
running dd - "em0: Could not setup receive strucutres".

Actions

Copy link

Updated by nthery almost 18 years ago

[...]

There is one more strange thing in running these tests. I looked at memory
stats in top before and after running dd.

Before:
Mem: 42M Active, 40M Inact, 95M Wired, 304K Cache, 53M Buf, 795M Free
After:
Mem: 70M Active, 679M Inact, 175M Wired, 47M Cache, 109M Buf, 1752K Free

FWIW, I observe similar figures. I also noticed that deleting ./file
and waiting a bit restores memory to the "before" state.

The size increase in the wired pool can be reproduced more simply with:

sysctl vm.stats.vm.v_wire_count # A
dd if=/dev/zero of=./file bs=4096k count=1
sysctl vm.stats.vm.v_wire_count # B
rm ./file
sysctl vm.stats.vm.v_wire_count # C

A C && B A + 1

I traced this with gdb. The additional wired page is part of a struct
buf (b_xio) instance tied to the ./file vnode. I reckon this vnode
stays cached (namecache?) when the dd process ends and deleting ./file
forces destruction of the vnode.

AFAIU wired pages can not be reclaimed by the pager when memory is
low. So is it normal to keep b_xio pages wired when they are "just"
cached in a vnode (i.e. no ongoing r/w operation)?

Actions

Copy link

Updated by nthery almost 18 years ago

Oops. I've just noticed that block size is 4096k in the original post
and not 4096. I did all my experiments with the latter (there is a
paste bug in my previous email).

Actions

Copy link

Updated by corecode almost 18 years ago

I can confirm that creating a 4MB file consumes 1026 wired pages.

So either this is a) insane b) a bug or c) a counting issue, where the
buffer cache gets placed with the wired pages.

I didn't yet look at the code, but does this mean that additionally to
the buffer cache data, this b_xio is kept as well, keeping the same data?

cheers
simon

Actions

Copy link

Updated by dillon almost 18 years ago

:I can confirm that creating a 4MB file consumes 1026 wired pages.
:
:So either this is a) insane b) a bug or c) a counting issue, where the
:buffer cache gets placed with the wired pages.

They're just wired because they're in the buffer cache.  The real issue
    here is buffer cache management, that's all.

:>> I traced this with gdb. The additional wired page is part of a struct=
:
:>> buf (b_xio) instance tied to the ./file vnode. I reckon this vnode
:>> stays cached (namecache?) when the dd process ends and deleting ./file=
:
:>> forces destruction of the vnode.
:
:I didn't yet look at the code, but does this mean that additionally to
:the buffer cache data, this b_xio is kept as well, keeping the same dat=
:a?
:
:cheers
: simon

No, the b_xio is part of the struct buf.  The pages are wired only once.

This is really just a buffer cache management issue.  There's no reason
    the pages have to remain wired that long... the related buffer can
    certainly be destroyed after having been flushed to disk under heavy-load
    situations while still leaving the backing VM pages intact in the VM
    page cache.  I would focus any fixes in that area.

-Matt
                    Matthew Dillon 
                    &lt;dillon@backplane.com&gt;

Actions

Copy link

Updated by tuxillo over 11 years ago

Description updated (diff)
Category set to VM subsystem
Status changed from New to In Progress
Assignee deleted (0)
Target version set to 3.8

Hi,

I've done the same test under a vkernel:

sysctl vm.stats.vm.v_wire_count
vm.stats.vm.v_wire_count: 11486
dd if=/dev/zero of=./file bs=4m count=1
1+0 records in
1+0 records out
4194304 bytes transferred in 0.011742 secs (357201747 bytes/sec)
sysctl vm.stats.vm.v_wire_count
vm.stats.vm.v_wire_count: 11675
rm file
sysctl vm.stats.vm.v_wire_count
vm.stats.vm.v_wire_count: 10647

And the same test on real hardware:

antonioh@nas:~$ sysctl vm.stats.vm.v_wire_count
vm.stats.vm.v_wire_count: 379492
antonioh@nas:~$ dd if=/dev/zero of=./file bs=4m count=1
1+0 records in
1+0 records out
4194304 bytes transferred in 0.035698 secs (117494297 bytes/sec)
antonioh@nas:~$ sysctl vm.stats.vm.v_wire_count
vm.stats.vm.v_wire_count: 379500
antonioh@nas:~$ rm file
antonioh@nas:~$ sysctl vm.stats.vm.v_wire_count
vm.stats.vm.v_wire_count: 378476

I don't see the high usage corecode showed in his test.

Matt, there was a ton of work in the VM subsystem, is it possible that this is not the case anymore?

Best regards,
Antonio Huete

Actions

Copy link

Updated by alexh over 11 years ago

On 2014-02-18 14:47, bugtracker-admin@leaf.dragonflybsd.org wrote:

I don't see the high usage corecode showed in his test.

Matt, there was a ton of work in the VM subsystem, is it possible that
this is not the case anymore?

Well, we most definitely have quite bad problems under heavy I/O load.
Interactive performance drops significantly. Getting BFQ or similar
dsched policy up to scratch would go a long way here.

Cheers,
Alex

Actions

Copy link

Updated by tuxillo over 4 years ago

Target version changed from 3.8 to 6.0

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

DragonFlyBSD

Bug #884

Performance/memory problems under filesystem IO load

Updated by nthery almost 18 years ago

Updated by nthery almost 18 years ago

Updated by corecode almost 18 years ago

Updated by dillon almost 18 years ago

Updated by tuxillo over 11 years ago

Updated by alexh over 11 years ago

Updated by tuxillo over 4 years ago