Project

General

Profile

Submit #2767

[PATCH] sys/vfs/hammer: make description on low level storage layout up-to-date with code

Added by tkusumi about 2 years ago. Updated about 2 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
VFS subsystem
Target version:
Start date:
01/14/2015
Due date:
% Done:

100%


Description

This patch fixes description regarding hammer's low level storage layout based on two levels of blockmap layer. It aims to make the description explicit and up-to-date with what the actual code does. Following six describes six hunks within this patch.

hunk 1. Per-zone storage limit is not 2^59, it's (2 ^ (64 - zone_bits)) = 2^60. There are (only) two comments within hammerfs source that refer to the limit of zone address space is 2^59 and I think these two are wrong. This patch fixes them in hunk 1 and 4. Following explains some backgrounds of zone address space that you can actually see from existing code as well as two publicly available documents https://www.dragonflybsd.org/hammer/hammer.pdf and http://www.dragonflybsd.org/presentations/nycbsdcon08/ both published in 2008.

- Per-zone address space is lower 60 bits of 64 bits address space where the upper 4 bits are merely used as a zone identifier. All the zone address spaces except for zone 2 and 4 are actually virtual address spaces that are eventually converted to zone 2 (raw buffer zone) address in order to do real i/o. They virtually occupy their address space but they all get converted to a single zone 2 address for real i/o.
- In the backend zone 4 (freemap zone) has physical offset of the blockmap layer 1 and all the rest of zones get free space by having access to two layers of metadata that manage free space. These blockmap layers are created by newfs_hammer command.
- Free space allocation/reservation functions within kernel space (sys/vfs/hammer/hammer_blockmap.c) that are being used once the fs is mounted don't have any limitation regarding 59 bits. Once the 64 bits address reaches to the next zone (zzzz part gets +1), it simply loops back to buffer offset 0 of the current zone (same zzzz) and tries to find a free space from beginning, probably hoping lower space has been reblocked. This loop back behavior shows each zone address space logically has capacity of full 60 bits (not 59 bits) which equals physical capacity of the entire filesystem. Therefore 2^59 seems no longer relevant with regards to the current hammerfs implementation.
- Note that 60 bits of zone address space includes both layer 1 and 2 blockmap structure thus actual filesystem data limit is less than that. The layer 1 blockmap structure itself takes 2^18 hammer_blockmap_layer1 = 2^18 * 32 bytes = 8MB. newfs_hammer command actually locates them in the first big block (8MB chunk) of the root volume. The layer 2 blockmap structure hammer_blockmap_layer2 corresponds to every big block (8MB chunk) of filesystem data therefore the # of layer 2 blockmap depends on size and # of volumes.

layer1/layer2 direct map taken from sys/vfs/hammer/hammer_disk.h
zzzzvvvvvvvvoooo oooooooooooooooo oooooooooooooooo oooooooooooooooo - 64 bits address consists of zone, volume and offset within volume
----111111111111 1111112222222222 222222222ooooooo oooooooooooooooo - two levels of blockmap layer and offset within 8MB chunk

hunk 2. Most 64 bits offset variables are HAMMER_BUFSIZE bytes aligned. Don't see them getting 64 bytes aligned.

hunk 3. Add missing description for zone 11 (small data zone).

hunk 4. The layer 2 handles 19 bits which totals 60 bits (1EB) of address space for the entire hammerfs. There is a similar comment at L264 which refers to the layer 2 consists of 19 bits, but not 18 bits.

hunk 5. Explicitly use "zone:4" instead of "z:4".

hunk 6. Explicitly show 18+19+23=60 totals 2^60 = 1EB as the next line does the same for 2^(19+23) = 4TB.

Associated revisions

Revision 23234e40 (diff)
Added by tkusumi about 2 years ago

sys/vfs/hammer: make description on low level storage layout up-to-date with code

- This patch fixes description regarding hammer's low level storage
layout based on two levels of blockmap layer. It aims to make the
description explicit and up-to-date with what the actual code
does.

Closes: #2767

History

#1 Updated by tuxillo about 2 years ago

  • Category set to VFS subsystem
  • Assignee set to tuxillo
  • Target version set to 4.2.x

#2 Updated by tkusumi about 2 years ago

  • Status changed from New to Closed
  • % Done changed from 0 to 100

Also available in: Atom PDF