From 13acd533d4630bcb9442be591c37433fcd864600 Mon Sep 17 00:00:00 2001 From: Tomohiro Kusumi Date: Tue, 13 Jan 2015 20:13:28 +0900 Subject: [PATCH] sys/vfs/hammer: make description on low level storage layout up-to-date with code This patch fixes description regarding hammer's low level storage layout based on two levels of blockmap layer. It aims to make the description explicit and up-to-date with what the actual code does. Following six describes six hunks within this patch. hunk 1. Per-zone storage limit is not 2^59, it's (2 ^ (64 - zone_bits)) = 2^60. There are (only) two comments within hammerfs source that refer to the limit of zone address space is 2^59 and I think these two are wrong. This patch fixes them in hunk 1 and 4. Following explains some backgrounds of zone address space that you can actually see from existing code as well as two publicly available documents https://www.dragonflybsd.org/hammer/hammer.pdf and http://www.dragonflybsd.org/presentations/nycbsdcon08/ both published in 2008. - Per-zone address space is lower 60 bits of 64 bits address space where the upper 4 bits are merely used as a zone identifier. All the zone address spaces except for zone 2 and 4 are actually virtual address spaces that are eventually converted to zone 2 (raw buffer zone) address in order to do real i/o. They virtually occupy their address space but they all get converted to a single zone 2 address for real i/o. - In the backend zone 4 (freemap zone) has physical offset of the blockmap layer 1 and all the rest of zones get free space by having access to two layers of metadata that manage free space. These blockmap layers are created by newfs_hammer command. - Free space allocation/reservation functions within kernel space (sys/vfs/hammer/hammer_blockmap.c) that are being used once the fs is mounted don't have any limitation regarding 59 bits. Once the 64 bits address reaches to the next zone (zzzz part gets +1), it simply loops back to buffer offset 0 of the current zone (same zzzz) and tries to find a free space from beginning, probably hoping lower space has been reblocked. This loop back behavior shows each zone address space logically has capacity of full 60 bits (not 59 bits) which equals physical capacity of the entire filesystem. Therefore 2^59 seems no longer relevant with regards to the current hammerfs implementation. - Note that 60 bits of zone address space includes both layer 1 and 2 blockmap structure thus actual filesystem data limit is less than that. The layer 1 blockmap structure itself takes 2^18 hammer_blockmap_layer1 = 2^18 * 32 bytes = 8MB. newfs_hammer command actually locates them in the first big block (8MB chunk) of the root volume. The layer 2 blockmap structure hammer_blockmap_layer2 corresponds to every big block (8MB chunk) of filesystem data therefore the # of layer 2 blockmap depends on size and # of volumes. layer1/layer2 direct map taken from sys/vfs/hammer/hammer_disk.h zzzzvvvvvvvvoooo oooooooooooooooo oooooooooooooooo oooooooooooooooo - 64 bits address consists of zone, volume and offset within volume ----111111111111 1111112222222222 222222222ooooooo oooooooooooooooo - two levels of blockmap layer and offset within 8MB chunk hunk 2. Most 64 bits offset variables are HAMMER_BUFSIZE bytes aligned. Don't see them getting 64 bytes aligned. hunk 3. Add missing description for zone 11 (small data zone). hunk 4. The layer 2 handles 19 bits which totals 60 bits (1EB) of address space for the entire hammerfs. There is a similar comment at L264 which refers to the layer 2 consists of 19 bits, but not 18 bits. hunk 5. Explicitly use "zone:4" instead of "z:4". hunk 6. Explicitly show 18+19+23=60 totals 2^60 = 1EB as the next line does the same for 2^(19+23) = 4TB. --- sys/vfs/hammer/hammer_disk.h | 15 ++++++++------- 1 files changed, 8 insertions(+), 7 deletions(-) diff --git a/sys/vfs/hammer/hammer_disk.h b/sys/vfs/hammer/hammer_disk.h index d0fed5a..32cf35f 100644 --- a/sys/vfs/hammer/hammer_disk.h +++ b/sys/vfs/hammer/hammer_disk.h @@ -63,7 +63,7 @@ * 64K X-bufs are used for blocks >= a file's 1MB mark. * * Per-volume storage limit: 52 bits 4096 TB - * Per-Zone storage limit: 59 bits 512 KTB (due to blockmap) + * Per-Zone storage limit: 60 bits 1 MTB * Per-filesystem storage limit: 60 bits 1 MTB */ #define HAMMER_BUFSIZE 16384 @@ -104,7 +104,7 @@ * * Hammer offsets are used for FIFO indexing and embed a cycle counter * and volume number in addition to the offset. Most offsets are required - * to be 64-byte aligned. + * to be 16 KB aligned. */ typedef u_int64_t hammer_tid_t; typedef u_int64_t hammer_off_t; @@ -134,6 +134,7 @@ typedef u_int32_t hammer_crc_t; * zone 8 (z,v,o): B-Tree - actually zone-2 address * zone 9 (z,v,o): Record - actually zone-2 address * zone 10 (z,v,o): Large-data - actually zone-2 address + * zone 11 (z,v,o): Small-data - actually zone-2 address * zone 15: reserved for sanity * * layer1/layer2 direct map: @@ -201,9 +202,9 @@ typedef u_int32_t hammer_crc_t; * Large-Block backing store * * A blockmap is a two-level map which translates a blockmap-backed zone - * offset into a raw zone 2 offset. Each layer handles 18 bits. The 8M - * large-block size is 23 bits so two layers gives us 23+18+18 = 59 bits - * of address space. + * offset into a raw zone 2 offset. The layer 1 handles 18 bits and the + * layer 2 handles 19 bits. The 8M large-block size is 23 bits so two + * layers gives us 18+19+23 = 60 bits of address space. * * When using hinting for a blockmap lookup, the hint is lost when the * scan leaves the HINTBLOCK, which is typically several LARGEBLOCK's. @@ -273,7 +274,7 @@ typedef struct hammer_blockmap *hammer_blockmap_t; * thus any space allocated via the freemap can be directly translated * to a zone:2 (or zone:8-15) address. * - * zone-X blockmap offset: [z:4][layer1:18][layer2:19][bigblock:23] + * zone-X blockmap offset: [zone:4][layer1:18][layer2:19][bigblock:23] */ struct hammer_blockmap_layer1 { hammer_off_t blocks_free; /* big-blocks free */ @@ -324,7 +325,7 @@ typedef struct hammer_blockmap_layer2 *hammer_blockmap_layer2_t; #define HAMMER_BLOCKMAP_RADIX2_PERBUFFER \ (HAMMER_BLOCKMAP_RADIX2 / (HAMMER_LARGEBLOCK_SIZE / HAMMER_BUFSIZE)) -#define HAMMER_BLOCKMAP_LAYER1 /* 18+19+23 */ \ +#define HAMMER_BLOCKMAP_LAYER1 /* 18+19+23 - 1EB */ \ (HAMMER_BLOCKMAP_RADIX1 * HAMMER_BLOCKMAP_LAYER2) #define HAMMER_BLOCKMAP_LAYER2 /* 19+23 - 4TB */ \ (HAMMER_BLOCKMAP_RADIX2 * HAMMER_LARGEBLOCK_SIZE64) -- 1.7.1