@@ -5,6 +5,44 @@ The MSF File Format
5
5
.. contents ::
6
6
:local:
7
7
8
+ .. _msf_layout :
9
+
10
+ File Layout
11
+ ===========
12
+
13
+ The MSF file format consists of the following components:
14
+
15
+ 1. :ref: `msf_superblock `
16
+ 2. :ref: `msf_freeblockmap ` (also know as Free Page Map, or FPM)
17
+ 3. Data
18
+
19
+ Each component is stored as an indexed block, the length of which is specified
20
+ in ``SuperBlock::BlockSize ``. The file consists of 1 or more iterations of the
21
+ following pattern (sometimes referred to as an "interval"):
22
+
23
+ 1. 1 block of data
24
+ 2. Free Block Map 1 (corresponds to ``SuperBlock::FreeBlockMapBlock `` 1)
25
+ 3. Free Block Map 2 (corresponds to ``SuperBlock::FreeBlockMapBlock `` 2)
26
+ 4. ``SuperBlock::BlockSize - 3 `` blocks of data
27
+
28
+ In the first interval, the first data block is used to store
29
+ :ref: `msf_superblock `.
30
+
31
+ The following diagram demonstrates the general layout of the file (\| denotes
32
+ the end of an interval, and is for visualization purposes only):
33
+
34
+ +-------------+-----------------------+------------------+------------------+----------+----+------+------+------+-------------+----+-----+
35
+ | Block Index | 0 | 1 | 2 | 3 - 4095 | \| | 4096 | 4097 | 4098 | 4099 - 8191 | \| | ... |
36
+ +=============+=======================+==================+==================+==========+====+======+======+======+=============+====+=====+
37
+ | Meaning | :ref: `msf_superblock ` | Free Block Map 1 | Free Block Map 2 | Data | \| | Data | FPM1 | FPM2 | Data | \| | ... |
38
+ +-------------+-----------------------+------------------+------------------+----------+----+------+------+------+-------------+----+-----+
39
+
40
+ The file may end after any block, including immediately after a FPM1.
41
+
42
+ .. note ::
43
+ LLVM only supports 4096 byte blocks (sometimes referred to as the "BigMsf"
44
+ variant), so the rest of this document will assume a block size of 4096.
45
+
8
46
.. _msf_superblock :
9
47
10
48
The Superblock
@@ -32,14 +70,9 @@ follows:
32
70
sizes of 4KiB, and all further discussion assumes a block size of 4KiB.
33
71
- **FreeBlockMapBlock ** - The index of a block within the file, at which begins
34
72
a bitfield representing the set of all blocks within the file which are "free"
35
- (i.e. the data within that block is not used). This bitfield is spread across
36
- the MSF file at ``BlockSize `` intervals.
37
- **Important **: ``FreeBlockMapBlock `` can only be ``1 `` or ``2 ``! This field
38
- is designed to support incremental and atomic updates of the underlying MSF
39
- file. While writing to an MSF file, if the value of this field is `1 `, you
40
- can write your new modified bitfield to page 2, and vice versa. Only when
41
- you commit the file to disk do you need to swap the value in the SuperBlock
42
- to point to the new ``FreeBlockMapBlock ``.
73
+ (i.e. the data within that block is not used). See :ref: `msf_freeblockmap ` for
74
+ more information.
75
+ **Important **: ``FreeBlockMapBlock `` can only be ``1 `` or ``2 ``!
43
76
- **NumBlocks ** - The total number of blocks in the file. ``NumBlocks * BlockSize ``
44
77
should equal the size of the file on disk.
45
78
- **NumDirectoryBytes ** - The size of the stream directory, in bytes. The stream
@@ -53,7 +86,32 @@ follows:
53
86
contains the list of blocks that the stream directory occupies, and the stream
54
87
directory itself can be stitched together accordingly. The number of
55
88
``ulittle32_t ``'s in this array is given by ``ceil(NumDirectoryBytes / BlockSize) ``.
56
-
89
+
90
+ .. _msf_freeblockmap :
91
+
92
+ The Free Block Map
93
+ ==================
94
+
95
+ The Free Block Map (sometimes referred to as the Free Page Map, or FPM) is a
96
+ series of blocks which contains a bit flag for every block in the file. The
97
+ flag will be set to 0 if the block is in use, and 1 if the block is unused.
98
+
99
+ Each file contains two FPMs, one of which is active at any given time. This
100
+ feature is designed to support incremental and atomic updates of the underlying
101
+ MSF file. While writing to an MSF file, if the active FPM is FPM1, you can
102
+ write your new modified bitfield to FPM2, and vice versa. Only when you commit
103
+ the file to disk do you need to swap the value in the SuperBlock to point to
104
+ the new ``FreeBlockMapBlock ``.
105
+
106
+ The Free Block Maps are stored as a series of single blocks thoughout the file
107
+ at intervals of BlockSize. Because each FPM block is of size ``BlockSize ``
108
+ bytes, it contains 8 times as many bits as an interval has blocks. This means
109
+ that the first block of each FPM refers to the first 8 intervals of the file
110
+ (the first 32768 blocks), the second block of each FPM refers to the next 8
111
+ blocks, and so on. This results in far more FPM blocks being present than are
112
+ required, but in order to maintain backwards compatibility the format must stay
113
+ this way.
114
+
57
115
The Stream Directory
58
116
====================
59
117
The Stream Directory is the root of all access to the other streams in an MSF
@@ -66,10 +124,10 @@ file. Beginning at byte 0 of the stream directory is the following structure:
66
124
ulittle32_t StreamSizes[NumStreams];
67
125
ulittle32_t StreamBlocks[NumStreams][];
68
126
};
69
-
127
+
70
128
And this structure occupies exactly ``SuperBlock->NumDirectoryBytes `` bytes.
71
129
Note that each of the last two arrays is of variable length, and in particular
72
- that the second array is jagged.
130
+ that the second array is jagged.
73
131
74
132
**Example: ** Suppose a hypothetical PDB file with a 4KiB block size, and 4
75
133
streams of lengths {1000 bytes, 8000 bytes, 16000 bytes, 9000 bytes}.
97
155
{10, 15, 12}
98
156
};
99
157
};
100
-
158
+
101
159
In total, this occupies ``15 * 4 = 60 `` bytes, so ``SuperBlock->NumDirectoryBytes ``
102
160
would equal ``60 ``, and ``SuperBlock->BlockMapAddr `` would be an array of one
103
161
``ulittle32_t ``, since ``60 <= SuperBlock->BlockSize ``.
0 commit comments