Files are a wonderful abstraction, a stream of bytes that reside under name, sorted in a hierarchy. Simple enough that a child can use it, powerful enough to be the motto of an the entire set of operating systems. "Everything is a file" is one of the defining features of Unix, but it is also an abstraction, and as such, it is subject to the Law of Leaky Abstractions.
When building a storage engine, we need to have a pretty good idea about how to manage files. As it turns out, there is a lot of things that are just wrong about how we think about files. The "All File Systems Are Not Created Equal: On the Complexity of Crafting Crash-Consistent Applications" paper tested ten applications (from SQLite to Git to PostgreSQL) to find whatever they are properly writing to files. This paper is usually referred to as the ALICE (Application-Level Intelligent Crash Explorer) paper, after the name of the tool created to explore failures in file system usage.
There are a lot of details that you need to take into account. For example, you may consider that changing a file and then calling fsync()
will ensure that
the changes to the file are made durable, and that is correct, if you haven’t changed the file size. Because while the file data has been flushed, the file
metadata was not. Which may mean some fun times in the future with the debugger.
If you want to build a reliable storage engine, you are going to need to develop some paranoid tendencies. Because the fact that the documentation says something is possible or not doesn’t translate to how things really are. When you care about reliability, the file system abstraction leaks badly. And it is hard to find a way around that.
You can simulate some errors, but the sheer variety and scope involved makes thing hard. Especially because the error that hit you may come not from the code that you are using but several layers down. If you test errors from the file system, but the file system didn’t check for errors from the block device you may end up in a funny state. And in 99.999% of the cases, it won’t matter, but you are going to hit that one in a billion chance of an actual error at just the wrong time and see a broken system.
There have been a number of studies made on the topic and I think that Daniel Luu’s summary of the topic lays a lot of the issues on the table. There is not a single storage hardware solution that doesn’t have failure conditions that can cause data corruption. And file systems don’t always handle it properly.
To a large degree, we make certain assumptions about the system, then we verify them constantly. The less we demand from the bottom layers, the more reliable we can make our system.
LWN has some good articles on the topic of making sure that the data actually reach the disk and the complexities involved. The situation is made more complex by the fact that this is depend on what OS and file system you use and even what mode you used to mount a particular drive. As the author of a storage engine, you have to deal with these details in either of two ways:
-
Specify explicitly the support configuration, raise hell if user is attempting to use on non supported configuration.
-
Make it work across the board. Plan for failure and when it happens, have the facilities in place to recover from it.
gavran/pal.h
- The file system interface that we’ll consume for our storage enginelink:../include/gavran/pal.h[role=include]
Because working with files is such a huge complex mess, and because it is different across operating systems, we’ll hide this complexity behind a
platform abstraction layer (PAL). gavran/pal.h
- The file system interface that we’ll consume for our storage engine shows the core functions that the PAL expose.
The interface shown in gavran/pal.h
- The file system interface that we’ll consume for our storage engine is very small and quite big at the same time. It it small to be the entire interface between a storage engine and a
file system. At the same time, there is quite a bit that is going on here.
There are two important structures that we define here: span_t
and file_handle_t
. The span_t
represent a range of memory, and it is named in honor of
Span<T>
from .NET. This is going to be used for memory mapping and in general to represent arbitrarily sized chunks of memory. The file_handle_t
is going
going to be used to represent a file handle.
Note
|
Cross platform considerations
On Posix, a file description is an And yes, I’m aware of "Famous last words". |
The goal of this API is to allow us to do the right thing, regardless of what platform we are running on. Later on, we’ll build on top of these functions the
storage engine implementation. Let’s look at test.c
- Using the file API to create a file with a minimum size to see how we can make use of this API to work with files.
test.c
- Using the file API to create a file with a minimum sizelink:./code/test.c[role=include]
The code in test.c
- Using the file API to create a file with a minimum size should ensure that at the end of the day, we have a file that has a minimum size of 8KB which will retain its size even in the case of an error or
a system crash. That sounds easy enough to do in theory, but require some dancing around to get to it. Right now I’m going to focus on Linux as the implementation
system, but we’ll get to other systems down the line.
Now that we know how to use the API, let’s see how this is implemented. pal.linux.c
- Opening (or creating) a file shows how we open or create a file in Gavran.
It isn’t a long function, but you can see how all the different parts we built are coming together to make it (much) easier to write correct code.
pal.linux.c
- Opening (or creating) a filelink:./code/pal.linux.c[role=include]
-
Allocate the memory for the handle and ensure that it will be freed if the function fails.
-
Create a mutable copy of the file name and then ensuring that the file exists (which requires mutable string).
-
Resolve potentially relative path to an absolute one.
-
Open the actual file, for now, you can ignore the meaning of the
flags
, we’ll discuss them later. -
Call
fsync()
on the parent directory if this is a new file, we’ll see why this is needed shortly.
pal.linux.c
- Opening (or creating) a file starts by allocating a temporary string and then calling pal_ensure_path
, which will ensure that the path to the file is valid and create it as an empty file.
It then proceeds to allocate the memory for the handle and setting up try_defer()
to ensure that in the case of failure, we’ll free the allocated memory automatically.
We then store a copy of the path
in the handle’s filename
field. Note that we are doing that through realpath()
, which will resolve relative paths, symbolic links, etc. We need
to free()
the memory there. Because realpath
requires a file to exists before it can resolve it, we create an empty file in pal_ensure_path()
and use the size of the file to know
if we are dealing with an existing database or a new one.
Beyond what was already mentioned, you need to take into account users who pass invalid values (file name containing /
, for example), all the intricacies of soft and
hard links, size quotas, etc. The LWN post about this will probably turn your hair gray.
To keep the code size small and not overburden ourself with validation code, I’m going to state that I’m trusting the
callers of the API to have already done the validation of the data. As you can see in pal.linux.c
- Opening (or creating) a file, we are only trying to prevent
accidents, not trying to protect against malicious input. Gavran assumes that it is called with non-malicious values (note that this is different from valid values)
as it is an embedded library and is meant to be used as part of your system.
There is one very important validation / setup that we run before we open the file. We call pal_ensure_path()
, which ensure that we can create the file. It will create the full
directory path if needed, make sure that there isn’t a directory with the name in place and in general tidy up the place before we get started. You can see how it is implemented
in pal.linux.c
- Ensuring that the provided path is valid to create a file on.
pal.linux.c
- Ensuring that the provided path is valid to create a file onlink:./code/pal.linux.c[role=include]
Another responsibility that pal.linux.c
- Opening (or creating) a file has, aside from creating the full director path, is to let us know if we are creating a new file or opening an existing one. Why do we care?
We are opening the file using O_CREAT
, so the operating system will take care of that detail for us.
Warning
|
Beware the paths
In Let’s assume that we accept the path On Linux, once you opened a file, you no longer have access to its name. It may have multiple names (hard links) or non (anonymous or have been deleted). That is why I’m keeping track of the provided name (after resolving to the real path) as part of the handle. It is too good a source of information to just discard. |
The act of creating a file is a non trivial operation, since we need to make sure that the file creation is atomic and durable.
pal.linux.c
- Calling fsync() on a parent directory to ensure that the file metadata has been preserved deals with a fairly nasty problem with Linux, management of file metadata. In particular, adding a file or changing the size of a file will
cause changes not to the file itself, but to its parent directory. This isn’t actually true, but it is a fairly good lie in the sense that it gives you
enough information to have a mostly correct gut feeling about how things work.
If you want to understand how it works in more depth, I would recommend reading an excellent book on the topic: Practical File System Design with the Be File System. It is an old one (I read it the first time close to two decades ago) but it does an excellent job of covering a lot of the details. At under 250 pages, it makes for an excellent read for a very complex topic. Another resource you might want to consider is GotenksFS, a file system explicitly designed for learning purposes. There is also libfuse which allows you to define file systems in user space, which can be really interesting peek into how file systems really work.
Note
|
What about fork support?
When opening the file, we use |
Metadata updates not being part of fsync()
make it possible for you to make changes to a file (creating the file or increasing its size), call fsync()
on the
file and then losing data because the existence or the size of the file wasn’t properly persist to stable medium. I’ll refer you to
LWN again for the gory details.
pal.linux.c
- Opening (or creating) a file and pal.linux.c
- Calling fsync() on a parent directory to ensure that the file metadata has been preserved has quite a lot of error handling, but most of it isn’t visible, hidden in the defer()
call which simplify the overall system logic.
Why am I being so paranoid about error handling? To the point I defined a whole infrastructure for managing that before writing a single byte to a file?
The answer is simple, we aim to create an ACID storage engine, one which will take data and keep it. As such, we have to be aware that the
underlying system can fail in interesting ways.
pal.linux.c
- Calling fsync() on a parent directory to ensure that the file metadata has been preservedlink:./code/pal.linux.c[role=include]
-
If there is no
/
in the path, assume that this is from the current directory -
We need to find the name of the parent directory, we add a null in the right location to find it.
The ALICE paper has found numerous issue is projects that have been heavily battle tested. And a few years ago that have been a case of data loss in PostgreSQL that has been
track down to not checking the return value of an fsync()
call. This LWN article summarize the incident quite well.
If we aim to build a robust system, we must assume that anything can fail, and react accordingly.
Warning
|
What happens if
close() fail from the defer() ?In And while it may seem funny, With |
What pal.linux.c
- Calling fsync() on a parent directory to ensure that the file metadata has been preserved does, essentially, is to open()
the parent directory using O_RDONLY
and then call fsync()
on the returned file descriptor. This
instructs the file system to properly persist the directory information and protect us from losing a new file. Note that we rely on the fact that
strings are mutable in C to truncate the file
value by adding a null terminator for the parent directory (we restore it immediately afterward).
This trick allows us to avoid allocating memory during these operations.
It is safe to make this change in memory because the string we mutate is the one belonging to the handle
. So that is our own memory and we are fine to modify it.
The reason we need to call fsync_parent_directory()
is that to make the life of the user easier, we are going to create the file if it does not
exists, including any parent directories. And if we are creating these directories, we need to ensure that they won’t go away because of file
system metadata issues.
Sadly, safely creating the full path is a somewhat tedious task. You can see how we approach it in pal.linux.c
- Ensuring that the full path provide exists and the caller has access to it, where quite a lot is going on.
If the file already exists, we can return successfully immediately. If the file is a directory, we return an error. We then scan the path
one directory at a time and check if the directory exists. I’m using a trick here by putting a 0
in the place of the current directory
separator.
pal.linux.c
- Ensuring that the full path provide exists and the caller has access to itlink:./code/pal.linux.c[role=include]
In other words, given a /db/phones\0
, we’ll start the search for /
from the second character and then place a null terminator on that
position. The filename would then be: /db\0phones\0
. If needed, we will create the directory and then move to the next one. As part of
that, we’ll set the /
again and continue from the next /
. This code requires a mutable string to work, but it does the work with no
allocations, which is nice.
Automatically creating a directory when given a path is a small feature. Call mkdir()
before open
, pretty much. Doing that reliably is more
complex, but not too hard. There are things you need to consider when doing this, what permissions should you give, how to handle any soft or
hard links that you find along the way, etc.
On the other hand, not creating the directory automatically adds a tiny bump of frustration for the user. There is an extra, usually unnecessary step to take along the way.
If we need to create a new directory, we make sure to call fsync_parent_directory()
to ensure that a power failure will not cause the directory to go poof.
As careful as I am being here, note that there are many scenarios that I’m not trying to cover. Using soft and hard links or junction points is the first example that
pops to mind. And double the work if you need to deal with files or paths that come from un-trusted source. OWASP has quite a bit to talk about in terms of the kind of
vulnerabilities that this might expose.
Earlier I discussed wanting to get the proper primitives and get as far away from the level of code that you would usually need to write to deal with the file system, I think that now it is much clearer exactly why I want to get to that level as soon as I can.
When creating a file, it is created with zero bytes. That makes perfect sense, after all. There is no data here. When you’ll write to the file, the file system will allocate the additional space needed on the fly. This is simple, require no thinking on our part and exactly the wrong thing to want in a storage engine.
We just saw how hard we have to work to properly ensure that changes to the metadata (such as, for example, changing its size) are properly protected against possible
power failures. If we would need to call fsync_parent_directory()
after every write, we can kiss our hopes for good performance goodbye. Instead of letting the file
system allocate the disk space for our file on the fly, we’ll ask it for the space in advance, in well known locations. That will ensure that we only rarely need to
call fsync_parent_directory()
.
Requesting the disk space in advance has another major benefit, it gives the file system the most information about how much disk space we want. It means that we give
the file system the chance to give us long sequences of consecutive disk space. In the age of SSD and NVMe it isn’t as critical as it used to be, but it still matters.
Depending on your age, you may recall running defrag
to gain substantial performance increase on your system or have never heard about it at all.
pal.linux.c
- Pre-allocate disk space by letting the file system know ahead of time what we need shows how we request that the file system allocate enough disk space for us. At its core, we simply call to ftruncate()
which will extend the file
for us, if needed. The function allows you to specify minimum and maximum file size because we’ll need to support truncation of files in the future, and if
the amount of code is small enough, I would rather show you the whole thing at once, rather than tease it out.
pal.linux.c
- Pre-allocate disk space by letting the file system know ahead of time what we needlink:./code/pal.linux.c[role=include]
Note that in pal.linux.c
- Pre-allocate disk space by letting the file system know ahead of time what we need we call to fsync_parent_directory()
. That isn’t interesting on its on, it is interesting because we need to pass it a mutable string.
Of course, fsync_parent_directory()
makes sure to return things to normal by the time it returns, but it means that we cannot pass it a constant value and expect things
to work and handle→filename
is going to be part of our database’s shared state, so we can’t mutate it casually. There might be other threads peeking at the value while it
is being mutated. Therefor, we have to use a temporary copy. Convenience function such as mem_duplicate_string()
and using defer()
make this a breeze.
Warning
|
The perils of paths
I’m doing dynamic memory allocation in One of tha advantages of stack allocation is that it will be automatically cleaned up. We get the same behavior via |
After quite a journey, we are almost at the end. The only function we are left to implement to be able to compile the code in test.c
- Using the file API to create a file with a minimum size is pal_close_file()
.
You can see the code in pal.linux.c
- Closing a file.
pal.linux.c
- Closing a filelink:./code/pal.linux.c[role=include]
The pal_close_file()
function simple call the close()
method and add some additional error handling, nothing more. We aren’t trying to call fsync()
or do any
fancy things at this layer. That will be the responsibility of higher tiers in the code. Note that we setup the handle→filename
and the handle
itself to
be freed, but we aren’t calling that explicitly. I find that it is cleaner to do things this way. I don’t need to think about the error conditions that I may
need to cover.
One thing that deserve calling out here, an error from close()
isn’t theoretical. In almost all cases, whenever you do I/O, you are not interacting with
the actual hardware, but the page cache. That means that almost all the I/O is done in an asynchronous fashion and close()
is one way for you to get
notified if there have been any errors.
Even with checking the return value of close()
, you still need to take into account that errors will happen. Unless fsync()
was called, the file system
is free to take you writes to a close()
file and just throw them away. This is not a theoretical issue, by the way, it happens quite often in many
failure scenarios. The recommendation from the file system mailing list is to call fsync()
and
then close()
, to get the highest durability mode.
Now that we are able to create a file and allocate disk space for it, we need to tackle the next challenge, deciding how we are going to read and write from this file. I’ll defer talking about the internal organization of the file to the next chapter, for now, let’s talk about the low level interface that the PAL will offer to the rest of the system.
We already saw them in gavran/pal.h
- The file system interface that we’ll consume for our storage engine, pal_mmap()
and pal_unmap()
as well as pal_write_file()
and pal_read_file()
are the functions that are
provided for this purpose. There isn’t much there, which is quite surprising. There is a vast difference between the performance of reading from
disk (even fast ones) and reading from memory. For this reason, database and storage engines typically spend quite a bit of time managing buffer pools
and reducing the number of times they have to go to disk.
I’m going to use a really cool technique to avoid the issue entirely. By mapping the file into memory, I don’t have to write a buffer pool, I can use the page cache that already exists in the operating system. Using the system’s page cache has a lot of advantages. I have run into this idea for the first time when reading LMDB’s codebase and it is a fundamental property of how Voron (RavenDB’s storage engine) achieve its speed. I also recommend reading the "You’re Doing It Wrong" paper by Poul-Henning Kamp that goes into great details why this is a great idea.
The idea is that we’ll ask the operating system to mmap()
the file into memory and we’ll be able to access the data through directly memory access.
The operating system is in charge of the page cache, getting the right data to memory, etc. That is a lot of code that we don’t have to write, which
has gone through literal decades of optimizations. In particular, the page cache implements strategies such as read ahead, automatic eviction as needed
and many more behaviors that you’ll usually need to write in a buffer pool implementation. By leaning on the OS' virtual memory manager to do all that
we gain enormous leverage.
We could map the memory for both reads and writes, but I believe that it would make more sense to only map the file data for reads. This is to avoid
cases where we accidentally write over the file data in an unintended manner. Instead, we create an explicit call to write the data to the file: pal_write_file()
.
The pal_unmap()
is just the other side of the pal_mmap()
operation, allowing us to clean up after ourselves. I’m adding pal_read_file()
for completion’s sake
and because if we are running in 32 bits, we can’t really mmap()
a 10 GB file, so we also need another fallback. You can see the mapping and un-mapping code in pal.linux.c
- Implementing mapping and un-mapping of memory from out data file.
What is the best option for reading and writing, then? Go memory mapped I/O all the way or rely on read()
and write()
? And can we mix them? Why have two way to go about
this? I’m mostly going to be using memory mapped I/O for reads, but writes will use pwrite()
to write to it. That gives me the benefit of using the OS' buffer cache and
make the best use of the system resources. Using pwrite
, on the other hand, ensures that we aren’t accidentally writing beyond the end of the buffer and will help us catch
any errors.
On Linux, that is safe to do, because both mmap()
and the write()
call are using the same page cache and are coherent with respect to one another.
On Windows, on the other hand, that is not the case. Mixing file I/O calls and memory mapped files lead to situation where you write data using the
I/O API which will take some time to be visible using the memory view. For further reading, you can read how I found out about
this delightful state off affairs. When we get to the
Windows side of things, we’ll show how to deal with this limitation properly.
pal.linux.c
- Implementing mapping and un-mapping of memory from out data filelink:./code/pal.linux.c[role=include]
There really isn’t much there in pal.linux.c
- Implementing mapping and un-mapping of memory from out data file. We just need to call mmap()
or munmap()
and do some basic error reporting in the case of an error.
In pal.linux.c
- Allowing to close()
the file and unmap()
via defer()
, however, we allow to close()
the file and unmap()
the memory using a defer()
call.
pal.linux.c
- Allowing to close()
the file and unmap()
via defer()
link:./code/pal.linux.c[role=include]
In pal.linux.c
- Writing and reading data to the file we have the implementation of writing and reading to files using normal file I/O.
The pal_write_file()
call is a simple wrapper around the pwrite()
call, with the only difference being that we’ll repeat the write until the entire buffer
has been written. In practice, this usually means that we’ll only do a single call to pwrite()
which will perform all the work. And pal_read_file()
does
the same on top of pread()
. The only difference there is that pal_read_file
will consider partial writes as errors, so you either get the whole buffer you
asked for, or an error.
pal.linux.c
- Writing and reading data to the filelink:./code/pal.linux.c[role=include]
As much as I want to tunnel my writes I/O purely through the pwrite
system call, there are going to be cases where we’ll want to enable writable memory map
for the ease in which they allow us to make changes in memory. Supporting that is quite easy, using the mprotect()
system call, you can see the code in
pal.linux.c
- Enable writable memory maps and allow to disable them.
pal.linux.c
- Enable writable memory maps and allow to disable themlink:./code/pal.linux.c[role=include]
pal.linux.c
- Enable writable memory maps and allow to disable them doesn’t define a pal_disable_writes()
method, instead it jump directly to the defer
call. You are expected to call defer(pal_disable_writes, span);
immediately after calling ensure(pal_enable_writes(span)
. This API reflects the notion that writable memory mapped is meant to be rare.
There is only one area of the API that we haven’t looked at yet. The pal_fsync()
function is implemented just as simply as you would expect it to be. You can see it in
pal.linux.c
- Flushing a file.
pal.linux.c
- Flushing a filelink:./code/pal.linux.c[role=include]
Actually, there is a surprise in pal.linux.c
- Flushing a file, we aren’t calling fsync()
, we are only calling fdatasync()
. Note that elsewhere in the PAL, we used fsync()
, in
pal.linux.c
- Calling fsync() on a parent directory to ensure that the file metadata has been preserved, for example. Why this discrepancy? And why we call the method pal_fsync()
and not pal_fdatasync()
?
When we call fsync()
, we are asking to flush the data of the file to disk and the file’s metadata (last modified, access time, etc). When we call fdatasync()
, we
only ask to flush the file’s data, the metadata isn’t required to be included. That makes fdatasync()
cheaper for scenarios where metadata updates aren’t important.
Remember, fsync()
and fdatasync()
are expensive, so any reduction in cost is welcome.
I’m calling it pal_fsync()
because the choice of fdatasync()
is an implementation detail. On Mac, we’ll need to use fcntl(fd, F_FULLFSYNC);
because fsync()
doesn’t actually do what it is supposed to do. On Windows, we’ll call to FlushFileBuffers()
, which is the fsync()
equivalent.
We''ll learn all about why fsync()
is important to a storage engine in Chapter 8.
There is another aspect of durable writes that we have to look at. In pal.linux.c
- Opening (or creating) a file I mentioned the flags
argument and said that I’ll talk about it later.
That later is now, and you can see the flags' options in gavran/pal.h
- Flags for opening a file.
gavran/pal.h
- Flags for opening a filelink:../include/gavran/pal.h[role=include]
There aren’t really many options here, we have durable mode and non durable mode, that is all. When looking at pal.linux.c
- Opening (or creating) a file we can see that if we set the
flags
to pal_create_file()
to pal_file_creation_flags_durable
, it will simply add the following flags to the open()
call: O_DIRECT | O_DSYNC
. What does
this mean?
Tip
|
The cost of
fsync() Using The issue is that |
I’m going to cover this in more detail in Chapter 8, but the general idea is that O_DSYNC
is saying to the operating system that on every write to the file, it needs
to behave as if we called fdatasync()
and O_DIRECT
instructs the file system to ignore all buffers and use direct I/O. Together, however, they allow us to do
something far more interesting. They tell the operating system that we want to make a durable write to disk, bypassing all buffering in the middle. This is a dramatically
more expensive option, and using it in this manner requires that we’ll accept some harsh limitations on how we can actually write. All data must be page aligned, all
writes must be page aligned, etc.
We’ll be using this mode sparingly, but it is a crucial one to ensuring that we can get proper durable writes in the system. You can see how they are used in Chapter 8. For now, we’ll just use the non durable mode everywhere.
Note
|
Using a block device instead of bothering with the file system
Technically speaking, the model that I intend to use will work just as well for raw block devices as it would do for files. Indeed, there are some real benefits of bypassing the file system for a storage engine. What I most want from a file system as a storage engine author is that it will get out of my way. There are also some performance benefits, avoiding the need for data fragmentation, overhead of the file system, etc. That said, working with files is ever so much easier. Yes, you can use commands such as That would allow us to switch to a block device at a later point in time, but having direct access to the file is just too convenient to give up. However, there is another consideration to take into account. A database server can expect to have the right kind of permissions to allow opening a raw block device. But an embedded storage engine needs to deal with limited rights on behalf of its processes. |
We are still very early on in the process, but I think that peeking at test.c
- Using the file API to read & write will show you how far we have come. We are making for use of all
of our functions to store and read data from the file.
test.c
- Using the file API to read & writelink:./code/test.c[role=include]
test.c
- Using the file API to read & write shows case the the benefits of defer
again, which is rapidly becoming my favorite approach to dealing with error handling in this codebase.
We have to deal with multiple resources, but there isn’t any jump in complexity in the function. Everything works.
As for the actual code, we create a new database, ensure that it has the right size, map it into memory and then write a value to it using the pal_write_file()
function. We then read from it using the memory mapped address. The code reads nearly as well as the description of the code, with little in the way of unnecessary details.
This may seem like a humble beginning, but we are currently building the foundation of our storage engine. In the next chapter, we are going to start talking about how we are going to make use of this functionality.
In order to make sure that we are producing good software, and that means tests. Now that the chapter is over, it is time to see its tests. I’m not going
to deeply into the testing side of the pool, but I have found them to be invaluable to ensure that the software does what you think it does.
In this case, you already saw the major test cases, test.c
- Using the file API to read & write and test.c
- Using the file API to create a file with a minimum size, but I got a few that test more esoteric pieces of the API
as well in test.c
- Testing the PAL API.
On each chapter, I’m going to create a set of unit tests that will verify the functionality on the system. Each chapter conclude with a system that pass all its (and previous) tests. That is important to ensure that any change that I’m making isn’t going to break the system.
That said, be aware that you can spend a lot of time tying yourself to a particular implementation choices using tests. Right now I’m writing fairly silly tests, mostly because I’m not going to try to test that we are writing durably to disk. That requires a lab with a UPS that I can trigger via an API and a machine that I don’t mind frying.
And now we are ready to see how we are going to be using the files.
test.c
- Testing the PAL APIlink:./code/test.c[role=include]