A lightweight tar library written in ANSI C.
This version is a fork of rxi's microtar with bugfixes and API changes aimed at improving usability, but still keeping with the minimal design of the original library.
This library is free software; you can redistribute it and/or modify it under the terms of the MIT license. See LICENSE for details.
No effort has been put into handling every tar format variant. Basically
what is accepted is the "old-style" format, which appears to work well
enough to access basic archives created by GNU tar
.
The library consists of two files, microtar.c
and microtar.h
, which only
depend on a tiny part of the standard C library & can be easily incorporated
into a host project's build system.
The core library does not include any I/O hooks as these are supposed to be
provided by the host application. If the C library's fopen
and friends is
good enough, you can use microtar-stdio.c
.
Initialization is very simple. Everything the library needs is contained in
the mtar_t
struct; there is no memory allocation and no global state. It is
enough to zero-initialize an mtar_t
object to put it into a "closed" state.
You can use mtar_is_open()
to query whether the archive is open or not.
An archive can be opened for reading or writing, but not both. You have to specify which access mode you're using when you create the archive.
mtar_t tar;
mtar_init(&tar, MTAR_READ, my_io_ops, my_stream);
Or if using microtar-stdio.c
:
int error = mtar_open(&tar, "file.tar", "rb");
if(error) {
/* do something about it */
}
Note that mtar_init()
is called for you in this case and the access mode is
deduced from the mode flags.
If you opened an archive for reading, you'll likely want to iterate over all the files. Here's the long way of doing it:
mtar_t tar;
int err;
/* Go to the start of the archive... Not necessary if you've
* just opened the archive and are already at the beginning.
* (And of course you normally want to check the return value.) */
mtar_rewind(&tar);
/* Iterate over the archive members */
while((err = mtar_next(&tar)) == MTAR_ESUCCESS) {
/* Get a pointer to the current file header. It will
* remain valid until you move to another record with
* mtar_next() or call mtar_rewind() */
const mtar_header_t* header = mtar_get_header(&tar);
printf("%s (%d bytes)\n", header->name, header->size);
}
if(err != MTAR_ENULLRECORD) {
/* ENULLRECORD means we hit end of file; any
* other return value is an actual error. */
}
There's a useful shortcut for this type of iteration which removes the loop boilerplate, replacing it with another kind of boilerplate that may be more palatable in some cases.
/* Will be called for each archive member visited by mtar_foreach().
* The member's header is passed in as an argument so you don't need
* to fetch it manually with mtar_get_header(). You can freely read
* data (if present) and seek around. There is no special cleanup
* required and it is not necessary to read to the end of the stream.
*
* The callback should return zero (= MTAR_SUCCESS) to continue the
* iteration or return nonzero to abort. On abort, the value returned
* by the callback will be returned from mtar_foreach(). Since it may
* also return normal microtar error codes, it is suggested to use a
* positive value or pass the result via 'arg'.
*/
int foreach_cb(mtar_t* tar, const mtar_header_t* header, void* arg)
{
// ...
return 0;
}
void main()
{
mtar_t tar;
// ...
int ret = mtar_foreach(&tar, foreach_cb, NULL);
if(ret < 0) {
/* Microtar error codes are negative and may be returned if
* there is a problem with the iteration. */
} else if(ret == MTAR_ESUCCESS) {
/* If the iteration reaches the end of the archive without
* errors, the return code is MTAR_ESUCCESS. */
} else if(ret > 0) {
/* Positive values might be returned by the callback to
* signal some condition was met; they'll never be returned
* by microtar */
}
}
The other thing you're likely to do is look for a specific file:
/* Seek to a specific member in the archive */
int err = mtar_find(&tar, "foo.txt");
if(err == MTAR_ESUCCESS) {
/* File was found -- read the header with mtar_get_header() */
} else if(err == MTAR_ENOTFOUND) {
/* File wasn't in the archive */
} else {
/* Some error occurred */
}
Note this isn't terribly efficient since it scans the entire archive looking for the file.
Once pointed at a file via mtar_next()
or mtar_find()
you can read the
data with a simple POSIX-like API.
-
mtar_read_data(tar, buf, count)
reads up tocount
bytes intobuf
, returning the actual number of bytes read, or a negative error value. If at EOF, this returns zero. -
mtar_seek_data(tar, offset, whence)
works exactly likefseek()
withwhence
being one ofSEEK_SET
,SEEK_CUR
, orSEEK_END
andoffset
indicating a point relative to the beginning, current position, or end of the file. Returns zero on success, or a negative error code. -
mtar_eof_data(tar)
returns nonzero if the end of the file has been reached. It is possible to seek backward to clear this condition.
Microtar has limited support for creating archives. When an archive is opened
for writing, you can add new members using mtar_write_header()
.
-
mtar_write_header(tar, header)
writes out the header for a new member. The amount of data that follows is dictated byheader->size
, though if the underlying stream supports seeking and re-writing data, this size can be updated later withmtar_update_header()
ormtar_update_file_size()
. -
mtar_update_header(tar, header)
will re-write the previously written header. This may be used to change any header field. The underlying stream must support seeking. On a successful return the stream will be returned to the position it was at before the call.
File data can be written with mtar_write_data()
, and if the underlying stream
supports seeking, you can seek with mtar_seek_data()
and read back previously
written data with mtar_read_data()
. Note that it is not possible to truncate
the file stream by any means.
-
mtar_write_data(tar, buf, count)
will write up tocount
bytes frombuf
to the current member's data. Returns the number of bytes actually written or a negative error code. -
mtar_update_file_size(tar)
will update the header size to reflect the actual amount of written data. This is intended to be called right beforemtar_end_data()
if you are not declaring file sizes in advance. -
mtar_end_data(tar)
will end the current member. It will complain if you did not write the correct amount data provided in the header. This must be called before writing the next header. -
mtar_finalize(tar)
is called after you have written all members to the archive. It writes out some null records which mark the end of the archive, so you cannot write any more archive members after this.
Note that mtar_close()
can fail if there was a problem flushing buffered
data to disk, so its return value should always be checked.
Most functions that return int
return an error code from enum mtar_error
.
Zero is success and all other error codes are negative. mtar_strerror()
can
return a string describing the error code.
A couple of functions use a different return value convention:
mtar_foreach()
may error codes or an arbitrary nonzero value provided by the callback.mtar_read_data()
andmtar_write_data()
returns the number of bytes read or written, or a negative error code. In particular zero means that no bytes were read or written.mtar_get_header()
may returnNULL
if there is no valid header. It is only possible to see a null pointer if misusing the API or after a previous error so checking for this is usually not necessary.
There is essentially no support for error recovery. After an error you can
only do two things reliably: close the archive with mtar_close()
or try
rewinding to the beginning with mtar_rewind()
.
You can provide your own I/O hooks in a mtar_ops_t
struct. The same ops
struct can be shared among multiple mtar_t
objects but each object gets
its own void* stream
pointer.
Name | Arguments | Required |
---|---|---|
read |
void* stream, void* data, unsigned size |
If reading |
write |
void* stream, void* data, unsigned size |
If writing |
seek |
void* stream, unsigned pos |
If reading |
close |
void* stream |
Always |
read
and write
should transfer the number of bytes indicated
and return the number of bytes actually read or written, or a negative
enum mtar_error
code on error.
seek
must have semantics like lseek(..., pos, SEEK_SET)
; that is,
the position is an absolute byte offset in the stream. Seeking is not
optional for read support, but the library only performs backward
seeks under two circumstances:
mtar_rewind()
seeks to position 0.mtar_seek_data()
may seek backward if the user requests it.
Therefore, you will be able to get away with a limited forward-only
seek function if you're able to read everything in a single pass use
the API carefully. Note mtar_find()
and mtar_foreach()
will call
mtar_rewind()
.
close
is called by mtar_close()
to clean up the stream. Note the
library assumes that the stream handle is cleaned up by close
even
if an error occurs.
seek
and close
should return an enum mtar_error
code, either
MTAR_SUCCESS
, or a negative value on error.