bitvec
enables refining memory manipulation from single-byte precision to
single-bit precision. The bit-precision pointers in this crate allow creation of
more powerful bit-masks, set arithmetic, and I/O packet processing.
The core export of this crate is the type BitSlice
. This type is a region of
memory with individually-addressable bits. It is accessed by standard Rust
references: &BitSlice
and &mut BitSlice
. These references are able to
describe and operate on regions that start and end on any arbitrary bit address,
regardless of alignment to a byte or processor word.
Rust provides three types to manipulate a sequence of memory: &[T]
/&mut [T]
to borrow a region, Box<[T]>
to statically own a region, and Vec<T>
to
dynamically own a region. bitvec
provides parallel types for each: &BitSlice
and &mut BitSlice
borrow, BitBox
statically owns, and BitVec
dynamically
owns. These types mirror the relationships and APIs, including inherent methods
and trait implementations, that are found in the standard library.
The most significant differences are that bitvec
provides arbitrary bit
ordering through the Cursor
trait, and provides a full-featured slice type.
Other crates have fixed orderings, and are often unable to produce slices that
begin at any arbitrary bit.
Additionally, the bitvec
types implement the full extent of the standard
library APIs possible, in both inherent methods and trait implementations.
The bitvec
types’ handles are exactly the size of their standard library
counterparts, while the other crates carry bit index information in separate
fields, making their handles wider. Depending on your needs, this may sway your
opinion in either direction. bitvec
is more compact, but mangles the internal
pointer representation and requires more complex logic to use the bit region,
while other crates’ types are larger, but have more straightforward internal
logic.
- You need to directly control a bitstream’s representation in memory.
- You need to do unpleasant things with I/O communications protocols.
- You need a list of
bool
s that doesn’t waste 7 bits for every bit used. - You need to do set arithmetic, or numeric arithmetic, on those lists.
- You are running a find/replace command on your repository from
&[bool]
to&BitSlice
, orVec<bool>
toBitVec
, and expect minimal damage as a result.
- Your concern with the memory representation of bitsets includes sequence
compression.
BitSlice
performs absolutely no compression, and maps bits directly into memory. Compressed bit sets can be found in other crates, such as thecompacts
crate, which uses the Roaring BitSet format. - You want discontiguous data structures, such as a hash table, a tree, or any
other hallmark of computer science beyond the flat array.
bitvec
does not, and will not, accomplish this. You may be able to usebitvec
’s flat structures beneath a type wrapper which handles index processing, butbitvec
types are incapable of accomplishing this task themselves. Also, I don’t know how any of those data structures work.
Minimum Rust Version: 1.36.0
The 1.36
release of Rust stabilized the alloc
crate, allowing allocating
features (such as the BitVec
type) to be used in #![no_std]
environments
with the stable compiler series.
# Cargo.toml
[dependencies]
bitvec = "0.15"
bitvec
is highly modular, and requires several items to function correctly.
The simplest way to use it is via prelude glob import:
use bitvec::prelude::*;
This imports the following symbols:
BigEndian
BitBox
(only when an allocator is present)BitSlice
BitStore
BitVec
(only when an allocator is present)Bits
BitsMut
Cursor
LittleEndian
bitbox!
(only when an allocator is present)bitvec!
(only when an allocator is present)
If you do not want these names imported directly into the local scope –
Cursor
, BigEndian
, and LittleEndian
are likely culprits for name collision
– then you can import the prelude with a scope guard:
use bitvec::prelude as bv;
and those symbols will all be available only with a bv::
prefix.
bitvec
uses Cargo features to conditionally control some behavior.
The most prominent such behavior is one that cannot be controlled by Cargo
configuration: u64
is only usable with this library when targeting a 64-bit
system. 32-bit system targets are only permitted to use u8
, u16
, and u32
.
bitvec
uses atomic read/modify/write instructions by default. This is
necessary to avoid data races in &mut BitSlice
operations without using
heavier synchronization mechanisms. If your target does not support Rust’s
AtomicU*
types, or you do not want to use atomic RMW instructions, you may
disable the atomic
feature:
# Cargo.toml
[dependencies.bitvec]
default-features = false
features = [
# "atomic",
"std",
]
The two owning structures, BitBox
and BitVec
, require the presence of an
allocator. As bitvec
is written specifically for use cases where an allocator
may not exist, this dependence can be disabled. bitvec
is
#![no_std]
-compatible once the std
feature is disabled. It is not a design
goal to be #![no_core]
-compatible.
# Cargo.toml
[dependencies.bitvec]
default-features = false
features = [
"atomic",
# "std",
]
If you are working in a #![no_std]
environment that does have an allocator
available, you can reënable allocator support with the alloc
feature:
# Cargo.toml
[dependencies.bitvec]
default-features = false # disables "std"
features = ["alloc"] # enables the allocator
This uses #![feature(alloc)]
, which requires the nightly compiler.
De/serialization of bit slices is implemented through the serde
crate. This
functionality is governed by both the serde
feature and the std
feature.
By default, when serde
is enabled, BitSlice
, BitBox
, and BitVec
all gain
the Serialize
trait, and BitBox
and BitVec
gain the Deserialize
trait.
When std
is disabled, the BitBox
and BitVec
types are removed, leaving
only BitSlice
with Serialize
.
# Cargo.toml
[dependencies.bitvec]
features = ["serde"]
bitvec
’s three data structures are &BitSlice
, BitBox
, and BitVec
. Each
of these types takes two type parameters, which I have elided previously.
The first type parameter is the Cursor
trait. This trait governs how a bit
index maps to a bit position in the underlying memory. This parameter defaults
to the BigEndian
type, which counts from the most significant bit first to the
least significant bit last. The LittleEndian
type counts in the opposite
direction.
The second type parameter is the BitStore
trait. This trait abstracts over the
Rust fundamental types u8
, u16
, and u32
. On 64-bit targets, u64
is also
available. This parameter defaults to u8
, which acts on individual bytes.
These traits are both explained in the next section.
&BitSlice<C: Cursor, T: BitStore>
is an immutable region of memory,
addressable at bit precision. This has all the inherent methods of Rust’s slice
primitive, &[bool]
, and all the trait implementations. It has additional
methods which are specialized to its status as a slice of individual bits.
&mut BitSlice<C: Cursor, T: BitStore>
is a mutable region of memory. This
functions identically to &mut [bool]
, with the exception that IndexMut
is
impossible: you cannot write bitslice[index] = bit;
. This restriction is
sidestepped with the C++-style method at
: *bitslice.at(index) = bit;
is the
shim for write indexing.
The slice references have no restrictions on the alignment of their start or end bits.
The owning references, described below, will always begin their slice aligned to
the edge of their T: BitStore
type parameter. While this is not strictly
required by the implementation, it is convenient for ensuring that the
allocation pointer is preserved.
BitBox<C: Cursor, T: BitStore>
is a &mut BitSlice<C: Cursor, T: BitStore>
in
owned memory. It has few useful methods and no trait implementations of its own.
It is only capable of taking a bit slice into owned memory.
BitVec<C: Cursor, T: BitStore>
is a BitBox
that can adjust its allocation
size. It follows the inherent and trait API of the standard library’s Vec
type.
The API for these types is deliberately uninteresting. They are written to be as
close to drop-in replacements for the standard library types as possible. The
end goal of bitvec
is that you should be able to adopt it by running three
sed
find/replace commands on your repository. This is not literally possible,
but the work required for replacement is intended to be minimal.
bitvec
generalizes its behavior through the use of traits. In order to
optimize performance, these traits are not object-safe, and may not be used
as dyn Trait
patterns for type erasure. Refactoring bitvec
to support type
erasure would require significantly rewriting core infrastructure, and this is
not a design goal. I am willing to consider it if demand is shown, but I am not
going to proactively pursue it.
The Cursor
trait is an open-ended trait, that you are free to implement
yourself. It has one required function: fn at<T: BitStore>(BitIdx) -> BitPos
.
This function translates a semantic index to an electrical position. bitvec
provides two implementations for you: BigEndian
and LittleEndian
, described
above. The invariants this function must uphold are listed in its documentation.
The BitStore
trait is sealed, and may only be implemented by this library. It
is used to abstract over the Rust fundamentals u8
, u16
, u32
, and (on
64-bit systems) u64
.
Your choice in fundamental types governs how the Cursor
type translates
indices, and how the memory underneath your slice is written. The document
doc/Bit Patterns.md
enumerates the effect of the Cursor
and BitStore
combinations on raw memory.
If you are using bitvec
to perform set arithmetic, and you expect that your
sets will have more full elements in the interior than partially-used elements
on the front and back edge, it is advantageous to use the local CPU word. The
BitSlice
operations which traverse the slice are required to perform
bit-by-bit crawls on partial-use elements, but are able to use whole-word
instructions on full elements. The latter is a marked acceleration.
If you are using bitvec
to perform I/O packet manipulation, you should use the
fundamental best suited for your protocols. This is likely u8
, which is why it
is the default type.
The Bits
and BitsMut
traits are entry interfaces to the BitSlice
types.
These are equivalent to the AsRef
and AsMut
reference conversion traits in
the standard library, and should be used as such.
These traits are implemented on the Rust fundamentals that implement BitStore
(uN
), on slices of those fundamentals ([uN]
), and the first thirty-two
arrays of them ([uN; 0]
to [uN; 32]
). Each implementation of these traits
causes a linear expansion of compile time, and going beyond thirty-two both
surpasses the standard library’s manual implementation limits, and is a
denial-of-service attack on each rebuild.
These traits are left open so that if you need to implement them on wider arrays, you are able to do so.
You can use these traits to attach .as_bitslice::<C: Cursor>()
and
.as_mut_bitslice::<C: Cursor>()
conversion methods to any implementor, and
gain access to a BitSlice
over that type, or to bound a generic function
similar to how the standard library uses AsRef<Path>
:
let mut base = [0u8; 8];
let bits = base.as_mut_bitslice::<LittleEndian>();
// bits is now an `&mut BitSlice<LittleEndian, u8>`
println!("{}", bits.len()); // 64
fn operate_on_bits(mut data: impl BitsMut) {
let bits = data.as_mut_bitslice::<BigEndian>();
// `bits` is now an `&mut BitSlice<BigEndian, _>`
}
The bitbox!
and bitvec!
macros allow convenient production of their
eponymous types, equivalent to the vec!
macro in the standard library.
These macros accept an optional cursor token, an optional type token, and either a list of bits or a single bit and a repetition counter.
Because these are standard macros, not proc-macros, they do not yet produce well-optimized expanded code.
These macros are more thoroughly explained, including a list of all available use syntaxes, in their documentation.
This snippet runs through a selection of library functionality to demonstrate behavior. It is deliberately not representative of likely usage.
extern crate bitvec;
use bitvec::prelude::*;
use std::iter::repeat;
fn main() {
let mut bv = bitvec![BigEndian, u8; 0, 1, 0, 1];
bv.reserve(8);
bv.extend(repeat(false).take(4).chain(repeat(true).take(4)));
// Memory access
assert_eq!(bv.as_slice(), &[0b0101_0000, 0b1111_0000]);
// index 0 -^ ^- index 11
assert_eq!(bv.len(), 12);
assert!(bv.capacity() >= 16);
// Stack operations
bv.push(true);
bv.push(false);
bv.push(true);
assert!(bv[12]);
assert!(!bv[13]);
assert!(bv[14]);
assert!(bv.get(15).is_none());
bv.pop();
bv.pop();
bv.pop();
// Set operations. These deliberately have no effect.
bv &= repeat(true);
bv = bv | repeat(false);
bv ^= repeat(true);
bv = !bv;
// Arithmetic operations
let one = bitvec![1];
bv += one.clone();
assert_eq!(bv.as_slice(), &[0b0101_0001, 0b0000_0000]);
bv -= one.clone();
assert_eq!(bv.as_slice(), &[0b0101_0000, 0b1111_0000]);
// Borrowing iteration
let mut iter = bv.iter();
// index 0
assert_eq!(iter.next().unwrap(), false);
// index 11
assert_eq!(iter.next_back().unwrap(), true);
assert_eq!(iter.len(), 10);
}
Immutable and mutable access to the underlying memory is provided by the AsRef
and AsMut
implementations, so the BitVec
can be readily passed to transport
functions.
BitVec
implements Borrow
down to BitSlice
, and BitSlice
implements
ToOwned
up to BitVec
, so they can be used in a Cow
or wherever this API
is desired. Any case where a Vec
/[T]
pair cannot be replaced with a
BitVec
/BitSlice
pair is a bug in this library, and a bug report is
appropriate.
BitVec
can relinquish its owned memory with .into_vec()
or
.into_boxed_slice()
, and BitSlice
can relinquish its borrow by going out
of scope.
The BitSlice
type is able to cause memory aliasing, as multiple independent
&mut BitSlice
instances may use the same underlying memory. This crate takes
care to ensure that all observed behavior is exactly as expected, without any
side effects.
The BitSlice
methods only use whole-element instructions when the slice spans
the full width of the element; when the slice has only partial use, the methods
crawl each bit individually. This is slower on most architectures, but
guarantees safety.
Race conditions are avoided through use of the atomic read/modify/write
instructions stabilized in 1.34.0
, as described above.
Contributions of items in this list are absolutely welcome! Contributions of other features are also welcome, but I’ll have to be sold on them.
- Creation of specialized pointers
Rc<BitSlice>
andArc<BitSlice>
. - Procedural macros for
bitvec!
andbitbox!
- An FFI module, and bindings from other languages.