An async, fast synchronization primitives for task wakeup.
diatomic-waker
is similar to atomic-waker
in that it enables
concurrent updates and notifications to a wrapped Waker
. Unlike the latter,
however, it does not use spinlocks1 and is significantly faster, in
particular when the consumer needs to be notified periodically rather than just
once. It can in particular be used as a very fast, single-consumer eventcount
to turn a non-blocking data structure into an asynchronous one (see MPSC channel
receiver example).
This library is an offshoot of Asynchronix, an ongoing effort at a high performance asynchronous computation framework for system simulation.
Add this to your Cargo.toml
:
[dependencies]
diatomic-waker = "0.2.3"
By default, this crate enables the alloc
feature to provide the owned
WakeSink
and WakeSource
. It can be made no-std
-compatible by specifying
default-features = false
.
A multi-producer, single-consumer channel of capacity 1 for sending
NonZeroUsize
values, with an asynchronous receiver:
use std::num::NonZeroUsize;
use std::sync::atomic::{AtomicUsize, Ordering};
use std::sync::Arc;
use diatomic_waker::{WakeSink, WakeSource};
// The sending side of the channel.
#[derive(Clone)]
struct Sender {
wake_src: WakeSource,
value: Arc<AtomicUsize>,
}
// The receiving side of the channel.
struct Receiver {
wake_sink: WakeSink,
value: Arc<AtomicUsize>,
}
// Creates an empty channel.
fn channel() -> (Sender, Receiver) {
let value = Arc::new(AtomicUsize::new(0));
let wake_sink = WakeSink::new();
let wake_src = wake_sink.source();
(
Sender {
wake_src,
value: value.clone(),
},
Receiver { wake_sink, value },
)
}
impl Sender {
// Sends a value if the channel is empty.
fn try_send(&self, value: NonZeroUsize) -> bool {
let success = self
.value
.compare_exchange(0, value.get(), Ordering::Relaxed, Ordering::Relaxed)
.is_ok();
if success {
self.wake_src.notify()
};
success
}
}
impl Receiver {
// Receives a value asynchronously.
async fn recv(&mut self) -> NonZeroUsize {
// Wait until the predicate returns `Some(value)`, i.e. when the atomic
// value becomes non-zero.
self.wake_sink
.wait_until(|| NonZeroUsize::new(self.value.swap(0, Ordering::Relaxed)))
.await
}
}
This is a low-level primitive and as such its implementation relies on unsafe
.
The test suite makes extensive use of Loom to assess its correctness. As
amazing as it is, however, Loom is only a tool: it cannot formally prove the
absence of data races.
A distinguishing feature of diatomic-waker
is its use of two waker storage
slots (hence its name) rather than one. This makes it possible to achieve
lock-freedom in situations where waker registration and notification are
performed concurrently. In the case of concurrent notifications, even though one
notifier does hold a notification lock, other notifiers never block: they merely
request the holder of the lock to send another notification, which is a
wait-free operation.
Compared to atomic-waker
, dummy notifications (with no waker registered) are
much cheaper. The overall cost of a successful notification (registration +
notification itself) is also much cheaper in the common case where the
registered/unregistered waker is always the same, because the last waker is
always cached to avoid undue cloning. Quantitatively, the costs in terms of
atomic Read-Modify-Write (RMW) operations are:
- dummy notification due to no waker being registered: 1 RMW vs 2 RMWs for
atomic-waker
, - registration of the same waker as the last registered waker + notification:
1+3=4 RMWs vs 3+4=7 RMWs for
atomic-waker
(this assumes 1 RMW forWaker::wake_by_ref
, 2 RMWs forWaker::wake
, 1 RMW forWaker::clone
). - registration of a new waker + notification: 3+3=6 RMWs vs 3+4=7 RMWs for
atomic-waker
(same assumptions as above + 1 RMW forWaker::drop
); this is typically only necessary for the very first registration, - very few RMWs and predictable cost on contention due to the absence of spinlocks.
This software is licensed under the Apache License, Version 2.0 or the MIT license, at your option.
Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.
Footnotes
-
The implementation of AtomicWaker yields to the runtime on contention, which is in effect an executor-mediated spinlock. ↩