forked from cloudius-systems/osv
-
Notifications
You must be signed in to change notification settings - Fork 0
/
mutex
31 lines (27 loc) · 1.53 KB
/
mutex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
1. Make the structure smaller
-----------------------------
If we reduce mutex's size (currently 40 bytes) to 36 bytes, and also
additionally reduce the size of condvar, we can use mutex and condvar
directly in the pthread implementation instead of lazy_indirect<>.
We shouldn't do this unless there's evidence that lazy_indirect is
hurting performance.
Some ideas on how to make the structures smaller:
1. Make sequence, handoff, and/or depth 16-bit (in osv/mutex.h depth is
already 16-bit).
Can also make "count" 16-bit, but this will limit the number of waiting
threads to 65536, so probably not a good idea.
2. Make queue_mpsc's two pointers 32-bit. On thread creation, give each
thread a 32-bit pointer (or a recycled thread_id - see below - indexing
a global array) which can be used instead of putting the wait struct on
the stack. Or perhaps we can put all stacks in the low 32 bit?
3. Do the same to the two pointers in condvar to make condvar smaller too,
4. Have a new recycled low (32-bit) numeric "threadid" and use it for owner
instead of a 64-bit pointer.
2. Memory ordering
------------------
Using sequential memory ordering for all atomic variables is definitely
not needed, and significantly slows down the mutex. I already relaxed the
memory ordering in the fast path, and saw a significant improvement in
the uncontended case, but I need to complete this work in the handoff case.
See the measure_uncontended<handoff_stressing_mutex> for measuring this case
(currently I see 48 ns instead of 29 ns for the non-handoff case).