Skip to content

Commit

Permalink
x86-64: Remove unnecessary barrier in vread_tsc
Browse files Browse the repository at this point in the history
RDTSC is completely unordered on modern Intel and AMD CPUs.  The
Intel manual says that lfence;rdtsc causes all previous instructions
to complete before the tsc is read, and the AMD manual says to use
mfence;rdtsc to do the same thing.

From a decent amount of testing [1] this is enough to make rdtsc
be ordered with respect to subsequent loads across a wide variety
of CPUs.

On Sandy Bridge (i7-2600), this improves a loop of
clock_gettime(CLOCK_MONOTONIC) by more than 5 ns/iter.

[1] https://lkml.org/lkml/2011/4/18/350

Signed-off-by: Andy Lutomirski <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: "David S. Miller" <[email protected]>
Cc: Eric Dumazet <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Borislav Petkov <[email protected]>
Link: http://lkml.kernel.org/r/%3C1c158b9d74338aa5361f96dd473d0e6a58235302.1306156808.git.luto%40mit.edu%3E
Signed-off-by: Thomas Gleixner <[email protected]>
  • Loading branch information
amluto authored and KAGA-KOKO committed May 24, 2011
1 parent 8c49d9a commit 057e6a8
Showing 1 changed file with 5 additions and 4 deletions.
9 changes: 5 additions & 4 deletions arch/x86/kernel/tsc.c
Original file line number Diff line number Diff line change
Expand Up @@ -769,13 +769,14 @@ static cycle_t __vsyscall_fn vread_tsc(void)
cycle_t ret;

/*
* Surround the RDTSC by barriers, to make sure it's not
* speculated to outside the seqlock critical section and
* does not cause time warps:
* Empirically, a fence (of type that depends on the CPU)
* before rdtsc is enough to ensure that rdtsc is ordered
* with respect to loads. The various CPU manuals are unclear
* as to whether rdtsc can be reordered with later loads,
* but no one has ever seen it happen.
*/
rdtsc_barrier();
ret = (cycle_t)vget_cycles();
rdtsc_barrier();

return ret >= VVAR(vsyscall_gtod_data).clock.cycle_last ?
ret : VVAR(vsyscall_gtod_data).clock.cycle_last;
Expand Down

0 comments on commit 057e6a8

Please sign in to comment.