Skip to content

Commit

Permalink
Merge branch 'x86-fpu-for-linus' of git://git.kernel.org/pub/scm/linu…
Browse files Browse the repository at this point in the history
…x/kernel/git/tip/tip

Pull x86 fpu fixes and cleanups from Ingo Molnar:
 "This is _way_ more cleanups than fixes, but the bugs were subtle and
  hard to hit, and the primary reason for them existing was the
  unnecessary historical complexity of some of the x86/fpu interfaces.

  The first bunch of commits clean up and simplify the xstate user copy
  handling functions, in reaction to the collective head-scratching
  about the xstate user-copy handling code that leads up to the fix for
  this SkyLake xstate handling bug:

     0852b37: x86/fpu: Add FPU state copying quirk to handle XRSTOR failure on Intel Skylake CPUs

  The cleanups don't change any functionality, they just (hopefully)
  make it all clearer, more consistent, more debuggable and more robust.

  Note that most of the linecount increase comes from these commits,
  where we better split the user/kernel copy logic by having more
  variants, instead repeated fragile patterns of:

               if (kbuf) {
                       memcpy(kbuf + pos, data, copy);
               } else {
                       if (__copy_to_user(ubuf + pos, data, copy))
                               return -EFAULT;
               }

  The next bunch of commits simplify the FPU state-machine to get rid of
  old lazy-FPU idiosyncrasies - a defensive simplification to make all
  the code easier to review and fix. No change in functionality.

  Then there's a couple of additional debugging tweaks: static checker
  warning fix and move an FPU related warning to under WARN_ON_FPU(),
  followed by another bunch of commits that represent a finegrained
  split-up of the fixes from Eric Biggers to handle weird xstate bits
  properly.

  I did this finegrained split-up because some of these fixes also
  impact the ABI for weird xstate handling, for which we'd like to have
  good bisection results, should they cause any problems. (We also had
  one regression with the more monolithic fixes, so splitting it all up
  sounded prudent for robustness reasons as well.)

  About the whole series: the commits up to 03eaec8 have been in
  -next for months - but I've recently rebased them to remove a state
  machine clean-up commit that was objected to, and to make it more
  bisectable - so technically it's a new, rebased tree.

  Robustness history: this series had some regressions along the way,
  and all reported regressions have been fixed. All but one of the
  regressions manifested itself as easy to report warnings. The previous
  version of this latest series was also in linux-next, with one
  (warning-only) regression reported which is fixed in the latest
  version.

  Barring last minute brown paper bag bugs (and the commits are now
  older by a day which I'd hope helps paperbag reduction), I'm
  reasonably confident about its general robustness.

  Famous last words ..."

* 'x86-fpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (42 commits)
  x86/fpu: Use using_compacted_format() instead of open coded X86_FEATURE_XSAVES
  x86/fpu: Use validate_xstate_header() to validate the xstate_header in copy_user_to_xstate()
  x86/fpu: Eliminate the 'xfeatures' local variable in copy_user_to_xstate()
  x86/fpu: Copy the full header in copy_user_to_xstate()
  x86/fpu: Use validate_xstate_header() to validate the xstate_header in copy_kernel_to_xstate()
  x86/fpu: Eliminate the 'xfeatures' local variable in copy_kernel_to_xstate()
  x86/fpu: Copy the full state_header in copy_kernel_to_xstate()
  x86/fpu: Use validate_xstate_header() to validate the xstate_header in __fpu__restore_sig()
  x86/fpu: Use validate_xstate_header() to validate the xstate_header in xstateregs_set()
  x86/fpu: Introduce validate_xstate_header()
  x86/fpu: Rename fpu__activate_fpstate_read/write() to fpu__prepare_[read|write]()
  x86/fpu: Rename fpu__activate_curr() to fpu__initialize()
  x86/fpu: Simplify and speed up fpu__copy()
  x86/fpu: Fix stale comments about lazy FPU logic
  x86/fpu: Rename fpu::fpstate_active to fpu::initialized
  x86/fpu: Remove fpu__current_fpstate_write_begin/end()
  x86/fpu: Fix fpu__activate_fpstate_read() and update comments
  x86/fpu: Reinitialize FPU registers if restoring FPU state fails
  x86/fpu: Don't let userspace set bogus xcomp_bv
  x86/fpu: Turn WARN_ON() in context switch into WARN_ON_FPU()
  ...
  • Loading branch information
torvalds committed Sep 27, 2017
2 parents dc972a6 + 8474c53 commit 7031b64
Show file tree
Hide file tree
Showing 15 changed files with 375 additions and 315 deletions.
2 changes: 1 addition & 1 deletion arch/x86/ia32/ia32_signal.c
Original file line number Diff line number Diff line change
Expand Up @@ -231,7 +231,7 @@ static void __user *get_sigframe(struct ksignal *ksig, struct pt_regs *regs,
ksig->ka.sa.sa_restorer)
sp = (unsigned long) ksig->ka.sa.sa_restorer;

if (fpu->fpstate_active) {
if (fpu->initialized) {
unsigned long fx_aligned, math_size;

sp = fpu__alloc_mathframe(sp, 1, &fx_aligned, &math_size);
Expand Down
90 changes: 22 additions & 68 deletions arch/x86/include/asm/fpu/internal.h
Original file line number Diff line number Diff line change
Expand Up @@ -23,11 +23,9 @@
/*
* High level FPU state handling functions:
*/
extern void fpu__activate_curr(struct fpu *fpu);
extern void fpu__activate_fpstate_read(struct fpu *fpu);
extern void fpu__activate_fpstate_write(struct fpu *fpu);
extern void fpu__current_fpstate_write_begin(void);
extern void fpu__current_fpstate_write_end(void);
extern void fpu__initialize(struct fpu *fpu);
extern void fpu__prepare_read(struct fpu *fpu);
extern void fpu__prepare_write(struct fpu *fpu);
extern void fpu__save(struct fpu *fpu);
extern void fpu__restore(struct fpu *fpu);
extern int fpu__restore_sig(void __user *buf, int ia32_frame);
Expand Down Expand Up @@ -120,20 +118,11 @@ extern void fpstate_sanitize_xstate(struct fpu *fpu);
err; \
})

#define check_insn(insn, output, input...) \
({ \
int err; \
#define kernel_insn(insn, output, input...) \
asm volatile("1:" #insn "\n\t" \
"2:\n" \
".section .fixup,\"ax\"\n" \
"3: movl $-1,%[err]\n" \
" jmp 2b\n" \
".previous\n" \
_ASM_EXTABLE(1b, 3b) \
: [err] "=r" (err), output \
: "0"(0), input); \
err; \
})
_ASM_EXTABLE_HANDLE(1b, 2b, ex_handler_fprestore) \
: output : input)

static inline int copy_fregs_to_user(struct fregs_state __user *fx)
{
Expand All @@ -153,20 +142,16 @@ static inline int copy_fxregs_to_user(struct fxregs_state __user *fx)

static inline void copy_kernel_to_fxregs(struct fxregs_state *fx)
{
int err;

if (IS_ENABLED(CONFIG_X86_32)) {
err = check_insn(fxrstor %[fx], "=m" (*fx), [fx] "m" (*fx));
kernel_insn(fxrstor %[fx], "=m" (*fx), [fx] "m" (*fx));
} else {
if (IS_ENABLED(CONFIG_AS_FXSAVEQ)) {
err = check_insn(fxrstorq %[fx], "=m" (*fx), [fx] "m" (*fx));
kernel_insn(fxrstorq %[fx], "=m" (*fx), [fx] "m" (*fx));
} else {
/* See comment in copy_fxregs_to_kernel() below. */
err = check_insn(rex64/fxrstor (%[fx]), "=m" (*fx), [fx] "R" (fx), "m" (*fx));
kernel_insn(rex64/fxrstor (%[fx]), "=m" (*fx), [fx] "R" (fx), "m" (*fx));
}
}
/* Copying from a kernel buffer to FPU registers should never fail: */
WARN_ON_FPU(err);
}

static inline int copy_user_to_fxregs(struct fxregs_state __user *fx)
Expand All @@ -183,9 +168,7 @@ static inline int copy_user_to_fxregs(struct fxregs_state __user *fx)

static inline void copy_kernel_to_fregs(struct fregs_state *fx)
{
int err = check_insn(frstor %[fx], "=m" (*fx), [fx] "m" (*fx));

WARN_ON_FPU(err);
kernel_insn(frstor %[fx], "=m" (*fx), [fx] "m" (*fx));
}

static inline int copy_user_to_fregs(struct fregs_state __user *fx)
Expand Down Expand Up @@ -281,18 +264,13 @@ static inline void copy_fxregs_to_kernel(struct fpu *fpu)
* Use XRSTORS to restore context if it is enabled. XRSTORS supports compact
* XSAVE area format.
*/
#define XSTATE_XRESTORE(st, lmask, hmask, err) \
#define XSTATE_XRESTORE(st, lmask, hmask) \
asm volatile(ALTERNATIVE(XRSTOR, \
XRSTORS, X86_FEATURE_XSAVES) \
"\n" \
"xor %[err], %[err]\n" \
"3:\n" \
".pushsection .fixup,\"ax\"\n" \
"4: movl $-2, %[err]\n" \
"jmp 3b\n" \
".popsection\n" \
_ASM_EXTABLE(661b, 4b) \
: [err] "=r" (err) \
_ASM_EXTABLE_HANDLE(661b, 3b, ex_handler_fprestore)\
: \
: "D" (st), "m" (*st), "a" (lmask), "d" (hmask) \
: "memory")

Expand Down Expand Up @@ -336,7 +314,10 @@ static inline void copy_kernel_to_xregs_booting(struct xregs_state *xstate)
else
XSTATE_OP(XRSTOR, xstate, lmask, hmask, err);

/* We should never fault when copying from a kernel buffer: */
/*
* We should never fault when copying from a kernel buffer, and the FPU
* state we set at boot time should be valid.
*/
WARN_ON_FPU(err);
}

Expand All @@ -350,7 +331,7 @@ static inline void copy_xregs_to_kernel(struct xregs_state *xstate)
u32 hmask = mask >> 32;
int err;

WARN_ON(!alternatives_patched);
WARN_ON_FPU(!alternatives_patched);

XSTATE_XSAVE(xstate, lmask, hmask, err);

Expand All @@ -365,12 +346,8 @@ static inline void copy_kernel_to_xregs(struct xregs_state *xstate, u64 mask)
{
u32 lmask = mask;
u32 hmask = mask >> 32;
int err;

XSTATE_XRESTORE(xstate, lmask, hmask, err);

/* We should never fault when copying from a kernel buffer: */
WARN_ON_FPU(err);
XSTATE_XRESTORE(xstate, lmask, hmask);
}

/*
Expand Down Expand Up @@ -526,37 +503,16 @@ static inline int fpregs_state_valid(struct fpu *fpu, unsigned int cpu)
*/
static inline void fpregs_deactivate(struct fpu *fpu)
{
WARN_ON_FPU(!fpu->fpregs_active);

fpu->fpregs_active = 0;
this_cpu_write(fpu_fpregs_owner_ctx, NULL);
trace_x86_fpu_regs_deactivated(fpu);
}

static inline void fpregs_activate(struct fpu *fpu)
{
WARN_ON_FPU(fpu->fpregs_active);

fpu->fpregs_active = 1;
this_cpu_write(fpu_fpregs_owner_ctx, fpu);
trace_x86_fpu_regs_activated(fpu);
}

/*
* The question "does this thread have fpu access?"
* is slightly racy, since preemption could come in
* and revoke it immediately after the test.
*
* However, even in that very unlikely scenario,
* we can just assume we have FPU access - typically
* to save the FP state - we'll just take a #NM
* fault and get the FPU access back.
*/
static inline int fpregs_active(void)
{
return current->thread.fpu.fpregs_active;
}

/*
* FPU state switching for scheduling.
*
Expand All @@ -571,14 +527,13 @@ static inline int fpregs_active(void)
static inline void
switch_fpu_prepare(struct fpu *old_fpu, int cpu)
{
if (old_fpu->fpregs_active) {
if (old_fpu->initialized) {
if (!copy_fpregs_to_fpstate(old_fpu))
old_fpu->last_cpu = -1;
else
old_fpu->last_cpu = cpu;

/* But leave fpu_fpregs_owner_ctx! */
old_fpu->fpregs_active = 0;
trace_x86_fpu_regs_deactivated(old_fpu);
} else
old_fpu->last_cpu = -1;
Expand All @@ -595,7 +550,7 @@ switch_fpu_prepare(struct fpu *old_fpu, int cpu)
static inline void switch_fpu_finish(struct fpu *new_fpu, int cpu)
{
bool preload = static_cpu_has(X86_FEATURE_FPU) &&
new_fpu->fpstate_active;
new_fpu->initialized;

if (preload) {
if (!fpregs_state_valid(new_fpu, cpu))
Expand All @@ -617,8 +572,7 @@ static inline void user_fpu_begin(void)
struct fpu *fpu = &current->thread.fpu;

preempt_disable();
if (!fpregs_active())
fpregs_activate(fpu);
fpregs_activate(fpu);
preempt_enable();
}

Expand Down
32 changes: 6 additions & 26 deletions arch/x86/include/asm/fpu/types.h
Original file line number Diff line number Diff line change
Expand Up @@ -68,6 +68,9 @@ struct fxregs_state {
/* Default value for fxregs_state.mxcsr: */
#define MXCSR_DEFAULT 0x1f80

/* Copy both mxcsr & mxcsr_flags with a single u64 memcpy: */
#define MXCSR_AND_FLAGS_SIZE sizeof(u64)

/*
* Software based FPU emulation state. This is arbitrary really,
* it matches the x87 format to make it easier to understand:
Expand Down Expand Up @@ -290,36 +293,13 @@ struct fpu {
unsigned int last_cpu;

/*
* @fpstate_active:
* @initialized:
*
* This flag indicates whether this context is active: if the task
* This flag indicates whether this context is initialized: if the task
* is not running then we can restore from this context, if the task
* is running then we should save into this context.
*/
unsigned char fpstate_active;

/*
* @fpregs_active:
*
* This flag determines whether a given context is actively
* loaded into the FPU's registers and that those registers
* represent the task's current FPU state.
*
* Note the interaction with fpstate_active:
*
* # task does not use the FPU:
* fpstate_active == 0
*
* # task uses the FPU and regs are active:
* fpstate_active == 1 && fpregs_active == 1
*
* # the regs are inactive but still match fpstate:
* fpstate_active == 1 && fpregs_active == 0 && fpregs_owner == fpu
*
* The third state is what we use for the lazy restore optimization
* on lazy-switching CPUs.
*/
unsigned char fpregs_active;
unsigned char initialized;

/*
* @state:
Expand Down
12 changes: 8 additions & 4 deletions arch/x86/include/asm/fpu/xstate.h
Original file line number Diff line number Diff line change
Expand Up @@ -48,8 +48,12 @@ void fpu__xstate_clear_all_cpu_caps(void);
void *get_xsave_addr(struct xregs_state *xsave, int xstate);
const void *get_xsave_field_ptr(int xstate_field);
int using_compacted_format(void);
int copyout_from_xsaves(unsigned int pos, unsigned int count, void *kbuf,
void __user *ubuf, struct xregs_state *xsave);
int copyin_to_xsaves(const void *kbuf, const void __user *ubuf,
struct xregs_state *xsave);
int copy_xstate_to_kernel(void *kbuf, struct xregs_state *xsave, unsigned int offset, unsigned int size);
int copy_xstate_to_user(void __user *ubuf, struct xregs_state *xsave, unsigned int offset, unsigned int size);
int copy_kernel_to_xstate(struct xregs_state *xsave, const void *kbuf);
int copy_user_to_xstate(struct xregs_state *xsave, const void __user *ubuf);

/* Validate an xstate header supplied by userspace (ptrace or sigreturn) */
extern int validate_xstate_header(const struct xstate_header *hdr);

#endif
11 changes: 4 additions & 7 deletions arch/x86/include/asm/trace/fpu.h
Original file line number Diff line number Diff line change
Expand Up @@ -12,25 +12,22 @@ DECLARE_EVENT_CLASS(x86_fpu,

TP_STRUCT__entry(
__field(struct fpu *, fpu)
__field(bool, fpregs_active)
__field(bool, fpstate_active)
__field(bool, initialized)
__field(u64, xfeatures)
__field(u64, xcomp_bv)
),

TP_fast_assign(
__entry->fpu = fpu;
__entry->fpregs_active = fpu->fpregs_active;
__entry->fpstate_active = fpu->fpstate_active;
__entry->initialized = fpu->initialized;
if (boot_cpu_has(X86_FEATURE_OSXSAVE)) {
__entry->xfeatures = fpu->state.xsave.header.xfeatures;
__entry->xcomp_bv = fpu->state.xsave.header.xcomp_bv;
}
),
TP_printk("x86/fpu: %p fpregs_active: %d fpstate_active: %d xfeatures: %llx xcomp_bv: %llx",
TP_printk("x86/fpu: %p initialized: %d xfeatures: %llx xcomp_bv: %llx",
__entry->fpu,
__entry->fpregs_active,
__entry->fpstate_active,
__entry->initialized,
__entry->xfeatures,
__entry->xcomp_bv
)
Expand Down
Loading

0 comments on commit 7031b64

Please sign in to comment.