Skip to content

Commit

Permalink
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Browse files Browse the repository at this point in the history
Pull KVM updates from Paolo Bonzini:
 "One of the largest releases for KVM...  Hardly any generic
  changes, but lots of architecture-specific updates.

  ARM:
   - VHE support so that we can run the kernel at EL2 on ARMv8.1 systems
   - PMU support for guests
   - 32bit world switch rewritten in C
   - various optimizations to the vgic save/restore code.

  PPC:
   - enabled KVM-VFIO integration ("VFIO device")
   - optimizations to speed up IPIs between vcpus
   - in-kernel handling of IOMMU hypercalls
   - support for dynamic DMA windows (DDW).

  s390:
   - provide the floating point registers via sync regs;
   - separated instruction vs.  data accesses
   - dirty log improvements for huge guests
   - bugfixes and documentation improvements.

  x86:
   - Hyper-V VMBus hypercall userspace exit
   - alternative implementation of lowest-priority interrupts using
     vector hashing (for better VT-d posted interrupt support)
   - fixed guest debugging with nested virtualizations
   - improved interrupt tracking in the in-kernel IOAPIC
   - generic infrastructure for tracking writes to guest
     memory - currently its only use is to speedup the legacy shadow
     paging (pre-EPT) case, but in the future it will be used for
     virtual GPUs as well
   - much cleanup (LAPIC, kvmclock, MMU, PIT), including ubsan fixes"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (217 commits)
  KVM: x86: remove eager_fpu field of struct kvm_vcpu_arch
  KVM: x86: disable MPX if host did not enable MPX XSAVE features
  arm64: KVM: vgic-v3: Only wipe LRs on vcpu exit
  arm64: KVM: vgic-v3: Reset LRs at boot time
  arm64: KVM: vgic-v3: Do not save an LR known to be empty
  arm64: KVM: vgic-v3: Save maintenance interrupt state only if required
  arm64: KVM: vgic-v3: Avoid accessing ICH registers
  KVM: arm/arm64: vgic-v2: Make GICD_SGIR quicker to hit
  KVM: arm/arm64: vgic-v2: Only wipe LRs on vcpu exit
  KVM: arm/arm64: vgic-v2: Reset LRs at boot time
  KVM: arm/arm64: vgic-v2: Do not save an LR known to be empty
  KVM: arm/arm64: vgic-v2: Move GICH_ELRSR saving to its own function
  KVM: arm/arm64: vgic-v2: Save maintenance interrupt state only if required
  KVM: arm/arm64: vgic-v2: Avoid accessing GICH registers
  KVM: s390: allocate only one DMA page per VM
  KVM: s390: enable STFLE interpretation only if enabled for the guest
  KVM: s390: wake up when the VCPU cpu timer expires
  KVM: s390: step the VCPU timer while in enabled wait
  KVM: s390: protect VCPU cpu timer with a seqcount
  KVM: s390: step VCPU cpu timer during kvm_run ioctl
  ...
  • Loading branch information
torvalds committed Mar 16, 2016
2 parents 047486d + f958ee7 commit 10dc374
Show file tree
Hide file tree
Showing 142 changed files with 6,757 additions and 2,939 deletions.
99 changes: 95 additions & 4 deletions Documentation/virtual/kvm/api.txt
Original file line number Diff line number Diff line change
Expand Up @@ -2507,8 +2507,9 @@ struct kvm_create_device {

4.80 KVM_SET_DEVICE_ATTR/KVM_GET_DEVICE_ATTR

Capability: KVM_CAP_DEVICE_CTRL, KVM_CAP_VM_ATTRIBUTES for vm device
Type: device ioctl, vm ioctl
Capability: KVM_CAP_DEVICE_CTRL, KVM_CAP_VM_ATTRIBUTES for vm device,
KVM_CAP_VCPU_ATTRIBUTES for vcpu device
Type: device ioctl, vm ioctl, vcpu ioctl
Parameters: struct kvm_device_attr
Returns: 0 on success, -1 on error
Errors:
Expand All @@ -2533,8 +2534,9 @@ struct kvm_device_attr {

4.81 KVM_HAS_DEVICE_ATTR

Capability: KVM_CAP_DEVICE_CTRL, KVM_CAP_VM_ATTRIBUTES for vm device
Type: device ioctl, vm ioctl
Capability: KVM_CAP_DEVICE_CTRL, KVM_CAP_VM_ATTRIBUTES for vm device,
KVM_CAP_VCPU_ATTRIBUTES for vcpu device
Type: device ioctl, vm ioctl, vcpu ioctl
Parameters: struct kvm_device_attr
Returns: 0 on success, -1 on error
Errors:
Expand Down Expand Up @@ -2577,6 +2579,8 @@ Possible features:
Depends on KVM_CAP_ARM_EL1_32BIT (arm64 only).
- KVM_ARM_VCPU_PSCI_0_2: Emulate PSCI v0.2 for the CPU.
Depends on KVM_CAP_ARM_PSCI_0_2.
- KVM_ARM_VCPU_PMU_V3: Emulate PMUv3 for the CPU.
Depends on KVM_CAP_ARM_PMU_V3.


4.83 KVM_ARM_PREFERRED_TARGET
Expand Down Expand Up @@ -3035,6 +3039,87 @@ Returns: 0 on success, -1 on error

Queues an SMI on the thread's vcpu.

4.97 KVM_CAP_PPC_MULTITCE

Capability: KVM_CAP_PPC_MULTITCE
Architectures: ppc
Type: vm

This capability means the kernel is capable of handling hypercalls
H_PUT_TCE_INDIRECT and H_STUFF_TCE without passing those into the user
space. This significantly accelerates DMA operations for PPC KVM guests.
User space should expect that its handlers for these hypercalls
are not going to be called if user space previously registered LIOBN
in KVM (via KVM_CREATE_SPAPR_TCE or similar calls).

In order to enable H_PUT_TCE_INDIRECT and H_STUFF_TCE use in the guest,
user space might have to advertise it for the guest. For example,
IBM pSeries (sPAPR) guest starts using them if "hcall-multi-tce" is
present in the "ibm,hypertas-functions" device-tree property.

The hypercalls mentioned above may or may not be processed successfully
in the kernel based fast path. If they can not be handled by the kernel,
they will get passed on to user space. So user space still has to have
an implementation for these despite the in kernel acceleration.

This capability is always enabled.

4.98 KVM_CREATE_SPAPR_TCE_64

Capability: KVM_CAP_SPAPR_TCE_64
Architectures: powerpc
Type: vm ioctl
Parameters: struct kvm_create_spapr_tce_64 (in)
Returns: file descriptor for manipulating the created TCE table

This is an extension for KVM_CAP_SPAPR_TCE which only supports 32bit
windows, described in 4.62 KVM_CREATE_SPAPR_TCE

This capability uses extended struct in ioctl interface:

/* for KVM_CAP_SPAPR_TCE_64 */
struct kvm_create_spapr_tce_64 {
__u64 liobn;
__u32 page_shift;
__u32 flags;
__u64 offset; /* in pages */
__u64 size; /* in pages */
};

The aim of extension is to support an additional bigger DMA window with
a variable page size.
KVM_CREATE_SPAPR_TCE_64 receives a 64bit window size, an IOMMU page shift and
a bus offset of the corresponding DMA window, @size and @offset are numbers
of IOMMU pages.

@flags are not used at the moment.

The rest of functionality is identical to KVM_CREATE_SPAPR_TCE.

4.98 KVM_REINJECT_CONTROL

Capability: KVM_CAP_REINJECT_CONTROL
Architectures: x86
Type: vm ioctl
Parameters: struct kvm_reinject_control (in)
Returns: 0 on success,
-EFAULT if struct kvm_reinject_control cannot be read,
-ENXIO if KVM_CREATE_PIT or KVM_CREATE_PIT2 didn't succeed earlier.

i8254 (PIT) has two modes, reinject and !reinject. The default is reinject,
where KVM queues elapsed i8254 ticks and monitors completion of interrupt from
vector(s) that i8254 injects. Reinject mode dequeues a tick and injects its
interrupt whenever there isn't a pending interrupt from i8254.
!reinject mode injects an interrupt as soon as a tick arrives.

struct kvm_reinject_control {
__u8 pit_reinject;
__u8 reserved[31];
};

pit_reinject = 0 (!reinject mode) is recommended, unless running an old
operating system that uses the PIT for timing (e.g. Linux 2.4.x).

5. The kvm_run structure
------------------------

Expand Down Expand Up @@ -3339,6 +3424,7 @@ EOI was received.

struct kvm_hyperv_exit {
#define KVM_EXIT_HYPERV_SYNIC 1
#define KVM_EXIT_HYPERV_HCALL 2
__u32 type;
union {
struct {
Expand All @@ -3347,6 +3433,11 @@ EOI was received.
__u64 evt_page;
__u64 msg_page;
} synic;
struct {
__u64 input;
__u64 result;
__u64 params[2];
} hcall;
} u;
};
/* KVM_EXIT_HYPERV */
Expand Down
2 changes: 2 additions & 0 deletions Documentation/virtual/kvm/devices/s390_flic.txt
Original file line number Diff line number Diff line change
Expand Up @@ -88,6 +88,8 @@ struct kvm_s390_io_adapter_req {
perform a gmap translation for the guest address provided in addr,
pin a userspace page for the translated address and add it to the
list of mappings
Note: A new mapping will be created unconditionally; therefore,
the calling code should avoid making duplicate mappings.

KVM_S390_IO_ADAPTER_UNMAP
release a userspace page for the translated address specified in addr
Expand Down
33 changes: 33 additions & 0 deletions Documentation/virtual/kvm/devices/vcpu.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
Generic vcpu interface
====================================

The virtual cpu "device" also accepts the ioctls KVM_SET_DEVICE_ATTR,
KVM_GET_DEVICE_ATTR, and KVM_HAS_DEVICE_ATTR. The interface uses the same struct
kvm_device_attr as other devices, but targets VCPU-wide settings and controls.

The groups and attributes per virtual cpu, if any, are architecture specific.

1. GROUP: KVM_ARM_VCPU_PMU_V3_CTRL
Architectures: ARM64

1.1. ATTRIBUTE: KVM_ARM_VCPU_PMU_V3_IRQ
Parameters: in kvm_device_attr.addr the address for PMU overflow interrupt is a
pointer to an int
Returns: -EBUSY: The PMU overflow interrupt is already set
-ENXIO: The overflow interrupt not set when attempting to get it
-ENODEV: PMUv3 not supported
-EINVAL: Invalid PMU overflow interrupt number supplied

A value describing the PMUv3 (Performance Monitor Unit v3) overflow interrupt
number for this vcpu. This interrupt could be a PPI or SPI, but the interrupt
type must be same for each vcpu. As a PPI, the interrupt number is the same for
all vcpus, while as an SPI it must be a separate number per vcpu.

1.2 ATTRIBUTE: KVM_ARM_VCPU_PMU_V3_INIT
Parameters: no additional parameter in kvm_device_attr.addr
Returns: -ENODEV: PMUv3 not supported
-ENXIO: PMUv3 not properly configured as required prior to calling this
attribute
-EBUSY: PMUv3 already initialized

Request the initialization of the PMUv3.
52 changes: 52 additions & 0 deletions Documentation/virtual/kvm/devices/vm.txt
Original file line number Diff line number Diff line change
Expand Up @@ -84,3 +84,55 @@ Returns: -EBUSY in case 1 or more vcpus are already activated (only in write
-EFAULT if the given address is not accessible from kernel space
-ENOMEM if not enough memory is available to process the ioctl
0 in case of success

3. GROUP: KVM_S390_VM_TOD
Architectures: s390

3.1. ATTRIBUTE: KVM_S390_VM_TOD_HIGH

Allows user space to set/get the TOD clock extension (u8).

Parameters: address of a buffer in user space to store the data (u8) to
Returns: -EFAULT if the given address is not accessible from kernel space
-EINVAL if setting the TOD clock extension to != 0 is not supported

3.2. ATTRIBUTE: KVM_S390_VM_TOD_LOW

Allows user space to set/get bits 0-63 of the TOD clock register as defined in
the POP (u64).

Parameters: address of a buffer in user space to store the data (u64) to
Returns: -EFAULT if the given address is not accessible from kernel space

4. GROUP: KVM_S390_VM_CRYPTO
Architectures: s390

4.1. ATTRIBUTE: KVM_S390_VM_CRYPTO_ENABLE_AES_KW (w/o)

Allows user space to enable aes key wrapping, including generating a new
wrapping key.

Parameters: none
Returns: 0

4.2. ATTRIBUTE: KVM_S390_VM_CRYPTO_ENABLE_DEA_KW (w/o)

Allows user space to enable dea key wrapping, including generating a new
wrapping key.

Parameters: none
Returns: 0

4.3. ATTRIBUTE: KVM_S390_VM_CRYPTO_DISABLE_AES_KW (w/o)

Allows user space to disable aes key wrapping, clearing the wrapping key.

Parameters: none
Returns: 0

4.4. ATTRIBUTE: KVM_S390_VM_CRYPTO_DISABLE_DEA_KW (w/o)

Allows user space to disable dea key wrapping, clearing the wrapping key.

Parameters: none
Returns: 0
6 changes: 3 additions & 3 deletions Documentation/virtual/kvm/mmu.txt
Original file line number Diff line number Diff line change
Expand Up @@ -392,11 +392,11 @@ To instantiate a large spte, four constraints must be satisfied:
write-protected pages
- the guest page must be wholly contained by a single memory slot

To check the last two conditions, the mmu maintains a ->write_count set of
To check the last two conditions, the mmu maintains a ->disallow_lpage set of
arrays for each memory slot and large page size. Every write protected page
causes its write_count to be incremented, thus preventing instantiation of
causes its disallow_lpage to be incremented, thus preventing instantiation of
a large spte. The frames at the end of an unaligned memory slot have
artificially inflated ->write_counts so they can never be instantiated.
artificially inflated ->disallow_lpages so they can never be instantiated.

Zapping all pages (page generation count)
=========================================
Expand Down
41 changes: 3 additions & 38 deletions arch/arm/include/asm/kvm_asm.h
Original file line number Diff line number Diff line change
Expand Up @@ -19,38 +19,7 @@
#ifndef __ARM_KVM_ASM_H__
#define __ARM_KVM_ASM_H__

/* 0 is reserved as an invalid value. */
#define c0_MPIDR 1 /* MultiProcessor ID Register */
#define c0_CSSELR 2 /* Cache Size Selection Register */
#define c1_SCTLR 3 /* System Control Register */
#define c1_ACTLR 4 /* Auxiliary Control Register */
#define c1_CPACR 5 /* Coprocessor Access Control */
#define c2_TTBR0 6 /* Translation Table Base Register 0 */
#define c2_TTBR0_high 7 /* TTBR0 top 32 bits */
#define c2_TTBR1 8 /* Translation Table Base Register 1 */
#define c2_TTBR1_high 9 /* TTBR1 top 32 bits */
#define c2_TTBCR 10 /* Translation Table Base Control R. */
#define c3_DACR 11 /* Domain Access Control Register */
#define c5_DFSR 12 /* Data Fault Status Register */
#define c5_IFSR 13 /* Instruction Fault Status Register */
#define c5_ADFSR 14 /* Auxilary Data Fault Status R */
#define c5_AIFSR 15 /* Auxilary Instrunction Fault Status R */
#define c6_DFAR 16 /* Data Fault Address Register */
#define c6_IFAR 17 /* Instruction Fault Address Register */
#define c7_PAR 18 /* Physical Address Register */
#define c7_PAR_high 19 /* PAR top 32 bits */
#define c9_L2CTLR 20 /* Cortex A15/A7 L2 Control Register */
#define c10_PRRR 21 /* Primary Region Remap Register */
#define c10_NMRR 22 /* Normal Memory Remap Register */
#define c12_VBAR 23 /* Vector Base Address Register */
#define c13_CID 24 /* Context ID Register */
#define c13_TID_URW 25 /* Thread ID, User R/W */
#define c13_TID_URO 26 /* Thread ID, User R/O */
#define c13_TID_PRIV 27 /* Thread ID, Privileged */
#define c14_CNTKCTL 28 /* Timer Control Register (PL1) */
#define c10_AMAIR0 29 /* Auxilary Memory Attribute Indirection Reg0 */
#define c10_AMAIR1 30 /* Auxilary Memory Attribute Indirection Reg1 */
#define NR_CP15_REGS 31 /* Number of regs (incl. invalid) */
#include <asm/virt.h>

#define ARM_EXCEPTION_RESET 0
#define ARM_EXCEPTION_UNDEFINED 1
Expand Down Expand Up @@ -86,19 +55,15 @@ struct kvm_vcpu;
extern char __kvm_hyp_init[];
extern char __kvm_hyp_init_end[];

extern char __kvm_hyp_exit[];
extern char __kvm_hyp_exit_end[];

extern char __kvm_hyp_vector[];

extern char __kvm_hyp_code_start[];
extern char __kvm_hyp_code_end[];

extern void __kvm_flush_vm_context(void);
extern void __kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa);
extern void __kvm_tlb_flush_vmid(struct kvm *kvm);

extern int __kvm_vcpu_run(struct kvm_vcpu *vcpu);

extern void __init_stage2_translation(void);
#endif

#endif /* __ARM_KVM_ASM_H__ */
20 changes: 10 additions & 10 deletions arch/arm/include/asm/kvm_emulate.h
Original file line number Diff line number Diff line change
Expand Up @@ -68,12 +68,12 @@ static inline bool vcpu_mode_is_32bit(struct kvm_vcpu *vcpu)

static inline unsigned long *vcpu_pc(struct kvm_vcpu *vcpu)
{
return &vcpu->arch.regs.usr_regs.ARM_pc;
return &vcpu->arch.ctxt.gp_regs.usr_regs.ARM_pc;
}

static inline unsigned long *vcpu_cpsr(struct kvm_vcpu *vcpu)
{
return &vcpu->arch.regs.usr_regs.ARM_cpsr;
return &vcpu->arch.ctxt.gp_regs.usr_regs.ARM_cpsr;
}

static inline void vcpu_set_thumb(struct kvm_vcpu *vcpu)
Expand All @@ -83,13 +83,13 @@ static inline void vcpu_set_thumb(struct kvm_vcpu *vcpu)

static inline bool mode_has_spsr(struct kvm_vcpu *vcpu)
{
unsigned long cpsr_mode = vcpu->arch.regs.usr_regs.ARM_cpsr & MODE_MASK;
unsigned long cpsr_mode = vcpu->arch.ctxt.gp_regs.usr_regs.ARM_cpsr & MODE_MASK;
return (cpsr_mode > USR_MODE && cpsr_mode < SYSTEM_MODE);
}

static inline bool vcpu_mode_priv(struct kvm_vcpu *vcpu)
{
unsigned long cpsr_mode = vcpu->arch.regs.usr_regs.ARM_cpsr & MODE_MASK;
unsigned long cpsr_mode = vcpu->arch.ctxt.gp_regs.usr_regs.ARM_cpsr & MODE_MASK;
return cpsr_mode > USR_MODE;;
}

Expand All @@ -108,11 +108,6 @@ static inline phys_addr_t kvm_vcpu_get_fault_ipa(struct kvm_vcpu *vcpu)
return ((phys_addr_t)vcpu->arch.fault.hpfar & HPFAR_MASK) << 8;
}

static inline unsigned long kvm_vcpu_get_hyp_pc(struct kvm_vcpu *vcpu)
{
return vcpu->arch.fault.hyp_pc;
}

static inline bool kvm_vcpu_dabt_isvalid(struct kvm_vcpu *vcpu)
{
return kvm_vcpu_get_hsr(vcpu) & HSR_ISV;
Expand Down Expand Up @@ -143,6 +138,11 @@ static inline bool kvm_vcpu_dabt_iss1tw(struct kvm_vcpu *vcpu)
return kvm_vcpu_get_hsr(vcpu) & HSR_DABT_S1PTW;
}

static inline bool kvm_vcpu_dabt_is_cm(struct kvm_vcpu *vcpu)
{
return !!(kvm_vcpu_get_hsr(vcpu) & HSR_DABT_CM);
}

/* Get Access Size from a data abort */
static inline int kvm_vcpu_dabt_get_as(struct kvm_vcpu *vcpu)
{
Expand Down Expand Up @@ -192,7 +192,7 @@ static inline u32 kvm_vcpu_hvc_get_imm(struct kvm_vcpu *vcpu)

static inline unsigned long kvm_vcpu_get_mpidr_aff(struct kvm_vcpu *vcpu)
{
return vcpu->arch.cp15[c0_MPIDR] & MPIDR_HWID_BITMASK;
return vcpu_cp15(vcpu, c0_MPIDR) & MPIDR_HWID_BITMASK;
}

static inline void kvm_vcpu_set_be(struct kvm_vcpu *vcpu)
Expand Down
Loading

0 comments on commit 10dc374

Please sign in to comment.