Skip to content

Commit

Permalink
Merge tag 'kvm-4.20-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Browse files Browse the repository at this point in the history
Pull KVM updates from Radim Krčmář:
 "ARM:
   - Improved guest IPA space support (32 to 52 bits)

   - RAS event delivery for 32bit

   - PMU fixes

   - Guest entry hardening

   - Various cleanups

   - Port of dirty_log_test selftest

  PPC:
   - Nested HV KVM support for radix guests on POWER9. The performance
     is much better than with PR KVM. Migration and arbitrary level of
     nesting is supported.

   - Disable nested HV-KVM on early POWER9 chips that need a particular
     hardware bug workaround

   - One VM per core mode to prevent potential data leaks

   - PCI pass-through optimization

   - merge ppc-kvm topic branch and kvm-ppc-fixes to get a better base

  s390:
   - Initial version of AP crypto virtualization via vfio-mdev

   - Improvement for vfio-ap

   - Set the host program identifier

   - Optimize page table locking

  x86:
   - Enable nested virtualization by default

   - Implement Hyper-V IPI hypercalls

   - Improve #PF and #DB handling

   - Allow guests to use Enlightened VMCS

   - Add migration selftests for VMCS and Enlightened VMCS

   - Allow coalesced PIO accesses

   - Add an option to perform nested VMCS host state consistency check
     through hardware

   - Automatic tuning of lapic_timer_advance_ns

   - Many fixes, minor improvements, and cleanups"

* tag 'kvm-4.20-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (204 commits)
  KVM/nVMX: Do not validate that posted_intr_desc_addr is page aligned
  Revert "kvm: x86: optimize dr6 restore"
  KVM: PPC: Optimize clearing TCEs for sparse tables
  x86/kvm/nVMX: tweak shadow fields
  selftests/kvm: add missing executables to .gitignore
  KVM: arm64: Safety check PSTATE when entering guest and handle IL
  KVM: PPC: Book3S HV: Don't use streamlined entry path on early POWER9 chips
  arm/arm64: KVM: Enable 32 bits kvm vcpu events support
  arm/arm64: KVM: Rename function kvm_arch_dev_ioctl_check_extension()
  KVM: arm64: Fix caching of host MDCR_EL2 value
  KVM: VMX: enable nested virtualization by default
  KVM/x86: Use 32bit xor to clear registers in svm.c
  kvm: x86: Introduce KVM_CAP_EXCEPTION_PAYLOAD
  kvm: vmx: Defer setting of DR6 until #DB delivery
  kvm: x86: Defer setting of CR2 until #PF delivery
  kvm: x86: Add payload operands to kvm_multiple_exception
  kvm: x86: Add exception payload fields to kvm_vcpu_events
  kvm: x86: Add has_payload and payload to kvm_queued_exception
  KVM: Documentation: Fix omission in struct kvm_vcpu_events
  KVM: selftests: add Enlightened VMCS test
  ...
  • Loading branch information
torvalds committed Oct 26, 2018
2 parents 83c4087 + 22a7cdc commit 0d1e8b8
Show file tree
Hide file tree
Showing 138 changed files with 12,456 additions and 3,259 deletions.
837 changes: 837 additions & 0 deletions Documentation/s390/vfio-ap.txt

Large diffs are not rendered by default.

135 changes: 129 additions & 6 deletions Documentation/virtual/kvm/api.txt
Original file line number Diff line number Diff line change
Expand Up @@ -123,6 +123,37 @@ memory layout to fit in user mode), check KVM_CAP_MIPS_VZ and use the
flag KVM_VM_MIPS_VZ.


On arm64, the physical address size for a VM (IPA Size limit) is limited
to 40bits by default. The limit can be configured if the host supports the
extension KVM_CAP_ARM_VM_IPA_SIZE. When supported, use
KVM_VM_TYPE_ARM_IPA_SIZE(IPA_Bits) to set the size in the machine type
identifier, where IPA_Bits is the maximum width of any physical
address used by the VM. The IPA_Bits is encoded in bits[7-0] of the
machine type identifier.

e.g, to configure a guest to use 48bit physical address size :

vm_fd = ioctl(dev_fd, KVM_CREATE_VM, KVM_VM_TYPE_ARM_IPA_SIZE(48));

The requested size (IPA_Bits) must be :
0 - Implies default size, 40bits (for backward compatibility)

or

N - Implies N bits, where N is a positive integer such that,
32 <= N <= Host_IPA_Limit

Host_IPA_Limit is the maximum possible value for IPA_Bits on the host and
is dependent on the CPU capability and the kernel configuration. The limit can
be retrieved using KVM_CAP_ARM_VM_IPA_SIZE of the KVM_CHECK_EXTENSION
ioctl() at run-time.

Please note that configuring the IPA size does not affect the capability
exposed by the guest CPUs in ID_AA64MMFR0_EL1[PARange]. It only affects
size of the address translated by the stage2 level (guest physical to
host physical address translations).


4.3 KVM_GET_MSR_INDEX_LIST, KVM_GET_MSR_FEATURE_INDEX_LIST

Capability: basic, KVM_CAP_GET_MSR_FEATURES for KVM_GET_MSR_FEATURE_INDEX_LIST
Expand Down Expand Up @@ -850,7 +881,7 @@ struct kvm_vcpu_events {
__u8 injected;
__u8 nr;
__u8 has_error_code;
__u8 pad;
__u8 pending;
__u32 error_code;
} exception;
struct {
Expand All @@ -873,15 +904,23 @@ struct kvm_vcpu_events {
__u8 smm_inside_nmi;
__u8 latched_init;
} smi;
__u8 reserved[27];
__u8 exception_has_payload;
__u64 exception_payload;
};

Only two fields are defined in the flags field:
The following bits are defined in the flags field:

- KVM_VCPUEVENT_VALID_SHADOW may be set in the flags field to signal that
- KVM_VCPUEVENT_VALID_SHADOW may be set to signal that
interrupt.shadow contains a valid state.

- KVM_VCPUEVENT_VALID_SMM may be set in the flags field to signal that
smi contains a valid state.
- KVM_VCPUEVENT_VALID_SMM may be set to signal that smi contains a
valid state.

- KVM_VCPUEVENT_VALID_PAYLOAD may be set to signal that the
exception_has_payload, exception_payload, and exception.pending
fields contain a valid state. This bit will be set whenever
KVM_CAP_EXCEPTION_PAYLOAD is enabled.

ARM/ARM64:

Expand Down Expand Up @@ -961,6 +1000,11 @@ shall be written into the VCPU.

KVM_VCPUEVENT_VALID_SMM can only be set if KVM_CAP_X86_SMM is available.

If KVM_CAP_EXCEPTION_PAYLOAD is enabled, KVM_VCPUEVENT_VALID_PAYLOAD
can be set in the flags field to signal that the
exception_has_payload, exception_payload, and exception.pending fields
contain a valid state and shall be written into the VCPU.

ARM/ARM64:

Set the pending SError exception state for this VCPU. It is not possible to
Expand Down Expand Up @@ -1922,6 +1966,7 @@ registers, find a list below:
PPC | KVM_REG_PPC_TIDR | 64
PPC | KVM_REG_PPC_PSSCR | 64
PPC | KVM_REG_PPC_DEC_EXPIRY | 64
PPC | KVM_REG_PPC_PTCR | 64
PPC | KVM_REG_PPC_TM_GPR0 | 64
...
PPC | KVM_REG_PPC_TM_GPR31 | 64
Expand Down Expand Up @@ -2269,6 +2314,10 @@ The supported flags are:
The emulated MMU supports 1T segments in addition to the
standard 256M ones.

- KVM_PPC_NO_HASH
This flag indicates that HPT guests are not supported by KVM,
thus all guests must use radix MMU mode.

The "slb_size" field indicates how many SLB entries are supported

The "sps" array contains 8 entries indicating the supported base
Expand Down Expand Up @@ -3676,6 +3725,34 @@ Returns: 0 on success, -1 on error
This copies the vcpu's kvm_nested_state struct from userspace to the kernel. For
the definition of struct kvm_nested_state, see KVM_GET_NESTED_STATE.

4.116 KVM_(UN)REGISTER_COALESCED_MMIO

Capability: KVM_CAP_COALESCED_MMIO (for coalesced mmio)
KVM_CAP_COALESCED_PIO (for coalesced pio)
Architectures: all
Type: vm ioctl
Parameters: struct kvm_coalesced_mmio_zone
Returns: 0 on success, < 0 on error

Coalesced I/O is a performance optimization that defers hardware
register write emulation so that userspace exits are avoided. It is
typically used to reduce the overhead of emulating frequently accessed
hardware registers.

When a hardware register is configured for coalesced I/O, write accesses
do not exit to userspace and their value is recorded in a ring buffer
that is shared between kernel and userspace.

Coalesced I/O is used if one or more write accesses to a hardware
register can be deferred until a read or a write to another hardware
register on the same device. This last access will cause a vmexit and
userspace will process accesses from the ring buffer before emulating
it. That will avoid exiting to userspace on repeated writes.

Coalesced pio is based on coalesced mmio. There is little difference
between coalesced mmio and pio except that coalesced pio records accesses
to I/O ports.

5. The kvm_run structure
------------------------

Expand Down Expand Up @@ -4522,7 +4599,7 @@ hpage module parameter is not set to 1, -EINVAL is returned.
While it is generally possible to create a huge page backed VM without
this capability, the VM will not be able to run.

7.14 KVM_CAP_MSR_PLATFORM_INFO
7.15 KVM_CAP_MSR_PLATFORM_INFO

Architectures: x86
Parameters: args[0] whether feature should be enabled or not
Expand All @@ -4531,6 +4608,45 @@ With this capability, a guest may read the MSR_PLATFORM_INFO MSR. Otherwise,
a #GP would be raised when the guest tries to access. Currently, this
capability does not enable write permissions of this MSR for the guest.

7.16 KVM_CAP_PPC_NESTED_HV

Architectures: ppc
Parameters: none
Returns: 0 on success, -EINVAL when the implementation doesn't support
nested-HV virtualization.

HV-KVM on POWER9 and later systems allows for "nested-HV"
virtualization, which provides a way for a guest VM to run guests that
can run using the CPU's supervisor mode (privileged non-hypervisor
state). Enabling this capability on a VM depends on the CPU having
the necessary functionality and on the facility being enabled with a
kvm-hv module parameter.

7.17 KVM_CAP_EXCEPTION_PAYLOAD

Architectures: x86
Parameters: args[0] whether feature should be enabled or not

With this capability enabled, CR2 will not be modified prior to the
emulated VM-exit when L1 intercepts a #PF exception that occurs in
L2. Similarly, for kvm-intel only, DR6 will not be modified prior to
the emulated VM-exit when L1 intercepts a #DB exception that occurs in
L2. As a result, when KVM_GET_VCPU_EVENTS reports a pending #PF (or
#DB) exception for L2, exception.has_payload will be set and the
faulting address (or the new DR6 bits*) will be reported in the
exception_payload field. Similarly, when userspace injects a #PF (or
#DB) into L2 using KVM_SET_VCPU_EVENTS, it is expected to set
exception.has_payload and to put the faulting address (or the new DR6
bits*) in the exception_payload field.

This capability also enables exception.pending in struct
kvm_vcpu_events, which allows userspace to distinguish between pending
and injected exceptions.


* For the new DR6 bits, note that bit 16 is set iff the #DB exception
will clear DR6.RTM.

8. Other capabilities.
----------------------

Expand Down Expand Up @@ -4772,3 +4888,10 @@ CPU when the exception is taken. If this virtual SError is taken to EL1 using
AArch64, this value will be reported in the ISS field of ESR_ELx.

See KVM_CAP_VCPU_EVENTS for more details.
8.20 KVM_CAP_HYPERV_SEND_IPI

Architectures: x86

This capability indicates that KVM supports paravirtualized Hyper-V IPI send
hypercalls:
HvCallSendSyntheticClusterIpi, HvCallSendSyntheticClusterIpiEx.
12 changes: 12 additions & 0 deletions MAINTAINERS
Original file line number Diff line number Diff line change
Expand Up @@ -12800,6 +12800,18 @@ W: http://www.ibm.com/developerworks/linux/linux390/
S: Supported
F: drivers/s390/crypto/

S390 VFIO AP DRIVER
M: Tony Krowiak <[email protected]>
M: Pierre Morel <[email protected]>
M: Halil Pasic <[email protected]>
L: [email protected]
W: http://www.ibm.com/developerworks/linux/linux390/
S: Supported
F: drivers/s390/crypto/vfio_ap_drv.c
F: drivers/s390/crypto/vfio_ap_private.h
F: drivers/s390/crypto/vfio_ap_ops.c
F: Documentation/s390/vfio-ap.txt

S390 ZFCP DRIVER
M: Steffen Maier <[email protected]>
M: Benjamin Block <[email protected]>
Expand Down
3 changes: 1 addition & 2 deletions arch/arm/include/asm/kvm_arm.h
Original file line number Diff line number Diff line change
Expand Up @@ -133,8 +133,7 @@
* space.
*/
#define KVM_PHYS_SHIFT (40)
#define KVM_PHYS_SIZE (_AC(1, ULL) << KVM_PHYS_SHIFT)
#define KVM_PHYS_MASK (KVM_PHYS_SIZE - _AC(1, ULL))

#define PTRS_PER_S2_PGD (_AC(1, ULL) << (KVM_PHYS_SHIFT - 30))

/* Virtualization Translation Control Register (VTCR) bits */
Expand Down
13 changes: 12 additions & 1 deletion arch/arm/include/asm/kvm_host.h
Original file line number Diff line number Diff line change
Expand Up @@ -273,7 +273,7 @@ static inline void __cpu_init_stage2(void)
kvm_call_hyp(__init_stage2_translation);
}

static inline int kvm_arch_dev_ioctl_check_extension(struct kvm *kvm, long ext)
static inline int kvm_arch_vm_ioctl_check_extension(struct kvm *kvm, long ext)
{
return 0;
}
Expand Down Expand Up @@ -354,4 +354,15 @@ static inline void kvm_vcpu_put_sysregs(struct kvm_vcpu *vcpu) {}
struct kvm *kvm_arch_alloc_vm(void);
void kvm_arch_free_vm(struct kvm *kvm);

static inline int kvm_arm_setup_stage2(struct kvm *kvm, unsigned long type)
{
/*
* On 32bit ARM, VMs get a static 40bit IPA stage2 setup,
* so any non-zero value used as type is illegal.
*/
if (type)
return -EINVAL;
return 0;
}

#endif /* __ARM_KVM_HOST_H__ */
15 changes: 10 additions & 5 deletions arch/arm/include/asm/kvm_mmu.h
Original file line number Diff line number Diff line change
Expand Up @@ -35,23 +35,26 @@
addr; \
})

/*
* KVM_MMU_CACHE_MIN_PAGES is the number of stage2 page table translation levels.
*/
#define KVM_MMU_CACHE_MIN_PAGES 2

#ifndef __ASSEMBLY__

#include <linux/highmem.h>
#include <asm/cacheflush.h>
#include <asm/cputype.h>
#include <asm/kvm_arm.h>
#include <asm/kvm_hyp.h>
#include <asm/pgalloc.h>
#include <asm/stage2_pgtable.h>

/* Ensure compatibility with arm64 */
#define VA_BITS 32

#define kvm_phys_shift(kvm) KVM_PHYS_SHIFT
#define kvm_phys_size(kvm) (1ULL << kvm_phys_shift(kvm))
#define kvm_phys_mask(kvm) (kvm_phys_size(kvm) - 1ULL)
#define kvm_vttbr_baddr_mask(kvm) VTTBR_BADDR_MASK

#define stage2_pgd_size(kvm) (PTRS_PER_S2_PGD * sizeof(pgd_t))

int create_hyp_mappings(void *from, void *to, pgprot_t prot);
int create_hyp_io_mappings(phys_addr_t phys_addr, size_t size,
void __iomem **kaddr,
Expand Down Expand Up @@ -355,6 +358,8 @@ static inline int hyp_map_aux_data(void)

#define kvm_phys_to_vttbr(addr) (addr)

static inline void kvm_set_ipa_limit(void) {}

static inline bool kvm_cpu_has_cnp(void)
{
return false;
Expand Down
54 changes: 32 additions & 22 deletions arch/arm/include/asm/stage2_pgtable.h
Original file line number Diff line number Diff line change
Expand Up @@ -19,43 +19,53 @@
#ifndef __ARM_S2_PGTABLE_H_
#define __ARM_S2_PGTABLE_H_

#define stage2_pgd_none(pgd) pgd_none(pgd)
#define stage2_pgd_clear(pgd) pgd_clear(pgd)
#define stage2_pgd_present(pgd) pgd_present(pgd)
#define stage2_pgd_populate(pgd, pud) pgd_populate(NULL, pgd, pud)
#define stage2_pud_offset(pgd, address) pud_offset(pgd, address)
#define stage2_pud_free(pud) pud_free(NULL, pud)

#define stage2_pud_none(pud) pud_none(pud)
#define stage2_pud_clear(pud) pud_clear(pud)
#define stage2_pud_present(pud) pud_present(pud)
#define stage2_pud_populate(pud, pmd) pud_populate(NULL, pud, pmd)
#define stage2_pmd_offset(pud, address) pmd_offset(pud, address)
#define stage2_pmd_free(pmd) pmd_free(NULL, pmd)

#define stage2_pud_huge(pud) pud_huge(pud)
/*
* kvm_mmu_cache_min_pages() is the number of pages required
* to install a stage-2 translation. We pre-allocate the entry
* level table at VM creation. Since we have a 3 level page-table,
* we need only two pages to add a new mapping.
*/
#define kvm_mmu_cache_min_pages(kvm) 2

#define stage2_pgd_none(kvm, pgd) pgd_none(pgd)
#define stage2_pgd_clear(kvm, pgd) pgd_clear(pgd)
#define stage2_pgd_present(kvm, pgd) pgd_present(pgd)
#define stage2_pgd_populate(kvm, pgd, pud) pgd_populate(NULL, pgd, pud)
#define stage2_pud_offset(kvm, pgd, address) pud_offset(pgd, address)
#define stage2_pud_free(kvm, pud) pud_free(NULL, pud)

#define stage2_pud_none(kvm, pud) pud_none(pud)
#define stage2_pud_clear(kvm, pud) pud_clear(pud)
#define stage2_pud_present(kvm, pud) pud_present(pud)
#define stage2_pud_populate(kvm, pud, pmd) pud_populate(NULL, pud, pmd)
#define stage2_pmd_offset(kvm, pud, address) pmd_offset(pud, address)
#define stage2_pmd_free(kvm, pmd) pmd_free(NULL, pmd)

#define stage2_pud_huge(kvm, pud) pud_huge(pud)

/* Open coded p*d_addr_end that can deal with 64bit addresses */
static inline phys_addr_t stage2_pgd_addr_end(phys_addr_t addr, phys_addr_t end)
static inline phys_addr_t
stage2_pgd_addr_end(struct kvm *kvm, phys_addr_t addr, phys_addr_t end)
{
phys_addr_t boundary = (addr + PGDIR_SIZE) & PGDIR_MASK;

return (boundary - 1 < end - 1) ? boundary : end;
}

#define stage2_pud_addr_end(addr, end) (end)
#define stage2_pud_addr_end(kvm, addr, end) (end)

static inline phys_addr_t stage2_pmd_addr_end(phys_addr_t addr, phys_addr_t end)
static inline phys_addr_t
stage2_pmd_addr_end(struct kvm *kvm, phys_addr_t addr, phys_addr_t end)
{
phys_addr_t boundary = (addr + PMD_SIZE) & PMD_MASK;

return (boundary - 1 < end - 1) ? boundary : end;
}

#define stage2_pgd_index(addr) pgd_index(addr)
#define stage2_pgd_index(kvm, addr) pgd_index(addr)

#define stage2_pte_table_empty(ptep) kvm_page_empty(ptep)
#define stage2_pmd_table_empty(pmdp) kvm_page_empty(pmdp)
#define stage2_pud_table_empty(pudp) false
#define stage2_pte_table_empty(kvm, ptep) kvm_page_empty(ptep)
#define stage2_pmd_table_empty(kvm, pmdp) kvm_page_empty(pmdp)
#define stage2_pud_table_empty(kvm, pudp) false

#endif /* __ARM_S2_PGTABLE_H_ */
Loading

0 comments on commit 0d1e8b8

Please sign in to comment.