Skip to content

Commit

Permalink
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Browse files Browse the repository at this point in the history
Pull more KVM updates from Paolo Bonzini:
 - ARM bugfix and MSI injection support
 - x86 nested virt tweak and OOPS fix
 - Simplify pvclock code (vdso bits acked by Andy Lutomirski).

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  nvmx: mark ept single context invalidation as supported
  nvmx: remove comment about missing nested vpid support
  KVM: lapic: fix access preemption timer stuff even if kernel_irqchip=off
  KVM: documentation: fix KVM_CAP_X2APIC_API information
  x86: vdso: use __pvclock_read_cycles
  pvclock: introduce seqcount-like API
  arm64: KVM: Set cpsr before spsr on fault injection
  KVM: arm: vgic-irqfd: Workaround changing kvm_set_routing_entry prototype
  KVM: arm/arm64: Enable MSI routing
  KVM: arm/arm64: Enable irqchip routing
  KVM: Move kvm_setup_default/empty_irq_routing declaration in arch specific header
  KVM: irqchip: Convey devid to kvm_set_msi
  KVM: Add devid in kvm_kernel_irq_routing_entry
  KVM: api: Pass the devid in the msi routing entry
  • Loading branch information
torvalds committed Aug 6, 2016
2 parents 4305f42 + 45e1181 commit 80fac0f
Show file tree
Hide file tree
Showing 21 changed files with 273 additions and 117 deletions.
53 changes: 37 additions & 16 deletions Documentation/virtual/kvm/api.txt
Original file line number Diff line number Diff line change
Expand Up @@ -1433,13 +1433,16 @@ KVM_ASSIGN_DEV_IRQ. Partial deassignment of host or guest IRQ is allowed.
4.52 KVM_SET_GSI_ROUTING

Capability: KVM_CAP_IRQ_ROUTING
Architectures: x86 s390
Architectures: x86 s390 arm arm64
Type: vm ioctl
Parameters: struct kvm_irq_routing (in)
Returns: 0 on success, -1 on error

Sets the GSI routing table entries, overwriting any previously set entries.

On arm/arm64, GSI routing has the following limitation:
- GSI routing does not apply to KVM_IRQ_LINE but only to KVM_IRQFD.

struct kvm_irq_routing {
__u32 nr;
__u32 flags;
Expand Down Expand Up @@ -1468,7 +1471,13 @@ struct kvm_irq_routing_entry {
#define KVM_IRQ_ROUTING_S390_ADAPTER 3
#define KVM_IRQ_ROUTING_HV_SINT 4

No flags are specified so far, the corresponding field must be set to zero.
flags:
- KVM_MSI_VALID_DEVID: used along with KVM_IRQ_ROUTING_MSI routing entry
type, specifies that the devid field contains a valid value. The per-VM
KVM_CAP_MSI_DEVID capability advertises the requirement to provide
the device ID. If this capability is not available, userspace should
never set the KVM_MSI_VALID_DEVID flag as the ioctl might fail.
- zero otherwise

struct kvm_irq_routing_irqchip {
__u32 irqchip;
Expand All @@ -1479,9 +1488,16 @@ struct kvm_irq_routing_msi {
__u32 address_lo;
__u32 address_hi;
__u32 data;
__u32 pad;
union {
__u32 pad;
__u32 devid;
};
};

If KVM_MSI_VALID_DEVID is set, devid contains a unique device identifier
for the device that wrote the MSI message. For PCI, this is usually a
BFD identifier in the lower 16 bits.

On x86, address_hi is ignored unless the KVM_X2APIC_API_USE_32BIT_IDS
feature of KVM_CAP_X2APIC_API capability is enabled. If it is enabled,
address_hi bits 31-8 provide bits 31-8 of the destination id. Bits 7-0 of
Expand Down Expand Up @@ -2199,18 +2215,19 @@ struct kvm_msi {
__u8 pad[12];
};

flags: KVM_MSI_VALID_DEVID: devid contains a valid value
devid: If KVM_MSI_VALID_DEVID is set, contains a unique device identifier
for the device that wrote the MSI message.
For PCI, this is usually a BFD identifier in the lower 16 bits.
flags: KVM_MSI_VALID_DEVID: devid contains a valid value. The per-VM
KVM_CAP_MSI_DEVID capability advertises the requirement to provide
the device ID. If this capability is not available, userspace
should never set the KVM_MSI_VALID_DEVID flag as the ioctl might fail.

The per-VM KVM_CAP_MSI_DEVID capability advertises the need to provide
the device ID. If this capability is not set, userland cannot rely on
the kernel to allow the KVM_MSI_VALID_DEVID flag being set.
If KVM_MSI_VALID_DEVID is set, devid contains a unique device identifier
for the device that wrote the MSI message. For PCI, this is usually a
BFD identifier in the lower 16 bits.

On x86, address_hi is ignored unless the KVM_CAP_X2APIC_API capability is
enabled. If it is enabled, address_hi bits 31-8 provide bits 31-8 of the
destination id. Bits 7-0 of address_hi must be zero.
On x86, address_hi is ignored unless the KVM_X2APIC_API_USE_32BIT_IDS
feature of KVM_CAP_X2APIC_API capability is enabled. If it is enabled,
address_hi bits 31-8 provide bits 31-8 of the destination id. Bits 7-0 of
address_hi must be zero.


4.71 KVM_CREATE_PIT2
Expand Down Expand Up @@ -2383,9 +2400,13 @@ Note that closing the resamplefd is not sufficient to disable the
irqfd. The KVM_IRQFD_FLAG_RESAMPLE is only necessary on assignment
and need not be specified with KVM_IRQFD_FLAG_DEASSIGN.

On ARM/ARM64, the gsi field in the kvm_irqfd struct specifies the Shared
Peripheral Interrupt (SPI) index, such that the GIC interrupt ID is
given by gsi + 32.
On arm/arm64, gsi routing being supported, the following can happen:
- in case no routing entry is associated to this gsi, injection fails
- in case the gsi is associated to an irqchip routing entry,
irqchip.pin + 32 corresponds to the injected SPI ID.
- in case the gsi is associated to an MSI routing entry, the MSI
message and device ID are translated into an LPI (support restricted
to GICv3 ITS in-kernel emulation).

4.76 KVM_PPC_ALLOCATE_HTAB

Expand Down
2 changes: 2 additions & 0 deletions arch/arm/kvm/Kconfig
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,8 @@ config KVM
select KVM_VFIO
select HAVE_KVM_EVENTFD
select HAVE_KVM_IRQFD
select HAVE_KVM_IRQCHIP
select HAVE_KVM_IRQ_ROUTING
depends on ARM_VIRT_EXT && ARM_LPAE && ARM_ARCH_TIMER
---help---
Support hosting virtualized guest machines.
Expand Down
1 change: 1 addition & 0 deletions arch/arm/kvm/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -29,4 +29,5 @@ obj-y += $(KVM)/arm/vgic/vgic-v2.o
obj-y += $(KVM)/arm/vgic/vgic-mmio.o
obj-y += $(KVM)/arm/vgic/vgic-mmio-v2.o
obj-y += $(KVM)/arm/vgic/vgic-kvm-device.o
obj-y += $(KVM)/irqchip.o
obj-y += $(KVM)/arm/arch_timer.o
19 changes: 19 additions & 0 deletions arch/arm/kvm/irq.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
/*
* irq.h: in kernel interrupt controller related definitions
* Copyright (c) 2016 Red Hat, Inc.
*
* This program is free software; you can redistribute it and/or modify it
* under the terms and conditions of the GNU General Public License,
* version 2, as published by the Free Software Foundation.
*
* This header is included by irqchip.c. However, on ARM, interrupt
* controller declarations are located in include/kvm/arm_vgic.h since
* they are mostly shared between arm and arm64.
*/

#ifndef __IRQ_H
#define __IRQ_H

#include <kvm/arm_vgic.h>

#endif
2 changes: 2 additions & 0 deletions arch/arm64/kvm/Kconfig
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,8 @@ config KVM
select KVM_ARM_VGIC_V3
select KVM_ARM_PMU if HW_PERF_EVENTS
select HAVE_KVM_MSI
select HAVE_KVM_IRQCHIP
select HAVE_KVM_IRQ_ROUTING
---help---
Support hosting virtualized guest machines.
We don't support KVM with 16K page tables yet, due to the multiple
Expand Down
1 change: 1 addition & 0 deletions arch/arm64/kvm/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -30,5 +30,6 @@ kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/vgic/vgic-mmio-v2.o
kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/vgic/vgic-mmio-v3.o
kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/vgic/vgic-kvm-device.o
kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/vgic/vgic-its.o
kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/irqchip.o
kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/arch_timer.o
kvm-$(CONFIG_KVM_ARM_PMU) += $(KVM)/arm/pmu.o
12 changes: 5 additions & 7 deletions arch/arm64/kvm/inject_fault.c
Original file line number Diff line number Diff line change
Expand Up @@ -132,16 +132,14 @@ static u64 get_except_vector(struct kvm_vcpu *vcpu, enum exception_type type)
static void inject_abt64(struct kvm_vcpu *vcpu, bool is_iabt, unsigned long addr)
{
unsigned long cpsr = *vcpu_cpsr(vcpu);
bool is_aarch32;
bool is_aarch32 = vcpu_mode_is_32bit(vcpu);
u32 esr = 0;

is_aarch32 = vcpu_mode_is_32bit(vcpu);

*vcpu_spsr(vcpu) = cpsr;
*vcpu_elr_el1(vcpu) = *vcpu_pc(vcpu);

*vcpu_pc(vcpu) = get_except_vector(vcpu, except_type_sync);

*vcpu_cpsr(vcpu) = PSTATE_FAULT_BITS_64;
*vcpu_spsr(vcpu) = cpsr;

vcpu_sys_reg(vcpu, FAR_EL1) = addr;

Expand Down Expand Up @@ -172,11 +170,11 @@ static void inject_undef64(struct kvm_vcpu *vcpu)
unsigned long cpsr = *vcpu_cpsr(vcpu);
u32 esr = (ESR_ELx_EC_UNKNOWN << ESR_ELx_EC_SHIFT);

*vcpu_spsr(vcpu) = cpsr;
*vcpu_elr_el1(vcpu) = *vcpu_pc(vcpu);

*vcpu_pc(vcpu) = get_except_vector(vcpu, except_type_sync);

*vcpu_cpsr(vcpu) = PSTATE_FAULT_BITS_64;
*vcpu_spsr(vcpu) = cpsr;

/*
* Build an unknown exception, depending on the instruction
Expand Down
19 changes: 19 additions & 0 deletions arch/arm64/kvm/irq.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
/*
* irq.h: in kernel interrupt controller related definitions
* Copyright (c) 2016 Red Hat, Inc.
*
* This program is free software; you can redistribute it and/or modify it
* under the terms and conditions of the GNU General Public License,
* version 2, as published by the Free Software Foundation.
*
* This header is included by irqchip.c. However, on ARM, interrupt
* controller declarations are located in include/kvm/arm_vgic.h since
* they are mostly shared between arm and arm64.
*/

#ifndef __IRQ_H
#define __IRQ_H

#include <kvm/arm_vgic.h>

#endif
25 changes: 5 additions & 20 deletions arch/x86/entry/vdso/vclock_gettime.c
Original file line number Diff line number Diff line change
Expand Up @@ -96,9 +96,8 @@ static notrace cycle_t vread_pvclock(int *mode)
{
const struct pvclock_vcpu_time_info *pvti = &get_pvti0()->pvti;
cycle_t ret;
u64 tsc, pvti_tsc;
u64 last, delta, pvti_system_time;
u32 version, pvti_tsc_to_system_mul, pvti_tsc_shift;
u64 last;
u32 version;

/*
* Note: The kernel and hypervisor must guarantee that cpu ID
Expand All @@ -123,29 +122,15 @@ static notrace cycle_t vread_pvclock(int *mode)
*/

do {
version = pvti->version;

smp_rmb();
version = pvclock_read_begin(pvti);

if (unlikely(!(pvti->flags & PVCLOCK_TSC_STABLE_BIT))) {
*mode = VCLOCK_NONE;
return 0;
}

tsc = rdtsc_ordered();
pvti_tsc_to_system_mul = pvti->tsc_to_system_mul;
pvti_tsc_shift = pvti->tsc_shift;
pvti_system_time = pvti->system_time;
pvti_tsc = pvti->tsc_timestamp;

/* Make sure that the version double-check is last. */
smp_rmb();
} while (unlikely((version & 1) || version != pvti->version));

delta = tsc - pvti_tsc;
ret = pvti_system_time +
pvclock_scale_delta(delta, pvti_tsc_to_system_mul,
pvti_tsc_shift);
ret = __pvclock_read_cycles(pvti);
} while (pvclock_read_retry(pvti, version));

/* refer to vread_tsc() comment for rationale */
last = gtod->cycle_last;
Expand Down
39 changes: 23 additions & 16 deletions arch/x86/include/asm/pvclock.h
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,24 @@ void pvclock_resume(void);

void pvclock_touch_watchdogs(void);

static __always_inline
unsigned pvclock_read_begin(const struct pvclock_vcpu_time_info *src)
{
unsigned version = src->version & ~1;
/* Make sure that the version is read before the data. */
virt_rmb();
return version;
}

static __always_inline
bool pvclock_read_retry(const struct pvclock_vcpu_time_info *src,
unsigned version)
{
/* Make sure that the version is re-read after the data. */
virt_rmb();
return unlikely(version != src->version);
}

/*
* Scale a 64-bit delta by scaling and multiplying by a 32-bit fraction,
* yielding a 64-bit result.
Expand Down Expand Up @@ -69,23 +87,12 @@ static inline u64 pvclock_scale_delta(u64 delta, u32 mul_frac, int shift)
}

static __always_inline
unsigned __pvclock_read_cycles(const struct pvclock_vcpu_time_info *src,
cycle_t *cycles, u8 *flags)
cycle_t __pvclock_read_cycles(const struct pvclock_vcpu_time_info *src)
{
unsigned version;
cycle_t offset;
u64 delta;

version = src->version;
/* Make the latest version visible */
smp_rmb();

delta = rdtsc_ordered() - src->tsc_timestamp;
offset = pvclock_scale_delta(delta, src->tsc_to_system_mul,
src->tsc_shift);
*cycles = src->system_time + offset;
*flags = src->flags;
return version;
u64 delta = rdtsc_ordered() - src->tsc_timestamp;
cycle_t offset = pvclock_scale_delta(delta, src->tsc_to_system_mul,
src->tsc_shift);
return src->system_time + offset;
}

struct pvclock_vsyscall_time_info {
Expand Down
17 changes: 6 additions & 11 deletions arch/x86/kernel/pvclock.c
Original file line number Diff line number Diff line change
Expand Up @@ -64,14 +64,9 @@ u8 pvclock_read_flags(struct pvclock_vcpu_time_info *src)
u8 flags;

do {
version = src->version;
/* Make the latest version visible */
smp_rmb();

version = pvclock_read_begin(src);
flags = src->flags;
/* Make sure that the version double-check is last. */
smp_rmb();
} while ((src->version & 1) || version != src->version);
} while (pvclock_read_retry(src, version));

return flags & valid_flags;
}
Expand All @@ -84,10 +79,10 @@ cycle_t pvclock_clocksource_read(struct pvclock_vcpu_time_info *src)
u8 flags;

do {
version = __pvclock_read_cycles(src, &ret, &flags);
/* Make sure that the version double-check is last. */
smp_rmb();
} while ((src->version & 1) || version != src->version);
version = pvclock_read_begin(src);
ret = __pvclock_read_cycles(src);
flags = src->flags;
} while (pvclock_read_retry(src, version));

if (unlikely((flags & PVCLOCK_GUEST_STOPPED) != 0)) {
src->flags &= ~PVCLOCK_GUEST_STOPPED;
Expand Down
3 changes: 3 additions & 0 deletions arch/x86/kvm/irq.h
Original file line number Diff line number Diff line change
Expand Up @@ -120,4 +120,7 @@ void __kvm_migrate_timers(struct kvm_vcpu *vcpu);

int apic_has_pending_timer(struct kvm_vcpu *vcpu);

int kvm_setup_default_irq_routing(struct kvm *kvm);
int kvm_setup_empty_irq_routing(struct kvm *kvm);

#endif
3 changes: 3 additions & 0 deletions arch/x86/kvm/lapic.c
Original file line number Diff line number Diff line change
Expand Up @@ -1349,6 +1349,9 @@ static void start_sw_tscdeadline(struct kvm_lapic *apic)

bool kvm_lapic_hv_timer_in_use(struct kvm_vcpu *vcpu)
{
if (!lapic_in_kernel(vcpu))
return false;

return vcpu->arch.apic->lapic_timer.hv_timer_in_use;
}
EXPORT_SYMBOL_GPL(kvm_lapic_hv_timer_in_use);
Expand Down
15 changes: 7 additions & 8 deletions arch/x86/kvm/vmx.c
Original file line number Diff line number Diff line change
Expand Up @@ -2809,12 +2809,8 @@ static void nested_vmx_setup_ctls_msrs(struct vcpu_vmx *vmx)
vmx->nested.nested_vmx_ept_caps |=
VMX_EPT_EXECUTE_ONLY_BIT;
vmx->nested.nested_vmx_ept_caps &= vmx_capability.ept;
/*
* For nested guests, we don't do anything specific
* for single context invalidation. Hence, only advertise
* support for global context invalidation.
*/
vmx->nested.nested_vmx_ept_caps |= VMX_EPT_EXTENT_GLOBAL_BIT;
vmx->nested.nested_vmx_ept_caps |= VMX_EPT_EXTENT_GLOBAL_BIT |
VMX_EPT_EXTENT_CONTEXT_BIT;
} else
vmx->nested.nested_vmx_ept_caps = 0;

Expand Down Expand Up @@ -2945,7 +2941,6 @@ static int vmx_get_vmx_msr(struct kvm_vcpu *vcpu, u32 msr_index, u64 *pdata)
vmx->nested.nested_vmx_secondary_ctls_high);
break;
case MSR_IA32_VMX_EPT_VPID_CAP:
/* Currently, no nested vpid support */
*pdata = vmx->nested.nested_vmx_ept_caps |
((u64)vmx->nested.nested_vmx_vpid_caps << 32);
break;
Expand Down Expand Up @@ -7609,12 +7604,16 @@ static int handle_invept(struct kvm_vcpu *vcpu)

switch (type) {
case VMX_EPT_EXTENT_GLOBAL:
/*
* TODO: track mappings and invalidate
* single context requests appropriately
*/
case VMX_EPT_EXTENT_CONTEXT:
kvm_mmu_sync_roots(vcpu);
kvm_make_request(KVM_REQ_TLB_FLUSH, vcpu);
nested_vmx_succeed(vcpu);
break;
default:
/* Trap single context invalidation invept calls */
BUG_ON(1);
break;
}
Expand Down
Loading

0 comments on commit 80fac0f

Please sign in to comment.