Skip to content

Commit

Permalink
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Browse files Browse the repository at this point in the history
Pull KVM updates from Paolo Bonzini:
 "First round of KVM updates for 3.14; PPC parts will come next week.

  Nothing major here, just bugfixes all over the place.  The most
  interesting part is the ARM guys' virtualized interrupt controller
  overhaul, which lets userspace get/set the state and thus enables
  migration of ARM VMs"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (67 commits)
  kvm: make KVM_MMU_AUDIT help text more readable
  KVM: s390: Fix memory access error detection
  KVM: nVMX: Update guest activity state field on L2 exits
  KVM: nVMX: Fix nested_run_pending on activity state HLT
  KVM: nVMX: Clean up handling of VMX-related MSRs
  KVM: nVMX: Add tracepoints for nested_vmexit and nested_vmexit_inject
  KVM: nVMX: Pass vmexit parameters to nested_vmx_vmexit
  KVM: nVMX: Leave VMX mode on clearing of feature control MSR
  KVM: VMX: Fix DR6 update on #DB exception
  KVM: SVM: Fix reading of DR6
  KVM: x86: Sync DR7 on KVM_SET_DEBUGREGS
  add support for Hyper-V reference time counter
  KVM: remove useless write to vcpu->hv_clock.tsc_timestamp
  KVM: x86: fix tsc catchup issue with tsc scaling
  KVM: x86: limit PIT timer frequency
  KVM: x86: handle invalid root_hpa everywhere
  kvm: Provide kvm_vcpu_eligible_for_directed_yield() stub
  kvm: vfio: silence GCC warning
  KVM: ARM: Remove duplicate include
  arm/arm64: KVM: relax the requirements of VMA alignment for THP
  ...
  • Loading branch information
torvalds committed Jan 23, 2014
2 parents bb1281f + 7650b68 commit 7ebd3fa
Show file tree
Hide file tree
Showing 55 changed files with 1,515 additions and 406 deletions.
13 changes: 9 additions & 4 deletions Documentation/virtual/kvm/api.txt
Original file line number Diff line number Diff line change
Expand Up @@ -2104,7 +2104,7 @@ Returns: 0 on success, -1 on error
Allows setting an eventfd to directly trigger a guest interrupt.
kvm_irqfd.fd specifies the file descriptor to use as the eventfd and
kvm_irqfd.gsi specifies the irqchip pin toggled by this event. When
an event is tiggered on the eventfd, an interrupt is injected into
an event is triggered on the eventfd, an interrupt is injected into
the guest using the specified gsi pin. The irqfd is removed using
the KVM_IRQFD_FLAG_DEASSIGN flag, specifying both kvm_irqfd.fd
and kvm_irqfd.gsi.
Expand All @@ -2115,7 +2115,7 @@ interrupts. When KVM_IRQFD_FLAG_RESAMPLE is set the user must pass an
additional eventfd in the kvm_irqfd.resamplefd field. When operating
in resample mode, posting of an interrupt through kvm_irq.fd asserts
the specified gsi in the irqchip. When the irqchip is resampled, such
as from an EOI, the gsi is de-asserted and the user is notifed via
as from an EOI, the gsi is de-asserted and the user is notified via
kvm_irqfd.resamplefd. It is the user's responsibility to re-queue
the interrupt if the device making use of it still requires service.
Note that closing the resamplefd is not sufficient to disable the
Expand Down Expand Up @@ -2327,7 +2327,7 @@ current state. "addr" is ignored.
Capability: basic
Architectures: arm, arm64
Type: vcpu ioctl
Parameters: struct struct kvm_vcpu_init (in)
Parameters: struct kvm_vcpu_init (in)
Returns: 0 on success; -1 on error
Errors:
 EINVAL:    the target is unknown, or the combination of features is invalid.
Expand Down Expand Up @@ -2391,7 +2391,8 @@ struct kvm_reg_list {
This ioctl returns the guest registers that are supported for the
KVM_GET_ONE_REG/KVM_SET_ONE_REG calls.

4.85 KVM_ARM_SET_DEVICE_ADDR

4.85 KVM_ARM_SET_DEVICE_ADDR (deprecated)

Capability: KVM_CAP_ARM_SET_DEVICE_ADDR
Architectures: arm, arm64
Expand Down Expand Up @@ -2429,6 +2430,10 @@ must be called after calling KVM_CREATE_IRQCHIP, but before calling
KVM_RUN on any of the VCPUs. Calling this ioctl twice for any of the
base addresses will return -EEXIST.

Note, this IOCTL is deprecated and the more flexible SET/GET_DEVICE_ATTR API
should be used instead.


4.86 KVM_PPC_RTAS_DEFINE_TOKEN

Capability: KVM_CAP_PPC_RTAS
Expand Down
73 changes: 73 additions & 0 deletions Documentation/virtual/kvm/devices/arm-vgic.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
ARM Virtual Generic Interrupt Controller (VGIC)
===============================================

Device types supported:
KVM_DEV_TYPE_ARM_VGIC_V2 ARM Generic Interrupt Controller v2.0

Only one VGIC instance may be instantiated through either this API or the
legacy KVM_CREATE_IRQCHIP api. The created VGIC will act as the VM interrupt
controller, requiring emulated user-space devices to inject interrupts to the
VGIC instead of directly to CPUs.

Groups:
KVM_DEV_ARM_VGIC_GRP_ADDR
Attributes:
KVM_VGIC_V2_ADDR_TYPE_DIST (rw, 64-bit)
Base address in the guest physical address space of the GIC distributor
register mappings.

KVM_VGIC_V2_ADDR_TYPE_CPU (rw, 64-bit)
Base address in the guest physical address space of the GIC virtual cpu
interface register mappings.

KVM_DEV_ARM_VGIC_GRP_DIST_REGS
Attributes:
The attr field of kvm_device_attr encodes two values:
bits: | 63 .... 40 | 39 .. 32 | 31 .... 0 |
values: | reserved | cpu id | offset |

All distributor regs are (rw, 32-bit)

The offset is relative to the "Distributor base address" as defined in the
GICv2 specs. Getting or setting such a register has the same effect as
reading or writing the register on the actual hardware from the cpu
specified with cpu id field. Note that most distributor fields are not
banked, but return the same value regardless of the cpu id used to access
the register.
Limitations:
- Priorities are not implemented, and registers are RAZ/WI
Errors:
-ENODEV: Getting or setting this register is not yet supported
-EBUSY: One or more VCPUs are running

KVM_DEV_ARM_VGIC_GRP_CPU_REGS
Attributes:
The attr field of kvm_device_attr encodes two values:
bits: | 63 .... 40 | 39 .. 32 | 31 .... 0 |
values: | reserved | cpu id | offset |

All CPU interface regs are (rw, 32-bit)

The offset specifies the offset from the "CPU interface base address" as
defined in the GICv2 specs. Getting or setting such a register has the
same effect as reading or writing the register on the actual hardware.

The Active Priorities Registers APRn are implementation defined, so we set a
fixed format for our implementation that fits with the model of a "GICv2
implementation without the security extensions" which we present to the
guest. This interface always exposes four register APR[0-3] describing the
maximum possible 128 preemption levels. The semantics of the register
indicate if any interrupts in a given preemption level are in the active
state by setting the corresponding bit.

Thus, preemption level X has one or more active interrupts if and only if:

APRn[X mod 32] == 0b1, where n = X / 32

Bits for undefined preemption levels are RAZ/WI.

Limitations:
- Priorities are not implemented, and registers are RAZ/WI
Errors:
-ENODEV: Getting or setting this register is not yet supported
-EBUSY: One or more VCPUs are running
5 changes: 4 additions & 1 deletion Documentation/virtual/kvm/hypercalls.txt
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,9 @@ S390:
S390 uses diagnose instruction as hypercall (0x500) along with hypercall
number in R1.

For further information on the S390 diagnose call as supported by KVM,
refer to Documentation/virtual/kvm/s390-diag.txt.

PowerPC:
It uses R3-R10 and hypercall number in R11. R4-R11 are used as output registers.
Return value is placed in R3.
Expand Down Expand Up @@ -74,7 +77,7 @@ Usage example : A vcpu of a paravirtualized guest that is busywaiting in guest
kernel mode for an event to occur (ex: a spinlock to become available) can
execute HLT instruction once it has busy-waited for more than a threshold
time-interval. Execution of HLT instruction would cause the hypervisor to put
the vcpu to sleep until occurence of an appropriate event. Another vcpu of the
the vcpu to sleep until occurrence of an appropriate event. Another vcpu of the
same guest can wakeup the sleeping vcpu by issuing KVM_HC_KICK_CPU hypercall,
specifying APIC ID (a1) of the vcpu to be woken up. An additional argument (a0)
is used in the hypercall for future use.
4 changes: 2 additions & 2 deletions Documentation/virtual/kvm/locking.txt
Original file line number Diff line number Diff line change
Expand Up @@ -112,7 +112,7 @@ The Dirty bit is lost in this case.

In order to avoid this kind of issue, we always treat the spte as "volatile"
if it can be updated out of mmu-lock, see spte_has_volatile_bits(), it means,
the spte is always atomicly updated in this case.
the spte is always atomically updated in this case.

3): flush tlbs due to spte updated
If the spte is updated from writable to readonly, we should flush all TLBs,
Expand All @@ -125,7 +125,7 @@ be flushed caused by this reason in mmu_spte_update() since this is a common
function to update spte (present -> present).

Since the spte is "volatile" if it can be updated out of mmu-lock, we always
atomicly update the spte, the race caused by fast page fault can be avoided,
atomically update the spte, the race caused by fast page fault can be avoided,
See the comments in spte_has_volatile_bits() and mmu_spte_update().

3. Reference
Expand Down
2 changes: 1 addition & 1 deletion Documentation/virtual/kvm/ppc-pv.txt
Original file line number Diff line number Diff line change
Expand Up @@ -115,7 +115,7 @@ If any other bit changes in the MSR, please still use mtmsr(d).
Patched instructions
====================

The "ld" and "std" instructions are transormed to "lwz" and "stw" instructions
The "ld" and "std" instructions are transformed to "lwz" and "stw" instructions
respectively on 32 bit systems with an added offset of 4 to accommodate for big
endianness.

Expand Down
80 changes: 80 additions & 0 deletions Documentation/virtual/kvm/s390-diag.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
The s390 DIAGNOSE call on KVM
=============================

KVM on s390 supports the DIAGNOSE call for making hypercalls, both for
native hypercalls and for selected hypercalls found on other s390
hypervisors.

Note that bits are numbered as by the usual s390 convention (most significant
bit on the left).


General remarks
---------------

DIAGNOSE calls by the guest cause a mandatory intercept. This implies
all supported DIAGNOSE calls need to be handled by either KVM or its
userspace.

All DIAGNOSE calls supported by KVM use the RS-a format:

--------------------------------------
| '83' | R1 | R3 | B2 | D2 |
--------------------------------------
0 8 12 16 20 31

The second-operand address (obtained by the base/displacement calculation)
is not used to address data. Instead, bits 48-63 of this address specify
the function code, and bits 0-47 are ignored.

The supported DIAGNOSE function codes vary by the userspace used. For
DIAGNOSE function codes not specific to KVM, please refer to the
documentation for the s390 hypervisors defining them.


DIAGNOSE function code 'X'500' - KVM virtio functions
-----------------------------------------------------

If the function code specifies 0x500, various virtio-related functions
are performed.

General register 1 contains the virtio subfunction code. Supported
virtio subfunctions depend on KVM's userspace. Generally, userspace
provides either s390-virtio (subcodes 0-2) or virtio-ccw (subcode 3).

Upon completion of the DIAGNOSE instruction, general register 2 contains
the function's return code, which is either a return code or a subcode
specific value.

Subcode 0 - s390-virtio notification and early console printk
Handled by userspace.

Subcode 1 - s390-virtio reset
Handled by userspace.

Subcode 2 - s390-virtio set status
Handled by userspace.

Subcode 3 - virtio-ccw notification
Handled by either userspace or KVM (ioeventfd case).

General register 2 contains a subchannel-identification word denoting
the subchannel of the virtio-ccw proxy device to be notified.

General register 3 contains the number of the virtqueue to be notified.

General register 4 contains a 64bit identifier for KVM usage (the
kvm_io_bus cookie). If general register 4 does not contain a valid
identifier, it is ignored.

After completion of the DIAGNOSE call, general register 2 may contain
a 64bit identifier (in the kvm_io_bus cookie case).

See also the virtio standard for a discussion of this hypercall.


DIAGNOSE function code 'X'501 - KVM breakpoint
----------------------------------------------

If the function code specifies 0x501, breakpoint functions may be performed.
This function code is handled by userspace.
2 changes: 1 addition & 1 deletion Documentation/virtual/kvm/timekeeping.txt
Original file line number Diff line number Diff line change
Expand Up @@ -467,7 +467,7 @@ at any time. This causes problems as the passage of real time, the injection
of machine interrupts and the associated clock sources are no longer completely
synchronized with real time.

This same problem can occur on native harware to a degree, as SMM mode may
This same problem can occur on native hardware to a degree, as SMM mode may
steal cycles from the naturally on X86 systems when SMM mode is used by the
BIOS, but not in such an extreme fashion. However, the fact that SMM mode may
cause similar problems to virtualization makes it a good justification for
Expand Down
2 changes: 1 addition & 1 deletion MAINTAINERS
Original file line number Diff line number Diff line change
Expand Up @@ -4915,7 +4915,7 @@ F: include/linux/sunrpc/
F: include/uapi/linux/sunrpc/

KERNEL VIRTUAL MACHINE (KVM)
M: Gleb Natapov <gleb@redhat.com>
M: Gleb Natapov <gleb@kernel.org>
M: Paolo Bonzini <[email protected]>
L: [email protected]
W: http://www.linux-kvm.org
Expand Down
3 changes: 3 additions & 0 deletions arch/arm/include/asm/kvm_host.h
Original file line number Diff line number Diff line change
Expand Up @@ -225,4 +225,7 @@ static inline int kvm_arch_dev_ioctl_check_extension(long ext)
int kvm_perf_init(void);
int kvm_perf_teardown(void);

u64 kvm_arm_timer_get_reg(struct kvm_vcpu *, u64 regid);
int kvm_arm_timer_set_reg(struct kvm_vcpu *, u64 regid, u64 value);

#endif /* __ARM_KVM_HOST_H__ */
1 change: 1 addition & 0 deletions arch/arm/include/asm/kvm_mmu.h
Original file line number Diff line number Diff line change
Expand Up @@ -140,6 +140,7 @@ static inline void coherent_icache_guest_page(struct kvm *kvm, hva_t hva,
}

#define kvm_flush_dcache_to_poc(a,l) __cpuc_flush_dcache_area((a), (l))
#define kvm_virt_to_phys(x) virt_to_idmap((unsigned long)(x))

#endif /* !__ASSEMBLY__ */

Expand Down
28 changes: 28 additions & 0 deletions arch/arm/include/uapi/asm/kvm.h
Original file line number Diff line number Diff line change
Expand Up @@ -119,6 +119,26 @@ struct kvm_arch_memory_slot {
#define KVM_REG_ARM_32_CRN_MASK 0x0000000000007800
#define KVM_REG_ARM_32_CRN_SHIFT 11

#define ARM_CP15_REG_SHIFT_MASK(x,n) \
(((x) << KVM_REG_ARM_ ## n ## _SHIFT) & KVM_REG_ARM_ ## n ## _MASK)

#define __ARM_CP15_REG(op1,crn,crm,op2) \
(KVM_REG_ARM | (15 << KVM_REG_ARM_COPROC_SHIFT) | \
ARM_CP15_REG_SHIFT_MASK(op1, OPC1) | \
ARM_CP15_REG_SHIFT_MASK(crn, 32_CRN) | \
ARM_CP15_REG_SHIFT_MASK(crm, CRM) | \
ARM_CP15_REG_SHIFT_MASK(op2, 32_OPC2))

#define ARM_CP15_REG32(...) (__ARM_CP15_REG(__VA_ARGS__) | KVM_REG_SIZE_U32)

#define __ARM_CP15_REG64(op1,crm) \
(__ARM_CP15_REG(op1, 0, crm, 0) | KVM_REG_SIZE_U64)
#define ARM_CP15_REG64(...) __ARM_CP15_REG64(__VA_ARGS__)

#define KVM_REG_ARM_TIMER_CTL ARM_CP15_REG32(0, 14, 3, 1)
#define KVM_REG_ARM_TIMER_CNT ARM_CP15_REG64(1, 14)
#define KVM_REG_ARM_TIMER_CVAL ARM_CP15_REG64(3, 14)

/* Normal registers are mapped as coprocessor 16. */
#define KVM_REG_ARM_CORE (0x0010 << KVM_REG_ARM_COPROC_SHIFT)
#define KVM_REG_ARM_CORE_REG(name) (offsetof(struct kvm_regs, name) / 4)
Expand All @@ -143,6 +163,14 @@ struct kvm_arch_memory_slot {
#define KVM_REG_ARM_VFP_FPINST 0x1009
#define KVM_REG_ARM_VFP_FPINST2 0x100A

/* Device Control API: ARM VGIC */
#define KVM_DEV_ARM_VGIC_GRP_ADDR 0
#define KVM_DEV_ARM_VGIC_GRP_DIST_REGS 1
#define KVM_DEV_ARM_VGIC_GRP_CPU_REGS 2
#define KVM_DEV_ARM_VGIC_CPUID_SHIFT 32
#define KVM_DEV_ARM_VGIC_CPUID_MASK (0xffULL << KVM_DEV_ARM_VGIC_CPUID_SHIFT)
#define KVM_DEV_ARM_VGIC_OFFSET_SHIFT 0
#define KVM_DEV_ARM_VGIC_OFFSET_MASK (0xffffffffULL << KVM_DEV_ARM_VGIC_OFFSET_SHIFT)

/* KVM_IRQ_LINE irq field index values */
#define KVM_ARM_IRQ_TYPE_SHIFT 24
Expand Down
Loading

0 comments on commit 7ebd3fa

Please sign in to comment.