Skip to content

Commit

Permalink
Merge commit 'df84f17' into HEAD
Browse files Browse the repository at this point in the history
This merge fixes a semantic conflict with the trivial tree.

Signed-off-by: Paolo Bonzini <[email protected]>
  • Loading branch information
bonzini committed Oct 26, 2019
2 parents 856bd2c + df84f17 commit 673652a
Show file tree
Hide file tree
Showing 46 changed files with 2,224 additions and 1,034 deletions.
3 changes: 3 additions & 0 deletions .gitmodules
Original file line number Diff line number Diff line change
Expand Up @@ -58,3 +58,6 @@
[submodule "roms/opensbi"]
path = roms/opensbi
url = https://git.qemu.org/git/opensbi.git
[submodule "roms/qboot"]
path = roms/qboot
url = https://github.com/bonzini/qboot
9 changes: 9 additions & 0 deletions MAINTAINERS
Original file line number Diff line number Diff line change
Expand Up @@ -1275,6 +1275,15 @@ F: include/hw/timer/hpet.h
F: include/hw/timer/i8254*
F: include/hw/rtc/mc146818rtc*

microvm
M: Sergio Lopez <[email protected]>
M: Paolo Bonzini <[email protected]>
S: Maintained
F: docs/microvm.rst
F: hw/i386/microvm.c
F: include/hw/i386/microvm.h
F: pc-bios/bios-microvm.bin

Machine core
M: Eduardo Habkost <[email protected]>
M: Marcel Apfelbaum <[email protected]>
Expand Down
1 change: 1 addition & 0 deletions default-configs/i386-softmmu.mak
Original file line number Diff line number Diff line change
Expand Up @@ -28,3 +28,4 @@
CONFIG_ISAPC=y
CONFIG_I440FX=y
CONFIG_Q35=y
CONFIG_MICROVM=y
13 changes: 13 additions & 0 deletions docs/hyperv.txt
Original file line number Diff line number Diff line change
Expand Up @@ -184,6 +184,19 @@ enabled.

Requires: hv-vpindex, hv-synic, hv-time, hv-stimer

3.17. hv-no-nonarch-coresharing=on/off/auto
===========================================
This enlightenment tells guest OS that virtual processors will never share a
physical core unless they are reported as sibling SMT threads. This information
is required by Windows and Hyper-V guests to properly mitigate SMT related CPU
vulnerabilities.
When the option is set to 'auto' QEMU will enable the feature only when KVM
reports that non-architectural coresharing is impossible, this means that
hyper-threading is not supported or completely disabled on the host. This
setting also prevents migration as SMT settings on the destination may differ.
When the option is set to 'on' QEMU will always enable the feature, regardless
of host setup. To keep guests secure, this can only be used in conjunction with
exposing correct vCPU topology and vCPU pinning.

4. Development features
========================
Expand Down
108 changes: 108 additions & 0 deletions docs/microvm.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,108 @@
====================
microvm Machine Type
====================

``microvm`` is a machine type inspired by ``Firecracker`` and
constructed after its machine model.

It's a minimalist machine type without ``PCI`` nor ``ACPI`` support,
designed for short-lived guests. microvm also establishes a baseline
for benchmarking and optimizing both QEMU and guest operating systems,
since it is optimized for both boot time and footprint.


Supported devices
-----------------

The microvm machine type supports the following devices:

- ISA bus
- i8259 PIC (optional)
- i8254 PIT (optional)
- MC146818 RTC (optional)
- One ISA serial port (optional)
- LAPIC
- IOAPIC (with kernel-irqchip=split by default)
- kvmclock (if using KVM)
- fw_cfg
- Up to eight virtio-mmio devices (configured by the user)


Limitations
-----------

Currently, microvm does *not* support the following features:

- PCI-only devices.
- Hotplug of any kind.
- Live migration across QEMU versions.


Using the microvm machine type
------------------------------

Machine-specific options
~~~~~~~~~~~~~~~~~~~~~~~~

It supports the following machine-specific options:

- microvm.x-option-roms=bool (Set off to disable loading option ROMs)
- microvm.pit=OnOffAuto (Enable i8254 PIT)
- microvm.isa-serial=bool (Set off to disable the instantiation an ISA serial port)
- microvm.pic=OnOffAuto (Enable i8259 PIC)
- microvm.rtc=OnOffAuto (Enable MC146818 RTC)
- microvm.auto-kernel-cmdline=bool (Set off to disable adding virtio-mmio devices to the kernel cmdline)


Boot options
~~~~~~~~~~~~

By default, microvm uses ``qboot`` as its BIOS, to obtain better boot
times, but it's also compatible with ``SeaBIOS``.

As no current FW is able to boot from a block device using
``virtio-mmio`` as its transport, a microvm-based VM needs to be run
using a host-side kernel and, optionally, an initrd image.


Running a microvm-based VM
~~~~~~~~~~~~~~~~~~~~~~~~~~

By default, microvm aims for maximum compatibility, enabling both
legacy and non-legacy devices. In this example, a VM is created
without passing any additional machine-specific option, using the
legacy ``ISA serial`` device as console::

$ qemu-system-x86_64 -M microvm \
-enable-kvm -cpu host -m 512m -smp 2 \
-kernel vmlinux -append "earlyprintk=ttyS0 console=ttyS0 root=/dev/vda" \
-nodefaults -no-user-config -nographic \
-serial stdio \
-drive id=test,file=test.img,format=raw,if=none \
-device virtio-blk-device,drive=test \
-netdev tap,id=tap0,script=no,downscript=no \
-device virtio-net-device,netdev=tap0

While the example above works, you might be interested in reducing the
footprint further by disabling some legacy devices. If you're using
``KVM``, you can disable the ``RTC``, making the Guest rely on
``kvmclock`` exclusively. Additionally, if your host's CPUs have the
``TSC_DEADLINE`` feature, you can also disable both the i8259 PIC and
the i8254 PIT (make sure you're also emulating a CPU with such feature
in the guest).

This is an example of a VM with all optional legacy features
disabled::

$ qemu-system-x86_64 \
-M microvm,x-option-roms=off,pit=off,pic=off,isa-serial=off,rtc=off \
-enable-kvm -cpu host -m 512m -smp 2 \
-kernel vmlinux -append "console=hvc0 root=/dev/vda" \
-nodefaults -no-user-config -nographic \
-chardev stdio,id=virtiocon0 \
-device virtio-serial-device \
-device virtconsole,chardev=virtiocon0 \
-drive id=test,file=test.img,format=raw,if=none \
-device virtio-blk-device,drive=test \
-netdev tap,id=tap0,script=no,downscript=no \
-device virtio-net-device,netdev=tap0
10 changes: 5 additions & 5 deletions hw/acpi/cpu_hotplug.c
Original file line number Diff line number Diff line change
Expand Up @@ -128,7 +128,7 @@ void build_legacy_cpu_hotplug_aml(Aml *ctx, MachineState *machine,
Aml *one = aml_int(1);
MachineClass *mc = MACHINE_GET_CLASS(machine);
const CPUArchIdList *apic_ids = mc->possible_cpu_arch_ids(machine);
PCMachineState *pcms = PC_MACHINE(machine);
X86MachineState *x86ms = X86_MACHINE(machine);

/*
* _MAT method - creates an madt apic buffer
Expand Down Expand Up @@ -236,9 +236,9 @@ void build_legacy_cpu_hotplug_aml(Aml *ctx, MachineState *machine,
/* The current AML generator can cover the APIC ID range [0..255],
* inclusive, for VCPU hotplug. */
QEMU_BUILD_BUG_ON(ACPI_CPU_HOTPLUG_ID_LIMIT > 256);
if (pcms->apic_id_limit > ACPI_CPU_HOTPLUG_ID_LIMIT) {
if (x86ms->apic_id_limit > ACPI_CPU_HOTPLUG_ID_LIMIT) {
error_report("max_cpus is too large. APIC ID of last CPU is %u",
pcms->apic_id_limit - 1);
x86ms->apic_id_limit - 1);
exit(1);
}

Expand Down Expand Up @@ -315,8 +315,8 @@ void build_legacy_cpu_hotplug_aml(Aml *ctx, MachineState *machine,
* ith up to 255 elements. Windows guests up to win2k8 fail when
* VarPackageOp is used.
*/
pkg = pcms->apic_id_limit <= 255 ? aml_package(pcms->apic_id_limit) :
aml_varpackage(pcms->apic_id_limit);
pkg = x86ms->apic_id_limit <= 255 ? aml_package(x86ms->apic_id_limit) :
aml_varpackage(x86ms->apic_id_limit);

for (i = 0, apic_idx = 0; i < apic_ids->len; i++) {
int apic_id = apic_ids->cpus[i].arch_id;
Expand Down
10 changes: 10 additions & 0 deletions hw/i386/Kconfig
Original file line number Diff line number Diff line change
Expand Up @@ -92,6 +92,16 @@ config Q35
select SMBIOS
select FW_CFG_DMA

config MICROVM
bool
imply SERIAL_ISA
select ISA_BUS
select APIC
select IOAPIC
select I8259
select MC146818RTC
select VIRTIO_MMIO

config VTD
bool

Expand Down
2 changes: 2 additions & 0 deletions hw/i386/Makefile.objs
Original file line number Diff line number Diff line change
@@ -1,8 +1,10 @@
obj-$(CONFIG_KVM) += kvm/
obj-y += e820_memory_layout.o multiboot.o
obj-y += x86.o
obj-y += pc.o
obj-$(CONFIG_I440FX) += pc_piix.o
obj-$(CONFIG_Q35) += pc_q35.o
obj-$(CONFIG_MICROVM) += microvm.o
obj-y += fw_cfg.o pc_sysfw.o
obj-y += x86-iommu.o
obj-$(CONFIG_VTD) += intel_iommu.o
Expand Down
29 changes: 17 additions & 12 deletions hw/i386/acpi-build.c
Original file line number Diff line number Diff line change
Expand Up @@ -361,6 +361,7 @@ static void
build_madt(GArray *table_data, BIOSLinker *linker, PCMachineState *pcms)
{
MachineClass *mc = MACHINE_GET_CLASS(pcms);
X86MachineState *x86ms = X86_MACHINE(pcms);
const CPUArchIdList *apic_ids = mc->possible_cpu_arch_ids(MACHINE(pcms));
int madt_start = table_data->len;
AcpiDeviceIfClass *adevc = ACPI_DEVICE_IF_GET_CLASS(pcms->acpi_dev);
Expand Down Expand Up @@ -390,7 +391,7 @@ build_madt(GArray *table_data, BIOSLinker *linker, PCMachineState *pcms)
io_apic->address = cpu_to_le32(IO_APIC_DEFAULT_ADDRESS);
io_apic->interrupt = cpu_to_le32(0);

if (pcms->apic_xrupt_override) {
if (x86ms->apic_xrupt_override) {
intsrcovr = acpi_data_push(table_data, sizeof *intsrcovr);
intsrcovr->type = ACPI_APIC_XRUPT_OVERRIDE;
intsrcovr->length = sizeof(*intsrcovr);
Expand Down Expand Up @@ -1831,6 +1832,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
CrsRangeSet crs_range_set;
PCMachineState *pcms = PC_MACHINE(machine);
PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(machine);
X86MachineState *x86ms = X86_MACHINE(machine);
AcpiMcfgInfo mcfg;
uint32_t nr_mem = machine->ram_slots;
int root_bus_limit = 0xFF;
Expand Down Expand Up @@ -2103,7 +2105,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
* with half of the 16-bit control register. Hence, the total size
* of the i/o region used is FW_CFG_CTL_SIZE; when using DMA, the
* DMA control register is located at FW_CFG_DMA_IO_BASE + 4 */
uint8_t io_size = object_property_get_bool(OBJECT(pcms->fw_cfg),
uint8_t io_size = object_property_get_bool(OBJECT(x86ms->fw_cfg),
"dma_enabled", NULL) ?
ROUND_UP(FW_CFG_CTL_SIZE, 4) + sizeof(dma_addr_t) :
FW_CFG_CTL_SIZE;
Expand Down Expand Up @@ -2336,6 +2338,7 @@ build_srat(GArray *table_data, BIOSLinker *linker, MachineState *machine)
int srat_start, numa_start, slots;
uint64_t mem_len, mem_base, next_base;
MachineClass *mc = MACHINE_GET_CLASS(machine);
X86MachineState *x86ms = X86_MACHINE(machine);
const CPUArchIdList *apic_ids = mc->possible_cpu_arch_ids(machine);
PCMachineState *pcms = PC_MACHINE(machine);
ram_addr_t hotplugabble_address_space_size =
Expand Down Expand Up @@ -2406,16 +2409,16 @@ build_srat(GArray *table_data, BIOSLinker *linker, MachineState *machine)
}

/* Cut out the ACPI_PCI hole */
if (mem_base <= pcms->below_4g_mem_size &&
next_base > pcms->below_4g_mem_size) {
mem_len -= next_base - pcms->below_4g_mem_size;
if (mem_base <= x86ms->below_4g_mem_size &&
next_base > x86ms->below_4g_mem_size) {
mem_len -= next_base - x86ms->below_4g_mem_size;
if (mem_len > 0) {
numamem = acpi_data_push(table_data, sizeof *numamem);
build_srat_memory(numamem, mem_base, mem_len, i - 1,
MEM_AFFINITY_ENABLED);
}
mem_base = 1ULL << 32;
mem_len = next_base - pcms->below_4g_mem_size;
mem_len = next_base - x86ms->below_4g_mem_size;
next_base = mem_base + mem_len;
}

Expand Down Expand Up @@ -2634,6 +2637,7 @@ void acpi_build(AcpiBuildTables *tables, MachineState *machine)
{
PCMachineState *pcms = PC_MACHINE(machine);
PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(pcms);
X86MachineState *x86ms = X86_MACHINE(machine);
GArray *table_offsets;
unsigned facs, dsdt, rsdt, fadt;
AcpiPmInfo pm;
Expand Down Expand Up @@ -2795,7 +2799,7 @@ void acpi_build(AcpiBuildTables *tables, MachineState *machine)
*/
int legacy_aml_len =
pcmc->legacy_acpi_table_size +
ACPI_BUILD_LEGACY_CPU_AML_SIZE * pcms->apic_id_limit;
ACPI_BUILD_LEGACY_CPU_AML_SIZE * x86ms->apic_id_limit;
int legacy_table_size =
ROUND_UP(tables_blob->len - aml_len + legacy_aml_len,
ACPI_BUILD_ALIGN_SIZE);
Expand Down Expand Up @@ -2885,13 +2889,14 @@ void acpi_setup(void)
{
PCMachineState *pcms = PC_MACHINE(qdev_get_machine());
PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(pcms);
X86MachineState *x86ms = X86_MACHINE(pcms);
AcpiBuildTables tables;
AcpiBuildState *build_state;
Object *vmgenid_dev;
TPMIf *tpm;
static FwCfgTPMConfig tpm_config;

if (!pcms->fw_cfg) {
if (!x86ms->fw_cfg) {
ACPI_BUILD_DPRINTF("No fw cfg. Bailing out.\n");
return;
}
Expand Down Expand Up @@ -2922,7 +2927,7 @@ void acpi_setup(void)
acpi_add_rom_blob(acpi_build_update, build_state,
tables.linker->cmd_blob, "etc/table-loader", 0);

fw_cfg_add_file(pcms->fw_cfg, ACPI_BUILD_TPMLOG_FILE,
fw_cfg_add_file(x86ms->fw_cfg, ACPI_BUILD_TPMLOG_FILE,
tables.tcpalog->data, acpi_data_len(tables.tcpalog));

tpm = tpm_find();
Expand All @@ -2932,13 +2937,13 @@ void acpi_setup(void)
.tpm_version = tpm_get_version(tpm),
.tpmppi_version = TPM_PPI_VERSION_1_30
};
fw_cfg_add_file(pcms->fw_cfg, "etc/tpm/config",
fw_cfg_add_file(x86ms->fw_cfg, "etc/tpm/config",
&tpm_config, sizeof tpm_config);
}

vmgenid_dev = find_vmgenid_dev();
if (vmgenid_dev) {
vmgenid_add_fw_cfg(VMGENID(vmgenid_dev), pcms->fw_cfg,
vmgenid_add_fw_cfg(VMGENID(vmgenid_dev), x86ms->fw_cfg,
tables.vmgenid);
}

Expand All @@ -2951,7 +2956,7 @@ void acpi_setup(void)
uint32_t rsdp_size = acpi_data_len(tables.rsdp);

build_state->rsdp = g_memdup(tables.rsdp->data, rsdp_size);
fw_cfg_add_file_callback(pcms->fw_cfg, ACPI_BUILD_RSDP_FILE,
fw_cfg_add_file_callback(x86ms->fw_cfg, ACPI_BUILD_RSDP_FILE,
acpi_build_update, NULL, build_state,
build_state->rsdp, rsdp_size, true);
build_state->rsdp_mr = NULL;
Expand Down
3 changes: 2 additions & 1 deletion hw/i386/amd_iommu.c
Original file line number Diff line number Diff line change
Expand Up @@ -1540,6 +1540,7 @@ static void amdvi_realize(DeviceState *dev, Error **err)
X86IOMMUState *x86_iommu = X86_IOMMU_DEVICE(dev);
MachineState *ms = MACHINE(qdev_get_machine());
PCMachineState *pcms = PC_MACHINE(ms);
X86MachineState *x86ms = X86_MACHINE(ms);
PCIBus *bus = pcms->bus;

s->iotlb = g_hash_table_new_full(amdvi_uint64_hash,
Expand Down Expand Up @@ -1568,7 +1569,7 @@ static void amdvi_realize(DeviceState *dev, Error **err)
}

/* Pseudo address space under root PCI bus. */
pcms->ioapic_as = amdvi_host_dma_iommu(bus, s, AMDVI_IOAPIC_SB_DEVID);
x86ms->ioapic_as = amdvi_host_dma_iommu(bus, s, AMDVI_IOAPIC_SB_DEVID);

/* set up MMIO */
memory_region_init_io(&s->mmio, OBJECT(s), &mmio_mem_ops, s, "amdvi-mmio",
Expand Down
3 changes: 2 additions & 1 deletion hw/i386/intel_iommu.c
Original file line number Diff line number Diff line change
Expand Up @@ -3733,6 +3733,7 @@ static void vtd_realize(DeviceState *dev, Error **errp)
{
MachineState *ms = MACHINE(qdev_get_machine());
PCMachineState *pcms = PC_MACHINE(ms);
X86MachineState *x86ms = X86_MACHINE(ms);
PCIBus *bus = pcms->bus;
IntelIOMMUState *s = INTEL_IOMMU_DEVICE(dev);
X86IOMMUState *x86_iommu = X86_IOMMU_DEVICE(dev);
Expand Down Expand Up @@ -3773,7 +3774,7 @@ static void vtd_realize(DeviceState *dev, Error **errp)
sysbus_mmio_map(SYS_BUS_DEVICE(s), 0, Q35_HOST_BRIDGE_IOMMU_ADDR);
pci_setup_iommu(bus, vtd_host_dma_iommu, dev);
/* Pseudo address space under root PCI bus. */
pcms->ioapic_as = vtd_host_dma_iommu(bus, s, Q35_PSEUDO_DEVFN_IOAPIC);
x86ms->ioapic_as = vtd_host_dma_iommu(bus, s, Q35_PSEUDO_DEVFN_IOAPIC);
qemu_add_machine_init_done_notifier(&vtd_machine_done_notify);
}

Expand Down
Loading

0 comments on commit 673652a

Please sign in to comment.