Skip to content

Commit

Permalink
Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/gi…
Browse files Browse the repository at this point in the history
…t/mst/vhost

Pull virtio updates from Michael Tsirkin:

 - device feature provisioning in ifcvf, mlx5

 - new SolidNET driver

 - support for zoned block device in virtio blk

 - numa support in virtio pmem

 - VIRTIO_F_RING_RESET support in vhost-net

 - more debugfs entries in mlx5

 - resume support in vdpa

 - completion batching in virtio blk

 - cleanup of dma api use in vdpa

 - now simulating more features in vdpa-sim

 - documentation, features, fixes all over the place

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (64 commits)
  vdpa/mlx5: support device features provisioning
  vdpa/mlx5: make MTU/STATUS presence conditional on feature bits
  vdpa: validate device feature provisioning against supported class
  vdpa: validate provisioned device features against specified attribute
  vdpa: conditionally read STATUS in config space
  vdpa: fix improper error message when adding vdpa dev
  vdpa/mlx5: Initialize CVQ iotlb spinlock
  vdpa/mlx5: Don't clear mr struct on destroy MR
  vdpa/mlx5: Directly assign memory key
  tools/virtio: enable to build with retpoline
  vringh: fix a typo in comments for vringh_kiov
  vhost-vdpa: print warning when vhost_vdpa_alloc_domain fails
  scsi: virtio_scsi: fix handling of kmalloc failure
  vdpa: Fix a couple of spelling mistakes in some messages
  vhost-net: support VIRTIO_F_RING_RESET
  vhost-scsi: convert sysfs snprintf and sprintf to sysfs_emit
  vdpa: mlx5: support per virtqueue dma device
  vdpa: set dma mask for vDPA device
  virtio-vdpa: support per vq dma device
  vdpa: introduce get_vq_dma_device()
  ...
  • Loading branch information
torvalds committed Feb 25, 2023
2 parents 49d5759 + deeacf3 commit 84cc667
Show file tree
Hide file tree
Showing 47 changed files with 3,535 additions and 502 deletions.
1 change: 1 addition & 0 deletions Documentation/driver-api/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -108,6 +108,7 @@ available subsections can be seen below.
vfio-mediated-device
vfio
vfio-pci-device-specific-driver-acceptance
virtio/index
xilinx/index
xillybus
zorro
Expand Down
11 changes: 11 additions & 0 deletions Documentation/driver-api/virtio/index.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
.. SPDX-License-Identifier: GPL-2.0
======
Virtio
======

.. toctree::
:maxdepth: 1

virtio
writing_virtio_drivers
145 changes: 145 additions & 0 deletions Documentation/driver-api/virtio/virtio.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,145 @@
.. SPDX-License-Identifier: GPL-2.0
.. _virtio:

===============
Virtio on Linux
===============

Introduction
============

Virtio is an open standard that defines a protocol for communication
between drivers and devices of different types, see Chapter 5 ("Device
Types") of the virtio spec (`[1]`_). Originally developed as a standard
for paravirtualized devices implemented by a hypervisor, it can be used
to interface any compliant device (real or emulated) with a driver.

For illustrative purposes, this document will focus on the common case
of a Linux kernel running in a virtual machine and using paravirtualized
devices provided by the hypervisor, which exposes them as virtio devices
via standard mechanisms such as PCI.


Device - Driver communication: virtqueues
=========================================

Although the virtio devices are really an abstraction layer in the
hypervisor, they're exposed to the guest as if they are physical devices
using a specific transport method -- PCI, MMIO or CCW -- that is
orthogonal to the device itself. The virtio spec defines these transport
methods in detail, including device discovery, capabilities and
interrupt handling.

The communication between the driver in the guest OS and the device in
the hypervisor is done through shared memory (that's what makes virtio
devices so efficient) using specialized data structures called
virtqueues, which are actually ring buffers [#f1]_ of buffer descriptors
similar to the ones used in a network device:

.. kernel-doc:: include/uapi/linux/virtio_ring.h
:identifiers: struct vring_desc

All the buffers the descriptors point to are allocated by the guest and
used by the host either for reading or for writing but not for both.

Refer to Chapter 2.5 ("Virtqueues") of the virtio spec (`[1]`_) for the
reference definitions of virtqueues and "Virtqueues and virtio ring: How
the data travels" blog post (`[2]`_) for an illustrated overview of how
the host device and the guest driver communicate.

The :c:type:`vring_virtqueue` struct models a virtqueue, including the
ring buffers and management data. Embedded in this struct is the
:c:type:`virtqueue` struct, which is the data structure that's
ultimately used by virtio drivers:

.. kernel-doc:: include/linux/virtio.h
:identifiers: struct virtqueue

The callback function pointed by this struct is triggered when the
device has consumed the buffers provided by the driver. More
specifically, the trigger will be an interrupt issued by the hypervisor
(see vring_interrupt()). Interrupt request handlers are registered for
a virtqueue during the virtqueue setup process (transport-specific).

.. kernel-doc:: drivers/virtio/virtio_ring.c
:identifiers: vring_interrupt


Device discovery and probing
============================

In the kernel, the virtio core contains the virtio bus driver and
transport-specific drivers like `virtio-pci` and `virtio-mmio`. Then
there are individual virtio drivers for specific device types that are
registered to the virtio bus driver.

How a virtio device is found and configured by the kernel depends on how
the hypervisor defines it. Taking the `QEMU virtio-console
<https://gitlab.com/qemu-project/qemu/-/blob/master/hw/char/virtio-console.c>`__
device as an example. When using PCI as a transport method, the device
will present itself on the PCI bus with vendor 0x1af4 (Red Hat, Inc.)
and device id 0x1003 (virtio console), as defined in the spec, so the
kernel will detect it as it would do with any other PCI device.

During the PCI enumeration process, if a device is found to match the
virtio-pci driver (according to the virtio-pci device table, any PCI
device with vendor id = 0x1af4)::

/* Qumranet donated their vendor ID for devices 0x1000 thru 0x10FF. */
static const struct pci_device_id virtio_pci_id_table[] = {
{ PCI_DEVICE(PCI_VENDOR_ID_REDHAT_QUMRANET, PCI_ANY_ID) },
{ 0 }
};

then the virtio-pci driver is probed and, if the probing goes well, the
device is registered to the virtio bus::

static int virtio_pci_probe(struct pci_dev *pci_dev,
const struct pci_device_id *id)
{
...

if (force_legacy) {
rc = virtio_pci_legacy_probe(vp_dev);
/* Also try modern mode if we can't map BAR0 (no IO space). */
if (rc == -ENODEV || rc == -ENOMEM)
rc = virtio_pci_modern_probe(vp_dev);
if (rc)
goto err_probe;
} else {
rc = virtio_pci_modern_probe(vp_dev);
if (rc == -ENODEV)
rc = virtio_pci_legacy_probe(vp_dev);
if (rc)
goto err_probe;
}

...

rc = register_virtio_device(&vp_dev->vdev);

When the device is registered to the virtio bus the kernel will look
for a driver in the bus that can handle the device and call that
driver's ``probe`` method.

At this point, the virtqueues will be allocated and configured by
calling the appropriate ``virtio_find`` helper function, such as
virtio_find_single_vq() or virtio_find_vqs(), which will end up calling
a transport-specific ``find_vqs`` method.


References
==========

_`[1]` Virtio Spec v1.2:
https://docs.oasis-open.org/virtio/virtio/v1.2/virtio-v1.2.html

.. Check for later versions of the spec as well.
_`[2]` Virtqueues and virtio ring: How the data travels
https://www.redhat.com/en/blog/virtqueues-and-virtio-ring-how-data-travels

.. rubric:: Footnotes

.. [#f1] that's why they may be also referred to as virtrings.
197 changes: 197 additions & 0 deletions Documentation/driver-api/virtio/writing_virtio_drivers.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,197 @@
.. SPDX-License-Identifier: GPL-2.0
.. _writing_virtio_drivers:

======================
Writing Virtio Drivers
======================

Introduction
============

This document serves as a basic guideline for driver programmers that
need to hack a new virtio driver or understand the essentials of the
existing ones. See :ref:`Virtio on Linux <virtio>` for a general
overview of virtio.


Driver boilerplate
==================

As a bare minimum, a virtio driver needs to register in the virtio bus
and configure the virtqueues for the device according to its spec, the
configuration of the virtqueues in the driver side must match the
virtqueue definitions in the device. A basic driver skeleton could look
like this::

#include <linux/virtio.h>
#include <linux/virtio_ids.h>
#include <linux/virtio_config.h>
#include <linux/module.h>

/* device private data (one per device) */
struct virtio_dummy_dev {
struct virtqueue *vq;
};

static void virtio_dummy_recv_cb(struct virtqueue *vq)
{
struct virtio_dummy_dev *dev = vq->vdev->priv;
char *buf;
unsigned int len;

while ((buf = virtqueue_get_buf(dev->vq, &len)) != NULL) {
/* process the received data */
}
}

static int virtio_dummy_probe(struct virtio_device *vdev)
{
struct virtio_dummy_dev *dev = NULL;

/* initialize device data */
dev = kzalloc(sizeof(struct virtio_dummy_dev), GFP_KERNEL);
if (!dev)
return -ENOMEM;

/* the device has a single virtqueue */
dev->vq = virtio_find_single_vq(vdev, virtio_dummy_recv_cb, "input");
if (IS_ERR(dev->vq)) {
kfree(dev);
return PTR_ERR(dev->vq);

}
vdev->priv = dev;

/* from this point on, the device can notify and get callbacks */
virtio_device_ready(vdev);

return 0;
}

static void virtio_dummy_remove(struct virtio_device *vdev)
{
struct virtio_dummy_dev *dev = vdev->priv;

/*
* disable vq interrupts: equivalent to
* vdev->config->reset(vdev)
*/
virtio_reset_device(vdev);

/* detach unused buffers */
while ((buf = virtqueue_detach_unused_buf(dev->vq)) != NULL) {
kfree(buf);
}

/* remove virtqueues */
vdev->config->del_vqs(vdev);

kfree(dev);
}

static const struct virtio_device_id id_table[] = {
{ VIRTIO_ID_DUMMY, VIRTIO_DEV_ANY_ID },
{ 0 },
};

static struct virtio_driver virtio_dummy_driver = {
.driver.name = KBUILD_MODNAME,
.driver.owner = THIS_MODULE,
.id_table = id_table,
.probe = virtio_dummy_probe,
.remove = virtio_dummy_remove,
};

module_virtio_driver(virtio_dummy_driver);
MODULE_DEVICE_TABLE(virtio, id_table);
MODULE_DESCRIPTION("Dummy virtio driver");
MODULE_LICENSE("GPL");

The device id ``VIRTIO_ID_DUMMY`` here is a placeholder, virtio drivers
should be added only for devices that are defined in the spec, see
include/uapi/linux/virtio_ids.h. Device ids need to be at least reserved
in the virtio spec before being added to that file.

If your driver doesn't have to do anything special in its ``init`` and
``exit`` methods, you can use the module_virtio_driver() helper to
reduce the amount of boilerplate code.

The ``probe`` method does the minimum driver setup in this case
(memory allocation for the device data) and initializes the
virtqueue. virtio_device_ready() is used to enable the virtqueue and to
notify the device that the driver is ready to manage the device
("DRIVER_OK"). The virtqueues are anyway enabled automatically by the
core after ``probe`` returns.

.. kernel-doc:: include/linux/virtio_config.h
:identifiers: virtio_device_ready

In any case, the virtqueues need to be enabled before adding buffers to
them.

Sending and receiving data
==========================

The virtio_dummy_recv_cb() callback in the code above will be triggered
when the device notifies the driver after it finishes processing a
descriptor or descriptor chain, either for reading or writing. However,
that's only the second half of the virtio device-driver communication
process, as the communication is always started by the driver regardless
of the direction of the data transfer.

To configure a buffer transfer from the driver to the device, first you
have to add the buffers -- packed as `scatterlists` -- to the
appropriate virtqueue using any of the virtqueue_add_inbuf(),
virtqueue_add_outbuf() or virtqueue_add_sgs(), depending on whether you
need to add one input `scatterlist` (for the device to fill in), one
output `scatterlist` (for the device to consume) or multiple
`scatterlists`, respectively. Then, once the virtqueue is set up, a call
to virtqueue_kick() sends a notification that will be serviced by the
hypervisor that implements the device::

struct scatterlist sg[1];
sg_init_one(sg, buffer, BUFLEN);
virtqueue_add_inbuf(dev->vq, sg, 1, buffer, GFP_ATOMIC);
virtqueue_kick(dev->vq);

.. kernel-doc:: drivers/virtio/virtio_ring.c
:identifiers: virtqueue_add_inbuf

.. kernel-doc:: drivers/virtio/virtio_ring.c
:identifiers: virtqueue_add_outbuf

.. kernel-doc:: drivers/virtio/virtio_ring.c
:identifiers: virtqueue_add_sgs

Then, after the device has read or written the buffers prepared by the
driver and notifies it back, the driver can call virtqueue_get_buf() to
read the data produced by the device (if the virtqueue was set up with
input buffers) or simply to reclaim the buffers if they were already
consumed by the device:

.. kernel-doc:: drivers/virtio/virtio_ring.c
:identifiers: virtqueue_get_buf_ctx

The virtqueue callbacks can be disabled and re-enabled using the
virtqueue_disable_cb() and the family of virtqueue_enable_cb() functions
respectively. See drivers/virtio/virtio_ring.c for more details:

.. kernel-doc:: drivers/virtio/virtio_ring.c
:identifiers: virtqueue_disable_cb

.. kernel-doc:: drivers/virtio/virtio_ring.c
:identifiers: virtqueue_enable_cb

But note that some spurious callbacks can still be triggered under
certain scenarios. The way to disable callbacks reliably is to reset the
device or the virtqueue (virtio_reset_device()).


References
==========

_`[1]` Virtio Spec v1.2:
https://docs.oasis-open.org/virtio/virtio/v1.2/virtio-v1.2.html

Check for later versions of the spec as well.
5 changes: 5 additions & 0 deletions MAINTAINERS
Original file line number Diff line number Diff line change
Expand Up @@ -22057,6 +22057,7 @@ S: Maintained
F: Documentation/ABI/testing/sysfs-bus-vdpa
F: Documentation/ABI/testing/sysfs-class-vduse
F: Documentation/devicetree/bindings/virtio/
F: Documentation/driver-api/virtio/
F: drivers/block/virtio_blk.c
F: drivers/crypto/virtio/
F: drivers/net/virtio_net.c
Expand All @@ -22077,6 +22078,10 @@ IFCVF VIRTIO DATA PATH ACCELERATOR
R: Zhu Lingshan <[email protected]>
F: drivers/vdpa/ifcvf/

SNET DPU VIRTIO DATA PATH ACCELERATOR
R: Alvaro Karsz <[email protected]>
F: drivers/vdpa/solidrun/

VIRTIO BALLOON
M: "Michael S. Tsirkin" <[email protected]>
M: David Hildenbrand <[email protected]>
Expand Down
Loading

0 comments on commit 84cc667

Please sign in to comment.