-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add superpipe supporting infrastructure to device driver for the IBM CXL Flash adapter. This patch allows userspace applications to take advantage of the accelerated I/O features that this adapter provides and bypass the traditional filesystem stack. Signed-off-by: Matthew R. Ochs <[email protected]> Signed-off-by: Manoj N. Kumar <[email protected]> Reviewed-by: Michael Neuling <[email protected]> Reviewed-by: Wen Xiong <[email protected]> Reviewed-by: Brian King <[email protected]> Signed-off-by: James Bottomley <[email protected]>
- Loading branch information
Matthew R. Ochs
authored and
James Bottomley
committed
Aug 27, 2015
1 parent
5cdac81
commit 65be2c7
Showing
11 changed files
with
2,868 additions
and
4 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -314,6 +314,7 @@ Code Seq#(hex) Include File Comments | |
0xB3 00 linux/mmc/ioctl.h | ||
0xC0 00-0F linux/usb/iowarrior.h | ||
0xCA 00-0F uapi/misc/cxl.h | ||
0xCA 80-8F uapi/scsi/cxlflash_ioctl.h | ||
0xCB 00-1F CBM serial IEC bus in development: | ||
<mailto:[email protected]> | ||
0xCD 01 linux/reiserfs_fs.h | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,257 @@ | ||
Introduction | ||
============ | ||
|
||
The IBM Power architecture provides support for CAPI (Coherent | ||
Accelerator Power Interface), which is available to certain PCIe slots | ||
on Power 8 systems. CAPI can be thought of as a special tunneling | ||
protocol through PCIe that allow PCIe adapters to look like special | ||
purpose co-processors which can read or write an application's | ||
memory and generate page faults. As a result, the host interface to | ||
an adapter running in CAPI mode does not require the data buffers to | ||
be mapped to the device's memory (IOMMU bypass) nor does it require | ||
memory to be pinned. | ||
|
||
On Linux, Coherent Accelerator (CXL) kernel services present CAPI | ||
devices as a PCI device by implementing a virtual PCI host bridge. | ||
This abstraction simplifies the infrastructure and programming | ||
model, allowing for drivers to look similar to other native PCI | ||
device drivers. | ||
|
||
CXL provides a mechanism by which user space applications can | ||
directly talk to a device (network or storage) bypassing the typical | ||
kernel/device driver stack. The CXL Flash Adapter Driver enables a | ||
user space application direct access to Flash storage. | ||
|
||
The CXL Flash Adapter Driver is a kernel module that sits in the | ||
SCSI stack as a low level device driver (below the SCSI disk and | ||
protocol drivers) for the IBM CXL Flash Adapter. This driver is | ||
responsible for the initialization of the adapter, setting up the | ||
special path for user space access, and performing error recovery. It | ||
communicates directly the Flash Accelerator Functional Unit (AFU) | ||
as described in Documentation/powerpc/cxl.txt. | ||
|
||
The cxlflash driver supports two, mutually exclusive, modes of | ||
operation at the device (LUN) level: | ||
|
||
- Any flash device (LUN) can be configured to be accessed as a | ||
regular disk device (i.e.: /dev/sdc). This is the default mode. | ||
|
||
- Any flash device (LUN) can be configured to be accessed from | ||
user space with a special block library. This mode further | ||
specifies the means of accessing the device and provides for | ||
either raw access to the entire LUN (referred to as direct | ||
or physical LUN access) or access to a kernel/AFU-mediated | ||
partition of the LUN (referred to as virtual LUN access). The | ||
segmentation of a disk device into virtual LUNs is assisted | ||
by special translation services provided by the Flash AFU. | ||
|
||
Overview | ||
======== | ||
|
||
The Coherent Accelerator Interface Architecture (CAIA) introduces a | ||
concept of a master context. A master typically has special privileges | ||
granted to it by the kernel or hypervisor allowing it to perform AFU | ||
wide management and control. The master may or may not be involved | ||
directly in each user I/O, but at the minimum is involved in the | ||
initial setup before the user application is allowed to send requests | ||
directly to the AFU. | ||
|
||
The CXL Flash Adapter Driver establishes a master context with the | ||
AFU. It uses memory mapped I/O (MMIO) for this control and setup. The | ||
Adapter Problem Space Memory Map looks like this: | ||
|
||
+-------------------------------+ | ||
| 512 * 64 KB User MMIO | | ||
| (per context) | | ||
| User Accessible | | ||
+-------------------------------+ | ||
| 512 * 128 B per context | | ||
| Provisioning and Control | | ||
| Trusted Process accessible | | ||
+-------------------------------+ | ||
| 64 KB Global | | ||
| Trusted Process accessible | | ||
+-------------------------------+ | ||
|
||
This driver configures itself into the SCSI software stack as an | ||
adapter driver. The driver is the only entity that is considered a | ||
Trusted Process to program the Provisioning and Control and Global | ||
areas in the MMIO Space shown above. The master context driver | ||
discovers all LUNs attached to the CXL Flash adapter and instantiates | ||
scsi block devices (/dev/sdb, /dev/sdc etc.) for each unique LUN | ||
seen from each path. | ||
|
||
Once these scsi block devices are instantiated, an application | ||
written to a specification provided by the block library may get | ||
access to the Flash from user space (without requiring a system call). | ||
|
||
This master context driver also provides a series of ioctls for this | ||
block library to enable this user space access. The driver supports | ||
two modes for accessing the block device. | ||
|
||
The first mode is called a virtual mode. In this mode a single scsi | ||
block device (/dev/sdb) may be carved up into any number of distinct | ||
virtual LUNs. The virtual LUNs may be resized as long as the sum of | ||
the sizes of all the virtual LUNs, along with the meta-data associated | ||
with it does not exceed the physical capacity. | ||
|
||
The second mode is called the physical mode. In this mode a single | ||
block device (/dev/sdb) may be opened directly by the block library | ||
and the entire space for the LUN is available to the application. | ||
|
||
Only the physical mode provides persistence of the data. i.e. The | ||
data written to the block device will survive application exit and | ||
restart and also reboot. The virtual LUNs do not persist (i.e. do | ||
not survive after the application terminates or the system reboots). | ||
|
||
|
||
Block library API | ||
================= | ||
|
||
Applications intending to get access to the CXL Flash from user | ||
space should use the block library, as it abstracts the details of | ||
interfacing directly with the cxlflash driver that are necessary for | ||
performing administrative actions (i.e.: setup, tear down, resize). | ||
The block library can be thought of as a 'user' of services, | ||
implemented as IOCTLs, that are provided by the cxlflash driver | ||
specifically for devices (LUNs) operating in user space access | ||
mode. While it is not a requirement that applications understand | ||
the interface between the block library and the cxlflash driver, | ||
a high-level overview of each supported service (IOCTL) is provided | ||
below. | ||
|
||
The block library can be found on GitHub: | ||
http://www.github.com/mikehollinger/ibmcapikv | ||
|
||
|
||
CXL Flash Driver IOCTLs | ||
======================= | ||
|
||
Users, such as the block library, that wish to interface with a flash | ||
device (LUN) via user space access need to use the services provided | ||
by the cxlflash driver. As these services are implemented as ioctls, | ||
a file descriptor handle must first be obtained in order to establish | ||
the communication channel between a user and the kernel. This file | ||
descriptor is obtained by opening the device special file associated | ||
with the scsi disk device (/dev/sdb) that was created during LUN | ||
discovery. As per the location of the cxlflash driver within the | ||
SCSI protocol stack, this open is actually not seen by the cxlflash | ||
driver. Upon successful open, the user receives a file descriptor | ||
(herein referred to as fd1) that should be used for issuing the | ||
subsequent ioctls listed below. | ||
|
||
The structure definitions for these IOCTLs are available in: | ||
uapi/scsi/cxlflash_ioctl.h | ||
|
||
DK_CXLFLASH_ATTACH | ||
------------------ | ||
|
||
This ioctl obtains, initializes, and starts a context using the CXL | ||
kernel services. These services specify a context id (u16) by which | ||
to uniquely identify the context and its allocated resources. The | ||
services additionally provide a second file descriptor (herein | ||
referred to as fd2) that is used by the block library to initiate | ||
memory mapped I/O (via mmap()) to the CXL flash device and poll for | ||
completion events. This file descriptor is intentionally installed by | ||
this driver and not the CXL kernel services to allow for intermediary | ||
notification and access in the event of a non-user-initiated close(), | ||
such as a killed process. This design point is described in further | ||
detail in the description for the DK_CXLFLASH_DETACH ioctl. | ||
|
||
There are a few important aspects regarding the "tokens" (context id | ||
and fd2) that are provided back to the user: | ||
|
||
- These tokens are only valid for the process under which they | ||
were created. The child of a forked process cannot continue | ||
to use the context id or file descriptor created by its parent. | ||
|
||
- These tokens are only valid for the lifetime of the context and | ||
the process under which they were created. Once either is | ||
destroyed, the tokens are to be considered stale and subsequent | ||
usage will result in errors. | ||
|
||
- When a context is no longer needed, the user shall detach from | ||
the context via the DK_CXLFLASH_DETACH ioctl. | ||
|
||
- A close on fd2 will invalidate the tokens. This operation is not | ||
required by the user. | ||
|
||
DK_CXLFLASH_USER_DIRECT | ||
----------------------- | ||
This ioctl is responsible for transitioning the LUN to direct | ||
(physical) mode access and configuring the AFU for direct access from | ||
user space on a per-context basis. Additionally, the block size and | ||
last logical block address (LBA) are returned to the user. | ||
|
||
As mentioned previously, when operating in user space access mode, | ||
LUNs may be accessed in whole or in part. Only one mode is allowed | ||
at a time and if one mode is active (outstanding references exist), | ||
requests to use the LUN in a different mode are denied. | ||
|
||
The AFU is configured for direct access from user space by adding an | ||
entry to the AFU's resource handle table. The index of the entry is | ||
treated as a resource handle that is returned to the user. The user | ||
is then able to use the handle to reference the LUN during I/O. | ||
|
||
DK_CXLFLASH_RELEASE | ||
------------------- | ||
This ioctl is responsible for releasing a previously obtained | ||
reference to either a physical or virtual LUN. This can be | ||
thought of as the inverse of the DK_CXLFLASH_USER_DIRECT or | ||
DK_CXLFLASH_USER_VIRTUAL ioctls. Upon success, the resource handle | ||
is no longer valid and the entry in the resource handle table is | ||
made available to be used again. | ||
|
||
As part of the release process for virtual LUNs, the virtual LUN | ||
is first resized to 0 to clear out and free the translation tables | ||
associated with the virtual LUN reference. | ||
|
||
DK_CXLFLASH_DETACH | ||
------------------ | ||
This ioctl is responsible for unregistering a context with the | ||
cxlflash driver and release outstanding resources that were | ||
not explicitly released via the DK_CXLFLASH_RELEASE ioctl. Upon | ||
success, all "tokens" which had been provided to the user from the | ||
DK_CXLFLASH_ATTACH onward are no longer valid. | ||
|
||
DK_CXLFLASH_VERIFY | ||
------------------ | ||
This ioctl is used to detect various changes such as the capacity of | ||
the disk changing, the number of LUNs visible changing, etc. In cases | ||
where the changes affect the application (such as a LUN resize), the | ||
cxlflash driver will report the changed state to the application. | ||
|
||
The user calls in when they want to validate that a LUN hasn't been | ||
changed in response to a check condition. As the user is operating out | ||
of band from the kernel, they will see these types of events without | ||
the kernel's knowledge. When encountered, the user's architected | ||
behavior is to call in to this ioctl, indicating what they want to | ||
verify and passing along any appropriate information. For now, only | ||
verifying a LUN change (ie: size different) with sense data is | ||
supported. | ||
|
||
DK_CXLFLASH_RECOVER_AFU | ||
----------------------- | ||
This ioctl is used to drive recovery (if such an action is warranted) | ||
of a specified user context. Any state associated with the user context | ||
is re-established upon successful recovery. | ||
|
||
User contexts are put into an error condition when the device needs to | ||
be reset or is terminating. Users are notified of this error condition | ||
by seeing all 0xF's on an MMIO read. Upon encountering this, the | ||
architected behavior for a user is to call into this ioctl to recover | ||
their context. A user may also call into this ioctl at any time to | ||
check if the device is operating normally. If a failure is returned | ||
from this ioctl, the user is expected to gracefully clean up their | ||
context via release/detach ioctls. Until they do, the context they | ||
hold is not relinquished. The user may also optionally exit the process | ||
at which time the context/resources they held will be freed as part of | ||
the release fop. | ||
|
||
DK_CXLFLASH_MANAGE_LUN | ||
---------------------- | ||
This ioctl is used to switch a LUN from a mode where it is available | ||
for file-system access (legacy), to a mode where it is set aside for | ||
exclusive user space access (superpipe). In case a LUN is visible | ||
across multiple ports and adapters, this ioctl is used to uniquely | ||
identify each LUN by its World Wide Node Name (WWNN). |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,2 +1,2 @@ | ||
obj-$(CONFIG_CXLFLASH) += cxlflash.o | ||
cxlflash-y += main.o | ||
cxlflash-y += main.o superpipe.o lunmgt.o |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.