Skip to content

Commit

Permalink
Merge tag 'cxl-for-6.3' of git://git.kernel.org/pub/scm/linux/kernel/…
Browse files Browse the repository at this point in the history
…git/cxl/cxl

Pull Compute Express Link (CXL) updates from Dan Williams:
 "To date Linux has been dependent on platform-firmware to map CXL RAM
  regions and handle events / errors from devices. With this update we
  can now parse / update the CXL memory layout, and report events /
  errors from devices. This is a precursor for the CXL subsystem to
  handle the end-to-end "RAS" flow for CXL memory. i.e. the flow that
  for DDR-attached-DRAM is handled by the EDAC driver where it maps
  system physical address events to a field-replaceable-unit (FRU /
  endpoint device). In general, CXL has the potential to standardize
  what has historically been a pile of memory-controller-specific error
  handling logic.

  Another change of note is the default policy for handling RAM-backed
  device-dax instances. Previously the default access mode was "device",
  mmap(2) a device special file to access memory. The new default is
  "kmem" where the address range is assigned to the core-mm via
  add_memory_driver_managed(). This saves typical users from wondering
  why their platform memory is not visible via free(1) and stuck behind
  a device-file. At the same time it allows expert users to deploy
  policy to, for example, get dedicated access to high performance
  memory, or hide low performance memory from general purpose kernel
  allocations. This affects not only CXL, but also systems with
  high-bandwidth-memory that platform-firmware tags with the
  EFI_MEMORY_SP (special purpose) designation.

  Summary:

   - CXL RAM region enumeration: instantiate 'struct cxl_region' objects
     for platform firmware created memory regions

   - CXL RAM region provisioning: complement the existing PMEM region
     creation support with RAM region support

   - "Soft Reservation" policy change: Online (memory hot-add)
     soft-reserved memory (EFI_MEMORY_SP) by default, but still allow
     for setting aside such memory for dedicated access via device-dax.

   - CXL Events and Interrupts: Takeover CXL event handling from
     platform-firmware (ACPI calls this CXL Memory Error Reporting) and
     export CXL Events via Linux Trace Events.

   - Convey CXL _OSC results to drivers: Similar to PCI, let the CXL
     subsystem interrogate the result of CXL _OSC negotiation.

   - Emulate CXL DVSEC Range Registers as "decoders": Allow for
     first-generation devices that pre-date the definition of the CXL
     HDM Decoder Capability to translate the CXL DVSEC Range Registers
     into 'struct cxl_decoder' objects.

   - Set timestamp: Per spec, set the device timestamp in case of
     hotplug, or if platform-firwmare failed to set it.

   - General fixups: linux-next build issues, non-urgent fixes for
     pre-production hardware, unit test fixes, spelling and debug
     message improvements"

* tag 'cxl-for-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: (66 commits)
  dax/kmem: Fix leak of memory-hotplug resources
  cxl/mem: Add kdoc param for event log driver state
  cxl/trace: Add serial number to trace points
  cxl/trace: Add host output to trace points
  cxl/trace: Standardize device information output
  cxl/pci: Remove locked check for dvsec_range_allowed()
  cxl/hdm: Add emulation when HDM decoders are not committed
  cxl/hdm: Create emulated cxl_hdm for devices that do not have HDM decoders
  cxl/hdm: Emulate HDM decoder from DVSEC range registers
  cxl/pci: Refactor cxl_hdm_decode_init()
  cxl/port: Export cxl_dvsec_rr_decode() to cxl_port
  cxl/pci: Break out range register decoding from cxl_hdm_decode_init()
  cxl: add RAS status unmasking for CXL
  cxl: remove unnecessary calling of pci_enable_pcie_error_reporting()
  dax/hmem: build hmem device support as module if possible
  dax: cxl: add CXL_REGION dependency
  cxl: avoid returning uninitialized error code
  cxl/pmem: Fix nvdimm registration races
  cxl/mem: Fix UAPI command comment
  cxl/uapi: Tag commands from cxl_query_cmd()
  ...
  • Loading branch information
torvalds committed Feb 25, 2023
2 parents d8e4731 + e686c32 commit 7c3dc44
Show file tree
Hide file tree
Showing 60 changed files with 3,788 additions and 740 deletions.
79 changes: 53 additions & 26 deletions Documentation/ABI/testing/sysfs-bus-cxl
Original file line number Diff line number Diff line change
Expand Up @@ -90,6 +90,21 @@ Description:
capability.


What: /sys/bus/cxl/devices/{port,endpoint}X/parent_dport
Date: January, 2023
KernelVersion: v6.3
Contact: [email protected]
Description:
(RO) CXL port objects are instantiated for each upstream port in
a CXL/PCIe switch, and for each endpoint to map the
corresponding memory device into the CXL port hierarchy. When a
descendant CXL port (switch or endpoint) is enumerated it is
useful to know which 'dport' object in the parent CXL port
routes to this descendant. The 'parent_dport' symlink points to
the device representing the downstream port of a CXL switch that
routes to {port,endpoint}X.


What: /sys/bus/cxl/devices/portX/dportY
Date: June, 2021
KernelVersion: v5.14
Expand Down Expand Up @@ -183,7 +198,7 @@ Description:

What: /sys/bus/cxl/devices/endpointX/CDAT
Date: July, 2022
KernelVersion: v5.20
KernelVersion: v6.0
Contact: [email protected]
Description:
(RO) If this sysfs entry is not present no DOE mailbox was
Expand All @@ -194,7 +209,7 @@ Description:

What: /sys/bus/cxl/devices/decoderX.Y/mode
Date: May, 2022
KernelVersion: v5.20
KernelVersion: v6.0
Contact: [email protected]
Description:
(RW) When a CXL decoder is of devtype "cxl_decoder_endpoint" it
Expand All @@ -214,7 +229,7 @@ Description:

What: /sys/bus/cxl/devices/decoderX.Y/dpa_resource
Date: May, 2022
KernelVersion: v5.20
KernelVersion: v6.0
Contact: [email protected]
Description:
(RO) When a CXL decoder is of devtype "cxl_decoder_endpoint",
Expand All @@ -225,7 +240,7 @@ Description:

What: /sys/bus/cxl/devices/decoderX.Y/dpa_size
Date: May, 2022
KernelVersion: v5.20
KernelVersion: v6.0
Contact: [email protected]
Description:
(RW) When a CXL decoder is of devtype "cxl_decoder_endpoint" it
Expand All @@ -245,7 +260,7 @@ Description:

What: /sys/bus/cxl/devices/decoderX.Y/interleave_ways
Date: May, 2022
KernelVersion: v5.20
KernelVersion: v6.0
Contact: [email protected]
Description:
(RO) The number of targets across which this decoder's host
Expand All @@ -260,7 +275,7 @@ Description:

What: /sys/bus/cxl/devices/decoderX.Y/interleave_granularity
Date: May, 2022
KernelVersion: v5.20
KernelVersion: v6.0
Contact: [email protected]
Description:
(RO) The number of consecutive bytes of host physical address
Expand All @@ -270,25 +285,25 @@ Description:
interleave_granularity).


What: /sys/bus/cxl/devices/decoderX.Y/create_pmem_region
Date: May, 2022
KernelVersion: v5.20
What: /sys/bus/cxl/devices/decoderX.Y/create_{pmem,ram}_region
Date: May, 2022, January, 2023
KernelVersion: v6.0 (pmem), v6.3 (ram)
Contact: [email protected]
Description:
(RW) Write a string in the form 'regionZ' to start the process
of defining a new persistent memory region (interleave-set)
within the decode range bounded by root decoder 'decoderX.Y'.
The value written must match the current value returned from
reading this attribute. An atomic compare exchange operation is
done on write to assign the requested id to a region and
allocate the region-id for the next creation attempt. EBUSY is
returned if the region name written does not match the current
cached value.
of defining a new persistent, or volatile memory region
(interleave-set) within the decode range bounded by root decoder
'decoderX.Y'. The value written must match the current value
returned from reading this attribute. An atomic compare exchange
operation is done on write to assign the requested id to a
region and allocate the region-id for the next creation attempt.
EBUSY is returned if the region name written does not match the
current cached value.


What: /sys/bus/cxl/devices/decoderX.Y/delete_region
Date: May, 2022
KernelVersion: v5.20
KernelVersion: v6.0
Contact: [email protected]
Description:
(WO) Write a string in the form 'regionZ' to delete that region,
Expand All @@ -297,17 +312,18 @@ Description:

What: /sys/bus/cxl/devices/regionZ/uuid
Date: May, 2022
KernelVersion: v5.20
KernelVersion: v6.0
Contact: [email protected]
Description:
(RW) Write a unique identifier for the region. This field must
be set for persistent regions and it must not conflict with the
UUID of another region.
UUID of another region. For volatile ram regions this
attribute is a read-only empty string.


What: /sys/bus/cxl/devices/regionZ/interleave_granularity
Date: May, 2022
KernelVersion: v5.20
KernelVersion: v6.0
Contact: [email protected]
Description:
(RW) Set the number of consecutive bytes each device in the
Expand All @@ -318,7 +334,7 @@ Description:

What: /sys/bus/cxl/devices/regionZ/interleave_ways
Date: May, 2022
KernelVersion: v5.20
KernelVersion: v6.0
Contact: [email protected]
Description:
(RW) Configures the number of devices participating in the
Expand All @@ -328,7 +344,7 @@ Description:

What: /sys/bus/cxl/devices/regionZ/size
Date: May, 2022
KernelVersion: v5.20
KernelVersion: v6.0
Contact: [email protected]
Description:
(RW) System physical address space to be consumed by the region.
Expand All @@ -343,9 +359,20 @@ Description:
results in the same address being allocated.


What: /sys/bus/cxl/devices/regionZ/mode
Date: January, 2023
KernelVersion: v6.3
Contact: [email protected]
Description:
(RO) The mode of a region is established at region creation time
and dictates the mode of the endpoint decoder that comprise the
region. For more details on the possible modes see
/sys/bus/cxl/devices/decoderX.Y/mode


What: /sys/bus/cxl/devices/regionZ/resource
Date: May, 2022
KernelVersion: v5.20
KernelVersion: v6.0
Contact: [email protected]
Description:
(RO) A region is a contiguous partition of a CXL root decoder
Expand All @@ -357,7 +384,7 @@ Description:

What: /sys/bus/cxl/devices/regionZ/target[0..N]
Date: May, 2022
KernelVersion: v5.20
KernelVersion: v6.0
Contact: [email protected]
Description:
(RW) Write an endpoint decoder object name to 'targetX' where X
Expand All @@ -376,7 +403,7 @@ Description:

What: /sys/bus/cxl/devices/regionZ/commit
Date: May, 2022
KernelVersion: v5.20
KernelVersion: v6.0
Contact: [email protected]
Description:
(RW) Write a boolean 'true' string value to this attribute to
Expand Down
1 change: 1 addition & 0 deletions MAINTAINERS
Original file line number Diff line number Diff line change
Expand Up @@ -5912,6 +5912,7 @@ M: Dan Williams <[email protected]>
M: Vishal Verma <[email protected]>
M: Dave Jiang <[email protected]>
L: [email protected]
L: [email protected]
S: Supported
F: drivers/dax/

Expand Down
2 changes: 1 addition & 1 deletion drivers/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,7 @@ obj-$(CONFIG_FB_INTEL) += video/fbdev/intelfb/
obj-$(CONFIG_PARPORT) += parport/
obj-y += base/ block/ misc/ mfd/ nfc/
obj-$(CONFIG_LIBNVDIMM) += nvdimm/
obj-$(CONFIG_DAX) += dax/
obj-y += dax/
obj-$(CONFIG_DMA_SHARED_BUFFER) += dma-buf/
obj-$(CONFIG_NUBUS) += nubus/
obj-y += cxl/
Expand Down
4 changes: 2 additions & 2 deletions drivers/acpi/numa/hmat.c
Original file line number Diff line number Diff line change
Expand Up @@ -718,7 +718,7 @@ static void hmat_register_target_devices(struct memory_target *target)
for (res = target->memregions.child; res; res = res->sibling) {
int target_nid = pxm_to_node(target->memory_pxm);

hmem_register_device(target_nid, res);
hmem_register_resource(target_nid, res);
}
}

Expand Down Expand Up @@ -869,4 +869,4 @@ static __init int hmat_init(void)
acpi_put_table(tbl);
return 0;
}
device_initcall(hmat_init);
subsys_initcall(hmat_init);
3 changes: 3 additions & 0 deletions drivers/acpi/pci_root.c
Original file line number Diff line number Diff line change
Expand Up @@ -1047,6 +1047,9 @@ struct pci_bus *acpi_pci_root_create(struct acpi_pci_root *root,
if (!(root->osc_control_set & OSC_PCI_EXPRESS_DPC_CONTROL))
host_bridge->native_dpc = 0;

if (!(root->osc_ext_control_set & OSC_CXL_ERROR_REPORTING_CONTROL))
host_bridge->native_cxl_error = 0;

/*
* Evaluate the "PCI Boot Configuration" _DSM Function. If it
* exists and returns 0, we must preserve any PCI resource
Expand Down
14 changes: 12 additions & 2 deletions drivers/cxl/Kconfig
Original file line number Diff line number Diff line change
Expand Up @@ -104,19 +104,29 @@ config CXL_SUSPEND
depends on SUSPEND && CXL_MEM

config CXL_REGION
bool
bool "CXL: Region Support"
default CXL_BUS
# For MAX_PHYSMEM_BITS
depends on SPARSEMEM
select MEMREGION
select GET_FREE_REGION
help
Enable the CXL core to enumerate and provision CXL regions. A CXL
region is defined by one or more CXL expanders that decode a given
system-physical address range. For CXL regions established by
platform-firmware this option enables memory error handling to
identify the devices participating in a given interleaved memory
range. Otherwise, platform-firmware managed CXL is enabled by being
placed in the system address map and does not need a driver.

If unsure say 'y'

config CXL_REGION_INVALIDATION_TEST
bool "CXL: Region Cache Management Bypass (TEST)"
depends on CXL_REGION
help
CXL Region management and security operations potentially invalidate
the content of CPU caches without notifiying those caches to
the content of CPU caches without notifying those caches to
invalidate the affected cachelines. The CXL Region driver attempts
to invalidate caches when those events occur. If that invalidation
fails the region will fail to enable. Reasons for cache
Expand Down
5 changes: 3 additions & 2 deletions drivers/cxl/acpi.c
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ struct cxl_cxims_data {

/*
* Find a targets entry (n) in the host bridge interleave list.
* CXL Specfication 3.0 Table 9-22
* CXL Specification 3.0 Table 9-22
*/
static int cxl_xor_calc_n(u64 hpa, struct cxl_cxims_data *cximsd, int iw,
int ig)
Expand Down Expand Up @@ -731,7 +731,8 @@ static void __exit cxl_acpi_exit(void)
cxl_bus_drain();
}

module_init(cxl_acpi_init);
/* load before dax_hmem sees 'Soft Reserved' CXL ranges */
subsys_initcall(cxl_acpi_init);
module_exit(cxl_acpi_exit);
MODULE_LICENSE("GPL v2");
MODULE_IMPORT_NS(CXL);
Expand Down
3 changes: 3 additions & 0 deletions drivers/cxl/core/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -3,11 +3,14 @@ obj-$(CONFIG_CXL_BUS) += cxl_core.o
obj-$(CONFIG_CXL_SUSPEND) += suspend.o

ccflags-y += -I$(srctree)/drivers/cxl
CFLAGS_trace.o = -DTRACE_INCLUDE_PATH=. -I$(src)

cxl_core-y := port.o
cxl_core-y += pmem.o
cxl_core-y += regs.o
cxl_core-y += memdev.o
cxl_core-y += mbox.o
cxl_core-y += pci.o
cxl_core-y += hdm.o
cxl_core-$(CONFIG_TRACING) += trace.o
cxl_core-$(CONFIG_CXL_REGION) += region.o
7 changes: 4 additions & 3 deletions drivers/cxl/core/core.h
Original file line number Diff line number Diff line change
Expand Up @@ -11,15 +11,18 @@ extern struct attribute_group cxl_base_attribute_group;

#ifdef CONFIG_CXL_REGION
extern struct device_attribute dev_attr_create_pmem_region;
extern struct device_attribute dev_attr_create_ram_region;
extern struct device_attribute dev_attr_delete_region;
extern struct device_attribute dev_attr_region;
extern const struct device_type cxl_pmem_region_type;
extern const struct device_type cxl_dax_region_type;
extern const struct device_type cxl_region_type;
void cxl_decoder_kill_region(struct cxl_endpoint_decoder *cxled);
#define CXL_REGION_ATTR(x) (&dev_attr_##x.attr)
#define CXL_REGION_TYPE(x) (&cxl_region_type)
#define SET_CXL_REGION_ATTR(x) (&dev_attr_##x.attr),
#define CXL_PMEM_REGION_TYPE(x) (&cxl_pmem_region_type)
#define CXL_DAX_REGION_TYPE(x) (&cxl_dax_region_type)
int cxl_region_init(void);
void cxl_region_exit(void);
#else
Expand All @@ -37,6 +40,7 @@ static inline void cxl_region_exit(void)
#define CXL_REGION_TYPE(x) NULL
#define SET_CXL_REGION_ATTR(x)
#define CXL_PMEM_REGION_TYPE(x) NULL
#define CXL_DAX_REGION_TYPE(x) NULL
#endif

struct cxl_send_command;
Expand All @@ -56,9 +60,6 @@ resource_size_t cxl_dpa_size(struct cxl_endpoint_decoder *cxled);
resource_size_t cxl_dpa_resource_start(struct cxl_endpoint_decoder *cxled);
extern struct rw_semaphore cxl_dpa_rwsem;

bool is_switch_decoder(struct device *dev);
struct cxl_switch_decoder *to_cxl_switch_decoder(struct device *dev);

int cxl_memdev_init(void);
void cxl_memdev_exit(void);
void cxl_mbox_init(void);
Expand Down
Loading

0 comments on commit 7c3dc44

Please sign in to comment.