Skip to content

Commit

Permalink
Merge remote-tracking branches 'ras/edac-drivers', 'ras/edac-misc' an…
Browse files Browse the repository at this point in the history
…d 'ras/edac-amd-atl' into edac-updates-for-v6.9

* ras/edac-drivers:
  EDAC/i10nm: Add Intel Grand Ridge micro-server support
  EDAC/igen6: Add one more Intel Alder Lake-N SoC support

* ras/edac-misc:
  EDAC/versal: Convert to platform remove callback returning void
  EDAC/versal: Make the bit position of injected errors configurable
  EDAC/synopsys: Convert to devm_platform_ioremap_resource()

* ras/edac-amd-atl:
  RAS/AMD/FMPM: Fix off by one when unwinding on error
  RAS/AMD/FMPM: Add debugfs interface to print record entries
  RAS/AMD/FMPM: Save SPA values
  RAS: Export helper to get ras_debugfs_dir
  RAS/AMD/ATL: Fix bit overflow in denorm_addr_df4_np2()
  RAS: Introduce a FRU memory poison manager
  RAS/AMD/ATL: Add MI300 row retirement support
  Documentation: Move RAS section to admin-guide
  RAS/AMD/ATL: Add MI300 DRAM to normalized address translation support
  RAS/AMD/ATL: Fix array overflow in get_logical_coh_st_fabric_id_mi300()
  RAS/AMD/ATL: Add MI300 support
  Documentation: RAS: Add index and address translation section
  EDAC/amd64: Use new AMD Address Translation Library
  RAS: Introduce AMD Address Translation Library

Signed-off-by: Borislav Petkov (AMD) <[email protected]>
  • Loading branch information
bp3tk0v committed Mar 11, 2024
3 parents e77086c + 4527a21 + bd17b7c commit af65545
Show file tree
Hide file tree
Showing 30 changed files with 5,161 additions and 335 deletions.
24 changes: 24 additions & 0 deletions Documentation/admin-guide/RAS/address-translation.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
.. SPDX-License-Identifier: GPL-2.0
Address translation
===================

x86 AMD
-------

Zen-based AMD systems include a Data Fabric that manages the layout of
physical memory. Devices attached to the Fabric, like memory controllers,
I/O, etc., may not have a complete view of the system physical memory map.
These devices may provide a "normalized", i.e. device physical, address
when reporting memory errors. Normalized addresses must be translated to
a system physical address for the kernel to action on the memory.

AMD Address Translation Library (CONFIG_AMD_ATL) provides translation for
this case.

Glossary of acronyms used in address translation for Zen-based systems

* CCM = Cache Coherent Moderator
* COD = Cluster-on-Die
* COH_ST = Coherent Station
* DF = Data Fabric
Original file line number Diff line number Diff line change
@@ -1,15 +1,10 @@
.. SPDX-License-Identifier: GPL-2.0
Reliability, Availability and Serviceability features
=====================================================

This documents different aspects of the RAS functionality present in the
kernel.

Error decoding
---------------
==============

* x86
x86
---

Error decoding on AMD systems should be done using the rasdaemon tool:
https://github.com/mchehab/rasdaemon/
Expand Down
7 changes: 7 additions & 0 deletions Documentation/admin-guide/RAS/index.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
.. SPDX-License-Identifier: GPL-2.0
.. toctree::
:maxdepth: 2

main
error-decoding
address-translation
Original file line number Diff line number Diff line change
@@ -1,8 +1,12 @@
.. SPDX-License-Identifier: GPL-2.0
.. include:: <isonum.txt>

============================================
Reliability, Availability and Serviceability
============================================
==================================================
Reliability, Availability and Serviceability (RAS)
==================================================

This documents different aspects of the RAS functionality present in the
kernel.

RAS concepts
************
Expand Down
2 changes: 1 addition & 1 deletion Documentation/admin-guide/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -122,7 +122,7 @@ configure specific aspects of kernel behavior to your liking.
pmf
pnp
rapidio
ras
RAS/index
rtc
serial-console
svga
Expand Down
1 change: 0 additions & 1 deletion Documentation/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -113,7 +113,6 @@ to ReStructured Text format, or are simply too old.
:maxdepth: 1

staging/index
RAS/ras


Translations
Expand Down
15 changes: 13 additions & 2 deletions MAINTAINERS
Original file line number Diff line number Diff line change
Expand Up @@ -897,6 +897,12 @@ Q: https://patchwork.kernel.org/project/linux-rdma/list/
F: drivers/infiniband/hw/efa/
F: include/uapi/rdma/efa-abi.h

AMD ADDRESS TRANSLATION LIBRARY (ATL)
M: Yazen Ghannam <[email protected]>
L: [email protected]
S: Supported
F: drivers/ras/amd/atl/*

AMD AXI W1 DRIVER
M: Kris Chaplin <[email protected]>
R: Thomas Delev <[email protected]>
Expand Down Expand Up @@ -7578,7 +7584,6 @@ R: Robert Richter <[email protected]>
L: [email protected]
S: Supported
T: git git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras.git edac-for-next
F: Documentation/admin-guide/ras.rst
F: Documentation/driver-api/edac.rst
F: drivers/edac/
F: include/linux/edac.h
Expand Down Expand Up @@ -18353,11 +18358,17 @@ M: Tony Luck <[email protected]>
M: Borislav Petkov <[email protected]>
L: [email protected]
S: Maintained
F: Documentation/admin-guide/ras.rst
F: Documentation/admin-guide/RAS
F: drivers/ras/
F: include/linux/ras.h
F: include/ras/ras_event.h

RAS FRU MEMORY POISON MANAGER (FMPM)
M: Yazen Ghannam <[email protected]>
L: [email protected]
S: Maintained
F: drivers/ras/amd/fmpm.c

RC-CORE / LIRC FRAMEWORK
M: Sean Young <[email protected]>
L: [email protected]
Expand Down
1 change: 1 addition & 0 deletions drivers/edac/Kconfig
Original file line number Diff line number Diff line change
Expand Up @@ -78,6 +78,7 @@ config EDAC_GHES
config EDAC_AMD64
tristate "AMD64 (Opteron, Athlon64)"
depends on AMD_NB && EDAC_DECODE_MCE
imply AMD_ATL
help
Support for error detection and correction of DRAM ECC errors on
the AMD64 families (>= K8) of memory controllers.
Expand Down
Loading

0 comments on commit af65545

Please sign in to comment.