aboutsummaryrefslogtreecommitdiffstatshomepage
path: root/drivers/ras/amd/atl/system.c (follow)
AgeCommit message (Collapse)AuthorFilesLines
2024-07-15Merge tag 'edac_updates_for_v6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/rasLinus Torvalds1-1/+20
Pull EDAC updates from Borislav Petkov: - The AMD memory controllers data fabric version 4.5 supports non-power-of-2 denormalization in the sense that certain bits of the system physical address cannot be reconstructed from the normalized address reported by the RAS hardware. Add support for handling such addresses - Switch the EDAC drivers to the new Intel CPU model defines - The usual fixes and cleanups all over the place * tag 'edac_updates_for_v6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras: EDAC: Add missing MODULE_DESCRIPTION() macros EDAC/dmc520: Use devm_platform_ioremap_resource() EDAC/igen6: Add Intel Arrow Lake-U/H SoCs support RAS/AMD/FMPM: Use atl internal.h for INVALID_SPA RAS/AMD/ATL: Implement DF 4.5 NP2 denormalization RAS/AMD/ATL: Validate address map when information is gathered RAS/AMD/ATL: Expand helpers for adding and removing base and hole RAS/AMD/ATL: Read DRAM hole base early RAS/AMD/ATL: Add amd_atl pr_fmt() prefix RAS/AMD/ATL: Add a missing module description EDAC, i10nm: make skx_common.o a separate module EDAC/skx: Switch to new Intel CPU model defines EDAC/sb_edac: Switch to new Intel CPU model defines EDAC, pnd2: Switch to new Intel CPU model defines EDAC/i10nm: Switch to new Intel CPU model defines EDAC/ghes: Add missing newline to pr_info() statement RAS/AMD/ATL: Add missing newline to pr_info() statement EDAC/thunderx: Remove unused struct error_syndrome
2024-06-16RAS/AMD/ATL: Use system settings for MI300 DRAM to normalized address translationYazen Ghannam1-1/+1
The currently used normalized address format is not applicable to all MI300 systems. This leads to incorrect results during address translation. Drop the fixed layout and construct the normalized address from system settings. Fixes: 87a612375307 ("RAS/AMD/ATL: Add MI300 DRAM to normalized address translation support") Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Cc: <stable@kernel.org> Link: https://lore.kernel.org/r/20240607-mi300-dram-xl-fix-v1-2-2f11547a178c@amd.com
2024-06-09RAS/AMD/ATL: Read DRAM hole base earlyJohn Allen1-0/+19
Read DRAM hole base when constructing the address map as the value will not change during run time. Signed-off-by: John Allen <john.allen@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Yazen Ghannam <yazen.ghannam@amd.com> Link: https://lore.kernel.org/r/20240606203313.51197-3-john.allen@amd.com
2024-06-09RAS/AMD/ATL: Add amd_atl pr_fmt() prefixJohn Allen1-1/+1
Prefix all AMD ATL pr_* statements with "amd_atl:". Signed-off-by: John Allen <john.allen@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20240606203313.51197-2-john.allen@amd.com
2024-02-01RAS/AMD/ATL: Add MI300 DRAM to normalized address translation supportYazen Ghannam1-1/+5
Zen-based AMD systems report DRAM ECC errors through Unified Memory Controller (UMC) MCA banks. The value provided in MCA_ADDR is a "normalized" address which represents the UMC's view of its managed memory. The normalized address must be translated to a system physical address for software to take action. MI300 systems, uniquely, do not provide a normalized address in MCA_ADDR for DRAM ECC errors. Rather, the "DRAM" address is reported. This value includes identifiers for the bank, row, column, pseudochannel and stack of the memory location. The DRAM address must be converted to a normalized address in order to be further translated to a system physical address. Add helper functions to do the DRAM to normalized translation for MI300 systems. The method is based on the fixed hardware layout of the on-chip memory. [ bp: Massage commit message, decapitalize some, rename function. ] Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Co-developed-by: Muralidhara M K <muralidhara.mk@amd.com> Signed-off-by: Muralidhara M K <muralidhara.mk@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Tested-by: Muralidhara M K <muralidhara.mk@amd.com> Link: https://lore.kernel.org/r/20240131165732.88297-1-yazen.ghannam@amd.com
2024-01-29RAS/AMD/ATL: Add MI300 supportMuralidhara M K1-0/+3
AMD MI300 systems include on-die HBM3 memory and a unique topology. And they fall under Data Fabric version 4.5 in overall design. Generally, topology information (IDs, etc.) is gathered from Data Fabric registers. However, the unique topology for MI300 means that some topology information is fixed in hardware and follows arbitrary mappings. Furthermore, not all hardware instances are software-visible, so register accesses must be adjusted. Recognize and add helper functions for the new MI300 interleave modes. Add lookup tables for fixed values where appropriate. Adjust how Die and Node IDs are found and used. Also, fix some register bitmasks that were mislabeled. Signed-off-by: Muralidhara M K <muralidhara.mk@amd.com> Co-developed-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20240128155950.1434067-1-yazen.ghannam@amd.com
2024-01-24RAS: Introduce AMD Address Translation LibraryYazen Ghannam1-0/+281
AMD Zen-based systems report memory errors through Machine Check banks representing Unified Memory Controllers (UMCs). The address value reported for DRAM ECC errors is a "normalized address" that is relative to the UMC. This normalized address must be converted to a system physical address to be usable by the OS. Support for this address translation was introduced to the MCA subsystem with Zen1 systems. The code was later moved to the AMD64 EDAC module, since this was the only user of the code at the time. However, there are uses for this translation outside of EDAC. The system physical address can be used in MCA for preemptive page offlining as done in some MCA notifier functions. Also, this translation is needed as the basis of similar functionality needed for some CXL configurations on AMD systems. Introduce a common address translation library that can be used for multiple subsystems including MCA, EDAC, and CXL. Include support for UMC normalized to system physical address translation for current CPU systems. The Data Fabric Indirect register access offsets and one of the register fields were changed. Default to the current offsets and register field definition. And fallback to the older values if running on a "legacy" system. Provide built-in code to facilitate the loading and unloading of the library module without affecting other modules or built-in code. Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20240123041401.79812-2-yazen.ghannam@amd.com