aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/ras/amd/atl/internal.h (follow)
AgeCommit message (Collapse)AuthorFilesLines
2025-05-06Merge tag 'v6.15-rc5' into x86/cpu, to resolve conflictsIngo Molnar1-0/+3
Conflicts: tools/arch/x86/include/asm/cpufeatures.h Signed-off-by: Ingo Molnar <mingo@kernel.org>
2025-04-14x86/platform/amd: Move the <asm/amd_node.h> header to <asm/amd/node.h>Ingo Molnar1-1/+1
Collect AMD specific platform header files in <asm/amd/*.h>. Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Borislav Petkov (AMD) <bp@alien8.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mario Limonciello <superm1@kernel.org> Link: https://lore.kernel.org/r/20250413084144.3746608-7-mingo@kernel.org
2025-04-14x86/platform/amd: Move the <asm/amd_nb.h> header to <asm/amd/nb.h>Ingo Molnar1-1/+1
Collect AMD specific platform header files in <asm/amd/*.h>. Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Borislav Petkov (AMD) <bp@alien8.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mario Limonciello <superm1@kernel.org> Link: https://lore.kernel.org/r/20250413084144.3746608-4-mingo@kernel.org
2025-04-08RAS/AMD/FMPM: Get masked addressYazen Ghannam1-0/+3
Some operations require checking, or ignoring, specific bits in an address value. For example, this can be comparing address values to identify unique structures. Currently, the full address value is compared when filtering for duplicates. This results in over counting and creation of extra records. This gives the impression that more unique events occurred than did in reality. Mask the address for physical rows on MI300. [ bp: Simplify. ] Fixes: 6f15e617cc99 ("RAS: Introduce a FRU memory poison manager") Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Cc: stable@vger.kernel.org
2025-01-08x86/amd_nb: Move SMN access code to a new amd_node driverMario Limonciello1-0/+1
SMN access was bolted into amd_nb mostly as convenience. This has limitations though that require incurring tech debt to keep it working. Move SMN access to the newly introduced AMD Node driver. Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Acked-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> # pdx86 Acked-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com> # PMF, PMC Link: https://lore.kernel.org/r/20241206161210.163701-11-yazen.ghannam@amd.com
2024-08-01RAS/AMD/ATL: Translate normalized to system physical addresses using PRMJohn Allen1-0/+10
AMD Zen-based systems report memory error addresses through machine check banks representing Unified Memory Controllers (UMCs) in the form of UMC relative "normalized" addresses. A normalized address must be converted to a system physical address to be usable by the OS. Future AMD platforms will provide a UEFI PRM module that implements a number of address translation PRM handlers. This will provide an interface for the OS to call platform specific code without requiring the use of SMM or other heavy firmware operations. Add support for the normalized to system physical address translation PRM handler in the AMD Address Translation Library and prefer it over native code if available. The GUID and parameter buffer structure are specific to the normalized to system physical address handler provided by the address translation PRM module included in future AMD systems. The address translation PRM module is documented in chapter 22 of the publicly available "AMD Family 1Ah Models 00h–0Fh and Models 10h–1Fh ACPI v6.5 Porting Guide". [ bp: Massage commit message. ] Signed-off-by: John Allen <john.allen@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20240730151731.15363-3-john.allen@amd.com
2024-07-15Merge tag 'edac_updates_for_v6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/rasLinus Torvalds1-0/+48
Pull EDAC updates from Borislav Petkov: - The AMD memory controllers data fabric version 4.5 supports non-power-of-2 denormalization in the sense that certain bits of the system physical address cannot be reconstructed from the normalized address reported by the RAS hardware. Add support for handling such addresses - Switch the EDAC drivers to the new Intel CPU model defines - The usual fixes and cleanups all over the place * tag 'edac_updates_for_v6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras: EDAC: Add missing MODULE_DESCRIPTION() macros EDAC/dmc520: Use devm_platform_ioremap_resource() EDAC/igen6: Add Intel Arrow Lake-U/H SoCs support RAS/AMD/FMPM: Use atl internal.h for INVALID_SPA RAS/AMD/ATL: Implement DF 4.5 NP2 denormalization RAS/AMD/ATL: Validate address map when information is gathered RAS/AMD/ATL: Expand helpers for adding and removing base and hole RAS/AMD/ATL: Read DRAM hole base early RAS/AMD/ATL: Add amd_atl pr_fmt() prefix RAS/AMD/ATL: Add a missing module description EDAC, i10nm: make skx_common.o a separate module EDAC/skx: Switch to new Intel CPU model defines EDAC/sb_edac: Switch to new Intel CPU model defines EDAC, pnd2: Switch to new Intel CPU model defines EDAC/i10nm: Switch to new Intel CPU model defines EDAC/ghes: Add missing newline to pr_info() statement RAS/AMD/ATL: Add missing newline to pr_info() statement EDAC/thunderx: Remove unused struct error_syndrome
2024-06-16RAS/AMD/ATL: Use system settings for MI300 DRAM to normalized address translationYazen Ghannam1-1/+1
The currently used normalized address format is not applicable to all MI300 systems. This leads to incorrect results during address translation. Drop the fixed layout and construct the normalized address from system settings. Fixes: 87a612375307 ("RAS/AMD/ATL: Add MI300 DRAM to normalized address translation support") Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Cc: <stable@kernel.org> Link: https://lore.kernel.org/r/20240607-mi300-dram-xl-fix-v1-2-2f11547a178c@amd.com
2024-06-09RAS/AMD/ATL: Implement DF 4.5 NP2 denormalizationJohn Allen1-0/+40
Unlike with previous Data Fabric versions, with Data Fabric 4.5 non-power-of-2 denormalization, there are bits of the system physical address that can't be fully reconstructed from the normalized address. To determine the proper combination of missing system physical address bits, iterate through each possible combination of these bits, normalize the resulting system physical address, and compare to the original address that is being translated. If the addresses match, then the correct permutation of bits has been found. Signed-off-by: John Allen <john.allen@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Yazen Ghannam <yazen.ghannam@amd.com> Link: https://lore.kernel.org/r/20240606203313.51197-6-john.allen@amd.com
2024-06-09RAS/AMD/ATL: Expand helpers for adding and removing base and holeJohn Allen1-0/+3
The ret_addr field in struct addr_ctx contains the intermediate value of the returned address as it passes through multiple steps in the translation process. Currently, adding the DRAM base and legacy hole is only done once, so it operates directly on the intermediate value. However, for DF 4.5 non-power-of-2 denormalization, adding and removing the DRAM base and legacy hole needs to be done for multiple temporary address values. During this process, the intermediate value should not be lost so the ret_addr value can't be reused. Update the existing 'add' helper to operate on an arbitrary address and introduce a new 'remove' helper to do the inverse operations. Signed-off-by: John Allen <john.allen@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Yazen Ghannam <yazen.ghannam@amd.com> Link: https://lore.kernel.org/r/20240606203313.51197-4-john.allen@amd.com
2024-06-09RAS/AMD/ATL: Read DRAM hole base earlyJohn Allen1-0/+2
Read DRAM hole base when constructing the address map as the value will not change during run time. Signed-off-by: John Allen <john.allen@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Yazen Ghannam <yazen.ghannam@amd.com> Link: https://lore.kernel.org/r/20240606203313.51197-3-john.allen@amd.com
2024-06-09RAS/AMD/ATL: Add amd_atl pr_fmt() prefixJohn Allen1-0/+3
Prefix all AMD ATL pr_* statements with "amd_atl:". Signed-off-by: John Allen <john.allen@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20240606203313.51197-2-john.allen@amd.com
2024-02-01RAS/AMD/ATL: Add MI300 DRAM to normalized address translation supportYazen Ghannam1-0/+1
Zen-based AMD systems report DRAM ECC errors through Unified Memory Controller (UMC) MCA banks. The value provided in MCA_ADDR is a "normalized" address which represents the UMC's view of its managed memory. The normalized address must be translated to a system physical address for software to take action. MI300 systems, uniquely, do not provide a normalized address in MCA_ADDR for DRAM ECC errors. Rather, the "DRAM" address is reported. This value includes identifiers for the bank, row, column, pseudochannel and stack of the memory location. The DRAM address must be converted to a normalized address in order to be further translated to a system physical address. Add helper functions to do the DRAM to normalized translation for MI300 systems. The method is based on the fixed hardware layout of the on-chip memory. [ bp: Massage commit message, decapitalize some, rename function. ] Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Co-developed-by: Muralidhara M K <muralidhara.mk@amd.com> Signed-off-by: Muralidhara M K <muralidhara.mk@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Tested-by: Muralidhara M K <muralidhara.mk@amd.com> Link: https://lore.kernel.org/r/20240131165732.88297-1-yazen.ghannam@amd.com
2024-01-29RAS/AMD/ATL: Add MI300 supportMuralidhara M K1-1/+9
AMD MI300 systems include on-die HBM3 memory and a unique topology. And they fall under Data Fabric version 4.5 in overall design. Generally, topology information (IDs, etc.) is gathered from Data Fabric registers. However, the unique topology for MI300 means that some topology information is fixed in hardware and follows arbitrary mappings. Furthermore, not all hardware instances are software-visible, so register accesses must be adjusted. Recognize and add helper functions for the new MI300 interleave modes. Add lookup tables for fixed values where appropriate. Adjust how Die and Node IDs are found and used. Also, fix some register bitmasks that were mislabeled. Signed-off-by: Muralidhara M K <muralidhara.mk@amd.com> Co-developed-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20240128155950.1434067-1-yazen.ghannam@amd.com
2024-01-24RAS: Introduce AMD Address Translation LibraryYazen Ghannam1-0/+297
AMD Zen-based systems report memory errors through Machine Check banks representing Unified Memory Controllers (UMCs). The address value reported for DRAM ECC errors is a "normalized address" that is relative to the UMC. This normalized address must be converted to a system physical address to be usable by the OS. Support for this address translation was introduced to the MCA subsystem with Zen1 systems. The code was later moved to the AMD64 EDAC module, since this was the only user of the code at the time. However, there are uses for this translation outside of EDAC. The system physical address can be used in MCA for preemptive page offlining as done in some MCA notifier functions. Also, this translation is needed as the basis of similar functionality needed for some CXL configurations on AMD systems. Introduce a common address translation library that can be used for multiple subsystems including MCA, EDAC, and CXL. Include support for UMC normalized to system physical address translation for current CPU systems. The Data Fabric Indirect register access offsets and one of the register fields were changed. Default to the current offsets and register field definition. And fallback to the older values if running on a "legacy" system. Provide built-in code to facilitate the loading and unloading of the library module without affecting other modules or built-in code. Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20240123041401.79812-2-yazen.ghannam@amd.com