aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/acpi/apei/Makefile (follow)
AgeCommit message (Collapse)AuthorFilesLines
2016-06-29ACPI / APEI: Add Boot Error Record Table (BERT) supportHuang Ying1-1/+1
ACPI/APEI is designed to verifiy/report H/W errors, like Corrected Error(CE) and Uncorrected Error(UC). It contains four tables: HEST, ERST, EINJ and BERT. The first three tables have been merged for a long time, but because of lacking BIOS support for BERT, the support for BERT is pending until now. Recently on ARM 64 platform it is has been supported. So here we come. Under normal circumstances, when a hardware error occurs, kernel will be notified via NMI, MCE or some other method, then kernel will process the error condition, report it, and recover it if possible. But sometime, the situation is so bad, so that firmware may choose to reset directly without notifying Linux kernel. Linux kernel can use the Boot Error Record Table (BERT) to get the un-notified hardware errors that occurred in a previous boot. In this patch, the error information is reported via printk. For more information about BERT, please refer to ACPI Specification version 6.0, section 18.3.1: http://www.uefi.org/sites/default/files/resources/ACPI_6.0.pdf The following log is a BERT record after system reboot because of hitting a fatal memory error: BERT: Error records from previous boot: [Hardware Error]: It has been corrected by h/w and requires no further action [Hardware Error]: event severity: corrected [Hardware Error]: Error 0, type: recoverable [Hardware Error]: section_type: memory error [Hardware Error]: error_status: 0x0000000000000400 [Hardware Error]: physical_address: 0xffffffffffffffff [Hardware Error]: card: 1 module: 2 bank: 3 row: 1 column: 2 bit_position: 5 [Hardware Error]: error_type: 2, single-bit ECC [Tomasz Nowicki: Clear error status at the end of error handling] [Tony: Applied some cleanups suggested by Fu Wei] [Fu Wei: delete EXPORT_SYMBOL_GPL(bert_disable), improve the code] Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Tomasz Nowicki <tomasz.nowicki@linaro.org> Signed-off-by: Chen, Gong <gong.chen@linux.intel.com> Tested-by: Jonathan (Zhixiong) Zhang <zjzhang@codeaurora.org> Signed-off-by: Fu Wei <fu.wei@linaro.org> Tested-by: Tyler Baicar <tbaicar@codeaurora.org> Reviewed-by: Borislav Petkov <bp@suse.de> Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-10-31Move cper.c from drivers/acpi/apei to drivers/firmware/efiLuck, Tony1-1/+1
cper.c contains code to decode and print "Common Platform Error Records". Originally added under drivers/acpi/apei because the only user was in that same directory - but now we have another consumer, and we shouldn't have to force CONFIG_ACPI_APEI get access to this code. Since CPER is defined in the UEFI specification - the logical home for this code is under drivers/firmware/efi/ Acked-by: Matt Fleming <matt.fleming@intel.com> Acked-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Tony Luck <tony.luck@intel.com>
2010-08-14ACPI, APEI, ERST debug supportHuang Ying1-0/+1
This patch adds debugging/testing support to ERST. A misc device is implemented to export raw ERST read/write/clear etc operations to user space. With this patch, we can add ERST testing support to linuxfirmwarekit ISO (linuxfirmwarekit.org) to verify the kernel support and the firmware implementation. Signed-off-by: Huang Ying <ying.huang@intel.com> Acked-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2010-05-19ACPI, APEI, Error Record Serialization Table (ERST) supportHuang Ying1-1/+1
ERST is a way provided by APEI to save and retrieve hardware error record to and from some simple persistent storage (such as flash). The Linux kernel support implementation is quite simple and workable in NMI context. So it can be used to save hardware error record into flash in hardware error exception or NMI handler, where other more complex persistent storage such as disk is not usable. After saving hardware error records via ERST in hardware error exception or NMI handler, the error records can be retrieved and logged into disk or network after a clean reboot. For more information about ERST, please refer to ACPI Specification version 4.0, section 17.4. This patch incorporate fixes from Jin Dongming. Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Andi Kleen <ak@linux.intel.com> CC: Jin Dongming <jin.dongming@np.css.fujitsu.com> Signed-off-by: Len Brown <len.brown@intel.com>
2010-05-19ACPI, APEI, Generic Hardware Error Source memory error supportHuang Ying1-0/+1
Generic Hardware Error Source provides a way to report platform hardware errors (such as that from chipset). It works in so called "Firmware First" mode, that is, hardware errors are reported to firmware firstly, then reported to Linux by firmware. This way, some non-standard hardware error registers or non-standard hardware link can be checked by firmware to produce more valuable hardware error information for Linux. Now, only SCI notification type and memory errors are supported. More notification type and hardware error type will be added later. These memory errors are reported to user space through /dev/mcelog via faking a corrected Machine Check, so that the error memory page can be offlined by /sbin/mcelog if the error count for one page is beyond the threshold. On some machines, Machine Check can not report physical address for some corrected memory errors, but GHES can do that. So this simplified GHES is implemented firstly. Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2010-05-19ACPI, APEI, UEFI Common Platform Error Record (CPER) headerHuang Ying1-1/+1
CPER stands for Common Platform Error Record, it is the hardware error record format used to describe platform hardware error by various APEI tables, such as ERST, BERT and HEST etc. For more information about CPER, please refer to Appendix N of UEFI Specification version 2.3. This patch mainly includes the data structure difinition header file used by other files. Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2010-05-19ACPI, APEI, EINJ supportHuang Ying1-0/+1
EINJ provides a hardware error injection mechanism, this is useful for debugging and testing of other APEI and RAS features. Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2010-05-19ACPI, APEI, HEST table parsingHuang Ying1-1/+1
HEST describes error sources in detail; communicating operational parameters (i.e. severity levels, masking bits, and threshold values) to OS as necessary. It also allows the platform to report error sources for which OS would typically not implement support (for example, chipset-specific error registers). HEST information may be needed by other subsystems. For example, HEST PCIE AER error source information describes whether a PCIE root port works in "firmware first" mode, this is needed by general PCIE AER error subsystem. So a public HEST tabling parsing interface is provided. Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2010-05-19ACPI, APEI, APEI supporting infrastructureHuang Ying1-0/+3
APEI stands for ACPI Platform Error Interface, which allows to report errors (for example from the chipset) to the operating system. This improves NMI handling especially. In addition it supports error serialization and error injection. For more information about APEI, please refer to ACPI Specification version 4.0, chapter 17. This patch provides some common functions used by more than one APEI tables, mainly framework of interpreter for EINJ and ERST. A machine readable language is defined for EINJ and ERST for OS to execute, and so to drive the firmware to fulfill the corresponding functions. The machine language for EINJ and ERST is compatible, so a common framework is defined for them. Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Len Brown <len.brown@intel.com>