aboutsummaryrefslogtreecommitdiffstatshomepage
path: root/Documentation/edac
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation/edac')
-rw-r--r--Documentation/edac/features.rst103
-rw-r--r--Documentation/edac/index.rst12
-rw-r--r--Documentation/edac/memory_repair.rst152
-rw-r--r--Documentation/edac/scrub.rst342
4 files changed, 609 insertions, 0 deletions
diff --git a/Documentation/edac/features.rst b/Documentation/edac/features.rst
new file mode 100644
index 000000000000..3f283de297c7
--- /dev/null
+++ b/Documentation/edac/features.rst
@@ -0,0 +1,103 @@
+.. SPDX-License-Identifier: GPL-2.0 OR GFDL-1.2-no-invariants-or-later
+
+=================
+EDAC/RAS features
+=================
+
+Copyright (c) 2024-2025 HiSilicon Limited.
+
+:Author: Shiju Jose <shiju.jose@huawei.com>
+:License: The GNU Free Documentation License, Version 1.2 without
+ Invariant Sections, Front-Cover Texts nor Back-Cover Texts.
+ (dual licensed under the GPL v2)
+
+- Written for: 6.15
+
+Introduction
+------------
+
+EDAC/RAS components plugging and high-level design:
+
+1. Scrub control
+
+2. Error Check Scrub (ECS) control
+
+3. ACPI RAS2 features
+
+4. Post Package Repair (PPR) control
+
+5. Memory Sparing Repair control
+
+High level design is illustrated in the following diagram::
+
+ +-----------------------------------------------+
+ | Userspace - Rasdaemon |
+ | +-------------+ |
+ | | RAS CXL mem | +---------------+ |
+ | |error handler|---->| | |
+ | +-------------+ | RAS dynamic | |
+ | +-------------+ | scrub, memory | |
+ | | RAS memory |---->| repair control| |
+ | |error handler| +----|----------+ |
+ | +-------------+ | |
+ +--------------------------|--------------------+
+ |
+ |
+ +-------------------------------|------------------------------+
+ | Kernel EDAC extension for | controlling RAS Features |
+ |+------------------------------|----------------------------+ |
+ || EDAC Core Sysfs EDAC| Bus | |
+ || +--------------------------|---------------------------+| |
+ || |/sys/bus/edac/devices/<dev>/scrubX/ | | EDAC device || |
+ || |/sys/bus/edac/devices/<dev>/ecsX/ |<->| EDAC MC || |
+ || |/sys/bus/edac/devices/<dev>/repairX | | EDAC sysfs || |
+ || +---------------------------|--------------------------+| |
+ || EDAC|Bus | |
+ || | | |
+ || +----------+ Get feature | Get feature | |
+ || | | desc +---------|------+ desc +----------+ | |
+ || |EDAC scrub|<-----| EDAC device | | | | |
+ || +----------+ | driver- RAS |----->| EDAC mem | | |
+ || +----------+ | feature control| | repair | | |
+ || | |<-----| | +----------+ | |
+ || |EDAC ECS | +---------|------+ | |
+ || +----------+ Register RAS|features | |
+ || ______________________|_____________ | |
+ |+---------|---------------|------------------|--------------+ |
+ | +-------|----+ +-------|-------+ +----|----------+ |
+ | | | | CXL mem driver| | Client driver | |
+ | | ACPI RAS2 | | scrub, ECS, | | memory repair | |
+ | | driver | | sparing, PPR | | features | |
+ | +-----|------+ +-------|-------+ +------|--------+ |
+ | | | | |
+ +--------|-----------------|--------------------|--------------+
+ | | |
+ +--------|-----------------|--------------------|--------------+
+ | +---|-----------------|--------------------|-------+ |
+ | | | |
+ | | Platform HW and Firmware | |
+ | +--------------------------------------------------+ |
+ +--------------------------------------------------------------+
+
+
+1. EDAC Features components - Create feature-specific descriptors. For
+ example: scrub, ECS, memory repair in the above diagram.
+
+2. EDAC device driver for controlling RAS Features - Get feature's attribute
+ descriptors from EDAC RAS feature component and registers device's RAS
+ features with EDAC bus and expose the features control attributes via
+ sysfs. For example, /sys/bus/edac/devices/<dev-name>/<feature>X/
+
+3. RAS dynamic feature controller - Userspace sample modules in rasdaemon for
+ dynamic scrub/repair control to issue scrubbing/repair when excess number
+ of corrected memory errors are reported in a short span of time.
+
+RAS features
+------------
+1. Memory Scrub
+
+Memory scrub features are documented in `Documentation/edac/scrub.rst`.
+
+2. Memory Repair
+
+Memory repair features are documented in `Documentation/edac/memory_repair.rst`.
diff --git a/Documentation/edac/index.rst b/Documentation/edac/index.rst
new file mode 100644
index 000000000000..420c6601dbae
--- /dev/null
+++ b/Documentation/edac/index.rst
@@ -0,0 +1,12 @@
+.. SPDX-License-Identifier: GPL-2.0 OR GFDL-1.2-no-invariants-or-later
+
+==============
+EDAC Subsystem
+==============
+
+.. toctree::
+ :maxdepth: 1
+
+ features
+ memory_repair
+ scrub
diff --git a/Documentation/edac/memory_repair.rst b/Documentation/edac/memory_repair.rst
new file mode 100644
index 000000000000..5f8da7c9b186
--- /dev/null
+++ b/Documentation/edac/memory_repair.rst
@@ -0,0 +1,152 @@
+.. SPDX-License-Identifier: GPL-2.0 OR GFDL-1.2-no-invariants-or-later
+
+==========================
+EDAC Memory Repair Control
+==========================
+
+Copyright (c) 2024-2025 HiSilicon Limited.
+
+:Author: Shiju Jose <shiju.jose@huawei.com>
+:License: The GNU Free Documentation License, Version 1.2 without
+ Invariant Sections, Front-Cover Texts nor Back-Cover Texts.
+ (dual licensed under the GPL v2)
+:Original Reviewers:
+
+- Written for: 6.15
+
+Introduction
+------------
+
+Some memory devices support repair operations to address issues in their
+memory media. Post Package Repair (PPR) and memory sparing are examples of
+such features.
+
+Post Package Repair (PPR)
+~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Post Package Repair is a maintenance operation which requests the memory
+device to perform repair operation on its media. It is a memory self-healing
+feature that fixes a failing memory location by replacing it with a spare row
+in a DRAM device.
+
+For example, a CXL memory device with DRAM components that support PPR
+features implements maintenance operations. DRAM components support those
+types of PPR functions:
+
+ - hard PPR, for a permanent row repair, and
+ - soft PPR, for a temporary row repair.
+
+Soft PPR is much faster than hard PPR, but the repair is lost after a power
+cycle.
+
+The data may not be retained and memory requests may not be correctly
+processed during a repair operation. In such case, the repair operation should
+not be executed at runtime.
+
+For example, for CXL memory devices, see CXL spec rev 3.1 [1]_ sections
+8.2.9.7.1.1 PPR Maintenance Operations, 8.2.9.7.1.2 sPPR Maintenance Operation
+and 8.2.9.7.1.3 hPPR Maintenance Operation for more details.
+
+Memory Sparing
+~~~~~~~~~~~~~~
+
+Memory sparing is a repair function that replaces a portion of memory with
+a portion of functional memory at a particular granularity. Memory
+sparing has cacheline/row/bank/rank sparing granularities. For example, in
+rank memory-sparing mode, one memory rank serves as a spare for other ranks on
+the same channel in case they fail.
+
+The spare rank is held in reserve and not used as active memory until
+a failure is indicated, with reserved capacity subtracted from the total
+available memory in the system.
+
+After an error threshold is surpassed in a system protected by memory sparing,
+the content of a failing rank of DIMMs is copied to the spare rank. The
+failing rank is then taken offline and the spare rank placed online for use as
+active memory in place of the failed rank.
+
+For example, CXL memory devices can support various subclasses for sparing
+operation vary in terms of the scope of the sparing being performed.
+
+Cacheline sparing subclass refers to a sparing action that can replace a full
+cacheline. Row sparing is provided as an alternative to PPR sparing functions
+and its scope is that of a single DDR row. Bank sparing allows an entire bank
+to be replaced. Rank sparing is defined as an operation in which an entire DDR
+rank is replaced.
+
+See CXL spec 3.1 [1]_ section 8.2.9.7.1.4 Memory Sparing Maintenance
+Operations for more details.
+
+.. [1] https://computeexpresslink.org/cxl-specification/
+
+Use cases of generic memory repair features control
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+1. The soft PPR, hard PPR and memory-sparing features share similar control
+ attributes. Therefore, there is a need for a standardized, generic sysfs
+ repair control that is exposed to userspace and used by administrators,
+ scripts and tools.
+
+2. When a CXL device detects an error in a memory component, it informs the
+ host of the need for a repair maintenance operation by using an event
+ record where the "maintenance needed" flag is set. The event record
+ specifies the device physical address (DPA) and attributes of the memory
+ that requires repair. The kernel reports the corresponding CXL general
+ media or DRAM trace event to userspace, and userspace tools (e.g.
+ rasdaemon) initiate a repair maintenance operation in response to the
+ device request using the sysfs repair control.
+
+3. Userspace tools, such as rasdaemon, request a repair operation on a memory
+ region when maintenance need flag set or an uncorrected memory error or
+ excess of corrected memory errors above a threshold value is reported or an
+ exceed corrected errors threshold flag set for that memory.
+
+4. Multiple PPR/sparing instances may be present per memory device.
+
+5. Drivers should enforce that live repair is safe. In systems where memory
+ mapping functions can change between boots, one approach to this is to log
+ memory errors seen on this boot against which to check live memory repair
+ requests.
+
+The File System
+---------------
+
+The control attributes of a registered memory repair instance could be
+accessed in the /sys/bus/edac/devices/<dev-name>/mem_repairX/
+
+sysfs
+-----
+
+Sysfs files are documented in
+`Documentation/ABI/testing/sysfs-edac-memory-repair`.
+
+Examples
+--------
+
+The memory repair usage takes the form shown in this example:
+
+1. CXL memory sparing
+
+Memory sparing is defined as a repair function that replaces a portion of
+memory with a portion of functional memory at that same DPA. The subclass
+for this operation, cacheline/row/bank/rank sparing, vary in terms of the
+scope of the sparing being performed.
+
+Memory sparing maintenance operations may be supported by CXL devices that
+implement CXL.mem protocol. A sparing maintenance operation requests the
+CXL device to perform a repair operation on its media. For example, a CXL
+device with DRAM components that support memory sparing features may
+implement sparing maintenance operations.
+
+2. CXL memory Soft Post Package Repair (sPPR)
+
+Post Package Repair (PPR) maintenance operations may be supported by CXL
+devices that implement CXL.mem protocol. A PPR maintenance operation
+requests the CXL device to perform a repair operation on its media.
+For example, a CXL device with DRAM components that support PPR features
+may implement PPR Maintenance operations. Soft PPR (sPPR) is a temporary
+row repair. Soft PPR may be faster, but the repair is lost with a power
+cycle.
+
+Sysfs files for memory repair are documented in
+`Documentation/ABI/testing/sysfs-edac-memory-repair`
diff --git a/Documentation/edac/scrub.rst b/Documentation/edac/scrub.rst
new file mode 100644
index 000000000000..2cfa74fa1ffd
--- /dev/null
+++ b/Documentation/edac/scrub.rst
@@ -0,0 +1,342 @@
+.. SPDX-License-Identifier: GPL-2.0 OR GFDL-1.2-no-invariants-or-later
+
+=============
+Scrub Control
+=============
+
+Copyright (c) 2024-2025 HiSilicon Limited.
+
+:Author: Shiju Jose <shiju.jose@huawei.com>
+:License: The GNU Free Documentation License, Version 1.2 without
+ Invariant Sections, Front-Cover Texts nor Back-Cover Texts.
+ (dual licensed under the GPL v2)
+
+- Written for: 6.15
+
+Introduction
+------------
+
+Increasing DRAM size and cost have made memory subsystem reliability an
+important concern. These modules are used where potentially corrupted data
+could cause expensive or fatal issues. Memory errors are among the top
+hardware failures that cause server and workload crashes.
+
+Memory scrubbing is a feature where an ECC (Error-Correcting Code) engine
+reads data from each memory media location, corrects if necessary and writes
+the corrected data back to the same memory media location.
+
+DIMMs can be scrubbed at a configurable rate to detect uncorrected memory
+errors and attempt recovery from detected errors, providing the following
+benefits:
+
+1. Proactively scrubbing DIMMs reduces the chance of a correctable error
+ becoming uncorrectable.
+
+2. When detected, uncorrected errors caught in unallocated memory pages are
+ isolated and prevented from being allocated to an application or the OS.
+
+3. This reduces the likelihood of software or hardware products encountering
+ memory errors.
+
+4. The additional data on failures in memory may be used to build up
+ statistics that are later used to decide whether to use memory repair
+ technologies such as Post Package Repair or Sparing.
+
+There are 2 types of memory scrubbing:
+
+1. Background (patrol) scrubbing while the DRAM is otherwise idle.
+
+2. On-demand scrubbing for a specific address range or region of memory.
+
+Several types of interfaces to hardware memory scrubbers have been
+identified, such as CXL memory device patrol scrub, CXL DDR5 ECS, ACPI
+RAS2 memory scrubbing, and ACPI NVDIMM ARS (Address Range Scrub).
+
+The control mechanisms vary across different memory scrubbers. To enable
+standardized userspace tooling, there is a need to present these controls
+through a standardized ABI.
+
+A generic memory EDAC scrub control allows users to manage underlying
+scrubbers in the system through a standardized sysfs control interface. It
+abstracts the management of various scrubbing functionalities into a unified
+set of functions.
+
+Use cases of common scrub control feature
+-----------------------------------------
+
+1. Several types of interfaces for hardware memory scrubbers have been
+ identified, including the CXL memory device patrol scrub, CXL DDR5 ECS,
+ ACPI RAS2 memory scrubbing features, ACPI NVDIMM ARS (Address Range Scrub),
+ and software-based memory scrubbers.
+
+ Of the identified interfaces to hardware memory scrubbers some support
+ control over patrol (background) scrubbing (e.g., ACPI RAS2, CXL) and/or
+ on-demand scrubbing (e.g., ACPI RAS2, ACPI ARS). However, the scrub control
+ interfaces vary between memory scrubbers, highlighting the need for
+ a standardized, generic sysfs scrub control interface that is accessible to
+ userspace for administration and use by scripts/tools.
+
+2. User-space scrub controls allow users to disable scrubbing if necessary,
+ for example, to disable background patrol scrubbing or adjust the scrub
+ rate for performance-aware operations where background activities need to
+ be minimized or disabled.
+
+3. User-space tools enable on-demand scrubbing for specific address ranges,
+ provided that the scrubber supports this functionality.
+
+4. User-space tools can also control memory DIMM scrubbing at a configurable
+ scrub rate via sysfs scrub controls. This approach offers several benefits:
+
+ 4.1. Detects uncorrectable memory errors early, before user access to affected
+ memory, helping facilitate recovery.
+
+ 4.2. Reduces the likelihood of correctable errors developing into uncorrectable
+ errors.
+
+5. Policy control for hotplugged memory is necessary because there may not
+ be a system-wide BIOS or similar control to manage scrub settings for a CXL
+ device added after boot. Determining these settings is a policy decision,
+ balancing reliability against performance, so userspace should control it.
+ Therefore, a unified interface is recommended for handling this function in
+ a way that aligns with other similar interfaces, rather than creating a
+ separate one.
+
+Scrubbing features
+------------------
+
+CXL Memory Scrubbing features
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+CXL spec r3.1 [1]_ section 8.2.9.9.11.1 describes the memory device patrol
+scrub control feature. The device patrol scrub proactively locates and makes
+corrections to errors in regular cycle. The patrol scrub control allows the
+userspace request to change CXL patrol scrubber's configurations.
+
+The patrol scrub control allows the requester to specify the number of
+hours in which the patrol scrub cycles must be completed, provided that
+the requested scrub rate must be within the supported range of the
+scrub rate that the device is capable of. In the CXL driver, the
+number of seconds per scrub cycles, which user requests via sysfs, is
+rescaled to hours per scrub cycles.
+
+In addition, they allow the host to disable the feature in case it interferes
+with performance-aware operations which require the background operations to
+be turned off.
+
+Error Check Scrub (ECS)
+~~~~~~~~~~~~~~~~~~~~~~~
+
+CXL spec r3.1 [1]_ section 8.2.9.9.11.2 describes Error Check Scrub (ECS)
+- a feature defined in the JEDEC DDR5 SDRAM Specification (JESD79-5) and
+allowing DRAM to internally read, correct single-bit errors, and write back
+corrected data bits to the DRAM array while providing transparency to error
+counts.
+
+The DDR5 device contains number of memory media Field Replaceable Units (FRU)
+per device. The DDR5 ECS feature and thus the ECS control driver supports
+configuring the ECS parameters per FRU.
+
+ACPI RAS2 Hardware-based Memory Scrubbing
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+ACPI spec 6.5 [2]_ section 5.2.21 ACPI RAS2 describes an ACPI RAS2 table
+which provides interfaces for platform RAS features and supports independent
+RAS controls and capabilities for a given RAS feature for multiple instances
+of the same component in a given system.
+
+Memory RAS features apply to RAS capabilities, controls and operations that
+are specific to memory. RAS2 PCC sub-spaces for memory-specific RAS features
+have a Feature Type of 0x00 (Memory).
+
+The platform can use the hardware-based memory scrubbing feature to expose
+controls and capabilities associated with hardware-based memory scrub
+engines. The RAS2 memory scrubbing feature supports as per spec,
+
+1. Independent memory scrubbing controls for each NUMA domain, identified
+ using its proximity domain.
+
+2. Provision for background (patrol) scrubbing of the entire memory system,
+ as well as on-demand scrubbing for a specific region of memory.
+
+ACPI Address Range Scrubbing (ARS)
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+ACPI spec 6.5 [2]_ section 9.19.7.2 describes Address Range Scrubbing (ARS).
+ARS allows the platform to communicate memory errors to system software.
+This capability allows system software to prevent accesses to addresses with
+uncorrectable errors in memory. ARS functions manage all NVDIMMs present in
+the system. Only one scrub can be in progress system wide at any given time.
+
+The following functions are supported as per the specification:
+
+1. Query ARS Capabilities for a given address range, indicates platform
+ supports the ACPI NVDIMM Root Device Unconsumed Error Notification.
+
+2. Start ARS triggers an Address Range Scrub for the given memory range.
+ Address scrubbing can be done for volatile or persistent memory, or both.
+
+3. Query ARS Status command allows software to get the status of ARS,
+ including the progress of ARS and ARS error record.
+
+4. Clear Uncorrectable Error.
+
+5. Translate SPA
+
+6. ARS Error Inject etc.
+
+The kernel supports an existing control for ARS and ARS is currently not
+supported in EDAC.
+
+.. [1] https://computeexpresslink.org/cxl-specification/
+
+.. [2] https://uefi.org/specs/ACPI/6.5/
+
+Comparison of various scrubbing features
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+ +--------------+-----------+-----------+-----------+-----------+
+ | | ACPI | CXL patrol| CXL ECS | ARS |
+ | Name | RAS2 | scrub | | |
+ +--------------+-----------+-----------+-----------+-----------+
+ | | | | | |
+ | On-demand | Supported | No | No | Supported |
+ | Scrubbing | | | | |
+ | | | | | |
+ +--------------+-----------+-----------+-----------+-----------+
+ | | | | | |
+ | Background | Supported | Supported | Supported | No |
+ | scrubbing | | | | |
+ | | | | | |
+ +--------------+-----------+-----------+-----------+-----------+
+ | | | | | |
+ | Mode of | Scrub ctrl| per device| per memory| Unknown |
+ | scrubbing | per NUMA | | media | |
+ | | domain. | | | |
+ +--------------+-----------+-----------+-----------+-----------+
+ | | | | | |
+ | Query scrub | Supported | Supported | Supported | Supported |
+ | capabilities | | | | |
+ | | | | | |
+ +--------------+-----------+-----------+-----------+-----------+
+ | | | | | |
+ | Setting | Supported | No | No | Supported |
+ | address range| | | | |
+ | | | | | |
+ +--------------+-----------+-----------+-----------+-----------+
+ | | | | | |
+ | Setting | Supported | Supported | No | No |
+ | scrub rate | | | | |
+ | | | | | |
+ +--------------+-----------+-----------+-----------+-----------+
+ | | | | | |
+ | Unit for | Not | in hours | No | No |
+ | scrub rate | Defined | | | |
+ | | | | | |
+ +--------------+-----------+-----------+-----------+-----------+
+ | | Supported | | | |
+ | Scrub | on-demand | No | No | Supported |
+ | status/ | scrubbing | | | |
+ | Completion | only | | | |
+ +--------------+-----------+-----------+-----------+-----------+
+ | UC error | |CXL general|CXL general| ACPI UCE |
+ | reporting | Exception |media/DRAM |media/DRAM | notify and|
+ | | |event/media|event/media| query |
+ | | |scan? |scan? | ARS status|
+ +--------------+-----------+-----------+-----------+-----------+
+ | | | | | |
+ | Support for | Supported | Supported | Supported | No |
+ | EDAC control | | | | |
+ | | | | | |
+ +--------------+-----------+-----------+-----------+-----------+
+
+The File System
+---------------
+
+The control attributes of a registered scrubber instance could be
+accessed in:
+
+/sys/bus/edac/devices/<dev-name>/scrubX/
+
+sysfs
+-----
+
+Sysfs files are documented in
+`Documentation/ABI/testing/sysfs-edac-scrub`
+
+`Documentation/ABI/testing/sysfs-edac-ecs`
+
+Examples
+--------
+
+The usage takes the form shown in these examples:
+
+1. CXL memory Patrol Scrub
+
+The following are the use cases identified why we might increase the scrub rate.
+
+- Scrubbing is needed at device granularity because a device is showing
+ unexpectedly high errors.
+
+- Scrubbing may apply to memory that isn't online at all yet. Likely this
+ is a system wide default setting on boot.
+
+- Scrubbing at a higher rate because the monitor software has determined that
+ more reliability is necessary for a particular data set. This is called
+ Differentiated Reliability.
+
+1.1. Device based scrubbing
+
+CXL memory is exposed to memory management subsystem and ultimately userspace
+via CXL devices. Device-based scrubbing is used for the first use case
+described in "Section 1 CXL Memory Patrol Scrub".
+
+When combining control via the device interfaces and region interfaces,
+"see Section 1.2 Region based scrubbing".
+
+Sysfs files for scrubbing are documented in
+`Documentation/ABI/testing/sysfs-edac-scrub`
+
+1.2. Region based scrubbing
+
+CXL memory is exposed to memory management subsystem and ultimately userspace
+via CXL regions. CXL Regions represent mapped memory capacity in system
+physical address space. These can incorporate one or more parts of multiple CXL
+memory devices with traffic interleaved across them. The user may want to control
+the scrub rate via this more abstract region instead of having to figure out the
+constituent devices and program them separately. The scrub rate for each device
+covers the whole device. Thus if multiple regions use parts of that device then
+requests for scrubbing of other regions may result in a higher scrub rate than
+requested for this specific region.
+
+Region-based scrubbing is used for the third use case described in
+"Section 1 CXL Memory Patrol Scrub".
+
+Userspace must follow below set of rules on how to set the scrub rates for any
+mixture of requirements.
+
+1. Taking each region in turn from lowest desired scrub rate to highest and set
+ their scrub rates. Later regions may override the scrub rate on individual
+ devices (and hence potentially whole regions).
+
+2. Take each device for which enhanced scrubbing is required (higher rate) and
+ set those scrub rates. This will override the scrub rates of individual devices,
+ setting them to the maximum rate required for any of the regions they help back,
+ unless a specific rate is already defined.
+
+Sysfs files for scrubbing are documented in
+`Documentation/ABI/testing/sysfs-edac-scrub`
+
+2. CXL memory Error Check Scrub (ECS)
+
+The Error Check Scrub (ECS) feature enables a memory device to perform error
+checking and correction (ECC) and count single-bit errors. The associated
+memory controller sets the ECS mode with a trigger sent to the memory
+device. CXL ECS control allows the host, thus the userspace, to change the
+attributes for error count mode, threshold number of errors per segment
+(indicating how many segments have at least that number of errors) for
+reporting errors, and reset the ECS counter. Thus the responsibility for
+initiating Error Check Scrub on a memory device may lie with the memory
+controller or platform when unexpectedly high error rates are detected.
+
+Sysfs files for scrubbing are documented in
+`Documentation/ABI/testing/sysfs-edac-ecs`