aboutsummaryrefslogtreecommitdiffstats
path: root/Documentation/admin-guide
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation/admin-guide')
-rw-r--r--Documentation/admin-guide/LSM/SafeSetID.rst4
-rw-r--r--Documentation/admin-guide/cgroup-v2.rst16
-rw-r--r--Documentation/admin-guide/dell_rbu.rst128
-rw-r--r--Documentation/admin-guide/device-mapper/dm-dust.rst (renamed from Documentation/admin-guide/device-mapper/dm-dust.txt)243
-rw-r--r--Documentation/admin-guide/device-mapper/dm-integrity.rst5
-rw-r--r--Documentation/admin-guide/device-mapper/dm-raid.rst2
-rw-r--r--Documentation/admin-guide/device-mapper/index.rst1
-rw-r--r--Documentation/admin-guide/hw-vuln/mds.rst7
-rw-r--r--Documentation/admin-guide/hw-vuln/tsx_async_abort.rst5
-rw-r--r--Documentation/admin-guide/index.rst65
-rw-r--r--Documentation/admin-guide/iostats.rst56
-rw-r--r--Documentation/admin-guide/kernel-parameters.rst1
-rw-r--r--Documentation/admin-guide/kernel-parameters.txt91
-rw-r--r--Documentation/admin-guide/perf/imx-ddr.rst48
-rw-r--r--Documentation/admin-guide/perf/index.rst1
-rw-r--r--Documentation/admin-guide/perf/thunderx2-pmu.rst20
-rw-r--r--Documentation/admin-guide/ras.rst31
-rw-r--r--Documentation/admin-guide/sysctl/kernel.rst12
18 files changed, 487 insertions, 249 deletions
diff --git a/Documentation/admin-guide/LSM/SafeSetID.rst b/Documentation/admin-guide/LSM/SafeSetID.rst
index 212434ef65ad..7bff07ce4fdd 100644
--- a/Documentation/admin-guide/LSM/SafeSetID.rst
+++ b/Documentation/admin-guide/LSM/SafeSetID.rst
@@ -56,7 +56,7 @@ setid capabilities from the application completely and refactor the process
spawning semantics in the application (e.g. by using a privileged helper program
to do process spawning and UID/GID transitions). Unfortunately, there are a
number of semantics around process spawning that would be affected by this, such
-as fork() calls where the program doesn???t immediately call exec() after the
+as fork() calls where the program doesn't immediately call exec() after the
fork(), parent processes specifying custom environment variables or command line
args for spawned child processes, or inheritance of file handles across a
fork()/exec(). Because of this, as solution that uses a privileged helper in
@@ -72,7 +72,7 @@ own user namespace, and only approved UIDs/GIDs could be mapped back to the
initial system user namespace, affectively preventing privilege escalation.
Unfortunately, it is not generally feasible to use user namespaces in isolation,
without pairing them with other namespace types, which is not always an option.
-Linux checks for capabilities based off of the user namespace that ???owns??? some
+Linux checks for capabilities based off of the user namespace that "owns" some
entity. For example, Linux has the notion that network namespaces are owned by
the user namespace in which they were created. A consequence of this is that
capability checks for access to a given network namespace are done by checking
diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst
index 5361ebec3361..0636bcb60b5a 100644
--- a/Documentation/admin-guide/cgroup-v2.rst
+++ b/Documentation/admin-guide/cgroup-v2.rst
@@ -1120,8 +1120,9 @@ PAGE_SIZE multiple when read back.
Best-effort memory protection. If the memory usage of a
cgroup is within its effective low boundary, the cgroup's
- memory won't be reclaimed unless memory can be reclaimed
- from unprotected cgroups. Above the effective low boundary (or
+ memory won't be reclaimed unless there is no reclaimable
+ memory available in unprotected cgroups.
+ Above the effective low boundary (or
effective min boundary if it is higher), pages are reclaimed
proportionally to the overage, reducing reclaim pressure for
smaller overages.
@@ -1288,7 +1289,12 @@ PAGE_SIZE multiple when read back.
inactive_anon, active_anon, inactive_file, active_file, unevictable
Amount of memory, swap-backed and filesystem-backed,
on the internal memory management lists used by the
- page reclaim algorithm
+ page reclaim algorithm.
+
+ As these represent internal list state (eg. shmem pages are on anon
+ memory management lists), inactive_foo + active_foo may not be equal to
+ the value for the foo counter, since the foo counter is type-based, not
+ list-based.
slab_reclaimable
Part of "slab" that might be reclaimed, such as
@@ -1334,7 +1340,7 @@ PAGE_SIZE multiple when read back.
pgdeactivate
- Amount of pages moved to the inactive LRU lis
+ Amount of pages moved to the inactive LRU list
pglazyfree
@@ -1920,7 +1926,7 @@ Cpuset Interface Files
It accepts only the following input values when written to.
- "root" - a paritition root
+ "root" - a partition root
"member" - a non-root member of a partition
When set to be a partition root, the current cgroup is the
diff --git a/Documentation/admin-guide/dell_rbu.rst b/Documentation/admin-guide/dell_rbu.rst
new file mode 100644
index 000000000000..8d70e1fc9f9d
--- /dev/null
+++ b/Documentation/admin-guide/dell_rbu.rst
@@ -0,0 +1,128 @@
+=========================================
+Dell Remote BIOS Update driver (dell_rbu)
+=========================================
+
+Purpose
+=======
+
+Document demonstrating the use of the Dell Remote BIOS Update driver
+for updating BIOS images on Dell servers and desktops.
+
+Scope
+=====
+
+This document discusses the functionality of the rbu driver only.
+It does not cover the support needed from applications to enable the BIOS to
+update itself with the image downloaded in to the memory.
+
+Overview
+========
+
+This driver works with Dell OpenManage or Dell Update Packages for updating
+the BIOS on Dell servers (starting from servers sold since 1999), desktops
+and notebooks (starting from those sold in 2005).
+
+Please go to http://support.dell.com register and you can find info on
+OpenManage and Dell Update packages (DUP).
+
+Libsmbios can also be used to update BIOS on Dell systems go to
+http://linux.dell.com/libsmbios/ for details.
+
+Dell_RBU driver supports BIOS update using the monolithic image and packetized
+image methods. In case of monolithic the driver allocates a contiguous chunk
+of physical pages having the BIOS image. In case of packetized the app
+using the driver breaks the image in to packets of fixed sizes and the driver
+would place each packet in contiguous physical memory. The driver also
+maintains a link list of packets for reading them back.
+
+If the dell_rbu driver is unloaded all the allocated memory is freed.
+
+The rbu driver needs to have an application (as mentioned above) which will
+inform the BIOS to enable the update in the next system reboot.
+
+The user should not unload the rbu driver after downloading the BIOS image
+or updating.
+
+The driver load creates the following directories under the /sys file system::
+
+ /sys/class/firmware/dell_rbu/loading
+ /sys/class/firmware/dell_rbu/data
+ /sys/devices/platform/dell_rbu/image_type
+ /sys/devices/platform/dell_rbu/data
+ /sys/devices/platform/dell_rbu/packet_size
+
+The driver supports two types of update mechanism; monolithic and packetized.
+These update mechanism depends upon the BIOS currently running on the system.
+Most of the Dell systems support a monolithic update where the BIOS image is
+copied to a single contiguous block of physical memory.
+
+In case of packet mechanism the single memory can be broken in smaller chunks
+of contiguous memory and the BIOS image is scattered in these packets.
+
+By default the driver uses monolithic memory for the update type. This can be
+changed to packets during the driver load time by specifying the load
+parameter image_type=packet. This can also be changed later as below::
+
+ echo packet > /sys/devices/platform/dell_rbu/image_type
+
+In packet update mode the packet size has to be given before any packets can
+be downloaded. It is done as below::
+
+ echo XXXX > /sys/devices/platform/dell_rbu/packet_size
+
+In the packet update mechanism, the user needs to create a new file having
+packets of data arranged back to back. It can be done as follows:
+The user creates packets header, gets the chunk of the BIOS image and
+places it next to the packetheader; now, the packetheader + BIOS image chunk
+added together should match the specified packet_size. This makes one
+packet, the user needs to create more such packets out of the entire BIOS
+image file and then arrange all these packets back to back in to one single
+file.
+
+This file is then copied to /sys/class/firmware/dell_rbu/data.
+Once this file gets to the driver, the driver extracts packet_size data from
+the file and spreads it across the physical memory in contiguous packet_sized
+space.
+
+This method makes sure that all the packets get to the driver in a single operation.
+
+In monolithic update the user simply get the BIOS image (.hdr file) and copies
+to the data file as is without any change to the BIOS image itself.
+
+Do the steps below to download the BIOS image.
+
+1) echo 1 > /sys/class/firmware/dell_rbu/loading
+2) cp bios_image.hdr /sys/class/firmware/dell_rbu/data
+3) echo 0 > /sys/class/firmware/dell_rbu/loading
+
+The /sys/class/firmware/dell_rbu/ entries will remain till the following is
+done.
+
+::
+
+ echo -1 > /sys/class/firmware/dell_rbu/loading
+
+Until this step is completed the driver cannot be unloaded.
+
+Also echoing either mono, packet or init in to image_type will free up the
+memory allocated by the driver.
+
+If a user by accident executes steps 1 and 3 above without executing step 2;
+it will make the /sys/class/firmware/dell_rbu/ entries disappear.
+
+The entries can be recreated by doing the following::
+
+ echo init > /sys/devices/platform/dell_rbu/image_type
+
+.. note:: echoing init in image_type does not change its original value.
+
+Also the driver provides /sys/devices/platform/dell_rbu/data readonly file to
+read back the image downloaded.
+
+.. note::
+
+ After updating the BIOS image a user mode application needs to execute
+ code which sends the BIOS update request to the BIOS. So on the next reboot
+ the BIOS knows about the new image downloaded and it updates itself.
+ Also don't unload the rbu driver if the image has to be updated.
+
diff --git a/Documentation/admin-guide/device-mapper/dm-dust.txt b/Documentation/admin-guide/device-mapper/dm-dust.rst
index 954d402a1f6a..b6e7e7ead831 100644
--- a/Documentation/admin-guide/device-mapper/dm-dust.txt
+++ b/Documentation/admin-guide/device-mapper/dm-dust.rst
@@ -31,218 +31,233 @@ configured "bad blocks" will be treated as bad, or bypassed.
This allows the pre-writing of test data and metadata prior to
simulating a "failure" event where bad sectors start to appear.
-Table parameters:
------------------
+Table parameters
+----------------
<device_path> <offset> <blksz>
Mandatory parameters:
- <device_path>: path to the block device.
- <offset>: offset to data area from start of device_path
- <blksz>: block size in bytes
+ <device_path>:
+ Path to the block device.
+
+ <offset>:
+ Offset to data area from start of device_path
+
+ <blksz>:
+ Block size in bytes
+
(minimum 512, maximum 1073741824, must be a power of 2)
-Usage instructions:
--------------------
+Usage instructions
+------------------
-First, find the size (in 512-byte sectors) of the device to be used:
+First, find the size (in 512-byte sectors) of the device to be used::
-$ sudo blockdev --getsz /dev/vdb1
-33552384
+ $ sudo blockdev --getsz /dev/vdb1
+ 33552384
Create the dm-dust device:
(For a device with a block size of 512 bytes)
-$ sudo dmsetup create dust1 --table '0 33552384 dust /dev/vdb1 0 512'
+
+::
+
+ $ sudo dmsetup create dust1 --table '0 33552384 dust /dev/vdb1 0 512'
(For a device with a block size of 4096 bytes)
-$ sudo dmsetup create dust1 --table '0 33552384 dust /dev/vdb1 0 4096'
+
+::
+
+ $ sudo dmsetup create dust1 --table '0 33552384 dust /dev/vdb1 0 4096'
Check the status of the read behavior ("bypass" indicates that all I/O
-will be passed through to the underlying device):
-$ sudo dmsetup status dust1
-0 33552384 dust 252:17 bypass
+will be passed through to the underlying device)::
+
+ $ sudo dmsetup status dust1
+ 0 33552384 dust 252:17 bypass
-$ sudo dd if=/dev/mapper/dust1 of=/dev/null bs=512 count=128 iflag=direct
-128+0 records in
-128+0 records out
+ $ sudo dd if=/dev/mapper/dust1 of=/dev/null bs=512 count=128 iflag=direct
+ 128+0 records in
+ 128+0 records out
-$ sudo dd if=/dev/zero of=/dev/mapper/dust1 bs=512 count=128 oflag=direct
-128+0 records in
-128+0 records out
+ $ sudo dd if=/dev/zero of=/dev/mapper/dust1 bs=512 count=128 oflag=direct
+ 128+0 records in
+ 128+0 records out
-Adding and removing bad blocks:
--------------------------------
+Adding and removing bad blocks
+------------------------------
At any time (i.e.: whether the device has the "bad block" emulation
enabled or disabled), bad blocks may be added or removed from the
-device via the "addbadblock" and "removebadblock" messages:
+device via the "addbadblock" and "removebadblock" messages::
-$ sudo dmsetup message dust1 0 addbadblock 60
-kernel: device-mapper: dust: badblock added at block 60
+ $ sudo dmsetup message dust1 0 addbadblock 60
+ kernel: device-mapper: dust: badblock added at block 60
-$ sudo dmsetup message dust1 0 addbadblock 67
-kernel: device-mapper: dust: badblock added at block 67
+ $ sudo dmsetup message dust1 0 addbadblock 67
+ kernel: device-mapper: dust: badblock added at block 67
-$ sudo dmsetup message dust1 0 addbadblock 72
-kernel: device-mapper: dust: badblock added at block 72
+ $ sudo dmsetup message dust1 0 addbadblock 72
+ kernel: device-mapper: dust: badblock added at block 72
These bad blocks will be stored in the "bad block list".
-While the device is in "bypass" mode, reads and writes will succeed:
+While the device is in "bypass" mode, reads and writes will succeed::
-$ sudo dmsetup status dust1
-0 33552384 dust 252:17 bypass
+ $ sudo dmsetup status dust1
+ 0 33552384 dust 252:17 bypass
-Enabling block read failures:
------------------------------
+Enabling block read failures
+----------------------------
-To enable the "fail read on bad block" behavior, send the "enable" message:
+To enable the "fail read on bad block" behavior, send the "enable" message::
-$ sudo dmsetup message dust1 0 enable
-kernel: device-mapper: dust: enabling read failures on bad sectors
+ $ sudo dmsetup message dust1 0 enable
+ kernel: device-mapper: dust: enabling read failures on bad sectors
-$ sudo dmsetup status dust1
-0 33552384 dust 252:17 fail_read_on_bad_block
+ $ sudo dmsetup status dust1
+ 0 33552384 dust 252:17 fail_read_on_bad_block
With the device in "fail read on bad block" mode, attempting to read a
-block will encounter an "Input/output error":
+block will encounter an "Input/output error"::
-$ sudo dd if=/dev/mapper/dust1 of=/dev/null bs=512 count=1 skip=67 iflag=direct
-dd: error reading '/dev/mapper/dust1': Input/output error
-0+0 records in
-0+0 records out
-0 bytes copied, 0.00040651 s, 0.0 kB/s
+ $ sudo dd if=/dev/mapper/dust1 of=/dev/null bs=512 count=1 skip=67 iflag=direct
+ dd: error reading '/dev/mapper/dust1': Input/output error
+ 0+0 records in
+ 0+0 records out
+ 0 bytes copied, 0.00040651 s, 0.0 kB/s
...and writing to the bad blocks will remove the blocks from the list,
-therefore emulating the "remap" behavior of hard disk drives:
+therefore emulating the "remap" behavior of hard disk drives::
-$ sudo dd if=/dev/zero of=/dev/mapper/dust1 bs=512 count=128 oflag=direct
-128+0 records in
-128+0 records out
+ $ sudo dd if=/dev/zero of=/dev/mapper/dust1 bs=512 count=128 oflag=direct
+ 128+0 records in
+ 128+0 records out
-kernel: device-mapper: dust: block 60 removed from badblocklist by write
-kernel: device-mapper: dust: block 67 removed from badblocklist by write
-kernel: device-mapper: dust: block 72 removed from badblocklist by write
-kernel: device-mapper: dust: block 87 removed from badblocklist by write
+ kernel: device-mapper: dust: block 60 removed from badblocklist by write
+ kernel: device-mapper: dust: block 67 removed from badblocklist by write
+ kernel: device-mapper: dust: block 72 removed from badblocklist by write
+ kernel: device-mapper: dust: block 87 removed from badblocklist by write
-Bad block add/remove error handling:
-------------------------------------
+Bad block add/remove error handling
+-----------------------------------
Attempting to add a bad block that already exists in the list will
-result in an "Invalid argument" error, as well as a helpful message:
+result in an "Invalid argument" error, as well as a helpful message::
-$ sudo dmsetup message dust1 0 addbadblock 88
-device-mapper: message ioctl on dust1 failed: Invalid argument
-kernel: device-mapper: dust: block 88 already in badblocklist
+ $ sudo dmsetup message dust1 0 addbadblock 88
+ device-mapper: message ioctl on dust1 failed: Invalid argument
+ kernel: device-mapper: dust: block 88 already in badblocklist
Attempting to remove a bad block that doesn't exist in the list will
-result in an "Invalid argument" error, as well as a helpful message:
+result in an "Invalid argument" error, as well as a helpful message::
-$ sudo dmsetup message dust1 0 removebadblock 87
-device-mapper: message ioctl on dust1 failed: Invalid argument
-kernel: device-mapper: dust: block 87 not found in badblocklist
+ $ sudo dmsetup message dust1 0 removebadblock 87
+ device-mapper: message ioctl on dust1 failed: Invalid argument
+ kernel: device-mapper: dust: block 87 not found in badblocklist
-Counting the number of bad blocks in the bad block list:
---------------------------------------------------------
+Counting the number of bad blocks in the bad block list
+-------------------------------------------------------
To count the number of bad blocks configured in the device, run the
-following message command:
+following message command::
-$ sudo dmsetup message dust1 0 countbadblocks
+ $ sudo dmsetup message dust1 0 countbadblocks
A message will print with the number of bad blocks currently
-configured on the device:
+configured on the device::
-kernel: device-mapper: dust: countbadblocks: 895 badblock(s) found
+ kernel: device-mapper: dust: countbadblocks: 895 badblock(s) found
-Querying for specific bad blocks:
----------------------------------
+Querying for specific bad blocks
+--------------------------------
To find out if a specific block is in the bad block list, run the
-following message command:
+following message command::
-$ sudo dmsetup message dust1 0 queryblock 72
+ $ sudo dmsetup message dust1 0 queryblock 72
-The following message will print if the block is in the list:
-device-mapper: dust: queryblock: block 72 found in badblocklist
+The following message will print if the block is in the list::
-The following message will print if the block is in the list:
-device-mapper: dust: queryblock: block 72 not found in badblocklist
+ device-mapper: dust: queryblock: block 72 found in badblocklist
+
+The following message will print if the block is not in the list::
+
+ device-mapper: dust: queryblock: block 72 not found in badblocklist
The "queryblock" message command will work in both the "enabled"
and "disabled" modes, allowing the verification of whether a block
will be treated as "bad" without having to issue I/O to the device,
or having to "enable" the bad block emulation.
-Clearing the bad block list:
-----------------------------
+Clearing the bad block list
+---------------------------
To clear the bad block list (without needing to individually run
a "removebadblock" message command for every block), run the
-following message command:
+following message command::
-$ sudo dmsetup message dust1 0 clearbadblocks
+ $ sudo dmsetup message dust1 0 clearbadblocks
-After clearing the bad block list, the following message will appear:
+After clearing the bad block list, the following message will appear::
-kernel: device-mapper: dust: clearbadblocks: badblocks cleared
+ kernel: device-mapper: dust: clearbadblocks: badblocks cleared
If there were no bad blocks to clear, the following message will
-appear:
+appear::
-kernel: device-mapper: dust: clearbadblocks: no badblocks found
+ kernel: device-mapper: dust: clearbadblocks: no badblocks found
-Message commands list:
-----------------------
+Message commands list
+---------------------
Below is a list of the messages that can be sent to a dust device:
-Operations on blocks (requires a <blknum> argument):
+Operations on blocks (requires a <blknum> argument)::
-addbadblock <blknum>
-queryblock <blknum>
-removebadblock <blknum>
+ addbadblock <blknum>
+ queryblock <blknum>
+ removebadblock <blknum>
...where <blknum> is a block number within range of the device
- (corresponding to the block size of the device.)
+(corresponding to the block size of the device.)
-Single argument message commands:
+Single argument message commands::
-countbadblocks
-clearbadblocks
-disable
-enable
-quiet
+ countbadblocks
+ clearbadblocks
+ disable
+ enable
+ quiet
-Device removal:
----------------
+Device removal
+--------------
-When finished, remove the device via the "dmsetup remove" command:
+When finished, remove the device via the "dmsetup remove" command::
-$ sudo dmsetup remove dust1
+ $ sudo dmsetup remove dust1
-Quiet mode:
------------
+Quiet mode
+----------
On test runs with many bad blocks, it may be desirable to avoid
excessive logging (from bad blocks added, removed, or "remapped").
-This can be done by enabling "quiet mode" via the following message:
+This can be done by enabling "quiet mode" via the following message::
-$ sudo dmsetup message dust1 0 quiet
+ $ sudo dmsetup message dust1 0 quiet
This will suppress log messages from add / remove / removed by write
operations. Log messages from "countbadblocks" or "queryblock"
message commands will still print in quiet mode.
-The status of quiet mode can be seen by running "dmsetup status":
+The status of quiet mode can be seen by running "dmsetup status"::
-$ sudo dmsetup status dust1
-0 33552384 dust 252:17 fail_read_on_bad_block quiet
+ $ sudo dmsetup status dust1
+ 0 33552384 dust 252:17 fail_read_on_bad_block quiet
-To disable quiet mode, send the "quiet" message again:
+To disable quiet mode, send the "quiet" message again::
-$ sudo dmsetup message dust1 0 quiet
+ $ sudo dmsetup message dust1 0 quiet
-$ sudo dmsetup status dust1
-0 33552384 dust 252:17 fail_read_on_bad_block verbose
+ $ sudo dmsetup status dust1
+ 0 33552384 dust 252:17 fail_read_on_bad_block verbose
(The presence of "verbose" indicates normal logging.)
diff --git a/Documentation/admin-guide/device-mapper/dm-integrity.rst b/Documentation/admin-guide/device-mapper/dm-integrity.rst
index a30aa91b5fbe..594095b54b29 100644
--- a/Documentation/admin-guide/device-mapper/dm-integrity.rst
+++ b/Documentation/admin-guide/device-mapper/dm-integrity.rst
@@ -177,6 +177,11 @@ bitmap_flush_interval:number
The bitmap flush interval in milliseconds. The metadata buffers
are synchronized when this interval expires.
+fix_padding
+ Use a smaller padding of the tag area that is more
+ space-efficient. If this option is not present, large padding is
+ used - that is for compatibility with older kernels.
+
The journal mode (D/J), buffer_sectors, journal_watermark, commit_time can
be changed when reloading the target (load an inactive table and swap the
diff --git a/Documentation/admin-guide/device-mapper/dm-raid.rst b/Documentation/admin-guide/device-mapper/dm-raid.rst
index 2fe255b130fb..f6344675e395 100644
--- a/Documentation/admin-guide/device-mapper/dm-raid.rst
+++ b/Documentation/admin-guide/device-mapper/dm-raid.rst
@@ -417,3 +417,5 @@ Version History
deadlock/potential data corruption. Update superblock when
specific devices are requested via rebuild. Fix RAID leg
rebuild errors.
+ 1.15.0 Fix size extensions not being synchronized in case of new MD bitmap
+ pages allocated; also fix those not occuring after previous reductions
diff --git a/Documentation/admin-guide/device-mapper/index.rst b/Documentation/admin-guide/device-mapper/index.rst
index c77c58b8f67b..4872fb6d2952 100644
--- a/Documentation/admin-guide/device-mapper/index.rst
+++ b/Documentation/admin-guide/device-mapper/index.rst
@@ -9,6 +9,7 @@ Device Mapper
cache
delay
dm-crypt
+ dm-dust
dm-flakey
dm-init
dm-integrity
diff --git a/Documentation/admin-guide/hw-vuln/mds.rst b/Documentation/admin-guide/hw-vuln/mds.rst
index e3a796c0d3a2..2d19c9f4c1fe 100644
--- a/Documentation/admin-guide/hw-vuln/mds.rst
+++ b/Documentation/admin-guide/hw-vuln/mds.rst
@@ -265,8 +265,11 @@ time with the option "mds=". The valid arguments for this option are:
============ =============================================================
-Not specifying this option is equivalent to "mds=full".
-
+Not specifying this option is equivalent to "mds=full". For processors
+that are affected by both TAA (TSX Asynchronous Abort) and MDS,
+specifying just "mds=off" without an accompanying "tsx_async_abort=off"
+will have no effect as the same mitigation is used for both
+vulnerabilities.
Mitigation selection guide
--------------------------
diff --git a/Documentation/admin-guide/hw-vuln/tsx_async_abort.rst b/Documentation/admin-guide/hw-vuln/tsx_async_abort.rst
index fddbd7579c53..af6865b822d2 100644
--- a/Documentation/admin-guide/hw-vuln/tsx_async_abort.rst
+++ b/Documentation/admin-guide/hw-vuln/tsx_async_abort.rst
@@ -174,7 +174,10 @@ the option "tsx_async_abort=". The valid arguments for this option are:
CPU is not vulnerable to cross-thread TAA attacks.
============ =============================================================
-Not specifying this option is equivalent to "tsx_async_abort=full".
+Not specifying this option is equivalent to "tsx_async_abort=full". For
+processors that are affected by both TAA and MDS, specifying just
+"tsx_async_abort=off" without an accompanying "mds=off" will have no
+effect as the same mitigation is used for both vulnerabilities.
The kernel command line also allows to control the TSX feature using the
parameter "tsx=" on CPUs which support TSX control. MSR_IA32_TSX_CTRL is used
diff --git a/Documentation/admin-guide/index.rst b/Documentation/admin-guide/index.rst
index 34cc20ee7f3a..4405b7485312 100644
--- a/Documentation/admin-guide/index.rst
+++ b/Documentation/admin-guide/index.rst
@@ -57,60 +57,61 @@ configure specific aspects of kernel behavior to your liking.
.. toctree::
:maxdepth: 1
- initrd
- cgroup-v2
- cgroup-v1/index
- serial-console
- braille-console
- parport
- md
- module-signing
- rapidio
- sysrq
- unicode
- vga-softcursor
- binfmt-misc
- mono
- java
- ras
- bcache
- blockdev/index
- ext4
- binderfs
- cifs/index
- xfs
- jfs
- ufs
- pm/index
- thunderbolt
- LSM/index
- mm/index
- namespaces/index
- perf-security
acpi/index
aoe/index
+ auxdisplay/index
+ bcache
+ binderfs
+ binfmt-misc
+ blockdev/index
+ braille-console
btmrvl
+ cgroup-v1/index
+ cgroup-v2
+ cifs/index
clearing-warn-once
cpu-load
cputopology
+ dell_rbu
device-mapper/index
efi-stub
+ ext4
gpio/index
highuid
hw_random
+ initrd
iostats
+ java
+ jfs
kernel-per-CPU-kthreads
laptops/index
- auxdisplay/index
lcd-panel-cgram
ldm
lockup-watchdogs
+ LSM/index
+ md
+ mm/index
+ module-signing
+ mono
+ namespaces/index
numastat
+ parport
+ perf-security
+ pm/index
pnp
+ rapidio
+ ras
rtc
+ serial-console
svga
- wimax/index
+ sysrq
+ thunderbolt
+ ufs
+ unicode
+ vga-softcursor
video-output
+ wimax/index
+ xfs
.. only:: subproject and html
diff --git a/Documentation/admin-guide/iostats.rst b/Documentation/admin-guide/iostats.rst
index 5d63b18bd6d1..df5b8345c41d 100644
--- a/Documentation/admin-guide/iostats.rst
+++ b/Documentation/admin-guide/iostats.rst
@@ -46,81 +46,91 @@ each snapshot of your disk statistics.
In 2.4, the statistics fields are those after the device name. In
the above example, the first field of statistics would be 446216.
By contrast, in 2.6+ if you look at ``/sys/block/hda/stat``, you'll
-find just the eleven fields, beginning with 446216. If you look at
-``/proc/diskstats``, the eleven fields will be preceded by the major and
+find just the 15 fields, beginning with 446216. If you look at
+``/proc/diskstats``, the 15 fields will be preceded by the major and
minor device numbers, and device name. Each of these formats provides
-eleven fields of statistics, each meaning exactly the same things.
+15 fields of statistics, each meaning exactly the same things.
All fields except field 9 are cumulative since boot. Field 9 should
go to zero as I/Os complete; all others only increase (unless they
-overflow and wrap). Yes, these are (32-bit or 64-bit) unsigned long
-(native word size) numbers, and on a very busy or long-lived system they
-may wrap. Applications should be prepared to deal with that; unless
-your observations are measured in large numbers of minutes or hours,
-they should not wrap twice before you notice them.
+overflow and wrap). Wrapping might eventually occur on a very busy
+or long-lived system; so applications should be prepared to deal with
+it. Regarding wrapping, the types of the fields are either unsigned
+int (32 bit) or unsigned long (32-bit or 64-bit, depending on your
+machine) as noted per-field below. Unless your observations are very
+spread in time, these fields should not wrap twice before you notice it.
Each set of stats only applies to the indicated device; if you want
system-wide stats you'll have to find all the devices and sum them all up.
-Field 1 -- # of reads completed
+Field 1 -- # of reads completed (unsigned long)
This is the total number of reads completed successfully.
-Field 2 -- # of reads merged, field 6 -- # of writes merged
+Field 2 -- # of reads merged, field 6 -- # of writes merged (unsigned long)
Reads and writes which are adjacent to each other may be merged for
efficiency. Thus two 4K reads may become one 8K read before it is
ultimately handed to the disk, and so it will be counted (and queued)
as only one I/O. This field lets you know how often this was done.
-Field 3 -- # of sectors read
+Field 3 -- # of sectors read (unsigned long)
This is the total number of sectors read successfully.
-Field 4 -- # of milliseconds spent reading
+Field 4 -- # of milliseconds spent reading (unsigned int)
This is the total number of milliseconds spent by all reads (as
measured from __make_request() to end_that_request_last()).
-Field 5 -- # of writes completed
+Field 5 -- # of writes completed (unsigned long)
This is the total number of writes completed successfully.
-Field 6 -- # of writes merged
+Field 6 -- # of writes merged (unsigned long)
See the description of field 2.
-Field 7 -- # of sectors written
+Field 7 -- # of sectors written (unsigned long)
This is the total number of sectors written successfully.
-Field 8 -- # of milliseconds spent writing
+Field 8 -- # of milliseconds spent writing (unsigned int)
This is the total number of milliseconds spent by all writes (as
measured from __make_request() to end_that_request_last()).
-Field 9 -- # of I/Os currently in progress
+Field 9 -- # of I/Os currently in progress (unsigned int)
The only field that should go to zero. Incremented as requests are
given to appropriate struct request_queue and decremented as they finish.
-Field 10 -- # of milliseconds spent doing I/Os
+Field 10 -- # of milliseconds spent doing I/Os (unsigned int)
This field increases so long as field 9 is nonzero.
Since 5.0 this field counts jiffies when at least one request was
started or completed. If request runs more than 2 jiffies then some
I/O time will not be accounted unless there are other requests.
-Field 11 -- weighted # of milliseconds spent doing I/Os
+Field 11 -- weighted # of milliseconds spent doing I/Os (unsigned int)
This field is incremented at each I/O start, I/O completion, I/O
merge, or read of these stats by the number of I/Os in progress
(field 9) times the number of milliseconds spent doing I/O since the
last update of this field. This can provide an easy measure of both
I/O completion time and the backlog that may be accumulating.
-Field 12 -- # of discards completed
+Field 12 -- # of discards completed (unsigned long)
This is the total number of discards completed successfully.
-Field 13 -- # of discards merged
+Field 13 -- # of discards merged (unsigned long)
See the description of field 2
-Field 14 -- # of sectors discarded
+Field 14 -- # of sectors discarded (unsigned long)
This is the total number of sectors discarded successfully.
-Field 15 -- # of milliseconds spent discarding
+Field 15 -- # of milliseconds spent discarding (unsigned int)
This is the total number of milliseconds spent by all discards (as
measured from __make_request() to end_that_request_last()).
+Field 16 -- # of flush requests completed
+ This is the total number of flush requests completed successfully.
+
+ Block layer combines flush requests and executes at most one at a time.
+ This counts flush requests executed by disk. Not tracked for partitions.
+
+Field 17 -- # of milliseconds spent flushing
+ This is the total number of milliseconds spent by all flush requests.
+
To avoid introducing performance bottlenecks, no locks are held while
modifying these counters. This implies that minor inaccuracies may be
introduced when changes collide, so (for instance) adding up all the
diff --git a/Documentation/admin-guide/kernel-parameters.rst b/Documentation/admin-guide/kernel-parameters.rst
index d05d531b4ec9..6d421694d98e 100644
--- a/Documentation/admin-guide/kernel-parameters.rst
+++ b/Documentation/admin-guide/kernel-parameters.rst
@@ -127,6 +127,7 @@ parameter is applicable::
NET Appropriate network support is enabled.
NUMA NUMA support is enabled.
NFS Appropriate NFS support is enabled.
+ OF Devicetree is enabled.
OSS OSS sound support is enabled.
PV_OPS A paravirtualized kernel is enabled.
PARIDE The ParIDE (parallel port IDE) subsystem is enabled.
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 8dee8f68fe15..ade4e6ec23e0 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -113,7 +113,7 @@
the GPE dispatcher.
This facility can be used to prevent such uncontrolled
GPE floodings.
- Format: <int>
+ Format: <byte>
acpi_no_auto_serialize [HW,ACPI]
Disable auto-serialization of AML methods
@@ -437,8 +437,6 @@
no delay (0).
Format: integer
- bootmem_debug [KNL] Enable bootmem allocator debug messages.
-
bert_disable [ACPI]
Disable BERT OS support on buggy BIOSes.
@@ -983,12 +981,10 @@
earlycon= [KNL] Output early console device and options.
- [ARM64] The early console is determined by the
- stdout-path property in device tree's chosen node,
- or determined by the ACPI SPCR table.
-
- [X86] When used with no options the early console is
- determined by the ACPI SPCR table.
+ When used with no options, the early console is
+ determined by stdout-path property in device tree's
+ chosen node or the ACPI SPCR table if supported by
+ the platform.
cdns,<addr>[,options]
Start an early, polled-mode console on a Cadence
@@ -1101,7 +1097,7 @@
mapped with the correct attributes.
linflex,<addr>
- Use early console provided by Freescale LinFlex UART
+ Use early console provided by Freescale LINFlexD UART
serial driver for NXP S32V234 SoCs. A valid base
address must be provided, and the serial port must
already be setup and configured.
@@ -1168,7 +1164,8 @@
Format: {"off" | "on" | "skip[mbr]"}
efi= [EFI]
- Format: { "old_map", "nochunk", "noruntime", "debug" }
+ Format: { "old_map", "nochunk", "noruntime", "debug",
+ "nosoftreserve" }
old_map [X86-64]: switch to the old ioremap-based EFI
runtime services mapping. 32-bit still uses this one by
default.
@@ -1177,6 +1174,12 @@
firmware implementations.
noruntime : disable EFI runtime services support
debug: enable misc debug output
+ nosoftreserve: The EFI_MEMORY_SP (Specific Purpose)
+ attribute may cause the kernel to reserve the
+ memory range for a memory mapping driver to
+ claim. Specify efi=nosoftreserve to disable this
+ reservation and treat the memory by its base type
+ (i.e. EFI_CONVENTIONAL_MEMORY / "System RAM").
efi_no_storage_paranoia [EFI; X86]
Using this parameter you can use more than 50% of
@@ -1189,15 +1192,21 @@
updating original EFI memory map.
Region of memory which aa attribute is added to is
from ss to ss+nn.
+
If efi_fake_mem=2G@4G:0x10000,2G@0x10a0000000:0x10000
is specified, EFI_MEMORY_MORE_RELIABLE(0x10000)
attribute is added to range 0x100000000-0x180000000 and
0x10a0000000-0x1120000000.
+ If efi_fake_mem=8G@9G:0x40000 is specified, the
+ EFI_MEMORY_SP(0x40000) attribute is added to
+ range 0x240000000-0x43fffffff.
+
Using this parameter you can do debugging of EFI memmap
- related feature. For example, you can do debugging of
+ related features. For example, you can do debugging of
Address Range Mirroring feature even if your box
- doesn't support it.
+ doesn't support it, or mark specific memory as
+ "soft reserved".
efivar_ssdt= [EFI; X86] Name of an EFI variable that contains an SSDT
that is to be dynamically loaded by Linux. If there are
@@ -2473,6 +2482,12 @@
SMT on vulnerable CPUs
off - Unconditionally disable MDS mitigation
+ On TAA-affected machines, mds=off can be prevented by
+ an active TAA mitigation as both vulnerabilities are
+ mitigated with the same mechanism so in order to disable
+ this mitigation, you need to specify tsx_async_abort=off
+ too.
+
Not specifying this option is equivalent to
mds=full.
@@ -3110,9 +3125,9 @@
[X86,PV_OPS] Disable paravirtualized VMware scheduler
clock and use the default one.
- no-steal-acc [X86,KVM] Disable paravirtualized steal time accounting.
- steal time is computed, but won't influence scheduler
- behaviour
+ no-steal-acc [X86,KVM,ARM64] Disable paravirtualized steal time
+ accounting. steal time is computed, but won't
+ influence scheduler behaviour
nolapic [X86-32,APIC] Do not enable or use the local APIC.
@@ -3221,6 +3236,12 @@
This can be set from sysctl after boot.
See Documentation/admin-guide/sysctl/vm.rst for details.
+ of_devlink [OF, KNL] Create device links between consumer and
+ supplier devices by scanning the devictree to infer the
+ consumer/supplier relationships. A consumer device
+ will not be probed until all the supplier devices have
+ probed successfully.
+
ohci1394_dma=early [HW] enable debugging via the ohci1394 driver.
See Documentation/debugging-via-ohci1394.txt for more
info.
@@ -3519,8 +3540,15 @@
hpiosize=nn[KMG] The fixed amount of bus space which is
reserved for hotplug bridge's IO window.
Default size is 256 bytes.
+ hpmmiosize=nn[KMG] The fixed amount of bus space which is
+ reserved for hotplug bridge's MMIO window.
+ Default size is 2 megabytes.
+ hpmmioprefsize=nn[KMG] The fixed amount of bus space which is
+ reserved for hotplug bridge's MMIO_PREF window.
+ Default size is 2 megabytes.
hpmemsize=nn[KMG] The fixed amount of bus space which is
- reserved for hotplug bridge's memory window.
+ reserved for hotplug bridge's MMIO and
+ MMIO_PREF window.
Default size is 2 megabytes.
hpbussize=nn The minimum amount of additional bus numbers
reserved for buses below a hotplug bridge.
@@ -3567,6 +3595,8 @@
even if the platform doesn't give the OS permission to
use them. This may cause conflicts if the platform
also tries to use these services.
+ dpc-native Use native PCIe service for DPC only. May
+ cause conflicts if firmware uses AER or DPC.
compat Disable native PCIe services (PME, AER, DPC, PCIe
hotplug).
@@ -4931,6 +4961,11 @@
vulnerable to cross-thread TAA attacks.
off - Unconditionally disable TAA mitigation
+ On MDS-affected machines, tsx_async_abort=off can be
+ prevented by an active MDS mitigation as both vulnerabilities
+ are mitigated with the same mechanism so in order to disable
+ this mitigation, you need to specify mds=off too.
+
Not specifying this option is equivalent to
tsx_async_abort=full. On CPUs which are MDS affected
and deploy MDS mitigation, TAA mitigation is not
@@ -5090,13 +5125,13 @@
Flags is a set of characters, each corresponding
to a common usb-storage quirk flag as follows:
a = SANE_SENSE (collect more than 18 bytes
- of sense data);
+ of sense data, not on uas);
b = BAD_SENSE (don't collect more than 18
- bytes of sense data);
+ bytes of sense data, not on uas);
c = FIX_CAPACITY (decrease the reported
device capacity by one sector);
d = NO_READ_DISC_INFO (don't use
- READ_DISC_INFO command);
+ READ_DISC_INFO command, not on uas);
e = NO_READ_CAPACITY_16 (don't use
READ_CAPACITY_16 command);
f = NO_REPORT_OPCODES (don't use report opcodes
@@ -5111,17 +5146,18 @@
j = NO_REPORT_LUNS (don't use report luns
command, uas only);
l = NOT_LOCKABLE (don't try to lock and
- unlock ejectable media);
+ unlock ejectable media, not on uas);
m = MAX_SECTORS_64 (don't transfer more
- than 64 sectors = 32 KB at a time);
+ than 64 sectors = 32 KB at a time,
+ not on uas);
n = INITIAL_READ10 (force a retry of the
- initial READ(10) command);
+ initial READ(10) command, not on uas);
o = CAPACITY_OK (accept the capacity
- reported by the device);
+ reported by the device, not on uas);
p = WRITE_CACHE (the device cache is ON
- by default);
+ by default, not on uas);
r = IGNORE_RESIDUE (the device reports
- bogus residue values);
+ bogus residue values, not on uas);
s = SINGLE_LUN (the device has only one
Logical Unit);
t = NO_ATA_1X (don't allow ATA(12) and ATA(16)
@@ -5130,7 +5166,8 @@
w = NO_WP_DETECT (don't test whether the
medium is write-protected).
y = ALWAYS_SYNC (issue a SYNCHRONIZE_CACHE
- even if the device claims no cache)
+ even if the device claims no cache,
+ not on uas)
Example: quirks=0419:aaf5:rl,0421:0433:rc
user_debug= [KNL,ARM]
diff --git a/Documentation/admin-guide/perf/imx-ddr.rst b/Documentation/admin-guide/perf/imx-ddr.rst
index 517a205abad6..3726a10a03ba 100644
--- a/Documentation/admin-guide/perf/imx-ddr.rst
+++ b/Documentation/admin-guide/perf/imx-ddr.rst
@@ -17,36 +17,54 @@ The "format" directory describes format of the config (event ID) and config1
(AXI filtering) fields of the perf_event_attr structure, see /sys/bus/event_source/
devices/imx8_ddr0/format/. The "events" directory describes the events types
hardware supported that can be used with perf tool, see /sys/bus/event_source/
-devices/imx8_ddr0/events/.
- e.g.::
+devices/imx8_ddr0/events/. The "caps" directory describes filter features implemented
+in DDR PMU, see /sys/bus/events_source/devices/imx8_ddr0/caps/.
+
+ .. code-block:: bash
+
perf stat -a -e imx8_ddr0/cycles/ cmd
perf stat -a -e imx8_ddr0/read/,imx8_ddr0/write/ cmd
AXI filtering is only used by CSV modes 0x41 (axid-read) and 0x42 (axid-write)
to count reading or writing matches filter setting. Filter setting is various
from different DRAM controller implementations, which is distinguished by quirks
-in the driver.
+in the driver. You also can dump info from userspace, filter in "caps" directory
+indicates whether PMU supports AXI ID filter or not; enhanced_filter indicates
+whether PMU supports enhanced AXI ID filter or not. Value 0 for un-supported, and
+value 1 for supported.
-* With DDR_CAP_AXI_ID_FILTER quirk.
+* With DDR_CAP_AXI_ID_FILTER quirk(filter: 1, enhanced_filter: 0).
Filter is defined with two configuration parts:
--AXI_ID defines AxID matching value.
--AXI_MASKING defines which bits of AxID are meaningful for the matching.
- 0:corresponding bit is masked.
- 1: corresponding bit is not masked, i.e. used to do the matching.
+
+ - 0: corresponding bit is masked.
+ - 1: corresponding bit is not masked, i.e. used to do the matching.
AXI_ID and AXI_MASKING are mapped on DPCR1 register in performance counter.
When non-masked bits are matching corresponding AXI_ID bits then counter is
incremented. Perf counter is incremented if
- AxID && AXI_MASKING == AXI_ID && AXI_MASKING
+ AxID && AXI_MASKING == AXI_ID && AXI_MASKING
This filter doesn't support filter different AXI ID for axid-read and axid-write
event at the same time as this filter is shared between counters.
- e.g.::
- perf stat -a -e imx8_ddr0/axid-read,axi_mask=0xMMMM,axi_id=0xDDDD/ cmd
- perf stat -a -e imx8_ddr0/axid-write,axi_mask=0xMMMM,axi_id=0xDDDD/ cmd
-
- NOTE: axi_mask is inverted in userspace(i.e. set bits are bits to mask), and
- it will be reverted in driver automatically. so that the user can just specify
- axi_id to monitor a specific id, rather than having to specify axi_mask.
- e.g.::
+
+ .. code-block:: bash
+
+ perf stat -a -e imx8_ddr0/axid-read,axi_mask=0xMMMM,axi_id=0xDDDD/ cmd
+ perf stat -a -e imx8_ddr0/axid-write,axi_mask=0xMMMM,axi_id=0xDDDD/ cmd
+
+ .. note::
+
+ axi_mask is inverted in userspace(i.e. set bits are bits to mask), and
+ it will be reverted in driver automatically. so that the user can just specify
+ axi_id to monitor a specific id, rather than having to specify axi_mask.
+
+ .. code-block:: bash
+
perf stat -a -e imx8_ddr0/axid-read,axi_id=0x12/ cmd, which will monitor ARID=0x12
+
+* With DDR_CAP_AXI_ID_FILTER_ENHANCED quirk(filter: 1, enhanced_filter: 1).
+ This is an extension to the DDR_CAP_AXI_ID_FILTER quirk which permits
+ counting the number of bytes (as opposed to the number of bursts) from DDR
+ read and write transactions concurrently with another set of data counters.
diff --git a/Documentation/admin-guide/perf/index.rst b/Documentation/admin-guide/perf/index.rst
index ee4bfd2a740f..47c99f40cc16 100644
--- a/Documentation/admin-guide/perf/index.rst
+++ b/Documentation/admin-guide/perf/index.rst
@@ -8,6 +8,7 @@ Performance monitor support
:maxdepth: 1
hisi-pmu
+ imx-ddr
qcom_l2_pmu
qcom_l3_pmu
arm-ccn
diff --git a/Documentation/admin-guide/perf/thunderx2-pmu.rst b/Documentation/admin-guide/perf/thunderx2-pmu.rst
index 08e33675853a..01f158238ae1 100644
--- a/Documentation/admin-guide/perf/thunderx2-pmu.rst
+++ b/Documentation/admin-guide/perf/thunderx2-pmu.rst
@@ -3,24 +3,26 @@ Cavium ThunderX2 SoC Performance Monitoring Unit (PMU UNCORE)
=============================================================
The ThunderX2 SoC PMU consists of independent, system-wide, per-socket
-PMUs such as the Level 3 Cache (L3C) and DDR4 Memory Controller (DMC).
+PMUs such as the Level 3 Cache (L3C), DDR4 Memory Controller (DMC) and
+Cavium Coherent Processor Interconnect (CCPI2).
The DMC has 8 interleaved channels and the L3C has 16 interleaved tiles.
Events are counted for the default channel (i.e. channel 0) and prorated
to the total number of channels/tiles.
-The DMC and L3C support up to 4 counters. Counters are independently
-programmable and can be started and stopped individually. Each counter
-can be set to a different event. Counters are 32-bit and do not support
-an overflow interrupt; they are read every 2 seconds.
+The DMC and L3C support up to 4 counters, while the CCPI2 supports up to 8
+counters. Counters are independently programmable to different events and
+can be started and stopped individually. None of the counters support an
+overflow interrupt. DMC and L3C counters are 32-bit and read every 2 seconds.
+The CCPI2 counters are 64-bit and assumed not to overflow in normal operation.
PMU UNCORE (perf) driver:
The thunderx2_pmu driver registers per-socket perf PMUs for the DMC and
-L3C devices. Each PMU can be used to count up to 4 events
-simultaneously. The PMUs provide a description of their available events
-and configuration options under sysfs, see
-/sys/devices/uncore_<l3c_S/dmc_S/>; S is the socket id.
+L3C devices. Each PMU can be used to count up to 4 (DMC/L3C) or up to 8
+(CCPI2) events simultaneously. The PMUs provide a description of their
+available events and configuration options under sysfs, see
+/sys/devices/uncore_<l3c_S/dmc_S/ccpi2_S/>; S is the socket id.
The driver does not support sampling, therefore "perf record" will not
work. Per-task perf sessions are also not supported.
diff --git a/Documentation/admin-guide/ras.rst b/Documentation/admin-guide/ras.rst
index 2b20f5f7380d..0310db624964 100644
--- a/Documentation/admin-guide/ras.rst
+++ b/Documentation/admin-guide/ras.rst
@@ -330,9 +330,12 @@ There can be multiple csrows and multiple channels.
.. [#f4] Nowadays, the term DIMM (Dual In-line Memory Module) is widely
used to refer to a memory module, although there are other memory
- packaging alternatives, like SO-DIMM, SIMM, etc. Along this document,
- and inside the EDAC system, the term "dimm" is used for all memory
- modules, even when they use a different kind of packaging.
+ packaging alternatives, like SO-DIMM, SIMM, etc. The UEFI
+ specification (Version 2.7) defines a memory module in the Common
+ Platform Error Record (CPER) section to be an SMBIOS Memory Device
+ (Type 17). Along this document, and inside the EDAC subsystem, the term
+ "dimm" is used for all memory modules, even when they use a
+ different kind of packaging.
Memory controllers allow for several csrows, with 8 csrows being a
typical value. Yet, the actual number of csrows depends on the layout of
@@ -349,12 +352,14 @@ controllers. The following example will assume 2 channels:
| | ``ch0`` | ``ch1`` |
+============+===========+===========+
| ``csrow0`` | DIMM_A0 | DIMM_B0 |
- +------------+ | |
- | ``csrow1`` | | |
+ | | rank0 | rank0 |
+ +------------+ - | - |
+ | ``csrow1`` | rank1 | rank1 |
+------------+-----------+-----------+
| ``csrow2`` | DIMM_A1 | DIMM_B1 |
- +------------+ | |
- | ``csrow3`` | | |
+ | | rank0 | rank0 |
+ +------------+ - | - |
+ | ``csrow3`` | rank1 | rank1 |
+------------+-----------+-----------+
In the above example, there are 4 physical slots on the motherboard
@@ -374,11 +379,13 @@ which the memory DIMM is placed. Thus, when 1 DIMM is placed in each
Channel, the csrows cross both DIMMs.
Memory DIMMs come single or dual "ranked". A rank is a populated csrow.
-Thus, 2 single ranked DIMMs, placed in slots DIMM_A0 and DIMM_B0 above
-will have just one csrow (csrow0). csrow1 will be empty. On the other
-hand, when 2 dual ranked DIMMs are similarly placed, then both csrow0
-and csrow1 will be populated. The pattern repeats itself for csrow2 and
-csrow3.
+In the example above 2 dual ranked DIMMs are similarly placed. Thus,
+both csrow0 and csrow1 are populated. On the other hand, when 2 single
+ranked DIMMs are placed in slots DIMM_A0 and DIMM_B0, then they will
+have just one csrow (csrow0) and csrow1 will be empty. The pattern
+repeats itself for csrow2 and csrow3. Also note that some memory
+controllers don't have any logic to identify the memory module, see
+``rankX`` directories below.
The representation of the above is reflected in the directory
tree in EDAC's sysfs interface. Starting in directory
diff --git a/Documentation/admin-guide/sysctl/kernel.rst b/Documentation/admin-guide/sysctl/kernel.rst
index 032c7cd3cede..def074807cee 100644
--- a/Documentation/admin-guide/sysctl/kernel.rst
+++ b/Documentation/admin-guide/sysctl/kernel.rst
@@ -831,8 +831,8 @@ printk_ratelimit:
=================
Some warning messages are rate limited. printk_ratelimit specifies
-the minimum length of time between these messages (in jiffies), by
-default we allow one every 5 seconds.
+the minimum length of time between these messages (in seconds).
+The default value is 5 seconds.
A value of 0 will disable rate limiting.
@@ -845,6 +845,8 @@ seconds, we do allow a burst of messages to pass through.
printk_ratelimit_burst specifies the number of messages we can
send before ratelimiting kicks in.
+The default value is 10 messages.
+
printk_devkmsg:
===============
@@ -1101,7 +1103,7 @@ During initialization the kernel sets this value such that even if the
maximum number of threads is created, the thread structures occupy only
a part (1/8th) of the available RAM pages.
-The minimum value that can be written to threads-max is 20.
+The minimum value that can be written to threads-max is 1.
The maximum value that can be written to threads-max is given by the
constant FUTEX_TID_MASK (0x3fffffff).
@@ -1109,10 +1111,6 @@ constant FUTEX_TID_MASK (0x3fffffff).
If a value outside of this range is written to threads-max an error
EINVAL occurs.
-The value written is checked against the available RAM pages. If the
-thread structures would occupy too much (more than 1/8th) of the
-available RAM pages threads-max is reduced accordingly.
-
unknown_nmi_panic:
==================