aboutsummaryrefslogtreecommitdiffstats
path: root/Documentation/fault-injection
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation/fault-injection')
-rw-r--r--Documentation/fault-injection/fault-injection.rst72
-rw-r--r--Documentation/fault-injection/notifier-error-inject.rst4
-rw-r--r--Documentation/fault-injection/nvme-fault-injection.rst2
-rw-r--r--Documentation/fault-injection/provoke-crashes.rst57
4 files changed, 89 insertions, 46 deletions
diff --git a/Documentation/fault-injection/fault-injection.rst b/Documentation/fault-injection/fault-injection.rst
index f51bb21d20e4..17779a2772e5 100644
--- a/Documentation/fault-injection/fault-injection.rst
+++ b/Documentation/fault-injection/fault-injection.rst
@@ -16,15 +16,23 @@ Available fault injection capabilities
injects page allocation failures. (alloc_pages(), get_free_pages(), ...)
+- fail_usercopy
+
+ injects failures in user memory access functions. (copy_from_user(), get_user(), ...)
+
- fail_futex
injects futex deadlock and uaddr fault errors.
+- fail_sunrpc
+
+ injects kernel RPC client and server failures.
+
- fail_make_request
injects disk IO errors on devices permitted by setting
/sys/block/<device>/make-it-fail or
- /sys/block/<device>/<partition>/make-it-fail. (generic_make_request())
+ /sys/block/<device>/<partition>/make-it-fail. (submit_bio_noacct())
- fail_mmc_request
@@ -74,8 +82,10 @@ configuration of fault-injection capabilities.
- /sys/kernel/debug/fail*/times:
- specifies how many times failures may happen at most.
- A value of -1 means "no limit".
+ specifies how many times failures may happen at most. A value of -1
+ means "no limit". Note, though, that this file only accepts unsigned
+ values. So, if you want to specify -1, you better use 'printf' instead
+ of 'echo', e.g.: $ printf %#x -1 > times
- /sys/kernel/debug/fail*/space:
@@ -122,16 +132,16 @@ configuration of fault-injection capabilities.
Format: { 'Y' | 'N' }
- default is 'N', setting it to 'Y' won't inject failures into
- highmem/user allocations.
+ default is 'Y', setting it to 'N' will also inject failures into
+ highmem/user allocations (__GFP_HIGHMEM allocations).
- /sys/kernel/debug/failslab/ignore-gfp-wait:
- /sys/kernel/debug/fail_page_alloc/ignore-gfp-wait:
Format: { 'Y' | 'N' }
- default is 'N', setting it to 'Y' will inject failures
- only into non-sleep allocations (GFP_ATOMIC allocations).
+ default is 'Y', setting it to 'N' will also inject failures
+ into allocations that can sleep (__GFP_DIRECT_RECLAIM allocations).
- /sys/kernel/debug/fail_page_alloc/min-order:
@@ -145,6 +155,27 @@ configuration of fault-injection capabilities.
default is 'N', setting it to 'Y' will disable failure injections
when dealing with private (address space) futexes.
+- /sys/kernel/debug/fail_sunrpc/ignore-client-disconnect:
+
+ Format: { 'Y' | 'N' }
+
+ default is 'N', setting it to 'Y' will disable disconnect
+ injection on the RPC client.
+
+- /sys/kernel/debug/fail_sunrpc/ignore-server-disconnect:
+
+ Format: { 'Y' | 'N' }
+
+ default is 'N', setting it to 'Y' will disable disconnect
+ injection on the RPC server.
+
+- /sys/kernel/debug/fail_sunrpc/ignore-cache-wait:
+
+ Format: { 'Y' | 'N' }
+
+ default is 'N', setting it to 'Y' will disable cache wait
+ injection on the RPC server.
+
- /sys/kernel/debug/fail_function/inject:
Format: { 'function-name' | '!function-name' | '' }
@@ -163,11 +194,13 @@ configuration of fault-injection capabilities.
- ERRNO: retval must be -1 to -MAX_ERRNO (-4096).
- ERR_NULL: retval must be 0 or -1 to -MAX_ERRNO (-4096).
-- /sys/kernel/debug/fail_function/<functiuon-name>/retval:
+- /sys/kernel/debug/fail_function/<function-name>/retval:
- specifies the "error" return value to inject to the given
- function for given function. This will be created when
- user specifies new injection entry.
+ specifies the "error" return value to inject to the given function.
+ This will be created when the user specifies a new injection entry.
+ Note that this file only accepts unsigned values. So, if you want to
+ use a negative errno, you better use 'printf' instead of 'echo', e.g.:
+ $ printf %#x -12 > retval
Boot option
^^^^^^^^^^^
@@ -177,6 +210,7 @@ use the boot option::
failslab=
fail_page_alloc=
+ fail_usercopy=
fail_make_request=
fail_futex=
mmc_core.fail_request=<interval>,<probability>,<space>,<times>
@@ -222,7 +256,7 @@ How to add new fault injection capability
- debugfs entries
- failslab, fail_page_alloc, and fail_make_request use this way.
+ failslab, fail_page_alloc, fail_usercopy, and fail_make_request use this way.
Helper functions:
fault_create_debugfs_attr(name, parent, attr);
@@ -250,10 +284,10 @@ Application Examples
echo Y > /sys/kernel/debug/$FAILTYPE/task-filter
echo 10 > /sys/kernel/debug/$FAILTYPE/probability
echo 100 > /sys/kernel/debug/$FAILTYPE/interval
- echo -1 > /sys/kernel/debug/$FAILTYPE/times
+ printf %#x -1 > /sys/kernel/debug/$FAILTYPE/times
echo 0 > /sys/kernel/debug/$FAILTYPE/space
echo 2 > /sys/kernel/debug/$FAILTYPE/verbose
- echo 1 > /sys/kernel/debug/$FAILTYPE/ignore-gfp-wait
+ echo Y > /sys/kernel/debug/$FAILTYPE/ignore-gfp-wait
faulty_system()
{
@@ -304,11 +338,11 @@ Application Examples
echo N > /sys/kernel/debug/$FAILTYPE/task-filter
echo 10 > /sys/kernel/debug/$FAILTYPE/probability
echo 100 > /sys/kernel/debug/$FAILTYPE/interval
- echo -1 > /sys/kernel/debug/$FAILTYPE/times
+ printf %#x -1 > /sys/kernel/debug/$FAILTYPE/times
echo 0 > /sys/kernel/debug/$FAILTYPE/space
echo 2 > /sys/kernel/debug/$FAILTYPE/verbose
- echo 1 > /sys/kernel/debug/$FAILTYPE/ignore-gfp-wait
- echo 1 > /sys/kernel/debug/$FAILTYPE/ignore-gfp-highmem
+ echo Y > /sys/kernel/debug/$FAILTYPE/ignore-gfp-wait
+ echo Y > /sys/kernel/debug/$FAILTYPE/ignore-gfp-highmem
echo 10 > /sys/kernel/debug/$FAILTYPE/stacktrace-depth
trap "echo 0 > /sys/kernel/debug/$FAILTYPE/probability" SIGINT SIGTERM EXIT
@@ -331,11 +365,11 @@ Application Examples
FAILTYPE=fail_function
FAILFUNC=open_ctree
echo $FAILFUNC > /sys/kernel/debug/$FAILTYPE/inject
- echo -12 > /sys/kernel/debug/$FAILTYPE/$FAILFUNC/retval
+ printf %#x -12 > /sys/kernel/debug/$FAILTYPE/$FAILFUNC/retval
echo N > /sys/kernel/debug/$FAILTYPE/task-filter
echo 100 > /sys/kernel/debug/$FAILTYPE/probability
echo 0 > /sys/kernel/debug/$FAILTYPE/interval
- echo -1 > /sys/kernel/debug/$FAILTYPE/times
+ printf %#x -1 > /sys/kernel/debug/$FAILTYPE/times
echo 0 > /sys/kernel/debug/$FAILTYPE/space
echo 1 > /sys/kernel/debug/$FAILTYPE/verbose
diff --git a/Documentation/fault-injection/notifier-error-inject.rst b/Documentation/fault-injection/notifier-error-inject.rst
index 1668b6e48d3a..fdf2dc433ead 100644
--- a/Documentation/fault-injection/notifier-error-inject.rst
+++ b/Documentation/fault-injection/notifier-error-inject.rst
@@ -91,8 +91,8 @@ For more usage examples
There are tools/testing/selftests using the notifier error injection features
for CPU and memory notifiers.
- * tools/testing/selftests/cpu-hotplug/on-off-test.sh
- * tools/testing/selftests/memory-hotplug/on-off-test.sh
+ * tools/testing/selftests/cpu-hotplug/cpu-on-off-test.sh
+ * tools/testing/selftests/memory-hotplug/mem-on-off-test.sh
These scripts first do simple online and offline tests and then do fault
injection tests if notifier error injection module is available.
diff --git a/Documentation/fault-injection/nvme-fault-injection.rst b/Documentation/fault-injection/nvme-fault-injection.rst
index cdb2e829228e..1d4427890d75 100644
--- a/Documentation/fault-injection/nvme-fault-injection.rst
+++ b/Documentation/fault-injection/nvme-fault-injection.rst
@@ -3,7 +3,7 @@ NVMe Fault Injection
Linux's fault injection framework provides a systematic way to support
error injection via debugfs in the /sys/kernel/debug directory. When
enabled, the default NVME_SC_INVALID_OPCODE with no retry will be
-injected into the nvme_end_request. Users can change the default status
+injected into the nvme_try_complete_req. Users can change the default status
code and no retry flag via the debugfs. The list of Generic Command
Status can be found in include/linux/nvme.h
diff --git a/Documentation/fault-injection/provoke-crashes.rst b/Documentation/fault-injection/provoke-crashes.rst
index 9279a3e12278..3abe84225613 100644
--- a/Documentation/fault-injection/provoke-crashes.rst
+++ b/Documentation/fault-injection/provoke-crashes.rst
@@ -1,16 +1,19 @@
-===============
-Provoke crashes
-===============
+.. SPDX-License-Identifier: GPL-2.0
-The lkdtm module provides an interface to crash or injure the kernel at
-predefined crashpoints to evaluate the reliability of crash dumps obtained
-using different dumping solutions. The module uses KPROBEs to instrument
-crashing points, but can also crash the kernel directly without KRPOBE
-support.
+============================================================
+Provoking crashes with Linux Kernel Dump Test Module (LKDTM)
+============================================================
+The lkdtm module provides an interface to disrupt (and usually crash)
+the kernel at predefined code locations to evaluate the reliability of
+the kernel's exception handling and to test crash dumps obtained using
+different dumping solutions. The module uses KPROBEs to instrument the
+trigger location, but can also trigger the kernel directly without KPROBE
+support via debugfs.
-You can provide the way either through module arguments when inserting
-the module, or through a debugfs interface.
+You can select the location of the trigger ("crash point name") and the
+type of action ("crash point type") either through module arguments when
+inserting the module, or through the debugfs interface.
Usage::
@@ -18,31 +21,37 @@ Usage::
[cpoint_count={>0}]
recur_count
- Recursion level for the stack overflow test. Default is 10.
+ Recursion level for the stack overflow test. By default this is
+ dynamically calculated based on kernel configuration, with the
+ goal of being just large enough to exhaust the kernel stack. The
+ value can be seen at `/sys/module/lkdtm/parameters/recur_count`.
cpoint_name
- Crash point where the kernel is to be crashed. It can be
+ Where in the kernel to trigger the action. It can be
one of INT_HARDWARE_ENTRY, INT_HW_IRQ_EN, INT_TASKLET_ENTRY,
- FS_DEVRW, MEM_SWAPOUT, TIMERADD, SCSI_DISPATCH_CMD,
- IDE_CORE_CP, DIRECT
+ FS_DEVRW, MEM_SWAPOUT, TIMERADD, SCSI_QUEUE_RQ, or DIRECT.
cpoint_type
Indicates the action to be taken on hitting the crash point.
- It can be one of PANIC, BUG, EXCEPTION, LOOP, OVERFLOW,
- CORRUPT_STACK, UNALIGNED_LOAD_STORE_WRITE, OVERWRITE_ALLOCATION,
- WRITE_AFTER_FREE,
+ These are numerous, and best queried directly from debugfs. Some
+ of the common ones are PANIC, BUG, EXCEPTION, LOOP, and OVERFLOW.
+ See the contents of `/sys/kernel/debug/provoke-crash/DIRECT` for
+ a complete list.
cpoint_count
Indicates the number of times the crash point is to be hit
- to trigger an action. The default is 10.
+ before triggering the action. The default is 10 (except for
+ DIRECT, which always fires immediately).
You can also induce failures by mounting debugfs and writing the type to
-<mountpoint>/provoke-crash/<crashpoint>. E.g.::
+<debugfs>/provoke-crash/<crashpoint>. E.g.::
- mount -t debugfs debugfs /mnt
- echo EXCEPTION > /mnt/provoke-crash/INT_HARDWARE_ENTRY
+ mount -t debugfs debugfs /sys/kernel/debug
+ echo EXCEPTION > /sys/kernel/debug/provoke-crash/INT_HARDWARE_ENTRY
+The special file `DIRECT` will induce the action directly without KPROBE
+instrumentation. This mode is the only one available when the module is
+built for a kernel without KPROBEs support::
-A special file is `DIRECT` which will induce the crash directly without
-KPROBE instrumentation. This mode is the only one available when the module
-is built on a kernel without KPROBEs support.
+ # Instead of having a BUG kill your shell, have it kill "cat":
+ cat <(echo WRITE_RO) >/sys/kernel/debug/provoke-crash/DIRECT