aboutsummaryrefslogtreecommitdiffstatshomepage
path: root/Documentation/dev-tools/kcsan.rst
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation/dev-tools/kcsan.rst')
-rw-r--r--Documentation/dev-tools/kcsan.rst197
1 files changed, 121 insertions, 76 deletions
diff --git a/Documentation/dev-tools/kcsan.rst b/Documentation/dev-tools/kcsan.rst
index ce4bbd918648..94b6802ab0ab 100644
--- a/Documentation/dev-tools/kcsan.rst
+++ b/Documentation/dev-tools/kcsan.rst
@@ -1,5 +1,8 @@
-The Kernel Concurrency Sanitizer (KCSAN)
-========================================
+.. SPDX-License-Identifier: GPL-2.0
+.. Copyright (C) 2019, Google LLC.
+
+Kernel Concurrency Sanitizer (KCSAN)
+====================================
The Kernel Concurrency Sanitizer (KCSAN) is a dynamic race detector, which
relies on compile-time instrumentation, and uses a watchpoint-based sampling
@@ -8,7 +11,8 @@ approach to detect races. KCSAN's primary purpose is to detect `data races`_.
Usage
-----
-KCSAN requires Clang version 11 or later.
+KCSAN is supported by both GCC and Clang. With GCC we require version 11 or
+later, and with Clang also require version 11 or later.
To enable KCSAN configure the kernel with::
@@ -23,75 +27,57 @@ Error reports
A typical data race report looks like this::
==================================================================
- BUG: KCSAN: data-race in generic_permission / kernfs_refresh_inode
-
- write to 0xffff8fee4c40700c of 4 bytes by task 175 on cpu 4:
- kernfs_refresh_inode+0x70/0x170
- kernfs_iop_permission+0x4f/0x90
- inode_permission+0x190/0x200
- link_path_walk.part.0+0x503/0x8e0
- path_lookupat.isra.0+0x69/0x4d0
- filename_lookup+0x136/0x280
- user_path_at_empty+0x47/0x60
- vfs_statx+0x9b/0x130
- __do_sys_newlstat+0x50/0xb0
- __x64_sys_newlstat+0x37/0x50
- do_syscall_64+0x85/0x260
- entry_SYSCALL_64_after_hwframe+0x44/0xa9
-
- read to 0xffff8fee4c40700c of 4 bytes by task 166 on cpu 6:
- generic_permission+0x5b/0x2a0
- kernfs_iop_permission+0x66/0x90
- inode_permission+0x190/0x200
- link_path_walk.part.0+0x503/0x8e0
- path_lookupat.isra.0+0x69/0x4d0
- filename_lookup+0x136/0x280
- user_path_at_empty+0x47/0x60
- do_faccessat+0x11a/0x390
- __x64_sys_access+0x3c/0x50
- do_syscall_64+0x85/0x260
- entry_SYSCALL_64_after_hwframe+0x44/0xa9
+ BUG: KCSAN: data-race in test_kernel_read / test_kernel_write
+
+ write to 0xffffffffc009a628 of 8 bytes by task 487 on cpu 0:
+ test_kernel_write+0x1d/0x30
+ access_thread+0x89/0xd0
+ kthread+0x23e/0x260
+ ret_from_fork+0x22/0x30
+
+ read to 0xffffffffc009a628 of 8 bytes by task 488 on cpu 6:
+ test_kernel_read+0x10/0x20
+ access_thread+0x89/0xd0
+ kthread+0x23e/0x260
+ ret_from_fork+0x22/0x30
+
+ value changed: 0x00000000000009a6 -> 0x00000000000009b2
Reported by Kernel Concurrency Sanitizer on:
- CPU: 6 PID: 166 Comm: systemd-journal Not tainted 5.3.0-rc7+ #1
- Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
+ CPU: 6 PID: 488 Comm: access_thread Not tainted 5.12.0-rc2+ #1
+ Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014
==================================================================
The header of the report provides a short summary of the functions involved in
the race. It is followed by the access types and stack traces of the 2 threads
-involved in the data race.
+involved in the data race. If KCSAN also observed a value change, the observed
+old value and new value are shown on the "value changed" line respectively.
The other less common type of data race report looks like this::
==================================================================
- BUG: KCSAN: data-race in e1000_clean_rx_irq+0x551/0xb10
-
- race at unknown origin, with read to 0xffff933db8a2ae6c of 1 bytes by interrupt on cpu 0:
- e1000_clean_rx_irq+0x551/0xb10
- e1000_clean+0x533/0xda0
- net_rx_action+0x329/0x900
- __do_softirq+0xdb/0x2db
- irq_exit+0x9b/0xa0
- do_IRQ+0x9c/0xf0
- ret_from_intr+0x0/0x18
- default_idle+0x3f/0x220
- arch_cpu_idle+0x21/0x30
- do_idle+0x1df/0x230
- cpu_startup_entry+0x14/0x20
- rest_init+0xc5/0xcb
- arch_call_rest_init+0x13/0x2b
- start_kernel+0x6db/0x700
+ BUG: KCSAN: data-race in test_kernel_rmw_array+0x71/0xd0
+
+ race at unknown origin, with read to 0xffffffffc009bdb0 of 8 bytes by task 515 on cpu 2:
+ test_kernel_rmw_array+0x71/0xd0
+ access_thread+0x89/0xd0
+ kthread+0x23e/0x260
+ ret_from_fork+0x22/0x30
+
+ value changed: 0x0000000000002328 -> 0x0000000000002329
Reported by Kernel Concurrency Sanitizer on:
- CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.3.0-rc7+ #2
- Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
+ CPU: 2 PID: 515 Comm: access_thread Not tainted 5.12.0-rc2+ #1
+ Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014
==================================================================
This report is generated where it was not possible to determine the other
racing thread, but a race was inferred due to the data value of the watched
-memory location having changed. These can occur either due to missing
-instrumentation or e.g. DMA accesses. These reports will only be generated if
-``CONFIG_KCSAN_REPORT_RACE_UNKNOWN_ORIGIN=y`` (selected by default).
+memory location having changed. These reports always show a "value changed"
+line. A common reason for reports of this type are missing instrumentation in
+the racing thread, but could also occur due to e.g. DMA accesses. Such reports
+are shown only if ``CONFIG_KCSAN_REPORT_RACE_UNKNOWN_ORIGIN=y``, which is
+enabled by default.
Selective analysis
~~~~~~~~~~~~~~~~~~
@@ -102,7 +88,8 @@ the below options are available:
* KCSAN understands the ``data_race(expr)`` annotation, which tells KCSAN that
any data races due to accesses in ``expr`` should be ignored and resulting
- behaviour when encountering a data race is deemed safe.
+ behaviour when encountering a data race is deemed safe. Please see
+ `"Marking Shared-Memory Accesses" in the LKMM`_ for more information.
* Disabling data race detection for entire functions can be accomplished by
using the function attribute ``__no_kcsan``::
@@ -114,12 +101,6 @@ the below options are available:
To dynamically limit for which functions to generate reports, see the
`DebugFS interface`_ blacklist/whitelist feature.
- For ``__always_inline`` functions, replace ``__always_inline`` with
- ``__no_kcsan_or_inline`` (which implies ``__always_inline``)::
-
- static __no_kcsan_or_inline void foo(void) {
- ...
-
* To disable data race detection for a particular compilation unit, add to the
``Makefile``::
@@ -130,6 +111,8 @@ the below options are available:
KCSAN_SANITIZE := n
+.. _"Marking Shared-Memory Accesses" in the LKMM: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/memory-model/Documentation/access-marking.txt
+
Furthermore, it is possible to tell KCSAN to show or hide entire classes of
data races, depending on preferences. These can be changed via the following
Kconfig options:
@@ -144,6 +127,18 @@ Kconfig options:
causes KCSAN to not report data races due to conflicts where the only plain
accesses are aligned writes up to word size.
+* ``CONFIG_KCSAN_PERMISSIVE``: Enable additional permissive rules to ignore
+ certain classes of common data races. Unlike the above, the rules are more
+ complex involving value-change patterns, access type, and address. This
+ option depends on ``CONFIG_KCSAN_REPORT_VALUE_CHANGE_ONLY=y``. For details
+ please see the ``kernel/kcsan/permissive.h``. Testers and maintainers that
+ only focus on reports from specific subsystems and not the whole kernel are
+ recommended to disable this option.
+
+To use the strictest possible rules, select ``CONFIG_KCSAN_STRICT=y``, which
+configures KCSAN to follow the Linux-kernel memory consistency model (LKMM) as
+closely as possible.
+
DebugFS interface
~~~~~~~~~~~~~~~~~
@@ -209,17 +204,17 @@ Ultimately this allows to determine the possible executions of concurrent code,
and if that code is free from data races.
KCSAN is aware of *marked atomic operations* (``READ_ONCE``, ``WRITE_ONCE``,
-``atomic_*``, etc.), but is oblivious of any ordering guarantees and simply
-assumes that memory barriers are placed correctly. In other words, KCSAN
-assumes that as long as a plain access is not observed to race with another
-conflicting access, memory operations are correctly ordered.
-
-This means that KCSAN will not report *potential* data races due to missing
-memory ordering. Developers should therefore carefully consider the required
-memory ordering requirements that remain unchecked. If, however, missing
-memory ordering (that is observable with a particular compiler and
-architecture) leads to an observable data race (e.g. entering a critical
-section erroneously), KCSAN would report the resulting data race.
+``atomic_*``, etc.), and a subset of ordering guarantees implied by memory
+barriers. With ``CONFIG_KCSAN_WEAK_MEMORY=y``, KCSAN models load or store
+buffering, and can detect missing ``smp_mb()``, ``smp_wmb()``, ``smp_rmb()``,
+``smp_store_release()``, and all ``atomic_*`` operations with equivalent
+implied barriers.
+
+Note, KCSAN will not report all data races due to missing memory ordering,
+specifically where a memory barrier would be required to prohibit subsequent
+memory operation from reordering before the barrier. Developers should
+therefore carefully consider the required memory ordering requirements that
+remain unchecked.
Race Detection Beyond Data Races
--------------------------------
@@ -273,6 +268,56 @@ marked operations, if all accesses to a variable that is accessed concurrently
are properly marked, KCSAN will never trigger a watchpoint and therefore never
report the accesses.
+Modeling Weak Memory
+~~~~~~~~~~~~~~~~~~~~
+
+KCSAN's approach to detecting data races due to missing memory barriers is
+based on modeling access reordering (with ``CONFIG_KCSAN_WEAK_MEMORY=y``).
+Each plain memory access for which a watchpoint is set up, is also selected for
+simulated reordering within the scope of its function (at most 1 in-flight
+access).
+
+Once an access has been selected for reordering, it is checked along every
+other access until the end of the function scope. If an appropriate memory
+barrier is encountered, the access will no longer be considered for simulated
+reordering.
+
+When the result of a memory operation should be ordered by a barrier, KCSAN can
+then detect data races where the conflict only occurs as a result of a missing
+barrier. Consider the example::
+
+ int x, flag;
+ void T1(void)
+ {
+ x = 1; // data race!
+ WRITE_ONCE(flag, 1); // correct: smp_store_release(&flag, 1)
+ }
+ void T2(void)
+ {
+ while (!READ_ONCE(flag)); // correct: smp_load_acquire(&flag)
+ ... = x; // data race!
+ }
+
+When weak memory modeling is enabled, KCSAN can consider ``x`` in ``T1`` for
+simulated reordering. After the write of ``flag``, ``x`` is again checked for
+concurrent accesses: because ``T2`` is able to proceed after the write of
+``flag``, a data race is detected. With the correct barriers in place, ``x``
+would not be considered for reordering after the proper release of ``flag``,
+and no data race would be detected.
+
+Deliberate trade-offs in complexity but also practical limitations mean only a
+subset of data races due to missing memory barriers can be detected. With
+currently available compiler support, the implementation is limited to modeling
+the effects of "buffering" (delaying accesses), since the runtime cannot
+"prefetch" accesses. Also recall that watchpoints are only set up for plain
+accesses, and the only access type for which KCSAN simulates reordering. This
+means reordering of marked accesses is not modeled.
+
+A consequence of the above is that acquire operations do not require barrier
+instrumentation (no prefetching). Furthermore, marked accesses introducing
+address or control dependencies do not require special handling (the marked
+access cannot be reordered, later dependent accesses cannot be prefetched).
+
Key Properties
~~~~~~~~~~~~~~
@@ -295,8 +340,8 @@ Key Properties
4. **Detects Racy Writes from Devices:** Due to checking data values upon
setting up watchpoints, racy writes from devices can also be detected.
-5. **Memory Ordering:** KCSAN is *not* explicitly aware of the LKMM's ordering
- rules; this may result in missed data races (false negatives).
+5. **Memory Ordering:** KCSAN is aware of only a subset of LKMM ordering rules;
+ this may result in missed data races (false negatives).
6. **Analysis Accuracy:** For observed executions, due to using a sampling
strategy, the analysis is *unsound* (false negatives possible), but aims to