aboutsummaryrefslogtreecommitdiffstats
path: root/Documentation/admin-guide
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation/admin-guide')
-rw-r--r--Documentation/admin-guide/LSM/LoadPin.rst6
-rw-r--r--Documentation/admin-guide/LSM/SafeSetID.rst29
-rw-r--r--Documentation/admin-guide/LSM/Yama.rst7
-rw-r--r--Documentation/admin-guide/LSM/tomoyo.rst16
-rw-r--r--Documentation/admin-guide/README.rst147
-rw-r--r--Documentation/admin-guide/abi-obsolete.rst11
-rw-r--r--Documentation/admin-guide/abi-removed.rst5
-rw-r--r--Documentation/admin-guide/abi-stable.rst14
-rw-r--r--Documentation/admin-guide/abi-testing.rst20
-rw-r--r--Documentation/admin-guide/abi.rst11
-rw-r--r--Documentation/admin-guide/acpi/cppc_sysfs.rst6
-rw-r--r--Documentation/admin-guide/acpi/dsdt-override.rst13
-rw-r--r--Documentation/admin-guide/acpi/fan_performance_states.rst28
-rw-r--r--Documentation/admin-guide/acpi/index.rst1
-rw-r--r--Documentation/admin-guide/acpi/initrd_table_override.rst2
-rw-r--r--Documentation/admin-guide/acpi/ssdt-overlays.rst51
-rw-r--r--Documentation/admin-guide/auxdisplay/cfag12864b.rst2
-rw-r--r--Documentation/admin-guide/auxdisplay/ks0108.rst2
-rw-r--r--Documentation/admin-guide/bcache.rst31
-rw-r--r--Documentation/admin-guide/binderfs.rst21
-rw-r--r--Documentation/admin-guide/binfmt-misc.rst12
-rw-r--r--Documentation/admin-guide/blockdev/drbd/figures.rst4
-rw-r--r--Documentation/admin-guide/blockdev/drbd/index.rst2
-rw-r--r--Documentation/admin-guide/blockdev/drbd/peer-states-8.dot (renamed from Documentation/admin-guide/blockdev/drbd/node-states-8.dot)5
-rw-r--r--Documentation/admin-guide/blockdev/floppy.rst6
-rw-r--r--Documentation/admin-guide/blockdev/index.rst6
-rw-r--r--Documentation/admin-guide/blockdev/paride.rst2
-rw-r--r--Documentation/admin-guide/blockdev/ramdisk.rst66
-rw-r--r--Documentation/admin-guide/blockdev/zram.rst41
-rw-r--r--Documentation/admin-guide/bootconfig.rst127
-rw-r--r--Documentation/admin-guide/bug-bisect.rst2
-rw-r--r--Documentation/admin-guide/bug-hunting.rst53
-rw-r--r--Documentation/admin-guide/cgroup-v1/blkio-controller.rst155
-rw-r--r--Documentation/admin-guide/cgroup-v1/cpusets.rst13
-rw-r--r--Documentation/admin-guide/cgroup-v1/hugetlb.rst107
-rw-r--r--Documentation/admin-guide/cgroup-v1/index.rst3
-rw-r--r--Documentation/admin-guide/cgroup-v1/memcg_test.rst25
-rw-r--r--Documentation/admin-guide/cgroup-v1/memory.rst118
-rw-r--r--Documentation/admin-guide/cgroup-v1/misc.rst4
-rw-r--r--Documentation/admin-guide/cgroup-v1/rdma.rst2
-rw-r--r--Documentation/admin-guide/cgroup-v2.rst688
-rw-r--r--Documentation/admin-guide/cifs/authors.rst6
-rw-r--r--Documentation/admin-guide/cifs/changes.rst5
-rw-r--r--Documentation/admin-guide/cifs/introduction.rst30
-rw-r--r--Documentation/admin-guide/cifs/todo.rst36
-rw-r--r--Documentation/admin-guide/cifs/usage.rst25
-rwxr-xr-xDocumentation/admin-guide/cifs/winucase_convert.pl2
-rw-r--r--Documentation/admin-guide/cpu-load.rst67
-rw-r--r--Documentation/admin-guide/cputopology.rst124
-rw-r--r--Documentation/admin-guide/dell_rbu.rst2
-rw-r--r--Documentation/admin-guide/device-mapper/dm-crypt.rst14
-rw-r--r--Documentation/admin-guide/device-mapper/dm-dust.rst32
-rw-r--r--Documentation/admin-guide/device-mapper/dm-ebs.rst51
-rw-r--r--Documentation/admin-guide/device-mapper/dm-ima.rst715
-rw-r--r--Documentation/admin-guide/device-mapper/dm-integrity.rst50
-rw-r--r--Documentation/admin-guide/device-mapper/dm-raid.rst4
-rw-r--r--Documentation/admin-guide/device-mapper/dm-zoned.rst68
-rw-r--r--Documentation/admin-guide/device-mapper/index.rst2
-rw-r--r--Documentation/admin-guide/device-mapper/verity.rst17
-rw-r--r--Documentation/admin-guide/device-mapper/writecache.rst45
-rw-r--r--Documentation/admin-guide/devices.rst9
-rw-r--r--Documentation/admin-guide/devices.txt41
-rw-r--r--Documentation/admin-guide/dynamic-debug-howto.rst271
-rw-r--r--Documentation/admin-guide/edid.rst60
-rw-r--r--Documentation/admin-guide/efi-stub.rst4
-rw-r--r--Documentation/admin-guide/ext4.rst36
-rw-r--r--Documentation/admin-guide/features.rst3
-rw-r--r--Documentation/admin-guide/filesystem-monitoring.rst78
-rw-r--r--Documentation/admin-guide/gpio/gpio-aggregator.rst111
-rw-r--r--Documentation/admin-guide/gpio/gpio-mockup.rst51
-rw-r--r--Documentation/admin-guide/gpio/gpio-sim.rst134
-rw-r--r--Documentation/admin-guide/gpio/index.rst3
-rw-r--r--Documentation/admin-guide/hw-vuln/core-scheduling.rst226
-rw-r--r--Documentation/admin-guide/hw-vuln/index.rst4
-rw-r--r--Documentation/admin-guide/hw-vuln/l1d_flush.rst69
-rw-r--r--Documentation/admin-guide/hw-vuln/l1tf.rst2
-rw-r--r--Documentation/admin-guide/hw-vuln/multihit.rst4
-rw-r--r--Documentation/admin-guide/hw-vuln/processor_mmio_stale_data.rst260
-rw-r--r--Documentation/admin-guide/hw-vuln/special-register-buffer-data-sampling.rst150
-rw-r--r--Documentation/admin-guide/hw-vuln/spectre.rst120
-rw-r--r--Documentation/admin-guide/hw-vuln/tsx_async_abort.rst4
-rw-r--r--Documentation/admin-guide/index.rst12
-rw-r--r--Documentation/admin-guide/init.rst76
-rw-r--r--Documentation/admin-guide/initrd.rst2
-rw-r--r--Documentation/admin-guide/iostats.rst11
-rw-r--r--Documentation/admin-guide/kdump/gdbmacros.txt159
-rw-r--r--Documentation/admin-guide/kdump/kdump.rst195
-rw-r--r--Documentation/admin-guide/kdump/vmcoreinfo.rst167
-rw-r--r--Documentation/admin-guide/kernel-parameters.rst31
-rw-r--r--Documentation/admin-guide/kernel-parameters.txt2358
-rw-r--r--Documentation/admin-guide/kernel-per-CPU-kthreads.rst28
-rw-r--r--Documentation/admin-guide/laptops/disk-shock-protection.rst2
-rw-r--r--Documentation/admin-guide/laptops/laptop-mode.rst11
-rw-r--r--Documentation/admin-guide/laptops/lg-laptop.rst6
-rw-r--r--Documentation/admin-guide/laptops/sonypi.rst2
-rw-r--r--Documentation/admin-guide/laptops/thinkpad-acpi.rst95
-rw-r--r--Documentation/admin-guide/lockup-watchdogs.rst4
-rw-r--r--Documentation/admin-guide/md.rst8
-rw-r--r--Documentation/admin-guide/media/au0828-cardlist.rst39
-rw-r--r--Documentation/admin-guide/media/avermedia.rst94
-rw-r--r--Documentation/admin-guide/media/bt8xx.rst157
-rw-r--r--Documentation/admin-guide/media/bttv-cardlist.rst683
-rw-r--r--Documentation/admin-guide/media/bttv.rst1762
-rw-r--r--Documentation/admin-guide/media/building.rst357
-rw-r--r--Documentation/admin-guide/media/cafe_ccic.rst62
-rw-r--r--Documentation/admin-guide/media/cardlist.rst29
-rw-r--r--Documentation/admin-guide/media/cec-drivers.rst10
-rw-r--r--Documentation/admin-guide/media/ci.rst77
-rw-r--r--Documentation/admin-guide/media/cpia2.rst145
-rw-r--r--Documentation/admin-guide/media/cx18-cardlist.rst17
-rw-r--r--Documentation/admin-guide/media/cx231xx-cardlist.rst99
-rw-r--r--Documentation/admin-guide/media/cx23885-cardlist.rst267
-rw-r--r--Documentation/admin-guide/media/cx88-cardlist.rst383
-rw-r--r--Documentation/admin-guide/media/cx88.rst58
-rw-r--r--Documentation/admin-guide/media/davinci-vpbe.rst65
-rw-r--r--Documentation/admin-guide/media/dvb-drivers.rst16
-rw-r--r--Documentation/admin-guide/media/dvb-usb-a800-cardlist.rst16
-rw-r--r--Documentation/admin-guide/media/dvb-usb-af9005-cardlist.rst20
-rw-r--r--Documentation/admin-guide/media/dvb-usb-af9015-cardlist.rst80
-rw-r--r--Documentation/admin-guide/media/dvb-usb-af9035-cardlist.rst74
-rw-r--r--Documentation/admin-guide/media/dvb-usb-anysee-cardlist.rst16
-rw-r--r--Documentation/admin-guide/media/dvb-usb-au6610-cardlist.rst16
-rw-r--r--Documentation/admin-guide/media/dvb-usb-az6007-cardlist.rst20
-rw-r--r--Documentation/admin-guide/media/dvb-usb-az6027-cardlist.rst24
-rw-r--r--Documentation/admin-guide/media/dvb-usb-ce6230-cardlist.rst18
-rw-r--r--Documentation/admin-guide/media/dvb-usb-cinergyT2-cardlist.rst16
-rw-r--r--Documentation/admin-guide/media/dvb-usb-cxusb-cardlist.rst40
-rw-r--r--Documentation/admin-guide/media/dvb-usb-dib0700-cardlist.rst162
-rw-r--r--Documentation/admin-guide/media/dvb-usb-dibusb-mb-cardlist.rst42
-rw-r--r--Documentation/admin-guide/media/dvb-usb-dibusb-mc-cardlist.rst30
-rw-r--r--Documentation/admin-guide/media/dvb-usb-digitv-cardlist.rst16
-rw-r--r--Documentation/admin-guide/media/dvb-usb-dtt200u-cardlist.rst22
-rw-r--r--Documentation/admin-guide/media/dvb-usb-dtv5100-cardlist.rst16
-rw-r--r--Documentation/admin-guide/media/dvb-usb-dvbsky-cardlist.rst42
-rw-r--r--Documentation/admin-guide/media/dvb-usb-dw2102-cardlist.rst56
-rw-r--r--Documentation/admin-guide/media/dvb-usb-ec168-cardlist.rst16
-rw-r--r--Documentation/admin-guide/media/dvb-usb-gl861-cardlist.rst20
-rw-r--r--Documentation/admin-guide/media/dvb-usb-gp8psk-cardlist.rst22
-rw-r--r--Documentation/admin-guide/media/dvb-usb-lmedm04-cardlist.rst20
-rw-r--r--Documentation/admin-guide/media/dvb-usb-m920x-cardlist.rst26
-rw-r--r--Documentation/admin-guide/media/dvb-usb-mxl111sf-cardlist.rst36
-rw-r--r--Documentation/admin-guide/media/dvb-usb-nova-t-usb2-cardlist.rst16
-rw-r--r--Documentation/admin-guide/media/dvb-usb-opera1-cardlist.rst16
-rw-r--r--Documentation/admin-guide/media/dvb-usb-pctv452e-cardlist.rst20
-rw-r--r--Documentation/admin-guide/media/dvb-usb-rtl28xxu-cardlist.rst80
-rw-r--r--Documentation/admin-guide/media/dvb-usb-technisat-usb2-cardlist.rst16
-rw-r--r--Documentation/admin-guide/media/dvb-usb-ttusb2-cardlist.rst24
-rw-r--r--Documentation/admin-guide/media/dvb-usb-umt-010-cardlist.rst16
-rw-r--r--Documentation/admin-guide/media/dvb-usb-vp702x-cardlist.rst16
-rw-r--r--Documentation/admin-guide/media/dvb-usb-vp7045-cardlist.rst18
-rw-r--r--Documentation/admin-guide/media/dvb-usb-zd1301-cardlist.rst16
-rw-r--r--Documentation/admin-guide/media/dvb.rst12
-rw-r--r--Documentation/admin-guide/media/dvb_intro.rst616
-rw-r--r--Documentation/admin-guide/media/dvb_references.rst29
-rw-r--r--Documentation/admin-guide/media/em28xx-cardlist.rst440
-rw-r--r--Documentation/admin-guide/media/faq.rst216
-rw-r--r--Documentation/admin-guide/media/fimc.rst153
-rw-r--r--Documentation/admin-guide/media/frontend-cardlist.rst226
-rw-r--r--Documentation/admin-guide/media/gspca-cardlist.rst451
-rw-r--r--Documentation/admin-guide/media/i2c-cardlist.rst296
-rw-r--r--Documentation/admin-guide/media/imx.rst714
-rw-r--r--Documentation/admin-guide/media/imx6q-sabreauto.dot51
-rw-r--r--Documentation/admin-guide/media/imx6q-sabresd.dot56
-rw-r--r--Documentation/admin-guide/media/imx7.rst221
-rw-r--r--Documentation/admin-guide/media/index.rst63
-rw-r--r--Documentation/admin-guide/media/intro.rst27
-rw-r--r--Documentation/admin-guide/media/ipu3.rst600
-rw-r--r--Documentation/admin-guide/media/ipu3_rcb.svg331
-rw-r--r--Documentation/admin-guide/media/ivtv-cardlist.rst139
-rw-r--r--Documentation/admin-guide/media/ivtv.rst218
-rw-r--r--Documentation/admin-guide/media/lmedm04.rst107
-rw-r--r--Documentation/admin-guide/media/meye.rst93
-rw-r--r--Documentation/admin-guide/media/misc-cardlist.rst28
-rw-r--r--Documentation/admin-guide/media/omap3isp.rst92
-rw-r--r--Documentation/admin-guide/media/omap4_camera.rst62
-rw-r--r--Documentation/admin-guide/media/opera-firmware.rst33
-rw-r--r--Documentation/admin-guide/media/other-usb-cardlist.rst92
-rw-r--r--Documentation/admin-guide/media/pci-cardlist.rst109
-rw-r--r--Documentation/admin-guide/media/philips.rst247
-rw-r--r--Documentation/admin-guide/media/platform-cardlist.rst91
-rw-r--r--Documentation/admin-guide/media/pulse8-cec.rst13
-rw-r--r--Documentation/admin-guide/media/qcom_camss.rst185
-rw-r--r--Documentation/admin-guide/media/qcom_camss_8x96_graph.dot106
-rw-r--r--Documentation/admin-guide/media/qcom_camss_graph.dot43
-rw-r--r--Documentation/admin-guide/media/radio-cardlist.rst44
-rw-r--r--Documentation/admin-guide/media/rcar-fdp1.rst39
-rw-r--r--Documentation/admin-guide/media/remote-controller.rst76
-rw-r--r--Documentation/admin-guide/media/rkisp1.dot18
-rw-r--r--Documentation/admin-guide/media/rkisp1.rst197
-rw-r--r--Documentation/admin-guide/media/saa7134-cardlist.rst803
-rw-r--r--Documentation/admin-guide/media/saa7134.rst89
-rw-r--r--Documentation/admin-guide/media/saa7164-cardlist.rst71
-rw-r--r--Documentation/admin-guide/media/si470x.rst167
-rw-r--r--Documentation/admin-guide/media/si4713.rst192
-rw-r--r--Documentation/admin-guide/media/si476x.rst160
-rw-r--r--Documentation/admin-guide/media/siano-cardlist.rst56
-rw-r--r--Documentation/admin-guide/media/technisat.rst100
-rw-r--r--Documentation/admin-guide/media/tm6000-cardlist.rst83
-rw-r--r--Documentation/admin-guide/media/ttusb-dec.rst45
-rw-r--r--Documentation/admin-guide/media/tuner-cardlist.rst100
-rw-r--r--Documentation/admin-guide/media/usb-cardlist.rst156
-rw-r--r--Documentation/admin-guide/media/v4l-drivers.rst34
-rw-r--r--Documentation/admin-guide/media/vimc.dot26
-rw-r--r--Documentation/admin-guide/media/vimc.rst110
-rw-r--r--Documentation/admin-guide/media/vivid.rst1416
-rw-r--r--Documentation/admin-guide/media/zoran-cardlist.rst51
-rw-r--r--Documentation/admin-guide/media/zr364xx.rst102
-rw-r--r--Documentation/admin-guide/mm/cma_debugfs.rst10
-rw-r--r--Documentation/admin-guide/mm/concepts.rst6
-rw-r--r--Documentation/admin-guide/mm/damon/index.rst17
-rw-r--r--Documentation/admin-guide/mm/damon/lru_sort.rst294
-rw-r--r--Documentation/admin-guide/mm/damon/reclaim.rst265
-rw-r--r--Documentation/admin-guide/mm/damon/start.rst127
-rw-r--r--Documentation/admin-guide/mm/damon/usage.rst702
-rw-r--r--Documentation/admin-guide/mm/hugetlbpage.rst99
-rw-r--r--Documentation/admin-guide/mm/index.rst10
-rw-r--r--Documentation/admin-guide/mm/ksm.rst58
-rw-r--r--Documentation/admin-guide/mm/memory-hotplug.rst889
-rw-r--r--Documentation/admin-guide/mm/multigen_lru.rst162
-rw-r--r--Documentation/admin-guide/mm/nommu-mmap.rst283
-rw-r--r--Documentation/admin-guide/mm/numa_memory_policy.rst41
-rw-r--r--Documentation/admin-guide/mm/numaperf.rst14
-rw-r--r--Documentation/admin-guide/mm/pagemap.rst79
-rw-r--r--Documentation/admin-guide/mm/shrinker_debugfs.rst135
-rw-r--r--Documentation/admin-guide/mm/swap_numa.rst80
-rw-r--r--Documentation/admin-guide/mm/transhuge.rst57
-rw-r--r--Documentation/admin-guide/mm/userfaultfd.rst307
-rw-r--r--Documentation/admin-guide/mm/zswap.rst168
-rw-r--r--Documentation/admin-guide/module-signing.rst2
-rw-r--r--Documentation/admin-guide/mono.rst4
-rw-r--r--Documentation/admin-guide/nfs/fault_injection.rst70
-rw-r--r--Documentation/admin-guide/nfs/index.rst1
-rw-r--r--Documentation/admin-guide/nfs/nfs-client.rst19
-rw-r--r--Documentation/admin-guide/nfs/nfs-rdma.rst2
-rw-r--r--Documentation/admin-guide/nfs/nfsroot.rst8
-rw-r--r--Documentation/admin-guide/nfs/pnfs-block-server.rst2
-rw-r--r--Documentation/admin-guide/nfs/pnfs-scsi-server.rst2
-rw-r--r--Documentation/admin-guide/numastat.rst31
-rw-r--r--Documentation/admin-guide/perf-security.rst155
-rw-r--r--Documentation/admin-guide/perf/alibaba_pmu.rst100
-rw-r--r--Documentation/admin-guide/perf/arm-ccn.rst2
-rw-r--r--Documentation/admin-guide/perf/arm-cmn.rst65
-rw-r--r--Documentation/admin-guide/perf/hisi-pcie-pmu.rst106
-rw-r--r--Documentation/admin-guide/perf/hisi-pmu.rst54
-rw-r--r--Documentation/admin-guide/perf/hns3-pmu.rst136
-rw-r--r--Documentation/admin-guide/perf/imx-ddr.rst5
-rw-r--r--Documentation/admin-guide/perf/index.rst4
-rw-r--r--Documentation/admin-guide/pm/amd-pstate.rst483
-rw-r--r--Documentation/admin-guide/pm/cpufreq.rst17
-rw-r--r--Documentation/admin-guide/pm/cpufreq_drivers.rst274
-rw-r--r--Documentation/admin-guide/pm/cpuidle.rst200
-rw-r--r--Documentation/admin-guide/pm/intel-speed-select.rst939
-rw-r--r--Documentation/admin-guide/pm/intel_idle.rst16
-rw-r--r--Documentation/admin-guide/pm/intel_pstate.rst139
-rw-r--r--Documentation/admin-guide/pm/intel_uncore_frequency_scaling.rst60
-rw-r--r--Documentation/admin-guide/pm/suspend-flows.rst270
-rw-r--r--Documentation/admin-guide/pm/system-wide.rst1
-rw-r--r--Documentation/admin-guide/pm/working-state.rst4
-rw-r--r--Documentation/admin-guide/pnp.rst4
-rw-r--r--Documentation/admin-guide/pstore-blk.rst234
-rw-r--r--Documentation/admin-guide/ramoops.rst22
-rw-r--r--Documentation/admin-guide/ras.rst28
-rw-r--r--Documentation/admin-guide/reporting-bugs.rst182
-rw-r--r--Documentation/admin-guide/reporting-issues.rst1764
-rw-r--r--Documentation/admin-guide/reporting-regressions.rst451
-rw-r--r--Documentation/admin-guide/security-bugs.rst9
-rw-r--r--Documentation/admin-guide/serial-console.rst2
-rw-r--r--Documentation/admin-guide/spkguide.txt1620
-rw-r--r--Documentation/admin-guide/svga.rst7
-rw-r--r--Documentation/admin-guide/syscall-user-dispatch.rst90
-rw-r--r--Documentation/admin-guide/sysctl/abi.rst73
-rw-r--r--Documentation/admin-guide/sysctl/fs.rst6
-rw-r--r--Documentation/admin-guide/sysctl/kernel.rst1461
-rw-r--r--Documentation/admin-guide/sysctl/net.rst105
-rw-r--r--Documentation/admin-guide/sysctl/user.rst6
-rw-r--r--Documentation/admin-guide/sysctl/vm.rst176
-rw-r--r--Documentation/admin-guide/sysrq.rst48
-rw-r--r--Documentation/admin-guide/tainted-kernels.rst36
-rw-r--r--Documentation/admin-guide/thunderbolt.rst63
-rw-r--r--Documentation/admin-guide/unicode.rst4
-rw-r--r--Documentation/admin-guide/wimax/i2400m.rst283
-rw-r--r--Documentation/admin-guide/wimax/index.rst19
-rw-r--r--Documentation/admin-guide/wimax/wimax.rst89
-rw-r--r--Documentation/admin-guide/xfs.rst80
284 files changed, 33553 insertions, 3968 deletions
diff --git a/Documentation/admin-guide/LSM/LoadPin.rst b/Documentation/admin-guide/LSM/LoadPin.rst
index 716ad9b23c9a..dd3ca68b5df1 100644
--- a/Documentation/admin-guide/LSM/LoadPin.rst
+++ b/Documentation/admin-guide/LSM/LoadPin.rst
@@ -11,8 +11,8 @@ restrictions without needing to sign the files individually.
The LSM is selectable at build-time with ``CONFIG_SECURITY_LOADPIN``, and
can be controlled at boot-time with the kernel command line option
-"``loadpin.enabled``". By default, it is enabled, but can be disabled at
-boot ("``loadpin.enabled=0``").
+"``loadpin.enforce``". By default, it is enabled, but can be disabled at
+boot ("``loadpin.enforce=0``").
LoadPin starts pinning when it sees the first file loaded. If the
block device backing the filesystem is not read-only, a sysctl is
@@ -28,4 +28,4 @@ different mechanisms such as ``CONFIG_MODULE_SIG`` and
``CONFIG_KEXEC_VERIFY_SIG`` to verify kernel module and kernel image while
still use LoadPin to protect the integrity of other files kernel loads. The
full list of valid file types can be found in ``kernel_read_file_str``
-defined in ``include/linux/fs.h``.
+defined in ``include/linux/kernel_read_file.h``.
diff --git a/Documentation/admin-guide/LSM/SafeSetID.rst b/Documentation/admin-guide/LSM/SafeSetID.rst
index 7bff07ce4fdd..0ec34863c674 100644
--- a/Documentation/admin-guide/LSM/SafeSetID.rst
+++ b/Documentation/admin-guide/LSM/SafeSetID.rst
@@ -3,9 +3,9 @@ SafeSetID
=========
SafeSetID is an LSM module that gates the setid family of syscalls to restrict
UID/GID transitions from a given UID/GID to only those approved by a
-system-wide whitelist. These restrictions also prohibit the given UIDs/GIDs
+system-wide allowlist. These restrictions also prohibit the given UIDs/GIDs
from obtaining auxiliary privileges associated with CAP_SET{U/G}ID, such as
-allowing a user to set up user namespace UID mappings.
+allowing a user to set up user namespace UID/GID mappings.
Background
@@ -98,10 +98,21 @@ Directions for use
==================
This LSM hooks the setid syscalls to make sure transitions are allowed if an
applicable restriction policy is in place. Policies are configured through
-securityfs by writing to the safesetid/add_whitelist_policy and
-safesetid/flush_whitelist_policies files at the location where securityfs is
-mounted. The format for adding a policy is '<UID>:<UID>', using literal
-numbers, such as '123:456'. To flush the policies, any write to the file is
-sufficient. Again, configuring a policy for a UID will prevent that UID from
-obtaining auxiliary setid privileges, such as allowing a user to set up user
-namespace UID mappings.
+securityfs by writing to the safesetid/uid_allowlist_policy and
+safesetid/gid_allowlist_policy files at the location where securityfs is
+mounted. The format for adding a policy is '<UID>:<UID>' or '<GID>:<GID>',
+using literal numbers, and ending with a newline character such as '123:456\n'.
+Writing an empty string "" will flush the policy. Again, configuring a policy
+for a UID/GID will prevent that UID/GID from obtaining auxiliary setid
+privileges, such as allowing a user to set up user namespace UID/GID mappings.
+
+Note on GID policies and setgroups()
+====================================
+In v5.9 we are adding support for limiting CAP_SETGID privileges as was done
+previously for CAP_SETUID. However, for compatibility with common sandboxing
+related code conventions in userspace, we currently allow arbitrary
+setgroups() calls for processes with CAP_SETGID restrictions. Until we add
+support in a future release for restricting setgroups() calls, these GID
+policies add no meaningful security. setgroups() restrictions will be enforced
+once we have the policy checking code in place, which will rely on GID policy
+configuration code added in v5.9.
diff --git a/Documentation/admin-guide/LSM/Yama.rst b/Documentation/admin-guide/LSM/Yama.rst
index d0a060de3973..d9cd937ebd2d 100644
--- a/Documentation/admin-guide/LSM/Yama.rst
+++ b/Documentation/admin-guide/LSM/Yama.rst
@@ -19,9 +19,10 @@ attach to other running processes (e.g. Firefox, SSH sessions, GPG agent,
etc) to extract additional credentials and continue to expand the scope
of their attack without resorting to user-assisted phishing.
-This is not a theoretical problem. SSH session hijacking
-(http://www.storm.net.nz/projects/7) and arbitrary code injection
-(http://c-skills.blogspot.com/2007/05/injectso.html) attacks already
+This is not a theoretical problem. `SSH session hijacking
+<https://www.blackhat.com/presentations/bh-usa-05/bh-us-05-boileau.pdf>`_
+and `arbitrary code injection
+<https://c-skills.blogspot.com/2007/05/injectso.html>`_ attacks already
exist and remain possible if ptrace is allowed to operate as before.
Since ptrace is not commonly used by non-developers and non-admins, system
builders should be allowed the option to disable this debugging system.
diff --git a/Documentation/admin-guide/LSM/tomoyo.rst b/Documentation/admin-guide/LSM/tomoyo.rst
index e2d6b6e15082..4bc9c2b4da6f 100644
--- a/Documentation/admin-guide/LSM/tomoyo.rst
+++ b/Documentation/admin-guide/LSM/tomoyo.rst
@@ -27,29 +27,29 @@ Where is documentation?
=======================
User <-> Kernel interface documentation is available at
-http://tomoyo.osdn.jp/2.5/policy-specification/index.html .
+https://tomoyo.osdn.jp/2.5/policy-specification/index.html .
Materials we prepared for seminars and symposiums are available at
-http://osdn.jp/projects/tomoyo/docs/?category_id=532&language_id=1 .
+https://osdn.jp/projects/tomoyo/docs/?category_id=532&language_id=1 .
Below lists are chosen from three aspects.
What is TOMOYO?
TOMOYO Linux Overview
- http://osdn.jp/projects/tomoyo/docs/lca2009-takeda.pdf
+ https://osdn.jp/projects/tomoyo/docs/lca2009-takeda.pdf
TOMOYO Linux: pragmatic and manageable security for Linux
- http://osdn.jp/projects/tomoyo/docs/freedomhectaipei-tomoyo.pdf
+ https://osdn.jp/projects/tomoyo/docs/freedomhectaipei-tomoyo.pdf
TOMOYO Linux: A Practical Method to Understand and Protect Your Own Linux Box
- http://osdn.jp/projects/tomoyo/docs/PacSec2007-en-no-demo.pdf
+ https://osdn.jp/projects/tomoyo/docs/PacSec2007-en-no-demo.pdf
What can TOMOYO do?
Deep inside TOMOYO Linux
- http://osdn.jp/projects/tomoyo/docs/lca2009-kumaneko.pdf
+ https://osdn.jp/projects/tomoyo/docs/lca2009-kumaneko.pdf
The role of "pathname based access control" in security.
- http://osdn.jp/projects/tomoyo/docs/lfj2008-bof.pdf
+ https://osdn.jp/projects/tomoyo/docs/lfj2008-bof.pdf
History of TOMOYO?
Realities of Mainlining
- http://osdn.jp/projects/tomoyo/docs/lfj2008.pdf
+ https://osdn.jp/projects/tomoyo/docs/lfj2008.pdf
What is future plan?
====================
diff --git a/Documentation/admin-guide/README.rst b/Documentation/admin-guide/README.rst
index cc6151fc0845..9a969c0157f1 100644
--- a/Documentation/admin-guide/README.rst
+++ b/Documentation/admin-guide/README.rst
@@ -1,9 +1,9 @@
.. _readme:
-Linux kernel release 5.x <http://kernel.org/>
+Linux kernel release 6.x <http://kernel.org/>
=============================================
-These are the release notes for Linux version 5. Read them carefully,
+These are the release notes for Linux version 6. Read them carefully,
as they tell you what this is all about, explain how to install the
kernel, and what to do if something goes wrong.
@@ -63,7 +63,7 @@ Installing the kernel source
directory where you have permissions (e.g. your home directory) and
unpack it::
- xz -cd linux-5.x.tar.xz | tar xvf -
+ xz -cd linux-6.x.tar.xz | tar xvf -
Replace "X" with the version number of the latest kernel.
@@ -72,12 +72,12 @@ Installing the kernel source
files. They should match the library, and not get messed up by
whatever the kernel-du-jour happens to be.
- - You can also upgrade between 5.x releases by patching. Patches are
+ - You can also upgrade between 6.x releases by patching. Patches are
distributed in the xz format. To install by patching, get all the
newer patch files, enter the top level directory of the kernel source
- (linux-5.x) and execute::
+ (linux-6.x) and execute::
- xz -cd ../patch-5.x.xz | patch -p1
+ xz -cd ../patch-6.x.xz | patch -p1
Replace "x" for all versions bigger than the version "x" of your current
source tree, **in_order**, and you should be ok. You may want to remove
@@ -85,13 +85,13 @@ Installing the kernel source
that there are no failed patches (some-file-name# or some-file-name.rej).
If there are, either you or I have made a mistake.
- Unlike patches for the 5.x kernels, patches for the 5.x.y kernels
+ Unlike patches for the 6.x kernels, patches for the 6.x.y kernels
(also known as the -stable kernels) are not incremental but instead apply
- directly to the base 5.x kernel. For example, if your base kernel is 5.0
- and you want to apply the 5.0.3 patch, you must not first apply the 5.0.1
- and 5.0.2 patches. Similarly, if you are running kernel version 5.0.2 and
- want to jump to 5.0.3, you must first reverse the 5.0.2 patch (that is,
- patch -R) **before** applying the 5.0.3 patch. You can read more on this in
+ directly to the base 6.x kernel. For example, if your base kernel is 6.0
+ and you want to apply the 6.0.3 patch, you must not first apply the 6.0.1
+ and 6.0.2 patches. Similarly, if you are running kernel version 6.0.2 and
+ want to jump to 6.0.3, you must first reverse the 6.0.2 patch (that is,
+ patch -R) **before** applying the 6.0.3 patch. You can read more on this in
:ref:`Documentation/process/applying-patches.rst <applying_patches>`.
Alternatively, the script patch-kernel can be used to automate this
@@ -114,7 +114,7 @@ Installing the kernel source
Software requirements
---------------------
- Compiling and running the 5.x kernels requires up-to-date
+ Compiling and running the 6.x kernels requires up-to-date
versions of various software packages. Consult
:ref:`Documentation/process/changes.rst <changes>` for the minimum version numbers
required and how to get updates for these packages. Beware that using
@@ -132,12 +132,12 @@ Build directory for the kernel
place for the output files (including .config).
Example::
- kernel source code: /usr/src/linux-5.x
+ kernel source code: /usr/src/linux-6.x
build directory: /home/name/build/kernel
To configure and build the kernel, use::
- cd /usr/src/linux-5.x
+ cd /usr/src/linux-6.x
make O=/home/name/build/kernel menuconfig
make O=/home/name/build/kernel
sudo make O=/home/name/build/kernel modules_install install
@@ -209,20 +209,28 @@ Configuring the kernel
store the lsmod of that machine into a file
and pass it in as a LSMOD parameter.
+ Also, you can preserve modules in certain folders
+ or kconfig files by specifying their paths in
+ parameter LMC_KEEP.
+
target$ lsmod > /tmp/mylsmod
target$ scp /tmp/mylsmod host:/tmp
- host$ make LSMOD=/tmp/mylsmod localmodconfig
+ host$ make LSMOD=/tmp/mylsmod \
+ LMC_KEEP="drivers/usb:drivers/gpu:fs" \
+ localmodconfig
The above also works when cross compiling.
"make localyesconfig" Similar to localmodconfig, except it will convert
- all module options to built in (=y) options.
+ all module options to built in (=y) options. You can
+ also preserve modules by LMC_KEEP.
- "make kvmconfig" Enable additional options for kvm guest kernel support.
+ "make kvm_guest.config" Enable additional options for kvm guest kernel
+ support.
- "make xenconfig" Enable additional options for xen dom0 guest kernel
- support.
+ "make xen.config" Enable additional options for xen dom0 guest kernel
+ support.
"make tinyconfig" Configure the tiniest possible kernel.
@@ -251,11 +259,9 @@ Configuring the kernel
Compiling the kernel
--------------------
- - Make sure you have at least gcc 4.6 available.
+ - Make sure you have at least gcc 5.1 available.
For more information, refer to :ref:`Documentation/process/changes.rst <changes>`.
- Please note that you can still run a.out user programs with this kernel.
-
- Do a ``make`` to create a compressed kernel image. It is also
possible to do ``make install`` if you have lilo installed to suit the
kernel makefiles, but you may want to check your particular lilo setup first.
@@ -315,94 +321,19 @@ Compiling the kernel
reboot, and enjoy!
If you ever need to change the default root device, video mode,
- ramdisk size, etc. in the kernel image, use the ``rdev`` program (or
- alternatively the LILO boot options when appropriate). No need to
- recompile the kernel to change these parameters.
+ etc. in the kernel image, use your bootloader's boot options
+ where appropriate. No need to recompile the kernel to change
+ these parameters.
- Reboot with the new kernel and enjoy.
If something goes wrong
-----------------------
- - If you have problems that seem to be due to kernel bugs, please check
- the file MAINTAINERS to see if there is a particular person associated
- with the part of the kernel that you are having trouble with. If there
- isn't anyone listed there, then the second best thing is to mail
- them to me (torvalds@linux-foundation.org), and possibly to any other
- relevant mailing-list or to the newsgroup.
-
- - In all bug-reports, *please* tell what kernel you are talking about,
- how to duplicate the problem, and what your setup is (use your common
- sense). If the problem is new, tell me so, and if the problem is
- old, please try to tell me when you first noticed it.
-
- - If the bug results in a message like::
-
- unable to handle kernel paging request at address C0000010
- Oops: 0002
- EIP: 0010:XXXXXXXX
- eax: xxxxxxxx ebx: xxxxxxxx ecx: xxxxxxxx edx: xxxxxxxx
- esi: xxxxxxxx edi: xxxxxxxx ebp: xxxxxxxx
- ds: xxxx es: xxxx fs: xxxx gs: xxxx
- Pid: xx, process nr: xx
- xx xx xx xx xx xx xx xx xx xx
-
- or similar kernel debugging information on your screen or in your
- system log, please duplicate it *exactly*. The dump may look
- incomprehensible to you, but it does contain information that may
- help debugging the problem. The text above the dump is also
- important: it tells something about why the kernel dumped code (in
- the above example, it's due to a bad kernel pointer). More information
- on making sense of the dump is in Documentation/admin-guide/bug-hunting.rst
-
- - If you compiled the kernel with CONFIG_KALLSYMS you can send the dump
- as is, otherwise you will have to use the ``ksymoops`` program to make
- sense of the dump (but compiling with CONFIG_KALLSYMS is usually preferred).
- This utility can be downloaded from
- https://www.kernel.org/pub/linux/utils/kernel/ksymoops/ .
- Alternatively, you can do the dump lookup by hand:
-
- - In debugging dumps like the above, it helps enormously if you can
- look up what the EIP value means. The hex value as such doesn't help
- me or anybody else very much: it will depend on your particular
- kernel setup. What you should do is take the hex value from the EIP
- line (ignore the ``0010:``), and look it up in the kernel namelist to
- see which kernel function contains the offending address.
-
- To find out the kernel function name, you'll need to find the system
- binary associated with the kernel that exhibited the symptom. This is
- the file 'linux/vmlinux'. To extract the namelist and match it against
- the EIP from the kernel crash, do::
-
- nm vmlinux | sort | less
-
- This will give you a list of kernel addresses sorted in ascending
- order, from which it is simple to find the function that contains the
- offending address. Note that the address given by the kernel
- debugging messages will not necessarily match exactly with the
- function addresses (in fact, that is very unlikely), so you can't
- just 'grep' the list: the list will, however, give you the starting
- point of each kernel function, so by looking for the function that
- has a starting address lower than the one you are searching for but
- is followed by a function with a higher address you will find the one
- you want. In fact, it may be a good idea to include a bit of
- "context" in your problem report, giving a few lines around the
- interesting one.
-
- If you for some reason cannot do the above (you have a pre-compiled
- kernel image or similar), telling me as much about your setup as
- possible will help. Please read the :ref:`admin-guide/reporting-bugs.rst <reportingbugs>`
- document for details.
-
- - Alternatively, you can use gdb on a running kernel. (read-only; i.e. you
- cannot change values or set break points.) To do this, first compile the
- kernel with -g; edit arch/x86/Makefile appropriately, then do a ``make
- clean``. You'll also need to enable CONFIG_PROC_FS (via ``make config``).
-
- After you've rebooted with the new kernel, do ``gdb vmlinux /proc/kcore``.
- You can now use all the usual gdb commands. The command to look up the
- point where your system crashed is ``l *0xXXXXXXXX``. (Replace the XXXes
- with the EIP value.)
-
- gdb'ing a non-running kernel currently fails because ``gdb`` (wrongly)
- disregards the starting offset for which the kernel is compiled.
+If you have problems that seem to be due to kernel bugs, please follow the
+instructions at 'Documentation/admin-guide/reporting-issues.rst'.
+
+Hints on understanding kernel bug reports are in
+'Documentation/admin-guide/bug-hunting.rst'. More on debugging the kernel
+with gdb is in 'Documentation/dev-tools/gdb-kernel-debugging.rst' and
+'Documentation/dev-tools/kgdb.rst'.
diff --git a/Documentation/admin-guide/abi-obsolete.rst b/Documentation/admin-guide/abi-obsolete.rst
new file mode 100644
index 000000000000..d095867899c5
--- /dev/null
+++ b/Documentation/admin-guide/abi-obsolete.rst
@@ -0,0 +1,11 @@
+ABI obsolete symbols
+====================
+
+Documents interfaces that are still remaining in the kernel, but are
+marked to be removed at some later point in time.
+
+The description of the interface will document the reason why it is
+obsolete and when it can be expected to be removed.
+
+.. kernel-abi:: $srctree/Documentation/ABI/obsolete
+ :rst:
diff --git a/Documentation/admin-guide/abi-removed.rst b/Documentation/admin-guide/abi-removed.rst
new file mode 100644
index 000000000000..f7e9e43023c1
--- /dev/null
+++ b/Documentation/admin-guide/abi-removed.rst
@@ -0,0 +1,5 @@
+ABI removed symbols
+===================
+
+.. kernel-abi:: $srctree/Documentation/ABI/removed
+ :rst:
diff --git a/Documentation/admin-guide/abi-stable.rst b/Documentation/admin-guide/abi-stable.rst
new file mode 100644
index 000000000000..70490736e0d3
--- /dev/null
+++ b/Documentation/admin-guide/abi-stable.rst
@@ -0,0 +1,14 @@
+ABI stable symbols
+==================
+
+Documents the interfaces that the developer has defined to be stable.
+
+Userspace programs are free to use these interfaces with no
+restrictions, and backward compatibility for them will be guaranteed
+for at least 2 years.
+
+Most interfaces (like syscalls) are expected to never change and always
+be available.
+
+.. kernel-abi:: $srctree/Documentation/ABI/stable
+ :rst:
diff --git a/Documentation/admin-guide/abi-testing.rst b/Documentation/admin-guide/abi-testing.rst
new file mode 100644
index 000000000000..b205b16a72d0
--- /dev/null
+++ b/Documentation/admin-guide/abi-testing.rst
@@ -0,0 +1,20 @@
+ABI testing symbols
+===================
+
+Documents interfaces that are felt to be stable,
+as the main development of this interface has been completed.
+
+The interface can be changed to add new features, but the
+current interface will not break by doing this, unless grave
+errors or security problems are found in them.
+
+Userspace programs can start to rely on these interfaces, but they must
+be aware of changes that can occur before these interfaces move to
+be marked stable.
+
+Programs that use these interfaces are strongly encouraged to add their
+name to the description of these interfaces, so that the kernel
+developers can easily notify them if any changes occur.
+
+.. kernel-abi:: $srctree/Documentation/ABI/testing
+ :rst:
diff --git a/Documentation/admin-guide/abi.rst b/Documentation/admin-guide/abi.rst
new file mode 100644
index 000000000000..bcab3ef2597c
--- /dev/null
+++ b/Documentation/admin-guide/abi.rst
@@ -0,0 +1,11 @@
+=====================
+Linux ABI description
+=====================
+
+.. toctree::
+ :maxdepth: 2
+
+ abi-stable
+ abi-testing
+ abi-obsolete
+ abi-removed
diff --git a/Documentation/admin-guide/acpi/cppc_sysfs.rst b/Documentation/admin-guide/acpi/cppc_sysfs.rst
index a4b99afbe331..e53d76365aa7 100644
--- a/Documentation/admin-guide/acpi/cppc_sysfs.rst
+++ b/Documentation/admin-guide/acpi/cppc_sysfs.rst
@@ -4,11 +4,13 @@
Collaborative Processor Performance Control (CPPC)
==================================================
+.. _cppc_sysfs:
+
CPPC
====
CPPC defined in the ACPI spec describes a mechanism for the OS to manage the
-performance of a logical processor on a contigious and abstract performance
+performance of a logical processor on a contiguous and abstract performance
scale. CPPC exposes a set of registers to describe abstract performance scale,
to request performance levels and to measure per-cpu delivered performance.
@@ -45,7 +47,7 @@ for each cpu X::
* lowest_freq : CPU frequency corresponding to lowest_perf (in MHz).
* nominal_freq : CPU frequency corresponding to nominal_perf (in MHz).
The above frequencies should only be used to report processor performance in
- freqency instead of abstract scale. These values should not be used for any
+ frequency instead of abstract scale. These values should not be used for any
functional decisions.
* feedback_ctrs : Includes both Reference and delivered performance counter.
diff --git a/Documentation/admin-guide/acpi/dsdt-override.rst b/Documentation/admin-guide/acpi/dsdt-override.rst
deleted file mode 100644
index 50bd7f194bf4..000000000000
--- a/Documentation/admin-guide/acpi/dsdt-override.rst
+++ /dev/null
@@ -1,13 +0,0 @@
-.. SPDX-License-Identifier: GPL-2.0
-
-===============
-Overriding DSDT
-===============
-
-Linux supports a method of overriding the BIOS DSDT:
-
-CONFIG_ACPI_CUSTOM_DSDT - builds the image into the kernel.
-
-When to use this method is described in detail on the
-Linux/ACPI home page:
-https://01.org/linux-acpi/documentation/overriding-dsdt
diff --git a/Documentation/admin-guide/acpi/fan_performance_states.rst b/Documentation/admin-guide/acpi/fan_performance_states.rst
index 98fe5c333121..b9e4b4d146c1 100644
--- a/Documentation/admin-guide/acpi/fan_performance_states.rst
+++ b/Documentation/admin-guide/acpi/fan_performance_states.rst
@@ -60,3 +60,31 @@ For example::
When a given field is not populated or its value provided by the platform
firmware is invalid, the "not-defined" string is shown instead of the value.
+
+ACPI Fan Fine Grain Control
+=============================
+
+When _FIF object specifies support for fine grain control, then fan speed
+can be set from 0 to 100% with the recommended minimum "step size" via
+_FSL object. User can adjust fan speed using thermal sysfs cooling device.
+
+Here use can look at fan performance states for a reference speed (speed_rpm)
+and set it by changing cooling device cur_state. If the fine grain control
+is supported then user can also adjust to some other speeds which are
+not defined in the performance states.
+
+The support of fine grain control is presented via sysfs attribute
+"fine_grain_control". If fine grain control is present, this attribute
+will show "1" otherwise "0".
+
+This sysfs attribute is presented in the same directory as performance states.
+
+ACPI Fan Performance Feedback
+=============================
+
+The optional _FST object provides status information for the fan device.
+This includes field to provide current fan speed in revolutions per minute
+at which the fan is rotating.
+
+This speed is presented in the sysfs using the attribute "fan_speed_rpm",
+in the same directory as performance states.
diff --git a/Documentation/admin-guide/acpi/index.rst b/Documentation/admin-guide/acpi/index.rst
index 71277689ad97..b078fdb8f4c9 100644
--- a/Documentation/admin-guide/acpi/index.rst
+++ b/Documentation/admin-guide/acpi/index.rst
@@ -9,7 +9,6 @@ the Linux ACPI support.
:maxdepth: 1
initrd_table_override
- dsdt-override
ssdt-overlays
cppc_sysfs
fan_performance_states
diff --git a/Documentation/admin-guide/acpi/initrd_table_override.rst b/Documentation/admin-guide/acpi/initrd_table_override.rst
index cbd768207631..bb24fa6b5fbe 100644
--- a/Documentation/admin-guide/acpi/initrd_table_override.rst
+++ b/Documentation/admin-guide/acpi/initrd_table_override.rst
@@ -102,7 +102,7 @@ Where to retrieve userspace tools
=================================
iasl and acpixtract are part of Intel's ACPICA project:
-http://acpica.org/
+https://acpica.org/
and should be packaged by distributions (for example in the acpica package
on SUSE).
diff --git a/Documentation/admin-guide/acpi/ssdt-overlays.rst b/Documentation/admin-guide/acpi/ssdt-overlays.rst
index da37455f96c9..b5fbf54dca19 100644
--- a/Documentation/admin-guide/acpi/ssdt-overlays.rst
+++ b/Documentation/admin-guide/acpi/ssdt-overlays.rst
@@ -30,22 +30,21 @@ following ASL code can be used::
{
Device (STAC)
{
- Name (_ADR, Zero)
Name (_HID, "BMA222E")
+ Name (RBUF, ResourceTemplate ()
+ {
+ I2cSerialBus (0x0018, ControllerInitiated, 0x00061A80,
+ AddressingMode7Bit, "\\_SB.I2C6", 0x00,
+ ResourceConsumer, ,)
+ GpioInt (Edge, ActiveHigh, Exclusive, PullDown, 0x0000,
+ "\\_SB.GPO2", 0x00, ResourceConsumer, , )
+ { // Pin list
+ 0
+ }
+ })
Method (_CRS, 0, Serialized)
{
- Name (RBUF, ResourceTemplate ()
- {
- I2cSerialBus (0x0018, ControllerInitiated, 0x00061A80,
- AddressingMode7Bit, "\\_SB.I2C6", 0x00,
- ResourceConsumer, ,)
- GpioInt (Edge, ActiveHigh, Exclusive, PullDown, 0x0000,
- "\\_SB.GPO2", 0x00, ResourceConsumer, , )
- { // Pin list
- 0
- }
- })
Return (RBUF)
}
}
@@ -63,7 +62,7 @@ which can then be compiled to AML binary format::
ASL Input: minnomax.asl - 30 lines, 614 bytes, 7 keywords
AML Output: minnowmax.aml - 165 bytes, 6 named objects, 1 executable opcodes
-[1] http://wiki.minnowboard.org/MinnowBoard_MAX#Low_Speed_Expansion_Connector_.28Top.29
+[1] https://www.elinux.org/Minnowboard:MinnowMax#Low_Speed_Expansion_.28Top.29
The resulting AML code can then be loaded by the kernel using one of the methods
below.
@@ -75,7 +74,7 @@ This option allows loading of user defined SSDTs from initrd and it is useful
when the system does not support EFI or when there is not enough EFI storage.
It works in a similar way with initrd based ACPI tables override/upgrade: SSDT
-aml code must be placed in the first, uncompressed, initrd under the
+AML code must be placed in the first, uncompressed, initrd under the
"kernel/firmware/acpi" path. Multiple files can be used and this will translate
in loading multiple tables. Only SSDT and OEM tables are allowed. See
initrd_table_override.txt for more details.
@@ -103,12 +102,14 @@ This is the preferred method, when EFI is supported on the platform, because it
allows a persistent, OS independent way of storing the user defined SSDTs. There
is also work underway to implement EFI support for loading user defined SSDTs
and using this method will make it easier to convert to the EFI loading
-mechanism when that will arrive.
+mechanism when that will arrive. To enable it, the
+CONFIG_EFI_CUSTOM_SSDT_OVERLAYS shoyld be chosen to y.
-In order to load SSDTs from an EFI variable the efivar_ssdt kernel command line
-parameter can be used. The argument for the option is the variable name to
-use. If there are multiple variables with the same name but with different
-vendor GUIDs, all of them will be loaded.
+In order to load SSDTs from an EFI variable the ``"efivar_ssdt=..."`` kernel
+command line parameter can be used (the name has a limitation of 16 characters).
+The argument for the option is the variable name to use. If there are multiple
+variables with the same name but with different vendor GUIDs, all of them will
+be loaded.
In order to store the AML code in an EFI variable the efivarfs filesystem can be
used. It is enabled and mounted by default in /sys/firmware/efi/efivars in all
@@ -127,7 +128,7 @@ variable with the content from a given file::
#!/bin/sh -e
- while ! [ -z "$1" ]; do
+ while [ -n "$1" ]; do
case "$1" in
"-f") filename="$2"; shift;;
"-g") guid="$2"; shift;;
@@ -167,14 +168,14 @@ variable with the content from a given file::
Loading ACPI SSDTs from configfs
================================
-This option allows loading of user defined SSDTs from userspace via the configfs
+This option allows loading of user defined SSDTs from user space via the configfs
interface. The CONFIG_ACPI_CONFIGFS option must be select and configfs must be
mounted. In the following examples, we assume that configfs has been mounted in
-/config.
+/sys/kernel/config.
-New tables can be loading by creating new directories in /config/acpi/table/ and
-writing the SSDT aml code in the aml attribute::
+New tables can be loading by creating new directories in /sys/kernel/config/acpi/table
+and writing the SSDT AML code in the aml attribute::
- cd /config/acpi/table
+ cd /sys/kernel/config/acpi/table
mkdir my_ssdt
cat ~/ssdt.aml > my_ssdt/aml
diff --git a/Documentation/admin-guide/auxdisplay/cfag12864b.rst b/Documentation/admin-guide/auxdisplay/cfag12864b.rst
index 18c2865bd322..da385d851acc 100644
--- a/Documentation/admin-guide/auxdisplay/cfag12864b.rst
+++ b/Documentation/admin-guide/auxdisplay/cfag12864b.rst
@@ -3,7 +3,7 @@ cfag12864b LCD Driver Documentation
===================================
:License: GPLv2
-:Author & Maintainer: Miguel Ojeda Sandonis
+:Author & Maintainer: Miguel Ojeda <ojeda@kernel.org>
:Date: 2006-10-27
diff --git a/Documentation/admin-guide/auxdisplay/ks0108.rst b/Documentation/admin-guide/auxdisplay/ks0108.rst
index c0b7faf73136..a7d3fe509373 100644
--- a/Documentation/admin-guide/auxdisplay/ks0108.rst
+++ b/Documentation/admin-guide/auxdisplay/ks0108.rst
@@ -3,7 +3,7 @@ ks0108 LCD Controller Driver Documentation
==========================================
:License: GPLv2
-:Author & Maintainer: Miguel Ojeda Sandonis
+:Author & Maintainer: Miguel Ojeda <ojeda@kernel.org>
:Date: 2006-10-27
diff --git a/Documentation/admin-guide/bcache.rst b/Documentation/admin-guide/bcache.rst
index c0ce64d75bbf..8d3a2d045c0a 100644
--- a/Documentation/admin-guide/bcache.rst
+++ b/Documentation/admin-guide/bcache.rst
@@ -5,11 +5,14 @@ A block layer cache (bcache)
Say you've got a big slow raid 6, and an ssd or three. Wouldn't it be
nice if you could use them as cache... Hence bcache.
-Wiki and git repositories are at:
+The bcache wiki can be found at:
+ https://bcache.evilpiepirate.org
- - http://bcache.evilpiepirate.org
- - http://evilpiepirate.org/git/linux-bcache.git
- - http://evilpiepirate.org/git/bcache-tools.git
+This is the git repository of bcache-tools:
+ https://git.kernel.org/pub/scm/linux/kernel/git/colyli/bcache-tools.git/
+
+The latest bcache kernel code can be found from mainline Linux kernel:
+ https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
It's designed around the performance characteristics of SSDs - it only allocates
in erase block sized buckets, and it uses a hybrid btree/log to track cached
@@ -41,17 +44,21 @@ in the cache it first disables writeback caching and waits for all dirty data
to be flushed.
Getting started:
-You'll need make-bcache from the bcache-tools repository. Both the cache device
+You'll need bcache util from the bcache-tools repository. Both the cache device
and backing device must be formatted before use::
- make-bcache -B /dev/sdb
- make-bcache -C /dev/sdc
+ bcache make -B /dev/sdb
+ bcache make -C /dev/sdc
-make-bcache has the ability to format multiple devices at the same time - if
+`bcache make` has the ability to format multiple devices at the same time - if
you format your backing devices and cache device at the same time, you won't
have to manually attach::
- make-bcache -B /dev/sda /dev/sdb -C /dev/sdc
+ bcache make -B /dev/sda /dev/sdb -C /dev/sdc
+
+If your bcache-tools is not updated to latest version and does not have the
+unified `bcache` utility, you may use the legacy `make-bcache` utility to format
+bcache device with same -B and -C parameters.
bcache-tools now ships udev rules, and bcache devices are known to the kernel
immediately. Without udev, you can manually register devices like this::
@@ -188,7 +195,7 @@ D) Recovering data without bcache:
If bcache is not available in the kernel, a filesystem on the backing
device is still available at an 8KiB offset. So either via a loopdev
of the backing device created with --offset 8K, or any value defined by
---data-offset when you originally formatted bcache with `make-bcache`.
+--data-offset when you originally formatted bcache with `bcache make`.
For example::
@@ -210,7 +217,7 @@ E) Wiping a cache device
After you boot back with bcache enabled, you recreate the cache and attach it::
- host:~# make-bcache -C /dev/sdh2
+ host:~# bcache make -C /dev/sdh2
UUID: 7be7e175-8f4c-4f99-94b2-9c904d227045
Set UUID: 5bc072a8-ab17-446d-9744-e247949913c1
version: 0
@@ -318,7 +325,7 @@ want for getting the best possible numbers when benchmarking.
The default metadata size in bcache is 8k. If your backing device is
RAID based, then be sure to align this by a multiple of your stride
- width using `make-bcache --data-offset`. If you intend to expand your
+ width using `bcache make --data-offset`. If you intend to expand your
disk array in the future, then multiply a series of primes by your
raid stripe size to get the disk multiples that you would like.
diff --git a/Documentation/admin-guide/binderfs.rst b/Documentation/admin-guide/binderfs.rst
index c009671f8434..41a4db00df8d 100644
--- a/Documentation/admin-guide/binderfs.rst
+++ b/Documentation/admin-guide/binderfs.rst
@@ -33,6 +33,12 @@ max
a per-instance limit. If ``max=<count>`` is set then only ``<count>`` number
of binder devices can be allocated in this binderfs instance.
+stats
+ Using ``stats=global`` enables global binder statistics.
+ ``stats=global`` is only available for a binderfs instance mounted in the
+ initial user namespace. An attempt to use the option to mount a binderfs
+ instance in another user namespace will return a permission error.
+
Allocating binder Devices
-------------------------
@@ -64,5 +70,18 @@ Deleting binder Devices
Binderfs binder devices can be deleted via `unlink() <unlink_>`_. This means
that the `rm() <rm_>`_ tool can be used to delete them. Note that the
``binder-control`` device cannot be deleted since this would make the binderfs
-instance unuseable. The ``binder-control`` device will be deleted when the
+instance unusable. The ``binder-control`` device will be deleted when the
binderfs instance is unmounted and all references to it have been dropped.
+
+Binder features
+---------------
+
+Assuming an instance of binderfs has been mounted at ``/dev/binderfs``, the
+features supported by the binder driver can be located under
+``/dev/binderfs/features/``. The presence of individual files can be tested
+to determine whether a particular feature is supported by the driver.
+
+Example::
+
+ cat /dev/binderfs/features/oneway_spam_detection
+ 1
diff --git a/Documentation/admin-guide/binfmt-misc.rst b/Documentation/admin-guide/binfmt-misc.rst
index 97b0d7927078..59cd902e3549 100644
--- a/Documentation/admin-guide/binfmt-misc.rst
+++ b/Documentation/admin-guide/binfmt-misc.rst
@@ -1,5 +1,5 @@
-Kernel Support for miscellaneous (your favourite) Binary Formats v1.1
-=====================================================================
+Kernel Support for miscellaneous Binary Formats (binfmt_misc)
+=============================================================
This Kernel feature allows you to invoke almost (for restrictions see below)
every program by simply typing its name in the shell.
@@ -23,7 +23,7 @@ Here is what the fields mean:
- ``name``
is an identifier string. A new /proc file will be created with this
- ``name below /proc/sys/fs/binfmt_misc``; cannot contain slashes ``/`` for
+ name below ``/proc/sys/fs/binfmt_misc``; cannot contain slashes ``/`` for
obvious reasons.
- ``type``
is the type of recognition. Give ``M`` for magic and ``E`` for extension.
@@ -83,7 +83,7 @@ Here is what the fields mean:
``F`` - fix binary
The usual behaviour of binfmt_misc is to spawn the
binary lazily when the misc format file is invoked. However,
- this doesn``t work very well in the face of mount namespaces and
+ this doesn't work very well in the face of mount namespaces and
changeroots, so the ``F`` mode opens the binary as soon as the
emulation is installed and uses the opened image to spawn the
emulator, meaning it is always available once installed,
@@ -140,8 +140,8 @@ Hints
-----
If you want to pass special arguments to your interpreter, you can
-write a wrapper script for it. See Documentation/admin-guide/java.rst for an
-example.
+write a wrapper script for it.
+See :doc:`Documentation/admin-guide/java.rst <./java>` for an example.
Your interpreter should NOT look in the PATH for the filename; the kernel
passes it the full filename (or the file descriptor) to use. Using ``$PATH`` can
diff --git a/Documentation/admin-guide/blockdev/drbd/figures.rst b/Documentation/admin-guide/blockdev/drbd/figures.rst
index bd9a4901fe46..9f73253ea353 100644
--- a/Documentation/admin-guide/blockdev/drbd/figures.rst
+++ b/Documentation/admin-guide/blockdev/drbd/figures.rst
@@ -25,6 +25,6 @@ Sub graphs of DRBD's state transitions
:alt: disk-states-8.dot
:align: center
-.. kernel-figure:: node-states-8.dot
- :alt: node-states-8.dot
+.. kernel-figure:: peer-states-8.dot
+ :alt: peer-states-8.dot
:align: center
diff --git a/Documentation/admin-guide/blockdev/drbd/index.rst b/Documentation/admin-guide/blockdev/drbd/index.rst
index 68ecd5c113e9..561fd1e35917 100644
--- a/Documentation/admin-guide/blockdev/drbd/index.rst
+++ b/Documentation/admin-guide/blockdev/drbd/index.rst
@@ -10,7 +10,7 @@ Description
clusters and in this context, is a "drop-in" replacement for shared
storage. Simplistically, you could see it as a network RAID 1.
- Please visit http://www.drbd.org to find out more.
+ Please visit https://www.drbd.org to find out more.
.. toctree::
:maxdepth: 1
diff --git a/Documentation/admin-guide/blockdev/drbd/node-states-8.dot b/Documentation/admin-guide/blockdev/drbd/peer-states-8.dot
index bfa54e1f8016..6dc3954954d6 100644
--- a/Documentation/admin-guide/blockdev/drbd/node-states-8.dot
+++ b/Documentation/admin-guide/blockdev/drbd/peer-states-8.dot
@@ -1,8 +1,3 @@
-digraph node_states {
- Secondary -> Primary [ label = "ioctl_set_state()" ]
- Primary -> Secondary [ label = "ioctl_set_state()" ]
-}
-
digraph peer_states {
Secondary -> Primary [ label = "recv state packet" ]
Primary -> Secondary [ label = "recv state packet" ]
diff --git a/Documentation/admin-guide/blockdev/floppy.rst b/Documentation/admin-guide/blockdev/floppy.rst
index 4a8f31cf4139..0328438ebe2c 100644
--- a/Documentation/admin-guide/blockdev/floppy.rst
+++ b/Documentation/admin-guide/blockdev/floppy.rst
@@ -6,7 +6,7 @@ FAQ list:
=========
A FAQ list may be found in the fdutils package (see below), and also
-at <http://fdutils.linux.lu/faq.html>.
+at <https://fdutils.linux.lu/faq.html>.
LILO configuration options (Thinkpad users, read this)
@@ -220,11 +220,11 @@ It also contains additional documentation about the floppy driver.
The latest version can be found at fdutils homepage:
- http://fdutils.linux.lu
+ https://fdutils.linux.lu
The fdutils releases can be found at:
- http://fdutils.linux.lu/download.html
+ https://fdutils.linux.lu/download.html
http://www.tux.org/pub/knaff/fdutils/
diff --git a/Documentation/admin-guide/blockdev/index.rst b/Documentation/admin-guide/blockdev/index.rst
index b903cf152091..957ccf617797 100644
--- a/Documentation/admin-guide/blockdev/index.rst
+++ b/Documentation/admin-guide/blockdev/index.rst
@@ -1,8 +1,8 @@
.. SPDX-License-Identifier: GPL-2.0
-===========================
-The Linux RapidIO Subsystem
-===========================
+=============
+Block Devices
+=============
.. toctree::
:maxdepth: 1
diff --git a/Documentation/admin-guide/blockdev/paride.rst b/Documentation/admin-guide/blockdev/paride.rst
index 87b4278bf314..e1ce90af602a 100644
--- a/Documentation/admin-guide/blockdev/paride.rst
+++ b/Documentation/admin-guide/blockdev/paride.rst
@@ -220,7 +220,7 @@ example::
Finally, you can load high-level drivers for each kind of device that
you have connected. By default, each driver will autoprobe for a single
device, but you can support up to four similar devices by giving their
-individual co-ordinates when you load the driver.
+individual coordinates when you load the driver.
For example, if you had two no-name CD-ROM drives both using the
KingByte KBIC-951A adapter, one on port 0x378 and the other on 0x3bc
diff --git a/Documentation/admin-guide/blockdev/ramdisk.rst b/Documentation/admin-guide/blockdev/ramdisk.rst
index b7c2268f8dec..9ce6101e8dd9 100644
--- a/Documentation/admin-guide/blockdev/ramdisk.rst
+++ b/Documentation/admin-guide/blockdev/ramdisk.rst
@@ -6,7 +6,7 @@ Using the RAM disk block device with Linux
1) Overview
2) Kernel Command Line Parameters
- 3) Using "rdev -r"
+ 3) Using "rdev"
4) An Example of Creating a Compressed RAM Disk
@@ -59,51 +59,27 @@ default is 4096 (4 MB).
rd_size
See ramdisk_size.
-3) Using "rdev -r"
-------------------
+3) Using "rdev"
+---------------
-The usage of the word (two bytes) that "rdev -r" sets in the kernel image is
-as follows. The low 11 bits (0 -> 10) specify an offset (in 1 k blocks) of up
-to 2 MB (2^11) of where to find the RAM disk (this used to be the size). Bit
-14 indicates that a RAM disk is to be loaded, and bit 15 indicates whether a
-prompt/wait sequence is to be given before trying to read the RAM disk. Since
-the RAM disk dynamically grows as data is being written into it, a size field
-is not required. Bits 11 to 13 are not currently used and may as well be zero.
-These numbers are no magical secrets, as seen below::
+"rdev" is an obsolete, deprecated, antiquated utility that could be used
+to set the boot device in a Linux kernel image.
- ./arch/x86/kernel/setup.c:#define RAMDISK_IMAGE_START_MASK 0x07FF
- ./arch/x86/kernel/setup.c:#define RAMDISK_PROMPT_FLAG 0x8000
- ./arch/x86/kernel/setup.c:#define RAMDISK_LOAD_FLAG 0x4000
+Instead of using rdev, just place the boot device information on the
+kernel command line and pass it to the kernel from the bootloader.
-Consider a typical two floppy disk setup, where you will have the
-kernel on disk one, and have already put a RAM disk image onto disk #2.
+You can also pass arguments to the kernel by setting FDARGS in
+arch/x86/boot/Makefile and specify in initrd image by setting FDINITRD in
+arch/x86/boot/Makefile.
-Hence you want to set bits 0 to 13 as 0, meaning that your RAM disk
-starts at an offset of 0 kB from the beginning of the floppy.
-The command line equivalent is: "ramdisk_start=0"
+Some of the kernel command line boot options that may apply here are::
-You want bit 14 as one, indicating that a RAM disk is to be loaded.
-The command line equivalent is: "load_ramdisk=1"
-
-You want bit 15 as one, indicating that you want a prompt/keypress
-sequence so that you have a chance to switch floppy disks.
-The command line equivalent is: "prompt_ramdisk=1"
-
-Putting that together gives 2^15 + 2^14 + 0 = 49152 for an rdev word.
-So to create disk one of the set, you would do::
-
- /usr/src/linux# cat arch/x86/boot/zImage > /dev/fd0
- /usr/src/linux# rdev /dev/fd0 /dev/fd0
- /usr/src/linux# rdev -r /dev/fd0 49152
+ ramdisk_start=N
+ ramdisk_size=M
If you make a boot disk that has LILO, then for the above, you would use::
- append = "ramdisk_start=0 load_ramdisk=1 prompt_ramdisk=1"
-
-Since the default start = 0 and the default prompt = 1, you could use::
-
- append = "load_ramdisk=1"
-
+ append = "ramdisk_start=N ramdisk_size=M"
4) An Example of Creating a Compressed RAM Disk
-----------------------------------------------
@@ -151,12 +127,9 @@ f) Put the RAM disk image onto the floppy, after the kernel. Use an offset
dd if=/tmp/ram_image.gz of=/dev/fd0 bs=1k seek=400
-g) Use "rdev" to set the boot device, RAM disk offset, prompt flag, etc.
- For prompt_ramdisk=1, load_ramdisk=1, ramdisk_start=400, one would
- have 2^15 + 2^14 + 400 = 49552::
-
- rdev /dev/fd0 /dev/fd0
- rdev -r /dev/fd0 49552
+g) Make sure that you have already specified the boot information in
+ FDARGS and FDINITRD or that you use a bootloader to pass kernel
+ command line boot options to the kernel.
That is it. You now have your boot/root compressed RAM disk floppy. Some
users may wish to combine steps (d) and (f) by using a pipe.
@@ -167,11 +140,14 @@ users may wish to combine steps (d) and (f) by using a pipe.
Changelog:
----------
+SEPT-2020 :
+
+ Removed usage of "rdev"
+
10-22-04 :
Updated to reflect changes in command line options, remove
obsolete references, general cleanup.
James Nelson (james4765@gmail.com)
-
12-95 :
Original Document
diff --git a/Documentation/admin-guide/blockdev/zram.rst b/Documentation/admin-guide/blockdev/zram.rst
index 27c77d853028..c73b16930449 100644
--- a/Documentation/admin-guide/blockdev/zram.rst
+++ b/Documentation/admin-guide/blockdev/zram.rst
@@ -251,8 +251,6 @@ line of text and contains the following stats separated by whitespace:
================ =============================================================
orig_data_size uncompressed size of data stored in this disk.
- This excludes same-element-filled pages (same_pages) since
- no memory is allocated for them.
Unit: bytes
compr_data_size compressed size of data stored in this disk
mem_used_total the amount of memory allocated for this disk. This
@@ -268,6 +266,7 @@ line of text and contains the following stats separated by whitespace:
No memory is allocated for such pages.
pages_compacted the number of pages freed during compaction
huge_pages the number of incompressible pages
+ huge_pages_since the number of incompressible pages since zram set up
================ =============================================================
File /sys/block/zram<id>/bd_stat
@@ -316,8 +315,8 @@ To use the feature, admin should set up backing device via::
echo /dev/sda5 > /sys/block/zramX/backing_dev
-before disksize setting. It supports only partition at this moment.
-If admin wants to use incompressible page writeback, they could do via::
+before disksize setting. It supports only partitions at this moment.
+If admin wants to use incompressible page writeback, they could do it via::
echo huge > /sys/block/zramX/writeback
@@ -329,12 +328,30 @@ as idle::
From now on, any pages on zram are idle pages. The idle mark
will be removed until someone requests access of the block.
IOW, unless there is access request, those pages are still idle pages.
+Additionally, when CONFIG_ZRAM_MEMORY_TRACKING is enabled pages can be
+marked as idle based on how long (in seconds) it's been since they were
+last accessed::
+
+ echo 86400 > /sys/block/zramX/idle
+
+In this example all pages which haven't been accessed in more than 86400
+seconds (one day) will be marked idle.
Admin can request writeback of those idle pages at right timing via::
echo idle > /sys/block/zramX/writeback
-With the command, zram writeback idle pages from memory to the storage.
+With the command, zram will writeback idle pages from memory to the storage.
+
+Additionally, if a user choose to writeback only huge and idle pages
+this can be accomplished with::
+
+ echo huge_idle > /sys/block/zramX/writeback
+
+If an admin wants to write a specific page in zram device to the backing device,
+they could write a page index into the interface.
+
+ echo "page_index=1251" > /sys/block/zramX/writeback
If there are lots of write IO with flash device, potentially, it has
flash wearout problem so that admin needs to design write limitation
@@ -342,7 +359,7 @@ to guarantee storage health for entire product life.
To overcome the concern, zram supports "writeback_limit" feature.
The "writeback_limit_enable"'s default value is 0 so that it doesn't limit
-any writeback. IOW, if admin wants to apply writeback budget, he should
+any writeback. IOW, if admin wants to apply writeback budget, they should
enable writeback_limit_enable via::
$ echo 1 > /sys/block/zramX/writeback_limit_enable
@@ -353,7 +370,7 @@ until admin sets the budget via /sys/block/zramX/writeback_limit.
(If admin doesn't enable writeback_limit_enable, writeback_limit's value
assigned via /sys/block/zramX/writeback_limit is meaningless.)
-If admin want to limit writeback as per-day 400M, he could do it
+If admin wants to limit writeback as per-day 400M, they could do it
like below::
$ MB_SHIFT=20
@@ -362,17 +379,17 @@ like below::
/sys/block/zram0/writeback_limit.
$ echo 1 > /sys/block/zram0/writeback_limit_enable
-If admins want to allow further write again once the bugdet is exhausted,
-he could do it like below::
+If admins want to allow further write again once the budget is exhausted,
+they could do it like below::
$ echo $((400<<MB_SHIFT>>4K_SHIFT)) > \
/sys/block/zram0/writeback_limit
-If admin wants to see remaining writeback budget since last set::
+If an admin wants to see the remaining writeback budget since last set::
$ cat /sys/block/zramX/writeback_limit
-If admin want to disable writeback limit, he could do::
+If an admin wants to disable writeback limit, they could do::
$ echo 0 > /sys/block/zramX/writeback_limit_enable
@@ -381,7 +398,7 @@ system reboot, echo 1 > /sys/block/zramX/reset) so keeping how many of
writeback happened until you reset the zram to allocate extra writeback
budget in next setting is user's job.
-If admin wants to measure writeback count in a certain period, he could
+If admin wants to measure writeback count in a certain period, they could
know it via /sys/block/zram0/bd_stat's 3rd column.
memory tracking
diff --git a/Documentation/admin-guide/bootconfig.rst b/Documentation/admin-guide/bootconfig.rst
index cf2edcd09183..d99994345d41 100644
--- a/Documentation/admin-guide/bootconfig.rst
+++ b/Documentation/admin-guide/bootconfig.rst
@@ -23,7 +23,7 @@ of dot-connected-words, and key and value are connected by ``=``. The value
has to be terminated by semi-colon (``;``) or newline (``\n``).
For array value, array entries are separated by comma (``,``). ::
-KEY[.WORD[...]] = VALUE[, VALUE2[...]][;]
+ KEY[.WORD[...]] = VALUE[, VALUE2[...]][;]
Unlike the kernel command line syntax, spaces are OK around the comma and ``=``.
@@ -71,6 +71,16 @@ For example,::
foo = bar, baz
foo = qux # !ERROR! we can not re-define same key
+If you want to update the value, you must use the override operator
+``:=`` explicitly. For example::
+
+ foo = bar, baz
+ foo := qux
+
+then, the ``qux`` is assigned to ``foo`` key. This is useful for
+overriding the default value by adding (partial) custom bootconfigs
+without parsing the default bootconfig.
+
If you want to append the value to existing key as an array member,
you can use ``+=`` operator. For example::
@@ -79,12 +89,35 @@ you can use ``+=`` operator. For example::
In this case, the key ``foo`` has ``bar``, ``baz`` and ``qux``.
-However, a sub-key and a value can not co-exist under a parent key.
-For example, following config is NOT allowed.::
+Moreover, sub-keys and a value can coexist under a parent key.
+For example, following config is allowed.::
foo = value1
- foo.bar = value2 # !ERROR! subkey "bar" and value "value1" can NOT co-exist
+ foo.bar = value2
+ foo := value3 # This will update foo's value.
+
+Note, since there is no syntax to put a raw value directly under a
+structured key, you have to define it outside of the brace. For example::
+
+ foo {
+ bar = value1
+ bar {
+ baz = value2
+ qux = value3
+ }
+ }
+
+Also, the order of the value node under a key is fixed. If there
+are a value and subkeys, the value is always the first child node
+of the key. Thus if user specifies subkeys first, e.g.::
+
+ foo.bar = value1
+ foo = value2
+
+In the program (and /proc/bootconfig), it will be shown as below::
+ foo = value2
+ foo.bar = value1
Comments
--------
@@ -125,18 +158,33 @@ Each key-value pair is shown in each line with following style::
Boot Kernel With a Boot Config
==============================
-Since the boot configuration file is loaded with initrd, it will be added
-to the end of the initrd (initramfs) image file with size, checksum and
-12-byte magic word as below.
+There are two options to boot the kernel with bootconfig: attaching the
+bootconfig to the initrd image or embedding it in the kernel itself.
-[initrd][bootconfig][size(u32)][checksum(u32)][#BOOTCONFIG\n]
+Attaching a Boot Config to Initrd
+---------------------------------
+
+Since the boot configuration file is loaded with initrd by default,
+it will be added to the end of the initrd (initramfs) image file with
+padding, size, checksum and 12-byte magic word as below.
+
+[initrd][bootconfig][padding][size(le32)][checksum(le32)][#BOOTCONFIG\n]
+
+The size and checksum fields are unsigned 32bit little endian value.
+
+When the boot configuration is added to the initrd image, the total
+file size is aligned to 4 bytes. To fill the gap, null characters
+(``\0``) will be added. Thus the ``size`` is the length of the bootconfig
+file + padding bytes.
The Linux kernel decodes the last part of the initrd image in memory to
get the boot configuration data.
Because of this "piggyback" method, there is no need to change or
-update the boot loader and the kernel image itself.
+update the boot loader and the kernel image itself as long as the boot
+loader passes the correct initrd file size. If by any chance, the boot
+loader passes a longer size, the kernel fails to find the bootconfig data.
-To do this operation, Linux kernel provides "bootconfig" command under
+To do this operation, Linux kernel provides ``bootconfig`` command under
tools/bootconfig, which allows admin to apply or delete the config file
to/from initrd image. You can build it by the following command::
@@ -154,6 +202,62 @@ To remove the config from the image, you can use -d option as below::
Then add "bootconfig" on the normal kernel command line to tell the
kernel to look for the bootconfig at the end of the initrd file.
+Embedding a Boot Config into Kernel
+-----------------------------------
+
+If you can not use initrd, you can also embed the bootconfig file in the
+kernel by Kconfig options. In this case, you need to recompile the kernel
+with the following configs::
+
+ CONFIG_BOOT_CONFIG_EMBED=y
+ CONFIG_BOOT_CONFIG_EMBED_FILE="/PATH/TO/BOOTCONFIG/FILE"
+
+``CONFIG_BOOT_CONFIG_EMBED_FILE`` requires an absolute path or a relative
+path to the bootconfig file from source tree or object tree.
+The kernel will embed it as the default bootconfig.
+
+Just as when attaching the bootconfig to the initrd, you need ``bootconfig``
+option on the kernel command line to enable the embedded bootconfig.
+
+Note that even if you set this option, you can override the embedded
+bootconfig by another bootconfig which attached to the initrd.
+
+Kernel parameters via Boot Config
+=================================
+
+In addition to the kernel command line, the boot config can be used for
+passing the kernel parameters. All the key-value pairs under ``kernel``
+key will be passed to kernel cmdline directly. Moreover, the key-value
+pairs under ``init`` will be passed to init process via the cmdline.
+The parameters are concatinated with user-given kernel cmdline string
+as the following order, so that the command line parameter can override
+bootconfig parameters (this depends on how the subsystem handles parameters
+but in general, earlier parameter will be overwritten by later one.)::
+
+ [bootconfig params][cmdline params] -- [bootconfig init params][cmdline init params]
+
+Here is an example of the bootconfig file for kernel/init parameters.::
+
+ kernel {
+ root = 01234567-89ab-cdef-0123-456789abcd
+ }
+ init {
+ splash
+ }
+
+This will be copied into the kernel cmdline string as the following::
+
+ root="01234567-89ab-cdef-0123-456789abcd" -- splash
+
+If user gives some other command line like,::
+
+ ro bootconfig -- quiet
+
+The final kernel cmdline will be the following::
+
+ root="01234567-89ab-cdef-0123-456789abcd" ro bootconfig -- splash quiet
+
+
Config File Limitation
======================
@@ -165,7 +269,8 @@ up to 512 key-value pairs. If keys contains 3 words in average, it can
contain 256 key-value pairs. In most cases, the number of config items
will be under 100 entries and smaller than 8KB, so it would be enough.
If the node number exceeds 1024, parser returns an error even if the file
-size is smaller than 32KB.
+size is smaller than 32KB. (Note that this maximum size is not including
+the padding null characters.)
Anyway, since bootconfig command verifies it when appending a boot config
to initrd image, user can notice it before boot.
diff --git a/Documentation/admin-guide/bug-bisect.rst b/Documentation/admin-guide/bug-bisect.rst
index 59567da344e8..325c5d0ed34a 100644
--- a/Documentation/admin-guide/bug-bisect.rst
+++ b/Documentation/admin-guide/bug-bisect.rst
@@ -15,7 +15,7 @@ give up. Report as much as you have found to the relevant maintainer. See
MAINTAINERS for who that is for the subsystem you have worked on.
Before you submit a bug report read
-:ref:`Documentation/admin-guide/reporting-bugs.rst <reportingbugs>`.
+'Documentation/admin-guide/reporting-issues.rst'.
Devices not appearing
=====================
diff --git a/Documentation/admin-guide/bug-hunting.rst b/Documentation/admin-guide/bug-hunting.rst
index 44b8a4edd348..95299b08c405 100644
--- a/Documentation/admin-guide/bug-hunting.rst
+++ b/Documentation/admin-guide/bug-hunting.rst
@@ -49,15 +49,19 @@ the issue, it may also contain the word **Oops**, as on this one::
Despite being an **Oops** or some other sort of stack trace, the offended
line is usually required to identify and handle the bug. Along this chapter,
-we'll refer to "Oops" for all kinds of stack traces that need to be analized.
+we'll refer to "Oops" for all kinds of stack traces that need to be analyzed.
-.. note::
+If the kernel is compiled with ``CONFIG_DEBUG_INFO``, you can enhance the
+quality of the stack trace by using file:`scripts/decode_stacktrace.sh`.
+
+Modules linked in
+-----------------
+
+Modules that are tainted or are being loaded or unloaded are marked with
+"(...)", where the taint flags are described in
+file:`Documentation/admin-guide/tainted-kernels.rst`, "being loaded" is
+annotated with "+", and "being unloaded" is annotated with "-".
- ``ksymoops`` is useless on 2.6 or upper. Please use the Oops in its original
- format (from ``dmesg``, etc). Ignore any references in this or other docs to
- "decoding the Oops" or "running it through ksymoops".
- If you post an Oops from 2.6+ that has been run through ``ksymoops``,
- people will just tell you to repost it.
Where is the Oops message is located?
-------------------------------------
@@ -71,7 +75,7 @@ by running ``journalctl`` command.
Sometimes ``klogd`` dies, in which case you can run ``dmesg > file`` to
read the data from the kernel buffers and save it. Or you can
``cat /proc/kmsg > file``, however you have to break in to stop the transfer,
-``kmsg`` is a "never ending file".
+since ``kmsg`` is a "never ending file".
If the machine has crashed so badly that you cannot enter commands or
the disk is not available then you have three options:
@@ -81,9 +85,9 @@ the disk is not available then you have three options:
planned for a crash. Alternatively, you can take a picture of
the screen with a digital camera - not nice, but better than
nothing. If the messages scroll off the top of the console, you
- may find that booting with a higher resolution (eg, ``vga=791``)
+ may find that booting with a higher resolution (e.g., ``vga=791``)
will allow you to read more of the text. (Caveat: This needs ``vesafb``,
- so won't help for 'early' oopses)
+ so won't help for 'early' oopses.)
(2) Boot with a serial console (see
:ref:`Documentation/admin-guide/serial-console.rst <serial_console>`),
@@ -104,7 +108,7 @@ Kernel source file. There are two methods for doing that. Usually, using
gdb
^^^
-The GNU debug (``gdb``) is the best way to figure out the exact file and line
+The GNU debugger (``gdb``) is the best way to figure out the exact file and line
number of the OOPS from the ``vmlinux`` file.
The usage of gdb works best on a kernel compiled with ``CONFIG_DEBUG_INFO``.
@@ -165,7 +169,7 @@ If you have a call trace, such as::
[<ffffffff8802770b>] :jbd:journal_stop+0x1be/0x1ee
...
-this shows the problem likely in the :jbd: module. You can load that module
+this shows the problem likely is in the :jbd: module. You can load that module
in gdb and list the relevant code::
$ gdb fs/jbd/jbd.ko
@@ -199,8 +203,9 @@ in the kernel hacking menu of the menu configuration.) For example::
You need to be at the top level of the kernel tree for this to pick up
your C files.
-If you don't have access to the code you can also debug on some crash dumps
-e.g. crash dump output as shown by Dave Miller::
+If you don't have access to the source code you can still debug some crash
+dumps using the following method (example crash dump output as shown by
+Dave Miller)::
EIP is at +0x14/0x4c0
...
@@ -230,6 +235,9 @@ e.g. crash dump output as shown by Dave Miller::
mov 0x8(%ebp), %ebx ! %ebx = skb->sk
mov 0x13c(%ebx), %eax ! %eax = inet_sk(sk)->opt
+file:`scripts/decodecode` can be used to automate most of this, depending
+on what CPU architecture is being debugged.
+
Reporting the bug
-----------------
@@ -241,7 +249,7 @@ used for the development of the affected code. This can be done by using
the ``get_maintainer.pl`` script.
For example, if you find a bug at the gspca's sonixj.c file, you can get
-their maintainers with::
+its maintainers with::
$ ./scripts/get_maintainer.pl -f drivers/media/usb/gspca/sonixj.c
Hans Verkuil <hverkuil@xs4all.nl> (odd fixer:GSPCA USB WEBCAM DRIVER,commit_signer:1/1=100%)
@@ -253,16 +261,17 @@ their maintainers with::
Please notice that it will point to:
-- The last developers that touched on the source code. On the above example,
- Tejun and Bhaktipriya (in this specific case, none really envolved on the
- development of this file);
+- The last developers that touched the source code (if this is done inside
+ a git tree). On the above example, Tejun and Bhaktipriya (in this
+ specific case, none really involved on the development of this file);
- The driver maintainer (Hans Verkuil);
- The subsystem maintainer (Mauro Carvalho Chehab);
- The driver and/or subsystem mailing list (linux-media@vger.kernel.org);
- the Linux Kernel mailing list (linux-kernel@vger.kernel.org).
Usually, the fastest way to have your bug fixed is to report it to mailing
-list used for the development of the code (linux-media ML) copying the driver maintainer (Hans).
+list used for the development of the code (linux-media ML) copying the
+driver maintainer (Hans).
If you are totally stumped as to whom to send the report, and
``get_maintainer.pl`` didn't provide you anything useful, send it to
@@ -303,9 +312,9 @@ protection fault message can be simply cut out of the message files
and forwarded to the kernel developers.
Two types of address resolution are performed by ``klogd``. The first is
-static translation and the second is dynamic translation. Static
-translation uses the System.map file in much the same manner that
-ksymoops does. In order to do static translation the ``klogd`` daemon
+static translation and the second is dynamic translation.
+Static translation uses the System.map file.
+In order to do static translation the ``klogd`` daemon
must be able to find a system map file at daemon initialization time.
See the klogd man page for information on how ``klogd`` searches for map
files.
diff --git a/Documentation/admin-guide/cgroup-v1/blkio-controller.rst b/Documentation/admin-guide/cgroup-v1/blkio-controller.rst
index 36d43ae7dc13..16253eda192e 100644
--- a/Documentation/admin-guide/cgroup-v1/blkio-controller.rst
+++ b/Documentation/admin-guide/cgroup-v1/blkio-controller.rst
@@ -17,36 +17,37 @@ level logical devices like device mapper.
HOWTO
=====
+
Throttling/Upper Limit policy
-----------------------------
-- Enable Block IO controller::
+Enable Block IO controller::
CONFIG_BLK_CGROUP=y
-- Enable throttling in block layer::
+Enable throttling in block layer::
CONFIG_BLK_DEV_THROTTLING=y
-- Mount blkio controller (see cgroups.txt, Why are cgroups needed?)::
+Mount blkio controller (see cgroups.txt, Why are cgroups needed?)::
mount -t cgroup -o blkio none /sys/fs/cgroup/blkio
-- Specify a bandwidth rate on particular device for root group. The format
- for policy is "<major>:<minor> <bytes_per_second>"::
+Specify a bandwidth rate on particular device for root group. The format
+for policy is "<major>:<minor> <bytes_per_second>"::
echo "8:16 1048576" > /sys/fs/cgroup/blkio/blkio.throttle.read_bps_device
- Above will put a limit of 1MB/second on reads happening for root group
- on device having major/minor number 8:16.
+This will put a limit of 1MB/second on reads happening for root group
+on device having major/minor number 8:16.
-- Run dd to read a file and see if rate is throttled to 1MB/s or not::
+Run dd to read a file and see if rate is throttled to 1MB/s or not::
# dd iflag=direct if=/mnt/common/zerofile of=/dev/null bs=4K count=1024
1024+0 records in
1024+0 records out
4194304 bytes (4.2 MB) copied, 4.0001 s, 1.0 MB/s
- Limits for writes can be put using blkio.throttle.write_bps_device file.
+Limits for writes can be put using blkio.throttle.write_bps_device file.
Hierarchical Cgroups
====================
@@ -79,85 +80,89 @@ following::
Various user visible config options
===================================
-CONFIG_BLK_CGROUP
- - Block IO controller.
-CONFIG_BFQ_CGROUP_DEBUG
- - Debug help. Right now some additional stats file show up in cgroup
+ CONFIG_BLK_CGROUP
+ Block IO controller.
+
+ CONFIG_BFQ_CGROUP_DEBUG
+ Debug help. Right now some additional stats file show up in cgroup
if this option is enabled.
-CONFIG_BLK_DEV_THROTTLING
- - Enable block device throttling support in block layer.
+ CONFIG_BLK_DEV_THROTTLING
+ Enable block device throttling support in block layer.
Details of cgroup files
=======================
+
Proportional weight policy files
--------------------------------
-- blkio.weight
- - Specifies per cgroup weight. This is default weight of the group
- on all the devices until and unless overridden by per device rule.
- (See blkio.weight_device).
- Currently allowed range of weights is from 10 to 1000.
-- blkio.weight_device
- - One can specify per cgroup per device rules using this interface.
- These rules override the default value of group weight as specified
- by blkio.weight.
+ blkio.bfq.weight
+ Specifies per cgroup weight. This is default weight of the group
+ on all the devices until and unless overridden by per device rule
+ (see `blkio.bfq.weight_device` below).
+
+ Currently allowed range of weights is from 1 to 1000. For more details,
+ see Documentation/block/bfq-iosched.rst.
+
+ blkio.bfq.weight_device
+ Specifes per cgroup per device weights, overriding the default group
+ weight. For more details, see Documentation/block/bfq-iosched.rst.
Following is the format::
- # echo dev_maj:dev_minor weight > blkio.weight_device
+ # echo dev_maj:dev_minor weight > blkio.bfq.weight_device
Configure weight=300 on /dev/sdb (8:16) in this cgroup::
- # echo 8:16 300 > blkio.weight_device
- # cat blkio.weight_device
+ # echo 8:16 300 > blkio.bfq.weight_device
+ # cat blkio.bfq.weight_device
dev weight
8:16 300
Configure weight=500 on /dev/sda (8:0) in this cgroup::
- # echo 8:0 500 > blkio.weight_device
- # cat blkio.weight_device
+ # echo 8:0 500 > blkio.bfq.weight_device
+ # cat blkio.bfq.weight_device
dev weight
8:0 500
8:16 300
Remove specific weight for /dev/sda in this cgroup::
- # echo 8:0 0 > blkio.weight_device
- # cat blkio.weight_device
+ # echo 8:0 0 > blkio.bfq.weight_device
+ # cat blkio.bfq.weight_device
dev weight
8:16 300
-- blkio.time
- - disk time allocated to cgroup per device in milliseconds. First
+ blkio.time
+ Disk time allocated to cgroup per device in milliseconds. First
two fields specify the major and minor number of the device and
third field specifies the disk time allocated to group in
milliseconds.
-- blkio.sectors
- - number of sectors transferred to/from disk by the group. First
+ blkio.sectors
+ Number of sectors transferred to/from disk by the group. First
two fields specify the major and minor number of the device and
third field specifies the number of sectors transferred by the
group to/from the device.
-- blkio.io_service_bytes
- - Number of bytes transferred to/from the disk by the group. These
+ blkio.io_service_bytes
+ Number of bytes transferred to/from the disk by the group. These
are further divided by the type of operation - read or write, sync
or async. First two fields specify the major and minor number of the
device, third field specifies the operation type and the fourth field
specifies the number of bytes.
-- blkio.io_serviced
- - Number of IOs (bio) issued to the disk by the group. These
+ blkio.io_serviced
+ Number of IOs (bio) issued to the disk by the group. These
are further divided by the type of operation - read or write, sync
or async. First two fields specify the major and minor number of the
device, third field specifies the operation type and the fourth field
specifies the number of IOs.
-- blkio.io_service_time
- - Total amount of time between request dispatch and request completion
+ blkio.io_service_time
+ Total amount of time between request dispatch and request completion
for the IOs done by this cgroup. This is in nanoseconds to make it
meaningful for flash devices too. For devices with queue depth of 1,
this time represents the actual service time. When queue_depth > 1,
@@ -170,8 +175,8 @@ Proportional weight policy files
specifies the operation type and the fourth field specifies the
io_service_time in ns.
-- blkio.io_wait_time
- - Total amount of time the IOs for this cgroup spent waiting in the
+ blkio.io_wait_time
+ Total amount of time the IOs for this cgroup spent waiting in the
scheduler queues for service. This can be greater than the total time
elapsed since it is cumulative io_wait_time for all IOs. It is not a
measure of total time the cgroup spent waiting but rather a measure of
@@ -185,24 +190,24 @@ Proportional weight policy files
minor number of the device, third field specifies the operation type
and the fourth field specifies the io_wait_time in ns.
-- blkio.io_merged
- - Total number of bios/requests merged into requests belonging to this
+ blkio.io_merged
+ Total number of bios/requests merged into requests belonging to this
cgroup. This is further divided by the type of operation - read or
write, sync or async.
-- blkio.io_queued
- - Total number of requests queued up at any given instant for this
+ blkio.io_queued
+ Total number of requests queued up at any given instant for this
cgroup. This is further divided by the type of operation - read or
write, sync or async.
-- blkio.avg_queue_size
- - Debugging aid only enabled if CONFIG_BFQ_CGROUP_DEBUG=y.
+ blkio.avg_queue_size
+ Debugging aid only enabled if CONFIG_BFQ_CGROUP_DEBUG=y.
The average queue size for this cgroup over the entire time of this
cgroup's existence. Queue size samples are taken each time one of the
queues of this cgroup gets a timeslice.
-- blkio.group_wait_time
- - Debugging aid only enabled if CONFIG_BFQ_CGROUP_DEBUG=y.
+ blkio.group_wait_time
+ Debugging aid only enabled if CONFIG_BFQ_CGROUP_DEBUG=y.
This is the amount of time the cgroup had to wait since it became busy
(i.e., went from 0 to 1 request queued) to get a timeslice for one of
its queues. This is different from the io_wait_time which is the
@@ -212,8 +217,8 @@ Proportional weight policy files
will only report the group_wait_time accumulated till the last time it
got a timeslice and will not include the current delta.
-- blkio.empty_time
- - Debugging aid only enabled if CONFIG_BFQ_CGROUP_DEBUG=y.
+ blkio.empty_time
+ Debugging aid only enabled if CONFIG_BFQ_CGROUP_DEBUG=y.
This is the amount of time a cgroup spends without any pending
requests when not being served, i.e., it does not include any time
spent idling for one of the queues of the cgroup. This is in
@@ -221,8 +226,8 @@ Proportional weight policy files
the stat will only report the empty_time accumulated till the last
time it had a pending request and will not include the current delta.
-- blkio.idle_time
- - Debugging aid only enabled if CONFIG_BFQ_CGROUP_DEBUG=y.
+ blkio.idle_time
+ Debugging aid only enabled if CONFIG_BFQ_CGROUP_DEBUG=y.
This is the amount of time spent by the IO scheduler idling for a
given cgroup in anticipation of a better request than the existing ones
from other queues/cgroups. This is in nanoseconds. If this is read
@@ -230,60 +235,60 @@ Proportional weight policy files
idle_time accumulated till the last idle period and will not include
the current delta.
-- blkio.dequeue
- - Debugging aid only enabled if CONFIG_BFQ_CGROUP_DEBUG=y. This
+ blkio.dequeue
+ Debugging aid only enabled if CONFIG_BFQ_CGROUP_DEBUG=y. This
gives the statistics about how many a times a group was dequeued
from service tree of the device. First two fields specify the major
and minor number of the device and third field specifies the number
of times a group was dequeued from a particular device.
-- blkio.*_recursive
- - Recursive version of various stats. These files show the
+ blkio.*_recursive
+ Recursive version of various stats. These files show the
same information as their non-recursive counterparts but
include stats from all the descendant cgroups.
Throttling/Upper limit policy files
-----------------------------------
-- blkio.throttle.read_bps_device
- - Specifies upper limit on READ rate from the device. IO rate is
+ blkio.throttle.read_bps_device
+ Specifies upper limit on READ rate from the device. IO rate is
specified in bytes per second. Rules are per device. Following is
the format::
echo "<major>:<minor> <rate_bytes_per_second>" > /cgrp/blkio.throttle.read_bps_device
-- blkio.throttle.write_bps_device
- - Specifies upper limit on WRITE rate to the device. IO rate is
+ blkio.throttle.write_bps_device
+ Specifies upper limit on WRITE rate to the device. IO rate is
specified in bytes per second. Rules are per device. Following is
the format::
echo "<major>:<minor> <rate_bytes_per_second>" > /cgrp/blkio.throttle.write_bps_device
-- blkio.throttle.read_iops_device
- - Specifies upper limit on READ rate from the device. IO rate is
+ blkio.throttle.read_iops_device
+ Specifies upper limit on READ rate from the device. IO rate is
specified in IO per second. Rules are per device. Following is
the format::
echo "<major>:<minor> <rate_io_per_second>" > /cgrp/blkio.throttle.read_iops_device
-- blkio.throttle.write_iops_device
- - Specifies upper limit on WRITE rate to the device. IO rate is
+ blkio.throttle.write_iops_device
+ Specifies upper limit on WRITE rate to the device. IO rate is
specified in io per second. Rules are per device. Following is
the format::
echo "<major>:<minor> <rate_io_per_second>" > /cgrp/blkio.throttle.write_iops_device
-Note: If both BW and IOPS rules are specified for a device, then IO is
- subjected to both the constraints.
+ Note: If both BW and IOPS rules are specified for a device, then IO is
+ subjected to both the constraints.
-- blkio.throttle.io_serviced
- - Number of IOs (bio) issued to the disk by the group. These
+ blkio.throttle.io_serviced
+ Number of IOs (bio) issued to the disk by the group. These
are further divided by the type of operation - read or write, sync
or async. First two fields specify the major and minor number of the
device, third field specifies the operation type and the fourth field
specifies the number of IOs.
-- blkio.throttle.io_service_bytes
- - Number of bytes transferred to/from the disk by the group. These
+ blkio.throttle.io_service_bytes
+ Number of bytes transferred to/from the disk by the group. These
are further divided by the type of operation - read or write, sync
or async. First two fields specify the major and minor number of the
device, third field specifies the operation type and the fourth field
@@ -291,6 +296,6 @@ Note: If both BW and IOPS rules are specified for a device, then IO is
Common files among various policies
-----------------------------------
-- blkio.reset_stats
- - Writing an int to this file will result in resetting all the stats
+ blkio.reset_stats
+ Writing an int to this file will result in resetting all the stats
for that cgroup.
diff --git a/Documentation/admin-guide/cgroup-v1/cpusets.rst b/Documentation/admin-guide/cgroup-v1/cpusets.rst
index 86a6ae995d54..5d844ed4df69 100644
--- a/Documentation/admin-guide/cgroup-v1/cpusets.rst
+++ b/Documentation/admin-guide/cgroup-v1/cpusets.rst
@@ -1,3 +1,5 @@
+.. _cpusets:
+
=======
CPUSETS
=======
@@ -223,6 +225,17 @@ cpu_online_mask using a CPU hotplug notifier, and the mems file
automatically tracks the value of node_states[N_MEMORY]--i.e.,
nodes with memory--using the cpuset_track_online_nodes() hook.
+The cpuset.effective_cpus and cpuset.effective_mems files are
+normally read-only copies of cpuset.cpus and cpuset.mems files
+respectively. If the cpuset cgroup filesystem is mounted with the
+special "cpuset_v2_mode" option, the behavior of these files will become
+similar to the corresponding files in cpuset v2. In other words, hotplug
+events will not change cpuset.cpus and cpuset.mems. Those events will
+only affect cpuset.effective_cpus and cpuset.effective_mems which show
+the actual cpus and memory nodes that are currently used by this cpuset.
+See Documentation/admin-guide/cgroup-v2.rst for more information about
+cpuset v2 behavior.
+
1.4 What are exclusive cpusets ?
--------------------------------
diff --git a/Documentation/admin-guide/cgroup-v1/hugetlb.rst b/Documentation/admin-guide/cgroup-v1/hugetlb.rst
index a3902aa253a9..0fa724d82abb 100644
--- a/Documentation/admin-guide/cgroup-v1/hugetlb.rst
+++ b/Documentation/admin-guide/cgroup-v1/hugetlb.rst
@@ -2,13 +2,6 @@
HugeTLB Controller
==================
-The HugeTLB controller allows to limit the HugeTLB usage per control group and
-enforces the controller limit during page fault. Since HugeTLB doesn't
-support page reclaim, enforcing the limit at page fault time implies that,
-the application will get SIGBUS signal if it tries to access HugeTLB pages
-beyond its limit. This requires the application to know beforehand how much
-HugeTLB pages it would require for its use.
-
HugeTLB controller can be created by first mounting the cgroup filesystem.
# mount -t cgroup -o hugetlb none /sys/fs/cgroup
@@ -28,23 +21,115 @@ process (bash) into it.
Brief summary of control files::
- hugetlb.<hugepagesize>.limit_in_bytes # set/show limit of "hugepagesize" hugetlb usage
- hugetlb.<hugepagesize>.max_usage_in_bytes # show max "hugepagesize" hugetlb usage recorded
- hugetlb.<hugepagesize>.usage_in_bytes # show current usage for "hugepagesize" hugetlb
- hugetlb.<hugepagesize>.failcnt # show the number of allocation failure due to HugeTLB limit
+ hugetlb.<hugepagesize>.rsvd.limit_in_bytes # set/show limit of "hugepagesize" hugetlb reservations
+ hugetlb.<hugepagesize>.rsvd.max_usage_in_bytes # show max "hugepagesize" hugetlb reservations and no-reserve faults
+ hugetlb.<hugepagesize>.rsvd.usage_in_bytes # show current reservations and no-reserve faults for "hugepagesize" hugetlb
+ hugetlb.<hugepagesize>.rsvd.failcnt # show the number of allocation failure due to HugeTLB reservation limit
+ hugetlb.<hugepagesize>.limit_in_bytes # set/show limit of "hugepagesize" hugetlb faults
+ hugetlb.<hugepagesize>.max_usage_in_bytes # show max "hugepagesize" hugetlb usage recorded
+ hugetlb.<hugepagesize>.usage_in_bytes # show current usage for "hugepagesize" hugetlb
+ hugetlb.<hugepagesize>.failcnt # show the number of allocation failure due to HugeTLB usage limit
+ hugetlb.<hugepagesize>.numa_stat # show the numa information of the hugetlb memory charged to this cgroup
For a system supporting three hugepage sizes (64k, 32M and 1G), the control
files include::
hugetlb.1GB.limit_in_bytes
hugetlb.1GB.max_usage_in_bytes
+ hugetlb.1GB.numa_stat
hugetlb.1GB.usage_in_bytes
hugetlb.1GB.failcnt
+ hugetlb.1GB.rsvd.limit_in_bytes
+ hugetlb.1GB.rsvd.max_usage_in_bytes
+ hugetlb.1GB.rsvd.usage_in_bytes
+ hugetlb.1GB.rsvd.failcnt
hugetlb.64KB.limit_in_bytes
hugetlb.64KB.max_usage_in_bytes
+ hugetlb.64KB.numa_stat
hugetlb.64KB.usage_in_bytes
hugetlb.64KB.failcnt
+ hugetlb.64KB.rsvd.limit_in_bytes
+ hugetlb.64KB.rsvd.max_usage_in_bytes
+ hugetlb.64KB.rsvd.usage_in_bytes
+ hugetlb.64KB.rsvd.failcnt
hugetlb.32MB.limit_in_bytes
hugetlb.32MB.max_usage_in_bytes
+ hugetlb.32MB.numa_stat
hugetlb.32MB.usage_in_bytes
hugetlb.32MB.failcnt
+ hugetlb.32MB.rsvd.limit_in_bytes
+ hugetlb.32MB.rsvd.max_usage_in_bytes
+ hugetlb.32MB.rsvd.usage_in_bytes
+ hugetlb.32MB.rsvd.failcnt
+
+
+1. Page fault accounting
+
+hugetlb.<hugepagesize>.limit_in_bytes
+hugetlb.<hugepagesize>.max_usage_in_bytes
+hugetlb.<hugepagesize>.usage_in_bytes
+hugetlb.<hugepagesize>.failcnt
+
+The HugeTLB controller allows users to limit the HugeTLB usage (page fault) per
+control group and enforces the limit during page fault. Since HugeTLB
+doesn't support page reclaim, enforcing the limit at page fault time implies
+that, the application will get SIGBUS signal if it tries to fault in HugeTLB
+pages beyond its limit. Therefore the application needs to know exactly how many
+HugeTLB pages it uses before hand, and the sysadmin needs to make sure that
+there are enough available on the machine for all the users to avoid processes
+getting SIGBUS.
+
+
+2. Reservation accounting
+
+hugetlb.<hugepagesize>.rsvd.limit_in_bytes
+hugetlb.<hugepagesize>.rsvd.max_usage_in_bytes
+hugetlb.<hugepagesize>.rsvd.usage_in_bytes
+hugetlb.<hugepagesize>.rsvd.failcnt
+
+The HugeTLB controller allows to limit the HugeTLB reservations per control
+group and enforces the controller limit at reservation time and at the fault of
+HugeTLB memory for which no reservation exists. Since reservation limits are
+enforced at reservation time (on mmap or shget), reservation limits never causes
+the application to get SIGBUS signal if the memory was reserved before hand. For
+MAP_NORESERVE allocations, the reservation limit behaves the same as the fault
+limit, enforcing memory usage at fault time and causing the application to
+receive a SIGBUS if it's crossing its limit.
+
+Reservation limits are superior to page fault limits described above, since
+reservation limits are enforced at reservation time (on mmap or shget), and
+never causes the application to get SIGBUS signal if the memory was reserved
+before hand. This allows for easier fallback to alternatives such as
+non-HugeTLB memory for example. In the case of page fault accounting, it's very
+hard to avoid processes getting SIGBUS since the sysadmin needs precisely know
+the HugeTLB usage of all the tasks in the system and make sure there is enough
+pages to satisfy all requests. Avoiding tasks getting SIGBUS on overcommited
+systems is practically impossible with page fault accounting.
+
+
+3. Caveats with shared memory
+
+For shared HugeTLB memory, both HugeTLB reservation and page faults are charged
+to the first task that causes the memory to be reserved or faulted, and all
+subsequent uses of this reserved or faulted memory is done without charging.
+
+Shared HugeTLB memory is only uncharged when it is unreserved or deallocated.
+This is usually when the HugeTLB file is deleted, and not when the task that
+caused the reservation or fault has exited.
+
+
+4. Caveats with HugeTLB cgroup offline.
+
+When a HugeTLB cgroup goes offline with some reservations or faults still
+charged to it, the behavior is as follows:
+
+- The fault charges are charged to the parent HugeTLB cgroup (reparented),
+- the reservation charges remain on the offline HugeTLB cgroup.
+
+This means that if a HugeTLB cgroup gets offlined while there is still HugeTLB
+reservations charged to it, that cgroup persists as a zombie until all HugeTLB
+reservations are uncharged. HugeTLB reservations behave in this manner to match
+the memory controller whose cgroups also persist as zombie until all charged
+memory is uncharged. Also, the tracking of HugeTLB reservations is a bit more
+complex compared to the tracking of HugeTLB faults, so it is significantly
+harder to reparent reservations at offline time.
diff --git a/Documentation/admin-guide/cgroup-v1/index.rst b/Documentation/admin-guide/cgroup-v1/index.rst
index 10bf48bae0b0..99fbc8a64ba9 100644
--- a/Documentation/admin-guide/cgroup-v1/index.rst
+++ b/Documentation/admin-guide/cgroup-v1/index.rst
@@ -1,3 +1,5 @@
+.. _cgroup-v1:
+
========================
Control Groups version 1
========================
@@ -15,6 +17,7 @@ Control Groups version 1
hugetlb
memcg_test
memory
+ misc
net_cls
net_prio
pids
diff --git a/Documentation/admin-guide/cgroup-v1/memcg_test.rst b/Documentation/admin-guide/cgroup-v1/memcg_test.rst
index 3f7115e07b5d..a402359abb99 100644
--- a/Documentation/admin-guide/cgroup-v1/memcg_test.rst
+++ b/Documentation/admin-guide/cgroup-v1/memcg_test.rst
@@ -97,7 +97,7 @@ Under below explanation, we assume CONFIG_MEM_RES_CTRL_SWAP=y.
=============
Page Cache is charged at
- - add_to_page_cache_locked().
+ - filemap_add_folio().
The logic is very clear. (About migration, see below)
@@ -133,18 +133,9 @@ Under below explanation, we assume CONFIG_MEM_RES_CTRL_SWAP=y.
8. LRU
======
- Each memcg has its own private LRU. Now, its handling is under global
- VM's control (means that it's handled under global pgdat->lru_lock).
- Almost all routines around memcg's LRU is called by global LRU's
- list management functions under pgdat->lru_lock.
-
- A special function is mem_cgroup_isolate_pages(). This scans
- memcg's private LRU and call __isolate_lru_page() to extract a page
- from LRU.
-
- (By __isolate_lru_page(), the page is removed from both of global and
- private LRU.)
-
+ Each memcg has its own vector of LRUs (inactive anon, active anon,
+ inactive file, active file, unevictable) of pages from each node,
+ each LRU handled under a single lru_lock for that memcg and node.
9. Typical Tests.
=================
@@ -219,13 +210,11 @@ Under below explanation, we assume CONFIG_MEM_RES_CTRL_SWAP=y.
This is an easy way to test page migration, too.
-9.5 mkdir/rmdir
----------------
+9.5 nested cgroups
+------------------
- When using hierarchy, mkdir/rmdir test should be done.
- Use tests like the following::
+ Use tests like the following for testing nested cgroups::
- echo 1 >/opt/cgroup/01/memory/use_hierarchy
mkdir /opt/cgroup/01/child_a
mkdir /opt/cgroup/01/child_b
diff --git a/Documentation/admin-guide/cgroup-v1/memory.rst b/Documentation/admin-guide/cgroup-v1/memory.rst
index 0ae4f564c2d6..5b86245450bd 100644
--- a/Documentation/admin-guide/cgroup-v1/memory.rst
+++ b/Documentation/admin-guide/cgroup-v1/memory.rst
@@ -64,6 +64,7 @@ Brief summary of control files.
threads
cgroup.procs show list of processes
cgroup.event_control an interface for event_fd()
+ This knob is not available on CONFIG_PREEMPT_RT systems.
memory.usage_in_bytes show current usage for memory
(See 5.5 for details)
memory.memsw.usage_in_bytes show current usage for memory+Swap
@@ -75,8 +76,11 @@ Brief summary of control files.
memory.max_usage_in_bytes show max memory usage recorded
memory.memsw.max_usage_in_bytes show max memory+Swap usage recorded
memory.soft_limit_in_bytes set/show soft limit of memory usage
+ This knob is not available on CONFIG_PREEMPT_RT systems.
memory.stat show various statistics
memory.use_hierarchy set/show hierarchical account enabled
+ This knob is deprecated and shouldn't be
+ used.
memory.force_empty trigger forced page reclaim
memory.pressure_level set memory pressure notifications
memory.swappiness set/show swappiness parameter of vmscan
@@ -85,10 +89,8 @@ Brief summary of control files.
memory.oom_control set/show oom controls.
memory.numa_stat show the number of memory usage per numa
node
- memory.kmem.limit_in_bytes set/show hard limit for kernel memory
- This knob is deprecated and shouldn't be
- used. It is planned that this be removed in
- the foreseeable future.
+ memory.kmem.limit_in_bytes This knob is deprecated and writing to
+ it will return -ENOTSUPP.
memory.kmem.usage_in_bytes show current kernel memory allocation
memory.kmem.failcnt show the number of kernel memory usage
hits limits
@@ -199,11 +201,11 @@ An RSS page is unaccounted when it's fully unmapped. A PageCache page is
unaccounted when it's removed from radix-tree. Even if RSS pages are fully
unmapped (by kswapd), they may exist as SwapCache in the system until they
are really freed. Such SwapCaches are also accounted.
-A swapped-in page is not accounted until it's mapped.
+A swapped-in page is accounted after adding into swapcache.
Note: The kernel does swapin-readahead and reads multiple swaps at once.
-This means swapped-in pages may contain pages for other tasks than a task
-causing page fault. So, we avoid accounting at swap-in I/O.
+Since page's memcg recorded into swap whatever memsw enabled, the page will
+be accounted after swapin.
At page migration, accounting information is kept.
@@ -222,18 +224,13 @@ the cgroup that brought it in -- this will happen on memory pressure).
But see section 8.2: when moving a task to another cgroup, its pages may
be recharged to the new cgroup, if move_charge_at_immigrate has been chosen.
-Exception: If CONFIG_MEMCG_SWAP is not used.
-When you do swapoff and make swapped-out pages of shmem(tmpfs) to
-be backed into memory in force, charges for pages are accounted against the
-caller of swapoff rather than the users of shmem.
-
-2.4 Swap Extension (CONFIG_MEMCG_SWAP)
+2.4 Swap Extension
--------------------------------------
-Swap Extension allows you to record charge for swap. A swapped-in page is
-charged back to original page allocator if possible.
+Swap usage is always recorded for each of cgroup. Swap Extension allows you to
+read and limit it.
-When swap is accounted, following files are added.
+When CONFIG_SWAP is enabled, following files are added.
- memory.memsw.usage_in_bytes.
- memory.memsw.limit_in_bytes.
@@ -290,22 +287,19 @@ When oom event notifier is registered, event will be delivered.
2.6 Locking
-----------
- lock_page_cgroup()/unlock_page_cgroup() should not be called under
- the i_pages lock.
-
- Other lock order is following:
+Lock order is as follows:
- PG_locked.
- mm->page_table_lock
- pgdat->lru_lock
- lock_page_cgroup.
+ Page lock (PG_locked bit of page->flags)
+ mm->page_table_lock or split pte_lock
+ lock_page_memcg (memcg->move_lock)
+ mapping->i_pages lock
+ lruvec->lru_lock.
- In many cases, just lock_page_cgroup() is called.
+Per-node-per-memcgroup LRU (cgroup's private LRU) is guarded by
+lruvec->lru_lock; PG_lru bit of page->flags is cleared before
+isolating a page from its LRU under lruvec->lru_lock.
- per-zone-per-cgroup LRU (cgroup's private LRU) is just guarded by
- pgdat->lru_lock, it has no lock of its own.
-
-2.7 Kernel Memory Extension (CONFIG_MEMCG_KMEM)
+2.7 Kernel Memory Extension
-----------------------------------------------
With the Kernel memory extension, the Memory Controller is able to limit
@@ -366,8 +360,8 @@ U != 0, K = unlimited:
U != 0, K < U:
Kernel memory is a subset of the user memory. This setup is useful in
- deployments where the total amount of memory per-cgroup is overcommited.
- Overcommiting kernel memory limits is definitely not recommended, since the
+ deployments where the total amount of memory per-cgroup is overcommitted.
+ Overcommitting kernel memory limits is definitely not recommended, since the
box can still run out of non-reclaimable memory.
In this case, the admin could set up K so that the sum of all groups is
never greater than the total memory, and freely set U at the cost of his
@@ -392,8 +386,6 @@ U != 0, K >= U:
a. Enable CONFIG_CGROUPS
b. Enable CONFIG_MEMCG
-c. Enable CONFIG_MEMCG_SWAP (to use swap extension)
-d. Enable CONFIG_MEMCG_KMEM (to use kmem extension)
3.1. Prepare the cgroups (see cgroups.txt, Why are cgroups needed?)
-------------------------------------------------------------------
@@ -500,16 +492,13 @@ cgroup might have some charge associated with it, even though all
tasks have migrated away from it. (because we charge against pages, not
against tasks.)
-We move the stats to root (if use_hierarchy==0) or parent (if
-use_hierarchy==1), and no change on the charge except uncharging
+We move the stats to parent, and no change on the charge except uncharging
from the child.
Charges recorded in swap information is not updated at removal of cgroup.
Recorded information is discarded and a cgroup which uses swap (swapcache)
will be charged as a new owner of it.
-About use_hierarchy, see Section 6.
-
5. Misc. interfaces
===================
@@ -527,13 +516,6 @@ About use_hierarchy, see Section 6.
charged file caches. Some out-of-use page caches may keep charged until
memory pressure happens. If you want to avoid that, force_empty will be useful.
- Also, note that when memory.kmem.limit_in_bytes is set the charges due to
- kernel pages will still be seen. This is not considered a failure and the
- write will still return success. In this case, it is expected that
- memory.kmem.usage_in_bytes == memory.usage_in_bytes.
-
- About use_hierarchy, see Section 6.
-
5.2 stat file
-------------
@@ -680,31 +662,20 @@ hierarchy::
d e
In the diagram above, with hierarchical accounting enabled, all memory
-usage of e, is accounted to its ancestors up until the root (i.e, c and root),
-that has memory.use_hierarchy enabled. If one of the ancestors goes over its
-limit, the reclaim algorithm reclaims from the tasks in the ancestor and the
-children of the ancestor.
+usage of e, is accounted to its ancestors up until the root (i.e, c and root).
+If one of the ancestors goes over its limit, the reclaim algorithm reclaims
+from the tasks in the ancestor and the children of the ancestor.
-6.1 Enabling hierarchical accounting and reclaim
-------------------------------------------------
+6.1 Hierarchical accounting and reclaim
+---------------------------------------
-A memory cgroup by default disables the hierarchy feature. Support
-can be enabled by writing 1 to memory.use_hierarchy file of the root cgroup::
+Hierarchical accounting is enabled by default. Disabling the hierarchical
+accounting is deprecated. An attempt to do it will result in a failure
+and a warning printed to dmesg.
- # echo 1 > memory.use_hierarchy
-
-The feature can be disabled by::
+For compatibility reasons writing 1 to memory.use_hierarchy will always pass::
- # echo 0 > memory.use_hierarchy
-
-NOTE1:
- Enabling/disabling will fail if either the cgroup already has other
- cgroups created below it, or if the parent cgroup has use_hierarchy
- enabled.
-
-NOTE2:
- When panic_on_oom is set to "2", the whole system will panic in
- case of an OOM event in any cgroup.
+ # echo 1 > memory.use_hierarchy
7. Soft limits
==============
@@ -873,6 +844,9 @@ At reading, current status of OOM is shown.
(if 1, oom-killer is disabled)
- under_oom 0 or 1
(if 1, the memory cgroup is under OOM, tasks may be stopped.)
+ - oom_kill integer counter
+ The number of processes belonging to this cgroup killed by any
+ kind of OOM killer.
11. Memory Pressure
===================
@@ -985,21 +959,21 @@ References
2. Singh, Balbir. Memory Controller (RSS Control),
http://lwn.net/Articles/222762/
3. Emelianov, Pavel. Resource controllers based on process cgroups
- http://lkml.org/lkml/2007/3/6/198
+ https://lore.kernel.org/r/45ED7DEC.7010403@sw.ru
4. Emelianov, Pavel. RSS controller based on process cgroups (v2)
- http://lkml.org/lkml/2007/4/9/78
+ https://lore.kernel.org/r/461A3010.90403@sw.ru
5. Emelianov, Pavel. RSS controller based on process cgroups (v3)
- http://lkml.org/lkml/2007/5/30/244
+ https://lore.kernel.org/r/465D9739.8070209@openvz.org
6. Menage, Paul. Control Groups v10, http://lwn.net/Articles/236032/
7. Vaidyanathan, Srinivasan, Control Groups: Pagecache accounting and control
subsystem (v3), http://lwn.net/Articles/235534/
8. Singh, Balbir. RSS controller v2 test results (lmbench),
- http://lkml.org/lkml/2007/5/17/232
+ https://lore.kernel.org/r/464C95D4.7070806@linux.vnet.ibm.com
9. Singh, Balbir. RSS controller v2 AIM9 results
- http://lkml.org/lkml/2007/5/18/1
+ https://lore.kernel.org/r/464D267A.50107@linux.vnet.ibm.com
10. Singh, Balbir. Memory controller v6 test results,
- http://lkml.org/lkml/2007/8/19/36
+ https://lore.kernel.org/r/20070819094658.654.84837.sendpatchset@balbir-laptop
11. Singh, Balbir. Memory controller introduction (v6),
- http://lkml.org/lkml/2007/8/17/69
+ https://lore.kernel.org/r/20070817084228.26003.12568.sendpatchset@balbir-laptop
12. Corbet, Jonathan, Controlling memory use in cgroups,
http://lwn.net/Articles/243795/
diff --git a/Documentation/admin-guide/cgroup-v1/misc.rst b/Documentation/admin-guide/cgroup-v1/misc.rst
new file mode 100644
index 000000000000..661614c24df3
--- /dev/null
+++ b/Documentation/admin-guide/cgroup-v1/misc.rst
@@ -0,0 +1,4 @@
+===============
+Misc controller
+===============
+Please refer "Misc" documentation in Documentation/admin-guide/cgroup-v2.rst
diff --git a/Documentation/admin-guide/cgroup-v1/rdma.rst b/Documentation/admin-guide/cgroup-v1/rdma.rst
index 2fcb0a9bf790..e69369b7252e 100644
--- a/Documentation/admin-guide/cgroup-v1/rdma.rst
+++ b/Documentation/admin-guide/cgroup-v1/rdma.rst
@@ -114,4 +114,4 @@ Following resources can be accounted by rdma controller.
(d) Delete resource limit::
- echo echo mlx4_0 hca_handle=max hca_object=max > /sys/fs/cgroup/rdma/1/rdma.max
+ echo mlx4_0 hca_handle=max hca_object=max > /sys/fs/cgroup/rdma/1/rdma.max
diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst
index 3f801461f0f3..dc254a3cb956 100644
--- a/Documentation/admin-guide/cgroup-v2.rst
+++ b/Documentation/admin-guide/cgroup-v2.rst
@@ -1,3 +1,5 @@
+.. _cgroup-v2:
+
================
Control Group v2
================
@@ -9,7 +11,7 @@ This is the authoritative documentation on the design, interface and
conventions of cgroup v2. It describes all userland-visible aspects
of cgroup including core and specific controller behaviors. All
future changes must be reflected in this document. Documentation for
-v1 is available under Documentation/admin-guide/cgroup-v1/.
+v1 is available under :ref:`Documentation/admin-guide/cgroup-v1/index.rst <cgroup-v1>`.
.. CONTENTS
@@ -54,6 +56,7 @@ v1 is available under Documentation/admin-guide/cgroup-v1/.
5-3-3. IO Latency
5-3-3-1. How IO Latency Throttling Works
5-3-3-2. IO Latency Interface Files
+ 5-3-4. IO Priority
5-4. PID
5-4-1. PID Interface Files
5-5. Cpuset
@@ -63,8 +66,11 @@ v1 is available under Documentation/admin-guide/cgroup-v1/.
5-7-1. RDMA Interface Files
5-8. HugeTLB
5.8-1. HugeTLB Interface Files
- 5-8. Misc
- 5-8-1. perf_event
+ 5-9. Misc
+ 5.9-1 Miscellaneous cgroup Interface Files
+ 5.9-2 Migration and Ownership
+ 5-10. Others
+ 5-10-1. perf_event
5-N. Non-normative information
5-N-1. CPU controller root cgroup process behaviour
5-N-2. IO controller root cgroup process behaviour
@@ -172,15 +178,21 @@ disabling controllers in v1 and make them always available in v2.
cgroup v2 currently supports the following mount options.
nsdelegate
-
Consider cgroup namespaces as delegation boundaries. This
option is system wide and can only be set on mount or modified
through remount from the init namespace. The mount option is
ignored on non-init namespace mounts. Please refer to the
Delegation section for details.
- memory_localevents
+ favordynmods
+ Reduce the latencies of dynamic cgroup modifications such as
+ task migrations and controller on/offs at the cost of making
+ hot path operations such as forks and exits more expensive.
+ The static usage pattern of creating a cgroup, enabling
+ controllers, and then seeding it with CLONE_INTO_CGROUP is
+ not affected by this option.
+ memory_localevents
Only populate memory.events with data for the current cgroup,
and not any subtrees. This is legacy behaviour, the default
behaviour without this option is to include subtree counts.
@@ -188,6 +200,16 @@ cgroup v2 currently supports the following mount options.
modified through remount from the init namespace. The mount
option is ignored on non-init namespace mounts.
+ memory_recursiveprot
+ Recursively apply memory.min and memory.low protection to
+ entire subtrees, without requiring explicit downward
+ propagation into leaf cgroups. This allows protecting entire
+ subtrees from one another, while retaining free competition
+ within those subtrees. This should have been the default
+ behavior but is a mount-option to avoid regressing setups
+ relying on the original semantics (e.g. specifying bogusly
+ high 'bypass' protection values at higher tree levels).
+
Organizing Processes and Threads
--------------------------------
@@ -703,9 +725,7 @@ Conventions
- Settings for a single feature should be contained in a single file.
- The root cgroup should be exempt from resource control and thus
- shouldn't have resource control interface files. Also,
- informational files on the root cgroup which end up showing global
- information available elsewhere shouldn't exist.
+ shouldn't have resource control interface files.
- The default time unit is microseconds. If a different unit is ever
used, an explicit unit suffix must be present.
@@ -777,7 +797,6 @@ Core Interface Files
All cgroup core files are prefixed with "cgroup."
cgroup.type
-
A read-write single value file which exists on non-root
cgroups.
@@ -942,9 +961,49 @@ All cgroup core files are prefixed with "cgroup."
it's possible to delete a frozen (and empty) cgroup, as well as
create new sub-cgroups.
+ cgroup.kill
+ A write-only single value file which exists in non-root cgroups.
+ The only allowed value is "1".
+
+ Writing "1" to the file causes the cgroup and all descendant cgroups to
+ be killed. This means that all processes located in the affected cgroup
+ tree will be killed via SIGKILL.
+
+ Killing a cgroup tree will deal with concurrent forks appropriately and
+ is protected against migrations.
+
+ In a threaded cgroup, writing this file fails with EOPNOTSUPP as
+ killing cgroups is a process directed operation, i.e. it affects
+ the whole thread-group.
+
+ cgroup.pressure
+ A read-write single value file that allowed values are "0" and "1".
+ The default is "1".
+
+ Writing "0" to the file will disable the cgroup PSI accounting.
+ Writing "1" to the file will re-enable the cgroup PSI accounting.
+
+ This control attribute is not hierarchical, so disable or enable PSI
+ accounting in a cgroup does not affect PSI accounting in descendants
+ and doesn't need pass enablement via ancestors from root.
+
+ The reason this control attribute exists is that PSI accounts stalls for
+ each cgroup separately and aggregates it at each level of the hierarchy.
+ This may cause non-negligible overhead for some workloads when under
+ deep level of the hierarchy, in which case this control attribute can
+ be used to disable PSI accounting in the non-leaf cgroups.
+
+ irq.pressure
+ A read-write nested-keyed file.
+
+ Shows pressure stall information for IRQ/SOFTIRQ. See
+ :ref:`Documentation/accounting/psi.rst <psi>` for details.
+
Controllers
===========
+.. _cgroup-v2-cpu:
+
CPU
---
@@ -974,7 +1033,7 @@ CPU Interface Files
All time durations are in microseconds.
cpu.stat
- A read-only flat-keyed file which exists on non-root cgroups.
+ A read-only flat-keyed file.
This file exists whether the controller is enabled or not.
It always reports the following three stats:
@@ -988,6 +1047,8 @@ All time durations are in microseconds.
- nr_periods
- nr_throttled
- throttled_usec
+ - nr_bursts
+ - burst_usec
cpu.weight
A read-write single value file which exists on non-root
@@ -1019,11 +1080,17 @@ All time durations are in microseconds.
$PERIOD duration. "max" for $MAX indicates no limit. If only
one number is written, $MAX is updated.
+ cpu.max.burst
+ A read-write single value file which exists on non-root
+ cgroups. The default is "0".
+
+ The burst in the range [0, $MAX].
+
cpu.pressure
- A read-only nested-key file which exists on non-root cgroups.
+ A read-write nested-keyed file.
Shows pressure stall information for CPU. See
- Documentation/accounting/psi.rst for details.
+ :ref:`Documentation/accounting/psi.rst <psi>` for details.
cpu.uclamp.min
A read-write single value file which exists on non-root cgroups.
@@ -1103,7 +1170,7 @@ PAGE_SIZE multiple when read back.
proportionally to the overage, reducing reclaim pressure for
smaller overages.
- Effective min boundary is limited by memory.min values of
+ Effective min boundary is limited by memory.min values of
all ancestor cgroups. If there is memory.min overcommitment
(child cgroup or cgroups are requiring more protected memory
than parent will allow), then each child cgroup will get
@@ -1161,10 +1228,52 @@ PAGE_SIZE multiple when read back.
Under certain circumstances, the usage may go over the limit
temporarily.
+ In default configuration regular 0-order allocations always
+ succeed unless OOM killer chooses current task as a victim.
+
+ Some kinds of allocations don't invoke the OOM killer.
+ Caller could retry them differently, return into userspace
+ as -ENOMEM or silently ignore in cases like disk readahead.
+
This is the ultimate protection mechanism. As long as the
high limit is used and monitored properly, this limit's
utility is limited to providing the final safety net.
+ memory.reclaim
+ A write-only nested-keyed file which exists for all cgroups.
+
+ This is a simple interface to trigger memory reclaim in the
+ target cgroup.
+
+ This file accepts a single key, the number of bytes to reclaim.
+ No nested keys are currently supported.
+
+ Example::
+
+ echo "1G" > memory.reclaim
+
+ The interface can be later extended with nested keys to
+ configure the reclaim behavior. For example, specify the
+ type of memory to reclaim from (anon, file, ..).
+
+ Please note that the kernel can over or under reclaim from
+ the target cgroup. If less bytes are reclaimed than the
+ specified amount, -EAGAIN is returned.
+
+ Please note that the proactive reclaim (triggered by this
+ interface) is not meant to indicate memory pressure on the
+ memory cgroup. Therefore socket memory balancing triggered by
+ the memory reclaim normally is not exercised in this case.
+ This means that the networking layer will not adapt based on
+ reclaim induced by memory.reclaim.
+
+ memory.peak
+ A read-only single value file which exists on non-root
+ cgroups.
+
+ The max memory usage recorded for the cgroup and its
+ descendants since the creation of the cgroup.
+
memory.oom.group
A read-write single value file which exists on non-root
cgroups. The default value is "0".
@@ -1191,7 +1300,7 @@ PAGE_SIZE multiple when read back.
Note that all fields in this file are hierarchical and the
file modified event can be generated due to an event down the
- hierarchy. For for the local events at the cgroup level see
+ hierarchy. For the local events at the cgroup level see
memory.events.local.
low
@@ -1217,22 +1326,17 @@ PAGE_SIZE multiple when read back.
The number of time the cgroup's memory usage was
reached the limit and allocation was about to fail.
- Depending on context result could be invocation of OOM
- killer and retrying allocation or failing allocation.
-
- Failed allocation in its turn could be returned into
- userspace as -ENOMEM or silently ignored in cases like
- disk readahead. For now OOM in memory cgroup kills
- tasks iff shortage has happened inside page fault.
-
This event is not raised if the OOM killer is not
considered as an option, e.g. for failed high-order
- allocations.
+ allocations or if caller asked to not retry attempts.
oom_kill
The number of processes belonging to this cgroup
killed by any kind of OOM killer.
+ oom_group_kill
+ The number of times a group OOM has occurred.
+
memory.events.local
Similar to memory.events but the fields in the file are local
to the cgroup i.e. not hierarchical. The file modified event
@@ -1251,6 +1355,10 @@ PAGE_SIZE multiple when read back.
can show up in the middle. Don't rely on items remaining in a
fixed position; use the keys to look up specific values!
+ If the entry has no per-node counter (or not show in the
+ memory.numa_stat). We use 'npn' (non-per-node) as the tag
+ to indicate that it will not show in the memory.numa_stat.
+
anon
Amount of memory used in anonymous mappings such as
brk(), sbrk(), and mmap(MAP_ANONYMOUS)
@@ -1259,20 +1367,42 @@ PAGE_SIZE multiple when read back.
Amount of memory used to cache filesystem data,
including tmpfs and shared memory.
+ kernel (npn)
+ Amount of total kernel memory, including
+ (kernel_stack, pagetables, percpu, vmalloc, slab) in
+ addition to other kernel memory use cases.
+
kernel_stack
Amount of memory allocated to kernel stacks.
- slab
- Amount of memory used for storing in-kernel data
- structures.
+ pagetables
+ Amount of memory allocated for page tables.
+
+ sec_pagetables
+ Amount of memory allocated for secondary page tables,
+ this currently includes KVM mmu allocations on x86
+ and arm64.
- sock
+ percpu (npn)
+ Amount of memory used for storing per-cpu kernel
+ data structures.
+
+ sock (npn)
Amount of memory used in network transmission buffers
+ vmalloc (npn)
+ Amount of memory used for vmap backed memory.
+
shmem
Amount of cached filesystem data that is swap-backed,
such as tmpfs, shm segments, shared anonymous mmap()s
+ zswap
+ Amount of memory consumed by the zswap compression backend.
+
+ zswapped
+ Amount of application memory swapped out to zswap.
+
file_mapped
Amount of cached filesystem data mapped with mmap()
@@ -1284,10 +1414,22 @@ PAGE_SIZE multiple when read back.
Amount of cached filesystem data that was modified and
is currently being written back to disk
+ swapcached
+ Amount of swap cached in memory. The swapcache is accounted
+ against both memory and swap usage.
+
anon_thp
Amount of memory used in anonymous mappings backed by
transparent hugepages
+ file_thp
+ Amount of cached filesystem data backed by transparent
+ hugepages
+
+ shmem_thp
+ Amount of shm, tmpfs, shared anonymous mmap()s backed by
+ transparent hugepages
+
inactive_anon, active_anon, inactive_file, active_file, unevictable
Amount of memory, swap-backed and filesystem-backed,
on the internal memory management lists used by the
@@ -1306,64 +1448,108 @@ PAGE_SIZE multiple when read back.
Part of "slab" that cannot be reclaimed on memory
pressure.
- pgfault
- Total number of page faults incurred
+ slab (npn)
+ Amount of memory used for storing in-kernel data
+ structures.
- pgmajfault
- Number of major page faults incurred
+ workingset_refault_anon
+ Number of refaults of previously evicted anonymous pages.
- workingset_refault
+ workingset_refault_file
+ Number of refaults of previously evicted file pages.
- Number of refaults of previously evicted pages
+ workingset_activate_anon
+ Number of refaulted anonymous pages that were immediately
+ activated.
- workingset_activate
+ workingset_activate_file
+ Number of refaulted file pages that were immediately activated.
- Number of refaulted pages that were immediately activated
+ workingset_restore_anon
+ Number of restored anonymous pages which have been detected as
+ an active workingset before they got reclaimed.
- workingset_nodereclaim
+ workingset_restore_file
+ Number of restored file pages which have been detected as an
+ active workingset before they got reclaimed.
+ workingset_nodereclaim
Number of times a shadow node has been reclaimed
- pgrefill
+ pgscan (npn)
+ Amount of scanned pages (in an inactive LRU list)
- Amount of scanned pages (in an active LRU list)
+ pgsteal (npn)
+ Amount of reclaimed pages
- pgscan
+ pgscan_kswapd (npn)
+ Amount of scanned pages by kswapd (in an inactive LRU list)
- Amount of scanned pages (in an inactive LRU list)
+ pgscan_direct (npn)
+ Amount of scanned pages directly (in an inactive LRU list)
- pgsteal
+ pgsteal_kswapd (npn)
+ Amount of reclaimed pages by kswapd
- Amount of reclaimed pages
+ pgsteal_direct (npn)
+ Amount of reclaimed pages directly
+
+ pgfault (npn)
+ Total number of page faults incurred
- pgactivate
+ pgmajfault (npn)
+ Number of major page faults incurred
- Amount of pages moved to the active LRU list
+ pgrefill (npn)
+ Amount of scanned pages (in an active LRU list)
- pgdeactivate
+ pgactivate (npn)
+ Amount of pages moved to the active LRU list
+ pgdeactivate (npn)
Amount of pages moved to the inactive LRU list
- pglazyfree
-
+ pglazyfree (npn)
Amount of pages postponed to be freed under memory pressure
- pglazyfreed
-
+ pglazyfreed (npn)
Amount of reclaimed lazyfree pages
- thp_fault_alloc
-
+ thp_fault_alloc (npn)
Number of transparent hugepages which were allocated to satisfy
- a page fault, including COW faults. This counter is not present
- when CONFIG_TRANSPARENT_HUGEPAGE is not set.
-
- thp_collapse_alloc
+ a page fault. This counter is not present when CONFIG_TRANSPARENT_HUGEPAGE
+ is not set.
+ thp_collapse_alloc (npn)
Number of transparent hugepages which were allocated to allow
collapsing an existing range of pages. This counter is not
present when CONFIG_TRANSPARENT_HUGEPAGE is not set.
+ memory.numa_stat
+ A read-only nested-keyed file which exists on non-root cgroups.
+
+ This breaks down the cgroup's memory footprint into different
+ types of memory, type-specific details, and other information
+ per node on the state of the memory management system.
+
+ This is useful for providing visibility into the NUMA locality
+ information within an memcg since the pages are allowed to be
+ allocated from any physical node. One of the use case is evaluating
+ application performance by combining this information with the
+ application's CPU allocation.
+
+ All memory amounts are in bytes.
+
+ The output format of memory.numa_stat is::
+
+ type N0=<bytes in node 0> N1=<bytes in node 1> ...
+
+ The entries are ordered to be human readable, and new entries
+ can show up in the middle. Don't rely on items remaining in a
+ fixed position; use the keys to look up specific values!
+
+ The entries can refer to the memory.stat.
+
memory.swap.current
A read-only single value file which exists on non-root
cgroups.
@@ -1371,6 +1557,22 @@ PAGE_SIZE multiple when read back.
The total amount of swap currently being used by the cgroup
and its descendants.
+ memory.swap.high
+ A read-write single value file which exists on non-root
+ cgroups. The default is "max".
+
+ Swap usage throttle limit. If a cgroup's swap usage exceeds
+ this limit, all its further allocations will be throttled to
+ allow userspace to implement custom out-of-memory procedures.
+
+ This limit marks a point of no return for the cgroup. It is NOT
+ designed to manage the amount of swapping a workload does
+ during regular operation. Compare to memory.swap.max, which
+ prohibits swapping past a set amount, but lets the cgroup
+ continue unimpeded as long as other memory can be reclaimed.
+
+ Healthy workloads are not expected to reach this limit.
+
memory.swap.max
A read-write single value file which exists on non-root
cgroups. The default is "max".
@@ -1384,6 +1586,10 @@ PAGE_SIZE multiple when read back.
otherwise, a value change in this file generates a file
modified event.
+ high
+ The number of times the cgroup's swap usage was over
+ the high threshold.
+
max
The number of times the cgroup's swap usage was about
to go over the max boundary and swap allocation
@@ -1399,11 +1605,26 @@ PAGE_SIZE multiple when read back.
higher than the limit for an extended period of time. This
reduces the impact on the workload and memory management.
+ memory.zswap.current
+ A read-only single value file which exists on non-root
+ cgroups.
+
+ The total amount of memory consumed by the zswap compression
+ backend.
+
+ memory.zswap.max
+ A read-write single value file which exists on non-root
+ cgroups. The default is "max".
+
+ Zswap usage hard limit. If a cgroup's zswap pool reaches this
+ limit, it will refuse to take any more stores before existing
+ entries fault back in or are written out to disk.
+
memory.pressure
- A read-only nested-key file which exists on non-root cgroups.
+ A read-only nested-keyed file.
Shows pressure stall information for memory. See
- Documentation/accounting/psi.rst for details.
+ :ref:`Documentation/accounting/psi.rst <psi>` for details.
Usage Guidelines
@@ -1463,8 +1684,7 @@ IO Interface Files
~~~~~~~~~~~~~~~~~~
io.stat
- A read-only nested-keyed file which exists on non-root
- cgroups.
+ A read-only nested-keyed file.
Lines are keyed by $MAJ:$MIN device numbers and not ordered.
The following nested keys are defined.
@@ -1478,13 +1698,13 @@ IO Interface Files
dios Number of discard IOs
====== =====================
- An example read output follows:
+ An example read output follows::
8:16 rbytes=1459200 wbytes=314773504 rios=192 wios=353 dbytes=0 dios=0
8:0 rbytes=90430464 wbytes=299008000 rios=8950 wios=1252 dbytes=50331648 dios=3021
io.cost.qos
- A read-write nested-keyed file with exists only on the root
+ A read-write nested-keyed file which exists only on the root
cgroup.
This file configures the Quality of Service of the IO cost
@@ -1539,7 +1759,7 @@ IO Interface Files
automatic mode can be restored by setting "ctrl" to "auto".
io.cost.model
- A read-write nested-keyed file with exists only on the root
+ A read-write nested-keyed file which exists only on the root
cgroup.
This file configures the cost model of the IO cost model based
@@ -1640,10 +1860,10 @@ IO Interface Files
8:16 rbps=2097152 wbps=max riops=max wiops=max
io.pressure
- A read-only nested-key file which exists on non-root cgroups.
+ A read-only nested-keyed file.
Shows pressure stall information for IO. See
- Documentation/accounting/psi.rst for details.
+ :ref:`Documentation/accounting/psi.rst <psi>` for details.
Writeback
@@ -1664,9 +1884,9 @@ per-cgroup dirty memory states are examined and the more restrictive
of the two is enforced.
cgroup writeback requires explicit support from the underlying
-filesystem. Currently, cgroup writeback is implemented on ext2, ext4
-and btrfs. On other filesystems, all writeback IOs are attributed to
-the root cgroup.
+filesystem. Currently, cgroup writeback is implemented on ext2, ext4,
+btrfs, f2fs, and xfs. On other filesystems, all writeback IOs are
+attributed to the root cgroup.
There are inherent differences in memory and writeback management
which affects how cgroup ownership is tracked. Memory is tracked per
@@ -1765,7 +1985,7 @@ IO Latency Interface Files
io.latency
This takes a similar format as the other controllers.
- "MAJOR:MINOR target=<target time in microseconds"
+ "MAJOR:MINOR target=<target time in microseconds>"
io.stat
If the controller is enabled you will see extra stats in io.stat in
@@ -1785,6 +2005,60 @@ IO Latency Interface Files
duration of time between evaluation events. Windows only elapse
with IO activity. Idle periods extend the most recent window.
+IO Priority
+~~~~~~~~~~~
+
+A single attribute controls the behavior of the I/O priority cgroup policy,
+namely the blkio.prio.class attribute. The following values are accepted for
+that attribute:
+
+ no-change
+ Do not modify the I/O priority class.
+
+ none-to-rt
+ For requests that do not have an I/O priority class (NONE),
+ change the I/O priority class into RT. Do not modify
+ the I/O priority class of other requests.
+
+ restrict-to-be
+ For requests that do not have an I/O priority class or that have I/O
+ priority class RT, change it into BE. Do not modify the I/O priority
+ class of requests that have priority class IDLE.
+
+ idle
+ Change the I/O priority class of all requests into IDLE, the lowest
+ I/O priority class.
+
+The following numerical values are associated with the I/O priority policies:
+
++-------------+---+
+| no-change | 0 |
++-------------+---+
+| none-to-rt | 1 |
++-------------+---+
+| rt-to-be | 2 |
++-------------+---+
+| all-to-idle | 3 |
++-------------+---+
+
+The numerical value that corresponds to each I/O priority class is as follows:
+
++-------------------------------+---+
+| IOPRIO_CLASS_NONE | 0 |
++-------------------------------+---+
+| IOPRIO_CLASS_RT (real-time) | 1 |
++-------------------------------+---+
+| IOPRIO_CLASS_BE (best effort) | 2 |
++-------------------------------+---+
+| IOPRIO_CLASS_IDLE | 3 |
++-------------------------------+---+
+
+The algorithm to set the I/O priority class for a request is as follows:
+
+- Translate the I/O priority class policy into a number.
+- Change the request I/O priority class into the maximum of the I/O priority
+ class policy number and the numerical I/O priority class.
+
PID
---
@@ -1853,7 +2127,7 @@ Cpuset Interface Files
from the requested CPUs.
The CPU numbers are comma-separated numbers or ranges.
- For example:
+ For example::
# cat cpuset.cpus
0-4,6,8-10
@@ -1892,7 +2166,7 @@ Cpuset Interface Files
from the requested memory nodes.
The memory node numbers are comma-separated numbers or ranges.
- For example:
+ For example::
# cat cpuset.mems
0-1,3
@@ -1905,6 +2179,17 @@ Cpuset Interface Files
The value of "cpuset.mems" stays constant until the next update
and won't be affected by any memory nodes hotplug events.
+ Setting a non-empty value to "cpuset.mems" causes memory of
+ tasks within the cgroup to be migrated to the designated nodes if
+ they are currently using memory outside of the designated nodes.
+
+ There is a cost for this memory migration. The migration
+ may not be complete and some memory pages may be left behind.
+ So it is recommended that "cpuset.mems" should be set properly
+ before spawning new tasks into the cpuset. Even if there is
+ a need to change "cpuset.mems" with active tasks, it shouldn't
+ be done frequently.
+
cpuset.mems.effective
A read-only multiple values file which exists on all
cpuset-enabled cgroups.
@@ -1926,73 +2211,95 @@ Cpuset Interface Files
cpuset-enabled cgroups. This flag is owned by the parent cgroup
and is not delegatable.
- It accepts only the following input values when written to.
-
- "root" - a partition root
- "member" - a non-root member of a partition
-
- When set to be a partition root, the current cgroup is the
- root of a new partition or scheduling domain that comprises
- itself and all its descendants except those that are separate
- partition roots themselves and their descendants. The root
- cgroup is always a partition root.
-
- There are constraints on where a partition root can be set.
- It can only be set in a cgroup if all the following conditions
- are true.
-
- 1) The "cpuset.cpus" is not empty and the list of CPUs are
- exclusive, i.e. they are not shared by any of its siblings.
- 2) The parent cgroup is a partition root.
- 3) The "cpuset.cpus" is also a proper subset of the parent's
- "cpuset.cpus.effective".
- 4) There is no child cgroups with cpuset enabled. This is for
- eliminating corner cases that have to be handled if such a
- condition is allowed.
-
- Setting it to partition root will take the CPUs away from the
- effective CPUs of the parent cgroup. Once it is set, this
- file cannot be reverted back to "member" if there are any child
- cgroups with cpuset enabled.
-
- A parent partition cannot distribute all its CPUs to its
- child partitions. There must be at least one cpu left in the
- parent partition.
-
- Once becoming a partition root, changes to "cpuset.cpus" is
- generally allowed as long as the first condition above is true,
- the change will not take away all the CPUs from the parent
- partition and the new "cpuset.cpus" value is a superset of its
- children's "cpuset.cpus" values.
-
- Sometimes, external factors like changes to ancestors'
- "cpuset.cpus" or cpu hotplug can cause the state of the partition
- root to change. On read, the "cpuset.sched.partition" file
- can show the following values.
-
- "member" Non-root member of a partition
- "root" Partition root
- "root invalid" Invalid partition root
-
- It is a partition root if the first 2 partition root conditions
- above are true and at least one CPU from "cpuset.cpus" is
- granted by the parent cgroup.
-
- A partition root can become invalid if none of CPUs requested
- in "cpuset.cpus" can be granted by the parent cgroup or the
- parent cgroup is no longer a partition root itself. In this
- case, it is not a real partition even though the restriction
- of the first partition root condition above will still apply.
- The cpu affinity of all the tasks in the cgroup will then be
- associated with CPUs in the nearest ancestor partition.
-
- An invalid partition root can be transitioned back to a
- real partition root if at least one of the requested CPUs
- can now be granted by its parent. In this case, the cpu
- affinity of all the tasks in the formerly invalid partition
- will be associated to the CPUs of the newly formed partition.
- Changing the partition state of an invalid partition root to
- "member" is always allowed even if child cpusets are present.
+ It accepts only the following input values when written to.
+
+ ========== =====================================
+ "member" Non-root member of a partition
+ "root" Partition root
+ "isolated" Partition root without load balancing
+ ========== =====================================
+
+ The root cgroup is always a partition root and its state
+ cannot be changed. All other non-root cgroups start out as
+ "member".
+
+ When set to "root", the current cgroup is the root of a new
+ partition or scheduling domain that comprises itself and all
+ its descendants except those that are separate partition roots
+ themselves and their descendants.
+
+ When set to "isolated", the CPUs in that partition root will
+ be in an isolated state without any load balancing from the
+ scheduler. Tasks placed in such a partition with multiple
+ CPUs should be carefully distributed and bound to each of the
+ individual CPUs for optimal performance.
+
+ The value shown in "cpuset.cpus.effective" of a partition root
+ is the CPUs that the partition root can dedicate to a potential
+ new child partition root. The new child subtracts available
+ CPUs from its parent "cpuset.cpus.effective".
+
+ A partition root ("root" or "isolated") can be in one of the
+ two possible states - valid or invalid. An invalid partition
+ root is in a degraded state where some state information may
+ be retained, but behaves more like a "member".
+
+ All possible state transitions among "member", "root" and
+ "isolated" are allowed.
+
+ On read, the "cpuset.cpus.partition" file can show the following
+ values.
+
+ ============================= =====================================
+ "member" Non-root member of a partition
+ "root" Partition root
+ "isolated" Partition root without load balancing
+ "root invalid (<reason>)" Invalid partition root
+ "isolated invalid (<reason>)" Invalid isolated partition root
+ ============================= =====================================
+
+ In the case of an invalid partition root, a descriptive string on
+ why the partition is invalid is included within parentheses.
+
+ For a partition root to become valid, the following conditions
+ must be met.
+
+ 1) The "cpuset.cpus" is exclusive with its siblings , i.e. they
+ are not shared by any of its siblings (exclusivity rule).
+ 2) The parent cgroup is a valid partition root.
+ 3) The "cpuset.cpus" is not empty and must contain at least
+ one of the CPUs from parent's "cpuset.cpus", i.e. they overlap.
+ 4) The "cpuset.cpus.effective" cannot be empty unless there is
+ no task associated with this partition.
+
+ External events like hotplug or changes to "cpuset.cpus" can
+ cause a valid partition root to become invalid and vice versa.
+ Note that a task cannot be moved to a cgroup with empty
+ "cpuset.cpus.effective".
+
+ For a valid partition root with the sibling cpu exclusivity
+ rule enabled, changes made to "cpuset.cpus" that violate the
+ exclusivity rule will invalidate the partition as well as its
+ sibiling partitions with conflicting cpuset.cpus values. So
+ care must be taking in changing "cpuset.cpus".
+
+ A valid non-root parent partition may distribute out all its CPUs
+ to its child partitions when there is no task associated with it.
+
+ Care must be taken to change a valid partition root to
+ "member" as all its child partitions, if present, will become
+ invalid causing disruption to tasks running in those child
+ partitions. These inactivated partitions could be recovered if
+ their parent is switched back to a partition root with a proper
+ set of "cpuset.cpus".
+
+ Poll and inotify events are triggered whenever the state of
+ "cpuset.cpus.partition" changes. That includes changes caused
+ by write to "cpuset.cpus.partition", cpu hotplug or other
+ changes that modify the validity status of the partition.
+ This will allow user space agents to monitor unexpected changes
+ to "cpuset.cpus.partition" without the need to do continuous
+ polling.
Device controller
@@ -2004,26 +2311,26 @@ existing device files.
Cgroup v2 device controller has no interface files and is implemented
on top of cgroup BPF. To control access to device files, a user may
-create bpf programs of the BPF_CGROUP_DEVICE type and attach them
-to cgroups. On an attempt to access a device file, corresponding
-BPF programs will be executed, and depending on the return value
-the attempt will succeed or fail with -EPERM.
+create bpf programs of type BPF_PROG_TYPE_CGROUP_DEVICE and attach
+them to cgroups with BPF_CGROUP_DEVICE flag. On an attempt to access a
+device file, corresponding BPF programs will be executed, and depending
+on the return value the attempt will succeed or fail with -EPERM.
-A BPF_CGROUP_DEVICE program takes a pointer to the bpf_cgroup_dev_ctx
-structure, which describes the device access attempt: access type
-(mknod/read/write) and device (type, major and minor numbers).
-If the program returns 0, the attempt fails with -EPERM, otherwise
-it succeeds.
+A BPF_PROG_TYPE_CGROUP_DEVICE program takes a pointer to the
+bpf_cgroup_dev_ctx structure, which describes the device access attempt:
+access type (mknod/read/write) and device (type, major and minor numbers).
+If the program returns 0, the attempt fails with -EPERM, otherwise it
+succeeds.
-An example of BPF_CGROUP_DEVICE program may be found in the kernel
-source tree in the tools/testing/selftests/bpf/dev_cgroup.c file.
+An example of BPF_PROG_TYPE_CGROUP_DEVICE program may be found in
+tools/testing/selftests/bpf/progs/dev_cgroup.c in the kernel source tree.
RDMA
----
The "rdma" controller regulates the distribution and accounting of
-of RDMA resources.
+RDMA resources.
RDMA Interface Files
~~~~~~~~~~~~~~~~~~~~
@@ -2086,9 +2393,90 @@ HugeTLB Interface Files
are local to the cgroup i.e. not hierarchical. The file modified event
generated on this file reflects only the local events.
+ hugetlb.<hugepagesize>.numa_stat
+ Similar to memory.numa_stat, it shows the numa information of the
+ hugetlb pages of <hugepagesize> in this cgroup. Only active in
+ use hugetlb pages are included. The per-node values are in bytes.
+
Misc
----
+The Miscellaneous cgroup provides the resource limiting and tracking
+mechanism for the scalar resources which cannot be abstracted like the other
+cgroup resources. Controller is enabled by the CONFIG_CGROUP_MISC config
+option.
+
+A resource can be added to the controller via enum misc_res_type{} in the
+include/linux/misc_cgroup.h file and the corresponding name via misc_res_name[]
+in the kernel/cgroup/misc.c file. Provider of the resource must set its
+capacity prior to using the resource by calling misc_cg_set_capacity().
+
+Once a capacity is set then the resource usage can be updated using charge and
+uncharge APIs. All of the APIs to interact with misc controller are in
+include/linux/misc_cgroup.h.
+
+Misc Interface Files
+~~~~~~~~~~~~~~~~~~~~
+
+Miscellaneous controller provides 3 interface files. If two misc resources (res_a and res_b) are registered then:
+
+ misc.capacity
+ A read-only flat-keyed file shown only in the root cgroup. It shows
+ miscellaneous scalar resources available on the platform along with
+ their quantities::
+
+ $ cat misc.capacity
+ res_a 50
+ res_b 10
+
+ misc.current
+ A read-only flat-keyed file shown in the non-root cgroups. It shows
+ the current usage of the resources in the cgroup and its children.::
+
+ $ cat misc.current
+ res_a 3
+ res_b 0
+
+ misc.max
+ A read-write flat-keyed file shown in the non root cgroups. Allowed
+ maximum usage of the resources in the cgroup and its children.::
+
+ $ cat misc.max
+ res_a max
+ res_b 4
+
+ Limit can be set by::
+
+ # echo res_a 1 > misc.max
+
+ Limit can be set to max by::
+
+ # echo res_a max > misc.max
+
+ Limits can be set higher than the capacity value in the misc.capacity
+ file.
+
+ misc.events
+ A read-only flat-keyed file which exists on non-root cgroups. The
+ following entries are defined. Unless specified otherwise, a value
+ change in this file generates a file modified event. All fields in
+ this file are hierarchical.
+
+ max
+ The number of times the cgroup's resource usage was
+ about to go over the max boundary.
+
+Migration and Ownership
+~~~~~~~~~~~~~~~~~~~~~~~
+
+A miscellaneous scalar resource is charged to the cgroup in which it is used
+first, and stays charged to that cgroup until that resource is freed. Migrating
+a process to a different cgroup does not move the charge to the destination
+cgroup where the process has moved.
+
+Others
+------
+
perf_event
~~~~~~~~~~
@@ -2145,7 +2533,7 @@ Without cgroup namespace, the "/proc/$PID/cgroup" file shows the
complete path of the cgroup of a process. In a container setup where
a set of cgroups and namespaces are intended to isolate processes the
"/proc/$PID/cgroup" file may leak potential system level information
-to the isolated processes. For Example::
+to the isolated processes. For example::
# cat /proc/self/cgroup
0::/batchjobs/container_id1
diff --git a/Documentation/admin-guide/cifs/authors.rst b/Documentation/admin-guide/cifs/authors.rst
index b02d6dd6c070..5c1d2f0fa7d1 100644
--- a/Documentation/admin-guide/cifs/authors.rst
+++ b/Documentation/admin-guide/cifs/authors.rst
@@ -5,10 +5,10 @@ Authors
Original Author
---------------
-Steve French (sfrench@samba.org)
+Steve French (smfrench@gmail.com, sfrench@samba.org)
The author wishes to express his appreciation and thanks to:
-Andrew Tridgell (Samba team) for his early suggestions about smb/cifs VFS
+Andrew Tridgell (Samba team) for his early suggestions about SMB/CIFS VFS
improvements. Thanks to IBM for allowing me time and test resources to pursue
this project, to Jim McDonough from IBM (and the Samba Team) for his help, to
the IBM Linux JFS team for explaining many esoteric Linux filesystem features.
@@ -51,7 +51,7 @@ Patch Contributors
- Ronnie Sahlberg (for SMB3 xattr work, bug fixes, and lots of great work on compounding)
- Shirish Pargaonkar (for many ACL patches over the years)
- Sachin Prabhu (many bug fixes, including for reconnect, copy offload and security)
-- Paulo Alcantara
+- Paulo Alcantara (for some excellent work in DFS, and in booting from SMB3)
- Long Li (some great work on RDMA, SMB Direct)
diff --git a/Documentation/admin-guide/cifs/changes.rst b/Documentation/admin-guide/cifs/changes.rst
index 71f2ecb62299..3147bbae9c43 100644
--- a/Documentation/admin-guide/cifs/changes.rst
+++ b/Documentation/admin-guide/cifs/changes.rst
@@ -3,6 +3,7 @@ Changes
=======
See https://wiki.samba.org/index.php/LinuxCIFSKernel for summary
-information (that may be easier to read than parsing the output of
-"git log fs/cifs") about fixes/improvements to CIFS/SMB2/SMB3 support (changes
+information about fixes/improvements to CIFS/SMB2/SMB3 support (changes
to cifs.ko module) by kernel version (and cifs internal module version).
+This may be easier to read than parsing the output of "git log fs/cifs"
+by release.
diff --git a/Documentation/admin-guide/cifs/introduction.rst b/Documentation/admin-guide/cifs/introduction.rst
index 0b98f672d36f..53ea62906aa5 100644
--- a/Documentation/admin-guide/cifs/introduction.rst
+++ b/Documentation/admin-guide/cifs/introduction.rst
@@ -7,19 +7,19 @@ Introduction
protocol which was the successor to the Server Message Block
(SMB) protocol, the native file sharing mechanism for most early
PC operating systems. New and improved versions of CIFS are now
- called SMB2 and SMB3. Use of SMB3 (and later, including SMB3.1.1)
- is strongly preferred over using older dialects like CIFS due to
- security reaasons. All modern dialects, including the most recent,
- SMB3.1.1 are supported by the CIFS VFS module. The SMB3 protocol
- is implemented and supported by all major file servers
- such as all modern versions of Windows (including Windows 2016
- Server), as well as by Samba (which provides excellent
- CIFS/SMB2/SMB3 server support and tools for Linux and many other
- operating systems). Apple systems also support SMB3 well, as
- do most Network Attached Storage vendors, so this network
- filesystem client can mount to a wide variety of systems.
- It also supports mounting to the cloud (for example
- Microsoft Azure), including the necessary security features.
+ called SMB2 and SMB3. Use of SMB3 (and later, including SMB3.1.1
+ the most current dialect) is strongly preferred over using older
+ dialects like CIFS due to security reasons. All modern dialects,
+ including the most recent, SMB3.1.1, are supported by the CIFS VFS
+ module. The SMB3 protocol is implemented and supported by all major
+ file servers such as Windows (including Windows 2019 Server), as
+ well as by Samba (which provides excellent CIFS/SMB2/SMB3 server
+ support and tools for Linux and many other operating systems).
+ Apple systems also support SMB3 well, as do most Network Attached
+ Storage vendors, so this network filesystem client can mount to a
+ wide variety of systems. It also supports mounting to the cloud
+ (for example Microsoft Azure), including the necessary security
+ features.
The intent of this module is to provide the most advanced network
file system function for SMB3 compliant servers, including advanced
@@ -27,8 +27,8 @@ Introduction
POSIX compliance, secure per-user session establishment, encryption,
high performance safe distributed caching (leases/oplocks), optional packet
signing, large files, Unicode support and other internationalization
- improvements. Since both Samba server and this filesystem client support
- the CIFS Unix extensions (and in the future SMB3 POSIX extensions),
+ improvements. Since both Samba server and this filesystem client support the
+ CIFS Unix extensions, and the Linux client also suppors SMB3 POSIX extensions,
the combination can provide a reasonable alternative to other network and
cluster file systems for fileserving in some Linux to Linux environments,
not just in Linux to Windows (or Linux to Mac) environments.
diff --git a/Documentation/admin-guide/cifs/todo.rst b/Documentation/admin-guide/cifs/todo.rst
index 084c25f92dcb..2646ed2e2d3e 100644
--- a/Documentation/admin-guide/cifs/todo.rst
+++ b/Documentation/admin-guide/cifs/todo.rst
@@ -13,24 +13,26 @@ is a partial list of the known problems and missing features:
a) SMB3 (and SMB3.1.1) missing optional features:
- - multichannel (started), integration with RDMA
- - directory leases (improved metadata caching), started (root dir only)
+ - multichannel (partially integrated), integration of multichannel with RDMA
+ - directory leases (improved metadata caching). Currently only implemented for root dir
- T10 copy offload ie "ODX" (copy chunk, and "Duplicate Extents" ioctl
currently the only two server side copy mechanisms supported)
b) improved sparse file support (fiemap and SEEK_HOLE are implemented
- but additional features would be supportable by the protocol).
+ but additional features would be supportable by the protocol such
+ as FALLOC_FL_COLLAPSE_RANGE and FALLOC_FL_INSERT_RANGE)
c) Directory entry caching relies on a 1 second timer, rather than
using Directory Leases, currently only the root file handle is cached longer
+ by leveraging Directory Leases
-d) quota support (needs minor kernel change since quota calls
- to make it to network filesystems or deviceless filesystems)
+d) quota support (needs minor kernel change since quota calls otherwise
+ won't make it to network filesystems or deviceless filesystems).
e) Additional use cases can be optimized to use "compounding" (e.g.
open/query/close and open/setinfo/close) to reduce the number of
roundtrips to the server and improve performance. Various cases
- (stat, statfs, create, unlink, mkdir) already have been improved by
+ (stat, statfs, create, unlink, mkdir, xattrs) already have been improved by
using compounding but more can be done. In addition we could
significantly reduce redundant opens by using deferred close (with
handle caching leases) and better using reference counters on file
@@ -60,7 +62,9 @@ k) Add tools to take advantage of more smb3 specific ioctls and features
metadata attributes easier from tools (e.g. extending what was done
in smb-info tool).
-l) encrypted file support
+l) encrypted file support (currently the attribute showing the file is
+ encrypted on the server is reported, but changing the attribute is not
+ supported).
m) improved stats gathering tools (perhaps integration with nfsometer?)
to extend and make easier to use what is currently in /proc/fs/cifs/Stats
@@ -69,14 +73,13 @@ n) Add support for claims based ACLs ("DAC")
o) mount helper GUI (to simplify the various configuration options on mount)
-p) Add support for witness protocol (perhaps ioctl to cifs.ko from user space
- tool listening on witness protocol RPC) to allow for notification of share
- move, server failover, and server adapter changes. And also improve other
- failover scenarios, e.g. when client knows multiple DFS entries point to
- different servers, and the server we are connected to has gone down.
+p) Expand support for witness protocol to allow for notification of share
+ move, and server network adapter changes. Currently only notifications by
+ the witness protocol for server move is supported by the Linux client.
q) Allow mount.cifs to be more verbose in reporting errors with dialect
- or unsupported feature errors.
+ or unsupported feature errors. This would now be easier due to the
+ implementation of the new mount API.
r) updating cifs documentation, and user guide.
@@ -87,18 +90,17 @@ t) split cifs and smb3 support into separate modules so legacy (and less
secure) CIFS dialect can be disabled in environments that don't need it
and simplify the code.
-v) POSIX Extensions for SMB3.1.1 (started, create and mkdir support added
- so far).
+v) Additional testing of POSIX Extensions for SMB3.1.1
w) Add support for additional strong encryption types, and additional spnego
- authentication mechanisms (see MS-SMB2)
+ authentication mechanisms (see MS-SMB2). GCM-256 is now partially implemented.
x) Finish support for SMB3.1.1 compression
Known Bugs
==========
-See http://bugzilla.samba.org - search on product "CifsVFS" for
+See https://bugzilla.samba.org - search on product "CifsVFS" for
current bug list. Also check http://bugzilla.kernel.org (Product = File System, Component = CIFS)
1) existing symbolic links (Windows reparse points) are recognized but
diff --git a/Documentation/admin-guide/cifs/usage.rst b/Documentation/admin-guide/cifs/usage.rst
index d3fb67b8a976..3766bf8a1c20 100644
--- a/Documentation/admin-guide/cifs/usage.rst
+++ b/Documentation/admin-guide/cifs/usage.rst
@@ -16,8 +16,7 @@ standard for interoperating between Macs and Windows and major NAS appliances.
Please see
MS-SMB2 (for detailed SMB2/SMB3/SMB3.1.1 protocol specification)
-http://protocolfreedom.org/ and
-http://samba.org/samba/PFIF/
+or https://samba.org/samba/PFIF/
for more details.
@@ -32,7 +31,7 @@ Build instructions
For Linux:
-1) Download the kernel (e.g. from http://www.kernel.org)
+1) Download the kernel (e.g. from https://www.kernel.org)
and change directory into the top of the kernel directory tree
(e.g. /usr/src/linux-2.5.73)
2) make menuconfig (or make xconfig)
@@ -84,7 +83,7 @@ and encrypted shares and stronger signing and authentication algorithms.
There are additional mount options that may be helpful for SMB3 to get
improved POSIX behavior (NB: can use vers=3.0 to force only SMB3, never 2.1):
- ``mfsymlinks`` and ``cifsacl`` and ``idsfromsid``
+ ``mfsymlinks`` and either ``cifsacl`` or ``modefromsid`` (usually with ``idsfromsid``)
Allowing User Mounts
====================
@@ -116,7 +115,7 @@ later source tree in docs/manpages/mount.cifs.8
Allowing User Unmounts
======================
-To permit users to ummount directories that they have user mounted (see above),
+To permit users to unmount directories that they have user mounted (see above),
the utility umount.cifs may be used. It may be invoked directly, or if
umount.cifs is placed in /sbin, umount can invoke the cifs umount helper
(at least for most versions of the umount utility) for umount of cifs
@@ -198,7 +197,7 @@ that is ignored by local server applications and non-cifs clients and that will
not be traversed by the Samba server). This is opaque to the Linux client
application using the cifs vfs. Absolute symlinks will work to Samba 3.0.5 or
later, but only for remote clients using the CIFS Unix extensions, and will
-be invisbile to Windows clients and typically will not affect local
+be invisible to Windows clients and typically will not affect local
applications running on the same server as Samba.
Use instructions
@@ -268,7 +267,7 @@ would be forbidden for Windows/CIFS semantics) as long as the server is
configured for Unix Extensions (and the client has not disabled
/proc/fs/cifs/LinuxExtensionsEnabled). In addition the mount option
``mapposix`` can be used on CIFS (vers=1.0) to force the mapping of
-illegal Windows/NTFS/SMB characters to a remap range (this mount parm
+illegal Windows/NTFS/SMB characters to a remap range (this mount parameter
is the default for SMB3). This remap (``mapposix``) range is also
compatible with Mac (and "Services for Mac" on some older Windows).
@@ -715,6 +714,7 @@ DebugData Displays information about active CIFS sessions and
version.
Stats Lists summary resource usage information as well as per
share statistics.
+open_files List all the open file handles on all active SMB sessions.
======================= =======================================================
Configuration pseudo-files:
@@ -734,10 +734,9 @@ SecurityFlags Flags which control security negotiation and
using weaker password hashes is 0x37037 (lanman,
plaintext, ntlm, ntlmv2, signing allowed). Some
SecurityFlags require the corresponding menuconfig
- options to be enabled (lanman and plaintext require
- CONFIG_CIFS_WEAK_PW_HASH for example). Enabling
- plaintext authentication currently requires also
- enabling lanman authentication in the security flags
+ options to be enabled. Enabling plaintext
+ authentication currently requires also enabling
+ lanman authentication in the security flags
because the cifs module only supports sending
laintext passwords using the older lanman dialect
form of the session setup SMB. (e.g. for authentication
@@ -795,6 +794,8 @@ LinuxExtensionsEnabled If set to one then the client will attempt to
support and want to map the uid and gid fields
to values supplied at mount (rather than the
actual values, then set this to zero. (default 1)
+dfscache List the content of the DFS cache.
+ If set to 0, the client will clear the cache.
======================= =======================================================
These experimental features and tracing can be enabled by changing flags in
@@ -831,7 +832,7 @@ the active sessions and the shares that are mounted.
Enabling Kerberos (extended security) works but requires version 1.2 or later
of the helper program cifs.upcall to be present and to be configured in the
/etc/request-key.conf file. The cifs.upcall helper program is from the Samba
-project(http://www.samba.org). NTLM and NTLMv2 and LANMAN support do not
+project(https://www.samba.org). NTLM and NTLMv2 and LANMAN support do not
require this helper. Note that NTLMv2 security (which does not require the
cifs.upcall helper program), instead of using Kerberos, is sufficient for
some use cases.
diff --git a/Documentation/admin-guide/cifs/winucase_convert.pl b/Documentation/admin-guide/cifs/winucase_convert.pl
index 322a9c833f23..993186beea20 100755
--- a/Documentation/admin-guide/cifs/winucase_convert.pl
+++ b/Documentation/admin-guide/cifs/winucase_convert.pl
@@ -16,7 +16,7 @@
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
-# along with this program. If not, see <http://www.gnu.org/licenses/>.
+# along with this program. If not, see <https://www.gnu.org/licenses/>.
#
while(<>) {
diff --git a/Documentation/admin-guide/cpu-load.rst b/Documentation/admin-guide/cpu-load.rst
index 2d01ce43d2a2..21a984337080 100644
--- a/Documentation/admin-guide/cpu-load.rst
+++ b/Documentation/admin-guide/cpu-load.rst
@@ -61,51 +61,54 @@ will lead to quite erratic information inside ``/proc/stat``::
static volatile sig_atomic_t stop;
- static void sighandler (int signr)
+ static void sighandler(int signr)
{
- (void) signr;
- stop = 1;
+ (void) signr;
+ stop = 1;
}
+
static unsigned long hog (unsigned long niters)
{
- stop = 0;
- while (!stop && --niters);
- return niters;
+ stop = 0;
+ while (!stop && --niters);
+ return niters;
}
+
int main (void)
{
- int i;
- struct itimerval it = { .it_interval = { .tv_sec = 0, .tv_usec = 1 },
- .it_value = { .tv_sec = 0, .tv_usec = 1 } };
- sigset_t set;
- unsigned long v[HIST];
- double tmp = 0.0;
- unsigned long n;
- signal (SIGALRM, &sighandler);
- setitimer (ITIMER_REAL, &it, NULL);
-
- hog (ULONG_MAX);
- for (i = 0; i < HIST; ++i) v[i] = ULONG_MAX - hog (ULONG_MAX);
- for (i = 0; i < HIST; ++i) tmp += v[i];
- tmp /= HIST;
- n = tmp - (tmp / 3.0);
-
- sigemptyset (&set);
- sigaddset (&set, SIGALRM);
-
- for (;;) {
- hog (n);
- sigwait (&set, &i);
- }
- return 0;
+ int i;
+ struct itimerval it = {
+ .it_interval = { .tv_sec = 0, .tv_usec = 1 },
+ .it_value = { .tv_sec = 0, .tv_usec = 1 } };
+ sigset_t set;
+ unsigned long v[HIST];
+ double tmp = 0.0;
+ unsigned long n;
+ signal(SIGALRM, &sighandler);
+ setitimer(ITIMER_REAL, &it, NULL);
+
+ hog (ULONG_MAX);
+ for (i = 0; i < HIST; ++i) v[i] = ULONG_MAX - hog(ULONG_MAX);
+ for (i = 0; i < HIST; ++i) tmp += v[i];
+ tmp /= HIST;
+ n = tmp - (tmp / 3.0);
+
+ sigemptyset(&set);
+ sigaddset(&set, SIGALRM);
+
+ for (;;) {
+ hog(n);
+ sigwait(&set, &i);
+ }
+ return 0;
}
References
----------
-- http://lkml.org/lkml/2007/2/12/6
-- Documentation/filesystems/proc.txt (1.8)
+- https://lore.kernel.org/r/loom.20070212T063225-663@post.gmane.org
+- Documentation/filesystems/proc.rst (1.8)
Thanks
diff --git a/Documentation/admin-guide/cputopology.rst b/Documentation/admin-guide/cputopology.rst
index b90dafcc8237..d29cacc9b3c3 100644
--- a/Documentation/admin-guide/cputopology.rst
+++ b/Documentation/admin-guide/cputopology.rst
@@ -2,105 +2,28 @@
How CPU topology info is exported via sysfs
===========================================
-Export CPU topology info via sysfs. Items (attributes) are similar
-to /proc/cpuinfo output of some architectures. They reside in
-/sys/devices/system/cpu/cpuX/topology/:
-
-physical_package_id:
-
- physical package id of cpuX. Typically corresponds to a physical
- socket number, but the actual value is architecture and platform
- dependent.
-
-die_id:
-
- the CPU die ID of cpuX. Typically it is the hardware platform's
- identifier (rather than the kernel's). The actual value is
- architecture and platform dependent.
-
-core_id:
-
- the CPU core ID of cpuX. Typically it is the hardware platform's
- identifier (rather than the kernel's). The actual value is
- architecture and platform dependent.
-
-book_id:
-
- the book ID of cpuX. Typically it is the hardware platform's
- identifier (rather than the kernel's). The actual value is
- architecture and platform dependent.
-
-drawer_id:
-
- the drawer ID of cpuX. Typically it is the hardware platform's
- identifier (rather than the kernel's). The actual value is
- architecture and platform dependent.
-
-core_cpus:
-
- internal kernel map of CPUs within the same core.
- (deprecated name: "thread_siblings")
-
-core_cpus_list:
-
- human-readable list of CPUs within the same core.
- (deprecated name: "thread_siblings_list");
-
-package_cpus:
-
- internal kernel map of the CPUs sharing the same physical_package_id.
- (deprecated name: "core_siblings")
-
-package_cpus_list:
-
- human-readable list of CPUs sharing the same physical_package_id.
- (deprecated name: "core_siblings_list")
-
-die_cpus:
-
- internal kernel map of CPUs within the same die.
-
-die_cpus_list:
-
- human-readable list of CPUs within the same die.
-
-book_siblings:
-
- internal kernel map of cpuX's hardware threads within the same
- book_id.
-
-book_siblings_list:
-
- human-readable list of cpuX's hardware threads within the same
- book_id.
-
-drawer_siblings:
-
- internal kernel map of cpuX's hardware threads within the same
- drawer_id.
-
-drawer_siblings_list:
-
- human-readable list of cpuX's hardware threads within the same
- drawer_id.
+CPU topology info is exported via sysfs. Items (attributes) are similar
+to /proc/cpuinfo output of some architectures. They reside in
+/sys/devices/system/cpu/cpuX/topology/. Please refer to the ABI file:
+Documentation/ABI/stable/sysfs-devices-system-cpu.
Architecture-neutral, drivers/base/topology.c, exports these attributes.
-However, the book and drawer related sysfs files will only be created if
-CONFIG_SCHED_BOOK and CONFIG_SCHED_DRAWER are selected, respectively.
-
-CONFIG_SCHED_BOOK and CONFIG_SCHED_DRAWER are currently only used on s390,
-where they reflect the cpu and cache hierarchy.
+However the die, cluster, book, and drawer hierarchy related sysfs files will
+only be created if an architecture provides the related macros as described
+below.
For an architecture to support this feature, it must define some of
these macros in include/asm-XXX/topology.h::
#define topology_physical_package_id(cpu)
#define topology_die_id(cpu)
+ #define topology_cluster_id(cpu)
#define topology_core_id(cpu)
#define topology_book_id(cpu)
#define topology_drawer_id(cpu)
#define topology_sibling_cpumask(cpu)
#define topology_core_cpumask(cpu)
+ #define topology_cluster_cpumask(cpu)
#define topology_die_cpumask(cpu)
#define topology_book_cpumask(cpu)
#define topology_drawer_cpumask(cpu)
@@ -116,15 +39,16 @@ not defined by include/asm-XXX/topology.h:
1) topology_physical_package_id: -1
2) topology_die_id: -1
-3) topology_core_id: 0
-4) topology_sibling_cpumask: just the given CPU
-5) topology_core_cpumask: just the given CPU
-6) topology_die_cpumask: just the given CPU
-
-For architectures that don't support books (CONFIG_SCHED_BOOK) there are no
-default definitions for topology_book_id() and topology_book_cpumask().
-For architectures that don't support drawers (CONFIG_SCHED_DRAWER) there are
-no default definitions for topology_drawer_id() and topology_drawer_cpumask().
+3) topology_cluster_id: -1
+4) topology_core_id: 0
+5) topology_book_id: -1
+6) topology_drawer_id: -1
+7) topology_sibling_cpumask: just the given CPU
+8) topology_core_cpumask: just the given CPU
+9) topology_cluster_cpumask: just the given CPU
+10) topology_die_cpumask: just the given CPU
+11) topology_book_cpumask: just the given CPU
+12) topology_drawer_cpumask: just the given CPU
Additionally, CPU topology information is provided under
/sys/devices/system/cpu and includes these files. The internal
@@ -135,9 +59,9 @@ source for the output is in brackets ("[]").
[NR_CPUS-1]
offline: CPUs that are not online because they have been
- HOTPLUGGED off (see cpu-hotplug.txt) or exceed the limit
- of CPUs allowed by the kernel configuration (kernel_max
- above). [~cpu_online_mask + cpus >= NR_CPUS]
+ HOTPLUGGED off or exceed the limit of CPUs allowed by the
+ kernel configuration (kernel_max above).
+ [~cpu_online_mask + cpus >= NR_CPUS]
online: CPUs that are online and being scheduled [cpu_online_mask]
@@ -173,5 +97,5 @@ online.)::
possible: 0-127
present: 0-3
-See cpu-hotplug.txt for the possible_cpus=NUM kernel start parameter
-as well as more information on the various cpumasks.
+See Documentation/core-api/cpu_hotplug.rst for the possible_cpus=NUM
+kernel start parameter as well as more information on the various cpumasks.
diff --git a/Documentation/admin-guide/dell_rbu.rst b/Documentation/admin-guide/dell_rbu.rst
index 8d70e1fc9f9d..2196caf1b939 100644
--- a/Documentation/admin-guide/dell_rbu.rst
+++ b/Documentation/admin-guide/dell_rbu.rst
@@ -26,7 +26,7 @@ Please go to http://support.dell.com register and you can find info on
OpenManage and Dell Update packages (DUP).
Libsmbios can also be used to update BIOS on Dell systems go to
-http://linux.dell.com/libsmbios/ for details.
+https://linux.dell.com/libsmbios/ for details.
Dell_RBU driver supports BIOS update using the monolithic image and packetized
image methods. In case of monolithic the driver allocates a contiguous chunk
diff --git a/Documentation/admin-guide/device-mapper/dm-crypt.rst b/Documentation/admin-guide/device-mapper/dm-crypt.rst
index 8f4a3f889d43..aa2d04d95df6 100644
--- a/Documentation/admin-guide/device-mapper/dm-crypt.rst
+++ b/Documentation/admin-guide/device-mapper/dm-crypt.rst
@@ -46,7 +46,7 @@ Parameters::
capi:authenc(hmac(sha256),xts(aes))-random
capi:rfc7539(chacha20,poly1305)-random
- The /proc/crypto contains a list of curently loaded crypto modes.
+ The /proc/crypto contains a list of currently loaded crypto modes.
<key>
Key used for encryption. It is encoded either as a hexadecimal number
@@ -67,7 +67,7 @@ Parameters::
the value passed in <key_size>.
<key_type>
- Either 'logon' or 'user' kernel key type.
+ Either 'logon', 'user', 'encrypted' or 'trusted' kernel key type.
<key_description>
The kernel keyring key description crypt target should look for
@@ -92,7 +92,7 @@ Parameters::
<#opt_params>
Number of optional parameters. If there are no optional parameters,
- the optional paramaters section can be skipped or #opt_params can be zero.
+ the optional parameters section can be skipped or #opt_params can be zero.
Otherwise #opt_params is the number of following arguments.
Example of optional parameters section:
@@ -121,6 +121,14 @@ submit_from_crypt_cpus
thread because it benefits CFQ to have writes submitted using the
same context.
+no_read_workqueue
+ Bypass dm-crypt internal workqueue and process read requests synchronously.
+
+no_write_workqueue
+ Bypass dm-crypt internal workqueue and process write requests synchronously.
+ This option is automatically enabled for host-managed zoned block devices
+ (e.g. host-managed SMR hard-disks).
+
integrity:<bytes>:<type>
The device requires additional <bytes> metadata per-sector stored
in per-bio integrity structure. This metadata must by provided
diff --git a/Documentation/admin-guide/device-mapper/dm-dust.rst b/Documentation/admin-guide/device-mapper/dm-dust.rst
index b6e7e7ead831..e35ec8cd2f88 100644
--- a/Documentation/admin-guide/device-mapper/dm-dust.rst
+++ b/Documentation/admin-guide/device-mapper/dm-dust.rst
@@ -69,10 +69,11 @@ Create the dm-dust device:
$ sudo dmsetup create dust1 --table '0 33552384 dust /dev/vdb1 0 4096'
Check the status of the read behavior ("bypass" indicates that all I/O
-will be passed through to the underlying device)::
+will be passed through to the underlying device; "verbose" indicates that
+bad block additions, removals, and remaps will be verbosely logged)::
$ sudo dmsetup status dust1
- 0 33552384 dust 252:17 bypass
+ 0 33552384 dust 252:17 bypass verbose
$ sudo dd if=/dev/mapper/dust1 of=/dev/null bs=512 count=128 iflag=direct
128+0 records in
@@ -164,7 +165,7 @@ following message command::
A message will print with the number of bad blocks currently
configured on the device::
- kernel: device-mapper: dust: countbadblocks: 895 badblock(s) found
+ countbadblocks: 895 badblock(s) found
Querying for specific bad blocks
--------------------------------
@@ -176,11 +177,11 @@ following message command::
The following message will print if the block is in the list::
- device-mapper: dust: queryblock: block 72 found in badblocklist
+ dust_query_block: block 72 found in badblocklist
The following message will print if the block is not in the list::
- device-mapper: dust: queryblock: block 72 not found in badblocklist
+ dust_query_block: block 72 not found in badblocklist
The "queryblock" message command will work in both the "enabled"
and "disabled" modes, allowing the verification of whether a block
@@ -198,12 +199,28 @@ following message command::
After clearing the bad block list, the following message will appear::
- kernel: device-mapper: dust: clearbadblocks: badblocks cleared
+ dust_clear_badblocks: badblocks cleared
If there were no bad blocks to clear, the following message will
appear::
- kernel: device-mapper: dust: clearbadblocks: no badblocks found
+ dust_clear_badblocks: no badblocks found
+
+Listing the bad block list
+--------------------------
+
+To list all bad blocks in the bad block list (using an example device
+with blocks 1 and 2 in the bad block list), run the following message
+command::
+
+ $ sudo dmsetup message dust1 0 listbadblocks
+ 1
+ 2
+
+If there are no bad blocks in the bad block list, the command will
+execute with no output::
+
+ $ sudo dmsetup message dust1 0 listbadblocks
Message commands list
---------------------
@@ -223,6 +240,7 @@ Single argument message commands::
countbadblocks
clearbadblocks
+ listbadblocks
disable
enable
quiet
diff --git a/Documentation/admin-guide/device-mapper/dm-ebs.rst b/Documentation/admin-guide/device-mapper/dm-ebs.rst
new file mode 100644
index 000000000000..534fa38e8862
--- /dev/null
+++ b/Documentation/admin-guide/device-mapper/dm-ebs.rst
@@ -0,0 +1,51 @@
+======
+dm-ebs
+======
+
+
+This target is similar to the linear target except that it emulates
+a smaller logical block size on a device with a larger logical block
+size. Its main purpose is to provide emulation of 512 byte sectors on
+devices that do not provide this emulation (i.e. 4K native disks).
+
+Supported emulated logical block sizes 512, 1024, 2048 and 4096.
+
+Underlying block size can be set to > 4K to test buffering larger units.
+
+
+Table parameters
+----------------
+ <dev path> <offset> <emulated sectors> [<underlying sectors>]
+
+Mandatory parameters:
+
+ <dev path>:
+ Full pathname to the underlying block-device,
+ or a "major:minor" device-number.
+ <offset>:
+ Starting sector within the device;
+ has to be a multiple of <emulated sectors>.
+ <emulated sectors>:
+ Number of sectors defining the logical block size to be emulated;
+ 1, 2, 4, 8 sectors of 512 bytes supported.
+
+Optional parameter:
+
+ <underyling sectors>:
+ Number of sectors defining the logical block size of <dev path>.
+ 2^N supported, e.g. 8 = emulate 8 sectors of 512 bytes = 4KiB.
+ If not provided, the logical block size of <dev path> will be used.
+
+
+Examples:
+
+Emulate 1 sector = 512 bytes logical block size on /dev/sda starting at
+offset 1024 sectors with underlying devices block size automatically set:
+
+ebs /dev/sda 1024 1
+
+Emulate 2 sector = 1KiB logical block size on /dev/sda starting at
+offset 128 sectors, enforce 2KiB underlying device block size.
+This presumes 2KiB logical blocksize on /dev/sda or less to work:
+
+ebs /dev/sda 128 2 4
diff --git a/Documentation/admin-guide/device-mapper/dm-ima.rst b/Documentation/admin-guide/device-mapper/dm-ima.rst
new file mode 100644
index 000000000000..a4aa50a828e0
--- /dev/null
+++ b/Documentation/admin-guide/device-mapper/dm-ima.rst
@@ -0,0 +1,715 @@
+======
+dm-ima
+======
+
+For a given system, various external services/infrastructure tools
+(including the attestation service) interact with it - both during the
+setup and during rest of the system run-time. They share sensitive data
+and/or execute critical workload on that system. The external services
+may want to verify the current run-time state of the relevant kernel
+subsystems before fully trusting the system with business-critical
+data/workload.
+
+Device mapper plays a critical role on a given system by providing
+various important functionalities to the block devices using various
+target types like crypt, verity, integrity etc. Each of these target
+types’ functionalities can be configured with various attributes.
+The attributes chosen to configure these target types can significantly
+impact the security profile of the block device, and in-turn, of the
+system itself. For instance, the type of encryption algorithm and the
+key size determines the strength of encryption for a given block device.
+
+Therefore, verifying the current state of various block devices as well
+as their various target attributes is crucial for external services before
+fully trusting the system with business-critical data/workload.
+
+IMA kernel subsystem provides the necessary functionality for
+device mapper to measure the state and configuration of
+various block devices -
+
+- by device mapper itself, from within the kernel,
+- in a tamper resistant way,
+- and re-measured - triggered on state/configuration change.
+
+Setting the IMA Policy:
+=======================
+For IMA to measure the data on a given system, the IMA policy on the
+system needs to be updated to have following line, and the system needs
+to be restarted for the measurements to take effect.
+
+::
+
+ /etc/ima/ima-policy
+ measure func=CRITICAL_DATA label=device-mapper template=ima-buf
+
+The measurements will be reflected in the IMA logs, which are located at:
+
+::
+
+ /sys/kernel/security/integrity/ima/ascii_runtime_measurements
+ /sys/kernel/security/integrity/ima/binary_runtime_measurements
+
+Then IMA ASCII measurement log has the following format:
+
+::
+
+ <PCR> <TEMPLATE_DATA_DIGEST> <TEMPLATE_NAME> <TEMPLATE_DATA>
+
+ PCR := Platform Configuration Register, in which the values are registered.
+ This is applicable if TPM chip is in use.
+
+ TEMPLATE_DATA_DIGEST := Template data digest of the IMA record.
+ TEMPLATE_NAME := Template name that registered the integrity value (e.g. ima-buf).
+
+ TEMPLATE_DATA := <ALG> ":" <EVENT_DIGEST> <EVENT_NAME> <EVENT_DATA>
+ It contains data for the specific event to be measured,
+ in a given template data format.
+
+ ALG := Algorithm to compute event digest
+ EVENT_DIGEST := Digest of the event data
+ EVENT_NAME := Description of the event (e.g. 'dm_table_load').
+ EVENT_DATA := The event data to be measured.
+
+|
+
+| *NOTE #1:*
+| The DM target data measured by IMA subsystem can alternatively
+ be queried from userspace by setting DM_IMA_MEASUREMENT_FLAG with
+ DM_TABLE_STATUS_CMD.
+
+|
+
+| *NOTE #2:*
+| The Kernel configuration CONFIG_IMA_DISABLE_HTABLE allows measurement of duplicate records.
+| To support recording duplicate IMA events in the IMA log, the Kernel needs to be configured with
+ CONFIG_IMA_DISABLE_HTABLE=y.
+
+Supported Device States:
+========================
+Following device state changes will trigger IMA measurements:
+
+ 1. Table load
+ #. Device resume
+ #. Device remove
+ #. Table clear
+ #. Device rename
+
+1. Table load:
+---------------
+When a new table is loaded in a device's inactive table slot,
+the device information and target specific details from the
+targets in the table are measured.
+
+The IMA measurement log has the following format for 'dm_table_load':
+
+::
+
+ EVENT_NAME := "dm_table_load"
+ EVENT_DATA := <dm_version_str> ";" <device_metadata> ";" <table_load_data>
+
+ dm_version_str := "dm_version=" <N> "." <N> "." <N>
+ Same as Device Mapper driver version.
+ device_metadata := <device_name> "," <device_uuid> "," <device_major> "," <device_minor> ","
+ <minor_count> "," <num_device_targets> ";"
+
+ device_name := "name=" <dm-device-name>
+ device_uuid := "uuid=" <dm-device-uuid>
+ device_major := "major=" <N>
+ device_minor := "minor=" <N>
+ minor_count := "minor_count=" <N>
+ num_device_targets := "num_targets=" <N>
+ dm-device-name := Name of the device. If it contains special characters like '\', ',', ';',
+ they are prefixed with '\'.
+ dm-device-uuid := UUID of the device. If it contains special characters like '\', ',', ';',
+ they are prefixed with '\'.
+
+ table_load_data := <target_data>
+ Represents the data (as name=value pairs) from various targets in the table,
+ which is being loaded into the DM device's inactive table slot.
+ target_data := <target_data_row> | <target_data><target_data_row>
+
+ target_data_row := <target_index> "," <target_begin> "," <target_len> "," <target_name> ","
+ <target_version> "," <target_attributes> ";"
+ target_index := "target_index=" <N>
+ Represents nth target in the table (from 0 to N-1 targets specified in <num_device_targets>)
+ If all the data for N targets doesn't fit in the given buffer - then the data that fits
+ in the buffer (say from target 0 to x) is measured in a given IMA event.
+ The remaining data from targets x+1 to N-1 is measured in the subsequent IMA events,
+ with the same format as that of 'dm_table_load'
+ i.e. <dm_version_str> ";" <device_metadata> ";" <table_load_data>.
+
+ target_begin := "target_begin=" <N>
+ target_len := "target_len=" <N>
+ target_name := Name of the target. 'linear', 'crypt', 'integrity' etc.
+ The targets that are supported for IMA measurements are documented below in the
+ 'Supported targets' section.
+ target_version := "target_version=" <N> "." <N> "." <N>
+ target_attributes := Data containing comma separated list of name=value pairs of target specific attributes.
+
+ For instance, if a linear device is created with the following table entries,
+ # dmsetup create linear1
+ 0 2 linear /dev/loop0 512
+ 2 2 linear /dev/loop0 512
+ 4 2 linear /dev/loop0 512
+ 6 2 linear /dev/loop0 512
+
+ Then IMA ASCII measurement log will have the following entry:
+ (converted from ASCII to text for readability)
+
+ 10 a8c5ff755561c7a28146389d1514c318592af49a ima-buf sha256:4d73481ecce5eadba8ab084640d85bb9ca899af4d0a122989252a76efadc5b72
+ dm_table_load
+ dm_version=4.45.0;
+ name=linear1,uuid=,major=253,minor=0,minor_count=1,num_targets=4;
+ target_index=0,target_begin=0,target_len=2,target_name=linear,target_version=1.4.0,device_name=7:0,start=512;
+ target_index=1,target_begin=2,target_len=2,target_name=linear,target_version=1.4.0,device_name=7:0,start=512;
+ target_index=2,target_begin=4,target_len=2,target_name=linear,target_version=1.4.0,device_name=7:0,start=512;
+ target_index=3,target_begin=6,target_len=2,target_name=linear,target_version=1.4.0,device_name=7:0,start=512;
+
+2. Device resume:
+------------------
+When a suspended device is resumed, the device information and the hash of the
+data from previous load of an active table are measured.
+
+The IMA measurement log has the following format for 'dm_device_resume':
+
+::
+
+ EVENT_NAME := "dm_device_resume"
+ EVENT_DATA := <dm_version_str> ";" <device_metadata> ";" <active_table_hash> ";" <current_device_capacity> ";"
+
+ dm_version_str := As described in the 'Table load' section above.
+ device_metadata := As described in the 'Table load' section above.
+ active_table_hash := "active_table_hash=" <table_hash_alg> ":" <table_hash>
+ Rerpresents the hash of the IMA data being measured for the
+ active table for the device.
+ table_hash_alg := Algorithm used to compute the hash.
+ table_hash := Hash of the (<dm_version_str> ";" <device_metadata> ";" <table_load_data> ";")
+ as described in the 'dm_table_load' above.
+ Note: If the table_load data spans across multiple IMA 'dm_table_load'
+ events for a given device, the hash is computed combining all the event data
+ i.e. (<dm_version_str> ";" <device_metadata> ";" <table_load_data> ";")
+ across all those events.
+ current_device_capacity := "current_device_capacity=" <N>
+
+ For instance, if a linear device is resumed with the following command,
+ #dmsetup resume linear1
+
+ then IMA ASCII measurement log will have an entry with:
+ (converted from ASCII to text for readability)
+
+ 10 56c00cc062ffc24ccd9ac2d67d194af3282b934e ima-buf sha256:e7d12c03b958b4e0e53e7363a06376be88d98a1ac191fdbd3baf5e4b77f329b6
+ dm_device_resume
+ dm_version=4.45.0;
+ name=linear1,uuid=,major=253,minor=0,minor_count=1,num_targets=4;
+ active_table_hash=sha256:4d73481ecce5eadba8ab084640d85bb9ca899af4d0a122989252a76efadc5b72;current_device_capacity=8;
+
+3. Device remove:
+------------------
+When a device is removed, the device information and a sha256 hash of the
+data from an active and inactive table are measured.
+
+The IMA measurement log has the following format for 'dm_device_remove':
+
+::
+
+ EVENT_NAME := "dm_device_remove"
+ EVENT_DATA := <dm_version_str> ";" <device_active_metadata> ";" <device_inactive_metadata> ";"
+ <active_table_hash> "," <inactive_table_hash> "," <remove_all> ";" <current_device_capacity> ";"
+
+ dm_version_str := As described in the 'Table load' section above.
+ device_active_metadata := Device metadata that reflects the currently loaded active table.
+ The format is same as 'device_metadata' described in the 'Table load' section above.
+ device_inactive_metadata := Device metadata that reflects the inactive table.
+ The format is same as 'device_metadata' described in the 'Table load' section above.
+ active_table_hash := Hash of the currently loaded active table.
+ The format is same as 'active_table_hash' described in the 'Device resume' section above.
+ inactive_table_hash := Hash of the inactive table.
+ The format is same as 'active_table_hash' described in the 'Device resume' section above.
+ remove_all := "remove_all=" <yes_no>
+ yes_no := "y" | "n"
+ current_device_capacity := "current_device_capacity=" <N>
+
+ For instance, if a linear device is removed with the following command,
+ #dmsetup remove l1
+
+ then IMA ASCII measurement log will have the following entry:
+ (converted from ASCII to text for readability)
+
+ 10 790e830a3a7a31590824ac0642b3b31c2d0e8b38 ima-buf sha256:ab9f3c959367a8f5d4403d6ce9c3627dadfa8f9f0e7ec7899299782388de3840
+ dm_device_remove
+ dm_version=4.45.0;
+ device_active_metadata=name=l1,uuid=,major=253,minor=2,minor_count=1,num_targets=2;
+ device_inactive_metadata=name=l1,uuid=,major=253,minor=2,minor_count=1,num_targets=1;
+ active_table_hash=sha256:4a7e62efaebfc86af755831998b7db6f59b60d23c9534fb16a4455907957953a,
+ inactive_table_hash=sha256:9d79c175bc2302d55a183e8f50ad4bafd60f7692fd6249e5fd213e2464384b86,remove_all=n;
+ current_device_capacity=2048;
+
+4. Table clear:
+----------------
+When an inactive table is cleared from the device, the device information and a sha256 hash of the
+data from an inactive table are measured.
+
+The IMA measurement log has the following format for 'dm_table_clear':
+
+::
+
+ EVENT_NAME := "dm_table_clear"
+ EVENT_DATA := <dm_version_str> ";" <device_inactive_metadata> ";" <inactive_table_hash> ";" <current_device_capacity> ";"
+
+ dm_version_str := As described in the 'Table load' section above.
+ device_inactive_metadata := Device metadata that was captured during the load time inactive table being cleared.
+ The format is same as 'device_metadata' described in the 'Table load' section above.
+ inactive_table_hash := Hash of the inactive table being cleared from the device.
+ The format is same as 'active_table_hash' described in the 'Device resume' section above.
+ current_device_capacity := "current_device_capacity=" <N>
+
+ For instance, if a linear device's inactive table is cleared,
+ #dmsetup clear l1
+
+ then IMA ASCII measurement log will have an entry with:
+ (converted from ASCII to text for readability)
+
+ 10 77d347408f557f68f0041acb0072946bb2367fe5 ima-buf sha256:42f9ca22163fdfa548e6229dece2959bc5ce295c681644240035827ada0e1db5
+ dm_table_clear
+ dm_version=4.45.0;
+ name=l1,uuid=,major=253,minor=2,minor_count=1,num_targets=1;
+ inactive_table_hash=sha256:75c0dc347063bf474d28a9907037eba060bfe39d8847fc0646d75e149045d545;current_device_capacity=1024;
+
+5. Device rename:
+------------------
+When an device's NAME or UUID is changed, the device information and the new NAME and UUID
+are measured.
+
+The IMA measurement log has the following format for 'dm_device_rename':
+
+::
+
+ EVENT_NAME := "dm_device_rename"
+ EVENT_DATA := <dm_version_str> ";" <device_active_metadata> ";" <new_device_name> "," <new_device_uuid> ";" <current_device_capacity> ";"
+
+ dm_version_str := As described in the 'Table load' section above.
+ device_active_metadata := Device metadata that reflects the currently loaded active table.
+ The format is same as 'device_metadata' described in the 'Table load' section above.
+ new_device_name := "new_name=" <dm-device-name>
+ dm-device-name := Same as <dm-device-name> described in 'Table load' section above
+ new_device_uuid := "new_uuid=" <dm-device-uuid>
+ dm-device-uuid := Same as <dm-device-uuid> described in 'Table load' section above
+ current_device_capacity := "current_device_capacity=" <N>
+
+ E.g 1: if a linear device's name is changed with the following command,
+ #dmsetup rename linear1 --setuuid 1234-5678
+
+ then IMA ASCII measurement log will have an entry with:
+ (converted from ASCII to text for readability)
+
+ 10 8b0423209b4c66ac1523f4c9848c9b51ee332f48 ima-buf sha256:6847b7258134189531db593e9230b257c84f04038b5a18fd2e1473860e0569ac
+ dm_device_rename
+ dm_version=4.45.0;
+ name=linear1,uuid=,major=253,minor=2,minor_count=1,num_targets=1;new_name=linear1,new_uuid=1234-5678;
+ current_device_capacity=1024;
+
+ E.g 2: if a linear device's name is changed with the following command,
+ # dmsetup rename linear1 linear=2
+
+ then IMA ASCII measurement log will have an entry with:
+ (converted from ASCII to text for readability)
+
+ 10 bef70476b99c2bdf7136fae033aa8627da1bf76f ima-buf sha256:8c6f9f53b9ef9dc8f92a2f2cca8910e622543d0f0d37d484870cb16b95111402
+ dm_device_rename
+ dm_version=4.45.0;
+ name=linear1,uuid=1234-5678,major=253,minor=2,minor_count=1,num_targets=1;
+ new_name=linear\=2,new_uuid=1234-5678;
+ current_device_capacity=1024;
+
+Supported targets:
+==================
+
+Following targets are supported to measure their data using IMA:
+
+ 1. cache
+ #. crypt
+ #. integrity
+ #. linear
+ #. mirror
+ #. multipath
+ #. raid
+ #. snapshot
+ #. striped
+ #. verity
+
+1. cache
+---------
+The 'target_attributes' (described as part of EVENT_DATA in 'Table load'
+section above) has the following data format for 'cache' target.
+
+::
+
+ target_attributes := <target_name> "," <target_version> "," <metadata_mode> "," <cache_metadata_device> ","
+ <cache_device> "," <cache_origin_device> "," <writethrough> "," <writeback> ","
+ <passthrough> "," <no_discard_passdown> ";"
+
+ target_name := "target_name=cache"
+ target_version := "target_version=" <N> "." <N> "." <N>
+ metadata_mode := "metadata_mode=" <cache_metadata_mode>
+ cache_metadata_mode := "fail" | "ro" | "rw"
+ cache_device := "cache_device=" <cache_device_name_string>
+ cache_origin_device := "cache_origin_device=" <cache_origin_device_string>
+ writethrough := "writethrough=" <yes_no>
+ writeback := "writeback=" <yes_no>
+ passthrough := "passthrough=" <yes_no>
+ no_discard_passdown := "no_discard_passdown=" <yes_no>
+ yes_no := "y" | "n"
+
+ E.g.
+ When a 'cache' target is loaded, then IMA ASCII measurement log will have an entry
+ similar to the following, depicting what 'cache' attributes are measured in EVENT_DATA
+ for 'dm_table_load' event.
+ (converted from ASCII to text for readability)
+
+ dm_version=4.45.0;name=cache1,uuid=cache_uuid,major=253,minor=2,minor_count=1,num_targets=1;
+ target_index=0,target_begin=0,target_len=28672,target_name=cache,target_version=2.2.0,metadata_mode=rw,
+ cache_metadata_device=253:4,cache_device=253:3,cache_origin_device=253:5,writethrough=y,writeback=n,
+ passthrough=n,metadata2=y,no_discard_passdown=n;
+
+
+2. crypt
+---------
+The 'target_attributes' (described as part of EVENT_DATA in 'Table load'
+section above) has the following data format for 'crypt' target.
+
+::
+
+ target_attributes := <target_name> "," <target_version> "," <allow_discards> "," <same_cpu_crypt> ","
+ <submit_from_crypt_cpus> "," <no_read_workqueue> "," <no_write_workqueue> ","
+ <iv_large_sectors> "," <iv_large_sectors> "," [<integrity_tag_size> ","] [<cipher_auth> ","]
+ [<sector_size> ","] [<cipher_string> ","] <key_size> "," <key_parts> ","
+ <key_extra_size> "," <key_mac_size> ";"
+
+ target_name := "target_name=crypt"
+ target_version := "target_version=" <N> "." <N> "." <N>
+ allow_discards := "allow_discards=" <yes_no>
+ same_cpu_crypt := "same_cpu_crypt=" <yes_no>
+ submit_from_crypt_cpus := "submit_from_crypt_cpus=" <yes_no>
+ no_read_workqueue := "no_read_workqueue=" <yes_no>
+ no_write_workqueue := "no_write_workqueue=" <yes_no>
+ iv_large_sectors := "iv_large_sectors=" <yes_no>
+ integrity_tag_size := "integrity_tag_size=" <N>
+ cipher_auth := "cipher_auth=" <string>
+ sector_size := "sector_size=" <N>
+ cipher_string := "cipher_string="
+ key_size := "key_size=" <N>
+ key_parts := "key_parts=" <N>
+ key_extra_size := "key_extra_size=" <N>
+ key_mac_size := "key_mac_size=" <N>
+ yes_no := "y" | "n"
+
+ E.g.
+ When a 'crypt' target is loaded, then IMA ASCII measurement log will have an entry
+ similar to the following, depicting what 'crypt' attributes are measured in EVENT_DATA
+ for 'dm_table_load' event.
+ (converted from ASCII to text for readability)
+
+ dm_version=4.45.0;
+ name=crypt1,uuid=crypt_uuid1,major=253,minor=0,minor_count=1,num_targets=1;
+ target_index=0,target_begin=0,target_len=1953125,target_name=crypt,target_version=1.23.0,
+ allow_discards=y,same_cpu=n,submit_from_crypt_cpus=n,no_read_workqueue=n,no_write_workqueue=n,
+ iv_large_sectors=n,cipher_string=aes-xts-plain64,key_size=32,key_parts=1,key_extra_size=0,key_mac_size=0;
+
+3. integrity
+-------------
+The 'target_attributes' (described as part of EVENT_DATA in 'Table load'
+section above) has the following data format for 'integrity' target.
+
+::
+
+ target_attributes := <target_name> "," <target_version> "," <dev_name> "," <start>
+ <tag_size> "," <mode> "," [<meta_device> ","] [<block_size> ","] <recalculate> ","
+ <allow_discards> "," <fix_padding> "," <fix_hmac> "," <legacy_recalculate> ","
+ <journal_sectors> "," <interleave_sectors> "," <buffer_sectors> ";"
+
+ target_name := "target_name=integrity"
+ target_version := "target_version=" <N> "." <N> "." <N>
+ dev_name := "dev_name=" <device_name_str>
+ start := "start=" <N>
+ tag_size := "tag_size=" <N>
+ mode := "mode=" <integrity_mode_str>
+ integrity_mode_str := "J" | "B" | "D" | "R"
+ meta_device := "meta_device=" <meta_device_str>
+ block_size := "block_size=" <N>
+ recalculate := "recalculate=" <yes_no>
+ allow_discards := "allow_discards=" <yes_no>
+ fix_padding := "fix_padding=" <yes_no>
+ fix_hmac := "fix_hmac=" <yes_no>
+ legacy_recalculate := "legacy_recalculate=" <yes_no>
+ journal_sectors := "journal_sectors=" <N>
+ interleave_sectors := "interleave_sectors=" <N>
+ buffer_sectors := "buffer_sectors=" <N>
+ yes_no := "y" | "n"
+
+ E.g.
+ When a 'integrity' target is loaded, then IMA ASCII measurement log will have an entry
+ similar to the following, depicting what 'integrity' attributes are measured in EVENT_DATA
+ for 'dm_table_load' event.
+ (converted from ASCII to text for readability)
+
+ dm_version=4.45.0;
+ name=integrity1,uuid=,major=253,minor=1,minor_count=1,num_targets=1;
+ target_index=0,target_begin=0,target_len=7856,target_name=integrity,target_version=1.10.0,
+ dev_name=253:0,start=0,tag_size=32,mode=J,recalculate=n,allow_discards=n,fix_padding=n,
+ fix_hmac=n,legacy_recalculate=n,journal_sectors=88,interleave_sectors=32768,buffer_sectors=128;
+
+
+4. linear
+----------
+The 'target_attributes' (described as part of EVENT_DATA in 'Table load'
+section above) has the following data format for 'linear' target.
+
+::
+
+ target_attributes := <target_name> "," <target_version> "," <device_name> <,> <start> ";"
+
+ target_name := "target_name=linear"
+ target_version := "target_version=" <N> "." <N> "." <N>
+ device_name := "device_name=" <linear_device_name_str>
+ start := "start=" <N>
+
+ E.g.
+ When a 'linear' target is loaded, then IMA ASCII measurement log will have an entry
+ similar to the following, depicting what 'linear' attributes are measured in EVENT_DATA
+ for 'dm_table_load' event.
+ (converted from ASCII to text for readability)
+
+ dm_version=4.45.0;
+ name=linear1,uuid=linear_uuid1,major=253,minor=2,minor_count=1,num_targets=1;
+ target_index=0,target_begin=0,target_len=28672,target_name=linear,target_version=1.4.0,
+ device_name=253:1,start=2048;
+
+5. mirror
+----------
+The 'target_attributes' (described as part of EVENT_DATA in 'Table load'
+section above) has the following data format for 'mirror' target.
+
+::
+
+ target_attributes := <target_name> "," <target_version> "," <nr_mirrors> ","
+ <mirror_device_data> "," <handle_errors> "," <keep_log> "," <log_type_status> ";"
+
+ target_name := "target_name=mirror"
+ target_version := "target_version=" <N> "." <N> "." <N>
+ nr_mirrors := "nr_mirrors=" <NR>
+ mirror_device_data := <mirror_device_row> | <mirror_device_data><mirror_device_row>
+ mirror_device_row is repeated <NR> times - for <NR> described in <nr_mirrors>.
+ mirror_device_row := <mirror_device_name> "," <mirror_device_status>
+ mirror_device_name := "mirror_device_" <X> "=" <mirror_device_name_str>
+ where <X> ranges from 0 to (<NR> -1) - for <NR> described in <nr_mirrors>.
+ mirror_device_status := "mirror_device_" <X> "_status=" <mirror_device_status_char>
+ where <X> ranges from 0 to (<NR> -1) - for <NR> described in <nr_mirrors>.
+ mirror_device_status_char := "A" | "F" | "D" | "S" | "R" | "U"
+ handle_errors := "handle_errors=" <yes_no>
+ keep_log := "keep_log=" <yes_no>
+ log_type_status := "log_type_status=" <log_type_status_str>
+ yes_no := "y" | "n"
+
+ E.g.
+ When a 'mirror' target is loaded, then IMA ASCII measurement log will have an entry
+ similar to the following, depicting what 'mirror' attributes are measured in EVENT_DATA
+ for 'dm_table_load' event.
+ (converted from ASCII to text for readability)
+
+ dm_version=4.45.0;
+ name=mirror1,uuid=mirror_uuid1,major=253,minor=6,minor_count=1,num_targets=1;
+ target_index=0,target_begin=0,target_len=2048,target_name=mirror,target_version=1.14.0,nr_mirrors=2,
+ mirror_device_0=253:4,mirror_device_0_status=A,
+ mirror_device_1=253:5,mirror_device_1_status=A,
+ handle_errors=y,keep_log=n,log_type_status=;
+
+6. multipath
+-------------
+The 'target_attributes' (described as part of EVENT_DATA in 'Table load'
+section above) has the following data format for 'multipath' target.
+
+::
+
+ target_attributes := <target_name> "," <target_version> "," <nr_priority_groups>
+ ["," <pg_state> "," <priority_groups> "," <priority_group_paths>] ";"
+
+ target_name := "target_name=multipath"
+ target_version := "target_version=" <N> "." <N> "." <N>
+ nr_priority_groups := "nr_priority_groups=" <NPG>
+ priority_groups := <priority_groups_row>|<priority_groups_row><priority_groups>
+ priority_groups_row := "pg_state_" <X> "=" <pg_state_str> "," "nr_pgpaths_" <X> "=" <NPGP> ","
+ "path_selector_name_" <X> "=" <string> "," <priority_group_paths>
+ where <X> ranges from 0 to (<NPG> -1) - for <NPG> described in <nr_priority_groups>.
+ pg_state_str := "E" | "A" | "D"
+ <priority_group_paths> := <priority_group_paths_row> | <priority_group_paths_row><priority_group_paths>
+ priority_group_paths_row := "path_name_" <X> "_" <Y> "=" <string> "," "is_active_" <X> "_" <Y> "=" <is_active_str>
+ "fail_count_" <X> "_" <Y> "=" <N> "," "path_selector_status_" <X> "_" <Y> "=" <path_selector_status_str>
+ where <X> ranges from 0 to (<NPG> -1) - for <NPG> described in <nr_priority_groups>,
+ and <Y> ranges from 0 to (<NPGP> -1) - for <NPGP> described in <priority_groups_row>.
+ is_active_str := "A" | "F"
+
+ E.g.
+ When a 'multipath' target is loaded, then IMA ASCII measurement log will have an entry
+ similar to the following, depicting what 'multipath' attributes are measured in EVENT_DATA
+ for 'dm_table_load' event.
+ (converted from ASCII to text for readability)
+
+ dm_version=4.45.0;
+ name=mp,uuid=,major=253,minor=0,minor_count=1,num_targets=1;
+ target_index=0,target_begin=0,target_len=2097152,target_name=multipath,target_version=1.14.0,nr_priority_groups=2,
+ pg_state_0=E,nr_pgpaths_0=2,path_selector_name_0=queue-length,
+ path_name_0_0=8:16,is_active_0_0=A,fail_count_0_0=0,path_selector_status_0_0=,
+ path_name_0_1=8:32,is_active_0_1=A,fail_count_0_1=0,path_selector_status_0_1=,
+ pg_state_1=E,nr_pgpaths_1=2,path_selector_name_1=queue-length,
+ path_name_1_0=8:48,is_active_1_0=A,fail_count_1_0=0,path_selector_status_1_0=,
+ path_name_1_1=8:64,is_active_1_1=A,fail_count_1_1=0,path_selector_status_1_1=;
+
+7. raid
+--------
+The 'target_attributes' (described as part of EVENT_DATA in 'Table load'
+section above) has the following data format for 'raid' target.
+
+::
+
+ target_attributes := <target_name> "," <target_version> "," <raid_type> "," <raid_disks> "," <raid_state>
+ <raid_device_status> ["," journal_dev_mode] ";"
+
+ target_name := "target_name=raid"
+ target_version := "target_version=" <N> "." <N> "." <N>
+ raid_type := "raid_type=" <raid_type_str>
+ raid_disks := "raid_disks=" <NRD>
+ raid_state := "raid_state=" <raid_state_str>
+ raid_state_str := "frozen" | "reshape" |"resync" | "check" | "repair" | "recover" | "idle" |"undef"
+ raid_device_status := <raid_device_status_row> | <raid_device_status_row><raid_device_status>
+ <raid_device_status_row> is repeated <NRD> times - for <NRD> described in <raid_disks>.
+ raid_device_status_row := "raid_device_" <X> "_status=" <raid_device_status_str>
+ where <X> ranges from 0 to (<NRD> -1) - for <NRD> described in <raid_disks>.
+ raid_device_status_str := "A" | "D" | "a" | "-"
+ journal_dev_mode := "journal_dev_mode=" <journal_dev_mode_str>
+ journal_dev_mode_str := "writethrough" | "writeback" | "invalid"
+
+ E.g.
+ When a 'raid' target is loaded, then IMA ASCII measurement log will have an entry
+ similar to the following, depicting what 'raid' attributes are measured in EVENT_DATA
+ for 'dm_table_load' event.
+ (converted from ASCII to text for readability)
+
+ dm_version=4.45.0;
+ name=raid_LV1,uuid=uuid_raid_LV1,major=253,minor=12,minor_count=1,num_targets=1;
+ target_index=0,target_begin=0,target_len=2048,target_name=raid,target_version=1.15.1,
+ raid_type=raid10,raid_disks=4,raid_state=idle,
+ raid_device_0_status=A,
+ raid_device_1_status=A,
+ raid_device_2_status=A,
+ raid_device_3_status=A;
+
+
+8. snapshot
+------------
+The 'target_attributes' (described as part of EVENT_DATA in 'Table load'
+section above) has the following data format for 'snapshot' target.
+
+::
+
+ target_attributes := <target_name> "," <target_version> "," <snap_origin_name> ","
+ <snap_cow_name> "," <snap_valid> "," <snap_merge_failed> "," <snapshot_overflowed> ";"
+
+ target_name := "target_name=snapshot"
+ target_version := "target_version=" <N> "." <N> "." <N>
+ snap_origin_name := "snap_origin_name=" <string>
+ snap_cow_name := "snap_cow_name=" <string>
+ snap_valid := "snap_valid=" <yes_no>
+ snap_merge_failed := "snap_merge_failed=" <yes_no>
+ snapshot_overflowed := "snapshot_overflowed=" <yes_no>
+ yes_no := "y" | "n"
+
+ E.g.
+ When a 'snapshot' target is loaded, then IMA ASCII measurement log will have an entry
+ similar to the following, depicting what 'snapshot' attributes are measured in EVENT_DATA
+ for 'dm_table_load' event.
+ (converted from ASCII to text for readability)
+
+ dm_version=4.45.0;
+ name=snap1,uuid=snap_uuid1,major=253,minor=13,minor_count=1,num_targets=1;
+ target_index=0,target_begin=0,target_len=4096,target_name=snapshot,target_version=1.16.0,
+ snap_origin_name=253:11,snap_cow_name=253:12,snap_valid=y,snap_merge_failed=n,snapshot_overflowed=n;
+
+9. striped
+-----------
+The 'target_attributes' (described as part of EVENT_DATA in 'Table load'
+section above) has the following data format for 'striped' target.
+
+::
+
+ target_attributes := <target_name> "," <target_version> "," <stripes> "," <chunk_size> ","
+ <stripe_data> ";"
+
+ target_name := "target_name=striped"
+ target_version := "target_version=" <N> "." <N> "." <N>
+ stripes := "stripes=" <NS>
+ chunk_size := "chunk_size=" <N>
+ stripe_data := <stripe_data_row>|<stripe_data><stripe_data_row>
+ stripe_data_row := <stripe_device_name> "," <stripe_physical_start> "," <stripe_status>
+ stripe_device_name := "stripe_" <X> "_device_name=" <stripe_device_name_str>
+ where <X> ranges from 0 to (<NS> -1) - for <NS> described in <stripes>.
+ stripe_physical_start := "stripe_" <X> "_physical_start=" <N>
+ where <X> ranges from 0 to (<NS> -1) - for <NS> described in <stripes>.
+ stripe_status := "stripe_" <X> "_status=" <stripe_status_str>
+ where <X> ranges from 0 to (<NS> -1) - for <NS> described in <stripes>.
+ stripe_status_str := "D" | "A"
+
+ E.g.
+ When a 'striped' target is loaded, then IMA ASCII measurement log will have an entry
+ similar to the following, depicting what 'striped' attributes are measured in EVENT_DATA
+ for 'dm_table_load' event.
+ (converted from ASCII to text for readability)
+
+ dm_version=4.45.0;
+ name=striped1,uuid=striped_uuid1,major=253,minor=5,minor_count=1,num_targets=1;
+ target_index=0,target_begin=0,target_len=640,target_name=striped,target_version=1.6.0,stripes=2,chunk_size=64,
+ stripe_0_device_name=253:0,stripe_0_physical_start=2048,stripe_0_status=A,
+ stripe_1_device_name=253:3,stripe_1_physical_start=2048,stripe_1_status=A;
+
+10. verity
+----------
+The 'target_attributes' (described as part of EVENT_DATA in 'Table load'
+section above) has the following data format for 'verity' target.
+
+::
+
+ target_attributes := <target_name> "," <target_version> "," <hash_failed> "," <verity_version> ","
+ <data_device_name> "," <hash_device_name> "," <verity_algorithm> "," <root_digest> ","
+ <salt> "," <ignore_zero_blocks> "," <check_at_most_once> ["," <root_hash_sig_key_desc>]
+ ["," <verity_mode>] ";"
+
+ target_name := "target_name=verity"
+ target_version := "target_version=" <N> "." <N> "." <N>
+ hash_failed := "hash_failed=" <hash_failed_str>
+ hash_failed_str := "C" | "V"
+ verity_version := "verity_version=" <verity_version_str>
+ data_device_name := "data_device_name=" <data_device_name_str>
+ hash_device_name := "hash_device_name=" <hash_device_name_str>
+ verity_algorithm := "verity_algorithm=" <verity_algorithm_str>
+ root_digest := "root_digest=" <root_digest_str>
+ salt := "salt=" <salt_str>
+ salt_str := "-" <verity_salt_str>
+ ignore_zero_blocks := "ignore_zero_blocks=" <yes_no>
+ check_at_most_once := "check_at_most_once=" <yes_no>
+ root_hash_sig_key_desc := "root_hash_sig_key_desc="
+ verity_mode := "verity_mode=" <verity_mode_str>
+ verity_mode_str := "ignore_corruption" | "restart_on_corruption" | "panic_on_corruption" | "invalid"
+ yes_no := "y" | "n"
+
+ E.g.
+ When a 'verity' target is loaded, then IMA ASCII measurement log will have an entry
+ similar to the following, depicting what 'verity' attributes are measured in EVENT_DATA
+ for 'dm_table_load' event.
+ (converted from ASCII to text for readability)
+
+ dm_version=4.45.0;
+ name=test-verity,uuid=,major=253,minor=2,minor_count=1,num_targets=1;
+ target_index=0,target_begin=0,target_len=1953120,target_name=verity,target_version=1.8.0,hash_failed=V,
+ verity_version=1,data_device_name=253:1,hash_device_name=253:0,verity_algorithm=sha256,
+ root_digest=29cb87e60ce7b12b443ba6008266f3e41e93e403d7f298f8e3f316b29ff89c5e,
+ salt=e48da609055204e89ae53b655ca2216dd983cf3cb829f34f63a297d106d53e2d,
+ ignore_zero_blocks=n,check_at_most_once=n;
diff --git a/Documentation/admin-guide/device-mapper/dm-integrity.rst b/Documentation/admin-guide/device-mapper/dm-integrity.rst
index c00f9f11e3f3..8db172efa272 100644
--- a/Documentation/admin-guide/device-mapper/dm-integrity.rst
+++ b/Documentation/admin-guide/device-mapper/dm-integrity.rst
@@ -45,7 +45,7 @@ To use the target for the first time:
will format the device
3. unload the dm-integrity target
4. read the "provided_data_sectors" value from the superblock
-5. load the dm-integrity target with the the target size
+5. load the dm-integrity target with the target size
"provided_data_sectors"
6. if you want to use dm-integrity with dm-crypt, load the dm-crypt target
with the size "provided_data_sectors"
@@ -99,7 +99,7 @@ interleave_sectors:number
the superblock is used.
meta_device:device
- Don't interleave the data and metadata on on device. Use a
+ Don't interleave the data and metadata on the device. Use a
separate device for metadata.
buffer_sectors:number
@@ -117,7 +117,7 @@ journal_watermark:number
commit_time:number
Commit time in milliseconds. When this time passes, the journal is
- written. The journal is also written immediatelly if the FLUSH
+ written. The journal is also written immediately if the FLUSH
request is received.
internal_hash:algorithm(:key) (the key is optional)
@@ -143,11 +143,11 @@ recalculate
journal_crypt:algorithm(:key) (the key is optional)
Encrypt the journal using given algorithm to make sure that the
attacker can't read the journal. You can use a block cipher here
- (such as "cbc(aes)") or a stream cipher (for example "chacha20",
- "salsa20" or "ctr(aes)").
+ (such as "cbc(aes)") or a stream cipher (for example "chacha20"
+ or "ctr(aes)").
The journal contains history of last writes to the block device,
- an attacker reading the journal could see the last sector nubmers
+ an attacker reading the journal could see the last sector numbers
that were written. From the sector numbers, the attacker can infer
the size of files that were written. To protect against this
situation, you can encrypt the journal.
@@ -177,17 +177,45 @@ bitmap_flush_interval:number
The bitmap flush interval in milliseconds. The metadata buffers
are synchronized when this interval expires.
+allow_discards
+ Allow block discard requests (a.k.a. TRIM) for the integrity device.
+ Discards are only allowed to devices using internal hash.
+
fix_padding
Use a smaller padding of the tag area that is more
space-efficient. If this option is not present, large padding is
used - that is for compatibility with older kernels.
+fix_hmac
+ Improve security of internal_hash and journal_mac:
+
+ - the section number is mixed to the mac, so that an attacker can't
+ copy sectors from one journal section to another journal section
+ - the superblock is protected by journal_mac
+ - a 16-byte salt stored in the superblock is mixed to the mac, so
+ that the attacker can't detect that two disks have the same hmac
+ key and also to disallow the attacker to move sectors from one
+ disk to another
+
+legacy_recalculate
+ Allow recalculating of volumes with HMAC keys. This is disabled by
+ default for security reasons - an attacker could modify the volume,
+ set recalc_sector to zero, and the kernel would not detect the
+ modification.
+
+The journal mode (D/J), buffer_sectors, journal_watermark, commit_time and
+allow_discards can be changed when reloading the target (load an inactive
+table and swap the tables with suspend and resume). The other arguments
+should not be changed when reloading the target because the layout of disk
+data depend on them and the reloaded target would be non-functional.
+
+
+Status line:
-The journal mode (D/J), buffer_sectors, journal_watermark, commit_time can
-be changed when reloading the target (load an inactive table and swap the
-tables with suspend and resume). The other arguments should not be changed
-when reloading the target because the layout of disk data depend on them
-and the reloaded target would be non-functional.
+1. the number of integrity mismatches
+2. provided data sectors - that is the number of sectors that the user
+ could use
+3. the current recalculating position (or '-' if we didn't recalculate)
The layout of the formatted block device:
diff --git a/Documentation/admin-guide/device-mapper/dm-raid.rst b/Documentation/admin-guide/device-mapper/dm-raid.rst
index 695a2ea1d1ae..bb17e26e3c1b 100644
--- a/Documentation/admin-guide/device-mapper/dm-raid.rst
+++ b/Documentation/admin-guide/device-mapper/dm-raid.rst
@@ -71,7 +71,7 @@ The target is named "raid" and it accepts the following parameters::
============= ===============================================================
Reference: Chapter 4 of
- http://www.snia.org/sites/default/files/SNIA_DDF_Technical_Position_v2.0.pdf
+ https://www.snia.org/sites/default/files/SNIA_DDF_Technical_Position_v2.0.pdf
<#raid_params>: The number of parameters that follow.
@@ -418,6 +418,6 @@ Version History
specific devices are requested via rebuild. Fix RAID leg
rebuild errors.
1.15.0 Fix size extensions not being synchronized in case of new MD bitmap
- pages allocated; also fix those not occuring after previous reductions
+ pages allocated; also fix those not occurring after previous reductions
1.15.1 Fix argument count and arguments for rebuild/write_mostly/journal_(dev|mode)
on the status line.
diff --git a/Documentation/admin-guide/device-mapper/dm-zoned.rst b/Documentation/admin-guide/device-mapper/dm-zoned.rst
index 07f56ebc1730..0fac051caeac 100644
--- a/Documentation/admin-guide/device-mapper/dm-zoned.rst
+++ b/Documentation/admin-guide/device-mapper/dm-zoned.rst
@@ -14,7 +14,7 @@ host-aware zoned block devices.
For a more detailed description of the zoned block device models and
their constraints see (for SCSI devices):
-http://www.t10.org/drafts.htm#ZBC_Family
+https://www.t10.org/drafts.htm#ZBC_Family
and (for ATA devices):
@@ -24,7 +24,7 @@ The dm-zoned implementation is simple and minimizes system overhead (CPU
and memory usage as well as storage capacity loss). For a 10TB
host-managed disk with 256 MB zones, dm-zoned memory usage per disk
instance is at most 4.5 MB and as little as 5 zones will be used
-internally for storing metadata and performaing reclaim operations.
+internally for storing metadata and performing reclaim operations.
dm-zoned target devices are formatted and checked using the dmzadm
utility available at:
@@ -37,9 +37,13 @@ Algorithm
dm-zoned implements an on-disk buffering scheme to handle non-sequential
write accesses to the sequential zones of a zoned block device.
Conventional zones are used for caching as well as for storing internal
-metadata.
+metadata. It can also use a regular block device together with the zoned
+block device; in that case the regular block device will be split logically
+in zones with the same size as the zoned block device. These zones will be
+placed in front of the zones from the zoned block device and will be handled
+just like conventional zones.
-The zones of the device are separated into 2 types:
+The zones of the device(s) are separated into 2 types:
1) Metadata zones: these are conventional zones used to store metadata.
Metadata zones are not reported as useable capacity to the user.
@@ -98,7 +102,7 @@ the buffer zone assigned. If the accessed chunk has no mapping, or the
accessed blocks are invalid, the read buffer is zeroed and the read
operation terminated.
-After some time, the limited number of convnetional zones available may
+After some time, the limited number of conventional zones available may
be exhausted (all used to map chunks or buffer sequential zones) and
unaligned writes to unbuffered chunks become impossible. To avoid this
situation, a reclaim process regularly scans used conventional zones and
@@ -127,6 +131,13 @@ resumed. Flushing metadata thus only temporarily delays write and
discard requests. Read requests can be processed concurrently while
metadata flush is being executed.
+If a regular device is used in conjunction with the zoned block device,
+a third set of metadata (without the zone bitmaps) is written to the
+start of the zoned block device. This metadata has a generation counter of
+'0' and will never be updated during normal operation; it just serves for
+identification purposes. The first and second copy of the metadata
+are located at the start of the regular block device.
+
Usage
=====
@@ -138,9 +149,46 @@ Ex::
dmzadm --format /dev/sdxx
-For a formatted device, the target can be created normally with the
-dmsetup utility. The only parameter that dm-zoned requires is the
-underlying zoned block device name. Ex::
- echo "0 `blockdev --getsize ${dev}` zoned ${dev}" | \
- dmsetup create dmz-`basename ${dev}`
+If two drives are to be used, both devices must be specified, with the
+regular block device as the first device.
+
+Ex::
+
+ dmzadm --format /dev/sdxx /dev/sdyy
+
+
+Formatted device(s) can be started with the dmzadm utility, too.:
+
+Ex::
+
+ dmzadm --start /dev/sdxx /dev/sdyy
+
+
+Information about the internal layout and current usage of the zones can
+be obtained with the 'status' callback from dmsetup:
+
+Ex::
+
+ dmsetup status /dev/dm-X
+
+will return a line
+
+ 0 <size> zoned <nr_zones> zones <nr_unmap_rnd>/<nr_rnd> random <nr_unmap_seq>/<nr_seq> sequential
+
+where <nr_zones> is the total number of zones, <nr_unmap_rnd> is the number
+of unmapped (ie free) random zones, <nr_rnd> the total number of zones,
+<nr_unmap_seq> the number of unmapped sequential zones, and <nr_seq> the
+total number of sequential zones.
+
+Normally the reclaim process will be started once there are less than 50
+percent free random zones. In order to start the reclaim process manually
+even before reaching this threshold the 'dmsetup message' function can be
+used:
+
+Ex::
+
+ dmsetup message /dev/dm-X 0 reclaim
+
+will start the reclaim process and random zones will be moved to sequential
+zones.
diff --git a/Documentation/admin-guide/device-mapper/index.rst b/Documentation/admin-guide/device-mapper/index.rst
index ec62fcc8eece..cde52cc09645 100644
--- a/Documentation/admin-guide/device-mapper/index.rst
+++ b/Documentation/admin-guide/device-mapper/index.rst
@@ -11,7 +11,9 @@ Device Mapper
dm-clone
dm-crypt
dm-dust
+ dm-ebs
dm-flakey
+ dm-ima
dm-init
dm-integrity
dm-io
diff --git a/Documentation/admin-guide/device-mapper/verity.rst b/Documentation/admin-guide/device-mapper/verity.rst
index bb02caa45289..a65c1602cb23 100644
--- a/Documentation/admin-guide/device-mapper/verity.rst
+++ b/Documentation/admin-guide/device-mapper/verity.rst
@@ -69,7 +69,7 @@ Construction Parameters
<#opt_params>
Number of optional parameters. If there are no optional parameters,
- the optional paramaters section can be skipped or #opt_params can be zero.
+ the optional parameters section can be skipped or #opt_params can be zero.
Otherwise #opt_params is the number of following arguments.
Example of optional parameters section:
@@ -83,6 +83,10 @@ restart_on_corruption
not compatible with ignore_corruption and requires user space support to
avoid restart loops.
+panic_on_corruption
+ Panic the device when a corrupted block is discovered. This option is
+ not compatible with ignore_corruption and restart_on_corruption.
+
ignore_zero_blocks
Do not verify blocks that are expected to contain zeroes and always return
zeroes instead. This may be useful if the partition contains unused blocks
@@ -130,7 +134,16 @@ root_hash_sig_key_desc <key_description>
the pkcs7 signature of the roothash. The pkcs7 signature is used to validate
the root hash during the creation of the device mapper block device.
Verification of roothash depends on the config DM_VERITY_VERIFY_ROOTHASH_SIG
- being set in the kernel.
+ being set in the kernel. The signatures are checked against the builtin
+ trusted keyring by default, or the secondary trusted keyring if
+ DM_VERITY_VERIFY_ROOTHASH_SIG_SECONDARY_KEYRING is set. The secondary
+ trusted keyring includes by default the builtin trusted keyring, and it can
+ also gain new certificates at run time if they are signed by a certificate
+ already in the secondary trusted keyring.
+
+try_verify_in_tasklet
+ If verity hashes are in cache, verify data blocks in kernel tasklet instead
+ of workqueue. This option can reduce IO latency.
Theory of operation
===================
diff --git a/Documentation/admin-guide/device-mapper/writecache.rst b/Documentation/admin-guide/device-mapper/writecache.rst
index d3d7690f5e8d..60c16b7fd5ac 100644
--- a/Documentation/admin-guide/device-mapper/writecache.rst
+++ b/Documentation/admin-guide/device-mapper/writecache.rst
@@ -12,7 +12,6 @@ first sector should contain valid superblock from previous invocation.
Constructor parameters:
1. type of the cache device - "p" or "s"
-
- p - persistent memory
- s - SSD
2. the underlying device that will be cached
@@ -37,10 +36,10 @@ Constructor parameters:
autocommit_blocks n (default: 64 for pmem, 65536 for ssd)
when the application writes this amount of blocks without
issuing the FLUSH request, the blocks are automatically
- commited
+ committed
autocommit_time ms (default: 1000)
autocommit time in milliseconds. The data is automatically
- commited if this time passes and no FLUSH request is
+ committed if this time passes and no FLUSH request is
received
fua (by default on)
applicable only to persistent memory - use the FUA flag
@@ -53,19 +52,51 @@ Constructor parameters:
- some underlying devices perform better with fua, some
with nofua. The user should test it
+ cleaner
+ when this option is activated (either in the constructor
+ arguments or by a message), the cache will not promote
+ new writes (however, writes to already cached blocks are
+ promoted, to avoid data corruption due to misordered
+ writes) and it will gradually writeback any cached
+ data. The userspace can then monitor the cleaning
+ process with "dmsetup status". When the number of cached
+ blocks drops to zero, userspace can unload the
+ dm-writecache target and replace it with dm-linear or
+ other targets.
+ max_age n
+ specifies the maximum age of a block in milliseconds. If
+ a block is stored in the cache for too long, it will be
+ written to the underlying device and cleaned up.
+ metadata_only
+ only metadata is promoted to the cache. This option
+ improves performance for heavier REQ_META workloads.
+ pause_writeback n (default: 3000)
+ pause writeback if there was some write I/O redirected to
+ the origin volume in the last n milliseconds
Status:
+
1. error indicator - 0 if there was no error, otherwise error number
2. the number of blocks
3. the number of free blocks
4. the number of blocks under writeback
+5. the number of read blocks
+6. the number of read blocks that hit the cache
+7. the number of write blocks
+8. the number of write blocks that hit uncommitted block
+9. the number of write blocks that hit committed block
+10. the number of write blocks that bypass the cache
+11. the number of write blocks that are allocated in the cache
+12. the number of write requests that are blocked on the freelist
+13. the number of flush requests
+14. the number of discarded blocks
Messages:
flush
- flush the cache device. The message returns successfully
+ Flush the cache device. The message returns successfully
if the cache device was flushed without an error
flush_on_suspend
- flush the cache device on next suspend. Use this message
+ Flush the cache device on next suspend. Use this message
when you are going to remove the cache device. The proper
sequence for removing the cache device is:
@@ -77,3 +108,7 @@ Messages:
5. resume the device, so that it will use the linear
target
6. the cache device is now inactive and it can be deleted
+ cleaner
+ See above "cleaner" constructor documentation.
+ clear_stats
+ Clear the statistics that are reported on the status line
diff --git a/Documentation/admin-guide/devices.rst b/Documentation/admin-guide/devices.rst
index d41671aeaef0..e3776d77374b 100644
--- a/Documentation/admin-guide/devices.rst
+++ b/Documentation/admin-guide/devices.rst
@@ -7,17 +7,16 @@ This list is the Linux Device List, the official registry of allocated
device numbers and ``/dev`` directory nodes for the Linux operating
system.
-The LaTeX version of this document is no longer maintained, nor is
-the document that used to reside at lanana.org. This version in the
-mainline Linux kernel is the master document. Updates shall be sent
-as patches to the kernel maintainers (see the
+The version of this document at lanana.org is no longer maintained. This
+version in the mainline Linux kernel is the master document. Updates
+shall be sent as patches to the kernel maintainers (see the
:ref:`Documentation/process/submitting-patches.rst <submittingpatches>` document).
Specifically explore the sections titled "CHAR and MISC DRIVERS", and
"BLOCK LAYER" in the MAINTAINERS file to find the right maintainers
to involve for character and block devices.
This document is included by reference into the Filesystem Hierarchy
-Standard (FHS). The FHS is available from http://www.pathname.com/fhs/.
+Standard (FHS). The FHS is available from https://www.pathname.com/fhs/.
Allocations marked (68k/Amiga) apply to Linux/68k on the Amiga
platform only. Allocations marked (68k/Atari) apply to Linux/68k on
diff --git a/Documentation/admin-guide/devices.txt b/Documentation/admin-guide/devices.txt
index 2a97aaec8b12..9764d6edb189 100644
--- a/Documentation/admin-guide/devices.txt
+++ b/Documentation/admin-guide/devices.txt
@@ -4,7 +4,7 @@
1 char Memory devices
1 = /dev/mem Physical memory access
- 2 = /dev/kmem Kernel virtual memory access
+ 2 = /dev/kmem OBSOLETE - replaced by /proc/kcore
3 = /dev/null Null device
4 = /dev/port I/O port access
5 = /dev/zero Null byte source
@@ -289,7 +289,7 @@
152 = /dev/kpoll Kernel Poll Driver
153 = /dev/mergemem Memory merge device
154 = /dev/pmu Macintosh PowerBook power manager
- 155 = /dev/isictl MultiTech ISICom serial control
+ 155 =
156 = /dev/lcd Front panel LCD display
157 = /dev/ac Applicom Intl Profibus card
158 = /dev/nwbutton Netwinder external button
@@ -375,8 +375,9 @@
239 = /dev/uhid User-space I/O driver support for HID subsystem
240 = /dev/userio Serio driver testing device
241 = /dev/vhost-vsock Host kernel driver for virtio vsock
+ 242 = /dev/rfkill Turning off radio transmissions (rfkill)
- 242-254 Reserved for local use
+ 243-254 Reserved for local use
255 Reserved for MISC_DYNAMIC_MINOR
11 char Raw keyboard device (Linux/SPARC only)
@@ -476,11 +477,6 @@
18 block Sanyo CD-ROM
0 = /dev/sjcd Sanyo CD-ROM
- 19 char Cyclades serial card
- 0 = /dev/ttyC0 First Cyclades port
- ...
- 31 = /dev/ttyC31 32nd Cyclades port
-
19 block "Double" compressed disk
0 = /dev/double0 First compressed disk
...
@@ -492,11 +488,6 @@
See the Double documentation for the meaning of the
mirror devices.
- 20 char Cyclades serial card - alternate devices
- 0 = /dev/cub0 Callout device for ttyC0
- ...
- 31 = /dev/cub31 Callout device for ttyC31
-
20 block Hitachi CD-ROM (under development)
0 = /dev/hitcd Hitachi CD-ROM
@@ -1442,7 +1433,7 @@
...
The driver and documentation may be obtained from
- http://www.winradio.com/
+ https://www.winradio.com/
82 block I2O hard disk
0 = /dev/i2o/hdag 33rd I2O hard disk, whole disk
@@ -1656,12 +1647,12 @@
dynamically, so there is no fixed mapping from subdevice
pathnames to minor numbers.
- See http://www.comedi.org/ for information about the Comedi
+ See https://www.comedi.org/ for information about the Comedi
project.
98 block User-mode virtual block device
0 = /dev/ubda First user-mode block device
- 16 = /dev/udbb Second user-mode block device
+ 16 = /dev/ubdb Second user-mode block device
...
Partitions are handled in the same way as for IDE
@@ -1723,7 +1714,7 @@
implementations a kernel presence for caching and easy
mounting. For more information about the project,
write to <arla-drinkers@stacken.kth.se> or see
- http://www.stacken.kth.se/project/arla/
+ https://www.stacken.kth.se/project/arla/
103 block Audit device
0 = /dev/audit Audit device
@@ -1942,7 +1933,7 @@
...
255= /dev/umem/d15p15 15th partition of 16th board.
- 117 char COSA/SRP synchronous serial card
+ 117 char [REMOVED] COSA/SRP synchronous serial card
0 = /dev/cosa0c0 1st board, 1st channel
1 = /dev/cosa0c1 1st board, 2nd channel
...
@@ -2348,13 +2339,7 @@
disks (see major number 3) except that the limit on
partitions is 31.
- 162 char Raw block device interface
- 0 = /dev/rawctl Raw I/O control device
- 1 = /dev/raw/raw1 First raw I/O device
- 2 = /dev/raw/raw2 Second raw I/O device
- ...
- max minor number of raw device is set by kernel config
- MAX_RAW_DEVS or raw module parameter 'max_raw_devs'
+ 162 char Used for (now removed) raw block device interface
163 char
@@ -3002,10 +2987,10 @@
65 = /dev/infiniband/issm1 Second InfiniBand IsSM device
...
127 = /dev/infiniband/issm63 63rd InfiniBand IsSM device
- 128 = /dev/infiniband/uverbs0 First InfiniBand verbs device
- 129 = /dev/infiniband/uverbs1 Second InfiniBand verbs device
+ 192 = /dev/infiniband/uverbs0 First InfiniBand verbs device
+ 193 = /dev/infiniband/uverbs1 Second InfiniBand verbs device
...
- 159 = /dev/infiniband/uverbs31 31st InfiniBand verbs device
+ 223 = /dev/infiniband/uverbs31 31st InfiniBand verbs device
232 char Biometric Devices
0 = /dev/biometric/sensor0/fingerprint first fingerprint sensor on first device
diff --git a/Documentation/admin-guide/dynamic-debug-howto.rst b/Documentation/admin-guide/dynamic-debug-howto.rst
index 252e5ef324e5..faa22f77847a 100644
--- a/Documentation/admin-guide/dynamic-debug-howto.rst
+++ b/Documentation/admin-guide/dynamic-debug-howto.rst
@@ -5,135 +5,115 @@ Dynamic debug
Introduction
============
-This document describes how to use the dynamic debug (dyndbg) feature.
+Dynamic debug allows you to dynamically enable/disable kernel
+debug-print code to obtain additional kernel information.
-Dynamic debug is designed to allow you to dynamically enable/disable
-kernel code to obtain additional kernel information. Currently, if
-``CONFIG_DYNAMIC_DEBUG`` is set, then all ``pr_debug()``/``dev_dbg()`` and
-``print_hex_dump_debug()``/``print_hex_dump_bytes()`` calls can be dynamically
-enabled per-callsite.
+If ``/proc/dynamic_debug/control`` exists, your kernel has dynamic
+debug. You'll need root access (sudo su) to use this.
-If ``CONFIG_DYNAMIC_DEBUG`` is not set, ``print_hex_dump_debug()`` is just
-shortcut for ``print_hex_dump(KERN_DEBUG)``.
+Dynamic debug provides:
-For ``print_hex_dump_debug()``/``print_hex_dump_bytes()``, format string is
-its ``prefix_str`` argument, if it is constant string; or ``hexdump``
-in case ``prefix_str`` is built dynamically.
+ * a Catalog of all *prdbgs* in your kernel.
+ ``cat /proc/dynamic_debug/control`` to see them.
-Dynamic debug has even more useful features:
-
- * Simple query language allows turning on and off debugging
- statements by matching any combination of 0 or 1 of:
+ * a Simple query/command language to alter *prdbgs* by selecting on
+ any combination of 0 or 1 of:
- source filename
- function name
- line number (including ranges of line numbers)
- module name
- format string
-
- * Provides a debugfs control file: ``<debugfs>/dynamic_debug/control``
- which can be read to display the complete list of known debug
- statements, to help guide you
-
-Controlling dynamic debug Behaviour
-===================================
-
-The behaviour of ``pr_debug()``/``dev_dbg()`` are controlled via writing to a
-control file in the 'debugfs' filesystem. Thus, you must first mount
-the debugfs filesystem, in order to make use of this feature.
-Subsequently, we refer to the control file as:
-``<debugfs>/dynamic_debug/control``. For example, if you want to enable
-printing from source file ``svcsock.c``, line 1603 you simply do::
-
- nullarbor:~ # echo 'file svcsock.c line 1603 +p' >
- <debugfs>/dynamic_debug/control
-
-If you make a mistake with the syntax, the write will fail thus::
-
- nullarbor:~ # echo 'file svcsock.c wtf 1 +p' >
- <debugfs>/dynamic_debug/control
- -bash: echo: write error: Invalid argument
+ - class name (as known/declared by each module)
Viewing Dynamic Debug Behaviour
===============================
-You can view the currently configured behaviour of all the debug
-statements via::
+You can view the currently configured behaviour in the *prdbg* catalog::
- nullarbor:~ # cat <debugfs>/dynamic_debug/control
+ :#> head -n7 /proc/dynamic_debug/control
# filename:lineno [module]function flags format
- /usr/src/packages/BUILD/sgi-enhancednfs-1.4/default/net/sunrpc/svc_rdma.c:323 [svcxprt_rdma]svc_rdma_cleanup =_ "SVCRDMA Module Removed, deregister RPC RDMA transport\012"
- /usr/src/packages/BUILD/sgi-enhancednfs-1.4/default/net/sunrpc/svc_rdma.c:341 [svcxprt_rdma]svc_rdma_init =_ "\011max_inline : %d\012"
- /usr/src/packages/BUILD/sgi-enhancednfs-1.4/default/net/sunrpc/svc_rdma.c:340 [svcxprt_rdma]svc_rdma_init =_ "\011sq_depth : %d\012"
- /usr/src/packages/BUILD/sgi-enhancednfs-1.4/default/net/sunrpc/svc_rdma.c:338 [svcxprt_rdma]svc_rdma_init =_ "\011max_requests : %d\012"
- ...
+ init/main.c:1179 [main]initcall_blacklist =_ "blacklisting initcall %s\012
+ init/main.c:1218 [main]initcall_blacklisted =_ "initcall %s blacklisted\012"
+ init/main.c:1424 [main]run_init_process =_ " with arguments:\012"
+ init/main.c:1426 [main]run_init_process =_ " %s\012"
+ init/main.c:1427 [main]run_init_process =_ " with environment:\012"
+ init/main.c:1429 [main]run_init_process =_ " %s\012"
+The 3rd space-delimited column shows the current flags, preceded by
+a ``=`` for easy use with grep/cut. ``=p`` shows enabled callsites.
-You can also apply standard Unix text manipulation filters to this
-data, e.g.::
+Controlling dynamic debug Behaviour
+===================================
- nullarbor:~ # grep -i rdma <debugfs>/dynamic_debug/control | wc -l
- 62
+The behaviour of *prdbg* sites are controlled by writing
+query/commands to the control file. Example::
- nullarbor:~ # grep -i tcp <debugfs>/dynamic_debug/control | wc -l
- 42
+ # grease the interface
+ :#> alias ddcmd='echo $* > /proc/dynamic_debug/control'
-The third column shows the currently enabled flags for each debug
-statement callsite (see below for definitions of the flags). The
-default value, with no flags enabled, is ``=_``. So you can view all
-the debug statement callsites with any non-default flags::
+ :#> ddcmd '-p; module main func run* +p'
+ :#> grep =p /proc/dynamic_debug/control
+ init/main.c:1424 [main]run_init_process =p " with arguments:\012"
+ init/main.c:1426 [main]run_init_process =p " %s\012"
+ init/main.c:1427 [main]run_init_process =p " with environment:\012"
+ init/main.c:1429 [main]run_init_process =p " %s\012"
- nullarbor:~ # awk '$3 != "=_"' <debugfs>/dynamic_debug/control
- # filename:lineno [module]function flags format
- /usr/src/packages/BUILD/sgi-enhancednfs-1.4/default/net/sunrpc/svcsock.c:1603 [sunrpc]svc_send p "svc_process: st_sendto returned %d\012"
+Error messages go to console/syslog::
+
+ :#> ddcmd mode foo +p
+ dyndbg: unknown keyword "mode"
+ dyndbg: query parse failed
+ bash: echo: write error: Invalid argument
+
+If debugfs is also enabled and mounted, ``dynamic_debug/control`` is
+also under the mount-dir, typically ``/sys/kernel/debug/``.
Command Language Reference
==========================
-At the lexical level, a command comprises a sequence of words separated
+At the basic lexical level, a command is a sequence of words separated
by spaces or tabs. So these are all equivalent::
- nullarbor:~ # echo -n 'file svcsock.c line 1603 +p' >
- <debugfs>/dynamic_debug/control
- nullarbor:~ # echo -n ' file svcsock.c line 1603 +p ' >
- <debugfs>/dynamic_debug/control
- nullarbor:~ # echo -n 'file svcsock.c line 1603 +p' >
- <debugfs>/dynamic_debug/control
+ :#> ddcmd file svcsock.c line 1603 +p
+ :#> ddcmd "file svcsock.c line 1603 +p"
+ :#> ddcmd ' file svcsock.c line 1603 +p '
Command submissions are bounded by a write() system call.
Multiple commands can be written together, separated by ``;`` or ``\n``::
- ~# echo "func pnpacpi_get_resources +p; func pnp_assign_mem +p" \
- > <debugfs>/dynamic_debug/control
-
-If your query set is big, you can batch them too::
-
- ~# cat query-batch-file > <debugfs>/dynamic_debug/control
+ :#> ddcmd "func pnpacpi_get_resources +p; func pnp_assign_mem +p"
+ :#> ddcmd <<"EOC"
+ func pnpacpi_get_resources +p
+ func pnp_assign_mem +p
+ EOC
+ :#> cat query-batch-file > /proc/dynamic_debug/control
-Another way is to use wildcards. The match rule supports ``*`` (matches
-zero or more characters) and ``?`` (matches exactly one character). For
-example, you can match all usb drivers::
+You can also use wildcards in each query term. The match rule supports
+``*`` (matches zero or more characters) and ``?`` (matches exactly one
+character). For example, you can match all usb drivers::
- ~# echo "file drivers/usb/* +p" > <debugfs>/dynamic_debug/control
+ :#> ddcmd file "drivers/usb/*" +p # "" to suppress shell expansion
-At the syntactical level, a command comprises a sequence of match
-specifications, followed by a flags change specification::
+Syntactically, a command is pairs of keyword values, followed by a
+flags change or setting::
command ::= match-spec* flags-spec
-The match-spec's are used to choose a subset of the known pr_debug()
-callsites to which to apply the flags-spec. Think of them as a query
-with implicit ANDs between each pair. Note that an empty list of
-match-specs will select all debug statement callsites.
+The match-spec's select *prdbgs* from the catalog, upon which to apply
+the flags-spec, all constraints are ANDed together. An absent keyword
+is the same as keyword "*".
-A match specification comprises a keyword, which controls the
-attribute of the callsite to be compared, and a value to compare
-against. Possible keywords are:::
+
+A match specification is a keyword, which selects the attribute of
+the callsite to be compared, and a value to compare against. Possible
+keywords are:::
match-spec ::= 'func' string |
'file' string |
'module' string |
'format' string |
+ 'class' string |
'line' line-range
line-range ::= lineno |
@@ -156,15 +136,18 @@ func
of each callsite. Example::
func svc_tcp_accept
+ func *recv* # in rfcomm, bluetooth, ping, tcp
file
- The given string is compared against either the full pathname, the
- src-root relative pathname, or the basename of the source file of
- each callsite. Examples::
+ The given string is compared against either the src-root relative
+ pathname, or the basename of the source file of each callsite.
+ Examples::
file svcsock.c
- file kernel/freezer.c
- file /usr/src/packages/BUILD/sgi-enhancednfs-1.4/default/net/sunrpc/svcsock.c
+ file kernel/freezer.c # ie column 1 of control file
+ file drivers/usb/* # all callsites under it
+ file inode.c:start_* # parse :tail as a func (above)
+ file inode.c:1-100 # parse :tail as a line-range (above)
module
The given string is compared against the module name
@@ -174,6 +157,7 @@ module
module sunrpc
module nfsd
+ module drm* # both drm, drm_kms_helper
format
The given string is searched for in the dynamic debug format
@@ -191,6 +175,16 @@ format
format "nfsd: SETATTR" // a neater way to match a format with whitespace
format 'nfsd: SETATTR' // yet another way to match a format with whitespace
+class
+ The given class_name is validated against each module, which may
+ have declared a list of known class_names. If the class_name is
+ found for a module, callsite & class matching and adjustment
+ proceeds. Examples::
+
+ class DRM_UT_KMS # a DRM.debug category
+ class JUNK # silent non-match
+ // class TLD_* # NOTICE: no wildcard in class names
+
line
The given line number or range of line numbers is compared
against the line number of each ``pr_debug()`` callsite. A single
@@ -216,17 +210,16 @@ of the characters::
The flags are::
p enables the pr_debug() callsite.
- f Include the function name in the printed message
- l Include line number in the printed message
- m Include module name in the printed message
- t Include thread ID in messages not generated from interrupt context
- _ No flags are set. (Or'd with others on input)
+ _ enables no flags.
-For ``print_hex_dump_debug()`` and ``print_hex_dump_bytes()``, only ``p`` flag
-have meaning, other flags ignored.
+ Decorator flags add to the message-prefix, in order:
+ t Include thread ID, or <intr>
+ m Include module name
+ f Include the function name
+ l Include line number
-For display, the flags are preceded by ``=``
-(mnemonic: what the flags are currently equal to).
+For ``print_hex_dump_debug()`` and ``print_hex_dump_bytes()``, only
+the ``p`` flag has meaning, other flags are ignored.
Note the regexp ``^[-+=][flmpt_]+$`` matches a flags specification.
To clear all flags at once, use ``=_`` or ``-flmpt``.
@@ -237,14 +230,13 @@ Debug messages during Boot Process
To activate debug messages for core code and built-in modules during
the boot process, even before userspace and debugfs exists, use
-``dyndbg="QUERY"``, ``module.dyndbg="QUERY"``, or ``ddebug_query="QUERY"``
-(``ddebug_query`` is obsoleted by ``dyndbg``, and deprecated). QUERY follows
+``dyndbg="QUERY"`` or ``module.dyndbg="QUERY"``. QUERY follows
the syntax described above, but must not exceed 1023 characters. Your
bootloader may impose lower limits.
These ``dyndbg`` params are processed just after the ddebug tables are
-processed, as part of the arch_initcall. Thus you can enable debug
-messages in all code run after this arch_initcall via this boot
+processed, as part of the early_initcall. Thus you can enable debug
+messages in all code run after this early_initcall via this boot
parameter.
On an x86 system for example ACPI enablement is a subsys_initcall and::
@@ -258,8 +250,7 @@ this boot parameter for debugging purposes.
If ``foo`` module is not built-in, ``foo.dyndbg`` will still be processed at
boot time, without effect, but will be reprocessed when module is
-loaded later. ``ddebug_query=`` and bare ``dyndbg=`` are only processed at
-boot.
+loaded later. Bare ``dyndbg=`` is only processed at boot.
Debug Messages at Module Initialization Time
@@ -303,7 +294,7 @@ For ``CONFIG_DYNAMIC_DEBUG`` kernels, any settings given at boot-time (or
enabled by ``-DDEBUG`` flag during compilation) can be disabled later via
the debugfs interface if the debug messages are no longer needed::
- echo "module module_name -p" > <debugfs>/dynamic_debug/control
+ echo "module module_name -p" > /proc/dynamic_debug/control
Examples
========
@@ -311,43 +302,75 @@ Examples
::
// enable the message at line 1603 of file svcsock.c
- nullarbor:~ # echo -n 'file svcsock.c line 1603 +p' >
- <debugfs>/dynamic_debug/control
+ :#> ddcmd 'file svcsock.c line 1603 +p'
// enable all the messages in file svcsock.c
- nullarbor:~ # echo -n 'file svcsock.c +p' >
- <debugfs>/dynamic_debug/control
+ :#> ddcmd 'file svcsock.c +p'
// enable all the messages in the NFS server module
- nullarbor:~ # echo -n 'module nfsd +p' >
- <debugfs>/dynamic_debug/control
+ :#> ddcmd 'module nfsd +p'
// enable all 12 messages in the function svc_process()
- nullarbor:~ # echo -n 'func svc_process +p' >
- <debugfs>/dynamic_debug/control
+ :#> ddcmd 'func svc_process +p'
// disable all 12 messages in the function svc_process()
- nullarbor:~ # echo -n 'func svc_process -p' >
- <debugfs>/dynamic_debug/control
+ :#> ddcmd 'func svc_process -p'
// enable messages for NFS calls READ, READLINK, READDIR and READDIR+.
- nullarbor:~ # echo -n 'format "nfsd: READ" +p' >
- <debugfs>/dynamic_debug/control
+ :#> ddcmd 'format "nfsd: READ" +p'
// enable messages in files of which the paths include string "usb"
- nullarbor:~ # echo -n '*usb* +p' > <debugfs>/dynamic_debug/control
+ :#> ddcmd 'file *usb* +p' > /proc/dynamic_debug/control
// enable all messages
- nullarbor:~ # echo -n '+p' > <debugfs>/dynamic_debug/control
+ :#> ddcmd '+p' > /proc/dynamic_debug/control
// add module, function to all enabled messages
- nullarbor:~ # echo -n '+mf' > <debugfs>/dynamic_debug/control
+ :#> ddcmd '+mf' > /proc/dynamic_debug/control
// boot-args example, with newlines and comments for readability
Kernel command line: ...
// see whats going on in dyndbg=value processing
- dynamic_debug.verbose=1
- // enable pr_debugs in 2 builtins, #cmt is stripped
- dyndbg="module params +p #cmt ; module sys +p"
+ dynamic_debug.verbose=3
+ // enable pr_debugs in the btrfs module (can be builtin or loadable)
+ btrfs.dyndbg="+p"
+ // enable pr_debugs in all files under init/
+ // and the function parse_one, #cmt is stripped
+ dyndbg="file init/* +p #cmt ; func parse_one +p"
// enable pr_debugs in 2 functions in a module loaded later
pc87360.dyndbg="func pc87360_init_device +p; func pc87360_find +p"
+
+Kernel Configuration
+====================
+
+Dynamic Debug is enabled via kernel config items::
+
+ CONFIG_DYNAMIC_DEBUG=y # build catalog, enables CORE
+ CONFIG_DYNAMIC_DEBUG_CORE=y # enable mechanics only, skip catalog
+
+If you do not want to enable dynamic debug globally (i.e. in some embedded
+system), you may set ``CONFIG_DYNAMIC_DEBUG_CORE`` as basic support of dynamic
+debug and add ``ccflags := -DDYNAMIC_DEBUG_MODULE`` into the Makefile of any
+modules which you'd like to dynamically debug later.
+
+
+Kernel *prdbg* API
+==================
+
+The following functions are cataloged and controllable when dynamic
+debug is enabled::
+
+ pr_debug()
+ dev_dbg()
+ print_hex_dump_debug()
+ print_hex_dump_bytes()
+
+Otherwise, they are off by default; ``ccflags += -DDEBUG`` or
+``#define DEBUG`` in a source file will enable them appropriately.
+
+If ``CONFIG_DYNAMIC_DEBUG`` is not set, ``print_hex_dump_debug()`` is
+just a shortcut for ``print_hex_dump(KERN_DEBUG)``.
+
+For ``print_hex_dump_debug()``/``print_hex_dump_bytes()``, format string is
+its ``prefix_str`` argument, if it is constant string; or ``hexdump``
+in case ``prefix_str`` is built dynamically.
diff --git a/Documentation/admin-guide/edid.rst b/Documentation/admin-guide/edid.rst
new file mode 100644
index 000000000000..80deeb21a265
--- /dev/null
+++ b/Documentation/admin-guide/edid.rst
@@ -0,0 +1,60 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+====
+EDID
+====
+
+In the good old days when graphics parameters were configured explicitly
+in a file called xorg.conf, even broken hardware could be managed.
+
+Today, with the advent of Kernel Mode Setting, a graphics board is
+either correctly working because all components follow the standards -
+or the computer is unusable, because the screen remains dark after
+booting or it displays the wrong area. Cases when this happens are:
+
+- The graphics board does not recognize the monitor.
+- The graphics board is unable to detect any EDID data.
+- The graphics board incorrectly forwards EDID data to the driver.
+- The monitor sends no or bogus EDID data.
+- A KVM sends its own EDID data instead of querying the connected monitor.
+
+Adding the kernel parameter "nomodeset" helps in most cases, but causes
+restrictions later on.
+
+As a remedy for such situations, the kernel configuration item
+CONFIG_DRM_LOAD_EDID_FIRMWARE was introduced. It allows to provide an
+individually prepared or corrected EDID data set in the /lib/firmware
+directory from where it is loaded via the firmware interface. The code
+(see drivers/gpu/drm/drm_edid_load.c) contains built-in data sets for
+commonly used screen resolutions (800x600, 1024x768, 1280x1024, 1600x1200,
+1680x1050, 1920x1080) as binary blobs, but the kernel source tree does
+not contain code to create these data. In order to elucidate the origin
+of the built-in binary EDID blobs and to facilitate the creation of
+individual data for a specific misbehaving monitor, commented sources
+and a Makefile environment are given here.
+
+To create binary EDID and C source code files from the existing data
+material, simply type "make" in tools/edid/.
+
+If you want to create your own EDID file, copy the file 1024x768.S,
+replace the settings with your own data and add a new target to the
+Makefile. Please note that the EDID data structure expects the timing
+values in a different way as compared to the standard X11 format.
+
+X11:
+ HTimings:
+ hdisp hsyncstart hsyncend htotal
+ VTimings:
+ vdisp vsyncstart vsyncend vtotal
+
+EDID::
+
+ #define XPIX hdisp
+ #define XBLANK htotal-hdisp
+ #define XOFFSET hsyncstart-hdisp
+ #define XPULSE hsyncend-hsyncstart
+
+ #define YPIX vdisp
+ #define YBLANK vtotal-vdisp
+ #define YOFFSET vsyncstart-vdisp
+ #define YPULSE vsyncend-vsyncstart
diff --git a/Documentation/admin-guide/efi-stub.rst b/Documentation/admin-guide/efi-stub.rst
index 833edb0d0bc4..b24e7c40d832 100644
--- a/Documentation/admin-guide/efi-stub.rst
+++ b/Documentation/admin-guide/efi-stub.rst
@@ -7,10 +7,10 @@ as a PE/COFF image, thereby convincing EFI firmware loaders to load
it as an EFI executable. The code that modifies the bzImage header,
along with the EFI-specific entry point that the firmware loader
jumps to are collectively known as the "EFI boot stub", and live in
-arch/x86/boot/header.S and arch/x86/boot/compressed/eboot.c,
+arch/x86/boot/header.S and drivers/firmware/efi/libstub/x86-stub.c,
respectively. For ARM the EFI stub is implemented in
arch/arm/boot/compressed/efi-header.S and
-arch/arm/boot/compressed/efi-stub.c. EFI stub code that is shared
+drivers/firmware/efi/libstub/arm32-stub.c. EFI stub code that is shared
between architectures is in drivers/firmware/efi/libstub.
For arm64, there is no compressed kernel support, so the Image itself
diff --git a/Documentation/admin-guide/ext4.rst b/Documentation/admin-guide/ext4.rst
index 9443fcef1876..4c559e08d11e 100644
--- a/Documentation/admin-guide/ext4.rst
+++ b/Documentation/admin-guide/ext4.rst
@@ -392,9 +392,16 @@ When mounting an ext4 filesystem, the following option are accepted:
dax
Use direct access (no page cache). See
- Documentation/filesystems/dax.txt. Note that this option is
+ Documentation/filesystems/dax.rst. Note that this option is
incompatible with data=journal.
+ inlinecrypt
+ When possible, encrypt/decrypt the contents of encrypted files using the
+ blk-crypto framework rather than filesystem-layer encryption. This
+ allows the use of inline encryption hardware. The on-disk format is
+ unaffected. For more details, see
+ Documentation/block/inline-encryption.rst.
+
Data Mode
=========
There are 3 different data modes:
@@ -482,6 +489,9 @@ Files in /sys/fs/ext4/<devname>:
multiple of this tuning parameter if the stripe size is not set in the
ext4 superblock
+ mb_max_inode_prealloc
+ The maximum length of per-inode ext4_prealloc_space list.
+
mb_max_to_scan
The maximum number of extents the multiblock allocator will search to
find the best extent.
@@ -522,21 +532,21 @@ Files in /sys/fs/ext4/<devname>:
Ioctls
======
-There is some Ext4 specific functionality which can be accessed by applications
-through the system call interfaces. The list of all Ext4 specific ioctls are
-shown in the table below.
+Ext4 implements various ioctls which can be used by applications to access
+ext4-specific functionality. An incomplete list of these ioctls is shown in the
+table below. This list includes truly ext4-specific ioctls (``EXT4_IOC_*``) as
+well as ioctls that may have been ext4-specific originally but are now supported
+by some other filesystem(s) too (``FS_IOC_*``).
-Table of Ext4 specific ioctls
+Table of Ext4 ioctls
- EXT4_IOC_GETFLAGS
+ FS_IOC_GETFLAGS
Get additional attributes associated with inode. The ioctl argument is
- an integer bitfield, with bit values described in ext4.h. This ioctl is
- an alias for FS_IOC_GETFLAGS.
+ an integer bitfield, with bit values described in ext4.h.
- EXT4_IOC_SETFLAGS
+ FS_IOC_SETFLAGS
Set additional attributes associated with inode. The ioctl argument is
- an integer bitfield, with bit values described in ext4.h. This ioctl is
- an alias for FS_IOC_SETFLAGS.
+ an integer bitfield, with bit values described in ext4.h.
EXT4_IOC_GETVERSION, EXT4_IOC_GETVERSION_OLD
Get the inode i_generation number stored for each inode. The
@@ -611,7 +621,7 @@ kernel source: <file:fs/ext4/>
programs: http://e2fsprogs.sourceforge.net/
-useful links: http://fedoraproject.org/wiki/ext3-devel
+useful links: https://fedoraproject.org/wiki/ext3-devel
http://www.bullopensource.org/ext4/
http://ext4.wiki.kernel.org/index.php/Main_Page
- http://fedoraproject.org/wiki/Features/Ext4
+ https://fedoraproject.org/wiki/Features/Ext4
diff --git a/Documentation/admin-guide/features.rst b/Documentation/admin-guide/features.rst
new file mode 100644
index 000000000000..8c167082a84f
--- /dev/null
+++ b/Documentation/admin-guide/features.rst
@@ -0,0 +1,3 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+.. kernel-feat:: $srctree/Documentation/features
diff --git a/Documentation/admin-guide/filesystem-monitoring.rst b/Documentation/admin-guide/filesystem-monitoring.rst
new file mode 100644
index 000000000000..ab8dba76283c
--- /dev/null
+++ b/Documentation/admin-guide/filesystem-monitoring.rst
@@ -0,0 +1,78 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+====================================
+File system Monitoring with fanotify
+====================================
+
+File system Error Reporting
+===========================
+
+Fanotify supports the FAN_FS_ERROR event type for file system-wide error
+reporting. It is meant to be used by file system health monitoring
+daemons, which listen for these events and take actions (notify
+sysadmin, start recovery) when a file system problem is detected.
+
+By design, a FAN_FS_ERROR notification exposes sufficient information
+for a monitoring tool to know a problem in the file system has happened.
+It doesn't necessarily provide a user space application with semantics
+to verify an IO operation was successfully executed. That is out of
+scope for this feature. Instead, it is only meant as a framework for
+early file system problem detection and reporting recovery tools.
+
+When a file system operation fails, it is common for dozens of kernel
+errors to cascade after the initial failure, hiding the original failure
+log, which is usually the most useful debug data to troubleshoot the
+problem. For this reason, FAN_FS_ERROR tries to report only the first
+error that occurred for a file system since the last notification, and
+it simply counts additional errors. This ensures that the most
+important pieces of information are never lost.
+
+FAN_FS_ERROR requires the fanotify group to be setup with the
+FAN_REPORT_FID flag.
+
+At the time of this writing, the only file system that emits FAN_FS_ERROR
+notifications is Ext4.
+
+A FAN_FS_ERROR Notification has the following format::
+
+ ::
+
+ [ Notification Metadata (Mandatory) ]
+ [ Generic Error Record (Mandatory) ]
+ [ FID record (Mandatory) ]
+
+The order of records is not guaranteed, and new records might be added
+in the future. Therefore, applications must not rely on the order and
+must be prepared to skip over unknown records. Please refer to
+``samples/fanotify/fs-monitor.c`` for an example parser.
+
+Generic error record
+--------------------
+
+The generic error record provides enough information for a file system
+agnostic tool to learn about a problem in the file system, without
+providing any additional details about the problem. This record is
+identified by ``struct fanotify_event_info_header.info_type`` being set
+to FAN_EVENT_INFO_TYPE_ERROR.
+
+ ::
+
+ struct fanotify_event_info_error {
+ struct fanotify_event_info_header hdr;
+ __s32 error;
+ __u32 error_count;
+ };
+
+The `error` field identifies the type of error using errno values.
+`error_count` tracks the number of errors that occurred and were
+suppressed to preserve the original error information, since the last
+notification.
+
+FID record
+----------
+
+The FID record can be used to uniquely identify the inode that triggered
+the error through the combination of fsid and file handle. A file system
+specific application can use that information to attempt a recovery
+procedure. Errors that are not related to an inode are reported with an
+empty file handle of type FILEID_INVALID.
diff --git a/Documentation/admin-guide/gpio/gpio-aggregator.rst b/Documentation/admin-guide/gpio/gpio-aggregator.rst
new file mode 100644
index 000000000000..5cd1e7221756
--- /dev/null
+++ b/Documentation/admin-guide/gpio/gpio-aggregator.rst
@@ -0,0 +1,111 @@
+.. SPDX-License-Identifier: GPL-2.0-only
+
+GPIO Aggregator
+===============
+
+The GPIO Aggregator provides a mechanism to aggregate GPIOs, and expose them as
+a new gpio_chip. This supports the following use cases.
+
+
+Aggregating GPIOs using Sysfs
+-----------------------------
+
+GPIO controllers are exported to userspace using /dev/gpiochip* character
+devices. Access control to these devices is provided by standard UNIX file
+system permissions, on an all-or-nothing basis: either a GPIO controller is
+accessible for a user, or it is not.
+
+The GPIO Aggregator provides access control for a set of one or more GPIOs, by
+aggregating them into a new gpio_chip, which can be assigned to a group or user
+using standard UNIX file ownership and permissions. Furthermore, this
+simplifies and hardens exporting GPIOs to a virtual machine, as the VM can just
+grab the full GPIO controller, and no longer needs to care about which GPIOs to
+grab and which not, reducing the attack surface.
+
+Aggregated GPIO controllers are instantiated and destroyed by writing to
+write-only attribute files in sysfs.
+
+ /sys/bus/platform/drivers/gpio-aggregator/
+
+ "new_device" ...
+ Userspace may ask the kernel to instantiate an aggregated GPIO
+ controller by writing a string describing the GPIOs to
+ aggregate to the "new_device" file, using the format
+
+ .. code-block:: none
+
+ [<gpioA>] [<gpiochipB> <offsets>] ...
+
+ Where:
+
+ "<gpioA>" ...
+ is a GPIO line name,
+
+ "<gpiochipB>" ...
+ is a GPIO chip label, and
+
+ "<offsets>" ...
+ is a comma-separated list of GPIO offsets and/or
+ GPIO offset ranges denoted by dashes.
+
+ Example: Instantiate a new GPIO aggregator by aggregating GPIO
+ line 19 of "e6052000.gpio" and GPIO lines 20-21 of
+ "e6050000.gpio" into a new gpio_chip:
+
+ .. code-block:: sh
+
+ $ echo 'e6052000.gpio 19 e6050000.gpio 20-21' > new_device
+
+ "delete_device" ...
+ Userspace may ask the kernel to destroy an aggregated GPIO
+ controller after use by writing its device name to the
+ "delete_device" file.
+
+ Example: Destroy the previously-created aggregated GPIO
+ controller, assumed to be "gpio-aggregator.0":
+
+ .. code-block:: sh
+
+ $ echo gpio-aggregator.0 > delete_device
+
+
+Generic GPIO Driver
+-------------------
+
+The GPIO Aggregator can also be used as a generic driver for a simple
+GPIO-operated device described in DT, without a dedicated in-kernel driver.
+This is useful in industrial control, and is not unlike e.g. spidev, which
+allows the user to communicate with an SPI device from userspace.
+
+Binding a device to the GPIO Aggregator is performed either by modifying the
+gpio-aggregator driver, or by writing to the "driver_override" file in Sysfs.
+
+Example: If "door" is a GPIO-operated device described in DT, using its own
+compatible value::
+
+ door {
+ compatible = "myvendor,mydoor";
+
+ gpios = <&gpio2 19 GPIO_ACTIVE_HIGH>,
+ <&gpio2 20 GPIO_ACTIVE_LOW>;
+ gpio-line-names = "open", "lock";
+ };
+
+it can be bound to the GPIO Aggregator by either:
+
+1. Adding its compatible value to ``gpio_aggregator_dt_ids[]``,
+2. Binding manually using "driver_override":
+
+.. code-block:: sh
+
+ $ echo gpio-aggregator > /sys/bus/platform/devices/door/driver_override
+ $ echo door > /sys/bus/platform/drivers/gpio-aggregator/bind
+
+After that, a new gpiochip "door" has been created:
+
+.. code-block:: sh
+
+ $ gpioinfo door
+ gpiochip12 - 2 lines:
+ line 0: "open" unused input active-high
+ line 1: "lock" unused input active-high
diff --git a/Documentation/admin-guide/gpio/gpio-mockup.rst b/Documentation/admin-guide/gpio/gpio-mockup.rst
new file mode 100644
index 000000000000..493071da1738
--- /dev/null
+++ b/Documentation/admin-guide/gpio/gpio-mockup.rst
@@ -0,0 +1,51 @@
+.. SPDX-License-Identifier: GPL-2.0-only
+
+GPIO Testing Driver
+===================
+
+The GPIO Testing Driver (gpio-mockup) provides a way to create simulated GPIO
+chips for testing purposes. The lines exposed by these chips can be accessed
+using the standard GPIO character device interface as well as manipulated
+using the dedicated debugfs directory structure.
+
+Creating simulated chips using module params
+--------------------------------------------
+
+When loading the gpio-mockup driver a number of parameters can be passed to the
+module.
+
+ gpio_mockup_ranges
+
+ This parameter takes an argument in the form of an array of integer
+ pairs. Each pair defines the base GPIO number (non-negative integer)
+ and the first number after the last of this chip. If the base GPIO
+ is -1, the gpiolib will assign it automatically. while the following
+ parameter is the number of lines exposed by the chip.
+
+ Example: gpio_mockup_ranges=-1,8,-1,16,405,409
+
+ The line above creates three chips. The first one will expose 8 lines,
+ the second 16 and the third 4. The base GPIO for the third chip is set
+ to 405 while for two first chips it will be assigned automatically.
+
+ gpio_mockup_named_lines
+
+ This parameter doesn't take any arguments. It lets the driver know that
+ GPIO lines exposed by it should be named.
+
+ The name format is: gpio-mockup-X-Y where X is mockup chip's ID
+ and Y is the line offset.
+
+Manipulating simulated lines
+----------------------------
+
+Each mockup chip creates its own subdirectory in /sys/kernel/debug/gpio-mockup/.
+The directory is named after the chip's label. A symlink is also created, named
+after the chip's name, which points to the label directory.
+
+Inside each subdirectory, there's a separate attribute for each GPIO line. The
+name of the attribute represents the line's offset in the chip.
+
+Reading from a line attribute returns the current value. Writing to it (0 or 1)
+changes the configuration of the simulated pull-up/pull-down resistor
+(1 - pull-up, 0 - pull-down).
diff --git a/Documentation/admin-guide/gpio/gpio-sim.rst b/Documentation/admin-guide/gpio/gpio-sim.rst
new file mode 100644
index 000000000000..d8a90c81b9ee
--- /dev/null
+++ b/Documentation/admin-guide/gpio/gpio-sim.rst
@@ -0,0 +1,134 @@
+.. SPDX-License-Identifier: GPL-2.0-or-later
+
+Configfs GPIO Simulator
+=======================
+
+The configfs GPIO Simulator (gpio-sim) provides a way to create simulated GPIO
+chips for testing purposes. The lines exposed by these chips can be accessed
+using the standard GPIO character device interface as well as manipulated
+using sysfs attributes.
+
+Creating simulated chips
+------------------------
+
+The gpio-sim module registers a configfs subsystem called ``'gpio-sim'``. For
+details of the configfs filesystem, please refer to the configfs documentation.
+
+The user can create a hierarchy of configfs groups and items as well as modify
+values of exposed attributes. Once the chip is instantiated, this hierarchy
+will be translated to appropriate device properties. The general structure is:
+
+**Group:** ``/config/gpio-sim``
+
+This is the top directory of the gpio-sim configfs tree.
+
+**Group:** ``/config/gpio-sim/gpio-device``
+
+**Attribute:** ``/config/gpio-sim/gpio-device/dev_name``
+
+**Attribute:** ``/config/gpio-sim/gpio-device/live``
+
+This is a directory representing a GPIO platform device. The ``'dev_name'``
+attribute is read-only and allows the user-space to read the platform device
+name (e.g. ``'gpio-sim.0'``). The ``'live'`` attribute allows to trigger the
+actual creation of the device once it's fully configured. The accepted values
+are: ``'1'`` to enable the simulated device and ``'0'`` to disable and tear
+it down.
+
+**Group:** ``/config/gpio-sim/gpio-device/gpio-bankX``
+
+**Attribute:** ``/config/gpio-sim/gpio-device/gpio-bankX/chip_name``
+
+**Attribute:** ``/config/gpio-sim/gpio-device/gpio-bankX/num_lines``
+
+This group represents a bank of GPIOs under the top platform device. The
+``'chip_name'`` attribute is read-only and allows the user-space to read the
+device name of the bank device. The ``'num_lines'`` attribute allows to specify
+the number of lines exposed by this bank.
+
+**Group:** ``/config/gpio-sim/gpio-device/gpio-bankX/lineY``
+
+**Attribute:** ``/config/gpio-sim/gpio-device/gpio-bankX/lineY/name``
+
+This group represents a single line at the offset Y. The 'name' attribute
+allows to set the line name as represented by the 'gpio-line-names' property.
+
+**Item:** ``/config/gpio-sim/gpio-device/gpio-bankX/lineY/hog``
+
+**Attribute:** ``/config/gpio-sim/gpio-device/gpio-bankX/lineY/hog/name``
+
+**Attribute:** ``/config/gpio-sim/gpio-device/gpio-bankX/lineY/hog/direction``
+
+This item makes the gpio-sim module hog the associated line. The ``'name'``
+attribute specifies the in-kernel consumer name to use. The ``'direction'``
+attribute specifies the hog direction and must be one of: ``'input'``,
+``'output-high'`` and ``'output-low'``.
+
+Inside each bank directory, there's a set of attributes that can be used to
+configure the new chip. Additionally the user can ``mkdir()`` subdirectories
+inside the chip's directory that allow to pass additional configuration for
+specific lines. The name of those subdirectories must take the form of:
+``'line<offset>'`` (e.g. ``'line0'``, ``'line20'``, etc.) as the name will be
+used by the module to assign the config to the specific line at given offset.
+
+Once the confiuration is complete, the ``'live'`` attribute must be set to 1 in
+order to instantiate the chip. It can be set back to 0 to destroy the simulated
+chip. The module will synchronously wait for the new simulated device to be
+successfully probed and if this doesn't happen, writing to ``'live'`` will
+result in an error.
+
+Simulated GPIO chips can also be defined in device-tree. The compatible string
+must be: ``"gpio-simulator"``. Supported properties are:
+
+ ``"gpio-sim,label"`` - chip label
+
+Other standard GPIO properties (like ``"gpio-line-names"``, ``"ngpios"`` or
+``"gpio-hog"``) are also supported. Please refer to the GPIO documentation for
+details.
+
+An example device-tree code defining a GPIO simulator:
+
+.. code-block :: none
+
+ gpio-sim {
+ compatible = "gpio-simulator";
+
+ bank0 {
+ gpio-controller;
+ #gpio-cells = <2>;
+ ngpios = <16>;
+ gpio-sim,label = "dt-bank0";
+ gpio-line-names = "", "sim-foo", "", "sim-bar";
+ };
+
+ bank1 {
+ gpio-controller;
+ #gpio-cells = <2>;
+ ngpios = <8>;
+ gpio-sim,label = "dt-bank1";
+
+ line3 {
+ gpio-hog;
+ gpios = <3 0>;
+ output-high;
+ line-name = "sim-hog-from-dt";
+ };
+ };
+ };
+
+Manipulating simulated lines
+----------------------------
+
+Each simulated GPIO chip creates a separate sysfs group under its device
+directory for each exposed line
+(e.g. ``/sys/devices/platform/gpio-sim.X/gpiochipY/``). The name of each group
+is of the form: ``'sim_gpioX'`` where X is the offset of the line. Inside each
+group there are two attibutes:
+
+ ``pull`` - allows to read and set the current simulated pull setting for
+ every line, when writing the value must be one of: ``'pull-up'``,
+ ``'pull-down'``
+
+ ``value`` - allows to read the current value of the line which may be
+ different from the pull if the line is being driven from
+ user-space
diff --git a/Documentation/admin-guide/gpio/index.rst b/Documentation/admin-guide/gpio/index.rst
index a244ba4e87d5..f6861ca16ffe 100644
--- a/Documentation/admin-guide/gpio/index.rst
+++ b/Documentation/admin-guide/gpio/index.rst
@@ -7,7 +7,10 @@ gpio
.. toctree::
:maxdepth: 1
+ gpio-aggregator
sysfs
+ gpio-mockup
+ gpio-sim
.. only:: subproject and html
diff --git a/Documentation/admin-guide/hw-vuln/core-scheduling.rst b/Documentation/admin-guide/hw-vuln/core-scheduling.rst
new file mode 100644
index 000000000000..cf1eeefdfc32
--- /dev/null
+++ b/Documentation/admin-guide/hw-vuln/core-scheduling.rst
@@ -0,0 +1,226 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+===============
+Core Scheduling
+===============
+Core scheduling support allows userspace to define groups of tasks that can
+share a core. These groups can be specified either for security usecases (one
+group of tasks don't trust another), or for performance usecases (some
+workloads may benefit from running on the same core as they don't need the same
+hardware resources of the shared core, or may prefer different cores if they
+do share hardware resource needs). This document only describes the security
+usecase.
+
+Security usecase
+----------------
+A cross-HT attack involves the attacker and victim running on different Hyper
+Threads of the same core. MDS and L1TF are examples of such attacks. The only
+full mitigation of cross-HT attacks is to disable Hyper Threading (HT). Core
+scheduling is a scheduler feature that can mitigate some (not all) cross-HT
+attacks. It allows HT to be turned on safely by ensuring that only tasks in a
+user-designated trusted group can share a core. This increase in core sharing
+can also improve performance, however it is not guaranteed that performance
+will always improve, though that is seen to be the case with a number of real
+world workloads. In theory, core scheduling aims to perform at least as good as
+when Hyper Threading is disabled. In practice, this is mostly the case though
+not always: as synchronizing scheduling decisions across 2 or more CPUs in a
+core involves additional overhead - especially when the system is lightly
+loaded. When ``total_threads <= N_CPUS/2``, the extra overhead may cause core
+scheduling to perform more poorly compared to SMT-disabled, where N_CPUS is the
+total number of CPUs. Please measure the performance of your workloads always.
+
+Usage
+-----
+Core scheduling support is enabled via the ``CONFIG_SCHED_CORE`` config option.
+Using this feature, userspace defines groups of tasks that can be co-scheduled
+on the same core. The core scheduler uses this information to make sure that
+tasks that are not in the same group never run simultaneously on a core, while
+doing its best to satisfy the system's scheduling requirements.
+
+Core scheduling can be enabled via the ``PR_SCHED_CORE`` prctl interface.
+This interface provides support for the creation of core scheduling groups, as
+well as admission and removal of tasks from created groups::
+
+ #include <sys/prctl.h>
+
+ int prctl(int option, unsigned long arg2, unsigned long arg3,
+ unsigned long arg4, unsigned long arg5);
+
+option:
+ ``PR_SCHED_CORE``
+
+arg2:
+ Command for operation, must be one off:
+
+ - ``PR_SCHED_CORE_GET`` -- get core_sched cookie of ``pid``.
+ - ``PR_SCHED_CORE_CREATE`` -- create a new unique cookie for ``pid``.
+ - ``PR_SCHED_CORE_SHARE_TO`` -- push core_sched cookie to ``pid``.
+ - ``PR_SCHED_CORE_SHARE_FROM`` -- pull core_sched cookie from ``pid``.
+
+arg3:
+ ``pid`` of the task for which the operation applies.
+
+arg4:
+ ``pid_type`` for which the operation applies. It is one of
+ ``PR_SCHED_CORE_SCOPE_``-prefixed macro constants. For example, if arg4
+ is ``PR_SCHED_CORE_SCOPE_THREAD_GROUP``, then the operation of this command
+ will be performed for all tasks in the task group of ``pid``.
+
+arg5:
+ userspace pointer to an unsigned long for storing the cookie returned by
+ ``PR_SCHED_CORE_GET`` command. Should be 0 for all other commands.
+
+In order for a process to push a cookie to, or pull a cookie from a process, it
+is required to have the ptrace access mode: `PTRACE_MODE_READ_REALCREDS` to the
+process.
+
+Building hierarchies of tasks
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+The simplest way to build hierarchies of threads/processes which share a
+cookie and thus a core is to rely on the fact that the core-sched cookie is
+inherited across forks/clones and execs, thus setting a cookie for the
+'initial' script/executable/daemon will place every spawned child in the
+same core-sched group.
+
+Cookie Transferral
+~~~~~~~~~~~~~~~~~~
+Transferring a cookie between the current and other tasks is possible using
+PR_SCHED_CORE_SHARE_FROM and PR_SCHED_CORE_SHARE_TO to inherit a cookie from a
+specified task or a share a cookie with a task. In combination this allows a
+simple helper program to pull a cookie from a task in an existing core
+scheduling group and share it with already running tasks.
+
+Design/Implementation
+---------------------
+Each task that is tagged is assigned a cookie internally in the kernel. As
+mentioned in `Usage`_, tasks with the same cookie value are assumed to trust
+each other and share a core.
+
+The basic idea is that, every schedule event tries to select tasks for all the
+siblings of a core such that all the selected tasks running on a core are
+trusted (same cookie) at any point in time. Kernel threads are assumed trusted.
+The idle task is considered special, as it trusts everything and everything
+trusts it.
+
+During a schedule() event on any sibling of a core, the highest priority task on
+the sibling's core is picked and assigned to the sibling calling schedule(), if
+the sibling has the task enqueued. For rest of the siblings in the core,
+highest priority task with the same cookie is selected if there is one runnable
+in their individual run queues. If a task with same cookie is not available,
+the idle task is selected. Idle task is globally trusted.
+
+Once a task has been selected for all the siblings in the core, an IPI is sent to
+siblings for whom a new task was selected. Siblings on receiving the IPI will
+switch to the new task immediately. If an idle task is selected for a sibling,
+then the sibling is considered to be in a `forced idle` state. I.e., it may
+have tasks on its on runqueue to run, however it will still have to run idle.
+More on this in the next section.
+
+Forced-idling of hyperthreads
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+The scheduler tries its best to find tasks that trust each other such that all
+tasks selected to be scheduled are of the highest priority in a core. However,
+it is possible that some runqueues had tasks that were incompatible with the
+highest priority ones in the core. Favoring security over fairness, one or more
+siblings could be forced to select a lower priority task if the highest
+priority task is not trusted with respect to the core wide highest priority
+task. If a sibling does not have a trusted task to run, it will be forced idle
+by the scheduler (idle thread is scheduled to run).
+
+When the highest priority task is selected to run, a reschedule-IPI is sent to
+the sibling to force it into idle. This results in 4 cases which need to be
+considered depending on whether a VM or a regular usermode process was running
+on either HT::
+
+ HT1 (attack) HT2 (victim)
+ A idle -> user space user space -> idle
+ B idle -> user space guest -> idle
+ C idle -> guest user space -> idle
+ D idle -> guest guest -> idle
+
+Note that for better performance, we do not wait for the destination CPU
+(victim) to enter idle mode. This is because the sending of the IPI would bring
+the destination CPU immediately into kernel mode from user space, or VMEXIT
+in the case of guests. At best, this would only leak some scheduler metadata
+which may not be worth protecting. It is also possible that the IPI is received
+too late on some architectures, but this has not been observed in the case of
+x86.
+
+Trust model
+~~~~~~~~~~~
+Core scheduling maintains trust relationships amongst groups of tasks by
+assigning them a tag that is the same cookie value.
+When a system with core scheduling boots, all tasks are considered to trust
+each other. This is because the core scheduler does not have information about
+trust relationships until userspace uses the above mentioned interfaces, to
+communicate them. In other words, all tasks have a default cookie value of 0.
+and are considered system-wide trusted. The forced-idling of siblings running
+cookie-0 tasks is also avoided.
+
+Once userspace uses the above mentioned interfaces to group sets of tasks, tasks
+within such groups are considered to trust each other, but do not trust those
+outside. Tasks outside the group also don't trust tasks within.
+
+Limitations of core-scheduling
+------------------------------
+Core scheduling tries to guarantee that only trusted tasks run concurrently on a
+core. But there could be small window of time during which untrusted tasks run
+concurrently or kernel could be running concurrently with a task not trusted by
+kernel.
+
+IPI processing delays
+~~~~~~~~~~~~~~~~~~~~~
+Core scheduling selects only trusted tasks to run together. IPI is used to notify
+the siblings to switch to the new task. But there could be hardware delays in
+receiving of the IPI on some arch (on x86, this has not been observed). This may
+cause an attacker task to start running on a CPU before its siblings receive the
+IPI. Even though cache is flushed on entry to user mode, victim tasks on siblings
+may populate data in the cache and micro architectural buffers after the attacker
+starts to run and this is a possibility for data leak.
+
+Open cross-HT issues that core scheduling does not solve
+--------------------------------------------------------
+1. For MDS
+~~~~~~~~~~
+Core scheduling cannot protect against MDS attacks between the siblings
+running in user mode and the others running in kernel mode. Even though all
+siblings run tasks which trust each other, when the kernel is executing
+code on behalf of a task, it cannot trust the code running in the
+sibling. Such attacks are possible for any combination of sibling CPU modes
+(host or guest mode).
+
+2. For L1TF
+~~~~~~~~~~~
+Core scheduling cannot protect against an L1TF guest attacker exploiting a
+guest or host victim. This is because the guest attacker can craft invalid
+PTEs which are not inverted due to a vulnerable guest kernel. The only
+solution is to disable EPT (Extended Page Tables).
+
+For both MDS and L1TF, if the guest vCPU is configured to not trust each
+other (by tagging separately), then the guest to guest attacks would go away.
+Or it could be a system admin policy which considers guest to guest attacks as
+a guest problem.
+
+Another approach to resolve these would be to make every untrusted task on the
+system to not trust every other untrusted task. While this could reduce
+parallelism of the untrusted tasks, it would still solve the above issues while
+allowing system processes (trusted tasks) to share a core.
+
+3. Protecting the kernel (IRQ, syscall, VMEXIT)
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Unfortunately, core scheduling does not protect kernel contexts running on
+sibling hyperthreads from one another. Prototypes of mitigations have been posted
+to LKML to solve this, but it is debatable whether such windows are practically
+exploitable, and whether the performance overhead of the prototypes are worth
+it (not to mention, the added code complexity).
+
+Other Use cases
+---------------
+The main use case for Core scheduling is mitigating the cross-HT vulnerabilities
+with SMT enabled. There are other use cases where this feature could be used:
+
+- Isolating tasks that needs a whole core: Examples include realtime tasks, tasks
+ that uses SIMD instructions etc.
+- Gang scheduling: Requirements for a group of tasks that needs to be scheduled
+ together could also be realized using core scheduling. One example is vCPUs of
+ a VM.
diff --git a/Documentation/admin-guide/hw-vuln/index.rst b/Documentation/admin-guide/hw-vuln/index.rst
index 0795e3c2643f..4df436e7c417 100644
--- a/Documentation/admin-guide/hw-vuln/index.rst
+++ b/Documentation/admin-guide/hw-vuln/index.rst
@@ -14,3 +14,7 @@ are configurable at compile, boot or run time.
mds
tsx_async_abort
multihit.rst
+ special-register-buffer-data-sampling.rst
+ core-scheduling.rst
+ l1d_flush.rst
+ processor_mmio_stale_data.rst
diff --git a/Documentation/admin-guide/hw-vuln/l1d_flush.rst b/Documentation/admin-guide/hw-vuln/l1d_flush.rst
new file mode 100644
index 000000000000..210020bc3f56
--- /dev/null
+++ b/Documentation/admin-guide/hw-vuln/l1d_flush.rst
@@ -0,0 +1,69 @@
+L1D Flushing
+============
+
+With an increasing number of vulnerabilities being reported around data
+leaks from the Level 1 Data cache (L1D) the kernel provides an opt-in
+mechanism to flush the L1D cache on context switch.
+
+This mechanism can be used to address e.g. CVE-2020-0550. For applications
+the mechanism keeps them safe from vulnerabilities, related to leaks
+(snooping of) from the L1D cache.
+
+
+Related CVEs
+------------
+The following CVEs can be addressed by this
+mechanism
+
+ ============= ======================== ==================
+ CVE-2020-0550 Improper Data Forwarding OS related aspects
+ ============= ======================== ==================
+
+Usage Guidelines
+----------------
+
+Please see document: :ref:`Documentation/userspace-api/spec_ctrl.rst
+<set_spec_ctrl>` for details.
+
+**NOTE**: The feature is disabled by default, applications need to
+specifically opt into the feature to enable it.
+
+Mitigation
+----------
+
+When PR_SET_L1D_FLUSH is enabled for a task a flush of the L1D cache is
+performed when the task is scheduled out and the incoming task belongs to a
+different process and therefore to a different address space.
+
+If the underlying CPU supports L1D flushing in hardware, the hardware
+mechanism is used, software fallback for the mitigation, is not supported.
+
+Mitigation control on the kernel command line
+---------------------------------------------
+
+The kernel command line allows to control the L1D flush mitigations at boot
+time with the option "l1d_flush=". The valid arguments for this option are:
+
+ ============ =============================================================
+ on Enables the prctl interface, applications trying to use
+ the prctl() will fail with an error if l1d_flush is not
+ enabled
+ ============ =============================================================
+
+By default the mechanism is disabled.
+
+Limitations
+-----------
+
+The mechanism does not mitigate L1D data leaks between tasks belonging to
+different processes which are concurrently executing on sibling threads of
+a physical CPU core when SMT is enabled on the system.
+
+This can be addressed by controlled placement of processes on physical CPU
+cores or by disabling SMT. See the relevant chapter in the L1TF mitigation
+document: :ref:`Documentation/admin-guide/hw-vuln/l1tf.rst <smt_control>`.
+
+**NOTE** : The opt-in of a task for L1D flushing works only when the task's
+affinity is limited to cores running in non-SMT mode. If a task which
+requested L1D flushing is scheduled on a SMT-enabled core the kernel sends
+a SIGBUS to the task.
diff --git a/Documentation/admin-guide/hw-vuln/l1tf.rst b/Documentation/admin-guide/hw-vuln/l1tf.rst
index f83212fae4d5..3eeeb488d955 100644
--- a/Documentation/admin-guide/hw-vuln/l1tf.rst
+++ b/Documentation/admin-guide/hw-vuln/l1tf.rst
@@ -268,7 +268,7 @@ Guest mitigation mechanisms
/proc/irq/$NR/smp_affinity[_list] files. Limited documentation is
available at:
- https://www.kernel.org/doc/Documentation/IRQ-affinity.txt
+ https://www.kernel.org/doc/Documentation/core-api/irq/irq-affinity.rst
.. _smt_control:
diff --git a/Documentation/admin-guide/hw-vuln/multihit.rst b/Documentation/admin-guide/hw-vuln/multihit.rst
index ba9988d8bce5..140e4cec38c3 100644
--- a/Documentation/admin-guide/hw-vuln/multihit.rst
+++ b/Documentation/admin-guide/hw-vuln/multihit.rst
@@ -80,6 +80,10 @@ The possible values in this file are:
- The processor is not vulnerable.
* - KVM: Mitigation: Split huge pages
- Software changes mitigate this issue.
+ * - KVM: Mitigation: VMX unsupported
+ - KVM is not vulnerable because Virtual Machine Extensions (VMX) is not supported.
+ * - KVM: Mitigation: VMX disabled
+ - KVM is not vulnerable because Virtual Machine Extensions (VMX) is disabled.
* - KVM: Vulnerable
- The processor is vulnerable, but no mitigation enabled
diff --git a/Documentation/admin-guide/hw-vuln/processor_mmio_stale_data.rst b/Documentation/admin-guide/hw-vuln/processor_mmio_stale_data.rst
new file mode 100644
index 000000000000..c98fd11907cc
--- /dev/null
+++ b/Documentation/admin-guide/hw-vuln/processor_mmio_stale_data.rst
@@ -0,0 +1,260 @@
+=========================================
+Processor MMIO Stale Data Vulnerabilities
+=========================================
+
+Processor MMIO Stale Data Vulnerabilities are a class of memory-mapped I/O
+(MMIO) vulnerabilities that can expose data. The sequences of operations for
+exposing data range from simple to very complex. Because most of the
+vulnerabilities require the attacker to have access to MMIO, many environments
+are not affected. System environments using virtualization where MMIO access is
+provided to untrusted guests may need mitigation. These vulnerabilities are
+not transient execution attacks. However, these vulnerabilities may propagate
+stale data into core fill buffers where the data can subsequently be inferred
+by an unmitigated transient execution attack. Mitigation for these
+vulnerabilities includes a combination of microcode update and software
+changes, depending on the platform and usage model. Some of these mitigations
+are similar to those used to mitigate Microarchitectural Data Sampling (MDS) or
+those used to mitigate Special Register Buffer Data Sampling (SRBDS).
+
+Data Propagators
+================
+Propagators are operations that result in stale data being copied or moved from
+one microarchitectural buffer or register to another. Processor MMIO Stale Data
+Vulnerabilities are operations that may result in stale data being directly
+read into an architectural, software-visible state or sampled from a buffer or
+register.
+
+Fill Buffer Stale Data Propagator (FBSDP)
+-----------------------------------------
+Stale data may propagate from fill buffers (FB) into the non-coherent portion
+of the uncore on some non-coherent writes. Fill buffer propagation by itself
+does not make stale data architecturally visible. Stale data must be propagated
+to a location where it is subject to reading or sampling.
+
+Sideband Stale Data Propagator (SSDP)
+-------------------------------------
+The sideband stale data propagator (SSDP) is limited to the client (including
+Intel Xeon server E3) uncore implementation. The sideband response buffer is
+shared by all client cores. For non-coherent reads that go to sideband
+destinations, the uncore logic returns 64 bytes of data to the core, including
+both requested data and unrequested stale data, from a transaction buffer and
+the sideband response buffer. As a result, stale data from the sideband
+response and transaction buffers may now reside in a core fill buffer.
+
+Primary Stale Data Propagator (PSDP)
+------------------------------------
+The primary stale data propagator (PSDP) is limited to the client (including
+Intel Xeon server E3) uncore implementation. Similar to the sideband response
+buffer, the primary response buffer is shared by all client cores. For some
+processors, MMIO primary reads will return 64 bytes of data to the core fill
+buffer including both requested data and unrequested stale data. This is
+similar to the sideband stale data propagator.
+
+Vulnerabilities
+===============
+Device Register Partial Write (DRPW) (CVE-2022-21166)
+-----------------------------------------------------
+Some endpoint MMIO registers incorrectly handle writes that are smaller than
+the register size. Instead of aborting the write or only copying the correct
+subset of bytes (for example, 2 bytes for a 2-byte write), more bytes than
+specified by the write transaction may be written to the register. On
+processors affected by FBSDP, this may expose stale data from the fill buffers
+of the core that created the write transaction.
+
+Shared Buffers Data Sampling (SBDS) (CVE-2022-21125)
+----------------------------------------------------
+After propagators may have moved data around the uncore and copied stale data
+into client core fill buffers, processors affected by MFBDS can leak data from
+the fill buffer. It is limited to the client (including Intel Xeon server E3)
+uncore implementation.
+
+Shared Buffers Data Read (SBDR) (CVE-2022-21123)
+------------------------------------------------
+It is similar to Shared Buffer Data Sampling (SBDS) except that the data is
+directly read into the architectural software-visible state. It is limited to
+the client (including Intel Xeon server E3) uncore implementation.
+
+Affected Processors
+===================
+Not all the CPUs are affected by all the variants. For instance, most
+processors for the server market (excluding Intel Xeon E3 processors) are
+impacted by only Device Register Partial Write (DRPW).
+
+Below is the list of affected Intel processors [#f1]_:
+
+ =================== ============ =========
+ Common name Family_Model Steppings
+ =================== ============ =========
+ HASWELL_X 06_3FH 2,4
+ SKYLAKE_L 06_4EH 3
+ BROADWELL_X 06_4FH All
+ SKYLAKE_X 06_55H 3,4,6,7,11
+ BROADWELL_D 06_56H 3,4,5
+ SKYLAKE 06_5EH 3
+ ICELAKE_X 06_6AH 4,5,6
+ ICELAKE_D 06_6CH 1
+ ICELAKE_L 06_7EH 5
+ ATOM_TREMONT_D 06_86H All
+ LAKEFIELD 06_8AH 1
+ KABYLAKE_L 06_8EH 9 to 12
+ ATOM_TREMONT 06_96H 1
+ ATOM_TREMONT_L 06_9CH 0
+ KABYLAKE 06_9EH 9 to 13
+ COMETLAKE 06_A5H 2,3,5
+ COMETLAKE_L 06_A6H 0,1
+ ROCKETLAKE 06_A7H 1
+ =================== ============ =========
+
+If a CPU is in the affected processor list, but not affected by a variant, it
+is indicated by new bits in MSR IA32_ARCH_CAPABILITIES. As described in a later
+section, mitigation largely remains the same for all the variants, i.e. to
+clear the CPU fill buffers via VERW instruction.
+
+New bits in MSRs
+================
+Newer processors and microcode update on existing affected processors added new
+bits to IA32_ARCH_CAPABILITIES MSR. These bits can be used to enumerate
+specific variants of Processor MMIO Stale Data vulnerabilities and mitigation
+capability.
+
+MSR IA32_ARCH_CAPABILITIES
+--------------------------
+Bit 13 - SBDR_SSDP_NO - When set, processor is not affected by either the
+ Shared Buffers Data Read (SBDR) vulnerability or the sideband stale
+ data propagator (SSDP).
+Bit 14 - FBSDP_NO - When set, processor is not affected by the Fill Buffer
+ Stale Data Propagator (FBSDP).
+Bit 15 - PSDP_NO - When set, processor is not affected by Primary Stale Data
+ Propagator (PSDP).
+Bit 17 - FB_CLEAR - When set, VERW instruction will overwrite CPU fill buffer
+ values as part of MD_CLEAR operations. Processors that do not
+ enumerate MDS_NO (meaning they are affected by MDS) but that do
+ enumerate support for both L1D_FLUSH and MD_CLEAR implicitly enumerate
+ FB_CLEAR as part of their MD_CLEAR support.
+Bit 18 - FB_CLEAR_CTRL - Processor supports read and write to MSR
+ IA32_MCU_OPT_CTRL[FB_CLEAR_DIS]. On such processors, the FB_CLEAR_DIS
+ bit can be set to cause the VERW instruction to not perform the
+ FB_CLEAR action. Not all processors that support FB_CLEAR will support
+ FB_CLEAR_CTRL.
+
+MSR IA32_MCU_OPT_CTRL
+---------------------
+Bit 3 - FB_CLEAR_DIS - When set, VERW instruction does not perform the FB_CLEAR
+action. This may be useful to reduce the performance impact of FB_CLEAR in
+cases where system software deems it warranted (for example, when performance
+is more critical, or the untrusted software has no MMIO access). Note that
+FB_CLEAR_DIS has no impact on enumeration (for example, it does not change
+FB_CLEAR or MD_CLEAR enumeration) and it may not be supported on all processors
+that enumerate FB_CLEAR.
+
+Mitigation
+==========
+Like MDS, all variants of Processor MMIO Stale Data vulnerabilities have the
+same mitigation strategy to force the CPU to clear the affected buffers before
+an attacker can extract the secrets.
+
+This is achieved by using the otherwise unused and obsolete VERW instruction in
+combination with a microcode update. The microcode clears the affected CPU
+buffers when the VERW instruction is executed.
+
+Kernel reuses the MDS function to invoke the buffer clearing:
+
+ mds_clear_cpu_buffers()
+
+On MDS affected CPUs, the kernel already invokes CPU buffer clear on
+kernel/userspace, hypervisor/guest and C-state (idle) transitions. No
+additional mitigation is needed on such CPUs.
+
+For CPUs not affected by MDS or TAA, mitigation is needed only for the attacker
+with MMIO capability. Therefore, VERW is not required for kernel/userspace. For
+virtualization case, VERW is only needed at VMENTER for a guest with MMIO
+capability.
+
+Mitigation points
+-----------------
+Return to user space
+^^^^^^^^^^^^^^^^^^^^
+Same mitigation as MDS when affected by MDS/TAA, otherwise no mitigation
+needed.
+
+C-State transition
+^^^^^^^^^^^^^^^^^^
+Control register writes by CPU during C-state transition can propagate data
+from fill buffer to uncore buffers. Execute VERW before C-state transition to
+clear CPU fill buffers.
+
+Guest entry point
+^^^^^^^^^^^^^^^^^
+Same mitigation as MDS when processor is also affected by MDS/TAA, otherwise
+execute VERW at VMENTER only for MMIO capable guests. On CPUs not affected by
+MDS/TAA, guest without MMIO access cannot extract secrets using Processor MMIO
+Stale Data vulnerabilities, so there is no need to execute VERW for such guests.
+
+Mitigation control on the kernel command line
+---------------------------------------------
+The kernel command line allows to control the Processor MMIO Stale Data
+mitigations at boot time with the option "mmio_stale_data=". The valid
+arguments for this option are:
+
+ ========== =================================================================
+ full If the CPU is vulnerable, enable mitigation; CPU buffer clearing
+ on exit to userspace and when entering a VM. Idle transitions are
+ protected as well. It does not automatically disable SMT.
+ full,nosmt Same as full, with SMT disabled on vulnerable CPUs. This is the
+ complete mitigation.
+ off Disables mitigation completely.
+ ========== =================================================================
+
+If the CPU is affected and mmio_stale_data=off is not supplied on the kernel
+command line, then the kernel selects the appropriate mitigation.
+
+Mitigation status information
+-----------------------------
+The Linux kernel provides a sysfs interface to enumerate the current
+vulnerability status of the system: whether the system is vulnerable, and
+which mitigations are active. The relevant sysfs file is:
+
+ /sys/devices/system/cpu/vulnerabilities/mmio_stale_data
+
+The possible values in this file are:
+
+ .. list-table::
+
+ * - 'Not affected'
+ - The processor is not vulnerable
+ * - 'Vulnerable'
+ - The processor is vulnerable, but no mitigation enabled
+ * - 'Vulnerable: Clear CPU buffers attempted, no microcode'
+ - The processor is vulnerable, but microcode is not updated. The
+ mitigation is enabled on a best effort basis.
+ * - 'Mitigation: Clear CPU buffers'
+ - The processor is vulnerable and the CPU buffer clearing mitigation is
+ enabled.
+ * - 'Unknown: No mitigations'
+ - The processor vulnerability status is unknown because it is
+ out of Servicing period. Mitigation is not attempted.
+
+Definitions:
+------------
+
+Servicing period: The process of providing functional and security updates to
+Intel processors or platforms, utilizing the Intel Platform Update (IPU)
+process or other similar mechanisms.
+
+End of Servicing Updates (ESU): ESU is the date at which Intel will no
+longer provide Servicing, such as through IPU or other similar update
+processes. ESU dates will typically be aligned to end of quarter.
+
+If the processor is vulnerable then the following information is appended to
+the above information:
+
+ ======================== ===========================================
+ 'SMT vulnerable' SMT is enabled
+ 'SMT disabled' SMT is disabled
+ 'SMT Host state unknown' Kernel runs in a VM, Host SMT state unknown
+ ======================== ===========================================
+
+References
+----------
+.. [#f1] Affected Processors
+ https://www.intel.com/content/www/us/en/developer/topic-technology/software-security-guidance/processors-affected-consolidated-product-cpu-model.html
diff --git a/Documentation/admin-guide/hw-vuln/special-register-buffer-data-sampling.rst b/Documentation/admin-guide/hw-vuln/special-register-buffer-data-sampling.rst
new file mode 100644
index 000000000000..966c9b3296ea
--- /dev/null
+++ b/Documentation/admin-guide/hw-vuln/special-register-buffer-data-sampling.rst
@@ -0,0 +1,150 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+SRBDS - Special Register Buffer Data Sampling
+=============================================
+
+SRBDS is a hardware vulnerability that allows MDS
+Documentation/admin-guide/hw-vuln/mds.rst techniques to
+infer values returned from special register accesses. Special register
+accesses are accesses to off core registers. According to Intel's evaluation,
+the special register reads that have a security expectation of privacy are
+RDRAND, RDSEED and SGX EGETKEY.
+
+When RDRAND, RDSEED and EGETKEY instructions are used, the data is moved
+to the core through the special register mechanism that is susceptible
+to MDS attacks.
+
+Affected processors
+-------------------
+Core models (desktop, mobile, Xeon-E3) that implement RDRAND and/or RDSEED may
+be affected.
+
+A processor is affected by SRBDS if its Family_Model and stepping is
+in the following list, with the exception of the listed processors
+exporting MDS_NO while Intel TSX is available yet not enabled. The
+latter class of processors are only affected when Intel TSX is enabled
+by software using TSX_CTRL_MSR otherwise they are not affected.
+
+ ============= ============ ========
+ common name Family_Model Stepping
+ ============= ============ ========
+ IvyBridge 06_3AH All
+
+ Haswell 06_3CH All
+ Haswell_L 06_45H All
+ Haswell_G 06_46H All
+
+ Broadwell_G 06_47H All
+ Broadwell 06_3DH All
+
+ Skylake_L 06_4EH All
+ Skylake 06_5EH All
+
+ Kabylake_L 06_8EH <= 0xC
+ Kabylake 06_9EH <= 0xD
+ ============= ============ ========
+
+Related CVEs
+------------
+
+The following CVE entry is related to this SRBDS issue:
+
+ ============== ===== =====================================
+ CVE-2020-0543 SRBDS Special Register Buffer Data Sampling
+ ============== ===== =====================================
+
+Attack scenarios
+----------------
+An unprivileged user can extract values returned from RDRAND and RDSEED
+executed on another core or sibling thread using MDS techniques.
+
+
+Mitigation mechanism
+--------------------
+Intel will release microcode updates that modify the RDRAND, RDSEED, and
+EGETKEY instructions to overwrite secret special register data in the shared
+staging buffer before the secret data can be accessed by another logical
+processor.
+
+During execution of the RDRAND, RDSEED, or EGETKEY instructions, off-core
+accesses from other logical processors will be delayed until the special
+register read is complete and the secret data in the shared staging buffer is
+overwritten.
+
+This has three effects on performance:
+
+#. RDRAND, RDSEED, or EGETKEY instructions have higher latency.
+
+#. Executing RDRAND at the same time on multiple logical processors will be
+ serialized, resulting in an overall reduction in the maximum RDRAND
+ bandwidth.
+
+#. Executing RDRAND, RDSEED or EGETKEY will delay memory accesses from other
+ logical processors that miss their core caches, with an impact similar to
+ legacy locked cache-line-split accesses.
+
+The microcode updates provide an opt-out mechanism (RNGDS_MITG_DIS) to disable
+the mitigation for RDRAND and RDSEED instructions executed outside of Intel
+Software Guard Extensions (Intel SGX) enclaves. On logical processors that
+disable the mitigation using this opt-out mechanism, RDRAND and RDSEED do not
+take longer to execute and do not impact performance of sibling logical
+processors memory accesses. The opt-out mechanism does not affect Intel SGX
+enclaves (including execution of RDRAND or RDSEED inside an enclave, as well
+as EGETKEY execution).
+
+IA32_MCU_OPT_CTRL MSR Definition
+--------------------------------
+Along with the mitigation for this issue, Intel added a new thread-scope
+IA32_MCU_OPT_CTRL MSR, (address 0x123). The presence of this MSR and
+RNGDS_MITG_DIS (bit 0) is enumerated by CPUID.(EAX=07H,ECX=0).EDX[SRBDS_CTRL =
+9]==1. This MSR is introduced through the microcode update.
+
+Setting IA32_MCU_OPT_CTRL[0] (RNGDS_MITG_DIS) to 1 for a logical processor
+disables the mitigation for RDRAND and RDSEED executed outside of an Intel SGX
+enclave on that logical processor. Opting out of the mitigation for a
+particular logical processor does not affect the RDRAND and RDSEED mitigations
+for other logical processors.
+
+Note that inside of an Intel SGX enclave, the mitigation is applied regardless
+of the value of RNGDS_MITG_DS.
+
+Mitigation control on the kernel command line
+---------------------------------------------
+The kernel command line allows control over the SRBDS mitigation at boot time
+with the option "srbds=". The option for this is:
+
+ ============= =============================================================
+ off This option disables SRBDS mitigation for RDRAND and RDSEED on
+ affected platforms.
+ ============= =============================================================
+
+SRBDS System Information
+------------------------
+The Linux kernel provides vulnerability status information through sysfs. For
+SRBDS this can be accessed by the following sysfs file:
+/sys/devices/system/cpu/vulnerabilities/srbds
+
+The possible values contained in this file are:
+
+ ============================== =============================================
+ Not affected Processor not vulnerable
+ Vulnerable Processor vulnerable and mitigation disabled
+ Vulnerable: No microcode Processor vulnerable and microcode is missing
+ mitigation
+ Mitigation: Microcode Processor is vulnerable and mitigation is in
+ effect.
+ Mitigation: TSX disabled Processor is only vulnerable when TSX is
+ enabled while this system was booted with TSX
+ disabled.
+ Unknown: Dependent on
+ hypervisor status Running on virtual guest processor that is
+ affected but with no way to know if host
+ processor is mitigated or vulnerable.
+ ============================== =============================================
+
+SRBDS Default mitigation
+------------------------
+This new microcode serializes processor access during execution of RDRAND,
+RDSEED ensures that the shared buffer is overwritten before it is released for
+reuse. Use the "srbds=off" kernel command line to disable the mitigation for
+RDRAND and RDSEED.
diff --git a/Documentation/admin-guide/hw-vuln/spectre.rst b/Documentation/admin-guide/hw-vuln/spectre.rst
index e05e581af5cf..c4dcdb3d0d45 100644
--- a/Documentation/admin-guide/hw-vuln/spectre.rst
+++ b/Documentation/admin-guide/hw-vuln/spectre.rst
@@ -60,8 +60,8 @@ privileged data touched during the speculative execution.
Spectre variant 1 attacks take advantage of speculative execution of
conditional branches, while Spectre variant 2 attacks use speculative
execution of indirect branches to leak privileged memory.
-See :ref:`[1] <spec_ref1>` :ref:`[5] <spec_ref5>` :ref:`[7] <spec_ref7>`
-:ref:`[10] <spec_ref10>` :ref:`[11] <spec_ref11>`.
+See :ref:`[1] <spec_ref1>` :ref:`[5] <spec_ref5>` :ref:`[6] <spec_ref6>`
+:ref:`[7] <spec_ref7>` :ref:`[10] <spec_ref10>` :ref:`[11] <spec_ref11>`.
Spectre variant 1 (Bounds Check Bypass)
---------------------------------------
@@ -131,6 +131,19 @@ steer its indirect branch speculations to gadget code, and measure the
speculative execution's side effects left in level 1 cache to infer the
victim's data.
+Yet another variant 2 attack vector is for the attacker to poison the
+Branch History Buffer (BHB) to speculatively steer an indirect branch
+to a specific Branch Target Buffer (BTB) entry, even if the entry isn't
+associated with the source address of the indirect branch. Specifically,
+the BHB might be shared across privilege levels even in the presence of
+Enhanced IBRS.
+
+Currently the only known real-world BHB attack vector is via
+unprivileged eBPF. Therefore, it's highly recommended to not enable
+unprivileged eBPF, especially when eIBRS is used (without retpolines).
+For a full mitigation against BHB attacks, it's recommended to use
+retpolines (or eIBRS combined with retpolines).
+
Attack scenarios
----------------
@@ -364,13 +377,15 @@ The possible values in this file are:
- Kernel status:
- ==================================== =================================
- 'Not affected' The processor is not vulnerable
- 'Vulnerable' Vulnerable, no mitigation
- 'Mitigation: Full generic retpoline' Software-focused mitigation
- 'Mitigation: Full AMD retpoline' AMD-specific software mitigation
- 'Mitigation: Enhanced IBRS' Hardware-focused mitigation
- ==================================== =================================
+ ======================================== =================================
+ 'Not affected' The processor is not vulnerable
+ 'Mitigation: None' Vulnerable, no mitigation
+ 'Mitigation: Retpolines' Use Retpoline thunks
+ 'Mitigation: LFENCE' Use LFENCE instructions
+ 'Mitigation: Enhanced IBRS' Hardware-focused mitigation
+ 'Mitigation: Enhanced IBRS + Retpolines' Hardware-focused + Retpolines
+ 'Mitigation: Enhanced IBRS + LFENCE' Hardware-focused + LFENCE
+ ======================================== =================================
- Firmware status: Show if Indirect Branch Restricted Speculation (IBRS) is
used to protect against Spectre variant 2 attacks when calling firmware (x86 only).
@@ -407,6 +422,14 @@ The possible values in this file are:
'RSB filling' Protection of RSB on context switch enabled
============= ===========================================
+ - EIBRS Post-barrier Return Stack Buffer (PBRSB) protection status:
+
+ =========================== =======================================================
+ 'PBRSB-eIBRS: SW sequence' CPU is affected and protection of RSB on VMEXIT enabled
+ 'PBRSB-eIBRS: Vulnerable' CPU is vulnerable
+ 'PBRSB-eIBRS: Not affected' CPU is not affected by PBRSB
+ =========================== =======================================================
+
Full mitigation might require a microcode update from the CPU
vendor. When the necessary microcode is not available, the kernel will
report vulnerability.
@@ -468,7 +491,7 @@ Spectre variant 2
before invoking any firmware code to prevent Spectre variant 2 exploits
using the firmware.
- Using kernel address space randomization (CONFIG_RANDOMIZE_SLAB=y
+ Using kernel address space randomization (CONFIG_RANDOMIZE_BASE=y
and CONFIG_SLAB_FREELIST_RANDOM=y in the kernel configuration) makes
attacks on the kernel generally more difficult.
@@ -490,9 +513,8 @@ Spectre variant 2
Restricting indirect branch speculation on a user program will
also prevent the program from launching a variant 2 attack
- on x86. All sand-boxed SECCOMP programs have indirect branch
- speculation restricted by default. Administrators can change
- that behavior via the kernel command line and sysfs control files.
+ on x86. Administrators can change that behavior via the kernel
+ command line and sysfs control files.
See :ref:`spectre_mitigation_control_command_line`.
Programs that disable their indirect branch speculation will have
@@ -584,71 +606,26 @@ kernel command line.
Specific mitigations can also be selected manually:
- retpoline
- replace indirect branches
- retpoline,generic
- google's original retpoline
- retpoline,amd
- AMD-specific minimal thunk
+ retpoline auto pick between generic,lfence
+ retpoline,generic Retpolines
+ retpoline,lfence LFENCE; indirect branch
+ retpoline,amd alias for retpoline,lfence
+ eibrs enhanced IBRS
+ eibrs,retpoline enhanced IBRS + Retpolines
+ eibrs,lfence enhanced IBRS + LFENCE
+ ibrs use IBRS to protect kernel
Not specifying this option is equivalent to
spectre_v2=auto.
-For user space mitigation:
-
- spectre_v2_user=
-
- [X86] Control mitigation of Spectre variant 2
- (indirect branch speculation) vulnerability between
- user space tasks
-
- on
- Unconditionally enable mitigations. Is
- enforced by spectre_v2=on
-
- off
- Unconditionally disable mitigations. Is
- enforced by spectre_v2=off
-
- prctl
- Indirect branch speculation is enabled,
- but mitigation can be enabled via prctl
- per thread. The mitigation control state
- is inherited on fork.
-
- prctl,ibpb
- Like "prctl" above, but only STIBP is
- controlled per thread. IBPB is issued
- always when switching between different user
- space processes.
-
- seccomp
- Same as "prctl" above, but all seccomp
- threads will enable the mitigation unless
- they explicitly opt out.
-
- seccomp,ibpb
- Like "seccomp" above, but only STIBP is
- controlled per thread. IBPB is issued
- always when switching between different
- user space processes.
-
- auto
- Kernel selects the mitigation depending on
- the available CPU features and vulnerability.
-
- Default mitigation:
- If CONFIG_SECCOMP=y then "seccomp", otherwise "prctl"
-
- Not specifying this option is equivalent to
- spectre_v2_user=auto.
-
In general the kernel by default selects
reasonable mitigations for the current CPU. To
disable Spectre variant 2 mitigations, boot with
spectre_v2=off. Spectre variant 1 mitigations
cannot be disabled.
+For spectre_v2_user see Documentation/admin-guide/kernel-parameters.txt
+
Mitigation selection guide
--------------------------
@@ -674,9 +651,8 @@ Mitigation selection guide
off by disabling their indirect branch speculation when they are run
(See :ref:`Documentation/userspace-api/spec_ctrl.rst <set_spec_ctrl>`).
This prevents untrusted programs from polluting the branch target
- buffer. All programs running in SECCOMP sandboxes have indirect
- branch speculation restricted by default. This behavior can be
- changed via the kernel command line and sysfs control files. See
+ buffer. This behavior can be changed via the kernel command line
+ and sysfs control files. See
:ref:`spectre_mitigation_control_command_line`.
3. High security mode
@@ -730,7 +706,7 @@ AMD white papers:
.. _spec_ref6:
-[6] `Software techniques for managing speculation on AMD processors <https://developer.amd.com/wp-content/resources/90343-B_SoftwareTechniquesforManagingSpeculation_WP_7-18Update_FNL.pdf>`_.
+[6] `Software techniques for managing speculation on AMD processors <https://developer.amd.com/wp-content/resources/Managing-Speculation-on-AMD-Processors.pdf>`_.
ARM white papers:
diff --git a/Documentation/admin-guide/hw-vuln/tsx_async_abort.rst b/Documentation/admin-guide/hw-vuln/tsx_async_abort.rst
index af6865b822d2..76673affd917 100644
--- a/Documentation/admin-guide/hw-vuln/tsx_async_abort.rst
+++ b/Documentation/admin-guide/hw-vuln/tsx_async_abort.rst
@@ -60,7 +60,7 @@ Hyper-Thread attacks are possible.
The victim of a malicious actor does not need to make use of TSX. Only the
attacker needs to begin a TSX transaction and raise an asynchronous abort
-which in turn potenitally leaks data stored in the buffers.
+which in turn potentially leaks data stored in the buffers.
More detailed technical information is available in the TAA specific x86
architecture section: :ref:`Documentation/x86/tsx_async_abort.rst <tsx_async_abort>`.
@@ -136,8 +136,6 @@ enables the mitigation by default.
The mitigation can be controlled at boot time via a kernel command line option.
See :ref:`taa_mitigation_control_command_line`.
-.. _virt_mechanism:
-
Virtualization mitigation
^^^^^^^^^^^^^^^^^^^^^^^^^
diff --git a/Documentation/admin-guide/index.rst b/Documentation/admin-guide/index.rst
index f1d0ccffbe72..5bfafcbb9562 100644
--- a/Documentation/admin-guide/index.rst
+++ b/Documentation/admin-guide/index.rst
@@ -18,6 +18,9 @@ etc.
devices
sysctl/index
+ abi
+ features
+
This section describes CPU vulnerabilities and their mitigations.
.. toctree::
@@ -31,7 +34,8 @@ problems and bugs in particular.
.. toctree::
:maxdepth: 1
- reporting-bugs
+ reporting-issues
+ reporting-regressions
security-bugs
bug-hunting
bug-bisect
@@ -41,6 +45,7 @@ problems and bugs in particular.
init
kdump/index
perf/index
+ pstore-blk
This is the beginning of a section with information of interest to
application developers. Documents covering various aspects of the kernel
@@ -75,8 +80,10 @@ configure specific aspects of kernel behavior to your liking.
cputopology
dell_rbu
device-mapper/index
+ edid
efi-stub
ext4
+ filesystem-monitoring
nfs/index
gpio/index
highuid
@@ -92,6 +99,7 @@ configure specific aspects of kernel behavior to your liking.
lockup-watchdogs
LSM/index
md
+ media/index
mm/index
module-signing
mono
@@ -106,13 +114,13 @@ configure specific aspects of kernel behavior to your liking.
rtc
serial-console
svga
+ syscall-user-dispatch
sysrq
thunderbolt
ufs
unicode
vga-softcursor
video-output
- wimax/index
xfs
.. only:: subproject and html
diff --git a/Documentation/admin-guide/init.rst b/Documentation/admin-guide/init.rst
index e89d97f31eaf..41f06a09152e 100644
--- a/Documentation/admin-guide/init.rst
+++ b/Documentation/admin-guide/init.rst
@@ -1,52 +1,48 @@
-Explaining the dreaded "No init found." boot hang message
+Explaining the "No working init found." boot hang message
=========================================================
+:Authors: Andreas Mohr <andi at lisas period de>
+ Cristian Souza <cristianmsbr at gmail period com>
-OK, so you've got this pretty unintuitive message (currently located
-in init/main.c) and are wondering what the H*** went wrong.
-Some high-level reasons for failure (listed roughly in order of execution)
-to load the init binary are:
-
-A) Unable to mount root FS
-B) init binary doesn't exist on rootfs
-C) broken console device
-D) binary exists but dependencies not available
-E) binary cannot be loaded
-
-Detailed explanations:
-
-A) Set "debug" kernel parameter (in bootloader config file or CONFIG_CMDLINE)
- to get more detailed kernel messages.
-B) make sure you have the correct root FS type
- (and ``root=`` kernel parameter points to the correct partition),
- required drivers such as storage hardware (such as SCSI or USB!)
- and filesystem (ext3, jffs2 etc.) are builtin (alternatively as modules,
- to be pre-loaded by an initrd)
-C) Possibly a conflict in ``console= setup`` --> initial console unavailable.
- E.g. some serial consoles are unreliable due to serial IRQ issues (e.g.
- missing interrupt-based configuration).
+This document provides some high-level reasons for failure
+(listed roughly in order of execution) to load the init binary.
+
+1) **Unable to mount root FS**: Set "debug" kernel parameter (in bootloader
+ config file or CONFIG_CMDLINE) to get more detailed kernel messages.
+
+2) **init binary doesn't exist on rootfs**: Make sure you have the correct
+ root FS type (and ``root=`` kernel parameter points to the correct
+ partition), required drivers such as storage hardware (such as SCSI or
+ USB!) and filesystem (ext3, jffs2, etc.) are builtin (alternatively as
+ modules, to be pre-loaded by an initrd).
+
+3) **Broken console device**: Possibly a conflict in ``console= setup``
+ --> initial console unavailable. E.g. some serial consoles are unreliable
+ due to serial IRQ issues (e.g. missing interrupt-based configuration).
Try using a different ``console= device`` or e.g. ``netconsole=``.
-D) e.g. required library dependencies of the init binary such as
- ``/lib/ld-linux.so.2`` missing or broken. Use
- ``readelf -d <INIT>|grep NEEDED`` to find out which libraries are required.
-E) make sure the binary's architecture matches your hardware.
- E.g. i386 vs. x86_64 mismatch, or trying to load x86 on ARM hardware.
- In case you tried loading a non-binary file here (shell script?),
- you should make sure that the script specifies an interpreter in its shebang
- header line (``#!/...``) that is fully working (including its library
- dependencies). And before tackling scripts, better first test a simple
- non-script binary such as ``/bin/sh`` and confirm its successful execution.
- To find out more, add code ``to init/main.c`` to display kernel_execve()s
- return values.
+
+4) **Binary exists but dependencies not available**: E.g. required library
+ dependencies of the init binary such as ``/lib/ld-linux.so.2`` missing or
+ broken. Use ``readelf -d <INIT>|grep NEEDED`` to find out which libraries
+ are required.
+
+5) **Binary cannot be loaded**: Make sure the binary's architecture matches
+ your hardware. E.g. i386 vs. x86_64 mismatch, or trying to load x86 on ARM
+ hardware. In case you tried loading a non-binary file here (shell script?),
+ you should make sure that the script specifies an interpreter in its
+ shebang header line (``#!/...``) that is fully working (including its
+ library dependencies). And before tackling scripts, better first test a
+ simple non-script binary such as ``/bin/sh`` and confirm its successful
+ execution. To find out more, add code ``to init/main.c`` to display
+ kernel_execve()s return values.
Please extend this explanation whenever you find new failure causes
(after all loading the init binary is a CRITICAL and hard transition step
-which needs to be made as painless as possible), then submit patch to LKML.
+which needs to be made as painless as possible), then submit a patch to LKML.
Further TODOs:
- Implement the various ``run_init_process()`` invocations via a struct array
which can then store the ``kernel_execve()`` result value and on failure
log it all by iterating over **all** results (very important usability fix).
-- try to make the implementation itself more helpful in general,
- e.g. by providing additional error messages at affected places.
+- Try to make the implementation itself more helpful in general, e.g. by
+ providing additional error messages at affected places.
-Andreas Mohr <andi at lisas period de>
diff --git a/Documentation/admin-guide/initrd.rst b/Documentation/admin-guide/initrd.rst
index a03dabaaf3a3..67bbad8806e8 100644
--- a/Documentation/admin-guide/initrd.rst
+++ b/Documentation/admin-guide/initrd.rst
@@ -376,7 +376,7 @@ Resources
---------
.. [#f1] Almesberger, Werner; "Booting Linux: The History and the Future"
- http://www.almesberger.net/cv/papers/ols2k-9.ps.gz
+ https://www.almesberger.net/cv/papers/ols2k-9.ps.gz
.. [#f2] newlib package (experimental), with initrd example
https://www.sourceware.org/newlib/
.. [#f3] util-linux: Miscellaneous utilities for Linux
diff --git a/Documentation/admin-guide/iostats.rst b/Documentation/admin-guide/iostats.rst
index df5b8345c41d..609a3201fd4e 100644
--- a/Documentation/admin-guide/iostats.rst
+++ b/Documentation/admin-guide/iostats.rst
@@ -76,7 +76,7 @@ Field 3 -- # of sectors read (unsigned long)
Field 4 -- # of milliseconds spent reading (unsigned int)
This is the total number of milliseconds spent by all reads (as
- measured from __make_request() to end_that_request_last()).
+ measured from blk_mq_alloc_request() to __blk_mq_end_request()).
Field 5 -- # of writes completed (unsigned long)
This is the total number of writes completed successfully.
@@ -89,7 +89,7 @@ Field 7 -- # of sectors written (unsigned long)
Field 8 -- # of milliseconds spent writing (unsigned int)
This is the total number of milliseconds spent by all writes (as
- measured from __make_request() to end_that_request_last()).
+ measured from blk_mq_alloc_request() to __blk_mq_end_request()).
Field 9 -- # of I/Os currently in progress (unsigned int)
The only field that should go to zero. Incremented as requests are
@@ -100,7 +100,7 @@ Field 10 -- # of milliseconds spent doing I/Os (unsigned int)
Since 5.0 this field counts jiffies when at least one request was
started or completed. If request runs more than 2 jiffies then some
- I/O time will not be accounted unless there are other requests.
+ I/O time might be not accounted in case of concurrent requests.
Field 11 -- weighted # of milliseconds spent doing I/Os (unsigned int)
This field is incremented at each I/O start, I/O completion, I/O
@@ -120,7 +120,7 @@ Field 14 -- # of sectors discarded (unsigned long)
Field 15 -- # of milliseconds spent discarding (unsigned int)
This is the total number of milliseconds spent by all discards (as
- measured from __make_request() to end_that_request_last()).
+ measured from blk_mq_alloc_request() to __blk_mq_end_request()).
Field 16 -- # of flush requests completed
This is the total number of flush requests completed successfully.
@@ -143,6 +143,9 @@ are summed (possibly overflowing the unsigned long variable they are
summed to) and the result given to the user. There is no convenient
user interface for accessing the per-CPU counters themselves.
+Since 4.19 request times are measured with nanoseconds precision and
+truncated to milliseconds before showing in this interface.
+
Disks vs Partitions
-------------------
diff --git a/Documentation/admin-guide/kdump/gdbmacros.txt b/Documentation/admin-guide/kdump/gdbmacros.txt
index 220d0a80ca2c..82aecdcae8a6 100644
--- a/Documentation/admin-guide/kdump/gdbmacros.txt
+++ b/Documentation/admin-guide/kdump/gdbmacros.txt
@@ -170,57 +170,103 @@ document trapinfo
address the kernel panicked.
end
-define dump_log_idx
- set $idx = $arg0
- if ($argc > 1)
- set $prev_flags = $arg1
+define dump_record
+ set var $desc = $arg0
+ set var $info = $arg1
+ if ($argc > 2)
+ set var $prev_flags = $arg2
else
- set $prev_flags = 0
+ set var $prev_flags = 0
end
- set $msg = ((struct printk_log *) (log_buf + $idx))
- set $prefix = 1
- set $newline = 1
- set $log = log_buf + $idx + sizeof(*$msg)
-
- # prev & LOG_CONT && !(msg->flags & LOG_PREIX)
- if (($prev_flags & 8) && !($msg->flags & 4))
- set $prefix = 0
+
+ set var $prefix = 1
+ set var $newline = 1
+
+ set var $begin = $desc->text_blk_lpos.begin % (1U << prb->text_data_ring.size_bits)
+ set var $next = $desc->text_blk_lpos.next % (1U << prb->text_data_ring.size_bits)
+
+ # handle data-less record
+ if ($begin & 1)
+ set var $text_len = 0
+ set var $log = ""
+ else
+ # handle wrapping data block
+ if ($begin > $next)
+ set var $begin = 0
+ end
+
+ # skip over descriptor id
+ set var $begin = $begin + sizeof(long)
+
+ # handle truncated message
+ if ($next - $begin < $info->text_len)
+ set var $text_len = $next - $begin
+ else
+ set var $text_len = $info->text_len
+ end
+
+ set var $log = &prb->text_data_ring.data[$begin]
+ end
+
+ # prev & LOG_CONT && !(info->flags & LOG_PREIX)
+ if (($prev_flags & 8) && !($info->flags & 4))
+ set var $prefix = 0
end
- # msg->flags & LOG_CONT
- if ($msg->flags & 8)
+ # info->flags & LOG_CONT
+ if ($info->flags & 8)
# (prev & LOG_CONT && !(prev & LOG_NEWLINE))
if (($prev_flags & 8) && !($prev_flags & 2))
- set $prefix = 0
+ set var $prefix = 0
end
- # (!(msg->flags & LOG_NEWLINE))
- if (!($msg->flags & 2))
- set $newline = 0
+ # (!(info->flags & LOG_NEWLINE))
+ if (!($info->flags & 2))
+ set var $newline = 0
end
end
if ($prefix)
- printf "[%5lu.%06lu] ", $msg->ts_nsec / 1000000000, $msg->ts_nsec % 1000000000
+ printf "[%5lu.%06lu] ", $info->ts_nsec / 1000000000, $info->ts_nsec % 1000000000
end
- if ($msg->text_len != 0)
- eval "printf \"%%%d.%ds\", $log", $msg->text_len, $msg->text_len
+ if ($text_len)
+ eval "printf \"%%%d.%ds\", $log", $text_len, $text_len
end
if ($newline)
printf "\n"
end
- if ($msg->dict_len > 0)
- set $dict = $log + $msg->text_len
- set $idx = 0
- set $line = 1
- while ($idx < $msg->dict_len)
- if ($line)
- printf " "
- set $line = 0
+
+ # handle dictionary data
+
+ set var $dict = &$info->dev_info.subsystem[0]
+ set var $dict_len = sizeof($info->dev_info.subsystem)
+ if ($dict[0] != '\0')
+ printf " SUBSYSTEM="
+ set var $idx = 0
+ while ($idx < $dict_len)
+ set var $c = $dict[$idx]
+ if ($c == '\0')
+ loop_break
+ else
+ if ($c < ' ' || $c >= 127 || $c == '\\')
+ printf "\\x%02x", $c
+ else
+ printf "%c", $c
+ end
end
- set $c = $dict[$idx]
+ set var $idx = $idx + 1
+ end
+ printf "\n"
+ end
+
+ set var $dict = &$info->dev_info.device[0]
+ set var $dict_len = sizeof($info->dev_info.device)
+ if ($dict[0] != '\0')
+ printf " DEVICE="
+ set var $idx = 0
+ while ($idx < $dict_len)
+ set var $c = $dict[$idx]
if ($c == '\0')
- printf "\n"
- set $line = 1
+ loop_break
else
if ($c < ' ' || $c >= 127 || $c == '\\')
printf "\\x%02x", $c
@@ -228,33 +274,46 @@ define dump_log_idx
printf "%c", $c
end
end
- set $idx = $idx + 1
+ set var $idx = $idx + 1
end
printf "\n"
end
end
-document dump_log_idx
- Dump a single log given its index in the log buffer. The first
- parameter is the index into log_buf, the second is optional and
- specified the previous log buffer's flags, used for properly
- formatting continued lines.
+document dump_record
+ Dump a single record. The first parameter is the descriptor,
+ the second parameter is the info, the third parameter is
+ optional and specifies the previous record's flags, used for
+ properly formatting continued lines.
end
define dmesg
- set $i = log_first_idx
- set $end_idx = log_first_idx
- set $prev_flags = 0
+ # definitions from kernel/printk/printk_ringbuffer.h
+ set var $desc_committed = 1
+ set var $desc_finalized = 2
+ set var $desc_sv_bits = sizeof(long) * 8
+ set var $desc_flags_shift = $desc_sv_bits - 2
+ set var $desc_flags_mask = 3 << $desc_flags_shift
+ set var $id_mask = ~$desc_flags_mask
+
+ set var $desc_count = 1U << prb->desc_ring.count_bits
+ set var $prev_flags = 0
+
+ set var $id = prb->desc_ring.tail_id.counter
+ set var $end_id = prb->desc_ring.head_id.counter
while (1)
- set $msg = ((struct printk_log *) (log_buf + $i))
- if ($msg->len == 0)
- set $i = 0
- else
- dump_log_idx $i $prev_flags
- set $i = $i + $msg->len
- set $prev_flags = $msg->flags
+ set var $desc = &prb->desc_ring.descs[$id % $desc_count]
+ set var $info = &prb->desc_ring.infos[$id % $desc_count]
+
+ # skip non-committed record
+ set var $state = 3 & ($desc->state_var.counter >> $desc_flags_shift)
+ if ($state == $desc_committed || $state == $desc_finalized)
+ dump_record $desc $info $prev_flags
+ set var $prev_flags = $info->flags
end
- if ($i == $end_idx)
+
+ set var $id = ($id + 1) & $id_mask
+ if ($id == $end_id)
loop_break
end
end
diff --git a/Documentation/admin-guide/kdump/kdump.rst b/Documentation/admin-guide/kdump/kdump.rst
index ac7e131d2935..a748e7eb4429 100644
--- a/Documentation/admin-guide/kdump/kdump.rst
+++ b/Documentation/admin-guide/kdump/kdump.rst
@@ -2,7 +2,7 @@
Documentation for Kdump - The kexec-based Crash Dumping Solution
================================================================
-This document includes overview, setup and installation, and analysis
+This document includes overview, setup, installation, and analysis
information.
Overview
@@ -13,9 +13,9 @@ dump of the system kernel's memory needs to be taken (for example, when
the system panics). The system kernel's memory image is preserved across
the reboot and is accessible to the dump-capture kernel.
-You can use common commands, such as cp and scp, to copy the
-memory image to a dump file on the local disk, or across the network to
-a remote system.
+You can use common commands, such as cp, scp or makedumpfile to copy
+the memory image to a dump file on the local disk, or across the network
+to a remote system.
Kdump and kexec are currently supported on the x86, x86_64, ppc64, ia64,
s390x, arm and arm64 architectures.
@@ -26,13 +26,15 @@ the dump-capture kernel. This ensures that ongoing Direct Memory Access
The kexec -p command loads the dump-capture kernel into this reserved
memory.
-On x86 machines, the first 640 KB of physical memory is needed to boot,
-regardless of where the kernel loads. Therefore, kexec backs up this
-region just before rebooting into the dump-capture kernel.
+On x86 machines, the first 640 KB of physical memory is needed for boot,
+regardless of where the kernel loads. For simpler handling, the whole
+low 1M is reserved to avoid any later kernel or device driver writing
+data into this area. Like this, the low 1M can be reused as system RAM
+by kdump kernel without extra handling.
-Similarly on PPC64 machines first 32KB of physical memory is needed for
-booting regardless of where the kernel is loaded and to support 64K page
-size kexec backs up the first 64KB memory.
+On PPC64 machines first 32KB of physical memory is needed for booting
+regardless of where the kernel is loaded and to support 64K page size
+kexec backs up the first 64KB memory.
For s390x, when kdump is triggered, the crashkernel region is exchanged
with the region [0, crashkernel region size] and then the kdump kernel
@@ -46,14 +48,14 @@ passed to the dump-capture kernel through the elfcorehdr= boot
parameter. Optionally the size of the ELF header can also be passed
when using the elfcorehdr=[size[KMG]@]offset[KMG] syntax.
-
With the dump-capture kernel, you can access the memory image through
/proc/vmcore. This exports the dump as an ELF-format file that you can
-write out using file copy commands such as cp or scp. Further, you can
-use analysis tools such as the GNU Debugger (GDB) and the Crash tool to
-debug the dump file. This method ensures that the dump pages are correctly
-ordered.
-
+write out using file copy commands such as cp or scp. You can also use
+makedumpfile utility to analyze and write out filtered contents with
+options, e.g with '-d 31' it will only write out kernel data. Further,
+you can use analysis tools such as the GNU Debugger (GDB) and the Crash
+tool to debug the dump file. This method ensures that the dump pages are
+correctly ordered.
Setup and Installation
======================
@@ -125,9 +127,18 @@ dump-capture kernels for enabling kdump support.
System kernel config options
----------------------------
-1) Enable "kexec system call" in "Processor type and features."::
+1) Enable "kexec system call" or "kexec file based system call" in
+ "Processor type and features."::
+
+ CONFIG_KEXEC=y or CONFIG_KEXEC_FILE=y
+
+ And both of them will select KEXEC_CORE::
+
+ CONFIG_KEXEC_CORE=y
- CONFIG_KEXEC=y
+ Subsequently, CRASH_CORE is selected by KEXEC_CORE::
+
+ CONFIG_CRASH_CORE=y
2) Enable "sysfs file system support" in "Filesystem" -> "Pseudo
filesystems." This is usually enabled by default::
@@ -135,9 +146,9 @@ System kernel config options
CONFIG_SYSFS=y
Note that "sysfs file system support" might not appear in the "Pseudo
- filesystems" menu if "Configure standard kernel features (for small
- systems)" is not enabled in "General Setup." In this case, check the
- .config file itself to ensure that sysfs is turned on, as follows::
+ filesystems" menu if "Configure standard kernel features (expert users)"
+ is not enabled in "General Setup." In this case, check the .config file
+ itself to ensure that sysfs is turned on, as follows::
grep 'CONFIG_SYSFS' .config
@@ -175,17 +186,19 @@ Dump-capture kernel config options (Arch Dependent, i386 and x86_64)
CONFIG_HIGHMEM4G
-2) On i386 and x86_64, disable symmetric multi-processing support
- under "Processor type and features"::
+2) With CONFIG_SMP=y, usually nr_cpus=1 need specified on the kernel
+ command line when loading the dump-capture kernel because one
+ CPU is enough for kdump kernel to dump vmcore on most of systems.
- CONFIG_SMP=n
+ However, you can also specify nr_cpus=X to enable multiple processors
+ in kdump kernel. In this case, "disable_cpu_apicid=" is needed to
+ tell kdump kernel which cpu is 1st kernel's BSP. Please refer to
+ admin-guide/kernel-parameters.txt for more details.
- (If CONFIG_SMP=y, then specify maxcpus=1 on the kernel command line
- when loading the dump-capture kernel, see section "Load the Dump-capture
- Kernel".)
+ With CONFIG_SMP=n, the above things are not related.
-3) If one wants to build and use a relocatable kernel,
- Enable "Build a relocatable kernel" support under "Processor type and
+3) A relocatable kernel is suggested to be built by default. If not yet,
+ enable "Build a relocatable kernel" support under "Processor type and
features"::
CONFIG_RELOCATABLE=y
@@ -232,7 +245,7 @@ Dump-capture kernel config options (Arch Dependent, ia64)
as a dump-capture kernel if desired.
The crashkernel region can be automatically placed by the system
- kernel at run time. This is done by specifying the base address as 0,
+ kernel at runtime. This is done by specifying the base address as 0,
or omitting it all together::
crashkernel=256M@0
@@ -241,10 +254,6 @@ Dump-capture kernel config options (Arch Dependent, ia64)
crashkernel=256M
- If the start address is specified, note that the start address of the
- kernel will be aligned to 64Mb, so if the start address is not then
- any space below the alignment point will be wasted.
-
Dump-capture kernel config options (Arch Dependent, arm)
----------------------------------------------------------
@@ -260,46 +269,82 @@ Dump-capture kernel config options (Arch Dependent, arm64)
on non-VHE systems even if it is configured. This is because the CPU
will not be reset to EL2 on panic.
-Extended crashkernel syntax
+crashkernel syntax
===========================
+1) crashkernel=size@offset
-While the "crashkernel=size[@offset]" syntax is sufficient for most
-configurations, sometimes it's handy to have the reserved memory dependent
-on the value of System RAM -- that's mostly for distributors that pre-setup
-the kernel command line to avoid a unbootable system after some memory has
-been removed from the machine.
+ Here 'size' specifies how much memory to reserve for the dump-capture kernel
+ and 'offset' specifies the beginning of this reserved memory. For example,
+ "crashkernel=64M@16M" tells the system kernel to reserve 64 MB of memory
+ starting at physical address 0x01000000 (16MB) for the dump-capture kernel.
-The syntax is::
+ The crashkernel region can be automatically placed by the system
+ kernel at run time. This is done by specifying the base address as 0,
+ or omitting it all together::
- crashkernel=<range1>:<size1>[,<range2>:<size2>,...][@offset]
- range=start-[end]
+ crashkernel=256M@0
-For example::
+ or::
- crashkernel=512M-2G:64M,2G-:128M
+ crashkernel=256M
-This would mean:
+ If the start address is specified, note that the start address of the
+ kernel will be aligned to a value (which is Arch dependent), so if the
+ start address is not then any space below the alignment point will be
+ wasted.
- 1) if the RAM is smaller than 512M, then don't reserve anything
- (this is the "rescue" case)
- 2) if the RAM size is between 512M and 2G (exclusive), then reserve 64M
- 3) if the RAM size is larger than 2G, then reserve 128M
+2) range1:size1[,range2:size2,...][@offset]
+ While the "crashkernel=size[@offset]" syntax is sufficient for most
+ configurations, sometimes it's handy to have the reserved memory dependent
+ on the value of System RAM -- that's mostly for distributors that pre-setup
+ the kernel command line to avoid a unbootable system after some memory has
+ been removed from the machine.
+ The syntax is::
-Boot into System Kernel
-=======================
+ crashkernel=<range1>:<size1>[,<range2>:<size2>,...][@offset]
+ range=start-[end]
+
+ For example::
+
+ crashkernel=512M-2G:64M,2G-:128M
+
+ This would mean:
+
+ 1) if the RAM is smaller than 512M, then don't reserve anything
+ (this is the "rescue" case)
+ 2) if the RAM size is between 512M and 2G (exclusive), then reserve 64M
+ 3) if the RAM size is larger than 2G, then reserve 128M
+
+3) crashkernel=size,high and crashkernel=size,low
+ If memory above 4G is preferred, crashkernel=size,high can be used to
+ fulfill that. With it, physical memory is allowed to be allocated from top,
+ so could be above 4G if system has more than 4G RAM installed. Otherwise,
+ memory region will be allocated below 4G if available.
+
+ When crashkernel=X,high is passed, kernel could allocate physical memory
+ region above 4G, low memory under 4G is needed in this case. There are
+ three ways to get low memory:
+
+ 1) Kernel will allocate at least 256M memory below 4G automatically
+ if crashkernel=Y,low is not specified.
+ 2) Let user specify low memory size instead.
+ 3) Specified value 0 will disable low memory allocation::
+
+ crashkernel=0,low
+
+Boot into System Kernel
+-----------------------
1) Update the boot loader (such as grub, yaboot, or lilo) configuration
files as necessary.
-2) Boot the system kernel with the boot parameter "crashkernel=Y@X",
- where Y specifies how much memory to reserve for the dump-capture kernel
- and X specifies the beginning of this reserved memory. For example,
- "crashkernel=64M@16M" tells the system kernel to reserve 64 MB of memory
- starting at physical address 0x01000000 (16MB) for the dump-capture kernel.
+2) Boot the system kernel with the boot parameter "crashkernel=Y@X".
- On x86 and x86_64, use "crashkernel=64M@16M".
+ On x86 and x86_64, use "crashkernel=Y[@X]". Most of the time, the
+ start address 'X' is not necessary, kernel will search a suitable
+ area. Unless an explicit start address is expected.
On ppc64, use "crashkernel=128M@32M".
@@ -331,8 +376,8 @@ of dump-capture kernel. Following is the summary.
For i386 and x86_64:
- - Use vmlinux if kernel is not relocatable.
- Use bzImage/vmlinuz if kernel is relocatable.
+ - Use vmlinux if kernel is not relocatable.
For ppc64:
@@ -392,7 +437,7 @@ loading dump-capture kernel.
For i386, x86_64 and ia64:
- "1 irqpoll maxcpus=1 reset_devices"
+ "1 irqpoll nr_cpus=1 reset_devices"
For ppc64:
@@ -400,7 +445,7 @@ For ppc64:
For s390x:
- "1 maxcpus=1 cgroup_disable=memory"
+ "1 nr_cpus=1 cgroup_disable=memory"
For arm:
@@ -408,7 +453,7 @@ For arm:
For arm64:
- "1 maxcpus=1 reset_devices"
+ "1 nr_cpus=1 reset_devices"
Notes on loading the dump-capture kernel:
@@ -488,6 +533,14 @@ the following command::
cp /proc/vmcore <dump-file>
+or use scp to write out the dump file between hosts on a network, e.g::
+
+ scp /proc/vmcore remote_username@remote_ip:<dump-file>
+
+You can also use makedumpfile utility to write out the dump file
+with specified options to filter out unwanted contents, e.g::
+
+ makedumpfile -l --message-level 1 -d 31 /proc/vmcore <dump-file>
Analysis
========
@@ -509,9 +562,12 @@ ELF32-format headers using the --elf32-core-headers kernel option on the
dump kernel.
You can also use the Crash utility to analyze dump files in Kdump
-format. Crash is available on Dave Anderson's site at the following URL:
+format. Crash is available at the following URL:
- http://people.redhat.com/~anderson/
+ https://github.com/crash-utility/crash
+
+Crash document can be found at:
+ https://crash-utility.github.io/
Trigger Kdump on WARN()
=======================
@@ -521,11 +577,18 @@ will cause a kdump to occur at the panic() call. In cases where a user wants
to specify this during runtime, /proc/sys/kernel/panic_on_warn can be set to 1
to achieve the same behaviour.
+Trigger Kdump on add_taint()
+============================
+
+The kernel parameter panic_on_taint facilitates a conditional call to panic()
+from within add_taint() whenever the value set in this bitmask matches with the
+bit flag being set by add_taint().
+This will cause a kdump to occur at the add_taint()->panic() call.
+
Contact
=======
-- Vivek Goyal (vgoyal@redhat.com)
-- Maneesh Soni (maneesh@in.ibm.com)
+- kexec@lists.infradead.org
GDB macros
==========
diff --git a/Documentation/admin-guide/kdump/vmcoreinfo.rst b/Documentation/admin-guide/kdump/vmcoreinfo.rst
index 007a6b86e0ee..6726f439958c 100644
--- a/Documentation/admin-guide/kdump/vmcoreinfo.rst
+++ b/Documentation/admin-guide/kdump/vmcoreinfo.rst
@@ -39,6 +39,12 @@ call.
User-space tools can get the kernel name, host name, kernel release
number, kernel version, architecture name and OS type from it.
+(uts_namespace, name)
+---------------------
+
+Offset of the name's member. Crash Utility and Makedumpfile get
+the start address of the init_uts_ns.name from this.
+
node_online_map
---------------
@@ -93,6 +99,11 @@ It exists in the sparse memory mapping model, and it is also somewhat
similar to the mem_map variable, both of them are used to translate an
address.
+MAX_PHYSMEM_BITS
+----------------
+
+Defines the maximum supported physical address space memory.
+
page
----
@@ -184,50 +195,123 @@ from this.
Free areas descriptor. User-space tools use this value to iterate the
free_area ranges. MAX_ORDER is used by the zone buddy allocator.
-log_first_idx
+prb
+---
+
+A pointer to the printk ringbuffer (struct printk_ringbuffer). This
+may be pointing to the static boot ringbuffer or the dynamically
+allocated ringbuffer, depending on when the core dump occurred.
+Used by user-space tools to read the active kernel log buffer.
+
+printk_rb_static
+----------------
+
+A pointer to the static boot printk ringbuffer. If @prb has a
+different value, this is useful for viewing the initial boot messages,
+which may have been overwritten in the dynamically allocated
+ringbuffer.
+
+clear_seq
+---------
+
+The sequence number of the printk() record after the last clear
+command. It indicates the first record after the last
+SYSLOG_ACTION_CLEAR, like issued by 'dmesg -c'. Used by user-space
+tools to dump a subset of the dmesg log.
+
+printk_ringbuffer
+-----------------
+
+The size of a printk_ringbuffer structure. This structure contains all
+information required for accessing the various components of the
+kernel log buffer.
+
+(printk_ringbuffer, desc_ring|text_data_ring|dict_data_ring|fail)
+-----------------------------------------------------------------
+
+Offsets for the various components of the printk ringbuffer. Used by
+user-space tools to view the kernel log buffer without requiring the
+declaration of the structure.
+
+prb_desc_ring
-------------
-Index of the first record stored in the buffer log_buf. Used by
-user-space tools to read the strings in the log_buf.
+The size of the prb_desc_ring structure. This structure contains
+information about the set of record descriptors.
-log_buf
--------
+(prb_desc_ring, count_bits|descs|head_id|tail_id)
+-------------------------------------------------
+
+Offsets for the fields describing the set of record descriptors. Used
+by user-space tools to be able to traverse the descriptors without
+requiring the declaration of the structure.
+
+prb_desc
+--------
+
+The size of the prb_desc structure. This structure contains
+information about a single record descriptor.
+
+(prb_desc, info|state_var|text_blk_lpos|dict_blk_lpos)
+------------------------------------------------------
-Console output is written to the ring buffer log_buf at index
-log_first_idx. Used to get the kernel log.
+Offsets for the fields describing a record descriptors. Used by
+user-space tools to be able to read descriptors without requiring
+the declaration of the structure.
-log_buf_len
+prb_data_blk_lpos
+-----------------
+
+The size of the prb_data_blk_lpos structure. This structure contains
+information about where the text or dictionary data (data block) is
+located within the respective data ring.
+
+(prb_data_blk_lpos, begin|next)
+-------------------------------
+
+Offsets for the fields describing the location of a data block. Used
+by user-space tools to be able to locate data blocks without
+requiring the declaration of the structure.
+
+printk_info
-----------
-log_buf's length.
+The size of the printk_info structure. This structure contains all
+the meta-data for a record.
-clear_idx
----------
+(printk_info, seq|ts_nsec|text_len|dict_len|caller_id)
+------------------------------------------------------
-The index that the next printk() record to read after the last clear
-command. It indicates the first record after the last SYSLOG_ACTION
-_CLEAR, like issued by 'dmesg -c'. Used by user-space tools to dump
-the dmesg log.
+Offsets for the fields providing the meta-data for a record. Used by
+user-space tools to be able to read the information without requiring
+the declaration of the structure.
-log_next_idx
-------------
+prb_data_ring
+-------------
-The index of the next record to store in the buffer log_buf. Used to
-compute the index of the current buffer position.
+The size of the prb_data_ring structure. This structure contains
+information about a set of data blocks.
-printk_log
-----------
+(prb_data_ring, size_bits|data|head_lpos|tail_lpos)
+---------------------------------------------------
+
+Offsets for the fields describing a set of data blocks. Used by
+user-space tools to be able to access the data blocks without
+requiring the declaration of the structure.
+
+atomic_long_t
+-------------
-The size of a structure printk_log. Used to compute the size of
-messages, and extract dmesg log. It encapsulates header information for
-log_buf, such as timestamp, syslog level, etc.
+The size of the atomic_long_t structure. Used by user-space tools to
+be able to copy the full structure, regardless of its
+architecture-specific implementation.
-(printk_log, ts_nsec|len|text_len|dict_len)
--------------------------------------------
+(atomic_long_t, counter)
+------------------------
-It represents field offsets in struct printk_log. User space tools
-parse it and check whether the values of printk_log's members have been
-changed.
+Offset for the long value of an atomic_long_t variable. Used by
+user-space tools to access the long value without requiring the
+architecture-specific declaration.
(free_area.free_list, MIGRATE_TYPES)
------------------------------------
@@ -393,6 +477,31 @@ KERNELOFFSET
The kernel randomization offset. Used to compute the page offset. If
KASLR is disabled, this value is zero.
+KERNELPACMASK
+-------------
+
+The mask to extract the Pointer Authentication Code from a kernel virtual
+address.
+
+TCR_EL1.T1SZ
+------------
+
+Indicates the size offset of the memory region addressed by TTBR1_EL1.
+The region size is 2^(64-T1SZ) bytes.
+
+TTBR1_EL1 is the table base address register specified by ARMv8-A
+architecture which is used to lookup the page-tables for the Virtual
+addresses in the higher VA range (refer to ARMv8 ARM document for
+more details).
+
+MODULES_VADDR|MODULES_END|VMALLOC_START|VMALLOC_END|VMEMMAP_START|VMEMMAP_END
+-----------------------------------------------------------------------------
+
+Used to get the correct ranges:
+ MODULES_VADDR ~ MODULES_END-1 : Kernel module space.
+ VMALLOC_START ~ VMALLOC_END-1 : vmalloc() / ioremap() space.
+ VMEMMAP_START ~ VMEMMAP_END-1 : vmemmap region, used for struct page array.
+
arm
===
diff --git a/Documentation/admin-guide/kernel-parameters.rst b/Documentation/admin-guide/kernel-parameters.rst
index 6d421694d98e..959f73a32712 100644
--- a/Documentation/admin-guide/kernel-parameters.rst
+++ b/Documentation/admin-guide/kernel-parameters.rst
@@ -3,8 +3,8 @@
The kernel's command-line parameters
====================================
-The following is a consolidated list of the kernel parameters as
-implemented by the __setup(), core_param() and module_param() macros
+The following is a consolidated list of the kernel parameters as implemented
+by the __setup(), early_param(), core_param() and module_param() macros
and sorted into English Dictionary order (defined as ignoring all
punctuation and sorting digits before letters in a case insensitive
manner), and with descriptions where known.
@@ -60,7 +60,7 @@ Note that for the special case of a range one can split the range into equal
sized groups and for each group use some amount from the beginning of that
group:
- <cpu number>-cpu number>:<used size>/<group size>
+ <cpu number>-<cpu number>:<used size>/<group size>
For example one can add to the command line following parameter:
@@ -68,7 +68,19 @@ For example one can add to the command line following parameter:
where the final item represents CPUs 100,101,125,126,150,151,...
+The value "N" can be used to represent the numerically last CPU on the system,
+i.e "foo_cpus=16-N" would be equivalent to "16-31" on a 32 core system.
+Keep in mind that "N" is dynamic, so if system changes cause the bitmap width
+to change, such as less cores in the CPU list, then N and any ranges using N
+will also change. Use the same on a small 4 core system, and "16-N" becomes
+"16-3" and now the same boot input will be flagged as invalid (start > end).
+
+The special case-tolerant group name "all" has a meaning of selecting all CPUs,
+so that "nohz_full=all" is the equivalent of "nohz_full=0-N".
+
+The semantics of "N" and "all" is supported on a level of bitmaps and holds for
+all users of bitmap_parse().
This document may not be entirely up to date and comprehensive. The command
"modinfo -p ${modulename}" shows a current list of all parameters of a loadable
@@ -87,6 +99,7 @@ parameter is applicable::
ALSA ALSA sound support is enabled.
APIC APIC support is enabled.
APM Advanced Power Management support is enabled.
+ APPARMOR AppArmor support is enabled.
ARM ARM architecture is enabled.
ARM64 ARM64 architecture is enabled.
AX25 Appropriate AX.25 support is enabled.
@@ -96,15 +109,15 @@ parameter is applicable::
DYNAMIC_DEBUG Build in debug messages and enable them at runtime
EDD BIOS Enhanced Disk Drive Services (EDD) is enabled
EFI EFI Partitioning (GPT) is enabled
- EIDE EIDE/ATAPI support is enabled.
EVM Extended Verification Module
FB The frame buffer device is enabled.
FTRACE Function tracing enabled.
GCOV GCOV profiling is enabled.
+ HIBERNATION HIBERNATION is enabled.
HW Appropriate hardware is enabled.
+ HYPER_V HYPERV support is enabled.
IA-64 IA-64 architecture is enabled.
IMA Integrity measurement architecture is enabled.
- IOSCHED More than one I/O scheduler is enabled.
IP_PNP IP DHCP, BOOTP, or RARP is enabled.
IPV6 IPv6 support is enabled.
ISAPNP ISA PnP code is enabled.
@@ -128,7 +141,6 @@ parameter is applicable::
NUMA NUMA support is enabled.
NFS Appropriate NFS support is enabled.
OF Devicetree is enabled.
- OSS OSS sound support is enabled.
PV_OPS A paravirtualized kernel is enabled.
PARIDE The ParIDE (parallel port IDE) subsystem is enabled.
PARISC The PA-RISC architecture is enabled.
@@ -140,6 +152,7 @@ parameter is applicable::
PPT Parallel port support is enabled.
PS2 Appropriate PS/2 support is enabled.
RAM RAM disk support is enabled.
+ RISCV RISCV architecture is enabled.
RDT Intel Resource Director Technology.
S390 S390 architecture is enabled.
SCSI Appropriate SCSI support is enabled.
@@ -147,7 +160,6 @@ parameter is applicable::
the Documentation/scsi/ sub-directory.
SECURITY Different security models are enabled.
SELINUX SELinux support is enabled.
- APPARMOR AppArmor support is enabled.
SERIAL Serial support is enabled.
SH SuperH architecture is enabled.
SMP The kernel is an SMP kernel.
@@ -155,7 +167,6 @@ parameter is applicable::
SWSUSP Software suspend (hibernation) is enabled.
SUSPEND System suspend states are enabled.
TPM TPM drivers are enabled.
- TS Appropriate touchscreen support is enabled.
UMS USB Mass Storage support is enabled.
USB USB support is enabled.
USBHID USB Human Interface Device support is enabled.
@@ -164,7 +175,6 @@ parameter is applicable::
VGA The VGA console has been enabled.
VT Virtual terminal support is enabled.
WDT Watchdog support is enabled.
- XT IBM PC/XT MFM hard disk support is enabled.
X86-32 X86-32, aka i386 architecture is enabled.
X86-64 X86-64 architecture is enabled.
More X86-64 boot options can be found in
@@ -172,6 +182,7 @@ parameter is applicable::
X86 Either 32-bit or 64-bit x86 (same as X86-32+X86-64)
X86_UV SGI UV support is enabled.
XEN Xen support is enabled
+ XTENSA xtensa architecture is enabled.
In addition, the following text indicates that the option::
@@ -197,7 +208,7 @@ The number of kernel parameters is not limited, but the length of the
complete command line (parameters including spaces etc.) is limited to
a fixed number of characters. This limit depends on the architecture
and is between 256 and 4096 characters. It is defined in the file
-./include/asm/setup.h as COMMAND_LINE_SIZE.
+./include/uapi/asm-generic/setup.h as COMMAND_LINE_SIZE.
Finally, the [KMG] suffix is commonly described after a number of kernel
parameter values. These 'K', 'M', and 'G' letters represent the _binary_
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index c07815d230bc..a465d5242774 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -22,11 +22,13 @@
default: 0
acpi_backlight= [HW,ACPI]
- acpi_backlight=vendor
- acpi_backlight=video
- If set to vendor, prefer vendor specific driver
+ { vendor | video | native | none }
+ If set to vendor, prefer vendor-specific driver
(e.g. thinkpad_acpi, sony_acpi, etc.) instead
of the ACPI video.ko driver.
+ If set to video, use the ACPI video.ko driver.
+ If set to native, use the device's native backlight mode.
+ If set to none, disable the ACPI backlight interface.
acpi_force_32bit_fadt_addr
force FADT to use 32 bit addresses rather than the
@@ -48,7 +50,7 @@
CONFIG_ACPI_DEBUG must be enabled to produce any ACPI
debug output. Bits in debug_layer correspond to a
_COMPONENT in an ACPI source file, e.g.,
- #define _COMPONENT ACPI_PCI_COMPONENT
+ #define _COMPONENT ACPI_EVENTS
Bits in debug_level correspond to a level in
ACPI_DEBUG_PRINT statements, e.g.,
ACPI_DEBUG_PRINT((ACPI_DB_INFO, ...
@@ -58,8 +60,6 @@
Enable processor driver info messages:
acpi.debug_layer=0x20000000
- Enable PCI/PCI interrupt routing info messages:
- acpi.debug_layer=0x400000
Enable AML "Debug" output, i.e., stores to the Debug
object while interpreting AML:
acpi.debug_layer=0xffffffff acpi.debug_level=0x2
@@ -113,7 +113,7 @@
the GPE dispatcher.
This facility can be used to prevent such uncontrolled
GPE floodings.
- Format: <byte>
+ Format: <byte> or <bitmap-list>
acpi_no_auto_serialize [HW,ACPI]
Disable auto-serialization of AML methods
@@ -225,14 +225,23 @@
For broken nForce2 BIOS resulting in XT-PIC timer.
acpi_sleep= [HW,ACPI] Sleep options
- Format: { s3_bios, s3_mode, s3_beep, s4_nohwsig,
- old_ordering, nonvs, sci_force_enable, nobl }
+ Format: { s3_bios, s3_mode, s3_beep, s4_hwsig,
+ s4_nohwsig, old_ordering, nonvs,
+ sci_force_enable, nobl }
See Documentation/power/video.rst for information on
s3_bios and s3_mode.
s3_beep is for debugging; it makes the PC's speaker beep
as soon as the kernel's real-mode entry point is called.
+ s4_hwsig causes the kernel to check the ACPI hardware
+ signature during resume from hibernation, and gracefully
+ refuse to resume if it has changed. This complies with
+ the ACPI specification but not with reality, since
+ Windows does not do this and many laptops do change it
+ on docking. So the default behaviour is to allow resume
+ and simply warn when the signature changes, unless the
+ s4_hwsig option is enabled.
s4_nohwsig prevents ACPI hardware signature from being
- used during resume from hibernation.
+ used (or even warned about) during resume.
old_ordering causes the ACPI 1.0 ordering of the _PTS
control method, with respect to putting devices into
low power states, to be enforced (the ACPI 2.0 ordering
@@ -287,13 +296,21 @@
do not want to use tracing_snapshot_alloc() as it needs
to be done where GFP_KERNEL allocations are allowed.
+ allow_mismatched_32bit_el0 [ARM64]
+ Allow execve() of 32-bit applications and setting of the
+ PER_LINUX32 personality on systems where only a strict
+ subset of the CPUs support 32-bit EL0. When this
+ parameter is present, the set of CPUs supporting 32-bit
+ EL0 is indicated by /sys/devices/system/cpu/aarch32_el0
+ and hot-unplug operations may be restricted.
+
+ See Documentation/arm64/asymmetric-32bit.rst for more
+ information.
+
amd_iommu= [HW,X86-64]
Pass parameters to the AMD IOMMU driver in the system.
Possible values are:
- fullflush - enable flushing of IO/TLB entries when
- they are unmapped. Otherwise they are
- flushed before they will be reused, which
- is a lot of faster
+ fullflush - Deprecated, equivalent to iommu.strict=1
off - do not initialize any AMD IOMMU found in
the system
force_isolation - Force device isolation for all
@@ -301,6 +318,11 @@
allowed anymore to lift isolation
requirements as needed. This option
does not override iommu=pt
+ force_enable - Force enable the IOMMU on platforms known
+ to be buggy with IOMMU enabled. Use this
+ option with care.
+ pgtbl_v1 - Use v1 page table for DMA-API (Default).
+ pgtbl_v2 - Use v2 page table for DMA-API.
amd_iommu_dump= [HW,X86-64]
Enable AMD IOMMU driver option to dump the ACPI table
@@ -354,7 +376,7 @@
shot down by NMI
autoconf= [IPV6]
- See Documentation/networking/ipv6.txt.
+ See Documentation/networking/ipv6.rst.
show_lapic= [APIC,X86] Advanced Programmable Interrupt Controller
Limit apic dumping. The parameter defines the maximal
@@ -371,6 +393,21 @@
arcrimi= [HW,NET] ARCnet - "RIM I" (entirely mem-mapped) cards
Format: <io>,<irq>,<nodeID>
+ arm64.nobti [ARM64] Unconditionally disable Branch Target
+ Identification support
+
+ arm64.nopauth [ARM64] Unconditionally disable Pointer Authentication
+ support
+
+ arm64.nomte [ARM64] Unconditionally disable Memory Tagging Extension
+ support
+
+ arm64.nosve [ARM64] Unconditionally disable Scalable Vector
+ Extension support
+
+ arm64.nosme [ARM64] Unconditionally disable Scalable Matrix
+ Extension support
+
ataflop= [HW,M68k]
atarimouse= [HW,MOUSE] Atari Mouse
@@ -432,6 +469,12 @@
Format: <io>,<irq>,<mode>
See header of drivers/net/hamradio/baycom_ser_hdx.c.
+ bert_disable [ACPI]
+ Disable BERT OS support on buggy BIOSes.
+
+ bgrt_disable [ACPI][X86]
+ Disable BGRT to avoid flickering OEM logo.
+
blkdevparts= Manual partition parsing of block device(s) for
embedded devices based on command line input.
See Documentation/block/cmdline-partition.rst
@@ -447,13 +490,10 @@
See Documentation/admin-guide/bootconfig.rst
- bert_disable [ACPI]
- Disable BERT OS support on buggy BIOSes.
-
bttv.card= [HW,V4L] bttv (bt848 + bt878 based grabber cards)
bttv.radio= Most important insmod options are available as
kernel args too.
- bttv.pll= See Documentation/media/v4l-drivers/bttv.rst
+ bttv.pll= See Documentation/admin-guide/media/bttv.rst
bttv.tuner=
bulk_remove=off [PPC] This parameter disables the use of the pSeries
@@ -488,16 +528,21 @@
ccw_timeout_log [S390]
See Documentation/s390/common_io.rst for details.
- cgroup_disable= [KNL] Disable a particular controller
- Format: {name of the controller(s) to disable}
+ cgroup_disable= [KNL] Disable a particular controller or optional feature
+ Format: {name of the controller(s) or feature(s) to disable}
The effects of cgroup_disable=foo are:
- foo isn't auto-mounted if you mount all cgroups in
a single hierarchy
- foo isn't visible as an individually mountable
subsystem
+ - if foo is an optional feature then the feature is
+ disabled and corresponding cgroup files are not
+ created
{Currently only "memory" controller deal with this and
cut the overhead, others just disable the usage. So
only cgroup_disable=memory is actually worthy}
+ Specifying "pressure" disables per-cgroup pressure
+ stall information accounting feature
cgroup_no_v1= [KNL] Disable cgroup controllers and named hierarchies in v1
Format: { { controller | "all" | "named" }
@@ -513,7 +558,7 @@
nosocket -- Disable socket memory accounting.
nokmem -- Disable kernel memory accounting.
- checkreqprot [SELINUX] Set initial checkreqprot flag value.
+ checkreqprot= [SELINUX] Set initial checkreqprot flag value.
Format: { "0" | "1" }
See security/selinux/Kconfig help text.
0 -- check protection applied by kernel (includes
@@ -522,9 +567,29 @@
Default value is set via a kernel config option.
Value can be changed at runtime via
/sys/fs/selinux/checkreqprot.
+ Setting checkreqprot to 1 is deprecated.
cio_ignore= [S390]
See Documentation/s390/common_io.rst for details.
+
+ clearcpuid=X[,X...] [X86]
+ Disable CPUID feature X for the kernel. See
+ arch/x86/include/asm/cpufeatures.h for the valid bit
+ numbers X. Note the Linux-specific bits are not necessarily
+ stable over kernel options, but the vendor-specific
+ ones should be.
+ X can also be a string as appearing in the flags: line
+ in /proc/cpuinfo which does not have the above
+ instability issue. However, not all features have names
+ in /proc/cpuinfo.
+ Note that using this option will taint your kernel.
+ Also note that user programs calling CPUID directly
+ or using the feature without checking anything
+ will still see it. This just prevents it from
+ being used by the kernel or shown in /proc/cpuinfo.
+ Also note the kernel might malfunction if you disable
+ some critical bits.
+
clk_ignore_unused
[CLK]
Prevents the clock framework from automatically gating
@@ -571,27 +636,47 @@
loops can be debugged more effectively on production
systems.
- clearcpuid=BITNUM [X86]
- Disable CPUID feature X for the kernel. See
- arch/x86/include/asm/cpufeatures.h for the valid bit
- numbers. Note the Linux specific bits are not necessarily
- stable over kernel options, but the vendor specific
- ones should be.
- Also note that user programs calling CPUID directly
- or using the feature without checking anything
- will still see it. This just prevents it from
- being used by the kernel or shown in /proc/cpuinfo.
- Also note the kernel might malfunction if you disable
- some critical bits.
+ clocksource.max_cswd_read_retries= [KNL]
+ Number of clocksource_watchdog() retries due to
+ external delays before the clock will be marked
+ unstable. Defaults to two retries, that is,
+ three attempts to read the clock under test.
+
+ clocksource.verify_n_cpus= [KNL]
+ Limit the number of CPUs checked for clocksources
+ marked with CLOCK_SOURCE_VERIFY_PERCPU that
+ are marked unstable due to excessive skew.
+ A negative value says to check all CPUs, while
+ zero says not to check any. Values larger than
+ nr_cpu_ids are silently truncated to nr_cpu_ids.
+ The actual CPUs are chosen randomly, with
+ no replacement if the same CPU is chosen twice.
+
+ clocksource-wdtest.holdoff= [KNL]
+ Set the time in seconds that the clocksource
+ watchdog test waits before commencing its tests.
+ Defaults to zero when built as a module and to
+ 10 seconds when built into the kernel.
cma=nn[MG]@[start[MG][-end[MG]]]
- [ARM,X86,KNL]
+ [KNL,CMA]
Sets the size of kernel global memory area for
contiguous memory allocations and optionally the
placement constraint by the physical address range of
memory allocations. A value of 0 disables CMA
altogether. For more information, see
- include/linux/dma-contiguous.h
+ kernel/dma/contiguous.c
+
+ cma_pernuma=nn[MG]
+ [ARM64,KNL,CMA]
+ Sets the size of kernel per-numa memory area for
+ contiguous memory allocations. A value of 0 disables
+ per-numa CMA altogether. And If this option is not
+ specificed, the default value is 0.
+ With per-numa CMA enabled, DMA users on node nid will
+ first try to allocate buffer from the pernuma area
+ which is located in node nid, if the allocation fails,
+ they will fallback to the global default memory area.
cmo_free_hint= [PPC] Format: { yes | no }
Specify whether pages are marked as being inactive
@@ -632,7 +717,7 @@
See Documentation/admin-guide/serial-console.rst for more
information. See
- Documentation/networking/netconsole.txt for an
+ Documentation/networking/netconsole.rst for an
alternative.
uart[8250],io,<addr>[,options]
@@ -653,6 +738,12 @@
hvc<n> Use the hypervisor console device <n>. This is for
both Xen and PowerPC hypervisors.
+ { null | "" }
+ Use to disable console output, i.e., to have kernel
+ console messages discarded.
+ This must be the only console= parameter used on the
+ kernel command line.
+
If the device connected to the port is not a TTY but a braille
device, prepend "brl," before the device type, for instance
console=brl,ttyS0
@@ -679,7 +770,7 @@
coredump_filter=
[KNL] Change the default value for
/proc/<pid>/coredump_filter.
- See also Documentation/filesystems/proc.txt.
+ See also Documentation/filesystems/proc.rst.
coresight_cpu_debug.enable
[ARM,ARM64]
@@ -688,6 +779,24 @@
0: default value, disable debugging
1: enable debugging at boot time
+ cpcihp_generic= [HW,PCI] Generic port I/O CompactPCI driver
+ Format:
+ <first_slot>,<last_slot>,<port>,<enum_bit>[,<debug>]
+
+ cpu0_hotplug [X86] Turn on CPU0 hotplug feature when
+ CONFIG_BOOTPARAM_HOTPLUG_CPU0 is off.
+ Some features depend on CPU0. Known dependencies are:
+ 1. Resume from suspend/hibernate depends on CPU0.
+ Suspend/hibernate will fail if CPU0 is offline and you
+ need to online CPU0 before suspend/hibernate.
+ 2. PIC interrupts also depend on CPU0. CPU0 can't be
+ removed if a PIC interrupt is detected.
+ It's said poweroff/reboot may depend on CPU0 on some
+ machines although I haven't seen such issues so far
+ after CPU0 is offline on a few tested machines.
+ If the dependencies are under your control, you can
+ turn on cpu0_hotplug.
+
cpuidle.off=1 [CPU_IDLE]
disable the cpuidle sub-system
@@ -697,15 +806,24 @@
cpufreq.off=1 [CPU_FREQ]
disable the cpufreq sub-system
+ cpufreq.default_governor=
+ [CPU_FREQ] Name of the default cpufreq governor or
+ policy to use. This governor must be registered in the
+ kernel before the cpufreq driver probes.
+
cpu_init_udelay=N
[X86] Delay for N microsec between assert and de-assert
of APIC INIT to start processors. This delay occurs
on every CPU online, such as boot, and resume from suspend.
Default: 10000
- cpcihp_generic= [HW,PCI] Generic port I/O CompactPCI driver
- Format:
- <first_slot>,<last_slot>,<port>,<enum_bit>[,<debug>]
+ crash_kexec_post_notifiers
+ Run kdump after running panic-notifiers and dumping
+ kmsg. This only for the users who doubt kdump always
+ succeeds in any situation.
+ Note that this also increases risks of kdump failure,
+ because some panic notifiers can make the crashed
+ kernel more unstable.
crashkernel=size[KMG][@offset[KMG]]
[KNL] Using kexec, Linux can switch to a 'crash kernel'
@@ -713,7 +831,7 @@
memory region [offset, offset + size] for that kernel
image. If '@offset' is omitted, then a suitable offset
is selected automatically.
- [KNL, x86_64] select a region under 4G first, and
+ [KNL, X86-64] Select a region under 4G first, and
fall back to reserve region above 4G when '@offset'
hasn't been specified.
See Documentation/admin-guide/kdump/kdump.rst for further details.
@@ -726,27 +844,33 @@
Documentation/admin-guide/kdump/kdump.rst for an example.
crashkernel=size[KMG],high
- [KNL, x86_64] range could be above 4G. Allow kernel
+ [KNL, X86-64, ARM64] range could be above 4G. Allow kernel
to allocate physical memory region from top, so could
be above 4G if system have more than 4G ram installed.
Otherwise memory region will be allocated below 4G, if
available.
It will be ignored if crashkernel=X is specified.
crashkernel=size[KMG],low
- [KNL, x86_64] range under 4G. When crashkernel=X,high
+ [KNL, X86-64] range under 4G. When crashkernel=X,high
is passed, kernel could allocate physical memory region
above 4G, that cause second kernel crash on system
that require some amount of low memory, e.g. swiotlb
requires at least 64M+32K low memory, also enough extra
low memory is needed to make sure DMA buffers for 32-bit
- devices won't run out. Kernel would try to allocate at
+ devices won't run out. Kernel would try to allocate
at least 256M below 4G automatically.
- This one let user to specify own low range under 4G
+ This one lets the user specify own low range under 4G
for second kernel instead.
0: to disable low allocation.
It will be ignored when crashkernel=X,high is not used
or memory reserved is below 4G.
+ [KNL, ARM64] range in low memory.
+ This one lets the user specify a low range in the
+ DMA zone for the crash dump kernel.
+ It will be ignored when crashkernel=X,high is not used
+ or memory reserved is located in the DMA zones.
+
cryptomgr.notests
[KNL] Disable crypto self-tests
@@ -756,6 +880,16 @@
cs89x0_media= [HW,NET]
Format: { rj45 | aui | bnc }
+ csdlock_debug= [KNL] Enable debug add-ons of cross-CPU function call
+ handling. When switched on, additional debug data is
+ printed to the console in case a hanging CPU is
+ detected, and that CPU is pinged again in order to try
+ to resolve the hang situation.
+ 0: disable csdlock debugging (default)
+ 1: enable basic csdlock debugging (minor impact)
+ ext: enable extended csdlock debugging (more impact,
+ but more data)
+
dasd= [HW,NET]
See header of drivers/s390/block/dasd_devmap.c.
@@ -764,11 +898,6 @@
Format: <port#>,<type>
See also Documentation/input/devices/joystick-parport.rst
- ddebug_query= [KNL,DYNAMIC_DEBUG] Enable debug messages at early boot
- time. See
- Documentation/admin-guide/dynamic-debug-howto.rst for
- details. Deprecated, see dyndbg.
-
debug [KNL] Enable kernel debugging (events log level).
debug_boot_weak_hash
@@ -780,13 +909,14 @@
insecure, please do not use on production kernels.
debug_locks_verbose=
- [KNL] verbose self-tests
- Format=<0|1>
+ [KNL] verbose locking self-tests
+ Format: <int>
Print debugging info while doing the locking API
self-tests.
- We default to 0 (no extra messages), setting it to
- 1 will print _a lot_ more information - normally
- only useful to kernel developers.
+ Bitmask for the various LOCKTYPE_ tests. Defaults to 0
+ (no extra messages), setting it to -1 (all bits set)
+ will print _a_lot_ more information - normally only
+ useful to lockdep developers.
debug_objects [KNL] Enable object debugging
@@ -821,29 +951,71 @@
useful to also enable the page_owner functionality.
on: enable the feature
- debugpat [X86] Enable PAT debugging
+ debugfs= [KNL] This parameter enables what is exposed to userspace
+ and debugfs internal clients.
+ Format: { on, no-mount, off }
+ on: All functions are enabled.
+ no-mount:
+ Filesystem is not registered but kernel clients can
+ access APIs and a crashkernel can be used to read
+ its content. There is nothing to mount.
+ off: Filesystem is not registered and clients
+ get a -EPERM as result when trying to register files
+ or directories within debugfs.
+ This is equivalent of the runtime functionality if
+ debugfs was not enabled in the kernel at all.
+ Default value is set in build-time with a kernel configuration.
- decnet.addr= [HW,NET]
- Format: <area>[,<node>]
- See also Documentation/networking/decnet.txt.
+ debugpat [X86] Enable PAT debugging
default_hugepagesz=
- [same as hugepagesz=] The size of the default
- HugeTLB page size. This is the size represented by
- the legacy /proc/ hugepages APIs, used for SHM, and
- default size when mounting hugetlbfs filesystems.
- Defaults to the default architecture's huge page size
- if not specified.
+ [HW] The size of the default HugeTLB page. This is
+ the size represented by the legacy /proc/ hugepages
+ APIs. In addition, this is the default hugetlb size
+ used for shmget(), mmap() and mounting hugetlbfs
+ filesystems. If not specified, defaults to the
+ architecture's default huge page size. Huge page
+ sizes are architecture dependent. See also
+ Documentation/admin-guide/mm/hugetlbpage.rst.
+ Format: size[KMG]
deferred_probe_timeout=
[KNL] Debugging option to set a timeout in seconds for
deferred probe to give up waiting on dependencies to
probe. Only specific dependencies (subsystems or
- drivers) that have opted in will be ignored. A timeout of 0
- will timeout at the end of initcalls. This option will also
+ drivers) that have opted in will be ignored. A timeout
+ of 0 will timeout at the end of initcalls. If the time
+ out hasn't expired, it'll be restarted by each
+ successful driver registration. This option will also
dump out devices still on the deferred probe list after
retrying.
+ delayacct [KNL] Enable per-task delay accounting
+
+ dell_smm_hwmon.ignore_dmi=
+ [HW] Continue probing hardware even if DMI data
+ indicates that the driver is running on unsupported
+ hardware.
+
+ dell_smm_hwmon.force=
+ [HW] Activate driver even if SMM BIOS signature does
+ not match list of supported models and enable otherwise
+ blacklisted features.
+
+ dell_smm_hwmon.power_status=
+ [HW] Report power status in /proc/i8k
+ (disabled by default).
+
+ dell_smm_hwmon.restricted=
+ [HW] Allow controlling fans only if SYS_ADMIN
+ capability is set.
+
+ dell_smm_hwmon.fan_mult=
+ [HW] Factor to multiply fan speed with.
+
+ dell_smm_hwmon.fan_max=
+ [HW] Maximum configurable fan speed.
+
dfltcc= [HW,S390]
Format: { on | off | def_only | inf_only | always }
on: s390 zlib hardware support for compression on
@@ -865,23 +1037,21 @@
can be useful when debugging issues that require an SLB
miss to occur.
- disable= [IPV6]
- See Documentation/networking/ipv6.txt.
+ stress_slb [PPC]
+ Limits the number of kernel SLB entries, and flushes
+ them frequently to increase the rate of SLB faults
+ on kernel addresses.
- hardened_usercopy=
- [KNL] Under CONFIG_HARDENED_USERCOPY, whether
- hardening is enabled for this boot. Hardened
- usercopy checking is used to protect the kernel
- from reading or writing beyond known memory
- allocation boundaries as a proactive defense
- against bounds-checking flaws in the kernel's
- copy_to_user()/copy_from_user() interface.
- on Perform hardened usercopy checks (default).
- off Disable hardened usercopy checks.
+ disable= [IPV6]
+ See Documentation/networking/ipv6.rst.
disable_radix [PPC]
Disable RADIX MMU mode on POWER9
+ radix_hcall_invalidate=on [PPC/PSERIES]
+ Disable RADIX GTSE feature and use hcall for TLB
+ invalidate.
+
disable_tlbie [PPC]
Disable TLBIE instruction. Currently does not work
with KVM, with HASH MMU, or with coherent accelerators.
@@ -895,18 +1065,12 @@
causing system reset or hang due to sending
INIT from AP to BSP.
- perf_v4_pmi= [X86,INTEL]
- Format: <bool>
- Disable Intel PMU counter freezing feature.
- The feature only exists starting from
- Arch Perfmon v4 (Skylake and newer).
-
disable_ddw [PPC/PSERIES]
- Disable Dynamic DMA Window support. Use this if
+ Disable Dynamic DMA Window support. Use this
to workaround buggy firmware.
disable_ipv6= [IPV6]
- See Documentation/networking/ipv6.txt.
+ See Documentation/networking/ipv6.rst.
disable_mtrr_cleanup [X86]
The kernel tries to adjust MTRR layout from continuous
@@ -943,7 +1107,10 @@
driver later using sysfs.
driver_async_probe= [KNL]
- List of driver names to be probed asynchronously.
+ List of driver names to be probed asynchronously. *
+ matches with all driver names. If * is specified, the
+ rest of the listed driver names are those that will NOT
+ match the *.
Format: <driver_name1>,<driver_name2>...
drm.edid_firmware=[<connector>:]<file>[,[<connector>:]<file>]
@@ -956,7 +1123,7 @@
edid/1680x1050.bin, or edid/1920x1080.bin is given
and no file with the same name exists. Details and
instructions how to build your own EDID data are
- available in Documentation/driver-api/edid.rst. An EDID
+ available in Documentation/admin-guide/edid.rst. An EDID
data set will only be used for a particular connector,
if its name and a colon are prepended to the EDID
name. Each connector may use a unique EDID data
@@ -981,20 +1148,20 @@
what data is available or for reverse-engineering.
dyndbg[="val"] [KNL,DYNAMIC_DEBUG]
- module.dyndbg[="val"]
+ <module>.dyndbg[="val"]
Enable debug messages at boot time. See
Documentation/admin-guide/dynamic-debug-howto.rst
for details.
- nompx [X86] Disables Intel Memory Protection Extensions.
- See Documentation/x86/intel_mpx.rst for more
- information about the feature.
-
nopku [X86] Disable Memory Protection Keys CPU feature found
in some Intel CPUs.
- module.async_probe [KNL]
- Enable asynchronous probe on this module.
+ <module>.async_probe[=<bool>] [KNL]
+ If no <bool> value is specified or if the value
+ specified is not a valid <bool>, enable asynchronous
+ probe on this module. Otherwise, enable/disable
+ asynchronous probe on this module as indicated by the
+ <bool> value. See also: module.async_probe
early_ioremap_debug [KNL]
Enable debug messages in early_ioremap support. This
@@ -1038,6 +1205,11 @@
the driver will use only 32-bit accessors to read/write
the device registers.
+ liteuart,<addr>
+ Start an early console on a litex serial port at the
+ specified address. The serial port must already be
+ setup and configured. Options are not yet supported.
+
meson,<addr>
Start an early, polled-mode console on a meson serial
port at the specified address. The serial port must
@@ -1099,6 +1271,12 @@
A valid base address must be provided, and the serial
port must already be setup and configured.
+ ec_imx21,<addr>
+ ec_imx6q,<addr>
+ Start an early, polled-mode, output-only console on the
+ Freescale i.MX UART at the specified address. The UART
+ must already be setup and configured.
+
ar3700_uart,<addr>
Start an early, polled-mode console on the
Armada 3700 serial port at the specified
@@ -1142,7 +1320,7 @@
Append ",keep" to not disable it when the real console
takes over.
- Only one of vga, efi, serial, or usb debug port can
+ Only one of vga, serial, or usb debug port can
be used at a time.
Currently only ttyS0 and ttyS1 may be specified by
@@ -1157,10 +1335,10 @@
Interaction with the standard serial driver is not
very good.
- The VGA and EFI output is eventually overwritten by
+ The VGA output is eventually overwritten by
the real console.
- The xen output can only be used by Xen PV guests.
+ The xen option can only be used in Xen domains.
The sclp output can only be used on s390.
@@ -1176,34 +1354,27 @@
force: enforce the use of EDAC to report H/W event.
default: on.
- ekgdboc= [X86,KGDB] Allow early kernel console debugging
- ekgdboc=kbd
-
- This is designed to be used in conjunction with
- the boot argument: earlyprintk=vga
-
edd= [EDD]
Format: {"off" | "on" | "skip[mbr]"}
efi= [EFI]
- Format: { "old_map", "nochunk", "noruntime", "debug",
- "nosoftreserve", "disable_early_pci_dma",
- "no_disable_early_pci_dma" }
- old_map [X86-64]: switch to the old ioremap-based EFI
- runtime services mapping. [Needs CONFIG_X86_UV=y]
+ Format: { "debug", "disable_early_pci_dma",
+ "nochunk", "noruntime", "nosoftreserve",
+ "novamap", "no_disable_early_pci_dma" }
+ debug: enable misc debug output.
+ disable_early_pci_dma: disable the busmaster bit on all
+ PCI bridges while in the EFI boot stub.
nochunk: disable reading files in "chunks" in the EFI
boot stub, as chunking can cause problems with some
firmware implementations.
noruntime : disable EFI runtime services support
- debug: enable misc debug output
nosoftreserve: The EFI_MEMORY_SP (Specific Purpose)
attribute may cause the kernel to reserve the
memory range for a memory mapping driver to
claim. Specify efi=nosoftreserve to disable this
reservation and treat the memory by its base type
(i.e. EFI_CONVENTIONAL_MEMORY / "System RAM").
- disable_early_pci_dma: Disable the busmaster bit on all
- PCI bridges while in the EFI boot stub
+ novamap: do not call SetVirtualAddressMap().
no_disable_early_pci_dma: Leave the busmaster bit set
on all PCI bridges while in the EFI boot stub
@@ -1244,6 +1415,17 @@
eisa_irq_edge= [PARISC,HW]
See header of drivers/parisc/eisa.c.
+ ekgdboc= [X86,KGDB] Allow early kernel console debugging
+ Format: ekgdboc=kbd
+
+ This is designed to be used in conjunction with
+ the boot argument: earlyprintk=vga
+
+ This parameter works in place of the kgdboc parameter
+ but can only be used if the backing tty is available
+ very early in the boot process. For early debugging
+ via a serial port see kgdboc_earlycon instead.
+
elanfreq= [X86-32]
See comment before function elanfreq_setup() in
arch/x86/kernel/cpu/cpufreq/elanfreq.c.
@@ -1265,7 +1447,7 @@
(in particular on some ATI chipsets).
The kernel tries to set a reasonable default.
- enforcing [SELINUX] Set initial enforcing status.
+ enforcing= [SELINUX] Set initial enforcing status.
Format: {"0" | "1"}
See security/selinux/Kconfig help text.
0 -- permissive (log only, no denials).
@@ -1287,13 +1469,27 @@
Permit 'security.evm' to be updated regardless of
current integrity status.
+ early_page_ext [KNL] Enforces page_ext initialization to earlier
+ stages so cover more early boot allocations.
+ Please note that as side effect some optimizations
+ might be disabled to achieve that (e.g. parallelized
+ memory initialization is disabled) so the boot process
+ might take longer, especially on systems with a lot of
+ memory. Available with CONFIG_PAGE_EXTENSION=y.
+
failslab=
+ fail_usercopy=
fail_page_alloc=
fail_make_request=[KNL]
General fault injection mechanism.
Format: <interval>,<probability>,<space>,<times>
See also Documentation/fault-injection/.
+ fb_tunnels= [NET]
+ Format: { initns | none }
+ See Documentation/admin-guide/sysctl/net.rst for
+ fb_tunnels_only_for_init_ns
+
floppy= [HW]
See Documentation/admin-guide/blockdev/floppy.rst.
@@ -1315,6 +1511,14 @@
as early as possible in order to facilitate early
boot debugging.
+ ftrace_boot_snapshot
+ [FTRACE] On boot up, a snapshot will be taken of the
+ ftrace ring buffer that can be read at:
+ /sys/kernel/tracing/snapshot.
+ This is useful if you need tracing information from kernel
+ boot up that is likely to be overridden by user space
+ start up functionality.
+
ftrace_dump_on_oops[=orig_cpu]
[FTRACE] will dump the trace buffers on oops.
If no parameter is passed, ftrace will dump
@@ -1324,7 +1528,7 @@
ftrace_filter=[function-list]
[FTRACE] Limit the functions traced by the function
- tracer at boot up. function-list is a comma separated
+ tracer at boot up. function-list is a comma-separated
list of functions. This list can be changed at run
time by the set_ftrace_filter file in the debugfs
tracing directory.
@@ -1338,13 +1542,13 @@
ftrace_graph_filter=[function-list]
[FTRACE] Limit the top level callers functions traced
by the function graph tracer at boot up.
- function-list is a comma separated list of functions
+ function-list is a comma-separated list of functions
that can be changed at run time by the
set_graph_function file in the debugfs tracing directory.
ftrace_graph_notrace=[function-list]
[FTRACE] Do not trace from the functions specified in
- function-list. This list is a comma separated list of
+ function-list. This list is a comma-separated list of
functions that can be changed at run time by the
set_graph_notrace file in the debugfs tracing directory.
@@ -1354,6 +1558,29 @@
can be changed at run time by the max_graph_depth file
in the tracefs tracing directory. default: 0 (no limit)
+ fw_devlink= [KNL] Create device links between consumer and supplier
+ devices by scanning the firmware to infer the
+ consumer/supplier relationships. This feature is
+ especially useful when drivers are loaded as modules as
+ it ensures proper ordering of tasks like device probing
+ (suppliers first, then consumers), supplier boot state
+ clean up (only after all consumers have probed),
+ suspend/resume & runtime PM (consumers first, then
+ suppliers).
+ Format: { off | permissive | on | rpm }
+ off -- Don't create device links from firmware info.
+ permissive -- Create device links from firmware info
+ but use it only for ordering boot state clean
+ up (sync_state() calls).
+ on -- Create device links from firmware info and use it
+ to enforce probe and suspend/resume ordering.
+ rpm -- Like "on", but also use to order runtime PM.
+
+ fw_devlink.strict=<bool>
+ [KNL] Treat all inferred dependencies as mandatory
+ dependencies. This only applies for fw_devlink=on|rpm.
+ Format: <bool>
+
gamecon.map[2|3]=
[HW,JOY] Multisystem joystick and NES/SNES/PSX pad
support via parallel port (up to 5 devices per port)
@@ -1362,7 +1589,7 @@
gamma= [HW,DRM]
- gart_fix_e820= [X86_64] disable the fix e820 for K8 GART
+ gart_fix_e820= [X86-64] disable the fix e820 for K8 GART
Format: off | on
default: on
@@ -1376,6 +1603,12 @@
Don't use this when you are not running on the
android emulator
+ gpio-mockup.gpio_mockup_ranges
+ [HW] Sets the ranges of gpiochip of for this device.
+ Format: <start1>,<end1>,<start2>,<end2>...
+ gpio-mockup.gpio_mockup_named_lines
+ [HW] Let the driver know GPIO lines should be named.
+
gpt [EFI] Forces disk with valid GPT signature but
invalid Protective MBR to be treated as GPT. If the
primary GPT is corrupted, it enables the backup/alternate
@@ -1399,14 +1632,21 @@
Format: <unsigned int> such that (rxsize & ~0x1fffc0) == 0.
Default: 1024
- gpio-mockup.gpio_mockup_ranges
- [HW] Sets the ranges of gpiochip of for this device.
- Format: <start1>,<end1>,<start2>,<end2>...
+ hardened_usercopy=
+ [KNL] Under CONFIG_HARDENED_USERCOPY, whether
+ hardening is enabled for this boot. Hardened
+ usercopy checking is used to protect the kernel
+ from reading or writing beyond known memory
+ allocation boundaries as a proactive defense
+ against bounds-checking flaws in the kernel's
+ copy_to_user()/copy_from_user() interface.
+ on Perform hardened usercopy checks (default).
+ off Disable hardened usercopy checks.
hardlockup_all_cpu_backtrace=
[KNL] Should the hard-lockup detector generate
backtraces on all cpus.
- Format: <integer>
+ Format: 0 | 1
hashdist= [KNL,NUMA] Large hashes allocated during boot
are distributed across NUMA nodes. Defaults on
@@ -1423,6 +1663,15 @@
corresponding firmware-first mode error processing
logic will be disabled.
+ hibernate= [HIBERNATION]
+ noresume Don't check if there's a hibernation image
+ present during boot.
+ nocompress Don't compress/decompress hibernation images.
+ no Disable hibernation and resume.
+ protect_image Turn on image protection during restoration
+ (that will set all pages holding image data
+ during restoration read-only).
+
highmem=nn[KMG] [KNL,BOOT] forces the highmem zone to have an exact
size of <nn>. This works even on boxes that have no
highmem otherwise. This also works to reduce highmem
@@ -1434,6 +1683,19 @@
hlt [BUGS=ARM,SH]
+ hostname= [KNL] Set the hostname (aka UTS nodename).
+ Format: <string>
+ This allows setting the system's hostname during early
+ startup. This sets the name returned by gethostname.
+ Using this parameter to set the hostname makes it
+ possible to ensure the hostname is correctly set before
+ any userspace processes run, avoiding the possibility
+ that a process may call gethostname before the hostname
+ has been explicitly set, resulting in the calling
+ process getting an incorrect result. The string must
+ not exceed the maximum allowed hostname length (usually
+ 64 characters) and will be truncated otherwise.
+
hpet= [X86-32,HPET] option to control HPET usage
Format: { enable (default) | disable | force |
verbose }
@@ -1445,19 +1707,62 @@
hpet_mmap= [X86, HPET_MMAP] Allow userspace to mmap HPET
registers. Default set by CONFIG_HPET_MMAP_DEFAULT.
- hugepages= [HW,X86-32,IA-64] HugeTLB pages to allocate at boot.
- hugepagesz= [HW,IA-64,PPC,X86-64] The size of the HugeTLB pages.
- On x86-64 and powerpc, this option can be specified
- multiple times interleaved with hugepages= to reserve
- huge pages of different sizes. Valid pages sizes on
- x86-64 are 2M (when the CPU supports "pse") and 1G
- (when the CPU supports the "pdpe1gb" cpuinfo flag).
+ hugepages= [HW] Number of HugeTLB pages to allocate at boot.
+ If this follows hugepagesz (below), it specifies
+ the number of pages of hugepagesz to be allocated.
+ If this is the first HugeTLB parameter on the command
+ line, it specifies the number of pages to allocate for
+ the default huge page size. If using node format, the
+ number of pages to allocate per-node can be specified.
+ See also Documentation/admin-guide/mm/hugetlbpage.rst.
+ Format: <integer> or (node format)
+ <node>:<integer>[,<node>:<integer>]
+
+ hugepagesz=
+ [HW] The size of the HugeTLB pages. This is used in
+ conjunction with hugepages (above) to allocate huge
+ pages of a specific size at boot. The pair
+ hugepagesz=X hugepages=Y can be specified once for
+ each supported huge page size. Huge page sizes are
+ architecture dependent. See also
+ Documentation/admin-guide/mm/hugetlbpage.rst.
+ Format: size[KMG]
+
+ hugetlb_cma= [HW,CMA] The size of a CMA area used for allocation
+ of gigantic hugepages. Or using node format, the size
+ of a CMA area per node can be specified.
+ Format: nn[KMGTPE] or (node format)
+ <node>:nn[KMGTPE][,<node>:nn[KMGTPE]]
+
+ Reserve a CMA area of given size and allocate gigantic
+ hugepages using the CMA allocator. If enabled, the
+ boot-time allocation of gigantic hugepages is skipped.
+
+ hugetlb_free_vmemmap=
+ [KNL] Reguires CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP
+ enabled.
+ Control if HugeTLB Vmemmap Optimization (HVO) is enabled.
+ Allows heavy hugetlb users to free up some more
+ memory (7 * PAGE_SIZE for each 2MB hugetlb page).
+ Format: { on | off (default) }
+
+ on: enable HVO
+ off: disable HVO
+
+ Built with CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP_DEFAULT_ON=y,
+ the default is on.
+
+ Note that the vmemmap pages may be allocated from the added
+ memory block itself when memory_hotplug.memmap_on_memory is
+ enabled, those vmemmap pages cannot be optimized even if this
+ feature is enabled. Other vmemmap pages not allocated from
+ the added memory block itself do not be affected.
hung_task_panic=
[KNL] Should the hung task detector generate panics.
- Format: <integer>
+ Format: 0 | 1
- A nonzero value instructs the kernel to panic when a
+ A value of 1 instructs the kernel to panic when a
hung task is detected. The default value is controlled
by the CONFIG_BOOTPARAM_HUNG_TASK_PANIC build-time
option. The value selected by this boot parameter can
@@ -1513,20 +1818,11 @@
architectures force reset to be always executed
i8042.unlock [HW] Unlock (ignore) the keylock
i8042.kbdreset [HW] Reset device connected to KBD port
+ i8042.probe_defer
+ [HW] Allow deferred probing upon i8042 probe errors
i810= [HW,DRM]
- i8k.ignore_dmi [HW] Continue probing hardware even if DMI data
- indicates that the driver is running on unsupported
- hardware.
- i8k.force [HW] Activate i8k driver even if SMM BIOS signature
- does not match list of supported models.
- i8k.power_status
- [HW] Report power status in /proc/i8k
- (disabled by default)
- i8k.restricted [HW] Allow controlling fans only if SYS_ADMIN
- capability is set.
-
i915.invert_brightness=
[DRM] Invert the sense of the variable that is used to
set the brightness of the panel backlight. Normally a
@@ -1544,26 +1840,6 @@
icn= [HW,ISDN]
Format: <io>[,<membase>[,<icn_id>[,<icn_id2>]]]
- ide-core.nodma= [HW] (E)IDE subsystem
- Format: =0.0 to prevent dma on hda, =0.1 hdb =1.0 hdc
- .vlb_clock .pci_clock .noflush .nohpa .noprobe .nowerr
- .cdrom .chs .ignore_cable are additional options
- See Documentation/ide/ide.rst.
-
- ide-generic.probe-mask= [HW] (E)IDE subsystem
- Format: <int>
- Probe mask for legacy ISA IDE ports. Depending on
- platform up to 6 ports are supported, enabled by
- setting corresponding bits in the mask to 1. The
- default value is 0x0, which has a special meaning.
- On systems that have PCI, it triggers scanning the
- PCI bus for the first and the second port, which
- are then probed. On systems without PCI the value
- of 0x0 enables probing the two first ports as if it
- was 0x3.
-
- ide-pci-generic.all-generic-ide [HW] (E)IDE subsystem
- Claim all unknown PCI IDE storage controllers.
idle= [X86]
Format: idle=poll, idle=halt, idle=nomwait
@@ -1575,6 +1851,17 @@
In such case C2/C3 won't be used again.
idle=nomwait: Disable mwait for CPU C-states
+ idxd.sva= [HW]
+ Format: <bool>
+ Allow force disabling of Shared Virtual Memory (SVA)
+ support for the idxd driver. By default it is set to
+ true (1).
+
+ idxd.tc_override= [HW]
+ Format: <bool>
+ Allow override of default traffic class configuration
+ for the device. By default it is set to false (0).
+
ieee754= [MIPS] Select IEEE Std 754 conformance mode
Format: { strict | legacy | 2008 | relaxed }
Default: strict
@@ -1648,7 +1935,7 @@
ima_policy= [IMA]
The builtin policies to load during IMA setup.
Format: "tcb | appraise_tcb | secure_boot |
- fail_securely"
+ fail_securely | critical_data"
The "tcb" policy measures all programs exec'd, files
mmap'd for exec, and all files opened with the read
@@ -1667,6 +1954,9 @@
filesystems with the SB_I_UNVERIFIABLE_SIGNATURE
flag.
+ The "critical_data" policy measures kernel integrity
+ critical data.
+
ima_tcb [IMA] Deprecated. Use ima_policy= instead.
Load a policy which meets the needs of the Trusted
Computing Base. This means IMA will measure all
@@ -1675,7 +1965,8 @@
ima_template= [IMA]
Select one of defined IMA measurements template formats.
- Formats: { "ima" | "ima-ng" | "ima-sig" }
+ Formats: { "ima" | "ima-ng" | "ima-ngv2" | "ima-sig" |
+ "ima-sigv2" }
Default: "ima-ng"
ima_template_fmt=
@@ -1712,8 +2003,27 @@
initcall functions. Useful for debugging built-in
modules and initcalls.
+ initramfs_async= [KNL]
+ Format: <bool>
+ Default: 1
+ This parameter controls whether the initramfs
+ image is unpacked asynchronously, concurrently
+ with devices being probed and
+ initialized. This should normally just work,
+ but as a debugging aid, one can get the
+ historical behaviour of the initramfs
+ unpacking being completed before device_ and
+ late_ initcalls.
+
initrd= [BOOT] Specify the location of the initial ramdisk
+ initrdmem= [KNL] Specify a physical address and size from which to
+ load the initrd. If an initrd is compiled in or
+ specified in the bootparams, it takes priority over this
+ setting.
+ Format: ss[KMG],nn[KMG]
+ Default is 0, 0
+
init_on_alloc= [MM] Fill newly allocated pages and heap objects with
zeroes.
Format: 0 | 1
@@ -1723,7 +2033,7 @@
Format: 0 | 1
Default set by CONFIG_INIT_ON_FREE_DEFAULT_ON.
- init_pkru= [x86] Specify the default memory protection keys rights
+ init_pkru= [X86] Specify the default memory protection keys rights
register contents for all processes. 0x55555554 by
default (disallow access to all but pkey 0). Can
override in debugfs after boot.
@@ -1731,7 +2041,7 @@
inport.irq= [HW] Inport (ATI XL and Microsoft) busmouse driver
Format: <irq>
- int_pln_enable [x86] Enable power limit notification interrupt
+ int_pln_enable [X86] Enable power limit notification interrupt
integrity_audit=[IMA]
Format: { "0" | "1" }
@@ -1749,26 +2059,18 @@
bypassed by not enabling DMAR with this option. In
this case, gfx device will use physical address for
DMA.
- forcedac [x86_64]
- With this option iommu will not optimize to look
- for io virtual address below 32-bit forcing dual
- address cycle on pci bus for cards supporting greater
- than 32-bit addressing. The default is to look
- for translation below 32-bit and if not available
- then look in the higher range.
strict [Default Off]
- With this option on every unmap_single operation will
- result in a hardware IOTLB flush operation as opposed
- to batching them for performance.
+ Deprecated, equivalent to iommu.strict=1.
sp_off [Default Off]
By default, super page will be supported if Intel IOMMU
has the capability. With this option, super page will
not be supported.
- sm_on [Default Off]
- By default, scalable mode will be disabled even if the
- hardware advertises that it has support for the scalable
- mode translation. With this option set, scalable mode
- will be used on hardware which claims to support it.
+ sm_on
+ Enable the Intel IOMMU scalable mode if the hardware
+ advertises that it has support for the scalable mode
+ translation.
+ sm_off
+ Disallow use of the Intel IOMMU scalable mode.
tboot_noforce [Default Off]
Do not force the Intel IOMMU enabled under tboot.
By default, tboot will force Intel IOMMU on, which
@@ -1778,11 +2080,6 @@
Note that using this option lowers the security
provided by tboot because it makes the system
vulnerable to DMA attacks.
- nobounce [Default off]
- Disable bounce buffer for unstrusted devices such as
- the Thunderbolt devices. This will treat the untrusted
- devices as the trusted ones, hence might expose security
- risks of DMA attacks.
intel_idle.max_cstate= [KNL,HW,ACPI,X86]
0 disables intel_idle and fall back on acpi_idle.
@@ -1834,7 +2131,7 @@
strict regions from userspace.
relaxed
- iommu= [x86]
+ iommu= [X86]
off
force
noforce
@@ -1844,12 +2141,20 @@
merge
nomerge
soft
- pt [x86]
- nopt [x86]
+ pt [X86]
+ nopt [X86]
nobypass [PPC/POWERNV]
Disable IOMMU bypass, using IOMMU for PCI devices.
- iommu.strict= [ARM64] Configure TLB invalidation behaviour
+ iommu.forcedac= [ARM64, X86] Control IOVA allocation for PCI devices.
+ Format: { "0" | "1" }
+ 0 - Try to allocate a 32-bit DMA address first, before
+ falling back to the full range if needed.
+ 1 - Allocate directly from the full usable range,
+ forcing Dual Address Cycle for PCI cards supporting
+ greater than 32-bit addressing.
+
+ iommu.strict= [ARM64, X86] Configure TLB invalidation behaviour
Format: { "0" | "1" }
0 - Lazy mode.
Request that DMA unmap operations use deferred
@@ -1857,9 +2162,12 @@
throughput at the cost of reduced device isolation.
Will fall back to strict mode if not supported by
the relevant IOMMU driver.
- 1 - Strict mode (default).
+ 1 - Strict mode.
DMA unmap operations invalidate IOMMU hardware TLBs
synchronously.
+ unset - Use value of CONFIG_IOMMU_DEFAULT_DMA_{LAZY,STRICT}.
+ Note: on x86, strict mode specified via one of the
+ legacy driver-specific options takes precedence.
iommu.passthrough=
[ARM64, X86] Configure DMA to bypass the IOMMU by default.
@@ -1868,7 +2176,7 @@
1 - Bypass the IOMMU for DMA.
unset - Use value of CONFIG_IOMMU_DEFAULT_PASSTHROUGH.
- io7= [HW] IO7 for Marvel based alpha systems
+ io7= [HW] IO7 for Marvel-based Alpha systems
See comment before marvel_specify_io7 in
arch/alpha/kernel/core_marvel.c.
@@ -1883,7 +2191,7 @@
No delay
ip= [IP_PNP]
- See Documentation/filesystems/nfs/nfsroot.txt.
+ See Documentation/admin-guide/nfs/nfsroot.rst.
ipcmni_extend [KNL] Extend the maximum number of unique System V
IPC identifiers from 32,768 to 16,777,216.
@@ -1988,25 +2296,41 @@
iucv= [HW,NET]
- ivrs_ioapic [HW,X86_64]
+ ivrs_ioapic [HW,X86-64]
Provide an override to the IOAPIC-ID<->DEVICE-ID
- mapping provided in the IVRS ACPI table. For
- example, to map IOAPIC-ID decimal 10 to
- PCI device 00:14.0 write the parameter as:
+ mapping provided in the IVRS ACPI table.
+ By default, PCI segment is 0, and can be omitted.
+ For example:
+ * To map IOAPIC-ID decimal 10 to PCI device 00:14.0
+ write the parameter as:
ivrs_ioapic[10]=00:14.0
+ * To map IOAPIC-ID decimal 10 to PCI segment 0x1 and
+ PCI device 00:14.0 write the parameter as:
+ ivrs_ioapic[10]=0001:00:14.0
- ivrs_hpet [HW,X86_64]
+ ivrs_hpet [HW,X86-64]
Provide an override to the HPET-ID<->DEVICE-ID
- mapping provided in the IVRS ACPI table. For
- example, to map HPET-ID decimal 0 to
- PCI device 00:14.0 write the parameter as:
+ mapping provided in the IVRS ACPI table.
+ By default, PCI segment is 0, and can be omitted.
+ For example:
+ * To map HPET-ID decimal 0 to PCI device 00:14.0
+ write the parameter as:
ivrs_hpet[0]=00:14.0
+ * To map HPET-ID decimal 10 to PCI segment 0x1 and
+ PCI device 00:14.0 write the parameter as:
+ ivrs_ioapic[10]=0001:00:14.0
- ivrs_acpihid [HW,X86_64]
+ ivrs_acpihid [HW,X86-64]
Provide an override to the ACPI-HID:UID<->DEVICE-ID
- mapping provided in the IVRS ACPI table. For
- example, to map UART-HID:UID AMD0020:0 to
- PCI device 00:14.5 write the parameter as:
+ mapping provided in the IVRS ACPI table.
+
+ For example, to map UART-HID:UID AMD0020:0 to
+ PCI segment 0x1 and PCI device ID 00:14.5,
+ write the parameter as:
+ ivrs_acpihid[0001:00:14.5]=AMD0020:0
+
+ By default, PCI segment is 0, and can be omitted.
+ For example, PCI device 00:14.5 write the parameter as:
ivrs_acpihid[00:14.5]=AMD0020:0
js= [HW,JOY] Analog joystick
@@ -2071,10 +2395,25 @@
kms, kbd format: kms,kbd
kms, kbd and serial format: kms,kbd,<ser_dev>[,baud]
+ kgdboc_earlycon= [KGDB,HW]
+ If the boot console provides the ability to read
+ characters and can work in polling mode, you can use
+ this parameter to tell kgdb to use it as a backend
+ until the normal console is registered. Intended to
+ be used together with the kgdboc parameter which
+ specifies the normal console to transition to.
+
+ The name of the early console should be specified
+ as the value of this parameter. Note that the name of
+ the early console might be different than the tty
+ name passed to kgdboc. It's OK to leave the value
+ blank and the first boot console that implements
+ read() will be picked.
+
kgdbwait [KGDB] Stop kernel execution and enter the
kernel debugger at the earliest opportunity.
- kmac= [MIPS] korina ethernet MAC address.
+ kmac= [MIPS] Korina ethernet MAC address.
Configure the RouterBoard 532 series on-chip
Ethernet adapter MAC address.
@@ -2103,16 +2442,43 @@
0: force disabled
1: force enabled
+ kunit.enable= [KUNIT] Enable executing KUnit tests. Requires
+ CONFIG_KUNIT to be set to be fully enabled. The
+ default value can be overridden via
+ KUNIT_DEFAULT_ENABLED.
+ Default is 1 (enabled)
+
kvm.ignore_msrs=[KVM] Ignore guest accesses to unhandled MSRs.
Default is 0 (don't ignore, but inject #GP)
+ kvm.eager_page_split=
+ [KVM,X86] Controls whether or not KVM will try to
+ proactively split all huge pages during dirty logging.
+ Eager page splitting reduces interruptions to vCPU
+ execution by eliminating the write-protection faults
+ and MMU lock contention that would otherwise be
+ required to split huge pages lazily.
+
+ VM workloads that rarely perform writes or that write
+ only to a small region of VM memory may benefit from
+ disabling eager page splitting to allow huge pages to
+ still be used for reads.
+
+ The behavior of eager page splitting depends on whether
+ KVM_DIRTY_LOG_INITIALLY_SET is enabled or disabled. If
+ disabled, all huge pages in a memslot will be eagerly
+ split when dirty logging is enabled on that memslot. If
+ enabled, eager page splitting will be performed during
+ the KVM_CLEAR_DIRTY ioctl, and only for the pages being
+ cleared.
+
+ Eager page splitting is only supported when kvm.tdp_mmu=Y.
+
+ Default is Y (on).
+
kvm.enable_vmware_backdoor=[KVM] Support VMware backdoor PV interface.
Default is false (don't support).
- kvm.mmu_audit= [KVM] This is a R/W parameter which allows audit
- KVM MMU at runtime.
- Default is 0 (off)
-
kvm.nx_huge_pages=
[KVM] Controls the software workaround for the
X86_BUG_ITLB_MULTIHIT bug.
@@ -2130,7 +2496,14 @@
[KVM] Controls how many 4KiB pages are periodically zapped
back to huge pages. 0 disables the recovery, otherwise if
the value is N KVM will zap 1/Nth of the 4KiB pages every
- minute. The default is 60.
+ period (see below). The default is 60.
+
+ kvm.nx_huge_pages_recovery_period_ms=
+ [KVM] Controls the time period at which KVM zaps 4KiB pages
+ back to huge pages. If the value is a non-zero N, KVM will
+ zap a portion (see ratio above) of the pages every N msecs.
+ If the value is 0 (the default), KVM will pick a period based
+ on the ratio, such that a page is zapped after 1 hour on average.
kvm-amd.nested= [KVM,AMD] Allow nested virtualization in KVM/SVM.
Default is 1 (enabled)
@@ -2139,6 +2512,21 @@
for all guests.
Default is 1 (enabled) if in 64-bit or 32-bit PAE mode.
+ kvm-arm.mode=
+ [KVM,ARM] Select one of KVM/arm64's modes of operation.
+
+ none: Forcefully disable KVM.
+
+ nvhe: Standard nVHE-based mode, without support for
+ protected guests.
+
+ protected: nVHE-based mode with support for guests whose
+ state is kept private from the host.
+
+ Defaults to VHE/nVHE based on hardware support. Setting
+ mode to "protected" will disable kexec and hibernation
+ for the host.
+
kvm-arm.vgic_v3_group0_trap=
[KVM,ARM] Trap guest accesses to GICv3 group-0
system registers
@@ -2155,13 +2543,25 @@
[KVM,ARM] Allow use of GICv4 for direct injection of
LPIs.
+ kvm_cma_resv_ratio=n [PPC]
+ Reserves given percentage from system memory area for
+ contiguous memory allocation for KVM hash pagetable
+ allocation.
+ By default it reserves 5% of total system memory.
+ Format: <integer>
+ Default: 5
+
kvm-intel.ept= [KVM,Intel] Disable extended page tables
(virtualized MMU) support on capable Intel chips.
Default is 1 (enabled)
kvm-intel.emulate_invalid_guest_state=
- [KVM,Intel] Enable emulation of invalid guest states
- Default is 0 (disabled)
+ [KVM,Intel] Disable emulation of invalid guest state.
+ Ignored if kvm-intel.enable_unrestricted_guest=1, as
+ guest state is never invalid for unrestricted guests.
+ This param doesn't apply to nested guests (L2), as KVM
+ never emulates invalid L2 guest state.
+ Default is 1 (enabled)
kvm-intel.flexpriority=
[KVM,Intel] Disable FlexPriority feature (TPR shadow).
@@ -2192,6 +2592,23 @@
feature (tagged TLBs) on capable Intel chips.
Default is 1 (enabled)
+ l1d_flush= [X86,INTEL]
+ Control mitigation for L1D based snooping vulnerability.
+
+ Certain CPUs are vulnerable to an exploit against CPU
+ internal buffers which can forward information to a
+ disclosure gadget under certain conditions.
+
+ In vulnerable processors, the speculatively
+ forwarded data can be used in a cache side channel
+ attack, to access data to which the attacker does
+ not have direct access.
+
+ This parameter controls the mitigation. The
+ options are:
+
+ on - enable the interface for the mitigation
+
l1tf= [X86] Control mitigation of the L1TF vulnerability on
affected CPUs
@@ -2264,9 +2681,10 @@
lapic [X86-32,APIC] Enable the local APIC even if BIOS
disabled it.
- lapic= [x86,APIC] "notscdeadline" Do not use TSC deadline
+ lapic= [X86,APIC] Do not use TSC deadline
value for LAPIC timer one-shot implementation. Default
back to the programmable timer unit in the LAPIC.
+ Format: notscdeadline
lapic_timer_c2_ok [X86,APIC] trust the local apic timer
in C2 power state.
@@ -2287,14 +2705,14 @@
when set.
Format: <int>
- libata.force= [LIBATA] Force configurations. The format is comma
- separated list of "[ID:]VAL" where ID is
- PORT[.DEVICE]. PORT and DEVICE are decimal numbers
- matching port, link or device. Basically, it matches
- the ATA ID string printed on console by libata. If
- the whole ID part is omitted, the last PORT and DEVICE
- values are used. If ID hasn't been specified yet, the
- configuration applies to all ports, links and devices.
+ libata.force= [LIBATA] Force configurations. The format is a comma-
+ separated list of "[ID:]VAL" where ID is PORT[.DEVICE].
+ PORT and DEVICE are decimal numbers matching port, link
+ or device. Basically, it matches the ATA ID string
+ printed on console by libata. If the whole ID part is
+ omitted, the last PORT and DEVICE values are used. If
+ ID hasn't been specified yet, the configuration applies
+ to all ports, links and devices.
If only DEVICE is omitted, the parameter applies to
the port and all links and devices behind it. DEVICE
@@ -2304,7 +2722,7 @@
host link and device attached to it.
The VAL specifies the configuration to force. As long
- as there's no ambiguity shortcut notation is allowed.
+ as there is no ambiguity, shortcut notation is allowed.
For example, both 1.5 and 1.5G would work for 1.5Gbps.
The following configurations can be forced.
@@ -2317,29 +2735,65 @@
udma[/][16,25,33,44,66,100,133] notation is also
allowed.
+ * nohrst, nosrst, norst: suppress hard, soft and both
+ resets.
+
+ * rstonce: only attempt one reset during hot-unplug
+ link recovery.
+
+ * [no]dbdelay: Enable or disable the extra 200ms delay
+ before debouncing a link PHY and device presence
+ detection.
+
* [no]ncq: Turn on or off NCQ.
- * [no]ncqtrim: Turn off queued DSM TRIM.
+ * [no]ncqtrim: Enable or disable queued DSM TRIM.
- * nohrst, nosrst, norst: suppress hard, soft
- and both resets.
+ * [no]ncqati: Enable or disable NCQ trim on ATI chipset.
- * rstonce: only attempt one reset during
- hot-unplug link recovery
+ * [no]trim: Enable or disable (unqueued) TRIM.
- * dump_id: dump IDENTIFY data.
+ * trim_zero: Indicate that TRIM command zeroes data.
- * atapi_dmadir: Enable ATAPI DMADIR bridge support
+ * max_trim_128m: Set 128M maximum trim size limit.
+
+ * [no]dma: Turn on or off DMA transfers.
+
+ * atapi_dmadir: Enable ATAPI DMADIR bridge support.
+
+ * atapi_mod16_dma: Enable the use of ATAPI DMA for
+ commands that are not a multiple of 16 bytes.
+
+ * [no]dmalog: Enable or disable the use of the
+ READ LOG DMA EXT command to access logs.
+
+ * [no]iddevlog: Enable or disable access to the
+ identify device data log.
+
+ * [no]logdir: Enable or disable access to the general
+ purpose log directory.
+
+ * max_sec_128: Set transfer size limit to 128 sectors.
+
+ * max_sec_1024: Set or clear transfer size limit to
+ 1024 sectors.
+
+ * max_sec_lba48: Set or clear transfer size limit to
+ 65535 sectors.
+
+ * [no]lpm: Enable or disable link power management.
+
+ * [no]setxfer: Indicate if transfer speed mode setting
+ should be skipped.
+
+ * dump_id: Dump IDENTIFY data.
* disable: Disable this device.
If there are multiple matching configurations changing
the same attribute, the last one is used.
- memblock=debug [KNL] Enable memblock debug messages.
-
- load_ramdisk= [RAM] List of ramdisks to load from floppy
- See Documentation/admin-guide/blockdev/ramdisk.rst.
+ load_ramdisk= [RAM] [Deprecated]
lockd.nlm_grace_period=P [NFS] Assign grace period.
Format: <integer>
@@ -2476,11 +2930,11 @@
(machvec) in a generic kernel.
Example: machvec=hpzx1
- machtype= [Loongson] Share the same kernel image file between different
- yeeloong laptop.
+ machtype= [Loongson] Share the same kernel image file between
+ different yeeloong laptops.
Example: machtype=lemote-yeeloong-2f-7inch
- max_addr=nn[KMG] [KNL,BOOT,ia64] All physical memory greater
+ max_addr=nn[KMG] [KNL,BOOT,IA-64] All physical memory greater
than or equal to this physical address is ignored.
maxcpus= [SMP] Maximum number of processors that an SMP kernel
@@ -2542,17 +2996,46 @@
For details see: Documentation/admin-guide/hw-vuln/mds.rst
+ mem=nn[KMG] [HEXAGON] Set the memory size.
+ Must be specified, otherwise memory size will be 0.
+
mem=nn[KMG] [KNL,BOOT] Force usage of a specific amount of memory
- Amount of memory to be used when the kernel is not able
- to see the whole system memory or for test.
+ Amount of memory to be used in cases as follows:
+
+ 1 for test;
+ 2 when the kernel is not able to see the whole system memory;
+ 3 memory that lies after 'mem=' boundary is excluded from
+ the hypervisor, then assigned to KVM guests.
+ 4 to limit the memory available for kdump kernel.
+
+ [ARC,MICROBLAZE] - the limit applies only to low memory,
+ high memory is not affected.
+
+ [ARM64] - only limits memory covered by the linear
+ mapping. The NOMAP regions are not affected.
+
[X86] Work as limiting max address. Use together
with memmap= to avoid physical address space collisions.
Without memmap= PCI devices could be placed at addresses
belonging to unused RAM.
+ Note that this only takes effects during boot time since
+ in above case 3, memory may need be hot added after boot
+ if system memory of hypervisor is not sufficient.
+
+ mem=nn[KMG]@ss[KMG]
+ [ARM,MIPS] - override the memory layout reported by
+ firmware.
+ Define a memory region of size nn[KMG] starting at
+ ss[KMG].
+ Multiple different regions can be specified with
+ multiple mem= parameters on the command line.
+
mem=nopentium [BUGS=X86-32] Disable usage of 4MB pages for kernel
memory.
+ memblock=debug [KNL] Enable memblock debug messages.
+
memchunk=nn[KMG]
[KNL,SH] Allow user to override the default size for
per-device physically contiguous DMA buffers.
@@ -2572,7 +3055,7 @@
option description.
memmap=nn[KMG]@ss[KMG]
- [KNL] Force usage of a specific region of memory.
+ [KNL, X86, MIPS, XTENSA] Force usage of a specific region of memory.
Region of memory to be used is from ss to ss+nn.
If @ss[KMG] is omitted, it is equivalent to mem=nn[KMG],
which limits max address to nn[KMG].
@@ -2634,7 +3117,26 @@
seconds. Use this parameter to check at some
other rate. 0 disables periodic checking.
- memtest= [KNL,X86,ARM,PPC] Enable memtest
+ memory_hotplug.memmap_on_memory
+ [KNL,X86,ARM] Boolean flag to enable this feature.
+ Format: {on | off (default)}
+ When enabled, runtime hotplugged memory will
+ allocate its internal metadata (struct pages,
+ those vmemmap pages cannot be optimized even
+ if hugetlb_free_vmemmap is enabled) from the
+ hotadded memory which will allow to hotadd a
+ lot of memory without requiring additional
+ memory to do so.
+ This feature is disabled by default because it
+ has some implication on large (e.g. GB)
+ allocations in some configurations (e.g. small
+ memory blocks).
+ The state of the flag can be read in
+ /sys/module/memory_hotplug/parameters/memmap_on_memory.
+ Note that even when enabled, there are a few cases where
+ the feature is not effective.
+
+ memtest= [KNL,X86,ARM,M68K,PPC,RISCV] Enable memtest
Format: <integer>
default : 0 <disable>
Specifies the number of memtest passes to be
@@ -2652,7 +3154,7 @@
mem_encrypt=on: Activate SME
mem_encrypt=off: Do not activate SME
- Refer to Documentation/virt/kvm/amd-memory-encryption.rst
+ Refer to Documentation/virt/kvm/x86/amd-memory-encryption.rst
for details on when memory encryption can be activated.
mem_sleep_default= [SUSPEND] Default system suspend mode:
@@ -2662,7 +3164,7 @@
See Documentation/admin-guide/pm/sleep-states.rst.
meye.*= [HW] Set MotionEye Camera parameters
- See Documentation/media/v4l-drivers/meye.rst.
+ See Documentation/admin-guide/media/meye.rst.
mfgpt_irq= [IA-32] Specify the IRQ to use for the
Multi-Function General Purpose Timers on AMD Geode
@@ -2675,7 +3177,7 @@
mga= [HW,DRM]
- min_addr=nn[KMG] [KNL,BOOT,ia64] All physical memory below this
+ min_addr=nn[KMG] [KNL,BOOT,IA-64] All physical memory below this
physical address is ignored.
mini2440= [ARM,HW,KNL]
@@ -2697,7 +3199,7 @@
touchscreen support is not enabled in the mainstream
kernel as of 2.6.30, a preliminary port can be found
in the "bleeding edge" mini2440 support kernel at
- http://repo.or.cz/w/linux-2.6/mini2440.git
+ https://repo.or.cz/w/linux-2.6/mini2440.git
mitigations=
[X86,PPC,S390,ARM64] Control optional mitigations for
@@ -2710,17 +3212,23 @@
improves system performance, but it may also
expose users to several CPU vulnerabilities.
Equivalent to: nopti [X86,PPC]
- kpti=0 [ARM64]
+ if nokaslr then kpti=0 [ARM64]
nospectre_v1 [X86,PPC]
nobp=0 [S390]
nospectre_v2 [X86,PPC,S390,ARM64]
spectre_v2_user=off [X86]
spec_store_bypass_disable=off [X86,PPC]
ssbd=force-off [ARM64]
+ nospectre_bhb [ARM64]
l1tf=off [X86]
mds=off [X86]
tsx_async_abort=off [X86]
kvm.nx_huge_pages=off [X86]
+ srbds=off [X86,INTEL]
+ no_entry_flush [PPC]
+ no_uaccess_flush [PPC]
+ mmio_stale_data=off [X86]
+ retbleed=off [X86]
Exceptions:
This does not have any effect on
@@ -2742,6 +3250,8 @@
Equivalent to: l1tf=flush,nosmt [X86]
mds=full,nosmt [X86]
tsx_async_abort=full,nosmt [X86]
+ mmio_stale_data=full,nosmt [X86]
+ retbleed=auto,nosmt [X86]
mminit_loglevel=
[KNL] When CONFIG_DEBUG_MEMORY_INIT is set, this
@@ -2751,6 +3261,49 @@
log everything. Information is printed at KERN_DEBUG
so loglevel=8 may also need to be specified.
+ mmio_stale_data=
+ [X86,INTEL] Control mitigation for the Processor
+ MMIO Stale Data vulnerabilities.
+
+ Processor MMIO Stale Data is a class of
+ vulnerabilities that may expose data after an MMIO
+ operation. Exposed data could originate or end in
+ the same CPU buffers as affected by MDS and TAA.
+ Therefore, similar to MDS and TAA, the mitigation
+ is to clear the affected CPU buffers.
+
+ This parameter controls the mitigation. The
+ options are:
+
+ full - Enable mitigation on vulnerable CPUs
+
+ full,nosmt - Enable mitigation and disable SMT on
+ vulnerable CPUs.
+
+ off - Unconditionally disable mitigation
+
+ On MDS or TAA affected machines,
+ mmio_stale_data=off can be prevented by an active
+ MDS or TAA mitigation as these vulnerabilities are
+ mitigated with the same mechanism so in order to
+ disable this mitigation, you need to specify
+ mds=off and tsx_async_abort=off too.
+
+ Not specifying this option is equivalent to
+ mmio_stale_data=full.
+
+ For details see:
+ Documentation/admin-guide/hw-vuln/processor_mmio_stale_data.rst
+
+ module.async_probe=<bool>
+ [KNL] When set to true, modules will use async probing
+ by default. To enable/disable async probing for a
+ specific module, use the module specific control that
+ is documented under <module>.async_probe. When both
+ module.async_probe and <module>.async_probe are
+ specified, <module>.async_probe takes precedence for
+ the specific module.
+
module.sig_enforce
[KNL] When CONFIG_MODULE_SIG is set, this means that
modules without (valid) signatures will fail to load.
@@ -2795,26 +3348,12 @@
<name>,<region-number>[,<base>,<size>,<buswidth>,<altbuswidth>]
mtdparts= [MTD]
- See drivers/mtd/cmdlinepart.c.
-
- multitce=off [PPC] This parameter disables the use of the pSeries
- firmware feature for updating multiple TCE entries
- at a time.
-
- onenand.bdry= [HW,MTD] Flex-OneNAND Boundary Configuration
-
- Format: [die0_boundary][,die0_lock][,die1_boundary][,die1_lock]
-
- boundary - index of last SLC block on Flex-OneNAND.
- The remaining blocks are configured as MLC blocks.
- lock - Configure if Flex-OneNAND boundary should be locked.
- Once locked, the boundary cannot be changed.
- 1 indicates lock status, 0 indicates unlock status.
+ See drivers/mtd/parsers/cmdlinepart.c
mtdset= [ARM]
ARM/S3C2412 JIVE boot control
- See arch/arm/mach-s3c2412/mach-jive.c
+ See arch/arm/mach-s3c/mach-jive.c
mtouchusb.raw_coordinates=
[HW] Make the MicroTouch USB driver use raw coordinates
@@ -2837,6 +3376,10 @@
Used for mtrr cleanup. It is spare mtrr entries number.
Set to 2 or more if your graphical card needs more.
+ multitce=off [PPC] This parameter disables the use of the pSeries
+ firmware feature for updating multiple TCE entries
+ at a time.
+
n2= [NET] SDL Inc. RISCom/N2 synchronous serial card
netdev= [NET] Network devices parameters
@@ -2846,6 +3389,11 @@
This usage is only documented in each driver source
file if at all.
+ netpoll.carrier_timeout=
+ [NET] Specifies amount of time (in seconds) that
+ netpoll should wait for a carrier. By default netpoll
+ waits 4 seconds.
+
nf_conntrack.acct=
[NETFILTER] Enable connection tracking flow accounting
0 to disable accounting
@@ -2853,13 +3401,13 @@
Default value is 0.
nfsaddrs= [NFS] Deprecated. Use ip= instead.
- See Documentation/filesystems/nfs/nfsroot.txt.
+ See Documentation/admin-guide/nfs/nfsroot.rst.
nfsroot= [NFS] nfs root filesystem for disk-less boxes.
- See Documentation/filesystems/nfs/nfsroot.txt.
+ See Documentation/admin-guide/nfs/nfsroot.rst.
nfsrootdebug [NFS] enable nfsroot debugging messages.
- See Documentation/filesystems/nfs/nfsroot.txt.
+ See Documentation/admin-guide/nfs/nfsroot.rst.
nfs.callback_nr_threads=
[NFSv4] set the total number of threads that the
@@ -2951,6 +3499,19 @@
driver. A non-zero value sets the minimum interval
in seconds between layoutstats transmissions.
+ nfsd.inter_copy_offload_enable =
+ [NFSv4.2] When set to 1, the server will support
+ server-to-server copies for which this server is
+ the destination of the copy.
+
+ nfsd.nfsd4_ssc_umount_timeout =
+ [NFSv4.2] When used as the destination of a
+ server-to-server copy, knfsd temporarily mounts
+ the source server. It caches the mount in case
+ it will be needed again, and discards it if not
+ used for the number of milliseconds specified by
+ this parameter.
+
nfsd.nfs4_disable_idmapping=
[NFSv4] When set to the default of '1', the NFSv4
server will return only numeric uids and gids to
@@ -2958,6 +3519,11 @@
and gids from such clients. This is intended to ease
migration from NFSv2/v3.
+
+ nmi_backtrace.backtrace_idle [KNL]
+ Dump stacks even of idle CPUs in response to an
+ NMI stack-backtrace request.
+
nmi_debug= [KNL,SH] Specify one or more actions to take
when a NMI is triggered.
Format: [state][,regs][,debounce][,die]
@@ -2978,11 +3544,6 @@
These settings can be accessed at runtime via
the nmi_watchdog and hardlockup_panic sysctls.
- netpoll.carrier_timeout=
- [NET] Specifies amount of time (in seconds) that
- netpoll should wait for a carrier. By default netpoll
- waits 4 seconds.
-
no387 [BUGS=X86-32] Tells the kernel to use the 387 maths
emulation library even if a 387 maths coprocessor
is present.
@@ -2990,6 +3551,8 @@
no5lvl [X86-64] Disable 5-level paging mode. Forces
kernel to use 4-level paging instead.
+ nofsgsbase [X86] Disables FSGSBASE instructions.
+
no_console_suspend
[HW] Never suspend the console
Disable suspending of consoles during suspend and
@@ -3030,31 +3593,21 @@
noautogroup Disable scheduler automatic task group creation.
- nobats [PPC] Do not use BATs for mapping kernel lowmem
- on "Classic" PPC cores.
-
nocache [ARM]
- noclflush [BUGS=X86] Don't use the CLFLUSH instruction
-
- nodelayacct [KNL] Disable per-task delay accounting
-
nodsp [SH] Disable hardware DSP at boot time.
noefi Disable EFI runtime services support.
- noexec [IA-64]
+ no_entry_flush [PPC] Don't flush the L1-D cache when entering the kernel.
- noexec [X86]
- On X86-32 available only on PAE configured kernels.
- noexec=on: enable non-executable mappings (default)
- noexec=off: disable non-executable mappings
+ noexec [IA-64]
- nosmap [X86,PPC]
+ nosmap [PPC]
Disable SMAP (Supervisor Mode Access Prevention)
even if it is supported by processor.
- nosmep [X86,PPC]
+ nosmep [PPC64s]
Disable SMEP (Supervisor Mode Execution Prevention)
even if it is supported by processor.
@@ -3071,12 +3624,14 @@
register save and restore. The kernel will only save
legacy floating-point registers on task switch.
- nohugeiomap [KNL,x86,PPC] Disable kernel huge I/O mappings.
+ nohugeiomap [KNL,X86,PPC,ARM64] Disable kernel huge I/O mappings.
+
+ nohugevmalloc [KNL,X86,PPC,ARM64] Disable kernel huge vmalloc mappings.
nosmt [KNL,S390] Disable symmetric multithreading (SMT).
Equivalent to smt=1.
- [KNL,x86] Disable symmetric multithreading (SMT).
+ [KNL,X86] Disable symmetric multithreading (SMT).
nosmt=force: Force disable SMT, cannot be undone
via the sysfs control file.
@@ -3084,14 +3639,21 @@
(bounds check bypass). With this option data leaks are
possible in the system.
- nospectre_v2 [X86,PPC_FSL_BOOK3E,ARM64] Disable all mitigations for
+ nospectre_v2 [X86,PPC_E500,ARM64] Disable all mitigations for
the Spectre variant 2 (indirect branch prediction)
vulnerability. System may allow data leaks with this
option.
+ nospectre_bhb [ARM64] Disable all mitigations for Spectre-BHB (branch
+ history injection) vulnerability. System may allow data leaks
+ with this option.
+
nospec_store_bypass_disable
[HW] Disable all mitigations for the Speculative Store Bypass vulnerability
+ no_uaccess_flush
+ [PPC] Don't flush the L1-D cache after accessing user data.
+
noxsave [BUGS=X86] Disables x86 extended register state save
and restore using xsave. The kernel will fallback to
enabling legacy floating-point and sse state.
@@ -3111,9 +3673,14 @@
parameter, xsave area per process might occupy more
memory on xsaves enabled systems.
- nohlt [BUGS=ARM,SH] Tells the kernel that the sleep(SH) or
- wfi(ARM) instruction doesn't work correctly and not to
- use it. This is also useful when using JTAG debugger.
+ nohlt [ARM,ARM64,MICROBLAZE,SH] Forces the kernel to busy wait
+ in do_idle() and not use the arch_cpu_idle()
+ implementation; requires CONFIG_GENERIC_IDLE_POLL_SETUP
+ to be effective. This is useful on platforms where the
+ sleep(SH) or wfi(ARM,ARM64) instructions do not work
+ correctly or when doing power measurements to evalute
+ the impact of the sleep instructions. This is also
+ useful when using JTAG debugger.
no_file_caps Tells the kernel not to honor file capabilities. The
only way then for a file to be executed with privilege
@@ -3126,6 +3693,20 @@
in certain environments such as networked servers or
real-time systems.
+ no_hash_pointers
+ Force pointers printed to the console or buffers to be
+ unhashed. By default, when a pointer is printed via %p
+ format string, that pointer is "hashed", i.e. obscured
+ by hashing the pointer value. This is a security feature
+ that hides actual kernel addresses from unprivileged
+ users, but it also makes debugging the kernel more
+ difficult since unequal pointers can no longer be
+ compared. However, if this command-line option is
+ specified, then all normal pointers will have their true
+ value printed. This option should only be specified when
+ debugging the kernel. Please do not use on production
+ kernels.
+
nohibernate [HIBERNATION] Disable hibernation and resume.
nohz= [KNL] Boottime enable/disable dynamic ticks
@@ -3142,6 +3723,9 @@
just as if they had also been called out in the
rcu_nocbs= boot parameter.
+ Note that this argument takes precedence over
+ the CONFIG_RCU_NOCB_CPU_DEFAULT_ALL option.
+
noiotrap [SH] Disables trapped I/O port accesses.
noirqdebug [X86-32] Disables the code which attempts to detect and
@@ -3174,17 +3758,14 @@
[X86,PV_OPS] Disable paravirtualized VMware scheduler
clock and use the default one.
- no-steal-acc [X86,KVM,ARM64] Disable paravirtualized steal time
- accounting. steal time is computed, but won't
- influence scheduler behaviour
+ no-steal-acc [X86,PV_OPS,ARM64,PPC/PSERIES] Disable paravirtualized
+ steal time accounting. steal time is computed, but
+ won't influence scheduler behaviour
nolapic [X86-32,APIC] Do not enable or use the local APIC.
nolapic_timer [X86-32,APIC] Do not use the local APIC timer.
- noltlbs [PPC] Do not use large page/tlb entries for kernel
- lowmem mapping on PPC40x and PPC8xx
-
nomca [IA-64] Disable machine check abort handling
nomce [X86-32] Disable Machine Check Exception
@@ -3196,6 +3777,13 @@
shutdown the other cpus. Instead use the REBOOT_VECTOR
irq.
+ nomodeset Disable kernel modesetting. DRM drivers will not perform
+ display-mode changes or accelerated rendering. Only the
+ system framebuffer will be available for use if this was
+ set-up by the firmware or boot loader.
+
+ Useful as fallback, or for testing and debugging.
+
nomodule Disable module load
nopat [X86] Disable PAT (page attribute table extension of
@@ -3209,11 +3797,6 @@
noreplace-smp [X86-32,SMP] Don't replace SMP instructions
with UP alternatives
- nordrand [X86] Disable kernel use of the RDRAND and
- RDSEED instructions even if they are supported
- by the processor. RDRAND and RDSEED are still
- available to user space applications.
-
noresume [SWSUSP] Disables resume and restores original swap
space.
@@ -3223,7 +3806,7 @@
nosbagart [IA-64]
- nosep [BUGS=X86-32] Disables x86 SYSENTER/SYSEXIT support.
+ nosgx [X86-64,SGX] Disables Intel SGX kernel support.
nosmp [SMP] Tells an SMP kernel to act as a UP kernel,
and disable the IO APIC. legacy for "maxcpus=0".
@@ -3239,19 +3822,9 @@
nox2apic [X86-64,APIC] Do not enable x2APIC mode.
- cpu0_hotplug [X86] Turn on CPU0 hotplug feature when
- CONFIG_BOOTPARAM_HOTPLUG_CPU0 is off.
- Some features depend on CPU0. Known dependencies are:
- 1. Resume from suspend/hibernate depends on CPU0.
- Suspend/hibernate will fail if CPU0 is offline and you
- need to online CPU0 before suspend/hibernate.
- 2. PIC interrupts also depend on CPU0. CPU0 can't be
- removed if a PIC interrupt is detected.
- It's said poweroff/reboot may depend on CPU0 on some
- machines although I haven't seen such issues so far
- after CPU0 is offline on a few tested machines.
- If the dependencies are under your control, you can
- turn on cpu0_hotplug.
+ NOTE: this parameter will be ignored on systems with the
+ LEGACY_XAPIC_DISABLED bit set in the
+ IA32_XAPIC_DISABLE_STATUS MSR.
nps_mtm_hs_ctr= [KNL,ARC]
This parameter sets the maximum duration, in
@@ -3277,7 +3850,11 @@
nr_uarts= [SERIAL] maximum number of UARTs to be registered.
- numa_balancing= [KNL,X86] Enable or disable automatic NUMA balancing.
+ numa=off [KNL, ARM64, PPC, RISCV, SPARC, X86] Disable NUMA, Only
+ set up a single NUMA node spanning all memory.
+
+ numa_balancing= [KNL,ARM64,PPC,RISCV,S390,X86] Enable or disable automatic
+ NUMA balancing.
Allowed values are enable and disable
numa_zonelist_order= [KNL, BOOT] Select zonelist order for NUMA.
@@ -3285,14 +3862,8 @@
This can be set from sysctl after boot.
See Documentation/admin-guide/sysctl/vm.rst for details.
- of_devlink [OF, KNL] Create device links between consumer and
- supplier devices by scanning the devictree to infer the
- consumer/supplier relationships. A consumer device
- will not be probed until all the supplier devices have
- probed successfully.
-
ohci1394_dma=early [HW] enable debugging via the ohci1394 driver.
- See Documentation/debugging-via-ohci1394.txt for more
+ See Documentation/core-api/debugging-via-ohci1394.rst for more
info.
olpc_ec_timeout= [OLPC] ms delay when issuing EC commands
@@ -3307,19 +3878,15 @@
For example, to override I2C bus2:
omap_mux=i2c2_scl.i2c2_scl=0x100,i2c2_sda.i2c2_sda=0x100
- oprofile.timer= [HW]
- Use timer interrupt instead of performance counters
+ onenand.bdry= [HW,MTD] Flex-OneNAND Boundary Configuration
- oprofile.cpu_type= Force an oprofile cpu type
- This might be useful if you have an older oprofile
- userland or if you want common events.
- Format: { arch_perfmon }
- arch_perfmon: [X86] Force use of architectural
- perfmon on Intel CPUs instead of the
- CPU specific event set.
- timer: [X86] Force use of architectural NMI
- timer mode (see also oprofile.timer
- for generic hr timer mode)
+ Format: [die0_boundary][,die0_lock][,die1_boundary][,die1_lock]
+
+ boundary - index of last SLC block on Flex-OneNAND.
+ The remaining blocks are configured as MLC blocks.
+ lock - Configure if Flex-OneNAND boundary should be locked.
+ Once locked, the boundary cannot be changed.
+ 1 indicates lock status, 0 indicates unlock status.
oops=panic Always panic on oopses. Default is to just kill the
process, but there is a small probability of
@@ -3349,6 +3916,12 @@
off: turn off poisoning (default)
on: turn on poisoning
+ page_reporting.page_reporting_order=
+ [KNL] Minimal page reporting order
+ Format: <integer>
+ Adjust the minimal page reporting order. The page
+ reporting is disabled when it exceeds (MAX_ORDER-1).
+
panic= [KNL] Kernel behaviour on panic: delay <timeout>
timeout > 0: seconds before rebooting
timeout = 0: wait forever
@@ -3363,18 +3936,28 @@
bit 3: print locks info if CONFIG_LOCKDEP is on
bit 4: print ftrace buffer
bit 5: print all printk messages in buffer
+ bit 6: print all CPUs backtrace (if available in the arch)
+ *Be aware* that this option may print a _lot_ of lines,
+ so there are risks of losing older messages in the log.
+ Use this option carefully, maybe worth to setup a
+ bigger log buffer with "log_buf_len" along with this.
+
+ panic_on_taint= Bitmask for conditionally calling panic() in add_taint()
+ Format: <hex>[,nousertaint]
+ Hexadecimal bitmask representing the set of TAINT flags
+ that will cause the kernel to panic when add_taint() is
+ called with any of the flags in this set.
+ The optional switch "nousertaint" can be utilized to
+ prevent userspace forced crashes by writing to sysctl
+ /proc/sys/kernel/tainted any flagset matching with the
+ bitmask set on panic_on_taint.
+ See Documentation/admin-guide/tainted-kernels.rst for
+ extra details on the taint flags that users can pick
+ to compose the bitmask to assign to panic_on_taint.
panic_on_warn panic() instead of WARN(). Useful to cause kdump
on a WARN().
- crash_kexec_post_notifiers
- Run kdump after running panic-notifiers and dumping
- kmsg. This only for the users who doubt kdump always
- succeeds in any situation.
- Note that this also increases risks of kdump failure,
- because some panic notifiers can make the crashed
- kernel more unstable.
-
parkbd.port= [HW] Parallel port number the keyboard adapter is
connected to, default is 0.
Format: <parport#>
@@ -3404,6 +3987,96 @@
Currently this function knows 686a and 8231 chips.
Format: [spp|ps2|epp|ecp|ecpepp]
+ pata_legacy.all= [HW,LIBATA]
+ Format: <int>
+ Set to non-zero to probe primary and secondary ISA
+ port ranges on PCI systems where no PCI PATA device
+ has been found at either range. Disabled by default.
+
+ pata_legacy.autospeed= [HW,LIBATA]
+ Format: <int>
+ Set to non-zero if a chip is present that snoops speed
+ changes. Disabled by default.
+
+ pata_legacy.ht6560a= [HW,LIBATA]
+ Format: <int>
+ Set to 1, 2, or 3 for HT 6560A on the primary channel,
+ the secondary channel, or both channels respectively.
+ Disabled by default.
+
+ pata_legacy.ht6560b= [HW,LIBATA]
+ Format: <int>
+ Set to 1, 2, or 3 for HT 6560B on the primary channel,
+ the secondary channel, or both channels respectively.
+ Disabled by default.
+
+ pata_legacy.iordy_mask= [HW,LIBATA]
+ Format: <int>
+ IORDY enable mask. Set individual bits to allow IORDY
+ for the respective channel. Bit 0 is for the first
+ legacy channel handled by this driver, bit 1 is for
+ the second channel, and so on. The sequence will often
+ correspond to the primary legacy channel, the secondary
+ legacy channel, and so on, but the handling of a PCI
+ bus and the use of other driver options may interfere
+ with the sequence. By default IORDY is allowed across
+ all channels.
+
+ pata_legacy.opti82c46x= [HW,LIBATA]
+ Format: <int>
+ Set to 1, 2, or 3 for Opti 82c611A on the primary
+ channel, the secondary channel, or both channels
+ respectively. Disabled by default.
+
+ pata_legacy.opti82c611a= [HW,LIBATA]
+ Format: <int>
+ Set to 1, 2, or 3 for Opti 82c465MV on the primary
+ channel, the secondary channel, or both channels
+ respectively. Disabled by default.
+
+ pata_legacy.pio_mask= [HW,LIBATA]
+ Format: <int>
+ PIO mode mask for autospeed devices. Set individual
+ bits to allow the use of the respective PIO modes.
+ Bit 0 is for mode 0, bit 1 is for mode 1, and so on.
+ All modes allowed by default.
+
+ pata_legacy.probe_all= [HW,LIBATA]
+ Format: <int>
+ Set to non-zero to probe tertiary and further ISA
+ port ranges on PCI systems. Disabled by default.
+
+ pata_legacy.probe_mask= [HW,LIBATA]
+ Format: <int>
+ Probe mask for legacy ISA PATA ports. Depending on
+ platform configuration and the use of other driver
+ options up to 6 legacy ports are supported: 0x1f0,
+ 0x170, 0x1e8, 0x168, 0x1e0, 0x160, however probing
+ of individual ports can be disabled by setting the
+ corresponding bits in the mask to 1. Bit 0 is for
+ the first port in the list above (0x1f0), and so on.
+ By default all supported ports are probed.
+
+ pata_legacy.qdi= [HW,LIBATA]
+ Format: <int>
+ Set to non-zero to probe QDI controllers. By default
+ set to 1 if CONFIG_PATA_QDI_MODULE, 0 otherwise.
+
+ pata_legacy.winbond= [HW,LIBATA]
+ Format: <int>
+ Set to non-zero to probe Winbond controllers. Use
+ the standard I/O port (0x130) if 1, otherwise the
+ value given is the I/O port to use (typically 0x1b0).
+ By default set to 1 if CONFIG_PATA_WINBOND_VLB_MODULE,
+ 0 otherwise.
+
+ pata_platform.pio_mask= [HW,LIBATA]
+ Format: <int>
+ Supported PIO mode mask. Set individual bits to allow
+ the use of the respective PIO modes. Bit 0 is for
+ mode 0, bit 1 is for mode 1, and so on. Mode 0 only
+ allowed by default.
+
pause_on_oops=
Halt all CPUs after the first oops has been printed for
the specified number of seconds. This is to be used if
@@ -3529,6 +4202,15 @@
please report a bug.
nocrs [X86] Ignore PCI host bridge windows from ACPI.
If you need to use this, please report a bug.
+ use_e820 [X86] Use E820 reservations to exclude parts of
+ PCI host bridge windows. This is a workaround
+ for BIOS defects in host bridge _CRS methods.
+ If you need to use this, please report a bug to
+ <linux-pci@vger.kernel.org>.
+ no_e820 [X86] Ignore E820 reservations for PCI host
+ bridge windows. This is the default on modern
+ hardware. If you need to use this, please report
+ a bug to <linux-pci@vger.kernel.org>.
routeirq Do IRQ routing for all PCI devices.
This is normally done in pci_enable_device(),
so this option is a temporary workaround
@@ -3632,6 +4314,8 @@
may put more devices in an IOMMU group.
force_floating [S390] Force usage of floating interrupts.
nomio [S390] Do not use MIO instructions.
+ norid [S390] ignore the RID field and force use of
+ one PCI domain per PCI function
pcie_aspm= [PCIE] Forcibly enable or disable PCIe Active State Power
Management.
@@ -3698,6 +4382,17 @@
Override pmtimer IOPort with a hex value.
e.g. pmtmr=0x508
+ pmu_override= [PPC] Override the PMU.
+ This option takes over the PMU facility, so it is no
+ longer usable by perf. Setting this option starts the
+ PMU counters by setting MMCR0 to 0 (the FC bit is
+ cleared). If a number is given, then MMCR1 is set to
+ that number, otherwise (e.g., 'pmu_override=on'), MMCR1
+ remains 0.
+
+ pm_debug_messages [SUSPEND,KNL]
+ Enable suspend/resume debug messages during boot up.
+
pnp.debug=1 [PNP]
Enable PNP debug messages (depends on the
CONFIG_PNP_DEBUG_MESSAGES option). Change at run-time
@@ -3747,6 +4442,13 @@
Format: {"off"}
Disable Hardware Transactional Memory
+ preempt= [KNL]
+ Select preemption mode if you have CONFIG_PREEMPT_DYNAMIC
+ none - Limited to cond_resched() calls
+ voluntary - Limited to cond_resched() and might_sleep() calls
+ full - Any section that isn't explicitly preempt disabled
+ can be preempted anytime.
+
print-fatal-signals=
[KNL] debug: print fatal signals
@@ -3766,6 +4468,15 @@
Format: <bool> (1/Y/y=enable, 0/N/n=disable)
default: disabled
+ printk.console_no_auto_verbose=
+ Disable console loglevel raise on oops, panic
+ or lockdep-detected issues (only if lock debug is on).
+ With an exception to setups with low baudrate on
+ serial console, keeping this 0 is a good choice
+ in order to provide more debug information.
+ Format: <bool>
+ default: 0 (auto_verbose is enabled)
+
printk.devkmsg={on,off,ratelimit}
Control writing to /dev/kmsg.
on - unlimited logging to /dev/kmsg from userspace
@@ -3795,9 +4506,12 @@
Param: <number> - step/bucket size as a power of 2 for
statistical time based profiling.
- prompt_ramdisk= [RAM] List of RAM disks to prompt for floppy disk
- before loading.
- See Documentation/admin-guide/blockdev/ramdisk.rst.
+ prompt_ramdisk= [RAM] [Deprecated]
+
+ prot_virt= [S390] enable hosting protected virtual machines
+ isolated from the hypervisor (if hardware supports
+ that).
+ Format: <bool>
psi= [KNL] Enable or disable pressure stall information
tracking.
@@ -3821,7 +4535,7 @@
pt. [PARIDE]
See Documentation/admin-guide/blockdev/paride.rst.
- pti= [X86_64] Control Page Table Isolation of user and
+ pti= [X86-64] Control Page Table Isolation of user and
kernel address spaces. Disabling this feature
removes hardening, but improves performance of
system calls and interrupts.
@@ -3833,7 +4547,7 @@
Not specifying this option is equivalent to pti=auto.
- nopti [X86_64]
+ nopti [X86-64]
Equivalent to pti=off
pty.legacy_count=
@@ -3850,33 +4564,64 @@
ramdisk_size= [RAM] Sizes of RAM disks in kilobytes
See Documentation/admin-guide/blockdev/ramdisk.rst.
+ ramdisk_start= [RAM] RAM disk image start address
+
random.trust_cpu={on,off}
[KNL] Enable or disable trusting the use of the
CPU's random number generator (if available) to
fully seed the kernel's CRNG. Default is controlled
by CONFIG_RANDOM_TRUST_CPU.
+ random.trust_bootloader={on,off}
+ [KNL] Enable or disable trusting the use of a
+ seed passed by the bootloader (if available) to
+ fully seed the kernel's CRNG. Default is controlled
+ by CONFIG_RANDOM_TRUST_BOOTLOADER.
+
+ randomize_kstack_offset=
+ [KNL] Enable or disable kernel stack offset
+ randomization, which provides roughly 5 bits of
+ entropy, frustrating memory corruption attacks
+ that depend on stack address determinism or
+ cross-syscall address exposures. This is only
+ available on architectures that have defined
+ CONFIG_HAVE_ARCH_RANDOMIZE_KSTACK_OFFSET.
+ Format: <bool> (1/Y/y=enable, 0/N/n=disable)
+ Default is CONFIG_RANDOMIZE_KSTACK_OFFSET_DEFAULT.
+
ras=option[,option,...] [KNL] RAS-specific options
cec_disable [X86]
Disable the Correctable Errors Collector,
see CONFIG_RAS_CEC help text.
- rcu_nocbs= [KNL]
- The argument is a cpu list, as described above,
- except that the string "all" can be used to
- specify every CPU on the system.
-
- In kernels built with CONFIG_RCU_NOCB_CPU=y, set
- the specified list of CPUs to be no-callback CPUs.
- Invocation of these CPUs' RCU callbacks will be
- offloaded to "rcuox/N" kthreads created for that
- purpose, where "x" is "p" for RCU-preempt, and
- "s" for RCU-sched, and "N" is the CPU number.
- This reduces OS jitter on the offloaded CPUs,
- which can be useful for HPC and real-time
- workloads. It can also improve energy efficiency
- for asymmetric multiprocessors.
+ rcu_nocbs[=cpu-list]
+ [KNL] The optional argument is a cpu list,
+ as described above.
+
+ In kernels built with CONFIG_RCU_NOCB_CPU=y,
+ enable the no-callback CPU mode, which prevents
+ such CPUs' callbacks from being invoked in
+ softirq context. Invocation of such CPUs' RCU
+ callbacks will instead be offloaded to "rcuox/N"
+ kthreads created for that purpose, where "x" is
+ "p" for RCU-preempt, "s" for RCU-sched, and "g"
+ for the kthreads that mediate grace periods; and
+ "N" is the CPU number. This reduces OS jitter on
+ the offloaded CPUs, which can be useful for HPC
+ and real-time workloads. It can also improve
+ energy efficiency for asymmetric multiprocessors.
+
+ If a cpulist is passed as an argument, the specified
+ list of CPUs is set to no-callback mode from boot.
+
+ Otherwise, if the '=' sign and the cpulist
+ arguments are omitted, no CPU will be set to
+ no-callback mode from boot but the mode may be
+ toggled at runtime via cpusets.
+
+ Note that this argument takes precedence over
+ the CONFIG_RCU_NOCB_CPU_DEFAULT_ALL option.
rcu_nocb_poll [KNL]
Rather than requiring that offloaded CPUs
@@ -3918,6 +4663,10 @@
value, meaning that RCU_SOFTIRQ is used by default.
Specify rcutree.use_softirq=0 to use rcuc kthreads.
+ But note that CONFIG_PREEMPT_RT=y kernels disable
+ this kernel boot parameter, forcibly setting it
+ to zero.
+
rcutree.rcu_fanout_exact= [KNL]
Disable autobalancing of the rcu_node combining
tree. This is used by rcutorture, and might
@@ -3932,6 +4681,19 @@
latencies, which will choose a value aligned
with the appropriate hardware boundaries.
+ rcutree.rcu_min_cached_objs= [KNL]
+ Minimum number of objects which are cached and
+ maintained per one CPU. Object size is equal
+ to PAGE_SIZE. The cache allows to reduce the
+ pressure to page allocator, also it makes the
+ whole algorithm to behave better in low memory
+ condition.
+
+ rcutree.rcu_delay_page_cache_fill_msec= [KNL]
+ Set the page-cache refill delay (in milliseconds)
+ in response to low-memory conditions. The range
+ of permitted values is in the range 0:100000.
+
rcutree.jiffies_till_first_fqs= [KNL]
Set delay from grace-period initialization to
first attempt to force quiescent states.
@@ -3967,6 +4729,36 @@
(the least-favored priority). Otherwise, when
RCU_BOOST is not set, valid values are 0-99 and
the default is zero (non-realtime operation).
+ When RCU_NOCB_CPU is set, also adjust the
+ priority of NOCB callback kthreads.
+
+ rcutree.rcu_divisor= [KNL]
+ Set the shift-right count to use to compute
+ the callback-invocation batch limit bl from
+ the number of callbacks queued on this CPU.
+ The result will be bounded below by the value of
+ the rcutree.blimit kernel parameter. Every bl
+ callbacks, the softirq handler will exit in
+ order to allow the CPU to do other work.
+
+ Please note that this callback-invocation batch
+ limit applies only to non-offloaded callback
+ invocation. Offloaded callbacks are instead
+ invoked in the context of an rcuoc kthread, which
+ scheduler will preempt as it does any other task.
+
+ rcutree.nocb_nobypass_lim_per_jiffy= [KNL]
+ On callback-offloaded (rcu_nocbs) CPUs,
+ RCU reduces the lock contention that would
+ otherwise be caused by callback floods through
+ use of the ->nocb_bypass list. However, in the
+ common non-flooded case, RCU queues directly to
+ the main ->cblist in order to avoid the extra
+ overhead of the ->nocb_bypass list and its lock.
+ But if there are too many callbacks queued during
+ a single jiffy, RCU pre-queues the callbacks into
+ the ->nocb_bypass queue. The definition of "too
+ many" is supplied by this kernel boot parameter.
rcutree.rcu_nocb_gp_stride= [KNL]
Set the number of NOCB callback kthreads in
@@ -3984,15 +4776,14 @@
Set threshold of queued RCU callbacks below which
batch limiting is re-enabled.
- rcutree.rcu_idle_gp_delay= [KNL]
- Set wakeup interval for idle CPUs that have
- RCU callbacks (RCU_FAST_NO_HZ=y).
-
- rcutree.rcu_idle_lazy_gp_delay= [KNL]
- Set wakeup interval for idle CPUs that have
- only "lazy" RCU callbacks (RCU_FAST_NO_HZ=y).
- Lazy RCU callbacks are those which RCU can
- prove do nothing more than free memory.
+ rcutree.qovld= [KNL]
+ Set threshold of queued RCU callbacks beyond which
+ RCU's force-quiescent-state scan will aggressively
+ enlist help from cond_resched() and sched IPIs to
+ help CPUs more quickly reach quiescent states.
+ Set to less than zero to make this be set based
+ on rcutree.qhimark at boot time and to zero to
+ disable more aggressive help enlistment.
rcutree.rcu_kick_kthreads= [KNL]
Cause the grace-period kthread to get an extra
@@ -4001,46 +4792,67 @@
This wake_up() will be accompanied by a
WARN_ONCE() splat and an ftrace_dump().
+ rcutree.rcu_unlock_delay= [KNL]
+ In CONFIG_RCU_STRICT_GRACE_PERIOD=y kernels,
+ this specifies an rcu_read_unlock()-time delay
+ in microseconds. This defaults to zero.
+ Larger delays increase the probability of
+ catching RCU pointer leaks, that is, buggy use
+ of RCU-protected pointers after the relevant
+ rcu_read_unlock() has completed.
+
rcutree.sysrq_rcu= [KNL]
Commandeer a sysrq key to dump out Tree RCU's
rcu_node tree with an eye towards determining
why a new grace period has not yet started.
- rcuperf.gp_async= [KNL]
+ rcuscale.gp_async= [KNL]
Measure performance of asynchronous
grace-period primitives such as call_rcu().
- rcuperf.gp_async_max= [KNL]
+ rcuscale.gp_async_max= [KNL]
Specify the maximum number of outstanding
callbacks per writer thread. When a writer
thread exceeds this limit, it invokes the
corresponding flavor of rcu_barrier() to allow
previously posted callbacks to drain.
- rcuperf.gp_exp= [KNL]
+ rcuscale.gp_exp= [KNL]
Measure performance of expedited synchronous
grace-period primitives.
- rcuperf.holdoff= [KNL]
+ rcuscale.holdoff= [KNL]
Set test-start holdoff period. The purpose of
this parameter is to delay the start of the
test until boot completes in order to avoid
interference.
- rcuperf.kfree_rcu_test= [KNL]
+ rcuscale.kfree_rcu_test= [KNL]
Set to measure performance of kfree_rcu() flooding.
- rcuperf.kfree_nthreads= [KNL]
+ rcuscale.kfree_rcu_test_double= [KNL]
+ Test the double-argument variant of kfree_rcu().
+ If this parameter has the same value as
+ rcuscale.kfree_rcu_test_single, both the single-
+ and double-argument variants are tested.
+
+ rcuscale.kfree_rcu_test_single= [KNL]
+ Test the single-argument variant of kfree_rcu().
+ If this parameter has the same value as
+ rcuscale.kfree_rcu_test_double, both the single-
+ and double-argument variants are tested.
+
+ rcuscale.kfree_nthreads= [KNL]
The number of threads running loops of kfree_rcu().
- rcuperf.kfree_alloc_num= [KNL]
+ rcuscale.kfree_alloc_num= [KNL]
Number of allocations and frees done in an iteration.
- rcuperf.kfree_loops= [KNL]
- Number of loops doing rcuperf.kfree_alloc_num number
+ rcuscale.kfree_loops= [KNL]
+ Number of loops doing rcuscale.kfree_alloc_num number
of allocations and frees.
- rcuperf.nreaders= [KNL]
+ rcuscale.nreaders= [KNL]
Set number of RCU readers. The value -1 selects
N, where N is the number of CPUs. A value
"n" less than -1 selects N-n+1, where N is again
@@ -4049,23 +4861,23 @@
A value of "n" less than or equal to -N selects
a single reader.
- rcuperf.nwriters= [KNL]
+ rcuscale.nwriters= [KNL]
Set number of RCU writers. The values operate
- the same as for rcuperf.nreaders.
+ the same as for rcuscale.nreaders.
N, where N is the number of CPUs
- rcuperf.perf_type= [KNL]
+ rcuscale.perf_type= [KNL]
Specify the RCU implementation to test.
- rcuperf.shutdown= [KNL]
+ rcuscale.shutdown= [KNL]
Shut the system down after performance tests
complete. This is useful for hands-off automated
testing.
- rcuperf.verbose= [KNL]
+ rcuscale.verbose= [KNL]
Enable additional printk() statements.
- rcuperf.writer_holdoff= [KNL]
+ rcuscale.writer_holdoff= [KNL]
Write-side holdoff between grace periods,
in microseconds. The default of zero says
no holdoff.
@@ -4083,8 +4895,12 @@
in seconds.
rcutorture.fwd_progress= [KNL]
- Enable RCU grace-period forward-progress testing
+ Specifies the number of kthreads to be used
+ for RCU grace-period forward-progress testing
for the types of RCU supporting this notion.
+ Defaults to 1 kthread, values less than zero or
+ greater than the number of CPUs cause the number
+ of CPUs to be used.
rcutorture.fwd_progress_div= [KNL]
Specify the fraction of a CPU-stall-warning
@@ -4118,6 +4934,18 @@
are zero, rcutorture acts as if is interpreted
they are all non-zero.
+ rcutorture.irqreader= [KNL]
+ Run RCU readers from irq handlers, or, more
+ accurately, from a timer handler. Not all RCU
+ flavors take kindly to this sort of thing.
+
+ rcutorture.leakpointer= [KNL]
+ Leak an RCU-protected pointer out of the reader.
+ This can of course result in splats, and is
+ intended to test the ability of things like
+ CONFIG_RCU_STRICT_GRACE_PERIOD=y to detect
+ such leaks.
+
rcutorture.n_barrier_cbs= [KNL]
Set callbacks/threads for rcu_barrier() testing.
@@ -4126,6 +4954,14 @@
stress RCU, they don't participate in the actual
test, hence the "fake".
+ rcutorture.nocbs_nthreads= [KNL]
+ Set number of RCU callback-offload togglers.
+ Zero (the default) disables toggling.
+
+ rcutorture.nocbs_toggle= [KNL]
+ Set the delay in milliseconds between successive
+ callback-offload toggling attempts.
+
rcutorture.nreaders= [KNL]
Set number of RCU readers. The value -1 selects
N-1, where N is the number of CPUs. A value
@@ -4143,6 +4979,20 @@
Set time (jiffies) between CPU-hotplug operations,
or zero to disable CPU-hotplug testing.
+ rcutorture.read_exit= [KNL]
+ Set the number of read-then-exit kthreads used
+ to test the interaction of RCU updaters and
+ task-exit processing.
+
+ rcutorture.read_exit_burst= [KNL]
+ The number of times in a given read-then-exit
+ episode that a set of read-then-exit kthreads
+ is spawned.
+
+ rcutorture.read_exit_delay= [KNL]
+ The delay, in seconds, between successive
+ read-then-exit testing episodes.
+
rcutorture.shuffle_interval= [KNL]
Set task-shuffle interval (s). Shuffling tasks
allows some CPUs to go into dyntick-idle mode
@@ -4156,12 +5006,24 @@
Duration of CPU stall (s) to test RCU CPU stall
warnings, zero to disable.
+ rcutorture.stall_cpu_block= [KNL]
+ Sleep while stalling if set. This will result
+ in warnings from preemptible RCU in addition
+ to any other stall-related activity.
+
rcutorture.stall_cpu_holdoff= [KNL]
Time to wait (s) after boot before inducing stall.
rcutorture.stall_cpu_irqsoff= [KNL]
Disable interrupts while stalling if set.
+ rcutorture.stall_gp_kthread= [KNL]
+ Duration (s) of forced sleep within RCU
+ grace-period kthread to test RCU CPU stall
+ warnings, zero to disable. If both stall_cpu
+ and stall_gp_kthread are specified, the
+ kthread is starved first, then the CPU.
+
rcutorture.stat_interval= [KNL]
Time (s) between statistics printk()s.
@@ -4199,8 +5061,26 @@
rcupdate.rcu_cpu_stall_suppress= [KNL]
Suppress RCU CPU stall warning messages.
+ rcupdate.rcu_cpu_stall_suppress_at_boot= [KNL]
+ Suppress RCU CPU stall warning messages and
+ rcutorture writer stall warnings that occur
+ during early boot, that is, during the time
+ before the init task is spawned.
+
rcupdate.rcu_cpu_stall_timeout= [KNL]
Set timeout for RCU CPU stall warning messages.
+ The value is in seconds and the maximum allowed
+ value is 300 seconds.
+
+ rcupdate.rcu_exp_cpu_stall_timeout= [KNL]
+ Set timeout for expedited RCU CPU stall warning
+ messages. The value is in milliseconds
+ and the maximum allowed value is 21000
+ milliseconds. Please note that this value is
+ adjusted to an arch timer tick resolution.
+ Setting this to zero causes the value from
+ rcupdate.rcu_cpu_stall_timeout to be used (after
+ conversion from seconds to milliseconds).
rcupdate.rcu_expedited= [KNL]
Use expedited grace-period primitives, for
@@ -4226,10 +5106,71 @@
only normal grace-period primitives. No effect
on CONFIG_TINY_RCU kernels.
+ But note that CONFIG_PREEMPT_RT=y kernels enables
+ this kernel boot parameter, forcibly setting
+ it to the value one, that is, converting any
+ post-boot attempt at an expedited RCU grace
+ period to instead use normal non-expedited
+ grace-period processing.
+
+ rcupdate.rcu_task_collapse_lim= [KNL]
+ Set the maximum number of callbacks present
+ at the beginning of a grace period that allows
+ the RCU Tasks flavors to collapse back to using
+ a single callback queue. This switching only
+ occurs when rcupdate.rcu_task_enqueue_lim is
+ set to the default value of -1.
+
+ rcupdate.rcu_task_contend_lim= [KNL]
+ Set the minimum number of callback-queuing-time
+ lock-contention events per jiffy required to
+ cause the RCU Tasks flavors to switch to per-CPU
+ callback queuing. This switching only occurs
+ when rcupdate.rcu_task_enqueue_lim is set to
+ the default value of -1.
+
+ rcupdate.rcu_task_enqueue_lim= [KNL]
+ Set the number of callback queues to use for the
+ RCU Tasks family of RCU flavors. The default
+ of -1 allows this to be automatically (and
+ dynamically) adjusted. This parameter is intended
+ for use in testing.
+
+ rcupdate.rcu_task_ipi_delay= [KNL]
+ Set time in jiffies during which RCU tasks will
+ avoid sending IPIs, starting with the beginning
+ of a given grace period. Setting a large
+ number avoids disturbing real-time workloads,
+ but lengthens grace periods.
+
+ rcupdate.rcu_task_stall_info= [KNL]
+ Set initial timeout in jiffies for RCU task stall
+ informational messages, which give some indication
+ of the problem for those not patient enough to
+ wait for ten minutes. Informational messages are
+ only printed prior to the stall-warning message
+ for a given grace period. Disable with a value
+ less than or equal to zero. Defaults to ten
+ seconds. A change in value does not take effect
+ until the beginning of the next grace period.
+
+ rcupdate.rcu_task_stall_info_mult= [KNL]
+ Multiplier for time interval between successive
+ RCU task stall informational messages for a given
+ RCU tasks grace period. This value is clamped
+ to one through ten, inclusive. It defaults to
+ the value three, so that the first informational
+ message is printed 10 seconds into the grace
+ period, the second at 40 seconds, the third at
+ 160 seconds, and then the stall warning at 600
+ seconds would prevent a fourth at 640 seconds.
+
rcupdate.rcu_task_stall_timeout= [KNL]
- Set timeout in jiffies for RCU task stall warning
- messages. Disable with a value less than or equal
- to zero.
+ Set timeout in jiffies for RCU task stall
+ warning messages. Disable with a value less
+ than or equal to zero. Defaults to ten minutes.
+ A change in value does not take effect until
+ the beginning of the next grace period.
rcupdate.rcu_self_test= [KNL]
Run the RCU early boot self tests
@@ -4255,7 +5196,7 @@
reboot= [KNL]
Format (x86 or x86_64):
- [w[arm] | c[old] | h[ard] | s[oft] | g[pio]] \
+ [w[arm] | c[old] | h[ard] | s[oft] | g[pio]] | d[efault] \
[[,]s[mp]#### \
[[,]b[ios] | a[cpi] | k[bd] | t[riple] | e[fi] | p[ci]] \
[[,]f[orce]
@@ -4267,6 +5208,51 @@
reboot_cpu is s[mp]#### with #### being the processor
to be used for rebooting.
+ refscale.holdoff= [KNL]
+ Set test-start holdoff period. The purpose of
+ this parameter is to delay the start of the
+ test until boot completes in order to avoid
+ interference.
+
+ refscale.loops= [KNL]
+ Set the number of loops over the synchronization
+ primitive under test. Increasing this number
+ reduces noise due to loop start/end overhead,
+ but the default has already reduced the per-pass
+ noise to a handful of picoseconds on ca. 2020
+ x86 laptops.
+
+ refscale.nreaders= [KNL]
+ Set number of readers. The default value of -1
+ selects N, where N is roughly 75% of the number
+ of CPUs. A value of zero is an interesting choice.
+
+ refscale.nruns= [KNL]
+ Set number of runs, each of which is dumped onto
+ the console log.
+
+ refscale.readdelay= [KNL]
+ Set the read-side critical-section duration,
+ measured in microseconds.
+
+ refscale.scale_type= [KNL]
+ Specify the read-protection implementation to test.
+
+ refscale.shutdown= [KNL]
+ Shut down the system at the end of the performance
+ test. This defaults to 1 (shut it down) when
+ refscale is built into the kernel and to 0 (leave
+ it running) when refscale is built as a module.
+
+ refscale.verbose= [KNL]
+ Enable additional printk() statements.
+
+ refscale.verbose_batched= [KNL]
+ Batch the additional printk() statements. If zero
+ (the default) or negative, print everything. Otherwise,
+ print every Nth verbose statement, where N is the value
+ specified.
+
relax_domain_level=
[KNL, SMP] Set scheduler's default relax_domain_level.
See Documentation/admin-guide/cgroup-v1/cpusets.rst.
@@ -4282,11 +5268,6 @@
Reserves a hole at the top of the kernel virtual
address space.
- reservelow= [X86]
- Format: nn[K]
- Set the amount of memory to reserve for BIOS at
- the bottom of the address space.
-
reset_devices [KNL] Force drivers to reset the underlying device
during initialization.
@@ -4308,17 +5289,45 @@
Useful for devices that are detected asynchronously
(e.g. USB and MMC devices).
- hibernate= [HIBERNATION]
- noresume Don't check if there's a hibernation image
- present during boot.
- nocompress Don't compress/decompress hibernation images.
- no Disable hibernation and resume.
- protect_image Turn on image protection during restoration
- (that will set all pages holding image data
- during restoration read-only).
-
retain_initrd [RAM] Keep initrd memory after extraction
+ retbleed= [X86] Control mitigation of RETBleed (Arbitrary
+ Speculative Code Execution with Return Instructions)
+ vulnerability.
+
+ AMD-based UNRET and IBPB mitigations alone do not stop
+ sibling threads from influencing the predictions of other
+ sibling threads. For that reason, STIBP is used on pro-
+ cessors that support it, and mitigate SMT on processors
+ that don't.
+
+ off - no mitigation
+ auto - automatically select a migitation
+ auto,nosmt - automatically select a mitigation,
+ disabling SMT if necessary for
+ the full mitigation (only on Zen1
+ and older without STIBP).
+ ibpb - On AMD, mitigate short speculation
+ windows on basic block boundaries too.
+ Safe, highest perf impact. It also
+ enables STIBP if present. Not suitable
+ on Intel.
+ ibpb,nosmt - Like "ibpb" above but will disable SMT
+ when STIBP is not available. This is
+ the alternative for systems which do not
+ have STIBP.
+ unret - Force enable untrained return thunks,
+ only effective on AMD f15h-f17h based
+ systems.
+ unret,nosmt - Like unret, but will disable SMT when STIBP
+ is not available. This is the alternative for
+ systems which do not have STIBP.
+
+ Selecting 'auto' will choose a mitigation method at run
+ time according to the CPU.
+
+ Not specifying this option is equivalent to retbleed=auto.
+
rfkill.default_state=
0 "airplane mode". All wifi, bluetooth, wimax, gps, fm,
etc. communication is blocked by default.
@@ -4343,6 +5352,8 @@
rodata= [KNL]
on Mark read-only kernel memory as read-only (default).
off Leave read-only kernel memory writable for debugging.
+ full Mark read-only kernel memory and aliases as read-only
+ [arm64]
rockchip.usb_uart
Enable the uart passthrough on the designated usb port
@@ -4380,18 +5391,136 @@
an IOTLB flush. Default is lazy flushing before reuse,
which is faster.
+ s390_iommu_aperture= [KNL,S390]
+ Specifies the size of the per device DMA address space
+ accessible through the DMA and IOMMU APIs as a decimal
+ factor of the size of main memory.
+ The default is 1 meaning that one can concurrently use
+ as many DMA addresses as physical memory is installed,
+ if supported by hardware, and thus map all of memory
+ once. With a value of 2 one can map all of memory twice
+ and so on. As a special case a factor of 0 imposes no
+ restrictions other than those given by hardware at the
+ cost of significant additional memory use for tables.
+
sa1100ir [NET]
See drivers/net/irda/sa1100_ir.c.
- sbni= [NET] Granch SBNI12 leased line adapter
-
- sched_debug [KNL] Enables verbose scheduler debug messages.
+ sched_verbose [KNL] Enables verbose scheduler debug messages.
schedstats= [KNL,X86] Enable or disable scheduled statistics.
Allowed values are enable and disable. This feature
incurs a small amount of overhead in the scheduler
but is useful for debugging and performance tuning.
+ sched_thermal_decay_shift=
+ [KNL, SMP] Set a decay shift for scheduler thermal
+ pressure signal. Thermal pressure signal follows the
+ default decay period of other scheduler pelt
+ signals(usually 32 ms but configurable). Setting
+ sched_thermal_decay_shift will left shift the decay
+ period for the thermal pressure signal by the shift
+ value.
+ i.e. with the default pelt decay period of 32 ms
+ sched_thermal_decay_shift thermal pressure decay pr
+ 1 64 ms
+ 2 128 ms
+ and so on.
+ Format: integer between 0 and 10
+ Default is 0.
+
+ scftorture.holdoff= [KNL]
+ Number of seconds to hold off before starting
+ test. Defaults to zero for module insertion and
+ to 10 seconds for built-in smp_call_function()
+ tests.
+
+ scftorture.longwait= [KNL]
+ Request ridiculously long waits randomly selected
+ up to the chosen limit in seconds. Zero (the
+ default) disables this feature. Please note
+ that requesting even small non-zero numbers of
+ seconds can result in RCU CPU stall warnings,
+ softlockup complaints, and so on.
+
+ scftorture.nthreads= [KNL]
+ Number of kthreads to spawn to invoke the
+ smp_call_function() family of functions.
+ The default of -1 specifies a number of kthreads
+ equal to the number of CPUs.
+
+ scftorture.onoff_holdoff= [KNL]
+ Number seconds to wait after the start of the
+ test before initiating CPU-hotplug operations.
+
+ scftorture.onoff_interval= [KNL]
+ Number seconds to wait between successive
+ CPU-hotplug operations. Specifying zero (which
+ is the default) disables CPU-hotplug operations.
+
+ scftorture.shutdown_secs= [KNL]
+ The number of seconds following the start of the
+ test after which to shut down the system. The
+ default of zero avoids shutting down the system.
+ Non-zero values are useful for automated tests.
+
+ scftorture.stat_interval= [KNL]
+ The number of seconds between outputting the
+ current test statistics to the console. A value
+ of zero disables statistics output.
+
+ scftorture.stutter_cpus= [KNL]
+ The number of jiffies to wait between each change
+ to the set of CPUs under test.
+
+ scftorture.use_cpus_read_lock= [KNL]
+ Use use_cpus_read_lock() instead of the default
+ preempt_disable() to disable CPU hotplug
+ while invoking one of the smp_call_function*()
+ functions.
+
+ scftorture.verbose= [KNL]
+ Enable additional printk() statements.
+
+ scftorture.weight_single= [KNL]
+ The probability weighting to use for the
+ smp_call_function_single() function with a zero
+ "wait" parameter. A value of -1 selects the
+ default if all other weights are -1. However,
+ if at least one weight has some other value, a
+ value of -1 will instead select a weight of zero.
+
+ scftorture.weight_single_wait= [KNL]
+ The probability weighting to use for the
+ smp_call_function_single() function with a
+ non-zero "wait" parameter. See weight_single.
+
+ scftorture.weight_many= [KNL]
+ The probability weighting to use for the
+ smp_call_function_many() function with a zero
+ "wait" parameter. See weight_single.
+ Note well that setting a high probability for
+ this weighting can place serious IPI load
+ on the system.
+
+ scftorture.weight_many_wait= [KNL]
+ The probability weighting to use for the
+ smp_call_function_many() function with a
+ non-zero "wait" parameter. See weight_single
+ and weight_many.
+
+ scftorture.weight_all= [KNL]
+ The probability weighting to use for the
+ smp_call_function_all() function with a zero
+ "wait" parameter. See weight_single and
+ weight_many.
+
+ scftorture.weight_all_wait= [KNL]
+ The probability weighting to use for the
+ smp_call_function_all() function with a
+ non-zero "wait" parameter. See weight_single
+ and weight_many.
+
skew_tick= [KNL] Offset the periodic timer tick per cpu to mitigate
xtime_lock contention on larger systems, and/or RCU lock
contention on all systems with CONFIG_MAXSMP set.
@@ -4421,6 +5550,8 @@
serialnumber [BUGS=X86-32]
+ sev=option[,option...] [X86-64] See Documentation/x86/x86_64/boot-options.rst
+
shapers= [NET]
Maximal number of shapers.
@@ -4429,6 +5560,10 @@
slram= [HW,MTD]
+ slab_merge [MM]
+ Enable merging of slabs with similar size when the
+ kernel is built without CONFIG_SLAB_MERGE_DEFAULT.
+
slab_nomerge [MM]
Disable merging of slabs with similar size. May be
necessary if there is some reason to distinguish
@@ -4440,7 +5575,7 @@
cache (risks via metadata attacks are mostly
unchanged). Debug options disable merging on their
own.
- For more information see Documentation/vm/slub.rst.
+ For more information see Documentation/mm/slub.rst.
slab_max_order= [MM, SLAB]
Determines the maximum allowed order for slabs.
@@ -4448,27 +5583,19 @@
fragmentation. Defaults to 1 for systems with
more than 32MB of RAM, 0 otherwise.
- slub_debug[=options[,slabs]] [MM, SLUB]
+ slub_debug[=options[,slabs][;[options[,slabs]]...] [MM, SLUB]
Enabling slub_debug allows one to determine the
culprit if slab objects become corrupted. Enabling
slub_debug can create guard zones around objects and
may poison objects when not in use. Also tracks the
last alloc / free. For more information see
- Documentation/vm/slub.rst.
-
- slub_memcg_sysfs= [MM, SLUB]
- Determines whether to enable sysfs directories for
- memory cgroup sub-caches. 1 to enable, 0 to disable.
- The default is determined by CONFIG_SLUB_MEMCG_SYSFS_ON.
- Enabling this can lead to a very high number of debug
- directories and files being created under
- /sys/kernel/slub.
+ Documentation/mm/slub.rst.
slub_max_order= [MM, SLUB]
Determines the maximum allowed order for slabs.
A high setting may cause OOMs due to memory
fragmentation. For more information see
- Documentation/vm/slub.rst.
+ Documentation/mm/slub.rst.
slub_min_objects= [MM, SLUB]
The minimum number of objects per slab. SLUB will
@@ -4477,12 +5604,15 @@
the number of objects indicated. The higher the number
of objects the smaller the overhead of tracking slabs
and the less frequently locks need to be acquired.
- For more information see Documentation/vm/slub.rst.
+ For more information see Documentation/mm/slub.rst.
slub_min_order= [MM, SLUB]
Determines the minimum page order for slabs. Must be
lower than slub_max_order.
- For more information see Documentation/vm/slub.rst.
+ For more information see Documentation/mm/slub.rst.
+
+ slub_merge [MM, SLUB]
+ Same with slab_merge.
slub_nomerge [MM, SLUB]
Same with slab_nomerge. This is supported for legacy.
@@ -4491,6 +5621,17 @@
smart2= [HW]
Format: <io1>[,<io2>[,...,<io8>]]
+ smp.csd_lock_timeout= [KNL]
+ Specify the period of time in milliseconds
+ that smp_call_function() and friends will wait
+ for a CPU to release the CSD lock. This is
+ useful when diagnosing bugs involving CPUs
+ disabling interrupts for extended periods
+ of time. Defaults to 5,000 milliseconds, and
+ setting a value of zero disables this feature.
+ This feature may be more efficiently disabled
+ using the csdlock_debug- kernel parameter.
+
smsc-ircc2.nopnp [HW] Don't use PNP to discover SMC devices
smsc-ircc2.ircc_cfg= [HW] Device configuration I/O port
smsc-ircc2.ircc_sir= [HW] SIR base I/O port
@@ -4502,7 +5643,7 @@
1: Fast pin select (default)
2: ATC IRMode
- smt [KNL,S390] Set the maximum number of threads (logical
+ smt= [KNL,S390] Set the maximum number of threads (logical
CPUs) to use per physical CPU on systems capable of
symmetric multithreading (SMT). Will be capped to the
actual hardware limit.
@@ -4511,18 +5652,18 @@
softlockup_panic=
[KNL] Should the soft-lockup detector generate panics.
- Format: <integer>
+ Format: 0 | 1
- A nonzero value instructs the soft-lockup detector
- to panic the machine when a soft-lockup occurs. This
- is also controlled by CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC
- which is the respective build-time switch to that
- functionality.
+ A value of 1 instructs the soft-lockup detector
+ to panic the machine when a soft-lockup occurs. It is
+ also controlled by the kernel.softlockup_panic sysctl
+ and CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC, which is the
+ respective build-time switch to that functionality.
softlockup_all_cpu_backtrace=
[KNL] Should the soft-lockup detector generate
backtraces on all cpus.
- Format: <integer>
+ Format: 0 | 1
sonypi.*= [HW] Sony Programmable I/O Control Device driver
See Documentation/admin-guide/laptops/sonypi.rst
@@ -4554,8 +5695,13 @@
Specific mitigations can also be selected manually:
retpoline - replace indirect branches
- retpoline,generic - google's original retpoline
- retpoline,amd - AMD-specific minimal thunk
+ retpoline,generic - Retpolines
+ retpoline,lfence - LFENCE; indirect branch
+ retpoline,amd - alias for retpoline,lfence
+ eibrs - enhanced IBRS
+ eibrs,retpoline - enhanced IBRS + Retpolines
+ eibrs,lfence - enhanced IBRS + LFENCE
+ ibrs - use IBRS to protect kernel
Not specifying this option is equivalent to
spectre_v2=auto.
@@ -4596,8 +5742,7 @@
auto - Kernel selects the mitigation depending on
the available CPU features and vulnerability.
- Default mitigation:
- If CONFIG_SECCOMP=y then "seccomp", otherwise "prctl"
+ Default mitigation: "prctl"
Not specifying this option is equivalent to
spectre_v2_user=auto.
@@ -4641,7 +5786,7 @@
will disable SSB unless they explicitly opt out.
Default mitigations:
- X86: If CONFIG_SECCOMP=y "seccomp", otherwise "prctl"
+ X86: "prctl"
On powerpc the options are:
@@ -4659,6 +5804,90 @@
spia_pedr=
spia_peddr=
+ split_lock_detect=
+ [X86] Enable split lock detection or bus lock detection
+
+ When enabled (and if hardware support is present), atomic
+ instructions that access data across cache line
+ boundaries will result in an alignment check exception
+ for split lock detection or a debug exception for
+ bus lock detection.
+
+ off - not enabled
+
+ warn - the kernel will emit rate-limited warnings
+ about applications triggering the #AC
+ exception or the #DB exception. This mode is
+ the default on CPUs that support split lock
+ detection or bus lock detection. Default
+ behavior is by #AC if both features are
+ enabled in hardware.
+
+ fatal - the kernel will send SIGBUS to applications
+ that trigger the #AC exception or the #DB
+ exception. Default behavior is by #AC if
+ both features are enabled in hardware.
+
+ ratelimit:N -
+ Set system wide rate limit to N bus locks
+ per second for bus lock detection.
+ 0 < N <= 1000.
+
+ N/A for split lock detection.
+
+
+ If an #AC exception is hit in the kernel or in
+ firmware (i.e. not while executing in user mode)
+ the kernel will oops in either "warn" or "fatal"
+ mode.
+
+ #DB exception for bus lock is triggered only when
+ CPL > 0.
+
+ srbds= [X86,INTEL]
+ Control the Special Register Buffer Data Sampling
+ (SRBDS) mitigation.
+
+ Certain CPUs are vulnerable to an MDS-like
+ exploit which can leak bits from the random
+ number generator.
+
+ By default, this issue is mitigated by
+ microcode. However, the microcode fix can cause
+ the RDRAND and RDSEED instructions to become
+ much slower. Among other effects, this will
+ result in reduced throughput from /dev/urandom.
+
+ The microcode mitigation can be disabled with
+ the following option:
+
+ off: Disable mitigation and remove
+ performance impact to RDRAND and RDSEED
+
+ srcutree.big_cpu_lim [KNL]
+ Specifies the number of CPUs constituting a
+ large system, such that srcu_struct structures
+ should immediately allocate an srcu_node array.
+ This kernel-boot parameter defaults to 128,
+ but takes effect only when the low-order four
+ bits of srcutree.convert_to_big is equal to 3
+ (decide at boot).
+
+ srcutree.convert_to_big [KNL]
+ Specifies under what conditions an SRCU tree
+ srcu_struct structure will be converted to big
+ form, that is, with an rcu_node tree:
+
+ 0: Never.
+ 1: At init_srcu_struct() time.
+ 2: When rcutorture decides to.
+ 3: Decide at boot time (default).
+ 0x1X: Above plus if high contention.
+
+ Either way, the srcu_node tree will be sized based
+ on the actual runtime number of CPUs (nr_cpu_ids)
+ instead of the compile-time CONFIG_NR_CPUS.
+
srcutree.counter_wrap_check [KNL]
Specifies how frequently to check for
grace-period sequence counter wrap for the
@@ -4676,6 +5905,32 @@
expediting. Set to zero to disable automatic
expediting.
+ srcutree.srcu_max_nodelay [KNL]
+ Specifies the number of no-delay instances
+ per jiffy for which the SRCU grace period
+ worker thread will be rescheduled with zero
+ delay. Beyond this limit, worker thread will
+ be rescheduled with a sleep delay of one jiffy.
+
+ srcutree.srcu_max_nodelay_phase [KNL]
+ Specifies the per-grace-period phase, number of
+ non-sleeping polls of readers. Beyond this limit,
+ grace period worker thread will be rescheduled
+ with a sleep delay of one jiffy, between each
+ rescan of the readers, for a grace period phase.
+
+ srcutree.srcu_retry_check_delay [KNL]
+ Specifies number of microseconds of non-sleeping
+ delay between each non-sleeping poll of readers.
+
+ srcutree.small_contention_lim [KNL]
+ Specifies the number of update-side contention
+ events per jiffy will be tolerated before
+ initiating a conversion of an srcu_struct
+ structure to big form. Note that the value of
+ srcutree.convert_to_big must have the 0x10 bit
+ set for contention-based conversions to occur.
+
ssbd= [ARM64,HW]
Speculative Store Bypass Disable control
@@ -4700,12 +5955,18 @@
growing up) the main stack are reserved for no other
mapping. Default value is 256 pages.
+ stack_depot_disable= [KNL]
+ Setting this to true through kernel command line will
+ disable the stack depot thereby saving the static memory
+ consumed by the stack hash table. By default this is set
+ to false.
+
stacktrace [FTRACE]
Enabled the stack tracer on boot up.
stacktrace_filter=[function-list]
[FTRACE] Limit the functions that the stack tracer
- will trace at boot up. function-list is a comma separated
+ will trace at boot up. function-list is a comma-separated
list of functions. This list can be changed at run
time by the stack_trace_filter file in the debugfs
tracing directory. Note, this enables stack tracing
@@ -4724,6 +5985,15 @@
stifb= [HW]
Format: bpp:<bpp1>[:<bpp2>[:<bpp3>...]]
+ strict_sas_size=
+ [X86]
+ Format: <bool>
+ Enable or disable strict sigaltstack size checks
+ against the required signal frame size which
+ depends on the supported FPU features. This can
+ be used to filter out binaries which have
+ not yet been made aware of AT_MINSIGSTKSZ.
+
sunrpc.min_resvport=
sunrpc.max_resvport=
[NFS,SUNRPC]
@@ -4779,20 +6049,27 @@
This parameter controls use of the Protected
Execution Facility on pSeries.
- swapaccount=[0|1]
- [KNL] Enable accounting of swap in memory resource
- controller if no parameter or 1 is given or disable
- it if 0 is given (See Documentation/admin-guide/cgroup-v1/memory.rst)
-
swiotlb= [ARM,IA-64,PPC,MIPS,X86]
- Format: { <int> | force | noforce }
+ Format: { <int> [,<int>] | force | noforce }
<int> -- Number of I/O TLB slabs
+ <int> -- Second integer after comma. Number of swiotlb
+ areas with their own lock. Will be rounded up
+ to a power of 2.
force -- force using of bounce buffers even if they
wouldn't be automatically used by the kernel
noforce -- Never use bounce buffers (for debugging)
switches= [HW,M68k]
+ sysctl.*= [KNL]
+ Set a sysctl parameter, right before loading the init
+ process, as if the value was written to the respective
+ /proc/sys/... file. Both '.' and '/' are recognized as
+ separators. Unrecognized parameters and invalid values
+ are reported in the kernel log. Sysctls registered
+ later by a loaded module cannot be set this way.
+ Example: sysctl.vm.swappiness=40
+
sysfs.deprecated=0|1 [KNL]
Enable/disable old style sysfs layout for old udev
on older distributions. When this option is enabled
@@ -4812,12 +6089,13 @@
Set the number of tcp_metrics_hash slots.
Default value is 8192 or 16384 depending on total
ram pages. This is used to specify the TCP metrics
- cache size. See Documentation/networking/ip-sysctl.txt
+ cache size. See Documentation/networking/ip-sysctl.rst
"tcp_no_metrics_save" section for more details.
tdfx= [HW,DRM]
- test_suspend= [SUSPEND][,N]
+ test_suspend= [SUSPEND]
+ Format: { "mem" | "standby" | "freeze" }[,N]
Specify "mem" (for Suspend-to-RAM) or "standby" (for
standby suspend) or "freeze" (for suspend type freeze)
as the system sleep state during system startup with
@@ -4871,6 +6149,25 @@
topology updates sent by the hypervisor to this
LPAR.
+ torture.disable_onoff_at_boot= [KNL]
+ Prevent the CPU-hotplug component of torturing
+ until after init has spawned.
+
+ torture.ftrace_dump_at_shutdown= [KNL]
+ Dump the ftrace buffer at torture-test shutdown,
+ even if there were no errors. This can be a
+ very costly operation when many torture tests
+ are running concurrently, especially on systems
+ with rotating-rust storage.
+
+ torture.verbose_sleep_frequency= [KNL]
+ Specifies how many verbose printk()s should be
+ emitted between each sleep. The default of zero
+ disables verbose-printk() sleeping.
+
+ torture.verbose_sleep_duration= [KNL]
+ Duration of each verbose-printk() sleep in jiffies.
+
tp720= [HW,PS2]
tpm_suspend_pcr=[HW,TPM]
@@ -4882,13 +6179,66 @@
This will guarantee that all the other pcrs
are saved.
+ tp_printk [FTRACE]
+ Have the tracepoints sent to printk as well as the
+ tracing ring buffer. This is useful for early boot up
+ where the system hangs or reboots and does not give the
+ option for reading the tracing buffer or performing a
+ ftrace_dump_on_oops.
+
+ To turn off having tracepoints sent to printk,
+ echo 0 > /proc/sys/kernel/tracepoint_printk
+ Note, echoing 1 into this file without the
+ tracepoint_printk kernel cmdline option has no effect.
+
+ The tp_printk_stop_on_boot (see below) can also be used
+ to stop the printing of events to console at
+ late_initcall_sync.
+
+ ** CAUTION **
+
+ Having tracepoints sent to printk() and activating high
+ frequency tracepoints such as irq or sched, can cause
+ the system to live lock.
+
+ tp_printk_stop_on_boot [FTRACE]
+ When tp_printk (above) is set, it can cause a lot of noise
+ on the console. It may be useful to only include the
+ printing of events during boot up, as user space may
+ make the system inoperable.
+
+ This command line option will stop the printing of events
+ to console at the late_initcall_sync() time frame.
+
trace_buf_size=nn[KMG]
[FTRACE] will set tracing buffer size on each cpu.
+ trace_clock= [FTRACE] Set the clock used for tracing events
+ at boot up.
+ local - Use the per CPU time stamp counter
+ (converted into nanoseconds). Fast, but
+ depending on the architecture, may not be
+ in sync between CPUs.
+ global - Event time stamps are synchronize across
+ CPUs. May be slower than the local clock,
+ but better for some race conditions.
+ counter - Simple counting of events (1, 2, ..)
+ note, some counts may be skipped due to the
+ infrastructure grabbing the clock more than
+ once per event.
+ uptime - Use jiffies as the time stamp.
+ perf - Use the same clock that perf uses.
+ mono - Use ktime_get_mono_fast_ns() for time stamps.
+ mono_raw - Use ktime_get_raw_fast_ns() for time
+ stamps.
+ boot - Use ktime_get_boot_fast_ns() for time stamps.
+ Architectures may add more clocks. See
+ Documentation/trace/ftrace.rst for more details.
+
trace_event=[event-list]
[FTRACE] Set and start specified trace events in order
to facilitate early boot debugging. The event-list is a
- comma separated list of trace events to enable. See
+ comma-separated list of trace events to enable. See
also Documentation/trace/events.rst
trace_options=[option-list]
@@ -4907,24 +6257,6 @@
See also Documentation/trace/ftrace.rst "trace options"
section.
- tp_printk[FTRACE]
- Have the tracepoints sent to printk as well as the
- tracing ring buffer. This is useful for early boot up
- where the system hangs or reboots and does not give the
- option for reading the tracing buffer or performing a
- ftrace_dump_on_oops.
-
- To turn off having tracepoints sent to printk,
- echo 0 > /proc/sys/kernel/tracepoint_printk
- Note, echoing 1 into this file without the
- tracepoint_printk kernel cmdline option has no effect.
-
- ** CAUTION **
-
- Having tracepoints sent to printk() and activating high
- frequency tracepoints such as irq or sched, can cause
- the system to live lock.
-
traceoff_on_warning
[FTRACE] enable this option to disable tracing when a
warning is hit. This turns off "tracing_on". Tracing can
@@ -4946,6 +6278,29 @@
See Documentation/admin-guide/mm/transhuge.rst
for more details.
+ trusted.source= [KEYS]
+ Format: <string>
+ This parameter identifies the trust source as a backend
+ for trusted keys implementation. Supported trust
+ sources:
+ - "tpm"
+ - "tee"
+ - "caam"
+ If not specified then it defaults to iterating through
+ the trust source list starting with TPM and assigns the
+ first trust source as a backend which is initialized
+ successfully during iteration.
+
+ trusted.rng= [KEYS]
+ Format: <string>
+ The RNG used to generate key material for trusted keys.
+ Can be one of:
+ - "kernel"
+ - the same value as trusted.source: "tpm" or "tee"
+ - "default"
+ If not specified, "default" is used. In this case,
+ the RNG's choice is left to each individual trust source.
+
tsc= Disable clocksource stability checks for TSC.
Format: <string>
[x86] reliable: mark tsc clocksource as reliable, this
@@ -4965,6 +6320,12 @@
interruptions from clocksource watchdog are not
acceptable).
+ tsc_early_khz= [X86] Skip early TSC calibration and use the given
+ value instead. Useful when the early TSC frequency discovery
+ procedure is not reliable, such as on overclocked systems
+ with CPUID.16h support and partial CPUID.15h support.
+ Format: <unsigned int>
+
tsx= [X86] Control Transactional Synchronization
Extensions (TSX) feature in Intel processors that
support TSX control.
@@ -5085,8 +6446,7 @@
usbcore.old_scheme_first=
[USB] Start with the old device initialization
- scheme, applies only to low and full-speed devices
- (default 0 = off).
+ scheme (default 0 = off).
usbcore.usbfs_memory_mb=
[USB] Memory limit (in MB) for buffers allocated by
@@ -5205,6 +6565,7 @@
device);
j = NO_REPORT_LUNS (don't use report luns
command, uas only);
+ k = NO_SAME (do not use WRITE_SAME, uas only)
l = NOT_LOCKABLE (don't try to lock and
unlock ejectable media, not on uas);
m = MAX_SECTORS_64 (don't transfer more
@@ -5247,7 +6608,7 @@
HIGHMEM regardless of setting
of CONFIG_HIGHPTE.
- vdso= [X86,SH]
+ vdso= [X86,SH,SPARC]
On X86_32, this is an alias for vdso32=. Otherwise:
vdso=1: enable VDSO (the default)
@@ -5273,11 +6634,12 @@
video= [FB] Frame buffer configuration
See Documentation/fb/modedb.rst.
- video.brightness_switch_enabled= [0,1]
+ video.brightness_switch_enabled= [ACPI]
+ Format: [0|1]
If set to 1, on receiving an ACPI notify event
generated by hotkey, video driver will adjust brightness
level and then send out the event to user space through
- the allocated input device; If set to 0, video driver
+ the allocated input device. If set to 0, video driver
will only send out the event without touching backlight
brightness level.
default: 1
@@ -5466,12 +6828,6 @@
default x2apic cluster mode on platforms
supporting x2apic.
- x86_intel_mid_timer= [X86-32,APBT]
- Choose timer option for x86 Intel MID platform.
- Two valid options are apbt timer only and lapic timer
- plus one apbt timer for broadcast timer.
- x86_intel_mid_timer=apbt_only | lapic_and_apbt
-
xen_512gb_limit [KNL,X86-64,XEN]
Restricts the kernel running paravirtualized under Xen
to use only up to 512 GB of RAM. The reason to do so is
@@ -5495,9 +6851,16 @@
Crash from Xen panic notifier, without executing late
panic() code such as dumping handler.
+ xen_msr_safe= [X86,XEN]
+ Format: <bool>
+ Select whether to always use non-faulting (safe) MSR
+ access functions when running as Xen PV guest. The
+ default value is controlled by CONFIG_XEN_PV_MSR_SAFE.
+
xen_nopvspin [X86,XEN]
- Disables the ticketlock slowpath using Xen PV
- optimizations.
+ Disables the qspinlock slowpath using Xen PV optimizations.
+ This parameter is obsoleted by "nopvspin" parameter, which
+ has equivalent effect for XEN platform.
xen_nopv [X86]
Disables the PV optimizations forcing the HVM guest to
@@ -5505,6 +6868,10 @@
This option is obsoleted by the "nopv" option, which
has equivalent effect for XEN platform.
+ xen_no_vector_callback
+ [KNL,X86,XEN] Disable the vector callback for Xen
+ event channel interrupts.
+
xen_scrub_pages= [XEN]
Boolean option to control scrubbing pages before giving them back
to Xen, for use by other domains. Can be also changed at runtime
@@ -5518,11 +6885,38 @@
improve timer resolution at the expense of processing
more timer interrupts.
+ xen.balloon_boot_timeout= [XEN]
+ The time (in seconds) to wait before giving up to boot
+ in case initial ballooning fails to free enough memory.
+ Applies only when running as HVM or PVH guest and
+ started with less memory configured than allowed at
+ max. Default is 180.
+
+ xen.event_eoi_delay= [XEN]
+ How long to delay EOI handling in case of event
+ storms (jiffies). Default is 10.
+
+ xen.event_loop_timeout= [XEN]
+ After which time (jiffies) the event handling loop
+ should start to delay EOI handling. Default is 2.
+
+ xen.fifo_events= [XEN]
+ Boolean parameter to disable using fifo event handling
+ even if available. Normally fifo event handling is
+ preferred over the 2-level event handling, as it is
+ fairer and the number of possible event channels is
+ much higher. Default is on (use fifo events).
+
nopv= [X86,XEN,KVM,HYPER_V,VMWARE]
Disables the PV optimizations forcing the guest to run
as generic guest with no PV drivers. Currently support
XEN HVM, KVM, HYPER_V and VMWARE guest.
+ nopvspin [X86,XEN,KVM]
+ Disables the qspinlock slow path using PV optimizations
+ which allow the hypervisor to 'idle' the guest on lock
+ contention.
+
xirc2ps_cs= [NET,PCMCIA]
Format:
<irq>,<irq_mask>,<io>,<full_duplex>,<do_sound>,<lockup_hack>[,<irq2>[,<irq3>[,<irq4>]]]
@@ -5536,6 +6930,12 @@
controller on both pseries and powernv
platforms. Only useful on POWER9 and above.
+ xive.store-eoi=off [PPC]
+ By default on POWER10 and above, the kernel will use
+ stores for EOI handling when the XIVE interrupt mode
+ is active. This option allows the XIVE driver to use
+ loads instead, as on POWER9.
+
xhci-hcd.quirks [USB,KNL]
A hex value specifying bitmask with supplemental xhci
host controller quirks. Meaning of each bit can be
diff --git a/Documentation/admin-guide/kernel-per-CPU-kthreads.rst b/Documentation/admin-guide/kernel-per-CPU-kthreads.rst
index baeeba8762ae..e4a5fc26f1a9 100644
--- a/Documentation/admin-guide/kernel-per-CPU-kthreads.rst
+++ b/Documentation/admin-guide/kernel-per-CPU-kthreads.rst
@@ -10,7 +10,7 @@ them to a "housekeeping" CPU dedicated to such work.
References
==========
-- Documentation/IRQ-affinity.txt: Binding interrupts to sets of CPUs.
+- Documentation/core-api/irq/irq-affinity.rst: Binding interrupts to sets of CPUs.
- Documentation/admin-guide/cgroup-v1: Using cgroups to bind tasks to sets of CPUs.
@@ -208,7 +208,7 @@ Do at least one of the following:
2. Enable RCU to do its processing remotely via dyntick-idle by
doing all of the following:
- a. Build with CONFIG_NO_HZ=y and CONFIG_RCU_FAST_NO_HZ=y.
+ a. Build with CONFIG_NO_HZ=y.
b. Ensure that the CPU goes idle frequently, allowing other
CPUs to detect that it has passed through an RCU quiescent
state. If the kernel is built with CONFIG_NO_HZ_FULL=y,
@@ -234,7 +234,7 @@ To reduce its OS jitter, do any of the following:
Such a workqueue can be confined to a given subset of the
CPUs using the ``/sys/devices/virtual/workqueue/*/cpumask`` sysfs
files. The set of WQ_SYSFS workqueues can be displayed using
- "ls sys/devices/virtual/workqueue". That said, the workqueues
+ "ls /sys/devices/virtual/workqueue". That said, the workqueues
maintainer would like to caution people against indiscriminately
sprinkling WQ_SYSFS across all the workqueues. The reason for
caution is that it is easy to add WQ_SYSFS, but because sysfs is
@@ -273,7 +273,7 @@ To reduce its OS jitter, do any of the following:
However, there is an RFC patch from Christoph Lameter
(based on an earlier one from Gilad Ben-Yossef) that
reduces or even eliminates vmstat overhead for some
- workloads at https://lkml.org/lkml/2013/9/4/379.
+ workloads at https://lore.kernel.org/r/00000140e9dfd6bd-40db3d4f-c1be-434f-8132-7820f81bb586-000000@email.amazonses.com.
e. If running on high-end powerpc servers, build with
CONFIG_PPC_RTAS_DAEMON=n. This prevents the RTAS
daemon from running on each CPU every second or so.
@@ -332,23 +332,3 @@ To reduce its OS jitter, do at least one of the following:
kthreads from being created in the first place. However, please
note that this will not eliminate OS jitter, but will instead
shift it to RCU_SOFTIRQ.
-
-Name:
- watchdog/%u
-
-Purpose:
- Detect software lockups on each CPU.
-
-To reduce its OS jitter, do at least one of the following:
-
-1. Build with CONFIG_LOCKUP_DETECTOR=n, which will prevent these
- kthreads from being created in the first place.
-2. Boot with "nosoftlockup=0", which will also prevent these kthreads
- from being created. Other related watchdog and softlockup boot
- parameters may be found in Documentation/admin-guide/kernel-parameters.rst
- and Documentation/watchdog/watchdog-parameters.rst.
-3. Echo a zero to /proc/sys/kernel/watchdog to disable the
- watchdog timer.
-4. Echo a large number of /proc/sys/kernel/watchdog_thresh in
- order to reduce the frequency of OS jitter due to the watchdog
- timer down to a level that is acceptable for your workload.
diff --git a/Documentation/admin-guide/laptops/disk-shock-protection.rst b/Documentation/admin-guide/laptops/disk-shock-protection.rst
index e97c5f78d8c3..22c7ec3e84cf 100644
--- a/Documentation/admin-guide/laptops/disk-shock-protection.rst
+++ b/Documentation/admin-guide/laptops/disk-shock-protection.rst
@@ -135,7 +135,7 @@ single project which, although still considered experimental, is fit
for use. Please feel free to add projects that have been the victims
of my ignorance.
-- http://www.thinkwiki.org/wiki/HDAPS
+- https://www.thinkwiki.org/wiki/HDAPS
See this page for information about Linux support of the hard disk
active protection system as implemented in IBM/Lenovo Thinkpads.
diff --git a/Documentation/admin-guide/laptops/laptop-mode.rst b/Documentation/admin-guide/laptops/laptop-mode.rst
index c984c4262f2e..b61cc601d298 100644
--- a/Documentation/admin-guide/laptops/laptop-mode.rst
+++ b/Documentation/admin-guide/laptops/laptop-mode.rst
@@ -101,17 +101,6 @@ this results in concentration of disk activity in a small time interval which
occurs only once every 10 minutes, or whenever the disk is forced to spin up by
a cache miss. The disk can then be spun down in the periods of inactivity.
-If you want to find out which process caused the disk to spin up, you can
-gather information by setting the flag /proc/sys/vm/block_dump. When this flag
-is set, Linux reports all disk read and write operations that take place, and
-all block dirtyings done to files. This makes it possible to debug why a disk
-needs to spin up, and to increase battery life even more. The output of
-block_dump is written to the kernel output, and it can be retrieved using
-"dmesg". When you use block_dump and your kernel logging level also includes
-kernel debugging messages, you probably want to turn off klogd, otherwise
-the output of block_dump will be logged, causing disk activity that is not
-normally there.
-
Configuration
-------------
diff --git a/Documentation/admin-guide/laptops/lg-laptop.rst b/Documentation/admin-guide/laptops/lg-laptop.rst
index ce9b14671cb9..67fd6932cef4 100644
--- a/Documentation/admin-guide/laptops/lg-laptop.rst
+++ b/Documentation/admin-guide/laptops/lg-laptop.rst
@@ -13,10 +13,8 @@ Hotkeys
The following FN keys are ignored by the kernel without this driver:
- FN-F1 (LG control panel) - Generates F15
-- FN-F5 (Touchpad toggle) - Generates F13
+- FN-F5 (Touchpad toggle) - Generates F21
- FN-F6 (Airplane mode) - Generates RFKILL
-- FN-F8 (Keyboard backlight) - Generates F16.
- This key also changes keyboard backlight mode.
- FN-F9 (Reader mode) - Generates F14
The rest of the FN keys work without a need for a special driver.
@@ -40,7 +38,7 @@ FN lock.
Battery care limit
------------------
-Writing 80/100 to /sys/devices/platform/lg-laptop/battery_care_limit
+Writing 80/100 to /sys/class/power_supply/CMB0/charge_control_end_threshold
sets the maximum capacity to charge the battery. Limiting the charge
reduces battery capacity loss over time.
diff --git a/Documentation/admin-guide/laptops/sonypi.rst b/Documentation/admin-guide/laptops/sonypi.rst
index c6eaaf48f7c1..190da1234314 100644
--- a/Documentation/admin-guide/laptops/sonypi.rst
+++ b/Documentation/admin-guide/laptops/sonypi.rst
@@ -151,7 +151,7 @@ Bugs:
different way to adjust the backlighting of the screen. There
is a userspace utility to adjust the brightness on those models,
which can be downloaded from
- http://www.acc.umu.se/~erikw/program/smartdimmer-0.1.tar.bz2
+ https://www.acc.umu.se/~erikw/program/smartdimmer-0.1.tar.bz2
- since all development was done by reverse engineering, there is
*absolutely no guarantee* that this driver will not crash your
diff --git a/Documentation/admin-guide/laptops/thinkpad-acpi.rst b/Documentation/admin-guide/laptops/thinkpad-acpi.rst
index 822907dcc845..475eb0e81e4a 100644
--- a/Documentation/admin-guide/laptops/thinkpad-acpi.rst
+++ b/Documentation/admin-guide/laptops/thinkpad-acpi.rst
@@ -50,6 +50,9 @@ detailed description):
- WAN enable and disable
- UWB enable and disable
- LCD Shadow (PrivacyGuard) enable and disable
+ - Lap mode sensor
+ - Setting keyboard language
+ - WWAN Antenna type
A compatibility table by model and feature is maintained on the web
site, http://ibm-acpi.sf.net/. I appreciate any success or failure
@@ -904,7 +907,7 @@ temperatures:
The mapping of thermal sensors to physical locations varies depending on
system-board model (and thus, on ThinkPad model).
-http://thinkwiki.org/wiki/Thermal_Sensors is a public wiki page that
+https://thinkwiki.org/wiki/Thermal_Sensors is a public wiki page that
tries to track down these locations for various models.
Most (newer?) models seem to follow this pattern:
@@ -925,7 +928,7 @@ For the R51 (source: Thomas Gruber):
- 3: Internal HDD
For the T43, T43/p (source: Shmidoax/Thinkwiki.org)
-http://thinkwiki.org/wiki/Thermal_Sensors#ThinkPad_T43.2C_T43p
+https://thinkwiki.org/wiki/Thermal_Sensors#ThinkPad_T43.2C_T43p
- 2: System board, left side (near PCMCIA slot), reported as HDAPS temp
- 3: PCMCIA slot
@@ -935,7 +938,7 @@ http://thinkwiki.org/wiki/Thermal_Sensors#ThinkPad_T43.2C_T43p
- 11: Power regulator, underside of system board, below F2 key
The A31 has a very atypical layout for the thermal sensors
-(source: Milos Popovic, http://thinkwiki.org/wiki/Thermal_Sensors#ThinkPad_A31)
+(source: Milos Popovic, https://thinkwiki.org/wiki/Thermal_Sensors#ThinkPad_A31)
- 1: CPU
- 2: Main Battery: main sensor
@@ -1432,6 +1435,20 @@ The first command ensures the best viewing angle and the latter one turns
on the feature, restricting the viewing angles.
+DYTC Lapmode sensor
+-------------------
+
+sysfs: dytc_lapmode
+
+Newer thinkpads and mobile workstations have the ability to determine if
+the device is in deskmode or lapmode. This feature is used by user space
+to decide if WWAN transmission can be increased to maximum power and is
+also useful for understanding the different thermal modes available as
+they differ between desk and lap mode.
+
+The property is read-only. If the platform doesn't have support the sysfs
+class is not created.
+
EXPERIMENTAL: UWB
-----------------
@@ -1451,6 +1468,49 @@ Sysfs notes
rfkill controller switch "tpacpi_uwb_sw": refer to
Documentation/driver-api/rfkill.rst for details.
+
+Setting keyboard language
+-------------------------
+
+sysfs: keyboard_lang
+
+This feature is used to set keyboard language to ECFW using ASL interface.
+Fewer thinkpads models like T580 , T590 , T15 Gen 1 etc.. has "=", "(',
+")" numeric keys, which are not displaying correctly, when keyboard language
+is other than "english". This is because the default keyboard language in ECFW
+is set as "english". Hence using this sysfs, user can set the correct keyboard
+language to ECFW and then these key's will work correctly.
+
+Example of command to set keyboard language is mentioned below::
+
+ echo jp > /sys/devices/platform/thinkpad_acpi/keyboard_lang
+
+Text corresponding to keyboard layout to be set in sysfs are: be(Belgian),
+cz(Czech), da(Danish), de(German), en(English), es(Spain), et(Estonian),
+fr(French), fr-ch(French(Switzerland)), hu(Hungarian), it(Italy), jp (Japan),
+nl(Dutch), nn(Norway), pl(Polish), pt(portugese), sl(Slovenian), sv(Sweden),
+tr(Turkey)
+
+WWAN Antenna type
+-----------------
+
+sysfs: wwan_antenna_type
+
+On some newer Thinkpads we need to set SAR value based on the antenna
+type. This interface will be used by userspace to get the antenna type
+and set the corresponding SAR value, as is required for FCC certification.
+
+The available commands are::
+
+ cat /sys/devices/platform/thinkpad_acpi/wwan_antenna_type
+
+Currently 2 antenna types are supported as mentioned below:
+- type a
+- type b
+
+The property is read-only. If the platform doesn't have support the sysfs
+class is not created.
+
Adaptive keyboard
-----------------
@@ -1460,15 +1520,32 @@ This sysfs attribute controls the keyboard "face" that will be shown on the
Lenovo X1 Carbon 2nd gen (2014)'s adaptive keyboard. The value can be read
and set.
-- 1 = Home mode
-- 2 = Web-browser mode
-- 3 = Web-conference mode
-- 4 = Function mode
-- 5 = Layflat mode
+- 0 = Home mode
+- 1 = Web-browser mode
+- 2 = Web-conference mode
+- 3 = Function mode
+- 4 = Layflat mode
For more details about which buttons will appear depending on the mode, please
review the laptop's user guide:
-http://www.lenovo.com/shop/americas/content/user_guides/x1carbon_2_ug_en.pdf
+https://download.lenovo.com/ibmdl/pub/pc/pccbbs/mobiles_pdf/x1carbon_2_ug_en.pdf
+
+Battery charge control
+----------------------
+
+sysfs attributes:
+/sys/class/power_supply/BAT*/charge_control_{start,end}_threshold
+
+These two attributes are created for those batteries that are supported by the
+driver. They enable the user to control the battery charge thresholds of the
+given battery. Both values may be read and set. `charge_control_start_threshold`
+accepts an integer between 0 and 99 (inclusive); this value represents a battery
+percentage level, below which charging will begin. `charge_control_end_threshold`
+accepts an integer between 1 and 100 (inclusive); this value represents a battery
+percentage level, above which charging will stop.
+
+The exact semantics of the attributes may be found in
+Documentation/ABI/testing/sysfs-class-power.
Multiple Commands, Module Parameters
------------------------------------
diff --git a/Documentation/admin-guide/lockup-watchdogs.rst b/Documentation/admin-guide/lockup-watchdogs.rst
index 290840c160af..3e09284a8b9b 100644
--- a/Documentation/admin-guide/lockup-watchdogs.rst
+++ b/Documentation/admin-guide/lockup-watchdogs.rst
@@ -39,7 +39,7 @@ in principle, they should work in any architecture where these
subsystems are present.
A periodic hrtimer runs to generate interrupts and kick the watchdog
-task. An NMI perf event is generated every "watchdog_thresh"
+job. An NMI perf event is generated every "watchdog_thresh"
(compile-time initialized to 10 and configurable through sysctl of the
same name) seconds to check for hardlockups. If any CPU in the system
does not receive any hrtimer interrupt during that time the
@@ -47,7 +47,7 @@ does not receive any hrtimer interrupt during that time the
generate a kernel warning or call panic, depending on the
configuration.
-The watchdog task is a high priority kernel thread that updates a
+The watchdog job runs in a stop scheduling thread that updates a
timestamp every time it is scheduled. If that timestamp is not updated
for 2*watchdog_thresh seconds (the softlockup threshold) the
'softlockup detector' (coded inside the hrtimer callback function)
diff --git a/Documentation/admin-guide/md.rst b/Documentation/admin-guide/md.rst
index 3c51084ffd37..d8fc9a59c086 100644
--- a/Documentation/admin-guide/md.rst
+++ b/Documentation/admin-guide/md.rst
@@ -5,7 +5,7 @@ Boot time assembly of RAID arrays
---------------------------------
Tools that manage md devices can be found at
- http://www.kernel.org/pub/linux/utils/raid/
+ https://www.kernel.org/pub/linux/utils/raid/
You can boot with your md device with the following kernel command
@@ -221,7 +221,7 @@ All md devices contain:
layout
The ``layout`` for the array for the particular level. This is
- simply a number that is interpretted differently by different
+ simply a number that is interpreted differently by different
levels. It can be written while assembling an array.
array_size
@@ -426,6 +426,10 @@ All md devices contain:
The accepted values when writing to this file are ``ppl`` and ``resync``,
used to enable and disable PPL.
+ uuid
+ This indicates the UUID of the array in the following format:
+ xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
+
As component devices are added to an md array, they appear in the ``md``
directory as new directories named::
diff --git a/Documentation/admin-guide/media/au0828-cardlist.rst b/Documentation/admin-guide/media/au0828-cardlist.rst
new file mode 100644
index 000000000000..aaaadc934e7a
--- /dev/null
+++ b/Documentation/admin-guide/media/au0828-cardlist.rst
@@ -0,0 +1,39 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+AU0828 cards list
+=================
+
+.. tabularcolumns:: |p{1.4cm}|p{6.5cm}|p{10.0cm}|
+
+.. flat-table::
+ :header-rows: 1
+ :widths: 2 19 18
+ :stub-columns: 0
+
+ * - Card number
+ - Card name
+ - USB IDs
+
+ * - 0
+ - Unknown board
+ -
+
+ * - 1
+ - Hauppauge HVR950Q
+ - 2040:7200, 2040:7210, 2040:7217, 2040:721b, 2040:721e, 2040:721f, 2040:7280, 0fd9:0008, 2040:7260, 2040:7213, 2040:7270
+
+ * - 2
+ - Hauppauge HVR850
+ - 2040:7240
+
+ * - 3
+ - DViCO FusionHDTV USB
+ - 0fe9:d620
+
+ * - 4
+ - Hauppauge HVR950Q rev xxF8
+ - 2040:7201, 2040:7211, 2040:7281
+
+ * - 5
+ - Hauppauge Woodbury
+ - 05e1:0480, 2040:8200
diff --git a/Documentation/admin-guide/media/avermedia.rst b/Documentation/admin-guide/media/avermedia.rst
new file mode 100644
index 000000000000..93ff74002d20
--- /dev/null
+++ b/Documentation/admin-guide/media/avermedia.rst
@@ -0,0 +1,94 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+======================================
+Avermedia DVB-T on BT878 Release Notes
+======================================
+
+February 14th 2006
+
+.. note::
+
+ Several other Avermedia devices are supported. For a more
+ broader and updated content about that, please check:
+
+ https://linuxtv.org/wiki/index.php/AVerMedia
+
+The Avermedia DVB-T
+~~~~~~~~~~~~~~~~~~~
+
+The Avermedia DVB-T is a budget PCI DVB card. It has 3 inputs:
+
+* RF Tuner Input
+* Composite Video Input (RCA Jack)
+* SVIDEO Input (Mini-DIN)
+
+The RF Tuner Input is the input to the tuner module of the
+card. The Tuner is otherwise known as the "Frontend" . The
+Frontend of the Avermedia DVB-T is a Microtune 7202D. A timely
+post to the linux-dvb mailing list ascertained that the
+Microtune 7202D is supported by the sp887x driver which is
+found in the dvb-hw CVS module.
+
+The DVB-T card is based around the BT878 chip which is a very
+common multimedia bridge and often found on Analogue TV cards.
+There is no on-board MPEG2 decoder, which means that all MPEG2
+decoding must be done in software, or if you have one, on an
+MPEG2 hardware decoding card or chipset.
+
+
+Getting the card going
+~~~~~~~~~~~~~~~~~~~~~~
+
+At this stage, it has not been able to ascertain the
+functionality of the remaining device nodes in respect of the
+Avermedia DVBT. However, full functionality in respect of
+tuning, receiving and supplying the MPEG2 data stream is
+possible with the currently available versions of the driver.
+It may be possible that additional functionality is available
+from the card (i.e. viewing the additional analogue inputs
+that the card presents), but this has not been tested yet. If
+I get around to this, I'll update the document with whatever I
+find.
+
+To power up the card, load the following modules in the
+following order:
+
+* modprobe bttv (normally loaded automatically)
+* modprobe dvb-bt8xx (or place dvb-bt8xx in /etc/modules)
+
+Insertion of these modules into the running kernel will
+activate the appropriate DVB device nodes. It is then possible
+to start accessing the card with utilities such as scan, tzap,
+dvbstream etc.
+
+The frontend module sp887x.o, requires an external firmware.
+Please use the command "get_dvb_firmware sp887x" to download
+it. Then copy it to /usr/lib/hotplug/firmware or /lib/firmware/
+(depending on configuration of firmware hotplug).
+
+Known Limitations
+~~~~~~~~~~~~~~~~~
+
+At present I can say with confidence that the frontend tunes
+via /dev/dvb/adapter{x}/frontend0 and supplies an MPEG2 stream
+via /dev/dvb/adapter{x}/dvr0. I have not tested the
+functionality of any other part of the card yet. I will do so
+over time and update this document.
+
+There are some limitations in the i2c layer due to a returned
+error message inconsistency. Although this generates errors in
+dmesg and the system logs, it does not appear to affect the
+ability of the frontend to function correctly.
+
+Further Update
+~~~~~~~~~~~~~~
+
+dvbstream and VideoLAN Client on windows works a treat with
+DVB, in fact this is currently serving as my main way of
+viewing DVB-T at the moment. Additionally, VLC is happily
+decoding HDTV signals, although the PC is dropping the odd
+frame here and there - I assume due to processing capability -
+as all the decoding is being done under windows in software.
+
+Many thanks to Nigel Pearson for the updates to this document
+since the recent revision of the driver.
diff --git a/Documentation/admin-guide/media/bt8xx.rst b/Documentation/admin-guide/media/bt8xx.rst
new file mode 100644
index 000000000000..3589f6ab7e46
--- /dev/null
+++ b/Documentation/admin-guide/media/bt8xx.rst
@@ -0,0 +1,157 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+==================================
+How to get the bt8xx cards working
+==================================
+
+Authors:
+ Richard Walker,
+ Jamie Honan,
+ Michael Hunold,
+ Manu Abraham,
+ Uwe Bugla,
+ Michael Krufky
+
+General information
+-------------------
+
+This class of cards has a bt878a as the PCI interface, and require the bttv
+driver for accessing the i2c bus and the gpio pins of the bt8xx chipset.
+
+Please see Documentation/admin-guide/media/bttv-cardlist.rst for a complete
+list of Cards based on the Conexant Bt8xx PCI bridge supported by the
+Linux Kernel.
+
+In order to be able to compile the kernel, some config options should be
+enabled::
+
+ ./scripts/config -e PCI
+ ./scripts/config -e INPUT
+ ./scripts/config -m I2C
+ ./scripts/config -m MEDIA_SUPPORT
+ ./scripts/config -e MEDIA_PCI_SUPPORT
+ ./scripts/config -e MEDIA_ANALOG_TV_SUPPORT
+ ./scripts/config -e MEDIA_DIGITAL_TV_SUPPORT
+ ./scripts/config -e MEDIA_RADIO_SUPPORT
+ ./scripts/config -e RC_CORE
+ ./scripts/config -m VIDEO_BT848
+ ./scripts/config -m DVB_BT8XX
+
+If you want to automatically support all possible variants of the Bt8xx
+cards, you should also do::
+
+ ./scripts/config -e MEDIA_SUBDRV_AUTOSELECT
+
+.. note::
+
+ Please use the following options with care as deselection of drivers which
+ are in fact necessary may result in DVB devices that cannot be tuned due
+ to lack of driver support.
+
+If your goal is to just support an specific board, you may, instead,
+disable MEDIA_SUBDRV_AUTOSELECT and manually select the frontend drivers
+required by your board. With that, you can save some RAM.
+
+You can do that by calling make xconfig/qconfig/menuconfig and look at
+the options on those menu options (only enabled if
+``Autoselect ancillary drivers`` is disabled:
+
+#) ``Device drivers`` => ``Multimedia support`` => ``Customize TV tuners``
+#) ``Device drivers`` => ``Multimedia support`` => ``Customize DVB frontends``
+
+Then, on each of the above menu, please select your card-specific
+frontend and tuner modules.
+
+
+Loading Modules
+---------------
+
+Regular case: If the bttv driver detects a bt8xx-based DVB card, all
+frontend and backend modules will be loaded automatically.
+
+Exceptions are:
+
+- Old TV cards without EEPROMs, sharing a common PCI subsystem ID;
+- Old TwinHan DST cards or clones with or without CA slot and not
+ containing an Eeprom.
+
+In the following cases overriding the PCI type detection for bttv and
+for dvb-bt8xx drivers by passing modprobe parameters may be necessary.
+
+Running TwinHan and Clones
+~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+As shown at Documentation/admin-guide/media/bttv-cardlist.rst, TwinHan and
+clones use ``card=113`` modprobe parameter. So, in order to properly
+detect it for devices without EEPROM, you should use::
+
+ $ modprobe bttv card=113
+ $ modprobe dst
+
+Useful parameters for verbosity level and debugging the dst module::
+
+ verbose=0: messages are disabled
+ 1: only error messages are displayed
+ 2: notifications are displayed
+ 3: other useful messages are displayed
+ 4: debug setting
+ dst_addons=0: card is a free to air (FTA) card only
+ 0x20: card has a conditional access slot for scrambled channels
+ dst_algo=0: (default) Software tuning algorithm
+ 1: Hardware tuning algorithm
+
+
+The autodetected values are determined by the cards' "response string".
+
+In your logs see f. ex.: dst_get_device_id: Recognize [DSTMCI].
+
+For bug reports please send in a complete log with verbose=4 activated.
+Please also see Documentation/admin-guide/media/ci.rst.
+
+Running multiple cards
+~~~~~~~~~~~~~~~~~~~~~~
+
+See Documentation/admin-guide/media/bttv-cardlist.rst for a complete list of
+Card ID. Some examples:
+
+ =========================== ===
+ Brand name ID
+ =========================== ===
+ Pinnacle PCTV Sat 94
+ Nebula Electronics Digi TV 104
+ pcHDTV HD-2000 TV 112
+ Twinhan DST and clones 113
+ Avermedia AverTV DVB-T 77: 123
+ Avermedia AverTV DVB-T 761 124
+ DViCO FusionHDTV DVB-T Lite 128
+ DViCO FusionHDTV 5 Lite 135
+ =========================== ===
+
+.. note::
+
+ When you have multiple cards, the order of the card ID should
+ match the order where they're detected by the system. Please notice
+ that removing/inserting other PCI cards may change the detection
+ order.
+
+Example::
+
+ $ modprobe bttv card=113 card=135
+
+In case of further problems please subscribe and send questions to
+the mailing list: linux-media@vger.kernel.org.
+
+Probing the cards with broken PCI subsystem ID
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+There are some TwinHan cards whose EEPROM has become corrupted for some
+reason. The cards do not have a correct PCI subsystem ID.
+Still, it is possible to force probing the cards with::
+
+ $ echo 109e 0878 $subvendor $subdevice > \
+ /sys/bus/pci/drivers/bt878/new_id
+
+The two numbers there are::
+
+ 109e: PCI_VENDOR_ID_BROOKTREE
+ 0878: PCI_DEVICE_ID_BROOKTREE_878
diff --git a/Documentation/admin-guide/media/bttv-cardlist.rst b/Documentation/admin-guide/media/bttv-cardlist.rst
new file mode 100644
index 000000000000..8671d4f7ba7b
--- /dev/null
+++ b/Documentation/admin-guide/media/bttv-cardlist.rst
@@ -0,0 +1,683 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+BTTV cards list
+===============
+
+.. tabularcolumns:: |p{1.4cm}|p{11.1cm}|p{4.2cm}|
+
+.. flat-table::
+ :header-rows: 1
+ :widths: 2 19 18
+ :stub-columns: 0
+
+ * - Card number
+ - Card name
+ - PCI subsystem IDs
+
+ * - 0
+ - *** UNKNOWN/GENERIC ***
+ -
+
+ * - 1
+ - MIRO PCTV
+ -
+
+ * - 2
+ - Hauppauge (bt848)
+ -
+
+ * - 3
+ - STB, Gateway P/N 6000699 (bt848)
+ -
+
+ * - 4
+ - Intel Create and Share PCI/ Smart Video Recorder III
+ -
+
+ * - 5
+ - Diamond DTV2000
+ -
+
+ * - 6
+ - AVerMedia TVPhone
+ -
+
+ * - 7
+ - MATRIX-Vision MV-Delta
+ -
+
+ * - 8
+ - Lifeview FlyVideo II (Bt848) LR26 / MAXI TV Video PCI2 LR26
+ -
+
+ * - 9
+ - IMS/IXmicro TurboTV
+ -
+
+ * - 10
+ - Hauppauge (bt878)
+ - 0070:13eb, 0070:3900, 2636:10b4
+
+ * - 11
+ - MIRO PCTV pro
+ -
+
+ * - 12
+ - ADS Technologies Channel Surfer TV (bt848)
+ -
+
+ * - 13
+ - AVerMedia TVCapture 98
+ - 1461:0002, 1461:0004, 1461:0300
+
+ * - 14
+ - Aimslab Video Highway Xtreme (VHX)
+ -
+
+ * - 15
+ - Zoltrix TV-Max
+ - a1a0:a0fc
+
+ * - 16
+ - Prolink Pixelview PlayTV (bt878)
+ -
+
+ * - 17
+ - Leadtek WinView 601
+ -
+
+ * - 18
+ - AVEC Intercapture
+ -
+
+ * - 19
+ - Lifeview FlyVideo II EZ /FlyKit LR38 Bt848 (capture only)
+ -
+
+ * - 20
+ - CEI Raffles Card
+ -
+
+ * - 21
+ - Lifeview FlyVideo 98/ Lucky Star Image World ConferenceTV LR50
+ -
+
+ * - 22
+ - Askey CPH050/ Phoebe Tv Master + FM
+ - 14ff:3002
+
+ * - 23
+ - Modular Technology MM201/MM202/MM205/MM210/MM215 PCTV, bt878
+ - 14c7:0101
+
+ * - 24
+ - Askey CPH05X/06X (bt878) [many vendors]
+ - 144f:3002, 144f:3005, 144f:5000, 14ff:3000
+
+ * - 25
+ - Terratec TerraTV+ Version 1.0 (Bt848)/ Terra TValue Version 1.0/ Vobis TV-Boostar
+ -
+
+ * - 26
+ - Hauppauge WinCam newer (bt878)
+ -
+
+ * - 27
+ - Lifeview FlyVideo 98/ MAXI TV Video PCI2 LR50
+ -
+
+ * - 28
+ - Terratec TerraTV+ Version 1.1 (bt878)
+ - 153b:1127, 1852:1852
+
+ * - 29
+ - Imagenation PXC200
+ - 1295:200a
+
+ * - 30
+ - Lifeview FlyVideo 98 LR50
+ - 1f7f:1850
+
+ * - 31
+ - Formac iProTV, Formac ProTV I (bt848)
+ -
+
+ * - 32
+ - Intel Create and Share PCI/ Smart Video Recorder III
+ -
+
+ * - 33
+ - Terratec TerraTValue Version Bt878
+ - 153b:1117, 153b:1118, 153b:1119, 153b:111a, 153b:1134, 153b:5018
+
+ * - 34
+ - Leadtek WinFast 2000/ WinFast 2000 XP
+ - 107d:6606, 107d:6609, 6606:217d, f6ff:fff6
+
+ * - 35
+ - Lifeview FlyVideo 98 LR50 / Chronos Video Shuttle II
+ - 1851:1850, 1851:a050
+
+ * - 36
+ - Lifeview FlyVideo 98FM LR50 / Typhoon TView TV/FM Tuner
+ - 1852:1852
+
+ * - 37
+ - Prolink PixelView PlayTV pro
+ -
+
+ * - 38
+ - Askey CPH06X TView99
+ - 144f:3000, 144f:a005, a04f:a0fc
+
+ * - 39
+ - Pinnacle PCTV Studio/Rave
+ - 11bd:0012, bd11:1200, bd11:ff00, 11bd:ff12
+
+ * - 40
+ - STB TV PCI FM, Gateway P/N 6000704 (bt878), 3Dfx VoodooTV 100
+ - 10b4:2636, 10b4:2645, 121a:3060
+
+ * - 41
+ - AVerMedia TVPhone 98
+ - 1461:0001, 1461:0003
+
+ * - 42
+ - ProVideo PV951
+ - aa0c:146c
+
+ * - 43
+ - Little OnAir TV
+ -
+
+ * - 44
+ - Sigma TVII-FM
+ -
+
+ * - 45
+ - MATRIX-Vision MV-Delta 2
+ -
+
+ * - 46
+ - Zoltrix Genie TV/FM
+ - 15b0:4000, 15b0:400a, 15b0:400d, 15b0:4010, 15b0:4016
+
+ * - 47
+ - Terratec TV/Radio+
+ - 153b:1123
+
+ * - 48
+ - Askey CPH03x/ Dynalink Magic TView
+ -
+
+ * - 49
+ - IODATA GV-BCTV3/PCI
+ - 10fc:4020
+
+ * - 50
+ - Prolink PV-BT878P+4E / PixelView PlayTV PAK / Lenco MXTV-9578 CP
+ -
+
+ * - 51
+ - Eagle Wireless Capricorn2 (bt878A)
+ -
+
+ * - 52
+ - Pinnacle PCTV Studio Pro
+ -
+
+ * - 53
+ - Typhoon TView RDS + FM Stereo / KNC1 TV Station RDS
+ -
+
+ * - 54
+ - Lifeview FlyVideo 2000 /FlyVideo A2/ Lifetec LT 9415 TV [LR90]
+ -
+
+ * - 55
+ - Askey CPH031/ BESTBUY Easy TV
+ -
+
+ * - 56
+ - Lifeview FlyVideo 98FM LR50
+ - a051:41a0
+
+ * - 57
+ - GrandTec 'Grand Video Capture' (Bt848)
+ - 4344:4142
+
+ * - 58
+ - Askey CPH060/ Phoebe TV Master Only (No FM)
+ -
+
+ * - 59
+ - Askey CPH03x TV Capturer
+ -
+
+ * - 60
+ - Modular Technology MM100PCTV
+ -
+
+ * - 61
+ - AG Electronics GMV1
+ - 15cb:0101
+
+ * - 62
+ - Askey CPH061/ BESTBUY Easy TV (bt878)
+ -
+
+ * - 63
+ - ATI TV-Wonder
+ - 1002:0001
+
+ * - 64
+ - ATI TV-Wonder VE
+ - 1002:0003
+
+ * - 65
+ - Lifeview FlyVideo 2000S LR90
+ -
+
+ * - 66
+ - Terratec TValueRadio
+ - 153b:1135, 153b:ff3b
+
+ * - 67
+ - IODATA GV-BCTV4/PCI
+ - 10fc:4050
+
+ * - 68
+ - 3Dfx VoodooTV FM (Euro)
+ - 10b4:2637
+
+ * - 69
+ - Active Imaging AIMMS
+ -
+
+ * - 70
+ - Prolink Pixelview PV-BT878P+ (Rev.4C,8E)
+ -
+
+ * - 71
+ - Lifeview FlyVideo 98EZ (capture only) LR51
+ - 1851:1851
+
+ * - 72
+ - Prolink Pixelview PV-BT878P+9B (PlayTV Pro rev.9B FM+NICAM)
+ - 1554:4011
+
+ * - 73
+ - Sensoray 311/611
+ - 6000:0311, 6000:0611
+
+ * - 74
+ - RemoteVision MX (RV605)
+ -
+
+ * - 75
+ - Powercolor MTV878/ MTV878R/ MTV878F
+ -
+
+ * - 76
+ - Canopus WinDVR PCI (COMPAQ Presario 3524JP, 5112JP)
+ - 0e11:0079
+
+ * - 77
+ - GrandTec Multi Capture Card (Bt878)
+ -
+
+ * - 78
+ - Jetway TV/Capture JW-TV878-FBK, Kworld KW-TV878RF
+ - 0a01:17de
+
+ * - 79
+ - DSP Design TCVIDEO
+ -
+
+ * - 80
+ - Hauppauge WinTV PVR
+ - 0070:4500
+
+ * - 81
+ - IODATA GV-BCTV5/PCI
+ - 10fc:4070, 10fc:d018
+
+ * - 82
+ - Osprey 100/150 (878)
+ - 0070:ff00
+
+ * - 83
+ - Osprey 100/150 (848)
+ -
+
+ * - 84
+ - Osprey 101 (848)
+ -
+
+ * - 85
+ - Osprey 101/151
+ -
+
+ * - 86
+ - Osprey 101/151 w/ svid
+ -
+
+ * - 87
+ - Osprey 200/201/250/251
+ -
+
+ * - 88
+ - Osprey 200/250
+ - 0070:ff01
+
+ * - 89
+ - Osprey 210/220/230
+ -
+
+ * - 90
+ - Osprey 500
+ - 0070:ff02
+
+ * - 91
+ - Osprey 540
+ - 0070:ff04
+
+ * - 92
+ - Osprey 2000
+ - 0070:ff03
+
+ * - 93
+ - IDS Eagle
+ -
+
+ * - 94
+ - Pinnacle PCTV Sat
+ - 11bd:001c
+
+ * - 95
+ - Formac ProTV II (bt878)
+ -
+
+ * - 96
+ - MachTV
+ -
+
+ * - 97
+ - Euresys Picolo
+ -
+
+ * - 98
+ - ProVideo PV150
+ - aa00:1460, aa01:1461, aa02:1462, aa03:1463, aa04:1464, aa05:1465, aa06:1466, aa07:1467
+
+ * - 99
+ - AD-TVK503
+ -
+
+ * - 100
+ - Hercules Smart TV Stereo
+ -
+
+ * - 101
+ - Pace TV & Radio Card
+ -
+
+ * - 102
+ - IVC-200
+ - 0000:a155, 0001:a155, 0002:a155, 0003:a155, 0100:a155, 0101:a155, 0102:a155, 0103:a155, 0800:a155, 0801:a155, 0802:a155, 0803:a155
+
+ * - 103
+ - Grand X-Guard / Trust 814PCI
+ - 0304:0102
+
+ * - 104
+ - Nebula Electronics DigiTV
+ - 0071:0101
+
+ * - 105
+ - ProVideo PV143
+ - aa00:1430, aa00:1431, aa00:1432, aa00:1433, aa03:1433
+
+ * - 106
+ - PHYTEC VD-009-X1 VD-011 MiniDIN (bt878)
+ -
+
+ * - 107
+ - PHYTEC VD-009-X1 VD-011 Combi (bt878)
+ -
+
+ * - 108
+ - PHYTEC VD-009 MiniDIN (bt878)
+ -
+
+ * - 109
+ - PHYTEC VD-009 Combi (bt878)
+ -
+
+ * - 110
+ - IVC-100
+ - ff00:a132
+
+ * - 111
+ - IVC-120G
+ - ff00:a182, ff01:a182, ff02:a182, ff03:a182, ff04:a182, ff05:a182, ff06:a182, ff07:a182, ff08:a182, ff09:a182, ff0a:a182, ff0b:a182, ff0c:a182, ff0d:a182, ff0e:a182, ff0f:a182
+
+ * - 112
+ - pcHDTV HD-2000 TV
+ - 7063:2000
+
+ * - 113
+ - Twinhan DST + clones
+ - 11bd:0026, 1822:0001, 270f:fc00, 1822:0026
+
+ * - 114
+ - Winfast VC100
+ - 107d:6607
+
+ * - 115
+ - Teppro TEV-560/InterVision IV-560
+ -
+
+ * - 116
+ - SIMUS GVC1100
+ - aa6a:82b2
+
+ * - 117
+ - NGS NGSTV+
+ -
+
+ * - 118
+ - LMLBT4
+ -
+
+ * - 119
+ - Tekram M205 PRO
+ -
+
+ * - 120
+ - Conceptronic CONTVFMi
+ -
+
+ * - 121
+ - Euresys Picolo Tetra
+ - 1805:0105, 1805:0106, 1805:0107, 1805:0108
+
+ * - 122
+ - Spirit TV Tuner
+ -
+
+ * - 123
+ - AVerMedia AVerTV DVB-T 771
+ - 1461:0771
+
+ * - 124
+ - AverMedia AverTV DVB-T 761
+ - 1461:0761
+
+ * - 125
+ - MATRIX Vision Sigma-SQ
+ -
+
+ * - 126
+ - MATRIX Vision Sigma-SLC
+ -
+
+ * - 127
+ - APAC Viewcomp 878(AMAX)
+ -
+
+ * - 128
+ - DViCO FusionHDTV DVB-T Lite
+ - 18ac:db10, 18ac:db11
+
+ * - 129
+ - V-Gear MyVCD
+ -
+
+ * - 130
+ - Super TV Tuner
+ -
+
+ * - 131
+ - Tibet Systems 'Progress DVR' CS16
+ -
+
+ * - 132
+ - Kodicom 4400R (master)
+ -
+
+ * - 133
+ - Kodicom 4400R (slave)
+ -
+
+ * - 134
+ - Adlink RTV24
+ -
+
+ * - 135
+ - DViCO FusionHDTV 5 Lite
+ - 18ac:d500
+
+ * - 136
+ - Acorp Y878F
+ - 9511:1540
+
+ * - 137
+ - Conceptronic CTVFMi v2
+ - 036e:109e
+
+ * - 138
+ - Prolink Pixelview PV-BT878P+ (Rev.2E)
+ -
+
+ * - 139
+ - Prolink PixelView PlayTV MPEG2 PV-M4900
+ -
+
+ * - 140
+ - Osprey 440
+ - 0070:ff07
+
+ * - 141
+ - Asound Skyeye PCTV
+ -
+
+ * - 142
+ - Sabrent TV-FM (bttv version)
+ -
+
+ * - 143
+ - Hauppauge ImpactVCB (bt878)
+ - 0070:13eb
+
+ * - 144
+ - MagicTV
+ -
+
+ * - 145
+ - SSAI Security Video Interface
+ - 4149:5353
+
+ * - 146
+ - SSAI Ultrasound Video Interface
+ - 414a:5353
+
+ * - 147
+ - VoodooTV 200 (USA)
+ - 121a:3000
+
+ * - 148
+ - DViCO FusionHDTV 2
+ - dbc0:d200
+
+ * - 149
+ - Typhoon TV-Tuner PCI (50684)
+ -
+
+ * - 150
+ - Geovision GV-600
+ - 008a:763c
+
+ * - 151
+ - Kozumi KTV-01C
+ -
+
+ * - 152
+ - Encore ENL TV-FM-2
+ - 1000:1801
+
+ * - 153
+ - PHYTEC VD-012 (bt878)
+ -
+
+ * - 154
+ - PHYTEC VD-012-X1 (bt878)
+ -
+
+ * - 155
+ - PHYTEC VD-012-X2 (bt878)
+ -
+
+ * - 156
+ - IVCE-8784
+ - 0000:f050, 0001:f050, 0002:f050, 0003:f050
+
+ * - 157
+ - Geovision GV-800(S) (master)
+ - 800a:763d
+
+ * - 158
+ - Geovision GV-800(S) (slave)
+ - 800b:763d, 800c:763d, 800d:763d
+
+ * - 159
+ - ProVideo PV183
+ - 1830:1540, 1831:1540, 1832:1540, 1833:1540, 1834:1540, 1835:1540, 1836:1540, 1837:1540
+
+ * - 160
+ - Tongwei Video Technology TD-3116
+ - f200:3116
+
+ * - 161
+ - Aposonic W-DVR
+ - 0279:0228
+
+ * - 162
+ - Adlink MPG24
+ -
+
+ * - 163
+ - Bt848 Capture 14MHz
+ -
+
+ * - 164
+ - CyberVision CV06 (SV)
+ -
+
+ * - 165
+ - Kworld V-Stream Xpert TV PVR878
+ -
+
+ * - 166
+ - PCI-8604PW
+ -
diff --git a/Documentation/admin-guide/media/bttv.rst b/Documentation/admin-guide/media/bttv.rst
new file mode 100644
index 000000000000..125f6f47123d
--- /dev/null
+++ b/Documentation/admin-guide/media/bttv.rst
@@ -0,0 +1,1762 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+===============
+The bttv driver
+===============
+
+Release notes for bttv
+----------------------
+
+You'll need at least these config options for bttv::
+
+ ./scripts/config -e PCI
+ ./scripts/config -m I2C
+ ./scripts/config -m INPUT
+ ./scripts/config -m MEDIA_SUPPORT
+ ./scripts/config -e MEDIA_PCI_SUPPORT
+ ./scripts/config -e MEDIA_ANALOG_TV_SUPPORT
+ ./scripts/config -e MEDIA_DIGITAL_TV_SUPPORT
+ ./scripts/config -e MEDIA_RADIO_SUPPORT
+ ./scripts/config -e RC_CORE
+ ./scripts/config -m VIDEO_BT848
+
+If your board has digital TV, you'll also need::
+
+ ./scripts/config -m DVB_BT8XX
+
+In this case, please see Documentation/admin-guide/media/bt8xx.rst
+for additional notes.
+
+Make bttv work with your card
+-----------------------------
+
+If you have bttv compiled and installed, just booting the Kernel
+should be enough for it to try probing it. However, depending
+on the model, the Kernel may require additional information about
+the hardware, as the device may not be able to provide such info
+directly to the Kernel.
+
+If it doesn't bttv likely could not autodetect your card and needs some
+insmod options. The most important insmod option for bttv is "card=n"
+to select the correct card type. If you get video but no sound you've
+very likely specified the wrong (or no) card type. A list of supported
+cards is in Documentation/admin-guide/media/bttv-cardlist.rst.
+
+If bttv takes very long to load (happens sometimes with the cheap
+cards which have no tuner), try adding this to your modules configuration
+file (usually, it is either ``/etc/modules.conf`` or some file at
+``/etc/modules-load.d/``, but the actual place depends on your
+distribution)::
+
+ options i2c-algo-bit bit_test=1
+
+Some cards may require an extra firmware file to work. For example,
+for the WinTV/PVR you need one firmware file from its driver CD,
+called: ``hcwamc.rbf``. It is inside a self-extracting zip file
+called ``pvr45xxx.exe``. Just placing it at the ``/etc/firmware``
+directory should be enough for it to be autoload during the driver's
+probing mode (e. g. when the Kernel boots or when the driver is
+manually loaded via ``modprobe`` command).
+
+If your card isn't listed in Documentation/admin-guide/media/bttv-cardlist.rst
+or if you have trouble making audio work, please read :ref:`still_doesnt_work`.
+
+
+Autodetecting cards
+-------------------
+
+bttv uses the PCI Subsystem ID to autodetect the card type. lspci lists
+the Subsystem ID in the second line, looks like this:
+
+.. code-block:: none
+
+ 00:0a.0 Multimedia video controller: Brooktree Corporation Bt878 (rev 02)
+ Subsystem: Hauppauge computer works Inc. WinTV/GO
+ Flags: bus master, medium devsel, latency 32, IRQ 5
+ Memory at e2000000 (32-bit, prefetchable) [size=4K]
+
+only bt878-based cards can have a subsystem ID (which does not mean
+that every card really has one). bt848 cards can't have a Subsystem
+ID and therefore can't be autodetected. There is a list with the ID's
+at Documentation/admin-guide/media/bttv-cardlist.rst
+(in case you are interested or want to mail patches with updates).
+
+
+.. _still_doesnt_work:
+
+Still doesn't work?
+-------------------
+
+I do NOT have a lab with 30+ different grabber boards and a
+PAL/NTSC/SECAM test signal generator at home, so I often can't
+reproduce your problems. This makes debugging very difficult for me.
+
+If you have some knowledge and spare time, please try to fix this
+yourself (patches very welcome of course...) You know: The linux
+slogan is "Do it yourself".
+
+There is a mailing list at
+http://vger.kernel.org/vger-lists.html#linux-media
+
+If you have trouble with some specific TV card, try to ask there
+instead of mailing me directly. The chance that someone with the
+same card listens there is much higher...
+
+For problems with sound: There are a lot of different systems used
+for TV sound all over the world. And there are also different chips
+which decode the audio signal. Reports about sound problems ("stereo
+doesn't work") are pretty useless unless you include some details
+about your hardware and the TV sound scheme used in your country (or
+at least the country you are living in).
+
+Modprobe options
+----------------
+
+.. note::
+
+
+ The following argument list can be outdated, as we might add more
+ options if ever needed. In case of doubt, please check with
+ ``modinfo <module>``.
+
+ This command prints various information about a kernel
+ module, among them a complete and up-to-date list of insmod options.
+
+
+
+bttv
+ The bt848/878 (grabber chip) driver
+
+ insmod args::
+
+ card=n card type, see CARDLIST for a list.
+ tuner=n tuner type, see CARDLIST for a list.
+ radio=0/1 card supports radio
+ pll=0/1/2 pll settings
+
+ 0: don't use PLL
+ 1: 28 MHz crystal installed
+ 2: 35 MHz crystal installed
+
+ triton1=0/1 for Triton1 (+others) compatibility
+ vsfx=0/1 yet another chipset bug compatibility bit
+ see README.quirks for details on these two.
+
+ bigendian=n Set the endianness of the gfx framebuffer.
+ Default is native endian.
+ fieldnr=0/1 Count fields. Some TV descrambling software
+ needs this, for others it only generates
+ 50 useless IRQs/sec. default is 0 (off).
+ autoload=0/1 autoload helper modules (tuner, audio).
+ default is 1 (on).
+ bttv_verbose=0/1/2 verbose level (at insmod time, while
+ looking at the hardware). default is 1.
+ bttv_debug=0/1 debug messages (for capture).
+ default is 0 (off).
+ irq_debug=0/1 irq handler debug messages.
+ default is 0 (off).
+ gbuffers=2-32 number of capture buffers for mmap'ed capture.
+ default is 4.
+ gbufsize= size of capture buffers. default and
+ maximum value is 0x208000 (~2MB)
+ no_overlay=0 Enable overlay on broken hardware. There
+ are some chipsets (SIS for example) which
+ are known to have problems with the PCI DMA
+ push used by bttv. bttv will disable overlay
+ by default on this hardware to avoid crashes.
+ With this insmod option you can override this.
+ no_overlay=1 Disable overlay. It should be used by broken
+ hardware that doesn't support PCI2PCI direct
+ transfers.
+ automute=0/1 Automatically mutes the sound if there is
+ no TV signal, on by default. You might try
+ to disable this if you have bad input signal
+ quality which leading to unwanted sound
+ dropouts.
+ chroma_agc=0/1 AGC of chroma signal, off by default.
+ adc_crush=0/1 Luminance ADC crush, on by default.
+ i2c_udelay= Allow reduce I2C speed. Default is 5 usecs
+ (meaning 66,67 Kbps). The default is the
+ maximum supported speed by kernel bitbang
+ algorithm. You may use lower numbers, if I2C
+ messages are lost (16 is known to work on
+ all supported cards).
+
+ bttv_gpio=0/1
+ gpiomask=
+ audioall=
+ audiomux=
+ See Sound-FAQ for a detailed description.
+
+ remap, card, radio and pll accept up to four comma-separated arguments
+ (for multiple boards).
+
+tuner
+ The tuner driver. You need this unless you want to use only
+ with a camera or the board doesn't provide analog TV tuning.
+
+ insmod args::
+
+ debug=1 print some debug info to the syslog
+ type=n type of the tuner chip. n as follows:
+ see CARDLIST for a complete list.
+ pal=[bdgil] select PAL variant (used for some tuners
+ only, important for the audio carrier).
+
+tvaudio
+ Provide a single driver for all simple i2c audio control
+ chips (tda/tea*).
+
+ insmod args::
+
+ tda8425 = 1 enable/disable the support for the
+ tda9840 = 1 various chips.
+ tda9850 = 1 The tea6300 can't be autodetected and is
+ tda9855 = 1 therefore off by default, if you have
+ tda9873 = 1 this one on your card (STB uses these)
+ tda9874a = 1 you have to enable it explicitly.
+ tea6300 = 0 The two tda985x chips use the same i2c
+ tea6420 = 1 address and can't be disturgished from
+ pic16c54 = 1 each other, you might have to disable
+ the wrong one.
+ debug = 1 print debug messages
+
+msp3400
+ The driver for the msp34xx sound processor chips. If you have a
+ stereo card, you probably want to insmod this one.
+
+ insmod args::
+
+ debug=1/2 print some debug info to the syslog,
+ 2 is more verbose.
+ simple=1 Use the "short programming" method. Newer
+ msp34xx versions support this. You need this
+ for dbx stereo. Default is on if supported by
+ the chip.
+ once=1 Don't check the TV-stations Audio mode
+ every few seconds, but only once after
+ channel switches.
+ amsound=1 Audio carrier is AM/NICAM at 6.5 Mhz. This
+ should improve things for french people, the
+ carrier autoscan seems to work with FM only...
+
+If the box freezes hard with bttv
+---------------------------------
+
+It might be a bttv driver bug. It also might be bad hardware. It also
+might be something else ...
+
+Just mailing me "bttv freezes" isn't going to help much. This README
+has a few hints how you can help to pin down the problem.
+
+
+bttv bugs
+~~~~~~~~~
+
+If some version works and another doesn't it is likely to be a driver
+bug. It is very helpful if you can tell where exactly it broke
+(i.e. the last working and the first broken version).
+
+With a hard freeze you probably doesn't find anything in the logfiles.
+The only way to capture any kernel messages is to hook up a serial
+console and let some terminal application log the messages. /me uses
+screen. See Documentation/admin-guide/serial-console.rst for details on
+setting up a serial console.
+
+Read Documentation/admin-guide/bug-hunting.rst to learn how to get any useful
+information out of a register+stack dump printed by the kernel on
+protection faults (so-called "kernel oops").
+
+If you run into some kind of deadlock, you can try to dump a call trace
+for each process using sysrq-t (see Documentation/admin-guide/sysrq.rst).
+This way it is possible to figure where *exactly* some process in "D"
+state is stuck.
+
+I've seen reports that bttv 0.7.x crashes whereas 0.8.x works rock solid
+for some people. Thus probably a small buglet left somewhere in bttv
+0.7.x. I have no idea where exactly, it works stable for me and a lot of
+other people. But in case you have problems with the 0.7.x versions you
+can give 0.8.x a try ...
+
+
+hardware bugs
+~~~~~~~~~~~~~
+
+Some hardware can't deal with PCI-PCI transfers (i.e. grabber => vga).
+Sometimes problems show up with bttv just because of the high load on
+the PCI bus. The bt848/878 chips have a few workarounds for known
+incompatibilities, see README.quirks.
+
+Some folks report that increasing the pci latency helps too,
+althrought I'm not sure whenever this really fixes the problems or
+only makes it less likely to happen. Both bttv and btaudio have a
+insmod option to set the PCI latency of the device.
+
+Some mainboard have problems to deal correctly with multiple devices
+doing DMA at the same time. bttv + ide seems to cause this sometimes,
+if this is the case you likely see freezes only with video and hard disk
+access at the same time. Updating the IDE driver to get the latest and
+greatest workarounds for hardware bugs might fix these problems.
+
+
+other
+~~~~~
+
+If you use some binary-only yunk (like nvidia module) try to reproduce
+the problem without.
+
+IRQ sharing is known to cause problems in some cases. It works just
+fine in theory and many configurations. Neverless it might be worth a
+try to shuffle around the PCI cards to give bttv another IRQ or make
+it share the IRQ with some other piece of hardware. IRQ sharing with
+VGA cards seems to cause trouble sometimes. I've also seen funny
+effects with bttv sharing the IRQ with the ACPI bridge (and
+apci-enabled kernel).
+
+Bttv quirks
+-----------
+
+Below is what the bt878 data book says about the PCI bug compatibility
+modes of the bt878 chip.
+
+The triton1 insmod option sets the EN_TBFX bit in the control register.
+The vsfx insmod option does the same for EN_VSFX bit. If you have
+stability problems you can try if one of these options makes your box
+work solid.
+
+drivers/pci/quirks.c knows about these issues, this way these bits are
+enabled automagically for known-buggy chipsets (look at the kernel
+messages, bttv tells you).
+
+Normal PCI Mode
+~~~~~~~~~~~~~~~
+
+The PCI REQ signal is the logical-or of the incoming function requests.
+The inter-nal GNT[0:1] signals are gated asynchronously with GNT and
+demultiplexed by the audio request signal. Thus the arbiter defaults to
+the video function at power-up and parks there during no requests for
+bus access. This is desirable since the video will request the bus more
+often. However, the audio will have highest bus access priority. Thus
+the audio will have first access to the bus even when issuing a request
+after the video request but before the PCI external arbiter has granted
+access to the Bt879. Neither function can preempt the other once on the
+bus. The duration to empty the entire video PCI FIFO onto the PCI bus is
+very short compared to the bus access latency the audio PCI FIFO can
+tolerate.
+
+
+430FX Compatibility Mode
+~~~~~~~~~~~~~~~~~~~~~~~~
+
+When using the 430FX PCI, the following rules will ensure
+compatibility:
+
+ (1) Deassert REQ at the same time as asserting FRAME.
+ (2) Do not reassert REQ to request another bus transaction until after
+ finish-ing the previous transaction.
+
+Since the individual bus masters do not have direct control of REQ, a
+simple logical-or of video and audio requests would violate the rules.
+Thus, both the arbiter and the initiator contain 430FX compatibility
+mode logic. To enable 430FX mode, set the EN_TBFX bit as indicated in
+Device Control Register on page 104.
+
+When EN_TBFX is enabled, the arbiter ensures that the two compatibility
+rules are satisfied. Before GNT is asserted by the PCI arbiter, this
+internal arbiter may still logical-or the two requests. However, once
+the GNT is issued, this arbiter must lock in its decision and now route
+only the granted request to the REQ pin. The arbiter decision lock
+happens regardless of the state of FRAME because it does not know when
+FRAME will be asserted (typically - each initiator will assert FRAME on
+the cycle following GNT). When FRAME is asserted, it is the initiator s
+responsibility to remove its request at the same time. It is the
+arbiters responsibility to allow this request to flow through to REQ and
+not allow the other request to hold REQ asserted. The decision lock may
+be removed at the end of the transaction: for example, when the bus is
+idle (FRAME and IRDY). The arbiter decision may then continue
+asynchronously until GNT is again asserted.
+
+
+Interfacing with Non-PCI 2.1 Compliant Core Logic
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+A small percentage of core logic devices may start a bus transaction
+during the same cycle that GNT is de-asserted. This is non PCI 2.1
+compliant. To ensure compatibility when using PCs with these PCI
+controllers, the EN_VSFX bit must be enabled (refer to Device Control
+Register on page 104). When in this mode, the arbiter does not pass GNT
+to the internal functions unless REQ is asserted. This prevents a bus
+transaction from starting the same cycle as GNT is de-asserted. This
+also has the side effect of not being able to take advantage of bus
+parking, thus lowering arbitration performance. The Bt879 drivers must
+query for these non-compliant devices, and set the EN_VSFX bit only if
+required.
+
+
+Other elements of the tvcards array
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+If you are trying to make a new card work you might find it useful to
+know what the other elements in the tvcards array are good for::
+
+ video_inputs - # of video inputs the card has
+ audio_inputs - historical cruft, not used any more.
+ tuner - which input is the tuner
+ svhs - which input is svhs (all others are labeled composite)
+ muxsel - video mux, input->registervalue mapping
+ pll - same as pll= insmod option
+ tuner_type - same as tuner= insmod option
+ *_modulename - hint whenever some card needs this or that audio
+ module loaded to work properly.
+ has_radio - whenever this TV card has a radio tuner.
+ no_msp34xx - "1" disables loading of msp3400.o module
+ no_tda9875 - "1" disables loading of tda9875.o module
+ needs_tvaudio - set to "1" to load tvaudio.o module
+
+If some config item is specified both from the tvcards array and as
+insmod option, the insmod option takes precedence.
+
+Cards
+-----
+
+.. note::
+
+ For a more updated list, please check
+ https://linuxtv.org/wiki/index.php/Hardware_Device_Information
+
+Supported cards: Bt848/Bt848a/Bt849/Bt878/Bt879 cards
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+All cards with Bt848/Bt848a/Bt849/Bt878/Bt879 and normal
+Composite/S-VHS inputs are supported. Teletext and Intercast support
+(PAL only) for ALL cards via VBI sample decoding in software.
+
+Some cards with additional multiplexing of inputs or other additional
+fancy chips are only partially supported (unless specifications by the
+card manufacturer are given). When a card is listed here it isn't
+necessarily fully supported.
+
+All other cards only differ by additional components as tuners, sound
+decoders, EEPROMs, teletext decoders ...
+
+
+MATRIX Vision
+~~~~~~~~~~~~~
+
+MV-Delta
+- Bt848A
+- 4 Composite inputs, 1 S-VHS input (shared with 4th composite)
+- EEPROM
+
+http://www.matrix-vision.de/
+
+This card has no tuner but supports all 4 composite (1 shared with an
+S-VHS input) of the Bt848A.
+Very nice card if you only have satellite TV but several tuners connected
+to the card via composite.
+
+Many thanks to Matrix-Vision for giving us 2 cards for free which made
+Bt848a/Bt849 single crystal operation support possible!!!
+
+
+
+Miro/Pinnacle PCTV
+~~~~~~~~~~~~~~~~~~
+
+- Bt848
+ some (all??) come with 2 crystals for PAL/SECAM and NTSC
+- PAL, SECAM or NTSC TV tuner (Philips or TEMIC)
+- MSP34xx sound decoder on add on board
+ decoder is supported but AFAIK does not yet work
+ (other sound MUX setting in GPIO port needed??? somebody who fixed this???)
+- 1 tuner, 1 composite and 1 S-VHS input
+- tuner type is autodetected
+
+http://www.miro.de/
+http://www.miro.com/
+
+
+Many thanks for the free card which made first NTSC support possible back
+in 1997!
+
+
+Hauppauge Win/TV pci
+~~~~~~~~~~~~~~~~~~~~
+
+There are many different versions of the Hauppauge cards with different
+tuners (TV+Radio ...), teletext decoders.
+Note that even cards with same model numbers have (depending on the revision)
+different chips on it.
+
+- Bt848 (and others but always in 2 crystal operation???)
+ newer cards have a Bt878
+
+- PAL, SECAM, NTSC or tuner with or without Radio support
+
+e.g.:
+
+- PAL:
+
+ - TDA5737: VHF, hyperband and UHF mixer/oscillator for TV and VCR 3-band tuners
+ - TSA5522: 1.4 GHz I2C-bus controlled synthesizer, I2C 0xc2-0xc3
+
+- NTSC:
+
+ - TDA5731: VHF, hyperband and UHF mixer/oscillator for TV and VCR 3-band tuners
+ - TSA5518: no datasheet available on Philips site
+
+- Philips SAA5246 or SAA5284 ( or no) Teletext decoder chip
+ with buffer RAM (e.g. Winbond W24257AS-35: 32Kx8 CMOS static RAM)
+ SAA5246 (I2C 0x22) is supported
+
+- 256 bytes EEPROM: Microchip 24LC02B or Philips 8582E2Y
+ with configuration information
+ I2C address 0xa0 (24LC02B also responds to 0xa2-0xaf)
+
+- 1 tuner, 1 composite and (depending on model) 1 S-VHS input
+
+- 14052B: mux for selection of sound source
+
+- sound decoder: TDA9800, MSP34xx (stereo cards)
+
+
+Askey CPH-Series
+~~~~~~~~~~~~~~~~
+Developed by TelSignal(?), OEMed by many vendors (Typhoon, Anubis, Dynalink)
+
+- Card series:
+ - CPH01x: BT848 capture only
+ - CPH03x: BT848
+ - CPH05x: BT878 with FM
+ - CPH06x: BT878 (w/o FM)
+ - CPH07x: BT878 capture only
+
+- TV standards:
+ - CPH0x0: NTSC-M/M
+ - CPH0x1: PAL-B/G
+ - CPH0x2: PAL-I/I
+ - CPH0x3: PAL-D/K
+ - CPH0x4: SECAM-L/L
+ - CPH0x5: SECAM-B/G
+ - CPH0x6: SECAM-D/K
+ - CPH0x7: PAL-N/N
+ - CPH0x8: PAL-B/H
+ - CPH0x9: PAL-M/M
+
+- CPH03x was often sold as "TV capturer".
+
+Identifying:
+
+ #) 878 cards can be identified by PCI Subsystem-ID:
+ - 144f:3000 = CPH06x
+ - 144F:3002 = CPH05x w/ FM
+ - 144F:3005 = CPH06x_LC (w/o remote control)
+ #) The cards have a sticker with "CPH"-model on the back.
+ #) These cards have a number printed on the PCB just above the tuner metal box:
+ - "80-CP2000300-x" = CPH03X
+ - "80-CP2000500-x" = CPH05X
+ - "80-CP2000600-x" = CPH06X / CPH06x_LC
+
+ Askey sells these cards as "Magic TView series", Brand "MagicXpress".
+ Other OEM often call these "Tview", "TView99" or else.
+
+Lifeview Flyvideo Series:
+~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The naming of these series differs in time and space.
+
+Identifying:
+ #) Some models can be identified by PCI subsystem ID:
+
+ - 1852:1852 = Flyvideo 98 FM
+ - 1851:1850 = Flyvideo 98
+ - 1851:1851 = Flyvideo 98 EZ (capture only)
+
+ #) There is a print on the PCB:
+
+ - LR25 = Flyvideo (Zoran ZR36120, SAA7110A)
+ - LR26 Rev.N = Flyvideo II (Bt848)
+ - LR26 Rev.O = Flyvideo II (Bt878)
+ - LR37 Rev.C = Flyvideo EZ (Capture only, ZR36120 + SAA7110)
+ - LR38 Rev.A1= Flyvideo II EZ (Bt848 capture only)
+ - LR50 Rev.Q = Flyvideo 98 (w/eeprom and PCI subsystem ID)
+ - LR50 Rev.W = Flyvideo 98 (no eeprom)
+ - LR51 Rev.E = Flyvideo 98 EZ (capture only)
+ - LR90 = Flyvideo 2000 (Bt878)
+ - LR90 Flyvideo 2000S (Bt878) w/Stereo TV (Package incl. LR91 daughterboard)
+ - LR91 = Stereo daughter card for LR90
+ - LR97 = Flyvideo DVBS
+ - LR99 Rev.E = Low profile card for OEM integration (only internal audio!) bt878
+ - LR136 = Flyvideo 2100/3100 (Low profile, SAA7130/SAA7134)
+ - LR137 = Flyvideo DV2000/DV3000 (SAA7130/SAA7134 + IEEE1394)
+ - LR138 Rev.C= Flyvideo 2000 (SAA7130)
+ - LR138 Flyvideo 3000 (SAA7134) w/Stereo TV
+
+ - These exist in variations w/FM and w/Remote sometimes denoted
+ by suffixes "FM" and "R".
+
+ #) You have a laptop (miniPCI card):
+
+ - Product = FlyTV Platinum Mini
+ - Model/Chip = LR212/saa7135
+
+ - Lifeview.com.tw states (Feb. 2002):
+ "The FlyVideo2000 and FlyVideo2000s product name have renamed to FlyVideo98."
+ Their Bt8x8 cards are listed as discontinued.
+ - Flyvideo 2000S was probably sold as Flyvideo 3000 in some countries(Europe?).
+ The new Flyvideo 2000/3000 are SAA7130/SAA7134 based.
+
+"Flyvideo II" had been the name for the 848 cards, nowadays (in Germany)
+this name is re-used for LR50 Rev.W.
+
+The Lifeview website mentioned Flyvideo III at some time, but such a card
+has not yet been seen (perhaps it was the german name for LR90 [stereo]).
+These cards are sold by many OEMs too.
+
+FlyVideo A2 (Elta 8680)= LR90 Rev.F (w/Remote, w/o FM, stereo TV by tda9821) {Germany}
+
+Lifeview 3000 (Elta 8681) as sold by Plus(April 2002), Germany = LR138 w/ saa7134
+
+lifeview config coding on gpio pins 0-9
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+- LR50 rev. Q ("PARTS: 7031505116), Tuner wurde als Nr. 5 erkannt, Eingänge
+ SVideo, TV, Composite, Audio, Remote:
+
+ - CP9..1=100001001 (1: 0-Ohm-Widerstand gegen GND unbestückt; 0: bestückt)
+
+
+Typhoon TV card series:
+~~~~~~~~~~~~~~~~~~~~~~~
+
+These can be CPH, Flyvideo, Pixelview or KNC1 series.
+
+Typhoon is the brand of Anubis.
+
+Model 50680 got re-used, some model no. had different contents over time.
+
+Models:
+
+ - 50680 "TV Tuner PCI Pal BG"(old,red package)=can be CPH03x(bt848) or CPH06x(bt878)
+ - 50680 "TV Tuner Pal BG" (blue package)= Pixelview PV-BT878P+ (Rev 9B)
+ - 50681 "TV Tuner PCI Pal I" (variant of 50680)
+ - 50682 "TView TV/FM Tuner Pal BG" = Flyvideo 98FM (LR50 Rev.Q)
+
+ .. note::
+
+ The package has a picture of CPH05x (which would be a real TView)
+
+ - 50683 "TV Tuner PCI SECAM" (variant of 50680)
+ - 50684 "TV Tuner Pal BG" = Pixelview 878TV(Rev.3D)
+ - 50686 "TV Tuner" = KNC1 TV Station
+ - 50687 "TV Tuner stereo" = KNC1 TV Station pro
+ - 50688 "TV Tuner RDS" (black package) = KNC1 TV Station RDS
+ - 50689 TV SAT DVB-S CARD CI PCI (SAA7146AH, SU1278?) = "KNC1 TV Station DVB-S"
+ - 50692 "TV/FM Tuner" (small PCB)
+ - 50694 TV TUNER CARD RDS (PHILIPS CHIPSET SAA7134HL)
+ - 50696 TV TUNER STEREO (PHILIPS CHIPSET SAA7134HL, MK3ME Tuner)
+ - 50804 PC-SAT TV/Audio Karte = Techni-PC-Sat (ZORAN 36120PQC, Tuner:Alps)
+ - 50866 TVIEW SAT RECEIVER+ADR
+ - 50868 "TV/FM Tuner Pal I" (variant of 50682)
+ - 50999 "TV/FM Tuner Secam" (variant of 50682)
+
+Guillemot
+~~~~~~~~~
+
+Models:
+
+- Maxi-TV PCI (ZR36120)
+- Maxi TV Video 2 = LR50 Rev.Q (FI1216MF, PAL BG+SECAM)
+- Maxi TV Video 3 = CPH064 (PAL BG + SECAM)
+
+Mentor
+~~~~~~
+
+Mentor TV card ("55-878TV-U1") = Pixelview 878TV(Rev.3F) (w/FM w/Remote)
+
+Prolink
+~~~~~~~
+
+- TV cards:
+
+ - PixelView Play TV pro - (Model: PV-BT878P+ REV 8E)
+ - PixelView Play TV pro - (Model: PV-BT878P+ REV 9D)
+ - PixelView Play TV pro - (Model: PV-BT878P+ REV 4C / 8D / 10A )
+ - PixelView Play TV - (Model: PV-BT848P+)
+ - 878TV - (Model: PV-BT878TV)
+
+- Multimedia TV packages (card + software pack):
+
+ - PixelView Play TV Theater - (Model: PV-M4200) = PixelView Play TV pro + Software
+ - PixelView Play TV PAK - (Model: PV-BT878P+ REV 4E)
+ - PixelView Play TV/VCR - (Model: PV-M3200 REV 4C / 8D / 10A )
+ - PixelView Studio PAK - (Model: M2200 REV 4C / 8D / 10A )
+ - PixelView PowerStudio PAK - (Model: PV-M3600 REV 4E)
+ - PixelView DigitalVCR PAK - (Model: PV-M2400 REV 4C / 8D / 10A )
+ - PixelView PlayTV PAK II (TV/FM card + usb camera) PV-M3800
+ - PixelView PlayTV XP PV-M4700,PV-M4700(w/FM)
+ - PixelView PlayTV DVR PV-M4600 package contents:PixelView PlayTV pro, windvr & videoMail s/w
+
+- Further Cards:
+
+ - PV-BT878P+rev.9B (Play TV Pro, opt. w/FM w/NICAM)
+ - PV-BT878P+rev.2F
+ - PV-BT878P Rev.1D (bt878, capture only)
+
+ - XCapture PV-CX881P (cx23881)
+ - PlayTV HD PV-CX881PL+, PV-CX881PL+(w/FM) (cx23881)
+
+ - DTV3000 PV-DTV3000P+ DVB-S CI = Twinhan VP-1030
+ - DTV2000 DVB-S = Twinhan VP-1020
+
+- Video Conferencing:
+
+ - PixelView Meeting PAK - (Model: PV-BT878P)
+ - PixelView Meeting PAK Lite - (Model: PV-BT878P)
+ - PixelView Meeting PAK plus - (Model: PV-BT878P+rev 4C/8D/10A)
+ - PixelView Capture - (Model: PV-BT848P)
+ - PixelView PlayTV USB pro
+ - Model No. PV-NT1004+, PV-NT1004+ (w/FM) = NT1004 USB decoder chip + SAA7113 video decoder chip
+
+Dynalink
+~~~~~~~~
+
+These are CPH series.
+
+Phoebemicro
+~~~~~~~~~~~
+
+- TV Master = CPH030 or CPH060
+- TV Master FM = CPH050
+
+Genius/Kye
+~~~~~~~~~~
+
+- Video Wonder/Genius Internet Video Kit = LR37 Rev.C
+- Video Wonder Pro II (848 or 878) = LR26
+
+Tekram
+~~~~~~
+
+- VideoCap C205 (Bt848)
+- VideoCap C210 (zr36120 +Philips)
+- CaptureTV M200 (ISA)
+- CaptureTV M205 (Bt848)
+
+Lucky Star
+~~~~~~~~~~
+
+- Image World Conference TV = LR50 Rev. Q
+
+Leadtek
+~~~~~~~
+
+- WinView 601 (Bt848)
+- WinView 610 (Zoran)
+- WinFast2000
+- WinFast2000 XP
+
+Support for the Leadtek WinView 601 TV/FM
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Author of this section: Jon Tombs <jon@gte.esi.us.es>
+
+This card is basically the same as all the rest (Bt484A, Philips tuner),
+the main difference is that they have attached a programmable attenuator to 3
+GPIO lines in order to give some volume control. They have also stuck an
+infra-red remote control decoded on the board, I will add support for this
+when I get time (it simple generates an interrupt for each key press, with
+the key code is placed in the GPIO port).
+
+I don't yet have any application to test the radio support. The tuner
+frequency setting should work but it is possible that the audio multiplexer
+is wrong. If it doesn't work, send me email.
+
+
+- No Thanks to Leadtek they refused to answer any questions about their
+ hardware. The driver was written by visual inspection of the card. If you
+ use this driver, send an email insult to them, and tell them you won't
+ continue buying their hardware unless they support Linux.
+
+- Little thanks to Princeton Technology Corp (http://www.princeton.com.tw)
+ who make the audio attenuator. Their publicly available data-sheet available
+ on their web site doesn't include the chip programming information! Hidden
+ on their server are the full data-sheets, but don't ask how I found it.
+
+To use the driver I use the following options, the tuner and pll settings might
+be different in your country. You can force it via modprobe parameters.
+For example::
+
+ modprobe bttv tuner=1 pll=28 radio=1 card=17
+
+Sets tuner type 1 (Philips PAL_I), PLL with a 28 MHz crystal, enables
+FM radio and selects bttv card ID 17 (Leadtek WinView 601).
+
+
+KNC One
+~~~~~~~
+
+- TV-Station
+- TV-Station SE (+Software Bundle)
+- TV-Station pro (+TV stereo)
+- TV-Station FM (+Radio)
+- TV-Station RDS (+RDS)
+- TV Station SAT (analog satellite)
+- TV-Station DVB-S
+
+.. note:: newer Cards have saa7134, but model name stayed the same?
+
+Provideo
+~~~~~~~~
+
+- PV951 or PV-951, now named PV-951T
+ (also are sold as:
+ Boeder TV-FM Video Capture Card,
+ Titanmedia Supervision TV-2400,
+ Provideo PV951 TF,
+ 3DeMon PV951,
+ MediaForte TV-Vision PV951,
+ Yoko PV951,
+ Vivanco Tuner Card PCI Art.-Nr.: 68404
+ )
+
+- Surveillance Series:
+
+ - PV-141
+ - PV-143
+ - PV-147
+ - PV-148 (capture only)
+ - PV-150
+ - PV-151
+
+- TV-FM Tuner Series:
+
+ - PV-951TDV (tv tuner + 1394)
+ - PV-951T/TF
+ - PV-951PT/TF
+ - PV-956T/TF Low Profile
+ - PV-911
+
+Highscreen
+~~~~~~~~~~
+
+Models:
+
+- TV Karte = LR50 Rev.S
+- TV-Boostar = Terratec Terra TV+ Version 1.0 (Bt848, tda9821) "ceb105.pcb"
+
+Zoltrix
+~~~~~~~
+
+Models:
+
+- Face to Face Capture (Bt848 capture only) (PCB "VP-2848")
+- Face To Face TV MAX (Bt848) (PCB "VP-8482 Rev1.3")
+- Genie TV (Bt878) (PCB "VP-8790 Rev 2.1")
+- Genie Wonder Pro
+
+AVerMedia
+~~~~~~~~~
+
+- AVer FunTV Lite (ISA, AV3001 chipset) "M101.C"
+- AVerTV
+- AVerTV Stereo
+- AVerTV Studio (w/FM)
+- AVerMedia TV98 with Remote
+- AVerMedia TV/FM98 Stereo
+- AVerMedia TVCAM98
+- TVCapture (Bt848)
+- TVPhone (Bt848)
+- TVCapture98 (="AVerMedia TV98" in USA) (Bt878)
+- TVPhone98 (Bt878, w/FM)
+
+======== =========== =============== ======= ====== ======== =======================
+PCB PCI-ID Model-Name Eeprom Tuner Sound Country
+======== =========== =============== ======= ====== ======== =======================
+M101.C ISA !
+M108-B Bt848 -- FR1236 US [#f2]_, [#f3]_
+M1A8-A Bt848 AVer TV-Phone FM1216 --
+M168-T 1461:0003 AVerTV Studio 48:17 FM1216 TDA9840T D [#f1]_ w/FM w/Remote
+M168-U 1461:0004 TVCapture98 40:11 FI1216 -- D w/Remote
+M168II-B 1461:0003 Medion MD9592 48:16 FM1216 TDA9873H D w/FM
+======== =========== =============== ======= ====== ======== =======================
+
+.. [#f1] Daughterboard MB68-A with TDA9820T and TDA9840T
+.. [#f2] Sony NE41S soldered (stereo sound?)
+.. [#f3] Daughterboard M118-A w/ pic 16c54 and 4 MHz quartz
+
+- US site has different drivers for (as of 09/2002):
+
+ - EZ Capture/InterCam PCI (BT-848 chip)
+ - EZ Capture/InterCam PCI (BT-878 chip)
+ - TV-Phone (BT-848 chip)
+ - TV98 (BT-848 chip)
+ - TV98 With Remote (BT-848 chip)
+ - TV98 (BT-878 chip)
+ - TV98 With Remote (BT-878)
+ - TV/FM98 (BT-878 chip)
+ - AVerTV
+ - AverTV Stereo
+ - AVerTV Studio
+
+DE hat diverse Treiber fuer diese Modelle (Stand 09/2002):
+
+ - TVPhone (848) mit Philips tuner FR12X6 (w/ FM radio)
+ - TVPhone (848) mit Philips tuner FM12X6 (w/ FM radio)
+ - TVCapture (848) w/Philips tuner FI12X6
+ - TVCapture (848) non-Philips tuner
+ - TVCapture98 (Bt878)
+ - TVPhone98 (Bt878)
+ - AVerTV und TVCapture98 w/VCR (Bt 878)
+ - AVerTVStudio und TVPhone98 w/VCR (Bt878)
+ - AVerTV GO Serie (Kein SVideo Input)
+ - AVerTV98 (BT-878 chip)
+ - AVerTV98 mit Fernbedienung (BT-878 chip)
+ - AVerTV/FM98 (BT-878 chip)
+
+ - VDOmate (www.averm.com.cn) = M168U ?
+
+Aimslab
+~~~~~~~
+
+Models:
+
+- Video Highway or "Video Highway TR200" (ISA)
+- Video Highway Xtreme (aka "VHX") (Bt848, FM w/ TEA5757)
+
+IXMicro (former: IMS=Integrated Micro Solutions)
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Models:
+
+- IXTV BT848 (=TurboTV)
+- IXTV BT878
+- IMS TurboTV (Bt848)
+
+Lifetec/Medion/Tevion/Aldi
+~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Models:
+
+- LT9306/MD9306 = CPH061
+- LT9415/MD9415 = LR90 Rev.F or Rev.G
+- MD9592 = Avermedia TVphone98 (PCI_ID=1461:0003), PCB-Rev=M168II-B (w/TDA9873H)
+- MD9717 = KNC One (Rev D4, saa7134, FM1216 MK2 tuner)
+- MD5044 = KNC One (Rev D4, saa7134, FM1216ME MK3 tuner)
+
+Modular Technologies (www.modulartech.com) UK
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Models:
+
+- MM100 PCTV (Bt848)
+- MM201 PCTV (Bt878, Bt832) w/ Quartzsight camera
+- MM202 PCTV (Bt878, Bt832, tda9874)
+- MM205 PCTV (Bt878)
+- MM210 PCTV (Bt878) (Galaxy TV, Galaxymedia ?)
+
+Terratec
+~~~~~~~~
+
+Models:
+
+- Terra TV+ Version 1.0 (Bt848), "ceb105.PCB" printed on the PCB, TDA9821
+- Terra TV+ Version 1.1 (Bt878), "LR74 Rev.E" printed on the PCB, TDA9821
+- Terra TValueRadio, "LR102 Rev.C" printed on the PCB
+- Terra TV/Radio+ Version 1.0, "80-CP2830100-0" TTTV3 printed on the PCB,
+ "CPH010-E83" on the back, SAA6588T, TDA9873H
+- Terra TValue Version BT878, "80-CP2830110-0 TTTV4" printed on the PCB,
+ "CPH011-D83" on back
+- Terra TValue Version 1.0 "ceb105.PCB" (really identical to Terra TV+ Version 1.0)
+- Terra TValue New Revision "LR102 Rec.C"
+- Terra Active Radio Upgrade (tea5757h, saa6588t)
+
+- LR74 is a newer PCB revision of ceb105 (both incl. connector for Active Radio Upgrade)
+
+- Cinergy 400 (saa7134), "E877 11(S)", "PM820092D" printed on PCB
+- Cinergy 600 (saa7134)
+
+Technisat
+~~~~~~~~~
+
+Models:
+
+- Discos ADR PC-Karte ISA (no TV!)
+- Discos ADR PC-Karte PCI (probably no TV?)
+- Techni-PC-Sat (Sat. analog)
+ Rev 1.2 (zr36120, vpx3220, stv0030, saa5246, BSJE3-494A)
+- Mediafocus I (zr36120/zr36125, drp3510, Sat. analog + ADR Radio)
+- Mediafocus II (saa7146, Sat. analog)
+- SatADR Rev 2.1 (saa7146a, saa7113h, stv0056a, msp3400c, drp3510a, BSKE3-307A)
+- SkyStar 1 DVB (AV7110) = Technotrend Premium
+- SkyStar 2 DVB (B2C2) (=Sky2PC)
+
+Siemens
+~~~~~~~
+
+Multimedia eXtension Board (MXB) (SAA7146, SAA7111)
+
+Powercolor
+~~~~~~~~~~
+
+Models:
+
+- MTV878
+ Package comes with different contents:
+
+ a) pcb "MTV878" (CARD=75)
+ b) Pixelview Rev. 4\_
+
+- MTV878R w/Remote Control
+- MTV878F w/Remote Control w/FM radio
+
+Pinnacle
+~~~~~~~~
+
+PCTV models:
+
+- Mirovideo PCTV (Bt848)
+- Mirovideo PCTV SE (Bt848)
+- Mirovideo PCTV Pro (Bt848 + Daughterboard for TV Stereo and FM)
+- Studio PCTV Rave (Bt848 Version = Mirovideo PCTV)
+- Studio PCTV Rave (Bt878 package w/o infrared)
+- Studio PCTV (Bt878)
+- Studio PCTV Pro (Bt878 stereo w/ FM)
+- Pinnacle PCTV (Bt878, MT2032)
+- Pinnacle PCTV Pro (Bt878, MT2032)
+- Pinncale PCTV Sat (bt878a, HM1821/1221) ["Conexant CX24110 with CX24108 tuner, aka HM1221/HM1811"]
+- Pinnacle PCTV Sat XE
+
+M(J)PEG capture and playback models:
+
+- DC1+ (ISA)
+- DC10 (zr36057, zr36060, saa7110, adv7176)
+- DC10+ (zr36067, zr36060, saa7110, adv7176)
+- DC20 (ql16x24b,zr36050, zr36016, saa7110, saa7187 ...)
+- DC30 (zr36057, zr36050, zr36016, vpx3220, adv7176, ad1843, tea6415, miro FST97A1)
+- DC30+ (zr36067, zr36050, zr36016, vpx3220, adv7176)
+- DC50 (zr36067, zr36050, zr36016, saa7112, adv7176 (2 pcs.?), ad1843, miro FST97A1, Lattice ???)
+
+Lenco
+~~~~~
+
+Models:
+
+- MXR-9565 (=Technisat Mediafocus?)
+- MXR-9571 (Bt848) (=CPH031?)
+- MXR-9575
+- MXR-9577 (Bt878) (=Prolink 878TV Rev.3x)
+- MXTV-9578CP (Bt878) (= Prolink PV-BT878P+4E)
+
+Iomega
+~~~~~~
+
+Buz (zr36067, zr36060, saa7111, saa7185)
+
+LML
+~~~
+ LML33 (zr36067, zr36060, bt819, bt856)
+
+Grandtec
+~~~~~~~~
+
+Models:
+
+- Grand Video Capture (Bt848)
+- Multi Capture Card (Bt878)
+
+Koutech
+~~~~~~~
+
+Models:
+
+- KW-606 (Bt848)
+- KW-607 (Bt848 capture only)
+- KW-606RSF
+- KW-607A (capture only)
+- KW-608 (Zoran capture only)
+
+IODATA (jp)
+~~~~~~~~~~~
+
+Models:
+
+- GV-BCTV/PCI
+- GV-BCTV2/PCI
+- GV-BCTV3/PCI
+- GV-BCTV4/PCI
+- GV-VCP/PCI (capture only)
+- GV-VCP2/PCI (capture only)
+
+Canopus (jp)
+~~~~~~~~~~~~
+
+WinDVR = Kworld "KW-TVL878RF"
+
+www.sigmacom.co.kr
+~~~~~~~~~~~~~~~~~~
+
+Sigma Cyber TV II
+
+www.sasem.co.kr
+~~~~~~~~~~~~~~~
+
+Litte OnAir TV
+
+hama
+~~~~
+
+TV/Radio-Tuner Card, PCI (Model 44677) = CPH051
+
+Sigma Designs
+~~~~~~~~~~~~~
+
+Hollywood plus (em8300, em9010, adv7175), (PCB "M340-10") MPEG DVD decoder
+
+Formac
+~~~~~~
+
+Models:
+
+- iProTV (Card for iMac Mezzanine slot, Bt848+SCSI)
+- ProTV (Bt848)
+- ProTV II = ProTV Stereo (Bt878) ["stereo" means FM stereo, tv is still mono]
+
+ATI
+~~~
+
+Models:
+
+- TV-Wonder
+- TV-Wonder VE
+
+Diamond Multimedia
+~~~~~~~~~~~~~~~~~~
+
+DTV2000 (Bt848, tda9875)
+
+Aopen
+~~~~~
+
+- VA1000 Plus (w/ Stereo)
+- VA1000 Lite
+- VA1000 (=LR90)
+
+Intel
+~~~~~
+
+Models:
+
+- Smart Video Recorder (ISA full-length)
+- Smart Video Recorder pro (ISA half-length)
+- Smart Video Recorder III (Bt848)
+
+STB
+~~~
+
+Models:
+
+- STB Gateway 6000704 (bt878)
+- STB Gateway 6000699 (bt848)
+- STB Gateway 6000402 (bt848)
+- STB TV130 PCI
+
+Videologic
+~~~~~~~~~~
+
+Models:
+
+- Captivator Pro/TV (ISA?)
+- Captivator PCI/VC (Bt848 bundled with camera) (capture only)
+
+Technotrend
+~~~~~~~~~~~~
+
+Models:
+
+- TT-SAT PCI (PCB "Sat-PCI Rev.:1.3.1"; zr36125, vpx3225d, stc0056a, Tuner:BSKE6-155A
+- TT-DVB-Sat
+ - revisions 1.1, 1.3, 1.5, 1.6 and 2.1
+ - This card is sold as OEM from:
+
+ - Siemens DVB-s Card
+ - Hauppauge WinTV DVB-S
+ - Technisat SkyStar 1 DVB
+ - Galaxis DVB Sat
+
+ - Now this card is called TT-PCline Premium Family
+ - TT-Budget (saa7146, bsru6-701a)
+ This card is sold as OEM from:
+
+ - Hauppauge WinTV Nova
+ - Satelco Standard PCI (DVB-S)
+ - TT-DVB-C PCI
+
+Teles
+~~~~~
+
+ DVB-s (Rev. 2.2, BSRV2-301A, data only?)
+
+Remote Vision
+~~~~~~~~~~~~~
+
+MX RV605 (Bt848 capture only)
+
+Boeder
+~~~~~~
+
+Models:
+
+- PC ChatCam (Model 68252) (Bt848 capture only)
+- Tv/Fm Capture Card (Model 68404) = PV951
+
+Media-Surfer (esc-kathrein.de)
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Models:
+
+- Sat-Surfer (ISA)
+- Sat-Surfer PCI = Techni-PC-Sat
+- Cable-Surfer 1
+- Cable-Surfer 2
+- Cable-Surfer PCI (zr36120)
+- Audio-Surfer (ISA Radio card)
+
+Jetway (www.jetway.com.tw)
+~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Models:
+
+- JW-TV 878M
+- JW-TV 878 = KWorld KW-TV878RF
+
+Galaxis
+~~~~~~~
+
+Models:
+
+- Galaxis DVB Card S CI
+- Galaxis DVB Card C CI
+- Galaxis DVB Card S
+- Galaxis DVB Card C
+- Galaxis plug.in S [neuer Name: Galaxis DVB Card S CI
+
+Hauppauge
+~~~~~~~~~
+
+Models:
+
+- many many WinTV models ...
+- WinTV DVBs = Technotrend Premium 1.3
+- WinTV NOVA = Technotrend Budget 1.1 "S-DVB DATA"
+- WinTV NOVA-CI "SDVBACI"
+- WinTV Nova USB (=Technotrend USB 1.0)
+- WinTV-Nexus-s (=Technotrend Premium 2.1 or 2.2)
+- WinTV PVR
+- WinTV PVR 250
+- WinTV PVR 450
+
+US models
+
+-990 WinTV-PVR-350 (249USD) (iTVC15 chipset + radio)
+-980 WinTV-PVR-250 (149USD) (iTVC15 chipset)
+-880 WinTV-PVR-PCI (199USD) (KFIR chipset + bt878)
+-881 WinTV-PVR-USB
+-190 WinTV-GO
+-191 WinTV-GO-FM
+-404 WinTV
+-401 WinTV-radio
+-495 WinTV-Theater
+-602 WinTV-USB
+-621 WinTV-USB-FM
+-600 USB-Live
+-698 WinTV-HD
+-697 WinTV-D
+-564 WinTV-Nexus-S
+
+Deutsche Modelle:
+
+-603 WinTV GO
+-719 WinTV Primio-FM
+-718 WinTV PCI-FM
+-497 WinTV Theater
+-569 WinTV USB
+-568 WinTV USB-FM
+-882 WinTV PVR
+-981 WinTV PVR 250
+-891 WinTV-PVR-USB
+-541 WinTV Nova
+-488 WinTV Nova-Ci
+-564 WinTV-Nexus-s
+-727 WinTV-DVB-c
+-545 Common Interface
+-898 WinTV-Nova-USB
+
+UK models:
+
+-607 WinTV Go
+-693,793 WinTV Primio FM
+-647,747 WinTV PCI FM
+-498 WinTV Theater
+-883 WinTV PVR
+-893 WinTV PVR USB (Duplicate entry)
+-566 WinTV USB (UK)
+-573 WinTV USB FM
+-429 Impact VCB (bt848)
+-600 USB Live (Video-In 1x Comp, 1xSVHS)
+-542 WinTV Nova
+-717 WinTV DVB-S
+-909 Nova-t PCI
+-893 Nova-t USB (Duplicate entry)
+-802 MyTV
+-804 MyView
+-809 MyVideo
+-872 MyTV2Go FM
+-546 WinTV Nova-S CI
+-543 WinTV Nova
+-907 Nova-S USB
+-908 Nova-T USB
+-717 WinTV Nexus-S
+-157 DEC3000-s Standalone + USB
+
+Spain:
+
+-685 WinTV-Go
+-690 WinTV-PrimioFM
+-416 WinTV-PCI Nicam Estereo
+-677 WinTV-PCI-FM
+-699 WinTV-Theater
+-683 WinTV-USB
+-678 WinTV-USB-FM
+-983 WinTV-PVR-250
+-883 WinTV-PVR-PCI
+-993 WinTV-PVR-350
+-893 WinTV-PVR-USB
+-728 WinTV-DVB-C PCI
+-832 MyTV2Go
+-869 MyTV2Go-FM
+-805 MyVideo (USB)
+
+
+Matrix-Vision
+~~~~~~~~~~~~~
+
+Models:
+
+- MATRIX-Vision MV-Delta
+- MATRIX-Vision MV-Delta 2
+- MVsigma-SLC (Bt848)
+
+Conceptronic (.net)
+~~~~~~~~~~~~~~~~~~~
+
+Models:
+
+- TVCON FM, TV card w/ FM = CPH05x
+- TVCON = CPH06x
+
+BestData
+~~~~~~~~
+
+Models:
+
+- HCC100 = VCC100rev1 + camera
+- VCC100 rev1 (bt848)
+- VCC100 rev2 (bt878)
+
+Gallant (www.gallantcom.com) www.minton.com.tw
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Models:
+
+- Intervision IV-510 (capture only bt8x8)
+- Intervision IV-550 (bt8x8)
+- Intervision IV-100 (zoran)
+- Intervision IV-1000 (bt8x8)
+
+Asonic (www.asonic.com.cn) (website down)
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+SkyEye tv 878
+
+Hoontech
+~~~~~~~~
+
+878TV/FM
+
+Teppro (www.itcteppro.com.tw)
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Models:
+
+- ITC PCITV (Card Ver 1.0) "Teppro TV1/TVFM1 Card"
+- ITC PCITV (Card Ver 2.0)
+- ITC PCITV (Card Ver 3.0) = "PV-BT878P+ (REV.9D)"
+- ITC PCITV (Card Ver 4.0)
+- TEPPRO IV-550 (For BT848 Main Chip)
+- ITC DSTTV (bt878, satellite)
+- ITC VideoMaker (saa7146, StreamMachine sm2110, tvtuner) "PV-SM2210P+ (REV:1C)"
+
+Kworld (www.kworld.com.tw)
+~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+PC TV Station:
+
+- KWORLD KW-TV878R TV (no radio)
+- KWORLD KW-TV878RF TV (w/ radio)
+- KWORLD KW-TVL878RF (low profile)
+- KWORLD KW-TV713XRF (saa7134)
+
+
+ MPEG TV Station (same cards as above plus WinDVR Software MPEG en/decoder)
+
+- KWORLD KW-TV878R -Pro TV (no Radio)
+- KWORLD KW-TV878RF-Pro TV (w/ Radio)
+- KWORLD KW-TV878R -Ultra TV (no Radio)
+- KWORLD KW-TV878RF-Ultra TV (w/ Radio)
+
+JTT/ Justy Corp.(http://www.jtt.ne.jp/)
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+JTT-02 (JTT TV) "TV watchmate pro" (bt848)
+
+ADS www.adstech.com
+~~~~~~~~~~~~~~~~~~~
+
+Models:
+
+- Channel Surfer TV ( CHX-950 )
+- Channel Surfer TV+FM ( CHX-960FM )
+
+AVEC www.prochips.com
+~~~~~~~~~~~~~~~~~~~~~
+
+AVEC Intercapture (bt848, tea6320)
+
+NoBrand
+~~~~~~~
+
+TV Excel = Australian Name for "PV-BT878P+ 8E" or "878TV Rev.3\_"
+
+Mach www.machspeed.com
+~~~~~~~~~~~~~~~~~~~~~~
+
+Mach TV 878
+
+Eline www.eline-net.com/
+~~~~~~~~~~~~~~~~~~~~~~~~
+
+Models:
+
+- Eline Vision TVMaster / TVMaster FM (ELV-TVM/ ELV-TVM-FM) = LR26 (bt878)
+- Eline Vision TVMaster-2000 (ELV-TVM-2000, ELV-TVM-2000-FM)= LR138 (saa713x)
+
+Spirit
+~~~~~~
+
+- Spirit TV Tuner/Video Capture Card (bt848)
+
+Boser www.boser.com.tw
+~~~~~~~~~~~~~~~~~~~~~~
+
+Models:
+
+- HS-878 Mini PCI Capture Add-on Card
+- HS-879 Mini PCI 3D Audio and Capture Add-on Card (w/ ES1938 Solo-1)
+
+Satelco www.citycom-gmbh.de, www.satelco.de
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Models:
+
+- TV-FM =KNC1 saa7134
+- Standard PCI (DVB-S) = Technotrend Budget
+- Standard PCI (DVB-S) w/ CI
+- Satelco Highend PCI (DVB-S) = Technotrend Premium
+
+
+Sensoray www.sensoray.com
+~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Models:
+
+- Sensoray 311 (PC/104 bus)
+- Sensoray 611 (PCI)
+
+CEI (Chartered Electronics Industries Pte Ltd [CEI] [FCC ID HBY])
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Models:
+
+- TV Tuner - HBY-33A-RAFFLES Brooktree Bt848KPF + Philips
+- TV Tuner MG9910 - HBY33A-TVO CEI + Philips SAA7110 + OKI M548262 + ST STV8438CV
+- Primetime TV (ISA)
+
+ - acquired by Singapore Technologies
+ - now operating as Chartered Semiconductor Manufacturing
+ - Manufacturer of video cards is listed as:
+
+ - Cogent Electronics Industries [CEI]
+
+AITech
+~~~~~~
+
+Models:
+
+- Wavewatcher TV (ISA)
+- AITech WaveWatcher TV-PCI = can be LR26 (Bt848) or LR50 (BT878)
+- WaveWatcher TVR-202 TV/FM Radio Card (ISA)
+
+MAXRON
+~~~~~~
+
+Maxron MaxTV/FM Radio (KW-TV878-FNT) = Kworld or JW-TV878-FBK
+
+www.ids-imaging.de
+~~~~~~~~~~~~~~~~~~
+
+Models:
+
+- Falcon Series (capture only)
+
+In USA: http://www.theimagingsource.com/
+- DFG/LC1
+
+www.sknet-web.co.jp
+~~~~~~~~~~~~~~~~~~~
+
+SKnet Monster TV (saa7134)
+
+A-Max www.amaxhk.com (Colormax, Amax, Napa)
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+APAC Viewcomp 878
+
+Cybertainment
+~~~~~~~~~~~~~
+
+Models:
+
+- CyberMail AV Video Email Kit w/ PCI Capture Card (capture only)
+- CyberMail Xtreme
+
+These are Flyvideo
+
+VCR (http://www.vcrinc.com/)
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Video Catcher 16
+
+Twinhan
+~~~~~~~
+
+Models:
+
+- DST Card/DST-IP (bt878, twinhan asic) VP-1020
+ - Sold as:
+
+ - KWorld DVBS Satellite TV-Card
+ - Powercolor DSTV Satellite Tuner Card
+ - Prolink Pixelview DTV2000
+ - Provideo PV-911 Digital Satellite TV Tuner Card With Common Interface ?
+
+- DST-CI Card (DVB Satellite) VP-1030
+- DCT Card (DVB cable)
+
+MSI
+~~~
+
+Models:
+
+- MSI TV@nywhere Tuner Card (MS-8876) (CX23881/883) Not Bt878 compatible.
+- MS-8401 DVB-S
+
+Focus www.focusinfo.com
+~~~~~~~~~~~~~~~~~~~~~~~
+
+InVideo PCI (bt878)
+
+Sdisilk www.sdisilk.com/
+~~~~~~~~~~~~~~~~~~~~~~~~
+
+Models:
+
+- SDI Silk 100
+- SDI Silk 200 SDI Input Card
+
+www.euresys.com
+~~~~~~~~~~~~~~~
+
+PICOLO series
+
+PMC/Pace
+~~~~~~~~
+
+www.pacecom.co.uk website closed
+
+Mercury www.kobian.com (UK and FR)
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Models:
+
+- LR50
+- LR138RBG-Rx == LR138
+
+TEC sound
+~~~~~~~~~
+
+TV-Mate = Zoltrix VP-8482
+
+Though educated googling found: www.techmakers.com
+
+(package and manuals don't have any other manufacturer info) TecSound
+
+Lorenzen www.lorenzen.de
+~~~~~~~~~~~~~~~~~~~~~~~~
+
+SL DVB-S PCI = Technotrend Budget PCI (su1278 or bsru version)
+
+Origo (.uk) www.origo2000.com
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+PC TV Card = LR50
+
+I/O Magic www.iomagic.com
+~~~~~~~~~~~~~~~~~~~~~~~~~
+
+PC PVR - Desktop TV Personal Video Recorder DR-PCTV100 = Pinnacle ROB2D-51009464 4.0 + Cyberlink PowerVCR II
+
+Arowana
+~~~~~~~
+
+TV-Karte / Poso Power TV (?) = Zoltrix VP-8482 (?)
+
+iTVC15 boards
+~~~~~~~~~~~~~
+
+kuroutoshikou.com ITVC15
+yuan.com MPG160 PCI TV (Internal PCI MPEG2 encoder card plus TV-tuner)
+
+Asus www.asuscom.com
+~~~~~~~~~~~~~~~~~~~~
+
+Models:
+
+- Asus TV Tuner Card 880 NTSC (low profile, cx23880)
+- Asus TV (saa7134)
+
+Hoontech
+~~~~~~~~
+
+http://www.hoontech.de/
+
+- HART Vision 848 (H-ART Vision 848)
+- HART Vision 878 (H-Art Vision 878)
+
+
+
+Chips used at bttv devices
+--------------------------
+
+- all boards:
+
+ - Brooktree Bt848/848A/849/878/879: video capture chip
+
+- Board specific
+
+ - Miro PCTV:
+
+ - Philips or Temic Tuner
+
+ - Hauppauge Win/TV pci (version 405):
+
+ - Microchip 24LC02B or Philips 8582E2Y:
+
+ - 256 Byte EEPROM with configuration information
+ - I2C 0xa0-0xa1, (24LC02B also responds to 0xa2-0xaf)
+
+ - Philips SAA5246AGP/E: Videotext decoder chip, I2C 0x22-0x23
+
+ - TDA9800: sound decoder
+
+ - Winbond W24257AS-35: 32Kx8 CMOS static RAM (Videotext buffer mem)
+
+ - 14052B: analog switch for selection of sound source
+
+- PAL:
+
+ - TDA5737: VHF, hyperband and UHF mixer/oscillator for TV and VCR 3-band tuners
+ - TSA5522: 1.4 GHz I2C-bus controlled synthesizer, I2C 0xc2-0xc3
+
+- NTSC:
+
+ - TDA5731: VHF, hyperband and UHF mixer/oscillator for TV and VCR 3-band tuners
+ - TSA5518: no datasheet available on Philips site
+
+- STB TV pci:
+
+ - ???
+ - if you want better support for STB cards send me info!
+ Look at the board! What chips are on it?
+
+
+
+
+Specs
+-----
+
+Philips http://www.Semiconductors.COM/pip/
+
+Conexant http://www.conexant.com/
+
+Micronas http://www.micronas.com/en/home/index.html
+
+Thanks
+------
+
+Many thanks to:
+
+- Markus Schroeder <schroedm@uni-duesseldorf.de> for information on the Bt848
+ and tuner programming and his control program xtvc.
+
+- Martin Buck <martin-2.buck@student.uni-ulm.de> for his great Videotext
+ package.
+
+- Gerd Hoffmann for the MSP3400 support and the modular
+ I2C, tuner, ... support.
+
+
+- MATRIX Vision for giving us 2 cards for free, which made support of
+ single crystal operation possible.
+
+- MIRO for providing a free PCTV card and detailed information about the
+ components on their cards. (E.g. how the tuner type is detected)
+ Without their card I could not have debugged the NTSC mode.
+
+- Hauppauge for telling how the sound input is selected and what components
+ they do and will use on their radio cards.
+ Also many thanks for faxing me the FM1216 data sheet.
+
+Contributors
+------------
+
+Michael Chu <mmchu@pobox.com>
+ AverMedia fix and more flexible card recognition
+
+Alan Cox <alan@lxorguk.ukuu.org.uk>
+ Video4Linux interface and 2.1.x kernel adaptation
+
+Chris Kleitsch
+ Hardware I2C
+
+Gerd Hoffmann
+ Radio card (ITT sound processor)
+
+bigfoot <bigfoot@net-way.net>
+
+Ragnar Hojland Espinosa <ragnar@macula.net>
+ ConferenceTV card
+
+
++ many more (please mail me if you are missing in this list and would
+ like to be mentioned)
diff --git a/Documentation/admin-guide/media/building.rst b/Documentation/admin-guide/media/building.rst
new file mode 100644
index 000000000000..2d660b76caea
--- /dev/null
+++ b/Documentation/admin-guide/media/building.rst
@@ -0,0 +1,357 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+===================================
+Building support for a media device
+===================================
+
+The first step is to download the Kernel's source code, either via a
+distribution-specific source file or via the Kernel's main git tree\ [1]_.
+
+Please notice, however, that, if:
+
+- you're a braveheart and want to experiment with new stuff;
+- if you want to report a bug;
+- if you're developing new patches
+
+you should use the main media development tree ``master`` branch:
+
+ https://git.linuxtv.org/media_tree.git/
+
+In this case, you may find some useful information at the
+`LinuxTv wiki pages <https://linuxtv.org/wiki>`_:
+
+ https://linuxtv.org/wiki/index.php/How_to_Obtain,_Build_and_Install_V4L-DVB_Device_Drivers
+
+.. [1] The upstream Linux Kernel development tree is located at
+
+ https://git.kernel.org/pub/scm/li nux/kernel/git/torvalds/linux.git/
+
+Configuring the Linux Kernel
+============================
+
+You can access a menu of Kernel building options with::
+
+ $ make menuconfig
+
+Then, select all desired options and exit it, saving the configuration.
+
+The changed configuration will be at the ``.config`` file. It would
+look like::
+
+ ...
+ # CONFIG_RC_CORE is not set
+ # CONFIG_CEC_CORE is not set
+ CONFIG_MEDIA_SUPPORT=m
+ CONFIG_MEDIA_SUPPORT_FILTER=y
+ ...
+
+The media subsystem is controlled by those menu configuration options::
+
+ Device Drivers --->
+ <M> Remote Controller support --->
+ [ ] HDMI CEC RC integration
+ [ ] Enable CEC error injection support
+ [*] HDMI CEC drivers --->
+ <*> Multimedia support --->
+
+The ``Remote Controller support`` option enables the core support for
+remote controllers\ [2]_.
+
+The ``HDMI CEC RC integration`` option enables integration of HDMI CEC
+with Linux, allowing to receive data via HDMI CEC as if it were produced
+by a remote controller directly connected to the machine.
+
+The ``HDMI CEC drivers`` option allow selecting platform and USB drivers
+that receives and/or transmits CEC codes via HDMI interfaces\ [3]_.
+
+The last option (``Multimedia support``) enables support for cameras,
+audio/video grabbers and TV.
+
+The media subsystem support can either be built together with the main
+Kernel or as a module. For most use cases, it is preferred to have it
+built as modules.
+
+.. note::
+
+ Instead of using a menu, the Kernel provides a script with allows
+ enabling configuration options directly. To enable media support
+ and remote controller support using Kernel modules, you could use::
+
+ $ scripts/config -m RC_CORE
+ $ scripts/config -m MEDIA_SUPPORT
+
+.. [2] ``Remote Controller support`` should also be enabled if you
+ want to use some TV card drivers that may depend on the remote
+ controller core support.
+
+.. [3] Please notice that the DRM subsystem also have drivers for GPUs
+ that use the media HDMI CEC support.
+
+ Those GPU-specific drivers are selected via the ``Graphics support``
+ menu, under ``Device Drivers``.
+
+ When a GPU driver supports HDMI CEC, it will automatically
+ enable the CEC core support at the media subsystem.
+
+Media dependencies
+------------------
+
+It should be noticed that enabling the above from a clean config is
+usually not enough. The media subsystem depends on several other Linux
+core support in order to work.
+
+For example, most media devices use a serial communication bus in
+order to talk with some peripherals. Such bus is called I²C
+(Inter-Integrated Circuit). In order to be able to build support
+for such hardware, the I²C bus support should be enabled, either via
+menu or with::
+
+ ./scripts/config -m I2C
+
+Another example: the remote controller core requires support for
+input devices, with can be enabled with::
+
+ ./scripts/config -m INPUT
+
+Other core functionality may also be needed (like PCI and/or USB support),
+depending on the specific driver(s) you would like to enable.
+
+Enabling Remote Controller Support
+----------------------------------
+
+The remote controller menu allows selecting drivers for specific devices.
+It's menu looks like this::
+
+ --- Remote Controller support
+ <M> Compile Remote Controller keymap modules
+ [*] LIRC user interface
+ [*] Support for eBPF programs attached to lirc devices
+ [*] Remote controller decoders --->
+ [*] Remote Controller devices --->
+
+The ``Compile Remote Controller keymap modules`` option creates key maps for
+several popular remote controllers.
+
+The ``LIRC user interface`` option adds enhanced functionality when using the
+``lirc`` program, by enabling an API that allows userspace to receive raw data
+from remote controllers.
+
+The ``Support for eBPF programs attached to lirc devices`` option allows
+the usage of special programs (called eBPF) that would allow aplications
+to add extra remote controller decoding functionality to the Linux Kernel.
+
+The ``Remote controller decoders`` option allows selecting the
+protocols that will be recognized by the Linux Kernel. Except if you
+want to disable some specific decoder, it is suggested to keep all
+sub-options enabled.
+
+The ``Remote Controller devices`` allows you to select the drivers
+that would be needed to support your device.
+
+The same configuration can also be set via the ``script/config``
+script. So, for instance, in order to support the ITE remote controller
+driver (found on Intel NUCs and on some ASUS x86 desktops), you could do::
+
+ $ scripts/config -e INPUT
+ $ scripts/config -e ACPI
+ $ scripts/config -e MODULES
+ $ scripts/config -m RC_CORE
+ $ scripts/config -e RC_DEVICES
+ $ scripts/config -e RC_DECODERS
+ $ scripts/config -m IR_RC5_DECODER
+ $ scripts/config -m IR_ITE_CIR
+
+Enabling HDMI CEC Support
+-------------------------
+
+The HDMI CEC support is set automatically when a driver requires it. So,
+all you need to do is to enable support either for a graphics card
+that needs it or by one of the existing HDMI drivers.
+
+The HDMI-specific drivers are available at the ``HDMI CEC drivers``
+menu\ [4]_::
+
+ --- HDMI CEC drivers
+ < > ChromeOS EC CEC driver
+ < > Amlogic Meson AO CEC driver
+ < > Amlogic Meson G12A AO CEC driver
+ < > Generic GPIO-based CEC driver
+ < > Samsung S5P CEC driver
+ < > STMicroelectronics STiH4xx HDMI CEC driver
+ < > STMicroelectronics STM32 HDMI CEC driver
+ < > Tegra HDMI CEC driver
+ < > SECO Boards HDMI CEC driver
+ [ ] SECO Boards IR RC5 support
+ < > Pulse Eight HDMI CEC
+ < > RainShadow Tech HDMI CEC
+
+.. [4] The above contents is just an example. The actual options for
+ HDMI devices depends on the system's architecture and may vary
+ on new Kernels.
+
+Enabling Media Support
+----------------------
+
+The Media menu has a lot more options than the remote controller menu.
+Once selected, you should see the following options::
+
+ --- Media support
+ [ ] Filter media drivers
+ [*] Autoselect ancillary drivers
+ Media device types --->
+ Media core support --->
+ Video4Linux options --->
+ Media controller options --->
+ Digital TV options --->
+ HDMI CEC options --->
+ Media drivers --->
+ Media ancillary drivers --->
+
+Except if you know exactly what you're doing, or if you want to build
+a driver for a SoC platform, it is strongly recommended to keep the
+``Autoselect ancillary drivers`` option turned on, as it will auto-select
+the needed I²C ancillary drivers.
+
+There are now two ways to select media device drivers, as described
+below.
+
+``Filter media drivers`` menu
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+This menu is meant to easy setup for PC and Laptop hardware. It works
+by letting the user to specify what kind of media drivers are desired,
+with those options::
+
+ [ ] Cameras and video grabbers
+ [ ] Analog TV
+ [ ] Digital TV
+ [ ] AM/FM radio receivers/transmitters
+ [ ] Software defined radio
+ [ ] Platform-specific devices
+ [ ] Test drivers
+
+So, if you want to add support to a camera or video grabber only,
+select just the first option. Multiple options are allowed.
+
+Once the options on this menu are selected, the building system will
+auto-select the needed core drivers in order to support the selected
+functionality.
+
+.. note::
+
+ Most TV cards are hybrid: they support both Analog TV and Digital TV.
+
+ If you have an hybrid card, you may need to enable both ``Analog TV``
+ and ``Digital TV`` at the menu.
+
+When using this option, the defaults for the media support core
+functionality are usually good enough to provide the basic functionality
+for the driver. Yet, you could manually enable some desired extra (optional)
+functionality using the settings under each of the following
+``Media support`` sub-menus::
+
+ Media core support --->
+ Video4Linux options --->
+ Media controller options --->
+ Digital TV options --->
+ HDMI CEC options --->
+
+Once you select the desired filters, the drivers that matches the filtering
+criteria will be available at the ``Media support->Media drivers`` sub-menu.
+
+``Media Core Support`` menu without filtering
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+If you disable the ``Filter media drivers`` menu, all drivers available
+for your system whose dependencies are met should be shown at the
+``Media drivers`` menu.
+
+Please notice, however, that you should first ensure that the
+``Media Core Support`` menu has all the core functionality your drivers
+would need, as otherwise the corresponding device drivers won't be shown.
+
+Example
+-------
+
+In order to enable modular support for one of the boards listed on
+:doc:`this table <cx231xx-cardlist>`, with modular media core modules, the
+``.config`` file should contain those lines::
+
+ CONFIG_MODULES=y
+ CONFIG_USB=y
+ CONFIG_I2C=y
+ CONFIG_INPUT=y
+ CONFIG_RC_CORE=m
+ CONFIG_MEDIA_SUPPORT=m
+ CONFIG_MEDIA_SUPPORT_FILTER=y
+ CONFIG_MEDIA_ANALOG_TV_SUPPORT=y
+ CONFIG_MEDIA_DIGITAL_TV_SUPPORT=y
+ CONFIG_MEDIA_USB_SUPPORT=y
+ CONFIG_VIDEO_CX231XX=y
+ CONFIG_VIDEO_CX231XX_DVB=y
+
+Building and installing a new Kernel
+====================================
+
+Once the ``.config`` file has everything needed, all it takes to build
+is to run the ``make`` command::
+
+ $ make
+
+And then install the new Kernel and its modules::
+
+ $ sudo make modules_install
+ $ sudo make install
+
+Building just the new media drivers and core
+============================================
+
+Running a new development Kernel from the development tree is usually risky,
+because it may have experimental changes that may have bugs. So, there are
+some ways to build just the new drivers, using alternative trees.
+
+There is the `Linux Kernel backports project
+<https://backports.wiki.kernel.org/index.php/Main_Page>`_, with contains
+newer drivers meant to be compiled against stable Kernels.
+
+The LinuxTV developers, with are responsible for maintaining the media
+subsystem also maintains a backport tree, with just the media drivers
+daily updated from the newest kernel. Such tree is available at:
+
+https://git.linuxtv.org/media_build.git/
+
+It should be noticed that, while it should be relatively safe to use the
+``media_build`` tree for testing purposes, there are not warranties that
+it would work (or even build) on a random Kernel. This tree is maintained
+using a "best-efforts" principle, as time permits us to fix issues there.
+
+If you notice anything wrong on it, feel free to submit patches at the
+Linux media subsystem's mailing list: media@vger.kernel.org. Please
+add ``[PATCH media-build]`` at the e-mail's subject if you submit a new
+patch for the media-build.
+
+Before using it, you should run::
+
+ $ ./build
+
+.. note::
+
+ 1) you may need to run it twice if the ``media-build`` tree gets
+ updated;
+ 2) you may need to do a ``make distclean`` if you had built it
+ in the past for a different Kernel version than the one you're
+ currently using;
+ 3) by default, it will use the same config options for media as
+ the ones defined on the Kernel you're running.
+
+In order to select different drivers or different config options,
+use::
+
+ $ make menuconfig
+
+Then, you can build and install the new drivers::
+
+ $ make && sudo make install
+
+This will override the previous media drivers that your Kernel were
+using.
diff --git a/Documentation/admin-guide/media/cafe_ccic.rst b/Documentation/admin-guide/media/cafe_ccic.rst
new file mode 100644
index 000000000000..ff7fbce1342a
--- /dev/null
+++ b/Documentation/admin-guide/media/cafe_ccic.rst
@@ -0,0 +1,62 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+The cafe_ccic driver
+====================
+
+Author: Jonathan Corbet <corbet@lwn.net>
+
+Introduction
+------------
+
+"cafe_ccic" is a driver for the Marvell 88ALP01 "cafe" CMOS camera
+controller. This is the controller found in first-generation OLPC systems,
+and this driver was written with support from the OLPC project.
+
+Current status: the core driver works. It can generate data in YUV422,
+RGB565, and RGB444 formats. (Anybody looking at the code will see RGB32 as
+well, but that is a debugging aid which will be removed shortly). VGA and
+QVGA modes work; CIF is there but the colors remain funky. Only the OV7670
+sensor is known to work with this controller at this time.
+
+To try it out: either of these commands will work:
+
+.. code-block:: none
+
+ $ mplayer tv:// -tv driver=v4l2:width=640:height=480 -nosound
+ $ mplayer tv:// -tv driver=v4l2:width=640:height=480:outfmt=bgr16 -nosound
+
+The "xawtv" utility also works; gqcam does not, for unknown reasons.
+
+Load time options
+-----------------
+
+There are a few load-time options, most of which can be changed after
+loading via sysfs as well:
+
+ - alloc_bufs_at_load: Normally, the driver will not allocate any DMA
+ buffers until the time comes to transfer data. If this option is set,
+ then worst-case-sized buffers will be allocated at module load time.
+ This option nails down the memory for the life of the module, but
+ perhaps decreases the chances of an allocation failure later on.
+
+ - dma_buf_size: The size of DMA buffers to allocate. Note that this
+ option is only consulted for load-time allocation; when buffers are
+ allocated at run time, they will be sized appropriately for the current
+ camera settings.
+
+ - n_dma_bufs: The controller can cycle through either two or three DMA
+ buffers. Normally, the driver tries to use three buffers; on faster
+ systems, however, it will work well with only two.
+
+ - min_buffers: The minimum number of streaming I/O buffers that the driver
+ will consent to work with. Default is one, but, on slower systems,
+ better behavior with mplayer can be achieved by setting to a higher
+ value (like six).
+
+ - max_buffers: The maximum number of streaming I/O buffers; default is
+ ten. That number was carefully picked out of a hat and should not be
+ assumed to actually mean much of anything.
+
+ - flip: If this boolean parameter is set, the sensor will be instructed to
+ invert the video image. Whether it makes sense is determined by how
+ your particular camera is mounted.
diff --git a/Documentation/admin-guide/media/cardlist.rst b/Documentation/admin-guide/media/cardlist.rst
new file mode 100644
index 000000000000..5b38bfd6a19d
--- /dev/null
+++ b/Documentation/admin-guide/media/cardlist.rst
@@ -0,0 +1,29 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+==========
+Cards List
+==========
+
+The media subsystem provide support for lots of PCI and USB drivers, plus
+platform-specific drivers. It also contains several ancillary I²C drivers.
+
+The platform-specific drivers are usually present on embedded systems,
+or are supported by the main board. Usually, setting them is done via
+OpenFirmware or ACPI.
+
+The PCI and USB drivers, however, are independent of the system's board,
+and may be added/removed by the user.
+
+You may also take a look at
+https://linuxtv.org/wiki/index.php/Hardware_Device_Information
+for more details about supported cards.
+
+.. toctree::
+ :maxdepth: 2
+
+ usb-cardlist
+ pci-cardlist
+ platform-cardlist
+ radio-cardlist
+ i2c-cardlist
+ misc-cardlist
diff --git a/Documentation/admin-guide/media/cec-drivers.rst b/Documentation/admin-guide/media/cec-drivers.rst
new file mode 100644
index 000000000000..8d9686c08df9
--- /dev/null
+++ b/Documentation/admin-guide/media/cec-drivers.rst
@@ -0,0 +1,10 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+=================================
+CEC driver-specific documentation
+=================================
+
+.. toctree::
+ :maxdepth: 2
+
+ pulse8-cec
diff --git a/Documentation/admin-guide/media/ci.rst b/Documentation/admin-guide/media/ci.rst
new file mode 100644
index 000000000000..ded4d8fbbf92
--- /dev/null
+++ b/Documentation/admin-guide/media/ci.rst
@@ -0,0 +1,77 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+Digital TV Conditional Access Interface
+=======================================
+
+
+.. note::
+
+ This documentation is outdated.
+
+This document describes the usage of the high level CI API as
+in accordance to the Linux DVB API. This is a not a documentation for the,
+existing low level CI API.
+
+.. note::
+
+ For the Twinhan/Twinhan clones, the dst_ca module handles the CI
+ hardware handling. This module is loaded automatically if a CI
+ (Common Interface, that holds the CAM (Conditional Access Module)
+ is detected.
+
+ca_zap
+~~~~~~
+
+A userspace application, like ``ca_zap`` is required to handle encrypted
+MPEG-TS streams.
+
+The ``ca_zap`` userland application is in charge of sending the
+descrambling related information to the Conditional Access Module (CAM).
+
+This application requires the following to function properly as of now.
+
+a) Tune to a valid channel, with szap.
+
+ eg: $ szap -c channels.conf -r "TMC" -x
+
+b) a channels.conf containing a valid PMT PID
+
+ eg: TMC:11996:h:0:27500:278:512:650:321
+
+ here 278 is a valid PMT PID. the rest of the values are the
+ same ones that szap uses.
+
+c) after running a szap, you have to run ca_zap, for the
+ descrambler to function,
+
+ eg: $ ca_zap channels.conf "TMC"
+
+d) Hopefully enjoy your favourite subscribed channel as you do with
+ a FTA card.
+
+.. note::
+
+ Currently ca_zap, and dst_test, both are meant for demonstration
+ purposes only, they can become full fledged applications if necessary.
+
+
+Cards that fall in this category
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+At present the cards that fall in this category are the Twinhan and its
+clones, these cards are available as VVMER, Tomato, Hercules, Orange and
+so on.
+
+CI modules that are supported
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The CI module support is largely dependent upon the firmware on the cards
+Some cards do support almost all of the available CI modules. There is
+nothing much that can be done in order to make additional CI modules
+working with these cards.
+
+Modules that have been tested by this driver at present are
+
+(1) Irdeto 1 and 2 from SCM
+(2) Viaccess from SCM
+(3) Dragoncam
diff --git a/Documentation/admin-guide/media/cpia2.rst b/Documentation/admin-guide/media/cpia2.rst
new file mode 100644
index 000000000000..f6ffef686462
--- /dev/null
+++ b/Documentation/admin-guide/media/cpia2.rst
@@ -0,0 +1,145 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+The cpia2 driver
+================
+
+Authors: Peter Pregler <Peter_Pregler@email.com>,
+Scott J. Bertin <scottbertin@yahoo.com>, and
+Jarl Totland <Jarl.Totland@bdc.no> for the original cpia driver, which
+this one was modelled from.
+
+Introduction
+------------
+
+This is a driver for STMicroelectronics's CPiA2 (second generation
+Colour Processor Interface ASIC) based cameras. This camera outputs an MJPEG
+stream at up to vga size. It implements the Video4Linux interface as much as
+possible. Since the V4L interface does not support compressed formats, only
+an mjpeg enabled application can be used with the camera. We have modified the
+gqcam application to view this stream.
+
+The driver is implemented as two kernel modules. The cpia2 module
+contains the camera functions and the V4L interface. The cpia2_usb module
+contains usb specific functions. The main reason for this was the size of the
+module was getting out of hand, so I separated them. It is not likely that
+there will be a parallel port version.
+
+Features
+--------
+
+- Supports cameras with the Vision stv6410 (CIF) and stv6500 (VGA) cmos
+ sensors. I only have the vga sensor, so can't test the other.
+- Image formats: VGA, QVGA, CIF, QCIF, and a number of sizes in between.
+ VGA and QVGA are the native image sizes for the VGA camera. CIF is done
+ in the coprocessor by scaling QVGA. All other sizes are done by clipping.
+- Palette: YCrCb, compressed with MJPEG.
+- Some compression parameters are settable.
+- Sensor framerate is adjustable (up to 30 fps CIF, 15 fps VGA).
+- Adjust brightness, color, contrast while streaming.
+- Flicker control settable for 50 or 60 Hz mains frequency.
+
+Making and installing the stv672 driver modules
+-----------------------------------------------
+
+Requirements
+~~~~~~~~~~~~
+
+Video4Linux must be either compiled into the kernel or
+available as a module. Video4Linux2 is automatically detected and made
+available at compile time.
+
+Setup
+~~~~~
+
+Use ``modprobe cpia2`` to load and ``modprobe -r cpia2`` to unload. This
+may be done automatically by your distribution.
+
+Driver options
+~~~~~~~~~~~~~~
+
+.. tabularcolumns:: |p{13ex}|L|
+
+
+============== ========================================================
+Option Description
+============== ========================================================
+video_nr video device to register (0=/dev/video0, etc)
+ range -1 to 64. default is -1 (first available)
+ If you have more than 1 camera, this MUST be -1.
+buffer_size Size for each frame buffer in bytes (default 68k)
+num_buffers Number of frame buffers (1-32, default 3)
+alternate USB Alternate (2-7, default 7)
+flicker_freq Frequency for flicker reduction(50 or 60, default 60)
+flicker_mode 0 to disable, or 1 to enable flicker reduction.
+ (default 0). This is only effective if the camera
+ uses a stv0672 coprocessor.
+============== ========================================================
+
+Setting the options
+~~~~~~~~~~~~~~~~~~~
+
+If you are using modules, edit /etc/modules.conf and add an options
+line like this::
+
+ options cpia2 num_buffers=3 buffer_size=65535
+
+If the driver is compiled into the kernel, at boot time specify them
+like this::
+
+ cpia2.num_buffers=3 cpia2.buffer_size=65535
+
+What buffer size should I use?
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The maximum image size depends on the alternate you choose, and the
+frame rate achieved by the camera. If the compression engine is able to
+keep up with the frame rate, the maximum image size is given by the table
+below.
+
+The compression engine starts out at maximum compression, and will
+increase image quality until it is close to the size in the table. As long
+as the compression engine can keep up with the frame rate, after a short time
+the images will all be about the size in the table, regardless of resolution.
+
+At low alternate settings, the compression engine may not be able to
+compress the image enough and will reduce the frame rate by producing larger
+images.
+
+The default of 68k should be good for most users. This will handle
+any alternate at frame rates down to 15fps. For lower frame rates, it may
+be necessary to increase the buffer size to avoid having frames dropped due
+to insufficient space.
+
+========== ========== ======== =====
+Alternate bytes/ms 15fps 30fps
+========== ========== ======== =====
+ 2 128 8533 4267
+ 3 384 25600 12800
+ 4 640 42667 21333
+ 5 768 51200 25600
+ 6 896 59733 29867
+ 7 1023 68200 34100
+========== ========== ======== =====
+
+Table: Image size(bytes)
+
+
+How many buffers should I use?
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+For normal streaming, 3 should give the best results. With only 2,
+it is possible for the camera to finish sending one image just after a
+program has started reading the other. If this happens, the driver must drop
+a frame. The exception to this is if you have a heavily loaded machine. In
+this case use 2 buffers. You are probably not reading at the full frame rate.
+If the camera can send multiple images before a read finishes, it could
+overwrite the third buffer before the read finishes, leading to a corrupt
+image. Single and double buffering have extra checks to avoid overwriting.
+
+Using the camera
+~~~~~~~~~~~~~~~~
+
+We are providing a modified gqcam application to view the output. In
+order to avoid confusion, here it is called mview. There is also the qx5view
+program which can also control the lights on the qx5 microscope. MJPEG Tools
+(http://mjpeg.sourceforge.net) can also be used to record from the camera.
diff --git a/Documentation/admin-guide/media/cx18-cardlist.rst b/Documentation/admin-guide/media/cx18-cardlist.rst
new file mode 100644
index 000000000000..26f2da9aa542
--- /dev/null
+++ b/Documentation/admin-guide/media/cx18-cardlist.rst
@@ -0,0 +1,17 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+CX18 cards list
+===============
+
+Those cards are supported by cx18 driver:
+
+- Hauppauge HVR-1600 (ESMT memory)
+- Hauppauge HVR-1600 (Samsung memory)
+- Compro VideoMate H900
+- Yuan MPC718 MiniPCI DVB-T/Analog
+- Conexant Raptor PAL/SECAM
+- Toshiba Qosmio DVB-T/Analog
+- Leadtek WinFast PVR2100
+- Leadtek WinFast DVR3100
+- GoTView PCI DVD3 Hybrid
+- Hauppauge HVR-1600 (s5h1411/tda18271)
diff --git a/Documentation/admin-guide/media/cx231xx-cardlist.rst b/Documentation/admin-guide/media/cx231xx-cardlist.rst
new file mode 100644
index 000000000000..d374101be047
--- /dev/null
+++ b/Documentation/admin-guide/media/cx231xx-cardlist.rst
@@ -0,0 +1,99 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+cx231xx cards list
+==================
+
+.. tabularcolumns:: |p{1.4cm}|p{10.0cm}|p{6.5cm}|
+
+.. flat-table::
+ :header-rows: 1
+ :widths: 2 12 19
+ :stub-columns: 0
+
+ * - Card number
+ - Card name
+ - USB IDs
+ * - 0
+ - Unknown CX231xx video grabber
+ - 0572:5A3C
+ * - 1
+ - Conexant Hybrid TV - CARRAERA
+ - 0572:58A2
+ * - 2
+ - Conexant Hybrid TV - SHELBY
+ - 0572:58A1
+ * - 3
+ - Conexant Hybrid TV - RDE253S
+ - 0572:58A4
+ * - 4
+ - Conexant Hybrid TV - RDU253S
+ - 0572:58A5
+ * - 5
+ - Conexant VIDEO GRABBER
+ - 0572:58A6, 07ca:c039
+ * - 6
+ - Conexant Hybrid TV - rde 250
+ - 0572:589E
+ * - 7
+ - Conexant Hybrid TV - RDU 250
+ - 0572:58A0
+ * - 8
+ - Hauppauge EXETER
+ - 2040:b120, 2040:b140
+ * - 9
+ - Hauppauge USB Live 2
+ - 2040:c200
+ * - 10
+ - Pixelview PlayTV USB Hybrid
+ - 4000:4001
+ * - 11
+ - Pixelview Xcapture USB
+ - 1D19:6109, 4000:4001
+ * - 12
+ - Kworld UB430 USB Hybrid
+ - 1b80:e424
+ * - 13
+ - Iconbit Analog Stick U100 FM
+ - 1f4d:0237
+ * - 14
+ - Hauppauge WinTV USB2 FM (PAL)
+ - 2040:b110
+ * - 15
+ - Hauppauge WinTV USB2 FM (NTSC)
+ - 2040:b111
+ * - 16
+ - Elgato Video Capture V2
+ - 0fd9:0037
+ * - 17
+ - Geniatech OTG102
+ - 1f4d:0102
+ * - 18
+ - Kworld UB445 USB Hybrid
+ - 1b80:e421
+ * - 19
+ - Hauppauge WinTV 930C-HD (1113xx) / HVR-900H (111xxx) / PCTV QuatroStick 521e
+ - 2040:b130, 2040:b138, 2013:0259
+ * - 20
+ - Hauppauge WinTV 930C-HD (1114xx) / HVR-901H (1114xx) / PCTV QuatroStick 522e
+ - 2040:b131, 2040:b139, 2013:025e
+ * - 21
+ - Hauppauge WinTV-HVR-955Q (111401)
+ - 2040:b123, 2040:b124
+ * - 22
+ - Terratec Grabby
+ - 1f4d:0102
+ * - 23
+ - Evromedia USB Full Hybrid Full HD
+ - 1b80:d3b2
+ * - 24
+ - Astrometa T2hybrid
+ - 15f4:0135
+ * - 25
+ - The Imaging Source DFG/USB2pro
+ - 199e:8002
+ * - 26
+ - Hauppauge WinTV-HVR-935C
+ - 2040:b151
+ * - 27
+ - Hauppauge WinTV-HVR-975
+ - 2040:b150
diff --git a/Documentation/admin-guide/media/cx23885-cardlist.rst b/Documentation/admin-guide/media/cx23885-cardlist.rst
new file mode 100644
index 000000000000..c47514fead33
--- /dev/null
+++ b/Documentation/admin-guide/media/cx23885-cardlist.rst
@@ -0,0 +1,267 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+cx23885 cards list
+==================
+
+.. tabularcolumns:: |p{1.4cm}|p{11.1cm}|p{4.2cm}|
+
+.. flat-table::
+ :header-rows: 1
+ :widths: 2 19 18
+ :stub-columns: 0
+
+ * - Card number
+ - Card name
+ - PCI subsystem IDs
+
+ * - 0
+ - UNKNOWN/GENERIC
+ - 0070:3400
+
+ * - 1
+ - Hauppauge WinTV-HVR1800lp
+ - 0070:7600
+
+ * - 2
+ - Hauppauge WinTV-HVR1800
+ - 0070:7800, 0070:7801, 0070:7809
+
+ * - 3
+ - Hauppauge WinTV-HVR1250
+ - 0070:7911
+
+ * - 4
+ - DViCO FusionHDTV5 Express
+ - 18ac:d500
+
+ * - 5
+ - Hauppauge WinTV-HVR1500Q
+ - 0070:7790, 0070:7797
+
+ * - 6
+ - Hauppauge WinTV-HVR1500
+ - 0070:7710, 0070:7717
+
+ * - 7
+ - Hauppauge WinTV-HVR1200
+ - 0070:71d1, 0070:71d3
+
+ * - 8
+ - Hauppauge WinTV-HVR1700
+ - 0070:8101
+
+ * - 9
+ - Hauppauge WinTV-HVR1400
+ - 0070:8010
+
+ * - 10
+ - DViCO FusionHDTV7 Dual Express
+ - 18ac:d618
+
+ * - 11
+ - DViCO FusionHDTV DVB-T Dual Express
+ - 18ac:db78
+
+ * - 12
+ - Leadtek Winfast PxDVR3200 H
+ - 107d:6681
+
+ * - 13
+ - Compro VideoMate E650F
+ - 185b:e800
+
+ * - 14
+ - TurboSight TBS 6920
+ - 6920:8888
+
+ * - 15
+ - TeVii S470
+ - d470:9022
+
+ * - 16
+ - DVBWorld DVB-S2 2005
+ - 0001:2005
+
+ * - 17
+ - NetUP Dual DVB-S2 CI
+ - 1b55:2a2c
+
+ * - 18
+ - Hauppauge WinTV-HVR1270
+ - 0070:2211
+
+ * - 19
+ - Hauppauge WinTV-HVR1275
+ - 0070:2215, 0070:221d, 0070:22f2
+
+ * - 20
+ - Hauppauge WinTV-HVR1255
+ - 0070:2251, 0070:22f1
+
+ * - 21
+ - Hauppauge WinTV-HVR1210
+ - 0070:2291, 0070:2295, 0070:2299, 0070:229d, 0070:22f0, 0070:22f3, 0070:22f4, 0070:22f5
+
+ * - 22
+ - Mygica X8506 DMB-TH
+ - 14f1:8651
+
+ * - 23
+ - Magic-Pro ProHDTV Extreme 2
+ - 14f1:8657
+
+ * - 24
+ - Hauppauge WinTV-HVR1850
+ - 0070:8541
+
+ * - 25
+ - Compro VideoMate E800
+ - 1858:e800
+
+ * - 26
+ - Hauppauge WinTV-HVR1290
+ - 0070:8551
+
+ * - 27
+ - Mygica X8558 PRO DMB-TH
+ - 14f1:8578
+
+ * - 28
+ - LEADTEK WinFast PxTV1200
+ - 107d:6f22
+
+ * - 29
+ - GoTView X5 3D Hybrid
+ - 5654:2390
+
+ * - 30
+ - NetUP Dual DVB-T/C-CI RF
+ - 1b55:e2e4
+
+ * - 31
+ - Leadtek Winfast PxDVR3200 H XC4000
+ - 107d:6f39
+
+ * - 32
+ - MPX-885
+ -
+
+ * - 33
+ - Mygica X8502/X8507 ISDB-T
+ - 14f1:8502
+
+ * - 34
+ - TerraTec Cinergy T PCIe Dual
+ - 153b:117e
+
+ * - 35
+ - TeVii S471
+ - d471:9022
+
+ * - 36
+ - Hauppauge WinTV-HVR1255
+ - 0070:2259
+
+ * - 37
+ - Prof Revolution DVB-S2 8000
+ - 8000:3034
+
+ * - 38
+ - Hauppauge WinTV-HVR4400/HVR5500
+ - 0070:c108, 0070:c138, 0070:c1f8
+
+ * - 39
+ - AVerTV Hybrid Express Slim HC81R
+ - 1461:d939
+
+ * - 40
+ - TurboSight TBS 6981
+ - 6981:8888
+
+ * - 41
+ - TurboSight TBS 6980
+ - 6980:8888
+
+ * - 42
+ - Leadtek Winfast PxPVR2200
+ - 107d:6f21
+
+ * - 43
+ - Hauppauge ImpactVCB-e
+ - 0070:7133, 0070:7137
+
+ * - 44
+ - DViCO FusionHDTV DVB-T Dual Express2
+ - 18ac:db98
+
+ * - 45
+ - DVBSky T9580
+ - 4254:9580
+
+ * - 46
+ - DVBSky T980C
+ - 4254:980c
+
+ * - 47
+ - DVBSky S950C
+ - 4254:950c
+
+ * - 48
+ - Technotrend TT-budget CT2-4500 CI
+ - 13c2:3013
+
+ * - 49
+ - DVBSky S950
+ - 4254:0950
+
+ * - 50
+ - DVBSky S952
+ - 4254:0952
+
+ * - 51
+ - DVBSky T982
+ - 4254:0982
+
+ * - 52
+ - Hauppauge WinTV-HVR5525
+ - 0070:f038
+
+ * - 53
+ - Hauppauge WinTV Starburst
+ - 0070:c12a
+
+ * - 54
+ - ViewCast 260e
+ - 1576:0260
+
+ * - 55
+ - ViewCast 460e
+ - 1576:0460
+
+ * - 56
+ - Hauppauge WinTV-QuadHD-DVB
+ - 0070:6a28, 0070:6b28
+
+ * - 57
+ - Hauppauge WinTV-QuadHD-ATSC
+ - 0070:6a18, 0070:6b18
+
+ * - 58
+ - Hauppauge WinTV-HVR-1265(161111)
+ - 0070:2a18
+
+ * - 59
+ - Hauppauge WinTV-Starburst2
+ - 0070:f02a
+
+ * - 60
+ - Hauppauge WinTV-QuadHD-DVB(885)
+ -
+
+ * - 61
+ - Hauppauge WinTV-QuadHD-ATSC(885)
+ -
+
+ * - 62
+ - AVerMedia CE310B
+ - 1461:3100
diff --git a/Documentation/admin-guide/media/cx88-cardlist.rst b/Documentation/admin-guide/media/cx88-cardlist.rst
new file mode 100644
index 000000000000..76dc9a14cf91
--- /dev/null
+++ b/Documentation/admin-guide/media/cx88-cardlist.rst
@@ -0,0 +1,383 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+CX88 cards list
+===============
+
+.. tabularcolumns:: |p{1.4cm}|p{11.1cm}|p{4.2cm}|
+
+.. flat-table::
+ :header-rows: 1
+ :widths: 2 19 18
+ :stub-columns: 0
+
+ * - Card number
+ - Card name
+ - PCI subsystem IDs
+
+ * - 0
+ - UNKNOWN/GENERIC
+ -
+
+ * - 1
+ - Hauppauge WinTV 34xxx models
+ - 0070:3400, 0070:3401
+
+ * - 2
+ - GDI Black Gold
+ - 14c7:0106, 14c7:0107
+
+ * - 3
+ - PixelView
+ - 1554:4811
+
+ * - 4
+ - ATI TV Wonder Pro
+ - 1002:00f8, 1002:00f9
+
+ * - 5
+ - Leadtek Winfast 2000XP Expert
+ - 107d:6611, 107d:6613
+
+ * - 6
+ - AverTV Studio 303 (M126)
+ - 1461:000b
+
+ * - 7
+ - MSI TV-@nywhere Master
+ - 1462:8606
+
+ * - 8
+ - Leadtek Winfast DV2000
+ - 107d:6620, 107d:6621
+
+ * - 9
+ - Leadtek PVR 2000
+ - 107d:663b, 107d:663c, 107d:6632, 107d:6630, 107d:6638, 107d:6631, 107d:6637, 107d:663d
+
+ * - 10
+ - IODATA GV-VCP3/PCI
+ - 10fc:d003
+
+ * - 11
+ - Prolink PlayTV PVR
+ -
+
+ * - 12
+ - ASUS PVR-416
+ - 1043:4823, 1461:c111
+
+ * - 13
+ - MSI TV-@nywhere
+ -
+
+ * - 14
+ - KWorld/VStream XPert DVB-T
+ - 17de:08a6
+
+ * - 15
+ - DViCO FusionHDTV DVB-T1
+ - 18ac:db00
+
+ * - 16
+ - KWorld LTV883RF
+ -
+
+ * - 17
+ - DViCO FusionHDTV 3 Gold-Q
+ - 18ac:d810, 18ac:d800
+
+ * - 18
+ - Hauppauge Nova-T DVB-T
+ - 0070:9002, 0070:9001, 0070:9000
+
+ * - 19
+ - Conexant DVB-T reference design
+ - 14f1:0187
+
+ * - 20
+ - Provideo PV259
+ - 1540:2580
+
+ * - 21
+ - DViCO FusionHDTV DVB-T Plus
+ - 18ac:db10, 18ac:db11
+
+ * - 22
+ - pcHDTV HD3000 HDTV
+ - 7063:3000
+
+ * - 23
+ - digitalnow DNTV Live! DVB-T
+ - 17de:a8a6
+
+ * - 24
+ - Hauppauge WinTV 28xxx (Roslyn) models
+ - 0070:2801
+
+ * - 25
+ - Digital-Logic MICROSPACE Entertainment Center (MEC)
+ - 14f1:0342
+
+ * - 26
+ - IODATA GV/BCTV7E
+ - 10fc:d035
+
+ * - 27
+ - PixelView PlayTV Ultra Pro (Stereo)
+ -
+
+ * - 28
+ - DViCO FusionHDTV 3 Gold-T
+ - 18ac:d820
+
+ * - 29
+ - ADS Tech Instant TV DVB-T PCI
+ - 1421:0334
+
+ * - 30
+ - TerraTec Cinergy 1400 DVB-T
+ - 153b:1166
+
+ * - 31
+ - DViCO FusionHDTV 5 Gold
+ - 18ac:d500
+
+ * - 32
+ - AverMedia UltraTV Media Center PCI 550
+ - 1461:8011
+
+ * - 33
+ - Kworld V-Stream Xpert DVD
+ -
+
+ * - 34
+ - ATI HDTV Wonder
+ - 1002:a101
+
+ * - 35
+ - WinFast DTV1000-T
+ - 107d:665f
+
+ * - 36
+ - AVerTV 303 (M126)
+ - 1461:000a
+
+ * - 37
+ - Hauppauge Nova-S-Plus DVB-S
+ - 0070:9201, 0070:9202
+
+ * - 38
+ - Hauppauge Nova-SE2 DVB-S
+ - 0070:9200
+
+ * - 39
+ - KWorld DVB-S 100
+ - 17de:08b2, 1421:0341
+
+ * - 40
+ - Hauppauge WinTV-HVR1100 DVB-T/Hybrid
+ - 0070:9400, 0070:9402
+
+ * - 41
+ - Hauppauge WinTV-HVR1100 DVB-T/Hybrid (Low Profile)
+ - 0070:9800, 0070:9802
+
+ * - 42
+ - digitalnow DNTV Live! DVB-T Pro
+ - 1822:0025, 1822:0019
+
+ * - 43
+ - KWorld/VStream XPert DVB-T with cx22702
+ - 17de:08a1, 12ab:2300
+
+ * - 44
+ - DViCO FusionHDTV DVB-T Dual Digital
+ - 18ac:db50, 18ac:db54
+
+ * - 45
+ - KWorld HardwareMpegTV XPert
+ - 17de:0840, 1421:0305
+
+ * - 46
+ - DViCO FusionHDTV DVB-T Hybrid
+ - 18ac:db40, 18ac:db44
+
+ * - 47
+ - pcHDTV HD5500 HDTV
+ - 7063:5500
+
+ * - 48
+ - Kworld MCE 200 Deluxe
+ - 17de:0841
+
+ * - 49
+ - PixelView PlayTV P7000
+ - 1554:4813
+
+ * - 50
+ - NPG Tech Real TV FM Top 10
+ - 14f1:0842
+
+ * - 51
+ - WinFast DTV2000 H
+ - 107d:665e
+
+ * - 52
+ - Geniatech DVB-S
+ - 14f1:0084
+
+ * - 53
+ - Hauppauge WinTV-HVR3000 TriMode Analog/DVB-S/DVB-T
+ - 0070:1404, 0070:1400, 0070:1401, 0070:1402
+
+ * - 54
+ - Norwood Micro TV Tuner
+ -
+
+ * - 55
+ - Shenzhen Tungsten Ages Tech TE-DTV-250 / Swann OEM
+ - c180:c980
+
+ * - 56
+ - Hauppauge WinTV-HVR1300 DVB-T/Hybrid MPEG Encoder
+ - 0070:9600, 0070:9601, 0070:9602
+
+ * - 57
+ - ADS Tech Instant Video PCI
+ - 1421:0390
+
+ * - 58
+ - Pinnacle PCTV HD 800i
+ - 11bd:0051
+
+ * - 59
+ - DViCO FusionHDTV 5 PCI nano
+ - 18ac:d530
+
+ * - 60
+ - Pinnacle Hybrid PCTV
+ - 12ab:1788
+
+ * - 61
+ - Leadtek TV2000 XP Global
+ - 107d:6f18, 107d:6618, 107d:6619
+
+ * - 62
+ - PowerColor RA330
+ - 14f1:ea3d
+
+ * - 63
+ - Geniatech X8000-MT DVBT
+ - 14f1:8852
+
+ * - 64
+ - DViCO FusionHDTV DVB-T PRO
+ - 18ac:db30
+
+ * - 65
+ - DViCO FusionHDTV 7 Gold
+ - 18ac:d610
+
+ * - 66
+ - Prolink Pixelview MPEG 8000GT
+ - 1554:4935
+
+ * - 67
+ - Kworld PlusTV HD PCI 120 (ATSC 120)
+ - 17de:08c1
+
+ * - 68
+ - Hauppauge WinTV-HVR4000 DVB-S/S2/T/Hybrid
+ - 0070:6900, 0070:6904, 0070:6902
+
+ * - 69
+ - Hauppauge WinTV-HVR4000(Lite) DVB-S/S2
+ - 0070:6905, 0070:6906
+
+ * - 70
+ - TeVii S460 DVB-S/S2
+ - d460:9022
+
+ * - 71
+ - Omicom SS4 DVB-S/S2 PCI
+ - A044:2011
+
+ * - 72
+ - TBS 8920 DVB-S/S2
+ - 8920:8888
+
+ * - 73
+ - TeVii S420 DVB-S
+ - d420:9022
+
+ * - 74
+ - Prolink Pixelview Global Extreme
+ - 1554:4976
+
+ * - 75
+ - PROF 7300 DVB-S/S2
+ - B033:3033
+
+ * - 76
+ - SATTRADE ST4200 DVB-S/S2
+ - b200:4200
+
+ * - 77
+ - TBS 8910 DVB-S
+ - 8910:8888
+
+ * - 78
+ - Prof 6200 DVB-S
+ - b022:3022
+
+ * - 79
+ - Terratec Cinergy HT PCI MKII
+ - 153b:1177
+
+ * - 80
+ - Hauppauge WinTV-IR Only
+ - 0070:9290
+
+ * - 81
+ - Leadtek WinFast DTV1800 Hybrid
+ - 107d:6654
+
+ * - 82
+ - WinFast DTV2000 H rev. J
+ - 107d:6f2b
+
+ * - 83
+ - Prof 7301 DVB-S/S2
+ - b034:3034
+
+ * - 84
+ - Samsung SMT 7020 DVB-S
+ - 18ac:dc00, 18ac:dccd
+
+ * - 85
+ - Twinhan VP-1027 DVB-S
+ - 1822:0023
+
+ * - 86
+ - TeVii S464 DVB-S/S2
+ - d464:9022
+
+ * - 87
+ - Leadtek WinFast DTV2000 H PLUS
+ - 107d:6f42
+
+ * - 88
+ - Leadtek WinFast DTV1800 H (XC4000)
+ - 107d:6f38
+
+ * - 89
+ - Leadtek TV2000 XP Global (SC4100)
+ - 107d:6f36
+
+ * - 90
+ - Leadtek TV2000 XP Global (XC4100)
+ - 107d:6f43
+
+ * - 91
+ - NotOnlyTV LV3H
+ -
diff --git a/Documentation/admin-guide/media/cx88.rst b/Documentation/admin-guide/media/cx88.rst
new file mode 100644
index 000000000000..e4badb18199d
--- /dev/null
+++ b/Documentation/admin-guide/media/cx88.rst
@@ -0,0 +1,58 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+The cx88 driver
+===============
+
+Author: Gerd Hoffmann
+
+This is a v4l2 device driver for the cx2388x chip.
+
+
+Current status
+--------------
+
+video
+ - Works.
+ - Overlay isn't supported.
+
+audio
+ - Works. The TV standard detection is made by the driver, as the
+ hardware has bugs to auto-detect.
+ - audio data dma (i.e. recording without loopback cable to the
+ sound card) is supported via cx88-alsa.
+
+vbi
+ - Works.
+
+
+How to add support for new cards
+--------------------------------
+
+The driver needs some config info for the TV cards. This stuff is in
+cx88-cards.c. If the driver doesn't work well you likely need a new
+entry for your card in that file. Check the kernel log (using dmesg)
+to see whenever the driver knows your card or not. There is a line
+like this one:
+
+.. code-block:: none
+
+ cx8800[0]: subsystem: 0070:3400, board: Hauppauge WinTV \
+ 34xxx models [card=1,autodetected]
+
+If your card is listed as "board: UNKNOWN/GENERIC" it is unknown to
+the driver. What to do then?
+
+1) Try upgrading to the latest snapshot, maybe it has been added
+ meanwhile.
+2) You can try to create a new entry yourself, have a look at
+ cx88-cards.c. If that worked, mail me your changes as unified
+ diff ("diff -u").
+3) Or you can mail me the config information. We need at least the
+ following information to add the card:
+
+ - the PCI Subsystem ID ("0070:3400" from the line above,
+ "lspci -v" output is fine too).
+ - the tuner type used by the card. You can try to find one by
+ trial-and-error using the tuner=<n> insmod option. If you
+ know which one the card has you can also have a look at the
+ list in CARDLIST.tuner
diff --git a/Documentation/admin-guide/media/davinci-vpbe.rst b/Documentation/admin-guide/media/davinci-vpbe.rst
new file mode 100644
index 000000000000..9e6360fd02db
--- /dev/null
+++ b/Documentation/admin-guide/media/davinci-vpbe.rst
@@ -0,0 +1,65 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+The VPBE V4L2 driver design
+===========================
+
+Functional partitioning
+-----------------------
+
+Consists of the following:
+
+ 1. V4L2 display driver
+
+ Implements creation of video2 and video3 device nodes and
+ provides v4l2 device interface to manage VID0 and VID1 layers.
+
+ 2. Display controller
+
+ Loads up VENC, OSD and external encoders such as ths8200. It provides
+ a set of API calls to V4L2 drivers to set the output/standards
+ in the VENC or external sub devices. It also provides
+ a device object to access the services from OSD subdevice
+ using sub device ops. The connection of external encoders to VENC LCD
+ controller port is done at init time based on default output and standard
+ selection or at run time when application change the output through
+ V4L2 IOCTLs.
+
+ When connected to an external encoder, vpbe controller is also responsible
+ for setting up the interface between VENC and external encoders based on
+ board specific settings (specified in board-xxx-evm.c). This allows
+ interfacing external encoders such as ths8200. The setup_if_config()
+ is implemented for this as well as configure_venc() (part of the next patch)
+ API to set timings in VENC for a specific display resolution. As of this
+ patch series, the interconnection and enabling and setting of the external
+ encoders is not present, and would be a part of the next patch series.
+
+ 3. VENC subdevice module
+
+ Responsible for setting outputs provided through internal DACs and also
+ setting timings at LCD controller port when external encoders are connected
+ at the port or LCD panel timings required. When external encoder/LCD panel
+ is connected, the timings for a specific standard/preset is retrieved from
+ the board specific table and the values are used to set the timings in
+ venc using non-standard timing mode.
+
+ Support LCD Panel displays using the VENC. For example to support a Logic
+ PD display, it requires setting up the LCD controller port with a set of
+ timings for the resolution supported and setting the dot clock. So we could
+ add the available outputs as a board specific entry (i.e add the "LogicPD"
+ output name to board-xxx-evm.c). A table of timings for various LCDs
+ supported can be maintained in the board specific setup file to support
+ various LCD displays.As of this patch a basic driver is present, and this
+ support for external encoders and displays forms a part of the next
+ patch series.
+
+ 4. OSD module
+
+ OSD module implements all OSD layer management and hardware specific
+ features. The VPBE module interacts with the OSD for enabling and
+ disabling appropriate features of the OSD.
+
+Current status
+--------------
+
+A fully functional working version of the V4L2 driver is available. This
+driver has been tested with NTSC and PAL standards and buffer streaming.
diff --git a/Documentation/admin-guide/media/dvb-drivers.rst b/Documentation/admin-guide/media/dvb-drivers.rst
new file mode 100644
index 000000000000..8df637c375f9
--- /dev/null
+++ b/Documentation/admin-guide/media/dvb-drivers.rst
@@ -0,0 +1,16 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+========================================
+Digital TV driver-specific documentation
+========================================
+
+.. toctree::
+ :maxdepth: 2
+
+ avermedia
+ bt8xx
+ lmedm04
+ opera-firmware
+ technisat
+ ttusb-dec
+ zr364xx
diff --git a/Documentation/admin-guide/media/dvb-usb-a800-cardlist.rst b/Documentation/admin-guide/media/dvb-usb-a800-cardlist.rst
new file mode 100644
index 000000000000..2ec8bb8230ff
--- /dev/null
+++ b/Documentation/admin-guide/media/dvb-usb-a800-cardlist.rst
@@ -0,0 +1,16 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+dvb-usb-a800 cards list
+=======================
+
+.. tabularcolumns:: |p{7.0cm}|p{10.5cm}|
+
+.. flat-table::
+ :header-rows: 1
+ :widths: 7 13
+ :stub-columns: 0
+
+ * - Card name
+ - USB IDs
+ * - AVerMedia AverTV DVB-T USB 2.0 (A800)
+ - 07ca:a800, 07ca:a801
diff --git a/Documentation/admin-guide/media/dvb-usb-af9005-cardlist.rst b/Documentation/admin-guide/media/dvb-usb-af9005-cardlist.rst
new file mode 100644
index 000000000000..285160ee82e8
--- /dev/null
+++ b/Documentation/admin-guide/media/dvb-usb-af9005-cardlist.rst
@@ -0,0 +1,20 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+dvb-usb-af9005 cards list
+=========================
+
+.. tabularcolumns:: |p{7.0cm}|p{10.5cm}|
+
+.. flat-table::
+ :header-rows: 1
+ :widths: 7 13
+ :stub-columns: 0
+
+ * - Card name
+ - USB IDs
+ * - Afatech DVB-T USB1.1 stick
+ - 15a4:9020
+ * - Ansonic DVB-T USB1.1 stick
+ - 10b9:6000
+ * - TerraTec Cinergy T USB XE
+ - 0ccd:0055
diff --git a/Documentation/admin-guide/media/dvb-usb-af9015-cardlist.rst b/Documentation/admin-guide/media/dvb-usb-af9015-cardlist.rst
new file mode 100644
index 000000000000..c557994f796a
--- /dev/null
+++ b/Documentation/admin-guide/media/dvb-usb-af9015-cardlist.rst
@@ -0,0 +1,80 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+dvb-usb-af9015 cards list
+=========================
+
+.. tabularcolumns:: |p{7.0cm}|p{10.5cm}|
+
+.. flat-table::
+ :header-rows: 1
+ :widths: 7 13
+ :stub-columns: 0
+
+ * - Card name
+ - USB IDs
+ * - AVerMedia A309
+ - 07ca:a309
+ * - AVerMedia AVerTV DVB-T Volar X
+ - 07ca:a815
+ * - Afatech AF9015 reference design
+ - 15a4:9015, 15a4:9016
+ * - AverMedia AVerTV Red HD+ (A850T)
+ - 07ca:850b
+ * - AverMedia AVerTV Volar Black HD (A850)
+ - 07ca:850a
+ * - AverMedia AVerTV Volar GPS 805 (A805)
+ - 07ca:a805
+ * - AverMedia AVerTV Volar M (A815Mac)
+ - 07ca:815a
+ * - Conceptronic USB2.0 DVB-T CTVDIGRCU V3.0
+ - 1b80:e397
+ * - DigitalNow TinyTwin
+ - 13d3:3226
+ * - DigitalNow TinyTwin v2
+ - 1b80:e402
+ * - DigitalNow TinyTwin v3
+ - 1f4d:9016
+ * - Fujitsu-Siemens Slim Mobile USB DVB-T
+ - 07ca:8150
+ * - Genius TVGo DVB-T03
+ - 0458:4012
+ * - KWorld Digital MC-810
+ - 1b80:c810
+ * - KWorld PlusTV DVB-T PCI Pro Card (DVB-T PC160-T)
+ - 1b80:c161
+ * - KWorld PlusTV Dual DVB-T PCI (DVB-T PC160-2T)
+ - 1b80:c160
+ * - KWorld PlusTV Dual DVB-T Stick (DVB-T 399U)
+ - 1b80:e399, 1b80:e400
+ * - KWorld USB DVB-T Stick Mobile (UB383-T)
+ - 1b80:e383
+ * - KWorld USB DVB-T TV Stick II (VS-DVB-T 395U)
+ - 1b80:e396, 1b80:e39b, 1b80:e395, 1b80:e39a
+ * - Leadtek WinFast DTV Dongle Gold
+ - 0413:6029
+ * - Leadtek WinFast DTV2000DS
+ - 0413:6a04
+ * - MSI DIGIVOX Duo
+ - 1462:8801
+ * - MSI Digi VOX mini III
+ - 1462:8807
+ * - Pinnacle PCTV 71e
+ - 2304:022b
+ * - Sveon STV20 Tuner USB DVB-T HDTV
+ - 1b80:e39d
+ * - Sveon STV22 Dual USB DVB-T Tuner HDTV
+ - 1b80:e401
+ * - Telestar Starstick 2
+ - 10b9:8000
+ * - TerraTec Cinergy T Stick Dual RC
+ - 0ccd:0099
+ * - TerraTec Cinergy T Stick RC
+ - 0ccd:0097
+ * - TerraTec Cinergy T USB XE
+ - 0ccd:0069
+ * - TrekStor DVB-T USB Stick
+ - 15a4:901b
+ * - TwinHan AzureWave AD-TU700(704J)
+ - 13d3:3237
+ * - Xtensions XD-380
+ - 1ae7:0381
diff --git a/Documentation/admin-guide/media/dvb-usb-af9035-cardlist.rst b/Documentation/admin-guide/media/dvb-usb-af9035-cardlist.rst
new file mode 100644
index 000000000000..63e4170777c4
--- /dev/null
+++ b/Documentation/admin-guide/media/dvb-usb-af9035-cardlist.rst
@@ -0,0 +1,74 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+dvb-usb-af9035 cards list
+=========================
+
+.. tabularcolumns:: |p{7.0cm}|p{10.5cm}|
+
+.. flat-table::
+ :header-rows: 1
+ :widths: 7 13
+ :stub-columns: 0
+
+ * - Card name
+ - USB IDs
+ * - AVerMedia AVerTV Volar HD/PRO (A835)
+ - 07ca:a835, 07ca:b835
+ * - AVerMedia HD Volar (A867)
+ - 07ca:1867, 07ca:a867, 07ca:0337
+ * - AVerMedia TD310 DVB-T2
+ - 07ca:1871
+ * - AVerMedia Twinstar (A825)
+ - 07ca:0825
+ * - Afatech AF9035 reference design
+ - 15a4:9035, 15a4:1000, 15a4:1001, 15a4:1002, 15a4:1003
+ * - Asus U3100Mini Plus
+ - 0b05:1779
+ * - Avermedia A835B(1835)
+ - 07ca:1835
+ * - Avermedia A835B(2835)
+ - 07ca:2835
+ * - Avermedia A835B(3835)
+ - 07ca:3835
+ * - Avermedia A835B(4835)
+ - 07ca:4835
+ * - Avermedia AverTV Volar HD 2 (TD110)
+ - 07ca:a110
+ * - Avermedia H335
+ - 07ca:0335
+ * - Digital Dual TV Receiver CTVDIGDUAL_V2
+ - 1b80:e410
+ * - EVOLVEO XtraTV stick
+ - 1f4d:a115
+ * - Hauppauge WinTV-MiniStick 2
+ - 2040:f900
+ * - ITE 9135 Generic
+ - 048d:9135
+ * - ITE 9135(9005) Generic
+ - 048d:9005
+ * - ITE 9135(9006) Generic
+ - 048d:9006
+ * - ITE 9303 Generic
+ - 048d:9306
+ * - Kworld UB499-2T T09
+ - 1b80:e409
+ * - Leadtek WinFast DTV Dongle Dual
+ - 0413:6a05
+ * - Logilink VG0022A
+ - 1d19:0100
+ * - PCTV AndroiDTV (78e)
+ - 2013:025a
+ * - PCTV microStick (79e)
+ - 2013:0262
+ * - Sveon STV22 Dual DVB-T HDTV
+ - 1b80:e411
+ * - TerraTec Cinergy T Stick
+ - 0ccd:0093
+ * - TerraTec Cinergy T Stick (rev. 2)
+ - 0ccd:00aa
+ * - TerraTec Cinergy T Stick Dual RC (rev. 2)
+ - 0ccd:0099
+ * - TerraTec Cinergy TC2 Stick
+ - 0ccd:10b2
+ * - TerraTec T1
+ - 0ccd:10ae
diff --git a/Documentation/admin-guide/media/dvb-usb-anysee-cardlist.rst b/Documentation/admin-guide/media/dvb-usb-anysee-cardlist.rst
new file mode 100644
index 000000000000..1fb5d22a00dc
--- /dev/null
+++ b/Documentation/admin-guide/media/dvb-usb-anysee-cardlist.rst
@@ -0,0 +1,16 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+dvb-usb-anysee cards list
+=========================
+
+.. tabularcolumns:: |p{7.0cm}|p{10.5cm}|
+
+.. flat-table::
+ :header-rows: 1
+ :widths: 7 13
+ :stub-columns: 0
+
+ * - Card name
+ - USB IDs
+ * - Anysee
+ - 04b4:861f, 1c73:861f
diff --git a/Documentation/admin-guide/media/dvb-usb-au6610-cardlist.rst b/Documentation/admin-guide/media/dvb-usb-au6610-cardlist.rst
new file mode 100644
index 000000000000..02b2b742710b
--- /dev/null
+++ b/Documentation/admin-guide/media/dvb-usb-au6610-cardlist.rst
@@ -0,0 +1,16 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+dvb-usb-au6610 cards list
+=========================
+
+.. tabularcolumns:: |p{7.0cm}|p{10.5cm}|
+
+.. flat-table::
+ :header-rows: 1
+ :widths: 7 13
+ :stub-columns: 0
+
+ * - Card name
+ - USB IDs
+ * - Sigmatek DVB-110
+ - 058f:6610
diff --git a/Documentation/admin-guide/media/dvb-usb-az6007-cardlist.rst b/Documentation/admin-guide/media/dvb-usb-az6007-cardlist.rst
new file mode 100644
index 000000000000..db27eb47cc8f
--- /dev/null
+++ b/Documentation/admin-guide/media/dvb-usb-az6007-cardlist.rst
@@ -0,0 +1,20 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+dvb-usb-az6007 cards list
+=========================
+
+.. tabularcolumns:: |p{7.0cm}|p{10.5cm}|
+
+.. flat-table::
+ :header-rows: 1
+ :widths: 7 13
+ :stub-columns: 0
+
+ * - Card name
+ - USB IDs
+ * - Azurewave 6007
+ - 13d3:0ccd
+ * - Technisat CableStar Combo HD CI
+ - 14f7:0003
+ * - Terratec H7
+ - 0ccd:10b4, 0ccd:10a3
diff --git a/Documentation/admin-guide/media/dvb-usb-az6027-cardlist.rst b/Documentation/admin-guide/media/dvb-usb-az6027-cardlist.rst
new file mode 100644
index 000000000000..6d8575e9d90c
--- /dev/null
+++ b/Documentation/admin-guide/media/dvb-usb-az6027-cardlist.rst
@@ -0,0 +1,24 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+dvb-usb-az6027 cards list
+=========================
+
+.. tabularcolumns:: |p{7.0cm}|p{10.5cm}|
+
+.. flat-table::
+ :header-rows: 1
+ :widths: 7 13
+ :stub-columns: 0
+
+ * - Card name
+ - USB IDs
+ * - AZUREWAVE DVB-S/S2 USB2.0 (AZ6027)
+ - 13d3:3275
+ * - Elgato EyeTV Sat
+ - 0fd9:002a, 0fd9:0025, 0fd9:0036
+ * - TERRATEC S7
+ - 0ccd:10a4
+ * - TERRATEC S7 MKII
+ - 0ccd:10ac
+ * - Technisat SkyStar USB 2 HD CI
+ - 14f7:0001, 14f7:0002
diff --git a/Documentation/admin-guide/media/dvb-usb-ce6230-cardlist.rst b/Documentation/admin-guide/media/dvb-usb-ce6230-cardlist.rst
new file mode 100644
index 000000000000..09750e8ac139
--- /dev/null
+++ b/Documentation/admin-guide/media/dvb-usb-ce6230-cardlist.rst
@@ -0,0 +1,18 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+dvb-usb-ce6230 cards list
+=========================
+
+.. tabularcolumns:: |p{7.0cm}|p{10.5cm}|
+
+.. flat-table::
+ :header-rows: 1
+ :widths: 7 13
+ :stub-columns: 0
+
+ * - Card name
+ - USB IDs
+ * - AVerMedia A310 USB 2.0 DVB-T tuner
+ - 07ca:a310
+ * - Intel CE9500 reference design
+ - 8086:9500
diff --git a/Documentation/admin-guide/media/dvb-usb-cinergyT2-cardlist.rst b/Documentation/admin-guide/media/dvb-usb-cinergyT2-cardlist.rst
new file mode 100644
index 000000000000..0ee753929eca
--- /dev/null
+++ b/Documentation/admin-guide/media/dvb-usb-cinergyT2-cardlist.rst
@@ -0,0 +1,16 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+dvb-usb-cinergyT2 cards list
+============================
+
+.. tabularcolumns:: |p{7.0cm}|p{10.5cm}|
+
+.. flat-table::
+ :header-rows: 1
+ :widths: 7 13
+ :stub-columns: 0
+
+ * - Card name
+ - USB IDs
+ * - TerraTec/qanu USB2.0 Highspeed DVB-T Receiver
+ - 0ccd:0x0038
diff --git a/Documentation/admin-guide/media/dvb-usb-cxusb-cardlist.rst b/Documentation/admin-guide/media/dvb-usb-cxusb-cardlist.rst
new file mode 100644
index 000000000000..a73f15d1acf5
--- /dev/null
+++ b/Documentation/admin-guide/media/dvb-usb-cxusb-cardlist.rst
@@ -0,0 +1,40 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+dvb-usb-cxusb cards list
+========================
+
+.. tabularcolumns:: |p{7.0cm}|p{10.5cm}|
+
+.. flat-table::
+ :header-rows: 1
+ :widths: 7 13
+ :stub-columns: 0
+
+ * - Card name
+ - USB IDs
+ * - AVerMedia AVerTVHD Volar (A868R)
+ -
+ * - Conexant DMB-TH Stick
+ -
+ * - DViCO FusionHDTV DVB-T Dual Digital 2
+ -
+ * - DViCO FusionHDTV DVB-T Dual Digital 4
+ -
+ * - DViCO FusionHDTV DVB-T Dual Digital 4 (rev 2)
+ -
+ * - DViCO FusionHDTV DVB-T Dual USB
+ -
+ * - DViCO FusionHDTV DVB-T NANO2
+ -
+ * - DViCO FusionHDTV DVB-T USB (LGZ201)
+ -
+ * - DViCO FusionHDTV DVB-T USB (TH7579)
+ -
+ * - DViCO FusionHDTV5 USB Gold
+ -
+ * - DigitalNow DVB-T Dual USB
+ -
+ * - Medion MD95700 (MDUSBTV-HYBRID)
+ -
+ * - Mygica D689 DMB-TH
+ -
diff --git a/Documentation/admin-guide/media/dvb-usb-dib0700-cardlist.rst b/Documentation/admin-guide/media/dvb-usb-dib0700-cardlist.rst
new file mode 100644
index 000000000000..4b76b6f1089b
--- /dev/null
+++ b/Documentation/admin-guide/media/dvb-usb-dib0700-cardlist.rst
@@ -0,0 +1,162 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+dvb-usb-dib0700 cards list
+==========================
+
+.. tabularcolumns:: |p{7.0cm}|p{10.5cm}|
+
+.. flat-table::
+ :header-rows: 1
+ :widths: 7 13
+ :stub-columns: 0
+
+ * - Card name
+ - USB IDs
+ * - ASUS My Cinema U3000 Mini DVBT Tuner
+ - 0b05:171f
+ * - ASUS My Cinema U3100 Mini DVBT Tuner
+ - 0b05:173f
+ * - AVerMedia AVerTV DVB-T Express
+ - 07ca:b568
+ * - AVerMedia AVerTV DVB-T Volar
+ - 07ca:a807, 07ca:b808
+ * - Artec T14BR DVB-T
+ - 05d8:810f
+ * - Asus My Cinema-U3000Hybrid
+ - 0b05:1736
+ * - Compro Videomate U500
+ - 185b:1e78, 185b:1e80
+ * - DiBcom NIM7090 reference design
+ - 10b8:1bb2
+ * - DiBcom NIM8096MD reference design
+ - 10b8:1fa8
+ * - DiBcom NIM9090MD reference design
+ - 10b8:2384
+ * - DiBcom STK7070P reference design
+ - 10b8:1ebc
+ * - DiBcom STK7070PD reference design
+ - 10b8:1ebe
+ * - DiBcom STK7700D reference design
+ - 10b8:1ef0
+ * - DiBcom STK7700P reference design
+ - 10b8:1e14, 10b8:1e78
+ * - DiBcom STK7770P reference design
+ - 10b8:1e80
+ * - DiBcom STK807xP reference design
+ - 10b8:1f90
+ * - DiBcom STK807xPVR reference design
+ - 10b8:1f98
+ * - DiBcom STK8096-PVR reference design
+ - 2013:1faa, 10b8:1faa
+ * - DiBcom STK8096GP reference design
+ - 10b8:1fa0
+ * - DiBcom STK9090M reference design
+ - 10b8:2383
+ * - DiBcom TFE7090PVR reference design
+ - 10b8:1bb4
+ * - DiBcom TFE7790P reference design
+ - 10b8:1e6e
+ * - DiBcom TFE8096P reference design
+ - 10b8:1f9C
+ * - Elgato EyeTV DTT
+ - 0fd9:0021
+ * - Elgato EyeTV DTT rev. 2
+ - 0fd9:003f
+ * - Elgato EyeTV Diversity
+ - 0fd9:0011
+ * - Elgato EyeTV Dtt Dlx PD378S
+ - 0fd9:0020
+ * - EvolutePC TVWay+
+ - 1e59:0002
+ * - Gigabyte U7000
+ - 1044:7001
+ * - Gigabyte U8000-RH
+ - 1044:7002
+ * - Hama DVB=T Hybrid USB Stick
+ - 147f:2758
+ * - Hauppauge ATSC MiniCard (B200)
+ - 2040:b200
+ * - Hauppauge ATSC MiniCard (B210)
+ - 2040:b210
+ * - Hauppauge Nova-T 500 Dual DVB-T
+ - 2040:9941, 2040:9950
+ * - Hauppauge Nova-T MyTV.t
+ - 2040:7080
+ * - Hauppauge Nova-T Stick
+ - 2040:7050, 2040:7060, 2040:7070
+ * - Hauppauge Nova-TD Stick (52009)
+ - 2040:5200
+ * - Hauppauge Nova-TD Stick/Elgato Eye-TV Diversity
+ - 2040:9580
+ * - Hauppauge Nova-TD-500 (84xxx)
+ - 2040:8400
+ * - Leadtek WinFast DTV Dongle H
+ - 0413:60f6
+ * - Leadtek Winfast DTV Dongle (STK7700P based)
+ - 0413:6f00, 0413:6f01
+ * - Medion CTX1921 DVB-T USB
+ - 1660:1921
+ * - Microsoft Xbox One Digital TV Tuner
+ - 045e:02d5
+ * - PCTV 2002e
+ - 2013:025c
+ * - PCTV 2002e SE
+ - 2013:025d
+ * - Pinnacle Expresscard 320cx
+ - 2304:022e
+ * - Pinnacle PCTV 2000e
+ - 2304:022c
+ * - Pinnacle PCTV 282e
+ - 2013:0248, 2304:0248
+ * - Pinnacle PCTV 340e HD Pro USB Stick
+ - 2304:023d
+ * - Pinnacle PCTV 72e
+ - 2304:0236
+ * - Pinnacle PCTV 73A
+ - 2304:0243
+ * - Pinnacle PCTV 73e
+ - 2304:0237
+ * - Pinnacle PCTV 73e SE
+ - 2013:0245, 2304:0245
+ * - Pinnacle PCTV DVB-T Flash Stick
+ - 2304:0228
+ * - Pinnacle PCTV Dual DVB-T Diversity Stick
+ - 2304:0229
+ * - Pinnacle PCTV HD Pro USB Stick
+ - 2304:023a
+ * - Pinnacle PCTV HD USB Stick
+ - 2304:023b
+ * - Pinnacle PCTV Hybrid Stick Solo
+ - 2304:023e
+ * - Prolink Pixelview SBTVD
+ - 1554:5010
+ * - Sony PlayTV
+ - 1415:0003
+ * - TechniSat AirStar TeleStick 2
+ - 14f7:0004
+ * - Terratec Cinergy DT USB XS Diversity/ T5
+ - 0ccd:0081, 0ccd:10a1
+ * - Terratec Cinergy DT XS Diversity
+ - 0ccd:005a
+ * - Terratec Cinergy HT Express
+ - 0ccd:0060
+ * - Terratec Cinergy HT USB XE
+ - 0ccd:0058
+ * - Terratec Cinergy T Express
+ - 0ccd:0062
+ * - Terratec Cinergy T USB XXS (HD)/ T3
+ - 0ccd:0078, 0ccd:10a0, 0ccd:00ab
+ * - Uniwill STK7700P based (Hama and others)
+ - 1584:6003
+ * - YUAN High-Tech DiBcom STK7700D
+ - 1164:1e8c
+ * - YUAN High-Tech MC770
+ - 1164:0871
+ * - YUAN High-Tech STK7700D
+ - 1164:1efc
+ * - YUAN High-Tech STK7700PH
+ - 1164:1f08
+ * - Yuan EC372S
+ - 1164:1edc
+ * - Yuan PD378S
+ - 1164:2edc
diff --git a/Documentation/admin-guide/media/dvb-usb-dibusb-mb-cardlist.rst b/Documentation/admin-guide/media/dvb-usb-dibusb-mb-cardlist.rst
new file mode 100644
index 000000000000..f25a54721f0d
--- /dev/null
+++ b/Documentation/admin-guide/media/dvb-usb-dibusb-mb-cardlist.rst
@@ -0,0 +1,42 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+dvb-usb-dibusb-mb cards list
+============================
+
+.. tabularcolumns:: |p{7.0cm}|p{10.5cm}|
+
+.. flat-table::
+ :header-rows: 1
+ :widths: 7 13
+ :stub-columns: 0
+
+ * - Card name
+ - USB IDs
+ * - AVerMedia AverTV DVBT USB1.1
+ - 14aa:0001, 14aa:0002
+ * - Artec T1 USB1.1 TVBOX with AN2135
+ - 05d8:8105, 05d8:8106
+ * - Artec T1 USB1.1 TVBOX with AN2235
+ - 05d8:8107, 05d8:8108
+ * - Artec T1 USB1.1 TVBOX with AN2235 (faulty USB IDs)
+ - 0547:2235
+ * - Artec T1 USB2.0
+ - 05d8:8109, 05d8:810a
+ * - Compro Videomate DVB-U2000 - DVB-T USB1.1 (please confirm to linux-dvb)
+ - 185b:d000, 145f:010c, 185b:d001
+ * - DiBcom USB1.1 DVB-T reference design (MOD3000)
+ - 10b8:0bb8, 10b8:0bb9
+ * - Grandtec USB1.1 DVB-T
+ - 5032:0fa0, 5032:0bb8, 5032:0fa1, 5032:0bb9
+ * - KWorld V-Stream XPERT DTV - DVB-T USB1.1
+ - eb1a:17de, eb1a:17df
+ * - KWorld Xpert DVB-T USB2.0
+ - eb2a:17de
+ * - KWorld/ADSTech Instant DVB-T USB2.0
+ - 06e1:a333, 06e1:a334
+ * - TwinhanDTV USB-Ter USB1.1 / Magic Box I / HAMA USB1.1 DVB-T device
+ - 13d3:3201, 1822:3201, 13d3:3202, 1822:3202
+ * - Unknown USB1.1 DVB-T device ???? please report the name to the author
+ - 1025:005e, 1025:005f
+ * - VideoWalker DVB-T USB
+ - 0458:701e, 0458:701f
diff --git a/Documentation/admin-guide/media/dvb-usb-dibusb-mc-cardlist.rst b/Documentation/admin-guide/media/dvb-usb-dibusb-mc-cardlist.rst
new file mode 100644
index 000000000000..8d03bae0e084
--- /dev/null
+++ b/Documentation/admin-guide/media/dvb-usb-dibusb-mc-cardlist.rst
@@ -0,0 +1,30 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+dvb-usb-dibusb-mc cards list
+============================
+
+.. tabularcolumns:: |p{7.0cm}|p{10.5cm}|
+
+.. flat-table::
+ :header-rows: 1
+ :widths: 7 13
+ :stub-columns: 0
+
+ * - Card name
+ - USB IDs
+ * - Artec T1 USB2.0 TVBOX (please check the warm ID)
+ - 05d8:8109, 05d8:810a
+ * - Artec T14 - USB2.0 DVB-T
+ - 05d8:810b, 05d8:810c
+ * - DiBcom USB2.0 DVB-T reference design (MOD3000P)
+ - 10b8:0bc6, 10b8:0bc7
+ * - GRAND - USB2.0 DVB-T adapter
+ - 5032:0bc6, 5032:0bc7
+ * - Humax/Coex DVB-T USB Stick 2.0 High Speed
+ - 10b9:5000, 10b9:5001
+ * - LITE-ON USB2.0 DVB-T Tuner
+ - 04ca:f000, 04ca:f001
+ * - Leadtek - USB2.0 Winfast DTV dongle
+ - 0413:6025, 0413:6026
+ * - MSI Digivox Mini SL
+ - eb1a:e360, eb1a:e361
diff --git a/Documentation/admin-guide/media/dvb-usb-digitv-cardlist.rst b/Documentation/admin-guide/media/dvb-usb-digitv-cardlist.rst
new file mode 100644
index 000000000000..2b4d8325e8e9
--- /dev/null
+++ b/Documentation/admin-guide/media/dvb-usb-digitv-cardlist.rst
@@ -0,0 +1,16 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+dvb-usb-digitv cards list
+=========================
+
+.. tabularcolumns:: |p{7.0cm}|p{10.5cm}|
+
+.. flat-table::
+ :header-rows: 1
+ :widths: 7 13
+ :stub-columns: 0
+
+ * - Card name
+ - USB IDs
+ * - Nebula Electronics uDigiTV DVB-T USB2.0)
+ - 0547:0201
diff --git a/Documentation/admin-guide/media/dvb-usb-dtt200u-cardlist.rst b/Documentation/admin-guide/media/dvb-usb-dtt200u-cardlist.rst
new file mode 100644
index 000000000000..b4150a7bf31f
--- /dev/null
+++ b/Documentation/admin-guide/media/dvb-usb-dtt200u-cardlist.rst
@@ -0,0 +1,22 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+dvb-usb-dtt200u cards list
+==========================
+
+.. tabularcolumns:: |p{7.0cm}|p{10.5cm}|
+
+.. flat-table::
+ :header-rows: 1
+ :widths: 7 13
+ :stub-columns: 0
+
+ * - Card name
+ - USB IDs
+ * - WideView WT-220U PenType Receiver (Miglia)
+ - 18f3:0220
+ * - WideView WT-220U PenType Receiver (Typhoon/Freecom)
+ - 14aa:0222, 14aa:0220, 14aa:0221, 14aa:0225, 14aa:0226
+ * - WideView WT-220U PenType Receiver (based on ZL353)
+ - 14aa:022a, 14aa:022b
+ * - WideView/Yuan/Yakumo/Hama/Typhoon DVB-T USB2.0 (WT-200U)
+ - 14aa:0201, 14aa:0301
diff --git a/Documentation/admin-guide/media/dvb-usb-dtv5100-cardlist.rst b/Documentation/admin-guide/media/dvb-usb-dtv5100-cardlist.rst
new file mode 100644
index 000000000000..91d6e35e6f9d
--- /dev/null
+++ b/Documentation/admin-guide/media/dvb-usb-dtv5100-cardlist.rst
@@ -0,0 +1,16 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+dvb-usb-dtv5100 cards list
+==========================
+
+.. tabularcolumns:: |p{7.0cm}|p{10.5cm}|
+
+.. flat-table::
+ :header-rows: 1
+ :widths: 7 13
+ :stub-columns: 0
+
+ * - Card name
+ - USB IDs
+ * - AME DTV-5100 USB2.0 DVB-T
+ - 0x06be:0xa232
diff --git a/Documentation/admin-guide/media/dvb-usb-dvbsky-cardlist.rst b/Documentation/admin-guide/media/dvb-usb-dvbsky-cardlist.rst
new file mode 100644
index 000000000000..9f7b619f35f7
--- /dev/null
+++ b/Documentation/admin-guide/media/dvb-usb-dvbsky-cardlist.rst
@@ -0,0 +1,42 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+dvb-usb-dvbsky cards list
+=========================
+
+.. tabularcolumns:: |p{7.0cm}|p{10.5cm}|
+
+.. flat-table::
+ :header-rows: 1
+ :widths: 7 13
+ :stub-columns: 0
+
+ * - Card name
+ - USB IDs
+ * - DVBSky S960/S860
+ - 0572:6831
+ * - DVBSky S960CI
+ - 0572:960c
+ * - DVBSky T330
+ - 0572:0320
+ * - DVBSky T680CI
+ - 0572:680c
+ * - MyGica Mini DVB-(T/T2/C) USB Stick T230
+ - 0572:c688
+ * - MyGica Mini DVB-(T/T2/C) USB Stick T230C
+ - 0572:c689
+ * - MyGica Mini DVB-(T/T2/C) USB Stick T230C Lite
+ - 0572:c699
+ * - MyGica Mini DVB-(T/T2/C) USB Stick T230C v2
+ - 0572:c68a
+ * - TechnoTrend TT-connect CT2-4650 CI
+ - 0b48:3012
+ * - TechnoTrend TT-connect CT2-4650 CI v1.1
+ - 0b48:3015
+ * - TechnoTrend TT-connect S2-4650 CI
+ - 0b48:3017
+ * - TechnoTrend TVStick CT2-4400
+ - 0b48:3014
+ * - Terratec Cinergy S2 Rev.4
+ - 0ccd:0105
+ * - Terratec H7 Rev.4
+ - 0ccd:10a5
diff --git a/Documentation/admin-guide/media/dvb-usb-dw2102-cardlist.rst b/Documentation/admin-guide/media/dvb-usb-dw2102-cardlist.rst
new file mode 100644
index 000000000000..e39bc8e4bffe
--- /dev/null
+++ b/Documentation/admin-guide/media/dvb-usb-dw2102-cardlist.rst
@@ -0,0 +1,56 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+dvb-usb-dw2102 cards list
+=========================
+
+.. tabularcolumns:: |p{7.0cm}|p{10.5cm}|
+
+.. flat-table::
+ :header-rows: 1
+ :widths: 7 13
+ :stub-columns: 0
+
+ * - Card name
+ - USB IDs
+ * - DVBWorld DVB-C 3101 USB2.0
+ - 04b4:3101
+ * - DVBWorld DVB-S 2101 USB2.0
+ - 04b4:0x2101
+ * - DVBWorld DVB-S 2102 USB2.0
+ - 04b4:2102
+ * - DVBWorld DW2104 USB2.0
+ - 04b4:2104
+ * - GOTVIEW Satellite HD
+ - 0x1FE1:5456
+ * - Geniatech T220 DVB-T/T2 USB2.0
+ - 0x1f4d:0xD220
+ * - SU3000HD DVB-S USB2.0
+ - 0x1f4d:0x3000
+ * - TeVii S482 (tuner 1)
+ - 0x9022:0xd483
+ * - TeVii S482 (tuner 2)
+ - 0x9022:0xd484
+ * - TeVii S630 USB
+ - 0x9022:d630
+ * - TeVii S650 USB2.0
+ - 0x9022:d650
+ * - TeVii S662
+ - 0x9022:d662
+ * - TechnoTrend TT-connect S2-4600
+ - 0b48:3011
+ * - TerraTec Cinergy S USB
+ - 0ccd:0064
+ * - Terratec Cinergy S2 PCIe Dual Port 1
+ - 153b:1181
+ * - Terratec Cinergy S2 PCIe Dual Port 2
+ - 153b:1182
+ * - Terratec Cinergy S2 USB BOX
+ - 0ccd:0x0105
+ * - Terratec Cinergy S2 USB HD
+ - 0ccd:00a8
+ * - Terratec Cinergy S2 USB HD Rev.2
+ - 0ccd:00b0
+ * - Terratec Cinergy S2 USB HD Rev.3
+ - 0ccd:0102
+ * - X3M TV SPC1400HD PCI
+ - 0x1f4d:0x3100
diff --git a/Documentation/admin-guide/media/dvb-usb-ec168-cardlist.rst b/Documentation/admin-guide/media/dvb-usb-ec168-cardlist.rst
new file mode 100644
index 000000000000..a3660dfa5dcc
--- /dev/null
+++ b/Documentation/admin-guide/media/dvb-usb-ec168-cardlist.rst
@@ -0,0 +1,16 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+dvb-usb-ec168 cards list
+========================
+
+.. tabularcolumns:: |p{7.0cm}|p{10.5cm}|
+
+.. flat-table::
+ :header-rows: 1
+ :widths: 7 13
+ :stub-columns: 0
+
+ * - Card name
+ - USB IDs
+ * - E3C EC168 reference design
+ - 18b4:1689, 18b4:fffa, 18b4:fffb, 18b4:1001, 18b4:1002
diff --git a/Documentation/admin-guide/media/dvb-usb-gl861-cardlist.rst b/Documentation/admin-guide/media/dvb-usb-gl861-cardlist.rst
new file mode 100644
index 000000000000..5ec62fe03d64
--- /dev/null
+++ b/Documentation/admin-guide/media/dvb-usb-gl861-cardlist.rst
@@ -0,0 +1,20 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+dvb-usb-gl861 cards list
+========================
+
+.. tabularcolumns:: |p{7.0cm}|p{10.5cm}|
+
+.. flat-table::
+ :header-rows: 1
+ :widths: 7 13
+ :stub-columns: 0
+
+ * - Card name
+ - USB IDs
+ * - 774 Friio White ISDB-T USB2.0
+ - 7a69:0001
+ * - A-LINK DTU DVB-T USB2.0
+ - 05e3:f170
+ * - MSI Mega Sky 55801 DVB-T USB2.0
+ - 0db0:5581
diff --git a/Documentation/admin-guide/media/dvb-usb-gp8psk-cardlist.rst b/Documentation/admin-guide/media/dvb-usb-gp8psk-cardlist.rst
new file mode 100644
index 000000000000..150fa9f7810a
--- /dev/null
+++ b/Documentation/admin-guide/media/dvb-usb-gp8psk-cardlist.rst
@@ -0,0 +1,22 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+dvb-usb-gp8psk cards list
+=========================
+
+.. tabularcolumns:: |p{7.0cm}|p{10.5cm}|
+
+.. flat-table::
+ :header-rows: 1
+ :widths: 7 13
+ :stub-columns: 0
+
+ * - Card name
+ - USB IDs
+ * - Genpix 8PSK-to-USB2 Rev.1 DVB-S receiver
+ - 09c0:0200, 09c0:0201
+ * - Genpix 8PSK-to-USB2 Rev.2 DVB-S receiver
+ - 09c0:0202
+ * - Genpix SkyWalker-1 DVB-S receiver
+ - 09c0:0203
+ * - Genpix SkyWalker-2 DVB-S receiver
+ - 09c0:0206
diff --git a/Documentation/admin-guide/media/dvb-usb-lmedm04-cardlist.rst b/Documentation/admin-guide/media/dvb-usb-lmedm04-cardlist.rst
new file mode 100644
index 000000000000..2050fbf03d4a
--- /dev/null
+++ b/Documentation/admin-guide/media/dvb-usb-lmedm04-cardlist.rst
@@ -0,0 +1,20 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+dvb-usb-lmedm04 cards list
+==========================
+
+.. tabularcolumns:: |p{7.0cm}|p{10.5cm}|
+
+.. flat-table::
+ :header-rows: 1
+ :widths: 7 13
+ :stub-columns: 0
+
+ * - Card name
+ - USB IDs
+ * - DM04_LME2510C_DVB-S
+ - 3344:1120
+ * - DM04_LME2510C_DVB-S RS2000
+ - 3344:22f0
+ * - DM04_LME2510_DVB-S
+ - 3344:1122
diff --git a/Documentation/admin-guide/media/dvb-usb-m920x-cardlist.rst b/Documentation/admin-guide/media/dvb-usb-m920x-cardlist.rst
new file mode 100644
index 000000000000..73145940b5c5
--- /dev/null
+++ b/Documentation/admin-guide/media/dvb-usb-m920x-cardlist.rst
@@ -0,0 +1,26 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+dvb-usb-m920x cards list
+========================
+
+.. tabularcolumns:: |p{7.0cm}|p{10.5cm}|
+
+.. flat-table::
+ :header-rows: 1
+ :widths: 7 13
+ :stub-columns: 0
+
+ * - Card name
+ - USB IDs
+ * - DTV-DVB UDTT7049
+ - 13d3:3219
+ * - Dposh DVB-T USB2.0
+ - 1498:9206, 1498:a090
+ * - LifeView TV Walker Twin DVB-T USB2.0
+ - 10fd:0514, 10fd:0513
+ * - MSI DIGI VOX mini II DVB-T USB2.0
+ - 10fd:1513
+ * - MSI Mega Sky 580 DVB-T USB2.0
+ - 0db0:5580
+ * - Pinnacle PCTV 310e
+ - 13d3:3211
diff --git a/Documentation/admin-guide/media/dvb-usb-mxl111sf-cardlist.rst b/Documentation/admin-guide/media/dvb-usb-mxl111sf-cardlist.rst
new file mode 100644
index 000000000000..6974801c43b6
--- /dev/null
+++ b/Documentation/admin-guide/media/dvb-usb-mxl111sf-cardlist.rst
@@ -0,0 +1,36 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+dvb-usb-mxl111sf cards list
+===========================
+
+.. tabularcolumns:: |p{7.0cm}|p{10.5cm}|
+
+.. flat-table::
+ :header-rows: 1
+ :widths: 7 13
+ :stub-columns: 0
+
+ * - Card name
+ - USB IDs
+ * - HCW 117xxx
+ - 2040:b702
+ * - HCW 126xxx
+ - 2040:c602, 2040:c60a
+ * - Hauppauge 117xxx ATSC+
+ - 2040:b700, 2040:b703, 2040:b753, 2040:b763, 2040:b757, 2040:b767
+ * - Hauppauge 117xxx DVBT
+ - 2040:b704, 2040:b764
+ * - Hauppauge 126xxx
+ - 2040:c612, 2040:c61a
+ * - Hauppauge 126xxx ATSC
+ - 2040:c601, 2040:c609, 2040:b701
+ * - Hauppauge 126xxx ATSC+
+ - 2040:c600, 2040:c603, 2040:c60b, 2040:c653, 2040:c65b
+ * - Hauppauge 126xxx DVBT
+ - 2040:c604, 2040:c60c
+ * - Hauppauge 138xxx DVBT
+ - 2040:d854, 2040:d864, 2040:d8d4, 2040:d8e4
+ * - Hauppauge Mercury
+ - 2040:d853, 2040:d863, 2040:d8d3, 2040:d8e3, 2040:d8ff
+ * - Hauppauge WinTV-Aero-M
+ - 2040:c613, 2040:c61b
diff --git a/Documentation/admin-guide/media/dvb-usb-nova-t-usb2-cardlist.rst b/Documentation/admin-guide/media/dvb-usb-nova-t-usb2-cardlist.rst
new file mode 100644
index 000000000000..e295f912a585
--- /dev/null
+++ b/Documentation/admin-guide/media/dvb-usb-nova-t-usb2-cardlist.rst
@@ -0,0 +1,16 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+dvb-usb-nova-t-usb2 cards list
+==============================
+
+.. tabularcolumns:: |p{7.0cm}|p{10.5cm}|
+
+.. flat-table::
+ :header-rows: 1
+ :widths: 7 13
+ :stub-columns: 0
+
+ * - Card name
+ - USB IDs
+ * - Hauppauge WinTV-NOVA-T usb2
+ - 2040:9300, 2040:9301
diff --git a/Documentation/admin-guide/media/dvb-usb-opera1-cardlist.rst b/Documentation/admin-guide/media/dvb-usb-opera1-cardlist.rst
new file mode 100644
index 000000000000..362245f5a46a
--- /dev/null
+++ b/Documentation/admin-guide/media/dvb-usb-opera1-cardlist.rst
@@ -0,0 +1,16 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+dvb-usb-opera1 cards list
+=========================
+
+.. tabularcolumns:: |p{7.0cm}|p{10.5cm}|
+
+.. flat-table::
+ :header-rows: 1
+ :widths: 7 13
+ :stub-columns: 0
+
+ * - Card name
+ - USB IDs
+ * - Opera1 DVB-S USB2.0
+ - 04b4:2830, 695c:3829
diff --git a/Documentation/admin-guide/media/dvb-usb-pctv452e-cardlist.rst b/Documentation/admin-guide/media/dvb-usb-pctv452e-cardlist.rst
new file mode 100644
index 000000000000..886d8cc18acb
--- /dev/null
+++ b/Documentation/admin-guide/media/dvb-usb-pctv452e-cardlist.rst
@@ -0,0 +1,20 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+dvb-usb-pctv452e cards list
+===========================
+
+.. tabularcolumns:: |p{7.0cm}|p{10.5cm}|
+
+.. flat-table::
+ :header-rows: 1
+ :widths: 7 13
+ :stub-columns: 0
+
+ * - Card name
+ - USB IDs
+ * - PCTV HDTV USB
+ - 2304:021f
+ * - Technotrend TT Connect S2-3600
+ - 0b48:3007
+ * - Technotrend TT Connect S2-3650-CI
+ - 0b48:300a
diff --git a/Documentation/admin-guide/media/dvb-usb-rtl28xxu-cardlist.rst b/Documentation/admin-guide/media/dvb-usb-rtl28xxu-cardlist.rst
new file mode 100644
index 000000000000..9f4295331a15
--- /dev/null
+++ b/Documentation/admin-guide/media/dvb-usb-rtl28xxu-cardlist.rst
@@ -0,0 +1,80 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+dvb-usb-rtl28xxu cards list
+===========================
+
+.. tabularcolumns:: |p{7.0cm}|p{10.5cm}|
+
+.. flat-table::
+ :header-rows: 1
+ :widths: 7 13
+ :stub-columns: 0
+
+ * - Card name
+ - USB IDs
+ * - ASUS My Cinema-U3100Mini Plus V2
+ - 1b80:d3a8
+ * - Astrometa DVB-T2
+ - 15f4:0131
+ * - Compro VideoMate U620F
+ - 185b:0620
+ * - Compro VideoMate U650F
+ - 185b:0650
+ * - Crypto ReDi PC 50 A
+ - 1f4d:a803
+ * - Dexatek DK DVB-T Dongle
+ - 1d19:1101
+ * - Dexatek DK mini DVB-T Dongle
+ - 1d19:1102
+ * - DigitalNow Quad DVB-T Receiver
+ - 0413:6680
+ * - Freecom USB2.0 DVB-T
+ - 14aa:0160, 14aa:0161
+ * - G-Tek Electronics Group Lifeview LV5TDLX DVB-T
+ - 1f4d:b803
+ * - GIGABYTE U7300
+ - 1b80:d393
+ * - Genius TVGo DVB-T03
+ - 0458:707f
+ * - GoTView MasterHD 3
+ - 5654:ca42
+ * - Leadtek WinFast DTV Dongle mini
+ - 0413:6a03
+ * - Leadtek WinFast DTV2000DS Plus
+ - 0413:6f12
+ * - Leadtek Winfast DTV Dongle Mini D
+ - 0413:6f0f
+ * - MSI DIGIVOX Micro HD
+ - 1d19:1104
+ * - MaxMedia HU394-T
+ - 1b80:d394
+ * - PROlectrix DV107669
+ - 1f4d:d803
+ * - Peak DVB-T USB
+ - 1b80:d395
+ * - Realtek RTL2831U reference design
+ - 0bda:2831
+ * - Realtek RTL2832U reference design
+ - 0bda:2832, 0bda:2838
+ * - Sveon STV20
+ - 1b80:d39d
+ * - Sveon STV21
+ - 1b80:d3b0
+ * - Sveon STV27
+ - 1b80:d3af
+ * - TURBO-X Pure TV Tuner DTT-2000
+ - 1b80:d3a4
+ * - TerraTec Cinergy T Stick Black
+ - 0ccd:00a9
+ * - TerraTec Cinergy T Stick RC (Rev. 3)
+ - 0ccd:00d3
+ * - TerraTec Cinergy T Stick+
+ - 0ccd:00d7
+ * - TerraTec NOXON DAB Stick
+ - 0ccd:00b3
+ * - TerraTec NOXON DAB Stick (rev 2)
+ - 0ccd:00e0
+ * - TerraTec NOXON DAB Stick (rev 3)
+ - 0ccd:00b4
+ * - Trekstor DVB-T Stick Terres 2.0
+ - 1f4d:C803
diff --git a/Documentation/admin-guide/media/dvb-usb-technisat-usb2-cardlist.rst b/Documentation/admin-guide/media/dvb-usb-technisat-usb2-cardlist.rst
new file mode 100644
index 000000000000..30ee92ada134
--- /dev/null
+++ b/Documentation/admin-guide/media/dvb-usb-technisat-usb2-cardlist.rst
@@ -0,0 +1,16 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+dvb-usb-technisat-usb2 cards list
+=================================
+
+.. tabularcolumns:: |p{7.0cm}|p{10.5cm}|
+
+.. flat-table::
+ :header-rows: 1
+ :widths: 7 13
+ :stub-columns: 0
+
+ * - Card name
+ - USB IDs
+ * - Technisat SkyStar USB HD (DVB-S/S2)
+ - 14f7:0500
diff --git a/Documentation/admin-guide/media/dvb-usb-ttusb2-cardlist.rst b/Documentation/admin-guide/media/dvb-usb-ttusb2-cardlist.rst
new file mode 100644
index 000000000000..faa78e5f3f5d
--- /dev/null
+++ b/Documentation/admin-guide/media/dvb-usb-ttusb2-cardlist.rst
@@ -0,0 +1,24 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+dvb-usb-ttusb2 cards list
+=========================
+
+.. tabularcolumns:: |p{7.0cm}|p{10.5cm}|
+
+.. flat-table::
+ :header-rows: 1
+ :widths: 7 13
+ :stub-columns: 0
+
+ * - Card name
+ - USB IDs
+ * - Pinnacle 400e DVB-S USB2.0
+ - 2304:020f
+ * - Pinnacle 450e DVB-S USB2.0
+ - 2304:0222
+ * - Technotrend TT-connect CT-3650
+ - 0b48:300d
+ * - Technotrend TT-connect S-2400
+ - 0b48:3006
+ * - Technotrend TT-connect S-2400 (8kB EEPROM)
+ - 0b48:3009
diff --git a/Documentation/admin-guide/media/dvb-usb-umt-010-cardlist.rst b/Documentation/admin-guide/media/dvb-usb-umt-010-cardlist.rst
new file mode 100644
index 000000000000..ce7ce901b5ac
--- /dev/null
+++ b/Documentation/admin-guide/media/dvb-usb-umt-010-cardlist.rst
@@ -0,0 +1,16 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+dvb-usb-umt-010 cards list
+==========================
+
+.. tabularcolumns:: |p{7.0cm}|p{10.5cm}|
+
+.. flat-table::
+ :header-rows: 1
+ :widths: 7 13
+ :stub-columns: 0
+
+ * - Card name
+ - USB IDs
+ * - Hanftek UMT-010 DVB-T USB2.0
+ - 15f4:0001, 15f4:0015
diff --git a/Documentation/admin-guide/media/dvb-usb-vp702x-cardlist.rst b/Documentation/admin-guide/media/dvb-usb-vp702x-cardlist.rst
new file mode 100644
index 000000000000..101442434268
--- /dev/null
+++ b/Documentation/admin-guide/media/dvb-usb-vp702x-cardlist.rst
@@ -0,0 +1,16 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+dvb-usb-vp702x cards list
+=========================
+
+.. tabularcolumns:: |p{7.0cm}|p{10.5cm}|
+
+.. flat-table::
+ :header-rows: 1
+ :widths: 7 13
+ :stub-columns: 0
+
+ * - Card name
+ - USB IDs
+ * - TwinhanDTV StarBox DVB-S USB2.0 (VP7021)
+ - 13d3:3207
diff --git a/Documentation/admin-guide/media/dvb-usb-vp7045-cardlist.rst b/Documentation/admin-guide/media/dvb-usb-vp7045-cardlist.rst
new file mode 100644
index 000000000000..2fc8fc4ecc32
--- /dev/null
+++ b/Documentation/admin-guide/media/dvb-usb-vp7045-cardlist.rst
@@ -0,0 +1,18 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+dvb-usb-vp7045 cards list
+=========================
+
+.. tabularcolumns:: |p{7.0cm}|p{10.5cm}|
+
+.. flat-table::
+ :header-rows: 1
+ :widths: 7 13
+ :stub-columns: 0
+
+ * - Card name
+ - USB IDs
+ * - DigitalNow TinyUSB 2 DVB-t Receiver
+ - 13d3:3223, 13d3:3224
+ * - Twinhan USB2.0 DVB-T receiver (TwinhanDTV Alpha/MagicBox II)
+ - 13d3:3205, 13d3:3206
diff --git a/Documentation/admin-guide/media/dvb-usb-zd1301-cardlist.rst b/Documentation/admin-guide/media/dvb-usb-zd1301-cardlist.rst
new file mode 100644
index 000000000000..9ca446184753
--- /dev/null
+++ b/Documentation/admin-guide/media/dvb-usb-zd1301-cardlist.rst
@@ -0,0 +1,16 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+dvb-usb-zd1301 cards list
+=========================
+
+.. tabularcolumns:: |p{7.0cm}|p{10.5cm}|
+
+.. flat-table::
+ :header-rows: 1
+ :widths: 7 13
+ :stub-columns: 0
+
+ * - Card name
+ - USB IDs
+ * - ZyDAS ZD1301 reference design
+ - 0ace:13a1
diff --git a/Documentation/admin-guide/media/dvb.rst b/Documentation/admin-guide/media/dvb.rst
new file mode 100644
index 000000000000..e5258bfa5cd9
--- /dev/null
+++ b/Documentation/admin-guide/media/dvb.rst
@@ -0,0 +1,12 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+==========
+Digital TV
+==========
+
+.. toctree::
+
+ dvb_intro
+ ci
+ faq
+ dvb_references
diff --git a/Documentation/admin-guide/media/dvb_intro.rst b/Documentation/admin-guide/media/dvb_intro.rst
new file mode 100644
index 000000000000..44eac9b3be6c
--- /dev/null
+++ b/Documentation/admin-guide/media/dvb_intro.rst
@@ -0,0 +1,616 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+==============================
+Using the Digital TV Framework
+==============================
+
+Introduction
+~~~~~~~~~~~~
+
+One significant difference between Digital TV and Analogue TV that the
+unwary (like myself) should consider is that, although the component
+structure of DVB-T cards are substantially similar to Analogue TV cards,
+they function in substantially different ways.
+
+The purpose of an Analogue TV is to receive and display an Analogue
+Television signal. An Analogue TV signal (otherwise known as composite
+video) is an analogue encoding of a sequence of image frames (25 frames
+per second in Europe) rasterised using an interlacing technique.
+Interlacing takes two fields to represent one frame. Therefore, an
+Analogue TV card for a PC has the following purpose:
+
+* Tune the receiver to receive a broadcast signal
+* demodulate the broadcast signal
+* demultiplex the analogue video signal and analogue audio
+ signal.
+
+ .. note::
+
+ some countries employ a digital audio signal
+ embedded within the modulated composite analogue signal -
+ using NICAM signaling.)
+
+* digitize the analogue video signal and make the resulting datastream
+ available to the data bus.
+
+The digital datastream from an Analogue TV card is generated by
+circuitry on the card and is often presented uncompressed. For a PAL TV
+signal encoded at a resolution of 768x576 24-bit color pixels over 25
+frames per second - a fair amount of data is generated and must be
+processed by the PC before it can be displayed on the video monitor
+screen. Some Analogue TV cards for PCs have onboard MPEG2 encoders which
+permit the raw digital data stream to be presented to the PC in an
+encoded and compressed form - similar to the form that is used in
+Digital TV.
+
+The purpose of a simple budget digital TV card (DVB-T,C or S) is to
+simply:
+
+* Tune the received to receive a broadcast signal. * Extract the encoded
+ digital datastream from the broadcast signal.
+* Make the encoded digital datastream (MPEG2) available to the data bus.
+
+The significant difference between the two is that the tuner on the
+analogue TV card spits out an Analogue signal, whereas the tuner on the
+digital TV card spits out a compressed encoded digital datastream. As
+the signal is already digitised, it is trivial to pass this datastream
+to the PC databus with minimal additional processing and then extract
+the digital video and audio datastreams passing them to the appropriate
+software or hardware for decoding and viewing.
+
+Getting the card going
+~~~~~~~~~~~~~~~~~~~~~~
+
+The Device Driver API for DVB under Linux will the following
+device nodes via the devfs filesystem:
+
+* /dev/dvb/adapter0/demux0
+* /dev/dvb/adapter0/dvr0
+* /dev/dvb/adapter0/frontend0
+
+The ``/dev/dvb/adapter0/dvr0`` device node is used to read the MPEG2
+Data Stream and the ``/dev/dvb/adapter0/frontend0`` device node is used
+to tune the frontend tuner module. The ``/dev/dvb/adapter0/demux0`` is
+used to control what programs will be received.
+
+Depending on the card's feature set, the Device Driver API could also
+expose other device nodes:
+
+* /dev/dvb/adapter0/ca0
+* /dev/dvb/adapter0/audio0
+* /dev/dvb/adapter0/net0
+* /dev/dvb/adapter0/osd0
+* /dev/dvb/adapter0/video0
+
+The ``/dev/dvb/adapter0/ca0`` is used to decode encrypted channels. The
+other device nodes are found only on devices that use the av7110
+driver, with is now obsoleted, together with the extra API whose such
+devices use.
+
+Receiving a digital TV channel
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+This section attempts to explain how it works and how this affects the
+configuration of a Digital TV card.
+
+On this example, we're considering tuning into DVB-T channels in
+Australia, at the Melbourne region.
+
+The frequencies broadcast by Mount Dandenong transmitters are,
+currently:
+
+Table 1. Transponder Frequencies Mount Dandenong, Vic, Aus.
+
+=========== ===========
+Broadcaster Frequency
+=========== ===========
+Seven 177.500 Mhz
+SBS 184.500 Mhz
+Nine 191.625 Mhz
+Ten 219.500 Mhz
+ABC 226.500 Mhz
+Channel 31 557.625 Mhz
+=========== ===========
+
+The digital TV Scan utilities (like dvbv5-scan) have use a set of
+compiled-in defaults for various countries and regions. Those are
+currently provided as a separate package, called dtv-scan-tables. It's
+git tree is located at LinuxTV.org:
+
+ https://git.linuxtv.org/dtv-scan-tables.git/
+
+If none of the tables there suit, you can specify a data file on the
+command line which contains the transponder frequencies. Here is a
+sample file for the above channel transponders, in the old "channel"
+format::
+
+ # Data file for DVB scan program
+ #
+ # C Frequency SymbolRate FEC QAM
+ # S Frequency Polarisation SymbolRate FEC
+ # T Frequency Bandwidth FEC FEC2 QAM Mode Guard Hier
+
+ T 177500000 7MHz AUTO AUTO QAM64 8k 1/16 NONE
+ T 184500000 7MHz AUTO AUTO QAM64 8k 1/8 NONE
+ T 191625000 7MHz AUTO AUTO QAM64 8k 1/16 NONE
+ T 219500000 7MHz AUTO AUTO QAM64 8k 1/16 NONE
+ T 226500000 7MHz AUTO AUTO QAM64 8k 1/16 NONE
+ T 557625000 7MHz AUTO AUTO QPSK 8k 1/16 NONE
+
+Nowadays, we prefer to use a newer format, with is more verbose and easier
+to understand. With the new format, the "Seven" channel transponder's
+data is represented by::
+
+ [Seven]
+ DELIVERY_SYSTEM = DVBT
+ FREQUENCY = 177500000
+ BANDWIDTH_HZ = 7000000
+ CODE_RATE_HP = AUTO
+ CODE_RATE_LP = AUTO
+ MODULATION = QAM/64
+ TRANSMISSION_MODE = 8K
+ GUARD_INTERVAL = 1/16
+ HIERARCHY = NONE
+ INVERSION = AUTO
+
+For an updated version of the complete table, please see:
+
+ https://git.linuxtv.org/dtv-scan-tables.git/tree/dvb-t/au-Melbourne
+
+When the Digital TV scanning utility runs, it will output a file
+containing the information for all the audio and video programs that
+exists into each channel's transponders which the card's frontend can
+lock onto. (i.e. any whose signal is strong enough at your antenna).
+
+Here's the output of the dvbv5 tools from a channel scan took from
+Melburne::
+
+ [ABC HDTV]
+ SERVICE_ID = 560
+ VIDEO_PID = 2307
+ AUDIO_PID = 0
+ DELIVERY_SYSTEM = DVBT
+ FREQUENCY = 226500000
+ INVERSION = OFF
+ BANDWIDTH_HZ = 7000000
+ CODE_RATE_HP = 3/4
+ CODE_RATE_LP = 3/4
+ MODULATION = QAM/64
+ TRANSMISSION_MODE = 8K
+ GUARD_INTERVAL = 1/16
+ HIERARCHY = NONE
+
+ [ABC TV Melbourne]
+ SERVICE_ID = 561
+ VIDEO_PID = 512
+ AUDIO_PID = 650
+ DELIVERY_SYSTEM = DVBT
+ FREQUENCY = 226500000
+ INVERSION = OFF
+ BANDWIDTH_HZ = 7000000
+ CODE_RATE_HP = 3/4
+ CODE_RATE_LP = 3/4
+ MODULATION = QAM/64
+ TRANSMISSION_MODE = 8K
+ GUARD_INTERVAL = 1/16
+ HIERARCHY = NONE
+
+ [ABC TV 2]
+ SERVICE_ID = 562
+ VIDEO_PID = 512
+ AUDIO_PID = 650
+ DELIVERY_SYSTEM = DVBT
+ FREQUENCY = 226500000
+ INVERSION = OFF
+ BANDWIDTH_HZ = 7000000
+ CODE_RATE_HP = 3/4
+ CODE_RATE_LP = 3/4
+ MODULATION = QAM/64
+ TRANSMISSION_MODE = 8K
+ GUARD_INTERVAL = 1/16
+ HIERARCHY = NONE
+
+ [ABC TV 3]
+ SERVICE_ID = 563
+ VIDEO_PID = 512
+ AUDIO_PID = 650
+ DELIVERY_SYSTEM = DVBT
+ FREQUENCY = 226500000
+ INVERSION = OFF
+ BANDWIDTH_HZ = 7000000
+ CODE_RATE_HP = 3/4
+ CODE_RATE_LP = 3/4
+ MODULATION = QAM/64
+ TRANSMISSION_MODE = 8K
+ GUARD_INTERVAL = 1/16
+ HIERARCHY = NONE
+
+ [ABC TV 4]
+ SERVICE_ID = 564
+ VIDEO_PID = 512
+ AUDIO_PID = 650
+ DELIVERY_SYSTEM = DVBT
+ FREQUENCY = 226500000
+ INVERSION = OFF
+ BANDWIDTH_HZ = 7000000
+ CODE_RATE_HP = 3/4
+ CODE_RATE_LP = 3/4
+ MODULATION = QAM/64
+ TRANSMISSION_MODE = 8K
+ GUARD_INTERVAL = 1/16
+ HIERARCHY = NONE
+
+ [ABC DiG Radio]
+ SERVICE_ID = 566
+ VIDEO_PID = 0
+ AUDIO_PID = 2311
+ DELIVERY_SYSTEM = DVBT
+ FREQUENCY = 226500000
+ INVERSION = OFF
+ BANDWIDTH_HZ = 7000000
+ CODE_RATE_HP = 3/4
+ CODE_RATE_LP = 3/4
+ MODULATION = QAM/64
+ TRANSMISSION_MODE = 8K
+ GUARD_INTERVAL = 1/16
+ HIERARCHY = NONE
+
+ [TEN Digital]
+ SERVICE_ID = 1585
+ VIDEO_PID = 512
+ AUDIO_PID = 650
+ DELIVERY_SYSTEM = DVBT
+ FREQUENCY = 219500000
+ INVERSION = OFF
+ BANDWIDTH_HZ = 7000000
+ CODE_RATE_HP = 3/4
+ CODE_RATE_LP = 1/2
+ MODULATION = QAM/64
+ TRANSMISSION_MODE = 8K
+ GUARD_INTERVAL = 1/16
+ HIERARCHY = NONE
+
+ [TEN Digital 1]
+ SERVICE_ID = 1586
+ VIDEO_PID = 512
+ AUDIO_PID = 650
+ DELIVERY_SYSTEM = DVBT
+ FREQUENCY = 219500000
+ INVERSION = OFF
+ BANDWIDTH_HZ = 7000000
+ CODE_RATE_HP = 3/4
+ CODE_RATE_LP = 1/2
+ MODULATION = QAM/64
+ TRANSMISSION_MODE = 8K
+ GUARD_INTERVAL = 1/16
+ HIERARCHY = NONE
+
+ [TEN Digital 2]
+ SERVICE_ID = 1587
+ VIDEO_PID = 512
+ AUDIO_PID = 650
+ DELIVERY_SYSTEM = DVBT
+ FREQUENCY = 219500000
+ INVERSION = OFF
+ BANDWIDTH_HZ = 7000000
+ CODE_RATE_HP = 3/4
+ CODE_RATE_LP = 1/2
+ MODULATION = QAM/64
+ TRANSMISSION_MODE = 8K
+ GUARD_INTERVAL = 1/16
+ HIERARCHY = NONE
+
+ [TEN Digital 3]
+ SERVICE_ID = 1588
+ VIDEO_PID = 512
+ AUDIO_PID = 650
+ DELIVERY_SYSTEM = DVBT
+ FREQUENCY = 219500000
+ INVERSION = OFF
+ BANDWIDTH_HZ = 7000000
+ CODE_RATE_HP = 3/4
+ CODE_RATE_LP = 1/2
+ MODULATION = QAM/64
+ TRANSMISSION_MODE = 8K
+ GUARD_INTERVAL = 1/16
+ HIERARCHY = NONE
+
+ [TEN Digital]
+ SERVICE_ID = 1589
+ VIDEO_PID = 512
+ AUDIO_PID = 650
+ DELIVERY_SYSTEM = DVBT
+ FREQUENCY = 219500000
+ INVERSION = OFF
+ BANDWIDTH_HZ = 7000000
+ CODE_RATE_HP = 3/4
+ CODE_RATE_LP = 1/2
+ MODULATION = QAM/64
+ TRANSMISSION_MODE = 8K
+ GUARD_INTERVAL = 1/16
+ HIERARCHY = NONE
+
+ [TEN Digital 4]
+ SERVICE_ID = 1590
+ VIDEO_PID = 512
+ AUDIO_PID = 650
+ DELIVERY_SYSTEM = DVBT
+ FREQUENCY = 219500000
+ INVERSION = OFF
+ BANDWIDTH_HZ = 7000000
+ CODE_RATE_HP = 3/4
+ CODE_RATE_LP = 1/2
+ MODULATION = QAM/64
+ TRANSMISSION_MODE = 8K
+ GUARD_INTERVAL = 1/16
+ HIERARCHY = NONE
+
+ [TEN Digital]
+ SERVICE_ID = 1591
+ VIDEO_PID = 512
+ AUDIO_PID = 650
+ DELIVERY_SYSTEM = DVBT
+ FREQUENCY = 219500000
+ INVERSION = OFF
+ BANDWIDTH_HZ = 7000000
+ CODE_RATE_HP = 3/4
+ CODE_RATE_LP = 1/2
+ MODULATION = QAM/64
+ TRANSMISSION_MODE = 8K
+ GUARD_INTERVAL = 1/16
+ HIERARCHY = NONE
+
+ [TEN HD]
+ SERVICE_ID = 1592
+ VIDEO_PID = 514
+ AUDIO_PID = 0
+ DELIVERY_SYSTEM = DVBT
+ FREQUENCY = 219500000
+ INVERSION = OFF
+ BANDWIDTH_HZ = 7000000
+ CODE_RATE_HP = 3/4
+ CODE_RATE_LP = 1/2
+ MODULATION = QAM/64
+ TRANSMISSION_MODE = 8K
+ GUARD_INTERVAL = 1/16
+ HIERARCHY = NONE
+
+ [TEN Digital]
+ SERVICE_ID = 1593
+ VIDEO_PID = 512
+ AUDIO_PID = 650
+ DELIVERY_SYSTEM = DVBT
+ FREQUENCY = 219500000
+ INVERSION = OFF
+ BANDWIDTH_HZ = 7000000
+ CODE_RATE_HP = 3/4
+ CODE_RATE_LP = 1/2
+ MODULATION = QAM/64
+ TRANSMISSION_MODE = 8K
+ GUARD_INTERVAL = 1/16
+ HIERARCHY = NONE
+
+ [Nine Digital]
+ SERVICE_ID = 1072
+ VIDEO_PID = 513
+ AUDIO_PID = 660
+ DELIVERY_SYSTEM = DVBT
+ FREQUENCY = 191625000
+ INVERSION = OFF
+ BANDWIDTH_HZ = 7000000
+ CODE_RATE_HP = 3/4
+ CODE_RATE_LP = 1/2
+ MODULATION = QAM/64
+ TRANSMISSION_MODE = 8K
+ GUARD_INTERVAL = 1/16
+ HIERARCHY = NONE
+
+ [Nine Digital HD]
+ SERVICE_ID = 1073
+ VIDEO_PID = 512
+ AUDIO_PID = 0
+ DELIVERY_SYSTEM = DVBT
+ FREQUENCY = 191625000
+ INVERSION = OFF
+ BANDWIDTH_HZ = 7000000
+ CODE_RATE_HP = 3/4
+ CODE_RATE_LP = 1/2
+ MODULATION = QAM/64
+ TRANSMISSION_MODE = 8K
+ GUARD_INTERVAL = 1/16
+ HIERARCHY = NONE
+
+ [Nine Guide]
+ SERVICE_ID = 1074
+ VIDEO_PID = 514
+ AUDIO_PID = 670
+ DELIVERY_SYSTEM = DVBT
+ FREQUENCY = 191625000
+ INVERSION = OFF
+ BANDWIDTH_HZ = 7000000
+ CODE_RATE_HP = 3/4
+ CODE_RATE_LP = 1/2
+ MODULATION = QAM/64
+ TRANSMISSION_MODE = 8K
+ GUARD_INTERVAL = 1/16
+ HIERARCHY = NONE
+
+ [7 Digital]
+ SERVICE_ID = 1328
+ VIDEO_PID = 769
+ AUDIO_PID = 770
+ DELIVERY_SYSTEM = DVBT
+ FREQUENCY = 177500000
+ INVERSION = OFF
+ BANDWIDTH_HZ = 7000000
+ CODE_RATE_HP = 2/3
+ CODE_RATE_LP = 2/3
+ MODULATION = QAM/64
+ TRANSMISSION_MODE = 8K
+ GUARD_INTERVAL = 1/8
+ HIERARCHY = NONE
+
+ [7 Digital 1]
+ SERVICE_ID = 1329
+ VIDEO_PID = 769
+ AUDIO_PID = 770
+ DELIVERY_SYSTEM = DVBT
+ FREQUENCY = 177500000
+ INVERSION = OFF
+ BANDWIDTH_HZ = 7000000
+ CODE_RATE_HP = 2/3
+ CODE_RATE_LP = 2/3
+ MODULATION = QAM/64
+ TRANSMISSION_MODE = 8K
+ GUARD_INTERVAL = 1/8
+ HIERARCHY = NONE
+
+ [7 Digital 2]
+ SERVICE_ID = 1330
+ VIDEO_PID = 769
+ AUDIO_PID = 770
+ DELIVERY_SYSTEM = DVBT
+ FREQUENCY = 177500000
+ INVERSION = OFF
+ BANDWIDTH_HZ = 7000000
+ CODE_RATE_HP = 2/3
+ CODE_RATE_LP = 2/3
+ MODULATION = QAM/64
+ TRANSMISSION_MODE = 8K
+ GUARD_INTERVAL = 1/8
+ HIERARCHY = NONE
+
+ [7 Digital 3]
+ SERVICE_ID = 1331
+ VIDEO_PID = 769
+ AUDIO_PID = 770
+ DELIVERY_SYSTEM = DVBT
+ FREQUENCY = 177500000
+ INVERSION = OFF
+ BANDWIDTH_HZ = 7000000
+ CODE_RATE_HP = 2/3
+ CODE_RATE_LP = 2/3
+ MODULATION = QAM/64
+ TRANSMISSION_MODE = 8K
+ GUARD_INTERVAL = 1/8
+ HIERARCHY = NONE
+
+ [7 HD Digital]
+ SERVICE_ID = 1332
+ VIDEO_PID = 833
+ AUDIO_PID = 834
+ DELIVERY_SYSTEM = DVBT
+ FREQUENCY = 177500000
+ INVERSION = OFF
+ BANDWIDTH_HZ = 7000000
+ CODE_RATE_HP = 2/3
+ CODE_RATE_LP = 2/3
+ MODULATION = QAM/64
+ TRANSMISSION_MODE = 8K
+ GUARD_INTERVAL = 1/8
+ HIERARCHY = NONE
+
+ [7 Program Guide]
+ SERVICE_ID = 1334
+ VIDEO_PID = 865
+ AUDIO_PID = 866
+ DELIVERY_SYSTEM = DVBT
+ FREQUENCY = 177500000
+ INVERSION = OFF
+ BANDWIDTH_HZ = 7000000
+ CODE_RATE_HP = 2/3
+ CODE_RATE_LP = 2/3
+ MODULATION = QAM/64
+ TRANSMISSION_MODE = 8K
+ GUARD_INTERVAL = 1/8
+ HIERARCHY = NONE
+
+ [SBS HD]
+ SERVICE_ID = 784
+ VIDEO_PID = 102
+ AUDIO_PID = 103
+ DELIVERY_SYSTEM = DVBT
+ FREQUENCY = 536500000
+ INVERSION = OFF
+ BANDWIDTH_HZ = 7000000
+ CODE_RATE_HP = 2/3
+ CODE_RATE_LP = 2/3
+ MODULATION = QAM/64
+ TRANSMISSION_MODE = 8K
+ GUARD_INTERVAL = 1/8
+ HIERARCHY = NONE
+
+ [SBS DIGITAL 1]
+ SERVICE_ID = 785
+ VIDEO_PID = 161
+ AUDIO_PID = 81
+ DELIVERY_SYSTEM = DVBT
+ FREQUENCY = 536500000
+ INVERSION = OFF
+ BANDWIDTH_HZ = 7000000
+ CODE_RATE_HP = 2/3
+ CODE_RATE_LP = 2/3
+ MODULATION = QAM/64
+ TRANSMISSION_MODE = 8K
+ GUARD_INTERVAL = 1/8
+ HIERARCHY = NONE
+
+ [SBS DIGITAL 2]
+ SERVICE_ID = 786
+ VIDEO_PID = 162
+ AUDIO_PID = 83
+ DELIVERY_SYSTEM = DVBT
+ FREQUENCY = 536500000
+ INVERSION = OFF
+ BANDWIDTH_HZ = 7000000
+ CODE_RATE_HP = 2/3
+ CODE_RATE_LP = 2/3
+ MODULATION = QAM/64
+ TRANSMISSION_MODE = 8K
+ GUARD_INTERVAL = 1/8
+ HIERARCHY = NONE
+
+ [SBS EPG]
+ SERVICE_ID = 787
+ VIDEO_PID = 163
+ AUDIO_PID = 85
+ DELIVERY_SYSTEM = DVBT
+ FREQUENCY = 536500000
+ INVERSION = OFF
+ BANDWIDTH_HZ = 7000000
+ CODE_RATE_HP = 2/3
+ CODE_RATE_LP = 2/3
+ MODULATION = QAM/64
+ TRANSMISSION_MODE = 8K
+ GUARD_INTERVAL = 1/8
+ HIERARCHY = NONE
+
+ [SBS RADIO 1]
+ SERVICE_ID = 798
+ VIDEO_PID = 0
+ AUDIO_PID = 201
+ DELIVERY_SYSTEM = DVBT
+ FREQUENCY = 536500000
+ INVERSION = OFF
+ BANDWIDTH_HZ = 7000000
+ CODE_RATE_HP = 2/3
+ CODE_RATE_LP = 2/3
+ MODULATION = QAM/64
+ TRANSMISSION_MODE = 8K
+ GUARD_INTERVAL = 1/8
+ HIERARCHY = NONE
+
+ [SBS RADIO 2]
+ SERVICE_ID = 799
+ VIDEO_PID = 0
+ AUDIO_PID = 202
+ DELIVERY_SYSTEM = DVBT
+ FREQUENCY = 536500000
+ INVERSION = OFF
+ BANDWIDTH_HZ = 7000000
+ CODE_RATE_HP = 2/3
+ CODE_RATE_LP = 2/3
+ MODULATION = QAM/64
+ TRANSMISSION_MODE = 8K
+ GUARD_INTERVAL = 1/8
+ HIERARCHY = NONE
diff --git a/Documentation/admin-guide/media/dvb_references.rst b/Documentation/admin-guide/media/dvb_references.rst
new file mode 100644
index 000000000000..4f0fd4259cfa
--- /dev/null
+++ b/Documentation/admin-guide/media/dvb_references.rst
@@ -0,0 +1,29 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+References
+==========
+
+The main development site and GIT repository for Digital TV
+drivers is https://linuxtv.org.
+
+The DVB mailing list linux-dvb is hosted at vger. Please see
+http://vger.kernel.org/vger-lists.html#linux-media for details.
+
+There are also some other old lists hosted at:
+https://linuxtv.org/lists.php. If you're interested on that for historic
+reasons, please check the archive at https://linuxtv.org/pipermail/linux-dvb/.
+
+The media subsystem Wiki is hosted at https://linuxtv.org/wiki/.
+There, you'll find lots of information, from both development and usage
+of media boards. Please check it before asking newbie questions on the
+mailing list or IRC channels.
+
+The API documentation is documented at the Kernel tree. You can find it
+in both html and pdf formats, together with other useful documentation at:
+
+ - https://linuxtv.org/docs.php.
+
+You may also find useful material at https://linuxtv.org/downloads/.
+
+In order to get the needed firmware for some drivers to work, there's
+a script at the kernel tree, at scripts/get_dvb_firmware.
diff --git a/Documentation/admin-guide/media/em28xx-cardlist.rst b/Documentation/admin-guide/media/em28xx-cardlist.rst
new file mode 100644
index 000000000000..ace65718ea22
--- /dev/null
+++ b/Documentation/admin-guide/media/em28xx-cardlist.rst
@@ -0,0 +1,440 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+EM28xx cards list
+=================
+
+.. tabularcolumns:: |p{1.4cm}|p{10.0cm}|p{1.9cm}|p{4.2cm}|
+
+.. flat-table::
+ :header-rows: 1
+ :widths: 2 12 3 16
+ :stub-columns: 0
+
+ * - Card number
+ - Card name
+ - Empia Chip
+ - USB IDs
+ * - 0
+ - Unknown EM2800 video grabber
+ - em2800
+ - eb1a:2800
+ * - 1
+ - Unknown EM2750/28xx video grabber
+ - em2820 or em2840
+ - eb1a:2710, eb1a:2820, eb1a:2821, eb1a:2860, eb1a:2861, eb1a:2862, eb1a:2863, eb1a:2870, eb1a:2881, eb1a:2883, eb1a:2868, eb1a:2875
+ * - 2
+ - Terratec Cinergy 250 USB
+ - em2820 or em2840
+ - 0ccd:0036
+ * - 3
+ - Pinnacle PCTV USB 2
+ - em2820 or em2840
+ - 2304:0208
+ * - 4
+ - Hauppauge WinTV USB 2
+ - em2820 or em2840
+ - 2040:4200, 2040:4201
+ * - 5
+ - MSI VOX USB 2.0
+ - em2820 or em2840
+ -
+ * - 6
+ - Terratec Cinergy 200 USB
+ - em2800
+ -
+ * - 7
+ - Leadtek Winfast USB II
+ - em2800
+ - 0413:6023
+ * - 8
+ - Kworld USB2800
+ - em2800
+ -
+ * - 9
+ - Pinnacle Dazzle DVC 90/100/101/107 / Kaiser Baas Video to DVD maker / Kworld DVD Maker 2 / Plextor ConvertX PX-AV100U
+ - em2820 or em2840
+ - 1b80:e302, 1b80:e304, 2304:0207, 2304:021a, 093b:a003
+ * - 10
+ - Hauppauge WinTV HVR 900
+ - em2880
+ - 2040:6500
+ * - 11
+ - Terratec Hybrid XS
+ - em2880
+ -
+ * - 12
+ - Kworld PVR TV 2800 RF
+ - em2820 or em2840
+ -
+ * - 13
+ - Terratec Prodigy XS
+ - em2880
+ -
+ * - 14
+ - SIIG AVTuner-PVR / Pixelview Prolink PlayTV USB 2.0
+ - em2820 or em2840
+ -
+ * - 15
+ - V-Gear PocketTV
+ - em2800
+ -
+ * - 16
+ - Hauppauge WinTV HVR 950
+ - em2883
+ - 2040:6513, 2040:6517, 2040:651b
+ * - 17
+ - Pinnacle PCTV HD Pro Stick
+ - em2880
+ - 2304:0227
+ * - 18
+ - Hauppauge WinTV HVR 900 (R2)
+ - em2880
+ - 2040:6502
+ * - 19
+ - EM2860/SAA711X Reference Design
+ - em2860
+ -
+ * - 20
+ - AMD ATI TV Wonder HD 600
+ - em2880
+ - 0438:b002
+ * - 21
+ - eMPIA Technology, Inc. GrabBeeX+ Video Encoder
+ - em2800
+ - eb1a:2801
+ * - 22
+ - EM2710/EM2750/EM2751 webcam grabber
+ - em2750
+ - eb1a:2750, eb1a:2751
+ * - 23
+ - Huaqi DLCW-130
+ - em2750
+ -
+ * - 24
+ - D-Link DUB-T210 TV Tuner
+ - em2820 or em2840
+ - 2001:f112
+ * - 25
+ - Gadmei UTV310
+ - em2820 or em2840
+ -
+ * - 26
+ - Hercules Smart TV USB 2.0
+ - em2820 or em2840
+ -
+ * - 27
+ - Pinnacle PCTV USB 2 (Philips FM1216ME)
+ - em2820 or em2840
+ -
+ * - 28
+ - Leadtek Winfast USB II Deluxe
+ - em2820 or em2840
+ -
+ * - 29
+ - EM2860/TVP5150 Reference Design
+ - em2860
+ - eb1a:5051
+ * - 30
+ - Videology 20K14XUSB USB2.0
+ - em2820 or em2840
+ -
+ * - 31
+ - Usbgear VD204v9
+ - em2821
+ -
+ * - 32
+ - Supercomp USB 2.0 TV
+ - em2821
+ -
+ * - 33
+ - Elgato Video Capture
+ - em2860
+ - 0fd9:0033
+ * - 34
+ - Terratec Cinergy A Hybrid XS
+ - em2860
+ - 0ccd:004f
+ * - 35
+ - Typhoon DVD Maker
+ - em2860
+ -
+ * - 36
+ - NetGMBH Cam
+ - em2860
+ -
+ * - 37
+ - Gadmei UTV330
+ - em2860
+ - eb1a:50a6
+ * - 38
+ - Yakumo MovieMixer
+ - em2861
+ -
+ * - 39
+ - KWorld PVRTV 300U
+ - em2861
+ - eb1a:e300
+ * - 40
+ - Plextor ConvertX PX-TV100U
+ - em2861
+ - 093b:a005
+ * - 41
+ - Kworld 350 U DVB-T
+ - em2870
+ - eb1a:e350
+ * - 42
+ - Kworld 355 U DVB-T
+ - em2870
+ - eb1a:e355, eb1a:e357, eb1a:e359
+ * - 43
+ - Terratec Cinergy T XS
+ - em2870
+ -
+ * - 44
+ - Terratec Cinergy T XS (MT2060)
+ - em2870
+ - 0ccd:0043
+ * - 45
+ - Pinnacle PCTV DVB-T
+ - em2870
+ -
+ * - 46
+ - Compro, VideoMate U3
+ - em2870
+ - 185b:2870
+ * - 47
+ - KWorld DVB-T 305U
+ - em2880
+ - eb1a:e305
+ * - 48
+ - KWorld DVB-T 310U
+ - em2880
+ -
+ * - 49
+ - MSI DigiVox A/D
+ - em2880
+ - eb1a:e310
+ * - 50
+ - MSI DigiVox A/D II
+ - em2880
+ - eb1a:e320
+ * - 51
+ - Terratec Hybrid XS Secam
+ - em2880
+ - 0ccd:004c
+ * - 52
+ - DNT DA2 Hybrid
+ - em2881
+ -
+ * - 53
+ - Pinnacle Hybrid Pro
+ - em2881
+ -
+ * - 54
+ - Kworld VS-DVB-T 323UR
+ - em2882
+ - eb1a:e323
+ * - 55
+ - Terratec Cinergy Hybrid T USB XS (em2882)
+ - em2882
+ - 0ccd:005e, 0ccd:0042
+ * - 56
+ - Pinnacle Hybrid Pro (330e)
+ - em2882
+ - 2304:0226
+ * - 57
+ - Kworld PlusTV HD Hybrid 330
+ - em2883
+ - eb1a:a316
+ * - 58
+ - Compro VideoMate ForYou/Stereo
+ - em2820 or em2840
+ - 185b:2041
+ * - 59
+ - Pinnacle PCTV HD Mini
+ - em2874
+ - 2304:023f
+ * - 60
+ - Hauppauge WinTV HVR 850
+ - em2883
+ - 2040:651f
+ * - 61
+ - Pixelview PlayTV Box 4 USB 2.0
+ - em2820 or em2840
+ -
+ * - 62
+ - Gadmei TVR200
+ - em2820 or em2840
+ -
+ * - 63
+ - Kaiomy TVnPC U2
+ - em2860
+ - eb1a:e303
+ * - 64
+ - Easy Cap Capture DC-60
+ - em2860
+ - 1b80:e309
+ * - 65
+ - IO-DATA GV-MVP/SZ
+ - em2820 or em2840
+ - 04bb:0515
+ * - 66
+ - Empire dual TV
+ - em2880
+ -
+ * - 67
+ - Terratec Grabby
+ - em2860
+ - 0ccd:0096, 0ccd:10AF
+ * - 68
+ - Terratec AV350
+ - em2860
+ - 0ccd:0084
+ * - 69
+ - KWorld ATSC 315U HDTV TV Box
+ - em2882
+ - eb1a:a313
+ * - 70
+ - Evga inDtube
+ - em2882
+ -
+ * - 71
+ - Silvercrest Webcam 1.3mpix
+ - em2820 or em2840
+ -
+ * - 72
+ - Gadmei UTV330+
+ - em2861
+ -
+ * - 73
+ - Reddo DVB-C USB TV Box
+ - em2870
+ -
+ * - 74
+ - Actionmaster/LinXcel/Digitus VC211A
+ - em2800
+ -
+ * - 75
+ - Dikom DK300
+ - em2882
+ -
+ * - 76
+ - KWorld PlusTV 340U or UB435-Q (ATSC)
+ - em2870
+ - 1b80:a340
+ * - 77
+ - EM2874 Leadership ISDBT
+ - em2874
+ -
+ * - 78
+ - PCTV nanoStick T2 290e
+ - em28174
+ - 2013:024f
+ * - 79
+ - Terratec Cinergy H5
+ - em2884
+ - eb1a:2885, 0ccd:10a2, 0ccd:10ad, 0ccd:10b6
+ * - 80
+ - PCTV DVB-S2 Stick (460e)
+ - em28174
+ - 2013:024c
+ * - 81
+ - Hauppauge WinTV HVR 930C
+ - em2884
+ - 2040:1605
+ * - 82
+ - Terratec Cinergy HTC Stick
+ - em2884
+ - 0ccd:00b2
+ * - 83
+ - Honestech Vidbox NW03
+ - em2860
+ - eb1a:5006
+ * - 84
+ - MaxMedia UB425-TC
+ - em2874
+ - 1b80:e425
+ * - 85
+ - PCTV QuatroStick (510e)
+ - em2884
+ - 2304:0242
+ * - 86
+ - PCTV QuatroStick nano (520e)
+ - em2884
+ - 2013:0251
+ * - 87
+ - Terratec Cinergy HTC USB XS
+ - em2884
+ - 0ccd:008e, 0ccd:00ac
+ * - 88
+ - C3 Tech Digital Duo HDTV/SDTV USB
+ - em2884
+ - 1b80:e755
+ * - 89
+ - Delock 61959
+ - em2874
+ - 1b80:e1cc
+ * - 90
+ - KWorld USB ATSC TV Stick UB435-Q V2
+ - em2874
+ - 1b80:e346
+ * - 91
+ - SpeedLink Vicious And Devine Laplace webcam
+ - em2765
+ - 1ae7:9003, 1ae7:9004
+ * - 92
+ - PCTV DVB-S2 Stick (461e)
+ - em28178
+ - 2013:0258
+ * - 93
+ - KWorld USB ATSC TV Stick UB435-Q V3
+ - em2874
+ - 1b80:e34c
+ * - 94
+ - PCTV tripleStick (292e)
+ - em28178
+ - 2013:025f, 2013:0264, 2040:0264, 2040:8264, 2040:8268
+ * - 95
+ - Leadtek VC100
+ - em2861
+ - 0413:6f07
+ * - 96
+ - Terratec Cinergy T2 Stick HD
+ - em28178
+ - eb1a:8179
+ * - 97
+ - Elgato EyeTV Hybrid 2008 INT
+ - em2884
+ - 0fd9:0018
+ * - 98
+ - PLEX PX-BCUD
+ - em28178
+ - 3275:0085
+ * - 99
+ - Hauppauge WinTV-dualHD DVB
+ - em28174
+ - 2040:0265, 2040:8265
+ * - 100
+ - Hauppauge WinTV-dualHD 01595 ATSC/QAM
+ - em28174
+ - 2040:026d, 2040:826d
+ * - 101
+ - Terratec Cinergy H6 rev. 2
+ - em2884
+ - 0ccd:10b2
+ * - 102
+ - :ZOLID HYBRID TV STICK
+ - em2882
+ -
+ * - 103
+ - Magix USB Videowandler-2
+ - em2861
+ - 1b80:e349
+ * - 104
+ - PCTV DVB-S2 Stick (461e v2)
+ - em28178
+ - 2013:0461, 2013:0259
+ * - 105
+ - MyGica iGrabber
+ - em2860
+ - 1f4d:1abe
diff --git a/Documentation/admin-guide/media/faq.rst b/Documentation/admin-guide/media/faq.rst
new file mode 100644
index 000000000000..b63548b6f313
--- /dev/null
+++ b/Documentation/admin-guide/media/faq.rst
@@ -0,0 +1,216 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+FAQ
+===
+
+.. note::
+
+ 1. With Digital TV, a single physical channel may have different
+ contents inside it. The specs call each one as a *service*.
+ This is what a TV user would call "channel". So, in order to
+ avoid confusion, we're calling *transponders* as the physical
+ channel on this FAQ, and *services* for the logical channel.
+ 2. The LinuxTV community maintains some Wiki pages with contain
+ a lot of information related to the media subsystem. If you
+ don't find an answer for your needs here, it is likely that
+ you'll be able to get something useful there. It is hosted
+ at:
+
+ https://www.linuxtv.org/wiki/
+
+Some very frequently asked questions about Linux Digital TV support
+
+1. The signal seems to die a few seconds after tuning.
+
+ It's not a bug, it's a feature. Because the frontends have
+ significant power requirements (and hence get very hot), they
+ are powered down if they are unused (i.e. if the frontend device
+ is closed). The ``dvb-core`` module parameter ``dvb_shutdown_timeout``
+ allow you to change the timeout (default 5 seconds). Setting the
+ timeout to 0 disables the timeout feature.
+
+2. How can I watch TV?
+
+ Together with the Linux Kernel, the Digital TV developers support
+ some simple utilities which are mainly intended for testing
+ and to demonstrate how the DVB API works. This is called DVB v5
+ tools and are grouped together with the ``v4l-utils`` git repository:
+
+ https://git.linuxtv.org/v4l-utils.git/
+
+ You can find more information at the LinuxTV wiki:
+
+ https://www.linuxtv.org/wiki/index.php/DVBv5_Tools
+
+ The first step is to get a list of services that are transmitted.
+
+ This is done by using several existing tools. You can use
+ for example the ``dvbv5-scan`` tool. You can find more information
+ about it at:
+
+ https://www.linuxtv.org/wiki/index.php/Dvbv5-scan
+
+ There are some other applications like ``w_scan`` [#]_ that do a
+ blind scan, trying hard to find all possible channels, but
+ those consumes a large amount of time to run.
+
+ .. [#] https://www.linuxtv.org/wiki/index.php/W_scan
+
+ Also, some applications like ``kaffeine`` have their own code
+ to scan for services. So, you don't need to use an external
+ application to obtain such list.
+
+ Most of such tools need a file containing a list of channel
+ transponders available on your area. So, LinuxTV developers
+ maintain tables of Digital TV channel transponders, receiving
+ patches from the community to keep them updated.
+
+ This list is hosted at:
+
+ https://git.linuxtv.org/dtv-scan-tables.git
+
+ And packaged on several distributions.
+
+ Kaffeine has some blind scan support for some terrestrial standards.
+ It also relies on DTV scan tables, although it contains a copy
+ of it internally (and, if requested by the user, it will download
+ newer versions of it).
+
+ If you are lucky you can just use one of the supplied channel
+ transponders. If not, you may need to seek for such info at
+ the Internet and create a new file. There are several sites with
+ contains physical channel lists. For cable and satellite, usually
+ knowing how to tune into a single channel is enough for the
+ scanning tool to identify the other channels. On some places,
+ this could also work for terrestrial transmissions.
+
+ Once you have a transponders list, you need to generate a services
+ list with a tool like ``dvbv5-scan``.
+
+ Almost all modern Digital TV cards don't have built-in hardware
+ MPEG-decoders. So, it is up to the application to get a MPEG-TS
+ stream provided by the board, split it into audio, video and other
+ data and decode.
+
+3. Which Digital TV applications exist?
+
+ Several media player applications are capable of tuning into
+ digital TV channels, including Kaffeine, Vlc, mplayer and MythTV.
+
+ Kaffeine aims to be very user-friendly, and it is maintained
+ by one of the Kernel driver developers.
+
+ A comprehensive list of those and other apps can be found at:
+
+ https://www.linuxtv.org/wiki/index.php/TV_Related_Software
+
+ Some of the most popular ones are linked below:
+
+ https://kde.org/applications/multimedia/org.kde.kaffeine
+ KDE media player, focused on Digital TV support
+
+ https://www.linuxtv.org/vdrwiki/index.php/Main_Page
+ Klaus Schmidinger's Video Disk Recorder
+
+ https://linuxtv.org/downloads and https://git.linuxtv.org/
+ Digital TV and other media-related applications and
+ Kernel drivers. The ``v4l-utils`` package there contains
+ several swiss knife tools for using with Digital TV.
+
+ http://sourceforge.net/projects/dvbtools/
+ Dave Chapman's dvbtools package, including
+ dvbstream and dvbtune
+
+ http://www.dbox2.info/
+ LinuxDVB on the dBox2
+
+ http://www.tuxbox.org/
+ the TuxBox CVS many interesting DVB applications and the dBox2
+ DVB source
+
+ http://www.nenie.org/misc/mpsys/
+ MPSYS: a MPEG2 system library and tools
+
+ https://www.videolan.org/vlc/index.pt.html
+ Vlc
+
+ http://mplayerhq.hu/
+ MPlayer
+
+ http://xine.sourceforge.net/ and http://xinehq.de/
+ Xine
+
+ http://www.mythtv.org/
+ MythTV - analog TV and digital TV PVR
+
+ http://dvbsnoop.sourceforge.net/
+ DVB sniffer program to monitor, analyze, debug, dump
+ or view dvb/mpeg/dsm-cc/mhp stream information (TS,
+ PES, SECTION)
+
+4. Can't get a signal tuned correctly
+
+ That could be due to a lot of problems. On my personal experience,
+ usually TV cards need stronger signals than TV sets, and are more
+ sensitive to noise. So, perhaps you just need a better antenna or
+ cabling. Yet, it could also be some hardware or driver issue.
+
+ For example, if you are using a Technotrend/Hauppauge DVB-C card
+ *without* analog module, you might have to use module parameter
+ adac=-1 (dvb-ttpci.o).
+
+ Please see the FAQ page at linuxtv.org, as it could contain some
+ valuable information:
+
+ https://www.linuxtv.org/wiki/index.php/FAQ_%26_Troubleshooting
+
+ If that doesn't work, check at the linux-media ML archives, to
+ see if someone else had a similar problem with your hardware
+ and/or digital TV service provider:
+
+ https://lore.kernel.org/linux-media/
+
+ If none of this works, you can try sending an e-mail to the
+ linux-media ML and see if someone else could shed some light.
+ The e-mail is linux-media AT vger.kernel.org.
+
+5. The dvb_net device doesn't give me any packets at all
+
+ Run ``tcpdump`` on the ``dvb0_0`` interface. This sets the interface
+ into promiscuous mode so it accepts any packets from the PID
+ you have configured with the ``dvbnet`` utility. Check if there
+ are any packets with the IP addr and MAC addr you have
+ configured with ``ifconfig`` or with ``ip addr``.
+
+ If ``tcpdump`` doesn't give you any output, check the statistics
+ which ``ifconfig`` or ``netstat -ni`` outputs. (Note: If the MAC
+ address is wrong, ``dvb_net`` won't get any input; thus you have to
+ run ``tcpdump`` before checking the statistics.) If there are no
+ packets at all then maybe the PID is wrong. If there are error packets,
+ then either the PID is wrong or the stream does not conform to
+ the MPE standard (EN 301 192, http://www.etsi.org/). You can
+ use e.g. ``dvbsnoop`` for debugging.
+
+6. The ``dvb_net`` device doesn't give me any multicast packets
+
+ Check your routes if they include the multicast address range.
+ Additionally make sure that "source validation by reversed path
+ lookup" is disabled::
+
+ $ "echo 0 > /proc/sys/net/ipv4/conf/dvb0/rp_filter"
+
+7. What are all those modules that need to be loaded?
+
+ In order to make it more flexible and support different hardware
+ combinations, the media subsystem is written on a modular way.
+
+ So, besides the Digital TV hardware module for the main chipset,
+ it also needs to load a frontend driver, plus the Digital TV
+ core. If the board also has remote controller, it will also
+ need the remote controller core and the remote controller tables.
+ The same happens if the board has support for analog TV: the
+ core support for video4linux need to be loaded.
+
+ The actual module names are Linux-kernel version specific, as,
+ from time to time, things change, in order to make the media
+ support more flexible.
diff --git a/Documentation/admin-guide/media/fimc.rst b/Documentation/admin-guide/media/fimc.rst
new file mode 100644
index 000000000000..267ef52fe387
--- /dev/null
+++ b/Documentation/admin-guide/media/fimc.rst
@@ -0,0 +1,153 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+.. include:: <isonum.txt>
+
+The Samsung S5P/Exynos4 FIMC driver
+===================================
+
+Copyright |copy| 2012 - 2013 Samsung Electronics Co., Ltd.
+
+The FIMC (Fully Interactive Mobile Camera) device available in Samsung
+SoC Application Processors is an integrated camera host interface, color
+space converter, image resizer and rotator. It's also capable of capturing
+data from LCD controller (FIMD) through the SoC internal writeback data
+path. There are multiple FIMC instances in the SoCs (up to 4), having
+slightly different capabilities, like pixel alignment constraints, rotator
+availability, LCD writeback support, etc. The driver is located at
+drivers/media/platform/samsung/exynos4-is directory.
+
+Supported SoCs
+--------------
+
+S5PC100 (mem-to-mem only), S5PV210, Exynos4210
+
+Supported features
+------------------
+
+- camera parallel interface capture (ITU-R.BT601/565);
+- camera serial interface capture (MIPI-CSI2);
+- memory-to-memory processing (color space conversion, scaling, mirror
+ and rotation);
+- dynamic pipeline re-configuration at runtime (re-attachment of any FIMC
+ instance to any parallel video input or any MIPI-CSI front-end);
+- runtime PM and system wide suspend/resume
+
+Not currently supported
+-----------------------
+
+- LCD writeback input
+- per frame clock gating (mem-to-mem)
+
+User space interfaces
+---------------------
+
+Media device interface
+~~~~~~~~~~~~~~~~~~~~~~
+
+The driver supports Media Controller API as defined at :ref:`media_controller`.
+The media device driver name is "Samsung S5P FIMC".
+
+The purpose of this interface is to allow changing assignment of FIMC instances
+to the SoC peripheral camera input at runtime and optionally to control internal
+connections of the MIPI-CSIS device(s) to the FIMC entities.
+
+The media device interface allows to configure the SoC for capturing image
+data from the sensor through more than one FIMC instance (e.g. for simultaneous
+viewfinder and still capture setup).
+
+Reconfiguration is done by enabling/disabling media links created by the driver
+during initialization. The internal device topology can be easily discovered
+through media entity and links enumeration.
+
+Memory-to-memory video node
+~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+V4L2 memory-to-memory interface at /dev/video? device node. This is standalone
+video device, it has no media pads. However please note the mem-to-mem and
+capture video node operation on same FIMC instance is not allowed. The driver
+detects such cases but the applications should prevent them to avoid an
+undefined behaviour.
+
+Capture video node
+~~~~~~~~~~~~~~~~~~
+
+The driver supports V4L2 Video Capture Interface as defined at
+:ref:`devices`.
+
+At the capture and mem-to-mem video nodes only the multi-planar API is
+supported. For more details see: :ref:`planar-apis`.
+
+Camera capture subdevs
+~~~~~~~~~~~~~~~~~~~~~~
+
+Each FIMC instance exports a sub-device node (/dev/v4l-subdev?), a sub-device
+node is also created per each available and enabled at the platform level
+MIPI-CSI receiver device (currently up to two).
+
+sysfs
+~~~~~
+
+In order to enable more precise camera pipeline control through the sub-device
+API the driver creates a sysfs entry associated with "s5p-fimc-md" platform
+device. The entry path is: /sys/platform/devices/s5p-fimc-md/subdev_conf_mode.
+
+In typical use case there could be a following capture pipeline configuration:
+sensor subdev -> mipi-csi subdev -> fimc subdev -> video node
+
+When we configure these devices through sub-device API at user space, the
+configuration flow must be from left to right, and the video node is
+configured as last one.
+
+When we don't use sub-device user space API the whole configuration of all
+devices belonging to the pipeline is done at the video node driver.
+The sysfs entry allows to instruct the capture node driver not to configure
+the sub-devices (format, crop), to avoid resetting the subdevs' configuration
+when the last configuration steps at the video node is performed.
+
+For full sub-device control support (subdevs configured at user space before
+starting streaming):
+
+.. code-block:: none
+
+ # echo "sub-dev" > /sys/platform/devices/s5p-fimc-md/subdev_conf_mode
+
+For V4L2 video node control only (subdevs configured internally by the host
+driver):
+
+.. code-block:: none
+
+ # echo "vid-dev" > /sys/platform/devices/s5p-fimc-md/subdev_conf_mode
+
+This is a default option.
+
+5. Device mapping to video and subdev device nodes
+--------------------------------------------------
+
+There are associated two video device nodes with each device instance in
+hardware - video capture and mem-to-mem and additionally a subdev node for
+more precise FIMC capture subsystem control. In addition a separate v4l2
+sub-device node is created per each MIPI-CSIS device.
+
+How to find out which /dev/video? or /dev/v4l-subdev? is assigned to which
+device?
+
+You can either grep through the kernel log to find relevant information, i.e.
+
+.. code-block:: none
+
+ # dmesg | grep -i fimc
+
+(note that udev, if present, might still have rearranged the video nodes),
+
+or retrieve the information from /dev/media? with help of the media-ctl tool:
+
+.. code-block:: none
+
+ # media-ctl -p
+
+7. Build
+--------
+
+If the driver is built as a loadable kernel module (CONFIG_VIDEO_SAMSUNG_S5P_FIMC=m)
+two modules are created (in addition to the core v4l2 modules): s5p-fimc.ko and
+optional s5p-csis.ko (MIPI-CSI receiver subdev).
diff --git a/Documentation/admin-guide/media/frontend-cardlist.rst b/Documentation/admin-guide/media/frontend-cardlist.rst
new file mode 100644
index 000000000000..ba5b7c69a978
--- /dev/null
+++ b/Documentation/admin-guide/media/frontend-cardlist.rst
@@ -0,0 +1,226 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+================
+Frontend drivers
+================
+
+.. note::
+
+ #) There is no guarantee that every frontend driver works
+ out of the box with every card, because of different wiring.
+
+ #) The demodulator chips can be used with a variety of
+ tuner/PLL chips, and not all combinations are supported. Often
+ the demodulator and tuner/PLL chip are inside a metal box for
+ shielding, and the whole metal box has its own part number.
+
+
+Common Interface (EN50221) controller drivers
+=============================================
+
+============== =========================================================
+Driver Name
+============== =========================================================
+cxd2099 Sony CXD2099AR Common Interface driver
+sp2 CIMaX SP2
+============== =========================================================
+
+ATSC (North American/Korean Terrestrial/Cable DTV) frontends
+============================================================
+
+============== =========================================================
+Driver Name
+============== =========================================================
+au8522_dig Auvitek AU8522 based DTV demod
+au8522_decoder Auvitek AU8522 based ATV demod
+bcm3510 Broadcom BCM3510
+lg2160 LG Electronics LG216x based
+lgdt3305 LG Electronics LGDT3304 and LGDT3305 based
+lgdt3306a LG Electronics LGDT3306A based
+lgdt330x LG Electronics LGDT3302/LGDT3303 based
+nxt200x NxtWave Communications NXT2002/NXT2004 based
+or51132 Oren OR51132 based
+or51211 Oren OR51211 based
+s5h1409 Samsung S5H1409 based
+s5h1411 Samsung S5H1411 based
+============== =========================================================
+
+DVB-C (cable) frontends
+=======================
+
+============== =========================================================
+Driver Name
+============== =========================================================
+stv0297 ST STV0297 based
+tda10021 Philips TDA10021 based
+tda10023 Philips TDA10023 based
+ves1820 VLSI VES1820 based
+============== =========================================================
+
+DVB-S (satellite) frontends
+===========================
+
+============== =========================================================
+Driver Name
+============== =========================================================
+cx24110 Conexant CX24110 based
+cx24116 Conexant CX24116 based
+cx24117 Conexant CX24117 based
+cx24120 Conexant CX24120 based
+cx24123 Conexant CX24123 based
+ds3000 Montage Technology DS3000 based
+mb86a16 Fujitsu MB86A16 based
+mt312 Zarlink VP310/MT312/ZL10313 based
+s5h1420 Samsung S5H1420 based
+si21xx Silicon Labs SI21XX based
+stb6000 ST STB6000 silicon tuner
+stv0288 ST STV0288 based
+stv0299 ST STV0299 based
+stv0900 ST STV0900 based
+stv6110 ST STV6110 silicon tuner
+tda10071 NXP TDA10071
+tda10086 Philips TDA10086 based
+tda8083 Philips TDA8083 based
+tda8261 Philips TDA8261 based
+tda826x Philips TDA826X silicon tuner
+ts2020 Montage Technology TS2020 based tuners
+tua6100 Infineon TUA6100 PLL
+cx24113 Conexant CX24113/CX24128 tuner for DVB-S/DSS
+itd1000 Integrant ITD1000 Zero IF tuner for DVB-S/DSS
+ves1x93 VLSI VES1893 or VES1993 based
+zl10036 Zarlink ZL10036 silicon tuner
+zl10039 Zarlink ZL10039 silicon tuner
+============== =========================================================
+
+DVB-T (terrestrial) frontends
+=============================
+
+============== =========================================================
+Driver Name
+============== =========================================================
+af9013 Afatech AF9013 demodulator
+cx22700 Conexant CX22700 based
+cx22702 Conexant cx22702 demodulator (OFDM)
+cxd2820r Sony CXD2820R
+cxd2841er Sony CXD2841ER
+cxd2880 Sony CXD2880 DVB-T2/T tuner + demodulator
+dib3000mb DiBcom 3000M-B
+dib3000mc DiBcom 3000P/M-C
+dib7000m DiBcom 7000MA/MB/PA/PB/MC
+dib7000p DiBcom 7000PC
+dib9000 DiBcom 9000
+drxd Micronas DRXD driver
+ec100 E3C EC100
+l64781 LSI L64781
+mt352 Zarlink MT352 based
+nxt6000 NxtWave Communications NXT6000 based
+rtl2830 Realtek RTL2830 DVB-T
+rtl2832 Realtek RTL2832 DVB-T
+rtl2832_sdr Realtek RTL2832 SDR
+s5h1432 Samsung s5h1432 demodulator (OFDM)
+si2168 Silicon Labs Si2168
+sp8870 Spase sp8870 based
+sp887x Spase sp887x based
+stv0367 ST STV0367 based
+tda10048 Philips TDA10048HN based
+tda1004x Philips TDA10045H/TDA10046H based
+zd1301_demod ZyDAS ZD1301
+zl10353 Zarlink ZL10353 based
+============== =========================================================
+
+Digital terrestrial only tuners/PLL
+===================================
+
+============== =========================================================
+Driver Name
+============== =========================================================
+dvb-pll Generic I2C PLL based tuners
+dib0070 DiBcom DiB0070 silicon base-band tuner
+dib0090 DiBcom DiB0090 silicon base-band tuner
+============== =========================================================
+
+ISDB-S (satellite) & ISDB-T (terrestrial) frontends
+===================================================
+
+============== =========================================================
+Driver Name
+============== =========================================================
+mn88443x Socionext MN88443x
+tc90522 Toshiba TC90522
+============== =========================================================
+
+ISDB-T (terrestrial) frontends
+==============================
+
+============== =========================================================
+Driver Name
+============== =========================================================
+dib8000 DiBcom 8000MB/MC
+mb86a20s Fujitsu mb86a20s
+s921 Sharp S921 frontend
+============== =========================================================
+
+Multistandard (cable + terrestrial) frontends
+=============================================
+
+============== =========================================================
+Driver Name
+============== =========================================================
+drxk Micronas DRXK based
+mn88472 Panasonic MN88472
+mn88473 Panasonic MN88473
+si2165 Silicon Labs si2165 based
+tda18271c2dd NXP TDA18271C2 silicon tuner
+============== =========================================================
+
+Multistandard (satellite) frontends
+===================================
+
+============== =========================================================
+Driver Name
+============== =========================================================
+m88ds3103 Montage Technology M88DS3103
+mxl5xx MaxLinear MxL5xx based tuner-demodulators
+stb0899 STB0899 based
+stb6100 STB6100 based tuners
+stv090x STV0900/STV0903(A/B) based
+stv0910 STV0910 based
+stv6110x STV6110/(A) based tuners
+stv6111 STV6111 based tuners
+============== =========================================================
+
+SEC control devices for DVB-S
+=============================
+
+============== =========================================================
+Driver Name
+============== =========================================================
+a8293 Allegro A8293
+af9033 Afatech AF9033 DVB-T demodulator
+ascot2e Sony Ascot2E tuner
+atbm8830 AltoBeam ATBM8830/8831 DMB-TH demodulator
+drx39xyj Micronas DRX-J demodulator
+helene Sony HELENE Sat/Ter tuner (CXD2858ER)
+horus3a Sony Horus3A tuner
+isl6405 ISL6405 SEC controller
+isl6421 ISL6421 SEC controller
+isl6423 ISL6423 SEC controller
+ix2505v Sharp IX2505V silicon tuner
+lgs8gl5 Silicon Legend LGS-8GL5 demodulator (OFDM)
+lgs8gxx Legend Silicon LGS8913/LGS8GL5/LGS8GXX DMB-TH demodulator
+lnbh25 LNBH25 SEC controller
+lnbh29 LNBH29 SEC controller
+lnbp21 LNBP21/LNBH24 SEC controllers
+lnbp22 LNBP22 SEC controllers
+m88rs2000 M88RS2000 DVB-S demodulator and tuner
+tda665x TDA665x tuner
+============== =========================================================
+
+Tools to develop new frontends
+==============================
+
+============== =========================================================
+Driver Name
+============== =========================================================
+dvb_dummy_fe Dummy frontend driver
+============== =========================================================
diff --git a/Documentation/admin-guide/media/gspca-cardlist.rst b/Documentation/admin-guide/media/gspca-cardlist.rst
new file mode 100644
index 000000000000..e3404d1589da
--- /dev/null
+++ b/Documentation/admin-guide/media/gspca-cardlist.rst
@@ -0,0 +1,451 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+The gspca cards list
+====================
+
+The modules for the gspca webcam drivers are:
+
+- gspca_main: main driver
+- gspca\_\ *driver*: subdriver module with *driver* as follows
+
+========= ========= ===================================================================
+*driver* vend:prod Device
+========= ========= ===================================================================
+spca501 0000:0000 MystFromOri Unknown Camera
+spca508 0130:0130 Clone Digital Webcam 11043
+se401 03e8:0004 Endpoints/AoxSE401
+zc3xx 03f0:1b07 HP Premium Starter Cam
+m5602 0402:5602 ALi Video Camera Controller
+spca501 040a:0002 Kodak DVC-325
+spca500 040a:0300 Kodak EZ200
+zc3xx 041e:041e Creative WebCam Live!
+ov519 041e:4003 Video Blaster WebCam Go Plus
+stv0680 041e:4007 Go Mini
+spca500 041e:400a Creative PC-CAM 300
+sunplus 041e:400b Creative PC-CAM 600
+sunplus 041e:4012 PC-Cam350
+sunplus 041e:4013 Creative Pccam750
+zc3xx 041e:4017 Creative Webcam Mobile PD1090
+spca508 041e:4018 Creative Webcam Vista (PD1100)
+spca561 041e:401a Creative Webcam Vista (PD1100)
+zc3xx 041e:401c Creative NX
+spca505 041e:401d Creative Webcam NX ULTRA
+zc3xx 041e:401e Creative Nx Pro
+zc3xx 041e:401f Creative Webcam Notebook PD1171
+zc3xx 041e:4022 Webcam NX Pro
+pac207 041e:4028 Creative Webcam Vista Plus
+zc3xx 041e:4029 Creative WebCam Vista Pro
+zc3xx 041e:4034 Creative Instant P0620
+zc3xx 041e:4035 Creative Instant P0620D
+zc3xx 041e:4036 Creative Live !
+sq930x 041e:4038 Creative Joy-IT
+zc3xx 041e:403a Creative Nx Pro 2
+spca561 041e:403b Creative Webcam Vista (VF0010)
+sq930x 041e:403c Creative Live! Ultra
+sq930x 041e:403d Creative Live! Ultra for Notebooks
+sq930x 041e:4041 Creative Live! Motion
+zc3xx 041e:4051 Creative Live!Cam Notebook Pro (VF0250)
+ov519 041e:4052 Creative Live! VISTA IM
+zc3xx 041e:4053 Creative Live!Cam Video IM
+vc032x 041e:405b Creative Live! Cam Notebook Ultra (VC0130)
+ov519 041e:405f Creative Live! VISTA VF0330
+ov519 041e:4060 Creative Live! VISTA VF0350
+ov519 041e:4061 Creative Live! VISTA VF0400
+ov519 041e:4064 Creative Live! VISTA VF0420
+ov519 041e:4067 Creative Live! Cam Video IM (VF0350)
+ov519 041e:4068 Creative Live! VISTA VF0470
+sn9c2028 0458:7003 GeniusVideocam Live v2
+spca561 0458:7004 Genius VideoCAM Express V2
+sn9c2028 0458:7005 Genius Smart 300, version 2
+sunplus 0458:7006 Genius Dsc 1.3 Smart
+zc3xx 0458:7007 Genius VideoCam V2
+zc3xx 0458:700c Genius VideoCam V3
+zc3xx 0458:700f Genius VideoCam Web V2
+sonixj 0458:7025 Genius Eye 311Q
+sn9c20x 0458:7029 Genius Look 320s
+sonixj 0458:702e Genius Slim 310 NB
+sn9c20x 0458:7045 Genius Look 1320 V2
+sn9c20x 0458:704a Genius Slim 1320
+sn9c20x 0458:704c Genius i-Look 1321
+sn9c20x 045e:00f4 LifeCam VX-6000 (SN9C20x + OV9650)
+sonixj 045e:00f5 MicroSoft VX3000
+sonixj 045e:00f7 MicroSoft VX1000
+ov519 045e:028c Micro$oft xbox cam
+kinect 045e:02ae Xbox NUI Camera
+kinect 045e:02bf Kinect for Windows NUI Camera
+spca561 0461:0815 Micro Innovations IC200 Webcam
+sunplus 0461:0821 Fujifilm MV-1
+zc3xx 0461:0a00 MicroInnovation WebCam320
+stv06xx 046D:08F0 QuickCamMessenger
+stv06xx 046D:08F5 QuickCamCommunicate
+stv06xx 046D:08F6 QuickCamMessenger (new)
+stv06xx 046d:0840 QuickCamExpress
+stv06xx 046d:0850 LEGOcam / QuickCam Web
+stv06xx 046d:0870 DexxaWebCam USB
+spca500 046d:0890 Logitech QuickCam traveler
+vc032x 046d:0892 Logitech Orbicam
+vc032x 046d:0896 Logitech Orbicam
+vc032x 046d:0897 Logitech QuickCam for Dell notebooks
+zc3xx 046d:089d Logitech QuickCam E2500
+zc3xx 046d:08a0 Logitech QC IM
+zc3xx 046d:08a1 Logitech QC IM 0x08A1 +sound
+zc3xx 046d:08a2 Labtec Webcam Pro
+zc3xx 046d:08a3 Logitech QC Chat
+zc3xx 046d:08a6 Logitech QCim
+zc3xx 046d:08a7 Logitech QuickCam Image
+zc3xx 046d:08a9 Logitech Notebook Deluxe
+zc3xx 046d:08aa Labtec Webcam Notebook
+zc3xx 046d:08ac Logitech QuickCam Cool
+zc3xx 046d:08ad Logitech QCCommunicate STX
+zc3xx 046d:08ae Logitech QuickCam for Notebooks
+zc3xx 046d:08af Logitech QuickCam Cool
+zc3xx 046d:08b9 Logitech QuickCam Express
+zc3xx 046d:08d7 Logitech QCam STX
+zc3xx 046d:08d8 Logitech Notebook Deluxe
+zc3xx 046d:08d9 Logitech QuickCam IM/Connect
+zc3xx 046d:08da Logitech QuickCam Messenger
+zc3xx 046d:08dd Logitech QuickCam for Notebooks
+spca500 046d:0900 Logitech Inc. ClickSmart 310
+spca500 046d:0901 Logitech Inc. ClickSmart 510
+sunplus 046d:0905 Logitech ClickSmart 820
+tv8532 046d:0920 Logitech QuickCam Express
+tv8532 046d:0921 Labtec Webcam
+spca561 046d:0928 Logitech QC Express Etch2
+spca561 046d:0929 Labtec Webcam Elch2
+spca561 046d:092a Logitech QC for Notebook
+spca561 046d:092b Labtec Webcam Plus
+spca561 046d:092c Logitech QC chat Elch2
+spca561 046d:092d Logitech QC Elch2
+spca561 046d:092e Logitech QC Elch2
+spca561 046d:092f Logitech QuickCam Express Plus
+sunplus 046d:0960 Logitech ClickSmart 420
+nw80x 046d:d001 Logitech QuickCam Pro (dark focus ring)
+se401 0471:030b PhilipsPCVC665K
+sunplus 0471:0322 Philips DMVC1300K
+zc3xx 0471:0325 Philips SPC 200 NC
+zc3xx 0471:0326 Philips SPC 300 NC
+sonixj 0471:0327 Philips SPC 600 NC
+sonixj 0471:0328 Philips SPC 700 NC
+zc3xx 0471:032d Philips SPC 210 NC
+zc3xx 0471:032e Philips SPC 315 NC
+sonixj 0471:0330 Philips SPC 710 NC
+se401 047d:5001 Kensington67014
+se401 047d:5002 Kensington6701(5/7)
+se401 047d:5003 Kensington67016
+spca501 0497:c001 Smile International
+sunplus 04a5:3003 Benq DC 1300
+sunplus 04a5:3008 Benq DC 1500
+sunplus 04a5:300a Benq DC 3410
+spca500 04a5:300c Benq DC 1016
+benq 04a5:3035 Benq DC E300
+vicam 04c1:009d HomeConnect Webcam [vicam]
+konica 04c8:0720 IntelYC 76
+finepix 04cb:0104 Fujifilm FinePix 4800
+finepix 04cb:0109 Fujifilm FinePix A202
+finepix 04cb:010b Fujifilm FinePix A203
+finepix 04cb:010f Fujifilm FinePix A204
+finepix 04cb:0111 Fujifilm FinePix A205
+finepix 04cb:0113 Fujifilm FinePix A210
+finepix 04cb:0115 Fujifilm FinePix A303
+finepix 04cb:0117 Fujifilm FinePix A310
+finepix 04cb:0119 Fujifilm FinePix F401
+finepix 04cb:011b Fujifilm FinePix F402
+finepix 04cb:011d Fujifilm FinePix F410
+finepix 04cb:0121 Fujifilm FinePix F601
+finepix 04cb:0123 Fujifilm FinePix F700
+finepix 04cb:0125 Fujifilm FinePix M603
+finepix 04cb:0127 Fujifilm FinePix S300
+finepix 04cb:0129 Fujifilm FinePix S304
+finepix 04cb:012b Fujifilm FinePix S500
+finepix 04cb:012d Fujifilm FinePix S602
+finepix 04cb:012f Fujifilm FinePix S700
+finepix 04cb:0131 Fujifilm FinePix unknown model
+finepix 04cb:013b Fujifilm FinePix unknown model
+finepix 04cb:013d Fujifilm FinePix unknown model
+finepix 04cb:013f Fujifilm FinePix F420
+sunplus 04f1:1001 JVC GC A50
+spca561 04fc:0561 Flexcam 100
+spca1528 04fc:1528 Sunplus MD80 clone
+sunplus 04fc:500c Sunplus CA500C
+sunplus 04fc:504a Aiptek Mini PenCam 1.3
+sunplus 04fc:504b Maxell MaxPocket LE 1.3
+sunplus 04fc:5330 Digitrex 2110
+sunplus 04fc:5360 Sunplus Generic
+spca500 04fc:7333 PalmPixDC85
+sunplus 04fc:ffff Pure DigitalDakota
+nw80x 0502:d001 DVC V6
+spca501 0506:00df 3Com HomeConnect Lite
+sunplus 052b:1507 Megapixel 5 Pretec DC-1007
+sunplus 052b:1513 Megapix V4
+sunplus 052b:1803 MegaImage VI
+nw80x 052b:d001 EZCam Pro p35u
+tv8532 0545:808b Veo Stingray
+tv8532 0545:8333 Veo Stingray
+sunplus 0546:3155 Polaroid PDC3070
+sunplus 0546:3191 Polaroid Ion 80
+sunplus 0546:3273 Polaroid PDC2030
+touptek 0547:6801 TTUCMOS08000KPB, AS MU800
+dtcs033 0547:7303 Anchor Chips, Inc
+ov519 054c:0154 Sonny toy4
+ov519 054c:0155 Sonny toy5
+cpia1 0553:0002 CPIA CPiA (version1) based cameras
+stv0680 0553:0202 STV0680 Camera
+zc3xx 055f:c005 Mustek Wcam300A
+spca500 055f:c200 Mustek Gsmart 300
+sunplus 055f:c211 Kowa Bs888e Microcamera
+spca500 055f:c220 Gsmart Mini
+sunplus 055f:c230 Mustek Digicam 330K
+sunplus 055f:c232 Mustek MDC3500
+sunplus 055f:c360 Mustek DV4000 Mpeg4
+sunplus 055f:c420 Mustek gSmart Mini 2
+sunplus 055f:c430 Mustek Gsmart LCD 2
+sunplus 055f:c440 Mustek DV 3000
+sunplus 055f:c520 Mustek gSmart Mini 3
+sunplus 055f:c530 Mustek Gsmart LCD 3
+sunplus 055f:c540 Gsmart D30
+sunplus 055f:c630 Mustek MDC4000
+sunplus 055f:c650 Mustek MDC5500Z
+nw80x 055f:d001 Mustek Wcam 300 mini
+zc3xx 055f:d003 Mustek WCam300A
+zc3xx 055f:d004 Mustek WCam300 AN
+conex 0572:0041 Creative Notebook cx11646
+ov519 05a9:0511 Video Blaster WebCam 3/WebCam Plus, D-Link USB Digital Video Camera
+ov519 05a9:0518 Creative WebCam
+ov519 05a9:0519 OV519 Microphone
+ov519 05a9:0530 OmniVision
+ov534_9 05a9:1550 OmniVision VEHO Filmscanner
+ov519 05a9:2800 OmniVision SuperCAM
+ov519 05a9:4519 Webcam Classic
+ov534_9 05a9:8065 OmniVision test kit ov538+ov9712
+ov519 05a9:8519 OmniVision
+ov519 05a9:a511 D-Link USB Digital Video Camera
+ov519 05a9:a518 D-Link DSB-C310 Webcam
+sunplus 05da:1018 Digital Dream Enigma 1.3
+stk014 05e1:0893 Syntek DV4000
+gl860 05e3:0503 Genesys Logic PC Camera
+gl860 05e3:f191 Genesys Logic PC Camera
+vicam 0602:1001 ViCam Webcam
+spca561 060b:a001 Maxell Compact Pc PM3
+zc3xx 0698:2003 CTX M730V built in
+topro 06a2:0003 TP6800 PC Camera, CmoX CX0342 webcam
+topro 06a2:6810 Creative Qmax
+nw80x 06a5:0000 Typhoon Webcam 100 USB
+nw80x 06a5:d001 Divio based webcams
+nw80x 06a5:d800 Divio Chicony TwinkleCam, Trust SpaceCam
+spca500 06bd:0404 Agfa CL20
+spca500 06be:0800 Optimedia
+nw80x 06be:d001 EZCam Pro p35u
+sunplus 06d6:0031 Trust 610 LCD PowerC@m Zoom
+sunplus 06d6:0041 Aashima Technology B.V.
+spca506 06e1:a190 ADS Instant VCD
+ov534 06f8:3002 Hercules Blog Webcam
+ov534_9 06f8:3003 Hercules Dualpix HD Weblog
+sonixj 06f8:3004 Hercules Classic Silver
+sonixj 06f8:3008 Hercules Deluxe Optical Glass
+pac7302 06f8:3009 Hercules Classic Link
+pac7302 06f8:301b Hercules Link
+nw80x 0728:d001 AVerMedia Camguard
+spca508 0733:0110 ViewQuest VQ110
+spca501 0733:0401 Intel Create and Share
+spca501 0733:0402 ViewQuest M318B
+spca505 0733:0430 Intel PC Camera Pro
+sunplus 0733:1311 Digital Dream Epsilon 1.3
+sunplus 0733:1314 Mercury 2.1MEG Deluxe Classic Cam
+sunplus 0733:2211 Jenoptik jdc 21 LCD
+sunplus 0733:2221 Mercury Digital Pro 3.1p
+sunplus 0733:3261 Concord 3045 spca536a
+sunplus 0733:3281 Cyberpix S550V
+spca506 0734:043b 3DeMon USB Capture aka
+cpia1 0813:0001 QX3 camera
+ov519 0813:0002 Dual Mode USB Camera Plus
+spca500 084d:0003 D-Link DSC-350
+spca500 08ca:0103 Aiptek PocketDV
+sunplus 08ca:0104 Aiptek PocketDVII 1.3
+sunplus 08ca:0106 Aiptek Pocket DV3100+
+mr97310a 08ca:0110 Trust Spyc@m 100
+mr97310a 08ca:0111 Aiptek PenCam VGA+
+sunplus 08ca:2008 Aiptek Mini PenCam 2 M
+sunplus 08ca:2010 Aiptek PocketCam 3M
+sunplus 08ca:2016 Aiptek PocketCam 2 Mega
+sunplus 08ca:2018 Aiptek Pencam SD 2M
+sunplus 08ca:2020 Aiptek Slim 3000F
+sunplus 08ca:2022 Aiptek Slim 3200
+sunplus 08ca:2024 Aiptek DV3500 Mpeg4
+sunplus 08ca:2028 Aiptek PocketCam4M
+sunplus 08ca:2040 Aiptek PocketDV4100M
+sunplus 08ca:2042 Aiptek PocketDV5100
+sunplus 08ca:2050 Medion MD 41437
+sunplus 08ca:2060 Aiptek PocketDV5300
+tv8532 0923:010f ICM532 cams
+mr97310a 093a:010e All known CIF cams with this ID
+mr97310a 093a:010f All known VGA cams with this ID
+mars 093a:050f Mars-Semi Pc-Camera
+pac207 093a:2460 Qtec Webcam 100
+pac207 093a:2461 HP Webcam
+pac207 093a:2463 Philips SPC 220 NC
+pac207 093a:2464 Labtec Webcam 1200
+pac207 093a:2468 Webcam WB-1400T
+pac207 093a:2470 Genius GF112
+pac207 093a:2471 Genius VideoCam ge111
+pac207 093a:2472 Genius VideoCam ge110
+pac207 093a:2474 Genius iLook 111
+pac207 093a:2476 Genius e-Messenger 112
+pac7311 093a:2600 PAC7311 Typhoon
+pac7311 093a:2601 Philips SPC 610 NC
+pac7311 093a:2603 Philips SPC 500 NC
+pac7311 093a:2608 Trust WB-3300p
+pac7311 093a:260e Gigaware VGA PC Camera, Trust WB-3350p, SIGMA cam 2350
+pac7311 093a:260f SnakeCam
+pac7302 093a:2620 Apollo AC-905
+pac7302 093a:2621 PAC731x
+pac7302 093a:2622 Genius Eye 312
+pac7302 093a:2623 Pixart Imaging, Inc.
+pac7302 093a:2624 PAC7302
+pac7302 093a:2625 Genius iSlim 310
+pac7302 093a:2626 Labtec 2200
+pac7302 093a:2627 Genius FaceCam 300
+pac7302 093a:2628 Genius iLook 300
+pac7302 093a:2629 Genius iSlim 300
+pac7302 093a:262a Webcam 300k
+pac7302 093a:262c Philips SPC 230 NC
+jl2005bcd 0979:0227 Various brands, 19 known cameras supported
+jeilinj 0979:0270 Sakar 57379
+jeilinj 0979:0280 Sportscam DV15, Sakar 57379
+zc3xx 0ac8:0301 Web Camera
+zc3xx 0ac8:0302 Z-star Vimicro zc0302
+vc032x 0ac8:0321 Vimicro generic vc0321
+vc032x 0ac8:0323 Vimicro Vc0323
+vc032x 0ac8:0328 A4Tech PK-130MG
+zc3xx 0ac8:301b Z-Star zc301b
+zc3xx 0ac8:303b Vimicro 0x303b
+zc3xx 0ac8:305b Z-star Vimicro zc0305b
+zc3xx 0ac8:307b PC Camera (ZS0211)
+vc032x 0ac8:c001 Sony embedded vimicro
+vc032x 0ac8:c002 Sony embedded vimicro
+vc032x 0ac8:c301 Samsung Q1 Ultra Premium
+spca508 0af9:0010 Hama USB Sightcam 100
+spca508 0af9:0011 Hama USB Sightcam 100
+ov519 0b62:0059 iBOT2 Webcam
+sonixb 0c45:6001 Genius VideoCAM NB
+sonixb 0c45:6005 Microdia Sweex Mini Webcam
+sonixb 0c45:6007 Sonix sn9c101 + Tas5110D
+sonixb 0c45:6009 spcaCam@120
+sonixb 0c45:600d spcaCam@120
+sonixb 0c45:6011 Microdia PC Camera (SN9C102)
+sonixb 0c45:6019 Generic Sonix OV7630
+sonixb 0c45:6024 Generic Sonix Tas5130c
+sonixb 0c45:6025 Xcam Shanga
+sonixb 0c45:6027 GeniusEye 310
+sonixb 0c45:6028 Sonix Btc Pc380
+sonixb 0c45:6029 spcaCam@150
+sonixb 0c45:602a Meade ETX-105EC Camera
+sonixb 0c45:602c Generic Sonix OV7630
+sonixb 0c45:602d LIC-200 LG
+sonixb 0c45:602e Genius VideoCam Messenger
+sonixj 0c45:6040 Speed NVC 350K
+sonixj 0c45:607c Sonix sn9c102p Hv7131R
+sonixb 0c45:6083 VideoCAM Look
+sonixb 0c45:608c VideoCAM Look
+sonixb 0c45:608f PC Camera (SN9C103 + OV7630)
+sonixb 0c45:60a8 VideoCAM Look
+sonixb 0c45:60aa VideoCAM Look
+sonixb 0c45:60af VideoCAM Look
+sonixb 0c45:60b0 Genius VideoCam Look
+sonixj 0c45:60c0 Sangha Sn535
+sonixj 0c45:60ce USB-PC-Camera-168 (TALK-5067)
+sonixj 0c45:60ec SN9C105+MO4000
+sonixj 0c45:60fb Surfer NoName
+sonixj 0c45:60fc LG-LIC300
+sonixj 0c45:60fe Microdia Audio
+sonixj 0c45:6100 PC Camera (SN9C128)
+sonixj 0c45:6102 PC Camera (SN9C128)
+sonixj 0c45:610a PC Camera (SN9C128)
+sonixj 0c45:610b PC Camera (SN9C128)
+sonixj 0c45:610c PC Camera (SN9C128)
+sonixj 0c45:610e PC Camera (SN9C128)
+sonixj 0c45:6128 Microdia/Sonix SNP325
+sonixj 0c45:612a Avant Camera
+sonixj 0c45:612b Speed-Link REFLECT2
+sonixj 0c45:612c Typhoon Rasy Cam 1.3MPix
+sonixj 0c45:612e PC Camera (SN9C110)
+sonixj 0c45:6130 Sonix Pccam
+sonixj 0c45:6138 Sn9c120 Mo4000
+sonixj 0c45:613a Microdia Sonix PC Camera
+sonixj 0c45:613b Surfer SN-206
+sonixj 0c45:613c Sonix Pccam168
+sonixj 0c45:613e PC Camera (SN9C120)
+sonixj 0c45:6142 Hama PC-Webcam AC-150
+sonixj 0c45:6143 Sonix Pccam168
+sonixj 0c45:6148 Digitus DA-70811/ZSMC USB PC Camera ZS211/Microdia
+sonixj 0c45:614a Frontech E-Ccam (JIL-2225)
+sn9c20x 0c45:6240 PC Camera (SN9C201 + MT9M001)
+sn9c20x 0c45:6242 PC Camera (SN9C201 + MT9M111)
+sn9c20x 0c45:6248 PC Camera (SN9C201 + OV9655)
+sn9c20x 0c45:624c PC Camera (SN9C201 + MT9M112)
+sn9c20x 0c45:624e PC Camera (SN9C201 + SOI968)
+sn9c20x 0c45:624f PC Camera (SN9C201 + OV9650)
+sn9c20x 0c45:6251 PC Camera (SN9C201 + OV9650)
+sn9c20x 0c45:6253 PC Camera (SN9C201 + OV9650)
+sn9c20x 0c45:6260 PC Camera (SN9C201 + OV7670)
+sn9c20x 0c45:6270 PC Camera (SN9C201 + MT9V011/MT9V111/MT9V112)
+sn9c20x 0c45:627b PC Camera (SN9C201 + OV7660)
+sn9c20x 0c45:627c PC Camera (SN9C201 + HV7131R)
+sn9c20x 0c45:627f PC Camera (SN9C201 + OV9650)
+sn9c20x 0c45:6280 PC Camera (SN9C202 + MT9M001)
+sn9c20x 0c45:6282 PC Camera (SN9C202 + MT9M111)
+sn9c20x 0c45:6288 PC Camera (SN9C202 + OV9655)
+sn9c20x 0c45:628c PC Camera (SN9C201 + MT9M112)
+sn9c20x 0c45:628e PC Camera (SN9C202 + SOI968)
+sn9c20x 0c45:628f PC Camera (SN9C202 + OV9650)
+sn9c20x 0c45:62a0 PC Camera (SN9C202 + OV7670)
+sn9c20x 0c45:62b0 PC Camera (SN9C202 + MT9V011/MT9V111/MT9V112)
+sn9c20x 0c45:62b3 PC Camera (SN9C202 + OV9655)
+sn9c20x 0c45:62bb PC Camera (SN9C202 + OV7660)
+sn9c20x 0c45:62bc PC Camera (SN9C202 + HV7131R)
+sn9c2028 0c45:8001 Wild Planet Digital Spy Camera
+sn9c2028 0c45:8003 Sakar #11199, #6637x, #67480 keychain cams
+sn9c2028 0c45:8008 Mini-Shotz ms-350
+sn9c2028 0c45:800a Vivitar Vivicam 3350B
+sunplus 0d64:0303 Sunplus FashionCam DXG
+ov519 0e96:c001 TRUST 380 USB2 SPACEC@M
+etoms 102c:6151 Qcam Sangha CIF
+etoms 102c:6251 Qcam xxxxxx VGA
+ov519 1046:9967 W9967CF/W9968CF WebCam IC, Video Blaster WebCam Go
+zc3xx 10fd:0128 Typhoon Webshot II USB 300k 0x0128
+spca561 10fd:7e50 FlyCam Usb 100
+zc3xx 10fd:804d Typhoon Webshot II Webcam [zc0301]
+zc3xx 10fd:8050 Typhoon Webshot II USB 300k
+ov534 1415:2000 Sony HD Eye for PS3 (SLEH 00201)
+pac207 145f:013a Trust WB-1300N
+pac7302 145f:013c Trust
+sn9c20x 145f:013d Trust WB-3600R
+vc032x 15b8:6001 HP 2.0 Megapixel
+vc032x 15b8:6002 HP 2.0 Megapixel rz406aa
+stk1135 174f:6a31 ASUSlaptop, MT9M112 sensor
+spca501 1776:501c Arowana 300K CMOS Camera
+t613 17a1:0128 TASCORP JPEG Webcam, NGS Cyclops
+vc032x 17ef:4802 Lenovo Vc0323+MI1310_SOC
+pac7302 1ae7:2001 SpeedLinkSnappy Mic SL-6825-SBK
+pac207 2001:f115 D-Link DSB-C120
+sq905c 2770:9050 Disney pix micro (CIF)
+sq905c 2770:9051 Lego Bionicle
+sq905c 2770:9052 Disney pix micro 2 (VGA)
+sq905c 2770:905c All 11 known cameras with this ID
+sq905 2770:9120 All 24 known cameras with this ID
+sq905c 2770:913d All 4 known cameras with this ID
+sq930x 2770:930b Sweex Motion Tracking / I-Tec iCam Tracer
+sq930x 2770:930c Trust WB-3500T / NSG Robbie 2.0
+spca500 2899:012c Toptro Industrial
+ov519 8020:ef04 ov519
+spca508 8086:0110 Intel Easy PC Camera
+spca500 8086:0630 Intel Pocket PC Camera
+spca506 99fa:8988 Grandtec V.cap
+sn9c20x a168:0610 Dino-Lite Digital Microscope (SN9C201 + HV7131R)
+sn9c20x a168:0611 Dino-Lite Digital Microscope (SN9C201 + HV7131R)
+sn9c20x a168:0613 Dino-Lite Digital Microscope (SN9C201 + HV7131R)
+sn9c20x a168:0614 Dino-Lite Digital Microscope (SN9C201 + MT9M111)
+sn9c20x a168:0615 Dino-Lite Digital Microscope (SN9C201 + MT9M111)
+sn9c20x a168:0617 Dino-Lite Digital Microscope (SN9C201 + MT9M111)
+sn9c20x a168:0618 Dino-Lite Digital Microscope (SN9C201 + HV7131R)
+spca561 abcd:cdee Petcam
+========= ========= ===================================================================
diff --git a/Documentation/admin-guide/media/i2c-cardlist.rst b/Documentation/admin-guide/media/i2c-cardlist.rst
new file mode 100644
index 000000000000..ef3b5fff3b01
--- /dev/null
+++ b/Documentation/admin-guide/media/i2c-cardlist.rst
@@ -0,0 +1,296 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+I²C drivers
+===========
+
+The I²C (Inter-Integrated Circuit) bus is a three-wires bus used internally
+at the media cards for communication between different chips. While the bus
+is not visible to the Linux Kernel, drivers need to send and receive
+commands via the bus. The Linux Kernel driver abstraction has support to
+implement different drivers for each component inside an I²C bus, as if
+the bus were visible to the main system board.
+
+One of the problems with I²C devices is that sometimes the same device may
+work with different I²C hardware. This is common, for example, on devices
+that comes with a tuner for North America market, and another one for
+Europe. Some drivers have a ``tuner=`` modprobe parameter to allow using a
+different tuner number in order to address such issue.
+
+The current supported of I²C drivers (not including staging drivers) are
+listed below.
+
+Audio decoders, processors and mixers
+-------------------------------------
+
+============ ==========================================================
+Driver Name
+============ ==========================================================
+cs3308 Cirrus Logic CS3308 audio ADC
+cs5345 Cirrus Logic CS5345 audio ADC
+cs53l32a Cirrus Logic CS53L32A audio ADC
+msp3400 Micronas MSP34xx audio decoders
+sony-btf-mpx Sony BTF's internal MPX
+tda1997x NXP TDA1997x HDMI receiver
+tda7432 Philips TDA7432 audio processor
+tda9840 Philips TDA9840 audio processor
+tea6415c Philips TEA6415C audio processor
+tea6420 Philips TEA6420 audio processor
+tlv320aic23b Texas Instruments TLV320AIC23B audio codec
+tvaudio Simple audio decoder chips
+uda1342 Philips UDA1342 audio codec
+vp27smpx Panasonic VP27's internal MPX
+wm8739 Wolfson Microelectronics WM8739 stereo audio ADC
+wm8775 Wolfson Microelectronics WM8775 audio ADC with input mixer
+============ ==========================================================
+
+Audio/Video compression chips
+-----------------------------
+
+============ ==========================================================
+Driver Name
+============ ==========================================================
+saa6752hs Philips SAA6752HS MPEG-2 Audio/Video Encoder
+============ ==========================================================
+
+Camera sensor devices
+---------------------
+
+============ ==========================================================
+Driver Name
+============ ==========================================================
+ccs MIPI CCS compliant camera sensors (also SMIA++ and SMIA)
+et8ek8 ET8EK8 camera sensor
+hi556 Hynix Hi-556 sensor
+hi846 Hynix Hi-846 sensor
+imx208 Sony IMX208 sensor
+imx214 Sony IMX214 sensor
+imx219 Sony IMX219 sensor
+imx258 Sony IMX258 sensor
+imx274 Sony IMX274 sensor
+imx290 Sony IMX290 sensor
+imx319 Sony IMX319 sensor
+imx334 Sony IMX334 sensor
+imx355 Sony IMX355 sensor
+imx412 Sony IMX412 sensor
+m5mols Fujitsu M-5MOLS 8MP sensor
+mt9m001 mt9m001
+mt9m032 MT9M032 camera sensor
+mt9m111 mt9m111, mt9m112 and mt9m131
+mt9p031 Aptina MT9P031
+mt9t001 Aptina MT9T001
+mt9t112 Aptina MT9T111/MT9T112
+mt9v011 Micron mt9v011 sensor
+mt9v032 Micron MT9V032 sensor
+mt9v111 Aptina MT9V111 sensor
+noon010pc30 Siliconfile NOON010PC30 sensor
+ov13858 OmniVision OV13858 sensor
+ov13b10 OmniVision OV13B10 sensor
+ov2640 OmniVision OV2640 sensor
+ov2659 OmniVision OV2659 sensor
+ov2680 OmniVision OV2680 sensor
+ov2685 OmniVision OV2685 sensor
+ov5640 OmniVision OV5640 sensor
+ov5645 OmniVision OV5645 sensor
+ov5647 OmniVision OV5647 sensor
+ov5670 OmniVision OV5670 sensor
+ov5675 OmniVision OV5675 sensor
+ov5695 OmniVision OV5695 sensor
+ov6650 OmniVision OV6650 sensor
+ov7251 OmniVision OV7251 sensor
+ov7640 OmniVision OV7640 sensor
+ov7670 OmniVision OV7670 sensor
+ov772x OmniVision OV772x sensor
+ov7740 OmniVision OV7740 sensor
+ov8856 OmniVision OV8856 sensor
+ov9640 OmniVision OV9640 sensor
+ov9650 OmniVision OV9650/OV9652 sensor
+rj54n1cb0c Sharp RJ54N1CB0C sensor
+s5c73m3 Samsung S5C73M3 sensor
+s5k4ecgx Samsung S5K4ECGX sensor
+s5k5baf Samsung S5K5BAF sensor
+s5k6a3 Samsung S5K6A3 sensor
+s5k6aa Samsung S5K6AAFX sensor
+sr030pc30 Siliconfile SR030PC30 sensor
+vs6624 ST VS6624 sensor
+============ ==========================================================
+
+Flash devices
+-------------
+
+============ ==========================================================
+Driver Name
+============ ==========================================================
+adp1653 ADP1653 flash
+lm3560 LM3560 dual flash driver
+lm3646 LM3646 dual flash driver
+============ ==========================================================
+
+IR I2C driver
+-------------
+
+============ ==========================================================
+Driver Name
+============ ==========================================================
+ir-kbd-i2c I2C module for IR
+============ ==========================================================
+
+Lens drivers
+------------
+
+============ ==========================================================
+Driver Name
+============ ==========================================================
+ad5820 AD5820 lens voice coil
+ak7375 AK7375 lens voice coil
+dw9714 DW9714 lens voice coil
+dw9768 DW9768 lens voice coil
+dw9807-vcm DW9807 lens voice coil
+============ ==========================================================
+
+Miscellaneous helper chips
+--------------------------
+
+============ ==========================================================
+Driver Name
+============ ==========================================================
+video-i2c I2C transport video
+m52790 Mitsubishi M52790 A/V switch
+st-mipid02 STMicroelectronics MIPID02 CSI-2 to PARALLEL bridge
+ths7303 THS7303/53 Video Amplifier
+============ ==========================================================
+
+RDS decoders
+------------
+
+============ ==========================================================
+Driver Name
+============ ==========================================================
+saa6588 SAA6588 Radio Chip RDS decoder
+============ ==========================================================
+
+SDR tuner chips
+---------------
+
+============ ==========================================================
+Driver Name
+============ ==========================================================
+max2175 Maxim 2175 RF to Bits tuner
+============ ==========================================================
+
+Video and audio decoders
+------------------------
+
+============ ==========================================================
+Driver Name
+============ ==========================================================
+cx25840 Conexant CX2584x audio/video decoders
+saa717x Philips SAA7171/3/4 audio/video decoders
+============ ==========================================================
+
+Video decoders
+--------------
+
+============ ==========================================================
+Driver Name
+============ ==========================================================
+adv7180 Analog Devices ADV7180 decoder
+adv7183 Analog Devices ADV7183 decoder
+adv748x Analog Devices ADV748x decoder
+adv7604 Analog Devices ADV7604 decoder
+adv7842 Analog Devices ADV7842 decoder
+bt819 BT819A VideoStream decoder
+bt856 BT856 VideoStream decoder
+bt866 BT866 VideoStream decoder
+ks0127 KS0127 video decoder
+ml86v7667 OKI ML86V7667 video decoder
+saa7110 Philips SAA7110 video decoder
+saa7115 Philips SAA7111/3/4/5 video decoders
+tc358743 Toshiba TC358743 decoder
+tvp514x Texas Instruments TVP514x video decoder
+tvp5150 Texas Instruments TVP5150 video decoder
+tvp7002 Texas Instruments TVP7002 video decoder
+tw2804 Techwell TW2804 multiple video decoder
+tw9903 Techwell TW9903 video decoder
+tw9906 Techwell TW9906 video decoder
+tw9910 Techwell TW9910 video decoder
+vpx3220 vpx3220a, vpx3216b & vpx3214c video decoders
+============ ==========================================================
+
+Video encoders
+--------------
+
+============ ==========================================================
+Driver Name
+============ ==========================================================
+ad9389b Analog Devices AD9389B encoder
+adv7170 Analog Devices ADV7170 video encoder
+adv7175 Analog Devices ADV7175 video encoder
+adv7343 ADV7343 video encoder
+adv7393 ADV7393 video encoder
+adv7511-v4l2 Analog Devices ADV7511 encoder
+ak881x AK8813/AK8814 video encoders
+saa7127 Philips SAA7127/9 digital video encoders
+saa7185 Philips SAA7185 video encoder
+ths8200 Texas Instruments THS8200 video encoder
+============ ==========================================================
+
+Video improvement chips
+-----------------------
+
+============ ==========================================================
+Driver Name
+============ ==========================================================
+upd64031a NEC Electronics uPD64031A Ghost Reduction
+upd64083 NEC Electronics uPD64083 3-Dimensional Y/C separation
+============ ==========================================================
+
+Tuner drivers
+-------------
+
+============ ==================================================
+Driver Name
+============ ==================================================
+e4000 Elonics E4000 silicon tuner
+fc0011 Fitipower FC0011 silicon tuner
+fc0012 Fitipower FC0012 silicon tuner
+fc0013 Fitipower FC0013 silicon tuner
+fc2580 FCI FC2580 silicon tuner
+it913x ITE Tech IT913x silicon tuner
+m88rs6000t Montage M88RS6000 internal tuner
+max2165 Maxim MAX2165 silicon tuner
+mc44s803 Freescale MC44S803 Low Power CMOS Broadband tuners
+msi001 Mirics MSi001
+mt2060 Microtune MT2060 silicon IF tuner
+mt2063 Microtune MT2063 silicon IF tuner
+mt20xx Microtune 2032 / 2050 tuners
+mt2131 Microtune MT2131 silicon tuner
+mt2266 Microtune MT2266 silicon tuner
+mxl301rf MaxLinear MxL301RF tuner
+mxl5005s MaxLinear MSL5005S silicon tuner
+mxl5007t MaxLinear MxL5007T silicon tuner
+qm1d1b0004 Sharp QM1D1B0004 tuner
+qm1d1c0042 Sharp QM1D1C0042 tuner
+qt1010 Quantek QT1010 silicon tuner
+r820t Rafael Micro R820T silicon tuner
+si2157 Silicon Labs Si2157 silicon tuner
+tuner-types Simple tuner support
+tda18212 NXP TDA18212 silicon tuner
+tda18218 NXP TDA18218 silicon tuner
+tda18250 NXP TDA18250 silicon tuner
+tda18271 NXP TDA18271 silicon tuner
+tda827x Philips TDA827X silicon tuner
+tda8290 TDA 8290/8295 + 8275(a)/18271 tuner combo
+tda9887 TDA 9885/6/7 analog IF demodulator
+tea5761 TEA 5761 radio tuner
+tea5767 TEA 5767 radio tuner
+tua9001 Infineon TUA9001 silicon tuner
+xc2028 XCeive xc2028/xc3028 tuners
+xc4000 Xceive XC4000 silicon tuner
+xc5000 Xceive XC5000 silicon tuner
+============ ==================================================
+
+.. toctree::
+ :maxdepth: 1
+
+ tuner-cardlist
+ frontend-cardlist
diff --git a/Documentation/admin-guide/media/imx.rst b/Documentation/admin-guide/media/imx.rst
new file mode 100644
index 000000000000..b8fa70f854fd
--- /dev/null
+++ b/Documentation/admin-guide/media/imx.rst
@@ -0,0 +1,714 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+i.MX Video Capture Driver
+=========================
+
+Introduction
+------------
+
+The Freescale i.MX5/6 contains an Image Processing Unit (IPU), which
+handles the flow of image frames to and from capture devices and
+display devices.
+
+For image capture, the IPU contains the following internal subunits:
+
+- Image DMA Controller (IDMAC)
+- Camera Serial Interface (CSI)
+- Image Converter (IC)
+- Sensor Multi-FIFO Controller (SMFC)
+- Image Rotator (IRT)
+- Video De-Interlacing or Combining Block (VDIC)
+
+The IDMAC is the DMA controller for transfer of image frames to and from
+memory. Various dedicated DMA channels exist for both video capture and
+display paths. During transfer, the IDMAC is also capable of vertical
+image flip, 8x8 block transfer (see IRT description), pixel component
+re-ordering (for example UYVY to YUYV) within the same colorspace, and
+packed <--> planar conversion. The IDMAC can also perform a simple
+de-interlacing by interweaving even and odd lines during transfer
+(without motion compensation which requires the VDIC).
+
+The CSI is the backend capture unit that interfaces directly with
+camera sensors over Parallel, BT.656/1120, and MIPI CSI-2 buses.
+
+The IC handles color-space conversion, resizing (downscaling and
+upscaling), horizontal flip, and 90/270 degree rotation operations.
+
+There are three independent "tasks" within the IC that can carry out
+conversions concurrently: pre-process encoding, pre-process viewfinder,
+and post-processing. Within each task, conversions are split into three
+sections: downsizing section, main section (upsizing, flip, colorspace
+conversion, and graphics plane combining), and rotation section.
+
+The IPU time-shares the IC task operations. The time-slice granularity
+is one burst of eight pixels in the downsizing section, one image line
+in the main processing section, one image frame in the rotation section.
+
+The SMFC is composed of four independent FIFOs that each can transfer
+captured frames from sensors directly to memory concurrently via four
+IDMAC channels.
+
+The IRT carries out 90 and 270 degree image rotation operations. The
+rotation operation is carried out on 8x8 pixel blocks at a time. This
+operation is supported by the IDMAC which handles the 8x8 block transfer
+along with block reordering, in coordination with vertical flip.
+
+The VDIC handles the conversion of interlaced video to progressive, with
+support for different motion compensation modes (low, medium, and high
+motion). The deinterlaced output frames from the VDIC can be sent to the
+IC pre-process viewfinder task for further conversions. The VDIC also
+contains a Combiner that combines two image planes, with alpha blending
+and color keying.
+
+In addition to the IPU internal subunits, there are also two units
+outside the IPU that are also involved in video capture on i.MX:
+
+- MIPI CSI-2 Receiver for camera sensors with the MIPI CSI-2 bus
+ interface. This is a Synopsys DesignWare core.
+- Two video multiplexers for selecting among multiple sensor inputs
+ to send to a CSI.
+
+For more info, refer to the latest versions of the i.MX5/6 reference
+manuals [#f1]_ and [#f2]_.
+
+
+Features
+--------
+
+Some of the features of this driver include:
+
+- Many different pipelines can be configured via media controller API,
+ that correspond to the hardware video capture pipelines supported in
+ the i.MX.
+
+- Supports parallel, BT.565, and MIPI CSI-2 interfaces.
+
+- Concurrent independent streams, by configuring pipelines to multiple
+ video capture interfaces using independent entities.
+
+- Scaling, color-space conversion, horizontal and vertical flip, and
+ image rotation via IC task subdevs.
+
+- Many pixel formats supported (RGB, packed and planar YUV, partial
+ planar YUV).
+
+- The VDIC subdev supports motion compensated de-interlacing, with three
+ motion compensation modes: low, medium, and high motion. Pipelines are
+ defined that allow sending frames to the VDIC subdev directly from the
+ CSI. There is also support in the future for sending frames to the
+ VDIC from memory buffers via a output/mem2mem devices.
+
+- Includes a Frame Interval Monitor (FIM) that can correct vertical sync
+ problems with the ADV718x video decoders.
+
+
+Topology
+--------
+
+The following shows the media topologies for the i.MX6Q SabreSD and
+i.MX6Q SabreAuto. Refer to these diagrams in the entity descriptions
+in the next section.
+
+The i.MX5/6 topologies can differ upstream from the IPUv3 CSI video
+multiplexers, but the internal IPUv3 topology downstream from there
+is common to all i.MX5/6 platforms. For example, the SabreSD, with the
+MIPI CSI-2 OV5640 sensor, requires the i.MX6 MIPI CSI-2 receiver. But
+the SabreAuto has only the ADV7180 decoder on a parallel bt.656 bus, and
+therefore does not require the MIPI CSI-2 receiver, so it is missing in
+its graph.
+
+.. _imx6q_topology_graph:
+
+.. kernel-figure:: imx6q-sabresd.dot
+ :alt: Diagram of the i.MX6Q SabreSD media pipeline topology
+ :align: center
+
+ Media pipeline graph on i.MX6Q SabreSD
+
+.. kernel-figure:: imx6q-sabreauto.dot
+ :alt: Diagram of the i.MX6Q SabreAuto media pipeline topology
+ :align: center
+
+ Media pipeline graph on i.MX6Q SabreAuto
+
+Entities
+--------
+
+imx6-mipi-csi2
+--------------
+
+This is the MIPI CSI-2 receiver entity. It has one sink pad to receive
+the MIPI CSI-2 stream (usually from a MIPI CSI-2 camera sensor). It has
+four source pads, corresponding to the four MIPI CSI-2 demuxed virtual
+channel outputs. Multiple source pads can be enabled to independently
+stream from multiple virtual channels.
+
+This entity actually consists of two sub-blocks. One is the MIPI CSI-2
+core. This is a Synopsys Designware MIPI CSI-2 core. The other sub-block
+is a "CSI-2 to IPU gasket". The gasket acts as a demultiplexer of the
+four virtual channels streams, providing four separate parallel buses
+containing each virtual channel that are routed to CSIs or video
+multiplexers as described below.
+
+On i.MX6 solo/dual-lite, all four virtual channel buses are routed to
+two video multiplexers. Both CSI0 and CSI1 can receive any virtual
+channel, as selected by the video multiplexers.
+
+On i.MX6 Quad, virtual channel 0 is routed to IPU1-CSI0 (after selected
+by a video mux), virtual channels 1 and 2 are hard-wired to IPU1-CSI1
+and IPU2-CSI0, respectively, and virtual channel 3 is routed to
+IPU2-CSI1 (again selected by a video mux).
+
+ipuX_csiY_mux
+-------------
+
+These are the video multiplexers. They have two or more sink pads to
+select from either camera sensors with a parallel interface, or from
+MIPI CSI-2 virtual channels from imx6-mipi-csi2 entity. They have a
+single source pad that routes to a CSI (ipuX_csiY entities).
+
+On i.MX6 solo/dual-lite, there are two video mux entities. One sits
+in front of IPU1-CSI0 to select between a parallel sensor and any of
+the four MIPI CSI-2 virtual channels (a total of five sink pads). The
+other mux sits in front of IPU1-CSI1, and again has five sink pads to
+select between a parallel sensor and any of the four MIPI CSI-2 virtual
+channels.
+
+On i.MX6 Quad, there are two video mux entities. One sits in front of
+IPU1-CSI0 to select between a parallel sensor and MIPI CSI-2 virtual
+channel 0 (two sink pads). The other mux sits in front of IPU2-CSI1 to
+select between a parallel sensor and MIPI CSI-2 virtual channel 3 (two
+sink pads).
+
+ipuX_csiY
+---------
+
+These are the CSI entities. They have a single sink pad receiving from
+either a video mux or from a MIPI CSI-2 virtual channel as described
+above.
+
+This entity has two source pads. The first source pad can link directly
+to the ipuX_vdic entity or the ipuX_ic_prp entity, using hardware links
+that require no IDMAC memory buffer transfer.
+
+When the direct source pad is routed to the ipuX_ic_prp entity, frames
+from the CSI can be processed by one or both of the IC pre-processing
+tasks.
+
+When the direct source pad is routed to the ipuX_vdic entity, the VDIC
+will carry out motion-compensated de-interlace using "high motion" mode
+(see description of ipuX_vdic entity).
+
+The second source pad sends video frames directly to memory buffers
+via the SMFC and an IDMAC channel, bypassing IC pre-processing. This
+source pad is routed to a capture device node, with a node name of the
+format "ipuX_csiY capture".
+
+Note that since the IDMAC source pad makes use of an IDMAC channel,
+pixel reordering within the same colorspace can be carried out by the
+IDMAC channel. For example, if the CSI sink pad is receiving in UYVY
+order, the capture device linked to the IDMAC source pad can capture
+in YUYV order. Also, if the CSI sink pad is receiving a packed YUV
+format, the capture device can capture a planar YUV format such as
+YUV420.
+
+The IDMAC channel at the IDMAC source pad also supports simple
+interweave without motion compensation, which is activated if the source
+pad's field type is sequential top-bottom or bottom-top, and the
+requested capture interface field type is set to interlaced (t-b, b-t,
+or unqualified interlaced). The capture interface will enforce the same
+field order as the source pad field order (interlaced-bt if source pad
+is seq-bt, interlaced-tb if source pad is seq-tb).
+
+For events produced by ipuX_csiY, see ref:`imx_api_ipuX_csiY`.
+
+Cropping in ipuX_csiY
+---------------------
+
+The CSI supports cropping the incoming raw sensor frames. This is
+implemented in the ipuX_csiY entities at the sink pad, using the
+crop selection subdev API.
+
+The CSI also supports fixed divide-by-two downscaling independently in
+width and height. This is implemented in the ipuX_csiY entities at
+the sink pad, using the compose selection subdev API.
+
+The output rectangle at the ipuX_csiY source pad is the same as
+the compose rectangle at the sink pad. So the source pad rectangle
+cannot be negotiated, it must be set using the compose selection
+API at sink pad (if /2 downscale is desired, otherwise source pad
+rectangle is equal to incoming rectangle).
+
+To give an example of crop and /2 downscale, this will crop a
+1280x960 input frame to 640x480, and then /2 downscale in both
+dimensions to 320x240 (assumes ipu1_csi0 is linked to ipu1_csi0_mux):
+
+.. code-block:: none
+
+ media-ctl -V "'ipu1_csi0_mux':2[fmt:UYVY2X8/1280x960]"
+ media-ctl -V "'ipu1_csi0':0[crop:(0,0)/640x480]"
+ media-ctl -V "'ipu1_csi0':0[compose:(0,0)/320x240]"
+
+Frame Skipping in ipuX_csiY
+---------------------------
+
+The CSI supports frame rate decimation, via frame skipping. Frame
+rate decimation is specified by setting the frame intervals at
+sink and source pads. The ipuX_csiY entity then applies the best
+frame skip setting to the CSI to achieve the desired frame rate
+at the source pad.
+
+The following example reduces an assumed incoming 60 Hz frame
+rate by half at the IDMAC output source pad:
+
+.. code-block:: none
+
+ media-ctl -V "'ipu1_csi0':0[fmt:UYVY2X8/640x480@1/60]"
+ media-ctl -V "'ipu1_csi0':2[fmt:UYVY2X8/640x480@1/30]"
+
+Frame Interval Monitor in ipuX_csiY
+-----------------------------------
+
+See ref:`imx_api_FIM`.
+
+ipuX_vdic
+---------
+
+The VDIC carries out motion compensated de-interlacing, with three
+motion compensation modes: low, medium, and high motion. The mode is
+specified with the menu control V4L2_CID_DEINTERLACING_MODE. The VDIC
+has two sink pads and a single source pad.
+
+The direct sink pad receives from an ipuX_csiY direct pad. With this
+link the VDIC can only operate in high motion mode.
+
+When the IDMAC sink pad is activated, it receives from an output
+or mem2mem device node. With this pipeline, the VDIC can also operate
+in low and medium modes, because these modes require receiving
+frames from memory buffers. Note that an output or mem2mem device
+is not implemented yet, so this sink pad currently has no links.
+
+The source pad routes to the IC pre-processing entity ipuX_ic_prp.
+
+ipuX_ic_prp
+-----------
+
+This is the IC pre-processing entity. It acts as a router, routing
+data from its sink pad to one or both of its source pads.
+
+This entity has a single sink pad. The sink pad can receive from the
+ipuX_csiY direct pad, or from ipuX_vdic.
+
+This entity has two source pads. One source pad routes to the
+pre-process encode task entity (ipuX_ic_prpenc), the other to the
+pre-process viewfinder task entity (ipuX_ic_prpvf). Both source pads
+can be activated at the same time if the sink pad is receiving from
+ipuX_csiY. Only the source pad to the pre-process viewfinder task entity
+can be activated if the sink pad is receiving from ipuX_vdic (frames
+from the VDIC can only be processed by the pre-process viewfinder task).
+
+ipuX_ic_prpenc
+--------------
+
+This is the IC pre-processing encode entity. It has a single sink
+pad from ipuX_ic_prp, and a single source pad. The source pad is
+routed to a capture device node, with a node name of the format
+"ipuX_ic_prpenc capture".
+
+This entity performs the IC pre-process encode task operations:
+color-space conversion, resizing (downscaling and upscaling),
+horizontal and vertical flip, and 90/270 degree rotation. Flip
+and rotation are provided via standard V4L2 controls.
+
+Like the ipuX_csiY IDMAC source, this entity also supports simple
+de-interlace without motion compensation, and pixel reordering.
+
+ipuX_ic_prpvf
+-------------
+
+This is the IC pre-processing viewfinder entity. It has a single sink
+pad from ipuX_ic_prp, and a single source pad. The source pad is routed
+to a capture device node, with a node name of the format
+"ipuX_ic_prpvf capture".
+
+This entity is identical in operation to ipuX_ic_prpenc, with the same
+resizing and CSC operations and flip/rotation controls. It will receive
+and process de-interlaced frames from the ipuX_vdic if ipuX_ic_prp is
+receiving from ipuX_vdic.
+
+Like the ipuX_csiY IDMAC source, this entity supports simple
+interweaving without motion compensation. However, note that if the
+ipuX_vdic is included in the pipeline (ipuX_ic_prp is receiving from
+ipuX_vdic), it's not possible to use interweave in ipuX_ic_prpvf,
+since the ipuX_vdic has already carried out de-interlacing (with
+motion compensation) and therefore the field type output from
+ipuX_vdic can only be none (progressive).
+
+Capture Pipelines
+-----------------
+
+The following describe the various use-cases supported by the pipelines.
+
+The links shown do not include the backend sensor, video mux, or mipi
+csi-2 receiver links. This depends on the type of sensor interface
+(parallel or mipi csi-2). So these pipelines begin with:
+
+sensor -> ipuX_csiY_mux -> ...
+
+for parallel sensors, or:
+
+sensor -> imx6-mipi-csi2 -> (ipuX_csiY_mux) -> ...
+
+for mipi csi-2 sensors. The imx6-mipi-csi2 receiver may need to route
+to the video mux (ipuX_csiY_mux) before sending to the CSI, depending
+on the mipi csi-2 virtual channel, hence ipuX_csiY_mux is shown in
+parenthesis.
+
+Unprocessed Video Capture:
+--------------------------
+
+Send frames directly from sensor to camera device interface node, with
+no conversions, via ipuX_csiY IDMAC source pad:
+
+-> ipuX_csiY:2 -> ipuX_csiY capture
+
+IC Direct Conversions:
+----------------------
+
+This pipeline uses the preprocess encode entity to route frames directly
+from the CSI to the IC, to carry out scaling up to 1024x1024 resolution,
+CSC, flipping, and image rotation:
+
+-> ipuX_csiY:1 -> 0:ipuX_ic_prp:1 -> 0:ipuX_ic_prpenc:1 -> ipuX_ic_prpenc capture
+
+Motion Compensated De-interlace:
+--------------------------------
+
+This pipeline routes frames from the CSI direct pad to the VDIC entity to
+support motion-compensated de-interlacing (high motion mode only),
+scaling up to 1024x1024, CSC, flip, and rotation:
+
+-> ipuX_csiY:1 -> 0:ipuX_vdic:2 -> 0:ipuX_ic_prp:2 -> 0:ipuX_ic_prpvf:1 -> ipuX_ic_prpvf capture
+
+
+Usage Notes
+-----------
+
+To aid in configuration and for backward compatibility with V4L2
+applications that access controls only from video device nodes, the
+capture device interfaces inherit controls from the active entities
+in the current pipeline, so controls can be accessed either directly
+from the subdev or from the active capture device interface. For
+example, the FIM controls are available either from the ipuX_csiY
+subdevs or from the active capture device.
+
+The following are specific usage notes for the Sabre* reference
+boards:
+
+
+i.MX6Q SabreLite with OV5642 and OV5640
+---------------------------------------
+
+This platform requires the OmniVision OV5642 module with a parallel
+camera interface, and the OV5640 module with a MIPI CSI-2
+interface. Both modules are available from Boundary Devices:
+
+- https://boundarydevices.com/product/nit6x_5mp
+- https://boundarydevices.com/product/nit6x_5mp_mipi
+
+Note that if only one camera module is available, the other sensor
+node can be disabled in the device tree.
+
+The OV5642 module is connected to the parallel bus input on the i.MX
+internal video mux to IPU1 CSI0. It's i2c bus connects to i2c bus 2.
+
+The MIPI CSI-2 OV5640 module is connected to the i.MX internal MIPI CSI-2
+receiver, and the four virtual channel outputs from the receiver are
+routed as follows: vc0 to the IPU1 CSI0 mux, vc1 directly to IPU1 CSI1,
+vc2 directly to IPU2 CSI0, and vc3 to the IPU2 CSI1 mux. The OV5640 is
+also connected to i2c bus 2 on the SabreLite, therefore the OV5642 and
+OV5640 must not share the same i2c slave address.
+
+The following basic example configures unprocessed video capture
+pipelines for both sensors. The OV5642 is routed to ipu1_csi0, and
+the OV5640, transmitting on MIPI CSI-2 virtual channel 1 (which is
+imx6-mipi-csi2 pad 2), is routed to ipu1_csi1. Both sensors are
+configured to output 640x480, and the OV5642 outputs YUYV2X8, the
+OV5640 UYVY2X8:
+
+.. code-block:: none
+
+ # Setup links for OV5642
+ media-ctl -l "'ov5642 1-0042':0 -> 'ipu1_csi0_mux':1[1]"
+ media-ctl -l "'ipu1_csi0_mux':2 -> 'ipu1_csi0':0[1]"
+ media-ctl -l "'ipu1_csi0':2 -> 'ipu1_csi0 capture':0[1]"
+ # Setup links for OV5640
+ media-ctl -l "'ov5640 1-0040':0 -> 'imx6-mipi-csi2':0[1]"
+ media-ctl -l "'imx6-mipi-csi2':2 -> 'ipu1_csi1':0[1]"
+ media-ctl -l "'ipu1_csi1':2 -> 'ipu1_csi1 capture':0[1]"
+ # Configure pads for OV5642 pipeline
+ media-ctl -V "'ov5642 1-0042':0 [fmt:YUYV2X8/640x480 field:none]"
+ media-ctl -V "'ipu1_csi0_mux':2 [fmt:YUYV2X8/640x480 field:none]"
+ media-ctl -V "'ipu1_csi0':2 [fmt:AYUV32/640x480 field:none]"
+ # Configure pads for OV5640 pipeline
+ media-ctl -V "'ov5640 1-0040':0 [fmt:UYVY2X8/640x480 field:none]"
+ media-ctl -V "'imx6-mipi-csi2':2 [fmt:UYVY2X8/640x480 field:none]"
+ media-ctl -V "'ipu1_csi1':2 [fmt:AYUV32/640x480 field:none]"
+
+Streaming can then begin independently on the capture device nodes
+"ipu1_csi0 capture" and "ipu1_csi1 capture". The v4l2-ctl tool can
+be used to select any supported YUV pixelformat on the capture device
+nodes, including planar.
+
+i.MX6Q SabreAuto with ADV7180 decoder
+-------------------------------------
+
+On the i.MX6Q SabreAuto, an on-board ADV7180 SD decoder is connected to the
+parallel bus input on the internal video mux to IPU1 CSI0.
+
+The following example configures a pipeline to capture from the ADV7180
+video decoder, assuming NTSC 720x480 input signals, using simple
+interweave (unconverted and without motion compensation). The adv7180
+must output sequential or alternating fields (field type 'seq-bt' for
+NTSC, or 'alternate'):
+
+.. code-block:: none
+
+ # Setup links
+ media-ctl -l "'adv7180 3-0021':0 -> 'ipu1_csi0_mux':1[1]"
+ media-ctl -l "'ipu1_csi0_mux':2 -> 'ipu1_csi0':0[1]"
+ media-ctl -l "'ipu1_csi0':2 -> 'ipu1_csi0 capture':0[1]"
+ # Configure pads
+ media-ctl -V "'adv7180 3-0021':0 [fmt:UYVY2X8/720x480 field:seq-bt]"
+ media-ctl -V "'ipu1_csi0_mux':2 [fmt:UYVY2X8/720x480]"
+ media-ctl -V "'ipu1_csi0':2 [fmt:AYUV32/720x480]"
+ # Configure "ipu1_csi0 capture" interface (assumed at /dev/video4)
+ v4l2-ctl -d4 --set-fmt-video=field=interlaced_bt
+
+Streaming can then begin on /dev/video4. The v4l2-ctl tool can also be
+used to select any supported YUV pixelformat on /dev/video4.
+
+This example configures a pipeline to capture from the ADV7180
+video decoder, assuming PAL 720x576 input signals, with Motion
+Compensated de-interlacing. The adv7180 must output sequential or
+alternating fields (field type 'seq-tb' for PAL, or 'alternate').
+
+.. code-block:: none
+
+ # Setup links
+ media-ctl -l "'adv7180 3-0021':0 -> 'ipu1_csi0_mux':1[1]"
+ media-ctl -l "'ipu1_csi0_mux':2 -> 'ipu1_csi0':0[1]"
+ media-ctl -l "'ipu1_csi0':1 -> 'ipu1_vdic':0[1]"
+ media-ctl -l "'ipu1_vdic':2 -> 'ipu1_ic_prp':0[1]"
+ media-ctl -l "'ipu1_ic_prp':2 -> 'ipu1_ic_prpvf':0[1]"
+ media-ctl -l "'ipu1_ic_prpvf':1 -> 'ipu1_ic_prpvf capture':0[1]"
+ # Configure pads
+ media-ctl -V "'adv7180 3-0021':0 [fmt:UYVY2X8/720x576 field:seq-tb]"
+ media-ctl -V "'ipu1_csi0_mux':2 [fmt:UYVY2X8/720x576]"
+ media-ctl -V "'ipu1_csi0':1 [fmt:AYUV32/720x576]"
+ media-ctl -V "'ipu1_vdic':2 [fmt:AYUV32/720x576 field:none]"
+ media-ctl -V "'ipu1_ic_prp':2 [fmt:AYUV32/720x576 field:none]"
+ media-ctl -V "'ipu1_ic_prpvf':1 [fmt:AYUV32/720x576 field:none]"
+ # Configure "ipu1_ic_prpvf capture" interface (assumed at /dev/video2)
+ v4l2-ctl -d2 --set-fmt-video=field=none
+
+Streaming can then begin on /dev/video2. The v4l2-ctl tool can also be
+used to select any supported YUV pixelformat on /dev/video2.
+
+This platform accepts Composite Video analog inputs to the ADV7180 on
+Ain1 (connector J42).
+
+i.MX6DL SabreAuto with ADV7180 decoder
+--------------------------------------
+
+On the i.MX6DL SabreAuto, an on-board ADV7180 SD decoder is connected to the
+parallel bus input on the internal video mux to IPU1 CSI0.
+
+The following example configures a pipeline to capture from the ADV7180
+video decoder, assuming NTSC 720x480 input signals, using simple
+interweave (unconverted and without motion compensation). The adv7180
+must output sequential or alternating fields (field type 'seq-bt' for
+NTSC, or 'alternate'):
+
+.. code-block:: none
+
+ # Setup links
+ media-ctl -l "'adv7180 4-0021':0 -> 'ipu1_csi0_mux':4[1]"
+ media-ctl -l "'ipu1_csi0_mux':5 -> 'ipu1_csi0':0[1]"
+ media-ctl -l "'ipu1_csi0':2 -> 'ipu1_csi0 capture':0[1]"
+ # Configure pads
+ media-ctl -V "'adv7180 4-0021':0 [fmt:UYVY2X8/720x480 field:seq-bt]"
+ media-ctl -V "'ipu1_csi0_mux':5 [fmt:UYVY2X8/720x480]"
+ media-ctl -V "'ipu1_csi0':2 [fmt:AYUV32/720x480]"
+ # Configure "ipu1_csi0 capture" interface (assumed at /dev/video0)
+ v4l2-ctl -d0 --set-fmt-video=field=interlaced_bt
+
+Streaming can then begin on /dev/video0. The v4l2-ctl tool can also be
+used to select any supported YUV pixelformat on /dev/video0.
+
+This example configures a pipeline to capture from the ADV7180
+video decoder, assuming PAL 720x576 input signals, with Motion
+Compensated de-interlacing. The adv7180 must output sequential or
+alternating fields (field type 'seq-tb' for PAL, or 'alternate').
+
+.. code-block:: none
+
+ # Setup links
+ media-ctl -l "'adv7180 4-0021':0 -> 'ipu1_csi0_mux':4[1]"
+ media-ctl -l "'ipu1_csi0_mux':5 -> 'ipu1_csi0':0[1]"
+ media-ctl -l "'ipu1_csi0':1 -> 'ipu1_vdic':0[1]"
+ media-ctl -l "'ipu1_vdic':2 -> 'ipu1_ic_prp':0[1]"
+ media-ctl -l "'ipu1_ic_prp':2 -> 'ipu1_ic_prpvf':0[1]"
+ media-ctl -l "'ipu1_ic_prpvf':1 -> 'ipu1_ic_prpvf capture':0[1]"
+ # Configure pads
+ media-ctl -V "'adv7180 4-0021':0 [fmt:UYVY2X8/720x576 field:seq-tb]"
+ media-ctl -V "'ipu1_csi0_mux':5 [fmt:UYVY2X8/720x576]"
+ media-ctl -V "'ipu1_csi0':1 [fmt:AYUV32/720x576]"
+ media-ctl -V "'ipu1_vdic':2 [fmt:AYUV32/720x576 field:none]"
+ media-ctl -V "'ipu1_ic_prp':2 [fmt:AYUV32/720x576 field:none]"
+ media-ctl -V "'ipu1_ic_prpvf':1 [fmt:AYUV32/720x576 field:none]"
+ # Configure "ipu1_ic_prpvf capture" interface (assumed at /dev/video2)
+ v4l2-ctl -d2 --set-fmt-video=field=none
+
+Streaming can then begin on /dev/video2. The v4l2-ctl tool can also be
+used to select any supported YUV pixelformat on /dev/video2.
+
+This platform accepts Composite Video analog inputs to the ADV7180 on
+Ain1 (connector J42).
+
+i.MX6Q SabreSD with MIPI CSI-2 OV5640
+-------------------------------------
+
+Similarly to i.MX6Q SabreLite, the i.MX6Q SabreSD supports a parallel
+interface OV5642 module on IPU1 CSI0, and a MIPI CSI-2 OV5640
+module. The OV5642 connects to i2c bus 1 and the OV5640 to i2c bus 2.
+
+The device tree for SabreSD includes OF graphs for both the parallel
+OV5642 and the MIPI CSI-2 OV5640, but as of this writing only the MIPI
+CSI-2 OV5640 has been tested, so the OV5642 node is currently disabled.
+The OV5640 module connects to MIPI connector J5. The NXP part number
+for the OV5640 module that connects to the SabreSD board is H120729.
+
+The following example configures unprocessed video capture pipeline to
+capture from the OV5640, transmitting on MIPI CSI-2 virtual channel 0:
+
+.. code-block:: none
+
+ # Setup links
+ media-ctl -l "'ov5640 1-003c':0 -> 'imx6-mipi-csi2':0[1]"
+ media-ctl -l "'imx6-mipi-csi2':1 -> 'ipu1_csi0_mux':0[1]"
+ media-ctl -l "'ipu1_csi0_mux':2 -> 'ipu1_csi0':0[1]"
+ media-ctl -l "'ipu1_csi0':2 -> 'ipu1_csi0 capture':0[1]"
+ # Configure pads
+ media-ctl -V "'ov5640 1-003c':0 [fmt:UYVY2X8/640x480]"
+ media-ctl -V "'imx6-mipi-csi2':1 [fmt:UYVY2X8/640x480]"
+ media-ctl -V "'ipu1_csi0_mux':0 [fmt:UYVY2X8/640x480]"
+ media-ctl -V "'ipu1_csi0':0 [fmt:AYUV32/640x480]"
+
+Streaming can then begin on "ipu1_csi0 capture" node. The v4l2-ctl
+tool can be used to select any supported pixelformat on the capture
+device node.
+
+To determine what is the /dev/video node correspondent to
+"ipu1_csi0 capture":
+
+.. code-block:: none
+
+ media-ctl -e "ipu1_csi0 capture"
+ /dev/video0
+
+/dev/video0 is the streaming element in this case.
+
+Starting the streaming via v4l2-ctl:
+
+.. code-block:: none
+
+ v4l2-ctl --stream-mmap -d /dev/video0
+
+Starting the streaming via Gstreamer and sending the content to the display:
+
+.. code-block:: none
+
+ gst-launch-1.0 v4l2src device=/dev/video0 ! kmssink
+
+The following example configures a direct conversion pipeline to capture
+from the OV5640, transmitting on MIPI CSI-2 virtual channel 0. It also
+shows colorspace conversion and scaling at IC output.
+
+.. code-block:: none
+
+ # Setup links
+ media-ctl -l "'ov5640 1-003c':0 -> 'imx6-mipi-csi2':0[1]"
+ media-ctl -l "'imx6-mipi-csi2':1 -> 'ipu1_csi0_mux':0[1]"
+ media-ctl -l "'ipu1_csi0_mux':2 -> 'ipu1_csi0':0[1]"
+ media-ctl -l "'ipu1_csi0':1 -> 'ipu1_ic_prp':0[1]"
+ media-ctl -l "'ipu1_ic_prp':1 -> 'ipu1_ic_prpenc':0[1]"
+ media-ctl -l "'ipu1_ic_prpenc':1 -> 'ipu1_ic_prpenc capture':0[1]"
+ # Configure pads
+ media-ctl -V "'ov5640 1-003c':0 [fmt:UYVY2X8/640x480]"
+ media-ctl -V "'imx6-mipi-csi2':1 [fmt:UYVY2X8/640x480]"
+ media-ctl -V "'ipu1_csi0_mux':2 [fmt:UYVY2X8/640x480]"
+ media-ctl -V "'ipu1_csi0':1 [fmt:AYUV32/640x480]"
+ media-ctl -V "'ipu1_ic_prp':1 [fmt:AYUV32/640x480]"
+ media-ctl -V "'ipu1_ic_prpenc':1 [fmt:ARGB8888_1X32/800x600]"
+ # Set a format at the capture interface
+ v4l2-ctl -d /dev/video1 --set-fmt-video=pixelformat=RGB3
+
+Streaming can then begin on "ipu1_ic_prpenc capture" node.
+
+To determine what is the /dev/video node correspondent to
+"ipu1_ic_prpenc capture":
+
+.. code-block:: none
+
+ media-ctl -e "ipu1_ic_prpenc capture"
+ /dev/video1
+
+
+/dev/video1 is the streaming element in this case.
+
+Starting the streaming via v4l2-ctl:
+
+.. code-block:: none
+
+ v4l2-ctl --stream-mmap -d /dev/video1
+
+Starting the streaming via Gstreamer and sending the content to the display:
+
+.. code-block:: none
+
+ gst-launch-1.0 v4l2src device=/dev/video1 ! kmssink
+
+Known Issues
+------------
+
+1. When using 90 or 270 degree rotation control at capture resolutions
+ near the IC resizer limit of 1024x1024, and combined with planar
+ pixel formats (YUV420, YUV422p), frame capture will often fail with
+ no end-of-frame interrupts from the IDMAC channel. To work around
+ this, use lower resolution and/or packed formats (YUYV, RGB3, etc.)
+ when 90 or 270 rotations are needed.
+
+
+File list
+---------
+
+drivers/staging/media/imx/
+include/media/imx.h
+include/linux/imx-media.h
+
+References
+----------
+
+.. [#f1] http://www.nxp.com/assets/documents/data/en/reference-manuals/IMX6DQRM.pdf
+.. [#f2] http://www.nxp.com/assets/documents/data/en/reference-manuals/IMX6SDLRM.pdf
+
+
+Authors
+-------
+
+- Steve Longerbeam <steve_longerbeam@mentor.com>
+- Philipp Zabel <kernel@pengutronix.de>
+- Russell King <linux@armlinux.org.uk>
+
+Copyright (C) 2012-2017 Mentor Graphics Inc.
diff --git a/Documentation/admin-guide/media/imx6q-sabreauto.dot b/Documentation/admin-guide/media/imx6q-sabreauto.dot
new file mode 100644
index 000000000000..bd6cf0b358c0
--- /dev/null
+++ b/Documentation/admin-guide/media/imx6q-sabreauto.dot
@@ -0,0 +1,51 @@
+digraph board {
+ rankdir=TB
+ n00000001 [label="{{<port0> 0} | ipu1_csi0\n/dev/v4l-subdev0 | {<port1> 1 | <port2> 2}}", shape=Mrecord, style=filled, fillcolor=green]
+ n00000001:port2 -> n00000005 [style=dashed]
+ n00000001:port1 -> n0000000f:port0 [style=dashed]
+ n00000001:port1 -> n0000000b:port0 [style=dashed]
+ n00000005 [label="ipu1_csi0 capture\n/dev/video0", shape=box, style=filled, fillcolor=yellow]
+ n0000000b [label="{{<port0> 0 | <port1> 1} | ipu1_vdic\n/dev/v4l-subdev1 | {<port2> 2}}", shape=Mrecord, style=filled, fillcolor=green]
+ n0000000b:port2 -> n0000000f:port0 [style=dashed]
+ n0000000f [label="{{<port0> 0} | ipu1_ic_prp\n/dev/v4l-subdev2 | {<port1> 1 | <port2> 2}}", shape=Mrecord, style=filled, fillcolor=green]
+ n0000000f:port1 -> n00000013:port0 [style=dashed]
+ n0000000f:port2 -> n0000001c:port0 [style=dashed]
+ n00000013 [label="{{<port0> 0} | ipu1_ic_prpenc\n/dev/v4l-subdev3 | {<port1> 1}}", shape=Mrecord, style=filled, fillcolor=green]
+ n00000013:port1 -> n00000016 [style=dashed]
+ n00000016 [label="ipu1_ic_prpenc capture\n/dev/video1", shape=box, style=filled, fillcolor=yellow]
+ n0000001c [label="{{<port0> 0} | ipu1_ic_prpvf\n/dev/v4l-subdev4 | {<port1> 1}}", shape=Mrecord, style=filled, fillcolor=green]
+ n0000001c:port1 -> n0000001f [style=dashed]
+ n0000001f [label="ipu1_ic_prpvf capture\n/dev/video2", shape=box, style=filled, fillcolor=yellow]
+ n0000002f [label="{{<port0> 0} | ipu1_csi1\n/dev/v4l-subdev5 | {<port1> 1 | <port2> 2}}", shape=Mrecord, style=filled, fillcolor=green]
+ n0000002f:port2 -> n00000033 [style=dashed]
+ n0000002f:port1 -> n0000000f:port0 [style=dashed]
+ n0000002f:port1 -> n0000000b:port0 [style=dashed]
+ n00000033 [label="ipu1_csi1 capture\n/dev/video3", shape=box, style=filled, fillcolor=yellow]
+ n0000003d [label="{{<port0> 0} | ipu2_csi0\n/dev/v4l-subdev6 | {<port1> 1 | <port2> 2}}", shape=Mrecord, style=filled, fillcolor=green]
+ n0000003d:port2 -> n00000041 [style=dashed]
+ n0000003d:port1 -> n0000004b:port0 [style=dashed]
+ n0000003d:port1 -> n00000047:port0 [style=dashed]
+ n00000041 [label="ipu2_csi0 capture\n/dev/video4", shape=box, style=filled, fillcolor=yellow]
+ n00000047 [label="{{<port0> 0 | <port1> 1} | ipu2_vdic\n/dev/v4l-subdev7 | {<port2> 2}}", shape=Mrecord, style=filled, fillcolor=green]
+ n00000047:port2 -> n0000004b:port0 [style=dashed]
+ n0000004b [label="{{<port0> 0} | ipu2_ic_prp\n/dev/v4l-subdev8 | {<port1> 1 | <port2> 2}}", shape=Mrecord, style=filled, fillcolor=green]
+ n0000004b:port1 -> n0000004f:port0 [style=dashed]
+ n0000004b:port2 -> n00000058:port0 [style=dashed]
+ n0000004f [label="{{<port0> 0} | ipu2_ic_prpenc\n/dev/v4l-subdev9 | {<port1> 1}}", shape=Mrecord, style=filled, fillcolor=green]
+ n0000004f:port1 -> n00000052 [style=dashed]
+ n00000052 [label="ipu2_ic_prpenc capture\n/dev/video5", shape=box, style=filled, fillcolor=yellow]
+ n00000058 [label="{{<port0> 0} | ipu2_ic_prpvf\n/dev/v4l-subdev10 | {<port1> 1}}", shape=Mrecord, style=filled, fillcolor=green]
+ n00000058:port1 -> n0000005b [style=dashed]
+ n0000005b [label="ipu2_ic_prpvf capture\n/dev/video6", shape=box, style=filled, fillcolor=yellow]
+ n0000006b [label="{{<port0> 0} | ipu2_csi1\n/dev/v4l-subdev11 | {<port1> 1 | <port2> 2}}", shape=Mrecord, style=filled, fillcolor=green]
+ n0000006b:port2 -> n0000006f [style=dashed]
+ n0000006b:port1 -> n0000004b:port0 [style=dashed]
+ n0000006b:port1 -> n00000047:port0 [style=dashed]
+ n0000006f [label="ipu2_csi1 capture\n/dev/video7", shape=box, style=filled, fillcolor=yellow]
+ n00000079 [label="{{<port0> 0 | <port1> 1} | ipu1_csi0_mux\n/dev/v4l-subdev12 | {<port2> 2}}", shape=Mrecord, style=filled, fillcolor=green]
+ n00000079:port2 -> n00000001:port0 [style=dashed]
+ n0000007d [label="{{<port0> 0 | <port1> 1} | ipu2_csi1_mux\n/dev/v4l-subdev13 | {<port2> 2}}", shape=Mrecord, style=filled, fillcolor=green]
+ n0000007d:port2 -> n0000006b:port0 [style=dashed]
+ n00000081 [label="{{} | adv7180 3-0021\n/dev/v4l-subdev14 | {<port0> 0}}", shape=Mrecord, style=filled, fillcolor=green]
+ n00000081:port0 -> n00000079:port1 [style=dashed]
+}
diff --git a/Documentation/admin-guide/media/imx6q-sabresd.dot b/Documentation/admin-guide/media/imx6q-sabresd.dot
new file mode 100644
index 000000000000..7d56cafa1944
--- /dev/null
+++ b/Documentation/admin-guide/media/imx6q-sabresd.dot
@@ -0,0 +1,56 @@
+digraph board {
+ rankdir=TB
+ n00000001 [label="{{<port0> 0} | ipu1_csi0\n/dev/v4l-subdev0 | {<port1> 1 | <port2> 2}}", shape=Mrecord, style=filled, fillcolor=green]
+ n00000001:port2 -> n00000005 [style=dashed]
+ n00000001:port1 -> n0000000f:port0 [style=dashed]
+ n00000001:port1 -> n0000000b:port0 [style=dashed]
+ n00000005 [label="ipu1_csi0 capture\n/dev/video0", shape=box, style=filled, fillcolor=yellow]
+ n0000000b [label="{{<port0> 0 | <port1> 1} | ipu1_vdic\n/dev/v4l-subdev1 | {<port2> 2}}", shape=Mrecord, style=filled, fillcolor=green]
+ n0000000b:port2 -> n0000000f:port0 [style=dashed]
+ n0000000f [label="{{<port0> 0} | ipu1_ic_prp\n/dev/v4l-subdev2 | {<port1> 1 | <port2> 2}}", shape=Mrecord, style=filled, fillcolor=green]
+ n0000000f:port1 -> n00000013:port0 [style=dashed]
+ n0000000f:port2 -> n0000001c:port0 [style=dashed]
+ n00000013 [label="{{<port0> 0} | ipu1_ic_prpenc\n/dev/v4l-subdev3 | {<port1> 1}}", shape=Mrecord, style=filled, fillcolor=green]
+ n00000013:port1 -> n00000016 [style=dashed]
+ n00000016 [label="ipu1_ic_prpenc capture\n/dev/video1", shape=box, style=filled, fillcolor=yellow]
+ n0000001c [label="{{<port0> 0} | ipu1_ic_prpvf\n/dev/v4l-subdev4 | {<port1> 1}}", shape=Mrecord, style=filled, fillcolor=green]
+ n0000001c:port1 -> n0000001f [style=dashed]
+ n0000001f [label="ipu1_ic_prpvf capture\n/dev/video2", shape=box, style=filled, fillcolor=yellow]
+ n0000002f [label="{{<port0> 0} | ipu1_csi1\n/dev/v4l-subdev5 | {<port1> 1 | <port2> 2}}", shape=Mrecord, style=filled, fillcolor=green]
+ n0000002f:port2 -> n00000033 [style=dashed]
+ n0000002f:port1 -> n0000000f:port0 [style=dashed]
+ n0000002f:port1 -> n0000000b:port0 [style=dashed]
+ n00000033 [label="ipu1_csi1 capture\n/dev/video3", shape=box, style=filled, fillcolor=yellow]
+ n0000003d [label="{{<port0> 0} | ipu2_csi0\n/dev/v4l-subdev6 | {<port1> 1 | <port2> 2}}", shape=Mrecord, style=filled, fillcolor=green]
+ n0000003d:port2 -> n00000041 [style=dashed]
+ n0000003d:port1 -> n0000004b:port0 [style=dashed]
+ n0000003d:port1 -> n00000047:port0 [style=dashed]
+ n00000041 [label="ipu2_csi0 capture\n/dev/video4", shape=box, style=filled, fillcolor=yellow]
+ n00000047 [label="{{<port0> 0 | <port1> 1} | ipu2_vdic\n/dev/v4l-subdev7 | {<port2> 2}}", shape=Mrecord, style=filled, fillcolor=green]
+ n00000047:port2 -> n0000004b:port0 [style=dashed]
+ n0000004b [label="{{<port0> 0} | ipu2_ic_prp\n/dev/v4l-subdev8 | {<port1> 1 | <port2> 2}}", shape=Mrecord, style=filled, fillcolor=green]
+ n0000004b:port1 -> n0000004f:port0 [style=dashed]
+ n0000004b:port2 -> n00000058:port0 [style=dashed]
+ n0000004f [label="{{<port0> 0} | ipu2_ic_prpenc\n/dev/v4l-subdev9 | {<port1> 1}}", shape=Mrecord, style=filled, fillcolor=green]
+ n0000004f:port1 -> n00000052 [style=dashed]
+ n00000052 [label="ipu2_ic_prpenc capture\n/dev/video5", shape=box, style=filled, fillcolor=yellow]
+ n00000058 [label="{{<port0> 0} | ipu2_ic_prpvf\n/dev/v4l-subdev10 | {<port1> 1}}", shape=Mrecord, style=filled, fillcolor=green]
+ n00000058:port1 -> n0000005b [style=dashed]
+ n0000005b [label="ipu2_ic_prpvf capture\n/dev/video6", shape=box, style=filled, fillcolor=yellow]
+ n0000006b [label="{{<port0> 0} | ipu2_csi1\n/dev/v4l-subdev11 | {<port1> 1 | <port2> 2}}", shape=Mrecord, style=filled, fillcolor=green]
+ n0000006b:port2 -> n0000006f [style=dashed]
+ n0000006b:port1 -> n0000004b:port0 [style=dashed]
+ n0000006b:port1 -> n00000047:port0 [style=dashed]
+ n0000006f [label="ipu2_csi1 capture\n/dev/video7", shape=box, style=filled, fillcolor=yellow]
+ n00000079 [label="{{<port0> 0} | imx6-mipi-csi2\n/dev/v4l-subdev12 | {<port1> 1 | <port2> 2 | <port3> 3 | <port4> 4}}", shape=Mrecord, style=filled, fillcolor=green]
+ n00000079:port2 -> n0000002f:port0 [style=dashed]
+ n00000079:port3 -> n0000003d:port0 [style=dashed]
+ n00000079:port1 -> n0000007f:port0 [style=dashed]
+ n00000079:port4 -> n00000083:port0 [style=dashed]
+ n0000007f [label="{{<port0> 0 | <port1> 1} | ipu1_csi0_mux\n/dev/v4l-subdev13 | {<port2> 2}}", shape=Mrecord, style=filled, fillcolor=green]
+ n0000007f:port2 -> n00000001:port0 [style=dashed]
+ n00000083 [label="{{<port0> 0 | <port1> 1} | ipu2_csi1_mux\n/dev/v4l-subdev14 | {<port2> 2}}", shape=Mrecord, style=filled, fillcolor=green]
+ n00000083:port2 -> n0000006b:port0 [style=dashed]
+ n00000087 [label="{{} | ov5640 1-003c\n/dev/v4l-subdev15 | {<port0> 0}}", shape=Mrecord, style=filled, fillcolor=green]
+ n00000087:port0 -> n00000079:port0 [style=dashed]
+}
diff --git a/Documentation/admin-guide/media/imx7.rst b/Documentation/admin-guide/media/imx7.rst
new file mode 100644
index 000000000000..2fa27718f52a
--- /dev/null
+++ b/Documentation/admin-guide/media/imx7.rst
@@ -0,0 +1,221 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+i.MX7 Video Capture Driver
+==========================
+
+Introduction
+------------
+
+The i.MX7 contrary to the i.MX5/6 family does not contain an Image Processing
+Unit (IPU); because of that the capabilities to perform operations or
+manipulation of the capture frames are less feature rich.
+
+For image capture the i.MX7 has three units:
+- CMOS Sensor Interface (CSI)
+- Video Multiplexer
+- MIPI CSI-2 Receiver
+
+.. code-block:: none
+
+ MIPI Camera Input ---> MIPI CSI-2 --- > |\
+ | \
+ | \
+ | M |
+ | U | ------> CSI ---> Capture
+ | X |
+ | /
+ Parallel Camera Input ----------------> | /
+ |/
+
+For additional information, please refer to the latest versions of the i.MX7
+reference manual [#f1]_.
+
+Entities
+--------
+
+imx-mipi-csi2
+--------------
+
+This is the MIPI CSI-2 receiver entity. It has one sink pad to receive the pixel
+data from MIPI CSI-2 camera sensor. It has one source pad, corresponding to the
+virtual channel 0. This module is compliant to previous version of Samsung
+D-phy, and supports two D-PHY Rx Data lanes.
+
+csi-mux
+-------
+
+This is the video multiplexer. It has two sink pads to select from either camera
+sensor with a parallel interface or from MIPI CSI-2 virtual channel 0. It has
+a single source pad that routes to the CSI.
+
+csi
+---
+
+The CSI enables the chip to connect directly to external CMOS image sensor. CSI
+can interface directly with Parallel and MIPI CSI-2 buses. It has 256 x 64 FIFO
+to store received image pixel data and embedded DMA controllers to transfer data
+from the FIFO through AHB bus.
+
+This entity has one sink pad that receives from the csi-mux entity and a single
+source pad that routes video frames directly to memory buffers. This pad is
+routed to a capture device node.
+
+Usage Notes
+-----------
+
+To aid in configuration and for backward compatibility with V4L2 applications
+that access controls only from video device nodes, the capture device interfaces
+inherit controls from the active entities in the current pipeline, so controls
+can be accessed either directly from the subdev or from the active capture
+device interface. For example, the sensor controls are available either from the
+sensor subdevs or from the active capture device.
+
+Warp7 with OV2680
+-----------------
+
+On this platform an OV2680 MIPI CSI-2 module is connected to the internal MIPI
+CSI-2 receiver. The following example configures a video capture pipeline with
+an output of 800x600, and BGGR 10 bit bayer format:
+
+.. code-block:: none
+
+ # Setup links
+ media-ctl -l "'ov2680 1-0036':0 -> 'imx7-mipi-csis.0':0[1]"
+ media-ctl -l "'imx7-mipi-csis.0':1 -> 'csi-mux':1[1]"
+ media-ctl -l "'csi-mux':2 -> 'csi':0[1]"
+ media-ctl -l "'csi':1 -> 'csi capture':0[1]"
+
+ # Configure pads for pipeline
+ media-ctl -V "'ov2680 1-0036':0 [fmt:SBGGR10_1X10/800x600 field:none]"
+ media-ctl -V "'csi-mux':1 [fmt:SBGGR10_1X10/800x600 field:none]"
+ media-ctl -V "'csi-mux':2 [fmt:SBGGR10_1X10/800x600 field:none]"
+ media-ctl -V "'imx7-mipi-csis.0':0 [fmt:SBGGR10_1X10/800x600 field:none]"
+ media-ctl -V "'csi':0 [fmt:SBGGR10_1X10/800x600 field:none]"
+
+After this streaming can start. The v4l2-ctl tool can be used to select any of
+the resolutions supported by the sensor.
+
+.. code-block:: none
+
+ # media-ctl -p
+ Media controller API version 5.2.0
+
+ Media device information
+ ------------------------
+ driver imx7-csi
+ model imx-media
+ serial
+ bus info
+ hw revision 0x0
+ driver version 5.2.0
+
+ Device topology
+ - entity 1: csi (2 pads, 2 links)
+ type V4L2 subdev subtype Unknown flags 0
+ device node name /dev/v4l-subdev0
+ pad0: Sink
+ [fmt:SBGGR10_1X10/800x600 field:none colorspace:srgb xfer:srgb ycbcr:601 quantization:full-range]
+ <- "csi-mux":2 [ENABLED]
+ pad1: Source
+ [fmt:SBGGR10_1X10/800x600 field:none colorspace:srgb xfer:srgb ycbcr:601 quantization:full-range]
+ -> "csi capture":0 [ENABLED]
+
+ - entity 4: csi capture (1 pad, 1 link)
+ type Node subtype V4L flags 0
+ device node name /dev/video0
+ pad0: Sink
+ <- "csi":1 [ENABLED]
+
+ - entity 10: csi-mux (3 pads, 2 links)
+ type V4L2 subdev subtype Unknown flags 0
+ device node name /dev/v4l-subdev1
+ pad0: Sink
+ [fmt:Y8_1X8/1x1 field:none]
+ pad1: Sink
+ [fmt:SBGGR10_1X10/800x600 field:none]
+ <- "imx7-mipi-csis.0":1 [ENABLED]
+ pad2: Source
+ [fmt:SBGGR10_1X10/800x600 field:none]
+ -> "csi":0 [ENABLED]
+
+ - entity 14: imx7-mipi-csis.0 (2 pads, 2 links)
+ type V4L2 subdev subtype Unknown flags 0
+ device node name /dev/v4l-subdev2
+ pad0: Sink
+ [fmt:SBGGR10_1X10/800x600 field:none]
+ <- "ov2680 1-0036":0 [ENABLED]
+ pad1: Source
+ [fmt:SBGGR10_1X10/800x600 field:none]
+ -> "csi-mux":1 [ENABLED]
+
+ - entity 17: ov2680 1-0036 (1 pad, 1 link)
+ type V4L2 subdev subtype Sensor flags 0
+ device node name /dev/v4l-subdev3
+ pad0: Source
+ [fmt:SBGGR10_1X10/800x600@1/30 field:none colorspace:srgb]
+ -> "imx7-mipi-csis.0":0 [ENABLED]
+
+i.MX6ULL-EVK with OV5640
+------------------------
+
+On this platform a parallel OV5640 sensor is connected to the CSI port.
+The following example configures a video capture pipeline with an output
+of 640x480 and UYVY8_2X8 format:
+
+.. code-block:: none
+
+ # Setup links
+ media-ctl -l "'ov5640 1-003c':0 -> 'csi':0[1]"
+ media-ctl -l "'csi':1 -> 'csi capture':0[1]"
+
+ # Configure pads for pipeline
+ media-ctl -v -V "'ov5640 1-003c':0 [fmt:UYVY8_2X8/640x480 field:none]"
+
+After this streaming can start:
+
+.. code-block:: none
+
+ gst-launch-1.0 -v v4l2src device=/dev/video1 ! video/x-raw,format=UYVY,width=640,height=480 ! v4l2convert ! fbdevsink
+
+.. code-block:: none
+
+ # media-ctl -p
+ Media controller API version 5.14.0
+
+ Media device information
+ ------------------------
+ driver imx7-csi
+ model imx-media
+ serial
+ bus info
+ hw revision 0x0
+ driver version 5.14.0
+
+ Device topology
+ - entity 1: csi (2 pads, 2 links)
+ type V4L2 subdev subtype Unknown flags 0
+ device node name /dev/v4l-subdev0
+ pad0: Sink
+ [fmt:UYVY8_2X8/640x480 field:none colorspace:srgb xfer:srgb ycbcr:601 quantization:full-range]
+ <- "ov5640 1-003c":0 [ENABLED,IMMUTABLE]
+ pad1: Source
+ [fmt:UYVY8_2X8/640x480 field:none colorspace:srgb xfer:srgb ycbcr:601 quantization:full-range]
+ -> "csi capture":0 [ENABLED,IMMUTABLE]
+
+ - entity 4: csi capture (1 pad, 1 link)
+ type Node subtype V4L flags 0
+ device node name /dev/video1
+ pad0: Sink
+ <- "csi":1 [ENABLED,IMMUTABLE]
+
+ - entity 10: ov5640 1-003c (1 pad, 1 link)
+ type V4L2 subdev subtype Sensor flags 0
+ device node name /dev/v4l-subdev1
+ pad0: Source
+ [fmt:UYVY8_2X8/640x480@1/30 field:none colorspace:srgb xfer:srgb ycbcr:601 quantization:full-range]
+ -> "csi":0 [ENABLED,IMMUTABLE]
+
+References
+----------
+
+.. [#f1] https://www.nxp.com/docs/en/reference-manual/IMX7SRM.pdf
diff --git a/Documentation/admin-guide/media/index.rst b/Documentation/admin-guide/media/index.rst
new file mode 100644
index 000000000000..c676af665111
--- /dev/null
+++ b/Documentation/admin-guide/media/index.rst
@@ -0,0 +1,63 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+.. include:: <isonum.txt>
+
+====================================
+Media subsystem admin and user guide
+====================================
+
+This section contains usage information about media subsystem and
+its supported drivers.
+
+Please see:
+
+Documentation/userspace-api/media/index.rst
+
+ - for the userspace APIs used on media devices.
+
+Documentation/driver-api/media/index.rst
+
+ - for driver development information and Kernel APIs used by
+ media devices;
+
+The media subsystem
+===================
+
+.. only:: html
+
+ .. class:: toc-title
+
+ Table of Contents
+
+.. toctree::
+ :maxdepth: 2
+ :numbered:
+
+ intro
+ building
+
+ remote-controller
+
+ dvb
+
+ cardlist
+
+ v4l-drivers
+ dvb-drivers
+ cec-drivers
+
+**Copyright** |copy| 1999-2020 : LinuxTV Developers
+
+::
+
+ This documentation is free software; you can redistribute it and/or modify it
+ under the terms of the GNU General Public License as published by the Free
+ Software Foundation; either version 2 of the License, or (at your option) any
+ later version.
+
+ This program is distributed in the hope that it will be useful, but WITHOUT
+ ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
+ more details.
+
+ For more details see the file COPYING in the source distribution of Linux.
diff --git a/Documentation/admin-guide/media/intro.rst b/Documentation/admin-guide/media/intro.rst
new file mode 100644
index 000000000000..fec8122f2412
--- /dev/null
+++ b/Documentation/admin-guide/media/intro.rst
@@ -0,0 +1,27 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+============
+Introduction
+============
+
+The media subsystem consists on Linux support for several different types
+of devices:
+
+- Audio and video grabbers;
+- PC and Laptop Cameras;
+- Complex cameras found on Embedded hardware;
+- Analog and digital TV;
+- HDMI Customer Electronics Control (CEC);
+- Multi-touch input devices;
+- Remote Controllers;
+- Media encoders and decoders.
+
+Due to the diversity of devices, the subsystem provides several different
+APIs:
+
+- Remote Controller API;
+- HDMI CEC API;
+- Video4Linux API;
+- Media controller API;
+- Video4Linux Request API (experimental);
+- Digital TV API (also known as DVB API).
diff --git a/Documentation/admin-guide/media/ipu3.rst b/Documentation/admin-guide/media/ipu3.rst
new file mode 100644
index 000000000000..83b3cd03b35c
--- /dev/null
+++ b/Documentation/admin-guide/media/ipu3.rst
@@ -0,0 +1,600 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+.. include:: <isonum.txt>
+
+===============================================================
+Intel Image Processing Unit 3 (IPU3) Imaging Unit (ImgU) driver
+===============================================================
+
+Copyright |copy| 2018 Intel Corporation
+
+Introduction
+============
+
+This file documents the Intel IPU3 (3rd generation Image Processing Unit)
+Imaging Unit drivers located under drivers/media/pci/intel/ipu3 (CIO2) as well
+as under drivers/staging/media/ipu3 (ImgU).
+
+The Intel IPU3 found in certain Kaby Lake (as well as certain Sky Lake)
+platforms (U/Y processor lines) is made up of two parts namely the Imaging Unit
+(ImgU) and the CIO2 device (MIPI CSI2 receiver).
+
+The CIO2 device receives the raw Bayer data from the sensors and outputs the
+frames in a format that is specific to the IPU3 (for consumption by the IPU3
+ImgU). The CIO2 driver is available as drivers/media/pci/intel/ipu3/ipu3-cio2*
+and is enabled through the CONFIG_VIDEO_IPU3_CIO2 config option.
+
+The Imaging Unit (ImgU) is responsible for processing images captured
+by the IPU3 CIO2 device. The ImgU driver sources can be found under
+drivers/staging/media/ipu3 directory. The driver is enabled through the
+CONFIG_VIDEO_IPU3_IMGU config option.
+
+The two driver modules are named ipu3_csi2 and ipu3_imgu, respectively.
+
+The drivers has been tested on Kaby Lake platforms (U/Y processor lines).
+
+Both of the drivers implement V4L2, Media Controller and V4L2 sub-device
+interfaces. The IPU3 CIO2 driver supports camera sensors connected to the CIO2
+MIPI CSI-2 interfaces through V4L2 sub-device sensor drivers.
+
+CIO2
+====
+
+The CIO2 is represented as a single V4L2 subdev, which provides a V4L2 subdev
+interface to the user space. There is a video node for each CSI-2 receiver,
+with a single media controller interface for the entire device.
+
+The CIO2 contains four independent capture channel, each with its own MIPI CSI-2
+receiver and DMA engine. Each channel is modelled as a V4L2 sub-device exposed
+to userspace as a V4L2 sub-device node and has two pads:
+
+.. tabularcolumns:: |p{0.8cm}|p{4.0cm}|p{4.0cm}|
+
+.. flat-table::
+ :header-rows: 1
+
+ * - Pad
+ - Direction
+ - Purpose
+
+ * - 0
+ - sink
+ - MIPI CSI-2 input, connected to the sensor subdev
+
+ * - 1
+ - source
+ - Raw video capture, connected to the V4L2 video interface
+
+The V4L2 video interfaces model the DMA engines. They are exposed to userspace
+as V4L2 video device nodes.
+
+Capturing frames in raw Bayer format
+------------------------------------
+
+CIO2 MIPI CSI2 receiver is used to capture frames (in packed raw Bayer format)
+from the raw sensors connected to the CSI2 ports. The captured frames are used
+as input to the ImgU driver.
+
+Image processing using IPU3 ImgU requires tools such as raw2pnm [#f1]_, and
+yavta [#f2]_ due to the following unique requirements and / or features specific
+to IPU3.
+
+-- The IPU3 CSI2 receiver outputs the captured frames from the sensor in packed
+raw Bayer format that is specific to IPU3.
+
+-- Multiple video nodes have to be operated simultaneously.
+
+Let us take the example of ov5670 sensor connected to CSI2 port 0, for a
+2592x1944 image capture.
+
+Using the media controller APIs, the ov5670 sensor is configured to send
+frames in packed raw Bayer format to IPU3 CSI2 receiver.
+
+.. code-block:: none
+
+ # This example assumes /dev/media0 as the CIO2 media device
+ export MDEV=/dev/media0
+
+ # and that ov5670 sensor is connected to i2c bus 10 with address 0x36
+ export SDEV=$(media-ctl -d $MDEV -e "ov5670 10-0036")
+
+ # Establish the link for the media devices using media-ctl [#f3]_
+ media-ctl -d $MDEV -l "ov5670:0 -> ipu3-csi2 0:0[1]"
+
+ # Set the format for the media devices
+ media-ctl -d $MDEV -V "ov5670:0 [fmt:SGRBG10/2592x1944]"
+ media-ctl -d $MDEV -V "ipu3-csi2 0:0 [fmt:SGRBG10/2592x1944]"
+ media-ctl -d $MDEV -V "ipu3-csi2 0:1 [fmt:SGRBG10/2592x1944]"
+
+Once the media pipeline is configured, desired sensor specific settings
+(such as exposure and gain settings) can be set, using the yavta tool.
+
+e.g
+
+.. code-block:: none
+
+ yavta -w 0x009e0903 444 $SDEV
+ yavta -w 0x009e0913 1024 $SDEV
+ yavta -w 0x009e0911 2046 $SDEV
+
+Once the desired sensor settings are set, frame captures can be done as below.
+
+e.g
+
+.. code-block:: none
+
+ yavta --data-prefix -u -c10 -n5 -I -s2592x1944 --file=/tmp/frame-#.bin \
+ -f IPU3_SGRBG10 $(media-ctl -d $MDEV -e "ipu3-cio2 0")
+
+With the above command, 10 frames are captured at 2592x1944 resolution, with
+sGRBG10 format and output as IPU3_SGRBG10 format.
+
+The captured frames are available as /tmp/frame-#.bin files.
+
+ImgU
+====
+
+The ImgU is represented as two V4L2 subdevs, each of which provides a V4L2
+subdev interface to the user space.
+
+Each V4L2 subdev represents a pipe, which can support a maximum of 2 streams.
+This helps to support advanced camera features like Continuous View Finder (CVF)
+and Snapshot During Video(SDV).
+
+The ImgU contains two independent pipes, each modelled as a V4L2 sub-device
+exposed to userspace as a V4L2 sub-device node.
+
+Each pipe has two sink pads and three source pads for the following purpose:
+
+.. tabularcolumns:: |p{0.8cm}|p{4.0cm}|p{4.0cm}|
+
+.. flat-table::
+ :header-rows: 1
+
+ * - Pad
+ - Direction
+ - Purpose
+
+ * - 0
+ - sink
+ - Input raw video stream
+
+ * - 1
+ - sink
+ - Processing parameters
+
+ * - 2
+ - source
+ - Output processed video stream
+
+ * - 3
+ - source
+ - Output viewfinder video stream
+
+ * - 4
+ - source
+ - 3A statistics
+
+Each pad is connected to a corresponding V4L2 video interface, exposed to
+userspace as a V4L2 video device node.
+
+Device operation
+----------------
+
+With ImgU, once the input video node ("ipu3-imgu 0/1":0, in
+<entity>:<pad-number> format) is queued with buffer (in packed raw Bayer
+format), ImgU starts processing the buffer and produces the video output in YUV
+format and statistics output on respective output nodes. The driver is expected
+to have buffers ready for all of parameter, output and statistics nodes, when
+input video node is queued with buffer.
+
+At a minimum, all of input, main output, 3A statistics and viewfinder
+video nodes should be enabled for IPU3 to start image processing.
+
+Each ImgU V4L2 subdev has the following set of video nodes.
+
+input, output and viewfinder video nodes
+----------------------------------------
+
+The frames (in packed raw Bayer format specific to the IPU3) received by the
+input video node is processed by the IPU3 Imaging Unit and are output to 2 video
+nodes, with each targeting a different purpose (main output and viewfinder
+output).
+
+Details onand the Bayer format specific to the IPU3 can be found in
+:ref:`v4l2-pix-fmt-ipu3-sbggr10`.
+
+The driver supports V4L2 Video Capture Interface as defined at :ref:`devices`.
+
+Only the multi-planar API is supported. More details can be found at
+:ref:`planar-apis`.
+
+Parameters video node
+---------------------
+
+The parameters video node receives the ImgU algorithm parameters that are used
+to configure how the ImgU algorithms process the image.
+
+Details on processing parameters specific to the IPU3 can be found in
+:ref:`v4l2-meta-fmt-params`.
+
+3A statistics video node
+------------------------
+
+3A statistics video node is used by the ImgU driver to output the 3A (auto
+focus, auto exposure and auto white balance) statistics for the frames that are
+being processed by the ImgU to user space applications. User space applications
+can use this statistics data to compute the desired algorithm parameters for
+the ImgU.
+
+Configuring the Intel IPU3
+==========================
+
+The IPU3 ImgU pipelines can be configured using the Media Controller, defined at
+:ref:`media_controller`.
+
+Running mode and firmware binary selection
+------------------------------------------
+
+ImgU works based on firmware, currently the ImgU firmware support run 2 pipes
+in time-sharing with single input frame data. Each pipe can run at certain mode
+- "VIDEO" or "STILL", "VIDEO" mode is commonly used for video frames capture,
+and "STILL" is used for still frame capture. However, you can also select
+"VIDEO" to capture still frames if you want to capture images with less system
+load and power. For "STILL" mode, ImgU will try to use smaller BDS factor and
+output larger bayer frame for further YUV processing than "VIDEO" mode to get
+high quality images. Besides, "STILL" mode need XNR3 to do noise reduction,
+hence "STILL" mode will need more power and memory bandwidth than "VIDEO" mode.
+TNR will be enabled in "VIDEO" mode and bypassed by "STILL" mode. ImgU is
+running at "VIDEO" mode by default, the user can use v4l2 control
+V4L2_CID_INTEL_IPU3_MODE (currently defined in
+drivers/staging/media/ipu3/include/uapi/intel-ipu3.h) to query and set the
+running mode. For user, there is no difference for buffer queueing between the
+"VIDEO" and "STILL" mode, mandatory input and main output node should be
+enabled and buffers need be queued, the statistics and the view-finder queues
+are optional.
+
+The firmware binary will be selected according to current running mode, such log
+"using binary if_to_osys_striped " or "using binary if_to_osys_primary_striped"
+could be observed if you enable the ImgU dynamic debug, the binary
+if_to_osys_striped is selected for "VIDEO" and the binary
+"if_to_osys_primary_striped" is selected for "STILL".
+
+
+Processing the image in raw Bayer format
+----------------------------------------
+
+Configuring ImgU V4L2 subdev for image processing
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The ImgU V4L2 subdevs have to be configured with media controller APIs to have
+all the video nodes setup correctly.
+
+Let us take "ipu3-imgu 0" subdev as an example.
+
+.. code-block:: none
+
+ media-ctl -d $MDEV -r
+ media-ctl -d $MDEV -l "ipu3-imgu 0 input":0 -> "ipu3-imgu 0":0[1]
+ media-ctl -d $MDEV -l "ipu3-imgu 0":2 -> "ipu3-imgu 0 output":0[1]
+ media-ctl -d $MDEV -l "ipu3-imgu 0":3 -> "ipu3-imgu 0 viewfinder":0[1]
+ media-ctl -d $MDEV -l "ipu3-imgu 0":4 -> "ipu3-imgu 0 3a stat":0[1]
+
+Also the pipe mode of the corresponding V4L2 subdev should be set as desired
+(e.g 0 for video mode or 1 for still mode) through the control id 0x009819a1 as
+below.
+
+.. code-block:: none
+
+ yavta -w "0x009819A1 1" /dev/v4l-subdev7
+
+Certain hardware blocks in ImgU pipeline can change the frame resolution by
+cropping or scaling, these hardware blocks include Input Feeder(IF), Bayer Down
+Scaler (BDS) and Geometric Distortion Correction (GDC).
+There is also a block which can change the frame resolution - YUV Scaler, it is
+only applicable to the secondary output.
+
+RAW Bayer frames go through these ImgU pipeline hardware blocks and the final
+processed image output to the DDR memory.
+
+.. kernel-figure:: ipu3_rcb.svg
+ :alt: ipu3 resolution blocks image
+
+ IPU3 resolution change hardware blocks
+
+**Input Feeder**
+
+Input Feeder gets the Bayer frame data from the sensor, it can enable cropping
+of lines and columns from the frame and then store pixels into device's internal
+pixel buffer which are ready to readout by following blocks.
+
+**Bayer Down Scaler**
+
+Bayer Down Scaler is capable of performing image scaling in Bayer domain, the
+downscale factor can be configured from 1X to 1/4X in each axis with
+configuration steps of 0.03125 (1/32).
+
+**Geometric Distortion Correction**
+
+Geometric Distortion Correction is used to perform correction of distortions
+and image filtering. It needs some extra filter and envelope padding pixels to
+work, so the input resolution of GDC should be larger than the output
+resolution.
+
+**YUV Scaler**
+
+YUV Scaler which similar with BDS, but it is mainly do image down scaling in
+YUV domain, it can support up to 1/12X down scaling, but it can not be applied
+to the main output.
+
+The ImgU V4L2 subdev has to be configured with the supported resolutions in all
+the above hardware blocks, for a given input resolution.
+For a given supported resolution for an input frame, the Input Feeder, Bayer
+Down Scaler and GDC blocks should be configured with the supported resolutions
+as each hardware block has its own alignment requirement.
+
+You must configure the output resolution of the hardware blocks smartly to meet
+the hardware requirement along with keeping the maximum field of view. The
+intermediate resolutions can be generated by specific tool -
+
+https://github.com/intel/intel-ipu3-pipecfg
+
+This tool can be used to generate intermediate resolutions. More information can
+be obtained by looking at the following IPU3 ImgU configuration table.
+
+https://chromium.googlesource.com/chromiumos/overlays/board-overlays/+/master
+
+Under baseboard-poppy/media-libs/cros-camera-hal-configs-poppy/files/gcss
+directory, graph_settings_ov5670.xml can be used as an example.
+
+The following steps prepare the ImgU pipeline for the image processing.
+
+1. The ImgU V4L2 subdev data format should be set by using the
+VIDIOC_SUBDEV_S_FMT on pad 0, using the GDC width and height obtained above.
+
+2. The ImgU V4L2 subdev cropping should be set by using the
+VIDIOC_SUBDEV_S_SELECTION on pad 0, with V4L2_SEL_TGT_CROP as the target,
+using the input feeder height and width.
+
+3. The ImgU V4L2 subdev composing should be set by using the
+VIDIOC_SUBDEV_S_SELECTION on pad 0, with V4L2_SEL_TGT_COMPOSE as the target,
+using the BDS height and width.
+
+For the ov5670 example, for an input frame with a resolution of 2592x1944
+(which is input to the ImgU subdev pad 0), the corresponding resolutions
+for input feeder, BDS and GDC are 2592x1944, 2592x1944 and 2560x1920
+respectively.
+
+Once this is done, the received raw Bayer frames can be input to the ImgU
+V4L2 subdev as below, using the open source application v4l2n [#f1]_.
+
+For an image captured with 2592x1944 [#f4]_ resolution, with desired output
+resolution as 2560x1920 and viewfinder resolution as 2560x1920, the following
+v4l2n command can be used. This helps process the raw Bayer frames and produces
+the desired results for the main output image and the viewfinder output, in NV12
+format.
+
+.. code-block:: none
+
+ v4l2n --pipe=4 --load=/tmp/frame-#.bin --open=/dev/video4
+ --fmt=type:VIDEO_OUTPUT_MPLANE,width=2592,height=1944,pixelformat=0X47337069 \
+ --reqbufs=type:VIDEO_OUTPUT_MPLANE,count:1 --pipe=1 \
+ --output=/tmp/frames.out --open=/dev/video5 \
+ --fmt=type:VIDEO_CAPTURE_MPLANE,width=2560,height=1920,pixelformat=NV12 \
+ --reqbufs=type:VIDEO_CAPTURE_MPLANE,count:1 --pipe=2 \
+ --output=/tmp/frames.vf --open=/dev/video6 \
+ --fmt=type:VIDEO_CAPTURE_MPLANE,width=2560,height=1920,pixelformat=NV12 \
+ --reqbufs=type:VIDEO_CAPTURE_MPLANE,count:1 --pipe=3 --open=/dev/video7 \
+ --output=/tmp/frames.3A --fmt=type:META_CAPTURE,? \
+ --reqbufs=count:1,type:META_CAPTURE --pipe=1,2,3,4 --stream=5
+
+You can also use yavta [#f2]_ command to do same thing as above:
+
+.. code-block:: none
+
+ yavta --data-prefix -Bcapture-mplane -c10 -n5 -I -s2592x1944 \
+ --file=frame-#.out-f NV12 /dev/video5 & \
+ yavta --data-prefix -Bcapture-mplane -c10 -n5 -I -s2592x1944 \
+ --file=frame-#.vf -f NV12 /dev/video6 & \
+ yavta --data-prefix -Bmeta-capture -c10 -n5 -I \
+ --file=frame-#.3a /dev/video7 & \
+ yavta --data-prefix -Boutput-mplane -c10 -n5 -I -s2592x1944 \
+ --file=/tmp/frame-in.cio2 -f IPU3_SGRBG10 /dev/video4
+
+where /dev/video4, /dev/video5, /dev/video6 and /dev/video7 devices point to
+input, output, viewfinder and 3A statistics video nodes respectively.
+
+Converting the raw Bayer image into YUV domain
+----------------------------------------------
+
+The processed images after the above step, can be converted to YUV domain
+as below.
+
+Main output frames
+~~~~~~~~~~~~~~~~~~
+
+.. code-block:: none
+
+ raw2pnm -x2560 -y1920 -fNV12 /tmp/frames.out /tmp/frames.out.ppm
+
+where 2560x1920 is output resolution, NV12 is the video format, followed
+by input frame and output PNM file.
+
+Viewfinder output frames
+~~~~~~~~~~~~~~~~~~~~~~~~
+
+.. code-block:: none
+
+ raw2pnm -x2560 -y1920 -fNV12 /tmp/frames.vf /tmp/frames.vf.ppm
+
+where 2560x1920 is output resolution, NV12 is the video format, followed
+by input frame and output PNM file.
+
+Example user space code for IPU3
+================================
+
+User space code that configures and uses IPU3 is available here.
+
+https://chromium.googlesource.com/chromiumos/platform/arc-camera/+/master/
+
+The source can be located under hal/intel directory.
+
+Overview of IPU3 pipeline
+=========================
+
+IPU3 pipeline has a number of image processing stages, each of which takes a
+set of parameters as input. The major stages of pipelines are shown here:
+
+.. kernel-render:: DOT
+ :alt: IPU3 ImgU Pipeline
+ :caption: IPU3 ImgU Pipeline Diagram
+
+ digraph "IPU3 ImgU" {
+ node [shape=box]
+ splines="ortho"
+ rankdir="LR"
+
+ a [label="Raw pixels"]
+ b [label="Bayer Downscaling"]
+ c [label="Optical Black Correction"]
+ d [label="Linearization"]
+ e [label="Lens Shading Correction"]
+ f [label="White Balance / Exposure / Focus Apply"]
+ g [label="Bayer Noise Reduction"]
+ h [label="ANR"]
+ i [label="Demosaicing"]
+ j [label="Color Correction Matrix"]
+ k [label="Gamma correction"]
+ l [label="Color Space Conversion"]
+ m [label="Chroma Down Scaling"]
+ n [label="Chromatic Noise Reduction"]
+ o [label="Total Color Correction"]
+ p [label="XNR3"]
+ q [label="TNR"]
+ r [label="DDR", style=filled, fillcolor=yellow, shape=cylinder]
+ s [label="YUV Downscaling"]
+ t [label="DDR", style=filled, fillcolor=yellow, shape=cylinder]
+
+ { rank=same; a -> b -> c -> d -> e -> f -> g -> h -> i }
+ { rank=same; j -> k -> l -> m -> n -> o -> p -> q -> s -> t}
+
+ a -> j [style=invis, weight=10]
+ i -> j
+ q -> r
+ }
+
+The table below presents a description of the above algorithms.
+
+======================== =======================================================
+Name Description
+======================== =======================================================
+Optical Black Correction Optical Black Correction block subtracts a pre-defined
+ value from the respective pixel values to obtain better
+ image quality.
+ Defined in struct ipu3_uapi_obgrid_param.
+Linearization This algo block uses linearization parameters to
+ address non-linearity sensor effects. The Lookup table
+ table is defined in
+ struct ipu3_uapi_isp_lin_vmem_params.
+SHD Lens shading correction is used to correct spatial
+ non-uniformity of the pixel response due to optical
+ lens shading. This is done by applying a different gain
+ for each pixel. The gain, black level etc are
+ configured in struct ipu3_uapi_shd_config_static.
+BNR Bayer noise reduction block removes image noise by
+ applying a bilateral filter.
+ See struct ipu3_uapi_bnr_static_config for details.
+ANR Advanced Noise Reduction is a block based algorithm
+ that performs noise reduction in the Bayer domain. The
+ convolution matrix etc can be found in
+ struct ipu3_uapi_anr_config.
+DM Demosaicing converts raw sensor data in Bayer format
+ into RGB (Red, Green, Blue) presentation. Then add
+ outputs of estimation of Y channel for following stream
+ processing by Firmware. The struct is defined as
+ struct ipu3_uapi_dm_config.
+Color Correction Color Correction algo transforms sensor specific color
+ space to the standard "sRGB" color space. This is done
+ by applying 3x3 matrix defined in
+ struct ipu3_uapi_ccm_mat_config.
+Gamma correction Gamma correction struct ipu3_uapi_gamma_config is a
+ basic non-linear tone mapping correction that is
+ applied per pixel for each pixel component.
+CSC Color space conversion transforms each pixel from the
+ RGB primary presentation to YUV (Y: brightness,
+ UV: Luminance) presentation. This is done by applying
+ a 3x3 matrix defined in
+ struct ipu3_uapi_csc_mat_config
+CDS Chroma down sampling
+ After the CSC is performed, the Chroma Down Sampling
+ is applied for a UV plane down sampling by a factor
+ of 2 in each direction for YUV 4:2:0 using a 4x2
+ configurable filter struct ipu3_uapi_cds_params.
+CHNR Chroma noise reduction
+ This block processes only the chrominance pixels and
+ performs noise reduction by cleaning the high
+ frequency noise.
+ See struct struct ipu3_uapi_yuvp1_chnr_config.
+TCC Total color correction as defined in struct
+ struct ipu3_uapi_yuvp2_tcc_static_config.
+XNR3 eXtreme Noise Reduction V3 is the third revision of
+ noise reduction algorithm used to improve image
+ quality. This removes the low frequency noise in the
+ captured image. Two related structs are being defined,
+ struct ipu3_uapi_isp_xnr3_params for ISP data memory
+ and struct ipu3_uapi_isp_xnr3_vmem_params for vector
+ memory.
+TNR Temporal Noise Reduction block compares successive
+ frames in time to remove anomalies / noise in pixel
+ values. struct ipu3_uapi_isp_tnr3_vmem_params and
+ struct ipu3_uapi_isp_tnr3_params are defined for ISP
+ vector and data memory respectively.
+======================== =======================================================
+
+Other often encountered acronyms not listed in above table:
+
+ ACC
+ Accelerator cluster
+ AWB_FR
+ Auto white balance filter response statistics
+ BDS
+ Bayer downscaler parameters
+ CCM
+ Color correction matrix coefficients
+ IEFd
+ Image enhancement filter directed
+ Obgrid
+ Optical black level compensation
+ OSYS
+ Output system configuration
+ ROI
+ Region of interest
+ YDS
+ Y down sampling
+ YTM
+ Y-tone mapping
+
+A few stages of the pipeline will be executed by firmware running on the ISP
+processor, while many others will use a set of fixed hardware blocks also
+called accelerator cluster (ACC) to crunch pixel data and produce statistics.
+
+ACC parameters of individual algorithms, as defined by
+struct ipu3_uapi_acc_param, can be chosen to be applied by the user
+space through struct struct ipu3_uapi_flags embedded in
+struct ipu3_uapi_params structure. For parameters that are configured as
+not enabled by the user space, the corresponding structs are ignored by the
+driver, in which case the existing configuration of the algorithm will be
+preserved.
+
+References
+==========
+
+.. [#f5] drivers/staging/media/ipu3/include/uapi/intel-ipu3.h
+
+.. [#f1] https://github.com/intel/nvt
+
+.. [#f2] http://git.ideasonboard.org/yavta.git
+
+.. [#f3] http://git.ideasonboard.org/?p=media-ctl.git;a=summary
+
+.. [#f4] ImgU limitation requires an additional 16x16 for all input resolutions
diff --git a/Documentation/admin-guide/media/ipu3_rcb.svg b/Documentation/admin-guide/media/ipu3_rcb.svg
new file mode 100644
index 000000000000..d878421b42a0
--- /dev/null
+++ b/Documentation/admin-guide/media/ipu3_rcb.svg
@@ -0,0 +1,331 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="774pt" height="152pt" viewBox="0 0 774 152" version="1.1">
+<defs>
+<g>
+<symbol overflow="visible" id="glyph0-0">
+<path style="stroke:none;" d="M 1 0 L 1 -15 L 9 -15 L 9 0 Z M 8 -1 L 8 -14 L 2 -14 L 2 -1 Z M 8 -1 "/>
+</symbol>
+<symbol overflow="visible" id="glyph0-1">
+<path style="stroke:none;" d="M 4.6875 -1.15625 C 5.519531 -1.15625 6.15625 -1.316406 6.59375 -1.640625 C 7.039062 -1.960938 7.265625 -2.441406 7.265625 -3.078125 C 7.265625 -3.460938 7.179688 -3.789062 7.015625 -4.0625 C 6.859375 -4.34375 6.644531 -4.582031 6.375 -4.78125 C 6.113281 -4.988281 5.816406 -5.171875 5.484375 -5.328125 C 5.148438 -5.484375 4.804688 -5.628906 4.453125 -5.765625 C 4.054688 -5.921875 3.675781 -6.097656 3.3125 -6.296875 C 2.945312 -6.492188 2.617188 -6.726562 2.328125 -7 C 2.046875 -7.269531 1.820312 -7.582031 1.65625 -7.9375 C 1.488281 -8.300781 1.40625 -8.726562 1.40625 -9.21875 C 1.40625 -10.300781 1.742188 -11.144531 2.421875 -11.75 C 3.097656 -12.351562 4.046875 -12.65625 5.265625 -12.65625 C 5.597656 -12.65625 5.925781 -12.628906 6.25 -12.578125 C 6.570312 -12.535156 6.875 -12.476562 7.15625 -12.40625 C 7.4375 -12.34375 7.6875 -12.265625 7.90625 -12.171875 C 8.125 -12.085938 8.300781 -12 8.4375 -11.90625 L 7.921875 -10.515625 C 7.648438 -10.679688 7.28125 -10.84375 6.8125 -11 C 6.351562 -11.15625 5.835938 -11.234375 5.265625 -11.234375 C 4.660156 -11.234375 4.140625 -11.082031 3.703125 -10.78125 C 3.265625 -10.488281 3.046875 -10.039062 3.046875 -9.4375 C 3.046875 -9.09375 3.109375 -8.800781 3.234375 -8.5625 C 3.359375 -8.320312 3.53125 -8.109375 3.75 -7.921875 C 3.96875 -7.742188 4.222656 -7.582031 4.515625 -7.4375 C 4.804688 -7.289062 5.128906 -7.144531 5.484375 -7 C 5.984375 -6.789062 6.441406 -6.578125 6.859375 -6.359375 C 7.285156 -6.148438 7.648438 -5.894531 7.953125 -5.59375 C 8.253906 -5.300781 8.488281 -4.953125 8.65625 -4.546875 C 8.820312 -4.148438 8.90625 -3.664062 8.90625 -3.09375 C 8.90625 -2.019531 8.539062 -1.191406 7.8125 -0.609375 C 7.082031 -0.0234375 6.039062 0.265625 4.6875 0.265625 C 4.238281 0.265625 3.820312 0.234375 3.4375 0.171875 C 3.050781 0.109375 2.707031 0.03125 2.40625 -0.0625 C 2.101562 -0.15625 1.835938 -0.25 1.609375 -0.34375 C 1.390625 -0.4375 1.21875 -0.519531 1.09375 -0.59375 L 1.59375 -1.953125 C 1.863281 -1.804688 2.257812 -1.632812 2.78125 -1.4375 C 3.300781 -1.25 3.9375 -1.15625 4.6875 -1.15625 Z M 4.6875 -1.15625 "/>
+</symbol>
+<symbol overflow="visible" id="glyph0-2">
+<path style="stroke:none;" d="M 5.1875 -9.5 C 6.4375 -9.5 7.398438 -9.109375 8.078125 -8.328125 C 8.753906 -7.546875 9.09375 -6.363281 9.09375 -4.78125 L 9.09375 -4.203125 L 2.453125 -4.203125 C 2.523438 -3.242188 2.84375 -2.515625 3.40625 -2.015625 C 3.976562 -1.515625 4.773438 -1.265625 5.796875 -1.265625 C 6.390625 -1.265625 6.890625 -1.3125 7.296875 -1.40625 C 7.710938 -1.5 8.023438 -1.597656 8.234375 -1.703125 L 8.453125 -0.296875 C 8.253906 -0.191406 7.894531 -0.0820312 7.375 0.03125 C 6.851562 0.15625 6.269531 0.21875 5.625 0.21875 C 4.820312 0.21875 4.113281 0.0976562 3.5 -0.140625 C 2.894531 -0.390625 2.394531 -0.726562 2 -1.15625 C 1.601562 -1.582031 1.300781 -2.09375 1.09375 -2.6875 C 0.894531 -3.28125 0.796875 -3.925781 0.796875 -4.625 C 0.796875 -5.445312 0.921875 -6.164062 1.171875 -6.78125 C 1.429688 -7.394531 1.765625 -7.898438 2.171875 -8.296875 C 2.585938 -8.703125 3.054688 -9.003906 3.578125 -9.203125 C 4.097656 -9.398438 4.632812 -9.5 5.1875 -9.5 Z M 7.421875 -5.546875 C 7.421875 -6.328125 7.210938 -6.945312 6.796875 -7.40625 C 6.390625 -7.863281 5.84375 -8.09375 5.15625 -8.09375 C 4.769531 -8.09375 4.421875 -8.019531 4.109375 -7.875 C 3.796875 -7.726562 3.523438 -7.535156 3.296875 -7.296875 C 3.066406 -7.054688 2.882812 -6.78125 2.75 -6.46875 C 2.625 -6.164062 2.539062 -5.859375 2.5 -5.546875 Z M 7.421875 -5.546875 "/>
+</symbol>
+<symbol overflow="visible" id="glyph0-3">
+<path style="stroke:none;" d="M 1.421875 -9.015625 C 2.015625 -9.160156 2.609375 -9.273438 3.203125 -9.359375 C 3.796875 -9.441406 4.351562 -9.484375 4.875 -9.484375 C 6.113281 -9.484375 7.050781 -9.160156 7.6875 -8.515625 C 8.320312 -7.878906 8.640625 -6.851562 8.640625 -5.4375 L 8.640625 0 L 7 0 L 7 -5.140625 C 7 -5.742188 6.945312 -6.226562 6.84375 -6.59375 C 6.738281 -6.96875 6.585938 -7.257812 6.390625 -7.46875 C 6.191406 -7.675781 5.957031 -7.816406 5.6875 -7.890625 C 5.414062 -7.972656 5.117188 -8.015625 4.796875 -8.015625 C 4.535156 -8.015625 4.253906 -8 3.953125 -7.96875 C 3.648438 -7.9375 3.359375 -7.894531 3.078125 -7.84375 L 3.078125 0 L 1.421875 0 Z M 1.421875 -9.015625 "/>
+</symbol>
+<symbol overflow="visible" id="glyph0-4">
+<path style="stroke:none;" d="M 7.015625 -2.3125 C 7.015625 -2.644531 6.878906 -2.914062 6.609375 -3.125 C 6.335938 -3.34375 6 -3.53125 5.59375 -3.6875 C 5.1875 -3.851562 4.742188 -4.015625 4.265625 -4.171875 C 3.785156 -4.328125 3.335938 -4.515625 2.921875 -4.734375 C 2.515625 -4.960938 2.175781 -5.242188 1.90625 -5.578125 C 1.632812 -5.910156 1.5 -6.34375 1.5 -6.875 C 1.5 -7.625 1.800781 -8.25 2.40625 -8.75 C 3.007812 -9.25 3.960938 -9.5 5.265625 -9.5 C 5.765625 -9.5 6.285156 -9.460938 6.828125 -9.390625 C 7.367188 -9.316406 7.832031 -9.21875 8.21875 -9.09375 L 7.921875 -7.625 C 7.816406 -7.675781 7.671875 -7.726562 7.484375 -7.78125 C 7.296875 -7.84375 7.082031 -7.894531 6.84375 -7.9375 C 6.601562 -7.988281 6.34375 -8.023438 6.0625 -8.046875 C 5.789062 -8.078125 5.53125 -8.09375 5.28125 -8.09375 C 3.84375 -8.09375 3.125 -7.703125 3.125 -6.921875 C 3.125 -6.640625 3.257812 -6.398438 3.53125 -6.203125 C 3.800781 -6.015625 4.144531 -5.835938 4.5625 -5.671875 C 4.976562 -5.515625 5.425781 -5.351562 5.90625 -5.1875 C 6.382812 -5.019531 6.828125 -4.816406 7.234375 -4.578125 C 7.648438 -4.335938 7.992188 -4.046875 8.265625 -3.703125 C 8.546875 -3.367188 8.6875 -2.941406 8.6875 -2.421875 C 8.6875 -1.578125 8.359375 -0.925781 7.703125 -0.46875 C 7.046875 -0.0078125 6.007812 0.21875 4.59375 0.21875 C 3.957031 0.21875 3.375 0.164062 2.84375 0.0625 C 2.3125 -0.0390625 1.800781 -0.203125 1.3125 -0.421875 L 1.640625 -1.921875 C 2.109375 -1.703125 2.597656 -1.523438 3.109375 -1.390625 C 3.617188 -1.253906 4.171875 -1.1875 4.765625 -1.1875 C 6.265625 -1.1875 7.015625 -1.5625 7.015625 -2.3125 Z M 7.015625 -2.3125 "/>
+</symbol>
+<symbol overflow="visible" id="glyph0-5">
+<path style="stroke:none;" d="M 9.203125 -4.640625 C 9.203125 -3.910156 9.097656 -3.25 8.890625 -2.65625 C 8.679688 -2.0625 8.390625 -1.550781 8.015625 -1.125 C 7.640625 -0.695312 7.191406 -0.363281 6.671875 -0.125 C 6.160156 0.101562 5.597656 0.21875 4.984375 0.21875 C 4.378906 0.21875 3.820312 0.101562 3.3125 -0.125 C 2.800781 -0.363281 2.359375 -0.695312 1.984375 -1.125 C 1.609375 -1.550781 1.316406 -2.0625 1.109375 -2.65625 C 0.898438 -3.25 0.796875 -3.910156 0.796875 -4.640625 C 0.796875 -5.367188 0.898438 -6.035156 1.109375 -6.640625 C 1.316406 -7.242188 1.609375 -7.753906 1.984375 -8.171875 C 2.359375 -8.585938 2.800781 -8.910156 3.3125 -9.140625 C 3.820312 -9.378906 4.378906 -9.5 4.984375 -9.5 C 5.597656 -9.5 6.160156 -9.378906 6.671875 -9.140625 C 7.191406 -8.910156 7.640625 -8.585938 8.015625 -8.171875 C 8.390625 -7.753906 8.679688 -7.242188 8.890625 -6.640625 C 9.097656 -6.035156 9.203125 -5.367188 9.203125 -4.640625 Z M 7.5 -4.640625 C 7.5 -5.691406 7.269531 -6.519531 6.8125 -7.125 C 6.363281 -7.738281 5.753906 -8.046875 4.984375 -8.046875 C 4.222656 -8.046875 3.617188 -7.738281 3.171875 -7.125 C 2.722656 -6.519531 2.5 -5.691406 2.5 -4.640625 C 2.5 -3.597656 2.722656 -2.773438 3.171875 -2.171875 C 3.617188 -1.566406 4.222656 -1.265625 4.984375 -1.265625 C 5.753906 -1.265625 6.363281 -1.566406 6.8125 -2.171875 C 7.269531 -2.773438 7.5 -3.597656 7.5 -4.640625 Z M 7.5 -4.640625 "/>
+</symbol>
+<symbol overflow="visible" id="glyph0-6">
+<path style="stroke:none;" d="M 2.140625 0 L 2.140625 -8.78125 C 3.503906 -9.25 4.878906 -9.484375 6.265625 -9.484375 C 6.691406 -9.484375 7.097656 -9.460938 7.484375 -9.421875 C 7.867188 -9.390625 8.296875 -9.320312 8.765625 -9.21875 L 8.453125 -7.765625 C 8.023438 -7.878906 7.648438 -7.953125 7.328125 -7.984375 C 7.003906 -8.023438 6.648438 -8.046875 6.265625 -8.046875 C 5.453125 -8.046875 4.625 -7.929688 3.78125 -7.703125 L 3.78125 0 Z M 2.140625 0 "/>
+</symbol>
+<symbol overflow="visible" id="glyph0-7">
+<path style="stroke:none;" d="M 5.8125 -10.984375 L 5.8125 -1.40625 L 8.21875 -1.40625 L 8.21875 0 L 1.78125 0 L 1.78125 -1.40625 L 4.1875 -1.40625 L 4.1875 -10.984375 L 1.78125 -10.984375 L 1.78125 -12.375 L 8.21875 -12.375 L 8.21875 -10.984375 Z M 5.8125 -10.984375 "/>
+</symbol>
+<symbol overflow="visible" id="glyph0-8">
+<path style="stroke:none;" d="M 1.8125 0 L 1.8125 -12.375 L 8.84375 -12.375 L 8.84375 -10.984375 L 3.453125 -10.984375 L 3.453125 -7.125 L 8.203125 -7.125 L 8.203125 -5.734375 L 3.453125 -5.734375 L 3.453125 0 Z M 1.8125 0 "/>
+</symbol>
+<symbol overflow="visible" id="glyph0-9">
+<path style="stroke:none;" d="M 4.078125 0.09375 C 3.878906 0.09375 3.644531 0.0859375 3.375 0.078125 C 3.113281 0.0664062 2.847656 0.0507812 2.578125 0.03125 C 2.316406 0.0078125 2.050781 -0.0195312 1.78125 -0.0625 C 1.507812 -0.101562 1.273438 -0.148438 1.078125 -0.203125 L 1.078125 -12.203125 C 1.273438 -12.253906 1.503906 -12.300781 1.765625 -12.34375 C 2.023438 -12.382812 2.289062 -12.410156 2.5625 -12.421875 C 2.84375 -12.441406 3.113281 -12.457031 3.375 -12.46875 C 3.632812 -12.488281 3.867188 -12.5 4.078125 -12.5 C 4.691406 -12.5 5.265625 -12.445312 5.796875 -12.34375 C 6.328125 -12.238281 6.789062 -12.054688 7.1875 -11.796875 C 7.582031 -11.546875 7.890625 -11.210938 8.109375 -10.796875 C 8.328125 -10.390625 8.4375 -9.878906 8.4375 -9.265625 C 8.4375 -8.960938 8.390625 -8.675781 8.296875 -8.40625 C 8.203125 -8.132812 8.070312 -7.878906 7.90625 -7.640625 C 7.738281 -7.398438 7.546875 -7.1875 7.328125 -7 C 7.109375 -6.820312 6.875 -6.6875 6.625 -6.59375 C 7.300781 -6.40625 7.867188 -6.0625 8.328125 -5.5625 C 8.785156 -5.0625 9.015625 -4.414062 9.015625 -3.625 C 9.015625 -2.394531 8.617188 -1.46875 7.828125 -0.84375 C 7.046875 -0.21875 5.796875 0.09375 4.078125 0.09375 Z M 2.71875 -5.78125 L 2.71875 -1.359375 C 2.75 -1.347656 2.898438 -1.332031 3.171875 -1.3125 C 3.441406 -1.289062 3.785156 -1.28125 4.203125 -1.28125 C 4.609375 -1.28125 5 -1.3125 5.375 -1.375 C 5.757812 -1.445312 6.097656 -1.570312 6.390625 -1.75 C 6.691406 -1.925781 6.929688 -2.160156 7.109375 -2.453125 C 7.285156 -2.753906 7.375 -3.132812 7.375 -3.59375 C 7.375 -4.007812 7.289062 -4.359375 7.125 -4.640625 C 6.957031 -4.921875 6.738281 -5.144531 6.46875 -5.3125 C 6.195312 -5.476562 5.878906 -5.597656 5.515625 -5.671875 C 5.160156 -5.742188 4.789062 -5.78125 4.40625 -5.78125 Z M 2.71875 -7.140625 L 4.015625 -7.140625 C 4.347656 -7.140625 4.679688 -7.171875 5.015625 -7.234375 C 5.347656 -7.304688 5.644531 -7.414062 5.90625 -7.5625 C 6.175781 -7.707031 6.390625 -7.90625 6.546875 -8.15625 C 6.710938 -8.414062 6.796875 -8.738281 6.796875 -9.125 C 6.796875 -9.476562 6.722656 -9.78125 6.578125 -10.03125 C 6.429688 -10.289062 6.238281 -10.5 6 -10.65625 C 5.757812 -10.820312 5.484375 -10.9375 5.171875 -11 C 4.859375 -11.0625 4.53125 -11.09375 4.1875 -11.09375 C 3.832031 -11.09375 3.523438 -11.085938 3.265625 -11.078125 C 3.003906 -11.078125 2.820312 -11.066406 2.71875 -11.046875 Z M 2.71875 -7.140625 "/>
+</symbol>
+<symbol overflow="visible" id="glyph0-10">
+<path style="stroke:none;" d="M 9.203125 -6.203125 C 9.203125 -5.054688 9.054688 -4.082031 8.765625 -3.28125 C 8.484375 -2.476562 8.09375 -1.828125 7.59375 -1.328125 C 7.09375 -0.828125 6.5 -0.460938 5.8125 -0.234375 C 5.125 -0.015625 4.378906 0.09375 3.578125 0.09375 C 2.753906 0.09375 1.921875 -0.00390625 1.078125 -0.203125 L 1.078125 -12.203125 C 1.921875 -12.398438 2.753906 -12.5 3.578125 -12.5 C 4.378906 -12.5 5.125 -12.382812 5.8125 -12.15625 C 6.5 -11.925781 7.09375 -11.554688 7.59375 -11.046875 C 8.09375 -10.546875 8.484375 -9.894531 8.765625 -9.09375 C 9.054688 -8.300781 9.203125 -7.335938 9.203125 -6.203125 Z M 2.71875 -1.375 C 3.050781 -1.332031 3.390625 -1.3125 3.734375 -1.3125 C 4.335938 -1.3125 4.875 -1.398438 5.34375 -1.578125 C 5.8125 -1.765625 6.203125 -2.054688 6.515625 -2.453125 C 6.835938 -2.847656 7.082031 -3.351562 7.25 -3.96875 C 7.425781 -4.59375 7.515625 -5.335938 7.515625 -6.203125 C 7.515625 -7.878906 7.191406 -9.109375 6.546875 -9.890625 C 5.898438 -10.679688 4.945312 -11.078125 3.6875 -11.078125 C 3.507812 -11.078125 3.335938 -11.070312 3.171875 -11.0625 C 3.003906 -11.0625 2.851562 -11.046875 2.71875 -11.015625 Z M 2.71875 -1.375 "/>
+</symbol>
+<symbol overflow="visible" id="glyph0-11">
+<path style="stroke:none;" d="M 7.453125 -6.09375 L 9.09375 -6.09375 L 9.09375 -0.296875 C 8.84375 -0.203125 8.4375 -0.0859375 7.875 0.046875 C 7.320312 0.191406 6.664062 0.265625 5.90625 0.265625 C 5.15625 0.265625 4.472656 0.125 3.859375 -0.15625 C 3.242188 -0.445312 2.71875 -0.863281 2.28125 -1.40625 C 1.851562 -1.957031 1.519531 -2.632812 1.28125 -3.4375 C 1.039062 -4.25 0.921875 -5.171875 0.921875 -6.203125 C 0.921875 -7.242188 1.050781 -8.160156 1.3125 -8.953125 C 1.582031 -9.753906 1.945312 -10.425781 2.40625 -10.96875 C 2.863281 -11.519531 3.398438 -11.9375 4.015625 -12.21875 C 4.628906 -12.507812 5.289062 -12.65625 6 -12.65625 C 6.457031 -12.65625 6.859375 -12.617188 7.203125 -12.546875 C 7.546875 -12.484375 7.835938 -12.40625 8.078125 -12.3125 C 8.328125 -12.226562 8.53125 -12.132812 8.6875 -12.03125 C 8.851562 -11.925781 8.976562 -11.847656 9.0625 -11.796875 L 8.515625 -10.421875 C 8.210938 -10.660156 7.847656 -10.851562 7.421875 -11 C 7.003906 -11.15625 6.5625 -11.234375 6.09375 -11.234375 C 5.59375 -11.234375 5.125 -11.113281 4.6875 -10.875 C 4.257812 -10.632812 3.890625 -10.296875 3.578125 -9.859375 C 3.273438 -9.421875 3.035156 -8.890625 2.859375 -8.265625 C 2.679688 -7.648438 2.59375 -6.960938 2.59375 -6.203125 C 2.59375 -5.453125 2.671875 -4.769531 2.828125 -4.15625 C 2.984375 -3.539062 3.207031 -3.015625 3.5 -2.578125 C 3.789062 -2.140625 4.148438 -1.796875 4.578125 -1.546875 C 5.015625 -1.304688 5.515625 -1.1875 6.078125 -1.1875 C 6.460938 -1.1875 6.757812 -1.210938 6.96875 -1.265625 C 7.1875 -1.316406 7.347656 -1.367188 7.453125 -1.421875 Z M 7.453125 -6.09375 "/>
+</symbol>
+<symbol overflow="visible" id="glyph0-12">
+<path style="stroke:none;" d="M 9.203125 -0.515625 C 8.734375 -0.253906 8.234375 -0.0625 7.703125 0.0625 C 7.179688 0.195312 6.617188 0.265625 6.015625 0.265625 C 5.285156 0.265625 4.609375 0.132812 3.984375 -0.125 C 3.367188 -0.382812 2.832031 -0.773438 2.375 -1.296875 C 1.925781 -1.828125 1.570312 -2.5 1.3125 -3.3125 C 1.050781 -4.132812 0.921875 -5.097656 0.921875 -6.203125 C 0.921875 -7.253906 1.054688 -8.179688 1.328125 -8.984375 C 1.597656 -9.785156 1.96875 -10.457031 2.4375 -11 C 2.90625 -11.539062 3.453125 -11.953125 4.078125 -12.234375 C 4.703125 -12.515625 5.367188 -12.65625 6.078125 -12.65625 C 6.566406 -12.65625 7.066406 -12.585938 7.578125 -12.453125 C 8.097656 -12.328125 8.601562 -12.109375 9.09375 -11.796875 L 8.625 -10.4375 C 7.738281 -10.945312 6.910156 -11.203125 6.140625 -11.203125 C 5.585938 -11.203125 5.09375 -11.082031 4.65625 -10.84375 C 4.226562 -10.613281 3.859375 -10.28125 3.546875 -9.84375 C 3.242188 -9.40625 3.007812 -8.878906 2.84375 -8.265625 C 2.675781 -7.648438 2.59375 -6.960938 2.59375 -6.203125 C 2.59375 -5.347656 2.679688 -4.609375 2.859375 -3.984375 C 3.046875 -3.359375 3.296875 -2.835938 3.609375 -2.421875 C 3.929688 -2.003906 4.316406 -1.695312 4.765625 -1.5 C 5.210938 -1.300781 5.695312 -1.203125 6.21875 -1.203125 C 6.601562 -1.203125 7.007812 -1.25 7.4375 -1.34375 C 7.863281 -1.445312 8.304688 -1.625 8.765625 -1.875 Z M 9.203125 -0.515625 "/>
+</symbol>
+<symbol overflow="visible" id="glyph1-0">
+<path style="stroke:none;" d="M 0.59375 0 L 0.59375 -9 L 5.40625 -9 L 5.40625 0 Z M 4.796875 -0.59375 L 4.796875 -8.40625 L 1.203125 -8.40625 L 1.203125 -0.59375 Z M 4.796875 -0.59375 "/>
+</symbol>
+<symbol overflow="visible" id="glyph1-1">
+<path style="stroke:none;" d="M 2.515625 0 L 2.515625 -2.765625 C 2.023438 -3.554688 1.582031 -4.332031 1.1875 -5.09375 C 0.789062 -5.851562 0.445312 -6.628906 0.15625 -7.421875 L 1.265625 -7.421875 C 1.492188 -6.753906 1.757812 -6.113281 2.0625 -5.5 C 2.363281 -4.882812 2.6875 -4.253906 3.03125 -3.609375 C 3.394531 -4.285156 3.71875 -4.929688 4 -5.546875 C 4.28125 -6.160156 4.539062 -6.785156 4.78125 -7.421875 L 5.859375 -7.421875 C 5.554688 -6.640625 5.207031 -5.875 4.8125 -5.125 C 4.414062 -4.382812 3.976562 -3.601562 3.5 -2.78125 L 3.5 0 Z M 2.515625 0 "/>
+</symbol>
+<symbol overflow="visible" id="glyph1-2">
+<path style="stroke:none;" d="M 3 0.15625 C 2.5625 0.15625 2.1875 0.09375 1.875 -0.03125 C 1.570312 -0.164062 1.320312 -0.347656 1.125 -0.578125 C 0.9375 -0.804688 0.796875 -1.085938 0.703125 -1.421875 C 0.617188 -1.765625 0.578125 -2.144531 0.578125 -2.5625 L 0.578125 -7.421875 L 1.5625 -7.421875 L 1.5625 -2.65625 C 1.5625 -2.28125 1.59375 -1.96875 1.65625 -1.71875 C 1.726562 -1.46875 1.828125 -1.265625 1.953125 -1.109375 C 2.078125 -0.960938 2.222656 -0.859375 2.390625 -0.796875 C 2.566406 -0.734375 2.769531 -0.703125 3 -0.703125 C 3.226562 -0.703125 3.425781 -0.734375 3.59375 -0.796875 C 3.769531 -0.859375 3.921875 -0.960938 4.046875 -1.109375 C 4.171875 -1.265625 4.265625 -1.46875 4.328125 -1.71875 C 4.398438 -1.96875 4.4375 -2.28125 4.4375 -2.65625 L 4.4375 -7.421875 L 5.421875 -7.421875 L 5.421875 -2.5625 C 5.421875 -2.144531 5.375 -1.765625 5.28125 -1.421875 C 5.195312 -1.085938 5.054688 -0.804688 4.859375 -0.578125 C 4.671875 -0.347656 4.421875 -0.164062 4.109375 -0.03125 C 3.804688 0.09375 3.4375 0.15625 3 0.15625 Z M 3 0.15625 "/>
+</symbol>
+<symbol overflow="visible" id="glyph1-3">
+<path style="stroke:none;" d="M 1.21875 -7.421875 C 1.320312 -6.921875 1.445312 -6.375 1.59375 -5.78125 C 1.738281 -5.1875 1.890625 -4.585938 2.046875 -3.984375 C 2.210938 -3.390625 2.378906 -2.820312 2.546875 -2.28125 C 2.722656 -1.738281 2.882812 -1.265625 3.03125 -0.859375 C 3.15625 -1.265625 3.300781 -1.742188 3.46875 -2.296875 C 3.644531 -2.847656 3.816406 -3.421875 3.984375 -4.015625 C 4.148438 -4.609375 4.304688 -5.203125 4.453125 -5.796875 C 4.609375 -6.390625 4.734375 -6.929688 4.828125 -7.421875 L 5.859375 -7.421875 C 5.796875 -7.109375 5.691406 -6.679688 5.546875 -6.140625 C 5.398438 -5.597656 5.226562 -4.992188 5.03125 -4.328125 C 4.832031 -3.660156 4.609375 -2.953125 4.359375 -2.203125 C 4.117188 -1.453125 3.863281 -0.71875 3.59375 0 L 2.375 0 C 2.125 -0.71875 1.878906 -1.445312 1.640625 -2.1875 C 1.410156 -2.9375 1.195312 -3.644531 1 -4.3125 C 0.800781 -4.976562 0.628906 -5.582031 0.484375 -6.125 C 0.335938 -6.675781 0.226562 -7.109375 0.15625 -7.421875 Z M 1.21875 -7.421875 "/>
+</symbol>
+<symbol overflow="visible" id="glyph1-4">
+<path style="stroke:none;" d=""/>
+</symbol>
+<symbol overflow="visible" id="glyph1-5">
+<path style="stroke:none;" d="M 5.515625 -3.71875 C 5.515625 -3.03125 5.425781 -2.445312 5.25 -1.96875 C 5.082031 -1.488281 4.847656 -1.097656 4.546875 -0.796875 C 4.253906 -0.492188 3.898438 -0.273438 3.484375 -0.140625 C 3.078125 -0.00390625 2.628906 0.0625 2.140625 0.0625 C 1.648438 0.0625 1.148438 0 0.640625 -0.125 L 0.640625 -7.3125 C 1.148438 -7.4375 1.648438 -7.5 2.140625 -7.5 C 2.628906 -7.5 3.078125 -7.429688 3.484375 -7.296875 C 3.898438 -7.160156 4.253906 -6.941406 4.546875 -6.640625 C 4.847656 -6.335938 5.082031 -5.941406 5.25 -5.453125 C 5.425781 -4.972656 5.515625 -4.394531 5.515625 -3.71875 Z M 1.625 -0.828125 C 1.832031 -0.804688 2.039062 -0.796875 2.25 -0.796875 C 2.601562 -0.796875 2.921875 -0.847656 3.203125 -0.953125 C 3.484375 -1.054688 3.71875 -1.226562 3.90625 -1.46875 C 4.101562 -1.707031 4.253906 -2.007812 4.359375 -2.375 C 4.460938 -2.75 4.515625 -3.195312 4.515625 -3.71875 C 4.515625 -4.726562 4.316406 -5.46875 3.921875 -5.9375 C 3.535156 -6.40625 2.960938 -6.640625 2.203125 -6.640625 C 2.097656 -6.640625 1.992188 -6.640625 1.890625 -6.640625 C 1.796875 -6.640625 1.707031 -6.628906 1.625 -6.609375 Z M 1.625 -0.828125 "/>
+</symbol>
+<symbol overflow="visible" id="glyph1-6">
+<path style="stroke:none;" d="M 5.515625 -2.78125 C 5.515625 -2.34375 5.453125 -1.945312 5.328125 -1.59375 C 5.203125 -1.238281 5.023438 -0.929688 4.796875 -0.671875 C 4.578125 -0.410156 4.3125 -0.210938 4 -0.078125 C 3.695312 0.0546875 3.359375 0.125 2.984375 0.125 C 2.628906 0.125 2.296875 0.0546875 1.984375 -0.078125 C 1.679688 -0.210938 1.414062 -0.410156 1.1875 -0.671875 C 0.96875 -0.929688 0.796875 -1.238281 0.671875 -1.59375 C 0.546875 -1.945312 0.484375 -2.34375 0.484375 -2.78125 C 0.484375 -3.21875 0.546875 -3.617188 0.671875 -3.984375 C 0.796875 -4.347656 0.96875 -4.65625 1.1875 -4.90625 C 1.414062 -5.15625 1.679688 -5.347656 1.984375 -5.484375 C 2.296875 -5.628906 2.628906 -5.703125 2.984375 -5.703125 C 3.359375 -5.703125 3.695312 -5.628906 4 -5.484375 C 4.3125 -5.347656 4.578125 -5.15625 4.796875 -4.90625 C 5.023438 -4.65625 5.203125 -4.347656 5.328125 -3.984375 C 5.453125 -3.617188 5.515625 -3.21875 5.515625 -2.78125 Z M 4.5 -2.78125 C 4.5 -3.414062 4.363281 -3.914062 4.09375 -4.28125 C 3.820312 -4.644531 3.453125 -4.828125 2.984375 -4.828125 C 2.523438 -4.828125 2.160156 -4.644531 1.890625 -4.28125 C 1.628906 -3.914062 1.5 -3.414062 1.5 -2.78125 C 1.5 -2.15625 1.628906 -1.660156 1.890625 -1.296875 C 2.160156 -0.929688 2.523438 -0.75 2.984375 -0.75 C 3.453125 -0.75 3.820312 -0.929688 4.09375 -1.296875 C 4.363281 -1.660156 4.5 -2.15625 4.5 -2.78125 Z M 4.5 -2.78125 "/>
+</symbol>
+<symbol overflow="visible" id="glyph1-7">
+<path style="stroke:none;" d="M 4.109375 0 C 3.992188 -0.269531 3.890625 -0.515625 3.796875 -0.734375 C 3.710938 -0.960938 3.628906 -1.1875 3.546875 -1.40625 C 3.460938 -1.632812 3.378906 -1.867188 3.296875 -2.109375 C 3.210938 -2.359375 3.113281 -2.640625 3 -2.953125 C 2.882812 -2.640625 2.78125 -2.359375 2.6875 -2.109375 C 2.601562 -1.867188 2.519531 -1.632812 2.4375 -1.40625 C 2.351562 -1.1875 2.265625 -0.960938 2.171875 -0.734375 C 2.085938 -0.515625 1.984375 -0.269531 1.859375 0 L 1.109375 0 C 0.890625 -0.976562 0.707031 -1.953125 0.5625 -2.921875 C 0.414062 -3.890625 0.304688 -4.769531 0.234375 -5.5625 L 1.15625 -5.5625 C 1.1875 -5.25 1.210938 -4.941406 1.234375 -4.640625 C 1.265625 -4.347656 1.300781 -4.035156 1.34375 -3.703125 C 1.382812 -3.378906 1.429688 -3.023438 1.484375 -2.640625 C 1.535156 -2.253906 1.59375 -1.820312 1.65625 -1.34375 C 1.78125 -1.664062 1.882812 -1.945312 1.96875 -2.1875 C 2.0625 -2.425781 2.144531 -2.648438 2.21875 -2.859375 C 2.289062 -3.078125 2.359375 -3.296875 2.421875 -3.515625 C 2.492188 -3.742188 2.570312 -4 2.65625 -4.28125 L 3.390625 -4.28125 C 3.472656 -4 3.546875 -3.742188 3.609375 -3.515625 C 3.671875 -3.296875 3.738281 -3.078125 3.8125 -2.859375 C 3.882812 -2.648438 3.957031 -2.425781 4.03125 -2.1875 C 4.113281 -1.945312 4.21875 -1.671875 4.34375 -1.359375 C 4.414062 -1.796875 4.476562 -2.203125 4.53125 -2.578125 C 4.59375 -2.953125 4.640625 -3.304688 4.671875 -3.640625 C 4.710938 -3.972656 4.75 -4.296875 4.78125 -4.609375 C 4.820312 -4.921875 4.851562 -5.238281 4.875 -5.5625 L 5.765625 -5.5625 C 5.734375 -5.164062 5.6875 -4.738281 5.625 -4.28125 C 5.570312 -3.820312 5.503906 -3.351562 5.421875 -2.875 C 5.335938 -2.394531 5.25 -1.910156 5.15625 -1.421875 C 5.0625 -0.929688 4.960938 -0.457031 4.859375 0 Z M 4.109375 0 "/>
+</symbol>
+<symbol overflow="visible" id="glyph1-8">
+<path style="stroke:none;" d="M 0.859375 -5.40625 C 1.210938 -5.5 1.566406 -5.566406 1.921875 -5.609375 C 2.273438 -5.660156 2.609375 -5.6875 2.921875 -5.6875 C 3.671875 -5.6875 4.234375 -5.492188 4.609375 -5.109375 C 4.992188 -4.722656 5.1875 -4.109375 5.1875 -3.265625 L 5.1875 0 L 4.203125 0 L 4.203125 -3.078125 C 4.203125 -3.441406 4.171875 -3.734375 4.109375 -3.953125 C 4.046875 -4.179688 3.953125 -4.359375 3.828125 -4.484375 C 3.710938 -4.609375 3.570312 -4.691406 3.40625 -4.734375 C 3.25 -4.785156 3.070312 -4.8125 2.875 -4.8125 C 2.71875 -4.8125 2.546875 -4.800781 2.359375 -4.78125 C 2.179688 -4.757812 2.007812 -4.734375 1.84375 -4.703125 L 1.84375 0 L 0.859375 0 Z M 0.859375 -5.40625 "/>
+</symbol>
+<symbol overflow="visible" id="glyph1-9">
+<path style="stroke:none;" d="M 4.21875 -1.390625 C 4.21875 -1.585938 4.132812 -1.75 3.96875 -1.875 C 3.800781 -2.007812 3.59375 -2.125 3.34375 -2.21875 C 3.101562 -2.3125 2.835938 -2.40625 2.546875 -2.5 C 2.265625 -2.59375 2 -2.707031 1.75 -2.84375 C 1.507812 -2.976562 1.304688 -3.144531 1.140625 -3.34375 C 0.984375 -3.539062 0.90625 -3.800781 0.90625 -4.125 C 0.90625 -4.570312 1.082031 -4.945312 1.4375 -5.25 C 1.800781 -5.550781 2.375 -5.703125 3.15625 -5.703125 C 3.457031 -5.703125 3.769531 -5.675781 4.09375 -5.625 C 4.414062 -5.582031 4.695312 -5.523438 4.9375 -5.453125 L 4.75 -4.578125 C 4.6875 -4.609375 4.597656 -4.640625 4.484375 -4.671875 C 4.367188 -4.710938 4.238281 -4.742188 4.09375 -4.765625 C 3.957031 -4.796875 3.804688 -4.816406 3.640625 -4.828125 C 3.472656 -4.847656 3.316406 -4.859375 3.171875 -4.859375 C 2.304688 -4.859375 1.875 -4.625 1.875 -4.15625 C 1.875 -3.988281 1.953125 -3.84375 2.109375 -3.71875 C 2.273438 -3.601562 2.484375 -3.5 2.734375 -3.40625 C 2.984375 -3.3125 3.25 -3.210938 3.53125 -3.109375 C 3.820312 -3.015625 4.09375 -2.894531 4.34375 -2.75 C 4.59375 -2.601562 4.796875 -2.425781 4.953125 -2.21875 C 5.117188 -2.019531 5.203125 -1.765625 5.203125 -1.453125 C 5.203125 -0.953125 5.003906 -0.5625 4.609375 -0.28125 C 4.222656 -0.0078125 3.609375 0.125 2.765625 0.125 C 2.378906 0.125 2.023438 0.09375 1.703125 0.03125 C 1.378906 -0.03125 1.078125 -0.125 0.796875 -0.25 L 0.984375 -1.15625 C 1.265625 -1.019531 1.554688 -0.910156 1.859375 -0.828125 C 2.171875 -0.742188 2.503906 -0.703125 2.859375 -0.703125 C 3.765625 -0.703125 4.21875 -0.929688 4.21875 -1.390625 Z M 4.21875 -1.390625 "/>
+</symbol>
+<symbol overflow="visible" id="glyph1-10">
+<path style="stroke:none;" d="M 0.59375 -2.765625 C 0.59375 -3.273438 0.671875 -3.710938 0.828125 -4.078125 C 0.984375 -4.441406 1.203125 -4.742188 1.484375 -4.984375 C 1.765625 -5.234375 2.09375 -5.414062 2.46875 -5.53125 C 2.84375 -5.644531 3.238281 -5.703125 3.65625 -5.703125 C 3.925781 -5.703125 4.195312 -5.679688 4.46875 -5.640625 C 4.738281 -5.609375 5.023438 -5.546875 5.328125 -5.453125 L 5.09375 -4.59375 C 4.832031 -4.6875 4.59375 -4.75 4.375 -4.78125 C 4.15625 -4.8125 3.929688 -4.828125 3.703125 -4.828125 C 3.421875 -4.828125 3.148438 -4.785156 2.890625 -4.703125 C 2.640625 -4.628906 2.414062 -4.507812 2.21875 -4.34375 C 2.03125 -4.1875 1.878906 -3.976562 1.765625 -3.71875 C 1.660156 -3.457031 1.609375 -3.140625 1.609375 -2.765625 C 1.609375 -2.421875 1.660156 -2.117188 1.765625 -1.859375 C 1.867188 -1.609375 2.015625 -1.398438 2.203125 -1.234375 C 2.390625 -1.078125 2.613281 -0.957031 2.875 -0.875 C 3.144531 -0.789062 3.4375 -0.75 3.75 -0.75 C 4.007812 -0.75 4.253906 -0.765625 4.484375 -0.796875 C 4.722656 -0.828125 4.984375 -0.890625 5.265625 -0.984375 L 5.40625 -0.15625 C 5.125 -0.0507812 4.835938 0.0195312 4.546875 0.0625 C 4.265625 0.101562 3.957031 0.125 3.625 0.125 C 3.175781 0.125 2.765625 0.0664062 2.390625 -0.046875 C 2.023438 -0.171875 1.707031 -0.351562 1.4375 -0.59375 C 1.164062 -0.832031 0.957031 -1.132812 0.8125 -1.5 C 0.664062 -1.863281 0.59375 -2.285156 0.59375 -2.765625 Z M 0.59375 -2.765625 "/>
+</symbol>
+<symbol overflow="visible" id="glyph1-11">
+<path style="stroke:none;" d="M 3.0625 -0.703125 C 3.3125 -0.703125 3.53125 -0.707031 3.71875 -0.71875 C 3.914062 -0.738281 4.082031 -0.765625 4.21875 -0.796875 L 4.21875 -2.453125 C 4.082031 -2.492188 3.925781 -2.523438 3.75 -2.546875 C 3.570312 -2.566406 3.382812 -2.578125 3.1875 -2.578125 C 3 -2.578125 2.816406 -2.5625 2.640625 -2.53125 C 2.460938 -2.507812 2.304688 -2.460938 2.171875 -2.390625 C 2.035156 -2.316406 1.921875 -2.222656 1.828125 -2.109375 C 1.742188 -1.992188 1.703125 -1.847656 1.703125 -1.671875 C 1.703125 -1.304688 1.820312 -1.050781 2.0625 -0.90625 C 2.3125 -0.769531 2.644531 -0.703125 3.0625 -0.703125 Z M 2.96875 -5.703125 C 3.382812 -5.703125 3.734375 -5.648438 4.015625 -5.546875 C 4.296875 -5.441406 4.523438 -5.296875 4.703125 -5.109375 C 4.878906 -4.929688 5.003906 -4.707031 5.078125 -4.4375 C 5.148438 -4.175781 5.1875 -3.890625 5.1875 -3.578125 L 5.1875 -0.09375 C 4.957031 -0.0507812 4.648438 -0.00390625 4.265625 0.046875 C 3.890625 0.0976562 3.5 0.125 3.09375 0.125 C 2.789062 0.125 2.492188 0.0976562 2.203125 0.046875 C 1.921875 -0.00390625 1.664062 -0.09375 1.4375 -0.21875 C 1.21875 -0.351562 1.039062 -0.535156 0.90625 -0.765625 C 0.769531 -0.992188 0.703125 -1.289062 0.703125 -1.65625 C 0.703125 -1.976562 0.769531 -2.25 0.90625 -2.46875 C 1.039062 -2.6875 1.21875 -2.863281 1.4375 -3 C 1.664062 -3.132812 1.921875 -3.234375 2.203125 -3.296875 C 2.484375 -3.359375 2.769531 -3.390625 3.0625 -3.390625 C 3.445312 -3.390625 3.832031 -3.34375 4.21875 -3.25 L 4.21875 -3.53125 C 4.21875 -3.695312 4.195312 -3.859375 4.15625 -4.015625 C 4.125 -4.171875 4.054688 -4.3125 3.953125 -4.4375 C 3.847656 -4.5625 3.707031 -4.660156 3.53125 -4.734375 C 3.363281 -4.816406 3.144531 -4.859375 2.875 -4.859375 C 2.53125 -4.859375 2.226562 -4.832031 1.96875 -4.78125 C 1.71875 -4.738281 1.523438 -4.691406 1.390625 -4.640625 L 1.265625 -5.453125 C 1.398438 -5.523438 1.625 -5.582031 1.9375 -5.625 C 2.257812 -5.675781 2.601562 -5.703125 2.96875 -5.703125 Z M 2.96875 -5.703125 "/>
+</symbol>
+<symbol overflow="visible" id="glyph1-12">
+<path style="stroke:none;" d="M 4.0625 0.125 C 3.707031 0.125 3.410156 0.078125 3.171875 -0.015625 C 2.941406 -0.109375 2.757812 -0.25 2.625 -0.4375 C 2.488281 -0.632812 2.390625 -0.875 2.328125 -1.15625 C 2.273438 -1.4375 2.25 -1.765625 2.25 -2.140625 L 2.25 -7.421875 L 0.640625 -7.421875 L 0.640625 -8.25 L 3.234375 -8.25 L 3.234375 -2.140625 C 3.234375 -1.867188 3.25 -1.644531 3.28125 -1.46875 C 3.320312 -1.289062 3.378906 -1.144531 3.453125 -1.03125 C 3.535156 -0.925781 3.628906 -0.851562 3.734375 -0.8125 C 3.847656 -0.769531 3.984375 -0.75 4.140625 -0.75 C 4.367188 -0.75 4.582031 -0.773438 4.78125 -0.828125 C 4.988281 -0.890625 5.144531 -0.953125 5.25 -1.015625 L 5.40625 -0.1875 C 5.351562 -0.15625 5.28125 -0.117188 5.1875 -0.078125 C 5.101562 -0.046875 5 -0.015625 4.875 0.015625 C 4.757812 0.046875 4.628906 0.0703125 4.484375 0.09375 C 4.347656 0.113281 4.207031 0.125 4.0625 0.125 Z M 4.0625 0.125 "/>
+</symbol>
+<symbol overflow="visible" id="glyph1-13">
+<path style="stroke:none;" d="M 2.515625 -6.4375 C 2.304688 -6.4375 2.125 -6.503906 1.96875 -6.640625 C 1.8125 -6.785156 1.734375 -6.984375 1.734375 -7.234375 C 1.734375 -7.484375 1.8125 -7.679688 1.96875 -7.828125 C 2.125 -7.972656 2.304688 -8.046875 2.515625 -8.046875 C 2.722656 -8.046875 2.898438 -7.972656 3.046875 -7.828125 C 3.203125 -7.679688 3.28125 -7.484375 3.28125 -7.234375 C 3.28125 -6.984375 3.203125 -6.785156 3.046875 -6.640625 C 2.898438 -6.503906 2.722656 -6.4375 2.515625 -6.4375 Z M 2.25 -4.734375 L 0.640625 -4.734375 L 0.640625 -5.5625 L 3.234375 -5.5625 L 3.234375 -2.140625 C 3.234375 -1.585938 3.3125 -1.21875 3.46875 -1.03125 C 3.625 -0.84375 3.851562 -0.75 4.15625 -0.75 C 4.382812 -0.75 4.597656 -0.773438 4.796875 -0.828125 C 4.992188 -0.890625 5.144531 -0.953125 5.25 -1.015625 L 5.40625 -0.1875 C 5.351562 -0.15625 5.28125 -0.117188 5.1875 -0.078125 C 5.101562 -0.046875 5.003906 -0.015625 4.890625 0.015625 C 4.773438 0.046875 4.644531 0.0703125 4.5 0.09375 C 4.363281 0.113281 4.21875 0.125 4.0625 0.125 C 3.71875 0.125 3.425781 0.078125 3.1875 -0.015625 C 2.957031 -0.109375 2.769531 -0.25 2.625 -0.4375 C 2.488281 -0.632812 2.390625 -0.875 2.328125 -1.15625 C 2.273438 -1.4375 2.25 -1.765625 2.25 -2.140625 Z M 2.25 -4.734375 "/>
+</symbol>
+<symbol overflow="visible" id="glyph1-14">
+<path style="stroke:none;" d="M 4.15625 -0.515625 C 4.039062 -0.453125 3.863281 -0.382812 3.625 -0.3125 C 3.394531 -0.238281 3.128906 -0.203125 2.828125 -0.203125 C 2.503906 -0.203125 2.195312 -0.253906 1.90625 -0.359375 C 1.625 -0.472656 1.378906 -0.640625 1.171875 -0.859375 C 0.960938 -1.078125 0.796875 -1.351562 0.671875 -1.6875 C 0.546875 -2.03125 0.484375 -2.4375 0.484375 -2.90625 C 0.484375 -3.3125 0.539062 -3.679688 0.65625 -4.015625 C 0.769531 -4.359375 0.9375 -4.65625 1.15625 -4.90625 C 1.375 -5.15625 1.644531 -5.347656 1.96875 -5.484375 C 2.289062 -5.628906 2.65625 -5.703125 3.0625 -5.703125 C 3.539062 -5.703125 3.945312 -5.664062 4.28125 -5.59375 C 4.625 -5.53125 4.910156 -5.46875 5.140625 -5.40625 L 5.140625 -0.4375 C 5.140625 0.425781 4.921875 1.050781 4.484375 1.4375 C 4.054688 1.820312 3.398438 2.015625 2.515625 2.015625 C 2.160156 2.015625 1.835938 1.984375 1.546875 1.921875 C 1.253906 1.867188 0.992188 1.804688 0.765625 1.734375 L 0.953125 0.859375 C 1.160156 0.941406 1.394531 1.007812 1.65625 1.0625 C 1.925781 1.125 2.222656 1.15625 2.546875 1.15625 C 3.117188 1.15625 3.53125 1.035156 3.78125 0.796875 C 4.03125 0.566406 4.15625 0.191406 4.15625 -0.328125 Z M 4.15625 -4.6875 C 4.0625 -4.71875 3.925781 -4.75 3.75 -4.78125 C 3.582031 -4.8125 3.359375 -4.828125 3.078125 -4.828125 C 2.554688 -4.828125 2.160156 -4.648438 1.890625 -4.296875 C 1.628906 -3.941406 1.5 -3.472656 1.5 -2.890625 C 1.5 -2.566406 1.535156 -2.289062 1.609375 -2.0625 C 1.691406 -1.84375 1.796875 -1.65625 1.921875 -1.5 C 2.054688 -1.351562 2.207031 -1.242188 2.375 -1.171875 C 2.539062 -1.109375 2.722656 -1.078125 2.921875 -1.078125 C 3.160156 -1.078125 3.390625 -1.113281 3.609375 -1.1875 C 3.835938 -1.257812 4.019531 -1.34375 4.15625 -1.4375 Z M 4.15625 -4.6875 "/>
+</symbol>
+<symbol overflow="visible" id="glyph1-15">
+<path style="stroke:none;" d="M 2.8125 -0.703125 C 3.3125 -0.703125 3.691406 -0.796875 3.953125 -0.984375 C 4.222656 -1.171875 4.359375 -1.457031 4.359375 -1.84375 C 4.359375 -2.082031 4.304688 -2.28125 4.203125 -2.4375 C 4.109375 -2.601562 3.984375 -2.75 3.828125 -2.875 C 3.671875 -3 3.488281 -3.109375 3.28125 -3.203125 C 3.082031 -3.296875 2.878906 -3.378906 2.671875 -3.453125 C 2.429688 -3.546875 2.203125 -3.648438 1.984375 -3.765625 C 1.765625 -3.890625 1.566406 -4.03125 1.390625 -4.1875 C 1.222656 -4.351562 1.085938 -4.546875 0.984375 -4.765625 C 0.890625 -4.984375 0.84375 -5.238281 0.84375 -5.53125 C 0.84375 -6.175781 1.046875 -6.679688 1.453125 -7.046875 C 1.859375 -7.410156 2.425781 -7.59375 3.15625 -7.59375 C 3.351562 -7.59375 3.550781 -7.578125 3.75 -7.546875 C 3.945312 -7.523438 4.128906 -7.492188 4.296875 -7.453125 C 4.460938 -7.410156 4.609375 -7.359375 4.734375 -7.296875 C 4.867188 -7.242188 4.976562 -7.191406 5.0625 -7.140625 L 4.75 -6.3125 C 4.59375 -6.40625 4.375 -6.5 4.09375 -6.59375 C 3.8125 -6.695312 3.5 -6.75 3.15625 -6.75 C 2.789062 -6.75 2.476562 -6.65625 2.21875 -6.46875 C 1.957031 -6.289062 1.828125 -6.019531 1.828125 -5.65625 C 1.828125 -5.457031 1.863281 -5.285156 1.9375 -5.140625 C 2.007812 -4.992188 2.113281 -4.863281 2.25 -4.75 C 2.382812 -4.644531 2.535156 -4.546875 2.703125 -4.453125 C 2.878906 -4.367188 3.070312 -4.285156 3.28125 -4.203125 C 3.59375 -4.078125 3.875 -3.945312 4.125 -3.8125 C 4.375 -3.6875 4.585938 -3.535156 4.765625 -3.359375 C 4.953125 -3.179688 5.09375 -2.972656 5.1875 -2.734375 C 5.289062 -2.492188 5.34375 -2.203125 5.34375 -1.859375 C 5.34375 -1.210938 5.125 -0.710938 4.6875 -0.359375 C 4.25 -0.015625 3.625 0.15625 2.8125 0.15625 C 2.539062 0.15625 2.289062 0.132812 2.0625 0.09375 C 1.832031 0.0625 1.625 0.0195312 1.4375 -0.03125 C 1.257812 -0.09375 1.101562 -0.148438 0.96875 -0.203125 C 0.832031 -0.253906 0.726562 -0.304688 0.65625 -0.359375 L 0.953125 -1.171875 C 1.117188 -1.085938 1.359375 -0.988281 1.671875 -0.875 C 1.984375 -0.757812 2.363281 -0.703125 2.8125 -0.703125 Z M 2.8125 -0.703125 "/>
+</symbol>
+<symbol overflow="visible" id="glyph1-16">
+<path style="stroke:none;" d="M 3.109375 -5.703125 C 3.859375 -5.703125 4.4375 -5.46875 4.84375 -5 C 5.25 -4.53125 5.453125 -3.820312 5.453125 -2.875 L 5.453125 -2.515625 L 1.46875 -2.515625 C 1.507812 -1.941406 1.703125 -1.503906 2.046875 -1.203125 C 2.390625 -0.898438 2.867188 -0.75 3.484375 -0.75 C 3.835938 -0.75 4.132812 -0.773438 4.375 -0.828125 C 4.625 -0.890625 4.8125 -0.953125 4.9375 -1.015625 L 5.078125 -0.1875 C 4.953125 -0.113281 4.734375 -0.046875 4.421875 0.015625 C 4.109375 0.0859375 3.757812 0.125 3.375 0.125 C 2.894531 0.125 2.472656 0.0507812 2.109375 -0.09375 C 1.742188 -0.238281 1.441406 -0.4375 1.203125 -0.6875 C 0.960938 -0.945312 0.78125 -1.253906 0.65625 -1.609375 C 0.539062 -1.960938 0.484375 -2.347656 0.484375 -2.765625 C 0.484375 -3.265625 0.554688 -3.695312 0.703125 -4.0625 C 0.859375 -4.4375 1.0625 -4.742188 1.3125 -4.984375 C 1.5625 -5.222656 1.835938 -5.398438 2.140625 -5.515625 C 2.453125 -5.640625 2.773438 -5.703125 3.109375 -5.703125 Z M 4.453125 -3.328125 C 4.453125 -3.796875 4.328125 -4.164062 4.078125 -4.4375 C 3.828125 -4.71875 3.5 -4.859375 3.09375 -4.859375 C 2.863281 -4.859375 2.65625 -4.8125 2.46875 -4.71875 C 2.28125 -4.632812 2.117188 -4.519531 1.984375 -4.375 C 1.847656 -4.226562 1.738281 -4.0625 1.65625 -3.875 C 1.570312 -3.695312 1.519531 -3.515625 1.5 -3.328125 Z M 4.453125 -3.328125 "/>
+</symbol>
+<symbol overflow="visible" id="glyph1-17">
+<path style="stroke:none;" d="M 4.15625 -4.390625 C 4.039062 -4.492188 3.875 -4.59375 3.65625 -4.6875 C 3.445312 -4.78125 3.222656 -4.828125 2.984375 -4.828125 C 2.722656 -4.828125 2.5 -4.773438 2.3125 -4.671875 C 2.125 -4.566406 1.96875 -4.421875 1.84375 -4.234375 C 1.726562 -4.054688 1.640625 -3.84375 1.578125 -3.59375 C 1.523438 -3.34375 1.5 -3.070312 1.5 -2.78125 C 1.5 -2.132812 1.648438 -1.632812 1.953125 -1.28125 C 2.253906 -0.925781 2.648438 -0.75 3.140625 -0.75 C 3.390625 -0.75 3.597656 -0.757812 3.765625 -0.78125 C 3.941406 -0.8125 4.070312 -0.835938 4.15625 -0.859375 Z M 4.15625 -8.140625 L 5.140625 -8.3125 L 5.140625 -0.15625 C 4.929688 -0.09375 4.65625 -0.03125 4.3125 0.03125 C 3.976562 0.09375 3.585938 0.125 3.140625 0.125 C 2.742188 0.125 2.378906 0.0546875 2.046875 -0.078125 C 1.722656 -0.210938 1.441406 -0.40625 1.203125 -0.65625 C 0.972656 -0.90625 0.796875 -1.207031 0.671875 -1.5625 C 0.546875 -1.925781 0.484375 -2.332031 0.484375 -2.78125 C 0.484375 -3.21875 0.535156 -3.613281 0.640625 -3.96875 C 0.742188 -4.320312 0.898438 -4.625 1.109375 -4.875 C 1.316406 -5.132812 1.566406 -5.335938 1.859375 -5.484375 C 2.148438 -5.628906 2.488281 -5.703125 2.875 -5.703125 C 3.164062 -5.703125 3.421875 -5.664062 3.640625 -5.59375 C 3.867188 -5.519531 4.039062 -5.441406 4.15625 -5.359375 Z M 4.15625 -8.140625 "/>
+</symbol>
+<symbol overflow="visible" id="glyph1-18">
+<path style="stroke:none;" d="M 1.28125 0 L 1.28125 -5.265625 C 2.101562 -5.546875 2.925781 -5.6875 3.75 -5.6875 C 4.007812 -5.6875 4.253906 -5.675781 4.484375 -5.65625 C 4.722656 -5.632812 4.976562 -5.59375 5.25 -5.53125 L 5.078125 -4.65625 C 4.816406 -4.726562 4.585938 -4.773438 4.390625 -4.796875 C 4.203125 -4.816406 3.988281 -4.828125 3.75 -4.828125 C 3.269531 -4.828125 2.773438 -4.757812 2.265625 -4.625 L 2.265625 0 Z M 1.28125 0 "/>
+</symbol>
+<symbol overflow="visible" id="glyph1-19">
+<path style="stroke:none;" d="M 0.609375 1.046875 C 0.679688 1.085938 0.78125 1.117188 0.90625 1.140625 C 1.039062 1.160156 1.164062 1.171875 1.28125 1.171875 C 1.675781 1.171875 1.984375 1.082031 2.203125 0.90625 C 2.421875 0.738281 2.625 0.460938 2.8125 0.078125 C 2.363281 -0.773438 1.941406 -1.6875 1.546875 -2.65625 C 1.148438 -3.625 0.828125 -4.59375 0.578125 -5.5625 L 1.65625 -5.5625 C 1.738281 -5.25 1.832031 -4.90625 1.9375 -4.53125 C 2.039062 -4.15625 2.160156 -3.769531 2.296875 -3.375 C 2.441406 -2.988281 2.585938 -2.597656 2.734375 -2.203125 C 2.890625 -1.804688 3.054688 -1.425781 3.234375 -1.0625 C 3.367188 -1.4375 3.488281 -1.800781 3.59375 -2.15625 C 3.707031 -2.519531 3.8125 -2.882812 3.90625 -3.25 C 4.007812 -3.613281 4.109375 -3.984375 4.203125 -4.359375 C 4.296875 -4.742188 4.394531 -5.144531 4.5 -5.5625 L 5.53125 -5.5625 C 5.269531 -4.53125 4.984375 -3.523438 4.671875 -2.546875 C 4.359375 -1.566406 4.019531 -0.660156 3.65625 0.171875 C 3.519531 0.484375 3.375 0.753906 3.21875 0.984375 C 3.0625 1.222656 2.890625 1.414062 2.703125 1.5625 C 2.523438 1.71875 2.316406 1.832031 2.078125 1.90625 C 1.847656 1.976562 1.585938 2.015625 1.296875 2.015625 C 1.140625 2.015625 0.972656 1.992188 0.796875 1.953125 C 0.617188 1.910156 0.5 1.875 0.4375 1.84375 Z M 0.609375 1.046875 "/>
+</symbol>
+<symbol overflow="visible" id="glyph1-20">
+<path style="stroke:none;" d="M 0.34375 -3.71875 C 0.34375 -4.382812 0.40625 -4.960938 0.53125 -5.453125 C 0.664062 -5.941406 0.847656 -6.34375 1.078125 -6.65625 C 1.304688 -6.96875 1.582031 -7.203125 1.90625 -7.359375 C 2.238281 -7.515625 2.601562 -7.59375 3 -7.59375 C 3.394531 -7.59375 3.753906 -7.515625 4.078125 -7.359375 C 4.410156 -7.203125 4.691406 -6.96875 4.921875 -6.65625 C 5.148438 -6.34375 5.328125 -5.941406 5.453125 -5.453125 C 5.585938 -4.960938 5.65625 -4.382812 5.65625 -3.71875 C 5.65625 -3.050781 5.585938 -2.472656 5.453125 -1.984375 C 5.328125 -1.503906 5.148438 -1.101562 4.921875 -0.78125 C 4.691406 -0.457031 4.410156 -0.21875 4.078125 -0.0625 C 3.753906 0.0820312 3.394531 0.15625 3 0.15625 C 2.601562 0.15625 2.238281 0.0820312 1.90625 -0.0625 C 1.582031 -0.21875 1.304688 -0.457031 1.078125 -0.78125 C 0.847656 -1.101562 0.664062 -1.503906 0.53125 -1.984375 C 0.40625 -2.472656 0.34375 -3.050781 0.34375 -3.71875 Z M 1.359375 -3.71875 C 1.359375 -2.738281 1.488281 -1.988281 1.75 -1.46875 C 2.007812 -0.957031 2.414062 -0.703125 2.96875 -0.703125 C 3.53125 -0.703125 3.953125 -0.957031 4.234375 -1.46875 C 4.515625 -1.988281 4.65625 -2.738281 4.65625 -3.71875 C 4.65625 -4.695312 4.515625 -5.445312 4.234375 -5.96875 C 3.953125 -6.488281 3.53125 -6.75 2.96875 -6.75 C 2.414062 -6.75 2.007812 -6.488281 1.75 -5.96875 C 1.488281 -5.445312 1.359375 -4.695312 1.359375 -3.71875 Z M 1.359375 -3.71875 "/>
+</symbol>
+<symbol overflow="visible" id="glyph1-21">
+<path style="stroke:none;" d="M 5.140625 -0.15625 C 4.929688 -0.101562 4.644531 -0.046875 4.28125 0.015625 C 3.925781 0.0859375 3.507812 0.125 3.03125 0.125 C 2.613281 0.125 2.265625 0.0625 1.984375 -0.0625 C 1.703125 -0.1875 1.472656 -0.363281 1.296875 -0.59375 C 1.117188 -0.820312 0.992188 -1.09375 0.921875 -1.40625 C 0.847656 -1.71875 0.8125 -2.0625 0.8125 -2.4375 L 0.8125 -5.5625 L 1.796875 -5.5625 L 1.796875 -2.65625 C 1.796875 -1.96875 1.894531 -1.476562 2.09375 -1.1875 C 2.300781 -0.894531 2.644531 -0.75 3.125 -0.75 C 3.226562 -0.75 3.332031 -0.753906 3.4375 -0.765625 C 3.550781 -0.773438 3.65625 -0.785156 3.75 -0.796875 C 3.851562 -0.804688 3.9375 -0.816406 4 -0.828125 C 4.070312 -0.847656 4.125 -0.859375 4.15625 -0.859375 L 4.15625 -5.5625 L 5.140625 -5.5625 Z M 5.140625 -0.15625 "/>
+</symbol>
+<symbol overflow="visible" id="glyph1-22">
+<path style="stroke:none;" d="M 2.921875 -5.5625 L 5.265625 -5.5625 L 5.265625 -4.734375 L 2.921875 -4.734375 L 2.921875 -2.140625 C 2.921875 -1.867188 2.9375 -1.644531 2.96875 -1.46875 C 3.007812 -1.289062 3.078125 -1.144531 3.171875 -1.03125 C 3.265625 -0.925781 3.382812 -0.851562 3.53125 -0.8125 C 3.675781 -0.769531 3.851562 -0.75 4.0625 -0.75 C 4.34375 -0.75 4.570312 -0.773438 4.75 -0.828125 C 4.925781 -0.878906 5.09375 -0.941406 5.25 -1.015625 L 5.40625 -0.1875 C 5.289062 -0.132812 5.109375 -0.0703125 4.859375 0 C 4.617188 0.0820312 4.316406 0.125 3.953125 0.125 C 3.546875 0.125 3.207031 0.078125 2.9375 -0.015625 C 2.675781 -0.109375 2.46875 -0.25 2.3125 -0.4375 C 2.164062 -0.632812 2.066406 -0.875 2.015625 -1.15625 C 1.960938 -1.4375 1.9375 -1.765625 1.9375 -2.140625 L 1.9375 -4.734375 L 0.75 -4.734375 L 0.75 -5.5625 L 1.9375 -5.5625 L 1.9375 -7.125 L 2.921875 -7.296875 Z M 2.921875 -5.5625 "/>
+</symbol>
+<symbol overflow="visible" id="glyph1-23">
+<path style="stroke:none;" d="M 4.5 -2.765625 C 4.5 -3.421875 4.347656 -3.925781 4.046875 -4.28125 C 3.742188 -4.632812 3.347656 -4.8125 2.859375 -4.8125 C 2.585938 -4.8125 2.375 -4.796875 2.21875 -4.765625 C 2.0625 -4.742188 1.9375 -4.71875 1.84375 -4.6875 L 1.84375 -1.171875 C 1.957031 -1.066406 2.117188 -0.96875 2.328125 -0.875 C 2.546875 -0.789062 2.773438 -0.75 3.015625 -0.75 C 3.273438 -0.75 3.5 -0.800781 3.6875 -0.90625 C 3.875 -1.007812 4.023438 -1.148438 4.140625 -1.328125 C 4.265625 -1.515625 4.351562 -1.726562 4.40625 -1.96875 C 4.46875 -2.21875 4.5 -2.484375 4.5 -2.765625 Z M 5.515625 -2.765625 C 5.515625 -2.347656 5.460938 -1.957031 5.359375 -1.59375 C 5.253906 -1.238281 5.097656 -0.929688 4.890625 -0.671875 C 4.679688 -0.421875 4.429688 -0.222656 4.140625 -0.078125 C 3.847656 0.0546875 3.507812 0.125 3.125 0.125 C 2.832031 0.125 2.570312 0.0859375 2.34375 0.015625 C 2.125 -0.046875 1.957031 -0.125 1.84375 -0.21875 L 1.84375 1.984375 L 0.859375 1.984375 L 0.859375 -5.40625 C 1.066406 -5.46875 1.34375 -5.53125 1.6875 -5.59375 C 2.03125 -5.65625 2.421875 -5.6875 2.859375 -5.6875 C 3.253906 -5.6875 3.613281 -5.617188 3.9375 -5.484375 C 4.269531 -5.347656 4.550781 -5.15625 4.78125 -4.90625 C 5.019531 -4.65625 5.203125 -4.347656 5.328125 -3.984375 C 5.453125 -3.617188 5.515625 -3.210938 5.515625 -2.765625 Z M 5.515625 -2.765625 "/>
+</symbol>
+<symbol overflow="visible" id="glyph1-24">
+<path style="stroke:none;" d="M 3.015625 -3.734375 L 4.15625 -7.421875 L 5.09375 -7.421875 C 5.25 -6.253906 5.359375 -5.054688 5.421875 -3.828125 C 5.492188 -2.609375 5.554688 -1.332031 5.609375 0 L 4.65625 0 C 4.644531 -0.425781 4.632812 -0.890625 4.625 -1.390625 C 4.625 -1.898438 4.613281 -2.421875 4.59375 -2.953125 C 4.582031 -3.492188 4.570312 -4.039062 4.5625 -4.59375 C 4.550781 -5.144531 4.539062 -5.679688 4.53125 -6.203125 L 3.4375 -2.8125 L 2.578125 -2.8125 L 1.46875 -6.203125 C 1.46875 -5.679688 1.457031 -5.144531 1.4375 -4.59375 C 1.425781 -4.050781 1.414062 -3.507812 1.40625 -2.96875 C 1.394531 -2.425781 1.382812 -1.898438 1.375 -1.390625 C 1.363281 -0.890625 1.351562 -0.425781 1.34375 0 L 0.390625 0 C 0.410156 -0.601562 0.4375 -1.222656 0.46875 -1.859375 C 0.5 -2.503906 0.535156 -3.144531 0.578125 -3.78125 C 0.617188 -4.414062 0.671875 -5.039062 0.734375 -5.65625 C 0.796875 -6.269531 0.863281 -6.859375 0.9375 -7.421875 L 1.84375 -7.421875 Z M 3.015625 -3.734375 "/>
+</symbol>
+<symbol overflow="visible" id="glyph2-0">
+<path style="stroke:none;" d="M 0.640625 2.296875 L 0.640625 -9.171875 L 7.140625 -9.171875 L 7.140625 2.296875 Z M 1.375 1.578125 L 6.421875 1.578125 L 6.421875 -8.4375 L 1.375 -8.4375 Z M 1.375 1.578125 "/>
+</symbol>
+<symbol overflow="visible" id="glyph2-1">
+<path style="stroke:none;" d="M 6.34375 -6.84375 L 6.34375 -5.75 C 6.007812 -5.925781 5.675781 -6.0625 5.34375 -6.15625 C 5.007812 -6.25 4.675781 -6.296875 4.34375 -6.296875 C 3.582031 -6.296875 2.992188 -6.050781 2.578125 -5.5625 C 2.160156 -5.082031 1.953125 -4.410156 1.953125 -3.546875 C 1.953125 -2.679688 2.160156 -2.007812 2.578125 -1.53125 C 2.992188 -1.050781 3.582031 -0.8125 4.34375 -0.8125 C 4.675781 -0.8125 5.007812 -0.851562 5.34375 -0.9375 C 5.675781 -1.03125 6.007812 -1.171875 6.34375 -1.359375 L 6.34375 -0.265625 C 6.019531 -0.117188 5.679688 -0.0078125 5.328125 0.0625 C 4.984375 0.144531 4.613281 0.1875 4.21875 0.1875 C 3.144531 0.1875 2.289062 -0.144531 1.65625 -0.8125 C 1.03125 -1.488281 0.71875 -2.398438 0.71875 -3.546875 C 0.71875 -4.703125 1.035156 -5.613281 1.671875 -6.28125 C 2.304688 -6.945312 3.179688 -7.28125 4.296875 -7.28125 C 4.648438 -7.28125 5 -7.242188 5.34375 -7.171875 C 5.6875 -7.097656 6.019531 -6.988281 6.34375 -6.84375 Z M 6.34375 -6.84375 "/>
+</symbol>
+<symbol overflow="visible" id="glyph2-2">
+<path style="stroke:none;" d="M 5.34375 -6.015625 C 5.207031 -6.085938 5.0625 -6.140625 4.90625 -6.171875 C 4.757812 -6.210938 4.59375 -6.234375 4.40625 -6.234375 C 3.75 -6.234375 3.242188 -6.019531 2.890625 -5.59375 C 2.535156 -5.164062 2.359375 -4.550781 2.359375 -3.75 L 2.359375 0 L 1.1875 0 L 1.1875 -7.109375 L 2.359375 -7.109375 L 2.359375 -6 C 2.597656 -6.4375 2.914062 -6.757812 3.3125 -6.96875 C 3.707031 -7.175781 4.1875 -7.28125 4.75 -7.28125 C 4.832031 -7.28125 4.921875 -7.273438 5.015625 -7.265625 C 5.109375 -7.253906 5.21875 -7.238281 5.34375 -7.21875 Z M 5.34375 -6.015625 "/>
+</symbol>
+<symbol overflow="visible" id="glyph2-3">
+<path style="stroke:none;" d="M 3.984375 -6.296875 C 3.359375 -6.296875 2.863281 -6.050781 2.5 -5.5625 C 2.132812 -5.070312 1.953125 -4.398438 1.953125 -3.546875 C 1.953125 -2.691406 2.128906 -2.019531 2.484375 -1.53125 C 2.847656 -1.050781 3.347656 -0.8125 3.984375 -0.8125 C 4.597656 -0.8125 5.085938 -1.054688 5.453125 -1.546875 C 5.816406 -2.035156 6 -2.703125 6 -3.546875 C 6 -4.390625 5.816406 -5.054688 5.453125 -5.546875 C 5.085938 -6.046875 4.597656 -6.296875 3.984375 -6.296875 Z M 3.984375 -7.28125 C 4.992188 -7.28125 5.789062 -6.945312 6.375 -6.28125 C 6.957031 -5.625 7.25 -4.710938 7.25 -3.546875 C 7.25 -2.378906 6.957031 -1.460938 6.375 -0.796875 C 5.789062 -0.140625 4.992188 0.1875 3.984375 0.1875 C 2.960938 0.1875 2.160156 -0.140625 1.578125 -0.796875 C 1.003906 -1.460938 0.71875 -2.378906 0.71875 -3.546875 C 0.71875 -4.710938 1.003906 -5.625 1.578125 -6.28125 C 2.160156 -6.945312 2.960938 -7.28125 3.984375 -7.28125 Z M 3.984375 -7.28125 "/>
+</symbol>
+<symbol overflow="visible" id="glyph2-4">
+<path style="stroke:none;" d="M 2.359375 -1.0625 L 2.359375 2.703125 L 1.1875 2.703125 L 1.1875 -7.109375 L 2.359375 -7.109375 L 2.359375 -6.03125 C 2.597656 -6.457031 2.90625 -6.769531 3.28125 -6.96875 C 3.65625 -7.175781 4.101562 -7.28125 4.625 -7.28125 C 5.488281 -7.28125 6.191406 -6.9375 6.734375 -6.25 C 7.273438 -5.5625 7.546875 -4.660156 7.546875 -3.546875 C 7.546875 -2.429688 7.273438 -1.53125 6.734375 -0.84375 C 6.191406 -0.15625 5.488281 0.1875 4.625 0.1875 C 4.101562 0.1875 3.65625 0.0820312 3.28125 -0.125 C 2.90625 -0.332031 2.597656 -0.644531 2.359375 -1.0625 Z M 6.328125 -3.546875 C 6.328125 -4.410156 6.148438 -5.082031 5.796875 -5.5625 C 5.441406 -6.050781 4.957031 -6.296875 4.34375 -6.296875 C 3.726562 -6.296875 3.242188 -6.050781 2.890625 -5.5625 C 2.535156 -5.082031 2.359375 -4.410156 2.359375 -3.546875 C 2.359375 -2.691406 2.535156 -2.019531 2.890625 -1.53125 C 3.242188 -1.039062 3.726562 -0.796875 4.34375 -0.796875 C 4.957031 -0.796875 5.441406 -1.039062 5.796875 -1.53125 C 6.148438 -2.019531 6.328125 -2.691406 6.328125 -3.546875 Z M 6.328125 -3.546875 "/>
+</symbol>
+<symbol overflow="visible" id="glyph2-5">
+<path style="stroke:none;" d="M 5.75 -6.90625 L 5.75 -5.796875 C 5.425781 -5.960938 5.085938 -6.085938 4.734375 -6.171875 C 4.378906 -6.253906 4.007812 -6.296875 3.625 -6.296875 C 3.039062 -6.296875 2.601562 -6.207031 2.3125 -6.03125 C 2.03125 -5.851562 1.890625 -5.585938 1.890625 -5.234375 C 1.890625 -4.960938 1.988281 -4.75 2.1875 -4.59375 C 2.394531 -4.445312 2.816406 -4.300781 3.453125 -4.15625 L 3.84375 -4.0625 C 4.675781 -3.882812 5.265625 -3.632812 5.609375 -3.3125 C 5.960938 -2.988281 6.140625 -2.539062 6.140625 -1.96875 C 6.140625 -1.300781 5.878906 -0.773438 5.359375 -0.390625 C 4.835938 -0.00390625 4.117188 0.1875 3.203125 0.1875 C 2.816406 0.1875 2.414062 0.148438 2 0.078125 C 1.59375 0.00390625 1.160156 -0.109375 0.703125 -0.265625 L 0.703125 -1.46875 C 1.140625 -1.238281 1.566406 -1.066406 1.984375 -0.953125 C 2.398438 -0.847656 2.8125 -0.796875 3.21875 -0.796875 C 3.769531 -0.796875 4.191406 -0.890625 4.484375 -1.078125 C 4.785156 -1.265625 4.9375 -1.53125 4.9375 -1.875 C 4.9375 -2.1875 4.828125 -2.425781 4.609375 -2.59375 C 4.398438 -2.769531 3.9375 -2.9375 3.21875 -3.09375 L 2.8125 -3.1875 C 2.082031 -3.34375 1.554688 -3.578125 1.234375 -3.890625 C 0.910156 -4.203125 0.75 -4.632812 0.75 -5.1875 C 0.75 -5.851562 0.984375 -6.367188 1.453125 -6.734375 C 1.929688 -7.097656 2.609375 -7.28125 3.484375 -7.28125 C 3.910156 -7.28125 4.316406 -7.25 4.703125 -7.1875 C 5.085938 -7.125 5.4375 -7.03125 5.75 -6.90625 Z M 5.75 -6.90625 "/>
+</symbol>
+<symbol overflow="visible" id="glyph2-6">
+<path style="stroke:none;" d="M 4.453125 -3.578125 C 3.515625 -3.578125 2.863281 -3.46875 2.5 -3.25 C 2.132812 -3.03125 1.953125 -2.660156 1.953125 -2.140625 C 1.953125 -1.734375 2.085938 -1.40625 2.359375 -1.15625 C 2.628906 -0.914062 3 -0.796875 3.46875 -0.796875 C 4.113281 -0.796875 4.632812 -1.023438 5.03125 -1.484375 C 5.425781 -1.941406 5.625 -2.550781 5.625 -3.3125 L 5.625 -3.578125 Z M 6.78125 -4.0625 L 6.78125 0 L 5.625 0 L 5.625 -1.078125 C 5.351562 -0.648438 5.019531 -0.332031 4.625 -0.125 C 4.226562 0.0820312 3.738281 0.1875 3.15625 0.1875 C 2.425781 0.1875 1.847656 -0.015625 1.421875 -0.421875 C 0.992188 -0.835938 0.78125 -1.382812 0.78125 -2.0625 C 0.78125 -2.863281 1.046875 -3.46875 1.578125 -3.875 C 2.117188 -4.28125 2.921875 -4.484375 3.984375 -4.484375 L 5.625 -4.484375 L 5.625 -4.609375 C 5.625 -5.140625 5.445312 -5.550781 5.09375 -5.84375 C 4.738281 -6.144531 4.238281 -6.296875 3.59375 -6.296875 C 3.1875 -6.296875 2.789062 -6.242188 2.40625 -6.140625 C 2.019531 -6.046875 1.648438 -5.898438 1.296875 -5.703125 L 1.296875 -6.78125 C 1.722656 -6.945312 2.140625 -7.070312 2.546875 -7.15625 C 2.953125 -7.238281 3.34375 -7.28125 3.71875 -7.28125 C 4.75 -7.28125 5.515625 -7.015625 6.015625 -6.484375 C 6.523438 -5.953125 6.78125 -5.144531 6.78125 -4.0625 Z M 6.78125 -4.0625 "/>
+</symbol>
+<symbol overflow="visible" id="glyph2-7">
+<path style="stroke:none;" d="M 1.21875 -9.875 L 2.390625 -9.875 L 2.390625 0 L 1.21875 0 Z M 1.21875 -9.875 "/>
+</symbol>
+<symbol overflow="visible" id="glyph2-8">
+<path style="stroke:none;" d="M 7.3125 -3.84375 L 7.3125 -3.28125 L 1.9375 -3.28125 C 1.988281 -2.46875 2.226562 -1.851562 2.65625 -1.4375 C 3.09375 -1.019531 3.695312 -0.8125 4.46875 -0.8125 C 4.914062 -0.8125 5.347656 -0.863281 5.765625 -0.96875 C 6.191406 -1.082031 6.613281 -1.25 7.03125 -1.46875 L 7.03125 -0.359375 C 6.613281 -0.179688 6.179688 -0.046875 5.734375 0.046875 C 5.296875 0.140625 4.851562 0.1875 4.40625 0.1875 C 3.269531 0.1875 2.367188 -0.140625 1.703125 -0.796875 C 1.046875 -1.460938 0.71875 -2.359375 0.71875 -3.484375 C 0.71875 -4.648438 1.03125 -5.570312 1.65625 -6.25 C 2.289062 -6.9375 3.140625 -7.28125 4.203125 -7.28125 C 5.160156 -7.28125 5.914062 -6.972656 6.46875 -6.359375 C 7.03125 -5.742188 7.3125 -4.90625 7.3125 -3.84375 Z M 6.140625 -4.1875 C 6.128906 -4.820312 5.945312 -5.332031 5.59375 -5.71875 C 5.25 -6.101562 4.789062 -6.296875 4.21875 -6.296875 C 3.5625 -6.296875 3.035156 -6.109375 2.640625 -5.734375 C 2.253906 -5.367188 2.03125 -4.851562 1.96875 -4.1875 Z M 6.140625 -4.1875 "/>
+</symbol>
+<symbol overflow="visible" id="glyph2-9">
+<path style="stroke:none;" d="M 6.328125 -3.546875 C 6.328125 -4.410156 6.148438 -5.082031 5.796875 -5.5625 C 5.441406 -6.050781 4.957031 -6.296875 4.34375 -6.296875 C 3.726562 -6.296875 3.242188 -6.050781 2.890625 -5.5625 C 2.535156 -5.082031 2.359375 -4.410156 2.359375 -3.546875 C 2.359375 -2.691406 2.535156 -2.019531 2.890625 -1.53125 C 3.242188 -1.039062 3.726562 -0.796875 4.34375 -0.796875 C 4.957031 -0.796875 5.441406 -1.039062 5.796875 -1.53125 C 6.148438 -2.019531 6.328125 -2.691406 6.328125 -3.546875 Z M 2.359375 -6.03125 C 2.597656 -6.457031 2.90625 -6.769531 3.28125 -6.96875 C 3.65625 -7.175781 4.101562 -7.28125 4.625 -7.28125 C 5.488281 -7.28125 6.191406 -6.9375 6.734375 -6.25 C 7.273438 -5.5625 7.546875 -4.660156 7.546875 -3.546875 C 7.546875 -2.429688 7.273438 -1.53125 6.734375 -0.84375 C 6.191406 -0.15625 5.488281 0.1875 4.625 0.1875 C 4.101562 0.1875 3.65625 0.0820312 3.28125 -0.125 C 2.90625 -0.332031 2.597656 -0.644531 2.359375 -1.0625 L 2.359375 0 L 1.1875 0 L 1.1875 -9.875 L 2.359375 -9.875 Z M 2.359375 -6.03125 "/>
+</symbol>
+<symbol overflow="visible" id="glyph2-10">
+<path style="stroke:none;" d="M 1.21875 -7.109375 L 2.390625 -7.109375 L 2.390625 0 L 1.21875 0 Z M 1.21875 -9.875 L 2.390625 -9.875 L 2.390625 -8.390625 L 1.21875 -8.390625 Z M 1.21875 -9.875 "/>
+</symbol>
+<symbol overflow="visible" id="glyph2-11">
+<path style="stroke:none;" d="M 7.140625 -4.296875 L 7.140625 0 L 5.96875 0 L 5.96875 -4.25 C 5.96875 -4.925781 5.835938 -5.429688 5.578125 -5.765625 C 5.316406 -6.097656 4.921875 -6.265625 4.390625 -6.265625 C 3.765625 -6.265625 3.269531 -6.0625 2.90625 -5.65625 C 2.539062 -5.257812 2.359375 -4.710938 2.359375 -4.015625 L 2.359375 0 L 1.1875 0 L 1.1875 -7.109375 L 2.359375 -7.109375 L 2.359375 -6 C 2.640625 -6.425781 2.96875 -6.742188 3.34375 -6.953125 C 3.71875 -7.171875 4.15625 -7.28125 4.65625 -7.28125 C 5.46875 -7.28125 6.082031 -7.023438 6.5 -6.515625 C 6.925781 -6.015625 7.140625 -5.273438 7.140625 -4.296875 Z M 7.140625 -4.296875 "/>
+</symbol>
+<symbol overflow="visible" id="glyph2-12">
+<path style="stroke:none;" d="M 5.90625 -3.640625 C 5.90625 -4.484375 5.726562 -5.132812 5.375 -5.59375 C 5.03125 -6.0625 4.539062 -6.296875 3.90625 -6.296875 C 3.28125 -6.296875 2.789062 -6.0625 2.4375 -5.59375 C 2.09375 -5.132812 1.921875 -4.484375 1.921875 -3.640625 C 1.921875 -2.796875 2.09375 -2.140625 2.4375 -1.671875 C 2.789062 -1.210938 3.28125 -0.984375 3.90625 -0.984375 C 4.539062 -0.984375 5.03125 -1.210938 5.375 -1.671875 C 5.726562 -2.140625 5.90625 -2.796875 5.90625 -3.640625 Z M 7.078125 -0.875 C 7.078125 0.332031 6.804688 1.226562 6.265625 1.8125 C 5.722656 2.40625 4.898438 2.703125 3.796875 2.703125 C 3.390625 2.703125 3.003906 2.671875 2.640625 2.609375 C 2.273438 2.546875 1.921875 2.453125 1.578125 2.328125 L 1.578125 1.1875 C 1.921875 1.375 2.257812 1.507812 2.59375 1.59375 C 2.925781 1.6875 3.265625 1.734375 3.609375 1.734375 C 4.378906 1.734375 4.953125 1.535156 5.328125 1.140625 C 5.710938 0.742188 5.90625 0.140625 5.90625 -0.671875 L 5.90625 -1.25 C 5.664062 -0.832031 5.351562 -0.519531 4.96875 -0.3125 C 4.59375 -0.101562 4.144531 0 3.625 0 C 2.75 0 2.046875 -0.332031 1.515625 -1 C 0.984375 -1.664062 0.71875 -2.546875 0.71875 -3.640625 C 0.71875 -4.734375 0.984375 -5.613281 1.515625 -6.28125 C 2.046875 -6.945312 2.75 -7.28125 3.625 -7.28125 C 4.144531 -7.28125 4.59375 -7.175781 4.96875 -6.96875 C 5.351562 -6.757812 5.664062 -6.445312 5.90625 -6.03125 L 5.90625 -7.109375 L 7.078125 -7.109375 Z M 7.078125 -0.875 "/>
+</symbol>
+<symbol overflow="visible" id="glyph2-13">
+<path style="stroke:none;" d="M 5.125 -8.609375 C 4.1875 -8.609375 3.441406 -8.257812 2.890625 -7.5625 C 2.347656 -6.875 2.078125 -5.929688 2.078125 -4.734375 C 2.078125 -3.535156 2.347656 -2.585938 2.890625 -1.890625 C 3.441406 -1.203125 4.1875 -0.859375 5.125 -0.859375 C 6.050781 -0.859375 6.785156 -1.203125 7.328125 -1.890625 C 7.878906 -2.585938 8.15625 -3.535156 8.15625 -4.734375 C 8.15625 -5.929688 7.878906 -6.875 7.328125 -7.5625 C 6.785156 -8.257812 6.050781 -8.609375 5.125 -8.609375 Z M 5.125 -9.65625 C 6.445312 -9.65625 7.503906 -9.207031 8.296875 -8.3125 C 9.097656 -7.414062 9.5 -6.222656 9.5 -4.734375 C 9.5 -3.234375 9.097656 -2.035156 8.296875 -1.140625 C 7.503906 -0.253906 6.445312 0.1875 5.125 0.1875 C 3.789062 0.1875 2.722656 -0.253906 1.921875 -1.140625 C 1.128906 -2.035156 0.734375 -3.234375 0.734375 -4.734375 C 0.734375 -6.222656 1.128906 -7.414062 1.921875 -8.3125 C 2.722656 -9.207031 3.789062 -9.65625 5.125 -9.65625 Z M 5.125 -9.65625 "/>
+</symbol>
+<symbol overflow="visible" id="glyph2-14">
+<path style="stroke:none;" d="M 1.109375 -2.8125 L 1.109375 -7.109375 L 2.265625 -7.109375 L 2.265625 -2.84375 C 2.265625 -2.175781 2.394531 -1.671875 2.65625 -1.328125 C 2.925781 -0.992188 3.320312 -0.828125 3.84375 -0.828125 C 4.476562 -0.828125 4.976562 -1.023438 5.34375 -1.421875 C 5.707031 -1.828125 5.890625 -2.378906 5.890625 -3.078125 L 5.890625 -7.109375 L 7.0625 -7.109375 L 7.0625 0 L 5.890625 0 L 5.890625 -1.09375 C 5.609375 -0.65625 5.28125 -0.332031 4.90625 -0.125 C 4.53125 0.0820312 4.09375 0.1875 3.59375 0.1875 C 2.78125 0.1875 2.160156 -0.0664062 1.734375 -0.578125 C 1.316406 -1.085938 1.109375 -1.832031 1.109375 -2.8125 Z M 4.046875 -7.28125 Z M 4.046875 -7.28125 "/>
+</symbol>
+<symbol overflow="visible" id="glyph2-15">
+<path style="stroke:none;" d="M 2.375 -9.125 L 2.375 -7.109375 L 4.78125 -7.109375 L 4.78125 -6.203125 L 2.375 -6.203125 L 2.375 -2.34375 C 2.375 -1.757812 2.453125 -1.382812 2.609375 -1.21875 C 2.773438 -1.0625 3.101562 -0.984375 3.59375 -0.984375 L 4.78125 -0.984375 L 4.78125 0 L 3.59375 0 C 2.6875 0 2.0625 -0.164062 1.71875 -0.5 C 1.375 -0.84375 1.203125 -1.457031 1.203125 -2.34375 L 1.203125 -6.203125 L 0.34375 -6.203125 L 0.34375 -7.109375 L 1.203125 -7.109375 L 1.203125 -9.125 Z M 2.375 -9.125 "/>
+</symbol>
+<symbol overflow="visible" id="glyph2-16">
+<path style="stroke:none;" d=""/>
+</symbol>
+<symbol overflow="visible" id="glyph2-17">
+<path style="stroke:none;" d="M 1.28125 -9.484375 L 6.71875 -9.484375 L 6.71875 -8.390625 L 2.5625 -8.390625 L 2.5625 -5.609375 L 6.3125 -5.609375 L 6.3125 -4.53125 L 2.5625 -4.53125 L 2.5625 0 L 1.28125 0 Z M 1.28125 -9.484375 "/>
+</symbol>
+<symbol overflow="visible" id="glyph2-18">
+<path style="stroke:none;" d="M 6.765625 -5.75 C 7.054688 -6.269531 7.40625 -6.65625 7.8125 -6.90625 C 8.21875 -7.15625 8.695312 -7.28125 9.25 -7.28125 C 9.988281 -7.28125 10.554688 -7.019531 10.953125 -6.5 C 11.359375 -5.976562 11.5625 -5.242188 11.5625 -4.296875 L 11.5625 0 L 10.390625 0 L 10.390625 -4.25 C 10.390625 -4.9375 10.265625 -5.441406 10.015625 -5.765625 C 9.773438 -6.097656 9.410156 -6.265625 8.921875 -6.265625 C 8.316406 -6.265625 7.835938 -6.0625 7.484375 -5.65625 C 7.128906 -5.257812 6.953125 -4.710938 6.953125 -4.015625 L 6.953125 0 L 5.78125 0 L 5.78125 -4.25 C 5.78125 -4.9375 5.660156 -5.441406 5.421875 -5.765625 C 5.179688 -6.097656 4.804688 -6.265625 4.296875 -6.265625 C 3.703125 -6.265625 3.226562 -6.0625 2.875 -5.65625 C 2.53125 -5.25 2.359375 -4.703125 2.359375 -4.015625 L 2.359375 0 L 1.1875 0 L 1.1875 -7.109375 L 2.359375 -7.109375 L 2.359375 -6 C 2.617188 -6.4375 2.9375 -6.757812 3.3125 -6.96875 C 3.6875 -7.175781 4.128906 -7.28125 4.640625 -7.28125 C 5.160156 -7.28125 5.597656 -7.148438 5.953125 -6.890625 C 6.316406 -6.628906 6.585938 -6.25 6.765625 -5.75 Z M 6.765625 -5.75 "/>
+</symbol>
+<symbol overflow="visible" id="glyph2-19">
+<path style="stroke:none;" d="M 6.953125 -9.171875 L 6.953125 -7.921875 C 6.472656 -8.148438 6.015625 -8.320312 5.578125 -8.4375 C 5.148438 -8.550781 4.734375 -8.609375 4.328125 -8.609375 C 3.628906 -8.609375 3.085938 -8.472656 2.703125 -8.203125 C 2.328125 -7.929688 2.140625 -7.546875 2.140625 -7.046875 C 2.140625 -6.628906 2.265625 -6.3125 2.515625 -6.09375 C 2.773438 -5.882812 3.253906 -5.710938 3.953125 -5.578125 L 4.734375 -5.421875 C 5.679688 -5.234375 6.382812 -4.910156 6.84375 -4.453125 C 7.300781 -3.992188 7.53125 -3.378906 7.53125 -2.609375 C 7.53125 -1.691406 7.222656 -0.992188 6.609375 -0.515625 C 5.992188 -0.046875 5.085938 0.1875 3.890625 0.1875 C 3.441406 0.1875 2.960938 0.132812 2.453125 0.03125 C 1.953125 -0.0703125 1.429688 -0.222656 0.890625 -0.421875 L 0.890625 -1.734375 C 1.410156 -1.441406 1.921875 -1.222656 2.421875 -1.078125 C 2.921875 -0.929688 3.410156 -0.859375 3.890625 -0.859375 C 4.628906 -0.859375 5.195312 -1 5.59375 -1.28125 C 5.988281 -1.570312 6.1875 -1.984375 6.1875 -2.515625 C 6.1875 -2.984375 6.039062 -3.347656 5.75 -3.609375 C 5.46875 -3.867188 5.003906 -4.066406 4.359375 -4.203125 L 3.578125 -4.359375 C 2.617188 -4.546875 1.925781 -4.84375 1.5 -5.25 C 1.070312 -5.65625 0.859375 -6.21875 0.859375 -6.9375 C 0.859375 -7.78125 1.148438 -8.441406 1.734375 -8.921875 C 2.328125 -9.410156 3.144531 -9.65625 4.1875 -9.65625 C 4.625 -9.65625 5.070312 -9.613281 5.53125 -9.53125 C 6 -9.445312 6.472656 -9.328125 6.953125 -9.171875 Z M 6.953125 -9.171875 "/>
+</symbol>
+<symbol overflow="visible" id="glyph2-20">
+<path style="stroke:none;" d="M 4.1875 0.65625 C 3.851562 1.507812 3.53125 2.0625 3.21875 2.3125 C 2.90625 2.570312 2.488281 2.703125 1.96875 2.703125 L 1.03125 2.703125 L 1.03125 1.734375 L 1.71875 1.734375 C 2.039062 1.734375 2.289062 1.65625 2.46875 1.5 C 2.644531 1.34375 2.835938 0.984375 3.046875 0.421875 L 3.265625 -0.109375 L 0.390625 -7.109375 L 1.625 -7.109375 L 3.84375 -1.546875 L 6.0625 -7.109375 L 7.3125 -7.109375 Z M 4.1875 0.65625 "/>
+</symbol>
+</g>
+</defs>
+<g id="surface268880">
+<rect x="0" y="0" width="774" height="152" style="fill:rgb(100%,100%,100%);fill-opacity:1;stroke:none;"/>
+<path style="fill-rule:evenodd;fill:rgb(100%,100%,100%);fill-opacity:1;stroke-width:0.1;stroke-linecap:butt;stroke-linejoin:miter;stroke:rgb(0%,0%,0%);stroke-opacity:1;stroke-miterlimit:10;" d="M 21.75297 10.408118 L 26.433829 10.408118 L 26.433829 12.281165 L 21.75297 12.281165 Z M 21.75297 10.408118 " transform="matrix(20,0,0,20,-434.059401,-172.47877)"/>
+<path style="fill-rule:evenodd;fill:rgb(100%,100%,100%);fill-opacity:1;stroke-width:0.1;stroke-linecap:butt;stroke-linejoin:miter;stroke:rgb(0%,0%,0%);stroke-opacity:1;stroke-miterlimit:10;" d="M 29.079728 10.51222 L 32.829728 10.51222 L 32.829728 12.149915 L 29.079728 12.149915 Z M 29.079728 10.51222 " transform="matrix(20,0,0,20,-434.059401,-172.47877)"/>
+<g style="fill:rgb(0%,0%,0%);fill-opacity:1;">
+ <use xlink:href="#glyph0-1" x="20.171875" y="57.705621"/>
+ <use xlink:href="#glyph0-2" x="30.171875" y="57.705621"/>
+ <use xlink:href="#glyph0-3" x="40.171875" y="57.705621"/>
+ <use xlink:href="#glyph0-4" x="50.171875" y="57.705621"/>
+ <use xlink:href="#glyph0-5" x="60.171875" y="57.705621"/>
+ <use xlink:href="#glyph0-6" x="70.171875" y="57.705621"/>
+</g>
+<g style="fill:rgb(0%,0%,0%);fill-opacity:1;">
+ <use xlink:href="#glyph0-7" x="174.203125" y="60.053277"/>
+ <use xlink:href="#glyph0-8" x="184.203125" y="60.053277"/>
+</g>
+<path style="fill-rule:evenodd;fill:rgb(100%,100%,100%);fill-opacity:1;stroke-width:0.1;stroke-linecap:butt;stroke-linejoin:miter;stroke:rgb(0%,0%,0%);stroke-opacity:1;stroke-miterlimit:10;" d="M 40.925236 10.544446 L 44.675236 10.544446 L 44.675236 12.090345 L 40.925236 12.090345 Z M 40.925236 10.544446 " transform="matrix(20,0,0,20,-434.059401,-172.47877)"/>
+<path style="fill-rule:evenodd;fill:rgb(100%,100%,100%);fill-opacity:1;stroke-width:0.1;stroke-linecap:butt;stroke-linejoin:miter;stroke:rgb(0%,0%,0%);stroke-opacity:1;stroke-miterlimit:10;" d="M 34.883439 10.536634 L 38.633439 10.536634 L 38.633439 12.120032 L 34.883439 12.120032 Z M 34.883439 10.536634 " transform="matrix(20,0,0,20,-434.059401,-172.47877)"/>
+<path style="fill-rule:evenodd;fill:rgb(100%,100%,100%);fill-opacity:1;stroke-width:0.1;stroke-linecap:butt;stroke-linejoin:miter;stroke:rgb(0%,0%,0%);stroke-opacity:1;stroke-miterlimit:10;" d="M 47.084806 10.484876 L 52.045743 10.484876 L 52.045743 12.130774 L 47.084806 12.130774 Z M 47.084806 10.484876 " transform="matrix(20,0,0,20,-434.059401,-172.47877)"/>
+<path style="fill-rule:evenodd;fill:rgb(100%,100%,100%);fill-opacity:1;stroke-width:0.1;stroke-linecap:butt;stroke-linejoin:miter;stroke:rgb(0%,0%,0%);stroke-opacity:1;stroke-miterlimit:10;" d="M 53.980118 10.376868 L 59.866642 10.376868 L 59.866642 12.279603 L 53.980118 12.279603 Z M 53.980118 10.376868 " transform="matrix(20,0,0,20,-434.059401,-172.47877)"/>
+<path style="fill-rule:evenodd;fill:rgb(100%,100%,100%);fill-opacity:1;stroke-width:0.1;stroke-linecap:butt;stroke-linejoin:miter;stroke:rgb(0%,0%,0%);stroke-opacity:1;stroke-miterlimit:10;" d="M 54.048478 13.825501 L 59.868009 13.825501 L 59.868009 15.490345 L 54.048478 15.490345 Z M 54.048478 13.825501 " transform="matrix(20,0,0,20,-434.059401,-172.47877)"/>
+<path style="fill:none;stroke-width:0.1;stroke-linecap:butt;stroke-linejoin:miter;stroke:rgb(0%,0%,0%);stroke-opacity:1;stroke-miterlimit:10;" d="M 26.481876 11.338001 L 28.593009 11.332337 " transform="matrix(20,0,0,20,-434.059401,-172.47877)"/>
+<path style="fill-rule:evenodd;fill:rgb(0%,0%,0%);fill-opacity:1;stroke-width:0.1;stroke-linecap:butt;stroke-linejoin:miter;stroke:rgb(0%,0%,0%);stroke-opacity:1;stroke-miterlimit:10;" d="M 28.968009 11.33136 L 28.468595 11.582728 L 28.593009 11.332337 L 28.467228 11.082728 Z M 28.968009 11.33136 " transform="matrix(20,0,0,20,-434.059401,-172.47877)"/>
+<path style="fill:none;stroke-width:0.1;stroke-linecap:butt;stroke-linejoin:miter;stroke:rgb(0%,0%,0%);stroke-opacity:1;stroke-miterlimit:10;" d="M 32.876798 11.329798 L 34.396525 11.328626 " transform="matrix(20,0,0,20,-434.059401,-172.47877)"/>
+<path style="fill-rule:evenodd;fill:rgb(0%,0%,0%);fill-opacity:1;stroke-width:0.1;stroke-linecap:butt;stroke-linejoin:miter;stroke:rgb(0%,0%,0%);stroke-opacity:1;stroke-miterlimit:10;" d="M 34.771525 11.328431 L 34.27172 11.578821 L 34.396525 11.328626 L 34.271329 11.078821 Z M 34.771525 11.328431 " transform="matrix(20,0,0,20,-434.059401,-172.47877)"/>
+<path style="fill:none;stroke-width:0.1;stroke-linecap:butt;stroke-linejoin:miter;stroke:rgb(0%,0%,0%);stroke-opacity:1;stroke-miterlimit:10;" d="M 38.633439 11.328431 L 40.438517 11.319642 " transform="matrix(20,0,0,20,-434.059401,-172.47877)"/>
+<path style="fill-rule:evenodd;fill:rgb(0%,0%,0%);fill-opacity:1;stroke-width:0.1;stroke-linecap:butt;stroke-linejoin:miter;stroke:rgb(0%,0%,0%);stroke-opacity:1;stroke-miterlimit:10;" d="M 40.813517 11.317884 L 40.314689 11.570228 L 40.438517 11.319642 L 40.312345 11.070423 Z M 40.813517 11.317884 " transform="matrix(20,0,0,20,-434.059401,-172.47877)"/>
+<path style="fill:none;stroke-width:0.1;stroke-linecap:butt;stroke-linejoin:miter;stroke:rgb(0%,0%,0%);stroke-opacity:1;stroke-miterlimit:10;" d="M 44.675236 11.317298 L 46.597892 11.309876 " transform="matrix(20,0,0,20,-434.059401,-172.47877)"/>
+<path style="fill-rule:evenodd;fill:rgb(0%,0%,0%);fill-opacity:1;stroke-width:0.1;stroke-linecap:butt;stroke-linejoin:miter;stroke:rgb(0%,0%,0%);stroke-opacity:1;stroke-miterlimit:10;" d="M 46.972892 11.308313 L 46.473868 11.560267 L 46.597892 11.309876 L 46.471915 11.060267 Z M 46.972892 11.308313 " transform="matrix(20,0,0,20,-434.059401,-172.47877)"/>
+<path style="fill:none;stroke-width:0.1;stroke-linecap:butt;stroke-linejoin:miter;stroke:rgb(0%,0%,0%);stroke-opacity:1;stroke-miterlimit:10;" d="M 52.045743 11.307923 L 53.4934 11.323157 " transform="matrix(20,0,0,20,-434.059401,-172.47877)"/>
+<path style="fill-rule:evenodd;fill:rgb(0%,0%,0%);fill-opacity:1;stroke-width:0.1;stroke-linecap:butt;stroke-linejoin:miter;stroke:rgb(0%,0%,0%);stroke-opacity:1;stroke-miterlimit:10;" d="M 53.8684 11.327063 L 53.365861 11.57179 L 53.4934 11.323157 L 53.370939 11.07179 Z M 53.8684 11.327063 " transform="matrix(20,0,0,20,-434.059401,-172.47877)"/>
+<g style="fill:rgb(0%,0%,0%);fill-opacity:1;">
+ <use xlink:href="#glyph0-9" x="286.757813" y="60.611871"/>
+ <use xlink:href="#glyph0-10" x="296.757813" y="60.611871"/>
+ <use xlink:href="#glyph0-1" x="306.757813" y="60.611871"/>
+</g>
+<g style="fill:rgb(0%,0%,0%);fill-opacity:1;">
+ <use xlink:href="#glyph0-11" x="405.660156" y="59.904839"/>
+ <use xlink:href="#glyph0-10" x="415.660156" y="59.904839"/>
+ <use xlink:href="#glyph0-12" x="425.660156" y="59.904839"/>
+</g>
+<g style="fill:rgb(0%,0%,0%);fill-opacity:1;">
+ <use xlink:href="#glyph1-1" x="511.308594" y="58.064616"/>
+ <use xlink:href="#glyph1-2" x="517.308757" y="58.064616"/>
+ <use xlink:href="#glyph1-3" x="523.308919" y="58.064616"/>
+ <use xlink:href="#glyph1-4" x="529.309082" y="58.064616"/>
+ <use xlink:href="#glyph1-5" x="535.309245" y="58.064616"/>
+ <use xlink:href="#glyph1-6" x="541.309408" y="58.064616"/>
+ <use xlink:href="#glyph1-7" x="547.30957" y="58.064616"/>
+ <use xlink:href="#glyph1-8" x="553.309733" y="58.064616"/>
+ <use xlink:href="#glyph1-9" x="559.309896" y="58.064616"/>
+ <use xlink:href="#glyph1-10" x="565.310059" y="58.064616"/>
+ <use xlink:href="#glyph1-11" x="571.310221" y="58.064616"/>
+ <use xlink:href="#glyph1-12" x="577.310384" y="58.064616"/>
+ <use xlink:href="#glyph1-13" x="583.310547" y="58.064616"/>
+ <use xlink:href="#glyph1-8" x="589.31071" y="58.064616"/>
+ <use xlink:href="#glyph1-14" x="595.310872" y="58.064616"/>
+</g>
+<path style="fill:none;stroke-width:0.1;stroke-linecap:butt;stroke-linejoin:miter;stroke:rgb(0%,0%,0%);stroke-opacity:1;stroke-miterlimit:10;" d="M 45.671915 11.342298 L 45.655704 11.342298 L 45.655704 14.657923 L 53.561759 14.657923 " transform="matrix(20,0,0,20,-434.059401,-172.47877)"/>
+<path style="fill-rule:evenodd;fill:rgb(0%,0%,0%);fill-opacity:1;stroke-width:0.1;stroke-linecap:butt;stroke-linejoin:miter;stroke:rgb(0%,0%,0%);stroke-opacity:1;stroke-miterlimit:10;" d="M 53.936759 14.657923 L 53.436759 14.907923 L 53.561759 14.657923 L 53.436759 14.407923 Z M 53.936759 14.657923 " transform="matrix(20,0,0,20,-434.059401,-172.47877)"/>
+<g style="fill:rgb(0%,0%,0%);fill-opacity:1;">
+ <use xlink:href="#glyph1-15" x="657.078125" y="57.724772"/>
+ <use xlink:href="#glyph1-16" x="663.078288" y="57.724772"/>
+ <use xlink:href="#glyph1-10" x="669.078451" y="57.724772"/>
+ <use xlink:href="#glyph1-6" x="675.078613" y="57.724772"/>
+ <use xlink:href="#glyph1-8" x="681.078776" y="57.724772"/>
+ <use xlink:href="#glyph1-17" x="687.078939" y="57.724772"/>
+ <use xlink:href="#glyph1-11" x="693.079102" y="57.724772"/>
+ <use xlink:href="#glyph1-18" x="699.079264" y="57.724772"/>
+ <use xlink:href="#glyph1-19" x="705.079427" y="57.724772"/>
+ <use xlink:href="#glyph1-4" x="711.07959" y="57.724772"/>
+ <use xlink:href="#glyph1-20" x="717.079753" y="57.724772"/>
+ <use xlink:href="#glyph1-21" x="723.079915" y="57.724772"/>
+ <use xlink:href="#glyph1-22" x="729.080078" y="57.724772"/>
+ <use xlink:href="#glyph1-23" x="735.080241" y="57.724772"/>
+ <use xlink:href="#glyph1-21" x="741.080404" y="57.724772"/>
+ <use xlink:href="#glyph1-22" x="747.080566" y="57.724772"/>
+</g>
+<g style="fill:rgb(0%,0%,0%);fill-opacity:1;">
+ <use xlink:href="#glyph1-24" x="673.335938" y="124.170085"/>
+ <use xlink:href="#glyph1-11" x="679.3361" y="124.170085"/>
+ <use xlink:href="#glyph1-13" x="685.336263" y="124.170085"/>
+ <use xlink:href="#glyph1-8" x="691.336426" y="124.170085"/>
+ <use xlink:href="#glyph1-4" x="697.336589" y="124.170085"/>
+ <use xlink:href="#glyph1-20" x="703.336751" y="124.170085"/>
+ <use xlink:href="#glyph1-21" x="709.336914" y="124.170085"/>
+ <use xlink:href="#glyph1-22" x="715.337077" y="124.170085"/>
+ <use xlink:href="#glyph1-23" x="721.33724" y="124.170085"/>
+ <use xlink:href="#glyph1-21" x="727.337402" y="124.170085"/>
+ <use xlink:href="#glyph1-22" x="733.337565" y="124.170085"/>
+</g>
+<g style="fill:rgb(0%,0%,0%);fill-opacity:1;">
+ <use xlink:href="#glyph2-1" x="168.71875" y="31.959093"/>
+ <use xlink:href="#glyph2-2" x="175.866102" y="31.959093"/>
+ <use xlink:href="#glyph2-3" x="180.92551" y="31.959093"/>
+ <use xlink:href="#glyph2-4" x="188.879069" y="31.959093"/>
+</g>
+<g style="fill:rgb(0%,0%,0%);fill-opacity:1;">
+ <use xlink:href="#glyph2-5" x="288.109375" y="31.681749"/>
+ <use xlink:href="#glyph2-1" x="294.882378" y="31.681749"/>
+ <use xlink:href="#glyph2-6" x="302.029731" y="31.681749"/>
+ <use xlink:href="#glyph2-7" x="309.996039" y="31.681749"/>
+ <use xlink:href="#glyph2-8" x="313.607964" y="31.681749"/>
+</g>
+<g style="fill:rgb(0%,0%,0%);fill-opacity:1;">
+ <use xlink:href="#glyph2-5" x="535.988281" y="33.365343"/>
+ <use xlink:href="#glyph2-1" x="542.761285" y="33.365343"/>
+ <use xlink:href="#glyph2-6" x="549.908637" y="33.365343"/>
+ <use xlink:href="#glyph2-7" x="557.874946" y="33.365343"/>
+ <use xlink:href="#glyph2-8" x="561.486871" y="33.365343"/>
+</g>
+<g style="fill:rgb(0%,0%,0%);fill-opacity:1;">
+ <use xlink:href="#glyph2-9" x="26.695313" y="32.365343"/>
+ <use xlink:href="#glyph2-10" x="34.947266" y="32.365343"/>
+ <use xlink:href="#glyph2-11" x="38.559191" y="32.365343"/>
+ <use xlink:href="#glyph2-11" x="46.798394" y="32.365343"/>
+ <use xlink:href="#glyph2-10" x="55.037598" y="32.365343"/>
+ <use xlink:href="#glyph2-11" x="58.649523" y="32.365343"/>
+ <use xlink:href="#glyph2-12" x="66.888726" y="32.365343"/>
+</g>
+<path style="fill:none;stroke-width:0.1;stroke-linecap:butt;stroke-linejoin:miter;stroke:rgb(0%,0%,0%);stroke-opacity:1;stroke-dasharray:0.14,0.14;stroke-miterlimit:10;" d="M 45.300431 9.486438 L 60.373478 9.486438 L 60.373478 16.175696 L 45.300431 16.175696 Z M 45.300431 9.486438 " transform="matrix(20,0,0,20,-434.059401,-172.47877)"/>
+<g style="fill:rgb(0%,0%,0%);fill-opacity:1;">
+ <use xlink:href="#glyph2-13" x="532.003906" y="11.904405"/>
+ <use xlink:href="#glyph2-14" x="542.236382" y="11.904405"/>
+ <use xlink:href="#glyph2-15" x="550.475586" y="11.904405"/>
+ <use xlink:href="#glyph2-4" x="555.5727" y="11.904405"/>
+ <use xlink:href="#glyph2-14" x="563.824653" y="11.904405"/>
+ <use xlink:href="#glyph2-15" x="572.063856" y="11.904405"/>
+ <use xlink:href="#glyph2-16" x="577.16097" y="11.904405"/>
+ <use xlink:href="#glyph2-17" x="581.293186" y="11.904405"/>
+ <use xlink:href="#glyph2-3" x="588.307346" y="11.904405"/>
+ <use xlink:href="#glyph2-2" x="596.260905" y="11.904405"/>
+ <use xlink:href="#glyph2-18" x="601.377279" y="11.904405"/>
+ <use xlink:href="#glyph2-6" x="614.040853" y="11.904405"/>
+ <use xlink:href="#glyph2-15" x="622.007161" y="11.904405"/>
+ <use xlink:href="#glyph2-15" x="627.104275" y="11.904405"/>
+ <use xlink:href="#glyph2-8" x="632.201389" y="11.904405"/>
+ <use xlink:href="#glyph2-2" x="640.199436" y="11.904405"/>
+ <use xlink:href="#glyph2-16" x="645.544217" y="11.904405"/>
+ <use xlink:href="#glyph2-19" x="649.676432" y="11.904405"/>
+ <use xlink:href="#glyph2-20" x="657.928385" y="11.904405"/>
+ <use xlink:href="#glyph2-5" x="665.621799" y="11.904405"/>
+ <use xlink:href="#glyph2-15" x="672.394803" y="11.904405"/>
+ <use xlink:href="#glyph2-8" x="677.491916" y="11.904405"/>
+ <use xlink:href="#glyph2-18" x="685.489963" y="11.904405"/>
+</g>
+</g>
+</svg>
diff --git a/Documentation/admin-guide/media/ivtv-cardlist.rst b/Documentation/admin-guide/media/ivtv-cardlist.rst
new file mode 100644
index 000000000000..0ffc3b71ae60
--- /dev/null
+++ b/Documentation/admin-guide/media/ivtv-cardlist.rst
@@ -0,0 +1,139 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+IVTV cards list
+===============
+
+.. tabularcolumns:: |p{1.4cm}|p{12.7cm}|p{3.4cm}|
+
+.. flat-table::
+ :header-rows: 1
+ :widths: 2 19 18
+ :stub-columns: 0
+
+ * - Card number
+ - Card name
+ - PCI subsystem IDs
+
+ * - 0
+ - Hauppauge WinTV PVR-250
+ - IVTV16 104d:813d
+
+ * - 1
+ - Hauppauge WinTV PVR-350
+ - IVTV16 104d:813d
+
+ * - 2
+ - Hauppauge WinTV PVR-150
+ - IVTV16 104d:813d
+
+ * - 3
+ - AVerMedia M179
+ - IVTV15 1461:a3cf, IVTV15 1461:a3ce
+
+ * - 4
+ - Yuan MPG600, Kuroutoshikou ITVC16-STVLP
+ - IVTV16 12ab:fff3, IVTV16 12ab:ffff
+
+ * - 5
+ - YUAN MPG160, Kuroutoshikou ITVC15-STVLP, I/O Data GV-M2TV/PCI
+ - IVTV15 10fc:40a0
+
+ * - 6
+ - Yuan PG600, Diamond PVR-550
+ - IVTV16 ff92:0070, IVTV16 ffab:0600
+
+ * - 7
+ - Adaptec VideOh! AVC-2410
+ - IVTV16 9005:0093
+
+ * - 8
+ - Adaptec VideOh! AVC-2010
+ - IVTV16 9005:0092
+
+ * - 9
+ - Nagase Transgear 5000TV
+ - IVTV16 1461:bfff
+
+ * - 10
+ - AOpen VA2000MAX-SNT6
+ - IVTV16 0000:ff5f
+
+ * - 11
+ - Yuan MPG600GR, Kuroutoshikou CX23416GYC-STVLP
+ - IVTV16 12ab:0600, IVTV16 fbab:0600, IVTV16 1154:0523
+
+ * - 12
+ - I/O Data GV-MVP/RX, GV-MVP/RX2W (dual tuner)
+ - IVTV16 10fc:d01e, IVTV16 10fc:d038, IVTV16 10fc:d039
+
+ * - 13
+ - I/O Data GV-MVP/RX2E
+ - IVTV16 10fc:d025
+
+ * - 14
+ - GotView PCI DVD
+ - IVTV16 12ab:0600
+
+ * - 15
+ - GotView PCI DVD2 Deluxe
+ - IVTV16 ffac:0600
+
+ * - 16
+ - Yuan MPC622
+ - IVTV16 ff01:d998
+
+ * - 17
+ - Digital Cowboy DCT-MTVP1
+ - IVTV16 1461:bfff
+
+ * - 18
+ - Yuan PG600-2, GotView PCI DVD Lite
+ - IVTV16 ffab:0600, IVTV16 ffad:0600
+
+ * - 19
+ - Club3D ZAP-TV1x01
+ - IVTV16 ffab:0600
+
+ * - 20
+ - AVerTV MCE 116 Plus
+ - IVTV16 1461:c439
+
+ * - 21
+ - ASUS Falcon2
+ - IVTV16 1043:4b66, IVTV16 1043:462e, IVTV16 1043:4b2e
+
+ * - 22
+ - AVerMedia PVR-150 Plus / AVerTV M113 Partsnic (Daewoo) Tuner
+ - IVTV16 1461:c034, IVTV16 1461:c035
+
+ * - 23
+ - AVerMedia EZMaker PCI Deluxe
+ - IVTV16 1461:c03f
+
+ * - 24
+ - AVerMedia M104
+ - IVTV16 1461:c136
+
+ * - 25
+ - Buffalo PC-MV5L/PCI
+ - IVTV16 1154:052b
+
+ * - 26
+ - AVerMedia UltraTV 1500 MCE / AVerTV M113 Philips Tuner
+ - IVTV16 1461:c019, IVTV16 1461:c01b
+
+ * - 27
+ - Sony VAIO Giga Pocket (ENX Kikyou)
+ - IVTV16 104d:813d
+
+ * - 28
+ - Hauppauge WinTV PVR-350 (V1)
+ - IVTV16 104d:813d
+
+ * - 29
+ - Yuan MPG600GR, Kuroutoshikou CX23416GYC-STVLP (no GR)
+ - IVTV16 104d:813d
+
+ * - 30
+ - Yuan MPG600GR, Kuroutoshikou CX23416GYC-STVLP (no GR/YCS)
+ - IVTV16 104d:813d
diff --git a/Documentation/admin-guide/media/ivtv.rst b/Documentation/admin-guide/media/ivtv.rst
new file mode 100644
index 000000000000..101f16d0263e
--- /dev/null
+++ b/Documentation/admin-guide/media/ivtv.rst
@@ -0,0 +1,218 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+The ivtv driver
+===============
+
+Author: Hans Verkuil <hverkuil@xs4all.nl>
+
+This is a v4l2 device driver for the Conexant cx23415/6 MPEG encoder/decoder.
+The cx23415 can do both encoding and decoding, the cx23416 can only do MPEG
+encoding. Currently the only card featuring full decoding support is the
+Hauppauge PVR-350.
+
+.. note::
+
+ #) This driver requires the latest encoder firmware (version 2.06.039, size
+ 376836 bytes). Get the firmware from here:
+
+ https://linuxtv.org/downloads/firmware/#conexant
+
+ #) 'normal' TV applications do not work with this driver, you need
+ an application that can handle MPEG input such as mplayer, xine, MythTV,
+ etc.
+
+The primary goal of the IVTV project is to provide a "clean room" Linux
+Open Source driver implementation for video capture cards based on the
+iCompression iTVC15 or Conexant CX23415/CX23416 MPEG Codec.
+
+Features
+--------
+
+ * Hardware mpeg2 capture of broadcast video (and sound) via the tuner or
+ S-Video/Composite and audio line-in.
+ * Hardware mpeg2 capture of FM radio where hardware support exists
+ * Supports NTSC, PAL, SECAM with stereo sound
+ * Supports SAP and bilingual transmissions.
+ * Supports raw VBI (closed captions and teletext).
+ * Supports sliced VBI (closed captions and teletext) and is able to insert
+ this into the captured MPEG stream.
+ * Supports raw YUV and PCM input.
+
+Additional features for the PVR-350 (CX23415 based)
+---------------------------------------------------
+
+ * Provides hardware mpeg2 playback
+ * Provides comprehensive OSD (On Screen Display: ie. graphics overlaying the
+ video signal)
+ * Provides a framebuffer (allowing X applications to appear on the video
+ device)
+ * Supports raw YUV output.
+
+IMPORTANT: In case of problems first read this page:
+ https://help.ubuntu.com/community/Install_IVTV_Troubleshooting
+
+See also
+--------
+
+https://linuxtv.org
+
+IRC
+---
+
+irc://irc.freenode.net/#v4l
+
+----------------------------------------------------------
+
+Devices
+-------
+
+A maximum of 12 ivtv boards are allowed at the moment.
+
+Cards that don't have a video output capability (i.e. non PVR350 cards)
+lack the vbi8, vbi16, video16 and video48 devices. They also do not
+support the framebuffer device /dev/fbx for OSD.
+
+The radio0 device may or may not be present, depending on whether the
+card has a radio tuner or not.
+
+Here is a list of the base v4l devices:
+
+.. code-block:: none
+
+ crw-rw---- 1 root video 81, 0 Jun 19 22:22 /dev/video0
+ crw-rw---- 1 root video 81, 16 Jun 19 22:22 /dev/video16
+ crw-rw---- 1 root video 81, 24 Jun 19 22:22 /dev/video24
+ crw-rw---- 1 root video 81, 32 Jun 19 22:22 /dev/video32
+ crw-rw---- 1 root video 81, 48 Jun 19 22:22 /dev/video48
+ crw-rw---- 1 root video 81, 64 Jun 19 22:22 /dev/radio0
+ crw-rw---- 1 root video 81, 224 Jun 19 22:22 /dev/vbi0
+ crw-rw---- 1 root video 81, 228 Jun 19 22:22 /dev/vbi8
+ crw-rw---- 1 root video 81, 232 Jun 19 22:22 /dev/vbi16
+
+Base devices
+------------
+
+For every extra card you have the numbers increased by one. For example,
+/dev/video0 is listed as the 'base' encoding capture device so we have:
+
+- /dev/video0 is the encoding capture device for the first card (card 0)
+- /dev/video1 is the encoding capture device for the second card (card 1)
+- /dev/video2 is the encoding capture device for the third card (card 2)
+
+Note that if the first card doesn't have a feature (eg no decoder, so no
+video16, the second card will still use video17. The simple rule is 'add
+the card number to the base device number'. If you have other capture
+cards (e.g. WinTV PCI) that are detected first, then you have to tell
+the ivtv module about it so that it will start counting at 1 (or 2, or
+whatever). Otherwise the device numbers can get confusing. The ivtv
+'ivtv_first_minor' module option can be used for that.
+
+
+- /dev/video0
+
+ The encoding capture device(s).
+
+ Read-only.
+
+ Reading from this device gets you the MPEG1/2 program stream.
+ Example:
+
+ .. code-block:: none
+
+ cat /dev/video0 > my.mpg (you need to hit ctrl-c to exit)
+
+
+- /dev/video16
+
+ The decoder output device(s)
+
+ Write-only. Only present if the MPEG decoder (i.e. CX23415) exists.
+
+ An mpeg2 stream sent to this device will appear on the selected video
+ display, audio will appear on the line-out/audio out. It is only
+ available for cards that support video out. Example:
+
+ .. code-block:: none
+
+ cat my.mpg >/dev/video16
+
+
+- /dev/video24
+
+ The raw audio capture device(s).
+
+ Read-only
+
+ The raw audio PCM stereo stream from the currently selected
+ tuner or audio line-in. Reading from this device results in a raw
+ (signed 16 bit Little Endian, 48000 Hz, stereo pcm) capture.
+ This device only captures audio. This should be replaced by an ALSA
+ device in the future.
+ Note that there is no corresponding raw audio output device, this is
+ not supported in the decoder firmware.
+
+
+- /dev/video32
+
+ The raw video capture device(s)
+
+ Read-only
+
+ The raw YUV video output from the current video input. The YUV format
+ is a 16x16 linear tiled NV12 format (V4L2_PIX_FMT_NV12_16L16)
+
+ Note that the YUV and PCM streams are not synchronized, so they are of
+ limited use.
+
+
+- /dev/video48
+
+ The raw video display device(s)
+
+ Write-only. Only present if the MPEG decoder (i.e. CX23415) exists.
+
+ Writes a YUV stream to the decoder of the card.
+
+
+- /dev/radio0
+
+ The radio tuner device(s)
+
+ Cannot be read or written.
+
+ Used to enable the radio tuner and tune to a frequency. You cannot
+ read or write audio streams with this device. Once you use this
+ device to tune the radio, use /dev/video24 to read the raw pcm stream
+ or /dev/video0 to get an mpeg2 stream with black video.
+
+
+- /dev/vbi0
+
+ The 'vertical blank interval' (Teletext, CC, WSS etc) capture device(s)
+
+ Read-only
+
+ Captures the raw (or sliced) video data sent during the Vertical Blank
+ Interval. This data is used to encode teletext, closed captions, VPS,
+ widescreen signalling, electronic program guide information, and other
+ services.
+
+
+- /dev/vbi8
+
+ Processed vbi feedback device(s)
+
+ Read-only. Only present if the MPEG decoder (i.e. CX23415) exists.
+
+ The sliced VBI data embedded in an MPEG stream is reproduced on this
+ device. So while playing back a recording on /dev/video16, you can
+ read the embedded VBI data from /dev/vbi8.
+
+
+- /dev/vbi16
+
+ The vbi 'display' device(s)
+
+ Write-only. Only present if the MPEG decoder (i.e. CX23415) exists.
+
+ Can be used to send sliced VBI data to the video-out connector.
diff --git a/Documentation/admin-guide/media/lmedm04.rst b/Documentation/admin-guide/media/lmedm04.rst
new file mode 100644
index 000000000000..a6ee33413748
--- /dev/null
+++ b/Documentation/admin-guide/media/lmedm04.rst
@@ -0,0 +1,107 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+Firmware files for lmedm04 cards
+================================
+
+To extract firmware for the DM04/QQBOX you need to copy the
+following file(s) to this directory.
+
+For DM04+/QQBOX LME2510C (Sharp 7395 Tuner)
+-------------------------------------------
+
+The Sharp 7395 driver can be found in windows/system32/drivers
+
+US2A0D.sys (dated 17 Mar 2009)
+
+
+and run:
+
+.. code-block:: none
+
+ scripts/get_dvb_firmware lme2510c_s7395
+
+will produce dvb-usb-lme2510c-s7395.fw
+
+An alternative but older firmware can be found on the driver
+disk DVB-S_EN_3.5A in BDADriver/driver
+
+LMEBDA_DVBS7395C.sys (dated 18 Jan 2008)
+
+and run:
+
+.. code-block:: none
+
+ ./get_dvb_firmware lme2510c_s7395_old
+
+will produce dvb-usb-lme2510c-s7395.fw
+
+The LG firmware can be found on the driver
+disk DM04+_5.1A[LG] in BDADriver/driver
+
+For DM04 LME2510 (LG Tuner)
+---------------------------
+
+LMEBDA_DVBS.sys (dated 13 Nov 2007)
+
+and run:
+
+
+.. code-block:: none
+
+ ./get_dvb_firmware lme2510_lg
+
+will produce dvb-usb-lme2510-lg.fw
+
+
+Other LG firmware can be extracted manually from US280D.sys
+only found in windows/system32/drivers
+
+dd if=US280D.sys ibs=1 skip=42360 count=3924 of=dvb-usb-lme2510-lg.fw
+
+For DM04 LME2510C (LG Tuner)
+----------------------------
+
+.. code-block:: none
+
+ dd if=US280D.sys ibs=1 skip=35200 count=3850 of=dvb-usb-lme2510c-lg.fw
+
+
+The Sharp 0194 tuner driver can be found in windows/system32/drivers
+
+US290D.sys (dated 09 Apr 2009)
+
+For LME2510
+-----------
+
+.. code-block:: none
+
+ dd if=US290D.sys ibs=1 skip=36856 count=3976 of=dvb-usb-lme2510-s0194.fw
+
+
+For LME2510C
+------------
+
+
+.. code-block:: none
+
+ dd if=US290D.sys ibs=1 skip=33152 count=3697 of=dvb-usb-lme2510c-s0194.fw
+
+
+The m88rs2000 tuner driver can be found in windows/system32/drivers
+
+US2B0D.sys (dated 29 Jun 2010)
+
+
+.. code-block:: none
+
+ dd if=US2B0D.sys ibs=1 skip=34432 count=3871 of=dvb-usb-lme2510c-rs2000.fw
+
+We need to modify id of rs2000 firmware or it will warm boot id 3344:1120.
+
+
+.. code-block:: none
+
+
+ echo -ne \\xF0\\x22 | dd conv=notrunc bs=1 count=2 seek=266 of=dvb-usb-lme2510c-rs2000.fw
+
+Copy the firmware file(s) to /lib/firmware
diff --git a/Documentation/admin-guide/media/meye.rst b/Documentation/admin-guide/media/meye.rst
new file mode 100644
index 000000000000..9098a1e65f8b
--- /dev/null
+++ b/Documentation/admin-guide/media/meye.rst
@@ -0,0 +1,93 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+.. include:: <isonum.txt>
+
+Vaio Picturebook Motion Eye Camera Driver
+=========================================
+
+Copyright |copy| 2001-2004 Stelian Pop <stelian@popies.net>
+
+Copyright |copy| 2001-2002 Alcôve <www.alcove.com>
+
+Copyright |copy| 2000 Andrew Tridgell <tridge@samba.org>
+
+This driver enable the use of video4linux compatible applications with the
+Motion Eye camera. This driver requires the "Sony Laptop Extras" driver (which
+can be found in the "Misc devices" section of the kernel configuration utility)
+to be compiled and installed (using its "camera=1" parameter).
+
+It can do at maximum 30 fps @ 320x240 or 15 fps @ 640x480.
+
+Grabbing is supported in packed YUV colorspace only.
+
+MJPEG hardware grabbing is supported via a private API (see below).
+
+Hardware supported
+------------------
+
+This driver supports the 'second' version of the MotionEye camera :)
+
+The first version was connected directly on the video bus of the Neomagic
+video card and is unsupported.
+
+The second one, made by Kawasaki Steel is fully supported by this
+driver (PCI vendor/device is 0x136b/0xff01)
+
+The third one, present in recent (more or less last year) Picturebooks
+(C1M* models), is not supported. The manufacturer has given the specs
+to the developers under a NDA (which allows the development of a GPL
+driver however), but things are not moving very fast (see
+http://r-engine.sourceforge.net/) (PCI vendor/device is 0x10cf/0x2011).
+
+There is a forth model connected on the USB bus in TR1* Vaio laptops.
+This camera is not supported at all by the current driver, in fact
+little information if any is available for this camera
+(USB vendor/device is 0x054c/0x0107).
+
+Driver options
+--------------
+
+Several options can be passed to the meye driver using the standard
+module argument syntax (<param>=<value> when passing the option to the
+module or meye.<param>=<value> on the kernel boot line when meye is
+statically linked into the kernel). Those options are:
+
+.. code-block:: none
+
+ gbuffers: number of capture buffers, default is 2 (32 max)
+
+ gbufsize: size of each capture buffer, default is 614400
+
+ video_nr: video device to register (0 = /dev/video0, etc)
+
+Module use
+----------
+
+In order to automatically load the meye module on use, you can put those lines
+in your /etc/modprobe.d/meye.conf file:
+
+.. code-block:: none
+
+ alias char-major-81 videodev
+ alias char-major-81-0 meye
+ options meye gbuffers=32
+
+Usage:
+------
+
+.. code-block:: none
+
+ xawtv >= 3.49 (<http://bytesex.org/xawtv/>)
+ for display and uncompressed video capture:
+
+ xawtv -c /dev/video0 -geometry 640x480
+ or
+ xawtv -c /dev/video0 -geometry 320x240
+
+ motioneye (<http://popies.net/meye/>)
+ for getting ppm or jpg snapshots, mjpeg video
+
+Bugs / Todo
+-----------
+
+- 'motioneye' still uses the meye private v4l1 API extensions.
diff --git a/Documentation/admin-guide/media/misc-cardlist.rst b/Documentation/admin-guide/media/misc-cardlist.rst
new file mode 100644
index 000000000000..4c26bcfccd61
--- /dev/null
+++ b/Documentation/admin-guide/media/misc-cardlist.rst
@@ -0,0 +1,28 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+Firewire driver
+===============
+
+The media subsystem also provides a firewire driver for digital TV:
+
+======= =====================
+Driver Name
+======= =====================
+firedtv FireDTV and FloppyDTV
+======= =====================
+
+Test drivers
+============
+
+In order to test userspace applications, there's a number of virtual
+drivers, with provide test functionality, simulating real hardware
+devices:
+
+======= ======================================
+Driver Name
+======= ======================================
+vicodec Virtual Codec Driver
+vim2m Virtual Memory-to-Memory Driver
+vimc Virtual Media Controller Driver (VIMC)
+vivid Virtual Video Test Driver
+======= ======================================
diff --git a/Documentation/admin-guide/media/omap3isp.rst b/Documentation/admin-guide/media/omap3isp.rst
new file mode 100644
index 000000000000..f32e7375a1a2
--- /dev/null
+++ b/Documentation/admin-guide/media/omap3isp.rst
@@ -0,0 +1,92 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+.. include:: <isonum.txt>
+
+OMAP 3 Image Signal Processor (ISP) driver
+==========================================
+
+Copyright |copy| 2010 Nokia Corporation
+
+Copyright |copy| 2009 Texas Instruments, Inc.
+
+Contacts: Laurent Pinchart <laurent.pinchart@ideasonboard.com>,
+Sakari Ailus <sakari.ailus@iki.fi>, David Cohen <dacohen@gmail.com>
+
+
+Introduction
+------------
+
+This file documents the Texas Instruments OMAP 3 Image Signal Processor (ISP)
+driver located under drivers/media/platform/ti/omap3isp. The original driver was
+written by Texas Instruments but since that it has been rewritten (twice) at
+Nokia.
+
+The driver has been successfully used on the following versions of OMAP 3:
+
+- 3430
+- 3530
+- 3630
+
+The driver implements V4L2, Media controller and v4l2_subdev interfaces.
+Sensor, lens and flash drivers using the v4l2_subdev interface in the kernel
+are supported.
+
+
+Split to subdevs
+----------------
+
+The OMAP 3 ISP is split into V4L2 subdevs, each of the blocks inside the ISP
+having one subdev to represent it. Each of the subdevs provide a V4L2 subdev
+interface to userspace.
+
+- OMAP3 ISP CCP2
+- OMAP3 ISP CSI2a
+- OMAP3 ISP CCDC
+- OMAP3 ISP preview
+- OMAP3 ISP resizer
+- OMAP3 ISP AEWB
+- OMAP3 ISP AF
+- OMAP3 ISP histogram
+
+Each possible link in the ISP is modelled by a link in the Media controller
+interface. For an example program see [#]_.
+
+
+Controlling the OMAP 3 ISP
+--------------------------
+
+In general, the settings given to the OMAP 3 ISP take effect at the beginning
+of the following frame. This is done when the module becomes idle during the
+vertical blanking period on the sensor. In memory-to-memory operation the pipe
+is run one frame at a time. Applying the settings is done between the frames.
+
+All the blocks in the ISP, excluding the CSI-2 and possibly the CCP2 receiver,
+insist on receiving complete frames. Sensors must thus never send the ISP
+partial frames.
+
+Autoidle does have issues with some ISP blocks on the 3430, at least.
+Autoidle is only enabled on 3630 when the omap3isp module parameter autoidle
+is non-zero.
+
+Technical reference manuals (TRMs) and other documentation
+----------------------------------------------------------
+
+OMAP 3430 TRM:
+<URL:http://focus.ti.com/pdfs/wtbu/OMAP34xx_ES3.1.x_PUBLIC_TRM_vZM.zip>
+Referenced 2011-03-05.
+
+OMAP 35xx TRM:
+<URL:http://www.ti.com/litv/pdf/spruf98o> Referenced 2011-03-05.
+
+OMAP 3630 TRM:
+<URL:http://focus.ti.com/pdfs/wtbu/OMAP36xx_ES1.x_PUBLIC_TRM_vQ.zip>
+Referenced 2011-03-05.
+
+DM 3730 TRM:
+<URL:http://www.ti.com/litv/pdf/sprugn4h> Referenced 2011-03-06.
+
+
+References
+----------
+
+.. [#] http://git.ideasonboard.org/?p=media-ctl.git;a=summary
diff --git a/Documentation/admin-guide/media/omap4_camera.rst b/Documentation/admin-guide/media/omap4_camera.rst
new file mode 100644
index 000000000000..2ada9b1e6897
--- /dev/null
+++ b/Documentation/admin-guide/media/omap4_camera.rst
@@ -0,0 +1,62 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+OMAP4 ISS Driver
+================
+
+Author: Sergio Aguirre <sergio.a.aguirre@gmail.com>
+
+Copyright (C) 2012, Texas Instruments
+
+Introduction
+------------
+
+The OMAP44XX family of chips contains the Imaging SubSystem (a.k.a. ISS),
+Which contains several components that can be categorized in 3 big groups:
+
+- Interfaces (2 Interfaces: CSI2-A & CSI2-B/CCP2)
+- ISP (Image Signal Processor)
+- SIMCOP (Still Image Coprocessor)
+
+For more information, please look in [#f1]_ for latest version of:
+"OMAP4430 Multimedia Device Silicon Revision 2.x"
+
+As of Revision AB, the ISS is described in detail in section 8.
+
+This driver is supporting **only** the CSI2-A/B interfaces for now.
+
+It makes use of the Media Controller framework [#f2]_, and inherited most of the
+code from OMAP3 ISP driver (found under drivers/media/platform/ti/omap3isp/\*),
+except that it doesn't need an IOMMU now for ISS buffers memory mapping.
+
+Supports usage of MMAP buffers only (for now).
+
+Tested platforms
+----------------
+
+- OMAP4430SDP, w/ ES2.1 GP & SEVM4430-CAM-V1-0 (Contains IMX060 & OV5640, in
+ which only the last one is supported, outputting YUV422 frames).
+
+- TI Blaze MDP, w/ OMAP4430 ES2.2 EMU (Contains 1 IMX060 & 2 OV5650 sensors, in
+ which only the OV5650 are supported, outputting RAW10 frames).
+
+- PandaBoard, Rev. A2, w/ OMAP4430 ES2.1 GP & OV adapter board, tested with
+ following sensors:
+ * OV5640
+ * OV5650
+
+- Tested on mainline kernel:
+
+ http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=summary
+
+ Tag: v3.3 (commit c16fa4f2ad19908a47c63d8fa436a1178438c7e7)
+
+File list
+---------
+drivers/staging/media/omap4iss/
+include/linux/platform_data/media/omap4iss.h
+
+References
+----------
+
+.. [#f1] http://focus.ti.com/general/docs/wtbu/wtbudocumentcenter.tsp?navigationId=12037&templateId=6123#62
+.. [#f2] http://lwn.net/Articles/420485/
diff --git a/Documentation/admin-guide/media/opera-firmware.rst b/Documentation/admin-guide/media/opera-firmware.rst
new file mode 100644
index 000000000000..fab3581551de
--- /dev/null
+++ b/Documentation/admin-guide/media/opera-firmware.rst
@@ -0,0 +1,33 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+Opera firmware
+==============
+
+Author: Marco Gittler <g.marco@freenet.de>
+
+To extract the firmware for the Opera DVB-S1 USB-Box
+you need to copy the files:
+
+2830SCap2.sys
+2830SLoad2.sys
+
+from the windriver disk into this directory.
+
+Then run:
+
+.. code-block:: none
+
+ scripts/get_dvb_firmware opera1
+
+and after that you have 2 files:
+
+dvb-usb-opera-01.fw
+dvb-usb-opera1-fpga-01.fw
+
+in here.
+
+Copy them into /lib/firmware/ .
+
+After that the driver can load the firmware
+(if you have enabled firmware loading
+in kernel config and have hotplug running).
diff --git a/Documentation/admin-guide/media/other-usb-cardlist.rst b/Documentation/admin-guide/media/other-usb-cardlist.rst
new file mode 100644
index 000000000000..bbfdb1389c18
--- /dev/null
+++ b/Documentation/admin-guide/media/other-usb-cardlist.rst
@@ -0,0 +1,92 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+Other USB cards list
+====================
+
+================ ====================================== =====================
+Driver Card name USB IDs
+================ ====================================== =====================
+airspy Airspy 1d50:60a1
+dvb-as102 Abilis Systems DVB-Titan 1BA6:0001
+dvb-as102 PCTV Systems picoStick (74e) 2013:0246
+dvb-as102 Elgato EyeTV DTT Deluxe 0fd9:002c
+dvb-as102 nBox DVB-T Dongle 0b89:0007
+dvb-as102 Sky IT Digital Key (green led) 2137:0001
+b2c2-flexcop-usb Technisat/B2C2 FlexCop II/IIb/III 0af7:0101
+ Digital TV
+cpia2 Vision's CPiA2 cameras 0553:0100, 0553:0140,
+ such as the Digital Blue QX5 0553:0151
+go7007 WIS GO7007 MPEG encoder 1943:a250, 093b:a002,
+ 093b:a004, 0eb1:6666,
+ 0eb1:6668
+hackrf HackRF Software Decoder Radio 1d50:6089
+hdpvr Hauppauge HD PVR 2040:4900, 2040:4901,
+ 2040:4902, 2040:4982,
+ 2040:4903
+msi2500 Mirics MSi3101 SDR Dongle 1df7:2500, 2040:d300
+pvrusb2 Hauppauge WinTV-PVR USB2 2040:2900, 2040:2950,
+ 2040:2400, 1164:0622,
+ 1164:0602, 11ba:1003,
+ 11ba:1001, 2040:7300,
+ 2040:7500, 2040:7501,
+ 0ccd:0039, 2040:7502,
+ 2040:7510
+pwc Creative Webcam 5 041E:400C
+pwc Creative Webcam Pro Ex 041E:4011
+pwc Logitech QuickCam 3000 Pro 046D:08B0
+pwc Logitech QuickCam Notebook Pro 046D:08B1
+pwc Logitech QuickCam 4000 Pro 046D:08B2
+pwc Logitech QuickCam Zoom (old model) 046D:08B3
+pwc Logitech QuickCam Zoom (new model) 046D:08B4
+pwc Logitech QuickCam Orbit/Sphere 046D:08B5
+pwc Logitech/Cisco VT Camera 046D:08B6
+pwc Logitech ViewPort AV 100 046D:08B7
+pwc Logitech QuickCam 046D:08B8
+pwc Philips PCA645VC 0471:0302
+pwc Philips PCA646VC 0471:0303
+pwc Askey VC010 type 2 0471:0304
+pwc Philips PCVC675K (Vesta) 0471:0307
+pwc Philips PCVC680K (Vesta Pro) 0471:0308
+pwc Philips PCVC690K (Vesta Pro Scan) 0471:030C
+pwc Philips PCVC730K (ToUCam Fun), 0471:0310
+ PCVC830 (ToUCam II)
+pwc Philips PCVC740K (ToUCam Pro), 0471:0311
+ PCVC840 (ToUCam II)
+pwc Philips PCVC750K (ToUCam Pro Scan) 0471:0312
+pwc Philips PCVC720K/40 (ToUCam XS) 0471:0313
+pwc Philips SPC 900NC 0471:0329
+pwc Philips SPC 880NC 0471:032C
+pwc Sotec Afina Eye 04CC:8116
+pwc Samsung MPC-C10 055D:9000
+pwc Samsung MPC-C30 055D:9001
+pwc Samsung SNC-35E (Ver3.0) 055D:9002
+pwc Askey VC010 type 1 069A:0001
+pwc AME Co. Afina Eye 06BE:8116
+pwc Visionite VCS-UC300 0d81:1900
+pwc Visionite VCS-UM100 0d81:1910
+s2255drv Sensoray 2255 1943:2255, 1943:2257
+stk1160 STK1160 USB video capture dongle 05e1:0408
+stkwebcam Syntek DC1125 174f:a311, 05e1:0501
+dvb-ttusb-budget Technotrend/Hauppauge Nova-USB devices 0b48:1003, 0b48:1004,
+ 0b48:1005
+dvb-ttusb_dec Technotrend/Hauppauge MPEG decoder 0b48:1006
+ DEC3000-s
+dvb-ttusb_dec Technotrend/Hauppauge MPEG decoder 0b48:1007
+dvb-ttusb_dec Technotrend/Hauppauge MPEG decoder 0b48:1008
+ DEC2000-t
+dvb-ttusb_dec Technotrend/Hauppauge MPEG decoder
+ DEC2540-t 0b48:1009
+usbtv Fushicai USBTV007 Audio-Video Grabber 1b71:3002, 1f71:3301,
+ 1f71:3306
+zr364xx USB ZR364XX Camera 08ca:0109, 041e:4024,
+ 0d64:0108, 0546:3187,
+ 0d64:3108, 0595:4343,
+ 0bb0:500d, 0feb:2004,
+ 055f:b500, 08ca:2062,
+ 052b:1a18, 04c8:0729,
+ 04f2:a208, 0784:0040,
+ 06d6:0034, 0a17:0062,
+ 06d6:003b, 0a17:004e,
+ 041e:405d, 08ca:2102,
+ 06d6:003d
+================ ====================================== =====================
diff --git a/Documentation/admin-guide/media/pci-cardlist.rst b/Documentation/admin-guide/media/pci-cardlist.rst
new file mode 100644
index 000000000000..f4d670e632f8
--- /dev/null
+++ b/Documentation/admin-guide/media/pci-cardlist.rst
@@ -0,0 +1,109 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+PCI drivers
+===========
+
+The PCI boards are identified by an identification called PCI ID. The PCI ID
+is actually composed by two parts:
+
+ - Vendor ID and device ID;
+ - Subsystem ID and Subsystem device ID;
+
+The ``lspci -nn`` command allows identifying the vendor/device PCI IDs:
+
+.. code-block:: none
+ :emphasize-lines: 3
+
+ $ lspci -nn
+ ...
+ 00:0a.0 Multimedia controller [0480]: Philips Semiconductors SAA7131/SAA7133/SAA7135 Video Broadcast Decoder [1131:7133] (rev d1)
+ 00:0b.0 Multimedia controller [0480]: Brooktree Corporation Bt878 Audio Capture [109e:0878] (rev 11)
+ 01:00.0 Multimedia video controller [0400]: Conexant Systems, Inc. CX23887/8 PCIe Broadcast Audio and Video Decoder with 3D Comb [14f1:8880] (rev 0f)
+ 02:01.0 Multimedia video controller [0400]: Internext Compression Inc iTVC15 (CX23415) Video Decoder [4444:0803] (rev 01)
+ 02:02.0 Multimedia video controller [0400]: Conexant Systems, Inc. CX23418 Single-Chip MPEG-2 Encoder with Integrated Analog Video/Broadcast Audio Decoder [14f1:5b7a]
+ 02:03.0 Multimedia video controller [0400]: Brooktree Corporation Bt878 Video Capture [109e:036e] (rev 11)
+ ...
+
+The subsystem IDs can be obtained using ``lspci -vn``
+
+.. code-block:: none
+ :emphasize-lines: 4
+
+ $ lspci -vn
+ ...
+ 00:0a.0 0480: 1131:7133 (rev d1)
+ Subsystem: 1461:f01d
+ Flags: bus master, medium devsel, latency 32, IRQ 209
+ Memory at e2002000 (32-bit, non-prefetchable) [size=2K]
+ Capabilities: [40] Power Management version 2
+ ...
+
+At the above example, the first card uses the ``saa7134`` driver, and
+has a vendor/device PCI ID equal to ``1131:7133`` and a PCI subsystem
+ID equal to ``1461:f01d`` (see :doc:`Saa7134 card list<saa7134-cardlist>`).
+
+Unfortunately, sometimes the same PCI subsystem ID is used by different
+products. So, several media drivers allow passing a ``card=`` parameter,
+in order to setup a card number that would match the correct settings for
+an specific board.
+
+The current supported PCI/PCIe cards (not including staging drivers) are
+listed below\ [#]_.
+
+.. [#] some of the drivers have sub-drivers, not shown at this table
+
+================ ========================================================
+Driver Name
+================ ========================================================
+altera-ci Altera FPGA based CI module
+b2c2-flexcop-pci Technisat/B2C2 Air/Sky/Cable2PC PCI
+bt878 DVB/ATSC Support for bt878 based TV cards
+bttv BT8x8 Video For Linux
+cobalt Cisco Cobalt
+cx18 Conexant cx23418 MPEG encoder
+cx23885 Conexant cx23885 (2388x successor)
+cx25821 Conexant cx25821
+cx88xx Conexant 2388x (bt878 successor)
+ddbridge Digital Devices bridge
+dm1105 SDMC DM1105 based PCI cards
+dt3155 DT3155 frame grabber
+dvb-ttpci AV7110 cards
+earth-pt1 PT1 cards
+earth-pt3 Earthsoft PT3 cards
+hexium_gemini Hexium Gemini frame grabber
+hexium_orion Hexium HV-PCI6 and Orion frame grabber
+hopper HOPPER based cards
+ipu3-cio2 Intel ipu3-cio2 driver
+ivtv Conexant cx23416/cx23415 MPEG encoder/decoder
+ivtvfb Conexant cx23415 framebuffer
+mantis MANTIS based cards
+meye Sony Vaio Picturebook Motion Eye
+mxb Siemens-Nixdorf 'Multimedia eXtension Board'
+netup-unidvb NetUP Universal DVB card
+ngene Micronas nGene
+pluto2 Pluto2 cards
+saa7134 Philips SAA7134
+saa7164 NXP SAA7164
+smipcie SMI PCIe DVBSky cards
+solo6x10 Bluecherry / Softlogic 6x10 capture cards (MPEG-4/H.264)
+sta2x11_vip STA2X11 VIP Video For Linux
+tw5864 Techwell TW5864 video/audio grabber and encoder
+tw686x Intersil/Techwell TW686x
+tw68 Techwell tw68x Video For Linux
+zoran Zoran-36057/36067 JPEG codec
+================ ========================================================
+
+Some of those drivers support multiple devices, as shown at the card
+lists below:
+
+.. toctree::
+ :maxdepth: 1
+
+ bttv-cardlist
+ cx18-cardlist
+ cx23885-cardlist
+ cx88-cardlist
+ ivtv-cardlist
+ saa7134-cardlist
+ saa7164-cardlist
+ zoran-cardlist
diff --git a/Documentation/admin-guide/media/philips.rst b/Documentation/admin-guide/media/philips.rst
new file mode 100644
index 000000000000..e2840be10d08
--- /dev/null
+++ b/Documentation/admin-guide/media/philips.rst
@@ -0,0 +1,247 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+Philips webcams (pwc driver)
+============================
+
+This file contains some additional information for the Philips and OEM webcams.
+E-mail: webcam@smcc.demon.nl Last updated: 2004-01-19
+Site: http://www.smcc.demon.nl/webcam/
+
+As of this moment, the following cameras are supported:
+
+ * Philips PCA645
+ * Philips PCA646
+ * Philips PCVC675
+ * Philips PCVC680
+ * Philips PCVC690
+ * Philips PCVC720/40
+ * Philips PCVC730
+ * Philips PCVC740
+ * Philips PCVC750
+ * Askey VC010
+ * Creative Labs Webcam 5
+ * Creative Labs Webcam Pro Ex
+ * Logitech QuickCam 3000 Pro
+ * Logitech QuickCam 4000 Pro
+ * Logitech QuickCam Notebook Pro
+ * Logitech QuickCam Zoom
+ * Logitech QuickCam Orbit
+ * Logitech QuickCam Sphere
+ * Samsung MPC-C10
+ * Samsung MPC-C30
+ * Sotec Afina Eye
+ * AME CU-001
+ * Visionite VCS-UM100
+ * Visionite VCS-UC300
+
+The main webpage for the Philips driver is at the address above. It contains
+a lot of extra information, a FAQ, and the binary plugin 'PWCX'. This plugin
+contains decompression routines that allow you to use higher image sizes and
+framerates; in addition the webcam uses less bandwidth on the USB bus (handy
+if you want to run more than 1 camera simultaneously). These routines fall
+under a NDA, and may therefore not be distributed as source; however, its use
+is completely optional.
+
+You can build this code either into your kernel, or as a module. I recommend
+the latter, since it makes troubleshooting a lot easier. The built-in
+microphone is supported through the USB Audio class.
+
+When you load the module you can set some default settings for the
+camera; some programs depend on a particular image-size or -format and
+don't know how to set it properly in the driver. The options are:
+
+size
+ Can be one of 'sqcif', 'qsif', 'qcif', 'sif', 'cif' or
+ 'vga', for an image size of resp. 128x96, 160x120, 176x144,
+ 320x240, 352x288 and 640x480 (of course, only for those cameras that
+ support these resolutions).
+
+fps
+ Specifies the desired framerate. Is an integer in the range of 4-30.
+
+fbufs
+ This parameter specifies the number of internal buffers to use for storing
+ frames from the cam. This will help if the process that reads images from
+ the cam is a bit slow or momentarily busy. However, on slow machines it
+ only introduces lag, so choose carefully. The default is 3, which is
+ reasonable. You can set it between 2 and 5.
+
+mbufs
+ This is an integer between 1 and 10. It will tell the module the number of
+ buffers to reserve for mmap(), VIDIOCCGMBUF, VIDIOCMCAPTURE and friends.
+ The default is 2, which is adequate for most applications (double
+ buffering).
+
+ Should you experience a lot of 'Dumping frame...' messages during
+ grabbing with a tool that uses mmap(), you might want to increase if.
+ However, it doesn't really buffer images, it just gives you a bit more
+ slack when your program is behind. But you need a multi-threaded or
+ forked program to really take advantage of these buffers.
+
+ The absolute maximum is 10, but don't set it too high! Every buffer takes
+ up 460 KB of RAM, so unless you have a lot of memory setting this to
+ something more than 4 is an absolute waste. This memory is only
+ allocated during open(), so nothing is wasted when the camera is not in
+ use.
+
+power_save
+ When power_save is enabled (set to 1), the module will try to shut down
+ the cam on close() and re-activate on open(). This will save power and
+ turn off the LED. Not all cameras support this though (the 645 and 646
+ don't have power saving at all), and some models don't work either (they
+ will shut down, but never wake up). Consider this experimental. By
+ default this option is disabled.
+
+compression (only useful with the plugin)
+ With this option you can control the compression factor that the camera
+ uses to squeeze the image through the USB bus. You can set the
+ parameter between 0 and 3::
+
+ 0 = prefer uncompressed images; if the requested mode is not available
+ in an uncompressed format, the driver will silently switch to low
+ compression.
+ 1 = low compression.
+ 2 = medium compression.
+ 3 = high compression.
+
+ High compression takes less bandwidth of course, but it could also
+ introduce some unwanted artefacts. The default is 2, medium compression.
+ See the FAQ on the website for an overview of which modes require
+ compression.
+
+ The compression parameter does not apply to the 645 and 646 cameras
+ and OEM models derived from those (only a few). Most cams honour this
+ parameter.
+
+leds
+ This settings takes 2 integers, that define the on/off time for the LED
+ (in milliseconds). One of the interesting things that you can do with
+ this is let the LED blink while the camera is in use. This::
+
+ leds=500,500
+
+ will blink the LED once every second. But with::
+
+ leds=0,0
+
+ the LED never goes on, making it suitable for silent surveillance.
+
+ By default the camera's LED is on solid while in use, and turned off
+ when the camera is not used anymore.
+
+ This parameter works only with the ToUCam range of cameras (720, 730, 740,
+ 750) and OEMs. For other cameras this command is silently ignored, and
+ the LED cannot be controlled.
+
+ Finally: this parameters does not take effect UNTIL the first time you
+ open the camera device. Until then, the LED remains on.
+
+dev_hint
+ A long standing problem with USB devices is their dynamic nature: you
+ never know what device a camera gets assigned; it depends on module load
+ order, the hub configuration, the order in which devices are plugged in,
+ and the phase of the moon (i.e. it can be random). With this option you
+ can give the driver a hint as to what video device node (/dev/videoX) it
+ should use with a specific camera. This is also handy if you have two
+ cameras of the same model.
+
+ A camera is specified by its type (the number from the camera model,
+ like PCA645, PCVC750VC, etc) and optionally the serial number (visible
+ in /sys/kernel/debug/usb/devices). A hint consists of a string with the
+ following format::
+
+ [type[.serialnumber]:]node
+
+ The square brackets mean that both the type and the serialnumber are
+ optional, but a serialnumber cannot be specified without a type (which
+ would be rather pointless). The serialnumber is separated from the type
+ by a '.'; the node number by a ':'.
+
+ This somewhat cryptic syntax is best explained by a few examples::
+
+ dev_hint=3,5 The first detected cam gets assigned
+ /dev/video3, the second /dev/video5. Any
+ other cameras will get the first free
+ available slot (see below).
+
+ dev_hint=645:1,680:2 The PCA645 camera will get /dev/video1,
+ and a PCVC680 /dev/video2.
+
+ dev_hint=645.0123:3,645.4567:0 The PCA645 camera with serialnumber
+ 0123 goes to /dev/video3, the same
+ camera model with the 4567 serial
+ gets /dev/video0.
+
+ dev_hint=750:1,4,5,6 The PCVC750 camera will get /dev/video1, the
+ next 3 Philips cams will use /dev/video4
+ through /dev/video6.
+
+ Some points worth knowing:
+
+ - Serialnumbers are case sensitive and must be written full, including
+ leading zeroes (it's treated as a string).
+ - If a device node is already occupied, registration will fail and
+ the webcam is not available.
+ - You can have up to 64 video devices; be sure to make enough device
+ nodes in /dev if you want to spread the numbers.
+ After /dev/video9 comes /dev/video10 (not /dev/videoA).
+ - If a camera does not match any dev_hint, it will simply get assigned
+ the first available device node, just as it used to be.
+
+trace
+ In order to better detect problems, it is now possible to turn on a
+ 'trace' of some of the calls the module makes; it logs all items in your
+ kernel log at debug level.
+
+ The trace variable is a bitmask; each bit represents a certain feature.
+ If you want to trace something, look up the bit value(s) in the table
+ below, add the values together and supply that to the trace variable.
+
+ ====== ======= ================================================ =======
+ Value Value Description Default
+ (dec) (hex)
+ ====== ======= ================================================ =======
+ 1 0x1 Module initialization; this will log messages On
+ while loading and unloading the module
+
+ 2 0x2 probe() and disconnect() traces On
+
+ 4 0x4 Trace open() and close() calls Off
+
+ 8 0x8 read(), mmap() and associated ioctl() calls Off
+
+ 16 0x10 Memory allocation of buffers, etc. Off
+
+ 32 0x20 Showing underflow, overflow and Dumping frame On
+ messages
+
+ 64 0x40 Show viewport and image sizes Off
+
+ 128 0x80 PWCX debugging Off
+ ====== ======= ================================================ =======
+
+ For example, to trace the open() & read() functions, sum 8 + 4 = 12,
+ so you would supply trace=12 during insmod or modprobe. If
+ you want to turn the initialization and probing tracing off, set trace=0.
+ The default value for trace is 35 (0x23).
+
+
+
+Example::
+
+ # modprobe pwc size=cif fps=15 power_save=1
+
+The fbufs, mbufs and trace parameters are global and apply to all connected
+cameras. Each camera has its own set of buffers.
+
+size and fps only specify defaults when you open() the device; this is to
+accommodate some tools that don't set the size. You can change these
+settings after open() with the Video4Linux ioctl() calls. The default of
+defaults is QCIF size at 10 fps.
+
+The compression parameter is semiglobal; it sets the initial compression
+preference for all camera's, but this parameter can be set per camera with
+the VIDIOCPWCSCQUAL ioctl() call.
+
+All parameters are optional.
+
diff --git a/Documentation/admin-guide/media/platform-cardlist.rst b/Documentation/admin-guide/media/platform-cardlist.rst
new file mode 100644
index 000000000000..ac73c4166d1e
--- /dev/null
+++ b/Documentation/admin-guide/media/platform-cardlist.rst
@@ -0,0 +1,91 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+Platform drivers
+================
+
+There are several drivers that are focused on providing support for
+functionality that are already included at the main board, and don't
+use neither USB nor PCI bus. Those drivers are called platform
+drivers, and are very popular on embedded devices.
+
+The current supported of platform drivers (not including staging drivers) are
+listed below
+
+================= ============================================================
+Driver Name
+================= ============================================================
+am437x-vpfe TI AM437x VPFE
+aspeed-video Aspeed AST2400 and AST2500
+atmel-isc ATMEL Image Sensor Controller (ISC)
+atmel-isi ATMEL Image Sensor Interface (ISI)
+c8sectpfe SDR platform devices
+c8sectpfe SDR platform devices
+cafe_ccic Marvell 88ALP01 (Cafe) CMOS Camera Controller
+cdns-csi2rx Cadence MIPI-CSI2 RX Controller
+cdns-csi2tx Cadence MIPI-CSI2 TX Controller
+coda-vpu Chips&Media Coda multi-standard codec IP
+dm355_ccdc TI DM355 CCDC video capture
+dm644x_ccdc TI DM6446 CCDC video capture
+exynos-fimc-is EXYNOS4x12 FIMC-IS (Imaging Subsystem)
+exynos-fimc-lite EXYNOS FIMC-LITE camera interface
+exynos-gsc Samsung Exynos G-Scaler
+exy Samsung S5P/EXYNOS4 SoC series Camera Subsystem
+fsl-viu Freescale VIU
+imx-pxp i.MX Pixel Pipeline (PXP)
+isdf TI DM365 ISIF video capture
+mmp_camera Marvell Armada 610 integrated camera controller
+mtk_jpeg Mediatek JPEG Codec
+mtk-mdp Mediatek MDP
+mtk-vcodec-dec Mediatek Video Codec
+mtk-vpu Mediatek Video Processor Unit
+mx2_emmaprp MX2 eMMa-PrP
+omap3-isp OMAP 3 Camera
+omap-vout OMAP2/OMAP3 V4L2-Display
+pxa_camera PXA27x Quick Capture Interface
+qcom-camss Qualcomm V4L2 Camera Subsystem
+rcar-csi2 R-Car MIPI CSI-2 Receiver
+rcar_drif Renesas Digital Radio Interface (DRIF)
+rcar-fcp Renesas Frame Compression Processor
+rcar_fdp1 Renesas Fine Display Processor
+rcar_jpu Renesas JPEG Processing Unit
+rcar-vin R-Car Video Input (VIN)
+renesas-ceu Renesas Capture Engine Unit (CEU)
+rockchip-rga Rockchip Raster 2d Graphic Acceleration Unit
+s3c-camif Samsung S3C24XX/S3C64XX SoC Camera Interface
+s5p-csis S5P/EXYNOS MIPI-CSI2 receiver (MIPI-CSIS)
+s5p-fimc S5P/EXYNOS4 FIMC/CAMIF camera interface
+s5p-g2d Samsung S5P and EXYNOS4 G2D 2d graphics accelerator
+s5p-jpeg Samsung S5P/Exynos3250/Exynos4 JPEG codec
+s5p-mfc Samsung S5P MFC Video Codec
+sh_veu SuperH VEU mem2mem video processing
+sh_vou SuperH VOU video output
+stm32-dcmi STM32 Digital Camera Memory Interface (DCMI)
+stm32-dma2d STM32 Chrom-Art Accelerator Unit
+sun4i-csi Allwinner A10 CMOS Sensor Interface Support
+sun6i-csi Allwinner V3s Camera Sensor Interface
+sun8i-di Allwinner Deinterlace
+sun8i-rotate Allwinner DE2 rotation
+ti-cal TI Memory-to-memory multimedia devices
+ti-csc TI DVB platform devices
+ti-vpe TI VPE (Video Processing Engine)
+venus-enc Qualcomm Venus V4L2 encoder/decoder
+via-camera VIAFB camera controller
+video-mux Video Multiplexer
+vpif_display TI DaVinci VPIF V4L2-Display
+vpif_capture TI DaVinci VPIF video capture
+vpss TI DaVinci VPBE V4L2-Display
+vsp1 Renesas VSP1 Video Processing Engine
+xilinx-tpg Xilinx Video Test Pattern Generator
+xilinx-video Xilinx Video IP (EXPERIMENTAL)
+xilinx-vtc Xilinx Video Timing Controller
+================= ============================================================
+
+MMC/SDIO DVB adapters
+---------------------
+
+======= ===========================================
+Driver Name
+======= ===========================================
+smssdio Siano SMS1xxx based MDTV via SDIO interface
+======= ===========================================
+
diff --git a/Documentation/admin-guide/media/pulse8-cec.rst b/Documentation/admin-guide/media/pulse8-cec.rst
new file mode 100644
index 000000000000..356d08b519f3
--- /dev/null
+++ b/Documentation/admin-guide/media/pulse8-cec.rst
@@ -0,0 +1,13 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+Pulse-Eight CEC Adapter driver
+==============================
+
+The pulse8-cec driver implements the following module option:
+
+``persistent_config``
+---------------------
+
+By default this is off, but when set to 1 the driver will store the current
+settings to the device's internal eeprom and restore it the next time the
+device is connected to the USB port.
diff --git a/Documentation/admin-guide/media/qcom_camss.rst b/Documentation/admin-guide/media/qcom_camss.rst
new file mode 100644
index 000000000000..a72e17d09cb7
--- /dev/null
+++ b/Documentation/admin-guide/media/qcom_camss.rst
@@ -0,0 +1,185 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+.. include:: <isonum.txt>
+
+Qualcomm Camera Subsystem driver
+================================
+
+Introduction
+------------
+
+This file documents the Qualcomm Camera Subsystem driver located under
+drivers/media/platform/qcom/camss.
+
+The current version of the driver supports the Camera Subsystem found on
+Qualcomm MSM8916/APQ8016 and MSM8996/APQ8096 processors.
+
+The driver implements V4L2, Media controller and V4L2 subdev interfaces.
+Camera sensor using V4L2 subdev interface in the kernel is supported.
+
+The driver is implemented using as a reference the Qualcomm Camera Subsystem
+driver for Android as found in Code Aurora [#f1]_ [#f2]_.
+
+
+Qualcomm Camera Subsystem hardware
+----------------------------------
+
+The Camera Subsystem hardware found on 8x16 / 8x96 processors and supported by
+the driver consists of:
+
+- 2 / 3 CSIPHY modules. They handle the Physical layer of the CSI2 receivers.
+ A separate camera sensor can be connected to each of the CSIPHY module;
+- 2 / 4 CSID (CSI Decoder) modules. They handle the Protocol and Application
+ layer of the CSI2 receivers. A CSID can decode data stream from any of the
+ CSIPHY. Each CSID also contains a TG (Test Generator) block which can generate
+ artificial input data for test purposes;
+- ISPIF (ISP Interface) module. Handles the routing of the data streams from
+ the CSIDs to the inputs of the VFE;
+- 1 / 2 VFE (Video Front End) module(s). Contain a pipeline of image processing
+ hardware blocks. The VFE has different input interfaces. The PIX (Pixel) input
+ interface feeds the input data to the image processing pipeline. The image
+ processing pipeline contains also a scale and crop module at the end. Three
+ RDI (Raw Dump Interface) input interfaces bypass the image processing
+ pipeline. The VFE also contains the AXI bus interface which writes the output
+ data to memory.
+
+
+Supported functionality
+-----------------------
+
+The current version of the driver supports:
+
+- Input from camera sensor via CSIPHY;
+- Generation of test input data by the TG in CSID;
+- RDI interface of VFE
+
+ - Raw dump of the input data to memory.
+
+ Supported formats:
+
+ - YUYV/UYVY/YVYU/VYUY (packed YUV 4:2:2 - V4L2_PIX_FMT_YUYV /
+ V4L2_PIX_FMT_UYVY / V4L2_PIX_FMT_YVYU / V4L2_PIX_FMT_VYUY);
+ - MIPI RAW8 (8bit Bayer RAW - V4L2_PIX_FMT_SRGGB8 /
+ V4L2_PIX_FMT_SGRBG8 / V4L2_PIX_FMT_SGBRG8 / V4L2_PIX_FMT_SBGGR8);
+ - MIPI RAW10 (10bit packed Bayer RAW - V4L2_PIX_FMT_SBGGR10P /
+ V4L2_PIX_FMT_SGBRG10P / V4L2_PIX_FMT_SGRBG10P / V4L2_PIX_FMT_SRGGB10P /
+ V4L2_PIX_FMT_Y10P);
+ - MIPI RAW12 (12bit packed Bayer RAW - V4L2_PIX_FMT_SRGGB12P /
+ V4L2_PIX_FMT_SGBRG12P / V4L2_PIX_FMT_SGRBG12P / V4L2_PIX_FMT_SRGGB12P).
+ - (8x96 only) MIPI RAW14 (14bit packed Bayer RAW - V4L2_PIX_FMT_SRGGB14P /
+ V4L2_PIX_FMT_SGBRG14P / V4L2_PIX_FMT_SGRBG14P / V4L2_PIX_FMT_SRGGB14P).
+
+ - (8x96 only) Format conversion of the input data.
+
+ Supported input formats:
+
+ - MIPI RAW10 (10bit packed Bayer RAW - V4L2_PIX_FMT_SBGGR10P / V4L2_PIX_FMT_Y10P).
+
+ Supported output formats:
+
+ - Plain16 RAW10 (10bit unpacked Bayer RAW - V4L2_PIX_FMT_SBGGR10 / V4L2_PIX_FMT_Y10).
+
+- PIX interface of VFE
+
+ - Format conversion of the input data.
+
+ Supported input formats:
+
+ - YUYV/UYVY/YVYU/VYUY (packed YUV 4:2:2 - V4L2_PIX_FMT_YUYV /
+ V4L2_PIX_FMT_UYVY / V4L2_PIX_FMT_YVYU / V4L2_PIX_FMT_VYUY).
+
+ Supported output formats:
+
+ - NV12/NV21 (two plane YUV 4:2:0 - V4L2_PIX_FMT_NV12 / V4L2_PIX_FMT_NV21);
+ - NV16/NV61 (two plane YUV 4:2:2 - V4L2_PIX_FMT_NV16 / V4L2_PIX_FMT_NV61).
+ - (8x96 only) YUYV/UYVY/YVYU/VYUY (packed YUV 4:2:2 - V4L2_PIX_FMT_YUYV /
+ V4L2_PIX_FMT_UYVY / V4L2_PIX_FMT_YVYU / V4L2_PIX_FMT_VYUY).
+
+ - Scaling support. Configuration of the VFE Encoder Scale module
+ for downscalling with ratio up to 16x.
+
+ - Cropping support. Configuration of the VFE Encoder Crop module.
+
+- Concurrent and independent usage of two (8x96: three) data inputs -
+ could be camera sensors and/or TG.
+
+
+Driver Architecture and Design
+------------------------------
+
+The driver implements the V4L2 subdev interface. With the goal to model the
+hardware links between the modules and to expose a clean, logical and usable
+interface, the driver is split into V4L2 sub-devices as follows (8x16 / 8x96):
+
+- 2 / 3 CSIPHY sub-devices - each CSIPHY is represented by a single sub-device;
+- 2 / 4 CSID sub-devices - each CSID is represented by a single sub-device;
+- 2 / 4 ISPIF sub-devices - ISPIF is represented by a number of sub-devices
+ equal to the number of CSID sub-devices;
+- 4 / 8 VFE sub-devices - VFE is represented by a number of sub-devices equal to
+ the number of the input interfaces (3 RDI and 1 PIX for each VFE).
+
+The considerations to split the driver in this particular way are as follows:
+
+- representing CSIPHY and CSID modules by a separate sub-device for each module
+ allows to model the hardware links between these modules;
+- representing VFE by a separate sub-devices for each input interface allows
+ to use the input interfaces concurrently and independently as this is
+ supported by the hardware;
+- representing ISPIF by a number of sub-devices equal to the number of CSID
+ sub-devices allows to create linear media controller pipelines when using two
+ cameras simultaneously. This avoids branches in the pipelines which otherwise
+ will require a) userspace and b) media framework (e.g. power on/off
+ operations) to make assumptions about the data flow from a sink pad to a
+ source pad on a single media entity.
+
+Each VFE sub-device is linked to a separate video device node.
+
+The media controller pipeline graph is as follows (with connected two / three
+OV5645 camera sensors):
+
+.. _qcom_camss_graph:
+
+.. kernel-figure:: qcom_camss_graph.dot
+ :alt: qcom_camss_graph.dot
+ :align: center
+
+ Media pipeline graph 8x16
+
+.. kernel-figure:: qcom_camss_8x96_graph.dot
+ :alt: qcom_camss_8x96_graph.dot
+ :align: center
+
+ Media pipeline graph 8x96
+
+
+Implementation
+--------------
+
+Runtime configuration of the hardware (updating settings while streaming) is
+not required to implement the currently supported functionality. The complete
+configuration on each hardware module is applied on STREAMON ioctl based on
+the current active media links, formats and controls set.
+
+The output size of the scaler module in the VFE is configured with the actual
+compose selection rectangle on the sink pad of the 'msm_vfe0_pix' entity.
+
+The crop output area of the crop module in the VFE is configured with the actual
+crop selection rectangle on the source pad of the 'msm_vfe0_pix' entity.
+
+
+Documentation
+-------------
+
+APQ8016 Specification:
+https://developer.qualcomm.com/download/sd410/snapdragon-410-processor-device-specification.pdf
+Referenced 2016-11-24.
+
+APQ8096 Specification:
+https://developer.qualcomm.com/download/sd820e/qualcomm-snapdragon-820e-processor-apq8096sge-device-specification.pdf
+Referenced 2018-06-22.
+
+References
+----------
+
+.. [#f1] https://source.codeaurora.org/quic/la/kernel/msm-3.10/
+.. [#f2] https://source.codeaurora.org/quic/la/kernel/msm-3.18/
diff --git a/Documentation/admin-guide/media/qcom_camss_8x96_graph.dot b/Documentation/admin-guide/media/qcom_camss_8x96_graph.dot
new file mode 100644
index 000000000000..7ed243b41b67
--- /dev/null
+++ b/Documentation/admin-guide/media/qcom_camss_8x96_graph.dot
@@ -0,0 +1,106 @@
+# SPDX-License-Identifier: GPL-2.0
+
+digraph board {
+ rankdir=TB
+ n00000001 [label="{{<port0> 0} | msm_csiphy0\n/dev/v4l-subdev0 | {<port1> 1}}", shape=Mrecord, style=filled, fillcolor=green]
+ n00000001:port1 -> n0000000a:port0 [style=dashed]
+ n00000001:port1 -> n0000000d:port0 [style=dashed]
+ n00000001:port1 -> n00000010:port0 [style=dashed]
+ n00000001:port1 -> n00000013:port0 [style=dashed]
+ n00000004 [label="{{<port0> 0} | msm_csiphy1\n/dev/v4l-subdev1 | {<port1> 1}}", shape=Mrecord, style=filled, fillcolor=green]
+ n00000004:port1 -> n0000000a:port0 [style=dashed]
+ n00000004:port1 -> n0000000d:port0 [style=dashed]
+ n00000004:port1 -> n00000010:port0 [style=dashed]
+ n00000004:port1 -> n00000013:port0 [style=dashed]
+ n00000007 [label="{{<port0> 0} | msm_csiphy2\n/dev/v4l-subdev2 | {<port1> 1}}", shape=Mrecord, style=filled, fillcolor=green]
+ n00000007:port1 -> n0000000a:port0 [style=dashed]
+ n00000007:port1 -> n0000000d:port0 [style=dashed]
+ n00000007:port1 -> n00000010:port0 [style=dashed]
+ n00000007:port1 -> n00000013:port0 [style=dashed]
+ n0000000a [label="{{<port0> 0} | msm_csid0\n/dev/v4l-subdev3 | {<port1> 1}}", shape=Mrecord, style=filled, fillcolor=green]
+ n0000000a:port1 -> n00000016:port0 [style=dashed]
+ n0000000a:port1 -> n00000019:port0 [style=dashed]
+ n0000000a:port1 -> n0000001c:port0 [style=dashed]
+ n0000000a:port1 -> n0000001f:port0 [style=dashed]
+ n0000000d [label="{{<port0> 0} | msm_csid1\n/dev/v4l-subdev4 | {<port1> 1}}", shape=Mrecord, style=filled, fillcolor=green]
+ n0000000d:port1 -> n00000016:port0 [style=dashed]
+ n0000000d:port1 -> n00000019:port0 [style=dashed]
+ n0000000d:port1 -> n0000001c:port0 [style=dashed]
+ n0000000d:port1 -> n0000001f:port0 [style=dashed]
+ n00000010 [label="{{<port0> 0} | msm_csid2\n/dev/v4l-subdev5 | {<port1> 1}}", shape=Mrecord, style=filled, fillcolor=green]
+ n00000010:port1 -> n00000016:port0 [style=dashed]
+ n00000010:port1 -> n00000019:port0 [style=dashed]
+ n00000010:port1 -> n0000001c:port0 [style=dashed]
+ n00000010:port1 -> n0000001f:port0 [style=dashed]
+ n00000013 [label="{{<port0> 0} | msm_csid3\n/dev/v4l-subdev6 | {<port1> 1}}", shape=Mrecord, style=filled, fillcolor=green]
+ n00000013:port1 -> n00000016:port0 [style=dashed]
+ n00000013:port1 -> n00000019:port0 [style=dashed]
+ n00000013:port1 -> n0000001c:port0 [style=dashed]
+ n00000013:port1 -> n0000001f:port0 [style=dashed]
+ n00000016 [label="{{<port0> 0} | msm_ispif0\n/dev/v4l-subdev7 | {<port1> 1}}", shape=Mrecord, style=filled, fillcolor=green]
+ n00000016:port1 -> n00000022:port0 [style=dashed]
+ n00000016:port1 -> n0000002b:port0 [style=dashed]
+ n00000016:port1 -> n00000034:port0 [style=dashed]
+ n00000016:port1 -> n0000003d:port0 [style=dashed]
+ n00000016:port1 -> n00000046:port0 [style=dashed]
+ n00000016:port1 -> n0000004f:port0 [style=dashed]
+ n00000016:port1 -> n00000058:port0 [style=dashed]
+ n00000016:port1 -> n00000061:port0 [style=dashed]
+ n00000019 [label="{{<port0> 0} | msm_ispif1\n/dev/v4l-subdev8 | {<port1> 1}}", shape=Mrecord, style=filled, fillcolor=green]
+ n00000019:port1 -> n00000022:port0 [style=dashed]
+ n00000019:port1 -> n0000002b:port0 [style=dashed]
+ n00000019:port1 -> n00000034:port0 [style=dashed]
+ n00000019:port1 -> n0000003d:port0 [style=dashed]
+ n00000019:port1 -> n00000046:port0 [style=dashed]
+ n00000019:port1 -> n0000004f:port0 [style=dashed]
+ n00000019:port1 -> n00000058:port0 [style=dashed]
+ n00000019:port1 -> n00000061:port0 [style=dashed]
+ n0000001c [label="{{<port0> 0} | msm_ispif2\n/dev/v4l-subdev9 | {<port1> 1}}", shape=Mrecord, style=filled, fillcolor=green]
+ n0000001c:port1 -> n00000022:port0 [style=dashed]
+ n0000001c:port1 -> n0000002b:port0 [style=dashed]
+ n0000001c:port1 -> n00000034:port0 [style=dashed]
+ n0000001c:port1 -> n0000003d:port0 [style=dashed]
+ n0000001c:port1 -> n00000046:port0 [style=dashed]
+ n0000001c:port1 -> n0000004f:port0 [style=dashed]
+ n0000001c:port1 -> n00000058:port0 [style=dashed]
+ n0000001c:port1 -> n00000061:port0 [style=dashed]
+ n0000001f [label="{{<port0> 0} | msm_ispif3\n/dev/v4l-subdev10 | {<port1> 1}}", shape=Mrecord, style=filled, fillcolor=green]
+ n0000001f:port1 -> n00000022:port0 [style=dashed]
+ n0000001f:port1 -> n0000002b:port0 [style=dashed]
+ n0000001f:port1 -> n00000034:port0 [style=dashed]
+ n0000001f:port1 -> n0000003d:port0 [style=dashed]
+ n0000001f:port1 -> n00000046:port0 [style=dashed]
+ n0000001f:port1 -> n0000004f:port0 [style=dashed]
+ n0000001f:port1 -> n00000058:port0 [style=dashed]
+ n0000001f:port1 -> n00000061:port0 [style=dashed]
+ n00000022 [label="{{<port0> 0} | msm_vfe0_rdi0\n/dev/v4l-subdev11 | {<port1> 1}}", shape=Mrecord, style=filled, fillcolor=green]
+ n00000022:port1 -> n00000025 [style=bold]
+ n00000025 [label="msm_vfe0_video0\n/dev/video0", shape=box, style=filled, fillcolor=yellow]
+ n0000002b [label="{{<port0> 0} | msm_vfe0_rdi1\n/dev/v4l-subdev12 | {<port1> 1}}", shape=Mrecord, style=filled, fillcolor=green]
+ n0000002b:port1 -> n0000002e [style=bold]
+ n0000002e [label="msm_vfe0_video1\n/dev/video1", shape=box, style=filled, fillcolor=yellow]
+ n00000034 [label="{{<port0> 0} | msm_vfe0_rdi2\n/dev/v4l-subdev13 | {<port1> 1}}", shape=Mrecord, style=filled, fillcolor=green]
+ n00000034:port1 -> n00000037 [style=bold]
+ n00000037 [label="msm_vfe0_video2\n/dev/video2", shape=box, style=filled, fillcolor=yellow]
+ n0000003d [label="{{<port0> 0} | msm_vfe0_pix\n/dev/v4l-subdev14 | {<port1> 1}}", shape=Mrecord, style=filled, fillcolor=green]
+ n0000003d:port1 -> n00000040 [style=bold]
+ n00000040 [label="msm_vfe0_video3\n/dev/video3", shape=box, style=filled, fillcolor=yellow]
+ n00000046 [label="{{<port0> 0} | msm_vfe1_rdi0\n/dev/v4l-subdev15 | {<port1> 1}}", shape=Mrecord, style=filled, fillcolor=green]
+ n00000046:port1 -> n00000049 [style=bold]
+ n00000049 [label="msm_vfe1_video0\n/dev/video4", shape=box, style=filled, fillcolor=yellow]
+ n0000004f [label="{{<port0> 0} | msm_vfe1_rdi1\n/dev/v4l-subdev16 | {<port1> 1}}", shape=Mrecord, style=filled, fillcolor=green]
+ n0000004f:port1 -> n00000052 [style=bold]
+ n00000052 [label="msm_vfe1_video1\n/dev/video5", shape=box, style=filled, fillcolor=yellow]
+ n00000058 [label="{{<port0> 0} | msm_vfe1_rdi2\n/dev/v4l-subdev17 | {<port1> 1}}", shape=Mrecord, style=filled, fillcolor=green]
+ n00000058:port1 -> n0000005b [style=bold]
+ n0000005b [label="msm_vfe1_video2\n/dev/video6", shape=box, style=filled, fillcolor=yellow]
+ n00000061 [label="{{<port0> 0} | msm_vfe1_pix\n/dev/v4l-subdev18 | {<port1> 1}}", shape=Mrecord, style=filled, fillcolor=green]
+ n00000061:port1 -> n00000064 [style=bold]
+ n00000064 [label="msm_vfe1_video3\n/dev/video7", shape=box, style=filled, fillcolor=yellow]
+ n000000e2 [label="{{} | ov5645 3-0039\n/dev/v4l-subdev19 | {<port0> 0}}", shape=Mrecord, style=filled, fillcolor=green]
+ n000000e2:port0 -> n00000004:port0 [style=bold]
+ n000000e4 [label="{{} | ov5645 3-003a\n/dev/v4l-subdev20 | {<port0> 0}}", shape=Mrecord, style=filled, fillcolor=green]
+ n000000e4:port0 -> n00000007:port0 [style=bold]
+ n000000e6 [label="{{} | ov5645 3-003b\n/dev/v4l-subdev21 | {<port0> 0}}", shape=Mrecord, style=filled, fillcolor=green]
+ n000000e6:port0 -> n00000001:port0 [style=bold]
+}
diff --git a/Documentation/admin-guide/media/qcom_camss_graph.dot b/Documentation/admin-guide/media/qcom_camss_graph.dot
new file mode 100644
index 000000000000..ef7dca92fd0b
--- /dev/null
+++ b/Documentation/admin-guide/media/qcom_camss_graph.dot
@@ -0,0 +1,43 @@
+# SPDX-License-Identifier: GPL-2.0
+
+digraph board {
+ rankdir=TB
+ n00000001 [label="{{<port0> 0} | msm_csiphy0\n/dev/v4l-subdev0 | {<port1> 1}}", shape=Mrecord, style=filled, fillcolor=green]
+ n00000001:port1 -> n00000007:port0 [style=dashed]
+ n00000001:port1 -> n0000000a:port0 [style=dashed]
+ n00000004 [label="{{<port0> 0} | msm_csiphy1\n/dev/v4l-subdev1 | {<port1> 1}}", shape=Mrecord, style=filled, fillcolor=green]
+ n00000004:port1 -> n00000007:port0 [style=dashed]
+ n00000004:port1 -> n0000000a:port0 [style=dashed]
+ n00000007 [label="{{<port0> 0} | msm_csid0\n/dev/v4l-subdev2 | {<port1> 1}}", shape=Mrecord, style=filled, fillcolor=green]
+ n00000007:port1 -> n0000000d:port0 [style=dashed]
+ n00000007:port1 -> n00000010:port0 [style=dashed]
+ n0000000a [label="{{<port0> 0} | msm_csid1\n/dev/v4l-subdev3 | {<port1> 1}}", shape=Mrecord, style=filled, fillcolor=green]
+ n0000000a:port1 -> n0000000d:port0 [style=dashed]
+ n0000000a:port1 -> n00000010:port0 [style=dashed]
+ n0000000d [label="{{<port0> 0} | msm_ispif0\n/dev/v4l-subdev4 | {<port1> 1}}", shape=Mrecord, style=filled, fillcolor=green]
+ n0000000d:port1 -> n00000013:port0 [style=dashed]
+ n0000000d:port1 -> n0000001c:port0 [style=dashed]
+ n0000000d:port1 -> n00000025:port0 [style=dashed]
+ n0000000d:port1 -> n0000002e:port0 [style=dashed]
+ n00000010 [label="{{<port0> 0} | msm_ispif1\n/dev/v4l-subdev5 | {<port1> 1}}", shape=Mrecord, style=filled, fillcolor=green]
+ n00000010:port1 -> n00000013:port0 [style=dashed]
+ n00000010:port1 -> n0000001c:port0 [style=dashed]
+ n00000010:port1 -> n00000025:port0 [style=dashed]
+ n00000010:port1 -> n0000002e:port0 [style=dashed]
+ n00000013 [label="{{<port0> 0} | msm_vfe0_rdi0\n/dev/v4l-subdev6 | {<port1> 1}}", shape=Mrecord, style=filled, fillcolor=green]
+ n00000013:port1 -> n00000016 [style=bold]
+ n00000016 [label="msm_vfe0_video0\n/dev/video0", shape=box, style=filled, fillcolor=yellow]
+ n0000001c [label="{{<port0> 0} | msm_vfe0_rdi1\n/dev/v4l-subdev7 | {<port1> 1}}", shape=Mrecord, style=filled, fillcolor=green]
+ n0000001c:port1 -> n0000001f [style=bold]
+ n0000001f [label="msm_vfe0_video1\n/dev/video1", shape=box, style=filled, fillcolor=yellow]
+ n00000025 [label="{{<port0> 0} | msm_vfe0_rdi2\n/dev/v4l-subdev8 | {<port1> 1}}", shape=Mrecord, style=filled, fillcolor=green]
+ n00000025:port1 -> n00000028 [style=bold]
+ n00000028 [label="msm_vfe0_video2\n/dev/video2", shape=box, style=filled, fillcolor=yellow]
+ n0000002e [label="{{<port0> 0} | msm_vfe0_pix\n/dev/v4l-subdev9 | {<port1> 1}}", shape=Mrecord, style=filled, fillcolor=green]
+ n0000002e:port1 -> n00000031 [style=bold]
+ n00000031 [label="msm_vfe0_video3\n/dev/video3", shape=box, style=filled, fillcolor=yellow]
+ n00000057 [label="{{} | ov5645 1-0076\n/dev/v4l-subdev10 | {<port0> 0}}", shape=Mrecord, style=filled, fillcolor=green]
+ n00000057:port0 -> n00000001:port0 [style=bold]
+ n00000059 [label="{{} | ov5645 1-0074\n/dev/v4l-subdev11 | {<port0> 0}}", shape=Mrecord, style=filled, fillcolor=green]
+ n00000059:port0 -> n00000004:port0 [style=bold]
+}
diff --git a/Documentation/admin-guide/media/radio-cardlist.rst b/Documentation/admin-guide/media/radio-cardlist.rst
new file mode 100644
index 000000000000..a82a146bf912
--- /dev/null
+++ b/Documentation/admin-guide/media/radio-cardlist.rst
@@ -0,0 +1,44 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+Radio drivers
+=============
+
+There is also support for pure AM/FM radio, and even for some FM radio
+transmitters:
+
+===================== =========================================================
+Driver Name
+===================== =========================================================
+si4713 Silicon Labs Si4713 FM Radio Transmitter
+radio-aztech Aztech/Packard Bell Radio
+radio-cadet ADS Cadet AM/FM Tuner
+radio-gemtek GemTek Radio card (or compatible)
+radio-maxiradio Guillemot MAXI Radio FM 2000 radio
+radio-miropcm20 miroSOUND PCM20 radio
+radio-aimslab AIMSlab RadioTrack (aka RadioReveal)
+radio-rtrack2 AIMSlab RadioTrack II
+saa7706h SAA7706H Car Radio DSP
+radio-sf16fmi SF16-FMI/SF16-FMP/SF16-FMD Radio
+radio-sf16fmr2 SF16-FMR2/SF16-FMD2 Radio
+radio-shark Griffin radioSHARK USB radio receiver
+shark2 Griffin radioSHARK2 USB radio receiver
+radio-si470x-common Silicon Labs Si470x FM Radio Receiver
+radio-si476x Silicon Laboratories Si476x I2C FM Radio
+radio-tea5764 TEA5764 I2C FM radio
+tef6862 TEF6862 Car Radio Enhanced Selectivity Tuner
+radio-terratec TerraTec ActiveRadio ISA Standalone
+radio-timb Enable the Timberdale radio driver
+radio-trust Trust FM radio card
+radio-typhoon Typhoon Radio (a.k.a. EcoRadio)
+radio-wl1273 Texas Instruments WL1273 I2C FM Radio
+fm_drv ISA radio devices
+fm_drv ISA radio devices
+radio-zoltrix Zoltrix Radio
+dsbr100 D-Link/GemTek USB FM radio
+radio-keene Keene FM Transmitter USB
+radio-ma901 Masterkit MA901 USB FM radio
+radio-mr800 AverMedia MR 800 USB FM radio
+radio-raremono Thanko's Raremono AM/FM/SW radio
+radio-si470x-usb Silicon Labs Si470x FM Radio Receiver support with USB
+radio-usb-si4713 Silicon Labs Si4713 FM Radio Transmitter support with USB
+===================== =========================================================
diff --git a/Documentation/admin-guide/media/rcar-fdp1.rst b/Documentation/admin-guide/media/rcar-fdp1.rst
new file mode 100644
index 000000000000..88b0edcf9046
--- /dev/null
+++ b/Documentation/admin-guide/media/rcar-fdp1.rst
@@ -0,0 +1,39 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+Renesas R-Car Fine Display Processor (FDP1) Driver
+==================================================
+
+The R-Car FDP1 driver implements driver-specific controls as follows.
+
+``V4L2_CID_DEINTERLACING_MODE (menu)``
+ The video deinterlacing mode (such as Bob, Weave, ...). The R-Car FDP1
+ driver implements the following modes.
+
+.. flat-table::
+ :header-rows: 0
+ :stub-columns: 0
+ :widths: 1 4
+
+ * - ``"Progressive" (0)``
+ - The input image video stream is progressive (not interlaced). No
+ deinterlacing is performed. Apart from (optional) format and encoding
+ conversion output frames are identical to the input frames.
+ * - ``"Adaptive 2D/3D" (1)``
+ - Motion adaptive version of 2D and 3D deinterlacing. Use 3D deinterlacing
+ in the presence of fast motion and 2D deinterlacing with diagonal
+ interpolation otherwise.
+ * - ``"Fixed 2D" (2)``
+ - The current field is scaled vertically by averaging adjacent lines to
+ recover missing lines. This method is also known as blending or Line
+ Averaging (LAV).
+ * - ``"Fixed 3D" (3)``
+ - The previous and next fields are averaged to recover lines missing from
+ the current field. This method is also known as Field Averaging (FAV).
+ * - ``"Previous field" (4)``
+ - The current field is weaved with the previous field, i.e. the previous
+ field is used to fill missing lines from the current field. This method
+ is also known as weave deinterlacing.
+ * - ``"Next field" (5)``
+ - The current field is weaved with the next field, i.e. the next field is
+ used to fill missing lines from the current field. This method is also
+ known as weave deinterlacing.
diff --git a/Documentation/admin-guide/media/remote-controller.rst b/Documentation/admin-guide/media/remote-controller.rst
new file mode 100644
index 000000000000..188944b00f4f
--- /dev/null
+++ b/Documentation/admin-guide/media/remote-controller.rst
@@ -0,0 +1,76 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+======================================================
+Infrared remote control support in video4linux drivers
+======================================================
+
+Authors: Gerd Hoffmann, Mauro Carvalho Chehab
+
+Basics
+======
+
+Most analog and digital TV boards support remote controllers. Several of
+them have a microprocessor that receives the IR carriers, convert into
+pulse/space sequences and then to scan codes, returning such codes to
+userspace ("scancode mode"). Other boards return just the pulse/space
+sequences ("raw mode").
+
+The support for remote controller in scancode mode is provided by the
+standard Linux input layer. The support for raw mode is provided via LIRC.
+
+In order to check the support and test it, it is suggested to download
+the `v4l-utils <https://git.linuxtv.org/v4l-utils.git/>`_. It provides
+two tools to handle remote controllers:
+
+- ir-keytable: provides a way to query the remote controller, list the
+ protocols it supports, enable in-kernel support for IR decoder or
+ switch the protocol and to test the reception of scan codes;
+
+- ir-ctl: provide tools to handle remote controllers that support raw mode
+ via LIRC interface.
+
+Usually, the remote controller module is auto-loaded when the TV card is
+detected. However, for a few devices, you need to manually load the
+ir-kbd-i2c module.
+
+How it works
+============
+
+The modules register the remote as keyboard within the linux input
+layer, i.e. you'll see the keys of the remote as normal key strokes
+(if CONFIG_INPUT_KEYBOARD is enabled).
+
+Using the event devices (CONFIG_INPUT_EVDEV) it is possible for
+applications to access the remote via /dev/input/event<n> devices.
+The udev/systemd will automatically create the devices. If you install
+the `v4l-utils <https://git.linuxtv.org/v4l-utils.git/>`_, it may also
+automatically load a different keytable than the default one. Please see
+`v4l-utils <https://git.linuxtv.org/v4l-utils.git/>`_ ir-keytable.1
+man page for details.
+
+The ir-keytable tool is nice for trouble shooting, i.e. to check
+whenever the input device is really present, which of the devices it
+is, check whenever pressing keys on the remote actually generates
+events and the like. You can also use any other input utility that changes
+the keymaps, like the input kbd utility.
+
+
+Using with lircd
+----------------
+
+The latest versions of the lircd daemon supports reading events from the
+linux input layer (via event device). It also supports receiving IR codes
+in lirc mode.
+
+
+Using without lircd
+-------------------
+
+Xorg recognizes several IR keycodes that have its numerical value lower
+than 247. With the advent of Wayland, the input driver got updated too,
+and should now accept all keycodes. Yet, you may want to just reassign
+the keycodes to something that your favorite media application likes.
+
+This can be done by setting
+`v4l-utils <https://git.linuxtv.org/v4l-utils.git/>`_ to load your own
+keytable in runtime. Please read ir-keytable.1 man page for details.
diff --git a/Documentation/admin-guide/media/rkisp1.dot b/Documentation/admin-guide/media/rkisp1.dot
new file mode 100644
index 000000000000..54c1953a6130
--- /dev/null
+++ b/Documentation/admin-guide/media/rkisp1.dot
@@ -0,0 +1,18 @@
+digraph board {
+ rankdir=TB
+ n00000001 [label="{{<port0> 0 | <port1> 1} | rkisp1_isp\n/dev/v4l-subdev0 | {<port2> 2 | <port3> 3}}", shape=Mrecord, style=filled, fillcolor=green]
+ n00000001:port2 -> n00000006:port0
+ n00000001:port2 -> n00000009:port0
+ n00000001:port3 -> n00000014 [style=bold]
+ n00000006 [label="{{<port0> 0} | rkisp1_resizer_mainpath\n/dev/v4l-subdev1 | {<port1> 1}}", shape=Mrecord, style=filled, fillcolor=green]
+ n00000006:port1 -> n0000000c [style=bold]
+ n00000009 [label="{{<port0> 0} | rkisp1_resizer_selfpath\n/dev/v4l-subdev2 | {<port1> 1}}", shape=Mrecord, style=filled, fillcolor=green]
+ n00000009:port1 -> n00000010 [style=bold]
+ n0000000c [label="rkisp1_mainpath\n/dev/video0", shape=box, style=filled, fillcolor=yellow]
+ n00000010 [label="rkisp1_selfpath\n/dev/video1", shape=box, style=filled, fillcolor=yellow]
+ n00000014 [label="rkisp1_stats\n/dev/video2", shape=box, style=filled, fillcolor=yellow]
+ n00000018 [label="rkisp1_params\n/dev/video3", shape=box, style=filled, fillcolor=yellow]
+ n00000018 -> n00000001:port1 [style=bold]
+ n0000001c [label="{{} | imx219 4-0010\n/dev/v4l-subdev3 | {<port0> 0}}", shape=Mrecord, style=filled, fillcolor=green]
+ n0000001c:port0 -> n00000001:port0
+}
diff --git a/Documentation/admin-guide/media/rkisp1.rst b/Documentation/admin-guide/media/rkisp1.rst
new file mode 100644
index 000000000000..ccf418713623
--- /dev/null
+++ b/Documentation/admin-guide/media/rkisp1.rst
@@ -0,0 +1,197 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+.. include:: <isonum.txt>
+
+=========================================
+Rockchip Image Signal Processor (rkisp1)
+=========================================
+
+Introduction
+============
+
+This file documents the driver for the Rockchip ISP1 that is part of RK3288
+and RK3399 SoCs. The driver is located under drivers/staging/media/rkisp1
+and uses the Media-Controller API.
+
+Revisions
+=========
+
+There exist multiple smaller revisions to this ISP that got introduced in
+later SoCs. Revisions can be found in the enum :c:type:`rkisp1_cif_isp_version`
+in the UAPI and the revision of the ISP inside the running SoC can be read
+in the field hw_revision of struct media_device_info as returned by
+ioctl MEDIA_IOC_DEVICE_INFO.
+
+Versions in use are:
+
+- RKISP1_V10: used at least in rk3288 and rk3399
+- RKISP1_V11: declared in the original vendor code, but not used
+- RKISP1_V12: used at least in rk3326 and px30
+- RKISP1_V13: used at least in rk1808
+
+Topology
+========
+.. _rkisp1_topology_graph:
+
+.. kernel-figure:: rkisp1.dot
+ :alt: Diagram of the default media pipeline topology
+ :align: center
+
+
+The driver has 4 video devices:
+
+- rkisp1_mainpath: capture device for retrieving images, usually in higher
+ resolution.
+- rkisp1_selfpath: capture device for retrieving images.
+- rkisp1_stats: a metadata capture device that sends statistics.
+- rkisp1_params: a metadata output device that receives parameters
+ configurations from userspace.
+
+The driver has 3 subdevices:
+
+- rkisp1_resizer_mainpath: used to resize and downsample frames for the
+ mainpath capture device.
+- rkisp1_resizer_selfpath: used to resize and downsample frames for the
+ selfpath capture device.
+- rkisp1_isp: is connected to the sensor and is responsible for all the isp
+ operations.
+
+
+rkisp1_mainpath, rkisp1_selfpath - Frames Capture Video Nodes
+-------------------------------------------------------------
+Those are the `mainpath` and `selfpath` capture devices to capture frames.
+Those entities are the DMA engines that write the frames to memory.
+The selfpath video device can capture YUV/RGB formats. Its input is YUV encoded
+stream and it is able to convert it to RGB. The selfpath is not able to
+capture bayer formats.
+The mainpath can capture both bayer and YUV formats but it is not able to
+capture RGB formats.
+Both capture videos support
+the ``V4L2_CAP_IO_MC`` :ref:`capability <device-capabilities>`.
+
+
+rkisp1_resizer_mainpath, rkisp1_resizer_selfpath - Resizers Subdevices Nodes
+----------------------------------------------------------------------------
+Those are resizer entities for the mainpath and the selfpath. Those entities
+can scale the frames up and down and also change the YUV sampling (for example
+YUV4:2:2 -> YUV4:2:0). They also have cropping capability on the sink pad.
+The resizers entities can only operate on YUV:4:2:2 format
+(MEDIA_BUS_FMT_YUYV8_2X8).
+The mainpath capture device supports capturing video in bayer formats. In that
+case the resizer of the mainpath is set to 'bypass' mode - it just forward the
+frame without operating on it.
+
+rkisp1_isp - Image Signal Processing Subdevice Node
+---------------------------------------------------
+This is the isp entity. It is connected to the sensor on sink pad 0 and
+receives the frames using the CSI-2 protocol. It is responsible of configuring
+the CSI-2 protocol. It has a cropping capability on sink pad 0 that is
+connected to the sensor and on source pad 2 connected to the resizer entities.
+Cropping on sink pad 0 defines the image region from the sensor.
+Cropping on source pad 2 defines the region for the Image Stabilizer (IS).
+
+.. _rkisp1_stats:
+
+rkisp1_stats - Statistics Video Node
+------------------------------------
+The statistics video node outputs the 3A (auto focus, auto exposure and auto
+white balance) statistics, and also histogram statistics for the frames that
+are being processed by the rkisp1 to userspace applications.
+Using these data, applications can implement algorithms and re-parameterize
+the driver through the rkisp_params node to improve image quality during a
+video stream.
+The buffer format is defined by struct :c:type:`rkisp1_stat_buffer`, and
+userspace should set
+:ref:`V4L2_META_FMT_RK_ISP1_STAT_3A <v4l2-meta-fmt-rk-isp1-stat-3a>` as the
+dataformat.
+
+.. _rkisp1_params:
+
+rkisp1_params - Parameters Video Node
+-------------------------------------
+The rkisp1_params video node receives a set of parameters from userspace
+to be applied to the hardware during a video stream, allowing userspace
+to dynamically modify values such as black level, cross talk corrections
+and others.
+
+The buffer format is defined by struct :c:type:`rkisp1_params_cfg`, and
+userspace should set
+:ref:`V4L2_META_FMT_RK_ISP1_PARAMS <v4l2-meta-fmt-rk-isp1-params>` as the
+dataformat.
+
+
+Capturing Video Frames Example
+==============================
+
+In the following example, the sensor connected to pad 0 of 'rkisp1_isp' is
+imx219.
+
+The following commands can be used to capture video from the selfpath video
+node with dimension 900x800 planar format YUV 4:2:2. It uses all cropping
+capabilities possible, (see explanation right below)
+
+.. code-block:: bash
+
+ # set the links
+ "media-ctl" "-d" "platform:rkisp1" "-r"
+ "media-ctl" "-d" "platform:rkisp1" "-l" "'imx219 4-0010':0 -> 'rkisp1_isp':0 [1]"
+ "media-ctl" "-d" "platform:rkisp1" "-l" "'rkisp1_isp':2 -> 'rkisp1_resizer_selfpath':0 [1]"
+ "media-ctl" "-d" "platform:rkisp1" "-l" "'rkisp1_isp':2 -> 'rkisp1_resizer_mainpath':0 [0]"
+
+ # set format for imx219 4-0010:0
+ "media-ctl" "-d" "platform:rkisp1" "--set-v4l2" '"imx219 4-0010":0 [fmt:SRGGB10_1X10/1640x1232]'
+
+ # set format for rkisp1_isp pads:
+ "media-ctl" "-d" "platform:rkisp1" "--set-v4l2" '"rkisp1_isp":0 [fmt:SRGGB10_1X10/1640x1232 crop: (0,0)/1600x1200]'
+ "media-ctl" "-d" "platform:rkisp1" "--set-v4l2" '"rkisp1_isp":2 [fmt:YUYV8_2X8/1600x1200 crop: (0,0)/1500x1100]'
+
+ # set format for rkisp1_resizer_selfpath pads:
+ "media-ctl" "-d" "platform:rkisp1" "--set-v4l2" '"rkisp1_resizer_selfpath":0 [fmt:YUYV8_2X8/1500x1100 crop: (300,400)/1400x1000]'
+ "media-ctl" "-d" "platform:rkisp1" "--set-v4l2" '"rkisp1_resizer_selfpath":1 [fmt:YUYV8_2X8/900x800]'
+
+ # set format for rkisp1_selfpath:
+ "v4l2-ctl" "-z" "platform:rkisp1" "-d" "rkisp1_selfpath" "-v" "width=900,height=800,"
+ "v4l2-ctl" "-z" "platform:rkisp1" "-d" "rkisp1_selfpath" "-v" "pixelformat=422P"
+
+ # start streaming:
+ v4l2-ctl "-z" "platform:rkisp1" "-d" "rkisp1_selfpath" "--stream-mmap" "--stream-count" "10"
+
+
+In the above example the sensor is configured to bayer format:
+`SRGGB10_1X10/1640x1232`. The rkisp1_isp:0 pad should be configured to the
+same mbus format and dimensions as the sensor, otherwise streaming will fail
+with 'EPIPE' error. So it is also configured to `SRGGB10_1X10/1640x1232`.
+In addition, the rkisp1_isp:0 pad is configured to cropping `(0,0)/1600x1200`.
+
+The cropping dimensions are automatically propagated to be the format of the
+isp source pad `rkisp1_isp:2`. Another cropping operation is configured on
+the isp source pad: `(0,0)/1500x1100`.
+
+The resizer's sink pad `rkisp1_resizer_selfpath` should be configured to format
+`YUYV8_2X8/1500x1100` in order to match the format on the other side of the
+link. In addition a cropping `(300,400)/1400x1000` is configured on it.
+
+The source pad of the resizer, `rkisp1_resizer_selfpath:1` is configured to
+format `YUYV8_2X8/900x800`. That means that the resizer first crop a window
+of `(300,400)/1400x100` from the received frame and then scales this window
+to dimension `900x800`.
+
+Note that the above example does not uses the stats-params control loop.
+Therefore the capture frames will not go through the 3A algorithms and
+probably won't have a good quality, and can even look dark and greenish.
+
+Configuring Quantization
+========================
+
+The driver supports limited and full range quantization on YUV formats,
+where limited is the default.
+To switch between one or the other, userspace should use the Colorspace
+Conversion API (CSC) for subdevices on source pad 2 of the
+isp (`rkisp1_isp:2`). The quantization configured on this pad is the
+quantization of the captured video frames on the mainpath and selfpath
+video nodes.
+Note that the resizer and capture entities will always report
+``V4L2_QUANTIZATION_DEFAULT`` even if the quantization is configured to full
+range on `rkisp1_isp:2`. So in order to get the configured quantization,
+application should get it from pad `rkisp1_isp:2`.
+
diff --git a/Documentation/admin-guide/media/saa7134-cardlist.rst b/Documentation/admin-guide/media/saa7134-cardlist.rst
new file mode 100644
index 000000000000..3ef8fab6bcad
--- /dev/null
+++ b/Documentation/admin-guide/media/saa7134-cardlist.rst
@@ -0,0 +1,803 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+SAA7134 cards list
+==================
+
+.. tabularcolumns:: |p{1.4cm}|p{11.1cm}|p{4.2cm}|
+
+.. flat-table::
+ :header-rows: 1
+ :widths: 2 19 18
+ :stub-columns: 0
+
+ * - Card number
+ - Card name
+ - PCI subsystem IDs
+
+ * - 0
+ - UNKNOWN/GENERIC
+ -
+
+ * - 1
+ - Proteus Pro [philips reference design]
+ - 1131:2001, 1131:2001
+
+ * - 2
+ - LifeView FlyVIDEO3000
+ - 5168:0138, 4e42:0138
+
+ * - 3
+ - LifeView/Typhoon FlyVIDEO2000
+ - 5168:0138, 4e42:0138
+
+ * - 4
+ - EMPRESS
+ - 1131:6752
+
+ * - 5
+ - SKNet Monster TV
+ - 1131:4e85
+
+ * - 6
+ - Tevion MD 9717
+ -
+
+ * - 7
+ - KNC One TV-Station RDS / Typhoon TV Tuner RDS
+ - 1131:fe01, 1894:fe01
+
+ * - 8
+ - Terratec Cinergy 400 TV
+ - 153b:1142
+
+ * - 9
+ - Medion 5044
+ -
+
+ * - 10
+ - Kworld/KuroutoShikou SAA7130-TVPCI
+ -
+
+ * - 11
+ - Terratec Cinergy 600 TV
+ - 153b:1143
+
+ * - 12
+ - Medion 7134
+ - 16be:0003, 16be:5000
+
+ * - 13
+ - Typhoon TV+Radio 90031
+ -
+
+ * - 14
+ - ELSA EX-VISION 300TV
+ - 1048:226b
+
+ * - 15
+ - ELSA EX-VISION 500TV
+ - 1048:226a
+
+ * - 16
+ - ASUS TV-FM 7134
+ - 1043:4842, 1043:4830, 1043:4840
+
+ * - 17
+ - AOPEN VA1000 POWER
+ - 1131:7133
+
+ * - 18
+ - BMK MPEX No Tuner
+ -
+
+ * - 19
+ - Compro VideoMate TV
+ - 185b:c100
+
+ * - 20
+ - Matrox CronosPlus
+ - 102B:48d0
+
+ * - 21
+ - 10MOONS PCI TV CAPTURE CARD
+ - 1131:2001
+
+ * - 22
+ - AverMedia M156 / Medion 2819
+ - 1461:a70b
+
+ * - 23
+ - BMK MPEX Tuner
+ -
+
+ * - 24
+ - KNC One TV-Station DVR
+ - 1894:a006
+
+ * - 25
+ - ASUS TV-FM 7133
+ - 1043:4843
+
+ * - 26
+ - Pinnacle PCTV Stereo (saa7134)
+ - 11bd:002b
+
+ * - 27
+ - Manli MuchTV M-TV002
+ -
+
+ * - 28
+ - Manli MuchTV M-TV001
+ -
+
+ * - 29
+ - Nagase Sangyo TransGear 3000TV
+ - 1461:050c
+
+ * - 30
+ - Elitegroup ECS TVP3XP FM1216 Tuner Card(PAL-BG,FM)
+ - 1019:4cb4
+
+ * - 31
+ - Elitegroup ECS TVP3XP FM1236 Tuner Card (NTSC,FM)
+ - 1019:4cb5
+
+ * - 32
+ - AVACS SmartTV
+ -
+
+ * - 33
+ - AVerMedia DVD EZMaker
+ - 1461:10ff
+
+ * - 34
+ - Noval Prime TV 7133
+ -
+
+ * - 35
+ - AverMedia AverTV Studio 305
+ - 1461:2115
+
+ * - 36
+ - UPMOST PURPLE TV
+ - 12ab:0800
+
+ * - 37
+ - Items MuchTV Plus / IT-005
+ -
+
+ * - 38
+ - Terratec Cinergy 200 TV
+ - 153b:1152
+
+ * - 39
+ - LifeView FlyTV Platinum Mini
+ - 5168:0212, 4e42:0212, 5169:1502
+
+ * - 40
+ - Compro VideoMate TV PVR/FM
+ - 185b:c100
+
+ * - 41
+ - Compro VideoMate TV Gold+
+ - 185b:c100
+
+ * - 42
+ - Sabrent SBT-TVFM (saa7130)
+ -
+
+ * - 43
+ - :Zolid Xpert TV7134
+ -
+
+ * - 44
+ - Empire PCI TV-Radio LE
+ -
+
+ * - 45
+ - Avermedia AVerTV Studio 307
+ - 1461:9715
+
+ * - 46
+ - AVerMedia Cardbus TV/Radio (E500)
+ - 1461:d6ee
+
+ * - 47
+ - Terratec Cinergy 400 mobile
+ - 153b:1162
+
+ * - 48
+ - Terratec Cinergy 600 TV MK3
+ - 153b:1158
+
+ * - 49
+ - Compro VideoMate Gold+ Pal
+ - 185b:c200
+
+ * - 50
+ - Pinnacle PCTV 300i DVB-T + PAL
+ - 11bd:002d
+
+ * - 51
+ - ProVideo PV952
+ - 1540:9524
+
+ * - 52
+ - AverMedia AverTV/305
+ - 1461:2108
+
+ * - 53
+ - ASUS TV-FM 7135
+ - 1043:4845
+
+ * - 54
+ - LifeView FlyTV Platinum FM / Gold
+ - 5168:0214, 5168:5214, 1489:0214, 5168:0304
+
+ * - 55
+ - LifeView FlyDVB-T DUO / MSI TV@nywhere Duo
+ - 5168:0306, 4E42:0306
+
+ * - 56
+ - Avermedia AVerTV 307
+ - 1461:a70a
+
+ * - 57
+ - Avermedia AVerTV GO 007 FM
+ - 1461:f31f
+
+ * - 58
+ - ADS Tech Instant TV (saa7135)
+ - 1421:0350, 1421:0351, 1421:0370, 1421:1370
+
+ * - 59
+ - Kworld/Tevion V-Stream Xpert TV PVR7134
+ -
+
+ * - 60
+ - LifeView/Typhoon/Genius FlyDVB-T Duo Cardbus
+ - 5168:0502, 4e42:0502, 1489:0502
+
+ * - 61
+ - Philips TOUGH DVB-T reference design
+ - 1131:2004
+
+ * - 62
+ - Compro VideoMate TV Gold+II
+ -
+
+ * - 63
+ - Kworld Xpert TV PVR7134
+ -
+
+ * - 64
+ - FlyTV mini Asus Digimatrix
+ - 1043:0210
+
+ * - 65
+ - V-Stream Studio TV Terminator
+ -
+
+ * - 66
+ - Yuan TUN-900 (saa7135)
+ -
+
+ * - 67
+ - Beholder BeholdTV 409 FM
+ - 0000:4091
+
+ * - 68
+ - GoTView 7135 PCI
+ - 5456:7135
+
+ * - 69
+ - Philips EUROPA V3 reference design
+ - 1131:2004
+
+ * - 70
+ - Compro Videomate DVB-T300
+ - 185b:c900
+
+ * - 71
+ - Compro Videomate DVB-T200
+ - 185b:c901
+
+ * - 72
+ - RTD Embedded Technologies VFG7350
+ - 1435:7350
+
+ * - 73
+ - RTD Embedded Technologies VFG7330
+ - 1435:7330
+
+ * - 74
+ - LifeView FlyTV Platinum Mini2
+ - 14c0:1212
+
+ * - 75
+ - AVerMedia AVerTVHD MCE A180
+ - 1461:1044
+
+ * - 76
+ - SKNet MonsterTV Mobile
+ - 1131:4ee9
+
+ * - 77
+ - Pinnacle PCTV 40i/50i/110i (saa7133)
+ - 11bd:002e
+
+ * - 78
+ - ASUSTeK P7131 Dual
+ - 1043:4862
+
+ * - 79
+ - Sedna/MuchTV PC TV Cardbus TV/Radio (ITO25 Rev:2B)
+ -
+
+ * - 80
+ - ASUS Digimatrix TV
+ - 1043:0210
+
+ * - 81
+ - Philips Tiger reference design
+ - 1131:2018
+
+ * - 82
+ - MSI TV@Anywhere plus
+ - 1462:6231, 1462:8624
+
+ * - 83
+ - Terratec Cinergy 250 PCI TV
+ - 153b:1160
+
+ * - 84
+ - LifeView FlyDVB Trio
+ - 5168:0319
+
+ * - 85
+ - AverTV DVB-T 777
+ - 1461:2c05, 1461:2c05
+
+ * - 86
+ - LifeView FlyDVB-T / Genius VideoWonder DVB-T
+ - 5168:0301, 1489:0301
+
+ * - 87
+ - ADS Instant TV Duo Cardbus PTV331
+ - 0331:1421
+
+ * - 88
+ - Tevion/KWorld DVB-T 220RF
+ - 17de:7201
+
+ * - 89
+ - ELSA EX-VISION 700TV
+ - 1048:226c
+
+ * - 90
+ - Kworld ATSC110/115
+ - 17de:7350, 17de:7352
+
+ * - 91
+ - AVerMedia A169 B
+ - 1461:7360
+
+ * - 92
+ - AVerMedia A169 B1
+ - 1461:6360
+
+ * - 93
+ - Medion 7134 Bridge #2
+ - 16be:0005
+
+ * - 94
+ - LifeView FlyDVB-T Hybrid Cardbus/MSI TV @nywhere A/D NB
+ - 5168:3306, 5168:3502, 5168:3307, 4e42:3502
+
+ * - 95
+ - LifeView FlyVIDEO3000 (NTSC)
+ - 5169:0138
+
+ * - 96
+ - Medion Md8800 Quadro
+ - 16be:0007, 16be:0008, 16be:000d
+
+ * - 97
+ - LifeView FlyDVB-S /Acorp TV134DS
+ - 5168:0300, 4e42:0300
+
+ * - 98
+ - Proteus Pro 2309
+ - 0919:2003
+
+ * - 99
+ - AVerMedia TV Hybrid A16AR
+ - 1461:2c00
+
+ * - 100
+ - Asus Europa2 OEM
+ - 1043:4860
+
+ * - 101
+ - Pinnacle PCTV 310i
+ - 11bd:002f
+
+ * - 102
+ - Avermedia AVerTV Studio 507
+ - 1461:9715
+
+ * - 103
+ - Compro Videomate DVB-T200A
+ -
+
+ * - 104
+ - Hauppauge WinTV-HVR1110 DVB-T/Hybrid
+ - 0070:6700, 0070:6701, 0070:6702, 0070:6703, 0070:6704, 0070:6705
+
+ * - 105
+ - Terratec Cinergy HT PCMCIA
+ - 153b:1172
+
+ * - 106
+ - Encore ENLTV
+ - 1131:2342, 1131:2341, 3016:2344
+
+ * - 107
+ - Encore ENLTV-FM
+ - 1131:230f
+
+ * - 108
+ - Terratec Cinergy HT PCI
+ - 153b:1175
+
+ * - 109
+ - Philips Tiger - S Reference design
+ -
+
+ * - 110
+ - Avermedia M102
+ - 1461:f31e
+
+ * - 111
+ - ASUS P7131 4871
+ - 1043:4871
+
+ * - 112
+ - ASUSTeK P7131 Hybrid
+ - 1043:4876
+
+ * - 113
+ - Elitegroup ECS TVP3XP FM1246 Tuner Card (PAL,FM)
+ - 1019:4cb6
+
+ * - 114
+ - KWorld DVB-T 210
+ - 17de:7250
+
+ * - 115
+ - Sabrent PCMCIA TV-PCB05
+ - 0919:2003
+
+ * - 116
+ - 10MOONS TM300 TV Card
+ - 1131:2304
+
+ * - 117
+ - Avermedia Super 007
+ - 1461:f01d
+
+ * - 118
+ - Beholder BeholdTV 401
+ - 0000:4016
+
+ * - 119
+ - Beholder BeholdTV 403
+ - 0000:4036
+
+ * - 120
+ - Beholder BeholdTV 403 FM
+ - 0000:4037
+
+ * - 121
+ - Beholder BeholdTV 405
+ - 0000:4050
+
+ * - 122
+ - Beholder BeholdTV 405 FM
+ - 0000:4051
+
+ * - 123
+ - Beholder BeholdTV 407
+ - 0000:4070
+
+ * - 124
+ - Beholder BeholdTV 407 FM
+ - 0000:4071
+
+ * - 125
+ - Beholder BeholdTV 409
+ - 0000:4090
+
+ * - 126
+ - Beholder BeholdTV 505 FM
+ - 5ace:5050
+
+ * - 127
+ - Beholder BeholdTV 507 FM / BeholdTV 509 FM
+ - 5ace:5070, 5ace:5090
+
+ * - 128
+ - Beholder BeholdTV Columbus TV/FM
+ - 0000:5201
+
+ * - 129
+ - Beholder BeholdTV 607 FM
+ - 5ace:6070
+
+ * - 130
+ - Beholder BeholdTV M6
+ - 5ace:6190
+
+ * - 131
+ - Twinhan Hybrid DTV-DVB 3056 PCI
+ - 1822:0022
+
+ * - 132
+ - Genius TVGO AM11MCE
+ -
+
+ * - 133
+ - NXP Snake DVB-S reference design
+ -
+
+ * - 134
+ - Medion/Creatix CTX953 Hybrid
+ - 16be:0010
+
+ * - 135
+ - MSI TV@nywhere A/D v1.1
+ - 1462:8625
+
+ * - 136
+ - AVerMedia Cardbus TV/Radio (E506R)
+ - 1461:f436
+
+ * - 137
+ - AVerMedia Hybrid TV/Radio (A16D)
+ - 1461:f936
+
+ * - 138
+ - Avermedia M115
+ - 1461:a836
+
+ * - 139
+ - Compro VideoMate T750
+ - 185b:c900
+
+ * - 140
+ - Avermedia DVB-S Pro A700
+ - 1461:a7a1
+
+ * - 141
+ - Avermedia DVB-S Hybrid+FM A700
+ - 1461:a7a2
+
+ * - 142
+ - Beholder BeholdTV H6
+ - 5ace:6290
+
+ * - 143
+ - Beholder BeholdTV M63
+ - 5ace:6191
+
+ * - 144
+ - Beholder BeholdTV M6 Extra
+ - 5ace:6193
+
+ * - 145
+ - AVerMedia MiniPCI DVB-T Hybrid M103
+ - 1461:f636, 1461:f736
+
+ * - 146
+ - ASUSTeK P7131 Analog
+ -
+
+ * - 147
+ - Asus Tiger 3in1
+ - 1043:4878
+
+ * - 148
+ - Encore ENLTV-FM v5.3
+ - 1a7f:2008
+
+ * - 149
+ - Avermedia PCI pure analog (M135A)
+ - 1461:f11d
+
+ * - 150
+ - Zogis Real Angel 220
+ -
+
+ * - 151
+ - ADS Tech Instant HDTV
+ - 1421:0380
+
+ * - 152
+ - Asus Tiger Rev:1.00
+ - 1043:4857
+
+ * - 153
+ - Kworld Plus TV Analog Lite PCI
+ - 17de:7128
+
+ * - 154
+ - Avermedia AVerTV GO 007 FM Plus
+ - 1461:f31d
+
+ * - 155
+ - Hauppauge WinTV-HVR1150 ATSC/QAM-Hybrid
+ - 0070:6706, 0070:6708
+
+ * - 156
+ - Hauppauge WinTV-HVR1120 DVB-T/Hybrid
+ - 0070:6707, 0070:6709, 0070:670a
+
+ * - 157
+ - Avermedia AVerTV Studio 507UA
+ - 1461:a11b
+
+ * - 158
+ - AVerMedia Cardbus TV/Radio (E501R)
+ - 1461:b7e9
+
+ * - 159
+ - Beholder BeholdTV 505 RDS
+ - 0000:505B
+
+ * - 160
+ - Beholder BeholdTV 507 RDS
+ - 0000:5071
+
+ * - 161
+ - Beholder BeholdTV 507 RDS
+ - 0000:507B
+
+ * - 162
+ - Beholder BeholdTV 607 FM
+ - 5ace:6071
+
+ * - 163
+ - Beholder BeholdTV 609 FM
+ - 5ace:6090
+
+ * - 164
+ - Beholder BeholdTV 609 FM
+ - 5ace:6091
+
+ * - 165
+ - Beholder BeholdTV 607 RDS
+ - 5ace:6072
+
+ * - 166
+ - Beholder BeholdTV 607 RDS
+ - 5ace:6073
+
+ * - 167
+ - Beholder BeholdTV 609 RDS
+ - 5ace:6092
+
+ * - 168
+ - Beholder BeholdTV 609 RDS
+ - 5ace:6093
+
+ * - 169
+ - Compro VideoMate S350/S300
+ - 185b:c900
+
+ * - 170
+ - AverMedia AverTV Studio 505
+ - 1461:a115
+
+ * - 171
+ - Beholder BeholdTV X7
+ - 5ace:7595
+
+ * - 172
+ - RoverMedia TV Link Pro FM
+ - 19d1:0138
+
+ * - 173
+ - Zolid Hybrid TV Tuner PCI
+ - 1131:2004
+
+ * - 174
+ - Asus Europa Hybrid OEM
+ - 1043:4847
+
+ * - 175
+ - Leadtek Winfast DTV1000S
+ - 107d:6655
+
+ * - 176
+ - Beholder BeholdTV 505 RDS
+ - 0000:5051
+
+ * - 177
+ - Hawell HW-404M7
+ -
+
+ * - 178
+ - Beholder BeholdTV H7
+ - 5ace:7190
+
+ * - 179
+ - Beholder BeholdTV A7
+ - 5ace:7090
+
+ * - 180
+ - Avermedia PCI M733A
+ - 1461:4155, 1461:4255
+
+ * - 181
+ - TechoTrend TT-budget T-3000
+ - 13c2:2804
+
+ * - 182
+ - Kworld PCI SBTVD/ISDB-T Full-Seg Hybrid
+ - 17de:b136
+
+ * - 183
+ - Compro VideoMate Vista M1F
+ - 185b:c900
+
+ * - 184
+ - Encore ENLTV-FM 3
+ - 1a7f:2108
+
+ * - 185
+ - MagicPro ProHDTV Pro2 DMB-TH/Hybrid
+ - 17de:d136
+
+ * - 186
+ - Beholder BeholdTV 501
+ - 5ace:5010
+
+ * - 187
+ - Beholder BeholdTV 503 FM
+ - 5ace:5030
+
+ * - 188
+ - Sensoray 811/911
+ - 6000:0811, 6000:0911
+
+ * - 189
+ - Kworld PC150-U
+ - 17de:a134
+
+ * - 190
+ - Asus My Cinema PS3-100
+ - 1043:48cd
+
+ * - 191
+ - Hawell HW-9004V1
+ -
+
+ * - 192
+ - AverMedia AverTV Satellite Hybrid+FM A706
+ - 1461:2055
+
+ * - 193
+ - WIS Voyager or compatible
+ - 1905:7007
+
+ * - 194
+ - AverMedia AverTV/505
+ - 1461:a10a
+
+ * - 195
+ - Leadtek Winfast TV2100 FM
+ - 107d:6f3a
+
+ * - 196
+ - SnaZio* TVPVR PRO
+ - 1779:13cf
diff --git a/Documentation/admin-guide/media/saa7134.rst b/Documentation/admin-guide/media/saa7134.rst
new file mode 100644
index 000000000000..51eae7eb5ab7
--- /dev/null
+++ b/Documentation/admin-guide/media/saa7134.rst
@@ -0,0 +1,89 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+The saa7134 driver
+==================
+
+Author Gerd Hoffmann
+
+
+This is a v4l2/oss device driver for saa7130/33/34/35 based capture / TV
+boards.
+
+
+Status
+------
+
+Almost everything is working. video, sound, tuner, radio, mpeg ts, ...
+
+As with bttv, card-specific tweaks are needed. Check CARDLIST for a
+list of known TV cards and saa7134-cards.c for the drivers card
+configuration info.
+
+
+Build
+-----
+
+Once you pick up a Kernel source, you should configure, build,
+install and boot the new kernel. You'll need at least
+these config options::
+
+ ./scripts/config -e PCI
+ ./scripts/config -e INPUT
+ ./scripts/config -m I2C
+ ./scripts/config -m MEDIA_SUPPORT
+ ./scripts/config -e MEDIA_PCI_SUPPORT
+ ./scripts/config -e MEDIA_ANALOG_TV_SUPPORT
+ ./scripts/config -e MEDIA_DIGITAL_TV_SUPPORT
+ ./scripts/config -e MEDIA_RADIO_SUPPORT
+ ./scripts/config -e RC_CORE
+ ./scripts/config -e MEDIA_SUBDRV_AUTOSELECT
+ ./scripts/config -m VIDEO_SAA7134
+ ./scripts/config -e SAA7134_ALSA
+ ./scripts/config -e VIDEO_SAA7134_RC
+ ./scripts/config -e VIDEO_SAA7134_DVB
+ ./scripts/config -e VIDEO_SAA7134_GO7007
+
+To build and install, you should run::
+
+ make && make modules_install && make install
+
+Once the new Kernel is booted, saa7134 driver should be loaded automatically.
+
+Depending on the card you might have to pass ``card=<nr>`` as insmod option.
+If so, please check Documentation/admin-guide/media/saa7134-cardlist.rst
+for valid choices.
+
+Once you have your card type number, you can pass a modules configuration
+via a file (usually, it is either ``/etc/modules.conf`` or some file at
+``/etc/modules-load.d/``, but the actual place depends on your
+distribution), with this content::
+
+ options saa7134 card=13 # Assuming that your card type is #13
+
+
+Changes / Fixes
+---------------
+
+Please mail to linux-media AT vger.kernel.org unified diffs against
+the linux media git tree:
+
+ https://git.linuxtv.org/media_tree.git/
+
+This is done by committing a patch at a clone of the git tree and
+submitting the patch using ``git send-email``. Don't forget to
+describe at the lots what it changes / which problem it fixes / whatever
+it is good for ...
+
+
+Known Problems
+--------------
+
+* The tuner for the flyvideos isn't detected automatically and the
+ default might not work for you depending on which version you have.
+ There is a ``tuner=`` insmod option to override the driver's default.
+
+Credits
+-------
+
+andrew.stevens@philips.com + werner.leeb@philips.com for providing
+saa7134 hardware specs and sample board.
diff --git a/Documentation/admin-guide/media/saa7164-cardlist.rst b/Documentation/admin-guide/media/saa7164-cardlist.rst
new file mode 100644
index 000000000000..7949c09aa900
--- /dev/null
+++ b/Documentation/admin-guide/media/saa7164-cardlist.rst
@@ -0,0 +1,71 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+SAA7164 cards list
+==================
+
+.. tabularcolumns:: |p{1.4cm}|p{11.1cm}|p{4.2cm}|
+
+.. flat-table::
+ :header-rows: 1
+ :widths: 2 19 18
+ :stub-columns: 0
+
+ * - Card number
+ - Card name
+ - PCI subsystem IDs
+
+ * - 0
+ - Unknown
+ -
+
+ * - 1
+ - Generic Rev2
+ -
+
+ * - 2
+ - Generic Rev3
+ -
+
+ * - 3
+ - Hauppauge WinTV-HVR2250
+ - 0070:8880, 0070:8810
+
+ * - 4
+ - Hauppauge WinTV-HVR2200
+ - 0070:8980
+
+ * - 5
+ - Hauppauge WinTV-HVR2200
+ - 0070:8900
+
+ * - 6
+ - Hauppauge WinTV-HVR2200
+ - 0070:8901
+
+ * - 7
+ - Hauppauge WinTV-HVR2250
+ - 0070:8891, 0070:8851
+
+ * - 8
+ - Hauppauge WinTV-HVR2250
+ - 0070:88A1
+
+ * - 9
+ - Hauppauge WinTV-HVR2200
+ - 0070:8940
+
+ * - 10
+ - Hauppauge WinTV-HVR2200
+ - 0070:8953
+
+ * - 11
+ - Hauppauge WinTV-HVR2255(proto)
+ - 0070:f111
+
+ * - 12
+ - Hauppauge WinTV-HVR2255
+ - 0070:f111
+
+ * - 13
+ - Hauppauge WinTV-HVR2205
+ - 0070:f123, 0070:f120
diff --git a/Documentation/admin-guide/media/si470x.rst b/Documentation/admin-guide/media/si470x.rst
new file mode 100644
index 000000000000..d53bf5f95200
--- /dev/null
+++ b/Documentation/admin-guide/media/si470x.rst
@@ -0,0 +1,167 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+.. include:: <isonum.txt>
+
+The Silicon Labs Si470x FM Radio Receivers driver
+=================================================
+
+Copyright |copy| 2009 Tobias Lorenz <tobias.lorenz@gmx.net>
+
+
+Information from Silicon Labs
+-----------------------------
+
+Silicon Laboratories is the manufacturer of the radio ICs, that nowadays are the
+most often used radio receivers in cell phones. Usually they are connected with
+I2C. But SiLabs also provides a reference design, which integrates this IC,
+together with a small microcontroller C8051F321, to form a USB radio.
+Part of this reference design is also a radio application in binary and source
+code. The software also contains an automatic firmware upgrade to the most
+current version. Information on these can be downloaded here:
+http://www.silabs.com/usbradio
+
+
+Supported ICs
+-------------
+
+The following ICs have a very similar register set, so that they are or will be
+supported somewhen by the driver:
+
+- Si4700: FM radio receiver
+- Si4701: FM radio receiver, RDS Support
+- Si4702: FM radio receiver
+- Si4703: FM radio receiver, RDS Support
+- Si4704: FM radio receiver, no external antenna required
+- Si4705: FM radio receiver, no external antenna required, RDS support, Dig I/O
+- Si4706: Enhanced FM RDS/TMC radio receiver, no external antenna required, RDS
+ Support
+- Si4707: Dedicated weather band radio receiver with SAME decoder, RDS Support
+- Si4708: Smallest FM receivers
+- Si4709: Smallest FM receivers, RDS Support
+
+More information on these can be downloaded here:
+http://www.silabs.com/products/mcu/Pages/USBFMRadioRD.aspx
+
+
+Supported USB devices
+---------------------
+
+Currently the following USB radios (vendor:product) with the Silicon Labs si470x
+chips are known to work:
+
+- 10c4:818a: Silicon Labs USB FM Radio Reference Design
+- 06e1:a155: ADS/Tech FM Radio Receiver (formerly Instant FM Music) (RDX-155-EF)
+- 1b80:d700: KWorld USB FM Radio SnapMusic Mobile 700 (FM700)
+- 10c5:819a: Sanei Electric, Inc. FM USB Radio (sold as DealExtreme.com PCear)
+
+
+Software
+--------
+
+Testing is usually done with most application under Debian/testing:
+
+- fmtools - Utility for managing FM tuner cards
+- gnomeradio - FM-radio tuner for the GNOME desktop
+- gradio - GTK FM radio tuner
+- kradio - Comfortable Radio Application for KDE
+- radio - ncurses-based radio application
+- mplayer - The Ultimate Movie Player For Linux
+- v4l2-ctl - Collection of command line video4linux utilities
+
+For example, you can use:
+
+.. code-block:: none
+
+ v4l2-ctl -d /dev/radio0 --set-ctrl=volume=10,mute=0 --set-freq=95.21 --all
+
+There is also a library libv4l, which can be used. It's going to have a function
+for frequency seeking, either by using hardware functionality as in radio-si470x
+or by implementing a function as we currently have in every of the mentioned
+programs. Somewhen the radio programs should make use of libv4l.
+
+For processing RDS information, there is a project ongoing at:
+http://rdsd.berlios.de/
+
+There is currently no project for making TMC sentences human readable.
+
+
+Audio Listing
+-------------
+
+USB Audio is provided by the ALSA snd_usb_audio module. It is recommended to
+also select SND_USB_AUDIO, as this is required to get sound from the radio. For
+listing you have to redirect the sound, for example using one of the following
+commands. Please adjust the audio devices to your needs (/dev/dsp* and hw:x,x).
+
+If you just want to test audio (very poor quality):
+
+.. code-block:: none
+
+ cat /dev/dsp1 > /dev/dsp
+
+If you use sox + OSS try:
+
+.. code-block:: none
+
+ sox -2 --endian little -r 96000 -t oss /dev/dsp1 -t oss /dev/dsp
+
+or using sox + alsa:
+
+.. code-block:: none
+
+ sox --endian little -c 2 -S -r 96000 -t alsa hw:1 -t alsa -r 96000 hw:0
+
+If you use arts try:
+
+.. code-block:: none
+
+ arecord -D hw:1,0 -r96000 -c2 -f S16_LE | artsdsp aplay -B -
+
+If you use mplayer try:
+
+.. code-block:: none
+
+ mplayer -radio adevice=hw=1.0:arate=96000 \
+ -rawaudio rate=96000 \
+ radio://<frequency>/capture
+
+Module Parameters
+-----------------
+
+After loading the module, you still have access to some of them in the sysfs
+mount under /sys/module/radio_si470x/parameters. The contents of read-only files
+(0444) are not updated, even if space, band and de are changed using private
+video controls. The others are runtime changeable.
+
+
+Errors
+------
+
+Increase tune_timeout, if you often get -EIO errors.
+
+When timed out or band limit is reached, hw_freq_seek returns -EAGAIN.
+
+If you get any errors from snd_usb_audio, please report them to the ALSA people.
+
+
+Open Issues
+-----------
+
+V4L minor device allocation and parameter setting is not perfect. A solution is
+currently under discussion.
+
+There is an USB interface for downloading/uploading new firmware images. Support
+for it can be implemented using the request_firmware interface.
+
+There is a RDS interrupt mode. The driver is already using the same interface
+for polling RDS information, but is currently not using the interrupt mode.
+
+There is a LED interface, which can be used to override the LED control
+programmed in the firmware. This can be made available using the LED support
+functions in the kernel.
+
+
+Other useful information and links
+----------------------------------
+
+http://www.silabs.com/usbradio
diff --git a/Documentation/admin-guide/media/si4713.rst b/Documentation/admin-guide/media/si4713.rst
new file mode 100644
index 000000000000..be8e6b49b7b4
--- /dev/null
+++ b/Documentation/admin-guide/media/si4713.rst
@@ -0,0 +1,192 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+.. include:: <isonum.txt>
+
+The Silicon Labs Si4713 FM Radio Transmitter Driver
+===================================================
+
+Copyright |copy| 2009 Nokia Corporation
+
+Contact: Eduardo Valentin <eduardo.valentin@nokia.com>
+
+
+Information about the Device
+----------------------------
+
+This chip is a Silicon Labs product. It is a I2C device, currently on 0x63 address.
+Basically, it has transmission and signal noise level measurement features.
+
+The Si4713 integrates transmit functions for FM broadcast stereo transmission.
+The chip also allows integrated receive power scanning to identify low signal
+power FM channels.
+
+The chip is programmed using commands and responses. There are also several
+properties which can change the behavior of this chip.
+
+Users must comply with local regulations on radio frequency (RF) transmission.
+
+Device driver description
+-------------------------
+
+There are two modules to handle this device. One is a I2C device driver
+and the other is a platform driver.
+
+The I2C device driver exports a v4l2-subdev interface to the kernel.
+All properties can also be accessed by v4l2 extended controls interface, by
+using the v4l2-subdev calls (g_ext_ctrls, s_ext_ctrls).
+
+The platform device driver exports a v4l2 radio device interface to user land.
+So, it uses the I2C device driver as a sub device in order to send the user
+commands to the actual device. Basically it is a wrapper to the I2C device driver.
+
+Applications can use v4l2 radio API to specify frequency of operation, mute state,
+etc. But mostly of its properties will be present in the extended controls.
+
+When the v4l2 mute property is set to 1 (true), the driver will turn the chip off.
+
+Properties description
+----------------------
+
+The properties can be accessed using v4l2 extended controls.
+Here is an output from v4l2-ctl util:
+
+.. code-block:: none
+
+ / # v4l2-ctl -d /dev/radio0 --all -L
+ Driver Info:
+ Driver name : radio-si4713
+ Card type : Silicon Labs Si4713 Modulator
+ Bus info :
+ Driver version: 0
+ Capabilities : 0x00080800
+ RDS Output
+ Modulator
+ Audio output: 0 (FM Modulator Audio Out)
+ Frequency: 1408000 (88.000000 MHz)
+ Video Standard = 0x00000000
+ Modulator:
+ Name : FM Modulator
+ Capabilities : 62.5 Hz stereo rds
+ Frequency range : 76.0 MHz - 108.0 MHz
+ Subchannel modulation: stereo+rds
+
+ User Controls
+
+ mute (bool) : default=1 value=0
+
+ FM Radio Modulator Controls
+
+ rds_signal_deviation (int) : min=0 max=90000 step=10 default=200 value=200 flags=slider
+ rds_program_id (int) : min=0 max=65535 step=1 default=0 value=0
+ rds_program_type (int) : min=0 max=31 step=1 default=0 value=0
+ rds_ps_name (str) : min=0 max=96 step=8 value='si4713 '
+ rds_radio_text (str) : min=0 max=384 step=32 value=''
+ audio_limiter_feature_enabled (bool) : default=1 value=1
+ audio_limiter_release_time (int) : min=250 max=102390 step=50 default=5010 value=5010 flags=slider
+ audio_limiter_deviation (int) : min=0 max=90000 step=10 default=66250 value=66250 flags=slider
+ audio_compression_feature_enabl (bool) : default=1 value=1
+ audio_compression_gain (int) : min=0 max=20 step=1 default=15 value=15 flags=slider
+ audio_compression_threshold (int) : min=-40 max=0 step=1 default=-40 value=-40 flags=slider
+ audio_compression_attack_time (int) : min=0 max=5000 step=500 default=0 value=0 flags=slider
+ audio_compression_release_time (int) : min=100000 max=1000000 step=100000 default=1000000 value=1000000 flags=slider
+ pilot_tone_feature_enabled (bool) : default=1 value=1
+ pilot_tone_deviation (int) : min=0 max=90000 step=10 default=6750 value=6750 flags=slider
+ pilot_tone_frequency (int) : min=0 max=19000 step=1 default=19000 value=19000 flags=slider
+ pre_emphasis_settings (menu) : min=0 max=2 default=1 value=1
+ tune_power_level (int) : min=0 max=120 step=1 default=88 value=88 flags=slider
+ tune_antenna_capacitor (int) : min=0 max=191 step=1 default=0 value=110 flags=slider
+
+Here is a summary of them:
+
+* Pilot is an audible tone sent by the device.
+
+- pilot_frequency - Configures the frequency of the stereo pilot tone.
+- pilot_deviation - Configures pilot tone frequency deviation level.
+- pilot_enabled - Enables or disables the pilot tone feature.
+
+* The si4713 device is capable of applying audio compression to the
+ transmitted signal.
+
+- acomp_enabled - Enables or disables the audio dynamic range control feature.
+- acomp_gain - Sets the gain for audio dynamic range control.
+- acomp_threshold - Sets the threshold level for audio dynamic range control.
+- acomp_attack_time - Sets the attack time for audio dynamic range control.
+- acomp_release_time - Sets the release time for audio dynamic range control.
+
+* Limiter setups audio deviation limiter feature. Once a over deviation occurs,
+ it is possible to adjust the front-end gain of the audio input and always
+ prevent over deviation.
+
+- limiter_enabled - Enables or disables the limiter feature.
+- limiter_deviation - Configures audio frequency deviation level.
+- limiter_release_time - Sets the limiter release time.
+
+* Tuning power
+
+- power_level - Sets the output power level for signal transmission.
+ antenna_capacitor - This selects the value of antenna tuning capacitor
+ manually or automatically if set to zero.
+
+* RDS related
+
+- rds_ps_name - Sets the RDS ps name field for transmission.
+- rds_radio_text - Sets the RDS radio text for transmission.
+- rds_pi - Sets the RDS PI field for transmission.
+- rds_pty - Sets the RDS PTY field for transmission.
+
+* Region related
+
+- preemphasis - sets the preemphasis to be applied for transmission.
+
+RNL
+---
+
+This device also has an interface to measure received noise level. To do that, you should
+ioctl the device node. Here is an code of example:
+
+.. code-block:: none
+
+ int main (int argc, char *argv[])
+ {
+ struct si4713_rnl rnl;
+ int fd = open("/dev/radio0", O_RDWR);
+ int rval;
+
+ if (argc < 2)
+ return -EINVAL;
+
+ if (fd < 0)
+ return fd;
+
+ sscanf(argv[1], "%d", &rnl.frequency);
+
+ rval = ioctl(fd, SI4713_IOC_MEASURE_RNL, &rnl);
+ if (rval < 0)
+ return rval;
+
+ printf("received noise level: %d\n", rnl.rnl);
+
+ close(fd);
+ }
+
+The struct si4713_rnl and SI4713_IOC_MEASURE_RNL are defined under
+include/linux/platform_data/media/si4713.h.
+
+Stereo/Mono and RDS subchannels
+-------------------------------
+
+The device can also be configured using the available sub channels for
+transmission. To do that use S/G_MODULATOR ioctl and configure txsubchans properly.
+Refer to the V4L2 API specification for proper use of this ioctl.
+
+Testing
+-------
+Testing is usually done with v4l2-ctl utility for managing FM tuner cards.
+The tool can be found in v4l-dvb repository under v4l2-apps/util directory.
+
+Example for setting rds ps name:
+
+.. code-block:: none
+
+ # v4l2-ctl -d /dev/radio0 --set-ctrl=rds_ps_name="Dummy"
+
diff --git a/Documentation/admin-guide/media/si476x.rst b/Documentation/admin-guide/media/si476x.rst
new file mode 100644
index 000000000000..87062301d6a1
--- /dev/null
+++ b/Documentation/admin-guide/media/si476x.rst
@@ -0,0 +1,160 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+.. include:: <isonum.txt>
+
+
+The SI476x Driver
+=================
+
+Copyright |copy| 2013 Andrey Smirnov <andrew.smirnov@gmail.com>
+
+TODO for the driver
+-------------------
+
+- According to the SiLabs' datasheet it is possible to update the
+ firmware of the radio chip in the run-time, thus bringing it to the
+ most recent version. Unfortunately I couldn't find any mentioning of
+ the said firmware update for the old chips that I tested the driver
+ against, so for chips like that the driver only exposes the old
+ functionality.
+
+
+Parameters exposed over debugfs
+-------------------------------
+SI476x allow user to get multiple characteristics that can be very
+useful for EoL testing/RF performance estimation, parameters that have
+very little to do with V4L2 subsystem. Such parameters are exposed via
+debugfs and can be accessed via regular file I/O operations.
+
+The drivers exposes following files:
+
+* /sys/kernel/debug/<device-name>/acf
+ This file contains ACF(Automatically Controlled Features) status
+ information. The contents of the file is binary data of the
+ following layout:
+
+ .. tabularcolumns:: |p{7ex}|p{12ex}|L|
+
+ ============= ============== ====================================
+ Offset Name Description
+ ============= ============== ====================================
+ 0x00 blend_int Flag, set when stereo separation has
+ crossed below the blend threshold
+ 0x01 hblend_int Flag, set when HiBlend cutoff
+ frequency is lower than threshold
+ 0x02 hicut_int Flag, set when HiCut cutoff
+ frequency is lower than threshold
+ 0x03 chbw_int Flag, set when channel filter
+ bandwidth is less than threshold
+ 0x04 softmute_int Flag indicating that softmute
+ attenuation has increased above
+ softmute threshold
+ 0x05 smute 0 - Audio is not soft muted
+ 1 - Audio is soft muted
+ 0x06 smattn Soft mute attenuation level in dB
+ 0x07 chbw Channel filter bandwidth in kHz
+ 0x08 hicut HiCut cutoff frequency in units of
+ 100Hz
+ 0x09 hiblend HiBlend cutoff frequency in units
+ of 100 Hz
+ 0x10 pilot 0 - Stereo pilot is not present
+ 1 - Stereo pilot is present
+ 0x11 stblend Stereo blend in %
+ ============= ============== ====================================
+
+
+* /sys/kernel/debug/<device-name>/rds_blckcnt
+ This file contains statistics about RDS receptions. It's binary data
+ has the following layout:
+
+ .. tabularcolumns:: |p{7ex}|p{12ex}|L|
+
+ ============= ============== ====================================
+ Offset Name Description
+ ============= ============== ====================================
+ 0x00 expected Number of expected RDS blocks
+ 0x02 received Number of received RDS blocks
+ 0x04 uncorrectable Number of uncorrectable RDS blocks
+ ============= ============== ====================================
+
+* /sys/kernel/debug/<device-name>/agc
+ This file contains information about parameters pertaining to
+ AGC(Automatic Gain Control)
+
+ The layout is:
+
+ .. tabularcolumns:: |p{7ex}|p{12ex}|L|
+
+ ============= ============== ====================================
+ Offset Name Description
+ ============= ============== ====================================
+ 0x00 mxhi 0 - FM Mixer PD high threshold is
+ not tripped
+ 1 - FM Mixer PD high threshold is
+ tripped
+ 0x01 mxlo ditto for FM Mixer PD low
+ 0x02 lnahi ditto for FM LNA PD high
+ 0x03 lnalo ditto for FM LNA PD low
+ 0x04 fmagc1 FMAGC1 attenuator resistance
+ (see datasheet for more detail)
+ 0x05 fmagc2 ditto for FMAGC2
+ 0x06 pgagain PGA gain in dB
+ 0x07 fmwblang FM/WB LNA Gain in dB
+ ============= ============== ====================================
+
+* /sys/kernel/debug/<device-name>/rsq
+ This file contains information about parameters pertaining to
+ RSQ(Received Signal Quality)
+
+ The layout is:
+
+ .. tabularcolumns:: |p{7ex}|p{12ex}|p{60ex}|
+
+ ============= ============== ====================================
+ Offset Name Description
+ ============= ============== ====================================
+ 0x00 multhint 0 - multipath value has not crossed
+ the Multipath high threshold
+ 1 - multipath value has crossed
+ the Multipath high threshold
+ 0x01 multlint ditto for Multipath low threshold
+ 0x02 snrhint 0 - received signal's SNR has not
+ crossed high threshold
+ 1 - received signal's SNR has
+ crossed high threshold
+ 0x03 snrlint ditto for low threshold
+ 0x04 rssihint ditto for RSSI high threshold
+ 0x05 rssilint ditto for RSSI low threshold
+ 0x06 bltf Flag indicating if seek command
+ reached/wrapped seek band limit
+ 0x07 snr_ready Indicates that SNR metrics is ready
+ 0x08 rssiready ditto for RSSI metrics
+ 0x09 injside 0 - Low-side injection is being used
+ 1 - High-side injection is used
+ 0x10 afcrl Flag indicating if AFC rails
+ 0x11 valid Flag indicating if channel is valid
+ 0x12 readfreq Current tuned frequency
+ 0x14 freqoff Signed frequency offset in units of
+ 2ppm
+ 0x15 rssi Signed value of RSSI in dBuV
+ 0x16 snr Signed RF SNR in dB
+ 0x17 issi Signed Image Strength Signal
+ indicator
+ 0x18 lassi Signed Low side adjacent Channel
+ Strength indicator
+ 0x19 hassi ditto fpr High side
+ 0x20 mult Multipath indicator
+ 0x21 dev Frequency deviation
+ 0x24 assi Adjacent channel SSI
+ 0x25 usn Ultrasonic noise indicator
+ 0x26 pilotdev Pilot deviation in units of 100 Hz
+ 0x27 rdsdev ditto for RDS
+ 0x28 assidev ditto for ASSI
+ 0x29 strongdev Frequency deviation
+ 0x30 rdspi RDS PI code
+ ============= ============== ====================================
+
+* /sys/kernel/debug/<device-name>/rsq_primary
+ This file contains information about parameters pertaining to
+ RSQ(Received Signal Quality) for primary tuner only. Layout is as
+ the one above.
diff --git a/Documentation/admin-guide/media/siano-cardlist.rst b/Documentation/admin-guide/media/siano-cardlist.rst
new file mode 100644
index 000000000000..bb731a953878
--- /dev/null
+++ b/Documentation/admin-guide/media/siano-cardlist.rst
@@ -0,0 +1,56 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+Siano cards list
+================
+
+.. tabularcolumns:: p{13.3cm}|p{4.2cm}|
+
+.. flat-table::
+ :header-rows: 1
+ :widths: 17 16
+ :stub-columns: 0
+
+ * - Card name
+ - USB IDs
+ * - Hauppauge Catamount
+ - 2040:1700
+ * - Hauppauge Okemo-A
+ - 2040:1800
+ * - Hauppauge Okemo-B
+ - 2040:1801
+ * - Hauppauge WinTV MiniCard
+ - 2040:2000, 2040:200a, 2040:2010, 2040:2011, 2040:2019
+ * - Hauppauge WinTV MiniCard Rev 2
+ - 2040:2009
+ * - Hauppauge WinTV MiniStick
+ - 2040:5500, 2040:5510, 2040:5520, 2040:5530, 2040:5580, 2040:5590, 2040:b900, 2040:b910, 2040:b980, 2040:b990, 2040:c000, 2040:c010, 2040:c080, 2040:c090, 2040:c0a0, 2040:f5a0
+ * - Hauppauge microStick 77e
+ - 2013:0257
+ * - ONDA Data Card Digital Receiver
+ - 19D2:0078
+ * - Siano Denver (ATSC-M/H) Digital Receiver
+ - 187f:0800
+ * - Siano Denver (TDMB) Digital Receiver
+ - 187f:0700
+ * - Siano Ming Digital Receiver
+ - 187f:0310
+ * - Siano Nice Digital Receiver
+ - 187f:0202, 187f:0202
+ * - Siano Nova A Digital Receiver
+ - 187f:0200
+ * - Siano Nova B Digital Receiver
+ - 187f:0201
+ * - Siano Pele Digital Receiver
+ - 187f:0500
+ * - Siano Rio Digital Receiver
+ - 187f:0600, 3275:0080
+ * - Siano Stellar Digital Receiver
+ - 187f:0100
+ * - Siano Stellar Digital Receiver ROM
+ - 187f:0010
+ * - Siano Vega Digital Receiver
+ - 187f:0300
+ * - Siano Venice Digital Receiver
+ - 187f:0301, 187f:0301, 187f:0302
+ * - ZTE Data Card Digital Receiver
+ - 19D2:0086
diff --git a/Documentation/admin-guide/media/technisat.rst b/Documentation/admin-guide/media/technisat.rst
new file mode 100644
index 000000000000..9eaa12366bbf
--- /dev/null
+++ b/Documentation/admin-guide/media/technisat.rst
@@ -0,0 +1,100 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+How to set up the Technisat/B2C2 Flexcop devices
+================================================
+
+.. note::
+
+ This documentation is outdated.
+
+Author: Uwe Bugla <uwe.bugla@gmx.de> August 2009
+
+Find out what device you have
+-----------------------------
+
+Important Notice: The driver does NOT support Technisat USB 2 devices!
+
+First start your linux box with a shipped kernel:
+
+.. code-block:: none
+
+ lspci -vvv for a PCI device (lsusb -vvv for an USB device) will show you for example:
+ 02:0b.0 Network controller: Techsan Electronics Co Ltd B2C2 FlexCopII DVB chip /
+ Technisat SkyStar2 DVB card (rev 02)
+
+ dmesg | grep frontend may show you for example:
+ DVB: registering frontend 0 (Conexant CX24123/CX24109)...
+
+Kernel compilation:
+-------------------
+
+If the Flexcop / Technisat is the only DVB / TV / Radio device in your box
+get rid of unnecessary modules and check this one:
+
+``Multimedia support`` => ``Customise analog and hybrid tuner modules to build``
+
+In this directory uncheck every driver which is activated there
+(except ``Simple tuner support`` for ATSC 3rd generation only -> see case 9 please).
+
+Then please activate:
+
+- Main module part:
+
+ ``Multimedia support`` => ``DVB/ATSC adapters`` => ``Technisat/B2C2 FlexcopII(b) and FlexCopIII adapters``
+
+ #) => ``Technisat/B2C2 Air/Sky/Cable2PC PCI`` (PCI card) or
+ #) => ``Technisat/B2C2 Air/Sky/Cable2PC USB`` (USB 1.1 adapter)
+ and for troubleshooting purposes:
+ #) => ``Enable debug for the B2C2 FlexCop drivers``
+
+- Frontend / Tuner / Demodulator module part:
+
+ ``Multimedia support`` => ``DVB/ATSC adapters``
+ => ``Customise the frontend modules to build`` ``Customise DVB frontends`` =>
+
+ - SkyStar DVB-S Revision 2.3:
+
+ #) => ``Zarlink VP310/MT312/ZL10313 based``
+ #) => ``Generic I2C PLL based tuners``
+
+ - SkyStar DVB-S Revision 2.6:
+
+ #) => ``ST STV0299 based``
+ #) => ``Generic I2C PLL based tuners``
+
+ - SkyStar DVB-S Revision 2.7:
+
+ #) => ``Samsung S5H1420 based``
+ #) => ``Integrant ITD1000 Zero IF tuner for DVB-S/DSS``
+ #) => ``ISL6421 SEC controller``
+
+ - SkyStar DVB-S Revision 2.8:
+
+ #) => ``Conexant CX24123 based``
+ #) => ``Conexant CX24113/CX24128 tuner for DVB-S/DSS``
+ #) => ``ISL6421 SEC controller``
+
+ - AirStar DVB-T card:
+
+ #) => ``Zarlink MT352 based``
+ #) => ``Generic I2C PLL based tuners``
+
+ - CableStar DVB-C card:
+
+ #) => ``ST STV0297 based``
+ #) => ``Generic I2C PLL based tuners``
+
+ - AirStar ATSC card 1st generation:
+
+ #) => ``Broadcom BCM3510``
+
+ - AirStar ATSC card 2nd generation:
+
+ #) => ``NxtWave Communications NXT2002/NXT2004 based``
+ #) => ``Generic I2C PLL based tuners``
+
+ - AirStar ATSC card 3rd generation:
+
+ #) => ``LG Electronics LGDT3302/LGDT3303 based``
+ #) ``Multimedia support`` => ``Customise analog and hybrid tuner modules to build`` => ``Simple tuner support``
+
diff --git a/Documentation/admin-guide/media/tm6000-cardlist.rst b/Documentation/admin-guide/media/tm6000-cardlist.rst
new file mode 100644
index 000000000000..6d2769c0f4d8
--- /dev/null
+++ b/Documentation/admin-guide/media/tm6000-cardlist.rst
@@ -0,0 +1,83 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+TM6000 cards list
+=================
+
+.. tabularcolumns:: |p{1.4cm}|p{11.1cm}|p{4.2cm}|
+
+.. flat-table::
+ :header-rows: 1
+ :widths: 2 19 18
+ :stub-columns: 0
+
+ * - Card number
+ - Card name
+ - USB IDs
+
+ * - 0
+ - Unknown tm6000 video grabber
+ -
+
+ * - 1
+ - Generic tm5600 board
+ - 6000:0001
+
+ * - 2
+ - Generic tm6000 board
+ -
+
+ * - 3
+ - Generic tm6010 board
+ - 6000:0002
+
+ * - 4
+ - 10Moons UT 821
+ -
+
+ * - 5
+ - 10Moons UT 330
+ -
+
+ * - 6
+ - ADSTECH Dual TV USB
+ - 06e1:f332
+
+ * - 7
+ - Freecom Hybrid Stick / Moka DVB-T Receiver Dual
+ - 14aa:0620
+
+ * - 8
+ - ADSTECH Mini Dual TV USB
+ - 06e1:b339
+
+ * - 9
+ - Hauppauge WinTV HVR-900H / WinTV USB2-Stick
+ - 2040:6600, 2040:6601, 2040:6610, 2040:6611
+
+ * - 10
+ - Beholder Wander DVB-T/TV/FM USB2.0
+ - 6000:dec0
+
+ * - 11
+ - Beholder Voyager TV/FM USB2.0
+ - 6000:dec1
+
+ * - 12
+ - Terratec Cinergy Hybrid XE / Cinergy Hybrid-Stick
+ - 0ccd:0086, 0ccd:00A5
+
+ * - 13
+ - Twinhan TU501(704D1)
+ - 13d3:3240, 13d3:3241, 13d3:3243, 13d3:3264
+
+ * - 14
+ - Beholder Wander Lite DVB-T/TV/FM USB2.0
+ - 6000:dec2
+
+ * - 15
+ - Beholder Voyager Lite TV/FM USB2.0
+ - 6000:dec3
+
+ * - 16
+ - Terratec Grabster AV 150/250 MX
+ - 0ccd:0079
diff --git a/Documentation/admin-guide/media/ttusb-dec.rst b/Documentation/admin-guide/media/ttusb-dec.rst
new file mode 100644
index 000000000000..516bbab8a872
--- /dev/null
+++ b/Documentation/admin-guide/media/ttusb-dec.rst
@@ -0,0 +1,45 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+TechnoTrend/Hauppauge DEC USB Driver
+====================================
+
+Driver Status
+-------------
+
+Supported:
+
+ - DEC2000-t
+ - DEC2450-t
+ - DEC3000-s
+ - Video Streaming
+ - Audio Streaming
+ - Section Filters
+ - Channel Zapping
+ - Hotplug firmware loader
+
+To Do:
+
+ - Tuner status information
+ - DVB network interface
+ - Streaming video PC->DEC
+ - Conax support for 2450-t
+
+Getting the Firmware
+--------------------
+To download the firmware, use the following commands:
+
+.. code-block:: none
+
+ scripts/get_dvb_firmware dec2000t
+ scripts/get_dvb_firmware dec2540t
+ scripts/get_dvb_firmware dec3000s
+
+
+Hotplug Firmware Loading
+------------------------
+
+Since 2.6 kernels, the firmware is loaded at the point that the driver module
+is loaded.
+
+Copy the three files downloaded above into the /usr/lib/hotplug/firmware or
+/lib/firmware directory (depending on configuration of firmware hotplug).
diff --git a/Documentation/admin-guide/media/tuner-cardlist.rst b/Documentation/admin-guide/media/tuner-cardlist.rst
new file mode 100644
index 000000000000..362617c59c5d
--- /dev/null
+++ b/Documentation/admin-guide/media/tuner-cardlist.rst
@@ -0,0 +1,100 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+Tuner cards list
+================
+
+============ =====================================================
+Tuner number Card name
+============ =====================================================
+0 Temic PAL (4002 FH5)
+1 Philips PAL_I (FI1246 and compatibles)
+2 Philips NTSC (FI1236,FM1236 and compatibles)
+3 Philips (SECAM+PAL_BG) (FI1216MF, FM1216MF, FR1216MF)
+4 NoTuner
+5 Philips PAL_BG (FI1216 and compatibles)
+6 Temic NTSC (4032 FY5)
+7 Temic PAL_I (4062 FY5)
+8 Temic NTSC (4036 FY5)
+9 Alps HSBH1
+10 Alps TSBE1
+11 Alps TSBB5
+12 Alps TSBE5
+13 Alps TSBC5
+14 Temic PAL_BG (4006FH5)
+15 Alps TSCH6
+16 Temic PAL_DK (4016 FY5)
+17 Philips NTSC_M (MK2)
+18 Temic PAL_I (4066 FY5)
+19 Temic PAL* auto (4006 FN5)
+20 Temic PAL_BG (4009 FR5) or PAL_I (4069 FR5)
+21 Temic NTSC (4039 FR5)
+22 Temic PAL/SECAM multi (4046 FM5)
+23 Philips PAL_DK (FI1256 and compatibles)
+24 Philips PAL/SECAM multi (FQ1216ME)
+25 LG PAL_I+FM (TAPC-I001D)
+26 LG PAL_I (TAPC-I701D)
+27 LG NTSC+FM (TPI8NSR01F)
+28 LG PAL_BG+FM (TPI8PSB01D)
+29 LG PAL_BG (TPI8PSB11D)
+30 Temic PAL* auto + FM (4009 FN5)
+31 SHARP NTSC_JP (2U5JF5540)
+32 Samsung PAL TCPM9091PD27
+33 MT20xx universal
+34 Temic PAL_BG (4106 FH5)
+35 Temic PAL_DK/SECAM_L (4012 FY5)
+36 Temic NTSC (4136 FY5)
+37 LG PAL (newer TAPC series)
+38 Philips PAL/SECAM multi (FM1216ME MK3)
+39 LG NTSC (newer TAPC series)
+40 HITACHI V7-J180AT
+41 Philips PAL_MK (FI1216 MK)
+42 Philips FCV1236D ATSC/NTSC dual in
+43 Philips NTSC MK3 (FM1236MK3 or FM1236/F)
+44 Philips 4 in 1 (ATI TV Wonder Pro/Conexant)
+45 Microtune 4049 FM5
+46 Panasonic VP27s/ENGE4324D
+47 LG NTSC (TAPE series)
+48 Tenna TNF 8831 BGFF)
+49 Microtune 4042 FI5 ATSC/NTSC dual in
+50 TCL 2002N
+51 Philips PAL/SECAM_D (FM 1256 I-H3)
+52 Thomson DTT 7610 (ATSC/NTSC)
+53 Philips FQ1286
+54 Philips/NXP TDA 8290/8295 + 8275/8275A/18271
+55 TCL 2002MB
+56 Philips PAL/SECAM multi (FQ1216AME MK4)
+57 Philips FQ1236A MK4
+58 Ymec TVision TVF-8531MF/8831MF/8731MF
+59 Ymec TVision TVF-5533MF
+60 Thomson DTT 761X (ATSC/NTSC)
+61 Tena TNF9533-D/IF/TNF9533-B/DF
+62 Philips TEA5767HN FM Radio
+63 Philips FMD1216ME MK3 Hybrid Tuner
+64 LG TDVS-H06xF
+65 Ymec TVF66T5-B/DFF
+66 LG TALN series
+67 Philips TD1316 Hybrid Tuner
+68 Philips TUV1236D ATSC/NTSC dual in
+69 Tena TNF 5335 and similar models
+70 Samsung TCPN 2121P30A
+71 Xceive xc2028/xc3028 tuner
+72 Thomson FE6600
+73 Samsung TCPG 6121P30A
+75 Philips TEA5761 FM Radio
+76 Xceive 5000 tuner
+77 TCL tuner MF02GIP-5N-E
+78 Philips FMD1216MEX MK3 Hybrid Tuner
+79 Philips PAL/SECAM multi (FM1216 MK5)
+80 Philips FQ1216LME MK3 PAL/SECAM w/active loopthrough
+81 Partsnic (Daewoo) PTI-5NF05
+82 Philips CU1216L
+83 NXP TDA18271
+84 Sony BTF-Pxn01Z
+85 Philips FQ1236 MK5
+86 Tena TNF5337 MFD
+87 Xceive 4000 tuner
+88 Xceive 5000C tuner
+89 Sony BTF-PG472Z PAL/SECAM
+90 Sony BTF-PK467Z NTSC-M-JP
+91 Sony BTF-PB463Z NTSC-M
+============ =====================================================
diff --git a/Documentation/admin-guide/media/usb-cardlist.rst b/Documentation/admin-guide/media/usb-cardlist.rst
new file mode 100644
index 000000000000..1e96f928e0af
--- /dev/null
+++ b/Documentation/admin-guide/media/usb-cardlist.rst
@@ -0,0 +1,156 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+USB drivers
+===========
+
+The USB boards are identified by an identification called USB ID.
+
+The ``lsusb`` command allows identifying the USB IDs::
+
+ $ lsusb
+ ...
+ Bus 001 Device 015: ID 046d:082d Logitech, Inc. HD Pro Webcam C920
+ Bus 001 Device 074: ID 2040:b131 Hauppauge
+ Bus 001 Device 075: ID 2013:024f PCTV Systems nanoStick T2 290e
+ ...
+
+Newer camera devices use a standard way to expose themselves as such,
+via USB Video Class. Those cameras are automatically supported by the
+``uvc-driver``.
+
+Older cameras and TV USB devices uses USB Vendor Classes: each vendor
+defines its own way to access the device. This section contains
+card lists for such vendor-class devices.
+
+While this is not as common as on PCI, sometimes the same USB ID is used
+by different products. So, several media drivers allow passing a ``card=``
+parameter, in order to setup a card number that would match the correct
+settings for an specific product type.
+
+The current supported USB cards (not including staging drivers) are
+listed below\ [#]_.
+
+.. [#]
+
+ some of the drivers have sub-drivers, not shown at this table.
+ In particular, gspca driver has lots of sub-drivers,
+ for cameras not supported by the USB Video Class (UVC) driver,
+ as shown at :doc:`gspca card list <gspca-cardlist>`.
+
+====================== =========================================================
+Driver Name
+====================== =========================================================
+airspy AirSpy
+au0828 Auvitek AU0828
+b2c2-flexcop-usb Technisat/B2C2 Air/Sky/Cable2PC USB
+cpia2 CPiA2 Video For Linux
+cx231xx Conexant cx231xx USB video capture
+dvb-as102 Abilis AS102 DVB receiver
+dvb-ttusb-budget Technotrend/Hauppauge Nova - USB devices
+dvb-usb-a800 AVerMedia AverTV DVB-T USB 2.0 (A800)
+dvb-usb-af9005 Afatech AF9005 DVB-T USB1.1
+dvb-usb-af9015 Afatech AF9015 DVB-T USB2.0
+dvb-usb-af9035 Afatech AF9035 DVB-T USB2.0
+dvb-usb-anysee Anysee DVB-T/C USB2.0
+dvb-usb-au6610 Alcor Micro AU6610 USB2.0
+dvb-usb-az6007 AzureWave 6007 and clones DVB-T/C USB2.0
+dvb-usb-az6027 Azurewave DVB-S/S2 USB2.0 AZ6027
+dvb-usb-ce6230 Intel CE6230 DVB-T USB2.0
+dvb-usb-cinergyT2 Terratec CinergyT2/qanu USB 2.0 DVB-T
+dvb-usb-cxusb Conexant USB2.0 hybrid
+dvb-usb-dib0700 DiBcom DiB0700
+dvb-usb-dibusb-common DiBcom DiB3000M-B
+dvb-usb-dibusb-mc DiBcom DiB3000M-C/P
+dvb-usb-digitv Nebula Electronics uDigiTV DVB-T USB2.0
+dvb-usb-dtt200u WideView WT-200U and WT-220U (pen) DVB-T
+dvb-usb-dtv5100 AME DTV-5100 USB2.0 DVB-T
+dvb-usb-dvbsky DVBSky USB
+dvb-usb-dw2102 DvbWorld & TeVii DVB-S/S2 USB2.0
+dvb-usb-ec168 E3C EC168 DVB-T USB2.0
+dvb-usb-gl861 Genesys Logic GL861 USB2.0
+dvb-usb-gp8psk GENPIX 8PSK->USB module
+dvb-usb-lmedm04 LME DM04/QQBOX DVB-S USB2.0
+dvb-usb-m920x Uli m920x DVB-T USB2.0
+dvb-usb-nova-t-usb2 Hauppauge WinTV-NOVA-T usb2 DVB-T USB2.0
+dvb-usb-opera Opera1 DVB-S USB2.0 receiver
+dvb-usb-pctv452e Pinnacle PCTV HDTV Pro USB device/TT Connect S2-3600
+dvb-usb-rtl28xxu Realtek RTL28xxU DVB USB
+dvb-usb-technisat-usb2 Technisat DVB-S/S2 USB2.0
+dvb-usb-ttusb2 Pinnacle 400e DVB-S USB2.0
+dvb-usb-umt-010 HanfTek UMT-010 DVB-T USB2.0
+dvb_usb_v2 Support for various USB DVB devices v2
+dvb-usb-vp702x TwinhanDTV StarBox and clones DVB-S USB2.0
+dvb-usb-vp7045 TwinhanDTV Alpha/MagicBoxII, DNTV tinyUSB2, Beetle USB2.0
+em28xx Empia EM28xx USB devices
+go7007 WIS GO7007 MPEG encoder
+gspca Drivers for several USB Cameras
+hackrf HackRF
+hdpvr Hauppauge HD PVR
+msi2500 Mirics MSi2500
+mxl111sf-tuner MxL111SF DTV USB2.0
+pvrusb2 Hauppauge WinTV-PVR USB2
+pwc USB Philips Cameras
+s2250 Sensoray 2250/2251
+s2255drv USB Sensoray 2255 video capture device
+smsusb Siano SMS1xxx based MDTV receiver
+stkwebcam USB Syntek DC1125 Camera
+tm6000-alsa TV Master TM5600/6000/6010 audio
+tm6000-dvb DVB Support for tm6000 based TV cards
+tm6000 TV Master TM5600/6000/6010 driver
+ttusb_dec Technotrend/Hauppauge USB DEC devices
+usbtv USBTV007 video capture
+uvcvideo USB Video Class (UVC)
+zd1301 ZyDAS ZD1301
+zr364xx USB ZR364XX Camera
+====================== =========================================================
+
+.. toctree::
+ :maxdepth: 1
+
+ au0828-cardlist
+ cx231xx-cardlist
+ em28xx-cardlist
+ tm6000-cardlist
+ siano-cardlist
+
+ gspca-cardlist
+
+ dvb-usb-dib0700-cardlist
+ dvb-usb-dibusb-mb-cardlist
+ dvb-usb-dibusb-mc-cardlist
+
+ dvb-usb-a800-cardlist
+ dvb-usb-af9005-cardlist
+ dvb-usb-az6027-cardlist
+ dvb-usb-cinergyT2-cardlist
+ dvb-usb-cxusb-cardlist
+ dvb-usb-digitv-cardlist
+ dvb-usb-dtt200u-cardlist
+ dvb-usb-dtv5100-cardlist
+ dvb-usb-dw2102-cardlist
+ dvb-usb-gp8psk-cardlist
+ dvb-usb-m920x-cardlist
+ dvb-usb-nova-t-usb2-cardlist
+ dvb-usb-opera1-cardlist
+ dvb-usb-pctv452e-cardlist
+ dvb-usb-technisat-usb2-cardlist
+ dvb-usb-ttusb2-cardlist
+ dvb-usb-umt-010-cardlist
+ dvb-usb-vp702x-cardlist
+ dvb-usb-vp7045-cardlist
+
+ dvb-usb-af9015-cardlist
+ dvb-usb-af9035-cardlist
+ dvb-usb-anysee-cardlist
+ dvb-usb-au6610-cardlist
+ dvb-usb-az6007-cardlist
+ dvb-usb-ce6230-cardlist
+ dvb-usb-dvbsky-cardlist
+ dvb-usb-ec168-cardlist
+ dvb-usb-gl861-cardlist
+ dvb-usb-lmedm04-cardlist
+ dvb-usb-mxl111sf-cardlist
+ dvb-usb-rtl28xxu-cardlist
+ dvb-usb-zd1301-cardlist
+
+ other-usb-cardlist
diff --git a/Documentation/admin-guide/media/v4l-drivers.rst b/Documentation/admin-guide/media/v4l-drivers.rst
new file mode 100644
index 000000000000..9c7ebe2ca3bd
--- /dev/null
+++ b/Documentation/admin-guide/media/v4l-drivers.rst
@@ -0,0 +1,34 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+.. _uapi-v4l-drivers:
+
+===============================================
+Video4Linux (V4L) driver-specific documentation
+===============================================
+
+.. toctree::
+ :maxdepth: 2
+
+ bttv
+ cafe_ccic
+ cpia2
+ cx88
+ davinci-vpbe
+ fimc
+ imx
+ imx7
+ ipu3
+ ivtv
+ meye
+ omap3isp
+ omap4_camera
+ philips
+ qcom_camss
+ rcar-fdp1
+ rkisp1
+ saa7134
+ si470x
+ si4713
+ si476x
+ vimc
+ vivid
diff --git a/Documentation/admin-guide/media/vimc.dot b/Documentation/admin-guide/media/vimc.dot
new file mode 100644
index 000000000000..92a5bb631235
--- /dev/null
+++ b/Documentation/admin-guide/media/vimc.dot
@@ -0,0 +1,26 @@
+# SPDX-License-Identifier: GPL-2.0
+
+digraph board {
+ rankdir=TB
+ n00000001 [label="{{} | Sensor A\n/dev/v4l-subdev0 | {<port0> 0}}", shape=Mrecord, style=filled, fillcolor=green]
+ n00000001:port0 -> n00000005:port0 [style=bold]
+ n00000001:port0 -> n0000000b [style=bold]
+ n00000001 -> n00000002
+ n00000002 [label="{{} | Lens A\n/dev/v4l-subdev5 | {<port0>}}", shape=Mrecord, style=filled, fillcolor=green]
+ n00000003 [label="{{} | Sensor B\n/dev/v4l-subdev1 | {<port0> 0}}", shape=Mrecord, style=filled, fillcolor=green]
+ n00000003:port0 -> n00000008:port0 [style=bold]
+ n00000003:port0 -> n0000000f [style=bold]
+ n00000003 -> n00000004
+ n00000004 [label="{{} | Lens B\n/dev/v4l-subdev6 | {<port0>}}", shape=Mrecord, style=filled, fillcolor=green]
+ n00000005 [label="{{<port0> 0} | Debayer A\n/dev/v4l-subdev2 | {<port1> 1}}", shape=Mrecord, style=filled, fillcolor=green]
+ n00000005:port1 -> n00000015:port0
+ n00000008 [label="{{<port0> 0} | Debayer B\n/dev/v4l-subdev3 | {<port1> 1}}", shape=Mrecord, style=filled, fillcolor=green]
+ n00000008:port1 -> n00000015:port0 [style=dashed]
+ n0000000b [label="Raw Capture 0\n/dev/video0", shape=box, style=filled, fillcolor=yellow]
+ n0000000f [label="Raw Capture 1\n/dev/video1", shape=box, style=filled, fillcolor=yellow]
+ n00000013 [label="{{} | RGB/YUV Input\n/dev/v4l-subdev4 | {<port0> 0}}", shape=Mrecord, style=filled, fillcolor=green]
+ n00000013:port0 -> n00000015:port0 [style=dashed]
+ n00000015 [label="{{<port0> 0} | Scaler\n/dev/v4l-subdev5 | {<port1> 1}}", shape=Mrecord, style=filled, fillcolor=green]
+ n00000015:port1 -> n00000018 [style=bold]
+ n00000018 [label="RGB/YUV Capture\n/dev/video2", shape=box, style=filled, fillcolor=yellow]
+}
diff --git a/Documentation/admin-guide/media/vimc.rst b/Documentation/admin-guide/media/vimc.rst
new file mode 100644
index 000000000000..3b4d2b36b4f3
--- /dev/null
+++ b/Documentation/admin-guide/media/vimc.rst
@@ -0,0 +1,110 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+The Virtual Media Controller Driver (vimc)
+==========================================
+
+The vimc driver emulates complex video hardware using the V4L2 API and the Media
+API. It has a capture device and three subdevices: sensor, debayer and scaler.
+
+Topology
+--------
+
+The topology is hardcoded, although you could modify it in vimc-core and
+recompile the driver to achieve your own topology. This is the default topology:
+
+.. _vimc_topology_graph:
+
+.. kernel-figure:: vimc.dot
+ :alt: Diagram of the default media pipeline topology
+ :align: center
+
+ Media pipeline graph on vimc
+
+Configuring the topology
+~~~~~~~~~~~~~~~~~~~~~~~~
+
+Each subdevice will come with its default configuration (pixelformat, height,
+width, ...). One needs to configure the topology in order to match the
+configuration on each linked subdevice to stream frames through the pipeline.
+If the configuration doesn't match, the stream will fail. The ``v4l-utils``
+package is a bundle of user-space applications, that comes with ``media-ctl`` and
+``v4l2-ctl`` that can be used to configure the vimc configuration. This sequence
+of commands fits for the default topology:
+
+.. code-block:: bash
+
+ media-ctl -d platform:vimc -V '"Sensor A":0[fmt:SBGGR8_1X8/640x480]'
+ media-ctl -d platform:vimc -V '"Debayer A":0[fmt:SBGGR8_1X8/640x480]'
+ media-ctl -d platform:vimc -V '"Sensor B":0[fmt:SBGGR8_1X8/640x480]'
+ media-ctl -d platform:vimc -V '"Debayer B":0[fmt:SBGGR8_1X8/640x480]'
+ v4l2-ctl -z platform:vimc -d "RGB/YUV Capture" -v width=1920,height=1440
+ v4l2-ctl -z platform:vimc -d "Raw Capture 0" -v pixelformat=BA81
+ v4l2-ctl -z platform:vimc -d "Raw Capture 1" -v pixelformat=BA81
+
+Subdevices
+----------
+
+Subdevices define the behavior of an entity in the topology. Depending on the
+subdevice, the entity can have multiple pads of type source or sink.
+
+vimc-sensor:
+ Generates images in several formats using video test pattern generator.
+ Exposes:
+
+ * 1 Pad source
+
+vimc-lens:
+ Ancillary lens for a sensor. Supports auto focus control. Linked to
+ a vimc-sensor using an ancillary link. The lens supports FOCUS_ABSOLUTE
+ control.
+
+.. code-block:: bash
+
+ media-ctl -p
+ ...
+ - entity 28: Lens A (0 pad, 0 link)
+ type V4L2 subdev subtype Lens flags 0
+ device node name /dev/v4l-subdev6
+ - entity 29: Lens B (0 pad, 0 link)
+ type V4L2 subdev subtype Lens flags 0
+ device node name /dev/v4l-subdev7
+ v4l2-ctl -d /dev/v4l-subdev7 -C focus_absolute
+ focus_absolute: 0
+
+
+vimc-debayer:
+ Transforms images in bayer format into a non-bayer format.
+ Exposes:
+
+ * 1 Pad sink
+ * 1 Pad source
+
+vimc-scaler:
+ Re-size the image to meet the source pad resolution. E.g.: if the sync
+ pad is configured to 360x480 and the source to 1280x720, the image will
+ be stretched to fit the source resolution. Works for any resolution
+ within the vimc limitations (even shrinking the image if necessary).
+ Exposes:
+
+ * 1 Pad sink
+ * 1 Pad source
+
+vimc-capture:
+ Exposes node /dev/videoX to allow userspace to capture the stream.
+ Exposes:
+
+ * 1 Pad sink
+ * 1 Pad source
+
+Module options
+--------------
+
+Vimc has a module parameter to configure the driver.
+
+* ``allocator=<unsigned int>``
+
+ memory allocator selection, default is 0. It specifies the way buffers
+ will be allocated.
+
+ - 0: vmalloc
+ - 1: dma-contig
diff --git a/Documentation/admin-guide/media/vivid.rst b/Documentation/admin-guide/media/vivid.rst
new file mode 100644
index 000000000000..abd90ed31090
--- /dev/null
+++ b/Documentation/admin-guide/media/vivid.rst
@@ -0,0 +1,1416 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+The Virtual Video Test Driver (vivid)
+=====================================
+
+This driver emulates video4linux hardware of various types: video capture, video
+output, vbi capture and output, metadata capture and output, radio receivers and
+transmitters, touch capture and a software defined radio receiver. In addition a
+simple framebuffer device is available for testing capture and output overlays.
+
+Up to 64 vivid instances can be created, each with up to 16 inputs and 16 outputs.
+
+Each input can be a webcam, TV capture device, S-Video capture device or an HDMI
+capture device. Each output can be an S-Video output device or an HDMI output
+device.
+
+These inputs and outputs act exactly as a real hardware device would behave. This
+allows you to use this driver as a test input for application development, since
+you can test the various features without requiring special hardware.
+
+This document describes the features implemented by this driver:
+
+- Support for read()/write(), MMAP, USERPTR and DMABUF streaming I/O.
+- A large list of test patterns and variations thereof
+- Working brightness, contrast, saturation and hue controls
+- Support for the alpha color component
+- Full colorspace support, including limited/full RGB range
+- All possible control types are present
+- Support for various pixel aspect ratios and video aspect ratios
+- Error injection to test what happens if errors occur
+- Supports crop/compose/scale in any combination for both input and output
+- Can emulate up to 4K resolutions
+- All Field settings are supported for testing interlaced capturing
+- Supports all standard YUV and RGB formats, including two multiplanar YUV formats
+- Raw and Sliced VBI capture and output support
+- Radio receiver and transmitter support, including RDS support
+- Software defined radio (SDR) support
+- Capture and output overlay support
+- Metadata capture and output support
+- Touch capture support
+
+These features will be described in more detail below.
+
+Configuring the driver
+----------------------
+
+By default the driver will create a single instance that has a video capture
+device with webcam, TV, S-Video and HDMI inputs, a video output device with
+S-Video and HDMI outputs, one vbi capture device, one vbi output device, one
+radio receiver device, one radio transmitter device and one SDR device.
+
+The number of instances, devices, video inputs and outputs and their types are
+all configurable using the following module options:
+
+- n_devs:
+
+ number of driver instances to create. By default set to 1. Up to 64
+ instances can be created.
+
+- node_types:
+
+ which devices should each driver instance create. An array of
+ hexadecimal values, one for each instance. The default is 0x1d3d.
+ Each value is a bitmask with the following meaning:
+
+ - bit 0: Video Capture node
+ - bit 2-3: VBI Capture node: 0 = none, 1 = raw vbi, 2 = sliced vbi, 3 = both
+ - bit 4: Radio Receiver node
+ - bit 5: Software Defined Radio Receiver node
+ - bit 8: Video Output node
+ - bit 10-11: VBI Output node: 0 = none, 1 = raw vbi, 2 = sliced vbi, 3 = both
+ - bit 12: Radio Transmitter node
+ - bit 16: Framebuffer for testing overlays
+ - bit 17: Metadata Capture node
+ - bit 18: Metadata Output node
+ - bit 19: Touch Capture node
+
+ So to create four instances, the first two with just one video capture
+ device, the second two with just one video output device you would pass
+ these module options to vivid:
+
+ .. code-block:: none
+
+ n_devs=4 node_types=0x1,0x1,0x100,0x100
+
+- num_inputs:
+
+ the number of inputs, one for each instance. By default 4 inputs
+ are created for each video capture device. At most 16 inputs can be created,
+ and there must be at least one.
+
+- input_types:
+
+ the input types for each instance, the default is 0xe4. This defines
+ what the type of each input is when the inputs are created for each driver
+ instance. This is a hexadecimal value with up to 16 pairs of bits, each
+ pair gives the type and bits 0-1 map to input 0, bits 2-3 map to input 1,
+ 30-31 map to input 15. Each pair of bits has the following meaning:
+
+ - 00: this is a webcam input
+ - 01: this is a TV tuner input
+ - 10: this is an S-Video input
+ - 11: this is an HDMI input
+
+ So to create a video capture device with 8 inputs where input 0 is a TV
+ tuner, inputs 1-3 are S-Video inputs and inputs 4-7 are HDMI inputs you
+ would use the following module options:
+
+ .. code-block:: none
+
+ num_inputs=8 input_types=0xffa9
+
+- num_outputs:
+
+ the number of outputs, one for each instance. By default 2 outputs
+ are created for each video output device. At most 16 outputs can be
+ created, and there must be at least one.
+
+- output_types:
+
+ the output types for each instance, the default is 0x02. This defines
+ what the type of each output is when the outputs are created for each
+ driver instance. This is a hexadecimal value with up to 16 bits, each bit
+ gives the type and bit 0 maps to output 0, bit 1 maps to output 1, bit
+ 15 maps to output 15. The meaning of each bit is as follows:
+
+ - 0: this is an S-Video output
+ - 1: this is an HDMI output
+
+ So to create a video output device with 8 outputs where outputs 0-3 are
+ S-Video outputs and outputs 4-7 are HDMI outputs you would use the
+ following module options:
+
+ .. code-block:: none
+
+ num_outputs=8 output_types=0xf0
+
+- vid_cap_nr:
+
+ give the desired videoX start number for each video capture device.
+ The default is -1 which will just take the first free number. This allows
+ you to map capture video nodes to specific videoX device nodes. Example:
+
+ .. code-block:: none
+
+ n_devs=4 vid_cap_nr=2,4,6,8
+
+ This will attempt to assign /dev/video2 for the video capture device of
+ the first vivid instance, video4 for the next up to video8 for the last
+ instance. If it can't succeed, then it will just take the next free
+ number.
+
+- vid_out_nr:
+
+ give the desired videoX start number for each video output device.
+ The default is -1 which will just take the first free number.
+
+- vbi_cap_nr:
+
+ give the desired vbiX start number for each vbi capture device.
+ The default is -1 which will just take the first free number.
+
+- vbi_out_nr:
+
+ give the desired vbiX start number for each vbi output device.
+ The default is -1 which will just take the first free number.
+
+- radio_rx_nr:
+
+ give the desired radioX start number for each radio receiver device.
+ The default is -1 which will just take the first free number.
+
+- radio_tx_nr:
+
+ give the desired radioX start number for each radio transmitter
+ device. The default is -1 which will just take the first free number.
+
+- sdr_cap_nr:
+
+ give the desired swradioX start number for each SDR capture device.
+ The default is -1 which will just take the first free number.
+
+- meta_cap_nr:
+
+ give the desired videoX start number for each metadata capture device.
+ The default is -1 which will just take the first free number.
+
+- meta_out_nr:
+
+ give the desired videoX start number for each metadata output device.
+ The default is -1 which will just take the first free number.
+
+- touch_cap_nr:
+
+ give the desired v4l-touchX start number for each touch capture device.
+ The default is -1 which will just take the first free number.
+
+- ccs_cap_mode:
+
+ specify the allowed video capture crop/compose/scaling combination
+ for each driver instance. Video capture devices can have any combination
+ of cropping, composing and scaling capabilities and this will tell the
+ vivid driver which of those is should emulate. By default the user can
+ select this through controls.
+
+ The value is either -1 (controlled by the user) or a set of three bits,
+ each enabling (1) or disabling (0) one of the features:
+
+ - bit 0:
+
+ Enable crop support. Cropping will take only part of the
+ incoming picture.
+ - bit 1:
+
+ Enable compose support. Composing will copy the incoming
+ picture into a larger buffer.
+
+ - bit 2:
+
+ Enable scaling support. Scaling can scale the incoming
+ picture. The scaler of the vivid driver can enlarge up
+ or down to four times the original size. The scaler is
+ very simple and low-quality. Simplicity and speed were
+ key, not quality.
+
+ Note that this value is ignored by webcam inputs: those enumerate
+ discrete framesizes and that is incompatible with cropping, composing
+ or scaling.
+
+- ccs_out_mode:
+
+ specify the allowed video output crop/compose/scaling combination
+ for each driver instance. Video output devices can have any combination
+ of cropping, composing and scaling capabilities and this will tell the
+ vivid driver which of those is should emulate. By default the user can
+ select this through controls.
+
+ The value is either -1 (controlled by the user) or a set of three bits,
+ each enabling (1) or disabling (0) one of the features:
+
+ - bit 0:
+
+ Enable crop support. Cropping will take only part of the
+ outgoing buffer.
+
+ - bit 1:
+
+ Enable compose support. Composing will copy the incoming
+ buffer into a larger picture frame.
+
+ - bit 2:
+
+ Enable scaling support. Scaling can scale the incoming
+ buffer. The scaler of the vivid driver can enlarge up
+ or down to four times the original size. The scaler is
+ very simple and low-quality. Simplicity and speed were
+ key, not quality.
+
+- multiplanar:
+
+ select whether each device instance supports multi-planar formats,
+ and thus the V4L2 multi-planar API. By default device instances are
+ single-planar.
+
+ This module option can override that for each instance. Values are:
+
+ - 1: this is a single-planar instance.
+ - 2: this is a multi-planar instance.
+
+- vivid_debug:
+
+ enable driver debugging info
+
+- no_error_inj:
+
+ if set disable the error injecting controls. This option is
+ needed in order to run a tool like v4l2-compliance. Tools like that
+ exercise all controls including a control like 'Disconnect' which
+ emulates a USB disconnect, making the device inaccessible and so
+ all tests that v4l2-compliance is doing will fail afterwards.
+
+ There may be other situations as well where you want to disable the
+ error injection support of vivid. When this option is set, then the
+ controls that select crop, compose and scale behavior are also
+ removed. Unless overridden by ccs_cap_mode and/or ccs_out_mode the
+ will default to enabling crop, compose and scaling.
+
+- allocators:
+
+ memory allocator selection, default is 0. It specifies the way buffers
+ will be allocated.
+
+ - 0: vmalloc
+ - 1: dma-contig
+
+- cache_hints:
+
+ specifies if the device should set queues' user-space cache and memory
+ consistency hint capability (V4L2_BUF_CAP_SUPPORTS_MMAP_CACHE_HINTS).
+ The hints are valid only when using MMAP streaming I/O. Default is 0.
+
+ - 0: forbid hints
+ - 1: allow hints
+
+Taken together, all these module options allow you to precisely customize
+the driver behavior and test your application with all sorts of permutations.
+It is also very suitable to emulate hardware that is not yet available, e.g.
+when developing software for a new upcoming device.
+
+
+Video Capture
+-------------
+
+This is probably the most frequently used feature. The video capture device
+can be configured by using the module options num_inputs, input_types and
+ccs_cap_mode (see section 1 for more detailed information), but by default
+four inputs are configured: a webcam, a TV tuner, an S-Video and an HDMI
+input, one input for each input type. Those are described in more detail
+below.
+
+Special attention has been given to the rate at which new frames become
+available. The jitter will be around 1 jiffie (that depends on the HZ
+configuration of your kernel, so usually 1/100, 1/250 or 1/1000 of a second),
+but the long-term behavior is exactly following the framerate. So a
+framerate of 59.94 Hz is really different from 60 Hz. If the framerate
+exceeds your kernel's HZ value, then you will get dropped frames, but the
+frame/field sequence counting will keep track of that so the sequence
+count will skip whenever frames are dropped.
+
+
+Webcam Input
+~~~~~~~~~~~~
+
+The webcam input supports three framesizes: 320x180, 640x360 and 1280x720. It
+supports frames per second settings of 10, 15, 25, 30, 50 and 60 fps. Which ones
+are available depends on the chosen framesize: the larger the framesize, the
+lower the maximum frames per second.
+
+The initially selected colorspace when you switch to the webcam input will be
+sRGB.
+
+
+TV and S-Video Inputs
+~~~~~~~~~~~~~~~~~~~~~
+
+The only difference between the TV and S-Video input is that the TV has a
+tuner. Otherwise they behave identically.
+
+These inputs support audio inputs as well: one TV and one Line-In. They
+both support all TV standards. If the standard is queried, then the Vivid
+controls 'Standard Signal Mode' and 'Standard' determine what
+the result will be.
+
+These inputs support all combinations of the field setting. Special care has
+been taken to faithfully reproduce how fields are handled for the different
+TV standards. This is particularly noticeable when generating a horizontally
+moving image so the temporal effect of using interlaced formats becomes clearly
+visible. For 50 Hz standards the top field is the oldest and the bottom field
+is the newest in time. For 60 Hz standards that is reversed: the bottom field
+is the oldest and the top field is the newest in time.
+
+When you start capturing in V4L2_FIELD_ALTERNATE mode the first buffer will
+contain the top field for 50 Hz standards and the bottom field for 60 Hz
+standards. This is what capture hardware does as well.
+
+Finally, for PAL/SECAM standards the first half of the top line contains noise.
+This simulates the Wide Screen Signal that is commonly placed there.
+
+The initially selected colorspace when you switch to the TV or S-Video input
+will be SMPTE-170M.
+
+The pixel aspect ratio will depend on the TV standard. The video aspect ratio
+can be selected through the 'Standard Aspect Ratio' Vivid control.
+Choices are '4x3', '16x9' which will give letterboxed widescreen video and
+'16x9 Anamorphic' which will give full screen squashed anamorphic widescreen
+video that will need to be scaled accordingly.
+
+The TV 'tuner' supports a frequency range of 44-958 MHz. Channels are available
+every 6 MHz, starting from 49.25 MHz. For each channel the generated image
+will be in color for the +/- 0.25 MHz around it, and in grayscale for
++/- 1 MHz around the channel. Beyond that it is just noise. The VIDIOC_G_TUNER
+ioctl will return 100% signal strength for +/- 0.25 MHz and 50% for +/- 1 MHz.
+It will also return correct afc values to show whether the frequency is too
+low or too high.
+
+The audio subchannels that are returned are MONO for the +/- 1 MHz range around
+a valid channel frequency. When the frequency is within +/- 0.25 MHz of the
+channel it will return either MONO, STEREO, either MONO | SAP (for NTSC) or
+LANG1 | LANG2 (for others), or STEREO | SAP.
+
+Which one is returned depends on the chosen channel, each next valid channel
+will cycle through the possible audio subchannel combinations. This allows
+you to test the various combinations by just switching channels..
+
+Finally, for these inputs the v4l2_timecode struct is filled in in the
+dequeued v4l2_buffer struct.
+
+
+HDMI Input
+~~~~~~~~~~
+
+The HDMI inputs supports all CEA-861 and DMT timings, both progressive and
+interlaced, for pixelclock frequencies between 25 and 600 MHz. The field
+mode for interlaced formats is always V4L2_FIELD_ALTERNATE. For HDMI the
+field order is always top field first, and when you start capturing an
+interlaced format you will receive the top field first.
+
+The initially selected colorspace when you switch to the HDMI input or
+select an HDMI timing is based on the format resolution: for resolutions
+less than or equal to 720x576 the colorspace is set to SMPTE-170M, for
+others it is set to REC-709 (CEA-861 timings) or sRGB (VESA DMT timings).
+
+The pixel aspect ratio will depend on the HDMI timing: for 720x480 is it
+set as for the NTSC TV standard, for 720x576 it is set as for the PAL TV
+standard, and for all others a 1:1 pixel aspect ratio is returned.
+
+The video aspect ratio can be selected through the 'DV Timings Aspect Ratio'
+Vivid control. Choices are 'Source Width x Height' (just use the
+same ratio as the chosen format), '4x3' or '16x9', either of which can
+result in pillarboxed or letterboxed video.
+
+For HDMI inputs it is possible to set the EDID. By default a simple EDID
+is provided. You can only set the EDID for HDMI inputs. Internally, however,
+the EDID is shared between all HDMI inputs.
+
+No interpretation is done of the EDID data with the exception of the
+physical address. See the CEC section for more details.
+
+There is a maximum of 15 HDMI inputs (if there are more, then they will be
+reduced to 15) since that's the limitation of the EDID physical address.
+
+
+Video Output
+------------
+
+The video output device can be configured by using the module options
+num_outputs, output_types and ccs_out_mode (see section 1 for more detailed
+information), but by default two outputs are configured: an S-Video and an
+HDMI input, one output for each output type. Those are described in more detail
+below.
+
+Like with video capture the framerate is also exact in the long term.
+
+
+S-Video Output
+~~~~~~~~~~~~~~
+
+This output supports audio outputs as well: "Line-Out 1" and "Line-Out 2".
+The S-Video output supports all TV standards.
+
+This output supports all combinations of the field setting.
+
+The initially selected colorspace when you switch to the TV or S-Video input
+will be SMPTE-170M.
+
+
+HDMI Output
+~~~~~~~~~~~
+
+The HDMI output supports all CEA-861 and DMT timings, both progressive and
+interlaced, for pixelclock frequencies between 25 and 600 MHz. The field
+mode for interlaced formats is always V4L2_FIELD_ALTERNATE.
+
+The initially selected colorspace when you switch to the HDMI output or
+select an HDMI timing is based on the format resolution: for resolutions
+less than or equal to 720x576 the colorspace is set to SMPTE-170M, for
+others it is set to REC-709 (CEA-861 timings) or sRGB (VESA DMT timings).
+
+The pixel aspect ratio will depend on the HDMI timing: for 720x480 is it
+set as for the NTSC TV standard, for 720x576 it is set as for the PAL TV
+standard, and for all others a 1:1 pixel aspect ratio is returned.
+
+An HDMI output has a valid EDID which can be obtained through VIDIOC_G_EDID.
+
+There is a maximum of 15 HDMI outputs (if there are more, then they will be
+reduced to 15) since that's the limitation of the EDID physical address. See
+also the CEC section for more details.
+
+VBI Capture
+-----------
+
+There are three types of VBI capture devices: those that only support raw
+(undecoded) VBI, those that only support sliced (decoded) VBI and those that
+support both. This is determined by the node_types module option. In all
+cases the driver will generate valid VBI data: for 60 Hz standards it will
+generate Closed Caption and XDS data. The closed caption stream will
+alternate between "Hello world!" and "Closed captions test" every second.
+The XDS stream will give the current time once a minute. For 50 Hz standards
+it will generate the Wide Screen Signal which is based on the actual Video
+Aspect Ratio control setting and teletext pages 100-159, one page per frame.
+
+The VBI device will only work for the S-Video and TV inputs, it will give
+back an error if the current input is a webcam or HDMI.
+
+
+VBI Output
+----------
+
+There are three types of VBI output devices: those that only support raw
+(undecoded) VBI, those that only support sliced (decoded) VBI and those that
+support both. This is determined by the node_types module option.
+
+The sliced VBI output supports the Wide Screen Signal and the teletext signal
+for 50 Hz standards and Closed Captioning + XDS for 60 Hz standards.
+
+The VBI device will only work for the S-Video output, it will give
+back an error if the current output is HDMI.
+
+
+Radio Receiver
+--------------
+
+The radio receiver emulates an FM/AM/SW receiver. The FM band also supports RDS.
+The frequency ranges are:
+
+ - FM: 64 MHz - 108 MHz
+ - AM: 520 kHz - 1710 kHz
+ - SW: 2300 kHz - 26.1 MHz
+
+Valid channels are emulated every 1 MHz for FM and every 100 kHz for AM and SW.
+The signal strength decreases the further the frequency is from the valid
+frequency until it becomes 0% at +/- 50 kHz (FM) or 5 kHz (AM/SW) from the
+ideal frequency. The initial frequency when the driver is loaded is set to
+95 MHz.
+
+The FM receiver supports RDS as well, both using 'Block I/O' and 'Controls'
+modes. In the 'Controls' mode the RDS information is stored in read-only
+controls. These controls are updated every time the frequency is changed,
+or when the tuner status is requested. The Block I/O method uses the read()
+interface to pass the RDS blocks on to the application for decoding.
+
+The RDS signal is 'detected' for +/- 12.5 kHz around the channel frequency,
+and the further the frequency is away from the valid frequency the more RDS
+errors are randomly introduced into the block I/O stream, up to 50% of all
+blocks if you are +/- 12.5 kHz from the channel frequency. All four errors
+can occur in equal proportions: blocks marked 'CORRECTED', blocks marked
+'ERROR', blocks marked 'INVALID' and dropped blocks.
+
+The generated RDS stream contains all the standard fields contained in a
+0B group, and also radio text and the current time.
+
+The receiver supports HW frequency seek, either in Bounded mode, Wrap Around
+mode or both, which is configurable with the "Radio HW Seek Mode" control.
+
+
+Radio Transmitter
+-----------------
+
+The radio transmitter emulates an FM/AM/SW transmitter. The FM band also supports RDS.
+The frequency ranges are:
+
+ - FM: 64 MHz - 108 MHz
+ - AM: 520 kHz - 1710 kHz
+ - SW: 2300 kHz - 26.1 MHz
+
+The initial frequency when the driver is loaded is 95.5 MHz.
+
+The FM transmitter supports RDS as well, both using 'Block I/O' and 'Controls'
+modes. In the 'Controls' mode the transmitted RDS information is configured
+using controls, and in 'Block I/O' mode the blocks are passed to the driver
+using write().
+
+
+Software Defined Radio Receiver
+-------------------------------
+
+The SDR receiver has three frequency bands for the ADC tuner:
+
+ - 300 kHz
+ - 900 kHz - 2800 kHz
+ - 3200 kHz
+
+The RF tuner supports 50 MHz - 2000 MHz.
+
+The generated data contains the In-phase and Quadrature components of a
+1 kHz tone that has an amplitude of sqrt(2).
+
+
+Metadata Capture
+----------------
+
+The Metadata capture generates UVC format metadata. The PTS and SCR are
+transmitted based on the values set in vivid contols.
+
+The Metadata device will only work for the Webcam input, it will give
+back an error for all other inputs.
+
+
+Metadata Output
+---------------
+
+The Metadata output can be used to set brightness, contrast, saturation and hue.
+
+The Metadata device will only work for the Webcam output, it will give
+back an error for all other outputs.
+
+
+Touch Capture
+-------------
+
+The Touch capture generates touch patterns simulating single tap, double tap,
+triple tap, move from left to right, zoom in, zoom out, palm press (simulating
+a large area being pressed on a touchpad), and simulating 16 simultaneous
+touch points.
+
+Controls
+--------
+
+Different devices support different controls. The sections below will describe
+each control and which devices support them.
+
+
+User Controls - Test Controls
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The Button, Boolean, Integer 32 Bits, Integer 64 Bits, Menu, String, Bitmask and
+Integer Menu are controls that represent all possible control types. The Menu
+control and the Integer Menu control both have 'holes' in their menu list,
+meaning that one or more menu items return EINVAL when VIDIOC_QUERYMENU is called.
+Both menu controls also have a non-zero minimum control value. These features
+allow you to check if your application can handle such things correctly.
+These controls are supported for every device type.
+
+
+User Controls - Video Capture
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The following controls are specific to video capture.
+
+The Brightness, Contrast, Saturation and Hue controls actually work and are
+standard. There is one special feature with the Brightness control: each
+video input has its own brightness value, so changing input will restore
+the brightness for that input. In addition, each video input uses a different
+brightness range (minimum and maximum control values). Switching inputs will
+cause a control event to be sent with the V4L2_EVENT_CTRL_CH_RANGE flag set.
+This allows you to test controls that can change their range.
+
+The 'Gain, Automatic' and Gain controls can be used to test volatile controls:
+if 'Gain, Automatic' is set, then the Gain control is volatile and changes
+constantly. If 'Gain, Automatic' is cleared, then the Gain control is a normal
+control.
+
+The 'Horizontal Flip' and 'Vertical Flip' controls can be used to flip the
+image. These combine with the 'Sensor Flipped Horizontally/Vertically' Vivid
+controls.
+
+The 'Alpha Component' control can be used to set the alpha component for
+formats containing an alpha channel.
+
+
+User Controls - Audio
+~~~~~~~~~~~~~~~~~~~~~
+
+The following controls are specific to video capture and output and radio
+receivers and transmitters.
+
+The 'Volume' and 'Mute' audio controls are typical for such devices to
+control the volume and mute the audio. They don't actually do anything in
+the vivid driver.
+
+
+Vivid Controls
+~~~~~~~~~~~~~~
+
+These vivid custom controls control the image generation, error injection, etc.
+
+
+Test Pattern Controls
+^^^^^^^^^^^^^^^^^^^^^
+
+The Test Pattern Controls are all specific to video capture.
+
+- Test Pattern:
+
+ selects which test pattern to use. Use the CSC Colorbar for
+ testing colorspace conversions: the colors used in that test pattern
+ map to valid colors in all colorspaces. The colorspace conversion
+ is disabled for the other test patterns.
+
+- OSD Text Mode:
+
+ selects whether the text superimposed on the
+ test pattern should be shown, and if so, whether only counters should
+ be displayed or the full text.
+
+- Horizontal Movement:
+
+ selects whether the test pattern should
+ move to the left or right and at what speed.
+
+- Vertical Movement:
+
+ does the same for the vertical direction.
+
+- Show Border:
+
+ show a two-pixel wide border at the edge of the actual image,
+ excluding letter or pillarboxing.
+
+- Show Square:
+
+ show a square in the middle of the image. If the image is
+ displayed with the correct pixel and image aspect ratio corrections,
+ then the width and height of the square on the monitor should be
+ the same.
+
+- Insert SAV Code in Image:
+
+ adds a SAV (Start of Active Video) code to the image.
+ This can be used to check if such codes in the image are inadvertently
+ interpreted instead of being ignored.
+
+- Insert EAV Code in Image:
+
+ does the same for the EAV (End of Active Video) code.
+
+- Insert Video Guard Band
+
+ adds 4 columns of pixels with the HDMI Video Guard Band code at the
+ left hand side of the image. This only works with 3 or 4 byte RGB pixel
+ formats. The RGB pixel value 0xab/0x55/0xab turns out to be equivalent
+ to the HDMI Video Guard Band code that precedes each active video line
+ (see section 5.2.2.1 in the HDMI 1.3 Specification). To test if a video
+ receiver has correct HDMI Video Guard Band processing, enable this
+ control and then move the image to the left hand side of the screen.
+ That will result in video lines that start with multiple pixels that
+ have the same value as the Video Guard Band that precedes them.
+ Receivers that will just keep skipping Video Guard Band values will
+ now fail and either loose sync or these video lines will shift.
+
+
+Capture Feature Selection Controls
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+These controls are all specific to video capture.
+
+- Sensor Flipped Horizontally:
+
+ the image is flipped horizontally and the
+ V4L2_IN_ST_HFLIP input status flag is set. This emulates the case where
+ a sensor is for example mounted upside down.
+
+- Sensor Flipped Vertically:
+
+ the image is flipped vertically and the
+ V4L2_IN_ST_VFLIP input status flag is set. This emulates the case where
+ a sensor is for example mounted upside down.
+
+- Standard Aspect Ratio:
+
+ selects if the image aspect ratio as used for the TV or
+ S-Video input should be 4x3, 16x9 or anamorphic widescreen. This may
+ introduce letterboxing.
+
+- DV Timings Aspect Ratio:
+
+ selects if the image aspect ratio as used for the HDMI
+ input should be the same as the source width and height ratio, or if
+ it should be 4x3 or 16x9. This may introduce letter or pillarboxing.
+
+- Timestamp Source:
+
+ selects when the timestamp for each buffer is taken.
+
+- Colorspace:
+
+ selects which colorspace should be used when generating the image.
+ This only applies if the CSC Colorbar test pattern is selected,
+ otherwise the test pattern will go through unconverted.
+ This behavior is also what you want, since a 75% Colorbar
+ should really have 75% signal intensity and should not be affected
+ by colorspace conversions.
+
+ Changing the colorspace will result in the V4L2_EVENT_SOURCE_CHANGE
+ to be sent since it emulates a detected colorspace change.
+
+- Transfer Function:
+
+ selects which colorspace transfer function should be used when
+ generating an image. This only applies if the CSC Colorbar test pattern is
+ selected, otherwise the test pattern will go through unconverted.
+ This behavior is also what you want, since a 75% Colorbar
+ should really have 75% signal intensity and should not be affected
+ by colorspace conversions.
+
+ Changing the transfer function will result in the V4L2_EVENT_SOURCE_CHANGE
+ to be sent since it emulates a detected colorspace change.
+
+- Y'CbCr Encoding:
+
+ selects which Y'CbCr encoding should be used when generating
+ a Y'CbCr image. This only applies if the format is set to a Y'CbCr format
+ as opposed to an RGB format.
+
+ Changing the Y'CbCr encoding will result in the V4L2_EVENT_SOURCE_CHANGE
+ to be sent since it emulates a detected colorspace change.
+
+- Quantization:
+
+ selects which quantization should be used for the RGB or Y'CbCr
+ encoding when generating the test pattern.
+
+ Changing the quantization will result in the V4L2_EVENT_SOURCE_CHANGE
+ to be sent since it emulates a detected colorspace change.
+
+- Limited RGB Range (16-235):
+
+ selects if the RGB range of the HDMI source should
+ be limited or full range. This combines with the Digital Video 'Rx RGB
+ Quantization Range' control and can be used to test what happens if
+ a source provides you with the wrong quantization range information.
+ See the description of that control for more details.
+
+- Apply Alpha To Red Only:
+
+ apply the alpha channel as set by the 'Alpha Component'
+ user control to the red color of the test pattern only.
+
+- Enable Capture Cropping:
+
+ enables crop support. This control is only present if
+ the ccs_cap_mode module option is set to the default value of -1 and if
+ the no_error_inj module option is set to 0 (the default).
+
+- Enable Capture Composing:
+
+ enables composing support. This control is only
+ present if the ccs_cap_mode module option is set to the default value of
+ -1 and if the no_error_inj module option is set to 0 (the default).
+
+- Enable Capture Scaler:
+
+ enables support for a scaler (maximum 4 times upscaling
+ and downscaling). This control is only present if the ccs_cap_mode
+ module option is set to the default value of -1 and if the no_error_inj
+ module option is set to 0 (the default).
+
+- Maximum EDID Blocks:
+
+ determines how many EDID blocks the driver supports.
+ Note that the vivid driver does not actually interpret new EDID
+ data, it just stores it. It allows for up to 256 EDID blocks
+ which is the maximum supported by the standard.
+
+- Fill Percentage of Frame:
+
+ can be used to draw only the top X percent
+ of the image. Since each frame has to be drawn by the driver, this
+ demands a lot of the CPU. For large resolutions this becomes
+ problematic. By drawing only part of the image this CPU load can
+ be reduced.
+
+
+Output Feature Selection Controls
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+These controls are all specific to video output.
+
+- Enable Output Cropping:
+
+ enables crop support. This control is only present if
+ the ccs_out_mode module option is set to the default value of -1 and if
+ the no_error_inj module option is set to 0 (the default).
+
+- Enable Output Composing:
+
+ enables composing support. This control is only
+ present if the ccs_out_mode module option is set to the default value of
+ -1 and if the no_error_inj module option is set to 0 (the default).
+
+- Enable Output Scaler:
+
+ enables support for a scaler (maximum 4 times upscaling
+ and downscaling). This control is only present if the ccs_out_mode
+ module option is set to the default value of -1 and if the no_error_inj
+ module option is set to 0 (the default).
+
+
+Error Injection Controls
+^^^^^^^^^^^^^^^^^^^^^^^^
+
+The following two controls are only valid for video and vbi capture.
+
+- Standard Signal Mode:
+
+ selects the behavior of VIDIOC_QUERYSTD: what should it return?
+
+ Changing this control will result in the V4L2_EVENT_SOURCE_CHANGE
+ to be sent since it emulates a changed input condition (e.g. a cable
+ was plugged in or out).
+
+- Standard:
+
+ selects the standard that VIDIOC_QUERYSTD should return if the
+ previous control is set to "Selected Standard".
+
+ Changing this control will result in the V4L2_EVENT_SOURCE_CHANGE
+ to be sent since it emulates a changed input standard.
+
+
+The following two controls are only valid for video capture.
+
+- DV Timings Signal Mode:
+
+ selects the behavior of VIDIOC_QUERY_DV_TIMINGS: what
+ should it return?
+
+ Changing this control will result in the V4L2_EVENT_SOURCE_CHANGE
+ to be sent since it emulates a changed input condition (e.g. a cable
+ was plugged in or out).
+
+- DV Timings:
+
+ selects the timings the VIDIOC_QUERY_DV_TIMINGS should return
+ if the previous control is set to "Selected DV Timings".
+
+ Changing this control will result in the V4L2_EVENT_SOURCE_CHANGE
+ to be sent since it emulates changed input timings.
+
+
+The following controls are only present if the no_error_inj module option
+is set to 0 (the default). These controls are valid for video and vbi
+capture and output streams and for the SDR capture device except for the
+Disconnect control which is valid for all devices.
+
+- Wrap Sequence Number:
+
+ test what happens when you wrap the sequence number in
+ struct v4l2_buffer around.
+
+- Wrap Timestamp:
+
+ test what happens when you wrap the timestamp in struct
+ v4l2_buffer around.
+
+- Percentage of Dropped Buffers:
+
+ sets the percentage of buffers that
+ are never returned by the driver (i.e., they are dropped).
+
+- Disconnect:
+
+ emulates a USB disconnect. The device will act as if it has
+ been disconnected. Only after all open filehandles to the device
+ node have been closed will the device become 'connected' again.
+
+- Inject V4L2_BUF_FLAG_ERROR:
+
+ when pressed, the next frame returned by
+ the driver will have the error flag set (i.e. the frame is marked
+ corrupt).
+
+- Inject VIDIOC_REQBUFS Error:
+
+ when pressed, the next REQBUFS or CREATE_BUFS
+ ioctl call will fail with an error. To be precise: the videobuf2
+ queue_setup() op will return -EINVAL.
+
+- Inject VIDIOC_QBUF Error:
+
+ when pressed, the next VIDIOC_QBUF or
+ VIDIOC_PREPARE_BUFFER ioctl call will fail with an error. To be
+ precise: the videobuf2 buf_prepare() op will return -EINVAL.
+
+- Inject VIDIOC_STREAMON Error:
+
+ when pressed, the next VIDIOC_STREAMON ioctl
+ call will fail with an error. To be precise: the videobuf2
+ start_streaming() op will return -EINVAL.
+
+- Inject Fatal Streaming Error:
+
+ when pressed, the streaming core will be
+ marked as having suffered a fatal error, the only way to recover
+ from that is to stop streaming. To be precise: the videobuf2
+ vb2_queue_error() function is called.
+
+
+VBI Raw Capture Controls
+^^^^^^^^^^^^^^^^^^^^^^^^
+
+- Interlaced VBI Format:
+
+ if set, then the raw VBI data will be interlaced instead
+ of providing it grouped by field.
+
+
+Digital Video Controls
+~~~~~~~~~~~~~~~~~~~~~~
+
+- Rx RGB Quantization Range:
+
+ sets the RGB quantization detection of the HDMI
+ input. This combines with the Vivid 'Limited RGB Range (16-235)'
+ control and can be used to test what happens if a source provides
+ you with the wrong quantization range information. This can be tested
+ by selecting an HDMI input, setting this control to Full or Limited
+ range and selecting the opposite in the 'Limited RGB Range (16-235)'
+ control. The effect is easy to see if the 'Gray Ramp' test pattern
+ is selected.
+
+- Tx RGB Quantization Range:
+
+ sets the RGB quantization detection of the HDMI
+ output. It is currently not used for anything in vivid, but most HDMI
+ transmitters would typically have this control.
+
+- Transmit Mode:
+
+ sets the transmit mode of the HDMI output to HDMI or DVI-D. This
+ affects the reported colorspace since DVI_D outputs will always use
+ sRGB.
+
+- Display Present:
+
+ sets the presence of a "display" on the HDMI output. This affects
+ the tx_edid_present, tx_hotplug and tx_rxsense controls.
+
+
+FM Radio Receiver Controls
+~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+- RDS Reception:
+
+ set if the RDS receiver should be enabled.
+
+- RDS Program Type:
+
+
+- RDS PS Name:
+
+
+- RDS Radio Text:
+
+
+- RDS Traffic Announcement:
+
+
+- RDS Traffic Program:
+
+
+- RDS Music:
+
+ these are all read-only controls. If RDS Rx I/O Mode is set to
+ "Block I/O", then they are inactive as well. If RDS Rx I/O Mode is set
+ to "Controls", then these controls report the received RDS data.
+
+.. note::
+ The vivid implementation of this is pretty basic: they are only
+ updated when you set a new frequency or when you get the tuner status
+ (VIDIOC_G_TUNER).
+
+- Radio HW Seek Mode:
+
+ can be one of "Bounded", "Wrap Around" or "Both". This
+ determines if VIDIOC_S_HW_FREQ_SEEK will be bounded by the frequency
+ range or wrap-around or if it is selectable by the user.
+
+- Radio Programmable HW Seek:
+
+ if set, then the user can provide the lower and
+ upper bound of the HW Seek. Otherwise the frequency range boundaries
+ will be used.
+
+- Generate RBDS Instead of RDS:
+
+ if set, then generate RBDS (the US variant of
+ RDS) data instead of RDS (European-style RDS). This affects only the
+ PICODE and PTY codes.
+
+- RDS Rx I/O Mode:
+
+ this can be "Block I/O" where the RDS blocks have to be read()
+ by the application, or "Controls" where the RDS data is provided by
+ the RDS controls mentioned above.
+
+
+FM Radio Modulator Controls
+~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+- RDS Program ID:
+
+
+- RDS Program Type:
+
+
+- RDS PS Name:
+
+
+- RDS Radio Text:
+
+
+- RDS Stereo:
+
+
+- RDS Artificial Head:
+
+
+- RDS Compressed:
+
+
+- RDS Dynamic PTY:
+
+
+- RDS Traffic Announcement:
+
+
+- RDS Traffic Program:
+
+
+- RDS Music:
+
+ these are all controls that set the RDS data that is transmitted by
+ the FM modulator.
+
+- RDS Tx I/O Mode:
+
+ this can be "Block I/O" where the application has to use write()
+ to pass the RDS blocks to the driver, or "Controls" where the RDS data
+ is Provided by the RDS controls mentioned above.
+
+Metadata Capture Controls
+~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+- Generate PTS
+
+ if set, then the generated metadata stream contains Presentation timestamp.
+
+- Generate SCR
+
+ if set, then the generated metadata stream contains Source Clock information.
+
+Video, VBI and RDS Looping
+--------------------------
+
+The vivid driver supports looping of video output to video input, VBI output
+to VBI input and RDS output to RDS input. For video/VBI looping this emulates
+as if a cable was hooked up between the output and input connector. So video
+and VBI looping is only supported between S-Video and HDMI inputs and outputs.
+VBI is only valid for S-Video as it makes no sense for HDMI.
+
+Since radio is wireless this looping always happens if the radio receiver
+frequency is close to the radio transmitter frequency. In that case the radio
+transmitter will 'override' the emulated radio stations.
+
+Looping is currently supported only between devices created by the same
+vivid driver instance.
+
+
+Video and Sliced VBI looping
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The way to enable video/VBI looping is currently fairly crude. A 'Loop Video'
+control is available in the "Vivid" control class of the video
+capture and VBI capture devices. When checked the video looping will be enabled.
+Once enabled any video S-Video or HDMI input will show a static test pattern
+until the video output has started. At that time the video output will be
+looped to the video input provided that:
+
+- the input type matches the output type. So the HDMI input cannot receive
+ video from the S-Video output.
+
+- the video resolution of the video input must match that of the video output.
+ So it is not possible to loop a 50 Hz (720x576) S-Video output to a 60 Hz
+ (720x480) S-Video input, or a 720p60 HDMI output to a 1080p30 input.
+
+- the pixel formats must be identical on both sides. Otherwise the driver would
+ have to do pixel format conversion as well, and that's taking things too far.
+
+- the field settings must be identical on both sides. Same reason as above:
+ requiring the driver to convert from one field format to another complicated
+ matters too much. This also prohibits capturing with 'Field Top' or 'Field
+ Bottom' when the output video is set to 'Field Alternate'. This combination,
+ while legal, became too complicated to support. Both sides have to be 'Field
+ Alternate' for this to work. Also note that for this specific case the
+ sequence and field counting in struct v4l2_buffer on the capture side may not
+ be 100% accurate.
+
+- field settings V4L2_FIELD_SEQ_TB/BT are not supported. While it is possible to
+ implement this, it would mean a lot of work to get this right. Since these
+ field values are rarely used the decision was made not to implement this for
+ now.
+
+- on the input side the "Standard Signal Mode" for the S-Video input or the
+ "DV Timings Signal Mode" for the HDMI input should be configured so that a
+ valid signal is passed to the video input.
+
+The framerates do not have to match, although this might change in the future.
+
+By default you will see the OSD text superimposed on top of the looped video.
+This can be turned off by changing the "OSD Text Mode" control of the video
+capture device.
+
+For VBI looping to work all of the above must be valid and in addition the vbi
+output must be configured for sliced VBI. The VBI capture side can be configured
+for either raw or sliced VBI. Note that at the moment only CC/XDS (60 Hz formats)
+and WSS (50 Hz formats) VBI data is looped. Teletext VBI data is not looped.
+
+
+Radio & RDS Looping
+~~~~~~~~~~~~~~~~~~~
+
+As mentioned in section 6 the radio receiver emulates stations are regular
+frequency intervals. Depending on the frequency of the radio receiver a
+signal strength value is calculated (this is returned by VIDIOC_G_TUNER).
+However, it will also look at the frequency set by the radio transmitter and
+if that results in a higher signal strength than the settings of the radio
+transmitter will be used as if it was a valid station. This also includes
+the RDS data (if any) that the transmitter 'transmits'. This is received
+faithfully on the receiver side. Note that when the driver is loaded the
+frequencies of the radio receiver and transmitter are not identical, so
+initially no looping takes place.
+
+
+Cropping, Composing, Scaling
+----------------------------
+
+This driver supports cropping, composing and scaling in any combination. Normally
+which features are supported can be selected through the Vivid controls,
+but it is also possible to hardcode it when the module is loaded through the
+ccs_cap_mode and ccs_out_mode module options. See section 1 on the details of
+these module options.
+
+This allows you to test your application for all these variations.
+
+Note that the webcam input never supports cropping, composing or scaling. That
+only applies to the TV/S-Video/HDMI inputs and outputs. The reason is that
+webcams, including this virtual implementation, normally use
+VIDIOC_ENUM_FRAMESIZES to list a set of discrete framesizes that it supports.
+And that does not combine with cropping, composing or scaling. This is
+primarily a limitation of the V4L2 API which is carefully reproduced here.
+
+The minimum and maximum resolutions that the scaler can achieve are 16x16 and
+(4096 * 4) x (2160 x 4), but it can only scale up or down by a factor of 4 or
+less. So for a source resolution of 1280x720 the minimum the scaler can do is
+320x180 and the maximum is 5120x2880. You can play around with this using the
+qv4l2 test tool and you will see these dependencies.
+
+This driver also supports larger 'bytesperline' settings, something that
+VIDIOC_S_FMT allows but that few drivers implement.
+
+The scaler is a simple scaler that uses the Coarse Bresenham algorithm. It's
+designed for speed and simplicity, not quality.
+
+If the combination of crop, compose and scaling allows it, then it is possible
+to change crop and compose rectangles on the fly.
+
+
+Formats
+-------
+
+The driver supports all the regular packed and planar 4:4:4, 4:2:2 and 4:2:0
+YUYV formats, 8, 16, 24 and 32 RGB packed formats and various multiplanar
+formats.
+
+The alpha component can be set through the 'Alpha Component' User control
+for those formats that support it. If the 'Apply Alpha To Red Only' control
+is set, then the alpha component is only used for the color red and set to
+0 otherwise.
+
+The driver has to be configured to support the multiplanar formats. By default
+the driver instances are single-planar. This can be changed by setting the
+multiplanar module option, see section 1 for more details on that option.
+
+If the driver instance is using the multiplanar formats/API, then the first
+single planar format (YUYV) and the multiplanar NV16M and NV61M formats the
+will have a plane that has a non-zero data_offset of 128 bytes. It is rare for
+data_offset to be non-zero, so this is a useful feature for testing applications.
+
+Video output will also honor any data_offset that the application set.
+
+
+Capture Overlay
+---------------
+
+Note: capture overlay support is implemented primarily to test the existing
+V4L2 capture overlay API. In practice few if any GPUs support such overlays
+anymore, and neither are they generally needed anymore since modern hardware
+is so much more capable. By setting flag 0x10000 in the node_types module
+option the vivid driver will create a simple framebuffer device that can be
+used for testing this API. Whether this API should be used for new drivers is
+questionable.
+
+This driver has support for a destructive capture overlay with bitmap clipping
+and list clipping (up to 16 rectangles) capabilities. Overlays are not
+supported for multiplanar formats. It also honors the struct v4l2_window field
+setting: if it is set to FIELD_TOP or FIELD_BOTTOM and the capture setting is
+FIELD_ALTERNATE, then only the top or bottom fields will be copied to the overlay.
+
+The overlay only works if you are also capturing at that same time. This is a
+vivid limitation since it copies from a buffer to the overlay instead of
+filling the overlay directly. And if you are not capturing, then no buffers
+are available to fill.
+
+In addition, the pixelformat of the capture format and that of the framebuffer
+must be the same for the overlay to work. Otherwise VIDIOC_OVERLAY will return
+an error.
+
+In order to really see what it going on you will need to create two vivid
+instances: the first with a framebuffer enabled. You configure the capture
+overlay of the second instance to use the framebuffer of the first, then
+you start capturing in the second instance. For the first instance you setup
+the output overlay for the video output, turn on video looping and capture
+to see the blended framebuffer overlay that's being written to by the second
+instance. This setup would require the following commands:
+
+.. code-block:: none
+
+ $ sudo modprobe vivid n_devs=2 node_types=0x10101,0x1
+ $ v4l2-ctl -d1 --find-fb
+ /dev/fb1 is the framebuffer associated with base address 0x12800000
+ $ sudo v4l2-ctl -d2 --set-fbuf fb=1
+ $ v4l2-ctl -d1 --set-fbuf fb=1
+ $ v4l2-ctl -d0 --set-fmt-video=pixelformat='AR15'
+ $ v4l2-ctl -d1 --set-fmt-video-out=pixelformat='AR15'
+ $ v4l2-ctl -d2 --set-fmt-video=pixelformat='AR15'
+ $ v4l2-ctl -d0 -i2
+ $ v4l2-ctl -d2 -i2
+ $ v4l2-ctl -d2 -c horizontal_movement=4
+ $ v4l2-ctl -d1 --overlay=1
+ $ v4l2-ctl -d0 -c loop_video=1
+ $ v4l2-ctl -d2 --stream-mmap --overlay=1
+
+And from another console:
+
+.. code-block:: none
+
+ $ v4l2-ctl -d1 --stream-out-mmap
+
+And yet another console:
+
+.. code-block:: none
+
+ $ qv4l2
+
+and start streaming.
+
+As you can see, this is not for the faint of heart...
+
+
+Output Overlay
+--------------
+
+Note: output overlays are primarily implemented in order to test the existing
+V4L2 output overlay API. Whether this API should be used for new drivers is
+questionable.
+
+This driver has support for an output overlay and is capable of:
+
+ - bitmap clipping,
+ - list clipping (up to 16 rectangles)
+ - chromakey
+ - source chromakey
+ - global alpha
+ - local alpha
+ - local inverse alpha
+
+Output overlays are not supported for multiplanar formats. In addition, the
+pixelformat of the capture format and that of the framebuffer must be the
+same for the overlay to work. Otherwise VIDIOC_OVERLAY will return an error.
+
+Output overlays only work if the driver has been configured to create a
+framebuffer by setting flag 0x10000 in the node_types module option. The
+created framebuffer has a size of 720x576 and supports ARGB 1:5:5:5 and
+RGB 5:6:5.
+
+In order to see the effects of the various clipping, chromakeying or alpha
+processing capabilities you need to turn on video looping and see the results
+on the capture side. The use of the clipping, chromakeying or alpha processing
+capabilities will slow down the video loop considerably as a lot of checks have
+to be done per pixel.
+
+
+CEC (Consumer Electronics Control)
+----------------------------------
+
+If there are HDMI inputs then a CEC adapter will be created that has
+the same number of input ports. This is the equivalent of e.g. a TV that
+has that number of inputs. Each HDMI output will also create a
+CEC adapter that is hooked up to the corresponding input port, or (if there
+are more outputs than inputs) is not hooked up at all. In other words,
+this is the equivalent of hooking up each output device to an input port of
+the TV. Any remaining output devices remain unconnected.
+
+The EDID that each output reads reports a unique CEC physical address that is
+based on the physical address of the EDID of the input. So if the EDID of the
+receiver has physical address A.B.0.0, then each output will see an EDID
+containing physical address A.B.C.0 where C is 1 to the number of inputs. If
+there are more outputs than inputs then the remaining outputs have a CEC adapter
+that is disabled and reports an invalid physical address.
+
+
+Some Future Improvements
+------------------------
+
+Just as a reminder and in no particular order:
+
+- Add a virtual alsa driver to test audio
+- Add virtual sub-devices and media controller support
+- Some support for testing compressed video
+- Add support to loop raw VBI output to raw VBI input
+- Add support to loop teletext sliced VBI output to VBI input
+- Fix sequence/field numbering when looping of video with alternate fields
+- Add support for V4L2_CID_BG_COLOR for video outputs
+- Add ARGB888 overlay support: better testing of the alpha channel
+- Improve pixel aspect support in the tpg code by passing a real v4l2_fract
+- Use per-queue locks and/or per-device locks to improve throughput
+- Add support to loop from a specific output to a specific input across
+ vivid instances
+- The SDR radio should use the same 'frequencies' for stations as the normal
+ radio receiver, and give back noise if the frequency doesn't match up with
+ a station frequency
+- Make a thread for the RDS generation, that would help in particular for the
+ "Controls" RDS Rx I/O Mode as the read-only RDS controls could be updated
+ in real-time.
+- Changing the EDID should cause hotplug detect emulation to happen.
diff --git a/Documentation/admin-guide/media/zoran-cardlist.rst b/Documentation/admin-guide/media/zoran-cardlist.rst
new file mode 100644
index 000000000000..d7fc8bed62ff
--- /dev/null
+++ b/Documentation/admin-guide/media/zoran-cardlist.rst
@@ -0,0 +1,51 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+Zoran cards list
+================
+
+.. tabularcolumns:: |p{1.4cm}|p{11.1cm}|p{4.2cm}|
+
+.. flat-table::
+ :header-rows: 1
+ :widths: 2 19 18
+ :stub-columns: 0
+
+ * - Card number
+ - Card name
+ - PCI subsystem IDs
+
+ * - 0
+ - DC10(old)
+ - <any>
+
+ * - 1
+ - DC10(new)
+ - <any>
+
+ * - 2
+ - DC10_PLUS
+ - 1031:7efe
+
+ * - 3
+ - DC30
+ - <any>
+
+ * - 4
+ - DC30_PLUS
+ - 1031:d801
+
+ * - 5
+ - LML33
+ - <any>
+
+ * - 6
+ - LML33R10
+ - 12f8:8a02
+
+ * - 7
+ - Buz
+ - 13ca:4231
+
+ * - 8
+ - 6-Eyes
+ - <any>
diff --git a/Documentation/admin-guide/media/zr364xx.rst b/Documentation/admin-guide/media/zr364xx.rst
new file mode 100644
index 000000000000..7291e54b8be3
--- /dev/null
+++ b/Documentation/admin-guide/media/zr364xx.rst
@@ -0,0 +1,102 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+Zoran 364xx based USB webcam module
+===================================
+
+site: http://royale.zerezo.com/zr364xx/
+
+mail: royale@zerezo.com
+
+
+Introduction
+------------
+
+
+This brings support under Linux for the Aiptek PocketDV 3300 and similar
+devices in webcam mode. If you just want to get on your PC the pictures
+and movies on the camera, you should use the usb-storage module instead.
+
+The driver works with several other cameras in webcam mode (see the list
+below).
+
+Possible chipsets are : ZR36430 (ZR36430BGC) and
+maybe ZR36431, ZR36440, ZR36442...
+
+You can try the experience changing the vendor/product ID values (look
+at the source code).
+
+You can get these values by looking at /var/log/messages when you plug
+your camera, or by typing : cat /sys/kernel/debug/usb/devices.
+
+
+Install
+-------
+
+In order to use this driver, you must compile it with your kernel,
+with the following config options::
+
+ ./scripts/config -e USB
+ ./scripts/config -m MEDIA_SUPPORT
+ ./scripts/config -e MEDIA_USB_SUPPORT
+ ./scripts/config -e MEDIA_CAMERA_SUPPORT
+ ./scripts/config -m USB_ZR364XX
+
+Usage
+-----
+
+modprobe zr364xx debug=X mode=Y
+
+- debug : set to 1 to enable verbose debug messages
+- mode : 0 = 320x240, 1 = 160x120, 2 = 640x480
+
+You can then use the camera with V4L2 compatible applications, for
+example Ekiga.
+
+To capture a single image, try this: dd if=/dev/video0 of=test.jpg bs=1M
+count=1
+
+links
+-----
+
+http://mxhaard.free.fr/ (support for many others cams including some Aiptek PocketDV)
+http://www.harmwal.nl/pccam880/ (this project also supports cameras based on this chipset)
+
+Supported devices
+-----------------
+
+====== ======= ============== ====================
+Vendor Product Distributor Model
+====== ======= ============== ====================
+0x08ca 0x0109 Aiptek PocketDV 3300
+0x08ca 0x0109 Maxell Maxcam PRO DV3
+0x041e 0x4024 Creative PC-CAM 880
+0x0d64 0x0108 Aiptek Fidelity 3200
+0x0d64 0x0108 Praktica DCZ 1.3 S
+0x0d64 0x0108 Genius Digital Camera (?)
+0x0d64 0x0108 DXG Technology Fashion Cam
+0x0546 0x3187 Polaroid iON 230
+0x0d64 0x3108 Praktica Exakta DC 2200
+0x0d64 0x3108 Genius G-Shot D211
+0x0595 0x4343 Concord Eye-Q Duo 1300
+0x0595 0x4343 Concord Eye-Q Duo 2000
+0x0595 0x4343 Fujifilm EX-10
+0x0595 0x4343 Ricoh RDC-6000
+0x0595 0x4343 Digitrex DSC 1300
+0x0595 0x4343 Firstline FDC 2000
+0x0bb0 0x500d Concord EyeQ Go Wireless
+0x0feb 0x2004 CRS Electronic 3.3 Digital Camera
+0x0feb 0x2004 Packard Bell DSC-300
+0x055f 0xb500 Mustek MDC 3000
+0x08ca 0x2062 Aiptek PocketDV 5700
+0x052b 0x1a18 Chiphead Megapix V12
+0x04c8 0x0729 Konica Revio 2
+0x04f2 0xa208 Creative PC-CAM 850
+0x0784 0x0040 Traveler Slimline X5
+0x06d6 0x0034 Trust Powerc@m 750
+0x0a17 0x0062 Pentax Optio 50L
+0x06d6 0x003b Trust Powerc@m 970Z
+0x0a17 0x004e Pentax Optio 50
+0x041e 0x405d Creative DiVi CAM 516
+0x08ca 0x2102 Aiptek DV T300
+0x06d6 0x003d Trust Powerc@m 910Z
+====== ======= ============== ====================
diff --git a/Documentation/admin-guide/mm/cma_debugfs.rst b/Documentation/admin-guide/mm/cma_debugfs.rst
index 4e06ffabd78a..7367e6294ef6 100644
--- a/Documentation/admin-guide/mm/cma_debugfs.rst
+++ b/Documentation/admin-guide/mm/cma_debugfs.rst
@@ -5,10 +5,10 @@ CMA Debugfs Interface
The CMA debugfs interface is useful to retrieve basic information out of the
different CMA areas and to test allocation/release in each of the areas.
-Each CMA zone represents a directory under <debugfs>/cma/, indexed by the
-kernel's CMA index. So the first CMA zone would be:
+Each CMA area represents a directory under <debugfs>/cma/, represented by
+its CMA name like below:
- <debugfs>/cma/cma-0
+ <debugfs>/cma/<cma_name>
The structure of the files created under that directory is as follows:
@@ -18,8 +18,8 @@ The structure of the files created under that directory is as follows:
- [RO] bitmap: The bitmap of page states in the zone.
- [WO] alloc: Allocate N pages from that CMA area. For example::
- echo 5 > <debugfs>/cma/cma-2/alloc
+ echo 5 > <debugfs>/cma/<cma_name>/alloc
-would try to allocate 5 pages from the cma-2 area.
+would try to allocate 5 pages from the 'cma_name' area.
- [WO] free: Free N pages from that CMA area, similar to the above.
diff --git a/Documentation/admin-guide/mm/concepts.rst b/Documentation/admin-guide/mm/concepts.rst
index c2531b14bf46..c79f1e336222 100644
--- a/Documentation/admin-guide/mm/concepts.rst
+++ b/Documentation/admin-guide/mm/concepts.rst
@@ -35,7 +35,7 @@ physical memory (demand paging) and provides a mechanism for the
protection and controlled sharing of data between processes.
With virtual memory, each and every memory access uses a virtual
-address. When the CPU decodes the an instruction that reads (or
+address. When the CPU decodes an instruction that reads (or
writes) from (or to) the system memory, it translates the `virtual`
address encoded in that instruction to a `physical` address that the
memory controller can understand.
@@ -125,7 +125,7 @@ processor. Each bank is referred to as a `node` and for each node Linux
constructs an independent memory management subsystem. A node has its
own set of zones, lists of free and used pages and various statistics
counters. You can find more details about NUMA in
-:ref:`Documentation/vm/numa.rst <numa>` and in
+:ref:`Documentation/mm/numa.rst <numa>` and in
:ref:`Documentation/admin-guide/mm/numa_memory_policy.rst <numa_memory_policy>`.
Page cache
@@ -184,7 +184,7 @@ pages either asynchronously or synchronously, depending on the state
of the system. When the system is not loaded, most of the memory is free
and allocation requests will be satisfied immediately from the free
pages supply. As the load increases, the amount of the free pages goes
-down and when it reaches a certain threshold (high watermark), an
+down and when it reaches a certain threshold (low watermark), an
allocation request will awaken the ``kswapd`` daemon. It will
asynchronously scan memory pages and either just free them if the data
they contain is available elsewhere, or evict to the backing storage
diff --git a/Documentation/admin-guide/mm/damon/index.rst b/Documentation/admin-guide/mm/damon/index.rst
new file mode 100644
index 000000000000..33d37bb2fb4e
--- /dev/null
+++ b/Documentation/admin-guide/mm/damon/index.rst
@@ -0,0 +1,17 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+==========================
+DAMON: Data Access MONitor
+==========================
+
+:doc:`DAMON </mm/damon/index>` allows light-weight data access monitoring.
+Using DAMON, users can analyze the memory access patterns of their systems and
+optimize those.
+
+.. toctree::
+ :maxdepth: 2
+
+ start
+ usage
+ reclaim
+ lru_sort
diff --git a/Documentation/admin-guide/mm/damon/lru_sort.rst b/Documentation/admin-guide/mm/damon/lru_sort.rst
new file mode 100644
index 000000000000..c09cace80651
--- /dev/null
+++ b/Documentation/admin-guide/mm/damon/lru_sort.rst
@@ -0,0 +1,294 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+=============================
+DAMON-based LRU-lists Sorting
+=============================
+
+DAMON-based LRU-lists Sorting (DAMON_LRU_SORT) is a static kernel module that
+aimed to be used for proactive and lightweight data access pattern based
+(de)prioritization of pages on their LRU-lists for making LRU-lists a more
+trusworthy data access pattern source.
+
+Where Proactive LRU-lists Sorting is Required?
+==============================================
+
+As page-granularity access checking overhead could be significant on huge
+systems, LRU lists are normally not proactively sorted but partially and
+reactively sorted for special events including specific user requests, system
+calls and memory pressure. As a result, LRU lists are sometimes not so
+perfectly prepared to be used as a trustworthy access pattern source for some
+situations including reclamation target pages selection under sudden memory
+pressure.
+
+Because DAMON can identify access patterns of best-effort accuracy while
+inducing only user-specified range of overhead, proactively running
+DAMON_LRU_SORT could be helpful for making LRU lists more trustworthy access
+pattern source with low and controlled overhead.
+
+How It Works?
+=============
+
+DAMON_LRU_SORT finds hot pages (pages of memory regions that showing access
+rates that higher than a user-specified threshold) and cold pages (pages of
+memory regions that showing no access for a time that longer than a
+user-specified threshold) using DAMON, and prioritizes hot pages while
+deprioritizing cold pages on their LRU-lists. To avoid it consuming too much
+CPU for the prioritizations, a CPU time usage limit can be configured. Under
+the limit, it prioritizes and deprioritizes more hot and cold pages first,
+respectively. System administrators can also configure under what situation
+this scheme should automatically activated and deactivated with three memory
+pressure watermarks.
+
+Its default parameters for hotness/coldness thresholds and CPU quota limit are
+conservatively chosen. That is, the module under its default parameters could
+be widely used without harm for common situations while providing a level of
+benefits for systems having clear hot/cold access patterns under memory
+pressure while consuming only a limited small portion of CPU time.
+
+Interface: Module Parameters
+============================
+
+To use this feature, you should first ensure your system is running on a kernel
+that is built with ``CONFIG_DAMON_LRU_SORT=y``.
+
+To let sysadmins enable or disable it and tune for the given system,
+DAMON_LRU_SORT utilizes module parameters. That is, you can put
+``damon_lru_sort.<parameter>=<value>`` on the kernel boot command line or write
+proper values to ``/sys/modules/damon_lru_sort/parameters/<parameter>`` files.
+
+Below are the description of each parameter.
+
+enabled
+-------
+
+Enable or disable DAMON_LRU_SORT.
+
+You can enable DAMON_LRU_SORT by setting the value of this parameter as ``Y``.
+Setting it as ``N`` disables DAMON_LRU_SORT. Note that DAMON_LRU_SORT could do
+no real monitoring and LRU-lists sorting due to the watermarks-based activation
+condition. Refer to below descriptions for the watermarks parameter for this.
+
+commit_inputs
+-------------
+
+Make DAMON_LRU_SORT reads the input parameters again, except ``enabled``.
+
+Input parameters that updated while DAMON_LRU_SORT is running are not applied
+by default. Once this parameter is set as ``Y``, DAMON_LRU_SORT reads values
+of parametrs except ``enabled`` again. Once the re-reading is done, this
+parameter is set as ``N``. If invalid parameters are found while the
+re-reading, DAMON_LRU_SORT will be disabled.
+
+hot_thres_access_freq
+---------------------
+
+Access frequency threshold for hot memory regions identification in permil.
+
+If a memory region is accessed in frequency of this or higher, DAMON_LRU_SORT
+identifies the region as hot, and mark it as accessed on the LRU list, so that
+it could not be reclaimed under memory pressure. 50% by default.
+
+cold_min_age
+------------
+
+Time threshold for cold memory regions identification in microseconds.
+
+If a memory region is not accessed for this or longer time, DAMON_LRU_SORT
+identifies the region as cold, and mark it as unaccessed on the LRU list, so
+that it could be reclaimed first under memory pressure. 120 seconds by
+default.
+
+quota_ms
+--------
+
+Limit of time for trying the LRU lists sorting in milliseconds.
+
+DAMON_LRU_SORT tries to use only up to this time within a time window
+(quota_reset_interval_ms) for trying LRU lists sorting. This can be used
+for limiting CPU consumption of DAMON_LRU_SORT. If the value is zero, the
+limit is disabled.
+
+10 ms by default.
+
+quota_reset_interval_ms
+-----------------------
+
+The time quota charge reset interval in milliseconds.
+
+The charge reset interval for the quota of time (quota_ms). That is,
+DAMON_LRU_SORT does not try LRU-lists sorting for more than quota_ms
+milliseconds or quota_sz bytes within quota_reset_interval_ms milliseconds.
+
+1 second by default.
+
+wmarks_interval
+---------------
+
+The watermarks check time interval in microseconds.
+
+Minimal time to wait before checking the watermarks, when DAMON_LRU_SORT is
+enabled but inactive due to its watermarks rule. 5 seconds by default.
+
+wmarks_high
+-----------
+
+Free memory rate (per thousand) for the high watermark.
+
+If free memory of the system in bytes per thousand bytes is higher than this,
+DAMON_LRU_SORT becomes inactive, so it does nothing but periodically checks the
+watermarks. 200 (20%) by default.
+
+wmarks_mid
+----------
+
+Free memory rate (per thousand) for the middle watermark.
+
+If free memory of the system in bytes per thousand bytes is between this and
+the low watermark, DAMON_LRU_SORT becomes active, so starts the monitoring and
+the LRU-lists sorting. 150 (15%) by default.
+
+wmarks_low
+----------
+
+Free memory rate (per thousand) for the low watermark.
+
+If free memory of the system in bytes per thousand bytes is lower than this,
+DAMON_LRU_SORT becomes inactive, so it does nothing but periodically checks the
+watermarks. 50 (5%) by default.
+
+sample_interval
+---------------
+
+Sampling interval for the monitoring in microseconds.
+
+The sampling interval of DAMON for the cold memory monitoring. Please refer to
+the DAMON documentation (:doc:`usage`) for more detail. 5ms by default.
+
+aggr_interval
+-------------
+
+Aggregation interval for the monitoring in microseconds.
+
+The aggregation interval of DAMON for the cold memory monitoring. Please
+refer to the DAMON documentation (:doc:`usage`) for more detail. 100ms by
+default.
+
+min_nr_regions
+--------------
+
+Minimum number of monitoring regions.
+
+The minimal number of monitoring regions of DAMON for the cold memory
+monitoring. This can be used to set lower-bound of the monitoring quality.
+But, setting this too high could result in increased monitoring overhead.
+Please refer to the DAMON documentation (:doc:`usage`) for more detail. 10 by
+default.
+
+max_nr_regions
+--------------
+
+Maximum number of monitoring regions.
+
+The maximum number of monitoring regions of DAMON for the cold memory
+monitoring. This can be used to set upper-bound of the monitoring overhead.
+However, setting this too low could result in bad monitoring quality. Please
+refer to the DAMON documentation (:doc:`usage`) for more detail. 1000 by
+defaults.
+
+monitor_region_start
+--------------------
+
+Start of target memory region in physical address.
+
+The start physical address of memory region that DAMON_LRU_SORT will do work
+against. By default, biggest System RAM is used as the region.
+
+monitor_region_end
+------------------
+
+End of target memory region in physical address.
+
+The end physical address of memory region that DAMON_LRU_SORT will do work
+against. By default, biggest System RAM is used as the region.
+
+kdamond_pid
+-----------
+
+PID of the DAMON thread.
+
+If DAMON_LRU_SORT is enabled, this becomes the PID of the worker thread. Else,
+-1.
+
+nr_lru_sort_tried_hot_regions
+-----------------------------
+
+Number of hot memory regions that tried to be LRU-sorted.
+
+bytes_lru_sort_tried_hot_regions
+--------------------------------
+
+Total bytes of hot memory regions that tried to be LRU-sorted.
+
+nr_lru_sorted_hot_regions
+-------------------------
+
+Number of hot memory regions that successfully be LRU-sorted.
+
+bytes_lru_sorted_hot_regions
+----------------------------
+
+Total bytes of hot memory regions that successfully be LRU-sorted.
+
+nr_hot_quota_exceeds
+--------------------
+
+Number of times that the time quota limit for hot regions have exceeded.
+
+nr_lru_sort_tried_cold_regions
+------------------------------
+
+Number of cold memory regions that tried to be LRU-sorted.
+
+bytes_lru_sort_tried_cold_regions
+---------------------------------
+
+Total bytes of cold memory regions that tried to be LRU-sorted.
+
+nr_lru_sorted_cold_regions
+--------------------------
+
+Number of cold memory regions that successfully be LRU-sorted.
+
+bytes_lru_sorted_cold_regions
+-----------------------------
+
+Total bytes of cold memory regions that successfully be LRU-sorted.
+
+nr_cold_quota_exceeds
+---------------------
+
+Number of times that the time quota limit for cold regions have exceeded.
+
+Example
+=======
+
+Below runtime example commands make DAMON_LRU_SORT to find memory regions
+having >=50% access frequency and LRU-prioritize while LRU-deprioritizing
+memory regions that not accessed for 120 seconds. The prioritization and
+deprioritization is limited to be done using only up to 1% CPU time to avoid
+DAMON_LRU_SORT consuming too much CPU time for the (de)prioritization. It also
+asks DAMON_LRU_SORT to do nothing if the system's free memory rate is more than
+50%, but start the real works if it becomes lower than 40%. If DAMON_RECLAIM
+doesn't make progress and therefore the free memory rate becomes lower than
+20%, it asks DAMON_LRU_SORT to do nothing again, so that we can fall back to
+the LRU-list based page granularity reclamation. ::
+
+ # cd /sys/modules/damon_lru_sort/parameters
+ # echo 500 > hot_thres_access_freq
+ # echo 120000000 > cold_min_age
+ # echo 10 > quota_ms
+ # echo 1000 > quota_reset_interval_ms
+ # echo 500 > wmarks_high
+ # echo 400 > wmarks_mid
+ # echo 200 > wmarks_low
+ # echo Y > enabled
diff --git a/Documentation/admin-guide/mm/damon/reclaim.rst b/Documentation/admin-guide/mm/damon/reclaim.rst
new file mode 100644
index 000000000000..4f1479a11e63
--- /dev/null
+++ b/Documentation/admin-guide/mm/damon/reclaim.rst
@@ -0,0 +1,265 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+=======================
+DAMON-based Reclamation
+=======================
+
+DAMON-based Reclamation (DAMON_RECLAIM) is a static kernel module that aimed to
+be used for proactive and lightweight reclamation under light memory pressure.
+It doesn't aim to replace the LRU-list based page_granularity reclamation, but
+to be selectively used for different level of memory pressure and requirements.
+
+Where Proactive Reclamation is Required?
+========================================
+
+On general memory over-committed systems, proactively reclaiming cold pages
+helps saving memory and reducing latency spikes that incurred by the direct
+reclaim of the process or CPU consumption of kswapd, while incurring only
+minimal performance degradation [1]_ [2]_ .
+
+Free Pages Reporting [3]_ based memory over-commit virtualization systems are
+good example of the cases. In such systems, the guest VMs reports their free
+memory to host, and the host reallocates the reported memory to other guests.
+As a result, the memory of the systems are fully utilized. However, the
+guests could be not so memory-frugal, mainly because some kernel subsystems and
+user-space applications are designed to use as much memory as available. Then,
+guests could report only small amount of memory as free to host, results in
+memory utilization drop of the systems. Running the proactive reclamation in
+guests could mitigate this problem.
+
+How It Works?
+=============
+
+DAMON_RECLAIM finds memory regions that didn't accessed for specific time
+duration and page out. To avoid it consuming too much CPU for the paging out
+operation, a speed limit can be configured. Under the speed limit, it pages
+out memory regions that didn't accessed longer time first. System
+administrators can also configure under what situation this scheme should
+automatically activated and deactivated with three memory pressure watermarks.
+
+Interface: Module Parameters
+============================
+
+To use this feature, you should first ensure your system is running on a kernel
+that is built with ``CONFIG_DAMON_RECLAIM=y``.
+
+To let sysadmins enable or disable it and tune for the given system,
+DAMON_RECLAIM utilizes module parameters. That is, you can put
+``damon_reclaim.<parameter>=<value>`` on the kernel boot command line or write
+proper values to ``/sys/modules/damon_reclaim/parameters/<parameter>`` files.
+
+Below are the description of each parameter.
+
+enabled
+-------
+
+Enable or disable DAMON_RECLAIM.
+
+You can enable DAMON_RCLAIM by setting the value of this parameter as ``Y``.
+Setting it as ``N`` disables DAMON_RECLAIM. Note that DAMON_RECLAIM could do
+no real monitoring and reclamation due to the watermarks-based activation
+condition. Refer to below descriptions for the watermarks parameter for this.
+
+commit_inputs
+-------------
+
+Make DAMON_RECLAIM reads the input parameters again, except ``enabled``.
+
+Input parameters that updated while DAMON_RECLAIM is running are not applied
+by default. Once this parameter is set as ``Y``, DAMON_RECLAIM reads values
+of parametrs except ``enabled`` again. Once the re-reading is done, this
+parameter is set as ``N``. If invalid parameters are found while the
+re-reading, DAMON_RECLAIM will be disabled.
+
+min_age
+-------
+
+Time threshold for cold memory regions identification in microseconds.
+
+If a memory region is not accessed for this or longer time, DAMON_RECLAIM
+identifies the region as cold, and reclaims it.
+
+120 seconds by default.
+
+quota_ms
+--------
+
+Limit of time for the reclamation in milliseconds.
+
+DAMON_RECLAIM tries to use only up to this time within a time window
+(quota_reset_interval_ms) for trying reclamation of cold pages. This can be
+used for limiting CPU consumption of DAMON_RECLAIM. If the value is zero, the
+limit is disabled.
+
+10 ms by default.
+
+quota_sz
+--------
+
+Limit of size of memory for the reclamation in bytes.
+
+DAMON_RECLAIM charges amount of memory which it tried to reclaim within a time
+window (quota_reset_interval_ms) and makes no more than this limit is tried.
+This can be used for limiting consumption of CPU and IO. If this value is
+zero, the limit is disabled.
+
+128 MiB by default.
+
+quota_reset_interval_ms
+-----------------------
+
+The time/size quota charge reset interval in milliseconds.
+
+The charget reset interval for the quota of time (quota_ms) and size
+(quota_sz). That is, DAMON_RECLAIM does not try reclamation for more than
+quota_ms milliseconds or quota_sz bytes within quota_reset_interval_ms
+milliseconds.
+
+1 second by default.
+
+wmarks_interval
+---------------
+
+Minimal time to wait before checking the watermarks, when DAMON_RECLAIM is
+enabled but inactive due to its watermarks rule.
+
+wmarks_high
+-----------
+
+Free memory rate (per thousand) for the high watermark.
+
+If free memory of the system in bytes per thousand bytes is higher than this,
+DAMON_RECLAIM becomes inactive, so it does nothing but only periodically checks
+the watermarks.
+
+wmarks_mid
+----------
+
+Free memory rate (per thousand) for the middle watermark.
+
+If free memory of the system in bytes per thousand bytes is between this and
+the low watermark, DAMON_RECLAIM becomes active, so starts the monitoring and
+the reclaiming.
+
+wmarks_low
+----------
+
+Free memory rate (per thousand) for the low watermark.
+
+If free memory of the system in bytes per thousand bytes is lower than this,
+DAMON_RECLAIM becomes inactive, so it does nothing but periodically checks the
+watermarks. In the case, the system falls back to the LRU-list based page
+granularity reclamation logic.
+
+sample_interval
+---------------
+
+Sampling interval for the monitoring in microseconds.
+
+The sampling interval of DAMON for the cold memory monitoring. Please refer to
+the DAMON documentation (:doc:`usage`) for more detail.
+
+aggr_interval
+-------------
+
+Aggregation interval for the monitoring in microseconds.
+
+The aggregation interval of DAMON for the cold memory monitoring. Please
+refer to the DAMON documentation (:doc:`usage`) for more detail.
+
+min_nr_regions
+--------------
+
+Minimum number of monitoring regions.
+
+The minimal number of monitoring regions of DAMON for the cold memory
+monitoring. This can be used to set lower-bound of the monitoring quality.
+But, setting this too high could result in increased monitoring overhead.
+Please refer to the DAMON documentation (:doc:`usage`) for more detail.
+
+max_nr_regions
+--------------
+
+Maximum number of monitoring regions.
+
+The maximum number of monitoring regions of DAMON for the cold memory
+monitoring. This can be used to set upper-bound of the monitoring overhead.
+However, setting this too low could result in bad monitoring quality. Please
+refer to the DAMON documentation (:doc:`usage`) for more detail.
+
+monitor_region_start
+--------------------
+
+Start of target memory region in physical address.
+
+The start physical address of memory region that DAMON_RECLAIM will do work
+against. That is, DAMON_RECLAIM will find cold memory regions in this region
+and reclaims. By default, biggest System RAM is used as the region.
+
+monitor_region_end
+------------------
+
+End of target memory region in physical address.
+
+The end physical address of memory region that DAMON_RECLAIM will do work
+against. That is, DAMON_RECLAIM will find cold memory regions in this region
+and reclaims. By default, biggest System RAM is used as the region.
+
+kdamond_pid
+-----------
+
+PID of the DAMON thread.
+
+If DAMON_RECLAIM is enabled, this becomes the PID of the worker thread. Else,
+-1.
+
+nr_reclaim_tried_regions
+------------------------
+
+Number of memory regions that tried to be reclaimed by DAMON_RECLAIM.
+
+bytes_reclaim_tried_regions
+---------------------------
+
+Total bytes of memory regions that tried to be reclaimed by DAMON_RECLAIM.
+
+nr_reclaimed_regions
+--------------------
+
+Number of memory regions that successfully be reclaimed by DAMON_RECLAIM.
+
+bytes_reclaimed_regions
+-----------------------
+
+Total bytes of memory regions that successfully be reclaimed by DAMON_RECLAIM.
+
+nr_quota_exceeds
+----------------
+
+Number of times that the time/space quota limits have exceeded.
+
+Example
+=======
+
+Below runtime example commands make DAMON_RECLAIM to find memory regions that
+not accessed for 30 seconds or more and pages out. The reclamation is limited
+to be done only up to 1 GiB per second to avoid DAMON_RECLAIM consuming too
+much CPU time for the paging out operation. It also asks DAMON_RECLAIM to do
+nothing if the system's free memory rate is more than 50%, but start the real
+works if it becomes lower than 40%. If DAMON_RECLAIM doesn't make progress and
+therefore the free memory rate becomes lower than 20%, it asks DAMON_RECLAIM to
+do nothing again, so that we can fall back to the LRU-list based page
+granularity reclamation. ::
+
+ # cd /sys/modules/damon_reclaim/parameters
+ # echo 30000000 > min_age
+ # echo $((1 * 1024 * 1024 * 1024)) > quota_sz
+ # echo 1000 > quota_reset_interval_ms
+ # echo 500 > wmarks_high
+ # echo 400 > wmarks_mid
+ # echo 200 > wmarks_low
+ # echo Y > enabled
+
+.. [1] https://research.google/pubs/pub48551/
+.. [2] https://lwn.net/Articles/787611/
+.. [3] https://www.kernel.org/doc/html/latest/mm/free_page_reporting.html
diff --git a/Documentation/admin-guide/mm/damon/start.rst b/Documentation/admin-guide/mm/damon/start.rst
new file mode 100644
index 000000000000..9f88afc734da
--- /dev/null
+++ b/Documentation/admin-guide/mm/damon/start.rst
@@ -0,0 +1,127 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+===============
+Getting Started
+===============
+
+This document briefly describes how you can use DAMON by demonstrating its
+default user space tool. Please note that this document describes only a part
+of its features for brevity. Please refer to the usage `doc
+<https://github.com/awslabs/damo/blob/next/USAGE.md>`_ of the tool for more
+details.
+
+
+Prerequisites
+=============
+
+Kernel
+------
+
+You should first ensure your system is running on a kernel built with
+``CONFIG_DAMON_*=y``.
+
+
+User Space Tool
+---------------
+
+For the demonstration, we will use the default user space tool for DAMON,
+called DAMON Operator (DAMO). It is available at
+https://github.com/awslabs/damo. The examples below assume that ``damo`` is on
+your ``$PATH``. It's not mandatory, though.
+
+Because DAMO is using the sysfs interface (refer to :doc:`usage` for the
+detail) of DAMON, you should ensure :doc:`sysfs </filesystems/sysfs>` is
+mounted.
+
+
+Recording Data Access Patterns
+==============================
+
+The commands below record the memory access patterns of a program and save the
+monitoring results to a file. ::
+
+ $ git clone https://github.com/sjp38/masim
+ $ cd masim; make; ./masim ./configs/zigzag.cfg &
+ $ sudo damo record -o damon.data $(pidof masim)
+
+The first two lines of the commands download an artificial memory access
+generator program and run it in the background. The generator will repeatedly
+access two 100 MiB sized memory regions one by one. You can substitute this
+with your real workload. The last line asks ``damo`` to record the access
+pattern in the ``damon.data`` file.
+
+
+Visualizing Recorded Patterns
+=============================
+
+You can visualize the pattern in a heatmap, showing which memory region
+(x-axis) got accessed when (y-axis) and how frequently (number).::
+
+ $ sudo damo report heats --heatmap stdout
+ 22222222222222222222222222222222222222211111111111111111111111111111111111111100
+ 44444444444444444444444444444444444444434444444444444444444444444444444444443200
+ 44444444444444444444444444444444444444433444444444444444444444444444444444444200
+ 33333333333333333333333333333333333333344555555555555555555555555555555555555200
+ 33333333333333333333333333333333333344444444444444444444444444444444444444444200
+ 22222222222222222222222222222222222223355555555555555555555555555555555555555200
+ 00000000000000000000000000000000000000288888888888888888888888888888888888888400
+ 00000000000000000000000000000000000000288888888888888888888888888888888888888400
+ 33333333333333333333333333333333333333355555555555555555555555555555555555555200
+ 88888888888888888888888888888888888888600000000000000000000000000000000000000000
+ 88888888888888888888888888888888888888600000000000000000000000000000000000000000
+ 33333333333333333333333333333333333333444444444444444444444444444444444444443200
+ 00000000000000000000000000000000000000288888888888888888888888888888888888888400
+ [...]
+ # access_frequency: 0 1 2 3 4 5 6 7 8 9
+ # x-axis: space (139728247021568-139728453431248: 196.848 MiB)
+ # y-axis: time (15256597248362-15326899978162: 1 m 10.303 s)
+ # resolution: 80x40 (2.461 MiB and 1.758 s for each character)
+
+You can also visualize the distribution of the working set size, sorted by the
+size.::
+
+ $ sudo damo report wss --range 0 101 10
+ # <percentile> <wss>
+ # target_id 18446632103789443072
+ # avr: 107.708 MiB
+ 0 0 B | |
+ 10 95.328 MiB |**************************** |
+ 20 95.332 MiB |**************************** |
+ 30 95.340 MiB |**************************** |
+ 40 95.387 MiB |**************************** |
+ 50 95.387 MiB |**************************** |
+ 60 95.398 MiB |**************************** |
+ 70 95.398 MiB |**************************** |
+ 80 95.504 MiB |**************************** |
+ 90 190.703 MiB |********************************************************* |
+ 100 196.875 MiB |***********************************************************|
+
+Using ``--sortby`` option with the above command, you can show how the working
+set size has chronologically changed.::
+
+ $ sudo damo report wss --range 0 101 10 --sortby time
+ # <percentile> <wss>
+ # target_id 18446632103789443072
+ # avr: 107.708 MiB
+ 0 3.051 MiB | |
+ 10 190.703 MiB |***********************************************************|
+ 20 95.336 MiB |***************************** |
+ 30 95.328 MiB |***************************** |
+ 40 95.387 MiB |***************************** |
+ 50 95.332 MiB |***************************** |
+ 60 95.320 MiB |***************************** |
+ 70 95.398 MiB |***************************** |
+ 80 95.398 MiB |***************************** |
+ 90 95.340 MiB |***************************** |
+ 100 95.398 MiB |***************************** |
+
+
+Data Access Pattern Aware Memory Management
+===========================================
+
+Below three commands make every memory region of size >=4K that doesn't
+accessed for >=60 seconds in your workload to be swapped out. ::
+
+ $ echo "#min-size max-size min-acc max-acc min-age max-age action" > test_scheme
+ $ echo "4K max 0 0 60s max pageout" >> test_scheme
+ $ damo schemes -c test_scheme <pid of your workload>
diff --git a/Documentation/admin-guide/mm/damon/usage.rst b/Documentation/admin-guide/mm/damon/usage.rst
new file mode 100644
index 000000000000..b47b0cbbd491
--- /dev/null
+++ b/Documentation/admin-guide/mm/damon/usage.rst
@@ -0,0 +1,702 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+===============
+Detailed Usages
+===============
+
+DAMON provides below interfaces for different users.
+
+- *DAMON user space tool.*
+ `This <https://github.com/awslabs/damo>`_ is for privileged people such as
+ system administrators who want a just-working human-friendly interface.
+ Using this, users can use the DAMON’s major features in a human-friendly way.
+ It may not be highly tuned for special cases, though. It supports both
+ virtual and physical address spaces monitoring. For more detail, please
+ refer to its `usage document
+ <https://github.com/awslabs/damo/blob/next/USAGE.md>`_.
+- *sysfs interface.*
+ :ref:`This <sysfs_interface>` is for privileged user space programmers who
+ want more optimized use of DAMON. Using this, users can use DAMON’s major
+ features by reading from and writing to special sysfs files. Therefore,
+ you can write and use your personalized DAMON sysfs wrapper programs that
+ reads/writes the sysfs files instead of you. The `DAMON user space tool
+ <https://github.com/awslabs/damo>`_ is one example of such programs. It
+ supports both virtual and physical address spaces monitoring. Note that this
+ interface provides only simple :ref:`statistics <damos_stats>` for the
+ monitoring results. For detailed monitoring results, DAMON provides a
+ :ref:`tracepoint <tracepoint>`.
+- *debugfs interface.*
+ :ref:`This <debugfs_interface>` is almost identical to :ref:`sysfs interface
+ <sysfs_interface>`. This will be removed after next LTS kernel is released,
+ so users should move to the :ref:`sysfs interface <sysfs_interface>`.
+- *Kernel Space Programming Interface.*
+ :doc:`This </mm/damon/api>` is for kernel space programmers. Using this,
+ users can utilize every feature of DAMON most flexibly and efficiently by
+ writing kernel space DAMON application programs for you. You can even extend
+ DAMON for various address spaces. For detail, please refer to the interface
+ :doc:`document </mm/damon/api>`.
+
+.. _sysfs_interface:
+
+sysfs Interface
+===============
+
+DAMON sysfs interface is built when ``CONFIG_DAMON_SYSFS`` is defined. It
+creates multiple directories and files under its sysfs directory,
+``<sysfs>/kernel/mm/damon/``. You can control DAMON by writing to and reading
+from the files under the directory.
+
+For a short example, users can monitor the virtual address space of a given
+workload as below. ::
+
+ # cd /sys/kernel/mm/damon/admin/
+ # echo 1 > kdamonds/nr_kdamonds && echo 1 > kdamonds/0/contexts/nr_contexts
+ # echo vaddr > kdamonds/0/contexts/0/operations
+ # echo 1 > kdamonds/0/contexts/0/targets/nr_targets
+ # echo $(pidof <workload>) > kdamonds/0/contexts/0/targets/0/pid_target
+ # echo on > kdamonds/0/state
+
+Files Hierarchy
+---------------
+
+The files hierarchy of DAMON sysfs interface is shown below. In the below
+figure, parents-children relations are represented with indentations, each
+directory is having ``/`` suffix, and files in each directory are separated by
+comma (","). ::
+
+ /sys/kernel/mm/damon/admin
+ │ kdamonds/nr_kdamonds
+ │ │ 0/state,pid
+ │ │ │ contexts/nr_contexts
+ │ │ │ │ 0/avail_operations,operations
+ │ │ │ │ │ monitoring_attrs/
+ │ │ │ │ │ │ intervals/sample_us,aggr_us,update_us
+ │ │ │ │ │ │ nr_regions/min,max
+ │ │ │ │ │ targets/nr_targets
+ │ │ │ │ │ │ 0/pid_target
+ │ │ │ │ │ │ │ regions/nr_regions
+ │ │ │ │ │ │ │ │ 0/start,end
+ │ │ │ │ │ │ │ │ ...
+ │ │ │ │ │ │ ...
+ │ │ │ │ │ schemes/nr_schemes
+ │ │ │ │ │ │ 0/action
+ │ │ │ │ │ │ │ access_pattern/
+ │ │ │ │ │ │ │ │ sz/min,max
+ │ │ │ │ │ │ │ │ nr_accesses/min,max
+ │ │ │ │ │ │ │ │ age/min,max
+ │ │ │ │ │ │ │ quotas/ms,bytes,reset_interval_ms
+ │ │ │ │ │ │ │ │ weights/sz_permil,nr_accesses_permil,age_permil
+ │ │ │ │ │ │ │ watermarks/metric,interval_us,high,mid,low
+ │ │ │ │ │ │ │ stats/nr_tried,sz_tried,nr_applied,sz_applied,qt_exceeds
+ │ │ │ │ │ │ ...
+ │ │ │ │ ...
+ │ │ ...
+
+Root
+----
+
+The root of the DAMON sysfs interface is ``<sysfs>/kernel/mm/damon/``, and it
+has one directory named ``admin``. The directory contains the files for
+privileged user space programs' control of DAMON. User space tools or deamons
+having the root permission could use this directory.
+
+kdamonds/
+---------
+
+The monitoring-related information including request specifications and results
+are called DAMON context. DAMON executes each context with a kernel thread
+called kdamond, and multiple kdamonds could run in parallel.
+
+Under the ``admin`` directory, one directory, ``kdamonds``, which has files for
+controlling the kdamonds exist. In the beginning, this directory has only one
+file, ``nr_kdamonds``. Writing a number (``N``) to the file creates the number
+of child directories named ``0`` to ``N-1``. Each directory represents each
+kdamond.
+
+kdamonds/<N>/
+-------------
+
+In each kdamond directory, two files (``state`` and ``pid``) and one directory
+(``contexts``) exist.
+
+Reading ``state`` returns ``on`` if the kdamond is currently running, or
+``off`` if it is not running. Writing ``on`` or ``off`` makes the kdamond be
+in the state. Writing ``commit`` to the ``state`` file makes kdamond reads the
+user inputs in the sysfs files except ``state`` file again. Writing
+``update_schemes_stats`` to ``state`` file updates the contents of stats files
+for each DAMON-based operation scheme of the kdamond. For details of the
+stats, please refer to :ref:`stats section <sysfs_schemes_stats>`.
+
+If the state is ``on``, reading ``pid`` shows the pid of the kdamond thread.
+
+``contexts`` directory contains files for controlling the monitoring contexts
+that this kdamond will execute.
+
+kdamonds/<N>/contexts/
+----------------------
+
+In the beginning, this directory has only one file, ``nr_contexts``. Writing a
+number (``N``) to the file creates the number of child directories named as
+``0`` to ``N-1``. Each directory represents each monitoring context. At the
+moment, only one context per kdamond is supported, so only ``0`` or ``1`` can
+be written to the file.
+
+contexts/<N>/
+-------------
+
+In each context directory, two files (``avail_operations`` and ``operations``)
+and three directories (``monitoring_attrs``, ``targets``, and ``schemes``)
+exist.
+
+DAMON supports multiple types of monitoring operations, including those for
+virtual address space and the physical address space. You can get the list of
+available monitoring operations set on the currently running kernel by reading
+``avail_operations`` file. Based on the kernel configuration, the file will
+list some or all of below keywords.
+
+ - vaddr: Monitor virtual address spaces of specific processes
+ - fvaddr: Monitor fixed virtual address ranges
+ - paddr: Monitor the physical address space of the system
+
+Please refer to :ref:`regions sysfs directory <sysfs_regions>` for detailed
+differences between the operations sets in terms of the monitoring target
+regions.
+
+You can set and get what type of monitoring operations DAMON will use for the
+context by writing one of the keywords listed in ``avail_operations`` file and
+reading from the ``operations`` file.
+
+contexts/<N>/monitoring_attrs/
+------------------------------
+
+Files for specifying attributes of the monitoring including required quality
+and efficiency of the monitoring are in ``monitoring_attrs`` directory.
+Specifically, two directories, ``intervals`` and ``nr_regions`` exist in this
+directory.
+
+Under ``intervals`` directory, three files for DAMON's sampling interval
+(``sample_us``), aggregation interval (``aggr_us``), and update interval
+(``update_us``) exist. You can set and get the values in micro-seconds by
+writing to and reading from the files.
+
+Under ``nr_regions`` directory, two files for the lower-bound and upper-bound
+of DAMON's monitoring regions (``min`` and ``max``, respectively), which
+controls the monitoring overhead, exist. You can set and get the values by
+writing to and rading from the files.
+
+For more details about the intervals and monitoring regions range, please refer
+to the Design document (:doc:`/mm/damon/design`).
+
+contexts/<N>/targets/
+---------------------
+
+In the beginning, this directory has only one file, ``nr_targets``. Writing a
+number (``N``) to the file creates the number of child directories named ``0``
+to ``N-1``. Each directory represents each monitoring target.
+
+targets/<N>/
+------------
+
+In each target directory, one file (``pid_target``) and one directory
+(``regions``) exist.
+
+If you wrote ``vaddr`` to the ``contexts/<N>/operations``, each target should
+be a process. You can specify the process to DAMON by writing the pid of the
+process to the ``pid_target`` file.
+
+.. _sysfs_regions:
+
+targets/<N>/regions
+-------------------
+
+When ``vaddr`` monitoring operations set is being used (``vaddr`` is written to
+the ``contexts/<N>/operations`` file), DAMON automatically sets and updates the
+monitoring target regions so that entire memory mappings of target processes
+can be covered. However, users could want to set the initial monitoring region
+to specific address ranges.
+
+In contrast, DAMON do not automatically sets and updates the monitoring target
+regions when ``fvaddr`` or ``paddr`` monitoring operations sets are being used
+(``fvaddr`` or ``paddr`` have written to the ``contexts/<N>/operations``).
+Therefore, users should set the monitoring target regions by themselves in the
+cases.
+
+For such cases, users can explicitly set the initial monitoring target regions
+as they want, by writing proper values to the files under this directory.
+
+In the beginning, this directory has only one file, ``nr_regions``. Writing a
+number (``N``) to the file creates the number of child directories named ``0``
+to ``N-1``. Each directory represents each initial monitoring target region.
+
+regions/<N>/
+------------
+
+In each region directory, you will find two files (``start`` and ``end``). You
+can set and get the start and end addresses of the initial monitoring target
+region by writing to and reading from the files, respectively.
+
+contexts/<N>/schemes/
+---------------------
+
+For usual DAMON-based data access aware memory management optimizations, users
+would normally want the system to apply a memory management action to a memory
+region of a specific access pattern. DAMON receives such formalized operation
+schemes from the user and applies those to the target memory regions. Users
+can get and set the schemes by reading from and writing to files under this
+directory.
+
+In the beginning, this directory has only one file, ``nr_schemes``. Writing a
+number (``N``) to the file creates the number of child directories named ``0``
+to ``N-1``. Each directory represents each DAMON-based operation scheme.
+
+schemes/<N>/
+------------
+
+In each scheme directory, four directories (``access_pattern``, ``quotas``,
+``watermarks``, and ``stats``) and one file (``action``) exist.
+
+The ``action`` file is for setting and getting what action you want to apply to
+memory regions having specific access pattern of the interest. The keywords
+that can be written to and read from the file and their meaning are as below.
+
+ - ``willneed``: Call ``madvise()`` for the region with ``MADV_WILLNEED``
+ - ``cold``: Call ``madvise()`` for the region with ``MADV_COLD``
+ - ``pageout``: Call ``madvise()`` for the region with ``MADV_PAGEOUT``
+ - ``hugepage``: Call ``madvise()`` for the region with ``MADV_HUGEPAGE``
+ - ``nohugepage``: Call ``madvise()`` for the region with ``MADV_NOHUGEPAGE``
+ - ``lru_prio``: Prioritize the region on its LRU lists.
+ - ``lru_deprio``: Deprioritize the region on its LRU lists.
+ - ``stat``: Do nothing but count the statistics
+
+schemes/<N>/access_pattern/
+---------------------------
+
+The target access pattern of each DAMON-based operation scheme is constructed
+with three ranges including the size of the region in bytes, number of
+monitored accesses per aggregate interval, and number of aggregated intervals
+for the age of the region.
+
+Under the ``access_pattern`` directory, three directories (``sz``,
+``nr_accesses``, and ``age``) each having two files (``min`` and ``max``)
+exist. You can set and get the access pattern for the given scheme by writing
+to and reading from the ``min`` and ``max`` files under ``sz``,
+``nr_accesses``, and ``age`` directories, respectively.
+
+schemes/<N>/quotas/
+-------------------
+
+Optimal ``target access pattern`` for each ``action`` is workload dependent, so
+not easy to find. Worse yet, setting a scheme of some action too aggressive
+can cause severe overhead. To avoid such overhead, users can limit time and
+size quota for each scheme. In detail, users can ask DAMON to try to use only
+up to specific time (``time quota``) for applying the action, and to apply the
+action to only up to specific amount (``size quota``) of memory regions having
+the target access pattern within a given time interval (``reset interval``).
+
+When the quota limit is expected to be exceeded, DAMON prioritizes found memory
+regions of the ``target access pattern`` based on their size, access frequency,
+and age. For personalized prioritization, users can set the weights for the
+three properties.
+
+Under ``quotas`` directory, three files (``ms``, ``bytes``,
+``reset_interval_ms``) and one directory (``weights``) having three files
+(``sz_permil``, ``nr_accesses_permil``, and ``age_permil``) in it exist.
+
+You can set the ``time quota`` in milliseconds, ``size quota`` in bytes, and
+``reset interval`` in milliseconds by writing the values to the three files,
+respectively. You can also set the prioritization weights for size, access
+frequency, and age in per-thousand unit by writing the values to the three
+files under the ``weights`` directory.
+
+schemes/<N>/watermarks/
+-----------------------
+
+To allow easy activation and deactivation of each scheme based on system
+status, DAMON provides a feature called watermarks. The feature receives five
+values called ``metric``, ``interval``, ``high``, ``mid``, and ``low``. The
+``metric`` is the system metric such as free memory ratio that can be measured.
+If the metric value of the system is higher than the value in ``high`` or lower
+than ``low`` at the memoent, the scheme is deactivated. If the value is lower
+than ``mid``, the scheme is activated.
+
+Under the watermarks directory, five files (``metric``, ``interval_us``,
+``high``, ``mid``, and ``low``) for setting each value exist. You can set and
+get the five values by writing to the files, respectively.
+
+Keywords and meanings of those that can be written to the ``metric`` file are
+as below.
+
+ - none: Ignore the watermarks
+ - free_mem_rate: System's free memory rate (per thousand)
+
+The ``interval`` should written in microseconds unit.
+
+.. _sysfs_schemes_stats:
+
+schemes/<N>/stats/
+------------------
+
+DAMON counts the total number and bytes of regions that each scheme is tried to
+be applied, the two numbers for the regions that each scheme is successfully
+applied, and the total number of the quota limit exceeds. This statistics can
+be used for online analysis or tuning of the schemes.
+
+The statistics can be retrieved by reading the files under ``stats`` directory
+(``nr_tried``, ``sz_tried``, ``nr_applied``, ``sz_applied``, and
+``qt_exceeds``), respectively. The files are not updated in real time, so you
+should ask DAMON sysfs interface to updte the content of the files for the
+stats by writing a special keyword, ``update_schemes_stats`` to the relevant
+``kdamonds/<N>/state`` file.
+
+Example
+~~~~~~~
+
+Below commands applies a scheme saying "If a memory region of size in [4KiB,
+8KiB] is showing accesses per aggregate interval in [0, 5] for aggregate
+interval in [10, 20], page out the region. For the paging out, use only up to
+10ms per second, and also don't page out more than 1GiB per second. Under the
+limitation, page out memory regions having longer age first. Also, check the
+free memory rate of the system every 5 seconds, start the monitoring and paging
+out when the free memory rate becomes lower than 50%, but stop it if the free
+memory rate becomes larger than 60%, or lower than 30%". ::
+
+ # cd <sysfs>/kernel/mm/damon/admin
+ # # populate directories
+ # echo 1 > kdamonds/nr_kdamonds; echo 1 > kdamonds/0/contexts/nr_contexts;
+ # echo 1 > kdamonds/0/contexts/0/schemes/nr_schemes
+ # cd kdamonds/0/contexts/0/schemes/0
+ # # set the basic access pattern and the action
+ # echo 4096 > access_pattern/sz/min
+ # echo 8192 > access_pattern/sz/max
+ # echo 0 > access_pattern/nr_accesses/min
+ # echo 5 > access_pattern/nr_accesses/max
+ # echo 10 > access_pattern/age/min
+ # echo 20 > access_pattern/age/max
+ # echo pageout > action
+ # # set quotas
+ # echo 10 > quotas/ms
+ # echo $((1024*1024*1024)) > quotas/bytes
+ # echo 1000 > quotas/reset_interval_ms
+ # # set watermark
+ # echo free_mem_rate > watermarks/metric
+ # echo 5000000 > watermarks/interval_us
+ # echo 600 > watermarks/high
+ # echo 500 > watermarks/mid
+ # echo 300 > watermarks/low
+
+Please note that it's highly recommended to use user space tools like `damo
+<https://github.com/awslabs/damo>`_ rather than manually reading and writing
+the files as above. Above is only for an example.
+
+.. _debugfs_interface:
+
+debugfs Interface
+=================
+
+.. note::
+
+ DAMON debugfs interface will be removed after next LTS kernel is released, so
+ users should move to the :ref:`sysfs interface <sysfs_interface>`.
+
+DAMON exports eight files, ``attrs``, ``target_ids``, ``init_regions``,
+``schemes``, ``monitor_on``, ``kdamond_pid``, ``mk_contexts`` and
+``rm_contexts`` under its debugfs directory, ``<debugfs>/damon/``.
+
+
+Attributes
+----------
+
+Users can get and set the ``sampling interval``, ``aggregation interval``,
+``update interval``, and min/max number of monitoring target regions by
+reading from and writing to the ``attrs`` file. To know about the monitoring
+attributes in detail, please refer to the :doc:`/mm/damon/design`. For
+example, below commands set those values to 5 ms, 100 ms, 1,000 ms, 10 and
+1000, and then check it again::
+
+ # cd <debugfs>/damon
+ # echo 5000 100000 1000000 10 1000 > attrs
+ # cat attrs
+ 5000 100000 1000000 10 1000
+
+
+Target IDs
+----------
+
+Some types of address spaces supports multiple monitoring target. For example,
+the virtual memory address spaces monitoring can have multiple processes as the
+monitoring targets. Users can set the targets by writing relevant id values of
+the targets to, and get the ids of the current targets by reading from the
+``target_ids`` file. In case of the virtual address spaces monitoring, the
+values should be pids of the monitoring target processes. For example, below
+commands set processes having pids 42 and 4242 as the monitoring targets and
+check it again::
+
+ # cd <debugfs>/damon
+ # echo 42 4242 > target_ids
+ # cat target_ids
+ 42 4242
+
+Users can also monitor the physical memory address space of the system by
+writing a special keyword, "``paddr\n``" to the file. Because physical address
+space monitoring doesn't support multiple targets, reading the file will show a
+fake value, ``42``, as below::
+
+ # cd <debugfs>/damon
+ # echo paddr > target_ids
+ # cat target_ids
+ 42
+
+Note that setting the target ids doesn't start the monitoring.
+
+
+Initial Monitoring Target Regions
+---------------------------------
+
+In case of the virtual address space monitoring, DAMON automatically sets and
+updates the monitoring target regions so that entire memory mappings of target
+processes can be covered. However, users can want to limit the monitoring
+region to specific address ranges, such as the heap, the stack, or specific
+file-mapped area. Or, some users can know the initial access pattern of their
+workloads and therefore want to set optimal initial regions for the 'adaptive
+regions adjustment'.
+
+In contrast, DAMON do not automatically sets and updates the monitoring target
+regions in case of physical memory monitoring. Therefore, users should set the
+monitoring target regions by themselves.
+
+In such cases, users can explicitly set the initial monitoring target regions
+as they want, by writing proper values to the ``init_regions`` file. Each line
+of the input should represent one region in below form.::
+
+ <target idx> <start address> <end address>
+
+The ``target idx`` should be the index of the target in ``target_ids`` file,
+starting from ``0``, and the regions should be passed in address order. For
+example, below commands will set a couple of address ranges, ``1-100`` and
+``100-200`` as the initial monitoring target region of pid 42, which is the
+first one (index ``0``) in ``target_ids``, and another couple of address
+ranges, ``20-40`` and ``50-100`` as that of pid 4242, which is the second one
+(index ``1``) in ``target_ids``.::
+
+ # cd <debugfs>/damon
+ # cat target_ids
+ 42 4242
+ # echo "0 1 100
+ 0 100 200
+ 1 20 40
+ 1 50 100" > init_regions
+
+Note that this sets the initial monitoring target regions only. In case of
+virtual memory monitoring, DAMON will automatically updates the boundary of the
+regions after one ``update interval``. Therefore, users should set the
+``update interval`` large enough in this case, if they don't want the
+update.
+
+
+Schemes
+-------
+
+For usual DAMON-based data access aware memory management optimizations, users
+would simply want the system to apply a memory management action to a memory
+region of a specific access pattern. DAMON receives such formalized operation
+schemes from the user and applies those to the target processes.
+
+Users can get and set the schemes by reading from and writing to ``schemes``
+debugfs file. Reading the file also shows the statistics of each scheme. To
+the file, each of the schemes should be represented in each line in below
+form::
+
+ <target access pattern> <action> <quota> <watermarks>
+
+You can disable schemes by simply writing an empty string to the file.
+
+Target Access Pattern
+~~~~~~~~~~~~~~~~~~~~~
+
+The ``<target access pattern>`` is constructed with three ranges in below
+form::
+
+ min-size max-size min-acc max-acc min-age max-age
+
+Specifically, bytes for the size of regions (``min-size`` and ``max-size``),
+number of monitored accesses per aggregate interval for access frequency
+(``min-acc`` and ``max-acc``), number of aggregate intervals for the age of
+regions (``min-age`` and ``max-age``) are specified. Note that the ranges are
+closed interval.
+
+Action
+~~~~~~
+
+The ``<action>`` is a predefined integer for memory management actions, which
+DAMON will apply to the regions having the target access pattern. The
+supported numbers and their meanings are as below.
+
+ - 0: Call ``madvise()`` for the region with ``MADV_WILLNEED``
+ - 1: Call ``madvise()`` for the region with ``MADV_COLD``
+ - 2: Call ``madvise()`` for the region with ``MADV_PAGEOUT``
+ - 3: Call ``madvise()`` for the region with ``MADV_HUGEPAGE``
+ - 4: Call ``madvise()`` for the region with ``MADV_NOHUGEPAGE``
+ - 5: Do nothing but count the statistics
+
+Quota
+~~~~~
+
+Optimal ``target access pattern`` for each ``action`` is workload dependent, so
+not easy to find. Worse yet, setting a scheme of some action too aggressive
+can cause severe overhead. To avoid such overhead, users can limit time and
+size quota for the scheme via the ``<quota>`` in below form::
+
+ <ms> <sz> <reset interval> <priority weights>
+
+This makes DAMON to try to use only up to ``<ms>`` milliseconds for applying
+the action to memory regions of the ``target access pattern`` within the
+``<reset interval>`` milliseconds, and to apply the action to only up to
+``<sz>`` bytes of memory regions within the ``<reset interval>``. Setting both
+``<ms>`` and ``<sz>`` zero disables the quota limits.
+
+When the quota limit is expected to be exceeded, DAMON prioritizes found memory
+regions of the ``target access pattern`` based on their size, access frequency,
+and age. For personalized prioritization, users can set the weights for the
+three properties in ``<priority weights>`` in below form::
+
+ <size weight> <access frequency weight> <age weight>
+
+Watermarks
+~~~~~~~~~~
+
+Some schemes would need to run based on current value of the system's specific
+metrics like free memory ratio. For such cases, users can specify watermarks
+for the condition.::
+
+ <metric> <check interval> <high mark> <middle mark> <low mark>
+
+``<metric>`` is a predefined integer for the metric to be checked. The
+supported numbers and their meanings are as below.
+
+ - 0: Ignore the watermarks
+ - 1: System's free memory rate (per thousand)
+
+The value of the metric is checked every ``<check interval>`` microseconds.
+
+If the value is higher than ``<high mark>`` or lower than ``<low mark>``, the
+scheme is deactivated. If the value is lower than ``<mid mark>``, the scheme
+is activated.
+
+.. _damos_stats:
+
+Statistics
+~~~~~~~~~~
+
+It also counts the total number and bytes of regions that each scheme is tried
+to be applied, the two numbers for the regions that each scheme is successfully
+applied, and the total number of the quota limit exceeds. This statistics can
+be used for online analysis or tuning of the schemes.
+
+The statistics can be shown by reading the ``schemes`` file. Reading the file
+will show each scheme you entered in each line, and the five numbers for the
+statistics will be added at the end of each line.
+
+Example
+~~~~~~~
+
+Below commands applies a scheme saying "If a memory region of size in [4KiB,
+8KiB] is showing accesses per aggregate interval in [0, 5] for aggregate
+interval in [10, 20], page out the region. For the paging out, use only up to
+10ms per second, and also don't page out more than 1GiB per second. Under the
+limitation, page out memory regions having longer age first. Also, check the
+free memory rate of the system every 5 seconds, start the monitoring and paging
+out when the free memory rate becomes lower than 50%, but stop it if the free
+memory rate becomes larger than 60%, or lower than 30%".::
+
+ # cd <debugfs>/damon
+ # scheme="4096 8192 0 5 10 20 2" # target access pattern and action
+ # scheme+=" 10 $((1024*1024*1024)) 1000" # quotas
+ # scheme+=" 0 0 100" # prioritization weights
+ # scheme+=" 1 5000000 600 500 300" # watermarks
+ # echo "$scheme" > schemes
+
+
+Turning On/Off
+--------------
+
+Setting the files as described above doesn't incur effect unless you explicitly
+start the monitoring. You can start, stop, and check the current status of the
+monitoring by writing to and reading from the ``monitor_on`` file. Writing
+``on`` to the file starts the monitoring of the targets with the attributes.
+Writing ``off`` to the file stops those. DAMON also stops if every target
+process is terminated. Below example commands turn on, off, and check the
+status of DAMON::
+
+ # cd <debugfs>/damon
+ # echo on > monitor_on
+ # echo off > monitor_on
+ # cat monitor_on
+ off
+
+Please note that you cannot write to the above-mentioned debugfs files while
+the monitoring is turned on. If you write to the files while DAMON is running,
+an error code such as ``-EBUSY`` will be returned.
+
+
+Monitoring Thread PID
+---------------------
+
+DAMON does requested monitoring with a kernel thread called ``kdamond``. You
+can get the pid of the thread by reading the ``kdamond_pid`` file. When the
+monitoring is turned off, reading the file returns ``none``. ::
+
+ # cd <debugfs>/damon
+ # cat monitor_on
+ off
+ # cat kdamond_pid
+ none
+ # echo on > monitor_on
+ # cat kdamond_pid
+ 18594
+
+
+Using Multiple Monitoring Threads
+---------------------------------
+
+One ``kdamond`` thread is created for each monitoring context. You can create
+and remove monitoring contexts for multiple ``kdamond`` required use case using
+the ``mk_contexts`` and ``rm_contexts`` files.
+
+Writing the name of the new context to the ``mk_contexts`` file creates a
+directory of the name on the DAMON debugfs directory. The directory will have
+DAMON debugfs files for the context. ::
+
+ # cd <debugfs>/damon
+ # ls foo
+ # ls: cannot access 'foo': No such file or directory
+ # echo foo > mk_contexts
+ # ls foo
+ # attrs init_regions kdamond_pid schemes target_ids
+
+If the context is not needed anymore, you can remove it and the corresponding
+directory by putting the name of the context to the ``rm_contexts`` file. ::
+
+ # echo foo > rm_contexts
+ # ls foo
+ # ls: cannot access 'foo': No such file or directory
+
+Note that ``mk_contexts``, ``rm_contexts``, and ``monitor_on`` files are in the
+root directory only.
+
+
+.. _tracepoint:
+
+Tracepoint for Monitoring Results
+=================================
+
+DAMON provides the monitoring results via a tracepoint,
+``damon:damon_aggregated``. While the monitoring is turned on, you could
+record the tracepoint events and show results using tracepoint supporting tools
+like ``perf``. For example::
+
+ # echo on > monitor_on
+ # perf record -e damon:damon_aggregated &
+ # sleep 5
+ # kill 9 $(pidof perf)
+ # echo off > monitor_on
+ # perf script
diff --git a/Documentation/admin-guide/mm/hugetlbpage.rst b/Documentation/admin-guide/mm/hugetlbpage.rst
index 1cc0bc78d10e..19f27c0d92e0 100644
--- a/Documentation/admin-guide/mm/hugetlbpage.rst
+++ b/Documentation/admin-guide/mm/hugetlbpage.rst
@@ -60,8 +60,12 @@ HugePages_Surp
the pool above the value in ``/proc/sys/vm/nr_hugepages``. The
maximum number of surplus huge pages is controlled by
``/proc/sys/vm/nr_overcommit_hugepages``.
+ Note: When the feature of freeing unused vmemmap pages associated
+ with each hugetlb page is enabled, the number of surplus huge pages
+ may be temporarily larger than the maximum number of surplus huge
+ pages when the system is under memory pressure.
Hugepagesize
- is the default hugepage size (in Kb).
+ is the default hugepage size (in kB).
Hugetlb
is the total amount of memory (in kB), consumed by huge
pages of all sizes.
@@ -80,6 +84,10 @@ returned to the huge page pool when freed by a task. A user with root
privileges can dynamically allocate more or free some persistent huge pages
by increasing or decreasing the value of ``nr_hugepages``.
+Note: When the feature of freeing unused vmemmap pages associated with each
+hugetlb page is enabled, we can fail to free the huge pages triggered by
+the user when ths system is under memory pressure. Please try again later.
+
Pages that are used as huge pages are reserved inside the kernel and cannot
be used for other purposes. Huge pages cannot be swapped out under
memory pressure.
@@ -100,6 +108,65 @@ with a huge page size selection parameter "hugepagesz=<size>". <size> must
be specified in bytes with optional scale suffix [kKmMgG]. The default huge
page size may be selected with the "default_hugepagesz=<size>" boot parameter.
+Hugetlb boot command line parameter semantics
+
+hugepagesz
+ Specify a huge page size. Used in conjunction with hugepages
+ parameter to preallocate a number of huge pages of the specified
+ size. Hence, hugepagesz and hugepages are typically specified in
+ pairs such as::
+
+ hugepagesz=2M hugepages=512
+
+ hugepagesz can only be specified once on the command line for a
+ specific huge page size. Valid huge page sizes are architecture
+ dependent.
+hugepages
+ Specify the number of huge pages to preallocate. This typically
+ follows a valid hugepagesz or default_hugepagesz parameter. However,
+ if hugepages is the first or only hugetlb command line parameter it
+ implicitly specifies the number of huge pages of default size to
+ allocate. If the number of huge pages of default size is implicitly
+ specified, it can not be overwritten by a hugepagesz,hugepages
+ parameter pair for the default size. This parameter also has a
+ node format. The node format specifies the number of huge pages
+ to allocate on specific nodes.
+
+ For example, on an architecture with 2M default huge page size::
+
+ hugepages=256 hugepagesz=2M hugepages=512
+
+ will result in 256 2M huge pages being allocated and a warning message
+ indicating that the hugepages=512 parameter is ignored. If a hugepages
+ parameter is preceded by an invalid hugepagesz parameter, it will
+ be ignored.
+
+ Node format example::
+
+ hugepagesz=2M hugepages=0:1,1:2
+
+ It will allocate 1 2M hugepage on node0 and 2 2M hugepages on node1.
+ If the node number is invalid, the parameter will be ignored.
+
+default_hugepagesz
+ Specify the default huge page size. This parameter can
+ only be specified once on the command line. default_hugepagesz can
+ optionally be followed by the hugepages parameter to preallocate a
+ specific number of huge pages of default size. The number of default
+ sized huge pages to preallocate can also be implicitly specified as
+ mentioned in the hugepages section above. Therefore, on an
+ architecture with 2M default huge page size::
+
+ hugepages=256
+ default_hugepagesz=2M hugepages=256
+ hugepages=256 default_hugepagesz=2M
+
+ will all result in 256 2M huge pages being allocated. Valid default
+ huge page size is architecture dependent.
+hugetlb_free_vmemmap
+ When CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP is set, this enables HugeTLB
+ Vmemmap Optimization (HVO).
+
When multiple huge page sizes are supported, ``/proc/sys/vm/nr_hugepages``
indicates the current number of pre-allocated huge pages of the default size.
Thus, one can use the following command to dynamically allocate/deallocate
@@ -177,8 +244,12 @@ will exist, of the form::
hugepages-${size}kB
-Inside each of these directories, the same set of files will exist::
+Inside each of these directories, the set of files contained in ``/proc``
+will exist. In addition, two additional interfaces for demoting huge
+pages may exist::
+ demote
+ demote_size
nr_hugepages
nr_hugepages_mempolicy
nr_overcommit_hugepages
@@ -186,7 +257,29 @@ Inside each of these directories, the same set of files will exist::
resv_hugepages
surplus_hugepages
-which function as described above for the default huge page-sized case.
+The demote interfaces provide the ability to split a huge page into
+smaller huge pages. For example, the x86 architecture supports both
+1GB and 2MB huge pages sizes. A 1GB huge page can be split into 512
+2MB huge pages. Demote interfaces are not available for the smallest
+huge page size. The demote interfaces are:
+
+demote_size
+ is the size of demoted pages. When a page is demoted a corresponding
+ number of huge pages of demote_size will be created. By default,
+ demote_size is set to the next smaller huge page size. If there are
+ multiple smaller huge page sizes, demote_size can be set to any of
+ these smaller sizes. Only huge page sizes less than the current huge
+ pages size are allowed.
+
+demote
+ is used to demote a number of huge pages. A user with root privileges
+ can write to this file. It may not be possible to demote the
+ requested number of huge pages. To determine how many pages were
+ actually demoted, compare the value of nr_hugepages before and after
+ writing to the demote interface. demote is a write only interface.
+
+The interfaces which are the same as in ``/proc`` (all except demote and
+demote_size) function as described above for the default huge page-sized case.
.. _mem_policy_and_hp_alloc:
diff --git a/Documentation/admin-guide/mm/index.rst b/Documentation/admin-guide/mm/index.rst
index 11db46448354..d1064e0ba34a 100644
--- a/Documentation/admin-guide/mm/index.rst
+++ b/Documentation/admin-guide/mm/index.rst
@@ -3,9 +3,9 @@ Memory Management
=================
Linux memory management subsystem is responsible, as the name implies,
-for managing the memory in the system. This includes implemnetation of
+for managing the memory in the system. This includes implementation of
virtual memory and demand paging, memory allocation both for kernel
-internal structures and user space programms, mapping of files into
+internal structures and user space programs, mapping of files into
processes address space and many other cool things.
Linux memory management is a complex system with many configurable
@@ -27,13 +27,19 @@ the Linux memory management.
concepts
cma_debugfs
+ damon/index
hugetlbpage
idle_page_tracking
ksm
memory-hotplug
+ multigen_lru
+ nommu-mmap
numa_memory_policy
numaperf
pagemap
+ shrinker_debugfs
soft-dirty
+ swap_numa
transhuge
userfaultfd
+ zswap
diff --git a/Documentation/admin-guide/mm/ksm.rst b/Documentation/admin-guide/mm/ksm.rst
index 874eb0c77d34..fb6ba2002a4b 100644
--- a/Documentation/admin-guide/mm/ksm.rst
+++ b/Documentation/admin-guide/mm/ksm.rst
@@ -9,7 +9,7 @@ Overview
KSM is a memory-saving de-duplication feature, enabled by CONFIG_KSM=y,
added to the Linux kernel in 2.6.32. See ``mm/ksm.c`` for its implementation,
-and http://lwn.net/Articles/306704/ and http://lwn.net/Articles/330589/
+and http://lwn.net/Articles/306704/ and https://lwn.net/Articles/330589/
KSM was originally developed for use with KVM (where it was known as
Kernel Shared Memory), to fit more virtual machines into physical memory,
@@ -52,7 +52,7 @@ with EAGAIN, but more probably arousing the Out-Of-Memory killer.
If KSM is not configured into the running kernel, madvise MADV_MERGEABLE
and MADV_UNMERGEABLE simply fail with EINVAL. If the running kernel was
built with CONFIG_KSM=y, those calls will normally succeed: even if the
-the KSM daemon is not currently running, MADV_MERGEABLE still registers
+KSM daemon is not currently running, MADV_MERGEABLE still registers
the range for whenever the KSM daemon is started; even if the range
cannot contain any pages which KSM could actually merge; even if
MADV_UNMERGEABLE is applied to a range which was never MADV_MERGEABLE.
@@ -184,6 +184,60 @@ The maximum possible ``pages_sharing/pages_shared`` ratio is limited by the
``max_page_sharing`` tunable. To increase the ratio ``max_page_sharing`` must
be increased accordingly.
+Monitoring KSM profit
+=====================
+
+KSM can save memory by merging identical pages, but also can consume
+additional memory, because it needs to generate a number of rmap_items to
+save each scanned page's brief rmap information. Some of these pages may
+be merged, but some may not be abled to be merged after being checked
+several times, which are unprofitable memory consumed.
+
+1) How to determine whether KSM save memory or consume memory in system-wide
+ range? Here is a simple approximate calculation for reference::
+
+ general_profit =~ pages_sharing * sizeof(page) - (all_rmap_items) *
+ sizeof(rmap_item);
+
+ where all_rmap_items can be easily obtained by summing ``pages_sharing``,
+ ``pages_shared``, ``pages_unshared`` and ``pages_volatile``.
+
+2) The KSM profit inner a single process can be similarly obtained by the
+ following approximate calculation::
+
+ process_profit =~ ksm_merging_pages * sizeof(page) -
+ ksm_rmap_items * sizeof(rmap_item).
+
+ where ksm_merging_pages is shown under the directory ``/proc/<pid>/``,
+ and ksm_rmap_items is shown in ``/proc/<pid>/ksm_stat``.
+
+From the perspective of application, a high ratio of ``ksm_rmap_items`` to
+``ksm_merging_pages`` means a bad madvise-applied policy, so developers or
+administrators have to rethink how to change madvise policy. Giving an example
+for reference, a page's size is usually 4K, and the rmap_item's size is
+separately 32B on 32-bit CPU architecture and 64B on 64-bit CPU architecture.
+so if the ``ksm_rmap_items/ksm_merging_pages`` ratio exceeds 64 on 64-bit CPU
+or exceeds 128 on 32-bit CPU, then the app's madvise policy should be dropped,
+because the ksm profit is approximately zero or negative.
+
+Monitoring KSM events
+=====================
+
+There are some counters in /proc/vmstat that may be used to monitor KSM events.
+KSM might help save memory, it's a tradeoff by may suffering delay on KSM COW
+or on swapping in copy. Those events could help users evaluate whether or how
+to use KSM. For example, if cow_ksm increases too fast, user may decrease the
+range of madvise(, , MADV_MERGEABLE).
+
+cow_ksm
+ is incremented every time a KSM page triggers copy on write (COW)
+ when users try to write to a KSM page, we have to make a copy.
+
+ksm_swpin_copy
+ is incremented every time a KSM page is copied when swapping in
+ note that KSM page might be copied when swapping in because do_swap_page()
+ cannot do all the locking needed to reconstitute a cross-anon_vma KSM page.
+
--
Izik Eidus,
Hugh Dickins, 17 Nov 2009
diff --git a/Documentation/admin-guide/mm/memory-hotplug.rst b/Documentation/admin-guide/mm/memory-hotplug.rst
index 5c4432c96c4b..a3c9e8ad8fa0 100644
--- a/Documentation/admin-guide/mm/memory-hotplug.rst
+++ b/Documentation/admin-guide/mm/memory-hotplug.rst
@@ -1,444 +1,677 @@
.. _admin_guide_memory_hotplug:
-==============
-Memory Hotplug
-==============
+==================
+Memory Hot(Un)Plug
+==================
-:Created: Jul 28 2007
-:Updated: Add some details about locking internals: Aug 20 2018
-
-This document is about memory hotplug including how-to-use and current status.
-Because Memory Hotplug is still under development, contents of this text will
-be changed often.
+This document describes generic Linux support for memory hot(un)plug with
+a focus on System RAM, including ZONE_MOVABLE support.
.. contents:: :local:
-.. note::
+Introduction
+============
- (1) x86_64's has special implementation for memory hotplug.
- This text does not describe it.
- (2) This text assumes that sysfs is mounted at ``/sys``.
+Memory hot(un)plug allows for increasing and decreasing the size of physical
+memory available to a machine at runtime. In the simplest case, it consists of
+physically plugging or unplugging a DIMM at runtime, coordinated with the
+operating system.
+Memory hot(un)plug is used for various purposes:
-Introduction
-============
+- The physical memory available to a machine can be adjusted at runtime, up- or
+ downgrading the memory capacity. This dynamic memory resizing, sometimes
+ referred to as "capacity on demand", is frequently used with virtual machines
+ and logical partitions.
+
+- Replacing hardware, such as DIMMs or whole NUMA nodes, without downtime. One
+ example is replacing failing memory modules.
-Purpose of memory hotplug
--------------------------
+- Reducing energy consumption either by physically unplugging memory modules or
+ by logically unplugging (parts of) memory modules from Linux.
-Memory Hotplug allows users to increase/decrease the amount of memory.
-Generally, there are two purposes.
+Further, the basic memory hot(un)plug infrastructure in Linux is nowadays also
+used to expose persistent memory, other performance-differentiated memory and
+reserved memory regions as ordinary system RAM to Linux.
-(A) For changing the amount of memory.
- This is to allow a feature like capacity on demand.
-(B) For installing/removing DIMMs or NUMA-nodes physically.
- This is to exchange DIMMs/NUMA-nodes, reduce power consumption, etc.
+Linux only supports memory hot(un)plug on selected 64 bit architectures, such as
+x86_64, arm64, ppc64, s390x and ia64.
-(A) is required by highly virtualized environments and (B) is required by
-hardware which supports memory power management.
+Memory Hot(Un)Plug Granularity
+------------------------------
-Linux memory hotplug is designed for both purpose.
+Memory hot(un)plug in Linux uses the SPARSEMEM memory model, which divides the
+physical memory address space into chunks of the same size: memory sections. The
+size of a memory section is architecture dependent. For example, x86_64 uses
+128 MiB and ppc64 uses 16 MiB.
-Phases of memory hotplug
+Memory sections are combined into chunks referred to as "memory blocks". The
+size of a memory block is architecture dependent and corresponds to the smallest
+granularity that can be hot(un)plugged. The default size of a memory block is
+the same as memory section size, unless an architecture specifies otherwise.
+
+All memory blocks have the same size.
+
+Phases of Memory Hotplug
------------------------
-There are 2 phases in Memory Hotplug:
+Memory hotplug consists of two phases:
- 1) Physical Memory Hotplug phase
- 2) Logical Memory Hotplug phase.
+(1) Adding the memory to Linux
+(2) Onlining memory blocks
-The First phase is to communicate hardware/firmware and make/erase
-environment for hotplugged memory. Basically, this phase is necessary
-for the purpose (B), but this is good phase for communication between
-highly virtualized environments too.
+In the first phase, metadata, such as the memory map ("memmap") and page tables
+for the direct mapping, is allocated and initialized, and memory blocks are
+created; the latter also creates sysfs files for managing newly created memory
+blocks.
-When memory is hotplugged, the kernel recognizes new memory, makes new memory
-management tables, and makes sysfs files for new memory's operation.
+In the second phase, added memory is exposed to the page allocator. After this
+phase, the memory is visible in memory statistics, such as free and total
+memory, of the system.
-If firmware supports notification of connection of new memory to OS,
-this phase is triggered automatically. ACPI can notify this event. If not,
-"probe" operation by system administration is used instead.
-(see :ref:`memory_hotplug_physical_mem`).
+Phases of Memory Hotunplug
+--------------------------
-Logical Memory Hotplug phase is to change memory state into
-available/unavailable for users. Amount of memory from user's view is
-changed by this phase. The kernel makes all memory in it as free pages
-when a memory range is available.
+Memory hotunplug consists of two phases:
-In this document, this phase is described as online/offline.
+(1) Offlining memory blocks
+(2) Removing the memory from Linux
-Logical Memory Hotplug phase is triggered by write of sysfs file by system
-administrator. For the hot-add case, it must be executed after Physical Hotplug
-phase by hand.
-(However, if you writes udev's hotplug scripts for memory hotplug, these
-phases can be execute in seamless way.)
+In the fist phase, memory is "hidden" from the page allocator again, for
+example, by migrating busy memory to other memory locations and removing all
+relevant free pages from the page allocator After this phase, the memory is no
+longer visible in memory statistics of the system.
-Unit of Memory online/offline operation
----------------------------------------
+In the second phase, the memory blocks are removed and metadata is freed.
-Memory hotplug uses SPARSEMEM memory model which allows memory to be divided
-into chunks of the same size. These chunks are called "sections". The size of
-a memory section is architecture dependent. For example, power uses 16MiB, ia64
-uses 1GiB.
+Memory Hotplug Notifications
+============================
-Memory sections are combined into chunks referred to as "memory blocks". The
-size of a memory block is architecture dependent and represents the logical
-unit upon which memory online/offline operations are to be performed. The
-default size of a memory block is the same as memory section size unless an
-architecture specifies otherwise. (see :ref:`memory_hotplug_sysfs_files`.)
+There are various ways how Linux is notified about memory hotplug events such
+that it can start adding hotplugged memory. This description is limited to
+systems that support ACPI; mechanisms specific to other firmware interfaces or
+virtual machines are not described.
-To determine the size (in bytes) of a memory block please read this file::
+ACPI Notifications
+------------------
- /sys/devices/system/memory/block_size_bytes
+Platforms that support ACPI, such as x86_64, can support memory hotplug
+notifications via ACPI.
-Kernel Configuration
-====================
+In general, a firmware supporting memory hotplug defines a memory class object
+HID "PNP0C80". When notified about hotplug of a new memory device, the ACPI
+driver will hotplug the memory to Linux.
-To use memory hotplug feature, kernel must be compiled with following
-config options.
+If the firmware supports hotplug of NUMA nodes, it defines an object _HID
+"ACPI0004", "PNP0A05", or "PNP0A06". When notified about an hotplug event, all
+assigned memory devices are added to Linux by the ACPI driver.
-- For all memory hotplug:
- - Memory model -> Sparse Memory (``CONFIG_SPARSEMEM``)
- - Allow for memory hot-add (``CONFIG_MEMORY_HOTPLUG``)
+Similarly, Linux can be notified about requests to hotunplug a memory device or
+a NUMA node via ACPI. The ACPI driver will try offlining all relevant memory
+blocks, and, if successful, hotunplug the memory from Linux.
-- To enable memory removal, the following are also necessary:
- - Allow for memory hot remove (``CONFIG_MEMORY_HOTREMOVE``)
- - Page Migration (``CONFIG_MIGRATION``)
+Manual Probing
+--------------
-- For ACPI memory hotplug, the following are also necessary:
- - Memory hotplug (under ACPI Support menu) (``CONFIG_ACPI_HOTPLUG_MEMORY``)
- - This option can be kernel module.
+On some architectures, the firmware may not be able to notify the operating
+system about a memory hotplug event. Instead, the memory has to be manually
+probed from user space.
-- As a related configuration, if your box has a feature of NUMA-node hotplug
- via ACPI, then this option is necessary too.
+The probe interface is located at::
- - ACPI0004,PNP0A05 and PNP0A06 Container Driver (under ACPI Support menu)
- (``CONFIG_ACPI_CONTAINER``).
+ /sys/devices/system/memory/probe
- This option can be kernel module too.
+Only complete memory blocks can be probed. Individual memory blocks are probed
+by providing the physical start address of the memory block::
+ % echo addr > /sys/devices/system/memory/probe
-.. _memory_hotplug_sysfs_files:
+Which results in a memory block for the range [addr, addr + memory_block_size)
+being created.
-sysfs files for memory hotplug
-==============================
+.. note::
-All memory blocks have their device information in sysfs. Each memory block
-is described under ``/sys/devices/system/memory`` as::
+ Using the probe interface is discouraged as it is easy to crash the kernel,
+ because Linux cannot validate user input; this interface might be removed in
+ the future.
- /sys/devices/system/memory/memoryXXX
+Onlining and Offlining Memory Blocks
+====================================
-where XXX is the memory block id.
+After a memory block has been created, Linux has to be instructed to actually
+make use of that memory: the memory block has to be "online".
-For the memory block covered by the sysfs directory. It is expected that all
-memory sections in this range are present and no memory holes exist in the
-range. Currently there is no way to determine if there is a memory hole, but
-the existence of one should not affect the hotplug capabilities of the memory
-block.
+Before a memory block can be removed, Linux has to stop using any memory part of
+the memory block: the memory block has to be "offlined".
-For example, assume 1GiB memory block size. A device for a memory starting at
-0x100000000 is ``/sys/device/system/memory/memory4``::
+The Linux kernel can be configured to automatically online added memory blocks
+and drivers automatically trigger offlining of memory blocks when trying
+hotunplug of memory. Memory blocks can only be removed once offlining succeeded
+and drivers may trigger offlining of memory blocks when attempting hotunplug of
+memory.
- (0x100000000 / 1Gib = 4)
+Onlining Memory Blocks Manually
+-------------------------------
-This device covers address range [0x100000000 ... 0x140000000)
+If auto-onlining of memory blocks isn't enabled, user-space has to manually
+trigger onlining of memory blocks. Often, udev rules are used to automate this
+task in user space.
-Under each memory block, you can see 5 files:
+Onlining of a memory block can be triggered via::
-- ``/sys/devices/system/memory/memoryXXX/phys_index``
-- ``/sys/devices/system/memory/memoryXXX/phys_device``
-- ``/sys/devices/system/memory/memoryXXX/state``
-- ``/sys/devices/system/memory/memoryXXX/removable``
-- ``/sys/devices/system/memory/memoryXXX/valid_zones``
+ % echo online > /sys/devices/system/memory/memoryXXX/state
-=================== ============================================================
-``phys_index`` read-only and contains memory block id, same as XXX.
-``state`` read-write
-
- - at read: contains online/offline state of memory.
- - at write: user can specify "online_kernel",
-
- "online_movable", "online", "offline" command
- which will be performed on all sections in the block.
-``phys_device`` read-only: designed to show the name of physical memory
- device. This is not well implemented now.
-``removable`` read-only: contains an integer value indicating
- whether the memory block is removable or not
- removable. A value of 1 indicates that the memory
- block is removable and a value of 0 indicates that
- it is not removable. A memory block is removable only if
- every section in the block is removable.
-``valid_zones`` read-only: designed to show which zones this memory block
- can be onlined to.
-
- The first column shows it`s default zone.
-
- "memory6/valid_zones: Normal Movable" shows this memoryblock
- can be onlined to ZONE_NORMAL by default and to ZONE_MOVABLE
- by online_movable.
-
- "memory7/valid_zones: Movable Normal" shows this memoryblock
- can be onlined to ZONE_MOVABLE by default and to ZONE_NORMAL
- by online_kernel.
-=================== ============================================================
+Or alternatively::
-.. note::
+ % echo 1 > /sys/devices/system/memory/memoryXXX/online
- These directories/files appear after physical memory hotplug phase.
+The kernel will select the target zone automatically, depending on the
+configured ``online_policy``.
-If CONFIG_NUMA is enabled the memoryXXX/ directories can also be accessed
-via symbolic links located in the ``/sys/devices/system/node/node*`` directories.
+One can explicitly request to associate an offline memory block with
+ZONE_MOVABLE by::
-For example::
+ % echo online_movable > /sys/devices/system/memory/memoryXXX/state
- /sys/devices/system/node/node0/memory9 -> ../../memory/memory9
+Or one can explicitly request a kernel zone (usually ZONE_NORMAL) by::
-A backlink will also be created::
+ % echo online_kernel > /sys/devices/system/memory/memoryXXX/state
- /sys/devices/system/memory/memory9/node0 -> ../../node/node0
+In any case, if onlining succeeds, the state of the memory block is changed to
+be "online". If it fails, the state of the memory block will remain unchanged
+and the above commands will fail.
-.. _memory_hotplug_physical_mem:
+Onlining Memory Blocks Automatically
+------------------------------------
-Physical memory hot-add phase
-=============================
+The kernel can be configured to try auto-onlining of newly added memory blocks.
+If this feature is disabled, the memory blocks will stay offline until
+explicitly onlined from user space.
-Hardware(Firmware) Support
---------------------------
+The configured auto-online behavior can be observed via::
-On x86_64/ia64 platform, memory hotplug by ACPI is supported.
+ % cat /sys/devices/system/memory/auto_online_blocks
-In general, the firmware (ACPI) which supports memory hotplug defines
-memory class object of _HID "PNP0C80". When a notify is asserted to PNP0C80,
-Linux's ACPI handler does hot-add memory to the system and calls a hotplug udev
-script. This will be done automatically.
+Auto-onlining can be enabled by writing ``online``, ``online_kernel`` or
+``online_movable`` to that file, like::
-But scripts for memory hotplug are not contained in generic udev package(now).
-You may have to write it by yourself or online/offline memory by hand.
-Please see :ref:`memory_hotplug_how_to_online_memory` and
-:ref:`memory_hotplug_how_to_offline_memory`.
+ % echo online > /sys/devices/system/memory/auto_online_blocks
-If firmware supports NUMA-node hotplug, and defines an object _HID "ACPI0004",
-"PNP0A05", or "PNP0A06", notification is asserted to it, and ACPI handler
-calls hotplug code for all of objects which are defined in it.
-If memory device is found, memory hotplug code will be called.
+Similarly to manual onlining, with ``online`` the kernel will select the
+target zone automatically, depending on the configured ``online_policy``.
-Notify memory hot-add event by hand
------------------------------------
+Modifying the auto-online behavior will only affect all subsequently added
+memory blocks only.
-On some architectures, the firmware may not notify the kernel of a memory
-hotplug event. Therefore, the memory "probe" interface is supported to
-explicitly notify the kernel. This interface depends on
-CONFIG_ARCH_MEMORY_PROBE and can be configured on powerpc, sh, and x86
-if hotplug is supported, although for x86 this should be handled by ACPI
-notification.
+.. note::
-Probe interface is located at::
+ In corner cases, auto-onlining can fail. The kernel won't retry. Note that
+ auto-onlining is not expected to fail in default configurations.
- /sys/devices/system/memory/probe
+.. note::
-You can tell the physical address of new memory to the kernel by::
+ DLPAR on ppc64 ignores the ``offline`` setting and will still online added
+ memory blocks; if onlining fails, memory blocks are removed again.
- % echo start_address_of_new_memory > /sys/devices/system/memory/probe
+Offlining Memory Blocks
+-----------------------
-Then, [start_address_of_new_memory, start_address_of_new_memory +
-memory_block_size] memory range is hot-added. In this case, hotplug script is
-not called (in current implementation). You'll have to online memory by
-yourself. Please see :ref:`memory_hotplug_how_to_online_memory`.
+In the current implementation, Linux's memory offlining will try migrating all
+movable pages off the affected memory block. As most kernel allocations, such as
+page tables, are unmovable, page migration can fail and, therefore, inhibit
+memory offlining from succeeding.
-Logical Memory hot-add phase
-============================
+Having the memory provided by memory block managed by ZONE_MOVABLE significantly
+increases memory offlining reliability; still, memory offlining can fail in
+some corner cases.
-State of memory
----------------
+Further, memory offlining might retry for a long time (or even forever), until
+aborted by the user.
-To see (online/offline) state of a memory block, read 'state' file::
+Offlining of a memory block can be triggered via::
+
+ % echo offline > /sys/devices/system/memory/memoryXXX/state
+
+Or alternatively::
+
+ % echo 0 > /sys/devices/system/memory/memoryXXX/online
+
+If offlining succeeds, the state of the memory block is changed to be "offline".
+If it fails, the state of the memory block will remain unchanged and the above
+commands will fail, for example, via::
+
+ bash: echo: write error: Device or resource busy
+
+or via::
+
+ bash: echo: write error: Invalid argument
+
+Observing the State of Memory Blocks
+------------------------------------
+
+The state (online/offline/going-offline) of a memory block can be observed
+either via::
% cat /sys/device/system/memory/memoryXXX/state
+Or alternatively (1/0) via::
-- If the memory block is online, you'll read "online".
-- If the memory block is offline, you'll read "offline".
+ % cat /sys/device/system/memory/memoryXXX/online
+For an online memory block, the managing zone can be observed via::
-.. _memory_hotplug_how_to_online_memory:
+ % cat /sys/device/system/memory/memoryXXX/valid_zones
-How to online memory
---------------------
+Configuring Memory Hot(Un)Plug
+==============================
-When the memory is hot-added, the kernel decides whether or not to "online"
-it according to the policy which can be read from "auto_online_blocks" file::
+There are various ways how system administrators can configure memory
+hot(un)plug and interact with memory blocks, especially, to online them.
- % cat /sys/devices/system/memory/auto_online_blocks
+Memory Hot(Un)Plug Configuration via Sysfs
+------------------------------------------
-The default depends on the CONFIG_MEMORY_HOTPLUG_DEFAULT_ONLINE kernel config
-option. If it is disabled the default is "offline" which means the newly added
-memory is not in a ready-to-use state and you have to "online" the newly added
-memory blocks manually. Automatic onlining can be requested by writing "online"
-to "auto_online_blocks" file::
+Some memory hot(un)plug properties can be configured or inspected via sysfs in::
- % echo online > /sys/devices/system/memory/auto_online_blocks
+ /sys/devices/system/memory/
-This sets a global policy and impacts all memory blocks that will subsequently
-be hotplugged. Currently offline blocks keep their state. It is possible, under
-certain circumstances, that some memory blocks will be added but will fail to
-online. User space tools can check their "state" files
-(``/sys/devices/system/memory/memoryXXX/state``) and try to online them manually.
+The following files are currently defined:
-If the automatic onlining wasn't requested, failed, or some memory block was
-offlined it is possible to change the individual block's state by writing to the
-"state" file::
+====================== =========================================================
+``auto_online_blocks`` read-write: set or get the default state of new memory
+ blocks; configure auto-onlining.
- % echo online > /sys/devices/system/memory/memoryXXX/state
+ The default value depends on the
+ CONFIG_MEMORY_HOTPLUG_DEFAULT_ONLINE kernel configuration
+ option.
-This onlining will not change the ZONE type of the target memory block,
-If the memory block doesn't belong to any zone an appropriate kernel zone
-(usually ZONE_NORMAL) will be used unless movable_node kernel command line
-option is specified when ZONE_MOVABLE will be used.
+ See the ``state`` property of memory blocks for details.
+``block_size_bytes`` read-only: the size in bytes of a memory block.
+``probe`` write-only: add (probe) selected memory blocks manually
+ from user space by supplying the physical start address.
-You can explicitly request to associate it with ZONE_MOVABLE by::
+ Availability depends on the CONFIG_ARCH_MEMORY_PROBE
+ kernel configuration option.
+``uevent`` read-write: generic udev file for device subsystems.
+====================== =========================================================
- % echo online_movable > /sys/devices/system/memory/memoryXXX/state
+.. note::
-.. note:: current limit: this memory block must be adjacent to ZONE_MOVABLE
+ When the CONFIG_MEMORY_FAILURE kernel configuration option is enabled, two
+ additional files ``hard_offline_page`` and ``soft_offline_page`` are available
+ to trigger hwpoisoning of pages, for example, for testing purposes. Note that
+ this functionality is not really related to memory hot(un)plug or actual
+ offlining of memory blocks.
-Or you can explicitly request a kernel zone (usually ZONE_NORMAL) by::
+Memory Block Configuration via Sysfs
+------------------------------------
- % echo online_kernel > /sys/devices/system/memory/memoryXXX/state
+Each memory block is represented as a memory block device that can be
+onlined or offlined. All memory blocks have their device information located in
+sysfs. Each present memory block is listed under
+``/sys/devices/system/memory`` as::
-.. note:: current limit: this memory block must be adjacent to ZONE_NORMAL
+ /sys/devices/system/memory/memoryXXX
-An explicit zone onlining can fail (e.g. when the range is already within
-and existing and incompatible zone already).
+where XXX is the memory block id; the number of digits is variable.
-After this, memory block XXX's state will be 'online' and the amount of
-available memory will be increased.
+A present memory block indicates that some memory in the range is present;
+however, a memory block might span memory holes. A memory block spanning memory
+holes cannot be offlined.
-This may be changed in future.
+For example, assume 1 GiB memory block size. A device for a memory starting at
+0x100000000 is ``/sys/device/system/memory/memory4``::
-Logical memory remove
-=====================
+ (0x100000000 / 1Gib = 4)
-Memory offline and ZONE_MOVABLE
--------------------------------
+This device covers address range [0x100000000 ... 0x140000000)
+
+The following files are currently defined:
-Memory offlining is more complicated than memory online. Because memory offline
-has to make the whole memory block be unused, memory offline can fail if
-the memory block includes memory which cannot be freed.
+=================== ============================================================
+``online`` read-write: simplified interface to trigger onlining /
+ offlining and to observe the state of a memory block.
+ When onlining, the zone is selected automatically.
+``phys_device`` read-only: legacy interface only ever used on s390x to
+ expose the covered storage increment.
+``phys_index`` read-only: the memory block id (XXX).
+``removable`` read-only: legacy interface that indicated whether a memory
+ block was likely to be offlineable or not. Nowadays, the
+ kernel return ``1`` if and only if it supports memory
+ offlining.
+``state`` read-write: advanced interface to trigger onlining /
+ offlining and to observe the state of a memory block.
+
+ When writing, ``online``, ``offline``, ``online_kernel`` and
+ ``online_movable`` are supported.
+
+ ``online_movable`` specifies onlining to ZONE_MOVABLE.
+ ``online_kernel`` specifies onlining to the default kernel
+ zone for the memory block, such as ZONE_NORMAL.
+ ``online`` let's the kernel select the zone automatically.
+
+ When reading, ``online``, ``offline`` and ``going-offline``
+ may be returned.
+``uevent`` read-write: generic uevent file for devices.
+``valid_zones`` read-only: when a block is online, shows the zone it
+ belongs to; when a block is offline, shows what zone will
+ manage it when the block will be onlined.
+
+ For online memory blocks, ``DMA``, ``DMA32``, ``Normal``,
+ ``Movable`` and ``none`` may be returned. ``none`` indicates
+ that memory provided by a memory block is managed by
+ multiple zones or spans multiple nodes; such memory blocks
+ cannot be offlined. ``Movable`` indicates ZONE_MOVABLE.
+ Other values indicate a kernel zone.
+
+ For offline memory blocks, the first column shows the
+ zone the kernel would select when onlining the memory block
+ right now without further specifying a zone.
+
+ Availability depends on the CONFIG_MEMORY_HOTREMOVE
+ kernel configuration option.
+=================== ============================================================
-In general, memory offline can use 2 techniques.
+.. note::
-(1) reclaim and free all memory in the memory block.
-(2) migrate all pages in the memory block.
+ If the CONFIG_NUMA kernel configuration option is enabled, the memoryXXX/
+ directories can also be accessed via symbolic links located in the
+ ``/sys/devices/system/node/node*`` directories.
-In the current implementation, Linux's memory offline uses method (2), freeing
-all pages in the memory block by page migration. But not all pages are
-migratable. Under current Linux, migratable pages are anonymous pages and
-page caches. For offlining a memory block by migration, the kernel has to
-guarantee that the memory block contains only migratable pages.
+ For example::
-Now, a boot option for making a memory block which consists of migratable pages
-is supported. By specifying "kernelcore=" or "movablecore=" boot option, you can
-create ZONE_MOVABLE...a zone which is just used for movable pages.
-(See also Documentation/admin-guide/kernel-parameters.rst)
+ /sys/devices/system/node/node0/memory9 -> ../../memory/memory9
+
+ A backlink will also be created::
+
+ /sys/devices/system/memory/memory9/node0 -> ../../node/node0
+
+Command Line Parameters
+-----------------------
+
+Some command line parameters affect memory hot(un)plug handling. The following
+command line parameters are relevant:
+
+======================== =======================================================
+``memhp_default_state`` configure auto-onlining by essentially setting
+ ``/sys/devices/system/memory/auto_online_blocks``.
+``movable_node`` configure automatic zone selection in the kernel when
+ using the ``contig-zones`` online policy. When
+ set, the kernel will default to ZONE_MOVABLE when
+ onlining a memory block, unless other zones can be kept
+ contiguous.
+======================== =======================================================
+
+See Documentation/admin-guide/kernel-parameters.txt for a more generic
+description of these command line parameters.
+
+Module Parameters
+------------------
+
+Instead of additional command line parameters or sysfs files, the
+``memory_hotplug`` subsystem now provides a dedicated namespace for module
+parameters. Module parameters can be set via the command line by predicating
+them with ``memory_hotplug.`` such as::
+
+ memory_hotplug.memmap_on_memory=1
+
+and they can be observed (and some even modified at runtime) via::
+
+ /sys/module/memory_hotplug/parameters/
+
+The following module parameters are currently defined:
+
+================================ ===============================================
+``memmap_on_memory`` read-write: Allocate memory for the memmap from
+ the added memory block itself. Even if enabled,
+ actual support depends on various other system
+ properties and should only be regarded as a
+ hint whether the behavior would be desired.
+
+ While allocating the memmap from the memory
+ block itself makes memory hotplug less likely
+ to fail and keeps the memmap on the same NUMA
+ node in any case, it can fragment physical
+ memory in a way that huge pages in bigger
+ granularity cannot be formed on hotplugged
+ memory.
+``online_policy`` read-write: Set the basic policy used for
+ automatic zone selection when onlining memory
+ blocks without specifying a target zone.
+ ``contig-zones`` has been the kernel default
+ before this parameter was added. After an
+ online policy was configured and memory was
+ online, the policy should not be changed
+ anymore.
+
+ When set to ``contig-zones``, the kernel will
+ try keeping zones contiguous. If a memory block
+ intersects multiple zones or no zone, the
+ behavior depends on the ``movable_node`` kernel
+ command line parameter: default to ZONE_MOVABLE
+ if set, default to the applicable kernel zone
+ (usually ZONE_NORMAL) if not set.
+
+ When set to ``auto-movable``, the kernel will
+ try onlining memory blocks to ZONE_MOVABLE if
+ possible according to the configuration and
+ memory device details. With this policy, one
+ can avoid zone imbalances when eventually
+ hotplugging a lot of memory later and still
+ wanting to be able to hotunplug as much as
+ possible reliably, very desirable in
+ virtualized environments. This policy ignores
+ the ``movable_node`` kernel command line
+ parameter and isn't really applicable in
+ environments that require it (e.g., bare metal
+ with hotunpluggable nodes) where hotplugged
+ memory might be exposed via the
+ firmware-provided memory map early during boot
+ to the system instead of getting detected,
+ added and onlined later during boot (such as
+ done by virtio-mem or by some hypervisors
+ implementing emulated DIMMs). As one example, a
+ hotplugged DIMM will be onlined either
+ completely to ZONE_MOVABLE or completely to
+ ZONE_NORMAL, not a mixture.
+ As another example, as many memory blocks
+ belonging to a virtio-mem device will be
+ onlined to ZONE_MOVABLE as possible,
+ special-casing units of memory blocks that can
+ only get hotunplugged together. *This policy
+ does not protect from setups that are
+ problematic with ZONE_MOVABLE and does not
+ change the zone of memory blocks dynamically
+ after they were onlined.*
+``auto_movable_ratio`` read-write: Set the maximum MOVABLE:KERNEL
+ memory ratio in % for the ``auto-movable``
+ online policy. Whether the ratio applies only
+ for the system across all NUMA nodes or also
+ per NUMA nodes depends on the
+ ``auto_movable_numa_aware`` configuration.
+
+ All accounting is based on present memory pages
+ in the zones combined with accounting per
+ memory device. Memory dedicated to the CMA
+ allocator is accounted as MOVABLE, although
+ residing on one of the kernel zones. The
+ possible ratio depends on the actual workload.
+ The kernel default is "301" %, for example,
+ allowing for hotplugging 24 GiB to a 8 GiB VM
+ and automatically onlining all hotplugged
+ memory to ZONE_MOVABLE in many setups. The
+ additional 1% deals with some pages being not
+ present, for example, because of some firmware
+ allocations.
+
+ Note that ZONE_NORMAL memory provided by one
+ memory device does not allow for more
+ ZONE_MOVABLE memory for a different memory
+ device. As one example, onlining memory of a
+ hotplugged DIMM to ZONE_NORMAL will not allow
+ for another hotplugged DIMM to get onlined to
+ ZONE_MOVABLE automatically. In contrast, memory
+ hotplugged by a virtio-mem device that got
+ onlined to ZONE_NORMAL will allow for more
+ ZONE_MOVABLE memory within *the same*
+ virtio-mem device.
+``auto_movable_numa_aware`` read-write: Configure whether the
+ ``auto_movable_ratio`` in the ``auto-movable``
+ online policy also applies per NUMA
+ node in addition to the whole system across all
+ NUMA nodes. The kernel default is "Y".
+
+ Disabling NUMA awareness can be helpful when
+ dealing with NUMA nodes that should be
+ completely hotunpluggable, onlining the memory
+ completely to ZONE_MOVABLE automatically if
+ possible.
+
+ Parameter availability depends on CONFIG_NUMA.
+================================ ===============================================
+
+ZONE_MOVABLE
+============
+
+ZONE_MOVABLE is an important mechanism for more reliable memory offlining.
+Further, having system RAM managed by ZONE_MOVABLE instead of one of the
+kernel zones can increase the number of possible transparent huge pages and
+dynamically allocated huge pages.
+
+Most kernel allocations are unmovable. Important examples include the memory
+map (usually 1/64ths of memory), page tables, and kmalloc(). Such allocations
+can only be served from the kernel zones.
+
+Most user space pages, such as anonymous memory, and page cache pages are
+movable. Such allocations can be served from ZONE_MOVABLE and the kernel zones.
+
+Only movable allocations are served from ZONE_MOVABLE, resulting in unmovable
+allocations being limited to the kernel zones. Without ZONE_MOVABLE, there is
+absolutely no guarantee whether a memory block can be offlined successfully.
+
+Zone Imbalances
+---------------
-Assume the system has "TOTAL" amount of memory at boot time, this boot option
-creates ZONE_MOVABLE as following.
+Having too much system RAM managed by ZONE_MOVABLE is called a zone imbalance,
+which can harm the system or degrade performance. As one example, the kernel
+might crash because it runs out of free memory for unmovable allocations,
+although there is still plenty of free memory left in ZONE_MOVABLE.
-1) When kernelcore=YYYY boot option is used,
- Size of memory not for movable pages (not for offline) is YYYY.
- Size of memory for movable pages (for offline) is TOTAL-YYYY.
+Usually, MOVABLE:KERNEL ratios of up to 3:1 or even 4:1 are fine. Ratios of 63:1
+are definitely impossible due to the overhead for the memory map.
-2) When movablecore=ZZZZ boot option is used,
- Size of memory not for movable pages (not for offline) is TOTAL - ZZZZ.
- Size of memory for movable pages (for offline) is ZZZZ.
+Actual safe zone ratios depend on the workload. Extreme cases, like excessive
+long-term pinning of pages, might not be able to deal with ZONE_MOVABLE at all.
.. note::
- Unfortunately, there is no information to show which memory block belongs
- to ZONE_MOVABLE. This is TBD.
+ CMA memory part of a kernel zone essentially behaves like memory in
+ ZONE_MOVABLE and similar considerations apply, especially when combining
+ CMA with ZONE_MOVABLE.
-.. _memory_hotplug_how_to_offline_memory:
+ZONE_MOVABLE Sizing Considerations
+----------------------------------
-How to offline memory
----------------------
+We usually expect that a large portion of available system RAM will actually
+be consumed by user space, either directly or indirectly via the page cache. In
+the normal case, ZONE_MOVABLE can be used when allocating such pages just fine.
-You can offline a memory block by using the same sysfs interface that was used
-in memory onlining::
+With that in mind, it makes sense that we can have a big portion of system RAM
+managed by ZONE_MOVABLE. However, there are some things to consider when using
+ZONE_MOVABLE, especially when fine-tuning zone ratios:
- % echo offline > /sys/devices/system/memory/memoryXXX/state
+- Having a lot of offline memory blocks. Even offline memory blocks consume
+ memory for metadata and page tables in the direct map; having a lot of offline
+ memory blocks is not a typical case, though.
+
+- Memory ballooning without balloon compaction is incompatible with
+ ZONE_MOVABLE. Only some implementations, such as virtio-balloon and
+ pseries CMM, fully support balloon compaction.
+
+ Further, the CONFIG_BALLOON_COMPACTION kernel configuration option might be
+ disabled. In that case, balloon inflation will only perform unmovable
+ allocations and silently create a zone imbalance, usually triggered by
+ inflation requests from the hypervisor.
+
+- Gigantic pages are unmovable, resulting in user space consuming a
+ lot of unmovable memory.
+
+- Huge pages are unmovable when an architectures does not support huge
+ page migration, resulting in a similar issue as with gigantic pages.
+
+- Page tables are unmovable. Excessive swapping, mapping extremely large
+ files or ZONE_DEVICE memory can be problematic, although only really relevant
+ in corner cases. When we manage a lot of user space memory that has been
+ swapped out or is served from a file/persistent memory/... we still need a lot
+ of page tables to manage that memory once user space accessed that memory.
+
+- In certain DAX configurations the memory map for the device memory will be
+ allocated from the kernel zones.
+
+- KASAN can have a significant memory overhead, for example, consuming 1/8th of
+ the total system memory size as (unmovable) tracking metadata.
+
+- Long-term pinning of pages. Techniques that rely on long-term pinnings
+ (especially, RDMA and vfio/mdev) are fundamentally problematic with
+ ZONE_MOVABLE, and therefore, memory offlining. Pinned pages cannot reside
+ on ZONE_MOVABLE as that would turn these pages unmovable. Therefore, they
+ have to be migrated off that zone while pinning. Pinning a page can fail
+ even if there is plenty of free memory in ZONE_MOVABLE.
+
+ In addition, using ZONE_MOVABLE might make page pinning more expensive,
+ because of the page migration overhead.
+
+By default, all the memory configured at boot time is managed by the kernel
+zones and ZONE_MOVABLE is not used.
+
+To enable ZONE_MOVABLE to include the memory present at boot and to control the
+ratio between movable and kernel zones there are two command line options:
+``kernelcore=`` and ``movablecore=``. See
+Documentation/admin-guide/kernel-parameters.rst for their description.
+
+Memory Offlining and ZONE_MOVABLE
+---------------------------------
+
+Even with ZONE_MOVABLE, there are some corner cases where offlining a memory
+block might fail:
+
+- Memory blocks with memory holes; this applies to memory blocks present during
+ boot and can apply to memory blocks hotplugged via the XEN balloon and the
+ Hyper-V balloon.
+
+- Mixed NUMA nodes and mixed zones within a single memory block prevent memory
+ offlining; this applies to memory blocks present during boot only.
+
+- Special memory blocks prevented by the system from getting offlined. Examples
+ include any memory available during boot on arm64 or memory blocks spanning
+ the crashkernel area on s390x; this usually applies to memory blocks present
+ during boot only.
+
+- Memory blocks overlapping with CMA areas cannot be offlined, this applies to
+ memory blocks present during boot only.
+
+- Concurrent activity that operates on the same physical memory area, such as
+ allocating gigantic pages, can result in temporary offlining failures.
+
+- Out of memory when dissolving huge pages, especially when HugeTLB Vmemmap
+ Optimization (HVO) is enabled.
+
+ Offlining code may be able to migrate huge page contents, but may not be able
+ to dissolve the source huge page because it fails allocating (unmovable) pages
+ for the vmemmap, because the system might not have free memory in the kernel
+ zones left.
+
+ Users that depend on memory offlining to succeed for movable zones should
+ carefully consider whether the memory savings gained from this feature are
+ worth the risk of possibly not being able to offline memory in certain
+ situations.
+
+Further, when running into out of memory situations while migrating pages, or
+when still encountering permanently unmovable pages within ZONE_MOVABLE
+(-> BUG), memory offlining will keep retrying until it eventually succeeds.
+
+When offlining is triggered from user space, the offlining context can be
+terminated by sending a fatal signal. A timeout based offlining can easily be
+implemented via::
-If offline succeeds, the state of the memory block is changed to be "offline".
-If it fails, some error core (like -EBUSY) will be returned by the kernel.
-Even if a memory block does not belong to ZONE_MOVABLE, you can try to offline
-it. If it doesn't contain 'unmovable' memory, you'll get success.
-
-A memory block under ZONE_MOVABLE is considered to be able to be offlined
-easily. But under some busy state, it may return -EBUSY. Even if a memory
-block cannot be offlined due to -EBUSY, you can retry offlining it and may be
-able to offline it (or not). (For example, a page is referred to by some kernel
-internal call and released soon.)
-
-Consideration:
- Memory hotplug's design direction is to make the possibility of memory
- offlining higher and to guarantee unplugging memory under any situation. But
- it needs more work. Returning -EBUSY under some situation may be good because
- the user can decide to retry more or not by himself. Currently, memory
- offlining code does some amount of retry with 120 seconds timeout.
-
-Physical memory remove
-======================
-
-Need more implementation yet....
- - Notification completion of remove works by OS to firmware.
- - Guard from remove if not yet.
-
-
-Locking Internals
-=================
-
-When adding/removing memory that uses memory block devices (i.e. ordinary RAM),
-the device_hotplug_lock should be held to:
-
-- synchronize against online/offline requests (e.g. via sysfs). This way, memory
- block devices can only be accessed (.online/.state attributes) by user
- space once memory has been fully added. And when removing memory, we
- know nobody is in critical sections.
-- synchronize against CPU hotplug and similar (e.g. relevant for ACPI and PPC)
-
-Especially, there is a possible lock inversion that is avoided using
-device_hotplug_lock when adding memory and user space tries to online that
-memory faster than expected:
-
-- device_online() will first take the device_lock(), followed by
- mem_hotplug_lock
-- add_memory_resource() will first take the mem_hotplug_lock, followed by
- the device_lock() (while creating the devices, during bus_add_device()).
-
-As the device is visible to user space before taking the device_lock(), this
-can result in a lock inversion.
-
-onlining/offlining of memory should be done via device_online()/
-device_offline() - to make sure it is properly synchronized to actions
-via sysfs. Holding device_hotplug_lock is advised (to e.g. protect online_type)
-
-When adding/removing/onlining/offlining memory or adding/removing
-heterogeneous/device memory, we should always hold the mem_hotplug_lock in
-write mode to serialise memory hotplug (e.g. access to global/zone
-variables).
-
-In addition, mem_hotplug_lock (in contrast to device_hotplug_lock) in read
-mode allows for a quite efficient get_online_mems/put_online_mems
-implementation, so code accessing memory can protect from that memory
-vanishing.
-
-
-Future Work
-===========
-
- - allowing memory hot-add to ZONE_MOVABLE. maybe we need some switch like
- sysctl or new control file.
- - showing memory block and physical device relationship.
- - test and make it better memory offlining.
- - support HugeTLB page migration and offlining.
- - memmap removing at memory offline.
- - physical remove memory.
+ % timeout $TIMEOUT offline_block | failure_handling
diff --git a/Documentation/admin-guide/mm/multigen_lru.rst b/Documentation/admin-guide/mm/multigen_lru.rst
new file mode 100644
index 000000000000..33e068830497
--- /dev/null
+++ b/Documentation/admin-guide/mm/multigen_lru.rst
@@ -0,0 +1,162 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+=============
+Multi-Gen LRU
+=============
+The multi-gen LRU is an alternative LRU implementation that optimizes
+page reclaim and improves performance under memory pressure. Page
+reclaim decides the kernel's caching policy and ability to overcommit
+memory. It directly impacts the kswapd CPU usage and RAM efficiency.
+
+Quick start
+===========
+Build the kernel with the following configurations.
+
+* ``CONFIG_LRU_GEN=y``
+* ``CONFIG_LRU_GEN_ENABLED=y``
+
+All set!
+
+Runtime options
+===============
+``/sys/kernel/mm/lru_gen/`` contains stable ABIs described in the
+following subsections.
+
+Kill switch
+-----------
+``enabled`` accepts different values to enable or disable the
+following components. Its default value depends on
+``CONFIG_LRU_GEN_ENABLED``. All the components should be enabled
+unless some of them have unforeseen side effects. Writing to
+``enabled`` has no effect when a component is not supported by the
+hardware, and valid values will be accepted even when the main switch
+is off.
+
+====== ===============================================================
+Values Components
+====== ===============================================================
+0x0001 The main switch for the multi-gen LRU.
+0x0002 Clearing the accessed bit in leaf page table entries in large
+ batches, when MMU sets it (e.g., on x86). This behavior can
+ theoretically worsen lock contention (mmap_lock). If it is
+ disabled, the multi-gen LRU will suffer a minor performance
+ degradation for workloads that contiguously map hot pages,
+ whose accessed bits can be otherwise cleared by fewer larger
+ batches.
+0x0004 Clearing the accessed bit in non-leaf page table entries as
+ well, when MMU sets it (e.g., on x86). This behavior was not
+ verified on x86 varieties other than Intel and AMD. If it is
+ disabled, the multi-gen LRU will suffer a negligible
+ performance degradation.
+[yYnN] Apply to all the components above.
+====== ===============================================================
+
+E.g.,
+::
+
+ echo y >/sys/kernel/mm/lru_gen/enabled
+ cat /sys/kernel/mm/lru_gen/enabled
+ 0x0007
+ echo 5 >/sys/kernel/mm/lru_gen/enabled
+ cat /sys/kernel/mm/lru_gen/enabled
+ 0x0005
+
+Thrashing prevention
+--------------------
+Personal computers are more sensitive to thrashing because it can
+cause janks (lags when rendering UI) and negatively impact user
+experience. The multi-gen LRU offers thrashing prevention to the
+majority of laptop and desktop users who do not have ``oomd``.
+
+Users can write ``N`` to ``min_ttl_ms`` to prevent the working set of
+``N`` milliseconds from getting evicted. The OOM killer is triggered
+if this working set cannot be kept in memory. In other words, this
+option works as an adjustable pressure relief valve, and when open, it
+terminates applications that are hopefully not being used.
+
+Based on the average human detectable lag (~100ms), ``N=1000`` usually
+eliminates intolerable janks due to thrashing. Larger values like
+``N=3000`` make janks less noticeable at the risk of premature OOM
+kills.
+
+The default value ``0`` means disabled.
+
+Experimental features
+=====================
+``/sys/kernel/debug/lru_gen`` accepts commands described in the
+following subsections. Multiple command lines are supported, so does
+concatenation with delimiters ``,`` and ``;``.
+
+``/sys/kernel/debug/lru_gen_full`` provides additional stats for
+debugging. ``CONFIG_LRU_GEN_STATS=y`` keeps historical stats from
+evicted generations in this file.
+
+Working set estimation
+----------------------
+Working set estimation measures how much memory an application needs
+in a given time interval, and it is usually done with little impact on
+the performance of the application. E.g., data centers want to
+optimize job scheduling (bin packing) to improve memory utilizations.
+When a new job comes in, the job scheduler needs to find out whether
+each server it manages can allocate a certain amount of memory for
+this new job before it can pick a candidate. To do so, the job
+scheduler needs to estimate the working sets of the existing jobs.
+
+When it is read, ``lru_gen`` returns a histogram of numbers of pages
+accessed over different time intervals for each memcg and node.
+``MAX_NR_GENS`` decides the number of bins for each histogram. The
+histograms are noncumulative.
+::
+
+ memcg memcg_id memcg_path
+ node node_id
+ min_gen_nr age_in_ms nr_anon_pages nr_file_pages
+ ...
+ max_gen_nr age_in_ms nr_anon_pages nr_file_pages
+
+Each bin contains an estimated number of pages that have been accessed
+within ``age_in_ms``. E.g., ``min_gen_nr`` contains the coldest pages
+and ``max_gen_nr`` contains the hottest pages, since ``age_in_ms`` of
+the former is the largest and that of the latter is the smallest.
+
+Users can write the following command to ``lru_gen`` to create a new
+generation ``max_gen_nr+1``:
+
+ ``+ memcg_id node_id max_gen_nr [can_swap [force_scan]]``
+
+``can_swap`` defaults to the swap setting and, if it is set to ``1``,
+it forces the scan of anon pages when swap is off, and vice versa.
+``force_scan`` defaults to ``1`` and, if it is set to ``0``, it
+employs heuristics to reduce the overhead, which is likely to reduce
+the coverage as well.
+
+A typical use case is that a job scheduler runs this command at a
+certain time interval to create new generations, and it ranks the
+servers it manages based on the sizes of their cold pages defined by
+this time interval.
+
+Proactive reclaim
+-----------------
+Proactive reclaim induces page reclaim when there is no memory
+pressure. It usually targets cold pages only. E.g., when a new job
+comes in, the job scheduler wants to proactively reclaim cold pages on
+the server it selected, to improve the chance of successfully landing
+this new job.
+
+Users can write the following command to ``lru_gen`` to evict
+generations less than or equal to ``min_gen_nr``.
+
+ ``- memcg_id node_id min_gen_nr [swappiness [nr_to_reclaim]]``
+
+``min_gen_nr`` should be less than ``max_gen_nr-1``, since
+``max_gen_nr`` and ``max_gen_nr-1`` are not fully aged (equivalent to
+the active list) and therefore cannot be evicted. ``swappiness``
+overrides the default value in ``/proc/sys/vm/swappiness``.
+``nr_to_reclaim`` limits the number of pages to evict.
+
+A typical use case is that a job scheduler runs this command before it
+tries to land a new job on a server. If it fails to materialize enough
+cold pages because of the overestimation, it retries on the next
+server according to the ranking result obtained from the working set
+estimation step. This less forceful approach limits the impacts on the
+existing jobs.
diff --git a/Documentation/admin-guide/mm/nommu-mmap.rst b/Documentation/admin-guide/mm/nommu-mmap.rst
new file mode 100644
index 000000000000..530fed08de2c
--- /dev/null
+++ b/Documentation/admin-guide/mm/nommu-mmap.rst
@@ -0,0 +1,283 @@
+=============================
+No-MMU memory mapping support
+=============================
+
+The kernel has limited support for memory mapping under no-MMU conditions, such
+as are used in uClinux environments. From the userspace point of view, memory
+mapping is made use of in conjunction with the mmap() system call, the shmat()
+call and the execve() system call. From the kernel's point of view, execve()
+mapping is actually performed by the binfmt drivers, which call back into the
+mmap() routines to do the actual work.
+
+Memory mapping behaviour also involves the way fork(), vfork(), clone() and
+ptrace() work. Under uClinux there is no fork(), and clone() must be supplied
+the CLONE_VM flag.
+
+The behaviour is similar between the MMU and no-MMU cases, but not identical;
+and it's also much more restricted in the latter case:
+
+ (#) Anonymous mapping, MAP_PRIVATE
+
+ In the MMU case: VM regions backed by arbitrary pages; copy-on-write
+ across fork.
+
+ In the no-MMU case: VM regions backed by arbitrary contiguous runs of
+ pages.
+
+ (#) Anonymous mapping, MAP_SHARED
+
+ These behave very much like private mappings, except that they're
+ shared across fork() or clone() without CLONE_VM in the MMU case. Since
+ the no-MMU case doesn't support these, behaviour is identical to
+ MAP_PRIVATE there.
+
+ (#) File, MAP_PRIVATE, PROT_READ / PROT_EXEC, !PROT_WRITE
+
+ In the MMU case: VM regions backed by pages read from file; changes to
+ the underlying file are reflected in the mapping; copied across fork.
+
+ In the no-MMU case:
+
+ - If one exists, the kernel will re-use an existing mapping to the
+ same segment of the same file if that has compatible permissions,
+ even if this was created by another process.
+
+ - If possible, the file mapping will be directly on the backing device
+ if the backing device has the NOMMU_MAP_DIRECT capability and
+ appropriate mapping protection capabilities. Ramfs, romfs, cramfs
+ and mtd might all permit this.
+
+ - If the backing device can't or won't permit direct sharing,
+ but does have the NOMMU_MAP_COPY capability, then a copy of the
+ appropriate bit of the file will be read into a contiguous bit of
+ memory and any extraneous space beyond the EOF will be cleared
+
+ - Writes to the file do not affect the mapping; writes to the mapping
+ are visible in other processes (no MMU protection), but should not
+ happen.
+
+ (#) File, MAP_PRIVATE, PROT_READ / PROT_EXEC, PROT_WRITE
+
+ In the MMU case: like the non-PROT_WRITE case, except that the pages in
+ question get copied before the write actually happens. From that point
+ on writes to the file underneath that page no longer get reflected into
+ the mapping's backing pages. The page is then backed by swap instead.
+
+ In the no-MMU case: works much like the non-PROT_WRITE case, except
+ that a copy is always taken and never shared.
+
+ (#) Regular file / blockdev, MAP_SHARED, PROT_READ / PROT_EXEC / PROT_WRITE
+
+ In the MMU case: VM regions backed by pages read from file; changes to
+ pages written back to file; writes to file reflected into pages backing
+ mapping; shared across fork.
+
+ In the no-MMU case: not supported.
+
+ (#) Memory backed regular file, MAP_SHARED, PROT_READ / PROT_EXEC / PROT_WRITE
+
+ In the MMU case: As for ordinary regular files.
+
+ In the no-MMU case: The filesystem providing the memory-backed file
+ (such as ramfs or tmpfs) may choose to honour an open, truncate, mmap
+ sequence by providing a contiguous sequence of pages to map. In that
+ case, a shared-writable memory mapping will be possible. It will work
+ as for the MMU case. If the filesystem does not provide any such
+ support, then the mapping request will be denied.
+
+ (#) Memory backed blockdev, MAP_SHARED, PROT_READ / PROT_EXEC / PROT_WRITE
+
+ In the MMU case: As for ordinary regular files.
+
+ In the no-MMU case: As for memory backed regular files, but the
+ blockdev must be able to provide a contiguous run of pages without
+ truncate being called. The ramdisk driver could do this if it allocated
+ all its memory as a contiguous array upfront.
+
+ (#) Memory backed chardev, MAP_SHARED, PROT_READ / PROT_EXEC / PROT_WRITE
+
+ In the MMU case: As for ordinary regular files.
+
+ In the no-MMU case: The character device driver may choose to honour
+ the mmap() by providing direct access to the underlying device if it
+ provides memory or quasi-memory that can be accessed directly. Examples
+ of such are frame buffers and flash devices. If the driver does not
+ provide any such support, then the mapping request will be denied.
+
+
+Further notes on no-MMU MMAP
+============================
+
+ (#) A request for a private mapping of a file may return a buffer that is not
+ page-aligned. This is because XIP may take place, and the data may not be
+ paged aligned in the backing store.
+
+ (#) A request for an anonymous mapping will always be page aligned. If
+ possible the size of the request should be a power of two otherwise some
+ of the space may be wasted as the kernel must allocate a power-of-2
+ granule but will only discard the excess if appropriately configured as
+ this has an effect on fragmentation.
+
+ (#) The memory allocated by a request for an anonymous mapping will normally
+ be cleared by the kernel before being returned in accordance with the
+ Linux man pages (ver 2.22 or later).
+
+ In the MMU case this can be achieved with reasonable performance as
+ regions are backed by virtual pages, with the contents only being mapped
+ to cleared physical pages when a write happens on that specific page
+ (prior to which, the pages are effectively mapped to the global zero page
+ from which reads can take place). This spreads out the time it takes to
+ initialize the contents of a page - depending on the write-usage of the
+ mapping.
+
+ In the no-MMU case, however, anonymous mappings are backed by physical
+ pages, and the entire map is cleared at allocation time. This can cause
+ significant delays during a userspace malloc() as the C library does an
+ anonymous mapping and the kernel then does a memset for the entire map.
+
+ However, for memory that isn't required to be precleared - such as that
+ returned by malloc() - mmap() can take a MAP_UNINITIALIZED flag to
+ indicate to the kernel that it shouldn't bother clearing the memory before
+ returning it. Note that CONFIG_MMAP_ALLOW_UNINITIALIZED must be enabled
+ to permit this, otherwise the flag will be ignored.
+
+ uClibc uses this to speed up malloc(), and the ELF-FDPIC binfmt uses this
+ to allocate the brk and stack region.
+
+ (#) A list of all the private copy and anonymous mappings on the system is
+ visible through /proc/maps in no-MMU mode.
+
+ (#) A list of all the mappings in use by a process is visible through
+ /proc/<pid>/maps in no-MMU mode.
+
+ (#) Supplying MAP_FIXED or a requesting a particular mapping address will
+ result in an error.
+
+ (#) Files mapped privately usually have to have a read method provided by the
+ driver or filesystem so that the contents can be read into the memory
+ allocated if mmap() chooses not to map the backing device directly. An
+ error will result if they don't. This is most likely to be encountered
+ with character device files, pipes, fifos and sockets.
+
+
+Interprocess shared memory
+==========================
+
+Both SYSV IPC SHM shared memory and POSIX shared memory is supported in NOMMU
+mode. The former through the usual mechanism, the latter through files created
+on ramfs or tmpfs mounts.
+
+
+Futexes
+=======
+
+Futexes are supported in NOMMU mode if the arch supports them. An error will
+be given if an address passed to the futex system call lies outside the
+mappings made by a process or if the mapping in which the address lies does not
+support futexes (such as an I/O chardev mapping).
+
+
+No-MMU mremap
+=============
+
+The mremap() function is partially supported. It may change the size of a
+mapping, and may move it [#]_ if MREMAP_MAYMOVE is specified and if the new size
+of the mapping exceeds the size of the slab object currently occupied by the
+memory to which the mapping refers, or if a smaller slab object could be used.
+
+MREMAP_FIXED is not supported, though it is ignored if there's no change of
+address and the object does not need to be moved.
+
+Shared mappings may not be moved. Shareable mappings may not be moved either,
+even if they are not currently shared.
+
+The mremap() function must be given an exact match for base address and size of
+a previously mapped object. It may not be used to create holes in existing
+mappings, move parts of existing mappings or resize parts of mappings. It must
+act on a complete mapping.
+
+.. [#] Not currently supported.
+
+
+Providing shareable character device support
+============================================
+
+To provide shareable character device support, a driver must provide a
+file->f_op->get_unmapped_area() operation. The mmap() routines will call this
+to get a proposed address for the mapping. This may return an error if it
+doesn't wish to honour the mapping because it's too long, at a weird offset,
+under some unsupported combination of flags or whatever.
+
+The driver should also provide backing device information with capabilities set
+to indicate the permitted types of mapping on such devices. The default is
+assumed to be readable and writable, not executable, and only shareable
+directly (can't be copied).
+
+The file->f_op->mmap() operation will be called to actually inaugurate the
+mapping. It can be rejected at that point. Returning the ENOSYS error will
+cause the mapping to be copied instead if NOMMU_MAP_COPY is specified.
+
+The vm_ops->close() routine will be invoked when the last mapping on a chardev
+is removed. An existing mapping will be shared, partially or not, if possible
+without notifying the driver.
+
+It is permitted also for the file->f_op->get_unmapped_area() operation to
+return -ENOSYS. This will be taken to mean that this operation just doesn't
+want to handle it, despite the fact it's got an operation. For instance, it
+might try directing the call to a secondary driver which turns out not to
+implement it. Such is the case for the framebuffer driver which attempts to
+direct the call to the device-specific driver. Under such circumstances, the
+mapping request will be rejected if NOMMU_MAP_COPY is not specified, and a
+copy mapped otherwise.
+
+.. important::
+
+ Some types of device may present a different appearance to anyone
+ looking at them in certain modes. Flash chips can be like this; for
+ instance if they're in programming or erase mode, you might see the
+ status reflected in the mapping, instead of the data.
+
+ In such a case, care must be taken lest userspace see a shared or a
+ private mapping showing such information when the driver is busy
+ controlling the device. Remember especially: private executable
+ mappings may still be mapped directly off the device under some
+ circumstances!
+
+
+Providing shareable memory-backed file support
+==============================================
+
+Provision of shared mappings on memory backed files is similar to the provision
+of support for shared mapped character devices. The main difference is that the
+filesystem providing the service will probably allocate a contiguous collection
+of pages and permit mappings to be made on that.
+
+It is recommended that a truncate operation applied to such a file that
+increases the file size, if that file is empty, be taken as a request to gather
+enough pages to honour a mapping. This is required to support POSIX shared
+memory.
+
+Memory backed devices are indicated by the mapping's backing device info having
+the memory_backed flag set.
+
+
+Providing shareable block device support
+========================================
+
+Provision of shared mappings on block device files is exactly the same as for
+character devices. If there isn't a real device underneath, then the driver
+should allocate sufficient contiguous memory to honour any supported mapping.
+
+
+Adjusting page trimming behaviour
+=================================
+
+NOMMU mmap automatically rounds up to the nearest power-of-2 number of pages
+when performing an allocation. This can have adverse effects on memory
+fragmentation, and as such, is left configurable. The default behaviour is to
+aggressively trim allocations and discard any excess pages back in to the page
+allocator. In order to retain finer-grained control over fragmentation, this
+behaviour can either be disabled completely, or bumped up to a higher page
+watermark where trimming begins.
+
+Page trimming behaviour is configurable via the sysctl ``vm.nr_trim_pages``.
diff --git a/Documentation/admin-guide/mm/numa_memory_policy.rst b/Documentation/admin-guide/mm/numa_memory_policy.rst
index 8463f5538fda..5a6afecbb0d0 100644
--- a/Documentation/admin-guide/mm/numa_memory_policy.rst
+++ b/Documentation/admin-guide/mm/numa_memory_policy.rst
@@ -245,6 +245,13 @@ MPOL_INTERLEAVED
address range or file. During system boot up, the temporary
interleaved system default policy works in this mode.
+MPOL_PREFERRED_MANY
+ This mode specifices that the allocation should be preferrably
+ satisfied from the nodemask specified in the policy. If there is
+ a memory pressure on all nodes in the nodemask, the allocation
+ can fall back to all existing numa nodes. This is effectively
+ MPOL_PREFERRED allowed for a mask rather than a single node.
+
NUMA memory policy supports the following optional mode flags:
MPOL_F_STATIC_NODES
@@ -253,10 +260,10 @@ MPOL_F_STATIC_NODES
nodes changes after the memory policy has been defined.
Without this flag, any time a mempolicy is rebound because of a
- change in the set of allowed nodes, the node (Preferred) or
- nodemask (Bind, Interleave) is remapped to the new set of
- allowed nodes. This may result in nodes being used that were
- previously undesired.
+ change in the set of allowed nodes, the preferred nodemask (Preferred
+ Many), preferred node (Preferred) or nodemask (Bind, Interleave) is
+ remapped to the new set of allowed nodes. This may result in nodes
+ being used that were previously undesired.
With this flag, if the user-specified nodes overlap with the
nodes allowed by the task's cpuset, then the memory policy is
@@ -364,19 +371,19 @@ follows:
2) for querying the policy, we do not need to take an extra reference on the
target task's task policy nor vma policies because we always acquire the
- task's mm's mmap_sem for read during the query. The set_mempolicy() and
- mbind() APIs [see below] always acquire the mmap_sem for write when
+ task's mm's mmap_lock for read during the query. The set_mempolicy() and
+ mbind() APIs [see below] always acquire the mmap_lock for write when
installing or replacing task or vma policies. Thus, there is no possibility
of a task or thread freeing a policy while another task or thread is
querying it.
3) Page allocation usage of task or vma policy occurs in the fault path where
- we hold them mmap_sem for read. Again, because replacing the task or vma
- policy requires that the mmap_sem be held for write, the policy can't be
+ we hold them mmap_lock for read. Again, because replacing the task or vma
+ policy requires that the mmap_lock be held for write, the policy can't be
freed out from under us while we're using it for page allocation.
4) Shared policies require special consideration. One task can replace a
- shared memory policy while another task, with a distinct mmap_sem, is
+ shared memory policy while another task, with a distinct mmap_lock, is
querying or allocating a page based on the policy. To resolve this
potential race, the shared policy infrastructure adds an extra reference
to the shared policy during lookup while holding a spin lock on the shared
@@ -401,7 +408,7 @@ follows:
Memory Policy APIs
==================
-Linux supports 3 system calls for controlling memory policy. These APIS
+Linux supports 4 system calls for controlling memory policy. These APIS
always affect only the calling task, the calling task's address space, or
some shared object mapped into the calling task's address space.
@@ -453,6 +460,20 @@ requested via the 'flags' argument.
See the mbind(2) man page for more details.
+Set home node for a Range of Task's Address Spacec::
+
+ long sys_set_mempolicy_home_node(unsigned long start, unsigned long len,
+ unsigned long home_node,
+ unsigned long flags);
+
+sys_set_mempolicy_home_node set the home node for a VMA policy present in the
+task's address range. The system call updates the home node only for the existing
+mempolicy range. Other address ranges are ignored. A home node is the NUMA node
+closest to which page allocation will come from. Specifying the home node override
+the default allocation policy to allocate memory close to the local node for an
+executing CPU.
+
+
Memory Policy Command Line Interface
====================================
diff --git a/Documentation/admin-guide/mm/numaperf.rst b/Documentation/admin-guide/mm/numaperf.rst
index a80c3c37226e..166697325947 100644
--- a/Documentation/admin-guide/mm/numaperf.rst
+++ b/Documentation/admin-guide/mm/numaperf.rst
@@ -56,6 +56,11 @@ nodes' access characteristics share the same performance relative to other
linked initiator nodes. Each target within an initiator's access class,
though, do not necessarily perform the same as each other.
+The access class "1" is used to allow differentiation between initiators
+that are CPUs and hence suitable for generic task scheduling, and
+IO initiators such as GPUs and NICs. Unlike access class 0, only
+nodes containing CPUs are considered.
+
================
NUMA Performance
================
@@ -69,7 +74,7 @@ memory node's access class 0 initiators as follows::
/sys/devices/system/node/nodeY/access0/initiators/
These attributes apply only when accessed from nodes that have the
-are linked under the this access's inititiators.
+are linked under the this access's initiators.
The performance characteristics the kernel provides for the local initiators
are exported are as follows::
@@ -88,6 +93,9 @@ The latency attributes are provided in nanoseconds.
The values reported here correspond to the rated latency and bandwidth
for the platform.
+Access class 1 takes the same form but only includes values for CPU to
+memory activity.
+
==========
NUMA Cache
==========
@@ -129,7 +137,7 @@ will create the following directory::
/sys/devices/system/node/nodeX/memory_side_cache/
-If that directory is not present, the system either does not not provide
+If that directory is not present, the system either does not provide
a memory-side cache, or that information is not accessible to the kernel.
The attributes for each level of cache is provided under its cache
@@ -143,7 +151,7 @@ Each cache level's directory provides its attributes. For example, the
following shows a single cache level and the attributes available for
software to query::
- # tree sys/devices/system/node/node0/memory_side_cache/
+ # tree /sys/devices/system/node/node0/memory_side_cache/
/sys/devices/system/node/node0/memory_side_cache/
|-- index1
| |-- indexing
diff --git a/Documentation/admin-guide/mm/pagemap.rst b/Documentation/admin-guide/mm/pagemap.rst
index 340a5aee9b80..6e2e416af783 100644
--- a/Documentation/admin-guide/mm/pagemap.rst
+++ b/Documentation/admin-guide/mm/pagemap.rst
@@ -21,7 +21,9 @@ There are four components to pagemap:
* Bit 55 pte is soft-dirty (see
:ref:`Documentation/admin-guide/mm/soft-dirty.rst <soft_dirty>`)
* Bit 56 page exclusively mapped (since 4.2)
- * Bits 57-60 zero
+ * Bit 57 pte is uffd-wp write-protected (since 5.13) (see
+ :ref:`Documentation/admin-guide/mm/userfaultfd.rst <userfaultfd>`)
+ * Bits 58-60 zero
* Bit 61 page is file-page or shared-anon (since 3.5)
* Bit 62 page swapped
* Bit 63 page present
@@ -88,13 +90,14 @@ Short descriptions to the page flags
====================================
0 - LOCKED
- page is being locked for exclusive access, e.g. by undergoing read/write IO
+ The page is being locked for exclusive access, e.g. by undergoing read/write
+ IO.
7 - SLAB
- page is managed by the SLAB/SLOB/SLUB/SLQB kernel memory allocator
+ The page is managed by the SLAB/SLOB/SLUB/SLQB kernel memory allocator.
When compound page is used, SLUB/SLQB will only set this flag on the head
page; SLOB will not flag it at all.
10 - BUDDY
- a free memory block managed by the buddy system allocator
+ A free memory block managed by the buddy system allocator.
The buddy system organizes free memory in blocks of various orders.
An order N block has 2^N physically contiguous pages, with the BUDDY flag
set for and _only_ for the first page.
@@ -110,65 +113,65 @@ Short descriptions to the page flags
16 - COMPOUND_TAIL
A compound page tail (see description above).
17 - HUGE
- this is an integral part of a HugeTLB page
+ This is an integral part of a HugeTLB page.
19 - HWPOISON
- hardware detected memory corruption on this page: don't touch the data!
+ Hardware detected memory corruption on this page: don't touch the data!
20 - NOPAGE
- no page frame exists at the requested address
+ No page frame exists at the requested address.
21 - KSM
- identical memory pages dynamically shared between one or more processes
+ Identical memory pages dynamically shared between one or more processes.
22 - THP
- contiguous pages which construct transparent hugepages
+ Contiguous pages which construct transparent hugepages.
23 - OFFLINE
- page is logically offline
+ The page is logically offline.
24 - ZERO_PAGE
- zero page for pfn_zero or huge_zero page
+ Zero page for pfn_zero or huge_zero page.
25 - IDLE
- page has not been accessed since it was marked idle (see
+ The page has not been accessed since it was marked idle (see
:ref:`Documentation/admin-guide/mm/idle_page_tracking.rst <idle_page_tracking>`).
Note that this flag may be stale in case the page was accessed via
a PTE. To make sure the flag is up-to-date one has to read
``/sys/kernel/mm/page_idle/bitmap`` first.
26 - PGTABLE
- page is in use as a page table
+ The page is in use as a page table.
IO related page flags
---------------------
1 - ERROR
- IO error occurred
+ IO error occurred.
3 - UPTODATE
- page has up-to-date data
+ The page has up-to-date data.
ie. for file backed page: (in-memory data revision >= on-disk one)
4 - DIRTY
- page has been written to, hence contains new data
+ The page has been written to, hence contains new data.
i.e. for file backed page: (in-memory data revision > on-disk one)
8 - WRITEBACK
- page is being synced to disk
+ The page is being synced to disk.
LRU related page flags
----------------------
5 - LRU
- page is in one of the LRU lists
+ The page is in one of the LRU lists.
6 - ACTIVE
- page is in the active LRU list
+ The page is in the active LRU list.
18 - UNEVICTABLE
- page is in the unevictable (non-)LRU list It is somehow pinned and
+ The page is in the unevictable (non-)LRU list It is somehow pinned and
not a candidate for LRU page reclaims, e.g. ramfs pages,
- shmctl(SHM_LOCK) and mlock() memory segments
+ shmctl(SHM_LOCK) and mlock() memory segments.
2 - REFERENCED
- page has been referenced since last LRU list enqueue/requeue
+ The page has been referenced since last LRU list enqueue/requeue.
9 - RECLAIM
- page will be reclaimed soon after its pageout IO completed
+ The page will be reclaimed soon after its pageout IO completed.
11 - MMAP
- a memory mapped page
+ A memory mapped page.
12 - ANON
- a memory mapped page that is not part of a file
+ A memory mapped page that is not part of a file.
13 - SWAPCACHE
- page is mapped to swap space, i.e. has an associated swap entry
+ The page is mapped to swap space, i.e. has an associated swap entry.
14 - SWAPBACKED
- page is backed by swap/RAM
+ The page is backed by swap/RAM.
The page-types tool in the tools/vm directory can be used to query the
above flags.
@@ -194,6 +197,28 @@ you can go through every map in the process, find the PFNs, look those up
in kpagecount, and tally up the number of pages that are only referenced
once.
+Exceptions for Shared Memory
+============================
+
+Page table entries for shared pages are cleared when the pages are zapped or
+swapped out. This makes swapped out pages indistinguishable from never-allocated
+ones.
+
+In kernel space, the swap location can still be retrieved from the page cache.
+However, values stored only on the normal PTE get lost irretrievably when the
+page is swapped out (i.e. SOFT_DIRTY).
+
+In user space, whether the page is present, swapped or none can be deduced with
+the help of lseek and/or mincore system calls.
+
+lseek() can differentiate between accessed pages (present or swapped out) and
+holes (none/non-allocated) by specifying the SEEK_DATA flag on the file where
+the pages are backed. For anonymous shared pages, the file can be found in
+``/proc/pid/map_files/``.
+
+mincore() can differentiate between pages in memory (present, including swap
+cache) and out of memory (swapped out or none/non-allocated).
+
Other notes
===========
diff --git a/Documentation/admin-guide/mm/shrinker_debugfs.rst b/Documentation/admin-guide/mm/shrinker_debugfs.rst
new file mode 100644
index 000000000000..3887f0b294fe
--- /dev/null
+++ b/Documentation/admin-guide/mm/shrinker_debugfs.rst
@@ -0,0 +1,135 @@
+.. _shrinker_debugfs:
+
+==========================
+Shrinker Debugfs Interface
+==========================
+
+Shrinker debugfs interface provides a visibility into the kernel memory
+shrinkers subsystem and allows to get information about individual shrinkers
+and interact with them.
+
+For each shrinker registered in the system a directory in **<debugfs>/shrinker/**
+is created. The directory's name is composed from the shrinker's name and an
+unique id: e.g. *kfree_rcu-0* or *sb-xfs:vda1-36*.
+
+Each shrinker directory contains **count** and **scan** files, which allow to
+trigger *count_objects()* and *scan_objects()* callbacks for each memcg and
+numa node (if applicable).
+
+Usage:
+------
+
+1. *List registered shrinkers*
+
+ ::
+
+ $ cd /sys/kernel/debug/shrinker/
+ $ ls
+ dquota-cache-16 sb-devpts-28 sb-proc-47 sb-tmpfs-42
+ mm-shadow-18 sb-devtmpfs-5 sb-proc-48 sb-tmpfs-43
+ mm-zspool:zram0-34 sb-hugetlbfs-17 sb-pstore-31 sb-tmpfs-44
+ rcu-kfree-0 sb-hugetlbfs-33 sb-rootfs-2 sb-tmpfs-49
+ sb-aio-20 sb-iomem-12 sb-securityfs-6 sb-tracefs-13
+ sb-anon_inodefs-15 sb-mqueue-21 sb-selinuxfs-22 sb-xfs:vda1-36
+ sb-bdev-3 sb-nsfs-4 sb-sockfs-8 sb-zsmalloc-19
+ sb-bpf-32 sb-pipefs-14 sb-sysfs-26 thp-deferred_split-10
+ sb-btrfs:vda2-24 sb-proc-25 sb-tmpfs-1 thp-zero-9
+ sb-cgroup2-30 sb-proc-39 sb-tmpfs-27 xfs-buf:vda1-37
+ sb-configfs-23 sb-proc-41 sb-tmpfs-29 xfs-inodegc:vda1-38
+ sb-dax-11 sb-proc-45 sb-tmpfs-35
+ sb-debugfs-7 sb-proc-46 sb-tmpfs-40
+
+2. *Get information about a specific shrinker*
+
+ ::
+
+ $ cd sb-btrfs\:vda2-24/
+ $ ls
+ count scan
+
+3. *Count objects*
+
+ Each line in the output has the following format::
+
+ <cgroup inode id> <nr of objects on node 0> <nr of objects on node 1> ...
+ <cgroup inode id> <nr of objects on node 0> <nr of objects on node 1> ...
+ ...
+
+ If there are no objects on all numa nodes, a line is omitted. If there
+ are no objects at all, the output might be empty.
+
+ If the shrinker is not memcg-aware or CONFIG_MEMCG is off, 0 is printed
+ as cgroup inode id. If the shrinker is not numa-aware, 0's are printed
+ for all nodes except the first one.
+ ::
+
+ $ cat count
+ 1 224 2
+ 21 98 0
+ 55 818 10
+ 2367 2 0
+ 2401 30 0
+ 225 13 0
+ 599 35 0
+ 939 124 0
+ 1041 3 0
+ 1075 1 0
+ 1109 1 0
+ 1279 60 0
+ 1313 7 0
+ 1347 39 0
+ 1381 3 0
+ 1449 14 0
+ 1483 63 0
+ 1517 53 0
+ 1551 6 0
+ 1585 1 0
+ 1619 6 0
+ 1653 40 0
+ 1687 11 0
+ 1721 8 0
+ 1755 4 0
+ 1789 52 0
+ 1823 888 0
+ 1857 1 0
+ 1925 2 0
+ 1959 32 0
+ 2027 22 0
+ 2061 9 0
+ 2469 799 0
+ 2537 861 0
+ 2639 1 0
+ 2707 70 0
+ 2775 4 0
+ 2877 84 0
+ 293 1 0
+ 735 8 0
+
+4. *Scan objects*
+
+ The expected input format::
+
+ <cgroup inode id> <numa id> <number of objects to scan>
+
+ For a non-memcg-aware shrinker or on a system with no memory
+ cgrups **0** should be passed as cgroup id.
+ ::
+
+ $ cd /sys/kernel/debug/shrinker/
+ $ cd sb-btrfs\:vda2-24/
+
+ $ cat count | head -n 5
+ 1 212 0
+ 21 97 0
+ 55 802 5
+ 2367 2 0
+ 225 13 0
+
+ $ echo "55 0 200" > scan
+
+ $ cat count | head -n 5
+ 1 212 0
+ 21 96 0
+ 55 752 5
+ 2367 2 0
+ 225 13 0
diff --git a/Documentation/admin-guide/mm/swap_numa.rst b/Documentation/admin-guide/mm/swap_numa.rst
new file mode 100644
index 000000000000..e0466f2db8fa
--- /dev/null
+++ b/Documentation/admin-guide/mm/swap_numa.rst
@@ -0,0 +1,80 @@
+.. _swap_numa:
+
+===========================================
+Automatically bind swap device to numa node
+===========================================
+
+If the system has more than one swap device and swap device has the node
+information, we can make use of this information to decide which swap
+device to use in get_swap_pages() to get better performance.
+
+
+How to use this feature
+=======================
+
+Swap device has priority and that decides the order of it to be used. To make
+use of automatically binding, there is no need to manipulate priority settings
+for swap devices. e.g. on a 2 node machine, assume 2 swap devices swapA and
+swapB, with swapA attached to node 0 and swapB attached to node 1, are going
+to be swapped on. Simply swapping them on by doing::
+
+ # swapon /dev/swapA
+ # swapon /dev/swapB
+
+Then node 0 will use the two swap devices in the order of swapA then swapB and
+node 1 will use the two swap devices in the order of swapB then swapA. Note
+that the order of them being swapped on doesn't matter.
+
+A more complex example on a 4 node machine. Assume 6 swap devices are going to
+be swapped on: swapA and swapB are attached to node 0, swapC is attached to
+node 1, swapD and swapE are attached to node 2 and swapF is attached to node3.
+The way to swap them on is the same as above::
+
+ # swapon /dev/swapA
+ # swapon /dev/swapB
+ # swapon /dev/swapC
+ # swapon /dev/swapD
+ # swapon /dev/swapE
+ # swapon /dev/swapF
+
+Then node 0 will use them in the order of::
+
+ swapA/swapB -> swapC -> swapD -> swapE -> swapF
+
+swapA and swapB will be used in a round robin mode before any other swap device.
+
+node 1 will use them in the order of::
+
+ swapC -> swapA -> swapB -> swapD -> swapE -> swapF
+
+node 2 will use them in the order of::
+
+ swapD/swapE -> swapA -> swapB -> swapC -> swapF
+
+Similaly, swapD and swapE will be used in a round robin mode before any
+other swap devices.
+
+node 3 will use them in the order of::
+
+ swapF -> swapA -> swapB -> swapC -> swapD -> swapE
+
+
+Implementation details
+======================
+
+The current code uses a priority based list, swap_avail_list, to decide
+which swap device to use and if multiple swap devices share the same
+priority, they are used round robin. This change here replaces the single
+global swap_avail_list with a per-numa-node list, i.e. for each numa node,
+it sees its own priority based list of available swap devices. Swap
+device's priority can be promoted on its matching node's swap_avail_list.
+
+The current swap device's priority is set as: user can set a >=0 value,
+or the system will pick one starting from -1 then downwards. The priority
+value in the swap_avail_list is the negated value of the swap device's
+due to plist being sorted from low to high. The new policy doesn't change
+the semantics for priority >=0 cases, the previous starting from -1 then
+downwards now becomes starting from -2 then downwards and -1 is reserved
+as the promoted value. So if multiple swap devices are attached to the same
+node, they will all be promoted to priority -1 on that node's plist and will
+be used round robin before any other swap devices.
diff --git a/Documentation/admin-guide/mm/transhuge.rst b/Documentation/admin-guide/mm/transhuge.rst
index bd5714547cee..8ee78ec232eb 100644
--- a/Documentation/admin-guide/mm/transhuge.rst
+++ b/Documentation/admin-guide/mm/transhuge.rst
@@ -191,7 +191,14 @@ allocation failure to throttle the next allocation attempt::
/sys/kernel/mm/transparent_hugepage/khugepaged/alloc_sleep_millisecs
-The khugepaged progress can be seen in the number of pages collapsed::
+The khugepaged progress can be seen in the number of pages collapsed (note
+that this counter may not be an exact count of the number of pages
+collapsed, since "collapsed" could mean multiple things: (1) A PTE mapping
+being replaced by a PMD mapping, or (2) All 4K physical pages replaced by
+one 2M hugepage. Each may happen independently, or together, depending on
+the type of memory and the failures that occur. As such, this value should
+be interpreted roughly as a sign of progress, and counters in /proc/vmstat
+consulted for more accurate accounting)::
/sys/kernel/mm/transparent_hugepage/khugepaged/pages_collapsed
@@ -220,6 +227,13 @@ memory. A lower value can prevent THPs from being
collapsed, resulting fewer pages being collapsed into
THPs, and lower memory access performance.
+``max_ptes_shared`` specifies how many pages can be shared across multiple
+processes. Exceeding the number would block the collapse::
+
+ /sys/kernel/mm/transparent_hugepage/khugepaged/max_ptes_shared
+
+A higher value may increase memory footprint for some workloads.
+
Boot parameter
==============
@@ -298,8 +312,7 @@ monitor how successfully the system is providing huge pages for use.
thp_fault_alloc
is incremented every time a huge page is successfully
- allocated to handle a page fault. This applies to both the
- first time a page is faulted and for COW faults.
+ allocated to handle a page fault.
thp_collapse_alloc
is incremented by khugepaged when it has found
@@ -310,6 +323,11 @@ thp_fault_fallback
is incremented if a page fault fails to allocate
a huge page and instead falls back to using small pages.
+thp_fault_fallback_charge
+ is incremented if a page fault fails to charge a huge page and
+ instead falls back to using small pages even though the
+ allocation was successful.
+
thp_collapse_alloc_failed
is incremented if khugepaged found a range
of pages that should be collapsed into one huge page but failed
@@ -319,6 +337,15 @@ thp_file_alloc
is incremented every time a file huge page is successfully
allocated.
+thp_file_fallback
+ is incremented if a file huge page is attempted to be allocated
+ but fails and instead falls back to using small pages.
+
+thp_file_fallback_charge
+ is incremented if a file huge page cannot be charged and instead
+ falls back to using small pages even though the allocation was
+ successful.
+
thp_file_mapped
is incremented every time a file huge page is mapped into
user address space.
@@ -346,10 +373,9 @@ thp_split_pmd
page table entry.
thp_zero_page_alloc
- is incremented every time a huge zero page is
- successfully allocated. It includes allocations which where
- dropped due race with other allocation. Note, it doesn't count
- every map of the huge zero page, only its allocation.
+ is incremented every time a huge zero page used for thp is
+ successfully allocated. Note, it doesn't count every map of
+ the huge zero page, only its allocation.
thp_zero_page_alloc_failed
is incremented if kernel fails to allocate
@@ -381,23 +407,8 @@ compact_fail
is incremented if the system tries to compact memory
but failed.
-compact_pages_moved
- is incremented each time a page is moved. If
- this value is increasing rapidly, it implies that the system
- is copying a lot of data to satisfy the huge page allocation.
- It is possible that the cost of copying exceeds any savings
- from reduced TLB misses.
-
-compact_pagemigrate_failed
- is incremented when the underlying mechanism
- for moving a page failed.
-
-compact_blocks_moved
- is incremented each time memory compaction examines
- a huge page aligned range of pages.
-
It is possible to establish how long the stalls were using the function
-tracer to record how long was spent in __alloc_pages_nodemask and
+tracer to record how long was spent in __alloc_pages() and
using the mm_page_alloc tracepoint to identify which allocations were
for huge pages.
diff --git a/Documentation/admin-guide/mm/userfaultfd.rst b/Documentation/admin-guide/mm/userfaultfd.rst
index 5048cf661a8a..83f31919ebb3 100644
--- a/Documentation/admin-guide/mm/userfaultfd.rst
+++ b/Documentation/admin-guide/mm/userfaultfd.rst
@@ -12,114 +12,227 @@ and more generally they allow userland to take control of various
memory page faults, something otherwise only the kernel code could do.
For example userfaults allows a proper and more optimal implementation
-of the PROT_NONE+SIGSEGV trick.
+of the ``PROT_NONE+SIGSEGV`` trick.
Design
======
-Userfaults are delivered and resolved through the userfaultfd syscall.
+Userspace creates a new userfaultfd, initializes it, and registers one or more
+regions of virtual memory with it. Then, any page faults which occur within the
+region(s) result in a message being delivered to the userfaultfd, notifying
+userspace of the fault.
-The userfaultfd (aside from registering and unregistering virtual
+The ``userfaultfd`` (aside from registering and unregistering virtual
memory ranges) provides two primary functionalities:
-1) read/POLLIN protocol to notify a userland thread of the faults
+1) ``read/POLLIN`` protocol to notify a userland thread of the faults
happening
-2) various UFFDIO_* ioctls that can manage the virtual memory regions
- registered in the userfaultfd that allows userland to efficiently
+2) various ``UFFDIO_*`` ioctls that can manage the virtual memory regions
+ registered in the ``userfaultfd`` that allows userland to efficiently
resolve the userfaults it receives via 1) or to manage the virtual
memory in the background
The real advantage of userfaults if compared to regular virtual memory
management of mremap/mprotect is that the userfaults in all their
operations never involve heavyweight structures like vmas (in fact the
-userfaultfd runtime load never takes the mmap_sem for writing).
-
+``userfaultfd`` runtime load never takes the mmap_lock for writing).
Vmas are not suitable for page- (or hugepage) granular fault tracking
when dealing with virtual address spaces that could span
Terabytes. Too many vmas would be needed for that.
-The userfaultfd once opened by invoking the syscall, can also be
+The ``userfaultfd``, once created, can also be
passed using unix domain sockets to a manager process, so the same
manager process could handle the userfaults of a multitude of
different processes without them being aware about what is going on
-(well of course unless they later try to use the userfaultfd
+(well of course unless they later try to use the ``userfaultfd``
themselves on the same region the manager is already tracking, which
-is a corner case that would currently return -EBUSY).
+is a corner case that would currently return ``-EBUSY``).
API
===
-When first opened the userfaultfd must be enabled invoking the
-UFFDIO_API ioctl specifying a uffdio_api.api value set to UFFD_API (or
-a later API version) which will specify the read/POLLIN protocol
-userland intends to speak on the UFFD and the uffdio_api.features
-userland requires. The UFFDIO_API ioctl if successful (i.e. if the
-requested uffdio_api.api is spoken also by the running kernel and the
+Creating a userfaultfd
+----------------------
+
+There are two ways to create a new userfaultfd, each of which provide ways to
+restrict access to this functionality (since historically userfaultfds which
+handle kernel page faults have been a useful tool for exploiting the kernel).
+
+The first way, supported since userfaultfd was introduced, is the
+userfaultfd(2) syscall. Access to this is controlled in several ways:
+
+- Any user can always create a userfaultfd which traps userspace page faults
+ only. Such a userfaultfd can be created using the userfaultfd(2) syscall
+ with the flag UFFD_USER_MODE_ONLY.
+
+- In order to also trap kernel page faults for the address space, either the
+ process needs the CAP_SYS_PTRACE capability, or the system must have
+ vm.unprivileged_userfaultfd set to 1. By default, vm.unprivileged_userfaultfd
+ is set to 0.
+
+The second way, added to the kernel more recently, is by opening
+/dev/userfaultfd and issuing a USERFAULTFD_IOC_NEW ioctl to it. This method
+yields equivalent userfaultfds to the userfaultfd(2) syscall.
+
+Unlike userfaultfd(2), access to /dev/userfaultfd is controlled via normal
+filesystem permissions (user/group/mode), which gives fine grained access to
+userfaultfd specifically, without also granting other unrelated privileges at
+the same time (as e.g. granting CAP_SYS_PTRACE would do). Users who have access
+to /dev/userfaultfd can always create userfaultfds that trap kernel page faults;
+vm.unprivileged_userfaultfd is not considered.
+
+Initializing a userfaultfd
+--------------------------
+
+When first opened the ``userfaultfd`` must be enabled invoking the
+``UFFDIO_API`` ioctl specifying a ``uffdio_api.api`` value set to ``UFFD_API`` (or
+a later API version) which will specify the ``read/POLLIN`` protocol
+userland intends to speak on the ``UFFD`` and the ``uffdio_api.features``
+userland requires. The ``UFFDIO_API`` ioctl if successful (i.e. if the
+requested ``uffdio_api.api`` is spoken also by the running kernel and the
requested features are going to be enabled) will return into
-uffdio_api.features and uffdio_api.ioctls two 64bit bitmasks of
+``uffdio_api.features`` and ``uffdio_api.ioctls`` two 64bit bitmasks of
respectively all the available features of the read(2) protocol and
the generic ioctl available.
-The uffdio_api.features bitmask returned by the UFFDIO_API ioctl
-defines what memory types are supported by the userfaultfd and what
-events, except page fault notifications, may be generated.
-
-If the kernel supports registering userfaultfd ranges on hugetlbfs
-virtual memory areas, UFFD_FEATURE_MISSING_HUGETLBFS will be set in
-uffdio_api.features. Similarly, UFFD_FEATURE_MISSING_SHMEM will be
-set if the kernel supports registering userfaultfd ranges on shared
-memory (covering all shmem APIs, i.e. tmpfs, IPCSHM, /dev/zero
-MAP_SHARED, memfd_create, etc).
-
-The userland application that wants to use userfaultfd with hugetlbfs
-or shared memory need to set the corresponding flag in
-uffdio_api.features to enable those features.
-
-If the userland desires to receive notifications for events other than
-page faults, it has to verify that uffdio_api.features has appropriate
-UFFD_FEATURE_EVENT_* bits set. These events are described in more
-detail below in "Non-cooperative userfaultfd" section.
-
-Once the userfaultfd has been enabled the UFFDIO_REGISTER ioctl should
-be invoked (if present in the returned uffdio_api.ioctls bitmask) to
-register a memory range in the userfaultfd by setting the
-uffdio_register structure accordingly. The uffdio_register.mode
+The ``uffdio_api.features`` bitmask returned by the ``UFFDIO_API`` ioctl
+defines what memory types are supported by the ``userfaultfd`` and what
+events, except page fault notifications, may be generated:
+
+- The ``UFFD_FEATURE_EVENT_*`` flags indicate that various other events
+ other than page faults are supported. These events are described in more
+ detail below in the `Non-cooperative userfaultfd`_ section.
+
+- ``UFFD_FEATURE_MISSING_HUGETLBFS`` and ``UFFD_FEATURE_MISSING_SHMEM``
+ indicate that the kernel supports ``UFFDIO_REGISTER_MODE_MISSING``
+ registrations for hugetlbfs and shared memory (covering all shmem APIs,
+ i.e. tmpfs, ``IPCSHM``, ``/dev/zero``, ``MAP_SHARED``, ``memfd_create``,
+ etc) virtual memory areas, respectively.
+
+- ``UFFD_FEATURE_MINOR_HUGETLBFS`` indicates that the kernel supports
+ ``UFFDIO_REGISTER_MODE_MINOR`` registration for hugetlbfs virtual memory
+ areas. ``UFFD_FEATURE_MINOR_SHMEM`` is the analogous feature indicating
+ support for shmem virtual memory areas.
+
+The userland application should set the feature flags it intends to use
+when invoking the ``UFFDIO_API`` ioctl, to request that those features be
+enabled if supported.
+
+Once the ``userfaultfd`` API has been enabled the ``UFFDIO_REGISTER``
+ioctl should be invoked (if present in the returned ``uffdio_api.ioctls``
+bitmask) to register a memory range in the ``userfaultfd`` by setting the
+uffdio_register structure accordingly. The ``uffdio_register.mode``
bitmask will specify to the kernel which kind of faults to track for
-the range (UFFDIO_REGISTER_MODE_MISSING would track missing
-pages). The UFFDIO_REGISTER ioctl will return the
-uffdio_register.ioctls bitmask of ioctls that are suitable to resolve
+the range. The ``UFFDIO_REGISTER`` ioctl will return the
+``uffdio_register.ioctls`` bitmask of ioctls that are suitable to resolve
userfaults on the range registered. Not all ioctls will necessarily be
-supported for all memory types depending on the underlying virtual
-memory backend (anonymous memory vs tmpfs vs real filebacked
-mappings).
+supported for all memory types (e.g. anonymous memory vs. shmem vs.
+hugetlbfs), or all types of intercepted faults.
-Userland can use the uffdio_register.ioctls to manage the virtual
+Userland can use the ``uffdio_register.ioctls`` to manage the virtual
address space in the background (to add or potentially also remove
-memory from the userfaultfd registered range). This means a userfault
+memory from the ``userfaultfd`` registered range). This means a userfault
could be triggering just before userland maps in the background the
user-faulted page.
-The primary ioctl to resolve userfaults is UFFDIO_COPY. That
-atomically copies a page into the userfault registered range and wakes
-up the blocked userfaults (unless uffdio_copy.mode &
-UFFDIO_COPY_MODE_DONTWAKE is set). Other ioctl works similarly to
-UFFDIO_COPY. They're atomic as in guaranteeing that nothing can see an
-half copied page since it'll keep userfaulting until the copy has
-finished.
+Resolving Userfaults
+--------------------
+
+There are three basic ways to resolve userfaults:
+
+- ``UFFDIO_COPY`` atomically copies some existing page contents from
+ userspace.
+
+- ``UFFDIO_ZEROPAGE`` atomically zeros the new page.
+
+- ``UFFDIO_CONTINUE`` maps an existing, previously-populated page.
+
+These operations are atomic in the sense that they guarantee nothing can
+see a half-populated page, since readers will keep userfaulting until the
+operation has finished.
+
+By default, these wake up userfaults blocked on the range in question.
+They support a ``UFFDIO_*_MODE_DONTWAKE`` ``mode`` flag, which indicates
+that waking will be done separately at some later time.
+
+Which ioctl to choose depends on the kind of page fault, and what we'd
+like to do to resolve it:
+
+- For ``UFFDIO_REGISTER_MODE_MISSING`` faults, the fault needs to be
+ resolved by either providing a new page (``UFFDIO_COPY``), or mapping
+ the zero page (``UFFDIO_ZEROPAGE``). By default, the kernel would map
+ the zero page for a missing fault. With userfaultfd, userspace can
+ decide what content to provide before the faulting thread continues.
+
+- For ``UFFDIO_REGISTER_MODE_MINOR`` faults, there is an existing page (in
+ the page cache). Userspace has the option of modifying the page's
+ contents before resolving the fault. Once the contents are correct
+ (modified or not), userspace asks the kernel to map the page and let the
+ faulting thread continue with ``UFFDIO_CONTINUE``.
+
+Notes:
+
+- You can tell which kind of fault occurred by examining
+ ``pagefault.flags`` within the ``uffd_msg``, checking for the
+ ``UFFD_PAGEFAULT_FLAG_*`` flags.
+
+- None of the page-delivering ioctls default to the range that you
+ registered with. You must fill in all fields for the appropriate
+ ioctl struct including the range.
+
+- You get the address of the access that triggered the missing page
+ event out of a struct uffd_msg that you read in the thread from the
+ uffd. You can supply as many pages as you want with these IOCTLs.
+ Keep in mind that unless you used DONTWAKE then the first of any of
+ those IOCTLs wakes up the faulting thread.
+
+- Be sure to test for all errors including
+ (``pollfd[0].revents & POLLERR``). This can happen, e.g. when ranges
+ supplied were incorrect.
+
+Write Protect Notifications
+---------------------------
+
+This is equivalent to (but faster than) using mprotect and a SIGSEGV
+signal handler.
+
+Firstly you need to register a range with ``UFFDIO_REGISTER_MODE_WP``.
+Instead of using mprotect(2) you use
+``ioctl(uffd, UFFDIO_WRITEPROTECT, struct *uffdio_writeprotect)``
+while ``mode = UFFDIO_WRITEPROTECT_MODE_WP``
+in the struct passed in. The range does not default to and does not
+have to be identical to the range you registered with. You can write
+protect as many ranges as you like (inside the registered range).
+Then, in the thread reading from uffd the struct will have
+``msg.arg.pagefault.flags & UFFD_PAGEFAULT_FLAG_WP`` set. Now you send
+``ioctl(uffd, UFFDIO_WRITEPROTECT, struct *uffdio_writeprotect)``
+again while ``pagefault.mode`` does not have ``UFFDIO_WRITEPROTECT_MODE_WP``
+set. This wakes up the thread which will continue to run with writes. This
+allows you to do the bookkeeping about the write in the uffd reading
+thread before the ioctl.
+
+If you registered with both ``UFFDIO_REGISTER_MODE_MISSING`` and
+``UFFDIO_REGISTER_MODE_WP`` then you need to think about the sequence in
+which you supply a page and undo write protect. Note that there is a
+difference between writes into a WP area and into a !WP area. The
+former will have ``UFFD_PAGEFAULT_FLAG_WP`` set, the latter
+``UFFD_PAGEFAULT_FLAG_WRITE``. The latter did not fail on protection but
+you still need to supply a page when ``UFFDIO_REGISTER_MODE_MISSING`` was
+used.
QEMU/KVM
========
-QEMU/KVM is using the userfaultfd syscall to implement postcopy live
+QEMU/KVM is using the ``userfaultfd`` syscall to implement postcopy live
migration. Postcopy live migration is one form of memory
externalization consisting of a virtual machine running with part or
all of its memory residing on a different node in the cloud. The
-userfaultfd abstraction is generic enough that not a single line of
+``userfaultfd`` abstraction is generic enough that not a single line of
KVM kernel code had to be modified in order to add postcopy live
migration to QEMU.
-Guest async page faults, FOLL_NOWAIT and all other GUP features work
+Guest async page faults, ``FOLL_NOWAIT`` and all other ``GUP*`` features work
just fine in combination with userfaults. Userfaults trigger async
page faults in the guest scheduler so those guest processes that
aren't waiting for userfaults (i.e. network bound) can keep running in
@@ -132,19 +245,19 @@ generating userfaults for readonly guest regions.
The implementation of postcopy live migration currently uses one
single bidirectional socket but in the future two different sockets
will be used (to reduce the latency of the userfaults to the minimum
-possible without having to decrease /proc/sys/net/ipv4/tcp_wmem).
+possible without having to decrease ``/proc/sys/net/ipv4/tcp_wmem``).
The QEMU in the source node writes all pages that it knows are missing
in the destination node, into the socket, and the migration thread of
-the QEMU running in the destination node runs UFFDIO_COPY|ZEROPAGE
-ioctls on the userfaultfd in order to map the received pages into the
-guest (UFFDIO_ZEROCOPY is used if the source page was a zero page).
+the QEMU running in the destination node runs ``UFFDIO_COPY|ZEROPAGE``
+ioctls on the ``userfaultfd`` in order to map the received pages into the
+guest (``UFFDIO_ZEROCOPY`` is used if the source page was a zero page).
A different postcopy thread in the destination node listens with
-poll() to the userfaultfd in parallel. When a POLLIN event is
+poll() to the ``userfaultfd`` in parallel. When a ``POLLIN`` event is
generated after a userfault triggers, the postcopy thread read() from
-the userfaultfd and receives the fault address (or -EAGAIN in case the
-userfault was already resolved and waken by a UFFDIO_COPY|ZEROPAGE run
+the ``userfaultfd`` and receives the fault address (or ``-EAGAIN`` in case the
+userfault was already resolved and waken by a ``UFFDIO_COPY|ZEROPAGE`` run
by the parallel QEMU migration thread).
After the QEMU postcopy thread (running in the destination node) gets
@@ -155,7 +268,7 @@ remaining missing pages from that new page offset. Soon after that
(just the time to flush the tcp_wmem queue through the network) the
migration thread in the QEMU running in the destination node will
receive the page that triggered the userfault and it'll map it as
-usual with the UFFDIO_COPY|ZEROPAGE (without actually knowing if it
+usual with the ``UFFDIO_COPY|ZEROPAGE`` (without actually knowing if it
was spontaneously sent by the source or if it was an urgent page
requested through a userfault).
@@ -168,74 +281,74 @@ checked to find which missing pages to send in round robin and we seek
over it when receiving incoming userfaults. After sending each page of
course the bitmap is updated accordingly. It's also useful to avoid
sending the same page twice (in case the userfault is read by the
-postcopy thread just before UFFDIO_COPY|ZEROPAGE runs in the migration
+postcopy thread just before ``UFFDIO_COPY|ZEROPAGE`` runs in the migration
thread).
Non-cooperative userfaultfd
===========================
-When the userfaultfd is monitored by an external manager, the manager
+When the ``userfaultfd`` is monitored by an external manager, the manager
must be able to track changes in the process virtual memory
layout. Userfaultfd can notify the manager about such changes using
the same read(2) protocol as for the page fault notifications. The
manager has to explicitly enable these events by setting appropriate
-bits in uffdio_api.features passed to UFFDIO_API ioctl:
+bits in ``uffdio_api.features`` passed to ``UFFDIO_API`` ioctl:
-UFFD_FEATURE_EVENT_FORK
- enable userfaultfd hooks for fork(). When this feature is
- enabled, the userfaultfd context of the parent process is
+``UFFD_FEATURE_EVENT_FORK``
+ enable ``userfaultfd`` hooks for fork(). When this feature is
+ enabled, the ``userfaultfd`` context of the parent process is
duplicated into the newly created process. The manager
- receives UFFD_EVENT_FORK with file descriptor of the new
- userfaultfd context in the uffd_msg.fork.
+ receives ``UFFD_EVENT_FORK`` with file descriptor of the new
+ ``userfaultfd`` context in the ``uffd_msg.fork``.
-UFFD_FEATURE_EVENT_REMAP
+``UFFD_FEATURE_EVENT_REMAP``
enable notifications about mremap() calls. When the
non-cooperative process moves a virtual memory area to a
different location, the manager will receive
- UFFD_EVENT_REMAP. The uffd_msg.remap will contain the old and
+ ``UFFD_EVENT_REMAP``. The ``uffd_msg.remap`` will contain the old and
new addresses of the area and its original length.
-UFFD_FEATURE_EVENT_REMOVE
+``UFFD_FEATURE_EVENT_REMOVE``
enable notifications about madvise(MADV_REMOVE) and
- madvise(MADV_DONTNEED) calls. The event UFFD_EVENT_REMOVE will
- be generated upon these calls to madvise. The uffd_msg.remove
+ madvise(MADV_DONTNEED) calls. The event ``UFFD_EVENT_REMOVE`` will
+ be generated upon these calls to madvise(). The ``uffd_msg.remove``
will contain start and end addresses of the removed area.
-UFFD_FEATURE_EVENT_UNMAP
+``UFFD_FEATURE_EVENT_UNMAP``
enable notifications about memory unmapping. The manager will
- get UFFD_EVENT_UNMAP with uffd_msg.remove containing start and
+ get ``UFFD_EVENT_UNMAP`` with ``uffd_msg.remove`` containing start and
end addresses of the unmapped area.
-Although the UFFD_FEATURE_EVENT_REMOVE and UFFD_FEATURE_EVENT_UNMAP
+Although the ``UFFD_FEATURE_EVENT_REMOVE`` and ``UFFD_FEATURE_EVENT_UNMAP``
are pretty similar, they quite differ in the action expected from the
-userfaultfd manager. In the former case, the virtual memory is
+``userfaultfd`` manager. In the former case, the virtual memory is
removed, but the area is not, the area remains monitored by the
-userfaultfd, and if a page fault occurs in that area it will be
+``userfaultfd``, and if a page fault occurs in that area it will be
delivered to the manager. The proper resolution for such page fault is
to zeromap the faulting address. However, in the latter case, when an
area is unmapped, either explicitly (with munmap() system call), or
implicitly (e.g. during mremap()), the area is removed and in turn the
-userfaultfd context for such area disappears too and the manager will
+``userfaultfd`` context for such area disappears too and the manager will
not get further userland page faults from the removed area. Still, the
notification is required in order to prevent manager from using
-UFFDIO_COPY on the unmapped area.
+``UFFDIO_COPY`` on the unmapped area.
Unlike userland page faults which have to be synchronous and require
explicit or implicit wakeup, all the events are delivered
asynchronously and the non-cooperative process resumes execution as
-soon as manager executes read(). The userfaultfd manager should
-carefully synchronize calls to UFFDIO_COPY with the events
-processing. To aid the synchronization, the UFFDIO_COPY ioctl will
-return -ENOSPC when the monitored process exits at the time of
-UFFDIO_COPY, and -ENOENT, when the non-cooperative process has changed
-its virtual memory layout simultaneously with outstanding UFFDIO_COPY
+soon as manager executes read(). The ``userfaultfd`` manager should
+carefully synchronize calls to ``UFFDIO_COPY`` with the events
+processing. To aid the synchronization, the ``UFFDIO_COPY`` ioctl will
+return ``-ENOSPC`` when the monitored process exits at the time of
+``UFFDIO_COPY``, and ``-ENOENT``, when the non-cooperative process has changed
+its virtual memory layout simultaneously with outstanding ``UFFDIO_COPY``
operation.
The current asynchronous model of the event delivery is optimal for
-single threaded non-cooperative userfaultfd manager implementations. A
+single threaded non-cooperative ``userfaultfd`` manager implementations. A
synchronous event delivery model can be added later as a new
-userfaultfd feature to facilitate multithreading enhancements of the
-non cooperative manager, for example to allow UFFDIO_COPY ioctls to
+``userfaultfd`` feature to facilitate multithreading enhancements of the
+non cooperative manager, for example to allow ``UFFDIO_COPY`` ioctls to
run in parallel to the event reception. Single threaded
implementations should continue to use the current async event
delivery model instead.
diff --git a/Documentation/admin-guide/mm/zswap.rst b/Documentation/admin-guide/mm/zswap.rst
new file mode 100644
index 000000000000..6e6f7b0d6562
--- /dev/null
+++ b/Documentation/admin-guide/mm/zswap.rst
@@ -0,0 +1,168 @@
+.. _zswap:
+
+=====
+zswap
+=====
+
+Overview
+========
+
+Zswap is a lightweight compressed cache for swap pages. It takes pages that are
+in the process of being swapped out and attempts to compress them into a
+dynamically allocated RAM-based memory pool. zswap basically trades CPU cycles
+for potentially reduced swap I/O. This trade-off can also result in a
+significant performance improvement if reads from the compressed cache are
+faster than reads from a swap device.
+
+.. note::
+ Zswap is a new feature as of v3.11 and interacts heavily with memory
+ reclaim. This interaction has not been fully explored on the large set of
+ potential configurations and workloads that exist. For this reason, zswap
+ is a work in progress and should be considered experimental.
+
+ Some potential benefits:
+
+* Desktop/laptop users with limited RAM capacities can mitigate the
+ performance impact of swapping.
+* Overcommitted guests that share a common I/O resource can
+ dramatically reduce their swap I/O pressure, avoiding heavy handed I/O
+ throttling by the hypervisor. This allows more work to get done with less
+ impact to the guest workload and guests sharing the I/O subsystem
+* Users with SSDs as swap devices can extend the life of the device by
+ drastically reducing life-shortening writes.
+
+Zswap evicts pages from compressed cache on an LRU basis to the backing swap
+device when the compressed pool reaches its size limit. This requirement had
+been identified in prior community discussions.
+
+Whether Zswap is enabled at the boot time depends on whether
+the ``CONFIG_ZSWAP_DEFAULT_ON`` Kconfig option is enabled or not.
+This setting can then be overridden by providing the kernel command line
+``zswap.enabled=`` option, for example ``zswap.enabled=0``.
+Zswap can also be enabled and disabled at runtime using the sysfs interface.
+An example command to enable zswap at runtime, assuming sysfs is mounted
+at ``/sys``, is::
+
+ echo 1 > /sys/module/zswap/parameters/enabled
+
+When zswap is disabled at runtime it will stop storing pages that are
+being swapped out. However, it will _not_ immediately write out or fault
+back into memory all of the pages stored in the compressed pool. The
+pages stored in zswap will remain in the compressed pool until they are
+either invalidated or faulted back into memory. In order to force all
+pages out of the compressed pool, a swapoff on the swap device(s) will
+fault back into memory all swapped out pages, including those in the
+compressed pool.
+
+Design
+======
+
+Zswap receives pages for compression through the Frontswap API and is able to
+evict pages from its own compressed pool on an LRU basis and write them back to
+the backing swap device in the case that the compressed pool is full.
+
+Zswap makes use of zpool for the managing the compressed memory pool. Each
+allocation in zpool is not directly accessible by address. Rather, a handle is
+returned by the allocation routine and that handle must be mapped before being
+accessed. The compressed memory pool grows on demand and shrinks as compressed
+pages are freed. The pool is not preallocated. By default, a zpool
+of type selected in ``CONFIG_ZSWAP_ZPOOL_DEFAULT`` Kconfig option is created,
+but it can be overridden at boot time by setting the ``zpool`` attribute,
+e.g. ``zswap.zpool=zbud``. It can also be changed at runtime using the sysfs
+``zpool`` attribute, e.g.::
+
+ echo zbud > /sys/module/zswap/parameters/zpool
+
+The zbud type zpool allocates exactly 1 page to store 2 compressed pages, which
+means the compression ratio will always be 2:1 or worse (because of half-full
+zbud pages). The zsmalloc type zpool has a more complex compressed page
+storage method, and it can achieve greater storage densities. However,
+zsmalloc does not implement compressed page eviction, so once zswap fills it
+cannot evict the oldest page, it can only reject new pages.
+
+When a swap page is passed from frontswap to zswap, zswap maintains a mapping
+of the swap entry, a combination of the swap type and swap offset, to the zpool
+handle that references that compressed swap page. This mapping is achieved
+with a red-black tree per swap type. The swap offset is the search key for the
+tree nodes.
+
+During a page fault on a PTE that is a swap entry, frontswap calls the zswap
+load function to decompress the page into the page allocated by the page fault
+handler.
+
+Once there are no PTEs referencing a swap page stored in zswap (i.e. the count
+in the swap_map goes to 0) the swap code calls the zswap invalidate function,
+via frontswap, to free the compressed entry.
+
+Zswap seeks to be simple in its policies. Sysfs attributes allow for one user
+controlled policy:
+
+* max_pool_percent - The maximum percentage of memory that the compressed
+ pool can occupy.
+
+The default compressor is selected in ``CONFIG_ZSWAP_COMPRESSOR_DEFAULT``
+Kconfig option, but it can be overridden at boot time by setting the
+``compressor`` attribute, e.g. ``zswap.compressor=lzo``.
+It can also be changed at runtime using the sysfs "compressor"
+attribute, e.g.::
+
+ echo lzo > /sys/module/zswap/parameters/compressor
+
+When the zpool and/or compressor parameter is changed at runtime, any existing
+compressed pages are not modified; they are left in their own zpool. When a
+request is made for a page in an old zpool, it is uncompressed using its
+original compressor. Once all pages are removed from an old zpool, the zpool
+and its compressor are freed.
+
+Some of the pages in zswap are same-value filled pages (i.e. contents of the
+page have same value or repetitive pattern). These pages include zero-filled
+pages and they are handled differently. During store operation, a page is
+checked if it is a same-value filled page before compressing it. If true, the
+compressed length of the page is set to zero and the pattern or same-filled
+value is stored.
+
+Same-value filled pages identification feature is enabled by default and can be
+disabled at boot time by setting the ``same_filled_pages_enabled`` attribute
+to 0, e.g. ``zswap.same_filled_pages_enabled=0``. It can also be enabled and
+disabled at runtime using the sysfs ``same_filled_pages_enabled``
+attribute, e.g.::
+
+ echo 1 > /sys/module/zswap/parameters/same_filled_pages_enabled
+
+When zswap same-filled page identification is disabled at runtime, it will stop
+checking for the same-value filled pages during store operation.
+In other words, every page will be then considered non-same-value filled.
+However, the existing pages which are marked as same-value filled pages remain
+stored unchanged in zswap until they are either loaded or invalidated.
+
+In some circumstances it might be advantageous to make use of just the zswap
+ability to efficiently store same-filled pages without enabling the whole
+compressed page storage.
+In this case the handling of non-same-value pages by zswap (enabled by default)
+can be disabled by setting the ``non_same_filled_pages_enabled`` attribute
+to 0, e.g. ``zswap.non_same_filled_pages_enabled=0``.
+It can also be enabled and disabled at runtime using the sysfs
+``non_same_filled_pages_enabled`` attribute, e.g.::
+
+ echo 1 > /sys/module/zswap/parameters/non_same_filled_pages_enabled
+
+Disabling both ``zswap.same_filled_pages_enabled`` and
+``zswap.non_same_filled_pages_enabled`` effectively disables accepting any new
+pages by zswap.
+
+To prevent zswap from shrinking pool when zswap is full and there's a high
+pressure on swap (this will result in flipping pages in and out zswap pool
+without any real benefit but with a performance drop for the system), a
+special parameter has been introduced to implement a sort of hysteresis to
+refuse taking pages into zswap pool until it has sufficient space if the limit
+has been hit. To set the threshold at which zswap would start accepting pages
+again after it became full, use the sysfs ``accept_threshold_percent``
+attribute, e. g.::
+
+ echo 80 > /sys/module/zswap/parameters/accept_threshold_percent
+
+Setting this parameter to 100 will disable the hysteresis.
+
+A debugfs interface is provided for various statistic about pool size, number
+of pages stored, same-value filled pages and various counters for the reasons
+pages are rejected.
diff --git a/Documentation/admin-guide/module-signing.rst b/Documentation/admin-guide/module-signing.rst
index f8b584179cff..7d7c7c8a545c 100644
--- a/Documentation/admin-guide/module-signing.rst
+++ b/Documentation/admin-guide/module-signing.rst
@@ -106,7 +106,7 @@ This has a number of options available:
certificate and a private key.
If the PEM file containing the private key is encrypted, or if the
- PKCS#11 token requries a PIN, this can be provided at build time by
+ PKCS#11 token requires a PIN, this can be provided at build time by
means of the ``KBUILD_SIGN_PIN`` variable.
diff --git a/Documentation/admin-guide/mono.rst b/Documentation/admin-guide/mono.rst
index 59e6d59f0ed9..c6dab5680065 100644
--- a/Documentation/admin-guide/mono.rst
+++ b/Documentation/admin-guide/mono.rst
@@ -12,11 +12,11 @@ other program after you have done the following:
a binary package, a source tarball or by installing from Git. Binary
packages for several distributions can be found at:
- http://www.mono-project.com/download/
+ https://www.mono-project.com/download/
Instructions for compiling Mono can be found at:
- http://www.mono-project.com/docs/compiling-mono/linux/
+ https://www.mono-project.com/docs/compiling-mono/linux/
Once the Mono CLR support has been installed, just check that
``/usr/bin/mono`` (which could be located elsewhere, for example
diff --git a/Documentation/admin-guide/nfs/fault_injection.rst b/Documentation/admin-guide/nfs/fault_injection.rst
deleted file mode 100644
index eb029c0c15ce..000000000000
--- a/Documentation/admin-guide/nfs/fault_injection.rst
+++ /dev/null
@@ -1,70 +0,0 @@
-===================
-NFS Fault Injection
-===================
-
-Fault injection is a method for forcing errors that may not normally occur, or
-may be difficult to reproduce. Forcing these errors in a controlled environment
-can help the developer find and fix bugs before their code is shipped in a
-production system. Injecting an error on the Linux NFS server will allow us to
-observe how the client reacts and if it manages to recover its state correctly.
-
-NFSD_FAULT_INJECTION must be selected when configuring the kernel to use this
-feature.
-
-
-Using Fault Injection
-=====================
-On the client, mount the fault injection server through NFS v4.0+ and do some
-work over NFS (open files, take locks, ...).
-
-On the server, mount the debugfs filesystem to <debug_dir> and ls
-<debug_dir>/nfsd. This will show a list of files that will be used for
-injecting faults on the NFS server. As root, write a number n to the file
-corresponding to the action you want the server to take. The server will then
-process the first n items it finds. So if you want to forget 5 locks, echo '5'
-to <debug_dir>/nfsd/forget_locks. A value of 0 will tell the server to forget
-all corresponding items. A log message will be created containing the number
-of items forgotten (check dmesg).
-
-Go back to work on the client and check if the client recovered from the error
-correctly.
-
-
-Available Faults
-================
-forget_clients:
- The NFS server keeps a list of clients that have placed a mount call. If
- this list is cleared, the server will have no knowledge of who the client
- is, forcing the client to reauthenticate with the server.
-
-forget_openowners:
- The NFS server keeps a list of what files are currently opened and who
- they were opened by. Clearing this list will force the client to reopen
- its files.
-
-forget_locks:
- The NFS server keeps a list of what files are currently locked in the VFS.
- Clearing this list will force the client to reclaim its locks (files are
- unlocked through the VFS as they are cleared from this list).
-
-forget_delegations:
- A delegation is used to assure the client that a file, or part of a file,
- has not changed since the delegation was awarded. Clearing this list will
- force the client to reacquire its delegation before accessing the file
- again.
-
-recall_delegations:
- Delegations can be recalled by the server when another client attempts to
- access a file. This test will notify the client that its delegation has
- been revoked, forcing the client to reacquire the delegation before using
- the file again.
-
-
-tools/nfs/inject_faults.sh script
-=================================
-This script has been created to ease the fault injection process. This script
-will detect the mounted debugfs directory and write to the files located there
-based on the arguments passed by the user. For example, running
-`inject_faults.sh forget_locks 1` as root will instruct the server to forget
-one lock. Running `inject_faults forget_locks` will instruct the server to
-forgetall locks.
diff --git a/Documentation/admin-guide/nfs/index.rst b/Documentation/admin-guide/nfs/index.rst
index 6b5a3c90fac5..3601a708f333 100644
--- a/Documentation/admin-guide/nfs/index.rst
+++ b/Documentation/admin-guide/nfs/index.rst
@@ -12,4 +12,3 @@ NFS
nfs-idmapper
pnfs-block-server
pnfs-scsi-server
- fault_injection
diff --git a/Documentation/admin-guide/nfs/nfs-client.rst b/Documentation/admin-guide/nfs/nfs-client.rst
index c4b777c7584b..36760685dd34 100644
--- a/Documentation/admin-guide/nfs/nfs-client.rst
+++ b/Documentation/admin-guide/nfs/nfs-client.rst
@@ -36,10 +36,9 @@ administrative requirements that require particular behavior that does not
work well as part of an nfs_client_id4 string.
The nfs.nfs4_unique_id boot parameter specifies a unique string that can be
-used instead of a system's node name when an NFS client identifies itself to
-a server. Thus, if the system's node name is not unique, or it changes, its
-nfs.nfs4_unique_id stays the same, preventing collision with other clients
-or loss of state during NFS reboot recovery or transparent state migration.
+used together with a system's node name when an NFS client identifies itself to
+a server. Thus, if the system's node name is not unique, its
+nfs.nfs4_unique_id can help prevent collisions with other clients.
The nfs.nfs4_unique_id string is typically a UUID, though it can contain
anything that is believed to be unique across all NFS clients. An
@@ -53,8 +52,12 @@ outstanding NFSv4 state has expired, to prevent loss of NFSv4 state.
This string can be stored in an NFS client's grub.conf, or it can be provided
via a net boot facility such as PXE. It may also be specified as an nfs.ko
-module parameter. Specifying a uniquifier string is not support for NFS
-clients running in containers.
+module parameter.
+
+This uniquifier string will be the same for all NFS clients running in
+containers unless it is overridden by a value written to
+/sys/fs/nfs/net/nfs_client/identifier which will be local to the network
+namespace of the process which writes.
The DNS resolver
@@ -65,8 +68,8 @@ migrated onto another server by means of the special "fs_locations"
attribute. See `RFC3530 Section 6: Filesystem Migration and Replication`_ and
`Implementation Guide for Referrals in NFSv4`_.
-.. _RFC3530 Section 6\: Filesystem Migration and Replication: http://tools.ietf.org/html/rfc3530#section-6
-.. _Implementation Guide for Referrals in NFSv4: http://tools.ietf.org/html/draft-ietf-nfsv4-referrals-00
+.. _RFC3530 Section 6\: Filesystem Migration and Replication: https://tools.ietf.org/html/rfc3530#section-6
+.. _Implementation Guide for Referrals in NFSv4: https://tools.ietf.org/html/draft-ietf-nfsv4-referrals-00
The fs_locations information can take the form of either an ip address and
a path, or a DNS hostname and a path. The latter requires the NFS client to
diff --git a/Documentation/admin-guide/nfs/nfs-rdma.rst b/Documentation/admin-guide/nfs/nfs-rdma.rst
index ef0f3678b1fb..f137485f8bde 100644
--- a/Documentation/admin-guide/nfs/nfs-rdma.rst
+++ b/Documentation/admin-guide/nfs/nfs-rdma.rst
@@ -65,7 +65,7 @@ use with NFS/RDMA.
If the version is less than 1.1.2 or the command does not exist,
you should install the latest version of nfs-utils.
- Download the latest package from: http://www.kernel.org/pub/linux/utils/nfs
+ Download the latest package from: https://www.kernel.org/pub/linux/utils/nfs
Uncompress the package and follow the installation instructions.
diff --git a/Documentation/admin-guide/nfs/nfsroot.rst b/Documentation/admin-guide/nfs/nfsroot.rst
index 82a4fda057f9..135218f33394 100644
--- a/Documentation/admin-guide/nfs/nfsroot.rst
+++ b/Documentation/admin-guide/nfs/nfsroot.rst
@@ -18,7 +18,7 @@ Mounting the root filesystem via NFS (nfsroot)
In order to use a diskless system, such as an X-terminal or printer server for
example, it is necessary for the root filesystem to be present on a non-disk
device. This may be an initramfs (see
-Documentation/filesystems/ramfs-rootfs-initramfs.txt), a ramdisk (see
+Documentation/filesystems/ramfs-rootfs-initramfs.rst), a ramdisk (see
Documentation/admin-guide/initrd.rst) or a filesystem mounted via NFS. The
following text describes on how to use NFS for the root filesystem. For the rest
of this text 'client' means the diskless system, and 'server' means the NFS
@@ -264,7 +264,7 @@ They depend on various facilities being available:
access to the floppy drive device, /dev/fd0
For more information on syslinux, including how to create bootdisks
- for prebuilt kernels, see http://syslinux.zytor.com/
+ for prebuilt kernels, see https://syslinux.zytor.com/
.. note::
Previously it was possible to write a kernel directly to
@@ -292,7 +292,7 @@ They depend on various facilities being available:
cdrecord dev=ATAPI:1,0,0 arch/x86/boot/image.iso
For more information on isolinux, including how to create bootdisks
- for prebuilt kernels, see http://syslinux.zytor.com/
+ for prebuilt kernels, see https://syslinux.zytor.com/
- Using LILO
@@ -346,7 +346,7 @@ They depend on various facilities being available:
see Documentation/admin-guide/serial-console.rst for more information.
For more information on isolinux, including how to create bootdisks
- for prebuilt kernels, see http://syslinux.zytor.com/
+ for prebuilt kernels, see https://syslinux.zytor.com/
diff --git a/Documentation/admin-guide/nfs/pnfs-block-server.rst b/Documentation/admin-guide/nfs/pnfs-block-server.rst
index b00a2e705cc4..20fe9f5117fe 100644
--- a/Documentation/admin-guide/nfs/pnfs-block-server.rst
+++ b/Documentation/admin-guide/nfs/pnfs-block-server.rst
@@ -8,7 +8,7 @@ to handling all the metadata access to the NFS export also hands out layouts
to the clients to directly access the underlying block devices that are
shared with the client.
-To use pNFS block layouts with with the Linux NFS server the exported file
+To use pNFS block layouts with the Linux NFS server the exported file
system needs to support the pNFS block layouts (currently just XFS), and the
file system must sit on shared storage (typically iSCSI) that is accessible
to the clients in addition to the MDS. As of now the file system needs to
diff --git a/Documentation/admin-guide/nfs/pnfs-scsi-server.rst b/Documentation/admin-guide/nfs/pnfs-scsi-server.rst
index d2f6ee558071..b2eec2288329 100644
--- a/Documentation/admin-guide/nfs/pnfs-scsi-server.rst
+++ b/Documentation/admin-guide/nfs/pnfs-scsi-server.rst
@@ -9,7 +9,7 @@ which in addition to handling all the metadata access to the NFS export,
also hands out layouts to the clients so that they can directly access the
underlying SCSI LUNs that are shared with the client.
-To use pNFS SCSI layouts with with the Linux NFS server, the exported file
+To use pNFS SCSI layouts with the Linux NFS server, the exported file
system needs to support the pNFS SCSI layouts (currently just XFS), and the
file system must sit on a SCSI LUN that is accessible to the clients in
addition to the MDS. As of now the file system needs to sit directly on the
diff --git a/Documentation/admin-guide/numastat.rst b/Documentation/admin-guide/numastat.rst
index aaf1667489f8..08ec2c2bdce3 100644
--- a/Documentation/admin-guide/numastat.rst
+++ b/Documentation/admin-guide/numastat.rst
@@ -6,6 +6,21 @@ Numa policy hit/miss statistics
All units are pages. Hugepages have separate counters.
+The numa_hit, numa_miss and numa_foreign counters reflect how well processes
+are able to allocate memory from nodes they prefer. If they succeed, numa_hit
+is incremented on the preferred node, otherwise numa_foreign is incremented on
+the preferred node and numa_miss on the node where allocation succeeded.
+
+Usually preferred node is the one local to the CPU where the process executes,
+but restrictions such as mempolicies can change that, so there are also two
+counters based on CPU local node. local_node is similar to numa_hit and is
+incremented on allocation from a node by CPU on the same node. other_node is
+similar to numa_miss and is incremented on the node where allocation succeeds
+from a CPU from a different node. Note there is no counter analogical to
+numa_foreign.
+
+In more detail:
+
=============== ============================================================
numa_hit A process wanted to allocate memory from this node,
and succeeded.
@@ -14,11 +29,13 @@ numa_miss A process wanted to allocate memory from another node,
but ended up with memory from this node.
numa_foreign A process wanted to allocate on this node,
- but ended up with memory from another one.
+ but ended up with memory from another node.
-local_node A process ran on this node and got memory from it.
+local_node A process ran on this node's CPU,
+ and got memory from this node.
-other_node A process ran on this node and got memory from another node.
+other_node A process ran on a different node's CPU
+ and got memory from this node.
interleave_hit Interleaving wanted to allocate from this node
and succeeded.
@@ -28,3 +45,11 @@ For easier reading you can use the numastat utility from the numactl package
(http://oss.sgi.com/projects/libnuma/). Note that it only works
well right now on machines with a small number of CPUs.
+Note that on systems with memoryless nodes (where a node has CPUs but no
+memory) the numa_hit, numa_miss and numa_foreign statistics can be skewed
+heavily. In the current kernel implementation, if a process prefers a
+memoryless node (i.e. because it is running on one of its local CPU), the
+implementation actually treats one of the nearest nodes with memory as the
+preferred node. As a result, such allocation will not increase the numa_foreign
+counter on the memoryless node, and will skew the numa_hit, numa_miss and
+numa_foreign statistics of the nearest node.
diff --git a/Documentation/admin-guide/perf-security.rst b/Documentation/admin-guide/perf-security.rst
index 72effa7c23b9..34aa334320ca 100644
--- a/Documentation/admin-guide/perf-security.rst
+++ b/Documentation/admin-guide/perf-security.rst
@@ -1,6 +1,6 @@
.. _perf_security:
-Perf Events and tool security
+Perf events and tool security
=============================
Overview
@@ -42,11 +42,11 @@ categories:
Data that belong to the fourth category can potentially contain
sensitive process data. If PMUs in some monitoring modes capture values
of execution context registers or data from process memory then access
-to such monitoring capabilities requires to be ordered and secured
-properly. So, perf_events/Perf performance monitoring is the subject for
-security access control management [5]_ .
+to such monitoring modes requires to be ordered and secured properly.
+So, perf_events performance monitoring and observability operations are
+the subject for security access control management [5]_ .
-perf_events/Perf access control
+perf_events access control
-------------------------------
To perform security checks, the Linux implementation splits processes
@@ -66,15 +66,32 @@ into distinct units, known as capabilities [6]_ , which can be
independently enabled and disabled on per-thread basis for processes and
files of unprivileged users.
-Unprivileged processes with enabled CAP_SYS_ADMIN capability are treated
+Unprivileged processes with enabled CAP_PERFMON capability are treated
as privileged processes with respect to perf_events performance
-monitoring and bypass *scope* permissions checks in the kernel.
-
-Unprivileged processes using perf_events system call API is also subject
-for PTRACE_MODE_READ_REALCREDS ptrace access mode check [7]_ , whose
-outcome determines whether monitoring is permitted. So unprivileged
-processes provided with CAP_SYS_PTRACE capability are effectively
-permitted to pass the check.
+monitoring and observability operations, thus, bypass *scope* permissions
+checks in the kernel. CAP_PERFMON implements the principle of least
+privilege [13]_ (POSIX 1003.1e: 2.2.2.39) for performance monitoring and
+observability operations in the kernel and provides a secure approach to
+performance monitoring and observability in the system.
+
+For backward compatibility reasons the access to perf_events monitoring and
+observability operations is also open for CAP_SYS_ADMIN privileged
+processes but CAP_SYS_ADMIN usage for secure monitoring and observability
+use cases is discouraged with respect to the CAP_PERFMON capability.
+If system audit records [14]_ for a process using perf_events system call
+API contain denial records of acquiring both CAP_PERFMON and CAP_SYS_ADMIN
+capabilities then providing the process with CAP_PERFMON capability singly
+is recommended as the preferred secure approach to resolve double access
+denial logging related to usage of performance monitoring and observability.
+
+Prior Linux v5.9 unprivileged processes using perf_events system call
+are also subject for PTRACE_MODE_READ_REALCREDS ptrace access mode check
+[7]_ , whose outcome determines whether monitoring is permitted.
+So unprivileged processes provided with CAP_SYS_PTRACE capability are
+effectively permitted to pass the check. Starting from Linux v5.9
+CAP_SYS_PTRACE capability is not required and CAP_PERFMON is enough to
+be provided for processes to make performance monitoring and observability
+operations.
Other capabilities being granted to unprivileged processes can
effectively enable capturing of additional data required for later
@@ -82,14 +99,14 @@ performance analysis of monitored processes or a system. For example,
CAP_SYSLOG capability permits reading kernel space memory addresses from
/proc/kallsyms file.
-perf_events/Perf privileged users
+Privileged Perf users groups
---------------------------------
-Mechanisms of capabilities, privileged capability-dumb files [6]_ and
-file system ACLs [10]_ can be used to create a dedicated group of
-perf_events/Perf privileged users who are permitted to execute
-performance monitoring without scope limits. The following steps can be
-taken to create such a group of privileged Perf users.
+Mechanisms of capabilities, privileged capability-dumb files [6]_,
+file system ACLs [10]_ and sudo [15]_ utility can be used to create
+dedicated groups of privileged Perf users who are permitted to execute
+performance monitoring and observability without limits. The following
+steps can be taken to create such groups of privileged Perf users.
1. Create perf_users group of privileged Perf users, assign perf_users
group to Perf tool executable and limit access to the executable for
@@ -108,30 +125,105 @@ taken to create such a group of privileged Perf users.
-rwxr-x--- 2 root perf_users 11M Oct 19 15:12 perf
2. Assign the required capabilities to the Perf tool executable file and
- enable members of perf_users group with performance monitoring
+ enable members of perf_users group with monitoring and observability
privileges [6]_ :
::
- # setcap "cap_sys_admin,cap_sys_ptrace,cap_syslog=ep" perf
- # setcap -v "cap_sys_admin,cap_sys_ptrace,cap_syslog=ep" perf
+ # setcap "cap_perfmon,cap_sys_ptrace,cap_syslog=ep" perf
+ # setcap -v "cap_perfmon,cap_sys_ptrace,cap_syslog=ep" perf
perf: OK
# getcap perf
- perf = cap_sys_ptrace,cap_sys_admin,cap_syslog+ep
+ perf = cap_sys_ptrace,cap_syslog,cap_perfmon+ep
+
+If the libcap [16]_ installed doesn't yet support "cap_perfmon", use "38" instead,
+i.e.:
+
+::
+
+ # setcap "38,cap_ipc_lock,cap_sys_ptrace,cap_syslog=ep" perf
+
+Note that you may need to have 'cap_ipc_lock' in the mix for tools such as
+'perf top', alternatively use 'perf top -m N', to reduce the memory that
+it uses for the perf ring buffer, see the memory allocation section below.
+
+Using a libcap without support for CAP_PERFMON will make cap_get_flag(caps, 38,
+CAP_EFFECTIVE, &val) fail, which will lead the default event to be 'cycles:u',
+so as a workaround explicitly ask for the 'cycles' event, i.e.:
+
+::
+
+ # perf top -e cycles
+
+To get kernel and user samples with a perf binary with just CAP_PERFMON.
As a result, members of perf_users group are capable of conducting
-performance monitoring by using functionality of the configured Perf
-tool executable that, when executes, passes perf_events subsystem scope
-checks.
+performance monitoring and observability by using functionality of the
+configured Perf tool executable that, when executes, passes perf_events
+subsystem scope checks.
+
+In case Perf tool executable can't be assigned required capabilities (e.g.
+file system is mounted with nosuid option or extended attributes are
+not supported by the file system) then creation of the capabilities
+privileged environment, naturally shell, is possible. The shell provides
+inherent processes with CAP_PERFMON and other required capabilities so that
+performance monitoring and observability operations are available in the
+environment without limits. Access to the environment can be open via sudo
+utility for members of perf_users group only. In order to create such
+environment:
+
+1. Create shell script that uses capsh utility [16]_ to assign CAP_PERFMON
+ and other required capabilities into ambient capability set of the shell
+ process, lock the process security bits after enabling SECBIT_NO_SETUID_FIXUP,
+ SECBIT_NOROOT and SECBIT_NO_CAP_AMBIENT_RAISE bits and then change
+ the process identity to sudo caller of the script who should essentially
+ be a member of perf_users group:
+
+::
+
+ # ls -alh /usr/local/bin/perf.shell
+ -rwxr-xr-x. 1 root root 83 Oct 13 23:57 /usr/local/bin/perf.shell
+ # cat /usr/local/bin/perf.shell
+ exec /usr/sbin/capsh --iab=^cap_perfmon --secbits=239 --user=$SUDO_USER -- -l
+
+2. Extend sudo policy at /etc/sudoers file with a rule for perf_users group:
+
+::
+
+ # grep perf_users /etc/sudoers
+ %perf_users ALL=/usr/local/bin/perf.shell
+
+3. Check that members of perf_users group have access to the privileged
+ shell and have CAP_PERFMON and other required capabilities enabled
+ in permitted, effective and ambient capability sets of an inherent process:
+
+::
+
+ $ id
+ uid=1003(capsh_test) gid=1004(capsh_test) groups=1004(capsh_test),1000(perf_users) context=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023
+ $ sudo perf.shell
+ [sudo] password for capsh_test:
+ $ grep Cap /proc/self/status
+ CapInh: 0000004000000000
+ CapPrm: 0000004000000000
+ CapEff: 0000004000000000
+ CapBnd: 000000ffffffffff
+ CapAmb: 0000004000000000
+ $ capsh --decode=0000004000000000
+ 0x0000004000000000=cap_perfmon
+
+As a result, members of perf_users group have access to the privileged
+environment where they can use tools employing performance monitoring APIs
+governed by CAP_PERFMON Linux capability.
This specific access control management is only available to superuser
or root running processes with CAP_SETPCAP, CAP_SETFCAP [6]_
capabilities.
-perf_events/Perf unprivileged users
+Unprivileged users
-----------------------------------
-perf_events/Perf *scope* and *access* control for unprivileged processes
+perf_events *scope* and *access* control for unprivileged processes
is governed by perf_event_paranoid [2]_ setting:
-1:
@@ -166,7 +258,7 @@ is governed by perf_event_paranoid [2]_ setting:
perf_event_mlock_kb locking limit is imposed but ignored for
unprivileged processes with CAP_IPC_LOCK capability.
-perf_events/Perf resource control
+Resource control
---------------------------------
Open file descriptors
@@ -227,4 +319,7 @@ Bibliography
.. [10] `<http://man7.org/linux/man-pages/man5/acl.5.html>`_
.. [11] `<http://man7.org/linux/man-pages/man2/getrlimit.2.html>`_
.. [12] `<http://man7.org/linux/man-pages/man5/limits.conf.5.html>`_
-
+.. [13] `<https://sites.google.com/site/fullycapable>`_
+.. [14] `<http://man7.org/linux/man-pages/man8/auditd.8.html>`_
+.. [15] `<https://man7.org/linux/man-pages/man8/sudo.8.html>`_
+.. [16] `<https://git.kernel.org/pub/scm/libs/libcap/libcap.git/>`_
diff --git a/Documentation/admin-guide/perf/alibaba_pmu.rst b/Documentation/admin-guide/perf/alibaba_pmu.rst
new file mode 100644
index 000000000000..11de998bb480
--- /dev/null
+++ b/Documentation/admin-guide/perf/alibaba_pmu.rst
@@ -0,0 +1,100 @@
+=============================================================
+Alibaba's T-Head SoC Uncore Performance Monitoring Unit (PMU)
+=============================================================
+
+The Yitian 710, custom-built by Alibaba Group's chip development business,
+T-Head, implements uncore PMU for performance and functional debugging to
+facilitate system maintenance.
+
+DDR Sub-System Driveway (DRW) PMU Driver
+=========================================
+
+Yitian 710 employs eight DDR5/4 channels, four on each die. Each DDR5 channel
+is independent of others to service system memory requests. And one DDR5
+channel is split into two independent sub-channels. The DDR Sub-System Driveway
+implements separate PMUs for each sub-channel to monitor various performance
+metrics.
+
+The Driveway PMU devices are named as ali_drw_<sys_base_addr> with perf.
+For example, ali_drw_21000 and ali_drw_21080 are two PMU devices for two
+sub-channels of the same channel in die 0. And the PMU device of die 1 is
+prefixed with ali_drw_400XXXXX, e.g. ali_drw_40021000.
+
+Each sub-channel has 36 PMU counters in total, which is classified into
+four groups:
+
+- Group 0: PMU Cycle Counter. This group has one pair of counters
+ pmu_cycle_cnt_low and pmu_cycle_cnt_high, that is used as the cycle count
+ based on DDRC core clock.
+
+- Group 1: PMU Bandwidth Counters. This group has 8 counters that are used
+ to count the total access number of either the eight bank groups in a
+ selected rank, or four ranks separately in the first 4 counters. The base
+ transfer unit is 64B.
+
+- Group 2: PMU Retry Counters. This group has 10 counters, that intend to
+ count the total retry number of each type of uncorrectable error.
+
+- Group 3: PMU Common Counters. This group has 16 counters, that are used
+ to count the common events.
+
+For now, the Driveway PMU driver only uses counters in group 0 and group 3.
+
+The DDR Controller (DDRCTL) and DDR PHY combine to create a complete solution
+for connecting an SoC application bus to DDR memory devices. The DDRCTL
+receives transactions Host Interface (HIF) which is custom-defined by Synopsys.
+These transactions are queued internally and scheduled for access while
+satisfying the SDRAM protocol timing requirements, transaction priorities, and
+dependencies between the transactions. The DDRCTL in turn issues commands on
+the DDR PHY Interface (DFI) to the PHY module, which launches and captures data
+to and from the SDRAM. The driveway PMUs have hardware logic to gather
+statistics and performance logging signals on HIF, DFI, etc.
+
+By counting the READ, WRITE and RMW commands sent to the DDRC through the HIF
+interface, we could calculate the bandwidth. Example usage of counting memory
+data bandwidth::
+
+ perf stat \
+ -e ali_drw_21000/hif_wr/ \
+ -e ali_drw_21000/hif_rd/ \
+ -e ali_drw_21000/hif_rmw/ \
+ -e ali_drw_21000/cycle/ \
+ -e ali_drw_21080/hif_wr/ \
+ -e ali_drw_21080/hif_rd/ \
+ -e ali_drw_21080/hif_rmw/ \
+ -e ali_drw_21080/cycle/ \
+ -e ali_drw_23000/hif_wr/ \
+ -e ali_drw_23000/hif_rd/ \
+ -e ali_drw_23000/hif_rmw/ \
+ -e ali_drw_23000/cycle/ \
+ -e ali_drw_23080/hif_wr/ \
+ -e ali_drw_23080/hif_rd/ \
+ -e ali_drw_23080/hif_rmw/ \
+ -e ali_drw_23080/cycle/ \
+ -e ali_drw_25000/hif_wr/ \
+ -e ali_drw_25000/hif_rd/ \
+ -e ali_drw_25000/hif_rmw/ \
+ -e ali_drw_25000/cycle/ \
+ -e ali_drw_25080/hif_wr/ \
+ -e ali_drw_25080/hif_rd/ \
+ -e ali_drw_25080/hif_rmw/ \
+ -e ali_drw_25080/cycle/ \
+ -e ali_drw_27000/hif_wr/ \
+ -e ali_drw_27000/hif_rd/ \
+ -e ali_drw_27000/hif_rmw/ \
+ -e ali_drw_27000/cycle/ \
+ -e ali_drw_27080/hif_wr/ \
+ -e ali_drw_27080/hif_rd/ \
+ -e ali_drw_27080/hif_rmw/ \
+ -e ali_drw_27080/cycle/ -- sleep 10
+
+The average DRAM bandwidth can be calculated as follows:
+
+- Read Bandwidth = perf_hif_rd * DDRC_WIDTH * DDRC_Freq / DDRC_Cycle
+- Write Bandwidth = (perf_hif_wr + perf_hif_rmw) * DDRC_WIDTH * DDRC_Freq / DDRC_Cycle
+
+Here, DDRC_WIDTH = 64 bytes.
+
+The current driver does not support sampling. So "perf record" is
+unsupported. Also attach to a task is unsupported as the events are all
+uncore.
diff --git a/Documentation/admin-guide/perf/arm-ccn.rst b/Documentation/admin-guide/perf/arm-ccn.rst
index 832b0c64023a..f62f7fe50eba 100644
--- a/Documentation/admin-guide/perf/arm-ccn.rst
+++ b/Documentation/admin-guide/perf/arm-ccn.rst
@@ -27,7 +27,7 @@ Crosspoint PMU events require "xp" (index), "bus" (bus number)
and "vc" (virtual channel ID).
Crosspoint watchpoint-based events (special "event" value 0xfe)
-require "xp" and "vc" as as above plus "port" (device port index),
+require "xp" and "vc" as above plus "port" (device port index),
"dir" (transmit/receive direction), comparator values ("cmp_l"
and "cmp_h") and "mask", being index of the comparator mask.
diff --git a/Documentation/admin-guide/perf/arm-cmn.rst b/Documentation/admin-guide/perf/arm-cmn.rst
new file mode 100644
index 000000000000..796e25b7027b
--- /dev/null
+++ b/Documentation/admin-guide/perf/arm-cmn.rst
@@ -0,0 +1,65 @@
+=============================
+Arm Coherent Mesh Network PMU
+=============================
+
+CMN-600 is a configurable mesh interconnect consisting of a rectangular
+grid of crosspoints (XPs), with each crosspoint supporting up to two
+device ports to which various AMBA CHI agents are attached.
+
+CMN implements a distributed PMU design as part of its debug and trace
+functionality. This consists of a local monitor (DTM) at every XP, which
+counts up to 4 event signals from the connected device nodes and/or the
+XP itself. Overflow from these local counters is accumulated in up to 8
+global counters implemented by the main controller (DTC), which provides
+overall PMU control and interrupts for global counter overflow.
+
+PMU events
+----------
+
+The PMU driver registers a single PMU device for the whole interconnect,
+see /sys/bus/event_source/devices/arm_cmn_0. Multi-chip systems may link
+more than one CMN together via external CCIX links - in this situation,
+each mesh counts its own events entirely independently, and additional
+PMU devices will be named arm_cmn_{1..n}.
+
+Most events are specified in a format based directly on the TRM
+definitions - "type" selects the respective node type, and "eventid" the
+event number. Some events require an additional occupancy ID, which is
+specified by "occupid".
+
+* Since RN-D nodes do not have any distinct events from RN-I nodes, they
+ are treated as the same type (0xa), and the common event templates are
+ named "rnid_*".
+
+* The cycle counter is treated as a synthetic event belonging to the DTC
+ node ("type" == 0x3, "eventid" is ignored).
+
+* XP events also encode the port and channel in the "eventid" field, to
+ match the underlying pmu_event0_id encoding for the pmu_event_sel
+ register. The event templates are named with prefixes to cover all
+ permutations.
+
+By default each event provides an aggregate count over all nodes of the
+given type. To target a specific node, "bynodeid" must be set to 1 and
+"nodeid" to the appropriate value derived from the CMN configuration
+(as defined in the "Node ID Mapping" section of the TRM).
+
+Watchpoints
+-----------
+
+The PMU can also count watchpoint events to monitor specific flit
+traffic. Watchpoints are treated as a synthetic event type, and like PMU
+events can be global or targeted with a particular XP's "nodeid" value.
+Since the watchpoint direction is otherwise implicit in the underlying
+register selection, separate events are provided for flit uploads and
+downloads.
+
+The flit match value and mask are passed in config1 and config2 ("val"
+and "mask" respectively). "wp_dev_sel", "wp_chn_sel", "wp_grp" and
+"wp_exclusive" are specified per the TRM definitions for dtm_wp_config0.
+Where a watchpoint needs to match fields from both match groups on the
+REQ or SNP channel, it can be specified as two events - one for each
+group - with the same nonzero "combine" value. The count for such a
+pair of combined events will be attributed to the primary match.
+Watchpoint events with a "combine" value of 0 are considered independent
+and will count individually.
diff --git a/Documentation/admin-guide/perf/hisi-pcie-pmu.rst b/Documentation/admin-guide/perf/hisi-pcie-pmu.rst
new file mode 100644
index 000000000000..294ebbdb22af
--- /dev/null
+++ b/Documentation/admin-guide/perf/hisi-pcie-pmu.rst
@@ -0,0 +1,106 @@
+================================================
+HiSilicon PCIe Performance Monitoring Unit (PMU)
+================================================
+
+On Hip09, HiSilicon PCIe Performance Monitoring Unit (PMU) could monitor
+bandwidth, latency, bus utilization and buffer occupancy data of PCIe.
+
+Each PCIe Core has a PMU to monitor multi Root Ports of this PCIe Core and
+all Endpoints downstream these Root Ports.
+
+
+HiSilicon PCIe PMU driver
+=========================
+
+The PCIe PMU driver registers a perf PMU with the name of its sicl-id and PCIe
+Core id.::
+
+ /sys/bus/event_source/hisi_pcie<sicl>_<core>
+
+PMU driver provides description of available events and filter options in sysfs,
+see /sys/bus/event_source/devices/hisi_pcie<sicl>_<core>.
+
+The "format" directory describes all formats of the config (events) and config1
+(filter options) fields of the perf_event_attr structure. The "events" directory
+describes all documented events shown in perf list.
+
+The "identifier" sysfs file allows users to identify the version of the
+PMU hardware device.
+
+The "bus" sysfs file allows users to get the bus number of Root Ports
+monitored by PMU.
+
+Example usage of perf::
+
+ $# perf list
+ hisi_pcie0_0/rx_mwr_latency/ [kernel PMU event]
+ hisi_pcie0_0/rx_mwr_cnt/ [kernel PMU event]
+ ------------------------------------------
+
+ $# perf stat -e hisi_pcie0_0/rx_mwr_latency/
+ $# perf stat -e hisi_pcie0_0/rx_mwr_cnt/
+ $# perf stat -g -e hisi_pcie0_0/rx_mwr_latency/ -e hisi_pcie0_0/rx_mwr_cnt/
+
+The current driver does not support sampling. So "perf record" is unsupported.
+Also attach to a task is unsupported for PCIe PMU.
+
+Filter options
+--------------
+
+1. Target filter
+PMU could only monitor the performance of traffic downstream target Root Ports
+or downstream target Endpoint. PCIe PMU driver support "port" and "bdf"
+interfaces for users, and these two interfaces aren't supported at the same
+time.
+
+-port
+"port" filter can be used in all PCIe PMU events, target Root Port can be
+selected by configuring the 16-bits-bitmap "port". Multi ports can be selected
+for AP-layer-events, and only one port can be selected for TL/DL-layer-events.
+
+For example, if target Root Port is 0000:00:00.0 (x8 lanes), bit0 of bitmap
+should be set, port=0x1; if target Root Port is 0000:00:04.0 (x4 lanes),
+bit8 is set, port=0x100; if these two Root Ports are both monitored, port=0x101.
+
+Example usage of perf::
+
+ $# perf stat -e hisi_pcie0_0/rx_mwr_latency,port=0x1/ sleep 5
+
+-bdf
+
+"bdf" filter can only be used in bandwidth events, target Endpoint is selected
+by configuring BDF to "bdf". Counter only counts the bandwidth of message
+requested by target Endpoint.
+
+For example, "bdf=0x3900" means BDF of target Endpoint is 0000:39:00.0.
+
+Example usage of perf::
+
+ $# perf stat -e hisi_pcie0_0/rx_mrd_flux,bdf=0x3900/ sleep 5
+
+2. Trigger filter
+Event statistics start when the first time TLP length is greater/smaller
+than trigger condition. You can set the trigger condition by writing "trig_len",
+and set the trigger mode by writing "trig_mode". This filter can only be used
+in bandwidth events.
+
+For example, "trig_len=4" means trigger condition is 2^4 DW, "trig_mode=0"
+means statistics start when TLP length > trigger condition, "trig_mode=1"
+means start when TLP length < condition.
+
+Example usage of perf::
+
+ $# perf stat -e hisi_pcie0_0/rx_mrd_flux,trig_len=0x4,trig_mode=1/ sleep 5
+
+3. Threshold filter
+Counter counts when TLP length within the specified range. You can set the
+threshold by writing "thr_len", and set the threshold mode by writing
+"thr_mode". This filter can only be used in bandwidth events.
+
+For example, "thr_len=4" means threshold is 2^4 DW, "thr_mode=0" means
+counter counts when TLP length >= threshold, and "thr_mode=1" means counts
+when TLP length < threshold.
+
+Example usage of perf::
+
+ $# perf stat -e hisi_pcie0_0/rx_mrd_flux,thr_len=0x4,thr_mode=1/ sleep 5
diff --git a/Documentation/admin-guide/perf/hisi-pmu.rst b/Documentation/admin-guide/perf/hisi-pmu.rst
index 404a5c3d9d00..546979360513 100644
--- a/Documentation/admin-guide/perf/hisi-pmu.rst
+++ b/Documentation/admin-guide/perf/hisi-pmu.rst
@@ -53,6 +53,60 @@ Example usage of perf::
$# perf stat -a -e hisi_sccl3_l3c0/rd_hit_cpipe/ sleep 5
$# perf stat -a -e hisi_sccl3_l3c0/config=0x02/ sleep 5
+For HiSilicon uncore PMU v2 whose identifier is 0x30, the topology is the same
+as PMU v1, but some new functions are added to the hardware.
+
+(a) L3C PMU supports filtering by core/thread within the cluster which can be
+specified as a bitmap::
+
+ $# perf stat -a -e hisi_sccl3_l3c0/config=0x02,tt_core=0x3/ sleep 5
+
+This will only count the operations from core/thread 0 and 1 in this cluster.
+
+(b) Tracetag allow the user to chose to count only read, write or atomic
+operations via the tt_req parameeter in perf. The default value counts all
+operations. tt_req is 3bits, 3'b100 represents read operations, 3'b101
+represents write operations, 3'b110 represents atomic store operations and
+3'b111 represents atomic non-store operations, other values are reserved::
+
+ $# perf stat -a -e hisi_sccl3_l3c0/config=0x02,tt_req=0x4/ sleep 5
+
+This will only count the read operations in this cluster.
+
+(c) Datasrc allows the user to check where the data comes from. It is 5 bits.
+Some important codes are as follows:
+5'b00001: comes from L3C in this die;
+5'b01000: comes from L3C in the cross-die;
+5'b01001: comes from L3C which is in another socket;
+5'b01110: comes from the local DDR;
+5'b01111: comes from the cross-die DDR;
+5'b10000: comes from cross-socket DDR;
+etc, it is mainly helpful to find that the data source is nearest from the CPU
+cores. If datasrc_cfg is used in the multi-chips, the datasrc_skt shall be
+configured in perf command::
+
+ $# perf stat -a -e hisi_sccl3_l3c0/config=0xb9,datasrc_cfg=0xE/,
+ hisi_sccl3_l3c0/config=0xb9,datasrc_cfg=0xF/ sleep 5
+
+(d)Some HiSilicon SoCs encapsulate multiple CPU and IO dies. Each CPU die
+contains several Compute Clusters (CCLs). The I/O dies are called Super I/O
+clusters (SICL) containing multiple I/O clusters (ICLs). Each CCL/ICL in the
+SoC has a unique ID. Each ID is 11bits, include a 6-bit SCCL-ID and 5-bit
+CCL/ICL-ID. For I/O die, the ICL-ID is followed by:
+5'b00000: I/O_MGMT_ICL;
+5'b00001: Network_ICL;
+5'b00011: HAC_ICL;
+5'b10000: PCIe_ICL;
+
+Users could configure IDs to count data come from specific CCL/ICL, by setting
+srcid_cmd & srcid_msk, and data desitined for specific CCL/ICL by setting
+tgtid_cmd & tgtid_msk. A set bit in srcid_msk/tgtid_msk means the PMU will not
+check the bit when matching against the srcid_cmd/tgtid_cmd.
+
+If all of these options are disabled, it can works by the default value that
+doesn't distinguish the filter condition and ID information and will return
+the total counter values in the PMU counters.
+
The current driver does not support sampling. So "perf record" is unsupported.
Also attach to a task is unsupported as the events are all uncore.
diff --git a/Documentation/admin-guide/perf/hns3-pmu.rst b/Documentation/admin-guide/perf/hns3-pmu.rst
new file mode 100644
index 000000000000..578407e487d6
--- /dev/null
+++ b/Documentation/admin-guide/perf/hns3-pmu.rst
@@ -0,0 +1,136 @@
+======================================
+HNS3 Performance Monitoring Unit (PMU)
+======================================
+
+HNS3(HiSilicon network system 3) Performance Monitoring Unit (PMU) is an
+End Point device to collect performance statistics of HiSilicon SoC NIC.
+On Hip09, each SICL(Super I/O cluster) has one PMU device.
+
+HNS3 PMU supports collection of performance statistics such as bandwidth,
+latency, packet rate and interrupt rate.
+
+Each HNS3 PMU supports 8 hardware events.
+
+HNS3 PMU driver
+===============
+
+The HNS3 PMU driver registers a perf PMU with the name of its sicl id.::
+
+ /sys/devices/hns3_pmu_sicl_<sicl_id>
+
+PMU driver provides description of available events, filter modes, format,
+identifier and cpumask in sysfs.
+
+The "events" directory describes the event code of all supported events
+shown in perf list.
+
+The "filtermode" directory describes the supported filter modes of each
+event.
+
+The "format" directory describes all formats of the config (events) and
+config1 (filter options) fields of the perf_event_attr structure.
+
+The "identifier" file shows version of PMU hardware device.
+
+The "bdf_min" and "bdf_max" files show the supported bdf range of each
+pmu device.
+
+The "hw_clk_freq" file shows the hardware clock frequency of each pmu
+device.
+
+Example usage of checking event code and subevent code::
+
+ $# cat /sys/devices/hns3_pmu_sicl_0/events/dly_tx_normal_to_mac_time
+ config=0x00204
+ $# cat /sys/devices/hns3_pmu_sicl_0/events/dly_tx_normal_to_mac_packet_num
+ config=0x10204
+
+Each performance statistic has a pair of events to get two values to
+calculate real performance data in userspace.
+
+The bits 0~15 of config (here 0x0204) are the true hardware event code. If
+two events have same value of bits 0~15 of config, that means they are
+event pair. And the bit 16 of config indicates getting counter 0 or
+counter 1 of hardware event.
+
+After getting two values of event pair in usersapce, the formula of
+computation to calculate real performance data is:::
+
+ counter 0 / counter 1
+
+Example usage of checking supported filter mode::
+
+ $# cat /sys/devices/hns3_pmu_sicl_0/filtermode/bw_ssu_rpu_byte_num
+ filter mode supported: global/port/port-tc/func/func-queue/
+
+Example usage of perf::
+
+ $# perf list
+ hns3_pmu_sicl_0/bw_ssu_rpu_byte_num/ [kernel PMU event]
+ hns3_pmu_sicl_0/bw_ssu_rpu_time/ [kernel PMU event]
+ ------------------------------------------
+
+ $# perf stat -g -e hns3_pmu_sicl_0/bw_ssu_rpu_byte_num,global=1/ -e hns3_pmu_sicl_0/bw_ssu_rpu_time,global=1/ -I 1000
+ or
+ $# perf stat -g -e hns3_pmu_sicl_0/config=0x00002,global=1/ -e hns3_pmu_sicl_0/config=0x10002,global=1/ -I 1000
+
+
+Filter modes
+--------------
+
+1. global mode
+PMU collect performance statistics for all HNS3 PCIe functions of IO DIE.
+Set the "global" filter option to 1 will enable this mode.
+Example usage of perf::
+
+ $# perf stat -a -e hns3_pmu_sicl_0/config=0x1020F,global=1/ -I 1000
+
+2. port mode
+PMU collect performance statistic of one whole physical port. The port id
+is same as mac id. The "tc" filter option must be set to 0xF in this mode,
+here tc stands for traffic class.
+
+Example usage of perf::
+
+ $# perf stat -a -e hns3_pmu_sicl_0/config=0x1020F,port=0,tc=0xF/ -I 1000
+
+3. port-tc mode
+PMU collect performance statistic of one tc of physical port. The port id
+is same as mac id. The "tc" filter option must be set to 0 ~ 7 in this
+mode.
+Example usage of perf::
+
+ $# perf stat -a -e hns3_pmu_sicl_0/config=0x1020F,port=0,tc=0/ -I 1000
+
+4. func mode
+PMU collect performance statistic of one PF/VF. The function id is BDF of
+PF/VF, its conversion formula::
+
+ func = (bus << 8) + (device << 3) + (function)
+
+for example:
+ BDF func
+ 35:00.0 0x3500
+ 35:00.1 0x3501
+ 35:01.0 0x3508
+
+In this mode, the "queue" filter option must be set to 0xFFFF.
+Example usage of perf::
+
+ $# perf stat -a -e hns3_pmu_sicl_0/config=0x1020F,bdf=0x3500,queue=0xFFFF/ -I 1000
+
+5. func-queue mode
+PMU collect performance statistic of one queue of PF/VF. The function id
+is BDF of PF/VF, the "queue" filter option must be set to the exact queue
+id of function.
+Example usage of perf::
+
+ $# perf stat -a -e hns3_pmu_sicl_0/config=0x1020F,bdf=0x3500,queue=0/ -I 1000
+
+6. func-intr mode
+PMU collect performance statistic of one interrupt of PF/VF. The function
+id is BDF of PF/VF, the "intr" filter option must be set to the exact
+interrupt id of function.
+Example usage of perf::
+
+ $# perf stat -a -e hns3_pmu_sicl_0/config=0x00301,bdf=0x3500,intr=0/ -I 1000
diff --git a/Documentation/admin-guide/perf/imx-ddr.rst b/Documentation/admin-guide/perf/imx-ddr.rst
index 3726a10a03ba..90926d0fb8ec 100644
--- a/Documentation/admin-guide/perf/imx-ddr.rst
+++ b/Documentation/admin-guide/perf/imx-ddr.rst
@@ -4,7 +4,7 @@ Freescale i.MX8 DDR Performance Monitoring Unit (PMU)
There are no performance counters inside the DRAM controller, so performance
signals are brought out to the edge of the controller where a set of 4 x 32 bit
-counters is implemented. This is controlled by the CSV modes programed in counter
+counters is implemented. This is controlled by the CSV modes programmed in counter
control register which causes a large number of PERF signals to be generated.
Selection of the value for each counter is done via the config registers. There
@@ -43,7 +43,8 @@ value 1 for supported.
AXI_ID and AXI_MASKING are mapped on DPCR1 register in performance counter.
When non-masked bits are matching corresponding AXI_ID bits then counter is
- incremented. Perf counter is incremented if
+ incremented. Perf counter is incremented if::
+
AxID && AXI_MASKING == AXI_ID && AXI_MASKING
This filter doesn't support filter different AXI ID for axid-read and axid-write
diff --git a/Documentation/admin-guide/perf/index.rst b/Documentation/admin-guide/perf/index.rst
index 47c99f40cc16..793e1970bc05 100644
--- a/Documentation/admin-guide/perf/index.rst
+++ b/Documentation/admin-guide/perf/index.rst
@@ -8,10 +8,14 @@ Performance monitor support
:maxdepth: 1
hisi-pmu
+ hisi-pcie-pmu
+ hns3-pmu
imx-ddr
qcom_l2_pmu
qcom_l3_pmu
arm-ccn
+ arm-cmn
xgene-pmu
arm_dsu_pmu
thunderx2-pmu
+ alibaba_pmu
diff --git a/Documentation/admin-guide/pm/amd-pstate.rst b/Documentation/admin-guide/pm/amd-pstate.rst
new file mode 100644
index 000000000000..8f3d30c5a0d8
--- /dev/null
+++ b/Documentation/admin-guide/pm/amd-pstate.rst
@@ -0,0 +1,483 @@
+.. SPDX-License-Identifier: GPL-2.0
+.. include:: <isonum.txt>
+
+===============================================
+``amd-pstate`` CPU Performance Scaling Driver
+===============================================
+
+:Copyright: |copy| 2021 Advanced Micro Devices, Inc.
+
+:Author: Huang Rui <ray.huang@amd.com>
+
+
+Introduction
+===================
+
+``amd-pstate`` is the AMD CPU performance scaling driver that introduces a
+new CPU frequency control mechanism on modern AMD APU and CPU series in
+Linux kernel. The new mechanism is based on Collaborative Processor
+Performance Control (CPPC) which provides finer grain frequency management
+than legacy ACPI hardware P-States. Current AMD CPU/APU platforms are using
+the ACPI P-states driver to manage CPU frequency and clocks with switching
+only in 3 P-states. CPPC replaces the ACPI P-states controls and allows a
+flexible, low-latency interface for the Linux kernel to directly
+communicate the performance hints to hardware.
+
+``amd-pstate`` leverages the Linux kernel governors such as ``schedutil``,
+``ondemand``, etc. to manage the performance hints which are provided by
+CPPC hardware functionality that internally follows the hardware
+specification (for details refer to AMD64 Architecture Programmer's Manual
+Volume 2: System Programming [1]_). Currently, ``amd-pstate`` supports basic
+frequency control function according to kernel governors on some of the
+Zen2 and Zen3 processors, and we will implement more AMD specific functions
+in future after we verify them on the hardware and SBIOS.
+
+
+AMD CPPC Overview
+=======================
+
+Collaborative Processor Performance Control (CPPC) interface enumerates a
+continuous, abstract, and unit-less performance value in a scale that is
+not tied to a specific performance state / frequency. This is an ACPI
+standard [2]_ which software can specify application performance goals and
+hints as a relative target to the infrastructure limits. AMD processors
+provide the low latency register model (MSR) instead of an AML code
+interpreter for performance adjustments. ``amd-pstate`` will initialize a
+``struct cpufreq_driver`` instance, ``amd_pstate_driver``, with the callbacks
+to manage each performance update behavior. ::
+
+ Highest Perf ------>+-----------------------+ +-----------------------+
+ | | | |
+ | | | |
+ | | Max Perf ---->| |
+ | | | |
+ | | | |
+ Nominal Perf ------>+-----------------------+ +-----------------------+
+ | | | |
+ | | | |
+ | | | |
+ | | | |
+ | | | |
+ | | | |
+ | | Desired Perf ---->| |
+ | | | |
+ | | | |
+ | | | |
+ | | | |
+ | | | |
+ | | | |
+ | | | |
+ | | | |
+ | | | |
+ Lowest non- | | | |
+ linear perf ------>+-----------------------+ +-----------------------+
+ | | | |
+ | | Lowest perf ---->| |
+ | | | |
+ Lowest perf ------>+-----------------------+ +-----------------------+
+ | | | |
+ | | | |
+ | | | |
+ 0 ------>+-----------------------+ +-----------------------+
+
+ AMD P-States Performance Scale
+
+
+.. _perf_cap:
+
+AMD CPPC Performance Capability
+--------------------------------
+
+Highest Performance (RO)
+.........................
+
+This is the absolute maximum performance an individual processor may reach,
+assuming ideal conditions. This performance level may not be sustainable
+for long durations and may only be achievable if other platform components
+are in a specific state; for example, it may require other processors to be in
+an idle state. This would be equivalent to the highest frequencies
+supported by the processor.
+
+Nominal (Guaranteed) Performance (RO)
+......................................
+
+This is the maximum sustained performance level of the processor, assuming
+ideal operating conditions. In the absence of an external constraint (power,
+thermal, etc.), this is the performance level the processor is expected to
+be able to maintain continuously. All cores/processors are expected to be
+able to sustain their nominal performance state simultaneously.
+
+Lowest non-linear Performance (RO)
+...................................
+
+This is the lowest performance level at which nonlinear power savings are
+achieved, for example, due to the combined effects of voltage and frequency
+scaling. Above this threshold, lower performance levels should be generally
+more energy efficient than higher performance levels. This register
+effectively conveys the most efficient performance level to ``amd-pstate``.
+
+Lowest Performance (RO)
+........................
+
+This is the absolute lowest performance level of the processor. Selecting a
+performance level lower than the lowest nonlinear performance level may
+cause an efficiency penalty but should reduce the instantaneous power
+consumption of the processor.
+
+AMD CPPC Performance Control
+------------------------------
+
+``amd-pstate`` passes performance goals through these registers. The
+register drives the behavior of the desired performance target.
+
+Minimum requested performance (RW)
+...................................
+
+``amd-pstate`` specifies the minimum allowed performance level.
+
+Maximum requested performance (RW)
+...................................
+
+``amd-pstate`` specifies a limit the maximum performance that is expected
+to be supplied by the hardware.
+
+Desired performance target (RW)
+...................................
+
+``amd-pstate`` specifies a desired target in the CPPC performance scale as
+a relative number. This can be expressed as percentage of nominal
+performance (infrastructure max). Below the nominal sustained performance
+level, desired performance expresses the average performance level of the
+processor subject to hardware. Above the nominal performance level,
+the processor must provide at least nominal performance requested and go higher
+if current operating conditions allow.
+
+Energy Performance Preference (EPP) (RW)
+.........................................
+
+This attribute provides a hint to the hardware if software wants to bias
+toward performance (0x0) or energy efficiency (0xff).
+
+
+Key Governors Support
+=======================
+
+``amd-pstate`` can be used with all the (generic) scaling governors listed
+by the ``scaling_available_governors`` policy attribute in ``sysfs``. Then,
+it is responsible for the configuration of policy objects corresponding to
+CPUs and provides the ``CPUFreq`` core (and the scaling governors attached
+to the policy objects) with accurate information on the maximum and minimum
+operating frequencies supported by the hardware. Users can check the
+``scaling_cur_freq`` information comes from the ``CPUFreq`` core.
+
+``amd-pstate`` mainly supports ``schedutil`` and ``ondemand`` for dynamic
+frequency control. It is to fine tune the processor configuration on
+``amd-pstate`` to the ``schedutil`` with CPU CFS scheduler. ``amd-pstate``
+registers the adjust_perf callback to implement performance update behavior
+similar to CPPC. It is initialized by ``sugov_start`` and then populates the
+CPU's update_util_data pointer to assign ``sugov_update_single_perf`` as the
+utilization update callback function in the CPU scheduler. The CPU scheduler
+will call ``cpufreq_update_util`` and assigns the target performance according
+to the ``struct sugov_cpu`` that the utilization update belongs to.
+Then, ``amd-pstate`` updates the desired performance according to the CPU
+scheduler assigned.
+
+.. _processor_support:
+
+Processor Support
+=======================
+
+The ``amd-pstate`` initialization will fail if the ``_CPC`` entry in the ACPI
+SBIOS does not exist in the detected processor. It uses ``acpi_cpc_valid``
+to check the existence of ``_CPC``. All Zen based processors support the legacy
+ACPI hardware P-States function, so when ``amd-pstate`` fails initialization,
+the kernel will fall back to initialize the ``acpi-cpufreq`` driver.
+
+There are two types of hardware implementations for ``amd-pstate``: one is
+`Full MSR Support <perf_cap_>`_ and another is `Shared Memory Support
+<perf_cap_>`_. It can use the :c:macro:`X86_FEATURE_CPPC` feature flag to
+indicate the different types. (For details, refer to the Processor Programming
+Reference (PPR) for AMD Family 19h Model 51h, Revision A1 Processors [3]_.)
+``amd-pstate`` is to register different ``static_call`` instances for different
+hardware implementations.
+
+Currently, some of the Zen2 and Zen3 processors support ``amd-pstate``. In the
+future, it will be supported on more and more AMD processors.
+
+Full MSR Support
+-----------------
+
+Some new Zen3 processors such as Cezanne provide the MSR registers directly
+while the :c:macro:`X86_FEATURE_CPPC` CPU feature flag is set.
+``amd-pstate`` can handle the MSR register to implement the fast switch
+function in ``CPUFreq`` that can reduce the latency of frequency control in
+interrupt context. The functions with a ``pstate_xxx`` prefix represent the
+operations on MSR registers.
+
+Shared Memory Support
+----------------------
+
+If the :c:macro:`X86_FEATURE_CPPC` CPU feature flag is not set, the
+processor supports the shared memory solution. In this case, ``amd-pstate``
+uses the ``cppc_acpi`` helper methods to implement the callback functions
+that are defined on ``static_call``. The functions with the ``cppc_xxx`` prefix
+represent the operations of ACPI CPPC helpers for the shared memory solution.
+
+
+AMD P-States and ACPI hardware P-States always can be supported in one
+processor. But AMD P-States has the higher priority and if it is enabled
+with :c:macro:`MSR_AMD_CPPC_ENABLE` or ``cppc_set_enable``, it will respond
+to the request from AMD P-States.
+
+
+User Space Interface in ``sysfs``
+==================================
+
+``amd-pstate`` exposes several global attributes (files) in ``sysfs`` to
+control its functionality at the system level. They are located in the
+``/sys/devices/system/cpu/cpufreq/policyX/`` directory and affect all CPUs. ::
+
+ root@hr-test1:/home/ray# ls /sys/devices/system/cpu/cpufreq/policy0/*amd*
+ /sys/devices/system/cpu/cpufreq/policy0/amd_pstate_highest_perf
+ /sys/devices/system/cpu/cpufreq/policy0/amd_pstate_lowest_nonlinear_freq
+ /sys/devices/system/cpu/cpufreq/policy0/amd_pstate_max_freq
+
+
+``amd_pstate_highest_perf / amd_pstate_max_freq``
+
+Maximum CPPC performance and CPU frequency that the driver is allowed to
+set, in percent of the maximum supported CPPC performance level (the highest
+performance supported in `AMD CPPC Performance Capability <perf_cap_>`_).
+In some ASICs, the highest CPPC performance is not the one in the ``_CPC``
+table, so we need to expose it to sysfs. If boost is not active, but
+still supported, this maximum frequency will be larger than the one in
+``cpuinfo``.
+This attribute is read-only.
+
+``amd_pstate_lowest_nonlinear_freq``
+
+The lowest non-linear CPPC CPU frequency that the driver is allowed to set,
+in percent of the maximum supported CPPC performance level. (Please see the
+lowest non-linear performance in `AMD CPPC Performance Capability
+<perf_cap_>`_.)
+This attribute is read-only.
+
+Other performance and frequency values can be read back from
+``/sys/devices/system/cpu/cpuX/acpi_cppc/``, see :ref:`cppc_sysfs`.
+
+
+``amd-pstate`` vs ``acpi-cpufreq``
+======================================
+
+On the majority of AMD platforms supported by ``acpi-cpufreq``, the ACPI tables
+provided by the platform firmware are used for CPU performance scaling, but
+only provide 3 P-states on AMD processors.
+However, on modern AMD APU and CPU series, hardware provides the Collaborative
+Processor Performance Control according to the ACPI protocol and customizes this
+for AMD platforms. That is, fine-grained and continuous frequency ranges
+instead of the legacy hardware P-states. ``amd-pstate`` is the kernel
+module which supports the new AMD P-States mechanism on most of the future AMD
+platforms. The AMD P-States mechanism is the more performance and energy
+efficiency frequency management method on AMD processors.
+
+Kernel Module Options for ``amd-pstate``
+=========================================
+
+.. _shared_mem:
+
+``shared_mem``
+Use a module param (shared_mem) to enable related processors manually with
+**amd_pstate.shared_mem=1**.
+Due to the performance issue on the processors with `Shared Memory Support
+<perf_cap_>`_, we disable it presently and will re-enable this by default
+once we address performance issue with this solution.
+
+To check whether the current processor is using `Full MSR Support <perf_cap_>`_
+or `Shared Memory Support <perf_cap_>`_ : ::
+
+ ray@hr-test1:~$ lscpu | grep cppc
+ Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf rapl pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb cat_l3 cdp_l3 hw_pstate ssbd mba ibrs ibpb stibp vmmcall fsgsbase bmi1 avx2 smep bmi2 erms invpcid cqm rdt_a rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local clzero irperf xsaveerptr rdpru wbnoinvd cppc arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload vgif v_spec_ctrl umip pku ospke vaes vpclmulqdq rdpid overflow_recov succor smca fsrm
+
+If the CPU flags have ``cppc``, then this processor supports `Full MSR Support
+<perf_cap_>`_. Otherwise, it supports `Shared Memory Support <perf_cap_>`_.
+
+
+``cpupower`` tool support for ``amd-pstate``
+===============================================
+
+``amd-pstate`` is supported by the ``cpupower`` tool, which can be used to dump
+frequency information. Development is in progress to support more and more
+operations for the new ``amd-pstate`` module with this tool. ::
+
+ root@hr-test1:/home/ray# cpupower frequency-info
+ analyzing CPU 0:
+ driver: amd-pstate
+ CPUs which run at the same hardware frequency: 0
+ CPUs which need to have their frequency coordinated by software: 0
+ maximum transition latency: 131 us
+ hardware limits: 400 MHz - 4.68 GHz
+ available cpufreq governors: ondemand conservative powersave userspace performance schedutil
+ current policy: frequency should be within 400 MHz and 4.68 GHz.
+ The governor "schedutil" may decide which speed to use
+ within this range.
+ current CPU frequency: Unable to call hardware
+ current CPU frequency: 4.02 GHz (asserted by call to kernel)
+ boost state support:
+ Supported: yes
+ Active: yes
+ AMD PSTATE Highest Performance: 166. Maximum Frequency: 4.68 GHz.
+ AMD PSTATE Nominal Performance: 117. Nominal Frequency: 3.30 GHz.
+ AMD PSTATE Lowest Non-linear Performance: 39. Lowest Non-linear Frequency: 1.10 GHz.
+ AMD PSTATE Lowest Performance: 15. Lowest Frequency: 400 MHz.
+
+
+Diagnostics and Tuning
+=======================
+
+Trace Events
+--------------
+
+There are two static trace events that can be used for ``amd-pstate``
+diagnostics. One of them is the ``cpu_frequency`` trace event generally used
+by ``CPUFreq``, and the other one is the ``amd_pstate_perf`` trace event
+specific to ``amd-pstate``. The following sequence of shell commands can
+be used to enable them and see their output (if the kernel is
+configured to support event tracing). ::
+
+ root@hr-test1:/home/ray# cd /sys/kernel/tracing/
+ root@hr-test1:/sys/kernel/tracing# echo 1 > events/amd_cpu/enable
+ root@hr-test1:/sys/kernel/tracing# cat trace
+ # tracer: nop
+ #
+ # entries-in-buffer/entries-written: 47827/42233061 #P:2
+ #
+ # _-----=> irqs-off
+ # / _----=> need-resched
+ # | / _---=> hardirq/softirq
+ # || / _--=> preempt-depth
+ # ||| / delay
+ # TASK-PID CPU# |||| TIMESTAMP FUNCTION
+ # | | | |||| | |
+ <idle>-0 [015] dN... 4995.979886: amd_pstate_perf: amd_min_perf=85 amd_des_perf=85 amd_max_perf=166 cpu_id=15 changed=false fast_switch=true
+ <idle>-0 [007] d.h.. 4995.979893: amd_pstate_perf: amd_min_perf=85 amd_des_perf=85 amd_max_perf=166 cpu_id=7 changed=false fast_switch=true
+ cat-2161 [000] d.... 4995.980841: amd_pstate_perf: amd_min_perf=85 amd_des_perf=85 amd_max_perf=166 cpu_id=0 changed=false fast_switch=true
+ sshd-2125 [004] d.s.. 4995.980968: amd_pstate_perf: amd_min_perf=85 amd_des_perf=85 amd_max_perf=166 cpu_id=4 changed=false fast_switch=true
+ <idle>-0 [007] d.s.. 4995.980968: amd_pstate_perf: amd_min_perf=85 amd_des_perf=85 amd_max_perf=166 cpu_id=7 changed=false fast_switch=true
+ <idle>-0 [003] d.s.. 4995.980971: amd_pstate_perf: amd_min_perf=85 amd_des_perf=85 amd_max_perf=166 cpu_id=3 changed=false fast_switch=true
+ <idle>-0 [011] d.s.. 4995.980996: amd_pstate_perf: amd_min_perf=85 amd_des_perf=85 amd_max_perf=166 cpu_id=11 changed=false fast_switch=true
+
+The ``cpu_frequency`` trace event will be triggered either by the ``schedutil`` scaling
+governor (for the policies it is attached to), or by the ``CPUFreq`` core (for the
+policies with other scaling governors).
+
+
+Tracer Tool
+-------------
+
+``amd_pstate_tracer.py`` can record and parse ``amd-pstate`` trace log, then
+generate performance plots. This utility can be used to debug and tune the
+performance of ``amd-pstate`` driver. The tracer tool needs to import intel
+pstate tracer.
+
+Tracer tool located in ``linux/tools/power/x86/amd_pstate_tracer``. It can be
+used in two ways. If trace file is available, then directly parse the file
+with command ::
+
+ ./amd_pstate_trace.py [-c cpus] -t <trace_file> -n <test_name>
+
+Or generate trace file with root privilege, then parse and plot with command ::
+
+ sudo ./amd_pstate_trace.py [-c cpus] -n <test_name> -i <interval> [-m kbytes]
+
+The test result can be found in ``results/test_name``. Following is the example
+about part of the output. ::
+
+ common_cpu common_secs common_usecs min_perf des_perf max_perf freq mperf apef tsc load duration_ms sample_num elapsed_time common_comm
+ CPU_005 712 116384 39 49 166 0.7565 9645075 2214891 38431470 25.1 11.646 469 2.496 kworker/5:0-40
+ CPU_006 712 116408 39 49 166 0.6769 8950227 1839034 37192089 24.06 11.272 470 2.496 kworker/6:0-1264
+
+Unit Tests for amd-pstate
+-------------------------
+
+``amd-pstate-ut`` is a test module for testing the ``amd-pstate`` driver.
+
+ * It can help all users to verify their processor support (SBIOS/Firmware or Hardware).
+
+ * Kernel can have a basic function test to avoid the kernel regression during the update.
+
+ * We can introduce more functional or performance tests to align the result together, it will benefit power and performance scale optimization.
+
+1. Test case decriptions
+
+ +---------+--------------------------------+------------------------------------------------------------------------------------+
+ | Index | Functions | Description |
+ +=========+================================+====================================================================================+
+ | 0 | amd_pstate_ut_acpi_cpc_valid || Check whether the _CPC object is present in SBIOS. |
+ | | || |
+ | | || The detail refer to `Processor Support <processor_support_>`_. |
+ +---------+--------------------------------+------------------------------------------------------------------------------------+
+ | 1 | amd_pstate_ut_check_enabled || Check whether AMD P-State is enabled. |
+ | | || |
+ | | || AMD P-States and ACPI hardware P-States always can be supported in one processor. |
+ | | | But AMD P-States has the higher priority and if it is enabled with |
+ | | | :c:macro:`MSR_AMD_CPPC_ENABLE` or ``cppc_set_enable``, it will respond to the |
+ | | | request from AMD P-States. |
+ +---------+--------------------------------+------------------------------------------------------------------------------------+
+ | 2 | amd_pstate_ut_check_perf || Check if the each performance values are reasonable. |
+ | | || highest_perf >= nominal_perf > lowest_nonlinear_perf > lowest_perf > 0. |
+ +---------+--------------------------------+------------------------------------------------------------------------------------+
+ | 3 | amd_pstate_ut_check_freq || Check if the each frequency values and max freq when set support boost mode |
+ | | | are reasonable. |
+ | | || max_freq >= nominal_freq > lowest_nonlinear_freq > min_freq > 0 |
+ | | || If boost is not active but supported, this maximum frequency will be larger than |
+ | | | the one in ``cpuinfo``. |
+ +---------+--------------------------------+------------------------------------------------------------------------------------+
+
+#. How to execute the tests
+
+ We use test module in the kselftest frameworks to implement it.
+ We create amd-pstate-ut module and tie it into kselftest.(for
+ details refer to Linux Kernel Selftests [4]_).
+
+ 1. Build
+
+ + open the :c:macro:`CONFIG_X86_AMD_PSTATE` configuration option.
+ + set the :c:macro:`CONFIG_X86_AMD_PSTATE_UT` configuration option to M.
+ + make project
+ + make selftest ::
+
+ $ cd linux
+ $ make -C tools/testing/selftests
+
+ #. Installation & Steps ::
+
+ $ make -C tools/testing/selftests install INSTALL_PATH=~/kselftest
+ $ sudo ./kselftest/run_kselftest.sh -c amd-pstate
+ TAP version 13
+ 1..1
+ # selftests: amd-pstate: amd-pstate-ut.sh
+ # amd-pstate-ut: ok
+ ok 1 selftests: amd-pstate: amd-pstate-ut.sh
+
+ #. Results ::
+
+ $ dmesg | grep "amd_pstate_ut" | tee log.txt
+ [12977.570663] amd_pstate_ut: 1 amd_pstate_ut_acpi_cpc_valid success!
+ [12977.570673] amd_pstate_ut: 2 amd_pstate_ut_check_enabled success!
+ [12977.571207] amd_pstate_ut: 3 amd_pstate_ut_check_perf success!
+ [12977.571212] amd_pstate_ut: 4 amd_pstate_ut_check_freq success!
+
+Reference
+===========
+
+.. [1] AMD64 Architecture Programmer's Manual Volume 2: System Programming,
+ https://www.amd.com/system/files/TechDocs/24593.pdf
+
+.. [2] Advanced Configuration and Power Interface Specification,
+ https://uefi.org/sites/default/files/resources/ACPI_Spec_6_4_Jan22.pdf
+
+.. [3] Processor Programming Reference (PPR) for AMD Family 19h Model 51h, Revision A1 Processors
+ https://www.amd.com/system/files/TechDocs/56569-A1-PUB.zip
+
+.. [4] Linux Kernel Selftests,
+ https://www.kernel.org/doc/html/latest/dev-tools/kselftest.html
diff --git a/Documentation/admin-guide/pm/cpufreq.rst b/Documentation/admin-guide/pm/cpufreq.rst
index 0c74a7784964..6adb7988e0eb 100644
--- a/Documentation/admin-guide/pm/cpufreq.rst
+++ b/Documentation/admin-guide/pm/cpufreq.rst
@@ -1,7 +1,6 @@
.. SPDX-License-Identifier: GPL-2.0
.. include:: <isonum.txt>
-.. |struct cpufreq_policy| replace:: :c:type:`struct cpufreq_policy <cpufreq_policy>`
.. |intel_pstate| replace:: :doc:`intel_pstate <intel_pstate>`
=======================
@@ -92,16 +91,16 @@ control the P-state of multiple CPUs at the same time and writing to it affects
all of those CPUs simultaneously.
Sets of CPUs sharing hardware P-state control interfaces are represented by
-``CPUFreq`` as |struct cpufreq_policy| objects. For consistency,
-|struct cpufreq_policy| is also used when there is only one CPU in the given
+``CPUFreq`` as struct cpufreq_policy objects. For consistency,
+struct cpufreq_policy is also used when there is only one CPU in the given
set.
-The ``CPUFreq`` core maintains a pointer to a |struct cpufreq_policy| object for
+The ``CPUFreq`` core maintains a pointer to a struct cpufreq_policy object for
every CPU in the system, including CPUs that are currently offline. If multiple
CPUs share the same hardware P-state control interface, all of the pointers
-corresponding to them point to the same |struct cpufreq_policy| object.
+corresponding to them point to the same struct cpufreq_policy object.
-``CPUFreq`` uses |struct cpufreq_policy| as its basic data type and the design
+``CPUFreq`` uses struct cpufreq_policy as its basic data type and the design
of its user space interface is based on the policy concept.
@@ -147,9 +146,9 @@ CPUs in it.
The next major initialization step for a new policy object is to attach a
scaling governor to it (to begin with, that is the default scaling governor
-determined by the kernel configuration, but it may be changed later
-via ``sysfs``). First, a pointer to the new policy object is passed to the
-governor's ``->init()`` callback which is expected to initialize all of the
+determined by the kernel command line or configuration, but it may be changed
+later via ``sysfs``). First, a pointer to the new policy object is passed to
+the governor's ``->init()`` callback which is expected to initialize all of the
data structures necessary to handle the given policy and, possibly, to add
a governor ``sysfs`` interface to it. Next, the governor is started by
invoking its ``->start()`` callback.
diff --git a/Documentation/admin-guide/pm/cpufreq_drivers.rst b/Documentation/admin-guide/pm/cpufreq_drivers.rst
new file mode 100644
index 000000000000..9a134ae65803
--- /dev/null
+++ b/Documentation/admin-guide/pm/cpufreq_drivers.rst
@@ -0,0 +1,274 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+=======================================================
+Legacy Documentation of CPU Performance Scaling Drivers
+=======================================================
+
+Included below are historic documents describing assorted
+:doc:`CPU performance scaling <cpufreq>` drivers. They are reproduced verbatim,
+with the original white space formatting and indentation preserved, except for
+the added leading space character in every line of text.
+
+
+AMD PowerNow! Drivers
+=====================
+
+::
+
+ PowerNow! and Cool'n'Quiet are AMD names for frequency
+ management capabilities in AMD processors. As the hardware
+ implementation changes in new generations of the processors,
+ there is a different cpu-freq driver for each generation.
+
+ Note that the driver's will not load on the "wrong" hardware,
+ so it is safe to try each driver in turn when in doubt as to
+ which is the correct driver.
+
+ Note that the functionality to change frequency (and voltage)
+ is not available in all processors. The drivers will refuse
+ to load on processors without this capability. The capability
+ is detected with the cpuid instruction.
+
+ The drivers use BIOS supplied tables to obtain frequency and
+ voltage information appropriate for a particular platform.
+ Frequency transitions will be unavailable if the BIOS does
+ not supply these tables.
+
+ 6th Generation: powernow-k6
+
+ 7th Generation: powernow-k7: Athlon, Duron, Geode.
+
+ 8th Generation: powernow-k8: Athlon, Athlon 64, Opteron, Sempron.
+ Documentation on this functionality in 8th generation processors
+ is available in the "BIOS and Kernel Developer's Guide", publication
+ 26094, in chapter 9, available for download from www.amd.com.
+
+ BIOS supplied data, for powernow-k7 and for powernow-k8, may be
+ from either the PSB table or from ACPI objects. The ACPI support
+ is only available if the kernel config sets CONFIG_ACPI_PROCESSOR.
+ The powernow-k8 driver will attempt to use ACPI if so configured,
+ and fall back to PST if that fails.
+ The powernow-k7 driver will try to use the PSB support first, and
+ fall back to ACPI if the PSB support fails. A module parameter,
+ acpi_force, is provided to force ACPI support to be used instead
+ of PSB support.
+
+
+``cpufreq-nforce2``
+===================
+
+::
+
+ The cpufreq-nforce2 driver changes the FSB on nVidia nForce2 platforms.
+
+ This works better than on other platforms, because the FSB of the CPU
+ can be controlled independently from the PCI/AGP clock.
+
+ The module has two options:
+
+ fid: multiplier * 10 (for example 8.5 = 85)
+ min_fsb: minimum FSB
+
+ If not set, fid is calculated from the current CPU speed and the FSB.
+ min_fsb defaults to FSB at boot time - 50 MHz.
+
+ IMPORTANT: The available range is limited downwards!
+ Also the minimum available FSB can differ, for systems
+ booting with 200 MHz, 150 should always work.
+
+
+``pcc-cpufreq``
+===============
+
+::
+
+ /*
+ * pcc-cpufreq.txt - PCC interface documentation
+ *
+ * Copyright (C) 2009 Red Hat, Matthew Garrett <mjg@redhat.com>
+ * Copyright (C) 2009 Hewlett-Packard Development Company, L.P.
+ * Nagananda Chumbalkar <nagananda.chumbalkar@hp.com>
+ */
+
+
+ Processor Clocking Control Driver
+ ---------------------------------
+
+ Contents:
+ ---------
+ 1. Introduction
+ 1.1 PCC interface
+ 1.1.1 Get Average Frequency
+ 1.1.2 Set Desired Frequency
+ 1.2 Platforms affected
+ 2. Driver and /sys details
+ 2.1 scaling_available_frequencies
+ 2.2 cpuinfo_transition_latency
+ 2.3 cpuinfo_cur_freq
+ 2.4 related_cpus
+ 3. Caveats
+
+ 1. Introduction:
+ ----------------
+ Processor Clocking Control (PCC) is an interface between the platform
+ firmware and OSPM. It is a mechanism for coordinating processor
+ performance (ie: frequency) between the platform firmware and the OS.
+
+ The PCC driver (pcc-cpufreq) allows OSPM to take advantage of the PCC
+ interface.
+
+ OS utilizes the PCC interface to inform platform firmware what frequency the
+ OS wants for a logical processor. The platform firmware attempts to achieve
+ the requested frequency. If the request for the target frequency could not be
+ satisfied by platform firmware, then it usually means that power budget
+ conditions are in place, and "power capping" is taking place.
+
+ 1.1 PCC interface:
+ ------------------
+ The complete PCC specification is available here:
+ https://acpica.org/sites/acpica/files/Processor-Clocking-Control-v1p0.pdf
+
+ PCC relies on a shared memory region that provides a channel for communication
+ between the OS and platform firmware. PCC also implements a "doorbell" that
+ is used by the OS to inform the platform firmware that a command has been
+ sent.
+
+ The ACPI PCCH() method is used to discover the location of the PCC shared
+ memory region. The shared memory region header contains the "command" and
+ "status" interface. PCCH() also contains details on how to access the platform
+ doorbell.
+
+ The following commands are supported by the PCC interface:
+ * Get Average Frequency
+ * Set Desired Frequency
+
+ The ACPI PCCP() method is implemented for each logical processor and is
+ used to discover the offsets for the input and output buffers in the shared
+ memory region.
+
+ When PCC mode is enabled, the platform will not expose processor performance
+ or throttle states (_PSS, _TSS and related ACPI objects) to OSPM. Therefore,
+ the native P-state driver (such as acpi-cpufreq for Intel, powernow-k8 for
+ AMD) will not load.
+
+ However, OSPM remains in control of policy. The governor (eg: "ondemand")
+ computes the required performance for each processor based on server workload.
+ The PCC driver fills in the command interface, and the input buffer and
+ communicates the request to the platform firmware. The platform firmware is
+ responsible for delivering the requested performance.
+
+ Each PCC command is "global" in scope and can affect all the logical CPUs in
+ the system. Therefore, PCC is capable of performing "group" updates. With PCC
+ the OS is capable of getting/setting the frequency of all the logical CPUs in
+ the system with a single call to the BIOS.
+
+ 1.1.1 Get Average Frequency:
+ ----------------------------
+ This command is used by the OSPM to query the running frequency of the
+ processor since the last time this command was completed. The output buffer
+ indicates the average unhalted frequency of the logical processor expressed as
+ a percentage of the nominal (ie: maximum) CPU frequency. The output buffer
+ also signifies if the CPU frequency is limited by a power budget condition.
+
+ 1.1.2 Set Desired Frequency:
+ ----------------------------
+ This command is used by the OSPM to communicate to the platform firmware the
+ desired frequency for a logical processor. The output buffer is currently
+ ignored by OSPM. The next invocation of "Get Average Frequency" will inform
+ OSPM if the desired frequency was achieved or not.
+
+ 1.2 Platforms affected:
+ -----------------------
+ The PCC driver will load on any system where the platform firmware:
+ * supports the PCC interface, and the associated PCCH() and PCCP() methods
+ * assumes responsibility for managing the hardware clocking controls in order
+ to deliver the requested processor performance
+
+ Currently, certain HP ProLiant platforms implement the PCC interface. On those
+ platforms PCC is the "default" choice.
+
+ However, it is possible to disable this interface via a BIOS setting. In
+ such an instance, as is also the case on platforms where the PCC interface
+ is not implemented, the PCC driver will fail to load silently.
+
+ 2. Driver and /sys details:
+ ---------------------------
+ When the driver loads, it merely prints the lowest and the highest CPU
+ frequencies supported by the platform firmware.
+
+ The PCC driver loads with a message such as:
+ pcc-cpufreq: (v1.00.00) driver loaded with frequency limits: 1600 MHz, 2933
+ MHz
+
+ This means that the OPSM can request the CPU to run at any frequency in
+ between the limits (1600 MHz, and 2933 MHz) specified in the message.
+
+ Internally, there is no need for the driver to convert the "target" frequency
+ to a corresponding P-state.
+
+ The VERSION number for the driver will be of the format v.xy.ab.
+ eg: 1.00.02
+ ----- --
+ | |
+ | -- this will increase with bug fixes/enhancements to the driver
+ |-- this is the version of the PCC specification the driver adheres to
+
+
+ The following is a brief discussion on some of the fields exported via the
+ /sys filesystem and how their values are affected by the PCC driver:
+
+ 2.1 scaling_available_frequencies:
+ ----------------------------------
+ scaling_available_frequencies is not created in /sys. No intermediate
+ frequencies need to be listed because the BIOS will try to achieve any
+ frequency, within limits, requested by the governor. A frequency does not have
+ to be strictly associated with a P-state.
+
+ 2.2 cpuinfo_transition_latency:
+ -------------------------------
+ The cpuinfo_transition_latency field is 0. The PCC specification does
+ not include a field to expose this value currently.
+
+ 2.3 cpuinfo_cur_freq:
+ ---------------------
+ A) Often cpuinfo_cur_freq will show a value different than what is declared
+ in the scaling_available_frequencies or scaling_cur_freq, or scaling_max_freq.
+ This is due to "turbo boost" available on recent Intel processors. If certain
+ conditions are met the BIOS can achieve a slightly higher speed than requested
+ by OSPM. An example:
+
+ scaling_cur_freq : 2933000
+ cpuinfo_cur_freq : 3196000
+
+ B) There is a round-off error associated with the cpuinfo_cur_freq value.
+ Since the driver obtains the current frequency as a "percentage" (%) of the
+ nominal frequency from the BIOS, sometimes, the values displayed by
+ scaling_cur_freq and cpuinfo_cur_freq may not match. An example:
+
+ scaling_cur_freq : 1600000
+ cpuinfo_cur_freq : 1583000
+
+ In this example, the nominal frequency is 2933 MHz. The driver obtains the
+ current frequency, cpuinfo_cur_freq, as 54% of the nominal frequency:
+
+ 54% of 2933 MHz = 1583 MHz
+
+ Nominal frequency is the maximum frequency of the processor, and it usually
+ corresponds to the frequency of the P0 P-state.
+
+ 2.4 related_cpus:
+ -----------------
+ The related_cpus field is identical to affected_cpus.
+
+ affected_cpus : 4
+ related_cpus : 4
+
+ Currently, the PCC driver does not evaluate _PSD. The platforms that support
+ PCC do not implement SW_ALL. So OSPM doesn't need to perform any coordination
+ to ensure that the same frequency is requested of all dependent CPUs.
+
+ 3. Caveats:
+ -----------
+ The "cpufreq_stats" module in its present form cannot be loaded and
+ expected to work with the PCC driver. Since the "cpufreq_stats" module
+ provides information wrt each P-state, it is not applicable to the PCC driver.
diff --git a/Documentation/admin-guide/pm/cpuidle.rst b/Documentation/admin-guide/pm/cpuidle.rst
index 6a06dc473dd6..19754beb5a4e 100644
--- a/Documentation/admin-guide/pm/cpuidle.rst
+++ b/Documentation/admin-guide/pm/cpuidle.rst
@@ -159,17 +159,15 @@ governor uses that information depends on what algorithm is implemented by it
and that is the primary reason for having more than one governor in the
``CPUIdle`` subsystem.
-There are three ``CPUIdle`` governors available, ``menu``, `TEO <teo-gov_>`_
-and ``ladder``. Which of them is used by default depends on the configuration
-of the kernel and in particular on whether or not the scheduler tick can be
-`stopped by the idle loop <idle-cpus-and-tick_>`_. It is possible to change the
-governor at run time if the ``cpuidle_sysfs_switch`` command line parameter has
-been passed to the kernel, but that is not safe in general, so it should not be
-done on production systems (that may change in the future, though). The name of
-the ``CPUIdle`` governor currently used by the kernel can be read from the
-:file:`current_governor_ro` (or :file:`current_governor` if
-``cpuidle_sysfs_switch`` is present in the kernel command line) file under
-:file:`/sys/devices/system/cpu/cpuidle/` in ``sysfs``.
+There are four ``CPUIdle`` governors available, ``menu``, `TEO <teo-gov_>`_,
+``ladder`` and ``haltpoll``. Which of them is used by default depends on the
+configuration of the kernel and in particular on whether or not the scheduler
+tick can be `stopped by the idle loop <idle-cpus-and-tick_>`_. Available
+governors can be read from the :file:`available_governors`, and the governor
+can be changed at runtime. The name of the ``CPUIdle`` governor currently
+used by the kernel can be read from the :file:`current_governor_ro` or
+:file:`current_governor` file under :file:`/sys/devices/system/cpu/cpuidle/`
+in ``sysfs``.
Which ``CPUIdle`` driver is used, on the other hand, usually depends on the
platform the kernel is running on, but there are platforms with more than one
@@ -349,81 +347,8 @@ for tickless systems. It follows the same basic strategy as the ``menu`` `one
<menu-gov_>`_: it always tries to find the deepest idle state suitable for the
given conditions. However, it applies a different approach to that problem.
-First, it does not use sleep length correction factors, but instead it attempts
-to correlate the observed idle duration values with the available idle states
-and use that information to pick up the idle state that is most likely to
-"match" the upcoming CPU idle interval. Second, it does not take the tasks
-that were running on the given CPU in the past and are waiting on some I/O
-operations to complete now at all (there is no guarantee that they will run on
-the same CPU when they become runnable again) and the pattern detection code in
-it avoids taking timer wakeups into account. It also only uses idle duration
-values less than the current time till the closest timer (with the scheduler
-tick excluded) for that purpose.
-
-Like in the ``menu`` governor `case <menu-gov_>`_, the first step is to obtain
-the *sleep length*, which is the time until the closest timer event with the
-assumption that the scheduler tick will be stopped (that also is the upper bound
-on the time until the next CPU wakeup). That value is then used to preselect an
-idle state on the basis of three metrics maintained for each idle state provided
-by the ``CPUIdle`` driver: ``hits``, ``misses`` and ``early_hits``.
-
-The ``hits`` and ``misses`` metrics measure the likelihood that a given idle
-state will "match" the observed (post-wakeup) idle duration if it "matches" the
-sleep length. They both are subject to decay (after a CPU wakeup) every time
-the target residency of the idle state corresponding to them is less than or
-equal to the sleep length and the target residency of the next idle state is
-greater than the sleep length (that is, when the idle state corresponding to
-them "matches" the sleep length). The ``hits`` metric is increased if the
-former condition is satisfied and the target residency of the given idle state
-is less than or equal to the observed idle duration and the target residency of
-the next idle state is greater than the observed idle duration at the same time
-(that is, it is increased when the given idle state "matches" both the sleep
-length and the observed idle duration). In turn, the ``misses`` metric is
-increased when the given idle state "matches" the sleep length only and the
-observed idle duration is too short for its target residency.
-
-The ``early_hits`` metric measures the likelihood that a given idle state will
-"match" the observed (post-wakeup) idle duration if it does not "match" the
-sleep length. It is subject to decay on every CPU wakeup and it is increased
-when the idle state corresponding to it "matches" the observed (post-wakeup)
-idle duration and the target residency of the next idle state is less than or
-equal to the sleep length (i.e. the idle state "matching" the sleep length is
-deeper than the given one).
-
-The governor walks the list of idle states provided by the ``CPUIdle`` driver
-and finds the last (deepest) one with the target residency less than or equal
-to the sleep length. Then, the ``hits`` and ``misses`` metrics of that idle
-state are compared with each other and it is preselected if the ``hits`` one is
-greater (which means that that idle state is likely to "match" the observed idle
-duration after CPU wakeup). If the ``misses`` one is greater, the governor
-preselects the shallower idle state with the maximum ``early_hits`` metric
-(or if there are multiple shallower idle states with equal ``early_hits``
-metric which also is the maximum, the shallowest of them will be preselected).
-[If there is a wakeup latency constraint coming from the `PM QoS framework
-<cpu-pm-qos_>`_ which is hit before reaching the deepest idle state with the
-target residency within the sleep length, the deepest idle state with the exit
-latency within the constraint is preselected without consulting the ``hits``,
-``misses`` and ``early_hits`` metrics.]
-
-Next, the governor takes several idle duration values observed most recently
-into consideration and if at least a half of them are greater than or equal to
-the target residency of the preselected idle state, that idle state becomes the
-final candidate to ask for. Otherwise, the average of the most recent idle
-duration values below the target residency of the preselected idle state is
-computed and the governor walks the idle states shallower than the preselected
-one and finds the deepest of them with the target residency within that average.
-That idle state is then taken as the final candidate to ask for.
-
-Still, at this point the governor may need to refine the idle state selection if
-it has not decided to `stop the scheduler tick <idle-cpus-and-tick_>`_. That
-generally happens if the target residency of the idle state selected so far is
-less than the tick period and the tick has not been stopped already (in a
-previous iteration of the idle loop). Then, like in the ``menu`` governor
-`case <menu-gov_>`_, the sleep length used in the previous computations may not
-reflect the real time until the closest timer event and if it really is greater
-than that time, a shallower state with a suitable target residency may need to
-be selected.
-
+.. kernel-doc:: drivers/cpuidle/governors/teo.c
+ :doc: teo-description
.. _idle-states-representation:
@@ -480,7 +405,7 @@ order to ask the hardware to enter that state. Also, for each
statistics of the given idle state. That information is exposed by the kernel
via ``sysfs``.
-For each CPU in the system, there is a :file:`/sys/devices/system/cpu<N>/cpuidle/`
+For each CPU in the system, there is a :file:`/sys/devices/system/cpu/cpu<N>/cpuidle/`
directory in ``sysfs``, where the number ``<N>`` is assigned to the given
CPU at the initialization time. That directory contains a set of subdirectories
called :file:`state0`, :file:`state1` and so on, up to the number of idle state
@@ -496,7 +421,7 @@ object corresponding to it, as follows:
residency.
``below``
- Total number of times this idle state had been asked for, but cerainly
+ Total number of times this idle state had been asked for, but certainly
a deeper idle state would have been a better match for the observed idle
duration.
@@ -530,6 +455,10 @@ object corresponding to it, as follows:
Total number of times the hardware has been asked by the given CPU to
enter this idle state.
+``rejected``
+ Total number of times a request to enter this idle state on the given
+ CPU was rejected.
+
The :file:`desc` and :file:`name` files both contain strings. The difference
between them is that the name is expected to be more concise, while the
description may be longer and it may contain white space or special characters.
@@ -574,6 +503,11 @@ particular case. For these reasons, the only reliable way to find out how
much time has been spent by the hardware in different idle states supported by
it is to use idle state residency counters in the hardware, if available.
+Generally, an interrupt received when trying to enter an idle state causes the
+idle state entry request to be rejected, in which case the ``CPUIdle`` driver
+may return an error code to indicate that this was the case. The :file:`usage`
+and :file:`rejected` files report the number of times the given idle state
+was entered successfully or rejected, respectively.
.. _cpu-pm-qos:
@@ -583,20 +517,17 @@ Power Management Quality of Service for CPUs
The power management quality of service (PM QoS) framework in the Linux kernel
allows kernel code and user space processes to set constraints on various
energy-efficiency features of the kernel to prevent performance from dropping
-below a required level. The PM QoS constraints can be set globally, in
-predefined categories referred to as PM QoS classes, or against individual
-devices.
+below a required level.
CPU idle time management can be affected by PM QoS in two ways, through the
-global constraint in the ``PM_QOS_CPU_DMA_LATENCY`` class and through the
-resume latency constraints for individual CPUs. Kernel code (e.g. device
-drivers) can set both of them with the help of special internal interfaces
-provided by the PM QoS framework. User space can modify the former by opening
-the :file:`cpu_dma_latency` special device file under :file:`/dev/` and writing
-a binary value (interpreted as a signed 32-bit integer) to it. In turn, the
-resume latency constraint for a CPU can be modified by user space by writing a
-string (representing a signed 32-bit integer) to the
-:file:`power/pm_qos_resume_latency_us` file under
+global CPU latency limit and through the resume latency constraints for
+individual CPUs. Kernel code (e.g. device drivers) can set both of them with
+the help of special internal interfaces provided by the PM QoS framework. User
+space can modify the former by opening the :file:`cpu_dma_latency` special
+device file under :file:`/dev/` and writing a binary value (interpreted as a
+signed 32-bit integer) to it. In turn, the resume latency constraint for a CPU
+can be modified from user space by writing a string (representing a signed
+32-bit integer) to the :file:`power/pm_qos_resume_latency_us` file under
:file:`/sys/devices/system/cpu/cpu<N>/` in ``sysfs``, where the CPU number
``<N>`` is allocated at the system initialization time. Negative values
will be rejected in both cases and, also in both cases, the written integer
@@ -605,32 +536,34 @@ number will be interpreted as a requested PM QoS constraint in microseconds.
The requested value is not automatically applied as a new constraint, however,
as it may be less restrictive (greater in this particular case) than another
constraint previously requested by someone else. For this reason, the PM QoS
-framework maintains a list of requests that have been made so far in each
-global class and for each device, aggregates them and applies the effective
-(minimum in this particular case) value as the new constraint.
+framework maintains a list of requests that have been made so far for the
+global CPU latency limit and for each individual CPU, aggregates them and
+applies the effective (minimum in this particular case) value as the new
+constraint.
In fact, opening the :file:`cpu_dma_latency` special device file causes a new
-PM QoS request to be created and added to the priority list of requests in the
-``PM_QOS_CPU_DMA_LATENCY`` class and the file descriptor coming from the
-"open" operation represents that request. If that file descriptor is then
-used for writing, the number written to it will be associated with the PM QoS
-request represented by it as a new requested constraint value. Next, the
-priority list mechanism will be used to determine the new effective value of
-the entire list of requests and that effective value will be set as a new
-constraint. Thus setting a new requested constraint value will only change the
-real constraint if the effective "list" value is affected by it. In particular,
-for the ``PM_QOS_CPU_DMA_LATENCY`` class it only affects the real constraint if
-it is the minimum of the requested constraints in the list. The process holding
-a file descriptor obtained by opening the :file:`cpu_dma_latency` special device
-file controls the PM QoS request associated with that file descriptor, but it
-controls this particular PM QoS request only.
+PM QoS request to be created and added to a global priority list of CPU latency
+limit requests and the file descriptor coming from the "open" operation
+represents that request. If that file descriptor is then used for writing, the
+number written to it will be associated with the PM QoS request represented by
+it as a new requested limit value. Next, the priority list mechanism will be
+used to determine the new effective value of the entire list of requests and
+that effective value will be set as a new CPU latency limit. Thus requesting a
+new limit value will only change the real limit if the effective "list" value is
+affected by it, which is the case if it is the minimum of the requested values
+in the list.
+
+The process holding a file descriptor obtained by opening the
+:file:`cpu_dma_latency` special device file controls the PM QoS request
+associated with that file descriptor, but it controls this particular PM QoS
+request only.
Closing the :file:`cpu_dma_latency` special device file or, more precisely, the
file descriptor obtained while opening it, causes the PM QoS request associated
-with that file descriptor to be removed from the ``PM_QOS_CPU_DMA_LATENCY``
-class priority list and destroyed. If that happens, the priority list mechanism
-will be used, again, to determine the new effective value for the whole list
-and that value will become the new real constraint.
+with that file descriptor to be removed from the global priority list of CPU
+latency limit requests and destroyed. If that happens, the priority list
+mechanism will be used again, to determine the new effective value for the whole
+list and that value will become the new limit.
In turn, for each CPU there is one resume latency PM QoS request associated with
the :file:`power/pm_qos_resume_latency_us` file under
@@ -647,10 +580,10 @@ CPU in question every time the list of requests is updated this way or another
(there may be other requests coming from kernel code in that list).
CPU idle time governors are expected to regard the minimum of the global
-effective ``PM_QOS_CPU_DMA_LATENCY`` class constraint and the effective
-resume latency constraint for the given CPU as the upper limit for the exit
-latency of the idle states they can select for that CPU. They should never
-select any idle states with exit latency beyond that limit.
+(effective) CPU latency limit and the effective resume latency constraint for
+the given CPU as the upper limit for the exit latency of the idle states that
+they are allowed to select for that CPU. They should never select any idle
+states with exit latency beyond that limit.
Idle States Control Via Kernel Command Line
@@ -679,8 +612,8 @@ the ``menu`` governor to be used on the systems that use the ``ladder`` governor
by default this way, for example.
The other kernel command line parameters controlling CPU idle time management
-described below are only relevant for the *x86* architecture and some of
-them affect Intel processors only.
+described below are only relevant for the *x86* architecture and references
+to ``intel_idle`` affect Intel processors only.
The *x86* architecture support code recognizes three kernel command line
options related to CPU idle time management: ``idle=poll``, ``idle=halt``,
@@ -693,7 +626,7 @@ which of the two parameters is added to the kernel command line. In the
instruction of the CPUs (which, as a rule, suspends the execution of the program
and causes the hardware to attempt to enter the shallowest available idle state)
for this purpose, and if ``idle=poll`` is used, idle CPUs will execute a
-more or less ``lightweight'' sequence of instructions in a tight loop. [Note
+more or less "lightweight" sequence of instructions in a tight loop. [Note
that using ``idle=poll`` is somewhat drastic in many cases, as preventing idle
CPUs from saving almost any energy at all may not be the only effect of it.
For example, on Intel hardware it effectively prevents CPUs from using
@@ -702,10 +635,13 @@ idle, so it very well may hurt single-thread computations performance as well as
energy-efficiency. Thus using it for performance reasons may not be a good idea
at all.]
-The ``idle=nomwait`` option disables the ``intel_idle`` driver and causes
-``acpi_idle`` to be used (as long as all of the information needed by it is
-there in the system's ACPI tables), but it is not allowed to use the
-``MWAIT`` instruction of the CPUs to ask the hardware to enter idle states.
+The ``idle=nomwait`` option prevents the use of ``MWAIT`` instruction of
+the CPU to enter idle states. When this option is used, the ``acpi_idle``
+driver will use the ``HLT`` instruction instead of ``MWAIT``. On systems
+running Intel processors, this option disables the ``intel_idle`` driver
+and forces the use of the ``acpi_idle`` driver instead. Note that in either
+case, ``acpi_idle`` driver will function only if all the information needed
+by it is in the system's ACPI tables.
In addition to the architecture-level kernel command line options affecting CPU
idle time management, there are parameters affecting individual ``CPUIdle``
diff --git a/Documentation/admin-guide/pm/intel-speed-select.rst b/Documentation/admin-guide/pm/intel-speed-select.rst
new file mode 100644
index 000000000000..a2bfb971654f
--- /dev/null
+++ b/Documentation/admin-guide/pm/intel-speed-select.rst
@@ -0,0 +1,939 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+============================================================
+Intel(R) Speed Select Technology User Guide
+============================================================
+
+The Intel(R) Speed Select Technology (Intel(R) SST) provides a powerful new
+collection of features that give more granular control over CPU performance.
+With Intel(R) SST, one server can be configured for power and performance for a
+variety of diverse workload requirements.
+
+Refer to the links below for an overview of the technology:
+
+- https://www.intel.com/content/www/us/en/architecture-and-technology/speed-select-technology-article.html
+- https://builders.intel.com/docs/networkbuilders/intel-speed-select-technology-base-frequency-enhancing-performance.pdf
+
+These capabilities are further enhanced in some of the newer generations of
+server platforms where these features can be enumerated and controlled
+dynamically without pre-configuring via BIOS setup options. This dynamic
+configuration is done via mailbox commands to the hardware. One way to enumerate
+and configure these features is by using the Intel Speed Select utility.
+
+This document explains how to use the Intel Speed Select tool to enumerate and
+control Intel(R) SST features. This document gives example commands and explains
+how these commands change the power and performance profile of the system under
+test. Using this tool as an example, customers can replicate the messaging
+implemented in the tool in their production software.
+
+intel-speed-select configuration tool
+======================================
+
+Most Linux distribution packages may include the "intel-speed-select" tool. If not,
+it can be built by downloading the Linux kernel tree from kernel.org. Once
+downloaded, the tool can be built without building the full kernel.
+
+From the kernel tree, run the following commands::
+
+# cd tools/power/x86/intel-speed-select/
+# make
+# make install
+
+Getting Help
+------------
+
+To get help with the tool, execute the command below::
+
+# intel-speed-select --help
+
+The top-level help describes arguments and features. Notice that there is a
+multi-level help structure in the tool. For example, to get help for the feature "perf-profile"::
+
+# intel-speed-select perf-profile --help
+
+To get help on a command, another level of help is provided. For example for the command info "info"::
+
+# intel-speed-select perf-profile info --help
+
+Summary of platform capability
+------------------------------
+To check the current platform and driver capabilities, execute::
+
+#intel-speed-select --info
+
+For example on a test system::
+
+ # intel-speed-select --info
+ Intel(R) Speed Select Technology
+ Executing on CPU model: X
+ Platform: API version : 1
+ Platform: Driver version : 1
+ Platform: mbox supported : 1
+ Platform: mmio supported : 1
+ Intel(R) SST-PP (feature perf-profile) is supported
+ TDP level change control is unlocked, max level: 4
+ Intel(R) SST-TF (feature turbo-freq) is supported
+ Intel(R) SST-BF (feature base-freq) is not supported
+ Intel(R) SST-CP (feature core-power) is supported
+
+Intel(R) Speed Select Technology - Performance Profile (Intel(R) SST-PP)
+------------------------------------------------------------------------
+
+This feature allows configuration of a server dynamically based on workload
+performance requirements. This helps users during deployment as they do not have
+to choose a specific server configuration statically. This Intel(R) Speed Select
+Technology - Performance Profile (Intel(R) SST-PP) feature introduces a mechanism
+that allows multiple optimized performance profiles per system. Each profile
+defines a set of CPUs that need to be online and rest offline to sustain a
+guaranteed base frequency. Once the user issues a command to use a specific
+performance profile and meet CPU online/offline requirement, the user can expect
+a change in the base frequency dynamically. This feature is called
+"perf-profile" when using the Intel Speed Select tool.
+
+Number or performance levels
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+There can be multiple performance profiles on a system. To get the number of
+profiles, execute the command below::
+
+ # intel-speed-select perf-profile get-config-levels
+ Intel(R) Speed Select Technology
+ Executing on CPU model: X
+ package-0
+ die-0
+ cpu-0
+ get-config-levels:4
+ package-1
+ die-0
+ cpu-14
+ get-config-levels:4
+
+On this system under test, there are 4 performance profiles in addition to the
+base performance profile (which is performance level 0).
+
+Lock/Unlock status
+~~~~~~~~~~~~~~~~~~
+
+Even if there are multiple performance profiles, it is possible that they
+are locked. If they are locked, users cannot issue a command to change the
+performance state. It is possible that there is a BIOS setup to unlock or check
+with your system vendor.
+
+To check if the system is locked, execute the following command::
+
+ # intel-speed-select perf-profile get-lock-status
+ Intel(R) Speed Select Technology
+ Executing on CPU model: X
+ package-0
+ die-0
+ cpu-0
+ get-lock-status:0
+ package-1
+ die-0
+ cpu-14
+ get-lock-status:0
+
+In this case, lock status is 0, which means that the system is unlocked.
+
+Properties of a performance level
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+To get properties of a specific performance level (For example for the level 0, below), execute the command below::
+
+ # intel-speed-select perf-profile info -l 0
+ Intel(R) Speed Select Technology
+ Executing on CPU model: X
+ package-0
+ die-0
+ cpu-0
+ perf-profile-level-0
+ cpu-count:28
+ enable-cpu-mask:000003ff,f0003fff
+ enable-cpu-list:0,1,2,3,4,5,6,7,8,9,10,11,12,13,28,29,30,31,32,33,34,35,36,37,38,39,40,41
+ thermal-design-power-ratio:26
+ base-frequency(MHz):2600
+ speed-select-turbo-freq:disabled
+ speed-select-base-freq:disabled
+ ...
+ ...
+
+Here -l option is used to specify a performance level.
+
+If the option -l is omitted, then this command will print information about all
+the performance levels. The above command is printing properties of the
+performance level 0.
+
+For this performance profile, the list of CPUs displayed by the
+"enable-cpu-mask/enable-cpu-list" at the max can be "online." When that
+condition is met, then base frequency of 2600 MHz can be maintained. To
+understand more, execute "intel-speed-select perf-profile info" for performance
+level 4::
+
+ # intel-speed-select perf-profile info -l 4
+ Intel(R) Speed Select Technology
+ Executing on CPU model: X
+ package-0
+ die-0
+ cpu-0
+ perf-profile-level-4
+ cpu-count:28
+ enable-cpu-mask:000000fa,f0000faf
+ enable-cpu-list:0,1,2,3,5,7,8,9,10,11,28,29,30,31,33,35,36,37,38,39
+ thermal-design-power-ratio:28
+ base-frequency(MHz):2800
+ speed-select-turbo-freq:disabled
+ speed-select-base-freq:unsupported
+ ...
+ ...
+
+There are fewer CPUs in the "enable-cpu-mask/enable-cpu-list". Consequently, if
+the user only keeps these CPUs online and the rest "offline," then the base
+frequency is increased to 2.8 GHz compared to 2.6 GHz at performance level 0.
+
+Get current performance level
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+To get the current performance level, execute::
+
+ # intel-speed-select perf-profile get-config-current-level
+ Intel(R) Speed Select Technology
+ Executing on CPU model: X
+ package-0
+ die-0
+ cpu-0
+ get-config-current_level:0
+
+First verify that the base_frequency displayed by the cpufreq sysfs is correct::
+
+ # cat /sys/devices/system/cpu/cpu0/cpufreq/base_frequency
+ 2600000
+
+This matches the base-frequency (MHz) field value displayed from the
+"perf-profile info" command for performance level 0(cpufreq frequency is in
+KHz).
+
+To check if the average frequency is equal to the base frequency for a 100% busy
+workload, disable turbo::
+
+# echo 1 > /sys/devices/system/cpu/intel_pstate/no_turbo
+
+Then runs a busy workload on all CPUs, for example::
+
+#stress -c 64
+
+To verify the base frequency, run turbostat::
+
+ #turbostat -c 0-13 --show Package,Core,CPU,Bzy_MHz -i 1
+
+ Package Core CPU Bzy_MHz
+ - - 2600
+ 0 0 0 2600
+ 0 1 1 2600
+ 0 2 2 2600
+ 0 3 3 2600
+ 0 4 4 2600
+ . . . .
+
+
+Changing performance level
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+To the change the performance level to 4, execute::
+
+ # intel-speed-select -d perf-profile set-config-level -l 4 -o
+ Intel(R) Speed Select Technology
+ Executing on CPU model: X
+ package-0
+ die-0
+ cpu-0
+ perf-profile
+ set_tdp_level:success
+
+In the command above, "-o" is optional. If it is specified, then it will also
+offline CPUs which are not present in the enable_cpu_mask for this performance
+level.
+
+Now if the base_frequency is checked::
+
+ #cat /sys/devices/system/cpu/cpu0/cpufreq/base_frequency
+ 2800000
+
+Which shows that the base frequency now increased from 2600 MHz at performance
+level 0 to 2800 MHz at performance level 4. As a result, any workload, which can
+use fewer CPUs, can see a boost of 200 MHz compared to performance level 0.
+
+Changing performance level via BMC Interface
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+It is possible to change SST-PP level using out of band (OOB) agent (Via some
+remote management console, through BMC "Baseboard Management Controller"
+interface). This mode is supported from the Sapphire Rapids processor
+generation. The kernel and tool change to support this mode is added to Linux
+kernel version 5.18. To enable this feature, kernel config
+"CONFIG_INTEL_HFI_THERMAL" is required. The minimum version of the tool
+is "v1.12" to support this feature, which is part of Linux kernel version 5.18.
+
+To support such configuration, this tool can be used as a daemon. Add
+a command line option --oob::
+
+ # intel-speed-select --oob
+ Intel(R) Speed Select Technology
+ Executing on CPU model:143[0x8f]
+ OOB mode is enabled and will run as daemon
+
+In this mode the tool will online/offline CPUs based on the new performance
+level.
+
+Check presence of other Intel(R) SST features
+---------------------------------------------
+
+Each of the performance profiles also specifies weather there is support of
+other two Intel(R) SST features (Intel(R) Speed Select Technology - Base Frequency
+(Intel(R) SST-BF) and Intel(R) Speed Select Technology - Turbo Frequency (Intel
+SST-TF)).
+
+For example, from the output of "perf-profile info" above, for level 0 and level
+4:
+
+For level 0::
+ speed-select-turbo-freq:disabled
+ speed-select-base-freq:disabled
+
+For level 4::
+ speed-select-turbo-freq:disabled
+ speed-select-base-freq:unsupported
+
+Given these results, the "speed-select-base-freq" (Intel(R) SST-BF) in level 4
+changed from "disabled" to "unsupported" compared to performance level 0.
+
+This means that at performance level 4, the "speed-select-base-freq" feature is
+not supported. However, at performance level 0, this feature is "supported", but
+currently "disabled", meaning the user has not activated this feature. Whereas
+"speed-select-turbo-freq" (Intel(R) SST-TF) is supported at both performance
+levels, but currently not activated by the user.
+
+The Intel(R) SST-BF and the Intel(R) SST-TF features are built on a foundation
+technology called Intel(R) Speed Select Technology - Core Power (Intel(R) SST-CP).
+The platform firmware enables this feature when Intel(R) SST-BF or Intel(R) SST-TF
+is supported on a platform.
+
+Intel(R) Speed Select Technology Core Power (Intel(R) SST-CP)
+---------------------------------------------------------------
+
+Intel(R) Speed Select Technology Core Power (Intel(R) SST-CP) is an interface that
+allows users to define per core priority. This defines a mechanism to distribute
+power among cores when there is a power constrained scenario. This defines a
+class of service (CLOS) configuration.
+
+The user can configure up to 4 class of service configurations. Each CLOS group
+configuration allows definitions of parameters, which affects how the frequency
+can be limited and power is distributed. Each CPU core can be tied to a class of
+service and hence an associated priority. The granularity is at core level not
+at per CPU level.
+
+Enable CLOS based prioritization
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+To use CLOS based prioritization feature, firmware must be informed to enable
+and use a priority type. There is a default per platform priority type, which
+can be changed with optional command line parameter.
+
+To enable and check the options, execute::
+
+ # intel-speed-select core-power enable --help
+ Intel(R) Speed Select Technology
+ Executing on CPU model: X
+ Enable core-power for a package/die
+ Clos Enable: Specify priority type with [--priority|-p]
+ 0: Proportional, 1: Ordered
+
+There are two types of priority types:
+
+- Ordered
+
+Priority for ordered throttling is defined based on the index of the assigned
+CLOS group. Where CLOS0 gets highest priority (throttled last).
+
+Priority order is:
+CLOS0 > CLOS1 > CLOS2 > CLOS3.
+
+- Proportional
+
+When proportional priority is used, there is an additional parameter called
+frequency_weight, which can be specified per CLOS group. The goal of
+proportional priority is to provide each core with the requested min., then
+distribute all remaining (excess/deficit) budgets in proportion to a defined
+weight. This proportional priority can be configured using "core-power config"
+command.
+
+To enable with the platform default priority type, execute::
+
+ # intel-speed-select core-power enable
+ Intel(R) Speed Select Technology
+ Executing on CPU model: X
+ package-0
+ die-0
+ cpu-0
+ core-power
+ enable:success
+ package-1
+ die-0
+ cpu-6
+ core-power
+ enable:success
+
+The scope of this enable is per package or die scoped when a package contains
+multiple dies. To check if CLOS is enabled and get priority type, "core-power
+info" command can be used. For example to check the status of core-power feature
+on CPU 0, execute::
+
+ # intel-speed-select -c 0 core-power info
+ Intel(R) Speed Select Technology
+ Executing on CPU model: X
+ package-0
+ die-0
+ cpu-0
+ core-power
+ support-status:supported
+ enable-status:enabled
+ clos-enable-status:enabled
+ priority-type:proportional
+ package-1
+ die-0
+ cpu-24
+ core-power
+ support-status:supported
+ enable-status:enabled
+ clos-enable-status:enabled
+ priority-type:proportional
+
+Configuring CLOS groups
+~~~~~~~~~~~~~~~~~~~~~~~
+
+Each CLOS group has its own attributes including min, max, freq_weight and
+desired. These parameters can be configured with "core-power config" command.
+Defaults will be used if user skips setting a parameter except clos id, which is
+mandatory. To check core-power config options, execute::
+
+ # intel-speed-select core-power config --help
+ Intel(R) Speed Select Technology
+ Executing on CPU model: X
+ Set core-power configuration for one of the four clos ids
+ Specify targeted clos id with [--clos|-c]
+ Specify clos Proportional Priority [--weight|-w]
+ Specify clos min in MHz with [--min|-n]
+ Specify clos max in MHz with [--max|-m]
+
+For example::
+
+ # intel-speed-select core-power config -c 0
+ Intel(R) Speed Select Technology
+ Executing on CPU model: X
+ clos epp is not specified, default: 0
+ clos frequency weight is not specified, default: 0
+ clos min is not specified, default: 0 MHz
+ clos max is not specified, default: 25500 MHz
+ clos desired is not specified, default: 0
+ package-0
+ die-0
+ cpu-0
+ core-power
+ config:success
+ package-1
+ die-0
+ cpu-6
+ core-power
+ config:success
+
+The user has the option to change defaults. For example, the user can change the
+"min" and set the base frequency to always get guaranteed base frequency.
+
+Get the current CLOS configuration
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+To check the current configuration, "core-power get-config" can be used. For
+example, to get the configuration of CLOS 0::
+
+ # intel-speed-select core-power get-config -c 0
+ Intel(R) Speed Select Technology
+ Executing on CPU model: X
+ package-0
+ die-0
+ cpu-0
+ core-power
+ clos:0
+ epp:0
+ clos-proportional-priority:0
+ clos-min:0 MHz
+ clos-max:Max Turbo frequency
+ clos-desired:0 MHz
+ package-1
+ die-0
+ cpu-24
+ core-power
+ clos:0
+ epp:0
+ clos-proportional-priority:0
+ clos-min:0 MHz
+ clos-max:Max Turbo frequency
+ clos-desired:0 MHz
+
+Associating a CPU with a CLOS group
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+To associate a CPU to a CLOS group "core-power assoc" command can be used::
+
+ # intel-speed-select core-power assoc --help
+ Intel(R) Speed Select Technology
+ Executing on CPU model: X
+ Associate a clos id to a CPU
+ Specify targeted clos id with [--clos|-c]
+
+
+For example to associate CPU 10 to CLOS group 3, execute::
+
+ # intel-speed-select -c 10 core-power assoc -c 3
+ Intel(R) Speed Select Technology
+ Executing on CPU model: X
+ package-0
+ die-0
+ cpu-10
+ core-power
+ assoc:success
+
+Once a CPU is associated, its sibling CPUs are also associated to a CLOS group.
+Once associated, avoid changing Linux "cpufreq" subsystem scaling frequency
+limits.
+
+To check the existing association for a CPU, "core-power get-assoc" command can
+be used. For example, to get association of CPU 10, execute::
+
+ # intel-speed-select -c 10 core-power get-assoc
+ Intel(R) Speed Select Technology
+ Executing on CPU model: X
+ package-1
+ die-0
+ cpu-10
+ get-assoc
+ clos:3
+
+This shows that CPU 10 is part of a CLOS group 3.
+
+
+Disable CLOS based prioritization
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+To disable, execute::
+
+# intel-speed-select core-power disable
+
+Some features like Intel(R) SST-TF can only be enabled when CLOS based prioritization
+is enabled. For this reason, disabling while Intel(R) SST-TF is enabled can cause
+Intel(R) SST-TF to fail. This will cause the "disable" command to display an error
+if Intel(R) SST-TF is already enabled. In turn, to disable, the Intel(R) SST-TF
+feature must be disabled first.
+
+Intel(R) Speed Select Technology - Base Frequency (Intel(R) SST-BF)
+-------------------------------------------------------------------
+
+The Intel(R) Speed Select Technology - Base Frequency (Intel(R) SST-BF) feature lets
+the user control base frequency. If some critical workload threads demand
+constant high guaranteed performance, then this feature can be used to execute
+the thread at higher base frequency on specific sets of CPUs (high priority
+CPUs) at the cost of lower base frequency (low priority CPUs) on other CPUs.
+This feature does not require offline of the low priority CPUs.
+
+The support of Intel(R) SST-BF depends on the Intel(R) Speed Select Technology -
+Performance Profile (Intel(R) SST-PP) performance level configuration. It is
+possible that only certain performance levels support Intel(R) SST-BF. It is also
+possible that only base performance level (level = 0) has support of Intel
+SST-BF. Consequently, first select the desired performance level to enable this
+feature.
+
+In the system under test here, Intel(R) SST-BF is supported at the base
+performance level 0, but currently disabled. For example for the level 0::
+
+ # intel-speed-select -c 0 perf-profile info -l 0
+ Intel(R) Speed Select Technology
+ Executing on CPU model: X
+ package-0
+ die-0
+ cpu-0
+ perf-profile-level-0
+ ...
+
+ speed-select-base-freq:disabled
+ ...
+
+Before enabling Intel(R) SST-BF and measuring its impact on a workload
+performance, execute some workload and measure performance and get a baseline
+performance to compare against.
+
+Here the user wants more guaranteed performance. For this reason, it is likely
+that turbo is disabled. To disable turbo, execute::
+
+#echo 1 > /sys/devices/system/cpu/intel_pstate/no_turbo
+
+Based on the output of the "intel-speed-select perf-profile info -l 0" base
+frequency of guaranteed frequency 2600 MHz.
+
+
+Measure baseline performance for comparison
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+To compare, pick a multi-threaded workload where each thread can be scheduled on
+separate CPUs. "Hackbench pipe" test is a good example on how to improve
+performance using Intel(R) SST-BF.
+
+Below, the workload is measuring average scheduler wakeup latency, so a lower
+number means better performance::
+
+ # taskset -c 3,4 perf bench -r 100 sched pipe
+ # Running 'sched/pipe' benchmark:
+ # Executed 1000000 pipe operations between two processes
+ Total time: 6.102 [sec]
+ 6.102445 usecs/op
+ 163868 ops/sec
+
+While running the above test, if we take turbostat output, it will show us that
+2 of the CPUs are busy and reaching max. frequency (which would be the base
+frequency as the turbo is disabled). The turbostat output::
+
+ #turbostat -c 0-13 --show Package,Core,CPU,Bzy_MHz -i 1
+ Package Core CPU Bzy_MHz
+ 0 0 0 1000
+ 0 1 1 1005
+ 0 2 2 1000
+ 0 3 3 2600
+ 0 4 4 2600
+ 0 5 5 1000
+ 0 6 6 1000
+ 0 7 7 1005
+ 0 8 8 1005
+ 0 9 9 1000
+ 0 10 10 1000
+ 0 11 11 995
+ 0 12 12 1000
+ 0 13 13 1000
+
+From the above turbostat output, both CPU 3 and 4 are very busy and reaching
+full guaranteed frequency of 2600 MHz.
+
+Intel(R) SST-BF Capabilities
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+To get capabilities of Intel(R) SST-BF for the current performance level 0,
+execute::
+
+ # intel-speed-select base-freq info -l 0
+ Intel(R) Speed Select Technology
+ Executing on CPU model: X
+ package-0
+ die-0
+ cpu-0
+ speed-select-base-freq
+ high-priority-base-frequency(MHz):3000
+ high-priority-cpu-mask:00000216,00002160
+ high-priority-cpu-list:5,6,8,13,33,34,36,41
+ low-priority-base-frequency(MHz):2400
+ tjunction-temperature(C):125
+ thermal-design-power(W):205
+
+The above capabilities show that there are some CPUs on this system that can
+offer base frequency of 3000 MHz compared to the standard base frequency at this
+performance levels. Nevertheless, these CPUs are fixed, and they are presented
+via high-priority-cpu-list/high-priority-cpu-mask. But if this Intel(R) SST-BF
+feature is selected, the low priorities CPUs (which are not in
+high-priority-cpu-list) can only offer up to 2400 MHz. As a result, if this
+clipping of low priority CPUs is acceptable, then the user can enable Intel
+SST-BF feature particularly for the above "sched pipe" workload since only two
+CPUs are used, they can be scheduled on high priority CPUs and can get boost of
+400 MHz.
+
+Enable Intel(R) SST-BF
+~~~~~~~~~~~~~~~~~~~~~~
+
+To enable Intel(R) SST-BF feature, execute::
+
+ # intel-speed-select base-freq enable -a
+ Intel(R) Speed Select Technology
+ Executing on CPU model: X
+ package-0
+ die-0
+ cpu-0
+ base-freq
+ enable:success
+ package-1
+ die-0
+ cpu-14
+ base-freq
+ enable:success
+
+In this case, -a option is optional. This not only enables Intel(R) SST-BF, but it
+also adjusts the priority of cores using Intel(R) Speed Select Technology Core
+Power (Intel(R) SST-CP) features. This option sets the minimum performance of each
+Intel(R) Speed Select Technology - Performance Profile (Intel(R) SST-PP) class to
+maximum performance so that the hardware will give maximum performance possible
+for each CPU.
+
+If -a option is not used, then the following steps are required before enabling
+Intel(R) SST-BF:
+
+- Discover Intel(R) SST-BF and note low and high priority base frequency
+- Note the high priority CPU list
+- Enable CLOS using core-power feature set
+- Configure CLOS parameters. Use CLOS.min to set to minimum performance
+- Subscribe desired CPUs to CLOS groups
+
+With this configuration, if the same workload is executed by pinning the
+workload to high priority CPUs (CPU 5 and 6 in this case)::
+
+ #taskset -c 5,6 perf bench -r 100 sched pipe
+ # Running 'sched/pipe' benchmark:
+ # Executed 1000000 pipe operations between two processes
+ Total time: 5.627 [sec]
+ 5.627922 usecs/op
+ 177685 ops/sec
+
+This way, by enabling Intel(R) SST-BF, the performance of this benchmark is
+improved (latency reduced) by 7.79%. From the turbostat output, it can be
+observed that the high priority CPUs reached 3000 MHz compared to 2600 MHz.
+The turbostat output::
+
+ #turbostat -c 0-13 --show Package,Core,CPU,Bzy_MHz -i 1
+ Package Core CPU Bzy_MHz
+ 0 0 0 2151
+ 0 1 1 2166
+ 0 2 2 2175
+ 0 3 3 2175
+ 0 4 4 2175
+ 0 5 5 3000
+ 0 6 6 3000
+ 0 7 7 2180
+ 0 8 8 2662
+ 0 9 9 2176
+ 0 10 10 2175
+ 0 11 11 2176
+ 0 12 12 2176
+ 0 13 13 2661
+
+Disable Intel(R) SST-BF
+~~~~~~~~~~~~~~~~~~~~~~~
+
+To disable the Intel(R) SST-BF feature, execute::
+
+# intel-speed-select base-freq disable -a
+
+
+Intel(R) Speed Select Technology - Turbo Frequency (Intel(R) SST-TF)
+--------------------------------------------------------------------
+
+This feature enables the ability to set different "All core turbo ratio limits"
+to cores based on the priority. By using this feature, some cores can be
+configured to get higher turbo frequency by designating them as high priority at
+the cost of lower or no turbo frequency on the low priority cores.
+
+For this reason, this feature is only useful when system is busy utilizing all
+CPUs, but the user wants some configurable option to get high performance on
+some CPUs.
+
+The support of Intel(R) Speed Select Technology - Turbo Frequency (Intel(R) SST-TF)
+depends on the Intel(R) Speed Select Technology - Performance Profile (Intel
+SST-PP) performance level configuration. It is possible that only a certain
+performance level supports Intel(R) SST-TF. It is also possible that only the base
+performance level (level = 0) has the support of Intel(R) SST-TF. Hence, first
+select the desired performance level to enable this feature.
+
+In the system under test here, Intel(R) SST-TF is supported at the base
+performance level 0, but currently disabled::
+
+ # intel-speed-select -c 0 perf-profile info -l 0
+ Intel(R) Speed Select Technology
+ package-0
+ die-0
+ cpu-0
+ perf-profile-level-0
+ ...
+ ...
+ speed-select-turbo-freq:disabled
+ ...
+ ...
+
+
+To check if performance can be improved using Intel(R) SST-TF feature, get the turbo
+frequency properties with Intel(R) SST-TF enabled and compare to the base turbo
+capability of this system.
+
+Get Base turbo capability
+~~~~~~~~~~~~~~~~~~~~~~~~~
+
+To get the base turbo capability of performance level 0, execute::
+
+ # intel-speed-select perf-profile info -l 0
+ Intel(R) Speed Select Technology
+ Executing on CPU model: X
+ package-0
+ die-0
+ cpu-0
+ perf-profile-level-0
+ ...
+ ...
+ turbo-ratio-limits-sse
+ bucket-0
+ core-count:2
+ max-turbo-frequency(MHz):3200
+ bucket-1
+ core-count:4
+ max-turbo-frequency(MHz):3100
+ bucket-2
+ core-count:6
+ max-turbo-frequency(MHz):3100
+ bucket-3
+ core-count:8
+ max-turbo-frequency(MHz):3100
+ bucket-4
+ core-count:10
+ max-turbo-frequency(MHz):3100
+ bucket-5
+ core-count:12
+ max-turbo-frequency(MHz):3100
+ bucket-6
+ core-count:14
+ max-turbo-frequency(MHz):3100
+ bucket-7
+ core-count:16
+ max-turbo-frequency(MHz):3100
+
+Based on the data above, when all the CPUS are busy, the max. frequency of 3100
+MHz can be achieved. If there is some busy workload on cpu 0 - 11 (e.g. stress)
+and on CPU 12 and 13, execute "hackbench pipe" workload::
+
+ # taskset -c 12,13 perf bench -r 100 sched pipe
+ # Running 'sched/pipe' benchmark:
+ # Executed 1000000 pipe operations between two processes
+ Total time: 5.705 [sec]
+ 5.705488 usecs/op
+ 175269 ops/sec
+
+The turbostat output::
+
+ #turbostat -c 0-13 --show Package,Core,CPU,Bzy_MHz -i 1
+ Package Core CPU Bzy_MHz
+ 0 0 0 3000
+ 0 1 1 3000
+ 0 2 2 3000
+ 0 3 3 3000
+ 0 4 4 3000
+ 0 5 5 3100
+ 0 6 6 3100
+ 0 7 7 3000
+ 0 8 8 3100
+ 0 9 9 3000
+ 0 10 10 3000
+ 0 11 11 3000
+ 0 12 12 3100
+ 0 13 13 3100
+
+Based on turbostat output, the performance is limited by frequency cap of 3100
+MHz. To check if the hackbench performance can be improved for CPU 12 and CPU
+13, first check the capability of the Intel(R) SST-TF feature for this performance
+level.
+
+Get Intel(R) SST-TF Capability
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+To get the capability, the "turbo-freq info" command can be used::
+
+ # intel-speed-select turbo-freq info -l 0
+ Intel(R) Speed Select Technology
+ Executing on CPU model: X
+ package-0
+ die-0
+ cpu-0
+ speed-select-turbo-freq
+ bucket-0
+ high-priority-cores-count:2
+ high-priority-max-frequency(MHz):3200
+ high-priority-max-avx2-frequency(MHz):3200
+ high-priority-max-avx512-frequency(MHz):3100
+ bucket-1
+ high-priority-cores-count:4
+ high-priority-max-frequency(MHz):3100
+ high-priority-max-avx2-frequency(MHz):3000
+ high-priority-max-avx512-frequency(MHz):2900
+ bucket-2
+ high-priority-cores-count:6
+ high-priority-max-frequency(MHz):3100
+ high-priority-max-avx2-frequency(MHz):3000
+ high-priority-max-avx512-frequency(MHz):2900
+ speed-select-turbo-freq-clip-frequencies
+ low-priority-max-frequency(MHz):2600
+ low-priority-max-avx2-frequency(MHz):2400
+ low-priority-max-avx512-frequency(MHz):2100
+
+Based on the output above, there is an Intel(R) SST-TF bucket for which there are
+two high priority cores. If only two high priority cores are set, then max.
+turbo frequency on those cores can be increased to 3200 MHz. This is 100 MHz
+more than the base turbo capability for all cores.
+
+In turn, for the hackbench workload, two CPUs can be set as high priority and
+rest as low priority. One side effect is that once enabled, the low priority
+cores will be clipped to a lower frequency of 2600 MHz.
+
+Enable Intel(R) SST-TF
+~~~~~~~~~~~~~~~~~~~~~~
+
+To enable Intel(R) SST-TF, execute::
+
+ # intel-speed-select -c 12,13 turbo-freq enable -a
+ Intel(R) Speed Select Technology
+ Executing on CPU model: X
+ package-0
+ die-0
+ cpu-12
+ turbo-freq
+ enable:success
+ package-0
+ die-0
+ cpu-13
+ turbo-freq
+ enable:success
+ package--1
+ die-0
+ cpu-63
+ turbo-freq --auto
+ enable:success
+
+In this case, the option "-a" is optional. If set, it enables Intel(R) SST-TF
+feature and also sets the CPUs to high and low priority using Intel Speed
+Select Technology Core Power (Intel(R) SST-CP) features. The CPU numbers passed
+with "-c" arguments are marked as high priority, including its siblings.
+
+If -a option is not used, then the following steps are required before enabling
+Intel(R) SST-TF:
+
+- Discover Intel(R) SST-TF and note buckets of high priority cores and maximum frequency
+
+- Enable CLOS using core-power feature set - Configure CLOS parameters
+
+- Subscribe desired CPUs to CLOS groups making sure that high priority cores are set to the maximum frequency
+
+If the same hackbench workload is executed, schedule hackbench threads on high
+priority CPUs::
+
+ #taskset -c 12,13 perf bench -r 100 sched pipe
+ # Running 'sched/pipe' benchmark:
+ # Executed 1000000 pipe operations between two processes
+ Total time: 5.510 [sec]
+ 5.510165 usecs/op
+ 180826 ops/sec
+
+This improved performance by around 3.3% improvement on a busy system. Here the
+turbostat output will show that the CPU 12 and CPU 13 are getting 100 MHz boost.
+The turbostat output::
+
+ #turbostat -c 0-13 --show Package,Core,CPU,Bzy_MHz -i 1
+ Package Core CPU Bzy_MHz
+ ...
+ 0 12 12 3200
+ 0 13 13 3200
diff --git a/Documentation/admin-guide/pm/intel_idle.rst b/Documentation/admin-guide/pm/intel_idle.rst
index 89309e1b0e48..b799a43da62e 100644
--- a/Documentation/admin-guide/pm/intel_idle.rst
+++ b/Documentation/admin-guide/pm/intel_idle.rst
@@ -20,8 +20,8 @@ Nehalem and later generations of Intel processors, but the level of support for
a particular processor model in it depends on whether or not it recognizes that
processor model and may also depend on information coming from the platform
firmware. [To understand ``intel_idle`` it is necessary to know how ``CPUIdle``
-works in general, so this is the time to get familiar with :doc:`cpuidle` if you
-have not done that yet.]
+works in general, so this is the time to get familiar with
+Documentation/admin-guide/pm/cpuidle.rst if you have not done that yet.]
``intel_idle`` uses the ``MWAIT`` instruction to inform the processor that the
logical CPU executing it is idle and so it may be possible to put some of the
@@ -53,7 +53,8 @@ processor) corresponding to them depends on the processor model and it may also
depend on the configuration of the platform.
In order to create a list of available idle states required by the ``CPUIdle``
-subsystem (see :ref:`idle-states-representation` in :doc:`cpuidle`),
+subsystem (see :ref:`idle-states-representation` in
+Documentation/admin-guide/pm/cpuidle.rst),
``intel_idle`` can use two sources of information: static tables of idle states
for different processor models included in the driver itself and the ACPI tables
of the system. The former are always used if the processor model at hand is
@@ -98,7 +99,8 @@ states may not be enabled by default if there are no matching entries in the
preliminary list of idle states coming from the ACPI tables. In that case user
space still can enable them later (on a per-CPU basis) with the help of
the ``disable`` idle state attribute in ``sysfs`` (see
-:ref:`idle-states-representation` in :doc:`cpuidle`). This basically means that
+:ref:`idle-states-representation` in
+Documentation/admin-guide/pm/cpuidle.rst). This basically means that
the idle states "known" to the driver may not be enabled by default if they have
not been exposed by the platform firmware (through the ACPI tables).
@@ -186,7 +188,8 @@ be desirable. In practice, it is only really necessary to do that if the idle
states in question cannot be enabled during system startup, because in the
working state of the system the CPU power management quality of service (PM
QoS) feature can be used to prevent ``CPUIdle`` from touching those idle states
-even if they have been enumerated (see :ref:`cpu-pm-qos` in :doc:`cpuidle`).
+even if they have been enumerated (see :ref:`cpu-pm-qos` in
+Documentation/admin-guide/pm/cpuidle.rst).
Setting ``max_cstate`` to 0 causes the ``intel_idle`` initialization to fail.
The ``no_acpi`` and ``use_acpi`` module parameters (recognized by ``intel_idle``
@@ -202,7 +205,8 @@ Namely, the positions of the bits that are set in the ``states_off`` value are
the indices of idle states to be disabled by default (as reflected by the names
of the corresponding idle state directories in ``sysfs``, :file:`state0`,
:file:`state1` ... :file:`state<i>` ..., where ``<i>`` is the index of the given
-idle state; see :ref:`idle-states-representation` in :doc:`cpuidle`).
+idle state; see :ref:`idle-states-representation` in
+Documentation/admin-guide/pm/cpuidle.rst).
For example, if ``states_off`` is equal to 3, the driver will disable idle
states 0 and 1 by default, and if it is equal to 8, idle state 3 will be
diff --git a/Documentation/admin-guide/pm/intel_pstate.rst b/Documentation/admin-guide/pm/intel_pstate.rst
index 67e414e34f37..d5043cd8d2f5 100644
--- a/Documentation/admin-guide/pm/intel_pstate.rst
+++ b/Documentation/admin-guide/pm/intel_pstate.rst
@@ -18,8 +18,8 @@ General Information
(``CPUFreq``). It is a scaling driver for the Sandy Bridge and later
generations of Intel processors. Note, however, that some of those processors
may not be supported. [To understand ``intel_pstate`` it is necessary to know
-how ``CPUFreq`` works in general, so this is the time to read :doc:`cpufreq` if
-you have not done that yet.]
+how ``CPUFreq`` works in general, so this is the time to read
+Documentation/admin-guide/pm/cpufreq.rst if you have not done that yet.]
For the processors supported by ``intel_pstate``, the P-state concept is broader
than just an operating frequency or an operating performance point (see the
@@ -54,17 +54,21 @@ registered (see `below <status_attr_>`_).
Operation Modes
===============
-``intel_pstate`` can operate in three different modes: in the active mode with
-or without hardware-managed P-states support and in the passive mode. Which of
-them will be in effect depends on what kernel command line options are used and
-on the capabilities of the processor.
+``intel_pstate`` can operate in two different modes, active or passive. In the
+active mode, it uses its own internal performance scaling governor algorithm or
+allows the hardware to do performance scaling by itself, while in the passive
+mode it responds to requests made by a generic ``CPUFreq`` governor implementing
+a certain performance scaling algorithm. Which of them will be in effect
+depends on what kernel command line options are used and on the capabilities of
+the processor.
Active Mode
-----------
-This is the default operation mode of ``intel_pstate``. If it works in this
-mode, the ``scaling_driver`` policy attribute in ``sysfs`` for all ``CPUFreq``
-policies contains the string "intel_pstate".
+This is the default operation mode of ``intel_pstate`` for processors with
+hardware-managed P-states (HWP) support. If it works in this mode, the
+``scaling_driver`` policy attribute in ``sysfs`` for all ``CPUFreq`` policies
+contains the string "intel_pstate".
In this mode the driver bypasses the scaling governors layer of ``CPUFreq`` and
provides its own scaling algorithms for P-state selection. Those algorithms
@@ -119,7 +123,9 @@ Energy-Performance Bias (EPB) knob (otherwise), which means that the processor's
internal P-state selection logic is expected to focus entirely on performance.
This will override the EPP/EPB setting coming from the ``sysfs`` interface
-(see `Energy vs Performance Hints`_ below).
+(see `Energy vs Performance Hints`_ below). Moreover, any attempts to change
+the EPP/EPB to a value different from 0 ("performance") via ``sysfs`` in this
+configuration will be rejected.
Also, in this configuration the range of P-states available to the processor's
internal P-state selection logic is always restricted to the upper boundary
@@ -138,12 +144,13 @@ internal P-state selection logic to be less performance-focused.
Active Mode Without HWP
~~~~~~~~~~~~~~~~~~~~~~~
-This is the default operation mode for processors that do not support the HWP
-feature. It also is used by default with the ``intel_pstate=no_hwp`` argument
-in the kernel command line. However, in this mode ``intel_pstate`` may refuse
-to work with the given processor if it does not recognize it. [Note that
-``intel_pstate`` will never refuse to work with any processor with the HWP
-feature enabled.]
+This operation mode is optional for processors that do not support the HWP
+feature or when the ``intel_pstate=no_hwp`` argument is passed to the kernel in
+the command line. The active mode is used in those cases if the
+``intel_pstate=active`` argument is passed to the kernel in the command line.
+In this mode ``intel_pstate`` may refuse to work with processors that are not
+recognized by it. [Note that ``intel_pstate`` will never refuse to work with
+any processor with the HWP feature enabled.]
In this mode ``intel_pstate`` registers utilization update callbacks with the
CPU scheduler in order to run a P-state selection algorithm, either
@@ -188,10 +195,15 @@ is not set.
Passive Mode
------------
-This mode is used if the ``intel_pstate=passive`` argument is passed to the
-kernel in the command line (it implies the ``intel_pstate=no_hwp`` setting too).
-Like in the active mode without HWP support, in this mode ``intel_pstate`` may
-refuse to work with the given processor if it does not recognize it.
+This is the default operation mode of ``intel_pstate`` for processors without
+hardware-managed P-states (HWP) support. It is always used if the
+``intel_pstate=passive`` argument is passed to the kernel in the command line
+regardless of whether or not the given processor supports HWP. [Note that the
+``intel_pstate=no_hwp`` setting causes the driver to start in the passive mode
+if it is not combined with ``intel_pstate=active``.] Like in the active mode
+without HWP support, in this mode ``intel_pstate`` may refuse to work with
+processors that are not recognized by it if HWP is prevented from being enabled
+through the kernel command line.
If the driver works in this mode, the ``scaling_driver`` policy attribute in
``sysfs`` for all ``CPUFreq`` policies contains the string "intel_cpufreq".
@@ -312,10 +324,9 @@ manuals need to be consulted to get to it too.
For this reason, there is a list of supported processors in ``intel_pstate`` and
the driver initialization will fail if the detected processor is not in that
-list, unless it supports the `HWP feature <Active Mode_>`_. [The interface to
-obtain all of the information listed above is the same for all of the processors
-supporting the HWP feature, which is why they all are supported by
-``intel_pstate``.]
+list, unless it supports the HWP feature. [The interface to obtain all of the
+information listed above is the same for all of the processors supporting the
+HWP feature, which is why ``intel_pstate`` works with all of them.]
User Space Interface in ``sysfs``
@@ -354,6 +365,9 @@ argument is passed to the kernel in the command line.
inclusive) including both turbo and non-turbo P-states (see
`Turbo P-states Support`_).
+ This attribute is present only if the value exposed by it is the same
+ for all of the CPUs in the system.
+
The value of this attribute is not affected by the ``no_turbo``
setting described `below <no_turbo_attr_>`_.
@@ -363,19 +377,22 @@ argument is passed to the kernel in the command line.
Ratio of the `turbo range <turbo_>`_ size to the size of the entire
range of supported P-states, in percent.
+ This attribute is present only if the value exposed by it is the same
+ for all of the CPUs in the system.
+
This attribute is read-only.
.. _no_turbo_attr:
``no_turbo``
If set (equal to 1), the driver is not allowed to set any turbo P-states
- (see `Turbo P-states Support`_). If unset (equalt to 0, which is the
+ (see `Turbo P-states Support`_). If unset (equal to 0, which is the
default), turbo P-states can be set by the driver.
[Note that ``intel_pstate`` does not support the general ``boost``
attribute (supported by some other scaling drivers) which is replaced
by this one.]
- This attrubute does not affect the maximum supported frequency value
+ This attribute does not affect the maximum supported frequency value
supplied to the ``CPUFreq`` core and exposed via the policy interface,
but it affects the maximum possible value of per-policy P-state limits
(see `Interpretation of Policy Attributes`_ below for details).
@@ -419,18 +436,24 @@ argument is passed to the kernel in the command line.
as well as the per-policy ones) are then reset to their default
values, possibly depending on the target operation mode.]
- That only is supported in some configurations, though (for example, if
- the `HWP feature is enabled in the processor <Active Mode With HWP_>`_,
- the operation mode of the driver cannot be changed), and if it is not
- supported in the current configuration, writes to this attribute will
- fail with an appropriate error.
+``energy_efficiency``
+ This attribute is only present on platforms with CPUs matching the Kaby
+ Lake or Coffee Lake desktop CPU model. By default, energy-efficiency
+ optimizations are disabled on these CPU models if HWP is enabled.
+ Enabling energy-efficiency optimizations may limit maximum operating
+ frequency with or without the HWP feature. With HWP enabled, the
+ optimizations are done only in the turbo frequency range. Without it,
+ they are done in the entire available frequency range. Setting this
+ attribute to "1" enables the energy-efficiency optimizations and setting
+ to "0" disables them.
Interpretation of Policy Attributes
-----------------------------------
The interpretation of some ``CPUFreq`` policy attributes described in
-:doc:`cpufreq` is special with ``intel_pstate`` as the current scaling driver
-and it generally depends on the driver's `operation mode <Operation Modes_>`_.
+Documentation/admin-guide/pm/cpufreq.rst is special with ``intel_pstate``
+as the current scaling driver and it generally depends on the driver's
+`operation mode <Operation Modes_>`_.
First of all, the values of the ``cpuinfo_max_freq``, ``cpuinfo_min_freq`` and
``scaling_cur_freq`` attributes are produced by applying a processor-specific
@@ -467,8 +490,8 @@ Next, the following policy attributes have special meaning if
policy for the time interval between the last two invocations of the
driver's utilization update callback by the CPU scheduler for that CPU.
-One more policy attribute is present if the `HWP feature is enabled in the
-processor <Active Mode With HWP_>`_:
+One more policy attribute is present if the HWP feature is enabled in the
+processor:
``base_frequency``
Shows the base frequency of the CPU. Any frequency above this will be
@@ -509,11 +532,11 @@ on the following rules, regardless of the current operation mode of the driver:
3. The global and per-policy limits can be set independently.
-If the `HWP feature is enabled in the processor <Active Mode With HWP_>`_, the
-resulting effective values are written into its registers whenever the limits
-change in order to request its internal P-state selection logic to always set
-P-states within these limits. Otherwise, the limits are taken into account by
-scaling governors (in the `passive mode <Passive Mode_>`_) and by the driver
+In the `active mode with the HWP feature enabled <Active Mode With HWP_>`_, the
+resulting effective values are written into hardware registers whenever the
+limits change in order to request its internal P-state selection logic to always
+set P-states within these limits. Otherwise, the limits are taken into account
+by scaling governors (in the `passive mode <Passive Mode_>`_) and by the driver
every time before setting a new P-state for a CPU.
Additionally, if the ``intel_pstate=per_cpu_perf_limits`` command line argument
@@ -524,12 +547,11 @@ at all and the only way to set the limits is by using the policy attributes.
Energy vs Performance Hints
---------------------------
-If ``intel_pstate`` works in the `active mode with the HWP feature enabled
-<Active Mode With HWP_>`_ in the processor, additional attributes are present
-in every ``CPUFreq`` policy directory in ``sysfs``. They are intended to allow
-user space to help ``intel_pstate`` to adjust the processor's internal P-state
-selection logic by focusing it on performance or on energy-efficiency, or
-somewhere between the two extremes:
+If the hardware-managed P-states (HWP) is enabled in the processor, additional
+attributes, intended to allow user space to help ``intel_pstate`` to adjust the
+processor's internal P-state selection logic by focusing it on performance or on
+energy-efficiency, or somewhere between the two extremes, are present in every
+``CPUFreq`` policy directory in ``sysfs``. They are :
``energy_performance_preference``
Current value of the energy vs performance hint for the given policy
@@ -548,7 +570,11 @@ somewhere between the two extremes:
Strings written to the ``energy_performance_preference`` attribute are
internally translated to integer values written to the processor's
Energy-Performance Preference (EPP) knob (if supported) or its
-Energy-Performance Bias (EPB) knob.
+Energy-Performance Bias (EPB) knob. It is also possible to write a positive
+integer value between 0 to 255, if the EPP feature is present. If the EPP
+feature is not present, writing integer value to this attribute is not
+supported. In this case, user can use the
+"/sys/devices/system/cpu/cpu*/power/energy_perf_bias" interface.
[Note that tasks may by migrated from one CPU to another by the scheduler's
load-balancing algorithm and if different energy vs performance hints are
@@ -629,12 +655,14 @@ of them have to be prepended with the ``intel_pstate=`` prefix.
Do not register ``intel_pstate`` as the scaling driver even if the
processor is supported by it.
+``active``
+ Register ``intel_pstate`` in the `active mode <Active Mode_>`_ to start
+ with.
+
``passive``
Register ``intel_pstate`` in the `passive mode <Passive Mode_>`_ to
start with.
- This option implies the ``no_hwp`` one described below.
-
``force``
Register ``intel_pstate`` as the scaling driver instead of
``acpi-cpufreq`` even if the latter is preferred on the given system.
@@ -649,13 +677,12 @@ of them have to be prepended with the ``intel_pstate=`` prefix.
driver is used instead of ``acpi-cpufreq``.
``no_hwp``
- Do not enable the `hardware-managed P-states (HWP) feature
- <Active Mode With HWP_>`_ even if it is supported by the processor.
+ Do not enable the hardware-managed P-states (HWP) feature even if it is
+ supported by the processor.
``hwp_only``
Register ``intel_pstate`` as the scaling driver only if the
- `hardware-managed P-states (HWP) feature <Active Mode With HWP_>`_ is
- supported by the processor.
+ hardware-managed P-states (HWP) feature is supported by the processor.
``support_acpi_ppc``
Take ACPI ``_PPC`` performance limits into account.
@@ -702,7 +729,7 @@ core (for the policies with other scaling governors).
The ``ftrace`` interface can be used for low-level diagnostics of
``intel_pstate``. For example, to check how often the function to set a
-P-state is called, the ``ftrace`` filter can be set to to
+P-state is called, the ``ftrace`` filter can be set to
:c:func:`intel_pstate_set_pstate`::
# cd /sys/kernel/debug/tracing/
@@ -734,10 +761,10 @@ References
==========
.. [1] Kristen Accardi, *Balancing Power and Performance in the Linux Kernel*,
- http://events.linuxfoundation.org/sites/events/files/slides/LinuxConEurope_2015.pdf
+ https://events.static.linuxfound.org/sites/events/files/slides/LinuxConEurope_2015.pdf
.. [2] *Intel® 64 and IA-32 Architectures Software Developer’s Manual Volume 3: System Programming Guide*,
- http://www.intel.com/content/www/us/en/architecture-and-technology/64-ia-32-architectures-software-developer-system-programming-manual-325384.html
+ https://www.intel.com/content/www/us/en/architecture-and-technology/64-ia-32-architectures-software-developer-system-programming-manual-325384.html
.. [3] *Advanced Configuration and Power Interface Specification*,
https://uefi.org/sites/default/files/resources/ACPI_6_3_final_Jan30.pdf
diff --git a/Documentation/admin-guide/pm/intel_uncore_frequency_scaling.rst b/Documentation/admin-guide/pm/intel_uncore_frequency_scaling.rst
new file mode 100644
index 000000000000..09169d935835
--- /dev/null
+++ b/Documentation/admin-guide/pm/intel_uncore_frequency_scaling.rst
@@ -0,0 +1,60 @@
+.. SPDX-License-Identifier: GPL-2.0
+.. include:: <isonum.txt>
+
+==============================
+Intel Uncore Frequency Scaling
+==============================
+
+:Copyright: |copy| 2022 Intel Corporation
+
+:Author: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
+
+Introduction
+------------
+
+The uncore can consume significant amount of power in Intel's Xeon servers based
+on the workload characteristics. To optimize the total power and improve overall
+performance, SoCs have internal algorithms for scaling uncore frequency. These
+algorithms monitor workload usage of uncore and set a desirable frequency.
+
+It is possible that users have different expectations of uncore performance and
+want to have control over it. The objective is similar to allowing users to set
+the scaling min/max frequencies via cpufreq sysfs to improve CPU performance.
+Users may have some latency sensitive workloads where they do not want any
+change to uncore frequency. Also, users may have workloads which require
+different core and uncore performance at distinct phases and they may want to
+use both cpufreq and the uncore scaling interface to distribute power and
+improve overall performance.
+
+Sysfs Interface
+---------------
+
+To control uncore frequency, a sysfs interface is provided in the directory:
+`/sys/devices/system/cpu/intel_uncore_frequency/`.
+
+There is one directory for each package and die combination as the scope of
+uncore scaling control is per die in multiple die/package SoCs or per
+package for single die per package SoCs. The name represents the
+scope of control. For example: 'package_00_die_00' is for package id 0 and
+die 0.
+
+Each package_*_die_* contains the following attributes:
+
+``initial_max_freq_khz``
+ Out of reset, this attribute represent the maximum possible frequency.
+ This is a read-only attribute. If users adjust max_freq_khz,
+ they can always go back to maximum using the value from this attribute.
+
+``initial_min_freq_khz``
+ Out of reset, this attribute represent the minimum possible frequency.
+ This is a read-only attribute. If users adjust min_freq_khz,
+ they can always go back to minimum using the value from this attribute.
+
+``max_freq_khz``
+ This attribute is used to set the maximum uncore frequency.
+
+``min_freq_khz``
+ This attribute is used to set the minimum uncore frequency.
+
+``current_freq_khz``
+ This attribute is used to get the current uncore frequency.
diff --git a/Documentation/admin-guide/pm/suspend-flows.rst b/Documentation/admin-guide/pm/suspend-flows.rst
new file mode 100644
index 000000000000..c479d7462647
--- /dev/null
+++ b/Documentation/admin-guide/pm/suspend-flows.rst
@@ -0,0 +1,270 @@
+.. SPDX-License-Identifier: GPL-2.0
+.. include:: <isonum.txt>
+
+=========================
+System Suspend Code Flows
+=========================
+
+:Copyright: |copy| 2020 Intel Corporation
+
+:Author: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
+
+At least one global system-wide transition needs to be carried out for the
+system to get from the working state into one of the supported
+:doc:`sleep states <sleep-states>`. Hibernation requires more than one
+transition to occur for this purpose, but the other sleep states, commonly
+referred to as *system-wide suspend* (or simply *system suspend*) states, need
+only one.
+
+For those sleep states, the transition from the working state of the system into
+the target sleep state is referred to as *system suspend* too (in the majority
+of cases, whether this means a transition or a sleep state of the system should
+be clear from the context) and the transition back from the sleep state into the
+working state is referred to as *system resume*.
+
+The kernel code flows associated with the suspend and resume transitions for
+different sleep states of the system are quite similar, but there are some
+significant differences between the :ref:`suspend-to-idle <s2idle>` code flows
+and the code flows related to the :ref:`suspend-to-RAM <s2ram>` and
+:ref:`standby <standby>` sleep states.
+
+The :ref:`suspend-to-RAM <s2ram>` and :ref:`standby <standby>` sleep states
+cannot be implemented without platform support and the difference between them
+boils down to the platform-specific actions carried out by the suspend and
+resume hooks that need to be provided by the platform driver to make them
+available. Apart from that, the suspend and resume code flows for these sleep
+states are mostly identical, so they both together will be referred to as
+*platform-dependent suspend* states in what follows.
+
+
+.. _s2idle_suspend:
+
+Suspend-to-idle Suspend Code Flow
+=================================
+
+The following steps are taken in order to transition the system from the working
+state to the :ref:`suspend-to-idle <s2idle>` sleep state:
+
+ 1. Invoking system-wide suspend notifiers.
+
+ Kernel subsystems can register callbacks to be invoked when the suspend
+ transition is about to occur and when the resume transition has finished.
+
+ That allows them to prepare for the change of the system state and to clean
+ up after getting back to the working state.
+
+ 2. Freezing tasks.
+
+ Tasks are frozen primarily in order to avoid unchecked hardware accesses
+ from user space through MMIO regions or I/O registers exposed directly to
+ it and to prevent user space from entering the kernel while the next step
+ of the transition is in progress (which might have been problematic for
+ various reasons).
+
+ All user space tasks are intercepted as though they were sent a signal and
+ put into uninterruptible sleep until the end of the subsequent system resume
+ transition.
+
+ The kernel threads that choose to be frozen during system suspend for
+ specific reasons are frozen subsequently, but they are not intercepted.
+ Instead, they are expected to periodically check whether or not they need
+ to be frozen and to put themselves into uninterruptible sleep if so. [Note,
+ however, that kernel threads can use locking and other concurrency controls
+ available in kernel space to synchronize themselves with system suspend and
+ resume, which can be much more precise than the freezing, so the latter is
+ not a recommended option for kernel threads.]
+
+ 3. Suspending devices and reconfiguring IRQs.
+
+ Devices are suspended in four phases called *prepare*, *suspend*,
+ *late suspend* and *noirq suspend* (see :ref:`driverapi_pm_devices` for more
+ information on what exactly happens in each phase).
+
+ Every device is visited in each phase, but typically it is not physically
+ accessed in more than two of them.
+
+ The runtime PM API is disabled for every device during the *late* suspend
+ phase and high-level ("action") interrupt handlers are prevented from being
+ invoked before the *noirq* suspend phase.
+
+ Interrupts are still handled after that, but they are only acknowledged to
+ interrupt controllers without performing any device-specific actions that
+ would be triggered in the working state of the system (those actions are
+ deferred till the subsequent system resume transition as described
+ `below <s2idle_resume_>`_).
+
+ IRQs associated with system wakeup devices are "armed" so that the resume
+ transition of the system is started when one of them signals an event.
+
+ 4. Freezing the scheduler tick and suspending timekeeping.
+
+ When all devices have been suspended, CPUs enter the idle loop and are put
+ into the deepest available idle state. While doing that, each of them
+ "freezes" its own scheduler tick so that the timer events associated with
+ the tick do not occur until the CPU is woken up by another interrupt source.
+
+ The last CPU to enter the idle state also stops the timekeeping which
+ (among other things) prevents high resolution timers from triggering going
+ forward until the first CPU that is woken up restarts the timekeeping.
+ That allows the CPUs to stay in the deep idle state relatively long in one
+ go.
+
+ From this point on, the CPUs can only be woken up by non-timer hardware
+ interrupts. If that happens, they go back to the idle state unless the
+ interrupt that woke up one of them comes from an IRQ that has been armed for
+ system wakeup, in which case the system resume transition is started.
+
+
+.. _s2idle_resume:
+
+Suspend-to-idle Resume Code Flow
+================================
+
+The following steps are taken in order to transition the system from the
+:ref:`suspend-to-idle <s2idle>` sleep state into the working state:
+
+ 1. Resuming timekeeping and unfreezing the scheduler tick.
+
+ When one of the CPUs is woken up (by a non-timer hardware interrupt), it
+ leaves the idle state entered in the last step of the preceding suspend
+ transition, restarts the timekeeping (unless it has been restarted already
+ by another CPU that woke up earlier) and the scheduler tick on that CPU is
+ unfrozen.
+
+ If the interrupt that has woken up the CPU was armed for system wakeup,
+ the system resume transition begins.
+
+ 2. Resuming devices and restoring the working-state configuration of IRQs.
+
+ Devices are resumed in four phases called *noirq resume*, *early resume*,
+ *resume* and *complete* (see :ref:`driverapi_pm_devices` for more
+ information on what exactly happens in each phase).
+
+ Every device is visited in each phase, but typically it is not physically
+ accessed in more than two of them.
+
+ The working-state configuration of IRQs is restored after the *noirq* resume
+ phase and the runtime PM API is re-enabled for every device whose driver
+ supports it during the *early* resume phase.
+
+ 3. Thawing tasks.
+
+ Tasks frozen in step 2 of the preceding `suspend <s2idle_suspend_>`_
+ transition are "thawed", which means that they are woken up from the
+ uninterruptible sleep that they went into at that time and user space tasks
+ are allowed to exit the kernel.
+
+ 4. Invoking system-wide resume notifiers.
+
+ This is analogous to step 1 of the `suspend <s2idle_suspend_>`_ transition
+ and the same set of callbacks is invoked at this point, but a different
+ "notification type" parameter value is passed to them.
+
+
+Platform-dependent Suspend Code Flow
+====================================
+
+The following steps are taken in order to transition the system from the working
+state to platform-dependent suspend state:
+
+ 1. Invoking system-wide suspend notifiers.
+
+ This step is the same as step 1 of the suspend-to-idle suspend transition
+ described `above <s2idle_suspend_>`_.
+
+ 2. Freezing tasks.
+
+ This step is the same as step 2 of the suspend-to-idle suspend transition
+ described `above <s2idle_suspend_>`_.
+
+ 3. Suspending devices and reconfiguring IRQs.
+
+ This step is analogous to step 3 of the suspend-to-idle suspend transition
+ described `above <s2idle_suspend_>`_, but the arming of IRQs for system
+ wakeup generally does not have any effect on the platform.
+
+ There are platforms that can go into a very deep low-power state internally
+ when all CPUs in them are in sufficiently deep idle states and all I/O
+ devices have been put into low-power states. On those platforms,
+ suspend-to-idle can reduce system power very effectively.
+
+ On the other platforms, however, low-level components (like interrupt
+ controllers) need to be turned off in a platform-specific way (implemented
+ in the hooks provided by the platform driver) to achieve comparable power
+ reduction.
+
+ That usually prevents in-band hardware interrupts from waking up the system,
+ which must be done in a special platform-dependent way. Then, the
+ configuration of system wakeup sources usually starts when system wakeup
+ devices are suspended and is finalized by the platform suspend hooks later
+ on.
+
+ 4. Disabling non-boot CPUs.
+
+ On some platforms the suspend hooks mentioned above must run in a one-CPU
+ configuration of the system (in particular, the hardware cannot be accessed
+ by any code running in parallel with the platform suspend hooks that may,
+ and often do, trap into the platform firmware in order to finalize the
+ suspend transition).
+
+ For this reason, the CPU offline/online (CPU hotplug) framework is used
+ to take all of the CPUs in the system, except for one (the boot CPU),
+ offline (typically, the CPUs that have been taken offline go into deep idle
+ states).
+
+ This means that all tasks are migrated away from those CPUs and all IRQs are
+ rerouted to the only CPU that remains online.
+
+ 5. Suspending core system components.
+
+ This prepares the core system components for (possibly) losing power going
+ forward and suspends the timekeeping.
+
+ 6. Platform-specific power removal.
+
+ This is expected to remove power from all of the system components except
+ for the memory controller and RAM (in order to preserve the contents of the
+ latter) and some devices designated for system wakeup.
+
+ In many cases control is passed to the platform firmware which is expected
+ to finalize the suspend transition as needed.
+
+
+Platform-dependent Resume Code Flow
+===================================
+
+The following steps are taken in order to transition the system from a
+platform-dependent suspend state into the working state:
+
+ 1. Platform-specific system wakeup.
+
+ The platform is woken up by a signal from one of the designated system
+ wakeup devices (which need not be an in-band hardware interrupt) and
+ control is passed back to the kernel (the working configuration of the
+ platform may need to be restored by the platform firmware before the
+ kernel gets control again).
+
+ 2. Resuming core system components.
+
+ The suspend-time configuration of the core system components is restored and
+ the timekeeping is resumed.
+
+ 3. Re-enabling non-boot CPUs.
+
+ The CPUs disabled in step 4 of the preceding suspend transition are taken
+ back online and their suspend-time configuration is restored.
+
+ 4. Resuming devices and restoring the working-state configuration of IRQs.
+
+ This step is the same as step 2 of the suspend-to-idle suspend transition
+ described `above <s2idle_resume_>`_.
+
+ 5. Thawing tasks.
+
+ This step is the same as step 3 of the suspend-to-idle suspend transition
+ described `above <s2idle_resume_>`_.
+
+ 6. Invoking system-wide resume notifiers.
+
+ This step is the same as step 4 of the suspend-to-idle suspend transition
+ described `above <s2idle_resume_>`_.
diff --git a/Documentation/admin-guide/pm/system-wide.rst b/Documentation/admin-guide/pm/system-wide.rst
index 2b1f987b34f0..1a1924d71006 100644
--- a/Documentation/admin-guide/pm/system-wide.rst
+++ b/Documentation/admin-guide/pm/system-wide.rst
@@ -8,3 +8,4 @@ System-Wide Power Management
:maxdepth: 2
sleep-states
+ suspend-flows
diff --git a/Documentation/admin-guide/pm/working-state.rst b/Documentation/admin-guide/pm/working-state.rst
index 88f717e59a42..ee45887811ff 100644
--- a/Documentation/admin-guide/pm/working-state.rst
+++ b/Documentation/admin-guide/pm/working-state.rst
@@ -11,4 +11,8 @@ Working-State Power Management
intel_idle
cpufreq
intel_pstate
+ amd-pstate
+ cpufreq_drivers
intel_epb
+ intel-speed-select
+ intel_uncore_frequency_scaling
diff --git a/Documentation/admin-guide/pnp.rst b/Documentation/admin-guide/pnp.rst
index bab2d10631f0..3eda08191d13 100644
--- a/Documentation/admin-guide/pnp.rst
+++ b/Documentation/admin-guide/pnp.rst
@@ -281,10 +281,6 @@ ISAPNP drivers. They should serve as a temporary solution only.
They are as follows::
- struct pnp_card *pnp_find_card(unsigned short vendor,
- unsigned short device,
- struct pnp_card *from)
-
struct pnp_dev *pnp_find_dev(struct pnp_card *card,
unsigned short vendor,
unsigned short function,
diff --git a/Documentation/admin-guide/pstore-blk.rst b/Documentation/admin-guide/pstore-blk.rst
new file mode 100644
index 000000000000..2d22ead9520e
--- /dev/null
+++ b/Documentation/admin-guide/pstore-blk.rst
@@ -0,0 +1,234 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+pstore block oops/panic logger
+==============================
+
+Introduction
+------------
+
+pstore block (pstore/blk) is an oops/panic logger that writes its logs to a
+block device and non-block device before the system crashes. You can get
+these log files by mounting pstore filesystem like::
+
+ mount -t pstore pstore /sys/fs/pstore
+
+
+pstore block concepts
+---------------------
+
+pstore/blk provides efficient configuration method for pstore/blk, which
+divides all configurations into two parts, configurations for user and
+configurations for driver.
+
+Configurations for user determine how pstore/blk works, such as pmsg_size,
+kmsg_size and so on. All of them support both Kconfig and module parameters,
+but module parameters have priority over Kconfig.
+
+Configurations for driver are all about block device and non-block device,
+such as total_size of block device and read/write operations.
+
+Configurations for user
+-----------------------
+
+All of these configurations support both Kconfig and module parameters, but
+module parameters have priority over Kconfig.
+
+Here is an example for module parameters::
+
+ pstore_blk.blkdev=/dev/mmcblk0p7 pstore_blk.kmsg_size=64 best_effort=y
+
+The detail of each configurations may be of interest to you.
+
+blkdev
+~~~~~~
+
+The block device to use. Most of the time, it is a partition of block device.
+It's required for pstore/blk. It is also used for MTD device.
+
+When pstore/blk is built as a module, "blkdev" accepts the following variants:
+
+1. /dev/<disk_name> represents the device number of disk
+#. /dev/<disk_name><decimal> represents the device number of partition - device
+ number of disk plus the partition number
+#. /dev/<disk_name>p<decimal> - same as the above; this form is used when disk
+ name of partitioned disk ends with a digit.
+
+When pstore/blk is built into the kernel, "blkdev" accepts the following variants:
+
+#. <hex_major><hex_minor> device number in hexadecimal representation,
+ with no leading 0x, for example b302.
+#. PARTUUID=00112233-4455-6677-8899-AABBCCDDEEFF represents the unique id of
+ a partition if the partition table provides it. The UUID may be either an
+ EFI/GPT UUID, or refer to an MSDOS partition using the format SSSSSSSS-PP,
+ where SSSSSSSS is a zero-filled hex representation of the 32-bit
+ "NT disk signature", and PP is a zero-filled hex representation of the
+ 1-based partition number.
+#. PARTUUID=<UUID>/PARTNROFF=<int> to select a partition in relation to a
+ partition with a known unique id.
+#. <major>:<minor> major and minor number of the device separated by a colon.
+
+It accepts the following variants for MTD device:
+
+1. <device name> MTD device name. "pstore" is recommended.
+#. <device number> MTD device number.
+
+kmsg_size
+~~~~~~~~~
+
+The chunk size in KB for oops/panic front-end. It **MUST** be a multiple of 4.
+It's optional if you do not care oops/panic log.
+
+There are multiple chunks for oops/panic front-end depending on the remaining
+space except other pstore front-ends.
+
+pstore/blk will log to oops/panic chunks one by one, and always overwrite the
+oldest chunk if there is no more free chunk.
+
+pmsg_size
+~~~~~~~~~
+
+The chunk size in KB for pmsg front-end. It **MUST** be a multiple of 4.
+It's optional if you do not care pmsg log.
+
+Unlike oops/panic front-end, there is only one chunk for pmsg front-end.
+
+Pmsg is a user space accessible pstore object. Writes to */dev/pmsg0* are
+appended to the chunk. On reboot the contents are available in
+*/sys/fs/pstore/pmsg-pstore-blk-0*.
+
+console_size
+~~~~~~~~~~~~
+
+The chunk size in KB for console front-end. It **MUST** be a multiple of 4.
+It's optional if you do not care console log.
+
+Similar to pmsg front-end, there is only one chunk for console front-end.
+
+All log of console will be appended to the chunk. On reboot the contents are
+available in */sys/fs/pstore/console-pstore-blk-0*.
+
+ftrace_size
+~~~~~~~~~~~
+
+The chunk size in KB for ftrace front-end. It **MUST** be a multiple of 4.
+It's optional if you do not care console log.
+
+Similar to oops front-end, there are multiple chunks for ftrace front-end
+depending on the count of cpu processors. Each chunk size is equal to
+ftrace_size / processors_count.
+
+All log of ftrace will be appended to the chunk. On reboot the contents are
+combined and available in */sys/fs/pstore/ftrace-pstore-blk-0*.
+
+Persistent function tracing might be useful for debugging software or hardware
+related hangs. Here is an example of usage::
+
+ # mount -t pstore pstore /sys/fs/pstore
+ # mount -t debugfs debugfs /sys/kernel/debug/
+ # echo 1 > /sys/kernel/debug/pstore/record_ftrace
+ # reboot -f
+ [...]
+ # mount -t pstore pstore /sys/fs/pstore
+ # tail /sys/fs/pstore/ftrace-pstore-blk-0
+ CPU:0 ts:5914676 c0063828 c0063b94 call_cpuidle <- cpu_startup_entry+0x1b8/0x1e0
+ CPU:0 ts:5914678 c039ecdc c006385c cpuidle_enter_state <- call_cpuidle+0x44/0x48
+ CPU:0 ts:5914680 c039e9a0 c039ecf0 cpuidle_enter_freeze <- cpuidle_enter_state+0x304/0x314
+ CPU:0 ts:5914681 c0063870 c039ea30 sched_idle_set_state <- cpuidle_enter_state+0x44/0x314
+ CPU:1 ts:5916720 c0160f59 c015ee04 kernfs_unmap_bin_file <- __kernfs_remove+0x140/0x204
+ CPU:1 ts:5916721 c05ca625 c015ee0c __mutex_lock_slowpath <- __kernfs_remove+0x148/0x204
+ CPU:1 ts:5916723 c05c813d c05ca630 yield_to <- __mutex_lock_slowpath+0x314/0x358
+ CPU:1 ts:5916724 c05ca2d1 c05ca638 __ww_mutex_lock <- __mutex_lock_slowpath+0x31c/0x358
+
+max_reason
+~~~~~~~~~~
+
+Limiting which kinds of kmsg dumps are stored can be controlled via
+the ``max_reason`` value, as defined in include/linux/kmsg_dump.h's
+``enum kmsg_dump_reason``. For example, to store both Oopses and Panics,
+``max_reason`` should be set to 2 (KMSG_DUMP_OOPS), to store only Panics
+``max_reason`` should be set to 1 (KMSG_DUMP_PANIC). Setting this to 0
+(KMSG_DUMP_UNDEF), means the reason filtering will be controlled by the
+``printk.always_kmsg_dump`` boot param: if unset, it'll be KMSG_DUMP_OOPS,
+otherwise KMSG_DUMP_MAX.
+
+Configurations for driver
+-------------------------
+
+A device driver uses ``register_pstore_device`` with
+``struct pstore_device_info`` to register to pstore/blk.
+
+.. kernel-doc:: fs/pstore/blk.c
+ :export:
+
+Compression and header
+----------------------
+
+Block device is large enough for uncompressed oops data. Actually we do not
+recommend data compression because pstore/blk will insert some information into
+the first line of oops/panic data. For example::
+
+ Panic: Total 16 times
+
+It means that it's OOPS|Panic for the 16th time since the first booting.
+Sometimes the number of occurrences of oops|panic since the first booting is
+important to judge whether the system is stable.
+
+The following line is inserted by pstore filesystem. For example::
+
+ Oops#2 Part1
+
+It means that it's OOPS for the 2nd time on the last boot.
+
+Reading the data
+----------------
+
+The dump data can be read from the pstore filesystem. The format for these
+files is ``dmesg-pstore-blk-[N]`` for oops/panic front-end,
+``pmsg-pstore-blk-0`` for pmsg front-end and so on. The timestamp of the
+dump file records the trigger time. To delete a stored record from block
+device, simply unlink the respective pstore file.
+
+Attentions in panic read/write APIs
+-----------------------------------
+
+If on panic, the kernel is not going to run for much longer, the tasks will not
+be scheduled and most kernel resources will be out of service. It
+looks like a single-threaded program running on a single-core computer.
+
+The following points require special attention for panic read/write APIs:
+
+1. Can **NOT** allocate any memory.
+ If you need memory, just allocate while the block driver is initializing
+ rather than waiting until the panic.
+#. Must be polled, **NOT** interrupt driven.
+ No task schedule any more. The block driver should delay to ensure the write
+ succeeds, but NOT sleep.
+#. Can **NOT** take any lock.
+ There is no other task, nor any shared resource; you are safe to break all
+ locks.
+#. Just use CPU to transfer.
+ Do not use DMA to transfer unless you are sure that DMA will not keep lock.
+#. Control registers directly.
+ Please control registers directly rather than use Linux kernel resources.
+ Do I/O map while initializing rather than wait until a panic occurs.
+#. Reset your block device and controller if necessary.
+ If you are not sure of the state of your block device and controller when
+ a panic occurs, you are safe to stop and reset them.
+
+pstore/blk supports psblk_blkdev_info(), which is defined in
+*linux/pstore_blk.h*, to get information of using block device, such as the
+device number, sector count and start sector of the whole disk.
+
+pstore block internals
+----------------------
+
+For developer reference, here are all the important structures and APIs:
+
+.. kernel-doc:: fs/pstore/zone.c
+ :internal:
+
+.. kernel-doc:: include/linux/pstore_zone.h
+ :internal:
+
+.. kernel-doc:: include/linux/pstore_blk.h
+ :internal:
diff --git a/Documentation/admin-guide/ramoops.rst b/Documentation/admin-guide/ramoops.rst
index 6dbcc5481000..e9f85142182d 100644
--- a/Documentation/admin-guide/ramoops.rst
+++ b/Documentation/admin-guide/ramoops.rst
@@ -3,7 +3,7 @@ Ramoops oops/panic logger
Sergiu Iordache <sergiu@chromium.org>
-Updated: 17 November 2011
+Updated: 10 Feb 2021
Introduction
------------
@@ -22,7 +22,7 @@ and type of the memory area are set using three variables:
* ``mem_address`` for the start
* ``mem_size`` for the size. The memory size will be rounded down to a
power of two.
- * ``mem_type`` to specifiy if the memory type (default is pgprot_writecombine).
+ * ``mem_type`` to specify if the memory type (default is pgprot_writecombine).
Typically the default value of ``mem_type=0`` should be used as that sets the pstore
mapping to pgprot_writecombine. Setting ``mem_type=1`` attempts to use
@@ -30,13 +30,21 @@ mapping to pgprot_writecombine. Setting ``mem_type=1`` attempts to use
depends on atomic operations. At least on ARM, pgprot_noncached causes the
memory to be mapped strongly ordered, and atomic operations on strongly ordered
memory are implementation defined, and won't work on many ARMs such as omaps.
+Setting ``mem_type=2`` attempts to treat the memory region as normal memory,
+which enables full cache on it. This can improve the performance.
The memory area is divided into ``record_size`` chunks (also rounded down to
-power of two) and each oops/panic writes a ``record_size`` chunk of
+power of two) and each kmesg dump writes a ``record_size`` chunk of
information.
-Dumping both oopses and panics can be done by setting 1 in the ``dump_oops``
-variable while setting 0 in that variable dumps only the panics.
+Limiting which kinds of kmsg dumps are stored can be controlled via
+the ``max_reason`` value, as defined in include/linux/kmsg_dump.h's
+``enum kmsg_dump_reason``. For example, to store both Oopses and Panics,
+``max_reason`` should be set to 2 (KMSG_DUMP_OOPS), to store only Panics
+``max_reason`` should be set to 1 (KMSG_DUMP_PANIC). Setting this to 0
+(KMSG_DUMP_UNDEF), means the reason filtering will be controlled by the
+``printk.always_kmsg_dump`` boot param: if unset, it'll be KMSG_DUMP_OOPS,
+otherwise KMSG_DUMP_MAX.
The module uses a counter to record multiple dumps but the counter gets reset
on restart (i.e. new dumps after the restart will overwrite old ones).
@@ -61,7 +69,7 @@ Setting the ramoops parameters can be done in several different manners:
mem=128M ramoops.mem_address=0x8000000 ramoops.ecc=1
B. Use Device Tree bindings, as described in
- ``Documentation/devicetree/bindings/reserved-memory/ramoops.txt``.
+ ``Documentation/devicetree/bindings/reserved-memory/ramoops.yaml``.
For example::
reserved-memory {
@@ -90,7 +98,7 @@ Setting the ramoops parameters can be done in several different manners:
.mem_address = <...>,
.mem_type = <...>,
.record_size = <...>,
- .dump_oops = <...>,
+ .max_reason = <...>,
.ecc = <...>,
};
diff --git a/Documentation/admin-guide/ras.rst b/Documentation/admin-guide/ras.rst
index 0310db624964..7b481b2a368e 100644
--- a/Documentation/admin-guide/ras.rst
+++ b/Documentation/admin-guide/ras.rst
@@ -156,11 +156,11 @@ the labels provided by the BIOS won't match the real ones.
ECC memory
----------
-As mentioned on the previous section, ECC memory has extra bits to be
-used for error correction. So, on 64 bit systems, a memory module
-has 64 bits of *data width*, and 74 bits of *total width*. So, there are
-8 bits extra bits to be used for the error detection and correction
-mechanisms. Those extra bits are called *syndrome*\ [#f1]_\ [#f2]_.
+As mentioned in the previous section, ECC memory has extra bits to be
+used for error correction. In the above example, a memory module has
+64 bits of *data width*, and 72 bits of *total width*. The extra 8
+bits which are used for the error detection and correction mechanisms
+are referred to as the *syndrome*\ [#f1]_\ [#f2]_.
So, when the cpu requests the memory controller to write a word with
*data width*, the memory controller calculates the *syndrome* in real time,
@@ -212,7 +212,7 @@ EDAC - Error Detection And Correction
purposes.
When the subsystem was pushed upstream for the first time, on
- Kernel 2.6.16, for the first time, it was renamed to ``EDAC``.
+ Kernel 2.6.16, it was renamed to ``EDAC``.
Purpose
-------
@@ -351,15 +351,17 @@ controllers. The following example will assume 2 channels:
+------------+-----------+-----------+
| | ``ch0`` | ``ch1`` |
+============+===========+===========+
- | ``csrow0`` | DIMM_A0 | DIMM_B0 |
- | | rank0 | rank0 |
- +------------+ - | - |
+ | |**DIMM_A0**|**DIMM_B0**|
+ +------------+-----------+-----------+
+ | ``csrow0`` | rank0 | rank0 |
+ +------------+-----------+-----------+
| ``csrow1`` | rank1 | rank1 |
+------------+-----------+-----------+
- | ``csrow2`` | DIMM_A1 | DIMM_B1 |
- | | rank0 | rank0 |
- +------------+ - | - |
- | ``csrow3`` | rank1 | rank1 |
+ | |**DIMM_A1**|**DIMM_B1**|
+ +------------+-----------+-----------+
+ | ``csrow2`` | rank0 | rank0 |
+ +------------+-----------+-----------+
+ | ``csrow3`` | rank1 | rank1 |
+------------+-----------+-----------+
In the above example, there are 4 physical slots on the motherboard
diff --git a/Documentation/admin-guide/reporting-bugs.rst b/Documentation/admin-guide/reporting-bugs.rst
deleted file mode 100644
index 49ac8dc3594d..000000000000
--- a/Documentation/admin-guide/reporting-bugs.rst
+++ /dev/null
@@ -1,182 +0,0 @@
-.. _reportingbugs:
-
-Reporting bugs
-++++++++++++++
-
-Background
-==========
-
-The upstream Linux kernel maintainers only fix bugs for specific kernel
-versions. Those versions include the current "release candidate" (or -rc)
-kernel, any "stable" kernel versions, and any "long term" kernels.
-
-Please see https://www.kernel.org/ for a list of supported kernels. Any
-kernel marked with [EOL] is "end of life" and will not have any fixes
-backported to it.
-
-If you've found a bug on a kernel version that isn't listed on kernel.org,
-contact your Linux distribution or embedded vendor for support.
-Alternatively, you can attempt to run one of the supported stable or -rc
-kernels, and see if you can reproduce the bug on that. It's preferable
-to reproduce the bug on the latest -rc kernel.
-
-
-How to report Linux kernel bugs
-===============================
-
-
-Identify the problematic subsystem
-----------------------------------
-
-Identifying which part of the Linux kernel might be causing your issue
-increases your chances of getting your bug fixed. Simply posting to the
-generic linux-kernel mailing list (LKML) may cause your bug report to be
-lost in the noise of a mailing list that gets 1000+ emails a day.
-
-Instead, try to figure out which kernel subsystem is causing the issue,
-and email that subsystem's maintainer and mailing list. If the subsystem
-maintainer doesn't answer, then expand your scope to mailing lists like
-LKML.
-
-
-Identify who to notify
-----------------------
-
-Once you know the subsystem that is causing the issue, you should send a
-bug report. Some maintainers prefer bugs to be reported via bugzilla
-(https://bugzilla.kernel.org), while others prefer that bugs be reported
-via the subsystem mailing list.
-
-To find out where to send an emailed bug report, find your subsystem or
-device driver in the MAINTAINERS file. Search in the file for relevant
-entries, and send your bug report to the person(s) listed in the "M:"
-lines, making sure to Cc the mailing list(s) in the "L:" lines. When the
-maintainer replies to you, make sure to 'Reply-all' in order to keep the
-public mailing list(s) in the email thread.
-
-If you know which driver is causing issues, you can pass one of the driver
-files to the get_maintainer.pl script::
-
- perl scripts/get_maintainer.pl -f <filename>
-
-If it is a security bug, please copy the Security Contact listed in the
-MAINTAINERS file. They can help coordinate bugfix and disclosure. See
-:ref:`Documentation/admin-guide/security-bugs.rst <securitybugs>` for more information.
-
-If you can't figure out which subsystem caused the issue, you should file
-a bug in kernel.org bugzilla and send email to
-linux-kernel@vger.kernel.org, referencing the bugzilla URL. (For more
-information on the linux-kernel mailing list see
-http://vger.kernel.org/lkml/).
-
-
-Tips for reporting bugs
------------------------
-
-If you haven't reported a bug before, please read:
-
- http://www.chiark.greenend.org.uk/~sgtatham/bugs.html
-
- http://www.catb.org/esr/faqs/smart-questions.html
-
-It's REALLY important to report bugs that seem unrelated as separate email
-threads or separate bugzilla entries. If you report several unrelated
-bugs at once, it's difficult for maintainers to tease apart the relevant
-data.
-
-
-Gather information
-------------------
-
-The most important information in a bug report is how to reproduce the
-bug. This includes system information, and (most importantly)
-step-by-step instructions for how a user can trigger the bug.
-
-If the failure includes an "OOPS:", take a picture of the screen, capture
-a netconsole trace, or type the message from your screen into the bug
-report. Please read "Documentation/admin-guide/bug-hunting.rst" before posting your
-bug report. This explains what you should do with the "Oops" information
-to make it useful to the recipient.
-
-This is a suggested format for a bug report sent via email or bugzilla.
-Having a standardized bug report form makes it easier for you not to
-overlook things, and easier for the developers to find the pieces of
-information they're really interested in. If some information is not
-relevant to your bug, feel free to exclude it.
-
-First run the ver_linux script included as scripts/ver_linux, which
-reports the version of some important subsystems. Run this script with
-the command ``awk -f scripts/ver_linux``.
-
-Use that information to fill in all fields of the bug report form, and
-post it to the mailing list with a subject of "PROBLEM: <one line
-summary from [1.]>" for easy identification by the developers::
-
- [1.] One line summary of the problem:
- [2.] Full description of the problem/report:
- [3.] Keywords (i.e., modules, networking, kernel):
- [4.] Kernel information
- [4.1.] Kernel version (from /proc/version):
- [4.2.] Kernel .config file:
- [5.] Most recent kernel version which did not have the bug:
- [6.] Output of Oops.. message (if applicable) with symbolic information
- resolved (see Documentation/admin-guide/bug-hunting.rst)
- [7.] A small shell script or example program which triggers the
- problem (if possible)
- [8.] Environment
- [8.1.] Software (add the output of the ver_linux script here)
- [8.2.] Processor information (from /proc/cpuinfo):
- [8.3.] Module information (from /proc/modules):
- [8.4.] Loaded driver and hardware information (/proc/ioports, /proc/iomem)
- [8.5.] PCI information ('lspci -vvv' as root)
- [8.6.] SCSI information (from /proc/scsi/scsi)
- [8.7.] Other information that might be relevant to the problem
- (please look in /proc and include all information that you
- think to be relevant):
- [X.] Other notes, patches, fixes, workarounds:
-
-
-Follow up
-=========
-
-Expectations for bug reporters
-------------------------------
-
-Linux kernel maintainers expect bug reporters to be able to follow up on
-bug reports. That may include running new tests, applying patches,
-recompiling your kernel, and/or re-triggering your bug. The most
-frustrating thing for maintainers is for someone to report a bug, and then
-never follow up on a request to try out a fix.
-
-That said, it's still useful for a kernel maintainer to know a bug exists
-on a supported kernel, even if you can't follow up with retests. Follow
-up reports, such as replying to the email thread with "I tried the latest
-kernel and I can't reproduce my bug anymore" are also helpful, because
-maintainers have to assume silence means things are still broken.
-
-Expectations for kernel maintainers
------------------------------------
-
-Linux kernel maintainers are busy, overworked human beings. Some times
-they may not be able to address your bug in a day, a week, or two weeks.
-If they don't answer your email, they may be on vacation, or at a Linux
-conference. Check the conference schedule at https://LWN.net for more info:
-
- https://lwn.net/Calendar/
-
-In general, kernel maintainers take 1 to 5 business days to respond to
-bugs. The majority of kernel maintainers are employed to work on the
-kernel, and they may not work on the weekends. Maintainers are scattered
-around the world, and they may not work in your time zone. Unless you
-have a high priority bug, please wait at least a week after the first bug
-report before sending the maintainer a reminder email.
-
-The exceptions to this rule are regressions, kernel crashes, security holes,
-or userspace breakage caused by new kernel behavior. Those bugs should be
-addressed by the maintainers ASAP. If you suspect a maintainer is not
-responding to these types of bugs in a timely manner (especially during a
-merge window), escalate the bug to LKML and Linus Torvalds.
-
-Thank you!
-
-[Some of this is taken from Frohwalt Egerer's original linux-kernel FAQ]
diff --git a/Documentation/admin-guide/reporting-issues.rst b/Documentation/admin-guide/reporting-issues.rst
new file mode 100644
index 000000000000..ec62151fe672
--- /dev/null
+++ b/Documentation/admin-guide/reporting-issues.rst
@@ -0,0 +1,1764 @@
+.. SPDX-License-Identifier: (GPL-2.0+ OR CC-BY-4.0)
+.. See the bottom of this file for additional redistribution information.
+
+Reporting issues
+++++++++++++++++
+
+
+The short guide (aka TL;DR)
+===========================
+
+Are you facing a regression with vanilla kernels from the same stable or
+longterm series? One still supported? Then search the `LKML
+<https://lore.kernel.org/lkml/>`_ and the `Linux stable mailing list
+<https://lore.kernel.org/stable/>`_ archives for matching reports to join. If
+you don't find any, install `the latest release from that series
+<https://kernel.org/>`_. If it still shows the issue, report it to the stable
+mailing list (stable@vger.kernel.org) and CC the regressions list
+(regressions@lists.linux.dev); ideally also CC the maintainer and the mailing
+list for the subsystem in question.
+
+In all other cases try your best guess which kernel part might be causing the
+issue. Check the :ref:`MAINTAINERS <maintainers>` file for how its developers
+expect to be told about problems, which most of the time will be by email with a
+mailing list in CC. Check the destination's archives for matching reports;
+search the `LKML <https://lore.kernel.org/lkml/>`_ and the web, too. If you
+don't find any to join, install `the latest mainline kernel
+<https://kernel.org/>`_. If the issue is present there, send a report.
+
+The issue was fixed there, but you would like to see it resolved in a still
+supported stable or longterm series as well? Then install its latest release.
+If it shows the problem, search for the change that fixed it in mainline and
+check if backporting is in the works or was discarded; if it's neither, ask
+those who handled the change for it.
+
+**General remarks**: When installing and testing a kernel as outlined above,
+ensure it's vanilla (IOW: not patched and not using add-on modules). Also make
+sure it's built and running in a healthy environment and not already tainted
+before the issue occurs.
+
+If you are facing multiple issues with the Linux kernel at once, report each
+separately. While writing your report, include all information relevant to the
+issue, like the kernel and the distro used. In case of a regression, CC the
+regressions mailing list (regressions@lists.linux.dev) to your report. Also try
+to pin-point the culprit with a bisection; if you succeed, include its
+commit-id and CC everyone in the sign-off-by chain.
+
+Once the report is out, answer any questions that come up and help where you
+can. That includes keeping the ball rolling by occasionally retesting with newer
+releases and sending a status update afterwards.
+
+Step-by-step guide how to report issues to the kernel maintainers
+=================================================================
+
+The above TL;DR outlines roughly how to report issues to the Linux kernel
+developers. It might be all that's needed for people already familiar with
+reporting issues to Free/Libre & Open Source Software (FLOSS) projects. For
+everyone else there is this section. It is more detailed and uses a
+step-by-step approach. It still tries to be brief for readability and leaves
+out a lot of details; those are described below the step-by-step guide in a
+reference section, which explains each of the steps in more detail.
+
+Note: this section covers a few more aspects than the TL;DR and does things in
+a slightly different order. That's in your interest, to make sure you notice
+early if an issue that looks like a Linux kernel problem is actually caused by
+something else. These steps thus help to ensure the time you invest in this
+process won't feel wasted in the end:
+
+ * Are you facing an issue with a Linux kernel a hardware or software vendor
+ provided? Then in almost all cases you are better off to stop reading this
+ document and reporting the issue to your vendor instead, unless you are
+ willing to install the latest Linux version yourself. Be aware the latter
+ will often be needed anyway to hunt down and fix issues.
+
+ * Perform a rough search for existing reports with your favorite internet
+ search engine; additionally, check the archives of the `Linux Kernel Mailing
+ List (LKML) <https://lore.kernel.org/lkml/>`_. If you find matching reports,
+ join the discussion instead of sending a new one.
+
+ * See if the issue you are dealing with qualifies as regression, security
+ issue, or a really severe problem: those are 'issues of high priority' that
+ need special handling in some steps that are about to follow.
+
+ * Make sure it's not the kernel's surroundings that are causing the issue
+ you face.
+
+ * Create a fresh backup and put system repair and restore tools at hand.
+
+ * Ensure your system does not enhance its kernels by building additional
+ kernel modules on-the-fly, which solutions like DKMS might be doing locally
+ without your knowledge.
+
+ * Check if your kernel was 'tainted' when the issue occurred, as the event
+ that made the kernel set this flag might be causing the issue you face.
+
+ * Write down coarsely how to reproduce the issue. If you deal with multiple
+ issues at once, create separate notes for each of them and make sure they
+ work independently on a freshly booted system. That's needed, as each issue
+ needs to get reported to the kernel developers separately, unless they are
+ strongly entangled.
+
+ * If you are facing a regression within a stable or longterm version line
+ (say something broke when updating from 5.10.4 to 5.10.5), scroll down to
+ 'Dealing with regressions within a stable and longterm kernel line'.
+
+ * Locate the driver or kernel subsystem that seems to be causing the issue.
+ Find out how and where its developers expect reports. Note: most of the
+ time this won't be bugzilla.kernel.org, as issues typically need to be sent
+ by mail to a maintainer and a public mailing list.
+
+ * Search the archives of the bug tracker or mailing list in question
+ thoroughly for reports that might match your issue. If you find anything,
+ join the discussion instead of sending a new report.
+
+After these preparations you'll now enter the main part:
+
+ * Unless you are already running the latest 'mainline' Linux kernel, better
+ go and install it for the reporting process. Testing and reporting with
+ the latest 'stable' Linux can be an acceptable alternative in some
+ situations; during the merge window that actually might be even the best
+ approach, but in that development phase it can be an even better idea to
+ suspend your efforts for a few days anyway. Whatever version you choose,
+ ideally use a 'vanilla' build. Ignoring these advices will dramatically
+ increase the risk your report will be rejected or ignored.
+
+ * Ensure the kernel you just installed does not 'taint' itself when
+ running.
+
+ * Reproduce the issue with the kernel you just installed. If it doesn't show
+ up there, scroll down to the instructions for issues only happening with
+ stable and longterm kernels.
+
+ * Optimize your notes: try to find and write the most straightforward way to
+ reproduce your issue. Make sure the end result has all the important
+ details, and at the same time is easy to read and understand for others
+ that hear about it for the first time. And if you learned something in this
+ process, consider searching again for existing reports about the issue.
+
+ * If your failure involves a 'panic', 'Oops', 'warning', or 'BUG', consider
+ decoding the kernel log to find the line of code that triggered the error.
+
+ * If your problem is a regression, try to narrow down when the issue was
+ introduced as much as possible.
+
+ * Start to compile the report by writing a detailed description about the
+ issue. Always mention a few things: the latest kernel version you installed
+ for reproducing, the Linux Distribution used, and your notes on how to
+ reproduce the issue. Ideally, make the kernel's build configuration
+ (.config) and the output from ``dmesg`` available somewhere on the net and
+ link to it. Include or upload all other information that might be relevant,
+ like the output/screenshot of an Oops or the output from ``lspci``. Once
+ you wrote this main part, insert a normal length paragraph on top of it
+ outlining the issue and the impact quickly. On top of this add one sentence
+ that briefly describes the problem and gets people to read on. Now give the
+ thing a descriptive title or subject that yet again is shorter. Then you're
+ ready to send or file the report like the MAINTAINERS file told you, unless
+ you are dealing with one of those 'issues of high priority': they need
+ special care which is explained in 'Special handling for high priority
+ issues' below.
+
+ * Wait for reactions and keep the thing rolling until you can accept the
+ outcome in one way or the other. Thus react publicly and in a timely manner
+ to any inquiries. Test proposed fixes. Do proactive testing: retest with at
+ least every first release candidate (RC) of a new mainline version and
+ report your results. Send friendly reminders if things stall. And try to
+ help yourself, if you don't get any help or if it's unsatisfying.
+
+
+Reporting regressions within a stable and longterm kernel line
+--------------------------------------------------------------
+
+This subsection is for you, if you followed above process and got sent here at
+the point about regression within a stable or longterm kernel version line. You
+face one of those if something breaks when updating from 5.10.4 to 5.10.5 (a
+switch from 5.9.15 to 5.10.5 does not qualify). The developers want to fix such
+regressions as quickly as possible, hence there is a streamlined process to
+report them:
+
+ * Check if the kernel developers still maintain the Linux kernel version
+ line you care about: go to the `front page of kernel.org
+ <https://kernel.org/>`_ and make sure it mentions
+ the latest release of the particular version line without an '[EOL]' tag.
+
+ * Check the archives of the `Linux stable mailing list
+ <https://lore.kernel.org/stable/>`_ for existing reports.
+
+ * Install the latest release from the particular version line as a vanilla
+ kernel. Ensure this kernel is not tainted and still shows the problem, as
+ the issue might have already been fixed there. If you first noticed the
+ problem with a vendor kernel, check a vanilla build of the last version
+ known to work performs fine as well.
+
+ * Send a short problem report to the Linux stable mailing list
+ (stable@vger.kernel.org) and CC the Linux regressions mailing list
+ (regressions@lists.linux.dev); if you suspect the cause in a particular
+ subsystem, CC its maintainer and its mailing list. Roughly describe the
+ issue and ideally explain how to reproduce it. Mention the first version
+ that shows the problem and the last version that's working fine. Then
+ wait for further instructions.
+
+The reference section below explains each of these steps in more detail.
+
+
+Reporting issues only occurring in older kernel version lines
+-------------------------------------------------------------
+
+This subsection is for you, if you tried the latest mainline kernel as outlined
+above, but failed to reproduce your issue there; at the same time you want to
+see the issue fixed in a still supported stable or longterm series or vendor
+kernels regularly rebased on those. If that the case, follow these steps:
+
+ * Prepare yourself for the possibility that going through the next few steps
+ might not get the issue solved in older releases: the fix might be too big
+ or risky to get backported there.
+
+ * Perform the first three steps in the section "Dealing with regressions
+ within a stable and longterm kernel line" above.
+
+ * Search the Linux kernel version control system for the change that fixed
+ the issue in mainline, as its commit message might tell you if the fix is
+ scheduled for backporting already. If you don't find anything that way,
+ search the appropriate mailing lists for posts that discuss such an issue
+ or peer-review possible fixes; then check the discussions if the fix was
+ deemed unsuitable for backporting. If backporting was not considered at
+ all, join the newest discussion, asking if it's in the cards.
+
+ * One of the former steps should lead to a solution. If that doesn't work
+ out, ask the maintainers for the subsystem that seems to be causing the
+ issue for advice; CC the mailing list for the particular subsystem as well
+ as the stable mailing list.
+
+The reference section below explains each of these steps in more detail.
+
+
+Reference section: Reporting issues to the kernel maintainers
+=============================================================
+
+The detailed guides above outline all the major steps in brief fashion, which
+should be enough for most people. But sometimes there are situations where even
+experienced users might wonder how to actually do one of those steps. That's
+what this section is for, as it will provide a lot more details on each of the
+above steps. Consider this as reference documentation: it's possible to read it
+from top to bottom. But it's mainly meant to skim over and a place to look up
+details how to actually perform those steps.
+
+A few words of general advice before digging into the details:
+
+ * The Linux kernel developers are well aware this process is complicated and
+ demands more than other FLOSS projects. We'd love to make it simpler. But
+ that would require work in various places as well as some infrastructure,
+ which would need constant maintenance; nobody has stepped up to do that
+ work, so that's just how things are for now.
+
+ * A warranty or support contract with some vendor doesn't entitle you to
+ request fixes from developers in the upstream Linux kernel community: such
+ contracts are completely outside the scope of the Linux kernel, its
+ development community, and this document. That's why you can't demand
+ anything such a contract guarantees in this context, not even if the
+ developer handling the issue works for the vendor in question. If you want
+ to claim your rights, use the vendor's support channel instead. When doing
+ so, you might want to mention you'd like to see the issue fixed in the
+ upstream Linux kernel; motivate them by saying it's the only way to ensure
+ the fix in the end will get incorporated in all Linux distributions.
+
+ * If you never reported an issue to a FLOSS project before you should consider
+ reading `How to Report Bugs Effectively
+ <https://www.chiark.greenend.org.uk/~sgtatham/bugs.html>`_, `How To Ask
+ Questions The Smart Way
+ <http://www.catb.org/esr/faqs/smart-questions.html>`_, and `How to ask good
+ questions <https://jvns.ca/blog/good-questions/>`_.
+
+With that off the table, find below the details on how to properly report
+issues to the Linux kernel developers.
+
+
+Make sure you're using the upstream Linux kernel
+------------------------------------------------
+
+ *Are you facing an issue with a Linux kernel a hardware or software vendor
+ provided? Then in almost all cases you are better off to stop reading this
+ document and reporting the issue to your vendor instead, unless you are
+ willing to install the latest Linux version yourself. Be aware the latter
+ will often be needed anyway to hunt down and fix issues.*
+
+Like most programmers, Linux kernel developers don't like to spend time dealing
+with reports for issues that don't even happen with their current code. It's
+just a waste everybody's time, especially yours. Unfortunately such situations
+easily happen when it comes to the kernel and often leads to frustration on both
+sides. That's because almost all Linux-based kernels pre-installed on devices
+(Computers, Laptops, Smartphones, Routers, …) and most shipped by Linux
+distributors are quite distant from the official Linux kernel as distributed by
+kernel.org: these kernels from these vendors are often ancient from the point of
+Linux development or heavily modified, often both.
+
+Most of these vendor kernels are quite unsuitable for reporting issues to the
+Linux kernel developers: an issue you face with one of them might have been
+fixed by the Linux kernel developers months or years ago already; additionally,
+the modifications and enhancements by the vendor might be causing the issue you
+face, even if they look small or totally unrelated. That's why you should report
+issues with these kernels to the vendor. Its developers should look into the
+report and, in case it turns out to be an upstream issue, fix it directly
+upstream or forward the report there. In practice that often does not work out
+or might not what you want. You thus might want to consider circumventing the
+vendor by installing the very latest Linux kernel core yourself. If that's an
+option for you move ahead in this process, as a later step in this guide will
+explain how to do that once it rules out other potential causes for your issue.
+
+Note, the previous paragraph is starting with the word 'most', as sometimes
+developers in fact are willing to handle reports about issues occurring with
+vendor kernels. If they do in the end highly depends on the developers and the
+issue in question. Your chances are quite good if the distributor applied only
+small modifications to a kernel based on a recent Linux version; that for
+example often holds true for the mainline kernels shipped by Debian GNU/Linux
+Sid or Fedora Rawhide. Some developers will also accept reports about issues
+with kernels from distributions shipping the latest stable kernel, as long as
+its only slightly modified; that for example is often the case for Arch Linux,
+regular Fedora releases, and openSUSE Tumbleweed. But keep in mind, you better
+want to use a mainline Linux and avoid using a stable kernel for this
+process, as outlined in the section 'Install a fresh kernel for testing' in more
+detail.
+
+Obviously you are free to ignore all this advice and report problems with an old
+or heavily modified vendor kernel to the upstream Linux developers. But note,
+those often get rejected or ignored, so consider yourself warned. But it's still
+better than not reporting the issue at all: sometimes such reports directly or
+indirectly will help to get the issue fixed over time.
+
+
+Search for existing reports, first run
+--------------------------------------
+
+ *Perform a rough search for existing reports with your favorite internet
+ search engine; additionally, check the archives of the Linux Kernel Mailing
+ List (LKML). If you find matching reports, join the discussion instead of
+ sending a new one.*
+
+Reporting an issue that someone else already brought forward is often a waste of
+time for everyone involved, especially you as the reporter. So it's in your own
+interest to thoroughly check if somebody reported the issue already. At this
+step of the process it's okay to just perform a rough search: a later step will
+tell you to perform a more detailed search once you know where your issue needs
+to be reported to. Nevertheless, do not hurry with this step of the reporting
+process, it can save you time and trouble.
+
+Simply search the internet with your favorite search engine first. Afterwards,
+search the `Linux Kernel Mailing List (LKML) archives
+<https://lore.kernel.org/lkml/>`_.
+
+If you get flooded with results consider telling your search engine to limit
+search timeframe to the past month or year. And wherever you search, make sure
+to use good search terms; vary them a few times, too. While doing so try to
+look at the issue from the perspective of someone else: that will help you to
+come up with other words to use as search terms. Also make sure not to use too
+many search terms at once. Remember to search with and without information like
+the name of the kernel driver or the name of the affected hardware component.
+But its exact brand name (say 'ASUS Red Devil Radeon RX 5700 XT Gaming OC')
+often is not much helpful, as it is too specific. Instead try search terms like
+the model line (Radeon 5700 or Radeon 5000) and the code name of the main chip
+('Navi' or 'Navi10') with and without its manufacturer ('AMD').
+
+In case you find an existing report about your issue, join the discussion, as
+you might be able to provide valuable additional information. That can be
+important even when a fix is prepared or in its final stages already, as
+developers might look for people that can provide additional information or
+test a proposed fix. Jump to the section 'Duties after the report went out' for
+details on how to get properly involved.
+
+Note, searching `bugzilla.kernel.org <https://bugzilla.kernel.org/>`_ might also
+be a good idea, as that might provide valuable insights or turn up matching
+reports. If you find the latter, just keep in mind: most subsystems expect
+reports in different places, as described below in the section "Check where you
+need to report your issue". The developers that should take care of the issue
+thus might not even be aware of the bugzilla ticket. Hence, check the ticket if
+the issue already got reported as outlined in this document and if not consider
+doing so.
+
+
+Issue of high priority?
+-----------------------
+
+ *See if the issue you are dealing with qualifies as regression, security
+ issue, or a really severe problem: those are 'issues of high priority' that
+ need special handling in some steps that are about to follow.*
+
+Linus Torvalds and the leading Linux kernel developers want to see some issues
+fixed as soon as possible, hence there are 'issues of high priority' that get
+handled slightly differently in the reporting process. Three type of cases
+qualify: regressions, security issues, and really severe problems.
+
+You deal with a regression if some application or practical use case running
+fine with one Linux kernel works worse or not at all with a newer version
+compiled using a similar configuration. The document
+Documentation/admin-guide/reporting-regressions.rst explains this in more
+detail. It also provides a good deal of other information about regressions you
+might want to be aware of; it for example explains how to add your issue to the
+list of tracked regressions, to ensure it won't fall through the cracks.
+
+What qualifies as security issue is left to your judgment. Consider reading
+Documentation/admin-guide/security-bugs.rst before proceeding, as it
+provides additional details how to best handle security issues.
+
+An issue is a 'really severe problem' when something totally unacceptably bad
+happens. That's for example the case when a Linux kernel corrupts the data it's
+handling or damages hardware it's running on. You're also dealing with a severe
+issue when the kernel suddenly stops working with an error message ('kernel
+panic') or without any farewell note at all. Note: do not confuse a 'panic' (a
+fatal error where the kernel stop itself) with a 'Oops' (a recoverable error),
+as the kernel remains running after the latter.
+
+
+Ensure a healthy environment
+----------------------------
+
+ *Make sure it's not the kernel's surroundings that are causing the issue
+ you face.*
+
+Problems that look a lot like a kernel issue are sometimes caused by build or
+runtime environment. It's hard to rule out that problem completely, but you
+should minimize it:
+
+ * Use proven tools when building your kernel, as bugs in the compiler or the
+ binutils can cause the resulting kernel to misbehave.
+
+ * Ensure your computer components run within their design specifications;
+ that's especially important for the main processor, the main memory, and the
+ motherboard. Therefore, stop undervolting or overclocking when facing a
+ potential kernel issue.
+
+ * Try to make sure it's not faulty hardware that is causing your issue. Bad
+ main memory for example can result in a multitude of issues that will
+ manifest itself in problems looking like kernel issues.
+
+ * If you're dealing with a filesystem issue, you might want to check the file
+ system in question with ``fsck``, as it might be damaged in a way that leads
+ to unexpected kernel behavior.
+
+ * When dealing with a regression, make sure it's not something else that
+ changed in parallel to updating the kernel. The problem for example might be
+ caused by other software that was updated at the same time. It can also
+ happen that a hardware component coincidentally just broke when you rebooted
+ into a new kernel for the first time. Updating the systems BIOS or changing
+ something in the BIOS Setup can also lead to problems that on look a lot
+ like a kernel regression.
+
+
+Prepare for emergencies
+-----------------------
+
+ *Create a fresh backup and put system repair and restore tools at hand.*
+
+Reminder, you are dealing with computers, which sometimes do unexpected things,
+especially if you fiddle with crucial parts like the kernel of its operating
+system. That's what you are about to do in this process. Thus, make sure to
+create a fresh backup; also ensure you have all tools at hand to repair or
+reinstall the operating system as well as everything you need to restore the
+backup.
+
+
+Make sure your kernel doesn't get enhanced
+------------------------------------------
+
+ *Ensure your system does not enhance its kernels by building additional
+ kernel modules on-the-fly, which solutions like DKMS might be doing locally
+ without your knowledge.*
+
+The risk your issue report gets ignored or rejected dramatically increases if
+your kernel gets enhanced in any way. That's why you should remove or disable
+mechanisms like akmods and DKMS: those build add-on kernel modules
+automatically, for example when you install a new Linux kernel or boot it for
+the first time. Also remove any modules they might have installed. Then reboot
+before proceeding.
+
+Note, you might not be aware that your system is using one of these solutions:
+they often get set up silently when you install Nvidia's proprietary graphics
+driver, VirtualBox, or other software that requires a some support from a
+module not part of the Linux kernel. That why your might need to uninstall the
+packages with such software to get rid of any 3rd party kernel module.
+
+
+Check 'taint' flag
+------------------
+
+ *Check if your kernel was 'tainted' when the issue occurred, as the event
+ that made the kernel set this flag might be causing the issue you face.*
+
+The kernel marks itself with a 'taint' flag when something happens that might
+lead to follow-up errors that look totally unrelated. The issue you face might
+be such an error if your kernel is tainted. That's why it's in your interest to
+rule this out early before investing more time into this process. This is the
+only reason why this step is here, as this process later will tell you to
+install the latest mainline kernel; you will need to check the taint flag again
+then, as that's when it matters because it's the kernel the report will focus
+on.
+
+On a running system is easy to check if the kernel tainted itself: if ``cat
+/proc/sys/kernel/tainted`` returns '0' then the kernel is not tainted and
+everything is fine. Checking that file is impossible in some situations; that's
+why the kernel also mentions the taint status when it reports an internal
+problem (a 'kernel bug'), a recoverable error (a 'kernel Oops') or a
+non-recoverable error before halting operation (a 'kernel panic'). Look near
+the top of the error messages printed when one of these occurs and search for a
+line starting with 'CPU:'. It should end with 'Not tainted' if the kernel was
+not tainted when it noticed the problem; it was tainted if you see 'Tainted:'
+followed by a few spaces and some letters.
+
+If your kernel is tainted, study Documentation/admin-guide/tainted-kernels.rst
+to find out why. Try to eliminate the reason. Often it's caused by one these
+three things:
+
+ 1. A recoverable error (a 'kernel Oops') occurred and the kernel tainted
+ itself, as the kernel knows it might misbehave in strange ways after that
+ point. In that case check your kernel or system log and look for a section
+ that starts with this::
+
+ Oops: 0000 [#1] SMP
+
+ That's the first Oops since boot-up, as the '#1' between the brackets shows.
+ Every Oops and any other problem that happens after that point might be a
+ follow-up problem to that first Oops, even if both look totally unrelated.
+ Rule this out by getting rid of the cause for the first Oops and reproducing
+ the issue afterwards. Sometimes simply restarting will be enough, sometimes
+ a change to the configuration followed by a reboot can eliminate the Oops.
+ But don't invest too much time into this at this point of the process, as
+ the cause for the Oops might already be fixed in the newer Linux kernel
+ version you are going to install later in this process.
+
+ 2. Your system uses a software that installs its own kernel modules, for
+ example Nvidia's proprietary graphics driver or VirtualBox. The kernel
+ taints itself when it loads such module from external sources (even if
+ they are Open Source): they sometimes cause errors in unrelated kernel
+ areas and thus might be causing the issue you face. You therefore have to
+ prevent those modules from loading when you want to report an issue to the
+ Linux kernel developers. Most of the time the easiest way to do that is:
+ temporarily uninstall such software including any modules they might have
+ installed. Afterwards reboot.
+
+ 3. The kernel also taints itself when it's loading a module that resides in
+ the staging tree of the Linux kernel source. That's a special area for
+ code (mostly drivers) that does not yet fulfill the normal Linux kernel
+ quality standards. When you report an issue with such a module it's
+ obviously okay if the kernel is tainted; just make sure the module in
+ question is the only reason for the taint. If the issue happens in an
+ unrelated area reboot and temporarily block the module from being loaded
+ by specifying ``foo.blacklist=1`` as kernel parameter (replace 'foo' with
+ the name of the module in question).
+
+
+Document how to reproduce issue
+-------------------------------
+
+ *Write down coarsely how to reproduce the issue. If you deal with multiple
+ issues at once, create separate notes for each of them and make sure they
+ work independently on a freshly booted system. That's needed, as each issue
+ needs to get reported to the kernel developers separately, unless they are
+ strongly entangled.*
+
+If you deal with multiple issues at once, you'll have to report each of them
+separately, as they might be handled by different developers. Describing
+various issues in one report also makes it quite difficult for others to tear
+it apart. Hence, only combine issues in one report if they are very strongly
+entangled.
+
+Additionally, during the reporting process you will have to test if the issue
+happens with other kernel versions. Therefore, it will make your work easier if
+you know exactly how to reproduce an issue quickly on a freshly booted system.
+
+Note: it's often fruitless to report issues that only happened once, as they
+might be caused by a bit flip due to cosmic radiation. That's why you should
+try to rule that out by reproducing the issue before going further. Feel free
+to ignore this advice if you are experienced enough to tell a one-time error
+due to faulty hardware apart from a kernel issue that rarely happens and thus
+is hard to reproduce.
+
+
+Regression in stable or longterm kernel?
+----------------------------------------
+
+ *If you are facing a regression within a stable or longterm version line
+ (say something broke when updating from 5.10.4 to 5.10.5), scroll down to
+ 'Dealing with regressions within a stable and longterm kernel line'.*
+
+Regression within a stable and longterm kernel version line are something the
+Linux developers want to fix badly, as such issues are even more unwanted than
+regression in the main development branch, as they can quickly affect a lot of
+people. The developers thus want to learn about such issues as quickly as
+possible, hence there is a streamlined process to report them. Note,
+regressions with newer kernel version line (say something broke when switching
+from 5.9.15 to 5.10.5) do not qualify.
+
+
+Check where you need to report your issue
+-----------------------------------------
+
+ *Locate the driver or kernel subsystem that seems to be causing the issue.
+ Find out how and where its developers expect reports. Note: most of the
+ time this won't be bugzilla.kernel.org, as issues typically need to be sent
+ by mail to a maintainer and a public mailing list.*
+
+It's crucial to send your report to the right people, as the Linux kernel is a
+big project and most of its developers are only familiar with a small subset of
+it. Quite a few programmers for example only care for just one driver, for
+example one for a WiFi chip; its developer likely will only have small or no
+knowledge about the internals of remote or unrelated "subsystems", like the TCP
+stack, the PCIe/PCI subsystem, memory management or file systems.
+
+Problem is: the Linux kernel lacks a central bug tracker where you can simply
+file your issue and make it reach the developers that need to know about it.
+That's why you have to find the right place and way to report issues yourself.
+You can do that with the help of a script (see below), but it mainly targets
+kernel developers and experts. For everybody else the MAINTAINERS file is the
+better place.
+
+How to read the MAINTAINERS file
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+To illustrate how to use the :ref:`MAINTAINERS <maintainers>` file, lets assume
+the WiFi in your Laptop suddenly misbehaves after updating the kernel. In that
+case it's likely an issue in the WiFi driver. Obviously it could also be some
+code it builds upon, but unless you suspect something like that stick to the
+driver. If it's really something else, the driver's developers will get the
+right people involved.
+
+Sadly, there is no way to check which code is driving a particular hardware
+component that is both universal and easy.
+
+In case of a problem with the WiFi driver you for example might want to look at
+the output of ``lspci -k``, as it lists devices on the PCI/PCIe bus and the
+kernel module driving it::
+
+ [user@something ~]$ lspci -k
+ [...]
+ 3a:00.0 Network controller: Qualcomm Atheros QCA6174 802.11ac Wireless Network Adapter (rev 32)
+ Subsystem: Bigfoot Networks, Inc. Device 1535
+ Kernel driver in use: ath10k_pci
+ Kernel modules: ath10k_pci
+ [...]
+
+But this approach won't work if your WiFi chip is connected over USB or some
+other internal bus. In those cases you might want to check your WiFi manager or
+the output of ``ip link``. Look for the name of the problematic network
+interface, which might be something like 'wlp58s0'. This name can be used like
+this to find the module driving it::
+
+ [user@something ~]$ realpath --relative-to=/sys/module/ /sys/class/net/wlp58s0/device/driver/module
+ ath10k_pci
+
+In case tricks like these don't bring you any further, try to search the
+internet on how to narrow down the driver or subsystem in question. And if you
+are unsure which it is: just try your best guess, somebody will help you if you
+guessed poorly.
+
+Once you know the driver or subsystem, you want to search for it in the
+MAINTAINERS file. In the case of 'ath10k_pci' you won't find anything, as the
+name is too specific. Sometimes you will need to search on the net for help;
+but before doing so, try a somewhat shorted or modified name when searching the
+MAINTAINERS file, as then you might find something like this::
+
+ QUALCOMM ATHEROS ATH10K WIRELESS DRIVER
+ Mail: A. Some Human <shuman@example.com>
+ Mailing list: ath10k@lists.infradead.org
+ Status: Supported
+ Web-page: https://wireless.wiki.kernel.org/en/users/Drivers/ath10k
+ SCM: git git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/ath.git
+ Files: drivers/net/wireless/ath/ath10k/
+
+Note: the line description will be abbreviations, if you read the plain
+MAINTAINERS file found in the root of the Linux source tree. 'Mail:' for
+example will be 'M:', 'Mailing list:' will be 'L', and 'Status:' will be 'S:'.
+A section near the top of the file explains these and other abbreviations.
+
+First look at the line 'Status'. Ideally it should be 'Supported' or
+'Maintained'. If it states 'Obsolete' then you are using some outdated approach
+that was replaced by a newer solution you need to switch to. Sometimes the code
+only has someone who provides 'Odd Fixes' when feeling motivated. And with
+'Orphan' you are totally out of luck, as nobody takes care of the code anymore.
+That only leaves these options: arrange yourself to live with the issue, fix it
+yourself, or find a programmer somewhere willing to fix it.
+
+After checking the status, look for a line starting with 'bugs:': it will tell
+you where to find a subsystem specific bug tracker to file your issue. The
+example above does not have such a line. That is the case for most sections, as
+Linux kernel development is completely driven by mail. Very few subsystems use
+a bug tracker, and only some of those rely on bugzilla.kernel.org.
+
+In this and many other cases you thus have to look for lines starting with
+'Mail:' instead. Those mention the name and the email addresses for the
+maintainers of the particular code. Also look for a line starting with 'Mailing
+list:', which tells you the public mailing list where the code is developed.
+Your report later needs to go by mail to those addresses. Additionally, for all
+issue reports sent by email, make sure to add the Linux Kernel Mailing List
+(LKML) <linux-kernel@vger.kernel.org> to CC. Don't omit either of the mailing
+lists when sending your issue report by mail later! Maintainers are busy people
+and might leave some work for other developers on the subsystem specific list;
+and LKML is important to have one place where all issue reports can be found.
+
+
+Finding the maintainers with the help of a script
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+For people that have the Linux sources at hand there is a second option to find
+the proper place to report: the script 'scripts/get_maintainer.pl' which tries
+to find all people to contact. It queries the MAINTAINERS file and needs to be
+called with a path to the source code in question. For drivers compiled as
+module if often can be found with a command like this::
+
+ $ modinfo ath10k_pci | grep filename | sed 's!/lib/modules/.*/kernel/!!; s!filename:!!; s!\.ko\(\|\.xz\)!!'
+ drivers/net/wireless/ath/ath10k/ath10k_pci.ko
+
+Pass parts of this to the script::
+
+ $ ./scripts/get_maintainer.pl -f drivers/net/wireless/ath/ath10k*
+ Some Human <shuman@example.com> (supporter:QUALCOMM ATHEROS ATH10K WIRELESS DRIVER)
+ Another S. Human <asomehuman@example.com> (maintainer:NETWORKING DRIVERS)
+ ath10k@lists.infradead.org (open list:QUALCOMM ATHEROS ATH10K WIRELESS DRIVER)
+ linux-wireless@vger.kernel.org (open list:NETWORKING DRIVERS (WIRELESS))
+ netdev@vger.kernel.org (open list:NETWORKING DRIVERS)
+ linux-kernel@vger.kernel.org (open list)
+
+Don't sent your report to all of them. Send it to the maintainers, which the
+script calls "supporter:"; additionally CC the most specific mailing list for
+the code as well as the Linux Kernel Mailing List (LKML). In this case you thus
+would need to send the report to 'Some Human <shuman@example.com>' with
+'ath10k@lists.infradead.org' and 'linux-kernel@vger.kernel.org' in CC.
+
+Note: in case you cloned the Linux sources with git you might want to call
+``get_maintainer.pl`` a second time with ``--git``. The script then will look
+at the commit history to find which people recently worked on the code in
+question, as they might be able to help. But use these results with care, as it
+can easily send you in a wrong direction. That for example happens quickly in
+areas rarely changed (like old or unmaintained drivers): sometimes such code is
+modified during tree-wide cleanups by developers that do not care about the
+particular driver at all.
+
+
+Search for existing reports, second run
+---------------------------------------
+
+ *Search the archives of the bug tracker or mailing list in question
+ thoroughly for reports that might match your issue. If you find anything,
+ join the discussion instead of sending a new report.*
+
+As mentioned earlier already: reporting an issue that someone else already
+brought forward is often a waste of time for everyone involved, especially you
+as the reporter. That's why you should search for existing report again, now
+that you know where they need to be reported to. If it's mailing list, you will
+often find its archives on `lore.kernel.org <https://lore.kernel.org/>`_.
+
+But some list are hosted in different places. That for example is the case for
+the ath10k WiFi driver used as example in the previous step. But you'll often
+find the archives for these lists easily on the net. Searching for 'archive
+ath10k@lists.infradead.org' for example will lead you to the `Info page for the
+ath10k mailing list <https://lists.infradead.org/mailman/listinfo/ath10k>`_,
+which at the top links to its
+`list archives <https://lists.infradead.org/pipermail/ath10k/>`_. Sadly this and
+quite a few other lists miss a way to search the archives. In those cases use a
+regular internet search engine and add something like
+'site:lists.infradead.org/pipermail/ath10k/' to your search terms, which limits
+the results to the archives at that URL.
+
+It's also wise to check the internet, LKML and maybe bugzilla.kernel.org again
+at this point. If your report needs to be filed in a bug tracker, you may want
+to check the mailing list archives for the subsystem as well, as someone might
+have reported it only there.
+
+For details how to search and what to do if you find matching reports see
+"Search for existing reports, first run" above.
+
+Do not hurry with this step of the reporting process: spending 30 to 60 minutes
+or even more time can save you and others quite a lot of time and trouble.
+
+
+Install a fresh kernel for testing
+----------------------------------
+
+ *Unless you are already running the latest 'mainline' Linux kernel, better
+ go and install it for the reporting process. Testing and reporting with
+ the latest 'stable' Linux can be an acceptable alternative in some
+ situations; during the merge window that actually might be even the best
+ approach, but in that development phase it can be an even better idea to
+ suspend your efforts for a few days anyway. Whatever version you choose,
+ ideally use a 'vanilla' built. Ignoring these advices will dramatically
+ increase the risk your report will be rejected or ignored.*
+
+As mentioned in the detailed explanation for the first step already: Like most
+programmers, Linux kernel developers don't like to spend time dealing with
+reports for issues that don't even happen with the current code. It's just a
+waste everybody's time, especially yours. That's why it's in everybody's
+interest that you confirm the issue still exists with the latest upstream code
+before reporting it. You are free to ignore this advice, but as outlined
+earlier: doing so dramatically increases the risk that your issue report might
+get rejected or simply ignored.
+
+In the scope of the kernel "latest upstream" normally means:
+
+ * Install a mainline kernel; the latest stable kernel can be an option, but
+ most of the time is better avoided. Longterm kernels (sometimes called 'LTS
+ kernels') are unsuitable at this point of the process. The next subsection
+ explains all of this in more detail.
+
+ * The over next subsection describes way to obtain and install such a kernel.
+ It also outlines that using a pre-compiled kernel are fine, but better are
+ vanilla, which means: it was built using Linux sources taken straight `from
+ kernel.org <https://kernel.org/>`_ and not modified or enhanced in any way.
+
+Choosing the right version for testing
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Head over to `kernel.org <https://kernel.org/>`_ to find out which version you
+want to use for testing. Ignore the big yellow button that says 'Latest release'
+and look a little lower at the table. At its top you'll see a line starting with
+mainline, which most of the time will point to a pre-release with a version
+number like '5.8-rc2'. If that's the case, you'll want to use this mainline
+kernel for testing, as that where all fixes have to be applied first. Do not let
+that 'rc' scare you, these 'development kernels' are pretty reliable — and you
+made a backup, as you were instructed above, didn't you?
+
+In about two out of every nine to ten weeks, mainline might point you to a
+proper release with a version number like '5.7'. If that happens, consider
+suspending the reporting process until the first pre-release of the next
+version (5.8-rc1) shows up on kernel.org. That's because the Linux development
+cycle then is in its two-week long 'merge window'. The bulk of the changes and
+all intrusive ones get merged for the next release during this time. It's a bit
+more risky to use mainline during this period. Kernel developers are also often
+quite busy then and might have no spare time to deal with issue reports. It's
+also quite possible that one of the many changes applied during the merge
+window fixes the issue you face; that's why you soon would have to retest with
+a newer kernel version anyway, as outlined below in the section 'Duties after
+the report went out'.
+
+That's why it might make sense to wait till the merge window is over. But don't
+to that if you're dealing with something that shouldn't wait. In that case
+consider obtaining the latest mainline kernel via git (see below) or use the
+latest stable version offered on kernel.org. Using that is also acceptable in
+case mainline for some reason does currently not work for you. An in general:
+using it for reproducing the issue is also better than not reporting it issue
+at all.
+
+Better avoid using the latest stable kernel outside merge windows, as all fixes
+must be applied to mainline first. That's why checking the latest mainline
+kernel is so important: any issue you want to see fixed in older version lines
+needs to be fixed in mainline first before it can get backported, which can
+take a few days or weeks. Another reason: the fix you hope for might be too
+hard or risky for backporting; reporting the issue again hence is unlikely to
+change anything.
+
+These aspects are also why longterm kernels (sometimes called "LTS kernels")
+are unsuitable for this part of the reporting process: they are to distant from
+the current code. Hence go and test mainline first and follow the process
+further: if the issue doesn't occur with mainline it will guide you how to get
+it fixed in older version lines, if that's in the cards for the fix in question.
+
+How to obtain a fresh Linux kernel
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+**Using a pre-compiled kernel**: This is often the quickest, easiest, and safest
+way for testing — especially is you are unfamiliar with the Linux kernel. The
+problem: most of those shipped by distributors or add-on repositories are build
+from modified Linux sources. They are thus not vanilla and therefore often
+unsuitable for testing and issue reporting: the changes might cause the issue
+you face or influence it somehow.
+
+But you are in luck if you are using a popular Linux distribution: for quite a
+few of them you'll find repositories on the net that contain packages with the
+latest mainline or stable Linux built as vanilla kernel. It's totally okay to
+use these, just make sure from the repository's description they are vanilla or
+at least close to it. Additionally ensure the packages contain the latest
+versions as offered on kernel.org. The packages are likely unsuitable if they
+are older than a week, as new mainline and stable kernels typically get released
+at least once a week.
+
+Please note that you might need to build your own kernel manually later: that's
+sometimes needed for debugging or testing fixes, as described later in this
+document. Also be aware that pre-compiled kernels might lack debug symbols that
+are needed to decode messages the kernel prints when a panic, Oops, warning, or
+BUG occurs; if you plan to decode those, you might be better off compiling a
+kernel yourself (see the end of this subsection and the section titled 'Decode
+failure messages' for details).
+
+**Using git**: Developers and experienced Linux users familiar with git are
+often best served by obtaining the latest Linux kernel sources straight from the
+`official development repository on kernel.org
+<https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/>`_.
+Those are likely a bit ahead of the latest mainline pre-release. Don't worry
+about it: they are as reliable as a proper pre-release, unless the kernel's
+development cycle is currently in the middle of a merge window. But even then
+they are quite reliable.
+
+**Conventional**: People unfamiliar with git are often best served by
+downloading the sources as tarball from `kernel.org <https://kernel.org/>`_.
+
+How to actually build a kernel is not described here, as many websites explain
+the necessary steps already. If you are new to it, consider following one of
+those how-to's that suggest to use ``make localmodconfig``, as that tries to
+pick up the configuration of your current kernel and then tries to adjust it
+somewhat for your system. That does not make the resulting kernel any better,
+but quicker to compile.
+
+Note: If you are dealing with a panic, Oops, warning, or BUG from the kernel,
+please try to enable CONFIG_KALLSYMS when configuring your kernel.
+Additionally, enable CONFIG_DEBUG_KERNEL and CONFIG_DEBUG_INFO, too; the
+latter is the relevant one of those two, but can only be reached if you enable
+the former. Be aware CONFIG_DEBUG_INFO increases the storage space required to
+build a kernel by quite a bit. But that's worth it, as these options will allow
+you later to pinpoint the exact line of code that triggers your issue. The
+section 'Decode failure messages' below explains this in more detail.
+
+But keep in mind: Always keep a record of the issue encountered in case it is
+hard to reproduce. Sending an undecoded report is better than not reporting
+the issue at all.
+
+
+Check 'taint' flag
+------------------
+
+ *Ensure the kernel you just installed does not 'taint' itself when
+ running.*
+
+As outlined above in more detail already: the kernel sets a 'taint' flag when
+something happens that can lead to follow-up errors that look totally
+unrelated. That's why you need to check if the kernel you just installed does
+not set this flag. And if it does, you in almost all the cases needs to
+eliminate the reason for it before you reporting issues that occur with it. See
+the section above for details how to do that.
+
+
+Reproduce issue with the fresh kernel
+-------------------------------------
+
+ *Reproduce the issue with the kernel you just installed. If it doesn't show
+ up there, scroll down to the instructions for issues only happening with
+ stable and longterm kernels.*
+
+Check if the issue occurs with the fresh Linux kernel version you just
+installed. If it was fixed there already, consider sticking with this version
+line and abandoning your plan to report the issue. But keep in mind that other
+users might still be plagued by it, as long as it's not fixed in either stable
+and longterm version from kernel.org (and thus vendor kernels derived from
+those). If you prefer to use one of those or just want to help their users,
+head over to the section "Details about reporting issues only occurring in
+older kernel version lines" below.
+
+
+Optimize description to reproduce issue
+---------------------------------------
+
+ *Optimize your notes: try to find and write the most straightforward way to
+ reproduce your issue. Make sure the end result has all the important
+ details, and at the same time is easy to read and understand for others
+ that hear about it for the first time. And if you learned something in this
+ process, consider searching again for existing reports about the issue.*
+
+An unnecessarily complex report will make it hard for others to understand your
+report. Thus try to find a reproducer that's straight forward to describe and
+thus easy to understand in written form. Include all important details, but at
+the same time try to keep it as short as possible.
+
+In this in the previous steps you likely have learned a thing or two about the
+issue you face. Use this knowledge and search again for existing reports
+instead you can join.
+
+
+Decode failure messages
+-----------------------
+
+ *If your failure involves a 'panic', 'Oops', 'warning', or 'BUG', consider
+ decoding the kernel log to find the line of code that triggered the error.*
+
+When the kernel detects an internal problem, it will log some information about
+the executed code. This makes it possible to pinpoint the exact line in the
+source code that triggered the issue and shows how it was called. But that only
+works if you enabled CONFIG_DEBUG_INFO and CONFIG_KALLSYMS when configuring
+your kernel. If you did so, consider to decode the information from the
+kernel's log. That will make it a lot easier to understand what lead to the
+'panic', 'Oops', 'warning', or 'BUG', which increases the chances that someone
+can provide a fix.
+
+Decoding can be done with a script you find in the Linux source tree. If you
+are running a kernel you compiled yourself earlier, call it like this::
+
+ [user@something ~]$ sudo dmesg | ./linux-5.10.5/scripts/decode_stacktrace.sh ./linux-5.10.5/vmlinux
+
+If you are running a packaged vanilla kernel, you will likely have to install
+the corresponding packages with debug symbols. Then call the script (which you
+might need to get from the Linux sources if your distro does not package it)
+like this::
+
+ [user@something ~]$ sudo dmesg | ./linux-5.10.5/scripts/decode_stacktrace.sh \
+ /usr/lib/debug/lib/modules/5.10.10-4.1.x86_64/vmlinux /usr/src/kernels/5.10.10-4.1.x86_64/
+
+The script will work on log lines like the following, which show the address of
+the code the kernel was executing when the error occurred::
+
+ [ 68.387301] RIP: 0010:test_module_init+0x5/0xffa [test_module]
+
+Once decoded, these lines will look like this::
+
+ [ 68.387301] RIP: 0010:test_module_init (/home/username/linux-5.10.5/test-module/test-module.c:16) test_module
+
+In this case the executed code was built from the file
+'~/linux-5.10.5/test-module/test-module.c' and the error occurred by the
+instructions found in line '16'.
+
+The script will similarly decode the addresses mentioned in the section
+starting with 'Call trace', which show the path to the function where the
+problem occurred. Additionally, the script will show the assembler output for
+the code section the kernel was executing.
+
+Note, if you can't get this to work, simply skip this step and mention the
+reason for it in the report. If you're lucky, it might not be needed. And if it
+is, someone might help you to get things going. Also be aware this is just one
+of several ways to decode kernel stack traces. Sometimes different steps will
+be required to retrieve the relevant details. Don't worry about that, if that's
+needed in your case, developers will tell you what to do.
+
+
+Special care for regressions
+----------------------------
+
+ *If your problem is a regression, try to narrow down when the issue was
+ introduced as much as possible.*
+
+Linux lead developer Linus Torvalds insists that the Linux kernel never
+worsens, that's why he deems regressions as unacceptable and wants to see them
+fixed quickly. That's why changes that introduced a regression are often
+promptly reverted if the issue they cause can't get solved quickly any other
+way. Reporting a regression is thus a bit like playing a kind of trump card to
+get something quickly fixed. But for that to happen the change that's causing
+the regression needs to be known. Normally it's up to the reporter to track
+down the culprit, as maintainers often won't have the time or setup at hand to
+reproduce it themselves.
+
+To find the change there is a process called 'bisection' which the document
+Documentation/admin-guide/bug-bisect.rst describes in detail. That process
+will often require you to build about ten to twenty kernel images, trying to
+reproduce the issue with each of them before building the next. Yes, that takes
+some time, but don't worry, it works a lot quicker than most people assume.
+Thanks to a 'binary search' this will lead you to the one commit in the source
+code management system that's causing the regression. Once you find it, search
+the net for the subject of the change, its commit id and the shortened commit id
+(the first 12 characters of the commit id). This will lead you to existing
+reports about it, if there are any.
+
+Note, a bisection needs a bit of know-how, which not everyone has, and quite a
+bit of effort, which not everyone is willing to invest. Nevertheless, it's
+highly recommended performing a bisection yourself. If you really can't or
+don't want to go down that route at least find out which mainline kernel
+introduced the regression. If something for example breaks when switching from
+5.5.15 to 5.8.4, then try at least all the mainline releases in that area (5.6,
+5.7 and 5.8) to check when it first showed up. Unless you're trying to find a
+regression in a stable or longterm kernel, avoid testing versions which number
+has three sections (5.6.12, 5.7.8), as that makes the outcome hard to
+interpret, which might render your testing useless. Once you found the major
+version which introduced the regression, feel free to move on in the reporting
+process. But keep in mind: it depends on the issue at hand if the developers
+will be able to help without knowing the culprit. Sometimes they might
+recognize from the report want went wrong and can fix it; other times they will
+be unable to help unless you perform a bisection.
+
+When dealing with regressions make sure the issue you face is really caused by
+the kernel and not by something else, as outlined above already.
+
+In the whole process keep in mind: an issue only qualifies as regression if the
+older and the newer kernel got built with a similar configuration. This can be
+achieved by using ``make olddefconfig``, as explained in more detail by
+Documentation/admin-guide/reporting-regressions.rst; that document also
+provides a good deal of other information about regressions you might want to be
+aware of.
+
+
+Write and send the report
+-------------------------
+
+ *Start to compile the report by writing a detailed description about the
+ issue. Always mention a few things: the latest kernel version you installed
+ for reproducing, the Linux Distribution used, and your notes on how to
+ reproduce the issue. Ideally, make the kernel's build configuration
+ (.config) and the output from ``dmesg`` available somewhere on the net and
+ link to it. Include or upload all other information that might be relevant,
+ like the output/screenshot of an Oops or the output from ``lspci``. Once
+ you wrote this main part, insert a normal length paragraph on top of it
+ outlining the issue and the impact quickly. On top of this add one sentence
+ that briefly describes the problem and gets people to read on. Now give the
+ thing a descriptive title or subject that yet again is shorter. Then you're
+ ready to send or file the report like the MAINTAINERS file told you, unless
+ you are dealing with one of those 'issues of high priority': they need
+ special care which is explained in 'Special handling for high priority
+ issues' below.*
+
+Now that you have prepared everything it's time to write your report. How to do
+that is partly explained by the three documents linked to in the preface above.
+That's why this text will only mention a few of the essentials as well as
+things specific to the Linux kernel.
+
+There is one thing that fits both categories: the most crucial parts of your
+report are the title/subject, the first sentence, and the first paragraph.
+Developers often get quite a lot of mail. They thus often just take a few
+seconds to skim a mail before deciding to move on or look closer. Thus: the
+better the top section of your report, the higher are the chances that someone
+will look into it and help you. And that is why you should ignore them for now
+and write the detailed report first. ;-)
+
+Things each report should mention
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Describe in detail how your issue happens with the fresh vanilla kernel you
+installed. Try to include the step-by-step instructions you wrote and optimized
+earlier that outline how you and ideally others can reproduce the issue; in
+those rare cases where that's impossible try to describe what you did to
+trigger it.
+
+Also include all the relevant information others might need to understand the
+issue and its environment. What's actually needed depends a lot on the issue,
+but there are some things you should include always:
+
+ * the output from ``cat /proc/version``, which contains the Linux kernel
+ version number and the compiler it was built with.
+
+ * the Linux distribution the machine is running (``hostnamectl | grep
+ "Operating System"``)
+
+ * the architecture of the CPU and the operating system (``uname -mi``)
+
+ * if you are dealing with a regression and performed a bisection, mention the
+ subject and the commit-id of the change that is causing it.
+
+In a lot of cases it's also wise to make two more things available to those
+that read your report:
+
+ * the configuration used for building your Linux kernel (the '.config' file)
+
+ * the kernel's messages that you get from ``dmesg`` written to a file. Make
+ sure that it starts with a line like 'Linux version 5.8-1
+ (foobar@example.com) (gcc (GCC) 10.2.1, GNU ld version 2.34) #1 SMP Mon Aug
+ 3 14:54:37 UTC 2020' If it's missing, then important messages from the first
+ boot phase already got discarded. In this case instead consider using
+ ``journalctl -b 0 -k``; alternatively you can also reboot, reproduce the
+ issue and call ``dmesg`` right afterwards.
+
+These two files are big, that's why it's a bad idea to put them directly into
+your report. If you are filing the issue in a bug tracker then attach them to
+the ticket. If you report the issue by mail do not attach them, as that makes
+the mail too large; instead do one of these things:
+
+ * Upload the files somewhere public (your website, a public file paste
+ service, a ticket created just for this purpose on `bugzilla.kernel.org
+ <https://bugzilla.kernel.org/>`_, ...) and include a link to them in your
+ report. Ideally use something where the files stay available for years, as
+ they could be useful to someone many years from now; this for example can
+ happen if five or ten years from now a developer works on some code that was
+ changed just to fix your issue.
+
+ * Put the files aside and mention you will send them later in individual
+ replies to your own mail. Just remember to actually do that once the report
+ went out. ;-)
+
+Things that might be wise to provide
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Depending on the issue you might need to add more background data. Here are a
+few suggestions what often is good to provide:
+
+ * If you are dealing with a 'warning', an 'OOPS' or a 'panic' from the kernel,
+ include it. If you can't copy'n'paste it, try to capture a netconsole trace
+ or at least take a picture of the screen.
+
+ * If the issue might be related to your computer hardware, mention what kind
+ of system you use. If you for example have problems with your graphics card,
+ mention its manufacturer, the card's model, and what chip is uses. If it's a
+ laptop mention its name, but try to make sure it's meaningful. 'Dell XPS 13'
+ for example is not, because it might be the one from 2012; that one looks
+ not that different from the one sold today, but apart from that the two have
+ nothing in common. Hence, in such cases add the exact model number, which
+ for example are '9380' or '7390' for XPS 13 models introduced during 2019.
+ Names like 'Lenovo Thinkpad T590' are also somewhat ambiguous: there are
+ variants of this laptop with and without a dedicated graphics chip, so try
+ to find the exact model name or specify the main components.
+
+ * Mention the relevant software in use. If you have problems with loading
+ modules, you want to mention the versions of kmod, systemd, and udev in use.
+ If one of the DRM drivers misbehaves, you want to state the versions of
+ libdrm and Mesa; also specify your Wayland compositor or the X-Server and
+ its driver. If you have a filesystem issue, mention the version of
+ corresponding filesystem utilities (e2fsprogs, btrfs-progs, xfsprogs, ...).
+
+ * Gather additional information from the kernel that might be of interest. The
+ output from ``lspci -nn`` will for example help others to identify what
+ hardware you use. If you have a problem with hardware you even might want to
+ make the output from ``sudo lspci -vvv`` available, as that provides
+ insights how the components were configured. For some issues it might be
+ good to include the contents of files like ``/proc/cpuinfo``,
+ ``/proc/ioports``, ``/proc/iomem``, ``/proc/modules``, or
+ ``/proc/scsi/scsi``. Some subsystem also offer tools to collect relevant
+ information. One such tool is ``alsa-info.sh`` `which the audio/sound
+ subsystem developers provide <https://www.alsa-project.org/wiki/AlsaInfo>`_.
+
+Those examples should give your some ideas of what data might be wise to
+attach, but you have to think yourself what will be helpful for others to know.
+Don't worry too much about forgetting something, as developers will ask for
+additional details they need. But making everything important available from
+the start increases the chance someone will take a closer look.
+
+
+The important part: the head of your report
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Now that you have the detailed part of the report prepared let's get to the
+most important section: the first few sentences. Thus go to the top, add
+something like 'The detailed description:' before the part you just wrote and
+insert two newlines at the top. Now write one normal length paragraph that
+describes the issue roughly. Leave out all boring details and focus on the
+crucial parts readers need to know to understand what this is all about; if you
+think this bug affects a lot of users, mention this to get people interested.
+
+Once you did that insert two more lines at the top and write a one sentence
+summary that explains quickly what the report is about. After that you have to
+get even more abstract and write an even shorter subject/title for the report.
+
+Now that you have written this part take some time to optimize it, as it is the
+most important parts of your report: a lot of people will only read this before
+they decide if reading the rest is time well spent.
+
+Now send or file the report like the :ref:`MAINTAINERS <maintainers>` file told
+you, unless it's one of those 'issues of high priority' outlined earlier: in
+that case please read the next subsection first before sending the report on
+its way.
+
+Special handling for high priority issues
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Reports for high priority issues need special handling.
+
+**Severe issues**: make sure the subject or ticket title as well as the first
+paragraph makes the severeness obvious.
+
+**Regressions**: make the report's subject start with '[REGRESSION]'.
+
+In case you performed a successful bisection, use the title of the change that
+introduced the regression as the second part of your subject. Make the report
+also mention the commit id of the culprit. In case of an unsuccessful bisection,
+make your report mention the latest tested version that's working fine (say 5.7)
+and the oldest where the issue occurs (say 5.8-rc1).
+
+When sending the report by mail, CC the Linux regressions mailing list
+(regressions@lists.linux.dev). In case the report needs to be filed to some web
+tracker, proceed to do so. Once filed, forward the report by mail to the
+regressions list; CC the maintainer and the mailing list for the subsystem in
+question. Make sure to inline the forwarded report, hence do not attach it.
+Also add a short note at the top where you mention the URL to the ticket.
+
+When mailing or forwarding the report, in case of a successful bisection add the
+author of the culprit to the recipients; also CC everyone in the signed-off-by
+chain, which you find at the end of its commit message.
+
+**Security issues**: for these issues your will have to evaluate if a
+short-term risk to other users would arise if details were publicly disclosed.
+If that's not the case simply proceed with reporting the issue as described.
+For issues that bear such a risk you will need to adjust the reporting process
+slightly:
+
+ * If the MAINTAINERS file instructed you to report the issue by mail, do not
+ CC any public mailing lists.
+
+ * If you were supposed to file the issue in a bug tracker make sure to mark
+ the ticket as 'private' or 'security issue'. If the bug tracker does not
+ offer a way to keep reports private, forget about it and send your report as
+ a private mail to the maintainers instead.
+
+In both cases make sure to also mail your report to the addresses the
+MAINTAINERS file lists in the section 'security contact'. Ideally directly CC
+them when sending the report by mail. If you filed it in a bug tracker, forward
+the report's text to these addresses; but on top of it put a small note where
+you mention that you filed it with a link to the ticket.
+
+See Documentation/admin-guide/security-bugs.rst for more information.
+
+
+Duties after the report went out
+--------------------------------
+
+ *Wait for reactions and keep the thing rolling until you can accept the
+ outcome in one way or the other. Thus react publicly and in a timely manner
+ to any inquiries. Test proposed fixes. Do proactive testing: retest with at
+ least every first release candidate (RC) of a new mainline version and
+ report your results. Send friendly reminders if things stall. And try to
+ help yourself, if you don't get any help or if it's unsatisfying.*
+
+If your report was good and you are really lucky then one of the developers
+might immediately spot what's causing the issue; they then might write a patch
+to fix it, test it, and send it straight for integration in mainline while
+tagging it for later backport to stable and longterm kernels that need it. Then
+all you need to do is reply with a 'Thank you very much' and switch to a version
+with the fix once it gets released.
+
+But this ideal scenario rarely happens. That's why the job is only starting
+once you got the report out. What you'll have to do depends on the situations,
+but often it will be the things listed below. But before digging into the
+details, here are a few important things you need to keep in mind for this part
+of the process.
+
+
+General advice for further interactions
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+**Always reply in public**: When you filed the issue in a bug tracker, always
+reply there and do not contact any of the developers privately about it. For
+mailed reports always use the 'Reply-all' function when replying to any mails
+you receive. That includes mails with any additional data you might want to add
+to your report: go to your mail applications 'Sent' folder and use 'reply-all'
+on your mail with the report. This approach will make sure the public mailing
+list(s) and everyone else that gets involved over time stays in the loop; it
+also keeps the mail thread intact, which among others is really important for
+mailing lists to group all related mails together.
+
+There are just two situations where a comment in a bug tracker or a 'Reply-all'
+is unsuitable:
+
+ * Someone tells you to send something privately.
+
+ * You were told to send something, but noticed it contains sensitive
+ information that needs to be kept private. In that case it's okay to send it
+ in private to the developer that asked for it. But note in the ticket or a
+ mail that you did that, so everyone else knows you honored the request.
+
+**Do research before asking for clarifications or help**: In this part of the
+process someone might tell you to do something that requires a skill you might
+not have mastered yet. For example, you might be asked to use some test tools
+you never have heard of yet; or you might be asked to apply a patch to the
+Linux kernel sources to test if it helps. In some cases it will be fine sending
+a reply asking for instructions how to do that. But before going that route try
+to find the answer own your own by searching the internet; alternatively
+consider asking in other places for advice. For example ask a friend or post
+about it to a chatroom or forum you normally hang out.
+
+**Be patient**: If you are really lucky you might get a reply to your report
+within a few hours. But most of the time it will take longer, as maintainers
+are scattered around the globe and thus might be in a different time zone – one
+where they already enjoy their night away from keyboard.
+
+In general, kernel developers will take one to five business days to respond to
+reports. Sometimes it will take longer, as they might be busy with the merge
+windows, other work, visiting developer conferences, or simply enjoying a long
+summer holiday.
+
+The 'issues of high priority' (see above for an explanation) are an exception
+here: maintainers should address them as soon as possible; that's why you
+should wait a week at maximum (or just two days if it's something urgent)
+before sending a friendly reminder.
+
+Sometimes the maintainer might not be responding in a timely manner; other
+times there might be disagreements, for example if an issue qualifies as
+regression or not. In such cases raise your concerns on the mailing list and
+ask others for public or private replies how to move on. If that fails, it
+might be appropriate to get a higher authority involved. In case of a WiFi
+driver that would be the wireless maintainers; if there are no higher level
+maintainers or all else fails, it might be one of those rare situations where
+it's okay to get Linus Torvalds involved.
+
+**Proactive testing**: Every time the first pre-release (the 'rc1') of a new
+mainline kernel version gets released, go and check if the issue is fixed there
+or if anything of importance changed. Mention the outcome in the ticket or in a
+mail you sent as reply to your report (make sure it has all those in the CC
+that up to that point participated in the discussion). This will show your
+commitment and that you are willing to help. It also tells developers if the
+issue persists and makes sure they do not forget about it. A few other
+occasional retests (for example with rc3, rc5 and the final) are also a good
+idea, but only report your results if something relevant changed or if you are
+writing something anyway.
+
+With all these general things off the table let's get into the details of how
+to help to get issues resolved once they were reported.
+
+Inquires and testing request
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Here are your duties in case you got replies to your report:
+
+**Check who you deal with**: Most of the time it will be the maintainer or a
+developer of the particular code area that will respond to your report. But as
+issues are normally reported in public it could be anyone that's replying —
+including people that want to help, but in the end might guide you totally off
+track with their questions or requests. That rarely happens, but it's one of
+many reasons why it's wise to quickly run an internet search to see who you're
+interacting with. By doing this you also get aware if your report was heard by
+the right people, as a reminder to the maintainer (see below) might be in order
+later if discussion fades out without leading to a satisfying solution for the
+issue.
+
+**Inquiries for data**: Often you will be asked to test something or provide
+additional details. Try to provide the requested information soon, as you have
+the attention of someone that might help and risk losing it the longer you
+wait; that outcome is even likely if you do not provide the information within
+a few business days.
+
+**Requests for testing**: When you are asked to test a diagnostic patch or a
+possible fix, try to test it in timely manner, too. But do it properly and make
+sure to not rush it: mixing things up can happen easily and can lead to a lot
+of confusion for everyone involved. A common mistake for example is thinking a
+proposed patch with a fix was applied, but in fact wasn't. Things like that
+happen even to experienced testers occasionally, but they most of the time will
+notice when the kernel with the fix behaves just as one without it.
+
+What to do when nothing of substance happens
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Some reports will not get any reaction from the responsible Linux kernel
+developers; or a discussion around the issue evolved, but faded out with
+nothing of substance coming out of it.
+
+In these cases wait two (better: three) weeks before sending a friendly
+reminder: maybe the maintainer was just away from keyboard for a while when
+your report arrived or had something more important to take care of. When
+writing the reminder, kindly ask if anything else from your side is needed to
+get the ball running somehow. If the report got out by mail, do that in the
+first lines of a mail that is a reply to your initial mail (see above) which
+includes a full quote of the original report below: that's on of those few
+situations where such a 'TOFU' (Text Over, Fullquote Under) is the right
+approach, as then all the recipients will have the details at hand immediately
+in the proper order.
+
+After the reminder wait three more weeks for replies. If you still don't get a
+proper reaction, you first should reconsider your approach. Did you maybe try
+to reach out to the wrong people? Was the report maybe offensive or so
+confusing that people decided to completely stay away from it? The best way to
+rule out such factors: show the report to one or two people familiar with FLOSS
+issue reporting and ask for their opinion. Also ask them for their advice how
+to move forward. That might mean: prepare a better report and make those people
+review it before you send it out. Such an approach is totally fine; just
+mention that this is the second and improved report on the issue and include a
+link to the first report.
+
+If the report was proper you can send a second reminder; in it ask for advice
+why the report did not get any replies. A good moment for this second reminder
+mail is shortly after the first pre-release (the 'rc1') of a new Linux kernel
+version got published, as you should retest and provide a status update at that
+point anyway (see above).
+
+If the second reminder again results in no reaction within a week, try to
+contact a higher-level maintainer asking for advice: even busy maintainers by
+then should at least have sent some kind of acknowledgment.
+
+Remember to prepare yourself for a disappointment: maintainers ideally should
+react somehow to every issue report, but they are only obliged to fix those
+'issues of high priority' outlined earlier. So don't be too devastating if you
+get a reply along the lines of 'thanks for the report, I have more important
+issues to deal with currently and won't have time to look into this for the
+foreseeable future'.
+
+It's also possible that after some discussion in the bug tracker or on a list
+nothing happens anymore and reminders don't help to motivate anyone to work out
+a fix. Such situations can be devastating, but is within the cards when it
+comes to Linux kernel development. This and several other reasons for not
+getting help are explained in 'Why some issues won't get any reaction or remain
+unfixed after being reported' near the end of this document.
+
+Don't get devastated if you don't find any help or if the issue in the end does
+not get solved: the Linux kernel is FLOSS and thus you can still help yourself.
+You for example could try to find others that are affected and team up with
+them to get the issue resolved. Such a team could prepare a fresh report
+together that mentions how many you are and why this is something that in your
+option should get fixed. Maybe together you can also narrow down the root cause
+or the change that introduced a regression, which often makes developing a fix
+easier. And with a bit of luck there might be someone in the team that knows a
+bit about programming and might be able to write a fix.
+
+
+Reference for "Reporting regressions within a stable and longterm kernel line"
+------------------------------------------------------------------------------
+
+This subsection provides details for the steps you need to perform if you face
+a regression within a stable and longterm kernel line.
+
+Make sure the particular version line still gets support
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+ *Check if the kernel developers still maintain the Linux kernel version
+ line you care about: go to the front page of kernel.org and make sure it
+ mentions the latest release of the particular version line without an
+ '[EOL]' tag.*
+
+Most kernel version lines only get supported for about three months, as
+maintaining them longer is quite a lot of work. Hence, only one per year is
+chosen and gets supported for at least two years (often six). That's why you
+need to check if the kernel developers still support the version line you care
+for.
+
+Note, if kernel.org lists two stable version lines on the front page, you
+should consider switching to the newer one and forget about the older one:
+support for it is likely to be abandoned soon. Then it will get a "end-of-life"
+(EOL) stamp. Version lines that reached that point still get mentioned on the
+kernel.org front page for a week or two, but are unsuitable for testing and
+reporting.
+
+Search stable mailing list
+~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+ *Check the archives of the Linux stable mailing list for existing reports.*
+
+Maybe the issue you face is already known and was fixed or is about to. Hence,
+`search the archives of the Linux stable mailing list
+<https://lore.kernel.org/stable/>`_ for reports about an issue like yours. If
+you find any matches, consider joining the discussion, unless the fix is
+already finished and scheduled to get applied soon.
+
+Reproduce issue with the newest release
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+ *Install the latest release from the particular version line as a vanilla
+ kernel. Ensure this kernel is not tainted and still shows the problem, as
+ the issue might have already been fixed there. If you first noticed the
+ problem with a vendor kernel, check a vanilla build of the last version
+ known to work performs fine as well.*
+
+Before investing any more time in this process you want to check if the issue
+was already fixed in the latest release of version line you're interested in.
+This kernel needs to be vanilla and shouldn't be tainted before the issue
+happens, as detailed outlined already above in the section "Install a fresh
+kernel for testing".
+
+Did you first notice the regression with a vendor kernel? Then changes the
+vendor applied might be interfering. You need to rule that out by performing
+a recheck. Say something broke when you updated from 5.10.4-vendor.42 to
+5.10.5-vendor.43. Then after testing the latest 5.10 release as outlined in
+the previous paragraph check if a vanilla build of Linux 5.10.4 works fine as
+well. If things are broken there, the issue does not qualify as upstream
+regression and you need switch back to the main step-by-step guide to report
+the issue.
+
+Report the regression
+~~~~~~~~~~~~~~~~~~~~~
+
+ *Send a short problem report to the Linux stable mailing list
+ (stable@vger.kernel.org) and CC the Linux regressions mailing list
+ (regressions@lists.linux.dev); if you suspect the cause in a particular
+ subsystem, CC its maintainer and its mailing list. Roughly describe the
+ issue and ideally explain how to reproduce it. Mention the first version
+ that shows the problem and the last version that's working fine. Then
+ wait for further instructions.*
+
+When reporting a regression that happens within a stable or longterm kernel
+line (say when updating from 5.10.4 to 5.10.5) a brief report is enough for
+the start to get the issue reported quickly. Hence a rough description to the
+stable and regressions mailing list is all it takes; but in case you suspect
+the cause in a particular subsystem, CC its maintainers and its mailing list
+as well, because that will speed things up.
+
+And note, it helps developers a great deal if you can specify the exact version
+that introduced the problem. Hence if possible within a reasonable time frame,
+try to find that version using vanilla kernels. Lets assume something broke when
+your distributor released a update from Linux kernel 5.10.5 to 5.10.8. Then as
+instructed above go and check the latest kernel from that version line, say
+5.10.9. If it shows the problem, try a vanilla 5.10.5 to ensure that no patches
+the distributor applied interfere. If the issue doesn't manifest itself there,
+try 5.10.7 and then (depending on the outcome) 5.10.8 or 5.10.6 to find the
+first version where things broke. Mention it in the report and state that 5.10.9
+is still broken.
+
+What the previous paragraph outlines is basically a rough manual 'bisection'.
+Once your report is out your might get asked to do a proper one, as it allows to
+pinpoint the exact change that causes the issue (which then can easily get
+reverted to fix the issue quickly). Hence consider to do a proper bisection
+right away if time permits. See the section 'Special care for regressions' and
+the document Documentation/admin-guide/bug-bisect.rst for details how to
+perform one. In case of a successful bisection add the author of the culprit to
+the recipients; also CC everyone in the signed-off-by chain, which you find at
+the end of its commit message.
+
+
+Reference for "Reporting issues only occurring in older kernel version lines"
+-----------------------------------------------------------------------------
+
+This section provides details for the steps you need to take if you could not
+reproduce your issue with a mainline kernel, but want to see it fixed in older
+version lines (aka stable and longterm kernels).
+
+Some fixes are too complex
+~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+ *Prepare yourself for the possibility that going through the next few steps
+ might not get the issue solved in older releases: the fix might be too big
+ or risky to get backported there.*
+
+Even small and seemingly obvious code-changes sometimes introduce new and
+totally unexpected problems. The maintainers of the stable and longterm kernels
+are very aware of that and thus only apply changes to these kernels that are
+within rules outlined in Documentation/process/stable-kernel-rules.rst.
+
+Complex or risky changes for example do not qualify and thus only get applied
+to mainline. Other fixes are easy to get backported to the newest stable and
+longterm kernels, but too risky to integrate into older ones. So be aware the
+fix you are hoping for might be one of those that won't be backported to the
+version line your care about. In that case you'll have no other choice then to
+live with the issue or switch to a newer Linux version, unless you want to
+patch the fix into your kernels yourself.
+
+Common preparations
+~~~~~~~~~~~~~~~~~~~
+
+ *Perform the first three steps in the section "Reporting issues only
+ occurring in older kernel version lines" above.*
+
+You need to carry out a few steps already described in another section of this
+guide. Those steps will let you:
+
+ * Check if the kernel developers still maintain the Linux kernel version line
+ you care about.
+
+ * Search the Linux stable mailing list for exiting reports.
+
+ * Check with the latest release.
+
+
+Check code history and search for existing discussions
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+ *Search the Linux kernel version control system for the change that fixed
+ the issue in mainline, as its commit message might tell you if the fix is
+ scheduled for backporting already. If you don't find anything that way,
+ search the appropriate mailing lists for posts that discuss such an issue
+ or peer-review possible fixes; then check the discussions if the fix was
+ deemed unsuitable for backporting. If backporting was not considered at
+ all, join the newest discussion, asking if it's in the cards.*
+
+In a lot of cases the issue you deal with will have happened with mainline, but
+got fixed there. The commit that fixed it would need to get backported as well
+to get the issue solved. That's why you want to search for it or any
+discussions abound it.
+
+ * First try to find the fix in the Git repository that holds the Linux kernel
+ sources. You can do this with the web interfaces `on kernel.org
+ <https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/>`_
+ or its mirror `on GitHub <https://github.com/torvalds/linux>`_; if you have
+ a local clone you alternatively can search on the command line with ``git
+ log --grep=<pattern>``.
+
+ If you find the fix, look if the commit message near the end contains a
+ 'stable tag' that looks like this:
+
+ Cc: <stable@vger.kernel.org> # 5.4+
+
+ If that's case the developer marked the fix safe for backporting to version
+ line 5.4 and later. Most of the time it's getting applied there within two
+ weeks, but sometimes it takes a bit longer.
+
+ * If the commit doesn't tell you anything or if you can't find the fix, look
+ again for discussions about the issue. Search the net with your favorite
+ internet search engine as well as the archives for the `Linux kernel
+ developers mailing list <https://lore.kernel.org/lkml/>`_. Also read the
+ section `Locate kernel area that causes the issue` above and follow the
+ instructions to find the subsystem in question: its bug tracker or mailing
+ list archive might have the answer you are looking for.
+
+ * If you see a proposed fix, search for it in the version control system as
+ outlined above, as the commit might tell you if a backport can be expected.
+
+ * Check the discussions for any indicators the fix might be too risky to get
+ backported to the version line you care about. If that's the case you have
+ to live with the issue or switch to the kernel version line where the fix
+ got applied.
+
+ * If the fix doesn't contain a stable tag and backporting was not discussed,
+ join the discussion: mention the version where you face the issue and that
+ you would like to see it fixed, if suitable.
+
+
+Ask for advice
+~~~~~~~~~~~~~~
+
+ *One of the former steps should lead to a solution. If that doesn't work
+ out, ask the maintainers for the subsystem that seems to be causing the
+ issue for advice; CC the mailing list for the particular subsystem as well
+ as the stable mailing list.*
+
+If the previous three steps didn't get you closer to a solution there is only
+one option left: ask for advice. Do that in a mail you sent to the maintainers
+for the subsystem where the issue seems to have its roots; CC the mailing list
+for the subsystem as well as the stable mailing list (stable@vger.kernel.org).
+
+
+Why some issues won't get any reaction or remain unfixed after being reported
+=============================================================================
+
+When reporting a problem to the Linux developers, be aware only 'issues of high
+priority' (regressions, security issues, severe problems) are definitely going
+to get resolved. The maintainers or if all else fails Linus Torvalds himself
+will make sure of that. They and the other kernel developers will fix a lot of
+other issues as well. But be aware that sometimes they can't or won't help; and
+sometimes there isn't even anyone to send a report to.
+
+This is best explained with kernel developers that contribute to the Linux
+kernel in their spare time. Quite a few of the drivers in the kernel were
+written by such programmers, often because they simply wanted to make their
+hardware usable on their favorite operating system.
+
+These programmers most of the time will happily fix problems other people
+report. But nobody can force them to do, as they are contributing voluntarily.
+
+Then there are situations where such developers really want to fix an issue,
+but can't: sometimes they lack hardware programming documentation to do so.
+This often happens when the publicly available docs are superficial or the
+driver was written with the help of reverse engineering.
+
+Sooner or later spare time developers will also stop caring for the driver.
+Maybe their test hardware broke, got replaced by something more fancy, or is so
+old that it's something you don't find much outside of computer museums
+anymore. Sometimes developer stops caring for their code and Linux at all, as
+something different in their life became way more important. In some cases
+nobody is willing to take over the job as maintainer – and nobody can be forced
+to, as contributing to the Linux kernel is done on a voluntary basis. Abandoned
+drivers nevertheless remain in the kernel: they are still useful for people and
+removing would be a regression.
+
+The situation is not that different with developers that are paid for their
+work on the Linux kernel. Those contribute most changes these days. But their
+employers sooner or later also stop caring for their code or make its
+programmer focus on other things. Hardware vendors for example earn their money
+mainly by selling new hardware; quite a few of them hence are not investing
+much time and energy in maintaining a Linux kernel driver for something they
+stopped selling years ago. Enterprise Linux distributors often care for a
+longer time period, but in new versions often leave support for old and rare
+hardware aside to limit the scope. Often spare time contributors take over once
+a company orphans some code, but as mentioned above: sooner or later they will
+leave the code behind, too.
+
+Priorities are another reason why some issues are not fixed, as maintainers
+quite often are forced to set those, as time to work on Linux is limited.
+That's true for spare time or the time employers grant their developers to
+spend on maintenance work on the upstream kernel. Sometimes maintainers also
+get overwhelmed with reports, even if a driver is working nearly perfectly. To
+not get completely stuck, the programmer thus might have no other choice than
+to prioritize issue reports and reject some of them.
+
+But don't worry too much about all of this, a lot of drivers have active
+maintainers who are quite interested in fixing as many issues as possible.
+
+
+Closing words
+=============
+
+Compared with other Free/Libre & Open Source Software it's hard to report
+issues to the Linux kernel developers: the length and complexity of this
+document and the implications between the lines illustrate that. But that's how
+it is for now. The main author of this text hopes documenting the state of the
+art will lay some groundwork to improve the situation over time.
+
+
+..
+ end-of-content
+..
+ This document is maintained by Thorsten Leemhuis <linux@leemhuis.info>. If
+ you spot a typo or small mistake, feel free to let him know directly and
+ he'll fix it. You are free to do the same in a mostly informal way if you
+ want to contribute changes to the text, but for copyright reasons please CC
+ linux-doc@vger.kernel.org and "sign-off" your contribution as
+ Documentation/process/submitting-patches.rst outlines in the section "Sign
+ your work - the Developer's Certificate of Origin".
+..
+ This text is available under GPL-2.0+ or CC-BY-4.0, as stated at the top
+ of the file. If you want to distribute this text under CC-BY-4.0 only,
+ please use "The Linux kernel developers" for author attribution and link
+ this as source:
+ https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/plain/Documentation/admin-guide/reporting-issues.rst
+..
+ Note: Only the content of this RST file as found in the Linux kernel sources
+ is available under CC-BY-4.0, as versions of this text that were processed
+ (for example by the kernel's build system) might contain content taken from
+ files which use a more restrictive license.
diff --git a/Documentation/admin-guide/reporting-regressions.rst b/Documentation/admin-guide/reporting-regressions.rst
new file mode 100644
index 000000000000..d8adccdae23f
--- /dev/null
+++ b/Documentation/admin-guide/reporting-regressions.rst
@@ -0,0 +1,451 @@
+.. SPDX-License-Identifier: (GPL-2.0+ OR CC-BY-4.0)
+.. [see the bottom of this file for redistribution information]
+
+Reporting regressions
++++++++++++++++++++++
+
+"*We don't cause regressions*" is the first rule of Linux kernel development;
+Linux founder and lead developer Linus Torvalds established it himself and
+ensures it's obeyed.
+
+This document describes what the rule means for users and how the Linux kernel's
+development model ensures to address all reported regressions; aspects relevant
+for kernel developers are left to Documentation/process/handling-regressions.rst.
+
+
+The important bits (aka "TL;DR")
+================================
+
+#. It's a regression if something running fine with one Linux kernel works worse
+ or not at all with a newer version. Note, the newer kernel has to be compiled
+ using a similar configuration; the detailed explanations below describes this
+ and other fine print in more detail.
+
+#. Report your issue as outlined in Documentation/admin-guide/reporting-issues.rst,
+ it already covers all aspects important for regressions and repeated
+ below for convenience. Two of them are important: start your report's subject
+ with "[REGRESSION]" and CC or forward it to `the regression mailing list
+ <https://lore.kernel.org/regressions/>`_ (regressions@lists.linux.dev).
+
+#. Optional, but recommended: when sending or forwarding your report, make the
+ Linux kernel regression tracking bot "regzbot" track the issue by specifying
+ when the regression started like this::
+
+ #regzbot introduced v5.13..v5.14-rc1
+
+
+All the details on Linux kernel regressions relevant for users
+==============================================================
+
+
+The important basics
+--------------------
+
+
+What is a "regression" and what is the "no regressions rule"?
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+It's a regression if some application or practical use case running fine with
+one Linux kernel works worse or not at all with a newer version compiled using a
+similar configuration. The "no regressions rule" forbids this to take place; if
+it happens by accident, developers that caused it are expected to quickly fix
+the issue.
+
+It thus is a regression when a WiFi driver from Linux 5.13 works fine, but with
+5.14 doesn't work at all, works significantly slower, or misbehaves somehow.
+It's also a regression if a perfectly working application suddenly shows erratic
+behavior with a newer kernel version; such issues can be caused by changes in
+procfs, sysfs, or one of the many other interfaces Linux provides to userland
+software. But keep in mind, as mentioned earlier: 5.14 in this example needs to
+be built from a configuration similar to the one from 5.13. This can be achieved
+using ``make olddefconfig``, as explained in more detail below.
+
+Note the "practical use case" in the first sentence of this section: developers
+despite the "no regressions" rule are free to change any aspect of the kernel
+and even APIs or ABIs to userland, as long as no existing application or use
+case breaks.
+
+Also be aware the "no regressions" rule covers only interfaces the kernel
+provides to the userland. It thus does not apply to kernel-internal interfaces
+like the module API, which some externally developed drivers use to hook into
+the kernel.
+
+How do I report a regression?
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Just report the issue as outlined in
+Documentation/admin-guide/reporting-issues.rst, it already describes the
+important points. The following aspects outlined there are especially relevant
+for regressions:
+
+ * When checking for existing reports to join, also search the `archives of the
+ Linux regressions mailing list <https://lore.kernel.org/regressions/>`_ and
+ `regzbot's web-interface <https://linux-regtracking.leemhuis.info/regzbot/>`_.
+
+ * Start your report's subject with "[REGRESSION]".
+
+ * In your report, clearly mention the last kernel version that worked fine and
+ the first broken one. Ideally try to find the exact change causing the
+ regression using a bisection, as explained below in more detail.
+
+ * Remember to let the Linux regressions mailing list
+ (regressions@lists.linux.dev) know about your report:
+
+ * If you report the regression by mail, CC the regressions list.
+
+ * If you report your regression to some bug tracker, forward the submitted
+ report by mail to the regressions list while CCing the maintainer and the
+ mailing list for the subsystem in question.
+
+ If it's a regression within a stable or longterm series (e.g.
+ v5.15.3..v5.15.5), remember to CC the `Linux stable mailing list
+ <https://lore.kernel.org/stable/>`_ (stable@vger.kernel.org).
+
+ In case you performed a successful bisection, add everyone to the CC the
+ culprit's commit message mentions in lines starting with "Signed-off-by:".
+
+When CCing for forwarding your report to the list, consider directly telling the
+aforementioned Linux kernel regression tracking bot about your report. To do
+that, include a paragraph like this in your mail::
+
+ #regzbot introduced: v5.13..v5.14-rc1
+
+Regzbot will then consider your mail a report for a regression introduced in the
+specified version range. In above case Linux v5.13 still worked fine and Linux
+v5.14-rc1 was the first version where you encountered the issue. If you
+performed a bisection to find the commit that caused the regression, specify the
+culprit's commit-id instead::
+
+ #regzbot introduced: 1f2e3d4c5d
+
+Placing such a "regzbot command" is in your interest, as it will ensure the
+report won't fall through the cracks unnoticed. If you omit this, the Linux
+kernel's regressions tracker will take care of telling regzbot about your
+regression, as long as you send a copy to the regressions mailing lists. But the
+regression tracker is just one human which sometimes has to rest or occasionally
+might even enjoy some time away from computers (as crazy as that might sound).
+Relying on this person thus will result in an unnecessary delay before the
+regressions becomes mentioned `on the list of tracked and unresolved Linux
+kernel regressions <https://linux-regtracking.leemhuis.info/regzbot/>`_ and the
+weekly regression reports sent by regzbot. Such delays can result in Linus
+Torvalds being unaware of important regressions when deciding between "continue
+development or call this finished and release the final?".
+
+Are really all regressions fixed?
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Nearly all of them are, as long as the change causing the regression (the
+"culprit commit") is reliably identified. Some regressions can be fixed without
+this, but often it's required.
+
+Who needs to find the root cause of a regression?
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Developers of the affected code area should try to locate the culprit on their
+own. But for them that's often impossible to do with reasonable effort, as quite
+a lot of issues only occur in a particular environment outside the developer's
+reach -- for example, a specific hardware platform, firmware, Linux distro,
+system's configuration, or application. That's why in the end it's often up to
+the reporter to locate the culprit commit; sometimes users might even need to
+run additional tests afterwards to pinpoint the exact root cause. Developers
+should offer advice and reasonably help where they can, to make this process
+relatively easy and achievable for typical users.
+
+How can I find the culprit?
+~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Perform a bisection, as roughly outlined in
+Documentation/admin-guide/reporting-issues.rst and described in more detail by
+Documentation/admin-guide/bug-bisect.rst. It might sound like a lot of work, but
+in many cases finds the culprit relatively quickly. If it's hard or
+time-consuming to reliably reproduce the issue, consider teaming up with other
+affected users to narrow down the search range together.
+
+Who can I ask for advice when it comes to regressions?
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Send a mail to the regressions mailing list (regressions@lists.linux.dev) while
+CCing the Linux kernel's regression tracker (regressions@leemhuis.info); if the
+issue might better be dealt with in private, feel free to omit the list.
+
+
+Additional details about regressions
+------------------------------------
+
+
+What is the goal of the "no regressions rule"?
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Users should feel safe when updating kernel versions and not have to worry
+something might break. This is in the interest of the kernel developers to make
+updating attractive: they don't want users to stay on stable or longterm Linux
+series that are either abandoned or more than one and a half years old. That's
+in everybody's interest, as `those series might have known bugs, security
+issues, or other problematic aspects already fixed in later versions
+<http://www.kroah.com/log/blog/2018/08/24/what-stable-kernel-should-i-use/>`_.
+Additionally, the kernel developers want to make it simple and appealing for
+users to test the latest pre-release or regular release. That's also in
+everybody's interest, as it's a lot easier to track down and fix problems, if
+they are reported shortly after being introduced.
+
+Is the "no regressions" rule really adhered in practice?
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+It's taken really seriously, as can be seen by many mailing list posts from
+Linux creator and lead developer Linus Torvalds, some of which are quoted in
+Documentation/process/handling-regressions.rst.
+
+Exceptions to this rule are extremely rare; in the past developers almost always
+turned out to be wrong when they assumed a particular situation was warranting
+an exception.
+
+Who ensures the "no regressions" is actually followed?
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The subsystem maintainers should take care of that, which are watched and
+supported by the tree maintainers -- e.g. Linus Torvalds for mainline and
+Greg Kroah-Hartman et al. for various stable/longterm series.
+
+All of them are helped by people trying to ensure no regression report falls
+through the cracks. One of them is Thorsten Leemhuis, who's currently acting as
+the Linux kernel's "regressions tracker"; to facilitate this work he relies on
+regzbot, the Linux kernel regression tracking bot. That's why you want to bring
+your report on the radar of these people by CCing or forwarding each report to
+the regressions mailing list, ideally with a "regzbot command" in your mail to
+get it tracked immediately.
+
+How quickly are regressions normally fixed?
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Developers should fix any reported regression as quickly as possible, to provide
+affected users with a solution in a timely manner and prevent more users from
+running into the issue; nevertheless developers need to take enough time and
+care to ensure regression fixes do not cause additional damage.
+
+The answer thus depends on various factors like the impact of a regression, its
+age, or the Linux series in which it occurs. In the end though, most regressions
+should be fixed within two weeks.
+
+Is it a regression, if the issue can be avoided by updating some software?
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Almost always: yes. If a developer tells you otherwise, ask the regression
+tracker for advice as outlined above.
+
+Is it a regression, if a newer kernel works slower or consumes more energy?
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Yes, but the difference has to be significant. A five percent slow-down in a
+micro-benchmark thus is unlikely to qualify as regression, unless it also
+influences the results of a broad benchmark by more than one percent. If in
+doubt, ask for advice.
+
+Is it a regression, if an external kernel module breaks when updating Linux?
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+No, as the "no regression" rule is about interfaces and services the Linux
+kernel provides to the userland. It thus does not cover building or running
+externally developed kernel modules, as they run in kernel-space and hook into
+the kernel using internal interfaces occasionally changed.
+
+How are regressions handled that are caused by security fixes?
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+In extremely rare situations security issues can't be fixed without causing
+regressions; those fixes are given way, as they are the lesser evil in the end.
+Luckily this middling almost always can be avoided, as key developers for the
+affected area and often Linus Torvalds himself try very hard to fix security
+issues without causing regressions.
+
+If you nevertheless face such a case, check the mailing list archives if people
+tried their best to avoid the regression. If not, report it; if in doubt, ask
+for advice as outlined above.
+
+What happens if fixing a regression is impossible without causing another?
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Sadly these things happen, but luckily not very often; if they occur, expert
+developers of the affected code area should look into the issue to find a fix
+that avoids regressions or at least their impact. If you run into such a
+situation, do what was outlined already for regressions caused by security
+fixes: check earlier discussions if people already tried their best and ask for
+advice if in doubt.
+
+A quick note while at it: these situations could be avoided, if people would
+regularly give mainline pre-releases (say v5.15-rc1 or -rc3) from each
+development cycle a test run. This is best explained by imagining a change
+integrated between Linux v5.14 and v5.15-rc1 which causes a regression, but at
+the same time is a hard requirement for some other improvement applied for
+5.15-rc1. All these changes often can simply be reverted and the regression thus
+solved, if someone finds and reports it before 5.15 is released. A few days or
+weeks later this solution can become impossible, as some software might have
+started to rely on aspects introduced by one of the follow-up changes: reverting
+all changes would then cause a regression for users of said software and thus is
+out of the question.
+
+Is it a regression, if some feature I relied on was removed months ago?
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+It is, but often it's hard to fix such regressions due to the aspects outlined
+in the previous section. It hence needs to be dealt with on a case-by-case
+basis. This is another reason why it's in everybody's interest to regularly test
+mainline pre-releases.
+
+Does the "no regression" rule apply if I seem to be the only affected person?
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+It does, but only for practical usage: the Linux developers want to be free to
+remove support for hardware only to be found in attics and museums anymore.
+
+Note, sometimes regressions can't be avoided to make progress -- and the latter
+is needed to prevent Linux from stagnation. Hence, if only very few users seem
+to be affected by a regression, it for the greater good might be in their and
+everyone else's interest to lettings things pass. Especially if there is an
+easy way to circumvent the regression somehow, for example by updating some
+software or using a kernel parameter created just for this purpose.
+
+Does the regression rule apply for code in the staging tree as well?
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Not according to the `help text for the configuration option covering all
+staging code <https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/staging/Kconfig>`_,
+which since its early days states::
+
+ Please note that these drivers are under heavy development, may or
+ may not work, and may contain userspace interfaces that most likely
+ will be changed in the near future.
+
+The staging developers nevertheless often adhere to the "no regressions" rule,
+but sometimes bend it to make progress. That's for example why some users had to
+deal with (often negligible) regressions when a WiFi driver from the staging
+tree was replaced by a totally different one written from scratch.
+
+Why do later versions have to be "compiled with a similar configuration"?
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Because the Linux kernel developers sometimes integrate changes known to cause
+regressions, but make them optional and disable them in the kernel's default
+configuration. This trick allows progress, as the "no regressions" rule
+otherwise would lead to stagnation.
+
+Consider for example a new security feature blocking access to some kernel
+interfaces often abused by malware, which at the same time are required to run a
+few rarely used applications. The outlined approach makes both camps happy:
+people using these applications can leave the new security feature off, while
+everyone else can enable it without running into trouble.
+
+How to create a configuration similar to the one of an older kernel?
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Start your machine with a known-good kernel and configure the newer Linux
+version with ``make olddefconfig``. This makes the kernel's build scripts pick
+up the configuration file (the ".config" file) from the running kernel as base
+for the new one you are about to compile; afterwards they set all new
+configuration options to their default value, which should disable new features
+that might cause regressions.
+
+Can I report a regression I found with pre-compiled vanilla kernels?
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+You need to ensure the newer kernel was compiled with a similar configuration
+file as the older one (see above), as those that built them might have enabled
+some known-to-be incompatible feature for the newer kernel. If in doubt, report
+the matter to the kernel's provider and ask for advice.
+
+
+More about regression tracking with "regzbot"
+---------------------------------------------
+
+What is regression tracking and why should I care about it?
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Rules like "no regressions" need someone to ensure they are followed, otherwise
+they are broken either accidentally or on purpose. History has shown this to be
+true for Linux kernel development as well. That's why Thorsten Leemhuis, the
+Linux Kernel's regression tracker, and some people try to ensure all regression
+are fixed by keeping an eye on them until they are resolved. Neither of them are
+paid for this, that's why the work is done on a best effort basis.
+
+Why and how are Linux kernel regressions tracked using a bot?
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Tracking regressions completely manually has proven to be quite hard due to the
+distributed and loosely structured nature of Linux kernel development process.
+That's why the Linux kernel's regression tracker developed regzbot to facilitate
+the work, with the long term goal to automate regression tracking as much as
+possible for everyone involved.
+
+Regzbot works by watching for replies to reports of tracked regressions.
+Additionally, it's looking out for posted or committed patches referencing such
+reports with "Link:" tags; replies to such patch postings are tracked as well.
+Combined this data provides good insights into the current state of the fixing
+process.
+
+How to see which regressions regzbot tracks currently?
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Check out `regzbot's web-interface <https://linux-regtracking.leemhuis.info/regzbot/>`_.
+
+What kind of issues are supposed to be tracked by regzbot?
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The bot is meant to track regressions, hence please don't involve regzbot for
+regular issues. But it's okay for the Linux kernel's regression tracker if you
+involve regzbot to track severe issues, like reports about hangs, corrupted
+data, or internal errors (Panic, Oops, BUG(), warning, ...).
+
+How to change aspects of a tracked regression?
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+By using a 'regzbot command' in a direct or indirect reply to the mail with the
+report. The easiest way to do that: find the report in your "Sent" folder or the
+mailing list archive and reply to it using your mailer's "Reply-all" function.
+In that mail, use one of the following commands in a stand-alone paragraph (IOW:
+use blank lines to separate one or multiple of these commands from the rest of
+the mail's text).
+
+ * Update when the regression started to happen, for example after performing a
+ bisection::
+
+ #regzbot introduced: 1f2e3d4c5d
+
+ * Set or update the title::
+
+ #regzbot title: foo
+
+ * Monitor a discussion or bugzilla.kernel.org ticket where additions aspects of
+ the issue or a fix are discussed:::
+
+ #regzbot monitor: https://lore.kernel.org/r/30th.anniversary.repost@klaava.Helsinki.FI/
+ #regzbot monitor: https://bugzilla.kernel.org/show_bug.cgi?id=123456789
+
+ * Point to a place with further details of interest, like a mailing list post
+ or a ticket in a bug tracker that are slightly related, but about a different
+ topic::
+
+ #regzbot link: https://bugzilla.kernel.org/show_bug.cgi?id=123456789
+
+ * Mark a regression as invalid::
+
+ #regzbot invalid: wasn't a regression, problem has always existed
+
+Regzbot supports a few other commands primarily used by developers or people
+tracking regressions. They and more details about the aforementioned regzbot
+commands can be found in the `getting started guide
+<https://gitlab.com/knurd42/regzbot/-/blob/main/docs/getting_started.md>`_ and
+the `reference documentation <https://gitlab.com/knurd42/regzbot/-/blob/main/docs/reference.md>`_
+for regzbot.
+
+..
+ end-of-content
+..
+ This text is available under GPL-2.0+ or CC-BY-4.0, as stated at the top
+ of the file. If you want to distribute this text under CC-BY-4.0 only,
+ please use "The Linux kernel developers" for author attribution and link
+ this as source:
+ https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/plain/Documentation/admin-guide/reporting-regressions.rst
+..
+ Note: Only the content of this RST file as found in the Linux kernel sources
+ is available under CC-BY-4.0, as versions of this text that were processed
+ (for example by the kernel's build system) might contain content taken from
+ files which use a more restrictive license.
diff --git a/Documentation/admin-guide/security-bugs.rst b/Documentation/admin-guide/security-bugs.rst
index dcd6c93c7aac..82e29837d589 100644
--- a/Documentation/admin-guide/security-bugs.rst
+++ b/Documentation/admin-guide/security-bugs.rst
@@ -21,11 +21,18 @@ understand and fix the security vulnerability.
As it is with any bug, the more information provided the easier it
will be to diagnose and fix. Please review the procedure outlined in
-admin-guide/reporting-bugs.rst if you are unclear about what
+'Documentation/admin-guide/reporting-issues.rst' if you are unclear about what
information is helpful. Any exploit code is very helpful and will not
be released without consent from the reporter unless it has already been
made public.
+Please send plain text emails without attachments where possible.
+It is much harder to have a context-quoted discussion about a complex
+issue if all the details are hidden away in attachments. Think of it like a
+:doc:`regular patch submission <../process/submitting-patches>`
+(even if you don't have a patch yet): describe the problem and impact, list
+reproduction steps, and follow it with a proposed fix, all in plain text.
+
Disclosure and embargoed information
------------------------------------
diff --git a/Documentation/admin-guide/serial-console.rst b/Documentation/admin-guide/serial-console.rst
index a8d1e36b627a..58b32832e50a 100644
--- a/Documentation/admin-guide/serial-console.rst
+++ b/Documentation/admin-guide/serial-console.rst
@@ -54,7 +54,7 @@ You will need to create a new device to use ``/dev/console``. The official
``/dev/console`` is now character device 5,1.
(You can also use a network device as a console. See
-``Documentation/networking/netconsole.txt`` for information on that.)
+``Documentation/networking/netconsole.rst`` for information on that.)
Here's an example that will use ``/dev/ttyS1`` (COM2) as the console.
Replace the sample values as needed.
diff --git a/Documentation/admin-guide/spkguide.txt b/Documentation/admin-guide/spkguide.txt
new file mode 100644
index 000000000000..1265c1eab31c
--- /dev/null
+++ b/Documentation/admin-guide/spkguide.txt
@@ -0,0 +1,1620 @@
+
+The Speakup User's Guide
+For Speakup 3.1.2 and Later
+By Gene Collins
+Updated by others
+Last modified on Mon Sep 27 14:26:31 2010
+Document version 1.3
+
+Copyright (c) 2005 Gene Collins
+Copyright (c) 2008 Samuel Thibault
+Copyright (c) 2009, 2010 the Speakup Team
+
+Permission is granted to copy, distribute and/or modify this document
+under the terms of the GNU Free Documentation License, Version 1.2 or
+any later version published by the Free Software Foundation; with no
+Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A
+copy of the license is included in the section entitled "GNU Free
+Documentation License".
+
+Preface
+
+The purpose of this document is to familiarize users with the user
+interface to Speakup, a Linux Screen Reader. If you need instructions
+for installing or obtaining Speakup, visit the web site at
+http://linux-speakup.org/. Speakup is a set of patches to the standard
+Linux kernel source tree. It can be built as a series of modules, or as
+a part of a monolithic kernel. These details are beyond the scope of
+this manual, but the user may need to be aware of the module
+capabilities, depending on how your system administrator has installed
+Speakup. If Speakup is built as a part of a monolithic kernel, and the
+user is using a hardware synthesizer, then Speakup will be able to
+provide speech access from the time the kernel is loaded, until the time
+the system is shutdown. This means that if you have obtained Linux
+installation media for a distribution which includes Speakup as a part
+of its kernel, you will be able, as a blind person, to install Linux
+with speech access unaided by a sighted person. Again, these details
+are beyond the scope of this manual, but the user should be aware of
+them. See the web site mentioned above for further details.
+
+1. Starting Speakup
+
+If your system administrator has installed Speakup to work with your
+specific synthesizer by default, then all you need to do to use Speakup
+is to boot your system, and Speakup should come up talking. This
+assumes of course that your synthesizer is a supported hardware
+synthesizer, and that it is either installed in or connected to your
+system, and is if necessary powered on.
+
+It is possible, however, that Speakup may have been compiled into the
+kernel with no default synthesizer. It is even possible that your
+kernel has been compiled with support for some of the supported
+synthesizers and not others. If you find that this is the case, and
+your synthesizer is supported but not available, complain to the person
+who compiled and installed your kernel. Or better yet, go to the web
+site, and learn how to patch Speakup into your own kernel source, and
+build and install your own kernel.
+
+If your kernel has been compiled with Speakup, and has no default
+synthesizer set, or you would like to use a different synthesizer than
+the default one, then you may issue the following command at the boot
+prompt of your boot loader.
+
+linux speakup.synth=ltlk
+
+This command would tell Speakup to look for and use a LiteTalk or
+DoubleTalk LT at boot up. You may replace the ltlk synthesizer keyword
+with the keyword for whatever synthesizer you wish to use. The
+speakup.synth parameter will accept the following keywords, provided
+that support for the related synthesizers has been built into the
+kernel.
+
+acntsa -- Accent SA
+acntpc -- Accent PC
+apollo -- Apollo
+audptr -- Audapter
+bns -- Braille 'n Speak
+dectlk -- DecTalk Express (old and new, db9 serial only)
+decext -- DecTalk (old) External
+dtlk -- DoubleTalk PC
+keypc -- Keynote Gold PC
+ltlk -- DoubleTalk LT, LiteTalk, or external Tripletalk (db9 serial only)
+spkout -- Speak Out
+txprt -- Transport
+dummy -- Plain text terminal
+
+Note: Speakup does * NOT * support usb connections! Speakup also does *
+NOT * support the internal Tripletalk!
+
+Speakup does support two other synthesizers, but because they work in
+conjunction with other software, they must be loaded as modules after
+their related software is loaded, and so are not available at boot up.
+These are as follows:
+
+decpc -- DecTalk PC (not available at boot up)
+soft -- One of several software synthesizers (not available at boot up)
+
+See the sections on loading modules and software synthesizers later in
+this manual for further details. It should be noted here that the
+speakup.synth boot parameter will have no effect if Speakup has been
+compiled as modules. In order for Speakup modules to be loaded during
+the boot process, such action must be configured by your system
+administrator. This will mean that you will hear some, but not all, of
+the bootup messages.
+
+2. Basic operation
+
+Once you have booted the system, and if necessary, have supplied the
+proper bootup parameter for your synthesizer, Speakup will begin
+talking as soon as the kernel is loaded. In fact, it will talk a lot!
+It will speak all the boot up messages that the kernel prints on the
+screen during the boot process. This is because Speakup is not a
+separate screen reader, but is actually built into the operating
+system. Since almost all console applications must print text on the
+screen using the kernel, and must get their keyboard input through the
+kernel, they are automatically handled properly by Speakup. There are a
+few exceptions, but we'll come to those later.
+
+Note: In this guide I will refer to the numeric keypad as the keypad.
+This is done because the speakupmap.map file referred to later in this
+manual uses the term keypad instead of numeric keypad. Also I'm lazy
+and would rather only type one word. So keypad it is. Got it? Good.
+
+Most of the Speakup review keys are located on the keypad at the far
+right of the keyboard. The numlock key should be off, in order for these
+to work. If you toggle the numlock on, the keypad will produce numbers,
+which is exactly what you want for spreadsheets and such. For the
+purposes of this guide, you should have the numlock turned off, which is
+its default state at bootup.
+
+You probably won't want to listen to all the bootup messages every time
+you start your system, though it's a good idea to listen to them at
+least once, just so you'll know what kind of information is available to
+you during the boot process. You can always review these messages after
+bootup with the command:
+
+dmesg | more
+
+In order to speed the boot process, and to silence the speaking of the
+bootup messages, just press the keypad enter key. This key is located
+in the bottom right corner of the keypad. Speakup will shut up and stay
+that way, until you press another key.
+
+You can check to see if the boot process has completed by pressing the 8
+key on the keypad, which reads the current line. This also has the
+effect of starting Speakup talking again, so you can press keypad enter
+to silence it again if the boot process has not completed.
+
+When the boot process is complete, you will arrive at a "login" prompt.
+At this point, you'll need to type in your user id and password, as
+provided by your system administrator. You will hear Speakup speak the
+letters of your user id as you type it, but not the password. This is
+because the password is not displayed on the screen for security
+reasons. This has nothing to do with Speakup, it's a Linux security
+feature.
+
+Once you've logged in, you can run any Linux command or program which is
+allowed by your user id. Normal users will not be able to run programs
+which require root privileges.
+
+When you are running a program or command, Speakup will automatically
+speak new text as it arrives on the screen. You can at any time silence
+the speech with keypad enter, or use any of the Speakup review keys.
+
+Here are some basic Speakup review keys, and a short description of what
+they do.
+
+keypad 1 -- read previous character
+keypad 2 -- read current character (pressing keypad 2 twice rapidly will speak
+ the current character phonetically)
+keypad 3 -- read next character
+keypad 4 -- read previous word
+keypad 5 -- read current word (press twice rapidly to spell the current word)
+keypad 6 -- read next word
+keypad 7 -- read previous line
+keypad 8 -- read current line (press twice rapidly to hear how much the
+ text on the current line is indented)
+keypad 9 -- read next line
+keypad period -- speak current cursor position and announce current
+ virtual console
+
+It's also worth noting that the insert key on the keypad is mapped
+as the speakup key. Instead of pressing and releasing this key, as you
+do under DOS or Windows, you hold it like a shift key, and press other
+keys in combination with it. For example, repeatedly holding keypad
+insert, from now on called speakup, and keypad enter will toggle the
+speaking of new text on the screen on and off. This is not the same as
+just pressing keypad enter by itself, which just silences the speech
+until you hit another key. When you hit speakup plus keypad enter,
+Speakup will say, "You turned me off.", or "Hey, that's better." When
+Speakup is turned off, no new text on the screen will be spoken. You
+can still use the reading controls to review the screen however.
+
+3. Using the Speakup Help System
+
+In order to enter the Speakup help system, press and hold the speakup
+key (remember that this is the keypad insert key), and press the f1 key.
+You will hear the message:
+
+"Press space to leave help, cursor up or down to scroll, or a letter to
+go to commands in list."
+
+When you press the spacebar to leave the help system, you will hear:
+
+"Leaving help."
+
+While you are in the Speakup help system, you can scroll up or down
+through the list of available commands using the cursor keys. The list
+of commands is arranged in alphabetical order. If you wish to jump to
+commands in a specific part of the alphabet, you may press the letter of
+the alphabet you wish to jump to.
+
+You can also just explore by typing keyboard keys. Pressing keys will
+cause Speakup to speak the command associated with that key. For
+example, if you press the keypad 8 key, you will hear:
+
+"Keypad 8 is line, say current."
+
+You'll notice that some commands do not have keys assigned to them.
+This is because they are very infrequently used commands, and are also
+accessible through the sys system. We'll discuss the sys system later
+in this manual.
+
+You'll also notice that some commands have two keys assigned to them.
+This is because Speakup has a built in set of alternative key bindings
+for laptop users. The alternate speakup key is the caps lock key. You
+can press and hold the caps lock key, while pressing an alternate
+speakup command key to activate the command. On most laptops, the
+numeric keypad is defined as the keys in the j k l area of the keyboard.
+
+There is usually a function key which turns this keypad function on and
+off, and some other key which controls the numlock state. Toggling the
+keypad functionality on and off can become a royal pain. So, Speakup
+gives you a simple way to get at an alternative set of key mappings for
+your laptop. These are also available by default on desktop systems,
+because Speakup does not know whether it is running on a desktop or
+laptop. So you may choose which set of Speakup keys to use. Some
+system administrators may have chosen to compile Speakup for a desktop
+system without this set of alternate key bindings, but these details are
+beyond the scope of this manual. To use the caps lock for its normal
+purpose, hold the shift key while toggling the caps lock on and off. We
+should note here, that holding the caps lock key and pressing the z key
+will toggle the alternate j k l keypad on and off.
+
+4. Keys and Their Assigned Commands
+
+In this section, we'll go through a list of all the speakup keys and
+commands. You can also get a list of commands and assigned keys from
+the help system.
+
+The following list was taken from the speakupmap.map file. Key
+assignments are on the left of the equal sign, and the associated
+Speakup commands are on the right. The designation "spk" means to press
+and hold the speakup key, a.k.a. keypad insert, a.k.a. caps lock, while
+pressing the other specified key.
+
+spk key_f9 = punc_level_dec
+spk key_f10 = punc_level_inc
+spk key_f11 = reading_punc_dec
+spk key_f12 = reading_punc_inc
+spk key_1 = vol_dec
+spk key_2 = vol_inc
+spk key_3 = pitch_dec
+spk key_4 = pitch_inc
+spk key_5 = rate_dec
+spk key_6 = rate_inc
+key_kpasterisk = toggle_cursoring
+spk key_kpasterisk = speakup_goto
+spk key_f1 = speakup_help
+spk key_f2 = set_win
+spk key_f3 = clear_win
+spk key_f4 = enable_win
+spk key_f5 = edit_some
+spk key_f6 = edit_most
+spk key_f7 = edit_delim
+spk key_f8 = edit_repeat
+shift spk key_f9 = edit_exnum
+ key_kp7 = say_prev_line
+spk key_kp7 = left_edge
+ key_kp8 = say_line
+double key_kp8 = say_line_indent
+spk key_kp8 = say_from_top
+ key_kp9 = say_next_line
+spk key_kp9 = top_edge
+ key_kpminus = speakup_parked
+spk key_kpminus = say_char_num
+ key_kp4 = say_prev_word
+spk key_kp4 = say_from_left
+ key_kp5 = say_word
+double key_kp5 = spell_word
+spk key_kp5 = spell_phonetic
+ key_kp6 = say_next_word
+spk key_kp6 = say_to_right
+ key_kpplus = say_screen
+spk key_kpplus = say_win
+ key_kp1 = say_prev_char
+spk key_kp1 = right_edge
+ key_kp2 = say_char
+spk key_kp2 = say_to_bottom
+double key_kp2 = say_phonetic_char
+ key_kp3 = say_next_char
+spk key_kp3 = bottom_edge
+ key_kp0 = spk_key
+ key_kpdot = say_position
+spk key_kpdot = say_attributes
+key_kpenter = speakup_quiet
+spk key_kpenter = speakup_off
+key_sysrq = speech_kill
+ key_kpslash = speakup_cut
+spk key_kpslash = speakup_paste
+spk key_pageup = say_first_char
+spk key_pagedown = say_last_char
+key_capslock = spk_key
+ spk key_z = spk_lock
+key_leftmeta = spk_key
+ctrl spk key_0 = speakup_goto
+spk key_u = say_prev_line
+spk key_i = say_line
+double spk key_i = say_line_indent
+spk key_o = say_next_line
+spk key_minus = speakup_parked
+shift spk key_minus = say_char_num
+spk key_j = say_prev_word
+spk key_k = say_word
+double spk key_k = spell_word
+spk key_l = say_next_word
+spk key_m = say_prev_char
+spk key_comma = say_char
+double spk key_comma = say_phonetic_char
+spk key_dot = say_next_char
+spk key_n = say_position
+ ctrl spk key_m = left_edge
+ ctrl spk key_y = top_edge
+ ctrl spk key_dot = right_edge
+ctrl spk key_p = bottom_edge
+spk key_apostrophe = say_screen
+spk key_h = say_from_left
+spk key_y = say_from_top
+spk key_semicolon = say_to_right
+spk key_p = say_to_bottom
+spk key_slash = say_attributes
+ spk key_enter = speakup_quiet
+ ctrl spk key_enter = speakup_off
+ spk key_9 = speakup_cut
+spk key_8 = speakup_paste
+shift spk key_m = say_first_char
+ ctrl spk key_semicolon = say_last_char
+spk key_r = read_all_doc
+
+5. The Speakup Sys System
+
+The Speakup screen reader also creates a speakup subdirectory as a part
+of the sys system.
+
+As a convenience, run as root
+
+ln -s /sys/accessibility/speakup /speakup
+
+to directly access speakup parameters from /speakup.
+You can see these entries by typing the command:
+
+ls -1 /speakup/*
+
+If you issue the above ls command, you will get back something like
+this:
+
+/speakup/attrib_bleep
+/speakup/bell_pos
+/speakup/bleep_time
+/speakup/bleeps
+/speakup/cursor_time
+/speakup/delimiters
+/speakup/ex_num
+/speakup/key_echo
+/speakup/keymap
+/speakup/no_interrupt
+/speakup/punc_all
+/speakup/punc_level
+/speakup/punc_most
+/speakup/punc_some
+/speakup/reading_punc
+/speakup/repeats
+/speakup/say_control
+/speakup/say_word_ctl
+/speakup/silent
+/speakup/spell_delay
+/speakup/synth
+/speakup/synth_direct
+/speakup/version
+
+/speakup/i18n:
+announcements
+characters
+chartab
+colors
+ctl_keys
+formatted
+function_names
+key_names
+states
+
+/speakup/soft:
+caps_start
+caps_stop
+delay_time
+direct
+freq
+full_time
+jiffy_delta
+pitch
+inflection
+punct
+rate
+tone
+trigger_time
+voice
+vol
+
+Notice the two subdirectories of /speakup: /speakup/i18n and
+/speakup/soft.
+The i18n subdirectory is described in a later section.
+The files under /speakup/soft represent settings that are specific to the
+driver for the software synthesizer. If you use the LiteTalk, your
+synthesizer-specific settings would be found in /speakup/ltlk. In other words,
+a subdirectory named /speakup/KWD is created to hold parameters specific
+to the device whose keyword is KWD.
+These parameters include volume, rate, pitch, and others.
+
+In addition to using the Speakup hot keys to change such things as
+volume, pitch, and rate, you can also echo values to the appropriate
+entry in the /speakup directory. This is very useful, since it
+lets you control Speakup parameters from within a script. How you
+would write such scripts is somewhat beyond the scope of this manual,
+but I will include a couple of simple examples here to give you a
+general idea of what such scripts can do.
+
+Suppose for example, that you wanted to control both the punctuation
+level and the reading punctuation level at the same time. For
+simplicity, we'll call them punc0, punc1, punc2, and punc3. The scripts
+might look something like this:
+
+#!/bin/bash
+# punc0
+# set punc and reading punc levels to 0
+echo 0 >/speakup/punc_level
+echo 0 >/speakup/reading_punc
+echo Punctuation level set to 0.
+
+#!/bin/bash
+# punc1
+# set punc and reading punc levels to 1
+echo 1 >/speakup/punc_level
+echo 1 >/speakup/reading_punc
+echo Punctuation level set to 1.
+
+#!/bin/bash
+# punc2
+# set punc and reading punc levels to 2
+echo 2 >/speakup/punc_level
+echo 2 >/speakup/reading_punc
+echo Punctuation level set to 2.
+
+#!/bin/bash
+# punc3
+# set punc and reading punc levels to 3
+echo 3 >/speakup/punc_level
+echo 3 >/speakup/reading_punc
+echo Punctuation level set to 3.
+
+If you were to store these four small scripts in a directory in your
+path, perhaps /usr/local/bin, and set the permissions to 755 with the
+chmod command, then you could change the default reading punc and
+punctuation levels at the same time by issuing just one command. For
+example, if you were to execute the punc3 command at your shell prompt,
+then the reading punc and punc level would both get set to 3.
+
+I should note that the above scripts were written to work with bash, but
+regardless of which shell you use, you should be able to do something
+similar.
+
+The Speakup sys system also has another interesting use. You can echo
+Speakup parameters into the sys system in a script during system
+startup, and speakup will return to your preferred parameters every time
+the system is rebooted.
+
+Most of the Speakup sys parameters can be manipulated by a regular user
+on the system. However, there are a few parameters that are dangerous
+enough that they should only be manipulated by the root user on your
+system. There are even some parameters that are read only, and cannot
+be written to at all. For example, the version entry in the Speakup
+sys system is read only. This is because there is no reason for a user
+to tamper with the version number which is reported by Speakup. Doing
+an ls -l on /speakup/version will return this:
+
+-r--r--r-- 1 root root 0 Mar 21 13:46 /speakup/version
+
+As you can see, the version entry in the Speakup sys system is read
+only, is owned by root, and belongs to the root group. Doing a cat of
+/speakup/version will display the Speakup version number, like
+this:
+
+cat /speakup/version
+Speakup v-2.00 CVS: Thu Oct 21 10:38:21 EDT 2004
+synth dtlk version 1.1
+
+The display shows the Speakup version number, along with the version
+number of the driver for the current synthesizer.
+
+Looking at entries in the Speakup sys system can be useful in many
+ways. For example, you might wish to know what level your volume is set
+at. You could type:
+
+cat /speakup/KWD/vol
+# Replace KWD with the keyword for your synthesizer, E.G., ltlk for LiteTalk.
+5
+
+The number five which comes back is the level at which the synthesizer
+volume is set at.
+
+All the entries in the Speakup sys system are readable, some are
+writable by root only, and some are writable by everyone. Unless you
+know what you are doing, you should probably leave the ones that are
+writable by root only alone. Most of the names are self explanatory.
+Vol for controlling volume, pitch for pitch, inflection for pitch range, rate
+for controlling speaking rate, etc. If you find one you aren't sure about, you
+can post a query on the Speakup list.
+
+6. Changing Synthesizers
+
+It is possible to change to a different synthesizer while speakup is
+running. In other words, it is not necessary to reboot the system
+in order to use a different synthesizer. You can simply echo the
+synthesizer keyword to the /speakup/synth sys entry.
+Depending on your situation, you may wish to echo none to the synth
+sys entry, to disable speech while one synthesizer is disconnected and
+a second one is connected in its place. Then echo the keyword for the
+new synthesizer into the synth sys entry in order to start speech
+with the newly connected synthesizer. See the list of synthesizer
+keywords in section 1 to find the keyword which matches your synth.
+
+7. Loading modules
+
+As mentioned earlier, Speakup can either be completely compiled into the
+kernel, with the exception of the help module, or it can be compiled as
+a series of modules. When compiled as modules, Speakup will only be
+able to speak some of the bootup messages if your system administrator
+has configured the system to load the modules at boot time. The modules
+can be loaded after the file systems have been checked and mounted, or
+from an initrd. There is a third possibility. Speakup can be compiled
+with some components built into the kernel, and others as modules. As
+we'll see in the next section, this is particularly useful when you are
+working with software synthesizers.
+
+If Speakup is completely compiled as modules, then you must use the
+modprobe command to load Speakup. You do this by loading the module for
+the synthesizer driver you wish to use. The driver modules are all
+named speakup_<keyword>, where <keyword> is the keyword for the
+synthesizer you want. So, in order to load the driver for the DecTalk
+Express, you would type the following command:
+
+modprobe speakup_dectlk
+
+Issuing this command would load the DecTalk Express driver and all other
+related Speakup modules necessary to get Speakup up and running.
+
+To completely unload Speakup, again presuming that it is entirely built
+as modules, you would give the command:
+
+modprobe -r speakup_dectlk
+
+The above command assumes you were running a DecTalk Express. If you
+were using a different synth, then you would substitute its keyword in
+place of dectlk.
+
+If you have multiple drivers loaded, you need to unload all of them, in
+order to completely unload Speakup.
+For example, if you have loaded both the dectlk and ltlk drivers, use the
+command:
+modprobe -r speakup_dectlk speakup_ltlk
+
+You cannot unload the driver for software synthesizers when a user-space
+daemon is using /dev/softsynth. First, kill the daemon. Next, remove
+the driver with the command:
+modprobe -r speakup_soft
+
+Now, suppose we have a situation where the main Speakup component
+is built into the kernel, and some or all of the drivers are built as
+modules. Since the main part of Speakup is compiled into the kernel, a
+partial Speakup sys system has been created which we can take advantage
+of by simply echoing the synthesizer keyword into the
+/speakup/synth sys entry. This will cause the kernel to
+automatically load the appropriate driver module, and start Speakup
+talking. To switch to another synth, just echo a new keyword to the
+synth sys entry. For example, to load the DoubleTalk LT driver,
+you would type:
+
+echo ltlk >/speakup/synth
+
+You can use the modprobe -r command to unload driver modules, regardless
+of whether the main part of Speakup has been built into the kernel or
+not.
+
+8. Using Software Synthesizers
+
+Using a software synthesizer requires that some other software be
+installed and running on your system. For this reason, software
+synthesizers are not available for use at bootup, or during a system
+installation process.
+There are two freely-available solutions for software speech: Espeakup and
+Speech Dispatcher.
+These are described in subsections 8.1 and 8.2, respectively.
+
+During the rest of this section, we assume that speakup_soft is either
+built in to your kernel, or loaded as a module.
+
+If your system does not have udev installed , before you can use a
+software synthesizer, you must have created the /dev/softsynth device.
+If you have not already done so, issue the following commands as root:
+
+cd /dev
+mknod softsynth c 10 26
+
+While we are at it, we might just as well create the /dev/synth device,
+which can be used to let user space programs send information to your
+synthesizer. To create /dev/synth, change to the /dev directory, and
+issue the following command as root:
+
+mknod synth c 10 25
+
+of both.
+
+8.1. Espeakup
+
+Espeakup is a connector between Speakup and the eSpeak software synthesizer.
+Espeakup may already be available as a package for your distribution
+of Linux. If it is not packaged, you need to install it manually.
+You can find it in the contrib/ subdirectory of the Speakup sources.
+The filename is espeakup-$VERSION.tar.bz2, where $VERSION
+depends on the current release of Espeakup. The Speakup 3.1.2 source
+ships with version 0.71 of Espeakup.
+The README file included with the Espeakup sources describes the process
+of manual installation.
+
+Assuming that Espeakup is installed, either by the user or by the distributor,
+follow these steps to use it.
+
+Tell Speakup to use the "soft driver:
+echo soft > /speakup/synth
+
+Finally, start the espeakup program. There are two ways to do it.
+Both require root privileges.
+
+If Espeakup was installed as a package for your Linux distribution,
+you probably have a distribution-specific script that controls the operation
+of the daemon. Look for a file named espeakup under /etc/init.d or
+/etc/rc.d. Execute the following command with root privileges:
+/etc/init.d/espeakup start
+Replace init.d with rc.d, if your distribution uses scripts located under
+/etc/rc.d.
+Your distribution will also have a procedure for starting daemons at
+boot-time, so it is possible to have software speech as soon as user-space
+daemons are started by the bootup scripts.
+These procedures are not described in this document.
+
+If you built Espeakup manually, the "make install" step placed the binary
+under /usr/bin.
+Run the following command as root:
+/usr/bin/espeakup
+Espeakup should start speaking.
+
+8.2. Speech Dispatcher
+
+For this option, you must have a package called
+Speech Dispatcher running on your system, and it must be configured to
+work with one of its supported software synthesizers.
+
+Two open source synthesizers you might use are Flite and Festival. You
+might also choose to purchase the Software DecTalk from Fonix Sales Inc.
+If you run a google search for Fonix, you'll find their web site.
+
+You can obtain a copy of Speech Dispatcher from free(b)soft at
+http://www.freebsoft.org/. Follow the installation instructions that
+come with Speech Dispatcher in order to install and configure Speech
+Dispatcher. You can check out the web site for your Linux distribution
+in order to get a copy of either Flite or Festival. Your Linux
+distribution may also have a precompiled Speech Dispatcher package.
+
+Once you've installed, configured, and tested Speech Dispatcher with your
+chosen software synthesizer, you still need one more piece of software
+in order to make things work. You need a package called speechd-up.
+You get it from the free(b)soft web site mentioned above. After you've
+compiled and installed speechd-up, you are almost ready to begin using
+your software synthesizer.
+
+Now you can begin using your software synthesizer. In order to do so,
+echo the soft keyword to the synth sys entry like this:
+
+echo soft >/speakup/synth
+
+Next run the speechd_up command like this:
+
+speechd_up &
+
+Your synth should now start talking, and you should be able to adjust
+the pitch, rate, etc.
+
+9. Using The DecTalk PC Card
+
+The DecTalk PC card is an ISA card that is inserted into one of the ISA
+slots in your computer. It requires that the DecTalk PC software be
+installed on your computer, and that the software be loaded onto the
+Dectalk PC card before it can be used.
+
+You can get the dec_pc.tgz file from the linux-speakup.org site. The
+dec_pc.tgz file is in the ~ftp/pub/linux/speakup directory.
+
+After you have downloaded the dec_pc.tgz file, untar it in your home
+directory, and read the Readme file in the newly created dec_pc
+directory.
+
+The easiest way to get the software working is to copy the entire dec_pc
+directory into /user/local/lib. To do this, su to root in your home
+directory, and issue the command:
+
+cp dec_pc /usr/local/lib
+
+You will need to copy the dtload command from the dec_pc directory to a
+directory in your path. Either /usr/bin or /usr/local/bin is a good
+choice.
+
+You can now run the dtload command in order to load the DecTalk PC
+software onto the card. After you have done this, echo the decpc
+keyword to the synth entry in the sys system like this:
+
+echo decpc >/speakup/synth
+
+Your DecTalk PC should start talking, and then you can adjust the pitch,
+rate, volume, voice, etc. The voice entry in the Speakup sys system
+will accept a number from 0 through 7 for the DecTalk PC synthesizer,
+which will give you access to some of the DecTalk voices.
+
+10. Using Cursor Tracking
+
+In Speakup version 2.0 and later, cursor tracking is turned on by
+default. This means that when you are using an editor, Speakup will
+automatically speak characters as you move left and right with the
+cursor keys, and lines as you move up and down with the cursor keys.
+This is the traditional sort of cursor tracking.
+Recent versions of Speakup provide two additional ways to control the
+text that is spoken when the cursor is moved:
+"highlight tracking" and "read window."
+They are described later in this section.
+Sometimes, these modes get in your way, so you can disable cursor tracking
+altogether.
+
+You may select among the various forms of cursor tracking using the keypad
+asterisk key.
+Each time you press this key, a new mode is selected, and Speakup speaks
+the name of the new mode. The names for the four possible states of cursor
+tracking are: "cursoring on", "highlight tracking", "read window",
+and "cursoring off." The keypad asterisk key moves through the list of
+modes in a circular fashion.
+
+If highlight tracking is enabled, Speakup tracks highlighted text,
+rather than the cursor itself. When you move the cursor with the arrow keys,
+Speakup speaks the currently highlighted information.
+This is useful when moving through various menus and dialog boxes.
+If cursor tracking isn't helping you while navigating a menu,
+try highlight tracking.
+
+With the "read window" variety of cursor tracking, you can limit the text
+that Speakup speaks by specifying a window of interest on the screen.
+See section 15 for a description of the process of defining windows.
+When you move the cursor via the arrow keys, Speakup only speaks
+the contents of the window. This is especially helpful when you are hearing
+superfluous speech. Consider the following example.
+
+Suppose that you are at a shell prompt. You use bash, and you want to
+explore your command history using the up and down arrow keys. If you
+have enabled cursor tracking, you will hear two pieces of information.
+Speakup speaks both your shell prompt and the current entry from the
+command history. You may not want to hear the prompt repeated
+each time you move, so you can silence it by specifying a window. Find
+the last line of text on the screen. Clear the current window by pressing
+the key combination speakup f3. Use the review cursor to find the first
+character that follows your shell prompt. Press speakup + f2 twice, to
+define a one-line window. The boundaries of the window are the
+character following the shell prompt and the end of the line. Now, cycle
+through the cursor tracking modes using keypad asterisk, until Speakup
+says "read window." Move through your history using your arrow keys.
+You will notice that Speakup no longer speaks the redundant prompt.
+
+Some folks like to turn cursor tracking off while they are using the
+lynx web browser. You definitely want to turn cursor tracking off when
+you are using the alsamixer application. Otherwise, you won't be able
+to hear your mixer settings while you are using the arrow keys.
+
+11. Cut and Paste
+
+One of Speakup's more useful features is the ability to cut and paste
+text on the screen. This means that you can capture information from a
+program, and paste that captured text into a different place in the
+program, or into an entirely different program, which may even be
+running on a different console.
+
+For example, in this manual, we have made references to several web
+sites. It would be nice if you could cut and paste these urls into your
+web browser. Speakup does this quite nicely. Suppose you wanted to
+past the following url into your browser:
+
+http://linux-speakup.org/
+
+Use the speakup review keys to position the reading cursor on the first
+character of the above url. When the reading cursor is in position,
+press the keypad slash key once. Speakup will say, "mark". Next,
+position the reading cursor on the rightmost character of the above
+url. Press the keypad slash key once again to actually cut the text
+from the screen. Speakup will say, "cut". Although we call this
+cutting, Speakup does not actually delete the cut text from the screen.
+It makes a copy of the text in a special buffer for later pasting.
+
+Now that you have the url cut from the screen, you can paste it into
+your browser, or even paste the url on a command line as an argument to
+your browser.
+
+Suppose you want to start lynx and go to the Speakup site.
+
+You can switch to a different console with the alt left and right
+arrows, or you can switch to a specific console by typing alt and a
+function key. These are not Speakup commands, just standard Linux
+console capabilities.
+
+Once you've changed to an appropriate console, and are at a shell prompt,
+type the word lynx, followed by a space. Now press and hold the speakup
+key, while you type the keypad slash character. The url will be pasted
+onto the command line, just as though you had typed it in. Press the
+enter key to execute the command.
+
+The paste buffer will continue to hold the cut information, until a new
+mark and cut operation is carried out. This means you can paste the cut
+information as many times as you like before doing another cut
+operation.
+
+You are not limited to cutting and pasting only one line on the screen.
+You can also cut and paste rectangular regions of the screen. Just
+position the reading cursor at the top left corner of the text to be
+cut, mark it with the keypad slash key, then position the reading cursor
+at the bottom right corner of the region to be cut, and cut it with the
+keypad slash key.
+
+12. Changing the Pronunciation of Characters
+
+Through the /speakup/i18n/characters sys entry, Speakup gives you the
+ability to change how Speakup pronounces a given character. You could,
+for example, change how some punctuation characters are spoken. You can
+even change how Speakup will pronounce certain letters.
+
+You may, for example, wish to change how Speakup pronounces the z
+character. The author of Speakup, Kirk Reiser, is Canadian, and thus
+believes that the z should be pronounced zed. If you are an American,
+you might wish to use the zee pronunciation instead of zed. You can
+change the pronunciation of both the upper and lower case z with the
+following two commands:
+
+echo 90 zee >/speakup/characters
+echo 122 zee >/speakup/characters
+
+Let's examine the parts of the two previous commands. They are issued
+at the shell prompt, and could be placed in a startup script.
+
+The word echo tells the shell that you want to have it display the
+string of characters that follow the word echo. If you were to just
+type:
+
+echo hello.
+
+You would get the word hello printed on your screen as soon as you
+pressed the enter key. In this case, we are echoing strings that we
+want to be redirected into the sys system.
+
+The numbers 90 and 122 in the above echo commands are the ascii numeric
+values for the upper and lower case z, the characters we wish to change.
+
+The string zee is the pronunciation that we want Speakup to use for the
+upper and lower case z.
+
+The > symbol redirects the output of the echo command to a file, just
+like in DOS, or at the Windows command prompt.
+
+And finally, /speakup/i18n/characters is the file entry in the sys system
+where we want the output to be directed. Speakup looks at the numeric
+value of the character we want to change, and inserts the pronunciation
+string into an internal table.
+
+You can look at the whole table with the following command:
+
+cat /speakup/i18n/characters
+
+Speakup will then print out the entire character pronunciation table. I
+won't display it here, but leave you to look at it at your convenience.
+
+13. Mapping Keys
+
+Speakup has the capability of allowing you to assign or "map" keys to
+internal Speakup commands. This section necessarily assumes you have a
+Linux kernel source tree installed, and that it has been patched and
+configured with Speakup. How you do this is beyond the scope of this
+manual. For this information, visit the Speakup web site at
+http://linux-speakup.org/. The reason you'll need the kernel source
+tree patched with Speakup is that the genmap utility you'll need for
+processing keymaps is in the
+/usr/src/linux-<version_number>/drivers/char/speakup directory. The
+<version_number> in the above directory path is the version number of
+the Linux source tree you are working with.
+
+So ok, you've gone off and gotten your kernel source tree, and patched
+and configured it. Now you can start manipulating keymaps.
+
+You can either use the
+/usr/src/linux-<version_number>/drivers/char/speakup/speakupmap.map file
+included with the Speakup source, or you can cut and paste the copy in
+section 4 into a separate file. If you use the one in the Speakup
+source tree, make sure you make a backup of it before you start making
+changes. You have been warned!
+
+Suppose that you want to swap the key assignments for the Speakup
+say_last_char and the Speakup say_first_char commands. The
+speakupmap.map lists the key mappings for these two commands as follows:
+
+spk key_pageup = say_first_char
+spk key_pagedown = say_last_char
+
+You can edit your copy of the speakupmap.map file and swap the command
+names on the right side of the = (equals) sign. You did make a backup,
+right? The new keymap lines would look like this:
+
+spk key_pageup = say_last_char
+spk key_pagedown = say_first_char
+
+After you edit your copy of the speakupmap.map file, save it under a new
+file name, perhaps newmap.map. Then exit your editor and return to the
+shell prompt.
+
+You are now ready to load your keymap with your swapped key assignments.
+ Assuming that you saved your new keymap as the file newmap.map, you
+would load your keymap into the sys system like this:
+
+/usr/src/linux-<version_number>/drivers/char/speakup/genmap newmap.map
+>/speakup/keymap
+
+Remember to substitute your kernel version number for the
+<version_number> in the above command. Also note that although the
+above command wrapped onto two lines in this document, you should type
+it all on one line.
+
+Your say first and say last characters should now be swapped. Pressing
+speakup pagedown should read you the first non-whitespace character on
+the line your reading cursor is in, and pressing speakup pageup should
+read you the last character on the line your reading cursor is in.
+
+You should note that these new mappings will only stay in effect until
+you reboot, or until you load another keymap.
+
+One final warning. If you try to load a partial map, you will quickly
+find that all the mappings you didn't include in your file got deleted
+from the working map. Be extremely careful, and always make a backup!
+You have been warned!
+
+14. Internationalizing Speakup
+
+Speakup indicates various conditions to the user by speaking messages.
+For instance, when you move to the left edge of the screen with the
+review keys, Speakup says, "left."
+Prior to version 3.1.0 of Speakup, all of these messages were in English,
+and they could not be changed. If you used a non-English synthesizer,
+you still heard English messages, such as "left" and "cursoring on."
+In version 3.1.0 or higher, one may load translations for the various
+messages via the /sys filesystem.
+
+The directory /speakup/i18n contains several collections of messages.
+Each group of messages is stored in its own file.
+The following section lists all of these files, along with a brief description
+of each.
+
+14.1. Files Under the i18n Subdirectory
+
+* announcements:
+This file contains various general announcements, most of which cannot
+be categorized. You will find messages such as "You killed Speakup",
+"I'm alive", "leaving help", "parked", "unparked", and others.
+You will also find the names of the screen edges and cursor tracking modes
+here.
+
+* characters:
+See section 12 for a description of this file.
+
+* chartab:
+See section 12. Unlike the rest of the files in the i18n subdirectory,
+this one does not contain messages to be spoken.
+
+* colors:
+When you use the "say attributes" function, Speakup says the name of the
+foreground and background colors. These names come from the i18n/colors
+file.
+
+* ctl_keys:
+Here, you will find names of control keys. These are used with Speakup's
+say_control feature.
+
+* formatted:
+This group of messages contains embedded formatting codes, to specify
+the type and width of displayed data. If you change these, you must
+preserve all of the formatting codes, and they must appear in the order
+used by the default messages.
+
+* function_names:
+Here, you will find a list of names for Speakup functions. These are used
+by the help system. For example, suppose that you have activated help mode,
+and you pressed keypad 3. Speakup says:
+"keypad 3 is character, say next."
+The message "character, say next" names a Speakup function, and it
+comes from this function_names file.
+
+* key_names:
+Again, key_names is used by Speakup's help system. In the previous
+example, Speakup said that you pressed "keypad 3."
+This name came from the key_names file.
+
+* states:
+This file contains names for key states.
+Again, these are part of the help system. For instance, if you had pressed
+speakup + keypad 3, you would hear:
+"speakup keypad 3 is go to bottom edge."
+The speakup key is depressed, so the name of the key state is speakup.
+This part of the message comes from the states collection.
+
+14.2. Changing language
+
+14.2.1. Loading Your Own Messages
+
+The files under the i18n subdirectory all follow the same format.
+They consist of lines, with one message per line.
+Each message is represented by a number, followed by the text of the message.
+The number is the position of the message in the given collection.
+For example, if you view the file /speakup/i18n/colors, you will see the
+following list:
+
+0 black
+1 blue
+2 green
+3 cyan
+4 red
+5 magenta
+6 yellow
+7 white
+8 grey
+
+You can change one message, or you can change a whole group.
+To load a whole collection of messages from a new source, simply use
+the cp command:
+cp ~/my_colors /speakup/i18n/colors
+You can change an individual message with the echo command,
+as shown in the following example.
+
+The Spanish name for the color blue is azul.
+Looking at the colors file, we see that the name "blue" is at position 1
+within the colors group. Let's change blue to azul:
+echo '1 azul' > /speakup/i18n/colors
+The next time that Speakup says message 1 from the colors group, it will
+say "azul", rather than "blue."
+
+14.2.2. Choose a language
+
+In the future, translations into various languages will be made available,
+and most users will just load the files necessary for their language. So far,
+only French language is available beyond native Canadian English language.
+
+French is only available after you are logged in.
+
+Canadian English is the default language. To toggle another language,
+download the source of Speakup and untar it in your home directory. The
+following command should let you do this:
+
+tar xvjf speakup-<version>.tar.bz2
+
+where <version> is the version number of the application.
+
+Next, change to the newly created directory, then into the tools/ directory, and
+run the script speakup_setlocale. You are asked the language that you want to
+use. Type the number associated to your language (e.g. fr for French) then press
+Enter. Needed files are copied in the i18n directory.
+
+Note: the speakupconf must be installed on your system so that settings are saved.
+Otherwise, you will have an error: your language will be loaded but you will
+have to run the script again every time Speakup restarts.
+See section 16.1. for information about speakupconf.
+
+You will have to repeat these steps for any change of locale, i.e. if you wish
+change the speakup's language or charset (iso-8859-15 ou UTF-8).
+
+If you wish store the settings, note that at your next login, you will need to
+do:
+
+speakup load
+
+Alternatively, you can add the above line to your file
+~/.bashrc or ~/.bash_profile.
+
+If your system administrator ran himself the script, all the users will be able
+to change from English to the language choosed by root and do directly
+speakupconf load (or add this to the ~/.bashrc or
+~/.bash_profile file). If there are several languages to handle, the
+administrator (or every user) will have to run the first steps until speakupconf
+save, choosing the appropriate language, in every user's home directory. Every
+user will then be able to do speakupconf load, Speakup will load his own settings.
+
+14.3. No Support for Non-Western-European Languages
+
+As of the current release, Speakup only supports Western European languages.
+Support for the extended characters used by languages outside of the Western
+European family of languages is a work in progress.
+
+15. Using Speakup's Windowing Capability
+
+Speakup has the capability of defining and manipulating windows on the
+screen. Speakup uses the term "Window", to mean a user defined area of
+the screen. The key strokes for defining and manipulating Speakup
+windows are as follows:
+
+speakup + f2 -- Set the bounds of the window.
+Speakup + f3 -- clear the current window definition.
+speakup + f4 -- Toggle window silence on and off.
+speakup + keypad plus -- Say the currently defined window.
+
+These capabilities are useful for tracking a certain part of the screen
+without rereading the whole screen, or for silencing a part of the
+screen that is constantly changing, such as a clock or status line.
+
+There is no way to save these window settings, and you can only have one
+window defined for each virtual console. There is also no way to have
+windows automatically defined for specific applications.
+
+In order to define a window, use the review keys to move your reading
+cursor to the beginning of the area you want to define. Then press
+speakup + f2. Speakup will tell you that the window starts at the
+indicated row and column position. Then move the reading cursor to the
+end of the area to be defined as a window, and press speakup + f2 again.
+ If there is more than one line in the window, Speakup will tell you
+that the window ends at the indicated row and column position. If there
+is only one line in the window, then Speakup will tell you that the
+window is the specified line on the screen. If you are only defining a
+one line window, you can just press speakup + f2 twice after placing the
+reading cursor on the line you want to define as a window. It is not
+necessary to position the reading cursor at the end of the line in order
+to define the whole line as a window.
+
+16. Tools for Controlling Speakup
+
+The speakup distribution includes extra tools (in the tools directory)
+which were written to make speakup easier to use. This section will
+briefly describe the use of these tools.
+
+16.1. Speakupconf
+
+speakupconf began life as a contribution from Steve Holmes, a member of
+the speakup community. We would like to thank him for his work on the
+early versions of this project.
+
+This script may be installed as part of your linux distribution, but if
+it isn't, the recommended places to put it are /usr/local/bin or
+/usr/bin. This script can be run by any user, so it does not require
+root privileges.
+
+Speakupconf allows you to save and load your Speakup settings. It works
+by reading and writing the /sys files described above.
+
+The directory that speakupconf uses to store your settings depends on
+whether it is run from the root account. If you execute speakupconf as
+root, it uses the directory /etc/speakup. Otherwise, it uses the directory
+~/.speakup, where ~ is your home directory.
+Anyone who needs to use Speakup from your console can load his own custom
+settings with this script.
+
+speakupconf takes one required argument: load or save.
+Use the command
+speakupconf save
+to save your Speakup settings, and
+speakupconf load
+to load them into Speakup.
+A second argument may be specified to use an alternate directory to
+load or save the speakup parameters.
+
+16.2. Talkwith
+
+Charles Hallenbeck, another member of the speakup community, wrote the
+initial versions of this script, and we would also like to thank him for
+his work on it.
+
+This script needs root privileges to run, so if it is not installed as
+part of your linux distribution, the recommended places to install it
+are /usr/local/sbin or /usr/sbin.
+
+Talkwith allows you to switch synthesizers on the fly. It takes a synthesizer
+name as an argument. For instance,
+talkwith dectlk
+causes Speakup to use the DecTalk Express. If you wish to switch to a
+software synthesizer, you must also indicate which daemon you wish to
+use. There are two possible choices:
+spd and espeakup. spd is an abbreviation for speechd-up.
+If you wish to use espeakup for software synthesis, give the command
+talkwith soft espeakup
+To use speechd-up, type:
+talkwith soft spd
+Any arguments that follow the name of the daemon are passed to the daemon
+when it is invoked. For instance:
+talkwith espeakup --default-voice=fr
+causes espeakup to use the French voice.
+Note that talkwith must always be executed with root privileges.
+
+Talkwith does not attempt to load your settings after the new
+synthesizer is activated. You can use speakupconf to load your settings
+if desired.
+
+ GNU Free Documentation License
+ Version 1.2, November 2002
+
+
+ Copyright (C) 2000,2001,2002 Free Software Foundation, Inc.
+ Everyone is permitted to copy and distribute verbatim copies
+ of this license document, but changing it is not allowed.
+
+
+0. PREAMBLE
+
+The purpose of this License is to make a manual, textbook, or other
+functional and useful document "free" in the sense of freedom: to
+assure everyone the effective freedom to copy and redistribute it,
+with or without modifying it, either commercially or noncommercially.
+Secondarily, this License preserves for the author and publisher a way
+to get credit for their work, while not being considered responsible
+for modifications made by others.
+
+This License is a kind of "copyleft", which means that derivative
+works of the document must themselves be free in the same sense. It
+complements the GNU General Public License, which is a copyleft
+license designed for free software.
+
+We have designed this License in order to use it for manuals for free
+software, because free software needs free documentation: a free
+program should come with manuals providing the same freedoms that the
+software does. But this License is not limited to software manuals;
+it can be used for any textual work, regardless of subject matter or
+whether it is published as a printed book. We recommend this License
+principally for works whose purpose is instruction or reference.
+
+
+1. APPLICABILITY AND DEFINITIONS
+
+This License applies to any manual or other work, in any medium, that
+contains a notice placed by the copyright holder saying it can be
+distributed under the terms of this License. Such a notice grants a
+world-wide, royalty-free license, unlimited in duration, to use that
+work under the conditions stated herein. The "Document", below,
+refers to any such manual or work. Any member of the public is a
+licensee, and is addressed as "you". You accept the license if you
+copy, modify or distribute the work in a way requiring permission
+under copyright law.
+
+A "Modified Version" of the Document means any work containing the
+Document or a portion of it, either copied verbatim, or with
+modifications and/or translated into another language.
+
+A "Secondary Section" is a named appendix or a front-matter section of
+the Document that deals exclusively with the relationship of the
+publishers or authors of the Document to the Document's overall subject
+(or to related matters) and contains nothing that could fall directly
+within that overall subject. (Thus, if the Document is in part a
+textbook of mathematics, a Secondary Section may not explain any
+mathematics.) The relationship could be a matter of historical
+connection with the subject or with related matters, or of legal,
+commercial, philosophical, ethical or political position regarding
+them.
+
+The "Invariant Sections" are certain Secondary Sections whose titles
+are designated, as being those of Invariant Sections, in the notice
+that says that the Document is released under this License. If a
+section does not fit the above definition of Secondary then it is not
+allowed to be designated as Invariant. The Document may contain zero
+Invariant Sections. If the Document does not identify any Invariant
+Sections then there are none.
+
+The "Cover Texts" are certain short passages of text that are listed,
+as Front-Cover Texts or Back-Cover Texts, in the notice that says that
+the Document is released under this License. A Front-Cover Text may
+be at most 5 words, and a Back-Cover Text may be at most 25 words.
+
+A "Transparent" copy of the Document means a machine-readable copy,
+represented in a format whose specification is available to the
+general public, that is suitable for revising the document
+straightforwardly with generic text editors or (for images composed of
+pixels) generic paint programs or (for drawings) some widely available
+drawing editor, and that is suitable for input to text formatters or
+for automatic translation to a variety of formats suitable for input
+to text formatters. A copy made in an otherwise Transparent file
+format whose markup, or absence of markup, has been arranged to thwart
+or discourage subsequent modification by readers is not Transparent.
+An image format is not Transparent if used for any substantial amount
+of text. A copy that is not "Transparent" is called "Opaque".
+
+Examples of suitable formats for Transparent copies include plain
+ASCII without markup, Texinfo input format, LaTeX input format, SGML
+or XML using a publicly available DTD, and standard-conforming simple
+HTML, PostScript or PDF designed for human modification. Examples of
+transparent image formats include PNG, XCF and JPG. Opaque formats
+include proprietary formats that can be read and edited only by
+proprietary word processors, SGML or XML for which the DTD and/or
+processing tools are not generally available, and the
+machine-generated HTML, PostScript or PDF produced by some word
+processors for output purposes only.
+
+The "Title Page" means, for a printed book, the title page itself,
+plus such following pages as are needed to hold, legibly, the material
+this License requires to appear in the title page. For works in
+formats which do not have any title page as such, "Title Page" means
+the text near the most prominent appearance of the work's title,
+preceding the beginning of the body of the text.
+
+A section "Entitled XYZ" means a named subunit of the Document whose
+title either is precisely XYZ or contains XYZ in parentheses following
+text that translates XYZ in another language. (Here XYZ stands for a
+specific section name mentioned below, such as "Acknowledgements",
+"Dedications", "Endorsements", or "History".) To "Preserve the Title"
+of such a section when you modify the Document means that it remains a
+section "Entitled XYZ" according to this definition.
+
+The Document may include Warranty Disclaimers next to the notice which
+states that this License applies to the Document. These Warranty
+Disclaimers are considered to be included by reference in this
+License, but only as regards disclaiming warranties: any other
+implication that these Warranty Disclaimers may have is void and has
+no effect on the meaning of this License.
+
+
+2. VERBATIM COPYING
+
+You may copy and distribute the Document in any medium, either
+commercially or noncommercially, provided that this License, the
+copyright notices, and the license notice saying this License applies
+to the Document are reproduced in all copies, and that you add no other
+conditions whatsoever to those of this License. You may not use
+technical measures to obstruct or control the reading or further
+copying of the copies you make or distribute. However, you may accept
+compensation in exchange for copies. If you distribute a large enough
+number of copies you must also follow the conditions in section 3.
+
+You may also lend copies, under the same conditions stated above, and
+you may publicly display copies.
+
+
+3. COPYING IN QUANTITY
+
+If you publish printed copies (or copies in media that commonly have
+printed covers) of the Document, numbering more than 100, and the
+Document's license notice requires Cover Texts, you must enclose the
+copies in covers that carry, clearly and legibly, all these Cover
+Texts: Front-Cover Texts on the front cover, and Back-Cover Texts on
+the back cover. Both covers must also clearly and legibly identify
+you as the publisher of these copies. The front cover must present
+the full title with all words of the title equally prominent and
+visible. You may add other material on the covers in addition.
+Copying with changes limited to the covers, as long as they preserve
+the title of the Document and satisfy these conditions, can be treated
+as verbatim copying in other respects.
+
+If the required texts for either cover are too voluminous to fit
+legibly, you should put the first ones listed (as many as fit
+reasonably) on the actual cover, and continue the rest onto adjacent
+pages.
+
+If you publish or distribute Opaque copies of the Document numbering
+more than 100, you must either include a machine-readable Transparent
+copy along with each Opaque copy, or state in or with each Opaque copy
+a computer-network location from which the general network-using
+public has access to download using public-standard network protocols
+a complete Transparent copy of the Document, free of added material.
+If you use the latter option, you must take reasonably prudent steps,
+when you begin distribution of Opaque copies in quantity, to ensure
+that this Transparent copy will remain thus accessible at the stated
+location until at least one year after the last time you distribute an
+Opaque copy (directly or through your agents or retailers) of that
+edition to the public.
+
+It is requested, but not required, that you contact the authors of the
+Document well before redistributing any large number of copies, to give
+them a chance to provide you with an updated version of the Document.
+
+
+4. MODIFICATIONS
+
+You may copy and distribute a Modified Version of the Document under
+the conditions of sections 2 and 3 above, provided that you release
+the Modified Version under precisely this License, with the Modified
+Version filling the role of the Document, thus licensing distribution
+and modification of the Modified Version to whoever possesses a copy
+of it. In addition, you must do these things in the Modified Version:
+
+A. Use in the Title Page (and on the covers, if any) a title distinct
+ from that of the Document, and from those of previous versions
+ (which should, if there were any, be listed in the History section
+ of the Document). You may use the same title as a previous version
+ if the original publisher of that version gives permission.
+B. List on the Title Page, as authors, one or more persons or entities
+ responsible for authorship of the modifications in the Modified
+ Version, together with at least five of the principal authors of the
+ Document (all of its principal authors, if it has fewer than five),
+ unless they release you from this requirement.
+C. State on the Title page the name of the publisher of the
+ Modified Version, as the publisher.
+D. Preserve all the copyright notices of the Document.
+E. Add an appropriate copyright notice for your modifications
+ adjacent to the other copyright notices.
+F. Include, immediately after the copyright notices, a license notice
+ giving the public permission to use the Modified Version under the
+ terms of this License, in the form shown in the Addendum below.
+G. Preserve in that license notice the full lists of Invariant Sections
+ and required Cover Texts given in the Document's license notice.
+H. Include an unaltered copy of this License.
+I. Preserve the section Entitled "History", Preserve its Title, and add
+ to it an item stating at least the title, year, new authors, and
+ publisher of the Modified Version as given on the Title Page. If
+ there is no section Entitled "History" in the Document, create one
+ stating the title, year, authors, and publisher of the Document as
+ given on its Title Page, then add an item describing the Modified
+ Version as stated in the previous sentence.
+J. Preserve the network location, if any, given in the Document for
+ public access to a Transparent copy of the Document, and likewise
+ the network locations given in the Document for previous versions
+ it was based on. These may be placed in the "History" section.
+ You may omit a network location for a work that was published at
+ least four years before the Document itself, or if the original
+ publisher of the version it refers to gives permission.
+K. For any section Entitled "Acknowledgements" or "Dedications",
+ Preserve the Title of the section, and preserve in the section all
+ the substance and tone of each of the contributor acknowledgements
+ and/or dedications given therein.
+L. Preserve all the Invariant Sections of the Document,
+ unaltered in their text and in their titles. Section numbers
+ or the equivalent are not considered part of the section titles.
+M. Delete any section Entitled "Endorsements". Such a section
+ may not be included in the Modified Version.
+N. Do not retitle any existing section to be Entitled "Endorsements"
+ or to conflict in title with any Invariant Section.
+O. Preserve any Warranty Disclaimers.
+
+If the Modified Version includes new front-matter sections or
+appendices that qualify as Secondary Sections and contain no material
+copied from the Document, you may at your option designate some or all
+of these sections as invariant. To do this, add their titles to the
+list of Invariant Sections in the Modified Version's license notice.
+These titles must be distinct from any other section titles.
+
+You may add a section Entitled "Endorsements", provided it contains
+nothing but endorsements of your Modified Version by various
+parties--for example, statements of peer review or that the text has
+been approved by an organization as the authoritative definition of a
+standard.
+
+You may add a passage of up to five words as a Front-Cover Text, and a
+passage of up to 25 words as a Back-Cover Text, to the end of the list
+of Cover Texts in the Modified Version. Only one passage of
+Front-Cover Text and one of Back-Cover Text may be added by (or
+through arrangements made by) any one entity. If the Document already
+includes a cover text for the same cover, previously added by you or
+by arrangement made by the same entity you are acting on behalf of,
+you may not add another; but you may replace the old one, on explicit
+permission from the previous publisher that added the old one.
+
+The author(s) and publisher(s) of the Document do not by this License
+give permission to use their names for publicity for or to assert or
+imply endorsement of any Modified Version.
+
+
+5. COMBINING DOCUMENTS
+
+You may combine the Document with other documents released under this
+License, under the terms defined in section 4 above for modified
+versions, provided that you include in the combination all of the
+Invariant Sections of all of the original documents, unmodified, and
+list them all as Invariant Sections of your combined work in its
+license notice, and that you preserve all their Warranty Disclaimers.
+
+The combined work need only contain one copy of this License, and
+multiple identical Invariant Sections may be replaced with a single
+copy. If there are multiple Invariant Sections with the same name but
+different contents, make the title of each such section unique by
+adding at the end of it, in parentheses, the name of the original
+author or publisher of that section if known, or else a unique number.
+Make the same adjustment to the section titles in the list of
+Invariant Sections in the license notice of the combined work.
+
+In the combination, you must combine any sections Entitled "History"
+in the various original documents, forming one section Entitled
+"History"; likewise combine any sections Entitled "Acknowledgements",
+and any sections Entitled "Dedications". You must delete all sections
+Entitled "Endorsements".
+
+
+6. COLLECTIONS OF DOCUMENTS
+
+You may make a collection consisting of the Document and other documents
+released under this License, and replace the individual copies of this
+License in the various documents with a single copy that is included in
+the collection, provided that you follow the rules of this License for
+verbatim copying of each of the documents in all other respects.
+
+You may extract a single document from such a collection, and distribute
+it individually under this License, provided you insert a copy of this
+License into the extracted document, and follow this License in all
+other respects regarding verbatim copying of that document.
+
+
+7. AGGREGATION WITH INDEPENDENT WORKS
+
+A compilation of the Document or its derivatives with other separate
+and independent documents or works, in or on a volume of a storage or
+distribution medium, is called an "aggregate" if the copyright
+resulting from the compilation is not used to limit the legal rights
+of the compilation's users beyond what the individual works permit.
+When the Document is included in an aggregate, this License does not
+apply to the other works in the aggregate which are not themselves
+derivative works of the Document.
+
+If the Cover Text requirement of section 3 is applicable to these
+copies of the Document, then if the Document is less than one half of
+the entire aggregate, the Document's Cover Texts may be placed on
+covers that bracket the Document within the aggregate, or the
+electronic equivalent of covers if the Document is in electronic form.
+Otherwise they must appear on printed covers that bracket the whole
+aggregate.
+
+
+8. TRANSLATION
+
+Translation is considered a kind of modification, so you may
+distribute translations of the Document under the terms of section 4.
+Replacing Invariant Sections with translations requires special
+permission from their copyright holders, but you may include
+translations of some or all Invariant Sections in addition to the
+original versions of these Invariant Sections. You may include a
+translation of this License, and all the license notices in the
+Document, and any Warranty Disclaimers, provided that you also include
+the original English version of this License and the original versions
+of those notices and disclaimers. In case of a disagreement between
+the translation and the original version of this License or a notice
+or disclaimer, the original version will prevail.
+
+If a section in the Document is Entitled "Acknowledgements",
+"Dedications", or "History", the requirement (section 4) to Preserve
+its Title (section 1) will typically require changing the actual
+title.
+
+
+9. TERMINATION
+
+You may not copy, modify, sublicense, or distribute the Document except
+as expressly provided for under this License. Any other attempt to
+copy, modify, sublicense or distribute the Document is void, and will
+automatically terminate your rights under this License. However,
+parties who have received copies, or rights, from you under this
+License will not have their licenses terminated so long as such
+parties remain in full compliance.
+
+
+10. FUTURE REVISIONS OF THIS LICENSE
+
+The Free Software Foundation may publish new, revised versions
+of the GNU Free Documentation License from time to time. Such new
+versions will be similar in spirit to the present version, but may
+differ in detail to address new problems or concerns. See
+https://www.gnu.org/copyleft/.
+
+Each version of the License is given a distinguishing version number.
+If the Document specifies that a particular numbered version of this
+License "or any later version" applies to it, you have the option of
+following the terms and conditions either of that specified version or
+of any later version that has been published (not as a draft) by the
+Free Software Foundation. If the Document does not specify a version
+number of this License, you may choose any version ever published (not
+as a draft) by the Free Software Foundation.
+
+
+ADDENDUM: How to use this License for your documents
+
+To use this License in a document you have written, include a copy of
+the License in the document and put the following copyright and
+license notices just after the title page:
+
+ Copyright (c) YEAR YOUR NAME.
+ Permission is granted to copy, distribute and/or modify this document
+ under the terms of the GNU Free Documentation License, Version 1.2
+ or any later version published by the Free Software Foundation;
+ with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts.
+ A copy of the license is included in the section entitled "GNU
+ Free Documentation License".
+
+If you have Invariant Sections, Front-Cover Texts and Back-Cover Texts,
+replace the "with...Texts." line with this:
+
+ with the Invariant Sections being LIST THEIR TITLES, with the
+ Front-Cover Texts being LIST, and with the Back-Cover Texts being LIST.
+
+If you have Invariant Sections without Cover Texts, or some other
+combination of the three, merge those two alternatives to suit the
+situation.
+
+If your document contains nontrivial examples of program code, we
+recommend releasing these examples in parallel under your choice of
+free software license, such as the GNU General Public License,
+to permit their use in free software.
+
+The End.
diff --git a/Documentation/admin-guide/svga.rst b/Documentation/admin-guide/svga.rst
index b6c2f9acca92..9eb1e0738e84 100644
--- a/Documentation/admin-guide/svga.rst
+++ b/Documentation/admin-guide/svga.rst
@@ -12,7 +12,8 @@ Intro
This small document describes the "Video Mode Selection" feature which
allows the use of various special video modes supported by the video BIOS. Due
to usage of the BIOS, the selection is limited to boot time (before the
-kernel decompression starts) and works only on 80X86 machines.
+kernel decompression starts) and works only on 80X86 machines that are
+booted through BIOS firmware (as opposed to through UEFI, kexec, etc.).
.. note::
@@ -23,7 +24,7 @@ kernel decompression starts) and works only on 80X86 machines.
The video mode to be used is selected by a kernel parameter which can be
specified in the kernel Makefile (the SVGA_MODE=... line) or by the "vga=..."
-option of LILO (or some other boot loader you use) or by the "vidmode" utility
+option of LILO (or some other boot loader you use) or by the "xrandr" utility
(present in standard Linux utility packages). You can use the following values
of this parameter::
@@ -41,7 +42,7 @@ of this parameter::
better to use absolute mode numbers instead.
0x.... - Hexadecimal video mode ID (also displayed on the menu, see below
- for exact meaning of the ID). Warning: rdev and LILO don't support
+ for exact meaning of the ID). Warning: LILO doesn't support
hexadecimal numbers -- you have to convert it to decimal manually.
Menu
diff --git a/Documentation/admin-guide/syscall-user-dispatch.rst b/Documentation/admin-guide/syscall-user-dispatch.rst
new file mode 100644
index 000000000000..60314953c728
--- /dev/null
+++ b/Documentation/admin-guide/syscall-user-dispatch.rst
@@ -0,0 +1,90 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+=====================
+Syscall User Dispatch
+=====================
+
+Background
+----------
+
+Compatibility layers like Wine need a way to efficiently emulate system
+calls of only a part of their process - the part that has the
+incompatible code - while being able to execute native syscalls without
+a high performance penalty on the native part of the process. Seccomp
+falls short on this task, since it has limited support to efficiently
+filter syscalls based on memory regions, and it doesn't support removing
+filters. Therefore a new mechanism is necessary.
+
+Syscall User Dispatch brings the filtering of the syscall dispatcher
+address back to userspace. The application is in control of a flip
+switch, indicating the current personality of the process. A
+multiple-personality application can then flip the switch without
+invoking the kernel, when crossing the compatibility layer API
+boundaries, to enable/disable the syscall redirection and execute
+syscalls directly (disabled) or send them to be emulated in userspace
+through a SIGSYS.
+
+The goal of this design is to provide very quick compatibility layer
+boundary crosses, which is achieved by not executing a syscall to change
+personality every time the compatibility layer executes. Instead, a
+userspace memory region exposed to the kernel indicates the current
+personality, and the application simply modifies that variable to
+configure the mechanism.
+
+There is a relatively high cost associated with handling signals on most
+architectures, like x86, but at least for Wine, syscalls issued by
+native Windows code are currently not known to be a performance problem,
+since they are quite rare, at least for modern gaming applications.
+
+Since this mechanism is designed to capture syscalls issued by
+non-native applications, it must function on syscalls whose invocation
+ABI is completely unexpected to Linux. Syscall User Dispatch, therefore
+doesn't rely on any of the syscall ABI to make the filtering. It uses
+only the syscall dispatcher address and the userspace key.
+
+As the ABI of these intercepted syscalls is unknown to Linux, these
+syscalls are not instrumentable via ptrace or the syscall tracepoints.
+
+Interface
+---------
+
+A thread can setup this mechanism on supported kernels by executing the
+following prctl:
+
+ prctl(PR_SET_SYSCALL_USER_DISPATCH, <op>, <offset>, <length>, [selector])
+
+<op> is either PR_SYS_DISPATCH_ON or PR_SYS_DISPATCH_OFF, to enable and
+disable the mechanism globally for that thread. When
+PR_SYS_DISPATCH_OFF is used, the other fields must be zero.
+
+[<offset>, <offset>+<length>) delimit a memory region interval
+from which syscalls are always executed directly, regardless of the
+userspace selector. This provides a fast path for the C library, which
+includes the most common syscall dispatchers in the native code
+applications, and also provides a way for the signal handler to return
+without triggering a nested SIGSYS on (rt\_)sigreturn. Users of this
+interface should make sure that at least the signal trampoline code is
+included in this region. In addition, for syscalls that implement the
+trampoline code on the vDSO, that trampoline is never intercepted.
+
+[selector] is a pointer to a char-sized region in the process memory
+region, that provides a quick way to enable disable syscall redirection
+thread-wide, without the need to invoke the kernel directly. selector
+can be set to SYSCALL_DISPATCH_FILTER_ALLOW or SYSCALL_DISPATCH_FILTER_BLOCK.
+Any other value should terminate the program with a SIGSYS.
+
+Security Notes
+--------------
+
+Syscall User Dispatch provides functionality for compatibility layers to
+quickly capture system calls issued by a non-native part of the
+application, while not impacting the Linux native regions of the
+process. It is not a mechanism for sandboxing system calls, and it
+should not be seen as a security mechanism, since it is trivial for a
+malicious application to subvert the mechanism by jumping to an allowed
+dispatcher region prior to executing the syscall, or to discover the
+address and modify the selector value. If the use case requires any
+kind of security sandboxing, Seccomp should be used instead.
+
+Any fork or exec of the existing process resets the mechanism to
+PR_SYS_DISPATCH_OFF.
diff --git a/Documentation/admin-guide/sysctl/abi.rst b/Documentation/admin-guide/sysctl/abi.rst
index 599bcde7f0b7..4e6db0a2a4c0 100644
--- a/Documentation/admin-guide/sysctl/abi.rst
+++ b/Documentation/admin-guide/sysctl/abi.rst
@@ -1,67 +1,34 @@
+.. SPDX-License-Identifier: GPL-2.0+
+
================================
Documentation for /proc/sys/abi/
================================
-kernel version 2.6.0.test2
+.. See scripts/check-sysctl-docs to keep this up to date:
+.. scripts/check-sysctl-docs -vtable="abi" \
+.. Documentation/admin-guide/sysctl/abi.rst \
+.. $(git grep -l register_sysctl_)
-Copyright (c) 2003, Fabian Frederick <ffrederick@users.sourceforge.net>
+Copyright (c) 2020, Stephen Kitt
-For general info: index.rst.
+For general info, see Documentation/admin-guide/sysctl/index.rst.
------------------------------------------------------------------------------
-This path is binary emulation relevant aka personality types aka abi.
-When a process is executed, it's linked to an exec_domain whose
-personality is defined using values available from /proc/sys/abi.
-You can find further details about abi in include/linux/personality.h.
-
-Here are the files featuring in 2.6 kernel:
-
-- defhandler_coff
-- defhandler_elf
-- defhandler_lcall7
-- defhandler_libcso
-- fake_utsname
-- trace
-
-defhandler_coff
----------------
-
-defined value:
- PER_SCOSVR3::
-
- 0x0003 | STICKY_TIMEOUTS | WHOLE_SECONDS | SHORT_INODE
-
-defhandler_elf
---------------
-
-defined value:
- PER_LINUX::
-
- 0
-
-defhandler_lcall7
------------------
-
-defined value :
- PER_SVR4::
-
- 0x0001 | STICKY_TIMEOUTS | MMAP_PAGE_ZERO,
-
-defhandler_libsco
------------------
-
-defined value:
- PER_SVR4::
+The files in ``/proc/sys/abi`` can be used to see and modify
+ABI-related settings.
- 0x0001 | STICKY_TIMEOUTS | MMAP_PAGE_ZERO,
+Currently, these files might (depending on your configuration)
+show up in ``/proc/sys/kernel``:
-fake_utsname
-------------
+.. contents:: :local:
-Unused
+vsyscall32 (x86)
+================
-trace
------
+Determines whether the kernels maps a vDSO page into 32-bit processes;
+can be set to 1 to enable, or 0 to disable. Defaults to enabled if
+``CONFIG_COMPAT_VDSO`` is set, disabled otherwise.
-Unused
+This controls the same setting as the ``vdso32`` kernel boot
+parameter.
diff --git a/Documentation/admin-guide/sysctl/fs.rst b/Documentation/admin-guide/sysctl/fs.rst
index 2a45119e3331..2a501c9ddc55 100644
--- a/Documentation/admin-guide/sysctl/fs.rst
+++ b/Documentation/admin-guide/sysctl/fs.rst
@@ -261,7 +261,7 @@ directories like /tmp. The common method of exploitation of this flaw
is to cross privilege boundaries when following a given symlink (i.e. a
root process follows a symlink belonging to another user). For a likely
incomplete list of hundreds of examples across the years, please see:
-http://cve.mitre.org/cgi-bin/cvekey.cgi?keyword=/tmp
+https://cve.mitre.org/cgi-bin/cvekey.cgi?keyword=/tmp
When set to "0", symlink following behavior is unrestricted.
@@ -380,5 +380,5 @@ This configuration option sets the maximum number of "watches" that are
allowed for each user.
Each "watch" costs roughly 90 bytes on a 32bit kernel, and roughly 160 bytes
on a 64bit one.
-The current default value for max_user_watches is the 1/32 of the available
-low memory, divided for the "watch" cost in bytes.
+The current default value for max_user_watches is the 1/25 (4%) of the
+available low memory, divided for the "watch" cost in bytes.
diff --git a/Documentation/admin-guide/sysctl/kernel.rst b/Documentation/admin-guide/sysctl/kernel.rst
index def074807cee..98d1b198b2b4 100644
--- a/Documentation/admin-guide/sysctl/kernel.rst
+++ b/Documentation/admin-guide/sysctl/kernel.rst
@@ -2,262 +2,228 @@
Documentation for /proc/sys/kernel/
===================================
-kernel version 2.2.10
+.. See scripts/check-sysctl-docs to keep this up to date
+
Copyright (c) 1998, 1999, Rik van Riel <riel@nl.linux.org>
Copyright (c) 2009, Shen Feng<shen@cn.fujitsu.com>
-For general info and legal blurb, please look in index.rst.
+For general info and legal blurb, please look in
+Documentation/admin-guide/sysctl/index.rst.
------------------------------------------------------------------------------
This file contains documentation for the sysctl files in
-/proc/sys/kernel/ and is valid for Linux kernel version 2.2.
+``/proc/sys/kernel/``.
The files in this directory can be used to tune and monitor
miscellaneous and general things in the operation of the Linux
-kernel. Since some of the files _can_ be used to screw up your
+kernel. Since some of the files *can* be used to screw up your
system, it is advisable to read both documentation and source
before actually making adjustments.
Currently, these files might (depending on your configuration)
-show up in /proc/sys/kernel:
-
-- acct
-- acpi_video_flags
-- auto_msgmni
-- bootloader_type [ X86 only ]
-- bootloader_version [ X86 only ]
-- cap_last_cap
-- core_pattern
-- core_pipe_limit
-- core_uses_pid
-- ctrl-alt-del
-- dmesg_restrict
-- domainname
-- hostname
-- hotplug
-- hardlockup_all_cpu_backtrace
-- hardlockup_panic
-- hung_task_panic
-- hung_task_check_count
-- hung_task_timeout_secs
-- hung_task_check_interval_secs
-- hung_task_warnings
-- hyperv_record_panic_msg
-- kexec_load_disabled
-- kptr_restrict
-- l2cr [ PPC only ]
-- modprobe ==> Documentation/debugging-modules.txt
-- modules_disabled
-- msg_next_id [ sysv ipc ]
-- msgmax
-- msgmnb
-- msgmni
-- nmi_watchdog
-- osrelease
-- ostype
-- overflowgid
-- overflowuid
-- panic
-- panic_on_oops
-- panic_on_stackoverflow
-- panic_on_unrecovered_nmi
-- panic_on_warn
-- panic_print
-- panic_on_rcu_stall
-- perf_cpu_time_max_percent
-- perf_event_paranoid
-- perf_event_max_stack
-- perf_event_mlock_kb
-- perf_event_max_contexts_per_stack
-- pid_max
-- powersave-nap [ PPC only ]
-- printk
-- printk_delay
-- printk_ratelimit
-- printk_ratelimit_burst
-- pty ==> Documentation/filesystems/devpts.txt
-- randomize_va_space
-- real-root-dev ==> Documentation/admin-guide/initrd.rst
-- reboot-cmd [ SPARC only ]
-- rtsig-max
-- rtsig-nr
-- sched_energy_aware
-- seccomp/ ==> Documentation/userspace-api/seccomp_filter.rst
-- sem
-- sem_next_id [ sysv ipc ]
-- sg-big-buff [ generic SCSI device (sg) ]
-- shm_next_id [ sysv ipc ]
-- shm_rmid_forced
-- shmall
-- shmmax [ sysv ipc ]
-- shmmni
-- softlockup_all_cpu_backtrace
-- soft_watchdog
-- stack_erasing
-- stop-a [ SPARC only ]
-- sysrq ==> Documentation/admin-guide/sysrq.rst
-- sysctl_writes_strict
-- tainted ==> Documentation/admin-guide/tainted-kernels.rst
-- threads-max
-- unknown_nmi_panic
-- watchdog
-- watchdog_thresh
-- version
-
-
-acct:
-=====
+show up in ``/proc/sys/kernel``:
+
+.. contents:: :local:
+
+
+acct
+====
-highwater lowwater frequency
+::
+
+ highwater lowwater frequency
If BSD-style process accounting is enabled these values control
its behaviour. If free space on filesystem where the log lives
-goes below <lowwater>% accounting suspends. If free space gets
-above <highwater>% accounting resumes. <Frequency> determines
+goes below ``lowwater``\ % accounting suspends. If free space gets
+above ``highwater``\ % accounting resumes. ``frequency`` determines
how often do we check the amount of free space (value is in
seconds). Default:
-4 2 30
-That is, suspend accounting if there left <= 2% free; resume it
-if we got >=4%; consider information about amount of free space
-valid for 30 seconds.
+::
-acpi_video_flags:
-=================
+ 4 2 30
+
+That is, suspend accounting if free space drops below 2%; resume it
+if it increases to at least 4%; consider information about amount of
+free space valid for 30 seconds.
-flags
-See Doc*/kernel/power/video.txt, it allows mode of video boot to be
-set during run time.
+acpi_video_flags
+================
+See Documentation/power/video.rst. This allows the video resume mode to be set,
+in a similar fashion to the ``acpi_sleep`` kernel parameter, by
+combining the following values:
-auto_msgmni:
-============
+= =======
+1 s3_bios
+2 s3_mode
+4 s3_beep
+= =======
+
+arch
+====
+
+The machine hardware name, the same output as ``uname -m``
+(e.g. ``x86_64`` or ``aarch64``).
+
+auto_msgmni
+===========
This variable has no effect and may be removed in future kernel
releases. Reading it always returns 0.
-Up to Linux 3.17, it enabled/disabled automatic recomputing of msgmni
-upon memory add/remove or upon ipc namespace creation/removal.
+Up to Linux 3.17, it enabled/disabled automatic recomputing of
+`msgmni`_
+upon memory add/remove or upon IPC namespace creation/removal.
Echoing "1" into this file enabled msgmni automatic recomputing.
-Echoing "0" turned it off. auto_msgmni default value was 1.
+Echoing "0" turned it off. The default value was 1.
-bootloader_type:
-================
-
-x86 bootloader identification
+bootloader_type (x86 only)
+==========================
This gives the bootloader type number as indicated by the bootloader,
shifted left by 4, and OR'd with the low four bits of the bootloader
version. The reason for this encoding is that this used to match the
-type_of_loader field in the kernel header; the encoding is kept for
+``type_of_loader`` field in the kernel header; the encoding is kept for
backwards compatibility. That is, if the full bootloader type number
is 0x15 and the full version number is 0x234, this file will contain
the value 340 = 0x154.
-See the type_of_loader and ext_loader_type fields in
+See the ``type_of_loader`` and ``ext_loader_type`` fields in
Documentation/x86/boot.rst for additional information.
-bootloader_version:
-===================
-
-x86 bootloader version
+bootloader_version (x86 only)
+=============================
The complete bootloader version number. In the example above, this
file will contain the value 564 = 0x234.
-See the type_of_loader and ext_loader_ver fields in
+See the ``type_of_loader`` and ``ext_loader_ver`` fields in
Documentation/x86/boot.rst for additional information.
-cap_last_cap:
-=============
+bpf_stats_enabled
+=================
+
+Controls whether the kernel should collect statistics on BPF programs
+(total time spent running, number of times run...). Enabling
+statistics causes a slight reduction in performance on each program
+run. The statistics can be seen using ``bpftool``.
+
+= ===================================
+0 Don't collect statistics (default).
+1 Collect statistics.
+= ===================================
+
+
+cad_pid
+=======
+
+This is the pid which will be signalled on reboot (notably, by
+Ctrl-Alt-Delete). Writing a value to this file which doesn't
+correspond to a running process will result in ``-ESRCH``.
+
+See also `ctrl-alt-del`_.
+
+
+cap_last_cap
+============
Highest valid capability of the running kernel. Exports
-CAP_LAST_CAP from the kernel.
+``CAP_LAST_CAP`` from the kernel.
-core_pattern:
-=============
+core_pattern
+============
-core_pattern is used to specify a core dumpfile pattern name.
+``core_pattern`` is used to specify a core dumpfile pattern name.
* max length 127 characters; default value is "core"
-* core_pattern is used as a pattern template for the output filename;
- certain string patterns (beginning with '%') are substituted with
- their actual values.
-* backward compatibility with core_uses_pid:
+* ``core_pattern`` is used as a pattern template for the output
+ filename; certain string patterns (beginning with '%') are
+ substituted with their actual values.
+* backward compatibility with ``core_uses_pid``:
- If core_pattern does not include "%p" (default does not)
- and core_uses_pid is set, then .PID will be appended to
+ If ``core_pattern`` does not include "%p" (default does not)
+ and ``core_uses_pid`` is set, then .PID will be appended to
the filename.
-* corename format specifiers::
-
- %<NUL> '%' is dropped
- %% output one '%'
- %p pid
- %P global pid (init PID namespace)
- %i tid
- %I global tid (init PID namespace)
- %u uid (in initial user namespace)
- %g gid (in initial user namespace)
- %d dump mode, matches PR_SET_DUMPABLE and
- /proc/sys/fs/suid_dumpable
- %s signal number
- %t UNIX time of dump
- %h hostname
- %e executable filename (may be shortened)
- %E executable path
- %<OTHER> both are dropped
+* corename format specifiers
+
+ ======== ==========================================
+ %<NUL> '%' is dropped
+ %% output one '%'
+ %p pid
+ %P global pid (init PID namespace)
+ %i tid
+ %I global tid (init PID namespace)
+ %u uid (in initial user namespace)
+ %g gid (in initial user namespace)
+ %d dump mode, matches ``PR_SET_DUMPABLE`` and
+ ``/proc/sys/fs/suid_dumpable``
+ %s signal number
+ %t UNIX time of dump
+ %h hostname
+ %e executable filename (may be shortened, could be changed by prctl etc)
+ %f executable filename
+ %E executable path
+ %c maximum size of core file by resource limit RLIMIT_CORE
+ %<OTHER> both are dropped
+ ======== ==========================================
* If the first character of the pattern is a '|', the kernel will treat
the rest of the pattern as a command to run. The core dump will be
written to the standard input of that program instead of to a file.
-core_pipe_limit:
-================
+core_pipe_limit
+===============
-This sysctl is only applicable when core_pattern is configured to pipe
-core files to a user space helper (when the first character of
-core_pattern is a '|', see above). When collecting cores via a pipe
-to an application, it is occasionally useful for the collecting
-application to gather data about the crashing process from its
-/proc/pid directory. In order to do this safely, the kernel must wait
-for the collecting process to exit, so as not to remove the crashing
-processes proc files prematurely. This in turn creates the
-possibility that a misbehaving userspace collecting process can block
-the reaping of a crashed process simply by never exiting. This sysctl
-defends against that. It defines how many concurrent crashing
-processes may be piped to user space applications in parallel. If
-this value is exceeded, then those crashing processes above that value
-are noted via the kernel log and their cores are skipped. 0 is a
-special value, indicating that unlimited processes may be captured in
-parallel, but that no waiting will take place (i.e. the collecting
-process is not guaranteed access to /proc/<crashing pid>/). This
-value defaults to 0.
-
-
-core_uses_pid:
-==============
+This sysctl is only applicable when `core_pattern`_ is configured to
+pipe core files to a user space helper (when the first character of
+``core_pattern`` is a '|', see above).
+When collecting cores via a pipe to an application, it is occasionally
+useful for the collecting application to gather data about the
+crashing process from its ``/proc/pid`` directory.
+In order to do this safely, the kernel must wait for the collecting
+process to exit, so as not to remove the crashing processes proc files
+prematurely.
+This in turn creates the possibility that a misbehaving userspace
+collecting process can block the reaping of a crashed process simply
+by never exiting.
+This sysctl defends against that.
+It defines how many concurrent crashing processes may be piped to user
+space applications in parallel.
+If this value is exceeded, then those crashing processes above that
+value are noted via the kernel log and their cores are skipped.
+0 is a special value, indicating that unlimited processes may be
+captured in parallel, but that no waiting will take place (i.e. the
+collecting process is not guaranteed access to ``/proc/<crashing
+pid>/``).
+This value defaults to 0.
+
+
+core_uses_pid
+=============
The default coredump filename is "core". By setting
-core_uses_pid to 1, the coredump filename becomes core.PID.
-If core_pattern does not include "%p" (default does not)
-and core_uses_pid is set, then .PID will be appended to
+``core_uses_pid`` to 1, the coredump filename becomes core.PID.
+If `core_pattern`_ does not include "%p" (default does not)
+and ``core_uses_pid`` is set, then .PID will be appended to
the filename.
-ctrl-alt-del:
-=============
+ctrl-alt-del
+============
When the value in this file is 0, ctrl-alt-del is trapped and
-sent to the init(1) program to handle a graceful restart.
+sent to the ``init(1)`` program to handle a graceful restart.
When, however, the value is > 0, Linux's reaction to a Vulcan
Nerve Pinch (tm) will be an immediate reboot, without even
syncing its dirty buffers.
@@ -269,21 +235,22 @@ Note:
to decide what to do with it.
-dmesg_restrict:
-===============
+dmesg_restrict
+==============
This toggle indicates whether unprivileged users are prevented
-from using dmesg(8) to view messages from the kernel's log buffer.
-When dmesg_restrict is set to (0) there are no restrictions. When
-dmesg_restrict is set set to (1), users must have CAP_SYSLOG to use
-dmesg(8).
+from using ``dmesg(8)`` to view messages from the kernel's log
+buffer.
+When ``dmesg_restrict`` is set to 0 there are no restrictions.
+When ``dmesg_restrict`` is set to 1, users must have
+``CAP_SYSLOG`` to use ``dmesg(8)``.
-The kernel config option CONFIG_SECURITY_DMESG_RESTRICT sets the
-default value of dmesg_restrict.
+The kernel config option ``CONFIG_SECURITY_DMESG_RESTRICT`` sets the
+default value of ``dmesg_restrict``.
-domainname & hostname:
-======================
+domainname & hostname
+=====================
These files can be used to set the NIS/YP domainname and the
hostname of your box in exactly the same way as the commands
@@ -302,167 +269,292 @@ hostname "darkstar" and DNS (Internet Domain Name Server)
domainname "frop.org", not to be confused with the NIS (Network
Information Service) or YP (Yellow Pages) domainname. These two
domain names are in general different. For a detailed discussion
-see the hostname(1) man page.
+see the ``hostname(1)`` man page.
-hardlockup_all_cpu_backtrace:
-=============================
+firmware_config
+===============
+
+See Documentation/driver-api/firmware/fallback-mechanisms.rst.
+
+The entries in this directory allow the firmware loader helper
+fallback to be controlled:
+
+* ``force_sysfs_fallback``, when set to 1, forces the use of the
+ fallback;
+* ``ignore_sysfs_fallback``, when set to 1, ignores any fallback.
+
+
+ftrace_dump_on_oops
+===================
+
+Determines whether ``ftrace_dump()`` should be called on an oops (or
+kernel panic). This will output the contents of the ftrace buffers to
+the console. This is very useful for capturing traces that lead to
+crashes and outputting them to a serial console.
+
+= ===================================================
+0 Disabled (default).
+1 Dump buffers of all CPUs.
+2 Dump the buffer of the CPU that triggered the oops.
+= ===================================================
+
+
+ftrace_enabled, stack_tracer_enabled
+====================================
+
+See Documentation/trace/ftrace.rst.
+
+
+hardlockup_all_cpu_backtrace
+============================
This value controls the hard lockup detector behavior when a hard
lockup condition is detected as to whether or not to gather further
debug information. If enabled, arch-specific all-CPU stack dumping
will be initiated.
-0: do nothing. This is the default behavior.
-
-1: on detection capture more debug information.
+= ============================================
+0 Do nothing. This is the default behavior.
+1 On detection capture more debug information.
+= ============================================
-hardlockup_panic:
-=================
+hardlockup_panic
+================
This parameter can be used to control whether the kernel panics
when a hard lockup is detected.
- 0 - don't panic on hard lockup
- 1 - panic on hard lockup
+= ===========================
+0 Don't panic on hard lockup.
+1 Panic on hard lockup.
+= ===========================
-See Documentation/admin-guide/lockup-watchdogs.rst for more information. This can
-also be set using the nmi_watchdog kernel parameter.
+See Documentation/admin-guide/lockup-watchdogs.rst for more information.
+This can also be set using the nmi_watchdog kernel parameter.
-hotplug:
-========
+hotplug
+=======
Path for the hotplug policy agent.
-Default value is "/sbin/hotplug".
+Default value is ``CONFIG_UEVENT_HELPER_PATH``, which in turn defaults
+to the empty string.
+This file only exists when ``CONFIG_UEVENT_HELPER`` is enabled. Most
+modern systems rely exclusively on the netlink-based uevent source and
+don't need this.
-hung_task_panic:
-================
-Controls the kernel's behavior when a hung task is detected.
-This file shows up if CONFIG_DETECT_HUNG_TASK is enabled.
+hung_task_all_cpu_backtrace
+===========================
-0: continue operation. This is the default behavior.
+If this option is set, the kernel will send an NMI to all CPUs to dump
+their backtraces when a hung task is detected. This file shows up if
+CONFIG_DETECT_HUNG_TASK and CONFIG_SMP are enabled.
-1: panic immediately.
+0: Won't show all CPUs backtraces when a hung task is detected.
+This is the default behavior.
+1: Will non-maskably interrupt all CPUs and dump their backtraces when
+a hung task is detected.
-hung_task_check_count:
-======================
+
+hung_task_panic
+===============
+
+Controls the kernel's behavior when a hung task is detected.
+This file shows up if ``CONFIG_DETECT_HUNG_TASK`` is enabled.
+
+= =================================================
+0 Continue operation. This is the default behavior.
+1 Panic immediately.
+= =================================================
+
+
+hung_task_check_count
+=====================
The upper bound on the number of tasks that are checked.
-This file shows up if CONFIG_DETECT_HUNG_TASK is enabled.
+This file shows up if ``CONFIG_DETECT_HUNG_TASK`` is enabled.
-hung_task_timeout_secs:
-=======================
+hung_task_timeout_secs
+======================
When a task in D state did not get scheduled
for more than this value report a warning.
-This file shows up if CONFIG_DETECT_HUNG_TASK is enabled.
+This file shows up if ``CONFIG_DETECT_HUNG_TASK`` is enabled.
-0: means infinite timeout - no checking done.
+0 means infinite timeout, no checking is done.
-Possible values to set are in range {0..LONG_MAX/HZ}.
+Possible values to set are in range {0:``LONG_MAX``/``HZ``}.
-hung_task_check_interval_secs:
-==============================
+hung_task_check_interval_secs
+=============================
Hung task check interval. If hung task checking is enabled
-(see hung_task_timeout_secs), the check is done every
-hung_task_check_interval_secs seconds.
-This file shows up if CONFIG_DETECT_HUNG_TASK is enabled.
+(see `hung_task_timeout_secs`_), the check is done every
+``hung_task_check_interval_secs`` seconds.
+This file shows up if ``CONFIG_DETECT_HUNG_TASK`` is enabled.
-0 (default): means use hung_task_timeout_secs as checking interval.
-Possible values to set are in range {0..LONG_MAX/HZ}.
+0 (default) means use ``hung_task_timeout_secs`` as checking
+interval.
+Possible values to set are in range {0:``LONG_MAX``/``HZ``}.
-hung_task_warnings:
-===================
+
+hung_task_warnings
+==================
The maximum number of warnings to report. During a check interval
if a hung task is detected, this value is decreased by 1.
When this value reaches 0, no more warnings will be reported.
-This file shows up if CONFIG_DETECT_HUNG_TASK is enabled.
+This file shows up if ``CONFIG_DETECT_HUNG_TASK`` is enabled.
-1: report an infinite number of warnings.
-hyperv_record_panic_msg:
-========================
+hyperv_record_panic_msg
+=======================
Controls whether the panic kmsg data should be reported to Hyper-V.
-0: do not report panic kmsg data.
+= =========================================================
+0 Do not report panic kmsg data.
+1 Report the panic kmsg data. This is the default behavior.
+= =========================================================
-1: report the panic kmsg data. This is the default behavior.
+ignore-unaligned-usertrap
+=========================
-kexec_load_disabled:
-====================
+On architectures where unaligned accesses cause traps, and where this
+feature is supported (``CONFIG_SYSCTL_ARCH_UNALIGN_NO_WARN``;
+currently, ``arc`` and ``ia64``), controls whether all unaligned traps
+are logged.
-A toggle indicating if the kexec_load syscall has been disabled. This
-value defaults to 0 (false: kexec_load enabled), but can be set to 1
-(true: kexec_load disabled). Once true, kexec can no longer be used, and
-the toggle cannot be set back to false. This allows a kexec image to be
-loaded before disabling the syscall, allowing a system to set up (and
-later use) an image without it being altered. Generally used together
-with the "modules_disabled" sysctl.
+= =============================================================
+0 Log all unaligned accesses.
+1 Only warn the first time a process traps. This is the default
+ setting.
+= =============================================================
+See also `unaligned-trap`_ and `unaligned-dump-stack`_. On ``ia64``,
+this allows system administrators to override the
+``IA64_THREAD_UAC_NOPRINT`` ``prctl`` and avoid logs being flooded.
-kptr_restrict:
-==============
-This toggle indicates whether restrictions are placed on
-exposing kernel addresses via /proc and other interfaces.
+kexec_load_disabled
+===================
-When kptr_restrict is set to 0 (the default) the address is hashed before
-printing. (This is the equivalent to %p.)
+A toggle indicating if the ``kexec_load`` syscall has been disabled.
+This value defaults to 0 (false: ``kexec_load`` enabled), but can be
+set to 1 (true: ``kexec_load`` disabled).
+Once true, kexec can no longer be used, and the toggle cannot be set
+back to false.
+This allows a kexec image to be loaded before disabling the syscall,
+allowing a system to set up (and later use) an image without it being
+altered.
+Generally used together with the `modules_disabled`_ sysctl.
-When kptr_restrict is set to (1), kernel pointers printed using the %pK
-format specifier will be replaced with 0's unless the user has CAP_SYSLOG
-and effective user and group ids are equal to the real ids. This is
-because %pK checks are done at read() time rather than open() time, so
-if permissions are elevated between the open() and the read() (e.g via
-a setuid binary) then %pK will not leak kernel pointers to unprivileged
-users. Note, this is a temporary solution only. The correct long-term
-solution is to do the permission checks at open() time. Consider removing
-world read permissions from files that use %pK, and using dmesg_restrict
-to protect against uses of %pK in dmesg(8) if leaking kernel pointer
-values to unprivileged users is a concern.
-When kptr_restrict is set to (2), kernel pointers printed using
-%pK will be replaced with 0's regardless of privileges.
+kptr_restrict
+=============
+This toggle indicates whether restrictions are placed on
+exposing kernel addresses via ``/proc`` and other interfaces.
+
+When ``kptr_restrict`` is set to 0 (the default) the address is hashed
+before printing.
+(This is the equivalent to %p.)
+
+When ``kptr_restrict`` is set to 1, kernel pointers printed using the
+%pK format specifier will be replaced with 0s unless the user has
+``CAP_SYSLOG`` and effective user and group ids are equal to the real
+ids.
+This is because %pK checks are done at read() time rather than open()
+time, so if permissions are elevated between the open() and the read()
+(e.g via a setuid binary) then %pK will not leak kernel pointers to
+unprivileged users.
+Note, this is a temporary solution only.
+The correct long-term solution is to do the permission checks at
+open() time.
+Consider removing world read permissions from files that use %pK, and
+using `dmesg_restrict`_ to protect against uses of %pK in ``dmesg(8)``
+if leaking kernel pointer values to unprivileged users is a concern.
+
+When ``kptr_restrict`` is set to 2, kernel pointers printed using
+%pK will be replaced with 0s regardless of privileges.
+
+
+modprobe
+========
-l2cr: (PPC only)
-================
+The full path to the usermode helper for autoloading kernel modules,
+by default ``CONFIG_MODPROBE_PATH``, which in turn defaults to
+"/sbin/modprobe". This binary is executed when the kernel requests a
+module. For example, if userspace passes an unknown filesystem type
+to mount(), then the kernel will automatically request the
+corresponding filesystem module by executing this usermode helper.
+This usermode helper should insert the needed module into the kernel.
-This flag controls the L2 cache of G3 processor boards. If
-0, the cache is disabled. Enabled if nonzero.
+This sysctl only affects module autoloading. It has no effect on the
+ability to explicitly insert modules.
+This sysctl can be used to debug module loading requests::
-modules_disabled:
-=================
+ echo '#! /bin/sh' > /tmp/modprobe
+ echo 'echo "$@" >> /tmp/modprobe.log' >> /tmp/modprobe
+ echo 'exec /sbin/modprobe "$@"' >> /tmp/modprobe
+ chmod a+x /tmp/modprobe
+ echo /tmp/modprobe > /proc/sys/kernel/modprobe
+
+Alternatively, if this sysctl is set to the empty string, then module
+autoloading is completely disabled. The kernel will not try to
+execute a usermode helper at all, nor will it call the
+kernel_module_request LSM hook.
+
+If CONFIG_STATIC_USERMODEHELPER=y is set in the kernel configuration,
+then the configured static usermode helper overrides this sysctl,
+except that the empty string is still accepted to completely disable
+module autoloading as described above.
+
+modules_disabled
+================
A toggle value indicating if modules are allowed to be loaded
in an otherwise modular kernel. This toggle defaults to off
(0), but can be set true (1). Once true, modules can be
neither loaded nor unloaded, and the toggle cannot be set back
-to false. Generally used with the "kexec_load_disabled" toggle.
+to false. Generally used with the `kexec_load_disabled`_ toggle.
-msg_next_id, sem_next_id, and shm_next_id:
-==========================================
+.. _msgmni:
+
+msgmax, msgmnb, and msgmni
+==========================
+
+``msgmax`` is the maximum size of an IPC message, in bytes. 8192 by
+default (``MSGMAX``).
+
+``msgmnb`` is the maximum size of an IPC queue, in bytes. 16384 by
+default (``MSGMNB``).
+
+``msgmni`` is the maximum number of IPC queues. 32000 by default
+(``MSGMNI``).
+
+
+msg_next_id, sem_next_id, and shm_next_id (System V IPC)
+========================================================
These three toggles allows to specify desired id for next allocated IPC
object: message, semaphore or shared memory respectively.
By default they are equal to -1, which means generic allocation logic.
-Possible values to set are in range {0..INT_MAX}.
+Possible values to set are in range {0:``INT_MAX``}.
Notes:
1) kernel doesn't guarantee, that new object will have desired id. So,
@@ -472,15 +564,24 @@ Notes:
fails, it is undefined if the value remains unmodified or is reset to -1.
-nmi_watchdog:
-=============
+ngroups_max
+===========
+
+Maximum number of supplementary groups, _i.e._ the maximum size which
+``setgroups`` will accept. Exports ``NGROUPS_MAX`` from the kernel.
+
+
+
+nmi_watchdog
+============
This parameter can be used to control the NMI watchdog
(i.e. the hard lockup detector) on x86 systems.
-0 - disable the hard lockup detector
-
-1 - enable the hard lockup detector
+= =================================
+0 Disable the hard lockup detector.
+1 Enable the hard lockup detector.
+= =================================
The hard lockup detector monitors each CPU for its ability to respond to
timer interrupts. The mechanism utilizes CPU performance counter registers
@@ -492,73 +593,82 @@ in a KVM virtual machine. This default can be overridden by adding::
nmi_watchdog=1
-to the guest kernel command line (see Documentation/admin-guide/kernel-parameters.rst).
+to the guest kernel command line (see
+Documentation/admin-guide/kernel-parameters.rst).
-numa_balancing:
-===============
+nmi_wd_lpm_factor (PPC only)
+============================
+
+Factor to apply to the NMI watchdog timeout (only when ``nmi_watchdog`` is
+set to 1). This factor represents the percentage added to
+``watchdog_thresh`` when calculating the NMI watchdog timeout during an
+LPM. The soft lockup timeout is not impacted.
+
+A value of 0 means no change. The default value is 200 meaning the NMI
+watchdog is set to 30s (based on ``watchdog_thresh`` equal to 10).
+
-Enables/disables automatic page fault based NUMA memory
-balancing. Memory is moved automatically to nodes
-that access it often.
+numa_balancing
+==============
+
+Enables/disables and configures automatic page fault based NUMA memory
+balancing. Memory is moved automatically to nodes that access it often.
+The value to set can be the result of ORing the following:
+
+= =================================
+0 NUMA_BALANCING_DISABLED
+1 NUMA_BALANCING_NORMAL
+2 NUMA_BALANCING_MEMORY_TIERING
+= =================================
-Enables/disables automatic NUMA memory balancing. On NUMA machines, there
-is a performance penalty if remote memory is accessed by a CPU. When this
-feature is enabled the kernel samples what task thread is accessing memory
-by periodically unmapping pages and later trapping a page fault. At the
-time of the page fault, it is determined if the data being accessed should
-be migrated to a local memory node.
+Or NUMA_BALANCING_NORMAL to optimize page placement among different
+NUMA nodes to reduce remote accessing. On NUMA machines, there is a
+performance penalty if remote memory is accessed by a CPU. When this
+feature is enabled the kernel samples what task thread is accessing
+memory by periodically unmapping pages and later trapping a page
+fault. At the time of the page fault, it is determined if the data
+being accessed should be migrated to a local memory node.
The unmapping of pages and trapping faults incur additional overhead that
ideally is offset by improved memory locality but there is no universal
guarantee. If the target workload is already bound to NUMA nodes then this
-feature should be disabled. Otherwise, if the system overhead from the
-feature is too high then the rate the kernel samples for NUMA hinting
-faults may be controlled by the numa_balancing_scan_period_min_ms,
-numa_balancing_scan_delay_ms, numa_balancing_scan_period_max_ms,
-numa_balancing_scan_size_mb, and numa_balancing_settle_count sysctls.
+feature should be disabled.
-numa_balancing_scan_period_min_ms, numa_balancing_scan_delay_ms, numa_balancing_scan_period_max_ms, numa_balancing_scan_size_mb
-===============================================================================================================================
+Or NUMA_BALANCING_MEMORY_TIERING to optimize page placement among
+different types of memory (represented as different NUMA nodes) to
+place the hot pages in the fast memory. This is implemented based on
+unmapping and page fault too.
+numa_balancing_promote_rate_limit_MBps
+======================================
-Automatic NUMA balancing scans tasks address space and unmaps pages to
-detect if pages are properly placed or if the data should be migrated to a
-memory node local to where the task is running. Every "scan delay" the task
-scans the next "scan size" number of pages in its address space. When the
-end of the address space is reached the scanner restarts from the beginning.
+Too high promotion/demotion throughput between different memory types
+may hurt application latency. This can be used to rate limit the
+promotion throughput. The per-node max promotion throughput in MB/s
+will be limited to be no more than the set value.
-In combination, the "scan delay" and "scan size" determine the scan rate.
-When "scan delay" decreases, the scan rate increases. The scan delay and
-hence the scan rate of every task is adaptive and depends on historical
-behaviour. If pages are properly placed then the scan delay increases,
-otherwise the scan delay decreases. The "scan size" is not adaptive but
-the higher the "scan size", the higher the scan rate.
+A rule of thumb is to set this to less than 1/10 of the PMEM node
+write bandwidth.
-Higher scan rates incur higher system overhead as page faults must be
-trapped and potentially data must be migrated. However, the higher the scan
-rate, the more quickly a tasks memory is migrated to a local node if the
-workload pattern changes and minimises performance impact due to remote
-memory accesses. These sysctls control the thresholds for scan delays and
-the number of pages scanned.
-
-numa_balancing_scan_period_min_ms is the minimum time in milliseconds to
-scan a tasks virtual memory. It effectively controls the maximum scanning
-rate for each task.
+oops_all_cpu_backtrace
+======================
-numa_balancing_scan_delay_ms is the starting "scan delay" used for a task
-when it initially forks.
+If this option is set, the kernel will send an NMI to all CPUs to dump
+their backtraces when an oops event occurs. It should be used as a last
+resort in case a panic cannot be triggered (to protect VMs running, for
+example) or kdump can't be collected. This file shows up if CONFIG_SMP
+is enabled.
-numa_balancing_scan_period_max_ms is the maximum time in milliseconds to
-scan a tasks virtual memory. It effectively controls the minimum scanning
-rate for each task.
+0: Won't show all CPUs backtraces when an oops is detected.
+This is the default behavior.
-numa_balancing_scan_size_mb is how many megabytes worth of pages are
-scanned for a given scan.
+1: Will non-maskably interrupt all CPUs and dump their backtraces when
+an oops event is detected.
-osrelease, ostype & version:
-============================
+osrelease, ostype & version
+===========================
::
@@ -569,15 +679,16 @@ osrelease, ostype & version:
# cat version
#5 Wed Feb 25 21:49:24 MET 1998
-The files osrelease and ostype should be clear enough. Version
+The files ``osrelease`` and ``ostype`` should be clear enough.
+``version``
needs a little more clarification however. The '#5' means that
this is the fifth kernel built from this source base and the
date behind it indicates the time the kernel was built.
The only way to tune these values is to rebuild the kernel :-)
-overflowgid & overflowuid:
-==========================
+overflowgid & overflowuid
+=========================
if your architecture did not always support 32-bit UIDs (i.e. arm,
i386, m68k, sh, and sparc32), a fixed UID and GID will be returned to
@@ -588,108 +699,128 @@ These sysctls allow you to change the value of the fixed UID and GID.
The default is 65534.
+panic
+=====
+
+The value in this file determines the behaviour of the kernel on a
panic:
-======
-The value in this file represents the number of seconds the kernel
-waits before rebooting on a panic. When you use the software watchdog,
-the recommended setting is 60.
+* if zero, the kernel will loop forever;
+* if negative, the kernel will reboot immediately;
+* if positive, the kernel will reboot after the corresponding number
+ of seconds.
+When you use the software watchdog, the recommended setting is 60.
-panic_on_io_nmi:
-================
+
+panic_on_io_nmi
+===============
Controls the kernel's behavior when a CPU receives an NMI caused by
an IO error.
-0: try to continue operation (default)
-
-1: panic immediately. The IO error triggered an NMI. This indicates a
- serious system condition which could result in IO data corruption.
- Rather than continuing, panicking might be a better choice. Some
- servers issue this sort of NMI when the dump button is pushed,
- and you can use this option to take a crash dump.
+= ==================================================================
+0 Try to continue operation (default).
+1 Panic immediately. The IO error triggered an NMI. This indicates a
+ serious system condition which could result in IO data corruption.
+ Rather than continuing, panicking might be a better choice. Some
+ servers issue this sort of NMI when the dump button is pushed,
+ and you can use this option to take a crash dump.
+= ==================================================================
-panic_on_oops:
-==============
+panic_on_oops
+=============
Controls the kernel's behaviour when an oops or BUG is encountered.
-0: try to continue operation
-
-1: panic immediately. If the `panic` sysctl is also non-zero then the
- machine will be rebooted.
+= ===================================================================
+0 Try to continue operation.
+1 Panic immediately. If the `panic` sysctl is also non-zero then the
+ machine will be rebooted.
+= ===================================================================
-panic_on_stackoverflow:
-=======================
+panic_on_stackoverflow
+======================
Controls the kernel's behavior when detecting the overflows of
kernel, IRQ and exception stacks except a user stack.
-This file shows up if CONFIG_DEBUG_STACKOVERFLOW is enabled.
+This file shows up if ``CONFIG_DEBUG_STACKOVERFLOW`` is enabled.
-0: try to continue operation.
+= ==========================
+0 Try to continue operation.
+1 Panic immediately.
+= ==========================
-1: panic immediately.
-
-panic_on_unrecovered_nmi:
-=========================
+panic_on_unrecovered_nmi
+========================
The default Linux behaviour on an NMI of either memory or unknown is
to continue operation. For many environments such as scientific
computing it is preferable that the box is taken out and the error
dealt with than an uncorrected parity/ECC error get propagated.
-A small number of systems do generate NMI's for bizarre random reasons
+A small number of systems do generate NMIs for bizarre random reasons
such as power management so the default is off. That sysctl works like
the existing panic controls already in that directory.
-panic_on_warn:
-==============
+panic_on_warn
+=============
Calls panic() in the WARN() path when set to 1. This is useful to avoid
a kernel rebuild when attempting to kdump at the location of a WARN().
-0: only WARN(), default behaviour.
+= ================================================
+0 Only WARN(), default behaviour.
+1 Call panic() after printing out WARN() location.
+= ================================================
-1: call panic() after printing out WARN() location.
-
-panic_print:
-============
+panic_print
+===========
Bitmask for printing system info when panic happens. User can chose
combination of the following bits:
-===== ========================================
+===== ============================================
bit 0 print all tasks info
bit 1 print system memory info
bit 2 print timer info
-bit 3 print locks info if CONFIG_LOCKDEP is on
+bit 3 print locks info if ``CONFIG_LOCKDEP`` is on
bit 4 print ftrace buffer
-===== ========================================
+bit 5 print all printk messages in buffer
+bit 6 print all CPUs backtrace (if available in the arch)
+===== ============================================
So for example to print tasks and memory info on panic, user can::
echo 3 > /proc/sys/kernel/panic_print
-panic_on_rcu_stall:
-===================
+panic_on_rcu_stall
+==================
When set to 1, calls panic() after RCU stall detection messages. This
is useful to define the root cause of RCU stalls using a vmcore.
-0: do not panic() when RCU stall takes place, default behavior.
+= ============================================================
+0 Do not panic() when RCU stall takes place, default behavior.
+1 panic() after printing RCU stall messages.
+= ============================================================
-1: panic() after printing RCU stall messages.
+max_rcu_stall_to_panic
+======================
+When ``panic_on_rcu_stall`` is set to 1, this value determines the
+number of times that RCU can stall before panic() is called.
-perf_cpu_time_max_percent:
-==========================
+When ``panic_on_rcu_stall`` is set to 0, this value is has no effect.
+
+perf_cpu_time_max_percent
+=========================
Hints to the kernel how much CPU time it should be allowed to
use to handle perf sampling events. If the perf subsystem
@@ -702,171 +833,222 @@ unexpectedly take too long to execute, the NMIs can become
stacked up next to each other so much that nothing else is
allowed to execute.
-0:
- disable the mechanism. Do not monitor or correct perf's
- sampling rate no matter how CPU time it takes.
+===== ========================================================
+0 Disable the mechanism. Do not monitor or correct perf's
+ sampling rate no matter how CPU time it takes.
-1-100:
- attempt to throttle perf's sample rate to this
- percentage of CPU. Note: the kernel calculates an
- "expected" length of each sample event. 100 here means
- 100% of that expected length. Even if this is set to
- 100, you may still see sample throttling if this
- length is exceeded. Set to 0 if you truly do not care
- how much CPU is consumed.
+1-100 Attempt to throttle perf's sample rate to this
+ percentage of CPU. Note: the kernel calculates an
+ "expected" length of each sample event. 100 here means
+ 100% of that expected length. Even if this is set to
+ 100, you may still see sample throttling if this
+ length is exceeded. Set to 0 if you truly do not care
+ how much CPU is consumed.
+===== ========================================================
-perf_event_paranoid:
-====================
+perf_event_paranoid
+===================
Controls use of the performance events system by unprivileged
-users (without CAP_SYS_ADMIN). The default value is 2.
+users (without CAP_PERFMON). The default value is 2.
+
+For backward compatibility reasons access to system performance
+monitoring and observability remains open for CAP_SYS_ADMIN
+privileged processes but CAP_SYS_ADMIN usage for secure system
+performance monitoring and observability operations is discouraged
+with respect to CAP_PERFMON use cases.
=== ==================================================================
- -1 Allow use of (almost) all events by all users
+ -1 Allow use of (almost) all events by all users.
- Ignore mlock limit after perf_event_mlock_kb without CAP_IPC_LOCK
+ Ignore mlock limit after perf_event_mlock_kb without
+ ``CAP_IPC_LOCK``.
->=0 Disallow ftrace function tracepoint by users without CAP_SYS_ADMIN
+>=0 Disallow ftrace function tracepoint by users without
+ ``CAP_PERFMON``.
- Disallow raw tracepoint access by users without CAP_SYS_ADMIN
+ Disallow raw tracepoint access by users without ``CAP_PERFMON``.
->=1 Disallow CPU event access by users without CAP_SYS_ADMIN
+>=1 Disallow CPU event access by users without ``CAP_PERFMON``.
->=2 Disallow kernel profiling by users without CAP_SYS_ADMIN
+>=2 Disallow kernel profiling by users without ``CAP_PERFMON``.
=== ==================================================================
-perf_event_max_stack:
-=====================
+perf_event_max_stack
+====================
-Controls maximum number of stack frames to copy for (attr.sample_type &
-PERF_SAMPLE_CALLCHAIN) configured events, for instance, when using
-'perf record -g' or 'perf trace --call-graph fp'.
+Controls maximum number of stack frames to copy for (``attr.sample_type &
+PERF_SAMPLE_CALLCHAIN``) configured events, for instance, when using
+'``perf record -g``' or '``perf trace --call-graph fp``'.
This can only be done when no events are in use that have callchains
-enabled, otherwise writing to this file will return -EBUSY.
+enabled, otherwise writing to this file will return ``-EBUSY``.
The default value is 127.
-perf_event_mlock_kb:
-====================
+perf_event_mlock_kb
+===================
-Control size of per-cpu ring buffer not counted agains mlock limit.
+Control size of per-cpu ring buffer not counted against mlock limit.
The default value is 512 + 1 page
-perf_event_max_contexts_per_stack:
-==================================
+perf_event_max_contexts_per_stack
+=================================
Controls maximum number of stack frame context entries for
-(attr.sample_type & PERF_SAMPLE_CALLCHAIN) configured events, for
-instance, when using 'perf record -g' or 'perf trace --call-graph fp'.
+(``attr.sample_type & PERF_SAMPLE_CALLCHAIN``) configured events, for
+instance, when using '``perf record -g``' or '``perf trace --call-graph fp``'.
This can only be done when no events are in use that have callchains
-enabled, otherwise writing to this file will return -EBUSY.
+enabled, otherwise writing to this file will return ``-EBUSY``.
The default value is 8.
-pid_max:
-========
+perf_user_access (arm64 only)
+=================================
+
+Controls user space access for reading perf event counters. When set to 1,
+user space can read performance monitor counter registers directly.
+
+The default value is 0 (access disabled).
+
+See Documentation/arm64/perf.rst for more information.
+
+
+pid_max
+=======
PID allocation wrap value. When the kernel's next PID value
reaches this value, it wraps back to a minimum PID value.
-PIDs of value pid_max or larger are not allocated.
+PIDs of value ``pid_max`` or larger are not allocated.
-ns_last_pid:
-============
+ns_last_pid
+===========
The last pid allocated in the current (the one task using this sysctl
lives in) pid namespace. When selecting a pid for a next task on fork
kernel tries to allocate a number starting from this one.
-powersave-nap: (PPC only)
-=========================
+powersave-nap (PPC only)
+========================
If set, Linux-PPC will use the 'nap' mode of powersaving,
otherwise the 'doze' mode will be used.
+
==============================================================
-printk:
-=======
+printk
+======
-The four values in printk denote: console_loglevel,
-default_message_loglevel, minimum_console_loglevel and
-default_console_loglevel respectively.
+The four values in printk denote: ``console_loglevel``,
+``default_message_loglevel``, ``minimum_console_loglevel`` and
+``default_console_loglevel`` respectively.
These values influence printk() behavior when printing or
-logging error messages. See 'man 2 syslog' for more info on
+logging error messages. See '``man 2 syslog``' for more info on
the different loglevels.
-- console_loglevel:
- messages with a higher priority than
- this will be printed to the console
-- default_message_loglevel:
- messages without an explicit priority
- will be printed with this priority
-- minimum_console_loglevel:
- minimum (highest) value to which
- console_loglevel can be set
-- default_console_loglevel:
- default value for console_loglevel
+======================== =====================================
+console_loglevel messages with a higher priority than
+ this will be printed to the console
+default_message_loglevel messages without an explicit priority
+ will be printed with this priority
+minimum_console_loglevel minimum (highest) value to which
+ console_loglevel can be set
+default_console_loglevel default value for console_loglevel
+======================== =====================================
-printk_delay:
-=============
+printk_delay
+============
-Delay each printk message in printk_delay milliseconds
+Delay each printk message in ``printk_delay`` milliseconds
Value from 0 - 10000 is allowed.
-printk_ratelimit:
-=================
+printk_ratelimit
+================
-Some warning messages are rate limited. printk_ratelimit specifies
+Some warning messages are rate limited. ``printk_ratelimit`` specifies
the minimum length of time between these messages (in seconds).
The default value is 5 seconds.
A value of 0 will disable rate limiting.
-printk_ratelimit_burst:
-=======================
+printk_ratelimit_burst
+======================
-While long term we enforce one message per printk_ratelimit
+While long term we enforce one message per `printk_ratelimit`_
seconds, we do allow a burst of messages to pass through.
-printk_ratelimit_burst specifies the number of messages we can
+``printk_ratelimit_burst`` specifies the number of messages we can
send before ratelimiting kicks in.
The default value is 10 messages.
-printk_devkmsg:
-===============
-
-Control the logging to /dev/kmsg from userspace:
-
-ratelimit:
- default, ratelimited
+printk_devkmsg
+==============
-on: unlimited logging to /dev/kmsg from userspace
+Control the logging to ``/dev/kmsg`` from userspace:
-off: logging to /dev/kmsg disabled
+========= =============================================
+ratelimit default, ratelimited
+on unlimited logging to /dev/kmsg from userspace
+off logging to /dev/kmsg disabled
+========= =============================================
-The kernel command line parameter printk.devkmsg= overrides this and is
+The kernel command line parameter ``printk.devkmsg=`` overrides this and is
a one-time setting until next reboot: once set, it cannot be changed by
this sysctl interface anymore.
+==============================================================
+
-randomize_va_space:
-===================
+pty
+===
+
+See Documentation/filesystems/devpts.rst.
+
+
+random
+======
+
+This is a directory, with the following entries:
+
+* ``boot_id``: a UUID generated the first time this is retrieved, and
+ unvarying after that;
+
+* ``uuid``: a UUID generated every time this is retrieved (this can
+ thus be used to generate UUIDs at will);
+
+* ``entropy_avail``: the pool's entropy count, in bits;
+
+* ``poolsize``: the entropy pool size, in bits;
+
+* ``urandom_min_reseed_secs``: obsolete (used to determine the minimum
+ number of seconds between urandom pool reseeding). This file is
+ writable for compatibility purposes, but writing to it has no effect
+ on any RNG behavior;
+
+* ``write_wakeup_threshold``: when the entropy count drops below this
+ (as a number of bits), processes waiting to write to ``/dev/random``
+ are woken up. This file is writable for compatibility purposes, but
+ writing to it has no effect on any RNG behavior.
+
+
+randomize_va_space
+==================
This option can be used to select the type of process address
space randomization that is used in the system, for architectures
@@ -881,10 +1063,10 @@ that support this feature.
This, among other things, implies that shared libraries will be
loaded to random addresses. Also for PIE-linked binaries, the
location of code start is randomized. This is the default if the
- CONFIG_COMPAT_BRK option is enabled.
+ ``CONFIG_COMPAT_BRK`` option is enabled.
2 Additionally enable heap randomization. This is the default if
- CONFIG_COMPAT_BRK is disabled.
+ ``CONFIG_COMPAT_BRK`` is disabled.
There are a few legacy applications out there (such as some ancient
versions of libc.so.5 from 1996) that assume that brk area starts
@@ -894,31 +1076,27 @@ that support this feature.
systems it is safe to choose full randomization.
Systems with ancient and/or broken binaries should be configured
- with CONFIG_COMPAT_BRK enabled, which excludes the heap from process
+ with ``CONFIG_COMPAT_BRK`` enabled, which excludes the heap from process
address space randomization.
== ===========================================================================
-reboot-cmd: (Sparc only)
-========================
-
-??? This seems to be a way to give an argument to the Sparc
-ROM/Flash boot loader. Maybe to tell it what to do after
-rebooting. ???
+real-root-dev
+=============
+See Documentation/admin-guide/initrd.rst.
-rtsig-max & rtsig-nr:
-=====================
-The file rtsig-max can be used to tune the maximum number
-of POSIX realtime (queued) signals that can be outstanding
-in the system.
+reboot-cmd (SPARC only)
+=======================
-rtsig-nr shows the number of RT signals currently queued.
+??? This seems to be a way to give an argument to the Sparc
+ROM/Flash boot loader. Maybe to tell it what to do after
+rebooting. ???
-sched_energy_aware:
-===================
+sched_energy_aware
+==================
Enables/disables Energy Aware Scheduling (EAS). EAS starts
automatically on platforms where it can run (that is,
@@ -927,76 +1105,150 @@ Model available). If your platform happens to meet the
requirements for EAS but you do not want to use it, change
this value to 0.
+task_delayacct
+===============
+
+Enables/disables task delay accounting (see
+Documentation/accounting/delay-accounting.rst. Enabling this feature incurs
+a small amount of overhead in the scheduler but is useful for debugging
+and performance tuning. It is required by some tools such as iotop.
-sched_schedstats:
-=================
+sched_schedstats
+================
Enables/disables scheduler statistics. Enabling this feature
incurs a small amount of overhead in the scheduler but is
useful for debugging and performance tuning.
+sched_util_clamp_min
+====================
-sg-big-buff:
-============
+Max allowed *minimum* utilization.
+
+Default value is 1024, which is the maximum possible value.
+
+It means that any requested uclamp.min value cannot be greater than
+sched_util_clamp_min, i.e., it is restricted to the range
+[0:sched_util_clamp_min].
+
+sched_util_clamp_max
+====================
+
+Max allowed *maximum* utilization.
+
+Default value is 1024, which is the maximum possible value.
+
+It means that any requested uclamp.max value cannot be greater than
+sched_util_clamp_max, i.e., it is restricted to the range
+[0:sched_util_clamp_max].
+
+sched_util_clamp_min_rt_default
+===============================
+
+By default Linux is tuned for performance. Which means that RT tasks always run
+at the highest frequency and most capable (highest capacity) CPU (in
+heterogeneous systems).
+
+Uclamp achieves this by setting the requested uclamp.min of all RT tasks to
+1024 by default, which effectively boosts the tasks to run at the highest
+frequency and biases them to run on the biggest CPU.
+
+This knob allows admins to change the default behavior when uclamp is being
+used. In battery powered devices particularly, running at the maximum
+capacity and frequency will increase energy consumption and shorten the battery
+life.
+
+This knob is only effective for RT tasks which the user hasn't modified their
+requested uclamp.min value via sched_setattr() syscall.
+
+This knob will not escape the range constraint imposed by sched_util_clamp_min
+defined above.
+
+For example if
+
+ sched_util_clamp_min_rt_default = 800
+ sched_util_clamp_min = 600
+
+Then the boost will be clamped to 600 because 800 is outside of the permissible
+range of [0:600]. This could happen for instance if a powersave mode will
+restrict all boosts temporarily by modifying sched_util_clamp_min. As soon as
+this restriction is lifted, the requested sched_util_clamp_min_rt_default
+will take effect.
+
+seccomp
+=======
+
+See Documentation/userspace-api/seccomp_filter.rst.
+
+
+sg-big-buff
+===========
This file shows the size of the generic SCSI (sg) buffer.
You can't tune it just yet, but you could change it on
-compile time by editing include/scsi/sg.h and changing
-the value of SG_BIG_BUFF.
+compile time by editing ``include/scsi/sg.h`` and changing
+the value of ``SG_BIG_BUFF``.
There shouldn't be any reason to change this value. If
you can come up with one, you probably know what you
are doing anyway :)
-shmall:
-=======
+shmall
+======
This parameter sets the total amount of shared memory pages that
-can be used system wide. Hence, SHMALL should always be at least
-ceil(shmmax/PAGE_SIZE).
+can be used system wide. Hence, ``shmall`` should always be at least
+``ceil(shmmax/PAGE_SIZE)``.
-If you are not sure what the default PAGE_SIZE is on your Linux
-system, you can run the following command:
+If you are not sure what the default ``PAGE_SIZE`` is on your Linux
+system, you can run the following command::
# getconf PAGE_SIZE
-shmmax:
-=======
+shmmax
+======
This value can be used to query and set the run time limit
on the maximum shared memory segment size that can be created.
Shared memory segments up to 1Gb are now supported in the
-kernel. This value defaults to SHMMAX.
+kernel. This value defaults to ``SHMMAX``.
-shm_rmid_forced:
-================
+shmmni
+======
+
+This value determines the maximum number of shared memory segments.
+4096 by default (``SHMMNI``).
+
+
+shm_rmid_forced
+===============
Linux lets you set resource limits, including how much memory one
-process can consume, via setrlimit(2). Unfortunately, shared memory
+process can consume, via ``setrlimit(2)``. Unfortunately, shared memory
segments are allowed to exist without association with any process, and
thus might not be counted against any resource limits. If enabled,
shared memory segments are automatically destroyed when their attach
count becomes zero after a detach or a process termination. It will
also destroy segments that were created, but never attached to, on exit
-from the process. The only use left for IPC_RMID is to immediately
+from the process. The only use left for ``IPC_RMID`` is to immediately
destroy an unattached segment. Of course, this breaks the way things are
defined, so some applications might stop working. Note that this
feature will do you no good unless you also configure your resource
-limits (in particular, RLIMIT_AS and RLIMIT_NPROC). Most systems don't
+limits (in particular, ``RLIMIT_AS`` and ``RLIMIT_NPROC``). Most systems don't
need this.
Note that if you change this from 0 to 1, already created segments
without users and with a dead originative process will be destroyed.
-sysctl_writes_strict:
-=====================
+sysctl_writes_strict
+====================
Control how file position affects the behavior of updating sysctl values
-via the /proc/sys interface:
+via the ``/proc/sys`` interface:
== ======================================================================
-1 Legacy per-write sysctl value handling, with no printk warnings.
@@ -1013,8 +1265,8 @@ via the /proc/sys interface:
== ======================================================================
-softlockup_all_cpu_backtrace:
-=============================
+softlockup_all_cpu_backtrace
+============================
This value controls the soft lockup detector thread's behavior
when a soft lockup condition is detected as to whether or not
@@ -1024,43 +1276,80 @@ be issued an NMI and instructed to capture stack trace.
This feature is only applicable for architectures which support
NMI.
-0: do nothing. This is the default behavior.
+= ============================================
+0 Do nothing. This is the default behavior.
+1 On detection capture more debug information.
+= ============================================
-1: on detection capture more debug information.
+softlockup_panic
+=================
-soft_watchdog:
-==============
+This parameter can be used to control whether the kernel panics
+when a soft lockup is detected.
-This parameter can be used to control the soft lockup detector.
+= ============================================
+0 Don't panic on soft lockup.
+1 Panic on soft lockup.
+= ============================================
- 0 - disable the soft lockup detector
+This can also be set using the softlockup_panic kernel parameter.
- 1 - enable the soft lockup detector
+
+soft_watchdog
+=============
+
+This parameter can be used to control the soft lockup detector.
+
+= =================================
+0 Disable the soft lockup detector.
+1 Enable the soft lockup detector.
+= =================================
The soft lockup detector monitors CPUs for threads that are hogging the CPUs
-without rescheduling voluntarily, and thus prevent the 'watchdog/N' threads
-from running. The mechanism depends on the CPUs ability to respond to timer
-interrupts which are needed for the 'watchdog/N' threads to be woken up by
-the watchdog timer function, otherwise the NMI watchdog - if enabled - can
-detect a hard lockup condition.
+without rescheduling voluntarily, and thus prevent the 'migration/N' threads
+from running, causing the watchdog work fail to execute. The mechanism depends
+on the CPUs ability to respond to timer interrupts which are needed for the
+watchdog work to be queued by the watchdog timer function, otherwise the NMI
+watchdog — if enabled — can detect a hard lockup condition.
-stack_erasing:
-==============
+stack_erasing
+=============
This parameter can be used to control kernel stack erasing at the end
-of syscalls for kernels built with CONFIG_GCC_PLUGIN_STACKLEAK.
+of syscalls for kernels built with ``CONFIG_GCC_PLUGIN_STACKLEAK``.
That erasing reduces the information which kernel stack leak bugs
can reveal and blocks some uninitialized stack variable attacks.
The tradeoff is the performance impact: on a single CPU system kernel
compilation sees a 1% slowdown, other systems and workloads may vary.
- 0: kernel stack erasing is disabled, STACKLEAK_METRICS are not updated.
+= ====================================================================
+0 Kernel stack erasing is disabled, STACKLEAK_METRICS are not updated.
+1 Kernel stack erasing is enabled (default), it is performed before
+ returning to the userspace at the end of syscalls.
+= ====================================================================
+
+
+stop-a (SPARC only)
+===================
- 1: kernel stack erasing is enabled (default), it is performed before
- returning to the userspace at the end of syscalls.
+Controls Stop-A:
+
+= ====================================
+0 Stop-A has no effect.
+1 Stop-A breaks to the PROM (default).
+= ====================================
+
+Stop-A is always enabled on a panic, so that the user can return to
+the boot PROM.
+
+
+sysrq
+=====
+
+See Documentation/admin-guide/sysrq.rst.
tainted
@@ -1072,7 +1361,7 @@ ORed together. The letters are seen in "Tainted" line of Oops reports.
====== ===== ==============================================================
1 `(P)` proprietary module was loaded
2 `(F)` module was force loaded
- 4 `(S)` SMP kernel oops on an officially SMP incapable processor
+ 4 `(S)` kernel running on an out of specification system
8 `(R)` module was force unloaded
16 `(M)` processor reported a Machine Check Exception (MCE)
32 `(B)` bad page referenced or some unexpected page flags
@@ -1092,28 +1381,95 @@ ORed together. The letters are seen in "Tainted" line of Oops reports.
See Documentation/admin-guide/tainted-kernels.rst for more information.
+Note:
+ writes to this sysctl interface will fail with ``EINVAL`` if the kernel is
+ booted with the command line option ``panic_on_taint=<bitmask>,nousertaint``
+ and any of the ORed together values being written to ``tainted`` match with
+ the bitmask declared on panic_on_taint.
+ See Documentation/admin-guide/kernel-parameters.rst for more details on
+ that particular kernel command line option and its optional
+ ``nousertaint`` switch.
-threads-max:
-============
+threads-max
+===========
This value controls the maximum number of threads that can be created
-using fork().
+using ``fork()``.
During initialization the kernel sets this value such that even if the
maximum number of threads is created, the thread structures occupy only
a part (1/8th) of the available RAM pages.
-The minimum value that can be written to threads-max is 1.
+The minimum value that can be written to ``threads-max`` is 1.
-The maximum value that can be written to threads-max is given by the
-constant FUTEX_TID_MASK (0x3fffffff).
+The maximum value that can be written to ``threads-max`` is given by the
+constant ``FUTEX_TID_MASK`` (0x3fffffff).
-If a value outside of this range is written to threads-max an error
-EINVAL occurs.
+If a value outside of this range is written to ``threads-max`` an
+``EINVAL`` error occurs.
-unknown_nmi_panic:
-==================
+traceoff_on_warning
+===================
+
+When set, disables tracing (see Documentation/trace/ftrace.rst) when a
+``WARN()`` is hit.
+
+
+tracepoint_printk
+=================
+
+When tracepoints are sent to printk() (enabled by the ``tp_printk``
+boot parameter), this entry provides runtime control::
+
+ echo 0 > /proc/sys/kernel/tracepoint_printk
+
+will stop tracepoints from being sent to printk(), and::
+
+ echo 1 > /proc/sys/kernel/tracepoint_printk
+
+will send them to printk() again.
+
+This only works if the kernel was booted with ``tp_printk`` enabled.
+
+See Documentation/admin-guide/kernel-parameters.rst and
+Documentation/trace/boottime-trace.rst.
+
+
+.. _unaligned-dump-stack:
+
+unaligned-dump-stack (ia64)
+===========================
+
+When logging unaligned accesses, controls whether the stack is
+dumped.
+
+= ===================================================
+0 Do not dump the stack. This is the default setting.
+1 Dump the stack.
+= ===================================================
+
+See also `ignore-unaligned-usertrap`_.
+
+
+unaligned-trap
+==============
+
+On architectures where unaligned accesses cause traps, and where this
+feature is supported (``CONFIG_SYSCTL_ARCH_UNALIGN_ALLOW``; currently,
+``arc`` and ``parisc``), controls whether unaligned traps are caught
+and emulated (instead of failing).
+
+= ========================================================
+0 Do not emulate unaligned accesses.
+1 Emulate unaligned accesses. This is the default setting.
+= ========================================================
+
+See also `ignore-unaligned-usertrap`_.
+
+
+unknown_nmi_panic
+=================
The value in this file affects behavior of handling NMI. When the
value is non-zero, unknown NMI is trapped and then panic occurs. At
@@ -1123,37 +1479,60 @@ NMI switch that most IA32 servers have fires unknown NMI up, for
example. If a system hangs up, try pressing the NMI switch.
-watchdog:
-=========
+unprivileged_bpf_disabled
+=========================
-This parameter can be used to disable or enable the soft lockup detector
-_and_ the NMI watchdog (i.e. the hard lockup detector) at the same time.
+Writing 1 to this entry will disable unprivileged calls to ``bpf()``;
+once disabled, calling ``bpf()`` without ``CAP_SYS_ADMIN`` or ``CAP_BPF``
+will return ``-EPERM``. Once set to 1, this can't be cleared from the
+running kernel anymore.
+
+Writing 2 to this entry will also disable unprivileged calls to ``bpf()``,
+however, an admin can still change this setting later on, if needed, by
+writing 0 or 1 to this entry.
- 0 - disable both lockup detectors
+If ``BPF_UNPRIV_DEFAULT_OFF`` is enabled in the kernel config, then this
+entry will default to 2 instead of 0.
- 1 - enable both lockup detectors
+= =============================================================
+0 Unprivileged calls to ``bpf()`` are enabled
+1 Unprivileged calls to ``bpf()`` are disabled without recovery
+2 Unprivileged calls to ``bpf()`` are disabled
+= =============================================================
+
+watchdog
+========
+
+This parameter can be used to disable or enable the soft lockup detector
+*and* the NMI watchdog (i.e. the hard lockup detector) at the same time.
+
+= ==============================
+0 Disable both lockup detectors.
+1 Enable both lockup detectors.
+= ==============================
The soft lockup detector and the NMI watchdog can also be disabled or
-enabled individually, using the soft_watchdog and nmi_watchdog parameters.
-If the watchdog parameter is read, for example by executing::
+enabled individually, using the ``soft_watchdog`` and ``nmi_watchdog``
+parameters.
+If the ``watchdog`` parameter is read, for example by executing::
cat /proc/sys/kernel/watchdog
-the output of this command (0 or 1) shows the logical OR of soft_watchdog
-and nmi_watchdog.
+the output of this command (0 or 1) shows the logical OR of
+``soft_watchdog`` and ``nmi_watchdog``.
-watchdog_cpumask:
-=================
+watchdog_cpumask
+================
This value can be used to control on which cpus the watchdog may run.
-The default cpumask is all possible cores, but if NO_HZ_FULL is
+The default cpumask is all possible cores, but if ``NO_HZ_FULL`` is
enabled in the kernel config, and cores are specified with the
-nohz_full= boot argument, those cores are excluded by default.
+``nohz_full=`` boot argument, those cores are excluded by default.
Offline cores can be included in this mask, and if the core is later
brought online, the watchdog will be started based on the mask value.
-Typically this value would only be touched in the nohz_full case
+Typically this value would only be touched in the ``nohz_full`` case
to re-enable cores that by default were not running the watchdog,
if a kernel lockup was suspected on those cores.
@@ -1164,12 +1543,12 @@ might say::
echo 0,2-4 > /proc/sys/kernel/watchdog_cpumask
-watchdog_thresh:
-================
+watchdog_thresh
+===============
This value can be used to control the frequency of hrtimer and NMI
events and the soft and hard lockup thresholds. The default threshold
is 10 seconds.
-The softlockup threshold is (2 * watchdog_thresh). Setting this
+The softlockup threshold is (``2 * watchdog_thresh``). Setting this
tunable to zero will disable lockup detection altogether.
diff --git a/Documentation/admin-guide/sysctl/net.rst b/Documentation/admin-guide/sysctl/net.rst
index 287b98708a40..6394f5dc2303 100644
--- a/Documentation/admin-guide/sysctl/net.rst
+++ b/Documentation/admin-guide/sysctl/net.rst
@@ -31,17 +31,18 @@ see only some of them, depending on your kernel's configuration.
Table : Subdirectories in /proc/sys/net
- ========= =================== = ========== ==================
+ ========= =================== = ========== ===================
Directory Content Directory Content
- ========= =================== = ========== ==================
- core General parameter appletalk Appletalk protocol
- unix Unix domain sockets netrom NET/ROM
- 802 E802 protocol ax25 AX25
- ethernet Ethernet protocol rose X.25 PLP layer
+ ========= =================== = ========== ===================
+ 802 E802 protocol mptcp Multipath TCP
+ appletalk Appletalk protocol netfilter Network Filter
+ ax25 AX25 netrom NET/ROM
+ bridge Bridging rose X.25 PLP layer
+ core General parameter tipc TIPC
+ ethernet Ethernet protocol unix Unix domain sockets
ipv4 IP version 4 x25 X.25 protocol
- bridge Bridging decnet DEC net
- ipv6 IP version 6 tipc TIPC
- ========= =================== = ========== ==================
+ ipv6 IP version 6
+ ========= =================== = ========== ===================
1. /proc/sys/net/core - Network core options
============================================
@@ -64,15 +65,16 @@ two flavors of JITs, the newer eBPF JIT currently supported on:
- arm64
- arm32
- ppc64
+ - ppc32
- sparc64
- mips64
- s390x
- - riscv
+ - riscv64
+ - riscv32
And the older cBPF JIT supported on the following archs:
- mips
- - ppc
- sparc
eBPF JITs are a superset of cBPF JITs, meaning the kernel will
@@ -100,6 +102,9 @@ Values:
- 1 - enable JIT hardening for unprivileged users only
- 2 - enable JIT hardening for all users
+where "privileged user" in this context means a process having
+CAP_BPF or CAP_SYS_ADMIN in the root user name space.
+
bpf_jit_kallsyms
----------------
@@ -270,7 +275,7 @@ poll cycle or the number of packets processed reaches netdev_budget.
netdev_max_backlog
------------------
-Maximum number of packets, queued on the INPUT side, when the interface
+Maximum number of packets, queued on the INPUT side, when the interface
receives packets faster than kernel can process them.
netdev_rss_key
@@ -310,6 +315,25 @@ permit to distribute the load on several cpus.
If set to 1 (default), timestamps are sampled as soon as possible, before
queueing.
+netdev_unregister_timeout_secs
+------------------------------
+
+Unregister network device timeout in seconds.
+This option controls the timeout (in seconds) used to issue a warning while
+waiting for a network device refcount to drop to 0 during device
+unregistration. A lower value may be useful during bisection to detect
+a leaked reference faster. A larger value may be useful to prevent false
+warnings on slow/loaded systems.
+Default value is 10, minimum 1, maximum 3600.
+
+skb_defer_max
+-------------
+
+Max size (in skbs) of the per-cpu list of skbs being freed
+by the cpu which allocated them. Used by TCP stack so far.
+
+Default: 64
+
optmem_max
----------
@@ -320,11 +344,20 @@ fb_tunnels_only_for_init_net
----------------------------
Controls if fallback tunnels (like tunl0, gre0, gretap0, erspan0,
-sit0, ip6tnl0, ip6gre0) are automatically created when a new
-network namespace is created, if corresponding tunnel is present
-in initial network namespace.
-If set to 1, these devices are not automatically created, and
-user space is responsible for creating them if needed.
+sit0, ip6tnl0, ip6gre0) are automatically created. There are 3 possibilities
+(a) value = 0; respective fallback tunnels are created when module is
+loaded in every net namespaces (backward compatible behavior).
+(b) value = 1; [kcmd value: initns] respective fallback tunnels are
+created only in init net namespace and every other net namespace will
+not have them.
+(c) value = 2; [kcmd value: none] fallback tunnels are not created
+when a module is loaded in any of the net namespace. Setting value to
+"2" is pointless after boot if these modules are built-in, so there is
+a kernel command-line option that can change this default. Please refer to
+Documentation/admin-guide/kernel-parameters.txt for additional details.
+
+Not creating fallback tunnels gives control to userspace to create
+whatever is needed only and avoid creating devices which are redundant.
Default : 0 (for compatibility reasons)
@@ -338,10 +371,42 @@ settings from init_net and for IPv6 we reset all settings to default.
If set to 1, both IPv4 and IPv6 settings are forced to inherit from
current ones in init_net. If set to 2, both IPv4 and IPv6 settings are
-forced to reset to their default values.
+forced to reset to their default values. If set to 3, both IPv4 and IPv6
+settings are forced to inherit from current ones in the netns where this
+new netns has been created.
Default : 0 (for compatibility reasons)
+txrehash
+--------
+
+Controls default hash rethink behaviour on listening socket when SO_TXREHASH
+option is set to SOCK_TXREHASH_DEFAULT (i. e. not overridden by setsockopt).
+
+If set to 1 (default), hash rethink is performed on listening socket.
+If set to 0, hash rethink is not performed.
+
+gro_normal_batch
+----------------
+
+Maximum number of the segments to batch up on output of GRO. When a packet
+exits GRO, either as a coalesced superframe or as an original packet which
+GRO has decided not to coalesce, it is placed on a per-NAPI list. This
+list is then passed to the stack when the number of segments reaches the
+gro_normal_batch limit.
+
+high_order_alloc_disable
+------------------------
+
+By default the allocator for page frags tries to use high order pages (order-3
+on x86). While the default behavior gives good results in most cases, some users
+might have hit a contention in page allocations/freeing. This was especially
+true on older kernels (< 5.14) when high-order pages were not stored on per-cpu
+lists. This allows to opt-in for order-0 allocation instead but is now mostly of
+historical importance.
+
+Default: 0
+
2. /proc/sys/net/unix - Parameters for Unix domain sockets
----------------------------------------------------------
@@ -352,8 +417,8 @@ socket's buffer. It will not take effect unless PF_UNIX flag is specified.
3. /proc/sys/net/ipv4 - IPV4 settings
-------------------------------------
-Please see: Documentation/networking/ip-sysctl.txt and ipvs-sysctl.txt for
-descriptions of these entries.
+Please see: Documentation/networking/ip-sysctl.rst and
+Documentation/admin-guide/sysctl/net.rst for descriptions of these entries.
4. Appletalk
diff --git a/Documentation/admin-guide/sysctl/user.rst b/Documentation/admin-guide/sysctl/user.rst
index 650eaa03f15e..c45824589339 100644
--- a/Documentation/admin-guide/sysctl/user.rst
+++ b/Documentation/admin-guide/sysctl/user.rst
@@ -65,6 +65,12 @@ max_pid_namespaces
The maximum number of pid namespaces that any user in the current
user namespace may create.
+max_time_namespaces
+===================
+
+ The maximum number of time namespaces that any user in the current
+ user namespace may create.
+
max_user_namespaces
===================
diff --git a/Documentation/admin-guide/sysctl/vm.rst b/Documentation/admin-guide/sysctl/vm.rst
index 64aeee1009ca..988f6a4c8084 100644
--- a/Documentation/admin-guide/sysctl/vm.rst
+++ b/Documentation/admin-guide/sysctl/vm.rst
@@ -25,8 +25,8 @@ files can be found in mm/swap.c.
Currently, these files are in /proc/sys/vm:
- admin_reserve_kbytes
-- block_dump
- compact_memory
+- compaction_proactiveness
- compact_unevictable_allowed
- dirty_background_bytes
- dirty_background_ratio
@@ -37,6 +37,7 @@ Currently, these files are in /proc/sys/vm:
- dirty_writeback_centisecs
- drop_caches
- extfrag_threshold
+- highmem_is_dirtyable
- hugetlb_shm_group
- laptop_mode
- legacy_va_layout
@@ -61,8 +62,9 @@ Currently, these files are in /proc/sys/vm:
- overcommit_memory
- overcommit_ratio
- page-cluster
+- page_lock_unfairness
- panic_on_oom
-- percpu_pagelist_fraction
+- percpu_pagelist_high_fraction
- stat_interval
- stat_refresh
- numa_stat
@@ -104,13 +106,6 @@ On x86_64 this is about 128MB.
Changing this takes effect whenever an application requests memory.
-block_dump
-==========
-
-block_dump enables block I/O debugging when set to a nonzero value. More
-information on block I/O debugging is in Documentation/admin-guide/laptops/laptop-mode.rst.
-
-
compact_memory
==============
@@ -119,6 +114,22 @@ all zones are compacted such that free memory is available in contiguous
blocks where possible. This can be important for example in the allocation of
huge pages although processes will also directly compact memory as required.
+compaction_proactiveness
+========================
+
+This tunable takes a value in the range [0, 100] with a default value of
+20. This tunable determines how aggressively compaction is done in the
+background. Write of a non zero value to this tunable will immediately
+trigger the proactive compaction. Setting it to 0 disables proactive compaction.
+
+Note that compaction has a non-trivial system-wide impact as pages
+belonging to different processes are moved around, which could also lead
+to latency spikes in unsuspecting applications. The kernel employs
+various heuristics to avoid wasting CPU cycles if it detects that
+proactive compaction is not being effective.
+
+Be careful when setting it to extreme values like 100, as that may
+cause excessive background compaction activity.
compact_unevictable_allowed
===========================
@@ -128,6 +139,9 @@ allowed to examine the unevictable lru (mlocked pages) for pages to compact.
This should be used on systems where stalls for minor page faults are an
acceptable trade for large contiguous free memory. Set to 0 to prevent
compaction from moving pages that are unevictable. Default value is 1.
+On CONFIG_PREEMPT_RT the default value is 0 in order to avoid a page fault, due
+to compaction, which would block the task from becoming active until the fault
+is resolved.
dirty_background_bytes
@@ -408,7 +422,7 @@ While most applications need less than a thousand maps, certain
programs, particularly malloc debuggers, may consume lots of them,
e.g., up to one or two maps per allocation.
-The default value is 65536.
+The default value is 65530.
memory_failure_early_kill:
@@ -548,6 +562,43 @@ Change the minimum size of the hugepage pool.
See Documentation/admin-guide/mm/hugetlbpage.rst
+hugetlb_optimize_vmemmap
+========================
+
+This knob is not available when the size of 'struct page' (a structure defined
+in include/linux/mm_types.h) is not power of two (an unusual system config could
+result in this).
+
+Enable (set to 1) or disable (set to 0) HugeTLB Vmemmap Optimization (HVO).
+
+Once enabled, the vmemmap pages of subsequent allocation of HugeTLB pages from
+buddy allocator will be optimized (7 pages per 2MB HugeTLB page and 4095 pages
+per 1GB HugeTLB page), whereas already allocated HugeTLB pages will not be
+optimized. When those optimized HugeTLB pages are freed from the HugeTLB pool
+to the buddy allocator, the vmemmap pages representing that range needs to be
+remapped again and the vmemmap pages discarded earlier need to be rellocated
+again. If your use case is that HugeTLB pages are allocated 'on the fly' (e.g.
+never explicitly allocating HugeTLB pages with 'nr_hugepages' but only set
+'nr_overcommit_hugepages', those overcommitted HugeTLB pages are allocated 'on
+the fly') instead of being pulled from the HugeTLB pool, you should weigh the
+benefits of memory savings against the more overhead (~2x slower than before)
+of allocation or freeing HugeTLB pages between the HugeTLB pool and the buddy
+allocator. Another behavior to note is that if the system is under heavy memory
+pressure, it could prevent the user from freeing HugeTLB pages from the HugeTLB
+pool to the buddy allocator since the allocation of vmemmap pages could be
+failed, you have to retry later if your system encounter this situation.
+
+Once disabled, the vmemmap pages of subsequent allocation of HugeTLB pages from
+buddy allocator will not be optimized meaning the extra overhead at allocation
+time from buddy allocator disappears, whereas already optimized HugeTLB pages
+will not be affected. If you want to make sure there are no optimized HugeTLB
+pages, you can set "nr_hugepages" to 0 first and then disable this. Note that
+writing 0 to nr_hugepages will make any "in use" HugeTLB pages become surplus
+pages. So, those surplus pages are still optimized until they are no longer
+in use. You would need to wait for those surplus pages to be released before
+there are no optimized pages in the system.
+
+
nr_hugepages_mempolicy
======================
@@ -580,7 +631,7 @@ trimming of allocations is initiated.
The default value is 1.
-See Documentation/nommu-mmap.txt for more information.
+See Documentation/admin-guide/mm/nommu-mmap.rst for more information.
numa_zonelist_order
@@ -707,7 +758,7 @@ and don't use much of it.
The default value is 0.
-See Documentation/vm/overcommit-accounting.rst and
+See Documentation/mm/overcommit-accounting.rst and
mm/util.c::__vm_enough_memory() for more information.
@@ -741,6 +792,14 @@ extra faults and I/O delays for following faults if they would have been part of
that consecutive pages readahead would have brought in.
+page_lock_unfairness
+====================
+
+This value determines the number of times that the page lock can be
+stolen from under a waiter. After the lock is stolen the number of times
+specified in this file (default is 5), the "fair lock handoff" semantics
+will apply, and the waiter will only be awakened if the lock can be taken.
+
panic_on_oom
============
@@ -770,22 +829,24 @@ panic_on_oom=2+kdump gives you very strong tool to investigate
why oom happens. You can get snapshot.
-percpu_pagelist_fraction
-========================
+percpu_pagelist_high_fraction
+=============================
-This is the fraction of pages at most (high mark pcp->high) in each zone that
-are allocated for each per cpu page list. The min value for this is 8. It
-means that we don't allow more than 1/8th of pages in each zone to be
-allocated in any single per_cpu_pagelist. This entry only changes the value
-of hot per cpu pagelists. User can specify a number like 100 to allocate
-1/100th of each zone to each per cpu page list.
+This is the fraction of pages in each zone that are can be stored to
+per-cpu page lists. It is an upper boundary that is divided depending
+on the number of online CPUs. The min value for this is 8 which means
+that we do not allow more than 1/8th of pages in each zone to be stored
+on per-cpu page lists. This entry only changes the value of hot per-cpu
+page lists. A user can specify a number like 100 to allocate 1/100th of
+each zone between per-cpu lists.
-The batch value of each per cpu pagelist is also updated as a result. It is
-set to pcp->high/4. The upper limit of batch is (PAGE_SHIFT * 8)
+The batch value of each per-cpu page list remains the same regardless of
+the value of the high fraction so allocation latencies are unaffected.
-The initial value is zero. Kernel does not use this value at boot time to set
-the high water marks for each per cpu page list. If the user writes '0' to this
-sysctl, it will revert to this default behavior.
+The initial value is zero. Kernel uses this value to set the high pcp->high
+mark based on the low watermark for the zone and the number of local
+online CPUs. If the user writes '0' to this sysctl, it will revert to
+this default behavior.
stat_interval
@@ -828,25 +889,46 @@ tooling to work, you can do::
swappiness
==========
-This control is used to define how aggressive the kernel will swap
-memory pages. Higher values will increase aggressiveness, lower values
-decrease the amount of swap. A value of 0 instructs the kernel not to
-initiate swap until the amount of free and file-backed pages is less
-than the high water mark in a zone.
+This control is used to define the rough relative IO cost of swapping
+and filesystem paging, as a value between 0 and 200. At 100, the VM
+assumes equal IO cost and will thus apply memory pressure to the page
+cache and swap-backed pages equally; lower values signify more
+expensive swap IO, higher values indicates cheaper.
+
+Keep in mind that filesystem IO patterns under memory pressure tend to
+be more efficient than swap's random IO. An optimal value will require
+experimentation and will also be workload-dependent.
The default value is 60.
+For in-memory swap, like zram or zswap, as well as hybrid setups that
+have swap on faster devices than the filesystem, values beyond 100 can
+be considered. For example, if the random IO against the swap device
+is on average 2x faster than IO from the filesystem, swappiness should
+be 133 (x + 2x = 200, 2x = 133.33).
+
+At 0, the kernel will not initiate swap until the amount of free and
+file-backed pages is less than the high watermark in a zone.
+
unprivileged_userfaultfd
========================
-This flag controls whether unprivileged users can use the userfaultfd
-system calls. Set this to 1 to allow unprivileged users to use the
-userfaultfd system calls, or set this to 0 to restrict userfaultfd to only
-privileged users (with SYS_CAP_PTRACE capability).
+This flag controls the mode in which unprivileged users can use the
+userfaultfd system calls. Set this to 0 to restrict unprivileged users
+to handle page faults in user mode only. In this case, users without
+SYS_CAP_PTRACE must pass UFFD_USER_MODE_ONLY in order for userfaultfd to
+succeed. Prohibiting use of userfaultfd for handling faults from kernel
+mode may make certain vulnerabilities more difficult to exploit.
-The default value is 1.
+Set this to 1 to allow unprivileged users to use the userfaultfd system
+calls without any restrictions.
+
+The default value is 0.
+Another way to control permissions for userfaultfd is to use
+/dev/userfaultfd instead of userfaultfd(2). See
+Documentation/admin-guide/mm/userfaultfd.rst.
user_reserve_kbytes
===================
@@ -898,12 +980,12 @@ allocations, THP and hugetlbfs pages.
To make it sensible with respect to the watermark_scale_factor
parameter, the unit is in fractions of 10,000. The default value of
-15,000 on !DISCONTIGMEM configurations means that up to 150% of the high
-watermark will be reclaimed in the event of a pageblock being mixed due
-to fragmentation. The level of reclaim is determined by the number of
-fragmentation events that occurred in the recent past. If this value is
-smaller than a pageblock then a pageblocks worth of pages will be reclaimed
-(e.g. 2MB on 64-bit x86). A boost factor of 0 will disable the feature.
+15,000 means that up to 150% of the high watermark will be reclaimed in the
+event of a pageblock being mixed due to fragmentation. The level of reclaim
+is determined by the number of fragmentation events that occurred in the
+recent past. If this value is smaller than a pageblock then a pageblocks
+worth of pages will be reclaimed (e.g. 2MB on 64-bit x86). A boost factor
+of 0 will disable the feature.
watermark_scale_factor
@@ -915,7 +997,7 @@ how much memory needs to be free before kswapd goes back to sleep.
The unit is in fractions of 10,000. The default value of 10 means the
distances between watermarks are 0.1% of the available memory in the
-node/system. The maximum value is 1000, or 10% of memory.
+node/system. The maximum value is 3000, or 30% of memory.
A high rate of threads entering direct reclaim (allocstall) or kswapd
going to sleep prematurely (kswapd_low_wmark_hit_quickly) can indicate
@@ -945,11 +1027,11 @@ that benefit from having their data cached, zone_reclaim_mode should be
left disabled as the caching effect is likely to be more important than
data locality.
-zone_reclaim may be enabled if it's known that the workload is partitioned
-such that each partition fits within a NUMA node and that accessing remote
-memory would cause a measurable performance reduction. The page allocator
-will then reclaim easily reusable pages (those page cache pages that are
-currently not used) before allocating off node pages.
+Consider enabling one or more zone_reclaim mode bits if it's known that the
+workload is partitioned such that each partition fits within a NUMA node
+and that accessing remote memory would cause a measurable performance
+reduction. The page allocator will take additional actions before
+allocating off node pages.
Allowing zone reclaim to write out pages stops processes that are
writing large amounts of data from dirtying pages on other nodes. Zone
diff --git a/Documentation/admin-guide/sysrq.rst b/Documentation/admin-guide/sysrq.rst
index 72b2cfb066f4..0a178ef0111d 100644
--- a/Documentation/admin-guide/sysrq.rst
+++ b/Documentation/admin-guide/sysrq.rst
@@ -48,9 +48,10 @@ always allowed (by a user with admin privileges).
How do I use the magic SysRq key?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-On x86 - You press the key combo :kbd:`ALT-SysRq-<command key>`.
+On x86
+ You press the key combo :kbd:`ALT-SysRq-<command key>`.
-.. note::
+ .. note::
Some
keyboards may not have a key labeled 'SysRq'. The 'SysRq' key is
also known as the 'Print Screen' key. Also some keyboards cannot
@@ -58,25 +59,28 @@ On x86 - You press the key combo :kbd:`ALT-SysRq-<command key>`.
have better luck with press :kbd:`Alt`, press :kbd:`SysRq`,
release :kbd:`SysRq`, press :kbd:`<command key>`, release everything.
-On SPARC - You press :kbd:`ALT-STOP-<command key>`, I believe.
+On SPARC
+ You press :kbd:`ALT-STOP-<command key>`, I believe.
On the serial console (PC style standard serial ports only)
You send a ``BREAK``, then within 5 seconds a command key. Sending
``BREAK`` twice is interpreted as a normal BREAK.
On PowerPC
- Press :kbd:`ALT - Print Screen` (or :kbd:`F13`) - :kbd:`<command key>`,
+ Press :kbd:`ALT - Print Screen` (or :kbd:`F13`) - :kbd:`<command key>`.
:kbd:`Print Screen` (or :kbd:`F13`) - :kbd:`<command key>` may suffice.
On other
If you know of the key combos for other architectures, please
- let me know so I can add them to this section.
+ submit a patch to be included in this section.
On all
- write a character to /proc/sysrq-trigger. e.g.::
+ Write a character to /proc/sysrq-trigger. e.g.::
echo t > /proc/sysrq-trigger
+The :kbd:`<command key>` is case sensitive.
+
What are the 'command' keys?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -86,8 +90,8 @@ Command Function
``b`` Will immediately reboot the system without syncing or unmounting
your disks.
-``c`` Will perform a system crash by a NULL pointer dereference.
- A crashdump will be taken if configured.
+``c`` Will perform a system crash and a crashdump will be taken
+ if configured.
``d`` Shows all locks that are held.
@@ -201,10 +205,12 @@ frozen (probably root) filesystem via the FIFREEZE ioctl.
Sometimes SysRq seems to get 'stuck' after using it, what can I do?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-That happens to me, also. I've found that tapping shift, alt, and control
-on both sides of the keyboard, and hitting an invalid sysrq sequence again
-will fix the problem. (i.e., something like :kbd:`alt-sysrq-z`). Switching to
-another virtual console (:kbd:`ALT+Fn`) and then back again should also help.
+When this happens, try tapping shift, alt and control on both sides of the
+keyboard, and hitting an invalid sysrq sequence again. (i.e., something like
+:kbd:`alt-sysrq-z`).
+
+Switching to another virtual console (:kbd:`ALT+Fn`) and then back again
+should also help.
I hit SysRq, but nothing seems to happen, what's wrong?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -229,13 +235,13 @@ prints help, and C) an action_msg string, that will print right before your
handler is called. Your handler must conform to the prototype in 'sysrq.h'.
After the ``sysrq_key_op`` is created, you can call the kernel function
-``register_sysrq_key(int key, struct sysrq_key_op *op_p);`` this will
+``register_sysrq_key(int key, const struct sysrq_key_op *op_p);`` this will
register the operation pointed to by ``op_p`` at table key 'key',
if that slot in the table is blank. At module unload time, you must call
-the function ``unregister_sysrq_key(int key, struct sysrq_key_op *op_p)``, which
-will remove the key op pointed to by 'op_p' from the key 'key', if and only if
-it is currently registered in that slot. This is in case the slot has been
-overwritten since you registered it.
+the function ``unregister_sysrq_key(int key, const struct sysrq_key_op *op_p)``,
+which will remove the key op pointed to by 'op_p' from the key 'key', if and
+only if it is currently registered in that slot. This is in case the slot has
+been overwritten since you registered it.
The Magic SysRQ system works by registering key operations against a key op
lookup table, which is defined in 'drivers/tty/sysrq.c'. This key table has
@@ -282,7 +288,7 @@ Just ask them on the linux-kernel mailing list:
Credits
~~~~~~~
-Written by Mydraal <vulpyne@vulpyne.net>
-Updated by Adam Sulmicki <adam@cfar.umd.edu>
-Updated by Jeremy M. Dolan <jmd@turbogeek.org> 2001/01/28 10:15:59
-Added to by Crutcher Dunnavant <crutcher+kernel@datastacks.com>
+- Written by Mydraal <vulpyne@vulpyne.net>
+- Updated by Adam Sulmicki <adam@cfar.umd.edu>
+- Updated by Jeremy M. Dolan <jmd@turbogeek.org> 2001/01/28 10:15:59
+- Added to by Crutcher Dunnavant <crutcher+kernel@datastacks.com>
diff --git a/Documentation/admin-guide/tainted-kernels.rst b/Documentation/admin-guide/tainted-kernels.rst
index 71e9184a9079..92a8a07f5c43 100644
--- a/Documentation/admin-guide/tainted-kernels.rst
+++ b/Documentation/admin-guide/tainted-kernels.rst
@@ -38,7 +38,7 @@ either letters or blanks. In above example it looks like this::
Tainted: P W O
-The meaning of those characters is explained in the table below. In tis case
+The meaning of those characters is explained in the table below. In this case
the kernel got tainted earlier because a proprietary Module (``P``) was loaded,
a warning occurred (``W``), and an externally-built module was loaded (``O``).
To decode other letters use the table below.
@@ -61,7 +61,7 @@ this on the machine that had the statements in the logs that were quoted earlier
* Proprietary module was loaded (#0)
* Kernel issued warning (#9)
* Externally-built ('out-of-tree') module was loaded (#12)
- See Documentation/admin-guide/tainted-kernels.rst in the the Linux kernel or
+ See Documentation/admin-guide/tainted-kernels.rst in the Linux kernel or
https://www.kernel.org/doc/html/latest/admin-guide/tainted-kernels.html for
a more details explanation of the various taint flags.
Raw taint value as int/string: 4609/'P W O '
@@ -84,7 +84,7 @@ Bit Log Number Reason that got the kernel tainted
=== === ====== ========================================================
0 G/P 1 proprietary module was loaded
1 _/F 2 module was force loaded
- 2 _/S 4 SMP kernel oops on an officially SMP incapable processor
+ 2 _/S 4 kernel running on an out of specification system
3 _/R 8 module was force unloaded
4 _/M 16 processor reported a Machine Check Exception (MCE)
5 _/B 32 bad page referenced or some unexpected page flags
@@ -100,6 +100,7 @@ Bit Log Number Reason that got the kernel tainted
15 _/K 32768 kernel has been live patched
16 _/X 65536 auxiliary taint, defined for and used by distros
17 _/T 131072 kernel was built with the struct randomization plugin
+ 18 _/N 262144 an in-kernel test has been run
=== === ====== ========================================================
Note: The character ``_`` is representing a blank in this table to make reading
@@ -116,10 +117,29 @@ More detailed explanation for tainting
1) ``F`` if any module was force loaded by ``insmod -f``, ``' '`` if all
modules were loaded normally.
- 2) ``S`` if the oops occurred on an SMP kernel running on hardware that
- hasn't been certified as safe to run multiprocessor.
- Currently this occurs only on various Athlons that are not
- SMP capable.
+ 2) ``S`` if the kernel is running on a processor or system that is out of
+ specification: hardware has been put into an unsupported configuration,
+ therefore proper execution cannot be guaranteed.
+ Kernel will be tainted if, for example:
+
+ - on x86: PAE is forced through forcepae on intel CPUs (such as Pentium M)
+ which do not report PAE but may have a functional implementation, an SMP
+ kernel is running on non officially capable SMP Athlon CPUs, MSRs are
+ being poked at from userspace.
+ - on arm: kernel running on certain CPUs (such as Keystone 2) without
+ having certain kernel features enabled.
+ - on arm64: there are mismatched hardware features between CPUs, the
+ bootloader has booted CPUs in different modes.
+ - certain drivers are being used on non supported architectures (such as
+ scsi/snic on something else than x86_64, scsi/ips on non
+ x86/x86_64/itanium, have broken firmware settings for the
+ irqchip/irq-gic on arm64 ...).
+ - x86/x86_64: Microcode late loading is dangerous and will result in
+ tainting the kernel. It requires that all CPUs rendezvous to make sure
+ the update happens when the system is as quiescent as possible. However,
+ a higher priority MCE/SMI/NMI can move control flow away from that
+ rendezvous and interrupt the update, which can be detrimental to the
+ machine.
3) ``R`` if a module was force unloaded by ``rmmod -f``, ``' '`` if all
modules were unloaded normally.
@@ -130,7 +150,7 @@ More detailed explanation for tainting
5) ``B`` If a page-release function has found a bad page reference or some
unexpected page flags. This indicates a hardware problem or a kernel bug;
there should be other information in the log indicating why this tainting
- occured.
+ occurred.
6) ``U`` if a user or user application specifically requested that the
Tainted flag be set, ``' '`` otherwise.
diff --git a/Documentation/admin-guide/thunderbolt.rst b/Documentation/admin-guide/thunderbolt.rst
index 10c4f0ce2ad0..2ed79f41a411 100644
--- a/Documentation/admin-guide/thunderbolt.rst
+++ b/Documentation/admin-guide/thunderbolt.rst
@@ -47,6 +47,9 @@ be DMA masters and thus read contents of the host memory without CPU and OS
knowing about it. There are ways to prevent this by setting up an IOMMU but
it is not always available for various reasons.
+Some USB4 systems have a BIOS setting to disable PCIe tunneling. This is
+treated as another security level (nopcie).
+
The security levels are as follows:
none
@@ -77,6 +80,10 @@ The security levels are as follows:
Display Port in a dock. All PCIe links downstream of the dock are
removed.
+ nopcie
+ PCIe tunneling is disabled/forbidden from the BIOS. Available in some
+ USB4 systems.
+
The current security level can be read from
``/sys/bus/thunderbolt/devices/domainX/security`` where ``domainX`` is
the Thunderbolt domain the host controller manages. There is typically
@@ -153,6 +160,22 @@ If the user still wants to connect the device they can either approve
the device without a key or write a new key and write 1 to the
``authorized`` file to get the new key stored on the device NVM.
+De-authorizing devices
+----------------------
+It is possible to de-authorize devices by writing ``0`` to their
+``authorized`` attribute. This requires support from the connection
+manager implementation and can be checked by reading domain
+``deauthorization`` attribute. If it reads ``1`` then the feature is
+supported.
+
+When a device is de-authorized the PCIe tunnel from the parent device
+PCIe downstream (or root) port to the device PCIe upstream port is torn
+down. This is essentially the same thing as PCIe hot-remove and the PCIe
+toplogy in question will not be accessible anymore until the device is
+authorized again. If there is storage such as NVMe or similar involved,
+there is a risk for data loss if the filesystem on that storage is not
+properly shut down. You have been warned!
+
DMA protection utilizing IOMMU
------------------------------
Recent systems from 2018 and forward with Thunderbolt ports may natively
@@ -173,8 +196,8 @@ following ``udev`` rule::
ACTION=="add", SUBSYSTEM=="thunderbolt", ATTRS{iommu_dma_protection}=="1", ATTR{authorized}=="0", ATTR{authorized}="1"
-Upgrading NVM on Thunderbolt device or host
--------------------------------------------
+Upgrading NVM on Thunderbolt device, host or retimer
+----------------------------------------------------
Since most of the functionality is handled in firmware running on a
host controller or a device, it is important that the firmware can be
upgraded to the latest where possible bugs in it have been fixed.
@@ -185,9 +208,10 @@ for some machines:
`Thunderbolt Updates <https://thunderbolttechnology.net/updates>`_
-Before you upgrade firmware on a device or host, please make sure it is a
-suitable upgrade. Failing to do that may render the device (or host) in a
-state where it cannot be used properly anymore without special tools!
+Before you upgrade firmware on a device, host or retimer, please make
+sure it is a suitable upgrade. Failing to do that may render the device
+in a state where it cannot be used properly anymore without special
+tools!
Host NVM upgrade on Apple Macs is not supported.
@@ -232,6 +256,35 @@ Note names of the NVMem devices ``nvm_activeN`` and ``nvm_non_activeN``
depend on the order they are registered in the NVMem subsystem. N in
the name is the identifier added by the NVMem subsystem.
+Upgrading on-board retimer NVM when there is no cable connected
+---------------------------------------------------------------
+If the platform supports, it may be possible to upgrade the retimer NVM
+firmware even when there is nothing connected to the USB4
+ports. When this is the case the ``usb4_portX`` devices have two special
+attributes: ``offline`` and ``rescan``. The way to upgrade the firmware
+is to first put the USB4 port into offline mode::
+
+ # echo 1 > /sys/bus/thunderbolt/devices/0-0/usb4_port1/offline
+
+This step makes sure the port does not respond to any hotplug events,
+and also ensures the retimers are powered on. The next step is to scan
+for the retimers::
+
+ # echo 1 > /sys/bus/thunderbolt/devices/0-0/usb4_port1/rescan
+
+This enumerates and adds the on-board retimers. Now retimer NVM can be
+upgraded in the same way than with cable connected (see previous
+section). However, the retimer is not disconnected as we are offline
+mode) so after writing ``1`` to ``nvm_authenticate`` one should wait for
+5 or more seconds before running rescan again::
+
+ # echo 1 > /sys/bus/thunderbolt/devices/0-0/usb4_port1/rescan
+
+This point if everything went fine, the port can be put back to
+functional state again::
+
+ # echo 0 > /sys/bus/thunderbolt/devices/0-0/usb4_port1/offline
+
Upgrading NVM when host controller is in safe mode
--------------------------------------------------
If the existing NVM is not properly authenticated (or is missing) the
diff --git a/Documentation/admin-guide/unicode.rst b/Documentation/admin-guide/unicode.rst
index 7425a3351321..290fe83ebe82 100644
--- a/Documentation/admin-guide/unicode.rst
+++ b/Documentation/admin-guide/unicode.rst
@@ -114,7 +114,7 @@ Unicode practice.
This range is now officially managed by the ConScript Unicode
Registry. The normative reference is at:
- http://www.evertype.com/standards/csur/klingon.html
+ https://www.evertype.com/standards/csur/klingon.html
Klingon has an alphabet of 26 characters, a positional numeric writing
system with 10 digits, and is written left-to-right, top-to-bottom.
@@ -178,7 +178,7 @@ fictional and artificial scripts has been established by John Cowan
<jcowan@reutershealth.com> and Michael Everson <everson@evertype.com>.
The ConScript Unicode Registry is accessible at:
- http://www.evertype.com/standards/csur/
+ https://www.evertype.com/standards/csur/
The ranges used fall at the low end of the End User Zone and can hence
not be normatively assigned, but it is recommended that people who
diff --git a/Documentation/admin-guide/wimax/i2400m.rst b/Documentation/admin-guide/wimax/i2400m.rst
deleted file mode 100644
index 194388c0c351..000000000000
--- a/Documentation/admin-guide/wimax/i2400m.rst
+++ /dev/null
@@ -1,283 +0,0 @@
-.. include:: <isonum.txt>
-
-====================================================
-Driver for the Intel Wireless Wimax Connection 2400m
-====================================================
-
-:Copyright: |copy| 2008 Intel Corporation < linux-wimax@intel.com >
-
- This provides a driver for the Intel Wireless WiMAX Connection 2400m
- and a basic Linux kernel WiMAX stack.
-
-1. Requirements
-===============
-
- * Linux installation with Linux kernel 2.6.22 or newer (if building
- from a separate tree)
- * Intel i2400m Echo Peak or Baxter Peak; this includes the Intel
- Wireless WiMAX/WiFi Link 5x50 series.
- * build tools:
-
- + Linux kernel development package for the target kernel; to
- build against your currently running kernel, you need to have
- the kernel development package corresponding to the running
- image installed (usually if your kernel is named
- linux-VERSION, the development package is called
- linux-dev-VERSION or linux-headers-VERSION).
- + GNU C Compiler, make
-
-2. Compilation and installation
-===============================
-
-2.1. Compilation of the drivers included in the kernel
-------------------------------------------------------
-
- Configure the kernel; to enable the WiMAX drivers select Drivers >
- Networking Drivers > WiMAX device support. Enable all of them as
- modules (easier).
-
- If USB or SDIO are not enabled in the kernel configuration, the options
- to build the i2400m USB or SDIO drivers will not show. Enable said
- subsystems and go back to the WiMAX menu to enable the drivers.
-
- Compile and install your kernel as usual.
-
-2.2. Compilation of the drivers distributed as an standalone module
--------------------------------------------------------------------
-
- To compile::
-
- $ cd source/directory
- $ make
-
- Once built you can load and unload using the provided load.sh script;
- load.sh will load the modules, load.sh u will unload them.
-
- To install in the default kernel directories (and enable auto loading
- when the device is plugged)::
-
- $ make install
- $ depmod -a
-
- If your kernel development files are located in a non standard
- directory or if you want to build for a kernel that is not the
- currently running one, set KDIR to the right location::
-
- $ make KDIR=/path/to/kernel/dev/tree
-
- For more information, please contact linux-wimax@intel.com.
-
-3. Installing the firmware
---------------------------
-
- The firmware can be obtained from http://linuxwimax.org or might have
- been supplied with your hardware.
-
- It has to be installed in the target system::
-
- $ cp FIRMWAREFILE.sbcf /lib/firmware/i2400m-fw-BUSTYPE-1.3.sbcf
-
- * NOTE: if your firmware came in an .rpm or .deb file, just install
- it as normal, with the rpm (rpm -i FIRMWARE.rpm) or dpkg
- (dpkg -i FIRMWARE.deb) commands. No further action is needed.
- * BUSTYPE will be usb or sdio, depending on the hardware you have.
- Each hardware type comes with its own firmware and will not work
- with other types.
-
-4. Design
-=========
-
- This package contains two major parts: a WiMAX kernel stack and a
- driver for the Intel i2400m.
-
- The WiMAX stack is designed to provide for common WiMAX control
- services to current and future WiMAX devices from any vendor; please
- see README.wimax for details.
-
- The i2400m kernel driver is broken up in two main parts: the bus
- generic driver and the bus-specific drivers. The bus generic driver
- forms the drivercore and contain no knowledge of the actual method we
- use to connect to the device. The bus specific drivers are just the
- glue to connect the bus-generic driver and the device. Currently only
- USB and SDIO are supported. See drivers/net/wimax/i2400m/i2400m.h for
- more information.
-
- The bus generic driver is logically broken up in two parts: OS-glue and
- hardware-glue. The OS-glue interfaces with Linux. The hardware-glue
- interfaces with the device on using an interface provided by the
- bus-specific driver. The reason for this breakup is to be able to
- easily reuse the hardware-glue to write drivers for other OSes; note
- the hardware glue part is written as a native Linux driver; no
- abstraction layers are used, so to port to another OS, the Linux kernel
- API calls should be replaced with the target OS's.
-
-5. Usage
-========
-
- To load the driver, follow the instructions in the install section;
- once the driver is loaded, plug in the device (unless it is permanently
- plugged in). The driver will enumerate the device, upload the firmware
- and output messages in the kernel log (dmesg, /var/log/messages or
- /var/log/kern.log) such as::
-
- ...
- i2400m_usb 5-4:1.0: firmware interface version 8.0.0
- i2400m_usb 5-4:1.0: WiMAX interface wmx0 (00:1d:e1:01:94:2c) ready
-
- At this point the device is ready to work.
-
- Current versions require the Intel WiMAX Network Service in userspace
- to make things work. See the network service's README for instructions
- on how to scan, connect and disconnect.
-
-5.1. Module parameters
-----------------------
-
- Module parameters can be set at kernel or module load time or by
- echoing values::
-
- $ echo VALUE > /sys/module/MODULENAME/parameters/PARAMETERNAME
-
- To make changes permanent, for example, for the i2400m module, you can
- also create a file named /etc/modprobe.d/i2400m containing::
-
- options i2400m idle_mode_disabled=1
-
- To find which parameters are supported by a module, run::
-
- $ modinfo path/to/module.ko
-
- During kernel bootup (if the driver is linked in the kernel), specify
- the following to the kernel command line::
-
- i2400m.PARAMETER=VALUE
-
-5.1.1. i2400m: idle_mode_disabled
-^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-
- The i2400m module supports a parameter to disable idle mode. This
- parameter, once set, will take effect only when the device is
- reinitialized by the driver (eg: following a reset or a reconnect).
-
-5.2. Debug operations: debugfs entries
---------------------------------------
-
- The driver will register debugfs entries that allow the user to tweak
- debug settings. There are three main container directories where
- entries are placed, which correspond to the three blocks a i2400m WiMAX
- driver has:
-
- * /sys/kernel/debug/wimax:DEVNAME/ for the generic WiMAX stack
- controls
- * /sys/kernel/debug/wimax:DEVNAME/i2400m for the i2400m generic
- driver controls
- * /sys/kernel/debug/wimax:DEVNAME/i2400m-usb (or -sdio) for the
- bus-specific i2400m-usb or i2400m-sdio controls).
-
- Of course, if debugfs is mounted in a directory other than
- /sys/kernel/debug, those paths will change.
-
-5.2.1. Increasing debug output
-^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-
- The files named *dl_* indicate knobs for controlling the debug output
- of different submodules::
-
- # find /sys/kernel/debug/wimax\:wmx0 -name \*dl_\*
- /sys/kernel/debug/wimax:wmx0/i2400m-usb/dl_tx
- /sys/kernel/debug/wimax:wmx0/i2400m-usb/dl_rx
- /sys/kernel/debug/wimax:wmx0/i2400m-usb/dl_notif
- /sys/kernel/debug/wimax:wmx0/i2400m-usb/dl_fw
- /sys/kernel/debug/wimax:wmx0/i2400m-usb/dl_usb
- /sys/kernel/debug/wimax:wmx0/i2400m/dl_tx
- /sys/kernel/debug/wimax:wmx0/i2400m/dl_rx
- /sys/kernel/debug/wimax:wmx0/i2400m/dl_rfkill
- /sys/kernel/debug/wimax:wmx0/i2400m/dl_netdev
- /sys/kernel/debug/wimax:wmx0/i2400m/dl_fw
- /sys/kernel/debug/wimax:wmx0/i2400m/dl_debugfs
- /sys/kernel/debug/wimax:wmx0/i2400m/dl_driver
- /sys/kernel/debug/wimax:wmx0/i2400m/dl_control
- /sys/kernel/debug/wimax:wmx0/wimax_dl_stack
- /sys/kernel/debug/wimax:wmx0/wimax_dl_op_rfkill
- /sys/kernel/debug/wimax:wmx0/wimax_dl_op_reset
- /sys/kernel/debug/wimax:wmx0/wimax_dl_op_msg
- /sys/kernel/debug/wimax:wmx0/wimax_dl_id_table
- /sys/kernel/debug/wimax:wmx0/wimax_dl_debugfs
-
- By reading the file you can obtain the current value of said debug
- level; by writing to it, you can set it.
-
- To increase the debug level of, for example, the i2400m's generic TX
- engine, just write::
-
- $ echo 3 > /sys/kernel/debug/wimax:wmx0/i2400m/dl_tx
-
- Increasing numbers yield increasing debug information; for details of
- what is printed and the available levels, check the source. The code
- uses 0 for disabled and increasing values until 8.
-
-5.2.2. RX and TX statistics
-^^^^^^^^^^^^^^^^^^^^^^^^^^^
-
- The i2400m/rx_stats and i2400m/tx_stats provide statistics about the
- data reception/delivery from the device::
-
- $ cat /sys/kernel/debug/wimax:wmx0/i2400m/rx_stats
- 45 1 3 34 3104 48 480
-
- The numbers reported are:
-
- * packets/RX-buffer: total, min, max
- * RX-buffers: total RX buffers received, accumulated RX buffer size
- in bytes, min size received, max size received
-
- Thus, to find the average buffer size received, divide accumulated
- RX-buffer / total RX-buffers.
-
- To clear the statistics back to 0, write anything to the rx_stats file::
-
- $ echo 1 > /sys/kernel/debug/wimax:wmx0/i2400m_rx_stats
-
- Likewise for TX.
-
- Note the packets this debug file refers to are not network packet, but
- packets in the sense of the device-specific protocol for communication
- to the host. See drivers/net/wimax/i2400m/tx.c.
-
-5.2.3. Tracing messages received from user space
-^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-
- To echo messages received from user space into the trace pipe that the
- i2400m driver creates, set the debug file i2400m/trace_msg_from_user to
- 1::
-
- $ echo 1 > /sys/kernel/debug/wimax:wmx0/i2400m/trace_msg_from_user
-
-5.2.4. Performing a device reset
-^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-
- By writing a 0, a 1 or a 2 to the file
- /sys/kernel/debug/wimax:wmx0/reset, the driver performs a warm (without
- disconnecting from the bus), cold (disconnecting from the bus) or bus
- (bus specific) reset on the device.
-
-5.2.5. Asking the device to enter power saving mode
-^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-
- By writing any value to the /sys/kernel/debug/wimax:wmx0 file, the
- device will attempt to enter power saving mode.
-
-6. Troubleshooting
-==================
-
-6.1. Driver complains about ``i2400m-fw-usb-1.2.sbcf: request failed``
-----------------------------------------------------------------------
-
- If upon connecting the device, the following is output in the kernel
- log::
-
- i2400m_usb 5-4:1.0: fw i2400m-fw-usb-1.3.sbcf: request failed: -2
-
- This means that the driver cannot locate the firmware file named
- /lib/firmware/i2400m-fw-usb-1.2.sbcf. Check that the file is present in
- the right location.
diff --git a/Documentation/admin-guide/wimax/index.rst b/Documentation/admin-guide/wimax/index.rst
deleted file mode 100644
index fdf7c1f99ff5..000000000000
--- a/Documentation/admin-guide/wimax/index.rst
+++ /dev/null
@@ -1,19 +0,0 @@
-.. SPDX-License-Identifier: GPL-2.0
-
-===============
-WiMAX subsystem
-===============
-
-.. toctree::
- :maxdepth: 2
-
- wimax
-
- i2400m
-
-.. only:: subproject and html
-
- Indices
- =======
-
- * :ref:`genindex`
diff --git a/Documentation/admin-guide/wimax/wimax.rst b/Documentation/admin-guide/wimax/wimax.rst
deleted file mode 100644
index 817ee8ba2732..000000000000
--- a/Documentation/admin-guide/wimax/wimax.rst
+++ /dev/null
@@ -1,89 +0,0 @@
-.. include:: <isonum.txt>
-
-========================
-Linux kernel WiMAX stack
-========================
-
-:Copyright: |copy| 2008 Intel Corporation < linux-wimax@intel.com >
-
- This provides a basic Linux kernel WiMAX stack to provide a common
- control API for WiMAX devices, usable from kernel and user space.
-
-1. Design
-=========
-
- The WiMAX stack is designed to provide for common WiMAX control
- services to current and future WiMAX devices from any vendor.
-
- Because currently there is only one and we don't know what would be the
- common services, the APIs it currently provides are very minimal.
- However, it is done in such a way that it is easily extensible to
- accommodate future requirements.
-
- The stack works by embedding a struct wimax_dev in your device's
- control structures. This provides a set of callbacks that the WiMAX
- stack will call in order to implement control operations requested by
- the user. As well, the stack provides API functions that the driver
- calls to notify about changes of state in the device.
-
- The stack exports the API calls needed to control the device to user
- space using generic netlink as a marshalling mechanism. You can access
- them using your own code or use the wrappers provided for your
- convenience in libwimax (in the wimax-tools package).
-
- For detailed information on the stack, please see
- include/linux/wimax.h.
-
-2. Usage
-========
-
- For usage in a driver (registration, API, etc) please refer to the
- instructions in the header file include/linux/wimax.h.
-
- When a device is registered with the WiMAX stack, a set of debugfs
- files will appear in /sys/kernel/debug/wimax:wmxX can tweak for
- control.
-
-2.1. Obtaining debug information: debugfs entries
--------------------------------------------------
-
- The WiMAX stack is compiled, by default, with debug messages that can
- be used to diagnose issues. By default, said messages are disabled.
-
- The drivers will register debugfs entries that allow the user to tweak
- debug settings.
-
- Each driver, when registering with the stack, will cause a debugfs
- directory named wimax:DEVICENAME to be created; optionally, it might
- create more subentries below it.
-
-2.1.1. Increasing debug output
-------------------------------
-
- The files named *dl_* indicate knobs for controlling the debug output
- of different submodules of the WiMAX stack::
-
- # find /sys/kernel/debug/wimax\:wmx0 -name \*dl_\*
- /sys/kernel/debug/wimax:wmx0/wimax_dl_stack
- /sys/kernel/debug/wimax:wmx0/wimax_dl_op_rfkill
- /sys/kernel/debug/wimax:wmx0/wimax_dl_op_reset
- /sys/kernel/debug/wimax:wmx0/wimax_dl_op_msg
- /sys/kernel/debug/wimax:wmx0/wimax_dl_id_table
- /sys/kernel/debug/wimax:wmx0/wimax_dl_debugfs
- /sys/kernel/debug/wimax:wmx0/.... # other driver specific files
-
- NOTE:
- Of course, if debugfs is mounted in a directory other than
- /sys/kernel/debug, those paths will change.
-
- By reading the file you can obtain the current value of said debug
- level; by writing to it, you can set it.
-
- To increase the debug level of, for example, the id-table submodule,
- just write:
-
- $ echo 3 > /sys/kernel/debug/wimax:wmx0/wimax_dl_id_table
-
- Increasing numbers yield increasing debug information; for details of
- what is printed and the available levels, check the source. The code
- uses 0 for disabled and increasing values until 8.
diff --git a/Documentation/admin-guide/xfs.rst b/Documentation/admin-guide/xfs.rst
index ad911be5b5e9..8de008c0c5ad 100644
--- a/Documentation/admin-guide/xfs.rst
+++ b/Documentation/admin-guide/xfs.rst
@@ -133,7 +133,7 @@ When mounting an XFS filesystem, the following options are accepted.
logbsize must be an integer multiple of the log
stripe unit configured at **mkfs(8)** time.
- The default value for for version 1 logs is 32768, while the
+ The default value for version 1 logs is 32768, while the
default value for version 2 logs is MAX(32768, log_sunit).
logdev=device and rtdev=device
@@ -210,6 +210,28 @@ When mounting an XFS filesystem, the following options are accepted.
inconsistent namespace presentation during or after a
failover event.
+Deprecation of V4 Format
+========================
+
+The V4 filesystem format lacks certain features that are supported by
+the V5 format, such as metadata checksumming, strengthened metadata
+verification, and the ability to store timestamps past the year 2038.
+Because of this, the V4 format is deprecated. All users should upgrade
+by backing up their files, reformatting, and restoring from the backup.
+
+Administrators and users can detect a V4 filesystem by running xfs_info
+against a filesystem mountpoint and checking for a string containing
+"crc=". If no such string is found, please upgrade xfsprogs to the
+latest version and try again.
+
+The deprecation will take place in two parts. Support for mounting V4
+filesystems can now be disabled at kernel build time via Kconfig option.
+The option will default to yes until September 2025, at which time it
+will be changed to default to no. In September 2030, support will be
+removed from the codebase entirely.
+
+Note: Distributors may choose to withdraw V4 format support earlier than
+the dates listed above.
Deprecated Mount Options
========================
@@ -217,6 +239,9 @@ Deprecated Mount Options
=========================== ================
Name Removal Schedule
=========================== ================
+Mounting with V4 filesystem September 2030
+ikeep/noikeep September 2025
+attr2/noattr2 September 2025
=========================== ================
@@ -259,6 +284,9 @@ The following sysctls are available for the XFS filesystem:
removes unused preallocation from clean inodes and releases
the unused space back to the free pool.
+ fs.xfs.speculative_cow_prealloc_lifetime
+ This is an alias for speculative_prealloc_lifetime.
+
fs.xfs.error_level (Min: 0 Default: 3 Max: 11)
A volume knob for error reporting when internal errors occur.
This will generate detailed messages & backtraces for filesystem
@@ -331,7 +359,13 @@ The following sysctls are available for the XFS filesystem:
Deprecated Sysctls
==================
-None at present.
+=========================================== ================
+ Name Removal Schedule
+=========================================== ================
+fs.xfs.irix_sgid_inherit September 2025
+fs.xfs.irix_symlink_mode September 2025
+fs.xfs.speculative_cow_prealloc_lifetime September 2025
+=========================================== ================
Removed Sysctls
@@ -465,3 +499,45 @@ the class and error context. For example, the default values for
"metadata/ENODEV" are "0" rather than "-1" so that this error handler defaults
to "fail immediately" behaviour. This is done because ENODEV is a fatal,
unrecoverable error no matter how many times the metadata IO is retried.
+
+Workqueue Concurrency
+=====================
+
+XFS uses kernel workqueues to parallelize metadata update processes. This
+enables it to take advantage of storage hardware that can service many IO
+operations simultaneously. This interface exposes internal implementation
+details of XFS, and as such is explicitly not part of any userspace API/ABI
+guarantee the kernel may give userspace. These are undocumented features of
+the generic workqueue implementation XFS uses for concurrency, and they are
+provided here purely for diagnostic and tuning purposes and may change at any
+time in the future.
+
+The control knobs for a filesystem's workqueues are organized by task at hand
+and the short name of the data device. They all can be found in:
+
+ /sys/bus/workqueue/devices/${task}!${device}
+
+================ ===========
+ Task Description
+================ ===========
+ xfs_iwalk-$pid Inode scans of the entire filesystem. Currently limited to
+ mount time quotacheck.
+ xfs-gc Background garbage collection of disk space that have been
+ speculatively allocated beyond EOF or for staging copy on
+ write operations.
+================ ===========
+
+For example, the knobs for the quotacheck workqueue for /dev/nvme0n1 would be
+found in /sys/bus/workqueue/devices/xfs_iwalk-1111!nvme0n1/.
+
+The interesting knobs for XFS workqueues are as follows:
+
+============ ===========
+ Knob Description
+============ ===========
+ max_active Maximum number of background threads that can be started to
+ run the work.
+ cpumask CPUs upon which the threads are allowed to run.
+ nice Relative priority of scheduling the threads. These are the
+ same nice levels that can be applied to userspace processes.
+============ ===========