aboutsummaryrefslogtreecommitdiffstats
path: root/Documentation/x86
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation/x86')
-rw-r--r--Documentation/x86/00-INDEX20
-rw-r--r--Documentation/x86/boot.txt32
-rw-r--r--Documentation/x86/earlyprintk.txt25
-rw-r--r--Documentation/x86/intel_rdt_ui.txt22
-rw-r--r--Documentation/x86/pat.txt4
-rw-r--r--Documentation/x86/x86_64/00-INDEX16
-rw-r--r--Documentation/x86/x86_64/mm.txt174
7 files changed, 185 insertions, 108 deletions
diff --git a/Documentation/x86/00-INDEX b/Documentation/x86/00-INDEX
deleted file mode 100644
index 3bb2ee3edcd1..000000000000
--- a/Documentation/x86/00-INDEX
+++ /dev/null
@@ -1,20 +0,0 @@
-00-INDEX
- - this file
-boot.txt
- - List of boot protocol versions
-earlyprintk.txt
- - Using earlyprintk with a USB2 debug port key.
-entry_64.txt
- - Describe (some of the) kernel entry points for x86.
-exception-tables.txt
- - why and how Linux kernel uses exception tables on x86
-microcode.txt
- - How to load microcode from an initrd-CPIO archive early to fix CPU issues.
-mtrr.txt
- - how to use x86 Memory Type Range Registers to increase performance
-pat.txt
- - Page Attribute Table intro and API
-usb-legacy-support.txt
- - how to fix/avoid quirks when using emulated PS/2 mouse/keyboard.
-zero-page.txt
- - layout of the first page of memory.
diff --git a/Documentation/x86/boot.txt b/Documentation/x86/boot.txt
index 5e9b826b5f62..7727db8f94bc 100644
--- a/Documentation/x86/boot.txt
+++ b/Documentation/x86/boot.txt
@@ -61,6 +61,18 @@ Protocol 2.12: (Kernel 3.8) Added the xloadflags field and extension fields
to struct boot_params for loading bzImage and ramdisk
above 4G in 64bit.
+Protocol 2.13: (Kernel 3.14) Support 32- and 64-bit flags being set in
+ xloadflags to support booting a 64-bit kernel from 32-bit
+ EFI
+
+Protocol 2.14: (Kernel 4.20) Added acpi_rsdp_addr holding the physical
+ address of the ACPI RSDP table.
+ The bootloader updates version with:
+ 0x8000 | min(kernel-version, bootloader-version)
+ kernel-version being the protocol version supported by
+ the kernel and bootloader-version the protocol version
+ supported by the bootloader.
+
**** MEMORY LAYOUT
The traditional memory map for the kernel loader, used for Image or
@@ -197,6 +209,7 @@ Offset Proto Name Meaning
0258/8 2.10+ pref_address Preferred loading address
0260/4 2.10+ init_size Linear memory required during initialization
0264/4 2.11+ handover_offset Offset of handover entry point
+0268/8 2.14+ acpi_rsdp_addr Physical address of RSDP table
(1) For backwards compatibility, if the setup_sects field contains 0, the
real value is 4.
@@ -309,7 +322,7 @@ Protocol: 2.00+
Contains the magic number "HdrS" (0x53726448).
Field name: version
-Type: read
+Type: modify
Offset/size: 0x206/2
Protocol: 2.00+
@@ -317,6 +330,12 @@ Protocol: 2.00+
e.g. 0x0204 for version 2.04, and 0x0a11 for a hypothetical version
10.17.
+ Up to protocol version 2.13 this information is only read by the
+ bootloader. From protocol version 2.14 onwards the bootloader will
+ write the used protocol version or-ed with 0x8000 to the field. The
+ used protocol version will be the minimum of the supported protocol
+ versions of the bootloader and the kernel.
+
Field name: realmode_swtch
Type: modify (optional)
Offset/size: 0x208/4
@@ -744,6 +763,17 @@ Offset/size: 0x264/4
See EFI HANDOVER PROTOCOL below for more details.
+Field name: acpi_rsdp_addr
+Type: write
+Offset/size: 0x268/8
+Protocol: 2.14+
+
+ This field can be set by the boot loader to tell the kernel the
+ physical address of the ACPI RSDP table.
+
+ A value of 0 indicates the kernel should fall back to the standard
+ methods to locate the RSDP.
+
**** THE IMAGE CHECKSUM
diff --git a/Documentation/x86/earlyprintk.txt b/Documentation/x86/earlyprintk.txt
index 688e3eeed21d..46933e06c972 100644
--- a/Documentation/x86/earlyprintk.txt
+++ b/Documentation/x86/earlyprintk.txt
@@ -35,25 +35,25 @@ and two USB cables, connected like this:
( If your system does not list a debug port capability then you probably
won't be able to use the USB debug key. )
- b.) You also need a Netchip USB debug cable/key:
+ b.) You also need a NetChip USB debug cable/key:
http://www.plxtech.com/products/NET2000/NET20DC/default.asp
- This is a small blue plastic connector with two USB connections,
+ This is a small blue plastic connector with two USB connections;
it draws power from its USB connections.
c.) You need a second client/console system with a high speed USB 2.0
port.
- d.) The Netchip device must be plugged directly into the physical
+ d.) The NetChip device must be plugged directly into the physical
debug port on the "host/target" system. You cannot use a USB hub in
between the physical debug port and the "host/target" system.
The EHCI debug controller is bound to a specific physical USB
- port and the Netchip device will only work as an early printk
+ port and the NetChip device will only work as an early printk
device in this port. The EHCI host controllers are electrically
wired such that the EHCI debug controller is hooked up to the
- first physical and there is no way to change this via software.
+ first physical port and there is no way to change this via software.
You can find the physical port through experimentation by trying
each physical port on the system and rebooting. Or you can try
and use lsusb or look at the kernel info messages emitted by the
@@ -65,9 +65,9 @@ and two USB cables, connected like this:
to the hardware vendor, because there is no reason not to wire
this port into one of the physically accessible ports.
- e.) It is also important to note, that many versions of the Netchip
+ e.) It is also important to note, that many versions of the NetChip
device require the "client/console" system to be plugged into the
- right and side of the device (with the product logo facing up and
+ right hand side of the device (with the product logo facing up and
readable left to right). The reason being is that the 5 volt
power supply is taken from only one side of the device and it
must be the side that does not get rebooted.
@@ -81,13 +81,18 @@ and two USB cables, connected like this:
CONFIG_EARLY_PRINTK_DBGP=y
And you need to add the boot command line: "earlyprintk=dbgp".
+
(If you are using Grub, append it to the 'kernel' line in
- /etc/grub.conf)
+ /etc/grub.conf. If you are using Grub2 on a BIOS firmware system,
+ append it to the 'linux' line in /boot/grub2/grub.cfg. If you are
+ using Grub2 on an EFI firmware system, append it to the 'linux'
+ or 'linuxefi' line in /boot/grub2/grub.cfg or
+ /boot/efi/EFI/<distro>/grub.cfg.)
On systems with more than one EHCI debug controller you must
specify the correct EHCI debug controller number. The ordering
comes from the PCI bus enumeration of the EHCI controllers. The
- default with no number argument is "0" the first EHCI debug
+ default with no number argument is "0" or the first EHCI debug
controller. To use the second EHCI debug controller, you would
use the command line: "earlyprintk=dbgp1"
@@ -111,7 +116,7 @@ and two USB cables, connected like this:
see the raw output.
c.) On Nvidia Southbridge based systems: the kernel will try to probe
- and find out which port has debug device connected.
+ and find out which port has a debug device connected.
3. Testing that it works fine:
diff --git a/Documentation/x86/intel_rdt_ui.txt b/Documentation/x86/intel_rdt_ui.txt
index f662d3c530e5..52b10945ff75 100644
--- a/Documentation/x86/intel_rdt_ui.txt
+++ b/Documentation/x86/intel_rdt_ui.txt
@@ -520,18 +520,24 @@ the pseudo-locked region:
2) Cache hit and miss measurements using model specific precision counters if
available. Depending on the levels of cache on the system the pseudo_lock_l2
and pseudo_lock_l3 tracepoints are available.
- WARNING: triggering this measurement uses from two (for just L2
- measurements) to four (for L2 and L3 measurements) precision counters on
- the system, if any other measurements are in progress the counters and
- their corresponding event registers will be clobbered.
When a pseudo-locked region is created a new debugfs directory is created for
it in debugfs as /sys/kernel/debug/resctrl/<newdir>. A single
write-only file, pseudo_lock_measure, is present in this directory. The
-measurement on the pseudo-locked region depends on the number, 1 or 2,
-written to this debugfs file. Since the measurements are recorded with the
-tracing infrastructure the relevant tracepoints need to be enabled before the
-measurement is triggered.
+measurement of the pseudo-locked region depends on the number written to this
+debugfs file:
+1 - writing "1" to the pseudo_lock_measure file will trigger the latency
+ measurement captured in the pseudo_lock_mem_latency tracepoint. See
+ example below.
+2 - writing "2" to the pseudo_lock_measure file will trigger the L2 cache
+ residency (cache hits and misses) measurement captured in the
+ pseudo_lock_l2 tracepoint. See example below.
+3 - writing "3" to the pseudo_lock_measure file will trigger the L3 cache
+ residency (cache hits and misses) measurement captured in the
+ pseudo_lock_l3 tracepoint.
+
+All measurements are recorded with the tracing infrastructure. This requires
+the relevant tracepoints to be enabled before the measurement is triggered.
Example of latency debugging interface:
In this example a pseudo-locked region named "newlock" was created. Here is
diff --git a/Documentation/x86/pat.txt b/Documentation/x86/pat.txt
index 2a4ee6302122..481d8d8536ac 100644
--- a/Documentation/x86/pat.txt
+++ b/Documentation/x86/pat.txt
@@ -90,12 +90,12 @@ pci proc | -- | -- | WC |
Advanced APIs for drivers
-------------------------
A. Exporting pages to users with remap_pfn_range, io_remap_pfn_range,
-vm_insert_pfn
+vmf_insert_pfn
Drivers wanting to export some pages to userspace do it by using mmap
interface and a combination of
1) pgprot_noncached()
-2) io_remap_pfn_range() or remap_pfn_range() or vm_insert_pfn()
+2) io_remap_pfn_range() or remap_pfn_range() or vmf_insert_pfn()
With PAT support, a new API pgprot_writecombine is being added. So, drivers can
continue to use the above sequence, with either pgprot_noncached() or
diff --git a/Documentation/x86/x86_64/00-INDEX b/Documentation/x86/x86_64/00-INDEX
deleted file mode 100644
index 92fc20ab5f0e..000000000000
--- a/Documentation/x86/x86_64/00-INDEX
+++ /dev/null
@@ -1,16 +0,0 @@
-00-INDEX
- - This file
-boot-options.txt
- - AMD64-specific boot options.
-cpu-hotplug-spec
- - Firmware support for CPU hotplug under Linux/x86-64
-fake-numa-for-cpusets
- - Using numa=fake and CPUSets for Resource Management
-kernel-stacks
- - Context-specific per-processor interrupt stacks.
-machinecheck
- - Configurable sysfs parameters for the x86-64 machine check code.
-mm.txt
- - Memory layout of x86-64 (4 level page tables, 46 bits physical).
-uefi.txt
- - Booting Linux via Unified Extensible Firmware Interface.
diff --git a/Documentation/x86/x86_64/mm.txt b/Documentation/x86/x86_64/mm.txt
index 5432a96d31ff..73aaaa3da436 100644
--- a/Documentation/x86/x86_64/mm.txt
+++ b/Documentation/x86/x86_64/mm.txt
@@ -1,55 +1,124 @@
+====================================================
+Complete virtual memory map with 4-level page tables
+====================================================
-Virtual memory map with 4 level page tables:
-
-0000000000000000 - 00007fffffffffff (=47 bits) user space, different per mm
-hole caused by [47:63] sign extension
-ffff800000000000 - ffff87ffffffffff (=43 bits) guard hole, reserved for hypervisor
-ffff880000000000 - ffffc7ffffffffff (=64 TB) direct mapping of all phys. memory
-ffffc80000000000 - ffffc8ffffffffff (=40 bits) hole
-ffffc90000000000 - ffffe8ffffffffff (=45 bits) vmalloc/ioremap space
-ffffe90000000000 - ffffe9ffffffffff (=40 bits) hole
-ffffea0000000000 - ffffeaffffffffff (=40 bits) virtual memory map (1TB)
-... unused hole ...
-ffffec0000000000 - fffffbffffffffff (=44 bits) kasan shadow memory (16TB)
-... unused hole ...
- vaddr_end for KASLR
-fffffe0000000000 - fffffe7fffffffff (=39 bits) cpu_entry_area mapping
-fffffe8000000000 - fffffeffffffffff (=39 bits) LDT remap for PTI
-ffffff0000000000 - ffffff7fffffffff (=39 bits) %esp fixup stacks
-... unused hole ...
-ffffffef00000000 - fffffffeffffffff (=64 GB) EFI region mapping space
-... unused hole ...
-ffffffff80000000 - ffffffff9fffffff (=512 MB) kernel text mapping, from phys 0
-ffffffffa0000000 - fffffffffeffffff (1520 MB) module mapping space
-[fixmap start] - ffffffffff5fffff kernel-internal fixmap range
-ffffffffff600000 - ffffffffff600fff (=4 kB) legacy vsyscall ABI
-ffffffffffe00000 - ffffffffffffffff (=2 MB) unused hole
-
-Virtual memory map with 5 level page tables:
-
-0000000000000000 - 00ffffffffffffff (=56 bits) user space, different per mm
-hole caused by [56:63] sign extension
-ff00000000000000 - ff0fffffffffffff (=52 bits) guard hole, reserved for hypervisor
-ff10000000000000 - ff8fffffffffffff (=55 bits) direct mapping of all phys. memory
-ff90000000000000 - ff9fffffffffffff (=52 bits) LDT remap for PTI
-ffa0000000000000 - ffd1ffffffffffff (=54 bits) vmalloc/ioremap space (12800 TB)
-ffd2000000000000 - ffd3ffffffffffff (=49 bits) hole
-ffd4000000000000 - ffd5ffffffffffff (=49 bits) virtual memory map (512TB)
-... unused hole ...
-ffdf000000000000 - fffffc0000000000 (=53 bits) kasan shadow memory (8PB)
-... unused hole ...
- vaddr_end for KASLR
-fffffe0000000000 - fffffe7fffffffff (=39 bits) cpu_entry_area mapping
-... unused hole ...
-ffffff0000000000 - ffffff7fffffffff (=39 bits) %esp fixup stacks
-... unused hole ...
-ffffffef00000000 - fffffffeffffffff (=64 GB) EFI region mapping space
-... unused hole ...
-ffffffff80000000 - ffffffff9fffffff (=512 MB) kernel text mapping, from phys 0
-ffffffffa0000000 - fffffffffeffffff (1520 MB) module mapping space
-[fixmap start] - ffffffffff5fffff kernel-internal fixmap range
-ffffffffff600000 - ffffffffff600fff (=4 kB) legacy vsyscall ABI
-ffffffffffe00000 - ffffffffffffffff (=2 MB) unused hole
+Notes:
+
+ - Negative addresses such as "-23 TB" are absolute addresses in bytes, counted down
+ from the top of the 64-bit address space. It's easier to understand the layout
+ when seen both in absolute addresses and in distance-from-top notation.
+
+ For example 0xffffe90000000000 == -23 TB, it's 23 TB lower than the top of the
+ 64-bit address space (ffffffffffffffff).
+
+ Note that as we get closer to the top of the address space, the notation changes
+ from TB to GB and then MB/KB.
+
+ - "16M TB" might look weird at first sight, but it's an easier to visualize size
+ notation than "16 EB", which few will recognize at first sight as 16 exabytes.
+ It also shows it nicely how incredibly large 64-bit address space is.
+
+========================================================================================================================
+ Start addr | Offset | End addr | Size | VM area description
+========================================================================================================================
+ | | | |
+ 0000000000000000 | 0 | 00007fffffffffff | 128 TB | user-space virtual memory, different per mm
+__________________|____________|__________________|_________|___________________________________________________________
+ | | | |
+ 0000800000000000 | +128 TB | ffff7fffffffffff | ~16M TB | ... huge, almost 64 bits wide hole of non-canonical
+ | | | | virtual memory addresses up to the -128 TB
+ | | | | starting offset of kernel mappings.
+__________________|____________|__________________|_________|___________________________________________________________
+ |
+ | Kernel-space virtual memory, shared between all processes:
+____________________________________________________________|___________________________________________________________
+ | | | |
+ ffff800000000000 | -128 TB | ffff87ffffffffff | 8 TB | ... guard hole, also reserved for hypervisor
+ ffff880000000000 | -120 TB | ffffc7ffffffffff | 64 TB | direct mapping of all physical memory (page_offset_base)
+ ffffc80000000000 | -56 TB | ffffc8ffffffffff | 1 TB | ... unused hole
+ ffffc90000000000 | -55 TB | ffffe8ffffffffff | 32 TB | vmalloc/ioremap space (vmalloc_base)
+ ffffe90000000000 | -23 TB | ffffe9ffffffffff | 1 TB | ... unused hole
+ ffffea0000000000 | -22 TB | ffffeaffffffffff | 1 TB | virtual memory map (vmemmap_base)
+ ffffeb0000000000 | -21 TB | ffffebffffffffff | 1 TB | ... unused hole
+ ffffec0000000000 | -20 TB | fffffbffffffffff | 16 TB | KASAN shadow memory
+ fffffc0000000000 | -4 TB | fffffdffffffffff | 2 TB | ... unused hole
+ | | | | vaddr_end for KASLR
+ fffffe0000000000 | -2 TB | fffffe7fffffffff | 0.5 TB | cpu_entry_area mapping
+ fffffe8000000000 | -1.5 TB | fffffeffffffffff | 0.5 TB | LDT remap for PTI
+ ffffff0000000000 | -1 TB | ffffff7fffffffff | 0.5 TB | %esp fixup stacks
+__________________|____________|__________________|_________|____________________________________________________________
+ |
+ | Identical layout to the 47-bit one from here on:
+____________________________________________________________|____________________________________________________________
+ | | | |
+ ffffff8000000000 | -512 GB | ffffffeeffffffff | 444 GB | ... unused hole
+ ffffffef00000000 | -68 GB | fffffffeffffffff | 64 GB | EFI region mapping space
+ ffffffff00000000 | -4 GB | ffffffff7fffffff | 2 GB | ... unused hole
+ ffffffff80000000 | -2 GB | ffffffff9fffffff | 512 MB | kernel text mapping, mapped to physical address 0
+ ffffffff80000000 |-2048 MB | | |
+ ffffffffa0000000 |-1536 MB | fffffffffeffffff | 1520 MB | module mapping space
+ ffffffffff000000 | -16 MB | | |
+ FIXADDR_START | ~-11 MB | ffffffffff5fffff | ~0.5 MB | kernel-internal fixmap range, variable size and offset
+ ffffffffff600000 | -10 MB | ffffffffff600fff | 4 kB | legacy vsyscall ABI
+ ffffffffffe00000 | -2 MB | ffffffffffffffff | 2 MB | ... unused hole
+__________________|____________|__________________|_________|___________________________________________________________
+
+
+====================================================
+Complete virtual memory map with 5-level page tables
+====================================================
+
+Notes:
+
+ - With 56-bit addresses, user-space memory gets expanded by a factor of 512x,
+ from 0.125 PB to 64 PB. All kernel mappings shift down to the -64 PT starting
+ offset and many of the regions expand to support the much larger physical
+ memory supported.
+
+========================================================================================================================
+ Start addr | Offset | End addr | Size | VM area description
+========================================================================================================================
+ | | | |
+ 0000000000000000 | 0 | 00ffffffffffffff | 64 PB | user-space virtual memory, different per mm
+__________________|____________|__________________|_________|___________________________________________________________
+ | | | |
+ 0000800000000000 | +64 PB | ffff7fffffffffff | ~16K PB | ... huge, still almost 64 bits wide hole of non-canonical
+ | | | | virtual memory addresses up to the -128 TB
+ | | | | starting offset of kernel mappings.
+__________________|____________|__________________|_________|___________________________________________________________
+ |
+ | Kernel-space virtual memory, shared between all processes:
+____________________________________________________________|___________________________________________________________
+ | | | |
+ ff00000000000000 | -64 PB | ff0fffffffffffff | 4 PB | ... guard hole, also reserved for hypervisor
+ ff10000000000000 | -60 PB | ff8fffffffffffff | 32 PB | direct mapping of all physical memory (page_offset_base)
+ ff90000000000000 | -28 PB | ff9fffffffffffff | 4 PB | LDT remap for PTI
+ ffa0000000000000 | -24 PB | ffd1ffffffffffff | 12.5 PB | vmalloc/ioremap space (vmalloc_base)
+ ffd2000000000000 | -11.5 PB | ffd3ffffffffffff | 0.5 PB | ... unused hole
+ ffd4000000000000 | -11 PB | ffd5ffffffffffff | 0.5 PB | virtual memory map (vmemmap_base)
+ ffd6000000000000 | -10.5 PB | ffdeffffffffffff | 2.25 PB | ... unused hole
+ ffdf000000000000 | -8.25 PB | fffffdffffffffff | ~8 PB | KASAN shadow memory
+ fffffc0000000000 | -4 TB | fffffdffffffffff | 2 TB | ... unused hole
+ | | | | vaddr_end for KASLR
+ fffffe0000000000 | -2 TB | fffffe7fffffffff | 0.5 TB | cpu_entry_area mapping
+ fffffe8000000000 | -1.5 TB | fffffeffffffffff | 0.5 TB | ... unused hole
+ ffffff0000000000 | -1 TB | ffffff7fffffffff | 0.5 TB | %esp fixup stacks
+__________________|____________|__________________|_________|____________________________________________________________
+ |
+ | Identical layout to the 47-bit one from here on:
+____________________________________________________________|____________________________________________________________
+ | | | |
+ ffffff8000000000 | -512 GB | ffffffeeffffffff | 444 GB | ... unused hole
+ ffffffef00000000 | -68 GB | fffffffeffffffff | 64 GB | EFI region mapping space
+ ffffffff00000000 | -4 GB | ffffffff7fffffff | 2 GB | ... unused hole
+ ffffffff80000000 | -2 GB | ffffffff9fffffff | 512 MB | kernel text mapping, mapped to physical address 0
+ ffffffff80000000 |-2048 MB | | |
+ ffffffffa0000000 |-1536 MB | fffffffffeffffff | 1520 MB | module mapping space
+ ffffffffff000000 | -16 MB | | |
+ FIXADDR_START | ~-11 MB | ffffffffff5fffff | ~0.5 MB | kernel-internal fixmap range, variable size and offset
+ ffffffffff600000 | -10 MB | ffffffffff600fff | 4 kB | legacy vsyscall ABI
+ ffffffffffe00000 | -2 MB | ffffffffffffffff | 2 MB | ... unused hole
+__________________|____________|__________________|_________|___________________________________________________________
Architecture defines a 64-bit virtual address. Implementations can support
less. Currently supported are 48- and 57-bit virtual addresses. Bits 63
@@ -77,3 +146,6 @@ Their order is preserved but their base will be offset early at boot time.
Be very careful vs. KASLR when changing anything here. The KASLR address
range must not overlap with anything except the KASAN shadow area, which is
correct as KASAN disables KASLR.
+
+For both 4- and 5-level layouts, the STACKLEAK_POISON value in the last 2MB
+hole: ffffffffffff4111