aboutsummaryrefslogtreecommitdiffstats
path: root/Documentation/x86 (follow)
AgeCommit message (Collapse)AuthorFilesLines
2021-11-14Merge tag 'x86_urgent_for_v5.16_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds1-0/+9
Pull x86 fixes from Borislav Petkov: - Add the model number of a new, Raptor Lake CPU, to intel-family.h - Do not log spurious corrected MCEs on SKL too, due to an erratum - Clarify the path of paravirt ops patches upstream - Add an optimization to avoid writing out AMX components to sigframes when former are in init state * tag 'x86_urgent_for_v5.16_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/cpu: Add Raptor Lake to Intel family x86/mce: Add errata workaround for Skylake SKX37 MAINTAINERS: Add some information to PARAVIRT_OPS entry x86/fpu: Optimize out sigframe xfeatures when in init state
2021-11-04Merge tag 'driver-core-5.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-coreLinus Torvalds1-54/+2
Pull driver core updates from Greg KH: "Here is the big set of driver core changes for 5.16-rc1. All of these have been in linux-next for a while now with no reported problems. Included in here are: - big update and cleanup of the sysfs abi documentation files and scripts from Mauro. We are almost at the place where we can properly check that the running kernel's sysfs abi is documented fully. - firmware loader updates - dyndbg updates - kernfs cleanups and fixes from Christoph - device property updates - component fix - other minor driver core cleanups and fixes" * tag 'driver-core-5.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (122 commits) device property: Drop redundant NULL checks x86/build: Tuck away built-in firmware under FW_LOADER vmlinux.lds.h: wrap built-in firmware support under FW_LOADER firmware_loader: move struct builtin_fw to the only place used x86/microcode: Use the firmware_loader built-in API firmware_loader: remove old DECLARE_BUILTIN_FIRMWARE() firmware_loader: formalize built-in firmware API component: do not leave master devres group open after bind dyndbg: refine verbosity 1-4 summary-detail gpiolib: acpi: Replace custom code with device_match_acpi_handle() i2c: acpi: Replace custom function with device_match_acpi_handle() driver core: Provide device_match_acpi_handle() helper dyndbg: fix spurious vNpr_info change dyndbg: no vpr-info on empty queries dyndbg: vpr-info on remove-module complete, not starting device property: Add missed header in fwnode.h Documentation: dyndbg: Improve cli param examples dyndbg: Remove support for ddebug_query param dyndbg: make dyndbg a known cli param dyndbg: show module in vpr-info in dd-exec-queries ...
2021-11-03x86/fpu: Optimize out sigframe xfeatures when in init stateDave Hansen1-0/+9
tl;dr: AMX state is ~8k. Signal frames can have space for this ~8k and each signal entry writes out all 8k even if it is zeros. Skip writing zeros for AMX to speed up signal delivery by about 4% overall when AMX is in its init state. This is a user-visible change to the sigframe ABI. == Hardware XSAVE Background == XSAVE state components may be tracked by the processor as being in their initial configuration. Software can detect which features are in this configuration by looking at the XSTATE_BV field in an XSAVE buffer or with the XGETBV(1) instruction. Both the XSAVE and XSAVEOPT instructions enumerate features s being in the initial configuration via the XSTATE_BV field in the XSAVE header, However, XSAVEOPT declines to actually write features in their initial configuration to the buffer. XSAVE writes the feature unconditionally, regardless of whether it is in the initial configuration or not. Basically, XSAVE users never need to inspect XSTATE_BV to determine if the feature has been written to the buffer. XSAVEOPT users *do* need to inspect XSTATE_BV. They might also need to clear out the buffer if they want to make an isolated change to the state, like modifying one register. == Software Signal / XSAVE Background == Signal frames have historically been written with XSAVE itself. Each state is written in its entirety, regardless of being in its initial configuration. In other words, the signal frame ABI uses the XSAVE behavior, not the XSAVEOPT behavior. == Problem == This means that any application which has acquired permission to use AMX via ARCH_REQ_XCOMP_PERM will write 8k of state to the signal frame. This 8k write will occur even when AMX was in its initial configuration and software *knows* this because of XSTATE_BV. This problem also exists to a lesser degree with AVX-512 and its 2k of state. However, AVX-512 use does not require ARCH_REQ_XCOMP_PERM and is more likely to have existing users which would be impacted by any change in behavior. == Solution == Stop writing out AMX xfeatures which are in their initial state to the signal frame. This effectively makes the signal frame XSAVE buffer look as if it were written with a combination of XSAVEOPT and XSAVE behavior. Userspace which handles XSAVEOPT- style buffers should be able to handle this naturally. For now, include only the AMX xfeatures: XTILE and XTILEDATA in this new behavior. These require new ABI to use anyway, which makes their users very unlikely to be broken. This XSAVEOPT-like behavior should be expected for all future dynamic xfeatures. It may also be extended to legacy features like AVX-512 in the future. Only attempt this optimization on systems with dynamic features. Disable dynamic feature support (XFD) if XGETBV1 is unavailable by adding a CPUID dependency. This has been measured to reduce the *overall* cycle cost of signal delivery by about 4%. Fixes: 2308ee57d93d ("x86/fpu/amx: Enable the AMX feature in 64-bit mode") Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: "Chang S. Bae" <chang.seok.bae@intel.com> Link: https://lore.kernel.org/r/20211102224750.FA412E26@davehans-spike.ostc.intel.com
2021-11-02Merge tag 'docs-5.16' of git://git.lwn.net/linuxLinus Torvalds2-3/+3
Pull documentation updates from Jonathan Corbet: "This is a relatively unexciting cycle for documentation. - Some small scripts/kerneldoc fixes - More Chinese translation work, but at a much reduced rate. - The tip-tree maintainer's handbook ...plus the usual array of build fixes, typo fixes, etc" * tag 'docs-5.16' of git://git.lwn.net/linux: (53 commits) kernel-doc: support DECLARE_PHY_INTERFACE_MASK() docs/zh_CN: add core-api xarray translation docs/zh_CN: add core-api assoc_array translation speakup: Fix typo in documentation "boo" -> "boot" docs: submitting-patches: make section about the Link: tag more explicit docs: deprecated.rst: Clarify open-coded arithmetic with literals scripts: documentation-file-ref-check: fix bpf selftests path scripts: documentation-file-ref-check: ignore hidden files coding-style.rst: trivial: fix location of driver model macros docs: f2fs: fix text alignment docs/zh_CN add PCI pci.rst translation docs/zh_CN add PCI index.rst translation docs: translations: zh_CN: memory-hotplug.rst: fix a typo docs: translations: zn_CN: irq-affinity.rst: add a missing extension block: add documentation for inflight scripts: kernel-doc: Ignore __alloc_size() attribute docs: pdfdocs: Adjust \headheight for fancyhdr docs: UML: user_mode_linux_howto_v2 edits docs: use the lore redirector everywhere docs: proc.rst: mountinfo: align columns ...
2021-11-01Merge tag 'x86_sgx_for_v5.16_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds1-0/+35
Pull x86 SGX updates from Borislav Petkov: "Add a SGX_IOC_VEPC_REMOVE ioctl to the /dev/sgx_vepc virt interface with which EPC pages can be put back into their uninitialized state without having to reopen /dev/sgx_vepc, which could not be possible anymore after startup due to security policies" * tag 'x86_sgx_for_v5.16_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/sgx/virt: implement SGX_IOC_VEPC_REMOVE ioctl x86/sgx/virt: extract sgx_vepc_remove_page
2021-10-28Documentation/x86: Add documentation for using dynamic XSTATE featuresChang S. Bae2-0/+66
Explain how dynamic XSTATE features can be enabled via the architecture-specific prctl() along with dynamic sigframe size and first use trap handling. Fix: Documentation/x86/xstate.rst:15: WARNING: Title underline too short. as reported by Stephen Rothwell <sfr@canb.auug.org.au> Originally-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20211026091157.16711-1-chang.seok.bae@intel.com
2021-10-22x86/sgx/virt: implement SGX_IOC_VEPC_REMOVE ioctlPaolo Bonzini1-0/+35
For bare-metal SGX on real hardware, the hardware provides guarantees SGX state at reboot. For instance, all pages start out uninitialized. The vepc driver provides a similar guarantee today for freshly-opened vepc instances, but guests such as Windows expect all pages to be in uninitialized state on startup, including after every guest reboot. Some userspace implementations of virtual SGX would rather avoid having to close and reopen the /dev/sgx_vepc file descriptor and re-mmap the virtual EPC. For example, they could sandbox themselves after the guest starts and forbid further calls to open(), in order to mitigate exploits from untrusted guests. Therefore, add a ioctl that does this with EREMOVE. Userspace can invoke the ioctl to bring its vEPC pages back to uninitialized state. There is a possibility that some pages fail to be removed if they are SECS pages, and the child and SECS pages could be in separate vEPC regions. Therefore, the ioctl returns the number of EREMOVE failures, telling userspace to try the ioctl again after it's done with all vEPC regions. A more verbose description of the correct usage and the possible error conditions is documented in sgx.rst. Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lkml.kernel.org/r/20211021201155.1523989-3-pbonzini@redhat.com
2021-10-12docs: use the lore redirector everywhereThorsten Leemhuis2-3/+3
Change all links from using the lkml redirector to the lore redirector, as the kernel.org admin recently indicated: we shouldn't be using lkml.kernel.org anymore because the domain can create confusion, as it indicates it is only valid for messages sent to the LKML; the convention has been to use https://lore.kernel.org/r/msgid for this reason. In this process also change three links from using http to https. Link: https://lore.kernel.org/r/20211006170025.qw3glxvocczfuhar@meerkat.local CC: Thomas Gleixner <tglx@linutronix.de> CC: Ingo Molnar <mingo@redhat.com> CC: Borislav Petkov <bp@alien8.de> CC: Hu Haowen <src.res@email.cn> CC: Alex Shi <alexs@kernel.org> CC: Federico Vaga <federico.vaga@vaga.pv.it> Signed-off-by: Thorsten Leemhuis <linux@leemhuis.info> Reviewed-by: Konstantin Ryabitsev <konstantin@linuxfoundation.org> Link: https://lore.kernel.org/r/5bb55bac6ba10fafab19bf2b21572dd0e2f8cea2.1633593385.git.linux@leemhuis.info Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2021-10-05ABI: sysfs-mce: add a new ABI fileMauro Carvalho Chehab1-54/+2
Reduce the gap of missing ABIs for Intel servers with MCE by adding a new ABI file. The contents of this file comes from: Documentation/x86/x86_64/machinecheck.rst Reviewed-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Link: https://lore.kernel.org/r/801a26985e32589eb78ba4b728d3e19fdea18f04.1632994837.git.mchehab+huawei@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-09-08Merge tag 'docs-5.15-2' of git://git.lwn.net/linuxLinus Torvalds1-4/+0
Pull more documentation updates from Jonathan Corbet: "Another collection of documentation patches, mostly fixes but also includes another set of traditional Chinese translations" * tag 'docs-5.15-2' of git://git.lwn.net/linux: docs: pdfdocs: Fix typo in CJK-language specific font settings docs: kernel-hacking: Remove inappropriate text docs/zh_TW: add translations for zh_TW/filesystems docs/zh_TW: add translations for zh_TW/cpu-freq docs/zh_TW: add translations for zh_TW/arm64 docs/zh_CN: Modify the translator tag and fix the wrong word Documentation/features/vm: correct huge-vmap APIs Documentation: block: blk-mq: Fix small typo in multi-queue docs Documentation: in_irq() cleanup Documentation: arm: marvell: Add 88F6825 model into list Documentation/process/maintainer-pgp-guide: Replace broken link to PGP path finder Documentation: locking: fix references Documentation: Update details of The Linux Kernel Module Programming Guide docs: x86: Remove obsolete information about x86_64 vmalloc() faulting Documentation/process/applying-patches: Activate linux-next man hyperlink
2021-08-20docs: x86: Remove obsolete information about x86_64 vmalloc() faultingPeilin Ye1-4/+0
x86_64 vmalloc() mappings are no longer "synchronized" among page tables via faulting since commit 6eb82f994026 ("x86/mm: Pre-allocate P4D/PUD pages for vmalloc area"), since the corresponding P4D or PUD pages are now preallocated at boot, by preallocate_vmalloc_pages(). Drop the "lazily synchronized" description for less confusion. While this file is x86_64-specific, it is worth noting that things are different for x86_32, where vmalloc()-related changes to `init_mm.pgd` are synchronized to all page tables in the system during runtime, via arch_sync_kernel_mappings(). Unfortunately, this synchronization is subject to race condition, which is further handled via faulting, see vmalloc_fault(). See commit 4819e15f740e ("x86/mm/32: Bring back vmalloc faulting on x86_32") for more details. Reviewed-by: Muchun Song <songmuchun@bytedance.com> Signed-off-by: Peilin Ye <peilin.ye@bytedance.com> Reviewed-by: Joerg Roedel <jroedel@suse.de> Link: https://lore.kernel.org/r/20210818220123.2623-1-yepeilin.cs@gmail.com Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2021-08-12x86/reboot: Document how to override DMI platform quirksPaul Gortmaker1-0/+7
commit 5955633e91bf ("x86/reboot: Skip DMI checks if reboot set by user") made it so that it's not required to recompile the kernel in order to bypass broken reboot quirks compiled into an image: * This variable is used privately to keep track of whether or not * reboot_type is still set to its default value (i.e., reboot= hasn't * been set on the command line). This is needed so that we can * suppress DMI scanning for reboot quirks. Without it, it's * impossible to override a faulty reboot quirk without recompiling. However, at the time it was not eally documented outside the source code, and so this information isn't really available to the average user out there. The change is a little white lie and invented "reboot=default" since it is easy to remember, and documents well. The truth is that any random string that is *not* a currently accepted string will work. Since that doesn't document well for non-coders, and since it's unknown what the future additions might be, lay claim on "default" since that is exactly what it achieves. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20210530162447.996461-3-paul.gortmaker@windriver.com
2021-08-12x86/reboot: Document the "reboot=pci" optionPaul Gortmaker1-1/+3
It is mentioned in the top level non-arch specific file but it was overlooked here for x86. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20210530162447.996461-2-paul.gortmaker@windriver.com
2021-07-07Merge tag 'x86-fpu-2021-07-07' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds2-0/+54
Pull x86 fpu updates from Thomas Gleixner: "Fixes and improvements for FPU handling on x86: - Prevent sigaltstack out of bounds writes. The kernel unconditionally writes the FPU state to the alternate stack without checking whether the stack is large enough to accomodate it. Check the alternate stack size before doing so and in case it's too small force a SIGSEGV instead of silently corrupting user space data. - MINSIGSTKZ and SIGSTKSZ are constants in signal.h and have never been updated despite the fact that the FPU state which is stored on the signal stack has grown over time which causes trouble in the field when AVX512 is available on a CPU. The kernel does not expose the minimum requirements for the alternate stack size depending on the available and enabled CPU features. ARM already added an aux vector AT_MINSIGSTKSZ for the same reason. Add it to x86 as well. - A major cleanup of the x86 FPU code. The recent discoveries of XSTATE related issues unearthed quite some inconsistencies, duplicated code and other issues. The fine granular overhaul addresses this, makes the code more robust and maintainable, which allows to integrate upcoming XSTATE related features in sane ways" * tag 'x86-fpu-2021-07-07' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (74 commits) x86/fpu/xstate: Clear xstate header in copy_xstate_to_uabi_buf() again x86/fpu/signal: Let xrstor handle the features to init x86/fpu/signal: Handle #PF in the direct restore path x86/fpu: Return proper error codes from user access functions x86/fpu/signal: Split out the direct restore code x86/fpu/signal: Sanitize copy_user_to_fpregs_zeroing() x86/fpu/signal: Sanitize the xstate check on sigframe x86/fpu/signal: Remove the legacy alignment check x86/fpu/signal: Move initial checks into fpu__restore_sig() x86/fpu: Mark init_fpstate __ro_after_init x86/pkru: Remove xstate fiddling from write_pkru() x86/fpu: Don't store PKRU in xstate in fpu_reset_fpstate() x86/fpu: Remove PKRU handling from switch_fpu_finish() x86/fpu: Mask PKRU from kernel XRSTOR[S] operations x86/fpu: Hook up PKRU into ptrace() x86/fpu: Add PKRU storage outside of task XSAVE buffer x86/fpu: Dont restore PKRU in fpregs_restore_userspace() x86/fpu: Rename xfeatures_mask_user() to xfeatures_mask_uabi() x86/fpu: Move FXSAVE_LEAK quirk info __copy_kernel_to_fpregs() x86/fpu: Rename __fpregs_load_activate() to fpregs_restore_userregs() ...
2021-06-28Merge tag 'docs-5.14' of git://git.lwn.net/linuxLinus Torvalds2-3/+3
Pull documentation updates from Jonathan Corbet: "This was a reasonably active cycle for documentation; this includes: - Some kernel-doc cleanups. That script is still regex onslaught from hell, but it has gotten a little better. - Improvements to the checkpatch docs, which are also used by the tool itself. - A major update to the pathname lookup documentation. - Elimination of :doc: markup, since our automarkup magic can create references from filenames without all the extra noise. - The flurry of Chinese translation activity continues. Plus, of course, the usual collection of updates, typo fixes, and warning fixes" * tag 'docs-5.14' of git://git.lwn.net/linux: (115 commits) docs: path-lookup: use bare function() rather than literals docs: path-lookup: update symlink description docs: path-lookup: update get_link() ->follow_link description docs: path-lookup: update WALK_GET, WALK_PUT desc docs: path-lookup: no get_link() docs: path-lookup: update i_op->put_link and cookie description docs: path-lookup: i_op->follow_link replaced with i_op->get_link docs: path-lookup: Add macro name to symlink limit description docs: path-lookup: remove filename_mountpoint docs: path-lookup: update do_last() part docs: path-lookup: update path_mountpoint() part docs: path-lookup: update path_to_nameidata() part docs: path-lookup: update follow_managed() part docs: Makefile: Use CONFIG_SHELL not SHELL docs: Take a little noise out of the build process docs: x86: avoid using ReST :doc:`foo` markup docs: virt: kvm: s390-pv-boot.rst: avoid using ReST :doc:`foo` markup docs: userspace-api: landlock.rst: avoid using ReST :doc:`foo` markup docs: trace: ftrace.rst: avoid using ReST :doc:`foo` markup docs: trace: coresight: coresight.rst: avoid using ReST :doc:`foo` markup ...
2021-06-28Merge tag 'x86-splitlock-2021-06-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds2-0/+127
Pull x86 splitlock updates from Ingo Molnar: - Add the "ratelimit:N" parameter to the split_lock_detect= boot option, to rate-limit the generation of bus-lock exceptions. This is both easier on system resources and kinder to offending applications than the current policy of outright killing them. - Document the split-lock detection feature and its parameters. * tag 'x86-splitlock-2021-06-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: Documentation/x86: Add ratelimit in buslock.rst Documentation/admin-guide: Add bus lock ratelimit x86/bus_lock: Set rate limit for bus lock Documentation/x86: Add buslock.rst
2021-06-17docs: x86: avoid using ReST :doc:`foo` markupMauro Carvalho Chehab2-3/+3
The :doc:`foo` tag is auto-generated via automarkup.py. So, use the filename at the sources, instead of :doc:`foo`. Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Link: https://lore.kernel.org/r/17c68b5f1d72488431c77c1de9f13683fe9f536c.1623824363.git.mchehab+huawei@kernel.org Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2021-06-10doc: Remove references to IBM CalgaryHubert Jasudowicz1-30/+1
The Calgary IOMMU driver has been removed in 90dc392fc445 ("x86: Remove the calgary IOMMU driver") Clean up stale docs that refer to it. Signed-off-by: Hubert Jasudowicz <hubert.jasudowicz@gmail.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/1bd2b57dd1db53df09e520b8170ff61418805de4.1623274832.git.hubert.jasudowicz@gmail.com
2021-05-19x86/elf: Support a new ELF aux vector AT_MINSIGSTKSZChang S. Bae2-0/+54
Historically, signal.h defines MINSIGSTKSZ (2KB) and SIGSTKSZ (8KB), for use by all architectures with sigaltstack(2). Over time, the hardware state size grew, but these constants did not evolve. Today, literal use of these constants on several architectures may result in signal stack overflow, and thus user data corruption. A few years ago, the ARM team addressed this issue by establishing getauxval(AT_MINSIGSTKSZ). This enables the kernel to supply a value at runtime that is an appropriate replacement on current and future hardware. Add getauxval(AT_MINSIGSTKSZ) support to x86, analogous to the support added for ARM in 94b07c1f8c39 ("arm64: signal: Report signal frame size to userspace via auxv"). Also, include a documentation to describe x86-specific auxiliary vectors. Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Len Brown <len.brown@intel.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20210518200320.17239-4-chang.seok.bae@intel.com
2021-05-18Documentation/x86: Add ratelimit in buslock.rstFenghua Yu1-0/+22
ratelimit is a new command line option for bus lock handling. Add proper documentation. [ tglx: Massaged documentation ] Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Tony Luck <tony.luck@intel.com> Link: https://lore.kernel.org/r/20210419214958.4035512-5-fenghua.yu@intel.com
2021-05-18Documentation/x86: Add buslock.rstFenghua Yu2-0/+105
Add buslock.rst to explain bus lock problem and how to detect and handle it. [ tglx: Included it into index.rst and added the missing include ... ] Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Tony Luck <tony.luck@intel.com> Link: https://lore.kernel.org/r/20210419214958.4035512-2-fenghua.yu@intel.com
2021-05-10x86/msr: Rename MSR_K8_SYSCFG to MSR_AMD64_SYSCFGBrijesh Singh1-3/+3
The SYSCFG MSR continued being updated beyond the K8 family; drop the K8 name from it. Suggested-by: Borislav Petkov <bp@alien8.de> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Joerg Roedel <jroedel@suse.de> Link: https://lkml.kernel.org/r/20210427111636.1207-4-brijesh.singh@amd.com
2021-05-06Merge tag 'docs-5.13-2' of git://git.lwn.net/linuxLinus Torvalds1-2/+2
Pull documentation fixes from Jonathan Corbet: "A few late-arriving documentation fixes, including some oprofile cleanup, a kernel-doc fix, some regression-reporting updates, and the usual minor fixes" * tag 'docs-5.13-2' of git://git.lwn.net/linux: Enlisted oprofile version line removed oprofiled version output line removed from the list Removed the oprofiled version option docs: reporting-issues.rst: CC subsystem and maintainers on regressions docs: correct URL to bios and kernel developer's guide docs/core-api: Consistent code style docs/zh_CN: Adjust order and content of zh_CN/index.rst Documentation: input: joydev file corrections docs: Fix typo in Documentation/x86/x86_64/5level-paging.rst kernel-doc: Add support for __deprecated
2021-04-27docs: Fix typo in Documentation/x86/x86_64/5level-paging.rstbilbao@vt.edu1-2/+2
fix two typos in the documentation (Documentation/x86/x86_64/5level-paging.rst), changing 'paing' for 'paging' and using the right verbal form for plural on 'some vendors offer'. Signed-off-by: Carlos Bilbao <bilbao@vt.edu> Link: https://lore.kernel.org/r/2599991.mvXUDI8C0e@iron-maiden Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2021-04-06x86/sgx: Introduce virtual EPC for use by KVM guestsSean Christopherson1-0/+16
Add a misc device /dev/sgx_vepc to allow userspace to allocate "raw" Enclave Page Cache (EPC) without an associated enclave. The intended and only known use case for raw EPC allocation is to expose EPC to a KVM guest, hence the 'vepc' moniker, virt.{c,h} files and X86_SGX_KVM Kconfig. The SGX driver uses the misc device /dev/sgx_enclave to support userspace in creating an enclave. Each file descriptor returned from opening /dev/sgx_enclave represents an enclave. Unlike the SGX driver, KVM doesn't control how the guest uses the EPC, therefore EPC allocated to a KVM guest is not associated with an enclave, and /dev/sgx_enclave is not suitable for allocating EPC for a KVM guest. Having separate device nodes for the SGX driver and KVM virtual EPC also allows separate permission control for running host SGX enclaves and KVM SGX guests. To use /dev/sgx_vepc to allocate a virtual EPC instance with particular size, the hypervisor opens /dev/sgx_vepc, and uses mmap() with the intended size to get an address range of virtual EPC. Then it may use the address range to create one KVM memory slot as virtual EPC for a guest. Implement the "raw" EPC allocation in the x86 core-SGX subsystem via /dev/sgx_vepc rather than in KVM. Doing so has two major advantages: - Does not require changes to KVM's uAPI, e.g. EPC gets handled as just another memory backend for guests. - EPC management is wholly contained in the SGX subsystem, e.g. SGX does not have to export any symbols, changes to reclaim flows don't need to be routed through KVM, SGX's dirty laundry doesn't have to get aired out for the world to see, and so on and so forth. The virtual EPC pages allocated to guests are currently not reclaimable. Reclaiming an EPC page used by enclave requires a special reclaim mechanism separate from normal page reclaim, and that mechanism is not supported for virutal EPC pages. Due to the complications of handling reclaim conflicts between guest and host, reclaiming virtual EPC pages is significantly more complex than basic support for SGX virtualization. [ bp: - Massage commit message and comments - use cpu_feature_enabled() - vertically align struct members init - massage Virtual EPC clarification text - move Kconfig prompt to Virtualization ] Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Co-developed-by: Kai Huang <kai.huang@intel.com> Signed-off-by: Kai Huang <kai.huang@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Dave Hansen <dave.hansen@intel.com> Acked-by: Jarkko Sakkinen <jarkko@kernel.org> Link: https://lkml.kernel.org/r/0c38ced8c8e5a69872db4d6a1c0dabd01e07cad7.1616136308.git.kai.huang@intel.com
2021-03-26x86/sgx: Wipe out EREMOVE from sgx_free_epc_page()Kai Huang1-0/+25
EREMOVE takes a page and removes any association between that page and an enclave. It must be run on a page before it can be added into another enclave. Currently, EREMOVE is run as part of pages being freed into the SGX page allocator. It is not expected to fail, as it would indicate a use-after-free of EPC pages. Rather than add the page back to the pool of available EPC pages, the kernel intentionally leaks the page to avoid additional errors in the future. However, KVM does not track how guest pages are used, which means that SGX virtualization use of EREMOVE might fail. Specifically, it is legitimate that EREMOVE returns SGX_CHILD_PRESENT for EPC assigned to KVM guest, because KVM/kernel doesn't track SECS pages. To allow SGX/KVM to introduce a more permissive EREMOVE helper and to let the SGX virtualization code use the allocator directly, break out the EREMOVE call from the SGX page allocator. Rename the original sgx_free_epc_page() to sgx_encl_free_epc_page(), indicating that it is used to free an EPC page assigned to a host enclave. Replace sgx_free_epc_page() with sgx_encl_free_epc_page() in all call sites so there's no functional change. At the same time, improve the error message when EREMOVE fails, and add documentation to explain to the user what that failure means and to suggest to the user what to do when this bug happens in the case it happens. [ bp: Massage commit message, fix typos and sanitize text, simplify. ] Signed-off-by: Kai Huang <kai.huang@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Link: https://lkml.kernel.org/r/20210325093057.122834-1-kai.huang@intel.com
2021-01-28Documentation/x86/boot.rst: Correct the example of SETUP_INDIRECTCao jin1-1/+1
struct setup_data.len is the length of data field. In case of SETUP_INDIRECT, it should be sizeof(setup_indirect). Signed-off-by: Cao jin <jojing64@gmail.com> Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com> Link: https://lore.kernel.org/r/20210127084911.63438-1-jojing64@gmail.com Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2020-12-14Merge tag 'docs-5.11' of git://git.lwn.net/linuxLinus Torvalds2-0/+4
Pull documentation updates from Jonathan Corbet: "A much quieter cycle for documentation (happily), with, one hopes, the bulk of the churn behind us. Significant stuff in this pull includes: - A set of new Chinese translations - Italian translation updates - A mechanism from Mauro to automatically format Documentation/features for the built docs - Automatic cross references without explicit :ref: markup - A new reset-controller document - An extensive new document on reporting problems from Thorsten That last patch also adds the CC-BY-4.0 license to LICENSES/dual; there was some discussion on this, but we seem to have consensus and an ack from Greg for that addition" * tag 'docs-5.11' of git://git.lwn.net/linux: (50 commits) docs: fix broken cross reference in translations/zh_CN docs: Note that sphinx 1.7 will be required soon docs: update requirements to install six module docs: reporting-issues: move 'outdated, need help' note to proper place docs: Update documentation to reflect what TAINT_CPU_OUT_OF_SPEC means docs: add a reset controller chapter to the driver API docs docs: make reporting-bugs.rst obsolete docs: Add a new text describing how to report bugs LICENSES: Add the CC-BY-4.0 license Documentation: fix multiple typos found in the admin-guide subdirectory Documentation: fix typos found in admin-guide subdirectory kernel-doc: Fix example in Nested structs/unions docs: clean up sysctl/kernel: titles, version docs: trace: fix event state structure name docs: nios2: add missing ReST file scripts: get_feat.pl: reduce table width for all features output scripts: get_feat.pl: change the group by order scripts: get_feat.pl: make complete table more coincise scripts: kernel-doc: fix parsing function-like typedefs Documentation: fix typos found in process, dev-tools, and doc-guide subdirectories ...
2020-12-14Merge tag 'x86_cache_for_v5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds2-1/+94
Pull x86 cache resource control updates from Borislav Petkov: - add logic to correct MBM total and local values fixing errata SKX99 and BDF102 (Fenghua Yu) - cleanups * tag 'x86_cache_for_v5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/resctrl: Clean up unused function parameter in rmdir path x86/resctrl: Constify kernfs_ops x86/resctrl: Correct MBM total and local values Documentation/x86: Rename resctrl_ui.rst and add two errata to the file
2020-12-14Merge tag 'x86_cpu_for_v5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds1-0/+9
Pull x86 cpuid updates from Borislav Petkov: "Only AMD-specific changes this time: - Save the AMD physical die ID into cpuinfo_x86.cpu_die_id and convert all code to use it (Yazen Ghannam) - Remove a dead and unused TSEG region remapping workaround on AMD (Arvind Sankar)" * tag 'x86_cpu_for_v5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/cpu/amd: Remove dead code for TSEG region remapping x86/topology: Set cpu_die_id only if DIE_TYPE found EDAC/mce_amd: Use struct cpuinfo_x86.cpu_die_id for AMD NodeId x86/CPU/AMD: Remove amd_get_nb_id() x86/CPU/AMD: Save AMD NodeId as cpu_die_id
2020-12-03docs: archis: add a per-architecture features listMauro Carvalho Chehab2-0/+4
Add a feature list matrix for each architecture to their respective Kernel books. Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Link: https://lore.kernel.org/r/9c39d4dd93e05c0008205527d2c3450912f029ed.1606748711.git.mchehab+huawei@kernel.org Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2020-11-19x86/CPU/AMD: Save AMD NodeId as cpu_die_idYazen Ghannam1-0/+9
AMD systems provide a "NodeId" value that represents a global ID indicating to which "Node" a logical CPU belongs. The "Node" is a physical structure equivalent to a Die, and it should not be confused with logical structures like NUMA nodes. Logical nodes can be adjusted based on firmware or other settings whereas the physical nodes/dies are fixed based on hardware topology. The NodeId value can be used when a physical ID is needed by software. Save the AMD NodeId to struct cpuinfo_x86.cpu_die_id. Use the value from CPUID or MSR as appropriate. Default to phys_proc_id otherwise. Do so for both AMD and Hygon systems. Drop the node_id parameter from cacheinfo_*_init_llc_id() as it is no longer needed. Update the x86 topology documentation. Suggested-by: Borislav Petkov <bp@alien8.de> Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20201109210659.754018-2-Yazen.Ghannam@amd.com
2020-11-18Documentation/x86: Document SGX kernel architectureJarkko Sakkinen2-0/+212
Document the Intel SGX kernel architecture. The fine-grained architecture details can be looked up from Intel SDM Volume 3D. Co-developed-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Jethro Beekman <jethro@fortanix.com> Cc: linux-doc@vger.kernel.org Link: https://lkml.kernel.org/r/20201112220135.165028-24-jarkko@kernel.org
2020-10-27Documentation/x86: Rename resctrl_ui.rst and add two errata to the fileFenghua Yu2-1/+94
Intel Memory Bandwidth Monitoring (MBM) counters may report system memory bandwidth incorrectly on some Intel processors. This is reported in documented in erratum SKX99, erratum BDF102 and in the RDT reference manual, see Documentation/x86/index.rst. To work around the errata, MBM total and local readings are corrected using a correction factor table. Since the correction factor table is not publicly documented anywhere, document the table and the errata in Documentation/x86/resctrl.rst for future reference. [ bp: Move web links to the doc, massage. ] Suggested-by: Borislav Petkov <bp@alien8.de> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Tony Luck <tony.luck@intel.com> Link: https://lkml.kernel.org/r/20201014004927.1839452-2-fenghua.yu@intel.com
2020-10-23Merge tag 'docs-5.10-2' of git://git.lwn.net/linuxLinus Torvalds1-1/+1
Pull documentation fixes from Jonathan Corbet: "A handful of late-arriving documentation fixes" * tag 'docs-5.10-2' of git://git.lwn.net/linux: docs: Add two missing entries in vm sysctl index docs/vm: trivial fixes to several spelling mistakes docs: submitting-patches: describe preserving review/test tags Documentation: Chinese translation of Documentation/arm64/hugetlbpage.rst Documentation: x86: fix a missing word in x86_64/mm.rst. docs: driver-api: remove a duplicated index entry docs: lkdtm: Modernize and improve details docs: deprecated.rst: Expand str*cpy() replacement notes docs/cpu-load: format the example code.
2020-10-21Documentation: x86: fix a missing word in x86_64/mm.rst.Wei Lin Chang1-1/+1
This patch adds a missing word in x86/x86_64/mm.rst, without which the note reads awkwardly. Signed-off-by: Wei Lin Chang <r09922117@csie.ntu.edu.tw> Link: https://lore.kernel.org/r/20201015062242.26296-1-r09922117@csie.ntu.edu.tw Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2020-10-14Merge tag 'devicetree-for-5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linuxLinus Torvalds2-0/+22
Pull devicetree updates from Rob Herring: - Update dtc to upstream version v1.6.0-31-gcbca977ea121 - dtx_diff help text reformatting - Speed-up validation time for binding and dtb checks using json for intermediate files - Add support for running yamllint on DT schema files - Remove old booting-without-of.rst - Extend the example schema to address common issues - Cleanup handling of additionalProperties/unevaluatedProperties - Ensure all DSI controller schemas reference dsi-controller.yaml - Vendor prefixes for Zealz, Wandbord/Technexion, Embest RIoT, Rex, DFI, and Cisco Meraki - Convert at25, SPMI bus, TI hwlock, HiSilicon Hi3660 USB3 PHY, Arm SP805 watchdog, Arm SP804, and Samsung 11-pin USB connector to DT schema - Convert HiSilicon SoC and syscon bindings to DT schema - Convert SiFive Risc-V L2 cache, PLIC, PRCI, and PWM to DT schema - Convert i.MX bindings for w1, crypto, rng, SIM, PM, DDR, SATA, vf610 GPIO, and UART to DT schema - Add i.MX 8M compatible strings - Add LM81 and DS1780 as trivial devices - Various missing properties added to fix dtb validation warnings * tag 'devicetree-for-5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: (111 commits) dt-bindings: misc: explicitly add #address-cells for slave mode spi: dt-bindings: spi-controller: explicitly require #address-cells=<0> for slave mode dt: Remove booting-without-of.rst dt-bindings: update usb-c-connector example dt-bindings: arm: hisilicon: add missing properties into cpuctrl.yaml dt-bindings: arm: hisilicon: add missing properties into sysctrl.yaml dt-bindings: pwm: imx: document i.MX compatibles scripts/dtc: Update to upstream version v1.6.0-31-gcbca977ea121 dt-bindings: Add running yamllint to dt_binding_check dt-bindings: powerpc: Add a schema for the 'sleep' property dt-bindings: pinctrl: sirf: Fix typo abitrary dt-bindings: pinctrl: qcom: Fix typo abitrary dt-bindings: Explicitly allow additional properties in common schemas dt-bindings: Use 'additionalProperties' instead of 'unevaluatedProperties' dt-bindings: Add missing 'unevaluatedProperties' Docs: Fixing spelling errors in Documentation/devicetree/bindings/ dt-bindings: arm: hisilicon: convert Hi6220 domain controller bindings to json-schema dt-bindings: riscv: convert pwm bindings to json-schema dt-bindings: riscv: convert plic bindings to json-schema dt-bindings: fu540: prci: convert PRCI bindings to json-schema ...
2020-10-13x86/numa: add 'nohmat' optionDan Williams1-0/+4
Disable parsing of the HMAT for debug, to workaround broken platform instances, or cases where it is otherwise not wanted. [rdunlap@infradead.org: fix build when CONFIG_ACPI is not set] Link: https://lkml.kernel.org/r/70e5ee34-9809-a997-7b49-499e4be61307@infradead.org Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Ben Skeggs <bskeggs@redhat.com> Cc: Brice Goglin <Brice.Goglin@inria.fr> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Dave Jiang <dave.jiang@intel.com> Cc: David Airlie <airlied@linux.ie> Cc: David Hildenbrand <david@redhat.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Ira Weiny <ira.weiny@intel.com> Cc: Jason Gunthorpe <jgg@mellanox.com> Cc: Jeff Moyer <jmoyer@redhat.com> Cc: Jia He <justin.he@arm.com> Cc: Joao Martins <joao.m.martins@oracle.com> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Mike Rapoport <rppt@linux.ibm.com> Cc: Paul Mackerras <paulus@ozlabs.org> Cc: Pavel Tatashin <pasha.tatashin@soleen.com> Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: Wei Yang <richard.weiyang@linux.alibaba.com> Cc: Will Deacon <will@kernel.org> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Hulk Robot <hulkci@huawei.com> Cc: Jason Yan <yanaijie@huawei.com> Cc: "Jérôme Glisse" <jglisse@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: kernel test robot <lkp@intel.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Stefano Stabellini <sstabellini@kernel.org> Cc: Vivek Goyal <vgoyal@redhat.com> Link: https://lkml.kernel.org/r/159643095540.4062302.732962081968036212.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-10-13dt: Remove booting-without-of.rstRob Herring2-0/+22
booting-without-of.rst is an ancient document that first outlined Flattened DeviceTree on PowerPC initially. The DT world has evolved a lot in the 15 years since and booting-without-of.rst is pretty stale. The name of the document itself is confusing if you don't understand the evolution from real 'OpenFirmware'. Most of what booting-without-of.rst contains is now in the DT specification (which evolved out of the ePAPR). The few things that weren't documented in the DT specification are now. All that remains is the boot entry details, so let's move these to arch specific documents. The exception is arm which already has the same details documented. Cc: Frank Rowand <frowand.list@gmail.com> Cc: Mauro Carvalho Chehab <mchehab@kernel.org> Cc: Geert Uytterhoeven <geert+renesas@glider.be> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Paul Mackerras <paulus@samba.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Rich Felker <dalias@libc.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: x86@kernel.org Cc: linuxppc-dev@lists.ozlabs.org Cc: linux-mips@vger.kernel.org Cc: linux-doc@vger.kernel.org Cc: linux-sh@vger.kernel.org Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: Borislav Petkov <bp@suse.de> Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc) Signed-off-by: Rob Herring <robh@kernel.org>
2020-10-12Merge tag 'x86_cache_for_v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds1-2/+16
Pull x86 cache resource control updates from Borislav Petkov: - Misc cleanups to the resctrl code in preparation for the ARM side (James Morse) - Add support for controlling per-thread memory bandwidth throttling delay values on hw which supports it (Fenghua Yu) * tag 'x86_cache_for_v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/resctrl: Enable user to view thread or core throttling mode x86/resctrl: Enumerate per-thread MBA controls cacheinfo: Move resctrl's get_cache_id() to the cacheinfo header file x86/resctrl: Add struct rdt_cache::arch_has_{sparse, empty}_bitmaps x86/resctrl: Merge AMD/Intel parse_bw() calls x86/resctrl: Add struct rdt_membw::arch_needs_linear to explain AMD/Intel MBA difference x86/resctrl: Use is_closid_match() in more places x86/resctrl: Include pid.h x86/resctrl: Use container_of() in delayed_work handlers x86/resctrl: Fix stale comment x86/resctrl: Remove struct rdt_membw::max_delay x86/resctrl: Remove unused struct mbm_state::chunks_bw
2020-10-12Merge tag 'x86_misc_for_v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds2-0/+156
Pull misc x86 fixes fromm Borislav Petkov: - Ratelimit the message about writes to unrecognized MSRs so that they don't spam the console log (Chris Down) - Document how the /proc/cpuinfo machinery works for future reference (Kyung Min Park, Ricardo Neri and Dave Hansen) - Correct the current NMI's duration calculation (Libing Zhou) * tag 'x86_misc_for_v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/nmi: Fix nmi_handle() duration miscalculation Documentation/x86: Add documentation for /proc/cpuinfo feature flags x86/msr: Make source of unrecognised MSR writes unambiguous x86/msr: Prevent userspace MSR access from dominating the console
2020-10-12Merge tag 'x86_pasid_for_5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds2-0/+258
Pull x86 PASID updates from Borislav Petkov: "Initial support for sharing virtual addresses between the CPU and devices which doesn't need pinning of pages for DMA anymore. Add support for the command submission to devices using new x86 instructions like ENQCMD{,S} and MOVDIR64B. In addition, add support for process address space identifiers (PASIDs) which are referenced by those command submission instructions along with the handling of the PASID state on context switch as another extended state. Work by Fenghua Yu, Ashok Raj, Yu-cheng Yu and Dave Jiang" * tag 'x86_pasid_for_5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/asm: Add an enqcmds() wrapper for the ENQCMDS instruction x86/asm: Carve out a generic movdir64b() helper for general usage x86/mmu: Allocate/free a PASID x86/cpufeatures: Mark ENQCMD as disabled when configured out mm: Add a pasid member to struct mm_struct x86/msr-index: Define an IA32_PASID MSR x86/fpu/xstate: Add supervisor PASID state for ENQCMD x86/cpufeatures: Enumerate ENQCMD and ENQCMDS instructions Documentation/x86: Add documentation for SVA (Shared Virtual Addressing) iommu/vt-d: Change flags type to unsigned int in binding mm drm, iommu: Change type of pasid to u32
2020-10-02Documentation/x86: Fix incorrect references to zero-page.txtHeinrich Schuchardt1-3/+3
The file zero-page.txt does not exit. Add links to zero-page.rst instead. [ bp: Massage a bit. ] Signed-off-by: Heinrich Schuchardt <xypron.glpk@gmx.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20201002190623.7489-1-xypron.glpk@gmx.de
2020-09-17Documentation/x86: Add documentation for SVA (Shared Virtual Addressing)Ashok Raj2-0/+258
ENQCMD and Data Streaming Accelerator (DSA) and all of their associated features are a complicated stack with lots of interconnected pieces. This documentation provides a big picture overview for all of the features. Signed-off-by: Ashok Raj <ashok.raj@intel.com> Co-developed-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Tony Luck <tony.luck@intel.com> Link: https://lkml.kernel.org/r/1600187413-163670-4-git-send-email-fenghua.yu@intel.com
2020-09-01Documentation/x86: Add documentation for /proc/cpuinfo feature flagsKyung Min Park2-0/+156
/proc/cpuinfo shows features which the kernel supports. Some of these flags are derived from CPUID, and others are derived from other sources, including some that are entirely software-based. Currently, there is not any documentation in the kernel about how /proc/cpuinfo flags are generated and what it means when they are missing. Add documentation for /proc/cpuinfo feature flags enumeration. Document how and when x86 feature flags are used. Also discuss what their presence or absence mean for the kernel and users. Suggested-by: Dave Hansen <dave.hansen@intel.com> Co-developed-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> Signed-off-by: Kyung Min Park <kyung.min.park@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Tony Luck <tony.luck@intel.com> Link: https://lkml.kernel.org/r/20200831183500.15481-1-kyung.min.park@intel.com
2020-08-26x86/resctrl: Enable user to view thread or core throttling modeFenghua Yu1-2/+16
Early Intel hardware implementations of Memory Bandwidth Allocation (MBA) could only control bandwidth at the processor core level. This meant that when two processes with different bandwidth allocations ran simultaneously on the same core the hardware had to resolve this difference. It did so by applying the higher throttling value (lower bandwidth) to both processes. Newer implementations can apply different throttling values to each thread on a core. Introduce a new resctrl file, "thread_throttle_mode", on Intel systems that shows to the user how throttling values are allocated, per-core or per-thread. On systems that support per-core throttling, the file will display "max". On newer systems that support per-thread throttling, the file will display "per-thread". AMD confirmed in [1] that AMD bandwidth allocation is already at thread level but that the AMD implementation does not use a memory delay throttle mode. So to avoid confusion the thread throttling mode would be UNDEFINED on AMD systems and the "thread_throttle_mode" file will not be visible. Originally-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Link: https://lkml.kernel.org/r/1598296281-127595-3-git-send-email-fenghua.yu@intel.com Link: [1] https://lore.kernel.org/lkml/18d277fd-6523-319c-d560-66b63ff606b8@amd.com
2020-08-04Merge tag 'docs-5.9' of git://git.lwn.net/linuxLinus Torvalds2-2/+2
Pull documentation updates from Jonathan Corbet: "It's been a busy cycle for documentation - hopefully the busiest for a while to come. Changes include: - Some new Chinese translations - Progress on the battle against double words words and non-HTTPS URLs - Some block-mq documentation - More RST conversions from Mauro. At this point, that task is essentially complete, so we shouldn't see this kind of churn again for a while. Unless we decide to switch to asciidoc or something...:) - Lots of typo fixes, warning fixes, and more" * tag 'docs-5.9' of git://git.lwn.net/linux: (195 commits) scripts/kernel-doc: optionally treat warnings as errors docs: ia64: correct typo mailmap: add entry for <alobakin@marvell.com> doc/zh_CN: add cpu-load Chinese version Documentation/admin-guide: tainted-kernels: fix spelling mistake MAINTAINERS: adjust kprobes.rst entry to new location devices.txt: document rfkill allocation PCI: correct flag name docs: filesystems: vfs: correct flag name docs: filesystems: vfs: correct sync_mode flag names docs: path-lookup: markup fixes for emphasis docs: path-lookup: more markup fixes docs: path-lookup: fix HTML entity mojibake CREDITS: Replace HTTP links with HTTPS ones docs: process: Add an example for creating a fixes tag doc/zh_CN: add Chinese translation prefer section doc/zh_CN: add clearing-warn-once Chinese version doc/zh_CN: add admin-guide index doc:it_IT: process: coding-style.rst: Correct __maybe_unused compiler label futex: MAINTAINERS: Re-add selftests directory ...
2020-08-04Merge tag 'x86-fsgsbase-2020-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds2-0/+200
Pull x86 fsgsbase from Thomas Gleixner: "Support for FSGSBASE. Almost 5 years after the first RFC to support it, this has been brought into a shape which is maintainable and actually works. This final version was done by Sasha Levin who took it up after Intel dropped the ball. Sasha discovered that the SGX (sic!) offerings out there ship rogue kernel modules enabling FSGSBASE behind the kernels back which opens an instantanious unpriviledged root hole. The FSGSBASE instructions provide a considerable speedup of the context switch path and enable user space to write GSBASE without kernel interaction. This enablement requires careful handling of the exception entries which go through the paranoid entry path as they can no longer rely on the assumption that user GSBASE is positive (as enforced via prctl() on non FSGSBASE enabled systemn). All other entries (syscalls, interrupts and exceptions) can still just utilize SWAPGS unconditionally when the entry comes from user space. Converting these entries to use FSGSBASE has no benefit as SWAPGS is only marginally slower than WRGSBASE and locating and retrieving the kernel GSBASE value is not a free operation either. The real benefit of RD/WRGSBASE is the avoidance of the MSR reads and writes. The changes come with appropriate selftests and have held up in field testing against the (sanitized) Graphene-SGX driver" * tag 'x86-fsgsbase-2020-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (21 commits) x86/fsgsbase: Fix Xen PV support x86/ptrace: Fix 32-bit PTRACE_SETREGS vs fsbase and gsbase selftests/x86/fsgsbase: Add a missing memory constraint selftests/x86/fsgsbase: Fix a comment in the ptrace_write_gsbase test selftests/x86: Add a syscall_arg_fault_64 test for negative GSBASE selftests/x86/fsgsbase: Test ptracer-induced GS base write with FSGSBASE selftests/x86/fsgsbase: Test GS selector on ptracer-induced GS base write Documentation/x86/64: Add documentation for GS/FS addressing mode x86/elf: Enumerate kernel FSGSBASE capability in AT_HWCAP2 x86/cpu: Enable FSGSBASE on 64bit by default and add a chicken bit x86/entry/64: Handle FSGSBASE enabled paranoid entry/exit x86/entry/64: Introduce the FIND_PERCPU_BASE macro x86/entry/64: Switch CR3 before SWAPGS in paranoid entry x86/speculation/swapgs: Check FSGSBASE in enabling SWAPGS mitigation x86/process/64: Use FSGSBASE instructions on thread copy and ptrace x86/process/64: Use FSBSBASE in switch_to() if available x86/process/64: Make save_fsgs_for_kvm() ready for FSGSBASE x86/fsgsbase/64: Enable FSGSBASE instructions in helper functions x86/fsgsbase/64: Add intrinsics for FSGSBASE instructions x86/cpu: Add 'unsafe_fsgsbase' to enable CR4.FSGSBASE ...
2020-07-31x86: Add support for ZSTD compressed kernelNick Terrell1-3/+3
- Add support for zstd compressed kernel - Define __DISABLE_EXPORTS in Makefile - Remove __DISABLE_EXPORTS definition from kaslr.c - Bump the heap size for zstd. - Update the documentation. Integrates the ZSTD decompression code to the x86 pre-boot code. Zstandard requires slightly more memory during the kernel decompression on x86 (192 KB vs 64 KB), and the memory usage is independent of the window size. __DISABLE_EXPORTS is now defined in the Makefile, which covers both the existing use in kaslr.c, and the use needed by the zstd decompressor in misc.c. This patch has been boot tested with both a zstd and gzip compressed kernel on i386 and x86_64 using buildroot and QEMU. Additionally, this has been tested in production on x86_64 devices. We saw a 2 second boot time reduction by switching kernel compression from xz to zstd. Signed-off-by: Nick Terrell <terrelln@fb.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Tested-by: Sedat Dilek <sedat.dilek@gmail.com> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20200730190841.2071656-7-nickrterrell@gmail.com
2020-07-13Documentation: x86: earlyprintk: drop doubled wordsRandy Dunlap1-1/+1
Drop the doubled word "and". Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: linux-doc@vger.kernel.org Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: x86@kernel.org Cc: "H. Peter Anvin" <hpa@zytor.com> Link: https://lore.kernel.org/r/20200703213107.30758-2-rdunlap@infradead.org Signed-off-by: Jonathan Corbet <corbet@lwn.net>