summaryrefslogtreecommitdiffstats
path: root/sys/arch/amd64/include (follow)
Commit message (Collapse)AuthorAgeFilesLines
* Remove a 15 year old XXX commentmlarkin2019-05-281-2/+1
|
* Oops, forgot to include a copyright year when originally addedguenther2019-05-171-2/+2
|
* Mitigate Intel's Microarchitectural Data Sampling vulnerability.guenther2019-05-174-4/+16
| | | | | | | | | | | | | | | | If the CPU has the new VERW behavior than that is used, otherwise use the proper sequence from Intel's "Deep Dive" doc is used in the return-to-userspace and enter-VMM-guest paths. The enter-C3-idle path is not mitigated because it's only a problem when SMT/HT is enabled: mitigating everything when that's enabled would be a _huge_ set of changes that we see no point in doing. Update vmm(4) to pass through the MSR bits so that guests can apply the optimal mitigation. VMM help and specific feedback from mlarkin@ vendor-portability help from jsg@ and kettenis@ ok kettenis@ mlarkin@ deraadt@ jsg@
* Add support to the BIOS bootloader for random kernel base VAmlarkin2019-05-151-2/+2
| | | | | | | | This diff adds support to be able to load a randomly linked kernel VA (subject to some range restrictions). This change has been in snaps for a few days without any fallout. ok deraadt@
* vmm: add host side pvclockpd2019-05-131-1/+5
| | | | | | | | | Emulate kvm pvclock in vmm(4). Compatible with pvclock(4) in OpenBSD. Linux does not attach to this (yet). Fixes by reyk@ and tested extensively by reyk@, tb@ and phessler@ ok mlarkin@ phessler@ reyk@
* Delete cpu_idle_{enter,leave}_fcn() as unused. Add RETGUARD checks toguenther2019-05-121-3/+1
| | | | | | cpu_idle_cycle() ok mpi@ kettenis@
* s/availible/available/guenther2019-05-121-2/+2
|
* vmm: add a x86 page table walkerpd2019-05-121-1/+2
| | | | | | | | | | Add a first cut of x86 page table walker to vmd(8) and vmm(4). This function is not used right now but is a building block for future features like HPET, OUTSB and INSB emulation, nested virtualisation support, etc. With help from Mike Larkin ok mlarkin@
* Improve the interaction between efifb(4), inteldrm(4) and radeondrm(4)kettenis2019-05-041-3/+4
| | | | | | | | | | | | | | | | when we have a serial console by introducing the notion of a "primary" graphics device. The primary graphics device is the one set up and used by firmware (BIOS, UEFI). The goal is to make sure that wsdisplay0 and drm0 reliably attach to the primary graphics device such that X works out of the box even if you have multiple cards or if you are using a serial console. This also fixes the situation where inteldrm(4) or radeondrm(4) would take over the console on UEFI systems even if the kernel was booted with a serial console. ok jsg@
* Fix vmm_support.S compilation error with gcc 8.3mlarkin2019-05-021-2/+2
| | | | ok deraadt
* change marks[] array to uint64_t, so the code can track full 64-bitderaadt2019-04-101-2/+2
| | | | | details from the ELF header instead of faking it. Proposal from mlarkin, tested on most architectures already
* Add variable length trap padding between the retguard epilogue and themortimer2019-04-021-1/+2
| | | | | | | | | | | | | | following return. This change adds a constraint that the name passed to the RETGUARD_* macros must correspond to the name in the corresponding ENTRY which starts the function (or a function which appears beforehand in the same file). Since we use the distance from the ENTRY definition to calculate how much padding to insert, the ENTRY symbol must be in scope at assembly time. This is almost always the case already, since it is the natural way to name the retguard symbols so they remain unique. ok deraadt@
* vmm(4): flush EPT when uvm removes mappings from a nested page tablemlarkin2019-04-011-1/+2
| | | | | | | | | Ensure TLB is flushed to avoid stale entries when uvm removes entries from a guest VM's EPT. This is done on VM teardown and when uvm pages out pages in low memory situations. Prompted by a conversation with Maxime from NetBSD a few months back. ok deraadt
* vmm(4): Don't advertise support for SSBD and related speculative execmlarkin2019-04-011-1/+6
| | | | | | | | | | | control features on AMD. Linux tries to use them and since these are not fully implemented yet, it results in an OOPS during boot on recent hardware. When these are properly passed through, we can restore advertising support for this feature. ok deraadt@
* vmm(4): Don't advertise support for MCE/MCA since we don't implementmlarkin2019-04-011-2/+3
| | | | | | | the MSRs to support them. Fixes an OOPS during Linux guest VM boot on Ryzen. ok deraadt
* vmm(4): On VMX, use sgdt/sidt to reset the GDT/IDT limits after exitingmlarkin2019-03-261-1/+25
| | | | | | | | | | | | | the guest VM. By default, VMX sets the limits to 0xFFFF on exit, which is larger than what we want and can lead to security issues. While here, reset the LDT as well. We don't use this in OpenBSD, and VMX loads a null LDT selector on exit anyway, but resetting it here prevents any future surprises. Pointed out by Maxime Villard from NetBSD - thanks! ok deraadt@
* Fix pctr(4) issues with MP and suspend:guenther2019-03-252-18/+5
| | | | | | | | - use an IPI to notify other CPUs toi update CR4 and the MSRs - use the cpu(4) resume callback to restore the pctr(4) settings after suspend/hibernate ok kettenis@ deraadt@
* X86_IPI_NAMES's only use was #if 0'ed out; delete bothguenther2019-03-251-6/+1
| | | | ok kettenis@ deraadt@
* Use the debugger mutex for `ddb_mp_mutex'. This should prevent a racevisa2019-03-231-3/+1
| | | | | | | | | | that could leave `ddb_mp_mutex' locked if one CPU incremented `db_active' while another CPU was in the critical section. When the race hit, the debugger was unable to resume execution or switch between CPUs. Race analyzed by patrick@ OK mpi@ patrick@
* Bump VMM_MAX_NAME_LEN to 64 to allow for longer vm names.ajacoutot2019-03-021-2/+2
| | | | ok mlarkin@
* vmm(4): allow preservation and restoration of guest debug registersmlarkin2019-02-201-2/+22
| | | | | | Allow save/restore of %drX registers during VM exit and entry discussed with deraadt@
* Remove PTPpaddr and use proc0.p_addr->u_pcb.pcb_cr3 instead. This alsoyasuoka2019-02-181-4/+1
| | | | | | | fixes kernel core dump to be readable by savecore. From fukaumi at soum.co.jp ok mlarkin
* the KERN*_{HI,LO} variables are not needed, and easier to calculate thederaadt2019-01-241-6/+1
| | | | parts on the fly.
* flense more trailing whitespacephessler2019-01-221-4/+4
|
* remove trailing whitespace in the Laptop Package part of the license text.phessler2019-01-221-4/+4
| | | | no words or punctation were modified.
* Support 2TB phys memmlarkin2019-01-211-5/+8
| | | | | | | | | This change expands the direct map to 4 slots (512GB each), to support machines with up to 2TB physical memory. Should further expansion be required, this change provides the means to do that with a single #define change. with help from and ok guenther
* vmm: better handling of two SMM related MSRsmlarkin2019-01-211-1/+3
| | | | | | | | | | We currently ignore MSR_SMBASE and MSR_SMM_MONITOR_CTL, but the SDM says accessing the former for read and latter for write while not in SMM mode should produce a #GP. This change detects those operations and injects a #GP as the documentation says. The previous behaviour was harmless, just not correct. ok pd
* Implement rdmsr_safemlarkin2019-01-201-1/+3
| | | | | | | rdmsr_safe is used when reading potentially missing MSRs, to avoid triggering #GPs in the kernel. ok guenther
* Add a pwraction sysctl that controls what the power button does on acpi.tedu2019-01-191-2/+4
| | | | | | By default, nothing changes -- shutdown is initiated. But allows turning power button into a sleep button if desired. (grudging) ok from a few parties
* Finish randominzing remaining layers of pmap_kernelmlarkin2019-01-191-1/+2
| | | | | | | | | | An earlier diff moved the top level page, this diff finishes the lower layers. New pages are allocated for the existing hiererchy (which thus benefit from random placement from pmemrange/etc). Existing managed pages are returned to uvm (a small number of bootstrap pages are not returned as they are allocated in locore0 and thus aren't managed). ok deraadt
* Move the placement of pmap_kernel's toplevel PML4 pagemlarkin2019-01-111-1/+3
| | | | | | | | | | | | This change moves the PML4 for pmap_kernel elsewhere during early boot. Lower levels of pmap_kernel will be moved in subsequent changes, but there are other pmap changes coming that need to be integrated first. In snaps for 3 days, no fallout seen. ok deraadt and substantial input and help from guenther@ cvs: ----------------------------------------------------------------------
* add efifb_stolen() to get the size of the efifb framebufferjsg2019-01-101-1/+3
| | | | suggested by and ok kettenis@
* Increase L2 PTE reservation for the kernelmlarkin2019-01-061-2/+2
| | | | | | | | Bump the number of L2 page table entries reserved for the kernel from 16 to 64, to allow for larger kernels. This diff was in snaps for 21 days without any reported fallout. ok deraadt
* Crank MAXTSIZ to next pow2 (256MB) because a few piggy binariesderaadt2019-01-031-2/+2
| | | | | compiled with retpoline enabled are even piggier now. diagnosed with robert kettenis and drahn
* remove intr_find_mpmapping proto func removed in intr.c rev 1.31 in 2011jsg2018-12-211-2/+1
|
* Include srp.h where struct cpu_info uses srp to avoid erroring out whenjsg2018-12-051-1/+2
| | | | | | | including cpu.h machine/intr.h etc without first including param.h when MULTIPROCESSOR is defined. ok visa@
* Add i386 relocations. Needed for 32-bit UEFI bootloader.kettenis2018-10-201-1/+14
| | | | ok patrick@, naddy@
* In vmm, handle xsetbv like xrstor: instead of trying to prevalidateguenther2018-10-071-1/+2
| | | | | | | the values, just try it and handle the #GP if it faults. Problem reported by Maxime Villard (max(at)m00nbsd.net) ok mlarkin@
* Use PCIDs where they and the INVPCID instruction are available.guenther2018-10-044-7/+42
| | | | | | | | | | | | This uses one PCID for kernel threads, one for the U+K tables of normal processes, one for the matching U-K tables (when meltdown in effect), and one for temporary mappings when poking other processes. Some further tweaks are envisioned but this is good enough to provide more separation and has (finally) been stable under ports testing. lots of ports testing and valid complaints from naddy@ and sthen@ feedback from mlarkin@ and sf@
* Unify the MD byteswapping code as much as possible across architectures.naddy2018-10-021-22/+22
| | | | | | | Use inline functions instead of GNU C statement expressions, and make them available to userland. With clues from guenther@. ok guenther@ kettenis@
* Delete the reserve_dumppages() declaration, missed in its 2010 removalguenther2018-09-301-3/+1
| | | | ok deraadt@
* Remap the UEFI buffer early such that we can use a write combining mappingkettenis2018-09-221-1/+2
| | | | | | which speeds things up considerably compared to an uncached mapping. ok deraadt@
* vmm(4): Clear the guest MWAITX/MONITORX extended CPUID feature bit,brynet2018-09-201-3/+4
| | | | | | | | | like we already do for MWAIT/MONITOR. Also match Intel here by not exposing the SVM capability to AMD guests. Allows Linux guests to boot in vmd(8) on Ryzen CPUs. ok mlarkin@
* Whitespace fixesguenther2018-09-121-3/+3
|
* Add defines for amd microcode msrs which appear to be present since k8jsg2018-09-111-1/+3
| | | | though amd only provides public redistributable updates for >= family 10h.
* Add defines for dealing with PCID support in cr3guenther2018-09-051-1/+4
| | | | ok mlarkin@
* Define __HAVE_ACPI.kettenis2018-08-251-1/+3
| | | | ok deraadt@, krw@, jca@
* Perform mitigations for Intel L1TF screwup. There are three options:deraadt2018-08-213-4/+8
| | | | | | | | | | | | | | | (1) Future cpus which don't have the bug, (2) cpu's with microcode containing a L1D flush operation, (3) stuffing the L1D cache with fresh data and expiring old content. This stuffing loop is complicated and interesting, no details on the mitigation have been released by Intel so Mike and I studied other systems for inspiration. Replacement algorithm for the L1D is described in the tlbleed paper. We use a 64K PA-linear region filled with trapsleds (in case there is L1D->L1I data movement). The TLBs covering the region are loaded first, because TLB loading apparently flows through the D cache. Before performing vmlaunch or vmresume, the cachelines covering the guest registers are also flushed. with mlarkin, additional testing by pd, handy comments from the kettenis and guenther peanuts
* Remove unused spllock().visa2018-08-201-2/+1
| | | | OK deraadt@ mpi@
* Add support for multiple PCI segments. Only really implemented for arm64kettenis2018-08-191-2/+3
| | | | | | | for now as amd64/i386 firmware still caters for legacy OSes that only support a single PCI segment. ok patrick@