| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
| |
Per Intel SDM (Vol 3D, App. A.10) bit 0 should be read as a 1 if enabled.
From Adam Steen. ok mlarkin@
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
>= zen 2 based amd processors return a value of 9 for l3 cache assoc
via cpuid 0x80000006. As that is a reserved value we end up incorrectly
claiming the l3 cache is disabled. While it is possible to get l3 cache
information via cpuid 0x8000001d when TOPEXT is advertised that will
instead give information about the l3 cache available to the core
complex (CCX) that the cpu belongs to where previously the amount of l3
available to all core complexes was shown.
As we don't detail topology in dmesg or show the mapping of cores to
core complexes just stop displaying l3 information. It already isn't
shown on intel.
ok gkoehler@
|
|
|
|
|
|
| |
while TC_TSS and TC_FLAGMASK have _never_ been used
ok kettenis@
|
|
|
|
|
|
|
| |
slots but rather go directly from the iretq frame to an intrframe.
This saves 22 bytes in each of the 148 interrupt entry points.
ok mpi@
|
|
|
|
|
|
| |
No effect on object code, just symbol table accuracy
ok mpi@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The Meltdown mitigation work ran right across the previous abstractions;
draw slightly different lines and use separate macros for interrupts
vs traps vs syscall.
The generated ASM for traps and general interrupts is completely
unchanged; the ASM for the four directly routed interrupts is brought
into line with the general interrupts; the ASM for syscalls is
changed to delay reenabling interrupts until after all registers
are saved and cleared.
ok mpi@
|
|
|
|
|
|
| |
it shouldn't optimise across them.
ok kettenis@
|
|
|
|
|
|
|
| |
This creates separate domains for each PCI device and can provide protection
against invalid memory access. Needed for Passthrough PCI from vmd.
ok deraadt@, kettenis@
: ----------------------------------------------------------------------
|
|
|
|
|
|
| |
floating-point control modes are properly restored by longjmp(3).
ok guenther@
|
|
|
|
| |
ok kettenis@ deraadt@
|
| |
|
|
|
|
|
|
|
|
|
| |
* We don't need TC_LAST
* Make internal functions static to avoid namespace pollution in libc.a
* Use a switch statement to harmonize with architectures providing
multiple timecounters
ok deraadt@, pirofti@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This diff exposes parts of clock_gettime(2) and gettimeofday(2) to
userland via libc eliberating processes from the need for a context
switch everytime they want to count the passage of time.
If a timecounter clock can be exposed to userland than it needs to set
its tc_user member to a non-zero value. Tested with one or multiple
counters per architecture.
The timing data is shared through a pointer found in the new ELF
auxiliary vector AUX_openbsd_timekeep containing timehands information
that is frequently updated by the kernel.
Timing differences between the last kernel update and the current time
are adjusted in userland by the tc_get_timecount() function inside the
MD usertc.c file.
This permits a much more responsive environment, quite visible in
browsers, office programs and gaming (apparently one is are able to fly
in Minecraft now).
Tested by robert@, sthen@, naddy@, kmos@, phessler@, and many others!
OK from at least kettenis@, cheloha@, naddy@, sthen@
|
|
|
|
|
|
|
|
|
|
| |
doing some sort of time measurement. This is necessary since RDTSC
is not a serializing instruction. We can use LFENCE as the serializing
instruction instead of CPUID since all amd64 machines have SSE.
This considerably reduces the jitter in TSC skew measurements.
ok deraadt@, cheloha@, phessler@
|
|
|
|
|
|
|
|
| |
functionality is provided by <sys/stdarg.h> using compiler builtins.
Tested in a ports bulk build on amd64 by naddy@
OK naddy@ mpi@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
the cpu is specified by a struct cpu_info *, which should generally
come from an intrmap.
this is adapted from a diff that patrick@ sent round a few years
ago for a pci_intr_map_msix_cpuid, where you asked for an msi vector
on a specific cpu, and then called pci_intr_establish with the
handle you get. kettenis pointed out that it's hard on some archs
to carry cpu on a pci interrupt handle, so i tweaked it to turn it
into a pci_intr_establish_cpu instead.
jmatthew@ and i (but mostly jmatthew@ to be honest) have been
experimenting with this api on multiple archs and it is working out
well. i'm putting this diff in now on amd64 so people can kick the
tyres a bit.
tested with hacked up vmx(4), ix(4), and mcx(4)
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
rnd.c uses nanotime to get access to some bits that change quickly
between events that it can mix into the entropy pool. it doesn't
use nanotime to get a monotonically increasing set or ordered and
accurate timestamps, it just wants something with bits that change.
there's been discussions for years about letting rnd use a clock
that's super fast to read, but not necessarily accurate, but it
wasn't until recently that i figured out it wasn't interested in
time at all, so things like keeping a fast clock coherent between
cpu cores or correct according to ntp is unecessary. this means we
can just let rnd read the cycle counters on cpus and things will
be fine. cpus with cycle counters that vary in their speed and
arent kept consistent between cores may even be desirable in this
context.
so this is the first step in converting rnd.c to reading cycle
counter. it copies the nanotime backend to each arch, and they can
replace it with something MD as a second step later on.
djm@ suggested rnd_messybytes, but we landed on cpu_rnd_messybits.
thanks to visa for his eyes.
ok deraadt@ visa@
deraadt@ says he will help handle any MD fallout that occurs.
|
|
|
|
|
|
|
| |
While here use the kqfilter equivalent to `seltrue' to ensure both
interfaces are coherent.
ok visa@
|
|
|
|
|
|
|
| |
and move it to the end of machdep.c. Rework the actual implementation
for te MC14818 compatible RTC into something that can be used as a todr_handle.
ok mpi@
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
This exposes VMM_IOC_MPROTECT_EPT which can be used by vmd to lock in physical
pages. Currently, vmd just terminates the vm in case it gets a protection fault
in the future.
This feature is used by solo5 which uses vmm(4) as a backend hypervisor.
ok mpi@
Patch from Adam Steen <adam@adamsteen.com.au>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Switch to using vcpu->vc_vmx_cr0_fixed[1|0] to check must be 0|1 bits,
rather than the cpu capabilities.
* Add the checks on the new values as per the SDM 2.5 CONTROL REGISTERS.
2.1 Bits 63:32 of CR0 and CR4 are reserved and must be written with zeros.
Writing a nonzero value to any of the upper 32 bits results in a
general-protection exception, #GP(0).
2.2 setting the PG flag when the PE flag is clear causes a general-protection
exception (#GP).
11.5.1 Cache Control Registers and Bits, Table 11-5. Cache Operating Modes
2.3 CD: 0, NW: 1, Invalid setting. Generates a general-protection exception
(#GP) with an error code of 0.
*. Don't alway assume, if the guest is not disabling paging, they are
enabling it, check the guest is actually enabling paging. also only read
cr4 when we actually need it, not right at the start.
ok mpi@
Patch from Adam Steen <adam@adamsteen.com.au>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
kernel by using lfence in place of stac/clac on pre-SMAP CPUs.
To quote from https://software.intel.com/security-software-guidance/insights/deep-dive-load-value-injection
"If the OS makes use of Supervisor Mode Access Prevention (SMAP)
on processors with SMAP enabled, then LVI on kernel load from
user pages will be mitigated. This is because the CLAC and STAC
instructions have LFENCE semantics on processors affected by LVI,
and this serves as a speculation fence around kernel loads from
user pages."
ok deraadt@
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
| |
On these machines we can't use the direct map since early on during boot
the direct map only covers the first 4GB of memory. Instead, use a
special (and temporary) mapping until we remap the framebuffer near the
start of autoconf. With lots of help from mlarkin@
tested by yasuoka@
ok mlarkin@
|
| |
|
|
|
|
|
|
| |
during kernel startup before syslogd(8) can receive it. Increase
message buffer size from 94k to 128k on amd64.
reported by Hrvoje Popovski; OK deraadt@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Even with the latest microcode this is not set on all CPUs with TSX, but
is set on CPUs which don't need MDS mitigations.
MDS mitigations also mitigate TSX Asynchronous Abort (TAA) but aren't
done if the CPU claims to not be affected by MDS (MDS_NO).
According to "Deep Dive: Intel Transactional Synchronization Extensions
(Intel TSX) Asynchronous Abort" CPUs requiring additional mitigations
for this are:
06-8e-0c Whiskey Lake (ULT refresh)
06-55-0{6,7} 2nd Gen Xeon Scalable Processors based on Cascade Lake
06-9e-0d Coffee Lake R
Currently TSX is disabled unconditionally when possible even if TAA_NO
is set.
ok bluhm@ guenther@ deraadt@
tested by bluhm@ on i5-8365U (06-8e-0c).
|
|
|
|
|
|
| |
the kernel.
ok mlarkin@, visa@
|
|
|
|
| |
ok guenther@ kettenis@
|
|
|
|
| |
ok deraadt@
|
| |
|
|
|
|
|
|
|
| |
This is legacy code and was probably used instead of the desried
inline'd function in cpufunc.h.
OK deraadt@, kettenis@.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
CPU0 is the reference clock and all others are skewed. During CPU
initialization the clocks synchronize by keeping a registry of each CPU
clock skewness and adapting the TSC read routine accordingly.
This commit also re-enables TSC as the default time source.
Future work includes MSR-based synchronization via IA32_TSC_ADJUST
and perhaps adding a task that is executed periodically to keep the
clocks in sync in case they drift apart.
Inspired from NetBSD.
Tested by many and thoroughly reviewed by kettenis@, thank you!
OK kettenis@, deraadt@
|
|
|
|
|
|
|
|
|
|
| |
or mis-take swapgs in interrupt path and in trap/fault/exception path. The
latter is improved to have no conditionals around this when Meltdown mitigation
is in effect. Codepatch out the fences based on the description of CPU bugs
in the (well written) Linux commit message.
feedback from kettenis@
ok deraadt@
|
|
|
|
|
| |
tweaked based on feedback from kettenis@
ok deraadt@
|
|
|
|
|
|
| |
(stirng -> string)
ok kettenis@ who pointed out I should fix the new arm64 smbiosvar.h too
|
|
|
|
|
|
|
| |
differences between the i386 and amd64 versions of the code and
switch to using the standard C integer exact width integer types.
ok deraadt@
|
|
|
|
| |
"yes please" guenther@
|
|
|
|
|
|
|
| |
Implement VMM_IOC_READVMPARAMS and VMM_IOC_WRITEVMPARAMS ioctls to read and
write pvclock state.
reads ok mlarkin@
|
|
|
|
|
|
| |
longer required to be layout compatible with struct trapframe
noted by Benjamin Baier (programmer (at) netzbasis.de)
|
|
|
|
|
|
| |
like Intel does in their patches on githup. Also add a compiler
level memory barrier to the wbinvd instruction like Linux does.
OK mlarkin@ guenther@ kettenis@
|
|
|
|
|
|
| |
an earlier diff from sf@.
ok jmatthew@, also ok mlarkin@, sf@ for a slightly different earlier version
|
|
|
|
| |
ok deraadt@, mlarkin@
|
| |
|
| |
|