| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
reduce it to a single one. Not only should this be more performant, it
also solves a kqueue related issue found by visa@ who also requested
this change: if you attach an EVFILT_WRITE filter to a pipe fd, the
knote gets added to the peer's klist. This is a problem for kqueue
because if you close the peer's fd, the knote is left in the list whose
head is about to be freed. knote_fdclose() is not able to clear the
knote because it is not registered with the peer's fd.
FreeBSD also takes a similar approach to pipe allocations.
ok mpi@ visa@
|
| |
|
|
|
|
| |
Pointed out by Martin Vahlensieck, thanks!
|
| |
|
|
|
|
|
|
| |
seed is explicitly set.
OK millert@
|
|
|
|
|
|
|
| |
of SMR lists in userspace-visible parts of system headers. In addition,
the macros allow libkvm to examine SMR data structures.
Initial diff by and OK claudio@
|
|
|
|
|
|
| |
prefix, and show how to use 0t for decimal (slight duplication from
the table in EXPRESSIONS but easier for the reader than sending them
off to look in a different part of the manual). ok mpi claudio jmc
|
|
|
|
|
|
| |
sc_ih value of struct rl_softc. This fixes a crash in re(4) because
intr_barrier(9) is called with the rl_softc sc_ih which was NULL.
OK kettenis@
|
|
|
|
|
| |
it's only available on amd64 (and i386), so don't really want to
encourage it's use just yet.
|
|
|
|
|
|
|
|
|
|
|
|
| |
The firmware will notify the driver when it decides to change Tx rate.
Based on those notifications the driver updates the value displayed by
ifconfig. This is similar to how bwfm(4) and urtwn(4) handle this.
Offloading Tx rate selection should allow us to eventually delete AMRR/MiRA
support code from iwx(4). That code is disabled for now, not yet deleted.
For now, the driver restricts firmware Tx rate selection to 11n/20MHz mode
because that's what net80211 can support.
|
|
|
|
|
|
|
|
|
| |
spin op the secondary CPUs, explicitly probe and attach that driver
before we attach the CPUs.
This should help with distributing interrupts across CPUs on arm64.
ok patrick@, deraadt@, dlg@
|
| |
|
|
|
|
|
| |
i feel like ive used the word vector too much, but hopefully someone
who is good with english will check this and fix it.
|
| |
|
|
|
|
| |
another sniped commit from jmatthew@
|
|
|
|
|
| |
i've been wanting to do this for a while, and now that we've got
stoeplitz and it gives us 16 bits, it seems like the right time.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
stoeplitz_cache and bring them into a form more suitable for mathematical
reasoning. Add a comment explaining the full construction which will also
help justifying upcoming diffs.
The observations for the code changes are the following:
First, scache->bytes[val] is a uint16_t, and we only need the lower
16 bits of res in the second nested pair of for loops. The values of
key[b] are only xored together to compute res, so we only need the lower
16 bits of those, too.
Second, looking at the first nested for loop, we see that the values 0..15
of j only touch the top 16 bits of key[b], so we can skip them. For b = 0,
the inner loop for j in 16..31 scans backwards through skey and sets the
corresponding bits of key[b], so key[0] = skey. A bit of pondering then
leads to key[b] = skey << b | skey >> (NBSK - b).
The key array is renamed into column since it stores columns of the
Toeplitz matrix.
It's not very expensive to brute-force verify that scache->bytes[val]
remains the same for all values of val and all values of skey. I did
this on amd64, sparc64 and powerpc.
ok dlg
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
im doing this with vmx(4) because it only exists on two archs (well,
one and a half archs really) so any impact is localised. most other
drivers i'm working on are enabled on 3 or 4 archs, and we're still
working on the interrupt code on those archs.
in the meantime vmx(4) can be used as a reference driver on how to
implement multiq. it shows the use of rss, toeplitz, intrmap, and
interrupts on multiple cpus. it's also a relatively simple device,
which makes it easier to understand the above features.
note that vmx(4) seems to advertise 25 msi-x vectors. it appears
that the intention is that 16 of these vectors are supposed to be
used for rx, 8 for tx, and 1 for events (eg, link up and down).
we're keeping things simple for now and using a maximum of 8 vectors
for both tx and rx, and one for events.
this is mostly based on work that jmatthew@ did, but it's simplified
now cos intrmap makes things easier.
|
| |
|
|
|
|
|
|
|
|
|
| |
i386 doesnt support msix, and the interrupt code assumes that it
only ties stuff to cpu0. this mostly exists so the api exists for
multiq drivers to compile against, but fail with when they try to
use it.
tested with a hacked up vmx(4).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
the cpu is specified by a struct cpu_info *, which should generally
come from an intrmap.
this is adapted from a diff that patrick@ sent round a few years
ago for a pci_intr_map_msix_cpuid, where you asked for an msi vector
on a specific cpu, and then called pci_intr_establish with the
handle you get. kettenis pointed out that it's hard on some archs
to carry cpu on a pci interrupt handle, so i tweaked it to turn it
into a pci_intr_establish_cpu instead.
jmatthew@ and i (but mostly jmatthew@ to be honest) have been
experimenting with this api on multiple archs and it is working out
well. i'm putting this diff in now on amd64 so people can kick the
tyres a bit.
tested with hacked up vmx(4), ix(4), and mcx(4)
|
| |
|
| |
|
|
|
|
|
| |
requested by kettenis@
discussed with jmatthew@
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
this basically tells us the number of interrupt vectors a pci device
is able to support. it relies on the arch having __HAVE_PCI_MSIX
defined. without that define it always returns 0.
i think this originally came from haesbart via patrick@ as amd64
md code in the middle of a diff from 2018(!), but i've tweaked it
to make it MI.
tested on sparc64 and amd64 with various drivers.
|
| |
|
|
|
|
| |
ok kettenis@
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
there's been discussions for years (and even some diffs!) about how we
should let drivers establish interrupts on multiple cpus.
the simple approach is to let every driver look at the number of
cpus in a box and just pin an interrupt on it, which is what pretty
much everyone else started with, but we have never seemed to get
past bikeshedding about. from what i can tell, the principal
objections to this are:
1. interrupts will tend to land on low numbered cpus.
ie, if drivers try to establish n interrupts on m cpus, they'll
start at cpu 0 and go to cpu n, which means cpu 0 will end up with more
interrupts than cpu m-1.
2. some cpus shouldn't be used for interrupts.
why a cpu should or shouldn't be used for interrupts can be pretty
arbitrary, but in practical terms i'm going to borrow from the
scheduler and say that we shouldn't run work on hyperthreads.
3. making all the drivers make the same decisions about the above is
a lot of maintenance overhead.
either we will have a bunch of inconsistencies, or we'll have a lot
of untested commits to keep everything the same.
my proposed solution to the above is this diff to provide the intrmap
api. drivers that want to establish multiple interrupts ask the api for
a set of cpus it can use, and the api considers the above issues when
generating a set of cpus for the driver to use. drivers then establish
interrupts on cpus with the info provided by the map.
it is based on the if_ringmap api in dragonflybsd, but generalised so it
could be used by something like nvme(4) in the future.
this version provides numeric ids for CPUs to drivers, but as
kettenis@ has been pointing out for a very long time, it makes more
sense to use cpu_info pointers. i'll be updating the code to address
that shortly.
discussed with deraadt@ and jmatthew@
ok claudio@ patrick@ kettenis@
|
|
|
|
|
|
| |
Prompted by warning from clang 10.
ok patrick@
|
|
|
|
|
|
| |
intr_barrier passed NULL to sched_barrier before this, which ends
up being the primary cpu. that's been mostly right until this point,
but is set to change.
|
|
|
|
|
|
| |
clang-10 complains about the misleading indentation.
ok patrick@
|
| |
|
| |
|
|
|
|
|
| |
While here, add proper bounds checking for the partial match case
in refldbld() too and check strlcpy() return values throughout.
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
| |
and rework the man text to reflect this;
guenther supplied the details on the various modes;
deraadt suggested __progname be banished from usage();
|
| |
|
|
|
|
|
|
|
|
| |
The node here is always ic_bss, for which the reference count isn't
actually used (it's always freed when the interface detaches), so
not releasing it in this case wasn't really a problem.
ok stsp@
|
| |
|
| |
|
|
|
|
|
|
|
| |
vm would get stuck if disconnected from console and get unstuck once console is
attached.
Spotted by tb@
|
| |
|
|
|
|
| |
Derry Jing.
|