| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
| |
from Matt Dunwoodie and Jason A. Donenfeld
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
reduce it to a single one. Not only should this be more performant, it
also solves a kqueue related issue found by visa@ who also requested
this change: if you attach an EVFILT_WRITE filter to a pipe fd, the
knote gets added to the peer's klist. This is a problem for kqueue
because if you close the peer's fd, the knote is left in the list whose
head is about to be freed. knote_fdclose() is not able to clear the
knote because it is not registered with the peer's fd.
FreeBSD also takes a similar approach to pipe allocations.
ok mpi@ visa@
|
|
|
|
|
|
|
| |
of SMR lists in userspace-visible parts of system headers. In addition,
the macros allow libkvm to examine SMR data structures.
Initial diff by and OK claudio@
|
|
|
|
|
| |
i've been wanting to do this for a while, and now that we've got
stoeplitz and it gives us 16 bits, it seems like the right time.
|
|
|
|
|
| |
requested by kettenis@
discussed with jmatthew@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
there's been discussions for years (and even some diffs!) about how we
should let drivers establish interrupts on multiple cpus.
the simple approach is to let every driver look at the number of
cpus in a box and just pin an interrupt on it, which is what pretty
much everyone else started with, but we have never seemed to get
past bikeshedding about. from what i can tell, the principal
objections to this are:
1. interrupts will tend to land on low numbered cpus.
ie, if drivers try to establish n interrupts on m cpus, they'll
start at cpu 0 and go to cpu n, which means cpu 0 will end up with more
interrupts than cpu m-1.
2. some cpus shouldn't be used for interrupts.
why a cpu should or shouldn't be used for interrupts can be pretty
arbitrary, but in practical terms i'm going to borrow from the
scheduler and say that we shouldn't run work on hyperthreads.
3. making all the drivers make the same decisions about the above is
a lot of maintenance overhead.
either we will have a bunch of inconsistencies, or we'll have a lot
of untested commits to keep everything the same.
my proposed solution to the above is this diff to provide the intrmap
api. drivers that want to establish multiple interrupts ask the api for
a set of cpus it can use, and the api considers the above issues when
generating a set of cpus for the driver to use. drivers then establish
interrupts on cpus with the info provided by the map.
it is based on the if_ringmap api in dragonflybsd, but generalised so it
could be used by something like nvme(4) in the future.
this version provides numeric ids for CPUs to drivers, but as
kettenis@ has been pointing out for a very long time, it makes more
sense to use cpu_info pointers. i'll be updating the code to address
that shortly.
discussed with deraadt@ and jmatthew@
ok claudio@ patrick@ kettenis@
|
|
|
|
| |
ok visa@, millert@
|
|
|
|
|
|
| |
This is only done in poll-compatibility mode, when __EV_POLL is set.
ok visa@, millert@
|
|
|
|
| |
Port breakages reported by naddy@
|
|
|
|
|
|
| |
While here prefix kernel-only EV flags with two underbars.
Suggested by kettenis@, ok visa@
|
|
|
|
|
|
|
|
|
| |
Adapt FS kqfilters to always return true when the flag is set and bypass
the polling mechanism of the NFS thread.
While here implement a write filter for NFS.
ok visa@
|
| |
|
| |
|
|
|
|
|
|
| |
conversion steps). it only contains kernel prototypes for 4 interfaces,
all of which legitimately belong in sys/systm.h, which are already included
by all enqueue_randomness() users.
|
|
|
|
|
|
|
|
|
|
| |
Since our last concurrency mistake only ioctl(2) ans sysctl(2) code path
take the reader lock. This is mostly for documentation purpose as long as
the softnet thread is converted back to use a read lock.
dlg@ said that comments should be good enough.
ok sashan@
|
|
|
|
| |
Missed in previous.
|
|
|
|
| |
sthen@ has reported that the patch might be causing hangs with X.
|
|
|
|
|
|
|
|
| |
confidence 'a great seed' was loaded, otherwise the kernel should assume at
best a 'ok seed' or 'weak seed'. This mechanism is being kept vague and
simple intentionally.
Existing bootloaders won't set it, of course.
discussed with kettenis
|
|
|
|
| |
OK millert@
|
|
|
|
| |
naptime is now a member of the timehands, th_naptime.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When we resume from a suspend we use the time from the RTC to advance
the system offset. This changes the UTC to match what the RTC has given
us while increasing the system uptime to account for the time we were
suspended.
Currently we decide whether to change to the RTC time in tc_setclock()
by comparing the new offset with the th_offset member. This is wrong.
th_offset is the *minimum* possible value for the offset, not the "real
offset". We need to perform the comparison within tc_windup() after
updating th_offset, otherwise we might rewind said offset.
Because we're now doing the comparison within tc_windup() we ought to
move naptime into the timehands. This means we now need a way to safely
read the naptime to compute the value of CLOCK_UPTIME for userspace.
Enter nanoruntime(9); it increases monotonically from boot but does not
jump forward after a resume like nanouptime(9).
|
|
|
|
|
|
|
|
|
|
|
|
| |
The struct keeps track of the end point of an event queue scan by
persisting the end marker. This will be needed when kqueue_scan() is
called repeatedly to complete a scan in a piecewise fashion. The end
marker has to be preserved between calls because otherwise the scan
might collect an event more than once. If a collected event gets
reactivated during scanning, it will be added at the tail of the queue,
out of reach because of the end marker.
OK mpi@
|
|
|
|
|
|
|
| |
This ensure spec_kqfilter() won't return an error when spec_poll()
returns success for a given device.
ok visa@
|
|
|
|
|
|
|
| |
that have arguments. Document this requirement/recommendation in style(9)
prompted by mpi@
ok deraadt@
|
|
|
|
|
|
| |
Silences an uninitialized warning in net/art.c
"reasonable" jmatthew@, ok mpi@
|
|
|
|
| |
From Vitaliy Makkoveev, ok visa@
|
|
|
|
|
|
|
|
|
| |
poll functions shouldn't return errnos, selfalse() and seltrue() exist
for this reason :)
While here fix some comments.
ok visa@
|
|
|
|
|
|
|
|
| |
Upgrade stacktrace_save() to stacktrace_save_at() on architectures where
the latter is missing. Define stacktrace_save() as an inline function
in header <sys/stacktrace.h> to reduce duplication of code.
OK mpi@
|
| |
|
|
|
|
|
|
|
| |
Prevent generating events that do not correspond to how the fifo has been
opened.
ok visa@, millert@
|
|
|
|
|
|
| |
for example, with locking assertions.
OK mpi@, anton@
|
|
|
|
|
|
|
| |
in SMR read-side critical sections are SMR_TAILQ_FOREACH(), SMR_TAILQ_FIRST()
and SMR_TAILQ_NEXT(). Most notably the last element can not be accessed
in a read-side critical section.
OK visa@
|
|
|
|
|
|
|
|
|
|
|
| |
suspend (SINGLE_SUSPEND or SINGLE_PTRACE) it needs to do this in
sleep_setup_signal(). This way the case where single_thread_clear() is
called before the sleep gets its wakeup call can be correctly handled and
the thread is put back to sleep in sleep_finish(). If the wakeup happens
before unsuspend then p_wchan is 0 and the thread will not go to sleep again.
In case of a unwind an error is returned causing the thread to return
immediatly with that error.
With and OK mpi@ kettenis@
|
|
|
|
| |
OK deraadt@
|
| |
|
|
|
|
|
|
|
|
| |
Use two underbars to start the locally defined variable, as suggested by
guenther@. The other option to avoid namespace conflict would be to start
the identifier with an underbar and a capital.
ok beck@, guenther@
|
|
|
|
| |
ok jca@, jsg@
|
|
|
|
|
|
|
|
| |
panic message shows the actual code location of the assert. Do this by
moving the assert logic inside the macros.
Prompted by and OK claudio@
OK mpi@
|
| |
|
|
|
|
|
|
|
|
| |
This variant of stacktrace_save() takes an aditionnal argument to skip
an arbitrary number of frame. This allows to skip recording frames used
to execute the profiling code and produces outputs easier to understand.
Inputs from and ok visa@
|
|
|
|
|
|
|
| |
single_thread_check() safe to be called without KERNEL_LOCK().
single_thread_wait() needs to use sleep_setup() and sleep_finish()
instead of tsleep() to make sure no wakeup() is lost.
Input kettenis@, with and OK visa@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This macro will be useful for truncating durations below INFSLP
(UINT64_MAX) when converting from a timespec or timeval to a count
of nanoseconds before calling tsleep_nsec(9), msleep_nsec(9), or
rwsleep_nsec(9).
A relative timespec can hold many more nanoseconds than a uint64_t
can. TIMESPEC_TO_NSEC() and TIMEVAL_TO_NSEC() check for overflow,
returning UINT64_MAX if the conversion would overflow a uint64_t.
Thus, MAXTSLP will make it easy to avoid inadvertently passing INFSLP
to tsleep_nsec(9) et al. when the caller intended to set a timeout.
The code in such a case might look something like this:
uint64_t nsecs = MIN(TIMESPEC_TO_NSEC(&ts), MAXTSLP);
The macro may also be useful for rejecting intervals that are "too large",
e.g. for sockets with timeouts, if the timeout duration is to be stored
as a uint64_t in an object in the kernel. The code in such a case might
look something like this:
case SIOCTIMEOUT:
{
struct timeval *tv = (struct timeval *)data;
uint64_t nsecs;
if (tv->tv_sec < 0 || !timerisvalid(tv))
return EINVAL;
nsecs = TIMEVAL_TO_NSEC(tv);
if (nsecs > MAXTSLP)
return EOVERFLOW;
obj.timeout = nsecs;
break;
}
Idea suggested by visa@.
ok visa@
|
|
|
|
|
|
|
| |
implementation file. Pushing the assignment of ps_uvpcwd down to
unveil_add() is required but it doesn't introduce any functional change.
ok mpi@ semarie@
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
This ensures that the conditions checked are still in force. The sleep
breaks atomicity, allowing another thread to alter the state.
single_thread_set() should return immediately after sleep when called
from dowait4() because there is no guarantee that the process pr still
exists. When called from single_thread_set(), the process is that of
the calling thread, which prevents process pr from disappearing.
OK anton@, mpi@, claudio@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
reparented to a debugger process.
Also re-parent exiting traced processes to their original parent, if it
is still alive, after the debugger has seen the exit status.
Logic comes from FreeBSD pointed out by guenther@.
While here rename proc_reparent() into process_reparent() and get rid of
superfluous checks.
ok visa@
|
|
|
|
|
|
|
| |
file atomic. This also gets rid of the last kernel lock protected field
in the scope of struct file.
ok mpi@ visa@
|
|
|
|
|
|
|
|
|
|
| |
This shows that atomic_* operations should not be necessery to write
to this field unlike with the process one.
The advantage of using a somewhat-unique prefix for struct member is
moot when multiple definitions use the same prefix :o)
From Amit Kulkarni, ok claudio@
|
|
|
|
|
|
| |
kern_sig.c where they are currently added by the include. While doing
that mark the sigprop array as const.
OK mpi@ anton@ millert@
|