| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
| |
suspend (SINGLE_SUSPEND or SINGLE_PTRACE) it needs to do this in
sleep_setup_signal(). This way the case where single_thread_clear() is
called before the sleep gets its wakeup call can be correctly handled and
the thread is put back to sleep in sleep_finish(). If the wakeup happens
before unsuspend then p_wchan is 0 and the thread will not go to sleep again.
In case of a unwind an error is returned causing the thread to return
immediatly with that error.
With and OK mpi@ kettenis@
|
| |
|
|
|
| |
sleep_setup/finish related functions are.
OK kettenis@
|
| |
|
|
|
| |
comes to setting a process into single thread mode. It is still worng but
first the interaction with single_thread_set() must be corrected.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
when called during execve(2). This was a caused by initializing sls_sig
with value 0 in r1.164 of kern_synch.c. Previously, tsleep(9) returned
immediately with EINTR in similar circumstances.
The immediate return without error can cause a system hang. For example,
vwaitforio() could end up spinning if called during execve(2) because
the thread did not enter sleep and other threads were not able to finish
the I/O.
tsleep
vwaitforio
nfs_flush
nfs_close
VOP_CLOSE
vn_closefile
fdrop
closef
fdcloseexec
sys_execve
Fix the issue by checking (p->p_flag & P_SUSPSINGLE) instead of
(p->p_p->ps_single != NULL) in sleep_setup_signal(). The former is more
selective than the latter and allows the thread that invokes execve(2)
enter sleep normally.
Bug report, change bisecting and testing help by Pavel Korovin
OK claudio@ mpi@
|
| |
|
|
|
|
| |
tsleep_nsec(9) will not set a timeout if the nsecs parameter is
equal to INFSLP (UINT64_MAX). We need to limit the duration to
MAXTSLP (UINT64_MAX - 1) to ensure a timeout is set.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
An absolute timeout T elapses when the clock has reached time T, i.e.
when T is less than or equal to the clock's current time.
But the current code thinks T elapses only when the clock is strictly
greater than T.
For example, if my absolute timeout is 1.00000000, the current code will
not return EWOULDBLOCK until the clock reaches 1.00000001. This is wrong:
my absolute timeout elapses a nanosecond prior to that point.
So the timespeccmp(3) here should be
timespeccmp(tsp, &now, <=)
and not
timespeccmp(tsp, &now, <)
as it is currently.
|
| |
|
|
|
|
| |
possible signal that was caught during sleep setup. It does not make sense
to have a default of 1 (SIGHUP) for this.
OK visa@ mpi@
|
| |
|
|
|
|
|
|
|
| |
sleep. If sleep_setup_signal() detects that the process has been
stopped, it calls mi_switch() instead of sleeping. Then the lock
was not released and other processes got stuck. Move the mtx_leave()
and rw_exit() before sleep_setup_signal() to prevent that a stopped
process holds a short term kernel lock.
input kettenis@; OK visa@ tedu@
|
| |
|
|
|
|
|
|
|
|
|
| |
Using different fields to remember in which runqueue or sleepqueue
threads currently are will make it easier to split the SCHED_LOCK().
With this change, the (potentially boosted) sleeping priority is no
longer overwriting the thread priority. This let us get rids of the
logic required to synchronize `p_priority' with `p_usrpri'.
Tested by many, ok visa@
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
We included DIAGNOSTIC in *sleep_nsec(9) when they were first committed
to help us sniff out divison-to-zero bugs when converting *sleep(9)
callers to the new interfaces.
Recently we exposed the new interface to userland callers. This has
yielded some warnings.
This diff adds a process name and pid to the warnings to help determine
the source of the zero-length sleeps.
ok mpi@
|
| |
|
|
|
|
|
|
|
|
|
| |
The design is fairly simple: events, in the form of descriptors on a
ring, are being produced in any kernel context and being consumed by
a userland process reading /dev/dt.
Code and hooks are all guarded under '#if NDT > 0' so this commit
shouldn't introduce any change as long as dt(4) is disable in GENERIC.
ok kettenis@, visa@, jasper@, deraadt@
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Threads in __thrsleep(2) are tracked using queues, one queue per each
process for synchronization between threads of a process, and one
system-wide queue for the special ident -1 handling. Each of these
queues has an associated rwlock that serializes access.
The queue lock is released when calling copyin() and copyout() in
thrsleep(). This preserves the existing behaviour where a blocked copy
operation does not prevent other threads from making progress.
Tested by anton@, claudio@
OK anton@, claudio@, tedu@, mpi@
|
| |
|
|
|
|
|
| |
This moves most of the SCHED_LOCK() related to protecting the sleepqueue
and its states to kern/kern_sync.c
Name suggestion from jsg@, ok kettenis@, visa@
|
| |
|
|
|
|
| |
tsleep(9) to tsleep_nsec(9).
ok bluhm@
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The *sleep(9) interfaces are challenging to use when one needs to sleep
for a given minimum duration: the programmer needs to account for both
the current tick and any integer division when converting an interval
to a count of ticks. This sort of input conversion is complicated and
ugly at best and error-prone at worst.
This patch consolidates this conversion logic into the *sleep_nsec(9)
functions themselves. This will allow us to use the functions at the
syscall layer and elsewhere in the kernel where guaranteeing a minimum
sleep duration is of vital importance.
With input from bluhm@, guenther@, ratchov@, tedu@, and kettenis@.
Requested by mpi@ and kettenis@.
Conversion algorithm from mpi@.
ok mpi@, kettenis@, deraadt@
|
| |
|
|
|
|
|
|
|
| |
rwsleep(9) with PCATCH and rw_enter(9) with RW_INTR without the kernel
lock. In addition, now tsleep(9) with PCATCH should be safe to use
without the kernel lock if the sleep is purely time-based.
Tested by anton@, cheloha@, chris@
OK anton@, cheloha@
|
| |
|
|
|
|
|
|
|
| |
the timeout cancellation in sleep_finish_timeout() would acquire the
kernel lock every time in the no-timeout case, as noticed by mpi@.
This also reduces the contention of timeout_mutex.
OK mpi@, feedback guenther@
|
| |
|
|
|
|
|
|
|
| |
This refactoring will help future scheduler locking, in particular to
shrink the SCHED_LOCK().
No intended behavior change.
ok visa@
|
| | |
|
| |
|
|
|
|
|
| |
This allows to enforce that sleeping priorities will now always be <
PUSER.
ok visa@, ratchov@
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Equivalent to their unsuffixed counterparts except that (a) they take
a timeout in terms of nanoseconds, and (b) INFSLP, aka UINT64_MAX (not
zero) indicates that a timeout should not be set.
For now, zero nanoseconds is not a strictly valid invocation: we log a
warning on DIAGNOSTIC kernels if we see such a call. We still sleep
until the next tick in such a case, however. In the future this could
become some sort of poll... TBD.
To facilitate conversions to these interfaces: add inline conversion
functions to sys/time.h for turning your timeout into nanoseconds.
Also do a few easy conversions for warmup and to demonstrate how
further conversions should be done.
Lots of input from mpi@ and ratchov@. Additional input from tedu@,
deraadt@, mortimer@, millert@, and claudio@.
Partly inspired by FreeBSD r247787.
positive feedback from deraadt@, ok mpi@
|
| |
|
|
|
|
|
|
|
|
|
|
| |
This is necessary when invoking sleep_finish_timeout() without the
kernel lock. If not cancelled properly, an already running endtsleep()
might cause a spurious wakeup on the thread if the thread re-enters
a sleep queue very quickly before the handler completes.
The flag P_TIMEOUT should stay cleared across the timeout cancellation.
Add an assertion for that.
OK mpi@
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Reduce code clutter by removing the file name and line number output
from witness(4). Typically it is easy enough to locate offending locks
using the stack traces that are shown in lock order conflict reports.
Tricky cases can be tracked using sysctl kern.witness.locktrace=1 .
This patch additionally removes the witness(4) wrapper for mutexes.
Now each mutex implementation has to invoke the WITNESS_*() macros
in order to utilize the checker.
Discussed with and OK dlg@, OK mpi@
|
| | |
|
| |
|
|
|
|
|
| |
sleep_finish_timeout(), and sleep_finish_signal() with error preferencing,
and then use it in five places.
ok mpi@
|
| |
|
|
|
|
|
| |
Wanted for tentative clock_nanosleep(2) diff, but maybe useful
elsewhere in the future.
ok mpi@
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Discussing with mpi@ and guenther@, we decided to first fix the existing
semaphore implementation with regards to SA_RESTART and POSIX compliant
returns in the case where we deal with restartable signals.
Currently we return EINTR everywhere which is mostly incorrect as the
user can not know if she needs to recall the syscall or not. Return
ECANCELED to signal that SA_RESTART was set and EINTR otherwise.
Regression tests pass and so does the posixsuite. Timespec validation
bits are needed to pass the later.
OK mpi@, guenther@
|
| |
|
|
|
| |
this will be used to replace the bare sleep_state handling in a
bunch of places, starting with the barriers.
|
| |
|
|
| |
ok visa@
|
| |
|
|
|
|
|
|
|
|
| |
If the rwlock passed to rwsleep(9) is contented, the CPU will call wakeup()
between sleep_setup() and sleep_finish(). At this moment curproc is on the
sleep queue but marked as SONPROC. Avoid panicing in this case.
Problem reported by sthen@
ok kettenis@, visa@
|
| | |
|
| |
|
|
| |
Loosely based on a diff from Christian Ludwig
|
| |
|
|
|
| |
tsleep(9) & friends seem to only produce false positives and cannot
be easily disabled.
|
| |
|
|
|
|
|
|
| |
allocation that can sleep while holding the NET_LOCK().
To be removed once we're confident the remaining code paths are safe.
Discussed with deraadt@
|
| |
|
|
|
|
| |
struct proc to struct process.
ok deraadt@ kettenis@
|
| |
|
|
|
|
| |
by a write lock.
ok guenther@, vgross@
|
| |
|
|
| |
OK guenther@ mpi@ tedu@
|
| |
|
|
|
|
|
|
| |
it's not enough to assign to an unsigned type because if the arithmetic
overflows the compiler may decide to do anything. so change all the
long long casts to uint64_t so that we start with the right type.
reported by Tim Newsham of NCC.
ok deraadt
|
| |
|
|
|
|
| |
into negative values, which later causes a panic.
reported by Tim Newsham at NCC.
ok guenther
|
| | |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
behind all other threads in the process by temporarily lowering its priority.
This isn't optimal but it is the easiest way to guarantee that we make
progress when we're waiting on an other thread to release a lock. This
results in significant improvements for processes that suffer from lock
contention, most notably firefox. Unfortunately this means that sched_yield(2)
needs to grab the kernel lock again.
All the hard work was done by mpi@, based on observations of the behaviour
of the BFS scheduler diff by Michal Mazurek.
ok deraadt@
|
| | |
|
| |
|
|
| |
ok mpi@
|
| |
|
|
| |
ok mpi@ bluhm@
|
| |
|
|
|
|
|
| |
Prevent lazy developers, like David and I, to use atomic operations
without including <sys/atomic.h>.
ok dlg@
|
| | |
|
| |
|
|
|
|
|
|
| |
spit out a ddb trace to console. This should allow us to find suspend
or resume routines which break the rules. It depends on the console
output function being non-sleeping.... but that's another codepath which
should try to be safe when cold is set.
ok kettenis
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
its basically atomic inc/dec, but it includes magical sleep code
in refcnt_finalise that is better written once than many times.
refcnt_finalise sleeps until all references are released and does
so with sleep_setup and sleep_finalize, which is fairly subtle.
putting this in now so i we can get on with work in the stack, a
proper discussion about visibility and how available intrinsics
should be in the kernel can happen after next week.
with help from guenther@
ok guenther@ deraadt@ mpi@
|
| |
|
|
|
|
|
| |
and doing VOP_WRITE() from inside tsleep/msleep makes the locking too
complicated, making it harder to move forward on MP changes.
ok deraadt@ kettenis@
|
| |
|
|
|
|
|
| |
portions of msleep and tsleep to give interrupts a chance to run
on other CPUs.
Tweak and OK kettenis
|