summaryrefslogtreecommitdiffstats
path: root/sys/sys (follow)
Commit message (Collapse)AuthorAgeFilesLines
...
* add mq_push. it's like mq_enqueue, but drops from the head, not the tail.dlg2020-06-211-1/+2
| | | | from Matt Dunwoodie and Jason A. Donenfeld
* backout pipe change, it crashes some archderaadt2020-06-191-4/+1
|
* Instead of performing three distinct allocations per created pipe,anton2020-06-171-1/+4
| | | | | | | | | | | | | reduce it to a single one. Not only should this be more performant, it also solves a kqueue related issue found by visa@ who also requested this change: if you attach an EVFILT_WRITE filter to a pipe fd, the knote gets added to the peer's klist. This is a problem for kqueue because if you close the peer's fd, the knote is left in the list whose head is about to be freed. knote_fdclose() is not able to clear the knote because it is not registered with the peer's fd. FreeBSD also takes a similar approach to pipe allocations. ok mpi@ visa@
* Expose SMR list and pointer macros to userspace. This enables the usevisa2020-06-171-3/+3
| | | | | | | of SMR lists in userspace-visible parts of system headers. In addition, the macros allow libkvm to examine SMR data structures. Initial diff by and OK claudio@
* make ph_flowid in mbufs 16bits by storing whether it's set in csum_flags.dlg2020-06-171-6/+3
| | | | | i've been wanting to do this for a while, and now that we've got stoeplitz and it gives us 16 bits, it seems like the right time.
* make intrmap_cpu return a struct cpu_info *, not a "cpuid number" thing.dlg2020-06-171-2/+2
| | | | | requested by kettenis@ discussed with jmatthew@
* add intrmap, an api that picks cpus for devices to attach interrupts to.dlg2020-06-171-0/+38
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | there's been discussions for years (and even some diffs!) about how we should let drivers establish interrupts on multiple cpus. the simple approach is to let every driver look at the number of cpus in a box and just pin an interrupt on it, which is what pretty much everyone else started with, but we have never seemed to get past bikeshedding about. from what i can tell, the principal objections to this are: 1. interrupts will tend to land on low numbered cpus. ie, if drivers try to establish n interrupts on m cpus, they'll start at cpu 0 and go to cpu n, which means cpu 0 will end up with more interrupts than cpu m-1. 2. some cpus shouldn't be used for interrupts. why a cpu should or shouldn't be used for interrupts can be pretty arbitrary, but in practical terms i'm going to borrow from the scheduler and say that we shouldn't run work on hyperthreads. 3. making all the drivers make the same decisions about the above is a lot of maintenance overhead. either we will have a bunch of inconsistencies, or we'll have a lot of untested commits to keep everything the same. my proposed solution to the above is this diff to provide the intrmap api. drivers that want to establish multiple interrupts ask the api for a set of cpus it can use, and the api considers the above issues when generating a set of cpus for the driver to use. drivers then establish interrupts on cpus with the info provided by the map. it is based on the if_ringmap api in dragonflybsd, but generalised so it could be used by something like nvme(4) in the future. this version provides numeric ids for CPUs to drivers, but as kettenis@ has been pointing out for a very long time, it makes more sense to use cpu_info pointers. i'll be updating the code to address that shortly. discussed with deraadt@ and jmatthew@ ok claudio@ patrick@ kettenis@
* Implement a simple kqfilter for deadfs matching its poll handler.mpi2020-06-151-1/+2
| | | | ok visa@, millert@
* Set __EV_HUP when the conditions matching poll(2)'s POLLUP are found.mpi2020-06-151-2/+5
| | | | | | This is only done in poll-compatibility mode, when __EV_POLL is set. ok visa@, millert@
* Revert addition of double underbars for filter-specific flag.mpi2020-06-121-2/+2
| | | | Port breakages reported by naddy@
* Rename poll-compatibility flag to better reflect what it is.mpi2020-06-111-3/+3
| | | | | | While here prefix kernel-only EV flags with two underbars. Suggested by kettenis@, ok visa@
* Use a new EV_OLDAPI flag to match the behavior of poll(2) and select(2).mpi2020-06-081-1/+2
| | | | | | | | | Adapt FS kqfilters to always return true when the flag is set and bypass the polling mechanism of the NFS thread. While here implement a write filter for NFS. ok visa@
* visa points out we don't use or need the TASK_BARRIER flag anymore.dlg2020-06-081-2/+1
|
* add missing forward declaration of struct procanton2020-06-041-1/+3
|
* dev/rndvar.h no longer has statistical interfaces (removed during variousderaadt2020-05-291-1/+6
| | | | | | conversion steps). it only contains kernel prototypes for 4 interfaces, all of which legitimately belong in sys/systm.h, which are already included by all enqueue_randomness() users.
* Document the various flavors of NET_LOCK() and rename the reader version.mpi2020-05-271-16/+25
| | | | | | | | | | Since our last concurrency mistake only ioctl(2) ans sysctl(2) code path take the reader lock. This is mostly for documentation purpose as long as the softnet thread is converted back to use a read lock. dlg@ said that comments should be good enough. ok sashan@
* Make cdev_{audio,video}_init() expose a kqfilter handler.mpi2020-05-261-4/+4
| | | | Missed in previous.
* Revert "Add kqueue_scan_state struct"visa2020-05-251-12/+1
| | | | sthen@ has reported that the patch might be causing hangs with X.
* Add RB_GOODRANDOM passed from bootloader to kernel in boothowto. Indicatesderaadt2020-05-231-17/+18
| | | | | | | | confidence 'a great seed' was loaded, otherwise the kernel should assume at best a 'ok seed' or 'weak seed'. This mechanism is being kept vague and simple intentionally. Existing bootloaders won't set it, of course. discussed with kettenis
* Update comment to reflect current headers.visa2020-05-211-6/+5
| | | | OK millert@
* kernel.h: remove global declaration for naptimecheloha2020-05-201-3/+1
| | | | naptime is now a member of the timehands, th_naptime.
* timecounting: decide whether to advance offset within tc_windup()cheloha2020-05-201-1/+4
| | | | | | | | | | | | | | | | | | | When we resume from a suspend we use the time from the RTC to advance the system offset. This changes the UTC to match what the RTC has given us while increasing the system uptime to account for the time we were suspended. Currently we decide whether to change to the RTC time in tc_setclock() by comparing the new offset with the th_offset member. This is wrong. th_offset is the *minimum* possible value for the offset, not the "real offset". We need to perform the comparison within tc_windup() after updating th_offset, otherwise we might rewind said offset. Because we're now doing the comparison within tc_windup() we ought to move naptime into the timehands. This means we now need a way to safely read the naptime to compute the value of CLOCK_UPTIME for userspace. Enter nanoruntime(9); it increases monotonically from boot but does not jump forward after a resume like nanouptime(9).
* Add kqueue_scan_state structvisa2020-05-171-1/+12
| | | | | | | | | | | | The struct keeps track of the end point of an event queue scan by persisting the end marker. This will be needed when kqueue_scan() is called repeatedly to complete a scan in a piecewise fashion. The end marker has to be preserved between calls because otherwise the scan might collect an event more than once. If a collected event gets reactivated during scanning, it will be added at the tail of the queue, out of reach because of the end marker. OK mpi@
* Match direct `seltrue' usages with a corresponding `seltrue_kqfilter'.mpi2020-05-131-2/+2
| | | | | | | This ensure spec_kqfilter() won't return an error when spec_poll() returns success for a given device. ok visa@
* Use a double-underscore prefix for local variables declared in macrosguenther2020-05-105-40/+40
| | | | | | | that have arguments. Document this requirement/recommendation in style(9) prompted by mpi@ ok deraadt@
* Initialize the srp_ref in the non-MP version of srp_enterjca2020-05-091-3/+10
| | | | | | Silences an uninitialized warning in net/art.c "reasonable" jmatthew@, ok mpi@
* Document that thread credentials are owned by curproc.mpi2020-04-281-2/+2
| | | | From Vitaliy Makkoveev, ok visa@
* Correct cdev_ipmi_init()'s poll stub to return 0 instead of ENODEV.mpi2020-04-211-3/+4
| | | | | | | | | poll functions shouldn't return errnos, selfalse() and seltrue() exist for this reason :) While here fix some comments. ok visa@
* Sync existing stacktrace_save() implementationsvisa2020-04-181-2/+7
| | | | | | | | Upgrade stacktrace_save() to stacktrace_save_at() on architectures where the latter is missing. Define stacktrace_save() as an inline function in header <sys/stacktrace.h> to reduce duplication of code. OK mpi@
* Mention tail queue in comments.visa2020-04-121-3/+11
|
* Make fifo_kqfilter() honor FREAD|FWRITE just like fifo_poll() does.mpi2020-04-081-2/+3
| | | | | | | Prevent generating events that do not correspond to how the fifo has been opened. ok visa@, millert@
* Abstract the head of knote lists. This allows extending the lists,visa2020-04-072-8/+15
| | | | | | for example, with locking assertions. OK mpi@, anton@
* Implement a SMR TAILQ implementation. The only operations which can be usedclaudio2020-04-071-1/+110
| | | | | | | in SMR read-side critical sections are SMR_TAILQ_FOREACH(), SMR_TAILQ_FIRST() and SMR_TAILQ_NEXT(). Most notably the last element can not be accessed in a read-side critical section. OK visa@
* Fix single thread behaviour in sleep_setup_signal(). If a thread needs toclaudio2020-04-061-1/+2
| | | | | | | | | | | suspend (SINGLE_SUSPEND or SINGLE_PTRACE) it needs to do this in sleep_setup_signal(). This way the case where single_thread_clear() is called before the sleep gets its wakeup call can be correctly handled and the thread is put back to sleep in sleep_finish(). If the wakeup happens before unsuspend then p_wchan is 0 and the thread will not go to sleep again. In case of a unwind an error is returned causing the thread to return immediatly with that error. With and OK mpi@ kettenis@
* Declare pledgenames[] as const.visa2020-04-051-3/+3
| | | | OK deraadt@
* crank to 6.7-betaderaadt2020-04-051-3/+3
|
* Prevent shadowing of local variable by the EV_SET() macro.mpi2020-04-041-9/+9
| | | | | | | | Use two underbars to start the locally defined variable, as suggested by guenther@. The other option to avoid namespace conflict would be to start the identifier with an underbar and a capital. ok beck@, guenther@
* Kill unused cdev_mousewr_init().mpi2020-04-031-8/+1
| | | | ok jca@, jsg@
* Adjust SMR_ASSERT_CRITICAL() and SMR_ASSERT_NONCRITICAL() so that thevisa2020-04-031-5/+9
| | | | | | | | panic message shows the actual code location of the assert. Do this by moving the assert logic inside the macros. Prompted by and OK claudio@ OK mpi@
* double ARG_MAXderaadt2020-04-021-2/+2
|
* Introduce stacktrace_save_at() and make use of it in dt(4).mpi2020-03-251-1/+2
| | | | | | | | This variant of stacktrace_save() takes an aditionnal argument to skip an arbitrary number of frame. This allows to skip recording frames used to execute the profiling code and produces outputs easier to understand. Inputs from and ok visa@
* Use atomic operations to update ps_singlecount. This makesclaudio2020-03-201-2/+2
| | | | | | | single_thread_check() safe to be called without KERNEL_LOCK(). single_thread_wait() needs to use sleep_setup() and sleep_finish() instead of tsleep() to make sure no wakeup() is lost. Input kettenis@, with and OK visa@
* tsleep_nsec(9): add MAXTSLP macro, the maximum sleep durationcheloha2020-03-201-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This macro will be useful for truncating durations below INFSLP (UINT64_MAX) when converting from a timespec or timeval to a count of nanoseconds before calling tsleep_nsec(9), msleep_nsec(9), or rwsleep_nsec(9). A relative timespec can hold many more nanoseconds than a uint64_t can. TIMESPEC_TO_NSEC() and TIMEVAL_TO_NSEC() check for overflow, returning UINT64_MAX if the conversion would overflow a uint64_t. Thus, MAXTSLP will make it easy to avoid inadvertently passing INFSLP to tsleep_nsec(9) et al. when the caller intended to set a timeout. The code in such a case might look something like this: uint64_t nsecs = MIN(TIMESPEC_TO_NSEC(&ts), MAXTSLP); The macro may also be useful for rejecting intervals that are "too large", e.g. for sockets with timeouts, if the timeout duration is to be stored as a uint64_t in an object in the kernel. The code in such a case might look something like this: case SIOCTIMEOUT: { struct timeval *tv = (struct timeval *)data; uint64_t nsecs; if (tv->tv_sec < 0 || !timerisvalid(tv)) return EINVAL; nsecs = TIMEVAL_TO_NSEC(tv); if (nsecs > MAXTSLP) return EOVERFLOW; obj.timeout = nsecs; break; } Idea suggested by visa@. ok visa@
* Move unveil data structures away from the proc.h header into theanton2020-03-192-20/+4
| | | | | | | implementation file. Pushing the assignment of ps_uvpcwd down to unveil_add() is required but it doesn't introduce any functional change. ok mpi@ semarie@
* regenanton2020-03-182-4/+4
|
* Restart child process scan in dowait4() if single_thread_wait() sleeps.visa2020-03-181-2/+2
| | | | | | | | | | | | This ensures that the conditions checked are still in force. The sleep breaks atomicity, allowing another thread to alter the state. single_thread_set() should return immediately after sleep when called from dowait4() because there is no guarantee that the process pr still exists. When called from single_thread_set(), the process is that of the calling thread, which prevents process pr from disappearing. OK anton@, mpi@, claudio@
* Keep track of traced child under a list of orphans while they are beingmpi2020-03-162-4/+16
| | | | | | | | | | | | | | reparented to a debugger process. Also re-parent exiting traced processes to their original parent, if it is still alive, after the debugger has seen the exit status. Logic comes from FreeBSD pointed out by guenther@. While here rename proc_reparent() into process_reparent() and get rid of superfluous checks. ok visa@
* In order to unlock flock(2), make writes to the f_iflags field of structanton2020-03-132-5/+4
| | | | | | | file atomic. This also gets rid of the last kernel lock protected field in the scope of struct file. ok mpi@ visa@
* Rename "sigacts" flag field to avoid conflict with the "process" one.mpi2020-03-131-2/+2
| | | | | | | | | | This shows that atomic_* operations should not be necessery to write to this field unlike with the process one. The advantage of using a somewhat-unique prefix for struct member is moot when multiple definitions use the same prefix :o) From Amit Kulkarni, ok claudio@
* Move the sigprop definition and the other bits under SIGPROP intoclaudio2020-03-111-44/+1
| | | | | | kern_sig.c where they are currently added by the include. While doing that mark the sigprop array as const. OK mpi@ anton@ millert@