summaryrefslogtreecommitdiffstats
path: root/sys/kern/sys_generic.c (follow)
Commit message (Collapse)AuthorAgeFilesLines
* Revert "Implement select(2) and pselect(2) on top of kqueue."visa2021-01-081-148/+58
| | | | | | | | | | The use of kqueue as backend has introduced a significant regression in the performance of select(2), so go back to using the original code. Some additional management overhead is to be expected when using kqueue. However, the overhead of the current implementation is too high. Reported by bluhm@ on bugs@
* Simplify parameters of pselregister().visa2020-12-261-8/+5
| | | | OK mpi@
* Implement select(2) and pselect(2) on top of kqueue.mpi2020-12-221-55/+148
| | | | | | | | | | | | | | | | The given set of fds are converted to equivalent kevents using EV_SET(2) and passed to the scanning internals of kevent(2): kqueue_scan(). ktrace(1) will now output the converted kevents on top of the usuals set bits to be able to find possible error in the convertion. This switch implies that select(2) and pselect(2) will now query the underlying kqfilters instead of the *_poll() routines. Based on similar work done on DragonFlyBSD with inputs from from visa@, millert@, anton@, cheloha@, thanks! ok visa@
* expose timeval/timespec from system calls into ktrace, before determiningderaadt2020-10-021-7/+8
| | | | | if they are out of range, making it easier to isolate reason for EINVAL ok cheloha
* poll(2), ppoll(2), pselect(2), select(2): tsleep(9) -> tsleep_nsec(9)cheloha2020-03-201-9/+15
| | | | | | With input from visa@. ok visa@
* Push the KERNEL_LOCK() insidge pgsigio() and selwakeup().mpi2020-02-141-2/+15
| | | | | | | | | | | The 3 subsystems: signal, poll/select and kqueue can now be addressed separatly. Note that bpf(4) and audio(4) currently delay the wakeups to a separate context in order to respect the KERNEL_LOCK() requirement. Sockets (UDP, TCP) and pipes spin to grab the lock for the sames reasons. ok anton@, visa@
* Make writes to the f_flag field of `struct file' MP-safe using atomicanton2020-02-011-5/+5
| | | | | | | | operations. Since the type of f_flag must change in order to use the atomic(9) API, reorder the struct in order to avoid padding; as pointed out by tedu@. ok mpi@ visa@
* Introduce wakeup_proc() a function to un-SSTOP/SSLEEP a thread.mpi2020-01-161-9/+3
| | | | | | | This moves most of the SCHED_LOCK() related to protecting the sleepqueue and its states to kern/kern_sync.c Name suggestion from jsg@, ok kettenis@, visa@
* Unify handling of ioctls FIOSETOWN/SIOCSPGRP/TIOCSPGRP andvisa2020-01-081-25/+1
| | | | | | | | | | | | FIOGETOWN/SIOCGPGRP/TIOCGPGRP. Do this by determining the meaning of the ID parameter inside the sigio code. Also add cases for FIOSETOWN and FIOGETOWN where there have been TIOCSPGRP and TIOCGPGRP before. These changes allow removing the ID translation from sys_fcntl() and sys_ioctl(). Idea from NetBSD OK mpi@, claudio@
* poll(2), ppoll(2), select(2), pselect(2): always set P_SELECT before tsleepcheloha2019-10-031-9/+5
| | | | | | | | | | | When I introduced the tsleep loops in r1.23 I screwed it up and introduced a bug: on EWOULDBLOCK we loop but fail to reset P_SELECT, so the thread will continue to sleep but miss all relevant descriptor activity after INT_MAX ticks have elapsed. Spotted by mpi@ back in July. ok mpi@
* push the KERNEL_LOCK deeper on read(2) and write(2)semarie2019-06-221-2/+5
| | | | | | | | | | | unlocks read(2) and write(2) syscalls families, and push the KERNEL_LOCK deeper in the code path. KERNEL_LOCK is managed per file type in fileops handlers (fo_read, fo_write, and fo_close). read(2) and write(2) on socket are KERNEL_LOCK-free. initial work from mpi@ and ians@ ok mpi@ kettenis@ visa@ ians@
* Make resource limit access MP-safe. So far, the copy-on-write sharingvisa2019-06-211-2/+2
| | | | | | | | | | of resource limit structs has been done between processes. By applying copy-on-write also between threads, threads can read rlimits in a nearly lock-free manner. Inspired by code in DragonFly BSD and FreeBSD. OK mpi@, agreement from jmatthew@ and anton@
* select(2), pselect(2), poll(2), ppoll(2): Support full timeout range.cheloha2019-01-211-63/+58
| | | | | | | | | | | | | | | | | | | | | | | Remove the arbitrary and undocumented 24hr limits for timeouts from these interfaces. To do so, loop tsleep(9) to chip away at timeouts larger than what tsleep(9) can handle in one call. Use timerisvalid(3)/timespecisvalid() for input validation instead of itimerfix()/timespecfix() to avoid the 100 million second upper bounds those functions introduce. POSIX requires support for timeouts of at least 31 days for select(2) and pselect(2), so these changes make our implementation more compliant. Other improvements here include better variable names for the time stuff and more consolidated timeout logic with less backwards goto jumping, all of which made dopselect() and doppoll() a bear to read. Naming improvements prompted by tedu@ in a prior patch for nanosleep(2). With input from deraadt@. Validation bug spotted by matthew@ in an earlier version. ok visa@
* Reorder checks in the read/write(2) family of syscalls to prepare makingmpi2018-08-201-152/+162
| | | | | | | | | | | | file operations mp-safe. This change makes it clear that `f_offset' is only accessed in vn_read() and vn_write(), which will help taking it out of the KERNEL_LOCK(). This refactoring uncovered a race in vn_read() which is now documented and will be addressed in a later diff. ok visa@
* Don't pass an uninitialised size value to free(9). Pointer argument isjsg2018-07-141-2/+2
| | | | | | NULL in this path so free will return early without accessing it. ok jca@ tb@
* Move socket & pipe specific logic in their ioctl handler.mpi2018-07-101-17/+5
| | | | ok visa@, tb@
* Protect per-file counters and document which lock is used to protectmpi2018-05-081-1/+5
| | | | | | | | | the other fields. Once we no longer have any [k] (kernel lock) protections, we'll be able to unlock almost all network related syscalls. Inputs from and ok bluhm@, visa@
* Move FREF() inside fd_getfile().mpi2018-04-271-10/+1
| | | | ok visa@
* Call FREF() right after fd_getfile_mode() in sys_ioctl().mpi2018-04-091-16/+16
| | | | ok visa@, bluhm@
* Stop assuming <sys/file.h> will pull in fcntl.h when _KERNEL is defined.guenther2018-01-021-1/+2
| | | | ok millert@ sthen@
* Assert that the corresponding socket is locked when manipulating socketmpi2017-06-261-2/+2
| | | | | | | | | | | | | | | | buffers. This is one step towards unlocking TCP input path. Note that all the functions asserting for the socket lock are not necessarilly MP-safe. All the fields of 'struct socket' aren't protected. Introduce a new kernel-only kqueue hint, NOTE_SUBMIT, to be able to tell when a filter needs to lock the underlying data structures. Logic and name taken from NetBSD. Tested by Hrvoje Popovski. ok claudio@, bluhm@, mikeb@
* Rename pfind(9) into tfind(9) to reflect that it deals with threads.mpi2017-01-241-3/+3
| | | | | | While here document prfind(9. with and ok guenther@
* Split PID from TID, giving processes a PID unrelated to the TID of theirguenther2016-11-071-9/+9
| | | | | | initial thread ok jsing@ kettenis@
* remove some casts that aren't necessary.tedu2016-07-051-6/+6
|
* ktrace support for pollfd[] arraysderaadt2016-06-071-3/+7
| | | | ok guenther
* remove stale lint annotationstedu2015-12-051-5/+1
|
* refactor pledge_*_check and pledge_fail functionssemarie2015-11-011-2/+2
| | | | | | | | | | | | | | - rename _check function without suffix: a "pledge" function called from anywhere is a "check" function. - makes pledge_fail call the responsability to the _check function. remove it from caller. - make proper use of (potential) returned error of _check() functions. - adds pledge_kill() and pledge_protexec() with and OK deraadt@
* move SS_DNS socket check from kern_plegde.c to sys_generic.csemarie2015-10-181-4/+11
| | | | | | | | | this check has nothing to do with pledge(2). make it lives in sys_ioctl() call. while here, move the (fp == NULL) check early and remove duplicate check from pledge_ioctl_check(). ok guenther@ deraadt@
* pledge_ioctl_check() will do the killing if neccessary; if it returns,deraadt2015-10-111-2/+2
| | | | | that is an errno to pass up to the calling system call instead. test case is "who < /dev/null", via ttyname().
* another stray )deraadt2015-10-091-2/+2
|
* shortcircuit TIOCGETA to directly return ENOTTY for non-ttys. It couldderaadt2015-10-091-2/+3
| | | | | be called against a non-tty fd, so as to test "is this a tty". Discovered by sthen and rob pierce at the same time.
* Rename tame() to pledge(). This fairly interface has evolved to be morederaadt2015-10-091-4/+4
| | | | | | strict than anticipated. It allows a programmer to pledge/promise/covenant that their program will operate within an easily defined subset of the Unix environment, or it pays the price.
* Convert _TM_ flags to TAME_ flags, collapsing the entire mappingderaadt2015-09-111-2/+2
| | | | | | layer because the strings select the right options. Mechanical conversion. ok guenther
* Only include <sys/tame.h> in the .c files that need itguenther2015-09-111-1/+2
| | | | ok deraadt@ miod@
* Move to tame(int flags, char *paths[]) API/ABI.deraadt2015-08-221-2/+1
| | | | | | | | | | | | The pathlist is a whitelist of dirs and files; anything else returns ENOENT. Recommendation is to use a narrowly defined list. Also add TAME_FATTR, which permits explicit change operations against "struct stat" fields. Some other TAME_ flags are refined slightly. Not cranking libc now, since nothing commited in base uses this and the timing is uncomfortable for others. Discussed with many; thanks for a few bug fixes from semarie, doug, guenther. ok guenther
* Add ktracing of structs iovec, msghdr, and cmsghdr for {,p}{read,write}v(),guenther2015-07-281-1/+9
| | | | | | | sendmsg(), and recvmsg(). For cmsghdr, the len, level, and type are always shown, and for SOL_SOCKET,SCM_RIGHTS the fd numbers being passed are shown. ok millert@ deraadt@
* tame(2) is a subsystem which restricts programs into a "reduced featurederaadt2015-07-191-4/+10
| | | | | | operating model". This is the kernel component; various changes should proceed in-tree for a while before userland programs start using it. ok miod, discussions and help from many
* Set POLLHUP even if no valid events were specified as per POSIX.millert2015-05-101-3/+5
| | | | | | | | Since we use the poll backend for select(2), care must be taken not to set the fd's bit in writefds in this case. A kernel-only flag, POLLNOHUP, is used by selscan() to tell the poll backend not to return POLLHUP on EOF. This is currently only used by fifo_poll(). The fifofs regress now passes. OK guenther@
* Indroduce fd_getfile_mode() and use it were fd_getfile() is directlympi2015-04-301-19/+6
| | | | | | | followed by a mode check. This will simplify the ref/unref dance as soon as fd_getfile() will increment fp's reference counter. Idea from and ok guenther@, ok millert@
* Remove useless extern definitions of nselcoll and selwait.millert2015-02-121-2/+1
| | | | OK guenther@
* convert bcopy to memcpy. ok millerttedu2014-12-101-3/+3
|
* pass size argument to free()deraadt2014-11-031-8/+8
| | | | ok doug tedu
* use mallocarray to get the array of pollfd structs.dlg2014-10-131-5/+9
| | | | tweaks and ok millert@ deraadt@
* trim some caststedu2014-07-131-2/+2
|
* use mallocarray where arguments are multipled. ok deraadttedu2014-07-131-2/+2
|
* Refactor out dosigsuspend() functionmatthew2014-07-121-11/+5
| | | | Discussed with guenther and kettenis
* add a size argument to free. will be used soon, but for now default to 0.tedu2014-07-121-8/+8
| | | | after discussions with beck deraadt kettenis.
* Repair compilability after the recent uvmexp changes, especially formiod2014-07-081-1/+3
| | | | not compile-time-known page size platforms.
* decouple struct uvmexp into a new file, so that uvm_extern.h and sysctl.hderaadt2014-07-081-3/+1
| | | | | don't need to be married. ok guenther miod beck jsing kettenis
* Eliminates struct pcred by moving the real and saved ugids intoguenther2014-03-301-2/+2
| | | | | | | | | struct ucred; struct process then directly links to the ucred Based on a discussion at c2k10 or so before noting that FreeBSD and NetBSD did this too. ok matthew@