summaryrefslogtreecommitdiffstats
path: root/sys/kern/kern_descrip.c (follow)
Commit message (Collapse)AuthorAgeFilesLines
* Move FRELE() outside fdplock in dup*(2) code. This avoids a potentialvisa2020-06-111-4/+7
| | | | | | | | | | | | lock order issue with the file close path. The FRELE() can trigger the close path during dup*(2) if another thread manages to close the file descriptor simultaneously. This race is possible because the file reference is taken before the file descriptor table is locked for write access. Vitaliy Makkoveev agrees OK anton@ mpi@
* In order to unlock flock(2), make writes to the f_iflags field of structanton2020-03-131-4/+4
| | | | | | | file atomic. This also gets rid of the last kernel lock protected field in the scope of struct file. ok mpi@ visa@
* Release the file descriptor table lock before calling closef()visa2020-02-261-21/+23
| | | | | | | | in finishdup(). This makes the order of operations similar to that of fdrelease() and removes a case where lock ordering might cause problems. OK anton@, mpi@
* Move setting of UF_EXCLOSE file descriptor flag inside finishdup().visa2020-02-181-11/+21
| | | | | | This makes it easier to release fdplock before calling closef(). OK mpi@, anton@
* Move kernel locking inside knote_fdclose() from finishdup() andvisa2020-02-051-12/+2
| | | | | | | | | | | | fdrelease(). This makes the upper layer of file descriptor closing free of KERNEL_LOCK() when the process does not use kqueue. The kernel locking around fdremove() and knote_fdclose() is no longer needed because kqueue_register() checks if there has been a race with file descriptor close. Moreover, the locking became ineffective against these races when filterops callbacks were allowed to sleep. OK anton@, mpi@
* Make writes to the f_flag field of `struct file' MP-safe using atomicanton2020-02-011-5/+8
| | | | | | | | operations. Since the type of f_flag must change in order to use the atomic(9) API, reorder the struct in order to avoid padding; as pointed out by tedu@. ok mpi@ visa@
* Skip fdplock when freeing a file descriptor table. The lock is notvisa2020-01-081-3/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | necessary because other threads cannot access the data structure. This fixes the following lock order issue: witness: lock order reversal: 1st 0xfffffd81d821d248 fdlock (&newfdp->fd_fd.fd_lock) 2nd 0xffff800000fe45b8 primlk (&prime_fpriv->lock) lock order "&prime_fpriv->lock"(rwlock) -> "&newfdp->fd_fd.fd_lock"(rwlock) first seen at: #0 witness_checkorder+0x449 #1 rw_enter_write+0x43 #2 dma_buf_fd+0x8c #3 drm_gem_prime_handle_to_fd+0xed #4 drmioctl+0xdc #5 VOP_IOCTL+0x55 #6 vn_ioctl+0x64 #7 sys_ioctl+0x2f6 #8 syscall+0x389 #9 Xsyscall+0x128 lock order "&newfdp->fd_fd.fd_lock"(rwlock) -> "&prime_fpriv->lock"(rwlock) first seen at: #0 witness_checkorder+0x449 #1 rw_enter_write+0x43 #2 drm_gem_object_release_handle+0x5e #3 idr_for_each+0xee #4 drm_gem_release+0x1f #5 drmclose+0x144 #6 spec_close+0x213 #7 VOP_CLOSE+0x49 #8 vn_closefile+0x9b #9 fdrop+0x8b #10 closef+0xaf #11 fdfree+0xd4 #12 exit1+0x1cf #13 sys_exit+0x16 #14 syscall+0x389 #15 Xsyscall+0x128 OK mpi@
* Unify handling of ioctls FIOSETOWN/SIOCSPGRP/TIOCSPGRP andvisa2020-01-081-16/+4
| | | | | | | | | | | | FIOGETOWN/SIOCGPGRP/TIOCGPGRP. Do this by determining the meaning of the ID parameter inside the sigio code. Also add cases for FIOSETOWN and FIOGETOWN where there have been TIOCSPGRP and TIOCGPGRP before. These changes allow removing the ID translation from sys_fcntl() and sys_ioctl(). Idea from NetBSD OK mpi@, claudio@
* Make kqlist part of filedesc and serialize access to it using fdplock.visa2020-01-061-1/+4
| | | | | | This choice of locking is guided by knote_fdclose(). OK mpi@, anton@
* Fix a file descriptor close race in kqueue_register()visa2020-01-031-6/+15
| | | | | | | | | | | | | | | After inserting a knote, check that the associated file descriptor still references the same file. Remove the knote if the descriptor has changed because otherwise the kqueue becomes inconsistent with the file descriptor table. There is an analogous race in fcntl(F_SETLK). It is already handled, but the code can be simplified by using the same check as in kqueue_register(). Fix inspired by DragonFly BSD OK mpi@, anton@
* Allow concurrent reads of the f_offset field of struct file byanton2019-08-051-5/+9
| | | | | | | | | serializing both read/write operations using the existing file mutex. The vnode lock still grants exclusive write access to the offset; the mutex is only used to make the actual write atomic and prevent any concurrent reader from observing intermediate values. ok mpi@ visa@
* Do not relock fdp in fdrelease(). This prevents unnecessary lockingvisa2019-07-151-7/+14
| | | | | | in the common case. OK mpi@
* Revert anton@ changes about read/write unlockingsolene2019-07-121-76/+5
| | | | | | https://marc.info/?l=openbsd-cvs&m=156277704122293&w=2 ok anton@
* Make read/write of the f_offset field belonging to struct file MP-safe;anton2019-07-101-5/+76
| | | | | | | | | | | | | | | | as part of the effort to unlock the kernel. Instead of relying on the vnode lock, introduce a dedicated lock per file. Exclusive write access is granted using the new foffset_enter and foffset_leave API. A convenience function foffset_get is also available for threads that only need to read the current offset. The lock acquisition order in vn_write has been changed to match the one in vn_read in order to avoid a potential deadlock. This change also gets rid of a documented race in vn_read(). Inspired by the FreeBSD implementation. With help and ok mpi@ visa@
* Lock the kernel when removing file descriptors from the descriptorvisa2019-07-031-1/+9
| | | | | | | table. This should prevent a race with kevent when unlocked code closes file descriptors that are fully set up. OK mpi@
* Return EINVAL, not EBADF for fcntl(fd, F_GETLK) of a non-vnode.millert2019-06-261-2/+2
| | | | Matches the recent F_SETLK change, POSIX and the man page.
* Return EINVAL not EBADF when trying to lock a non-vnode.millert2019-06-251-2/+2
| | | | | This behavior matches POSIX and our own fnctl(2) man page. OK anton@ deraadt@
* Make resource limit access MP-safe. So far, the copy-on-write sharingvisa2019-06-211-4/+4
| | | | | | | | | | of resource limit structs has been done between processes. By applying copy-on-write also between threads, threads can read rlimits in a nearly lock-free manner. Inspired by code in DragonFly BSD and FreeBSD. OK mpi@, agreement from jmatthew@ and anton@
* dup2(n,n) would rlimit check before handling the n==n shortcut,deraadt2019-05-131-6/+6
| | | | | and incorrectly return EBADF when n>curlim. ok millert guenther tedu
* trace struct flock; ok visa@anton2018-11-051-1/+9
|
* Remove all knotes from a file descriptor before closing the file invisa2018-08-241-1/+2
| | | | | | | fdfree(). This fixes a resource leak with cyclic kqueue references and prevents a kernel stack exhaustion scenario with long kqueue chains. OK mpi@
* Use explicit fd indexing to access fd_ofiles, to clarify the code.visa2018-08-211-7/+6
| | | | OK mpi@
* Make fnew() return a new file with only one reference. This makesvisa2018-08-201-3/+2
| | | | | | the API more logical. OK kettenis@ mpi@
* Remove a stale/obvious comment.visa2018-08-191-7/+1
| | | | OK mpi@
* Update fd_freefile when filtering/closing kqueue descriptors in fdcopy().jsing2018-08-101-2/+5
| | | | | | | | | | | | | Prior to r1.153 of kern_descrip.c, the kqueue descriptors were removed using fdremove(), which reset fd_freefile as appropriate. The new code simply avoids adding the descriptor to the new table, however this means that fd_freefile can be left with an incorrect value, resulting in a file descriptor allocation "hole". Restore the previous behavour by lowering fd_freefile as appropriate when dropping descriptors. Issue found via golang regress tests. ok deraadt@ mpi@ visa@
* Move socket & pipe specific logic in their ioctl handler.mpi2018-07-101-24/+4
| | | | ok visa@, tb@
* Fix an argument type error that happens when translating fcntl(F_SETOWN)visa2018-07-071-8/+9
| | | | | | | | to ioctl(TIOCSPGRP). The ioctl handlers expect a pointer to an int, so read the argument into a local int variable and pass the variable's address to the handler instead of referencing SCARG(uap, arg) directly. OK guenther@, mpi@
* Update the file reference count field `f_count' using atomic operationsvisa2018-07-021-31/+63
| | | | | | | | | instead of using a mutex for update serialization. Use a per-fdp mutex to manage updating of file instance pointers in the `fd_ofiles' array to let fd_getfile() acquire file references safely with concurrent file reference releases. OK mpi@
* Assert that fdp is locked in fdalloc().visa2018-07-021-1/+3
| | | | OK mpi@
* Lock the file descriptor table when accessing the `fd_ofileflags' array.visa2018-07-011-1/+3
| | | | | | | | This prevents the array from being freed too early. In the function unp_internalize(), the locking also ensures the per-fdp flags stay coherent with the file instance. OK mpi@
* Raise file_pool's IPL to prevent deadlocks with the newly unlockedvisa2018-06-271-2/+2
| | | | | | system calls. OK mpi@
* Remove a duplicate fd_used() call. The new file descriptor passedvisa2018-06-261-2/+3
| | | | | | | | | | | | to dupfdopen() has already been registered with fd_used() in fdalloc(). The duplicate call distorted the number of open file descriptors returned by getdtablecount(2) if a file was opened via /dev/fd/. While there, assert that the file instance should already be in the file list. OK mpi@
* Implement DRI3/prime support. This allows graphics buffers to be passedkettenis2018-06-251-2/+6
| | | | | | | | between processes using file descriptors. This provides an alternative to eporting them with guesable 32-bit IDs. This implementation does not (yet) allow sharing of graphics buffers between GPUs. ok mpi@, visa@
* Introduce fnew(), a function to initialize a `struct file'.mpi2018-06-251-12/+31
| | | | | | Commiting now to help refactoring of DRI3 and diskmap rewrite. ok visa@, kettenis@ as part of a larger diff.
* Use atomic operations for updating `numfiles'. This makes the file countvisa2018-06-241-5/+6
| | | | | | tracking work without locks. OK kettenis@, deraadt@
* Unlock sendmsg(2) and sendto(2).mpi2018-06-201-18/+42
| | | | | | | | | | These syscalls can now be executed w/o the KERNEL_LOCK() depending on the kind of socket. The current solution uses a single global mutex to serialize access to, and reference count, 'struct file'. ok visa@, kettenis@
* Put file descriptors on shared data structures when they are completelympi2018-06-181-46/+50
| | | | | | | | | | | | | | | | | setup, take 3. LARVAL fd still exist, but they are no longer marked with a flag and no longer reachable via `fd_ofiles[]' or the global linked list. This allows us to simplifies a lot code grabbing new references to fds. All of this is now possible because dup2(2) refuses to clone LARVAL fds. Note that the `fdplock' could now be release in all open(2)-like syscalls, just like it is done in accept(2). With inputs from Mathieu Masson, visa@, guenther@ and art@ Previous version ok bluhm@, ok visa@, sthen@
* Move kqueue related fields from struct filedesc to struct kqueue. Solves a panicanton2018-06-171-8/+3
| | | | | | | | | | | | | in knote_processexit() that can occur when the filedesc belonging to the process already has been freed. Similiar work has been done in: - FreeBSD (commit bc1805c6e871c178d0b6516c3baa774ffd77224a) - DragonFlyBSD (commit ccafe911a3aa55fd5262850ecfc5765cd31a56a2) Thanks to tb@ for testing. ok kettenis@ mpi@ visa@
* Revert introduction of fdinsert(), a sanitify check triggers whenmpi2018-06-051-48/+38
| | | | | | closing a LARVAL file. Found the hardway by sthen@.
* Add an assert that makes explicit that finishdup() should receivevisa2018-06-021-1/+3
| | | | | | an inserted fp. OK mpi@
* Put file descriptors on shared data structures when they are completelympi2018-06-021-38/+46
| | | | | | | | | | | | | | | | | setup. LARVAL fd still exist, but they are no longer marked with a flag and no longer reachable via `fd_ofiles[]'. This allows us to simplifies a lot code grabbing new references to fds. All of this is now possible because dup2(2) refuses to clone LARVAL fds. Note that the `fdplock' could now be release in all open(2)-like syscalls, just like it is done in accept(2). With inputs from Mathieu -, visa@, guenther@ and art@ ok visa@, bluhm@
* Use IPL_MPFLOOR for mutexes that can be taken w/ and w/o the KERNEL_LOCK().mpi2018-05-311-2/+2
| | | | From Mathieu <naabed at poolp.org>, ok visa@, tb@
* `f_mtx' must block interrupts as long as it is taken w/ and w/o thempi2018-05-291-2/+6
| | | | | | | | KERNEL_LOCK(). Otherwise a deadlock can occur as found the hardway by tb@. ok tb@, kettenis@, visa@
* Returns EBUSY if dup2(2) is called for a LARVAL file.mpi2018-05-281-6/+7
| | | | | | | | | | | This prevents a panic due to a double free if a program exits after having called accept(2) and dup2(2) on the same fd but without the corresponding connect(5). It will also allows us to simplify file descriptor locking. The error code has been choosed to match Linux's behavior. Pointed by Mathieu on tech@ after a discussion with guenther@. ok visa@
* Change fd_iterfile() to not return imature fps instead of skipping themmpi2018-05-081-2/+2
| | | | | | later. ok bluhm@, visa@
* Protect per-file counters and document which lock is used to protectmpi2018-05-081-1/+2
| | | | | | | | | the other fields. Once we no longer have any [k] (kernel lock) protections, we'll be able to unlock almost all network related syscalls. Inputs from and ok bluhm@, visa@
* Remove proc from the parameters of vn_lock(). The parameter isvisa2018-05-021-2/+2
| | | | | | unnecessary because curproc always does the locking. OK mpi@
* Clean up the parameters of VOP_LOCK() and VOP_UNLOCK(). It is alwaysvisa2018-04-281-2/+2
| | | | | | | curproc that does the locking or unlocking, so the proc parameter is pointless and can be dropped. OK mpi@, deraadt@
* Move FREF() inside fd_getfile().mpi2018-04-271-12/+12
| | | | ok visa@
* Rewrite fdcopy() to avoid memcpy()s.mpi2018-04-261-52/+38
| | | | With and ok visa@