summaryrefslogtreecommitdiffstats
path: root/sys/nfs (follow)
Commit message (Collapse)AuthorAgeFilesLines
* When killing a process, the signal is handled by any thread thatbluhm2019-05-131-4/+3
| | | | | | | | | | does not block the signal. If all threads block the signal, we delivered it to the main thread. This does not conform to POSIX. If any thread unblocks the signal, it should be delivered immediately to this thread. Mark such signals pending at the process instead of a single thread. Then any thread can handle it later. OK kettenis@ guenther@
* The kernel interpreted bogus lengths in RPC calls during NFS boot.bluhm2019-01-221-8/+31
| | | | | | | | A malicious rpc.bootparamd could corrupt memory, but the kernel has to trust the local network anyway in a diskless environment. Now in case of an RPC error, the kernel will stop booting with a specific panic. OK claudio@ beck@
* Introduce a dedicated entry point data structure for file locks. This new dataanton2019-01-211-2/+2
| | | | | | | | | | | | structure allows for better tracking of pending lock operations which is essential in order to prevent a use-after-free once the underlying vnode is gone. Inspired by the lockf implementation in FreeBSD. ok visa@ Reported-by: syzbot+d5540a236382f50f1dac@syzkaller.appspotmail.com
* Move boottime into the timehands.cheloha2019-01-191-3/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | To protect the timehands we first need to protect the basis for all UTC time in the kernel: the boottime. Because the boottime can be changed at any time it needs to be versioned along with the other members of the timehands to enable safe lockless reads when using it for anything. So the global boottime timespec goes away and the static boottimebin becomes a member of the timehands. Instead of reading the global boottime you use one of two interfaces: binboottime(9) or microboottime(9). nanoboottime(9) can trivially be added later, though there are no consumers for it at the moment. This introduces one small change in behavior. We used to advance the reported boottime just before launching kernel threads from main(). This makes it look to userland like we "booted" moments before those threads were launched. Because there is no longer a boottime global we can no longer trivially do this from main(), so the boottime we report to userspace via e.g. kern.boottime will now reflect whatever the time was when we bootstrapped the timehands via inittodr(9). This is usually no more than a minute before the kernel threads are launched from main(). The prior behavior can be restored by adding a new interface to the timecounter layer in a future commit. Based on FreeBSD r303387. Discussed with mpi@ and visa@. ok visa@
* Check for negative length in NFS strings. This affects both, thebluhm2019-01-181-2/+2
| | | | | client and server. OK beck@
* Check for negative length integers in NFS server. A maliciousbluhm2019-01-181-10/+11
| | | | | client could crash the server. OK tedu@
* Check for negative length integers in NFS client. A maliciousbluhm2019-01-181-3/+4
| | | | | server could confuse the client file system code. OK beck@
* Switch MH_ALIGN to m_align which is the same.claudio2018-11-301-2/+2
| | | | OK bluhm@
* M_LEADINGSPACE() and M_TRAILINGSPACE() are just wrappers forclaudio2018-11-093-9/+9
| | | | | | m_leadingspace() and m_trailingspace(). Convert all callers to call directly the functions and remove the defines. OK krw@, mpi@
* Instead of calculating the mbuf packet header length here and there,bluhm2018-09-102-18/+5
| | | | | | put the algorithm into a new function m_calchdrlen(). Also set an uninitialized m_len to 0 in NFS code. OK claudio@
* Use FNONBLOCK instead of SS_NBIO to check/indicate that the I/O modempi2018-07-303-6/+6
| | | | | | | | | | | | | for sockets is non-blocking. This allows us to G/C SS_NBIO. Having to keep the two flags in sync in a mp-safe way is complicated. This change introduce a behavior change in sosplice(), it can now always block. However this should not matter much due to the socket lock being taken beforhand. ok bluhm@, benno@, visa@
* Nuke unused define 'nfsm_writereply()'.krw2018-07-091-11/+1
| | | | ok beck@ deraadt@ guenther@ mpi@
* Use more list macros for v_dirtyblkhd.bluhm2018-07-023-15/+10
| | | | OK mpi@
* Drop redundant "node == parent node" checks from VOP_RMDIR()visa2018-06-211-8/+1
| | | | | | implementations. Rely on the VFS layer to do the checking. OK mpi@, helg@
* Make the VFS layer responsible for preventing the deletionvisa2018-06-131-1/+8
| | | | | | of mounted on directories. OK guenther@, mpi@
* Make callers of VOP_CREATE(9) and VOP_MKNOD(9) responsible forvisa2018-06-072-10/+10
| | | | | | unlocking the directory vnode. OK mpi@, helg@
* Pass the socket to sounlock(), this prepare the terrain for per-socketmpi2018-06-063-12/+12
| | | | | | locking. ok visa@, bluhm@
* Drop unnecessary `p' parameter from vget(9).visa2018-05-272-5/+4
| | | | OK mpi@
* Implement proper locking for NFS nodes.mpi2018-05-053-15/+69
| | | | Tested in bulks by many. ok visa@, beck@
* After unmount nfs_timer() could access the freed memory of structbluhm2018-05-041-2/+15
| | | | | | | nfsmount. Delay the free(9) of the nfs mount point data until pending or sleeping timeouts have finished by running it on the softclock thread. OK visa@
* Remove proc from the parameters of vn_lock(). The parameter isvisa2018-05-023-10/+9
| | | | | | unnecessary because curproc always does the locking. OK mpi@
* Clean up the parameters of VOP_LOCK() and VOP_UNLOCK(). It is alwaysvisa2018-04-285-17/+16
| | | | | | | curproc that does the locking or unlocking, so the proc parameter is pointless and can be dropped. OK mpi@, deraadt@
* Fix use of unreferenced vnode by decrementing the vnode's referencevisa2018-04-251-3/+4
| | | | | | | count after unlocking. To improve consistency, use vput() instead of VOP_UNLOCK() + vrele(). OK guenther@, mpi@, tedu@
* Prepare vnops to be locked.mpi2018-04-171-36/+48
| | | | | | | | | | | | | | | | - Use vput(9) instead of vrele(9) when a "locked" node is returned by nfs_nget(). - Make sure VN_KNOTE() is always called with a valid reference. - Add a missing PDIRUNLOCK in nfs_lookup() These changes are mostly noops as long as nfs_lock()/unlock() do nothing. Tested by bluhm@, visa@ and myself. ok visa@
* Change the representation of an NFS mount point by caching the rootmpi2018-04-093-45/+80
| | | | | | | | | | | | | | | nodes. nfs_root() now returns a "locked" vnode, so vput(9) must be called to release it. Note that this has currently no effect as nfs_lock/unlock are still stubs. This will prevent some lock odering problems with upcoming NFSnode locking. Tested by landry@, sthen@, visa@, naddy@ and myself. From NetBSD with some tweaks, ok visa@
* Check for possible race after sleeping instead of using a rwlock tompi2018-03-281-15/+7
| | | | | | | | protect insertions in `nm_ntree'. This will prevent a future lock ordering problem with NFSnode's lock. ok tedu@, visa@
* Remove almost unused `flags' argument of suser().mpi2018-02-191-2/+2
| | | | | | | The account flag `ASU' will no longer be set but that makes suser() mpsafe since it no longer mess with a per-process field. No objection from millert@, ok tedu@, bluhm@
* Syncronize filesystems to disk when suspending. Each mountpoint's vnodesderaadt2018-02-101-3/+3
| | | | | | | | | | are pushed to disk. Dangling vnodes (unlinked files still in use) and vnodes undergoing change by long-running syscalls are identified -- and such filesystems are marked dirty on-disk while we are suspended (in case power is lost, a fsck will be required). Filesystems without dangling or busy vnodes are marked clean, resulting in faster boots following "battery died" circumstances. Tested by numerous developers, thanks for the feedback.
* Use FREF() instead of rolling our own.mpi2018-01-311-2/+2
| | | | ok deraadt@, bluhm@
* Delete unnecessary <sys/file.h> includesguenther2017-12-301-2/+1
| | | | ok millert@ krw@
* In uvm Chuck decided backing store would not be allocated proactivelyderaadt2017-12-111-26/+15
| | | | | | | | | | | | | | | | | | | | | | for blocks re-fetchable from the filesystem. However at reboot time, filesystems are unmounted, and since processes lack backing store they are killed. Since the scheduler is still running, in some cases init is killed... which drops us to ddb [noted by bluhm]. Solution is to convert filesystems to read-only [proposed by kettenis]. The tale follows: sys_reboot() should pass proc * to MD boot() to vfs_shutdown() which completes current IO with vfs_busy VB_WRITE|VB_WAIT, then calls VFS_MOUNT() with MNT_UPDATE | MNT_RDONLY, soon teaching us that *fs_mount() calls a copyin() late... so store the sizes in vfsconflist[] and move the copyin() to sys_mount()... and notice nfs_mount copyin() is size-variant, so kill legacy struct nfs_args3. Next we learn ffs_mount()'s MNT_UPDATE code is sharp and rusty especially wrt softdep, so fix some bugs adn add ~MNT_SOFTDEP to the downgrade. Some vnodes need a little more help, so tie them to &dead_vnops. ffs_mount calling DIOCCACHESYNC is causing a bit of grief still but this issue is seperate and will be dealt with in time. couple hundred reboots by bluhm and myself, advice from guenther and others at the hut
* Push the NET_LOCK into ifioctl() and use the NET_RLOCK in ifioctl_get().tb2017-11-141-7/+1
| | | | | | In particular, this allows SIOCGIF* requests to run in parallel. lots of help & ok mpi, ok visa, sashan
* nfs_connect() returns EINVAL at the beginning if nm_sotype isbluhm2017-09-071-1/+3
| | | | | | | | invalid. But the compiler cannot know whether it has changed in the meantime, so in the else case a bunch of variables would not be initialized. Add a panic() there to change the compiler's assumptions, the code should not be reached anyway. found by clang -Wuninitialized; OK deraadt@
* Preallocate option mbufs in order to reduce solock()/sounlock() dances.mpi2017-09-051-43/+35
| | | | | | | Finally protect the last `so_rcv' and `so_snd' accesses with the socket lock. ok visa@, bluhm@
* Change sosetopt() to no longer free the mbuf it receives and changempi2017-09-013-3/+13
| | | | | | all the callers to call m_freem(9). Support from deraadt@ and tedu@, ok visa@, bluhm@
* Remove old deactivated pledge path code. A replacement mechanism isderaadt2017-08-291-7/+1
| | | | | being brewed. ok beck
* knf to fix tab/space mismatches that make it hard to tell what's insidetedu2017-08-141-150/+153
| | | | | an if vs the condition itself. weird contortions because of course the lines want to be like 900 columns wide, but i think it's better now.
* drop seriously lacking support for SEQPACKET.tedu2017-08-141-12/+8
| | | | | also move checks up sooner to prevent a (root) panic. ok bluhm
* Remove NET_LOCK()'s argument.mpi2017-08-112-12/+12
| | | | Tested by Hrvoje Popovski, ok bluhm@
* Move the solock()/sounlock() dance outside of sobind().mpi2017-08-102-2/+6
| | | | ok phessler@, visa@, bluhm@
* Move the socket lock "above" sosetopt(), sogetopt() and sosplice().mpi2017-08-093-10/+24
| | | | | | | Protect the fields modifieds by sosetopt() and simplify the dance with the stars. ok bluhm@
* Extend the scope of the socket lock to protect `so_state' in connect(2).mpi2017-07-241-3/+5
| | | | | | | As a side effect, soconnect() and soconnect2() now expect a locked socket, so update all the callers. ok bluhm@
* If second xdr_string_encode() fails in bp_getfile() m_freem() m sinceclaudio2017-07-191-2/+4
| | | | | | this mbuf was allocated by the first call. Fixes possible memory leak. Found by Ilja Van Sprundel OK bluhm@ deraadt@
* Add missing solock()/sounlock() dances around sbreserve().mpi2017-06-272-3/+7
| | | | | | While here document an abuse of parent socket's lock. Problem reported by krw@, analysis and ok bluhm@
* Assert that the corresponding socket is locked when manipulating socketmpi2017-06-261-2/+2
| | | | | | | | | | | | | | | | buffers. This is one step towards unlocking TCP input path. Note that all the functions asserting for the socket lock are not necessarilly MP-safe. All the fields of 'struct socket' aren't protected. Introduce a new kernel-only kqueue hint, NOTE_SUBMIT, to be able to tell when a filter needs to lock the underlying data structures. Logic and name taken from NetBSD. Tested by Hrvoje Popovski. ok claudio@, bluhm@, mikeb@
* When dealing with mbuf pointers passed down as function parameters,bluhm2017-06-191-8/+5
| | | | | | | bugs could easily result in use-after-free or double free. Introduce m_freemp() which automatically resets the pointer before freeing it. So we have less dangling pointers in the kernel. OK krw@ mpi@ claudio@
* Remove useless splsoftnet().mpi2017-05-173-30/+8
| | | | | | | | | | Outside of USB, no code is executed in a softnet interrupt context. So what's protecting NFS data structures is the KERNEL_LOCK(). But more importantly, since r1.114 of nfs_socket.c, the 'softnet' thread is no longer executing NFS code. ok visa@
* Sync nfs_connect() w/ sys_connect().mpi2017-05-081-7/+6
| | | | ok bluhm@
* Prevent a recursion in the socket layer.mpi2017-03-031-17/+6
| | | | | | | | | Always defere soreceive() to an nfsd(8) process instead of doing it in the 'softnet' thread. Avoiding this recursion ensure that we do not introduce a new sleeping point by releasing and grabbing the netlock. Tested by many, committing now in order to find possible performance regression.
* Keep local definitions local.mpi2017-02-2210-99/+102
| | | | "good work" deraadt@, ok visa@