summaryrefslogtreecommitdiffstats
path: root/sys/kern/vfs_subr.c (follow)
Commit message (Collapse)AuthorAgeFilesLines
* Add a temporary workaround to make removal of giant files betterbeck2019-06-091-1/+18
| | | | | | | | | | | | mlarkin@ noticed we would freeze while removing enormous files because of the amount of work done to invalidate buffers on unlink. This adds a temporary workaround to ensure we give up the lock and yield while doing this. The longer term answer will be to move these buffers to another list and not do the work here. ok deraadt@
* Add a subsystem lock for vfs_lockf.c. This enables calling lf_advlock()visa2019-04-191-2/+2
| | | | | | and lf_purgelocks() without the kernel lock. OK anton@ mpi@
* Restrict which filesystems are available for swap. This rules outvisa2019-04-021-2/+2
| | | | | | obvious misconfigurations that cannot work. OK mpi@ tedu@
* if a write fails, we mark the buffer invalid and throw it away. this cantedu2019-02-171-1/+2
| | | | | | | lead to lost errors, where a later fsync will return success. to fix this, set a flag on the vnode indicating a past error has occurred, and return an error for future fsync calls. ok bluhm deraadt visa
* Introduce a dedicated entry point data structure for file locks. This new dataanton2019-01-211-1/+3
| | | | | | | | | | | | structure allows for better tracking of pending lock operations which is essential in order to prevent a use-after-free once the underlying vnode is gone. Inspired by the lockf implementation in FreeBSD. ok visa@ Reported-by: syzbot+d5540a236382f50f1dac@syzkaller.appspotmail.com
* Rectify some issues with the noperm mount flag; the root vnode was notnatano2018-12-231-1/+10
| | | | | | | | protected properly and files without any x bit set were accidentaly considered executable when checked with access(2). Issues found and reported by deraadt, halex, reyk, tb ok deraadt
* free(9) sizes for netcred.mpi2018-12-071-4/+6
| | | | ok visa@
* Use atomic operations to update vfc_refcount. Change the field's typevisa2018-09-291-4/+5
| | | | | | to unsigned int. OK deraadt@
* Move the allocating and freeing of mount points intovisa2018-09-261-15/+39
| | | | | | dedicated functions. OK deraadt@ mpi@
* Harmonize spacing after ellipses in displayed messages.fcambus2018-09-221-4/+4
| | | | | | | | | | | | | We were using spacing after ellipses in an inconsistent way in the installer. Standardize on using "... " everywhere and take into account the cursor position while we are waiting for the task to complete: the cursor is now always positioned after the last dot, and the space is added when displaying completion confirmation. While there, also take cursor position into account in vfs_shutdown(), and remove the extra leading space before ticks in dhclient. OK deraadt@
* Simplify VFS initialization.visa2018-09-171-68/+1
| | | | | | | | | | Because loadable kernel modules are no longer, there is no need to register or unregister filesystem implementations at runtime. Remove vfs_register() and vfs_unregister(), and make vfsinit() call vfs_init routines directly. Replace the linked list of vfsconf structs with the vfsconflist[] array. OK mpi@ bluhm@
* Move vfsconf lookup code into dedicated functions.visa2018-09-161-12/+4
| | | | OK bluhm@
* Unveiling unveil(2).beck2018-07-131-1/+4
| | | | | | | | | | | | | This brings unveil into the tree, disabled by default - Currently this will return EPERM on all attempts to use it until we are fully certain it is ready for people to start using, but this now allows for others to do more tweaking and experimentation. Still needs to send the unveil's across forks and execs before fully enabling. Many thanks to robert@ and deraadt@ for extensive testing. ok deraadt@
* Use more list macros for v_dirtyblkhd.bluhm2018-07-021-3/+3
| | | | OK mpi@
* The function dounmount() traverses the mnt_list in forward directionbluhm2018-06-061-3/+4
| | | | | | | to call vfs_busy() for all nested mount points. vfs_stall() called vfs_busy() in reverser order for all mount points. Change the direction of the latter to resolve the lock order conflict. OK visa@
* Add VB_DUPOK to suppress witness(4) warning of concurrent mount locks.guenther2018-06-041-2/+7
| | | | | | | | | Use that in three places: - vfs_stall() - sys_mount() - dounmount()'s MNT_FORCE-does-recursive-unmounts case ok deraadt@ visa@
* Drop unnecessary `p' parameter from vget(9).visa2018-05-271-3/+3
| | | | OK mpi@
* When looping over mount points, the FOREACH SAVE macro is not save.bluhm2018-05-081-3/+7
| | | | | | | | | The loop variable mp is protected by vfs_busy() so that it cannot be unmounted. But the next mount point nmp could be unmounted while VFS_SYNC() sleeps. As the loop in vfs_stall() does not destroy the mount point, TAILQ_FOREACH_REVERSE without _SAVE is the correct macro to use. OK deraadt@ visa@
* Move the vfs stall "barrier" logic to a function. FREF() will soonmpi2018-05-081-1/+8
| | | | | | change and this has nothing to do with it. ok visa@, bluhm@
* Print the vp pointer in the vinvalbuf() panic strings.bluhm2018-05-071-4/+4
| | | | OK mpi@
* Remove proc from the parameters of vn_lock(). The parameter isvisa2018-05-021-3/+3
| | | | | | unnecessary because curproc always does the locking. OK mpi@
* Clean up the parameters of VOP_LOCK() and VOP_UNLOCK(). It is alwaysvisa2018-04-281-5/+5
| | | | | | | curproc that does the locking or unlocking, so the proc parameter is pointless and can be dropped. OK mpi@, deraadt@
* Remounting files systems read-only does not work reliably. Therebluhm2018-03-071-40/+27
| | | | | | | are corner cases where ffs may leak blocks. So better revert and unmount all file systems at reboot. The "init died" panic will be fixed in a different way. OK deraadt@
* Syncronize filesystems to disk when suspending. Each mountpoint's vnodesderaadt2018-02-101-8/+49
| | | | | | | | | | are pushed to disk. Dangling vnodes (unlinked files still in use) and vnodes undergoing change by long-running syscalls are identified -- and such filesystems are marked dirty on-disk while we are suspended (in case power is lost, a fsck will be required). Filesystems without dangling or busy vnodes are marked clean, resulting in faster boots following "battery died" circumstances. Tested by numerous developers, thanks for the feedback.
* Don't bother using DETACH_FORCE for the softraid luns at rebootderaadt2017-12-141-3/+3
| | | | | time; the aggressive mountpoint destruction seems to hit insane use-after-frees when we are already far on the way down.
* Give vflush_vnode() a hint about vnodes we don't need to account as "busy".deraadt2017-12-141-5/+9
| | | | | Change mountpoint to RDONLY a little later. Seems to improve the rw->ro transition a bit.
* Format the vnode lists of ddb show mount properly in columns.bluhm2017-12-111-14/+20
| | | | OK krw@
* In uvm Chuck decided backing store would not be allocated proactivelyderaadt2017-12-111-38/+52
| | | | | | | | | | | | | | | | | | | | | | for blocks re-fetchable from the filesystem. However at reboot time, filesystems are unmounted, and since processes lack backing store they are killed. Since the scheduler is still running, in some cases init is killed... which drops us to ddb [noted by bluhm]. Solution is to convert filesystems to read-only [proposed by kettenis]. The tale follows: sys_reboot() should pass proc * to MD boot() to vfs_shutdown() which completes current IO with vfs_busy VB_WRITE|VB_WAIT, then calls VFS_MOUNT() with MNT_UPDATE | MNT_RDONLY, soon teaching us that *fs_mount() calls a copyin() late... so store the sizes in vfsconflist[] and move the copyin() to sys_mount()... and notice nfs_mount copyin() is size-variant, so kill legacy struct nfs_args3. Next we learn ffs_mount()'s MNT_UPDATE code is sharp and rusty especially wrt softdep, so fix some bugs adn add ~MNT_SOFTDEP to the downgrade. Some vnodes need a little more help, so tie them to &dead_vnops. ffs_mount calling DIOCCACHESYNC is causing a bit of grief still but this issue is seperate and will be dealt with in time. couple hundred reboots by bluhm and myself, advice from guenther and others at the hut
* Use _kernel_lock_held() instead of __mp_lock_held(&kernel_lock).mpi2017-12-041-2/+2
| | | | ok visa@
* Give back some space to the ramdisk by compiling net/radix.c onlyflorian2017-07-311-2/+14
| | | | | | | if we compile pf, ipsec, pipex or nfsserver. Suggested by mpi some time ago. Tweak & OK bluhm deraadt assumes it's fair
* Tweak lock inits to make the system runnable with witness(4)visa2017-04-201-2/+2
| | | | on amd64 and i386.
* struct vfsconf is tightly packed, but let's M_ZERO it in case that everderaadt2017-04-041-2/+2
| | | | changes to avoid exposing userland memory.
* When traversing the mount list, the current mount point is lockedbluhm2017-01-151-4/+5
| | | | | | | | | with vfs_busy(). If the FOREACH_SAFE macro is used, the next pointer is not locked and could be freed by another process. Unless necessary, do not use _SAFE as it is unsafe. In vfs_unmountall() the current pointer is actullay freed. Add a comment that this race has to be fixed later. OK krw@
* Replace manual for() loops with FOREACH() macro.bluhm2017-01-101-7/+4
| | | | OK millert@
* Remove the unused olddp parameter from function dounmount().bluhm2017-01-101-2/+2
| | | | OK mpi@ millert@
* Cast enum to u_int when doing a bounds check to avoid a clang warning thatkettenis2016-09-281-3/+4
| | | | | | the comparison is always true. ok jca@, tedu@
* move the namecache_rb_tree from RB macros to RBT functions.dlg2016-09-161-2/+2
| | | | | | i had to shuffle the includes a bit. all the knowledge of the RB tree is now inside vfs_cache.c, and all accesses are via cache_* functions.
* move buf_rb_bufs from RB macros to RBT functionsdlg2016-09-161-6/+6
| | | | | i had to shuffle the order of some header bits cos RBT_PROTOTYPE needs to see what RBT_HEAD produces.
* all pools have their ipl set via pool_setipl, so fold it into pool_init.dlg2016-09-151-7/+5
| | | | | | | | | | | | | | | | | | | | | | the ioff argument to pool_init() is unused and has been for many years, so this replaces it with an ipl argument. because the ipl will be set on init we no longer need pool_setipl. most of these changes have been done with coccinelle using the spatch below. cocci sucks at formatting code though, so i fixed that by hand. the manpage and subr_pool.c bits i did myself. ok tedu@ jmatthew@ @ipl@ expression pp; expression ipl; expression s, a, o, f, m, p; @@ -pool_init(pp, s, a, o, f, m, p); -pool_setipl(pp, ipl); +pool_init(pp, s, a, ipl, f, m, p);
* pool_setipldlg2016-08-251-1/+3
| | | | ok kettenis@
* Prevent NULL-pointer call for filesystems that don't provide vfs_sysctlkettenis2016-07-221-2/+2
| | | | | | | | in their vfsops. Issue reported by Tim Newsham. ok claudio@, natano@
* Remove the lockmgr() API. It is only used by filesystems, where it is anatano2016-06-191-2/+2
| | | | | | | | trivial change to use rrw locks instead. All it needs is LK_* defines for the RW_* flags. tested by naddy and sthen on package building infrastructure input and ok jmc mpi tedu
* The doforce variable isn't modified anywhere. Also, the only filesystemnatano2016-05-261-2/+1
| | | | | | left using it is fuse. It has been removed from all other filesystems. ok millert deraadt
* copy_statfs_info() is not only used by ufs, but by other filesystems too,natano2016-04-261-3/+3
| | | | | so make sure that all members of mp->mnt_stat.mount_info are copied. ok stefan
* fix off by one in vfs_vnode_print - found by miodbeck2016-04-261-3/+3
| | | | ok deraadt@, krw@
* Share clone bitmap between aliased vnodes. This prevents duplicate clonenatano2016-04-071-8/+11
| | | | | instance numbers being handed out for the same minor device. ok mikeb
* Increase size of the clone bitmap (revised diff after revert). I havenatano2016-04-051-2/+13
| | | | | | | | | | | | | | | | | | tested this with fuse _and_ drm on amd64 and macppc. Also tested with cloning bpf (not in the tree) on macppc. ok mikeb "looks correct to me" millert The original commit message is as follows: Increase size of the clone bitmap. A limit of only 64 device clones turned out to be too low for the upcoming work on cloning bpf. The new limit is 1024 device clones. As part of the size increase, the bitmap has been changed to be allocated separately to avoid bloating all device nodes, as suggested by guenther, millert and deraadt. ok millert mikeb
* Revert the clone bitmap enlargement changemikeb2016-04-011-13/+2
|
* Increase size of the clone bitmap. A limit of only 64 device clonesnatano2016-03-311-2/+13
| | | | | | | | | turned out to be too low for the upcoming work on cloning bpf. The new limit is 1024 device clones. As part of the size increase, the bitmap has been changed to be allocated separately to avoid bloating all device nodes, as suggested by guenther, millert and deraadt. ok millert mikeb
* Remove the unused flags argument from VOP_UNLOCK().natano2016-03-191-4/+4
| | | | | | torture tested on amd64, i386 and macppc ok beck mpi stefan "the change looks right" deraadt