aboutsummaryrefslogtreecommitdiffstats
path: root/fs (follow)
AgeCommit message (Collapse)AuthorFilesLines
2012-09-26export fget_lightAl Viro1-0/+1
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-09-26ext4: close struct file leak on EXT4_IOC_MOVE_EXTAl Viro1-1/+2
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-09-26switch fchmod(2) to fget_light()Al Viro1-7/+5
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-09-26switch fallocate(2) to fget_light()Al Viro1-3/+3
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-09-26switch ftruncate(2) to fget_lightAl Viro1-5/+5
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-09-26namei.c: fix BS commentAl Viro1-1/+1
get_write_access() is needed for nfsd, not binfmt_aout (the latter has no business doing anything of that kind, of course) Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-09-26don't leak O_CLOEXEC into ->f_flagsAl Viro2-2/+2
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-09-26procfs: Convert /proc/pid/fdinfo/ handling routines to seq-file v2Cyrill Gorcunov1-48/+64
This patch converts /proc/pid/fdinfo/ handling routines to seq-file which is needed to extend seq operations and plug in auxiliary fdinfo provides from subsystems like eventfd/eventpoll/fsnotify. Note the proc_fd_link no longer call for proc_fd_info, simply because the guts of proc_fd_info() got merged into ->show() of that seq_file Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-09-26procfs: Move /proc/pid/fd[info] handling code to fd.[ch]Cyrill Gorcunov5-387/+416
This patch prepares the ground for further extension of /proc/pid/fd[info] handling code by moving fdinfo handling code into fs/proc/fd.c. I think such move makes both fs/proc/base.c and fs/proc/fd.c easier to read. Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Acked-by: Pavel Emelyanov <xemul@parallels.com> CC: Al Viro <viro@ZenIV.linux.org.uk> CC: Alexey Dobriyan <adobriyan@gmail.com> CC: Andrew Morton <akpm@linux-foundation.org> CC: James Bottomley <jbottomley@parallels.com> CC: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> CC: Alexey Dobriyan <adobriyan@gmail.com> CC: Matthew Helsley <matt.helsley@gmail.com> CC: "J. Bruce Fields" <bfields@fieldses.org> CC: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-09-26new helper: daemonize_descriptors()Al Viro1-0/+6
descriptor-related parts of daemonize, done right. As the result we simplify the locking rules for ->files - we hold task_lock in *all* cases when we modify ->files. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-09-26do_coredump(): make sure that descriptor table isn't sharedAl Viro1-0/+7
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-09-26new helper: iterate_fd()Al Viro1-0/+21
iterates through the opened files in given descriptor table, calling a supplied function; we stop once non-zero is returned. Callback gets struct file *, descriptor number and const void * argument passed to iterator. It is called with files->file_lock held, so it is not allowed to block. tty_io, netprio_cgroup and selinux flush_unauthorized_files() converted to its use. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-09-26make expand_files() and alloc_fd() staticAl Viro1-2/+2
no callers outside of fs/file.c left Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-09-26take __{set,clear}_{open_fd,close_on_exec}() into fs/file.cAl Viro1-0/+20
nobody uses those outside anymore. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-09-26new helper: replace_fd()Al Viro2-39/+63
analog of dup2(), except that it takes struct file * as source. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-09-26take purely descriptor-related stuff from fcntl.c to file.cAl Viro2-128/+135
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-09-26take close-on-exec logics to fs/file.c, clean it up a bitAl Viro2-35/+43
... and add cond_resched() there, while we are at it. We can get large latencies as is... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-09-26take descriptor-related part of close() to file.cAl Viro2-21/+27
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-09-26take fget() and friends to fs/file.cAl Viro2-106/+106
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-09-26expose a low-level variant of fd_install() for binderAl Viro1-2/+14
Similar situation to that of __alloc_fd(); do not use unless you really have to. You should not touch any descriptor table other than your own; it's a sure sign of a really bad API design. As with __alloc_fd(), you *must* use a first-class reference to struct files_struct; something obtained by get_files_struct(some task) (let alone direct task->files) will not do. It must be either current->files, or obtained by get_files_struct(current) by the owner of that sucker and given to you. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-09-26move put_unused_fd() and fd_install() to fs/file.cAl Viro2-44/+44
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-09-26trim free_fdtable_rcu()Al Viro1-15/+2
embedded case isn't hit anymore Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-09-26don't bother with call_rcu() in put_files_struct()Al Viro1-9/+5
At that point nobody can see us anyway; everything that looks at files_fdtable(files) is separated from the guts of put_files_struct(files) - either since files is current->files or because we fetched it under task_lock() and hadn't dropped that yet, or because we'd bumped files->count while holding task_lock()... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-09-26move files_struct-related bits from kernel/exit.c to fs/file.cAl Viro1-1/+99
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-09-26new helper: __alloc_fd()Al Viro1-4/+8
Essentially, alloc_fd() in a files_struct we own a reference to. Most of the time wanting to use it is a sign of lousy API design (such as android/binder). It's *not* a general-purpose interface; better that than open-coding its guts, but again, playing with other process' descriptor table is a sign of bad design. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-09-26take rlimit check to callers of expand_files()Al Viro2-7/+12
... except for one in android, where the check is different and already done in caller. No need to recalculate rlimit many times in alloc_fd() either. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-09-26fanotify: sanitize failure exits in copy_event_to_user()Al Viro1-39/+20
* do copy_to_user() before prepare_for_access_response(); that kills the need in remove_access_response(). * don't do fd_install() until we are past the last possible failure exit. Don't use sys_close() on cleanup side - just put_unused_fd() and fput(). Less racy that way... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-09-26pipe(2) - race-free error recoveryAl Viro1-9/+22
don't mess with sys_close() if copy_to_user() fails; just postpone fd_install() until we know it hasn't. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-09-26autofs4: don't open-code fd_install()Al Viro1-16/+2
The only difference between autofs_dev_ioctl_fd_install() and fd_install() is __set_close_on_exec() done by the latter. Just use get_unused_fd_flags(O_CLOEXEC) to allocate the descriptor and be done with that... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-09-26make get_unused_fd_flags() a functionAl Viro1-3/+3
... and get_unused_fd() a macro around it Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-09-26Merge remote branch 'origin' into for-nextAl Viro81-672/+804
2012-09-22close the race in nlmsvc_free_block()Al Viro1-2/+1
we need to grab mutex before the reference counter reaches 0 Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-09-22do_add_mount()/umount -l racesAl Viro1-2/+8
normally we deal with lock_mount()/umount races by checking that mountpoint to be is still in our namespace after lock_mount() has been done. However, do_add_mount() skips that check when called with MNT_SHRINKABLE in flags (i.e. from finish_automount()). The reason is that ->mnt_ns may be a temporary namespace created exactly to contain automounts a-la NFS4 referral handling. It's not the namespace of the caller, though, so check_mnt() would fail here. We still need to check that ->mnt_ns is non-NULL in that case, though. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-09-22Merge branch 'for-linus' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds1-1/+1
Pull cifs fix from Steve French. * 'for-linus' of git://git.samba.org/sfrench/cifs-2.6: cifs: fix return value in cifsConvertToUTF16
2012-09-21Merge tag 'for-linus-v3.6-rc7' of git://oss.sgi.com/xfs/xfsLinus Torvalds3-18/+29
Pull xfs bugfixes from Ben Myers: - fix a regression related to xfs_sync_worker racing with unmount. - fix a race while discarding xfs buffers. * tag 'for-linus-v3.6-rc7' of git://oss.sgi.com/xfs/xfs: xfs: stop the sync worker before xfs_unmountfs xfs: fix race while discarding buffers [V4]
2012-09-21debugfs: fix u32_array race in format_array_allocLinus Torvalds1-34/+23
The format_array_alloc() function is fundamentally racy, in that it prints the array twice: once to figure out how much space to allocate for the buffer, and the second time to actually print out the data. If any of the array contents changes in between, the allocation size may be wrong, and the end result may be truncated in odd ways. Just don't do it. Allocate a maximum-sized array up-front, and just format the array contents once. The only user of the u32_array interfaces is the Xen spinlock statistics code, and it has 31 entries in the arrays, so the maximum size really isn't that big, and the end result is much simpler code without the bug. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-09-21debugfs: fix race in u32_array_read and allocate array at openDavid Rientjes1-22/+11
u32_array_open() is racy when multiple threads read from a file with a seek position of zero, i.e. when two or more simultaneous reads are occurring after the non-seekable files are created. It is possible that file->private_data is double-freed because the threads races between kfree(file->private-data); and file->private_data = NULL; The fix is to only do format_array_alloc() when the file is opened and free it when it is closed. Note that because the file has always been non-seekable, you can't open it and read it multiple times anyway, so the data has always been generated just once. The difference is that now it is generated at open time rather than at the time of the first read, and that avoids the race. Reported-by: Dave Jones <davej@redhat.com> Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Tested-by: Raghavendra <raghavendra.kt@linux.vnet.ibm.com> Signed-off-by: David Rientjes <rientjes@google.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-09-18xfs: stop the sync worker before xfs_unmountfsBen Myers1-0/+1
Cancel work of the xfs_sync_worker before teardown of the log in xfs_unmountfs. This prevents occasional crashes on unmount like so: PID: 21602 TASK: ee9df060 CPU: 0 COMMAND: "kworker/0:3" #0 [c5377d28] crash_kexec at c0292c94 #1 [c5377d80] oops_end at c07090c2 #2 [c5377d98] no_context at c06f614e #3 [c5377dbc] __bad_area_nosemaphore at c06f6281 #4 [c5377df4] bad_area_nosemaphore at c06f629b #5 [c5377e00] do_page_fault at c070b0cb #6 [c5377e7c] error_code (via page_fault) at c070892c EAX: f300c6a8 EBX: f300c6a8 ECX: 000000c0 EDX: 000000c0 EBP: c5377ed0 DS: 007b ESI: 00000000 ES: 007b EDI: 00000001 GS: ffffad20 CS: 0060 EIP: c0481ad0 ERR: ffffffff EFLAGS: 00010246 #7 [c5377eb0] atomic64_read_cx8 at c0481ad0 #8 [c5377ebc] xlog_assign_tail_lsn_locked at f7cc7c6e [xfs] #9 [c5377ed4] xfs_trans_ail_delete_bulk at f7ccd520 [xfs] #10 [c5377f0c] xfs_buf_iodone at f7ccb602 [xfs] #11 [c5377f24] xfs_buf_do_callbacks at f7cca524 [xfs] #12 [c5377f30] xfs_buf_iodone_callbacks at f7cca5da [xfs] #13 [c5377f4c] xfs_buf_iodone_work at f7c718d0 [xfs] #14 [c5377f58] process_one_work at c024ee4c #15 [c5377f98] worker_thread at c024f43d #16 [c5377fbc] kthread at c025326b #17 [c5377fe8] kernel_thread_helper at c070e834 PID: 26653 TASK: e79143b0 CPU: 3 COMMAND: "umount" #0 [cde0fda0] __schedule at c0706595 #1 [cde0fe28] schedule at c0706b89 #2 [cde0fe30] schedule_timeout at c0705600 #3 [cde0fe94] __down_common at c0706098 #4 [cde0fec8] __down at c0706122 #5 [cde0fed0] down at c025936f #6 [cde0fee0] xfs_buf_lock at f7c7131d [xfs] #7 [cde0ff00] xfs_freesb at f7cc2236 [xfs] #8 [cde0ff10] xfs_fs_put_super at f7c80f21 [xfs] #9 [cde0ff1c] generic_shutdown_super at c0333d7a #10 [cde0ff38] kill_block_super at c0333e0f #11 [cde0ff48] deactivate_locked_super at c0334218 #12 [cde0ff58] deactivate_super at c033495d #13 [cde0ff68] mntput_no_expire at c034bc13 #14 [cde0ff7c] sys_umount at c034cc69 #15 [cde0ffa0] sys_oldumount at c034ccd4 #16 [cde0ffb0] system_call at c0707e66 commit 11159a05 added this to xfs_log_unmount and needs to be cleaned up at a later date. Signed-off-by: Ben Myers <bpm@sgi.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Mark Tinguely <tinguely@sgi.com>
2012-09-18cifs: fix return value in cifsConvertToUTF16Jeff Layton1-1/+1
This function returns the wrong value, which causes the callers to get the length of the resulting pathname wrong when it contains non-ASCII characters. This seems to fix https://bugzilla.samba.org/show_bug.cgi?id=6767 Cc: <stable@vger.kernel.org> Reported-by: Baldvin Kovacs <baldvin.kovacs@gmail.com> Reported-and-Tested-by: Nicolas Lefebvre <nico.lefebvre@gmail.com> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <smfrench@gmail.com>
2012-09-18vfs: dcache: use DCACHE_DENTRY_KILLED instead of DCACHE_DISCONNECTED in d_kill()Miklos Szeredi1-2/+2
IBM reported a soft lockup after applying the fix for the rename_lock deadlock. Commit c83ce989cb5f ("VFS: Fix the nfs sillyrename regression in kernel 2.6.38") was found to be the culprit. The nfs sillyrename fix used DCACHE_DISCONNECTED to indicate that the dentry was killed. This flag can be set on non-killed dentries too, which results in infinite retries when trying to traverse the dentry tree. This patch introduces a separate flag: DCACHE_DENTRY_KILLED, which is only set in d_kill() and makes try_to_ascend() test only this flag. IBM reported successful test results with this patch. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-09-17fs/proc: fix potential unregister_sysctl_table hangFrancesco Ruggeri1-3/+2
The unregister_sysctl_table() function hangs if all references to its ctl_table_header structure are not dropped. This can happen sometimes because of a leak in proc_sys_lookup(): proc_sys_lookup() gets a reference to the table via lookup_entry(), but it does not release it when a subsequent call to sysctl_follow_link() fails. This patch fixes this leak by making sure the reference is always dropped on return. See also commit 076c3eed2c31 ("sysctl: Rewrite proc_sys_lookup introducing find_entry and lookup_entry") which reorganized this code in 3.4. Tested in Linux 3.4.4. Signed-off-by: Francesco Ruggeri <fruggeri@aristanetworks.com> Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-09-16Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfsLinus Torvalds1-6/+2
Pull a btrfs revert from Chris Mason: "My for-linus branch has one revert in the new quota code. We're building up more fixes at etc for the next merge window, but I'm keeping them out unless they are bigger regressions or have a huge impact." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: Revert "Btrfs: fix some error codes in btrfs_qgroup_inherit()"
2012-09-14Merge git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-3.0-fixesLinus Torvalds3-44/+61
Pull GFS2 fixes from Steven Whitehouse: "Here are three GFS2 fixes for the current kernel tree. These are all related to the block reservation code which was added at the merge window. That code will be getting an update at the forthcoming merge window too. In the mean time though there are a few smaller issues which should be fixed. The first patch resolves an issue with write sizes of greater than 32 bits with the size hinting code. The second ensures that the allocation data structure is initialised when using xattrs and the third takes into account allocations which may have been made by other nodes which affect a reservation on the local node." * git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-3.0-fixes: GFS2: Take account of blockages when using reserved blocks GFS2: Fix missing allocation data for set/remove xattr GFS2: Make write size hinting code common
2012-09-14Merge tag 'ecryptfs-3.6-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tyhicks/ecryptfsLinus Torvalds3-2/+14
Pull ecryptfs fixes from Tyler Hicks: - Fixes a regression, introduced in 3.6-rc1, when a file is closed before its shared memory mapping is dirtied and unmapped. The lower file was being released when the eCryptfs file was closed and the dirtied pages could not be written out. - Adds a call to the lower filesystem's ->flush() from ecryptfs_flush(). - Fixes a regression, introduced in 2.6.39, when a file is renamed on top of another file. The target file's inode was not being evicted and the space taken by the file was not reclaimed until eCryptfs was unmounted. * tag 'ecryptfs-3.6-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tyhicks/ecryptfs: eCryptfs: Copy up attributes of the lower target inode after rename eCryptfs: Call lower ->flush() from ecryptfs_flush() eCryptfs: Write out all dirty pages just before releasing the lower file
2012-09-14Revert "Btrfs: fix some error codes in btrfs_qgroup_inherit()"Chris Mason1-6/+2
This reverts commit 5986802c2fcc754040bb7ed95f30bb16c4a843b7. Both paths are not error paths but regular cases where non-qgroup subvols are involved. Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2012-09-14vfs: make O_PATH file descriptors usable for 'fstat()'Linus Torvalds1-1/+1
We already use them for openat() and friends, but fstat() also wants to be able to use O_PATH file descriptors. This should make it more directly comparable to the O_SEARCH of Solaris. Note that you could already do the same thing with "fstatat()" and an empty path, but just doing "fstat()" directly is simpler and faster, so there is no reason not to just allow it directly. See also commit 332a2e1244bd, which did the same thing for fchdir, for the same reasons. Reported-by: ольга крыжановская <olga.kryzhanovska@gmail.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: stable@kernel.org # O_PATH introduced in 3.0+ Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-09-14eCryptfs: Copy up attributes of the lower target inode after renameTyler Hicks1-0/+5
After calling into the lower filesystem to do a rename, the lower target inode's attributes were not copied up to the eCryptfs target inode. This resulted in the eCryptfs target inode staying around, rather than being evicted, because i_nlink was not updated for the eCryptfs inode. This also meant that eCryptfs didn't do the final iput() on the lower target inode so it stayed around, as well. This would result in a failure to free up space occupied by the target file in the rename() operation. Both target inodes would eventually be evicted when the eCryptfs filesystem was unmounted. This patch calls fsstack_copy_attr_all() after the lower filesystem does its ->rename() so that important inode attributes, such as i_nlink, are updated at the eCryptfs layer. ecryptfs_evict_inode() is now called and eCryptfs can drop its final reference on the lower inode. http://launchpad.net/bugs/561129 Signed-off-by: Tyler Hicks <tyhicks@canonical.com> Tested-by: Colin Ian King <colin.king@canonical.com> Cc: <stable@vger.kernel.org> [2.6.39+]
2012-09-14eCryptfs: Call lower ->flush() from ecryptfs_flush()Tyler Hicks1-2/+8
Since eCryptfs only calls fput() on the lower file in ecryptfs_release(), eCryptfs should call the lower filesystem's ->flush() from ecryptfs_flush(). If the lower filesystem implements ->flush(), then eCryptfs should try to flush out any dirty pages prior to calling the lower ->flush(). If the lower filesystem does not implement ->flush(), then eCryptfs has no need to do anything in ecryptfs_flush() since dirty pages are now written out to the lower filesystem in ecryptfs_release(). Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
2012-09-14eCryptfs: Write out all dirty pages just before releasing the lower fileTyler Hicks1-0/+1
Fixes a regression caused by: 821f749 eCryptfs: Revert to a writethrough cache model That patch reverted some code (specifically, 32001d6f) that was necessary to properly handle open() -> mmap() -> close() -> dirty pages -> munmap(), because the lower file could be closed before the dirty pages are written out. Rather than reapplying 32001d6f, this approach is a better way of ensuring that the lower file is still open in order to handle writing out the dirty pages. It is called from ecryptfs_release(), while we have a lock on the lower file pointer, just before the lower file gets the final fput() and we overwrite the pointer. https://launchpad.net/bugs/1047261 Signed-off-by: Tyler Hicks <tyhicks@canonical.com> Reported-by: Artemy Tregubenko <me@arty.name> Tested-by: Artemy Tregubenko <me@arty.name> Tested-by: Colin Ian King <colin.king@canonical.com>
2012-09-13GFS2: Take account of blockages when using reserved blocksSteven Whitehouse1-38/+28
The claim_reserved_blks() function was not taking account of the possibility of "blockages" while performing allocation. This can be caused by another node allocating something in the same extent which has been reserved locally. This patch tests for this condition and then skips the remainder of the reservation in this case. This is a relatively rare event, so that it should not affect the general performance improvement which the block reservations provide. The claim_reserved_blks() function also appears not to be able to deal with reservations which cross bitmap boundaries, but that can be dealt with in a future patch since we don't generate boundary crossing reservations currently. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Reported-by: David Teigland <teigland@redhat.com> Cc: Bob Peterson <rpeterso@redhat.com>