aboutsummaryrefslogtreecommitdiffstats
path: root/fs/overlayfs/dir.c (follow)
AgeCommit message (Collapse)AuthorFilesLines
2016-12-16ovl: check for emptiness of redirect dirAmir Goldstein1-9/+22
Before introducing redirect_dir feature, the condition !ovl_lower_positive(dentry) for a directory, implied that it is a pure upper directory, which may be removed if empty. Now that directory can be redirect, it is possible that upper does not cover any lower (i.e. !ovl_lower_positive(dentry)), but the directory is a merge (with redirected path) and maybe non empty. Check for this case in ovl_remove_upper(). This change fixes the following test case from rename-pop-dir.py of unionmount-testsuite: """Remove dir and rename old name""" d = ctx.non_empty_dir() d2 = ctx.no_dir() ctx.rmdir(d, err=ENOTEMPTY) ctx.rename(d, d2) ctx.rmdir(d, err=ENOENT) ctx.rmdir(d2, err=ENOTEMPTY) ./run --ov rename-pop-dir /mnt/a/no_dir103: Expected error (Directory not empty) was not produced Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-12-16ovl: redirect on rename-dirMiklos Szeredi1-14/+124
Current code returns EXDEV when a directory would need to be copied up to move. We could copy up the directory tree in this case, but there's another, simpler solution: point to old lower directory from moved upper directory. This is achieved with a "trusted.overlay.redirect" xattr storing the path relative to the root of the overlay. After such attribute has been set, the directory can be moved without further actions required. This is a backward incompatible feature, old kernels won't be able to correctly mount an overlay containing redirected directories. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-12-16ovl: check lower existence of rename targetMiklos Szeredi1-52/+11
Check if something exists on the lower layer(s) under the target or rename to decide if directory needs to be marked "opaque". Marking opaque is done before the rename, and on failure the marking was undone. Also the opaque xattr was removed if the target didn't cover anything. This patch changes behavior so that removal of "opaque" is not done in either of the above cases. This means that directory may have the opaque flag even if it doesn't cover anything. However this shouldn't affect the performance or semantics of the overalay, while simplifying the code. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-12-16ovl: rename: simplify handling of lower/merged directoryMiklos Szeredi1-18/+12
d_is_dir() is safe to call on a negative dentry. Use this fact to simplify handling of the lower or merged directories. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-12-16ovl: get rid of PURE typeMiklos Szeredi1-6/+3
The remainging uses of __OVL_PATH_PURE can be replaced by ovl_dentry_is_opaque(). Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-12-16ovl: check lower existence when removingMiklos Szeredi1-2/+2
Currently ovl_lookup() checks existence of lower file even if there's a non-directory on upper (which is always opaque). This is done so that remove can decide whether a whiteout is needed or not. It would be better to defer this check to unlink, since most of the time the gathered information about opaqueness will be unused. This adds a helper ovl_lower_positive() that checks if there's anything on the lower layer(s). The following patches also introduce changes to how the "opaque" attribute is updated on directories: this attribute is added when the directory is creted or moved over a whiteout or object covering something on the lower layer. However following changes will allow the attribute to remain on the directory after being moved, even if the new location doesn't cover anything. Because of this, we need to check lower layers even for opaque directories, so that whiteout is only created when necessary. This function will later be also used to decide about marking a directory opaque, so deal with negative dentries as well. When dealing with negative, it's enough to check for being a whiteout If the dentry is positive but not upper then it also obviously needs whiteout/opaque. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-12-16ovl: add ovl_dentry_is_whiteout()Miklos Szeredi1-3/+3
And use it instead of ovl_dentry_is_opaque() where appropriate. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-12-16ovl: don't check stickyMiklos Szeredi1-24/+0
Since commit 07a2daab49c5 ("ovl: Copy up underlying inode's ->i_mode to overlay inode") sticky checking on overlay inode is performed by the vfs, so checking against sticky on underlying inode is not needed. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-12-16ovl: don't check rename to selfMiklos Szeredi1-12/+3
This is redundant, the vfs already performed this check (and was broken, see commit 9409e22acdfc ("vfs: rename: check backing inode being equal")). Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-12-16ovl: treat special files like a regular fsMiklos Szeredi1-1/+1
No sense in opening special files on the underlying layers, they work just as well if opened on the overlay. Side effect is that it's no longer possible to connect one side of a pipe opened on overlayfs with the other side opened on the underlying layer. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-12-16ovl: rename ovl_rename2() to ovl_rename()Miklos Szeredi1-4/+4
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-10-14Merge branch 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfsLinus Torvalds1-1/+4
Pull overlayfs updates from Miklos Szeredi: "This update contains fixes to the "use mounter's permission to access underlying layers" area, and miscellaneous other fixes and cleanups. No new features this time" * 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs: ovl: use vfs_get_link() vfs: add vfs_get_link() helper ovl: use generic_readlink ovl: explain error values when removing acl from workdir ovl: Fix info leak in ovl_lookup_temp() ovl: during copy up, switch to mounter's creds early ovl: lookup: do getxattr with mounter's permission ovl: copy_up_xattr(): use strnlen
2016-10-10Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfsLinus Torvalds1-1/+1
Pull more vfs updates from Al Viro: ">rename2() work from Miklos + current_time() from Deepa" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: fs: Replace current_fs_time() with current_time() fs: Replace CURRENT_TIME_SEC with current_time() for inode timestamps fs: Replace CURRENT_TIME with current_time() for inode timestamps fs: proc: Delete inode time initializations in proc_alloc_inode() vfs: Add current_time() api vfs: add note about i_op->rename changes to porting fs: rename "rename2" i_op to "rename" vfs: remove unused i_op->rename fs: make remaining filesystems use .rename2 libfs: support RENAME_NOREPLACE in simple_rename() fs: support RENAME_NOREPLACE for local filesystems ncpfs: fix unused variable warning
2016-10-10Merge branch 'work.xattr' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfsLinus Torvalds1-3/+0
Pull vfs xattr updates from Al Viro: "xattr stuff from Andreas This completes the switch to xattr_handler ->get()/->set() from ->getxattr/->setxattr/->removexattr" * 'work.xattr' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: vfs: Remove {get,set,remove}xattr inode operations xattr: Stop calling {get,set,remove}xattr inode operations vfs: Check for the IOP_XATTR flag in listxattr xattr: Add __vfs_{get,set,remove}xattr helpers libfs: Use IOP_XATTR flag for empty directory handling vfs: Use IOP_XATTR flag for bad-inode handling vfs: Add IOP_XATTR inode operations flag vfs: Move xattr_resolve_name to the front of fs/xattr.c ecryptfs: Switch to generic xattr handlers sockfs: Get rid of getxattr iop sockfs: getxattr: Fail with -EOPNOTSUPP for invalid attribute names kernfs: Switch to generic xattr handlers hfs: Switch to generic xattr handlers jffs2: Remove jffs2_{get,set,remove}xattr macros xattr: Remove unnecessary NULL attribute name check
2016-10-07vfs: Remove {get,set,remove}xattr inode operationsAndreas Gruenbacher1-3/+0
These inode operations are no longer used; remove them. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-10-04Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-securityLinus Torvalds1-0/+10
Pull security subsystem updates from James Morris: SELinux/LSM: - overlayfs support, necessary for container filesystems LSM: - finally remove the kernel_module_from_file hook Smack: - treat signal delivery as an 'append' operation TPM: - lots of bugfixes & updates Audit: - new audit data type: LSM_AUDIT_DATA_FILE * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (47 commits) Revert "tpm/tpm_crb: implement tpm crb idle state" Revert "tmp/tpm_crb: fix Intel PTT hw bug during idle state" Revert "tpm/tpm_crb: open code the crb_init into acpi_add" Revert "tmp/tpm_crb: implement runtime pm for tpm_crb" lsm,audit,selinux: Introduce a new audit data type LSM_AUDIT_DATA_FILE tmp/tpm_crb: implement runtime pm for tpm_crb tpm/tpm_crb: open code the crb_init into acpi_add tmp/tpm_crb: fix Intel PTT hw bug during idle state tpm/tpm_crb: implement tpm crb idle state tpm: add check for minimum buffer size in tpm_transmit() tpm: constify TPM 1.x header structures tpm/tpm_crb: fix the over 80 characters checkpatch warring tpm/tpm_crb: drop useless cpu_to_le32 when writing to registers tpm/tpm_crb: cache cmd_size register value. tmp/tpm_crb: drop include to platform_device tpm/tpm_tis: remove unused itpm variable tpm_crb: fix incorrect values of cmdReady and goIdle bits tpm_crb: refine the naming of constants tpm_crb: remove wmb()'s tpm_crb: fix crb_req_canceled behavior ...
2016-09-27fs: rename "rename2" i_op to "rename"Miklos Szeredi1-1/+1
Generated patch: sed -i "s/\.rename2\t/\.rename\t\t/" `git grep -wl rename2` sed -i "s/\brename2\b/rename/g" `git grep -wl rename2` Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-09-21ovl: Fix info leak in ovl_lookup_temp()Richard Weinberger1-1/+4
The function uses the memory address of a struct dentry as unique id. While the address-based directory entry is only visible to root it is IMHO still worth fixing since the temporary name does not have to be a kernel address. It can be any unique number. Replace it by an atomic integer which is allowed to wrap around. Signed-off-by: Richard Weinberger <richard@nod.at> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Cc: <stable@vger.kernel.org> # v3.18+ Fixes: e9be9d5e76e3 ("overlay filesystem")
2016-09-01ovl: Switch to generic_getxattrAndreas Gruenbacher1-1/+1
Now that overlayfs has xattr handlers for iop->{set,remove}xattr, use those same handlers for iop->getxattr as well. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-09-01ovl: Switch to generic_removexattrAndreas Gruenbacher1-1/+1
Commit d837a49bd57f ("ovl: fix POSIX ACL setting") switches from iop->setxattr from ovl_setxattr to generic_setxattr, so switch from ovl_removexattr to generic_removexattr as well. As far as permission checking goes, the same rules should apply in either case. While doing that, rename ovl_setxattr to ovl_xattr_set to indicate that this is not an iop->setxattr implementation and remove the unused inode argument. Move ovl_other_xattr_set above ovl_own_xattr_set so that they match the order of handlers in ovl_xattr_handlers. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Fixes: d837a49bd57f ("ovl: fix POSIX ACL setting") Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-09-01ovl: handle umask and posix_acl_default correctly on creationMiklos Szeredi1-0/+54
Setting MS_POSIXACL in sb->s_flags has the side effect of passing mode to create functions without masking against umask. Another problem when creating over a whiteout is that the default posix acl is not inherited from the parent dir (because the real parent dir at the time of creation is the work directory). Fix these problems by: a) If upper fs does not have MS_POSIXACL, then mask mode with umask. b) If creating over a whiteout, call posix_acl_create() to get the inherited acls. After creation (but before moving to the final destination) set these acls on the created file. posix_acl_create() also updates the file creation mode as appropriate. Fixes: 39a25b2b3762 ("ovl: define ->get_acl() for overlay inodes") Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-08-08security, overlayfs: Provide hook to correctly label newly created filesVivek Goyal1-0/+10
During a new file creation we need to make sure new file is created with the right label. New file is created in upper/ so effectively file should get label as if task had created file in upper/. We switched to mounter's creds for actual file creation. Also if there is a whiteout present, then file will be created in work/ dir first and then renamed in upper. In none of the cases file will be labeled as we want it to be. This patch introduces a new hook dentry_create_files_as(), which determines the label/context dentry will get if it had been created by task in upper and modify passed set of creds appropriately. Caller makes use of these new creds for file creation. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Acked-by: Stephen Smalley <sds@tycho.nsa.gov> [PM: fix whitespace issues found with checkpatch.pl] [PM: changes to use stat->mode in ovl_create_or_link()] Signed-off-by: Paul Moore <paul@paul-moore.com>
2016-07-29ovl: simplify empty checkingMiklos Szeredi1-29/+21
The empty checking logic is duplicated in ovl_check_empty_and_clear() and ovl_remove_and_whiteout(), except the condition for clearing whiteouts is different: ovl_check_empty_and_clear() checked for being upper ovl_remove_and_whiteout() checked for merge OR lower Move the intersection of those checks (upper AND merge) into ovl_check_empty_and_clear() and simplify ovl_remove_and_whiteout(). Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-07-29ovl: clear nlink on rmdirMiklos Szeredi1-2/+6
To make delete notification work on fa/inotify. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-07-29ovl: fix POSIX ACL settingMiklos Szeredi1-1/+1
Setting POSIX ACL needs special handling: 1) Some permission checks are done by ->setxattr() which now uses mounter's creds ("ovl: do operations on underlying file system in mounter's context"). These permission checks need to be done with current cred as well. 2) Setting ACL can fail for various reasons. We do not need to copy up in these cases. In the mean time switch to using generic_setxattr. [Arnd Bergmann] Fix link error without POSIX ACL. posix_acl_from_xattr() doesn't have a 'static inline' implementation when CONFIG_FS_POSIX_ACL is disabled, and I could not come up with an obvious way to do it. This instead avoids the link error by defining two sets of ACL operations and letting the compiler drop one of the two at compile time depending on CONFIG_FS_POSIX_ACL. This avoids all references to the ACL code, also leading to smaller code. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-07-29ovl: share inode for hard linkMiklos Szeredi1-35/+49
Inode attributes are copied up to overlay inode (uid, gid, mode, atime, mtime, ctime) so generic code using these fields works correcty. If a hard link is created in overlayfs separate inodes are allocated for each link. If chmod/chown/etc. is performed on one of the links then the inode belonging to the other ones won't be updated. This patch attempts to fix this by sharing inodes for hard links. Use inode hash (with real inode pointer as a key) to make sure overlay inodes are shared for hard links on upper. Hard links on lower are still split (which is not user observable until the copy-up happens, see Documentation/filesystems/overlayfs.txt under "Non-standard behavior"). The inode is only inserted in the hash if it is non-directoy and upper. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-07-29ovl: store real inode pointer in ->i_privateMiklos Szeredi1-1/+2
To get from overlay inode to real inode we currently use 'struct ovl_entry', which has lifetime connected to overlay dentry. This is okay, since each overlay dentry had a new overlay inode allocated. Following patch will break that assumption, so need to leave out ovl_entry. This patch stores the real inode directly in i_private, with the lowest bit used to indicate whether the inode is upper or lower. Lifetime rules remain, using ovl_inode_real() must only be done while caller holds ref on overlay dentry (and hence on real dentry), or within RCU protected regions. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-07-29ovl: update atime on upperMiklos Szeredi1-0/+1
Fix atime update logic in overlayfs. This patch adds an i_op->update_time() handler to overlayfs inodes. This forwards atime updates to the upper layer only. No atime updates are done on lower layers. Remove implicit atime updates to underlying files and directories with O_NOATIME. Remove explicit atime update in ovl_readlink(). Clear atime related mnt flags from cloned upper mount. This means atime updates are controlled purely by overlayfs mount options. Reported-by: Konstantin Khlebnikov <koct9i@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-07-29ovl: fix sgid on directoryMiklos Szeredi1-4/+27
When creating directory in workdir, the group/sgid inheritance from the parent dir was omitted completely. Fix this by calling inode_init_owner() on overlay inode and using the resulting uid/gid/mode to create the file. Unfortunately the sgid bit can be stripped off due to umask, so need to reset the mode in this case in workdir before moving the directory in place. Reported-by: Eryu Guan <eguan@redhat.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-07-29ovl: do operations on underlying file system in mounter's contextVivek Goyal1-31/+29
Given we are now doing checks both on overlay inode as well underlying inode, we should be able to do checks and operations on underlying file system using mounter's context. So modify all operations to do checks/operations on underlying dentry/inode in the context of mounter. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-07-29ovl: define ->get_acl() for overlay inodesVivek Goyal1-0/+1
Now we are planning to do DAC permission checks on overlay inode itself. And to make it work, we will need to make sure we can get acls from underlying inode. So define ->get_acl() for overlay inodes and this in turn calls into underlying filesystem to get acls, if any. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-07-29ovl: move some common code in a functionVivek Goyal1-8/+12
ovl_create_upper() and ovl_create_over_whiteout() seem to be sharing some common code which can be moved into a separate function. No functionality change. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-07-22ovl: verify upper dentry in ovl_remove_and_whiteout()Maxim Patlasov1-30/+24
The upper dentry may become stale before we call ovl_lock_rename_workdir. For example, someone could (mistakenly or maliciously) manually unlink(2) it directly from upperdir. To ensure it is not stale, let's lookup it after ovl_lock_rename_workdir and and check if it matches the upper dentry. Essentially, it is the same problem and similar solution as in commit 11f3710417d0 ("ovl: verify upper dentry before unlink and rename"). Signed-off-by: Maxim Patlasov <mpatlasov@virtuozzo.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Cc: <stable@vger.kernel.org>
2016-06-15ovl: fix uid/gid when creating over whiteoutMiklos Szeredi1-2/+11
Fix a regression when creating a file over a whiteout. The new file/directory needs to use the current fsuid/fsgid, not the ones from the mounter's credentials. The refcounting is a bit tricky: prepare_creds() sets an original refcount, override_creds() gets one more, which revert_cred() drops. So 1) we need to expicitly put the mounter's credentials when overriding with the updated one 2) we need to put the original ref to the updated creds (and this can safely be done before revert_creds(), since we'll still have the ref from override_creds()). Reported-by: Stephen Smalley <sds@tycho.nsa.gov> Fixes: 3fe6e52f0626 ("ovl: override creds with the ones from the superblock mounter") Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-05-27ovl: override creds with the ones from the superblock mounterAntonio Murdaca1-62/+5
In user namespace the whiteout creation fails with -EPERM because the current process isn't capable(CAP_SYS_ADMIN) when setting xattr. A simple reproducer: $ mkdir upper lower work merged lower/dir $ sudo mount -t overlay overlay -olowerdir=lower,upperdir=upper,workdir=work merged $ unshare -m -p -f -U -r bash Now as root in the user namespace: \# touch merged/dir/{1,2,3} # this will force a copy up of lower/dir \# rm -fR merged/* This ends up failing with -EPERM after the files in dir has been correctly deleted: unlinkat(4, "2", 0) = 0 unlinkat(4, "1", 0) = 0 unlinkat(4, "3", 0) = 0 close(4) = 0 unlinkat(AT_FDCWD, "merged/dir", AT_REMOVEDIR) = -1 EPERM (Operation not permitted) Interestingly, if you don't place files in merged/dir you can remove it, meaning if upper/dir does not exist, creating the char device file works properly in that same location. This patch uses ovl_sb_creator_cred() to get the cred struct from the superblock mounter and override the old cred with these new ones so that the whiteout creation is possible because overlay is wrong in assuming that the creds it will get with prepare_creds will be in the initial user namespace. The old cap_raise game is removed in favor of just overriding the old cred struct. This patch also drops from ovl_copy_up_one() the following two lines: override_cred->fsuid = stat->uid; override_cred->fsgid = stat->gid; This is because the correct uid and gid are taken directly with the stat struct and correctly set with ovl_set_attr(). Signed-off-by: Antonio Murdaca <runcom@redhat.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-03-21ovl: cleanup unused var in rename2Miklos Szeredi1-2/+0
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-03-21ovl: verify upper dentry before unlink and renameMiklos Szeredi1-21/+38
Unlink and rename in overlayfs checked the upper dentry for staleness by verifying upper->d_parent against upperdir. However the dentry can go stale also by being unhashed, for example. Expand the verification to actually look up the name again (under parent lock) and check if it matches the upper dentry. This matches what the VFS does before passing the dentry to filesytem's unlink/rename methods, which excludes any inconsistency caused by overlayfs. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-03-03ovl: ignore lower entries when checking purity of non-directory entriesKonstantin Khlebnikov1-0/+7
After rename file dentry still holds reference to lower dentry from previous location. This doesn't matter for data access because data comes from upper dentry. But this stale lower dentry taints dentry at new location and turns it into non-pure upper. Such file leaves visible whiteout entry after remove in directory which shouldn't have whiteouts at all. Overlayfs already tracks pureness of file location in oe->opaque. This patch just uses that for detecting actual path type. Comment from Vivek Goyal's patch: Here are the details of the problem. Do following. $ mkdir upper lower work merged upper/dir/ $ touch lower/test $ sudo mount -t overlay overlay -olowerdir=lower,upperdir=upper,workdir= work merged $ mv merged/test merged/dir/ $ rm merged/dir/test $ ls -l merged/dir/ /usr/bin/ls: cannot access merged/dir/test: No such file or directory total 0 c????????? ? ? ? ? ? test Basic problem seems to be that once a file has been unlinked, a whiteout has been left behind which was not needed and hence it becomes visible. Whiteout is visible because parent dir is of not type MERGE, hence od->is_real is set during ovl_dir_open(). And that means ovl_iterate() passes on iterate handling directly to underlying fs. Underlying fs does not know/filter whiteouts so it becomes visible to user. Why did we leave a whiteout to begin with when we should not have. ovl_do_remove() checks for OVL_TYPE_PURE_UPPER() and does not leave whiteout if file is pure upper. In this case file is not found to be pure upper hence whiteout is left. So why file was not PURE_UPPER in this case? I think because dentry is still carrying some leftover state which was valid before rename. For example, od->numlower was set to 1 as it was a lower file. After rename, this state is not valid anymore as there is no such file in lower. Signed-off-by: Konstantin Khlebnikov <koct9i@gmail.com> Reported-by: Viktor Stanchev <me@viktorstanchev.com> Suggested-by: Vivek Goyal <vgoyal@redhat.com> Link: https://bugzilla.kernel.org/show_bug.cgi?id=109611 Acked-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Cc: <stable@vger.kernel.org>
2016-03-03ovl: fix getcwd() failure after unsuccessful rmdirRui Wang1-1/+2
ovl_remove_upper() should do d_drop() only after it successfully removes the dir, otherwise a subsequent getcwd() system call will fail, breaking userspace programs. This is to fix: https://bugzilla.kernel.org/show_bug.cgi?id=110491 Signed-off-by: Rui Wang <rui.y.wang@intel.com> Reviewed-by: Konstantin Khlebnikov <koct9i@gmail.com> Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Cc: <stable@vger.kernel.org>
2016-01-22wrappers for ->i_mutex accessAl Viro1-6/+6
parallel to mutex_{lock,unlock,trylock,is_locked,lock_nested}, inode_foo(inode) being mutex_foo(&inode->i_mutex). Please, use those for access to ->i_mutex; over the coming cycle ->i_mutex will become rwsem, with ->lookup() done with it held only shared. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-05-19ovl: mount read-only if workdir can't be createdMiklos Szeredi1-0/+9
OpenWRT folks reported that overlayfs fails to mount if upper fs is full, because workdir can't be created. Wordir creation can fail for various other reasons too. There's no reason that the mount itself should fail, overlayfs can work fine without a workdir, as long as the overlay isn't modified. So mount it read-only and don't allow remounting read-write. Add a couple of WARN_ON()s for the impossible case of workdir being used despite being read-only. Reported-by: Bastian Bittorf <bittorf@bluebottle.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Cc: <stable@vger.kernel.org> # v3.18+
2015-05-14ovl: don't remove non-empty opaque directoryMiklos Szeredi1-5/+19
When removing an opaque directory we can't just call rmdir() to check for emptiness, because the directory will need to be replaced with a whiteout. The replacement is done with RENAME_EXCHANGE, which doesn't check emptiness. Solution is just to check emptiness by reading the directory. In the future we could add a new rename flag to check for emptiness even for RENAME_EXCHANGE to optimize this case. Reported-by: Vincent Batts <vbatts@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Tested-by: Jordi Pujol Palomer <jordipujolp@gmail.com> Fixes: 263b4a0fee43 ("ovl: dont replace opaque dir") Cc: <stable@vger.kernel.org> # v4.0+
2015-02-22VFS: (Scripted) Convert S_ISLNK/DIR/REG(dentry->d_inode) to d_is_*(dentry)David Howells1-3/+3
Convert the following where appropriate: (1) S_ISLNK(dentry->d_inode) to d_is_symlink(dentry). (2) S_ISREG(dentry->d_inode) to d_is_reg(dentry). (3) S_ISDIR(dentry->d_inode) to d_is_dir(dentry). This is actually more complicated than it appears as some calls should be converted to d_can_lookup() instead. The difference is whether the directory in question is a real dir with a ->lookup op or whether it's a fake dir with a ->d_automount op. In some circumstances, we can subsume checks for dentry->d_inode not being NULL into this, provided we the code isn't in a filesystem that expects d_inode to be NULL if the dirent really *is* negative (ie. if we're going to use d_inode() rather than d_backing_inode() to get the inode pointer). Note that the dentry type field may be set to something other than DCACHE_MISS_TYPE when d_inode is NULL in the case of unionmount, where the VFS manages the fall-through from a negative dentry to a lower layer. In such a case, the dentry type of the negative union dentry is set to the same as the type of the lower dentry. However, if you know d_inode is not NULL at the call site, then you can use the d_is_xxx() functions even in a filesystem. There is one further complication: a 0,0 chardev dentry may be labelled DCACHE_WHITEOUT_TYPE rather than DCACHE_SPECIAL_TYPE. Strictly, this was intended for special directory entry types that don't have attached inodes. The following perl+coccinelle script was used: use strict; my @callers; open($fd, 'git grep -l \'S_IS[A-Z].*->d_inode\' |') || die "Can't grep for S_ISDIR and co. callers"; @callers = <$fd>; close($fd); unless (@callers) { print "No matches\n"; exit(0); } my @cocci = ( '@@', 'expression E;', '@@', '', '- S_ISLNK(E->d_inode->i_mode)', '+ d_is_symlink(E)', '', '@@', 'expression E;', '@@', '', '- S_ISDIR(E->d_inode->i_mode)', '+ d_is_dir(E)', '', '@@', 'expression E;', '@@', '', '- S_ISREG(E->d_inode->i_mode)', '+ d_is_reg(E)' ); my $coccifile = "tmp.sp.cocci"; open($fd, ">$coccifile") || die $coccifile; print($fd "$_\n") || die $coccifile foreach (@cocci); close($fd); foreach my $file (@callers) { chomp $file; print "Processing ", $file, "\n"; system("spatch", "--sp-file", $coccifile, $file, "--in-place", "--no-show-diff") == 0 || die "spatch failed"; } [AV: overlayfs parts skipped] Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-12-13ovl: Use macros to present ovl_xattrhujianyang1-2/+2
This patch adds two macros: OVL_XATTR_PRE_NAME and OVL_XATTR_PRE_LEN to present ovl_xattr name prefix and its length. Also, a new macro OVL_XATTR_OPAQUE is introduced to replace old *ovl_opaque_xattr*. Fix the length of "trusted.overlay." to *16*. Signed-off-by: hujianyang <hujianyang@huawei.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2014-12-13ovl: dont replace opaque dirMiklos Szeredi1-1/+1
When removing an empty opaque directory, then it makes no sense to replace it with an exact replica of itself before removal. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2014-12-13ovl: make path-type a bitmapMiklos Szeredi1-11/+11
OVL_PATH_PURE_UPPER -> __OVL_PATH_UPPER | __OVL_PATH_PURE OVL_PATH_UPPER -> __OVL_PATH_UPPER OVL_PATH_MERGE -> __OVL_PATH_UPPER | __OVL_PATH_MERGE OVL_PATH_LOWER -> 0 Multiple R/O layers will allow __OVL_PATH_MERGE without __OVL_PATH_UPPER. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2014-11-20ovl: fix remove/copy-up raceMiklos Szeredi1-12/+19
ovl_remove_and_whiteout() needs to check if upper dentry exists or not after having locked upper parent directory. Previously we used a "type" value computed before locking the upper parent directory, which is susceptible to racing with copy-up. There's a similar check in ovl_check_empty_and_clear(). This one is not actually racy, since copy-up doesn't change the "emptyness" property of a directory. Add a comment to this effect, and check the existence of upper dentry locally to make the code cleaner. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2014-10-24overlay filesystemMiklos Szeredi1-0/+921
Overlayfs allows one, usually read-write, directory tree to be overlaid onto another, read-only directory tree. All modifications go to the upper, writable layer. This type of mechanism is most often used for live CDs but there's a wide variety of other uses. The implementation differs from other "union filesystem" implementations in that after a file is opened all operations go directly to the underlying, lower or upper, filesystems. This simplifies the implementation and allows native performance in these cases. The dentry tree is duplicated from the underlying filesystems, this enables fast cached lookups without adding special support into the VFS. This uses slightly more memory than union mounts, but dentries are relatively small. Currently inodes are duplicated as well, but it is a possible optimization to share inodes for non-directories. Opening non directories results in the open forwarded to the underlying filesystem. This makes the behavior very similar to union mounts (with the same limitations vs. fchmod/fchown on O_RDONLY file descriptors). Usage: mount -t overlayfs overlayfs -olowerdir=/lower,upperdir=/upper/upper,workdir=/upper/work /overlay The following cotributions have been folded into this patch: Neil Brown <neilb@suse.de>: - minimal remount support - use correct seek function for directories - initialise is_real before use - rename ovl_fill_cache to ovl_dir_read Felix Fietkau <nbd@openwrt.org>: - fix a deadlock in ovl_dir_read_merged - fix a deadlock in ovl_remove_whiteouts Erez Zadok <ezk@fsl.cs.sunysb.edu> - fix cleanup after WARN_ON Sedat Dilek <sedat.dilek@googlemail.com> - fix up permission to confirm to new API Robin Dong <hao.bigrat@gmail.com> - fix possible leak in ovl_new_inode - create new inode in ovl_link Andy Whitcroft <apw@canonical.com> - switch to __inode_permission() - copy up i_uid/i_gid from the underlying inode AV: - ovl_copy_up_locked() - dput(ERR_PTR(...)) on two failure exits - ovl_clear_empty() - one failure exit forgetting to do unlock_rename(), lack of check for udir being the parent of upper, dropping and regaining the lock on udir (which would require _another_ check for parent being right). - bogus d_drop() in copyup and rename [fix from your mail] - copyup/remove and copyup/rename races [fix from your mail] - ovl_dir_fsync() leaving ERR_PTR() in ->realfile - ovl_entry_free() is pointless - it's just a kfree_rcu() - fold ovl_do_lookup() into ovl_lookup() - manually assigning ->d_op is wrong. Just use ->s_d_op. [patches picked from Miklos]: * copyup/remove and copyup/rename races * bogus d_drop() in copyup and rename Also thanks to the following people for testing and reporting bugs: Jordi Pujol <jordipujolp@gmail.com> Andy Whitcroft <apw@canonical.com> Michal Suchanek <hramrach@centrum.cz> Felix Fietkau <nbd@openwrt.org> Erez Zadok <ezk@fsl.cs.sunysb.edu> Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>