linux-dev - Linux kernel development work

Age	Commit message (Collapse)	Author	Files	Lines
2015-07-01	sysfs: Add support for permanently empty directories to serve as mount points.	Eric W. Biederman	2	-0/+49
	Add two functions sysfs_create_mount_point and sysfs_remove_mount_point that hang a permanently empty directory off of a kobject or remove a permanently emptpy directory hanging from a kobject. Export these new functions so modular filesystems can use them. Cc: stable@vger.kernel.org Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2015-07-01	kernfs: Add support for always empty directories.	Eric W. Biederman	3	-1/+42
	Add a new function kernfs_create_empty_dir that can be used to create directory that can not be modified. Update the code to use make_empty_dir_inode when reporting a permanently empty directory to the vfs. Update the code to not allow adding to permanently empty directories. Cc: stable@vger.kernel.org Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2015-07-01	proc: Allow creating permanently empty directories that serve as mount points	Eric W. Biederman	4	-2/+35
	Add a new function proc_create_mount_point that when used to creates a directory that can not be added to. Add a new function is_empty_pde to test if a function is a mount point. Update the code to use make_empty_dir_inode when reporting a permanently empty directory to the vfs. Update the code to not allow adding to permanently empty directories. Update /proc/openprom and /proc/fs/nfsd to be permanently empty directories. Cc: stable@vger.kernel.org Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2015-07-01	sysctl: Allow creating permanently empty directories that serve as mountpoints.	Eric W. Biederman	3	-7/+41
	Add a magic sysctl table sysctl_mount_point that when used to create a directory forces that directory to be permanently empty. Update the code to use make_empty_dir_inode when accessing permanently empty directories. Update the code to not allow adding to permanently empty directories. Update /proc/sys/fs/binfmt_misc to be a permanently empty directory. Cc: stable@vger.kernel.org Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2015-07-01	fs: Add helper functions for permanently empty directories.	Eric W. Biederman	2	-0/+98
	To ensure it is safe to mount proc and sysfs I need to check if filesystems that are mounted on top of them are mounted on truly empty directories. Given that some directories can gain entries over time, knowing that a directory is empty right now is insufficient. Therefore add supporting infrastructure for permantently empty directories that proc and sysfs can use when they create mount points for filesystems and fs_fully_visible can use to test for permanently empty directories to ensure that nothing will be gained by mounting a fresh copy of proc or sysfs. Cc: stable@vger.kernel.org Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2015-07-01	vfs: Ignore unlocked mounts in fs_fully_visible	Eric W. Biederman	1	-2/+6
	Limit the mounts fs_fully_visible considers to locked mounts. Unlocked can always be unmounted so considering them adds hassle but no security benefit. Cc: stable@vger.kernel.org Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2015-06-04	mnt: Modify fs_fully_visible to deal with locked ro nodev and atime	Eric W. Biederman	1	-3/+21
	Ignore an existing mount if the locked readonly, nodev or atime attributes are less permissive than the desired attributes of the new mount. On success ensure the new mount locks all of the same readonly, nodev and atime attributes as the old mount. The nosuid and noexec attributes are not checked here as this change is destined for stable and enforcing those attributes causes a regression in lxc and libvirt-lxc where those applications will not start and there are no known executables on sysfs or proc and no known way to create exectuables without code modifications Cc: stable@vger.kernel.org Fixes: e51db73532955 ("userns: Better restrictions on when proc and sysfs can be mounted") Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2015-05-13	mnt: Refactor the logic for mounting sysfs and proc in a user namespace	Eric W. Biederman	4	-10/+10
	Fresh mounts of proc and sysfs are a very special case that works very much like a bind mount. Unfortunately the current structure can not preserve the MNT_LOCK... mount flags. Therefore refactor the logic into a form that can be modified to preserve those lock bits. Add a new filesystem flag FS_USERNS_VISIBLE that requires some mount of the filesystem be fully visible in the current mount namespace, before the filesystem may be mounted. Move the logic for calling fs_fully_visible from proc and sysfs into fs/namespace.c where it has greater access to mount namespace state. Cc: stable@vger.kernel.org Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2015-05-09	mnt: Fix fs_fully_visible to verify the root directory is visible	Eric W. Biederman	1	-0/+6
	This fixes a dumb bug in fs_fully_visible that allows proc or sys to be mounted if there is a bind mount of part of /proc/ or /sys/ visible. Cc: stable@vger.kernel.org Reported-by: Eric Windisch <ewindisch@docker.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2015-04-26	Linux 4.1-rc1	Linus Torvalds	1	-2/+2

2015-04-26	x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue	Andy Lutomirski	5	-0/+48
	AMD CPUs don't reinitialize the SS descriptor on SYSRET, so SYSRET with SS == 0 results in an invalid usermode state in which SS is apparently equal to __USER_DS but causes #SS if used. Work around the issue by setting SS to __KERNEL_DS __switch_to, thus ensuring that SYSRET never happens with SS set to NULL. This was exposed by a recent vDSO cleanup. Fixes: e7d6eefaaa44 x86/vdso32/syscall.S: Do not load __USER32_DS to %ss Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Peter Anvin <hpa@zytor.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Denys Vlasenko <vda.linux@googlemail.com> Cc: Brian Gerst <brgerst@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-04-26	v4l: xilinx: fix for include file movement	Stephen Rothwell	1	-1/+1
	Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-04-24	platform/chrome: chromeos_laptop - instantiate Atmel at primary address	Dmitry Torokhov	1	-9/+26
	The new Atmel MXT driver expects i2c client's address contain the primary (main address) of the chip, and calculates the expected bootloader address form the primary address. Unfortunately chrome_laptop does probe the devices and if touchpad (or touchscreen, or both) comes up in bootloader mode the i2c device gets instantiated with the bootloader address which confuses the driver. To work around this issue let's probe the primary address first. If the device is not detected at the primary address we'll probe alternative addresses as "dummy" devices. If any of them are found, destroy the dummy client and instantiate client with proper name at primary address still. Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Olof Johansson <olof@lixom.net>
2015-04-24	RCU pathwalk breakage when running into a symlink overmounting something	Al Viro	1	-2/+4
	Calling unlazy_walk() in walk_component() and do_last() when we find a symlink that needs to be followed doesn't acquire a reference to vfsmount. That's fine when the symlink is on the same vfsmount as the parent directory (which is almost always the case), but it's not always true - one _can_ manage to bind a symlink on top of something. And in such cases we end up with excessive mntput(). Cc: stable@vger.kernel.org # since 2.6.39 Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-24	fix I_DIO_WAKEUP definition	Eric Sandeen	1	-1/+1
	I_DIO_WAKEUP is never directly used, but fix it up anyway. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-24	direct-io: only inc/dec inode->i_dio_count for file systems	Jens Axboe	9	-33/+50
	do_blockdev_direct_IO() increments and decrements the inode ->i_dio_count for each IO operation. It does this to protect against truncate of a file. Block devices don't need this sort of protection. For a capable multiqueue setup, this atomic int is the only shared state between applications accessing the device for O_DIRECT, and it presents a scaling wall for that. In my testing, as much as 30% of system time is spent incrementing and decrementing this value. A mixed read/write workload improved from ~2.5M IOPS to ~9.6M IOPS, with better latencies too. Before: clat percentiles (usec): \| 1.00th=[ 33], 5.00th=[ 34], 10.00th=[ 34], 20.00th=[ 34], \| 30.00th=[ 34], 40.00th=[ 34], 50.00th=[ 35], 60.00th=[ 35], \| 70.00th=[ 35], 80.00th=[ 35], 90.00th=[ 37], 95.00th=[ 80], \| 99.00th=[ 98], 99.50th=[ 151], 99.90th=[ 155], 99.95th=[ 155], \| 99.99th=[ 165] After: clat percentiles (usec): \| 1.00th=[ 95], 5.00th=[ 108], 10.00th=[ 129], 20.00th=[ 149], \| 30.00th=[ 155], 40.00th=[ 161], 50.00th=[ 167], 60.00th=[ 171], \| 70.00th=[ 177], 80.00th=[ 185], 90.00th=[ 201], 95.00th=[ 270], \| 99.00th=[ 390], 99.50th=[ 398], 99.90th=[ 418], 99.95th=[ 422], \| 99.99th=[ 438] In other setups, Robert Elliott reported seeing good performance improvements: https://lkml.org/lkml/2015/4/3/557 The more applications accessing the device, the worse it gets. Add a new direct-io flags, DIO_SKIP_DIO_COUNT, which tells do_blockdev_direct_IO() that it need not worry about incrementing or decrementing the inode i_dio_count for this caller. Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Theodore Ts'o <tytso@mit.edu> Cc: Elliott, Robert (Server Storage) <elliott@hp.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Jens Axboe <axboe@fb.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-24	fs/9p: fix readdir()	Johannes Berg	1	-0/+2
	Al Viro's IOV changes broke 9p readdir() because the new code didn't abort the read when it returned nothing. The original code checked if the combined error/length was <= 0 but in the new code that accidentally got changed to just an error check. Add back the return from the function when nothing is read. Cc: Al Viro <viro@zeniv.linux.org.uk> Fixes: e1200fe68f20 ("9p: switch p9_client_read() to passing struct iov_iter *") Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-24	Btrfs: prevent list corruption during free space cache processing	Chris Mason	1	-14/+18
	__btrfs_write_out_cache is holding the ctl->tree_lock while it prepares a list of bitmaps to record in the free space cache. It was dropping the lock while it worked on other components, which made a window for free_bitmap() to free the bitmap struct without removing it from the list. This changes things to hold the lock the whole time, and also makes sure we hold the lock during enospc cleanup. Reported-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Chris Mason <clm@fb.com>
2015-04-24	toshiba_acpi: Do not register vendor backlight when acpi_video bl is available	Hans de Goede	2	-0/+25
	commit a39f46df33c6 ("toshiba_acpi: Fix regression caused by backlight extra check code") causes the backlight to no longer work on the Toshiba Z30, reverting that commit fixes this but restores the original issue fixed by that commit. Looking at the toshiba_acpi backlight code for a fix for this I noticed that the toshiba code is the only code under platform/x86 which unconditionally registers a vendor acpi backlight interface, without checking for acpi_video backlight support first. This commit adds the necessary checks bringing toshiba_acpi in line with the other drivers, and fixing the Z30 regression without needing to revert the commit causing it. Chances are that there will be some Toshiba models which have a non working acpi-video implementation while the toshiba vendor backlight interface does work, this commit adds an empty dmi_id table where such systems can be added, this is identical to how other drivers handle such systems. BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1206036 BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=86521 Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-and-tested-by: Azael Avalos <coproscefalo@gmail.com> Signed-off-by: Darren Hart <dvhart@linux.intel.com>
2015-04-24	x86: fix special __probe_kernel_write() tail zeroing case	Linus Torvalds	1	-1/+1
	Commit cae2a173fe94 ("x86: clean up/fix 'copy_in_user()' tail zeroing") fixed the failure case tail zeroing of one special case of the x86-64 generic user-copy routine, namely when used for the user-to-user case ("copy_in_user()"). But in the process it broke an even more unusual case: using the user copy routine for kernel-to-kernel copying. Now, normally kernel-kernel copies are obviously done using memcpy(), but we have a couple of special cases when we use the user-copy functions. One is when we pass a kernel buffer to a regular user-buffer routine, using set_fs(KERNEL_DS). That's a "normal" case, and continued to work fine, because it never takes any faults (with the possible exception of a silent and successful vmalloc fault). But Jan Beulich pointed out another, very unusual, special case: when we use the user-copy routines not because it's a path that expects a user pointer, but for a couple of ftrace/kgdb cases that want to do a kernel copy, but do so using "unsafe" buffers, and use the user-copy routine to gracefully handle faults. IOW, for probe_kernel_write(). And that broke for the case of a faulting kernel destination, because we saw the kernel destination and wanted to try to clear the tail of the buffer. Which doesn't work, since that's what faults. This only triggers for things like kgdb and ftrace users (eg trying setting a breakpoint on read-only memory), but it's definitely a bug. The fix is to not compare against the kernel address start (TASK_SIZE), but instead use the same limits "access_ok()" uses. Reported-and-tested-by: Jan Beulich <jbeulich@suse.com> Cc: stable@vger.kernel.org # 4.0 Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-04-24	crypto: img-hash - CRYPTO_DEV_IMGTEC_HASH should depend on HAS_DMA	Geert Uytterhoeven	1	-1/+2
	If NO_DMA=y: drivers/built-in.o: In function `img_hash_write_via_dma_stop': img-hash.c:(.text+0xa2b822): undefined reference to `dma_unmap_sg' drivers/built-in.o: In function `img_hash_xmit_dma': img-hash.c:(.text+0xa2b8d8): undefined reference to `dma_map_sg' img-hash.c:(.text+0xa2b948): undefined reference to `dma_unmap_sg' Also move the "depends" section below the "tristate" line while we're at it. Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2015-04-24	crypto: x86/sha512_ssse3 - fixup for asm function prototype change	Ard Biesheuvel	1	-1/+1
	Patch e68410ebf626 ("crypto: x86/sha512_ssse3 - move SHA-384/512 SSSE3 implementation to base layer") changed the prototypes of the core asm SHA-512 implementations so that they are compatible with the prototype used by the base layer. However, in one instance, the register that was used for passing the input buffer was reused as a scratch register later on in the code, and since the input buffer param changed places with the digest param -which needs to be written back before the function returns- this resulted in the scratch register to be dereferenced in a memory write operation, causing a GPF. Fix this by changing the scratch register to use the same register as the input buffer param again. Fixes: e68410ebf626 ("crypto: x86/sha512_ssse3 - move SHA-384/512 SSSE3 implementation to base layer") Reported-By: Bobby Powers <bobbypowers@gmail.com> Tested-By: Bobby Powers <bobbypowers@gmail.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2015-04-24	nios2: rework cache	Ley Foon Tan	3	-17/+57
	- flush dcache before flush instruction cache - remork update_mmu_cache and flush_dcache_page - add shmparam.h Signed-off-by: Ley Foon Tan <lftan@altera.com>
2015-04-24	nios2: Add types.h header required for __u32 type	Ezequiel Garcia	1	-0/+2
	Reported by the header checker (CONFIG_HEADERS_CHECK=y): CHECK usr/include/asm/ (31 files) ./usr/include/asm/ptrace.h:77: found __[us]{8,16,32,64} type without #include <linux/types.h> Signed-off-by: Ezequiel Garcia <ezequiel@vanguardiasur.com.ar> Acked-by: Ley Foon Tan <lftan@altera.com>
2015-04-24	ALSA: hda - fix headset mic detection problem for one more machine	Hui Wang	1	-9/+15
	We have two machines with alc256 codec in the pin quirk table, so moving the common pins to ALC256_STANDARD_PINS. Cc: stable@vger.kernel.org BugLink: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1447909 Signed-off-by: Hui Wang <hui.wang@canonical.com> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2015-04-24	eth: bf609 eth clock: add pclk clock for stmmac driver probe	Steven Miao	1	-0/+7
	Signed-off-by: Steven Miao <realmz6@gmail.com>
2015-04-24	blackfin: Wire up missing syscalls	Chen Gang	2	-1/+21
	The related syscalls are below which may cause samples/kdbus building break in next-20150401 tree, the related information and error: CALL scripts/checksyscalls.sh <stdin>:1223:2: warning: #warning syscall kcmp not implemented [-Wcpp] <stdin>:1226:2: warning: #warning syscall finit_module not implemented [-Wcpp] <stdin>:1229:2: warning: #warning syscall sched_setattr not implemented [-Wcpp] <stdin>:1232:2: warning: #warning syscall sched_getattr not implemented [-Wcpp] <stdin>:1235:2: warning: #warning syscall renameat2 not implemented [-Wcpp] <stdin>:1238:2: warning: #warning syscall seccomp not implemented [-Wcpp] <stdin>:1241:2: warning: #warning syscall getrandom not implemented [-Wcpp] <stdin>:1244:2: warning: #warning syscall memfd_create not implemented [-Wcpp] <stdin>:1247:2: warning: #warning syscall bpf not implemented [-Wcpp] <stdin>:1250:2: warning: #warning syscall execveat not implemented [-Wcpp] [...] HOSTCC samples/kdbus/kdbus-workers samples/kdbus/kdbus-workers.c: In function ‘prime_new’: samples/kdbus/kdbus-workers.c:930:18: error: ‘__NR_memfd_create’ undeclared (first use in this function) p->fd = syscall(__NR_memfd_create, "prime-area", MFD_CLOEXEC); ^ samples/kdbus/kdbus-workers.c:930:18: note: each undeclared identifier is reported only once for each function it appears in Signed-off-by: Chen Gang <gang.chen.5i5j@gmail.com>
2015-04-23	Btrfs: fix inode cache writeout	Chris Mason	1	-3/+7
	The code to fix stalls during free spache cache IO wasn't using the correct root when waiting on the IO for inode caches. This is only a problem when the inode cache is enabled with mount -o inode_cache This fixes the inode cache writeout to preserve any error values and makes sure not to override the root when inode cache writeout is done. Reported-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Chris Mason <clm@fb.com>
2015-04-24	ACPI / scan: Add a scan handler for PRP0001	Rafael J. Wysocki	1	-5/+28
	If the special PRP0001 device ID is present in the given device's list of ACPI/PNP IDs and the device has a valid "compatible" property in the _DSD, it should be enumerated using the default mechanism, unless some scan handlers match the IDs preceding PRP0001 in the device's list of ACPI/PNP IDs. In addition to that, no scan handlers matching the IDs following PRP0001 in that list should be attached to the device. To make that happen, define a scan handler that will match PRP0001 and trigger the default enumeration for the matching devices if the "compatible" property is present for them. Since that requires the check for platform_id and device->handler to be removed from acpi_default_enumeration(), move the fallback invocation of acpi_default_enumeration() to acpi_bus_attach() (after it's checked if there's a matching ACPI driver for the device), which is a better place to call it, and do the platform_id check in there too (device->handler is guaranteed to be unset at the point where the function is looking for a matching ACPI driver). Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Darren Hart <dvhart@linux.intel.com>
2015-04-24	ACPI / scan: Annotate physical_node_lock in acpi_scan_is_offline()	Rafael J. Wysocki	1	-1/+5
	acpi_scan_is_offline() may be called under the physical_node_lock lock of the given device object's parent, so prevent lockdep from complaining about that by annotating that instance with SINGLE_DEPTH_NESTING. Fixes: caa73ea158de (ACPI / hotplug / driver core: Handle containers in a special way) Reported-and-tested-by: Xie XiuQi <xiexiuqi@huawei.com> Reviewed-by: Toshi Kani <toshi.kani@hp.com> Cc: 3.14+ <stable@vger.kernel.org> # 3.14+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-04-24	drm/i915: vlv: fix save/restore of GFX_MAX_REQ_COUNT reg	Imre Deak	1	-2/+2
	Due this typo we don't save/restore the GFX_MAX_REQ_COUNT register across suspend/resume, so fix this. This was introduced in commit ddeea5b0c36f3665446518c609be91f9336ef674 Author: Imre Deak <imre.deak@intel.com> Date: Mon May 5 15:19:56 2014 +0300 drm/i915: vlv: add runtime PM support I noticed this only by reading the code. To my knowledge it shouldn't cause any real problems at the moment, since the power well backing this register remains on across a runtime s/r. This may change once system-wide s0ix functionality is enabled in the kernel. v2: - resend after a missing git add -u :/ Cc: stable@vger.kernel.org Signed-off-by: Imre Deak <imre.deak@intel.com> Tested-By: PRC QA PRTS (Patch Regression Test System Contact: shuang.he@intel.com) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2015-04-23	drm/i915: Workaround to avoid lite restore with HEAD==TAIL	Michel Thierry	2	-2/+36
	WaIdleLiteRestore is an execlists-only workaround, and requires the driver to ensure that any context always has HEAD!=TAIL when attempting lite restore. Add two extra MI_NOOP instructions at the end of each request, but keep the requests tail pointing before the MI_NOOPs. We may not need to executed them, and this is why request->tail is sampled before adding these extra instructions. If we submit a context to the ELSP which has previously been submitted, move the tail pointer past the MI_NOOPs. This ensures HEAD!=TAIL. v2: Move overallocation to gen8_emit_request, and added note about sampling request->tail in commit message (Chris). v3: Remove redundant request->tail assignment in __i915_add_request, in lrc mode this is already set in execlists_context_queue. Do not add wa implementation details inside gem (Chris). v4: Apply the wa whenever the req has been resubmitted and update comment (Chris). Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Thomas Daniel <thomas.daniel@intel.com> Signed-off-by: Michel Thierry <michel.thierry@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2015-04-23	drm/i915: cope with large i2c transfers	Dmitry Torokhov	2	-10/+57
	The hardware, according to the specs, is limited to 256 byte transfers, and current driver has no protections in case users attempt to do larger transfers. The code will just stomp over status register and mayhem ensues. Let's split larger transfers into digestable chunks. Doing this allows Atmel MXT driver on Pixel 1 function properly (it hasn't since commit 9d8dc3e529a19e427fd379118acd132520935c5d "Input: atmel_mxt_ts - implement T44 message handling" which tries to consume multiple touchscreen/touchpad reports in a single transaction). Cc: stable@vger.kernel.org Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2015-04-23	fs/nfs: fix new compiler warning about boolean in switch	Andre Przywara	1	-7/+4
	The brand new GCC 5.1.0 warns by default on using a boolean in the switch condition. This results in the following warning: fs/nfs/nfs4proc.c: In function 'nfs4_proc_get_rootfh': fs/nfs/nfs4proc.c:3100:10: warning: switch condition has boolean value [-Wswitch-bool] switch (auth_probe) { ^ This code was obviously using switch to make use of the fall-through semantics (without the usual comment, though). Rewrite that code using if statements to avoid the warning and make the code a bit more readable on the way. Signed-off-by: Andre Przywara <andre.przywara@arm.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-04-23	nfs: Remove unneeded casts in nfs	Firo Yang	1	-1/+1
	Don't unnecessarily cast allocation return value in fs/nfs/inode.c::nfs_alloc_inode(). Signed-off-by: Firo Yang <firogm@gmail.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-04-23	NFS: Don't attempt to decode missing directory entries	Benjamin Coddington	1	-0/+4
	If a READDIR reply comes back without any page data, avoid a NULL pointer dereference in xdr_copy_to_scratch(). BUG: unable to handle kernel NULL pointer dereference at 0000000000000001 IP: [<ffffffff813a378d>] memcpy+0xd/0x110 ... Call Trace: ? xdr_inline_decode+0x7a/0xb0 [sunrpc] nfs3_decode_dirent+0x73/0x320 [nfsv3] nfs_readdir_page_filler+0xd5/0x4e0 [nfs] ? nfs3_rpc_wrapper.constprop.9+0x42/0xc0 [nfsv3] nfs_readdir_xdr_to_array+0x1fa/0x330 [nfs] ? mem_cgroup_commit_charge+0xac/0x160 ? nfs_readdir_xdr_to_array+0x330/0x330 [nfs] nfs_readdir_filler+0x22/0x90 [nfs] do_read_cache_page+0x7e/0x1a0 read_cache_page+0x1c/0x20 nfs_readdir+0x18e/0x660 [nfs] ? nfs3_xdr_dec_getattr3res+0x80/0x80 [nfsv3] iterate_dir+0x97/0x130 SyS_getdents+0x94/0x120 ? fillonedir+0xd0/0xd0 system_call_fastpath+0x12/0x17 Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-04-23	Revert "nfs: replace nfs_add_stats with nfs_inc_stats when add one"	Nicolas Iooss	2	-2/+2
	This reverts commit 5a254d08b086d80cbead2ebcee6d2a4b3a15587a. Since commit 5a254d08b086 ("nfs: replace nfs_add_stats with nfs_inc_stats when add one"), nfs_readpage and nfs_do_writepage use nfs_inc_stats to increment NFSIOS_READPAGES and NFSIOS_WRITEPAGES instead of nfs_add_stats. However nfs_inc_stats does not do the same thing as nfs_add_stats with value 1 because these functions work on distinct stats: nfs_inc_stats increments stats from "enum nfs_stat_eventcounters" (in server->io_stats->events) and nfs_add_stats those from "enum nfs_stat_bytecounters" (in server->io_stats->bytes). Signed-off-by: Nicolas Iooss <nicolas.iooss_linux@m4x.org> Fixes: 5a254d08b086 ("nfs: replace nfs_add_stats with nfs_inc_stats...") Cc: stable@vger.kernel.org # 3.19+ Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-04-23	NFS: Rename idmap.c to nfs4idmap.c	Anna Schumaker	2	-1/+1
	I added the nfs4 prefix to make it obvious that this file is built into the NFS v4 module, and not the generic client. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-04-23	NFS: Move nfs_idmap.h into fs/nfs/	Anna Schumaker	10	-10/+10
	This file is only used internally to the NFS v4 module, so it doesn't need to be in the global include path. I also renamed it from nfs_idmap.h to nfs4idmap.h to emphasize that it's an NFSv4-only include file. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-04-23	NFS: Remove CONFIG_NFS_V4 checks from nfs_idmap.h	Anna Schumaker	3	-13/+0
	The idmapper is completely internal to the NFS v4 module, so this macro will always evaluate to true. This patch also removes unnecessary includes of this file from the generic NFS client. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-04-23	NFS: Add a stub for GETDEVICELIST	Anna Schumaker	1	-0/+6
	d4b18c3e (pnfs: remove GETDEVICELIST implementation) removed the GETDEVICELIST operation from the NFS client, but left a "hole" in the nfs4_procedures array. This caused /proc/self/mountstats to report an operation named "51" where GETDEVICELIST used to be. This patch adds a stub to fix mountstats. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Fixes: d4b18c3e (pnfs: remove GETDEVICELIST implementation) Cc: stable@vger.kernel.org # 3.17+ Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-04-23	nfs: remove WARN_ON_ONCE from nfs_direct_good_bytes	Peng Tao	1	-2/+0
	For flexfiles driver, we might choose to read from mirror index other than 0 while mirror_count is always 1 for read. Reported-by: Jean Spector <jean@primarydata.com> Cc: <stable@vger.kernel.org> # v3.19+ Cc: Weston Andros Adamson <dros@primarydata.com> Signed-off-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-04-23	nfs: fix DIO good bytes calculation	Peng Tao	1	-12/+17
	For direct read that has IO size larger than rsize, we'll split it into several READ requests and nfs_direct_good_bytes() would count completed bytes incorrectly by eating last zero count reply. Fix it by handling mirror and non-mirror cases differently such that we only count mirrored writes differently. This fixes 5fadeb47("nfs: count DIO good bytes correctly with mirroring"). Reported-by: Jean Spector <jean@primarydata.com> Cc: <stable@vger.kernel.org> # v3.19+ Signed-off-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-04-23	nfs: Fetch MOUNTED_ON_FILEID when updating an inode	Anna Schumaker	2	-2/+16
	2ef47eb1 (NFS: Fix use of nfs_attr_use_mounted_on_fileid()) was a good start to fixing a circular directory structure warning for NFS v4 "junctioned" mountpoints. Unfortunately, further testing continued to generate this error. My server is configured like this: anna@nfsd ~ % df Filesystem Size Used Avail Use% Mounted on /dev/vda1 9.1G 2.0G 6.5G 24% / /dev/vdc1 1014M 33M 982M 4% /exports /dev/vdc2 1014M 33M 982M 4% /exports/vol1 /dev/vdc3 1014M 33M 982M 4% /exports/vol1/vol2 anna@nfsd ~ % cat /etc/exports /exports/ (rw,async,no_subtree_check,no_root_squash) /exports/vol1/ (rw,async,no_subtree_check,no_root_squash) /exports/vol1/vol2 *(rw,async,no_subtree_check,no_root_squash) I've been running chown across the entire mountpoint twice in a row to hit this problem. The first run succeeds, but the second one fails with the circular directory warning along with: anna@client ~ % dmesg [Apr 3 14:28] NFS: server 192.168.100.204 error: fileid changed fsid 0:39: expected fileid 0x100080, got 0x80 WHere 0x80 is the mountpoint's fileid and 0x100080 is the mounted-on fileid. This patch fixes the issue by requesting an updated mounted-on fileid from the server during nfs_update_inode(), and then checking that the fileid stored in the nfs_inode matches either the fileid or mounted-on fileid returned by the server. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-04-23	sunrpc: make debugfs file creation failure non-fatal	Jeff Layton	5	-47/+41
	v2: gracefully handle the case where some dentry pointers end up NULL and be more dilligent about zeroing out dentry pointers We currently have a problem that SELinux policy is being enforced when creating debugfs files. If a debugfs file is created as a side effect of doing some syscall, then that creation can fail if the SELinux policy for that process prevents it. This seems wrong. We don't do that for files under /proc, for instance, so Bruce has proposed a patch to fix that. While discussing that patch however, Greg K.H. stated: "No kernel code should care / fail if a debugfs function fails, so please fix up the sunrpc code first." This patch converts all of the sunrpc debugfs setup code to be void return functins, and the callers to not look for errors from those functions. This should allow rpc_clnt and rpc_xprt creation to work, even if the kernel fails to create debugfs files for some reason. Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: "J. Bruce Fields" <bfields@fieldses.org> Signed-off-by: Jeff Layton <jeff.layton@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-04-23	nfs: fix high load average due to callback thread sleeping	Jeff Layton	1	-3/+3
	Chuck pointed out a problem that crept in with commit 6ffa30d3f734 (nfs: don't call blocking operations while !TASK_RUNNING). Linux counts tasks in uninterruptible sleep against the load average, so this caused the system's load average to be pinned at at least 1 when there was a NFSv4.1+ mount active. Not a huge problem, but it's probably worth fixing before we get too many complaints about it. This patch converts the code back to use TASK_INTERRUPTIBLE sleep, simply has it flush any signals on each loop iteration. In practice no one should really be signalling this thread at all, so I think this is reasonably safe. With this change, there's also no need to game the hung task watchdog so we can also convert the schedule_timeout call back to a normal schedule. Cc: <stable@vger.kernel.org> Reported-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Jeff Layton <jeff.layton@primarydata.com> Tested-by: Chuck Lever <chuck.lever@oracle.com> Fixes: commit 6ffa30d3f734 (“nfs: don't call blocking . . .”) Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-04-23	NFS: Reduce time spent holding the i_mutex during fallocate()	Anna Schumaker	2	-7/+10
	At the very least, we should not be taking the i_mutex until after checking if the server even supports ALLOCATE or DEALLOCATE, allowing v4.0 or v4.1 to exit without potentially waiting on a lock. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-04-23	NFS: Don't zap caches on fallocate()	Anna Schumaker	5	-10/+39
	This patch adds a GETATTR to the end of ALLOCATE and DEALLOCATE operations so we can set the updated inode size and change attribute directly. DEALLOCATE will still need to release pagecache pages, so nfs42_proc_deallocate() now calls truncate_pagecache_range() before contacting the server. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-04-23	i2c: st: add include for pinctrl	Wolfram Sang	1	-6/+7
	The driver uses pinctrl directly and thus should include the appropriate header. Sort the headers while we are here to have a better view what is included and what is not. Reported-by: Pascal Huerst <pascal.huerst@gmail.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2015-04-23	i2c: mux: use proper dev when removing "channel-X" symlinks	Wolfram Sang	1	-3/+5
	Those symlinks are created for the mux_dev, so we need to remove it from there. Currently, it breaks for muxes where the mux_dev is not the device of the parent adapter like this: [ 78.234644] WARNING: CPU: 0 PID: 365 at fs/sysfs/dir.c:31 sysfs_warn_dup+0x5c/0x78() [ 78.242438] sysfs: cannot create duplicate filename '/devices/platform/i2cbus@8/channel-0' Remove confusing comments while we are here. Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de> Fixes: c9449affad2ae0 Cc: stable@kernel.org