aboutsummaryrefslogtreecommitdiffstats
path: root/kernel (unfollow)
AgeCommit message (Collapse)AuthorFilesLines
2016-10-11ipc/msg: implement lockless pipelined wakeupsSebastian Andrzej Siewior1-93/+40
This patch moves the wakeup_process() invocation so it is not done under the ipc global lock by making use of a lockless wake_q. With this change, the waiter is woken up once the message has been assigned and it does not need to loop on SMP if the message points to NULL. In the signal case we still need to check the pointer under the lock to verify the state. This change should also avoid the introduction of preempt_disable() in -RT which avoids a busy-loop which pools for the NULL -> !NULL change if the waiter has a higher priority compared to the waker. By making use of wake_qs, the logic of sysv msg queues is greatly simplified (and very well suited as we can batch lockless wakeups), particularly around the lockless receive algorithm. This has been tested with Manred's pmsg-shared tool on a "AMD A10-7800 Radeon R7, 12 Compute Cores 4C+8G": test | before | after | diff -----------------|------------|------------|---------- pmsg-shared 8 60 | 19,347,422 | 30,442,191 | + ~57.34 % pmsg-shared 4 60 | 21,367,197 | 35,743,458 | + ~67.28 % pmsg-shared 2 60 | 22,884,224 | 24,278,200 | + ~6.09 % Link: http://lkml.kernel.org/r/1469748819-19484-2-git-send-email-dave@stgolabs.net Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Manfred Spraul <manfred@colorfullife.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-11ipc/sem.c: fix complex_count vs. simple op raceManfred Spraul2-55/+84
Commit 6d07b68ce16a ("ipc/sem.c: optimize sem_lock()") introduced a race: sem_lock has a fast path that allows parallel simple operations. There are two reasons why a simple operation cannot run in parallel: - a non-simple operations is ongoing (sma->sem_perm.lock held) - a complex operation is sleeping (sma->complex_count != 0) As both facts are stored independently, a thread can bypass the current checks by sleeping in the right positions. See below for more details (or kernel bugzilla 105651). The patch fixes that by creating one variable (complex_mode) that tracks both reasons why parallel operations are not possible. The patch also updates stale documentation regarding the locking. With regards to stable kernels: The patch is required for all kernels that include the commit 6d07b68ce16a ("ipc/sem.c: optimize sem_lock()") (3.10?) The alternative is to revert the patch that introduced the race. The patch is safe for backporting, i.e. it makes no assumptions about memory barriers in spin_unlock_wait(). Background: Here is the race of the current implementation: Thread A: (simple op) - does the first "sma->complex_count == 0" test Thread B: (complex op) - does sem_lock(): This includes an array scan. But the scan can't find Thread A, because Thread A does not own sem->lock yet. - the thread does the operation, increases complex_count, drops sem_lock, sleeps Thread A: - spin_lock(&sem->lock), spin_is_locked(sma->sem_perm.lock) - sleeps before the complex_count test Thread C: (complex op) - does sem_lock (no array scan, complex_count==1) - wakes up Thread B. - decrements complex_count Thread A: - does the complex_count test Bug: Now both thread A and thread C operate on the same array, without any synchronization. Fixes: 6d07b68ce16a ("ipc/sem.c: optimize sem_lock()") Link: http://lkml.kernel.org/r/1469123695-5661-1-git-send-email-manfred@colorfullife.com Reported-by: <felixh@informatik.uni-bremen.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@elte.hu> Cc: <1vier1@web.de> Cc: <stable@vger.kernel.org> [3.10+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-11kcov: do not instrument lib/stackdepot.cAlexander Potapenko1-0/+1
There's no point in collecting coverage from lib/stackdepot.c, as it is not a function of syscall inputs. Disabling kcov instrumentation for that file will reduce the coverage noise level. Link: http://lkml.kernel.org/r/1474640972-104131-1-git-send-email-glider@google.com Signed-off-by: Alexander Potapenko <glider@google.com> Acked-by: Dmitry Vyukov <dvyukov@google.com> Cc: Kostya Serebryany <kcc@google.com> Cc: Andrey Konovalov <andreyknvl@google.com> Cc: syzkaller <syzkaller@googlegroups.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-11config: android: enable CONFIG_SECCOMPRob Herring1-0/+1
As of Android N, SECCOMP is required. Without it, we will get mediaextractor error: E /system/bin/mediaextractor: libminijail: prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER): Invalid argument Link: http://lkml.kernel.org/r/20160908185934.18098-3-robh@kernel.org Signed-off-by: Rob Herring <robh@kernel.org> Acked-by: John Stultz <john.stultz@linaro.org> Cc: Amit Pundir <amit.pundir@linaro.org> Cc: Dmitry Shmidt <dimitrysh@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-11config: android: set SELinux as default security modeRob Herring1-0/+1
Android won't boot without SELinux enabled, so make it the default. Link: http://lkml.kernel.org/r/20160908185934.18098-2-robh@kernel.org Signed-off-by: Rob Herring <robh@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-11config: android: move device mapper options to recommendedRob Herring2-4/+4
CONFIG_MD is in recommended, but other dependent options like DM_CRYPT and DM_VERITY options are in base. The result is the options in base don't get enabled when applying both base and recommended fragments. Move all the options to recommended. Link: http://lkml.kernel.org/r/20160908185934.18098-1-robh@kernel.org Signed-off-by: Rob Herring <robh@kernel.org> Acked-by: John Stultz <john.stultz@linaro.org> Cc: Amit Pundir <amit.pundir@linaro.org> Cc: Dmitry Shmidt <dimitrysh@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-11config/android: Remove CONFIG_IPV6_PRIVACYBorislav Petkov1-1/+0
Option is long gone, see commit 5d9efa7ee99e ("ipv6: Remove privacy config option.") Link: http://lkml.kernel.org/r/20160811170340.9859-1-bp@alien8.de Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Rob Herring <robh@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-11relay: Use irq_work instead of plain timer for deferred wakeupPeter Zijlstra3-11/+17
Relay avoids calling wake_up_interruptible() for doing the wakeup of readers/consumers, waiting for the generation of new data, from the context of a process which produced the data. This is apparently done to prevent the possibility of a deadlock in case Scheduler itself is is generating data for the relay, after acquiring rq->lock. The following patch used a timer (to be scheduled at next jiffy), for delegating the wakeup to another context. commit 7c9cb38302e78d24e37f7d8a2ea7eed4ae5f2fa7 Author: Tom Zanussi <zanussi@comcast.net> Date: Wed May 9 02:34:01 2007 -0700 relay: use plain timer instead of delayed work relay doesn't need to use schedule_delayed_work() for waking readers when a simple timer will do. Scheduling a plain timer, at next jiffies boundary, to do the wakeup causes a significant wakeup latency for the Userspace client, which makes relay less suitable for the high-frequency low-payload use cases where the data gets generated at a very high rate, like multiple sub buffers getting filled within a milli second. Moreover the timer is re-scheduled on every newly produced sub buffer so the timer keeps getting pushed out if sub buffers are filled in a very quick succession (less than a jiffy gap between filling of 2 sub buffers). As a result relay runs out of sub buffers to store the new data. By using irq_work it is ensured that wakeup of userspace client, blocked in the poll call, is done at earliest (through self IPI or next timer tick) enabling it to always consume the data in time. Also this makes relay consistent with printk & ring buffers (trace), as they too use irq_work for deferred wake up of readers. [arnd@arndb.de: select CONFIG_IRQ_WORK] Link: http://lkml.kernel.org/r/20160912154035.3222156-1-arnd@arndb.de [akpm@linux-foundation.org: coding-style fixes] Link: http://lkml.kernel.org/r/1472906487-1559-1-git-send-email-akash.goel@intel.com Signed-off-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Akash Goel <akash.goel@intel.com> Cc: Tom Zanussi <tzanussi@gmail.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-11pps: kc: fix non-tickless system config dependencyMaciej S. Szmigiero1-1/+1
CONFIG_NO_HZ currently only sets the default value of dynticks config so if PPS kernel consumer needs periodic timer ticks it should depend on !CONFIG_NO_HZ_COMMON instead of !CONFIG_NO_HZ. Otherwise it is possible to enable it even on tickless system which has CONFIG_NO_HZ not set and CONFIG_NO_HZ_IDLE (or CONFIG_NO_HZ_FULL) set. Link: http://lkml.kernel.org/r/57E2B769.50202@maciej.szmigiero.name Signed-off-by: Maciej S. Szmigiero <mail@maciej.szmigiero.name> Acked-by: Rodolfo Giometti <giometti@enneenne.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-11mips/panic: replace smp_send_stop() with kdump friendly version in panic pathHidehiro Kawai4-1/+33
Daniel Walker reported problems which happens when crash_kexec_post_notifiers kernel option is enabled (https://lkml.org/lkml/2015/6/24/44). In that case, smp_send_stop() is called before entering kdump routines which assume other CPUs are still online. As the result, kdump routines fail to save other CPUs' registers. Additionally for MIPS OCTEON, it misses to stop the watchdog timer. To fix this problem, call a new kdump friendly function, crash_smp_send_stop(), instead of the smp_send_stop() when crash_kexec_post_notifiers is enabled. crash_smp_send_stop() is a weak function, and it just call smp_send_stop(). Architecture codes should override it so that kdump can work appropriately. This patch provides MIPS version. Fixes: f06e5153f4ae (kernel/panic.c: add "crash_kexec_post_notifiers" option) Link: http://lkml.kernel.org/r/20160810080950.11028.28000.stgit@sysi4-13.yrl.intra.hitachi.co.jp Signed-off-by: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com> Reported-by: Daniel Walker <dwalker@fifo99.com> Cc: Dave Young <dyoung@redhat.com> Cc: Baoquan He <bhe@redhat.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Daniel Walker <dwalker@fifo99.com> Cc: Xunlei Pang <xpang@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Borislav Petkov <bp@suse.de> Cc: David Vrabel <david.vrabel@citrix.com> Cc: Toshi Kani <toshi.kani@hpe.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: David Daney <david.daney@cavium.com> Cc: Aaro Koskinen <aaro.koskinen@iki.fi> Cc: "Steven J. Hill" <steven.hill@cavium.com> Cc: Corey Minyard <cminyard@mvista.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-11x86/panic: replace smp_send_stop() with kdump friendly version in panic pathHidehiro Kawai5-10/+66
Daniel Walker reported problems which happens when crash_kexec_post_notifiers kernel option is enabled (https://lkml.org/lkml/2015/6/24/44). In that case, smp_send_stop() is called before entering kdump routines which assume other CPUs are still online. As the result, for x86, kdump routines fail to save other CPUs' registers and disable virtualization extensions. To fix this problem, call a new kdump friendly function, crash_smp_send_stop(), instead of the smp_send_stop() when crash_kexec_post_notifiers is enabled. crash_smp_send_stop() is a weak function, and it just call smp_send_stop(). Architecture codes should override it so that kdump can work appropriately. This patch only provides x86-specific version. For Xen's PV kernel, just keep the current behavior. NOTES: - Right solution would be to place crash_smp_send_stop() before __crash_kexec() invocation in all cases and remove smp_send_stop(), but we can't do that until all architectures implement own crash_smp_send_stop() - crash_smp_send_stop()-like work is still needed by machine_crash_shutdown() because crash_kexec() can be called without entering panic() Fixes: f06e5153f4ae (kernel/panic.c: add "crash_kexec_post_notifiers" option) Link: http://lkml.kernel.org/r/20160810080948.11028.15344.stgit@sysi4-13.yrl.intra.hitachi.co.jp Signed-off-by: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com> Reported-by: Daniel Walker <dwalker@fifo99.com> Cc: Dave Young <dyoung@redhat.com> Cc: Baoquan He <bhe@redhat.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Daniel Walker <dwalker@fifo99.com> Cc: Xunlei Pang <xpang@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Borislav Petkov <bp@suse.de> Cc: David Vrabel <david.vrabel@citrix.com> Cc: Toshi Kani <toshi.kani@hpe.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: David Daney <david.daney@cavium.com> Cc: Aaro Koskinen <aaro.koskinen@iki.fi> Cc: "Steven J. Hill" <steven.hill@cavium.com> Cc: Corey Minyard <cminyard@mvista.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-11nvme: use the DMA_ATTR_NO_WARN attributeMauricio Faria de Oliveira1-1/+2
Use the DMA_ATTR_NO_WARN attribute for the dma_map_sg() call of the nvme driver that returns BLK_MQ_RQ_QUEUE_BUSY (not for BLK_MQ_RQ_QUEUE_ERROR). Link: http://lkml.kernel.org/r/1470092390-25451-4-git-send-email-mauricfo@linux.vnet.ibm.com Signed-off-by: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com> Reviewed-by: Gabriel Krisman Bertazi <krisman@linux.vnet.ibm.com> Cc: Keith Busch <keith.busch@intel.com> Cc: Jens Axboe <axboe@fb.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Krzysztof Kozlowski <k.kozlowski@samsung.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-11powerpc: implement the DMA_ATTR_NO_WARN attributeMauricio Faria de Oliveira1-2/+4
Add support for the DMA_ATTR_NO_WARN attribute on powerpc iommu code. Link: http://lkml.kernel.org/r/1470092390-25451-3-git-send-email-mauricfo@linux.vnet.ibm.com Signed-off-by: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Cc: Keith Busch <keith.busch@intel.com> Cc: Jens Axboe <axboe@fb.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Krzysztof Kozlowski <k.kozlowski@samsung.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-11dma-mapping: introduce the DMA_ATTR_NO_WARN attributeMauricio Faria de Oliveira2-0/+22
Introduce the DMA_ATTR_NO_WARN attribute, and document it. Link: http://lkml.kernel.org/r/1470092390-25451-2-git-send-email-mauricfo@linux.vnet.ibm.com Signed-off-by: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com> Cc: Keith Busch <keith.busch@intel.com> Cc: Jens Axboe <axboe@fb.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Krzysztof Kozlowski <k.kozlowski@samsung.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-11random: remove unused randomize_range()Jason Cooper2-20/+0
All call sites for randomize_range have been updated to use the much simpler and more robust randomize_addr(). Remove the now unnecessary code. Link: http://lkml.kernel.org/r/20160803233913.32511-8-jason@lakedaemon.net Signed-off-by: Jason Cooper <jason@lakedaemon.net> Acked-by: Kees Cook <keescook@chromium.org> Cc: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-11unicore32: use simpler API for random address requestsJason Cooper1-2/+1
Currently, all callers to randomize_range() set the length to 0 and calculate end by adding a constant to the start address. We can simplify the API to remove a bunch of needless checks and variables. Use the new randomize_addr(start, range) call to set the requested address. Link: http://lkml.kernel.org/r/20160803233913.32511-7-jason@lakedaemon.net Signed-off-by: Jason Cooper <jason@lakedaemon.net> Acked-by: Kees Cook <keescook@chromium.org> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-11tile: use simpler API for random address requestsJason Cooper1-2/+1
Currently, all callers to randomize_range() set the length to 0 and calculate end by adding a constant to the start address. We can simplify the API to remove a bunch of needless checks and variables. Use the new randomize_addr(start, range) call to set the requested address. Link: http://lkml.kernel.org/r/20160803233913.32511-6-jason@lakedaemon.net Signed-off-by: Jason Cooper <jason@lakedaemon.net> Acked-by: Kees Cook <keescook@chromium.org> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: Chris Metcalf <cmetcalf@mellanox.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-11arm64: use simpler API for random address requestsJason Cooper1-6/+2
Currently, all callers to randomize_range() set the length to 0 and calculate end by adding a constant to the start address. We can simplify the API to remove a bunch of needless checks and variables. Use the new randomize_addr(start, range) call to set the requested address. Link: http://lkml.kernel.org/r/20160803233913.32511-5-jason@lakedaemon.net Signed-off-by: Jason Cooper <jason@lakedaemon.net> Acked-by: Will Deacon <will.deacon@arm.com> Acked-by: Kees Cook <keescook@chromium.org> Cc: "Russell King - ARM Linux" <linux@arm.linux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-11ARM: use simpler API for random address requestsJason Cooper1-2/+1
Currently, all callers to randomize_range() set the length to 0 and calculate end by adding a constant to the start address. We can simplify the API to remove a bunch of needless checks and variables. Use the new randomize_addr(start, range) call to set the requested address. Link: http://lkml.kernel.org/r/20160803233913.32511-4-jason@lakedaemon.net Signed-off-by: Jason Cooper <jason@lakedaemon.net> Acked-by: Kees Cook <keescook@chromium.org> Cc: "Russell King - ARM Linux" <linux@arm.linux.org.uk> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-11x86: use simpler API for random address requestsJason Cooper2-6/+2
Currently, all callers to randomize_range() set the length to 0 and calculate end by adding a constant to the start address. We can simplify the API to remove a bunch of needless checks and variables. Use the new randomize_addr(start, range) call to set the requested address. Link: http://lkml.kernel.org/r/20160803233913.32511-3-jason@lakedaemon.net Signed-off-by: Jason Cooper <jason@lakedaemon.net> Acked-by: Kees Cook <keescook@chromium.org> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H . Peter Anvin" <hpa@zytor.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-11random: simplify API for random address requestsJason Cooper2-0/+34
To date, all callers of randomize_range() have set the length to 0, and check for a zero return value. For the current callers, the only way to get zero returned is if end <= start. Since they are all adding a constant to the start address, this is unnecessary. We can remove a bunch of needless checks by simplifying the API to do just what everyone wants, return an address between [start, start + range). While we're here, s/get_random_int/get_random_long/. No current call site is adversely affected by get_random_int(), since all current range requests are < UINT_MAX. However, we should match caller expectations to avoid coming up short (ha!) in the future. All current callers to randomize_range() chose to use the start address if randomize_range() failed. Therefore, we simplify things by just returning the start address on error. randomize_range() will be removed once all callers have been converted over to randomize_addr(). Link: http://lkml.kernel.org/r/20160803233913.32511-2-jason@lakedaemon.net Signed-off-by: Jason Cooper <jason@lakedaemon.net> Acked-by: Kees Cook <keescook@chromium.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: "Roberts, William C" <william.c.roberts@intel.com> Cc: Yann Droneaud <ydroneaud@opteya.com> Cc: Russell King <linux@arm.linux.org.uk> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H . Peter Anvin" <hpa@zytor.com> Cc: Nick Kralevich <nnk@google.com> Cc: Jeffrey Vander Stoep <jeffv@google.com> Cc: Daniel Cashman <dcashman@android.com> Cc: Chris Metcalf <cmetcalf@mellanox.com> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-11rapidio/rio_cm: use memdup_user() instead of duplicating codeAlexandre Bounine1-10/+5
Fix coccinelle warning about duplicating existing memdup_user function. Link: http://lkml.kernel.org/r/20160811151737.20140-1-alexandre.bounine@idt.com Link: https://lkml.org/lkml/2016/8/11/29 Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com> Reported-by: kbuild test robot <fengguang.wu@intel.com> Cc: Matt Porter <mporter@kernel.crashing.org> Cc: Andre van Herk <andre.van.herk@prodrive-technologies.com> Cc: Barry Wood <barry.wood@idt.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-11ptrace: clear TIF_SYSCALL_TRACE on ptrace detachAles Novak1-1/+2
On __ptrace_detach(), called from do_exit()->exit_notify()-> forget_original_parent()->exit_ptrace(), the TIF_SYSCALL_TRACE in thread->flags of the tracee is not cleared up. This results in the tracehook_report_syscall_* being called (though there's no longer a tracer listening to that) upon its further syscalls. Example scenario - attach "strace" to a running process and kill it (the strace) with SIGKILL. You'll see that the syscall trace hooks are still being called. The clearing of this flag should be moved from ptrace_detach() to __ptrace_detach(). Link: http://lkml.kernel.org/r/1472759493-20554-1-git-send-email-alnovak@suse.cz Signed-off-by: Ales Novak <alnovak@suse.cz> Acked-by: Oleg Nesterov <oleg@redhat.com> Cc: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-11pipe: cap initial pipe capacity according to pipe-max-size limitMichael Kerrisk (man-pages)1-0/+3
This is a patch that provides behavior that is more consistent, and probably less surprising to users. I consider the change optional, and welcome opinions about whether it should be applied. By default, pipes are created with a capacity of 64 kiB. However, /proc/sys/fs/pipe-max-size may be set smaller than this value. In this scenario, an unprivileged user could thus create a pipe whose initial capacity exceeds the limit. Therefore, it seems logical to cap the initial pipe capacity according to the value of pipe-max-size. The test program shown earlier in this patch series can be used to demonstrate the effect of the change brought about with this patch: # cat /proc/sys/fs/pipe-max-size 1048576 # sudo -u mtk ./test_F_SETPIPE_SZ 1 Initial pipe capacity: 65536 # echo 10000 > /proc/sys/fs/pipe-max-size # cat /proc/sys/fs/pipe-max-size 16384 # sudo -u mtk ./test_F_SETPIPE_SZ 1 Initial pipe capacity: 16384 # ./test_F_SETPIPE_SZ 1 Initial pipe capacity: 65536 The last two executions of 'test_F_SETPIPE_SZ' show that pipe-max-size caps the initial allocation for a new pipe for unprivileged users, but not for privileged users. Link: http://lkml.kernel.org/r/31dc7064-2a17-9c5b-1df1-4e3012ee992c@gmail.com Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com> Reviewed-by: Vegard Nossum <vegard.nossum@oracle.com> Cc: Willy Tarreau <w@1wt.eu> Cc: <socketpair@gmail.com> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: Jens Axboe <axboe@fb.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-11pipe: make account_pipe_buffers() return a value, and use itMichael Kerrisk (man-pages)1-18/+18
This is an optional patch, to provide a small performance improvement. Alter account_pipe_buffers() so that it returns the new value in user->pipe_bufs. This means that we can refactor too_many_pipe_buffers_soft() and too_many_pipe_buffers_hard() to avoid the costs of repeated use of atomic_long_read() to get the value user->pipe_bufs. Link: http://lkml.kernel.org/r/93e5f193-1e5e-3e1f-3a20-eae79b7e1310@gmail.com Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com> Reviewed-by: Vegard Nossum <vegard.nossum@oracle.com> Cc: Willy Tarreau <w@1wt.eu> Cc: <socketpair@gmail.com> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: Jens Axboe <axboe@fb.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-11pipe: fix limit checking in alloc_pipe_info()Michael Kerrisk (man-pages)1-7/+13
The limit checking in alloc_pipe_info() (used by pipe(2) and when opening a FIFO) has the following problems: (1) When checking capacity required for the new pipe, the checks against the limit in /proc/sys/fs/pipe-user-pages-{soft,hard} are made against existing consumption, and exclude the memory required for the new pipe capacity. As a consequence: (1) the memory allocation throttling provided by the soft limit does not kick in quite as early as it should, and (2) the user can overrun the hard limit. (2) As currently implemented, accounting and checking against the limits is done as follows: (a) Test whether the user has exceeded the limit. (b) Make new pipe buffer allocation. (c) Account new allocation against the limits. This is racey. Multiple processes may pass point (a) simultaneously, and then allocate pipe buffers that are accounted for only in step (c). The race means that the user's pipe buffer allocation could be pushed over the limit (by an arbitrary amount, depending on how unlucky we were in the race). [Thanks to Vegard Nossum for spotting this point, which I had missed.] This patch addresses the above problems as follows: * Alter the checks against limits to include the memory required for the new pipe. * Re-order the accounting step so that it precedes the buffer allocation. If the accounting step determines that a limit has been reached, revert the accounting and cause the operation to fail. Link: http://lkml.kernel.org/r/8ff3e9f9-23f6-510c-644f-8e70cd1c0bd9@gmail.com Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com> Reviewed-by: Vegard Nossum <vegard.nossum@oracle.com> Cc: Willy Tarreau <w@1wt.eu> Cc: <socketpair@gmail.com> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: Jens Axboe <axboe@fb.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-11pipe: simplify logic in alloc_pipe_info()Michael Kerrisk (man-pages)1-22/+23
Replace an 'if' block that covers most of the code in this function with a 'goto'. This makes the code a little simpler to read, and also simplifies the next patch (fix limit checking in alloc_pipe_info()) Link: http://lkml.kernel.org/r/aef030c1-0257-98a9-4988-186efa48530c@gmail.com Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com> Reviewed-by: Vegard Nossum <vegard.nossum@oracle.com> Cc: Willy Tarreau <w@1wt.eu> Cc: <socketpair@gmail.com> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: Jens Axboe <axboe@fb.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-11pipe: fix limit checking in pipe_set_size()Michael Kerrisk (man-pages)1-10/+31
The limit checking in pipe_set_size() (used by fcntl(F_SETPIPE_SZ)) has the following problems: (1) When increasing the pipe capacity, the checks against the limits in /proc/sys/fs/pipe-user-pages-{soft,hard} are made against existing consumption, and exclude the memory required for the increased pipe capacity. The new increase in pipe capacity can then push the total memory used by the user for pipes (possibly far) over a limit. This can also trigger the problem described next. (2) The limit checks are performed even when the new pipe capacity is less than the existing pipe capacity. This can lead to problems if a user sets a large pipe capacity, and then the limits are lowered, with the result that the user will no longer be able to decrease the pipe capacity. (3) As currently implemented, accounting and checking against the limits is done as follows: (a) Test whether the user has exceeded the limit. (b) Make new pipe buffer allocation. (c) Account new allocation against the limits. This is racey. Multiple processes may pass point (a) simultaneously, and then allocate pipe buffers that are accounted for only in step (c). The race means that the user's pipe buffer allocation could be pushed over the limit (by an arbitrary amount, depending on how unlucky we were in the race). [Thanks to Vegard Nossum for spotting this point, which I had missed.] This patch addresses the above problems as follows: * Perform checks against the limits only when increasing a pipe's capacity; an unprivileged user can always decrease a pipe's capacity. * Alter the checks against limits to include the memory required for the new pipe capacity. * Re-order the accounting step so that it precedes the buffer allocation. If the accounting step determines that a limit has been reached, revert the accounting and cause the operation to fail. The program below can be used to demonstrate problems 1 and 2, and the effect of the fix. The program takes one or more command-line arguments. The first argument specifies the number of pipes that the program should create. The remaining arguments are, alternately, pipe capacities that should be set using fcntl(F_SETPIPE_SZ), and sleep intervals (in seconds) between the fcntl() operations. (The sleep intervals allow the possibility to change the limits between fcntl() operations.) Problem 1 ========= Using the test program on an unpatched kernel, we first set some limits: # echo 0 > /proc/sys/fs/pipe-user-pages-soft # echo 1000000000 > /proc/sys/fs/pipe-max-size # echo 10000 > /proc/sys/fs/pipe-user-pages-hard # 40.96 MB Then show that we can set a pipe with capacity (100MB) that is over the hard limit # sudo -u mtk ./test_F_SETPIPE_SZ 1 100000000 Initial pipe capacity: 65536 Loop 1: set pipe capacity to 100000000 bytes F_SETPIPE_SZ returned 134217728 Now set the capacity to 100MB twice. The second call fails (which is probably surprising to most users, since it seems like a no-op): # sudo -u mtk ./test_F_SETPIPE_SZ 1 100000000 0 100000000 Initial pipe capacity: 65536 Loop 1: set pipe capacity to 100000000 bytes F_SETPIPE_SZ returned 134217728 Loop 2: set pipe capacity to 100000000 bytes Loop 2, pipe 0: F_SETPIPE_SZ failed: fcntl: Operation not permitted With a patched kernel, setting a capacity over the limit fails at the first attempt: # echo 0 > /proc/sys/fs/pipe-user-pages-soft # echo 1000000000 > /proc/sys/fs/pipe-max-size # echo 10000 > /proc/sys/fs/pipe-user-pages-hard # sudo -u mtk ./test_F_SETPIPE_SZ 1 100000000 Initial pipe capacity: 65536 Loop 1: set pipe capacity to 100000000 bytes Loop 1, pipe 0: F_SETPIPE_SZ failed: fcntl: Operation not permitted There is a small chance that the change to fix this problem could break user-space, since there are cases where fcntl(F_SETPIPE_SZ) calls that previously succeeded might fail. However, the chances are small, since (a) the pipe-user-pages-{soft,hard} limits are new (in 4.5), and the default soft/hard limits are high/unlimited. Therefore, it seems warranted to make these limits operate more precisely (and behave more like what users probably expect). Problem 2 ========= Running the test program on an unpatched kernel, we first set some limits: # getconf PAGESIZE 4096 # echo 0 > /proc/sys/fs/pipe-user-pages-soft # echo 1000000000 > /proc/sys/fs/pipe-max-size # echo 10000 > /proc/sys/fs/pipe-user-pages-hard # 40.96 MB Now perform two fcntl(F_SETPIPE_SZ) operations on a single pipe, first setting a pipe capacity (10MB), sleeping for a few seconds, during which time the hard limit is lowered, and then set pipe capacity to a smaller amount (5MB): # sudo -u mtk ./test_F_SETPIPE_SZ 1 10000000 15 5000000 & [1] 748 # Initial pipe capacity: 65536 Loop 1: set pipe capacity to 10000000 bytes F_SETPIPE_SZ returned 16777216 Sleeping 15 seconds # echo 1000 > /proc/sys/fs/pipe-user-pages-hard # 4.096 MB # Loop 2: set pipe capacity to 5000000 bytes Loop 2, pipe 0: F_SETPIPE_SZ failed: fcntl: Operation not permitted In this case, the user should be able to lower the limit. With a kernel that has the patch below, the second fcntl() succeeds: # echo 0 > /proc/sys/fs/pipe-user-pages-soft # echo 1000000000 > /proc/sys/fs/pipe-max-size # echo 10000 > /proc/sys/fs/pipe-user-pages-hard # sudo -u mtk ./test_F_SETPIPE_SZ 1 10000000 15 5000000 & [1] 3215 # Initial pipe capacity: 65536 # Loop 1: set pipe capacity to 10000000 bytes F_SETPIPE_SZ returned 16777216 Sleeping 15 seconds # echo 1000 > /proc/sys/fs/pipe-user-pages-hard # Loop 2: set pipe capacity to 5000000 bytes F_SETPIPE_SZ returned 8388608 8x---8x---8x---8x---8x---8x---8x---8x---8x---8x---8x---8x---8x---8x--- /* test_F_SETPIPE_SZ.c (C) 2016, Michael Kerrisk; licensed under GNU GPL version 2 or later Test operation of fcntl(F_SETPIPE_SZ) for setting pipe capacity and interactions with limits defined by /proc/sys/fs/pipe-* files. */ #define _GNU_SOURCE #include <stdio.h> #include <stdlib.h> #include <fcntl.h> #include <unistd.h> int main(int argc, char *argv[]) { int (*pfd)[2]; int npipes; int pcap, rcap; int j, p, s, stime, loop; if (argc < 2) { fprintf(stderr, "Usage: %s num-pipes " "[pipe-capacity sleep-time]...\n", argv[0]); exit(EXIT_FAILURE); } npipes = atoi(argv[1]); pfd = calloc(npipes, sizeof (int [2])); if (pfd == NULL) { perror("calloc"); exit(EXIT_FAILURE); } for (j = 0; j < npipes; j++) { if (pipe(pfd[j]) == -1) { fprintf(stderr, "Loop %d: pipe() failed: ", j); perror("pipe"); exit(EXIT_FAILURE); } } printf("Initial pipe capacity: %d\n", fcntl(pfd[0][0], F_GETPIPE_SZ)); for (j = 2; j < argc; j += 2 ) { loop = j / 2; pcap = atoi(argv[j]); printf(" Loop %d: set pipe capacity to %d bytes\n", loop, pcap); for (p = 0; p < npipes; p++) { s = fcntl(pfd[p][0], F_SETPIPE_SZ, pcap); if (s == -1) { fprintf(stderr, " Loop %d, pipe %d: F_SETPIPE_SZ " "failed: ", loop, p); perror("fcntl"); exit(EXIT_FAILURE); } if (p == 0) { printf(" F_SETPIPE_SZ returned %d\n", s); rcap = s; } else { if (s != rcap) { fprintf(stderr, " Loop %d, pipe %d: F_SETPIPE_SZ " "unexpected return: %d\n", loop, p, s); exit(EXIT_FAILURE); } } stime = (j + 1 < argc) ? atoi(argv[j + 1]) : 0; if (stime > 0) { printf(" Sleeping %d seconds\n", stime); sleep(stime); } } } exit(EXIT_SUCCESS); } 8x---8x---8x---8x---8x---8x---8x---8x---8x---8x---8x---8x---8x---8x--- Patch history: v2 * Switch order of test in 'if' statement to avoid function call (to capability()) in normal path. [This is a fix to a preexisting wart in the code. Thanks to Willy Tarreau] * Perform (size > pipe_max_size) check before calling account_pipe_buffers(). [Thanks to Vegard Nossum] Quoting Vegard: The potential problem happens if the user passes a very large number which will overflow pipe->user->pipe_bufs. On 32-bit, sizeof(int) == sizeof(long), so if they pass arg = INT_MAX then round_pipe_size() returns INT_MAX. Although it's true that the accounting is done in terms of pages and not bytes, so you'd need on the order of (1 << 13) = 8192 processes hitting the limit at the same time in order to make it overflow, which seems a bit unlikely. (See https://lkml.org/lkml/2016/8/12/215 for another discussion on the limit checking) Link: http://lkml.kernel.org/r/1e464945-536b-2420-798b-e77b9c7e8593@gmail.com Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com> Reviewed-by: Vegard Nossum <vegard.nossum@oracle.com> Cc: Willy Tarreau <w@1wt.eu> Cc: <socketpair@gmail.com> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: Jens Axboe <axboe@fb.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-11pipe: refactor argument for account_pipe_buffers()Michael Kerrisk (man-pages)1-5/+5
This is a preparatory patch for following work. account_pipe_buffers() performs accounting in the 'user_struct'. There is no need to pass a pointer to a 'pipe_inode_info' struct (which is then dereferenced to obtain a pointer to the 'user' field). Instead, pass a pointer directly to the 'user_struct'. This change is needed in preparation for a subsequent patch that the fixes the limit checking in alloc_pipe_info() (and the resulting code is a little more logical). Link: http://lkml.kernel.org/r/7277bf8c-a6fc-4a7d-659c-f5b145c981ab@gmail.com Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com> Reviewed-by: Vegard Nossum <vegard.nossum@oracle.com> Cc: Willy Tarreau <w@1wt.eu> Cc: <socketpair@gmail.com> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: Jens Axboe <axboe@fb.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-11pipe: move limit checking logic into pipe_set_size()Michael Kerrisk (man-pages)1-23/+18
This is a preparatory patch for following work. Move the F_SETPIPE_SZ limit-checking logic from pipe_fcntl() into pipe_set_size(). This simplifies the code a little, and allows for reworking required in a later patch that fixes the limit checking in pipe_set_size() Link: http://lkml.kernel.org/r/3701b2c5-2c52-2c3e-226d-29b9deb29b50@gmail.com Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com> Reviewed-by: Vegard Nossum <vegard.nossum@oracle.com> Cc: Willy Tarreau <w@1wt.eu> Cc: <socketpair@gmail.com> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: Jens Axboe <axboe@fb.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-11pipe: relocate round_pipe_size() above pipe_set_size()Michael Kerrisk (man-pages)1-12/+12
Patch series "pipe: fix limit handling", v2. When changing a pipe's capacity with fcntl(F_SETPIPE_SZ), various limits defined by /proc/sys/fs/pipe-* files are checked to see if unprivileged users are exceeding limits on memory consumption. While documenting and testing the operation of these limits I noticed that, as currently implemented, these checks have a number of problems: (1) When increasing the pipe capacity, the checks against the limits in /proc/sys/fs/pipe-user-pages-{soft,hard} are made against existing consumption, and exclude the memory required for the increased pipe capacity. The new increase in pipe capacity can then push the total memory used by the user for pipes (possibly far) over a limit. This can also trigger the problem described next. (2) The limit checks are performed even when the new pipe capacity is less than the existing pipe capacity. This can lead to problems if a user sets a large pipe capacity, and then the limits are lowered, with the result that the user will no longer be able to decrease the pipe capacity. (3) As currently implemented, accounting and checking against the limits is done as follows: (a) Test whether the user has exceeded the limit. (b) Make new pipe buffer allocation. (c) Account new allocation against the limits. This is racey. Multiple processes may pass point (a) simultaneously, and then allocate pipe buffers that are accounted for only in step (c). The race means that the user's pipe buffer allocation could be pushed over the limit (by an arbitrary amount, depending on how unlucky we were in the race). [Thanks to Vegard Nossum for spotting this point, which I had missed.] This patch series addresses these three problems. This patch (of 8): This is a minor preparatory patch. After subsequent patches, round_pipe_size() will be called from pipe_set_size(), so place round_pipe_size() above pipe_set_size(). Link: http://lkml.kernel.org/r/91a91fdb-a959-ba7f-b551-b62477cc98a1@gmail.com Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com> Reviewed-by: Vegard Nossum <vegard.nossum@oracle.com> Cc: Willy Tarreau <w@1wt.eu> Cc: <socketpair@gmail.com> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: Jens Axboe <axboe@fb.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-11autofs: refactor ioctl fn vector in iookup_dev_ioctl()Tomohiro Kusumi1-33/+16
cmd part of this struct is the same as an index of itself within _ioctls[]. In fact this cmd is unused, so we can drop this part. Link: http://lkml.kernel.org/r/20160831033414.9910.66697.stgit@pluto.themaw.net Signed-off-by: Tomohiro Kusumi <kusumi.tomohiro@gmail.com> Signed-off-by: Ian Kent <raven@themaw.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-11autofs: remove possibly misleading /* #define DEBUG */Tomohiro Kusumi1-2/+0
Having this in autofs_i.h gives illusion that uncommenting this enables pr_debug(), but it doesn't enable all the pr_debug() in autofs because inclusion order matters. XFS has the same DEBUG macro in its core header fs/xfs/xfs.h, however XFS seems to have a rule to include this prior to other XFS headers as well as kernel headers. This is not the case with autofs, and DEBUG could be enabled via Makefile, so autofs should just get rid of this comment to make the code less confusing. It's a comment, so there is literally no functional difference. Link: http://lkml.kernel.org/r/20160831033409.9910.77067.stgit@pluto.themaw.net Signed-off-by: Tomohiro Kusumi <kusumi.tomohiro@gmail.com> Signed-off-by: Ian Kent <raven@themaw.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-11autofs4: move linux/auto_dev-ioctl.h to uapi/linuxIan Kent2-208/+222
Since linux/auto_dev-ioctl.h wasn't included in include/linux/Kbuild it wasn't moved to uapi/linux as part of the uapi series. Link: http://lkml.kernel.org/r/20160812024901.12352.10984.stgit@pluto.themaw.net Signed-off-by: Ian Kent <raven@themaw.net> Cc: Tomohiro Kusumi <kusumi.tomohiro@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-11autofs: move inclusion of linux/limits.h to uapiTomohiro Kusumi2-1/+1
linux/limits.h should be included by uapi instead of linux/auto_fs.h so as not to cause compile error in userspace. # cat << EOF > ./test1.c > #include <stdio.h> > #include <linux/auto_fs.h> > int main(void) { > return 0; > } > EOF # gcc -Wall -g ./test1.c In file included from ./test1.c:2:0: /usr/include/linux/auto_fs.h:54:12: error: 'NAME_MAX' undeclared here (not in a function) char name[NAME_MAX+1]; ^ Link: http://lkml.kernel.org/r/20160812024856.12352.24092.stgit@pluto.themaw.net Signed-off-by: Tomohiro Kusumi <kusumi.tomohiro@gmail.com> Signed-off-by: Ian Kent <ikent@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-11autofs: fix print format for ioctl warning messageTomohiro Kusumi1-1/+1
All other warnings use "cmd(0x%08x)" and this is the only one with "cmd(%d)". (below comes from my userspace debug program, but not automount daemon) [ 1139.905676] autofs4:pid:1640:check_dev_ioctl_version: ioctl control interface version mismatch: kernel(1.0), user(0.0), cmd(-1072131215) Link: http://lkml.kernel.org/r/20160812024851.12352.75458.stgit@pluto.themaw.net Signed-off-by: Tomohiro Kusumi <kusumi.tomohiro@gmail.com> Signed-off-by: Ian Kent <ikent@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-11autofs: add autofs_dev_ioctl_version() for AUTOFS_DEV_IOCTL_VERSION_CMDIan Kent1-8/+17
No functional changes, based on the following justification. 1. Make the code more consistent using the ioctl vector _ioctls[], rather than assigning NULL only for this ioctl command. 2. Remove goto done; for better maintainability in the long run. 3. The existing code is based on the fact that validate_dev_ioctl() sets ioctl version for any command, but AUTOFS_DEV_IOCTL_VERSION_CMD should explicitly set it regardless of the default behavior. Link: http://lkml.kernel.org/r/20160812024846.12352.9885.stgit@pluto.themaw.net Signed-off-by: Tomohiro Kusumi <kusumi.tomohiro@gmail.com> Signed-off-by: Ian Kent <ikent@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-11autofs: fix dev ioctl number range checkIan Kent2-2/+3
The count of miscellaneous device ioctls in fs/autofs4/autofs_i.h is wrong. The number of ioctls is the difference between AUTOFS_DEV_IOCTL_VERSION_CMD and AUTOFS_DEV_IOCTL_ISMOUNTPOINT_CMD (14) not the difference between AUTOFS_IOC_COUNT and 11 (21). [kusumi.tomohiro@gmail.com: fix typo that made the count macro negative] Link: http://lkml.kernel.org/r/20160831033420.9910.16809.stgit@pluto.themaw.net Link: http://lkml.kernel.org/r/20160812024841.12352.11975.stgit@pluto.themaw.net Signed-off-by: Ian Kent <raven@themaw.net> Cc: Tomohiro Kusumi <kusumi.tomohiro@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-11autofs: fix pr_debug() messageTomohiro Kusumi1-1/+1
This isn't a return value, so change the message to indicate the status is the result of may_umount(). (or locate pr_debug() after put_user() with the same message) Link: http://lkml.kernel.org/r/20160812024836.12352.74628.stgit@pluto.themaw.net Signed-off-by: Tomohiro Kusumi <kusumi.tomohiro@gmail.com> Signed-off-by: Ian Kent <ikent@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-11autofs: update struct autofs_dev_ioctl in DocumentationTomohiro Kusumi1-28/+42
Sync with changes made by commit 730c9eeca980 ("autofs4: improve parameter usage") which introduced an union for various ioctl commands instead of having statically named arg1,2. This commit simply replaces arg1,2 with the corresponding fields without changing semantics. Link: http://lkml.kernel.org/r/20160812024831.12352.24667.stgit@pluto.themaw.net Signed-off-by: Tomohiro Kusumi <kusumi.tomohiro@gmail.com> Signed-off-by: Ian Kent <ikent@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-11autofs: fix Documentation regarding devid on ioctlTomohiro Kusumi1-3/+2
The explanation on how ioctl handles devid seems incorrect. Userspace who calls this ioctl has no input regarding devid, and ioctl implementation retrieves devid via superblock. Link: http://lkml.kernel.org/r/20160812024825.12352.13486.stgit@pluto.themaw.net Signed-off-by: Tomohiro Kusumi <kusumi.tomohiro@gmail.com> Signed-off-by: Ian Kent <ikent@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-11autofs: remove AUTOFS_DEVID_LENTomohiro Kusumi1-2/+0
This macro was never used by neither kernel nor userspace, and also doesn't represent "devid length" in bytes. (unless it was added to mean something else). Link: http://lkml.kernel.org/r/20160812024820.12352.21210.stgit@pluto.themaw.net Signed-off-by: Tomohiro Kusumi <kusumi.tomohiro@gmail.com> Signed-off-by: Ian Kent <ikent@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-11autofs: don't fail to free_dev_ioctl(param)Tomohiro Kusumi1-1/+2
Returning -ENOTTY here fails to free dynamically allocated param. Link: http://lkml.kernel.org/r/20160812024815.12352.69153.stgit@pluto.themaw.net Signed-off-by: Tomohiro Kusumi <kusumi.tomohiro@gmail.com> Signed-off-by: Ian Kent <ikent@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-11autofs: remove obsolete sb fieldsTomohiro Kusumi1-2/+0
These two were left from commit aa55ddf340c9 ("autofs4: remove unused ioctls") which removed unused ioctls. Link: http://lkml.kernel.org/r/20160812024810.12352.96377.stgit@pluto.themaw.net Signed-off-by: Tomohiro Kusumi <kusumi.tomohiro@gmail.com> Signed-off-by: Ian Kent <ikent@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-11autofs: use autofs4_free_ino() to kfree dentry dataTomohiro Kusumi1-1/+1
kfree dentry data allocated by autofs4_new_ino() with autofs4_free_ino() instead of raw kfree. (since we have the interface to free autofs_info*) This patch was modified to remove the need to set the dentry info field to NULL dew to a change in the previous patch. Link: http://lkml.kernel.org/r/20160812024805.12352.43650.stgit@pluto.themaw.net Signed-off-by: Tomohiro Kusumi <kusumi.tomohiro@gmail.com> Signed-off-by: Ian Kent <raven@themaw.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-11autofs: remove ino free in autofs4_dir_symlink()Ian Kent1-2/+0
The inode allocation failure case in autofs4_dir_symlink() frees the autofs dentry info of the dentry without setting ->d_fsdata to NULL. That could lead to a double free so just get rid of the free and leave it to ->d_release(). Link: http://lkml.kernel.org/r/20160812024759.12352.10653.stgit@pluto.themaw.net Signed-off-by: Ian Kent <raven@themaw.net> Cc: Tomohiro Kusumi <kusumi.tomohiro@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-11autofs: add WARN_ON(1) for non dir/link inode caseTomohiro Kusumi1-1/+2
It's invalid if the given mode is neither dir nor link, so warn on else case. Link: http://lkml.kernel.org/r/20160812024754.12352.8536.stgit@pluto.themaw.net Signed-off-by: Tomohiro Kusumi <kusumi.tomohiro@gmail.com> Signed-off-by: Ian Kent <raven@themaw.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-11autofs: fix autofs4_fill_super() error exit handlingIan Kent1-3/+3
Somewhere along the line the error handling gotos have become incorrect. Link: http://lkml.kernel.org/r/20160812024749.12352.15100.stgit@pluto.themaw.net Signed-off-by: Ian Kent <raven@themaw.net> Cc: Tomohiro Kusumi <kusumi.tomohiro@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-11autofs: test autofs versions first on sb initializationTomohiro Kusumi1-17/+17
This patch does what the below comment says. It could be and it's considered better to do this first before various functions get called during initialization. /* Couldn't this be tested earlier? */ Link: http://lkml.kernel.org/r/20160812024744.12352.43075.stgit@pluto.themaw.net Signed-off-by: Tomohiro Kusumi <kusumi.tomohiro@gmail.com> Signed-off-by: Ian Kent <raven@themaw.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-11autofs: drop unnecessary extern in autofs_i.hTomohiro Kusumi1-1/+1
autofs4_kill_sb() doesn't need to be declared as extern, and no other functions in .h are explicitly declared as extern. Link: http://lkml.kernel.org/r/20160812024739.12352.99354.stgit@pluto.themaw.net Signed-off-by: Tomohiro Kusumi <kusumi.tomohiro@gmail.com> Signed-off-by: Ian Kent <raven@themaw.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>