linux-dev - Linux kernel development work

Age	Commit message (Collapse)	Author	Files	Lines
2012-03-24	USB: sa1111: add hcd .reset method	Russell King	1	-9/+11
	Add the .reset method to the HCD, and update the .start method accordingly for this change. Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2012-03-24	USB: sa1111: add OHCI shutdown methods	Russell King	1	-0/+12
	Add OHCI shutdown methods to cleanly shutdown the OHCI controller on system shutdowns and reboots. This avoids the controller continuing to run should be soft-reboot the platform, potentially scribbling over system memory. Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2012-03-24	USB: sa1111: reorganize ohci-sa1111.c	Russell King	1	-134/+93
	Combine usb_hcd_sa1111_probe() and ohci_hcd_sa1111_drv_probe(), doing the same for the remove methods. Move sa1111_start_hc and sa1111_stop_hc to be located next to these the probe/release functions, as they're only called from them. Get rid of the /----/ breaker lines. Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2012-03-24	USB: sa1111: get rid of nasty printk(KERN_DEBUG "%s: ...", __FILE__)	Russell King	1	-4/+2
	Use dev_dbg() instead, it's more friendly. Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2012-03-24	USB: sa1111: sparse and checkpatch cleanups	Russell King	1	-16/+17
	Clean up the ohci-sa1111 driver formatting to be more compliant with current standards, and add 'static' to various function definitions to avoid sparse complaints about undeclared functions. Remove the unnecessary local declaration of 'usb_disabled', which can be found instead in linux/usb.h. Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2012-03-24	ARM: sa11x0: don't static map sa1111	Russell King	3	-15/+0
	The sa1111 support will ioremap() the device; there is no need for platforms to setup a static mapping for this. Remove the static mapping for this device from badge4, jornada720 and neponset. Acked-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2012-03-24	ARM: sa1111: use dev_err() rather than printk()	Russell King	1	-1/+1
	Use dev_err() to report device specific errors rather than printk(). Acked-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2012-03-24	ARM: sa1111: cleanup sub-device registration and unregistration	Russell King	1	-11/+16
	Move the releasing of resources out of the release function - this allows a cleaner and more conventional arrangement of the registration failure paths and a saner unregistration process for these devices. Acked-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2012-03-24	ARM: sa1111: only setup DMA for DMA capable devices	Russell King	1	-3/+6
	It's pointless registering the PS/2 interfaces with the dmabounce code when there's no DMA support for these in hardware, so only setup the DMA masks for two subdevices which support DMA - the OHCI and SAC. Acked-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2012-03-24	ARM: sa1111: register sa1111 devices with dmabounce in bus notifier	Russell King	1	-55/+74
	Use the bus notifier to register sa1111 devices with dmabounce, rather than after the device has been registered, potentially racing with driver binding. Acked-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2012-03-24	ARM: sa1111: move USB interface register definitions to ohci-sa1111.c	Russell King	2	-30/+25
	Move the USB interface register definitions into the driver, rather than keeping them in a common place. Acked-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2012-03-24	ARM: sa1111: move PCMCIA interface register definitions to sa1111_generic.c	Russell King	2	-48/+41
	Move the PCMCIA interface register definitions into the driver, rather than keeping them in a common place. Acked-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2012-03-24	ARM: sa1111: move PS/2 interface register definitions to sa1111p2.c	Russell King	2	-54/+37
	Move the PS/2 interface register definitions into the driver, rather than keeping them in a common location. Acked-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2012-03-24	ARM: sa1111: delete unused physical GPIO register definitions	Russell King	1	-16/+0
	Get rid of the unused GPIO register definitions - we access GPIO registers through the base + offset method, and having the phys address definitions is unnecessary duplication. Acked-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2012-03-24	ARM: sa1111: provide a generic way to prevent devices from registering	Russell King	6	-5/+7
	Some platforms don't want certain devices to be registered, because, eg, the interface is not wired. Provide a way for platforms to prevent various devices from being registered via a devid bitmask in the platform data. Acked-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2012-03-24	ARM: ecard: get rid of NO_IRQ madness	Russell King	4	-10/+6
	Get rid of the NO_IRQ madness from Acorn expansion card handling code. Thankfully, are relatively few users of this here, and so it's easy to audit. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2012-03-24	ARM: riscpc: use DEFINE_RES_xxx()	Russell King	1	-29/+7
	Use DEFINE_RES_xxx() to define device resources. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2012-03-24	ARM: riscpc: remove expansion card irq mask register	Russell King	1	-101/+2
	This register is only present on older platforms, and not on RiscPC, so lets remove this unused support. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2012-03-24	ARM: riscpc: convert ecard to use irq_alloc_descs()	Russell King	1	-9/+13
	Use irq_alloc_descs() to allocate IRQs for expansion cards. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2012-03-24	ARM: riscpc: use irq chip data in ecard.c	Russell King	1	-2/+3
	Use irq chip data to store the expansion card data pointer, rather than converting from the interrupt number to a slot number. This allows the interrupt chip methods to avoid knowing about interrupt numbering. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2012-03-24	ARM: riscpc: move ecard.c to arch/arm/mach-rpc	Russell King	4	-2/+1
	RiscPC is the only platform using the Acorn expansion card support, so move it into its mach-* directory. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2012-03-24	ARM: riscpc: remove IRQ_TIMER	Russell King	2	-3/+1
	Use IRQ_TIMER0 instead, which is the same thing. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2012-03-24	ARM: riscpc: use definition for serial port interrupt	Russell King	1	-1/+1
	Rather than using a plain integer, use the definition already provided. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2012-03-24	ARM: riscpc: pass IRQ resources into keyboard driver	Russell King	2	-7/+44
	Rather than including asm/irq.h into the keyboard driver, pass the IRQ numbers via the platform device instead. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2012-03-24	ARM: riscpc: move time-acorn.c to mach-rpc	Russell King	5	-6/+1
	Nothing but RiscPC makes use of the Acorn timekeeping code, so move it into mach-rpc. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2012-03-23	seq_file: add seq_set_overflow(), seq_overflow()	KAMEZAWA Hiroyuki	1	-10/+26
	It is undocumented but a seq_file's overflow state is indicated by m->count == m->size. Add seq_set_overflow() and seq_overflow() to set/check overflow status explicitly. Based on an idea from Eric Dumazet. [akpm@linux-foundation.org: tweak code comment] Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-23	proc-ns: use d_set_d_op() API to set dentry ops in proc_ns_instantiate().	Pravin B Shelar	1	-1/+1
	The namespace cleanup path leaks a dentry which holds a reference count on a network namespace. Keeping that network namespace from being freed when the last user goes away. Leaving things like vlan devices in the leaked network namespace. If you use ip netns add for much real work this problem becomes apparent pretty quickly. It light testing the problem hides because frequently you simply don't notice the leak. Use d_set_d_op() so that DCACHE_OP_* flags are set correctly. This issue exists back to 3.0. Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Reported-by: Justin Pettit <jpettit@nicira.com> Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: Jesse Gross <jesse@nicira.com> Cc: David Miller <davem@davemloft.net> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-23	procfs: speed up /proc/pid/stat, statm	KAMEZAWA Hiroyuki	3	-56/+86
	Process accounting applications as top, ps visit some files under /proc/<pid>. With seq_put_decimal_ull(), we can optimize /proc/<pid>/stat and /proc/<pid>/statm files. This patch adds - seq_put_decimal_ll() for signed values. - allow delimiter == 0. - convert seq_printf() to seq_put_decimal_ull/ll in /proc/stat, statm. Test result on a system with 2000+ procs. Before patch: [kamezawa@bluextal test]$ top -b -n 1 \| wc -l 2223 [kamezawa@bluextal test]$ time top -b -n 1 > /dev/null real 0m0.675s user 0m0.044s sys 0m0.121s [kamezawa@bluextal test]$ time ps -elf > /dev/null real 0m0.236s user 0m0.056s sys 0m0.176s After patch: kamezawa@bluextal ~]$ time top -b -n 1 > /dev/null real 0m0.657s user 0m0.052s sys 0m0.100s [kamezawa@bluextal ~]$ time ps -elf > /dev/null real 0m0.198s user 0m0.050s sys 0m0.145s Considering top, ps tend to scan /proc periodically, this will reduce cpu consumption by top/ps to some extent. [akpm@linux-foundation.org: checkpatch fixes] Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-23	procfs: add num_to_str() to speed up /proc/stat	KAMEZAWA Hiroyuki	5	-29/+84
	== stat_check.py num = 0 with open("/proc/stat") as f: while num < 1000 : data = f.read() f.seek(0, 0) num = num + 1 == perf shows 20.39% stat_check.py [kernel.kallsyms] [k] format_decode 13.41% stat_check.py [kernel.kallsyms] [k] number 12.61% stat_check.py [kernel.kallsyms] [k] vsnprintf 10.85% stat_check.py [kernel.kallsyms] [k] memcpy 4.85% stat_check.py [kernel.kallsyms] [k] radix_tree_lookup 4.43% stat_check.py [kernel.kallsyms] [k] seq_printf This patch removes most of calls to vsnprintf() by adding num_to_str() and seq_print_decimal_ull(), which prints decimal numbers without rich functions provided by printf(). On my 8cpu box. == Before patch == [root@bluextal test]# time ./stat_check.py real 0m0.150s user 0m0.026s sys 0m0.121s == After patch == [root@bluextal test]# time ./stat_check.py real 0m0.055s user 0m0.022s sys 0m0.030s [akpm@linux-foundation.org: remove incorrect comment, use less statck in num_to_str(), move comment from .h to .c, simplify seq_put_decimal_ull()] [andrea@betterlinux.com: avoid breaking the ABI in /proc/stat] Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrea Righi <andrea@betterlinux.com> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Glauber Costa <glommer@parallels.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Turner <pjt@google.com> Cc: Russell King <rmk@arm.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-23	proc: speed up /proc/stat handling	Eric Dumazet	1	-2/+5
	On a typical 16 cpus machine, "cat /proc/stat" gives more than 4096 bytes, and is slow : # strace -T -o /tmp/STRACE cat /proc/stat \| wc -c 5826 # grep "cpu " /tmp/STRACE read(0, "cpu 1949310 19 2144714 12117253"..., 32768) = 5826 <0.001504> Thats partly because show_stat() must be called twice since initial buffer size is too small (4096 bytes for less than 32 possible cpus) Fix this by : 1) Taking into account nr_irqs in the initial buffer sizing. 2) Using ksize() to allow better filling of initial buffer. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Glauber Costa <glommer@parallels.com> Cc: Russell King - ARM Linux <linux@arm.linux.org.uk> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Paul Turner <pjt@google.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Ingo Molnar <mingo@elte.hu> Cc: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-23	fs/proc/kcore.c: make get_sparsemem_vmemmap_info() static	Djalal Harouni	1	-2/+4
	get_sparsemem_vmemmap_info() is only used inside fs/proc/kcore.c Signed-off-by: Djalal Harouni <tixxdz@opendz.org> Reviewed-by: WANG Cong <xiyou.wangcong@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-23	coredump: add VM_NODUMP, MADV_NODUMP, MADV_CLEAR_NODUMP	Jason Baron	8	-0/+32
	Since we no longer need the VM_ALWAYSDUMP flag, let's use the freed bit for 'VM_NODUMP' flag. The idea is is to add a new madvise() flag: MADV_DONTDUMP, which can be set by applications to specifically request memory regions which should not dump core. The specific application I have in mind is qemu: we can add a flag there that wouldn't dump all of guest memory when qemu dumps core. This flag might also be useful for security sensitive apps that want to absolutely make sure that parts of memory are not dumped. To clear the flag use: MADV_DODUMP. [akpm@linux-foundation.org: s/MADV_NODUMP/MADV_DONTDUMP/, s/MADV_CLEAR_NODUMP/MADV_DODUMP/, per Roland] [akpm@linux-foundation.org: fix up the architectures which broke] Signed-off-by: Jason Baron <jbaron@redhat.com> Acked-by: Roland McGrath <roland@hack.frob.com> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Avi Kivity <avi@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Matt Turner <mattst88@gmail.com> Cc: "James E.J. Bottomley" <jejb@parisc-linux.org> Cc: Helge Deller <deller@gmx.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-23	coredump: remove VM_ALWAYSDUMP flag	Jason Baron	15	-69/+40
	The motivation for this patchset was that I was looking at a way for a qemu-kvm process, to exclude the guest memory from its core dump, which can be quite large. There are already a number of filter flags in /proc/<pid>/coredump_filter, however, these allow one to specify 'types' of kernel memory, not specific address ranges (which is needed in this case). Since there are no more vma flags available, the first patch eliminates the need for the 'VM_ALWAYSDUMP' flag. The flag is used internally by the kernel to mark vdso and vsyscall pages. However, it is simple enough to check if a vma covers a vdso or vsyscall page without the need for this flag. The second patch then replaces the 'VM_ALWAYSDUMP' flag with a new 'VM_NODUMP' flag, which can be set by userspace using new madvise flags: 'MADV_DONTDUMP', and unset via 'MADV_DODUMP'. The core dump filters continue to work the same as before unless 'MADV_DONTDUMP' is set on the region. The qemu code which implements this features is at: http://people.redhat.com/~jbaron/qemu-dump/qemu-dump.patch In my testing the qemu core dump shrunk from 383MB -> 13MB with this patch. I also believe that the 'MADV_DONTDUMP' flag might be useful for security sensitive apps, which might want to select which areas are dumped. This patch: The VM_ALWAYSDUMP flag is currently used by the coredump code to indicate that a vma is part of a vsyscall or vdso section. However, we can determine if a vma is in one these sections by checking it against the gate_vma and checking for a non-NULL return value from arch_vma_name(). Thus, freeing a valuable vma bit. Signed-off-by: Jason Baron <jbaron@redhat.com> Acked-by: Roland McGrath <roland@hack.frob.com> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Avi Kivity <avi@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-23	kmod: make __request_module() killable	Oleg Nesterov	1	-2/+24
	As Tetsuo Handa pointed out, request_module() can stress the system while the oom-killed caller sleeps in TASK_UNINTERRUPTIBLE. The task T uses "almost all" memory, then it does something which triggers request_module(). Say, it can simply call sys_socket(). This in turn needs more memory and leads to OOM. oom-killer correctly chooses T and kills it, but this can't help because it sleeps in TASK_UNINTERRUPTIBLE and after that oom-killer becomes "disabled" by the TIF_MEMDIE task T. Make __request_module() killable. The only necessary change is that call_modprobe() should kmalloc argv and module_name, they can't live in the stack if we use UMH_KILLABLE. This memory is freed via call_usermodehelper_freeinfo()->cleanup. Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Oleg Nesterov <oleg@redhat.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Tejun Heo <tj@kernel.org> Cc: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-23	kmod: introduce call_modprobe() helper	Oleg Nesterov	1	-8/+16
	No functional changes. Move the call_usermodehelper code from __request_module() into the new simple helper, call_modprobe(). Signed-off-by: Oleg Nesterov <oleg@redhat.com> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Tejun Heo <tj@kernel.org> Cc: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-23	usermodehelper: ____call_usermodehelper() doesn't need do_exit()	Oleg Nesterov	1	-1/+1
	Minor cleanup. ____call_usermodehelper() can simply return, no need to call do_exit() explicitely. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Tejun Heo <tj@kernel.org> Cc: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-23	usermodehelper: kill umh_wait, renumber UMH_* constants	Oleg Nesterov	3	-18/+10
	No functional changes. It is not sane to use UMH_KILLABLE with enum umh_wait, but obviously we do not want another argument in call_usermodehelper_* helpers. Kill this enum, use the plain int. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Tejun Heo <tj@kernel.org> Cc: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-23	usermodehelper: implement UMH_KILLABLE	Oleg Nesterov	2	-2/+27
	Implement UMH_KILLABLE, should be used along with UMH_WAIT_EXEC/PROC. The caller must ensure that subprocess_info->path/etc can not go away until call_usermodehelper_freeinfo(). call_usermodehelper_exec(UMH_KILLABLE) does wait_for_completion_killable. If it fails, it uses xchg(&sub_info->complete, NULL) to serialize with umh_complete() which does the same xhcg() to access sub_info->complete. If call_usermodehelper_exec wins, it can safely return. umh_complete() should get NULL and call call_usermodehelper_freeinfo(). Otherwise we know that umh_complete() was already called, in this case call_usermodehelper_exec() falls back to wait_for_completion() which should succeed "very soon". Note: UMH_NO_WAIT == -1 but it obviously should not be used with UMH_KILLABLE. We delay the neccessary cleanup to simplify the back porting. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Tejun Heo <tj@kernel.org> Cc: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-23	usermodehelper: introduce umh_complete(sub_info)	Oleg Nesterov	1	-2/+7
	Preparation. Add the new trivial helper, umh_complete(). Currently it simply does complete(sub_info->complete). Signed-off-by: Oleg Nesterov <oleg@redhat.com> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Tejun Heo <tj@kernel.org> Cc: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-23	usermodehelper: use UMH_WAIT_PROC consistently	Oleg Nesterov	5	-6/+6
	A few call_usermodehelper() callers use the hardcoded constant instead of the proper UMH_WAIT_PROC, fix them. Reported-by: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> Signed-off-by: Oleg Nesterov <oleg@redhat.com> Cc: Lars Ellenberg <drbd-dev@lists.linbit.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Michal Januszewski <spock@gentoo.org> Cc: Florian Tobias Schandinat <FlorianSchandinat@gmx.de> Cc: Kentaro Takeda <takedakn@nttdata.co.jp> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: James Morris <jmorris@namei.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-23	signal: zap_pid_ns_processes: s/SEND_SIG_NOINFO/SEND_SIG_FORCED/	Oleg Nesterov	1	-6/+2
	Change zap_pid_ns_processes() to use SEND_SIG_FORCED, it looks more clear compared to SEND_SIG_NOINFO which relies on from_ancestor_ns logic send_signal(). It is also more efficient if we need to kill a lot of tasks because it doesn't alloc sigqueue. While at it, add the __fatal_signal_pending(task) check as a minor optimization. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Cc: Tejun Heo <tj@kernel.org> Cc: Anton Vorontsov <anton.vorontsov@linaro.org> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: KOSAKI Motohiro <kosaki.motohiro@gmail.com> Cc: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-23	signal: oom_kill_task: use SEND_SIG_FORCED instead of force_sig()	Oleg Nesterov	1	-2/+2
	Change oom_kill_task() to use do_send_sig_info(SEND_SIG_FORCED) instead of force_sig(SIGKILL). With the recent changes we do not need force_ to kill the CLONE_NEWPID tasks. And this is more correct. force_sig() can race with the exiting thread even if oom_kill_task() checks p->mm != NULL, while do_send_sig_info(group => true) kille the whole process. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Cc: Tejun Heo <tj@kernel.org> Cc: Anton Vorontsov <anton.vorontsov@linaro.org> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: KOSAKI Motohiro <kosaki.motohiro@gmail.com> Cc: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-23	signal: cosmetic, s/from_ancestor_ns/force/ in prepare_signal() paths	Oleg Nesterov	1	-8/+7
	Cosmetic, rename the from_ancestor_ns argument in prepare_signal() paths. After the previous change it doesn't match the reality. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Cc: Tejun Heo <tj@kernel.org> Cc: Anton Vorontsov <anton.vorontsov@linaro.org> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: KOSAKI Motohiro <kosaki.motohiro@gmail.com> Cc: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-23	signal: give SEND_SIG_FORCED more power to beat SIGNAL_UNKILLABLE	Oleg Nesterov	1	-1/+2
	force_sig_info() and friends have the special semantics for synchronous signals, this interface should not be used if the target is not current. And it needs the fixes, in particular the clearing of SIGNAL_UNKILLABLE is not exactly right. However there are callers which have to use force_ exactly because it clears SIGNAL_UNKILLABLE and thus it can kill the CLONE_NEWPID tasks, although this is almost always is wrong by various reasons. With this patch SEND_SIG_FORCED ignores SIGNAL_UNKILLABLE, like we do if the signal comes from the ancestor namespace. This makes the naming in prepare_signal() paths insane, fixed by the next cleanup. Note: this only affects SIGKILL/SIGSTOP, but this is enough for force_sig() abusers. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Cc: Tejun Heo <tj@kernel.org> Cc: Anton Vorontsov <anton.vorontsov@linaro.org> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: KOSAKI Motohiro <kosaki.motohiro@gmail.com> Cc: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-23	Hexagon: use set_current_blocked() and block_sigmask()	Matt Fleming	1	-10/+2
	As described in e6fa16ab9c1e ("signal: sigprocmask() should do retarget_shared_pending()") the modification of current->blocked is incorrect as we need to check whether the signal we're about to block is pending in the shared queue. Also, use the new helper function introduced in commit 5e6292c0f28f ("signal: add block_sigmask() for adding sigmask to current->blocked") which centralises the code for updating current->blocked after successfully delivering a signal and reduces the amount of duplicate code across architectures. In the past some architectures got this code wrong, so using this helper function should stop that from happening again. Acked-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Richard Kuo <rkuo@codeaurora.org> Signed-off-by: Matt Fleming <matt.fleming@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-23	ptrace: remove PTRACE_SEIZE_DEVEL bit	Denys Vlasenko	2	-19/+1
	PTRACE_SEIZE code is tested and ready for production use, remove the code which requires special bit in data argument to make PTRACE_SEIZE work. Strace team prepares for a new release of strace, and we would like to ship the code which uses PTRACE_SEIZE, preferably after this change goes into released kernel. Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com> Acked-by: Tejun Heo <tj@kernel.org> Acked-by: Oleg Nesterov <oleg@redhat.com> Cc: Pedro Alves <palves@redhat.com> Cc: Jan Kratochvil <jan.kratochvil@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-23	ptrace: renumber PTRACE_EVENT_STOP so that future new options and events can match	Denys Vlasenko	1	-1/+2
	PTRACE_EVENT_foo and PTRACE_O_TRACEfoo used to match. New PTRACE_EVENT_STOP is the first event which has no corresponding PTRACE_O_TRACE option. If we will ever want to add another such option, its PTRACE_EVENT's value will collide with PTRACE_EVENT_STOP's value. This patch changes PTRACE_EVENT_STOP value to prevent this. While at it, added a comment - the one atop PTRACE_EVENT block, saying "Wait extended result codes for the above trace options", is not true for PTRACE_EVENT_STOP. Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com> Cc: Tejun Heo <tj@kernel.org> Reviewed-by: Oleg Nesterov <oleg@redhat.com> Cc: Pedro Alves <palves@redhat.com> Cc: Jan Kratochvil <jan.kratochvil@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-23	ptrace: make PTRACE_SEIZE set ptrace options specified in 'data' parameter	Denys Vlasenko	1	-10/+21
	This can be used to close a few corner cases in strace where we get unwanted racy behavior after attach, but before we have a chance to set options (the notorious post-execve SIGTRAP comes to mind), and removes the need to track "did we set opts for this task" state in strace internals. While we are at it: Make it possible to extend SEIZE in the future with more functionality by passing non-zero 'addr' parameter. To that end, error out if 'addr' is non-zero. PTRACE_ATTACH did not (and still does not) have such check, and users (strace) do pass garbage there... let's avoid repeating this mistake with SEIZE. Set all task->ptrace bits in one operation - before this change, we were adding PT_SEIZED and PT_PTRACE_CAP with task->ptrace \|= BIT ops. This was probably ok (not a bug), but let's be on a safer side. Changes since v2: use (unsigned long) casts instead of (long) ones, move PTRACE_SEIZE_DEVEL-related code to separate lines of code. Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com> Acked-by: Tejun Heo <tj@kernel.org> Cc: Pedro Alves <palves@redhat.com> Reviewed-by: Oleg Nesterov <oleg@redhat.com> Cc: Jan Kratochvil <jan.kratochvil@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-23	ptrace: simplify PTRACE_foo constants and PTRACE_SETOPTIONS code	Denys Vlasenko	2	-41/+23
	Exchange PT_TRACESYSGOOD and PT_PTRACE_CAP bit positions, which makes PT_option bits contiguous and therefore makes code in ptrace_setoptions() much simpler. Every PTRACE_O_TRACEevent is defined to (1 << PTRACE_EVENT_event) instead of using explicit numeric constants, to ensure we don't mess up relationship between bit positions and event ids. PT_EVENT_FLAG_SHIFT was not particularly useful, PT_OPT_FLAG_SHIFT with value of PT_EVENT_FLAG_SHIFT-1 is easier to use. PT_TRACE_MASK constant is nuked, the only its use is replaced by (PTRACE_O_MASK << PT_OPT_FLAG_SHIFT). Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com> Acked-by: Tejun Heo <tj@kernel.org> Reviewed-by: Oleg Nesterov <oleg@redhat.com> Cc: Pedro Alves <palves@redhat.com> Cc: Jan Kratochvil <jan.kratochvil@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-23	ptrace: don't modify flags on PTRACE_SETOPTIONS failure	Denys Vlasenko	1	-1/+4
	On ptrace(PTRACE_SETOPTIONS, pid, 0, <opts>), we used to set those option bits which are known, and then fail with -EINVAL if there are some unknown bits in <opts>. This is inconsistent with typical error handling, which does not change any state if input is invalid. This patch changes PTRACE_SETOPTIONS behavior so that in this case, we return -EINVAL and don't change any bits in task->ptrace. It's very unlikely that there is userspace code in the wild which will be affected by this change: it should have the form ptrace(PTRACE_SETOPTIONS, pid, 0, PTRACE_O_BOGUSOPT) where PTRACE_O_BOGUSOPT is a constant unknown to the kernel. But kernel headers, naturally, don't contain any PTRACE_O_BOGUSOPTs, thus the only way userspace can use one if it defines one itself. I can't see why anyone would do such a thing deliberately. Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com> Acked-by: Tejun Heo <tj@kernel.org> Reviewed-by: Oleg Nesterov <oleg@redhat.com> Cc: Pedro Alves <palves@redhat.com> Cc: Jan Kratochvil <jan.kratochvil@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>