linux-dev - Linux kernel development work

Age	Commit message (Collapse)	Author	Files	Lines
2015-02-25	perf tools: Fix probing for PERF_FLAG_FD_CLOEXEC flag	Adrian Hunter	1	-3/+15
	Commit f6edb53c4993ffe92ce521fb449d1c146cea6ec2 converted the probe to a CPU wide event first (pid == -1). For kernels that do not support the PERF_FLAG_FD_CLOEXEC flag the probe fails with EINVAL. Since this errno is not handled pid is not reset to 0 and the subsequent use of pid = -1 as an argument brings in an additional failure path if perf_event_paranoid > 0: $ perf record -- sleep 1 perf_event_open(..., 0) failed unexpectedly with error 13 (Permission denied) [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.007 MB /tmp/perf.data (11 samples) ] Also, ensure the fd of the confirmation check is closed and comment why pid = -1 is used. Needs to go to 3.18 stable tree as well. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Based-on-patch-by: David Ahern <david.ahern@oracle.com> Acked-by: David Ahern <david.ahern@oracle.com> Cc: David Ahern <dsahern@gmail.com> Link: http://lkml.kernel.org/r/54EC610C.8000403@intel.com Cc: stable@vger.kernel.org # v3.18+ Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-02-25	perf tools: Fix pthread_attr_setaffinity_np build error	Adrian Hunter	1	-1/+2
	Feature detection for pthread_attr_setaffinity_np was failing, producing this error: In file included from bench/futex-hash.c:17:0: bench/futex.h:73:19: error: conflicting types for ‘pthread_attr_setaffinity_np’ static inline int pthread_attr_setaffinity_np(pthread_attr_t attr, ^ In file included from bench/futex.h:72:0, from bench/futex-hash.c:17: /usr/include/pthread.h:407:12: note: previous declaration of ‘pthread_attr_setaffinity_np’ was here extern int pthread_attr_setaffinity_np (pthread_attr_t __attr, ^ make[3]: * [bench/futex-hash.o] Error 1 make[2]: * [bench] Error 2 make[2]: *** Waiting for unfinished jobs.... This was because compiling test-pthread-attr-setaffinity-np.c failed due to the function arguments: test-pthread-attr-setaffinity-np.c: In function ‘main’: test-pthread-attr-setaffinity-np.c:11:2: warning: null argument where non-null required (argument 3) [-Wnonnull] ret = pthread_attr_setaffinity_np(&thread_attr, 0, NULL); ^ So fix the arguments. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Tested-by: Stephane Eranian <eranian@google.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1424774766-24194-1-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-02-25	perf tools: Define _GNU_SOURCE on pthread_attr_setaffinity_np feature check	Josh Boyer	1	-1/+1
	The man page for pthread_attr_set_affinity_np states that _GNU_SOURCE must be defined before pthread.h is included in order to get the proper function declaration. Define this in the Makefile. Without this defined, the feature check fails on a Fedora system with gcc5 and then the perf build later fails with conflicting prototypes for the function. Signed-off-by: Josh Boyer <jwboyer@fedoraproject.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Vineet Gupta <Vineet.Gupta1@synopsys.com> Link: http://lkml.kernel.org/r/20150211162404.GA15522@hansolo.redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-02-22	perf bench: Fix order of arguments to memcpy_alloc_mem	Bruce Merry	1	-2/+2
	This was causing the destination instead of the source to be filled. As a result, the source was typically all mapped to one zero page, and hence very cacheable. Signed-off-by: Bruce Merry <bmerry@ska.ac.za> Acked-by: Ingo Molnar <mingo@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20150115092022.GA11292@kryton Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-02-21	kprobes/x86: Check for invalid ftrace location in __recover_probed_insn()	Petr Mladek	2	-0/+14
	__recover_probed_insn() should always be called from an address where an instructions starts. The check for ftrace_location() might help to discover a potential inconsistency. This patch adds WARN_ON() when the inconsistency is detected. Also it adds handling of the situation when the original code can not get recovered. Suggested-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: Petr Mladek <pmladek@suse.cz> Cc: Ananth NMavinakayanahalli <ananth@in.ibm.com> Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Cc: David S. Miller <davem@davemloft.net> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/1424441250-27146-3-git-send-email-pmladek@suse.cz Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-02-21	kprobes/x86: Use 5-byte NOP when the code might be modified by ftrace	Petr Mladek	1	-14/+28
	can_probe() checks if the given address points to the beginning of an instruction. It analyzes all the instructions from the beginning of the function until the given address. The code might be modified by another Kprobe. In this case, the current code is read into a buffer, int3 breakpoint is replaced by the saved opcode in the buffer, and can_probe() analyzes the buffer instead. There is a bug that __recover_probed_insn() tries to restore the original code even for Kprobes using the ftrace framework. But in this case, the opcode is not stored. See the difference between arch_prepare_kprobe() and arch_prepare_kprobe_ftrace(). The opcode is stored by arch_copy_kprobe() only from arch_prepare_kprobe(). This patch makes Kprobe to use the ideal 5-byte NOP when the code can be modified by ftrace. It is the original instruction, see ftrace_make_nop() and ftrace_nop_replace(). Note that we always need to use the NOP for ftrace locations. Kprobes do not block ftrace and the instruction might get modified at anytime. It might even be in an inconsistent state because it is modified step by step using the int3 breakpoint. The patch also fixes indentation of the touched comment. Note that I found this problem when playing with Kprobes. I did it on x86_64 with gcc-4.8.3 that supported -mfentry. I modified samples/kprobes/kprobe_example.c and added offset 5 to put the probe right after the fentry area: static struct kprobe kp = { .symbol_name = "do_fork", + .offset = 5, }; Then I was able to load kprobe_example before jprobe_example but not the other way around: $> modprobe jprobe_example $> modprobe kprobe_example modprobe: ERROR: could not insert 'kprobe_example': Invalid or incomplete multibyte or wide character It did not make much sense and debugging pointed to the bug described above. Signed-off-by: Petr Mladek <pmladek@suse.cz> Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Ananth NMavinakayanahalli <ananth@in.ibm.com> Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Cc: David S. Miller <davem@davemloft.net> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/1424441250-27146-2-git-send-email-pmladek@suse.cz Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-02-18	kprobes/x86: Mark 2 bytes NOP as boostable	Wang Nan	1	-1/+1
	Currently, x86 kprobes is unable to boost 2 bytes nop like: nopl 0x0(%rax,%rax,1) which is 0x0f 0x1f 0x44 0x00 0x00. Such nops have exactly 5 bytes to hold a relative jmp instruction. Boosting them should be obviously safe. This patch enable boosting such nops by simply updating twobyte_is_boostable[] array. Signed-off-by: Wang Nan <wangnan0@huawei.com> Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: <lizefan@huawei.com> Link: http://lkml.kernel.org/r/1423532045-41049-1-git-send-email-wangnan0@huawei.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-02-18	uprobes/x86: Fix 2-byte opcode table	Denys Vlasenko	1	-35/+17
	Enabled probing of lar, lsl, popcnt, lddqu, prefetch insns. They should be safe to probe, they throw no exceptions. Enabled probing of 3-byte opcodes 0f 38-3f xx - these are vector isns, so should be safe. Enabled probing of many currently undefined 0f xx insns. At the rate new vector instructions are getting added, we don't want to constantly enable more bits. We want to only occasionally disable ones which for some reason can't be probed. This includes 0f 24,26 opcodes, which are undefined since Pentium. On 486, they were "mov to/from test register". Explained more fully what 0f 78,79 opcodes are. Explained what 0f ae opcode is. (It's unclear why we don't allow probing it, but let's not change it for now). Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Jim Keniston <jkenisto@us.ibm.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/1423768732-32194-3-git-send-email-dvlasenk@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-02-18	uprobes/x86: Fix 1-byte opcode tables	Denys Vlasenko	1	-48/+18
	This change fixes 1-byte opcode tables so that only insns for which we have real reasons to disallow probing are marked with unset bits. To that end: Set bits for all prefix bytes. Their setting is ignored anyway - we check the bitmap against OPCODE1(insn), not against first byte. Keeping them set to 0 only confuses code reader with "why we don't support that opcode" question. Thus: enable bytes c4,c5 in 64-bit mode (VEX prefixes). Byte 62 (EVEX prefix) is not yet enabled since insn decoder does not support that yet. For 32-bit mode, enable probing of opcodes 63 (arpl) and d6 (salc). They don't require any special handling. For 64-bit mode, disable 9a and ea - these undefined opcodes were mistakenly left enabled. Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Jim Keniston <jkenisto@us.ibm.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/1423768732-32194-2-git-send-email-dvlasenk@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-02-18	uprobes/x86: Add comment with insn opcodes, mnemonics and why we dont support them	Denys Vlasenko	1	-19/+134
	After adding these, it's clear we have some awkward choices there. Some valid instructions are prohibited from uprobing while several invalid ones are allowed. Hopefully future edits to the good-opcode tables will fix wrong bits or explain why those bits are not wrong. No actual code changes. Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Jim Keniston <jkenisto@us.ibm.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/1423768732-32194-1-git-send-email-dvlasenk@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-02-09	random: Fix fast_mix() function	George Spelvin	1	-4/+4
	There was a bad typo in commit 43759d4f429c ("random: use an improved fast_mix() function") and I didn't notice because it "looked right", so I saw what I expected to see when I reviewed it. Only months later did I look and notice it's not the Threefish-inspired mix function that I had designed and optimized. Mea Culpa. Each input bit still has a chance to affect each output bit, and the fast pool is spilled long before it fills, so it's not a total disaster, but it's definitely not the intended great improvement. I'm still working on finding better rotation constants. These are good enough, but since it's unrolled twice, it's possible to get better mixing for free by using eight different constants rather than repeating the same four. Signed-off-by: George Spelvin <linux@horizon.com> Cc: Theodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org # v3.16+ Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-08	Linux 3.19	Linus Torvalds	1	-1/+1

2015-02-09	nios2: fix unhandled signals	Chung-Ling Tang	1	-3/+5
	Follow other architectures for user fault handling. Signed-off-by: Chung-Ling Tang <cltang@codesourcery.com> Acked-by: Ley Foon Tan <lftan@altera.com>
2015-02-07	x86/tlb/trace: Do not trace on CPU that is offline	Steven Rostedt (Red Hat)	1	-1/+3
	When taking a CPU down for suspend and resume, a tracepoint may be called when the CPU has been designated offline. As tracepoints require RCU for protection, they must not be called if the current CPU is offline. Unfortunately, trace_tlb_flush() is called in this scenario as was noted by LOCKDEP: ... Disabling non-boot CPUs ... intel_pstate CPU 1 exiting =============================== smpboot: CPU 1 didn't die... [ INFO: suspicious RCU usage. ] 3.19.0-rc7-next-20150204.1-iniza-small #1 Not tainted ------------------------------- include/trace/events/tlb.h:35 suspicious rcu_dereference_check() usage! other info that might help us debug this: RCU used illegally from offline CPU! rcu_scheduler_active = 1, debug_locks = 0 no locks held by swapper/1/0. stack backtrace: CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.19.0-rc7-next-20150204.1-iniza-small #1 Hardware name: SAMSUNG ELECTRONICS CO., LTD. 530U3BI/530U4BI/530U4BH/530U3BI/530U4BI/530U4BH, BIOS 13XK 03/28/2013 0000000000000001 ffff88011a44fe18 ffffffff817e370d 0000000000000011 ffff88011a448290 ffff88011a44fe48 ffffffff810d6847 ffff8800c66b9600 0000000000000001 ffff88011a44c000 ffffffff81cb3900 ffff88011a44fe78 Call Trace: [<ffffffff817e370d>] dump_stack+0x4c/0x65 [<ffffffff810d6847>] lockdep_rcu_suspicious+0xe7/0x120 [<ffffffff810b71a5>] idle_task_exit+0x205/0x2c0 [<ffffffff81054c4e>] play_dead_common+0xe/0x50 [<ffffffff81054ca5>] native_play_dead+0x15/0x140 [<ffffffff8102963f>] arch_cpu_idle_dead+0xf/0x20 [<ffffffff810cd89e>] cpu_startup_entry+0x37e/0x580 [<ffffffff81053e20>] start_secondary+0x140/0x150 intel_pstate CPU 2 exiting ... By converting the tlb_flush tracepoint to a TRACE_EVENT_CONDITION where the condition is cpu_online(smp_processor_id()), we can avoid calling RCU protected code when the CPU is offline. Link: http://lkml.kernel.org/r/CA+icZUUGiGDoL5NU8RuxKzFjoLjEKRtUWx=JB8B9a0EQv-eGzQ@mail.gmail.com Cc: stable@vger.kernel.org # 3.17+ Fixes: d17d8f9dedb9 "x86/mm: Add tracepoints for TLB flushes" Reported-by: Sedat Dilek <sedat.dilek@gmail.com> Tested-by: Sedat Dilek <sedat.dilek@gmail.com> Suggested-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Acked-by: Dave Hansen <dave@sr71.net> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2015-02-07	tracing: Add condition check to RCU lockdep checks	Steven Rostedt (Red Hat)	1	-1/+1
	The trace_tlb_flush() tracepoint can be called when a CPU is going offline. When a CPU is offline, RCU is no longer watching that CPU and since the tracepoint is protected by RCU, it must not be called. To prevent the tlb_flush tracepoint from being called when the CPU is offline, it was converted to a TRACE_EVENT_CONDITION where the condition checks if the CPU is online before calling the tracepoint. Unfortunately, this was not enough to stop lockdep from complaining about it. Even though the RCU protected code of the tracepoint will never be called, the condition is hidden within the tracepoint, and even though the condition prevents RCU code from being called, the lockdep checks are outside the tracepoint (this is to test tracepoints even when they are not enabled). Even though tracepoints should be checked to be RCU safe when they are not enabled, the condition should still be considered when checking RCU. Link: http://lkml.kernel.org/r/CA+icZUUGiGDoL5NU8RuxKzFjoLjEKRtUWx=JB8B9a0EQv-eGzQ@mail.gmail.com Fixes: 3a630178fd5f "tracing: generate RCU warnings even when tracepoints are disabled" Cc: stable@vger.kernel.org # 3.18+ Acked-by: Dave Hansen <dave@sr71.net> Reported-by: Sedat Dilek <sedat.dilek@gmail.com> Tested-by: Sedat Dilek <sedat.dilek@gmail.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2015-02-06	Revert "IB/core: Add support for extended query device caps"	Yann Droneaud	5	-130/+42
	While commit 7e36ef8205ff ("IB/core: Temporarily disable ex_query_device uverb") is correct as it makes the extended QUERY_DEVICE uverb (which came as part of commit 5a77abf9a97a ("IB/core: Add support for extended query device caps") and commit 860f10a799c8 ("IB/core: Add flags for on demand paging support")) not available to userspace, it doesn't address the initial issue regarding ib_copy_to_udata() [1][2]. Additionally, further discussions around this new uverb seems to conclude it would require a different data structure than the one currently described in <rdma/ib_user_verbs.h> [3]. Both of these issues require a revert of the changes, so this patch partially reverts commit 8cdd312cfed7 ("IB/mlx5: Implement the ODP capability query verb") and commit 860f10a799c8 ("IB/core: Add flags for on demand paging support") and fully reverts commit 5a77abf9a97a ("IB/core: Add support for extended query device caps"). [1] "Re: [PATCH v3 06/17] IB/core: Add support for extended query device caps" http://mid.gmane.org/1418733236.2779.26.camel@opteya.com [2] "Re: [PATCH] IB/core: Temporarily disable ex_query_device uverb" http://mid.gmane.org/1423067503.3030.83.camel@opteya.com [3] "RE: [PATCH v1 1/5] IB/uverbs: ex_query_device: answer must not depend on request's comp_mask" http://mid.gmane.org/2807E5FD2F6FDA4886F6618EAC48510E0CC12C30@CRSMSX101.amr.corp.intel.com Cc: Eli Cohen <eli@mellanox.com> Cc: Haggai Eran <haggaie@mellanox.com> Cc: Ira Weiny <ira.weiny@intel.com> Cc: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Cc: Sagi Grimberg <sagig@mellanox.com> Cc: Shachar Raindel <raindel@mellanox.com> Signed-off-by: Yann Droneaud <ydroneaud@opteya.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2015-02-05	mm/debug_pagealloc: fix build failure on ppc and some other archs	Joonsoo Kim	4	-23/+0
	Kim Phillips reported following build failure. LD init/built-in.o mm/built-in.o: In function `free_pages_prepare': mm/page_alloc.c:770: undefined reference to `.kernel_map_pages' mm/built-in.o: In function `prep_new_page': mm/page_alloc.c:933: undefined reference to `.kernel_map_pages' mm/built-in.o: In function `map_pages': mm/compaction.c:61: undefined reference to `.kernel_map_pages' make: *** [vmlinux] Error 1 Reason for this problem is that commit 031bc5743f15 ("mm/debug-pagealloc: make debug-pagealloc boottime configurable") forgot to remove the old declaration of kernel_map_pages() for some architectures. This patch removes them to fix build failure. Reported-by: Kim Phillips <kim.phillips@freescale.com> Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: David Miller <davem@davemloft.net> Cc: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-05	nilfs2: fix deadlock of segment constructor over I_SYNC flag	Ryusuke Konishi	3	-7/+44
	Nilfs2 eventually hangs in a stress test with fsstress program. This issue was caused by the following deadlock over I_SYNC flag between nilfs_segctor_thread() and writeback_sb_inodes(): nilfs_segctor_thread() nilfs_segctor_thread_construct() nilfs_segctor_unlock() nilfs_dispose_list() iput() iput_final() evict() inode_wait_for_writeback() * wait for I_SYNC flag writeback_sb_inodes() * set I_SYNC flag on inode->i_state __writeback_single_inode() do_writepages() nilfs_writepages() nilfs_construct_dsync_segment() nilfs_segctor_sync() * wait for completion of segment constructor inode_sync_complete() * clear I_SYNC flag after __writeback_single_inode() completed writeback_sb_inodes() calls do_writepages() for dirty inodes after setting I_SYNC flag on inode->i_state. do_writepages() in turn calls nilfs_writepages(), which can run segment constructor and wait for its completion. On the other hand, segment constructor calls iput(), which can call evict() and wait for the I_SYNC flag on inode_wait_for_writeback(). Since segment constructor doesn't know when I_SYNC will be set, it cannot know whether iput() will block or not unless inode->i_nlink has a non-zero count. We can prevent evict() from being called in iput() by implementing sop->drop_inode(), but it's not preferable to leave inodes with i_nlink == 0 for long periods because it even defers file truncation and inode deallocation. So, this instead resolves the deadlock by calling iput() asynchronously with a workqueue for inodes with i_nlink == 0. Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Cc: Al Viro <viro@zeniv.linux.org.uk> Tested-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-05	MAINTAINERS: remove SUPERH website	Sudip Mukherjee	1	-1/+0
	The mentioned website only displays information about buying and selling domains. Signed-off-by: Sudip Mukherjee <sudip@vectorindia.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-05	memcg, shmem: fix shmem migration to use lrucare	Michal Hocko	2	-2/+2
	It has been reported that 965GM might trigger VM_BUG_ON_PAGE(!lrucare && PageLRU(oldpage), oldpage) in mem_cgroup_migrate when shmem wants to replace a swap cache page because of shmem_should_replace_page (the page is allocated from an inappropriate zone). shmem_replace_page expects that the oldpage is not on LRU list and calls mem_cgroup_migrate without lrucare. This is obviously incorrect because swapcache pages might be on the LRU list (e.g. swapin readahead page). Fix this by enabling lrucare for the migration in shmem_replace_page. Also clarify that lrucare should be used even if one of the pages might be on LRU list. The BUG_ON will trigger only when CONFIG_DEBUG_VM is enabled but even without that the migration code might leave the old page on an inappropriate memcg' LRU which is not that critical because the page would get removed with its last reference but it is still confusing. Fixes: 0a31bc97c80c ("mm: memcontrol: rewrite uncharge API") Signed-off-by: Michal Hocko <mhocko@suse.cz> Reported-by: Chris Wilson <chris@chris-wilson.co.uk> Reported-by: Dave Airlie <airlied@gmail.com> Acked-by: Hugh Dickins <hughd@google.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: <stable@vger.kernel.org> [3.17+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-05	mm: export "high_memory" symbol on !MMU	Arnd Bergmann	1	-0/+1
	The symbol 'high_memory' is provided on both MMU- and NOMMU-kernels, but only one of them is exported, which leads to module build errors in drivers that work fine built-in: ERROR: "high_memory" [drivers/net/virtio_net.ko] undefined! ERROR: "high_memory" [drivers/net/ppp/ppp_mppe.ko] undefined! ERROR: "high_memory" [drivers/mtd/nand/nand.ko] undefined! ERROR: "high_memory" [crypto/tcrypt.ko] undefined! ERROR: "high_memory" [crypto/cts.ko] undefined! This exports the symbol to get these to work on NOMMU as well. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Greg Ungerer <gerg@uclinux.org> Acked-by: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-05	.mailmap: update Konstantin Khlebnikov's email address	Kim Phillips	1	-0/+1
	get_maintainer.pl returns k.khlebnikov@samsung.com via git history, for which emails get rejected: RCPT TO:<k.khlebnikov@samsung.com> 550 5.1.1 Recipient address rejected: User unknown Use his other address that passes vger's mxverify: RCPT TO:<koct9i@gmail.com> 250 2.1.5 OK ir10si13843754pbc.62 - gsmtp and add his old email address in the wrong email address field. Signed-off-by: Kim Phillips <kim.phillips@freescale.com> Acked-by: Konstantin Khlebnikov <koct9i@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-05	mm: pagewalk: call pte_hole() for VM_PFNMAP during walk_page_range	Shiraz Hashim	1	-1/+4
	walk_page_range() silently skips vma having VM_PFNMAP set, which leads to undesirable behaviour at client end (who called walk_page_range). Userspace applications get the wrong data, so the effect is like just confusing users (if the applications just display the data) or sometimes killing the processes (if the applications do something with misunderstanding virtual addresses due to the wrong data.) For example for pagemap_read, when no callbacks are called against VM_PFNMAP vma, pagemap_read may prepare pagemap data for next virtual address range at wrong index. Eventually userspace may get wrong pagemap data for a task. Corresponding to a VM_PFNMAP marked vma region, kernel may report mappings from subsequent vma regions. User space in turn may account more pages (than really are) to the task. In my case I was using procmem, procrack (Android utility) which uses pagemap interface to account RSS pages of a task. Due to this bug it was giving a wrong picture for vmas (with VM_PFNMAP set). Fixes: a9ff785e4437 ("mm/pagewalk.c: walk_page_range should avoid VM_PFNMAP areas") Signed-off-by: Shiraz Hashim <shashim@codeaurora.org> Acked-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: <stable@vger.kernel.org> [3.10+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-05	ASoC: Intel: fix sst firmware path for cht-bsw-rt5672	Kevin Strasser	1	-1/+1
	All sst firmware is provided under the intel directory of the linux-firmware tree. By default this directory structure is kept when installing on a target system. Change the path to expect a default linux-firmware installation. Signed-off-by: Kevin Strasser <kevin.strasser@linux.intel.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2015-02-05	ARM: dts: Fix I2S1, I2S2 compatible for exynos4 SoCs	Sylwester Nawrocki	1	-2/+2
	I2S1, I2S2 on Exynos4 SoC series have limited functionality compared to I2S0, "samsung,s3c6410-i2s" compatible should be used for them. Signed-off-by: Sylwester Nawrocki <s.nawrocki@samsung.com> Signed-off-by: Mark Brown <broonie@kernel.org> Cc: stable@vger.kernel.org
2015-02-05	spi: mxs: cleanup wait_for_completion return handling	Nicholas Mc Guire	1	-3/+2
	return type of wait_for_completion_timeout is unsigned long not int, this patch uses the return value of wait_for_completion_timeout in the condition directly rather than adding a additional appropriately typed variable. Signed-off-by: Nicholas Mc Guire <hofrat@osadl.org> Signed-off-by: Mark Brown <broonie@kernel.org>
2015-02-05	spi: ti-qspi: cleanup wait_for_completion return handling	Nicholas Mc Guire	1	-8/+6
	return type of wait_for_completion_timeout is unsigned long not int, this patch uses the return value of wait_for_completion_timeout in the condition directly rather than assigning it to an incorrect type variable. Signed-off-by: Nicholas Mc Guire <hofrat@osadl.org> Signed-off-by: Mark Brown <broonie@kernel.org>
2015-02-05	regulator: max77843: Add max77843 regulator driver	Jaewon Kim	3	-0/+236
	This patch adds new regulator driver to support max77843 MFD(Multi Function Device) chip`s regulators. The Max77843 has two voltage regulators for USB safeout. Signed-off-by: Jaewon Kim <jaewon02.kim@samsung.com> Signed-off-by: Beomho Seo <beomho.seo@samsung.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2015-02-05	IRQCHIP: mips-gic: Avoid rerouting timer IRQs for smp-cmp	James Hogan	1	-0/+27
	Commit e9de688dac65 ("irqchip: mips-gic: Support local interrupts") changed the GIC irqchip driver so that all local interrupts were routed to the same CPU pin used for external interrupts. Unfortunately this causes a regression when smp-cmp is used. The CPUs are started by the bootloader and put in a timer based waiting poll loop, but when their timer interrupts are rerouted to a different IRQ pin which is not unmasked they never wake up. Since smp-cmp support is deprecated and everybody who was using it should be switching to smp-cps which brings up the secondary CPUs without bootloader assistance, I've gone for the simple fix which can be easily removed once smp-cmp is removed, rather than a fully generic fix. In __gic_init() the local GIC_VPE_TIMER_MAP register is read to find the boot-time routing of the local timer interrupt, and a chained handler is added to that CPU pin as well as the normal one. Signed-off-by: James Hogan <james.hogan@imgtec.com> Fixes: e9de688dac65 ("irqchip: mips-gic: Support local interrupts") Cc: Andrew Bresticker <abrestic@chromium.org> Cc: Qais Yousef <qais.yousef@imgtec.com> Cc: Paul Burton <paul.burton@imgtec.com> Cc: Jason Cooper <jason@lakedaemon.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-mips@linux-mips.org Reviewed-by: Andrew Bresticker <abrestic@chromium.org> Patchwork: https://patchwork.linux-mips.org/patch/9081/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2015-02-05	sit: fix some __be16/u16 mismatches	Eric Dumazet	1	-4/+4
	Fixes following sparse warnings : net/ipv6/sit.c:1509:32: warning: incorrect type in assignment (different base types) net/ipv6/sit.c:1509:32: expected restricted __be16 [usertype] sport net/ipv6/sit.c:1509:32: got unsigned short net/ipv6/sit.c:1514:32: warning: incorrect type in assignment (different base types) net/ipv6/sit.c:1514:32: expected restricted __be16 [usertype] dport net/ipv6/sit.c:1514:32: got unsigned short net/ipv6/sit.c:1711:38: warning: incorrect type in argument 3 (different base types) net/ipv6/sit.c:1711:38: expected unsigned short [unsigned] [usertype] value net/ipv6/sit.c:1711:38: got restricted __be16 [usertype] sport net/ipv6/sit.c:1713:38: warning: incorrect type in argument 3 (different base types) net/ipv6/sit.c:1713:38: expected unsigned short [unsigned] [usertype] value net/ipv6/sit.c:1713:38: got restricted __be16 [usertype] dport Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-05	ipv6: fix sparse errors in ip6_make_flowlabel()	Eric Dumazet	1	-2/+2
	include/net/ipv6.h:713:22: warning: incorrect type in assignment (different base types) include/net/ipv6.h:713:22: expected restricted __be32 [usertype] hash include/net/ipv6.h:713:22: got unsigned int include/net/ipv6.h:719:25: warning: restricted __be32 degrades to integer include/net/ipv6.h:719:22: warning: invalid assignment: ^= include/net/ipv6.h:719:22: left side has type restricted __be32 include/net/ipv6.h:719:22: right side has type unsigned int Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-05	net: remove some sparse warnings	Eric Dumazet	1	-3/+3
	netdev_adjacent_add_links() and netdev_adjacent_del_links() are static. queue->qdisc has __rcu annotation, need to use RCU_INIT_POINTER() Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-05	flow_keys: n_proto type should be __be16	Eric Dumazet	1	-3/+3
	(struct flow_keys)->n_proto is in network order, use proper type for this. Fixes following sparse errors : net/core/flow_dissector.c:139:39: warning: incorrect type in assignment (different base types) net/core/flow_dissector.c:139:39: expected unsigned short [unsigned] [usertype] n_proto net/core/flow_dissector.c:139:39: got restricted __be16 [assigned] [usertype] proto net/core/flow_dissector.c:237:23: warning: incorrect type in assignment (different base types) net/core/flow_dissector.c:237:23: expected unsigned short [unsigned] [usertype] n_proto net/core/flow_dissector.c:237:23: got restricted __be16 [assigned] [usertype] proto Signed-off-by: Eric Dumazet <edumazet@google.com> Fixes: e0f31d849867 ("flow_keys: Record IP layer protocol in skb_flow_dissect()") Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-05	ip6_gre: fix endianness errors in ip6gre_err	Sabrina Dubroca	1	-2/+2
	info is in network byte order, change it back to host byte order before use. In particular, the current code sets the MTU of the tunnel to a wrong (too big) value. Fixes: c12b395a4664 ("gre: Support GRE over IPv6") Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-05	qlcnic: Fix NAPI poll routine for Tx completion	Shahed Shaikh	1	-3/+24
	After d75b1ade567f ("net: less interrupt masking in NAPI") driver's NAPI poll routine is expected to return exact budget value if it wants to be re-called. Signed-off-by: Shahed Shaikh <shahed.shaikh@qlogic.com> Fixes: d75b1ade567f ("net: less interrupt masking in NAPI") Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-05	hrtimer: Fix incorrect tai offset calculation for non high-res timer systems	John Stultz	1	-1/+1
	I noticed some CLOCK_TAI timer test failures on one of my less-frequently used configurations. And after digging in I found in 76f4108892d9 (Cleanup hrtimer accessors to the timekepeing state), the hrtimer_get_softirq_time tai offset calucation was incorrectly rewritten, as the tai offset we return shold be from CLOCK_MONOTONIC, and not CLOCK_REALTIME. This results in CLOCK_TAI timers expiring early on non-highres capable machines. This patch fixes the issue, calculating the tai time properly from the monotonic base. Signed-off-by: John Stultz <john.stultz@linaro.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: stable <stable@vger.kernel.org> # 3.17+ Link: http://lkml.kernel.org/r/1423097126-10236-1-git-send-email-john.stultz@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-02-04	amd-xgbe: Set RSS enablement based on hardware features	Lendacky, Thomas	1	-0/+1
	The RSS support requires enablement based on the features reported by the hardware. The setting of this flag is missing. Add support to set the RSS enablement flag based on the reported hardware features. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-04	amd-xgbe: Adjust for zero-based traffic class count	Lendacky, Thomas	1	-1/+2
	The number of traffic classes reported by the hardware is zero-based so increment the value returned to get an actual count. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-04	cls_api.c: Fix dumping of non-existing actions' stats.	Ignacy Gawędzki	1	-3/+4
	In tcf_exts_dump_stats(), ensure that exts->actions is not empty before accessing the first element of that list and calling tcf_action_copy_stats() on it. This fixes some random segvs when adding filters of type "basic" with no particular action. This also fixes the dumping of those "no-action" filters, which more often than not made calls to tcf_action_copy_stats() fail and consequently netlink attributes added by the caller to be removed by a call to nla_nest_cancel(). Fixes: 33be62715991 ("net_sched: act: use standard struct list_head") Signed-off-by: Ignacy Gawędzki <ignacy.gawedzki@green-communications.fr> Acked-by: Cong Wang <cwang@twopensource.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-04	pkt_sched: fq: avoid hang when quantum 0	Kenneth Klette Jonassen	1	-2/+8
	Configuring fq with quantum 0 hangs the system, presumably because of a non-interruptible infinite loop. Either way quantum 0 does not make sense. Reproduce with: sudo tc qdisc add dev lo root fq quantum 0 initial_quantum 0 ping 127.0.0.1 Signed-off-by: Kenneth Klette Jonassen <kennetkl@ifi.uio.no> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-04	Btrfs: add missing blk_finish_plug in btrfs_sync_log()	Forrest Liu	1	-0/+1
	Add missing blk_finish_plug in btrfs_sync_log() Signed-off-by: Forrest Liu <forrestl@synology.com> Reviewed-by: David Sterba <dsterba@suse.cz> Signed-off-by: Chris Mason <clm@fb.com>
2015-02-05	drm/cirrus: Limit modes depending on bpp option	Takashi Iwai	4	-2/+15
	The commit [8975626ea35a: drm/cirrus: allow 32bpp framebuffers for cirrus drm] broke X modesetting driver because cirrus driver still provides the full list of modes up to 1280x1024 while the 32bpp can support only up to 800x600. We might be able to filter out the invalid modes in mode_valid callback, but unfortunately the bpp in question can't be referred there for now (let me know if there is a better way to retrieve the bpp for the probed fb). So, instead, this patch adds the bpp module option to specify the maximal bpp explicitly and limits the resolutions in get_modes depending on its value. The default value is set to 24 so that the existing stuff keeps working. If you need a new 32bpp feature, specify cirrus.bpp=32 option explicitly. Fixes: 8975626ea35a ('drm/cirrus: allow 32bpp framebuffers for cirrus drm') Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-02-04	net: rds: use correct size for max unacked packets and bytes	Sasha Levin	1	-2/+2
	Max unacked packets/bytes is an int while sizeof(long) was used in the sysctl table. This means that when they were getting read we'd also leak kernel memory to userspace along with the timeout values. Signed-off-by: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-04	vhost/net: fix up num_buffers endian-ness	Michael S. Tsirkin	1	-1/+3
	In virtio 1.0 mode, when mergeable buffers are enabled on a big-endian host, num_buffers wasn't byte-swapped correctly, so large incoming packets got corrupted. To fix, fill it in within hdr - this also makes sure it gets the correct type. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-04	gianfar: correct the bad expression while writing bit-pattern	Sanjeev Sharma	1	-1/+1
	This patch correct the bad expression while writing the bit-pattern from software's buffer to hardware registers. Signed-off-by: Sanjeev Sharma <Sanjeev_Sharma@mentor.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-04	net: usb: sr9700: Use 'SR_' prefix for the common register macros	Chen Gang	2	-51/+51
	The commone register macors (e.g. RSR) is too commont to drivers, it may be conflict with the architectures (e.g. xtensa, sh). The related warnings (with allmodconfig under xtensa): CC [M] drivers/net/usb/sr9700.o In file included from drivers/net/usb/sr9700.c:24:0: drivers/net/usb/sr9700.h:65:0: warning: "RSR" redefined #define RSR 0x06 ^ In file included from ./arch/xtensa/include/asm/bitops.h:22:0, from include/linux/bitops.h:36, from include/linux/kernel.h:10, from include/linux/list.h:8, from include/linux/module.h:9, from drivers/net/usb/sr9700.c:13: ./arch/xtensa/include/asm/processor.h:190:0: note: this is the location of the previous definition #define RSR(v,sr) __asm__ __volatile__ ("rsr %0,"__stringify(sr) : "=a"(v)); ^ Signed-off-by: Chen Gang <gang.chen.5i5j@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-04	spi: spi-imx: cleanup wait_for_completion handling	Nicholas Mc Guire	1	-7/+8
	return type of wait_for_completion_timeout is unsigned long not int and always returns >=0 , this patch adds a suitable return variable and simplifies the return value checking as there is no < 0 case. Signed-off-by: Nicholas Mc Guire <hofrat@osadl.org> Signed-off-by: Mark Brown <broonie@kernel.org>
2015-02-04	spi: sh-msiof: cleanup wait_for_completion return handling	Nicholas Mc Guire	1	-4/+2
	return type of wait_for_completion_timeout is unsigned long not int, this patch uses the return value of wait_for_completion_timeout in the condition directly rather than assigning it to an incorrect type variable. Signed-off-by: Nicholas Mc Guire <der.herr@hofr.at> Signed-off-by: Mark Brown <broonie@kernel.org>
2015-02-04	spi: match var type to return type of wait_for_completion	Nicholas Mc Guire	1	-1/+1
	return type of wait_for_completion_timeout is unsigned long not int, this patch changes the type of m from int to unsigned long. Signed-off-by: Nicholas Mc Guire <der.herr@hofr.at> Signed-off-by: Mark Brown <broonie@kernel.org>
2015-02-04	regmap: Fix i2c word access when using SMBus access functions	Guenter Roeck	1	-1/+45
	SMBus access functions assume that 16-bit values are formatted as little endian numbers. The direct i2c access functions in regmap, however, assume that 16-bit values are formatted as big endian numbers. As a result, the current code returns different values if an i2c chip's 16-bit registers are accessed through i2c access functions vs. SMBus access functions. Use regmap_smbus_read_word_swapped and regmap_smbus_write_word_swapped for 16-bit SMBus accesses if a chip is configured as REGMAP_ENDIAN_BIG. If the chip is configured as REGMAP_ENDIAN_LITTLE, keep using regmap_smbus_write_word_data and regmap_smbus_read_word_data. Otherwise reject registration if the controller does not support direct i2c accesses. Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Mark Brown <broonie@kernel.org>