linux-dev - Linux kernel development work

Age	Commit message (Collapse)	Author	Files	Lines
2016-03-10	dmaengine: at_xdmac: fix residue computation	Ludovic Desroches	1	-3/+39
	When computing the residue we need two pieces of information: the current descriptor and the remaining data of the current descriptor. To get that information, we need to read consecutively two registers but we can't do it in an atomic way. For that reason, we have to check manually that current descriptor has not changed. Signed-off-by: Ludovic Desroches <ludovic.desroches@atmel.com> Suggested-by: Cyrille Pitchen <cyrille.pitchen@atmel.com> Reported-by: David Engraf <david.engraf@sysgo.com> Tested-by: David Engraf <david.engraf@sysgo.com> Fixes: e1f7c9eee707 ("dmaengine: at_xdmac: creation of the atmel eXtended DMA Controller driver") Cc: stable@vger.kernel.org #4.1 and later Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2016-03-10	KVM: MMU: fix reserved bit check for ept=0/CR0.WP=0/CR4.SMEP=1/EFER.NX=0	Paolo Bonzini	1	-1/+3
	KVM has special logic to handle pages with pte.u=1 and pte.w=0 when CR0.WP=1. These pages' SPTEs flip continuously between two states: U=1/W=0 (user and supervisor reads allowed, supervisor writes not allowed) and U=0/W=1 (supervisor reads and writes allowed, user writes not allowed). When SMEP is in effect, however, U=0 will enable kernel execution of this page. To avoid this, KVM also sets NX=1 in the shadow PTE together with U=0, making the two states U=1/W=0/NX=gpte.NX and U=0/W=1/NX=1. When guest EFER has the NX bit cleared, the reserved bit check thinks that the latter state is invalid; teach it that the smep_andnot_wp case will also use the NX bit of SPTEs. Cc: stable@vger.kernel.org Reviewed-by: Xiao Guangrong <guangrong.xiao@linux.inel.com> Fixes: c258b62b264fdc469b6d3610a907708068145e3b Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-10	KVM: MMU: fix ept=0/pte.u=1/pte.w=0/CR0.WP=0/CR4.SMEP=1/EFER.NX=0 combo	Paolo Bonzini	2	-14/+25
	Yes, all of these are needed. :) This is admittedly a bit odd, but kvm-unit-tests access.flat tests this if you run it with "-cpu host" and of course ept=0. KVM runs the guest with CR0.WP=1, so it must handle supervisor writes specially when pte.u=1/pte.w=0/CR0.WP=0. Such writes cause a fault when U=1 and W=0 in the SPTE, but they must succeed because CR0.WP=0. When KVM gets the fault, it sets U=0 and W=1 in the shadow PTE and restarts execution. This will still cause a user write to fault, while supervisor writes will succeed. User reads will fault spuriously now, and KVM will then flip U and W again in the SPTE (U=1, W=0). User reads will be enabled and supervisor writes disabled, going back to the originary situation where supervisor writes fault spuriously. When SMEP is in effect, however, U=0 will enable kernel execution of this page. To avoid this, KVM also sets NX=1 in the shadow PTE together with U=0. If the guest has not enabled NX, the result is a continuous stream of page faults due to the NX bit being reserved. The fix is to force EFER.NX=1 even if the CPU is taking care of the EFER switch. (All machines with SMEP have the CPU_LOAD_IA32_EFER vm-entry control, so they do not use user-return notifiers for EFER---if they did, EFER.NX would be forced to the same value as the host). There is another bug in the reserved bit check, which I've split to a separate patch for easier application to stable kernels. Cc: stable@vger.kernel.org Cc: Andy Lutomirski <luto@amacapital.net> Reviewed-by: Xiao Guangrong <guangrong.xiao@linux.intel.com> Fixes: f6577a5fa15d82217ca73c74cd2dcbc0f6c781dd Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-10	s390/mm: four page table levels vs. fork	Martin Schwidefsky	2	-10/+30
	The fork of a process with four page table levels is broken since git commit 6252d702c5311ce9 "[S390] dynamic page tables." All new mm contexts are created with three page table levels and an asce limit of 4TB. If the parent has four levels dup_mmap will add vmas to the new context which are outside of the asce limit. The subsequent call to copy_page_range will walk the three level page table structure of the new process with non-zero pgd and pud indexes. This leads to memory clobbers as the pgd_index and the pud_index is added to the mm->pgd pointer without a pgd_deref in between. The init_new_context() function is selecting the number of page table levels for a new context. The function is used by mm_init() which in turn is called by dup_mm() and mm_alloc(). These two are used by fork() and exec(). The init_new_context() function can distinguish the two cases by looking at mm->context.asce_limit, for fork() the mm struct has been copied and the number of page table levels may not change. For exec() the mm_alloc() function set the new mm structure to zero, in this case a three-level page table is created as the temporary stack space is located at STACK_TOP_MAX = 4TB. This fixes CVE-2016-2143. Reported-by: Marcin Kościelnicki <koriakin@0x04.net> Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: stable@vger.kernel.org Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-03-09	ext4: iterate over buffer heads correctly in move_extent_per_page()	Eryu Guan	1	-0/+1
	In commit bcff24887d00 ("ext4: don't read blocks from disk after extents being swapped") bh is not updated correctly in the for loop and wrong data has been written to disk. generic/324 catches this on sub-page block size ext4. Fixes: bcff24887d00 ("ext4: don't read blocks from disk after extentsbeing swapped") Signed-off-by: Eryu Guan <guaneryu@gmail.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2016-03-09	dma-mapping: avoid oops when parameter cpu_addr is null	Zhen Lei	1	-1/+1
	To keep consistent with kfree, which tolerate ptr is NULL. We do this because sometimes we may use goto statement, so that success and failure case can share parts of the code. But unfortunately, dma_free_coherent called with parameter cpu_addr is null will cause oops, such as showed below: Unable to handle kernel paging request at virtual address ffffffc020d3b2b8 pgd = ffffffc083a61000 [ffffffc020d3b2b8] pgd=0000000000000000, pud=0000000000000000 CPU: 4 PID: 1489 Comm: malloc_dma_1 Tainted: G O 4.1.12 #1 Hardware name: ARM64 (DT) PC is at __dma_free_coherent.isra.10+0x74/0xc8 LR is at __dma_free+0x9c/0xb0 Process malloc_dma_1 (pid: 1489, stack limit = 0xffffffc0837fc020) [...] Call trace: __dma_free_coherent.isra.10+0x74/0xc8 __dma_free+0x9c/0xb0 malloc_dma+0x104/0x158 [dma_alloc_coherent_mtmalloc] kthread+0xec/0xfc Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-09	mm/hugetlb: use EOPNOTSUPP in hugetlb sysctl handlers	Jan Stancek	1	-2/+2
	Replace ENOTSUPP with EOPNOTSUPP. If hugepages are not supported, this value is propagated to userspace. EOPNOTSUPP is part of uapi and is widely supported by libc libraries. It gives nicer message to user, rather than: # cat /proc/sys/vm/nr_hugepages cat: /proc/sys/vm/nr_hugepages: Unknown error 524 And also LTP's proc01 test was failing because this ret code (524) was unexpected: proc01 1 TFAIL : proc01.c:396: read failed: /proc/sys/vm/nr_hugepages: errno=???(524): Unknown error 524 proc01 2 TFAIL : proc01.c:396: read failed: /proc/sys/vm/nr_hugepages_mempolicy: errno=???(524): Unknown error 524 proc01 3 TFAIL : proc01.c:396: read failed: /proc/sys/vm/nr_overcommit_hugepages: errno=???(524): Unknown error 524 Signed-off-by: Jan Stancek <jstancek@redhat.com> Acked-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: David Rientjes <rientjes@google.com> Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-09	memremap: check pfn validity before passing to pfn_to_page()	Ard Biesheuvel	1	-2/+2
	In memremap's helper function try_ram_remap(), we dereference a struct page pointer that was derived from a PFN that is known to be covered by a 'System RAM' iomem region, and is thus assumed to be a 'valid' PFN, i.e., a PFN that has a struct page associated with it and is covered by the kernel direct mapping. However, the assumption that there is a 1:1 relation between the System RAM iomem region and the kernel direct mapping is not universally valid on all architectures, and on ARM and arm64, 'System RAM' may include regions for which pfn_valid() returns false. Generally speaking, both __va() and pfn_to_page() should only ever be called on PFNs/physical addresses for which pfn_valid() returns true, so add that check to try_ram_remap(). Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-09	mm, thp: fix migration of PTE-mapped transparent huge pages	Kirill A. Shutemov	1	-1/+1
	We don't have native support of THP migration, so we have to split huge page into small pages in order to migrate it to different node. This includes PTE-mapped huge pages. I made mistake in refcounting patchset: we don't actually split PTE-mapped huge page in queue_pages_pte_range(), if we step on head page. The result is that the head page is queued for migration, but none of tail pages: putting head page on queue takes pin on the page and any subsequent attempts of split_huge_pages() would fail and we skip queuing tail pages. unmap_and_move_huge_page() will eventually split the huge pages, but only one of 512 pages would get migrated. Let's fix the situation. Fixes: 248db92da13f2507 ("migrate_pages: try to split pages on queuing") Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-09	dax: check return value of dax_radix_entry()	Ross Zwisler	1	-1/+8
	dax_pfn_mkwrite() previously wasn't checking the return value of the call to dax_radix_entry(), which was a mistake. Instead, capture this return value and return the appropriate VM_FAULT_ value. Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Matthew Wilcox <willy@linux.intel.com> Cc: Dave Chinner <david@fromorbit.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-09	ocfs2: fix return value from ocfs2_page_mkwrite()	Jan Kara	1	-0/+4
	ocfs2_page_mkwrite() could mistakenly return error code instead of mkwrite status value. Fix it. Signed-off-by: Jan Kara <jack@suse.cz> Cc: Mark Fasheh <mfasheh@suse.de> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Joseph Qi <joseph.qi@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-09	arm64: kasan: clear stale stack poison	Mark Rutland	1	-0/+4
	Functions which the compiler has instrumented for KASAN place poison on the stack shadow upon entry and remove this poison prior to returning. In the case of cpuidle, CPUs exit the kernel a number of levels deep in C code. Any instrumented functions on this critical path will leave portions of the stack shadow poisoned. If CPUs lose context and return to the kernel via a cold path, we restore a prior context saved in __cpu_suspend_enter are forgotten, and we never remove the poison they placed in the stack shadow area by functions calls between this and the actual exit of the kernel. Thus, (depending on stackframe layout) subsequent calls to instrumented functions may hit this stale poison, resulting in (spurious) KASAN splats to the console. To avoid this, clear any stale poison from the idle thread for a CPU prior to bringing a CPU online. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Reviewed-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Cc: Alexander Potapenko <glider@google.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Ingo Molnar <mingo@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-09	sched/kasan: remove stale KASAN poison after hotplug	Mark Rutland	1	-0/+3
	Functions which the compiler has instrumented for KASAN place poison on the stack shadow upon entry and remove this poision prior to returning. In the case of CPU hotplug, CPUs exit the kernel a number of levels deep in C code. Any instrumented functions on this critical path will leave portions of the stack shadow poisoned. When a CPU is subsequently brought back into the kernel via a different path, depending on stackframe, layout calls to instrumented functions may hit this stale poison, resulting in (spurious) KASAN splats to the console. To avoid this, clear any stale poison from the idle thread for a CPU prior to bringing a CPU online. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Reviewed-by: Ingo Molnar <mingo@kernel.org> Cc: Alexander Potapenko <glider@google.com> Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-09	kasan: add functions to clear stack poison	Mark Rutland	2	-1/+25
	Functions which the compiler has instrumented for ASAN place poison on the stack shadow upon entry and remove this poison prior to returning. In some cases (e.g. hotplug and idle), CPUs may exit the kernel a number of levels deep in C code. If there are any instrumented functions on this critical path, these will leave portions of the idle thread stack shadow poisoned. If a CPU returns to the kernel via a different path (e.g. a cold entry), then depending on stack frame layout subsequent calls to instrumented functions may use regions of the stack with stale poison, resulting in (spurious) KASAN splats to the console. Contemporary GCCs always add stack shadow poisoning when ASAN is enabled, even when asked to not instrument a function [1], so we can't simply annotate functions on the critical path to avoid poisoning. Instead, this series explicitly removes any stale poison before it can be hit. In the common hotplug case we clear the entire stack shadow in common code, before a CPU is brought online. On architectures which perform a cold return as part of cpu idle may retain an architecture-specific amount of stack contents. To retain the poison for this retained context, the arch code must call the core KASAN code, passing a "watermark" stack pointer value beyond which shadow will be cleared. Architectures which don't perform a cold return as part of idle do not need any additional code. This patch (of 3): Functions which the compiler has instrumented for KASAN place poison on the stack shadow upon entry and remove this poision prior to returning. In some cases (e.g. hotplug and idle), CPUs may exit the kernel a number of levels deep in C code. If there are any instrumented functions on this critical path, these will leave portions of the stack shadow poisoned. If a CPU returns to the kernel via a different path (e.g. a cold entry), then depending on stack frame layout subsequent calls to instrumented functions may use regions of the stack with stale poison, resulting in (spurious) KASAN splats to the console. To avoid this, we must clear stale poison from the stack prior to instrumented functions being called. This patch adds functions to the KASAN core for removing poison from (portions of) a task's stack. These will be used by subsequent patches to avoid problems with hotplug and idle. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Alexander Potapenko <glider@google.com> Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-09	mm: fix mixed zone detection in devm_memremap_pages	Dan Williams	1	-5/+6
	The check for whether we overlap "System RAM" needs to be done at section granularity. For example a system with the following mapping: 100000000-37bffffff : System RAM 37c000000-837ffffff : Persistent Memory ...is unable to use devm_memremap_pages() as it would result in two zones colliding within a given section. Signed-off-by: Dan Williams <dan.j.williams@intel.com> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Reviewed-by: Toshi Kani <toshi.kani@hpe.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-09	list: kill list_force_poison()	Dan Williams	3	-22/+7
	Given we have uninitialized list_heads being passed to list_add() it will always be the case that those uninitialized values randomly trigger the poison value. Especially since a list_add() operation will seed the stack with the poison value for later stack allocations to trip over. For example, see these two false positive reports: list_add attempted on force-poisoned entry WARNING: at lib/list_debug.c:34 [..] NIP [c00000000043c390] __list_add+0xb0/0x150 LR [c00000000043c38c] __list_add+0xac/0x150 Call Trace: __list_add+0xac/0x150 (unreliable) __down+0x4c/0xf8 down+0x68/0x70 xfs_buf_lock+0x4c/0x150 [xfs] list_add attempted on force-poisoned entry(0000000000000500), new->next == d0000000059ecdb0, new->prev == 0000000000000500 WARNING: at lib/list_debug.c:33 [..] NIP [c00000000042db78] __list_add+0xa8/0x140 LR [c00000000042db74] __list_add+0xa4/0x140 Call Trace: __list_add+0xa4/0x140 (unreliable) rwsem_down_read_failed+0x6c/0x1a0 down_read+0x58/0x60 xfs_log_commit_cil+0x7c/0x600 [xfs] Fixes: commit 5c2c2587b132 ("mm, dax, pmem: introduce {get\|put}_dev_pagemap() for dax-gup") Signed-off-by: Dan Williams <dan.j.williams@intel.com> Reported-by: Eryu Guan <eguan@redhat.com> Tested-by: Eryu Guan <eguan@redhat.com> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-09	mm: __delete_from_page_cache show Bad page if mapped	Hugh Dickins	1	-1/+24
	Commit e1534ae95004 ("mm: differentiate page_mapped() from page_mapcount() for compound pages") changed the famous BUG_ON(page_mapped(page)) in __delete_from_page_cache() to VM_BUG_ON_PAGE(page_mapped(page)): which gives us more info when CONFIG_DEBUG_VM=y, but nothing at all when not. Although it has not usually been very helpul, being hit long after the error in question, we do need to know if it actually happens on users' systems; but reinstating a crash there is likely to be opposed :) In the non-debug case, pr_alert("BUG: Bad page cache") plus dump_page(), dump_stack(), add_taint() - I don't really believe LOCKDEP_NOW_UNRELIABLE, but that seems to be the standard procedure now. Move that, or the VM_BUG_ON_PAGE(), up before the deletion from tree: so that the unNULLified page->mapping gives a little more information. If the inode is being evicted (rather than truncated), it won't have any vmas left, so it's safe(ish) to assume that the raised mapcount is erroneous, and we can discount it from page_count to avoid leaking the page (I'm less worried by leaking the occasional 4kB, than losing a potential 2MB page with each 4kB page leaked). Signed-off-by: Hugh Dickins <hughd@google.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-09	mm/hugetlb: hugetlb_no_page: rate-limit warning message	Geoffrey Thomas	1	-1/+1
	The warning message "killed due to inadequate hugepage pool" simply indicates that SIGBUS was sent, not that the process was forcibly killed. If the process has a signal handler installed does not fix the problem, this message can rapidly spam the kernel log. On my amd64 dev machine that does not have hugepages configured, I can reproduce the repeated warnings easily by setting vm.nr_hugepages=2 (i.e., 4 megabytes of huge pages) and running something that sets a signal handler and forks, like #include <sys/mman.h> #include <signal.h> #include <stdlib.h> #include <unistd.h> sig_atomic_t counter = 10; void handler(int signal) { if (counter-- == 0) exit(0); } int main(void) { int status; char addr = mmap(NULL, 4 1048576, PROT_READ \| PROT_WRITE, MAP_PRIVATE \| MAP_ANONYMOUS \| MAP_HUGETLB, -1, 0); if (addr == MAP_FAILED) {perror("mmap"); return 1;} addr = 'x'; switch (fork()) { case -1: perror("fork"); return 1; case 0: signal(SIGBUS, handler); addr = 'x'; break; default: *addr = 'x'; wait(&status); if (WIFSIGNALED(status)) { psignal(WTERMSIG(status), "child"); } break; } } Signed-off-by: Geoffrey Thomas <geofft@ldpreload.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Hillf Danton <hillf.zj@alibaba-inc.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-09	tracing: Fix check for cpu online when event is disabled	Steven Rostedt (Red Hat)	1	-8/+9
	Commit f37755490fe9b ("tracepoints: Do not trace when cpu is offline") added a check to make sure that tracepoints only get called when the cpu is online, as it uses rcu_read_lock_sched() for protection. Commit 3a630178fd5f3 ("tracing: generate RCU warnings even when tracepoints are disabled") added lockdep checks (including rcu checks) for events that are not enabled to catch possible RCU issues that would only be triggered if a trace event was enabled. Commit f37755490fe9b only stopped the warnings when the trace event was enabled but did not prevent warnings if the trace event was called when disabled. To fix this, the cpu online check is moved to where the condition is added to the trace event. This will place the cpu online check in all places that it may be used now and in the future. Cc: stable@vger.kernel.org # v3.18+ Fixes: f37755490fe9b ("tracepoints: Do not trace when cpu is offline") Fixes: 3a630178fd5f3 ("tracing: generate RCU warnings even when tracepoints are disabled") Reported-by: Sudeep Holla <sudeep.holla@arm.com> Tested-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2016-03-09	arm64: hugetlb: partial revert of 66b3923a1a0f	Will Deacon	1	-14/+0
	Commit 66b3923a1a0f ("arm64: hugetlb: add support for PTE contiguous bit") introduced support for huge pages using the contiguous bit in the PTE as opposed to block mappings, which may be slightly unwieldy (512M) in 64k page configurations. Unfortunately, this support has resulted in some late regressions when running the libhugetlbfs test suite with 64k pages and CONFIG_DEBUG_VM as a result of a BUG: \| readback (2M: 64): ------------[ cut here ]------------ \| kernel BUG at fs/hugetlbfs/inode.c:446! \| Internal error: Oops - BUG: 0 [#1] SMP \| Modules linked in: \| CPU: 7 PID: 1448 Comm: readback Not tainted 4.5.0-rc7 #148 \| Hardware name: linux,dummy-virt (DT) \| task: fffffe0040964b00 ti: fffffe00c2668000 task.ti: fffffe00c2668000 \| PC is at remove_inode_hugepages+0x44c/0x480 \| LR is at remove_inode_hugepages+0x264/0x480 Rather than revert the entire patch, simply avoid advertising the contiguous huge page sizes for now while people are actively working on a fix. This patch can then be reverted once things have been sorted out. Cc: David Woods <dwoods@ezchip.com> Reported-by: Steve Capper <steve.capper@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-03-09	arm64: account for sparsemem section alignment when choosing vmemmap offset	Ard Biesheuvel	1	-2/+3
	Commit dfd55ad85e4a ("arm64: vmemmap: use virtual projection of linear region") fixed an issue where the struct page array would overflow into the adjacent virtual memory region if system RAM was placed so high up in physical memory that its addresses were not representable in the build time configured virtual address size. However, the fix failed to take into account that the vmemmap region needs to be relatively aligned with respect to the sparsemem section size, so that a sequence of page structs corresponding with a sparsemem section in the linear region appears naturally aligned in the vmemmap region. So round up vmemmap to sparsemem section size. Since this essentially moves the projection of the linear region up in memory, also revert the reduction of the size of the vmemmap region. Cc: <stable@vger.kernel.org> Fixes: dfd55ad85e4a ("arm64: vmemmap: use virtual projection of linear region") Tested-by: Mark Langsdorf <mlangsdo@redhat.com> Tested-by: David Daney <david.daney@cavium.com> Tested-by: Robert Richter <rrichter@cavium.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-03-09	kvm: cap halt polling at exactly halt_poll_ns	David Matlack	1	-0/+3
	When growing halt-polling, there is no check that the poll time exceeds the limit. It's possible for vcpu->halt_poll_ns grow once past halt_poll_ns, and stay there until a halt which takes longer than vcpu->halt_poll_ns. For example, booting a Linux guest with halt_poll_ns=11000: ... kvm:kvm_halt_poll_ns: vcpu 0: halt_poll_ns 0 (shrink 10000) ... kvm:kvm_halt_poll_ns: vcpu 0: halt_poll_ns 10000 (grow 0) ... kvm:kvm_halt_poll_ns: vcpu 0: halt_poll_ns 20000 (grow 10000) Signed-off-by: David Matlack <dmatlack@google.com> Fixes: aca6ff29c4063a8d467cdee241e6b3bf7dc4a171 Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-09	dmaengine: fsldma: fix memory leak	Xuelin Shi	1	-0/+2
	adding unmap of sources and destinations while doing dequeue. Signed-off-by: Xuelin Shi <xuelin.shi@nxp.com> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2016-03-09	device property: fwnode->secondary may contain ERR_PTR(-ENODEV)	Heikki Krogerus	1	-4/+4
	This fixes BUG triggered when fwnode->secondary is not NULL, but has ERR_PTR(-ENODEV) instead. BUG: unable to handle kernel paging request at ffffffffffffffed IP: [<ffffffff81677b86>] __fwnode_property_read_string+0x26/0x160 PGD 200e067 PUD 2010067 PMD 0 Oops: 0000 [#1] SMP KASAN Modules linked in: dwc3_pci(+) dwc3 CPU: 0 PID: 1138 Comm: modprobe Not tainted 4.5.0-rc5+ #61 task: ffff88015aaf5b00 ti: ffff88007b958000 task.ti: ffff88007b958000 RIP: 0010:[<ffffffff81677b86>] [<ffffffff81677b86>] __fwnode_property_read_string+0x26/0x160 RSP: 0018:ffff88007b95eff8 EFLAGS: 00010246 RAX: fffffbfffffffffd RBX: ffffffffffffffed RCX: ffff88015999cd37 RDX: dffffc0000000000 RSI: ffffffff81e11bc0 RDI: ffffffffffffffed RBP: ffff88007b95f020 R08: 0000000000000000 R09: 0000000000000000 R10: ffff88007b90f7cf R11: 0000000000000000 R12: ffff88007b95f0a0 R13: 00000000fffffffa R14: ffffffff81e11bc0 R15: ffff880159ea37a0 FS: 00007ff35f46c700(0000) GS:ffff88015b800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: ffffffffffffffed CR3: 000000007b8be000 CR4: 00000000001006f0 Stack: ffff88015999cd20 ffffffff81e11bc0 ffff88007b95f0a0 ffff88007b383dd8 ffff880159ea37a0 ffff88007b95f048 ffffffff81677d03 ffff88007b952460 ffffffff81e11bc0 ffff88007b95f0a0 ffff88007b95f070 ffffffff81677d40 Call Trace: [<ffffffff81677d03>] fwnode_property_read_string+0x43/0x50 [<ffffffff81677d40>] device_property_read_string+0x30/0x40 ... Fixes: 362c0b30249e (device property: Fallback to secondary fwnode if primary misses the property) Signed-off-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-03-08	ACPICA: Revert "Parser: Fix for SuperName method invocation"	Bob Moore	1	-5/+4
	ACPICA commit eade8f78f2aa21e8eabc3380a5728db47273bcf1 Revert commit ae90fbf562d7 (ACPICA: Parser: Fix for SuperName method invocation). Support for method invocations as part of super_name will be removed from the ACPI specification, since no AML interpreter supports it. Fixes: ae90fbf562d7 (ACPICA: Parser: Fix for SuperName method invocation) Link: https://github.com/acpica/acpica/commit/eade8f78 Signed-off-by: Bob Moore <robert.moore@intel.com> Signed-off-by: Lv Zheng <lv.zheng@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-03-08	Revert "drm/radeon/pm: adjust display configuration after powerstate"	Alex Deucher	1	-3/+2
	This reverts commit 39d4275058baf53e89203407bf3841ff2c74fa32. This caused a regression on some older hardware. bug: https://bugzilla.kernel.org/show_bug.cgi?id=113891 Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2016-03-08	drm/amdgpu/dp: add back special handling for NUTMEG	Alex Deucher	1	-4/+16
	When I fixed the dp rate selection in: 3b73b168cffd9c392584d3f665021fa2190f8612 drm/amdgpu: fix dp link rate selection (v2) I accidently dropped the special handling for NUTMEG DP bridge chips. They require a fixed link rate. Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Ken Wang <Qingqing.Wang@amd.com> Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2016-03-08	drm/radeon/dp: add back special handling for NUTMEG	Alex Deucher	1	-4/+16
	When I fixed the dp rate selection in: 092c96a8ab9d1bd60ada2ed385cc364ce084180e drm/radeon: fix dp link rate selection (v2) I accidently dropped the special handling for NUTMEG DP bridge chips. They require a fixed link rate. Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Ken Wang <Qingqing.Wang@amd.com> Reviewed-by: Harry Wentland <harry.wentland@amd.com> Tested-by: Ken Moffat <zarniwhoop@ntlworld.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2016-03-08	KVM: s390: correct fprs on SIGP (STOP AND) STORE STATUS	David Hildenbrand	1	-1/+1
	With MACHINE_HAS_VX, we convert the floating point registers from the vector registeres when storing the status. For other VCPUs, these are stored to vcpu->run->s.regs.vrs, but we are using current->thread.fpu.vxrs, which resolves to the currently loaded VCPU. So kvm_s390_store_status_unloaded() currently writes the wrong floating point registers (converted from the vector registers) when called from another VCPU on a z13. This is only the case for old user space not handling SIGP STORE STATUS and SIGP STOP AND STORE STATUS, but relying on the kernel implementation. All other calls come from the loaded VCPU via kvm_s390_store_status(). Fixes: 9abc2a08a7d6 (KVM: s390: fix memory overwrites when vx is disabled) Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Cc: stable@vger.kernel.org # v4.4+ Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-08	KVM: VMX: disable PEBS before a guest entry	Radim Krčmář	1	-0/+7
	Linux guests on Haswell (and also SandyBridge and Broadwell, at least) would crash if you decided to run a host command that uses PEBS, like perf record -e 'cpu/mem-stores/pp' -a This happens because KVM is using VMX MSR switching to disable PEBS, but SDM [2015-12] 18.4.4.4 Re-configuring PEBS Facilities explains why it isn't safe: When software needs to reconfigure PEBS facilities, it should allow a quiescent period between stopping the prior event counting and setting up a new PEBS event. The quiescent period is to allow any latent residual PEBS records to complete its capture at their previously specified buffer address (provided by IA32_DS_AREA). There might not be a quiescent period after the MSR switch, so a CPU ends up using host's MSR_IA32_DS_AREA to access an area in guest's memory. (Or MSR switching is just buggy on some models.) The guest can learn something about the host this way: If the guest doesn't map address pointed by MSR_IA32_DS_AREA, it results in #PF where we leak host's MSR_IA32_DS_AREA through CR2. After that, a malicious guest can map and configure memory where MSR_IA32_DS_AREA is pointing and can therefore get an output from host's tracing. This is not a critical leak as the host must initiate with PEBS tracing and I have not been able to get a record from more than one instruction before vmentry in vmx_vcpu_run() (that place has most registers already overwritten with guest's). We could disable PEBS just few instructions before vmentry, but disabling it earlier shouldn't affect host tracing too much. We also don't need to switch MSR_IA32_PEBS_ENABLE on VMENTRY, but that optimization isn't worth its code, IMO. (If you are implementing PEBS for guests, be sure to handle the case where both host and guest enable PEBS, because this patch doesn't.) Fixes: 26a4f3c08de4 ("perf/x86: disable PEBS on a guest entry.") Cc: <stable@vger.kernel.org> Reported-by: Jiří Olša <jolsa@redhat.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-08	s390/cpumf: Fix lpp detection	Christian Borntraeger	1	-1/+1
	we have to check bit 40 of the facility list before issuing LPP and not bit 48. Otherwise a guest running on a system with "The decimal-floating-point zoned-conversion facility" and without the "The set-program-parameters facility" might crash on an lpp instruction. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Cc: stable@vger.kernel.org # v4.4+ Fixes: e22cf8ca6f75 ("s390/cpumf: rework program parameter setting to detect guest samples") Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-03-07	jffs2: reduce the breakage on recovery from halfway failed rename()	Al Viro	1	-3/+8
	d_instantiate(new_dentry, old_inode) is absolutely wrong thing to do - it will oops if new_dentry used to be positive, for starters. What we need is d_invalidate() the target and be done with that. Cc: stable@vger.kernel.org # v3.18+ Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-03-07	ncpfs: fix a braino in OOM handling in ncp_fill_cache()	Al Viro	1	-1/+1
	Failing to allocate an inode for child means that cache for parent is incompletely populated. So it's parent directory inode ('dir') that needs NCPI_DIR_CACHE flag removed, not the child inode ('inode', which is what we'd failed to allocate in the first place). Fucked-up-in: commit 5e993e25 ("ncpfs: get rid of d_validate() nonsense") Fucked-up-by: Al Viro <viro@zeniv.linux.org.uk> Cc: stable@vger.kernel.org # v3.19 Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-03-08	KVM: PPC: Book3S HV: Sanitize special-purpose register values on guest exit	Paul Mackerras	1	-0/+14
	Thomas Huth discovered that a guest could cause a hard hang of a host CPU by setting the Instruction Authority Mask Register (IAMR) to a suitable value. It turns out that this is because when the code was added to context-switch the new special-purpose registers (SPRs) that were added in POWER8, we forgot to add code to ensure that they were restored to a sane value on guest exit. This adds code to set those registers where a bad value could compromise the execution of the host kernel to a suitable neutral value on guest exit. Cc: stable@vger.kernel.org # v3.14+ Fixes: b005255e12a3 Reported-by: Thomas Huth <thuth@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>
2016-03-08	drm/i2c: tda998x: Choose between atomic or non atomic dpms helper	Jyri Sarha	1	-1/+9
	Choose between atomic or non atomic connector dpms helper. If tda998x is connected to a drm driver that does not support atomic modeset calling drm_atomic_helper_connector_dpms() causes a crash when the connectors atomic state is not initialized. The patch implements a driver specific connector dpms helper that calls drm_atomic_helper_connector_dpms() if driver supports DRIVER_ATOMIC and otherwise it calls the legacy drm_helper_connector_dpms(). Fixes commit 9736e988d328 ("drm/i2c: tda998x: Add support for atomic modesetting"). Signed-off-by: Jyri Sarha <jsarha@ti.com> Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-03-08	drm/vmwgfx: Add back ->detect() and ->fill_modes()	Thierry Reding	1	-0/+2
	This partially reverts commit d56f57ac969c ("drm/gma500: Move to private save/restore hooks") which removed these lines by mistake. Reported-by: Sebastian Herbszt <herbszt@gmx.de> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Thierry Reding <treding@nvidia.com> Tested-by: Thomas Hellstrom <thellstrom@vmware.com> Tested-by: Sebastian Herbszt <herbszt@gmx.de> Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-03-07	Revert "drm/radeon: call hpd_irq_event on resume"	Linus Torvalds	1	-1/+0
	This reverts commit dbb17a21c131eca94eb31136eee9a7fe5aff00d9. It turns out that commit can cause problems for systems with multiple GPUs, and causes X to hang on at least a HP Pavilion dv7 with hybrid graphics. This got noticed originally in 4.4.4, where this patch had already gotten back-ported, but 4.5-rc7 was verified to have the same problem. Alexander Deucher says: "It looks like you have a muxed system so I suspect what's happening is that one of the display is being reported as connected for both the IGP and the dGPU and then the desktop environment gets confused or there some sort problem in the detect functions since the mux is not switched to the dGPU. I don't see an easy fix unless Dave has any ideas. I'd say just revert for now" Reported-by: Jörg-Volker Peetz <jvpeetz@web.de> Acked-by: Alexander Deucher <Alexander.Deucher@amd.com> Cc: Dave Airlie <airlied@gmail.com> Cc: stable@kernel.org # wherever dbb17a21c131 got back-ported Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-07	ppp: release rtnl mutex when interface creation fails	Guillaume Nault	1	-0/+1
	Add missing rtnl_unlock() in the error path of ppp_create_interface(). Fixes: 58a89ecaca53 ("ppp: fix lockdep splat in ppp_dev_uninit()") Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-07	cdc_ncm: do not call usbnet_link_change from cdc_ncm_bind	Bjørn Mork	1	-15/+5
	usbnet_link_change will call schedule_work and should be avoided if bind is failing. Otherwise we will end up with scheduled work referring to a netdev which has gone away. Instead of making the call conditional, we can just defer it to usbnet_probe, using the driver_info flag made for this purpose. Fixes: 8a34b0ae8778 ("usbnet: cdc_ncm: apply usbnet_link_change") Reported-by: Andrey Konovalov <andreyknvl@gmail.com> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-07	tcp: fix tcpi_segs_in after connection establishment	Eric Dumazet	1	-1/+2
	If final packet (ACK) of 3WHS is lost, it appears we do not properly account the following incoming segment into tcpi_segs_in While we are at it, starts segs_in with one, to count the SYN packet. We do not yet count number of SYN we received for a request sock, we might add this someday. packetdrill script showing proper behavior after fix : // Tests tcpi_segs_in when 3rd packet (ACK) of 3WHS is lost 0.000 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3 +0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0 +0 bind(3, ..., ...) = 0 +0 listen(3, 1) = 0 +0 < S 0:0(0) win 32792 <mss 1000,sackOK,nop,nop> +0 > S. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK> +.020 < P. 1:1001(1000) ack 1 win 32792 +0 accept(3, ..., ...) = 4 +.000 %{ assert tcpi_segs_in == 2, 'tcpi_segs_in=%d' % tcpi_segs_in }% Fixes: 2efd055c53c06 ("tcp: add tcpi_segs_in and tcpi_segs_out to tcp_info") Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-07	net: hns: fix the bug about loopback	yankejian	5	-1/+65
	It will always be passed if the soc is tested the loopback cases. This patch will fix this bug. Signed-off-by: Kejian Yan <yankejian@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-07	jme: Fix device PM wakeup API usage	Guo-Fu Tseng	1	-4/+2
	According to Documentation/power/devices.txt The driver should not use device_set_wakeup_enable() which is the policy for user to decide. Using device_init_wakeup() to initialize dev->power.should_wakeup and dev->power.can_wakeup on driver initialization. And use device_may_wakeup() on suspend to decide if WoL function should be enabled on NIC. Reported-by: Diego Viola <diego.viola@gmail.com> Signed-off-by: Guo-Fu Tseng <cooldavid@cooldavid.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-07	jme: Do not enable NIC WoL functions on S0	Guo-Fu Tseng	1	-6/+11
	Otherwise it might be back on resume right after going to suspend in some hardware. Reported-by: Diego Viola <diego.viola@gmail.com> Signed-off-by: Guo-Fu Tseng <cooldavid@cooldavid.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-07	PCI: Allow a NULL "parent" pointer in pci_bus_assign_domain_nr()	Krzysztof =?utf-8?Q?Ha=C5=82asa?=	1	-1/+3
	pci_create_root_bus() passes a "parent" pointer to pci_bus_assign_domain_nr(). When CONFIG_PCI_DOMAINS_GENERIC is defined, pci_bus_assign_domain_nr() dereferences that pointer. Many callers of pci_create_root_bus() supply a NULL "parent" pointer, which leads to a NULL pointer dereference error. 7c674700098c ("PCI: Move domain assignment from arm64 to generic code") moved the "parent" dereference from arm64 to generic code. Only arm64 used that code (because only arm64 defined CONFIG_PCI_DOMAINS_GENERIC), and it always supplied a valid "parent" pointer. Other arches supplied NULL "parent" pointers but didn't defined CONFIG_PCI_DOMAINS_GENERIC, so they used a no-op version of pci_bus_assign_domain_nr(). 8c7d14746abc ("ARM/PCI: Move to generic PCI domains") defined CONFIG_PCI_DOMAINS_GENERIC on ARM, and many ARM platforms use pci_common_init(), which supplies a NULL "parent" pointer. These platforms (cns3xxx, dove, footbridge, iop13xx, etc.) crash with a NULL pointer dereference like this while probing PCI: Unable to handle kernel NULL pointer dereference at virtual address 000000a4 PC is at pci_bus_assign_domain_nr+0x10/0x84 LR is at pci_create_root_bus+0x48/0x2e4 Kernel panic - not syncing: Attempted to kill init! [bhelgaas: changelog, add "Reported:" and "Fixes:" tags] Reported: http://forum.doozan.com/read.php?2,17868,22070,quote=1 Fixes: 8c7d14746abc ("ARM/PCI: Move to generic PCI domains") Fixes: 7c674700098c ("PCI: Move domain assignment from arm64 to generic code") Signed-off-by: Krzysztof Hałasa <khalasa@piap.pl> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> CC: stable@vger.kernel.org # v4.0+
2016-03-07	udp6: fix UDP/IPv6 encap resubmit path	Bill Sommerfeld	1	-4/+2
	IPv4 interprets a negative return value from a protocol handler as a request to redispatch to a new protocol. In contrast, IPv6 interprets a negative value as an error, and interprets a positive value as a request for redispatch. UDP for IPv6 was unaware of this difference. Change __udp6_lib_rcv() to return a positive value for redispatch. Note that the socket's encap_rcv hook still needs to return a negative value to request dispatch, and in the case of IPv6 packets, adjust IP6CB(skb)->nhoff to identify the byte containing the next protocol. Signed-off-by: Bill Sommerfeld <wsommerfeld@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-07	be2net: Don't leak iomapped memory on removal.	Douglas Miller	2	-0/+5
	The adapter->pcicfg resource is either mapped via pci_iomap() or derived from adapter->db. During be_remove() this resource was ignored and so could remain mapped after remove. Add a flag to track whether adapter->pcicfg was mapped or not, then use that flag in be_unmap_pci_bars() to unmap if required. Fixes: 25848c901 ("use PCI MMIO read instead of config read for errors") Signed-off-by: Douglas Miller <dougmill@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-07	vmxnet3: avoid calling pskb_may_pull with interrupts disabled	Neil Horman	1	-27/+46
	vmxnet3 has a function vmxnet3_parse_and_copy_hdr which, among other operations, uses pskb_may_pull to linearize the header portion of an skb. That operation eventually uses local_bh_disable/enable to ensure that it doesn't race with the drivers bottom half handler. Unfortunately, vmxnet3 preforms this parse_and_copy operation with a spinlock held and interrupts disabled. This causes us to run afoul of the WARN_ON_ONCE(irqs_disabled()) warning in local_bh_enable, resulting in this: WARNING: at kernel/softirq.c:159 local_bh_enable+0x59/0x90() (Not tainted) Hardware name: VMware Virtual Platform Modules linked in: ipv6 ppdev parport_pc parport microcode e1000 vmware_balloon vmxnet3 i2c_piix4 sg ext4 jbd2 mbcache sd_mod crc_t10dif sr_mod cdrom mptspi mptscsih mptbase scsi_transport_spi pata_acpi ata_generic ata_piix vmwgfx ttm drm_kms_helper drm i2c_core dm_mirror dm_region_hash dm_log dm_mod [last unloaded: mperf] Pid: 6229, comm: sshd Not tainted 2.6.32-616.el6.i686 #1 Call Trace: [<c04624d9>] ? warn_slowpath_common+0x89/0xe0 [<c0469e99>] ? local_bh_enable+0x59/0x90 [<c046254b>] ? warn_slowpath_null+0x1b/0x20 [<c0469e99>] ? local_bh_enable+0x59/0x90 [<c07bb936>] ? skb_copy_bits+0x126/0x210 [<f8d1d9fe>] ? ext4_ext_find_extent+0x24e/0x2d0 [ext4] [<c07bc49e>] ? __pskb_pull_tail+0x6e/0x2b0 [<f95a6164>] ? vmxnet3_xmit_frame+0xba4/0xef0 [vmxnet3] [<c05d15a6>] ? selinux_ip_postroute+0x56/0x320 [<c0615988>] ? cfq_add_rq_rb+0x98/0x110 [<c0852df8>] ? packet_rcv+0x48/0x350 [<c07c5839>] ? dev_queue_xmit_nit+0xc9/0x140 ... Fix it by splitting vmxnet3_parse_and_copy_hdr into two functions: vmxnet3_parse_hdr, which sets up the internal/on stack ctx datastructure, and pulls the skb (both of which can be done without holding the spinlock with irqs disabled and vmxnet3_copy_header, which just copies the skb to the tx ring under the lock safely. tested and shown to correct the described problem. Applies cleanly to the head of the net tree Signed-off-by: Neil Horman <nhorman@tuxdriver.com> CC: Shrikrishna Khare <skhare@vmware.com> CC: "VMware, Inc." <pv-drivers@vmware.com> CC: "David S. Miller" <davem@davemloft.net> Acked-by: Shrikrishna Khare <skhare@vmware.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-07	net: ethernet: Add missing MFD_SYSCON dependency on HAS_IOMEM	Krzysztof Kozlowski	1	-0/+1
	The MFD_SYSCON depends on HAS_IOMEM so when selecting it avoid unmet direct dependencies. Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-07	ibmveth: check return of skb_linearize in ibmveth_start_xmit	Thomas Falcon	1	-1/+4
	If skb_linearize fails, the driver should drop the packet instead of trying to copy it into the bounce buffer. Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-07	cdc_ncm: toggle altsetting to force reset before setup	Bjørn Mork	1	-1/+5
	Some devices will silently fail setup unless they are reset first. This is necessary even if the data interface is already in altsetting 0, which it will be when the device is probed for the first time. Briefly toggling the altsetting forces a function reset regardless of the initial state. This fixes a setup problem observed on a number of Huawei devices, appearing to operate in NTB-32 mode even if we explicitly set them to NTB-16 mode. Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>