linux-dev - Linux kernel development work

Age	Commit message (Collapse)	Author	Files	Lines
2014-03-04	at86rf230: add help and 212 to Kconfig menu entry	Alexander Aring	1	-1/+7
	Since commit 8fad346f366a72978ea942abd06bd501ebd39c22 (ieee802154: add basic support for RF212 to at86rf230 driver) we support at86rf212 as well. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-04	be2net: dma_sync each RX frag before passing it to the stack	Sathya Perla	2	-11/+24
	The driver currently maps a page for DMA, divides the page into multiple frags and posts them to the HW. It un-maps the page after data is received on all the frags of the page. This scheme doesn't work when bounce buffers are used for DMA (swiotlb=force kernel param). This patch fixes this problem by calling dma_sync_single_for_cpu() for each frag (excepting the last one) so that the data is copied from the bounce buffers. The page is un-mapped only when DMA finishes on the last frag of the page. (Thanks Ben H. for suggesting the dma_sync API!) This patch also renames the "last_page_user" field of be_rx_page_info{} struct to "last_frag" to improve readability of the fixed code. Reported-by: Li Fengmao <li.fengmao@zte.com.cn> Signed-off-by: Sathya Perla <sathya.perla@emulex.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-04	cfg80211: add MPLS and 802.21 classification	Simon Wunderlich	1	-0/+16
	MPLS labels may contain traffic control information, which should be evaluated and used by the wireless subsystem if present. Also check for IEEE 802.21 which is always network control traffic. Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de> Signed-off-by: Mathias Kretschmer <mathias.kretschmer@fokus.fraunhofer.de> Acked-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-04	UAPI: add MPLS label stack definition	Simon Wunderlich	2	-0/+40
	Labels for the Multiprotocol Label Switching are defined in RFC 3032 which was superseded by RFC 5462. Add the definition to UAPI and a stub header for include/linux. Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de> Signed-off-by: Mathias Kretschmer <mathias.kretschmer@fokus.fraunhofer.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-04	if_ether.h: add IEEE 802.21 Ethertype	Simon Wunderlich	1	-0/+1
	Add the Ethertype for IEEE Std 802.21 - Media Independent Handover Protocol. This Ethertype is used for network control messages. Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de> Signed-off-by: Mathias Kretschmer <mathias.kretschmer@fokus.fraunhofer.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-04	mm: page_alloc: exempt GFP_THISNODE allocations from zone fairness	Johannes Weiner	1	-4/+22
	Jan Stancek reports manual page migration encountering allocation failures after some pages when there is still plenty of memory free, and bisected the problem down to commit 81c0a2bb515f ("mm: page_alloc: fair zone allocator policy"). The problem is that GFP_THISNODE obeys the zone fairness allocation batches on one hand, but doesn't reset them and wake kswapd on the other hand. After a few of those allocations, the batches are exhausted and the allocations fail. Fixing this means either having GFP_THISNODE wake up kswapd, or GFP_THISNODE not participating in zone fairness at all. The latter seems safer as an acute bugfix, we can clean up later. Reported-by: Jan Stancek <jstancek@redhat.com> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Rik van Riel <riel@redhat.com> Acked-by: Mel Gorman <mgorman@suse.de> Cc: <stable@kernel.org> [3.12+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-03-04	mm: numa: bugfix for LAST_CPUPID_NOT_IN_PAGE_FLAGS	Liu Ping Fan	1	-2/+2
	When doing some numa tests on powerpc, I triggered an oops bug. I find it is caused by using page->_last_cpupid. It should be initialized as "-1 & LAST_CPUPID_MASK", but not "-1". Otherwise, in task_numa_fault(), we will miss the checking (last_cpupid == (-1 & LAST_CPUPID_MASK)). And finally cause an oops bug in task_numa_group(), since the online cpu is less than possible cpu. This happen with CONFIG_SPARSE_VMEMMAP disabled Call trace: SMP NR_CPUS=64 NUMA PowerNV Modules linked in: CPU: 24 PID: 804 Comm: systemd-udevd Not tainted3.13.0-rc1+ #32 task: c000001e2746aa80 ti: c000001e32c50000 task.ti:c000001e32c50000 REGS: c000001e32c53510 TRAP: 0300 Not tainted(3.13.0-rc1+) MSR: 9000000000009032 <SF,HV,EE,ME,IR,DR,RI> CR:28024424 XER: 20000000 CFAR: c000000000009324 DAR: 7265717569726857 DSISR:40000000 SOFTE: 1 NIP .task_numa_fault+0x1470/0x2370 LR .task_numa_fault+0x1468/0x2370 Call Trace: .task_numa_fault+0x1468/0x2370 (unreliable) .do_numa_page+0x480/0x4a0 .handle_mm_fault+0x4ec/0xc90 .do_page_fault+0x3a8/0x890 handle_page_fault+0x10/0x30 Instruction dump: 3c82fefb 3884b138 48d9cff1 60000000 48000574 3c62fefb3863af78 3c82fefb 3884b138 48d9cfd5 60000000 e93f0100 <812902e4> 7d2907b45529063e 7d2a07b4 ---[ end trace 15f2510da5ae07cf ]--- Signed-off-by: Liu Ping Fan <pingfank@linux.vnet.ibm.com> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Mel Gorman <mel@csn.ul.ie> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-03-04	MAINTAINERS: add and correct types of some "T:" entries	Joe Perches	1	-8/+9
	Tree location entries should start with the appropriate type. Add git to some, hg to another. Neaten tree type description. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-03-04	MAINTAINERS: use tab for separator	Joe Perches	1	-44/+44
	Convert whitespace to single tab for separators. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-03-04	rapidio/tsi721: fix tasklet termination in dma channel release	Alexandre Bounine	2	-9/+19
	This patch is a modification of the patch originally proposed by Xiaotian Feng <xtfeng@gmail.com>: https://lkml.org/lkml/2012/11/5/413 This new version disables DMA channel interrupts and ensures that the tasklet wil not be scheduled again before calling tasklet_kill(). Unfortunately the updated patch was not released at that time due to planned rework of Tsi721 mport driver to use threaded interrupts (which has yet to happen). Recently the issue was reported again: https://lkml.org/lkml/2014/2/19/762. Description from the original Xiaotian's patch: "Some drivers use tasklet_disable in device remove/release process, tasklet_disable will inc tasklet->count and return. If the tasklet is not handled yet under some softirq pressure, the tasklet will be placed on the tasklet_vec, never have a chance to be excuted. This might lead to a heavy loaded ksoftirqd, wakeup with pending_softirq, but tasklet is disabled. tasklet_kill should be used in this case." This patch is applicable to kernel versions starting from v3.5. Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com> Cc: Matt Porter <mporter@kernel.crashing.org> Cc: Xiaotian Feng <xtfeng@gmail.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Mike Galbraith <bitbucket@online.de> Cc: <stable@vger.kernel.org> [3.5+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-03-04	hfsplus: fix remount issue	Vyacheslav Dubeyko	1	-1/+1
	Current implementation of HFS+ driver has small issue with remount option. Namely, for example, you are unable to remount from RO mode into RW mode by means of command "mount -o remount,rw /dev/loop0 /mnt/hfsplus". Trying to execute sequence of commands results in an error message: mount /dev/loop0 /mnt/hfsplus mount -o remount,ro /dev/loop0 /mnt/hfsplus mount -o remount,rw /dev/loop0 /mnt/hfsplus mount: you must specify the filesystem type mount -t hfsplus -o remount,rw /dev/loop0 /mnt/hfsplus mount: /mnt/hfsplus not mounted or bad option The reason of such issue is failure of mount syscall: mount("/dev/loop0", "/mnt/hfsplus", 0x2282a60, MS_MGC_VAL\|MS_REMOUNT, NULL) = -1 EINVAL (Invalid argument) Namely, hfsplus_parse_options_remount() method receives empty "input" argument and return false in such case. As a result, hfsplus_remount() returns -EINVAL error code. This patch fixes the issue by means of return true for the case of empty "input" argument in hfsplus_parse_options_remount() method. Signed-off-by: Vyacheslav Dubeyko <slava@dubeyko.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-03-04	zram: avoid null access when fail to alloc meta	Minchan Kim	1	-0/+2
	zram_meta_alloc could fail so caller should check it. Otherwise, your system will hang. Signed-off-by: Minchan Kim <minchan@kernel.org> Acked-by: Jerome Marchand <jmarchan@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-03-04	sh: prefix sh-specific "CCR" and "CCR2" by "SH_"	Geert Uytterhoeven	11	-18/+20
	Commit bcf24e1daa94 ("mmc: omap_hsmmc: use the generic config for omap2plus devices"), enabled the build for other platforms for compile testing. sh-allmodconfig now fails with: include/linux/omap-dma.h:171:8: error: expected identifier before numeric constant make[4]: *** [drivers/mmc/host/omap_hsmmc.o] Error 1 This happens because SuperH #defines "CCR", which is one of the enum values in include/linux/omap-dma.h. There's a similar issue with "CCR2" on sh2a. As "CCR" and "CCR2" are too generic names for global #defines, prefix them with "SH_" to fix this. Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-03-04	ocfs2: fix quota file corruption	Jan Kara	2	-14/+17
	Global quota files are accessed from different nodes. Thus we cannot cache offset of quota structure in the quota file after we drop our node reference count to it because after that moment quota structure may be freed and reallocated elsewhere by a different node resulting in corruption of quota file. Fix the problem by clearing dq_off when we are releasing dquot structure. We also remove the DB_READ_B handling because it is useless - DQ_ACTIVE_B is set iff DQ_READ_B is set. Signed-off-by: Jan Kara <jack@suse.cz> Cc: Goldwyn Rodrigues <rgoldwyn@suse.de> Cc: Joel Becker <jlbec@evilplan.org> Reviewed-by: Mark Fasheh <mfasheh@suse.de> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-03-04	drivers/rtc/rtc-s3c.c: fix incorrect way of save/restore of S3C2410_TICNT for TYPE_S3C64XX	Vikas Sajjan	1	-5/+12
	On exynos5250, exynos5420 and exynos5260 it was observed that, after 1 cycle of S2R, the rtc-tick occurs at a very fast rate as compared to the rtc-tick occuring before S2R. This patch fixes the above issue by correcting the wrong way of save/restore of S3C2410_TICNT for TYPE_S3C64XX. Signed-off-by: Vikas Sajjan <vikas.sajjan@samsung.com> Cc: Grant Likely <grant.likely@linaro.org> Cc: Rob Herring <robh+dt@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-03-04	kallsyms: fix absolute addresses for kASLR	Andy Honig	1	-2/+1
	Currently symbols that are absolute addresses are incorrectly displayed in /proc/kallsyms if the kernel is loaded with kASLR. The problem was that the scripts/kallsyms.c file which generates the array of symbol names and addresses uses an relocatable value for all symbols, even absolute symbols. This patch fixes that. Several kallsyms output in different boot states for comparison: $ egrep '_(stext\|_per_cpu_(start\|end))' /root/kallsyms.nokaslr 0000000000000000 D __per_cpu_start 0000000000014280 D __per_cpu_end ffffffff810001c8 T _stext $ egrep '_(stext\|_per_cpu_(start\|end))' /root/kallsyms.kaslr1 000000001f200000 D __per_cpu_start 000000001f214280 D __per_cpu_end ffffffffa02001c8 T _stext $ egrep '_(stext\|_per_cpu_(start\|end))' /root/kallsyms.kaslr2 000000000d400000 D __per_cpu_start 000000000d414280 D __per_cpu_end ffffffff8e4001c8 T _stext $ egrep '_(stext\|_per_cpu_(start\|end))' /root/kallsyms.kaslr-fixed 0000000000000000 D __per_cpu_start 0000000000014280 D __per_cpu_end ffffffffadc001c8 T _stext Signed-off-by: Andy Honig <ahonig@google.com> Signed-off-by: Kees Cook <keescook@chromium.org> Cc: Michal Marek <mmarek@suse.cz> Cc: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-03-04	scripts/gen_initramfs_list.sh: fix flags for initramfs LZ4 compression	Daniel M. Weeks	1	-1/+1
	LZ4 as implemented in the kernel differs from the default method now used by the reference implementation of LZ4. Until the in-kernel method is updated to support the new default, passing the legacy flag (-l) to the compressor is necessary. Without this flag the kernel-generated, LZ4-compressed initramfs is junk. Kyungsik said: : It seems that lz4 supports legacy format with the same option as lz4c : does. Just looking at the first few bytes of lz4 compressed image, we can : see whether it is new format or not. : : It shows new format magic number without this patch. New format magic : number is 0x184d2204. : : $ hexdump -C ./initramfs_data.cpio.lz4 \|more : 00000000 04 22 4d 18 64 70 b9 69 (Little Endian) : ... : : Currently kernel supports legacy format only. Signed-off-by: Daniel M. Weeks <dan@danweeks.net> Cc: Michal Marek <mmarek@suse.cz> Acked-by: Kyungsik Lee <kyungsik.lee@lge.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-03-04	mm: include VM_MIXEDMAP flag in the VM_SPECIAL list to avoid m(un)locking	Vlastimil Babka	2	-2/+2
	Daniel Borkmann reported a VM_BUG_ON assertion failing: ------------[ cut here ]------------ kernel BUG at mm/mlock.c:528! invalid opcode: 0000 [#1] SMP Modules linked in: ccm arc4 iwldvm [...] video CPU: 3 PID: 2266 Comm: netsniff-ng Not tainted 3.14.0-rc2+ #8 Hardware name: LENOVO 2429BP3/2429BP3, BIOS G4ET37WW (1.12 ) 05/29/2012 task: ffff8801f87f9820 ti: ffff88002cb44000 task.ti: ffff88002cb44000 RIP: 0010:[<ffffffff81171ad0>] [<ffffffff81171ad0>] munlock_vma_pages_range+0x2e0/0x2f0 Call Trace: do_munmap+0x18f/0x3b0 vm_munmap+0x41/0x60 SyS_munmap+0x22/0x30 system_call_fastpath+0x1a/0x1f RIP munlock_vma_pages_range+0x2e0/0x2f0 ---[ end trace a0088dcf07ae10f2 ]--- because munlock_vma_pages_range() thinks it's unexpectedly in the middle of a THP page. This can be reproduced with default config since 3.11 kernels. A reproducer can be found in the kernel's selftest directory for networking by running ./psock_tpacket. The problem is that an order=2 compound page (allocated by alloc_one_pg_vec_page() is part of the munlocked VM_MIXEDMAP vma (mapped by packet_mmap()) and mistaken for a THP page and assumed to be order=9. The checks for THP in munlock came with commit ff6a6da60b89 ("mm: accelerate munlock() treatment of THP pages"), i.e. since 3.9, but did not trigger a bug. It just makes munlock_vma_pages_range() skip such compound pages until the next 512-pages-aligned page, when it encounters a head page. This is however not a problem for vma's where mlocking has no effect anyway, but it can distort the accounting. Since commit 7225522bb429 ("mm: munlock: batch non-THP page isolation and munlock+putback using pagevec") this can trigger a VM_BUG_ON in PageTransHuge() check. This patch fixes the issue by adding VM_MIXEDMAP flag to VM_SPECIAL, a list of flags that make vma's non-mlockable and non-mergeable. The reasoning is that VM_MIXEDMAP vma's are similar to VM_PFNMAP, which is already on the VM_SPECIAL list, and both are intended for non-LRU pages where mlocking makes no sense anyway. Related Lkml discussion can be found in [2]. [1] tools/testing/selftests/net/psock_tpacket [2] https://lkml.org/lkml/2014/1/10/427 Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Reported-by: Daniel Borkmann <dborkman@redhat.com> Tested-by: Daniel Borkmann <dborkman@redhat.com> Cc: Thomas Hellstrom <thellstrom@vmware.com> Cc: John David Anglin <dave.anglin@bell.net> Cc: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com> Cc: Konstantin Khlebnikov <khlebnikov@openvz.org> Cc: Carsten Otte <cotte@de.ibm.com> Cc: Jared Hulbert <jaredeh@gmail.com> Tested-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Rik van Riel <riel@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: <stable@vger.kernel.org> [3.11.x+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-03-04	memcg: reparent charges of children before processing parent	Filipe Brandenburger	1	-1/+9
	Sometimes the cleanup after memcg hierarchy testing gets stuck in mem_cgroup_reparent_charges(), unable to bring non-kmem usage down to 0. There may turn out to be several causes, but a major cause is this: the workitem to offline parent can get run before workitem to offline child; parent's mem_cgroup_reparent_charges() circles around waiting for the child's pages to be reparented to its lrus, but it's holding cgroup_mutex which prevents the child from reaching its mem_cgroup_reparent_charges(). Further testing showed that an ordered workqueue for cgroup_destroy_wq is not always good enough: percpu_ref_kill_and_confirm's call_rcu_sched stage on the way can mess up the order before reaching the workqueue. Instead, when offlining a memcg, call mem_cgroup_reparent_charges() on all its children (and grandchildren, in the correct order) to have their charges reparented first. Fixes: e5fca243abae ("cgroup: use a dedicated workqueue for cgroup destruction") Signed-off-by: Filipe Brandenburger <filbranden@google.com> Signed-off-by: Hugh Dickins <hughd@google.com> Reviewed-by: Tejun Heo <tj@kernel.org> Acked-by: Michal Hocko <mhocko@suse.cz> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: <stable@vger.kernel.org> [v3.10+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-03-04	memcg: fix endless loop in __mem_cgroup_iter_next()	Hugh Dickins	1	-2/+2
	Commit 0eef615665ed ("memcg: fix css reference leak and endless loop in mem_cgroup_iter") got the interaction with the commit a few before it d8ad30559715 ("mm/memcg: iteration skip memcgs not yet fully initialized") slightly wrong, and we didn't notice at the time. It's elusive, and harder to get than the original, but for a couple of days before rc1, I several times saw a endless loop similar to that supposedly being fixed. This time it was a tighter loop in __mem_cgroup_iter_next(): because we can get here when our root has already been offlined, and the ordering of conditions was such that we then just cycled around forever. Fixes: 0eef615665ed ("memcg: fix css reference leak and endless loop in mem_cgroup_iter"). Signed-off-by: Hugh Dickins <hughd@google.com> Acked-by: Michal Hocko <mhocko@suse.cz> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Greg Thelen <gthelen@google.com> Cc: <stable@vger.kernel.org> [3.12+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-03-04	lib/radix-tree.c: swapoff tmpfs radix_tree: remember to rcu_read_unlock	Hugh Dickins	1	-1/+3
	Running fsx on tmpfs with concurrent memhog-swapoff-swapon, lots of BUG: sleeping function called from invalid context at kernel/fork.c:606 in_atomic(): 0, irqs_disabled(): 0, pid: 1394, name: swapoff 1 lock held by swapoff/1394: #0: (rcu_read_lock){.+.+.+}, at: [<ffffffff812520a1>] radix_tree_locate_item+0x1f/0x2b6 followed by ================================================ [ BUG: lock held when returning to user space! ] 3.14.0-rc1 #3 Not tainted ------------------------------------------------ swapoff/1394 is leaving the kernel with locks still held! 1 lock held by swapoff/1394: #0: (rcu_read_lock){.+.+.+}, at: [<ffffffff812520a1>] radix_tree_locate_item+0x1f/0x2b6 after which the system recovered nicely. Whoops, I long ago forgot the rcu_read_unlock() on one unlikely branch. Fixes e504f3fdd63d ("tmpfs radix_tree: locate_item to speed up swapoff") Signed-off-by: Hugh Dickins <hughd@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-03-04	dma debug: account for cachelines and read-only mappings in overlap tracking	Dan Williams	1	-46/+85
	While debug_dma_assert_idle() checks if a given page is actively undergoing dma the valid granularity of a dma mapping is a cacheline. Sander's testing shows that the warning message "DMA-API: exceeded 7 overlapping mappings of pfn..." is falsely triggering. The test is simply mapping multiple cachelines in a given page. Ultimately we want overlap tracking to be valid as it is a real api violation, so we need to track active mappings by cachelines. Update the active dma tracking to use the page-frame-relative cacheline of the mapping as the key, and update debug_dma_assert_idle() to check for all possible mapped cachelines for a given page. However, the need to track active mappings is only relevant when the dma-mapping is writable by the device. In fact it is fairly standard for read-only mappings to have hundreds or thousands of overlapping mappings at once. Limiting the overlap tracking to writable (!DMA_TO_DEVICE) eliminates this class of false-positive overlap reports. Note, the radix gang lookup is sub-optimal. It would be best if it stopped fetching entries once the search passed a page boundary. Nevertheless, this implementation does not perturb the original net_dma failing case. That is to say the extra overhead does not show up in terms of making the failing case pass due to a timing change. References: http://marc.info/?l=linux-netdev&m=139232263419315&w=2 http://marc.info/?l=linux-netdev&m=139217088107122&w=2 Signed-off-by: Dan Williams <dan.j.williams@intel.com> Reported-by: Sander Eikelenboom <linux@eikelenboom.it> Reported-by: Dave Jones <davej@redhat.com> Tested-by: Dave Jones <davej@redhat.com> Tested-by: Sander Eikelenboom <linux@eikelenboom.it> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Francois Romieu <romieu@fr.zoreil.com> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Wei Liu <wei.liu2@citrix.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-03-04	mm: close PageTail race	David Rientjes	9	-55/+25
	Commit bf6bddf1924e ("mm: introduce compaction and migration for ballooned pages") introduces page_count(page) into memory compaction which dereferences page->first_page if PageTail(page). This results in a very rare NULL pointer dereference on the aforementioned page_count(page). Indeed, anything that does compound_head(), including page_count() is susceptible to racing with prep_compound_page() and seeing a NULL or dangling page->first_page pointer. This patch uses Andrea's implementation of compound_trans_head() that deals with such a race and makes it the default compound_head() implementation. This includes a read memory barrier that ensures that if PageTail(head) is true that we return a head page that is neither NULL nor dangling. The patch then adds a store memory barrier to prep_compound_page() to ensure page->first_page is set. This is the safest way to ensure we see the head page that we are expecting, PageTail(page) is already in the unlikely() path and the memory barriers are unfortunately required. Hugetlbfs is the exception, we don't enforce a store memory barrier during init since no race is possible. Signed-off-by: David Rientjes <rientjes@google.com> Cc: Holger Kiehl <Holger.Kiehl@dwd.de> Cc: Christoph Lameter <cl@linux.com> Cc: Rafael Aquini <aquini@redhat.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Michal Hocko <mhocko@suse.cz> Cc: Mel Gorman <mgorman@suse.de> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Rik van Riel <riel@redhat.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-03-04	MAINTAINERS: EDAC: add Mauro and Borislav as interim patch collectors	Borislav Petkov	1	-0/+2
	We're more or less collecting EDAC patches already anyway so let's hold it down so that get_maintainer sees it too. Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Mauro Carvalho Chehab <m.chehab@samsung.com> Cc: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-03-03	macvlan: Add support for 'always_on' offload features	Vlad Yasevich	1	-2/+5
	Macvlan currently inherits all of its features from the lower device. When lower device disables offload support, this causes macvlan to disable offload support as well. This causes performance regression when using macvlan/macvtap in bridge mode. It can be easily demonstrated by creating 2 namespaces using macvlan in bridge mode and running netperf between them: MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.0.0.1 () port 0 AF_INET Recv Send Send Socket Socket Message Elapsed Size Size Size Time Throughput bytes bytes bytes secs. 10^6bits/sec 87380 16384 16384 20.00 1204.61 To restore the performance, we add software offload features to the list of "always_on" features for macvlan. This way when a namespace or a guest using macvtap initially sends a packet, this packet will not be segmented at macvlan level. It will only be segmented when macvlan sends the packet to the lower device. MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.0.0.1 () port 0 AF_INET Recv Send Send Socket Socket Message Elapsed Size Size Size Time Throughput bytes bytes bytes secs. 10^6bits/sec 87380 16384 16384 20.00 5507.35 Fixes: 6acf54f1cf0a6747bac9fea26f34cfc5a9029523 (macvtap: Add support of packet capture on macvtap device.) Fixes: 797f87f83b60685ff8a13fa0572d2f10393c50d3 (macvlan: fix netdev feature propagation from lower device) CC: Florian Westphal <fw@strlen.de> CC: Christian Borntraeger <borntraeger@de.ibm.com> CC: Jason Wang <jasowang@redhat.com> CC: Michael S. Tsirkin <mst@redhat.com> Tested-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-03	net: sctp: fix sctp_sf_do_5_1D_ce to verify if we/peer is AUTH capable	Daniel Borkmann	1	-0/+7
	RFC4895 introduced AUTH chunks for SCTP; during the SCTP handshake RANDOM; CHUNKS; HMAC-ALGO are negotiated (CHUNKS being optional though): ---------- INIT[RANDOM; CHUNKS; HMAC-ALGO] ----------> <------- INIT-ACK[RANDOM; CHUNKS; HMAC-ALGO] --------- -------------------- COOKIE-ECHO --------------------> <-------------------- COOKIE-ACK --------------------- A special case is when an endpoint requires COOKIE-ECHO chunks to be authenticated: ---------- INIT[RANDOM; CHUNKS; HMAC-ALGO] ----------> <------- INIT-ACK[RANDOM; CHUNKS; HMAC-ALGO] --------- ------------------ AUTH; COOKIE-ECHO ----------------> <-------------------- COOKIE-ACK --------------------- RFC4895, section 6.3. Receiving Authenticated Chunks says: The receiver MUST use the HMAC algorithm indicated in the HMAC Identifier field. If this algorithm was not specified by the receiver in the HMAC-ALGO parameter in the INIT or INIT-ACK chunk during association setup, the AUTH chunk and all the chunks after it MUST be discarded and an ERROR chunk SHOULD be sent with the error cause defined in Section 4.1. [...] If no endpoint pair shared key has been configured for that Shared Key Identifier, all authenticated chunks MUST be silently discarded. [...] When an endpoint requires COOKIE-ECHO chunks to be authenticated, some special procedures have to be followed because the reception of a COOKIE-ECHO chunk might result in the creation of an SCTP association. If a packet arrives containing an AUTH chunk as a first chunk, a COOKIE-ECHO chunk as the second chunk, and possibly more chunks after them, and the receiver does not have an STCB for that packet, then authentication is based on the contents of the COOKIE-ECHO chunk. In this situation, the receiver MUST authenticate the chunks in the packet by using the RANDOM parameters, CHUNKS parameters and HMAC_ALGO parameters obtained from the COOKIE-ECHO chunk, and possibly a local shared secret as inputs to the authentication procedure specified in Section 6.3. If authentication fails, then the packet is discarded. If the authentication is successful, the COOKIE-ECHO and all the chunks after the COOKIE-ECHO MUST be processed. If the receiver has an STCB, it MUST process the AUTH chunk as described above using the STCB from the existing association to authenticate the COOKIE-ECHO chunk and all the chunks after it. [...] Commit bbd0d59809f9 introduced the possibility to receive and verification of AUTH chunk, including the edge case for authenticated COOKIE-ECHO. On reception of COOKIE-ECHO, the function sctp_sf_do_5_1D_ce() handles processing, unpacks and creates a new association if it passed sanity checks and also tests for authentication chunks being present. After a new association has been processed, it invokes sctp_process_init() on the new association and walks through the parameter list it received from the INIT chunk. It checks SCTP_PARAM_RANDOM, SCTP_PARAM_HMAC_ALGO and SCTP_PARAM_CHUNKS, and copies them into asoc->peer meta data (peer_random, peer_hmacs, peer_chunks) in case sysctl -w net.sctp.auth_enable=1 is set. If in INIT's SCTP_PARAM_SUPPORTED_EXT parameter SCTP_CID_AUTH is set, peer_random != NULL and peer_hmacs != NULL the peer is to be assumed asoc->peer.auth_capable=1, in any other case asoc->peer.auth_capable=0. Now, if in sctp_sf_do_5_1D_ce() chunk->auth_chunk is available, we set up a fake auth chunk and pass that on to sctp_sf_authenticate(), which at latest in sctp_auth_calculate_hmac() reliably dereferences a NULL pointer at position 0..0008 when setting up the crypto key in crypto_hash_setkey() by using asoc->asoc_shared_key that is NULL as condition key_id == asoc->active_key_id is true if the AUTH chunk was injected correctly from remote. This happens no matter what net.sctp.auth_enable sysctl says. The fix is to check for net->sctp.auth_enable and for asoc->peer.auth_capable before doing any operations like sctp_sf_authenticate() as no key is activated in sctp_auth_asoc_init_active_key() for each case. Now as RFC4895 section 6.3 states that if the used HMAC-ALGO passed from the INIT chunk was not used in the AUTH chunk, we SHOULD send an error; however in this case it would be better to just silently discard such a maliciously prepared handshake as we didn't even receive a parameter at all. Also, as our endpoint has no shared key configured, section 6.3 says that MUST silently discard, which we are doing from now onwards. Before calling sctp_sf_pdiscard(), we need not only to free the association, but also the chunk->auth_chunk skb, as commit bbd0d59809f9 created a skb clone in that case. I have tested this locally by using netfilter's nfqueue and re-injecting packets into the local stack after maliciously modifying the INIT chunk (removing RANDOM; HMAC-ALGO param) and the SCTP packet containing the COOKIE_ECHO (injecting AUTH chunk before COOKIE_ECHO). Fixed with this patch applied. Fixes: bbd0d59809f9 ("[SCTP]: Implement the receive and verification of AUTH chunk") Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Cc: Vlad Yasevich <yasevich@gmail.com> Cc: Neil Horman <nhorman@tuxdriver.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-03	tcp: snmp stats for Fast Open, SYN rtx, and data pkts	Yuchung Cheng	7	-4/+24
	Add the following snmp stats: TCPFastOpenActiveFail: Fast Open attempts (SYN/data) failed beacuse the remote does not accept it or the attempts timed out. TCPSynRetrans: number of SYN and SYN/ACK retransmits to break down retransmissions into SYN, fast-retransmits, timeout retransmits, etc. TCPOrigDataSent: number of outgoing packets with original data (excluding retransmission but including data-in-SYN). This counter is different from TcpOutSegs because TcpOutSegs also tracks pure ACKs. TCPOrigDataSent is more useful to track the TCP retransmission rate. Change TCPFastOpenActive to track only successful Fast Opens to be symmetric to TCPFastOpenPassive. Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Nandita Dukkipati <nanditad@google.com> Signed-off-by: Lawrence Brakmo <brakmo@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-03	ip_tunnel:multicast process cause panic due to skb->_skb_refdst NULL pointer	Xin Long	1	-1/+0
	when ip_tunnel process multicast packets, it may check if the packet is looped back packet though 'rt_is_output_route(skb_rtable(skb))' in ip_tunnel_rcv(), but before that , skb->_skb_refdst has been dropped in iptunnel_pull_header(), so which leads to a panic. fix the bug: https://bugzilla.kernel.org/show_bug.cgi?id=70681 Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-03	net: cpsw: fix cpdma rx descriptor leak on down interface	Schuyler Patton	1	-0/+6
	This patch fixes a CPDMA RX Descriptor leak that occurs after taking the interface down when the CPSW is in Dual MAC mode. Previously the CPSW_ALE port was left open up which causes packets to be received and processed by the RX interrupt handler and were passed to the non active network interface where they were ignored. The fix is for the slave_stop function of the selected interface to disable the respective CPSW_ALE Port from forwarding packets. This blocks traffic from being received on the inactive interface. Signed-off-by: Schuyler Patton <spatton@ti.com> Reviewed-by: Felipe Balbi <balbi@ti.com> Signed-off-by: Mugunthan V N <mugunthanvnm@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-03	be2net: isolate TX workarounds not applicable to Skyhawk-R	Vasundhara Volam	1	-13/+26
	Some of TX workarounds in be_xmit_workarounds() routine are not applicable (and result in HW errors) to Skyhawk-R chip. Isolate BE3-R/Lancer specific workarounds to a separate routine. Signed-off-by: Vasundhara Volam <vasundhara.volam@emulex.com> Signed-off-by: Sathya Perla <sathya.perla@emulex.com> Signed-off-by: Somnath Kotur <somnath.kotur@emulex.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-03	be2net: Fix skb double free in be_xmit_wrokarounds() failure path	Vasundhara Volam	1	-3/+4
	skb_padto(), skb_share_check() and __vlan_put_tag() routines free skb when they return an error. This patch fixes be_xmit_workarounds() to not free skb again in such cases. Signed-off-by: Vasundhara Volam <vasundhara.volam@emulex.com> Signed-off-by: Sathya Perla <sathya.perla@emulex.com> Signed-off-by: Somnath Kotur <somnath.kotur@emulex.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-03	be2net: clear promiscuous bits in adapter->flags while disabling promiscuous mode	Somnath kotur	1	-3/+9
	We should clear promiscuous bits in adapter->flags while disabling promiscuous mode. Else we will not put interface back into VLAN promisc mode if the vlans already added exceeds the maximum limit. Signed-off-by: Kalesh AP <kalesh.purayil@emulex.com> Signed-off-by: Somnath Kotur <somnath.kotur@emulex.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-03	be2net: Fix to reset transparent vlan tagging	Somnath Kotur	2	-20/+12
	For disabling transparent tagging issue SET_HSW_CONFIG with pvid_valid=1 and pvid=0xFFFF and not with the default pvid as this case would fail in Lancer. Hence removing the get_hsw_config call from be_vf_setup() as it's only use of getting default pvid is no longer needed. Also do proper housekeeping only if the FW command succeeds. Signed-off-by: Kalesh AP <kalesh.purayil@emulex.com> Signed-off-by: Somnath Kotur <somnath.kotur@emulex.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-03	sch_tbf: Remove holes in struct tbf_sched_data.	Hiroaki SHIMODA	1	-11/+12
	On x86_64 we have 3 holes in struct tbf_sched_data. The member peak_present can be replaced with peak.rate_bytes_ps, because peak.rate_bytes_ps is set only when peak is specified in tbf_change(). tbf_peak_present() is introduced to test peak.rate_bytes_ps. The member max_size is moved to fill 32bit hole. Signed-off-by: Hiroaki SHIMODA <shimoda.hiroaki@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-03	ieee802154: fix at86rf212_set_txpower() exit path	Jean Sacren	1	-6/+1
	The commit 9b2777d6089bc ("ieee802154: add TX power control to wpan_phy") introduced the new function at86rf212_set_txpower() with the questionable check of the return of __at86rf230_write() in the exit path: 1) Both at86rf212_set_txpower() and __at86rf230_write() have the same return type. 2) Whatever __at86rf230_write() returns becomes the return value of at86rf212_set_txpower(). Thus, fix the exit path by getting rid of that check entirely. Signed-off-by: Jean Sacren <sakiwit@gmail.com> Cc: Phoebe Buckheister <phoebe.buckheister@itwm.fraunhofer.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-03	connector: remove duplicated code in cn_call_callback()	Alexey Khoroshilov	1	-1/+0
	There were a couple of patches fixing the same bug that results in duplicated err = 0; assignment. The patch removes one of them. Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-03	qlcnic: dcb: a couple off by one bugs	Dan Carpenter	1	-2/+2
	The ->tc_cfg[] array has QLC_DCB_MAX_TC (8) elements so the check is off by one. These functions are always called with valid values though so it doesn't affect how the code works. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Sucheta Chakraborty <sucheta.chakraborty@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-03	tcp: fix bogus RTT on special retransmission	Yuchung Cheng	2	-4/+10
	RTT may be bogus with tall loss probe (TLP) when a packet is retransmitted and latter (s)acked without TCPCB_SACKED_RETRANS flag. For example, TLP calls __tcp_retransmit_skb() instead of tcp_retransmit_skb(). The skb timestamps are updated but the sacked flag is not marked with TCPCB_SACKED_RETRANS. As a result we'll get bogus RTT in tcp_clean_rtx_queue() or in tcp_sacktag_one() on spurious retransmission. The fix is to apply the sticky flag TCP_EVER_RETRANS to enforce Karn's check on RTT sampling. However this will disable F-RTO if timeout occurs after TLP, by resetting undo_marker in tcp_enter_loss(). We relax this check to only if any pending retransmists are still in-flight. Signed-off-by: Yuchung Cheng <ycheng@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Acked-by: Nandita Dukkipati <nanditad@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-03	hsr: off by one sanity check in hsr_register_frame_in()	Dan Carpenter	1	-1/+1
	This is a sanity check and we never pass invalid values so this patch doesn't change anything. However the node->time_in[] array has HSR_MAX_SLAVE (2) elements and not HSR_MAX_DEV (3). Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-03	can: remove CAN FD compatibility for CAN 2.0 sockets	Oliver Hartkopp	2	-27/+5
	In commit e2d265d3b587 (canfd: add support for CAN FD in CAN_RAW sockets) CAN FD frames with a payload length up to 8 byte are passed to legacy sockets where the CAN FD support was not enabled by the application. After some discussions with developers at a fair this well meant feature leads to confusion as no clean switch for CAN / CAN FD is provided to the application programmer. Additionally a compatibility like this for legacy CAN_RAW sockets requires some compatibility handling for the sending, e.g. make CAN2.0 frames a CAN FD frame with BRS at transmission time (?!?). This will become a mess when people start to develop applications with real CAN FD hardware. This patch reverts the bad compatibility code together with the documentation describing the removed feature. Acked-by: Stephane Grosjean <s.grosjean@peak-system.com> Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2014-03-03	can: flexcan: factor out soft reset into seperate funtion	Marc Kleine-Budde	1	-9/+17
	This patch moves the soft reset into a seperate function. Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2014-03-03	can: flexcan: flexcan_remove(): add missing netif_napi_del()	Marc Kleine-Budde	1	-1/+2
	This patch adds the missing netif_napi_del() to the flexcan_remove() function. Cc: linux-stable <stable@vger.kernel.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2014-03-03	can: flexcan: fix transition from and to freeze mode in chip_{,un}freeze	Marc Kleine-Budde	1	-11/+49
	This patch factors out freeze and unfreeze of the CAN core into seperate functions. Experiments have shown that the transition from and to freeze mode may take several microseconds, especially the time entering the freeze mode depends on the current bitrate. This patch adds a while loop which polls the Freeze Mode ACK bit (FRZ_ACK) that indicates a successfull mode change. If the function runs into a timeout a error value is returned. Cc: linux-stable <stable@vger.kernel.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2014-03-03	can: flexcan: factor out transceiver {en,dis}able into seperate functions	Marc Kleine-Budde	1	-7/+20
	This patch moves the transceiver enable and disable into seperate functions, where the NULL pointer check is hidden. Cc: linux-stable <stable@vger.kernel.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2014-03-03	can: flexcan: fix transition from and to low power mode in chip_{en,dis}able	Marc Kleine-Budde	1	-12/+38
	In flexcan_chip_enable() and flexcan_chip_disable() fixed delays are used. Experiments have shown that the transition from and to low power mode may take several microseconds. This patch adds a while loop which polls the Low Power Mode ACK bit (LPM_ACK) that indicates a successfull mode change. If the function runs into a timeout a error value is returned. Cc: linux-stable <stable@vger.kernel.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2014-03-03	can: flexcan: flexcan_open(): fix error path if flexcan_chip_start() fails	Marc Kleine-Budde	1	-1/+3
	If flexcan_chip_start() in flexcan_open() fails, the interrupt is not freed, this patch adds the missing cleanup. Cc: linux-stable <stable@vger.kernel.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2014-03-03	can: flexcan: fix shutdown: first disable chip, then all interrupts	Marc Kleine-Budde	1	-3/+5
	When shutting down the CAN interface (ifconfig canX down) during high CAN bus loads, the CAN core might hang and freeze the whole CPU. This patch fixes the shutdown sequence by first disabling the CAN core then disabling all interrupts. Cc: linux-stable <stable@vger.kernel.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2014-03-02	Linux 3.14-rc5	Linus Torvalds	1	-1/+1

2014-03-02	USB AX88179/178A: Support D-Link DUB-1312	Gerry Demaret	1	-0/+17
	Add the USB device ID for the D-Link DUB-1312 USB 3.0 to Gigabit Ethernet Adapter to the AX88179/178A driver. Signed-off-by: Gerry Demaret <gerry@tigron.be> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-02	net/mlx4_en: Change Connect-X description in kconfig	Eyal Perry	1	-1/+1
	The mlx4_en driver support also 1Gbit and 40Gbit Ethernet devices, changed the driver description in the menuconfig to reflect that. Signed-off-by: Eyal Perry <eyalpe@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>