linux-dev - Linux kernel development work

Age	Commit message (Collapse)	Author	Files	Lines
2014-04-20	Linux 3.15-rc2	Linus Torvalds	1	-1/+1

2014-04-20	perf tools: Improve error reporting	Adrien BAK	2	-2/+9
	In the current version, when using perf record, if something goes wrong in tools/perf/builtin-record.c:375 session = perf_session__new(file, false, NULL); The error message: "Not enough memory for reading per file header" is issued. This error message seems to be outdated and is not very helpful. This patch proposes to replace this error message by "Perf session creation failed" I believe this issue has been brought to lkml: https://lkml.org/lkml/2014/2/24/458 although this patch only tackles a (small) part of the issue. Additionnaly, this patch improves error reporting in tools/perf/util/data.c open_file_write. Currently, if the call to open fails, the user is unaware of it. This patch logs the error, before returning the error code to the caller. Reported-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Adrien BAK <adrien.bak@metascale.org> Link: http://lkml.kernel.org/r/1397786443.3093.4.camel@beast [ Reorganize the changelog into paragraphs ] [ Added empty line after fd declaration in open_file_write ] Signed-off-by: Jiri Olsa <jolsa@redhat.com>
2014-04-20	perf tools: Adjust symbols in VDSO	Vladimir Nikulichev	1	-0/+2
	pert-report doesn't resolve function names in VDSO: $ perf report --stdio -g flat,0.0,15,callee --sort pid ... 8.76% 0x7fff6b1fe861 __gettimeofday ACE_OS::gettimeofday() ... In this case symbol values should be adjusted the same way as for executables, relocatable objects and prelinked libraries. After fix: $ perf report --stdio -g flat,0.0,15,callee --sort pid ... 8.76% __vdso_gettimeofday __gettimeofday ACE_OS::gettimeofday() Signed-off-by: Vladimir Nikulichev <nvs@tbricks.com> Tested-by: Namhyung Kim <namhyung@kernel.org> Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> Link: http://lkml.kernel.org/r/969812.163009436-sendEmail@nvs Signed-off-by: Jiri Olsa <jolsa@redhat.com>
2014-04-20	perf kvm: Fix 'Min time' counting in report command	Alexander Yarygin	1	-0/+1
	Every event in the perf-kvm has a 'stats' structure, which contains max/min/average/etc times of handling this event. The problem is that the 'perf-kvm stat report' command always shows that 'min time' is 0us for every event. Example: # perf kvm stat report Analyze events for all VCPUs: VM-EXIT Samples Samples% Time% Min Time Max Time Avg time [..] 0xB2 MSCH 12 0.07% 0.00% 0us 8us 7.31us ( +- 2.11% ) 0xB2 CHSC 12 0.07% 0.00% 0us 18us 9.39us ( +- 9.49% ) 0xB2 STPX 8 0.05% 0.00% 0us 2us 1.88us ( +- 7.18% ) 0xB2 STSI 7 0.04% 0.00% 0us 44us 16.49us ( +- 38.20% ) [..] This happens because the 'stats' structure is not initialized and stats->min equals to 0. Lets initialize the structure for every event after its allocation using init_stats() function. This initializes stats->min to -1 and makes 'Min time' statistics counting work: # perf kvm stat report Analyze events for all VCPUs: VM-EXIT Samples Samples% Time% Min Time Max Time Avg time [..] 0xB2 MSCH 12 0.07% 0.00% 6us 8us 7.31us ( +- 2.11% ) 0xB2 CHSC 12 0.07% 0.00% 7us 18us 9.39us ( +- 9.49% ) 0xB2 STPX 8 0.05% 0.00% 1us 2us 1.88us ( +- 7.18% ) 0xB2 STSI 7 0.04% 0.00% 1us 44us 16.49us ( +- 38.20% ) [..] Signed-off-by: Alexander Yarygin <yarygin@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: David Ahern <dsahern@gmail.com> Link: http://lkml.kernel.org/r/1397053319-2130-3-git-send-email-borntraeger@de.ibm.com [ Fixing the perf examples changelog output ] Signed-off-by: Jiri Olsa <jolsa@redhat.com>
2014-04-19	coredump: fix va_list corruption	Eric Dumazet	1	-1/+6
	A va_list needs to be copied in case it needs to be used twice. Thanks to Hugh for debugging this issue, leading to various panics. Tested: lpq84:~# echo "\|/foobar12345 %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h" >/proc/sys/kernel/core_pattern 'produce_core' is simply : main() { (int )0 = 1;} lpq84:~# ./produce_core Segmentation fault (core dumped) lpq84:~# dmesg \| tail -1 [ 614.352947] Core dump to \|/foobar12345 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 (null) pipe failed Notice the last argument was replaced by a NULL (we were lucky enough to not crash, but do not try this on your production machine !) After fix : lpq83:~# echo "\|/foobar12345 %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h" >/proc/sys/kernel/core_pattern lpq83:~# ./produce_core Segmentation fault lpq83:~# dmesg \| tail -1 [ 740.800441] Core dump to \|/foobar12345 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 pipe failed Fixes: 5fe9d8ca21cc ("coredump: cn_vprintf() has no reason to call vsnprintf() twice") Signed-off-by: Eric Dumazet <edumazet@google.com> Diagnosed-by: Hugh Dickins <hughd@google.com> Acked-by: Oleg Nesterov <oleg@redhat.com> Cc: Neil Horman <nhorman@tuxdriver.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: stable@vger.kernel.org # 3.11+ Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-18	thp: close race between split and zap huge pages	Kirill A. Shutemov	1	-3/+10
	Sasha Levin has reported two THP BUGs[1][2]. I believe both of them have the same root cause. Let's look to them one by one. The first bug[1] is "kernel BUG at mm/huge_memory.c:1829!". It's BUG_ON(mapcount != page_mapcount(page)) in __split_huge_page(). From my testing I see that page_mapcount() is higher than mapcount here. I think it happens due to race between zap_huge_pmd() and page_check_address_pmd(). page_check_address_pmd() misses PMD which is under zap: CPU0 CPU1 zap_huge_pmd() pmdp_get_and_clear() __split_huge_page() anon_vma_interval_tree_foreach() __split_huge_page_splitting() page_check_address_pmd() mm_find_pmd() /* * We check if PMD present without taking ptl: no * serialization against zap_huge_pmd(). We miss this PMD, * it's not accounted to 'mapcount' in __split_huge_page(). / pmd_present(pmd) == 0 BUG_ON(mapcount != page_mapcount(page)) // CRASH!!! page_remove_rmap(page) atomic_add_negative(-1, &page->_mapcount) The second bug[2] is "kernel BUG at mm/huge_memory.c:1371!". It's VM_BUG_ON_PAGE(!PageHead(page), page) in zap_huge_pmd(). This happens in similar way: CPU0 CPU1 zap_huge_pmd() pmdp_get_and_clear() page_remove_rmap(page) atomic_add_negative(-1, &page->_mapcount) __split_huge_page() anon_vma_interval_tree_foreach() __split_huge_page_splitting() page_check_address_pmd() mm_find_pmd() pmd_present(pmd) == 0 / The same comment as above / / * No crash this time since we already decremented page->_mapcount in * zap_huge_pmd(). / BUG_ON(mapcount != page_mapcount(page)) / * We split the compound page here into small pages without * serialization against zap_huge_pmd() */ __split_huge_page_refcount() VM_BUG_ON_PAGE(!PageHead(page), page); // CRASH!!! So my understanding the problem is pmd_present() check in mm_find_pmd() without taking page table lock. The bug was introduced by me commit with commit 117b0791ac42. Sorry for that. :( Let's open code mm_find_pmd() in page_check_address_pmd() and do the check under page table lock. Note that __page_check_address() does the same for PTE entires if sync != 0. I've stress tested split and zap code paths for 36+ hours by now and don't see crashes with the patch applied. Before it took <20 min to trigger the first bug and few hours for second one (if we ignore first). [1] https://lkml.kernel.org/g/<53440991.9090001@oracle.com> [2] https://lkml.kernel.org/g/<5310C56C.60709@oracle.com> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Reported-by: Sasha Levin <sasha.levin@oracle.com> Tested-by: Sasha Levin <sasha.levin@oracle.com> Cc: Bob Liu <lliubbo@gmail.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Rik van Riel <riel@redhat.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Michel Lespinasse <walken@google.com> Cc: Dave Jones <davej@redhat.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: <stable@vger.kernel.org> [3.13+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-18	mm: fix new kernel-doc warning in filemap.c	Randy Dunlap	1	-1/+0
	Fix new kernel-doc warning in mm/filemap.c: Warning(mm/filemap.c:2600): Excess function parameter 'ppos' description in '__generic_file_aio_write' Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-18	mm: fix CONFIG_DEBUG_VM_RB description	Davidlohr Bueso	1	-2/+1
	This appears to be a copy/paste error. Update the description to reflect extra rbtree debug and checks for the config option instead of duplicating CONFIG_DEBUG_VM. Signed-off-by: Davidlohr Bueso <davidlohr@hp.com> Cc: Aswin Chandramouleeswaran <aswin@hp.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-18	mm: use paravirt friendly ops for NUMA hinting ptes	Mel Gorman	1	-8/+23
	David Vrabel identified a regression when using automatic NUMA balancing under Xen whereby page table entries were getting corrupted due to the use of native PTE operations. Quoting him Xen PV guest page tables require that their entries use machine addresses if the preset bit (_PAGE_PRESENT) is set, and (for successful migration) non-present PTEs must use pseudo-physical addresses. This is because on migration MFNs in present PTEs are translated to PFNs (canonicalised) so they may be translated back to the new MFN in the destination domain (uncanonicalised). pte_mknonnuma(), pmd_mknonnuma(), pte_mknuma() and pmd_mknuma() set and clear the _PAGE_PRESENT bit using pte_set_flags(), pte_clear_flags(), etc. In a Xen PV guest, these functions must translate MFNs to PFNs when clearing _PAGE_PRESENT and translate PFNs to MFNs when setting _PAGE_PRESENT. His suggested fix converted p[te\|md]_[set\|clear]_flags to using paravirt-friendly ops but this is overkill. He suggested an alternative of using p[te\|md]_modify in the NUMA page table operations but this is does more work than necessary and would require looking up a VMA for protections. This patch modifies the NUMA page table operations to use paravirt friendly operations to set/clear the flags of interest. Unfortunately this will take a performance hit when updating the PTEs on CONFIG_PARAVIRT but I do not see a way around it that does not break Xen. Signed-off-by: Mel Gorman <mgorman@suse.de> Acked-by: David Vrabel <david.vrabel@citrix.com> Tested-by: David Vrabel <david.vrabel@citrix.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Peter Anvin <hpa@zytor.com> Cc: Fengguang Wu <fengguang.wu@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Steven Noonan <steven@uplinklabs.net> Cc: Rik van Riel <riel@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: Cyrill Gorcunov <gorcunov@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-18	mips: export flush_icache_range	Kees Cook	1	-2/+2
	The lkdtm module performs tests against executable memory ranges, so it needs to flush the icache for proper behaviors. Other architectures already export this, so do the same for MIPS. [akpm@linux-foundation.org: relocate export sites] Signed-off-by: Kees Cook <keescook@chromium.org> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Sanjay Lal <sanjayl@kymasys.com> Cc: John Crispin <blogic@openwrt.org> Cc: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-18	mm/hugetlb.c: add cond_resched_lock() in return_unused_surplus_pages()	Mizuma, Masayoshi	1	-0/+1
	soft lockup in freeing gigantic hugepage fixed in commit 55f67141a892 "mm: hugetlb: fix softlockup when a large number of hugepages are freed." can happen in return_unused_surplus_pages(), so let's fix it. Signed-off-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com> Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Michal Hocko <mhocko@suse.cz> Cc: Aneesh Kumar <aneesh.kumar@linux.vnet.ibm.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-18	wait: explain the shadowing and type inconsistencies	Peter Zijlstra	1	-1/+13
	Stick in a comment before someone else tries to fix the sparse warning this generates. Suggested-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/n/tip-o2ro6f3vkxklni0bc8f7m68s@git.kernel.org Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-18	Shiraz has moved	Viresh Kumar	11	-13/+14
	shiraz.hashim@st.com email-id doesn't exist anymore as he has left the company. Replace ST's id with shiraz.linux.kernel@gmail.com. It also updates .mailmap file to fix address for 'git shortlog'. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Cc: Shiraz Hashim <shiraz.linux.kernel@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-18	Documentation/vm/numa_memory_policy.txt: fix wrong document in numa_memory_policy.txt	Tang Chen	1	-3/+2
	In document numa_memory_policy.txt, the following examples for flag MPOL_F_RELATIVE_NODES are incorrect. For example, consider a task that is attached to a cpuset with mems 2-5 that sets an Interleave policy over the same set with MPOL_F_RELATIVE_NODES. If the cpuset's mems change to 3-7, the interleave now occurs over nodes 3,5-6. If the cpuset's mems then change to 0,2-3,5, then the interleave occurs over nodes 0,3,5. According to the comment of the patch adding flag MPOL_F_RELATIVE_NODES, the nodemasks the user specifies should be considered relative to the current task's mems_allowed. (https://lkml.org/lkml/2008/2/29/428) And according to numa_memory_policy.txt, if the user's nodemask includes nodes that are outside the range of the new set of allowed nodes, then the remap wraps around to the beginning of the nodemask and, if not already set, sets the node in the mempolicy nodemask. So in the example, if the user specifies 2-5, for a task whose mems_allowed is 3-7, the nodemasks should be remapped the third, fourth, fifth, sixth node in mems_allowed. like the following: mems_allowed: 3 4 5 6 7 relative index: 0 1 2 3 4 5 So the nodemasks should be remapped to 3,5-7, but not 3,5-6. And for a task whose mems_allowed is 0,2-3,5, the nodemasks should be remapped to 0,2-3,5, but not 0,3,5. mems_allowed: 0 2 3 5 relative index: 0 1 2 3 4 5 Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com> Cc: Randy Dunlap <rdunlap@infradead.org> Acked-by: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-18	powerpc/mm: fix ".__node_distance" undefined	Mike Qiu	1	-0/+1
	CHK include/config/kernel.release CHK include/generated/uapi/linux/version.h CHK include/generated/utsrelease.h ... Building modules, stage 2. WARNING: 1 bad relocations c0000000013d6a30 R_PPC64_ADDR64 uprobes_fetch_type_table WRAP arch/powerpc/boot/zImage.pseries WRAP arch/powerpc/boot/zImage.epapr MODPOST 1849 modules ERROR: ".__node_distance" [drivers/block/nvme.ko] undefined! make[1]: * [__modpost] Error 1 make: * [modules] Error 2 make: *** Waiting for unfinished jobs.... The reason is symbol "__node_distance" not been exported in powerpc. Signed-off-by: Mike Qiu <qiudayu@linux.vnet.ibm.com> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Cc: Jesse Larrew <jlarrew@linux.vnet.ibm.com> Cc: Robert Jennings <rcj@linux.vnet.ibm.com> Cc: Alistair Popple <alistair@popple.id.au> Cc: Mike Qiu <qiudayu@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-18	kernel/watchdog.c:touch_softlockup_watchdog(): use raw_cpu_write()	Andrew Morton	1	-1/+5
	Fix: BUG: using __this_cpu_write() in preemptible [00000000] code: systemd-udevd/497 caller is __this_cpu_preempt_check+0x13/0x20 CPU: 3 PID: 497 Comm: systemd-udevd Tainted: G W 3.15.0-rc1 #9 Hardware name: Hewlett-Packard HP EliteBook 8470p/179B, BIOS 68ICF Ver. F.02 04/27/2012 Call Trace: check_preemption_disabled+0xe1/0xf0 __this_cpu_preempt_check+0x13/0x20 touch_nmi_watchdog+0x28/0x40 Reported-by: Luis Henriques <luis.henriques@canonical.com> Tested-by: Luis Henriques <luis.henriques@canonical.com> Cc: Eric Piel <eric.piel@tremplin-utc.net> Cc: Robert Moore <robert.moore@intel.com> Cc: Lv Zheng <lv.zheng@intel.com> Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com> Cc: Len Brown <lenb@kernel.org> Cc: Christoph Lameter <cl@linux.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-18	init/Kconfig: move the trusted keyring config option to general setup	Peter Foley	1	-12/+12
	The SYSTEM_TRUSTED_KEYRING config option is not in any menu, causing it to show up in the toplevel of the kernel configuration. Fix this by moving it under the General Setup menu. Signed-off-by: Peter Foley <pefoley2@pefoley.com> Cc: David Howells <dhowells@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-18	vmscan: reclaim_clean_pages_from_list() must use mod_zone_page_state()	Christoph Lameter	1	-1/+1
	Seems to be called with preemption enabled. Therefore it must use mod_zone_page_state instead. Signed-off-by: Christoph Lameter <cl@linux.com> Reported-by: Grygorii Strashko <grygorii.strashko@ti.com> Tested-by: Grygorii Strashko <grygorii.strashko@ti.com> Cc: Tejun Heo <tj@kernel.org> Cc: Santosh Shilimkar <santosh.shilimkar@ti.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Mel Gorman <mel@csn.ul.ie> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-18	net: sctp: cache auth_enable per endpoint	Vlad Yasevich	7	-60/+92
	Currently, it is possible to create an SCTP socket, then switch auth_enable via sysctl setting to 1 and crash the system on connect: Oops[#1]: CPU: 0 PID: 0 Comm: swapper Not tainted 3.14.1-mipsgit-20140415 #1 task: ffffffff8056ce80 ti: ffffffff8055c000 task.ti: ffffffff8055c000 [...] Call Trace: [<ffffffff8043c4e8>] sctp_auth_asoc_set_default_hmac+0x68/0x80 [<ffffffff8042b300>] sctp_process_init+0x5e0/0x8a4 [<ffffffff8042188c>] sctp_sf_do_5_1B_init+0x234/0x34c [<ffffffff804228c8>] sctp_do_sm+0xb4/0x1e8 [<ffffffff80425a08>] sctp_endpoint_bh_rcv+0x1c4/0x214 [<ffffffff8043af68>] sctp_rcv+0x588/0x630 [<ffffffff8043e8e8>] sctp6_rcv+0x10/0x24 [<ffffffff803acb50>] ip6_input+0x2c0/0x440 [<ffffffff8030fc00>] __netif_receive_skb_core+0x4a8/0x564 [<ffffffff80310650>] process_backlog+0xb4/0x18c [<ffffffff80313cbc>] net_rx_action+0x12c/0x210 [<ffffffff80034254>] __do_softirq+0x17c/0x2ac [<ffffffff800345e0>] irq_exit+0x54/0xb0 [<ffffffff800075a4>] ret_from_irq+0x0/0x4 [<ffffffff800090ec>] rm7k_wait_irqoff+0x24/0x48 [<ffffffff8005e388>] cpu_startup_entry+0xc0/0x148 [<ffffffff805a88b0>] start_kernel+0x37c/0x398 Code: dd0900b8 000330f8 0126302d <dcc60000> 50c0fff1 0047182a a48306a0 03e00008 00000000 ---[ end trace b530b0551467f2fd ]--- Kernel panic - not syncing: Fatal exception in interrupt What happens while auth_enable=0 in that case is, that ep->auth_hmacs is initialized to NULL in sctp_auth_init_hmacs() when endpoint is being created. After that point, if an admin switches over to auth_enable=1, the machine can crash due to NULL pointer dereference during reception of an INIT chunk. When we enter sctp_process_init() via sctp_sf_do_5_1B_init() in order to respond to an INIT chunk, the INIT verification succeeds and while we walk and process all INIT params via sctp_process_param() we find that net->sctp.auth_enable is set, therefore do not fall through, but invoke sctp_auth_asoc_set_default_hmac() instead, and thus, dereference what we have set to NULL during endpoint initialization phase. The fix is to make auth_enable immutable by caching its value during endpoint initialization, so that its original value is being carried along until destruction. The bug seems to originate from the very first days. Fix in joint work with Daniel Borkmann. Reported-by: Joshua Kinard <kumba@gentoo.org> Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Tested-by: Joshua Kinard <kumba@gentoo.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-18	tg3: update rx_jumbo_pending ring param only when jumbo frames are enabled	Ivan Vecera	1	-1/+3
	The patch fixes a problem with dropped jumbo frames after usage of 'ethtool -G ... rx'. Scenario: 1. ip link set eth0 up 2. ethtool -G eth0 rx N # <- This zeroes rx-jumbo 3. ip link set mtu 9000 dev eth0 The ethtool command set rx_jumbo_pending to zero so any received jumbo packets are dropped and you need to use 'ethtool -G eth0 rx-jumbo N' to workaround the issue. The patch changes the logic so rx_jumbo_pending value is changed only if jumbo frames are enabled (MTU > 1500). Signed-off-by: Ivan Vecera <ivecera@redhat.com> Acked-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-18	vlan: Fix lockdep warning when vlan dev handle notification	dingtianhong	2	-5/+42
	When I open the LOCKDEP config and run these steps: modprobe 8021q vconfig add eth2 20 vconfig add eth2.20 30 ifconfig eth2 xx.xx.xx.xx then the Call Trace happened: [32524.386288] ============================================= [32524.386293] [ INFO: possible recursive locking detected ] [32524.386298] 3.14.0-rc2-0.7-default+ #35 Tainted: G O [32524.386302] --------------------------------------------- [32524.386306] ifconfig/3103 is trying to acquire lock: [32524.386310] (&vlan_netdev_addr_lock_key/1){+.....}, at: [<ffffffff814275f4>] dev_mc_sync+0x64/0xb0 [32524.386326] [32524.386326] but task is already holding lock: [32524.386330] (&vlan_netdev_addr_lock_key/1){+.....}, at: [<ffffffff8141af83>] dev_set_rx_mode+0x23/0x40 [32524.386341] [32524.386341] other info that might help us debug this: [32524.386345] Possible unsafe locking scenario: [32524.386345] [32524.386350] CPU0 [32524.386352] ---- [32524.386354] lock(&vlan_netdev_addr_lock_key/1); [32524.386359] lock(&vlan_netdev_addr_lock_key/1); [32524.386364] [32524.386364] * DEADLOCK * [32524.386364] [32524.386368] May be due to missing lock nesting notation [32524.386368] [32524.386373] 2 locks held by ifconfig/3103: [32524.386376] #0: (rtnl_mutex){+.+.+.}, at: [<ffffffff81431d42>] rtnl_lock+0x12/0x20 [32524.386387] #1: (&vlan_netdev_addr_lock_key/1){+.....}, at: [<ffffffff8141af83>] dev_set_rx_mode+0x23/0x40 [32524.386398] [32524.386398] stack backtrace: [32524.386403] CPU: 1 PID: 3103 Comm: ifconfig Tainted: G O 3.14.0-rc2-0.7-default+ #35 [32524.386409] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007 [32524.386414] ffffffff81ffae40 ffff8800d9625ae8 ffffffff814f68a2 ffff8800d9625bc8 [32524.386421] ffffffff810a35fb ffff8800d8a8d9d0 00000000d9625b28 ffff8800d8a8e5d0 [32524.386428] 000003cc00000000 0000000000000002 ffff8800d8a8e5f8 0000000000000000 [32524.386435] Call Trace: [32524.386441] [<ffffffff814f68a2>] dump_stack+0x6a/0x78 [32524.386448] [<ffffffff810a35fb>] __lock_acquire+0x7ab/0x1940 [32524.386454] [<ffffffff810a323a>] ? __lock_acquire+0x3ea/0x1940 [32524.386459] [<ffffffff810a4874>] lock_acquire+0xe4/0x110 [32524.386464] [<ffffffff814275f4>] ? dev_mc_sync+0x64/0xb0 [32524.386471] [<ffffffff814fc07a>] _raw_spin_lock_nested+0x2a/0x40 [32524.386476] [<ffffffff814275f4>] ? dev_mc_sync+0x64/0xb0 [32524.386481] [<ffffffff814275f4>] dev_mc_sync+0x64/0xb0 [32524.386489] [<ffffffffa0500cab>] vlan_dev_set_rx_mode+0x2b/0x50 [8021q] [32524.386495] [<ffffffff8141addf>] __dev_set_rx_mode+0x5f/0xb0 [32524.386500] [<ffffffff8141af8b>] dev_set_rx_mode+0x2b/0x40 [32524.386506] [<ffffffff8141b3cf>] __dev_open+0xef/0x150 [32524.386511] [<ffffffff8141b177>] __dev_change_flags+0xa7/0x190 [32524.386516] [<ffffffff8141b292>] dev_change_flags+0x32/0x80 [32524.386524] [<ffffffff8149ca56>] devinet_ioctl+0x7d6/0x830 [32524.386532] [<ffffffff81437b0b>] ? dev_ioctl+0x34b/0x660 [32524.386540] [<ffffffff814a05b0>] inet_ioctl+0x80/0xa0 [32524.386550] [<ffffffff8140199d>] sock_do_ioctl+0x2d/0x60 [32524.386558] [<ffffffff81401a52>] sock_ioctl+0x82/0x2a0 [32524.386568] [<ffffffff811a7123>] do_vfs_ioctl+0x93/0x590 [32524.386578] [<ffffffff811b2705>] ? rcu_read_lock_held+0x45/0x50 [32524.386586] [<ffffffff811b39e5>] ? __fget_light+0x105/0x110 [32524.386594] [<ffffffff811a76b1>] SyS_ioctl+0x91/0xb0 [32524.386604] [<ffffffff815057e2>] system_call_fastpath+0x16/0x1b ======================================================================== The reason is that all of the addr_lock_key for vlan dev have the same class, so if we change the status for vlan dev, the vlan dev and its real dev will hold the same class of addr_lock_key together, so the warning happened. we should distinguish the lock depth for vlan dev and its real dev. v1->v2: Convert the vlan_netdev_addr_lock_key to an array of eight elements, which could support to add 8 vlan id on a same vlan dev, I think it is enough for current scene, because a netdev's name is limited to IFNAMSIZ which could not hold 8 vlan id, and the vlan dev would not meet the same class key with its real dev. The new function vlan_dev_get_lockdep_subkey() will return the subkey and make the vlan dev could get a suitable class key. v2->v3: According David's suggestion, I use the subclass to distinguish the lock key for vlan dev and its real dev, but it make no sense, because the difference for subclass in the lock_class_key doesn't mean that the difference class for lock_key, so I use lock_depth to distinguish the different depth for every vlan dev, the same depth of the vlan dev could have the same lock_class_key, I import the MAX_LOCK_DEPTH from the include/linux/sched.h, I think it is enough here, the lockdep should never exceed that value. v3->v4: Add a huge array of locking keys will waste static kernel memory and is not a appropriate method, we could use _nested() variants to fix the problem, calculate the depth for every vlan dev, and use the depth as the subclass for addr_lock_key. Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-18	ARC: Delete stale barrier.h	Vineet Gupta	1	-37/+0
	Commit 93ea02bb8435 ("arch: Clean up asm/barrier.h implementations") wired generic barrier.h for ARC, but failed to delete the existing file. In 3.15, due to rcupdate.h updates, this causes a build breakage on ARC: CC arch/arc/kernel/asm-offsets.s In file included from include/linux/sched.h:45:0, from arch/arc/kernel/asm-offsets.c:9: include/linux/rculist.h: In function __list_add_rcu: include/linux/rculist.h:54:2: error: implicit declaration of function smp_store_release [-Werror=implicit-function-declaration] rcu_assign_pointer(list_next_rcu(prev), new); ^ Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Vineet Gupta <vgupta@synopsys.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-18	of: add empty of_find_node_by_path() for !OF	Alexander Shiyan	1	-0/+5
	Add an empty version of of_find_node_by_path(). This fixes following build error for asoc tree: sound/soc/fsl/fsl_ssi.c: In function 'fsl_ssi_probe': sound/soc/fsl/fsl_ssi.c:1471:2: error: implicit declaration of function 'of_find_node_by_path' [-Werror=implicit-function-declaration] sprop = of_get_property(of_find_node_by_path("/"), "compatible", NULL); Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Alexander Shiyan <shc_work@mail.ru> Signed-off-by: Rob Herring <robh@kernel.org>
2014-04-18	perf/x86/intel: Use rdmsrl_safe() when initializing RAPL PMU	Venkatesh Srinivas	1	-3/+9
	CPUs which should support the RAPL counters according to Family/Model/Stepping may still issue #GP when attempting to access the RAPL MSRs. This may happen when Linux is running under KVM and we are passing-through host F/M/S data, for example. Use rdmsrl_safe to first access the RAPL_POWER_UNIT MSR; if this fails, do not attempt to use this PMU. Signed-off-by: Venkatesh Srinivas <venkateshs@google.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1394739386-22260-1-git-send-email-venkateshs@google.com Cc: zheng.z.yan@intel.com Cc: eranian@google.com Cc: ak@linux.intel.com Cc: linux-kernel@vger.kernel.org [ The patch also silently fixes another bug: rapl_pmu_init() didn't handle the memory alloc failure case previously. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-04-18	drm: bochs: drop unused struct fields	Gerd Hoffmann	2	-3/+0
	Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-04-18	drm: bochs: add power management support	Gerd Hoffmann	2	-0/+45
	bochs kms driver lacks power management support, thus the vga display doesn't work any more after S3 resume. Fix this by adding suspend and resume functions. Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-04-18	drm: cirrus: add power management support	Gerd Hoffmann	2	-0/+45
	cirrus kms driver lacks power management support, thus the vga display doesn't work any more after S3 resume. Fix this by adding suspend and resume functions. Also make the mode_set function unblank the screen. Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-04-18	drm: Split out drm_probe_helper.c from drm_crtc_helper.c	Daniel Vetter	5	-372/+437
	This is leftover stuff from my previous doc round which I kinda wanted to do but didn't yet due to rebase hell. The modeset helpers and the probing helpers a independent and e.g. i915 uses the probing stuff but has its own modeset infrastructure. It hence makes to split this up. While at it add a DOC: comment for the probing libraray. It would be rather neat to pull some of the DocBook documenting these two helpers into in-line DOC: comments. But unfortunately kerneldoc doesn't support markdown or something similar to make nice-looking documentation, so the current state is better. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-04-18	drm/plane-helper: Don't fake-implement primary plane disabling	Daniel Vetter	1	-28/+5
	After thinking about this topic a bit more I've reached the conclusion that implementing this doesn't make sense: - The locking is all wrong: set_config(NULL) will also unlink encoders and connectors, but those links are protected with the mode_config mutex. In the ->disable_plane callback we only hold all modeset locks, but eventually we want to switch to just grabbing the per-crtc (and maybe per-plane) locks as needed, maybe based on ww_mutexes. Having a callback which absolutely needs all modeset locks is bad for this conversion. Note that the same isn't true for the provided ->update_plane since we've audited the crtc helpers to make sure that not encoder or connector links are changed. - There's no way to re-enable the plane with an ->update_plane: The connectors/encoder links are lost and so we can't re-enable the CRTC. Even without that issue the driver might have reassigned some shared resources (as opposed to e.g. DPMS off, where drivers are not allowed to do that to make sure the CRTC can be enabled again). - The semantics don't make much sense: Userspace asked to scan out black (or some other color if the driver supports a background color), not that the screen be disabled. - Implementing proper primary plane support (i.e. actually disabling the primary plane without disabling the CRTC) is really simple, at least if all the hw needs is flipping a bit. The big task is auditing all the interactions with other ioctls when the CRTC is on but there's no primary plane (e.g. pageflips). And some of that work still needs to be done. Cc: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-04-18	drm/ast: fix value check in cbr_scan2	Dave Airlie	1	-1/+1
	this is a typo vs the ums driver, fix to check correct value. Found initially by Coverity. Reported-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-04-18	drm/nouveau/bios: fix a bit shift error introduced by 457e77b	Sergei Antonov	1	-1/+1
	Commit 457e77b26428ab4a24998eecfb99f27fa4195397 added two checks applied to a value received from nv_rd32(bios, 0x619f04). But after this new piece of code is executed, the addr local variable does not hold the same value it used to hold before the commit. Here is what is was assigned in the original code: (u64)(nv_rd32(bios, 0x619f04) & 0xffffff00) << 8 in the committed code it ends up with this value: (u64)(nv_rd32(bios, 0x619f04) >> 8) << 8 These expressions are obviously not equivalent. My Nvidia video card does not show anything on the display when I boot a kernel containing this commit. The patch fixes the code so that the new checks are still done, but the side effect of an incorrect addr value is gone. Cc: Ben Skeggs <bskeggs@redhat.com> Cc: Dave Airlie <airlied@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Sergei Antonov <saproj@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-04-17	ipmi: boolify some things	Corey Minyard	3	-40/+38
	Convert some ints to bools. Signed-off-by: Corey Minyard <cminyard@mvista.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-17	ipmi: Turn off all activity on an idle ipmi interface	Corey Minyard	4	-97/+182
	The IPMI driver would wake up periodically looking for events and watchdog pretimeouts. If there is nothing waiting for these events, it's really kind of pointless to be checking for them. So modify the driver so the message handler can pass down if it needs the lower layer to be waiting for these. Modify the system interface lower layer to turn off all timer and thread activity if the upper layer doesn't need anything and it is not currently handling messages. And modify the message handler to not restart the timer if its timer is not needed. The timers and kthread will still be enabled if: - the SI interface is handling a message. - a user has enabled watching for events. - the IPMI watchdog timer is in use (since it uses pretimeouts). - the message handler is waiting on a remote response. - a user has registered to receive commands. This mostly affects interfaces without interrupts. Interfaces with interrupts already don't use CPU in the system interface when the interface is idle. Signed-off-by: Corey Minyard <cminyard@mvista.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-17	ipmi: Turn off default probing of interfaces	Corey Minyard	2	-1/+13
	The default probing can cause problems with some system, slow booting, extra CPU usages, etc. Turn it off by default and give a config option to enable it. From: Matthew Garrett <matthew.garrett@nebula.com> Signed-off-by: Corey Minyard <cminyard@mvista.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-17	ipmi: Reset the KCS timeout when starting error recovery	Corey Minyard	1	-2/+3
	The OBF timer in KCS was not reset in one situation when error recovery was started, resulting in an immediate timeout. Reported-by: Bodo Stroesser <bstroesser@ts.fujitsu.com> Signed-off-by: Corey Minyard <cminyard@mvista.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-17	ipmi: Fix a race restarting the timer	Bodo Stroesser	1	-18/+28
	With recent changes it is possible for the timer handler to detect an idle interface and not start the timer, but the thread to start an operation at the same time. The thread will not start the timer in that instance, resulting in the timer not running. Instead, move all timer operations under the lock and start the timer in the thread if it detect non-idle and the timer is not already running. Moving under locks allows the last timeout to be set in both the thread and the timer. 'Timer is not running' means that the timer is not pending and smi_timeout() is not running. So we need a flag to detect this correctly. Also fix a few other timeout bugs: setting the last timeout when the interrupt has to be disabled and the timer started, and setting the last timeout in check_start_timer_thread possibly racing with the timer Signed-off-by: Corey Minyard <cminyard@mvista.com> Signed-off-by: Bodo Stroesser <bstroesser@ts.fujitsu.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-17	Char: ipmi_bt_sm, fix infinite loop	Jiri Slaby	1	-1/+1
	In read_all_bytes, we do unsigned char i; ... bt->read_data[0] = BMC2HOST; bt->read_count = bt->read_data[0]; ... for (i = 1; i <= bt->read_count; i++) bt->read_data[i] = BMC2HOST; If bt->read_data[0] == bt->read_count == 255, we loop infinitely in the 'for' loop. Make 'i' an 'int' instead of 'char' to get rid of the overflow and finish the loop after 255 iterations every time. Signed-off-by: Jiri Slaby <jslaby@suse.cz> Reported-and-debugged-by: Rui Hui Dian <rhdian@novell.com> Cc: Tomas Cech <tcech@suse.cz> Cc: Corey Minyard <minyard@acm.org> Cc: <openipmi-developer@lists.sourceforge.net> Signed-off-by: Corey Minyard <cminyard@mvista.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-17	Revert "serial: 8250, disable "too much work" messages"	Greg Kroah-Hartman	1	-1/+2
	This reverts commit f4f653e9875e573860e783fecbebde284a8626f5. Jiri writes: No, please drop this one. We need a better solution as it turned out that some boxes need 16k loops and it will increase with new processors :(. Cc: Jiri Slaby <jslaby@suse.cz> Cc: Martin Pluskal <mpluskal@suse.com> Cc: Takashi Iwai <tiwai@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-04-17	tracing/uprobes: Fix uprobe_cpu_buffer memory leak	zhangwei(Jovi)	1	-0/+6
	Forgot to free uprobe_cpu_buffer percpu page in uprobe_buffer_disable(). Link: http://lkml.kernel.org/p/534F8B3F.1090407@huawei.com Cc: stable@vger.kernel.org # v3.14+ Acked-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: zhangwei(Jovi) <jovi.zhangwei@huawei.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-04-17	drm/radeon/ci: make sure mc ucode is loaded before checking the size	Alex Deucher	1	-1/+3
	Avoid a possible segfault. Noticed-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2014-04-17	drm/radeon/si: make sure mc ucode is loaded before checking the size	Alex Deucher	1	-1/+3
	Avoid a possible segfault. Noticed-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2014-04-17	drm/radeon: improve PLL params if we don't match exactly v2	Christian König	1	-6/+7
	Otherwise we might be quite off on older chipsets. v2: keep ref_div minimum Signed-off-by: Christian König <christian.koenig@amd.com>
2014-04-17	drm/radeon: memory leak on bo reservation failure. v2	Quentin Casasnovas	1	-2/+7
	On bo reservation failure, we end up leaking fpriv. v2 (chk): rebased and added missing free on vm failure as well Fixes: 5e386b574cf7e1 ("drm/radeon: fix missing bo reservation") Cc: stable@vger.kernel.org Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Quentin Casasnovas <quentin.casasnovas@oracle.com> Signed-off-by: Christian König <christian.koenig@amd.com>
2014-04-17	drm/radeon: fix VCE fence command	Christoph Jaeger	1	-1/+1
	Due to a type mismatch that causes an implicit type conversion, the upper 32 bits of the GPU address have been zeroed out when adding to the command buffer. Picked up by Coverity - CID 1198624. Signed-off-by: Christoph Jaeger <christophjaeger@linux.com>
2014-04-17	drm/radeon: re-enable mclk dpm on R7 260X asics	Alex Deucher	1	-2/+6
	If the new mc ucode is available. Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2014-04-17	drm/radeon: add support for newer mc ucode on CI (v2)	Alex Deucher	2	-10/+20
	Fixes mclk stability on certain asics. v2: print out mc firmware version used and size bug: https://bugs.freedesktop.org/show_bug.cgi?id=75992 Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2014-04-17	drm/radeon: add support for newer mc ucode on SI (v2)	Alex Deucher	2	-13/+25
	May fix stability issues with some newer cards. v2: print out mc firmware version used and size Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2014-04-17	drm/radeon: apply more strict limits for PLL params v2	Christian König	1	-0/+3
	Letting post and refernce divider get to big is bad for signal stability. v2: increase the limit to 210 Signed-off-by: Christian König <christian.koenig@amd.com>
2014-04-17	drm/radeon: update CI DPM powertune settings	Alex Deucher	1	-12/+13
	As per internal recommendations. Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2014-04-17	drm/radeon: fix runpm handling on APUs (v4)	Alex Deucher	6	-34/+27
	Don't try and runtime suspend the APU in PX systems. We only want to power down the dGPU. v2: fix harder v3: fix stupid typo v4: consolidate runpm enablement to a single flag bugs: https://bugs.freedesktop.org/show_bug.cgi?id=75127 https://bugzilla.kernel.org/show_bug.cgi?id=72701 Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org