aboutsummaryrefslogtreecommitdiffstats
path: root/arch (follow)
AgeCommit message (Collapse)AuthorFilesLines
2017-02-17bpf: make jited programs visible in tracesDaniel Borkmann4-48/+1
Long standing issue with JITed programs is that stack traces from function tracing check whether a given address is kernel code through {__,}kernel_text_address(), which checks for code in core kernel, modules and dynamically allocated ftrace trampolines. But what is still missing is BPF JITed programs (interpreted programs are not an issue as __bpf_prog_run() will be attributed to them), thus when a stack trace is triggered, the code walking the stack won't see any of the JITed ones. The same for address correlation done from user space via reading /proc/kallsyms. This is read by tools like perf, but the latter is also useful for permanent live tracing with eBPF itself in combination with stack maps when other eBPF types are part of the callchain. See offwaketime example on dumping stack from a map. This work tries to tackle that issue by making the addresses and symbols known to the kernel. The lookup from *kernel_text_address() is implemented through a latched RB tree that can be read under RCU in fast-path that is also shared for symbol/size/offset lookup for a specific given address in kallsyms. The slow-path iteration through all symbols in the seq file done via RCU list, which holds a tiny fraction of all exported ksyms, usually below 0.1 percent. Function symbols are exported as bpf_prog_<tag>, in order to aide debugging and attribution. This facility is currently enabled for root-only when bpf_jit_kallsyms is set to 1, and disabled if hardening is active in any mode. The rationale behind this is that still a lot of systems ship with world read permissions on kallsyms thus addresses should not get suddenly exposed for them. If that situation gets much better in future, we always have the option to change the default on this. Likewise, unprivileged programs are not allowed to add entries there either, but that is less of a concern as most such programs types relevant in this context are for root-only anyway. If enabled, call graphs and stack traces will then show a correct attribution; one example is illustrated below, where the trace is now visible in tooling such as perf script --kallsyms=/proc/kallsyms and friends. Before: 7fff8166889d bpf_clone_redirect+0x80007f0020ed (/lib/modules/4.9.0-rc8+/build/vmlinux) f5d80 __sendmsg_nocancel+0xffff006451f1a007 (/usr/lib64/libc-2.18.so) After: 7fff816688b7 bpf_clone_redirect+0x80007f002107 (/lib/modules/4.9.0-rc8+/build/vmlinux) 7fffa0575728 bpf_prog_33c45a467c9e061a+0x8000600020fb (/lib/modules/4.9.0-rc8+/build/vmlinux) 7fffa07ef1fc cls_bpf_classify+0x8000600020dc (/lib/modules/4.9.0-rc8+/build/vmlinux) 7fff81678b68 tc_classify+0x80007f002078 (/lib/modules/4.9.0-rc8+/build/vmlinux) 7fff8164d40b __netif_receive_skb_core+0x80007f0025fb (/lib/modules/4.9.0-rc8+/build/vmlinux) 7fff8164d718 __netif_receive_skb+0x80007f002018 (/lib/modules/4.9.0-rc8+/build/vmlinux) 7fff8164e565 process_backlog+0x80007f002095 (/lib/modules/4.9.0-rc8+/build/vmlinux) 7fff8164dc71 net_rx_action+0x80007f002231 (/lib/modules/4.9.0-rc8+/build/vmlinux) 7fff81767461 __softirqentry_text_start+0x80007f0020d1 (/lib/modules/4.9.0-rc8+/build/vmlinux) 7fff817658ac do_softirq_own_stack+0x80007f00201c (/lib/modules/4.9.0-rc8+/build/vmlinux) 7fff810a2c20 do_softirq+0x80007f002050 (/lib/modules/4.9.0-rc8+/build/vmlinux) 7fff810a2cb5 __local_bh_enable_ip+0x80007f002085 (/lib/modules/4.9.0-rc8+/build/vmlinux) 7fff8168d452 ip_finish_output2+0x80007f002152 (/lib/modules/4.9.0-rc8+/build/vmlinux) 7fff8168ea3d ip_finish_output+0x80007f00217d (/lib/modules/4.9.0-rc8+/build/vmlinux) 7fff8168f2af ip_output+0x80007f00203f (/lib/modules/4.9.0-rc8+/build/vmlinux) [...] 7fff81005854 do_syscall_64+0x80007f002054 (/lib/modules/4.9.0-rc8+/build/vmlinux) 7fff817649eb return_from_SYSCALL_64+0x80007f002000 (/lib/modules/4.9.0-rc8+/build/vmlinux) f5d80 __sendmsg_nocancel+0xffff01c484812007 (/usr/lib64/libc-2.18.so) Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Cc: linux-kernel@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-17bpf: remove stubs for cBPF from arch codeDaniel Borkmann4-21/+2
Remove the dummy bpf_jit_compile() stubs for eBPF JITs and make that a single __weak function in the core that can be overridden similarly to the eBPF one. Also remove stale pr_err() mentions of bpf_jit_compile. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-16Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller7-15/+38
2017-02-11Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds7-15/+38
Pull x86 fixes from Ingo Molnar: "Last minute x86 fixes: - Fix a softlockup detector warning and long delays if using ptdump with KASAN enabled. - Two more TSC-adjust fixes for interesting firmware interactions. - Two commits to fix an AMD CPU topology enumeration bug that caused a measurable gaming performance regression" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mm/ptdump: Fix soft lockup in page table walker x86/tsc: Make the TSC ADJUST sanitizing work for tsc_reliable x86/tsc: Avoid the large time jump when sanitizing TSC ADJUST x86/CPU/AMD: Fix Zen SMT topology x86/CPU/AMD: Bring back Compute Unit ID
2017-02-11Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller34-40/+219
2017-02-10Merge tag 'powerpc-4.10-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linuxLinus Torvalds6-26/+52
Pull powerpc fixes friom Michael Ellerman: "Apologies for the late pull request, but Ben has been busy finding bugs. - Userspace was semi-randomly segfaulting on radix due to us incorrectly handling a fault triggered by autonuma, caused by a patch we merged earlier in v4.10 to prevent the kernel executing userspace. - We weren't marking host IPIs properly for KVM in the OPAL ICP backend. - The ERAT flushing on radix was missing an isync and was incorrectly marked as DD1 only. - The powernv CPU hotplug code was missing a wakeup type and failing to flush the interrupt correctly when using OPAL ICP Thanks to Benjamin Herrenschmidt" * tag 'powerpc-4.10-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/powernv: Properly set "host-ipi" on IPIs powerpc/powernv: Fix CPU hotplug to handle waking on HVI powerpc/mm/radix: Update ERAT flushes when invalidating TLB powerpc/mm: Fix spurrious segfaults on radix with autonuma
2017-02-10MIPS: Octeon: Remove unnecessary MODULE_*()Russell King1-4/+0
octeon-platform.c can not be built as a module for two reasons: (a) the Makefile doesn't allow it: obj-y := cpu.o setup.o octeon-platform.o octeon-irq.o csrc-octeon.o (b) the multiple *_initcall() statements, each of which are translated to a module_init() call when attempting a module build, become aliases to init_module(). Having more than one alias will cause a build error. Hence, rather than adding a linux/module.h include, remove the redundant MODULE_*() from this file. Acked-by: David Daney <david.daney@cavium.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-10x86/mm/ptdump: Fix soft lockup in page table walkerAndrey Ryabinin1-0/+2
CONFIG_KASAN=y needs a lot of virtual memory mapped for its shadow. In that case ptdump_walk_pgd_level_core() takes a lot of time to walk across all page tables and doing this without a rescheduling causes soft lockups: NMI watchdog: BUG: soft lockup - CPU#3 stuck for 23s! [swapper/0:1] ... Call Trace: ptdump_walk_pgd_level_core+0x40c/0x550 ptdump_walk_pgd_level_checkwx+0x17/0x20 mark_rodata_ro+0x13b/0x150 kernel_init+0x2f/0x120 ret_from_fork+0x2c/0x40 I guess that this issue might arise even without KASAN on huge machines with several terabytes of RAM. Stick cond_resched() in pgd loop to fix this. Reported-by: Tobias Regnery <tobias.regnery@gmail.com> Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: kasan-dev@googlegroups.com Cc: Alexander Potapenko <glider@google.com> Cc: "Paul E . McKenney" <paulmck@linux.vnet.ibm.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/20170210095405.31802-1-aryabinin@virtuozzo.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-02-10x86/tsc: Make the TSC ADJUST sanitizing work for tsc_reliableThomas Gleixner1-9/+7
When the TSC is marked reliable then the synchronization check is skipped, but that also skips the TSC ADJUST sanitizing code. So on a machine with a wreckaged BIOS the TSC deviation between CPUs might go unnoticed. Let the TSC adjust sanitizing code run unconditionally and just skip the expensive synchronization checks when TSC is marked reliable. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Olof Johansson <olof@lixom.net> Link: http://lkml.kernel.org/r/20170209151231.491189912@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-02-10x86/tsc: Avoid the large time jump when sanitizing TSC ADJUSTThomas Gleixner1-2/+3
Olof reported that on a machine which has a BIOS wreckaged TSC the timestamps in dmesg are making a large jump because the TSC value is jumping forward after resetting the TSC ADJUST register to a sane value. This can be avoided by calling the TSC ADJUST saniziting function before initializing the per cpu sched clock machinery. That takes the offset into account and avoid the time jump. What cannot be avoided is that the 'Firmware Bug' warnings on the secondary CPUs are printed with the large time offsets because it would be too much effort and ugly hackery to print those warnings into a buffer and emit them after the adjustemt on the starting CPUs. It's a firmware bug and should be fixed in firmware. The weird timestamps are collateral damage and just illustrate the sillyness of the BIOS folks: [ 0.397445] smp: Bringing up secondary CPUs ... [ 0.402100] x86: Booting SMP configuration: [ 0.406343] .... node #0, CPUs: #1 [1265776479.930667] [Firmware Bug]: TSC ADJUST differs: Reference CPU0: -2978888639075328 CPU1: -2978888639183101 [1265776479.944664] TSC ADJUST synchronize: Reference CPU0: 0 CPU1: -2978888639183101 [ 0.508119] #2 [1265776480.032346] [Firmware Bug]: TSC ADJUST differs: Reference CPU0: -2978888639075328 CPU2: -2978888639183677 [1265776480.044192] TSC ADJUST synchronize: Reference CPU0: 0 CPU2: -2978888639183677 [ 0.607643] #3 [1265776480.131874] [Firmware Bug]: TSC ADJUST differs: Reference CPU0: -2978888639075328 CPU3: -2978888639184530 [1265776480.143720] TSC ADJUST synchronize: Reference CPU0: 0 CPU3: -2978888639184530 [ 0.707108] smp: Brought up 1 node, 4 CPUs [ 0.711271] smpboot: Total of 4 processors activated (21698.88 BogoMIPS) Reported-by: Olof Johansson <olof@lixom.net> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20170209151231.411460506@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-02-09ARM: orion: remove unused wnr854t_switch_plat_dataArnd Bergmann1-5/+0
The other instances of this structure got removed along with the MDIO device change, but this one was left behind and needs to be removed as well: arch/arm/mach-orion5x/wnr854t-setup.c:109:44: error: 'wnr854t_switch_plat_data' defined but not used [-Werror=unused-variable] static struct dsa_platform_data __initdata wnr854t_switch_plat_data = { Fixes: 575e93f7b5e6 ("ARM: orion: Register DSA switch as a MDIO device") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-09Merge branch 'fixes' of git://git.armlinux.org.uk/~rmk/linux-armLinus Torvalds3-3/+7
Pull ARM fixes from Russell King: "A couple more fixes for 4.10: - fix addressing the short regset write issue (Dave Martin) - fix for LPAE systems which leave a pending imprecise data abort before entering the kernel (Alexander Sverdlin)" * 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm: ARM: 8643/3: arm/ptrace: Preserve previous registers for short regset write ARM: 8642/1: LPAE: catch pending imprecise abort on unmask
2017-02-09powerpc/powernv: Properly set "host-ipi" on IPIsBenjamin Herrenschmidt1-2/+4
Otherwise KVM will fail to pass them through to the host Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-02-09powerpc/powernv: Fix CPU hotplug to handle waking on HVIBenjamin Herrenschmidt4-3/+42
The IPIs come in as HVI not EE, so we need to test the appropriate SRR1 bits. The encoding is such that it won't have false positives on P7 and P8 so we can just test it like that. We also need to handle the icp-opal variant of the flush. Fixes: d74361881f0d ("powerpc/xics: Add ICP OPAL backend") Cc: stable@vger.kernel.org # v4.8+ Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-02-09powerpc/mm/radix: Update ERAT flushes when invalidating TLBBenjamin Herrenschmidt1-5/+1
Three tiny changes to the ERAT flushing logic: First don't make it depend on DD1. It hasn't been decided yet but we might run DD2 in a mode that also requires explicit flushes for performance reasons so make it unconditional. We also add a missing isync, and finally remove the flush from _tlbiel_va as it is only necessary for congruence-class invalidations (PID, LPID and full TLB), not targetted invalidations. Fixes: 96ed1fe511a8 ("powerpc/mm/radix: Invalidate ERAT on tlbiel for POWER9 DD1") Cc: stable@vger.kernel.org # v4.9+ Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-02-08Revert "x86/ioapic: Restore IO-APIC irq_chip retrigger callback"Linus Torvalds1-2/+0
This reverts commit 020eb3daaba2857b32c4cf4c82f503d6a00a67de. Gabriel C reports that it causes his machine to not boot, and we haven't tracked down the reason for it yet. Since the bug it fixes has been around for a longish time, we're better off reverting the fix for now. Gabriel says: "It hangs early and freezes with a lot RCU warnings. I bisected it down to : > Ruslan Ruslichenko (1): > x86/ioapic: Restore IO-APIC irq_chip retrigger callback Reverting this one fixes the problem for me.. The box is a PRIMERGY TX200 S5 , 2 socket , 2 x E5520 CPU(s) installed" and Ruslan and Thomas are currently stumped. Reported-and-bisected-by: Gabriel C <nix.or.die@gmail.com> Cc: Ruslan Ruslichenko <rruslich@cisco.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: stable@kernel.org # for the backport of the original commit Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-08Merge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-socLinus Torvalds24-9/+160
Pull ARM SoC fixes from Arnd Bergmann: - A relatively large patch restores booting on i.MX platforms that failed to boot after a cleanup was merged for v4.10. - A quirk for USB needs to be enabled on the STi platform - On the Meson platform, we saw memory corruption with part of the memory used by the secure monitor, so we have to stay out of that area. - The same platform also has a problem with ethernet under load, which is fixed by disabling EEE negotiation. - imx6dl has an incorrect pin configuration, which prevents SPI from working. - Two maintainers have lost their access to their email addresses, so we should update the MAINTAINERS file before the release - Renaming one of the orion5x linkstation models to help simplify the debian install. - A couple of fixes for build warnings that were introduced during v4.10-rc. * tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: ARM: defconfigs: make NF_CT_PROTO_SCTP and NF_CT_PROTO_UDPLITE built-in MAINTAINERS: socfpga: update email for Dinh Nguyen ARM: orion5x: fix Makefile for linkstation-lschl.dtb ARM: dts: orion5x-lschl: More consistent naming on linkstation series ARM: dts: orion5x-lschl: Fix model name MAINTAINERS: change email address from atmel to microchip MAINTAINERS: at91: change email address ARM64: dts: meson-gx: Add firmware reserved memory zones ARM64: dts: meson-gxbb-odroidc2: fix GbE tx link breakage ARM: dts: STiH407-family: set snps,dis_u3_susphy_quirk ARM: dts: imx: Pass 'chosen' and 'memory' nodes ARM: dts: imx6dl: fix GPIO4 range ARM: imx: hide unused variable in #ifdef
2017-02-08powerpc/mm: Fix spurrious segfaults on radix with autonumaBenjamin Herrenschmidt1-16/+5
When autonuma (Automatic NUMA balancing) marks a PTE inaccessible it clears all the protection bits but leave the PTE valid. With the Radix MMU, an attempt at executing from such a PTE will take a fault with bit 35 of SRR1 set "SRR1_ISI_N_OR_G". It is thus incorrect to treat all such faults as errors. We should pass them to handle_mm_fault() for autonuma to deal with. The case of pages that are really not executable is handled by the existing test for VM_EXEC further down. That leaves us with catching the kernel attempts at executing user pages. We can catch that earlier, even before we do find_vma. It is never valid on powerpc for the kernel to take an exec fault to begin with. So fold that test with the existing test for the kernel faulting on kernel addresses to bail out early. Fixes: 1d18ad026844 ("powerpc/mm: Detect instruction fetch denied and report") Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Acked-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-02-07Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller28-290/+192
The conflict was an interaction between a bug fix in the netvsc driver in 'net' and an optimization of the RX path in 'net-next'. Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-07ARC: [arcompact] brown paper bag bug in unaligned access delay slot fixupVineet Gupta1-1/+1
Reported-by: Jo-Philipp Wich <jo@mein.io> Fixes: 9aed02feae57bf7 ("ARC: [arcompact] handle unaligned access delay slot") Cc: linux-kernel@vger.kernel.org Cc: linux-snps-arc@lists.infradead.org Cc: stable@vger.kernel.org Signed-off-by: Vineet Gupta <vgupta@synopsys.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-07ARM: orion: Register DSA switch as a MDIO deviceFlorian Fainelli9-36/+29
Utilize the ability to pass board specific MDIO bus information towards a particular MDIO device thus allowing us to provide the per-port switch layout to the Marvell 88E6XXX switch driver. Since we would end-up with conflicting registration paths, do not register the "dsa" platform device anymore. Note that the MDIO devices registered by code in net/dsa/dsa2.c does not parse a dsa_platform_data, but directly take a dsa_chip_data (specific to a single switch chip), so we update the different call sites to pass this structure down to orion_ge00_switch_init(). Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-06Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6Linus Torvalds1-4/+4
Pull crypto fixes from Herbert Xu: - use-after-free in algif_aead - modular aesni regression when pcbc is modular but absent - bug causing IO page faults in ccp - double list add in ccp - NULL pointer dereference in qat (two patches) - panic in chcr - NULL pointer dereference in chcr - out-of-bound access in chcr * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: chcr - Fix key length for RFC4106 crypto: algif_aead - Fix kernel panic on list_del crypto: aesni - Fix failure when pcbc module is absent crypto: ccp - Fix double add when creating new DMA command crypto: ccp - Fix DMA operations when IOMMU is enabled crypto: chcr - Check device is allocated before use crypto: chcr - Fix panic on dma_unmap_sg crypto: qat - zero esram only for DH85x devices crypto: qat - fix bar discovery for c62x
2017-02-06ARM: defconfigs: make NF_CT_PROTO_SCTP and NF_CT_PROTO_UDPLITE built-inArnd Bergmann2-4/+4
The symbols can no longer be used as loadable modules, leading to a harmless Kconfig warning: arch/arm/configs/imote2_defconfig:60:warning: symbol value 'm' invalid for NF_CT_PROTO_UDPLITE arch/arm/configs/imote2_defconfig:59:warning: symbol value 'm' invalid for NF_CT_PROTO_SCTP arch/arm/configs/ezx_defconfig:68:warning: symbol value 'm' invalid for NF_CT_PROTO_UDPLITE arch/arm/configs/ezx_defconfig:67:warning: symbol value 'm' invalid for NF_CT_PROTO_SCTP Let's make them built-in. Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2017-02-06Merge tag 'mvebu-fixes-4.10-1' of git://git.infradead.org/linux-mvebu into fixesArnd Bergmann2-3/+3
Pull "mvebu fixes for 4.10 (part 1)" from Gregory CLEMENT: More consistent naming for some orion5x based boards helping the switch to device tree for debian users. * tag 'mvebu-fixes-4.10-1' of git://git.infradead.org/linux-mvebu: ARM: orion5x: fix Makefile for linkstation-lschl.dtb ARM: dts: orion5x-lschl: More consistent naming on linkstation series ARM: dts: orion5x-lschl: Fix model name
2017-02-05x86/CPU/AMD: Fix Zen SMT topologyYazen Ghannam1-0/+7
After: a33d331761bc ("x86/CPU/AMD: Fix Bulldozer topology") our SMT scheduling topology for Fam17h systems is broken, because the ThreadId is included in the ApicId when SMT is enabled. So, without further decoding cpu_core_id is unique for each thread rather than the same for threads on the same core. This didn't affect systems with SMT disabled. Make cpu_core_id be what it is defined to be. Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: <stable@vger.kernel.org> # 4.9 Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20170205105022.8705-2-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-02-05x86/CPU/AMD: Bring back Compute Unit IDBorislav Petkov4-4/+19
Commit: a33d331761bc ("x86/CPU/AMD: Fix Bulldozer topology") restored the initial approach we had with the Fam15h topology of enumerating CU (Compute Unit) threads as cores. And this is still correct - they're beefier than HT threads but still have some shared functionality. Our current approach has a problem with the Mad Max Steam game, for example. Yves Dionne reported a certain "choppiness" while playing on v4.9.5. That problem stems most likely from the fact that the CU threads share resources within one CU and when we schedule to a thread of a different compute unit, this incurs latency due to migrating the working set to a different CU through the caches. When the thread siblings mask mirrors that aspect of the CUs and threads, the scheduler pays attention to it and tries to schedule within one CU first. Which takes care of the latency, of course. Reported-by: Yves Dionne <yves.dionne@gmail.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: <stable@vger.kernel.org> # 4.9 Cc: Brice Goglin <Brice.Goglin@inria.fr> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Yazen Ghannam <yazen.ghannam@amd.com> Link: http://lkml.kernel.org/r/20170205105022.8705-1-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-02-04Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds2-0/+3
Pull irq fixes from Thomas Gleixner: - Prevent double activation of interrupt lines, which causes problems on certain interrupt controllers - Handle the fallout of the above because x86 (ab)uses the activation function to reconfigure interrupts under the hood. * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/irq: Make irq activate operations symmetric irqdomain: Avoid activating interrupts more than once
2017-02-04Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds1-0/+1
Pull KVM fix from Radim Krčmář: "Fix a regression that prevented migration between hosts with different XSAVE features even if the missing features were not used by the guest (for stable)" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: x86: do not save guest-unsupported XSAVE state
2017-02-03Merge tag 'powerpc-4.10-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linuxLinus Torvalds11-62/+11
Pull powerpc fixes from Michael Ellerman: "The main change is we're reverting the initial stack protector support we merged this cycle. It turns out to not work on toolchains built with libc support, and fixing it will be need to wait for another release. And the rest are all fairly minor: - Some pasemi machines were not booting due to a missing error check in prom_find_boot_cpu() - In EEH we were checking a pointer rather than the bool it pointed to - The clang build was broken by a BUILD_BUG_ON() we added. - The radix (Power9 only) version of map_kernel_page() was broken if our memory size was a multiple of 2MB, which it generally isn't Thanks to: Darren Stevens, Gavin Shan, Reza Arbab" * tag 'powerpc-4.10-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/mm: Use the correct pointer when setting a 2MB pte powerpc: Fix build failure with clang due to BUILD_BUG_ON() powerpc: Revert the initial stack protector support powerpc/eeh: Fix wrong flag passed to eeh_unfreeze_pe() powerpc: Add missing error check to prom_find_boot_cpu()
2017-02-03KVM: x86: do not save guest-unsupported XSAVE stateRadim Krčmář1-0/+1
Saving unsupported state prevents migration when the new host does not support a XSAVE feature of the original host, even if the feature is not exposed to the guest. We've masked host features with guest-visible features before, with 4344ee981e21 ("KVM: x86: only copy XSAVE state for the supported features") and dropped it when implementing XSAVES. Do it again. Fixes: df1daba7d1cb ("KVM: x86: support XSAVES usage in the host") Cc: stable@vger.kernel.org Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2017-02-03modversions: treat symbol CRCs as 32 bit quantitiesArd Biesheuvel3-12/+1
The modversion symbol CRCs are emitted as ELF symbols, which allows us to easily populate the kcrctab sections by relying on the linker to associate each kcrctab slot with the correct value. This has a couple of downsides: - Given that the CRCs are treated as memory addresses, we waste 4 bytes for each CRC on 64 bit architectures, - On architectures that support runtime relocation, a R_<arch>_RELATIVE relocation entry is emitted for each CRC value, which identifies it as a quantity that requires fixing up based on the actual runtime load offset of the kernel. This results in corrupted CRCs unless we explicitly undo the fixup (and this is currently being handled in the core module code) - Such runtime relocation entries take up 24 bytes of __init space each, resulting in a x8 overhead in [uncompressed] kernel size for CRCs. Switching to explicit 32 bit values on 64 bit architectures fixes most of these issues, given that 32 bit values are not treated as quantities that require fixing up based on the actual runtime load offset. Note that on some ELF64 architectures [such as PPC64], these 32-bit values are still emitted as [absolute] runtime relocatable quantities, even if the value resolves to a build time constant. Since relative relocations are always resolved at build time, this patch enables MODULE_REL_CRCS on powerpc when CONFIG_RELOCATABLE=y, which turns the absolute CRC references into relative references into .rodata where the actual CRC value is stored. So redefine all CRC fields and variables as u32, and redefine the __CRC_SYMBOL() macro for 64 bit builds to emit the CRC reference using inline assembler (which is necessary since 64-bit C code cannot use 32-bit types to hold memory addresses, even if they are ultimately resolved using values that do not exceed 0xffffffff). To avoid potential problems with legacy 32-bit architectures using legacy toolchains, the equivalent C definition of the kcrctab entry is retained for 32-bit architectures. Note that this mostly reverts commit d4703aefdbc8 ("module: handle ppc64 relocating kcrctabs when CONFIG_RELOCATABLE=y") Acked-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-03crypto: aesni - Fix failure when pcbc module is absentHerbert Xu1-4/+4
When aesni is built as a module together with pcbc, the pcbc module must be present for aesni to load. However, the pcbc module may not be present for reasons such as its absence on initramfs. This patch allows the aesni to function even if the pcbc module is enabled but not present. Reported-by: Arkadiusz Miśkiewicz <arekm@maven.pl> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-02-02Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds6-35/+37
Pull x86 fixes from Ingo Molnar: "Misc fixes: - two microcode loader fixes - two FPU xstate handling fixes - an MCE timer handling related crash fix" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mce: Make timer handling more robust x86/microcode: Do not access the initrd after it has been freed x86/fpu/xstate: Fix xcomp_bv in XSAVES header x86/fpu: Set the xcomp_bv when we fake up a XSAVES area x86/microcode/intel: Drop stashed AP patch pointer optimization
2017-02-02Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller14-106/+195
All merge conflicts were simple overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-02Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds2-175/+117
Pull perf fixes from Ingo Molnar: "Five kernel fixes: - an mmap tracing ABI fix for certain mappings - a use-after-free fix, found via KASAN - three CPU hotplug related x86 PMU driver fixes" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/x86/intel/uncore: Make package handling more robust perf/x86/intel/uncore: Clean up hotplug conversion fallout perf/x86/intel/rapl: Make package handling more robust perf/core: Fix PERF_RECORD_MMAP2 prot/flags for anonymous memory perf/core: Fix use-after-free bug
2017-02-02Merge branch 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds1-0/+16
Pull EFI fixes from Ingo Molnar: "Two EFI boot fixes, one for arm64 and one for x86 systems with certain firmware versions" * 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: efi/fdt: Avoid FDT manipulation after ExitBootServices() x86/efi: Always map the first physical page into the EFI pagetables
2017-02-02Merge tag 'xtensa-20170202' of git://github.com/jcmvbkbc/linux-xtensaLinus Torvalds1-1/+1
Pull Xtensa fix from Max Filippov: "A for an Xtensa build error introduced in reset code refactoring series in v4.9: - fix noMMU build on cores with MMU" * tag 'xtensa-20170202' of git://github.com/jcmvbkbc/linux-xtensa: xtensa: fix noMMU build on cores with MMU
2017-02-02ARM: orion5x: fix Makefile for linkstation-lschl.dtbArnd Bergmann1-1/+1
The rename of orion5x-lschl.dts needs to be reflected in the Makefile: make[3]: *** No rule to make target 'arch/arm/boot/dts/orion5x-lschl.dtb', needed by '__build'. Fixes: 6cfd3cd8d836 ("ARM: dts: orion5x-lschl: More consistent naming on linkstation series") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
2017-02-01Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6Linus Torvalds1-46/+42
Pull crypto fixes from Herbert Xu: "This fixes a bug in CBC/CTR on ARM64 that breaks chaining as well as a bug in the core API that causes registration failures when a driver unloads and then reloads an algorithm" * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: arm64/aes-blk - honour iv_out requirement in CBC and CTR modes crypto: api - Clear CRYPTO_ALG_DEAD bit before registering an alg
2017-02-01perf/x86/intel/uncore: Make package handling more robustThomas Gleixner1-105/+91
The package management code in uncore relies on package mapping being available before a CPU is started. This changed with: 9d85eb9119f4 ("x86/smpboot: Make logical package management more robust") because the ACPI/BIOS information turned out to be unreliable, but that left uncore in broken state. This was not noticed because on a regular boot all CPUs are online before uncore is initialized. Move the allocation to the CPU online callback and simplify the hotplug handling. At this point the package mapping is established and correct. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sebastian Siewior <bigeasy@linutronix.de> Cc: Stephane Eranian <eranian@google.com> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: Yasuaki Ishimatsu <yasu.isimatu@gmail.com> Fixes: 9d85eb9119f4 ("x86/smpboot: Make logical package management more robust") Link: http://lkml.kernel.org/r/20170131230141.377156255@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-02-01perf/x86/intel/uncore: Clean up hotplug conversion falloutThomas Gleixner1-40/+4
The recent conversion to the hotplug state machine kept two mechanisms from the original code: 1) The first_init logic which adds the number of online CPUs in a package to the refcount. That's wrong because the callbacks are executed for all online CPUs. Remove it so the refcounting is correct. 2) The on_each_cpu() call to undo box->init() in the error handling path. That's bogus because when the prepare callback fails no box has been initialized yet. Remove it. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sebastian Siewior <bigeasy@linutronix.de> Cc: Stephane Eranian <eranian@google.com> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: Yasuaki Ishimatsu <yasu.isimatu@gmail.com> Fixes: 1a246b9f58c6 ("perf/x86/intel/uncore: Convert to hotplug state machine") Link: http://lkml.kernel.org/r/20170131230141.298032324@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-02-01perf/x86/intel/rapl: Make package handling more robustThomas Gleixner1-34/+26
The package management code in RAPL relies on package mapping being available before a CPU is started. This changed with: 9d85eb9119f4 ("x86/smpboot: Make logical package management more robust") because the ACPI/BIOS information turned out to be unreliable, but that left RAPL in broken state. This was not noticed because on a regular boot all CPUs are online before RAPL is initialized. A possible fix would be to reintroduce the mess which allocates a package data structure in CPU prepare and when it turns out to already exist in starting throw it away later in the CPU online callback. But that's a horrible hack and not required at all because RAPL becomes functional for perf only in the CPU online callback. That's correct because user space is not yet informed about the CPU being onlined, so nothing caan rely on RAPL being available on that particular CPU. Move the allocation to the CPU online callback and simplify the hotplug handling. At this point the package mapping is established and correct. This also adds a missing check for available package data in the event_init() function. Reported-by: Yasuaki Ishimatsu <yasu.isimatu@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sebastian Siewior <bigeasy@linutronix.de> Cc: Stephane Eranian <eranian@google.com> Cc: Vince Weaver <vincent.weaver@maine.edu> Fixes: 9d85eb9119f4 ("x86/smpboot: Make logical package management more robust") Link: http://lkml.kernel.org/r/20170131230141.212593966@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-31xtensa: fix noMMU build on cores with MMUMax Filippov1-1/+1
Commit bf15f86b343ed8 ("xtensa: initialize MMU before jumping to reset vector") calls MMU management functions even when CONFIG_MMU is not selected. That breaks noMMU build on cores with MMU. Don't manage MMU when CONFIG_MMU is not selected. Cc: stable@vger.kernel.org Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2017-01-31x86/mce: Make timer handling more robustThomas Gleixner1-19/+12
Erik reported that on a preproduction hardware a CMCI storm triggers the BUG_ON in add_timer_on(). The reason is that the per CPU MCE timer is started by the CMCI logic before the MCE CPU hotplug callback starts the timer with add_timer_on(). So the timer is already queued which triggers the BUG. Using add_timer_on() is pretty pointless in this code because the timer is strictlty per CPU, initialized as pinned and all operations which arm the timer happen on the CPU to which the timer belongs. Simplify the whole machinery by using mod_timer() instead of add_timer_on() which avoids the problem because mod_timer() can handle already queued timers. Use __start_timer() everywhere so the earliest armed expiry time is preserved. Reported-by: Erik Veijola <erik.veijola@intel.com> Tested-by: Borislav Petkov <bp@alien8.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Borislav Petkov <bp@alien8.de> Cc: Tony Luck <tony.luck@intel.com> Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1701310936080.3457@nanos Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-01-31x86/irq: Make irq activate operations symmetricThomas Gleixner2-0/+3
The recent commit which prevents double activation of interrupts unearthed interesting code in x86. The code (ab)uses irq_domain_activate_irq() to reconfigure an already activated interrupt. That trips over the prevention code now. Fix it by deactivating the interrupt before activating the new configuration. Fixes: 08d85f3ea99f1 "irqdomain: Avoid activating interrupts more than once" Reported-and-tested-by: Mike Galbraith <efault@gmx.de> Reported-and-tested-by: Borislav Petkov <bp@alien8.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1701311901580.3457@nanos
2017-01-31ARM: dts: orion5x-lschl: More consistent naming on linkstation seriesRoger Shimizu1-0/+0
DTS files, which includes orion5x-linkstation.dtsi, are named: orion5x-linkstation-*.dts So we rename the file below: arch/arm/boot/dts/orion5x-lschl.dts to the new name: arch/arm/boot/dts/orion5x-linkstation-lschl.dts Because DTS conversion of this device was just introduced in 4.9, Debian is still using legacy device support, other distros are the same, so here we won't expect any impact actually. Fixes: f94f268979a2 ("ARM: dts: orion5x: convert ls-chl to FDT") Cc: Ashley Hughes <ashley.hughes@blueyonder.co.uk> Signed-off-by: Roger Shimizu <rogershimizu@gmail.com> Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
2017-01-31ARM: dts: orion5x-lschl: Fix model nameRoger Shimizu1-2/+2
Model name should be consistent with legacy device file, so that user can migrate their system from legacy device support to device-tree safely. Legacy device file is currently removed, but it can be found on 4.8 or previous version of linux: arch/arm/mach-orion5x/ls-chl-setup.c Fixes: f94f268979a2 ("ARM: dts: orion5x: convert ls-chl to FDT") Cc: Ashley Hughes <ashley.hughes@blueyonder.co.uk> Signed-off-by: Roger Shimizu <rogershimizu@gmail.com> Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
2017-01-30Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparcLinus Torvalds4-8/+81
Pull sparc fixes from David Miller: "Several small bug fixes and tidies, along with a fix for non-resumable memory errors triggered by userspace" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc: sparc64: Handle PIO & MEM non-resumable errors. sparc64: Zero pages on allocation for mondo and error queues. sparc: Fixed typo in sstate.c. Replaced panicing with panicking sparc: use symbolic names for tsb indexing
2017-01-30sparc64: Handle PIO & MEM non-resumable errors.Liam R. Howlett1-0/+73
User processes trying to access an invalid memory address via PIO will receive a SIGBUS signal instead of causing a panic. Memory errors will receive a SIGKILL since a SIGBUS may result in a coredump which may attempt to repeat the faulting access. Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-30sparc64: Zero pages on allocation for mondo and error queues.Liam R. Howlett1-1/+1
Error queues use a non-zero first word to detect if the queues are full. Using pages that have not been zeroed may result in false positive overflow events. These queues are set up once during boot so zeroing all mondo and error queue pages is safe. Note that the false positive overflow does not always occur because the page allocation for these queues is so early in the boot cycle that higher number CPUs get fresh pages. It is only when traps are serviced with lower number CPUs who were given already used pages that this issue is exposed. Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>