aboutsummaryrefslogtreecommitdiffstats
path: root/arch/x86 (follow)
AgeCommit message (Collapse)AuthorFilesLines
2019-03-05x86: Deprecate a.out supportBorislav Petkov2-2/+1
Linux supports ELF binaries for ~25 years now. a.out coredumping has bitrotten quite significantly and would need some fixing to get it into shape again but considering how even the toolchains cannot create a.out executables in its default configuration, let's deprecate a.out support and remove it a couple of releases later, instead. Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Richard Weinberger <richard@nod.at> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com> Cc: Jann Horn <jannh@google.com> Cc: <linux-api@vger.kernel.org> Cc: <linux-fsdevel@vger.kernel.org> Cc: lkml <linux-kernel@vger.kernel.org> Cc: Matthew Wilcox <willy@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: <x86@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-05a.out: remove core dumping supportLinus Torvalds2-226/+0
We're (finally) phasing out a.out support for good. As Borislav Petkov points out, we've supported ELF binaries for about 25 years by now, and coredumping in particular has bitrotted over the years. None of the tool chains even support generating a.out binaries any more, and the plan is to deprecate a.out support entirely for the kernel. But I want to start with just removing the core dumping code, because I can still imagine that somebody actually might want to support a.out as a simpler biinary format. Particularly if you generate some random binaries on the fly, ELF is a much more complicated format (admittedly ELF also does have a lot of toolchain support, mitigating that complexity a lot and you really should have moved over in the last 25 years). So it's at least somewhat possible that somebody out there has some workflow that still involves generating and running a.out executables. In contrast, it's very unlikely that anybody depends on debugging any legacy a.out core files. But regardless, I want this phase-out to be done in two steps, so that we can resurrect a.out support (if needed) without having to resurrect the core file dumping that is almost certainly not needed. Jann Horn pointed to the <asm/a.out-core.h> file that my first trivial cut at this had missed. And Alan Cox points out that the a.out binary loader _could_ be done in user space if somebody wants to, but we might keep just the loader in the kernel if somebody really wants it, since the loader isn't that big and has no really odd special cases like the core dumping does. Acked-by: Borislav Petkov <bp@alien8.de> Cc: Alan Cox <gnomes@lxorguk.ukuu.org.uk> Cc: Jann Horn <jannh@google.com> Cc: Richard Weinberger <richard@nod.at> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-05Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6Linus Torvalds9-702/+336
Pull crypto update from Herbert Xu: "API: - Add helper for simple skcipher modes. - Add helper to register multiple templates. - Set CRYPTO_TFM_NEED_KEY when setkey fails. - Require neither or both of export/import in shash. - AEAD decryption test vectors are now generated from encryption ones. - New option CONFIG_CRYPTO_MANAGER_EXTRA_TESTS that includes random fuzzing. Algorithms: - Conversions to skcipher and helper for many templates. - Add more test vectors for nhpoly1305 and adiantum. Drivers: - Add crypto4xx prng support. - Add xcbc/cmac/ecb support in caam. - Add AES support for Exynos5433 in s5p. - Remove sha384/sha512 from artpec7 as hardware cannot do partial hash" [ There is a merge of the Freescale SoC tree in order to pull in changes required by patches to the caam/qi2 driver. ] * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (174 commits) crypto: s5p - add AES support for Exynos5433 dt-bindings: crypto: document Exynos5433 SlimSSS crypto: crypto4xx - add missing of_node_put after of_device_is_available crypto: cavium/zip - fix collision with generic cra_driver_name crypto: af_alg - use struct_size() in sock_kfree_s() crypto: caam - remove redundant likely/unlikely annotation crypto: s5p - update iv after AES-CBC op end crypto: x86/poly1305 - Clear key material from stack in SSE2 variant crypto: caam - generate hash keys in-place crypto: caam - fix DMA mapping xcbc key twice crypto: caam - fix hash context DMA unmap size hwrng: bcm2835 - fix probe as platform device crypto: s5p-sss - Use AES_BLOCK_SIZE define instead of number crypto: stm32 - drop pointless static qualifier in stm32_hash_remove() crypto: chelsio - Fixed Traffic Stall crypto: marvell - Remove set but not used variable 'ivsize' crypto: ccp - Update driver messages to remove some confusion crypto: adiantum - add 1536 and 4096-byte test vectors crypto: nhpoly1305 - add a test vector with len % 16 != 0 crypto: arm/aes-ce - update IV after partial final CTR block ...
2019-03-05Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-nextLinus Torvalds4-43/+126
Pull networking updates from David Miller: "Here we go, another merge window full of networking and #ebpf changes: 1) Snoop DHCPACKS in batman-adv to learn MAC/IP pairs in the DHCP range without dealing with floods of ARP traffic, from Linus Lüssing. 2) Throttle buffered multicast packet transmission in mt76, from Felix Fietkau. 3) Support adaptive interrupt moderation in ice, from Brett Creeley. 4) A lot of struct_size conversions, from Gustavo A. R. Silva. 5) Add peek/push/pop commands to bpftool, as well as bash completion, from Stanislav Fomichev. 6) Optimize sk_msg_clone(), from Vakul Garg. 7) Add SO_BINDTOIFINDEX, from David Herrmann. 8) Be more conservative with local resends due to local congestion, from Yuchung Cheng. 9) Allow vetoing of unsupported VXLAN FDBs, from Petr Machata. 10) Add health buffer support to devlink, from Eran Ben Elisha. 11) Add TXQ scheduling API to mac80211, from Toke Høiland-Jørgensen. 12) Add statistics to basic packet scheduler filter, from Cong Wang. 13) Add GRE tunnel support for mlxsw Spectrum-2, from Nir Dotan. 14) Lots of new IP tunneling forwarding tests, also from Nir Dotan. 15) Add 3ad stats to bonding, from Nikolay Aleksandrov. 16) Lots of probing improvements for bpftool, from Quentin Monnet. 17) Various nfp drive #ebpf JIT improvements from Jakub Kicinski. 18) Allow #ebpf programs to access gso_segs from skb shared info, from Eric Dumazet. 19) Add sock_diag support for AF_XDP sockets, from Björn Töpel. 20) Support 22260 iwlwifi devices, from Luca Coelho. 21) Use rbtree for ipv6 defragmentation, from Peter Oskolkov. 22) Add JMP32 instruction class support to #ebpf, from Jiong Wang. 23) Add spinlock support to #ebpf, from Alexei Starovoitov. 24) Support 256-bit keys and TLS 1.3 in ktls, from Dave Watson. 25) Add device infomation API to devlink, from Jakub Kicinski. 26) Add new timestamping socket options which are y2038 safe, from Deepa Dinamani. 27) Add RX checksum offloading for various sh_eth chips, from Sergei Shtylyov. 28) Flow offload infrastructure, from Pablo Neira Ayuso. 29) Numerous cleanups, improvements, and bug fixes to the PHY layer and many drivers from Heiner Kallweit. 30) Lots of changes to try and make packet scheduler classifiers run lockless as much as possible, from Vlad Buslov. 31) Support BCM957504 chip in bnxt_en driver, from Erik Burrows. 32) Add concurrency tests to tc-tests infrastructure, from Vlad Buslov. 33) Add hwmon support to aquantia, from Heiner Kallweit. 34) Allow 64-bit values for SO_MAX_PACING_RATE, from Eric Dumazet. And I would be remiss if I didn't thank the various major networking subsystem maintainers for integrating much of this work before I even saw it. Alexei Starovoitov, Daniel Borkmann, Pablo Neira Ayuso, Johannes Berg, Kalle Valo, and many others. Thank you!" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (2207 commits) net/sched: avoid unused-label warning net: ignore sysctl_devconf_inherit_init_net without SYSCTL phy: mdio-mux: fix Kconfig dependencies net: phy: use phy_modify_mmd_changed in genphy_c45_an_config_aneg net: dsa: mv88e6xxx: add call to mv88e6xxx_ports_cmode_init to probe for new DSA framework selftest/net: Remove duplicate header sky2: Disable MSI on Dell Inspiron 1545 and Gateway P-79 net/mlx5e: Update tx reporter status in case channels were successfully opened devlink: Add support for direct reporter health state update devlink: Update reporter state to error even if recover aborted sctp: call iov_iter_revert() after sending ABORT team: Free BPF filter when unregistering netdev ip6mr: Do not call __IP6_INC_STATS() from preemptible context isdn: mISDN: Fix potential NULL pointer dereference of kzalloc net: dsa: mv88e6xxx: support in-band signalling on SGMII ports with external PHYs cxgb4/chtls: Prefix adapter flags with CXGB4 net-sysfs: Switch to bitmap_zalloc() mellanox: Switch to bitmap_zalloc() bpf: add test cases for non-pointer sanitiation logic mlxsw: i2c: Extend initialization by querying resources data ...
2019-03-04Merge tag 'regulator-v5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulatorLinus Torvalds1-1/+0
Pull regulator updates from Mark Brown: "The bulk of the standout changes in this release are cleanups, with the core work being a combination of factoring out common code into helpers and the completion of the conversion of the core to use GPIO descriptors. Summary: - Addition of helper functions for current limits and conversion of drivers to use them by Axel Lin. - Lots and lots of cleanups from Axel Lin. - Conversion of the core to use GPIO descriptors rather than numbers by Linus Walleij. - New drivers for Maxim MAX77650 and ROHM BD70528" * tag 'regulator-v5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator: (131 commits) regulator: mc13xxx: Constify regulator_ops variables regulator: palmas: Constify palmas_smps_ramp_delay array regulator: wm831x-dcdc: Convert to use regulator_set/get_current_limit_regmap regulator: pv88090: Convert to use regulator_set/get_current_limit_regmap regulator: pv88080: Convert to use regulator_set/get_current_limit_regmap regulator: pv88060: Convert to use regulator_set/get_current_limit_regmap regulator: max77650: Convert to use regulator_set/get_current_limit_regmap regulator: lp873x: Convert to use regulator_set/get_current_limit_regmap regulator: lp872x: Convert to use regulator_set/get_current_limit_regmap regulator: da9210: Convert to use regulator_set/get_current_limit_regmap regulator: da9055: Convert to use regulator_set/get_current_limit_regmap regulator: core: Add set/get_current_limit helpers for regmap users regulator: Fix comment for csel_reg and csel_mask regulator: stm32-vrefbuf: add power management support regulator: 88pm8607: Remove unused fields from struct pm8607_regulator_info regulator: 88pm8607: Simplify pm8607_list_voltage implementation regulator: cpcap: Constify omap4_regulators and xoom_regulators regulator: cpcap: Remove unused vsel_shift from struct cpcap_regulator dt-bindings: regulator: tps65218: rectify units of LS3 dt-bindings: regulator: add LS2 load switch documentation ...
2019-03-04Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller3-61/+5
2019-03-04get rid of legacy 'get_ds()' functionLinus Torvalds1-1/+0
Every in-kernel use of this function defined it to KERNEL_DS (either as an actual define, or as an inline function). It's an entirely historical artifact, and long long long ago used to actually read the segment selector valueof '%ds' on x86. Which in the kernel is always KERNEL_DS. Inspired by a patch from Jann Horn that just did this for a very small subset of users (the ones in fs/), along with Al who suggested a script. I then just took it to the logical extreme and removed all the remaining gunk. Roughly scripted with git grep -l '(get_ds())' -- :^tools/ | xargs sed -i 's/(get_ds())/(KERNEL_DS)/' git grep -lw 'get_ds' -- :^tools/ | xargs sed -i '/^#define get_ds()/d' plus manual fixups to remove a few unusual usage patterns, the couple of inline function cases and to fix up a comment that had become stale. The 'get_ds()' function remains in an x86 kvm selftest, since in user space it actually does something relevant. Inspired-by: Jann Horn <jannh@google.com> Inspired-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-04x86-64: add warning for non-canonical user access address dereferencesLinus Torvalds1-0/+1
This adds a warning (once) for any kernel dereference that has a user exception handler, but accesses a non-canonical address. It basically is a simpler - and more limited - version of commit 9da3f2b74054 ("x86/fault: BUG() when uaccess helpers fault on kernel addresses") that got reverted. Note that unlike that original commit, this only causes a warning, because there are real situations where we currently can do this (notably speculative argument fetching for uprobes etc). Also, unlike that original commit, this _only_ triggers for #GP accesses, so the cases of valid kernel pointers that cross into a non-mapped page aren't affected. The intent of this is two-fold: - the uprobe/tracing accesses really do need to be more careful. In particular, from a portability standpoint it's just wrong to think that "a pointer is a pointer", and use the same logic for any random pointer value you find on the stack. It may _work_ on x86-64, but it doesn't necessarily work on other architectures (where the same pointer value can be either a kernel pointer _or_ a user pointer, and you really need to be much more careful in how you try to access it) The warning can hopefully end up being a reminder that just any random pointer access won't do. - Kees in particular wanted a way to actually report invalid uses of wild pointers to user space accessors, instead of just silently failing them. Automated fuzzers want a way to get reports if the kernel ever uses invalid values that the fuzzer fed it. The non-canonical address range is a fair chunk of the address space, and with this you can teach syzkaller to feed in invalid pointer values and find cases where we do not properly validate user addresses (possibly due to bad uses of "set_fs()"). Acked-by: Kees Cook <keescook@chromium.org> Cc: Jann Horn <jannh@google.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-04Merge branch 'regulator-5.1' into regulator-nextMark Brown1-1/+0
2019-03-02Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds2-3/+5
Pull x86 fixes from Thomas Gleixner: "Two last minute fixes: - Prevent value evaluation via functions happening in the user access enabled region of __put_user() (put another way: make sure to evaluate the value to be stored in user space _before_ enabling user space accesses) - Correct the definition of a Hyper-V hypercall constant" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/hyper-v: Fix definition of HV_MAX_FLUSH_REP_COUNT x86/uaccess: Don't leak the AC flag into __put_user() value evaluation
2019-02-28x86/hyper-v: Fix definition of HV_MAX_FLUSH_REP_COUNTLan Tianyu1-1/+1
The max flush rep count of HvFlushGuestPhysicalAddressList hypercall is equal with how many entries of union hv_gpa_page_range can be populated into the input parameter page. The code lacks parenthesis around PAGE_SIZE - 2 * sizeof(u64) which results in bogus computations. Add them. Fixes: cc4edae4b924 ("x86/hyper-v: Add HvFlushGuestAddressList hypercall support") Signed-off-by: Lan Tianyu <Tianyu.Lan@microsoft.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: kys@microsoft.com Cc: haiyangz@microsoft.com Cc: sthemmin@microsoft.com Cc: sashal@kernel.org Cc: bp@alien8.de Cc: hpa@zytor.com Cc: gregkh@linuxfoundation.org Cc: devel@linuxdriverproject.org Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20190225143114.5149-1-Tianyu.Lan@microsoft.com
2019-02-28crypto: x86/poly1305 - Clear key material from stack in SSE2 variantTommi Hirvola1-0/+4
1-block SSE2 variant of poly1305 stores variables s1..s4 containing key material on the stack. This commit adds missing zeroing of the stack memory. Benchmarks show negligible performance hit (tested on i7-3770). Signed-off-by: Tommi Hirvola <tommi@hirvola.fi> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-02-25x86/uaccess: Don't leak the AC flag into __put_user() value evaluationAndy Lutomirski1-2/+4
When calling __put_user(foo(), ptr), the __put_user() macro would call foo() in between __uaccess_begin() and __uaccess_end(). If that code were buggy, then those bugs would be run without SMAP protection. Fortunately, there seem to be few instances of the problem in the kernel. Nevertheless, __put_user() should be fixed to avoid doing this. Therefore, evaluate __put_user()'s argument before setting AC. This issue was noticed when an objtool hack by Peter Zijlstra complained about genregs_get() and I compared the assembly output to the C source. [ bp: Massage commit message and fixed up whitespace. ] Fixes: 11f1a4b9755f ("x86: reorganize SMAP handling in user space accesses") Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Brian Gerst <brgerst@gmail.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/20190225125231.845656645@infradead.org
2019-02-25Revert "x86/fault: BUG() when uaccess helpers fault on kernel addresses"Linus Torvalds1-58/+0
This reverts commit 9da3f2b74054406f87dff7101a569217ffceb29b. It was well-intentioned, but wrong. Overriding the exception tables for instructions for random reasons is just wrong, and that is what the new code did. It caused problems for tracing, and it caused problems for strncpy_from_user(), because the new checks made perfectly valid use cases break, rather than catch things that did bad things. Unchecked user space accesses are a problem, but that's not a reason to add invalid checks that then people have to work around with silly flags (in this case, that 'kernel_uaccess_faults_ok' flag, which is just an odd way to say "this commit was wrong" and was sprinked into random places to hide the wrongness). The real fix to unchecked user space accesses is to get rid of the special "let's not check __get_user() and __put_user() at all" logic. Make __{get|put}_user() be just aliases to the regular {get|put}_user() functions, and make it impossible to access user space without having the proper checks in places. The raison d'être of the special double-underscore versions used to be that the range check was expensive, and if you did multiple user accesses, you'd do the range check up front (like the signal frame handling code, for example). But SMAP (on x86) and PAN (on ARM) have made that optimization pointless, because the _real_ expense is the "set CPU flag to allow user space access". Do let's not break the valid cases to catch invalid cases that shouldn't even exist. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Kees Cook <keescook@chromium.org> Cc: Tobin C. Harding <tobin@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Jann Horn <jannh@google.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-02-24Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller3-4/+20
Three conflicts, one of which, for marvell10g.c is non-trivial and requires some follow-up from Heiner or someone else. The issue is that Heiner converted the marvell10g driver over to use the generic c45 code as much as possible. However, in 'net' a bug fix appeared which makes sure that a new local mask (MDIO_AN_10GBT_CTRL_ADV_NBT_MASK) with value 0x01e0 is cleared. Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-22KVM: MMU: record maximum physical address width in kvm_mmu_extended_roleYu Zhang2-0/+2
Previously, commit 7dcd57552008 ("x86/kvm/mmu: check if tdp/shadow MMU reconfiguration is needed") offered some optimization to avoid the unnecessary reconfiguration. Yet one scenario is broken - when cpuid changes VM's maximum physical address width, reconfiguration is needed to reset the reserved bits. Also, the TDP may need to reset its shadow_root_level when this value is changed. To fix this, a new field, maxphyaddr, is introduced in the extended role structure to keep track of the configured guest physical address width. Signed-off-by: Yu Zhang <yu.c.zhang@linux.intel.com> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-02-22kvm: x86: Return LA57 feature based on hardware capabilityYu Zhang1-0/+4
Previously, 'commit 372fddf70904 ("x86/mm: Introduce the 'no5lvl' kernel parameter")' cleared X86_FEATURE_LA57 in boot_cpu_data, if Linux chooses to not run in 5-level paging mode. Yet boot_cpu_data is queried by do_cpuid_ent() as the host capability later when creating vcpus, and Qemu will not be able to detect this feature and create VMs with LA57 feature. As discussed earlier, VMs can still benefit from extended linear address width, e.g. to enhance features like ASLR. So we would like to fix this, by return the true hardware capability when Qemu queries. Signed-off-by: Yu Zhang <yu.c.zhang@linux.intel.com> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-02-22x86/kvm/mmu: fix switch between root and guest MMUsVitaly Kuznetsov2-4/+14
Commit 14c07ad89f4d ("x86/kvm/mmu: introduce guest_mmu") brought one subtle change: previously, when switching back from L2 to L1, we were resetting MMU hooks (like mmu->get_cr3()) in kvm_init_mmu() called from nested_vmx_load_cr3() and now we do that in nested_ept_uninit_mmu_context() when we re-target vcpu->arch.mmu pointer. The change itself looks logical: if nested_ept_init_mmu_context() changes something than nested_ept_uninit_mmu_context() restores it back. There is, however, one thing: the following call chain: nested_vmx_load_cr3() kvm_mmu_new_cr3() __kvm_mmu_new_cr3() fast_cr3_switch() cached_root_available() now happens with MMU hooks pointing to the new MMU (root MMU in our case) while previously it was happening with the old one. cached_root_available() tries to stash current root but it is incorrect to read current CR3 with mmu->get_cr3(), we need to use old_mmu->get_cr3() which in case we're switching from L2 to L1 is guest_mmu. (BTW, in shadow page tables case this is a non-issue because we don't switch MMU). While we could've tried to guess that we're switching between MMUs and call the right ->get_cr3() from cached_root_available() this seems to be overly complicated. Instead, just stash the corresponding CR3 when setting root_hpa and make cached_root_available() use the stashed value. Fixes: 14c07ad89f4d ("x86/kvm/mmu: introduce guest_mmu") Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-02-20Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller11-32/+99
Two easily resolvable overlapping change conflicts, one in TCP and one in the eBPF verifier. Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-17Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds4-5/+34
Pull x86 fixes from Ingo Molnar: "Three changes: - An UV fix/quirk to pull UV BIOS calls into the efi_runtime_lock locking regime. (This done by aliasing __efi_uv_runtime_lock to efi_runtime_lock, which should make the quirk nature obvious and maintain the general policy that the EFI lock (name...) isn't exposed to drivers.) - Our version of MAGA: Make a.out Great Again. - Add a new Intel model name enumerator to an upstream header to help reduce dependencies going forward" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/platform/UV: Use efi_runtime_lock to serialise BIOS calls x86/CPU: Add Icelake model number x86/a.out: Clear the dump structure initially
2019-02-17Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds3-2/+37
Pull perf fixes from Ingo Molnar: "Two fixes on the kernel side: fix an over-eager condition that failed larger perf ring-buffer sizes, plus fix crashes in the Intel BTS code for a corner case, found by fuzzing" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/core: Fix impossible ring-buffer sizes warning perf/x86: Add check_period PMU callback
2019-02-15Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller6-32/+43
The netfilter conflicts were rather simple overlapping changes. However, the cls_tcindex.c stuff was a bit more complex. On the 'net' side, Cong is fixing several races and memory leaks. Whilst on the 'net-next' side we have Vlad adding the rtnl-ness support. What I've decided to do, in order to resolve this, is revert the conversion over to using a workqueue that Cong did, bringing us back to pure RCU. I did it this way because I believe that either Cong's races don't apply with have Vlad did things, or Cong will have to implement the race fix slightly differently. Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-15x86/platform/UV: Use efi_runtime_lock to serialise BIOS callsHedi Berriche2-3/+28
Calls into UV firmware must be protected against concurrency, expose the efi_runtime_lock to the UV platform, and use it to serialise UV BIOS calls. Signed-off-by: Hedi Berriche <hedi.berriche@hpe.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Reviewed-by: Russ Anderson <rja@hpe.com> Reviewed-by: Dimitri Sivanich <sivanich@hpe.com> Reviewed-by: Mike Travis <mike.travis@hpe.com> Cc: Andy Shevchenko <andy@infradead.org> Cc: Bhupesh Sharma <bhsharma@redhat.com> Cc: Darren Hart <dvhart@infradead.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: linux-efi <linux-efi@vger.kernel.org> Cc: platform-driver-x86@vger.kernel.org Cc: stable@vger.kernel.org # v4.9+ Cc: Steve Wahl <steve.wahl@hpe.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/20190213193413.25560-5-hedi.berriche@hpe.com
2019-02-14kvm: vmx: Fix entry number check for add_atomic_switch_msr()Xiaoyao Li1-1/+2
Commit ca83b4a7f2d068da79a0 ("x86/KVM/VMX: Add find_msr() helper function") introduces the helper function find_msr(), which returns -ENOENT when not find the msr in vmx->msr_autoload.guest/host. Correct checking contion of no more available entry in vmx->msr_autoload. Fixes: ca83b4a7f2d0 ("x86/KVM/VMX: Add find_msr() helper function") Cc: stable@vger.kernel.org Signed-off-by: Xiaoyao Li <xiaoyao.li@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-02-14KVM: x86: Recompute PID.ON when clearing PID.SNLuwei Kang3-21/+17
Some Posted-Interrupts from passthrough devices may be lost or overwritten when the vCPU is in runnable state. The SN (Suppress Notification) of PID (Posted Interrupt Descriptor) will be set when the vCPU is preempted (vCPU in KVM_MP_STATE_RUNNABLE state but not running on physical CPU). If a posted interrupt comes at this time, the irq remapping facility will set the bit of PIR (Posted Interrupt Requests) but not ON (Outstanding Notification). Then, the interrupt will not be seen by KVM, which always expects PID.ON=1 if PID.PIR=1 as documented in the Intel processor SDM but not in the VT-d specification. To fix this, restore the invariant after PID.SN is cleared. Signed-off-by: Luwei Kang <luwei.kang@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-02-14x86/CPU: Add Icelake model numberRajneesh Bhardwaj1-0/+2
Add the CPUID model number of Icelake (ICL) mobile processors to the Intel family list. Icelake U/Y series uses model number 0x7E. Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@linux.intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: "David E. Box" <david.e.box@intel.com> Cc: dvhart@infradead.org Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: platform-driver-x86@vger.kernel.org Cc: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/20190214115712.19642-2-rajneesh.bhardwaj@linux.intel.com
2019-02-13KVM: nVMX: Restore a preemption timer consistency checkSean Christopherson1-0/+4
A recently added preemption timer consistency check was unintentionally dropped when the consistency checks were being reorganized to match the SDM's ordering. Fixes: 461b4ba4c7ad ("KVM: nVMX: Move the checks for VM-Execution Control Fields to a separate helper function") Cc: Krish Sadhukhan <krish.sadhukhan@oracle.com> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-02-13x86/a.out: Clear the dump structure initiallyBorislav Petkov1-2/+4
dump_thread32() in aout_core_dump() does not clear the user32 structure allocated on the stack as the first thing on function entry. As a result, the dump.u_comm, dump.u_ar0 and dump.signal which get assigned before the clearing, get overwritten. Rename that function to fill_dump() to make it clear what it does and call it first thing. This was caught while staring at a patch by Derek Robson <robsonde@gmail.com>. Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Derek Robson <robsonde@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Michael Matz <matz@suse.de> Cc: x86@kernel.org Cc: <stable@vger.kernel.org> Link: https://lkml.kernel.org/r/20190202005512.3144-1-robsonde@gmail.com
2019-02-12x86/kvm/nVMX: read from MSR_IA32_VMX_PROCBASED_CTLS2 only when it is availableVitaly Kuznetsov1-3/+5
SDM says MSR_IA32_VMX_PROCBASED_CTLS2 is only available "If (CPUID.01H:ECX.[5] && IA32_VMX_PROCBASED_CTLS[63])". It was found that some old cpus (namely "Intel(R) Core(TM)2 CPU 6600 @ 2.40GHz (family: 0x6, model: 0xf, stepping: 0x6") don't have it. Add the missing check. Reported-by: Zdenek Kaspar <zkaspar82@gmail.com> Tested-by: Zdenek Kaspar <zkaspar82@gmail.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Reviewed-by: Jim Mattson <jmattson@google.com> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-02-11perf/x86: Add check_period PMU callbackJiri Olsa3-2/+37
Vince (and later on Ravi) reported crashes in the BTS code during fuzzing with the following backtrace: general protection fault: 0000 [#1] SMP PTI ... RIP: 0010:perf_prepare_sample+0x8f/0x510 ... Call Trace: <IRQ> ? intel_pmu_drain_bts_buffer+0x194/0x230 intel_pmu_drain_bts_buffer+0x160/0x230 ? tick_nohz_irq_exit+0x31/0x40 ? smp_call_function_single_interrupt+0x48/0xe0 ? call_function_single_interrupt+0xf/0x20 ? call_function_single_interrupt+0xa/0x20 ? x86_schedule_events+0x1a0/0x2f0 ? x86_pmu_commit_txn+0xb4/0x100 ? find_busiest_group+0x47/0x5d0 ? perf_event_set_state.part.42+0x12/0x50 ? perf_mux_hrtimer_restart+0x40/0xb0 intel_pmu_disable_event+0xae/0x100 ? intel_pmu_disable_event+0xae/0x100 x86_pmu_stop+0x7a/0xb0 x86_pmu_del+0x57/0x120 event_sched_out.isra.101+0x83/0x180 group_sched_out.part.103+0x57/0xe0 ctx_sched_out+0x188/0x240 ctx_resched+0xa8/0xd0 __perf_event_enable+0x193/0x1e0 event_function+0x8e/0xc0 remote_function+0x41/0x50 flush_smp_call_function_queue+0x68/0x100 generic_smp_call_function_single_interrupt+0x13/0x30 smp_call_function_single_interrupt+0x3e/0xe0 call_function_single_interrupt+0xf/0x20 </IRQ> The reason is that while event init code does several checks for BTS events and prevents several unwanted config bits for BTS event (like precise_ip), the PERF_EVENT_IOC_PERIOD allows to create BTS event without those checks being done. Following sequence will cause the crash: If we create an 'almost' BTS event with precise_ip and callchains, and it into a BTS event it will crash the perf_prepare_sample() function because precise_ip events are expected to come in with callchain data initialized, but that's not the case for intel_pmu_drain_bts_buffer() caller. Adding a check_period callback to be called before the period is changed via PERF_EVENT_IOC_PERIOD. It will deny the change if the event would become BTS. Plus adding also the limit_period check as well. Reported-by: Vince Weaver <vincent.weaver@maine.edu> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: <stable@vger.kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20190204123532.GA4794@krava Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-02-10Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds4-26/+29
Pull x86 fixes from Ingo Molnar: "A handful of fixes: - Fix an MCE corner case bug/crash found via MCE injection testing - Fix 5-level paging boot crash - Fix MCE recovery cache invalidation bug - Fix regression on Xen guests caused by a recent PMD level mremap speedup optimization" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mm: Make set_pmd_at() paravirt aware x86/mm/cpa: Fix set_mce_nospec() x86/boot/compressed/64: Do not corrupt EDX on EFER.LME=1 setting x86/MCE: Initialize mce.bank in the case of a fatal error in mce_no_way_out()
2019-02-10Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds2-6/+14
Pull perf fixes from Ingo Molnar: "A couple of kernel side fixes: - Fix the Intel uncore driver on certain hardware configurations - Fix a CPU hotplug related memory allocation bug - Remove a spurious WARN() ... plus also a handful of perf tooling fixes" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf script python: Add Python3 support to tests/attr.py perf trace: Support multiple "vfs_getname" probes perf symbols: Filter out hidden symbols from labels perf symbols: Add fallback definitions for GELF_ST_VISIBILITY() tools headers uapi: Sync linux/in.h copy from the kernel sources perf clang: Do not use 'return std::move(something)' perf mem/c2c: Fix perf_mem_events to support powerpc perf tests evsel-tp-sched: Fix bitwise operator perf/core: Don't WARN() for impossible ring-buffer sizes perf/x86/intel: Delay memory deallocation until x86_pmu_dead_cpu() perf/x86/intel/uncore: Add Node ID mask
2019-02-10x86/mm: Make set_pmd_at() paravirt awareJuergen Gross1-1/+1
set_pmd_at() calls native_set_pmd() unconditionally on x86. This was fine as long as only huge page entries were written via set_pmd_at(), as Xen pv guests don't support those. Commit 2c91bd4a4e2e53 ("mm: speed up mremap by 20x on large regions") introduced a usage of set_pmd_at() possible on pv guests, leading to failures like: BUG: unable to handle kernel paging request at ffff888023e26778 #PF error: [PROT] [WRITE] RIP: e030:move_page_tables+0x7c1/0xae0 move_vma.isra.3+0xd1/0x2d0 __se_sys_mremap+0x3c6/0x5b0 do_syscall_64+0x49/0x100 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Make set_pmd_at() paravirt aware by just letting it use set_pmd(). Fixes: 2c91bd4a4e2e53 ("mm: speed up mremap by 20x on large regions") Reported-by: Sander Eikelenboom <linux@eikelenboom.it> Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: xen-devel@lists.xenproject.org Cc: boris.ostrovsky@oracle.com Cc: sstabellini@kernel.org Cc: hpa@zytor.com Cc: bp@alien8.de Cc: torvalds@linux-foundation.org Link: https://lkml.kernel.org/r/20190210074056.11842-1-jgross@suse.com
2019-02-08Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller16-17/+69
An ipvlan bug fix in 'net' conflicted with the abstraction away of the IPV6 specific support in 'net-next'. Similarly, a bug fix for mlx5 in 'net' conflicted with the flow action conversion in 'net-next'. Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-08x86/mm/cpa: Fix set_mce_nospec()Peter Zijlstra1-25/+25
The recent commit fe0937b24ff5 ("x86/mm/cpa: Fold cpa_flush_range() and cpa_flush_array() into a single cpa_flush() function") accidentally made the call to make_addr_canonical_again() go away, which breaks set_mce_nospec(). Re-instate the call to convert the address back into canonical form right before invoking either CLFLUSH or INVLPG. Rename the function while at it to be shorter (and less MAGA). Fixes: fe0937b24ff5 ("x86/mm/cpa: Fold cpa_flush_range() and cpa_flush_array() into a single cpa_flush() function") Reported-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Tony Luck <tony.luck@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Rik van Riel <riel@surriel.com> Link: https://lkml.kernel.org/r/20190208120859.GH32511@hirez.programming.kicks-ass.net
2019-02-08crypto: x86/aesni-gcm - fix crash on empty plaintextEric Biggers1-5/+8
gcmaes_crypt_by_sg() dereferences the NULL pointer returned by scatterwalk_ffwd() when encrypting an empty plaintext and the source scatterlist ends immediately after the associated data. Fix it by only fast-forwarding to the src/dst data scatterlists if the data length is nonzero. This bug is reproduced by the "rfc4543(gcm(aes))" test vectors when run with the new AEAD test manager. Fixes: e845520707f8 ("crypto: aesni - Update aesni-intel_glue to use scatter/gather") Cc: <stable@vger.kernel.org> # v4.17+ Cc: Dave Watson <davejwatson@fb.com> Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-02-08crypto: x86/morus - fix handling chunked inputs and MAY_SLEEPEric Biggers2-48/+31
The x86 MORUS implementations all fail the improved AEAD tests because they produce the wrong result with some data layouts. The issue is that they assume that if the skcipher_walk API gives 'nbytes' not aligned to the walksize (a.k.a. walk.stride), then it is the end of the data. In fact, this can happen before the end. Also, when the CRYPTO_TFM_REQ_MAY_SLEEP flag is given, they can incorrectly sleep in the skcipher_walk_*() functions while preemption has been disabled by kernel_fpu_begin(). Fix these bugs. Fixes: 56e8e57fc3a7 ("crypto: morus - Add common SIMD glue code for MORUS") Cc: <stable@vger.kernel.org> # v4.18+ Cc: Ondrej Mosnacek <omosnace@redhat.com> Signed-off-by: Eric Biggers <ebiggers@google.com> Reviewed-by: Ondrej Mosnacek <omosnace@redhat.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-02-08crypto: x86/aegis - fix handling chunked inputs and MAY_SLEEPEric Biggers3-69/+45
The x86 AEGIS implementations all fail the improved AEAD tests because they produce the wrong result with some data layouts. The issue is that they assume that if the skcipher_walk API gives 'nbytes' not aligned to the walksize (a.k.a. walk.stride), then it is the end of the data. In fact, this can happen before the end. Also, when the CRYPTO_TFM_REQ_MAY_SLEEP flag is given, they can incorrectly sleep in the skcipher_walk_*() functions while preemption has been disabled by kernel_fpu_begin(). Fix these bugs. Fixes: 1d373d4e8e15 ("crypto: x86 - Add optimized AEGIS implementations") Cc: <stable@vger.kernel.org> # v4.18+ Cc: Ondrej Mosnacek <omosnace@redhat.com> Signed-off-by: Eric Biggers <ebiggers@google.com> Reviewed-by: Ondrej Mosnacek <omosnace@redhat.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-02-08crypto: x86/crct10dif-pcl - cleanup and optimizationsEric Biggers2-559/+235
The x86, arm, and arm64 asm implementations of crct10dif are very difficult to understand partly because many of the comments, labels, and macros are named incorrectly: the lengths mentioned are usually off by a factor of two from the actual code. Many other things are unnecessarily convoluted as well, e.g. there are many more fold constants than actually needed and some aren't fully reduced. This series therefore cleans up all these implementations to be much more maintainable. I also made some small optimizations where I saw opportunities, resulting in slightly better performance. This patch cleans up the x86 version. As part of this, I removed support for len < 16 from the x86 assembly; now the glue code falls back to the generic table-based implementation in this case. Due to the overhead of kernel_fpu_begin(), this actually significantly improves performance on these lengths. (And even if kernel_fpu_begin() were free, the generic code is still faster for about len < 11.) This removal also eliminates error-prone special cases and makes the x86, arm32, and arm64 ports of the code match more closely. Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-02-07KVM: nVMX: unconditionally cancel preemption timer in free_nested (CVE-2019-7221)Peter Shier1-0/+1
Bugzilla: 1671904 There are multiple code paths where an hrtimer may have been started to emulate an L1 VMX preemption timer that can result in a call to free_nested without an intervening L2 exit where the hrtimer is normally cancelled. Unconditionally cancel in free_nested to cover all cases. Embargoed until Feb 7th 2019. Signed-off-by: Peter Shier <pshier@google.com> Reported-by: Jim Mattson <jmattson@google.com> Reviewed-by: Jim Mattson <jmattson@google.com> Reported-by: Felix Wilhelm <fwilhelm@google.com> Cc: stable@kernel.org Message-Id: <20181011184646.154065-1-pshier@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-02-07KVM: x86: work around leak of uninitialized stack contents (CVE-2019-7222)Paolo Bonzini1-0/+7
Bugzilla: 1671930 Emulation of certain instructions (VMXON, VMCLEAR, VMPTRLD, VMWRITE with memory operand, INVEPT, INVVPID) can incorrectly inject a page fault when passed an operand that points to an MMIO address. The page fault will use uninitialized kernel stack memory as the CR2 and error code. The right behavior would be to abort the VM with a KVM_EXIT_INTERNAL_ERROR exit to userspace; however, it is not an easy fix, so for now just ensure that the error code and CR2 are zero. Embargoed until Feb 7th 2019. Reported-by: Felix Wilhelm <fwilhelm@google.com> Cc: stable@kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-02-06x86/boot/compressed/64: Do not corrupt EDX on EFER.LME=1 settingKirill A. Shutemov1-0/+2
RDMSR in the trampoline code overwrites EDX but that register is used to indicate whether 5-level paging has to be enabled and if clobbered, leads to failure to boot on a 5-level paging machine. Preserve EDX on the stack while we are dealing with EFER. Fixes: b677dfae5aa1 ("x86/boot/compressed/64: Set EFER.LME=1 in 32-bit trampoline before returning to long mode") Reported-by: Kyle D Pelton <kyle.d.pelton@intel.com> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: dave.hansen@linux.intel.com Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Wei Huang <wei@redhat.com> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/20190206115253.1907-1-kirill.shutemov@linux.intel.com
2019-02-06regulator: fixed/gpio: Pull inversion/OD into gpiolibLinus Walleij1-1/+0
This pushes the handling of inversion semantics and open drain settings to the GPIO descriptor and gpiolib. All affected board files are also augmented. This is especially nice since we don't have to have any confusing flags passed around to the left and right littering the fixed and GPIO regulator drivers and the regulator core. It is all just very straight-forward: the core asks the GPIO line to be asserted or deasserted and gpiolib deals with the rest depending on how the platform is configured: if the line is active low, it deals with that, if the line is open drain, it deals with that too. Cc: Alexander Shiyan <shc_work@mail.ru> # i.MX boards user Cc: Haojian Zhuang <haojian.zhuang@gmail.com> # MMP2 maintainer Cc: Aaro Koskinen <aaro.koskinen@iki.fi> # OMAP1 maintainer Cc: Tony Lindgren <tony@atomide.com> # OMAP1,2,3 maintainer Cc: Mike Rapoport <rppt@linux.vnet.ibm.com> # EM-X270 maintainer Cc: Robert Jarzmik <robert.jarzmik@free.fr> # EZX maintainer Cc: Philipp Zabel <philipp.zabel@gmail.com> # Magician maintainer Cc: Petr Cvek <petr.cvek@tul.cz> # Magician Cc: Robert Jarzmik <robert.jarzmik@free.fr> # PXA Cc: Paul Parsons <lost.distance@yahoo.com> # hx4700 Cc: Daniel Mack <zonque@gmail.com> # Raumfeld maintainer Cc: Marc Zyngier <marc.zyngier@arm.com> # Zeus maintainer Cc: Geert Uytterhoeven <geert+renesas@glider.be> # SuperH pinctrl/GPIO maintainer Cc: Russell King <rmk+kernel@armlinux.org.uk> # SA1100 Tested-by: Marek Szyprowski <m.szyprowski@samsung.com> Tested-by: Janusz Krzysztofik <jmkrzyszt@gmail.com> #OMAP1 Amstrad Delta Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Mark Brown <broonie@kernel.org>
2019-02-04perf/x86/intel: Delay memory deallocation until x86_pmu_dead_cpu()Peter Zijlstra1-5/+11
intel_pmu_cpu_prepare() allocated memory for ->shared_regs among other members of struct cpu_hw_events. This memory is released in intel_pmu_cpu_dying() which is wrong. The counterpart of the intel_pmu_cpu_prepare() callback is x86_pmu_dead_cpu(). Otherwise if the CPU fails on the UP path between CPUHP_PERF_X86_PREPARE and CPUHP_AP_PERF_X86_STARTING then it won't release the memory but allocate new memory on the next attempt to online the CPU (leaking the old memory). Also, if the CPU down path fails between CPUHP_AP_PERF_X86_STARTING and CPUHP_PERF_X86_PREPARE then the CPU will go back online but never allocate the memory that was released in x86_pmu_dying_cpu(). Make the memory allocation/free symmetrical in regard to the CPU hotplug notifier by moving the deallocation to intel_pmu_cpu_dead(). This started in commit: a7e3ed1e47011 ("perf: Add support for supplementary event registers"). In principle the bug was introduced in v2.6.39 (!), but it will almost certainly not backport cleanly across the big CPU hotplug rewrite between v4.7-v4.15... [ bigeasy: Added patch description. ] [ mingo: Added backporting guidance. ] Reported-by: He Zhe <zhe.he@windriver.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> # With developer hat on Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> # With maintainer hat on Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: acme@kernel.org Cc: bp@alien8.de Cc: hpa@zytor.com Cc: jolsa@kernel.org Cc: kan.liang@linux.intel.com Cc: namhyung@kernel.org Cc: <stable@vger.kernel.org> Fixes: a7e3ed1e47011 ("perf: Add support for supplementary event registers"). Link: https://lkml.kernel.org/r/20181219165350.6s3jvyxbibpvlhtq@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-02-04perf/x86/intel/uncore: Add Node ID maskKan Liang1-1/+3
Some PCI uncore PMUs cannot be registered on an 8-socket system (HPE Superdome Flex). To understand which Socket the PCI uncore PMUs belongs to, perf retrieves the local Node ID of the uncore device from CPUNODEID(0xC0) of the PCI configuration space, and the mapping between Socket ID and Node ID from GIDNIDMAP(0xD4). The Socket ID can be calculated accordingly. The local Node ID is only available at bit 2:0, but current code doesn't mask it. If a BIOS doesn't clear the rest of the bits, an incorrect Node ID will be fetched. Filter the Node ID by adding a mask. Reported-by: Song Liu <songliubraving@fb.com> Tested-by: Song Liu <songliubraving@fb.com> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: <stable@vger.kernel.org> # v3.7+ Fixes: 7c94ee2e0917 ("perf/x86: Add Intel Nehalem and Sandy Bridge-EP uncore support") Link: https://lkml.kernel.org/r/1548600794-33162-1-git-send-email-kan.liang@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-02-03arch: Use asm-generic/socket.h when possibleDeepa Dinamani2-1/+1
Many architectures maintain an arch specific copy of the file even though there are no differences with the asm-generic one. Allow these architectures to use the generic one instead. Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com> Acked-by: Max Filippov <jcmvbkbc@gmail.com> Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> Acked-by: Willem de Bruijn <willemb@google.com> Cc: chris@zankel.net Cc: fenghua.yu@intel.com Cc: tglx@linutronix.de Cc: schwidefsky@de.ibm.com Cc: linux-ia64@vger.kernel.org Cc: linux-xtensa@linux-xtensa.org Cc: linux-s390@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-03Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds10-12/+24
Pull x86 fixes from Thomas Gleixner: "A few updates for x86: - Fix an unintended sign extension issue in the fault handling code - Rename the new resource control config switch so it's less confusing - Avoid setting up EFI info in kexec when the EFI runtime is disabled. - Fix the microcode version check in the AMD microcode loader so it only loads higher version numbers and never downgrades - Set EFER.LME in the 32bit trampoline before returning to long mode to handle older AMD/KVM behaviour properly. - Add Darren and Andy as x86/platform reviewers" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/resctrl: Avoid confusion over the new X86_RESCTRL config x86/kexec: Don't setup EFI info if EFI runtime is not enabled x86/microcode/amd: Don't falsely trick the late loading mechanism MAINTAINERS: Add Andy and Darren as arch/x86/platform/ reviewers x86/fault: Fix sign-extend unintended sign extension x86/boot/compressed/64: Set EFER.LME=1 in 32-bit trampoline before returning to long mode x86/cpu: Add Atom Tremont (Jacobsville)
2019-02-03Merge branch 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds2-2/+3
Pull cpu hotplug fixes from Thomas Gleixner: "Two fixes for the cpu hotplug machinery: - Replace the overly clever 'SMT disabled by BIOS' detection logic as it breaks KVM scenarios and prevents speculation control updates when the Hyperthreads are brought online late after boot. - Remove a redundant invocation of the speculation control update function" * 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: cpu/hotplug: Fix "SMT disabled by BIOS" detection for KVM x86/speculation: Remove redundant arch_smt_update() invocation
2019-02-03x86/MCE: Initialize mce.bank in the case of a fatal error in mce_no_way_out()Tony Luck1-0/+1
Internal injection testing crashed with a console log that said: mce: [Hardware Error]: CPU 7: Machine Check Exception: f Bank 0: bd80000000100134 This caused a lot of head scratching because the MCACOD (bits 15:0) of that status is a signature from an L1 data cache error. But Linux says that it found it in "Bank 0", which on this model CPU only reports L1 instruction cache errors. The answer was that Linux doesn't initialize "m->bank" in the case that it finds a fatal error in the mce_no_way_out() pre-scan of banks. If this was a local machine check, then this partially initialized struct mce is being passed to mce_panic(). Fix is simple: just initialize m->bank in the case of a fatal error. Fixes: 40c36e2741d7 ("x86/mce: Fix incorrect "Machine check from unknown source" message") Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: x86-ml <x86@kernel.org> Cc: stable@vger.kernel.org # v4.18 Note pre-v5.0 arch/x86/kernel/cpu/mce/core.c was called arch/x86/kernel/cpu/mcheck/mce.c Link: https://lkml.kernel.org/r/20190201003341.10638-1-tony.luck@intel.com
2019-02-02x86/resctrl: Avoid confusion over the new X86_RESCTRL configJohannes Weiner4-8/+8
"Resource Control" is a very broad term for this CPU feature, and a term that is also associated with containers, cgroups etc. This can easily cause confusion. Make the user prompt more specific. Match the config symbol name. [ bp: In the future, the corresponding ARM arch-specific code will be under ARM_CPU_RESCTRL and the arch-agnostic bits will be carved out under the CPU_RESCTRL umbrella symbol. ] Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Babu Moger <Babu.Moger@amd.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Morse <james.morse@arm.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: linux-doc@vger.kernel.org Cc: Peter Zijlstra <peterz@infradead.org> Cc: Pu Wen <puwen@hygon.cn> Cc: Reinette Chatre <reinette.chatre@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/20190130195621.GA30653@cmpxchg.org