aboutsummaryrefslogtreecommitdiffstats
path: root/arch/powerpc (follow)
AgeCommit message (Collapse)AuthorFilesLines
2018-12-27Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-nextLinus Torvalds4-8/+19
Pull networking updates from David Miller: 1) New ipset extensions for matching on destination MAC addresses, from Stefano Brivio. 2) Add ipv4 ttl and tos, plus ipv6 flow label and hop limit offloads to nfp driver. From Stefano Brivio. 3) Implement GRO for plain UDP sockets, from Paolo Abeni. 4) Lots of work from Michał Mirosław to eliminate the VLAN_TAG_PRESENT bit so that we could support the entire vlan_tci value. 5) Rework the IPSEC policy lookups to better optimize more usecases, from Florian Westphal. 6) Infrastructure changes eliminating direct manipulation of SKB lists wherever possible, and to always use the appropriate SKB list helpers. This work is still ongoing... 7) Lots of PHY driver and state machine improvements and simplifications, from Heiner Kallweit. 8) Various TSO deferral refinements, from Eric Dumazet. 9) Add ntuple filter support to aquantia driver, from Dmitry Bogdanov. 10) Batch dropping of XDP packets in tuntap, from Jason Wang. 11) Lots of cleanups and improvements to the r8169 driver from Heiner Kallweit, including support for ->xmit_more. This driver has been getting some much needed love since he started working on it. 12) Lots of new forwarding selftests from Petr Machata. 13) Enable VXLAN learning in mlxsw driver, from Ido Schimmel. 14) Packed ring support for virtio, from Tiwei Bie. 15) Add new Aquantia AQtion USB driver, from Dmitry Bezrukov. 16) Add XDP support to dpaa2-eth driver, from Ioana Ciocoi Radulescu. 17) Implement coalescing on TCP backlog queue, from Eric Dumazet. 18) Implement carrier change in tun driver, from Nicolas Dichtel. 19) Support msg_zerocopy in UDP, from Willem de Bruijn. 20) Significantly improve garbage collection of neighbor objects when the table has many PERMANENT entries, from David Ahern. 21) Remove egdev usage from nfp and mlx5, and remove the facility completely from the tree as it no longer has any users. From Oz Shlomo and others. 22) Add a NETDEV_PRE_CHANGEADDR so that drivers can veto the change and therefore abort the operation before the commit phase (which is the NETDEV_CHANGEADDR event). From Petr Machata. 23) Add indirect call wrappers to avoid retpoline overhead, and use them in the GRO code paths. From Paolo Abeni. 24) Add support for netlink FDB get operations, from Roopa Prabhu. 25) Support bloom filter in mlxsw driver, from Nir Dotan. 26) Add SKB extension infrastructure. This consolidates the handling of the auxiliary SKB data used by IPSEC and bridge netfilter, and is designed to support the needs to MPTCP which could be integrated in the future. 27) Lots of XDP TX optimizations in mlx5 from Tariq Toukan. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1845 commits) net: dccp: fix kernel crash on module load drivers/net: appletalk/cops: remove redundant if statement and mask bnx2x: Fix NULL pointer dereference in bnx2x_del_all_vlans() on some hw net/net_namespace: Check the return value of register_pernet_subsys() net/netlink_compat: Fix a missing check of nla_parse_nested ieee802154: lowpan_header_create check must check daddr net/mlx4_core: drop useless LIST_HEAD mlxsw: spectrum: drop useless LIST_HEAD net/mlx5e: drop useless LIST_HEAD iptunnel: Set tun_flags in the iptunnel_metadata_reply from src net/mlx5e: fix semicolon.cocci warnings staging: octeon: fix build failure with XFRM enabled net: Revert recent Spectre-v1 patches. can: af_can: Fix Spectre v1 vulnerability packet: validate address length if non-zero nfc: af_nfc: Fix Spectre v1 vulnerability phonet: af_phonet: Fix Spectre v1 vulnerability net: core: Fix Spectre v1 vulnerability net: minor cleanup in skb_ext_add() net: drop the unused helper skb_ext_get() ...
2018-12-27Merge tag 'pstore-v4.21-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linuxLinus Torvalds1-2/+0
Pull pstore updates from Kees Cook: "Improvements and refactorings: - Improve compression handling - Refactor argument handling during initialization - Avoid needless locking for saner EFI backend handling - Add more kern-doc and improve debugging output" * tag 'pstore-v4.21-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: pstore/ram: Avoid NULL deref in ftrace merging failure path pstore: Convert buf_lock to semaphore pstore: Fix bool initialization/comparison pstore/ram: Do not treat empty buffers as valid pstore/ram: Simplify ramoops_get_next_prz() arguments pstore: Map PSTORE_TYPE_* to strings pstore: Replace open-coded << with BIT() pstore: Improve and update some comments and status output pstore/ram: Add kern-doc for struct persistent_ram_zone pstore/ram: Report backend assignments with finer granularity pstore/ram: Standardize module name in ramoops pstore: Avoid duplicate call of persistent_ram_zap() pstore: Remove needless lock during console writes pstore: Do not use crash buffer for decompression
2018-12-27Merge tag 'powerpc-4.21-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linuxLinus Torvalds247-3265/+3684
Pull powerpc updates from Michael Ellerman: "Notable changes: - Mitigations for Spectre v2 on some Freescale (NXP) CPUs. - A large series adding support for pass-through of Nvidia V100 GPUs to guests on Power9. - Another large series to enable hardware assistance for TLB table walk on MPC8xx CPUs. - Some preparatory changes to our DMA code, to make way for further cleanups from Christoph. - Several fixes for our Transactional Memory handling discovered by fuzzing the signal return path. - Support for generating our system call table(s) from a text file like other architectures. - A fix to our page fault handler so that instead of generating a WARN_ON_ONCE, user accesses of kernel addresses instead print a ratelimited and appropriately scary warning. - A cosmetic change to make our unhandled page fault messages more similar to other arches and also more compact and informative. - Freescale updates from Scott: "Highlights include elimination of legacy clock bindings use from dts files, an 83xx watchdog handler, fixes to old dts interrupt errors, and some minor cleanup." And many clean-ups, reworks and minor fixes etc. Thanks to: Alexandre Belloni, Alexey Kardashevskiy, Andrew Donnellan, Aneesh Kumar K.V, Arnd Bergmann, Benjamin Herrenschmidt, Breno Leitao, Christian Lamparter, Christophe Leroy, Christoph Hellwig, Daniel Axtens, Darren Stevens, David Gibson, Diana Craciun, Dmitry V. Levin, Firoz Khan, Geert Uytterhoeven, Greg Kurz, Gustavo Romero, Hari Bathini, Joel Stanley, Kees Cook, Madhavan Srinivasan, Mahesh Salgaonkar, Markus Elfring, Mathieu Malaterre, Michal Suchánek, Naveen N. Rao, Nick Desaulniers, Oliver O'Halloran, Paul Mackerras, Ram Pai, Ravi Bangoria, Rob Herring, Russell Currey, Sabyasachi Gupta, Sam Bobroff, Satheesh Rajendran, Scott Wood, Segher Boessenkool, Stephen Rothwell, Tang Yuantian, Thiago Jung Bauermann, Yangtao Li, Yuantian Tang, Yue Haibing" * tag 'powerpc-4.21-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (201 commits) Revert "powerpc/fsl_pci: simplify fsl_pci_dma_set_mask" powerpc/zImage: Also check for stdout-path powerpc: Fix HMIs on big-endian with CONFIG_RELOCATABLE=y macintosh: Use of_node_name_{eq, prefix} for node name comparisons ide: Use of_node_name_eq for node name comparisons powerpc: Use of_node_name_eq for node name comparisons powerpc/pseries/pmem: Convert to %pOFn instead of device_node.name powerpc/mm: Remove very old comment in hash-4k.h powerpc/pseries: Fix node leak in update_lmb_associativity_index() powerpc/configs/85xx: Enable CONFIG_DEBUG_KERNEL powerpc/dts/fsl: Fix dtc-flagged interrupt errors clk: qoriq: add more compatibles strings powerpc/fsl: Use new clockgen binding powerpc/83xx: handle machine check caused by watchdog timer powerpc/fsl-rio: fix spelling mistake "reserverd" -> "reserved" powerpc/fsl_pci: simplify fsl_pci_dma_set_mask arch/powerpc/fsl_rmu: Use dma_zalloc_coherent vfio_pci: Add NVIDIA GV100GL [Tesla V100 SXM2] subdriver vfio_pci: Allow regions to add own capabilities vfio_pci: Allow mapping extra regions ...
2018-12-26Merge branch 'x86-build-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds1-4/+0
Pull x86 build updates from Ingo Molnar: - Resolve LLVM build bug by removing redundant GNU specific flag - Remove obsolete -funit-at-a-time and -fno-unit-at-a-time use from x86 PowerPC and UM. The UML change was seen and acked by UML maintainer Richard Weinberger. * 'x86-build-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/um/vdso: Drop implicit common-page-size linker flag x86, powerpc: Remove -funit-at-a-time compiler option entirely x86/um: Remove -fno-unit-at-a-time workaround for pre-4.0 GCC
2018-12-26Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds1-1/+1
Pull RCU updates from Ingo Molnar: "The biggest RCU changes in this cycle were: - Convert RCU's BUG_ON() and similar calls to WARN_ON() and similar. - Replace calls of RCU-bh and RCU-sched update-side functions to their vanilla RCU counterparts. This series is a step towards complete removal of the RCU-bh and RCU-sched update-side functions. ( Note that some of these conversions are going upstream via their respective maintainers. ) - Documentation updates, including a number of flavor-consolidation updates from Joel Fernandes. - Miscellaneous fixes. - Automate generation of the initrd filesystem used for rcutorture testing. - Convert spin_is_locked() assertions to instead use lockdep. ( Note that some of these conversions are going upstream via their respective maintainers. ) - SRCU updates, especially including a fix from Dennis Krein for a bag-on-head-class bug. - RCU torture-test updates" * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (112 commits) rcutorture: Don't do busted forward-progress testing rcutorture: Use 100ms buckets for forward-progress callback histograms rcutorture: Recover from OOM during forward-progress tests rcutorture: Print forward-progress test age upon failure rcutorture: Print time since GP end upon forward-progress failure rcutorture: Print histogram of CB invocation at OOM time rcutorture: Print GP age upon forward-progress failure rcu: Print per-CPU callback counts for forward-progress failures rcu: Account for nocb-CPU callback counts in RCU CPU stall warnings rcutorture: Dump grace-period diagnostics upon forward-progress OOM rcutorture: Prepare for asynchronous access to rcu_fwd_startat torture: Remove unnecessary "ret" variables rcutorture: Affinity forward-progress test to avoid housekeeping CPUs rcutorture: Break up too-long rcu_torture_fwd_prog() function rcutorture: Remove cbflood facility torture: Bring any extra CPUs online during kernel startup rcutorture: Add call_rcu() flooding forward-progress tests rcutorture/formal: Replace synchronize_sched() with synchronize_rcu() tools/kernel.h: Replace synchronize_sched() with synchronize_rcu() net/decnet: Replace rcu_barrier_bh() with rcu_barrier() ...
2018-12-26Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds19-97/+518
Pull KVM updates from Paolo Bonzini: "ARM: - selftests improvements - large PUD support for HugeTLB - single-stepping fixes - improved tracing - various timer and vGIC fixes x86: - Processor Tracing virtualization - STIBP support - some correctness fixes - refactorings and splitting of vmx.c - use the Hyper-V range TLB flush hypercall - reduce order of vcpu struct - WBNOINVD support - do not use -ftrace for __noclone functions - nested guest support for PAUSE filtering on AMD - more Hyper-V enlightenments (direct mode for synthetic timers) PPC: - nested VFIO s390: - bugfixes only this time" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (171 commits) KVM: x86: Add CPUID support for new instruction WBNOINVD kvm: selftests: ucall: fix exit mmio address guessing Revert "compiler-gcc: disable -ftracer for __noclone functions" KVM: VMX: Move VM-Enter + VM-Exit handling to non-inline sub-routines KVM: VMX: Explicitly reference RCX as the vmx_vcpu pointer in asm blobs KVM: x86: Use jmp to invoke kvm_spurious_fault() from .fixup MAINTAINERS: Add arch/x86/kvm sub-directories to existing KVM/x86 entry KVM/x86: Use SVM assembly instruction mnemonics instead of .byte streams KVM/MMU: Flush tlb directly in the kvm_zap_gfn_range() KVM/MMU: Flush tlb directly in kvm_set_pte_rmapp() KVM/MMU: Move tlb flush in kvm_set_pte_rmapp() to kvm_mmu_notifier_change_pte() KVM: Make kvm_set_spte_hva() return int KVM: Replace old tlb flush function with new one to flush a specified range. KVM/MMU: Add tlb flush with range helper function KVM/VMX: Add hv tlb range flush support x86/hyper-v: Add HvFlushGuestAddressList hypercall support KVM: Add tlb_remote_flush_with_range callback in kvm_x86_ops KVM: x86: Disable Intel PT when VMXON in L1 guest KVM: x86: Set intercept for Intel PT MSRs read/write KVM: x86: Implement Intel PT MSRs read/write emulation ...
2018-12-25Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linuxLinus Torvalds1-54/+0
Pull arm64 festive updates from Will Deacon: "In the end, we ended up with quite a lot more than I expected: - Support for ARMv8.3 Pointer Authentication in userspace (CRIU and kernel-side support to come later) - Support for per-thread stack canaries, pending an update to GCC that is currently undergoing review - Support for kexec_file_load(), which permits secure boot of a kexec payload but also happens to improve the performance of kexec dramatically because we can avoid the sucky purgatory code from userspace. Kdump will come later (requires updates to libfdt). - Optimisation of our dynamic CPU feature framework, so that all detected features are enabled via a single stop_machine() invocation - KPTI whitelisting of Cortex-A CPUs unaffected by Meltdown, so that they can benefit from global TLB entries when KASLR is not in use - 52-bit virtual addressing for userspace (kernel remains 48-bit) - Patch in LSE atomics for per-cpu atomic operations - Custom preempt.h implementation to avoid unconditional calls to preempt_schedule() from preempt_enable() - Support for the new 'SB' Speculation Barrier instruction - Vectorised implementation of XOR checksumming and CRC32 optimisations - Workaround for Cortex-A76 erratum #1165522 - Improved compatibility with Clang/LLD - Support for TX2 system PMUS for profiling the L3 cache and DMC - Reflect read-only permissions in the linear map by default - Ensure MMIO reads are ordered with subsequent calls to Xdelay() - Initial support for memory hotplug - Tweak the threshold when we invalidate the TLB by-ASID, so that mremap() performance is improved for ranges spanning multiple PMDs. - Minor refactoring and cleanups" * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (125 commits) arm64: kaslr: print PHYS_OFFSET in dump_kernel_offset() arm64: sysreg: Use _BITUL() when defining register bits arm64: cpufeature: Rework ptr auth hwcaps using multi_entry_cap_matches arm64: cpufeature: Reduce number of pointer auth CPU caps from 6 to 4 arm64: docs: document pointer authentication arm64: ptr auth: Move per-thread keys from thread_info to thread_struct arm64: enable pointer authentication arm64: add prctl control for resetting ptrauth keys arm64: perf: strip PAC when unwinding userspace arm64: expose user PAC bit positions via ptrace arm64: add basic pointer authentication support arm64/cpufeature: detect pointer authentication arm64: Don't trap host pointer auth use to EL2 arm64/kvm: hide ptrauth from guests arm64/kvm: consistently handle host HCR_EL2 flags arm64: add pointer authentication register bits arm64: add comments about EC exception levels arm64: perf: Treat EXCLUDE_EL* bit definitions as unsigned arm64: kpti: Whitelist Cortex-A CPUs that don't implement the CSV3 field arm64: enable per-task stack canaries ...
2018-12-24Merge branch 'next' of https://git.kernel.org/pub/scm/linux/kernel/git/scottwood/linux into nextMichael Ellerman33-550/+210
Freescale updates from Scott: "Highlights include elimination of legacy clock bindings use from dts files, an 83xx watchdog handler, fixes to old dts interrupt errors, and some minor cleanup."
2018-12-23Revert "powerpc/fsl_pci: simplify fsl_pci_dma_set_mask"Scott Wood1-1/+5
This reverts commit c6e5485e0cb509292a14e880e1944143f99758c7 due to failures such as: e1000e 2000:01:00.0: Tx DMA map failed Signed-off-by: Scott Wood <oss@buserror.net>
2018-12-22powerpc/zImage: Also check for stdout-pathOliver O'Halloran1-1/+2
The /chosen/linux,stdout-path is "deprecated" in favour of /chosen/stdout-path so we should be checking for both. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-12-22powerpc: Fix HMIs on big-endian with CONFIG_RELOCATABLE=yBenjamin Herrenschmidt1-1/+1
HMIs will crash the kernel due to BRANCH_LINK_TO_FAR(hmi_exception_realmode) Calling into the OPD instead of the actual code. Fixes: 2337d207288f ("powerpc/64: CONFIG_RELOCATABLE support for hmi interrupts") Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> [mpe: Use DOTSYM() rather than #ifdef] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-12-22powerpc: Use of_node_name_eq for node name comparisonsRob Herring15-64/+39
Convert string compares of DT node names to use of_node_name_eq helper instead. This removes direct access to the node name pointer. A couple of open coded iterating thru the child node names are converted to use for_each_child_of_node() instead. Signed-off-by: Rob Herring <robh@kernel.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-12-22powerpc/pseries/pmem: Convert to %pOFn instead of device_node.nameRob Herring1-4/+4
In preparation to remove the node name pointer from struct device_node, convert printf users to use the %pOFn format specifier. pmem.c was recently added and missed the initial conversion. Signed-off-by: Rob Herring <robh@kernel.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-12-22powerpc/mm: Remove very old comment in hash-4k.hMichael Ellerman1-5/+1
This comment talks about PTEs being 64-bits and PMD/PGD being 32-bits, but that hasn't been true since 2005 when David Gibson implemented 4-level page tables in the commit titled "Four level pagetables for ppc64". Remove it. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-12-22powerpc/pseries: Fix node leak in update_lmb_associativity_index()Michael Ellerman1-0/+1
In update_lmb_associativity_index() we lookup dr_node using of_find_node_by_path() which takes a reference for us. In the non-error case we forget to drop the reference. Note that find_aa_index() does modify properties of the node, but doesn't need an extra reference held once it's returned. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-12-21powerpc/configs/85xx: Enable CONFIG_DEBUG_KERNELScott Wood1-0/+1
This is required for CONFIG_DEBUG_INFO to work. Signed-off-by: Scott Wood <oss@buserror.net>
2018-12-21powerpc/dts/fsl: Fix dtc-flagged interrupt errorsScott Wood5-134/+132
mpc8641_hpcn was updated to 4-cell interrupt specifiers, but PCI interrupt-map was not updated. It was also missing #interrupt-cells on the outer PCI buses. p1020rdb-pc was updated to 4-cell interrupt specifiers, but the ethernet-phy nodes weren't updated. mpc832x_rdb had an invalid "interrupts = <0>" on the ethernet-phy nodes. Besides being the wrong number of cells, 0 is not a valid IPIC interrupt according to ipic.c. Presumably it was meant to indicate that these PHYs are not connected to an interrupt. Signed-off-by: Scott Wood <oss@buserror.net>
2018-12-21powerpc/fsl: Use new clockgen bindingScott Wood22-409/+50
The driver retains compatibility with old device trees, but we don't want the old nodes lying around to be copied, or used as a reference (some of the mux options are incorrect), or even just being clutter. Signed-off-by: Scott Wood <oss@buserror.net> Signed-off-by: Tang Yuantian <andy.tang@nxp.com> [scottwood: removed sysclk node added by Andy] Signed-off-by: Scott Wood <oss@buserror.net>
2018-12-21powerpc/83xx: handle machine check caused by watchdog timerChristophe Leroy4-4/+26
When the watchdog timer is set in interrupt mode, it causes a machine check when it times out. The purpose of this mode is to ease debugging, not to crash the kernel and reboot the machine. This patch implements a special handling for that, in order to not crash the kernel if the watchdog times out while in interrupt or within the idle task. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> [scottwood: added missing #include] Signed-off-by: Scott Wood <oss@buserror.net>
2018-12-21powerpc/fsl-rio: fix spelling mistake "reserverd" -> "reserved"Alexandre Belloni1-1/+1
Fix a spelling mistake in a register description. Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Signed-off-by: Scott Wood <oss@buserror.net>
2018-12-21powerpc/fsl_pci: simplify fsl_pci_dma_set_maskChristoph Hellwig1-5/+1
swiotlb will only bounce buffer when the effective dma address for the device is smaller than the actual DMA range. Instead of flipping between the swiotlb and nommu ops for FSL SOCs that have the second outbound window just don't set the bus dma_mask in this case. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Scott Wood <oss@buserror.net>
2018-12-21arch/powerpc/fsl_rmu: Use dma_zalloc_coherentSabyasachi Gupta1-3/+1
Replaced dma_alloc_coherent + memset with dma_zalloc_coherent Signed-off-by: Sabyasachi Gupta <sabyasachi.linux@gmail.com> Signed-off-by: Scott Wood <oss@buserror.net>
2018-12-21Merge tag 'kvm-ppc-next-4.21-2' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc into kvm-nextPaolo Bonzini4-12/+93
Second PPC KVM update for 4.21 This has 5 commits that fix page dirty tracking when running nested HV KVM guests, from Suraj Jitindar Singh.
2018-12-21KVM: Make kvm_set_spte_hva() return intLan Tianyu3-3/+5
The patch is to make kvm_set_spte_hva() return int and caller can check return value to determine flush tlb or not. Signed-off-by: Lan Tianyu <Tianyu.Lan@microsoft.com> Acked-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-12-21powerpc/powernv/npu: Fault user page into the hypervisor's pagetableAlexey Kardashevskiy1-6/+7
When a page fault happens in a GPU, the GPU signals the OS and the GPU driver calls the fault handler which populated a page table; this allows the GPU to complete an ATS request. On the bare metal get_user_pages() is enough as it adds a pte to the kernel page table but under KVM the partition scope tree does not get updated so ATS will still fail. This reads a byte from an effective address which causes HV storage interrupt and KVM updates the partition scope tree. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-12-21powerpc/powernv/npu: Check mmio_atsd array bounds when populatingAlexey Kardashevskiy1-2/+3
A broken device tree might contain more than 8 values and introduce hard to debug memory corruption bug. This adds the boundary check. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-12-21powerpc/powernv/npu: Add release_ownership hookAlexey Kardashevskiy1-0/+51
In order to make ATS work and translate addresses for arbitrary LPID and PID, we need to program an NPU with LPID and allow PID wildcard matching with a specific MSR mask. This implements a helper to assign a GPU to LPAR and program the NPU with a wildcard for PID and a helper to do clean-up. The helper takes MSR (only DR/HV/PR/SF bits are allowed) to program them into NPU2 for ATS checkout requests support. This exports pnv_npu2_unmap_lpar_dev() as following patches will use it from the VFIO driver. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-12-21powerpc/powernv/npu: Add compound IOMMU groupsAlexey Kardashevskiy4-136/+322
At the moment the powernv platform registers an IOMMU group for each PE. There is an exception though: an NVLink bridge which is attached to the corresponding GPU's IOMMU group making it a master. Now we have POWER9 systems with GPUs connected to each other directly bypassing PCI. At the moment we do not control state of these links so we have to put such interconnected GPUs to one IOMMU group which means that the old scheme with one GPU as a master won't work - there will be up to 3 GPUs in such group. This introduces a npu_comp struct which represents a compound IOMMU group made of multiple PEs - PCI PEs (for GPUs) and NPU PEs (for NVLink bridges). This converts the existing NVLink1 code to use the new scheme. >From now on, each PE must have a valid iommu_table_group_ops which will either be called directly (for a single PE group) or indirectly from a compound group handlers. This moves IOMMU group registration for NVLink-connected GPUs to npu-dma.c. For POWER8, this stores a new compound group pointer in the PE (so a GPU is still a master); for POWER9 the new group pointer is stored in an NPU (which is allocated per a PCI host controller). Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> [mpe: Initialise npdev to NULL in pnv_try_setup_npu_table_group()] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-12-21powerpc/powernv/npu: Convert NPU IOMMU helpers to iommu_table_group_opsAlexey Kardashevskiy3-15/+34
At the moment NPU IOMMU is manipulated directly from the IODA2 PCI PE code; PCI PE acts as a master to NPU PE. Soon we will have compound IOMMU groups with several PEs from several different PHB (such as interconnected GPUs and NPUs) so there will be no single master but a one big IOMMU group. This makes a first step and converts an NPU PE with a set of extern function to a table group. This should cause no behavioral change. Note that pnv_npu_release_ownership() has never been implemented. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-12-21powerpc/powernv/npu: Move single TVE handling to NPU PEAlexey Kardashevskiy2-24/+11
Normal PCI PEs have 2 TVEs, one per a DMA window; however NPU PE has only one which points to one of two tables of the corresponding PCI PE. So whenever a new DMA window is programmed to PEs, the NPU PE needs to release old table in order to use the new one. Commit d41ce7b1bcc3e ("powerpc/powernv/npu: Do not try invalidating 32bit table when 64bit table is enabled") did just that but in pci-ioda.c while it actually belongs to npu-dma.c. This moves the single TVE handling to npu-dma.c. This does not implement restoring though as it is highly unlikely that we can set the table to PCI PE and cannot to NPU PE and if that fails, we could only set 32bit table to NPU PE and this configuration is not really supported or wanted. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-12-21powerpc/powernv: Reference iommu_table while it is linked to a groupAlexey Kardashevskiy2-5/+2
The iommu_table pointer stored in iommu_table_group may get stale by accident, this adds referencing and removes a redundant comment about this. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-12-21powerpc/iommu_api: Move IOMMU groups setup to a single placeAlexey Kardashevskiy1-14/+68
Registering new IOMMU groups and adding devices to them are separated in code and the latter is dug in the DMA setup code which it does not really belong to. This moved IOMMU groups setup to a separate helper which registers a group and adds devices as before. This does not make a difference as IOMMU groups are not used anyway; the only dependency here is that iommu_add_device() requires a valid pointer to an iommu_table (set by set_iommu_table_base()). To keep the old behaviour, this does not add new IOMMU groups for PEs with no DMA weight and also skips NVLink bridges which do not have pci_controller_ops::setup_bridge (the normal way of adding PEs). Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-12-21powerpc/powernv/pseries: Rework device adding to IOMMU groupsAlexey Kardashevskiy5-95/+74
The powernv platform registers IOMMU groups and adds devices to them from the pci_controller_ops::setup_bridge() hook except one case when virtual functions (SRIOV VFs) are added from a bus notifier. The pseries platform registers IOMMU groups from the pci_controller_ops::dma_bus_setup() hook and adds devices from the pci_controller_ops::dma_dev_setup() hook. The very same bus notifier used for powernv does not add devices for pseries though as __of_scan_bus() adds devices first, then it does the bus/dev DMA setup. Both platforms use iommu_add_device() which takes a device and expects it to have a valid IOMMU table struct with an iommu_table_group pointer which in turn points the iommu_group struct (which represents an IOMMU group). Although the helper seems easy to use, it relies on some pre-existing device configuration and associated data structures which it does not really need. This simplifies iommu_add_device() to take the table_group pointer directly. Pseries already has a table_group pointer handy and the bus notified is not used anyway. For powernv, this copies the existing bus notifier, makes it work for powernv only which means an easy way of getting to the table_group pointer. This was tested on VFs but should also support physical PCI hotplug. Since iommu_add_device() receives the table_group pointer directly, pseries does not do TCE cache invalidation (the hypervisor does) nor allow multiple groups per a VFIO container (in other words sharing an IOMMU table between partitionable endpoints), this removes iommu_table_group_link from pseries. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-12-21powerpc/pseries: Remove IOMMU API support for non-LPAR systemsAlexey Kardashevskiy1-7/+2
The pci_dma_bus_setup_pSeries and pci_dma_dev_setup_pSeries hooks are registered for the pseries platform which does not have FW_FEATURE_LPAR; these would be pre-powernv platforms which we never supported PCI pass through for anyway so remove it. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-12-21powerpc/pseries/npu: Enable platform supportAlexey Kardashevskiy1-0/+22
We already changed NPU API for GPUs to not to call OPAL and the remaining bit is initializing NPU structures. This searches for POWER9 NVLinks attached to any device on a PHB and initializes an NPU structure if any found. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-12-21powerpc/pseries/iommu: Use memory@ nodes in max RAM address calculationAlexey Kardashevskiy1-1/+32
We might have memory@ nodes with "linux,usable-memory" set to zero (for example, to replicate powernv's behaviour for GPU coherent memory) which means that the memory needs an extra initialization but since it can be used afterwards, the pseries platform will try mapping it for DMA so the DMA window needs to cover those memory regions too; if the window cannot cover new memory regions, the memory onlining fails. This walks through the memory nodes to find the highest RAM address to let a huge DMA window cover that too in case this memory gets onlined later. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-12-21powerpc/powernv/npu: Move OPAL calls away from context manipulationAlexey Kardashevskiy4-54/+77
When introduced, the NPU context init/destroy helpers called OPAL which enabled/disabled PID (a userspace memory context ID) filtering in an NPU per a GPU; this was a requirement for P9 DD1.0. However newer chip revision added a PID wildcard support so there is no more need to call OPAL every time a new context is initialized. Also, since the PID wildcard support was added, skiboot does not clear wildcard entries in the NPU so these remain in the hardware till the system reboot. This moves LPID and wildcard programming to the PE setup code which executes once during the booting process so NPU2 context init/destroy won't need to do additional configuration. This replaces the check for FW_FEATURE_OPAL with a check for npu!=NULL as this is the way to tell if the NPU support is present and configured. This moves pnv_npu2_init() declaration as pseries should be able to use it. This keeps pnv_npu2_map_lpar() in powernv as pseries is not allowed to call that. This exports pnv_npu2_map_lpar_dev() as following patches will use it from the VFIO driver. While at it, replace redundant list_for_each_entry_safe() with a simpler list_for_each_entry(). Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-12-21powerpc/powernv: Move npu struct from pnv_phb to pci_controllerAlexey Kardashevskiy4-35/+58
The powernv PCI code stores NPU data in the pnv_phb struct. The latter is referenced by pci_controller::private_data. We are going to have NPU2 support in the pseries platform as well but it does not store any private_data in in the pci_controller struct; and even if it did, it would be a different data structure. This makes npu a pointer and stores it one level higher in the pci_controller struct. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-12-21powerpc/vfio/iommu/kvm: Do not pin device memoryAlexey Kardashevskiy5-22/+116
This new memory does not have page structs as it is not plugged to the host so gup() will fail anyway. This adds 2 helpers: - mm_iommu_newdev() to preregister the "memory device" memory so the rest of API can still be used; - mm_iommu_is_devmem() to know if the physical address is one of thise new regions which we must avoid unpinning of. This adds @mm to tce_page_is_contained() and iommu_tce_xchg() to test if the memory is device memory to avoid pfn_to_page(). This adds a check for device memory in mm_iommu_ua_mark_dirty_rm() which does delayed pages dirtying. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-12-21powerpc/mm/iommu/vfio_spapr_tce: Change mm_iommu_get to reference a regionAlexey Kardashevskiy2-12/+11
Normally mm_iommu_get() should add a reference and mm_iommu_put() should remove it. However historically mm_iommu_find() does the referencing and mm_iommu_get() is doing allocation and referencing. We are going to add another helper to preregister device memory so instead of having mm_iommu_new() (which pre-registers the normal memory and references the region), we need separate helpers for pre-registering and referencing. This renames: - mm_iommu_get to mm_iommu_new; - mm_iommu_find to mm_iommu_get. This changes mm_iommu_get() to reference the region so the name now reflects what it does. This removes the check for exact match from mm_iommu_new() as we want it to fail on existing regions; mm_iommu_get() should be used instead. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-12-21powerpc/ioda/npu: Call skiboot's hot reset hook when disabling NPU2Alexey Kardashevskiy1-0/+10
The skiboot firmware has a hot reset handler which fences the NVIDIA V100 GPU RAM on Witherspoons and makes accesses no-op instead of throwing HMIs: https://github.com/open-power/skiboot/commit/fca2b2b839a67 Now we are going to pass V100 via VFIO which most certainly involves KVM guests which are often terminated without getting a chance to offline GPU RAM so we end up with a running machine with misconfigured memory. Accessing this memory produces hardware management interrupts (HMI) which bring the host down. To suppress HMIs, this wires up this hot reset hook to vfio_pci_disable() via pci_disable_device() which switches NPU2 to a safe mode and prevents HMIs. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Acked-by: Alistair Popple <alistair@popple.id.au> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-12-21powerpc/mm: Fix reporting of kernel execute faults on the 8xxChristophe Leroy1-1/+3
On the 8xx, no-execute is set via PPP bits in the PTE. Therefore a no-exec fault generates DSISR_PROTFAULT error bits, not DSISR_NOEXEC_OR_G. This patch adds DSISR_PROTFAULT in the test mask. Fixes: d3ca587404b3 ("powerpc/mm: Fix reporting of kernel execute faults") Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-12-21powerpc: generate uapi header and system call table filesFiroz Khan9-909/+26
System call table generation script must be run to gener- ate unistd_32/64.h and syscall_table_32/64/c32/spu.h files. This patch will have changes which will invokes the script. This patch will generate unistd_32/64.h and syscall_table- _32/64/c32/spu.h files by the syscall table generation script invoked by parisc/Makefile and the generated files against the removed files must be identical. The generated uapi header file will be included in uapi/- asm/unistd.h and generated system call table header file will be included by kernel/systbl.S file. Signed-off-by: Firoz Khan <firoz.khan@linaro.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-12-21powerpc: add system call table generation supportFiroz Khan4-0/+563
The system call tables are in different format in all architecture and it will be difficult to manually add or modify the system calls in the respective files. To make it easy by keeping a script and which will generate the uapi header and syscall table file. This change will also help to unify the implementation across all architectures. The system call table generation script is added in syscalls directory which contain the script to generate both uapi header file and system call table files. The syscall.tbl file will be the input for the scripts. syscall.tbl contains the list of available system calls along with system call number and corresponding entry point. Add a new system call in this architecture will be possible by adding new entry in the syscall.tbl file. Adding a new table entry consisting of: - System call number. - ABI. - System call name. - Entry point name. - Compat entry name, if required. syscallhdr.sh and syscalltbl.sh will generate uapi header- unistd_32/64.h and syscall_table_32/64/c32/spu.h files respectively. File syscall_table_32/64/c32/spu.h is incl- uded by syscall.S - the real system call table. Both *.sh files will parse the content syscall.tbl to generate the header and table files. ARM, s390 and x86 architecuture does have similar support. I leverage their implementation to come up with a generic solution. Signed-off-by: Firoz Khan <firoz.khan@linaro.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-12-21powerpc: split compat syscall table out from native tableFiroz Khan4-13/+39
PowerPC uses a syscall table with native and compat calls interleaved, which is a slightly simpler way to define two matching tables. As we move to having the tables generated, that advantage is no longer important, but the interleaved table gets in the way of using the same scripts as on the other archit- ectures. Split out a new compat_sys_call_table symbol that contains all the compat calls, and leave the main table for the nat- ive calls, to more closely match the method we use every- where else. Suggested-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Firoz Khan <firoz.khan@linaro.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-12-21powerpc: move macro definition from asm/systbl.hFiroz Khan2-1/+1
Move the macro definition for compat_sys_sigsuspend from asm/systbl.h to the file which it is getting included. One of the patch in this patch series is generating uapi header and syscall table files. In order to come up with a common implimentation across all architecture, we need to do this change. This change will simplify the implementation of system call table generation script and help to come up a common implementation across all architecture. Signed-off-by: Firoz Khan <firoz.khan@linaro.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-12-21powerpc: add __NR_syscalls along with NR_syscallsFiroz Khan2-3/+5
NR_syscalls macro holds the number of system call exist in powerpc architecture. We have to change the value of NR_syscalls, if we add or delete a system call. One of the patch in this patch series has a script which will generate a uapi header based on syscall.tbl file. The syscall.tbl file contains the number of system call information. So we have two option to update NR_syscalls value. 1. Update NR_syscalls in asm/unistd.h manually by count- ing the no.of system calls. No need to update NR_sys- calls until we either add a new system call or delete existing system call. 2. We can keep this feature in above mentioned script, that will count the number of syscalls and keep it in a generated file. In this case we don't need to expli- citly update NR_syscalls in asm/unistd.h file. The 2nd option will be the recommended one. For that, I added the __NR_syscalls macro in uapi/asm/unistd.h along with NR_syscalls asm/unistd.h. The macro __NR_syscalls also added for making the name convention same across all architecture. While __NR_syscalls isn't strictly part of the uapi, having it as part of the generated header to simplifies the implementation. We also need to enclose this macro with #ifdef __KERNEL__ to avoid side effects. Signed-off-by: Firoz Khan <firoz.khan@linaro.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-12-21powerpc/pkeys: Fix handling of pkey state across fork()Ram Pai2-6/+19
Protection key tracking information is not copied over to the mm_struct of the child during fork(). This can cause the child to erroneously allocate keys that were already allocated. Any allocated execute-only key is lost aswell. Add code; called by dup_mmap(), to copy the pkey state from parent to child explicitly. This problem was originally found by Dave Hansen on x86, which turns out to be a problem on powerpc aswell. Fixes: cf43d3b26452 ("powerpc: Enable pkey subsystem") Cc: stable@vger.kernel.org # v4.16+ Reviewed-by: Thiago Jung Bauermann <bauerman@linux.ibm.com> Signed-off-by: Ram Pai <linuxram@us.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-12-21powerpc/tm: Unset MSR[TS] if not recheckpointingBreno Leitao2-9/+29
There is a TM Bad Thing bug that can be caused when you return from a signal context in a suspended transaction but with ucontext MSR[TS] unset. This forces regs->msr[TS] to be set at syscall entrance (since the CPU state is transactional). It also calls treclaim() to flush the transaction state, which is done based on the live (mfmsr) MSR state. Since user context MSR[TS] is not set, then restore_tm_sigcontexts() is not called, thus, not executing recheckpoint, keeping the CPU state as not transactional. When calling rfid, SRR1 will have MSR[TS] set, but the CPU state is non transactional, causing the TM Bad Thing with the following stack: [ 33.862316] Bad kernel stack pointer 3fffd9dce3e0 at c00000000000c47c cpu 0x8: Vector: 700 (Program Check) at [c00000003ff7fd40] pc: c00000000000c47c: fast_exception_return+0xac/0xb4 lr: 00003fff865f442c sp: 3fffd9dce3e0 msr: 8000000102a03031 current = 0xc00000041f68b700 paca = 0xc00000000fb84800 softe: 0 irq_happened: 0x01 pid = 1721, comm = tm-signal-sigre Linux version 4.9.0-3-powerpc64le (debian-kernel@lists.debian.org) (gcc version 6.3.0 20170516 (Debian 6.3.0-18) ) #1 SMP Debian 4.9.30-2+deb9u2 (2017-06-26) WARNING: exception is not recoverable, can't continue The same problem happens on 32-bits signal handler, and the fix is very similar, if tm_recheckpoint() is not executed, then regs->msr[TS] should be zeroed. This patch also fixes a sparse warning related to lack of indentation when CONFIG_PPC_TRANSACTIONAL_MEM is set. Fixes: 2b0a576d15e0e ("powerpc: Add new transactional memory state to the signal context") CC: Stable <stable@vger.kernel.org> # 3.10+ Signed-off-by: Breno Leitao <leitao@debian.org> Tested-by: Michal Suchánek <msuchanek@suse.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-12-21powerpc/tm: Print scratch valueBreno Leitao1-1/+2
Usually a TM Bad Thing exception is raised due to three different problems. a) touching SPRs in an active transaction; b) using TM instruction with the facility disabled and c) setting a wrong MSR/SRR1 at RFID. The two initial cases are easy to identify by looking at the instructions. The latter case is harder, because the MSR is masked after RFID, so, it is very useful to look at the previous MSR (SRR1) before RFID as also the current and masked MSR. Since MSR is saved at paca just before RFID, this patch prints it if a TM Bad thing happen, helping to understand what is the invalid TM transition that is causing the exception. Signed-off-by: Breno Leitao <leitao@debian.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>