aboutsummaryrefslogtreecommitdiffstats
path: root/arch/powerpc/platforms/powernv/pci-ioda.c (follow)
AgeCommit message (Collapse)AuthorFilesLines
2022-08-26powerpc: move from strlcpy with unused retval to strscpyWolfram Sang1-1/+1
Follow the advice of the below link and prefer 'strscpy' in this subsystem. Conversion is 1:1 because the return value is not used. Generated by a coccinelle script. Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/CAHk-=wgfRnXz0W3D37d01q3JFkr_i_uTL=V6A6G1oUZcprmknw@mail.gmail.com/ Link: https://lore.kernel.org/r/20220818205946.6336-1-wsa+renesas@sang-engineering.com
2022-07-28powerpc/ioda/iommu/debugfs: Generate unique debugfs entriesAlexey Kardashevskiy1-0/+2
The iommu_table::it_index is a LIOBN which is not initialized on PowerNV as it is not used except IOMMU debugfs where it is used for a node name. This initializes it_index witn a unique number to avoid warnings and have a node for every iommu_table. This should not cause any behavioral change without CONFIG_IOMMU_DEBUGFS. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220714080800.3712998-1-aik@ozlabs.ru
2022-05-22powerpc: Fix all occurences of "the the"Michael Ellerman1-1/+1
Rather than waiting for the bots to fix these one-by-one, fix all occurences of "the the" throughout arch/powerpc. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220518142629.513007-1-mpe@ellerman.id.au
2022-05-19Merge branch 'topic/ppc-kvm' into nextMichael Ellerman1-28/+18
Merge our KVM topic branch.
2022-05-19KVM: PPC: Book3s: Retire H_PUT_TCE/etc real mode handlersAlexey Kardashevskiy1-28/+18
LoPAPR defines guest visible IOMMU with hypercalls to use it - H_PUT_TCE/etc. Implemented first on POWER7 where hypercalls would trap in the KVM in the real mode (with MMU off). The problem with the real mode is some memory is not available and some API usage crashed the host but enabling MMU was an expensive operation. The problems with the real mode handlers are: 1. Occasionally these cannot complete the request so the code is copied+modified to work in the virtual mode, very little is shared; 2. The real mode handlers have to be linked into vmlinux to work; 3. An exception in real mode immediately reboots the machine. If the small DMA window is used, the real mode handlers bring better performance. However since POWER8, there has always been a bigger DMA window which VMs use to map the entire VM memory to avoid calling H_PUT_TCE. Such 1:1 mapping happens once and uses H_PUT_TCE_INDIRECT (a bulk version of H_PUT_TCE) which virtual mode handler is even closer to its real mode version. On POWER9 hypercalls trap straight to the virtual mode so the real mode handlers never execute on POWER9 and later CPUs. So with the current use of the DMA windows and MMU improvements in POWER9 and later, there is no point in duplicating the code. The 32bit passed through devices may slow down but we do not have many of these in practice. For example, with this applied, a 1Gbit ethernet adapter still demostrates above 800Mbit/s of actual throughput. This removes the real mode handlers from KVM and related code from the powernv platform. This updates the list of implemented hcalls in KVM-HV as the realmode handlers are removed. This changes ABI - kvmppc_h_get_tce() moves to the KVM module and kvmppc_find_table() is static now. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220506053755.3820702-1-aik@ozlabs.ru
2022-05-08powerpc: Add missing headersChristophe Leroy1-1/+2
Don't inherit headers "by chances" from asm/prom.h, asm/mpc52xx.h, asm/pci.h etc... Include the needed headers, and remove asm/prom.h when it was needed exclusively for pulling necessary headers. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/be8bdc934d152a7d8ee8d1a840d5596e2f7d85e0.1646767214.git.christophe.leroy@csgroup.eu
2022-01-14Merge tag 'powerpc-5.17-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linuxLinus Torvalds1-2/+2
Pull powerpc updates from Michael Ellerman: - Optimise radix KVM guest entry/exit by 2x on Power9/Power10. - Allow firmware to tell us whether to disable the entry and uaccess flushes on Power10 or later CPUs. - Add BPF_PROBE_MEM support for 32 and 64-bit BPF jits. - Several fixes and improvements to our hard lockup watchdog. - Activate HAVE_DYNAMIC_FTRACE_WITH_REGS on 32-bit. - Allow building the 64-bit Book3S kernel without hash MMU support, ie. Radix only. - Add KUAP (SMAP) support for 40x, 44x, 8xx, Book3E (64-bit). - Add new encodings for perf_mem_data_src.mem_hops field, and use them on Power10. - A series of small performance improvements to 64-bit interrupt entry. - Several commits fixing issues when building with the clang integrated assembler. - Many other small features and fixes. Thanks to Alan Modra, Alexey Kardashevskiy, Ammar Faizi, Anders Roxell, Arnd Bergmann, Athira Rajeev, Cédric Le Goater, Christophe JAILLET, Christophe Leroy, Christoph Hellwig, Daniel Axtens, David Yang, Erhard Furtner, Fabiano Rosas, Greg Kroah-Hartman, Guo Ren, Hari Bathini, Jason Wang, Joel Stanley, Julia Lawall, Kajol Jain, Kees Cook, Laurent Dufour, Madhavan Srinivasan, Mark Brown, Minghao Chi, Nageswara R Sastry, Naresh Kamboju, Nathan Chancellor, Nathan Lynch, Nicholas Piggin, Nick Child, Oliver O'Halloran, Peiwei Hu, Randy Dunlap, Ravi Bangoria, Rob Herring, Russell Currey, Sachin Sant, Sean Christopherson, Segher Boessenkool, Thadeu Lima de Souza Cascardo, Tyrel Datwyler, Xiang wangx, and Yang Guang. * tag 'powerpc-5.17-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (240 commits) powerpc/xmon: Dump XIVE information for online-only processors. powerpc/opal: use default_groups in kobj_type powerpc/cacheinfo: use default_groups in kobj_type powerpc/sched: Remove unused TASK_SIZE_OF powerpc/xive: Add missing null check after calling kmalloc powerpc/floppy: Remove usage of the deprecated "pci-dma-compat.h" API selftests/powerpc: Add a test of sigreturning to an unaligned address powerpc/64s: Use EMIT_WARN_ENTRY for SRR debug warnings powerpc/64s: Mask NIP before checking against SRR0 powerpc/perf: Fix spelling of "its" powerpc/32: Fix boot failure with GCC latent entropy plugin powerpc/code-patching: Replace patch_instruction() by ppc_inst_write() in selftests powerpc/code-patching: Move code patching selftests in its own file powerpc/code-patching: Move instr_is_branch_{i/b}form() in code-patching.h powerpc/code-patching: Move patch_exception() outside code-patching.c powerpc/code-patching: Use test_trampoline for prefixed patch test powerpc/code-patching: Fix patch_branch() return on out-of-range failure powerpc/code-patching: Reorganise do_patch_instruction() to ease error handling powerpc/code-patching: Fix unmap_patch_area() error handling powerpc/code-patching: Fix error handling in do_patch_instruction() ...
2021-12-23powerpc/powernv: Add __init attribute to eligible functionsNick Child1-2/+2
Some functions defined in 'arch/powerpc/platforms/powernv' are deserving of an `__init` macro attribute. These functions are only called by other initialization functions and therefore should inherit the attribute. Also, change function declarations in header files to include `__init`. Signed-off-by: Nick Child <nick.child@ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211216220035.605465-12-nick.child@ibm.com
2021-12-09genirq/msi, treewide: Use a named struct for PCI/MSI attributesThomas Gleixner1-2/+2
The unnamed struct sucks and is in the way of further cleanups. Stick the PCI related MSI data into a real data structure and cleanup all users. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Juergen Gross <jgross@suse.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/20211206210224.374863119@linutronix.de
2021-11-06memblock: use memblock_free for freeing virtual pointersMike Rapoport1-1/+1
Rename memblock_free_ptr() to memblock_free() and use memblock_free() when freeing a virtual pointer so that memblock_free() will be a counterpart of memblock_alloc() The callers are updated with the below semantic patch and manual addition of (void *) casting to pointers that are represented by unsigned long variables. @@ identifier vaddr; expression size; @@ ( - memblock_phys_free(__pa(vaddr), size); + memblock_free(vaddr, size); | - memblock_free_ptr(vaddr, size); + memblock_free(vaddr, size); ) [sfr@canb.auug.org.au: fixup] Link: https://lkml.kernel.org/r/20211018192940.3d1d532f@canb.auug.org.au Link: https://lkml.kernel.org/r/20210930185031.18648-7-rppt@kernel.org Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Juergen Gross <jgross@suse.com> Cc: Shahab Vahedi <Shahab.Vahedi@synopsys.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-06memblock: rename memblock_free to memblock_phys_freeMike Rapoport1-1/+1
Since memblock_free() operates on a physical range, make its name reflect it and rename it to memblock_phys_free(), so it will be a logical counterpart to memblock_phys_alloc(). The callers are updated with the below semantic patch: @@ expression addr; expression size; @@ - memblock_free(addr, size); + memblock_phys_free(addr, size); Link: https://lkml.kernel.org/r/20210930185031.18648-6-rppt@kernel.org Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Juergen Gross <jgross@suse.com> Cc: Shahab Vahedi <Shahab.Vahedi@synopsys.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-08-13powerpc: rename powerpc_debugfs_root to arch_debugfs_dirAneesh Kumar K.V1-2/+2
No functional change in this patch. arch_debugfs_dir is the generic kernel name declared in linux/debugfs.h for arch-specific debugfs directory. Architectures like x86/s390 already use the name. Rename powerpc specific powerpc_debugfs_root to arch_debugfs_dir. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210812132831.233794-2-aneesh.kumar@linux.ibm.com
2021-08-10powerpc/powernv/pci: Rework pnv_opal_pci_msi_eoi()Cédric Le Goater1-4/+13
pnv_opal_pci_msi_eoi() is called from KVM to EOI passthrough interrupts when in real mode. Adding MSI domain broke the hack using the 'ioda.irq_chip' field to deduce the owning PHB. Fix that by using the IRQ chip data in the MSI domain. The 'ioda.irq_chip' field is now unused and could be removed from the pnv_phb struct. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210701132750.1475580-30-clg@kaod.org
2021-08-10powerpc/powernv/pci: Set the IRQ chip data for P8/CXL devicesCédric Le Goater1-3/+8
Before MSI domains, the default IRQ chip of PHB3 MSIs was patched by pnv_set_msi_irq_chip() with the custom EOI handler pnv_ioda2_msi_eoi() and the owning PHB was deduced from the 'ioda.irq_chip' field. This path has been deprecated by the MSI domains but it is still in use by the P8 CAPI 'cxl' driver. Rewriting this driver to support MSI would be a waste of time. Nevertheless, we can still remove the IRQ chip patch and set the IRQ chip data instead. This is cleaner. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210701132750.1475580-29-clg@kaod.org
2021-08-10powerpc/powernv/pci: Adapt is_pnv_opal_msi() to detect passthrough interruptCédric Le Goater1-1/+1
The pnv_ioda2_msi_eoi() chip handler is not used anymore for MSIs. Simply use the check on the PSI-MSI chip. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210701132750.1475580-27-clg@kaod.org
2021-08-10powerpc/powernv/pci: Drop unused MSI codeCédric Le Goater1-27/+0
MSIs should be fully managed by the PCI and IRQ subsystems now. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210701132750.1475580-26-clg@kaod.org
2021-08-10powerpc/pci: Drop XIVE restriction on MSI domainsCédric Le Goater1-3/+1
The PowerNV and pSeries platforms now have support for both the XICS and XIVE IRQ domains. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210701132750.1475580-23-clg@kaod.org
2021-08-10powerpc/powernv/pci: Customize the MSI EOI handler to support PHB3Cédric Le Goater1-1/+22
PHB3s need an extra OPAL call to EOI the interrupt. The call takes an OPAL HW IRQ number but it is translated into a vector number in OPAL. Here, we directly use the vector number of the in-the-middle "PNV-MSI" domain instead of grabbing the OPAL HW IRQ number in the XICS parent domain. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210701132750.1475580-22-clg@kaod.org
2021-08-10KVM: PPC: Book3S HV: Use the new IRQ chip to detect passthrough interruptsCédric Le Goater1-1/+3
Passthrough PCI MSI interrupts are detected in KVM with a check on a specific EOI handler (P8) or on XIVE (P9). We can now check the PCI-MSI IRQ chip which is cleaner. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210701132750.1475580-14-clg@kaod.org
2021-08-10powerpc/powernv/pci: Add MSI domainsCédric Le Goater1-0/+188
This is very similar to the MSI domains of the pSeries platform. The MSI allocator is directly handled under the Linux PHB in the in-the-middle "PNV-MSI" domain. Only the XIVE (P9/P10) parent domain is supported for now. Support for XICS will come later. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210701132750.1475580-13-clg@kaod.org
2021-08-10powerpc/powernv/pci: Introduce __pnv_pci_ioda_msi_setup()Cédric Le Goater1-5/+23
It will be used as a 'compose_msg' handler of the MSI domain introduced later. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210701132750.1475580-12-clg@kaod.org
2021-05-02powerpc/powernv: remove the nvlink supportChristoph Hellwig1-181/+4
This code was only used by the vfio-nvlink2 code, which itself had no proper use. Drop this huge chunk of code build into every powernv or generic build. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210326061311.1497642-3-hch@lst.de
2021-04-23powerpc/iommu: Do not immediately panic when failed IOMMU table allocationAlexey Kardashevskiy1-7/+8
Most platforms allocate IOMMU table structures (specifically it_map) at the boot time and when this fails - it is a valid reason for panic(). However the powernv platform allocates it_map after a device is returned to the host OS after being passed through and this happens long after the host OS booted. It is quite possible to trigger the it_map allocation panic() and kill the host even though it is not necessary - the host OS can still use the DMA bypass mode (requires a tiny fraction of it_map's memory) and even if that fails, the host OS is runnnable as it was without the device for which allocating it_map causes the panic. Instead of immediately crashing in a powernv/ioda2 system, this prints an error and continues. All other platforms still call panic(). Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Leonardo Bras <leobras.c@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210216033307.69863-3-aik@ozlabs.ru
2021-02-11powerpc/powernv/pci: Use kzalloc() for phb related allocationsMichael Ellerman1-3/+3
As part of commit fbbefb320214 ("powerpc/pci: Move PHB discovery for PCI_DN using platforms"), I switched some allocations from memblock_alloc() to kmalloc(), otherwise memblock would warn that it was being called after slab init. However I missed that the code relied on the allocations being zeroed, without which we could end up crashing: pci_bus 0000:00: busn_res: [bus 00-ff] end is updated to ff BUG: Unable to handle kernel data access on read at 0x6b6b6b6b6b6b6af7 Faulting instruction address: 0xc0000000000dbc90 Oops: Kernel access of bad area, sig: 11 [#1] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA PowerNV ... NIP pnv_ioda_get_pe_state+0xe0/0x1d0 LR pnv_ioda_get_pe_state+0xb4/0x1d0 Call Trace: pnv_ioda_get_pe_state+0xb4/0x1d0 (unreliable) pnv_pci_config_check_eeh.isra.9+0x78/0x270 pnv_pci_read_config+0xf8/0x160 pci_bus_read_config_dword+0xa4/0x120 pci_bus_generic_read_dev_vendor_id+0x54/0x270 pci_scan_single_device+0xb8/0x140 pci_scan_slot+0x80/0x1b0 pci_scan_child_bus_extend+0x94/0x490 pcibios_scan_phb+0x1f8/0x3c0 pcibios_init+0x8c/0x12c do_one_initcall+0x94/0x510 kernel_init_freeable+0x35c/0x3fc kernel_init+0x2c/0x168 ret_from_kernel_thread+0x5c/0x70 Switch them to kzalloc(). Fixes: fbbefb320214 ("powerpc/pci: Move PHB discovery for PCI_DN using platforms") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210211112749.3410771-1-mpe@ellerman.id.au
2021-02-09powerpc/pci: Move PHB discovery for PCI_DN using platformsOliver O'Halloran1-3/+7
Make powernv, pseries, powermac and maple use ppc_mc.discover_phbs. These platforms need to be done together because they all depend on pci_dn's being created from the DT. The pci_dn contains a pointer to the relevant pci_controller so they need to be created after the pci_controller structures are available, but before PCI devices are scanned. Currently this ordering is provided by initcalls and the sequence is: 1. PHBs are discovered (setup_arch) (early boot, pre-initcalls) 2. pci_dn are created from the unflattended DT (core initcall) 3. PHBs are scanned pcibios_init() (subsys initcall) The new ppc_md.discover_phbs() function is also a core_initcall so we can't guarantee ordering between the creation of pci_controllers and the creation of pci_dn's which require a pci_controller. We could use the postcore, or core_sync initcall levels, but it's cleaner to just move the pci_dn setup into the per-PHB inits which occur inside of .discover_phb() for these platforms. This brings the boot-time path in line with the PHB hotplug path that is used for pseries DLPAR operations too. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> [mpe: Squash powermac & maple in to avoid breakage those platforms, convert memblock allocs to use kmalloc to avoid warnings] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20201103043523.916109-2-oohall@gmail.com
2021-01-31powerpc/powernv/pci: Drop pnv_phb->initializedOliver O'Halloran1-17/+0
The pnv_phb->initialized flag is an odd beast. It was added back in 2012 in commit db1266c85261 ("powerpc/powernv: Skip check on PE if necessary") to allow devices to be enabled even if the device had not yet been assigned to a PE. Allowing the device to be enabled before the PE is configured may cause spurious EEH events since none of the IOMMU context has been setup. I'm not entirely sure why this was ever necessary. My best guess is that it was an workaround for a bug or some other undesireable behaviour from the PCI core. Either way, it's unnecessary now since as of commit dc3d8f85bb57 ("powerpc/powernv/pci: Re-work bus PE configuration") we can guarantee that the PE will be configured before the PCI core will allow drivers to bind to the device. It's also worth pointing out that the ->initialized flag is only set in pnv_pci_ioda_create_dbgfs(). That function has its entire body wrapped in #ifdef CONFIG_DEBUG_FS. As a result, for kernels built without debugfs (i.e. petitboot) the other checks in pnv_pci_enable_device_hook() are bypassed entirely. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200902013657.1753830-1-oohall@gmail.com
2020-12-04powernv/pci: Print an error when device enable is blockedOliver O'Halloran1-1/+3
If the platform decides to block enabling the device nothing is printed currently. This can lead to some confusion since the dmesg output will usually print an error with no context e.g. e1000e: probe of 0022:01:00.0 failed with error -22 This shouldn't be spammy since pci_enable_device() already prints a messages when it succeeds. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200409061337.9187-1-oohall@gmail.com
2020-08-25powerpc/powernv: Remove set but not used variable 'parent'zhengbin1-8/+0
Fix gcc '-Wunused-but-set-variable' warning: arch/powerpc/platforms/powernv/pci-ioda.c: In function pnv_ioda_configure_pe: arch/powerpc/platforms/powernv/pci-ioda.c:867:18: warning: variable parent set but not used [-Wunused-but-set-variable] It is not used since commit b131a8425c34 ("powerpc/powernv: Set PELTV for compound PEs") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: zhengbin <zhengbin13@huawei.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1574144074-142032-6-git-send-email-zhengbin13@huawei.com
2020-08-20powerpc/powernv/pci: Fix possible crash when releasing DMA resourcesFrederic Barrat1-1/+1
Fix a typo introduced during recent code cleanup, which could lead to silently not freeing resources or an oops message (on PCI hotplug or CAPI reset). Only impacts ioda2, the code path for ioda1 is correct. Fixes: 01e12629af4e ("powerpc/powernv/pci: Add explicit tracking of the DMA setup state") Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Reviewed-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200819130741.16769-1-fbarrat@linux.ibm.com
2020-07-26powerpc/powernv/pci: Refactor pnv_ioda_alloc_pe()Oliver O'Halloran1-8/+33
Rework the PE allocation logic to allow allocating blocks of PEs rather than individually. We'll use this to allocate contigious blocks of PEs for the SR-IOVs. This patch also adds code to pnv_ioda_alloc_pe() and pnv_ioda_reserve_pe() to use the existing, but unused, phb->pe_alloc_mutex. Currently these functions use atomic bit ops to release a currently allocated PE number. However, the pnv_ioda_alloc_pe() wants to have exclusive access to the bit map while scanning for hole large enough to accomodate the allocation size. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200722065715.1432738-10-oohall@gmail.com
2020-07-26powerpc/powernv/sriov: Move SR-IOV into a separate fileOliver O'Halloran1-655/+18
pci-ioda.c is getting a bit unwieldly due to the amount of stuff jammed in there. The SR-IOV support can be extracted easily enough and is mostly standalone, so move it into a separate file. This patch also moves the PowerNV SR-IOV specific fields from pci_dn and moves them into a platform specific structure. I'm not sure how they ended up in there in the first place, but leaking platform specifics into common code has proven to be a terrible idea so far so lets stop doing that. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200722065715.1432738-5-oohall@gmail.com
2020-07-26powerpc/powernv/pci: Initialise M64 for IODA1 as a 1-1 windowOliver O'Halloran1-30/+25
We pre-configure the m64 window for IODA1 as a 1-1 segment-PE mapping, similar to PHB3. Currently the actual mapping of segments occurs in pnv_ioda_pick_m64_pe(), but we can move it into pnv_ioda1_init_m64() and drop the IODA1 specific code paths in the PE setup / teardown. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200722065715.1432738-4-oohall@gmail.com
2020-07-26powerpc/powernv/pci: Add explicit tracking of the DMA setup stateOliver O'Halloran1-19/+29
There's an optimisation in the PE setup which skips performing DMA setup for a PE if we only have bridges in a PE. The assumption being that only "real" devices will DMA to system memory, which is probably fair. However, if we start off with only bridge devices in a PE then add a non-bridge device the new device won't be able to use DMA because we never configured it. Fix this (admittedly pretty weird) edge case by tracking whether we've done the DMA setup for the PE or not. If a non-bridge device is added to the PE (via rescan or hotplug, or whatever) we can set up DMA on demand. This also means the only remaining user of the old "DMA Weight" code is the IODA1 DMA setup code that it was originally added for, which is good. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200722065715.1432738-3-oohall@gmail.com
2020-07-26powerpc/powernv/pci: Always tear down DMA windows on PE releaseOliver O'Halloran1-27/+3
Currently we have these two functions: pnv_pci_ioda2_release_dma_pe(), and pnv_pci_ioda2_release_pe_dma() The first is used when tearing down VF PEs and the other is used for normal devices. There's very little difference between the two though. The latter (non-VF) will skip a call to pnv_pci_ioda2_unset_window() unless CONFIG_IOMMU_API=y is set. There's no real point in doing this so fold the two together. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200722065715.1432738-2-oohall@gmail.com
2020-07-26powerpc/powernv/pci: Add pci_bus_to_pnvhb() helperOliver O'Halloran1-64/+24
Add a helper to go from a pci_bus structure to the pnv_phb that hosts that bus. There's a lot of instances of the following pattern: struct pci_controller *hose = pci_bus_to_host(pdev->bus); struct pnv_phb *phb = hose->private_data; Without any other uses of the pci_controller inside the function. This is hard to read since it requires you to memorise the contents of the private data fields and kind of error prone since it involves blindly assigning a void pointer. Add a helper to make it more concise and explicit. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200722065715.1432738-1-oohall@gmail.com
2020-07-15powerpc/powernv: Move pnv_ioda_setup_bus_dma under CONFIG_IOMMU_APIOliver O'Halloran1-13/+13
pnv_ioda_setup_bus_dma() is only used when a passed through PE is returned to the host. If the kernel is built without IOMMU support this is dead code. Move it under the #ifdef with the rest of the IOMMU API support. Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200705133557.443607-2-oohall@gmail.com
2020-07-15powerpc/powernv: Make pnv_pci_sriov_enable() and friends staticOliver O'Halloran1-4/+4
The kernel test robot noticed these are non-static which causes Clang to print some warnings. These are called via ppc_md function pointers so there's no need for them to be non-static. Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200705133557.443607-1-oohall@gmail.com
2020-05-28powerpc/powernv/pci: Sprinkle around some WARN_ON()sOliver O'Halloran1-2/+2
pnv_pci_ioda_configure_bus() should now only ever be called when a device is added to the bus so add a WARN_ON() to the empty bus check. Similarly, pnv_pci_ioda_setup_bus_PE() should only ever be called for an unconfigured PE, so add a WARN_ON() for that case too. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200417073508.30356-5-oohall@gmail.com
2020-05-28powerpc/powernv/pci: Reserve the root bus PE during initOliver O'Halloran1-17/+9
Doing it once during boot rather than doing it on the fly and drop the janky populated logic. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200417073508.30356-4-oohall@gmail.com
2020-05-28powerpc/powernv/pci: Re-work bus PE configurationOliver O'Halloran1-51/+30
For normal PHBs IODA PEs are handled on a per-bus basis so all the devices on that bus will share a PE. Which PE specificly is determined by the location of the MMIO BARs for the devices on the bus so we can't actually configure the bus PEs until after MMIO resources are allocated. As a result PEs are currently configured by pcibios_setup_bridge(), which is called just before the bridge windows are programmed into the bus' parent bridge. Configuring the bus PE here causes a few problems: 1. The root bus doesn't have a parent bridge so setting up the PE for the root bus requires some hacks. 2. The PELT-V isn't setup correctly because pnv_ioda_set_peltv() assumes that PEs will be configured in root-to-leaf order. This assumption is broken because resource assignment is performed depth-first so the leaf bridges are setup before their parents are. The hack mentioned in 1) results in the "correct" PELT-V for busses immediately below the root port, but not for devices below a switch. 3. It's possible to break the sysfs PCI rescan feature by removing all the devices on a bus. When the last device is removed from a PE its will be de-configured. Rescanning the devices on a bus does not cause the bridge to be reconfigured rendering the devices on that bus unusable. We can address most of these problems by moving the PE setup out of pcibios_setup_bridge() and into pcibios_bus_add_device(). This fixes 1) and 2) because pcibios_bus_add_device() is called on each device in root-to-leaf order so PEs for parent buses will always be configured before their children. It also fixes 3) by ensuring the PE is configured before initialising DMA for the device. In the event the PE was de-configured due to removing all the devices in that PE it will now be reconfigured when a new device is added since there's no dependecy on the bridge_setup() hook being called. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200417073508.30356-3-oohall@gmail.com
2020-05-28powerpc/powernv/pci: Add helper to find ioda_pe from BDFNOliver O'Halloran1-0/+10
For each PHB we maintain a reverse-map that can be used to find the PE that a BDFN is currently mapped to. Add a helper for doing this lookup so we can check if a PE has been configured without looking at pdn->pe_number. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200417073508.30356-2-oohall@gmail.com
2020-05-28powerpc/powernv: Add a print indicating when an IODA PE is releasedOliver O'Halloran1-0/+2
Quite useful to know in some cases. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Reviewed-by: Sam Bobroff <sbobroff@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200408112213.5549-1-oohall@gmail.com
2020-05-28powerpc/powernv/npu: Move IOMMU group setup into npu-dma.cOliver O'Halloran1-53/+7
The NVlink IOMMU group setup is only relevant to NVLink devices so move it into the NPU containment zone. This let us remove some prototypes in pci.h and staticfy some function definitions. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200406030745.24595-8-oohall@gmail.com
2020-05-28powerpc/powernv/pci: Move tce size parsing to pci-ioda-tce.cOliver O'Halloran1-30/+0
Move it in with the rest of the TCE wrangling rather than carting around a static prototype in pci-ioda.c Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200406030745.24595-7-oohall@gmail.com
2020-05-28powerpc/powernv/pci: Delete old iommu recursive iommu setupOliver O'Halloran1-32/+0
No longer used. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200406030745.24595-6-oohall@gmail.com
2020-05-28powerpc/powernv/pci: Add device to iommu group during dma_dev_setup()Oliver O'Halloran1-34/+13
Historically adding devices to their respective iommu group has been handled by the post-init phb fixup for most devices. This was done because: 1) The IOMMU group is tied to the PE (usually) so we can only setup the iommu groups after we've done resource allocation since BAR location determines the device's PE, and: 2) The sysfs directory for the pci_dev needs to be available since iommu_add_device() wants to add an attribute for the iommu group. However, since commit 30d87ef8b38d ("powerpc/pci: Fix pcibios_setup_device() ordering") both conditions are met when hose->ops->dma_dev_setup() is called so there's no real need to do this in the fixup. Moving the call to iommu_add_device() into pnv_pci_ioda_dma_setup_dev() is a nice cleanup since it puts all the per-device IOMMU setup into one place. It also results in all (non-nvlink) devices getting their iommu group via a common path rather than relying on the bus notifier hack in pnv_tce_iommu_bus_notifier() to handle the adding VFs and hotplugged devices to their group. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200406030745.24595-5-oohall@gmail.com
2020-05-28powerpc/powernv/pci: Register iommu group at PE DMA setupOliver O'Halloran1-10/+6
Move the registration of IOMMU groups out of the post-phb init fixup and into when we configure DMA for a PE. For most devices this doesn't result in any functional changes, but for NVLink attached GPUs it requires a bit of care. When the GPU is probed an IOMMU group would be created for the PE that contains it. We need to ensure that group is removed before we add the PE to the compound group that's used to keep the translations see by the PCIe and NVLink buses the same. No functional changes. Probably. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200406030745.24595-4-oohall@gmail.com
2020-05-28powerpc/powernv/iov: Don't add VFs to iommu group during PE configOliver O'Halloran1-1/+0
In pnv_ioda_setup_vf_PE() we register an iommu group for the VF PE then call pnv_ioda_setup_bus_iommu_group() to add devices to that group. However, this function is called before the VFs are scanned so there's no devices to add. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200406030745.24595-3-oohall@gmail.com
2020-05-11powerpc: Replace _ALIGN_UP() by ALIGN()Christophe Leroy1-4/+4
_ALIGN_UP() is specific to powerpc ALIGN() is generic and does the same Replace _ALIGN_UP() by ALIGN() Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Joel Stanley <joel@jms.id.au> Link: https://lore.kernel.org/r/8a6d7e45f7904c73a0af539642d3962e2a3c7268.1587407777.git.christophe.leroy@c-s.fr
2020-05-11powerpc: Replace _ALIGN_DOWN() by ALIGN_DOWN()Christophe Leroy1-1/+1
_ALIGN_DOWN() is specific to powerpc ALIGN_DOWN() is generic and does the same Replace _ALIGN_DOWN() by ALIGN_DOWN() Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Joel Stanley <joel@jms.id.au> Link: https://lore.kernel.org/r/3911a86d6b5bfa7ad88cd7c82416fbe6bb47e793.1587407777.git.christophe.leroy@c-s.fr