aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/pci/iov.c (follow)
AgeCommit message (Collapse)AuthorFilesLines
2017-10-10PCI: Restore ARI Capable Hierarchy before setting numVFsTony Nguyen1-0/+8
In the restore path, we previously read PCI_SRIOV_VF_OFFSET and PCI_SRIOV_VF_STRIDE before restoring PCI_SRIOV_CTRL_ARI: pci_restore_state pci_restore_iov_state sriov_restore_state pci_iov_set_numvfs pci_read_config_word(... PCI_SRIOV_VF_OFFSET, &iov->offset) pci_read_config_word(... PCI_SRIOV_VF_STRIDE, &iov->stride) pci_write_config_word(... PCI_SRIOV_CTRL, iov->ctrl) But per SR-IOV r1.1, sec 3.3.3.5, the device can use PCI_SRIOV_CTRL_ARI to determine PCI_SRIOV_VF_OFFSET and PCI_SRIOV_VF_STRIDE. Therefore, this path, which is used for suspend/resume and AER recovery, can corrupt iov->offset and iov->stride. Since the iov state is associated with the device, not the driver, if we reload the driver, it will use the the corrupted data, which may cause crashes like this: kernel BUG at drivers/pci/iov.c:157! RIP: 0010:pci_iov_add_virtfn+0x2eb/0x350 Call Trace: pci_enable_sriov+0x353/0x440 ixgbe_pci_sriov_configure+0xd5/0x1f0 [ixgbe] sriov_numvfs_store+0xf7/0x170 dev_attr_store+0x18/0x30 sysfs_kf_write+0x37/0x40 kernfs_fop_write+0x120/0x1b0 vfs_write+0xb5/0x1a0 SyS_write+0x55/0xc0 Restore PCI_SRIOV_CTRL_ARI before calling pci_iov_set_numvfs(), then restore the rest of PCI_SRIOV_CTRL (which may set PCI_SRIOV_CTRL_VFE) afterwards. Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> [bhelgaas: changelog, add comment, also clear ARI if necessary] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Alexander Duyck <alexander.h.duyck@intel.com> CC: Emil Tantilov <emil.s.tantilov@intel.com>
2017-10-10PCI: Create SR-IOV virtfn/physfn links before attaching driverStuart Hayes1-1/+2
When creating virtual functions, create the "virtfn%u" and "physfn" links in sysfs *before* attaching the driver instead of after. When we attach the driver to the new virtual network interface first, there is a race when the driver attaches to the new sends out an "add" udev event, and the network interface naming software (biosdevname or systemd, for example) tries to look at these links. Signed-off-by: Stuart Hayes <stuart.w.hayes@gmail.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2017-10-05PCI: Cache the VF device ID in the SR-IOV structureFilippo Sironi1-2/+3
Cache the VF device ID in the SR-IOV structure and use it instead of reading it over and over from the PF config space capability. Signed-off-by: Filippo Sironi <sironi@amazon.de> [bhelgaas: rename to "vf_device" to match pci_dev->device] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2017-10-05PCI: Remove reset argument from pci_iov_{add,remove}_virtfn()Jan H. Schönherr1-13/+5
The "reset" argument passed to pci_iov_add_virtfn() and pci_iov_remove_virtfn() is always zero since 46cb7b1bd86f ("PCI: Remove unused SR-IOV VF Migration support") Remove the argument together with the associated code. Signed-off-by: Jan H. Schönherr <jschoenh@amazon.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Russell Currey <ruscur@russell.cc>
2017-08-29PCI: Disable VF decoding before pcibios_sriov_disable() updates resourcesGavin Shan1-3/+4
A struct resource represents the address space consumed by a device. We should not modify that resource while the device is actively using the address space. For VFs, pci_iov_update_resource() enforces this by printing a warning and doing nothing if the VFE (VF Enable) and MSE (VF Memory Space Enable) bits are set. Previously, both sriov_enable() and sriov_disable() called the pcibios_sriov_disable() arch hook, which may update the struct resource, while VFE and MSE were enabled. This effectively dropped the resource update pcibios_sriov_disable() intended to do. Disable VF memory decoding before calling pcibios_sriov_disable(). Reported-by: Carol L Soto <clsoto@us.ibm.com> Tested-by: Carol L Soto <clsoto@us.ibm.com> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> [bhelgaas: changelog] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: shan.gavin@gmail.com Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org>
2017-06-14PCI: Protect pci_driver->sriov_configure() usage with device_lock()Jakub Kicinski1-4/+0
Every method in struct device_driver or structures derived from it like struct pci_driver MUST provide exclusion vs the driver's ->remove() method, usually by using device_lock(). Protect use of pci_driver->sriov_configure() by holding the device lock while calling it. The PCI core sets the pci_dev->driver pointer in local_pci_probe() before calling ->probe() and only clears it after ->remove(). This means driver's ->sriov_configure() callback will happily race with probe() and remove(), most likely leading to BUGs, since drivers don't expect this. Remove the iov lock completely, since we remove the last user. [bhelgaas: changelog, thanks to Christoph for locking rule] Link: http://lkml.kernel.org/r/20170522225023.14010-1-jakub.kicinski@netronome.com Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2017-04-20PCI: Add sysfs sriov_drivers_autoprobe to control VF driver bindingBodong Wang1-0/+1
Sometimes it is not desirable to bind SR-IOV VFs to drivers. This can save host side resource usage by VF instances that will be assigned to VMs. Add a new PCI sysfs interface "sriov_drivers_autoprobe" to control that from the PF. To modify it, echo 0/n/N (disable probe) or 1/y/Y (enable probe) to: /sys/bus/pci/devices/<DOMAIN:BUS:DEVICE.FUNCTION>/sriov_drivers_autoprobe Note that this must be done before enabling VFs. The change will not take effect if VFs are already enabled. Simply, one can disable VFs by setting sriov_numvfs to 0, choose whether to probe or not, and then re-enable the VFs by restoring sriov_numvfs. [bhelgaas: changelog, ABI doc] Signed-off-by: Bodong Wang <bodong@mellanox.com> Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
2017-02-03PCI: Lock each enable/disable num_vfs operation in sysfsEmil Tantilov1-7/+0
Enabling/disabling SRIOV via sysfs by echo-ing multiple values simultaneously: # echo 63 > /sys/class/net/ethX/device/sriov_numvfs& # echo 63 > /sys/class/net/ethX/device/sriov_numvfs # sleep 5 # echo 0 > /sys/class/net/ethX/device/sriov_numvfs& # echo 0 > /sys/class/net/ethX/device/sriov_numvfs results in the following bug: kernel BUG at drivers/pci/iov.c:495! invalid opcode: 0000 [#1] SMP CPU: 1 PID: 8050 Comm: bash Tainted: G W 4.9.0-rc7-net-next #2092 RIP: 0010:[<ffffffff813b1647>] [<ffffffff813b1647>] pci_iov_release+0x57/0x60 Call Trace: [<ffffffff81391726>] pci_release_dev+0x26/0x70 [<ffffffff8155be6e>] device_release+0x3e/0xb0 [<ffffffff81365ee7>] kobject_cleanup+0x67/0x180 [<ffffffff81365d9d>] kobject_put+0x2d/0x60 [<ffffffff8155bc27>] put_device+0x17/0x20 [<ffffffff8139c08a>] pci_dev_put+0x1a/0x20 [<ffffffff8139cb6b>] pci_get_dev_by_id+0x5b/0x90 [<ffffffff8139cca5>] pci_get_subsys+0x35/0x40 [<ffffffff8139ccc8>] pci_get_device+0x18/0x20 [<ffffffff8139ccfb>] pci_get_domain_bus_and_slot+0x2b/0x60 [<ffffffff813b09e7>] pci_iov_remove_virtfn+0x57/0x180 [<ffffffff813b0b95>] pci_disable_sriov+0x65/0x140 [<ffffffffa00a1af7>] ixgbe_disable_sriov+0xc7/0x1d0 [ixgbe] [<ffffffffa00a1e9d>] ixgbe_pci_sriov_configure+0x3d/0x170 [ixgbe] [<ffffffff8139d28c>] sriov_numvfs_store+0xdc/0x130 ... RIP [<ffffffff813b1647>] pci_iov_release+0x57/0x60 Use the existing mutex lock to protect each enable/disable operation. Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com> CC: Alexander Duyck <alexander.h.duyck@intel.com>
2016-11-29PCI: Remove pci_resource_bar() and pci_iov_resource_bar()Bjorn Helgaas1-18/+0
pci_std_update_resource() only deals with standard BARs, so we don't have to worry about the complications of VF BARs in an SR-IOV capability. Compute the BAR address inline and remove pci_resource_bar(). That makes pci_iov_resource_bar() unused, so remove that as well. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
2016-11-29PCI: Don't update VF BARs while VF memory space is enabledBjorn Helgaas1-0/+8
If we update a VF BAR while it's enabled, there are two potential problems: 1) Any driver that's using the VF has a cached BAR value that is stale after the update, and 2) We can't update 64-bit BARs atomically, so the intermediate state (new lower dword with old upper dword) may conflict with another device, and an access by a driver unrelated to the VF may cause a bus error. Warn about attempts to update VF BARs while they are enabled. This is a programming error, so use dev_WARN() to get a backtrace. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
2016-11-29PCI: Separate VF BAR updates from standard BAR updatesBjorn Helgaas1-0/+50
Previously pci_update_resource() used the same code path for updating standard BARs and VF BARs in SR-IOV capabilities. Split the VF BAR update into a new pci_iov_update_resource() internal interface, which makes it simpler to compute the BAR address (we can get rid of pci_resource_bar() and pci_iov_resource_bar()). This patch: - Renames pci_update_resource() to pci_std_update_resource(), - Adds pci_iov_update_resource(), - Makes pci_update_resource() a wrapper that calls the appropriate one, No functional change intended. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
2016-11-23PCI: Do any VF BAR updates before enabling the BARsGavin Shan1-7/+7
Previously we enabled VFs and enable their memory space before calling pcibios_sriov_enable(). But pcibios_sriov_enable() may update the VF BARs: for example, on PPC PowerNV we may change them to manage the association of VFs to PEs. Because 64-bit BARs cannot be updated atomically, it's unsafe to update them while they're enabled. The half-updated state may conflict with other devices in the system. Call pcibios_sriov_enable() before enabling the VFs so any BAR updates happen while the VF BARs are disabled. [bhelgaas: changelog] Tested-by: Carol Soto <clsoto@us.ibm.com> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2016-09-12PCI: Check for pci_setup_device() failure in pci_iov_add_virtfn()Po Liu1-1/+4
If pci_setup_device() returns failure, we must return failure from pci_iov_add_virtfn(). If we ignore the failure and continue with an uninitialized pci_dev for virtfn, we crash later when we try to use those uninitialized parts. Signed-off-by: Po Liu <po.liu@nxp.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2016-03-19Merge tag 'powerpc-4.6-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linuxLinus Torvalds1-5/+5
Pull powerpc updates from Michael Ellerman: "This was delayed a day or two by some build-breakage on old toolchains which we've now fixed. There's two PCI commits both acked by Bjorn. There's one commit to mm/hugepage.c which is (co)authored by Kirill. Highlights: - Restructure Linux PTE on Book3S/64 to Radix format from Paul Mackerras - Book3s 64 MMU cleanup in preparation for Radix MMU from Aneesh Kumar K.V - Add POWER9 cputable entry from Michael Neuling - FPU/Altivec/VSX save/restore optimisations from Cyril Bur - Add support for new ftrace ABI on ppc64le from Torsten Duwe Various cleanups & minor fixes from: - Adam Buchbinder, Andrew Donnellan, Balbir Singh, Christophe Leroy, Cyril Bur, Luis Henriques, Madhavan Srinivasan, Pan Xinhui, Russell Currey, Sukadev Bhattiprolu, Suraj Jitindar Singh. General: - atomics: Allow architectures to define their own __atomic_op_* helpers from Boqun Feng - Implement atomic{, 64}_*_return_* variants and acquire/release/ relaxed variants for (cmp)xchg from Boqun Feng - Add powernv_defconfig from Jeremy Kerr - Fix BUG_ON() reporting in real mode from Balbir Singh - Add xmon command to dump OPAL msglog from Andrew Donnellan - Add xmon command to dump process/task similar to ps(1) from Douglas Miller - Clean up memory hotplug failure paths from David Gibson pci/eeh: - Redesign SR-IOV on PowerNV to give absolute isolation between VFs from Wei Yang. - EEH Support for SRIOV VFs from Wei Yang and Gavin Shan. - PCI/IOV: Rename and export virtfn_{add, remove} from Wei Yang - PCI: Add pcibios_bus_add_device() weak function from Wei Yang - MAINTAINERS: Update EEH details and maintainership from Russell Currey cxl: - Support added to the CXL driver for running on both bare-metal and hypervisor systems, from Christophe Lombard and Frederic Barrat. - Ignore probes for virtual afu pci devices from Vaibhav Jain perf: - Export Power8 generic and cache events to sysfs from Sukadev Bhattiprolu - hv-24x7: Fix usage with chip events, display change in counter values, display domain indices in sysfs, eliminate domain suffix in event names, from Sukadev Bhattiprolu Freescale: - Updates from Scott: "Highlights include 8xx optimizations, 32-bit checksum optimizations, 86xx consolidation, e5500/e6500 cpu hotplug, more fman and other dt bits, and minor fixes/cleanup" * tag 'powerpc-4.6-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (179 commits) powerpc: Fix unrecoverable SLB miss during restore_math() powerpc/8xx: Fix do_mtspr_cpu6() build on older compilers powerpc/rcpm: Fix build break when SMP=n powerpc/book3e-64: Use hardcoded mttmr opcode powerpc/fsl/dts: Add "jedec,spi-nor" flash compatible powerpc/T104xRDB: add tdm riser card node to device tree powerpc32: PAGE_EXEC required for inittext powerpc/mpc85xx: Add pcsphy nodes to FManV3 device tree powerpc/mpc85xx: Add MDIO bus muxing support to the board device tree(s) powerpc/86xx: Introduce and use common dtsi powerpc/86xx: Update device tree powerpc/86xx: Move dts files to fsl directory powerpc/86xx: Switch to kconfig fragments approach powerpc/86xx: Update defconfigs powerpc/86xx: Consolidate common platform code powerpc32: Remove one insn in mulhdu powerpc32: small optimisation in flush_icache_range() powerpc: Simplify test in __dma_sync() powerpc32: move xxxxx_dcache_range() functions inline powerpc32: Remove clear_pages() and define clear_page() inline ...
2016-03-09PCI/IOV: Rename and export virtfn_{add, remove}Wei Yang1-5/+5
During EEH recovery, hotplug is applied to the devices which don't have drivers or their drivers don't support EEH. However, the hotplug, which was implemented based on PCI bus, can't be applied to VF directly. Instead, we unplug and plug individual PCI devices (VFs). This renames virtn_{add,remove}() and exports them so they can be used in PCI hotplug during EEH recovery. Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com> Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-02-29PCI: Support SR-IOV on any function typeKelly Zytaruk1-4/+0
Previously, we only supported SR-IOV on PCI Express Endpoints and Root Complex Integrated Endpoints. This restriction has been present since d1b054da8f59 ("PCI: initialize and release SR-IOV capability") added SR-IOV support, but the spec does not require it. In fact, the SR-IOV spec r1.1, sec 3.3, says the SR-IOV extended capability may be present for any Type 0 function. Remove the function type test, so we can support SR-IOV on any function. Some AMD GPUs have display outputs, use the VGA class code, are Legacy Endpoints, and support SR-IOV. This change allows Linux to enable SR-IOV on these devices. [bhelgaas: changelog] Link: https://bugzilla.kernel.org/show_bug.cgi?id=112221 Signed-off-by: Kelly Zytaruk <kelly.zytaruk@amd.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2015-11-02Merge branches 'pci/aer', 'pci/hotplug', 'pci/misc', 'pci/msi', 'pci/resource' and 'pci/virtualization' into nextBjorn Helgaas1-49/+52
* pci/aer: PCI/AER: Clear error status registers during enumeration and restore * pci/hotplug: PCI: pciehp: Queue power work requests in dedicated function * pci/misc: PCI: Turn off Request Attributes to avoid Chelsio T5 Completion erratum x86/PCI: Make pci_subsys_init() static PCI: Add builtin_pci_driver() to avoid registration boilerplate PCI: Remove unnecessary "if" statement * pci/msi: x86/PCI: Don't alloc pcibios-irq when MSI is enabled PCI/MSI: Export all remapped MSIs to sysfs attributes PCI: Disable MSI on SiS 761 * pci/resource: sparc/PCI: Add mem64 resource parsing for root bus PCI: Expand Enhanced Allocation BAR output PCI: Make Enhanced Allocation bitmasks more obvious PCI: Handle Enhanced Allocation capability for SR-IOV devices PCI: Add support for Enhanced Allocation devices PCI: Add Enhanced Allocation register entries PCI: Handle IORESOURCE_PCI_FIXED when assigning resources PCI: Handle IORESOURCE_PCI_FIXED when sizing resources PCI: Clear IORESOURCE_UNSET when reverting to firmware-assigned address * pci/virtualization: PCI: Fix sriov_enable() error path for pcibios_enable_sriov() failures PCI: Wait 1 second between disabling VFs and clearing NumVFs PCI: Reorder pcibios_sriov_disable() PCI: Remove VFs in reverse order if virtfn_add() fails PCI: Remove redundant validation of SR-IOV offset/stride registers PCI: Set SR-IOV NumVFs to zero after enumeration PCI: Enable SR-IOV ARI Capable Hierarchy before reading TotalVFs PCI: Don't try to restore VF BARs
2015-10-30PCI: Fix sriov_enable() error path for pcibios_enable_sriov() failuresAlexander Duyck1-5/+6
Disable VFs if pcibios_enable_sriov() fails, just like we do for other errors in sriov_enable(). Call pcibios_sriov_disable() if virtfn_add() fails. [bhelgaas: changelog, split to separate patch for reviewability] Fixes: 995df527f399 ("PCI: Add pcibios_sriov_enable() and pcibios_sriov_disable()") Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Wei Yang <weiyang@linux.vnet.ibm.com>
2015-10-30PCI: Wait 1 second between disabling VFs and clearing NumVFsAlexander Duyck1-1/+1
Per sec 3.3.3.1 of the SR-IOV spec, r1.1, we must allow 1.0s after clearing VF Enable before reading any field in the SR-IOV Extended Capability. Wait 1 second before calling pci_iov_set_numvfs(), which reads PCI_SRIOV_VF_OFFSET and PCI_SRIOV_VF_STRIDE after it sets PCI_SRIOV_NUM_VF. [bhelgaas: split to separate patch for reviewability, add spec reference] Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2015-10-30PCI: Reorder pcibios_sriov_disable()Alexander Duyck1-6/+6
Move pcibios_sriov_disable() up so it's defined before a future use. [bhelgaas: split to separate patch for reviewability] Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Wei Yang <weiyang@linux.vnet.ibm.com>
2015-10-30PCI: Remove VFs in reverse order if virtfn_add() failsAlexander Duyck1-3/+3
If virtfn_add() fails, we call virtfn_remove() for any previously added devices. Remove the devices in reverse order (first-added is last-removed), which is more natural and doesn't require an additional variable. [bhelgaas: changelog, split to separate patch for reviewability] Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Wei Yang <weiyang@linux.vnet.ibm.com>
2015-10-29PCI: Handle Enhanced Allocation capability for SR-IOV devicesDavid Daney1-2/+9
SR-IOV BARs can be specified via EA entries. Extend the EA parser to extract the SRIOV BAR resources, and modify sriov_init() to use resources previously obtained via EA. Signed-off-by: David Daney <david.daney@cavium.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Sean O. Stalley <sean.stalley@intel.com>
2015-10-29PCI: Remove redundant validation of SR-IOV offset/stride registersAlexander Duyck1-9/+1
Previously, we read, validated, and cached PCI_SRIOV_VF_OFFSET and PCI_SRIOV_VF_STRIDE in sriov_enable(). But sriov_init() now does that via compute_max_vf_buses(), so we don't need to do it again. Remove the PCI_SRIOV_VF_OFFSET and PCI_SRIOV_VF_STRIDE config reads from sriov_enable(). The pci_sriov structure already contains the offset and stride corresponding to the current NumVFs. [bhelgaas: split to separate patch for reviewability] Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Wei Yang <weiyang@linux.vnet.ibm.com>
2015-10-29PCI: Set SR-IOV NumVFs to zero after enumerationAlexander Duyck1-19/+22
The enumeration path should leave NumVFs set to zero. But after 4449f079722c ("PCI: Calculate maximum number of buses required for VFs"), we call virtfn_max_buses() in the enumeration path, which changes NumVFs. This NumVFs change is visible via lspci and sysfs until a driver enables SR-IOV. Iterate from TotalVFs down to zero so NumVFs is zero when we're finished computing the maximum number of buses. Validate offset and stride in the loop, so we can test it at every possible NumVFs setting. Rename virtfn_max_buses() to compute_max_vf_buses() to hint that it does have a side effect of updating iov->max_VF_buses. [bhelgaas: changelog, rename, allow numVF==1 && stride==0, rework loop, reverse sense of error path] Fixes: 4449f079722c ("PCI: Calculate maximum number of buses required for VFs") Based-on-patch-by: Ethan Zhao <ethan.zhao@oracle.com> Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2015-10-29PCI: Enable SR-IOV ARI Capable Hierarchy before reading TotalVFsBen Shelton1-4/+4
For some SR-IOV devices, the number of available virtual functions, i.e., TotalVFs, increases after setting the ARI Capable Hierarchy bit in the SR-IOV Control register. This violates the SR-IOV spec, r1.1, sec 3.3.6, which says TotalVFs is HwInit, but we don't need TotalVFs before setting the ARI Capable bit anyway. Set the ARI Capable Hierarchy bit (if ARI is enabled in the upstream bridge) before reading TotalVFs. [bhelgaas: changelog] Signed-off-by: Ben Shelton <benjamin.h.shelton@intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2015-03-31PCI: Add pcibios_iov_resource_alignment() interfaceWei Yang1-1/+7
Per the SR-IOV spec r1.1, sec 3.3.14, the required alignment of a PF's IOV BAR is the size of an individual VF BAR, and the size consumed is the individual VF BAR size times NumVFs. The PowerNV platform has additional alignment requirements to help support its Partitionable Endpoint device isolation feature (see Documentation/powerpc/pci_iov_resource_on_powernv.txt). Add a pcibios_iov_resource_alignment() interface to allow platforms to request additional alignment. [bhelgaas: changelog, adapt to reworked pci_sriov_resource_alignment(), drop "align" parameter] Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2015-03-31PCI: Add pcibios_sriov_enable() and pcibios_sriov_disable()Wei Yang1-0/+19
VFs are dynamically created when a driver enables them. On some platforms, like PowerNV, special resources are necessary to enable VFs. Add platform hooks for enabling and disabling VFs. Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2015-03-31PCI: Export pci_iov_virtfn_bus() and pci_iov_virtfn_devfn()Wei Yang1-12/+16
On PowerNV, some resource reservation is needed for SR-IOV VFs that don't exist at the bootup stage. To do the match between resources and VFs, the code need to get the VF's BDF in advance. Rename virtfn_bus() and virtfn_devfn() to pci_iov_virtfn_bus() and pci_iov_virtfn_devfn() and export them. [bhelgaas: changelog, make "busnr" int] Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2015-03-31PCI: Calculate maximum number of buses required for VFsWei Yang1-4/+27
An SR-IOV device can change its First VF Offset and VF Stride based on the values of ARI Capable Hierarchy and NumVFs. The number of buses required for all VFs is determined by NumVFs, First VF Offset, and VF Stride (see SR-IOV spec r1.1, sec 2.1.2). Previously pci_iov_bus_range() computed how many buses would be required by TotalVFs, but this was based on a single NumVFs value and may not have been the maximum for all NumVFs configurations. Iterate over all valid NumVFs and calculate the maximum number of bus numbers that could ever be required for VFs of this device. [bhelgaas: changelog, compute busnr of NumVFs, not TotalVFs, remove kerenl-doc comment marker] Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2015-03-31PCI: Refresh First VF Offset and VF Stride when updating NumVFsWei Yang1-4/+19
The First VF Offset and VF Stride fields depend on the NumVFs setting, so refresh the cached fields in struct pci_sriov when updating NumVFs. See the SR-IOV spec r1.1, sec 3.3.9 and 3.3.10. [bhelgaas: changelog, remove kernel-doc comment marker] Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2015-03-31PCI: Index IOV resources in the conventional styleBjorn Helgaas1-4/+4
Most of PCI uses "res = &dev->resource[i]", not "res = dev->resource + i". Use that style in iov.c also. No functional change. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Wei Yang <weiyang@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2015-03-31PCI: Keep individual VF BAR size in struct pci_sriovWei Yang1-19/+20
Currently we don't store the individual VF BAR size. We calculate it when needed by dividing the PF's IOV resource size (which contains space for *all* the VFs) by total_VFs or by reading the BAR in the SR-IOV capability again. Keep the individual VF BAR size in struct pci_sriov.barsz[], add pci_iov_resource_size() to retrieve it, and use that instead of doing the division or reading the SR-IOV capability BAR. [bhelgaas: rename to "barsz[]", simplify barsz[] index computation, remove SR-IOV capability BAR sizing] Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2015-03-31PCI: Print PF SR-IOV resource that contains all VF(n) BAR spaceWei Yang1-0/+2
When we size VF BAR0, VF BAR1, etc., from the SR-IOV Capability of a PF, we learn the alignment requirement and amount of space consumed by a single VF. But when VFs are enabled, *each* of the NumVFs consumes that amount of space, so the total size of the PF resource is "VF BAR size * NumVFs". Add a printk of the total space consumed by the VFs corresponding to what we already do for normal non-IOV BARs. No functional change; new message only. [bhelgaas: split out into its own patch] Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2015-03-31PCI: Print more info in sriov_enable() error messageBjorn Helgaas1-2/+5
If we don't have space for all the bus numbers required to enable VFs, print the largest bus number required and the range available. No functional change; improved error message only. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Wei Yang <weiyang@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-11-19PCI: Remove fixed parameter in pci_iov_resource_bar()Myron Stowe1-8/+3
pci_iov_resource_bar() always sets its 'pci_bar_type' parameter to 'pci_bar_unknown'. Drop the parameter and just use 'pci_bar_unknown' directly in the callers. No functional change intended. Signed-off-by: Myron Stowe <myron.stowe@redhat.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> CC: Chris Wright <chrisw@sous-sol.org> CC: Yu Zhao <yuzhao@google.com>
2014-09-16PCI: Use device flag helper functionsEthan Zhao1-1/+1
Use PCI device flag helper functions when checking whether a device is assigned. No functional change. Signed-off-by: Ethan Zhao <ethan.zhao@oracle.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2014-05-30PCI: Make pci_bus_add_device() voidYijing Wang1-1/+1
pci_bus_add_device() always returns 0, so there's no point in returning anything at all. Make it a void function and remove the tests of the return value from the callers. [bhelgaas: changelog, remove unused "err" from i82875p_setup_overfl_dev()] Signed-off-by: Yijing Wang <wangyijing@huawei.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2014-02-19PCI: Remove unused SR-IOV VF Migration supportBjorn Helgaas1-119/+0
This reverts commit 74bb1bcc7dbb ("PCI: handle SR-IOV Virtual Function Migration"), removing this exported interface: pci_sriov_migration() Since pci_sriov_migration() is unused, it is impossible to schedule sriov_migration_task() or use any of the other migration infrastructure. This is based on Stephen Hemminger's patch (see link below), but goes a bit further. Link: http://lkml.kernel.org/r/20131227132710.7190647c@nehalam.linuxnetplumber.net Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> CC: Stephen Hemminger <stephen@networkplumber.org>
2014-01-10PCI: Never treat a VF as a multifunction deviceAlex Williamson1-0/+1
Per the SR-IOV spec rev 1.1: 3.4.1.9 Header Type (Offset 0Eh) "... For VFs, this register must be RO Zero." Unfortunately some devices get this wrong, ex. Emulex OneConnect 10Gb NIC. When they do it makes us handle ACS testing and therefore IOMMU groups as if they were actual multifunction devices and require ACS capabilities to make sure there's no peer-to-peer between functions. VFs are never traditional multifunction devices, so simply clear this bit before we get any further into setup. Link: https://bugzilla.kernel.org/show_bug.cgi?id=68431 Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2013-11-22PCI: Clear NumVFs when disabling SR-IOV in sriov_init()ethan.zhao1-0/+1
When SR-IOV is disabled (VF Enable is cleared), NumVFs is not very useful, so this patch clears it out to prevent confusing lspci output like that below. We already clear NumVFs in sriov_disable(), and this does the same when we disable SR-IOV as part of parsing the SR-IOV capability. $ lspci -vvv -s 13:00.0 13:00.0 Ethernet controller: Intel Corporation 82599EB 10-Gigabit SFI/SFP+ Network Connection (rev 01) Capabilities: [160 v1] Single Root I/O Virtualization (SR-IOV) IOVCtl: Enable- Migration- Interrupt- MSE- ARIHierarchy+ Initial VFs: 64, Total VFs: 64, Number of VFs: 64, ... [bhelgaas: changelog] Signed-off-by: ethan.zhao <ethan.kernel@gmail.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2013-11-14PCI: Fix whitespace, capitalization, and spelling errorsBjorn Helgaas1-1/+1
Fix whitespace, capitalization, and spelling errors. No functional change. I know "busses" is not an error, but "buses" was more common, so I used it consistently. Signed-off-by: Marta Rybczynska <rybczynska@gmail.com> (pci_reset_bridge_secondary_bus()) Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-07-31PCI: Return -ENOSYS for SR-IOV operations on non-SR-IOV devicesStefan Assmann1-7/+10
Change the return value to -ENOSYS if a device is not an SR-IOV PF. Previously we returned either -ENODEV or -EINVAL. Also have pci_sriov_get_totalvfs() return 0 in the error case to make the behaviour consistent whether CONFIG_PCI_IOV is enabled or not. Signed-off-by: Stefan Assmann <sassmann@kpanic.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2013-07-30PCI: Update NumVFs register when disabling SR-IOVYijing Wang1-1/+3
Currently, we only update NumVFs register during sriov_enable(). This register should also be updated during sriov_disable() and when sriov_enable() fails. Otherwise, we will get the stale "Number of VFs" info from lspci. [bhelgaas: changelog] Signed-off-by: Yijing Wang <wangyijing@huawei.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2013-07-25PCI: Fix comment typo in iov.cJonghwan Choi1-1/+1
"Devic3" should be "device." Signed-off-by: Jonghwan Choi <jhbird.choi@samsung.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2013-06-14Merge branch 'pci/jiang-bus-lock-v3' into nextBjorn Helgaas1-35/+25
* pci/jiang-bus-lock-v3: PCI: Return early on allocation failures to unindent mainline code PCI: Simplify IOV implementation and fix reference count races PCI: Drop redundant setting of bus->is_added in virtfn_add_bus() unicore32/PCI: Remove redundant call of pci_bus_add_devices() m68k/PCI: Remove redundant call of pci_bus_add_devices() PCI: Rename pci_release_bus_bridge_dev() to pci_release_host_bridge_dev() PCI: Fix refcount issue in pci_create_root_bus() error recovery path ia64/PCI: Clean up pci_scan_root_bus() usage PCI: Convert alloc_pci_dev(void) to pci_alloc_dev(bus) PCI: Introduce pci_alloc_dev(struct pci_bus*) to replace alloc_pci_dev() PCI: Introduce pci_bus_{get|put}() to manage PCI bus reference count Conflicts: drivers/pci/probe.c
2013-06-14PCI: Simplify IOV implementation and fix reference count racesJiang Liu1-35/+24
Trivial changes to IOV: 1) use new PCI interfaces to simplify IOV implementation 2) fix some reference count related race windows [bhelgaas: fix virtfn_add() add bus/alloc dev error paths] Signed-off-by: Jiang Liu <jiang.liu@huawei.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: Donald Dutile <ddutile@redhat.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Ram Pai <linuxram@us.ibm.com>
2013-06-14PCI: Drop redundant setting of bus->is_added in virtfn_add_bus()Jiang Liu1-1/+0
The flag pci_bus->is_added is used to guard invocation of pcibios_fixup_bus(pci_bus). When virtfn_add_bus() is called, the pci_bus->is_added flag has already been set, so remove the redundant bus->is_added = 1; Signed-off-by: Jiang Liu <jiang.liu@huawei.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: Donald Dutile <ddutile@redhat.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Ram Pai <linuxram@us.ibm.com>
2013-06-05PCI: Convert alloc_pci_dev(void) to pci_alloc_dev(bus)Gu Zheng1-3/+5
Use the new pci_alloc_dev(bus) to replace the existing using of alloc_pci_dev(void). [bhelgaas: drop pci_bus ref later in pci_release_dev()] Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: Jiang Liu <jiang.liu@huawei.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: David Airlie <airlied@linux.ie> Cc: Neela Syam Kolli <megaraidlinux@lsi.com> Cc: "James E.J. Bottomley" <JBottomley@parallels.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Andrew Morton <akpm@linux-foundation.org>
2013-05-31PCI: Finish SR-IOV VF setup before adding the deviceXudong Hao1-3/+2
Commit 4f535093cf "PCI: Put pci_dev in device tree as early as possible" moves device registering from pci_bus_add_devices() to pci_device_add(). That causes problems for virtual functions because device_add(&virtfn->dev) is called before setting the virtfn->is_virtfn flag, which then causes Xen to report PCI virtual functions as PCI physical functions. Fix it by setting virtfn->is_virtfn before calling pci_device_add(). [Jiang Liu]: Move the setting of virtfn->is_virtfn ahead further for better readability and modify changelog. Signed-off-by: Xudong Hao <xudong.hao@intel.com> Signed-off-by: Jiang Liu <jiang.liu@huawei.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: stable@vger.kernel.org # v3.9+
2013-04-24pci: Add SRIOV helper function to determine if VFs are assigned to guestAlexander Duyck1-0/+41
This function is meant to add a helper function that will determine if a PF has any VFs that are currently assigned to a guest. We currently have been implementing this function per driver, and going forward I would like to avoid that by making this function generic and using this helper. v2: Removed extern from declaration of pci_vfs_assigned in pci.h and return 0 if SR-IOV is disabled with is inline with other PCI SRIOV functions. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>