aboutsummaryrefslogtreecommitdiffstats
path: root/arch/powerpc/include/asm/eeh.h (follow)
AgeCommit message (Collapse)AuthorFilesLines
2017-08-31powerpc/eeh: Remove unnecessary config_addr from eeh_devAlexey Kardashevskiy1-1/+0
The eeh_dev struct hold a config space address of an associated node and the very same address is also stored in the pci_dn struct which is always present during the eeh_dev lifetime. This uses bus:devfn directly from pci_dn instead of cached and packed config_addr. Since config_addr is made from device's bus:dev.fn, there is no point in keeping it in the debugfs either so remove that too. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-08-31powerpc/eeh: Remove unnecessary pointer to phb from eeh_devAlexey Kardashevskiy1-1/+0
The eeh_dev struct already holds a pointer to pci_dn which it does not exist without and pci_dn itself holds the very same pointer so just use it. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-08-31powerpc/eeh: Reduce to one the number of places where edev is allocatedAlexey Kardashevskiy1-1/+2
arch/powerpc/kernel/eeh_dev.c:57 is the only legit place where edev is allocated; other 2 places allocate it on stack and in the heap for a very short period of time to use eeh_pe_get() as takes edev. This changes eeh_pe_get() to receive required parameters explicitly. This removes unnecessary temporary allocation of edev. This uses the "pe_no" name instead of the "pe_config_addr" name as it actually is a PE number and not a config space address as it seemed. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Acked-by: Russell Currey <ruscur@russell.cc> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-06-21powerpc/pci: Delay populating pdnGavin Shan1-1/+1
The pdn (struct pci_dn) instances are allocated from memblock or bootmem when creating PCI controller (hoses) in setup_arch(). PCI hotplug, which will be supported by proceeding patches, releases PCI device nodes and their corresponding pdn on unplugging event. The memory chunks for pdn instances allocated from memblock or bootmem are hard to reused after being released. This delays creating pdn by pci_devs_phb_init() from setup_arch() to core_initcall() so that they are allocated from slab. The memory consumed by pdn can be released to system without problem during PCI unplugging time. It indicates that pci_dn is unavailable in setup_arch() and the the fixup on pdn (like AGP's) can't be carried out that time. We have to do that in pcibios_root_bridge_prepare() on maple/pasemi/powermac platforms where/when the pdn is available. pcibios_root_bridge_prepare is called from subsys_initcall() which is executed after core_initcall() so the code flow does not change. At the mean while, the EEH device is created when pdn is populated, meaning pdn and EEH device have same life cycle. In turn, we needn't call eeh_dev_init() to create EEH device explicitly. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-06-14powerpc: Various typo fixesMichael Ellerman1-1/+1
Signed-off-by: Andrea Gelmini <andrea.gelmini@gelma.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09powerpc/eeh: powerpc/eeh: Support error recovery for VF PEWei Yang1-0/+2
PFs are enumerated on PCI bus, while VFs are created by PF's driver. In EEH recovery, it has two cases: 1. Device and driver is EEH aware, error handlers are called. 2. Device and driver is not EEH aware, un-plug the device and plug it again by enumerating it. The special thing happens on the second case. For a PF, we could use the original pci core to enumerate the bus, while for VF we need to record the VFs which aer un-plugged then plug it again. Also The patch caches the VF index in pci_dn, which can be used to calculate VF's bus, device and function number. Those information helps to locate the VF's PCI device instance when doing hotplug during EEH recovery if necessary. Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com> Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09powerpc/powernv: Support EEH reset for VF PEWei Yang1-0/+1
PEs for VFs don't have primary bus. So they have to have their own reset backend, which is used during EEH recovery. The patch implements the reset backend for VF's PE by issuing FLR or AF FLR to the VFs, which are contained in the PE. Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com> Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09powerpc/eeh: Create PE for VFsWei Yang1-0/+1
This creates PEs for VFs in the weak function pcibios_bus_add_device(). Those PEs for VFs are identified with newly introduced flag EEH_PE_VF so that we treat them differently during EEH recovery. Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com> Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09powerpc/eeh: EEH device for VFWei Yang1-0/+1
VFs and their corresponding pdn are created and released dynamically when their PF's SRIOV capability is enabled and disabled. This creates and releases EEH devices for VFs when creating and releasing their pdn instances, which means EEH devices and pdn instances have same life cycle. Also, VF's EEH device is identified by (struct eeh_dev::physfn). Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com> Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-02-15powerpc/eeh: Fix stale cached primary busGavin Shan1-0/+1
When PE is created, its primary bus is cached to pe->bus. At later point, the cached primary bus is returned from eeh_pe_bus_get(). However, we could get stale cached primary bus and run into kernel crash in one case: full hotplug as part of fenced PHB error recovery releases all PCI busses under the PHB at unplugging time and recreate them at plugging time. pe->bus is still dereferencing the PCI bus that was released. This adds another PE flag (EEH_PE_PRI_BUS) to represent the validity of pe->bus. pe->bus is updated when its first child EEH device is online and the flag is set. Before unplugging in full hotplug for error recovery, the flag is cleared. Fixes: 8cdb2833 ("powerpc/eeh: Trace PCI bus from PE") Cc: stable@vger.kernel.org #v3.11+ Reported-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Reported-by: Pradipta Ghosh <pradghos@in.ibm.com> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Tested-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-05-12powerpc/eeh: Introduce eeh_pe_inject_err()Gavin Shan1-0/+2
The patch defines PCI error types and functions in uapi/asm/eeh.h and exports function eeh_pe_inject_err(), which will be called by VFIO driver to inject the specified PCI error to the indicated PE for testing purpose. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-05-12powerpc/eeh: Move PE state constants aroundGavin Shan1-5/+2
There are two equivalent sets of PE state constants, defined in arch/powerpc/include/asm/eeh.h and include/uapi/linux/vfio.h. Though the names are different, their corresponding values are exactly same. The former is used by EEH core and the latter is used by userspace. The patch moves those constants from arch/powerpc/include/asm/eeh.h to arch/powerpc/include/uapi/asm/eeh.h, which are expected to be used by userspace from now on. We can't delete those constants in vfio.h as it's uncertain that those constants have been or will be used by userspace. Suggested-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-03-24powerpc/eeh: Remove device_node dependencyGavin Shan1-7/+0
The patch removes struct eeh_dev::dn and the corresponding helper functions: eeh_dev_to_of_node() and of_node_to_eeh_dev(). Instead, eeh_dev_to_pdn() and pdn_to_eeh_dev() should be used to get the pdn, which might contain device_node on PowerNV platform. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2015-03-24powerpc/eeh: Replace device_node with pci_dn in eeh_opsGavin Shan1-3/+3
There are 3 EEH operations whose arguments contain device_node: read_config(), write_config() and restore_config(). The patch replaces device_node with pci_dn. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2015-03-24powerpc/eeh: Do probe on pci_dnGavin Shan1-6/+5
Originally, EEH core probes on device_node or pci_dev to populate EEH devices and PEs, which conflicts with the fact: SRIOV VFs are usually enabled and created by PF's driver and they don't have the corresponding device_nodes. Instead, SRIOV VFs have dynamically created pci_dn, which can be used for EEH probe. The patch reworks EEH probe for PowerNV and pSeries platforms to do probing based on pci_dn, instead of pci_dev or device_node any more. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2015-03-24powerpc/eeh: Create eeh_dev from pci_dn instead of device_nodeGavin Shan1-2/+9
The patch adds function traverse_pci_dn(), which is similar to traverse_pci_devices() except it takes pci_dn, not device_node as parameter. The pci_dev.c has been reworked to create eeh_dev from pci_dn, instead of device_node. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2015-01-23powerpc/eeh: Allow to set maximal frozen timesGavin Shan1-6/+1
When PE's frozen count hits maximal allowed frozen times, which is 5 currently, it will be forced to be offline permanently. Once the PE is removed permanently, rebooting machine is required to bring the PE back. It's not convienent when testing EEH functionality. The patch exports the maximal allowed frozen times through debugfs entry (/sys/kernel/debug/powerpc/eeh_max_freezes). Requested-by: Ryan Grimm <grimm@linux.vnet.ibm.com> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-01-23powerpc/eeh: Introduce flag EEH_PE_REMOVEDGavin Shan1-0/+1
The conditions that one specific PE's frozen count exceeds the maximal allowed times (EEH_MAX_ALLOWED_FREEZES) and it's in isolated or recovery state indicate the PE was removed permanently implicitly. The patch introduces flag EEH_PE_REMOVED to indicate that explicitly so that we don't depend on the fixed maximal allowed times, which can be varied as we do in subsequent patch. Flag EEH_PE_REMOVED is expected to be marked for the PE whose frozen count exceeds the maximal allowed times, or just failed from recovery. Requested-by: Ryan Grimm <grimm@linux.vnet.ibm.com> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-01-23powerpc/eeh: Fix missed PE#0 on P7IOCGavin Shan1-2/+3
PE#0 should be regarded as valid for P7IOC, while it's invalid for PHB3. The patch adds flag EEH_VALID_PE_ZERO to differentiate those two cases. Without the patch, we possibly see frozen PE#0 state is cleared without EEH recovery taken on P7IOC as following kernel logs indicate: [root@ltcfbl8eb ~]# dmesg : pci 0000:00 : [PE# 000] Secondary bus 0 associated with PE#0 pci 0000:01 : [PE# 001] Secondary bus 1 associated with PE#1 pci 0001:00 : [PE# 000] Secondary bus 0 associated with PE#0 pci 0001:01 : [PE# 001] Secondary bus 1 associated with PE#1 pci 0002:00 : [PE# 000] Secondary bus 0 associated with PE#0 pci 0002:01 : [PE# 001] Secondary bus 1 associated with PE#1 pci 0003:00 : [PE# 000] Secondary bus 0 associated with PE#0 pci 0003:01 : [PE# 001] Secondary bus 1 associated with PE#1 pci 0003:20 : [PE# 002] Secondary bus 32..63 associated with PE#2 : EEH: Clear non-existing PHB#3-PE#0 EEH: PHB location: U78AE.001.WZS00M9-P1-002 Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-12-02powerpc/eeh: Dump PHB diag-data earlyGavin Shan1-0/+1
On PowerNV platform, PHB diag-data is dumped after stopping device drivers. In case of recursive EEH errors, the kernel is usually crashed before dumping PHB diag-data for the second EEH error. It's hard to locate the root cause of the second EEH error without PHB diag-data. The patch adds one more EEH option "eeh=early_log", which helps dumping PHB diag-data immediately once frozen PE is detected, in order to get the PHB diag-data for the second EEH error. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-12-02powerpc/eeh: Set EEH_PE_RESET on PE resetGavin Shan1-0/+1
The patch introduces additional flag EEH_PE_RESET to indicate the corresponding PE is under reset. In turn, the PE retrieval bakcend on PowerNV platform can return unfrozen state for the EEH core to moving forward. Flag EEH_PE_CFG_BLOCKED isn't the correct one for the purpose. In PCI passthrou case, the problem is more worse: Guest doesn't recover 6th EEH error. The PE is left in isolated (frozen) and config blocked state on Broadcom adapters. We can't retrieve the PE's state correctly any more, even from the host side via sysfs /sys/bus/pci/devices/xxx/eeh_pe_state. Reported-by: Rajeshkumar Subramanian <rajeshkumars@in.ibm.com> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-10-15powerpc/eeh: Block PCI config access upon frozen PEGavin Shan1-0/+1
The problem was found when I tried to inject PCI config error by PHB3 PAPR error injection registers into Broadcom Austin 4-ports NIC adapter. The frozen PE was reported successfully and EEH core started to recover it. However, I run into fenced PHB when dumping PCI config space as EEH logs. I was told that PCI config requests should not be progagated to the adapter until PE reset is done successfully. Otherise, we would run out of PHB internal credits and trigger PCT (PCIE Completion Timeout), which leads to the fenced PHB. The patch introduces another PE flag EEH_PE_CFG_RESTRICTED, which is set during PE initialization time if the PE includes the specific PCI devices that need block PCI config access until PE reset is done. When the PE becomes frozen for the first time, EEH_PE_CFG_BLOCKED is set if the PE has flag EEH_PE_CFG_RESTRICTED. Then the PCI config access to the PE will be dropped by platform PCI accessors until PE reset is done successfully. The mechanism is shared by PowerNV platform owned PE or userland owned ones. It's not used on pSeries platform yet. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-15powerpc/eeh: Rename flag EEH_PE_RESET to EEH_PE_CFG_BLOCKEDGavin Shan1-1/+1
The flag EEH_PE_RESET indicates blocking config space of the PE during reset time. We potentially need block PE's config space other than reset time. So it's reasonable to replace it with EEH_PE_CFG_BLOCKED to indicate its usage. There are no substantial code or logic changes in this patch. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-09-30powerpc/eeh: Emulate EEH recovery for VFIO devicesGavin Shan1-0/+1
When enabling EEH functionality on passed through devices (PE) with VFIO, the devices in the PE would be removed permanently from guest side. In that case, the PE remains frozen state. When returning PE to host, or restarting the guest again, we had mechanism unfreezing the PE by clearing PESTA/B frozen bits. However, that's not enough for some adapters, which are indicated as following "lspci" shows. Those adapters require hot reset on the parent bus to bring their firmware back to workable state. Otherwise, those adaptrs won't be operative and the host (for returning case) or the guest will fail to load the drivers for those adapters without exception. 0000:01:00.0 Ethernet controller: Emulex Corporation OneConnect \ 10Gb NIC (be3) (rev 02) 0000:01:00.0 0200: 19a2:0710 (rev 02) 0001:03:00.0 Ethernet controller: Emulex Corporation OneConnect \ NIC (Lancer) (rev 10) 0001:03:00.0 0200: 10df:e220 (rev 10) The patch adds mechanism to emulate EEH recovery (for hot reset on parent PCI bus) on 3 gates to fix the issue: open/release one adapter of the PE, enable EEH functionality on one adapter of the PE. Reported-by: Murilo Fossa Vicentini <muvic@br.ibm.com> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-09-30powerpc/eeh: Unfreeze PE on enabling EEH functionalityGavin Shan1-0/+1
When passing through PE to guest, that's possibly in frozen state. The driver for the pass-through devices on guest side can't be loaded successfully as reported. We already had one gate in eeh_dev_open() to clear PE frozen state accordingly, but that's not enough because the function is only called at QEMU startup for once. The patch adds another gate in eeh_pe_set_option() so that the PE frozen state can be cleared at QEMU restart time. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-09-30powerpc/eeh: Introduce eeh_ops::err_injectGavin Shan1-0/+2
The patch introduces eeh_ops::err_inject(), which allows to inject specified errors to indicated PE for testing purpose. The functionality isn't support on pSeries platform. On PowerNV, the functionality relies on OPAL API opal_pci_err_inject(). Signed-off-by: Mike Qiu <qiudayu@linux.vnet.ibm.com> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-09-30powerpc/eeh: Freeze PE before PE resetGavin Shan1-0/+1
The patch adds one more option (EEH_OPT_FREEZE_PE) to set_option() method to proactively freeze PE, which will be issued before resetting pass-throughed PE to drop MMIO access during reset because it's always contributing to recursive EEH error. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-09-30powerpc/eeh: Drop unused argument in eeh_check_failure()Gavin Shan1-15/+14
eeh_check_failure() is used to check frozen state of the PE which owns the indicated I/O address. The argument "val" of the function isn't used. The patch drops it and return the frozen state of the PE as expected. Cc: Vishal Mansur <vmansur@linux.vnet.ibm.com> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-09-25powerpc/eeh: Fix kernel crash when passing through VFWei Yang1-0/+5
When doing vfio passthrough a VF, the kernel will crash with following message: [ 442.656459] Unable to handle kernel paging request for data at address 0x00000060 [ 442.656593] Faulting instruction address: 0xc000000000038b88 [ 442.656706] Oops: Kernel access of bad area, sig: 11 [#1] [ 442.656798] SMP NR_CPUS=1024 NUMA PowerNV [ 442.656890] Modules linked in: vfio_pci mlx4_core nf_conntrack_netbios_ns nf_conntrack_broadcast ipt_MASQUERADE ip6t_REJECT xt_conntrack bnep bluetooth rfkill ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw tg3 nfsd be2net nfs_acl ses lockd ptp enclosure pps_core kvm_hv kvm_pr shpchp binfmt_misc kvm sunrpc uinput lpfc scsi_transport_fc ipr scsi_tgt [last unloaded: mlx4_core] [ 442.658152] CPU: 40 PID: 14948 Comm: qemu-system-ppc Not tainted 3.10.42yw-pkvm+ #37 [ 442.658219] task: c000000f7e2a9a00 ti: c000000f6dc3c000 task.ti: c000000f6dc3c000 [ 442.658287] NIP: c000000000038b88 LR: c0000000004435a8 CTR: c000000000455bc0 [ 442.658352] REGS: c000000f6dc3f580 TRAP: 0300 Not tainted (3.10.42yw-pkvm+) [ 442.658419] MSR: 9000000000009032 <SF,HV,EE,ME,IR,DR,RI> CR: 28004882 XER: 20000000 [ 442.658577] CFAR: c00000000000908c DAR: 0000000000000060 DSISR: 40000000 SOFTE: 1 GPR00: c0000000004435a8 c000000f6dc3f800 c0000000012b1c10 c00000000da24000 GPR04: 0000000000000003 0000000000001004 00000000000015b3 000000000000ffff GPR08: c00000000127f5d8 0000000000000000 000000000000ffff 0000000000000000 GPR12: c000000000068078 c00000000fdd6800 000001003c320c80 000001003c3607f0 GPR16: 0000000000000001 00000000105480c8 000000001055aaa8 000001003c31ab18 GPR20: 000001003c10fb40 000001003c360ae8 000000001063bcf0 000000001063bdb0 GPR24: 000001003c15ed70 0000000010548f40 c000001fe5514c88 c000001fe5514cb0 GPR28: c00000000da24000 0000000000000000 c00000000da24000 0000000000000003 [ 442.659471] NIP [c000000000038b88] .pcibios_set_pcie_reset_state+0x28/0x130 [ 442.659530] LR [c0000000004435a8] .pci_set_pcie_reset_state+0x28/0x40 [ 442.659585] Call Trace: [ 442.659610] [c000000f6dc3f800] [00000000000719e0] 0x719e0 (unreliable) [ 442.659677] [c000000f6dc3f880] [c0000000004435a8] .pci_set_pcie_reset_state+0x28/0x40 [ 442.659757] [c000000f6dc3f900] [c000000000455bf8] .reset_fundamental+0x38/0x80 [ 442.659835] [c000000f6dc3f980] [c0000000004562a8] .pci_dev_specific_reset+0xa8/0xf0 [ 442.659913] [c000000f6dc3fa00] [c0000000004448c4] .__pci_dev_reset+0x44/0x430 [ 442.659980] [c000000f6dc3fab0] [c000000000444d5c] .pci_reset_function+0x7c/0xc0 [ 442.660059] [c000000f6dc3fb30] [d00000001c141ab8] .vfio_pci_open+0xe8/0x2b0 [vfio_pci] [ 442.660139] [c000000f6dc3fbd0] [c000000000586c30] .vfio_group_fops_unl_ioctl+0x3a0/0x630 [ 442.660219] [c000000f6dc3fc90] [c000000000255fbc] .do_vfs_ioctl+0x4ec/0x7c0 [ 442.660286] [c000000f6dc3fd80] [c000000000256364] .SyS_ioctl+0xd4/0xf0 [ 442.660354] [c000000f6dc3fe30] [c000000000009e54] syscall_exit+0x0/0x98 [ 442.660420] Instruction dump: [ 442.660454] 4bfffce9 4bfffee4 7c0802a6 fbc1fff0 fbe1fff8 f8010010 f821ff81 7c7e1b78 [ 442.660566] 7c9f2378 60000000 60000000 e93e02c8 <e8690060> 2fa30000 41de00c4 2b9f0002 [ 442.660679] ---[ end trace a64ac9546bcf0328 ]--- [ 442.660724] The reason is current VF is not EEH enabled. This patch introduces a macro to convert eeh_dev to eeh_pe. By doing so, it will prevent converting with NULL pointer. Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com> Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com> CC: Michael Ellerman <mpe@ellerman.id.au> V3 -> V4: 1. move the macro definition from include/linux/pci.h to arch/powerpc/include/asm/eeh.h V2 -> V3: 1. rebased on 3.17-rc4 2. introduce a macro 3. use this macro in several other places V1 -> V2: 1. code style and patch subject adjustment Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-08-05powerpc/eeh: Aux PE data for error logGavin Shan1-0/+2
The patch allows PE (struct eeh_pe) instance to have auxillary data, whose size is configurable on basis of platform. For PowerNV, the auxillary data will be used to cache PHB diag-data for that PE (frozen PE or fenced PHB). In turn, we can retrieve the diag-data at any later points. It's useful for the case of VFIO PCI devices where the error log should be cached, and then be retrieved by the guest at later point. Also, it can avoid PHB diag-data overwritting if another frozen PE reported and the previous diag-data isn't fetched by guest. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-05powerpc/eeh: Selectively enable IO for error logGavin Shan1-4/+5
According to the experiment I did, PCI config access is blocked on P7IOC frozen PE by hardware, but PHB3 doesn't do that. That means we always get 0xFF's while dumping PCI config space of the frozen PE on P7IOC. We don't have the problem on PHB3. So we have to enable I/O prioir to collecting error log. Otherwise, meaningless 0xFF's are always returned. The patch fixes it by EEH flag (EEH_ENABLE_IO_FOR_LOG), which is selectively set to indicate the case for: P7IOC on PowerNV platform, pSeries platform. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-05powerpc/eeh: Refactor EEH flag accessorsGavin Shan1-21/+11
There are multiple global EEH flags. Almost each flag has its own accessor, which doesn't make sense. The patch refactors EEH flag accessors so that they look unified: eeh_add_flag(): Add EEH flag eeh_clear_flag(): Clear EEH flag eeh_has_flag(): Check if one specific flag has been set eeh_enabled(): Check if EEH functionality has been enabled Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-05powerpc/eeh: EEH support for VFIO PCI deviceGavin Shan1-0/+12
The patch exports functions to be used by new VFIO ioctl command, which will be introduced in subsequent patch, to support EEH functinality for VFIO PCI devices. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Acked-by: Alexander Graf <agraf@suse.de> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-05powerpc/eeh: Avoid event on passed PEGavin Shan1-0/+7
We must not handle EEH error on devices which are passed to somebody else. Instead, we expect that the frozen device owner detects an EEH error and recovers from it. This avoids EEH error handling on passed through devices so the device owner gets a chance to handle them. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Acked-by: Alexander Graf <agraf@suse.de> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-06-11powerpc/eeh: Dump PE location codeGavin Shan1-0/+1
As Ben suggested, it's meaningful to dump PE's location code for site engineers when hitting EEH errors. The patch introduces function eeh_pe_loc_get() to retireve the location code from dev-tree so that we can output it when hitting EEH errors. If primary PE bus is root bus, the PHB's dev-node would be tried prior to root port's dev-node. Otherwise, the upstream bridge's dev-node of the primary PE bus will be check for the location code directly. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-05-20Revert "powerpc/powernv: Fundamental reset on PLX ports"Benjamin Herrenschmidt1-1/+0
This reverts commit b2b5efcf208ddc9444aca77336627428782a39f4. This code was way too board specific, there are quirks as to how the PERST line is wired on different boards, we'll have to revisit this using/creating appropriate firmware interfaces. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-04-28powerpc/powernv: Fundamental reset on PLX portsGavin Shan1-0/+1
The patch intends to support fundamental reset on PLX downstream ports. If the PCI device matches any one of the internal table, which includes PLX vendor ID, bridge device ID, register offset for fundamental reset and bit, fundamental reset will be done accordingly. Otherwise, it will fail back to hot reset. Additional flag (EEH_DEV_FRESET) is introduced to record the last reset type on the PCI bridge. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-04-28powerpc/eeh: Make the delay for PE reset unifiedGavin Shan1-0/+10
Basically, we have 3 types of resets to fulfil PE reset: fundamental, hot and PHB reset. For the later 2 cases, we need PCI bus reset hold and settlement delay as specified by PCI spec. PowerNV and pSeries platforms are running on top of different firmware and some of the delays have been covered by underly firmware (PowerNV). The patch makes the delays unified to be done in backend, instead of EEH core. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-04-28powerpc/eeh: No hotplug on permanently removed devGavin Shan1-0/+1
The issue was detected in a bit complicated test case where we have multiple hierarchical PEs shown as following figure: +-----------------+ | PE#3 p2p#0 | | p2p#1 | +-----------------+ | +-----------------+ | PE#4 pdev#0 | | pdev#1 | +-----------------+ PE#4 (have 2 PCI devices) is the child of PE#3, which has 2 p2p bridges. We accidentally had less-known scenario: PE#4 was removed permanently from the system because of permanent failure (e.g. exceeding the max allowd failure times in last hour), then we detects EEH errors on PE#3 and tried to recover it. However, eeh_dev instances for pdev#0/1 were not detached from PE#4, which was still connected to PE#3. All of that was because of the fact that we rely on count-based pcibios_release_device(), which isn't reliable enough. When doing recovery for PE#3, we still apply hotplug on PE#4 and pdev#0/1, which are not valid any more. Eventually, we run into kernel crash. The patch fixes above issue from two aspects. For unplug, we simply skip those permanently removed PE, whose state is (EEH_PE_STATE_ISOLATED && !EEH_PE_STATE_RECOVERING) and its frozen count should be greater than EEH_MAX_ALLOWED_FREEZES. For plug, we marked all permanently removed EEH devices with EEH_DEV_REMOVED and return 0xFF's on read its PCI config so that PCI core will omit them. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-04-28powerpc/eeh: Cleanup EEH subsystem variablesGavin Shan1-10/+19
There're 2 EEH subsystem variables: eeh_subsystem_enabled and eeh_probe_mode. We needn't maintain 2 variables and we can just have one variable and introduce different flags. The patch also introduces additional flag EEH_FORCE_DISABLE, which will be used to disable EEH subsystem via boot parameter ("eeh=off") in future. Besides, the patch also introduces flag EEH_ENABLED, which is changed to disable or enable EEH functionality on the fly through debugfs entry in future. With the patch applied, the creteria to check the enabled EEH functionality is changed to: !EEH_FORCE_DISABLED && EEH_ENABLED : Enabled Other cases : Disabled Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-04-28powerpc/eeh: Use cached capability for log dumpGavin Shan1-1/+3
When calling into eeh_gather_pci_data() on pSeries platform, we possiblly don't have pci_dev instance yet, but eeh_dev is always ready. So we use cached capability from eeh_dev instead of pci_dev for log dump there. In order to keep things unified, we also cache PCI capability positions to eeh_dev for PowerNV as well. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-04-28powerpc/eeh: Block PCI-CFG access during PE resetGavin Shan1-0/+1
We've observed multiple PE reset failures because of PCI-CFG access during that period. Potentially, some device drivers can't support EEH very well and they can't put the device to motionless state before PE reset. So those device drivers might produce PCI-CFG accesses during PE reset. Also, we could have PCI-CFG access from user space (e.g. "lspci"). Since access to frozen PE should return 0xFF's, we can block PCI-CFG access during the period of PE reset so that we won't get recrusive EEH errors. The patch adds flag EEH_PE_RESET, which is kept during PE reset. The PowerNV/pSeries PCI-CFG accessors reuse the flag to block PCI-CFG accordingly. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-04-28powerpc/eeh: Remove EEH_PE_PHB_DEADGavin Shan1-1/+0
The PE state (for eeh_pe instance) EEH_PE_PHB_DEAD is duplicate to EEH_PE_ISOLATED. Originally, those PHBs (PHB PE) with EEH_PE_PHB_DEAD would be removed from the system. However, it's safe to replace that with EEH_PE_ISOLATED. The patch also clear EEH_PE_RECOVERING after fenced PHB has been handled, either failure or success. It makes the PHB PE state consistent with: PHB functions normally NONE PHB has been removed EEH_PE_ISOLATED PHB fenced, recovery in progress EEH_PE_ISOLATED | RECOVERING Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-02-17powerpc/eeh: Cleanup on eeh_subsystem_enabledGavin Shan1-2/+19
The patch cleans up variable eeh_subsystem_enabled so that we needn't refer the variable directly from external. Instead, we will use function eeh_enabled() and eeh_set_enable() to operate the variable. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-01-15powerpc/eeh: Handle multiple EEH errorsGavin Shan1-0/+10
For one PCI error relevant OPAL event, we possibly have multiple EEH errors for that. For example, multiple frozen PEs detected on different PHBs. Unfortunately, we didn't cover the case. The patch enumarates the return value from eeh_ops::next_error() and change eeh_handle_special_event() and eeh_ops::next_error() to handle all existing EEH errors. As Ben pointed out, we needn't list_for_each_entry_safe() since we are not deleting any PHB from the hose_list and the EEH serialized lock should be held while purging EEH events. The patch covers those suggestions as well. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-01-15powerpc/eeh: Hotplug improvementGavin Shan1-1/+2
When EEH error comes to one specific PCI device before its driver is loaded, we will apply hotplug to recover the error. During the plug time, the PCI device will be probed and its driver is loaded. Then we wrongly calls to the error handlers if the driver supports EEH explicitly. The patch intends to fix by introducing flag EEH_DEV_NO_HANDLER and set it before we remove the PCI device. In turn, we can avoid wrongly calls the error handlers of the PCI device after its driver loaded. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-01-15powerpc/eeh: Add restore_config operationGavin Shan1-0/+1
After reset on the specific PE or PHB, we never configure AER correctly on PowerNV platform. We needn't care it on pSeries platform. The patch introduces additional EEH operation eeh_ops:: restore_config() so that we have chance to configure AER correctly for PowerNV platform. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-24powerpc/eeh: Introdce flag to protect sysfsGavin Shan1-0/+2
The patch introduces flag EEH_DEV_SYSFS to keep track that the sysfs entries for the corresponding EEH device (then PCI device) has been added or removed, in order to avoid race condition. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-24powerpc/eeh: Don't use pci_dev during BAR restoreGavin Shan1-2/+6
While restoring BARs for one specific PCI device, the pci_dev instance should have been released. So it's not reliable to use the pci_dev instance on restoring BARs. However, we still need some information (e.g. PCIe capability position, header type) from the pci_dev instance. So we have to store those information to EEH device in advance. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-24powerpc/eeh: Use partial hotplug for EEH unaware driversGavin Shan1-1/+5
When EEH error happens to one specific PE, some devices with drivers supporting EEH won't except hotplug on the device. However, there might have other deivces without driver, or with driver without EEH support. For the case, we need do partial hotplug in order to make sure that the PE becomes absolutely quite during reset. Otherise, the PE reset might fail and leads to failure of error recovery. The current code doesn't handle that 'mixed' case properly, it either uses the error callbacks to the drivers, or tries hotplug, but doesn't handle a PE (EEH domain) composed of a combination of the two. The patch intends to support so-called "partial" hotplug for EEH: Before we do reset, we stop and remove those PCI devices without EEH sensitive driver. The corresponding EEH devices are not detached from its PE, but with special flag. After the reset is done, those EEH devices with the special flag will be scanned one by one. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>