Age | Commit message (Collapse) | Author | Files | Lines |
|
The aer_inject module was directly calling aer_irq(). This required the
AER driver export its private IRQ handler for no other reason than to
support error injection. A driver should not have to expose its private
interfaces, so use the IRQ subsystem to route injection to the AER driver,
and make aer_irq() a private interface.
This provides additional benefits:
First, directly calling the IRQ handler bypassed the IRQ subsytem so the
injection wasn't really synthesizing what happens if a shared AER interrupt
occurs.
The error injection had to provide the callback data directly, which may be
racing with a removal that is freeing that structure. The IRQ subsystem
can handle that race.
Finally, using the IRQ subsystem automatically reacts to threaded IRQs,
keeping the error injection abstracted from that implementation detail.
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
The port services driver already provides a method to find the pcie_device
for a service. Export that function, use it from the aer_inject module,
and remove the duplicate functionality.
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
Use the managed device resource allocations for the service data so the AER
driver doesn't need to manage it, further simplifying this driver.
Link: https://lore.kernel.org/linux-pci/20180918235848.26694-12-keith.busch@intel.com
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
'default n' is the default value for any bool or tristate Kconfig setting
so there is no need to write it explicitly.
Also since commit f467c5640c29 ("kconfig: only write '# CONFIG_FOO is not
set' for visible symbols") the Kconfig behavior is the same regardless of
'default n' being present or not:
...
One side effect of (and the main motivation for) this change is making
the following two definitions behave exactly the same:
config FOO
bool
config FOO
bool
default n
With this change, neither of these will generate a
'# CONFIG_FOO is not set' line (assuming FOO isn't selected/implied).
That might make it clearer to people that a bare 'default n' is
redundant.
...
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
The ntb_hw_switchtec driver has requirements on kernel configuration so add
these notes to the documentation and also clean up a few other sentences in
the documentation.
Signed-off-by: Wesley Yung <wesley.yung@microchip.com>
Signed-off-by: Kelvin Cao <kelvin.cao@microchip.com>
Signed-off-by: Wesley Sheng <wesley.sheng@microchip.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
|
|
Currently the Switchtec quirk runs on all endpoints in the switch,
including all the upstream and downstream ports. These other functions do
not contain BARs, so the quirk fails when trying to map the BAR and prints
the error "Cannot iomap Switchtec device". The user will see a few of
these useless and scary errors, one for each port in the switch.
At most, the quirk should only run on either a management endpoint
(PCI_CLASS_MEMORY_OTHER) or an NTB endpoint (PCI_CLASS_BRIDGE_OTHER).
However, the quirk is useless except in NTB applications, so we will
only run it when the class is PCI_CLASS_BRIDGE_OTHER.
Switch to using DECLARE_PCI_FIXUP_CLASS_FINAL and only match
PCI_CLASS_BRIDGE_OTHER.
Reported-by: Stephen Bates <sbates@raithlin.com>
Fixes: ad281ecf1c7d ("PCI: Add DMA alias quirk for Microsemi Switchtec NTB")
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
[bhelgaas: split SWITCHTEC_QUIRK() introduction to separate patch]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Doug Meyer <dmeyer@gigaio.com>
Cc: Kurt Schwemmer <kurt.schwemmer@microsemi.com>
|
|
Add SWITCHTEC_QUIRK() to reduce redundancy in declaring devices that use
quirk_switchtec_ntb_dma_alias().
By itself, this is no functional change, but a subsequent patch updates
SWITCHTEC_QUIRK() to fix ad281ecf1c7d ("PCI: Add DMA alias quirk for
Microsemi Switchtec NTB").
Fixes: ad281ecf1c7d ("PCI: Add DMA alias quirk for Microsemi Switchtec NTB")
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
[bhelgaas: split to separate patch]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
Add Device IDs to the Intel GPU "spurious interrupt" quirk table.
For these devices, unplugging the VGA cable and plugging it in again causes
spurious interrupts from the IGD. Linux eventually disables the interrupt,
but of course that disables any other devices sharing the interrupt.
The theory is that this is a VGA BIOS defect: it should have disabled the
IGD interrupt but failed to do so.
See f67fd55fa96f ("PCI: Add quirk for still enabled interrupts on Intel
Sandy Bridge GPUs") and 7c82126a94e6 ("PCI: Add new ID for Intel GPU
"spurious interrupt" quirk") for some history.
[bhelgaas: See link below for discussion about how to fix this more
generically instead of adding device IDs for every new Intel GPU. I hope
this is the last patch to add device IDs.]
Link: https://lore.kernel.org/linux-pci/1537974841-29928-1-git-send-email-bmeng.cn@gmail.com
Signed-off-by: Bin Meng <bmeng.cn@gmail.com>
[bhelgaas: changelog]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org # v3.4+
|
|
The few callers can just use dma_set_max_seg_size ()directly.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
The two callers can just use dma_set_seg_boundary() directly.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
Only some of these were still used by the cxgb4 driver, and that despite
the fact that the driver otherwise uses the generic DMA API.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
The threaded IRQ is naturally single threaded as desired, so use that to
simplify the AER bottom half handler. Since the root port structure has
much less to do now, remove the rpc construction helper routine.
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
Use the recommended kernel API for writing to a concurrently-accessed
kfifo. No functional change here.
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
The kernel provides a generic FIFO implementation, so no need to reinvent
that capability in a driver. Replace the AER-specific implementation with
the kernel-provided kfifo. Since the interrupt handler producer and work
queue consumer run single threaded, there is no need for additional
locking, so remove that lock, too.
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
The AER struct aer_rpc was carrying a copy of the error source simply as a
temperary variable. Remove that from the structure and use a stack
variable for the purpose.
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
The error recovery callbacks are only run on child devices. A Root Port is
never a child device, so this error resume callback was never invoked.
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
The PCI bus config accessors could be inlined into other accessor
functions, which makes it so they can't be traced. Force them to never be
inlined so that ftrace can hook into these functions.
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
In preparation to enabling -Wimplicit-fallthrough, mark switch cases
where we are expecting to fall through.
Addresses-Coverity-ID: 1472052 ("Missing break in switch")
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
After bfcb79fca19d ("PCI/ERR: Run error recovery callbacks for all affected
devices"), AER errors are always cleared by the PCI core and drivers don't
need to do it themselves.
Remove calls to pci_cleanup_aer_uncorrect_error_status() from device
driver error recovery functions.
Signed-off-by: Oza Pawandeep <poza@codeaurora.org>
[bhelgaas: changelog, remove PCI core changes, remove unused variables]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
Use kmemdup() rather than duplicating its implementation.
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
Fixes gcc '-Wunused-but-set-variable' warning:
drivers/pci/hotplug/cpqphp_core.c: In function 'init_SERR':
drivers/pci/hotplug/cpqphp_core.c:124:5: warning: variable 'physical_slot' set but not used [-Wunused-but-set-variable]
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
Remove duplicated include.
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
Currently, a hotplug bridge will be given hpmemsize additional memory
and hpiosize additional io if available, in order to satisfy any future
hotplug allocation requirements.
These calculations don't consider the current memory/io size of the
hotplug bridge/slot, so hotplug bridges/slots which have downstream
devices will be allocated their current allocation in addition to the
hpmemsize value.
This makes for possibly undesirable results with a mix of unoccupied and
occupied slots (ex, with hpmemsize=2M):
02:03.0 PCI bridge: <-- Occupied
Memory behind bridge: d6200000-d64fffff [size=3M]
02:04.0 PCI bridge: <-- Unoccupied
Memory behind bridge: d6500000-d66fffff [size=2M]
This change considers the current allocation size when using the
hpmemsize/hpiosize parameters to make the reservations predictable for
the mix of unoccupied and occupied slots:
02:03.0 PCI bridge: <-- Occupied
Memory behind bridge: d6200000-d63fffff [size=2M]
02:04.0 PCI bridge: <-- Unoccupied
Memory behind bridge: d6400000-d65fffff [size=2M]
Signed-off-by: Jon Derrick <jonathan.derrick@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
In order to have better power management for Thunderbolt PCIe chains,
Windows enables power management for native PCIe hotplug ports if there is
the following ACPI _DSD attached to the root port:
Name (_DSD, Package () {
ToUUID ("6211e2c0-58a3-4af3-90e1-927a4e0c55a4"),
Package () {
Package () {"HotPlugSupportInD3", 1}
}
})
This is also documented in:
https://docs.microsoft.com/en-us/windows-hardware/drivers/pci/dsd-for-pcie-root-ports#identifying-pcie-root-ports-supporting-hot-plug-in-d3
Do the same in Linux by introducing new firmware PM callback
(->bridge_d3()) and then implement it for ACPI based systems so that the
above property is checked.
There is one catch, though. The initial pci_dev->bridge_d3 is set before
the root port has ACPI companion bound (the device is not added to the PCI
bus either) so we need to look up the ACPI companion manually in that case
in acpi_pci_bridge_d3().
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
It is possible to have _DSD entries where the data is compatible with
device properties format but are using different GUID for various reasons.
In addition to that there can be many such _DSD entries for a single device
such as for PCIe root port used to host a Thunderbolt hierarchy:
Scope (\_SB.PCI0.RP21)
{
Name (_DSD, Package () {
ToUUID ("6211e2c0-58a3-4af3-90e1-927a4e0c55a4"),
Package () {
Package () {"HotPlugSupportInD3", 1}
},
ToUUID ("efcc06cc-73ac-4bc3-bff0-76143807c389"),
Package () {
Package () {"ExternalFacingPort", 1},
Package () {"UID", 0 }
}
})
}
More information about these new _DSD entries can be found in:
https://docs.microsoft.com/en-us/windows-hardware/drivers/pci/dsd-for-pcie-root-ports
To make these available for drivers via unified device property APIs,
modify ACPI property core so that it supports multiple _DSD entries
organized in a linked list. We also store GUID of each _DSD entry in struct
acpi_device_properties in case there is need to differentiate between
entries. The supported GUIDs are then listed in prp_guids array.
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Sakari Ailus <sakari.ailus@linux.intel.com>
|
|
Basically we need to do the same steps than what we do when system sleep is
entered and disable PME interrupt when the root port is runtime suspended.
This prevents spurious wakeups immediately when the port is transitioned
into D3cold.
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Basically we need to do the same thing when runtime suspending than with
system sleep so re-use those operations here. This makes sure hotplug
interrupt does not trigger immediately when the link goes down.
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
When PCIe port is runtime suspended/resumed some extra steps might be
needed to be executed from the port service driver side. For instance we
may need to disable PCIe hotplug interrupt to prevent it from triggering
immediately when PCIe link to the downstream component goes down.
To make the above possible add optional ->runtime_suspend() and
->runtime_resume() callbacks to struct pcie_port_service_driver and call
them for each port service in runtime suspend/resume callbacks of portdrv.
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
[bhelgaas: adjust "slot->state" for 5790a9c78e78 ("PCI: pciehp: Unify
controller and slot structs")]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Currently we try to keep PCIe ports runtime suspended over system suspend
if possible. This mostly happens when entering suspend-to-idle because
there is no need to re-configure wake settings.
This causes problems if the parent port goes into D3cold and it gets
resumed upon exit from system suspend. This may happen for example if the
port is part of PCIe switch and the same switch is connected to a PCIe
endpoint that needs to be resumed. The way exit from D3cold works according
PCIe 4.0 spec 5.3.1.4.2 is that power is restored and cold reset is
signaled. After this the device is in D0unitialized state keeping PME
context if it supports wake from D3cold.
The problem occurs when a PCIe hotplug port is left suspended and the
parent port goes into D3cold and back to D0: the port keeps its PME context
but since everything else is reset back to defaults (D0unitialized) it is
not set to detect hotplug events anymore.
For this reason change the PCIe portdrv power management logic so that it
is fine to keep the port runtime suspended over system suspend but it needs
to be resumed upon exit to make sure it gets properly re-initialized.
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
PCIe native hotplug shares MSI vector with native PME so the interrupt
handler might get called even the hotplug interrupt is masked. In that case
we should not handle any events because the interrupt was not meant for us.
Modify the PCIe hotplug interrupt handler to check this accordingly and
bail out if it finds out that the interrupt was not about hotplug.
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Lukas Wunner <lukas@wunner.de>
|
|
When PCIe hotplug port is transitioned into D3hot, the link to the
downstream component will go down. If hotplug interrupt generation is
enabled when that happens, it will trigger immediately, waking up the
system and bringing the link back up.
To prevent this, disable hotplug interrupt generation when system suspend
is entered. This does not prevent wakeup from low power states according
to PCIe 4.0 spec section 6.7.3.4:
Software enables a hot-plug event to generate a wakeup event by
enabling software notification of the event as described in Section
6.7.3.1. Note that in order for software to disable interrupt generation
while keeping wakeup generation enabled, the Hot-Plug Interrupt Enable
bit must be cleared.
So as long as we have set the slot event mask accordingly, wakeup should
work even if slot interrupt is disabled. The port should trigger wake and
then send PME to the root port when the PCIe hierarchy is brought back up.
Limit this to systems using native PME mechanism to make sure older Apple
systems depending on commit e3354628c376 ("PCI: pciehp: Support interrupts
sent from D3hot") still continue working.
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
We enable power management automatically for bridges where
pci_bridge_d3_possible() returns true. However, these bridges may have
ACPI methods such as _DSW that need to be called before D3 entry. For
example in Lenovo Thinkpad X1 Carbon 6th _DSW method is used to prepare
D3cold for the PCIe root port hosting Thunderbolt chain. Because wake is
not enabled _DSW method is never called and the port does not enter
D3cold properly consuming more power than necessary.
Users can work this around by writing "enabled" to "wakeup" sysfs file
under the device in question but that is not something an ordinary user
is expected to do.
Since we already automatically enable power management for PCIe ports
with ->bridge_d3 set extend that to enable wake for them as well,
assuming the port has any ACPI wakeup related objects implemented in the
namespace (adev->wakeup.flags.valid is true). This ensures the necessary
ACPI methods get called at appropriate times and allows the root port in
Thinkpad X1 Carbon 6th to go into D3cold.
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Commit baecc470d5fd ("PCI / PM: Skip bridges in pci_enable_wake()") changed
pci_enable_wake() so that all bridges are skipped when wakeup is enabled
(or disabled) with the reasoning that bridges can only signal wakeup on
behalf of their subordinate devices.
However, there are bridges that can signal wakeup themselves. For example
PCIe downstream and root ports supporting hotplug may signal wakeup upon
hotplug event.
For this reason change pci_enable_wake() so that it skips all bridges
except those that we power manage (->bridge_d3 is set). Those are the ones
that can go into low power states and may need to signal wakeup.
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
The spec has timing requirements when waiting for a link to become active
after a conventional reset. Implement those hard delays when waiting for
an active link so pciehp and dpc drivers don't need to duplicate this.
For devices that don't support data link layer active reporting, wait the
fixed time recommended by the PCIe spec.
Signed-off-by: Keith Busch <keith.busch@intel.com>
[bhelgaas: changelog]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Sinan Kaya <okaya@kernel.org>
|
|
Bring surprise removals and permanent failures together so we no longer
need separate flags. The implementation enforces that error handling will
not be able to override a surprise removal's permanent channel failure.
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Sinan Kaya <okaya@kernel.org>
|
|
A device still participates in error recovery even if it doesn't have
the error callbacks.
Always provide the status for user event watchers.
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Sinan Kaya <okaya@kernel.org>
|
|
There is no point in having a generic broadcast function if it needs to
have special cases for each callback it broadcasts.
Abstract the error broadcast to only the necessary information and removes
the now unnecessary helper to walk the bus.
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Sinan Kaya <okaya@kernel.org>
|
|
PCIe r4.0, sec 7.5.1.1.4 defines a new bit in the Status Register:
Immediate Readiness – This optional bit, when Set, indicates the Function
is guaranteed to be ready to successfully complete valid configuration
accesses at any time following any reset that the host is capable of
issuing Configuration Requests to this Function.
When this bit is Set, for accesses to this Function, software is exempt
from all requirements to delay configuration accesses following any type
of reset, including but not limited to the timing requirements defined in
Section 6.6.
This means that all delays after a Conventional or Function Reset can be
skipped.
This patch reads such bit and caches its value in a flag inside struct
pci_dev to be checked later if we should delay or can skip delays after a
reset. While at that, also move the explicit msleep(100) call from
pcie_flr() and pci_af_flr() to pci_dev_wait().
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
[bhelgaas: rename PCI_STATUS_IMMEDIATE to PCI_STATUS_IMM_READY]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
If an Endpoint reported an error with ERR_FATAL, we previously ran driver
error recovery callbacks only for the Endpoint's driver. But if we reset a
Link to recover from the error, all downstream components are affected,
including the Endpoint, any multi-function peers, and children of those
peers.
Initiate the Link reset from the deepest Downstream Port that is
reliable, and call the error recovery callbacks for all its children.
If a Downstream Port (including a Root Port) reports an error, we assume
the Port itself is reliable and we need to reset its downstream Link. In
all other cases (Switch Upstream Ports, Endpoints, Bridges, etc), we assume
the Link leading to the component needs to be reset, so we initiate the
reset at the parent Downstream Port.
This allows two other clean-ups. First, we currently only use a Link
reset, which can only be initiated using a Downstream Port, so we can
remove checks for Endpoints. Second, the Downstream Port where we initiate
the Link reset is reliable (unlike components downstream from it), so the
special cases for error detect and resume are no longer necessary.
Signed-off-by: Keith Busch <keith.busch@intel.com>
[bhelgaas: changelog]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Sinan Kaya <okaya@kernel.org>
|
|
We don't need to be paranoid about the topology changing while handling an
error. If the device has changed in a hotplug capable slot, we can rely on
the presence detection handling to react to a changing topology.
Restore the fatal error handling behavior that existed before merging DPC
with AER with 7e9084b36740 ("PCI/AER: Handle ERR_FATAL with removal and
re-enumeration of devices").
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Sinan Kaya <okaya@kernel.org>
|
|
It is a serious driver defect to enable MSI or MSI-X more than once. Doing
so may panic the kernel as in the stack trace below:
Call Trace:
sysfs_add_one+0xa5/0xd0
create_dir+0x7c/0xe0
sysfs_create_subdir+0x1c/0x20
internal_create_group+0x6d/0x290
sysfs_create_groups+0x4a/0xa0
populate_msi_sysfs+0x1cd/0x210
pci_enable_msix+0x31c/0x3e0
igbuio_pci_open+0x72/0x300 [igb_uio]
uio_open+0xcc/0x120 [uio]
chrdev_open+0xa1/0x1e0
[...]
do_sys_open+0xf3/0x1f0
SyS_open+0x1e/0x20
system_call_fastpath+0x16/0x1b
---[ end trace 11042e2848880209 ]---
Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: ffffffffa056b4fa
We want to keep the WARN_ON() and stack trace so the driver can be fixed,
but we can avoid the kernel panic by returning an error. We may still get
warnings like this:
Call Trace:
pci_enable_msix+0x3c9/0x3e0
igbuio_pci_open+0x72/0x300 [igb_uio]
uio_open+0xcc/0x120 [uio]
chrdev_open+0xa1/0x1e0
[...]
do_sys_open+0xf3/0x1f0
SyS_open+0x1e/0x20
system_call_fastpath+0x16/0x1b
------------[ cut here ]------------
WARNING: at fs/sysfs/dir.c:526 sysfs_add_one+0xa5/0xd0()
sysfs: cannot create duplicate filename '/devices/pci0000:00/0000:00:03.0/0000:01:00.1/msi_irqs'
Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
[bhelgaas: changelog, fix patch whitespace, remove !!]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
The secondary bus reset may have link side effects that a hotplug capable
port may incorrectly react to. Use the slot specific reset for hotplug
ports, fixing the undesirable link down-up handling during error
recovering.
Signed-off-by: Keith Busch <keith.busch@intel.com>
[bhelgaas: fold in
https://lore.kernel.org/linux-pci/20180926152326.14821-1-keith.busch@intel.com
for issue reported by Stephen Rothwell <sfr@canb.auug.org.au>]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Sinan Kaya <okaya@kernel.org>
|
|
The AER driver has never read the config space of an endpoint that reported
a fatal error because the link to that device is considered unreliable.
An ERR_FATAL from an upstream port almost certainly indicates an error on
its upstream link, so we can't expect to reliably read its config space for
the same reason.
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Sinan Kaya <okaya@kernel.org>
|
|
Error handling may be running in parallel with a hot removal. Reference
count the device during AER handling so the device can not be freed while
AER wants to reference it.
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Sinan Kaya <okaya@kernel.org>
|
|
This patch provides DPC save and restore capabilities. This is necessary
for the driver to observe DPC events in the event the configuration space
needs to be restored after a reset.
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Sinan Kaya <okaya@kernel.org>
|
|
The port's config space may be cleared after a link reset, which wipes out
the bridge's bus and memory windows. Restore the config space that was
saved during probe so we can access downstream devices.
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Sinan Kaya <okaya@kernel.org>
|
|
The PCI port driver saves the PCI state after initializing the device with
the applicable service devices. This was, however, before the service
drivers were even registered because PCI probe happens before the
device_initcall initialized those service drivers. The config space state
that the services set up were not being saved. The end result would cause
PCI devices to not react to events that the drivers think they did if the
PCI state ever needed to be restored.
Fix this by changing the service drivers from using the init calls to
having the portdrv driver calling the services directly. This will get the
state saved as desired, while making the relationship between the port
driver and the services under it more explicit in the code.
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Sinan Kaya <okaya@kernel.org>
|