Age | Commit message (Collapse) | Author | Files | Lines |
|
Pull timer cleanups from Thomas Gleixner:
"Another set of timer API cleanups:
- Convert init_timer*(), try_to_del_timer_sync() and
destroy_timer_on_stack() over to the canonical timer_*()
namespace convention.
There is another large conversion pending, which has not been included
because it would have caused a gazillion of merge conflicts in next.
The conversion scripts will be run towards the end of the merge window
and a pull request sent once all conflict dependencies have been
merged"
* tag 'timers-cleanups-2025-05-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
treewide, timers: Rename destroy_timer_on_stack() as timer_destroy_on_stack()
treewide, timers: Rename try_to_del_timer_sync() as timer_delete_sync_try()
timers: Rename init_timers() as timers_init()
timers: Rename NEXT_TIMER_MAX_DELTA as TIMER_NEXT_MAX_DELTA
timers: Rename __init_timer_on_stack() as __timer_init_on_stack()
timers: Rename __init_timer() as __timer_init()
timers: Rename init_timer_on_stack_key() as timer_init_key_on_stack()
timers: Rename init_timer_key() as timer_init_key()
|
|
Pull MSI updates from Thomas Gleixner:
"Updates for the MSI subsystem (core code and PCI):
- Switch the MSI descriptor locking to lock guards
- Replace a broken and naive implementation of PCI/MSI-X control word
updates in the PCI/TPH driver with a properly serialized variant in
the PCI/MSI core code.
- Remove the MSI descriptor abuse in the SCCI/UFS/QCOM driver by
replacing the direct access to the MSI descriptors with the proper
API function calls. People will never understand that APIs exist
for a reason...
- Provide core infrastructre for the upcoming PCI endpoint library
extensions. Currently limited to ARM GICv3+, but in theory
extensible to other architectures.
- Provide a MSI domain::teardown() callback, which allows drivers to
undo the effects of the prepare() callback.
- Move the MSI domain::prepare() callback invocation to domain
creation time to avoid redundant (and in case of ARM/GIC-V3-ITS
confusing) invocations on every allocation.
In combination with the new teardown callback this removes some
ugly hacks in the GIC-V3-ITS driver, which pretended to work around
the short comings of the core code so far. With this update the
code is correct by design and implementation.
- Make the irqchip MSI library globally available, provide a MSI
parent domain creation helper and convert a bunch of (PCI/)MSI
drivers over to the modern MSI parent mechanism. This is the first
step to get rid of at least one incarnation of the three PCI/MSI
management schemes.
- The usual small cleanups and improvements"
* tag 'irq-msi-2025-05-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (33 commits)
PCI/MSI: Use bool for MSI enable state tracking
PCI: tegra: Convert to MSI parent infrastructure
PCI: xgene: Convert to MSI parent infrastructure
PCI: apple: Convert to MSI parent infrastructure
irqchip/msi-lib: Honour the MSI_FLAG_NO_AFFINITY flag
irqchip/mvebu: Convert to msi_create_parent_irq_domain() helper
irqchip/gic: Convert to msi_create_parent_irq_domain() helper
genirq/msi: Add helper for creating MSI-parent irq domains
irqchip: Make irq-msi-lib.h globally available
irqchip/gic-v3-its: Use allocation size from the prepare call
genirq/msi: Engage the .msi_teardown() callback on domain removal
genirq/msi: Move prepare() call to per-device allocation
irqchip/gic-v3-its: Implement .msi_teardown() callback
genirq/msi: Add .msi_teardown() callback as the reverse of .msi_prepare()
irqchip/gic-v3-its: Add support for device tree msi-map and msi-mask
dt-bindings: PCI: pci-ep: Add support for iommu-map and msi-map
irqchip/gic-v3-its: Set IRQ_DOMAIN_FLAG_MSI_IMMUTABLE for ITS
irqdomain: Add IRQ_DOMAIN_FLAG_MSI_IMMUTABLE and irq_domain_is_msi_immutable()
platform-msi: Add msi_remove_device_irq_domain() in platform_device_msi_free_irqs_all()
genirq/msi: Rename msi_[un]lock_descs()
...
|
|
Pull irq cleanups from Thomas Gleixner:
"A set of cleanups for the generic interrupt subsystem:
- Consolidate on one set of functions for the interrupt domain code
to get rid of pointlessly duplicated code with only marginal
different semantics.
- Update the documentation accordingly and consolidate the coding
style of the irqdomain header"
* tag 'irq-cleanups-2025-05-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (46 commits)
irqdomain: Consolidate coding style
irqdomain: Fix kernel-doc and add it to Documentation
Documentation: irqdomain: Update it
Documentation: irq-domain.rst: Simple improvements
Documentation: irq/concepts: Minor improvements
Documentation: irq/concepts: Add commas and reflow
irqdomain: Improve kernel-docs of functions
irqdomain: Make struct irq_domain_info variables const
irqdomain: Use irq_domain_instantiate()'s return value as initializers
irqdomain: Drop irq_linear_revmap()
pinctrl: keembay: Switch to irq_find_mapping()
irqchip/armada-370-xp: Switch to irq_find_mapping()
gpu: ipu-v3: Switch to irq_find_mapping()
gpio: idt3243x: Switch to irq_find_mapping()
sh: Switch to irq_find_mapping()
powerpc: Switch to irq_find_mapping()
irqdomain: Drop irq_domain_add_*() functions
powerpc: Switch irq_domain_add_nomap() to use fwnode
thermal: Switch to irq_domain_create_linear()
soc: Switch to irq_domain_create_*()
...
|
|
Pull irq controller updates from Thomas Gleixner:
"Update for interrupt chip drivers:
- Convert the generic interrupt chip to lock guards to remove copy &
pasta boilerplate code and gotos.
- A new driver fot the interrupt controller in the EcoNet EN751221
MIPS SoC.
- Extend the SG2042-MSI driver to support the new SG2044 SoC
- Updates and cleanups for the (ancient) VT8500 driver
- Improve the scalability of the ARM GICV4.1 ITS driver by utilizing
node local copies a VM's interrupt translation table when possible.
This results in a 12% reduction of VM IPI latency in certain
workloads.
- The usual cleanups and improvements all over the place"
* tag 'irq-drivers-2025-05-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (25 commits)
irqchip/irq-pruss-intc: Simplify chained interrupt handler setup
irqchip/gic-v4.1: Use local 4_1 ITS to generate VSGI
irqchip/econet-en751221: Switch to of_fwnode_handle()
irqchip/irq-vt8500: Switch to irq_domain_create_*()
irqchip/econet-en751221: Switch to irq_domain_create_linear()
irqchip/irq-vt8500: Use fewer global variables and add error handling
irqchip/irq-vt8500: Use a dedicated chained handler function
irqchip/irq-vt8500: Don't require 8 interrupts from a chained controller
irqchip/irq-vt8500: Drop redundant copy of the device node pointer
irqchip/irq-vt8500: Split up ack/mask functions
irqchip/sg2042-msi: Fix wrong type cast in sg2044_msi_irq_ack()
irqchip/sg2042-msi: Add the Sophgo SG2044 MSI interrupt controller
irqchip/sg2042-msi: Introduce configurable chipinfo for SG2042
irqchip/sg2042-msi: Rename functions and data structures to be SG2042 agnostic
dt-bindings: interrupt-controller: Add Sophgo SG2044 MSI controller
genirq/generic-chip: Fix incorrect lock guard conversions
genirq/generic-chip: Remove unused lock wrappers
irqchip: Convert generic irqchip locking to guards
gpio: mvebu: Convert generic irqchip locking to guard()
ARM: orion/gpio:: Convert generic irqchip locking to guard()
...
|
|
Bad MSI implementations multiplex MSIs onto a single downstream interrupt,
meaning they have no concept of individual affinity.
The old MSI code did a reasonable job at this by honouring the
MSI_FLAG_NO_AFFINITY, but the new shiny device MSI code doesn't.
Teach it about the sad reality of existing hardware.
Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250513172819.2216709-7-maz@kernel.org
|
|
Switch the MVEBU family of interrupt chip drivers over to the common helper
function to create the interrupt domains.
[ tglx: Moved the struct out of the function call argument and fix up
the of_node_to_fwnode() instances ]
Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250513172819.2216709-5-maz@kernel.org
|
|
Switch the GIC family of interrupt chip drivers over to the common helper
function to create the interrupt domains.
[ tglx: Moved the struct out of the function call argument ]
Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250513172819.2216709-4-maz@kernel.org
|
|
Move irq-msi-lib.h into include/linux/irqchip, making it available
to compilation units outside of drivers/irqchip.
This requires some churn in drivers to fetch it from the new location,
generated using this script:
git grep -l -w \"irq-msi-lib.h\" | \
xargs sed -i -e 's:"irq-msi-lib.h":\<linux/irqchip/irq-msi-lib.h\>:'
Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250513172819.2216709-2-maz@kernel.org
|
|
irq_linear_revmap() is deprecated, so remove all its uses and supersede
them by an identical call to irq_find_mapping().
[ tglx: Fix up subject prefix ]
Signed-off-by: Jiri Slaby (SUSE) <jirislaby@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250319092951.37667-47-jirislaby@kernel.org
|
|
irq_domain_add_*() interfaces are going away as being obsolete now.
Switch to the preferred irq_domain_create_*() ones. Those differ in the
node parameter: They take more generic struct fwnode_handle instead of
struct device_node. Therefore, of_fwnode_handle() is added around the
original parameter.
Note some of the users can likely use dev->fwnode directly instead of
indirect of_fwnode_handle(dev->of_node). But dev->fwnode is not
guaranteed to be set for all, so this has to be investigated on case to
case basis (by people who can actually test with the HW).
[ tglx: Fix up subject prefix ]
Signed-off-by: Jiri Slaby (SUSE) <jirislaby@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Changhuang Liang <changhuang.liang@starfivetech.com>
Link: https://lore.kernel.org/all/20250319092951.37667-22-jirislaby@kernel.org
|
|
of_node_to_fwnode() is irqdomain's reimplementation of the "officially"
defined of_fwnode_handle(). The former is in the process of being
removed, so use the latter instead.
[ tglx: Fix up subject prefix ]
Signed-off-by: Jiri Slaby (SUSE) <jirislaby@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Antonio Borneo <antonio.borneo@foss.st.com>
Reviewed-by: Herve Codina <herve.codina@bootlin.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Acked-by: Herve Codina <herve.codina@bootlin.com> # irq-lan966x-oic
Link: https://lore.kernel.org/all/20250319092951.37667-7-jirislaby@kernel.org
|
|
The chained interrupt handler setup installs the handler and handler data
with two function call.s
irq_set_chained_handler_and_data() can set both in one operation. Replace
the two calls with one.
Signed-off-by: Chen Ni <nichen@iscas.ac.cn>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250515083450.3811411-1-nichen@iscas.ac.cn
|
|
On multi-node GICv4.1 system, VSGI senders always use one certain 4_1 ITS,
because find_4_1_its() returns the first its_node in the list, regardless of
which node the VSGI sender is on. This brings guest VSGI performance drop
when VM is not running on the same node as this returned ITS.
On a 2-socket environment, each with one ITS and 32 cpu, GICv4.1 enabled,
4U8G guest, 4 vcpu is running on same socket.
When the VM is on socket0, kvm-unit-tests ipi_hw result is 850ns.
When the VM is on socket1, it is 750ns.
The reason is that the VSGI sender always uses the last reported ITS (that
on socket1) to inject VSGI. The access from a CPU to a other-socket ITS
will cost 100ns more compared to an access to the local ITS.
Using the local ITS results in a 12% reduction in IPI latency.
Modify find_4_1_its() to return the first per-CPU local_4_1_its, which is
initialized when the VPE table is inherited from the ITS or from another
CPU. If it fails to find a local 4_1 ITS, it returns any 4_1 ITS like
before.
Suggested-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Nianyao Tang <tangnianyao@huawei.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/all/20250515145359.2795959-1-tangnianyao@huawei.com
|
|
When starting the local sync timer to synchronize the state of a remote
CPU it should be added on the CPU to be synchronized, not the initiating
CPU. This results in interrupt delivery being delayed until the timer
eventually runs (due to another mask/unmask/migrate operation) on the
target CPU.
Fixes: 0f67911e821c ("irqchip/riscv-imsic: Separate next and previous pointers in IMSIC vector")
Signed-off-by: Andrew Bresticker <abrestic@rivosinc.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Anup Patel <anup@brainfault.org>
Link: https://lore.kernel.org/all/20250514171320.3494917-1-abrestic@rivosinc.com
|
|
Commit 1c000dcaad2be ("irqchip/irq-msi-lib: Optionally set default
irq_eoi()/irq_ack()") added blanket MSI_CHIP_FLAG_SET_ACK flags,
irrespective of whether the underlying irqchip required it or not.
Drop it from a number of drivers that do not require it.
Fixes: 1c000dcaad2be ("irqchip/irq-msi-lib: Optionally set default irq_eoi()/irq_ack()")
Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250513172819.2216709-6-maz@kernel.org
|
|
Now that .msi_prepare() gets called at the right time and not
with semi-random parameters, remove the ugly hack that tried
to fix up the number of allocated vectors.
It is now correct by construction.
Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250513163144.2215824-6-maz@kernel.org
|
|
The ITS driver currently nukes the structure representing an endpoint
device translating via an ITS on freeing the last LPI allocated for it.
That's an unfortunate state of affair, as it is pretty common for a driver
to allocate a single MSI, do something clever, teardown this MSI, and
reallocate a whole bunch of them. The NVME driver does exactly that,
amongst others.
What happens in that case is that the core code is accidentaly issuing
another .msi_prepare() call, even if it shouldn't. This luckily cancels
the above behaviour and hides the problem.
In order to fix the core code, start by implementing the new
.msi_teardown() callback. Nothing calls it yet, so a side effect is that
the its_dev structure will not be freed and that the DID will stay
mapped. Not a big deal, and this will be solved in following patches.
Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250513163144.2215824-3-maz@kernel.org
|
|
of_node_to_fwnode() is an irqdomain's reimplementation of the
"officially" defined of_fwnode_handle(). The former is in the process of
being removed, so use the latter instead.
Signed-off-by: Jiri Slaby (SUSE) <jirislaby@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250513084739.2611747-1-jirislaby@kernel.org
|
|
Move this API to the canonical timer_*() namespace.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250507175338.672442-9-mingo@kernel.org
|
|
Some platform devices create child devices dynamically and require the
parent device's msi-map to map device IDs to actual sideband information.
A typical use case is using ITS as a PCIe Endpoint Controller(EPC)'s
doorbell function, where PCI hosts send TLP memory writes to the EP
controller. The EP controller converts these writes to AXI transactions
and appends platform-specific sideband information.
EPC's DTS will provide such information by msi-map and msi-mask. A
simplified dts as
pcie-ep@10000000 {
...
msi-map = <0 &its 0xc 8>;
^^^ 0xc is implement defined sideband information,
which append to AXI write transaction.
^ 0 is function index.
msi-mask = <0x7>
}
Check msi-map if msi-parent missed to keep compatility with existing systems.
Signed-off-by: Frank Li <Frank.Li@nxp.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250414-ep-msi-v18-5-f69b49917464@nxp.com
|
|
Set the IRQ_DOMAIN_FLAG_MSI_IMMUTABLE flag for ITS, as it does not change
the address/data pair after setup.
Ensure compatibility with MSI users, such as PCIe Endpoint Doorbell, which
require the address/data pair to remain unchanged. Enable PCIe endpoints to
use ITS for triggering doorbells from the PCIe Root Complex (RC) side.
Signed-off-by: Frank Li <Frank.Li@nxp.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250414-ep-msi-v18-3-f69b49917464@nxp.com
|
|
irq_domain_add_*() interfaces are going away as being obsolete now.
Switch to the preferred irq_domain_create_*() ones. Those differ in the
node parameter: They take more generic struct fwnode_handle instead of
struct device_node. Therefore, of_fwnode_handle() is added around the
original parameter.
Note some of the users can likely use dev->fwnode directly instead of
indirect of_fwnode_handle(dev->of_node). But dev->fwnode is not
guaranteed to be set for all, so this has to be investigated on case to
case basis (by people who can actually test with the HW).
[ tglx: Split out from combo patch to avoid merge conflicts ]
Signed-off-by: Jiri Slaby (SUSE) <jirislaby@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250319092951.37667-22-jirislaby@kernel.org
|
|
irq_domain_add_linear() is about to be removed. Switch to
irq_domain_create_linear().
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Caleb James DeLisle <cjd@cjdns.fr>
Link: https://lore.kernel.org/all/877c2top39.ffs@tglx
|
|
Controller private data doesn't really need to be in a global
statically allocated array - kzalloc it per controller instead,
keeping only one pointer to the primary controller global.
While at that, also add proper error return statuses in the init
path and respective cleanup of resources on errors.
Signed-off-by: Alexey Charkov <alchark@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250506-vt8500-intc-updates-v2-5-a3a0606cf92d@gmail.com
|
|
Current code for the chained interrupt controller maps its interrupts on
the parent but doesn't register a separate chained handler, instead
needlessly calling enable_irq() on an unactivated parent interrupt, causing
a boot time WARN_ON from the common code.
The common handler meanwhile loops through all registered interrupt
controllers in an arbitrary order and tries to handle active interrupts
in each of them, which is fragile.
Use common infrastructure for handling chained interrupts instead.
Signed-off-by: Alexey Charkov <alchark@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250506-vt8500-intc-updates-v2-4-a3a0606cf92d@gmail.com
|
|
VT8500 chained controller can route its interrupts to either or all
of its 8 interrupt outputs. Current code actually routes all of them
to the first output, so there is no need to create mappings for all
eight.
Drop redundant checks, and only map as many chained controller
interrupts as are defined in the device tree.
Signed-off-by: Alexey Charkov <alchark@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250506-vt8500-intc-updates-v2-3-a3a0606cf92d@gmail.com
|
|
Inside vt8500_irq_init(), np is the same as node. Drop it.
Signed-off-by: Alexey Charkov <alchark@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250506-vt8500-intc-updates-v2-2-a3a0606cf92d@gmail.com
|
|
vt8500_irq_mask() really does the ACK for edge triggered interrupts and the
MASK for level triggered interrupts. Edge triggered interrupts never
really are masked as a result, and there is unnecessary reading of the
status register before the ACK even though it's write-one-to-clear.
Split it up into a proper standalone vt8500_irq_ack() and an unconditional
vt8500_irq_mask().
No Fixes tag added, as it has survived this way for 15 years and nobody
complained, so apparently nothing really used edge triggered interrupts
anyway.
[ tglx: Tabularize the irqchip struct initializer ]
Signed-off-by: Alexey Charkov <alchark@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250506-vt8500-intc-updates-v2-1-a3a0606cf92d@gmail.com
|
|
On Qualcomm chipsets not all GPIOs are wakeup capable. Those GPIOs do not
have a corresponding MPM pin and should not be handled inside the MPM
driver. The IRQ domain hierarchy is always applied, so it's required to
explicitly disconnect the hierarchy for those. The pinctrl-msm driver marks
these with GPIO_NO_WAKE_IRQ. qcom-pdc has a check for this, but
irq-qcom-mpm is currently missing the check. This is causing crashes when
setting up interrupts for non-wake GPIOs:
root@rb1:~# gpiomon -c gpiochip1 10
irq: IRQ159: trimming hierarchy from :soc@0:interrupt-controller@f200000-1
Unable to handle kernel paging request at virtual address ffff8000a1dc3820
Hardware name: Qualcomm Technologies, Inc. Robotics RB1 (DT)
pc : mpm_set_type+0x80/0xcc
lr : mpm_set_type+0x5c/0xcc
Call trace:
mpm_set_type+0x80/0xcc (P)
qcom_mpm_set_type+0x64/0x158
irq_chip_set_type_parent+0x20/0x38
msm_gpio_irq_set_type+0x50/0x530
__irq_set_trigger+0x60/0x184
__setup_irq+0x304/0x6bc
request_threaded_irq+0xc8/0x19c
edge_detector_setup+0x260/0x364
linereq_create+0x420/0x5a8
gpio_ioctl+0x2d4/0x6c0
Fix this by copying the check for GPIO_NO_WAKE_IRQ from qcom-pdc.c, so that
MPM is removed entirely from the hierarchy for non-wake GPIOs.
Fixes: a6199bb514d8 ("irqchip: Add Qualcomm MPM controller driver")
Reported-by: Alexey Klimov <alexey.klimov@linaro.org>
Signed-off-by: Stephan Gerhold <stephan.gerhold@linaro.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Alexey Klimov <alexey.klimov@linaro.org>
Reviewed-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/all/20250502-irq-qcom-mpm-fix-no-wake-v1-1-8a1eafcd28d4@linaro.org
|
|
With ACPI in place, gicv2m_get_fwnode() is registered with the pci
subsystem as pci_msi_get_fwnode_cb(), which may get invoked at runtime
during a PCI host bridge probe. But, the call back is wrongly marked as
__init, causing it to be freed, while being registered with the PCI
subsystem and could trigger:
Unable to handle kernel paging request at virtual address ffff8000816c0400
gicv2m_get_fwnode+0x0/0x58 (P)
pci_set_bus_msi_domain+0x74/0x88
pci_register_host_bridge+0x194/0x548
This is easily reproducible on a Juno board with ACPI boot.
Retain the function for later use.
Fixes: 0644b3daca28 ("irqchip/gic-v2m: acpi: Introducing GICv2m ACPI support")
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Cc: stable@vger.kernel.org
|
|
The type cast in sg2044_msi_irq_ack() lost the __iomem attribute, which
makes the pointer type incorrect.
Add it back.
Fixes: e96b93a97c90 ("irqchip/sg2042-msi: Add the Sophgo SG2044 MSI interrupt controller")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Inochi Amaoto <inochiama@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Chen Wang <unicorn_wang@outlook.com>
Link: https://lore.kernel.org/all/20250422003804.214264-1-inochiama@gmail.com
Closes: https://lore.kernel.org/oe-kbuild-all/202504211251.B3aesulq-lkp@intel.com/
|
|
The BCM2712 MIP driver is required for Raspberry PI5, but it's not
automatically enabled when ARCH_BCM2835 is enabled and depends on
ARCH_BRCMSTB.
ARCH_BCM2835 shares drivers with ARCH_BRCMSTB platforms, but Raspberry PI5
does not require the BRCMSTB specific drivers, which are selected via
ARCH_BRCMSTB.
Enable the interrupt controller for both ARCH_BRCMSTB and ARCH_BCM2835.
[ tglx: Massage changelog ]
Fixes: 32c6c054661a ("irqchip: Add Broadcom BCM2712 MSI-X interrupt controller")
Signed-off-by: Peter Robinson <pbrobinson@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250416082523.179507-1-pbrobinson@gmail.com
|
|
A spurious TINT interrupt is seen during boot on RZ/G3E SMARC EVK.
A glitch in the edge detection circuit can cause a spurious interrupt.
Clear the status flag after setting the ICU_TSSRk registers, which is
recommended in the hardware manual as a countermeasure.
Fixes: 0d7605e75ac2 ("irqchip: Add RZ/V2H(P) Interrupt Control Unit (ICU) driver")
Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
|
|
Add support for Sophgo SG2044 MSI interrupt controller.
Signed-off-by: Inochi Amaoto <inochiama@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Chen Wang <wangchen20@iscas.ac.cn> # SG2042
Reviewed-by: Chen Wang <unicorn_wang@outlook.com>
Link: https://lore.kernel.org/all/20250413224922.69719-5-inochiama@gmail.com
|
|
As the controller on SG2044 uses different msi_parent_ops and a difffernt
irq_chip, it is necessary to provide that information to the probe function.
Add a new chipinfo structure to hold that information, implement the
necessary logic and make SG2042 use it.
Signed-off-by: Inochi Amaoto <inochiama@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Chen Wang <wangchen20@iscas.ac.cn> # SG2042
Reviewed-by: Chen Wang <unicorn_wang@outlook.com>
Link: https://lore.kernel.org/all/20250413224922.69719-4-inochiama@gmail.com
|
|
As the driver logic can be used in both SG2042 and SG2044 SoCs, rename
functions and data structures, which are not SG2042 specific, to SG204x*.
Signed-off-by: Inochi Amaoto <inochiama@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Chen Wang <wangchen20@iscas.ac.cn> # SG2042
Reviewed-by: Chen Wang <unicorn_wang@outlook.com>
Link: https://lore.kernel.org/all/20250413224922.69719-3-inochiama@gmail.com
|
|
The recently introduced msi_parent_ops::chip_flags sets irq_eoi()/irq_ack()
conditionally, but MIP driver has not been updated. Populate chip_flags
with EOI | ACK flags.
Fixes: 32c6c054661a ("irqchip: Add Broadcom BCM2712 MSI-X interrupt controller")
Signed-off-by: Stanimir Varbanov <svarbanov@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250407125918.3021454-1-svarbanov@suse.de
|
|
Conversion was done with Coccinelle and a few manual fixups.
In a few interrupt chip callbacks this changes replaces
raw_spin_lock_irqsave() with a guard(raw_spinlock). That's intended and
correct because those interrupt chip callbacks are invoked with the
interrupt descriptor lock held and interrupts disabled. No point in using
the irqsave variant.
No functional change.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://lore.kernel.org/all/20250313142524.325627746@linutronix.de
|
|
Add a driver for the interrupt controller in the EcoNet EN751221 MIPS SoC.
Signed-off-by: Caleb James DeLisle <cjd@cjdns.fr>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250330170306.2584136-4-cjd@cjdns.fr
|
|
The sg2042-msi driver uses the fallback callbacks set by
msi_lib_init_dev_msi_info(). commit 1c000dcaad2b ("irqchip/irq-msi-lib:
Optionally set default irq_eoi()/irq_ack()") changed the behavior of the
fallback mechanism by making it opt-in.
The sg2042-msi was not fixed up for this, which causes a NULL pointer
dereference due to the missing irq_ack() callback.
Add the missing chip flag to msi_parent_ops.
Fixes: c66741549424 ("irqchip: Add the Sophgo SG2042 MSI interrupt controller")
Signed-off-by: Inochi Amaoto <inochiama@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250217085657.789309-3-apatel@ventanamicro.com
Link: https://lore.kernel.org/all/20250405055625.1530180-1-inochiama@gmail.com
|
|
Naming interrupt domains host is confusing at best and the irqdomain code
uses both domain and host inconsistently.
Therefore rename irq_get_default_host() to irq_get_default_domain().
Signed-off-by: Jiri Slaby (SUSE) <jirislaby@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250319092951.37667-4-jirislaby@kernel.org
|
|
Naming interrupt domains host is confusing at best and the irqdomain code
uses both domain and host inconsistently.
Therefore rename irq_set_default_host() to irq_set_default_domain().
Signed-off-by: Jiri Slaby (SUSE) <jirislaby@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250319092951.37667-3-jirislaby@kernel.org
|
|
Pull pci updates from Bjorn Helgaas:
"Enumeration:
- Enable Configuration RRS SV, which makes device readiness visible,
early instead of during child bus scanning (Bjorn Helgaas)
- Log debug messages about reset methods being used (Bjorn Helgaas)
- Avoid reset when it has been disabled via sysfs (Nishanth
Aravamudan)
- Add common pci-ep-bus.yaml schema for exporting several peripherals
of a single PCI function via devicetree (Andrea della Porta)
- Create DT nodes for PCI host bridges to enable loading device tree
overlays to create platform devices for PCI devices that have
several features that require multiple drivers (Herve Codina)
Resource management:
- Enlarge devres table[] to accommodate bridge windows, ROM, IOV
BARs, etc., and validate BAR index in devres interfaces (Philipp
Stanner)
- Fix typo that repeatedly distributed resources to a bridge instead
of iterating over subordinate bridges, which resulted in too little
space to assign some BARs (Kai-Heng Feng)
- Relax bridge window tail sizing for optional resources, e.g., IOV
BARs, to avoid failures when removing and re-adding devices (Ilpo
Järvinen)
- Allow drivers to enable devices even if we haven't assigned
optional IOV resources to them (Ilpo Järvinen)
- Rework handling of optional resources (IOV BARs, ROMs) to reduce
failures if we can't allocate them (Ilpo Järvinen)
- Fix a NULL dereference in the SR-IOV VF creation error path (Shay
Drory)
- Fix s390 mmio_read/write syscalls, which didn't cause page faults
in some cases, which broke vfio-pci lazy mapping on first access
(Niklas Schnelle)
- Add pdev->non_mappable_bars to replace CONFIG_VFIO_PCI_MMAP, which
was disabled only for s390 (Niklas Schnelle)
- Support mmap of PCI resources on s390 except for ISM devices
(Niklas Schnelle)
ASPM:
- Delay pcie_link_state deallocation to avoid dangling pointers that
cause invalid references during hot-unplug (Daniel Stodden)
Power management:
- Allow PCI bridges to go to D3Hot when suspending on all non-x86
systems (Manivannan Sadhasivam)
Power control:
- Create pwrctrl devices in pci_scan_device() to make it more
symmetric with pci_pwrctrl_unregister() and make pwrctrl devices
for PCI bridges possible (Manivannan Sadhasivam)
- Unregister pwrctrl devices in pci_destroy_dev() so DOE, ASPM, etc.
can still access devices after pci_stop_dev() (Manivannan
Sadhasivam)
- If there's a pwrctrl device for a PCI device, skip scanning it
because the pwrctrl core will rescan the bus after the device is
powered on (Manivannan Sadhasivam)
- Add a pwrctrl driver for PCI slots based on voltage regulators
described via devicetree (Manivannan Sadhasivam)
Bandwidth control:
- Add set_pcie_speed.sh to TEST_PROGS to fix issue when executing the
set_pcie_cooling_state.sh test case (Yi Lai)
- Avoid a NULL pointer dereference when we run out of bus numbers to
assign for a bridge secondary bus (Lukas Wunner)
Hotplug:
- Drop superfluous pci_hotplug_slot_list, try_module_get() calls, and
NULL pointer checks (Lukas Wunner)
- Drop shpchp module init/exit logging, replace shpchp dbg() with
ctrl_dbg(), and remove unused dbg(), err(), info(), warn() wrappers
(Ilpo Järvinen)
- Drop 'shpchp_debug' module parameter in favor of standard dynamic
debugging (Ilpo Järvinen)
- Drop unused cpcihp .get_power(), .set_power() function pointers
(Guilherme Giacomo Simoes)
- Disable hotplug interrupts in portdrv only when pciehp is not
enabled to avoid issuing two hotplug commands too close together
(Feng Tang)
- Skip pciehp 'device replaced' check if the device has been removed
to address a deadlock when resuming after a device was removed
during system sleep (Lukas Wunner)
- Don't enable pciehp hotplug interupt when resuming in poll mode
(Ilpo Järvinen)
Virtualization:
- Fix bugs in 'pci=config_acs=' kernel command line parameter (Tushar
Dave)
DOE:
- Expose supported DOE features via sysfs (Alistair Francis)
- Allow DOE support to be enabled even if CXL isn't enabled (Alistair
Francis)
Endpoint framework:
- Convert PCI device data so pci-epf-test works correctly on
big-endian endpoint systems (Niklas Cassel)
- Add BAR_RESIZABLE type to endpoint framework and add DWC core
support for EPF drivers to set BAR_RESIZABLE type and size (Niklas
Cassel)
- Fix pci-epf-test double free that causes an oops if the host
reboots and PERST# deassertion restarts endpoint BAR allocation
(Christian Bruel)
- Fix endpoint BAR testing so tests can skip disabled BARs instead of
reporting them as failures (Niklas Cassel)
- Widen endpoint test BAR size variable to accommodate BARs larger
than INT_MAX (Niklas Cassel)
- Remove unused tools 'pci' build target left over after moving tests
to tools/testing/selftests/pci_endpoint (Jianfeng Liu)
Altera PCIe controller driver:
- Add DT binding and driver support for Agilex family (P-Tile,
F-Tile, R-Tile) (Matthew Gerlach and D M, Sharath Kumar)
AMD MDB PCIe controller driver:
- Add DT binding and driver for AMD MDB (Multimedia DMA Bridge)
(Thippeswamy Havalige)
Broadcom STB PCIe controller driver:
- Add BCM2712 MSI-X DT binding and interrupt controller drivers and
add softdep on irq_bcm2712_mip driver to ensure that it is loaded
first (Stanimir Varbanov)
- Expand inbound window map to 64GB so it can accommodate BCM2712
(Stanimir Varbanov)
- Add BCM2712 support and DT updates (Stanimir Varbanov)
- Apply link speed restriction before bringing link up, not after
(Jim Quinlan)
- Update Max Link Speed in Link Capabilities via the internal
writable register, not the read-only config register (Jim Quinlan)
- Handle regulator_bulk_get() error to avoid panic when we call
regulator_bulk_free() later (Jim Quinlan)
- Disable regulators only when removing the bus immediately below a
Root Port because we don't support regulators deeper in the
hierarchy (Jim Quinlan)
- Make const read-only arrays static (Colin Ian King)
Cadence PCIe endpoint driver:
- Correct MSG TLP generation so endpoints can generate INTx messages
(Hans Zhang)
Freescale i.MX6 PCIe controller driver:
- Identify the second controller on i.MX8MQ based on devicetree
'linux,pci-domain' instead of DBI 'reg' address (Richard Zhu)
- Remove imx_pcie_cpu_addr_fixup() since dwc core can now derive the
ATU input address (using parent_bus_offset) from devicetree (Frank
Li)
Freescale Layerscape PCIe controller driver:
- Drop deprecated 'num-ib-windows' and 'num-ob-windows' and
unnecessary 'status' from example (Krzysztof Kozlowski)
- Correct the syscon_regmap_lookup_by_phandle_args("fsl,pcie-scfg")
arg_count to fix probe failure on LS1043A (Ioana Ciornei)
HiSilicon STB PCIe controller driver:
- Call phy_exit() to clean up if histb_pcie_probe() fails (Christophe
JAILLET)
Intel Gateway PCIe controller driver:
- Remove intel_pcie_cpu_addr() since dwc core can now derive the ATU
input address (using parent_bus_offset) from devicetree (Frank Li)
Intel VMD host bridge driver:
- Convert vmd_dev.cfg_lock from spinlock_t to raw_spinlock_t so
pci_ops.read() will never sleep, even on PREEMPT_RT where
spinlock_t becomes a sleepable lock, to avoid calling a sleeping
function from invalid context (Ryo Takakura)
MediaTek PCIe Gen3 controller driver:
- Remove leftover mac_reset assert for Airoha EN7581 SoC (Lorenzo
Bianconi)
- Add EN7581 PBUS controller 'mediatek,pbus-csr' DT property and
program host bridge memory aperture to this syscon node (Lorenzo
Bianconi)
Qualcomm PCIe controller driver:
- Add qcom,pcie-ipq5332 binding (Varadarajan Narayanan)
- Add qcom i.MX8QM and i.MX8QXP/DXP optional DMA interrupt (Alexander
Stein)
- Add optional dma-coherent DT property for Qualcomm SA8775P (Dmitry
Baryshkov)
- Make DT iommu property required for SA8775P and prohibited for
SDX55 (Dmitry Baryshkov)
- Add DT IOMMU and DMA-related properties for Qualcomm SM8450 (Dmitry
Baryshkov)
- Add endpoint DT properties for SAR2130P and enable endpoint mode in
driver (Dmitry Baryshkov)
- Describe endpoint BAR0 and BAR2 as 64-bit only and BAR1 and BAR3 as
RESERVED (Manivannan Sadhasivam)
Rockchip DesignWare PCIe controller driver:
- Describe rk3568 and rk3588 BARs as Resizable, not Fixed (Niklas
Cassel)
Synopsys DesignWare PCIe controller driver:
- Add debugfs-based Silicon Debug, Error Injection, Statistical
Counter support for DWC (Shradha Todi)
- Add debugfs property to expose LTSSM status of DWC PCIe link (Hans
Zhang)
- Add Rockchip support for DWC debugfs features (Niklas Cassel)
- Add dw_pcie_parent_bus_offset() to look up the parent bus address
of a specified 'reg' property and return the offset from the CPU
physical address (Frank Li)
- Use dw_pcie_parent_bus_offset() to derive CPU -> ATU addr offset
via 'reg[config]' for host controllers and 'reg[addr_space]' for
endpoint controllers (Frank Li)
- Apply struct dw_pcie.parent_bus_offset in ATU users to remove use
of .cpu_addr_fixup() when programming ATU (Frank Li)
TI J721E PCIe driver:
- Correct the 'link down' interrupt bit for J784S4 (Siddharth
Vadapalli)
TI Keystone PCIe controller driver:
- Describe AM65x BARs 2 and 5 as Resizable (not Fixed) and reduce
alignment requirement from 1MB to 64KB (Niklas Cassel)
Xilinx Versal CPM PCIe controller driver:
- Free IRQ domain in probe error path to avoid leaking it
(Thippeswamy Havalige)
- Add DT .compatible "xlnx,versal-cpm5nc-host" and driver support for
Versal Net CPM5NC Root Port controller (Thippeswamy Havalige)
- Add driver support for CPM5_HOST1 (Thippeswamy Havalige)
Miscellaneous:
- Convert fsl,mpc83xx-pcie binding to YAML (J. Neuschäfer)
- Use for_each_available_child_of_node_scoped() to simplify apple,
kirin, mediatek, mt7621, tegra drivers (Zhang Zekun)"
* tag 'pci-v6.15-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: (197 commits)
PCI: layerscape: Fix arg_count to syscon_regmap_lookup_by_phandle_args()
PCI: j721e: Fix the value of .linkdown_irq_regfield for J784S4
misc: pci_endpoint_test: Add support for PCITEST_IRQ_TYPE_AUTO
PCI: endpoint: pci-epf-test: Expose supported IRQ types in CAPS register
PCI: dw-rockchip: Endpoint mode cannot raise INTx interrupts
PCI: endpoint: Add intx_capable to epc_features struct
dt-bindings: PCI: Add common schema for devices accessible through PCI BARs
PCI: intel-gw: Remove intel_pcie_cpu_addr()
PCI: imx6: Remove imx_pcie_cpu_addr_fixup()
PCI: dwc: Use parent_bus_offset to remove need for .cpu_addr_fixup()
PCI: dwc: ep: Ensure proper iteration over outbound map windows
PCI: dwc: ep: Use devicetree 'reg[addr_space]' to derive CPU -> ATU addr offset
PCI: dwc: ep: Consolidate devicetree handling in dw_pcie_ep_get_resources()
PCI: dwc: ep: Call epc_create() early in dw_pcie_ep_init()
PCI: dwc: Use devicetree 'reg[config]' to derive CPU -> ATU addr offset
PCI: dwc: Add dw_pcie_parent_bus_offset() checking and debug
PCI: dwc: Add dw_pcie_parent_bus_offset()
PCI/bwctrl: Fix NULL pointer dereference on bus number exhaustion
PCI: xilinx-cpm: Add cpm_csr register mapping for CPM5_HOST1 variant
PCI: brcmstb: Make const read-only arrays static
...
|
|
Pull SoC driver updates from Arnd Bergmann:
"These are the updates for SoC specific drivers and related subsystems:
- Firmware driver updates for SCMI, FF-A and SMCCC firmware
interfaces, adding support for additional firmware features
including SoC identification and FF-A SRI callbacks as well as
various bugfixes
- Memory controller updates for Nvidia and Mediatek
- Reset controller support for microchip sam9x7 and imx8qxp/imx8qm
- New hardware support for multiple Mediatek, Renesas and Samsung
Exynos chips
- Minor updates on Zynq, Qualcomm, Amlogic, TI, Samsung, Nvidia and
Apple chips
There will be a follow up with a few more driver updates that are
still causing build regressions at the moment"
* tag 'soc-drivers-6.15-1' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (97 commits)
irqchip: Add support for Amlogic A4 and A5 SoCs
dt-bindings: interrupt-controller: Add support for Amlogic A4 and A5 SoCs
reset: imx: fix incorrect module device table
dt-bindings: power: qcom,kpss-acc-v2: add qcom,msm8916-acc compatible
bus: qcom-ssc-block-bus: Fix the error handling path of qcom_ssc_block_bus_probe()
bus: qcom-ssc-block-bus: Remove some duplicated iounmap() calls
soc: qcom: pd-mapper: Add support for SDM630/636
reset: imx: Add SCU reset driver for i.MX8QXP and i.MX8QM
dt-bindings: firmware: imx: add property reset-controller
dt-bindings: reset: atmel,at91sam9260-reset: add sam9x7
memory: mtk-smi: Add ostd setting for mt8192
dt-bindings: soc: samsung: exynos-usi: Drop unnecessary status from example
firmware: tegra: bpmp: Fix typo in bpmp-abi.h
soc/tegra: pmc: Use str_enable_disable-like helpers
soc: samsung: include linux/array_size.h where needed
firmware: arm_scmi: use ioread64() instead of ioread64_hi_lo()
soc: mediatek: mtk-socinfo: Add extra entry for MT8395AV/ZA Genio 1200
soc: mediatek: mt8188-mmsys: Add support for DSC on VDO0
soc: mediatek: mmsys: Migrate all tables to MMSYS_ROUTE() macro
soc: mediatek: mt8365-mmsys: Fix routing table masks and values
...
|
|
Pull iommu updates from Joerg Roedel:
"Core iommufd dependencies from Jason:
- Change the iommufd fault handle into an always present hwpt handle
in the domain
- Give iommufd its own SW_MSI implementation along with some IRQ
layer rework
- Improvements to the handle attach API
Core fixes for probe-issues from Robin
Intel VT-d changes:
- Checking for SVA support in domain allocation and attach paths
- Move PCI ATS and PRI configuration into probe paths
- Fix a pentential hang on reboot -f
- Miscellaneous cleanups
AMD-Vi changes:
- Support for up to 2k IRQs per PCI device function
- Set of smaller fixes
ARM-SMMU changes:
- SMMUv2 devicetree binding updates for Qualcomm implementations
(QCS8300 GPU and MSM8937)
- Clean up SMMUv2 runtime PM implementation to help with wider rework
of pm_runtime_put_autosuspend()
Rockchip driver changes:
- Driver adjustments for recent DT probing changes
S390 IOMMU changes:
- Support for IOMMU passthrough
Apple Dart changes:
- Driver adjustments to meet ISP device requirements
- Null-ptr deref fix
- Disable subpage protection for DART 1"
* tag 'iommu-updates-v6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux: (54 commits)
iommu/vt-d: Fix possible circular locking dependency
iommu/vt-d: Don't clobber posted vCPU IRTE when host IRQ affinity changes
iommu/vt-d: Put IRTE back into posted MSI mode if vCPU posting is disabled
iommu: apple-dart: fix potential null pointer deref
iommu/rockchip: Retire global dma_dev workaround
iommu/rockchip: Register in a sensible order
iommu/rockchip: Allocate per-device data sensibly
iommu/mediatek-v1: Support COMPILE_TEST
iommu/amd: Enable support for up to 2K interrupts per function
iommu/amd: Rename DTE_INTTABLEN* and MAX_IRQS_PER_TABLE macro
iommu/amd: Replace slab cache allocator with page allocator
iommu/amd: Introduce generic function to set multibit feature value
iommu: Don't warn prematurely about dodgy probes
iommu/arm-smmu: Set rpm auto_suspend once during probe
dt-bindings: arm-smmu: Document QCS8300 GPU SMMU
iommu: Get DT/ACPI parsing into the proper probe path
iommu: Keep dev->iommu state consistent
iommu: Resolve ops in iommu_init_device()
iommu: Handle race with default domain setup
iommu: Unexport iommu_fwspec_free()
...
|
|
Pull kvm updates from Paolo Bonzini:
"ARM:
- Nested virtualization support for VGICv3, giving the nested
hypervisor control of the VGIC hardware when running an L2 VM
- Removal of 'late' nested virtualization feature register masking,
making the supported feature set directly visible to userspace
- Support for emulating FEAT_PMUv3 on Apple silicon, taking advantage
of an IMPLEMENTATION DEFINED trap that covers all PMUv3 registers
- Paravirtual interface for discovering the set of CPU
implementations where a VM may run, addressing a longstanding issue
of guest CPU errata awareness in big-little systems and
cross-implementation VM migration
- Userspace control of the registers responsible for identifying a
particular CPU implementation (MIDR_EL1, REVIDR_EL1, AIDR_EL1),
allowing VMs to be migrated cross-implementation
- pKVM updates, including support for tracking stage-2 page table
allocations in the protected hypervisor in the 'SecPageTable' stat
- Fixes to vPMU, ensuring that userspace updates to the vPMU after
KVM_RUN are reflected into the backing perf events
LoongArch:
- Remove unnecessary header include path
- Assume constant PGD during VM context switch
- Add perf events support for guest VM
RISC-V:
- Disable the kernel perf counter during configure
- KVM selftests improvements for PMU
- Fix warning at the time of KVM module removal
x86:
- Add support for aging of SPTEs without holding mmu_lock.
Not taking mmu_lock allows multiple aging actions to run in
parallel, and more importantly avoids stalling vCPUs. This includes
an implementation of per-rmap-entry locking; aging the gfn is done
with only a per-rmap single-bin spinlock taken, whereas locking an
rmap for write requires taking both the per-rmap spinlock and the
mmu_lock.
Note that this decreases slightly the accuracy of accessed-page
information, because changes to the SPTE outside aging might not
use atomic operations even if they could race against a clear of
the Accessed bit.
This is deliberate because KVM and mm/ tolerate false
positives/negatives for accessed information, and testing has shown
that reducing the latency of aging is far more beneficial to
overall system performance than providing "perfect" young/old
information.
- Defer runtime CPUID updates until KVM emulates a CPUID instruction,
to coalesce updates when multiple pieces of vCPU state are
changing, e.g. as part of a nested transition
- Fix a variety of nested emulation bugs, and add VMX support for
synthesizing nested VM-Exit on interception (instead of injecting
#UD into L2)
- Drop "support" for async page faults for protected guests that do
not set SEND_ALWAYS (i.e. that only want async page faults at CPL3)
- Bring a bit of sanity to x86's VM teardown code, which has
accumulated a lot of cruft over the years. Particularly, destroy
vCPUs before the MMU, despite the latter being a VM-wide operation
- Add common secure TSC infrastructure for use within SNP and in the
future TDX
- Block KVM_CAP_SYNC_REGS if guest state is protected. It does not
make sense to use the capability if the relevant registers are not
available for reading or writing
- Don't take kvm->lock when iterating over vCPUs in the suspend
notifier to fix a largely theoretical deadlock
- Use the vCPU's actual Xen PV clock information when starting the
Xen timer, as the cached state in arch.hv_clock can be stale/bogus
- Fix a bug where KVM could bleed PVCLOCK_GUEST_STOPPED across
different PV clocks; restrict PVCLOCK_GUEST_STOPPED to kvmclock, as
KVM's suspend notifier only accounts for kvmclock, and there's no
evidence that the flag is actually supported by Xen guests
- Clean up the per-vCPU "cache" of its reference pvclock, and instead
only track the vCPU's TSC scaling (multipler+shift) metadata (which
is moderately expensive to compute, and rarely changes for modern
setups)
- Don't write to the Xen hypercall page on MSR writes that are
initiated by the host (userspace or KVM) to fix a class of bugs
where KVM can write to guest memory at unexpected times, e.g.
during vCPU creation if userspace has set the Xen hypercall MSR
index to collide with an MSR that KVM emulates
- Restrict the Xen hypercall MSR index to the unofficial synthetic
range to reduce the set of possible collisions with MSRs that are
emulated by KVM (collisions can still happen as KVM emulates
Hyper-V MSRs, which also reside in the synthetic range)
- Clean up and optimize KVM's handling of Xen MSR writes and
xen_hvm_config
- Update Xen TSC leaves during CPUID emulation instead of modifying
the CPUID entries when updating PV clocks; there is no guarantee PV
clocks will be updated between TSC frequency changes and CPUID
emulation, and guest reads of the TSC leaves should be rare, i.e.
are not a hot path
x86 (Intel):
- Fix a bug where KVM unnecessarily reads XFD_ERR from hardware and
thus modifies the vCPU's XFD_ERR on a #NM due to CR0.TS=1
- Pass XFD_ERR as the payload when injecting #NM, as a preparatory
step for upcoming FRED virtualization support
- Decouple the EPT entry RWX protection bit macros from the EPT
Violation bits, both as a general cleanup and in anticipation of
adding support for emulating Mode-Based Execution Control (MBEC)
- Reject KVM_RUN if userspace manages to gain control and stuff
invalid guest state while KVM is in the middle of emulating nested
VM-Enter
- Add a macro to handle KVM's sanity checks on entry/exit VMCS
control pairs in anticipation of adding sanity checks for secondary
exit controls (the primary field is out of bits)
x86 (AMD):
- Ensure the PSP driver is initialized when both the PSP and KVM
modules are built-in (the initcall framework doesn't handle
dependencies)
- Use long-term pins when registering encrypted memory regions, so
that the pages are migrated out of MIGRATE_CMA/ZONE_MOVABLE and
don't lead to excessive fragmentation
- Add macros and helpers for setting GHCB return/error codes
- Add support for Idle HLT interception, which elides interception if
the vCPU has a pending, unmasked virtual IRQ when HLT is executed
- Fix a bug in INVPCID emulation where KVM fails to check for a
non-canonical address
- Don't attempt VMRUN for SEV-ES+ guests if the vCPU's VMSA is
invalid, e.g. because the vCPU was "destroyed" via SNP's AP
Creation hypercall
- Reject SNP AP Creation if the requested SEV features for the vCPU
don't match the VM's configured set of features
Selftests:
- Fix again the Intel PMU counters test; add a data load and do
CLFLUSH{OPT} on the data instead of executing code. The theory is
that modern Intel CPUs have learned new code prefetching tricks
that bypass the PMU counters
- Fix a flaw in the Intel PMU counters test where it asserts that an
event is counting correctly without actually knowing what the event
counts on the underlying hardware
- Fix a variety of flaws, bugs, and false failures/passes
dirty_log_test, and improve its coverage by collecting all dirty
entries on each iteration
- Fix a few minor bugs related to handling of stats FDs
- Add infrastructure to make vCPU and VM stats FDs available to tests
by default (open the FDs during VM/vCPU creation)
- Relax an assertion on the number of HLT exits in the xAPIC IPI test
when running on a CPU that supports AMD's Idle HLT (which elides
interception of HLT if a virtual IRQ is pending and unmasked)"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (216 commits)
RISC-V: KVM: Optimize comments in kvm_riscv_vcpu_isa_disable_allowed
RISC-V: KVM: Teardown riscv specific bits after kvm_exit
LoongArch: KVM: Register perf callbacks for guest
LoongArch: KVM: Implement arch-specific functions for guest perf
LoongArch: KVM: Add stub for kvm_arch_vcpu_preempted_in_kernel()
LoongArch: KVM: Remove PGD saving during VM context switch
LoongArch: KVM: Remove unnecessary header include path
KVM: arm64: Tear down vGIC on failed vCPU creation
KVM: arm64: PMU: Reload when resetting
KVM: arm64: PMU: Reload when user modifies registers
KVM: arm64: PMU: Fix SET_ONE_REG for vPMC regs
KVM: arm64: PMU: Assume PMU presence in pmu-emul.c
KVM: arm64: PMU: Set raw values from user to PM{C,I}NTEN{SET,CLR}, PMOVS{SET,CLR}
KVM: arm64: Create each pKVM hyp vcpu after its corresponding host vcpu
KVM: arm64: Factor out pKVM hyp vcpu creation to separate function
KVM: arm64: Initialize HCRX_EL2 traps in pKVM
KVM: arm64: Factor out setting HCRX_EL2 traps into separate function
KVM: x86: block KVM_CAP_SYNC_REGS if guest state is protected
KVM: x86: Add infrastructure for secure TSC
KVM: x86: Push down setting vcpu.arch.user_set_tsc
...
|
|
Pull irq driver updates from Thomas Gleixner:
- Support for hard indices on RISC-V. The hart index identifies a hart
(core) within a specific interrupt domain in RISC-V's Priviledged
Architecture.
- Rework of the RISC-V MSI driver
This moves the driver over to the generic MSI library and solves the
affinity problem of unmaskable PCI/MSI controllers. Unmaskable
PCI/MSI controllers are prone to lose interrupts when the MSI message
is updated to change the affinity because the message write consists
of three 32-bit subsequent writes, which update address and data. As
these writes are non-atomic versus the device raising an interrupt,
the device can observe a half written update and issue an interrupt
on the wrong vector. This is mitiated by a carefully orchestrated
step by step update and the observation of an eventually pending
interrupt on the CPU which issues the update. The algorithm follows
the well established method of the X86 MSI driver.
- A new driver for the RISC-V Sophgo SG2042 MSI controller
- Overhaul of the Renesas RZQ2L driver
Simplification of the probe function by using devm_*() mechanisms,
which avoid the endless list of error prone gotos in the failure
paths.
- Expand the Renesas RZV2H driver to support RZ/G3E SoCs
- A workaround for Rockchip 3568002 erratum in the GIC-V3 driver to
ensure that the addressing is limited to the lower 32-bit of the
physical address space.
- Add support for the Allwinner AS23 NMI controller
- Expand the IMX irqsteer driver to handle up to 960 input interrupts
- The usual small updates, cleanups and device tree changes
* tag 'irq-drivers-2025-03-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (40 commits)
irqchip/imx-irqsteer: Support up to 960 input interrupts
irqchip/sunxi-nmi: Support Allwinner A523 NMI controller
dt-bindings: irq: sun7i-nmi: Document the Allwinner A523 NMI controller
irqchip/davinci-cp-intc: Remove public header
irqchip/renesas-rzv2h: Add RZ/G3E support
irqchip/renesas-rzv2h: Update macros ICU_TSSR_TSSEL_{MASK,PREP}
irqchip/renesas-rzv2h: Update TSSR_TIEN macro
irqchip/renesas-rzv2h: Add field_width to struct rzv2h_hw_info
irqchip/renesas-rzv2h: Add max_tssel to struct rzv2h_hw_info
irqchip/renesas-rzv2h: Add struct rzv2h_hw_info with t_offs variable
irqchip/renesas-rzv2h: Use devm_pm_runtime_enable()
irqchip/renesas-rzv2h: Use devm_reset_control_get_exclusive_deasserted()
irqchip/renesas-rzv2h: Simplify rzv2h_icu_init()
irqchip/renesas-rzv2h: Drop irqchip from struct rzv2h_icu_priv
irqchip/renesas-rzv2h: Fix wrong variable usage in rzv2h_tint_set_type()
dt-bindings: interrupt-controller: renesas,rzv2h-icu: Document RZ/G3E SoC
riscv: sophgo: dts: Add msi controller for SG2042
irqchip: Add the Sophgo SG2042 MSI interrupt controller
dt-bindings: interrupt-controller: Add Sophgo SG2042 MSI
arm64: dts: rockchip: rk356x: Move PCIe MSI to use GIC ITS instead of MBI
...
|
|
|
|
Amlogic drivers changes for v6.15:
- GPIO interrupt controller support for Amlogic A4 and A5 SoCs
* tag 'amlogic-drivers-for-v6.15' of https://git.kernel.org/pub/scm/linux/kernel/git/amlogic/linux:
irqchip: Add support for Amlogic A4 and A5 SoCs
dt-bindings: interrupt-controller: Add support for Amlogic A4 and A5 SoCs
Link: https://lore.kernel.org/r/eeaa8d3b-4fc3-4dae-92b8-0fc590e1a070@linaro.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
* kvm-arm64/nv-vgic:
: NV VGICv3 support, courtesy of Marc Zyngier
:
: Support for emulating the GIC hypervisor controls and managing shadow
: VGICv3 state for the L1 hypervisor. As part of it, bring in support for
: taking IRQs to the L1 and UAPI to manage the VGIC maintenance interrupt.
KVM: arm64: nv: Fail KVM init if asking for NV without GICv3
KVM: arm64: nv: Allow userland to set VGIC maintenance IRQ
KVM: arm64: nv: Fold GICv3 host trapping requirements into guest setup
KVM: arm64: nv: Propagate used_lrs between L1 and L0 contexts
KVM: arm64: nv: Request vPE doorbell upon nested ERET to L2
KVM: arm64: nv: Respect virtual HCR_EL2.TWx setting
KVM: arm64: nv: Add Maintenance Interrupt emulation
KVM: arm64: nv: Handle L2->L1 transition on interrupt injection
KVM: arm64: nv: Nested GICv3 emulation
KVM: arm64: nv: Sanitise ICH_HCR_EL2 accesses
KVM: arm64: nv: Plumb handling of GICv3 EL2 accesses
KVM: arm64: nv: Add ICH_*_EL2 registers to vpcu_sysreg
KVM: arm64: nv: Load timer before the GIC
arm64: sysreg: Add layout for ICH_MISR_EL2
arm64: sysreg: Add layout for ICH_VTR_EL2
arm64: sysreg: Add layout for ICH_HCR_EL2
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|