linux-dev - Linux kernel development work

Age	Commit message (Collapse)	Author	Files	Lines
2022-09-09	iommu/virtio: Fix compile error with viommu_capable()	Joerg Roedel	1	-1/+1
	A recent fix introduced viommu_capable() but other changes from Robin change the function signature of the call-back it is used for. When both changes are merged a compile error will happen because the function pointer types mismatch. Fix that by updating the viommu_capable() signature after the merge. Cc: Jean-Philippe Brucker <jean-philippe@linaro.org> Cc: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de> Acked-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Link: https://lore.kernel.org/r/20220907151154.21911-1-joro@8bytes.org
2022-09-09	iova: Remove iovad->rcaches check in iova_rcache_get()	John Garry	1	-1/+1
	The iovad->rcaches check in iova_rcache_get() is pretty much useless without the same check in iova_rcache_insert(). Instead of adding this symmetric check to fastpath iova_rcache_insert(), drop the check in iova_rcache_get() in favour of making the IOVA domain rcache init more robust to failure in future. Signed-off-by: John Garry <john.garry@huawei.com> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/1662557681-145906-4-git-send-email-john.garry@huawei.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-09-09	iova: Remove magazine BUG_ON() checks	John Garry	1	-4/+0
	Two of the magazine helpers have BUG_ON() checks, as follows: - iova_magazine_pop() - here we ensure that the mag is not empty. However we already ensure that in the only caller, __iova_rcache_get(). - iova_magazine_push() - here we ensure that the mag is not full. However we already ensure that in the only caller, __iova_rcache_insert(). As described, the two bug checks are pointless so drop them. Signed-off-by: John Garry <john.garry@huawei.com> Acked-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com> Link: https://lore.kernel.org/r/1662557681-145906-3-git-send-email-john.garry@huawei.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-09-09	iova: Remove some magazine pointer NULL checks	John Garry	1	-5/+2
	Since commit 32e92d9f6f87 ("iommu/iova: Separate out rcache init") it has not been possible to have NULL CPU rcache "loaded" or "prev" magazine pointers once the IOVA domain has been properly initialized. Previously it was only possible to have NULL pointers from failure to allocate the magazines in the IOVA domain initialization. The only other two functions to modify these pointers - __iova_rcache_{get, insert}() - would already ensure that these pointers were non-NULL if initially non-NULL. As such, the mag NULL pointer checks in iova_magazine_full(), iova_magazine_empty(), and iova_magazine_free_pfns() may be dropped. Signed-off-by: John Garry <john.garry@huawei.com> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com> Link: https://lore.kernel.org/r/1662557681-145906-2-git-send-email-john.garry@huawei.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-09-09	iommu/dma: Make header private	Robin Murphy	12	-23/+14
	Now that dma-iommu.h only contains internal interfaces, make it private to the IOMMU subsytem. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/b237e06c56a101f77af142a54b629b27aa179d22.1660668998.git.robin.murphy@arm.com [ joro : re-add stub for iommu_dma_get_resv_regions ] Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-09-07	iommu/virtio: Fix interaction with VFIO	Jean-Philippe Brucker	1	-0/+11
	Commit e8ae0e140c05 ("vfio: Require that devices support DMA cache coherence") requires IOMMU drivers to advertise IOMMU_CAP_CACHE_COHERENCY, in order to be used by VFIO. Since VFIO does not provide to userspace the ability to maintain coherency through cache invalidations, it requires hardware coherency. Advertise the capability in order to restore VFIO support. The meaning of IOMMU_CAP_CACHE_COHERENCY also changed from "IOMMU can enforce cache coherent DMA transactions" to "IOMMU_CACHE is supported". While virtio-iommu cannot enforce coherency (of PCIe no-snoop transactions), it does support IOMMU_CACHE. We can distinguish different cases of non-coherent DMA: (1) When accesses from a hardware endpoint are not coherent. The host would describe such a device using firmware methods ('dma-coherent' in device-tree, '_CCA' in ACPI), since they are also needed without a vIOMMU. In this case mappings are created without IOMMU_CACHE. virtio-iommu doesn't need any additional support. It sends the same requests as for coherent devices. (2) When the physical IOMMU supports non-cacheable mappings. Supporting those would require a new feature in virtio-iommu, new PROBE request property and MAP flags. Device drivers would use a new API to discover this since it depends on the architecture and the physical IOMMU. (3) When the hardware supports PCIe no-snoop. It is possible for assigned PCIe devices to issue no-snoop transactions, and the virtio-iommu specification is lacking any mention of this. Arm platforms don't necessarily support no-snoop, and those that do cannot enforce coherency of no-snoop transactions. Device drivers must be careful about assuming that no-snoop transactions won't end up cached; see commit e02f5c1bb228 ("drm: disable uncached DMA optimization for ARM and arm64"). On x86 platforms, the host may or may not enforce coherency of no-snoop transactions with the physical IOMMU. But according to the above commit, on x86 a driver which assumes that no-snoop DMA is compatible with uncached CPU mappings will also work if the host enforces coherency. Although these issues are not specific to virtio-iommu, it could be used to facilitate discovery and configuration of no-snoop. This would require a new feature bit, PROBE property and ATTACH/MAP flags. Cc: stable@vger.kernel.org Fixes: e8ae0e140c05 ("vfio: Require that devices support DMA cache coherence") Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/20220825154622.86759-1-jean-philippe@linaro.org Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-09-07	iommu/vt-d: Fix lockdep splat due to klist iteration in atomic context	Lu Baolu	1	-28/+19
	With CONFIG_INTEL_IOMMU_DEBUGFS enabled, below lockdep splat are seen when an I/O fault occurs on a machine with an Intel IOMMU in it. DMAR: DRHD: handling fault status reg 3 DMAR: [DMA Write NO_PASID] Request device [00:1a.0] fault addr 0x0 [fault reason 0x05] PTE Write access is not set DMAR: Dump dmar0 table entries for IOVA 0x0 DMAR: root entry: 0x0000000127f42001 DMAR: context entry: hi 0x0000000000001502, low 0x000000012d8ab001 ================================ WARNING: inconsistent lock state 5.20.0-0.rc0.20220812git7ebfc85e2cd7.10.fc38.x86_64 #1 Not tainted -------------------------------- inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage. rngd/1006 [HC1[1]:SC0[0]:HE0:SE1] takes: ff177021416f2d78 (&k->k_lock){?.+.}-{2:2}, at: klist_next+0x1b/0x160 {HARDIRQ-ON-W} state was registered at: lock_acquire+0xce/0x2d0 _raw_spin_lock+0x33/0x80 klist_add_tail+0x46/0x80 bus_add_device+0xee/0x150 device_add+0x39d/0x9a0 add_memory_block+0x108/0x1d0 memory_dev_init+0xe1/0x117 driver_init+0x43/0x4d kernel_init_freeable+0x1c2/0x2cc kernel_init+0x16/0x140 ret_from_fork+0x1f/0x30 irq event stamp: 7812 hardirqs last enabled at (7811): [<ffffffff85000e86>] asm_sysvec_apic_timer_interrupt+0x16/0x20 hardirqs last disabled at (7812): [<ffffffff84f16894>] irqentry_enter+0x54/0x60 softirqs last enabled at (7794): [<ffffffff840ff669>] __irq_exit_rcu+0xf9/0x170 softirqs last disabled at (7787): [<ffffffff840ff669>] __irq_exit_rcu+0xf9/0x170 The klist iterator functions using spin_lock_irq() but the klist insertion functions using spin_*lock(), combined with the Intel DMAR IOMMU driver iterating over klists from atomic (hardirq) context, where pci_get_domain_bus_and_slot() calls into bus_find_device() which iterates over klists. As currently there's no plan to fix the klist to make it safe to use in atomic context, this fixes the lockdep splat by avoid calling pci_get_domain_bus_and_slot() in the hardirq context. Fixes: 8ac0b64b9735 ("iommu/vt-d: Use pci_get_domain_bus_and_slot() in pgtable_walk()") Reported-by: Lennert Buytenhek <buytenh@wantstofly.org> Link: https://lore.kernel.org/linux-iommu/Yvo2dfpEh%2FWC+Wrr@wantstofly.org/ Link: https://lore.kernel.org/linux-iommu/YvyBdPwrTuHHbn5X@wantstofly.org/ Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20220819015949.4795-1-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-09-07	iommu/vt-d: Fix recursive lock issue in iommu_flush_dev_iotlb()	Lu Baolu	1	-16/+23
	The per domain spinlock is acquired in iommu_flush_dev_iotlb(), which is possbile to be called in the interrupt context. For example, the drm-intel's CI system got completely blocked with below error: WARNING: inconsistent lock state 6.0.0-rc1-CI_DRM_11990-g6590d43d39b9+ #1 Not tainted -------------------------------- inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage. swapper/6/0 [HC0[0]:SC1[1]:HE1:SE0] takes: ffff88810440d678 (&domain->lock){+.?.}-{2:2}, at: iommu_flush_dev_iotlb.part.61+0x23/0x80 {SOFTIRQ-ON-W} state was registered at: lock_acquire+0xd3/0x310 _raw_spin_lock+0x2a/0x40 domain_update_iommu_cap+0x20b/0x2c0 intel_iommu_attach_device+0x5bd/0x860 __iommu_attach_device+0x18/0xe0 bus_iommu_probe+0x1f3/0x2d0 bus_set_iommu+0x82/0xd0 intel_iommu_init+0xe45/0x102a pci_iommu_init+0x9/0x31 do_one_initcall+0x53/0x2f0 kernel_init_freeable+0x18f/0x1e1 kernel_init+0x11/0x120 ret_from_fork+0x1f/0x30 irq event stamp: 162354 hardirqs last enabled at (162354): [<ffffffff81b59274>] _raw_spin_unlock_irqrestore+0x54/0x70 hardirqs last disabled at (162353): [<ffffffff81b5901b>] _raw_spin_lock_irqsave+0x4b/0x50 softirqs last enabled at (162338): [<ffffffff81e00323>] __do_softirq+0x323/0x48e softirqs last disabled at (162349): [<ffffffff810c1588>] irq_exit_rcu+0xb8/0xe0 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&domain->lock); <Interrupt> lock(&domain->lock); * DEADLOCK * 1 lock held by swapper/6/0: This coverts the spin_lock/unlock() into the irq save/restore varieties to fix the recursive locking issues. Fixes: ffd5869d93530 ("iommu/vt-d: Replace spin_lock_irqsave() with spin_lock()") Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Acked-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20220817025650.3253959-1-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-09-07	iommu/vt-d: Correctly calculate sagaw value of IOMMU	Lu Baolu	1	-3/+25
	The Intel IOMMU driver possibly selects between the first-level and the second-level translation tables for DMA address translation. However, the levels of page-table walks for the 4KB base page size are calculated from the SAGAW field of the capability register, which is only valid for the second-level page table. This causes the IOMMU driver to stop working if the hardware (or the emulated IOMMU) advertises only first-level translation capability and reports the SAGAW field as 0. This solves the above problem by considering both the first level and the second level when calculating the supported page table levels. Fixes: b802d070a52a1 ("iommu/vt-d: Use iova over first level") Cc: stable@vger.kernel.org Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20220817023558.3253263-1-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-09-07	iommu/vt-d: Fix kdump kernels boot failure with scalable mode	Lu Baolu	2	-59/+50
	The translation table copying code for kdump kernels is currently based on the extended root/context entry formats of ECS mode defined in older VT-d v2.5, and doesn't handle the scalable mode formats. This causes the kexec capture kernel boot failure with DMAR faults if the IOMMU was enabled in scalable mode by the previous kernel. The ECS mode has already been deprecated by the VT-d spec since v3.0 and Intel IOMMU driver doesn't support this mode as there's no real hardware implementation. Hence this converts ECS checking in copying table code into scalable mode. The existing copying code consumes a bit in the context entry as a mark of copied entry. It needs to work for the old format as well as for the extended context entries. As it's hard to find such a common bit for both legacy and scalable mode context entries. This replaces it with a per- IOMMU bitmap. Fixes: 7373a8cc38197 ("iommu/vt-d: Setup context and enable RID2PASID support") Cc: stable@vger.kernel.org Reported-by: Jerry Snitselaar <jsnitsel@redhat.com> Tested-by: Wen Jin <wen.jin@intel.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20220817011035.3250131-1-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-09-07	iommu/dma: Move public interfaces to linux/iommu.h	Robin Murphy	9	-48/+54
	The iommu-dma layer is now mostly encapsulated by iommu_dma_ops, with only a couple more public interfaces left pertaining to MSI integration. Since these depend on the main IOMMU API header anyway, move their declarations there, taking the opportunity to update the half-baked comments to proper kerneldoc along the way. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/9cd99738f52094e6bed44bfee03fa4f288d20695.1660668998.git.robin.murphy@arm.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-09-07	iommu/dma: Clean up Kconfig	Robin Murphy	4	-5/+1
	Although iommu-dma is a per-architecture chonce, that is currently implemented in a rather haphazard way. Selecting from the arch Kconfig was the original logical approach, but is complicated by having to manage dependencies; conversely, selecting from drivers ends up hiding the architecture dependency too well. Instead, let's just have it enable itself automatically when IOMMU API support is enabled for the relevant architectures. It can't get much clearer than that. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/2e33c8bc2b1bb478157b7964bfed976cb7466139.1660668998.git.robin.murphy@arm.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-09-07	iommu: Clean up bus_set_iommu()	Robin Murphy	10	-52/+0
	Clean up the remaining trivial bus_set_iommu() callsites along with the implementation. Now drivers only have to know and care about iommu_device instances, phew! Reviewed-by: Kevin Tian <kevin.tian@intel.com> Tested-by: Matthew Rosato <mjrosato@linux.ibm.com> # s390 Tested-by: Niklas Schnelle <schnelle@linux.ibm.com> # s390 Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/ea383d5f4d74ffe200ab61248e5de6e95846180a.1660572783.git.robin.murphy@arm.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-09-07	iommu/virtio: Clean up bus_set_iommu()	Robin Murphy	1	-25/+0
	Stop calling bus_set_iommu() since it's now unnecessary, and simplify the probe failure path accordingly. Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/0ff6f9166081724510e6772e43d45b317cab8c58.1660572783.git.robin.murphy@arm.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-09-07	iommu/tegra-smmu: Clean up bus_set_iommu()	Robin Murphy	1	-23/+6
	Stop calling bus_set_iommu() since it's now unnecessary, and simplify the probe failure path accordingly. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/13bb6baa6c4d74e95a12529e4eb1ddfb3885c3b5.1660572783.git.robin.murphy@arm.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-09-07	iommu/omap: Clean up bus_set_iommu()	Robin Murphy	1	-6/+0
	Stop calling bus_set_iommu() since it's now unnecessary, and simplify the init failure path accordingly. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/b578af8e2bf8afeccb2c2ce87c1aa38b36f01331.1660572783.git.robin.murphy@arm.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-09-07	iommu/mtk: Clean up bus_set_iommu()	Robin Murphy	2	-35/+2
	Stop calling bus_set_iommu() since it's now unnecessary, and simplify the probe failure paths accordingly. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/9134322ecd24030eebeac73f37ca579094cc7df0.1660572783.git.robin.murphy@arm.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-09-07	iommu/ipmmu-vmsa: Clean up bus_set_iommu()	Robin Murphy	1	-34/+1
	Stop calling bus_set_iommu() since it's now unnecessary. This also leaves the custom initcall effectively doing nothing but register the driver, which no longer needs to happen early either, so convert it to builtin_platform_driver(). Signed-off-by: Robin Murphy <robin.murphy@arm.com> Reported-by: kernel test robot <lkp@intel.com> Link: https://lore.kernel.org/r/14377566e449950c19367f75ec1b09724bf0889f.1660572783.git.robin.murphy@arm.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-09-07	iommu/exynos: Clean up bus_set_iommu()	Robin Murphy	1	-9/+0
	Stop calling bus_set_iommu() since it's now unnecessary, and simplify the init failure path accordingly. Tested-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/d7477ef546479300217ca7bccb44da8b02715a07.1660572783.git.robin.murphy@arm.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-09-07	iommu/dart: Clean up bus_set_iommu()	Robin Murphy	1	-29/+1
	Stop calling bus_set_iommu() since it's now unnecessary, and simplify the probe failure path accordingly. Tested-by: Sven Peter <sven@svenpeter.dev> Reviewed-by: Sven Peter <sven@svenpeter.dev> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/afe138964196907d58147a686c1dcd6a12f9e210.1660572783.git.robin.murphy@arm.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-09-07	iommu/arm-smmu-v3: Clean up bus_set_iommu()	Robin Murphy	1	-51/+2
	Stop calling bus_set_iommu() since it's now unnecessary, and simplify the probe failure path accordingly. Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/6235f07df013776656a61bb642023ecce07f46cc.1660572783.git.robin.murphy@arm.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-09-07	iommu/arm-smmu: Clean up bus_set_iommu()	Robin Murphy	1	-82/+2
	Stop calling bus_set_iommu() since it's now unnecessary. With device probes now replayed for every IOMMU instance registration, the whole sorry ordering workaround for legacy DT bindings goes too, hooray! Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/f7aaad3e479a78623a6942ed46937249168b55bd.1660572783.git.robin.murphy@arm.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-09-07	iommu/amd: Clean up bus_set_iommu()	Robin Murphy	3	-30/+1
	Stop calling bus_set_iommu() since it's now unnecessary, and garbage-collect the last remnants of amd_iommu_init_api(). Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/6bcc367e8802ae5a2b2840cbe4e9661ee024e80e.1660572783.git.robin.murphy@arm.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-09-07	iommu: Move bus setup to IOMMU device registration	Robin Murphy	1	-25/+30
	Move the bus setup to iommu_device_register(). This should allow bus_iommu_probe() to be correctly replayed for multiple IOMMU instances, and leaves bus_set_iommu() as a glorified no-op to be cleaned up next. At this point we can also handle cleanup better than just rolling back the most-recently-touched bus upon failure - which may release devices owned by other already-registered instances, and still leave devices on other buses with dangling pointers to the failed instance. Now it's easy to clean up the exact footprint of a given instance, no more, no less. Tested-by: Marek Szyprowski <m.szyprowski@samsung.com> Reviewed-by: Krishna Reddy <vdumpa@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Tested-by: Matthew Rosato <mjrosato@linux.ibm.com> # s390 Tested-by: Niklas Schnelle <schnelle@linux.ibm.com> # s390 Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/d342b6f27efb5ef3e93aacaa3012d25386d74866.1660572783.git.robin.murphy@arm.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-09-07	iommu: Always register bus notifiers	Robin Murphy	1	-35/+37
	The number of bus types that the IOMMU subsystem deals with is small and manageable, so pull that list into core code as a first step towards cleaning up all the boilerplate bus-awareness from drivers. Calling iommu_probe_device() before bus->iommu_ops is set will simply return -ENODEV and not break the notifier call chain, so there should be no harm in proactively registering all our bus notifiers at init time. Tested-by: Marek Szyprowski <m.szyprowski@samsung.com> Tested-by: Matthew Rosato <mjrosato@linux.ibm.com> # s390 Tested-by: Niklas Schnelle <schnelle@linux.ibm.com> # s390 Signed-off-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/7462347bf938bd6eedb629a3a318434f6516e712.1660572783.git.robin.murphy@arm.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-09-07	iommu/s390: Fail probe for non-PCI devices	Matthew Rosato	1	-1/+6
	s390-iommu only supports pci_bus_type today. Tested-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/8cb71ea1b24bd2622c1937bd9cfffe73b126eb56.1660572783.git.robin.murphy@arm.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-09-07	iommu/amd: Handle race between registration and device probe	Robin Murphy	1	-0/+4
	As for the Intel driver, make sure the AMD driver can cope with seeing .probe_device calls without having to wait for all known instances to register first. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/a8d8ebe12b411d28972f1ab928c6db92e8913cf5.1660572783.git.robin.murphy@arm.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-09-07	iommu/vt-d: Handle race between registration and device probe	Robin Murphy	1	-1/+1
	Currently we rely on registering all our instances before initially allowing any .probe_device calls via bus_set_iommu(). In preparation for phasing out the latter, make sure we won't inadvertently return success for a device associated with a known but not yet registered instance, otherwise we'll run straight into iommu_group_get_for_dev() trying to use NULL ops. That also highlights an issue with intel_iommu_get_resv_regions() taking dmar_global_lock from within a section where intel_iommu_init() already holds it, which already exists via probe_acpi_namespace_devices() when an ANDD device is probed, but gets more obvious with the upcoming change to iommu_device_register(). Since they are both read locks it manages not to deadlock in practice, and a more in-depth rework of this locking is underway, so no attempt is made to address it here. Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/579f2692291bcbfc3ac64f7456fcff0d629af131.1660572783.git.robin.murphy@arm.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-09-07	iommu/io-pgtable-arm: Remove iommu_dev==NULL special case	Robin Murphy	1	-2/+6
	The special case to allow iommu_dev==NULL in __arm_lpae_alloc_pages() is confusing to static checkers (and possibly readers in general), since it's not obvious that that is only intended for the selftests. However it only serves to get around the dev_to_node() call, and we can easily fake up enough to make that work anyway, so let's simply remove this consideration from the normal flow and punt the responsibility over to the test harness itself. Reported-by: Rustam Subkhankulov <subkhankulov@ispras.ru> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/e2095eeda305071cb56c2cb8ac8a82dc3bd4dcab.1660580155.git.robin.murphy@arm.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-09-07	iommu/arm-smmu: Report IOMMU_CAP_CACHE_COHERENCY better	Robin Murphy	2	-6/+8
	Assuming that any SMMU can enforce coherency for any device is clearly nonsense. Although technically even a single SMMU instance can be wired up to only be capable of emitting coherent traffic for some of the devices it translates, it's a fairly realistic approximation that if the SMMU's pagetable walker is wired up to a coherent interconnect then all its translation units probably are too, and conversely that lack of coherent table walks implies a non-coherent system in general. Either way it's still less inaccurate than what we've been claiming so far. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/106c9741415f0b6358c72d53ae9c78c553a2b45c.1660574547.git.robin.murphy@arm.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-09-07	iommu: Retire iommu_capable()	Robin Murphy	9	-24/+9
	With all callers now converted to the device-specific version, retire the old bus-based interface, and give drivers the chance to indicate accurate per-instance capabilities. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/d8bd8777d06929ad8f49df7fc80e1b9af32a41b5.1660574547.git.robin.murphy@arm.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-09-07	iommu: Remove duplicate ida_free in iommu_group_alloc	Yuan Can	1	-1/+0
	In the iommu_group_alloc, when the kobject_init_and_add failed, the group->kobj is associate with iommu_group_ktype, thus its release function iommu_group_release will be called by the following kobject_put. The iommu_group_release calls ida_free with the group->id, so we do not need to do it before kobject_put. Signed-off-by: Yuan Can <yuancan@huawei.com> Link: https://lore.kernel.org/r/20220815031423.94548-1-yuancan@huawei.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-09-07	iommu: Remove comment of dev_has_feat in struct doc	Yuan Can	1	-1/+1
	dev_has_feat has been removed from iommu_ops in commit 309c56e84602 ("iommu: remove the unused dev_has_feat method"), remove its description in the struct doc. Signed-off-by: Yuan Can <yuancan@huawei.com> Link: https://lore.kernel.org/r/20220815013339.2552-1-yuancan@huawei.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-09-07	iommu/amd: use full 64-bit value in build_completion_wait()	John Sperbeck	1	-1/+2
	We started using a 64 bit completion value. Unfortunately, we only stored the low 32-bits, so a very large completion value would never be matched in iommu_completion_wait(). Fixes: c69d89aff393 ("iommu/amd: Use 4K page for completion wait write-back semaphore") Signed-off-by: John Sperbeck <jsperbeck@google.com> Link: https://lore.kernel.org/r/20220801192229.3358786-1-jsperbeck@google.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-09-07	iommu: Do not dereference fwnode in struct device	Andy Shevchenko	1	-1/+1
	In order to make the underneath API easier to change in the future, prevent users from dereferencing fwnode from struct device. Instead, use the specific dev_fwnode() API for that. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/20220801164758.20664-1-andriy.shevchenko@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-09-04	Linux 6.0-rc4	Linus Torvalds	1	-1/+1

2022-09-04	Makefile.extrawarn: re-enable -Wformat for clang; take 2	Nick Desaulniers	1	-0/+12
	-Wformat was recently re-enabled for builds with clang, then quickly re-disabled, due to concerns stemming from the frequency of default argument promotion related warning instances. commit 258fafcd0683 ("Makefile.extrawarn: re-enable -Wformat for clang") commit 21f9c8a13bb2 ("Revert "Makefile.extrawarn: re-enable -Wformat for clang"") ISO WG14 has ratified N2562 to address default argument promotion explicitly for printf, as part of the upcoming ISO C2X standard. The behavior of clang was changed in clang-16 to not warn for the cited cases in all language modes. Add a version check, so that users of clang-16 now get the full effect of -Wformat. For older clang versions, re-enable flags under the -Wformat group that way users still get some useful checks related to format strings, without noisy default argument promotion warnings. I intentionally omitted -Wformat-y2k and -Wformat-security from being re-enabled, which are also part of -Wformat in clang-16. Link: https://github.com/ClangBuiltLinux/linux/issues/378 Link: https://github.com/llvm/llvm-project/issues/57102 Link: https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2562.pdf Suggested-by: Justin Stitt <jstitt007@gmail.com> Suggested-by: Nathan Chancellor <nathan@kernel.org> Suggested-by: Youngmin Nam <youngmin.nam@samsung.com> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Reviewed-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-09-03	gpio: ws16c48: Make irq_chip immutable	William Breathitt Gray	1	-3/+7
	Kernel warns about mutable irq_chips: "not an immutable chip, please consider fixing!" Make the struct irq_chip const, flag it as IRQCHIP_IMMUTABLE, add the new helper functions, and call the appropriate gpiolib functions. Signed-off-by: William Breathitt Gray <william.gray@linaro.org> Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
2022-09-03	gpio: 104-idio-16: Make irq_chip immutable	William Breathitt Gray	1	-7/+11
	Kernel warns about mutable irq_chips: "not an immutable chip, please consider fixing!" Make the struct irq_chip const, flag it as IRQCHIP_IMMUTABLE, add the new helper functions, and call the appropriate gpiolib functions. Signed-off-by: William Breathitt Gray <william.gray@linaro.org> Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
2022-09-03	gpio: 104-idi-48: Make irq_chip immutable	William Breathitt Gray	1	-3/+7
	Kernel warns about mutable irq_chips: "not an immutable chip, please consider fixing!" Make the struct irq_chip const, flag it as IRQCHIP_IMMUTABLE, add the new helper functions, and call the appropriate gpiolib functions. Signed-off-by: William Breathitt Gray <william.gray@linaro.org> Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
2022-09-03	gpio: 104-dio-48e: Make irq_chip immutable	William Breathitt Gray	1	-3/+7
	Kernel warns about mutable irq_chips: "not an immutable chip, please consider fixing!" Make the struct irq_chip const, flag it as IRQCHIP_IMMUTABLE, add the new helper functions, and call the appropriate gpiolib functions. Signed-off-by: William Breathitt Gray <william.gray@linaro.org> Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
2022-09-03	mm: pagewalk: Fix race between unmap and page walker	Steven Price	3	-13/+16
	The mmap lock protects the page walker from changes to the page tables during the walk. However a read lock is insufficient to protect those areas which don't have a VMA as munmap() detaches the VMAs before downgrading to a read lock and actually tearing down PTEs/page tables. For users of walk_page_range() the solution is to simply call pte_hole() immediately without checking the actual page tables when a VMA is not present. We now never call __walk_page_range() without a valid vma. For walk_page_range_novma() the locking requirements are tightened to require the mmap write lock to be taken, and then walking the pgd directly with 'no_vma' set. This in turn means that all page walkers either have a valid vma, or it's that special 'novma' case for page table debugging. As a result, all the odd '(!walk->vma && !walk->no_vma)' tests can be removed. Fixes: dd2283f2605e ("mm: mmap: zap pages with read mmap_sem in munmap") Reported-by: Jann Horn <jannh@google.com> Signed-off-by: Steven Price <steven.price@arm.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Konstantin Khlebnikov <koct9i@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-09-03	LoongArch: mm: Remove the unneeded result variable	ye xingchen	1	-4/+1
	Return the value pa_to_nid() directly instead of storing it in another redundant variable. Reported-by: Zeal Robot <zealci@zte.com.cn> Signed-off-by: ye xingchen <ye.xingchen@zte.com.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-09-03	LoongArch: Fix arch_remove_memory() undefined build error	Yupeng Li	1	-12/+10
	The kernel build error when unslected CONFIG_MEMORY_HOTREMOVE because arch_remove_memory() is needed by mm/memory_hotplug.c but undefined. Some build error messages like: LD vmlinux.o MODPOST vmlinux.symvers MODINFO modules.builtin.modinfo GEN modules.builtin LD .tmp_vmlinux.kallsyms1 loongarch64-linux-gnu-ld: mm/memory_hotplug.o: in function `.L242': memory_hotplug.c:(.ref.text+0x930): undefined reference to `arch_remove_memory' make: *** [Makefile:1169：vmlinux] 错误 1 Removed CONFIG_MEMORY_HOTREMOVE requirement and rearrange the file refer to the definitions of other platform architectures. Signed-off-by: Yupeng Li <liyupeng@zbhlos.com> Signed-off-by: Caicai <caizp2008@163.com> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-09-03	LoongArch: Fix section mismatch due to acpi_os_ioremap()	Huacai Chen	3	-2/+3
	Now acpi_os_ioremap() is marked with __init because it calls memblock_ is_memory() which is also marked with __init in the !ARCH_KEEP_MEMBLOCK case. However, acpi_os_ioremap() is called by ordinary functions such as acpi_os_{read, write}_memory() and causes section mismatch warnings: WARNING: modpost: vmlinux.o: section mismatch in reference: acpi_os_read_memory (section: .text) -> acpi_os_ioremap (section: .init.text) WARNING: modpost: vmlinux.o: section mismatch in reference: acpi_os_write_memory (section: .text) -> acpi_os_ioremap (section: .init.text) Fix these warnings by selecting ARCH_KEEP_MEMBLOCK unconditionally and removing the __init modifier of acpi_os_ioremap(). This can also give a chance to track "memory" and "reserved" memblocks after early boot. Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-09-03	LoongArch: Improve dump_tlb() output messages	Huacai Chen	1	-13/+13
	1, Use nr/nx to replace ri/xi; 2, Add 0x prefix for hexadecimal data. Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-09-03	LoongArch: Adjust arch_do_signal_or_restart() to adapt generic entry	Huacai Chen	1	-2/+2
	Commit 8ba62d37949e248c69 ("task_work: Call tracehook_notify_signal from get_signal on all architectures") adjust arch_do_signal_or_restart() for all architectures. LoongArch hasn't been upstream yet at that time and can be still built successfully without adjustment because this function has a weak version with the correct prototype. It is obviously that we should convert LoongArch to use new API, otherwise some signal handlings will be lost. Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-09-03	LoongArch: Avoid orphan input sections	Ard Biesheuvel	2	-0/+3
	Ensure that all input sections are listed explicitly in the linker script, and issue a warning otherwise. This ensures that the binary image matches the PE/COFF and other image metadata exactly, which is important for things like code signing. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-09-02	Documentation: document ublk	Ming Lei	3	-0/+255
	Add documentation for ublk subsystem. It was supposed to be documented when merging the driver, but missing at that time. Cc: Bagas Sanjaya <bagasdotme@gmail.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Richard W.M. Jones <rjones@redhat.com> Cc: Xiaoguang Wang <xiaoguang.wang@linux.alibaba.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: ZiyangZhang <ZiyangZhang@linux.alibaba.com> Signed-off-by: Ming Lei <ming.lei@redhat.com> [axboe: correct MAINTAINERS addition] Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-09-02	landlock: Fix file reparenting without explicit LANDLOCK_ACCESS_FS_REFER	Mickaël Salaün	2	-33/+170
	This change fixes a mis-handling of the LANDLOCK_ACCESS_FS_REFER right when multiple rulesets/domains are stacked. The expected behaviour was that an additional ruleset can only restrict the set of permitted operations, but in this particular case, it was potentially possible to re-gain the LANDLOCK_ACCESS_FS_REFER right. With the introduction of LANDLOCK_ACCESS_FS_REFER, we added the first globally denied-by-default access right. Indeed, this lifted an initial Landlock limitation to rename and link files, which was initially always denied when the source or the destination were different directories. This led to an inconsistent backward compatibility behavior which was only taken into account if no domain layer were using the new LANDLOCK_ACCESS_FS_REFER right. However, when restricting a thread with a new ruleset handling LANDLOCK_ACCESS_FS_REFER, all inherited parent rulesets/layers not explicitly handling LANDLOCK_ACCESS_FS_REFER would behave as if they were handling this access right and with all their rules allowing it. This means that renaming and linking files could became allowed by these parent layers, but all the other required accesses must also be granted: all layers must allow file removal or creation, and renaming and linking operations cannot lead to privilege escalation according to the Landlock policy. See detailed explanation in commit b91c3e4ea756 ("landlock: Add support for file reparenting with LANDLOCK_ACCESS_FS_REFER"). To say it another way, this bug may lift the renaming and linking limitations of the initial Landlock version, and a same ruleset can enforce different restrictions depending on previous or next enforced ruleset (i.e. inconsistent behavior). The LANDLOCK_ACCESS_FS_REFER right cannot give access to data not already allowed, but this doesn't follow the contract of the first Landlock ABI. This fix puts back the limitation for sandboxes that didn't opt-in for this additional right. For instance, if a first ruleset allows LANDLOCK_ACCESS_FS_MAKE_REG on /dst and LANDLOCK_ACCESS_FS_REMOVE_FILE on /src, renaming /src/file to /dst/file is denied. However, without this fix, stacking a new ruleset which allows LANDLOCK_ACCESS_FS_REFER on / would now permit the sandboxed thread to rename /src/file to /dst/file . This change fixes the (absolute) rule access rights, which now always forbid LANDLOCK_ACCESS_FS_REFER except when it is explicitly allowed when creating a rule. Making all domain handle LANDLOCK_ACCESS_FS_REFER was an initial approach but there is two downsides: * it makes the code more complex because we still want to check that a rule allowing LANDLOCK_ACCESS_FS_REFER is legitimate according to the ruleset's handled access rights (i.e. ABI v1 != ABI v2); * it would not allow to identify if the user created a ruleset explicitly handling LANDLOCK_ACCESS_FS_REFER or not, which will be an issue to audit Landlock. Instead, this change adds an ACCESS_INITIALLY_DENIED list of denied-by-default rights, which (only) contains LANDLOCK_ACCESS_FS_REFER. All domains are treated as if they are also handling this list, but without modifying their fs_access_masks field. A side effect is that the errno code returned by rename(2) or link(2) may be changed from EXDEV to EACCES according to the enforced restrictions. Indeed, we now have the mechanic to identify if an access is denied because of a required right (e.g. LANDLOCK_ACCESS_FS_MAKE_REG, LANDLOCK_ACCESS_FS_REMOVE_FILE) or if it is denied because of missing LANDLOCK_ACCESS_FS_REFER rights. This may result in different errno codes than for the initial Landlock version, but this approach is more consistent and better for rename/link compatibility reasons, and it wasn't possible before (hence no backport to ABI v1). The layout1.rename_file test reflects this change. Add 4 layout1.refer_denied_by_default* test suites to check that the behavior of a ruleset not handling LANDLOCK_ACCESS_FS_REFER (ABI v1) is unchanged even if another layer handles LANDLOCK_ACCESS_FS_REFER (i.e. ABI v1 precedence). Make sure rule's absolute access rights are correct by testing with and without a matching path. Add test_rename() and test_exchange() helpers. Extend layout1.inval tests to check that a denied-by-default access right is not necessarily part of a domain's handled access rights. Test coverage for security/landlock is 95.3% of 599 lines according to gcc/gcov-11. Fixes: b91c3e4ea756 ("landlock: Add support for file reparenting with LANDLOCK_ACCESS_FS_REFER") Reviewed-by: Paul Moore <paul@paul-moore.com> Reviewed-by: Günther Noack <gnoack3000@gmail.com> Link: https://lore.kernel.org/r/20220831203840.1370732-1-mic@digikod.net Cc: stable@vger.kernel.org [mic: Constify and slightly simplify test helpers] Signed-off-by: Mickaël Salaün <mic@digikod.net>