aboutsummaryrefslogtreecommitdiffstatshomepage
path: root/drivers/iommu/intel/svm.c (follow)
AgeCommit message (Collapse)AuthorFilesLines
2022-09-26iommu/vt-d: Rename cap_5lp_support to cap_fl5lp_supportYi Liu1-1/+1
This renaming better describes it is for first level page table (a.k.a first stage page table since VT-d spec 3.4). Signed-off-by: Yi Liu <yi.l.liu@intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/20220916071326.2223901-1-yi.l.liu@intel.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-09-26iommu/vt-d: Remove unnecessary SVA data accesses in page fault pathLu Baolu1-53/+7
The existing I/O page fault handling code accesses the per-PASID SVA data structures. This is unnecessary and makes the fault handling code only suitable for SVA scenarios. This removes the SVA data accesses from the I/O page fault reporting and responding code, so that the fault handling code could be generic. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/20220914011821.400986-1-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-07-15iommu/vt-d: Refactor iommu information of each domainLu Baolu1-1/+1
When a DMA domain is attached to a device, it needs to allocate a domain ID from its IOMMU. Currently, the domain ID information is stored in two static arrays embedded in the domain structure. This can lead to memory waste when the driver is running on a small platform. This optimizes these static arrays by replacing them with an xarray and consuming memory on demand. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Steve Wahl <steve.wahl@hpe.com> Link: https://lore.kernel.org/r/20220702015610.2849494-4-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-07-15iommu/vt-d: Acquiring lock in pasid manipulation helpersLu Baolu1-3/+0
The iommu->lock is used to protect the per-IOMMU pasid directory table and pasid table. Move the spinlock acquisition/release into the helpers to make the code self-contained. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/20220706025524.2904370-8-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-07-15iommu/vt-d: Replace spin_lock_irqsave() with spin_lock()Lu Baolu1-3/+3
The iommu->lock is used to protect changes in root/context/pasid tables and domain ID allocation. There's no use case to change these resources in any interrupt context. Therefore, it is unnecessary to disable the interrupts when the spinlock is held. The same thing happens on the device_domain_lock side, which protects the device domain attachment information. This replaces spin_lock/unlock_irqsave/irqrestore() calls with the normal spin_lock/unlock(). Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/20220706025524.2904370-6-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-07-15iommu/vt-d: Move include/linux/intel-iommu.h under iommuLu Baolu1-1/+1
This header file is private to the Intel IOMMU driver. Move it to the driver folder. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Steve Wahl <steve.wahl@hpe.com> Link: https://lore.kernel.org/r/20220514014322.2927339-8-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-07-15iommu/vt-d: Move trace/events/intel_iommu.h under iommuLu Baolu1-1/+1
This header file is private to the Intel IOMMU driver. Move it to the driver folder. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Steve Wahl <steve.wahl@hpe.com> Link: https://lore.kernel.org/r/20220514014322.2927339-2-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-04-28iommu/vt-d: Drop stop marker messagesLu Baolu1-0/+4
The page fault handling framework in the IOMMU core explicitly states that it doesn't handle PCI PASID Stop Marker and the IOMMU drivers must discard them before reporting faults. This handles Stop Marker messages in prq_event_thread() before reporting events to the core. The VT-d driver explicitly drains the pending page requests when a CPU page table (represented by a mm struct) is unbound from a PASID according to the procedures defined in the VT-d spec. The Stop Marker messages do not need a response. Hence, it is safe to drop the Stop Marker messages silently if any of them is found in the page request queue. Fixes: d5b9e4bfe0d88 ("iommu/vt-d: Report prq to io-pgfault framework") Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/20220421113558.3504874-1-baolu.lu@linux.intel.com Link: https://lore.kernel.org/r/20220423082330.3897867-2-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-03-24Merge tag 'iommu-updates-v5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommuLinus Torvalds1-217/+3
Pull iommu updates from Joerg Roedel: - IOMMU Core changes: - Removal of aux domain related code as it is basically dead and will be replaced by iommu-fd framework - Split of iommu_ops to carry domain-specific call-backs separatly - Cleanup to remove useless ops->capable implementations - Improve 32-bit free space estimate in iova allocator - Intel VT-d updates: - Various cleanups of the driver - Support for ATS of SoC-integrated devices listed in ACPI/SATC table - ARM SMMU updates: - Fix SMMUv3 soft lockup during continuous stream of events - Fix error path for Qualcomm SMMU probe() - Rework SMMU IRQ setup to prepare the ground for PMU support - Minor cleanups and refactoring - AMD IOMMU driver: - Some minor cleanups and error-handling fixes - Rockchip IOMMU driver: - Use standard driver registration - MSM IOMMU driver: - Minor cleanup and change to standard driver registration - Mediatek IOMMU driver: - Fixes for IOTLB flushing logic * tag 'iommu-updates-v5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (47 commits) iommu/amd: Improve amd_iommu_v2_exit() iommu/amd: Remove unused struct fault.devid iommu/amd: Clean up function declarations iommu/amd: Call memunmap in error path iommu/arm-smmu: Account for PMU interrupts iommu/vt-d: Enable ATS for the devices in SATC table iommu/vt-d: Remove unused function intel_svm_capable() iommu/vt-d: Add missing "__init" for rmrr_sanity_check() iommu/vt-d: Move intel_iommu_ops to header file iommu/vt-d: Fix indentation of goto labels iommu/vt-d: Remove unnecessary prototypes iommu/vt-d: Remove unnecessary includes iommu/vt-d: Remove DEFER_DEVICE_DOMAIN_INFO iommu/vt-d: Remove domain and devinfo mempool iommu/vt-d: Remove iova_cache_get/put() iommu/vt-d: Remove finding domain in dmar_insert_one_dev_info() iommu/vt-d: Remove intel_iommu::domains iommu/mediatek: Always tlb_flush_all when each PM resume iommu/mediatek: Add tlb_lock in tlb_flush_all iommu/mediatek: Remove the power status checking in tlb flush all ...
2022-03-04iommu/vt-d: Remove unused function intel_svm_capable()YueHaibing1-5/+0
This is unused since commit 4048377414162 ("iommu/vt-d: Use iommu_sva_alloc(free)_pasid() helpers"). Signed-off-by: YueHaibing <yuehaibing@huawei.com> Link: https://lore.kernel.org/r/20220216113851.25004-1-yuehaibing@huawei.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20220301020159.633356-12-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-03-04iommu/vt-d: Remove DEFER_DEVICE_DOMAIN_INFOLu Baolu1-3/+3
Allocate and set the per-device iommu private data during iommu device probe. This makes the per-device iommu private data always available during iommu_probe_device() and iommu_release_device(). With this changed, the dummy DEFER_DEVICE_DOMAIN_INFO pointer could be removed. The wrappers for getting the private data and domain are also cleaned. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20220214025704.3184654-1-baolu.lu@linux.intel.com Link: https://lore.kernel.org/r/20220301020159.633356-6-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-02-28iommu/vt-d: Remove guest pasid related callbacksLu Baolu1-209/+0
The guest pasid related callbacks are not called in the tree. Remove them to avoid dead code. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20220216025249.3459465-2-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-02-15iommu/sva: Assign a PASID to mm on PASID allocation and free it on mm exitFenghua Yu1-9/+0
PASIDs are process-wide. It was attempted to use refcounted PASIDs to free them when the last thread drops the refcount. This turned out to be complex and error prone. Given the fact that the PASID space is 20 bits, which allows up to 1M processes to have a PASID associated concurrently, PASID resource exhaustion is not a realistic concern. Therefore, it was decided to simplify the approach and stick with lazy on demand PASID allocation, but drop the eager free approach and make an allocated PASID's lifetime bound to the lifetime of the process. Get rid of the refcounting mechanisms and replace/rename the interfaces to reflect this new approach. [ bp: Massage commit message. ] Suggested-by: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Joerg Roedel <jroedel@suse.de> Link: https://lore.kernel.org/r/20220207230254.3342514-6-fenghua.yu@intel.com
2021-10-18iommu/vt-d: Clean up unused PASID updating functionsFenghua Yu1-23/+1
update_pasid() and its call chain are currently unused in the tree because Thomas disabled the ENQCMD feature. The feature will be re-enabled shortly using a different approach and update_pasid() and its call chain will not be used in the new approach. Remove the useless functions. Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Reviewed-by: Tony Luck <tony.luck@intel.com> Link: https://lore.kernel.org/r/20210920192349.2602141-1-fenghua.yu@intel.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20211014053839.727419-8-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-09-09iommu/vt-d: Fix a deadlock in intel_svm_drain_prq()Fenghua Yu1-0/+12
pasid_mutex and dev->iommu->param->lock are held while unbinding mm is flushing IO page fault workqueue and waiting for all page fault works to finish. But an in-flight page fault work also need to hold the two locks while unbinding mm are holding them and waiting for the work to finish. This may cause an ABBA deadlock issue as shown below: idxd 0000:00:0a.0: unbind PASID 2 ====================================================== WARNING: possible circular locking dependency detected 5.14.0-rc7+ #549 Not tainted [ 186.615245] ---------- dsa_test/898 is trying to acquire lock: ffff888100d854e8 (&param->lock){+.+.}-{3:3}, at: iopf_queue_flush_dev+0x29/0x60 but task is already holding lock: ffffffff82b2f7c8 (pasid_mutex){+.+.}-{3:3}, at: intel_svm_unbind+0x34/0x1e0 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #2 (pasid_mutex){+.+.}-{3:3}: __mutex_lock+0x75/0x730 mutex_lock_nested+0x1b/0x20 intel_svm_page_response+0x8e/0x260 iommu_page_response+0x122/0x200 iopf_handle_group+0x1c2/0x240 process_one_work+0x2a5/0x5a0 worker_thread+0x55/0x400 kthread+0x13b/0x160 ret_from_fork+0x22/0x30 -> #1 (&param->fault_param->lock){+.+.}-{3:3}: __mutex_lock+0x75/0x730 mutex_lock_nested+0x1b/0x20 iommu_report_device_fault+0xc2/0x170 prq_event_thread+0x28a/0x580 irq_thread_fn+0x28/0x60 irq_thread+0xcf/0x180 kthread+0x13b/0x160 ret_from_fork+0x22/0x30 -> #0 (&param->lock){+.+.}-{3:3}: __lock_acquire+0x1134/0x1d60 lock_acquire+0xc6/0x2e0 __mutex_lock+0x75/0x730 mutex_lock_nested+0x1b/0x20 iopf_queue_flush_dev+0x29/0x60 intel_svm_drain_prq+0x127/0x210 intel_svm_unbind+0xc5/0x1e0 iommu_sva_unbind_device+0x62/0x80 idxd_cdev_release+0x15a/0x200 [idxd] __fput+0x9c/0x250 ____fput+0xe/0x10 task_work_run+0x64/0xa0 exit_to_user_mode_prepare+0x227/0x230 syscall_exit_to_user_mode+0x2c/0x60 do_syscall_64+0x48/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xae other info that might help us debug this: Chain exists of: &param->lock --> &param->fault_param->lock --> pasid_mutex Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(pasid_mutex); lock(&param->fault_param->lock); lock(pasid_mutex); lock(&param->lock); *** DEADLOCK *** 2 locks held by dsa_test/898: #0: ffff888100cc1cc0 (&group->mutex){+.+.}-{3:3}, at: iommu_sva_unbind_device+0x53/0x80 #1: ffffffff82b2f7c8 (pasid_mutex){+.+.}-{3:3}, at: intel_svm_unbind+0x34/0x1e0 stack backtrace: CPU: 2 PID: 898 Comm: dsa_test Not tainted 5.14.0-rc7+ #549 Hardware name: Intel Corporation Kabylake Client platform/KBL S DDR4 UD IMM CRB, BIOS KBLSE2R1.R00.X050.P01.1608011715 08/01/2016 Call Trace: dump_stack_lvl+0x5b/0x74 dump_stack+0x10/0x12 print_circular_bug.cold+0x13d/0x142 check_noncircular+0xf1/0x110 __lock_acquire+0x1134/0x1d60 lock_acquire+0xc6/0x2e0 ? iopf_queue_flush_dev+0x29/0x60 ? pci_mmcfg_read+0xde/0x240 __mutex_lock+0x75/0x730 ? iopf_queue_flush_dev+0x29/0x60 ? pci_mmcfg_read+0xfd/0x240 ? iopf_queue_flush_dev+0x29/0x60 mutex_lock_nested+0x1b/0x20 iopf_queue_flush_dev+0x29/0x60 intel_svm_drain_prq+0x127/0x210 ? intel_pasid_tear_down_entry+0x22e/0x240 intel_svm_unbind+0xc5/0x1e0 iommu_sva_unbind_device+0x62/0x80 idxd_cdev_release+0x15a/0x200 pasid_mutex protects pasid and svm data mapping data. It's unnecessary to hold pasid_mutex while flushing the workqueue. To fix the deadlock issue, unlock pasid_pasid during flushing the workqueue to allow the works to be handled. Fixes: d5b9e4bfe0d8 ("iommu/vt-d: Report prq to io-pgfault framework") Reported-and-tested-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Link: https://lore.kernel.org/r/20210826215918.4073446-1-fenghua.yu@intel.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210828070622.2437559-3-baolu.lu@linux.intel.com [joro: Removed timing information from kernel log messages] Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-09-09iommu/vt-d: Fix PASID leak in intel_svm_unbind_mm()Fenghua Yu1-3/+0
The mm->pasid will be used in intel_svm_free_pasid() after load_pasid() during unbinding mm. Clearing it in load_pasid() will cause PASID cannot be freed in intel_svm_free_pasid(). Additionally mm->pasid was updated already before load_pasid() during pasid allocation. No need to update it again in load_pasid() during binding mm. Don't update mm->pasid to avoid the issues in both binding mm and unbinding mm. Fixes: 4048377414162 ("iommu/vt-d: Use iommu_sva_alloc(free)_pasid() helpers") Reported-and-tested-by: Dave Jiang <dave.jiang@intel.com> Co-developed-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Link: https://lore.kernel.org/r/20210826215918.4073446-1-fenghua.yu@intel.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210828070622.2437559-2-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-08-20Merge branches 'apple/dart', 'arm/smmu', 'iommu/fixes', 'x86/amd', 'x86/vt-d' and 'core' into nextJoerg Roedel1-5/+2
2021-08-19iommu/vt-d: Allow devices to have more than 32 outstanding PRsLu Baolu1-4/+0
The minimum per-IOMMU PRQ queue size is one 4K page, this is more entries than the hardcoded limit of 32 in the current VT-d code. Some devices can support up to 512 outstanding PRQs but underutilized by this limit of 32. Although, 32 gives some rough fairness when multiple devices share the same IOMMU PRQ queue, but far from optimal for customized use case. This extends the per-IOMMU PRQ queue size to four 4K pages and let the devices have as many outstanding page requests as they can. Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210720013856.4143880-1-baolu.lu@linux.intel.com Link: https://lore.kernel.org/r/20210818134852.1847070-7-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-08-18iommu/vt-d: Fix PASID reference leakFenghua Yu1-1/+2
A PASID reference is increased whenever a device is bound to an mm (and its PASID) successfully (i.e. the device's sdev user count is increased). But the reference is not dropped every time the device is unbound successfully from the mm (i.e. the device's sdev user count is decreased). The reference is dropped only once by calling intel_svm_free_pasid() when there isn't any device bound to the mm. intel_svm_free_pasid() drops the reference and only frees the PASID on zero reference. Fix the issue by dropping the PASID reference and freeing the PASID when no reference on successful unbinding the device by calling intel_svm_free_pasid() . Fixes: 4048377414162 ("iommu/vt-d: Use iommu_sva_alloc(free)_pasid() helpers") Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Link: https://lore.kernel.org/r/20210813181345.1870742-1-fenghua.yu@intel.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210817124321.1517985-2-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-06-10iommu/vt-d: Fix out-bounds-warning in intel/svm.cGustavo A. R. Silva1-10/+16
Replace a couple of calls to memcpy() with simple assignments in order to fix the following out-of-bounds warning: drivers/iommu/intel/svm.c:1198:4: warning: 'memcpy' offset [25, 32] from the object at 'desc' is out of the bounds of referenced subobject 'qw2' with type 'long long unsigned int' at offset 16 [-Warray-bounds] The problem is that the original code is trying to copy data into a couple of struct members adjacent to each other in a single call to memcpy(). This causes a legitimate compiler warning because memcpy() overruns the length of &desc.qw2 and &resp.qw2, respectively. This helps with the ongoing efforts to globally enable -Warray-bounds and get us closer to being able to tighten the FORTIFY_SOURCE routines on memcpy(). Link: https://github.com/KSPP/linux/issues/109 Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210414201403.GA392764@embeddedor Link: https://lore.kernel.org/r/20210610020115.1637656-18-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-06-10iommu/vt-d: Add PRQ handling latency samplingLu Baolu1-3/+13
The execution time for page fault request handling is performance critical and needs to be monitored. This adds code to sample the execution time of page fault request handling. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210520031531.712333-1-baolu.lu@linux.intel.com Link: https://lore.kernel.org/r/20210610020115.1637656-17-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-06-10iommu/vt-d: Add prq_report trace eventLu Baolu1-0/+7
This adds a new trace event to track the page fault request report. This event will provide almost all information defined in a page request descriptor. A sample output: | prq_report: dmar0/0000:00:0a.0 seq# 1: rid=0x50 addr=0x559ef6f97 r---- pasid=0x2 index=0x1 | prq_report: dmar0/0000:00:0a.0 seq# 2: rid=0x50 addr=0x559ef6f9c rw--l pasid=0x2 index=0x1 | prq_report: dmar0/0000:00:0a.0 seq# 3: rid=0x50 addr=0x559ef6f98 r---- pasid=0x2 index=0x1 | prq_report: dmar0/0000:00:0a.0 seq# 4: rid=0x50 addr=0x559ef6f9d rw--l pasid=0x2 index=0x1 | prq_report: dmar0/0000:00:0a.0 seq# 5: rid=0x50 addr=0x559ef6f99 r---- pasid=0x2 index=0x1 | prq_report: dmar0/0000:00:0a.0 seq# 6: rid=0x50 addr=0x559ef6f9e rw--l pasid=0x2 index=0x1 | prq_report: dmar0/0000:00:0a.0 seq# 7: rid=0x50 addr=0x559ef6f9a r---- pasid=0x2 index=0x1 | prq_report: dmar0/0000:00:0a.0 seq# 8: rid=0x50 addr=0x559ef6f9f rw--l pasid=0x2 index=0x1 This will be helpful for I/O page fault related debugging. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210520031531.712333-1-baolu.lu@linux.intel.com Link: https://lore.kernel.org/r/20210610020115.1637656-13-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-06-10iommu/vt-d: Report prq to io-pgfault frameworkLu Baolu1-79/+5
Let the IO page fault requests get handled through the io-pgfault framework. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210520031531.712333-1-baolu.lu@linux.intel.com Link: https://lore.kernel.org/r/20210610020115.1637656-12-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-06-10iommu/vt-d: Allocate/register iopf queue for sva devicesLu Baolu1-7/+30
This allocates and registers the iopf queue infrastructure for devices which want to support IO page fault for SVA. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210520031531.712333-1-baolu.lu@linux.intel.com Link: https://lore.kernel.org/r/20210610020115.1637656-11-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-06-10iommu/vt-d: Refactor prq_event_thread()Lu Baolu1-103/+136
Refactor prq_event_thread() by moving handling single prq event out of the main loop. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210520031531.712333-1-baolu.lu@linux.intel.com Link: https://lore.kernel.org/r/20210610020115.1637656-10-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-06-10iommu/vt-d: Use common helper to lookup svm devicesLu Baolu1-28/+40
It's common to iterate the svm device list and find a matched device. Add common helpers to do this and consolidate the code. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210520031531.712333-1-baolu.lu@linux.intel.com Link: https://lore.kernel.org/r/20210610020115.1637656-9-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-06-10iommu/vt-d: Use iommu_sva_alloc(free)_pasid() helpersLu Baolu1-162/+116
Align the pasid alloc/free code with the generic helpers defined in the iommu core. This also refactored the SVA binding code to improve the readability. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210520031531.712333-1-baolu.lu@linux.intel.com Link: https://lore.kernel.org/r/20210610020115.1637656-8-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-06-10iommu/vt-d: Add pasid private data helpersLu Baolu1-21/+41
We are about to use iommu_sva_alloc/free_pasid() helpers in iommu core. That means the pasid life cycle will be managed by iommu core. Use a local array to save the per pasid private data instead of attaching it the real pasid. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210520031531.712333-1-baolu.lu@linux.intel.com Link: https://lore.kernel.org/r/20210610020115.1637656-7-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-04-07iommu/vt-d: Report the right page fault addressLu Baolu1-1/+1
The Address field of the Page Request Descriptor only keeps bit [63:12] of the offending address. Convert it to a full address before reporting it to device drivers. Fixes: eb8d93ea3c1d3 ("iommu/vt-d: Report page request faults for guest SVA") Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210320025415.641201-2-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-04-07iommu/vt-d: Remove SVM_FLAG_PRIVATE_PASIDLu Baolu1-22/+18
The SVM_FLAG_PRIVATE_PASID has never been referenced in the tree, and there's no plan to have anything to use it. So cleanup it. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20210323010600.678627-4-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-04-07iommu/vt-d: Remove svm_dev_opsLu Baolu1-14/+1
The svm_dev_ops has never been referenced in the tree, and there's no plan to have anything to use it. Remove it to make the code neat. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20210323010600.678627-3-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-04-07iommu/vt-d: Don't set then clear private data in prq_event_thread()Lu Baolu1-2/+2
The VT-d specification (section 7.6) requires that the value in the Private Data field of a Page Group Response Descriptor must match the value in the Private Data field of the respective Page Request Descriptor. The private data field of a page group response descriptor is set then immediately cleared in prq_event_thread(). This breaks the rule defined by the VT-d specification. Fix it by moving clearing code up. Fixes: 5b438f4ba315d ("iommu/vt-d: Support page request in scalable mode") Cc: Jacob Pan <jacob.jun.pan@linux.intel.com> Reviewed-by: Liu Yi L <yi.l.liu@intel.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210320024156.640798-1-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-03-18iommu/vt-d: Calculate and set flags for handle_mm_faultJacob Pan1-3/+6
Page requests are originated from the user page fault. Therefore, we shall set FAULT_FLAG_USER.  FAULT_FLAG_REMOTE indicates that we are walking an mm which is not guaranteed to be the same as the current->mm and should not be subject to protection key enforcement. Therefore, we should set FAULT_FLAG_REMOTE to avoid faults when both SVM and PKEY are used. References: commit 1b2ee1266ea6 ("mm/core: Do not enforce PKEY permissions on remote mm access") Reviewed-by: Raj Ashok <ashok.raj@intel.com> Acked-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Link: https://lore.kernel.org/r/1614680040-1989-5-git-send-email-jacob.jun.pan@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-03-18iommu/vt-d: Reject unsupported page request modesJacob Pan1-1/+11
When supervisor/privilige mode SVM is used, we bind init_mm.pgd with a supervisor PASID. There should not be any page fault for init_mm. Execution request with DMA read is also not supported. This patch checks PRQ descriptor for both unsupported configurations, reject them both with invalid responses. Fixes: 1c4f88b7f1f92 ("iommu/vt-d: Shared virtual address in scalable mode") Acked-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Link: https://lore.kernel.org/r/1614680040-1989-4-git-send-email-jacob.jun.pan@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-01-29iommu/vt-d: Use INVALID response code instead of FAILURELu Baolu1-4/+1
The VT-d IOMMU response RESPONSE_FAILURE for a page request in below cases: - When it gets a Page_Request with no PASID; - When it gets a Page_Request with PASID that is not in use for this device. This is allowed by the spec, but IOMMU driver doesn't support such cases today. When the device receives RESPONSE_FAILURE, it sends the device state machine to HALT state. Now if we try to unload the driver, it hangs since the device doesn't send any outbound transactions to host when the driver is trying to clear things up. The only possible responses would be for invalidation requests. Let's use RESPONSE_INVALID instead for now, so that the device state machine doesn't enter HALT state. Suggested-by: Ashok Raj <ashok.raj@intel.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210126080730.2232859-3-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-01-29iommu/vt-d: Clear PRQ overflow only when PRQ is emptyLu Baolu1-2/+11
It is incorrect to always clear PRO when it's set w/o first checking whether the overflow condition has been cleared. Current code assumes that if an overflow condition occurs it must have been cleared by earlier loop. However since the code runs in a threaded context, the overflow condition could occur even after setting the head to the tail under some extreme condition. To be sane, we should read both head/tail again when seeing a pending PRO and only clear PRO after all pending PRs have been handled. Suggested-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/linux-iommu/MWHPR11MB18862D2EA5BD432BF22D99A48CA09@MWHPR11MB1886.namprd11.prod.outlook.com/ Link: https://lore.kernel.org/r/20210126080730.2232859-2-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-01-28iommu/vt-d: Consolidate duplicate cache invaliation codeLu Baolu1-46/+9
The pasid based IOTLB and devTLB invalidation code is duplicate in several places. Consolidate them by using the common helpers. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210114085021.717041-1-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-01-12iommu/vt-d: Fix unaligned addresses for intel_flush_svm_range_dev()Lu Baolu1-2/+20
The VT-d hardware will ignore those Addr bits which have been masked by the AM field in the PASID-based-IOTLB invalidation descriptor. As the result, if the starting address in the descriptor is not aligned with the address mask, some IOTLB caches might not invalidate. Hence people will see below errors. [ 1093.704661] dmar_fault: 29 callbacks suppressed [ 1093.704664] DMAR: DRHD: handling fault status reg 3 [ 1093.712738] DMAR: [DMA Read] Request device [7a:02.0] PASID 2 fault addr 7f81c968d000 [fault reason 113] SM: Present bit in first-level paging entry is clear Fix this by using aligned address for PASID-based-IOTLB invalidation. Fixes: 1c4f88b7f1f9 ("iommu/vt-d: Shared virtual address in scalable mode") Reported-and-tested-by: Guo Kaijie <Kaijie.Guo@intel.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20201231005323.2178523-2-baolu.lu@linux.intel.com Signed-off-by: Will Deacon <will@kernel.org>
2021-01-07iommu/vt-d: Move intel_iommu info from struct intel_svm to struct intel_svm_devLiu Yi L1-4/+5
'struct intel_svm' is shared by all devices bound to a give process, but records only a single pointer to a 'struct intel_iommu'. Consequently, cache invalidations may only be applied to a single DMAR unit, and are erroneously skipped for the other devices. In preparation for fixing this, rework the structures so that the iommu pointer resides in 'struct intel_svm_dev', allowing 'struct intel_svm' to track them in its device list. Fixes: 1c4f88b7f1f9 ("iommu/vt-d: Shared virtual address in scalable mode") Cc: Lu Baolu <baolu.lu@linux.intel.com> Cc: Jacob Pan <jacob.jun.pan@linux.intel.com> Cc: Raj Ashok <ashok.raj@intel.com> Cc: David Woodhouse <dwmw2@infradead.org> Reported-by: Guo Kaijie <Kaijie.Guo@intel.com> Reported-by: Xin Zeng <xin.zeng@intel.com> Signed-off-by: Guo Kaijie <Kaijie.Guo@intel.com> Signed-off-by: Xin Zeng <xin.zeng@intel.com> Signed-off-by: Liu Yi L <yi.l.liu@intel.com> Tested-by: Guo Kaijie <Kaijie.Guo@intel.com> Cc: stable@vger.kernel.org # v5.0+ Acked-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/1609949037-25291-2-git-send-email-yi.l.liu@intel.com Signed-off-by: Will Deacon <will@kernel.org>
2021-01-07iommu/vt-d: Fix lockdep splat in sva bind()/unbind()Lu Baolu1-6/+8
Lock(&iommu->lock) without disabling irq causes lockdep warnings. ======================================================== WARNING: possible irq lock inversion dependency detected 5.11.0-rc1+ #828 Not tainted -------------------------------------------------------- kworker/0:1H/120 just changed the state of lock: ffffffffad9ea1b8 (device_domain_lock){..-.}-{2:2}, at: iommu_flush_dev_iotlb.part.0+0x32/0x120 but this lock took another, SOFTIRQ-unsafe lock in the past: (&iommu->lock){+.+.}-{2:2} and interrupts could create inverse lock ordering between them. other info that might help us debug this: Possible interrupt unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&iommu->lock); local_irq_disable(); lock(device_domain_lock); lock(&iommu->lock); <Interrupt> lock(device_domain_lock); *** DEADLOCK *** Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20201231005323.2178523-5-baolu.lu@linux.intel.com Signed-off-by: Will Deacon <will@kernel.org>
2020-11-23iommu/ioasid: Add ioasid referencesJean-Philippe Brucker1-3/+3
Let IOASID users take references to existing ioasids with ioasid_get(). ioasid_put() drops a reference and only frees the ioasid when its reference number is zero. It returns true if the ioasid was freed. For drivers that don't call ioasid_get(), ioasid_put() is the same as ioasid_free(). Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Reviewed-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20201106155048.997886-2-jean-philippe@linaro.org Signed-off-by: Will Deacon <will@kernel.org>
2020-11-03iommu/vt-d: Fix a bug for PDP check in prq_event_threadLiu, Yi L1-1/+1
In prq_event_thread(), the QI_PGRP_PDP is wrongly set by 'req->pasid_present' which should be replaced to 'req->priv_data_present'. Fixes: 5b438f4ba315 ("iommu/vt-d: Support page request in scalable mode") Signed-off-by: Liu, Yi L <yi.l.liu@intel.com> Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com> Acked-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/1604025444-6954-3-git-send-email-yi.y.sun@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-11-03iommu/vt-d: Fix sid not set issue in intel_svm_bind_gpasid()Liu Yi L1-0/+6
Should get correct sid and set it into sdev. Because we execute 'sdev->sid != req->rid' in the loop of prq_event_thread(). Fixes: eb8d93ea3c1d ("iommu/vt-d: Report page request faults for guest SVA") Signed-off-by: Liu Yi L <yi.l.liu@intel.com> Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com> Acked-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/1604025444-6954-2-git-send-email-yi.y.sun@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-10-14Merge tag 'iommu-updates-v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommuLinus Torvalds1-3/+10
Pull iommu updates from Joerg Roedel: - ARM-SMMU Updates from Will: - Continued SVM enablement, where page-table is shared with CPU - Groundwork to support integrated SMMU with Adreno GPU - Allow disabling of MSI-based polling on the kernel command-line - Minor driver fixes and cleanups (octal permissions, error messages, ...) - Secure Nested Paging Support for AMD IOMMU. The IOMMU will fault when a device tries DMA on memory owned by a guest. This needs new fault-types as well as a rewrite of the IOMMU memory semaphore for command completions. - Allow broken Intel IOMMUs (wrong address widths reported) to still be used for interrupt remapping. - IOMMU UAPI updates for supporting vSVA, where the IOMMU can access address spaces of processes running in a VM. - Support for the MT8167 IOMMU in the Mediatek IOMMU driver. - Device-tree updates for the Renesas driver to support r8a7742. - Several smaller fixes and cleanups all over the place. * tag 'iommu-updates-v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (57 commits) iommu/vt-d: Gracefully handle DMAR units with no supported address widths iommu/vt-d: Check UAPI data processed by IOMMU core iommu/uapi: Handle data and argsz filled by users iommu/uapi: Rename uapi functions iommu/uapi: Use named union for user data iommu/uapi: Add argsz for user filled data docs: IOMMU user API iommu/qcom: add missing put_device() call in qcom_iommu_of_xlate() iommu/arm-smmu-v3: Add SVA device feature iommu/arm-smmu-v3: Check for SVA features iommu/arm-smmu-v3: Seize private ASID iommu/arm-smmu-v3: Share process page tables iommu/arm-smmu-v3: Move definitions to a header iommu/io-pgtable-arm: Move some definitions to a header iommu/arm-smmu-v3: Ensure queue is read after updating prod pointer iommu/amd: Re-purpose Exclusion range registers to support SNP CWWB iommu/amd: Add support for RMP_PAGE_FAULT and RMP_HW_ERR iommu/amd: Use 4K page for completion wait write-back semaphore iommu/tegra-smmu: Allow to group clients in same swgroup iommu/tegra-smmu: Fix iova->phys translation ...
2020-10-01iommu/vt-d: Check UAPI data processed by IOMMU coreJacob Pan1-2/+9
IOMMU generic layer already does sanity checks on UAPI data for version match and argsz range based on generic information. This patch adjusts the following data checking responsibilities: - removes the redundant version check from VT-d driver - removes the check for vendor specific data size - adds check for the use of reserved/undefined flags Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Link: https://lore.kernel.org/r/1601051567-54787-7-git-send-email-jacob.jun.pan@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-10-01iommu/uapi: Use named union for user dataJacob Pan1-1/+1
IOMMU UAPI data size is filled by the user space which must be validated by the kernel. To ensure backward compatibility, user data can only be extended by either re-purpose padding bytes or extend the variable sized union at the end. No size change is allowed before the union. Therefore, the minimum size is the offset of the union. To use offsetof() on the union, we must make it named. Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Link: https://lore.kernel.org/linux-iommu/20200611145518.0c2817d6@x1.home/ Link: https://lore.kernel.org/r/1601051567-54787-4-git-send-email-jacob.jun.pan@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-09-17x86/mmu: Allocate/free a PASIDFenghua Yu1-1/+27
A PASID is allocated for an "mm" the first time any thread binds to an SVA-capable device and is freed from the "mm" when the SVA is unbound by the last thread. It's possible for the "mm" to have different PASID values in different binding/unbinding SVA cycles. The mm's PASID (non-zero for valid PASID or 0 for invalid PASID) is propagated to a per-thread PASID MSR for all threads within the mm through IPI, context switch, or inherited. This is done to ensure that a running thread has the right PASID in the MSR matching the mm's PASID. [ bp: s/SVM/SVA/g; massage. ] Suggested-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Tony Luck <tony.luck@intel.com> Link: https://lkml.kernel.org/r/1600187413-163670-10-git-send-email-fenghua.yu@intel.com
2020-09-17iommu/vt-d: Change flags type to unsigned int in binding mmFenghua Yu1-3/+4
"flags" passed to intel_svm_bind_mm() is a bit mask and should be defined as "unsigned int" instead of "int". Change its type to "unsigned int". Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Acked-by: Joerg Roedel <jroedel@suse.de> Link: https://lkml.kernel.org/r/1600187413-163670-3-git-send-email-fenghua.yu@intel.com
2020-09-17drm, iommu: Change type of pasid to u32Fenghua Yu1-6/+6
PASID is defined as a few different types in iommu including "int", "u32", and "unsigned int". To be consistent and to match with uapi definitions, define PASID and its variations (e.g. max PASID) as "u32". "u32" is also shorter and a little more explicit than "unsigned int". No PASID type change in uapi although it defines PASID as __u64 in some places. Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Joerg Roedel <jroedel@suse.de> Link: https://lkml.kernel.org/r/1600187413-163670-2-git-send-email-fenghua.yu@intel.com
2020-08-12Merge branch 'akpm' (patches from Andrew)Linus Torvalds1-1/+2
Merge more updates from Andrew Morton: - most of the rest of MM (memcg, hugetlb, vmscan, proc, compaction, mempolicy, oom-kill, hugetlbfs, migration, thp, cma, util, memory-hotplug, cleanups, uaccess, migration, gup, pagemap), - various other subsystems (alpha, misc, sparse, bitmap, lib, bitops, checkpatch, autofs, minix, nilfs, ufs, fat, signals, kmod, coredump, exec, kdump, rapidio, panic, kcov, kgdb, ipc). * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (164 commits) mm/gup: remove task_struct pointer for all gup code mm: clean up the last pieces of page fault accountings mm/xtensa: use general page fault accounting mm/x86: use general page fault accounting mm/sparc64: use general page fault accounting mm/sparc32: use general page fault accounting mm/sh: use general page fault accounting mm/s390: use general page fault accounting mm/riscv: use general page fault accounting mm/powerpc: use general page fault accounting mm/parisc: use general page fault accounting mm/openrisc: use general page fault accounting mm/nios2: use general page fault accounting mm/nds32: use general page fault accounting mm/mips: use general page fault accounting mm/microblaze: use general page fault accounting mm/m68k: use general page fault accounting mm/ia64: use general page fault accounting mm/hexagon: use general page fault accounting mm/csky: use general page fault accounting ...