aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/iommu/arm-smmu-v3.c (follow)
AgeCommit message (Collapse)AuthorFilesLines
2019-11-12Merge branches 'iommu/fixes', 'arm/qcom', 'arm/renesas', 'arm/rockchip', 'arm/mediatek', 'arm/tegra', 'arm/smmu', 'x86/amd', 'x86/vt-d', 'virtio' and 'core' into nextJoerg Roedel1-6/+6
2019-11-11iommu/arm-smmu-v3: Don't display an error when IRQ lines are missingJean-Philippe Brucker1-4/+4
Since commit 7723f4c5ecdb ("driver core: platform: Add an error message to platform_get_irq*()"), platform_get_irq_byname() displays an error when the IRQ isn't found. Since the SMMUv3 driver uses that function to query which interrupt method is available, the message is now displayed during boot for any SMMUv3 that doesn't implement the combined interrupt, or that implements MSIs. [ 20.700337] arm-smmu-v3 arm-smmu-v3.7.auto: IRQ combined not found [ 20.706508] arm-smmu-v3 arm-smmu-v3.7.auto: IRQ eventq not found [ 20.712503] arm-smmu-v3 arm-smmu-v3.7.auto: IRQ priq not found [ 20.718325] arm-smmu-v3 arm-smmu-v3.7.auto: IRQ gerror not found Use platform_get_irq_byname_optional() to avoid displaying a spurious error. Fixes: 7723f4c5ecdb ("driver core: platform: Add an error message to platform_get_irq*()") Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-11-04iommu/io-pgtable-arm: Rationalise MAIR handlingRobin Murphy1-1/+1
Between VMSAv8-64 and the various 32-bit formats, there is either one 64-bit MAIR or a pair of 32-bit MAIR0/MAIR1 or NMRR/PMRR registers. As such, keeping two 64-bit values in io_pgtable_cfg has always been overkill. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will@kernel.org>
2019-10-15iommu: Add gfp parameter to iommu_ops::mapTom Murphy1-1/+1
Add a gfp_t parameter to the iommu_ops::map function. Remove the needless locking in the AMD iommu driver. The iommu_ops::map function (or the iommu_map function which calls it) was always supposed to be sleepable (according to Joerg's comment in this thread: https://lore.kernel.org/patchwork/patch/977520/ ) and so should probably have had a "might_sleep()" since it was written. However currently the dma-iommu api can call iommu_map in an atomic context, which it shouldn't do. This doesn't cause any problems because any iommu driver which uses the dma-iommu api uses gfp_atomic in it's iommu_ops::map function. But doing this wastes the memory allocators atomic pools. Signed-off-by: Tom Murphy <murphyt7@tcd.ie> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-09-17Merge tag 'leds-for-5.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/j.anaszewski/linux-ledsLinus Torvalds1-7/+2
Pull LED updates from Jacek Anaszewski: "In this cycle we've finally managed to contribute the patch set sorting out LED naming issues. Besides that there are many changes scattered among various LED class drivers and triggers. LED naming related improvements: - add new 'function' and 'color' fwnode properties and deprecate 'label' property which has been frequently abused for conveying vendor specific names that have been available in sysfs anyway - introduce a set of standard LED_FUNCTION* definitions - introduce a set of standard LED_COLOR_ID* definitions - add a new {devm_}led_classdev_register_ext() API with the capability of automatic LED name composition basing on the properties available in the passed fwnode; the function is backwards compatible in a sense that it uses 'label' data, if present in the fwnode, for creating LED name - add tools/leds/get_led_device_info.sh script for retrieving LED vendor, product and bus names, if applicable; it also performs basic validation of an LED name - update following drivers and their DT bindings to use the new LED registration API: - leds-an30259a, leds-gpio, leds-as3645a, leds-aat1290, leds-cr0014114, leds-lm3601x, leds-lm3692x, leds-lp8860, leds-lt3593, leds-sc27xx-blt Other LED class improvements: - replace {devm_}led_classdev_register() macros with inlines - allow to call led_classdev_unregister() unconditionally - switch to use fwnode instead of be stuck with OF one LED triggers improvements: - led-triggers: - fix dereferencing of null pointer - fix a memory leak bug - ledtrig-gpio: - GPIO 0 is valid Drop superseeded apu2/3 support from leds-apu since for apu2+ a newer, more complete driver exists, based on a generic driver for the AMD SOCs gpio-controller, supporting LEDs as well other devices: - drop profile field from priv data - drop iosize field from priv data - drop enum_apu_led_platform_types - drop superseeded apu2/3 led support - add pr_fmt prefix for better log output - fix error message on probing failure Other misc fixes and improvements to existing LED class drivers: - leds-ns2, leds-max77650: - add of_node_put() before return - leds-pwm, leds-is31fl32xx: - use struct_size() helper - leds-lm3697, leds-lm36274, leds-lm3532: - switch to use fwnode_property_count_uXX() - leds-lm3532: - fix brightness control for i2c mode - change the define for the fs current register - fixes for the driver for stability - add full scale current configuration - dt: Add property for full scale current. - avoid potentially unpaired regulator calls - move static keyword to the front of declarations - fix optional led-max-microamp prop error handling - leds-max77650: - add of_node_put() before return - add MODULE_ALIAS() - Switch to fwnode property API - leds-as3645a: - fix misuse of strlcpy - leds-netxbig: - add of_node_put() in netxbig_leds_get_of_pdata() - remove legacy board-file support - leds-is31fl319x: - simplify getting the adapter of a client - leds-ti-lmu-common: - fix coccinelle issue - move static keyword to the front of declaration - leds-syscon: - use resource managed variant of device register - leds-ktd2692: - fix a typo in the name of a constant - leds-lp5562: - allow firmware files up to the maximum length - leds-an30259a: - fix typo - leds-pca953x: - include the right header" * tag 'leds-for-5.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/j.anaszewski/linux-leds: (72 commits) leds: lm3532: Fix optional led-max-microamp prop error handling led: triggers: Fix dereferencing of null pointer leds: ti-lmu-common: Move static keyword to the front of declaration leds: lm3532: Move static keyword to the front of declarations leds: trigger: gpio: GPIO 0 is valid leds: pwm: Use struct_size() helper leds: is31fl32xx: Use struct_size() helper leds: ti-lmu-common: Fix coccinelle issue in TI LMU leds: lm3532: Avoid potentially unpaired regulator calls leds: syscon: Use resource managed variant of device register leds: Replace {devm_}led_classdev_register() macros with inlines leds: Allow to call led_classdev_unregister() unconditionally leds: lm3532: Add full scale current configuration dt: lm3532: Add property for full scale current. leds: lm3532: Fixes for the driver for stability leds: lm3532: Change the define for the fs current register leds: lm3532: Fix brightness control for i2c mode leds: Switch to use fwnode instead of be stuck with OF one leds: max77650: Switch to fwnode property API led: triggers: Fix a memory leak bug ...
2019-09-03iommu/arm-smmu-v3: Fix build error without CONFIG_PCI_ATSYueHaibing1-0/+7
If CONFIG_PCI_ATS is not set, building fails: drivers/iommu/arm-smmu-v3.c: In function arm_smmu_ats_supported: drivers/iommu/arm-smmu-v3.c:2325:35: error: struct pci_dev has no member named ats_cap; did you mean msi_cap? return !pdev->untrusted && pdev->ats_cap; ^~~~~~~ ats_cap should only used when CONFIG_PCI_ATS is defined, so use #ifdef block to guard this. Fixes: bfff88ec1afe ("iommu/arm-smmu-v3: Rework enabling/disabling of ATS for PCI masters") Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-08-23Merge branch 'for-joerg/arm-smmu/updates' of git://git.kernel.org/pub/scm/linux/kernel/git/will/linux into arm/smmuJoerg Roedel1-237/+736
2019-08-22Revert "iommu/arm-smmu-v3: Disable detection of ATS and PRI"Will Deacon1-2/+0
This reverts commit b5e86196b83fd68e065a7c811ab8925fb0dc3893. Now that ATC invalidation is performed in the correct places and without incurring a locking overhead for non-ATS systems, we can re-enable the corresponding SMMU feature detection. Signed-off-by: Will Deacon <will@kernel.org>
2019-08-22iommu/arm-smmu-v3: Avoid locking on invalidation path when not using ATSWill Deacon1-5/+32
When ATS is not in use, we can avoid taking the 'devices_lock' for the domain on the invalidation path by simply caching the number of ATS masters currently attached. The fiddly part is handling a concurrent ->attach() of an ATS-enabled master to a domain that is being invalidated, but we can handle this using an 'smp_mb()' to ensure that our check of the count is ordered after completion of our prior TLB invalidation. This also makes our ->attach() and ->detach() flows symmetric wrt ATS interactions. Acked-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will@kernel.org>
2019-08-21iommu/arm-smmu-v3: Fix ATC invalidation ordering wrt main TLBsWill Deacon1-7/+9
When invalidating the ATC for an PCIe endpoint using ATS, we must take care to complete invalidation of the main SMMU TLBs beforehand, otherwise the device could immediately repopulate its ATC with stale translations. Hooking the ATC invalidation into ->unmap() as we currently do does the exact opposite: it ensures that the ATC is invalidated *before* the main TLBs, which is bogus. Move ATC invalidation into the actual (leaf) invalidation routines so that it is always called after completing main TLB invalidation. Reviewed-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will@kernel.org>
2019-08-21iommu/arm-smmu-v3: Rework enabling/disabling of ATS for PCI mastersWill Deacon1-19/+28
To prevent any potential issues arising from speculative Address Translation Requests from an ATS-enabled PCIe endpoint, rework our ATS enabling/disabling logic so that we enable ATS at the SMMU before we enable it at the endpoint, and disable things in the opposite order. Reviewed-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will@kernel.org>
2019-08-21iommu/arm-smmu-v3: Don't issue CMD_SYNC for zero-length invalidationsWill Deacon1-0/+3
Calling arm_smmu_tlb_inv_range() with a size of zero, perhaps due to an empty 'iommu_iotlb_gather' structure, should be a NOP. Elide the CMD_SYNC when there is no invalidation to be performed. Signed-off-by: Will Deacon <will@kernel.org>
2019-08-21iommu/arm-smmu-v3: Remove boolean bitfield for 'ats_enabled' flagWill Deacon1-1/+1
There's really no need for this to be a bitfield, particularly as we don't have bitwise addressing on arm64. Signed-off-by: Will Deacon <will@kernel.org>
2019-08-21iommu/arm-smmu-v3: Disable detection of ATS and PRIWill Deacon1-0/+2
Detecting the ATS capability of the SMMU at probe time introduces a spinlock into the ->unmap() fast path, even when ATS is not actually in use. Furthermore, the ATC invalidation that exists is broken, as it occurs before invalidation of the main SMMU TLB which leaves a window where the ATC can be repopulated with stale entries. Given that ATS is both a new feature and a specialist sport, disable it for now whilst we fix it properly in subsequent patches. Since PRI requires ATS, disable that too. Cc: <stable@vger.kernel.org> Fixes: 9ce27afc0830 ("iommu/arm-smmu-v3: Add support for PCI ATS") Acked-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will@kernel.org>
2019-08-21iommu/arm-smmu-v3: Document ordering guarantees of command insertionWill Deacon1-0/+16
It turns out that we've always relied on some subtle ordering guarantees when inserting commands into the SMMUv3 command queue. With the recent changes to elide locking when possible, these guarantees become more subtle and even more important. Add a comment documented the barrier semantics of command insertion so that we don't have to derive the behaviour from scratch each time it comes up on the list. Signed-off-by: Will Deacon <will@kernel.org>
2019-08-08iommu/arm-smmu-v3: Defer TLB invalidation until ->iotlb_sync()Will Deacon1-29/+42
Update the iommu_iotlb_gather structure passed to ->tlb_add_page() and use this information to defer all TLB invalidation until ->iotlb_sync(). This drastically reduces contention on the command queue, since we can insert our commands in batches rather than one-by-one. Tested-by: Ganapatrao Kulkarni <gkulkarni@marvell.com> Signed-off-by: Will Deacon <will@kernel.org>
2019-08-08iommu/arm-smmu-v3: Reduce contention during command-queue insertionWill Deacon1-144/+533
The SMMU command queue is a bottleneck in large systems, thanks to the spin_lock which serialises accesses from all CPUs to the single queue supported by the hardware. Attempt to improve this situation by moving to a new algorithm for inserting commands into the queue, which is lock-free on the fast-path. Tested-by: Ganapatrao Kulkarni <gkulkarni@marvell.com> Signed-off-by: Will Deacon <will@kernel.org>
2019-08-06iommu/arm-smmu: Mark expected switch fall-throughAnders Roxell1-2/+2
Now that -Wimplicit-fallthrough is passed to GCC by default, the following warning shows up: ../drivers/iommu/arm-smmu-v3.c: In function ‘arm_smmu_write_strtab_ent’: ../drivers/iommu/arm-smmu-v3.c:1189:7: warning: this statement may fall through [-Wimplicit-fallthrough=] if (disable_bypass) ^ ../drivers/iommu/arm-smmu-v3.c:1191:3: note: here default: ^~~~~~~ Rework so that the compiler doesn't warn about fall-through. Make it clearer by calling 'BUG_ON()' when disable_bypass is set, and always 'break;' Signed-off-by: Anders Roxell <anders.roxell@linaro.org> Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-07-30drivers: Introduce device lookup variants by fwnodeSuzuki K Poulose1-7/+2
Add a helper to match the firmware node handle of a device and provide wrappers for {bus/class/driver}_find_device() APIs to avoid proliferation of duplicate custom match functions. Cc: "David S. Miller" <davem@davemloft.net> Cc: Doug Ledford <dledford@redhat.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: linux-usb@vger.kernel.org Cc: "Rafael J. Wysocki" <rafael@kernel.org> Cc: Ulf Hansson <ulf.hansson@linaro.org> Cc: Joe Perches <joe@perches.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Joerg Roedel <joro@8bytes.org> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Acked-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org> Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Link: https://lore.kernel.org/r/20190723221838.12024-4-suzuki.poulose@arm.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-29iommu/arm-smmu-v3: Operate directly on low-level queue where possibleWill Deacon1-27/+31
In preparation for rewriting the command queue insertion code to use a new algorithm, rework many of our queue macro accessors and manipulation functions so that they operate on the arm_smmu_ll_queue structure where possible. This will allow us to call these helpers on local variables without having to construct a full-blown arm_smmu_queue on the stack. No functional change. Tested-by: Ganapatrao Kulkarni <gkulkarni@marvell.com> Signed-off-by: Will Deacon <will@kernel.org>
2019-07-29iommu/arm-smmu-v3: Move low-level queue fields out of arm_smmu_queueWill Deacon1-41/+47
In preparation for rewriting the command queue insertion code to use a new algorithm, introduce a new arm_smmu_ll_queue structure which contains only the information necessary to perform queue arithmetic for a queue and will later be extended so that we can perform complex atomic manipulation on some of the fields. No functional change. Tested-by: Ganapatrao Kulkarni <gkulkarni@marvell.com> Signed-off-by: Will Deacon <will@kernel.org>
2019-07-29iommu/arm-smmu-v3: Drop unused 'q' argument from Q_OVF macroWill Deacon1-6/+6
The Q_OVF macro doesn't need to access the arm_smmu_queue structure, so drop the unused macro argument. No functional change. Tested-by: Ganapatrao Kulkarni <gkulkarni@marvell.com> Signed-off-by: Will Deacon <will@kernel.org>
2019-07-29iommu/arm-smmu-v3: Separate s/w and h/w views of prod and cons indexesWill Deacon1-14/+22
In preparation for rewriting the command queue insertion code to use a new algorithm, separate the software and hardware views of the prod and cons indexes so that manipulating the software state doesn't automatically update the hardware state at the same time. No functional change. Tested-by: Ganapatrao Kulkarni <gkulkarni@marvell.com> Signed-off-by: Will Deacon <will@kernel.org>
2019-07-29iommu/io-pgtable: Pass struct iommu_iotlb_gather to ->tlb_add_page()Will Deacon1-1/+2
With all the pieces in place, we can finally propagate the iommu_iotlb_gather structure from the call to unmap() down to the IOMMU drivers' implementation of ->tlb_add_page(). Currently everybody ignores it, but the machinery is now there to defer invalidation. Signed-off-by: Will Deacon <will@kernel.org>
2019-07-29iommu/io-pgtable: Pass struct iommu_iotlb_gather to ->unmap()Will Deacon1-1/+1
Update the io-pgtable ->unmap() function to take an iommu_iotlb_gather pointer as an argument, and update the callers as appropriate. Signed-off-by: Will Deacon <will@kernel.org>
2019-07-29iommu/io-pgtable: Remove unused ->tlb_sync() callbackWill Deacon1-8/+0
The ->tlb_sync() callback is no longer used, so it can be removed. Signed-off-by: Will Deacon <will@kernel.org>
2019-07-29iommu/io-pgtable: Replace ->tlb_add_flush() with ->tlb_add_page()Will Deacon1-1/+7
The ->tlb_add_flush() callback in the io-pgtable API now looks a bit silly: - It takes a size and a granule, which are always the same - It takes a 'bool leaf', which is always true - It only ever flushes a single page With that in mind, replace it with an optional ->tlb_add_page() callback that drops the useless parameters. Signed-off-by: Will Deacon <will@kernel.org>
2019-07-29iommu/io-pgtable: Hook up ->tlb_flush_walk() and ->tlb_flush_leaf() in driversWill Deacon1-0/+22
Hook up ->tlb_flush_walk() and ->tlb_flush_leaf() in drivers using the io-pgtable API so that we can start making use of them in the page-table code. For now, they can just wrap the implementations of ->tlb_add_flush and ->tlb_sync pending future optimisation in each driver. Signed-off-by: Will Deacon <will@kernel.org>
2019-07-29iommu: Pass struct iommu_iotlb_gather to ->unmap() and ->iotlb_sync()Will Deacon1-3/+4
To allow IOMMU drivers to batch up TLB flushing operations and postpone them until ->iotlb_sync() is called, extend the prototypes for the ->unmap() and ->iotlb_sync() IOMMU ops callbacks to take a pointer to the current iommu_iotlb_gather structure. All affected IOMMU drivers are updated, but there should be no functional change since the extra parameter is ignored for now. Signed-off-by: Will Deacon <will@kernel.org>
2019-07-24iommu/io-pgtable: Rename iommu_gather_ops to iommu_flush_opsWill Deacon1-2/+2
In preparation for TLB flush gathering in the IOMMU API, rename the iommu_gather_ops structure in io-pgtable to iommu_flush_ops, which better describes its purpose and avoids the potential for confusion between different levels of the API. $ find linux/ -type f -name '*.[ch]' | xargs sed -i 's/gather_ops/flush_ops/g' Signed-off-by: Will Deacon <will@kernel.org>
2019-07-12Merge tag 'driver-core-5.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-coreLinus Torvalds1-1/+1
Pull driver core and debugfs updates from Greg KH: "Here is the "big" driver core and debugfs changes for 5.3-rc1 It's a lot of different patches, all across the tree due to some api changes and lots of debugfs cleanups. Other than the debugfs cleanups, in this set of changes we have: - bus iteration function cleanups - scripts/get_abi.pl tool to display and parse Documentation/ABI entries in a simple way - cleanups to Documenatation/ABI/ entries to make them parse easier due to typos and other minor things - default_attrs use for some ktype users - driver model documentation file conversions to .rst - compressed firmware file loading - deferred probe fixes All of these have been in linux-next for a while, with a bunch of merge issues that Stephen has been patient with me for" * tag 'driver-core-5.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (102 commits) debugfs: make error message a bit more verbose orangefs: fix build warning from debugfs cleanup patch ubifs: fix build warning after debugfs cleanup patch driver: core: Allow subsystems to continue deferring probe drivers: base: cacheinfo: Ensure cpu hotplug work is done before Intel RDT arch_topology: Remove error messages on out-of-memory conditions lib: notifier-error-inject: no need to check return value of debugfs_create functions swiotlb: no need to check return value of debugfs_create functions ceph: no need to check return value of debugfs_create functions sunrpc: no need to check return value of debugfs_create functions ubifs: no need to check return value of debugfs_create functions orangefs: no need to check return value of debugfs_create functions nfsd: no need to check return value of debugfs_create functions lib: 842: no need to check return value of debugfs_create functions debugfs: provide pr_fmt() macro debugfs: log errors when something goes wrong drivers: s390/cio: Fix compilation warning about const qualifiers drivers: Add generic helper to match by of_node driver_find_device: Unify the match function with class_find_device() bus_find_device: Unify the match callback with class_find_device ...
2019-07-04iommu/arm-smmu-v3: Invalidate ATC when detaching a deviceJean-Philippe Brucker1-1/+4
We make the invalid assumption in arm_smmu_detach_dev() that the ATC is clear after calling pci_disable_ats(). For one thing, only enabling the PCIe ATS capability constitutes an implicit invalidation event, so the comment was wrong. More importantly, the ATS capability isn't necessarily disabled by pci_disable_ats() in a PF, if the associated VFs have ATS enabled. Explicitly invalidate all ATC entries in arm_smmu_detach_dev(). The endpoint cannot form new ATC entries because STE.EATS is clear. Fixes: 9ce27afc0830 ("iommu/arm-smmu-v3: Add support for PCI ATS") Reported-by: Manoj Kumar <Manoj.Kumar3@arm.com> Reported-by: Robin Murphy <Robin.Murphy@arm.com> Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com> Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-07-02iommu/arm-smmu-v3: Fix compilation when CONFIG_CMA=nWill Deacon1-0/+6
When compiling a kernel without support for CMA, CONFIG_CMA_ALIGNMENT is not defined which results in the following build failure: In file included from ./include/linux/list.h:9:0 from ./include/linux/kobject.h:19, from ./include/linux/of.h:17 from ./include/linux/irqdomain.h:35, from ./include/linux/acpi.h:13, from drivers/iommu/arm-smmu-v3.c:12: drivers/iommu/arm-smmu-v3.c: In function ‘arm_smmu_device_hw_probe’: drivers/iommu/arm-smmu-v3.c:194:40: error: ‘CONFIG_CMA_ALIGNMENT’ undeclared (first use in this function) #define Q_MAX_SZ_SHIFT (PAGE_SHIFT + CONFIG_CMA_ALIGNMENT) Fix the breakage by capping the maximum queue size based on MAX_ORDER when CMA is not enabled. Reported-by: Zhangshaokun <zhangshaokun@hisilicon.com> Signed-off-by: Will Deacon <will@kernel.org> Tested-by: Shaokun Zhang <zhangshaokun@hisilicon.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-06-25iommu/io-pgtable: Replace IO_PGTABLE_QUIRK_NO_DMA with specific flagWill Deacon1-3/+1
IO_PGTABLE_QUIRK_NO_DMA is a bit of a misnomer, since it's really just an indication of whether or not the page-table walker for the IOMMU is coherent with the CPU caches. Since cache coherency is more than just a quirk, replace the flag with its own field in the io_pgtable_cfg structure. Cc: Bjorn Andersson <bjorn.andersson@linaro.org> Signed-off-by: Will Deacon <will@kernel.org>
2019-06-24driver_find_device: Unify the match function with class_find_device()Suzuki K Poulose1-1/+1
The driver_find_device() accepts a match function pointer to filter the devices for lookup, similar to bus/class_find_device(). However, there is a minor difference in the prototype for the match parameter for driver_find_device() with the now unified version accepted by {bus/class}_find_device(), where it doesn't accept a "const" qualifier for the data argument. This prevents us from reusing the generic match functions for driver_find_device(). For this reason, change the prototype of the driver_find_device() to make the "match" parameter in line with {bus/class}_find_device() and adjust its callers to use the const qualifier. Also, we could now promote the "data" parameter to const as we pass it down as a const parameter to the match functions. Cc: Corey Minyard <minyard@acm.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Thierry Reding <thierry.reding@gmail.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "Rafael J. Wysocki" <rafael@kernel.org> Cc: Will Deacon <will.deacon@arm.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Peter Oberparleiter <oberpar@linux.ibm.com> Cc: Sebastian Ott <sebott@linux.ibm.com> Cc: David Airlie <airlied@linux.ie> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Nehal Shah <nehal-bakulchandra.shah@amd.com> Cc: Shyam Sundar S K <shyam-sundar.s-k@amd.com> Cc: Lee Jones <lee.jones@linaro.org> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-18iommu/arm-smmu-v3: Increase maximum size of queuesWill Deacon1-16/+38
We've been artificially limiting the size of our queues to 4k so that we don't end up allocating huge amounts of physically-contiguous memory at probe time. However, 4k is only enough for 256 commands in the command queue, so instead let's try to allocate the largest queue that the SMMU supports, retrying with a smaller size if the allocation fails. The caveat here is that we have to limit our upper bound based on CONFIG_CMA_ALIGNMENT to ensure that our queue allocations remain natually aligned, which is required by the SMMU architecture. Signed-off-by: Will Deacon <will.deacon@arm.com>
2019-04-23iommu/arm-smmu-v3: Don't disable SMMU in kdump kernelWill Deacon1-6/+4
Disabling the SMMU when probing from within a kdump kernel so that all incoming transactions are terminated can prevent the core of the crashed kernel from being transferred off the machine if all I/O devices are behind the SMMU. Instead, continue to probe the SMMU after it is disabled so that we can reinitialise it entirely and re-attach the DMA masters as they are reset. Since the kdump kernel may not have drivers for all of the active DMA masters, we suppress fault reporting to avoid spamming the console and swamping the IRQ threads. Reported-by: "Leizhen (ThunderTown)" <thunder.leizhen@huawei.com> Tested-by: "Leizhen (ThunderTown)" <thunder.leizhen@huawei.com> Tested-by: Bhupesh Sharma <bhsharma@redhat.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2019-04-23iommu/arm-smmu-v3: Disable tagged pointersJean-Philippe Brucker1-1/+0
The ARM architecture has a "Top Byte Ignore" (TBI) option that makes the MMU mask out bits [63:56] of an address, allowing a userspace application to store data in its pointers. This option is incompatible with PCI ATS. If TBI is enabled in the SMMU and userspace triggers DMA transactions on tagged pointers, the endpoint might create ATC entries for addresses that include a tag. Software would then have to send ATC invalidation packets for each 255 possible alias of an address, or just wipe the whole address space. This is not a viable option, so disable TBI. The impact of this change is unclear, since there are very few users of tagged pointers, much less SVA. But the requirement introduced by this patch doesn't seem excessive: a userspace application using both tagged pointers and SVA should now sanitize addresses (clear the tag) before using them for device DMA. Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2019-04-23iommu/arm-smmu-v3: Add support for PCI ATSJean-Philippe Brucker1-6/+195
PCIe devices can implement their own TLB, named Address Translation Cache (ATC). Enable Address Translation Service (ATS) for devices that support it and send them invalidation requests whenever we invalidate the IOTLBs. ATC invalidation is allowed to take up to 90 seconds, according to the PCIe spec, so it is possible to get a SMMU command queue timeout during normal operations. However we expect implementations to complete invalidation in reasonable time. We only enable ATS for "trusted" devices, and currently rely on the pci_dev->untrusted bit. For ATS we have to trust that: (a) The device doesn't issue "translated" memory requests for addresses that weren't returned by the SMMU in a Translation Completion. In particular, if we give control of a device or device partition to a VM or userspace, software cannot program the device to access arbitrary "translated" addresses. (b) The device follows permissions granted by the SMMU in a Translation Completion. If the device requested read+write permission and only got read, then it doesn't write. (c) The device doesn't send Translated transactions for an address that was invalidated by an ATC invalidation. Note that the PCIe specification explicitly requires all of these, so we can assume that implementations will cleanly shield ATCs from software. All ATS translated requests still go through the SMMU, to walk the stream table and check that the device is actually allowed to send translated requests. Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2019-04-23iommu/arm-smmu-v3: Link domains and devicesJean-Philippe Brucker1-1/+20
When removing a mapping from a domain, we need to send an invalidation to all devices that might have stored it in their Address Translation Cache (ATC). In addition when updating the context descriptor of a live domain, we'll need to send invalidations for all devices attached to it. Maintain a list of devices in each domain, protected by a spinlock. It is updated every time we attach or detach devices to and from domains. It needs to be a spinlock because we'll invalidate ATC entries from within hardirq-safe contexts, but it may be possible to relax the read side with RCU later. Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2019-04-23iommu/arm-smmu-v3: Add a master->domain pointerJean-Philippe Brucker1-47/+45
As we're going to track domain-master links more closely for ATS and CD invalidation, add pointer to the attached domain in struct arm_smmu_master. As a result, arm_smmu_strtab_ent is redundant and can be removed. Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2019-04-23iommu/arm-smmu-v3: Store SteamIDs in masterJean-Philippe Brucker1-15/+15
Simplify the attach/detach code a bit by keeping a pointer to the stream IDs in the master structure. Although not completely obvious here, it does make the subsequent support for ATS, PRI and PASID a bit simpler. Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2019-04-23iommu/arm-smmu-v3: Rename arm_smmu_master_data to arm_smmu_masterJean-Philippe Brucker1-6/+6
The arm_smmu_master_data structure already represents more than just the firmware data associated to a master, and will be used extensively to represent a device's state when implementing more SMMU features. Rename the structure to arm_smmu_master. Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2019-02-11iommu: Allow io-pgtable to be used outside of drivers/iommu/Rob Herring1-2/+1
Move io-pgtable.h to include/linux/ and export alloc_io_pgtable_ops and free_io_pgtable_ops. This enables drivers outside drivers/iommu/ to use the page table library. Specifically, some ARM Mali GPUs use the ARM page table formats. Cc: Will Deacon <will.deacon@arm.com> Cc: Robin Murphy <robin.murphy@arm.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Matthias Brugger <matthias.bgg@gmail.com> Cc: Rob Clark <robdclark@gmail.com> Cc: linux-arm-kernel@lists.infradead.org Cc: iommu@lists.linux-foundation.org Cc: linux-mediatek@lists.infradead.org Cc: linux-arm-msm@vger.kernel.org Signed-off-by: Rob Herring <robh@kernel.org> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2018-12-20Merge branches 'iommu/fixes', 'arm/renesas', 'arm/mediatek', 'arm/tegra', 'arm/omap', 'arm/smmu', 'x86/vt-d', 'x86/amd' and 'core' into nextJoerg Roedel1-26/+37
2018-12-17iommu/arm-smmu: Use helper functions to access dev->iommu_fwspecJoerg Roedel1-7/+9
Use the new helpers dev_iommu_fwspec_get()/set() to access the dev->iommu_fwspec pointer. This makes it easier to move that pointer later into another struct. Cc: Will Deacon <will.deacon@arm.com> Cc: Robin Murphy <robin.murphy@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2018-12-10iommu/arm-smmu-v3: Use explicit mb() when moving cons pointerWill Deacon1-1/+7
After removing an entry from a queue (e.g. reading an event in arm_smmu_evtq_thread()) it is necessary to advance the MMIO consumer pointer to free the queue slot back to the SMMU. A memory barrier is required here so that all reads targetting the queue entry have completed before the consumer pointer is updated. The implementation of queue_inc_cons() relies on a writel() to complete the previous reads, but this is incorrect because writel() is only guaranteed to complete prior writes. This patch replaces the call to writel() with an mb(); writel_relaxed() sequence, which gives us the read->write ordering which we require. Cc: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-10iommu/arm-smmu-v3: Avoid memory corruption from Hisilicon MSI payloadsZhen Lei1-1/+5
The GITS_TRANSLATER MMIO doorbell register in the ITS hardware is architected to be 4 bytes in size, yet on hi1620 and earlier, Hisilicon have allocated the adjacent 4 bytes to carry some IMPDEF sideband information which results in an 8-byte MSI payload being delivered when signalling an interrupt: MSIAddr: |----4bytes----|----4bytes----| | MSIData | IMPDEF | This poses no problem for the ITS hardware because the adjacent 4 bytes are reserved in the memory map. However, when delivering MSIs to memory, as we do in the SMMUv3 driver for signalling the completion of a SYNC command, the extended payload will corrupt the 4 bytes adjacent to the "sync_count" member in struct arm_smmu_device. Fortunately, the current layout allocates these bytes to padding, but this is fragile and we should make this explicit. Reviewed-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> [will: Rewrote commit message and comment] Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-10iommu/arm-smmu-v3: Fix big-endian CMD_SYNC writesRobin Murphy1-1/+7
When we insert the sync sequence number into the CMD_SYNC.MSIData field, we do so in CPU-native byte order, before writing out the whole command as explicitly little-endian dwords. Thus on big-endian systems, the SMMU will receive and write back a byteswapped version of sync_nr, which would be perfect if it were targeting a similarly-little-endian ITS, but since it's actually writing back to memory being polled by the CPUs, they're going to end up seeing the wrong thing. Since the SMMU doesn't care what the MSIData actually contains, the minimal-overhead solution is to simply add an extra byteswap initially, such that it then writes back the big-endian format directly. Cc: <stable@vger.kernel.org> Fixes: 37de98f8f1cf ("iommu/arm-smmu-v3: Use CMD_SYNC completion MSI") Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-03iommu/arm-smmu: Make arm-smmu-v3 explicitly non-modularPaul Gortmaker1-16/+9
The Kconfig currently controlling compilation of this code is: drivers/iommu/Kconfig:config ARM_SMMU_V3 drivers/iommu/Kconfig: bool "ARM Ltd. System MMU Version 3 (SMMUv3) Support" ...meaning that it currently is not being built as a module by anyone. Lets remove the modular code that is essentially orphaned, so that when reading the driver there is no doubt it is builtin-only. Since module_platform_driver() uses the same init level priority as builtin_platform_driver() the init ordering remains unchanged with this commit. We explicitly disallow a driver unbind, since that doesn't have a sensible use case anyway, but unlike most drivers, we can't delete the function tied to the ".remove" field. This is because as of commit 7aa8619a66ae ("iommu/arm-smmu-v3: Implement shutdown method") the .remove function was given a one line wrapper and re-used to provide a .shutdown service. So we delete the wrapper and re-name the function from remove to shutdown. We add a moduleparam.h include since the file does actually declare some module parameters, and leaving them as such is the easiest way currently to remain backwards compatible with existing use cases. Also note that MODULE_DEVICE_TABLE is a no-op for non-modular code. We also delete the MODULE_LICENSE tag etc. since all that information is already contained at the top of the file in the comments. Cc: Will Deacon <will.deacon@arm.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Robin Murphy <robin.murphy@arm.com> Cc: Nate Watterson <nwatters@codeaurora.org> Cc: linux-arm-kernel@lists.infradead.org Cc: iommu@lists.linux-foundation.org Acked-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>