Age | Commit message (Collapse) | Author | Files | Lines |
|
It is not used outside iova.c
Signed-off-by: John Garry <john.garry@huawei.com>
Link: https://lore.kernel.org/r/1607020492-189471-3-git-send-email-john.garry@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Function split_and_remove_iova() has not been referenced since commit
e70b081c6f37 ("iommu/vt-d: Remove IOVA handling code from the non-dma_ops
path"), so delete it.
Signed-off-by: John Garry <john.garry@huawei.com>
Link: https://lore.kernel.org/r/1607020492-189471-2-git-send-email-john.garry@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
The 'level' parameter to the iopte_type() macro is unused, so remove it.
Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
Link: https://lore.kernel.org/r/20201207120150.1891-1-jiangkunkun@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Although handling a mapping request with no permissions is a
trivial no-op, defer the early return until after the size/range
checks so that we are consistent with other mapping requests.
Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
Link: https://lore.kernel.org/r/20201207115758.9400-1-zhukeqian1@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Currently direct_mapping always use the smallest pgsize which is SZ_4K
normally to mapping. This is unnecessary. we could gather the size, and
call iommu_map then, iommu_map could decide how to map better with the
just right pgsize.
>From the original comment, we should take care overlap, otherwise,
iommu_map may return -EEXIST. In this overlap case, we should map the
previous region before overlap firstly. then map the left part.
Each a iommu device will call this direct_mapping when its iommu
initialize, This patch is effective to improve the boot/initialization
time especially while it only needs level 1 mapping.
Signed-off-by: Anan Sun <anan.sun@mediatek.com>
Signed-off-by: Yong Wu <yong.wu@mediatek.com>
Link: https://lore.kernel.org/r/20201207093553.8635-1-yong.wu@mediatek.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Both find_iova() and __free_iova() take iova_rbtree_lock,
there is no reason to take and release it twice inside
free_iova().
Fold them into one critical section by calling the unlock
versions instead.
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Link: https://lore.kernel.org/r/1605608734-84416-5-git-send-email-john.garry@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
There is no reason to use GFP_ATOMIC in a 'suspend' function.
Use GFP_KERNEL instead to give more opportunities to allocate the
requested memory.
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Link: https://lore.kernel.org/r/20201030182630.5154-1-christophe.jaillet@wanadoo.fr
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20201201013149.2466272-2-baolu.lu@linux.intel.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Fixes gcc '-Wunused-but-set-variable' warning:
drivers/iommu/intel/iommu.c:5643:27: warning: variable 'last_pfn' set but not used [-Wunused-but-set-variable]
5643 | unsigned long start_pfn, last_pfn;
| ^~~~~~~~
This variable is never used, so remove it.
Fixes: 2a2b8eaa5b25 ("iommu: Handle freelists when using deferred flushing in iommu drivers")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20201127013308.1833610-1-baolu.lu@linux.intel.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Although iommu_group_get() in iommu_probe_device() will always succeed
thanks to __iommu_probe_device() creating the group if it's not present,
it's still worth initialising 'ret' to -ENODEV in case this path is
reachable in the future.
For now, this patch results in no functional change.
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Link: https://lore.kernel.org/r/20201126133825.3643852-1-yangyingliang@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Below warnings are fixed:
Documentation/ABI/testing/sysfs-kernel-iommu_groups:38: WARNING: Unexpected indentation.
Documentation/ABI/testing/sysfs-kernel-iommu_groups:38: WARNING: Block quote ends without a blank line; unexpected unindent.
Documentation/ABI/testing/sysfs-kernel-iommu_groups:38: WARNING: Enumerated list ends without a blank line; unexpected unindent.
Documentation/ABI/testing/sysfs-kernel-iommu_groups:38: WARNING: Unexpected indentation.
Documentation/ABI/testing/sysfs-kernel-iommu_groups:38: WARNING: Block quote ends without a blank line; unexpected unindent.
Fixes: 63a816749d86 ("iommu: Document usage of "/sys/kernel/iommu_groups/<grp_id>/type" file")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Link: https://lore.kernel.org/linux-next/20201126174851.200e0e58@canb.auug.org.au/
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20201126090603.1511589-1-baolu.lu@linux.intel.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Fix the checkpatch warning for space required before the open
parenthesis.
Signed-off-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/0b4c3718a87992f11340a1cdd99fd746c647e485.1606287059.git.saiprakash.ranjan@codeaurora.org
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Use table and of_match_node() to match qcom implementation
instead of multiple of_device_compatible() calls for each
QCOM SMMU implementation.
Signed-off-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/4e11899bc02102a6e6155db215911e8b5aaba950.1606287059.git.saiprakash.ranjan@codeaurora.org
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Now that we have a struct io_pgtable_domain_attr with quirks,
use that for non_strict mode as well thereby removing the need
for more members of arm_smmu_domain in the future.
Signed-off-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org>
Link: https://lore.kernel.org/r/c191265f3db1f6b3e136d4057ca917666680a066.1606287059.git.saiprakash.ranjan@codeaurora.org
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Add support for domain attribute DOMAIN_ATTR_IO_PGTABLE_CFG
to get/set pagetable configuration data which initially will
be used to set quirks and later can be extended to include
other pagetable configuration data.
Signed-off-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org>
Link: https://lore.kernel.org/r/2ab52ced2f853115c32461259a075a2877feffa6.1606287059.git.saiprakash.ranjan@codeaurora.org
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Add a quirk IO_PGTABLE_QUIRK_ARM_OUTER_WBWA to override
the outer-cacheability attributes set in the TCR for a
non-coherent page table walker when using system cache.
Signed-off-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org>
Link: https://lore.kernel.org/r/f818676b4a2a9ad1edb92721947d47db41ed6a7c.1606287059.git.saiprakash.ranjan@codeaurora.org
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Add a new iommu domain attribute DOMAIN_ATTR_IO_PGTABLE_CFG
for pagetable configuration which initially will be used to
set quirks like for system cache aka last level cache to be
used by client drivers like GPU to set right attributes for
caching the hardware pagetables into the system cache and
later can be extended to include other page table configuration
data.
Signed-off-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org>
Link: https://lore.kernel.org/r/9190aa16f378fc0a7f8e57b2b9f60b033e7eeb4f.1606287059.git.saiprakash.ranjan@codeaurora.org
Signed-off-by: Will Deacon <will@kernel.org>
|
|
The default domain type of an iommu group can be changed by writing to
"/sys/kernel/iommu_groups/<grp_id>/type" file. Hence, document it's usage
and more importantly spell out its limitations.
Signed-off-by: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Sohil Mehta <sohil.mehta@intel.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Link: https://lore.kernel.org/r/20201124130604.2912899-5-baolu.lu@linux.intel.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
"/sys/kernel/iommu_groups/<grp_id>/type" file could be read to find out the
default domain type of an iommu group. The default domain of an iommu group
doesn't change after booting and hence could be read directly. But,
after addding support to dynamically change iommu group default domain, the
above assumption no longer stays valid.
iommu group default domain type could be changed at any time by writing to
"/sys/kernel/iommu_groups/<grp_id>/type". So, take group mutex before
reading iommu group default domain type so that the user wouldn't see stale
values or iommu_group_show_type() doesn't try to derefernce stale pointers.
Signed-off-by: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Sohil Mehta <sohil.mehta@intel.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Link: https://lore.kernel.org/r/20201124130604.2912899-4-baolu.lu@linux.intel.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Presently, the default domain of an iommu group is allocated during boot
time and it cannot be changed later. So, the device would typically be
either in identity (also known as pass_through) mode or the device would be
in DMA mode as long as the machine is up and running. There is no way to
change the default domain type dynamically i.e. after booting, a device
cannot switch between identity mode and DMA mode.
But, assume a use case wherein the user trusts the device and believes that
the OS is secure enough and hence wants *only* this device to bypass IOMMU
(so that it could be high performing) whereas all the other devices to go
through IOMMU (so that the system is protected). Presently, this use case
is not supported. It will be helpful if there is some way to change the
default domain of an iommu group dynamically. Hence, add such support.
A privileged user could request the kernel to change the default domain
type of a iommu group by writing to
"/sys/kernel/iommu_groups/<grp_id>/type" file. Presently, only three values
are supported
1. identity: all the DMA transactions from the device in this group are
*not* translated by the iommu
2. DMA: all the DMA transactions from the device in this group are
translated by the iommu
3. auto: change to the type the device was booted with
Note:
1. Default domain of an iommu group with two or more devices cannot be
changed.
2. The device in the iommu group shouldn't be bound to any driver.
3. The device shouldn't be assigned to user for direct access.
4. The change request will fail if any device in the group has a mandatory
default domain type and the requested one conflicts with that.
Please see "Documentation/ABI/testing/sysfs-kernel-iommu_groups" for more
information.
Signed-off-by: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Sohil Mehta <sohil.mehta@intel.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Link: https://lore.kernel.org/r/20201124130604.2912899-3-baolu.lu@linux.intel.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
So that the vendor iommu drivers are no more required to provide the
def_domain_type callback to always isolate the untrusted devices.
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Cc: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>
Link: https://lore.kernel.org/linux-iommu/243ce89c33fe4b9da4c56ba35acebf81@huawei.com/
Link: https://lore.kernel.org/r/20201124130604.2912899-2-baolu.lu@linux.intel.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Some cleanups after converting the driver to use dma-iommu ops.
- Remove nobounce option;
- Cleanup and simplify the path in domain mapping.
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Tested-by: Logan Gunthorpe <logang@deltatee.com>
Link: https://lore.kernel.org/r/20201124082057.2614359-8-baolu.lu@linux.intel.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Convert the intel iommu driver to the dma-iommu api. Remove the iova
handling and reserve region code from the intel iommu driver.
Signed-off-by: Tom Murphy <murphyt7@tcd.ie>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Tested-by: Logan Gunthorpe <logang@deltatee.com>
Link: https://lore.kernel.org/r/20201124082057.2614359-7-baolu.lu@linux.intel.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
The iommu-dma constrains IOVA allocation based on the domain geometry
that the driver reports. Update domain geometry everytime a domain is
attached to or detached from a device.
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Tested-by: Logan Gunthorpe <logang@deltatee.com>
Link: https://lore.kernel.org/r/20201124082057.2614359-6-baolu.lu@linux.intel.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Combining the sg segments exposes a bug in the Intel i915 driver which
causes visual artifacts and the screen to freeze. This is most likely
because of how the i915 handles the returned list. It probably doesn't
respect the returned value specifying the number of elements in the list
and instead depends on the previous behaviour of the Intel iommu driver
which would return the same number of elements in the output list as in
the input list.
[ This has been fixed in the i915 tree, but we agreed to carry this fix
temporarily in the iommu tree and revert it before 5.11 is released:
https://lore.kernel.org/linux-iommu/20201103105442.GD22888@8bytes.org/
-- Will ]
Signed-off-by: Tom Murphy <murphyt7@tcd.ie>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Tested-by: Logan Gunthorpe <logang@deltatee.com>
Link: https://lore.kernel.org/r/20201124082057.2614359-5-baolu.lu@linux.intel.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Allow the dma-iommu api to use bounce buffers for untrusted devices.
This is a copy of the intel bounce buffer code.
Co-developed-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Tom Murphy <murphyt7@tcd.ie>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Tested-by: Logan Gunthorpe <logang@deltatee.com>
Link: https://lore.kernel.org/r/20201124082057.2614359-4-baolu.lu@linux.intel.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Add a iommu_dma_free_cpu_cached_iovas function to allow drivers which
use the dma-iommu ops to free cached cpu iovas.
Signed-off-by: Tom Murphy <murphyt7@tcd.ie>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Tested-by: Logan Gunthorpe <logang@deltatee.com>
Link: https://lore.kernel.org/r/20201124082057.2614359-3-baolu.lu@linux.intel.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Allow the iommu_unmap_fast to return newly freed page table pages and
pass the freelist to queue_iova in the dma-iommu ops path.
This is useful for iommu drivers (in this case the intel iommu driver)
which need to wait for the ioTLB to be flushed before newly
free/unmapped page table pages can be freed. This way we can still batch
ioTLB free operations and handle the freelists.
Signed-off-by: Tom Murphy <murphyt7@tcd.ie>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Tested-by: Logan Gunthorpe <logang@deltatee.com>
Link: https://lore.kernel.org/r/20201124082057.2614359-2-baolu.lu@linux.intel.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
This patch simply adds support for PCI devices.
Signed-off-by: Nicolin Chen <nicoleotsuka@gmail.com>
Tested-by: Dmitry Osipenko <digetx@gmail.com>
Reviewed-by: Dmitry Osipenko <digetx@gmail.com>
Acked-by: Thierry Reding <treding@nvidia.com>
Link: https://lore.kernel.org/r/20201125101013.14953-6-nicoleotsuka@gmail.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
The bus_set_iommu() in tegra_smmu_probe() enumerates all clients
to call in tegra_smmu_probe_device() where each client searches
its DT node for smmu pointer and swgroup ID, so as to configure
an fwspec. But this requires a valid smmu pointer even before mc
and smmu drivers are probed. So in tegra_smmu_probe() we added a
line of code to fill mc->smmu, marking "a bit of a hack".
This works for most of clients in the DTB, however, doesn't work
for a client that doesn't exist in DTB, a PCI device for example.
Actually, if we return ERR_PTR(-ENODEV) in ->probe_device() when
it's called from bus_set_iommu(), iommu core will let everything
carry on. Then when a client gets probed, of_iommu_configure() in
iommu core will search DTB for swgroup ID and call ->of_xlate()
to prepare an fwspec, similar to tegra_smmu_probe_device() and
tegra_smmu_configure(). Then it'll call tegra_smmu_probe_device()
again, and this time we shall return smmu->iommu pointer properly.
So we can get rid of tegra_smmu_find() and tegra_smmu_configure()
along with DT polling code by letting the iommu core handle every
thing, except a problem that we search iommus property in DTB not
only for swgroup ID but also for mc node to get mc->smmu pointer
to call dev_iommu_priv_set() and return the smmu->iommu pointer.
So we'll need to find another way to get smmu pointer.
Referencing the implementation of sun50i-iommu driver, of_xlate()
has client's dev pointer, mc node and swgroup ID. This means that
we can call dev_iommu_priv_set() in of_xlate() instead, so we can
simply get smmu pointer in ->probe_device().
This patch reworks tegra_smmu_probe_device() by:
1) Removing mc->smmu hack in tegra_smmu_probe() so as to return
ERR_PTR(-ENODEV) in tegra_smmu_probe_device() during stage of
tegra_smmu_probe/tegra_mc_probe().
2) Moving dev_iommu_priv_set() to of_xlate() so we can get smmu
pointer in tegra_smmu_probe_device() to replace DTB polling.
3) Removing tegra_smmu_configure() accordingly since iommu core
takes care of it.
This also fixes a problem that previously we could add clients to
iommu groups before iommu core initializes its default domain:
ubuntu@jetson:~$ dmesg | grep iommu
platform 50000000.host1x: Adding to iommu group 1
platform 57000000.gpu: Adding to iommu group 2
iommu: Default domain type: Translated
platform 54200000.dc: Adding to iommu group 3
platform 54240000.dc: Adding to iommu group 3
platform 54340000.vic: Adding to iommu group 4
Though it works fine with IOMMU_DOMAIN_UNMANAGED, but will have
warnings if switching to IOMMU_DOMAIN_DMA:
iommu: Failed to allocate default IOMMU domain of type 0 for
group (null) - Falling back to IOMMU_DOMAIN_DMA
iommu: Failed to allocate default IOMMU domain of type 0 for
group (null) - Falling back to IOMMU_DOMAIN_DMA
Now, bypassing the first probe_device() call from bus_set_iommu()
fixes the sequence:
ubuntu@jetson:~$ dmesg | grep iommu
iommu: Default domain type: Translated
tegra-host1x 50000000.host1x: Adding to iommu group 0
tegra-dc 54200000.dc: Adding to iommu group 1
tegra-dc 54240000.dc: Adding to iommu group 1
tegra-vic 54340000.vic: Adding to iommu group 2
nouveau 57000000.gpu: Adding to iommu group 3
Note that dmesg log above is testing with IOMMU_DOMAIN_UNMANAGED.
Signed-off-by: Nicolin Chen <nicoleotsuka@gmail.com>
Tested-by: Dmitry Osipenko <digetx@gmail.com>
Reviewed-by: Dmitry Osipenko <digetx@gmail.com>
Acked-by: Thierry Reding <treding@nvidia.com>
Link: https://lore.kernel.org/r/20201125101013.14953-5-nicoleotsuka@gmail.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
In tegra_smmu_(de)attach_dev() functions, we poll DTB for each
client's iommus property to get swgroup ID in order to prepare
"as" and enable smmu. Actually tegra_smmu_configure() prepared
an fwspec for each client, and added to the fwspec all swgroup
IDs of client DT node in DTB.
So this patch uses fwspec in tegra_smmu_(de)attach_dev() so as
to replace the redundant DT polling code.
Signed-off-by: Nicolin Chen <nicoleotsuka@gmail.com>
Tested-by: Dmitry Osipenko <digetx@gmail.com>
Reviewed-by: Dmitry Osipenko <digetx@gmail.com>
Acked-by: Thierry Reding <treding@nvidia.com>
Link: https://lore.kernel.org/r/20201125101013.14953-4-nicoleotsuka@gmail.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
This is used to protect potential race condition at use_count.
since probes of client drivers, calling attach_dev(), may run
concurrently.
Signed-off-by: Nicolin Chen <nicoleotsuka@gmail.com>
Tested-by: Dmitry Osipenko <digetx@gmail.com>
Reviewed-by: Dmitry Osipenko <digetx@gmail.com>
Acked-by: Thierry Reding <treding@nvidia.com>
Link: https://lore.kernel.org/r/20201125101013.14953-3-nicoleotsuka@gmail.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
The tegra_smmu_group_get was added to group devices in different
SWGROUPs and it'd return a NULL group pointer upon a mismatch at
tegra_smmu_find_group(), so for most of clients/devices, it very
likely would mismatch and need a fallback generic_device_group().
But now tegra_smmu_group_get handles devices in same SWGROUP too,
which means that it would allocate a group for every new SWGROUP
or would directly return an existing one upon matching a SWGROUP,
i.e. any device will go through this function.
So possibility of having a NULL group pointer in device_group()
is upon failure of either devm_kzalloc() or iommu_group_alloc().
In either case, calling generic_device_group() no longer makes a
sense. Especially for devm_kzalloc() failing case, it'd cause a
problem if it fails at devm_kzalloc() yet succeeds at a fallback
generic_device_group(), because it does not create a group->list
for other devices to match.
This patch simply unwraps the function to clean it up.
Signed-off-by: Nicolin Chen <nicoleotsuka@gmail.com>
Tested-by: Dmitry Osipenko <digetx@gmail.com>
Reviewed-by: Dmitry Osipenko <digetx@gmail.com>
Acked-by: Thierry Reding <treding@nvidia.com>
Link: https://lore.kernel.org/r/20201125101013.14953-2-nicoleotsuka@gmail.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
The invalidate_range() notifier is called for any change to the address
space. Perform the required ATC invalidations.
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Link: https://lore.kernel.org/r/20201106155048.997886-5-jean-philippe@linaro.org
Signed-off-by: Will Deacon <will@kernel.org>
|
|
The sva_bind() function allows devices to access process address spaces
using a PASID (aka SSID).
(1) bind() allocates or gets an existing MMU notifier tied to the
(domain, mm) pair. Each mm gets one PASID.
(2) Any change to the address space calls invalidate_range() which sends
ATC invalidations (in a subsequent patch).
(3) When the process address space dies, the release() notifier disables
the CD to allow reclaiming the page tables. Since release() has to
be light we do not instruct device drivers to stop DMA here, we just
ignore incoming page faults from this point onwards.
To avoid any event 0x0a print (C_BAD_CD) we disable translation
without clearing CD.V. PCIe Translation Requests and Page Requests
are silently denied. Don't clear the R bit because the S bit can't
be cleared when STALL_MODEL==0b10 (forced), and clearing R without
clearing S is useless. Faulting transactions will stall and will be
aborted by the IOPF handler.
(4) After stopping DMA, the device driver releases the bond by calling
unbind(). We release the MMU notifier, free the PASID and the bond.
Three structures keep track of bonds:
* arm_smmu_bond: one per {device, mm} pair, the handle returned to the
device driver for a bind() request.
* arm_smmu_mmu_notifier: one per {domain, mm} pair, deals with ATS/TLB
invalidations and clearing the context descriptor on mm exit.
* arm_smmu_ctx_desc: one per mm, holds the pinned ASID and pgd.
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Link: https://lore.kernel.org/r/20201106155048.997886-4-jean-philippe@linaro.org
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Let IOMMU drivers allocate a single PASID per mm. Store the mm in the
IOASID set to allow refcounting and searching mm by PASID, when handling
an I/O page fault.
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20201106155048.997886-3-jean-philippe@linaro.org
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Let IOASID users take references to existing ioasids with ioasid_get().
ioasid_put() drops a reference and only frees the ioasid when its
reference number is zero. It returns true if the ioasid was freed.
For drivers that don't call ioasid_get(), ioasid_put() is the same as
ioasid_free().
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20201106155048.997886-2-jean-philippe@linaro.org
Signed-off-by: Will Deacon <will@kernel.org>
|
|
iommu_sva_unbind_device has no return value.
Remove the description of the return value of the function.
Signed-off-by: Chen Jun <c00424029@huawei.com>
Link: https://lore.kernel.org/r/20201023064827.74794-1-chenjun102@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
When ever an iova alloc request fails we free the iova
ranges present in the percpu iova rcaches and then retry
but the global iova rcache is not freed as a result we could
still see iova alloc failure even after retry as global
rcache is holding the iova's which can cause fragmentation.
So, free the global iova rcache as well and then go for the
retry.
Signed-off-by: Vijayanand Jitta <vjitta@codeaurora.org>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Acked-by: John Garry <john.garry@huaqwei.com>
Link: https://lore.kernel.org/r/1601451864-5956-2-git-send-email-vjitta@codeaurora.org
Signed-off-by: Will Deacon <will@kernel.org>
|
|
When ever a new iova alloc request comes iova is always searched
from the cached node and the nodes which are previous to cached
node. So, even if there is free iova space available in the nodes
which are next to the cached node iova allocation can still fail
because of this approach.
Consider the following sequence of iova alloc and frees on
1GB of iova space
1) alloc - 500MB
2) alloc - 12MB
3) alloc - 499MB
4) free - 12MB which was allocated in step 2
5) alloc - 13MB
After the above sequence we will have 12MB of free iova space and
cached node will be pointing to the iova pfn of last alloc of 13MB
which will be the lowest iova pfn of that iova space. Now if we get an
alloc request of 2MB we just search from cached node and then look
for lower iova pfn's for free iova and as they aren't any, iova alloc
fails though there is 12MB of free iova space.
To avoid such iova search failures do a retry from the last rb tree node
when iova search fails, this will search the entire tree and get an iova
if its available.
Signed-off-by: Vijayanand Jitta <vjitta@codeaurora.org>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/1601451864-5956-1-git-send-email-vjitta@codeaurora.org
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Commit 6ee1b77ba3ac ("iommu/vt-d: Add svm/sva invalidate function")
introduced intel_iommu_sva_invalidate() when CONFIG_INTEL_IOMMU_SVM.
This function uses the dedicated static variable inv_type_granu_table
and functions to_vtd_granularity() and to_vtd_size().
These parts are unused when !CONFIG_INTEL_IOMMU_SVM, and hence,
make CC=clang W=1 warns with an -Wunused-function warning.
Include these parts conditionally on CONFIG_INTEL_IOMMU_SVM.
Fixes: 6ee1b77ba3ac ("iommu/vt-d: Add svm/sva invalidate function")
Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20201115205951.20698-1-lukas.bulwahn@gmail.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Every Qcom Adreno GPU has an embedded SMMU for its own use. These
devices depend on unique features such as split pagetables,
different stall/halt requirements and other settings. Identify them
with a compatible string so that they can be identified in the
arm-smmu implementation specific code.
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Rob Herring <robh@kernel.org>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/20201109184728.2463097-4-jcrouse@codeaurora.org
Signed-off-by: Will Deacon <will@kernel.org>
|
|
For the Adreno GPU's SMMU, we want SCTLR.HUPCF set to ensure that
pending translations are not terminated on iova fault. Otherwise
a terminated CP read could hang the GPU by returning invalid
command-stream data. Add a hook to for the implementation to modify
the sctlr value if it wishes.
Co-developed-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@chromium.org>
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Link: https://lore.kernel.org/r/20201109184728.2463097-3-jcrouse@codeaurora.org
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Add a special implementation for the SMMU attached to most Adreno GPU
target triggered from the qcom,adreno-smmu compatible string.
The new Adreno SMMU implementation will enable split pagetables
(TTBR1) for the domain attached to the GPU device (SID 0) and
hard code it context bank 0 so the GPU hardware can implement
per-instance pagetables.
Co-developed-by: Rob Clark <robdclark@chromium.org>
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/20201109184728.2463097-2-jcrouse@codeaurora.org
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Fix the following coccinelle warnings:
./drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c:36:12-26: WARNING: Assignment of 0/1 to bool variable
Signed-off-by: Kaixu Xia <kaixuxia@tencent.com>
Link: https://lore.kernel.org/r/1604744439-6846-1-git-send-email-kaixuxia@tencent.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
|
|
current->group_leader->exit_signal may change during copy_process() if
current->real_parent exits.
Move the assignment inside tasklist_lock to avoid the race.
Signed-off-by: Eddy Wu <eddy_wu@trendmicro.com>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
It's buggy:
On Fri, Nov 06, 2020 at 10:30:08PM +0800, Minh Yuan wrote:
> We recently discovered a slab-out-of-bounds read in fbcon in the latest
> kernel ( v5.10-rc2 for now ). The root cause of this vulnerability is that
> "fbcon_do_set_font" did not handle "vc->vc_font.data" and
> "vc->vc_font.height" correctly, and the patch
> <https://lkml.org/lkml/2020/9/27/223> for VT_RESIZEX can't handle this
> issue.
>
> Specifically, we use KD_FONT_OP_SET to set a small font.data for tty6, and
> use KD_FONT_OP_SET again to set a large font.height for tty1. After that,
> we use KD_FONT_OP_COPY to assign tty6's vc_font.data to tty1's vc_font.data
> in "fbcon_do_set_font", while tty1 retains the original larger
> height. Obviously, this will cause an out-of-bounds read, because we can
> access a smaller vc_font.data with a larger vc_font.height.
Further there was only one user ever.
- Android's loadfont, busybox and console-tools only ever use OP_GET
and OP_SET
- fbset documentation only mentions the kernel cmdline font: option,
not anything else.
- systemd used OP_COPY before release 232 published in Nov 2016
Now unfortunately the crucial report seems to have gone down with
gmane, and the commit message doesn't say much. But the pull request
hints at OP_COPY being broken
https://github.com/systemd/systemd/pull/3651
So in other words, this never worked, and the only project which
foolishly every tried to use it, realized that rather quickly too.
Instead of trying to fix security issues here on dead code by adding
missing checks, fix the entire thing by removing the functionality.
Note that systemd code using the OP_COPY function ignored the return
value, so it doesn't matter what we're doing here really - just in
case a lone server somewhere happens to be extremely unlucky and
running an affected old version of systemd. The relevant code from
font_copy_to_all_vcs() in systemd was:
/* copy font from active VT, where the font was uploaded to */
cfo.op = KD_FONT_OP_COPY;
cfo.height = vcs.v_active-1; /* tty1 == index 0 */
(void) ioctl(vcfd, KDFONTOP, &cfo);
Note this just disables the ioctl, garbage collecting the now unused
callbacks is left for -next.
v2: Tetsuo found the old mail, which allowed me to find it on another
archive. Add the link too.
Acked-by: Peilin Ye <yepeilin.cs@gmail.com>
Reported-by: Minh Yuan <yuanmingbuaa@gmail.com>
References: https://lists.freedesktop.org/archives/systemd-devel/2016-June/036935.html
References: https://github.com/systemd/systemd/pull/3651
Cc: Greg KH <greg@kroah.com>
Cc: Peilin Ye <yepeilin.cs@gmail.com>
Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Link: https://lore.kernel.org/r/20201108153806.3140315-1-daniel.vetter@ffwll.ch
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Gratian managed to trigger the BUG_ON(!newowner) in fixup_pi_state_owner().
This is one possible chain of events leading to this:
Task Prio Operation
T1 120 lock(F)
T2 120 lock(F) -> blocks (top waiter)
T3 50 (RT) lock(F) -> boosts T1 and blocks (new top waiter)
XX timeout/ -> wakes T2
signal
T1 50 unlock(F) -> wakes T3 (rtmutex->owner == NULL, waiter bit is set)
T2 120 cleanup -> try_to_take_mutex() fails because T3 is the top waiter
and the lower priority T2 cannot steal the lock.
-> fixup_pi_state_owner() sees newowner == NULL -> BUG_ON()
The comment states that this is invalid and rt_mutex_real_owner() must
return a non NULL owner when the trylock failed, but in case of a queued
and woken up waiter rt_mutex_real_owner() == NULL is a valid transient
state. The higher priority waiter has simply not yet managed to take over
the rtmutex.
The BUG_ON() is therefore wrong and this is just another retry condition in
fixup_pi_state_owner().
Drop the locks, so that T3 can make progress, and then try the fixup again.
Gratian provided a great analysis, traces and a reproducer. The analysis is
to the point, but it confused the hell out of that tglx dude who had to
page in all the futex horrors again. Condensed version is above.
[ tglx: Wrote comment and changelog ]
Fixes: c1e2f0eaf015 ("futex: Avoid violating the 10th rule of futex")
Reported-by: Gratian Crisan <gratian.crisan@ni.com>
Signed-off-by: Mike Galbraith <efault@gmx.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/87a6w6x7bb.fsf@ni.com
Link: https://lore.kernel.org/r/87sg9pkvf7.fsf@nanos.tec.linutronix.de
|
|
As shown through runtime testing, the "filename" allocation is not
always freed in perf_event_parse_addr_filter().
There are three possible ways that this could happen:
- It could be allocated twice on subsequent iterations through the loop,
- or leaked on the success path,
- or on the failure path.
Clean up the code flow to make it obvious that 'filename' is always
freed in the reallocation path and in the two return paths as well.
We rely on the fact that kfree(NULL) is NOP and filename is initialized
with NULL.
This fixes the leak. No other side effects expected.
[ Dan Carpenter: cleaned up the code flow & added a changelog. ]
[ Ingo Molnar: updated the changelog some more. ]
Fixes: 375637bc5249 ("perf/core: Introduce address range filtering")
Signed-off-by: "kiyin(尹亮)" <kiyin@tencent.com>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: "Srivatsa S. Bhat" <srivatsa@csail.mit.edu>
Cc: Anthony Liguori <aliguori@amazon.com>
--
kernel/events/core.c | 12 +++++-------
1 file changed, 5 insertions(+), 7 deletions(-)
|
|
Testing shows a problem in that UV5 hubless systems were not being
recognized. Add them to the list of OEM IDs checked.
Fixes: 6c7794423a998 ("Add UV5 direct references")
Signed-off-by: Mike Travis <mike.travis@hpe.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20201105222741.157029-4-mike.travis@hpe.com
|