aboutsummaryrefslogtreecommitdiffstats
path: root/tools/perf/scripts/python/export-to-postgresql.py (unfollow)
AgeCommit message (Collapse)AuthorFilesLines
2022-05-04iommu/mediatek-v1: Just rename mtk_iommu to mtk_iommu_v1Yong Wu1-108/+103
No functional change. Just rename this for readable. Differentiate this from mtk_iommu.c Signed-off-by: Yong Wu <yong.wu@mediatek.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: Matthias Brugger <matthias.bgg@gmail.com> Link: https://lore.kernel.org/r/20220503071427.2285-29-yong.wu@mediatek.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-05-04iommu/mediatek: Remove mtk_iommu.hYong Wu3-36/+21
Currently there is a suspend structure in the header file. It's no need to keep a header file only for this. Move these into the c file and rm this header file. Signed-off-by: Yong Wu <yong.wu@mediatek.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: Matthias Brugger <matthias.bgg@gmail.com> Link: https://lore.kernel.org/r/20220503071427.2285-28-yong.wu@mediatek.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-05-04iommu/mediatek: Separate mtk_iommu_data for v1 and v2Yong Wu3-86/+106
Prepare for adding the structure "mtk_iommu_bank_data". No functional change. The mtk_iommu_domain in v1 and v2 are different, we could not add current data as bank[0] in v1 simplistically. Currently we have no plan to add new SoC for v1, in order to avoid affect v1 when we add many new features for v2, I totally separate v1 and v2 in this patch, there are many structures only for v2. Signed-off-by: Yong Wu <yong.wu@mediatek.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: Matthias Brugger <matthias.bgg@gmail.com> Link: https://lore.kernel.org/r/20220503071427.2285-27-yong.wu@mediatek.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-05-04iommu/mediatek: Just move code position in hw_initYong Wu1-24/+24
No functional change too, prepare for mt8195 IOMMU support bank functions. Some global control settings are in bank0 while the other banks have their bank independent setting. Here only move the global control settings and the independent registers together. Signed-off-by: Yong Wu <yong.wu@mediatek.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: Matthias Brugger <matthias.bgg@gmail.com> Link: https://lore.kernel.org/r/20220503071427.2285-26-yong.wu@mediatek.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-05-04iommu/mediatek: Only adjust code about register baseYong Wu1-24/+27
No functional change. Use "base" instead of the data->base. This is avoid to touch too many lines in the next patches. Signed-off-by: Yong Wu <yong.wu@mediatek.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: Matthias Brugger <matthias.bgg@gmail.com> Link: https://lore.kernel.org/r/20220503071427.2285-25-yong.wu@mediatek.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-05-04iommu/mediatek: Add mt8195 supportYong Wu2-0/+42
mt8195 has 3 IOMMU, containing 2 MM IOMMUs, one is for vdo, the other is for vpp. and 1 INFRA IOMMU. Signed-off-by: Yong Wu <yong.wu@mediatek.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: Matthias Brugger <matthias.bgg@gmail.com> Link: https://lore.kernel.org/r/20220503071427.2285-24-yong.wu@mediatek.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-05-04iommu/mediatek: Add PCIe supportYong Wu1-1/+20
Currently the code for of_iommu_configure_dev_id is like this: static int of_iommu_configure_dev_id(struct device_node *master_np, struct device *dev, const u32 *id) { struct of_phandle_args iommu_spec = { .args_count = 1 }; err = of_map_id(master_np, *id, "iommu-map", "iommu-map-mask", &iommu_spec.np, iommu_spec.args); ... } It supports only one id output. BUT our PCIe HW has two ID(one is for writing, the other is for reading). I'm not sure if we should change of_map_id to support output MAX_PHANDLE_ARGS. Here add the solution in ourselve drivers. If it's pcie case, enable one more bit. Not all infra iommu support PCIe, thus add a PCIe support flag here. Signed-off-by: Yong Wu <yong.wu@mediatek.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: Matthias Brugger <matthias.bgg@gmail.com> Link: https://lore.kernel.org/r/20220503071427.2285-23-yong.wu@mediatek.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-05-04iommu/mediatek: Add infra iommu supportYong Wu2-7/+31
The infra iommu enable bits in mt8195 is in the pericfg register segment, use regmap to update it. If infra iommu master translation fault, It doesn't have the larbid/portid, thus print out the whole register value. Since regmap_update_bits may fail, add return value for mtk_iommu_config. Signed-off-by: Yong Wu <yong.wu@mediatek.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: Matthias Brugger <matthias.bgg@gmail.com> Link: https://lore.kernel.org/r/20220503071427.2285-22-yong.wu@mediatek.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-05-04iommu/mediatek: Add a PM_CLK_AO flag for infra iommuYong Wu1-3/+26
The power/clock of infra iommu is always on, and it doesn't have the device link with the master devices, then the infra iommu device's PM status is not active, thus we add A PM_CLK_AO flag for infra iommu. The tlb operation is a bit not clear here, there are 2 special cases. Comment them in the code. Signed-off-by: Yong Wu <yong.wu@mediatek.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: Matthias Brugger <matthias.bgg@gmail.com> Link: https://lore.kernel.org/r/20220503071427.2285-21-yong.wu@mediatek.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-05-04iommu/mediatek: Allow IOMMU_DOMAIN_UNMANAGED for PCIe VFIOYong Wu1-1/+1
Allow the type IOMMU_DOMAIN_UNMANAGED since vfio_iommu_type1.c always call iommu_domain_alloc. The PCIe EP works ok when going through vfio. Signed-off-by: Yong Wu <yong.wu@mediatek.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: Matthias Brugger <matthias.bgg@gmail.com> Link: https://lore.kernel.org/r/20220503071427.2285-20-yong.wu@mediatek.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-05-04iommu/mediatek: Adjust device link when it is sub-commonYong Wu1-4/+14
For MM IOMMU, We always add device link between smi-common and IOMMU HW. In mt8195, we add smi-sub-common. Thus, if the node is sub-common, we still need find again to get smi-common, then do device link. Signed-off-by: Yong Wu <yong.wu@mediatek.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: Matthias Brugger <matthias.bgg@gmail.com> Link: https://lore.kernel.org/r/20220503071427.2285-19-yong.wu@mediatek.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-05-04iommu/mediatek: Contain MM IOMMU flow with the MM TYPEYong Wu1-91/+122
Prepare for supporting INFRA_IOMMU, and APU_IOMMU later. For Infra IOMMU/APU IOMMU, it doesn't have the "larb""port". thus, Use the MM flag contain the MM_IOMMU special flow, Also, it moves a big chunk code about parsing the mediatek,larbs into a function, this is only needed for MM IOMMU. and all the current SoC are MM_IOMMU. The device link between iommu consumer device and smi-larb device only is needed in MM iommu case. Signed-off-by: Yong Wu <yong.wu@mediatek.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: Matthias Brugger <matthias.bgg@gmail.com> Link: https://lore.kernel.org/r/20220503071427.2285-18-yong.wu@mediatek.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-05-04iommu/mediatek: Add IOMMU_TYPE flagYong Wu1-2/+10
Add IOMMU_TYPE definition. In the mt8195, we have another IOMMU_TYPE: infra iommu, also there will be another APU_IOMMU, thus, use 2bits for the IOMMU_TYPE. Signed-off-by: Yong Wu <yong.wu@mediatek.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: Matthias Brugger <matthias.bgg@gmail.com> Link: https://lore.kernel.org/r/20220503071427.2285-17-yong.wu@mediatek.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-05-04iommu/mediatek: Add SUB_COMMON_3BITS flagYong Wu2-11/+17
In prevous SoC, the sub common id occupy 2 bits. the mt8195's sub common id has 3bits. Add a new flag for this. and rename the previous flag to _2BITS. For readable, I put these two flags together, then move the other flags. no functional change. Signed-off-by: Yong Wu <yong.wu@mediatek.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: Matthias Brugger <matthias.bgg@gmail.com> Link: https://lore.kernel.org/r/20220503071427.2285-16-yong.wu@mediatek.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-05-04iommu/mediatek: Always enable output PA over 32bits in isrYong Wu1-2/+2
Currently the output PA[32:33] is contained by the flag IOVA_34. This is not right. the iova_34 has no relation with pa[32:33], the 32bits iova still could map to pa[32:33]. Move it out from the flag. No need fix tag since currently only mt8192 use the calulation and it always has this IOVA_34 flag. Prepare for the IOMMU that still use IOVA 32bits but its dram size may be over 4GB. Signed-off-by: Yong Wu <yong.wu@mediatek.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: Matthias Brugger <matthias.bgg@gmail.com> Link: https://lore.kernel.org/r/20220503071427.2285-15-yong.wu@mediatek.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-05-04iommu/mediatek: Remove the granule in the tlb flushYong Wu1-4/+2
The MediaTek IOMMU doesn't care about granule when tlb flushing. Remove this variable. Signed-off-by: Yong Wu <yong.wu@mediatek.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: Matthias Brugger <matthias.bgg@gmail.com> Link: https://lore.kernel.org/r/20220503071427.2285-14-yong.wu@mediatek.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-05-04iommu/mediatek: Add a flag STD_AXI_MODEYong Wu1-1/+3
Add a new flag STD_AXI_MODE which is prepared for infra and apu iommu which use the standard axi mode. All the current SoC don't use this flag. Signed-off-by: Yong Wu <yong.wu@mediatek.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: Matthias Brugger <matthias.bgg@gmail.com> Link: https://lore.kernel.org/r/20220503071427.2285-13-yong.wu@mediatek.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-05-04iommu/mediatek: Add a flag DCM_DISABLEYong Wu1-1/+8
In the infra iommu, we should disable DCM. add a new flag for this. Signed-off-by: Yong Wu <yong.wu@mediatek.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: Matthias Brugger <matthias.bgg@gmail.com> Link: https://lore.kernel.org/r/20220503071427.2285-12-yong.wu@mediatek.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-05-04iommu/mediatek: Add 12G~16G support for multi domainsYong Wu1-3/+5
In mt8192, we preassign 0-4G; 4G-8G; 8G-12G for different multimedia engines. This depends on the "dma-ranges=" in the iommu consumer's dtsi node. Adds 12G-16G region here. and reword the previous comment. we don't limit which master locate in which region. CCU still is 8G-12G. Don't change it here. Signed-off-by: Yong Wu <yong.wu@mediatek.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: Matthias Brugger <matthias.bgg@gmail.com> Link: https://lore.kernel.org/r/20220503071427.2285-11-yong.wu@mediatek.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-05-04iommu/mediatek: Adapt sharing and non-sharing pgtable caseYong Wu2-20/+30
In previous mt2712, Both IOMMUs are MM IOMMU, and they will share pgtable. However in the latest SoC, another is infra IOMMU, there is no reason to share pgtable between MM with INFRA IOMMU. This patch manage to implement the two case(sharing and non-sharing pgtable). Currently we use for_each_m4u to loop the 2 HWs. Add the list_head into this macro. In the sharing pgtable case, the list_head is the global "m4ulist". In the non-sharing pgtable case, the list_head is hw_list_head which is a variable in the "data". then for_each_m4u will only loop itself. Signed-off-by: Yong Wu <yong.wu@mediatek.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: Matthias Brugger <matthias.bgg@gmail.com> Link: https://lore.kernel.org/r/20220503071427.2285-10-yong.wu@mediatek.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-05-04iommu/mediatek: Add mutex for data in the mtk_iommu_domainYong Wu1-1/+9
Same with the previous patch, add a mutex for the "data" in the mtk_iommu_domain. Just improve the safety for multi devices enter attach_device at the same time. We don't get the real issue for this. Signed-off-by: Yong Wu <yong.wu@mediatek.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: Matthias Brugger <matthias.bgg@gmail.com> Link: https://lore.kernel.org/r/20220503071427.2285-9-yong.wu@mediatek.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-05-04iommu/mediatek: Add mutex for m4u_group and m4u_dom in dataYong Wu2-2/+13
Add a mutex to protect the data in the structure mtk_iommu_data, like ->"m4u_group" ->"m4u_dom". For the internal data, we should protect it in ourselves driver. Add a mutex for this. This could be a fix for the multi-groups support. Fixes: c3045f39244e ("iommu/mediatek: Support for multi domains") Signed-off-by: Yunfei Wang <yf.wang@mediatek.com> Signed-off-by: Yong Wu <yong.wu@mediatek.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: Matthias Brugger <matthias.bgg@gmail.com> Link: https://lore.kernel.org/r/20220503071427.2285-8-yong.wu@mediatek.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-05-04iommu/mediatek: Remove clk_disable in mtk_iommu_removeYong Wu1-1/+0
After the commit b34ea31fe013 ("iommu/mediatek: Always enable the clk on resume"), the iommu clock is controlled by the runtime callback. thus remove the clk control in the mtk_iommu_remove. Otherwise, it will warning like: echo 14018000.iommu > /sys/bus/platform/drivers/mtk-iommu/unbind [ 51.413044] ------------[ cut here ]------------ [ 51.413648] vpp0_smi_iommu already disabled [ 51.414233] WARNING: CPU: 2 PID: 157 at */v5.15-rc1/kernel/mediatek/ drivers/clk/clk.c:952 clk_core_disable+0xb0/0xb8 [ 51.417174] Hardware name: MT8195V/C(ENG) (DT) [ 51.418635] pc : clk_core_disable+0xb0/0xb8 [ 51.419177] lr : clk_core_disable+0xb0/0xb8 ... [ 51.429375] Call trace: [ 51.429694] clk_core_disable+0xb0/0xb8 [ 51.430193] clk_core_disable_lock+0x24/0x40 [ 51.430745] clk_disable+0x20/0x30 [ 51.431189] mtk_iommu_remove+0x58/0x118 [ 51.431705] platform_remove+0x28/0x60 [ 51.432197] device_release_driver_internal+0x110/0x1f0 [ 51.432873] device_driver_detach+0x18/0x28 [ 51.433418] unbind_store+0xd4/0x108 [ 51.433886] drv_attr_store+0x24/0x38 [ 51.434363] sysfs_kf_write+0x40/0x58 [ 51.434843] kernfs_fop_write_iter+0x164/0x1e0 Fixes: b34ea31fe013 ("iommu/mediatek: Always enable the clk on resume") Reported-by: Hsin-Yi Wang <hsinyi@chromium.org> Signed-off-by: Yong Wu <yong.wu@mediatek.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: Matthias Brugger <matthias.bgg@gmail.com> Link: https://lore.kernel.org/r/20220503071427.2285-7-yong.wu@mediatek.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-05-04iommu/mediatek: Add list_del in mtk_iommu_removeYong Wu1-2/+1
Lack the list_del in the mtk_iommu_remove, and remove bus_set_iommu(*, NULL) since there may be several iommu HWs. we can not bus_set_iommu null when one iommu driver unbind. This could be a fix for mt2712 which support 2 M4U HW and list them. Fixes: 7c3a2ec02806 ("iommu/mediatek: Merge 2 M4U HWs into one iommu domain") Signed-off-by: Yong Wu <yong.wu@mediatek.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: Matthias Brugger <matthias.bgg@gmail.com> Link: https://lore.kernel.org/r/20220503071427.2285-6-yong.wu@mediatek.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-05-04iommu/mediatek: Fix 2 HW sharing pgtable issueYong Wu1-2/+5
In the commit 4f956c97d26b ("iommu/mediatek: Move domain_finalise into attach_device"), I overlooked the sharing pgtable case. After that commit, the "data" in the mtk_iommu_domain_finalise always is the data of the current IOMMU HW. Fix this for the sharing pgtable case. Only affect mt2712 which is the only SoC that share pgtable currently. Fixes: 4f956c97d26b ("iommu/mediatek: Move domain_finalise into attach_device") Signed-off-by: Yong Wu <yong.wu@mediatek.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: Matthias Brugger <matthias.bgg@gmail.com> Link: https://lore.kernel.org/r/20220503071427.2285-5-yong.wu@mediatek.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-05-04dt-bindings: mediatek: mt8186: Add binding for MM iommuYong Wu2-0/+221
Add mt8186 iommu binding. "-mm" means the iommu is for Multimedia. Signed-off-by: Yong Wu <yong.wu@mediatek.com> Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> Reviewed-by: Rob Herring <robh@kernel.org> Reviewed-by: Matthias Brugger <matthias.bgg@gmail.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Link: https://lore.kernel.org/r/20220503071427.2285-4-yong.wu@mediatek.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-05-04dt-bindings: mediatek: mt8195: Add binding for infra IOMMUYong Wu3-1/+32
In mt8195, we have a new IOMMU that is for INFRA IOMMU. its masters mainly are PCIe and USB. Different with MM IOMMU, all these masters connect with IOMMU directly, there is no mediatek,larbs property for infra IOMMU. Another thing is about PCIe ports. currently the function "of_iommu_configure_dev_id" only support the id number is 1, But our PCIe have two ports, one is for reading and the other is for writing. see more about the PCIe patch in this patchset. Thus, I only list the reading id here and add the other id in our driver. Signed-off-by: Yong Wu <yong.wu@mediatek.com> Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> Reviewed-by: Rob Herring <robh@kernel.org> Reviewed-by: Matthias Brugger <matthias.bgg@gmail.com> Link: https://lore.kernel.org/r/20220503071427.2285-3-yong.wu@mediatek.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-05-04dt-bindings: mediatek: mt8195: Add binding for MM IOMMUYong Wu2-0/+397
This patch adds descriptions for mt8195 IOMMU which also use ARM Short-Descriptor translation table format. In mt8195, there are two smi-common HW and IOMMU, one is for vdo(video output), the other is for vpp(video processing pipe). They connects with different smi-larbs, then some setting(larbid_remap) is different. Differentiate them with the compatible string. Something like this: IOMMU(VDO) IOMMU(VPP) | | SMI_COMMON_VDO SMI_COMMON_VPP --------------- ---------------- | | ... | | ... larb0 larb2 ... larb1 larb3 ... Another change is that we have a new IOMMU that is for infra master like PCIe and USB. The infra master don't have the larb and ports, thus we rename the port header file to mt8195-memory-port.h rather than mt8195-larb-port.h. Also, the IOMMU is not only for MM, thus, we don't call it "m4u" which means "MultiMedia Memory Management UNIT". thus, use the "iommu" as the compatiable string. Signed-off-by: Yong Wu <yong.wu@mediatek.com> Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> Reviewed-by: Rob Herring <robh@kernel.org> Reviewed-by: Matthias Brugger <matthias.bgg@gmail.com> Link: https://lore.kernel.org/r/20220503071427.2285-2-yong.wu@mediatek.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-04-24Linux 5.18-rc4Linus Torvalds1-1/+1
2022-04-24kvmalloc: use vmalloc_huge for vmalloc allocationsLinus Torvalds1-2/+9
Since commit 559089e0a93d ("vmalloc: replace VM_NO_HUGE_VMAP with VM_ALLOW_HUGE_VMAP"), the use of hugepage mappings for vmalloc is an opt-in strategy, because it caused a number of problems that weren't noticed until x86 enabled it too. One of the issues was fixed by Nick Piggin in commit 3b8000ae185c ("mm/vmalloc: huge vmalloc backing pages should be split rather than compound"), but I'm still worried about page protection issues, and VM_FLUSH_RESET_PERMS in particular. However, like the hash table allocation case (commit f2edd118d02d: "page_alloc: use vmalloc_huge for large system hash"), the use of kvmalloc() should be safe from any such games, since the returned pointer might be a SLUB allocation, and as such no user should reasonably be using it in any odd ways. We also know that the allocations are fairly large, since it falls back to the vmalloc case only when a kmalloc() fails. So using a hugepage mapping seems both safe and relevant. This patch does show a weakness in the opt-in strategy: since the opt-in flag is in the 'vm_flags', not the usual gfp_t allocation flags, very few of the usual interfaces actually expose it. That's not much of an issue in this case that already used one of the fairly specialized low-level vmalloc interfaces for the allocation, but for a lot of other vmalloc() users that might want to opt in, it's going to be very inconvenient. We'll either have to fix any compatibility problems, or expose it in the gfp flags (__GFP_COMP would have made a lot of sense) to allow normal vmalloc() users to use hugepage mappings. That said, the cases that really matter were probably already taken care of by the hash tabel allocation. Link: https://lore.kernel.org/all/20220415164413.2727220-1-song@kernel.org/ Link: https://lore.kernel.org/all/CAHk-=whao=iosX1s5Z4SF-ZGa-ebAukJoAdUJFk5SPwnofV+Vg@mail.gmail.com/ Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Paul Menzel <pmenzel@molgen.mpg.de> Cc: Song Liu <songliubraving@fb.com> Cc: Rick Edgecombe <rick.p.edgecombe@intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-04-24page_alloc: use vmalloc_huge for large system hashSong Liu1-1/+1
Use vmalloc_huge() in alloc_large_system_hash() so that large system hash (>= PMD_SIZE) could benefit from huge pages. Note that vmalloc_huge only allocates huge pages for systems with HAVE_ARCH_HUGE_VMALLOC. Signed-off-by: Song Liu <song@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Rik van Riel <riel@surriel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-04-23sparc: cacheflush_32.h needs struct pageRandy Dunlap1-0/+1
Add a struct page forward declaration to cacheflush_32.h. Fixes this build warning: CC drivers/crypto/xilinx/zynqmp-sha.o In file included from arch/sparc/include/asm/cacheflush.h:11, from include/linux/cacheflush.h:5, from drivers/crypto/xilinx/zynqmp-sha.c:6: arch/sparc/include/asm/cacheflush_32.h:38:37: warning: 'struct page' declared inside parameter list will not be visible outside of this definition or declaration 38 | void sparc_flush_page_to_ram(struct page *page); Exposed by commit 0e03b8fd2936 ("crypto: xilinx - Turn SHA into a tristate and allow COMPILE_TEST") but not Fixes: that commit because the underlying problem is older. Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Reported-by: kernel test robot <lkp@intel.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: David S. Miller <davem@davemloft.net> Cc: Sam Ravnborg <sam@ravnborg.org> Cc: sparclinux@vger.kernel.org Acked-by: Sam Ravnborg <sam@ravnborg.org> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-04-22perf test: Fix error message for test case 71 on s390, where it is not supportedThomas Richter1-0/+4
Test case 71 'Convert perf time to TSC' is not supported on s390. Subtest 71.1 is skipped with the correct message, but subtest 71.2 is not skipped and fails. The root cause is function evlist__open() called from test__perf_time_to_tsc(). evlist__open() returns -ENOENT because the event cycles:u is not supported by the selected PMU, for example platform s390 on z/VM or an x86_64 virtual machine. The PMU driver returns -ENOENT in this case. This error is leads to the failure. Fix this by returning TEST_SKIP on -ENOENT. Output before: 71: Convert perf time to TSC: 71.1: TSC support: Skip (This architecture does not support) 71.2: Perf time to TSC: FAILED! Output after: 71: Convert perf time to TSC: 71.1: TSC support: Skip (This architecture does not support) 71.2: Perf time to TSC: Skip (perf_read_tsc_conversion is not supported) This also happens on an x86_64 virtual machine: # uname -m x86_64 $ ./perf test -F 71 71: Convert perf time to TSC : 71.1: TSC support : Ok 71.2: Perf time to TSC : FAILED! $ Committer testing: Continues to work on x86_64: $ perf test 71 71: Convert perf time to TSC : 71.1: TSC support : Ok 71.2: Perf time to TSC : Ok $ Fixes: 290fa68bdc458863 ("perf test tsc: Fix error message when not supported") Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Acked-by: Sumanth Korikkar <sumanthk@linux.ibm.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Chengdong Li <chengdongli@tencent.com> Cc: chengdongli@tencent.com Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Link: https://lore.kernel.org/r/20220420062921.1211825-1-tmricht@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-22perf report: Set PERF_SAMPLE_DATA_SRC bit for Arm SPE eventLeo Yan1-0/+14
Since commit bb30acae4c4dacfa ("perf report: Bail out --mem-mode if mem info is not available") "perf mem report" and "perf report --mem-mode" don't report result if the PERF_SAMPLE_DATA_SRC bit is missed in sample type. The commit ffab487052054162 ("perf: arm-spe: Fix perf report --mem-mode") partially fixes the issue. It adds PERF_SAMPLE_DATA_SRC bit for Arm SPE event, this allows the perf data file generated by kernel v5.18-rc1 or later version can be reported properly. On the other hand, perf tool still fails to be backward compatibility for a data file recorded by an older version's perf which contains Arm SPE trace data. This patch is a workaround in reporting phase, when detects ARM SPE PMU event and without PERF_SAMPLE_DATA_SRC bit, it will force to set the bit in the sample type and give a warning info. Fixes: bb30acae4c4dacfa ("perf report: Bail out --mem-mode if mem info is not available") Reviewed-by: James Clark <james.clark@arm.com> Signed-off-by: Leo Yan <leo.yan@linaro.org> Tested-by: German Gomez <german.gomez@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Link: https://lore.kernel.org/r/20220414123201.842754-1-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-22perf script: Always allow field 'data_src' for auxtraceLeo Yan1-1/+1
If use command 'perf script -F,+data_src' to dump memory samples with Arm SPE trace data, it reports error: # perf script -F,+data_src Samples for 'dummy:u' event do not have DATA_SRC attribute set. Cannot print 'data_src' field. This is because the 'dummy:u' event is absent DATA_SRC bit in its sample type, so if a file contains AUX area tracing data then always allow field 'data_src' to be selected as an option for perf script. Fixes: e55ed3423c1bb29f ("perf arm-spe: Synthesize memory event") Signed-off-by: Leo Yan <leo.yan@linaro.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: German Gomez <german.gomez@arm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220417114837.839896-1-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-22perf clang: Fix header include for LLVM >= 14Guilherme Amadio1-0/+4
The header TargetRegistry.h has moved in LLVM/clang 14. Committer notes: The problem as noticed when building in ubuntu:22.04: 90 98.61 ubuntu:22.04 : FAIL gcc version 11.2.0 (Ubuntu 11.2.0-19ubuntu1) util/c++/clang.cpp:23:10: fatal error: llvm/Support/TargetRegistry.h: No such file or directory 23 | #include "llvm/Support/TargetRegistry.h" | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ compilation terminated. Fixed after applying this patch. Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org> Signed-off-by: Guilherme Amadio <amadio@gentoo.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Link: https://twitter.com/GuilhermeAmadio/status/1514970524232921088 Link: http://lore.kernel.org/lkml/Ylp0M/VYgHOxtcnF@gentoo.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-22gpio: Request interrupts after IRQ is initializedMario Limonciello1-2/+2
Commit 5467801f1fcb ("gpio: Restrict usage of GPIO chip irq members before initialization") attempted to fix a race condition that lead to a NULL pointer, but in the process caused a regression for _AEI/_EVT declared GPIOs. This manifests in messages showing deferred probing while trying to allocate IRQs like so: amd_gpio AMDI0030:00: Failed to translate GPIO pin 0x0000 to IRQ, err -517 amd_gpio AMDI0030:00: Failed to translate GPIO pin 0x002C to IRQ, err -517 amd_gpio AMDI0030:00: Failed to translate GPIO pin 0x003D to IRQ, err -517 [ .. more of the same .. ] The code for walking _AEI doesn't handle deferred probing and so this leads to non-functional GPIO interrupts. Fix this issue by moving the call to `acpi_gpiochip_request_interrupts` to occur after gc->irc.initialized is set. Fixes: 5467801f1fcb ("gpio: Restrict usage of GPIO chip irq members before initialization") Link: https://lore.kernel.org/linux-gpio/BL1PR12MB51577A77F000A008AA694675E2EF9@BL1PR12MB5157.namprd12.prod.outlook.com/ Link: https://bugzilla.suse.com/show_bug.cgi?id=1198697 Link: https://bugzilla.kernel.org/show_bug.cgi?id=215850 Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1979 Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1976 Reported-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Shreeya Patel <shreeya.patel@collabora.com> Tested-By: Samuel Čavoj <samuel@cavoj.net> Tested-By: lukeluk498@gmail.com Link: Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Acked-by: Linus Walleij <linus.walleij@linaro.org> Reviewed-and-tested-by: Takashi Iwai <tiwai@suse.de> Cc: Shreeya Patel <shreeya.patel@collabora.com> Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-04-22arm/xen: Fix some refcount leaksMiaoqian Lin1-2/+7
The of_find_compatible_node() function returns a node pointer with refcount incremented, We should use of_node_put() on it when done Add the missing of_node_put() to release the refcount. Fixes: 9b08aaa3199a ("ARM: XEN: Move xen_early_init() before efi_init()") Fixes: b2371587fe0c ("arm/xen: Read extended regions from DT and init Xen resource") Signed-off-by: Miaoqian Lin <linmq006@gmail.com> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org> Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
2022-04-22XArray: Disallow sibling entries of nodesMatthew Wilcox (Oracle)1-0/+2
There is a race between xas_split() and xas_load() which can result in the wrong page being returned, and thus data corruption. Fortunately, it's hard to hit (syzbot took three months to find it) and often guarded with VM_BUG_ON(). The anatomy of this race is: thread A thread B order-9 page is stored at index 0x200 lookup of page at index 0x274 page split starts load of sibling entry at offset 9 stores nodes at offsets 8-15 load of entry at offset 8 The entry at offset 8 turns out to be a node, and so we descend into it, and load the page at index 0x234 instead of 0x274. This is hard to fix on the split side; we could replace the entire node that contains the order-9 page instead of replacing the eight entries. Fixing it on the lookup side is easier; just disallow sibling entries that point to nodes. This cannot ever be a useful thing as the descent would not know the correct offset to use within the new node. The test suite continues to pass, but I have not added a new test for this bug. Reported-by: syzbot+cf4cf13056f85dec2c40@syzkaller.appspotmail.com Tested-by: syzbot+cf4cf13056f85dec2c40@syzkaller.appspotmail.com Fixes: 6b24ca4a1a8d ("mm: Use multi-index entries in the page cache") Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
2022-04-22tools: Add kmem_cache_alloc_lru()Matthew Wilcox (Oracle)2-2/+9
Turn kmem_cache_alloc() into a wrapper around kmem_cache_alloc_lru(). Fixes: 9bbdc0f32409 ("xarray: use kmem_cache_alloc_lru to allocate xa_node") Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reported-by: Liam R. Howlett <Liam.Howlett@oracle.com> Reported-by: Li Wang <liwang@redhat.com>
2022-04-22mm/vmalloc: huge vmalloc backing pages should be split rather than compoundNicholas Piggin1-15/+21
Huge vmalloc higher-order backing pages were allocated with __GFP_COMP in order to allow the sub-pages to be refcounted by callers such as "remap_vmalloc_page [sic]" (remap_vmalloc_range). However a similar problem exists for other struct page fields callers use, for example fb_deferred_io_fault() takes a vmalloc'ed page and not only refcounts it but uses ->lru, ->mapping, ->index. This is not compatible with compound sub-pages, and can cause bad page state issues like BUG: Bad page state in process swapper/0 pfn:00743 page:(____ptrval____) refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x743 flags: 0x7ffff000000000(node=0|zone=0|lastcpupid=0x7ffff) raw: 007ffff000000000 c00c00000001d0c8 c00c00000001d0c8 0000000000000000 raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000 page dumped because: corrupted mapping in tail page Modules linked in: CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.18.0-rc3-00082-gfc6fff4a7ce1-dirty #2810 Call Trace: dump_stack_lvl+0x74/0xa8 (unreliable) bad_page+0x12c/0x170 free_tail_pages_check+0xe8/0x190 free_pcp_prepare+0x31c/0x4e0 free_unref_page+0x40/0x1b0 __vunmap+0x1d8/0x420 ... The correct approach is to use split high-order pages for the huge vmalloc backing. These allow callers to treat them in exactly the same way as individually-allocated order-0 pages. Link: https://lore.kernel.org/all/14444103-d51b-0fb3-ee63-c3f182f0b546@molgen.mpg.de/ Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Cc: Paul Menzel <pmenzel@molgen.mpg.de> Cc: Song Liu <songliubraving@fb.com> Cc: Rick Edgecombe <rick.p.edgecombe@intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-04-22arm64: mm: fix p?d_leaf()Muchun Song1-2/+2
The pmd_leaf() is used to test a leaf mapped PMD, however, it misses the PROT_NONE mapped PMD on arm64. Fix it. A real world issue [1] caused by this was reported by Qian Cai. Also fix pud_leaf(). Link: https://patchwork.kernel.org/comment/24798260/ [1] Fixes: 8aa82df3c123 ("arm64: mm: add p?d_leaf() definitions") Reported-by: Qian Cai <quic_qiancai@quicinc.com> Signed-off-by: Muchun Song <songmuchun@bytedance.com> Link: https://lore.kernel.org/r/20220422060033.48711-1-songmuchun@bytedance.com Signed-off-by: Will Deacon <will@kernel.org>
2022-04-21mm/mmu_notifier.c: fix race in mmu_interval_notifier_remove()Alistair Popple1-1/+13
In some cases it is possible for mmu_interval_notifier_remove() to race with mn_tree_inv_end() allowing it to return while the notifier data structure is still in use. Consider the following sequence: CPU0 - mn_tree_inv_end() CPU1 - mmu_interval_notifier_remove() ----------------------------------- ------------------------------------ spin_lock(subscriptions->lock); seq = subscriptions->invalidate_seq; spin_lock(subscriptions->lock); spin_unlock(subscriptions->lock); subscriptions->invalidate_seq++; wait_event(invalidate_seq != seq); return; interval_tree_remove(interval_sub); kfree(interval_sub); spin_unlock(subscriptions->lock); wake_up_all(); As the wait_event() condition is true it will return immediately. This can lead to use-after-free type errors if the caller frees the data structure containing the interval notifier subscription while it is still on a deferred list. Fix this by taking the appropriate lock when reading invalidate_seq to ensure proper synchronisation. I observed this whilst running stress testing during some development. You do have to be pretty unlucky, but it leads to the usual problems of use-after-free (memory corruption, kernel crash, difficult to diagnose WARN_ON, etc). Link: https://lkml.kernel.org/r/20220420043734.476348-1-apopple@nvidia.com Fixes: 99cb252f5e68 ("mm/mmu_notifier: add an interval tree notifier") Signed-off-by: Alistair Popple <apopple@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Cc: Christian König <christian.koenig@amd.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Ralph Campbell <rcampbell@nvidia.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-04-21kcov: don't generate a warning on vm_insert_page()'s failureAleksandr Nogikh1-2/+5
vm_insert_page()'s failure is not an unexpected condition, so don't do WARN_ONCE() in such a case. Instead, print a kernel message and just return an error code. This flaw has been reported under an OOM condition by sysbot [1]. The message is mainly for the benefit of the test log, in this case the fuzzer's log so that humans inspecting the log can figure out what was going on. KCOV is a testing tool, so I think being a little more chatty when KCOV unexpectedly is about to fail will save someone debugging time. We don't want the WARN, because it's not a kernel bug that syzbot should report, and failure can happen if the fuzzer tries hard enough (as above). Link: https://lkml.kernel.org/r/Ylkr2xrVbhQYwNLf@elver.google.com [1] Link: https://lkml.kernel.org/r/20220401182512.249282-1-nogikh@google.com Fixes: b3d7fe86fbd0 ("kcov: properly handle subsequent mmap calls"), Signed-off-by: Aleksandr Nogikh <nogikh@google.com> Acked-by: Marco Elver <elver@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Andrey Konovalov <andreyknvl@gmail.com> Cc: Alexander Potapenko <glider@google.com> Cc: Taras Madan <tarasmadan@google.com> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-04-21MAINTAINERS: add Vincenzo Frascino to KASAN reviewersVincenzo Frascino1-0/+1
Add my email address to KASAN reviewers list to make sure that I am Cc'ed in all the KASAN changes that may affect arm64 MTE. Link: https://lkml.kernel.org/r/20220419170640.21404-1-vincenzo.frascino@arm.com Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com> Cc: Andrey Konovalov <andreyknvl@gmail.com> Cc: Alexander Potapenko <glider@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-04-21oom_kill.c: futex: delay the OOM reaper to allow time for proper futex cleanupNico Pache2-14/+41
The pthread struct is allocated on PRIVATE|ANONYMOUS memory [1] which can be targeted by the oom reaper. This mapping is used to store the futex robust list head; the kernel does not keep a copy of the robust list and instead references a userspace address to maintain the robustness during a process death. A race can occur between exit_mm and the oom reaper that allows the oom reaper to free the memory of the futex robust list before the exit path has handled the futex death: CPU1 CPU2 -------------------------------------------------------------------- page_fault do_exit "signal" wake_oom_reaper oom_reaper oom_reap_task_mm (invalidates mm) exit_mm exit_mm_release futex_exit_release futex_cleanup exit_robust_list get_user (EFAULT- can't access memory) If the get_user EFAULT's, the kernel will be unable to recover the waiters on the robust_list, leaving userspace mutexes hung indefinitely. Delay the OOM reaper, allowing more time for the exit path to perform the futex cleanup. Reproducer: https://gitlab.com/jsavitz/oom_futex_reproducer Based on a patch by Michal Hocko. Link: https://elixir.bootlin.com/glibc/glibc-2.35/source/nptl/allocatestack.c#L370 [1] Link: https://lkml.kernel.org/r/20220414144042.677008-1-npache@redhat.com Fixes: 212925802454 ("mm: oom: let oom_reap_task and exit_mmap run concurrently") Signed-off-by: Joel Savitz <jsavitz@redhat.com> Signed-off-by: Nico Pache <npache@redhat.com> Co-developed-by: Joel Savitz <jsavitz@redhat.com> Suggested-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Rafael Aquini <aquini@redhat.com> Cc: Waiman Long <longman@redhat.com> Cc: Herton R. Krzesinski <herton@redhat.com> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Vincent Guittot <vincent.guittot@linaro.org> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Ben Segall <bsegall@google.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Daniel Bristot de Oliveira <bristot@redhat.com> Cc: David Rientjes <rientjes@google.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Joel Savitz <jsavitz@redhat.com> Cc: Darren Hart <dvhart@infradead.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-04-21selftest/vm: add skip support to mremap_testSidhartha Kumar1-3/+8
Allow the mremap test to be skipped due to errors such as failing to parse the mmap_min_addr sysctl. Link: https://lkml.kernel.org/r/20220420215721.4868-4-sidhartha.kumar@oracle.com Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com> Reviewed-by: Shuah Khan <skhan@linuxfoundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-04-21selftest/vm: support xfail in mremap_testSidhartha Kumar1-1/+1
Use ksft_test_result_xfail for the tests which are expected to fail. Link: https://lkml.kernel.org/r/20220420215721.4868-3-sidhartha.kumar@oracle.com Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com> Reviewed-by: Shuah Khan <skhan@linuxfoundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-04-21selftest/vm: verify remap destination address in mremap_testSidhartha Kumar1-3/+39
Because mremap does not have a MAP_FIXED_NOREPLACE flag, it can destroy existing mappings. This causes a segfault when regions such as text are remapped and the permissions are changed. Verify the requested mremap destination address does not overlap any existing mappings by using mmap's MAP_FIXED_NOREPLACE flag. Keep incrementing the destination address until a valid mapping is found or fail the current test once the max address is reached. Link: https://lkml.kernel.org/r/20220420215721.4868-2-sidhartha.kumar@oracle.com Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com> Reviewed-by: Shuah Khan <skhan@linuxfoundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-04-21selftest/vm: verify mmap addr in mremap_testSidhartha Kumar1-1/+40
Avoid calling mmap with requested addresses that are less than the system's mmap_min_addr. When run as root, mmap returns EACCES when trying to map addresses < mmap_min_addr. This is not one of the error codes for the condition to retry the mmap in the test. Rather than arbitrarily retrying on EACCES, don't attempt an mmap until addr > vm.mmap_min_addr. Add a munmap call after an alignment check as the mappings are retained after the retry and can reach the vm.max_map_count sysctl. Link: https://lkml.kernel.org/r/20220420215721.4868-1-sidhartha.kumar@oracle.com Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com> Reviewed-by: Shuah Khan <skhan@linuxfoundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>