wireguard-linux - WireGuard for the Linux kernel

Age	Commit message (Collapse)	Author	Files	Lines
2024-05-03	arm64/mm: Add uffd write-protect support	Ryan Roberts	3	-0/+53
	Let's use the newly-free PTE SW bit (58) to add support for uffd-wp. The standard handlers are implemented for set/test/clear for both pte and pmd. Additionally we must also track the uffd-wp state as a pte swp bit, so use a free swap pte bit (3). Acked-by: Peter Xu <peterx@redhat.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com> Link: https://lore.kernel.org/r/20240503144604.151095-5-ryan.roberts@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2024-05-03	arm64/mm: Move PTE_PRESENT_INVALID to overlay PTE_NG	Ryan Roberts	2	-7/+7
	PTE_PRESENT_INVALID was previously occupying bit 59, which when a PTE is valid can either be IGNORED, PBHA[0] or AttrIndex[3], depending on the HW configuration. In practice this is currently not a problem because PTE_PRESENT_INVALID can only be 1 when PTE_VALID=0 and upstream Linux always requires the bit set to 0 for a valid pte. However, if in future Linux wants to use the field (e.g. AttrIndex[3]) then we could end up with confusion when PTE_PRESENT_INVALID comes along and corrupts the field - we would ideally want to preserve it even for an invalid (but present) pte. The other problem with bit 59 is that it prevents the offset field of a swap entry within a swap pte from growing beyond 51 bits. By moving PTE_PRESENT_INVALID to a low bit we can lay the swap pte out so that the offset field could grow to 52 bits in future. So let's move PTE_PRESENT_INVALID to overlay PTE_NG (bit 11). There is no need to persist NG for a present-invalid entry; it is always set for user mappings and is not used by SW to derive any state from the pte. PTE_NS was considered instead of PTE_NG, but it is RES0 for non-secure SW, so there is a chance that future architecture may allocate the bit and we may therefore need to persist that bit for present-invalid ptes. These are both marginal benefits, but make things a bit tidier in my opinion. Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com> Link: https://lore.kernel.org/r/20240503144604.151095-4-ryan.roberts@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2024-05-03	arm64/mm: Remove PTE_PROT_NONE bit	Ryan Roberts	2	-16/+18
	Currently the PTE_PRESENT_INVALID and PTE_PROT_NONE functionality explicitly occupy 2 bits in the PTE when PTE_VALID/PMD_SECT_VALID is clear. This has 2 significant consequences: - PTE_PROT_NONE consumes a precious SW PTE bit that could be used for other things. - The swap pte layout must reserve those same 2 bits and ensure they are both always zero for a swap pte. It would be nice to reclaim at least one of those bits. But PTE_PRESENT_INVALID, which since the previous patch, applies uniformly to page/block descriptors at any level when PTE_VALID is clear, can already give us most of what PTE_PROT_NONE requires: If it is set, then the pte is still considered present; pte_present() returns true and all the fields in the pte follow the HW interpretation (e.g. SW can safely call pte_pfn(), etc). But crucially, the HW treats the pte as invalid and will fault if it hits. So let's remove PTE_PROT_NONE entirely and instead represent PROT_NONE as a present but invalid pte (PTE_VALID=0, PTE_PRESENT_INVALID=1) with PTE_USER=0 and PTE_UXN=1. This is a unique combination that is not used anywhere else. The net result is a clearer, simpler, more generic encoding scheme that applies uniformly to all levels. Additionally we free up a PTE SW bit and a swap pte bit (bit 58 in both cases). Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com> Link: https://lore.kernel.org/r/20240503144604.151095-3-ryan.roberts@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2024-05-03	arm64/mm: generalize PMD_PRESENT_INVALID for all levels	Ryan Roberts	2	-13/+16
	As preparation for the next patch, which frees up the PTE_PROT_NONE present pte and swap pte bit, generalize PMD_PRESENT_INVALID to PTE_PRESENT_INVALID. This will then be used to mark PROT_NONE ptes (and entries at any other level) in the next patch. While we're at it, fix up the swap pte format comment to include PTE_PRESENT_INVALID. This is not new, it just wasn't previously documented. Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com> Link: https://lore.kernel.org/r/20240503144604.151095-2-ryan.roberts@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2024-05-03	arm64: simplify arch_static_branch/_jump function	George Guo	1	-13/+15
	Extracted the jump table definition code from the arch_static_branch and arch_static_branch_jump functions into a macro JUMP_TABLE_ENTRY to reduce code duplication. Signed-off-by: George Guo <guodongtai@kylinos.cn> Link: https://lore.kernel.org/r/20240430085655.2798551-2-dongtai.guo@linux.dev Signed-off-by: Will Deacon <will@kernel.org>
2024-05-03	arm64: Add USER_STACKTRACE support	chenqiwu	3	-114/+125
	Currently, userstacktrace is unsupported for ftrace and uprobe tracers on arm64. This patch uses the perf_callchain_user() code as blueprint to implement the arch_stack_walk_user() which add userstacktrace support on arm64. Meanwhile, we can use arch_stack_walk_user() to simplify the implementation of perf_callchain_user(). This patch is tested pass with ftrace, uprobe and perf tracers profiling userstacktrace cases. Tested-by: chenqiwu <qiwu.chen@transsion.com> Signed-off-by: chenqiwu <qiwu.chen@transsion.com> Link: https://lore.kernel.org/r/20231219022229.10230-1-qiwu.chen@transsion.com Signed-off-by: Will Deacon <will@kernel.org>
2024-05-03	arm64: Add the arm64.no32bit_el0 command line option	Andrea della Porta	2	-0/+5
	Introducing the field 'el0' to the idreg-override for register ID_AA64PFR0_EL1. This field is also aliased to the new kernel command line option 'arm64.no32bit_el0' as a more recognizable and mnemonic name to disable the execution of 32 bit userspace applications (i.e. avoid Aarch32 execution state in EL0) from kernel command line. Link: https://lore.kernel.org/all/20240207105847.7739-1-andrea.porta@suse.com/ Signed-off-by: Andrea della Porta <andrea.porta@suse.com> Link: https://lore.kernel.org/r/20240429102833.6426-1-andrea.porta@suse.com Signed-off-by: Will Deacon <will@kernel.org>
2024-04-28	drivers/perf: hisi: hns3: Actually use devm_add_action_or_reset()	Hao Chen	1	-1/+1
	pci_alloc_irq_vectors() allocates an irq vector. When devm_add_action() fails, the irq vector is not freed, which leads to a memory leak. Replace the devm_add_action with devm_add_action_or_reset to ensure the irq vector can be destroyed when it fails. Fixes: 66637ab137b4 ("drivers/perf: hisi: add driver for HNS3 PMU") Signed-off-by: Hao Chen <chenhao418@huawei.com> Signed-off-by: Junhao He <hejunhao3@huawei.com> Reviewed-by: Jijie Shao <shaojijie@huawei.com> Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240425124627.13764-4-hejunhao3@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-04-28	drivers/perf: hisi: hns3: Fix out-of-bound access when valid event group	Junhao He	1	-1/+13
	The perf tool allows users to create event groups through following cmd [1], but the driver does not check whether the array index is out of bounds when writing data to the event_group array. If the number of events in an event_group is greater than HNS3_PMU_MAX_HW_EVENTS, the memory write overflow of event_group array occurs. Add array index check to fix the possible array out of bounds violation, and return directly when write new events are written to array bounds. There are 9 different events in an event_group. [1] perf stat -e '{pmu/event1/, ... ,pmu/event9/} Fixes: 66637ab137b4 ("drivers/perf: hisi: add driver for HNS3 PMU") Signed-off-by: Junhao He <hejunhao3@huawei.com> Signed-off-by: Hao Chen <chenhao418@huawei.com> Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Jijie Shao <shaojijie@huawei.com> Link: https://lore.kernel.org/r/20240425124627.13764-3-hejunhao3@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-04-28	drivers/perf: hisi_pcie: Fix out-of-bound access when valid event group	Junhao He	1	-1/+13
	The perf tool allows users to create event groups through following cmd [1], but the driver does not check whether the array index is out of bounds when writing data to the event_group array. If the number of events in an event_group is greater than HISI_PCIE_MAX_COUNTERS, the memory write overflow of event_group array occurs. Add array index check to fix the possible array out of bounds violation, and return directly when write new events are written to array bounds. There are 9 different events in an event_group. [1] perf stat -e '{pmu/event1/, ... ,pmu/event9/}' Fixes: 8404b0fbc7fb ("drivers/perf: hisi: Add driver for HiSilicon PCIe PMU") Signed-off-by: Junhao He <hejunhao3@huawei.com> Reviewed-by: Jijie Shao <shaojijie@huawei.com> Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240425124627.13764-2-hejunhao3@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-04-28	kselftest: arm64: Add a null pointer check	Kunwu Chan	1	-0/+4
	There is a 'malloc' call, which can be unsuccessful. This patch will add the malloc failure checking to avoid possible null dereference and give more information about test fail reasons. Signed-off-by: Kunwu Chan <chentao@kylinos.cn> Reviewed-by: Muhammad Usama Anjum <usama.anjum@collabora.com> Link: https://lore.kernel.org/r/20240423082102.2018886-1-chentao@kylinos.cn Signed-off-by: Will Deacon <will@kernel.org>
2024-04-28	arm64: defer clearing DAIF.D	Mark Rutland	4	-16/+16
	For historical reasons we unmask debug exceptions in __cpu_setup(), but it's not necessary to unmask debug exceptions this early in the boot/idle entry paths. It would be better to unmask debug exceptions later in C code as this simplifies the current code and will make it easier to rework exception masking logic to handle non-DAIF bits in future (e.g. PSTATE.{ALLINT,PM}). We started clearing DAIF.D in __cpu_setup() in commit: 2ce39ad15182604b ("arm64: debug: unmask PSTATE.D earlier") At the time, we needed to ensure that DAIF.D was clear on the primary CPU before scheduling and preemption were possible, and chose to do this in __cpu_setup() so that this occurred in the same place for primary and secondary CPUs. As we cannot handle debug exceptions this early, we placed an ISB between initializing MDSCR_EL1 and clearing DAIF.D so that no exceptions should be triggered. Subsequently we rewrote the return-from-{idle,suspend} paths to use __cpu_setup() in commit: cabe1c81ea5be983 ("arm64: Change cpu_resume() to enable mmu early then access sleep_sp by va") ... which allowed for earlier use of the MMU and had the desirable property of using the same code to reset the CPU in the cold and warm boot paths. This introduced a bug: DAIF.D was clear while cpu_do_resume() restored MDSCR_EL1 and other control registers (e.g. breakpoint/watchpoint control/value registers), and so we could unexpectedly take debug exceptions. We fixed that in commit: 744c6c37cc18705d ("arm64: kernel: Fix unmasked debug exceptions when restoring mdscr_el1") ... by having cpu_do_resume() use the `disable_dbg` macro to set DAIF.D before restoring MDSCR_EL1 and other control registers. This relies on DAIF.D being subsequently cleared again in cpu_resume(). Subsequently we reworked DAIF masking in commit: 0fbeb318754860b3 ("arm64: explicitly mask all exceptions") ... where we began enforcing a policy that DAIF.D being set implies all other DAIF bits are set, and so e.g. we cannot take an IRQ while DAIF.D is set. As part of this the use of `disable_dbg` in cpu_resume() was replaced with `disable_daif` for consistency with the rest of the kernel. These days, there's no need to clear DAIF.D early within __cpu_setup(): * setup_arch() clears DAIF.DA before scheduling and preemption are possible on the primary CPU, avoiding the problem we we originally trying to work around. Note: DAIF.IF get cleared later when interrupts are enabled for the first time. * secondary_start_kernel() clears all DAIF bits before scheduling and preemption are possible on secondary CPUs. Note: with pseudo-NMI, the PMR is initialized here before any DAIF bits are cleared. Similar will be necessary for the architectural NMI. * cpu_suspend() restores all DAIF bits when returning from idle, ensuring that we don't unexpectedly leave DAIF.D clear or set. Note: with pseudo-NMI, the PMR is initialized here before DAIF is cleared. Similar will be necessary for the architectural NMI. This patch removes the unmasking of debug exceptions from __cpu_setup(), relying on the above locations to initialize DAIF. This allows some other cleanups: * It is no longer necessary for cpu_resume() to explicitly mask debug (or other) exceptions, as it is always called with all DAIF bits set. Thus we drop the use of `disable_daif`. * The `enable_dbg` macro is no longer used, and so is dropped. * It is no longer necessary to have an ISB immediately after initializing MDSCR_EL1 in __cpu_setup(), and we can revert to relying on the context synchronization that occurs when the MMU is enabled between __cpu_setup() and code which clears DAIF.D Comments are added to setup_arch() and secondary_start_kernel() to explain the initial unmasking of the DAIF bits. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Mark Brown <broonie@kernel.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20240422113523.4070414-3-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2024-04-28	arm64: assembler: update stale comment for disable_step_tsk	Mark Rutland	1	-1/+1
	A comment in the disable_step_tsk macro refers to synchronising with enable_dbg, as historically the entry used enable_dbg to unmask debug exceptions after disabling single-stepping. These days the unmasking happens in entry-common.c via local_daif_restore() or local_daif_inherit(), so the comment is stale. This logic is likely to chang in future, so it would be best to avoid referring to those macros specifically. Update the comment to take this into account, and describe it in terms of clearing DAIF.D so that it doesn't macro where this logic lives nor what it is called. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Mark Brown <broonie@kernel.org> Cc: Will Deacon <will@kernel.org> Reviewed-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20240422113523.4070414-2-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2024-04-28	arm64/sysreg: Update PIE permission encodings	Shiqi Liu	2	-24/+24
	Fix left shift overflow issue when the parameter idx is greater than or equal to 8 in the calculation of perm in PIRx_ELx_PERM macro. Fix this by modifying the encoding to use a long integer type. Signed-off-by: Shiqi Liu <shiqiliu@hust.edu.cn> Acked-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20240421063328.29710-1-shiqiliu@hust.edu.cn Signed-off-by: Will Deacon <will@kernel.org>
2024-04-28	kselftest/arm64: Remove unused parameters in abi test	xieming	1	-1/+1
	Remove unused parameter i in tpidr2.c main function. Signed-off-by: xieming <xieming@kylinos.cn> Link: https://lore.kernel.org/r/20240422015730.89805-1-xieming@kylinos.cn Signed-off-by: Will Deacon <will@kernel.org>
2024-04-19	perf/arm-spe: Assign parents for event_source device	Jonathan Cameron	1	-0/+1
	Currently the PMU device appears directly under /sys/devices/ Only root busses should appear there, so instead assign the pmu->dev parent to be the platform device. Link: https://lore.kernel.org/linux-cxl/ZCLI9A40PJsyqAmq@kroah.com/ Acked-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240412161057.14099-24-Jonathan.Cameron@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-04-19	perf/arm-smmuv3: Assign parents for event_source device	Jonathan Cameron	1	-0/+1
	Currently the PMU device appears directly under /sys/devices/ Only root busses should appear there, so instead assign the pmu->dev parent to be the platform device. Link: https://lore.kernel.org/linux-cxl/ZCLI9A40PJsyqAmq@kroah.com/ Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240412161057.14099-23-Jonathan.Cameron@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-04-19	perf/arm-dsu: Assign parents for event_source device	Jonathan Cameron	1	-0/+1
	Currently the PMU device appears directly under /sys/devices/ Only root busses should appear there, so instead assign the pmu->dev parent to be the platform device. Link: https://lore.kernel.org/linux-cxl/ZCLI9A40PJsyqAmq@kroah.com/ Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240412161057.14099-22-Jonathan.Cameron@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-04-19	perf/arm-dmc620: Assign parents for event_source device	Jonathan Cameron	1	-0/+1
	Currently the PMU device appears directly under /sys/devices/ Only root busses should appear there, so instead assign the pmu->dev parent to be the platform device. Link: https://lore.kernel.org/linux-cxl/ZCLI9A40PJsyqAmq@kroah.com/ Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240412161057.14099-21-Jonathan.Cameron@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-04-19	perf/arm-ccn: Assign parents for event_source device	Jonathan Cameron	1	-0/+1
	Currently the PMU device appears directly under /sys/devices/ Only root busses should appear there, so instead assign the pmu->dev parent to be the platform device. Link: https://lore.kernel.org/linux-cxl/ZCLI9A40PJsyqAmq@kroah.com/ Acked-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240412161057.14099-20-Jonathan.Cameron@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-04-19	perf/arm-cci: Assign parents for event_source device	Jonathan Cameron	1	-0/+1
	Currently the PMU device appears directly under /sys/devices/ Only root busses should appear there, so instead assign the pmu->dev parent to be the platform device. Link: https://lore.kernel.org/linux-cxl/ZCLI9A40PJsyqAmq@kroah.com/ Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240412161057.14099-19-Jonathan.Cameron@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-04-19	perf/alibaba_uncore: Assign parents for event_source device	Jonathan Cameron	1	-0/+1
	Currently the PMU device appears directly under /sys/devices/ Only root busses should appear there, so instead assign the pmu->dev parent to be the platform device. Link: https://lore.kernel.org/linux-cxl/ZCLI9A40PJsyqAmq@kroah.com/ Reviewed-by: Shuai Xue <xueshuai@linux.alibaba.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240412161057.14099-18-Jonathan.Cameron@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-04-19	perf/arm_pmu: Assign parents for event_source devices	Jonathan Cameron	1	-0/+1
	Currently the PMU device appears directly under /sys/devices/ Only root busses should appear there, so instead assign the pmu->dev parent to be the platform device. Link: https://lore.kernel.org/linux-cxl/ZCLI9A40PJsyqAmq@kroah.com/ Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20240412161057.14099-17-Jonathan.Cameron@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-04-19	perf/imx_ddr: Assign parents for event_source devices	Jonathan Cameron	1	-0/+1
	Currently all this device appear directly under /sys/devices/ Only root busses should appear there, so instead assign the pmu->dev parent to be the platform device. Link: https://lore.kernel.org/linux-cxl/ZCLI9A40PJsyqAmq@kroah.com/ Cc: Frank Li <Frank.li@nxp.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240412161057.14099-16-Jonathan.Cameron@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-04-19	perf/qcom: Assign parents for event_source devices	Jonathan Cameron	2	-0/+2
	Currently all these devices appear directly under /sys/devices/ Only root busses should appear there, so instead assign the pmu->dev parents to be the platform devices. Link: https://lore.kernel.org/linux-cxl/ZCLI9A40PJsyqAmq@kroah.com/ Cc: Andy Gross <agross@kernel.org> Cc: Bjorn Andersson <andersson@kernel.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240412161057.14099-15-Jonathan.Cameron@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-04-19	Documentation: qcom-pmu: Use /sys/bus/event_source/devices paths	Jonathan Cameron	2	-2/+2
	To allow setting an appropriate parent for the struct pmu device remove existing references to /sys/devices/ path. Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240412161057.14099-14-Jonathan.Cameron@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-04-19	perf/riscv: Assign parents for event_source devices	Jonathan Cameron	2	-0/+2
	Currently all these devices appear directly under /sys/devices/ Only root busses should appear there, so instead assign the pmu->dev parents to be the appropriate platform devices. Link: https://lore.kernel.org/linux-cxl/ZCLI9A40PJsyqAmq@kroah.com/ Cc: Atish Patra <atishp@atishpatra.org> CC: Anup Patel <anup@brainfault.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240412161057.14099-13-Jonathan.Cameron@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-04-19	perf/thunderx2: Assign parents for event_source devices	Jonathan Cameron	1	-0/+1
	Currently all these devices appear directly under /sys/devices/ Only root busses should appear there, so instead assign the pmu->dev parents to be the platform device. Link: https://lore.kernel.org/linux-cxl/ZCLI9A40PJsyqAmq@kroah.com/ Cc: Robert Richter <rric@kernel.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240412161057.14099-12-Jonathan.Cameron@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-04-19	Documentation: thunderx2-pmu: Use /sys/bus/event_source/devices paths	Jonathan Cameron	1	-1/+1
	To allow setting an appropriate parent for the struct pmu device remove existing references to /sys/devices/ path. Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240412161057.14099-11-Jonathan.Cameron@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-04-19	perf/xgene: Assign parents for event_source devices	Jonathan Cameron	1	-0/+1
	Currently all these devices appear directly under /sys/devices/ Only root busses should appear there, so instead assign the pmu->dev parents to be the hardware related struct device. Link: https://lore.kernel.org/linux-cxl/ZCLI9A40PJsyqAmq@kroah.com/ Cc: Khuong Dinh <khuong@os.amperecomputing.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240412161057.14099-10-Jonathan.Cameron@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-04-19	Documentation: xgene-pmu: Use /sys/bus/event_source/devices paths	Jonathan Cameron	1	-1/+1
	To allow setting an appropriate parent for the struct pmu device remove existing references to /sys/devices/ path. Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240412161057.14099-9-Jonathan.Cameron@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-04-19	perf/arm_cspmu: Assign parents for event_source devices	Jonathan Cameron	1	-0/+1
	Currently all these devices appear directly under /sys/devices/ Only root busses should appear there, so instead assign the pmu->dev parents to be the platform device. Link: https://lore.kernel.org/linux-cxl/ZCLI9A40PJsyqAmq@kroah.com/ Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240412161057.14099-8-Jonathan.Cameron@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-04-19	perf/amlogic: Assign parents for event_source devices	Jonathan Cameron	1	-0/+1
	Currently all these devices appear directly under /sys/devices/ Only root busses should appear there, so instead assign the pmu->dev parents to be the platform device. Link: https://lore.kernel.org/linux-cxl/ZCLI9A40PJsyqAmq@kroah.com/ Reviewed-by: Jiucheng Xu <jiucheng.xu@amlogic.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240412161057.14099-7-Jonathan.Cameron@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-04-19	perf/hisi-hns3: Assign parents for event_source device	Jonathan Cameron	1	-0/+1
	Currently the PMU device appears directly under /sys/devices/ Only root busses should appear there, so instead assign the pmu->dev parent to be the PCI device. Link: https://lore.kernel.org/linux-cxl/ZCLI9A40PJsyqAmq@kroah.com/ Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240412161057.14099-6-Jonathan.Cameron@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-04-19	Documentation: hns-pmu: Use /sys/bus/event_source/devices paths	Jonathan Cameron	1	-4/+4
	To allow setting an appropriate parent for the struct pmu device remove existing references to /sys/devices/ path. Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240412161057.14099-5-Jonathan.Cameron@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-04-19	perf/hisi-uncore: Assign parents for event_source devices	Jonathan Cameron	1	-0/+1
	Currently the PMU device appears directly under /sys/devices/ Only root busses should appear there, so instead assign the pmu->dev parent to be the platform device. Link: https://lore.kernel.org/linux-cxl/ZCLI9A40PJsyqAmq@kroah.com/ Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Yicong Yang <yangyicong@hisilicon.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240412161057.14099-4-Jonathan.Cameron@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-04-19	Documentation: hisi-pmu: Drop reference to /sys/devices path	Jonathan Cameron	1	-1/+0
	Having assigned a parent to the device, the suggested path is no longer valid. As /sys/bus/event_sources based path is also provided, simply drop mention of alternative. Reviewed-by: Yicong Yang <yangyicong@hisilicon.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240412161057.14099-3-Jonathan.Cameron@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-04-19	perf/hisi-pcie: Assign parent for event_source device	Jonathan Cameron	1	-0/+1
	Currently the PMU device appears directly under /sys/devices/ Only root busses should appear there, so instead assign the pmu->dev parent to be the PCI device. Link: https://lore.kernel.org/linux-cxl/ZCLI9A40PJsyqAmq@kroah.com/ Reviewed-by: Yicong Yang <yangyicong@hisilicon.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240412161057.14099-2-Jonathan.Cameron@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-04-19	arm64: Add Neoverse-V2 part	Besar Wicaksono	1	-0/+2
	Add the part number and MIDR for Neoverse-V2 Signed-off-by: Besar Wicaksono <bwicaksono@nvidia.com> Reviewed-by: James Clark <james.clark@arm.com> Link: https://lore.kernel.org/r/20240109192310.16234-2-bwicaksono@nvidia.com Signed-off-by: Will Deacon <will@kernel.org>
2024-04-18	arm64: acpi: Honour firmware_signature field of FACS, if it exists	David Woodhouse	1	-0/+10
	If the firmware_signature changes then OSPM should not attempt to resume from hibernate, but should instead perform a clean reboot. Set the global swsusp_hardware_signature to allow the generic code to include the value in the swsusp header on disk, and perform the appropriate check on resume. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Acked-by: Sudeep Holla <sudeep.holla@arm.com> Acked-by: Rafael J. Wysocki <rafael@kernel.org> Link: https://lore.kernel.org/r/20240412073530.2222496-3-dwmw2@infradead.org Signed-off-by: Will Deacon <will@kernel.org>
2024-04-18	ACPICA: Detect FACS even for hardware reduced platforms	David Woodhouse	2	-23/+14
	ACPICA commit 44fc328a1a14b097d92b8be83989e4bf69b6e6cb The FACS is optional even on hardware reduced platforms, and may exist for the purpose of communicating the hardware_signature field to provoke a clean reboot instead of a resume from hibernation. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Acked-by: Rafael J. Wysocki <rafael@kernel.org> Link: https://lore.kernel.org/r/20240412073530.2222496-2-dwmw2@infradead.org Signed-off-by: Will Deacon <will@kernel.org>
2024-04-12	arm64: mm: Don't remap pgtables for allocate vs populate	Ryan Roberts	2	-32/+37
	During linear map pgtable creation, each pgtable is fixmapped / fixunmapped twice; once during allocation to zero the memory, and a again during population to write the entries. This means each table has 2 TLB invalidations issued against it. Let's fix this so that each table is only fixmapped/fixunmapped once, halving the number of TLBIs, and improving performance. Achieve this by separating allocation and initialization (zeroing) of the page. The allocated page is now fixmapped directly by the walker and initialized, before being populated and finally fixunmapped. This approach keeps the change small, but has the side effect that late allocations (using __get_free_page()) must also go through the generic memory clearing routine. So let's tell __get_free_page() not to zero the memory to avoid duplication. Additionally this approach means that fixmap/fixunmap is still used for late pgtable modifications. That's not technically needed since the memory is all mapped in the linear map by that point. That's left as a possible future optimization if found to be needed. Execution time of map_mem(), which creates the kernel linear map page tables, was measured on different machines with different RAM configs: \| Apple M2 VM \| Ampere Altra\| Ampere Altra\| Ampere Altra \| VM, 16G \| VM, 64G \| VM, 256G \| Metal, 512G ---------------\|-------------\|-------------\|-------------\|------------- \| ms (%) \| ms (%) \| ms (%) \| ms (%) ---------------\|-------------\|-------------\|-------------\|------------- before \| 11 (0%) \| 161 (0%) \| 656 (0%) \| 1654 (0%) after \| 10 (-11%) \| 104 (-35%) \| 438 (-33%) \| 1223 (-26%) Signed-off-by: Ryan Roberts <ryan.roberts@arm.com> Suggested-by: Mark Rutland <mark.rutland@arm.com> Tested-by: Itaru Kitayama <itaru.kitayama@fujitsu.com> Tested-by: Eric Chanudet <echanude@redhat.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20240412131908.433043-4-ryan.roberts@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2024-04-12	arm64: mm: Batch dsb and isb when populating pgtables	Ryan Roberts	2	-2/+16
	After removing uneccessary TLBIs, the next bottleneck when creating the page tables for the linear map is DSB and ISB, which were previously issued per-pte in __set_pte(). Since we are writing multiple ptes in a given pte table, we can elide these barriers and insert them once we have finished writing to the table. Execution time of map_mem(), which creates the kernel linear map page tables, was measured on different machines with different RAM configs: \| Apple M2 VM \| Ampere Altra\| Ampere Altra\| Ampere Altra \| VM, 16G \| VM, 64G \| VM, 256G \| Metal, 512G ---------------\|-------------\|-------------\|-------------\|------------- \| ms (%) \| ms (%) \| ms (%) \| ms (%) ---------------\|-------------\|-------------\|-------------\|------------- before \| 78 (0%) \| 435 (0%) \| 1723 (0%) \| 3779 (0%) after \| 11 (-86%) \| 161 (-63%) \| 656 (-62%) \| 1654 (-56%) Signed-off-by: Ryan Roberts <ryan.roberts@arm.com> Tested-by: Itaru Kitayama <itaru.kitayama@fujitsu.com> Tested-by: Eric Chanudet <echanude@redhat.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20240412131908.433043-3-ryan.roberts@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2024-04-12	arm64: mm: Don't remap pgtables per-cont(pte\|pmd) block	Ryan Roberts	1	-13/+14
	A large part of the kernel boot time is creating the kernel linear map page tables. When rodata=full, all memory is mapped by pte. And when there is lots of physical ram, there are lots of pte tables to populate. The primary cost associated with this is mapping and unmapping the pte table memory in the fixmap; at unmap time, the TLB entry must be invalidated and this is expensive. Previously, each pmd and pte table was fixmapped/fixunmapped for each cont(pte\|pmd) block of mappings (16 entries with 4K granule). This means we ended up issuing 32 TLBIs per (pmd\|pte) table during the population phase. Let's fix that, and fixmap/fixunmap each page once per population, for a saving of 31 TLBIs per (pmd\|pte) table. This gives a significant boot speedup. Execution time of map_mem(), which creates the kernel linear map page tables, was measured on different machines with different RAM configs: \| Apple M2 VM \| Ampere Altra\| Ampere Altra\| Ampere Altra \| VM, 16G \| VM, 64G \| VM, 256G \| Metal, 512G ---------------\|-------------\|-------------\|-------------\|------------- \| ms (%) \| ms (%) \| ms (%) \| ms (%) ---------------\|-------------\|-------------\|-------------\|------------- before \| 168 (0%) \| 2198 (0%) \| 8644 (0%) \| 17447 (0%) after \| 78 (-53%) \| 435 (-80%) \| 1723 (-80%) \| 3779 (-78%) Signed-off-by: Ryan Roberts <ryan.roberts@arm.com> Tested-by: Itaru Kitayama <itaru.kitayama@fujitsu.com> Tested-by: Eric Chanudet <echanude@redhat.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20240412131908.433043-2-ryan.roberts@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2024-04-12	arm64: boot: Support Flat Image Tree	Simon Glass	7	-3/+333
	Add a script which produces a Flat Image Tree (FIT), a single file containing the built kernel and associated devicetree files. Compression defaults to gzip which gives a good balance of size and performance. The files compress from about 86MB to 24MB using this approach. The FIT can be used by bootloaders which support it, such as U-Boot and Linuxboot. It permits automatic selection of the correct devicetree, matching the compatible string of the running board with the closest compatible string in the FIT. There is no need for filenames or other workarounds. Add a 'make image.fit' build target for arm64, as well. The FIT can be examined using 'dumpimage -l'. This uses the 'dtbs-list' file but processes only .dtb files, ignoring the overlay .dtbo files. This features requires pylibfdt (use 'pip install libfdt'). It also requires compression utilities for the algorithm being used. Supported compression options are the same as the Image.xxx files. Use FIT_COMPRESSION to select an algorithm other than gzip. While FIT supports a ramdisk / initrd, no attempt is made to support this here, since it must be built separately from the Linux build. Signed-off-by: Simon Glass <sjg@chromium.org> Acked-by: Masahiro Yamada <masahiroy@kernel.org> Link: https://lore.kernel.org/r/20240329032836.141899-3-sjg@chromium.org Signed-off-by: Will Deacon <will@kernel.org>
2024-04-12	arm64: Add BOOT_TARGETS variable	Simon Glass	1	-1/+5
	Add a new variable containing a list of possible targets. Mark them as phony. This matches the approach taken for arch/arm Signed-off-by: Simon Glass <sjg@chromium.org> Reviewed-by: Nicolas Schier <n.schier@avm.de> Link: https://lore.kernel.org/r/20240329032836.141899-2-sjg@chromium.org Signed-off-by: Will Deacon <will@kernel.org>
2024-04-12	arm64: arm_pmuv3: Correctly extract and check the PMUVer	Yicong Yang	2	-7/+9
	Currently we're using "sbfx" to extract the PMUVer from ID_AA64DFR0_EL1 and skip the init/reset if no PMU present when the extracted PMUVer is negative or is zero. However for PMUv3p8 the PMUVer will be 0b1000 and PMUVer extracted by "sbfx" will always be negative and we'll skip the init/reset in __init_el2_debug/reset_pmuserenr_el0 unexpectedly. So this patch use "ubfx" instead of "sbfx" to extract the PMUVer. If the PMUVer is implementation defined (0b1111) or not implemented(0b0000) then skip the reset/init. Previously we'll also skip the init/reset if the PMUVer is higher than the version we known (currently PMUv3p9), with this patch we'll only skip if the PMU is not implemented or implementation defined. This keeps consistence with how we probe the PMU in the driver with pmuv3_implemented(). Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20240411123030.7201-1-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-04-11	arm64: tlb: Allow range operation for MAX_TLBI_RANGE_PAGES	Gavin Shan	1	-2/+2
	MAX_TLBI_RANGE_PAGES pages is covered by SCALE#3 and NUM#31 and it's supported now. Allow TLBI RANGE operation when the number of pages is equal to MAX_TLBI_RANGE_PAGES in __flush_tlb_range_nosync(). Suggested-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Gavin Shan <gshan@redhat.com> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Reviewed-by: Ryan Roberts <ryan.roberts@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Shaoqin Huang <shahuang@redhat.com> Link: https://lore.kernel.org/r/20240405035852.1532010-4-gshan@redhat.com Signed-off-by: Will Deacon <will@kernel.org>
2024-04-11	arm64: tlb: Improve __TLBI_VADDR_RANGE()	Gavin Shan	1	-11/+18
	The macro returns the operand of TLBI RANGE instruction. A mask needs to be applied to each individual field upon producing the operand, to avoid the adjacent fields can interfere with each other when invalid arguments have been provided. The code looks more tidy at least with a mask and FIELD_PREP(). Suggested-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Gavin Shan <gshan@redhat.com> Reviewed-by: Ryan Roberts <ryan.roberts@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Reviewed-by: Shaoqin Huang <shahuang@redhat.com> Link: https://lore.kernel.org/r/20240405035852.1532010-3-gshan@redhat.com Signed-off-by: Will Deacon <will@kernel.org>
2024-04-10	arm64: tlb: Fix TLBI RANGE operand	Gavin Shan	1	-9/+11
	KVM/arm64 relies on TLBI RANGE feature to flush TLBs when the dirty pages are collected by VMM and the page table entries become write protected during live migration. Unfortunately, the operand passed to the TLBI RANGE instruction isn't correctly sorted out due to the commit 117940aa6e5f ("KVM: arm64: Define kvm_tlb_flush_vmid_range()"). It leads to crash on the destination VM after live migration because TLBs aren't flushed completely and some of the dirty pages are missed. For example, I have a VM where 8GB memory is assigned, starting from 0x40000000 (1GB). Note that the host has 4KB as the base page size. In the middile of migration, kvm_tlb_flush_vmid_range() is executed to flush TLBs. It passes MAX_TLBI_RANGE_PAGES as the argument to __kvm_tlb_flush_vmid_range() and __flush_s2_tlb_range_op(). SCALE#3 and NUM#31, corresponding to MAX_TLBI_RANGE_PAGES, isn't supported by __TLBI_RANGE_NUM(). In this specific case, -1 has been returned from __TLBI_RANGE_NUM() for SCALE#3/2/1/0 and rejected by the loop in the __flush_tlb_range_op() until the variable @scale underflows and becomes -9, 0xffff708000040000 is set as the operand. The operand is wrong since it's sorted out by __TLBI_VADDR_RANGE() according to invalid @scale and @num. Fix it by extending __TLBI_RANGE_NUM() to support the combination of SCALE#3 and NUM#31. With the changes, [-1 31] instead of [-1 30] can be returned from the macro, meaning the TLBs for 0x200000 pages in the above example can be flushed in one shoot with SCALE#3 and NUM#31. The macro TLBI_RANGE_MASK is dropped since no one uses it any more. The comments are also adjusted accordingly. Fixes: 117940aa6e5f ("KVM: arm64: Define kvm_tlb_flush_vmid_range()") Cc: stable@kernel.org # v6.6+ Reported-by: Yihuang Yu <yihyu@redhat.com> Suggested-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Gavin Shan <gshan@redhat.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Ryan Roberts <ryan.roberts@arm.com> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Reviewed-by: Shaoqin Huang <shahuang@redhat.com> Link: https://lore.kernel.org/r/20240405035852.1532010-2-gshan@redhat.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>