linux-dev - Linux kernel development work

Age	Commit message (Collapse)	Author	Files	Lines
2022-09-07	usb: dwc3: Avoid unmapping USB requests if endxfer is not complete	Wesley Cheng	3	-2/+10
	If DWC3_EP_DELAYED_STOP is set during stop active transfers, then do not continue attempting to unmap request buffers during dwc3_remove_requests(). This can lead to SMMU faults, as the controller has not stopped the processing of the TRB. Defer this sequence to the EP0 out start, which ensures that there are no pending SETUP transactions before issuing the endxfer. Reviewed-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com> Signed-off-by: Wesley Cheng <quic_wcheng@quicinc.com> Link: https://lore.kernel.org/r/20220901193625.8727-2-quic_wcheng@quicinc.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-09-07	net: ethernet: mtk_eth_soc: fix typo in __mtk_foe_entry_clear	Lorenzo Bianconi	1	-1/+1
	Set ib1 state to MTK_FOE_STATE_UNBIND in __mtk_foe_entry_clear routine. Fixes: 33fc42de33278 ("net: ethernet: mtk_eth_soc: support creating mac address based offload entries") Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-07	USB/ARM: Switch S3C2410 UDC to GPIO descriptors	Linus Walleij	10	-67/+100
	This converts the S3C2410 UDC USB device controller to use GPIO descriptor tables and modern GPIO. Cc: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Cc: Alim Akhtar <alim.akhtar@samsung.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-samsung-soc@vger.kernel.org Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Link: https://lore.kernel.org/r/20220901081649.564348-1-linus.walleij@linaro.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-09-07	usb: misc: uss720: fix uninitialized variable rlen	Dongliang Mu	1	-4/+4
	Smatch reports the following error: uninitialized symbol 'rlen' drivers/usb/misc/uss720.c:514 parport_uss720_epp_write_data() error drivers/usb/misc/uss720.c:575 parport_uss720_ecp_write_data() error drivers/usb/misc/uss720.c:593 parport_uss720_ecp_read_data() error drivers/usb/misc/uss720.c:626 parport_uss720_write_compat() error The root cause is, the failure of usb_bulk_msg leads to the uninitialized variable rlen in printk function. Fix this by initializing rlen with zero. Signed-off-by: Dongliang Mu <mudongliangabcd@gmail.com> Link: https://lore.kernel.org/r/20220903100004.2874741-1-dzm91@hust.edu.cn Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-09-07	usb: gadget: f_fs: stricter integer overflow checks	Dan Carpenter	1	-2/+2
	This from static analysis. The vla_item() takes a size and adds it to the total. It has a built in integer overflow check so if it encounters an integer overflow anywhere then it records the total as SIZE_MAX. However there is an issue here because the "lang_count*(needed_count+1)" multiplication can overflow. Technically the "lang_count + 1" addition could overflow too, but that would be detected and is harmless. Fix both using the new size_add() and size_mul() functions. Fixes: e6f3862fa1ec ("usb: gadget: FunctionFS: Remove VLAIS usage from gadget code") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Link: https://lore.kernel.org/r/YxDI3lMYomE7WCjn@kili Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-09-07	spi: Add capability to perform some transfer with chipselect off	Christophe Leroy	2	-3/+11
	Some components require a few clock cycles with chipselect off before or/and after the data transfer done with CS on. Typically IDT 801034 QUAD PCM CODEC datasheet states "Note *: CCLK should have one cycle before CS goes low, and two cycles after CS goes high". The cycles "before" are implicitely provided by all previous activity on the SPI bus. But the cycles "after" must be provided in order to terminate the SPI transfer. In order to use that kind of component, add a cs_off flag to spi_transfer struct. When this flag is set, the transfer is performed with chipselect off. This allows consummer to add a dummy transfer at the end of the transfer list which is performed with chipselect OFF, providing the required additional clock cycles. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Link: https://lore.kernel.org/r/434165c46f06d802690208a11e7ea2500e8da4c7.1662558898.git.christophe.leroy@csgroup.eu Signed-off-by: Mark Brown <broonie@kernel.org>
2022-09-07	iommu/amd: Add command-line option to enable different page table	Vasant Hegde	2	-5/+20
	Enhance amd_iommu command line option to specify v1 or v2 page table. By default system will boot in V1 page table mode. Co-developed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Link: https://lore.kernel.org/r/20220825063939.8360-10-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-09-07	iommu/amd: Add support for using AMD IOMMU v2 page table for DMA-API	Suravee Suthikulpanit	1	-0/+25
	Introduce init function for setting up DMA domain for DMA-API with the IOMMU v2 page table. Co-developed-by: Vasant Hegde <vasant.hegde@amd.com> Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Link: https://lore.kernel.org/r/20220825063939.8360-9-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-09-07	iommu/amd: Add support for Guest IO protection	Suravee Suthikulpanit	3	-0/+19
	AMD IOMMU introduces support for Guest I/O protection where the request from the I/O device without a PASID are treated as if they have PASID 0. Co-developed-by: Vasant Hegde <vasant.hegde@amd.com> Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Link: https://lore.kernel.org/r/20220825063939.8360-8-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-09-07	iommu/amd: Initial support for AMD IOMMU v2 page table	Vasant Hegde	5	-2/+423
	Introduce IO page table framework support for AMD IOMMU v2 page table. This patch implements 4 level page table within iommu amd driver and supports 4K/2M/1G page sizes. Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Link: https://lore.kernel.org/r/20220825063939.8360-7-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-09-07	iommu/amd: Update sanity check when enable PRI/ATS for IOMMU v1 table	Suravee Suthikulpanit	1	-3/+11
	Currently, PPR/ATS can be enabled only if the domain is type identity mapping. However, when allowing the IOMMU v2 page table to be used for DMA-API, the check is no longer valid. Update the sanity check to only apply for when using AMD_IOMMU_V1 page table mode. Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Link: https://lore.kernel.org/r/20220825063939.8360-6-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-09-07	iommu/amd: Refactor amd_iommu_domain_enable_v2 to remove locking	Suravee Suthikulpanit	1	-19/+27
	The current function to enable IOMMU v2 also lock the domain. In order to reuse the same code in different code path, in which the domain has already been locked, refactor the function to separate the locking from the enabling logic. Co-developed-by: Vasant Hegde <vasant.hegde@amd.com> Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Link: https://lore.kernel.org/r/20220825063939.8360-5-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-09-07	iommu/amd: Add map/unmap_pages() iommu_domain_ops callback support	Vasant Hegde	1	-13/+16
	Implement the map_pages() and unmap_pages() callback for the AMD IOMMU driver to allow calls from iommu core to map and unmap multiple pages. Also deprecate map/unmap callbacks. Finally gatherer is not updated by iommu_v1_unmap_pages(). Hence pass NULL instead of gather to iommu_v1_unmap_pages. Suggested-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Link: https://lore.kernel.org/r/20220825063939.8360-4-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-09-07	iommu/amd/io-pgtable: Implement unmap_pages io_pgtable_ops callback	Vasant Hegde	1	-8/+9
	Implement the io_pgtable_ops->unmap_pages() callback for AMD driver and deprecate io_pgtable_ops->unmap callback. Also if fetch_pte() returns NULL then return from unmap_mapages() instead of trying to continue to unmap remaining pages. Suggested-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Link: https://lore.kernel.org/r/20220825063939.8360-3-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-09-07	iommu/amd/io-pgtable: Implement map_pages io_pgtable_ops callback	Vasant Hegde	1	-25/+34
	Implement the io_pgtable_ops->map_pages() callback for AMD driver. Also deprecate io_pgtable->map callback. Suggested-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Link: https://lore.kernel.org/r/20220825063939.8360-2-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-09-07	netfilter: nfnetlink_osf: fix possible bogus match in nf_osf_find()	Pablo Neira Ayuso	1	-1/+3
	nf_osf_find() incorrectly returns true on mismatch, this leads to copying uninitialized memory area in nft_osf which can be used to leak stale kernel stack data to userspace. Fixes: 22c7652cdaa8 ("netfilter: nft_osf: Add version option support") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Florian Westphal <fw@strlen.de>
2022-09-07	netfilter: nf_conntrack_irc: Tighten matching on DCC message	David Leadbeater	1	-6/+28
	CTCP messages should only be at the start of an IRC message, not anywhere within it. While the helper only decodes packes in the ORIGINAL direction, its possible to make a client send a CTCP message back by empedding one into a PING request. As-is, thats enough to make the helper believe that it saw a CTCP message. Fixes: 869f37d8e48f ("[NETFILTER]: nf_conntrack/nf_nat: add IRC helper port") Signed-off-by: David Leadbeater <dgl@dgl.cx> Signed-off-by: Florian Westphal <fw@strlen.de>
2022-09-07	iommu/virtio: Fix interaction with VFIO	Jean-Philippe Brucker	1	-0/+11
	Commit e8ae0e140c05 ("vfio: Require that devices support DMA cache coherence") requires IOMMU drivers to advertise IOMMU_CAP_CACHE_COHERENCY, in order to be used by VFIO. Since VFIO does not provide to userspace the ability to maintain coherency through cache invalidations, it requires hardware coherency. Advertise the capability in order to restore VFIO support. The meaning of IOMMU_CAP_CACHE_COHERENCY also changed from "IOMMU can enforce cache coherent DMA transactions" to "IOMMU_CACHE is supported". While virtio-iommu cannot enforce coherency (of PCIe no-snoop transactions), it does support IOMMU_CACHE. We can distinguish different cases of non-coherent DMA: (1) When accesses from a hardware endpoint are not coherent. The host would describe such a device using firmware methods ('dma-coherent' in device-tree, '_CCA' in ACPI), since they are also needed without a vIOMMU. In this case mappings are created without IOMMU_CACHE. virtio-iommu doesn't need any additional support. It sends the same requests as for coherent devices. (2) When the physical IOMMU supports non-cacheable mappings. Supporting those would require a new feature in virtio-iommu, new PROBE request property and MAP flags. Device drivers would use a new API to discover this since it depends on the architecture and the physical IOMMU. (3) When the hardware supports PCIe no-snoop. It is possible for assigned PCIe devices to issue no-snoop transactions, and the virtio-iommu specification is lacking any mention of this. Arm platforms don't necessarily support no-snoop, and those that do cannot enforce coherency of no-snoop transactions. Device drivers must be careful about assuming that no-snoop transactions won't end up cached; see commit e02f5c1bb228 ("drm: disable uncached DMA optimization for ARM and arm64"). On x86 platforms, the host may or may not enforce coherency of no-snoop transactions with the physical IOMMU. But according to the above commit, on x86 a driver which assumes that no-snoop DMA is compatible with uncached CPU mappings will also work if the host enforces coherency. Although these issues are not specific to virtio-iommu, it could be used to facilitate discovery and configuration of no-snoop. This would require a new feature bit, PROBE property and ATTACH/MAP flags. Cc: stable@vger.kernel.org Fixes: e8ae0e140c05 ("vfio: Require that devices support DMA cache coherence") Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/20220825154622.86759-1-jean-philippe@linaro.org Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-09-07	netfilter: conntrack: ignore overly delayed tcp packets	Florian Westphal	1	-28/+21
	If 'nf_conntrack_tcp_loose' is off (the default), tcp packets that are outside of the current window are marked as INVALID. nf/iptables rulesets often drop such packets via 'ct state invalid' or similar checks. For overly delayed acks, this can be a nuisance if such 'invalid' packets are also logged. Since they are not invalid in a strict sense, just ignore them, i.e. conntrack won't extend timeout or change state so that they do not match invalid state rules anymore. This also avoids unwantend connection stalls in case conntrack considers retransmission (of data that did not reach the peer) as too old. The else branch of the conditional becomes obsolete. Next patch will reformant the now always-true if condition. The existing workaround for data that exceeds the calculated receive window is adjusted to use the 'ignore' state so that these packets do not refresh the timeout or change state other than updating ->td_end. Signed-off-by: Florian Westphal <fw@strlen.de>
2022-09-07	netfilter: conntrack: prepare tcp_in_window for ternary return value	Florian Westphal	1	-49/+87
	tcp_in_window returns true if the packet is in window and false if it is not. If its outside of window, packet will be treated as INVALID. There are corner cases where the packet should still be tracked, because rulesets may drop or log such packets, even though they can occur during normal operation, such as overly delayed acks. In extreme cases, connection may hang forever because conntrack state differs from real state. There is no retransmission for ACKs. In case of ACK loss after conntrack processing, its possible that a connection can be stuck because the actual retransmits are considered stale ("SEQ is under the lower bound (already ACKed data retransmitted)". The problem is made worse by carrier-grade-nat which can also result in stale packets from old connections to get treated as 'recent' packets in conntrack (it doesn't support tcp timestamps at this time). Prepare tcp_in_window() to return an enum that tells the desired action (in-window/accept, bogus/drop). A third action (accept the packet as in-window, but do not change state) is added in a followup patch. Signed-off-by: Florian Westphal <fw@strlen.de>
2022-09-07	ASoC: soc-dapm.c: random cleanup	Mark Brown	1	-15/+14
	Merge series from Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>: These are cleanup patches for soc-dapm.c. Each patches are not related, very random cleanup.
2022-09-07	kselftest/arm64: Enforce actual ABI for SVE syscalls	Mark Brown	1	-18/+37
	Currently syscall-abi permits the bits in Z registers not shared with the V registers as well as all of the predicate registers to be preserved on syscall but the actual implementation has always cleared them and our documentation has now been updated to make that the documented ABI so update the syscall-abi test to match. Signed-off-by: Mark Brown <broonie@kernel.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20220829162502.886816-4-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-09-07	kselftest/arm64: Correct buffer allocation for SVE Z registers	Mark Brown	1	-2/+2
	The buffer used for verifying SVE Z registers allocated enough space for 16 maximally sized registers rather than 32 due to using the macro for the number of P registers. In practice this didn't matter since for historical reasons the maximum VQ defined in the ABI is greater the architectural maximum so we will always allocate more space than is needed even with emulated platforms implementing the architectural maximum. Still, we should use the right define. Signed-off-by: Mark Brown <broonie@kernel.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20220829162502.886816-2-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-09-07	kselftest/arm64: Include larger SVE and SME VLs in signal tests	Mark Brown	4	-46/+33
	Now that the core utilities for signal testing support handling data in EXTRA_CONTEXT blocks we can test larger SVE and SME VLs which spill over the limits in the base signal context. This is done by defining storage for the context as a union with a ucontext_t and a buffer together with some helpers for getting relevant sizes and offsets like we do for fake_sigframe, this isn't the most lovely code ever but is fairly straightforward to implement and much less invasive to the somewhat unclear and indistinct layers of abstraction in the signal handling test code. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20220829160703.874492-11-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-09-07	kselftest/arm64: Allow larger buffers in get_signal_context()	Mark Brown	14	-15/+16
	In order to allow testing of signal contexts that overflow the base signal frame allow callers to pass the buffer size for the user context into get_signal_context(). No functional change. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20220829160703.874492-10-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-09-07	kselftest/arm64: Preserve any EXTRA_CONTEXT in handle_signal_copyctx()	Mark Brown	1	-2/+48
	When preserving the signal context for later verification by testcases check for and include any EXTRA_CONTEXT block if enough space has been provided. Since the EXTRA_CONTEXT block includes a pointer to the start of the additional data block we need to do at least some fixup on the copied data. For simplicity in users we do this by extending the length of the EXTRA_CONTEXT to include the following termination record, this will cause users to see the extra data as part of the linked list of contexts without needing any special handling. Care will be needed if any specific tests for EXTRA_CONTEXT are added beyond the validation done in ASSERT_GOOD_CONTEXT. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20220829160703.874492-9-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-09-07	kselftest/arm64: Validate contents of EXTRA_CONTEXT blocks	Mark Brown	1	-4/+21
	Currently in validate_reserved() we check the basic form and contents of an EXTRA_CONTEXT block but do not actually validate anything inside the data block it provides. Extend the validation to do so, when we get to the terminator for the main data block reset and start walking the extra data block instead. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20220829160703.874492-8-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-09-07	kselftest/arm64: Only validate each signal context once	Mark Brown	1	-7/+12
	Currently for the more complex signal context types we validate the context specific details the end of the parsing loop validate_reserved() if we've ever seen a context of that type. This is currently merely a bit inefficient but will get a bit awkward when we start parsing extra_context, at which point we will need to reset the head to advance into the extra space that extra_context provides. Instead only do the more detailed checks on each context type the first time we see that context type. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20220829160703.874492-7-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-09-07	kselftest/arm64: Remove unneeded protype for validate_extra_context()	Mark Brown	1	-2/+0
	Nothing outside testcases.c should need to use validate_extra_context(), remove the prototype to ensure nothing does. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20220829160703.874492-6-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-09-07	kselftest/arm64: Fix validation of EXTRA_CONTEXT signal context location	Mark Brown	1	-1/+1
	Currently in validate_extra_context() we assert both that the extra data pointed to by the EXTRA_CONTEXT is 16 byte aligned and that it immediately follows the struct _aarch64_ctx providing the terminator for the linked list of contexts in the signal frame. Since struct _aarch64_ctx is an 8 byte structure which must be 16 byte aligned these cannot both be true. As documented in sigcontext.h and implemented by the kernel the extra data should be at the next 16 byte aligned address after the terminator so fix the validation to match. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20220829160703.874492-5-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-09-07	kselftest/arm64: Fix validatation termination record after EXTRA_CONTEXT	Mark Brown	1	-1/+1
	When arm64 signal context data overflows the base struct sigcontext it gets placed in an extra buffer pointed to by a record of type EXTRA_CONTEXT in the base struct sigcontext which is required to be the last record in the base struct sigframe. The current validation code attempts to check this by using GET_RESV_NEXT_HEAD() to step forward from the current record to the next but that is a macro which assumes it is being provided with a struct _aarch64_ctx and uses the size there to skip forward to the next record. Instead validate_extra_context() passes it a struct extra_context which has a separate size field. This compiles but results in us trying to validate a termination record in completely the wrong place, at best failing validation and at worst just segfaulting. Fix this by passing the struct _aarch64_ctx we meant to into the macro. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20220829160703.874492-4-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-09-07	kselftest/arm64: Validate signal ucontext in place	Mark Brown	1	-3/+6
	In handle_input_signal_copyctx() we use ASSERT_GOOD_CONTEXT() to validate that the context we are saving meets expectations however we do this on the saved copy rather than on the actual signal context passed in. This breaks validation of EXTRA_CONTEXT since we attempt to validate the ABI requirement that the additional space supplied is immediately after the termination record in the standard context which will not be the case after it has been copied to another location. Fix this by doing the validation before we copy. Note that nothing actually looks inside the EXTRA_CONTEXT at present. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20220829160703.874492-3-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-09-07	kselftest/arm64: Enumerate SME rather than SVE vector lengths for za_regs	Mark Brown	1	-2/+2
	The za_regs signal test was enumerating the SVE vector lengths rather than the SME vector lengths through cut'n'paste error when determining what to test. Enumerate the SME vector lengths instead. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20220829160703.874492-2-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-09-07	kselftest/arm64: Add a test for signal frames with ZA disabled	Mark Brown	1	-0/+119
	When ZA is disabled there should be no register data in the ZA signal frame, add a test case which confirms that this is the case. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20220829155728.854947-3-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-09-07	kselftest/arm64: Tighten up validation of ZA signal context	Mark Brown	1	-1/+15
	Currently we accept any size for the ZA signal context that the shared code will accept which means we don't verify that any data is present. Since we have enabled ZA we know that there must be data so strengthen the check to only accept a signal frame with data, and while we're at it since we enabled ZA but did not set any data we know that ZA must contain zeros, confirm that. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20220829155728.854947-2-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-09-07	kselftest/arm64: kselftest harness for FP stress tests	Mark Brown	3	-1/+540
	Currently the stress test programs for floating point context switching are run by hand, there are extremely simplistic harnesses which run some copies of each test individually but they are not integrated into kselftest and with SVE and SME they only run with whatever vector length the process has by default. This is hassle when running the tests and means that they're not being run at all by CI systems picking up kselftest. In order to improve our coverage and provide a more convenient interface provide a harness program which starts enough stress test programs up to cause context switching and runs them for a set period. If only FPSIMD is available in the system we start two copies of the FPSIMD stress test per CPU, otherwise we start one copy of the FPSIMD and then start the SVE, streaming SVE and ZA tests once per CPU for each available VL they have to run on. We then run for a set period monitoring for any errors reported by the test programs before cleanly terminating them. In order to provide additional coverage of signal handling and some extra noise in the scheduling we send a SIGUSR2 to the stress tests once a second, the tests will count the number of signals they get. Since kselftest is generally expected to run quickly we by default only run for ten seconds. This is enough to show if there is anything cripplingly wrong but not exactly a thorough soak test, for interactive and more focused use a command line option -t N is provided which overrides the length of time to run for (specified in seconds) and if 0 is specified then there is no timeout and the test must be manually terminated. The timeout is counted in seconds with no output, this is done to account for the potentially slow startup time for the test programs on virtual platforms which tend to struggle during startup as they are both slow and tend to support a wide range of vector lengths. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20220829154452.824870-5-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-09-07	kselftest/arm64: Install signal handlers before output in FP stress tests	Mark Brown	3	-72/+72
	To interface more robustly with other processes install the signal handers in the floating point stress tests before we produce any output, this means that a parent process can know that if it has seen any output from the test then the test is ready to handle incoming signals. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20220906220056.820295-1-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-09-07	iommu/vt-d: Fix lockdep splat due to klist iteration in atomic context	Lu Baolu	1	-28/+19
	With CONFIG_INTEL_IOMMU_DEBUGFS enabled, below lockdep splat are seen when an I/O fault occurs on a machine with an Intel IOMMU in it. DMAR: DRHD: handling fault status reg 3 DMAR: [DMA Write NO_PASID] Request device [00:1a.0] fault addr 0x0 [fault reason 0x05] PTE Write access is not set DMAR: Dump dmar0 table entries for IOVA 0x0 DMAR: root entry: 0x0000000127f42001 DMAR: context entry: hi 0x0000000000001502, low 0x000000012d8ab001 ================================ WARNING: inconsistent lock state 5.20.0-0.rc0.20220812git7ebfc85e2cd7.10.fc38.x86_64 #1 Not tainted -------------------------------- inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage. rngd/1006 [HC1[1]:SC0[0]:HE0:SE1] takes: ff177021416f2d78 (&k->k_lock){?.+.}-{2:2}, at: klist_next+0x1b/0x160 {HARDIRQ-ON-W} state was registered at: lock_acquire+0xce/0x2d0 _raw_spin_lock+0x33/0x80 klist_add_tail+0x46/0x80 bus_add_device+0xee/0x150 device_add+0x39d/0x9a0 add_memory_block+0x108/0x1d0 memory_dev_init+0xe1/0x117 driver_init+0x43/0x4d kernel_init_freeable+0x1c2/0x2cc kernel_init+0x16/0x140 ret_from_fork+0x1f/0x30 irq event stamp: 7812 hardirqs last enabled at (7811): [<ffffffff85000e86>] asm_sysvec_apic_timer_interrupt+0x16/0x20 hardirqs last disabled at (7812): [<ffffffff84f16894>] irqentry_enter+0x54/0x60 softirqs last enabled at (7794): [<ffffffff840ff669>] __irq_exit_rcu+0xf9/0x170 softirqs last disabled at (7787): [<ffffffff840ff669>] __irq_exit_rcu+0xf9/0x170 The klist iterator functions using spin_lock_irq() but the klist insertion functions using spin_*lock(), combined with the Intel DMAR IOMMU driver iterating over klists from atomic (hardirq) context, where pci_get_domain_bus_and_slot() calls into bus_find_device() which iterates over klists. As currently there's no plan to fix the klist to make it safe to use in atomic context, this fixes the lockdep splat by avoid calling pci_get_domain_bus_and_slot() in the hardirq context. Fixes: 8ac0b64b9735 ("iommu/vt-d: Use pci_get_domain_bus_and_slot() in pgtable_walk()") Reported-by: Lennert Buytenhek <buytenh@wantstofly.org> Link: https://lore.kernel.org/linux-iommu/Yvo2dfpEh%2FWC+Wrr@wantstofly.org/ Link: https://lore.kernel.org/linux-iommu/YvyBdPwrTuHHbn5X@wantstofly.org/ Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20220819015949.4795-1-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-09-07	iommu/vt-d: Fix recursive lock issue in iommu_flush_dev_iotlb()	Lu Baolu	1	-16/+23
	The per domain spinlock is acquired in iommu_flush_dev_iotlb(), which is possbile to be called in the interrupt context. For example, the drm-intel's CI system got completely blocked with below error: WARNING: inconsistent lock state 6.0.0-rc1-CI_DRM_11990-g6590d43d39b9+ #1 Not tainted -------------------------------- inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage. swapper/6/0 [HC0[0]:SC1[1]:HE1:SE0] takes: ffff88810440d678 (&domain->lock){+.?.}-{2:2}, at: iommu_flush_dev_iotlb.part.61+0x23/0x80 {SOFTIRQ-ON-W} state was registered at: lock_acquire+0xd3/0x310 _raw_spin_lock+0x2a/0x40 domain_update_iommu_cap+0x20b/0x2c0 intel_iommu_attach_device+0x5bd/0x860 __iommu_attach_device+0x18/0xe0 bus_iommu_probe+0x1f3/0x2d0 bus_set_iommu+0x82/0xd0 intel_iommu_init+0xe45/0x102a pci_iommu_init+0x9/0x31 do_one_initcall+0x53/0x2f0 kernel_init_freeable+0x18f/0x1e1 kernel_init+0x11/0x120 ret_from_fork+0x1f/0x30 irq event stamp: 162354 hardirqs last enabled at (162354): [<ffffffff81b59274>] _raw_spin_unlock_irqrestore+0x54/0x70 hardirqs last disabled at (162353): [<ffffffff81b5901b>] _raw_spin_lock_irqsave+0x4b/0x50 softirqs last enabled at (162338): [<ffffffff81e00323>] __do_softirq+0x323/0x48e softirqs last disabled at (162349): [<ffffffff810c1588>] irq_exit_rcu+0xb8/0xe0 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&domain->lock); <Interrupt> lock(&domain->lock); * DEADLOCK * 1 lock held by swapper/6/0: This coverts the spin_lock/unlock() into the irq save/restore varieties to fix the recursive locking issues. Fixes: ffd5869d93530 ("iommu/vt-d: Replace spin_lock_irqsave() with spin_lock()") Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Acked-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20220817025650.3253959-1-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-09-07	iommu/vt-d: Correctly calculate sagaw value of IOMMU	Lu Baolu	1	-3/+25
	The Intel IOMMU driver possibly selects between the first-level and the second-level translation tables for DMA address translation. However, the levels of page-table walks for the 4KB base page size are calculated from the SAGAW field of the capability register, which is only valid for the second-level page table. This causes the IOMMU driver to stop working if the hardware (or the emulated IOMMU) advertises only first-level translation capability and reports the SAGAW field as 0. This solves the above problem by considering both the first level and the second level when calculating the supported page table levels. Fixes: b802d070a52a1 ("iommu/vt-d: Use iova over first level") Cc: stable@vger.kernel.org Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20220817023558.3253263-1-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-09-07	iommu/vt-d: Fix kdump kernels boot failure with scalable mode	Lu Baolu	2	-59/+50
	The translation table copying code for kdump kernels is currently based on the extended root/context entry formats of ECS mode defined in older VT-d v2.5, and doesn't handle the scalable mode formats. This causes the kexec capture kernel boot failure with DMAR faults if the IOMMU was enabled in scalable mode by the previous kernel. The ECS mode has already been deprecated by the VT-d spec since v3.0 and Intel IOMMU driver doesn't support this mode as there's no real hardware implementation. Hence this converts ECS checking in copying table code into scalable mode. The existing copying code consumes a bit in the context entry as a mark of copied entry. It needs to work for the old format as well as for the extended context entries. As it's hard to find such a common bit for both legacy and scalable mode context entries. This replaces it with a per- IOMMU bitmap. Fixes: 7373a8cc38197 ("iommu/vt-d: Setup context and enable RID2PASID support") Cc: stable@vger.kernel.org Reported-by: Jerry Snitselaar <jsnitsel@redhat.com> Tested-by: Wen Jin <wen.jin@intel.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20220817011035.3250131-1-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-09-07	MIPS: OCTEON: irq: Fix octeon_irq_force_ciu_mapping()	Alexander Sverdlin	1	-0/+10
	For irq_domain_associate() to work the virq descriptor has to be pre-allocated in advance. Otherwise the following happens: WARNING: CPU: 0 PID: 0 at .../kernel/irq/irqdomain.c:527 irq_domain_associate+0x298/0x2e8 error: virq128 is not allocated Modules linked in: CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.19.78-... #1 ... Call Trace: [<ffffffff801344c4>] show_stack+0x9c/0x130 [<ffffffff80769550>] dump_stack+0x90/0xd0 [<ffffffff801576d0>] __warn+0x118/0x130 [<ffffffff80157734>] warn_slowpath_fmt+0x4c/0x70 [<ffffffff801b83c0>] irq_domain_associate+0x298/0x2e8 [<ffffffff80a43bb8>] octeon_irq_init_ciu+0x4c8/0x53c [<ffffffff80a76cbc>] of_irq_init+0x1e0/0x388 [<ffffffff80a452cc>] init_IRQ+0x4c/0xf4 [<ffffffff80a3cc00>] start_kernel+0x404/0x698 Use irq_alloc_desc_at() to avoid the above problem. Signed-off-by: Alexander Sverdlin <alexander.sverdlin@nokia.com> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2022-09-07	selftests: nft_concat_range: add socat support	Florian Westphal	1	-12/+53
	There are different flavors of 'nc' around, this script fails on my test vm because 'nc' is 'nmap-ncat' which isn't 100% compatible. Add socat support and use it if available. Signed-off-by: Florian Westphal <fw@strlen.de>
2022-09-07	netfilter: nf_conntrack_sip: fix ct_sip_walk_headers	Igor Ryzhov	1	-2/+2
	ct_sip_next_header and ct_sip_get_header return an absolute value of matchoff, not a shift from current dataoff. So dataoff should be assigned matchoff, not incremented by it. This issue can be seen in the scenario when there are multiple Contact headers and the first one is using a hostname and other headers use IP addresses. In this case, ct_sip_walk_headers will work as follows: The first ct_sip_get_header call to will find the first Contact header but will return -1 as the header uses a hostname. But matchoff will be changed to the offset of this header. After that, dataoff should be set to matchoff, so that the next ct_sip_get_header call find the next Contact header. But instead of assigning dataoff to matchoff, it is incremented by it, which is not correct, as matchoff is an absolute value of the offset. So on the next call to the ct_sip_get_header, dataoff will be incorrect, and the next Contact header may not be found at all. Fixes: 05e3ced297fe ("[NETFILTER]: nf_conntrack_sip: introduce SIP-URI parsing helper") Signed-off-by: Igor Ryzhov <iryzhov@nfware.com> Signed-off-by: Florian Westphal <fw@strlen.de>
2022-09-07	Merge branch 'macsec-offload-mlx5'	David S. Miller	28	-53/+3180
	Saeed Mahameed says: ==================== Introduce MACsec skb_metadata_dst and mlx5 macsec offload v1->v2: - attach mlx5 implementation patches. This patchset introduces MACsec skb_metadata_dst to lay the ground for MACsec HW offload. MACsec is an IEEE standard (IEEE 802.1AE) for MAC security. It defines a way to establish a protocol independent connection between two hosts with data confidentiality, authenticity and/or integrity, using GCM-AES. MACsec operates on the Ethernet layer and as such is a layer 2 protocol, which means it’s designed to secure traffic within a layer 2 network, including DHCP or ARP requests. Linux has a software implementation of the MACsec standard and HW offloading support. The offloading is re-using the logic, netlink API and data structures of the existing MACsec software implementation. For Tx: In the current MACsec offload implementation, MACsec interfaces shares the same MAC address by default. Therefore, HW can't distinguish from which MACsec interface the traffic originated from. MACsec stack will use skb_metadata_dst to store the SCI value, which is unique per MACsec interface, skb_metadat_dst will be used later by the offloading device driver to associate the SKB with the corresponding offloaded interface (SCI) to facilitate HW MACsec offload. For Rx: Like in the Tx changes, if there are more than one MACsec device with the same MAC address as in the packet's destination MAC, the packet will be forward only to one of the devices and not neccessarly to the desired one. Offloading device driver sets the MACsec skb_metadata_dst sci field with the appropriaate Rx SCI for each SKB so the MACsec rx handler will know to which port to divert those skbs, instead of wrongly solely relaying on dst MAC address comparison. 1) patch 1,2, Add support to skb_metadata_dst in MACsec code: net/macsec: Add MACsec skb_metadata_dst Tx Data path support net/macsec: Add MACsec skb_metadata_dst Rx Data path support 2) patch 3, Move some MACsec driver code for sharing with various drivers that implements offload: net/macsec: Move some code for sharing with various drivers that implements offload 3) The rest of the patches introduce mlx5 implementation for macsec offloads TX and RX via steering tables. a) TX, intercept skbs with macsec offlad mark in skb_metadata_dst and mark the descriptor for offload. b) RX, intercept offloaded frames and prepare the proper skb_metadata_dst to mark offloaded rx frames. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-07	net/mlx5e: Add support to configure more than one macsec offload device	Lior Nahmanson	1	-46/+175
	Add the ability to add up to 16 MACsec offload interfaces over the same physical interface Signed-off-by: Lior Nahmanson <liorna@nvidia.com> Reviewed-by: Raed Salem <raeds@nvidia.com> Signed-off-by: Raed Salem <raeds@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-07	net/mlx5e: Add MACsec stats support for Rx/Tx flows	Lior Nahmanson	8	-2/+136
	Add the following statistics: RX successfully decrypted MACsec packets: macsec_rx_pkts : Number of packets decrypted successfully macsec_rx_bytes : Number of bytes decrypted successfully Rx dropped MACsec packets: macsec_rx_pkts_drop : Number of MACsec packets dropped macsec_rx_bytes_drop : Number of MACsec bytes dropped TX successfully encrypted MACsec packets: macsec_tx_pkts : Number of packets encrypted/authenticated successfully macsec_tx_bytes : Number of bytes encrypted/authenticated successfully Tx dropped MACsec packets: macsec_tx_pkts_drop : Number of MACsec packets dropped macsec_tx_bytes_drop : Number of MACsec bytes dropped The above can be seen using: ethtool -S <ifc> \|grep macsec Signed-off-by: Lior Nahmanson <liorna@nvidia.com> Reviewed-by: Raed Salem <raeds@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-07	net/mlx5e: Add MACsec offload SecY support	Lior Nahmanson	1	-0/+229
	Add offload support for MACsec SecY callbacks - add/update/delete. add_secy is called when need to create a new MACsec interface. upd_secy is called when source MAC address or tx SC was changed. del_secy is called when need to destroy the MACsec interface. Signed-off-by: Lior Nahmanson <liorna@nvidia.com> Reviewed-by: Raed Salem <raeds@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-07	net/mlx5e: Implement MACsec Rx data path using MACsec skb_metadata_dst	Lior Nahmanson	4	-3/+68
	MACsec driver need to distinguish to which offload device the MACsec is target to, in order to handle them correctly. This can be done by attaching a metadata_dst to a SKB with a SCI, when there is a match on MACsec rule. To achieve that, there is a map between fs_id to SCI, so for each RX SC, there is a unique fs_id allocated when creating RX SC. fs_id passed to device driver as metadata for packets that passed Rx MACsec offload to aid the driver to retrieve the matching SCI. Signed-off-by: Lior Nahmanson <liorna@nvidia.com> Reviewed-by: Raed Salem <raeds@nvidia.com> Signed-off-by: Raed Salem <raeds@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-07	net/mlx5e: Add MACsec RX steering rules	Lior Nahmanson	3	-94/+822
	Rx flow steering consists of two flow tables (FTs). The first FT (crypto table) have one default miss rule so non MACsec offloaded packets bypass the MACSec tables. All others flow table entries (FTEs) are divided to two equal groups size, both of them are for MACsec packets: The first group is for MACsec packets which contains SCI field in the SecTAG header. The second group is for MACsec packets which doesn't contain SCI, where need to match on the source MAC address (only if the SCI is built from default MACsec port). Destination MAC address, ethertype and some of SecTAG fields are also matched for both groups. In case of match, invoke decrypt action on the packet. For each MACsec Rx offloaded SA two rules are created: one with SCI and one without SCI. The second FT (check table) has two fixed rules: One rule is for verifying that the previous offload actions were finished successfully. In this case, need to decap the SecTAG header and forward the packet for further processing. Another default rule for dropping packets that failed in the previous decrypt actions. The MACsec FTs are created on demand when the first MACsec rule is added and destroyed when the last MACsec rule is deleted. Signed-off-by: Lior Nahmanson <liorna@nvidia.com> Reviewed-by: Raed Salem <raeds@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>