wireguard-linux - WireGuard for the Linux kernel

Age	Commit message (Collapse)	Author	Files	Lines
2023-04-17	Makefile: use -z pack-relative-relocs	Fangrui Song	2	-3/+8
	Commit 27f2a4db76e8 ("Makefile: fix GDB warning with CONFIG_RELR") added --use-android-relr-tags to fix a GDB warning BFD: /android0/linux-next/vmlinux: unknown type [0x13] section `.relr.dyn' The GDB warning has been fixed in version 11.2. The DT_ANDROID_RELR tag was deprecated since DT_RELR was standardized. Thus, --use-android-relr-tags should be removed. While making the change, try -z pack-relative-relocs, which is supported since LLD 15. Keep supporting --pack-dyn-relocs=relr as well for older LLD versions. There is no indication of obsolescence for --pack-dyn-relocs=relr. As of today, GNU ld supports the latter option for x86 and powerpc64 ports and has no intention to support --pack-dyn-relocs=relr. In the absence of the glibc symbol version GLIBC_ABI_DT_RELR, --pack-dyn-relocs=relr and -z pack-relative-relocs are identical in ld.lld. GNU ld and newer versions of LLD report warnings (instead of errors) for unknown -z options. Only errors lead to non-zero exit codes. Therefore, we should test --pack-dyn-relocs=relr before testing -z pack-relative-relocs. Link: https://github.com/ClangBuiltLinux/linux/issues/1057 Link: https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=a619b58721f0a03fd91c27670d3e4c2fb0d88f1e Signed-off-by: Fangrui Song <maskray@google.com> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2023-04-17	kbuild: clang: do not use CROSS_COMPILE for target triple	Masahiro Yamada	1	-6/+2
	The target triple is overridden by the user-supplied CROSS_COMPILE, but I do not see a good reason to support it. Users can use a new architecture without adding CLANG_TARGET_FLAGS_*, but that would be a rare case. Use the hard-coded and deterministic target triple all the time. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Acked-by: Nathan Chancellor <nathan@kernel.org> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
2023-04-17	kconfig: menuconfig: reorder functions to remove forward declarations	Masahiro Yamada	2	-295/+277
	Define helper functions before the callers so that forward declarations can go away. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2023-04-17	kconfig: menuconfig: remove unused M_EVENT macro	Masahiro Yamada	1	-11/+0
	This is not used anywhere. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2023-04-17	kconfig: menuconfig: remove OLD_NCURSES macro	Masahiro Yamada	3	-33/+0
	This code has been here for more than 20 years. The bug in the old days no longer matters. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2023-04-17	kbuild: builddeb: Eliminate debian/arch use	Bastian Germann	1	-1/+1
	In the builddeb context, the DEB_HOST_ARCH environment variable is set to the same value as debian/arch's content, so use the variable with dpkg-architecture. This is the last use of the debian/arch file during dpkg-buildpackage time. Signed-off-by: Bastian Germann <bage@linutronix.de> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2023-04-17	scripts/kallsyms: update the usage in the comment block	Masahiro Yamada	1	-1/+1
	Commit 010a0aad39fc ("kallsyms: Correctly sequence symbols when CONFIG_LTO_CLANG=y") added --lto-clang, and updated the usage() function, but not the comment. Update it in the same way. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
2023-04-17	scripts/kallsyms: decrease expand_symbol() / cleanup_symbol_name() calls	Masahiro Yamada	1	-18/+15
	Currently, expand_symbol() is called many times to get the uncompressed symbol names for sorting, and also for adding comments. With the output order shuffled in the previous commit, the symbol data are now written in the following order: (1) kallsyms_num_syms (2) kallsyms_names <-- need compressed names (3) kallsyms_markers (4) kallsyms_token_table (5) kallsyms_token_index (6) kallsyms_addressed / kallsyms_offsets <-- need uncompressed names (for commenting) (7) kallsyms_relative_base (8) kallsyms_seq_of_names <-- need uncompressed names (for sorting) The compressed names are only needed by (2). Call expand_symbol() between (2) and (3) to restore the original symbol names. This requires just one expand_symbol() call for each symbol. Call cleanup_symbol_name() between (7) and (8) instead of during sorting. It is allowed to overwrite the ->sym field because (8) just outputs the index instead of the name of each symbol. Again, this requires just one cleanup_symbol_name() call for each symbol. This refactoring makes it ~30% faster. [Before] $ time scripts/kallsyms --all-symbols --absolute-percpu --base-relative \ .tmp_vmlinux.kallsyms2.syms >/dev/null real 0m1.027s user 0m1.010s sys 0m0.016s [After] $ time scripts/kallsyms --all-symbols --absolute-percpu --base-relative \ .tmp_vmlinux.kallsyms2.syms >/dev/null real 0m0.717s user 0m0.717s sys 0m0.000s Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2023-04-17	scripts/kallsyms: change the output order	Masahiro Yamada	1	-59/+59
	Currently, this tool outputs symbol data in the following order. (1) kallsyms_addressed / kallsyms_offsets (2) kallsyms_relative_base (3) kallsyms_num_syms (4) kallsyms_names (5) kallsyms_markers (6) kallsyms_seq_of_names (7) kallsyms_token_table (8) kallsyms_token_index This commit changes the order as follows: (1) kallsyms_num_syms (2) kallsyms_names (3) kallsyms_markers (4) kallsyms_token_table (5) kallsyms_token_index (6) kallsyms_addressed / kallsyms_offsets (7) kallsyms_relative_base (8) kallsyms_seq_of_names The motivation is to decrease the number of function calls to expand_symbol() and cleanup_symbol_name(). The compressed names are only required for writing 'kallsyms_names'. If you do this first, we can restore the original symbol names. You do not need to repeat the same operation over again. The actual refactoring will happen in the next commit. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2023-04-17	scripts/kallsyms: move compiler-generated symbol patterns to mksysmap	Masahiro Yamada	2	-60/+43
	scripts/kallsyms.c maintains compiler-generated symbols, but we end up with something similar in scripts/mksysmap to avoid the "Inconsistent kallsyms data" error. For example, commit c17a2538704f ("mksysmap: Fix the mismatch of 'L0' symbols in System.map"). They were separately maintained prior to commit 94ff2f63d6a3 ("kbuild: reuse mksysmap output for kallsyms"). Now that scripts/kallsyms.c parses the output of scripts/mksysmap, it makes more sense to collect all the ignored patterns to mksysmap. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
2023-04-17	scripts/kallsyms: exclude symbols generated by itself dynamically	Masahiro Yamada	3	-20/+13
	Drop the symbols generated by scripts/kallsyms itself automatically instead of maintaining the symbol list manually. Pass the kallsyms object from the previous kallsyms step (if it exists) as the third parameter of scripts/mksysmap, which will weed out the generated symbols from the input to the next kallsyms step. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2023-04-17	scripts/mksysmap: use sed with in-line comments	Masahiro Yamada	1	-24/+37
	It is not feasible to insert comments in a multi-line shell command. Use sed, and move comments close to the code. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2023-04-17	scripts/mksysmap: remove comments described in nm(1)	Masahiro Yamada	1	-19/+1
	I do not think we need to repeat what is written in 'man nm'. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2023-04-17	scripts/kallsyms: remove redundant code for omitting U and N	Masahiro Yamada	1	-4/+1
	The symbol types 'U' and 'N' are already filtered out by the following line in scripts/mksysmap: -e ' [aNUw] ' Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
2023-04-17	kallsyms: expand symbol name into comment for debugging	Arnd Bergmann	1	-1/+2
	The assembler output of kallsyms.c is not meant for people to understand, and is generally not helpful when debugging "Inconsistent kallsyms data" warnings. I have previously struggled with these, but found it helpful to list which symbols changed between the first and second pass in the .tmp_vmlinux.kallsyms*.S files. As this file is preprocessed, it's possible to add a C-style multiline comment with the full type/name tuple. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2023-04-16	Linux 6.3-rc7	Linus Torvalds	1	-1/+1

2023-04-16	kbuild: do not create intermediate *.tar for tar packages	Masahiro Yamada	1	-17/+9
	Commit 05e96e96a315 ("kbuild: use git-archive for source package creation") split the compression as a separate step to factor out the common build rules. With the previous commit, we got back to the situation where source tarballs are compressed on-the-fly. There is no reason to keep the separate compression rules. Generate the comressed tar packages directly. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Nathan Chancellor <nathan@kernel.org>
2023-04-16	kbuild: do not create intermediate *.tar for source tarballs	Masahiro Yamada	1	-7/+17
	Since commit 05e96e96a315 ("kbuild: use git-archive for source package creation"), a source tarball is created in two steps; create .tar file then compress it. I split the compression as a separate rule because I just thought 'git archive' supported only gzip. For other compression algorithms, I could pipe the two commands: $ git archive HEAD \| xz > linux.tar.xz I read git-archive(1) carefully, and I realized GIT had provided a more elegant way: $ git -c tar.tar.xz.command=xz archive -o linux.tar.xz HEAD This commit uses 'tar.tar..command' configuration to specify the compression backend so we can compress a source tarball on-the-fly. GIT commit 767cf4579f0e ("archive: implement configurable tar filters") is more than a decade old, so it should be available on almost all build environments. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Nathan Chancellor <nathan@kernel.org>
2023-04-16	kbuild: merge cmd_archive_linux and cmd_archive_perf	Masahiro Yamada	1	-10/+9
	The two commands, cmd_archive_linux and cmd_archive_perf, are similar. Merge them to make it easier to add more changes to the git-archive command. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Nathan Chancellor <nathan@kernel.org>
2023-04-16	init/initramfs: Fix argument forwarding to panic() in panic_show_mem()	Benjamin Gray	1	-9/+2
	Forwarding variadic argument lists can't be done by passing a va_list to a function with signature foo(...) (as panic() has). It ends up interpreting the va_list itself as a single argument instead of iterating it. printf() happily accepts it of course, leading to corrupt output. Convert panic_show_mem() to a macro to allow forwarding the arguments. The function is trivial enough that it's easier than trying to introduce a vpanic() variant. Signed-off-by: Benjamin Gray <bgray@linux.ibm.com> Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2023-04-16	initramfs: Check negative timestamp to prevent broken cpio archive	Benjamin Gray	1	-3/+9
	Similar to commit 4c9d410f32b3 ("initramfs: Check timestamp to prevent broken cpio archive"), except asserts that the timestamp is non-negative. This can happen when the KBUILD_BUILD_TIMESTAMP is a value before UNIX epoch, which may be set when making reproducible builds that don't want to look like they use a valid date. While support for dates before 1970 might not be supported, this is more about preventing undetected CPIO corruption. The printf's use a minimum length format specifier, and will happily make the field longer than 8 characters if they need to. Signed-off-by: Benjamin Gray <bgray@linux.ibm.com> Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com> Tested-by: Andrew Donnellan <ajd@linux.ibm.com> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2023-04-15	cifs: fix negotiate context parsing	David Disseldorp	1	-10/+31
	smb311_decode_neg_context() doesn't properly check against SMB packet boundaries prior to accessing individual negotiate context entries. This is due to the length check omitting the eight byte smb2_neg_context header, as well as incorrect decrementing of len_of_ctxts. Fixes: 5100d8a3fe03 ("SMB311: Improve checking of negotiate security contexts") Reported-by: Volker Lendecke <vl@samba.org> Reviewed-by: Paulo Alcantara (SUSE) <pc@manguebit.com> Signed-off-by: David Disseldorp <ddiss@suse.de> Signed-off-by: Steve French <stfrench@microsoft.com>
2023-04-14	io_uring: complete request via task work in case of DEFER_TASKRUN	Ming Lei	1	-1/+1
	So far io_req_complete_post() only covers DEFER_TASKRUN by completing request via task work when the request is completed from IOWQ. However, uring command could be completed from any context, and if io uring is setup with DEFER_TASKRUN, the command is required to be completed from current context, otherwise wait on IORING_ENTER_GETEVENTS can't be wakeup, and may hang forever. The issue can be observed on removing ublk device, but turns out it is one generic issue for uring command & DEFER_TASKRUN, so solve it in io_uring core code. Fixes: e6aeb2721d3b ("io_uring: complete all requests in task context") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/linux-block/b3fc9991-4c53-9218-a8cc-5b4dd3952108@kernel.dk/ Reported-by: Jens Axboe <axboe@kernel.dk> Cc: Kanchan Joshi <joshi.k@samsung.com> Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-04-14	ALSA: hda/hdmi: disable KAE for Intel DG2	Kai Vehmanen	1	-1/+1
	Use of keep-alive (KAE) has resulted in loss of audio on some A750/770 cards as the transition from keep-alive to stream playback is not working as expected. As there is limited benefit of the new KAE mode on discrete cards, revert back to older silent-stream implementation on these systems. Cc: stable@vger.kernel.org Fixes: 15175a4f2bbb ("ALSA: hda/hdmi: add keep-alive support for ADL-P and DG2") Link: https://gitlab.freedesktop.org/drm/intel/-/issues/8307 Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com> Link: https://lore.kernel.org/r/20230413191153.3692049-1-kai.vehmanen@linux.intel.com Signed-off-by: Takashi Iwai <tiwai@suse.de>
2023-04-14	nvme-pci: add NVME_QUIRK_BOGUS_NID for T-FORCE Z330 SSD	Duy Truong	1	-0/+2
	Added a quirk to fix the TeamGroup T-Force Cardea Zero Z330 SSDs reporting duplicate NGUIDs. Signed-off-by: Duy Truong <dory@dory.moe> Cc: stable@vger.kernel.org Signed-off-by: Christoph Hellwig <hch@lst.de>
2023-04-13	riscv: No need to relocate the dtb as it lies in the fixmap region	Alexandre Ghiti	1	-19/+2
	We used to access the dtb via its linear mapping address but now that the dtb early mapping was moved in the fixmap region, we can keep using this address since it is present in swapper_pg_dir, and remove the dtb relocation. Note that the relocation was wrong anyway since early_memremap() is restricted to 256K whereas the maximum fdt size is 2MB. Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Tested-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20230329081932.79831-4-alexghiti@rivosinc.com Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-04-13	riscv: Do not set initial_boot_params to the linear address of the dtb	Alexandre Ghiti	1	-4/+1
	early_init_dt_verify() is already called in parse_dtb() and since the dtb address does not change anymore (it is now in the fixmap region), no need to reset initial_boot_params by calling early_init_dt_verify() again. Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20230329081932.79831-3-alexghiti@rivosinc.com Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-04-13	riscv: Move early dtb mapping into the fixmap region	Alexandre Ghiti	5	-33/+51
	riscv establishes 2 virtual mappings: - early_pg_dir maps the kernel which allows to discover the system memory - swapper_pg_dir installs the final mapping (linear mapping included) We used to map the dtb in early_pg_dir using DTB_EARLY_BASE_VA, and this mapping was not carried over in swapper_pg_dir. It happens that early_init_fdt_scan_reserved_mem() must be called before swapper_pg_dir is setup otherwise we could allocate reserved memory defined in the dtb. And this function initializes reserved_mem variable with addresses that lie in the early_pg_dir dtb mapping: when those addresses are reused with swapper_pg_dir, this mapping does not exist and then we trap. The previous "fix" was incorrect as early_init_fdt_scan_reserved_mem() must be called before swapper_pg_dir is set up otherwise we could allocate in reserved memory defined in the dtb. So move the dtb mapping in the fixmap region which is established in early_pg_dir and handed over to swapper_pg_dir. Fixes: 922b0375fc93 ("riscv: Fix memblock reservation for device tree blob") Fixes: 8f3a2b4a96dc ("RISC-V: Move DT mapping outof fixmap") Fixes: 50e63dd8ed92 ("riscv: fix reserved memory setup") Reported-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/all/f8e67f82-103d-156c-deb0-d6d6e2756f5e@microchip.com/ Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Tested-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20230329081932.79831-2-alexghiti@rivosinc.com Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-04-13	ksmbd: avoid out of bounds access in decode_preauth_ctxt()	David Disseldorp	1	-9/+14
	Confirm that the accessed pneg_ctxt->HashAlgorithms address sits within the SMB request boundary; deassemble_neg_contexts() only checks that the eight byte smb2_neg_context header + (client controlled) DataLength are within the packet boundary, which is insufficient. Checking for sizeof(struct smb2_preauth_neg_context) is overkill given that the type currently assumes SMB311_SALT_SIZE bytes of trailing Salt. Signed-off-by: David Disseldorp <ddiss@suse.de> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2023-04-13	selftests/bpf: Adjust bpf_xdp_metadata_rx_hash for new arg	Jesper Dangaard Brouer	6	-12/+23
	Update BPF selftests to use the new RSS type argument for kfunc bpf_xdp_metadata_rx_hash. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Acked-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/r/168132894068.340624.8914711185697163690.stgit@firesoul Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-13	mlx4: bpf_xdp_metadata_rx_hash add xdp rss hash type	Jesper Dangaard Brouer	1	-1/+18
	Update API for bpf_xdp_metadata_rx_hash() with arg for xdp rss hash type via matching individual Completion Queue Entry (CQE) status bits. Fixes: ab46182d0dcb ("net/mlx4_en: Support RX XDP metadata") Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Acked-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/r/168132893562.340624.12779118462402031248.stgit@firesoul Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-13	veth: bpf_xdp_metadata_rx_hash add xdp rss hash type	Jesper Dangaard Brouer	1	-2/+5
	Update API for bpf_xdp_metadata_rx_hash() with arg for xdp rss hash type. The veth driver currently only support XDP-hints based on SKB code path. The SKB have lost information about the RSS hash type, by compressing the information down to a single bitfield skb->l4_hash, that only knows if this was a L4 hash value. In preparation for veth, the xdp_rss_hash_type have an L4 indication bit that allow us to return a meaningful L4 indication when working with SKB based packets. Fixes: 306531f0249f ("veth: Support RX XDP metadata") Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Acked-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/r/168132893055.340624.16209448340644513469.stgit@firesoul Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-13	mlx5: bpf_xdp_metadata_rx_hash add xdp rss hash type	Jesper Dangaard Brouer	3	-3/+73
	Update API for bpf_xdp_metadata_rx_hash() with arg for xdp rss hash type via mapping table. The mlx5 hardware can also identify and RSS hash IPSEC. This indicate hash includes SPI (Security Parameters Index) as part of IPSEC hash. Extend xdp core enum xdp_rss_hash_type with IPSEC hash type. Fixes: bc8d405b1ba9 ("net/mlx5e: Support RX XDP metadata") Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Acked-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/r/168132892548.340624.11185734579430124869.stgit@firesoul Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-13	xdp: rss hash types representation	Jesper Dangaard Brouer	7	-6/+64
	The RSS hash type specifies what portion of packet data NIC hardware used when calculating RSS hash value. The RSS types are focused on Internet traffic protocols at OSI layers L3 and L4. L2 (e.g. ARP) often get hash value zero and no RSS type. For L3 focused on IPv4 vs. IPv6, and L4 primarily TCP vs UDP, but some hardware supports SCTP. Hardware RSS types are differently encoded for each hardware NIC. Most hardware represent RSS hash type as a number. Determining L3 vs L4 often requires a mapping table as there often isn't a pattern or sorting according to ISO layer. The patch introduce a XDP RSS hash type (enum xdp_rss_hash_type) that contains both BITs for the L3/L4 types, and combinations to be used by drivers for their mapping tables. The enum xdp_rss_type_bits get exposed to BPF via BTF, and it is up to the BPF-programmer to match using these defines. This proposal change the kfunc API bpf_xdp_metadata_rx_hash() adding a pointer value argument for provide the RSS hash type. Change signature for all xmo_rx_hash calls in drivers to make it compile. The RSS type implementations for each driver comes as separate patches. Fixes: 3d76a4d3d4e5 ("bpf: XDP metadata RX kfuncs") Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Acked-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/r/168132892042.340624.582563003880565460.stgit@firesoul Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-13	selftests/bpf: xdp_hw_metadata remove bpf_printk and add counters	Jesper Dangaard Brouer	2	-16/+24
	The tool xdp_hw_metadata can be used by driver developers implementing XDP-hints metadata kfuncs. Remove all bpf_printk calls, as the tool already transfers all the XDP-hints related information via metadata area to AF_XDP userspace process. Add counters for providing remaining information about failure and skipped packet events. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/r/168132891533.340624.7313781245316405141.stgit@firesoul Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-13	fbcon: set_con2fb_map needs to set con2fb_map!	Daniel Vetter	1	-1/+2
	I got really badly confused in d443d9386472 ("fbcon: move more common code into fb_open()") because we set the con2fb_map before the failure points, which didn't look good. But in trying to fix that I moved the assignment into the wrong path - we need to do it for _all_ vc we take over, not just the first one (which additionally requires the call to con2fb_acquire_newinfo). I've figured this out because of a KASAN bug report, where the fbcon_registered_fb and fbcon_display arrays went out of sync in fbcon_mode_deleted() because the con2fb_map pointed at the old fb_info, but the modes and everything was updated for the new one. Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Reviewed-by: Javier Martinez Canillas <javierm@redhat.com> Acked-by: Helge Deller <deller@gmx.de> Tested-by: Xingyuan Mo <hdthky0@gmail.com> Fixes: d443d9386472 ("fbcon: move more common code into fb_open()") Reported-by: Xingyuan Mo <hdthky0@gmail.com> Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: Sam Ravnborg <sam@ravnborg.org> Cc: Xingyuan Mo <hdthky0@gmail.com> Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: Helge Deller <deller@gmx.de> Cc: <stable@vger.kernel.org> # v5.19+
2023-04-13	fbcon: Fix error paths in set_con2fb_map	Daniel Vetter	1	-9/+8
	This is a regressoin introduced in b07db3958485 ("fbcon: Ditch error handling for con2fb_release_oldinfo"). I failed to realize what the if (!err) checks. The mentioned commit was dropping the con2fb_release_oldinfo() return value but the if (!err) was also checking whether the con2fb_acquire_newinfo() function call above failed or not. Fix this with an early return statement. Note that there's still a difference compared to the orginal state of the code, the below lines are now also skipped on error: if (!search_fb_in_map(info_idx)) info_idx = newidx; These are only needed when we've actually thrown out an old fb_info from the console mappings, which only happens later on. Also move the fbcon_add_cursor_work() call into the same if block, it's all protected by console_lock so doesn't matter when we set up the blinking cursor delayed work anyway. This further simplifies the control flow and allows us to ditch the found local variable. v2: Clarify commit message (Javier) Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Reviewed-by: Javier Martinez Canillas <javierm@redhat.com> Acked-by: Helge Deller <deller@gmx.de> Tested-by: Xingyuan Mo <hdthky0@gmail.com> Fixes: b07db3958485 ("fbcon: Ditch error handling for con2fb_release_oldinfo") Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: Sam Ravnborg <sam@ravnborg.org> Cc: Xingyuan Mo <hdthky0@gmail.com> Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: Helge Deller <deller@gmx.de> Cc: <stable@vger.kernel.org> # v5.19+
2023-04-13	skbuff: Fix a race between coalescing and releasing SKBs	Liang Chen	1	-8/+8
	Commit 1effe8ca4e34 ("skbuff: fix coalescing for page_pool fragment recycling") allowed coalescing to proceed with non page pool page and page pool page when @from is cloned, i.e. to->pp_recycle --> false from->pp_recycle --> true skb_cloned(from) --> true However, it actually requires skb_cloned(@from) to hold true until coalescing finishes in this situation. If the other cloned SKB is released while the merging is in process, from_shinfo->nr_frags will be set to 0 toward the end of the function, causing the increment of frag page _refcount to be unexpectedly skipped resulting in inconsistent reference counts. Later when SKB(@to) is released, it frees the page directly even though the page pool page is still in use, leading to use-after-free or double-free errors. So it should be prohibited. The double-free error message below prompted us to investigate: BUG: Bad page state in process swapper/1 pfn:0e0d1 page:00000000c6548b28 refcount:-1 mapcount:0 mapping:0000000000000000 index:0x2 pfn:0xe0d1 flags: 0xfffffc0000000(node=0\|zone=1\|lastcpupid=0x1fffff) raw: 000fffffc0000000 0000000000000000 ffffffff00000101 0000000000000000 raw: 0000000000000002 0000000000000000 ffffffffffffffff 0000000000000000 page dumped because: nonzero _refcount CPU: 1 PID: 0 Comm: swapper/1 Tainted: G E 6.2.0+ Call Trace: <IRQ> dump_stack_lvl+0x32/0x50 bad_page+0x69/0xf0 free_pcp_prepare+0x260/0x2f0 free_unref_page+0x20/0x1c0 skb_release_data+0x10b/0x1a0 napi_consume_skb+0x56/0x150 net_rx_action+0xf0/0x350 ? __napi_schedule+0x79/0x90 __do_softirq+0xc8/0x2b1 __irq_exit_rcu+0xb9/0xf0 common_interrupt+0x82/0xa0 </IRQ> <TASK> asm_common_interrupt+0x22/0x40 RIP: 0010:default_idle+0xb/0x20 Fixes: 53e0961da1c7 ("page_pool: add frag page recycling support in page pool") Signed-off-by: Liang Chen <liangchen.linux@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20230413090353.14448-1-liangchen.linux@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-04-13	net: macb: fix a memory corruption in extended buffer descriptor mode	Roman Gushchin	1	-0/+4
	For quite some time we were chasing a bug which looked like a sudden permanent failure of networking and mmc on some of our devices. The bug was very sensitive to any software changes and even more to any kernel debug options. Finally we got a setup where the problem was reproducible with CONFIG_DMA_API_DEBUG=y and it revealed the issue with the rx dma: [ 16.992082] ------------[ cut here ]------------ [ 16.996779] DMA-API: macb ff0b0000.ethernet: device driver tries to free DMA memory it has not allocated [device address=0x0000000875e3e244] [size=1536 bytes] [ 17.011049] WARNING: CPU: 0 PID: 85 at kernel/dma/debug.c:1011 check_unmap+0x6a0/0x900 [ 17.018977] Modules linked in: xxxxx [ 17.038823] CPU: 0 PID: 85 Comm: irq/55-8000f000 Not tainted 5.4.0 #28 [ 17.045345] Hardware name: xxxxx [ 17.049528] pstate: 60000005 (nZCv daif -PAN -UAO) [ 17.054322] pc : check_unmap+0x6a0/0x900 [ 17.058243] lr : check_unmap+0x6a0/0x900 [ 17.062163] sp : ffffffc010003c40 [ 17.065470] x29: ffffffc010003c40 x28: 000000004000c03c [ 17.070783] x27: ffffffc010da7048 x26: ffffff8878e38800 [ 17.076095] x25: ffffff8879d22810 x24: ffffffc010003cc8 [ 17.081407] x23: 0000000000000000 x22: ffffffc010a08750 [ 17.086719] x21: ffffff8878e3c7c0 x20: ffffffc010acb000 [ 17.092032] x19: 0000000875e3e244 x18: 0000000000000010 [ 17.097343] x17: 0000000000000000 x16: 0000000000000000 [ 17.102647] x15: ffffff8879e4a988 x14: 0720072007200720 [ 17.107959] x13: 0720072007200720 x12: 0720072007200720 [ 17.113261] x11: 0720072007200720 x10: 0720072007200720 [ 17.118565] x9 : 0720072007200720 x8 : 000000000000022d [ 17.123869] x7 : 0000000000000015 x6 : 0000000000000098 [ 17.129173] x5 : 0000000000000000 x4 : 0000000000000000 [ 17.134475] x3 : 00000000ffffffff x2 : ffffffc010a1d370 [ 17.139778] x1 : b420c9d75d27bb00 x0 : 0000000000000000 [ 17.145082] Call trace: [ 17.147524] check_unmap+0x6a0/0x900 [ 17.151091] debug_dma_unmap_page+0x88/0x90 [ 17.155266] gem_rx+0x114/0x2f0 [ 17.158396] macb_poll+0x58/0x100 [ 17.161705] net_rx_action+0x118/0x400 [ 17.165445] __do_softirq+0x138/0x36c [ 17.169100] irq_exit+0x98/0xc0 [ 17.172234] __handle_domain_irq+0x64/0xc0 [ 17.176320] gic_handle_irq+0x5c/0xc0 [ 17.179974] el1_irq+0xb8/0x140 [ 17.183109] xiic_process+0x5c/0xe30 [ 17.186677] irq_thread_fn+0x28/0x90 [ 17.190244] irq_thread+0x208/0x2a0 [ 17.193724] kthread+0x130/0x140 [ 17.196945] ret_from_fork+0x10/0x20 [ 17.200510] ---[ end trace 7240980785f81d6f ]--- [ 237.021490] ------------[ cut here ]------------ [ 237.026129] DMA-API: exceeded 7 overlapping mappings of cacheline 0x0000000021d79e7b [ 237.033886] WARNING: CPU: 0 PID: 0 at kernel/dma/debug.c:499 add_dma_entry+0x214/0x240 [ 237.041802] Modules linked in: xxxxx [ 237.061637] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G W 5.4.0 #28 [ 237.068941] Hardware name: xxxxx [ 237.073116] pstate: 80000085 (Nzcv daIf -PAN -UAO) [ 237.077900] pc : add_dma_entry+0x214/0x240 [ 237.081986] lr : add_dma_entry+0x214/0x240 [ 237.086072] sp : ffffffc010003c30 [ 237.089379] x29: ffffffc010003c30 x28: ffffff8878a0be00 [ 237.094683] x27: 0000000000000180 x26: ffffff8878e387c0 [ 237.099987] x25: 0000000000000002 x24: 0000000000000000 [ 237.105290] x23: 000000000000003b x22: ffffffc010a0fa00 [ 237.110594] x21: 0000000021d79e7b x20: ffffffc010abe600 [ 237.115897] x19: 00000000ffffffef x18: 0000000000000010 [ 237.121201] x17: 0000000000000000 x16: 0000000000000000 [ 237.126504] x15: ffffffc010a0fdc8 x14: 0720072007200720 [ 237.131807] x13: 0720072007200720 x12: 0720072007200720 [ 237.137111] x11: 0720072007200720 x10: 0720072007200720 [ 237.142415] x9 : 0720072007200720 x8 : 0000000000000259 [ 237.147718] x7 : 0000000000000001 x6 : 0000000000000000 [ 237.153022] x5 : ffffffc010003a20 x4 : 0000000000000001 [ 237.158325] x3 : 0000000000000006 x2 : 0000000000000007 [ 237.163628] x1 : 8ac721b3a7dc1c00 x0 : 0000000000000000 [ 237.168932] Call trace: [ 237.171373] add_dma_entry+0x214/0x240 [ 237.175115] debug_dma_map_page+0xf8/0x120 [ 237.179203] gem_rx_refill+0x190/0x280 [ 237.182942] gem_rx+0x224/0x2f0 [ 237.186075] macb_poll+0x58/0x100 [ 237.189384] net_rx_action+0x118/0x400 [ 237.193125] __do_softirq+0x138/0x36c [ 237.196780] irq_exit+0x98/0xc0 [ 237.199914] __handle_domain_irq+0x64/0xc0 [ 237.204000] gic_handle_irq+0x5c/0xc0 [ 237.207654] el1_irq+0xb8/0x140 [ 237.210789] arch_cpu_idle+0x40/0x200 [ 237.214444] default_idle_call+0x18/0x30 [ 237.218359] do_idle+0x200/0x280 [ 237.221578] cpu_startup_entry+0x20/0x30 [ 237.225493] rest_init+0xe4/0xf0 [ 237.228713] arch_call_rest_init+0xc/0x14 [ 237.232714] start_kernel+0x47c/0x4a8 [ 237.236367] ---[ end trace 7240980785f81d70 ]--- Lars was fast to find an explanation: according to the datasheet bit 2 of the rx buffer descriptor entry has a different meaning in the extended mode: Address [2] of beginning of buffer, or in extended buffer descriptor mode (DMA configuration register [28] = 1), indicates a valid timestamp in the buffer descriptor entry. The macb driver didn't mask this bit while getting an address and it eventually caused a memory corruption and a dma failure. The problem is resolved by explicitly clearing the problematic bit if hw timestamping is used. Fixes: 7b4296148066 ("net: macb: Add support for PTP timestamps in DMA descriptors") Signed-off-by: Roman Gushchin <roman.gushchin@linux.dev> Co-developed-by: Lars-Peter Clausen <lars@metafoo.de> Signed-off-by: Lars-Peter Clausen <lars@metafoo.de> Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://lore.kernel.org/r/20230412232144.770336-1-roman.gushchin@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-04-13	selftests: add the missing CONFIG_IP_SCTP in net config	Xin Long	1	-0/+1
	The selftest sctp_vrf needs CONFIG_IP_SCTP set in config when building the kernel, so add it. Fixes: a61bd7b9fef3 ("selftests: add a selftest for sctp vrf") Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org> Signed-off-by: Xin Long <lucien.xin@gmail.com> Reviewed-by: Sridhar Samudrala <sridhar.samudrala@intel.com> Link: https://lore.kernel.org/r/61dddebc4d2dd98fe7fb145e24d4b2430e42b572.1681312386.git.lucien.xin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-04-13	udp6: fix potential access to stale information	Eric Dumazet	1	-3/+5
	lena wang reported an issue caused by udpv6_sendmsg() mangling msg->msg_name and msg->msg_namelen, which are later read from ____sys_sendmsg() : /* * If this is sendmmsg() and sending to current destination address was * successful, remember it. */ if (used_address && err >= 0) { used_address->name_len = msg_sys->msg_namelen; if (msg_sys->msg_name) memcpy(&used_address->name, msg_sys->msg_name, used_address->name_len); } udpv6_sendmsg() wants to pretend the remote address family is AF_INET in order to call udp_sendmsg(). A fix would be to modify the address in-place, instead of using a local variable, but this could have other side effects. Instead, restore initial values before we return from udpv6_sendmsg(). Fixes: c71d8ebe7a44 ("net: Fix security_socket_sendmsg() bypass problem.") Reported-by: lena wang <lena.wang@mediatek.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Maciej Żenczykowski <maze@google.com> Link: https://lore.kernel.org/r/20230412130308.1202254-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-04-13	selftests: openvswitch: adjust datapath NL message declaration	Aaron Conole	1	-1/+1
	The netlink message for creating a new datapath takes an array of ports for the PID creation. This shouldn't cause much issue but correct it for future cases where we need to do decode of datapath information that could include the per-cpu PID map. Fixes: 25f16c873fb1 ("selftests: add openvswitch selftest suite") Signed-off-by: Aaron Conole <aconole@redhat.com> Link: https://lore.kernel.org/r/20230412115828.3991806-1-aconole@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-04-13	selftests: mptcp: userspace pm: uniform verify events	Matthieu Baerts	1	-0/+2
	Simply adding a "sleep" before checking something is usually not a good idea because the time that has been picked can not be enough or too much. The best is to wait for events with a timeout. In this selftest, 'sleep 0.5' is used more than 40 times. It is always used before calling a 'verify_' function except for this verify_listener_events which has been added later. At the end, using all these 'sleep 0.5' seems to work: the slow CIs don't complain so far. Also because it doesn't take too much time, we can just add two more 'sleep 0.5' to uniform what is done before calling a 'verify_' function. For the same reasons, we can also delay a bigger refactoring to replace all these 'sleep 0.5' by functions waiting for events instead of waiting for a fix time and hope for the best. Fixes: 6c73008aa301 ("selftests: mptcp: listener test for userspace PM") Cc: stable@vger.kernel.org Suggested-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-04-13	mptcp: fix NULL pointer dereference on fastopen early fallback	Paolo Abeni	1	-2/+9
	In case of early fallback to TCP, subflow_syn_recv_sock() deletes the subflow context before returning the newly allocated sock to the caller. The fastopen path does not cope with the above unconditionally dereferencing the subflow context. Fixes: 36b122baf6a8 ("mptcp: add subflow_v(4,6)_send_synack()") Cc: stable@vger.kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-04-13	mptcp: stricter state check in mptcp_worker	Paolo Abeni	1	-1/+1
	As reported by Christoph, the mptcp protocol can run the worker when the relevant msk socket is in an unexpected state: connect() // incoming reset + fastclose // the mptcp worker is scheduled mptcp_disconnect() // msk is now CLOSED listen() mptcp_worker() Leading to the following splat: divide error: 0000 [#1] PREEMPT SMP CPU: 1 PID: 21 Comm: kworker/1:0 Not tainted 6.3.0-rc1-gde5e8fd0123c #11 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014 Workqueue: events mptcp_worker RIP: 0010:__tcp_select_window+0x22c/0x4b0 net/ipv4/tcp_output.c:3018 RSP: 0018:ffffc900000b3c98 EFLAGS: 00010293 RAX: 000000000000ffd7 RBX: 000000000000ffd7 RCX: 0000000000000000 RDX: 0000000000000000 RSI: ffffffff8214ce97 RDI: 0000000000000004 RBP: 000000000000ffd7 R08: 0000000000000004 R09: 0000000000010000 R10: 000000000000ffd7 R11: ffff888005afa148 R12: 000000000000ffd7 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff88803ed00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000405270 CR3: 000000003011e006 CR4: 0000000000370ee0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> tcp_select_window net/ipv4/tcp_output.c:262 [inline] __tcp_transmit_skb+0x356/0x1280 net/ipv4/tcp_output.c:1345 tcp_transmit_skb net/ipv4/tcp_output.c:1417 [inline] tcp_send_active_reset+0x13e/0x320 net/ipv4/tcp_output.c:3459 mptcp_check_fastclose net/mptcp/protocol.c:2530 [inline] mptcp_worker+0x6c7/0x800 net/mptcp/protocol.c:2705 process_one_work+0x3bd/0x950 kernel/workqueue.c:2390 worker_thread+0x5b/0x610 kernel/workqueue.c:2537 kthread+0x138/0x170 kernel/kthread.c:376 ret_from_fork+0x2c/0x50 arch/x86/entry/entry_64.S:308 </TASK> This change addresses the issue explicitly checking for bad states before running the mptcp worker. Fixes: e16163b6e2b7 ("mptcp: refactor shutdown and close") Cc: stable@vger.kernel.org Reported-by: Christoph Paasch <cpaasch@apple.com> Link: https://github.com/multipath-tcp/mptcp_net-next/issues/374 Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Tested-by: Christoph Paasch <cpaasch@apple.com> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-04-13	mptcp: use mptcp_schedule_work instead of open-coding it	Paolo Abeni	2	-15/+8
	Beyond reducing code duplication this also avoids scheduling the mptcp_worker on a closed socket on some edge scenarios. The addressed issue is actually older than the blamed commit below, but this fix needs it as a pre-requisite. Fixes: ba8f48f7a4d7 ("mptcp: introduce mptcp_schedule_work") Cc: stable@vger.kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-04-13	i2c: ocores: generate stop condition after timeout in polling mode	Gregor Herburger	1	-16/+19
	In polling mode, no stop condition is generated after a timeout. This causes SCL to remain low and thereby block the bus. If this happens during a transfer it can cause slaves to misinterpret the subsequent transfer and return wrong values. To solve this, pass the ETIMEDOUT error up from ocores_process_polling() instead of setting STATE_ERROR directly. The caller is adjusted to call ocores_process_timeout() on error both in polling and in IRQ mode, which will set STATE_ERROR and generate a stop condition. Fixes: 69c8c0c0efa8 ("i2c: ocores: add polling interface") Signed-off-by: Gregor Herburger <gregor.herburger@tq-group.com> Signed-off-by: Matthias Schiffer <matthias.schiffer@ew.tq-group.com> Acked-by: Peter Korsgaard <peter@korsgaard.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Federico Vaga <federico.vaga@cern.ch> Signed-off-by: Wolfram Sang <wsa@kernel.org>
2023-04-13	RDMA/core: Fix GID entry ref leak when create_ah fails	Saravanan Vajravel	1	-0/+2
	If AH create request fails, release sgid_attr to avoid GID entry referrence leak reported while releasing GID table Fixes: 1a1f460ff151 ("RDMA: Hold the sgid_attr inside the struct ib_ah/qp") Link: https://lore.kernel.org/r/20230401063424.342204-1-saravanan.vajravel@broadcom.com Reviewed-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Saravanan Vajravel <saravanan.vajravel@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-04-13	x86/rtc: Remove __init for runtime functions	Matija Glavinic Pecotic	1	-2/+2
	set_rtc_noop(), get_rtc_noop() are after booting, therefore their __init annotation is wrong. A crash was observed on an x86 platform where CMOS RTC is unused and disabled via device tree. set_rtc_noop() was invoked from ntp: sync_hw_clock(), although CONFIG_RTC_SYSTOHC=n, however sync_cmos_clock() doesn't honour that. Workqueue: events_power_efficient sync_hw_clock RIP: 0010:set_rtc_noop Call Trace: update_persistent_clock64 sync_hw_clock Fix this by dropping the __init annotation from set/get_rtc_noop(). Fixes: c311ed6183f4 ("x86/init: Allow DT configured systems to disable RTC at boot time") Signed-off-by: Matija Glavinic Pecotic <matija.glavinic-pecotic.ext@nokia.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/59f7ceb1-446b-1d3d-0bc8-1f0ee94b1e18@nokia.com
2023-04-13	net: enetc: workaround for unresponsive pMAC after receiving express traffic	Vladimir Oltean	1	-0/+16
	I have observed an issue where the RX direction of the LS1028A ENETC pMAC seems unresponsive. The minimal procedure to reproduce the issue is: 1. Connect ENETC port 0 with a loopback RJ45 cable to one of the Felix switch ports (0). 2. Bring the ports up (MAC Merge layer is not enabled on either end). 3. Send a large quantity of unidirectional (express) traffic from Felix to ENETC. I tried altering frame size and frame count, and it doesn't appear to be specific to either of them, but rather, to the quantity of octets received. Lowering the frame count, the minimum quantity of packets to reproduce relatively consistently seems to be around 37000 frames at 1514 octets (w/o FCS) each. 4. Using ethtool --set-mm, enable the pMAC in the Felix and in the ENETC ports, in both RX and TX directions, and with verification on both ends. 5. Wait for verification to complete on both sides. 6. Configure a traffic class as preemptible on both ends. 7. Send some packets again. The issue is at step 5, where the verification process of ENETC ends (meaning that Felix responds with an SMD-R and ENETC sees the response), but the verification process of Felix never ends (it remains VERIFYING). If step 3 is skipped or if ENETC receives less traffic than approximately that threshold, the test runs all the way through (verification succeeds on both ends, preemptible traffic passes fine). If, between step 4 and 5, the step below is also introduced: 4.1. Disable and re-enable PM0_COMMAND_CONFIG bit RX_EN then again, the sequence of steps runs all the way through, and verification succeeds, even if there was the previous RX traffic injected into ENETC. Traffic sent by the ENETC port prior to enabling the MAC Merge layer does not seem to influence the verification result, only received traffic does. The LS1028A manual does not mention any relationship between PM0_COMMAND_CONFIG and MMCSR, and the hardware people don't seem to know for now either. The bit that is toggled to work around the issue is also toggled by enetc_mac_enable(), called from phylink's mac_link_down() and mac_link_up() methods - which is how the workaround was found: verification would work after a link down/up. Fixes: c7b9e8086902 ("net: enetc: add support for MAC Merge layer") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://lore.kernel.org/r/20230411192645.1896048-1-vladimir.oltean@nxp.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>