linux-dev - Linux kernel development work

Age	Commit message (Collapse)	Author	Files	Lines
2021-10-06	libbpf: Add API that copies all BTF types from one BTF object to another	Andrii Nakryiko	3	-2/+135
	Add a bulk copying api, btf__add_btf(), that speeds up and simplifies appending entire contents of one BTF object to another one, taking care of copying BTF type data, adjusting resulting BTF type IDs according to their new locations in the destination BTF object, as well as copying and deduplicating all the referenced strings and updating all the string offsets in new BTF types as appropriate. This API is intended to be used from tools that are generating and otherwise manipulating BTFs generically, such as pahole. In pahole's case, this API is useful for speeding up parallelized BTF encoding, as it allows pahole to offload all the intricacies of BTF type copying to libbpf and handle the parallelization aspects of the process. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <songliubraving@fb.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Link: https://lore.kernel.org/bpf/20211006051107.17921-2-andrii@kernel.org
2021-10-06	bpf, x64: Save bytes for DIV by reducing reg copies	Jie Meng	2	-29/+89
	Instead of unconditionally performing push/pop on %rax/%rdx in case of division/modulo, we can save a few bytes in case of destination register being either BPF r0 (%rax) or r3 (%rdx) since the result is written in there anyway. Also, we do not need to copy the source to %r11 unless the source is either %rax, %rdx or an immediate. For example, before the patch: 22: push %rax 23: push %rdx 24: mov %rsi,%r11 27: xor %edx,%edx 29: div %r11 2c: mov %rax,%r11 2f: pop %rdx 30: pop %rax 31: mov %r11,%rax After: 22: push %rdx 23: xor %edx,%edx 25: div %rsi 28: pop %rdx Signed-off-by: Jie Meng <jmeng@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20211002035626.2041910-1-jmeng@fb.com
2021-10-05	bpf: Avoid retpoline for bpf_for_each_map_elem	Andrey Ignatov	1	-1/+10
	Similarly to 09772d92cd5a ("bpf: avoid retpoline for lookup/update/delete calls on maps") and 84430d4232c3 ("bpf, verifier: avoid retpoline for map push/pop/peek operation") avoid indirect call while calling bpf_for_each_map_elem. Before (a program fragment): ; if (rules_map) { 142: (15) if r4 == 0x0 goto pc+8 143: (bf) r3 = r10 ; bpf_for_each_map_elem(rules_map, process_each_rule, &ctx, 0); 144: (07) r3 += -24 145: (bf) r1 = r4 146: (18) r2 = subprog[+5] 148: (b7) r4 = 0 149: (85) call bpf_for_each_map_elem#143680 <-- indirect call via helper After (same program fragment): ; if (rules_map) { 142: (15) if r4 == 0x0 goto pc+8 143: (bf) r3 = r10 ; bpf_for_each_map_elem(rules_map, process_each_rule, &ctx, 0); 144: (07) r3 += -24 145: (bf) r1 = r4 146: (18) r2 = subprog[+5] 148: (b7) r4 = 0 149: (85) call bpf_for_each_array_elem#170336 <-- direct call On a benchmark that calls bpf_for_each_map_elem() once and does many other things (mostly checking fields in skb) with CONFIG_RETPOLINE=y it makes program faster. Before: ============================================================================ Benchmark.cpp time/iter iters/s ============================================================================ IngressMatchByRemoteEndpoint 80.78ns 12.38M IngressMatchByRemoteIP 80.66ns 12.40M IngressMatchByRemotePort 80.87ns 12.37M After: ============================================================================ Benchmark.cpp time/iter iters/s ============================================================================ IngressMatchByRemoteEndpoint 73.49ns 13.61M IngressMatchByRemoteIP 71.48ns 13.99M IngressMatchByRemotePort 70.39ns 14.21M Signed-off-by: Andrey Ignatov <rdna@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211006001838.75607-1-rdna@fb.com
2021-10-05	bpf: selftests: Add selftests for module kfunc support	Kumar Kartikeya Dwivedi	9	-31/+132
	This adds selftests that tests the success and failure path for modules kfuncs (in presence of invalid kfunc calls) for both libbpf and gen_loader. It also adds a prog_test kfunc_btf_id_list so that we can add module BTF ID set from bpf_testmod. This also introduces a couple of test cases to verifier selftests for validating whether we get an error or not depending on if invalid kfunc call remains after elimination of unreachable instructions. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211002011757.311265-10-memxor@gmail.com
2021-10-05	libbpf: Update gen_loader to emit BTF_KIND_FUNC relocations	Kumar Kartikeya Dwivedi	3	-58/+280
	This change updates the BPF syscall loader to relocate BTF_KIND_FUNC relocations, with support for weak kfunc relocations. The general idea is to move map_fds to loader map, and also use the data for storing kfunc BTF fds. Since both reuse the fd_array parameter, they need to be kept together. For map_fds, we reserve MAX_USED_MAPS slots in a region, and for kfunc, we reserve MAX_KFUNC_DESCS. This is done so that insn->off has more chances of being <= INT16_MAX than treating data map as a sparse array and adding fd as needed. When the MAX_KFUNC_DESCS limit is reached, we fall back to the sparse array model, so that as long as it does remain <= INT16_MAX, we pass an index relative to the start of fd_array. We store all ksyms in an array where we try to avoid calling the bpf_btf_find_by_name_kind helper, and also reuse the BTF fd that was already stored. This also speeds up the loading process compared to emitting calls in all cases, in later tests. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211002011757.311265-9-memxor@gmail.com
2021-10-05	libbpf: Resolve invalid weak kfunc calls with imm = 0, off = 0	Kumar Kartikeya Dwivedi	1	-11/+13
	Preserve these calls as it allows verifier to succeed in loading the program if they are determined to be unreachable after dead code elimination during program load. If not, the verifier will fail at runtime. This is done for ext->is_weak symbols similar to the case for variable ksyms. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20211002011757.311265-8-memxor@gmail.com
2021-10-05	libbpf: Support kernel module function calls	Kumar Kartikeya Dwivedi	4	-24/+72
	This patch adds libbpf support for kernel module function call support. The fd_array parameter is used during BPF program load to pass module BTFs referenced by the program. insn->off is set to index into this array, but starts from 1, because insn->off as 0 is reserved for btf_vmlinux. We try to use existing insn->off for a module, since the kernel limits the maximum distinct module BTFs for kfuncs to 256, and also because index must never exceed the maximum allowed value that can fit in insn->off (INT16_MAX). In the future, if kernel interprets signed offset as unsigned for kfunc calls, this limit can be increased to UINT16_MAX. Also introduce a btf__find_by_name_kind_own helper to start searching from module BTF's start id when we know that the BTF ID is not present in vmlinux BTF (in find_ksym_btf_id). Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211002011757.311265-7-memxor@gmail.com
2021-10-05	bpf: Enable TCP congestion control kfunc from modules	Kumar Kartikeya Dwivedi	7	-34/+85
	This commit moves BTF ID lookup into the newly added registration helper, in a way that the bbr, cubic, and dctcp implementation set up their sets in the bpf_tcp_ca kfunc_btf_set list, while the ones not dependent on modules are looked up from the wrapper function. This lifts the restriction for them to be compiled as built in objects, and can be loaded as modules if required. Also modify Makefile.modfinal to call resolve_btfids for each module. Note that since kernel kfunc_ids never overlap with module kfunc_ids, we only match the owner for module btf id sets. See following commits for background on use of: CONFIG_X86 ifdef: 569c484f9995 (bpf: Limit static tcp-cc functions in the .BTF_ids list to x86) CONFIG_DYNAMIC_FTRACE ifdef: 7aae231ac93b (bpf: tcp: Limit calling some tcp cc functions to CONFIG_DYNAMIC_FTRACE) Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211002011757.311265-6-memxor@gmail.com
2021-10-05	tools: Allow specifying base BTF file in resolve_btfids	Kumar Kartikeya Dwivedi	2	-10/+20
	This commit allows specifying the base BTF for resolving btf id lists/sets during link time in the resolve_btfids tool. The base BTF is set to NULL if no path is passed. This allows resolving BTF ids for module kernel objects. Also, drop the --no-fail option, as it is only used in case .BTF_ids section is not present, instead make no-fail the default mode. The long option name is same as that of pahole. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20211002011757.311265-5-memxor@gmail.com
2021-10-05	bpf: btf: Introduce helpers for dynamic BTF set registration	Kumar Kartikeya Dwivedi	3	-0/+89
	This adds helpers for registering btf_id_set from modules and the bpf_check_mod_kfunc_call callback that can be used to look them up. With in kernel sets, the way this is supposed to work is, in kernel callback looks up within the in-kernel kfunc whitelist, and then defers to the dynamic BTF set lookup if it doesn't find the BTF id. If there is no in-kernel BTF id set, this callback can be used directly. Also fix includes for btf.h and bpfptr.h so that they can included in isolation. This is in preparation for their usage in tcp_bbr, tcp_cubic and tcp_dctcp modules in the next patch. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211002011757.311265-4-memxor@gmail.com
2021-10-05	bpf: Be conservative while processing invalid kfunc calls	Kumar Kartikeya Dwivedi	1	-0/+18
	This patch also modifies the BPF verifier to only return error for invalid kfunc calls specially marked by userspace (with insn->imm == 0, insn->off == 0) after the verifier has eliminated dead instructions. This can be handled in the fixup stage, and skip processing during add and check stages. If such an invalid call is dropped, the fixup stage will not encounter insn->imm as 0, otherwise it bails out and returns an error. This will be exposed as weak ksym support in libbpf in later patches. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211002011757.311265-3-memxor@gmail.com
2021-10-05	bpf: Introduce BPF support for kernel module function calls	Kumar Kartikeya Dwivedi	6	-32/+188
	This change adds support on the kernel side to allow for BPF programs to call kernel module functions. Userspace will prepare an array of module BTF fds that is passed in during BPF_PROG_LOAD using fd_array parameter. In the kernel, the module BTFs are placed in the auxilliary struct for bpf_prog, and loaded as needed. The verifier then uses insn->off to index into the fd_array. insn->off 0 is reserved for vmlinux BTF (for backwards compat), so userspace must use an fd_array index > 0 for module kfunc support. kfunc_btf_tab is sorted based on offset in an array, and each offset corresponds to one descriptor, with a max limit up to 256 such module BTFs. We also change existing kfunc_tab to distinguish each element based on imm, off pair as each such call will now be distinct. Another change is to check_kfunc_call callback, which now include a struct module * pointer, this is to be used in later patch such that the kfunc_id and module pointer are matched for dynamically registered BTF sets from loadable modules, so that same kfunc_id in two modules doesn't lead to check_kfunc_call succeeding. For the duration of the check_kfunc_call, the reference to struct module exists, as it returns the pointer stored in kfunc_btf_tab. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211002011757.311265-2-memxor@gmail.com
2021-10-05	net: usb: use eth_hw_addr_set() for dev->addr_len cases	Jakub Kicinski	5	-5/+5
	Convert usb drivers from memcpy(... dev->addr_len) to eth_hw_addr_set(): @@ expression dev, np; @@ - memcpy(dev->dev_addr, np, dev->addr_len) + eth_hw_addr_set(dev, np) Manually checked these are either usbnet or pure etherdevs. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-05	ethernet: use eth_hw_addr_set() for dev->addr_len cases	Jakub Kicinski	76	-93/+90
	Convert all Ethernet drivers from memcpy(... dev->addr_len) to eth_hw_addr_set(): @@ expression dev, np; @@ - memcpy(dev->dev_addr, np, dev->addr_len) + eth_hw_addr_set(dev, np) In theory addr_len may not be ETH_ALEN, but we don't expect non-Ethernet devices to live under this directory, and only the following cases of setting addr_len exist: - cxgb4 for mgmt device, and the drivers which set it to ETH_ALEN: s2io, mlx4, vxge. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-05	mlx4: constify args for const dev_addr	Jakub Kicinski	3	-5/+7
	netdev->dev_addr will become const soon. Make sure all functions which pass it around mark appropriate args as const. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-05	mlx4: remove custom dev_addr clearing	Jakub Kicinski	1	-8/+6
	mlx4_en_u64_to_mac() takes the dev->dev_addr pointer and writes to it byte by byte. It also clears the two bytes _after_ ETH_ALEN which seems unnecessary. dev->addr_len is set to ETH_ALEN just before the call. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-05	mlx4: replace mlx4_u64_to_mac() with u64_to_ether_addr()	Jakub Kicinski	2	-11/+1
	mlx4_u64_to_mac() predates the common helper but doesn't make the argument constant. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-05	mlx4: replace mlx4_mac_to_u64() with ether_addr_to_u64()	Jakub Kicinski	6	-24/+12
	mlx4_mac_to_u64() predates and opencodes ether_addr_to_u64(). It doesn't make the argument constant so it'll be problematic when dev->dev_addr becomes a const. Convert to the generic helper. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-05	netlink: remove netlink_broadcast_filtered	Florian Westphal	2	-25/+2
	No users in tree since commit a3498436b3a0 ("netns: restrict uevents"), so remove this functionality. Cc: Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-05	net: bgmac: support MDIO described in DT	Rafał Miłecki	1	-1/+5
	Check ethernet controller DT node for "mdio" subnode and use it with of_mdiobus_register() when present. That allows specifying MDIO and its PHY devices in a standard DT based way. This is required for BCM53573 SoC support. That family is sometimes called Northstar (by marketing?) but is quite different from it. It uses different CPU(s) and many different hw blocks. One of shared blocks in BCM53573 is Ethernet controller. Switch however is not SRAB accessible (as it Northstar) but is MDIO attached. Signed-off-by: Rafał Miłecki <rafal@milecki.pl> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-05	net: bgmac: improve handling PHY	Rafał Miłecki	1	-12/+21
	1. Use info from DT if available It allows describing for example a fixed link. It's more accurate than just guessing there may be one (depending on a chipset). 2. Verify PHY ID before trying to connect PHY PHY addr 0x1e (30) is special in Broadcom routers and means a switch connected as MDIO devices instead of a real PHY. Don't try connecting to it. Signed-off-by: Rafał Miłecki <rafal@milecki.pl> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-05	ethernet: ehea: add missing cast	Jakub Kicinski	1	-1/+1
	We need to cast the pointer, unlike memcpy() eth_hw_addr_set() does not take void . The driver already casts &port->mac_addr to u8 in other places. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Fixes: a96d317fb1a3 ("ethernet: use eth_hw_addr_set()") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-05	sparc: Fix typo.	David S. Miller	1	-1/+1
	Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-04	net/mlx5: Enable single IRQ for PCI Function	Shay Drory	2	-8/+19
	Prior to this patch the driver requires two IRQs to function properly, one required IRQ for control and at least one required IRQ for IO. This requirement can be relaxed to one as the driver now allows sharing of IRQs, so control and IO EQs can share the same irq. This is needed for high scale amount of VFs. Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-04	net/mlx5: Shift control IRQ to the last index	Shay Drory	5	-11/+13
	Control IRQ is the first IRQ vector. This complicates handling of completion irqs as we need to offset them by one. in the next patch, there are scenarios where completion and control EQs will share the same irq. for example: functions with single IRQ. To ease such scenarios, we shift control IRQ to the end of the irq array. Signed-off-by: Shay Drory <shayd@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-04	net/mlx5: Bridge, pop VLAN on egress table miss	Vlad Buslov	1	-2/+126
	Create lowest priority flow group in egress table with single rule that matches on special reg_c1 value that is set on ingress VLAN push with single action that pops VLAN. The flow destination is skip table that is used to skip any further processing of packet in FDB bridge priority. Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Reviewed-by: Paul Blakey <paulb@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-04	net/mlx5: Bridge, mark reg_c1 when pushing VLAN	Vlad Buslov	3	-1/+49
	On ingress VLAN push also assign value 0x7FE to reg_c1 tunnel id+opts bits (tunnel id 0, which is not a valid tunnel id, and option 0x7FE which was reserved by one of previous patches in the series). In following patch the reg value is matched on egress miss to restore the packet to its original state by removing the VLAN before passing it to the software data path. Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Reviewed-by: Paul Blakey <paulb@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-04	net/mlx5: Bridge, extract VLAN pop code to dedicated functions	Vlad Buslov	1	-12/+22
	Following patches in series need to pop VLAN when packet misses on egress. To reuse existing bridge VLAN pop handling code, extract it to dedicated helpers mlx5_esw_bridge_pkt_reformat_vlan_pop_supported() and mlx5_esw_bridge_pkt_reformat_vlan_pop_create(). Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Reviewed-by: Paul Blakey <paulb@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-04	net/mlx5: Bridge, refactor eswitch instance usage	Vlad Buslov	1	-11/+14
	Several functions in bridge.c excessively obtain pointer to parent eswitch instance by dereferencing br_offloads->esw on every usage and following patches in this series add even more usages of eswitch. Introduce local variable 'esw' and use it instead. Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Reviewed-by: Paul Blakey <paulb@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-04	net/mlx5e: Support accept action	Vlad Buslov	4	-4/+17
	Support TC generic 'accept' action in mlx5 by introducing MLX5_ESW_ATTR_FLAG_ACCEPT attribute flag. Flag has similar semantics to existing MLX5_ESW_ATTR_FLAG_SLOW_PATH flag, however, dedicated flag is required because existing 'slow path' flag can be flipped by tunneling subsystem when neighbor changes state. Introduce new helper function mlx5_esw_attr_flags_skip() to check whether attribute flags for 'slow path' or 'accept' action are set and use it in eswitch code instead of direct bit manipulation. Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Reviewed-by: Paul Blakey <paulb@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-04	net/mlx5e: Specify out ifindex when looking up encap route	Chris Mi	3	-0/+18
	There is a use case that the local and remote VTEPs are in the same host. Currently, the out ifindex is not specified when looking up the encap route for offloads. So in this case, a local route is returned and the route dev is lo. Actual tunnel interface can be created with a parameter "dev" [1], which specifies the physical device to use for tunnel endpoint communication. Pass this parameter to driver when looking up encap route for offloads. So that a unicast route will be returned. [1] ip link add name vxlan1 type vxlan id 100 dev enp4s0f0 remote 1.1.1.1 dstport 4789 Signed-off-by: Chris Mi <cmi@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-04	net/mlx5e: Reserve a value from TC tunnel options mapping	Vlad Buslov	1	-2/+4
	Reserve one more value from TC tunnel options range to be used by bridge offload in following patches. Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Reviewed-by: Paul Blakey <paulb@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-04	net/mlx5e: Move parse fdb check into actions_match_supported_fdb()	Roi Dayan	1	-9/+11
	The parse fdb/nic actions funcs parse the actions and then call actions_match_supported() for final check. Move related check in parse_tc_fdb_actions() into actions_match_supported_fdb() for more organized code. Signed-off-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-04	net/mlx5e: Split actions_match_supported() into a sub function	Roi Dayan	1	-26/+39
	There will probably be more checks, some for nic flows, some for fdb flows and some are shared checks. Split it for fdb and nic to avoid the function getting too big. Signed-off-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-04	net/mlx5e: Move mod hdr allocation to a single place	Roi Dayan	1	-36/+51
	Move mod hdr allocation chunk from parse_tc_fdb_actions() and parse_tc_nic_actions() to a shared function. Signed-off-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Reviewed-by: Oz Shlomo <ozsh@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-04	net/mlx5e: TC, Refactor sample offload error flow	Roi Dayan	1	-16/+5
	Refactor sample unoffload to be symmetric to sample offload. Use the existing del_post_rule() to release the post rule. Also mlx5e_tc_sample_unoffload() should not return post_rule which is NULL when post actions are supported. Sample offload works with this NULL because many places of the code use IS_ERR() instead of IS_ERR_OR_NULL() to check rule is valid and when rule is detected as sample offload the code is not using the rule. Let's be persistent and avoid returning NULL anyway and return the pre rule, like in CT case, which is not NULL. Signed-off-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Chris Mi <cmi@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-04	net/mlx5e: Add TX max rate support for MQPRIO channel mode	Tariq Toukan	4	-6/+203
	Add driver max_rate support for the MQPRIO bw_rlimit shaper in channel mode. Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-04	net/mlx5e: Specify SQ stats struct for mlx5e_open_txqsq()	Tariq Toukan	3	-8/+9
	Let the caller of mlx5e_open_txqsq() directly pass the SQ stats structure pointer. This replaces logic involving the qos_queue_group_id parameter, and helps generalizing its role in the next patch. Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-04	net: ethernet: use phylink_set_10g_modes()	Russell King (Oracle)	3	-18/+3
	Update three drivers to use the new phylink_set_10g_modes() helper: Cadence macb, Freescale DPAA2 and Marvell PP2. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-04	net: phylink: add phylink_set_10g_modes() helper	Russell King (Oracle)	2	-0/+12
	Add a helper for setting 10Gigabit modes, so we have one central place that sets all appropriate 10G modes for a driver. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-04	net: ipv6: fix use after free of struct seg6_pernet_data	MichelleJin	1	-1/+1
	sdata->tun_src should be freed before sdata is freed because sdata->tun_src is allocated after sdata allocation. So, kfree(sdata) and kfree(rcu_dereference_raw(sdata->tun_src)) are changed code order. Fixes: f04ed7d277e8 ("net: ipv6: check return value of rhashtable_init") Signed-off-by: MichelleJin <shjy180909@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-04	qed: fix ll2 establishment during load of RDMA driver	Manish Chopra	1	-5/+44
	If stats ID of a LL2 (light l2) queue exceeds than the total amount of statistics counters, it may cause system crash upon enabling RDMA on all PFs. This patch makes sure that the stats ID of the LL2 queue doesn't exceed the max allowed value. Signed-off-by: Manish Chopra <manishc@marvell.com> Signed-off-by: Ariel Elior <aelior@marvell.com> Signed-off-by: Shai Malin <smalin@marvell.com> Signed-off-by: Omkar Kulkarni <okulkarni@marvell.com> Signed-off-by: Prabhakar Kushwaha <pkushwaha@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-04	qed: Update the TCP active termination 2 MSL timer ("TIME_WAIT")	Prabhakar Kushwaha	3	-1/+3
	Initialize 2 MSL timeout value used for the TCP TIME_WAIT state to non-zero default. This patch also removes magic number from qedi/qedi_main.c. Reviewed-by: Manish Rangankar <mrangankar@marvell.com> Signed-off-by: Nikolay Assa <nassa@marvell.com> Signed-off-by: Ariel Elior <aelior@marvell.com> Signed-off-by: Shai Malin <smalin@marvell.com> Signed-off-by: Omkar Kulkarni <okulkarni@marvell.com> Signed-off-by: Prabhakar Kushwaha <pkushwaha@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-04	qed: Update TCP silly-window-syndrome timeout for iwarp, scsi	Nikolay Assa	3	-0/+4
	Update TCP silly-window-syndrome timeout, for the cases where initiator's small TCP window size prevents FW from transmitting packets on the connection. Timeout causes FW to retransmit window probes if needed, preventing I/O stall if initiator ignores first window probe. Reviewed-by: Manish Rangankar <mrangankar@marvell.com> Signed-off-by: Nikolay Assa <nassa@marvell.com> Signed-off-by: Ariel Elior <aelior@marvell.com> Signed-off-by: Shai Malin <smalin@marvell.com> Signed-off-by: Omkar Kulkarni <okulkarni@marvell.com> Signed-off-by: Prabhakar Kushwaha <pkushwaha@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-04	qed: Update debug related changes	Prabhakar Kushwaha	8	-505/+1031
	qed_debug features are updated to support FW version 8.59.1.0 along with few enhancements. - Removal of _BB_K2 from register defines. - Add new condition cond14. - Add dump of new area sw-platform, epoch, iscsi_task_pages, fcoe_task_pages, roce_task_pages and eth_task_pages. - Introduced new functions qed_dbg_phy_size(). - Update in qed_mcp_nvm_rd_cmd() declaration. - Allow QED to control init/exit at pf level. - Dump partial "ILT-dump" if buffer size is not sufficient. This patch also fixes the existing checkpatch warnings and few important checks. Signed-off-by: Ariel Elior <aelior@marvell.com> Signed-off-by: Shai Malin <smalin@marvell.com> Signed-off-by: Omkar Kulkarni <okulkarni@marvell.com> Signed-off-by: Prabhakar Kushwaha <pkushwaha@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-04	qed: Add '_GTT' suffix to the IRO RAM macros	Prabhakar Kushwaha	10	-111/+143
	GTT (Global translation table) is a fast-access window in the BAR into the register space, which only maps certain register addresses. This change helps enforce that only those addresses which are indeed mapped by the GTT are being accessed through it. Adding the '_GTT' suffix to the IRO FW memory (“RAM”) macros that access GTT-able region in FW memories (“RAM”) and use GTT macros to access RAM BAR from drivers. Signed-off-by: Ariel Elior <aelior@marvell.com> Signed-off-by: Omkar Kulkarni <okulkarni@marvell.com> Signed-off-by: Shai Malin <smalin@marvell.com> Signed-off-by: Prabhakar Kushwaha <pkushwaha@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-04	qed: Update FW init functions to support FW 8.59.1.0	Omkar Kulkarni	6	-200/+372
	The qed_init_fw_func.c and qed_init_ops.c updated to support FW version 8.59.1.0. - Support 16-bit VPORT WFQ (weighted fair queueing) weights. - Support WFQ (weighted fair queueing) weight per VPORT + TC. - Support allocation of Tx PQs(physical queues) per PF,VF. - Modify Global RL (rate limiter) upper bound configuration. - Update FW operation functions. - Update iro_arr[] array. This patch also fixes the existing checkpatch warnings and few important checks. Signed-off-by: Ariel Elior <aelior@marvell.com> Signed-off-by: Shai Malin <smalin@marvell.com> Signed-off-by: Omkar Kulkarni <okulkarni@marvell.com> Signed-off-by: Prabhakar Kushwaha <pkushwaha@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-04	qed: Use enum as per FW 8.59.1.0 in qed_iro_hsi.h	Prabhakar Kushwaha	1	-257/+390
	qed_iro_hsi.h contains HSI changes related to storm memories access. Existing code is based on hard-coded index. Use enum as defined for FW HSI 8.59.1.0, instead of hard-coded index. This patch also removes unnecessary header file inclusion. Signed-off-by: Ariel Elior <aelior@marvell.com> Signed-off-by: Omkar Kulkarni <okulkarni@marvell.com> Signed-off-by: Shai Malin <smalin@marvell.com> Signed-off-by: Prabhakar Kushwaha <pkushwaha@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-04	qed: Update qed_hsi.h for fw 8.59.1.0	Prabhakar Kushwaha	12	-308/+1590
	The qed_hsi.h has been updated to support new FW version 8.59.1.0 with changes. - Updates FW HSI (Hardware Software interface) structures. - Addition/update in function declaration and defines as per HSI. - Add generic infrastructure for FW error reporting as part of common event queue handling. - Move malicious VF error reporting to FW error reporting infrastructure. - Move consolidation queue initialization from FW context to ramrod message. qed_hsi.h header file changes lead to change in many files to ensure compilation. This patch also fixes the existing checkpatch warnings and few important checks. Signed-off-by: Ariel Elior <aelior@marvell.com> Signed-off-by: Shai Malin <smalin@marvell.com> Signed-off-by: Omkar Kulkarni <okulkarni@marvell.com> Signed-off-by: Prabhakar Kushwaha <pkushwaha@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-04	qed: Update qed_mfw_hsi.h for FW ver 8.59.1.0	Prabhakar Kushwaha	3	-263/+800
	The qed_mfw_hsi.h contains HSI (Hardware Software Interface) changes related to management firmware. It has been updated to support new FW version 8.59.1.0 with below changes. - New defines for VF bitmap. - fec_mode and extended_speed defines updated in struct eth_phy_cfg. - Updated structutres lldp_system_tlvs_buffer_s, public_global, public_port, public_func, drv_union_data, public_drv_mb with all dependent new structures. - Updates in NVM related structures and defines. - Msg defines are added in enum drv_msg_code and fw_msg_code. - Updated/added new defines. This patch also fixes the existing checkpatch warnings and few important checks. Signed-off-by: Ariel Elior <aelior@marvell.com> Signed-off-by: Omkar Kulkarni <okulkarni@marvell.com> Signed-off-by: Shai Malin <smalin@marvell.com> Signed-off-by: Prabhakar Kushwaha <pkushwaha@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>