linux-dev - Linux kernel development work

Age	Commit message (Collapse)	Author	Files	Lines
2018-12-10	net/sched: Remove egdev mechanism	Oz Shlomo	3	-297/+1
	The egdev mechanism was replaced by the TC indirect block notifications platform. Signed-off-by: Oz Shlomo <ozsh@mellanox.com> Reviewed-by: Eli Britstein <elibr@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Cc: John Hurley <john.hurley@netronome.com> Cc: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-12-10	net/mlx5e: Add GRE protocol offloading	Oz Shlomo	2	-1/+89
	Add HW offloading support for TC flower filters configured on gretap/ip6gretap net devices. Signed-off-by: Oz Shlomo <ozsh@mellanox.com> Reviewed-by: Eli Britstein <elibr@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-12-10	net: Add netif_is_gretap()/netif_is_ip6gretap()	Oz Shlomo	4	-16/+13
	Changed the is_gretap_dev and is_ip6gretap_dev logic from structure comparison to string comparison of the rtnl_link_ops kind field. This approach aligns with the current identification methods and function names of vxlan and geneve network devices. Convert mlxsw to use these helpers and use them in downstream mlx5 patch. Signed-off-by: Oz Shlomo <ozsh@mellanox.com> Reviewed-by: Eli Britstein <elibr@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-12-10	net/mlx5e: Move TC tunnel offloading code to separate source file	Oz Shlomo	6	-496/+548
	Move tunnel offloading related code to a separate source file for better code maintainability. Code refactoring with no functional change. Signed-off-by: Oz Shlomo <ozsh@mellanox.com> Reviewed-by: Eli Britstein <elibr@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-12-10	net/mlx5e: Branch according to classified tunnel type	Oz Shlomo	2	-48/+114
	Currently the tunnel offloading encap/decap methods assumes that VXLAN is the sole tunneling protocol. Lay the infrastructure for supporting multiple tunneling protocols by branching according to the tunnel net device kind. Encap filters tunnel type is determined according to the egress/mirred net device. Decap filters classify the tunnel type according to the filter's ingress net device kind. Distinguish between the tunnel type as defined by the SW model and the FW reformat type that specifies the HW operation being made. Signed-off-by: Oz Shlomo <ozsh@mellanox.com> Reviewed-by: Eli Britstein <elibr@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-12-10	net/mlx5e: Refactor VXLAN tunnel decap offloading code	Oz Shlomo	1	-39/+51
	Separates the vxlan header match handling from the matching on the general fields of ipv4/6 tunnels, thus allowing the common IP tunnel match code to branch in down stream patch, to multiple IP tunnels. This patch doesn't add any functionality. Signed-off-by: Oz Shlomo <ozsh@mellanox.com> Reviewed-by: Eli Britstein <elibr@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-12-10	net/mlx5e: Refactor VXLAN tunnel encap offloading code	Oz Shlomo	1	-119/+70
	Separates the vxlan header encap logic from the general ipv4/6 encapsulation methods, thus allowing the common IP encap/decap code to branch in downstream patch to multiple IP tunnels. Code refactoring with no functional change. Signed-off-by: Oz Shlomo <ozsh@mellanox.com> Reviewed-by: Eli Britstein <elibr@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-12-10	net/mlx5e: Replace egdev with indirect block notifications	Oz Shlomo	2	-25/+36
	Use TC indirect block notifications to offload filters that are configured on higher level device interfaces (e.g. tunnel devices). This mechanism replaces the current egdev implementation. Signed-off-by: Oz Shlomo <ozsh@mellanox.com> Reviewed-by: Eli Britstein <elibr@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-12-10	net/mlx5e: Propagate the filter's net device to mlx5e structures	Oz Shlomo	1	-7/+19
	Propagate the filter's net_device parameter to the tc flower parsed attributes structure so that it can later be used in tunnel decap offloading sequences. Pre-step for replacing egdev logic with the indirect block notification mechanism. Signed-off-by: Oz Shlomo <ozsh@mellanox.com> Reviewed-by: Eli Britstein <elibr@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-12-10	net/mlx5e: Provide the TC filter netdev as parameter to flower callbacks	Oz Shlomo	4	-12/+18
	Currently the driver controls flower filters that are installed on its devices. However, with the introduction of the indirect block notifications platform the driver may receive control events for filters that are installed on higher level net devices (e.g. tunnel devices). Therefore, the driver filter control API will not be able to implicitly assume the filter's net device. Explicitly specify the filter's net device, no functional change Signed-off-by: Oz Shlomo <ozsh@mellanox.com> Reviewed-by: Eli Britstein <elibr@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-12-10	net/mlx5e: Support TC indirect block notifications for eswitch uplink reprs	Oz Shlomo	4	-0/+210
	Towards using this mechanism as the means to offload tunnel decap rules set on SW tunnel devices instead of egdev, add the supporting structures and functions. Signed-off-by: Oz Shlomo <ozsh@mellanox.com> Reviewed-by: Eli Britstein <elibr@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-12-10	net/mlx5e: Store eswitch uplink representor state on a dedicated struct	Oz Shlomo	3	-5/+12
	Currently only a single field in the representor private structure is relevant for uplink representors. As a pre-step to allow adding additional uplink representor fields, introduce uplink representor private structure. This is prepration step towards replacing egdev logic with the indirect block notification mechanism. This patch doesn't change any functionality. Signed-off-by: Oz Shlomo <ozsh@mellanox.com> Reviewed-by: Eli Britstein <elibr@mellanox.com> Acked-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-12-10	tools/bpf: rename _info_cnt to nr__info	Yonghong Song	3	-56/+56
	Rename all occurances of _info_cnt field access to nr__info in tools directory. The local variables finfo_cnt, linfo_cnt and jited_linfo_cnt in function do_dump() of tools/bpf/bpftool/prog.c are also changed to nr_finfo, nr_linfo and nr_jited_linfo to keep naming convention consistent. Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-12-10	tools/bpf: sync kernel uapi bpf.h to tools directory	Yonghong Song	1	-3/+3
	Sync kernel uapi bpf.h "_info_cnt => nr__info" changes to tools directory. Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-12-10	bpf: rename _info_cnt to nr__info in bpf_prog_info	Yonghong Song	2	-22/+22
	In uapi bpf.h, currently we have the following fields in the struct bpf_prog_info: __u32 func_info_cnt; __u32 line_info_cnt; __u32 jited_line_info_cnt; The above field names "func_info_cnt" and "line_info_cnt" also appear in union bpf_attr for program loading. The original intention is to keep the names the same between bpf_prog_info and bpf_attr so it will imply what we returned to user space will be the same as what the user space passed to the kernel. Such a naming convention in bpf_prog_info is not consistent with other fields like: __u32 nr_jited_ksyms; __u32 nr_jited_func_lens; This patch made this adjustment so in bpf_prog_info newly introduced _info_cnt becomes nr__info. Acked-by: Song Liu <songliubraving@fb.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-12-10	bpf: clean up bpf_prog_get_info_by_fd()	Song Liu	1	-2/+2
	info.nr_jited_ksyms and info.nr_jited_func_lens cannot be 0 in these two statements, so we don't need to check them. Signed-off-by: Song Liu <songliubraving@fb.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-12-10	net/mlx5: Remove the get protocol device interface entry	Or Gerlitz	3	-32/+0
	This isn't used anywhere across the mlx5 driver stack, remove it. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-12-10	net/mlx5: Support extended destination format in flow steering command	Eli Britstein	2	-7/+75
	Update the flow steering command formatting according to the extended destination API. Note that the FW dictates that multi destination FTEs that involve at least one encap must use the extended destination format, while single destination ones must use the legacy format. Using extended destination format requires FW support. Check for its capabilities and return error if not supported. Signed-off-by: Eli Britstein <elibr@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Oz Shlomo <ozsh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-12-10	net/mlx5: E-Switch, Change vhca id valid bool field to bit flag	Eli Britstein	3	-5/+12
	Change the driver flow destination struct to use bit flags with the vhca id valid being the 1st one. The flags field is more extendable and will be used in downstream patch. Signed-off-by: Eli Britstein <elibr@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Oz Shlomo <ozsh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-12-10	net/mlx5: Introduce extended destination fields	Eli Britstein	1	-3/+16
	Extended destinations provide the ability to configure different encapsulation properties per destination on a single FTE. This is needed for use-cases such as remote mirroring over tunneled networks. Signed-off-by: Eli Britstein <elibr@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Oz Shlomo <ozsh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-12-10	net/mlx5: Revise gre and nvgre key formats	Oz Shlomo	3	-8/+17
	GRE RFC defines a 32 bit key field. NVGRE RFC splits the 32 bit key field to 24 bit VSID (gre_key_h) and 8 bit flow entropy (gre_key_l). Define the two key parsing alternatives in a union, thus enabling both access methods. Signed-off-by: Oz Shlomo <ozsh@mellanox.com> Reviewed-by: Eli Britstein <elibr@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-12-10	net/mlx5: Add monitor commands layout and event data	Eyal Davidovich	5	-1/+96
	Will be used in downstream patch to monitor counter changes by the HCA and report it to the driver by an event. The driver will update its counters cached data accordingly. Signed-off-by: Eyal Davidovich <eyald@mellanox.com> Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-12-10	net/mlx5: Add support for plugged-disabled cable status in PME	Mikhael Goikhman	2	-0/+3
	Support a new hardware module status in port module events: - module_status=0x4 (Cable plugged, but disabled) Signed-off-by: Mikhael Goikhman <migo@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-12-10	net/mlx5: Add support for PCIe power slot exceeded error in PME	Mikhael Goikhman	2	-0/+3
	Support a new hardware error type in port module events: - error_type=0xc (PCIe system power slot exceeded) Signed-off-by: Mikhael Goikhman <migo@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-12-10	net/mlx5: Rework handling of port module events	Mikhael Goikhman	3	-45/+65
	Add explicit HW defined error values. For simplicity, keep counters for all statuses starting from 0, although currently status=0 is not used. Additionally, when HW signals an unexpected cable status, it is reported now rather than ignored. And status counter is now updated on errors. Signed-off-by: Mikhael Goikhman <migo@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-12-10	bpf: bpftool: Fix newline and p_err issue	Martin KaFai Lau	3	-3/+3
	This patch fixes a few newline issues and also replaces p_err with p_info in prog.c Fixes: b053b439b72a ("bpf: libbpf: bpftool: Print bpf_line_info during prog dump") Cc: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Martin KaFai Lau <kafai@fb.com> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-12-10	tcp: handle EOR and FIN conditions the same in tcp_tso_should_defer()	Eric Dumazet	1	-5/+2
	In commit f9bfe4e6a9d0 ("tcp: lack of available data can also cause TSO defer") we moved the test in tcp_tso_should_defer() for packets with a FIN flag, and we mentioned that the same would be done later for EOR flag. Both flags should be handled at the same time, after all other heuristics have been considered. They both mean that no more bytes can be added to this skb by an application. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-10	net: dsa: ksz: Add reset GPIO handling	Marek Vasut	2	-0/+19
	Add code to handle optional reset GPIO in the KSZ switch driver. The switch has a reset GPIO line which can be controlled by the CPU, so make sure it is configured correctly in such setups. Signed-off-by: Marek Vasut <marex@denx.de> Cc: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Cc: Woojung Huh <woojung.huh@microchip.com> Cc: David S. Miller <davem@davemloft.net> Cc: Tristram Ha <Tristram.Ha@microchip.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-10	net: dsa: ksz: Add optional reset GPIO to Microchip KSZ switch binding	Marek Vasut	1	-0/+4
	Add optional reset GPIO, as such a signal is available on the KSZ switches. Signed-off-by: Marek Vasut <marex@denx.de> Cc: Andrew Lunn <andrew@lunn.ch> Cc: Florian Fainelli <f.fainelli@gmail.com> Cc: Woojung Huh <Woojung.Huh@microchip.com> Cc: David S. Miller <davem@davemloft.net> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-10	bonding: convert to DEFINE_SHOW_ATTRIBUTE	Yangtao Li	1	-13/+1
	Use DEFINE_SHOW_ATTRIBUTE macro to simplify the code. Signed-off-by: Yangtao Li <tiny.windzz@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-10	fjes: convert to DEFINE_SHOW_ATTRIBUTE	Yangtao Li	1	-13/+1
	Use DEFINE_SHOW_ATTRIBUTE macro to simplify the code. Signed-off-by: Yangtao Li <tiny.windzz@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-10	net: xenbus: convert to DEFINE_SHOW_ATTRIBUTE	Yangtao Li	1	-15/+3
	Use DEFINE_SHOW_ATTRIBUTE macro to simplify the code. Signed-off-by: Yangtao Li <tiny.windzz@gmail.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-10	ieee802154: at86rf230: convert to DEFINE_SHOW_ATTRIBUTE	Yangtao Li	1	-12/+1
	Use DEFINE_SHOW_ATTRIBUTE macro to simplify the code. Signed-off-by: Yangtao Li <tiny.windzz@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-10	ipvlan: Remove a useless comparison	YueHaibing	1	-1/+1
	Fix following gcc warning: drivers/net/ipvlan/ipvlan_main.c:543:12: warning: comparison is always false due to limited range of data type [-Wtype-limits] 'mode' is a u16 variable, IPVLAN_MODE_L2 is zero, the comparison is always false Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-10	net: hns3: fix spelling mistake "offser" -> "offset"	Colin Ian King	1	-1/+1
	There is a spelling mistake in a msg string, fix this. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-10	bpf: relax verifier restriction on BPF_MOV \| BPF_ALU	Jiong Wang	2	-4/+25
	Currently, the destination register is marked as unknown for 32-bit sub-register move (BPF_MOV \| BPF_ALU) whenever the source register type is SCALAR_VALUE. This is too conservative that some valid cases will be rejected. Especially, this may turn a constant scalar value into unknown value that could break some assumptions of verifier. For example, test_l4lb_noinline.c has the following C code: struct real_definition *dst 1: if (!get_packet_dst(&dst, &pckt, vip_info, is_ipv6)) 2: return TC_ACT_SHOT; 3: 4: if (dst->flags & F_IPV6) { get_packet_dst is responsible for initializing "dst" into valid pointer and return true (1), otherwise return false (0). The compiled instruction sequence using alu32 will be: 412: (54) (u32) r7 &= (u32) 1 413: (bc) (u32) r0 = (u32) r7 414: (95) exit insn 413, a BPF_MOV \| BPF_ALU, however will turn r0 into unknown value even r7 contains SCALAR_VALUE 1. This causes trouble when verifier is walking the code path that hasn't initialized "dst" inside get_packet_dst, for which case 0 is returned and we would then expect verifier concluding line 1 in the above C code pass the "if" check, therefore would skip fall through path starting at line 4. Now, because r0 returned from callee has became unknown value, so verifier won't skip analyzing path starting at line 4 and "dst->flags" requires dereferencing the pointer "dst" which actually hasn't be initialized for this path. This patch relaxed the code marking sub-register move destination. For a SCALAR_VALUE, it is safe to just copy the value from source then truncate it into 32-bit. A unit test also included to demonstrate this issue. This test will fail before this patch. This relaxation could let verifier skipping more paths for conditional comparison against immediate. It also let verifier recording a more accurate/strict value for one register at one state, if this state end up with going through exit without rejection and it is used for state comparison later, then it is possible an inaccurate/permissive value is better. So the real impact on verifier processed insn number is complex. But in all, without this fix, valid program could be rejected. >From real benchmarking on kernel selftests and Cilium bpf tests, there is no impact on processed instruction number when tests ares compiled with default compilation options. There is slightly improvements when they are compiled with -mattr=+alu32 after this patch. Also, test_xdp_noinline/-mattr=+alu32 now passed verification. It is rejected before this fix. Insn processed before/after this patch: default -mattr=+alu32 Kernel selftest === test_xdp.o 371/371 369/369 test_l4lb.o 6345/6345 5623/5623 test_xdp_noinline.o 2971/2971 rejected/2727 test_tcp_estates.o 429/429 430/430 Cilium bpf === bpf_lb-DLB_L3.o: 2085/2085 1685/1687 bpf_lb-DLB_L4.o: 2287/2287 1986/1982 bpf_lb-DUNKNOWN.o: 690/690 622/622 bpf_lxc.o: 95033/95033 N/A bpf_netdev.o: 7245/7245 N/A bpf_overlay.o: 2898/2898 3085/2947 NOTE: - bpf_lxc.o and bpf_netdev.o compiled by -mattr=+alu32 are rejected by verifier due to another issue inside verifier on supporting alu32 binary. - Each cilium bpf program could generate several processed insn number, above number is sum of them. v1->v2: - Restrict the change on SCALAR_VALUE. - Update benchmark numbers on Cilium bpf tests. Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-12-09	net/mlx5: Move flow counters data structures from flow steering header	Saeed Mahameed	2	-23/+23
	After the following flow counters API refactoring: ("net/mlx5: Use flow counter IDs and not the wrapping cache object") flow counters private data structures mlx5_fc_cache and mlx5_fc are redundantly exposed in fs_core.h, they have nothing to do with flow steering core and they are private to fs_counter.c, this patch moves them to where they belong and reduces their exposure in the driver. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-12-09	IB/mlx5: Use helper to get CQE opcode	Tariq Toukan	1	-4/+4
	Use the new helper that extracts the opcode from a CQE (completion queue entry) structure. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Acked-by: Leon Romanovsky <leonro@mellanox.com>
2018-12-09	net/mlx5: Use helper to get CQE opcode	Tariq Toukan	4	-7/+12
	Introduce and use a helper that extracts the opcode from a CQE (completion queue entry) structure. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-12-09	net/mlx5: When fetching CQEs return CQE instead of void pointer	Daniel Jurgens	1	-1/+1
	The function is only used to retrieve CQEs, use the proper type as the return value. Signed-off-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-12-09	Linux 4.20-rc6	Linus Torvalds	1	-1/+1

2018-12-09	media: bpf: add bpf function to report mouse movement	Sean Young	7	-23/+109
	Some IR remotes have a directional pad or other pointer-like thing that can be used as a mouse. Make it possible to decode these types of IR protocols in BPF. Cc: netdev@vger.kernel.org Signed-off-by: Sean Young <sean@mess.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-12-09	bpf: libbpf: bpftool: Print bpf_line_info during prog dump	Martin KaFai Lau	12	-25/+516
	This patch adds print bpf_line_info function in 'prog dump jitted' and 'prog dump xlated': [root@arch-fb-vm1 bpf]# ~/devshare/fb-kernel/linux/tools/bpf/bpftool/bpftool prog dump jited pinned /sys/fs/bpf/test_btf_haskv [...] int test_long_fname_2(struct dummy_tracepoint_args * arg): bpf_prog_44a040bf25481309_test_long_fname_2: ; static int test_long_fname_2(struct dummy_tracepoint_args arg) 0: push %rbp 1: mov %rsp,%rbp 4: sub $0x30,%rsp b: sub $0x28,%rbp f: mov %rbx,0x0(%rbp) 13: mov %r13,0x8(%rbp) 17: mov %r14,0x10(%rbp) 1b: mov %r15,0x18(%rbp) 1f: xor %eax,%eax 21: mov %rax,0x20(%rbp) 25: xor %esi,%esi ; int key = 0; 27: mov %esi,-0x4(%rbp) ; if (!arg->sock) 2a: mov 0x8(%rdi),%rdi ; if (!arg->sock) 2e: cmp $0x0,%rdi 32: je 0x0000000000000070 34: mov %rbp,%rsi ; counts = bpf_map_lookup_elem(&btf_map, &key); 37: add $0xfffffffffffffffc,%rsi 3b: movabs $0xffff8881139d7480,%rdi 45: add $0x110,%rdi 4c: mov 0x0(%rsi),%eax 4f: cmp $0x4,%rax 53: jae 0x000000000000005e 55: shl $0x3,%rax 59: add %rdi,%rax 5c: jmp 0x0000000000000060 5e: xor %eax,%eax ; if (!counts) 60: cmp $0x0,%rax 64: je 0x0000000000000070 ; counts->v6++; 66: mov 0x4(%rax),%edi 69: add $0x1,%rdi 6d: mov %edi,0x4(%rax) 70: mov 0x0(%rbp),%rbx 74: mov 0x8(%rbp),%r13 78: mov 0x10(%rbp),%r14 7c: mov 0x18(%rbp),%r15 80: add $0x28,%rbp 84: leaveq 85: retq [...] With linum: [root@arch-fb-vm1 bpf]# ~/devshare/fb-kernel/linux/tools/bpf/bpftool/bpftool prog dump jited pinned /sys/fs/bpf/test_btf_haskv linum int _dummy_tracepoint(struct dummy_tracepoint_args arg): bpf_prog_b07ccb89267cf242__dummy_tracepoint: ; return test_long_fname_1(arg); [file:/data/users/kafai/fb-kernel/linux/tools/testing/selftests/bpf/test_btf_haskv.c line_num:54 line_col:9] 0: push %rbp 1: mov %rsp,%rbp 4: sub $0x28,%rsp b: sub $0x28,%rbp f: mov %rbx,0x0(%rbp) 13: mov %r13,0x8(%rbp) 17: mov %r14,0x10(%rbp) 1b: mov %r15,0x18(%rbp) 1f: xor %eax,%eax 21: mov %rax,0x20(%rbp) 25: callq 0x000000000000851e ; return test_long_fname_1(arg); [file:/data/users/kafai/fb-kernel/linux/tools/testing/selftests/bpf/test_btf_haskv.c line_num:54 line_col:2] 2a: xor %eax,%eax 2c: mov 0x0(%rbp),%rbx 30: mov 0x8(%rbp),%r13 34: mov 0x10(%rbp),%r14 38: mov 0x18(%rbp),%r15 3c: add $0x28,%rbp 40: leaveq 41: retq [...] Signed-off-by: Martin KaFai Lau <kafai@fb.com> Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-12-09	bpf: libbpf: Add btf_line_info support to libbpf	Martin KaFai Lau	5	-89/+239
	This patch adds bpf_line_info support to libbpf: 1) Parsing the line_info sec from ".BTF.ext" 2) Relocating the line_info. If the main prog _info relocation fails, it will ignore the remaining subprog line_info and continue. If the subprog _info relocation fails, it will bail out. 3) BPF_PROG_LOAD a prog with line_info Signed-off-by: Martin KaFai Lau <kafai@fb.com> Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-12-09	bpf: libbpf: Refactor and bug fix on the bpf_func_info loading logic	Martin KaFai Lau	4	-177/+177
	This patch refactor and fix a bug in the libbpf's bpf_func_info loading logic. The bug fix and refactoring are targeting the same commit 2993e0515bb4 ("tools/bpf: add support to read .BTF.ext sections") which is in the bpf-next branch. 1) In bpf_load_program_xattr(), it should retry when errno == E2BIG regardless of log_buf and log_buf_sz. This patch fixes it. 2) btf_ext__reloc_init() and btf_ext__reloc() are essentially the same except btf_ext__reloc_init() always has insns_cnt == 0. Hence, btf_ext__reloc_init() is removed. btf_ext__reloc() is also renamed to btf_ext__reloc_func_info() to get ready for the line_info support in the next patch. 3) Consolidate func_info section logic from "btf_ext_parse_hdr()", "btf_ext_validate_func_info()" and "btf_ext__new()" to a new function "btf_ext_copy_func_info()" such that similar logic can be reused by the later libbpf's line_info patch. 4) The next line_info patch will store line_info_cnt instead of line_info_len in the bpf_program because the kernel is taking line_info_cnt also. It will save a few "len" to "cnt" conversions and will also save some function args. Hence, this patch also makes bpf_program to store func_info_cnt instead of func_info_len. 5) btf_ext depends on btf. e.g. the func_info's type_id in ".BTF.ext" is not useful when ".BTF" is absent. This patch only init the obj->btf_ext pointer after it has successfully init the obj->btf pointer. This can avoid always checking "obj->btf && obj->btf_ext" together for accessing ".BTF.ext". Checking "obj->btf_ext" alone will do. 6) Move "struct btf_sec_func_info" from btf.h to btf.c. There is no external usage outside btf.c. Fixes: 2993e0515bb4 ("tools/bpf: add support to read .BTF.ext sections") Signed-off-by: Martin KaFai Lau <kafai@fb.com> Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-12-09	bpf: Add unit tests for bpf_line_info	Martin KaFai Lau	1	-17/+580
	Add unit tests for bpf_line_info for both BPF_PROG_LOAD and BPF_OBJ_GET_INFO_BY_FD. jit enabled: [root@arch-fb-vm1 bpf]# ./test_btf -k 0 BTF prog info raw test[5] (line_info (No subprog)): OK BTF prog info raw test[6] (line_info (No subprog. insn_off >= prog->len)): OK BTF prog info raw test[7] (line_info (No subprog. zero tailing line_info): OK BTF prog info raw test[8] (line_info (No subprog. nonzero tailing line_info)): OK BTF prog info raw test[9] (line_info (subprog)): OK BTF prog info raw test[10] (line_info (subprog + func_info)): OK BTF prog info raw test[11] (line_info (subprog. missing 1st func line info)): OK BTF prog info raw test[12] (line_info (subprog. missing 2nd func line info)): OK BTF prog info raw test[13] (line_info (subprog. unordered insn offset)): OK jit disabled: BTF prog info raw test[5] (line_info (No subprog)): not jited. skipping jited_line_info check. OK BTF prog info raw test[6] (line_info (No subprog. insn_off >= prog->len)): OK BTF prog info raw test[7] (line_info (No subprog. zero tailing line_info): not jited. skipping jited_line_info check. OK BTF prog info raw test[8] (line_info (No subprog. nonzero tailing line_info)): OK BTF prog info raw test[9] (line_info (subprog)): not jited. skipping jited_line_info check. OK BTF prog info raw test[10] (line_info (subprog + func_info)): not jited. skipping jited_line_info check. OK BTF prog info raw test[11] (line_info (subprog. missing 1st func line info)): OK BTF prog info raw test[12] (line_info (subprog. missing 2nd func line info)): OK BTF prog info raw test[13] (line_info (subprog. unordered insn offset)): OK Signed-off-by: Martin KaFai Lau <kafai@fb.com> Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-12-09	bpf: Refactor and bug fix in test_func_type in test_btf.c	Martin KaFai Lau	1	-86/+125
	1) bpf_load_program_xattr() is absorbing the EBIG error which makes testing this case impossible. It is replaced with a direct syscall(__NR_bpf, BPF_PROG_LOAD,...). 2) The test_func_type() is renamed to test_info_raw() to prepare for the new line_info test in the next patch. 3) The bpf_obj_get_info_by_fd() testing for func_info is refactored to test_get_finfo(). A new test_get_linfo() will be added in the next patch for testing line_info purpose. 4) The test->func_info_cnt is checked instead of a static value "2". 5) Remove unnecessary "\n" in error message. 6) Adding back info_raw_test_num to the cmd arg such that a specific test case can be tested, like all other existing tests. 7) Fix a bug in handling expected_prog_load_failure. A test could pass even if prog_fd != -1 while expected_prog_load_failure is true. 8) The min rec_size check should be < 8 instead of < 4. Fixes: 4798c4ba3ba9 ("tools/bpf: extends test_btf to test load/retrieve func_type info") Signed-off-by: Martin KaFai Lau <kafai@fb.com> Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-12-09	bpf: tools: Sync uapi bpf.h	Martin KaFai Lau	1	-0/+19
	Sync uapi bpf.h to tools/include/uapi/linux for the new bpf_line_info. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-12-09	bpf: Add bpf_line_info support	Martin KaFai Lau	10	-33/+419
	This patch adds bpf_line_info support. It accepts an array of bpf_line_info objects during BPF_PROG_LOAD. The "line_info", "line_info_cnt" and "line_info_rec_size" are added to the "union bpf_attr". The "line_info_rec_size" makes bpf_line_info extensible in the future. The new "check_btf_line()" ensures the userspace line_info is valid for the kernel to use. When the verifier is translating/patching the bpf_prog (through "bpf_patch_insn_single()"), the line_infos' insn_off is also adjusted by the newly added "bpf_adj_linfo()". If the bpf_prog is jited, this patch also provides the jited addrs (in aux->jited_linfo) for the corresponding line_info.insn_off. "bpf_prog_fill_jited_linfo()" is added to fill the aux->jited_linfo. It is currently called by the x86 jit. Other jits can also use "bpf_prog_fill_jited_linfo()" and it will be done in the followup patches. In the future, if it deemed necessary, a particular jit could also provide its own "bpf_prog_fill_jited_linfo()" implementation. A few "line_info" fields are added to the bpf_prog_info such that the user can get the xlated line_info back (i.e. the line_info with its insn_off reflecting the translated prog). The jited_line_info is available if the prog is jited. It is an array of __u64. If the prog is not jited, jited_line_info_cnt is 0. The verifier's verbose log with line_info will be done in a follow up patch. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-12-09	net/sched: cls_flower: Reject duplicated rules also under skip_sw	Or Gerlitz	1	-13/+10
	Currently, duplicated rules are rejected only for skip_hw or "none", hence allowing users to push duplicates into HW for no reason. Use the flower tables to protect for that. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Paul Blakey <paulb@mellanox.com> Reported-by: Chris Mi <chrism@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>