aboutsummaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)AuthorFilesLines
2018-01-25bpf: add selftest for tcpbpfLawrence Brakmo8-6/+480
Added a selftest for tcpbpf (sock_ops) that checks that the appropriate callbacks occured and that it can access tcp_sock fields and that their values are correct. Run with command: ./test_tcpbpf_user Adding the flag "-d" will show why it did not pass. Signed-off-by: Lawrence Brakmo <brakmo@fb.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-01-25bpf: Add BPF_SOCK_OPS_STATE_CBLawrence Brakmo2-1/+52
Adds support for calling sock_ops BPF program when there is a TCP state change. Two arguments are used; one for the old state and another for the new state. There is a new enum in include/uapi/linux/bpf.h that exports the TCP states that prepends BPF_ to the current TCP state names. If it is ever necessary to change the internal TCP state values (other than adding more to the end), then it will become necessary to convert from the internal TCP state value to the BPF value before calling the BPF sock_ops function. There are a set of compile checks added in tcp.c to detect if the internal and BPF values differ so we can make the necessary fixes. New op: BPF_SOCK_OPS_STATE_CB. Signed-off-by: Lawrence Brakmo <brakmo@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-01-25bpf: Add BPF_SOCK_OPS_RETRANS_CBLawrence Brakmo2-1/+12
Adds support for calling sock_ops BPF program when there is a retransmission. Three arguments are used; one for the sequence number, another for the number of segments retransmitted, and the last one for the return value of tcp_transmit_skb (0 => success). Does not include syn-ack retransmissions. New op: BPF_SOCK_OPS_RETRANS_CB. Signed-off-by: Lawrence Brakmo <brakmo@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-01-25bpf: Add sock_ops R/W access to tclassLawrence Brakmo1-2/+45
Adds direct write access to sk_txhash and access to tclass for ipv6 flows through getsockopt and setsockopt. Sample usage for tclass: bpf_getsockopt(skops, SOL_IPV6, IPV6_TCLASS, &v, sizeof(v)) where skops is a pointer to the ctx (struct bpf_sock_ops). Signed-off-by: Lawrence Brakmo <brakmo@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-01-25bpf: Add support for reading sk_state and moreLawrence Brakmo2-11/+154
Add support for reading many more tcp_sock fields state, same as sk->sk_state rtt_min same as sk->rtt_min.s[0].v (current rtt_min) snd_ssthresh rcv_nxt snd_nxt snd_una mss_cache ecn_flags rate_delivered rate_interval_us packets_out retrans_out total_retrans segs_in data_segs_in segs_out data_segs_out lost_out sacked_out sk_txhash bytes_received (__u64) bytes_acked (__u64) Signed-off-by: Lawrence Brakmo <brakmo@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-01-25bpf: Add sock_ops RTO callbackLawrence Brakmo2-1/+14
Adds an optional call to sock_ops BPF program based on whether the BPF_SOCK_OPS_RTO_CB_FLAG is set in bpf_sock_ops_flags. The BPF program is passed 2 arguments: icsk_retransmits and whether the RTO has expired. Signed-off-by: Lawrence Brakmo <brakmo@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-01-25bpf: Adds field bpf_sock_ops_cb_flags to tcp_sockLawrence Brakmo3-1/+61
Adds field bpf_sock_ops_cb_flags to tcp_sock and bpf_sock_ops. Its primary use is to determine if there should be calls to sock_ops bpf program at various points in the TCP code. The field is initialized to zero, disabling the calls. A sock_ops BPF program can set it, per connection and as necessary, when the connection is established. It also adds support for reading and writting the field within a sock_ops BPF program. Reading is done by accessing the field directly. However, writing is done through the helper function bpf_sock_ops_cb_flags_set, in order to return an error if a BPF program is trying to set a callback that is not supported in the current kernel (i.e. running an older kernel). The helper function returns 0 if it was able to set all of the bits set in the argument, a positive number containing the bits that could not be set, or -EINVAL if the socket is not a full TCP socket. Examples of where one could call the bpf program: 1) When RTO fires 2) When a packet is retransmitted 3) When the connection terminates 4) When a packet is sent 5) When a packet is received Signed-off-by: Lawrence Brakmo <brakmo@fb.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-01-25bpf: Support passing args to sock_ops bpf functionLawrence Brakmo6-10/+42
Adds support for passing up to 4 arguments to sock_ops bpf functions. It reusues the reply union, so the bpf_sock_ops structures are not increased in size. Signed-off-by: Lawrence Brakmo <brakmo@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-01-25bpf: Add write access to tcp_sock and sock fieldsLawrence Brakmo3-1/+58
This patch adds a macro, SOCK_OPS_SET_FIELD, for writing to struct tcp_sock or struct sock fields. This required adding a new field "temp" to struct bpf_sock_ops_kern for temporary storage that is used by sock_ops_convert_ctx_access. It is used to store and recover the contents of a register, so the register can be used to store the address of the sk. Since we cannot overwrite the dst_reg because it contains the pointer to ctx, nor the src_reg since it contains the value we want to store, we need an extra register to contain the address of the sk. Also adds the macro SOCK_OPS_GET_OR_SET_FIELD that calls one of the GET or SET macros depending on the value of the TYPE field. Signed-off-by: Lawrence Brakmo <brakmo@fb.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-01-25bpf: Make SOCK_OPS_GET_TCP struct independentLawrence Brakmo1-10/+10
Changed SOCK_OPS_GET_TCP to SOCK_OPS_GET_FIELD and added 2 arguments so now it can also work with struct sock fields. The first argument is the name of the field in the bpf_sock_ops struct, the 2nd argument is the name of the field in the OBJ struct. Previous: SOCK_OPS_GET_TCP(FIELD_NAME) New: SOCK_OPS_GET_FIELD(BPF_FIELD, OBJ_FIELD, OBJ) Where OBJ is either "struct tcp_sock" or "struct sock" (without quotation). BPF_FIELD is the name of the field in the bpf_sock_ops struct and OBJ_FIELD is the name of the field in the OBJ struct. Although the field names are currently the same, the kernel struct names could change in the future and this change makes it easier to support that. Note that adding access to tcp_sock fields in sock_ops programs does not preclude the tcp_sock fields from being removed as long as we are willing to do one of the following: 1) Return a fixed value (e.x. 0 or 0xffffffff), or 2) Make the verifier fail if that field is accessed (i.e. program fails to load) so the user will know that field is no longer supported. Signed-off-by: Lawrence Brakmo <brakmo@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-01-25bpf: Make SOCK_OPS_GET_TCP size independentLawrence Brakmo1-5/+8
Make SOCK_OPS_GET_TCP helper macro size independent (before only worked with 4-byte fields. Signed-off-by: Lawrence Brakmo <brakmo@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-01-25bpf: Only reply field should be writeableLawrence Brakmo1-2/+1
Currently, a sock_ops BPF program can write the op field and all the reply fields (reply and replylong). This is a bug. The op field should not have been writeable and there is currently no way to use replylong field for indices >= 1. This patch enforces that only the reply field (which equals replylong[0]) is writeable. Fixes: 40304b2a1567 ("bpf: BPF support for sock_ops") Signed-off-by: Lawrence Brakmo <brakmo@fb.com> Acked-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-01-24bpf, doc: Correct one wrong value in "Register value tracking"Wang YanQing1-1/+1
If we then OR this with 0x40, then the value of 6th bit (0th is first bit) become known, so the right mask is 0xbf instead of 0xcf. Signed-off-by: Wang YanQing <udknight@gmail.com> Acked-by: Edward Cree <ecree@solarflare.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-01-24Merge branch 'bpf-samples-sockmap-improvements'Daniel Borkmann1-52/+340
John Fastabend says: ==================== The sockmap sample is pretty simple at the moment. All it does is open a few sockets attach BPF programs/sockmaps and sends a few packets. However, for testing and debugging I wanted to have more control over the sendmsg format and data than provided by tools like iperf3/netperf, etc. The reason is for testing BPF programs and stream parser it is helpful to be able submit multiple sendmsg calls with different msg layouts. For example lots of 1B iovs or a single large MB of data, etc. Additionally, my current test setup requires an entire orchestration layer (cilium) to run. As well as lighttpd and http traffic generators or for kafka testing brokers and clients. This makes it a bit more difficult when doing performance optimizations to incrementally test small changes and come up with performance delta's and perf numbers. By adding a few more options and an additional few tests the sockmap sample program can show a more complete example and do some of the above. Because the sample program is self contained it doesn't require additional infrastructure to run either. This series, although still fairly crude, does provide some nice additions. They are - a new sendmsg tests with a sender and recv threads - a new base tests so we can get metrics/data without BPF - multiple GBps of throughput on base and sendmsg tests - automatically set rlimit and common variables That said the UI is still primitive, more features could be added, more tests might be useful, the reporting is bare bones, etc. But, IMO lets push this now rather than sit on it for weeks until I get time to do the above improvements. Additional patches can address the other limitations/issues. Another thing I am considering is moving this into selftests, after a few more fixes so we avoid false failures, so that we get more sockmap testing. v2: removed bogus file added by patch 3/7 v3: 1/7 replace goto out with returns, remove sighandler update, 2/7 free iov in error cases 3/7 fix bogus makefile change, bail out early on errors v4: add Martin's "nits" and ACKs along with fixes to 2/7 iov free also pointed out by Martin. Thanks Daniel and Martin for the reviews! ==================== Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-01-24bpf: sockmap set rlimitJohn Fastabend1-0/+7
Avoid extra step of setting limit from cmdline and do it directly in the program. Signed-off-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-01-24bpf: sockmap put client sockets in blocking modeJohn Fastabend1-1/+1
Put client sockets in blocking mode otherwise with sendmsg tests its easy to overrun the socket buffers which results in the test being aborted. The original non-blocking was added to handle listen/accept with a single thread the client/accepted sockets do not need to be non-blocking. Signed-off-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-01-24bpf: sockmap sample add base test without any BPF for comparisonJohn Fastabend1-5/+21
Add a base test that does not use BPF hooks to test baseline case. Signed-off-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-01-24bpf: sockmap sample, report bytes/secJohn Fastabend1-5/+42
Report bytes/sec sent as well as total bytes. Useful to get rough idea how different configurations and usage patterns perform with sockmap. Signed-off-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-01-24bpf: sockmap sample, use fork() for send and recvJohn Fastabend1-16/+39
Currently for SENDMSG tests first send completes then recv runs. This does not work well for large data sizes and/or many iterations. So fork the recv and send handler so that we run both send and recv. In the future we can add a parameter to do more than a single fork of tx/rx. With this we can get many GBps of data which helps exercise the sockmap code. Signed-off-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-01-24bpf: add sendmsg option for testing BPF programsJohn Fastabend1-3/+145
When testing BPF programs using sockmap I often want to have more control over how sendmsg is exercised. This becomes even more useful as new sockmap program types are added. This adds a test type option to select type of test to run. Currently, only "ping" and "sendmsg" are supported, but more can be added as needed. The new help argument gives the following, Usage: ./sockmap --cgroup <cgroup_path> options: --help -h --cgroup -c --rate -r --verbose -v --iov_count -i --length -l --test -t Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-01-24bpf: refactor sockmap sample program update for arg parsingJohn Fastabend1-51/+114
sockmap sample program takes arguments from cmd line but it reads them in using offsets into the array. Because we want to add more arguments in the future lets do proper argument handling. Also refactor code to pull apart sock init and ping/pong test. This allows us to add new tests in the future. Signed-off-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-01-24selftests/bpf: make 'dubious pointer arithmetic' test usefulAlexei Starovoitov1-7/+23
mostly revert the previous workaround and make 'dubious pointer arithmetic' test useful again. Use (ptr - ptr) << const instead of ptr << const to generate large scalar. The rest stays as before commit 2b36047e7889. Fixes: 2b36047e7889 ("selftests/bpf: fix test_align") Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-01-23bpf: test_maps: cleanup sockmaps when test endsPrashant Bhole1-4/+12
Bug: BPF programs and maps related to sockmaps test exist in memory even after test_maps ends. This patch fixes it as a short term workaround (sockmap kernel side needs real fixing) by empyting sockmaps when test ends. Fixes: 6f6d33f3b3d0f ("bpf: selftests add sockmap tests") Signed-off-by: Prashant Bhole <bhole_prashant_q7@lab.ntt.co.jp> [ daniel: Note on workaround. ] Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-01-23selftests/bpf: fix test_dev_cgroupAlexei Starovoitov1-1/+1
The test incorrectly doing mkdir /mnt/cgroup-test-work-dirtest-bpf-based-device-cgroup instead of mkdir /mnt/cgroup-test-work-dir/test-bpf-based-device-cgroup somehow such mkdir succeeds and new directory appears: /mnt/cgroup-test-work-dir/cgroup-test-work-dirtest-bpf-based-device-cgroup Later cleanup via nftw("/mnt/cgroup-test-work-dir", ...); doesn't walk this directory. "rmdir /mnt/cgroup-test-work-dir" succeeds, but bpf program and dangling cgroup stays in memory. That's a separate issue on a cgroup side. For now fix the test. Fixes: 37f1ba0909df ("selftests/bpf: add a test for device cgroup controller") Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-01-23selftests/bpf: speedup test_mapsAlexei Starovoitov1-6/+10
test_hashmap_walk takes very long time on debug kernel with kasan on. Reduce the number of iterations in this test without sacrificing test coverage. Also add printfs as progress indicator. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-01-23tools/bpf: fix a test failure in selftests prog test_verifierYonghong Song1-0/+1
Commit 111e6b45315c ("selftests/bpf: make test_verifier run most programs") enables tools/testing/selftests/bpf/test_verifier unit cases to run via bpf_prog_test_run command. With the latest code base, test_verifier had one test case failure: ... #473/p check deducing bounds from const, 2 FAIL retval 1 != 0 0: (b7) r0 = 1 1: (75) if r0 s>= 0x1 goto pc+1 R0=inv1 R1=ctx(id=0,off=0,imm=0) R10=fp0,call_-1 2: (95) exit from 1 to 3: R0=inv1 R1=ctx(id=0,off=0,imm=0) R10=fp0,call_-1 3: (d5) if r0 s<= 0x1 goto pc+1 R0=inv1 R1=ctx(id=0,off=0,imm=0) R10=fp0,call_-1 4: (95) exit from 3 to 5: R0=inv1 R1=ctx(id=0,off=0,imm=0) R10=fp0,call_-1 5: (1f) r1 -= r0 6: (95) exit processed 7 insns (limit 131072), stack depth 0 ... The test case does not set return value in the test structure and hence the return value from the prog run is assumed to be 0. However, the actual return value is 1. As a result, the test failed. The fix is to correctly set the return value in the test structure. Fixes: 111e6b45315c ("selftests/bpf: make test_verifier run most programs") Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-01-23bpf: fix incorrect kmalloc usage in lpm_trie MAP_GET_NEXT_KEY rcu regionYonghong Song1-1/+1
In commit b471f2f1de8b ("bpf: implement MAP_GET_NEXT_KEY command for LPM_TRIE map"), the implemented MAP_GET_NEXT_KEY callback function is guarded with rcu read lock. In the function body, "kmalloc(size, GFP_USER | __GFP_NOWARN)" is used which may sleep and violate rcu read lock region requirements. This patch fixed the issue by using GFP_ATOMIC instead to avoid blocking kmalloc. Tested with CONFIG_DEBUG_ATOMIC_SLEEP=y as suggested by Eric Dumazet. Fixes: b471f2f1de8b ("bpf: implement MAP_GET_NEXT_KEY command for LPM_TRIE map") Signed-off-by: Yonghong Song <yhs@fb.com> Reported-by: syzbot <syzkaller@googlegroups.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-01-23net: aquantia: make symbol hw_atl_boards staticWei Yongjun1-1/+1
Fixes the following sparse warning: drivers/net/ethernet/aquantia/atlantic/aq_pci_func.c:50:34: warning: symbol 'hw_atl_boards' was not declared. Should it be static? Fixes: 4948293ff963 ("net: aquantia: Introduce new AQC devices and capabilities") Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Acked-by: Igor Russkikh <igor.russkikh@aquantia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-23net: aquantia: Fix error return code in aq_pci_probe()Wei Yongjun1-1/+3
Fix to return error code -ENOMEM from the aq_ndev_alloc() error handling case instead of 0, as done elsewhere in this function. Fixes: 23ee07ad3c2f ("net: aquantia: Cleanup pci functions module") Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Acked-by: Igor Russkikh <igor.russkikh@aquantia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-23nfp: fix error return code in nfp_pci_probe()Wei Yongjun1-0/+1
Fix to return error code -EINVAL instead of 0 when num_vfs above limit_vfs, as done elsewhere in this function. Fixes: 0dc786219186 ("nfp: handle SR-IOV already enabled when driver is probing") Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-23nfp: fix fw dump handling of absolute rtsym sizeCarl Heymann1-6/+10
Fix bug that causes _absolute_ rtsym sizes of > 8 bytes (as per symbol table) to result in incorrect space used during a TLV-based debug dump. Detail: The size calculation stage calculates the correct size (size of the rtsym address field == 8), while the dump uses the size in the table to calculate the TLV size to reserve. Symbols with size <= 8 are handled OK due to aligning sizes to 8, but including any absolute symbol with listed size > 8 leads to an ENOSPC error during the dump. Fixes: da762863edd9 ("nfp: fix absolute rtsym handling in debug dump") Signed-off-by: Carl Heymann <carl.heymann@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-22tun: avoid calling xdp_rxq_info_unreg() twiceCong Wang1-2/+0
Similarly to tx ring, xdp_rxq_info is only registered when !tfile->detached, so we need to avoid calling xdp_rxq_info_unreg() twice too. The helper tun_cleanup_tx_ring() already checks for this properly, so it is correct to put xdp_rxq_info_unreg() just inside there. Reported-by: syzbot+1c788d7ce0f0888f1d7f@syzkaller.appspotmail.com Fixes: 8565d26bcb2f ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net") Cc: Jason Wang <jasowang@redhat.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-22Merge branch 'net-sched-add-extack-support-for-cls-offloads'David S. Miller10-51/+95
Jakub Kicinski says: ==================== net: sched: add extack support for cls offloads I've dropped the tests from the series because test_offloads.py changes will conflict with bpf-next patches. I will send four more patches with tests once bpf-next is merged back, hopefully still making it into 4.16 :) v4: - rebase on top of Alex's changes. --- Quentin says: This series tries to improve user experience when eBPF hardware offload hits error paths at load time. In particular, it introduces netlink extended ack support in the nfp driver. To that aim, transmission of the pointer to the extack object is piped through the `change()` operation of the existing classifiers (patch 1 to 6). Then it is used for TC offload in the nfp driver (patch 8) and in netdevsim (patch 9, selftest in patch 10). Patch 7 adds a helper to handle extack messages in the core when TC offload is disabled on the net device. For completeness extack is propagated for classifiers other than cls_bpf, but it's up to the drivers to make use of it. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-22nfp: bpf: use extack support to improve debuggingQuentin Monnet3-18/+39
Use the recently added extack support for eBPF offload in the driver. Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-22nfp: bpf: plumb extack into functions related to XDP offloadQuentin Monnet3-6/+9
Pass a pointer to an extack object to nfp_app_xdp_offload() in order to prepare for extack usage in the nfp driver. Next step will be to forward this extack pointer to nfp_net_bpf_offload(), once this function is able to use it for printing error messages. Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-22net: sched: create tc_can_offload_extack() wrapperQuentin Monnet1-0/+11
Create a wrapper around tc_can_offload() that takes an additional extack pointer argument in order to output an error message if TC offload is disabled on the device. In this way, the error message is handled by the core and can be the same for all drivers. Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-22net: sched: add extack support for offload via tc_cls_common_offloadQuentin Monnet5-12/+15
Add extack support for hardware offload of classifiers. In order to achieve this, a pointer to a struct netlink_ext_ack is added to the struct tc_cls_common_offload that is passed to the callback for setting up the classifier. Function tc_cls_common_offload_init() is updated to support initialization of this new attribute. Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-22net: sched: cls_bpf: plumb extack support in filter for hardware offloadQuentin Monnet1-6/+8
Pass the extack pointer obtained in the `->change()` filter operation to cls_bpf_offload() and then to cls_bpf_offload_cmd(). This makes it possible to use this extack pointer in drivers offloading BPF programs in a future patch. Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-22net: sched: cls_u32: propagate extack support for filter offloadQuentin Monnet1-5/+5
Propagate the extack pointer from the `->change()` classifier operation to the function used for filter replacement in cls_u32. This makes it possible to use netlink extack messages in the future at replacement time for this filter, although it is not used at this point. Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-22net: sched: cls_matchall: propagate extack support for filter offloadQuentin Monnet1-2/+4
Propagate the extack pointer from the `->change()` classifier operation to the function used for filter replacement in cls_matchall. This makes it possible to use netlink extack messages in the future at replacement time for this filter, although it is not used at this point. Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-22net: sched: cls_flower: propagate extack support for filter offloadQuentin Monnet1-2/+4
Propagate the extack pointer from the `->change()` classifier operation to the function used for filter replacement in cls_flower. This makes it possible to use netlink extack messages in the future at replacement time for this filter, although it is not used at this point. Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-22hv_netvsc: Use the num_online_cpus() for channel limitHaiyang Zhang1-9/+2
Since we no longer localize channel/CPU affiliation within one NUMA node, num_online_cpus() is used as the number of channel cap, instead of the number of processors in a NUMA node. This patch allows a bigger range for tuning the number of channels. Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-22net: hns3: converting spaces into tabs to avoid checkpatch.pl warningSalil Mehta1-2/+2
Spaces were mistakenly used instead of tabs in some of the code related to reset functionality, which caused checkpatch.pl errors. These were missed earlier so fixing them now. Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-22cxgb3: assign port id to net_device->dev_portArjun Vynipadath1-0/+1
T3 devices have different ports on same PCI function, so using dev_port to identify ports. Signed-off-by: Arjun Vynipadath <arjun@chelsio.com> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-22bridge: return boolean instead of integer in br_multicast_is_routerGustavo A. R. Silva1-1/+1
Return statements in functions returning bool should use true/false instead of 1/0. This issue was detected with the help of Coccinelle. Fixes: 85b352693264 ("bridge: Fix build error when IGMP_SNOOPING is not enabled") Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com> Reviewed-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-22net: stmmac: Fix reception of Broadcom switches tagsFlorian Fainelli6-7/+39
Broadcom tags inserted by Broadcom switches put a 4 byte header after the MAC SA and before the EtherType, which may look like some sort of 0 length LLC/SNAP packet (tcpdump and wireshark do think that way). With ACS enabled in stmmac the packets were truncated to 8 bytes on reception, whereas clearing this bit allowed normal reception to occur. In order to make that possible, we need to pass a net_device argument to the different core_init() functions and we are dependent on the Broadcom tagger padding packets correctly (which it now does). To be as little invasive as possible, this is only done for gmac1000 when the network device is DSA-enabled (netdev_uses_dsa() returns true). Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Acked-by: Giuseppe Cavallaro <peppe.cavallaro@st.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-22Merge branch 'hns3-new-features'David S. Miller5-1/+545
Peng Li says: ==================== add some features to hns3 driver This patchset adds some features to hns3 driver, include the support for ethtool command -d, -p and support for manager table. [Patch 1/4] adds support for ethtool command -d, its ops is get_regs. driver will send command to command queue, and get regs number and regs value from command queue. [Patch 2/4] adds manager table initialization for hardware. [Patch 3/4] adds support for ethtool command -p. For fiber ports, driver sends command to command queue, and IMP will write SGPIO regs to control leds. [Patch 4/4] adds support for net status led for fiber ports. Net status include port speed, total rx/tx packets and link status. Driver send the status to command queue, and IMP will write SGPIO to control leds. --- Change log: V1 -> V2: 1, fix comments from Andrew Lunn, remove the patch "net: hns3: add ethtool -p support for phy device". ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-22net: hns3: add net status led support for fiber portJian Shen3-0/+113
Check the net status per second, include port speed, total rx/tx packets and link status. Updating the led status for fiber port. Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-22net: hns3: add ethtool -p support for fiber portJian Shen4-0/+104
Add led location support for fiber port. The led will keep blinking when locating. Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-22net: hns3: add manager table initialization for hardwareFuyun Liang2-0/+123
The manager table is empty by default. If it is not initialized, the management pkgs like LLDP will be dropped by hardware. Default entries need to be added to manager table. Signed-off-by: Fuyun Liang <liangfuyun1@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>