aboutsummaryrefslogtreecommitdiffstats
path: root/kernel/bpf/offload.c (unfollow)
AgeCommit message (Collapse)AuthorFilesLines
2018-11-10bpf: pass a struct with offload callbacks to bpf_offload_dev_create()Quentin Monnet6-9/+13
For passing device functions for offloaded eBPF programs, there used to be no place where to store the pointer without making the non-offloaded programs pay a memory price. As a consequence, three functions were called with ndo_bpf() through specific commands. Now that we have struct bpf_offload_dev, and since none of those operations rely on RTNL, we can turn these three commands into hooks inside the struct bpf_prog_offload_ops, and pass them as part of bpf_offload_dev_create(). This commit effectively passes a pointer to the struct to bpf_offload_dev_create(). We temporarily have two struct bpf_prog_offload_ops instances, one under offdev->ops and one under offload->dev_ops. The next patches will make the transition towards the former, so that offload->dev_ops can be removed, and callbacks relying on ndo_bpf() added to offdev->ops as well. While at it, rename "nfp_bpf_analyzer_ops" as "nfp_bpf_dev_ops" (and similarly for netdevsim). Suggested-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-10nfp: bpf: move nfp_bpf_analyzer_ops from verifier.c to offload.cQuentin Monnet3-8/+12
We are about to add several new callbacks to the struct, all of them defined in offload.c. Move the struct bpf_prog_offload_ops object in that file. As a consequence, nfp_verify_insn() and nfp_finalize() can no longer be static. Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-09bpf: Extend the sk_lookup() helper to XDP hookpoint.Nitin Hande2-19/+90
This patch proposes to extend the sk_lookup() BPF API to the XDP hookpoint. The sk_lookup() helper supports a lookup on incoming packet to find the corresponding socket that will receive this packet. Current support for this BPF API is at the tc hookpoint. This patch will extend this API at XDP hookpoint. A XDP program can map the incoming packet to the 5-tuple parameter and invoke the API to find the corresponding socket structure. Signed-off-by: Nitin Hande <Nitin.Hande@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-11-09bpftool: Improve handling of ENOENT on map dumpsDavid Ahern1-4/+14
bpftool output is not user friendly when dumping a map with only a few populated entries: $ bpftool map 1: devmap name tx_devmap flags 0x0 key 4B value 4B max_entries 64 memlock 4096B 2: array name tx_idxmap flags 0x0 key 4B value 4B max_entries 64 memlock 4096B $ bpftool map dump id 1 key: 00 00 00 00 value: No such file or directory key: 01 00 00 00 value: No such file or directory key: 02 00 00 00 value: No such file or directory key: 03 00 00 00 value: 03 00 00 00 Handle ENOENT by keeping the line format sane and dumping "<no entry>" for the value $ bpftool map dump id 1 key: 00 00 00 00 value: <no entry> key: 01 00 00 00 value: <no entry> key: 02 00 00 00 value: <no entry> key: 03 00 00 00 value: 03 00 00 00 ... Signed-off-by: David Ahern <dsahern@gmail.com> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-11-09selftests/bpf: add a test case for sock_ops perf-event notificationSowmini Varadhan4-1/+303
This patch provides a tcp_bpf based eBPF sample. The test - ncat(1) as the TCP client program to connect() to a port with the intention of triggerring SYN retransmissions: we first install an iptables DROP rule to make sure ncat SYNs are resent (instead of aborting instantly after a TCP RST) - has a bpf kernel module that sends a perf-event notification for each TCP retransmit, and also tracks the number of such notifications sent in the global_map The test passes when the number of event notifications intercepted in user-space matches the value in the global_map. Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-11-09bpf: add perf event notificaton support for sock_opsSowmini Varadhan1-0/+22
This patch allows eBPF programs that use sock_ops to send perf based event notifications using bpf_perf_event_output(). Our main use case for this is the following: We would like to monitor some subset of TCP sockets in user-space, (the monitoring application would define 4-tuples it wants to monitor) using TCP_INFO stats to analyze reported problems. The idea is to use those stats to see where the bottlenecks are likely to be ("is it application-limited?" or "is there evidence of BufferBloat in the path?" etc). Today we can do this by periodically polling for tcp_info, but this could be made more efficient if the kernel would asynchronously notify the application via tcp_info when some "interesting" thresholds (e.g., "RTT variance > X", or "total_retrans > Y" etc) are reached. And to make this effective, it is better if we could apply the threshold check *before* constructing the tcp_info netlink notification, so that we don't waste resources constructing notifications that will be discarded by the filter. This work solves the problem by adding perf event based notification support for sock_ops. The eBPF program can thus be designed to apply any desired filters to the bpf_sock_ops and trigger a perf event notification based on the evaluation from the filter. The user space component can use these perf event notifications to either read any state managed by the eBPF program, or issue a TCP_INFO netlink call if desired. Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Co-developed-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-11-09nfp: bpf: relax prog rejection through max_pkt_offsetJiong Wang1-4/+5
NFP is refusing to offload programs whenever the MTU is set to a value larger than the max packet bytes that fits in NFP Cluster Target Memory (CTM). However, a eBPF program doesn't always need to access the whole packet data. Verifier has always calculated maximum direct packet access (DPA) offset, and kept it in max_pkt_offset inside prog auxiliar information. This patch relax prog rejection based on max_pkt_offset. Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-11-09bpf: let verifier to calculate and record max_pkt_offsetJiong Wang2-0/+13
In check_packet_access, update max_pkt_offset after the offset has passed __check_packet_access. It should be safe to use u32 for max_pkt_offset as explained in code comment. Also, when there is tail call, the max_pkt_offset of the called program is unknown, so conservatively set max_pkt_offset to MAX_PACKET_OFF for such case. Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-11-07bpf_load: add map name to load_maps error messageShannon Nelson1-2/+2
To help when debugging bpf/xdp load issues, have the load_map() error message include the number and name of the map that failed. Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-11-07tools: bpftool: adjust rlimit RLIMIT_MEMLOCK when loading programs, mapsQuentin Monnet4-0/+14
The limit for memory locked in the kernel by a process is usually set to 64 kbytes by default. This can be an issue when creating large BPF maps and/or loading many programs. A workaround is to raise this limit for the current process before trying to create a new BPF map. Changing the hard limit requires the CAP_SYS_RESOURCE and can usually only be done by root user (for non-root users, a call to setrlimit fails (and sets errno) and the program simply goes on with its rlimit unchanged). There is no API to get the current amount of memory locked for a user, therefore we cannot raise the limit only when required. One solution, used by bcc, is to try to create the map, and on getting a EPERM error, raising the limit to infinity before giving another try. Another approach, used in iproute2, is to raise the limit in all cases, before trying to create the map. Here we do the same as in iproute2: the rlimit is raised to infinity before trying to load programs or to create maps with bpftool. Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-11-07selftests/bpf: enable (uncomment) all tests in test_libbpf.shQuentin Monnet2-10/+14
libbpf is now able to load successfully test_l4lb_noinline.o and samples/bpf/tracex3_kern.o. For the test_l4lb_noinline, uncomment related tests from test_libbpf.c and remove the associated "TODO". For tracex3_kern.o, instead of loading a program from samples/bpf/ that might not have been compiled at this stage, try loading a program from BPF selftests. Since this test case is about loading a program compiled without the "-target bpf" flag, change the Makefile to compile one program accordingly (instead of passing the flag for compiling all programs). Regarding test_xdp_noinline.o: in its current shape the program fails to load because it provides no version section, but the loader needs one. The test was added to make sure that libbpf could load XDP programs even if they do not provide a version number in a dedicated section. But libbpf is already capable of doing that: in our case loading fails because the loader does not know that this is an XDP program (it does not need to, since it does not attach the program). So trying to load test_xdp_noinline.o does not bring much here: just delete this subtest. For the record, the error message obtained with tracex3_kern.o was fixed by commit e3d91b0ca523 ("tools/libbpf: handle issues with bpf ELF objects containing .eh_frames") I have not been abled to reproduce the "libbpf: incorrect bpf_call opcode" error for test_l4lb_noinline.o, even with the version of libbpf present at the time when test_libbpf.sh and test_libbpf_open.c were created. RFC -> v1: - Compile test_xdp without the "-target bpf" flag, and try to load it instead of ../../samples/bpf/tracex3_kern.o. - Delete test_xdp_noinline.o subtest. Cc: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-11-07net: hns3: Remove set but not used variable 'reset_level'YueHaibing1-10/+3
Fixes gcc '-Wunused-but-set-variable' warning: drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_err.c: In function 'hclge_log_and_clear_ppp_error': drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_err.c:821:24: warning: variable 'reset_level' set but not used [-Wunused-but-set-variable] enum hnae3_reset_type reset_level = HNAE3_NONE_RESET; It never used since introduction in commit 01865a50d78f ("net: hns3: Add enable and process hw errors of TM scheduler") Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-07nfp: flower: use the common netdev notifierJakub Kicinski4-55/+30
Use driver's common notifier for LAG and tunnel configuration. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-07nfp: register a notifier handler in a central location for the deviceJakub Kicinski2-15/+57
Code interested in networking events registers its own notifier handlers. Create one device-wide notifier instance. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-07nfp: flower: make nfp_fl_lag_changels_event() voidJakub Kicinski1-8/+5
nfp_fl_lag_changels_event() never fails, and therefore we would never return NOTIFY_BAD for NETDEV_CHANGELOWERSTATE. Make this clearer by changing nfp_fl_lag_changels_event()'s return type to void. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-07nfp: flower: don't try to nack device unregister eventsJakub Kicinski1-9/+12
Returning an error from a notifier means we want to veto the change. We shouldn't veto NETDEV_UNREGISTER just because we couldn't find the tracking info for given master. I can't seem to find a way to trigger this unless we have some other bug, so it's probably not fix-worthy. While at it move the checking if the netdev really is of interest into the handling functions, like we do for other events. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-07nfp: flower: remove unnecessary iteration over devicesJakub Kicinski1-7/+0
For flower tunnel offloads FW has to be informed about MAC addresses of tunnel devices. We use a netdev notifier to keep track of these addresses. Remove unnecessary loop over netdevices after notifier is registered. The intention of the loop was to catch devices which already existed on the system before nfp driver got loaded, but netdev notifier will replay NETDEV_REGISTER events. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-07nfp: flower: add ipv6 set flow label and hop limit offloadPieter Jansen van Vuuren2-4/+75
Add ipv6 set flow label and hop limit action offload. Since pedit sets headers per 4 byte word, we need to ensure that setting either version, priority, payload_len or nexthdr does not get offloaded. Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-07nfp: flower: add ipv4 set ttl and tos offloadPieter Jansen van Vuuren2-6/+73
Add ipv4 set ttl and tos action offload. Since pedit sets headers per 4 byte word, we need to ensure that setting either version, ihl, protocol, total length or checksum does not get offloaded. Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-07net: hns3: fix for cmd queue memory not freed problem during resetHuazhong Tan3-65/+91
It is not necessary to reallocate the descriptor and remap the descriptor memory in reset process, otherwise it may cause memory not freed problem. Also, this patch initializes the cmd queue's spinlocks in hclgevf_alloc_cmd_queue, and take the spinlocks when reinitializing cmd queue' registers. Fixes: fedd0c15d288 ("net: hns3: Add HNS3 VF IMP(Integrated Management Proc) cmd interface") Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-07net: hns3: add error handler for hclge_reset()Huazhong Tan2-22/+120
When hclge_reset() is called, it may fail for several reasons. For example, an higher-level reset event occurs, memory allocation failure, hardware reset timeout, etc. Therefore, it is necessary to add corresponding error handling for these situations. 1. A high-level reset is required due to a high-level reset failure. 2. For memory allocation failure, a high-level reset is initiated by the timer to recover. The reason for using the timer is to prevent this new high-level reset to interrupt the reset process of other pf/vf; 3. For the case of hardware reset timeout, reschedule the reset task to wait for the hardware to complete the reset. For memory allocation failure and reset timeouts, in order to prevent an infinite number of scheduled reset tasks, the number of error recovery needs to be limited. This patch also add some reset related debug log printing. Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-07net: hns3: call roce's reset notify callback when resettingHuazhong Tan1-0/+33
While doing resetting, roce should do its uninitailization part before nic's, and do its initialization part after nic's. Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-07net: hns3: adjust the process of PF resetHuazhong Tan1-2/+36
When doing PF reset, the driver needs to do some preparatory work before asserting PF reset. Since when hardware is resetting, it is necessary to stop tx/rx queue, clear hardware table, etc, otherwise hardware may run into unrecoverable state if there is still IO running when the hardware is resetting. Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-07net: hns3: move some reset information from hnae3_handle into hclge_dev/hclgevf_devHuazhong Tan7-34/+28
Saving reset related information in the hclge_dev/hclgevf_dev structure is more suitable than the hnae3_handle, since hardware related information is kept in these two structure. Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-07net: hns3: ignore new coming low-level reset while doing high-level resetHuazhong Tan1-11/+16
When processing a higher level reset, the pending lower level reset does not have to be processed anymore, because the higher level reset is the superset of the lower level reset. Therefore, when processing an higher level reset, the request of lower level reset needs to be cleared. Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-07net: hns3: use HNS3_NIC_STATE_RESETTING to indicate resettingHuazhong Tan4-0/+48
While hclge is going to reset, it will notify its client with HNAE3_DOWN_CLIENT, so this client should get into a resetting status from this moment, other operations from the stack need to be blocked as well. And when the reset is finished, the client will be notified with HNAE3_UP_CLIENT, so this is the end of the resetting status. This patch uses HNS3_NIC_STATE_RESETTING flag to implement that, and adds hns3_nic_resetting() to indicate which operation is not allowed. Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-07net: hns3: enable/disable ring in the enet while doing UP/DOWNHuazhong Tan4-44/+41
While hardware gets into reset status, the firmware will not respond to driver's command request, which may cause ring not disabled problem during reset process. So this patch uses register instead of command to enable/disable the ring in the enet while doing UP/DOWN operation. Also, HNS3_RING_RX_VM_REG is previously unused, so change it to the correct meaning, and add a wrapper function for readl(). Fixes: 46a3df9f9718 ("net: hns3: Add HNS3 Acceleration Engine & Compatibility Layer Support") Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-07net: hns3: adjust the location of clearing the table when doing resetHuazhong Tan1-10/+10
When doing a function reset, the hardware table should be cleared before the hardware reset. In current code, this clearing is done in hns3_reset_notify_uninit_enet, but it is too late, because the hardware reset is already done, hns3_reset_notify_down_enet is more suitable to do that. Fixes: bb6b94a896d4 ("net: hns3: Add reset interface implementation in client") Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-07net: hns3: provide some interface & information for the clientHuazhong Tan5-0/+68
The client needs to know if the hardware is resetting when loading or unloading itself, because client may abort the loading process or wait for the reset process to finish when unloading if hardware is resetting. So this patch provides these interfaces to do it. 1. get_hw_reset_stat, the reset status of hardware. 2. ae_dev_resetting, whether reset task is scheduling. 3. ae_dev_reset_cnt, how many reset has been done. Also, the RoCE client needs some field in the hnae3_roce_private_info to save its state, and process_hw_error interface in the hnae3_client_ops to process hardware errors. Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-07net: hns3: add set_default_reset_request in the hnae3_ae_opsHuazhong Tan5-1/+45
Currently, when reset_event is called because of tx timeout, it will upgrade the reset level (For PF, HNAE3_FUNC_RESET -> HNAE3_CORE_RESET -> HNAE3_GLOBAL_RESET) if the time between the new reset and last reset is within 20 secs, or restore the reset level to HNAE3_FUNC_RESET if the time between the new reset and last reset is over 20 secs. There is requirement that the caller needs to decide the reset level when triggering a reset, for example, RAS recovery. So this patch adds the set_default_reset_request to meet this requirement. Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-07net: hns3: use HNS3_NIC_STATE_INITED to indicate the initialization state of enetHuazhong Tan2-1/+18
Besides of module_init and module_exit, the process of reset will also uninitialize and initialize the enet client. When reset process fails with enet client uninitialized, the module_exit does not need to uninitialize the enet client, otherwise it may cause double uninitialization problem. So we need the HNS3_NIC_STATE_INITED flag to indicate whether the enet client is initialized. Also HNS3_NIC_STATE_REINITING is previously unused, so change it to HNS3_NIC_STATE_INITED. Fixes: bb6b94a896d4 ("net: hns3: Add reset interface implementation in client") Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-06net: systemport: Unmap queues upon DSA unregister eventFlorian Fainelli1-6/+50
Binding and unbinding the switch driver which creates the DSA slave network devices for which we set-up inspection would lead to undesireable effects since we were not clearing the port/queue mapping to the SYSTEMPORT TX queue. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-06net: systemport: Simplify queue mapping logicFlorian Fainelli2-9/+9
The use of a bitmap speeds up the finding of the first available queue to which we could start establishing the mapping for, but we still have to loop over all slave network devices to set them up. Simplify the logic to have a single loop, and use the fact that a correctly configured ring has inspect set to true. This will make things simpler to unwind during device unregistration. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-06net: dsa: bcm_sf2: Turn on PHY to allow successful registrationFlorian Fainelli1-0/+4
We are binding to the PHY using the SF2 slave MDIO bus that we create, binding involves reading the PHY's MII_PHYSID1/2 which won't be possible if the PHY is turned off. Temporarily turn it on/off for the bus probing to succeeed. This fixes unbind/bind problems where the port connecting to that PHY would be in error since it could not connect to it. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-06net: systemport: Restore Broadcom tag match filters upon resumeFlorian Fainelli2-0/+13
Some of the system suspend states that we support wipe out entirely the HW contents. If we had a Wake-on-LAN filter programmed prior to going into suspend, but we did not actually wake-up from Wake-on-LAN and instead used a deeper suspend state, make sure we restore the CID number that we need to match against. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-06net: dsa: bcm_sf2: Get rid of unmarshalling functionsFlorian Fainelli1-310/+0
Now that we have migrated the CFP rule handling to a list with a software copy, the delete/get operation just returns what is on the list, no need to read from the hardware which is both slow and more error prone. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-06net: dsa: bcm_sf2: Restore CFP rules during system resumeFlorian Fainelli3-0/+41
The hardware can lose its context during system suspend, and depending on the switch generation (7445 vs. 7278), while the rules are still there, they will have their valid bit cleared (because that's the fastest way for the HW to reset things). Just make sure we re-apply them coming back from resume. The 7445 switch is an older version of the core that has some quirky RAM technology requiring a delete then re-inser to guarantee the RAM entries are properly latched. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-06net: dsa: bcm_sf2: Split rule handling from HW operationFlorian Fainelli1-35/+54
In preparation for restoring CFP rules during system wide system suspend/resume where the hardware loses its context, split the rule validation from its actual insertion as well as the rule removal from its actual hardware deletion operation. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-06net: dsa: bcm_sf2: Keep copy of inserted rulesFlorian Fainelli3-11/+137
We tried hard to use the hardware as a storage area, which made things needlessly complex in that we had to both marshall and unmarshall the ethtool_rx_flow_spec into what the CFP hardware understands but it did not require any driver level allocations, so that was nice. Keep a copy of the ethtool_rx_flow_spec rule we want to insert, and also make sure we don't have a duplicate rule already. This greatly speeds up the deletion time since we only need to clear the slice's valid bit and not perform a full read. This is a preparatory step for being able to restore rules upon system resumption where the hardware loses its context partially or entirely. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-06rtnetlink: Add more extack messages to rtnl_newlinkDavid Ahern1-2/+4
Add extack arg to the nla_parse_nested calls in rtnl_newlink, and add messages for unknown device type and link network namespace id. In particular, it improves the failure message when the wrong link type is used. From $ ip li add bond1 type bonding RTNETLINK answers: Operation not supported to $ ip li add bond1 type bonding Error: Unknown device type. (The module name is bonding but the link type is bond.) Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-06net: Add extack argument to ip_fib_metrics_initDavid Ahern4-11/+25
Add extack argument to ip_fib_metrics_init and add messages for invalid metrics. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-06net: Add extack argument to rtnl_create_linkDavid Ahern7-12/+19
Add extack arg to rtnl_create_link and add messages for invalid number of Tx or Rx queues. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-06ipv6: gro: do not use slow memcmp() in ipv6_gro_receive()Eric Dumazet1-3/+10
ipv6_gro_receive() compares 34 bytes using slow memcmp(), while handcoding with a couple of ipv6_addr_equal() is much faster. Before this patch, "perf top -e cycles:pp -C <cpu>" would see memcmp() using ~10% of cpu cycles on a 40Gbit NIC receiving IPv6 TCP traffic. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-06net: skbuff.h: remove unnecessary unlikely()Yangtao Li1-3/+1
WARN_ON() already contains an unlikely(), so it's not necessary to use unlikely. Signed-off-by: Yangtao Li <tiny.windzz@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-06ISDN: eicon: Remove driverOlof Johansson85-41755/+0
I started looking at the history of this driver, and last time the maintainer was active on the mailing list was when discussing how to remove it. This was in 2012: https://lore.kernel.org/lkml/4F4DE175.30002@melware.de/ It looks to me like this has in practice been an orphan for quite a while. It's throwing warnings about stack size in a function that is in dire need of refactoring, and it's probably a case of "it's time to call it". Cc: Armin Schindler <mac@melware.de> Cc: Karsten Keil <isdn@linux-pingi.de> Signed-off-by: Olof Johansson <olof@lixom.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-06ARM: 8809/1: proc-v7: fix Thumb annotation of cpu_v7_hvc_switch_mmArd Biesheuvel1-1/+1
Due to what appears to be a copy/paste error, the opening ENTRY() of cpu_v7_hvc_switch_mm() lacks a matching ENDPROC(), and instead, the one for cpu_v7_smc_switch_mm() is duplicated. Given that it is ENDPROC() that emits the Thumb annotation, the cpu_v7_hvc_switch_mm() routine will be called in ARM mode on a Thumb2 kernel, resulting in the following splat: Internal error: Oops - undefined instruction: 0 [#1] SMP THUMB2 Modules linked in: CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.18.0-rc1-00030-g4d28ad89189d-dirty #488 Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015 PC is at cpu_v7_hvc_switch_mm+0x12/0x18 LR is at flush_old_exec+0x31b/0x570 pc : [<c0316efe>] lr : [<c04117c7>] psr: 00000013 sp : ee899e50 ip : 00000000 fp : 00000001 r10: eda28f34 r9 : eda31800 r8 : c12470e0 r7 : eda1fc00 r6 : eda53000 r5 : 00000000 r4 : ee88c000 r3 : c0316eec r2 : 00000001 r1 : eda53000 r0 : 6da6c000 Flags: nzcv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none Note the 'ISA ARM' in the last line. Fix this by using the correct name in ENDPROC(). Cc: <stable@vger.kernel.org> Fixes: 10115105cb3a ("ARM: spectre-v2: add firmware based hardening") Reviewed-by: Dave Martin <Dave.Martin@arm.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2018-11-05net: alx: make alx_drv_name staticRasmus Villemoes2-2/+1
alx_drv_name is not used outside main.c, so there's no reason for it to have external linkage. Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-05net: bpfilter: fix iptables failure if bpfilter_umh is disabledTaehee Yoo1-3/+3
When iptables command is executed, ip_{set/get}sockopt() try to upload bpfilter.ko if bpfilter is enabled. if it couldn't find bpfilter.ko, command is failed. bpfilter.ko is generated if CONFIG_BPFILTER_UMH is enabled. ip_{set/get}sockopt() only checks CONFIG_BPFILTER. So that if CONFIG_BPFILTER is enabled and CONFIG_BPFILTER_UMH is disabled, iptables command is always failed. test config: CONFIG_BPFILTER=y # CONFIG_BPFILTER_UMH is not set test command: %iptables -L iptables: No chain/target/match by that name. Fixes: d2ba09c17a06 ("net: add skeleton of bpfilter kernel module") Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-05sock_diag: fix autoloading of the raw_diag moduleAndrei Vagin1-0/+1
IPPROTO_RAW isn't registred as an inet protocol, so inet_protos[protocol] is always NULL for it. Cc: Cyrill Gorcunov <gorcunov@gmail.com> Cc: Xin Long <lucien.xin@gmail.com> Fixes: bf2ae2e4bf93 ("sock_diag: request _diag module only when the family or proto has been registered") Signed-off-by: Andrei Vagin <avagin@gmail.com> Reviewed-by: Cyrill Gorcunov <gorcunov@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-05net: core: netpoll: Enable netconsole IPv6 link local addressMatwey V. Kornilov1-1/+2
There is no reason to discard using source link local address when remote netconsole IPv6 address is set to be link local one. The patch allows administrators to use IPv6 netconsole without explicitly configuring source address: netconsole=@/,@fe80::5054:ff:fe2f:6012/ Signed-off-by: Matwey V. Kornilov <matwey@sai.msu.ru> Signed-off-by: David S. Miller <davem@davemloft.net>