aboutsummaryrefslogtreecommitdiffstats
path: root/net/batman-adv (unfollow)
AgeCommit message (Collapse)AuthorFilesLines
2018-05-09fm10k: reduce duplicate fm10k_stat macro codeJacob Keller1-14/+15
Share some of the code for setting up fm10k_stat macros by implementing an FM10K_STAT_FIELDS macro which we can use when setting up the type specific macros. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-05-09fm10k: setup VLANs for l2 accelerated macvlan interfacesJacob Keller1-2/+48
We have support for accelerating macvlan devices via the .ndo_dfwd_add_station() netdev op. These accelerated macvlan MAC addresses are stored in the l2_accel structure, separate from the unicast or multicast address lists. If a VLAN is added on top of the macvlan device by the stack, traffic will not properly flow to the macvlan. This occurs because we fail to setup the VLANs for l2_accel MAC addresses. In the non-offloaded case the MAC address is added to the unicast address list, and thus the normal setup for enabling VLANs works as expected. We also need to add VLANs marked from .ndo_vlan_rx_add_vid() into the l2_accel MAC addresses. Otherwise, VLAN traffic will not properly be received by the VLAN devices attached to the offloaded macvlan devices. Fix this by adding necessary logic to setup VLANs not only for the unicast and multicast addresses, but also the l2_accel list. We need similar logic in dfwd_add_station, dfwd_del_station, fm10k_update_vid, and fm10k_restore_rx_state. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-05-08udp: Do not copy destructor if one is not presentAlexander Duyck1-8/+14
This patch makes it so that if a destructor is not present we avoid trying to update the skb socket or any reference counting that would be associated with the NULL socket and/or descriptor. By doing this we can support traffic coming from another namespace without any issues. Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-08udp: Add support for software checksum and GSO_PARTIAL with GSO offloadAlexander Duyck2-20/+20
This patch adds support for a software provided checksum and GSO_PARTIAL segmentation support. With this we can offload UDP segmentation on devices that only have partial support for tunnels. Since we are no longer needing the hardware checksum we can drop the checks in the segmentation code that were verifying if it was present. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-08udp: Partially unroll handling of first segment and last segmentAlexander Duyck1-14/+19
This patch allows us to take care of unrolling the first segment and the last segment of the loop for processing the segmented skb. Part of the motivation for this is that it makes it easier to process the fact that the first fame and all of the frames in between should be mostly identical in terms of header data, and the last frame has differences in the length and partial checksum. In addition I am dropping the header length calculation since we don't really need it for anything but the last frame and it can be easily obtained by just pulling the data_len and offset of tail from the transport header. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-08udp: Do not pass checksum as a parameter to GSO segmentationAlexander Duyck3-22/+20
This patch is meant to allow us to avoid having to recompute the checksum from scratch and have it passed as a parameter. Instead of taking that approach we can take advantage of the fact that the length that was used to compute the existing checksum is included in the UDP header. Finally to avoid the need to invert the result we can just call csum16_add and csum16_sub directly. By doing this we can avoid a number of instructions in the loop that is handling segmentation. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-08udp: Do not pass MSS as parameter to GSO segmentationAlexander Duyck3-4/+6
There is no point in passing MSS as a parameter for for the GSO segmentation call as it is already available via the shared info for the skb itself. Reviewed-by: Eric Dumazet <edumazet@google.com> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-08udp: Record gso_segs when supporting UDP segmentation offloadAlexander Duyck1-0/+2
We need to record the number of segments that will be generated when this frame is segmented. The expectation is that if gso_size is set then gso_segs is set as well. Without this some drivers such as ixgbe get confused if they attempt to offload this as they record 0 segments for the entire packet instead of the correct value. Reviewed-by: Eric Dumazet <edumazet@google.com> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-08dt-bindings: dsa: Remove unnecessary #address/#size-cellsFabio Estevam1-6/+0
If the example binding is used on a real dts file, the following DTC warning is seen with W=1: arch/arm/boot/dts/imx6q-b450v3.dtb: Warning (avoid_unnecessary_addr_size): /mdio-gpio/switch@0: unnecessary #address-cells/#size-cells without "ranges" or child "reg" property Remove unnecessary #address-cells/#size-cells to improve the binding document examples. Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com> Reviewed-by: Rob Herring <robh@kernel.org> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-08net: phy: sfp: handle cases where neither BR, min nor BR, max is givenAntoine Tenart1-0/+7
When computing the bitrate using values read from an SFP module EEPROM, we use the nominal BR plus BR,min and BR,max to determine the boundaries. But in some cases BR,min and BR,max aren't provided, which led the SFP code to end up having the nominal value for both the minimum and maximum bitrate values. When using a passive cable, the nominal value should be used as the maximum one, and there is no minimum one so we should use 0. Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com> Acked-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-08bnxt_en: Always forward VF MAC address to the PF.Michael Chan2-2/+3
The current code already forwards the VF MAC address to the PF, except in one case. If the VF driver gets a valid MAC address from the firmware during probe time, it will not forward the MAC address to the PF, incorrectly assuming that the PF already knows the MAC address. This causes "ip link show" to show zero VF MAC addresses for this case. This assumption is not correct. Newer firmware remembers the VF MAC address last used by the VF and provides it to the VF driver during probe. So we need to always forward the VF MAC address to the PF. The forwarded MAC address may now be the PF assigned MAC address and so we need to make sure we approve it for this case. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-08bnxt_en: Read phy eeprom A2h address only when optical diagnostics is supported.Vasundhara Volam2-14/+9
For SFP+ modules, 0xA2 page is available only when Diagnostic Monitoring Type [Address A0h, Byte 92] is implemented. Extend bnxt_get_module_info(), to read optical diagnostics support at offset 92(0x5c) and set eeprom_len length to ETH_MODULE_SFF_8436_LEN (to exclude A2 page), if dianostics is not supported. Also in bnxt_get_module_info(), module id is read from offset 0x5e which is not correct. It was working by accident, as offset was not effective without setting enables flag in the firmware request. SFP module id is present at location 0. Fix this by removing the offset and read it from location 0. Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-08bnxt_en: Check unsupported speeds in bnxt_update_link() on PF only.Michael Chan1-0/+3
Only non-NPAR PFs need to actively check and manage unsupported link speeds. NPAR functions and VFs do not control the link speed and should skip the unsupported speed detection logic, to avoid warning messages from firmware rejecting the unsupported firmware calls. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-08bnxt_en: Fix firmware message delay loop regression.Michael Chan2-4/+15
A recent change to reduce delay granularity waiting for firmware reponse has caused a regression. With a tighter delay loop, the driver may see the beginning part of the response faster. The original 5 usec delay to wait for the rest of the message is not long enough and some messages are detected as invalid. Increase the maximum wait time from 5 usec to 20 usec. Also, fix the debug message that shows the total delay time for the response when the message times out. With the new logic, the delay time is not fixed per iteration of the loop, so we define a macro to show the total delay time. Fixes: 9751e8e71487 ("bnxt_en: reduce timeout on initial HWRM calls") Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-08net-next/hinic: add pci device ids for 25ge and 100ge cardZhao Chen1-2/+6
This patch adds PCI device IDs to support 25GE and 100GE card: 1. Add device id 0x0201 for HINIC 100GE dual port card. 2. Add device id 0x0200 for HINIC 25GE dual port card. 3. Macro of device id 0x1822 is modified for HINIC 25GE quad port card. Signed-off-by: Zhao Chen <zhaochen6@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-08flow_dissector: do not rely on implicit castsPaolo Abeni2-3/+3
This change fixes a couple of type mismatch reported by the sparse tool, explicitly using the requested type for the offending arguments. Signed-off-by: Paolo Abeni <pabeni@redhat.com> Acked-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-08net: core: rework basic flow dissection helperPaolo Abeni4-20/+28
When the core networking needs to detect the transport offset in a given packet and parse it explicitly, a full-blown flow_keys struct is used for storage. This patch introduces a smaller keys store, rework the basic flow dissect helper to use it, and apply this new helper where possible - namely in skb_probe_transport_header(). The used flow dissector data structures are renamed to match more closely the new role. The above gives ~50% performance improvement in micro benchmarking around skb_probe_transport_header() and ~30% around eth_get_headlen(), mostly due to the smaller memset. Small, but measurable improvement is measured also in macro benchmarking. v1 -> v2: use the new helper in eth_get_headlen() and skb_get_poff(), as per DaveM suggestion Suggested-by: David Miller <davem@davemloft.net> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-07net: ipv6/gre: Add GRO supportEran Ben Elisha1-10/+27
Add GRO capability for IPv6 GRE tunnel and ip6erspan tap, via gro_cells infrastructure. Performance testing: 55% higher badwidth. Measuring bandwidth of 1 thread IPv4 TCP traffic over IPv6 GRE tunnel while GRO on the physical interface is disabled. CPU: Intel Xeon E312xx (Sandy Bridge) NIC: Mellanox Technologies MT27700 Family [ConnectX-4] Before (GRO not working in tunnel) : 2.47 Gbits/sec After (GRO working in tunnel) : 3.85 Gbits/sec Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> CC: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-07net: ipv6: Fix typo in ipv6_find_hdr() documentationTariq Toukan1-1/+1
Fix 'an' into 'and', and use a comma instead of a period. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-07qed: Add support for Unified Fabric Port.Sudarsana Reddy Kalluru14-27/+283
This patch adds driver changes for supporting the Unified Fabric Port (UFP). This is a new paritioning mode wherein MFW provides the set of parameters to be used by the device such as traffic class, outer-vlan tag value, priority type etc. Drivers receives this info via notifications from mfw and configures the hardware accordingly. Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com> Signed-off-by: Ariel Elior <ariel.elior@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-07qed: Add support for multi function mode with 802.1ad tagging.Sudarsana Reddy Kalluru2-20/+49
The patch adds support for new Multi function mode wherein the traffic classification is done based on the 802.1ad tagging and the outer vlan tag provided by the management firmware. Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com> Signed-off-by: Ariel Elior <ariel.elior@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-07qed: Remove unused data member 'is_mf_default'.Sudarsana Reddy Kalluru2-3/+0
The data member 'is_mf_default' is not used by the qed/qede drivers, removing the same. Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com> Signed-off-by: Ariel Elior <ariel.elior@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-07qed*: Refactor mf_mode to consist of bits.Sudarsana Reddy Kalluru8-46/+71
`mf_mode' field indicates the multi-partitioning mode the device is configured to. This method doesn't scale very well, adding a new MF mode requires going over all the existing conditions, and deciding whether those are needed for the new mode or not. The patch defines a set of bit-fields for modes which are derived according to the mode info shared by the MFW and all the configuration would be made according to those. To add a new mode, there would be a single place where we'll need to go and choose which bits apply and which don't. Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com> Signed-off-by: Ariel Elior <ariel.elior@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-07net/9p: correct the variable name in v9fs_get_trans_by_name() commentSun Lianwen1-1/+1
The v9fs_get_trans_by_name(char *s) variable name is not "name" but "s". Signed-off-by: Sun Lianwen <sunlw.fnst@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-07vlan: correct the file path in vlan_dev_change_flags() commentSun Lianwen1-1/+3
The vlan_flags enum is defined in include/uapi/linux/if_vlan.h file. not in include/linux/if_vlan.h file. Signed-off-by: Sun Lianwen <sunlw.fnst@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-07liquidio: support use of ethtool to set link speed of CN23XX-225 cardsWeilin Chang7-24/+425
Support setting the link speed of CN23XX-225 cards (which can do 25Gbps or 10Gbps) via ethtool_ops.set_link_ksettings. Also fix the function assigned to ethtool_ops.get_link_ksettings to use the new link_ksettings api completely (instead of partially via ethtool_convert_legacy_u32_to_link_mode). Signed-off-by: Weilin Chang <weilin.chang@cavium.com> Acked-by: Raghu Vatsavayi <raghu.vatsavayi@cavium.com> Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-07net: 3com: 3c59x: irq save variant of ISRAnna-Maria Gleixner1-14/+4
When vortex_boomerang_interrupt() is invoked from vortex_tx_timeout() or poll_vortex() interrupts must be disabled. This detaches the interrupt disable logic from locking which requires patching for PREEMPT_RT. The advantage of avoiding spin_lock_irqsave() in the interrupt handler is minimal, but converting it removes all the extra code for callers which come not from interrupt context. Cc: Steffen Klassert <klassert@mathematik.tu-chemnitz.de> Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-07net: 3com: 3c59x: Pull locking out of ISRAnna-Maria Gleixner1-11/+9
Locking is done in the same way in _vortex_interrupt() and _boomerang_interrupt(). To prevent duplication, move the locking into the calling vortex_boomerang_interrupt() function. No functional change. Cc: Steffen Klassert <klassert@mathematik.tu-chemnitz.de> Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-07net: 3com: 3c59x: Move boomerang/vortex conditional into functionAnna-Maria Gleixner1-14/+20
If vp->full_bus_master_tx is set, vp->full_bus_master_rx is set as well (see vortex_probe1()). Therefore the conditionals for the decision if boomerang or vortex ISR is executed have the same result. Instead of repeating the explicit conditional execution of the boomerang/vortex ISR, move it into an own function. No functional change. Cc: Steffen Klassert <klassert@mathematik.tu-chemnitz.de> Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-07net: u64_stats_sync: Remove functions without userAnna-Maria Gleixner1-14/+0
Commit 67db3e4bfbc9 ("tcp: no longer hold ehash lock while calling tcp_get_info()") removes the only users of u64_stats_update_end/begin_raw() without removing the function in header file. Remove no longer used functions. Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-07selftests: net: add udpgso* to TEST_GEN_FILESAnders Roxell1-1/+1
The generated files udpgso* shouldn't be part of TEST_PROGS, they are used by udpgso.sh and udpgsp_bench.sh. They should be added to the TEST_GEN_FILES to get installed without being added to the main run_kselftest.sh script. Fixes: 3a687bef148d ("selftests: udp gso benchmark") Signed-off-by: Anders Roxell <anders.roxell@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-07netfilter: nft_dynset: fix timeout updates on 32bitFlorian Westphal1-1/+1
This must now use a 64bit jiffies value, else we set a bogus timeout on 32bit. Fixes: 8e1102d5a1596 ("netfilter: nf_tables: support timeouts larger than 23 days") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-05-07netfilter: ctnetlink: export nf_conntrack_maxFlorent Fourcot3-0/+5
IPCTNL_MSG_CT_GET_STATS netlink command allow to monitor current number of conntrack entries. However, if one wants to compare it with the maximum (and detect exhaustion), the only solution is currently to read sysctl value. This patch add nf_conntrack_max value in netlink message, and simplify monitoring for application built on netlink API. Signed-off-by: Florent Fourcot <florent.fourcot@wifirst.fr> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-05-07netfilter: extract Passive OS fingerprint infrastructure from xt_osfFernando Fernandez Mancera7-289/+359
Add nf_osf_ttl() and nf_osf_match() into nf_osf.c to prepare for nf_tables support. Signed-off-by: Fernando Fernandez Mancera <ffmancera@riseup.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-05-06netfilter: nf_tables: Provide NFT_{RT,CT}_MAX for userspacePhil Sutter1-0/+4
These macros allow conveniently declaring arrays which use NFT_{RT,CT}_* values as indexes. Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-05-06netfilter: nf_nat: remove unused ct arg from lookup functionsFlorian Westphal7-42/+22
Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-05-06netfilter: ip6t_srh: extend SRH matching for previous, next and last SIDAhmed Abdelsalam2-11/+205
IPv6 Segment Routing Header (SRH) contains a list of SIDs to be crossed by SR encapsulated packet. Each SID is encoded as an IPv6 prefix. When a Firewall receives an SR encapsulated packet, it should be able to identify which node previously processed the packet (previous SID), which node is going to process the packet next (next SID), and which node is the last to process the packet (last SID) which represent the final destination of the packet in case of inline SR mode. An example use-case of using these features could be SID list that includes two firewalls. When the second firewall receives a packet, it can check whether the packet has been processed by the first firewall or not. Based on that check, it decides to apply all rules, apply just subset of the rules, or totally skip all rules and forward the packet to the next SID. This patch extends SRH match to support matching previous SID, next SID, and last SID. Signed-off-by: Ahmed Abdelsalam <amsalam20@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-05-06netfilter: nft_numgen: enable hashing of one elementLaura Garcia Liebana1-1/+1
The modulus in the hash function was limited to > 1 as initially there was no sense to create a hashing of just one element. Nevertheless, there are certain cases specially for load balancing where this case needs to be addressed. This patch fixes the following error. Error: Could not process rule: Numerical result out of range add rule ip nftlb lb01 dnat to jhash ip saddr mod 1 map { 0: 192.168.0.10 } ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ The solution comes to force the hash to 0 when the modulus is 1. Signed-off-by: Laura Garcia Liebana <nevola@gmail.com>
2018-05-06netfilter: nft_numgen: add map lookups for numgen statementsLaura Garcia Liebana2-5/+84
This patch includes a new attribute in the numgen structure to allow the lookup of an element based on the number generator as a key. For this purpose, different ops have been included to extend the current numgen inc functions. Currently, only supported for numgen incremental operations, but it will be supported for random in a follow-up patch. Signed-off-by: Laura Garcia Liebana <nevola@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-05-04net/ipv6: rename rt6_next to fib6_nextDavid Ahern3-22/+22
This slipped through the cracks in the followup set to the fib6_info flip. Rename rt6_next to fib6_next. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-04bpf, xskmap: fix crash in xsk_map_alloc error path handlingDaniel Borkmann1-0/+2
If bpf_map_precharge_memlock() did not fail, then we set err to zero. However, any subsequent failure from either alloc_percpu() or the bpf_map_area_alloc() will return ERR_PTR(0) which in find_and_alloc_map() will cause NULL pointer deref. In devmap we have the convention that we return -EINVAL on page count overflow, so keep the same logic here and just set err to -ENOMEM after successful bpf_map_precharge_memlock(). Fixes: fbfc504a24f5 ("bpf: introduce new bpf AF_XDP map type BPF_MAP_TYPE_XSKMAP") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Björn Töpel <bjorn.topel@intel.com> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-05-04bpf: fix references to free_bpf_prog_info() in commentsJakub Kicinski1-2/+2
Comments in the verifier refer to free_bpf_prog_info() which seems to have never existed in tree. Replace it with free_used_maps(). Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-05-04tools: bpftool: add simple perf event output readerJakub Kicinski8-19/+444
Users of BPF sooner or later discover perf_event_output() helpers and BPF_MAP_TYPE_PERF_EVENT_ARRAY. Dumping this array type is not possible, however, we can add simple reading of perf events. Create a new event_pipe subcommand for maps, this sub command will only work with BPF_MAP_TYPE_PERF_EVENT_ARRAY maps. Parts of the code from samples/bpf/trace_output_user.c. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-05-04tools: bpftool: move get_possible_cpus() to common codeJakub Kicinski3-58/+59
Move the get_possible_cpus() function to shared code. No functional changes. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jiong Wang <jiong.wang@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-05-04tools: bpftool: fold hex keyword in command helpJakub Kicinski2-15/+17
Instead of spelling [hex] BYTES everywhere use DATA as keyword for generalized value. This will help us keep the messages concise when longer command are added in the future. It will also be useful once BTF support comes. We will only have to change the definition of DATA. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-05-04nfp: bpf: rewrite map pointers with NFP TIDsJakub Kicinski2-21/+32
Kernel will now replace map fds with actual pointer before calling the offload prepare. We can identify those pointers and replace them with NFP table IDs instead of loading the table ID in code generated for CALL instruction. This allows us to support having the same CALL being used with different maps. Since we don't want to change the FW ABI we still need to move the TID from R1 to portion of R0 before the jump. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jiong Wang <jiong.wang@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-05-04nfp: bpf: perf event output helpers supportJakub Kicinski7-4/+187
Add support for the perf_event_output family of helpers. The implementation on the NFP will not match the host code exactly. The state of the host map and rings is unknown to the device, hence device can't return errors when rings are not installed. The device simply packs the data into a firmware notification message and sends it over to the host, returning success to the program. There is no notion of a host CPU on the device when packets are being processed. Device will only offload programs which set BPF_F_CURRENT_CPU. Still, if map index doesn't match CPU no error will be returned (see above). Dropped/lost firmware notification messages will not cause "lost events" event on the perf ring, they are only visible via device error counters. Firmware notification messages may also get reordered in respect to the packets which caused their generation. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-05-04bpf: replace map pointer loads before calling into offloadsJakub Kicinski1-5/+5
Offloads may find host map pointers more useful than map fds. Map pointers can be used to identify the map, while fds are only valid within the context of loading process. Jump to skip_full_check on error in case verifier log overflow has to be handled (replace_map_fd_with_map_ptr() prints to the log, driver prep may do that too in the future). Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jiong Wang <jiong.wang@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-05-04bpf: export bpf_event_output()Jakub Kicinski1-0/+1
bpf_event_output() is useful for offloads to add events to BPF event rings, export it. Note that export is placed near the stub since tracing is optional and kernel/bpf/core.c is always going to be built. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jiong Wang <jiong.wang@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-05-04nfp: bpf: record offload neutral maps in the driverJakub Kicinski5-6/+168
For asynchronous events originating from the device, like perf event output, we need to be able to make sure that objects being referred to by the FW message are valid on the host. FW events can get queued and reordered. Even if we had a FW message "barrier" we should still protect ourselves from bogus FW output. Add a reverse-mapping hash table and record in it all raw map pointers FW may refer to. Only record neutral maps, i.e. perf event arrays. These are currently the only objects FW can refer to. Use RCU protection on the read side, update side is under RTNL. Since program vs map destruction order is slightly painful for offload simply take an extra reference on all the recorded maps to make sure they don't disappear. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>