aboutsummaryrefslogtreecommitdiffstats
path: root/drivers (follow)
AgeCommit message (Collapse)AuthorFilesLines
2016-03-13gro: Defer clearing of flush bit in tunnel pathsAlexander Duyck2-7/+3
This patch updates the GRO handlers for GRE, VXLAN, GENEVE, and FOU so that we do not clear the flush bit until after we have called the next level GRO handler. Previously this was being cleared before parsing through the list of frames, however this resulted in several paths where either the bit needed to be reset but wasn't as in the case of FOU, or cases where it was being set as in GENEVE. By just deferring the clearing of the bit until after the next level protocol has been parsed we can avoid any unnecessary bit twiddling and avoid bugs. Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-12rocker: move ageing_time from struct rocker to struct ofdpaJiri Pirko3-7/+7
This is OF-DPA specific, used only there, similar to ofdpa_port->ageing_time. So move it to OF-DPA code. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-11qed: Enlrage the drain timeoutYuval Mintz1-2/+2
In the scenario where slowpath configuration isn't passing due to various pause configurations affecting the chip, the theoretical time required in worst-case-scenario to empty hw fifos sufficiently to guarantee that slowpath configuration would flow is currently insufficient. This increases such a drain request to the theoretical maximum. Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-11qed: Notify of transciever changesZvi Nachmani2-0/+41
Handle a new message from the MFW, one that indicate that the transciever state has changed, and log that into the system logs. Signed-off-by: Zvi Nachmani <Zvi.Nachmani@qlogic.com> Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-11qed: Major changes to MB lockingTomer Tayar2-97/+167
Driver interaction with the managemnt firmware is done via mailbox commands which the management firmware periodically sample, as well as placing of additional data in set places in the shared memory. Each PF has a single designated mailbox address, and all flows that require messaging to the management should use it. This patch does 2 things: 1. It re-defines the critical section surrounding the mailbox sending - that section should include the setting of the shared memory as well as the sending of the command [otherwise a race might send a command with the data of a different command]. 2. It moves the locking scheme from using mutices into using spinlocks. This lays the groundwork for sending MFW commands from non-sleepable contexts. Signed-off-by: Tomer Tayar <Tomer.Tayar@qlogic.com> Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-11qed: Prevent MF link notificationsSudarsana Reddy Kalluru2-1/+9
When device is configured for Multi-function mode, some older management firmware might incorrectly notify interfaces of link changes while they haven't requested the physical link configuration to be set. This can create bizzare race conditions where unloading interfaces are getting notified that the link is up. Let the driver compensate - store the logical requested state of the link and don't propagate notifications after protocol driver explicitly requires the link to be unset. Signed-off-by: Sudarsana Reddy Kalluru <sudarsana.kalluru@qlogic.com> Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-11geneve: support setting IPv6 flow labelDaniel Borkmann1-8/+27
This work adds support for setting the IPv6 flow label for geneve per device and through collect metadata (ip_tunnel_key) frontends. Also here, the geneve dst cache does not need any special considerations, for the cases where caches can be used, the label is static per cache. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-11vxlan: support setting IPv6 flow labelDaniel Borkmann1-5/+21
This work adds support for setting the IPv6 flow label for vxlan per device and through collect metadata (ip_tunnel_key) frontends. The vxlan dst cache does not need any special considerations here, for the cases where caches can be used, the label is static per cache. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-11ip_tunnel: add support for setting flow label via collect metadataDaniel Borkmann2-2/+2
This patch extends udp_tunnel6_xmit_skb() to pass in the IPv6 flow label from call sites. Currently, there's no such option and it's always set to zero when writing ip6_flow_hdr(). Add a label member to ip_tunnel_key, so that flow-based tunnels via collect metadata frontends can make use of it. vxlan and geneve will be converted to add flow label support separately. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-11cisco: enic: Update logging macros and usesJoe Perches6-37/+43
Don't hide varibles used by the logging macros. Miscellanea: o Use the more common ##__VA_ARGS__ extension o Add missing newlines to formats o Realign arguments Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-11rocker: set FDB cleanup timer according to lowest ageing timeIdo Schimmel3-1/+7
In rocker, ageing time is a per-port attribute, so the next time the FDB cleanup timer fires should be set according to the lowest ageing time. This will later allow us to delete the BR_MIN_AGEING_TIME macro, which was added to guarantee minimum ageing time in the bridge layer, thereby breaking existing behavior. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-11mlxsw: spectrum: Check requested ageing time is validIdo Schimmel2-2/+9
Commit c62987bbd8a1 ("bridge: push bridge setting ageing_time down to switchdev") added a check for minimum and maximum ageing time, but this breaks existing behaviour where one can set ageing time to 0 for a non-learning bridge. Push this check down to the driver and allow the check in the bridge layer to be removed. Currently ageing time 0 is refused by the driver, but we can later add support for this functionality. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-11macvtap: always pass ethernet header in linearWillem de Bruijn1-3/+6
The stack expects link layer headers in the skb linear section. Macvtap can create skbs with llheader in frags in edge cases: when (IFF_VNET_HDR is off or vnet_hdr.hdr_len < ETH_HLEN) and prepad + len > PAGE_SIZE and vnet_hdr.flags has no or bad csum. Add checks to ensure linear is always at least ETH_HLEN. At this point, len is already ensured to be >= ETH_HLEN. For backwards compatiblity, rounds up short vnet_hdr.hdr_len. This differs from tap and packet, which return an error. Fixes b9fb9ee07e67 ("macvtap: add GSO/csum offload support") Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-10net/mlx5e: Support offload cls_flower with skbedit mark actionAmir Vadai3-0/+6
Introduce offloading of skbedit mark action. For example, to mark with 0x1234, all TCP (ip_proto 6) packets arriving to interface ens9: # tc qdisc add dev ens9 ingress # tc filter add dev ens9 protocol ip parent ffff: \ flower ip_proto 6 \ indev ens9 \ action skbedit mark 0x1234 Signed-off-by: Amir Vadai <amir@vadai.me> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-10net/mlx5e: Support offload cls_flower with drop actionAmir Vadai3-0/+309
Parse tc_cls_flower_offload into device specific commands and program the hardware to classify and act accordingly. For example, to drop ICMP (ip_proto 1) packets from specific smac, dmac, src_ip, src_ip, arriving to interface ens9: # tc qdisc add dev ens9 ingress # tc filter add dev ens9 protocol ip parent ffff: \ flower ip_proto 1 \ dst_mac 7c:fe:90:69:81:62 src_mac 7c:fe:90:69:81:56 \ dst_ip 11.11.11.11 src_ip 11.11.11.12 indev ens9 \ action drop Signed-off-by: Amir Vadai <amir@vadai.me> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-10net/mlx5e: Introduce tc offload supportAmir Vadai5-2/+222
Extend ndo_setup_tc() to support ingress tc offloading. Will be used by later patches to offload tc flower filter. Feature is off by default and could be enabled by issuing: # ethtool -K eth0 hw-tc-offload on Offloads flow table is dynamically created when first filter is added. Rules are saved in a hash table that is maintained by the consumer (for example - the flower offload in the next patch). When last filter is removed and no filters exist in the hash table, the offload flow table is destroyed. Signed-off-by: Amir Vadai <amir@vadai.me> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-10net/mlx5e: Add a new priority for kernel flow tablesAmir Vadai2-4/+4
Move the vlan and main flow tables to use priority 1. This will allow the upcoming TC offload logic to use a higher priority (0) for the offload steering table. Signed-off-by: Amir Vadai <amir@vadai.me> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-10net/mlx5e: Relax ndo_setup_tc handle restrictionAmir Vadai1-1/+1
Restricting handle to TC_H_ROOT breaks the old instantiation of mqprio to setup a hardware qdisc. This patch relaxes the test, to only check the type. Fixes: 08fb1da ("net/mlx5e: Support DCBNL IEEE ETS") Signed-off-by: Amir Vadai <amir@vadai.me> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-10net/mlx5_core: Set flow steering dest only for forward rulesAmir Vadai2-19/+28
We need to handle flow table entry destinations only if the action associated with the rule is forwarding (MLX5_FLOW_CONTEXT_ACTION_FWD_DEST). Fixes: 26a8145390b3 ('net/mlx5_core: Introduce flow steering firmware commands') Signed-off-by: Amir Vadai <amir@vadai.me> Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-10net-next: mediatek: add Kconfig and MakefileJohn Crispin4-0/+24
This patch adds the Makefile and Kconfig required to make the driver build. Signed-off-by: John Crispin <blogic@openwrt.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-10net-next: mediatek: add support for MT7623 ethernetJohn Crispin2-0/+2228
Add ethernet support for MediaTek SoCs from the MT7623 family. These have dual GMAC. Depending on the exact version, there might be a built-in Gigabit switch (MT7530). The core does not have the typical DMA ring setup. Instead there is a linked list that we add descriptors to. There is only one linked list that both MACs use together. There is a special field inside the TX descriptors called the VQID. This allows us to assign packets to different internal queues. By using a separate id for each MAC we are able to get deterministic results for BQL. Additionally we need to provide the core with a block of scratch memory that is the same size as the RX ring and data buffer. This is really needed to make the HW datapath work. Although the driver does not support this yet, we still need to assign the memory and tell the core about it for RX to work. Signed-off-by: Felix Fietkau <nbd@openwrt.org> Signed-off-by: Michael Lee <igvtee@gmail.com> Signed-off-by: John Crispin <blogic@openwrt.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-10qede: Fix net-next "make ARCH=x86_64"Manish Chopra1-1/+4
'commit 55482edc25f0606851de42e73618f813f310d009 ("qede: Add slowpath/fastpath support and enable hardware GRO")' introduces below error when compiling net-next with "make ARCH=x86_64" drivers/built-in.o: In function `qede_rx_int': qede_main.c:(.text+0x6101a0): undefined reference to `tcp_gro_complete' Signed-off-by: Manish Chopra <manish.chopra@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-10qlcnic: Fix mailbox completion handling during spurious interruptRajesh Borundia3-5/+14
o While the driver is in the middle of a MB completion processing and it receives a spurious MB interrupt, it is mistaken as a good MB completion interrupt leading to premature completion of the next MB request. Fix the driver to guard against this by checking the current state of MB processing and ignore the spurious interrupt. Also added a stats counter to record this condition. Signed-off-by: Rajesh Borundia <rajesh.borundia@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-10qlcnic: Remove unnecessary usage of atomic_tRajesh Borundia2-6/+5
o atomic_t usage is incorrect as we are not implementing any atomicity. Signed-off-by: Rajesh Borundia <rajesh.borundia@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-10cxgb4vf: Set number of queues in pci probe onlyHariprasad Shenai1-4/+4
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-10cxgb4vf: Add a couple more checks for invalid provisioning configurationsHariprasad Shenai1-0/+5
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-10cxgb4vf: Configure queue based on resource and interrupt typeHariprasad Shenai1-71/+94
The Queue Set Configuration code was always reserving room for a Forwarded interrupt Queue even in the cases where we weren't using it. Figure out how many Ports and Queue Sets we can support. This depends on knowing our Virtual Function Resources and may be called a second time if we fall back from MSI-X to MSI Interrupt Mode. This change fixes that problem. Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-10cxgb4vf: Enable interrupts before we register our network devicesHariprasad Shenai1-25/+26
This avoids a race condition where a system that has network devices set up to be automatically configured and we get the first Port Link Status message from the firmware on the Asynchronous Firmware Event Queue before we've enabled interrupts. If that happens, we end up losing the interrupt and never realizing that the links has actually come up. Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-10net: dsa: mv88e6xxx: avoid writing the same modeVivien Didelot1-8/+13
There is no need to change the 802.1Q port mode for the same value. Thus avoid such message: [ 401.954836] dsa dsa@0 lan0: 802.1Q Mode: Disabled (was Disabled) Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Tested-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-10net: dsa: mv88e6xxx: read then write PVIDVivien Didelot1-4/+26
The port register 0x07 contains more options than just the default VID, even though they are not used yet. So prefer a read then write operation over a direct write. This also allows to keep track of the change through dynamic debug. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Tested-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-10net: dsa: mv88e6xxx: rework port state setterVivien Didelot2-22/+34
Apply a few non-functional changes on the port state setter: * add a dynamic debug message with state names to track changes * explicit states checking instead of assuming their numeric values * lock mutex only once when changing several port states * use bitmap macros to declare and access port_state_update_mask Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Tested-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-10sh_eth: advance 'rxdesc' later in sh_eth_ring_format()Sergei Shtylyov1-3/+4
Iff dma_map_single() fails, 'rxdesc' should point to the last filled RX descriptor, so that it can be marked as the last one, however the driver would have already advanced it by that time. In order to fix that, only fill an RX descriptor once all the data for it is ready. Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-10sh_eth: fix NULL pointer dereference in sh_eth_ring_format()Sergei Shtylyov1-1/+2
In a low memory situation, if netdev_alloc_skb() fails on a first RX ring loop iteration in sh_eth_ring_format(), 'rxdesc' is still NULL. Avoid kernel oops by adding the 'rxdesc' check after the loop. Reported-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-10can: rcar_can: Add r8a7795 supportRamesh Shanmugasundaram2-1/+2
Added r8a7795 SoC support. Signed-off-by: Ramesh Shanmugasundaram <ramesh.shanmugasundaram@bp.renesas.com> Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2016-03-10can: ifi: Add obscure bit swap for EFF frame IDsMarek Vasut1-2/+29
In case of CAN2.0 EFF frame, the controller handles frame IDs in a rather bizzare way. The ID is split into an extended part, IDX[28:11] and standard part, ID[10:0]. In the TX path, the core first sends the top 11 bits of the IDX, followed by ID and finally the rest of IDX. In the RX path, the core stores the ID the LSbit part of IDX field, followed by the LSbit parts of real IDX. The MSbit parts of IDX are stored in ID field of the register. This patch implements the necessary bit shuffling to mitigate this obscure behavior. In case two of these controllers are connected together, the RX and TX bit swapping nullifies itself and the issue does not manifest. The issue only manifests when talking to another different CAN controller. Signed-off-by: Marek Vasut <marex@denx.de> Cc: Marc Kleine-Budde <mkl@pengutronix.de> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Oliver Hartkopp <socketcan@hartkopp.net> Cc: Wolfgang Grandegger <wg@grandegger.com> Reviewed-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2016-03-10can: ifi: Fix RX and TX ID maskMarek Vasut1-4/+4
The RX and TX ID mask for CAN2.0 is 11 bits wide. This patch fixes the incorrect mask, which caused the CAN IDs to miss the MSBit both on receive and transmit. Signed-off-by: Marek Vasut <marex@denx.de> Cc: Marc Kleine-Budde <mkl@pengutronix.de> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Oliver Hartkopp <socketcan@hartkopp.net> Cc: Wolfgang Grandegger <wg@grandegger.com> Reviewed-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2016-03-10can: ifi: Fix TX DLC configurationMarek Vasut1-8/+6
The TX DLC, the transmission length information, was not written into the transmit configuration register. When using the CAN core with different CAN controller, the receiving CAN controller will receive only the ID part of the CAN frame, but no data at all. This patch adds the TX DLC into the register to fix this issue. Signed-off-by: Marek Vasut <marex@denx.de> Cc: Marc Kleine-Budde <mkl@pengutronix.de> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Oliver Hartkopp <socketcan@hartkopp.net> Cc: Wolfgang Grandegger <wg@grandegger.com> Reviewed-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2016-03-10can: ifi: Fix clock generator configurationMarek Vasut1-21/+23
The clock generation does not match reality when using the CAN IP core outside of the FPGA design. This patch fixes the computation of values which are programmed into the clock generator registers. First, there are some off-by-one errors which manifest themselves only when communicating with different controller, so those are fixed. Second, the bits in the clock generator registers have different meaning depending on whether the core is in ISO CANFD mode or any of the other modes (BOSCH CANFD or CAN2.0). Detect the ISO CANFD mode and fix handling of this special case of clock configuration. Finally, the CAN clock speed is in CANCLOCK register, not SYSCLOCK register, so fix this as well. Signed-off-by: Marek Vasut <marex@denx.de> Cc: Marc Kleine-Budde <mkl@pengutronix.de> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Oliver Hartkopp <socketcan@hartkopp.net> Cc: Wolfgang Grandegger <wg@grandegger.com> Reviewed-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2016-03-08bnxt_en: Enable AER support.Satish Baddipadige1-0/+109
Add pci_error_handler callbacks to support for pcie advanced error recovery. Signed-off-by: Satish Baddipadige <sbaddipa@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-08bnxt_en: Include hardware port statistics in ethtool -S.Michael Chan1-2/+103
Include the more useful port statistics in ethtool -S for the PF device. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-08bnxt_en: Include some hardware port statistics in ndo_get_stats64().Michael Chan1-0/+16
Include some of the port error counters (e.g. crc) in ->ndo_get_stats64() for the PF device. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-08bnxt_en: Add port statistics support.Michael Chan2-1/+60
Gather periodic port statistics if the device is PF and link is up. This is triggered in bnxt_timer() every one second to request firmware to DMA the counters. Signed-off-by: Michael Chan <michael.chan@broadocm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-08bnxt_en: Extend autoneg to all speeds.Michael Chan2-14/+4
Allow all autoneg speeds aupported by firmware to be advertised. If the advertising parameter is 0, then all supported speeds will be advertised. Remove BNXT_ALL_COPPER_ETHTOOL_SPEED which is no longer used as all supported speeds can be advertised. Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-08bnxt_en: Use common function to get ethtool supported flags.Michael Chan1-20/+9
The supported bits and advertising bits in ethtool have the same definitions. The same is true for the firmware bits. So use the common function to handle the conversion for both supported and advertising bits. v2: Don't use parentheses on function return. Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-08bnxt_en: Add reporting of link partner advertisement.Michael Chan3-2/+23
And report actual pause settings to ETHTOOL_GPAUSEPARAM to let ethtool resolve the actual pause settings. Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-08bnxt_en: Refactor bnxt_fw_to_ethtool_advertised_spds().Michael Chan1-13/+20
Include the conversion of pause bits and add one extra call layer so that the same refactored function can be reused to get the link partner advertisement bits. Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-08vxlan: allow setting ipv6 traffic classDaniel Borkmann1-5/+9
We can already do that for IPv4, but IPv6 support was missing. Add it for vxlan, so it can be used with collect metadata frontends. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-08bpf, vxlan, geneve, gre: fix usage of dst_cache on xmitDaniel Borkmann2-16/+14
The assumptions from commit 0c1d70af924b ("net: use dst_cache for vxlan device"), 468dfffcd762 ("geneve: add dst caching support") and 3c1cb4d2604c ("net/ipv4: add dst cache support for gre lwtunnels") on dst_cache usage when ip_tunnel_info is used is unfortunately not always valid as assumed. While it seems correct for ip_tunnel_info front-ends such as OVS, eBPF however can fill in ip_tunnel_info for consumers like vxlan, geneve or gre with different remote dsts, tos, etc, therefore they cannot be assumed as packet independent. Right now vxlan, geneve, gre would cache the dst for eBPF and every packet would reuse the same entry that was first created on the initial route lookup. eBPF doesn't store/cache the ip_tunnel_info, so each skb may have a different one. Fix it by adding a flag that checks the ip_tunnel_info. Also the !tos test in vxlan needs to be handeled differently in this context as it is currently inferred from ip_tunnel_info as well if present. ip_tunnel_dst_cache_usable() helper is added for the three tunnel cases, which checks if we can use dst cache. Fixes: 0c1d70af924b ("net: use dst_cache for vxlan device") Fixes: 468dfffcd762 ("geneve: add dst caching support") Fixes: 3c1cb4d2604c ("net/ipv4: add dst cache support for gre lwtunnels") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Paolo Abeni <pabeni@redhat.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-08bpf: support for access to tunnel optionsDaniel Borkmann1-2/+2
After eBPF being able to programmatically access/manage tunnel key meta data via commit d3aa45ce6b94 ("bpf: add helpers to access tunnel metadata") and more recently also for IPv6 through c6c33454072f ("bpf: support ipv6 for bpf_skb_{set,get}_tunnel_key"), this work adds two complementary helpers to generically access their auxiliary tunnel options. Geneve and vxlan support this facility. For geneve, TLVs can be pushed, and for the vxlan case its GBP extension. I.e. setting tunnel key for geneve case only makes sense, if we can also read/write TLVs into it. In the GBP case, it provides the flexibility to easily map the group policy ID in combination with other helpers or maps. I chose to model this as two separate helpers, bpf_skb_{set,get}_tunnel_opt(), for a couple of reasons. bpf_skb_{set,get}_tunnel_key() is already rather complex by itself, and there may be cases for tunnel key backends where tunnel options are not always needed. If we would have integrated this into bpf_skb_{set,get}_tunnel_key() nevertheless, we are very limited with remaining helper arguments, so keeping compatibility on structs in case of passing in a flat buffer gets more cumbersome. Separating both also allows for more flexibility and future extensibility, f.e. options could be fed directly from a map, etc. Moreover, change geneve's xmit path to test only for info->options_len instead of TUNNEL_GENEVE_OPT flag. This makes it more consistent with vxlan's xmit path and allows for avoiding to specify a protocol flag in the API on xmit, so it can be protocol agnostic. Having info->options_len is enough information that is needed. Tested with vxlan and geneve. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-08Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller185-2008/+2204
Several cases of overlapping changes, as well as one instance (vxlan) of a bug fix in 'net' overlapping with code movement in 'net-next'. Signed-off-by: David S. Miller <davem@davemloft.net>