aboutsummaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)AuthorFilesLines
2018-05-25net/mlx5: E-switch, Create a second level FDB flow tableChris Mi3-5/+32
If firmware supports the forward action with a destination list that includes a flow table, create a second level FDB flow table. This is going to be used for flow based mirroring under the switchdev offloads mode. Signed-off-by: Chris Mi <chrism@mellanox.com> Reviewed-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-05-25net/mlx5: Add cap bits for flow table destination in FDB tableChris Mi1-1/+3
If set, the FDB table supports the forward action with a destination list that includes a flow table. Signed-off-by: Chris Mi <chrism@mellanox.com> Reviewed-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-05-25net/mlx5: E-Switch, Reorganize and rename fdb flow tablesChris Mi3-24/+25
We have several fdb flow tables for each of the legacy and switchdev modes. In the switchdev mode, there are fast path and slow path flow tables. Towards adding more flow tables in upcoming patches, reorganize and rename the various existing ones to reflect their functionality. Signed-off-by: Chris Mi <chrism@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-05-25net: dsa: dsa_loop: Make dynamic debugging helpfulFlorian Fainelli1-14/+17
Remove redundant debug prints from phy_read/write since we can trace those calls through trace events. Enhance dynamic debug prints to print arguments which helps figuring how what is going on at the driver level with higher level configuration interfaces. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-25Merge branch 'ovs-ct-zone'David S. Miller6-6/+595
Yi-Hung Wei says: ==================== openvswitch: Support conntrack zone limit Currently, nf_conntrack_max is used to limit the maximum number of conntrack entries in the conntrack table for every network namespace. For the VMs and containers that reside in the same namespace, they share the same conntrack table, and the total # of conntrack entries for all the VMs and containers are limited by nf_conntrack_max. In this case, if one of the VM/container abuses the usage the conntrack entries, it blocks the others from committing valid conntrack entries into the conntrack table. Even if we can possibly put the VM in different network namespace, the current nf_conntrack_max configuration is kind of rigid that we cannot limit different VM/container to have different # conntrack entries. To address the aforementioned issue, this patch proposes to have a fine-grained mechanism that could further limit the # of conntrack entries per-zone. For example, we can designate different zone to different VM, and set conntrack limit to each zone. By providing this isolation, a mis-behaved VM only consumes the conntrack entries in its own zone, and it will not influence other well-behaved VMs. Moreover, the users can set various conntrack limit to different zone based on their preference. The proposed implementation utilizes Netfilter's nf_conncount backend to count the number of connections in a particular zone. If the number of connection is above a configured limitation, OVS will return ENOMEM to the userspace. If userspace does not configure the zone limit, the limit defaults to zero that is no limitation, which is backward compatible to the behavior without this patch. The first patch defines the conntrack limit netlink definition, and the second patch provides the implementation. v4->v5: - Addresses comments from Parvin that include log error msg in ovs_ct_limit_init(), handle deletion for default limit, and add a common helper for get zone limit. - Rebases to master. v3->v4: - Addresses comments from Parvin that include simplify netlink API, and remove unncessary RCU lockings. - Rebases to master. v2->v3: - Addresses comments from Parvin that include using static keys to check if ovs_ct_limit features is used, only check ct_limit when a ct entry is unconfirmed, and reports rate limited warning messages when the ct limit is reached. - Rebases to master. v1->v2: - Fixes commit log typos suggested by Greg. - Fixes memory free issue that Julia found. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-25openvswitch: Support conntrack zone limitYi-Hung Wei5-6/+567
Currently, nf_conntrack_max is used to limit the maximum number of conntrack entries in the conntrack table for every network namespace. For the VMs and containers that reside in the same namespace, they share the same conntrack table, and the total # of conntrack entries for all the VMs and containers are limited by nf_conntrack_max. In this case, if one of the VM/container abuses the usage the conntrack entries, it blocks the others from committing valid conntrack entries into the conntrack table. Even if we can possibly put the VM in different network namespace, the current nf_conntrack_max configuration is kind of rigid that we cannot limit different VM/container to have different # conntrack entries. To address the aforementioned issue, this patch proposes to have a fine-grained mechanism that could further limit the # of conntrack entries per-zone. For example, we can designate different zone to different VM, and set conntrack limit to each zone. By providing this isolation, a mis-behaved VM only consumes the conntrack entries in its own zone, and it will not influence other well-behaved VMs. Moreover, the users can set various conntrack limit to different zone based on their preference. The proposed implementation utilizes Netfilter's nf_conncount backend to count the number of connections in a particular zone. If the number of connection is above a configured limitation, ovs will return ENOMEM to the userspace. If userspace does not configure the zone limit, the limit defaults to zero that is no limitation, which is backward compatible to the behavior without this patch. The following high leve APIs are provided to the userspace: - OVS_CT_LIMIT_CMD_SET: * set default connection limit for all zones * set the connection limit for a particular zone - OVS_CT_LIMIT_CMD_DEL: * remove the connection limit for a particular zone - OVS_CT_LIMIT_CMD_GET: * get the default connection limit for all zones * get the connection limit for a particular zone Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com> Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-25openvswitch: Add conntrack limit netlink definitionYi-Hung Wei1-0/+28
Define netlink messages and attributes to support user kernel communication that uses the conntrack limit feature. Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com> Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-25Merge tag 'mlx5e-updates-2018-05-19' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linuxDavid S. Miller17-80/+947
Saeed Mahameed says: ==================== mlx5e-updates-2018-05-19 This series contains updates for mlx5e netdevice driver with one subject, DSCP to priority mapping, in the first patch Huy adds the needed API in dcbnl, the second patch adds the needed mlx5 core capability bits for the feature, and all other patches are mlx5e (netdev) only changes to add support for the feature. From: Huy Nguyen Dscp to priority mapping for Ethernet packet: These patches enable differentiated services code point (dscp) to priority mapping for Ethernet packet. Once this feature is enabled, the packet is routed to the corresponding priority based on its dscp. User can combine this feature with priority flow control (pfc) feature to have priority flow control based on the dscp. Firmware interface: Mellanox firmware provides two control knobs for this feature: QPTS register allow changing the trust state between dscp and pcp mode. The default is pcp mode. Once in dscp mode, firmware will route the packet based on its dscp value if the dscp field exists. QPDPM register allow mapping a specific dscp (0 to 63) to a specific priority (0 to 7). By default, all the dscps are mapped to priority zero. Software interface: This feature is controlled via application priority TLV. IEEE specification P802.1Qcd/D2.1 defines priority selector id 5 for application priority TLV. This APP TLV selector defines DSCP to priority map. This APP TLV can be sent by the switch or can be set locally using software such as lldptool. In mlx5 drivers, we add the support for net dcb's getapp and setapp call back. Mlx5 driver only handles the selector id 5 application entry (dscp application priority application entry). If user sends multiple dscp to priority APP TLV entries on the same dscp, the last sent one will take effect. All the previous sent will be deleted. This attribute combined with pfc attribute allows advanced user to fine tune the qos setting for specific priority queue. For example, user can give dedicated buffer for one or more priorities or user can give large buffer to certain priorities. The dcb buffer configuration will be controlled by lldptool. >> lldptool -T -i eth2 -V BUFFER prio 0,2,5,7,1,2,3,6 maps priorities 0,1,2,3,4,5,6,7 to receive buffer 0,2,5,7,1,2,3,6 >> lldptool -T -i eth2 -V BUFFER size 87296,87296,0,87296,0,0,0,0 sets receive buffer size for buffer 0,1,2,3,4,5,6,7 respectively After discussion on mailing list with Jakub, Jiri, Ido and John, we agreed to choose dcbnl over devlink interface since this feature is intended to set port attributes which are governed by the netdev instance of that port, where devlink API is more suitable for global ASIC configurations. The firmware trust state (in QPTS register) is changed based on the number of dscp to priority application entries. When the first dscp to priority application entry is added by the user, the trust state is changed to dscp. When the last dscp to priority application entry is deleted by the user, the trust state is changed to pcp. When the port is in DSCP trust state, the transmit queue is selected based on the dscp of the skb. When the port is in DSCP trust state and vport inline mode is not NONE, firmware requires mlx5 driver to copy the IP header to the wqe ethernet segment inline header if the skb has it. This is done by changing the transmit queue sq's min inline mode to L3. Note that the min inline mode of sqs that belong to other features such as xdpsq, icosq are not modified. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-258139too: Remove unnecessary netif_napi_del()Bo Chen1-2/+0
The call to free_netdev() in __rtl8139_cleanup_dev() clears the network device napi list, and explicit calls to netif_napi_del() are unnecessary. Signed-off-by: Bo Chen <chenbo@pdx.edu> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-25Merge branch 'qed-ethtool-rx-flow-classification-enhancements'David S. Miller7-199/+502
Manish Chopra says: ==================== qed*: ethtool rx flow classification enhancements. This series re-structures the driver's ethtool rx flow classification flow, following that it adds other flow profiles and rx flow classification enhancements via "ethtool -N/-U" Please consider applying this to "net-next" ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-25qed*: Support drop action classificationManish Chopra7-15/+41
With this patch, User can configure for the supported flows to be dropped. Added a stat "gft_filter_drop" as well to be populated in ethtool for the dropped flows. For example - ethtool -N p5p1 flow-type udp4 dst-port 8000 action -1 ethtool -N p5p1 flow-type tcp4 scr-ip 192.168.8.1 action -1 Signed-off-by: Manish Chopra <manish.chopra@cavium.com> Signed-off-by: Shahed Shaikh <shahed.shaikh@cavium.com> Signed-off-by: Ariel Elior <ariel.elior@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-25qede: Support flow classification to the VFs.Manish Chopra1-4/+30
With the supported classification modes [4 tuples based, udp port based, src-ip based], flows can be classified to the VFs as well. With this patch, flows can be re-directed to the requested VF provided in "action" field of command. Please note that driver doesn't really care about the queue bits in "action" field for the VFs. Since queue will be still chosen by FW using RSS hash. [I.e., the classification would be done according to vport-only] For examples - ethtool -N p5p1 flow-type udp4 dst-port 8000 action 0x100000000 ethtool -N p5p1 flow-type tcp4 src-ip 192.16.6.10 action 0x200000000 ethtool -U p5p1 flow-type tcp4 src-ip 192.168.40.100 dst-ip \ 192.168.40.200 src-port 6660 dst-port 5550 \ action 0x100000000 Signed-off-by: Manish Chopra <manish.chopra@cavium.com> Signed-off-by: Shahed Shaikh <shahed.shaikh@cavium.com> Signed-off-by: Ariel Elior <ariel.elior@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-25qed*: Support other classification modes.Manish Chopra3-2/+32
Currently, driver supports flow classification to PF receive queues based on TCP/UDP 4 tuples [src_ip, dst_ip, src_port, dst_port] only. This patch enables to configure different flow profiles [For example - only UDP dest port or src_ip based] on the adapter so that classification can be done according to just those fields as well. Although, at a time just one type of flow configuration is supported due to limited number of flow profiles available on the device. For example - ethtool -N enp7s0f0 flow-type udp4 dst-port 45762 action 2 ethtool -N enp7s0f0 flow-type tcp4 src-ip 192.16.4.10 action 1 ethtool -N enp7s0f0 flow-type udp6 dst-port 45762 action 3 Signed-off-by: Manish Chopra <manish.chopra@cavium.com> Signed-off-by: Shahed Shaikh <shahed.shaikh@cavium.com> Signed-off-by: Ariel Elior <ariel.elior@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-25qede: Validate unsupported configurationsManish Chopra1-0/+73
Validate and prevent some of the configurations for unsupported [by firmware] inputs [for example - mac ext, vlans, masks/prefix, tos/tclass] via ethtool -N/-U. Signed-off-by: Manish Chopra <manish.chopra@cavium.com> Signed-off-by: Shahed Shaikh <shahed.shaikh@cavium.com> Signed-off-by: Ariel Elior <ariel.elior@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-25qede: Refactor ethtool rx classification flow.Manish Chopra1-182/+330
This patch simplifies the ethtool rx flow configuration [via ethtool -U/-N] flow code base by dividing it logically into various APIs based on given protocols. It also separates various validations and calculations done along the flow in their own APIs. Signed-off-by: Manish Chopra <manish.chopra@cavium.com> Signed-off-by: Shahed Shaikh <shahed.shaikh@cavium.com> Signed-off-by: Ariel Elior <ariel.elior@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-25cxgb4/cxgb4vf: Notify link changes to OS-dependent codeArjun Vynipadath6-6/+64
We have a confusion of two different abstractions in the Common Code: Physical Link (Port) and Logical Network Interface (Virtual Interface), and we haven't been properly managing the state of the intersection of those two abstractions. On the one hand we have the Physical state of the Link -- up or down -- and on the other we have the logical state of the VI, enabled or not. {ethN} refers to both the Physical and Logical State. In this case, ifconfig only affects/interrogates the Logical State of a VI, and ethtool only deals with the Physical State. And these are different. So, just because we disable the VI, we don't really want to change the Physical Link Up/Down state. Thus, the previous hack to set "lc->link_ok = 0" when we disable a VI is completely incorrect. Where we get into trouble is where the Physical Link State and the Logical VI State cross swords. And that happens in t4_handle_get_port_info() where we need to manage/safe the Physical Link State, but we also need to know when the Logical VI State has changed and pass that back up to the OS-dependent Driver routine t4_os_link_changed() which is concerned about the Logical Interface. So we enable a VI and that causes Firmware to send us a new Port Information message, but if none of the Physical Link State particulars have changed, we don't call t4_os_link_changed(). This fix uses the existing OS Contract APIs for the Common Code to inform the OS-dependent portion of the Host Driver when the "Link" (really Logical Network Interface) is "up" or "down". A new API t4_enable_pi_params() is added which calls t4_enable_vi_params() and, if that is successful, then calls back to the OS Contract API t4_os_link_changed() notifying the OS-dependent layer of the potential Link State change. Original Work by : Casey Leedom <leedom@chelsio.com> Signed-off-by: Santosh Rastapur <santosh@chelsio.com> Signed-off-by: Arjun Vynipadath <arjun@chelsio.com> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-25cxgb4: clean up init_oneGanesh Goudar2-21/+28
clean up init_one and use chip_ver consistently throughout init_one() for chip version. Signed-off-by: Casey Leedom <leedom@chelsio.com> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-25cxgb4/cxgb4vf: link management changes for new SFPGanesh Goudar3-18/+85
newer SFPs like SFP28 and QSFP28 Transceiver Modules present several new possibilities which we haven't faced before. Fix the assumptions in the code reflecting the more limited capabilities of previous Transceiver Module systems Original work by Casey Leedom <leedom@chelsio.com> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-25net: fec: remove stale commentYueHaibing1-6/+0
This comment is outdated as fec_ptp_ioctl has been replaced by fec_ptp_set/fec_ptp_get since commit 1d5244d0e43b ("fec: Implement the SIOCGHWTSTAMP ioctl") Signed-off-by: YueHaibing <yuehaibing@huawei.com> Acked-by: Fugang Duan <fugang.duan@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-25sfc: stop the TX queue before pushing new buffersMartin Habets1-8/+25
efx_enqueue_skb() can push new buffers for the xmit_more functionality. We must stops the TX queue before this or else the TX queue does not get restarted and we get a netdev watchdog. In the error handling we may now need to unwind more than 1 packet, and we may need to push the new buffers onto the partner queue. v2: In the error leg also push this queue if xmit_more is set Fixes: e9117e5099ea ("sfc: Firmware-Assisted TSO version 2") Reported-by: Jarod Wilson <jarod@redhat.com> Tested-by: Jarod Wilson <jarod@redhat.com> Signed-off-by: Martin Habets <mhabets@solarflare.com> Acked-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-25net: bridge: add support for port isolationNikolay Aleksandrov7-2/+24
This patch adds support for a new port flag - BR_ISOLATED. If it is set then isolated ports cannot communicate between each other, but they can still communicate with non-isolated ports. The same can be achieved via ACLs but they can't scale with large number of ports and also the complexity of the rules grows. This feature can be used to achieve isolated vlan functionality (similar to pvlan) as well, though currently it will be port-wide (for all vlans on the port). The new test in should_deliver uses data that is already cache hot and the new boolean is used to avoid an additional source port test in should_deliver. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Reviewed-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-24Merge branch 'nfp-offload-LAG-for-tc-flower-egress'David S. Miller14-30/+1053
Jakub Kicinski says: ==================== nfp: offload LAG for tc flower egress This series from John adds bond offload to the nfp driver. Patch 5 exposes the hash type for NETDEV_LAG_TX_TYPE_HASH to make sure nfp hashing matches that of the software LAG. This may be unnecessarily conservative, let's see what LAG maintainers think :) John says: This patchset sets up the infrastructure and offloads output actions for when a TC flower rule attempts to egress a packet to a LAG port. Firstly it adds some of the infrastructure required to the flower app and to the nfp core. This includes the ability to change the MAC address of a repr, a function for combining lookup and write to a FW symbol, and the addition of private data to a repr on a per app basis. Patch 6 continues by implementing notifiers that track Linux bonds and communicates to the FW those which enslave reprs, along with the current state of reprs within the bond. Patch 7 ensures bonds are synchronised with FW by receiving and acting upon cmsgs sent to the kernel. These may request that a bond message is retransmitted when FW can process it, or may request a full sync of the bonds defined in the kernel. Patch 8 offloads a flower action when that action requires egressing to a pre-defined Linux bond. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-24nfp: flower: compute link aggregation actionJohn Hurley5-28/+169
If the egress device of an offloaded rule is a LAG port, then encode the output port to the NFP with a LAG identifier and the offloaded group ID. A prelag action is also offloaded which must be the first action of the series (although may appear after other pre-actions - e.g. tunnels). This causes the FW to check that it has the necessary information to output to the requested LAG port. If it does not, the packet is sent to the kernel before any other actions are applied to it. Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-24nfp: flower: implement host cmsg handler for LAGJohn Hurley3-2/+105
Adds the control message handler to synchronize offloaded group config with that of the kernel. Such messages are sent from fw to driver and feature the following 3 flags: - Data: an attached cmsg could not be processed - store for retransmission - Xon: FW can accept new messages - retransmit any stored cmsgs - Sync: full sync requested so retransmit all kernel LAG group info Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-24nfp: flower: monitor and offload LAG groupsJohn Hurley4-3/+646
Monitor LAG events via the NETDEV_CHANGEUPPER/NETDEV_CHANGELOWERSTATE notifiers to maintain a list of offloadable groups. Sync these groups with HW via a delayed workqueue to prevent excessive re-configuration. When the workqueue is triggered it may generate multiple control messages for different groups. These messages are linked via a batch ID and flags to indicate a new batch and the end of a batch. Update private data in each repr to track their LAG lower state flags. The state of a repr is used to determine the active netdevs that can be offloaded. For example, in active-backup mode, we only offload the netdev currently active. Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-24net: include hash policy in LAG changeupper infoJohn Hurley3-1/+38
LAG upper event notifiers contain the tx type used by the LAG device. Extend this to also include the hash policy used for tx types that utilize hashing. Signed-off-by: John Hurley <john.hurley@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-24nfp: flower: add per repr private data for LAG offloadJohn Hurley2-0/+34
Add a bitmap to each flower repr to track its state if it is enslaved by a bond. This LAG state may be different to the port state - for example, the port may be up but LAG state may be down due to the selection in an active/backup bond. Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-24nfp: flower: check for/turn on LAG support in firmwareJohn Hurley4-0/+19
Check if the fw contains the _abi_flower_balance_sync_enable symbol. If it does then write a 1 to this indicating that the driver is willing to receive NIC to kernel LAG related control messages. If the write is successful, update the list of extra features supported by the fw and add a stub to accept LAG cmsgs. Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-24nfp: nfpcore: add rtsym writing functionJohn Hurley2-0/+45
Add an rtsym API function that combines the lookup of a symbol and the writing of a value to it. Values can be written as unsigned 32 or 64 bits. Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-24nfp: add ndo_set_mac_address for representorsJohn Hurley1-0/+1
Adding a netdev to a bond requires that its mac address can be modified. The default eth_mac_addr is sufficient to satisfy this requirement. Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-24hv_netvsc: fix bogus ifalias on network deviceStephen Hemminger1-1/+4
If the guest network adapter is not configured with DeviceNaming enabled on the host, then the query for friendly name will return success but with a zero length name. Which then leads to a garbage value (stack contents) for ifalias. Fix is simple, just don't set name if host doesn't return it. Fixes: 0fe554a46a0f ("hv_netvsc: propogate Hyper-V friendly name into interface alias") Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-24Merge branch 'net-Update-fib_table_lookup-tracepoints'David S. Miller6-84/+81
David Ahern says: ==================== net: Update fib_table_lookup tracepoints Update the FIB lookup tracepoints to include ip proto and port fields from the flow struct. In the process make the IPv4 tracepoint inline with IPv6 which is much easier to use and follow the lookup and result. Remove the tracepoint in fib_validate_source which does not provide value above the fib_table_lookup which immediately follows it. v2 - move CREATE_TRACE_POINTS for the v6 tracepoint to route.c to handle its need for an internal function to convert route type to error and handle IPv6 as a module or builtin. Reported by kbuild robot. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-24net/ipv4: Remove tracepoint in fib_validate_sourceDavid Ahern2-37/+0
Tracepoint does not add value and the call to fib_lookup follows it which shows the same information and the fib lookup result. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-24net/ipv6: Udate fib6_table_lookup tracepointDavid Ahern3-13/+29
Commit bb0ad1987e96 ("ipv6: fib6_rules: support for match on sport, dport and ip proto") added support for protocol and ports to FIB rules. Update the FIB lookup tracepoint to dump the parameters. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-24net/ipv4: Udate fib_table_lookup tracepointDavid Ahern2-34/+52
Commit 4a2d73a4fb36 ("ipv4: fib_rules: support match on sport, dport and ip proto") added support for protocol and ports to FIB rules. Update the FIB lookup tracepoint to dump the parameters. In addition, make the IPv4 tracepoint similar to the IPv6 one where the lookup parameters and result are dumped in 1 event. It is much easier to use and understand the outcome of the lookup. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-24net_sched: switch to rcu_workCong Wang13-221/+85
Commit 05f0fe6b74db ("RCU, workqueue: Implement rcu_work") introduces new API's for dispatching work in a RCU callback. Now we can just switch to the new API's for tc filters. This could get rid of a lot of code. Cc: Tejun Heo <tj@kernel.org> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-24Merge branch 'Mirroring-tests-involving-VLAN'David S. Miller11-81/+754
Petr Machata says: ==================== Mirroring tests involving VLAN This patchset tests mirror-to-gretap with various underlay configurations involving VLAN netdevice in particular. Some of the tests involve bridges as well, but tests aimed specifically at testing bridges (i.e. FDB, STP) are not part of this patchset. In patches #1-#6, the codebase is adapted to support the new tests. In patch #7, a test for mirroring to VLAN is introduced. Patches #8-#10 add three tests where VLAN is part of underlay path after gretap encapsulation. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-24selftests: forwarding: Test mirror-to-gre w/ UL 802.1d+VLANPetr Machata1-0/+109
Test for "tc action mirred egress mirror" that mirrors to GRE when the underlay route points at an 802.1d bridge and packet egresses through a VLAN device. Besides testing basic connectivity, this also tests that the traffic is properly tagged. Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-24selftests: forwarding: Test mirror-to-gre w/ UL VLANPetr Machata1-0/+92
Test for "tc action mirred egress mirror" that mirrors to a gretap netdevice whose underlay route points at a vlan device. Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-24selftests: forwarding: Test mirror-to-gre w/ UL VLAN+802.1qPetr Machata1-0/+140
Test for "tc action mirred egress mirror" that mirrors to GRE when the underlay route points at a vlan device on top of a bridge device with vlan filtering (802.1q). Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-24selftests: forwarding: Test mirror-to-vlanPetr Machata1-0/+169
Test for "tc action mirred egress mirror" that mirrors to a vlan device. - test_vlan() tests that the packets get mirrored - test_tagged_vlan() tests that the mirrored packets have correct inner VLAN tag. Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-24selftests: forwarding: lib: Extract trap_{, un}install()Petr Machata1-9/+18
A mirror-to-vlan test that's coming next needs to install the trap unconditionally. Therefore extract from slow_path_trap_{,un}install() a more generic functions trap_install() and trap_uninstall(), and covert the former two to conditional wrappers around these. Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-24selftests: forwarding: mirror_gre_lib: Support VLANPetr Machata1-0/+34
Add full_test_span_gre_dir_vlan_ips() and full_test_span_gre_dir_vlan() to support mirror-to-gre tests that involve VLAN. Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-24selftests: forwarding: lib: Support VLAN devicesPetr Machata1-0/+25
Add vlan_create() and vlan_destroy() to manage VLAN netdevices. Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-24selftests: forwarding: Add $h3's clsact to mirror_topo_lib.shPetr Machata3-4/+2
Having a clsact qdisc on $h3 is useful in several tests, and will be useful in more tests to come. Move the registration from all the tests that need it into the topology file itself. Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-24selftests: forwarding: mirror_gre_lib: Extract generic functionsPetr Machata2-31/+64
For non-GRE mirroring tests, a functions along the lines of do_test_span_gre_dir_ips() and test_span_gre_dir_ips() are necessary, but such that they don't assume tunnels are involved. Extract the code from mirror_gre_lib.sh to mirror_lib.sh and convert to just use a given device without assuming it's named "h3-$tundev". Convert the two above-mentioned functions to wrappers that pass along the correct device name. Add test_span_dir() and fail_test_span_dir() to round up the API for use by following patches. Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-24selftests: forwarding: Split mirror_gre_topo_lib.shPetr Machata2-44/+108
Move generic parts of mirror_gre_topo_lib.sh into a new file mirror_topo_lib.sh. Reuse the functions in GRE topo, adding the tunnel devices as necessary. Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-24Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextDavid S. Miller90-812/+5213
Alexei Starovoitov says: ==================== pull-request: bpf-next 2018-05-24 The following pull-request contains BPF updates for your *net-next* tree. The main changes are: 1) Björn Töpel cleans up AF_XDP (removes rebind, explicit cache alignment from uapi, etc). 2) David Ahern adds mtu checks to bpf_ipv{4,6}_fib_lookup() helpers. 3) Jesper Dangaard Brouer adds bulking support to ndo_xdp_xmit. 4) Jiong Wang adds support for indirect and arithmetic shifts to NFP 5) Martin KaFai Lau cleans up BTF uapi and makes the btf_header extensible. 6) Mathieu Xhonneux adds an End.BPF action to seg6local with BPF helpers allowing to edit/grow/shrink a SRH and apply on a packet generic SRv6 actions. 7) Sandipan Das adds support for bpf2bpf function calls in ppc64 JIT. 8) Yonghong Song adds BPF_TASK_FD_QUERY command for introspection of tracing events. 9) other misc fixes from Gustavo A. R. Silva, Sirio Balmelli, John Fastabend, and Magnus Karlsson ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-24Merge branch 'ibmvnic-Failover-hardening'David S. Miller2-23/+202
Thomas Falcon says: ==================== ibmvnic: Failover hardening Introduce additional transport event hardening to handle events during device reset. In the driver's current state, if a transport event is received during device reset, it can cause the device to become unresponsive as invalid operations are processed as the backing device context changes. After a transport event, the device expects a request to begin the initialization process. If the driver is still processing a previously queued device reset in this state, it is likely to fail as firmware will reject any commands other than the one to initialize the client driver's Command-Response Queue. Instead of failing and becoming dormant, the driver will make one more attempt to recover and continue operation. This is achieved by setting a state flag, which if true will direct the driver to clean up all allocated resources and perform a hard reset in an attempt to bring the driver back to an operational state. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-24ibmvnic: Introduce hard reset recoveryThomas Falcon2-4/+98
Introduce a recovery hard reset to handle reset failure as a result of change of device context following a transport event, such as a backing device failover or partition migration. These operations reset the device context to its initial state. If this occurs during a reset, any initialization commands are likely to fail with an invalid state error as backing device firmware requests reinitialization. When this happens, make one more attempt by performing a hard reset, which frees any resources currently allocated and performs device initialization. If a transport event occurs during a device reset, a flag is set which will trigger a new hard reset following the completionof the current reset event. Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>