aboutsummaryrefslogtreecommitdiffstats
path: root/net/packet (unfollow)
AgeCommit message (Collapse)AuthorFilesLines
2018-01-15net: phy: remove parameter new_link from phy_mac_interrupt()Heiner Kallweit3-11/+8
I see two issues with parameter new_link: 1. It's not needed. See also phy_interrupt(), works w/o this parameter. phy_mac_interrupt sets the state to PHY_CHANGELINK and triggers the state machine which then calls phy_read_status. And phy_read_status updates the link state. 2. phy_mac_interrupt is used in interrupt context and getting the link state may sleep (at least when having to access the PHY registers via MDIO bus). So let's remove it. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Tested-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-15tipc: fix a potental access after delete in tipc_sk_join()Jon Maloy1-0/+1
In commit d12d2e12cec2 "tipc: send out join messages as soon as new member is discovered") we added a call to the function tipc_group_join() without considering the case that the preceding tipc_sk_publish() might have failed, and the group item already deleted. We fix this by returning from tipc_sk_join() directly after the failed tipc_sk_publish. Reported-by: syzbot+e3eeae78ea88b8d6d858@syzkaller.appspotmail.com Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-15net: dsa: lan9303: check error value from devm_gpiod_get_optional()Phil Reid1-4/+10
devm_gpiod_get_optional() can return an error in addition to a NULL ptr. Check for error and propagate that to the probe function. Check return value in probe. This will now handle EPROBE_DEFER for the reset gpio. Signed-off-by: Phil Reid <preid@electromag.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-15net: dsa: lan9303: make lan9303_handle_reset() a void functionPhil Reid1-7/+3
lan9303_handle_reset never returns anything other than success. So there's not need for it to return an error code. Signed-off-by: Phil Reid <preid@electromag.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-15net: phy: Have __phy_modify return 0 on successAndrew Lunn1-7/+6
__phy_modify would return the old value of the register before it was modified. Thus on success, it does not return 0, but a positive value. Thus functions using phy_modify, which is a wrapper around __phy_modify, can start returning > 0 on success, rather than 0. As a result, breakage has been noticed in various places, where 0 was assumed. Code inspection does not find any current location where the return of the old value is currently used. So have __phy_modify return 0 on success. When there is a real need for the old value, either a new accessor can be added, or an additional parameter passed. Fixes: fea23fb591cc ("net: phy: convert read-modify-write to phy_modify()") Fixes: 2b74e5be17d2 ("net: phy: add phy_modify() accessor") Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Andrew Lunn <andrew@lunn.ch> Tested-by: Niklas Cassel <niklas.cassel@axis.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-14mlxsw: spectrum: qdiscs: Support stats for PRIO qdiscNogah Frankel1-0/+89
Support basic stats for PRIO qdisc, which includes tx packets and bytes count, drops count and backlog size. The rest of the stats are irrelevant for this qdisc offload. Since backlog is not only incremental but reflecting momentary value, in case of a qdisc that stops being offloaded but is not destroyed, backlog value needs to be updated about the un-offloading. For that reason an unoffload function is being added to the ops struct. Signed-off-by: Nogah Frankel <nogahf@mellanox.com> Reviewed-by: Yuval Mintz <yuvalm@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-14mlxsw: spectrum: qdiscs: Support PRIO qdisc offloadNogah Frankel3-0/+86
Add support for offloading PRIO qdisc as root qdisc. The support is for up to 8 bands. Routed packets priority is determined by the DSCP field with the default translations. Bridged packets priority is determined by the PCP field, if exist, otherwise it is set to 0. Since both options have only priorities 0-7, higher priorities mapping are being ignored. Signed-off-by: Nogah Frankel <nogahf@mellanox.com> Reviewed-by: Yuval Mintz <yuvalm@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-14net: sch: prio: Add offload ability to PRIO qdiscNogah Frankel3-0/+85
Add the ability to offload PRIO qdisc by using ndo_setup_tc. There are three commands for PRIO offloading: * TC_PRIO_REPLACE: handles set and tune * TC_PRIO_DESTROY: handles qdisc destroy * TC_PRIO_STATS: updates the qdiscs counters (given as reference) Like RED qdisc, the indication of whether PRIO is being offloaded is being set and updated as part of the dump function. It is so because the driver could decide to offload or not based on the qdisc parent, which could change without notifying the qdisc. Signed-off-by: Nogah Frankel <nogahf@mellanox.com> Reviewed-by: Yuval Mintz <yuvalm@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-14mlxsw: spectrum_router: Configure default routing priorityYuval Mintz1-0/+24
When routing ip packets, the kernel is setting the SKB's priority based on the tos field of the packet. Imitate this behavior in the mlxsw router, having the internal switch priority of a routed packet determined according to its DS field. Signed-off-by: Yuval Mintz <yuvalm@mellanox.com> Signed-off-by: Nogah Frankel <nogahf@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-14mlxsw: reg: add rdpm registerYuval Mintz2-1/+38
Add rdpm definition - router DSCP to priority mapping register. This register will be utilized later to align the default mapping between packet DSCP and switch-priority to the kernel's mapping between packet priority and skb priority. This is the first non-bit indexed register where the entries are arranged in descending order, i.e., entry at offset 0 matches configuration for dscp[63]. As a result, the item's step is converted into a signed variable to support descending arrays [where step would be negative]. Signed-off-by: Yuval Mintz <yuvalm@mellanox.com> Signed-off-by: Nogah Frankel <nogahf@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-14net: dsa: mv88e6xxx: Decode VTU problem interruptAndrew Lunn4-2/+90
When there is a problem with the VTU, an interrupt can be generated. Trap this interrupt and decode the registers to determine what the problem was, then log the error. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-14net: dsa: mv88e6xxx: Decode ATU problem interruptAndrew Lunn4-2/+103
When there is a problem with the ATU, an interrupt can be generated. Trap this interrupt and decode the registers to determine what the problem was, then log the error. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-14mlxsw: spectrum_router: Add support for IPv6 non-equal-cost multipathIdo Schimmel1-5/+8
Since commit eb789980d0aa ("mlxsw: spectrum_router: Populate adjacency entries according to weights") the driver includes support for non-equal-cost multipath, but IPv4 nexthops were the only user. Now that the kernel supports weighted IPv6 nexthops, we can extend the driver to support it as well. This is done by assigning each nexthop its configured weight, so that it will be populated accordingly in the device's adjacency table. The `weight` parameter is also taken into account when comparing nexthop groups in order not to consolidate non-identical groups. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-14net: netsec: use dma_addr_t for storing dma addressArnd Bergmann1-7/+7
On targets that have different sizes for phys_addr_t and dma_addr_t, we get a type mismatch error: drivers/net/ethernet/socionext/netsec.c: In function 'netsec_alloc_dring': drivers/net/ethernet/socionext/netsec.c:970:9: error: passing argument 3 of 'dma_zalloc_coherent' from incompatible pointer type [-Werror=incompatible-pointer-types] The code is otherwise correct, as the address is never actually used as a physical address but only passed into a DMA register. For consistently, I'm changing the variable name as well, to clarify that this is a DMA address. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-12ixgbevf: Fix kernel-doc format warningsTony Nguyen2-5/+18
Recent checks added for formatting kernel-doc comments are causing warnings if W= is run with a non-zero value. This patch fixes function comments to resolve warnings when W=1 is used. Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-01-12ixgbe: Fix kernel-doc format warningsTony Nguyen12-39/+99
Recent checks added for formatting kernel-doc comments are causing warnings if W= is run with a non-zero value. This patch fixes function comments to resolve warnings when W=1 is used. Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-01-12net: Cap number of queues even with accel_privAlexander Duyck1-2/+1
With the recent fix to ixgbe we can cap the number of queues always regardless of if accel_priv is being used or not since the actual number of queues are being reported via real_num_tx_queues. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-01-12ixgbe: Fix handling of macvlan Tx offloadAlexander Duyck2-10/+14
This update makes it so that we report the actual number of Tx queues via real_num_tx_queues but are still restricted to RSS on only the first pool by setting num_tc equal to 1. Doing this locks us into only having the ability to setup XPS on the queues in that pool, and only those queues should be used for transmitting anything other than macvlan traffic. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-01-12ixgbe: avoid bringing rings up/down as macvlans are added/removedAlexander Duyck2-58/+72
This change makes it so that instead of bringing rings up/down for various we just update the netdev pointer for the Rx ring and set or clear the MAC filter for the interface. By doing it this way we can avoid a number of races and issues in the code as things were getting messy with the macvlan clean-up racing with the interface clean-up to bring the rings down on shutdown. With this change we opt to leave the rings owned by the PF interface for both Tx and Rx and just direct the packets once they are received to the macvlan netdev. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-01-12ixgbe: Do not manipulate macvlan Tx queues when performing macvlan offloadAlexander Duyck2-108/+25
We should not be stopping/starting the upper devices Tx queues when handling a macvlan offload. Instead we should be stopping and starting traffic on our own queues. In order to prevent us from doing this I am updating the code so that we no longer change the queue configuration on the upper device, nor do we update the queue_index on our own device. Instead we can just use the queue index for our local device and not update the netdev in the case of the transmit rings. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-01-12ixgbe/fm10k: Record macvlan stats instead of Rx queue for macvlan offloaded ringsAlexander Duyck2-10/+14
We shouldn't be recording the Rx queue on macvlan offloaded frames since the macvlan is normally brought up as a single queue device, and it will trigger warnings for RPS if we have recorded queue IDs larger than the "real_num_rx_queues" value recorded for the device. Instead we should be recording the macvlan statistics since we are bypassing the normal macvlan statistics that would have been generated by the receive path. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-01-12ixgbe: Don't assume dev->num_tc is equal to hardware TC configAlexander Duyck6-27/+29
The code throughout ixgbe was assuming that dev->num_tc was populated and configured with the driver, when in fact this can be configured via mqprio without any hardware coordination other than restricting us to the real number of Tx queues we advertise. Instead of handling things this way we need to keep a local copy of the number of TCs in use so that we don't accidentally pull in the TC configuration from mqprio when it is configured in software mode. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-01-12ixgbe: Default to 1 pool always being allocatedAlexander Duyck2-5/+3
We might as well configure the limit to default to 1 pool always for the interface. This accounts for the fact that the PF counts as 1 pool if SR-IOV is enabled, and in general we are always running in 1 pool mode when RSS or DCB is enabled as well, though we don't need to actually evaluate any of the VMDq features in those cases. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-01-12ixgbe: Assume provided MAC filter has been verified by macvlanAlexander Duyck1-4/+8
The macvlan driver itself will validate the MAC address that is configured for a given interface. There is no need for us to verify it again. Instead we should be checking to verify that we actually allocate the filter and have not run out of resources to configure a MAC rule in our filter table. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-01-12net: hns3: check for NULL function pointer in hns3_nic_set_featuresJian Shen1-2/+4
It's necessary to check hook whether being defined before calling, improve the reliability. Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-12net: hns3: add feature check when feature changedJian Shen1-12/+15
Local variable "changed" was defined to indicates features changed, but was used only for feature NETIF_F_HW_VLAN_CTAG_RX. Add checking for other features. Fixes: 052ece6dc19c ("net: hns3: add ethtool related offload command") Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-12net: hns3: add int_gl_idx setup for TX and RX queuesFuyun Liang3-0/+21
If the int_gl_idx does not be set, the default interrupt coalesce index is 0. The TX queues and the RX queues will both use the GL0 as the interrupt coalesce GL switch. But it should be GL1 for TX queues and GL0 for RX queues. This patch adds the int_gl_idx setup for TX queues and RX queues. Fixes: 76ad4f0ee747 ("net: hns3: Add support of HNS3 Ethernet Driver for hip08 SoC") Signed-off-by: Fuyun Liang <liangfuyun1@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-12net: hns3: change the unit of GL value macroFuyun Liang1-4/+4
Previously, driver used 2us as the GL unit. The time unit ethtool command "-c" and "-C" use is 1us, so now the GL unit driver uses actually is 1us. This patch changes the unit of GL value macro from 2us to 1us. Signed-off-by: Fuyun Liang <liangfuyun1@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-12net: hns3: remove unused GL setup functionFuyun Liang1-12/+0
Since the TX GL and the RX GL need to be set separately, hns3_set_vector_coalesc_gl() has been replaced with hns3_set_vector_coalesce_rx_gl() and hns3_set_vector_coalesce_tx_gl(). This patch removes hns3_set_vector_coalesc_gl(). Signed-off-by: Fuyun Liang <liangfuyun1@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-12net: hns3: refactor GL update functionFuyun Liang1-19/+16
The GL update function uses the max GL value between tx_int_gl and rx_int_gl to set both new tx_int_gl and new rx_int_gl. Therefore, User can not enable TX GL self-adaptive or RX GL self-adaptive individually. This patch refactors the code to update the TX GL and the RX GL separately, making user can enable TX GL self-adaptive or RX GL self-adaptive individually. Signed-off-by: Fuyun Liang <liangfuyun1@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-12net: hns3: refactor interrupt coalescing init functionFuyun Liang1-9/+20
In the hardware, the coalesce configurable registers include GL0, GL1, GL2. In the driver, the TX queues use the register GL1 and the RX queues use the register GL0. This function initializes the configuration of the interrupt coalescing, but does not distinguish between the TX direction and the RX direction. It will cause some confusion. This patch refactors the function to initialize the TX GL and the RX GL separately. And the initialization of related variables also is added to this patch. Signed-off-by: Fuyun Liang <liangfuyun1@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-12net: hns3: add ethtool_ops.set_coalesce support to PFFuyun Liang3-4/+188
This patch adds ethtool_ops.set_coalesce support to PF. Signed-off-by: Fuyun Liang <liangfuyun1@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-12net: hns3: add ethtool_ops.get_coalesce support to PFFuyun Liang3-0/+40
This patch adds ethtool_ops.get_coalesce support to PF. Whilst our hardware supports per queue values, external interfaces support only a single shared value. As such we use the values for queue 0. Signed-off-by: Fuyun Liang <liangfuyun1@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-12net: hns3: remove TSO config command from VF driverPeng Li2-28/+0
Only main PF can config TSO MSS length according to hardware. This patch removes TSO config command from VF driver. Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-12net: hns3: add ethtool_ops.get_channels support for VFPeng Li2-0/+31
This patch supports the ethtool's get_channels() for VF. Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-11net: phy: mdio-bcm-unimac: fix potential NULL dereference in unimac_mdio_probe()Wei Yongjun1-0/+2
platform_get_resource() may fail and return NULL, so we should better check it's return value to avoid a NULL pointer dereference a bit later in the code. This is detected by Coccinelle semantic patch. @@ expression pdev, res, n, t, e, e1, e2; @@ res = platform_get_resource(pdev, t, n); + if (!res) + return -EINVAL; ... when != res == NULL e = devm_ioremap(e1, res->start, e2); Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-11net: socionext: Fix error return code in netsec_netdev_open()Wei Yongjun1-0/+1
Fix to return error code -ENODEV from the of_phy_connect() error handling case instead of 0, as done elsewhere in this function. Fixes: 533dd11a12f6 ("net: socionext: Add Synquacer NetSec driver") Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-11net: socionext: include linux/io.h to fix buildArnd Bergmann1-0/+1
I ran into a randconfig build failure: drivers/net/ethernet/socionext/netsec.c: In function 'netsec_probe': drivers/net/ethernet/socionext/netsec.c:1583:17: error: implicit declaration of function 'devm_ioremap'; did you mean 'ioremap'? [-Werror=implicit-function-declaration] Including linux/io.h directly fixes this. Fixes: 533dd11a12f6 ("net: socionext: Add Synquacer NetSec driver") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-11ibmvnic: Don't handle RX interrupts when not up.Nathan Fontenot1-0/+6
Initiating a kdump via the command line can cause a pending interrupt to be handled by the ibmvnic driver when initializing the sub-CRQ irqs during driver initialization. NIP [d000000000ca34f0] ibmvnic_interrupt_rx+0x40/0xd0 [ibmvnic] LR [c000000008132ef0] __handle_irq_event_percpu+0xa0/0x2f0 Call Trace: [c000000047fcfde0] [c000000008132ef0] __handle_irq_event_percpu+0xa0/0x2f0 [c000000047fcfea0] [c00000000813317c] handle_irq_event_percpu+0x3c/0x90 [c000000047fcfee0] [c00000000813323c] handle_irq_event+0x6c/0xd0 [c000000047fcff10] [c0000000081385e0] handle_fasteoi_irq+0xf0/0x250 [c000000047fcff40] [c0000000081320a0] generic_handle_irq+0x50/0x80 [c000000047fcff60] [c000000008014984] __do_irq+0x84/0x1d0 [c000000047fcff90] [c000000008027564] call_do_irq+0x14/0x24 [c00000003c92af00] [c000000008014b70] do_IRQ+0xa0/0x120 [c00000003c92af50] [c000000008002594] hardware_interrupt_common+0x114/0x180 Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-11cxgb4: implement ndo_features_checkGanesh Goudar1-0/+19
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-11cxgb4: add support for vxlan segmentation offloadGanesh Goudar3-37/+186
add changes to t4_eth_xmit to enable vxlan segmentation offload support. Original work by: Santosh Rastapur <santosh@chelsio.com> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-11cxgb4: implement udp tunnel callbacksGanesh Goudar3-0/+251
Implement ndo_udp_tunnel_add and ndo_udp_tunnel_del to support vxlan tunnelling. Original work by: Santosh Rastapur <santosh@chelsio.com> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-11cxgb4: add data structures to support vxlanGanesh Goudar3-0/+208
Add data structures and macros to be used in vxlan offload. Original work by: Santosh Rastapur <santosh@chelsio.com> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-11samples/bpf: xdp2skb_meta shows transferring info from XDP to SKBJesper Dangaard Brouer3-0/+324
Creating a bpf sample that shows howto use the XDP 'data_meta' infrastructure, created by Daniel Borkmann. Very few drivers support this feature, but I wanted a functional sample to begin with, when working on adding driver support. XDP data_meta is about creating a communication channel between BPF programs. This can be XDP tail-progs, but also other SKB based BPF hooks, like in this case the TC clsact hook. In this sample I show that XDP can store info named "mark", and TC/clsact chooses to use this info and store it into the skb->mark. It is a bit annoying that XDP and TC samples uses different tools/libs when attaching their BPF hooks. As the XDP and TC programs need to cooperate and agree on a struct-layout, it is best/easiest if the two programs can be contained within the same BPF restricted-C file. As the bpf-loader, I choose to not use bpf_load.c (or libbpf), but instead wrote a bash shell scripted named xdp2skb_meta.sh, which demonstrate howto use the iproute cmdline tools 'tc' and 'ip' for loading BPF programs. To make it easy for first time users, the shell script have command line parsing, and support --verbose and --dry-run mode, if you just want to see/learn the tc+ip command syntax: # ./xdp2skb_meta.sh --dev ixgbe2 --dry-run # Dry-run mode: enable VERBOSE and don't call TC+IP tc qdisc del dev ixgbe2 clsact tc qdisc add dev ixgbe2 clsact tc filter add dev ixgbe2 ingress prio 1 handle 1 bpf da obj ./xdp2skb_meta_kern.o sec tc_mark # Flush XDP on device: ixgbe2 ip link set dev ixgbe2 xdp off ip link set dev ixgbe2 xdp obj ./xdp2skb_meta_kern.o sec xdp_mark Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-01-10Fix a leak in socket(2) when we fail to allocate a file descriptor.Al Viro1-1/+3
Got broken by "make sock_alloc_file() do sock_release() on failures" - cleanup after sock_map_fd() failure got pulled all the way into sock_alloc_file(), but it used to serve the case when sock_map_fd() failed *before* getting to sock_alloc_file() as well, and that got lost. Trivial to fix, fortunately. Fixes: 8e1611e23579 (make sock_alloc_file() do sock_release() on failures) Reported-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2018-01-10sfc: add bits for 25/50/100G supported/advertised speedsEdward Cree1-0/+12
Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-10sfc: support the ethtool ksettings API properly so that 25/50/100G worksEdward Cree5-83/+92
Store and handle ethtool link mode masks within the driver instead of just a single u32. However, quite a significant amount of existing code wants to manipulate the masks directly, and thus now uses the first unsigned long (i.e. mask[0]) as though it were a legacy u32 mask. This is ok because all the bits that code is interested in are in the first 32 bits of the mask; but it might be a good idea to change them in future to use the proper bitmap API. Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-10sfc: basic MCDI mapping of 25/50/100G link speedsEdward Cree1-10/+16
Only handles direct speed setting, not autoneg, because the driver is still trying to pretend it uses the legacy ethtool API which doesn't have advertised/supported bits for 25/50/100G. Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-10mlxsw: spectrum: qdiscs: Remove qdisc before setting a new oneNogah Frankel1-0/+7
If a qdisc is being replaced by another qdisc of the same type, it can simply override over its configuration. However, if it replaces a qdisc of another type, it needs to be removed before setting the new qdisc. Signed-off-by: Nogah Frankel <nogahf@mellanox.com> Reviewed-by: Yuval Mintz <yuvalm@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-10mlxsw: spectrum: qdiscs: Create a generic replace functionNogah Frankel1-38/+71
Create a generic qdisc replace function. For that goal, add three functions to the qdisc ops struct: * check_params: Checks if the given parameters are offloadable. * replace: Offload the given parameters. * clean_stats: clean the qdisc stats for the offloaded qdisc. integrate RED offloading into using the new internal replace API. Signed-off-by: Nogah Frankel <nogahf@mellanox.com> Reviewed-by: Yuval Mintz <yuvalm@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>