aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/net/hyperv (follow)
AgeCommit message (Collapse)AuthorFilesLines
2018-03-23Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller4-190/+225
Fun set of conflict resolutions here... For the mac80211 stuff, these were fortunately just parallel adds. Trivially resolved. In drivers/net/phy/phy.c we had a bug fix in 'net' that moved the function phy_disable_interrupts() earlier in the file, whilst in 'net-next' the phy_error() call from this function was removed. In net/ipv4/xfrm4_policy.c, David Ahern's changes to remove the 'rt_table_id' member of rtable collided with a bug fix in 'net' that added a new struct member "rt_mtu_locked" which needs to be copied over here. The mlxsw driver conflict consisted of net-next separating the span code and definitions into separate files, whilst a 'net' bug fix made some changes to that moved code. The mlx5 infiniband conflict resolution was quite non-trivial, the RDMA tree's merge commit was used as a guide here, and here are their notes: ==================== Due to bug fixes found by the syzkaller bot and taken into the for-rc branch after development for the 4.17 merge window had already started being taken into the for-next branch, there were fairly non-trivial merge issues that would need to be resolved between the for-rc branch and the for-next branch. This merge resolves those conflicts and provides a unified base upon which ongoing development for 4.17 can be based. Conflicts: drivers/infiniband/hw/mlx5/main.c - Commit 42cea83f9524 (IB/mlx5: Fix cleanup order on unload) added to for-rc and commit b5ca15ad7e61 (IB/mlx5: Add proper representors support) add as part of the devel cycle both needed to modify the init/de-init functions used by mlx5. To support the new representors, the new functions added by the cleanup patch needed to be made non-static, and the init/de-init list added by the representors patch needed to be modified to match the init/de-init list changes made by the cleanup patch. Updates: drivers/infiniband/hw/mlx5/mlx5_ib.h - Update function prototypes added by representors patch to reflect new function names as changed by cleanup patch drivers/infiniband/hw/mlx5/ib_rep.c - Update init/de-init stage list to match new order from cleanup patch ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-22hv_netvsc: common detach logicStephen Hemminger4-143/+173
Make common function for detaching internals of device during changes to MTU and RSS. Make sure no more packets are transmitted and all packets have been received before doing device teardown. Change the wait logic to be common and use usleep_range(). Changes transmit enabling logic so that transmit queues are disabled during the period when lower device is being changed. And enabled only after sub channels are setup. This avoids issue where it could be that a packet was being sent while subchannel was not initialized. Fixes: 8195b1396ec8 ("hv_netvsc: fix deadlock on hotplug") Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-22hv_netvsc: change GPAD teardown order on older versionsStephen Hemminger1-1/+6
On older versions of Windows, the host ignores messages after vmbus channel is closed. Workaround this by doing what Windows does and send the teardown before close on older versions of NVSP protocol. Reported-by: Mohammed Gamal <mgamal@redhat.com> Fixes: 0cf737808ae7 ("hv_netvsc: netvsc_teardown_gpadl() split") Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-22hv_netvsc: use RCU to fix concurrent rx and queue changesStephen Hemminger2-35/+21
The receive processing may continue to happen while the internal network device state is in RCU grace period. The internal RNDIS structure is associated with the internal netvsc_device structure; both have the same RCU lifetime. Defer freeing all associated parts until after grace period. Fixes: 0cf737808ae7 ("hv_netvsc: netvsc_teardown_gpadl() split") Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-22hv_netvsc: disable NAPI before channel closeStephen Hemminger1-4/+4
This makes sure that no CPU is still process packets when the channel is closed. Fixes: 76bb5db5c749 ("netvsc: fix use after free on module removal") Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-17hv_netvsc: add trace pointsStephen Hemminger5-2/+220
This adds tracepoints to the driver which has proved useful in debugging startup and shutdown race conditions. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-17hv_netvsc: pass netvsc_device to rndis haltStephen Hemminger1-4/+3
The caller has a valid pointer, pass it to rndis_filter_halt_device and avoid any possible RCU races here. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-08hv_netvsc: fix locking during VF setupStephen Hemminger1-0/+4
The dev_uc/mc_sync calls need to have the device address list locked. This was spotted by running with lockdep enabled. Fixes: bee9d41b37ea ("hv_netvsc: propagate rx filters to VF") Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-08hv_netvsc: fix locking for rx_modeStephen Hemminger1-3/+8
The rx_mode operation handler is different than other callbacks in that is not always called with rtnl held. Therefore use RCU to ensure that references are valid. Fixes: bee9d41b37ea ("hv_netvsc: propagate rx filters to VF") Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-08hv_netvsc: avoid repeated updates of packet filterStephen Hemminger2-2/+7
The netvsc driver can get repeated calls to netvsc_rx_mode during network setup; each of these calls ends up scheduling the lower layers to update tha packet filter. This update requires an request/response to the host. So avoid doing this if we already know that the correct packet filter value is set. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-08hv_netvsc: fix filter flagsStephen Hemminger1-2/+2
The recent change to not always enable all multicast and broadcast was broken; meant to set filter, not change flags. Fixes: 009f766ca238 ("hv_netvsc: filter multicast/broadcast") Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-04hv_netvsc: propagate rx filters to VFStephen Hemminger1-4/+36
The netvsc device should propagate filters to the SR-IOV VF device (if present). The flags also need to be propagated to the VF device as well. This only really matters on local Hyper-V since Azure does not support multiple addresses. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-04hv_netvsc: filter multicast/broadcastStephen Hemminger1-8/+12
The netvsc driver was always enabling all multicast and broadcast even if netdevice flag had not enabled it. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-04hv_netvsc: defer queue selection to VFStephen Hemminger1-2/+13
When VF is used for accelerated networking it will likely have more queues (and different policy) than the synthetic NIC. This patch defers the queue policy to the VF so that all the queues can be used. This impacts workloads like local generate UDP. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-04hv_netvsc: use napi_schedule_irqoffStephen Hemminger1-1/+1
Since the netvsc_channel_cb is already called in interrupt context from vmbus, there is no need to do irqsave/restore. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-04hv_netvsc: fix race in napi poll when reschedulingStephen Hemminger1-2/+3
There is a race between napi_reschedule and re-enabling interrupts which could lead to missed host interrrupts. This occurs when interrupts are re-enabled (hv_end_read) and vmbus irq callback (netvsc_channel_cb) has already scheduled NAPI. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-04hv_netvsc: cancel subchannel setup before halting deviceStephen Hemminger1-0/+3
Block setup of multiple channels earlier in the teardown process. This avoids possible races between halt and subchannel initialization. Suggested-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-04hv_netvsc: fix error unwind handling if vmbus_open failsStephen Hemminger1-1/+1
Need to delete NAPI association if vmbus_open fails. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-04hv_netvsc: only wake transmit queue if link is upStephen Hemminger1-4/+3
Don't wake transmit queues if link is not up yet. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-04hv_netvsc: avoid retry on send during shutdownStephen Hemminger1-17/+7
Change the initialization order so that the device is ready to transmit (ie connect vsp is completed) before setting the internal reference to the device with RCU. This avoids any races on initialization and prevents retry issues on shutdown. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-22hv_netvsc: Use the num_online_cpus() for channel limitHaiyang Zhang1-9/+2
Since we no longer localize channel/CPU affiliation within one NUMA node, num_online_cpus() is used as the number of channel cap, instead of the number of processors in a NUMA node. This patch allows a bigger range for tuning the number of channels. Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13hv_netvsc: empty current transmit aggregation if flow blockedStephen Hemminger4-10/+17
If the transmit queue is known full, then don't keep aggregating data. And the cp_partial flag which indicates that the current aggregation buffer is full can be folded in to avoid more conditionals. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13hv_netvsc: remove open_cnt reference countStephen Hemminger3-10/+4
There is only ever a single instance of network device object referencing the internal rndis object. Therefore the open_cnt atomic is not necessary. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13hv_netvsc: pass netvsc_device to receive callbackStephen Hemminger3-14/+8
The netvsc_receive_callback function was using RCU to find the appropriate underlying netvsc_device. Since calling function already had that pointer, this was unnecessary. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13hv_netvsc: simplify function args in receive status pathStephen Hemminger4-19/+7
The caller (netvsc_receive) already has the net device pointer, and should just pass that to functions rather than the hyperv device. This eliminates several impossible error paths in the process. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13hv_netvsc: track memory allocation failures in ethtool statsStephen Hemminger2-2/+4
When skb can not be allocated, update ethtool statisitics rather than rx_dropped which is intended for netif_receive. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13hv_netvsc: copy_to_send buf can be voidStephen Hemminger1-14/+8
Since only caller does not care about return value. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13hv_netvsc: Fix the TX/RX buffer default sizesHaiyang Zhang2-5/+12
The values were not computed correctly. There are no significant visible impact, though. The intended size of RX buffer is 16 MB, and the default slot size is 1728. So, NETVSC_DEFAULT_RX should be 16*1024*1024 / 1728 = 9709. The intended size of TX buffer is 1 MB, and the slot size is 6144. So, NETVSC_DEFAULT_TX should be 1024*1024 / 6144 = 170. The patch puts the formula directly into the macro, and moves them to hyperv_net.h, together with related macros. Fixes: 5023a6db73196 ("netvsc: increase default receive buffer size") Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13hv_netvsc: Fix the receive buffer size limitHaiyang Zhang2-2/+9
The max should be 31 MB on host with NVSP version > 2. On legacy hosts (NVSP version <=2) only 15 MB receive buffer is allowed, otherwise the buffer request will be rejected by the host, resulting vNIC not coming up. The NVSP version is only available after negotiation. So, we add the limit checking for legacy hosts in netvsc_init_buf(). Fixes: 5023a6db73196 ("netvsc: increase default receive buffer size") Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-03hv_netvsc: optimize initialization of RNDIS headerStephen Hemminger1-31/+26
The memset of the whole maximum possible RNDIS header is unnecessary. For the main part of the header use a structure assignment. No need to memset the whole per packet info. Instead rely on caller to set what it wants. Also get rid of cast to void and signed/unsigned conversion. Now return pointer to per packet data (rather than the header) which simplifies use by code setting up the packet data. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-03hv_netvsc: use reciprocal divide to speed up percent calculationStephen Hemminger4-26/+21
Every packet sent checks the available ring space. The calculation can be sped up by using reciprocal divide which is multiplication. Since ring_size can only be configured by module parameter, so it doesn't have to be passed around everywhere. Also it should be unsigned since it is number of pages. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-03hv_netvsc: replace divide with mask when computing paddingStephen Hemminger1-1/+2
Packet alignment is always a power of 2 therefore modulus can be replaced with a faster and operation Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-03hv_netvsc: don't need local xmit_moreStephen Hemminger1-2/+1
Since skb is always non-NULL in the copy portion of netvsc_send do not need local variable. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-03hv_netvsc: drop unused macrosStephen Hemminger1-26/+0
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-16hv_netvsc: preserve hw_features on mtu/channels/ringparam changesVitaly Kuznetsov3-59/+83
rndis_filter_device_add() is called both from netvsc_probe() when we initially create the device and from set channels/mtu/ringparam routines where we basically remove the device and add it back. hw_features is reset in rndis_filter_device_add() and filled with host data. However, we lose all additional flags which are set outside of the driver, e.g. register_netdevice() adds NETIF_F_SOFT_FEATURES and many others. Unfortunately, calls to rndis_{query_hwcaps(), _set_offload_params()} calls cannot be avoided on every RNDIS reset: host expects us to set required features explicitly. Moreover, in theory hardware capabilities can change and we need to reflect the change in hw_features. Reset net->hw_features bits according to host data in rndis_netdev_set_hwcaps(), clear corresponding feature bits from net->features in case some features went missing (will never happen in real life I guess but let's be consistent). Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-08hv_netvsc: hide warnings about uninitialized/missing rndis deviceVitaly Kuznetsov1-2/+2
Hyper-V hosts are known to send RNDIS messages even after we halt the device in rndis_filter_halt_device(). Remove user visible messages as they are not really useful. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-08hv_netvsc: netvsc_teardown_gpadl() splitVitaly Kuznetsov1-33/+36
It was found that in some cases host refuses to teardown GPADL for send/ receive buffers (probably when some work with these buffere is scheduled or ongoing). Change the teardown logic to be: 1) Send NVSP_MSG1_TYPE_REVOKE_* messages 2) Close the channel 3) Teardown GPADLs. This seems to work reliably. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-29hv_netvsc: Set tx_table to equal weight after subchannels openHaiyang Zhang1-0/+3
In some cases, like internal vSwitch, the host doesn't provide send indirection table updates. This patch sets the table to be equal weight after subchannels are all open. Otherwise, all workload will be on one TX channel. As tested, this patch has largely increased the throughput over internal vSwitch. Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-14hv_netvsc: Add initialization of tx_table in netvsc_device_add()Haiyang Zhang1-0/+3
tx_table is part of the private data of kernel net_device. It is only zero-ed out when allocating net_device. We may recreate netvsc_device w/o recreating net_device, so the private netdev data, including tx_table, are not zeroed. It may contain channel numbers for the older netvsc_device. This patch adds initialization of tx_table each time we recreate netvsc_device. Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-14hv_netvsc: Rename tx_send_table to tx_tableHaiyang Zhang3-4/+4
Simplify the variable name: tx_send_table Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-14hv_netvsc: Rename ind_table to rx_tableHaiyang Zhang3-6/+6
Rename this variable because it is the Receive indirection table. Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-08hv_netvsc: Add ethtool handler to set and get TCP hash levelsHaiyang Zhang1-1/+24
The patch supports the options to switch TCP hash level between L3 and L4 by ethtool command. TCP over IPv4 and v6 can be set differently. The default hash level is L4. We currently only allow switching TX hash level from within the guests. For example, for TCP over IPv4 on eth0: To include TCP port numbers in hashing: ethtool -N eth0 rx-flow-hash tcp4 sdfn To exclude TCP port numbers in hashing: ethtool -N eth0 rx-flow-hash tcp4 sd To show TCP hash level: ethtool -n eth0 rx-flow-hash tcp4 Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-08hv_netvsc: Change the hash level variable to bit flagsHaiyang Zhang2-25/+59
This simplifies the logic and make it easier to add more options. Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-04net: Add extack to upper device linkingDavid Ahern1-1/+1
Add extack arg to netdev_upper_dev_link and netdev_master_upper_dev_link Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-01hv_netvsc: report stop_queue and wake_queueSimon Xiao3-2/+14
Report the numbers of events for stop_queue and wake_queue in ethtool stats. Example: ethtool -S eth0 NIC statistics: ... stop_queue: 7 wake_queue: 7 ... Signed-off-by: Simon Xiao <sixiao@microsoft.com> Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-25hv_netvsc: Fix the real number of queues of non-vRSS casesHaiyang Zhang1-0/+6
For older hosts without multi-channel (vRSS) support, and some error cases, we still need to set the real number of queues to one. This patch adds this missing setting. Fixes: 8195b1396ec8 ("hv_netvsc: fix deadlock on hotplug") Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Reviewed-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-25hv_netvsc: make const array ver_list static, reduces object code sizeColin Ian King1-1/+1
Don't populate const array ver_list on the stack, instead make it static. Makes the object code smaller by over 400 bytes: Before: text data bss dec hex filename 18444 3168 320 21932 55ac drivers/net/hyperv/netvsc.o After: text data bss dec hex filename 17950 3224 320 21494 53f6 drivers/net/hyperv/netvsc.o (gcc 6.3.0, x86-64) Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-21hv_netvsc: fix send buffer failure on MTU changeAlex Ng3-5/+12
If MTU is changed the host would reject the send buffer change. This problem is result of recent change to allow changing send buffer size. Every time we change the MTU, we store the previous net_device section count before destroying the buffer, but we don’t store the previous section size. When we reinitialize the buffer, its size is calculated by multiplying the previous count and previous size. Since we continuously increase the MTU, the host returns us a decreasing count value while the section size is reinitialized to 1728 bytes every time. This eventually leads to a condition where the calculated buf_size is so small that the host rejects it. Fixes: 8b5327975ae1 ("netvsc: allow controlling send/recv buffer size") Signed-off-by: Alex Ng <alexng@microsoft.com> Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-15netvsc: increase default receive buffer sizeStephen Hemminger1-1/+1
The default receive buffer size was reduced by recent change to a value which was appropriate for 10G and Windows Server 2016. But the value is too small for full performance with 40G on Azure. Increase the default back to maximum supported by host. Fixes: 8b5327975ae1 ("netvsc: allow controlling send/recv buffer size") Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-11hv_netvsc: avoid unnecessary wakeups on subchannel creationStephen Hemminger1-2/+2
Only need to wakeup the initiator after all sub-channels are opened. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>