aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/net/vmxnet3 (follow)
AgeCommit message (Collapse)AuthorFilesLines
2019-07-30net: Use skb_frag_off accessorsJonathan Lemon1-1/+1
Use accessor functions for skb fragment's page_offset instead of direct references, in preparation for bvec conversion. Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-22net: Use skb accessors in network driversMatthew Wilcox (Oracle)1-4/+3
In preparation for unifying the skb_frag and bio_vec, use the fine accessors which already exist and use skb_frag_t instead of struct skb_frag_struct. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-15vmxnet3: Remove call to memset after dma_alloc_coherentFuqian Huang1-1/+0
In commit 518a2f1925c3 ("dma-mapping: zero memory returned from dma_alloc_*"), dma_alloc_coherent has already zeroed the memory. So memset is not needed. Signed-off-by: Fuqian Huang <huangfq.daxian@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-04vmxnet3: turn off lro when rxcsum is disabledRonak Doshi3-2/+16
Currently, when rx csum is disabled, vmxnet3 driver does not turn off lro, which can cause performance issues if user does not turn off lro explicitly. This patch adds fix_features support which is used to turn off LRO whenever RXCSUM is disabled. Signed-off-by: Ronak Doshi <doshir@vmware.com> Acked-by: Rishi Mehta <rmehta@vmware.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-02net: ipv4: provide __rcu annotation for ifa_listFlorian Westphal1-6/+13
ifa_list is protected by rcu, yet code doesn't reflect this. Add the __rcu annotations and fix up all places that are now reported by sparse. I've done this in the same commit to not add intermediate patches that result in new warnings. Reported-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-08cross-tree: phase out dma_zalloc_coherent()Luis Chamberlain1-4/+4
We already need to zero out memory for dma_alloc_coherent(), as such using dma_zalloc_coherent() is superflous. Phase it out. This change was generated with the following Coccinelle SmPL patch: @ replace_dma_zalloc_coherent @ expression dev, size, data, handle, flags; @@ -dma_zalloc_coherent(dev, size, handle, flags) +dma_alloc_coherent(dev, size, handle, flags) Suggested-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> [hch: re-ran the script on the latest tree] Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-05-21Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2-28/+52
S390 bpf_jit.S is removed in net-next and had changes in 'net', since that code isn't used any more take the removal. TLS data structures split the TX and RX components in 'net-next', put the new struct members from the bug fix in 'net' into the RX part. The 'net-next' tree had some reworking of how the ERSPAN code works in the GRE tunneling code, overlapping with a one-line headroom calculation fix in 'net'. Overlapping changes in __sock_map_ctx_update_elem(), keep the bits that read the prog members via READ_ONCE() into local variables before using them. Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-17vmxnet3: Replace msleep(1) with usleep_range()YueHaibing2-4/+4
As documented in Documentation/timers/timers-howto.txt, replace msleep(1) with usleep_range(). Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-14vmxnet3: use DMA memory barriers where requiredhpreg@vmware.com2-2/+24
The gen bits must be read first from (resp. written last to) DMA memory. The proper way to enforce this on Linux is to call dma_rmb() (resp. dma_wmb()). Signed-off-by: Regis Duchesne <hpreg@vmware.com> Acked-by: Ronak Doshi <doshir@vmware.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-14vmxnet3: set the DMA mask before the first DMA map operationhpreg@vmware.com2-28/+30
The DMA mask must be set before, not after, the first DMA map operation, or the first DMA map operation could in theory fail on some systems. Fixes: b0eb57cb97e78 ("VMXNET3: Add support for virtual IOMMU") Signed-off-by: Regis Duchesne <hpreg@vmware.com> Acked-by: Ronak Doshi <doshir@vmware.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-19vmxnet3: fix incorrect dereference when rxvlan is disabledRonak Doshi2-6/+15
vmxnet3_get_hdr_len() is used to calculate the header length which in turn is used to calculate the gso_size for skb. When rxvlan offload is disabled, vlan tag is present in the header and the function references ip header from sizeof(ethhdr) and leads to incorrect pointer reference. This patch fixes this issue by taking sizeof(vlan_ethhdr) into account if vlan tag is present and correctly references the ip hdr. Signed-off-by: Ronak Doshi <doshir@vmware.com> Acked-by: Guolin Yang <gyang@vmware.com> Acked-by: Louis Luo <llouis@vmware.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-20vmxnet3: remove unused flag "rxcsum" from struct vmxnet3_adapterIgor Pylypiv1-2/+0
Signed-off-by: Igor Pylypiv <ipylypiv@silver-peak.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-17vmxnet3: use correct flag to indicate LRO featureRonak Doshi2-4/+4
'Commit 45dac1d6ea04 ("vmxnet3: Changes for vmxnet3 adapter version 2 (fwd)")' introduced a flag "lro" in structure vmxnet3_adapter which is used to indicate whether LRO is enabled or not. However, the patch did not set the flag and hence it was never exercised. So, when LRO is enabled, it resulted in poor TCP performance due to delayed acks. This issue is seen with packets which are larger than the mss getting a delayed ack rather than an immediate ack, thus resulting in high latency. This patch removes the lro flag and directly uses device features against NETIF_F_LRO to check if lro is enabled. Fixes: 45dac1d6ea04 ("vmxnet3: Changes for vmxnet3 adapter version 2 (fwd)") Reported-by: Rachel Lunnon <rachel_lunnon@stormagic.com> Signed-off-by: Ronak Doshi <doshir@vmware.com> Acked-by: Shrikrishna Khare <skhare@vmware.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-17vmxnet3: avoid xmit reset due to a race in vmxnet3Ronak Doshi2-7/+10
The field txNumDeferred is used by the driver to keep track of the number of packets it has pushed to the emulation. The driver increments it on pushing the packet to the emulation and the emulation resets it to 0 at the end of the transmit. There is a possibility of a race either when (a) ESX is under heavy load or (b) workload inside VM is of low packet rate. This race results in xmit hangs when network coalescing is disabled. This change creates a local copy of txNumDeferred and uses it to perform ring arithmetic. Reported-by: Noriho Tanaka <ntanaka@vmware.com> Signed-off-by: Ronak Doshi <doshir@vmware.com> Acked-by: Shrikrishna Khare <skhare@vmware.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-01vmxnet3: remove redundant initialization of pointer 'rq'Colin Ian King1-4/+2
Pointer rq is being initialized but this value is never read, it is being updated inside a for-loop. Remove the initialization and move it into the scope of the for-loop. Cleans up clang warning: drivers/net/vmxnet3/vmxnet3_drv.c:2763:27: warning: Value stored to 'rq' during its initialization is never read Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: Shrikrishna Khare <skhare@vmware.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-24Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller1-1/+1
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-23vmxnet3: repair memory leakNeil Horman1-1/+1
with the introduction of commit b0eb57cb97e7837ebb746404c2c58c6f536f23fa, it appears that rq->buf_info is improperly handled. While it is heap allocated when an rx queue is setup, and freed when torn down, an old line of code in vmxnet3_rq_destroy was not properly removed, leading to rq->buf_info[0] being set to NULL prior to its being freed, causing a memory leak, which eventually exhausts the system on repeated create/destroy operations (for example, when the mtu of a vmxnet3 interface is changed frequently. Fix is pretty straight forward, just move the NULL set to after the free. Tested by myself with successful results Applies to net, and should likely be queued for stable, please Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Reported-By: boyang@redhat.com CC: boyang@redhat.com CC: Shrikrishna Khare <skhare@vmware.com> CC: "VMware, Inc." <pv-drivers@vmware.com> CC: David S. Miller <davem@davemloft.net> Acked-by: Shrikrishna Khare <skhare@vmware.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-30vmxnet3: increase default rx ring sizesShrikrishna Khare1-4/+4
There are several reasons for increasing the receive ring sizes: 1. The original ring size of 256 was chosen about 10 years ago when vmxnet3 was first created. At that time, 10Gbps Ethernet was not prevalent and servers were dominated by 1Gbps Ethernet. Now 10Gbps is common place, and higher bandwidth links -- 25Gbps, 40Gbps, 50Gbps -- are starting to appear. 256 Rx ring entries are simply not enough to keep up with higher link speed when there is a burst of network frames coming from these high speed links. Even with full MTU size frames, they are gone in a short time. It is also more common to have a mix of frame sizes, and more likely bi-modal distribution of frame sizes so the average frame size is not close to full MTU. If we consider average frame size of 800B, 1024 frames that come in a burst takes ~0.65 ms to arrive at 10Gbps. With 256 entires, it takes ~0.16 ms to arrive at 10Gbps. At 25Gbps or 40Gbps, this time is reduced accordingly. 2. On a hypervisor where there are many VMs and CPU is over committed, i.e. the number of VCPUs is more than the number of VCPUs, each PCPU is in effect time shared between multiple VMs/VCPUs. The time granularity at which this multiplexing occurs is typically coarser than between processes on a guest OS. Trying to time slice more finely is not efficient, for example, if memory cache is barely warmed up when switching from one VM to another occurs. This CPU overcommit adds delay to when the driver in a VM can service incoming packets. Whether CPU is over committed really depends on customer workloads. For certain situations, it is very common. For example, workloads of desktop VMs and product testing setups. Consolidation and sharing is what drives efficiency of a customer setup for such workloads. In these situations, the raw network bandwidth may not be very high, but the delays between when a VM is running or not running can also be relatively long. Signed-off-by: Shrikrishna Khare <skhare@vmware.com> Acked-by: Jin Heo <heoj@vmware.com> Acked-by: Guolin Yang <gyang@vmware.com> Acked-by: Boon Ang <bang@vmware.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-14vmxnet3: avoid format strint overflow warningArnd Bergmann1-1/+1
gcc-7 notices that "-event-%d" could be more than 11 characters long if we had larger 'vector' numbers: drivers/net/vmxnet3/vmxnet3_drv.c: In function 'vmxnet3_activate_dev': drivers/net/vmxnet3/vmxnet3_drv.c:2095:40: error: 'sprintf' may write a terminating nul past the end of the destination [-Werror=format-overflow=] sprintf(intr->event_msi_vector_name, "%s-event-%d", ^~~~~~~~~~~~~ drivers/net/vmxnet3/vmxnet3_drv.c:2095:3: note: 'sprintf' output between 9 and 33 bytes into a destination of size 32 The current code is safe, but making the string a little longer is harmless and lets gcc see that it's ok. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-12vmxnet3: ensure that adapter is in proper state during force_closeNeil Horman1-0/+5
There are several paths in vmxnet3, where settings changes cause the adapter to be brought down and back up (vmxnet3_set_ringparam among them). Should part of the reset operation fail, these paths call vmxnet3_force_close, which enables all napi instances prior to calling dev_close (with the expectation that vmxnet3_close will then properly disable them again). However, vmxnet3_force_close neglects to clear VMXNET3_STATE_BIT_QUIESCED prior to calling dev_close. As a result vmxnet3_quiesce_dev (called from vmxnet3_close), returns early, and leaves all the napi instances in a enabled state while the device itself is closed. If a device in this state is activated again, napi_enable will be called on already enabled napi_instances, leading to a BUG halt. The fix is to simply enausre that the QUIESCED bit is cleared in vmxnet3_force_close to allow quesence to be completed properly on close. Signed-off-by: Neil Horman <nhorman@tuxdriver.com> CC: Shrikrishna Khare <skhare@vmware.com> CC: "VMware, Inc." <pv-drivers@vmware.com> CC: "David S. Miller" <davem@davemloft.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-22net: vmxnet3: use new api ethtool_{get|set}_link_ksettingsPhilippe Reynes1-11/+14
The ethtool api {get|set}_settings is deprecated. We move this driver to new api {get|set}_link_ksettings. Signed-off-by: Philippe Reynes <tremyfr@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-30drivers: net: generalize napi_complete_done()Eric Dumazet1-2/+2
napi_complete_done() allows to opt-in for gro_flush_timeout, added back in linux-3.19, commit 3b47d30396ba ("net: gro: add a per device gro flush timer") This allows for more efficient GRO aggregation without sacrifying latencies. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-08net: make ndo_get_stats64 a void functionstephen hemminger2-5/+3
The network device operation for reading statistics is only called in one place, and it ignores the return value. Having a structure return value is potentially confusing because some future driver could incorrectly assume that the return value was used. Fix all drivers with ndo_get_stats64 to have a void function. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-15Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdmaLinus Torvalds1-2/+1
Pull rdma updates from Doug Ledford: "This is the complete update for the rdma stack for this release cycle. Most of it is typical driver and core updates, but there is the entirely new VMWare pvrdma driver. You may have noticed that there were changes in DaveM's pull request to the bnxt Ethernet driver to support a RoCE RDMA driver. The bnxt_re driver was tentatively set to be pulled in this release cycle, but it simply wasn't ready in time and was dropped (a few review comments still to address, and some multi-arch build issues like prefetch() not working across all arches). Summary: - shared mlx5 updates with net stack (will drop out on merge if Dave's tree has already been merged) - driver updates: cxgb4, hfi1, hns-roce, i40iw, mlx4, mlx5, qedr, rxe - debug cleanups - new connection rejection helpers - SRP updates - various misc fixes - new paravirt driver from vmware" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (210 commits) IB: Add vmw_pvrdma driver IB/mlx4: fix improper return value IB/ocrdma: fix bad initialization infiniband: nes: return value of skb_linearize should be handled MAINTAINERS: Update Intel RDMA RNIC driver maintainers MAINTAINERS: Remove Mitesh Ahuja from emulex maintainers IB/core: fix unmap_sg argument qede: fix general protection fault may occur on probe IB/mthca: Replace pci_pool_alloc by pci_pool_zalloc mlx5, calc_sq_size(): Make a debug message more informative mlx5: Remove a set-but-not-used variable mlx5: Use { } instead of { 0 } to init struct IB/srp: Make writing the add_target sysfs attr interruptible IB/srp: Make mapping failures easier to debug IB/srp: Make login failures easier to debug IB/srp: Introduce a local variable in srp_add_one() IB/srp: Fix CONFIG_DYNAMIC_DEBUG=n build IB/multicast: Check ib_find_pkey() return value IPoIB: Avoid reading an uninitialized member variable IB/mad: Fix an array index check ...
2016-12-03vmxnet3: Move PCI Id to pci_ids.hAdit Ranadive1-2/+1
The VMXNet3 PCI Id will be shared with our paravirtual RDMA driver. Moved it to the shared location in pci_ids.h. Suggested-by: Leon Romanovsky <leon@kernel.org> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Adit Ranadive <aditr@vmware.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-10-30Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller1-7/+10
Mostly simple overlapping changes. For example, David Ahern's adjacency list revamp in 'net-next' conflicted with an adjacency list traversal bug fix in 'net'. Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-20net: use core MTU range checking in virt driversJarod Wilson1-3/+4
hyperv_net: - set min/max_mtu, per Haiyang, after rndis_filter_device_add virtio_net: - set min/max_mtu - remove virtnet_change_mtu vmxnet3: - set min/max_mtu xen-netback: - min_mtu = 0, max_mtu = 65517 xen-netfront: - min_mtu = 0, max_mtu = 65535 unisys/visor: - clean up defines a little to not clash with network core or add redundat definitions CC: netdev@vger.kernel.org CC: virtualization@lists.linux-foundation.org CC: "K. Y. Srinivasan" <kys@microsoft.com> CC: Haiyang Zhang <haiyangz@microsoft.com> CC: "Michael S. Tsirkin" <mst@redhat.com> CC: Shrikrishna Khare <skhare@vmware.com> CC: "VMware, Inc." <pv-drivers@vmware.com> CC: Wei Liu <wei.liu2@citrix.com> CC: Paul Durrant <paul.durrant@citrix.com> CC: David Kershner <david.kershner@unisys.com> Signed-off-by: Jarod Wilson <jarod@redhat.com> Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-15vmxnet3: avoid assumption about invalid dma_pa in vmxnet3_set_mc()Alexey Khoroshilov1-7/+10
vmxnet3_set_mc() checks new_table_pa returned by dma_map_single() with dma_mapping_error(), but even there it assumes zero is invalid pa (it assumes dma_mapping_error(...,0) returns true if new_table is NULL). The patch adds an explicit variable to track status of new_table_pa. Found by Linux Driver Verification project (linuxtesting.org). v2: use "bool" and "true"/"false" for boolean variables. Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-04vmxnet3: Wake queue from reset workBenjamin Poirier1-1/+1
vmxnet3_reset_work() expects tx queues to be stopped (via vmxnet3_quiesce_dev -> netif_tx_disable). However, this races with the netif_wake_queue() call in netif_tx_timeout() such that the driver's start_xmit routine may be called unexpectedly, triggering one of the BUG_ON in vmxnet3_map_pkt with a stack trace like this: RIP: 0010:[<ffffffffa00cf4bc>] vmxnet3_map_pkt+0x3ac/0x4c0 [vmxnet3] [<ffffffffa00cf7e0>] vmxnet3_tq_xmit+0x210/0x4e0 [vmxnet3] [<ffffffff813ab144>] dev_hard_start_xmit+0x2e4/0x4c0 [<ffffffff813c956e>] sch_direct_xmit+0x17e/0x1e0 [<ffffffff813c96a7>] __qdisc_run+0xd7/0x130 [<ffffffff813a6a7a>] net_tx_action+0x10a/0x200 [<ffffffff810691df>] __do_softirq+0x11f/0x260 [<ffffffff81472fdc>] call_softirq+0x1c/0x30 [<ffffffff81004695>] do_softirq+0x65/0xa0 [<ffffffff81069b89>] local_bh_enable_ip+0x99/0xa0 [<ffffffffa031ff36>] destroy_conntrack+0x96/0x110 [nf_conntrack] [<ffffffff813d65e2>] nf_conntrack_destroy+0x12/0x20 [<ffffffff8139c6d5>] skb_release_head_state+0xb5/0xf0 [<ffffffff8139d299>] skb_release_all+0x9/0x20 [<ffffffff8139cfe9>] __kfree_skb+0x9/0x90 [<ffffffffa00d0069>] vmxnet3_quiesce_dev+0x209/0x340 [vmxnet3] [<ffffffffa00d020a>] vmxnet3_reset_work+0x6a/0xa0 [vmxnet3] [<ffffffff8107d7cc>] process_one_work+0x16c/0x350 [<ffffffff810804fa>] worker_thread+0x17a/0x410 [<ffffffff810848c6>] kthread+0x96/0xa0 [<ffffffff81472ee4>] kernel_thread_helper+0x4/0x10 Signed-off-by: Benjamin Poirier <bpoirier@suse.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-30Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2-3/+5
All three conflicts were cases of simple overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-25vmxnet3: fix non static symbol warningWei Yongjun1-1/+1
Fixes the following sparse warning: drivers/net/vmxnet3/vmxnet3_drv.c:1645:1: warning: symbol 'vmxnet3_rq_destroy_all_rxdataring' was not declared. Should it be static? Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: Shrikrishna Khare <skhare@vmware.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-19vmxnet3: fix tx data ring copy for variable sizeShrikrishna Khare2-3/+5
'Commit 3c8b3efc061a ("vmxnet3: allow variable length transmit data ring buffer")' changed the size of the buffers in the tx data ring from a fixed size of 128 bytes to a variable size. However, while copying data to the data ring, vmxnet3_copy_hdr continues to carry the old code that assumes fixed buffer size of 128. This patch fixes it by adding correct offset based on the actual data ring buffer size. Signed-off-by: Guolin Yang <gyang@vmware.com> Signed-off-by: Shrikrishna Khare <skhare@vmware.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-16vmxnet3: update to version 3Shrikrishna Khare2-3/+8
With all vmxnet3 version 3 changes incorporated in the vmxnet3 driver, the driver can configure emulation to run at vmxnet3 version 3, provided the emulation advertises support for version 3. Signed-off-by: Shrikrishna Khare <skhare@vmware.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-16vmxnet3: introduce command to register memory regionShrikrishna Khare1-0/+17
In vmxnet3 version 3, the emulation added support for the vmxnet3 driver to communicate information about the memory regions the driver will use for rx/tx buffers. The driver can also indicate which rx/tx queue the memory region is applicable for. If this information is communicated to the emulation, the emulation will always keep these memory regions mapped, thereby avoiding the mapping/unmapping overhead for every packet. Currently, Linux vmxnet3 driver does not leverage this capability. The feasibility of using this approach for the Linux vmxnet3 driver will be investigated independently and if possible, will be part of a different patch. This patch only exposes the emulation capability to the driver (vmxnet3_defs.h is identical between the driver and the emulation). Signed-off-by: Guolin Yang <gyang@vmware.com> Signed-off-by: Shrikrishna Khare <skhare@vmware.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-16vmxnet3: add support for get_coalesce, set_coalesce ethtool operationsShrikrishna Khare4-1/+253
The emulation supports a variety of coalescing modes viz. disabled (no coalescing), adaptive, static (number of packets to batch before raising an interrupt), rate based (number of interrupts per second). This patch implements get_coalesce and set_coalesce methods to allow querying and configuring different coalescing modes. Signed-off-by: Keyong Sun <sunk@vmware.com> Signed-off-by: Manoj Tammali <tammalim@vmware.com> Signed-off-by: Shrikrishna Khare <skhare@vmware.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-16vmxnet3: add receive data ring supportShrikrishna Khare4-45/+193
vmxnet3 driver preallocates buffers for receiving packets and posts the buffers to the emulation. In order to deliver a received packet to the guest, the emulation must map buffer(s) and copy the packet into it. To avoid this memory mapping overhead, this patch introduces the receive data ring - a set of small sized buffers that are always mapped by the emulation. If a packet fits into the receive data ring buffer, the emulation delivers the packet via the receive data ring (which must be copied by the guest driver), or else the usual receive path is used. Receive Data Ring buffer length is configurable via ethtool -G ethX rx-mini Signed-off-by: Shrikrishna Khare <skhare@vmware.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-16vmxnet3: allow variable length transmit data ring bufferShrikrishna Khare4-19/+64
vmxnet3 driver supports transmit data ring viz. a set of fixed size buffers used by the driver to copy packet headers. Small packets that fit these buffers are copied into these buffers entirely. Currently this buffer size of fixed at 128 bytes. This patch extends transmit data ring implementation to allow variable length transmit data ring buffers. The length of the buffer is read from the emulation during initialization. Signed-off-by: Sriram Rangarajan <rangarajans@vmware.com> Signed-off-by: Shrikrishna Khare <skhare@vmware.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-16vmxnet3: introduce generalized command interface to configure the deviceShrikrishna Khare1-1/+21
Shared memory is used to exchange information between the vmxnet3 driver and the emulation. In order to request emulation to perform a task, the driver first populates specific fields in this shared memory and then issues corresponding command by writing to the command register(CMD). The layout of the shared memory was defined by vmxnet3 version 1 and cannot be extended for every new command without breaking backward compatibility. To address this problem, in vmxnet3 version 3, the emulation repurposed a reserved field in the shared memory to represent command information instead. For new commands, the driver first populates the command information field in the shared memory and then issues the command. The emulation interprets the data written to the command information depending on the type of the command. This patch exposes this capability to the driver. Signed-off-by: Guolin Yang <gyang@vmware.com> Signed-off-by: Shrikrishna Khare <skhare@vmware.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-16vmxnet3: prepare for version 3 changesShrikrishna Khare6-20/+36
vmxnet3 is currently at version 2, but some command definitions from previous vmxnet3 versions are missing. Add those definitions before moving to version 3. Also, introduce utility macros for vmxnet3 version comparison and update Copyright information and Maintained by. Signed-off-by: Shrikrishna Khare <skhare@vmware.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-10vmxnet3: segCnt can be 1 for LRO packetsShrikrishna Khare2-3/+3
The device emulation may send segCnt of 1 for LRO packets. Signed-off-by: Shrikrishna Khare <skhare@vmware.com> Signed-off-by: Jin Heo <heoj@vmware.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-21Driver: Vmxnet3: set CHECKSUM_UNNECESSARY for IPv6 packetsShrikrishna Khare2-6/+10
For IPv6, if the device indicates that the checksum is correct, set CHECKSUM_UNNECESSARY. Reported-by: Subbarao Narahari <snarahari@vmware.com> Signed-off-by: Shrikrishna Khare <skhare@vmware.com> Signed-off-by: Jin Heo <heoj@vmware.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-14vmxnet3: fix lock imbalance in vmxnet3_tq_xmit()Arnd Bergmann1-4/+4
A recent bug fix rearranged the code in vmxnet3_tq_xmit() in a way that left the error handling for oversized headers unlock a lock that had not been taken yet. Gcc warns about the incorrect use of the 'flags' variable because of that: drivers/net/vmxnet3/vmxnet3_drv.c: In function 'vmxnet3_tq_xmit.constprop': include/linux/spinlock.h:246:3: error: 'flags' may be used uninitialized in this function [-Werror=maybe-uninitialized] This changes the error handling path to 'goto' the end of the function beyond the lock/unlock pair. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Fixes: cec05562fb1d ("vmxnet3: avoid calling pskb_may_pull with interrupts disabled") Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-07vmxnet3: avoid calling pskb_may_pull with interrupts disabledNeil Horman1-27/+46
vmxnet3 has a function vmxnet3_parse_and_copy_hdr which, among other operations, uses pskb_may_pull to linearize the header portion of an skb. That operation eventually uses local_bh_disable/enable to ensure that it doesn't race with the drivers bottom half handler. Unfortunately, vmxnet3 preforms this parse_and_copy operation with a spinlock held and interrupts disabled. This causes us to run afoul of the WARN_ON_ONCE(irqs_disabled()) warning in local_bh_enable, resulting in this: WARNING: at kernel/softirq.c:159 local_bh_enable+0x59/0x90() (Not tainted) Hardware name: VMware Virtual Platform Modules linked in: ipv6 ppdev parport_pc parport microcode e1000 vmware_balloon vmxnet3 i2c_piix4 sg ext4 jbd2 mbcache sd_mod crc_t10dif sr_mod cdrom mptspi mptscsih mptbase scsi_transport_spi pata_acpi ata_generic ata_piix vmwgfx ttm drm_kms_helper drm i2c_core dm_mirror dm_region_hash dm_log dm_mod [last unloaded: mperf] Pid: 6229, comm: sshd Not tainted 2.6.32-616.el6.i686 #1 Call Trace: [<c04624d9>] ? warn_slowpath_common+0x89/0xe0 [<c0469e99>] ? local_bh_enable+0x59/0x90 [<c046254b>] ? warn_slowpath_null+0x1b/0x20 [<c0469e99>] ? local_bh_enable+0x59/0x90 [<c07bb936>] ? skb_copy_bits+0x126/0x210 [<f8d1d9fe>] ? ext4_ext_find_extent+0x24e/0x2d0 [ext4] [<c07bc49e>] ? __pskb_pull_tail+0x6e/0x2b0 [<f95a6164>] ? vmxnet3_xmit_frame+0xba4/0xef0 [vmxnet3] [<c05d15a6>] ? selinux_ip_postroute+0x56/0x320 [<c0615988>] ? cfq_add_rq_rb+0x98/0x110 [<c0852df8>] ? packet_rcv+0x48/0x350 [<c07c5839>] ? dev_queue_xmit_nit+0xc9/0x140 ... Fix it by splitting vmxnet3_parse_and_copy_hdr into two functions: vmxnet3_parse_hdr, which sets up the internal/on stack ctx datastructure, and pulls the skb (both of which can be done without holding the spinlock with irqs disabled and vmxnet3_copy_header, which just copies the skb to the tx ring under the lock safely. tested and shown to correct the described problem. Applies cleanly to the head of the net tree Signed-off-by: Neil Horman <nhorman@tuxdriver.com> CC: Shrikrishna Khare <skhare@vmware.com> CC: "VMware, Inc." <pv-drivers@vmware.com> CC: "David S. Miller" <davem@davemloft.net> Acked-by: Shrikrishna Khare <skhare@vmware.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-21Driver: Vmxnet3: Update Rx ring 2 max sizeShrikrishna Khare2-3/+3
Device emulation supports max size of 4096. Signed-off-by: Shrikrishna Khare <skhare@vmware.com> Signed-off-by: Bhavesh Davda <bhavesh@vmware.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-06Driver: Vmxnet3: Fix regression caused by 5738a09Shrikrishna Khare2-6/+6
Reported-by: Bingkuo Liu <bingkuol@vmware.com> Signed-off-by: Shrikrishna Khare <skhare@vmware.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-01vmxnet3: fix checks for dma mapping errorsAlexey Khoroshilov1-11/+60
vmxnet3_drv does not check dma_addr with dma_mapping_error() after mapping dma memory. The patch adds the checks and tries to handle failures. Found by Linux Driver Verification project (linuxtesting.org). Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru> Acked-by: Shrikrishna Khare <skhare@vmware.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-16Driver: Vmxnet3: Fix use of mfTableLen for big endian architecturesShrikrishna Khare2-5/+6
Signed-off-by: Shrikrishna Khare <skhare@vmware.com> Reported-by: Masao Uebayashi <uebayasi@gmail.com> Signed-off-by: Bhavesh Davda <bhavesh@vmware.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-16drivers/net: get rid of unnecessary initializations in .get_drvinfo()Ivan Vecera1-4/+0
Many drivers initialize uselessly n_priv_flags, n_stats, testinfo_len, eedump_len & regdump_len fields in their .get_drvinfo() ethtool op. It's not necessary as these fields is filled in ethtool_get_drvinfo(). v2: removed unused variable v3: removed another unused variable Signed-off-by: Ivan Vecera <ivecera@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-23Driver: Vmxnet3: Extend register dump supportShrikrishna Khare2-30/+92
Signed-off-by: Shrikrishna Khare <skhare@vmware.com> Signed-off-by: Bhavesh Davda <bhavesh@vmware.com> Acked-by: Srividya Murali <smurali@vmware.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-08vmxnet3: prevent receive getting out of sequence on napi pollNeil Horman1-4/+4
vmxnet3's current napi path is built to count every rx descriptor we recieve, and use that as a count of the napi budget. That means its possible to return from a napi poll halfway through recieving a fragmented packet accross multiple dma descriptors. If that happens, the next napi poll will start with the descriptor ring in an improper state (e.g. the first descriptor we look at may have the end-of-packet bit set), which will cause a BUG halt in the driver. Fix the issue by only counting whole received packets in the napi poll and returning that value, rather than the descriptor count. Tested by the reporter and myself, successfully Signed-off-by: Neil Horman <nhorman@tuxdriver.com> CC: Shreyas Bhatewara <sbhatewara@vmware.com> CC: "David S. Miller" <davem@davemloft.net> Acked-by: Andy Gospodarek <gospo@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>