aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/net/virtio_net.c (follow)
AgeCommit message (Collapse)AuthorFilesLines
2017-09-26bpf: add meta pointer for direct accessDaniel Borkmann1-0/+2
This work enables generic transfer of metadata from XDP into skb. The basic idea is that we can make use of the fact that the resulting skb must be linear and already comes with a larger headroom for supporting bpf_xdp_adjust_head(), which mangles xdp->data. Here, we base our work on a similar principle and introduce a small helper bpf_xdp_adjust_meta() for adjusting a new pointer called xdp->data_meta. Thus, the packet has a flexible and programmable room for meta data, followed by the actual packet data. struct xdp_buff is therefore laid out that we first point to data_hard_start, then data_meta directly prepended to data followed by data_end marking the end of packet. bpf_xdp_adjust_head() takes into account whether we have meta data already prepended and if so, memmove()s this along with the given offset provided there's enough room. xdp->data_meta is optional and programs are not required to use it. The rationale is that when we process the packet in XDP (e.g. as DoS filter), we can push further meta data along with it for the XDP_PASS case, and give the guarantee that a clsact ingress BPF program on the same device can pick this up for further post-processing. Since we work with skb there, we can also set skb->mark, skb->priority or other skb meta data out of BPF, thus having this scratch space generic and programmable allows for more flexibility than defining a direct 1:1 transfer of potentially new XDP members into skb (it's also more efficient as we don't need to initialize/handle each of such new members). The facility also works together with GRO aggregation. The scratch space at the head of the packet can be multiple of 4 byte up to 32 byte large. Drivers not yet supporting xdp->data_meta can simply be set up with xdp->data_meta as xdp->data + 1 as bpf_xdp_adjust_meta() will detect this and bail out, such that the subsequent match against xdp->data for later access is guaranteed to fail. The verifier treats xdp->data_meta/xdp->data the same way as we treat xdp->data/xdp->data_end pointer comparisons. The requirement for doing the compare against xdp->data is that it hasn't been modified from it's original address we got from ctx access. It may have a range marking already from prior successful xdp->data/xdp->data_end pointer comparisons though. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-22virtio-net: correctly set xdp_xmit for mergeable bufferJason Wang1-1/+1
We should set xdp_xmit only when xdp_do_redirect() succeed. Cc: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-20virtio-net: support XDP_REDIRECTJason Wang1-15/+62
This patch tries to add XDP_REDIRECT for virtio-net. The changes are not complex as we could use exist XDP_TX helpers for most of the work. The rest is passing the XDP_TX to NAPI handler for implementing batching. Cc: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-20virtio-net: add packet len average only when needed during XDPJason Wang1-3/+3
There's no need to add packet len average in the case of XDP_PASS since it will be done soon after skb is created. Cc: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-20virtio-net: remove unnecessary parameter of virtnet_xdp_xmit()Jason Wang1-3/+2
CC: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-01Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller1-1/+1
Three cases of simple overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-24virtio_net: be drop monitor friendlyEric Dumazet1-1/+1
This change is needed to not fool drop monitor. (perf record ... -e skb:kfree_skb ) Packets were properly sent and are consumed after TX completion. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-18net: drop unused attribute argument from sysfs queue funcsstephen hemminger1-1/+1
The show and store functions don't need/use the attribute. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-16virtio: put paren around sizeofstephen hemminger1-1/+1
Kernel coding style is to put paren around operand of sizeof. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-13virtio-net: make array guest_offloads staticColin Ian King1-4/+6
The array guest_offloads is local to the source and does not need to be in global scope, so make it static. Also tweak formatting. Cleans up sparse warnings: symbol 'guest_offloads' was not declared. Should it be static? Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-01Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller1-4/+3
Two minor conflicts in virtio_net driver (bug fix overlapping addition of a helper) and MAINTAINERS (new driver edit overlapping revamp of PHY entry). Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-31Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds1-3/+2
Pull networking fixes from David Miller: 1) Handle notifier registry failures properly in tun/tap driver, from Tonghao Zhang. 2) Fix bpf verifier handling of subtraction bounds and add a testcase for this, from Edward Cree. 3) Increase reset timeout in ftgmac100 driver, from Ben Herrenschmidt. 4) Fix use after free in prd_retire_rx_blk_timer_exired() in AF_PACKET, from Cong Wang. 5) Fix SElinux regression due to recent UDP optimizations, from Paolo Abeni. 6) We accidently increment IPSTATS_MIB_FRAGFAILS in the ipv6 code paths, fix from Stefano Brivio. 7) Fix some mem leaks in dccp, from Xin Long. 8) Adjust MDIO_BUS kconfig deps to avoid build errors, from Arnd Bergmann. 9) Mac address length check and buffer size fixes from Cong Wang. 10) Don't leak sockets in ipv6 udp early demux, from Paolo Abeni. 11) Fix return value when copy_from_user() fails in bpf_prog_get_info_by_fd(), from Daniel Borkmann. 12) Handle PHY_HALTED properly in phy library state machine, from Florian Fainelli. 13) Fix OOPS in fib_sync_down_dev(), from Ido Schimmel. 14) Fix truesize calculation in virtio_net which led to performance regressions, from Michael S Tsirkin. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (76 commits) samples/bpf: fix bpf tunnel cleanup udp6: fix jumbogram reception ppp: Fix a scheduling-while-atomic bug in del_chan Revert "net: bcmgenet: Remove init parameter from bcmgenet_mii_config" virtio_net: fix truesize for mergeable buffers mv643xx_eth: fix of_irq_to_resource() error check MAINTAINERS: Add more files to the PHY LIBRARY section ipv4: fib: Fix NULL pointer deref during fib_sync_down_dev() net: phy: Correctly process PHY_HALTED in phy_stop_machine() sunhme: fix up GREG_STAT and GREG_IMASK register offsets bpf: fix bpf_prog_get_info_by_fd to dump correct xlated_prog_len tcp: avoid bogus gcc-7 array-bounds warning net: tc35815: fix spelling mistake: "Intterrupt" -> "Interrupt" bpf: don't indicate success when copy_from_user fails udp6: fix socket leak on early demux net: thunderx: Fix BGX transmit stall due to underflow Revert "vhost: cache used event for better performance" team: use a larger struct for mac address net: check dev->addr_len for dev_set_mac_address() phy: bcm-ns-usb3: fix MDIO_BUS dependency ...
2017-07-31virtio_net: fix truesize for mergeable buffersMichael S. Tsirkin1-3/+2
Seth Forshee noticed a performance degradation with some workloads. This turns out to be due to packet drops. Euan Kemp noticed that this is because we drop all packets where length exceeds the truesize, but for some packets we add in extra memory without updating the truesize. This in turn was kept around unchanged from ab7db91705e95 ("virtio-net: auto-tune mergeable rx buffer size for improved performance"). That commit had an internal reason not to account for the extra space: not enough bits to do it. No longer true so let's account for the allocated length exactly. Many thanks to Seth Forshee for the report and bisecting and Euan Kemp for debugging the issue. Fixes: 680557cf79f8 ("virtio_net: rework mergeable buffer handling") Reported-by: Euan Kemp <euan.kemp@coreos.com> Tested-by: Euan Kemp <euan.kemp@coreos.com> Reported-by: Seth Forshee <seth.forshee@canonical.com> Tested-by: Seth Forshee <seth.forshee@canonical.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-25virtio-net: mark PM functions as __maybe_unusedArnd Bergmann1-4/+2
After removing the reset function, the freeze and restore functions are now unused when CONFIG_PM_SLEEP is disabled: drivers/net/virtio_net.c:1881:12: error: 'virtnet_restore_up' defined but not used [-Werror=unused-function] static int virtnet_restore_up(struct virtio_device *vdev) drivers/net/virtio_net.c:1859:13: error: 'virtnet_freeze_down' defined but not used [-Werror=unused-function] static void virtnet_freeze_down(struct virtio_device *vdev) A more robust way to do this is to remove the #ifdef around the callers and instead mark them as __maybe_unused. The compiler will now just silently drop the unused code. Fixes: 4941d472bf95 ("virtio-net: do not reset during XDP set") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-25virtio-net: fix module unloadingAndrew Jones1-1/+1
Unregister the driver before removing multi-instance hotplug callbacks. This order avoids the warning issued from __cpuhp_remove_state_cpuslocked when the number of remaining instances isn't yet zero. Fixes: 8017c279196a ("net/virtio-net: Convert to hotplug state machine") Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-07-24virtio-net: switch off offloads on demand if possible on XDP setJason Wang1-5/+65
Current XDP implementation wants guest offloads feature to be disabled on device. This is inconvenient and means guest can't benefit from offloads if XDP is not used. This patch tries to address this limitation by disabling the offloads on demand through control guest offloads. Guest offloads will be disabled and enabled on demand on XDP set. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-24virtio-net: do not reset during XDP setJason Wang1-126/+106
We currently reset the device during XDP set, the main reason is that we allocate more headroom with XDP (for header adjustment). This works but causes network downtime for users. Previous patches encoded the headroom in the buffer context, this makes it possible to detect the case where a buffer with headroom insufficient for XDP is added to the queue and XDP is enabled afterwards. Upon detection, we handle this case by copying the packet (slow, but it's a temporary condition). Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-24virtio-net: switch to use new ctx API for small bufferJason Wang1-5/+12
Use ctx API to store headroom for small buffers. Following patches will retrieve this info and use it for XDP. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-24virtio-net: pack headroom into ctx for mergeable buffersJason Wang1-5/+24
Pack headroom into ctx - this way when we get a buffer we can figure out the actual headroom that was allocated for the buffer. Will be helpful to optimize switching between XDP and non-XDP modes which have different headroom requirements. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-17virtio_net: Remove references to NETIF_F_UFO.David S. Miller1-4/+2
It is going away. Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-07virtio-net: fix leaking of ctx arrayJason Wang1-0/+1
Fixes: commit d45b897b11ea ("virtio_net: allow specifying context for rx") Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-30Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller1-0/+1
A set of overlapping changes in macvlan and the rocker driver, nothing serious. Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-29virtio-net: serialize tx routine during resetJason Wang1-0/+1
We don't hold any tx lock when trying to disable TX during reset, this would lead a use after free since ndo_start_xmit() tries to access the virtqueue which has already been freed. Fix this by using netif_tx_disable() before freeing the vqs, this could make sure no tx after vq freeing. Reported-by: Jean-Philippe Menil <jpmenil@gmail.com> Tested-by: Jean-Philippe Menil <jpmenil@gmail.com> Fixes commit f600b6905015 ("virtio_net: Add XDP support") Cc: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Robert McCabe <robert.mccabe@rockwellcollins.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-16bpf: virtio_net: Report bpf_prog ID during XDP_QUERY_PROGMartin KaFai Lau1-5/+8
Add support to virtio_net to report bpf_prog ID during XDP_QUERY_PROG. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Cc: John Fastabend <john.fastabend@gmail.com> Cc: Jason Wang <jasowang@redhat.com> Acked-by: Alexei Starovoitov <ast@fb.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-16networking: introduce and use skb_put_data()Johannes Berg1-1/+1
A common pattern with skb_put() is to just want to memcpy() some data into the new space, introduce skb_put_data() for this. An spatch similar to the one for skb_put_zero() converts many of the places using it: @@ identifier p, p2; expression len, skb, data; type t, t2; @@ ( -p = skb_put(skb, len); +p = skb_put_data(skb, data, len); | -p = (t)skb_put(skb, len); +p = skb_put_data(skb, data, len); ) ( p2 = (t2)p; -memcpy(p2, data, len); | -memcpy(p, data, len); ) @@ type t, t2; identifier p, p2; expression skb, data; @@ t *p; ... ( -p = skb_put(skb, sizeof(t)); +p = skb_put_data(skb, data, sizeof(t)); | -p = (t *)skb_put(skb, sizeof(t)); +p = skb_put_data(skb, data, sizeof(t)); ) ( p2 = (t2)p; -memcpy(p2, data, sizeof(*p)); | -memcpy(p, data, sizeof(*p)); ) @@ expression skb, len, data; @@ -memcpy(skb_put(skb, len), data, len); +skb_put_data(skb, data, len); (again, manually post-processed to retain some comments) Reviewed-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-06Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller1-2/+3
Just some simple overlapping changes in marvell PHY driver and the DSA core code. Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-04virtio_net: check return value of skb_to_sgvec alwaysJason A. Donenfeld1-2/+7
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Reviewed-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-02virtio_net: lower limit on buffer sizeMichael S. Tsirkin1-2/+3
commit d85b758f72b0 ("virtio_net: fix support for small rings") was supposed to increase the buffer size for small rings but had an unintentional side effect of decreasing it for large rings. This seems to break some setups - it's not yet clear why, but increasing buffer size back to what it was before helps. Fixes: d85b758f72b0 ("virtio_net: fix support for small rings") Reported-by: Mikulas Patocka <mpatocka@redhat.com> Reported-by: "J. Bruce Fields" <bfields@fieldses.org> Tested-by: Mikulas Patocka <mpatocka@redhat.com> Tested-by: "J. Bruce Fields" <bfields@fieldses.org> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-24virtio-net: enable TSO/checksum offloads for Q-in-Q vlansVlad Yasevich1-0/+1
Since virtio does not provide it's own ndo_features_check handler, TSO, and now checksum offload, are disabled for stacked vlans. Re-enable the support and let the host take care of it. This restores/improves Guest-to-Guest performance over Q-in-Q vlans. Acked-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-10Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhostLinus Torvalds1-64/+83
Pull virtio updates from Michael Tsirkin: "Fixes, cleanups, performance A bunch of changes to virtio, most affecting virtio net. Also ptr_ring batched zeroing - first of batching enhancements that seems ready." * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: s390/virtio: change maintainership tools/virtio: fix spelling mistake: "wakeus" -> "wakeups" virtio_net: tidy a couple debug statements ptr_ring: support testing different batching sizes ringtest: support test specific parameters ptr_ring: batch ring zeroing virtio: virtio_driver doc virtio_net: don't reset twice on XDP on/off virtio_net: fix support for small rings virtio_net: reduce alignment for buffers virtio_net: rework mergeable buffer handling virtio_net: allow specifying context for rx virtio: allow extra context per descriptor tools/virtio: fix build breakage virtio: add context flag to find vqs virtio: wrap find_vqs ringtest: fix an assert statement
2017-05-09virtio_net: tidy a couple debug statementsDan Carpenter1-2/+2
We are printing a decimal value for truesize so we shouldn't use an "0x" prefix. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-05-09virtio_net: don't reset twice on XDP on/offMichael S. Tsirkin1-1/+0
We already do a reset once in remove_vq_common - there appears to be no point in doing another one when we add/remove XDP. Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-05-09virtio_net: fix support for small ringsMichael S. Tsirkin1-4/+26
When ring size is small (<32 entries) making buffers smaller means a full ring might not be able to hold enough buffers to fit a single large packet. Make sure a ring full of buffers is large enough to allow at least one packet of max size. Fixes: 2613af0ed18a ("virtio_net: migrate mergeable rx buffers to page frag allocators") Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-05-09virtio_net: reduce alignment for buffersMichael S. Tsirkin1-12/+1
We don't need to align length to any particular value anymore. Aligning to L1 cache size probably sill makes sense to reduce false sharing. Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-05-09virtio_net: rework mergeable buffer handlingMichael S. Tsirkin1-46/+43
Use the new _ctx virtio API to maintain true length for each buffer. Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-05-09virtio_net: allow specifying context for rxMichael S. Tsirkin1-1/+14
With mergeable buffers we never use s/g for rx, so allow specifying context in that case. Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-05-03xdp: use common helper for netlink extended ack reportingDaniel Borkmann1-4/+4
Small follow-up to d74a32acd59a ("xdp: use netlink extended ACK reporting") in order to let drivers all use the same NL_SET_ERR_MSG_MOD() helper macro for reporting. This also ensures that we consistently add the driver's prefix for dumping the report in user space to indicate that the error message is driver specific and not coming from core code. Furthermore, NL_SET_ERR_MSG_MOD() now reuses NL_SET_ERR_MSG() and thus makes all macros check the pointer as suggested. References: https://www.spinics.net/lists/netdev/msg433267.html Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-02virtio: wrap find_vqsMichael S. Tsirkin1-2/+1
We are going to add more parameters to find_vqs, let's wrap the call so we don't need to tweak all drivers every time. Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-05-01virtio_net: make use of extended ack message reportingJakub Kicinski1-4/+7
Try to carry error messages to the user via the netlink extended ack message attribute. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-30virtio-net: use netif_tx_napi_add for tx napiWillem de Bruijn1-2/+2
Avoid hashing the tx napi struct into napi_hash[], which is used for busy polling receive queues. Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-26virtio-net: on tx, only call napi_disable if tx napi is onWillem de Bruijn1-2/+8
As of tx napi, device down (`ip link set dev $dev down`) hangs unless tx napi is enabled. Else napi_enable is not called, so napi_disable will spin on test_and_set_bit NAPI_STATE_SCHED. Only call napi_disable if tx napi is enabled. Fixes: 5a719c2552ca ("virtio-net: transmit napi") Reported-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Willem de Bruijn <willemb@google.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-24virtio-net: keep tx interrupts disabled unless kickWillem de Bruijn1-0/+3
Tx napi mode increases the rate of transmit interrupts. Suppress some by masking interrupts while more packets are expected. The interrupts will be reenabled before the last packet is sent. This optimization reduces the througput drop with tx napi for unidirectional flows such as UDP_STREAM that do not benefit from cleaning tx completions in the the receive napi handler. Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-24virtio-net: clean tx descriptors from rx napiWillem de Bruijn1-0/+21
Amortize the cost of virtual interrupts by doing both rx and tx work on reception of a receive interrupt if tx napi is enabled. With VIRTIO_F_EVENT_IDX, this suppresses most explicit tx completion interrupts for bidirectional workloads. Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-24virtio-net: move free_old_xmit_skbsWillem de Bruijn1-30/+30
An upcoming patch will call free_old_xmit_skbs indirectly from virtnet_poll. Move the function above this to avoid having to introduce a forward declaration. This is a pure move: no code changes. Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-24virtio-net: transmit napiWillem de Bruijn1-9/+67
Convert virtio-net to a standard napi tx completion path. This enables better TCP pacing using TCP small queues and increases single stream throughput. The virtio-net driver currently cleans tx descriptors on transmission of new packets in ndo_start_xmit. Latency depends on new traffic, so is unbounded. To avoid deadlock when a socket reaches its snd limit, packets are orphaned on tranmission. This breaks socket backpressure, including TSQ. Napi increases the number of interrupts generated compared to the current model, which keeps interrupts disabled as long as the ring has enough free descriptors. Keep tx napi optional and disabled for now. Follow-on patches will reduce the interrupt cost. Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-24virtio-net: napi helper functionsWillem de Bruijn1-30/+35
Prepare virtio-net for tx napi by converting existing napi code to use helper functions. This also deduplicates some logic. Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-15Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller1-11/+34
Conflicts were simply overlapping changes. In the net/ipv4/route.c case the code had simply moved around a little bit and the same fix was made in both 'net' and 'net-next'. In the net/sched/sch_generic.c case a fix in 'net' happened at the same time that a new argument was added to qdisc_hash_add(). Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-07virtio_net: clear MTU when out of rangeMichael S. Tsirkin1-11/+30
virtio attempts to clear the MTU feature bit if the value is out of the supported range, but this has no real effect since FEATURES_OK has already been set. Fix this up by checking the MTU in the new validate callback. Fixes: 14de9d114a82 ("virtio-net: Add initial MTU advice feature") Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-04-07virtio_net: enable big packets for large MTU valuesMichael S. Tsirkin1-0/+4
If one enables e.g. jumbo frames without mergeable buffers, packets won't fit in 1500 byte buffers we use. Switch to big packet mode instead. TODO: make sizing more exact, possibly extend small packet mode to use larger pages. Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-03-22net: virtio_net: use new api ethtool_{get|set}_link_ksettingsPhilippe Reynes1-19/+29
The ethtool api {get|set}_settings is deprecated. We move this driver to new api {get|set}_link_ksettings. Signed-off-by: Philippe Reynes <tremyfr@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>