aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/net/tun.c (follow)
AgeCommit message (Collapse)AuthorFilesLines
2014-12-16tun: drop broken IFF_VNET_LEMichael S. Tsirkin1-3/+23
Use TUNSETVNETLE/TUNGETVNETLE instead. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-11Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-nextLinus Torvalds1-72/+62
Pull networking updates from David Miller: 1) New offloading infrastructure and example 'rocker' driver for offloading of switching and routing to hardware. This work was done by a large group of dedicated individuals, not limited to: Scott Feldman, Jiri Pirko, Thomas Graf, John Fastabend, Jamal Hadi Salim, Andy Gospodarek, Florian Fainelli, Roopa Prabhu 2) Start making the networking operate on IOV iterators instead of modifying iov objects in-situ during transfers. Thanks to Al Viro and Herbert Xu. 3) A set of new netlink interfaces for the TIPC stack, from Richard Alpe. 4) Remove unnecessary looping during ipv6 routing lookups, from Martin KaFai Lau. 5) Add PAUSE frame generation support to gianfar driver, from Matei Pavaluca. 6) Allow for larger reordering levels in TCP, which are easily achievable in the real world right now, from Eric Dumazet. 7) Add a variable of napi_schedule that doesn't need to disable cpu interrupts, from Eric Dumazet. 8) Use a doubly linked list to optimize neigh_parms_release(), from Nicolas Dichtel. 9) Various enhancements to the kernel BPF verifier, and allow eBPF programs to actually be attached to sockets. From Alexei Starovoitov. 10) Support TSO/LSO in sunvnet driver, from David L Stevens. 11) Allow controlling ECN usage via routing metrics, from Florian Westphal. 12) Remote checksum offload, from Tom Herbert. 13) Add split-header receive, BQL, and xmit_more support to amd-xgbe driver, from Thomas Lendacky. 14) Add MPLS support to openvswitch, from Simon Horman. 15) Support wildcard tunnel endpoints in ipv6 tunnels, from Steffen Klassert. 16) Do gro flushes on a per-device basis using a timer, from Eric Dumazet. This tries to resolve the conflicting goals between the desired handling of bulk vs. RPC-like traffic. 17) Allow userspace to ask for the CPU upon what a packet was received/steered, via SO_INCOMING_CPU. From Eric Dumazet. 18) Limit GSO packets to half the current congestion window, from Eric Dumazet. 19) Add a generic helper so that all drivers set their RSS keys in a consistent way, from Eric Dumazet. 20) Add xmit_more support to enic driver, from Govindarajulu Varadarajan. 21) Add VLAN packet scheduler action, from Jiri Pirko. 22) Support configurable RSS hash functions via ethtool, from Eyal Perry. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1820 commits) Fix race condition between vxlan_sock_add and vxlan_sock_release net/macb: fix compilation warning for print_hex_dump() called with skb->mac_header net/mlx4: Add support for A0 steering net/mlx4: Refactor QUERY_PORT net/mlx4_core: Add explicit error message when rule doesn't meet configuration net/mlx4: Add A0 hybrid steering net/mlx4: Add mlx4_bitmap zone allocator net/mlx4: Add a check if there are too many reserved QPs net/mlx4: Change QP allocation scheme net/mlx4_core: Use tasklet for user-space CQ completion events net/mlx4_core: Mask out host side virtualization features for guests net/mlx4_en: Set csum level for encapsulated packets be2net: Export tunnel offloads only when a VxLAN tunnel is created gianfar: Fix dma check map error when DMA_API_DEBUG is enabled cxgb4/csiostor: Don't use MASTER_MUST for fw_hello call net: fec: only enable mdio interrupt before phy device link up net: fec: clear all interrupt events to support i.MX6SX net: fec: reset fep link status in suspend function net: sock: fix access via invalid file descriptor net: introduce helper macro for_each_cmsghdr ...
2014-12-11Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhostLinus Torvalds1-97/+71
Pull virtio updates from Michael Tsirkin: "virtio: virtio 1.0 support, misc patches This adds a lot of infrastructure for virtio 1.0 support. Notable missing pieces: virtio pci, virtio balloon (needs spec extension), vhost scsi. Plus, there are some minor fixes in a couple of places. Note: some net drivers are affected by these patches. David said he's fine with merging these patches through my tree. Rusty's on vacation, he acked using my tree for these, too" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (70 commits) virtio_ccw: finalize_features error handling virtio_ccw: future-proof finalize_features virtio_pci: rename virtio_pci -> virtio_pci_common virtio_pci: update file descriptions and copyright virtio_pci: split out legacy device support virtio_pci: setup config vector indirectly virtio_pci: setup vqs indirectly virtio_pci: delete vqs indirectly virtio_pci: use priv for vq notification virtio_pci: free up vq->priv virtio_pci: fix coding style for structs virtio_pci: add isr field virtio: drop legacy_only driver flag virtio_balloon: drop legacy_only driver flag virtio_ccw: rev 1 devices set VIRTIO_F_VERSION_1 virtio: allow finalize_features to fail virtio_ccw: legacy: don't negotiate rev 1/features virtio: add API to detect legacy devices virtio_console: fix sparse warnings vhost: remove unnecessary forward declarations in vhost.h ...
2014-12-09put iov_iter into msghdrAl Viro1-6/+2
Note that the code _using_ ->msg_iter at that point will be very unhappy with anything other than unshifted iovec-backed iov_iter. We still need to convert users to proper primitives. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-12-09tun: TUN_VNET_LE support, fix sparse warnings for virtio headersMichael S. Tsirkin1-19/+29
Pretty straight-forward: convert all fields to/from virtio endian-ness. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Jason Wang <jasowang@redhat.com>
2014-12-09tun: drop most type definesMichael S. Tsirkin1-34/+28
It's just as easy to use IFF_ flags directly, there's no point in adding our own defines. Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-12-09tun: move internal flag defines out of uapiMichael S. Tsirkin1-51/+21
TUN_ flags are internal and never exposed to userspace. Any application using it is almost certainly buggy. Move them out to tun.c. Note: we remove these completely in follow-up patches, this code movement is split out for ease of review. Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-12-08Merge branch 'iov_iter' into for-nextAl Viro1-20/+33
2014-12-05tun/macvtap: use consume_skb() instead of kfree_skb() when neededJason Wang1-1/+4
To be more friendly with drop monitor, we should only call kfree_skb() when the packets were dropped and use consume_skb() in other cases. Cc: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-02tun: Fix GSO meta-data handling in tun_get_userHerbert Xu1-1/+1
When we write the GSO meta-data in tun_get_user we end up advancing the IO vector twice, thus exhausting the user buffer before we can finish writing the packet. Fixes: f5ff53b4d97c ("{macvtap,tun}_get_user(): switch to iov_iter") Reported-by: Marcelo Ricardo Leitner <mleitner@redhat.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-24{macvtap,tun}_get_user(): switch to iov_iterAl Viro1-20/+24
allows to switch macvtap and tun from ->aio_write() to ->write_iter() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-11-24switch drivers/net/tun.c to ->read_iter()Al Viro1-25/+15
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-11-19tun: return NET_XMIT_DROP for dropped packetsJason Wang1-1/+1
After commit 5d097109257c03a71845729f8db6b5770c4bbedc ("tun: only queue packets on device"), NETDEV_TX_OK was returned for dropped packets. This will confuse pktgen since dropped packets were counted as sent ones. Fixing this by returning NET_XMIT_DROP to let pktgen count it as error packet. Cc: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-13tun: fix issues of iovec iterators using in tun_put_user()Jason Wang1-1/+3
This patch fixes two issues after using iovec iterators: - vlan_offset should be initialized to zero, otherwise unexpected offset will be used in skb_copy_datagram_iter() - advance iovec iterator when vnet_hdr_sz is greater than sizeof(gso), this is the case when mergeable rx buffer were enabled for a virt guest. Fixes e0b46d0ee9c240c7430a47e9b0365674d4a04522 ("tun: Use iovec iterators") Cc: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-07tun: Use iovec iteratorsHerbert Xu1-35/+30
This patch removes the use of skb_copy_datagram_const_iovec in favour of the iovec iterator-based skb_copy_datagram_iter. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-05fs: Convert show_fdinfo functions to voidJoe Perches1-2/+2
seq_printf functions shouldn't really check the return value. Checking seq_has_overflowed() occasionally is used instead. Update vfs documentation. Link: http://lkml.kernel.org/p/e37e6e7b76acbdcc3bb4ab2a57c8f8ca1ae11b9a.1412031505.git.joe@perches.com Cc: David S. Miller <davem@davemloft.net> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Joe Perches <joe@perches.com> [ did a few clean ups ] Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-11-03tun: Fix TUN_PKT_STRIP settingHerbert Xu1-4/+8
We set the flag TUN_PKT_STRIP if the user buffer provided is too small to contain the entire packet plus meta-data. However, this has been broken ever since we added GSO meta-data. VLAN acceleration also has the same problem. This patch fixes this by taking both into account when setting the TUN_PKT_STRIP flag. The fact that this has been broken for six years without anyone realising means that nobody actually uses this flag. Fixes: f43798c27684 ("tun: Allow GSO using virtio_net_hdr") Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-03tun: Fix csum_start with VLAN accelerationHerbert Xu1-7/+9
When VLAN acceleration is in use on the xmit path, we end up setting csum_start to the wrong place. The result is that the whoever ends up doing the checksum setting will corrupt the packet instead of writing the checksum to the expected location, usually this means writing the checksum with an offset of -4. This patch fixes this by adjusting csum_start when VLAN acceleration is detected. Fixes: 6680ec68eff4 ("tuntap: hardware vlan tx support") Cc: stable@vger.kernel.org Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-30drivers/net, ipv6: Select IPv6 fragment idents for virtio UFO packetsBen Hutchings1-1/+5
UFO is now disabled on all drivers that work with virtio net headers, but userland may try to send UFO/IPv6 packets anyway. Instead of sending with ID=0, we should select identifiers on their behalf (as we used to). Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Fixes: 916e4cf46d02 ("ipv6: reuse ip6_frag_id from ip6_ufo_append_data") Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-30drivers/net: Disable UFO through virtioBen Hutchings1-8/+11
IPv6 does not allow fragmentation by routers, so there is no fragmentation ID in the fixed header. UFO for IPv6 requires the ID to be passed separately, but there is no provision for this in the virtio net protocol. Until recently our software implementation of UFO/IPv6 generated a new ID, but this was a bug. Now we will use ID=0 for any UFO/IPv6 packet passed through a tap, which is even worse. Unfortunately there is no distinction between UFO/IPv4 and v6 features, so disable UFO on taps and virtio_net completely until we have a proper solution. We cannot depend on VM managers respecting the tap feature flags, so keep accepting UFO packets but log a warning the first time we do this. Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Fixes: 916e4cf46d02 ("ipv6: reuse ip6_frag_id from ip6_ufo_append_data") Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-09security: make security_file_set_fowner, f_setown and __f_setown void returnJeff Layton1-3/+1
security_file_set_fowner always returns 0, so make it f_setown and __f_setown void return functions and fix up the error handling in the callers. Cc: linux-security-module@vger.kernel.org Signed-off-by: Jeff Layton <jlayton@primarydata.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2014-07-15net: set name_assign_type in alloc_netdev()Tom Gundersen1-1/+2
Extend alloc_netdev{,_mq{,s}}() to take name_assign_type as argument, and convert all users to pass NET_NAME_UNKNOWN. Coccinelle patch: @@ expression sizeof_priv, name, setup, txqs, rxqs, count; @@ ( -alloc_netdev_mqs(sizeof_priv, name, setup, txqs, rxqs) +alloc_netdev_mqs(sizeof_priv, name, NET_NAME_UNKNOWN, setup, txqs, rxqs) | -alloc_netdev_mq(sizeof_priv, name, setup, count) +alloc_netdev_mq(sizeof_priv, name, NET_NAME_UNKNOWN, setup, count) | -alloc_netdev(sizeof_priv, name, setup) +alloc_netdev(sizeof_priv, name, NET_NAME_UNKNOWN, setup) ) v9: move comments here from the wrong commit Signed-off-by: Tom Gundersen <teg@jklm.no> Reviewed-by: David Herrmann <dh.herrmann@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-21net-tun: restructure tun_do_read for better sleep/wakeup efficiencyXi Wang1-38/+16
tun_do_read always adds current thread to wait queue, even if a packet is ready to read. This is inefficient because both sleeper and waker want to acquire the wait queue spin lock when packet rate is high. We restructure the read function and use common kernel networking routines to handle receive, sleep and wakeup. With the change available packets are checked first before the reading thread is added to the wait queue. Ran performance tests with the following configuration: - my packet generator -> tap1 -> br0 -> tap0 -> my packet consumer - sender pinned to one core and receiver pinned to another core - sender send small UDP packets (64 bytes total) as fast as it can - sandy bridge cores - throughput are receiver side goodput numbers The results are baseline: 731k pkts/sec, cpu utilization at 1.50 cpus changed: 783k pkts/sec, cpu utilization at 1.53 cpus The performance difference is largely determined by packet rate and inter-cpu communication cost. For example, if the sender and receiver are pinned to different cpu sockets, the results are baseline: 558k pkts/sec, cpu utilization at 1.71 cpus changed: 690k pkts/sec, cpu utilization at 1.67 cpus Co-authored-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Xi Wang <xii@google.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-27drivers/net: Use RCU_INIT_POINTER(x, NULL) in tun.cMonam Agarwal1-4/+4
This patch replaces rcu_assign_pointer(x, NULL) with RCU_INIT_POINTER(x, NULL) The rcu_assign_pointer() ensures that the initialization of a structure is carried out before storing a pointer to that structure. And in the case of the NULL pointer, there is no structure to initialize. So, rcu_assign_pointer(p, NULL) can be safely converted to RCU_INIT_POINTER(p, NULL) Signed-off-by: Monam Agarwal <monamagarwal123@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-20tun: remove bogus hardware vlan acceleration flags from vlan_featuresFernando Luis Vazquez Cao1-1/+3
Even though only the outer vlan tag can be HW accelerated in the transmission path, in the TUN/TAP driver vlan_features mirrors hw_features, which happens to have the NETIF_F_HW_VLAN_?TAG_TX flags set. Because of this, during packet tranmisssion through a stacked vlan device dev_hard_start_xmit, (incorrectly) assuming that the vlan device supports hardware vlan acceleration, does not add the vlan header to the skb payload and the inner vlan tags are lost (vlan_tci contains the outer vlan tag when userspace reads the packet from the tap device). Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp> Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-17netdevice: add queue selection fallback handler for ndo_select_queueDaniel Borkmann1-1/+1
Add a new argument for ndo_select_queue() callback that passes a fallback handler. This gets invoked through netdev_pick_tx(); fallback handler is currently __netdev_pick_tx() as most drivers invoke this function within their customized implementation in case for skbs that don't need any special handling. This fallback handler can then be replaced on other call-sites with different queue selection methods (e.g. in packet sockets, pktgen etc). This also has the nice side-effect that __netdev_pick_tx() is then only invoked from netdev_pick_tx() and export of that function to modules can be undone. Suggested-by: David S. Miller <davem@davemloft.net> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-28tun: add device name(iff) field to proc fdinfo entryMasatake YAMATO1-1/+26
A file descriptor opened for /dev/net/tun and a tun device are connected with ioctl. Though understanding the connection is important for trouble shooting, no way is given to a user to know the connected device for a given file descriptor at userland. This patch adds a new fdinfo field for the device name connected to a file descriptor opened for /dev/net/tun. Here is an example of the field: # lsof | grep tun qemu-syst 4565 qemu 25u CHR 10,200 0t138 12921 /dev/net/tun ... # cat /proc/4565/fdinfo/25 pos: 138 flags: 0104002 iff: vnet0 # ip link show dev vnet0 8: vnet0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 ... changelog: v2: indent iff just like the other fdinfo fields are. v3: remove unused variable. Both are suggested by David Miller <davem@davemloft.net>. Signed-off-by: Masatake YAMATO <yamato@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-22tuntap: Fix for a race in accessing numqueuesDominic Curran1-4/+6
A patch for fixing a race between queue selection and changing queues was introduced in commit 92bb73ea2("tuntap: fix a possible race between queue selection and changing queues"). The fix was to prevent the driver from re-reading the tun->numqueues more than once within tun_select_queue() using ACCESS_ONCE(). We have been experiancing 'Divide-by-zero' errors in tun_net_xmit() since we moved from 3.6 to 3.10, and believe that they come from a simular source where the value of tun->numqueues changes to zero between the first and a subsequent read of tun->numqueues. The fix is a simular use of ACCESS_ONCE(), as well as a multiply instead of a divide in the if statement. Signed-off-by: Dominic Curran <dominic.curran@citrix.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Maxim Krasnyansky <maxk@qti.qualcomm.com> Acked-by: Jason Wang <jasowang@redhat.com> Acked-by: Max Krasnyansky <maxk@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-14Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller1-1/+2
2014-01-10net: core: explicitly select a txq before doing l2 forwardingJason Wang1-1/+2
Currently, the tx queue were selected implicitly in ndo_dfwd_start_xmit(). The will cause several issues: - NETIF_F_LLTX were removed for macvlan, so txq lock were done for macvlan instead of lower device which misses the necessary txq synchronization for lower device such as txq stopping or frozen required by dev watchdog or control path. - dev_hard_start_xmit() was called with NULL txq which bypasses the net device watchdog. - dev_hard_start_xmit() does not check txq everywhere which will lead a crash when tso is disabled for lower device. Fix this by explicitly introducing a new param for .ndo_select_queue() for just selecting queues in the case of l2 forwarding offload. netdev_pick_tx() was also extended to accept this parameter and dev_queue_xmit_accel() was used to do l2 forwarding transmission. With this fixes, NETIF_F_LLTX could be preserved for macvlan and there's no need to check txq against NULL in dev_hard_start_xmit(). Also there's no need to keep a dedicated ndo_dfwd_start_xmit() and we can just reuse the code of dev_queue_xmit() to do the transmission. In the future, it was also required for macvtap l2 forwarding support since it provides a necessary synchronization method. Cc: John Fastabend <john.r.fastabend@intel.com> Cc: Neil Horman <nhorman@tuxdriver.com> Cc: e1000-devel@lists.sourceforge.net Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-02tun, rfs: fix the incorrect hash valueZhi Yong Wu1-1/+1
The code incorrectly save the queue index as the hash, so this patch is fixing it with the hash received in the stack receive path. Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-31tun: Add support for RFS on tun flowsTom Herbert1-2/+35
This patch adds support so that the rps_flow_tables (RFS) can be programmed using the tun flows which are already set up to track flows for the purposes of queue selection. On the receive path (corresponding to select_queue and tun_net_xmit) the rxhash is saved in the flow_entry. The original code only does flow lookup in select_queue, so this patch adds a flow lookup in tun_net_xmit if num_queues == 1 (select_queue is not called from dev_queue_xmit->netdev_pick_tx in that case). The flow is recorded (processing CPU) in tun_flow_update (TX path), and reset when flow is deleted. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-18Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller1-7/+9
Conflicts: drivers/net/ethernet/intel/i40e/i40e_main.c drivers/net/macvtap.c Both minor merge hassles, simple overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-17net: Change skb_get_rxhash to skb_get_hashTom Herbert1-2/+2
Changing name of function as part of making the hash in skbuff to be generic property, not just for receive path. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-11tun: unbreak truncated packet signallingJason Wang1-7/+9
Commit 6680ec68eff47d36f67b4351bc9836fd6cba9532 (tuntap: hardware vlan tx support) breaks the truncated packet signal by nev return a length greater than iov length in tun_put_user(). This patch fixes by always return the length of packet plus possible vlan header. Caller can detect the truncated packet by comparing the return value and the size of io length. Cc: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> Cc: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-10net: Revert macvtap/tun truncation signalling changes.David S. Miller1-12/+11
Jason Wang and Michael S. Tsirkin are still discussing how to properly fix this. Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-10tun: unbreak truncated packet signallingJason Wang1-11/+12
Commit 6680ec68eff47d36f67b4351bc9836fd6cba9532 (tuntap: hardware vlan tx support) breaks the truncated packet signal by never return a length greater than iov length in tun_put_user(). This patch fixes this by always return the length of packet plus possible vlan header. Caller can detect the truncated packet by comparing the return value and the size of iov length. Reported-by: Vlad Yasevich <vyasevich@gmail.com> Cc: Vlad Yasevich <vyasevich@gmail.com> Cc: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-10Revert "tun: remove useless codes in tun_chr_aio_read() and tun_recvmsg()"David S. Miller1-0/+5
This reverts commit 73713357ab58aacda1af715bb5a623528dbbfd79. MSG_TRUNC handling was broken and is going to be fixed in the 'net' tree, so revert this. Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-09tun: remove useless codes in tun_chr_aio_read() and tun_recvmsg()Zhi Yong Wu1-5/+0
By checking related codes, it is impossible that ret > len or total_len, so we should remove some useless codes in both above functions. Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-09Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller1-0/+2
Merge 'net' into 'net-next' to get the AF_PACKET bug fix that Daniel's direct transmit changes depend upon. Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-06tun: remove unused parameter in tun_do_read()Zhi Yong Wu1-4/+3
Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-06tun: spelling fixesstephen hemminger1-6/+6
Fix spelling errors in tun driver. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-06tun: update file current positionZhi Yong Wu1-0/+2
Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-14tuntap: limit head length of skb allocatedJason Wang1-1/+9
We currently use hdr_len as a hint of head length which is advertised by guest. But when guest advertise a very big value, it can lead to an 64K+ allocating of kmalloc() which has a very high possibility of failure when host memory is fragmented or under heavy stress. The huge hdr_len also reduce the effect of zerocopy or even disable if a gso skb is linearized in guest. To solves those issues, this patch introduces an upper limit (PAGE_SIZE) of the head, which guarantees an order 0 allocation each time. Cc: Stefan Hajnoczi <stefanha@redhat.com> Cc: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-10-08tun: don't look at current when non-blockingMichael S. Tsirkin1-3/+5
We play with a wait queue even if socket is non blocking. This is an obvious waste. Besides, it will prevent calling the non blocking variant when current is not valid. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-09-12tuntap: correctly handle error in tun_set_iff()Jason Wang1-3/+8
Commit c8d68e6be1c3b242f1c598595830890b65cea64a (tuntap: multiqueue support) only call free_netdev() on error in tun_set_iff(). This causes several issues: - memory of tun security were leaked - use after free since the flow gc timer was not deleted and the tfile were not detached This patch solves the above issues. Reported-by: Wannes Rombouts <wannes.rombouts@epitech.eu> Cc: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-09-05tuntap: orphan frags before trying to set tx timestampJason Wang1-3/+5
sock_tx_timestamp() will clear all zerocopy flags of skb which may lead the frags never to be orphaned. This will break guest to guest traffic when zerocopy is enabled. Fix this by orphaning the frags before trying to set tx time stamp. The issue were introduced by commit eda297729171fe16bf34fe5b0419dfb69060f623 (tun: Support software transmit time stamping). Cc: Richard Cochran <richardcochran@gmail.com> Cc: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-09-05tuntap: purge socket error queue on detachJason Wang1-3/+9
Commit eda297729171fe16bf34fe5b0419dfb69060f623 (tun: Support software transmit time stamping) will queue skbs into error queue when tx stamping is enabled. But it forgets to purge the error queue during detach. This patch fixes this. Cc: Richard Cochran <richardcochran@gmail.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-08-21tun: Get skfilter layoutPavel Emelyanov1-0/+10
The only thing we may have from tun device is the fprog, whic contains the number of filter elements and a pointer to (user-space) memory where the elements are. The program itself may not be available if the device is persistent and detached. Signed-off-by: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-08-21tun: Allow to skip filter on attachPavel Emelyanov1-5/+7
There's a small problem with sk-filters on tun devices. Consider an application doing this sequence of steps: fd = open("/dev/net/tun"); ioctl(fd, TUNSETIFF, { .ifr_name = "tun0" }); ioctl(fd, TUNATTACHFILTER, &my_filter); ioctl(fd, TUNSETPERSIST, 1); close(fd); At that point the tun0 will remain in the system and will keep in mind that there should be a socket filter at address '&my_filter'. If after that we do fd = open("/dev/net/tun"); ioctl(fd, TUNSETIFF, { .ifr_name = "tun0" }); we most likely receive the -EFAULT error, since tun_attach() would try to connect the filter back. But (!) if we provide a filter at address &my_filter, then tun0 will be created and the "new" filter would be attached, but application may not know about that. This may create certain problems to anyone using tun-s, but it's critical problem for c/r -- if we meet a persistent tun device with a filter in mind, we will not be able to attach to it to dump its state (flags, owner, address, vnethdr size, etc.). The proposal is to allow to attach to tun device (with TUNSETIFF) w/o attaching the filter to the tun-file's socket. After this attach app may e.g clean the device by dropping the filter, it doesn't want to have one, or (in case of c/r) get information about the device with tun ioctls. Signed-off-by: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: David S. Miller <davem@davemloft.net>