aboutsummaryrefslogtreecommitdiffstats
path: root/src/main.go (unfollow)
Commit message (Collapse)AuthorFilesLines
2023-12-11device: do atomic 64-bit add outside of vector loopJason A. Donenfeld1-1/+4
Only bother updating the rxBytes counter once we've processed a whole vector, since additions are atomic. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2023-12-11device: reduce redundant per-packet overhead in RX pathJordan Whited1-6/+15
Peer.RoutineSequentialReceiver() deals with packet vectors and does not need to perform timer and endpoint operations for every packet in a given vector. Changing these per-packet operations to per-vector improves throughput by as much as 10% in some environments. Signed-off-by: Jordan Whited <jordan@tailscale.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2023-12-11device: change Peer.endpoint locking to reduce contentionJordan Whited6-84/+86
Access to Peer.endpoint was previously synchronized by Peer.RWMutex. This has now moved to Peer.endpoint.Mutex. Peer.SendBuffers() is now the sole caller of Endpoint.ClearSrc(), which is signaled via a new bool, Peer.endpoint.clearSrcOnTx. Previous Callers of Endpoint.ClearSrc() now set this bool, primarily via peer.markEndpointSrcForClearing(). Peer.SetEndpointFromPacket() clears Peer.endpoint.clearSrcOnTx when an updated conn.Endpoint is stored. This maintains the same event order as before, i.e. a conn.Endpoint received after peer.endpoint.clearSrcOnTx is set, but before the next Peer.SendBuffers() call results in the latest conn.Endpoint source being used for the next packet transmission. These changes result in throughput improvements for single flow, parallel (-P n) flow, and bidirectional (--bidir) flow iperf3 TCP/UDP tests as measured on both Linux and Windows. Latency under load improves especially for high throughput Linux scenarios. These improvements are likely realized on all platforms to some degree, as the changes are not platform-specific. Co-authored-by: James Tucker <james@tailscale.com> Signed-off-by: James Tucker <james@tailscale.com> Signed-off-by: Jordan Whited <jordan@tailscale.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2023-12-11tun: implement UDP GSO/GRO for LinuxJordan Whited6-590/+1258
Implement UDP GSO and GRO for the Linux tun.Device, which is made possible by virtio extensions in the kernel's TUN driver starting in v6.2. secnetperf, a QUIC benchmark utility from microsoft/msquic@8e1eb1a, is used to demonstrate the effect of this commit between two Linux computers with i5-12400 CPUs. There is roughly ~13us of round trip latency between them. secnetperf was invoked with the following command line options: -stats:1 -exec:maxtput -test:tput -download:10000 -timed:1 -encrypt:0 The first result is from commit 2e0774f without UDP GSO/GRO on the TUN. [conn][0x55739a144980] STATS: EcnCapable=0 RTT=3973 us SendTotalPackets=55859 SendSuspectedLostPackets=61 SendSpuriousLostPackets=59 SendCongestionCount=27 SendEcnCongestionCount=0 RecvTotalPackets=2779122 RecvReorderedPackets=0 RecvDroppedPackets=0 RecvDuplicatePackets=0 RecvDecryptionFailures=0 Result: 3654977571 bytes @ 2922821 kbps (10003.972 ms). The second result is with UDP GSO/GRO on the TUN. [conn][0x56493dfd09a0] STATS: EcnCapable=0 RTT=1216 us SendTotalPackets=165033 SendSuspectedLostPackets=64 SendSpuriousLostPackets=61 SendCongestionCount=53 SendEcnCongestionCount=0 RecvTotalPackets=11845268 RecvReorderedPackets=25267 RecvDroppedPackets=0 RecvDuplicatePackets=0 RecvDecryptionFailures=0 Result: 15574671184 bytes @ 12458214 kbps (10001.222 ms). Signed-off-by: Jordan Whited <jordan@tailscale.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2023-12-11tun: fix Device.Read() buf length assumption on WindowsJordan Whited1-4/+3
The length of a packet read from the underlying TUN device may exceed the length of a supplied buffer when MTU exceeds device.MaxMessageSize. Reviewed-by: Brad Fitzpatrick <bradfitz@tailscale.com> Signed-off-by: Jordan Whited <jordan@tailscale.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2023-10-22device: ratchet up max segment size on androidJason A. Donenfeld1-1/+1
GRO requires big allocations to be efficient. This isn't great, as there might be Android memory usage issues. So we should revisit this commit. But at least it gets things working again. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2023-10-21conn: set unused OOB to zero lengthJason A. Donenfeld1-1/+2
Otherwise in the event that we're using GSO without sticky sockets, we pass garbage OOB buffers to sendmmsg, making a EINVAL, when GSO doesn't set its header. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2023-10-21conn: fix cmsg data padding calculation for gsoJason A. Donenfeld1-1/+1
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2023-10-21conn: separate gso and sticky controlJason A. Donenfeld6-66/+96
Android wants GSO but not sticky. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2023-10-18conn: harmonize GOOS checks between "linux" and "android"Jason A. Donenfeld1-5/+5
Otherwise GRO gets enabled on Android, but the conn doesn't use it, resulting in bundled packets being discarded. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2023-10-18conn: simplify supportsUDPOffloadJason A. Donenfeld1-8/+2
This allows a kernel to support UDP_GRO while not supporting UDP_SEGMENT. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2023-10-10go.mod,tun/netstack: bump gvisorJames Tucker4-23/+23
Signed-off-by: James Tucker <james@tailscale.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2023-10-10tun: fix crash when ForceMTU is called after closeJames Tucker1-0/+3
Close closes the events channel, resulting in a panic from send on closed channel. Reported-By: Brad Fitzpatrick <brad@tailscale.com> Signed-off-by: James Tucker <james@tailscale.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2023-10-10device: move Queue{In,Out}boundElement Mutex to container typeJordan Whited6-111/+121
Queue{In,Out}boundElement locking can contribute to significant overhead via sync.Mutex.lockSlow() in some environments. These types are passed throughout the device package as elements in a slice, so move the per-element Mutex to a container around the slice. Reviewed-by: Maisem Ali <maisem@tailscale.com> Signed-off-by: Jordan Whited <jordan@tailscale.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2023-10-10tun: reduce redundant checksumming in tcpGRO()Jordan Whited1-63/+99
IPv4 header and pseudo header checksums were being computed on every merge operation. Additionally, virtioNetHdr was being written at the same time. This delays those operations until after all coalescing has occurred. Reviewed-by: Adrian Dewhurst <adrian@tailscale.com> Signed-off-by: Jordan Whited <jordan@tailscale.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2023-10-10tun: unwind summing loop in checksumNoFold()Jordan Whited2-12/+123
$ benchstat old.txt new.txt goos: linux goarch: amd64 pkg: golang.zx2c4.com/wireguard/tun cpu: 12th Gen Intel(R) Core(TM) i5-12400 │ old.txt │ new.txt │ │ sec/op │ sec/op vs base │ Checksum/64-12 10.670n ± 2% 4.769n ± 0% -55.30% (p=0.000 n=10) Checksum/128-12 19.665n ± 2% 8.032n ± 0% -59.16% (p=0.000 n=10) Checksum/256-12 37.68n ± 1% 16.06n ± 0% -57.37% (p=0.000 n=10) Checksum/512-12 76.61n ± 3% 32.13n ± 0% -58.06% (p=0.000 n=10) Checksum/1024-12 160.55n ± 4% 64.25n ± 0% -59.98% (p=0.000 n=10) Checksum/1500-12 231.05n ± 7% 94.12n ± 0% -59.26% (p=0.000 n=10) Checksum/2048-12 309.5n ± 3% 128.5n ± 0% -58.48% (p=0.000 n=10) Checksum/4096-12 603.8n ± 4% 257.2n ± 0% -57.41% (p=0.000 n=10) Checksum/8192-12 1185.0n ± 3% 515.5n ± 0% -56.50% (p=0.000 n=10) Checksum/9000-12 1328.5n ± 5% 564.8n ± 0% -57.49% (p=0.000 n=10) Checksum/9001-12 1340.5n ± 3% 564.8n ± 0% -57.87% (p=0.000 n=10) geomean 185.3n 77.99n -57.92% Reviewed-by: Adrian Dewhurst <adrian@tailscale.com> Signed-off-by: Jordan Whited <jordan@tailscale.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2023-10-10device: distribute crypto work as slice of elementsJordan Whited3-55/+55
After reducing UDP stack traversal overhead via GSO and GRO, runtime.chanrecv() began to account for a high percentage (20% in one environment) of perf samples during a throughput benchmark. The individual packet channel ops with the crypto goroutines was the primary contributor to this overhead. Updating these channels to pass vectors, which the device package already handles at its ends, reduced this overhead substantially, and improved throughput. The iperf3 results below demonstrate the effect of this commit between two Linux computers with i5-12400 CPUs. There is roughly ~13us of round trip latency between them. The first result is with UDP GSO and GRO, and with single element channels. Starting Test: protocol: TCP, 1 streams, 131072 byte blocks [ ID] Interval Transfer Bitrate Retr Cwnd [ 5] 0.00-10.00 sec 12.3 GBytes 10.6 Gbits/sec 232 3.15 MBytes - - - - - - - - - - - - - - - - - - - - - - - - - Test Complete. Summary Results: [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-10.00 sec 12.3 GBytes 10.6 Gbits/sec 232 sender [ 5] 0.00-10.04 sec 12.3 GBytes 10.6 Gbits/sec receiver The second result is with channels updated to pass a slice of elements. Starting Test: protocol: TCP, 1 streams, 131072 byte blocks [ ID] Interval Transfer Bitrate Retr Cwnd [ 5] 0.00-10.00 sec 13.2 GBytes 11.3 Gbits/sec 182 3.15 MBytes - - - - - - - - - - - - - - - - - - - - - - - - - Test Complete. Summary Results: [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-10.00 sec 13.2 GBytes 11.3 Gbits/sec 182 sender [ 5] 0.00-10.04 sec 13.2 GBytes 11.3 Gbits/sec receiver Reviewed-by: Adrian Dewhurst <adrian@tailscale.com> Signed-off-by: Jordan Whited <jordan@tailscale.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2023-10-10conn, device: use UDP GSO and GRO on LinuxJordan Whited13-147/+673
StdNetBind probes for UDP GSO and GRO support at runtime. UDP GSO is dependent on checksum offload support on the egress netdev. UDP GSO will be disabled in the event sendmmsg() returns EIO, which is a strong signal that the egress netdev does not support checksum offload. The iperf3 results below demonstrate the effect of this commit between two Linux computers with i5-12400 CPUs. There is roughly ~13us of round trip latency between them. The first result is from commit 052af4a without UDP GSO or GRO. Starting Test: protocol: TCP, 1 streams, 131072 byte blocks [ ID] Interval Transfer Bitrate Retr Cwnd [ 5] 0.00-10.00 sec 9.85 GBytes 8.46 Gbits/sec 1139 3.01 MBytes - - - - - - - - - - - - - - - - - - - - - - - - - Test Complete. Summary Results: [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-10.00 sec 9.85 GBytes 8.46 Gbits/sec 1139 sender [ 5] 0.00-10.04 sec 9.85 GBytes 8.42 Gbits/sec receiver The second result is with UDP GSO and GRO. Starting Test: protocol: TCP, 1 streams, 131072 byte blocks [ ID] Interval Transfer Bitrate Retr Cwnd [ 5] 0.00-10.00 sec 12.3 GBytes 10.6 Gbits/sec 232 3.15 MBytes - - - - - - - - - - - - - - - - - - - - - - - - - Test Complete. Summary Results: [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-10.00 sec 12.3 GBytes 10.6 Gbits/sec 232 sender [ 5] 0.00-10.04 sec 12.3 GBytes 10.6 Gbits/sec receiver Reviewed-by: Adrian Dewhurst <adrian@tailscale.com> Signed-off-by: Jordan Whited <jordan@tailscale.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2023-07-04netstack: fix typoDimitri Papadopoulos Orfanos1-1/+1
Signed-off-by: Dimitri Papadopoulos Orfanos <3234522+DimitriPapadopoulos@users.noreply.github.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2023-07-04all: adjust build tags for wasip1/wasmBrad Fitzpatrick4-4/+4
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2023-06-27conn: windows: add missing return statement in DstToString AF_INETspringhack1-1/+1
Signed-off-by: SpringHack <springhack@live.cn> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2023-06-27conn: store IP_PKTINFO cmsg in StdNetendpoint srcJames Tucker4-98/+128
Replace the src storage inside StdNetEndpoint with a copy of the raw control message buffer, to reduce allocation and perform less work on a per-packet basis. Signed-off-by: James Tucker <james@tailscale.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2023-06-27device: wait for and lock ipc operations during closeJames Tucker1-0/+2
If an IPC operation is in flight while close starts, it is possible for both processes to deadlock. Prevent this by taking the IPC lock at the start of close and for the duration. Signed-off-by: James Tucker <jftucker@gmail.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2023-03-25tun: use correct IP header comparisons in tcpGRO() and tcpPacketsCanCoalesce()Jordan Whited2-16/+119
tcpGRO() was using an incorrect IPv4 more fragments bit mask. tcpPacketsCanCoalesce() was not distinguishing tcp6 from tcp4, and TTL values were not compared. TTL values should be equal at the IP layer, otherwise the packets should not coalesce. This tracks with the kernel. Reviewed-by: Denton Gentry <dgentry@tailscale.com> Signed-off-by: Jordan Whited <jordan@tailscale.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2023-03-25tun: disqualify tcp4 packets w/IP options from coalescingJordan Whited2-5/+55
IP options were not being compared prior to coalescing. They are not commonly used. Disqualification due to nonzero options is in line with the kernel. Reviewed-by: Denton Gentry <dgentry@tailscale.com> Signed-off-by: Jordan Whited <jordan@tailscale.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2023-03-24conn: move booleans to bottom of StdNetBind structJason A. Donenfeld1-9/+11
This results in a more compact structure. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2023-03-24conn: use ipv6 message pool for ipv6 receivingJason A. Donenfeld1-2/+2
Looks like a simple copy&paste error. Fixes: 9e2f386 ("conn, device, tun: implement vectorized I/O on Linux") Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2023-03-24conn: fix StdNetEndpoint data race by dynamically allocating endpointsJordan Whited1-24/+8
In 9e2f386 ("conn, device, tun: implement vectorized I/O on Linux"), the Linux-specific Bind implementation was collapsed into StdNetBind. This introduced a race on StdNetEndpoint from getSrcFromControl() and setSrcControl(). Remove the sync.Pool involved in the race, and simplify StdNetBind's receive path to allocate StdNetEndpoint on the heap instead, with the intent for it to be cleaned up by the GC, later. This essentially reverts ef5c587 ("conn: remove the final alloc per packet receive"), adding back that allocation, unfortunately. This does slightly increase resident memory usage in higher throughput scenarios. StdNetBind is the only Bind implementation that was using this Endpoint recycling technique prior to this commit. This is considered a stop-gap solution, and there are plans to replace the allocation with a better mechanism. Reported-by: lsc <lsc@lv6.tw> Link: https://lore.kernel.org/wireguard/ac87f86f-6837-4e0e-ec34-1df35f52540e@lv6.tw/ Fixes: 9e2f386 ("conn, device, tun: implement vectorized I/O on Linux") Cc: Josh Bleecher Snyder <josharian@gmail.com> Reviewed-by: James Tucker <james@tailscale.com> Signed-off-by: Jordan Whited <jordan@tailscale.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2023-03-23conn: disable sticky sockets on AndroidJason A. Donenfeld5-8/+22
We can't have the netlink listener socket, so it's not possible to support it. Plus, android networking stack complexity makes it a bit tricky anyway, so best to leave it disabled. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2023-03-23global: remove old style build tagsJason A. Donenfeld8-8/+0
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2023-03-17tun: replace ErrorBatch() with errors.Join()Jordan Whited2-51/+3
Reviewed-by: Maisem Ali <maisem@tailscale.com> Signed-off-by: Jordan Whited <jordan@tailscale.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2023-03-17go.mod: bump to Go 1.20Jordan Whited2-2/+2
Reviewed-by: Maisem Ali <maisem@tailscale.com> Signed-off-by: Jordan Whited <jordan@tailscale.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2023-03-16conn: fix getSrcFromControl() iterationJordan Whited2-1/+29
We only expect a single control message in the normal case, but this would loop infinitely if there were more. Reviewed-by: Adrian Dewhurst <adrian@tailscale.com> Signed-off-by: Jordan Whited <jordan@tailscale.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2023-03-16conn: use CmsgSpace() for ancillary data buf sizingJordan Whited1-5/+7
CmsgLen() does not account for data alignment. Reviewed-by: Adrian Dewhurst <adrian@tailscale.com> Signed-off-by: Jordan Whited <jordan@tailscale.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2023-03-13global: buff -> bufJason A. Donenfeld18-189/+189
This always struck me as kind of weird and non-standard. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2023-03-10conn: use right cmsghdr len types on 32-bit in sticky testJason A. Donenfeld1-4/+4
Cmsghdr uses uint32 and uint64 on 32-bit and 64-bit respectively for the Len member, which makes assignments and comparisons slightly more irksome than usual. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2023-03-10conn: make StdNetBind.BatchSize() return 1 for non-LinuxJordan Whited1-1/+4
This commit updates StdNetBind.BatchSize() to return 1 instead of IdealBatchSize for non-Linux platforms. Non-Linux platforms do not yet benefit from values > 1, which only serves to increase memory consumption. Reviewed-by: James Tucker <james@tailscale.com> Signed-off-by: Jordan Whited <jordan@tailscale.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2023-03-10tun/netstack: enable TCP Selective AcknowledgementsJordan Whited1-1/+6
Enable TCP SACK for the gVisor Stack used in tun/netstack. This can improve throughput by an order of magnitude in the presence of packet loss. Reviewed-by: James Tucker <james@tailscale.com> Signed-off-by: Jordan Whited <jordan@tailscale.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2023-03-10conn: ensure control message size is respected in StdNetBindJordan Whited1-2/+2
This commit re-slices received control messages in StdNetBind to the value the OS reports on a successful read. Previously, the len of this slice would always be srcControlSize, which could result in control message values leaking through a sync.Pool round trip. This is unlikely with the IP_PKTINFO socket option set successfully, but should be guarded against. Reviewed-by: James Tucker <james@tailscale.com> Signed-off-by: Jordan Whited <jordan@tailscale.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2023-03-10conn: fix StdNetBind fallback on WindowsJordan Whited2-64/+150
If RIO is unavailable, NewWinRingBind() falls back to StdNetBind. StdNetBind uses x/net/ipv{4,6}.PacketConn for sending and receiving datagrams, specifically via the {Read,Write}Batch methods. These methods are unimplemented on Windows and will return runtime errors as a result. Additionally, only Linux benefits from these x/net types for reading and writing, so we update StdNetBind to fall back to the standard library net package for all platforms other than Linux. Reviewed-by: James Tucker <james@tailscale.com> Signed-off-by: Jordan Whited <jordan@tailscale.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2023-03-10conn: inch BatchSize toward being non-dynamicJason A. Donenfeld8-23/+19
There's not really a use at the moment for making this configurable, and once bind_windows.go behaves like bind_std.go, we'll be able to use constants everywhere. So begin that simplification now. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2023-03-10conn: set SO_{SND,RCV}BUF to 7MB on the Bind UDP socketJordan Whited4-0/+52
The conn.Bind UDP sockets' send and receive buffers are now being sized to 7MB, whereas they were previously inheriting the system defaults. The system defaults are considerably small and can result in dropped packets on high speed links. By increasing the size of these buffers we are able to achieve higher throughput in the aforementioned case. The iperf3 results below demonstrate the effect of this commit between two Linux computers with 32-core Xeon Platinum CPUs @ 2.9Ghz. There is roughly ~125us of round trip latency between them. The first result is from commit 792b49c which uses the system defaults, e.g. net.core.{r,w}mem_max = 212992. The TCP retransmits are correlated with buffer full drops on both sides. Starting Test: protocol: TCP, 1 streams, 131072 byte blocks [ ID] Interval Transfer Bitrate Retr Cwnd [ 5] 0.00-10.00 sec 4.74 GBytes 4.08 Gbits/sec 2742 285 KBytes - - - - - - - - - - - - - - - - - - - - - - - - - Test Complete. Summary Results: [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-10.00 sec 4.74 GBytes 4.08 Gbits/sec 2742 sender [ 5] 0.00-10.04 sec 4.74 GBytes 4.06 Gbits/sec receiver The second result is after increasing SO_{SND,RCV}BUF to 7MB, i.e. applying this commit. Starting Test: protocol: TCP, 1 streams, 131072 byte blocks [ ID] Interval Transfer Bitrate Retr Cwnd [ 5] 0.00-10.00 sec 6.14 GBytes 5.27 Gbits/sec 0 3.15 MBytes - - - - - - - - - - - - - - - - - - - - - - - - - Test Complete. Summary Results: [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-10.00 sec 6.14 GBytes 5.27 Gbits/sec 0 sender [ 5] 0.00-10.04 sec 6.14 GBytes 5.25 Gbits/sec receiver The specific value of 7MB is chosen as it is the max supported by a default configuration of macOS. A value greater than 7MB may further benefit throughput for environments with higher network latency and lower CPU clocks, but will also increase latency under load (bufferbloat). Some platforms will silently clamp the value to other maximums. On Linux, we use SO_{SND,RCV}BUFFORCE in case 7MB is beyond net.core.{r,w}mem_max. Co-authored-by: James Tucker <james@tailscale.com> Signed-off-by: James Tucker <james@tailscale.com> Signed-off-by: Jordan Whited <jordan@tailscale.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2023-03-10go.mod: bump depsJason A. Donenfeld2-9/+9
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2023-03-10conn, device, tun: implement vectorized I/O on LinuxJordan Whited24-787/+1870
Implement TCP offloading via TSO and GRO for the Linux tun.Device, which is made possible by virtio extensions in the kernel's TUN driver. Delete conn.LinuxSocketEndpoint in favor of a collapsed conn.StdNetBind. conn.StdNetBind makes use of recvmmsg() and sendmmsg() on Linux. All platforms now fall under conn.StdNetBind, except for Windows, which remains in conn.WinRingBind, which still needs to be adjusted to handle multiple packets. Also refactor sticky sockets support to eventually be applicable on platforms other than just Linux. However Linux remains the sole platform that fully implements it for now. Co-authored-by: James Tucker <james@tailscale.com> Signed-off-by: James Tucker <james@tailscale.com> Signed-off-by: Jordan Whited <jordan@tailscale.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2023-03-10conn, device, tun: implement vectorized I/O plumbingJordan Whited25-494/+1026
Accept packet vectors for reading and writing in the tun.Device and conn.Bind interfaces, so that the internal plumbing between these interfaces now passes a vector of packets. Vectors move untouched between these interfaces, i.e. if 128 packets are received from conn.Bind.Read(), 128 packets are passed to tun.Device.Write(). There is no internal buffering. Currently, existing implementations are only adjusted to have vectors of length one. Subsequent patches will improve that. Also, as a related fixup, use the unix and windows packages rather than the syscall package when possible. Co-authored-by: James Tucker <james@tailscale.com> Signed-off-by: James Tucker <james@tailscale.com> Signed-off-by: Jordan Whited <jordan@tailscale.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2023-02-23version: bump snapshot0.0.20230223Jason A. Donenfeld1-1/+1
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2023-02-16device: uniformly check ECDH output for zerosJason A. Donenfeld5-38/+45
For some reason, this was omitted for response messages. Reported-by: z <dzm@unexpl0.red> Fixes: 8c34c4c ("First set of code review patches") Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2023-02-09tun: guard Device.Events() against chan writesJordan Whited8-11/+11
Signed-off-by: Jordan Whited <jordan@tailscale.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2023-02-07global: bump copyright yearJason A. Donenfeld75-75/+75
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2023-02-07tun/netstack: make http examples communicate with each otherSoren L. Hansen2-9/+9
This seems like a much better demonstration as it removes the need for external components. Signed-off-by: Søren L. Hansen <sorenisanerd@gmail.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>