aboutsummaryrefslogtreecommitdiffstats
path: root/device/peer.go (follow)
Commit message (Collapse)AuthorAgeFilesLines
* device: change Peer.endpoint locking to reduce contentionJordan Whited2023-12-111-15/+35
| | | | | | | | | | | | | | | | | | | | | | | | | Access to Peer.endpoint was previously synchronized by Peer.RWMutex. This has now moved to Peer.endpoint.Mutex. Peer.SendBuffers() is now the sole caller of Endpoint.ClearSrc(), which is signaled via a new bool, Peer.endpoint.clearSrcOnTx. Previous Callers of Endpoint.ClearSrc() now set this bool, primarily via peer.markEndpointSrcForClearing(). Peer.SetEndpointFromPacket() clears Peer.endpoint.clearSrcOnTx when an updated conn.Endpoint is stored. This maintains the same event order as before, i.e. a conn.Endpoint received after peer.endpoint.clearSrcOnTx is set, but before the next Peer.SendBuffers() call results in the latest conn.Endpoint source being used for the next packet transmission. These changes result in throughput improvements for single flow, parallel (-P n) flow, and bidirectional (--bidir) flow iperf3 TCP/UDP tests as measured on both Linux and Windows. Latency under load improves especially for high throughput Linux scenarios. These improvements are likely realized on all platforms to some degree, as the changes are not platform-specific. Co-authored-by: James Tucker <james@tailscale.com> Signed-off-by: James Tucker <james@tailscale.com> Signed-off-by: Jordan Whited <jordan@tailscale.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
* device: move Queue{In,Out}boundElement Mutex to container typeJordan Whited2023-10-101-4/+4
| | | | | | | | | | | Queue{In,Out}boundElement locking can contribute to significant overhead via sync.Mutex.lockSlow() in some environments. These types are passed throughout the device package as elements in a slice, so move the per-element Mutex to a container around the slice. Reviewed-by: Maisem Ali <maisem@tailscale.com> Signed-off-by: Jordan Whited <jordan@tailscale.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
* conn, device, tun: implement vectorized I/O plumbingJordan Whited2023-03-101-9/+17
| | | | | | | | | | | | | | | | | | | | Accept packet vectors for reading and writing in the tun.Device and conn.Bind interfaces, so that the internal plumbing between these interfaces now passes a vector of packets. Vectors move untouched between these interfaces, i.e. if 128 packets are received from conn.Bind.Read(), 128 packets are passed to tun.Device.Write(). There is no internal buffering. Currently, existing implementations are only adjusted to have vectors of length one. Subsequent patches will improve that. Also, as a related fixup, use the unix and windows packages rather than the syscall package when possible. Co-authored-by: James Tucker <james@tailscale.com> Signed-off-by: James Tucker <james@tailscale.com> Signed-off-by: Jordan Whited <jordan@tailscale.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
* device: uniformly check ECDH output for zerosJason A. Donenfeld2023-02-161-1/+1
| | | | | | | | For some reason, this was omitted for response messages. Reported-by: z <dzm@unexpl0.red> Fixes: 8c34c4c ("First set of code review patches") Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
* global: bump copyright yearJason A. Donenfeld2023-02-071-1/+1
| | | | Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
* global: bump copyright yearJason A. Donenfeld2022-09-201-1/+1
| | | | Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
* all: use Go 1.19 and its atomic typesBrad Fitzpatrick2022-09-041-31/+22
| | | | | Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
* device: defer state machine transitions until configuration is completeJason A. Donenfeld2021-11-151-6/+3
| | | | Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
* device: optimize Peer.String even moreJason A. Donenfeld2021-05-181-14/+16
| | | | | | This reduces the allocation, branches, and amount of base64 encoding. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
* device: optimize Peer.StringJosh Bleecher Snyder2021-05-141-7/+20
| | | | Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
* device: remove unusual ... in messagesJason A. Donenfeld2021-05-071-2/+2
| | | | | | We dont use ... in any other present progressive messages except these. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
* device: get rid of peers.empty boolean in timersActiveJason A. Donenfeld2021-03-061-1/+0
| | | | | | | | | | There's no way for len(peers)==0 when a current peer has isRunning==false. This requires some struct reshuffling so that the uint64 pointer is aligned. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
* conn: make binds replacableJason A. Donenfeld2021-02-231-7/+2
| | | | Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
* device: use container/list instead of open coding itJason A. Donenfeld2021-02-101-10/+11
| | | | | | | This linked list implementation is awful, but maybe Go 2 will help eventually, and at least we're not open coding the hlist any more. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
* device: flush peer queues before starting deviceJason A. Donenfeld2021-02-101-0/+2
| | | | | | | In case some old packets snuck in there before, this flushes before starting afresh. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
* device: create peer queues at peer creation timeJason A. Donenfeld2021-02-101-6/+3
| | | | | | | Rather than racing with Start(), since we're never destroying these queues, we just set the variables at creation time. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
* device: only allocate peer queues onceJosh Bleecher Snyder2021-02-091-4/+4
| | | | | | | | | This serves two purposes. First, it makes repeatedly stopping then starting a peer cheaper. Second, it prevents a data race observed accessing the queues. Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
* device: fix comment typo and shorten state.mu.Lock to state.LockJason A. Donenfeld2021-02-091-5/+5
| | | | Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
* device: do not attach finalizer to non-returned objectJason A. Donenfeld2021-02-091-4/+4
| | | | | | | | Before, the code attached a finalizer to an object that wasn't returned, resulting in immediate garbage collection. Instead return the actual pointer. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
* device: remove mutex from Peer send/receiveJosh Bleecher Snyder2021-02-081-10/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The immediate motivation for this change is an observed deadlock. 1. A goroutine calls peer.Stop. That calls peer.queue.Lock(). 2. Another goroutine is in RoutineSequentialReceiver. It receives an elem from peer.queue.inbound. 3. The peer.Stop goroutine calls close(peer.queue.inbound), close(peer.queue.outbound), and peer.stopping.Wait(). It blocks waiting for RoutineSequentialReceiver and RoutineSequentialSender to exit. 4. The RoutineSequentialReceiver goroutine calls peer.SendStagedPackets(). SendStagedPackets attempts peer.queue.RLock(). That blocks forever because the peer.Stop goroutine holds a write lock on that mutex. A background motivation for this change is that it can be expensive to have a mutex in the hot code path of RoutineSequential*. The mutex was necessary to avoid attempting to send elems on a closed channel. This commit removes that danger by never closing the channel. Instead, we send a sentinel nil value on the channel to indicate to the receiver that it should exit. The only problem with this is that if the receiver exits, we could write an elem into the channel which would never get received. If it never gets received, it cannot get returned to the device pools. To work around this, we use a finalizer. When the channel can be GC'd, the finalizer drains any remaining elements from the channel and restores them to the device pool. After that change, peer.queue.RWMutex no longer makes sense where it is. It is only used to prevent concurrent calls to Start and Stop. Move it to a more sensible location and make it a plain sync.Mutex. Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
* device: separate timersInit from timersStartJosh Bleecher Snyder2021-02-081-1/+2
| | | | | | | | | | | | | | | timersInit sets up the timers. It need only be done once per peer. timersStart does the work to prepare the timers for a newly running peer. It needs to be done every time a peer starts. Separate the two and call them in the appropriate places. This prevents data races on the peer's timers fields when starting and stopping peers. Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
* device: overhaul device state managementJosh Bleecher Snyder2021-02-081-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | This commit simplifies device state management. It creates a single unified state variable and documents its semantics. It also makes state changes more atomic. As an example of the sort of bug that occurred due to non-atomic state changes, the following sequence of events used to occur approximately every 2.5 million test runs: * RoutineTUNEventReader received an EventDown event. * It called device.Down, which called device.setUpDown. * That set device.state.changing, but did not yet attempt to lock device.state.Mutex. * Test completion called device.Close. * device.Close locked device.state.Mutex. * device.Close blocked on a call to device.state.stopping.Wait. * device.setUpDown then attempted to lock device.state.Mutex and blocked. Deadlock results. setUpDown cannot progress because device.state.Mutex is locked. Until setUpDown returns, RoutineTUNEventReader cannot call device.state.stopping.Done. Until device.state.stopping.Done gets called, device.state.stopping.Wait is blocked. As long as device.state.stopping.Wait is blocked, device.state.Mutex cannot be unlocked. This commit fixes that deadlock by holding device.state.mu when checking that the device is not closed. Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
* device: take peer handshake when reinitializing last sent handshakeJason A. Donenfeld2021-02-031-1/+4
| | | | | | This papers over other unrelated races, unfortunately. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
* device: tie encryption queue lifetime to the peers that write to itJosh Bleecher Snyder2021-02-031-0/+2
| | | | | Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
* device: simplify peer queue lockingJason A. Donenfeld2021-01-291-44/+13
| | | | Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
* global: bump copyrightJason A. Donenfeld2021-01-281-1/+1
| | | | Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
* device: get rid of nonce routineJason A. Donenfeld2021-01-271-22/+7
| | | | | | | | | | | | | This moves to a simple queue with no routine processing it, to reduce scheduler pressure. This splits latency in half! benchmark old ns/op new ns/op delta BenchmarkThroughput-16 2394 2364 -1.25% BenchmarkLatency-16 259652 120810 -53.47% Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
* device: use linked list for per-peer allowed-ip traversalJason A. Donenfeld2021-01-271-0/+1
| | | | | | | | | | | | | | | | | | | | | | | This makes the IpcGet method much faster. We also refactor the traversal API to use a callback so that we don't need to allocate at all. Avoiding allocations we do self-masking on insertion, which in turn means that split intermediate nodes require a copy of the bits. benchmark old ns/op new ns/op delta BenchmarkUAPIGet-16 3243 2659 -18.01% benchmark old allocs new allocs delta BenchmarkUAPIGet-16 35 30 -14.29% benchmark old bytes new bytes delta BenchmarkUAPIGet-16 1218 737 -39.49% This benchmark is good, though it's only for a pair of peers, each with only one allowedips. As this grows, the delta expands considerably. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
* device: combine debug and info log levels into 'verbose'Jason A. Donenfeld2021-01-261-2/+2
| | | | | | | | | | | | There are very few cases, if any, in which a user only wants one of these levels, so combine it into a single level. While we're at it, reduce indirection on the loggers by using an empty function rather than a nil function pointer. It's not like we have retpolines anyway, and we were always calling through a function with a branch prior, so this seems like a net gain. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
* device: change logging interface to use functionsJosh Bleecher Snyder2021-01-261-2/+2
| | | | | | | | | | | | | | | | | | | | | This commit overhauls wireguard-go's logging. The primary, motivating change is to use a function instead of a *log.Logger as the basic unit of logging. Using functions provides a lot more flexibility for people to bring their own logging system. It also introduces logging helper methods on Device. These reduce line noise at the call site. They also allow for log functions to be nil; when nil, instead of generating a log line and throwing it away, we don't bother generating it at all. This spares allocation and pointless work. This is a breaking change, although the fix required of clients is fairly straightforward. Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
* device: remove unnecessary zeroingJosh Bleecher Snyder2021-01-071-1/+0
| | | | Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
* device: call wg.Add outside the goroutineJosh Bleecher Snyder2021-01-071-0/+2
| | | | | | | | | | | One of the first rules of WaitGroups is that you call wg.Add outside of a goroutine, not inside it. Fix this embarrassing mistake. This prevents an extremely rare race condition (2 per 100,000 runs) which could occur when attempting to start a new peer concurrently with shutting down a device. Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
* device: fix alignment of peer stats memberJason A. Donenfeld2021-01-071-1/+2
| | | | | | | | This was shifted by 2 bytes when making persistent keepalive into a u32. Fix it by placing it after the aligned region. Fixes: e739ff7 ("device: fix persistent_keepalive_interval data races") Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
* device: fix data race in peer.timersActiveJosh Bleecher Snyder2021-01-071-0/+1
| | | | | | | | | Found by the race detector and existing tests. To avoid introducing a lock into this hot path, calculate and cache whether any peers exist. Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
* device: fix races from changing private_keyJosh Bleecher Snyder2021-01-071-3/+4
| | | | | | | | | | Access keypair.sendNonce atomically. Eliminate one unnecessary initialization to zero. Mutate handshake.lastSentHandshake with the mutex held. Co-authored-by: David Anderson <danderson@tailscale.com> Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
* device: use channel close to shut down and drain outbound channelJosh Bleecher Snyder2021-01-071-2/+1
| | | | | | | | | This is a similar treatment to the handling of the encryption channel found a few commits ago: Use the closing of the channel to manage goroutine lifetime and shutdown. It is considerably simpler because there is only a single writer. Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
* device: fix persistent_keepalive_interval data racesJosh Bleecher Snyder2021-01-071-1/+1
| | | | | Co-authored-by: David Anderson <danderson@tailscale.com> Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
* device: prevent spurious errors while closing a deviceJosh Bleecher Snyder2021-01-071-0/+5
| | | | | | | | | | | | | | When closing a device, packets that are in flight can make it to SendBuffer, which then returns an error. Those errors add noise but no light; they do not reflect an actual problem. Adding the synchronization required to prevent this from occurring is currently expensive and error-prone. Instead, quietly drop such packets instead of returning an error. Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
* device: remove starting waitgroupsJosh Bleecher Snyder2021-01-071-7/+1
| | | | | | | | | | | | | | | In each case, the starting waitgroup did nothing but ensure that the goroutine has launched. Nothing downstream depends on the order in which goroutines launch, and if the Go runtime scheduler is so broken that goroutines don't get launched reasonably promptly, we have much deeper problems. Given all that, simplify the code. Passed a race-enabled stress test 25,000 times without failure. Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
* device: add write queue mutex for peerHaichao Liu2020-11-181-1/+5
| | | | | | | fix panic: send on closed channel when remove peer Signed-off-by: Haichao Liu <liuhaichao@bytedance.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
* device: format a few thingsJason A. Donenfeld2020-11-061-1/+0
| | | | Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
* device: remove global for roaming escape hatchJason A. Donenfeld2020-10-141-2/+2
| | | | Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
* global: update header comments and modulesJason A. Donenfeld2020-05-021-1/+1
| | | | Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
* device: use atomic access for unlocked keypair.nextJason A. Donenfeld2020-05-021-3/+3
| | | | | | | | | Go's GC semantics might not always guarantee the safety of this, and the race detector gets upset too, so instead we wrap this all in atomic accessors. Reported-by: David Anderson <danderson@tailscale.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
* conn: introduce new package that splits out the Bind and Endpoint typesDavid Crawshaw2020-05-021-2/+4
| | | | | | | | | | The sticky socket code stays in the device package for now, as it reaches deeply into the peer list. This is the first step in an effort to split some code out of the very busy device package. Signed-off-by: David Crawshaw <crawshaw@tailscale.com>
* device: add test to ensure Peer fields are safe for atomic access on 32-bitDavid Anderson2020-05-021-1/+5
| | | | | | | | | Adds a test that will fail consistently on 32-bit platforms if the struct ever changes again to violate the rules. This is likely not needed because unaligned access crashes reliably, but this will reliably fail even if tests accidentally pass due to lucky alignment. Signed-Off-By: David Anderson <danderson@tailscale.com>
* noise: unify zero checking of ecdhJason A. Donenfeld2020-03-171-7/+2
|
* uapi: skip peers with invalid keysJason A. Donenfeld2019-08-051-3/+10
|
* device: immediately rekey all peers after changing device private keyJason A. Donenfeld2019-07-111-0/+19
| | | | Reported-by: Derrick Pallas <derrick@pallas.us>
* device: update transfer counters correctlyJason A. Donenfeld2019-06-111-1/+6
| | | | | The rule is to always update them to the full packet size minus UDP/IP encapsulation for all authenticated packet types.