aboutsummaryrefslogtreecommitdiffstats
path: root/device (follow)
Commit message (Collapse)AuthorAgeFilesLines
...
* device: remove mutex from Peer send/receiveJosh Bleecher Snyder2021-02-084-16/+80
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The immediate motivation for this change is an observed deadlock. 1. A goroutine calls peer.Stop. That calls peer.queue.Lock(). 2. Another goroutine is in RoutineSequentialReceiver. It receives an elem from peer.queue.inbound. 3. The peer.Stop goroutine calls close(peer.queue.inbound), close(peer.queue.outbound), and peer.stopping.Wait(). It blocks waiting for RoutineSequentialReceiver and RoutineSequentialSender to exit. 4. The RoutineSequentialReceiver goroutine calls peer.SendStagedPackets(). SendStagedPackets attempts peer.queue.RLock(). That blocks forever because the peer.Stop goroutine holds a write lock on that mutex. A background motivation for this change is that it can be expensive to have a mutex in the hot code path of RoutineSequential*. The mutex was necessary to avoid attempting to send elems on a closed channel. This commit removes that danger by never closing the channel. Instead, we send a sentinel nil value on the channel to indicate to the receiver that it should exit. The only problem with this is that if the receiver exits, we could write an elem into the channel which would never get received. If it never gets received, it cannot get returned to the device pools. To work around this, we use a finalizer. When the channel can be GC'd, the finalizer drains any remaining elements from the channel and restores them to the device pool. After that change, peer.queue.RWMutex no longer makes sense where it is. It is only used to prevent concurrent calls to Start and Stop. Move it to a more sensible location and make it a plain sync.Mutex. Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
* device: create channels.goJosh Bleecher Snyder2021-02-082-61/+69
| | | | | | | We have a bunch of stupid channel tricks, and I'm about to add more. Give them their own file. This commit is 100% code movement. Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
* device: print direction when ping transit failsJosh Bleecher Snyder2021-02-081-3/+9
| | | | Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
* device: separate timersInit from timersStartJosh Bleecher Snyder2021-02-082-5/+7
| | | | | | | | | | | | | | | timersInit sets up the timers. It need only be done once per peer. timersStart does the work to prepare the timers for a newly running peer. It needs to be done every time a peer starts. Separate the two and call them in the appropriate places. This prevents data races on the peer's timers fields when starting and stopping peers. Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
* device: don't track device interface state in RoutineTUNEventReaderJosh Bleecher Snyder2021-02-081-7/+4
| | | | | | | We already track this state elsewhere. No need to duplicate. The cost of calling changeState is negligible. Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
* device: improve MTU change handlingJosh Bleecher Snyder2021-02-081-8/+15
| | | | | | | | | | | The old code silently accepted negative MTUs. It also set MTUs above the maximum. It also had hard to follow deeply nested conditionals. Add more paranoid handling, and make the code more straight-line. Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
* device: remove device.state.stopping from RoutineTUNEventReaderJosh Bleecher Snyder2021-02-082-2/+1
| | | | | | | | | | The TUN event reader does three things: Change MTU, device up, and device down. Changing the MTU after the device is closed does no harm. Device up and device down don't make sense after the device is closed, but we can check that condition before proceeding with changeState. There's thus no reason to block device.Close on RoutineTUNEventReader exiting. Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
* device: overhaul device state managementJosh Bleecher Snyder2021-02-088-139/+188
| | | | | | | | | | | | | | | | | | | | | | | | | | This commit simplifies device state management. It creates a single unified state variable and documents its semantics. It also makes state changes more atomic. As an example of the sort of bug that occurred due to non-atomic state changes, the following sequence of events used to occur approximately every 2.5 million test runs: * RoutineTUNEventReader received an EventDown event. * It called device.Down, which called device.setUpDown. * That set device.state.changing, but did not yet attempt to lock device.state.Mutex. * Test completion called device.Close. * device.Close locked device.state.Mutex. * device.Close blocked on a call to device.state.stopping.Wait. * device.setUpDown then attempted to lock device.state.Mutex and blocked. Deadlock results. setUpDown cannot progress because device.state.Mutex is locked. Until setUpDown returns, RoutineTUNEventReader cannot call device.state.stopping.Done. Until device.state.stopping.Done gets called, device.state.stopping.Wait is blocked. As long as device.state.stopping.Wait is blocked, device.state.Mutex cannot be unlocked. This commit fixes that deadlock by holding device.state.mu when checking that the device is not closed. Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
* device: remove unnecessary zeroing in peer.SendKeepaliveJosh Bleecher Snyder2021-02-081-1/+0
| | | | | | elem.packet is always already nil. Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
* device: remove device.state.stopping from RoutineHandshakeJosh Bleecher Snyder2021-02-082-5/+1
| | | | | | It is no longer necessary. Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
* device: remove device.state.stopping from RoutineDecryptionJosh Bleecher Snyder2021-02-082-5/+3
| | | | | | | It is no longer necessary, as of 454de6f3e64abd2a7bf9201579cd92eea5280996 (device: use channel close to shut down and drain decryption channel). Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
* device: take peer handshake when reinitializing last sent handshakeJason A. Donenfeld2021-02-031-1/+4
| | | | | | This papers over other unrelated races, unfortunately. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
* device: fix goroutine leak testJosh Bleecher Snyder2021-02-031-8/+9
| | | | | | | | | | The leak test had rare flakes. If a system goroutine started at just the wrong moment, you'd get a false positive. Instead of looping until the goroutines look good and then checking, exit completely as soon as the number of goroutines looks good. Also, check more frequently, in an attempt to complete faster. Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
* device: add up/down stress testJason A. Donenfeld2021-02-031-0/+35
| | | | Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
* device: pass cfg strings around in tests instead of readerJason A. Donenfeld2021-02-031-9/+7
| | | | | | This makes it easier to tag things onto the end manually for quick hacks. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
* device: benchmark the waitpool to compare it to the prior channelsJason A. Donenfeld2021-02-031-0/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Here is the old implementation: type WaitPool struct { c chan interface{} } func NewWaitPool(max uint32, new func() interface{}) *WaitPool { p := &WaitPool{c: make(chan interface{}, max)} for i := uint32(0); i < max; i++ { p.c <- new() } return p } func (p *WaitPool) Get() interface{} { return <- p.c } func (p *WaitPool) Put(x interface{}) { p.c <- x } It performs worse than the new one: name old time/op new time/op delta WaitPool-16 16.4µs ± 5% 15.1µs ± 3% -7.86% (p=0.008 n=5+5) Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
* device: test that we do not leak goroutinesJosh Bleecher Snyder2021-02-031-0/+31
| | | | Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
* device: tie encryption queue lifetime to the peers that write to itJosh Bleecher Snyder2021-02-033-4/+6
| | | | | Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
* device: use a waiting sync.Pool instead of a channelJason A. Donenfeld2021-02-024-67/+116
| | | | | | Channels are FIFO which means we have guaranteed cache misses. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
* device: reduce number of append calls when paddingJason A. Donenfeld2021-01-291-5/+2
| | | | Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
* device: use int64 instead of atomic.Value for time stampJason A. Donenfeld2021-01-292-13/+27
| | | | Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
* device: use new model queues for handshakesJason A. Donenfeld2021-01-292-79/+52
| | | | Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
* device: simplify peer queue lockingJason A. Donenfeld2021-01-294-147/+70
| | | | Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
* device: reduce nesting when staging packetJason A. Donenfeld2021-01-281-6/+6
| | | | | Suggested-by: Josh Bleecher Snyder <josh@tailscale.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
* global: bump copyrightJason A. Donenfeld2021-01-2834-34/+34
| | | | Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
* device: do not allow get to run while set runsJason A. Donenfeld2021-01-282-3/+7
| | | | Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
* device: avoid hex allocations in IpcGetJason A. Donenfeld2021-01-282-15/+14
| | | | | | | | | | | | | benchmark old ns/op new ns/op delta BenchmarkUAPIGet-16 2872 2157 -24.90% benchmark old allocs new allocs delta BenchmarkUAPIGet-16 30 18 -40.00% benchmark old bytes new bytes delta BenchmarkUAPIGet-16 737 256 -65.26% Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
* device: the psk is not a chapoly keyJason A. Donenfeld2021-01-282-8/+7
| | | | | | It's a separate type of key that gets hashed into the chain. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
* device: get rid of nonce routineJason A. Donenfeld2021-01-278-167/+72
| | | | | | | | | | | | | This moves to a simple queue with no routine processing it, to reduce scheduler pressure. This splits latency in half! benchmark old ns/op new ns/op delta BenchmarkThroughput-16 2394 2364 -1.25% BenchmarkLatency-16 259652 120810 -53.47% Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
* device: avoid deadlock when changing private key and removing self peersJason A. Donenfeld2021-01-271-0/+2
| | | | Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
* device: use linked list for per-peer allowed-ip traversalJason A. Donenfeld2021-01-274-44/+62
| | | | | | | | | | | | | | | | | | | | | | | This makes the IpcGet method much faster. We also refactor the traversal API to use a callback so that we don't need to allocate at all. Avoiding allocations we do self-masking on insertion, which in turn means that split intermediate nodes require a copy of the bits. benchmark old ns/op new ns/op delta BenchmarkUAPIGet-16 3243 2659 -18.01% benchmark old allocs new allocs delta BenchmarkUAPIGet-16 35 30 -14.29% benchmark old bytes new bytes delta BenchmarkUAPIGet-16 1218 737 -39.49% This benchmark is good, though it's only for a pair of peers, each with only one allowedips. As this grows, the delta expands considerably. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
* device: combine debug and info log levels into 'verbose'Jason A. Donenfeld2021-01-2610-116/+96
| | | | | | | | | | | | There are very few cases, if any, in which a user only wants one of these levels, so combine it into a single level. While we're at it, reduce indirection on the loggers by using an empty function rather than a nil function pointer. It's not like we have retpolines anyway, and we were always calling through a function with a branch prior, so this seems like a net gain. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
* device: change logging interface to use functionsJosh Bleecher Snyder2021-01-269-164/+130
| | | | | | | | | | | | | | | | | | | | | This commit overhauls wireguard-go's logging. The primary, motivating change is to use a function instead of a *log.Logger as the basic unit of logging. Using functions provides a lot more flexibility for people to bring their own logging system. It also introduces logging helper methods on Device. These reduce line noise at the call site. They also allow for log functions to be nil; when nil, instead of generating a log line and throwing it away, we don't bother generating it at all. This spares allocation and pointless work. This is a breaking change, although the fix required of clients is fairly straightforward. Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
* device: fix shadowing of err in IpcHandleJosh Bleecher Snyder2021-01-261-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | The declaration of err in nextByte, err := buffered.ReadByte shadows the declaration of err in op, err := buffered.ReadString('\n') above. As a result, the assignments to err in err = ipcErrorf(ipc.IpcErrorInvalid, "trailing character in UAPI get: %c", nextByte) and in err = device.IpcGetOperation(buffered.Writer) do not modify the correct err variable. Found by staticcheck. Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
* device: remove extra error argJosh Bleecher Snyder2021-01-261-1/+1
| | | | | Caught by go vet. Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
* device: reduce allocs in Device.IpcGetOperationBrad Fitzpatrick2021-01-261-23/+27
| | | | | | | | | | | | | | | Plenty more to go, but a start: name old time/op new time/op delta UAPIGet-4 6.37µs ± 2% 5.56µs ± 1% -12.70% (p=0.000 n=8+8) name old alloc/op new alloc/op delta UAPIGet-4 1.98kB ± 0% 1.22kB ± 0% -38.71% (p=0.000 n=10+10) name old allocs/op new allocs/op delta UAPIGet-4 42.0 ± 0% 35.0 ± 0% -16.67% (p=0.000 n=10+10) Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
* device: add benchmark for UAPI Device.IpcGetOperationJosh Bleecher Snyder2021-01-261-0/+12
| | | | Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
* device: allow pipelining UAPI requestsJason A. Donenfeld2021-01-251-30/+36
| | | | | | The original spec ends with \n\n especially for this reason. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
* device: serialize access to IpcSetOperationJosh Bleecher Snyder2021-01-252-0/+4
| | | | | | Interleaves IpcSetOperations would spell trouble. Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
* device: simplify handling of IPC set endpointJosh Bleecher Snyder2021-01-251-12/+4
| | | | Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
* device: remove close processing fwmarkJosh Bleecher Snyder2021-01-251-11/+2
| | | | | | | Also, a behavior change: Stop treating a blank value as 0. It's not in the spec. Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
* device: remove unnecessary commentJosh Bleecher Snyder2021-01-251-1/+0
| | | | Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
* device: introduce new IPC error message for unknown errorJosh Bleecher Snyder2021-01-251-2/+2
| | | | Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
* device: correct IPC error number for I/O errorsJosh Bleecher Snyder2021-01-251-1/+4
| | | | Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
* device: simplify IpcHandle error handlingJosh Bleecher Snyder2021-01-251-15/+6
| | | | | | | | | | Unify the handling of unexpected UAPI errors. The comment that says "should never happen" is incorrect; this could happen due to I/O errors. Correct it. Change error message capitalization for consistency. Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
* device: split IpcSetOperation into partsJosh Bleecher Snyder2021-01-251-204/+198
| | | | | | | | | | | | | | | | The goal of this change is to make the structure of IpcSetOperation easier to follow. IpcSetOperation contains a small state machine: It starts by configuring the device, then shifts to configuring one peer at a time. Having the code all in one giant method obscured that structure. Split out the parts into helper functions and encapsulate the peer state. This makes the overall structure more apparent. Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
* device: expand IPCErrorJosh Bleecher Snyder2021-01-251-51/+43
| | | | | | | | | | Expand IPCError to contain a wrapped error, and add a helper to make constructing such errors easier. Add a defer-based "log on returned error" to IpcSetOperation. This lets us simplify all of the error return paths. Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
* device: remove dead codeJosh Bleecher Snyder2021-01-251-6/+1
| | | | | | | | If device.NewPeer returns a nil error, then the returned peer is always non-nil. Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
* device: return errors from ipc scannerJosh Bleecher Snyder2021-01-251-1/+1
| | | | | | | | The code as written will drop any read errors on the floor. Fix that. Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
* device: allow compiling with Go 1.15Jason A. Donenfeld2021-01-201-1/+1
| | | | | | | | | Until we depend on Go 1.16 (which isn't released yet), alias our own variable to the private member of the net package. This will allow an easy find replace to make this go away when we eventually switch to 1.16. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>