wireguard-openbsd - WireGuard implementation for the OpenBSD kernel

	Commit message (Collapse)	Author	Age	Files	Lines
*	Bump keepalive timers unconditionally on sendHEAD master	Jason A. Donenfeld	2021-10-26	1	-6/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The keepalive timers -- both persistent and mandatory -- are part of the internal state machine, which needs to be cranked whether or not the packet was actually sent. A packet might be dropped by the network. Or the packet might be dropped by the local network stack. The latter case gives a hint -- which is useful for the data_sent event -- but is harmful to consider for the keepalive state machine. So, crank those timers before even calling wg_send. Incidentally, doing it this way matches exactly what Linux's send.c's wg_packet_create_data_done and Go's send.go's RoutineSequentialSender do too. Suggested-by: Kyle Evans <kevans@freebsd.org> Reported-by: Ryan Roosa <ryanroosa@gmail.com>
*	Delete all peer allowed IPs at once	Matt Dunwoodie	2021-04-13	1	-43/+34
\| \| \| \| \|	This simplifies the deletion process, so we do not require a lookup of the node before deletion.
*	Merge wg_timers and wg_peer	Matt Dunwoodie	2021-04-13	1	-180/+155
\| \| \| \| \| \| \| \|	The primary motivator here is to get rid of CONTAINER_OF, which is quite an ugly macro. However, any reader should also be aware of the change from d_DISabled to p_ENabled.
*	Replace timer lock with SMR	Matt Dunwoodie	2021-04-13	1	-36/+31
\| \| \| \| \| \| \| \|	The lock was not used to protect any data structures, it was purely to ensure race-free setting of t_disabled. That is, that no other thread was halfway through any wg_timers_run_* function. With smr_* we can ensure this is still the case by calling smr_barrier() after setting t_disabled.
*	Run all timeouts in process context	Matt Dunwoodie	2021-04-13	1	-32/+20
\| \| \| \| \| \| \|	So the reason timeouts were running in interrupt context was because it was quicker. Running in process context required a `task` to be added, which we ended up doing anyway. So we might as well rely on timeout API to do it for us.
*	Use malloc instead of pool_* for infrequent allocations	Matt Dunwoodie	2021-04-13	1	-13/+6
\| \| \| \| \| \| \|	We can get rid of the pool overhead by using the malloc family of functions. This does lose us the ability to see directly how much each allocation is using, but it if we really want that, maybe we add new malloc types? Either way, not something we need at the moment.
*	Use SMR for wg_noise	Matt Dunwoodie	2021-04-13	3	-1313/+1089
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	While the largest change here is to use SMR for wg_noise, this was motivated by other deficiencies in the module. Primarily, the nonce operations should be performed in serial (wg_queue_out, wg_deliver_in) and not parallel (wg_encap, wg_decap). This also brings in a lock-free encrypt and decrypt path, which is nice. I suppose other improvements are that local, remote and keypair structs are opaque, so no more reaching in and fiddling with things. Unfortunately, these changes make abuse of the API easier (such as calling noise_keypair_encrypt on a keypair retrieved with noise_keypair_lookup (instead of noise_keypair_current) as they have different checks). Additionally, we have to trust that the nonce passed to noise_keypair_encrypt is non repeating (retrieved with noise_keypair_nonce_next), and noise_keypair_nonce_check is valid on received nonces. One area that could use a little bit more adjustment is the _free functions. They are used to call a function once it is safe to free a parent datastructure (one holding struct noise_{local,remote} ). This is currently used for lifetimes in the system and allows a consumer of wg_noise to opaquely manage lifetimes based on the reference counting of noise, remote and keypair. It is fine for now, but maybe revisit later.
*	Check iter != NULL	Matt Dunwoodie	2021-04-13	1	-2/+2
\| \| \| \| \| \| \| \| \|	The problem with checking peer != NULL is that we already dereference iter to get i_value. This is what was caught in the index == 0 bug reported on bugs@. Instead, we should assert that iter != NULL. This is likely to be removed when adjusting wg_noise.c in the not to distant future.
*	Allow setting keepalive while interface is down	Matt Dunwoodie	2021-04-13	1	-3/+4
\|
*	Rework encap/decap routines	Matt Dunwoodie	2021-04-13	1	-87/+84
\| \| \| \| \| \| \|	This will make further work on in place decryption a lot easier. Additionally, it improves the readability as we can get rid of the difficult _len variables. The copy in and out of wg_pkt_data is also a cleaner solution than memcpy nonces and whatnot.
*	Replace wg_tag with wg_packet	Matt Dunwoodie	2021-04-04	1	-291/+292
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I'll be the first to admit (but not the first to complain) about the wg_tag situation. It made it very difficult to manage mbufs (that may be reallocated with functions such as m_pullup). It was also not clear where allocation was occuring. This also gets rid of the ring buffers in wg_softc, which added no performance in this situation. They also used memory unnecessarily and increased the complexity. I also used this opportunity to get rid of the confusing t_mbuf/t_done situation and revert to a more understandable UNCRYPTED/CRYPTED/DEAD packet state. I don't believe there were any issues with the old style, but to improve readability is always a welcome addition. With these changes we can start encrypting packets in place (rather than copying to a new mbuf), which should increase performance. This also simplifies length calculations by using m_* functions and reading the pkthdr length.
*	Count all handshake packets	Matt Dunwoodie	2021-04-04	1	-2/+1
\|
*	Satisfy my ordering of struct elements and prototoypes	Matt Dunwoodie	2021-04-04	1	-3/+3
\|
*	Expand on key clearing message	Matt Dunwoodie	2021-04-04	1	-1/+3
\|
*	Error out if peer provider without public key	Matt Dunwoodie	2021-04-04	1	-2/+4
\|
*	Ensure a peer has a consistent PSK (if set when creating)	Matt Dunwoodie	2021-04-04	3	-12/+13
\|
*	Add noise_local_deinit to zero private keys	Matt Dunwoodie	2021-04-04	3	-0/+10
\|
*	Push kernel lock within rtable_add(9) and rework it to return 0 in the	mvs	2021-03-26	2	-10/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	case when requested table is already exists. Except initialization time, route_output() and if_createrdomain() are the only paths where we call rtable_add(9). We check requested table existence by rtable_exists(9) and it's not the error condition if the table exists. Otherwise we are trying to create requested table by rtable_add(9). Those paths are kernel locked so concurrent thread can't create requested table just after rtable_exists(9) check. Also rtable_add(9) has internal rtable_exists(9) check and in this case the table existence assumed as EEXIST error. This error path is never reached. We are going to unlock PF_ROUTE sockets. This means route_output() will not be serialized with if_createrdomain() and concurrent thread could create requested table. Table existence check and creation should be serialized and it makes sense to do this within rtable_add(9). This time kernel lock is used for this so it pushed down to rtable_add(9). The internal rtable_exists(9) check was modified and table existence is not error now. Since the external rtable_exists(9) check is useless it was removed from if_createrdomain(). It still exists in route_output() path because the logic is more complicated here. ok mpi@
*	Push kernel lock down to rt_setsource() to make `ifa' dereference safe.	mvs	2021-03-26	1	-3/+10
\| \| \| \| \| \| \| \| \| \| \|	Netlock doesn't make sense here because ifa_ifwithaddr() holds kernel lock while performs lists walkthrough. This was made to decrease the future diff for PF_ROUTE sockets unlocking. This time kernel lock is still held while we perform rt_setsource(). ok mpi@
*	Only install route with label, fix route leak on destroy	kn	2021-03-26	3	-3/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	ifconfig mp* mplslabel N" validates the label both in ifconfig(8) and each driver's ioctl handler, but there is one case where all drivers install a route without looking at the label at all. SIOCSLIFPHYRTABLE in all three drivers just validates the rdomain and sets the label to itself (0) such that the route is (re)installed accordingly. None of the driver's helper functions dealing with labels and routes validate labels themselves but instead expect the callees, e.g. the ioctl handler to do so. That means we can install routes for the explicit NULL label in non-default routing tables but are never able to clean them up without reboot. Fix this by adding the inverse of mp*_clone_destroy()'s label check to the routines installing the MPLS route to avoid bogus ones in the first place. OK claudio
*	wg(4): fix race between tx/rx handshakes, from Matt Dunwoodie, ok mpi@	sthen	2021-03-21	1	-5/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	"There is a race between sending/receiving handshake packets. This occurs if we consume an initiation, then send an initiation prior to replying to the consumed initiation. In particular, when consuming an initiation, we don't generate the index until creating the response (which is incorrect). If we attempt to create an initiation between these processes, we drop any outstanding handshake which in this case has index 0 as set when consuming the initiation. The fix attached is to generate the index when consuming the initiation so that any spurious initiation creation can drop a valid index. The patch also consolidates setting fields on the handshake."
*	RFC 8981 allows the configuration of only temporary IPv6 addresses.	florian	2021-03-20	1	-4/+8
\| \| \| \| \|	Make the interface come up when the IFXF_AUTOCONF6TEMP is set. OK kn
*	When changing the link local address send a RTM_IFINFO message out.	claudio	2021-03-18	1	-2/+4
\| \| \| \| \|	Also prefer if (error == 0) over if (!error). OK florian@ bluhm@
*	Do not call rtm_ifchg() if IFF_UP changed. The code in if_up() and if_down()	claudio	2021-03-18	1	-3/+6
\| \| \| \| \|	already call rtm_ifchg() and so this would just result in a duplicate message. Noticed by deraadt@. OK florian@ bluhm@
*	Like in the sysctl case include the ifp_sadl as RTA_IFP address in RTM_IFINFO	claudio	2021-03-18	1	-3/+6
\| \| \| \| \| \|	messages. This way userland can detect if the lladdr of an interface was changed. OK florian@ bluhm@
*	Fix SIOCDELLABEL/"ifconfig mpe0 -mplslabel" to unset label completely	kn	2021-03-18	1	-2/+2
\| \| \| \| \| \| \| \|	While the corresponding route gets removed properly, the driver's softc kept the old label, i.e. "ifconfig mpe0" would show "mpls: label 42" instead of "mpls: label (unset)" even though it was unset. OK claudio
*	Make "ifconfig mpw0 -mplslabel" work	kn	2021-03-17	1	-1/+4
\| \| \| \| \| \| \|	Code is there, noone ever used it, I guess. This makes ifconfig(8) documentation actually hold true. OK claudio
*	Use correct rdomain when adding/deleting routes	kn	2021-03-17	2	-7/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	mpip(4) always adds and deletes routes in rdomain 0 regardless of the `tunneldomain', i.e. the `sc_rdomain' value. mpw(4) adds routes with the specified rdomain but always deletes them in rdomain 0. mpe(4) consistently uses the softc's rdomain which is tracked consistently across the various ioctls -- no fix needed. Found while reading the code and testing ifconfig(8)'s "tunneldomain" in order to document MPLS ioctls. OK claudio
*	Hide kernel internals from userland by wrapping more bits in _KERNEL blocks.	claudio	2021-03-17	1	-1/+6
\| \| \| \| \|	Especially the includes of net/rtable.h and sys/queue.h are problematic. OK florian@
*	When RFC 8981 obsoleted RFC 4941 the terminology changed from	florian	2021-03-11	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	"privacy extensions" to "temporary address extensions" Change ifconfig(8) to output temporary after temporary addresses and add "temporary" option which is an alias for autoconfprivacy for now. Also make AUTOCONF6TEMP a positiv flag that is set by default. Previously the negative flag "INET6_NOPRIVACY" was set when privacy addresses were disabled. This makes the flags output less ugly and will allow us to disable autoconf addresses while having temporary addresses enabled in the future. More work is needed in slaacd. input benno, jmc, deraadt previous verison OK benno OK jmc, kn
*	There is no need to try to attach IPv6 to an interface when the	florian	2021-03-11	1	-2/+3
\| \| \| \| \| \| \| \|	AUTOCONF6 flag is already set. This is likely a leftover from when we sent router solicitations from the kernel. This was a way to trigger sending a solicitation from userland. OK kn
*	If the AUTOCONF4 or AUTOCONF6 flags get enabled, force the interface up.	deraadt	2021-03-11	1	-23/+34
\| \| \| \|	ok florian claudio
*	spelling	jsg	2021-03-10	25	-59/+59
\| \| \| \|	ok gnezdo@ semarie@ mpi@
*	Issuing FIOSETOWN and TIOCSPGRP ioctl commands on a tun(4) device leaks	anton	2021-03-09	1	-2/+3
\| \| \| \| \| \| \| \| \| \| \|	device references causing a hang while trying to remove the same interface since the reference count will never reach zero. Instead of returning, break out of the switch in order to ensure that tun_put() gets called. ok deraadt@ mvs@ Reported-by: syzbot+2ca11c73711a1d0b5c6c@syzkaller.appspotmail.com
*	Shorten the if_cloners_lock name preventing it from being truncated in	anton	2021-03-09	1	-2/+2
\| \| \| \| \| \|	the top(1) wait column. ok mvs@
*	use uint64_t ethernet addresses for compares in carp.	dlg	2021-03-07	1	-2/+2
\| \| \| \| \| \| \| \| \| \|	pass the uint64_t that ether_input has already converted from a real ethernet address into carp_input so it can use it without having to do its own conversion. tested by hrvoje popovski tested by me on amd64 and sparc64 ok patrick@ jmatthew@
*	ansi	jsg	2021-03-05	2	-88/+42
\|
*	pass the uint64_t dst ethernet address from ether_input to bridges.	dlg	2021-03-05	5	-27/+23
\| \| \| \|	tested on amd64 and sparc64.
*	work with 64bit ethernet addresses in ether_input().	dlg	2021-03-05	1	-9/+10
\| \| \| \| \| \| \| \| \| \| \|	this applies the tricks with addresses from veb and etherbridge code to the normal ethernet input processing. it basically loads the destination address from the packet and the interface ethernet address into uint64_ts for comparison. tested by hrvoje popovski and chris cappuccio tested here on amd64, arm64, and sparc64 ok claudio@ jmatthew@
*	clean up span ports as span ports, not bridge ports.	dlg	2021-03-03	1	-3/+2
\| \| \| \| \| \| \| \| \|	the visible result of this is that span ports aren't made promisc like bridge ports. when cleaning up a span port, trying to take promisc off it screwed up the refs, and it makes the underlying interface not able to be promisc when it should be promisc. found by dave voutila
*	fix an assert in veb_p_ioctl() that failed when called by a span port.	dlg	2021-03-02	1	-3/+4
\| \| \| \| \| \| \| \| \|	veb_p_ioctl() is used by both veb bridge and veb span ports, but it had an assert to check that it was being called by a veb bridge port. this extends the check so using it on a span port doesnt cause a panic. found by dave voutila
*	include of netinet/in.h here is incorrect, because net/route.h will pull	deraadt	2021-03-02	1	-2/+1
\| \| \| \| \|	excessive types into scope. ok claudio
*	Refactor ip_fragment() and ip6_fragment(). Use a mbuf list to	bluhm	2021-03-01	3	-76/+51
\| \| \| \| \| \| \| \| \| \|	simplify the handling of the fragment list. Now the functions ip_fragment() and ip6_fragment() always consume the mbuf. They free the mbuf and mbuf list in case of an error and take care about the counter. Adjust the code a bit to make v4 and v6 look similar. Fixes a potential mbuf leak when pf_route6() called pf_refragment6() and it failed. Now the mbuf is always freed by ip6_fragment(). OK dlg@ mvs@
*	big numbers need suffixes on some platforms. fix LACP_ADDR_SLOW_E64.	dlg	2021-02-28	1	-2/+2
\| \| \| \|	deraadt@ says i broke hppa :(
*	Rework route_input() and rtm_sendup(). While we perform foreach loop	mvs	2021-02-27	1	-32/+12
\| \| \| \| \| \| \| \| \| \| \| \| \|	in route_input() we drop solock() after we checked socket state. We pass mbuf(9) to this socket at next loops, while it referenced as `last'. Socket's state could be changed by concurrent thread while it's not locked. Since we perform socket's checks and output in same iteration, the logic which prevents mbuf(9) chain copy for the last socket in list was removed. ok bluhm@ claudio@
*	trim some code i accidentally left into the nvgre add address function	dlg	2021-02-27	1	-4/+1
\|
*	recover scope from v6 nvgre endpoint addresses for userland to look at.	dlg	2021-02-27	1	-2/+2
\|
*	put the mac addr into a uint64_t to compare it to the ethernet slow addr.	dlg	2021-02-27	1	-5/+9
\| \| \| \|	also do the ethertype comparison before the conversion above.
*	only store the current time on address table entries if it changes.	dlg	2021-02-26	1	-3/+6
\| \| \| \| \| \| \| \|	this avoids unecessary writes to memory. it helps a little bit with a single nettq, but we get a lot more of a boost in pps when running concurrently. thanks to hrvoje for testing.
*	tpmr can use the eth64 bits too.	dlg	2021-02-26	1	-9/+5
\|