wireguard-openbsd - WireGuard implementation for the OpenBSD kernel

	Commit message (Collapse)	Author	Age	Files	Lines
*	Bump keepalive timers unconditionally on sendHEAD master	Jason A. Donenfeld	2021-10-26	1	-6/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The keepalive timers -- both persistent and mandatory -- are part of the internal state machine, which needs to be cranked whether or not the packet was actually sent. A packet might be dropped by the network. Or the packet might be dropped by the local network stack. The latter case gives a hint -- which is useful for the data_sent event -- but is harmful to consider for the keepalive state machine. So, crank those timers before even calling wg_send. Incidentally, doing it this way matches exactly what Linux's send.c's wg_packet_create_data_done and Go's send.go's RoutineSequentialSender do too. Suggested-by: Kyle Evans <kevans@freebsd.org> Reported-by: Ryan Roosa <ryanroosa@gmail.com>
*	Delete all peer allowed IPs at once	Matt Dunwoodie	2021-04-13	1	-43/+34
\| \| \| \| \|	This simplifies the deletion process, so we do not require a lookup of the node before deletion.
*	Merge wg_timers and wg_peer	Matt Dunwoodie	2021-04-13	1	-180/+155
\| \| \| \| \| \| \| \|	The primary motivator here is to get rid of CONTAINER_OF, which is quite an ugly macro. However, any reader should also be aware of the change from d_DISabled to p_ENabled.
*	Replace timer lock with SMR	Matt Dunwoodie	2021-04-13	1	-36/+31
\| \| \| \| \| \| \| \|	The lock was not used to protect any data structures, it was purely to ensure race-free setting of t_disabled. That is, that no other thread was halfway through any wg_timers_run_* function. With smr_* we can ensure this is still the case by calling smr_barrier() after setting t_disabled.
*	Run all timeouts in process context	Matt Dunwoodie	2021-04-13	1	-32/+20
\| \| \| \| \| \| \|	So the reason timeouts were running in interrupt context was because it was quicker. Running in process context required a `task` to be added, which we ended up doing anyway. So we might as well rely on timeout API to do it for us.
*	Use malloc instead of pool_* for infrequent allocations	Matt Dunwoodie	2021-04-13	1	-13/+6
\| \| \| \| \| \| \|	We can get rid of the pool overhead by using the malloc family of functions. This does lose us the ability to see directly how much each allocation is using, but it if we really want that, maybe we add new malloc types? Either way, not something we need at the moment.
*	Use SMR for wg_noise	Matt Dunwoodie	2021-04-13	3	-1313/+1089
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	While the largest change here is to use SMR for wg_noise, this was motivated by other deficiencies in the module. Primarily, the nonce operations should be performed in serial (wg_queue_out, wg_deliver_in) and not parallel (wg_encap, wg_decap). This also brings in a lock-free encrypt and decrypt path, which is nice. I suppose other improvements are that local, remote and keypair structs are opaque, so no more reaching in and fiddling with things. Unfortunately, these changes make abuse of the API easier (such as calling noise_keypair_encrypt on a keypair retrieved with noise_keypair_lookup (instead of noise_keypair_current) as they have different checks). Additionally, we have to trust that the nonce passed to noise_keypair_encrypt is non repeating (retrieved with noise_keypair_nonce_next), and noise_keypair_nonce_check is valid on received nonces. One area that could use a little bit more adjustment is the _free functions. They are used to call a function once it is safe to free a parent datastructure (one holding struct noise_{local,remote} ). This is currently used for lifetimes in the system and allows a consumer of wg_noise to opaquely manage lifetimes based on the reference counting of noise, remote and keypair. It is fine for now, but maybe revisit later.
*	Add refcnt_take_if_gt()	Matt Dunwoodie	2021-04-13	2	-0/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This function (or of similar nature) is required to safely use a refcnt and smr_entry together. Such functions exist on other platforms as kref_get_unless_zero (on Linux) and refcount_acquire_if_gt (on FreeBSD). The following diagram details the following situation with and without refcnt_take_if_gt in 3 cases, with the first showing the "invalid" use of refcnt_take. Situation: Thread #1 is removing the global referenc (o). Thread #2 wants to reference an object (r), using a thread pointer (t). Case: 1) refcnt_take after Thread #1 has released "o" 2) refcnt_take_if_gt before Thread #1 has released "o" 3) refcnt_take_if_gt after Thread #1 has released "o" Data: struct obj { struct smr_entry smr; struct refcnt refcnt; } o, r, t1, t2; Thread #1 \| Thread #2 ---------------------------------+------------------------------------ \| r = NULL; rw_enter_write(&lock); \| smr_read_enter(); \| t1 = SMR_PTR_GET_LOCKED(&o); \| t2 = SMR_PTR_GET(&o); SMR_PTR_SET_LOCKED(&o, NULL); \| \| if (refcnt_rele(&t1->refcnt) \| smr_call(&t1->smr, free, t1); \| \| if (t2 != NULL) { \| refcnt_take(&t2->refcnt); \| r = t2; \| } rw_exit_write(&lock); \| smr_read_exit(); ..... // called by smr_thread \| free(t1); \| ..... \| // use after free \| *r ---------------------------------+------------------------------------ \| r = NULL; rw_enter_write(&lock); \| smr_read_enter(); \| t1 = SMR_PTR_GET_LOCKED(&o); \| t2 = SMR_PTR_GET(&o); SMR_PTR_SET_LOCKED(&o, NULL); \| \| if (refcnt_rele(&t1->refcnt) \| smr_call(&t1->smr, free, t1); \| \| if (t2 != NULL && \| refcnt_take_if_gt(&t2->refcnt, 0)) \| r = t2; rw_exit_write(&lock); \| smr_read_exit(); ..... // called by smr_thread \| // we don't have a valid reference free(t1); \| assert(r == NULL); ---------------------------------+------------------------------------ \| r = NULL; rw_enter_write(&lock); \| smr_read_enter(); \| t1 = SMR_PTR_GET_LOCKED(&o); \| t2 = SMR_PTR_GET(&o); SMR_PTR_SET_LOCKED(&o, NULL); \| \| if (t2 != NULL && \| refcnt_take_if_gt(&t2->refcnt, 0)) \| r = t2; if (refcnt_rele(&t1->refcnt) \| smr_call(&t1->smr, free, t1); \| rw_exit_write(&lock); \| smr_read_exit(); ..... \| // we need to put our reference \| if (refcnt_rele(&t2->refcnt)) \| smr_call(&t2->smr, free, t2); ..... // called by smr_thread \| free(t1); \| ---------------------------------+------------------------------------ Currently it uses atomic_add_int_nv to atomically read the refcnt, but I'm open to suggestions for better ways. The atomic_cas_uint is used to ensure that refcnt hasn't been modified since reading `old`.
*	Check iter != NULL	Matt Dunwoodie	2021-04-13	1	-2/+2
\| \| \| \| \| \| \| \| \|	The problem with checking peer != NULL is that we already dereference iter to get i_value. This is what was caught in the index == 0 bug reported on bugs@. Instead, we should assert that iter != NULL. This is likely to be removed when adjusting wg_noise.c in the not to distant future.
*	Allow setting keepalive while interface is down	Matt Dunwoodie	2021-04-13	1	-3/+4
\|
*	Rework encap/decap routines	Matt Dunwoodie	2021-04-13	1	-87/+84
\| \| \| \| \| \| \|	This will make further work on in place decryption a lot easier. Additionally, it improves the readability as we can get rid of the difficult _len variables. The copy in and out of wg_pkt_data is also a cleaner solution than memcpy nonces and whatnot.
*	Replace wg_tag with wg_packet	Matt Dunwoodie	2021-04-04	1	-291/+292
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I'll be the first to admit (but not the first to complain) about the wg_tag situation. It made it very difficult to manage mbufs (that may be reallocated with functions such as m_pullup). It was also not clear where allocation was occuring. This also gets rid of the ring buffers in wg_softc, which added no performance in this situation. They also used memory unnecessarily and increased the complexity. I also used this opportunity to get rid of the confusing t_mbuf/t_done situation and revert to a more understandable UNCRYPTED/CRYPTED/DEAD packet state. I don't believe there were any issues with the old style, but to improve readability is always a welcome addition. With these changes we can start encrypting packets in place (rather than copying to a new mbuf), which should increase performance. This also simplifies length calculations by using m_* functions and reading the pkthdr length.
*	Count all handshake packets	Matt Dunwoodie	2021-04-04	1	-2/+1
\|
*	Satisfy my ordering of struct elements and prototoypes	Matt Dunwoodie	2021-04-04	1	-3/+3
\|
*	Expand on key clearing message	Matt Dunwoodie	2021-04-04	1	-1/+3
\|
*	Error out if peer provider without public key	Matt Dunwoodie	2021-04-04	1	-2/+4
\|
*	Ensure a peer has a consistent PSK (if set when creating)	Matt Dunwoodie	2021-04-04	3	-12/+13
\|
*	Add noise_local_deinit to zero private keys	Matt Dunwoodie	2021-04-04	3	-0/+10
\|
*	Add a guard page between I/O virtual address space allocations. The idea	patrick	2021-04-03	1	-3/+4
\| \| \| \| \| \| \| \| \| \| \|	is that IOVA allocations always have a gap in-between which produces a fault on access. If a transfer to a given allocation runs further than expected we should be able to see it. We pre-allocate IOVA on bus DMA map creation, and as long as we don't allocate a PTE descriptor, this comes with no cost. We have plenty of address space anyway, so adding a page-sized gap does not hurt at all and can only have positive effects. Idea from kettenis@
*	Exclude the first page from I/O virtual address space, which is the NULL	patrick	2021-04-03	1	-3/+4
\| \| \| \| \| \| \| \| \|	pointer address. Not allowing this one to be allocated might help find driver bugs, where the device is programmed with a NULL pointer. We have plenty of address space anyway, so excluding this single page does not hurt at all and can only have positive effects. Idea from kettenis@
*	Fix Dale's email address	tb	2021-04-02	4	-8/+8
\| \| \| \|	ok drahn
*	Clean up nonexistent/unused properties handling	kn	2021-04-01	1	-12/+1
\| \| \| \| \| \| \| \| \| \|	Never used since import and probably just ported over from NetBSD as-is; "design-capacity" does not exist in the device tree binding. "monitor-interval-ms" defaults to 250ms as per binding and could be used in the sensor_task_register() call, but our framework only supports whole seconds and there's no advantage over our current fixed poll interval of 5s. OK patrick
*	Hardcode meaningful alert level, track apm's battery state better	kn	2021-04-01	1	-23/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The current code looks for the nonexistent "cellwise,alert-level" property and falls back to zero as threshold (like the original NetBSD code). It also updates the CONFIG register with that very threshold to let the hardware set a bit and thus alert us when it has been reached. Since our sensor framework is designed to poll every N seconds and this driver does not actually look at whether the hardware alerted, neither using a default threshold of zero nor updating the hardware with it makes sense. Remove the alert level code and simply map >50%, >25% and <=25% of remaining battery life to apm(4)'s "high", "low" and "critical" battery state respectively; this matches exactly what acpibat(4) does and provides more meaningful sensor readings without relying on nonexistent device tree bindings. Feedback OK patrick
*	Push kernel lock down to umb_rtrequest().	mvs	2021-04-01	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We are going to unlock PF_ROUTE sockets. This means `if_rtrequest' handler will be performed without kernel lock. umb_rtrequest() calls umb_send_inet_proposal() which touches kernel lock protected `ipv{4,6}dns' array. Also umb_rtrequest() is the only handler which requires kernel lock to be held. So push the lock down to umb_rtrequest() instead of grab it around `if_rtrequest' call. This hunk was commited separately for decreases PF_ROUTE sockets unlocking diff. ok gerhard@ deraadt@
*	Make ddb's dependency on libz explicit.	visa	2021-03-31	1	-12/+12
\| \| \| \|	OK deraadt@ mpi@
*	sync	sthen	2021-03-31	2	-54/+54
\|
*	Remove redundant "HUAWEI Mobile" in usbdevs strings, mention radio	sthen	2021-03-31	1	-26/+26
\| \| \| \|	technology where known. ok deraadt
*	Introduce UAO_USES_SWHASH() and use tabs instead of spaces in #defines.	mpi	2021-03-31	1	-25/+26
\| \| \| \| \| \|	No functionnal change, reduce the difference with NetBSD. ok jmatthew@
*	fix typos in comments	sthen	2021-03-30	1	-3/+3
\|
*	Handle systems, such as the Dell Precision 3640, that access	kettenis	2021-03-30	1	-19/+87
\| \| \| \| \| \| \| \| \| \|	GenericSerialBus operating regions witout checking whether they're really available. This needs to work on RAMDISK kernels as well. Since we don't want to pull in the i2c subsystem on those, provide a separate and much simpler dummy implementation of the GenericSerialBus access code when SMALL_KERNEL is defined. ok tb@
*	Register the PCI variant of dwiic(4) with acpi(4).	kettenis	2021-03-30	1	-2/+7
\| \| \| \|	ok tb@
*	Some cards announce support for the NTB16 format, but that support does not	patrick	2021-03-30	3	-41/+167
\| \| \| \| \| \| \| \| \| \|	work. Hence, add support for NTB32 in the transmit path. We already have support for NTB32 in the receive path. We detect the supported format on boot and can then decide on transmit which format to use. From ehrhardt@ with gerhard@ Tested by jan@ ok sthen@
*	Some umb(4) devices require the NDP pointer behind the NDP datagram.	patrick	2021-03-30	2	-36/+59
\| \| \| \| \|	From gerhard@ "broadly OK" sthen@
*	[ICMP] IP options lead to malformed reply	sashan	2021-03-30	4	-9/+53
\| \| \| \| \| \| \| \| \|	icmp_send() must update IP header length if IP optaions are appended. Such packet also has to be dispatched with IP_RAWOUTPUT flags. Bug reported and fix co-designed by Dominik Schreilechner _at_ siemens _dot_ com OK bluhm@
*	Move tx/rx descriptors into their own structs.	kevlo	2021-03-30	2	-188/+509
\| \| \| \| \| \| \| \| \|	This is a first step toward making rge work with multiple queues and interrupts. Only one queue is currently used. While here, update the RTL8125B microcode. ok jmatthew@
*	Turns out the PCIe DARTs support a full 32-bit device virtual address space.	kettenis	2021-03-29	1	-4/+9
\| \| \| \| \| \| \| \|	Adjust the region managed by the extend accordingly but avoid the first and last page. The last page collides with the MSI address used by the PCIe controller and not using the first page helps finding bugs. ok patrick@
*	combine umb_products and umb_fccauth_devs into one umb_quirks table	sthen	2021-03-29	1	-36/+51
\| \| \| \|	ok gerhard@
*	Fix IA32_EPT_VPID_CAP_XO_TRANSLATIONS specification	dv	2021-03-29	1	-2/+2
\| \| \| \| \| \|	Per Intel SDM (Vol 3D, App. A.10) bit 0 should be read as a 1 if enabled. From Adam Steen. ok mlarkin@
*	Since ipw(4) doesn't call into net80211_newstate() the interface link state	stsp	2021-03-28	1	-1/+13
\| \| \| \| \| \| \| \| \|	must be updated by the driver in order to get packets to flow. In case of WPA the link state was updated as a side-effect of a successful WPA handshake. This commit fixes the WEP and plaintext cases. Problem reported and fix tested by Riccardo Mottola.
*	Add vid/pid table to umb(4) allowing matching to alternate config	sthen	2021-03-28	1	-3/+64
\| \| \| \| \| \| \| \| \|	Some devices present multiple configurations and the one chosen by default is not always usable - for example, some have an CDC ECM config that does not work with our cdce(4) - allow overriding to a specific config in those cases. From gerhard@ with tweaks to comments by me, ok patrick@
*	sync	sthen	2021-03-28	2	-4/+14
\|
*	add pid for Dell DW5821e and HUAWEI ME906s LTE, ok patrick@	sthen	2021-03-28	1	-1/+3
\|
*	Make sure that all CPUs end up with the same bits set in SCTLR_EL1.	kettenis	2021-03-27	2	-27/+28
\| \| \| \| \| \| \| \| \| \|	Do this by clearing all the bits marked RES0 and set all the bits marked RES1 for the ARMv8.0. Any optional features introduced in later revisions of the architecture (such as PAN) will be enabled after SCTLR_EL1 is initialized. ok patrick@
*	Add ARMv8.5 instruction set related CPU features.	kettenis	2021-03-27	2	-4/+184
\| \| \| \|	ok patrick@
*	Fix SDMMC_DEBUG build	kn	2021-03-27	2	-8/+8
\| \| \| \| \|	- Replace undefined SDMMCDEVNAME macro with usual DEVNAME from sdmmcvar.h - typofix struct member name
*	trim the FCS off Ethernet packets before sending them up the stack.	dlg	2021-03-27	1	-1/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Jurjen Oskam on tech@ found that ure in a veb caused these extra fcs bytes to be transmitted by other veb members. the extra bytes aren't a problem usually because our network stack ignores them if they're present, eg, the ip stack reads an ip packet length and trims bytes in an mbuf if there's more. bridge(4) masked this problem because it always parses IP packets going over the bridge and trims them like the IP stack before pushing them out another port. veb(4) generally just moves packets around based on the Ethernet header, by default it doesn't look too deeply into packets, which is why this issue popped out. it is more correct for ure to just not pass the fcs bytes up. ok jmatthew@ kevlo@
*	Return EOPNOTSUPP for unsupported ioctls	kn	2021-03-26	1	-16/+6
\| \| \| \| \| \| \| \| \|	Match what apm(4/macppc) says and make apmd(8) log an approiate warning when unsupported power actions are requested. Merge identical cases while here. This syncs with the apm ioctl handlers on loongson and arm64.
*	Fix "mach dtb" return code to avoid bogus boot	kn	2021-03-26	1	-6/+8
\| \| \| \| \| \| \| \| \| \| \| \|	Bootloader command functions must return zero in case of failure, returning 1 tells the bootloader to boot the currently set kernel iamge. "machine dtb" is is the wrong way around so using it triggers a boot. Fix this and print a brief usage (like other commands such as "hexdump" do) while here. Feedback OK patrick
*	Fix errno, merge ioctl cases	kn	2021-03-26	1	-13/+5
\| \| \| \| \| \| \| \| \|	The EBADF error is always overwritten for the standby, suspend and hibernate ioctls, only the mode ioctl has it right. Merge the now identical casese while here. OK patrick
*	Flag sensors as invalid on bogus reads	kn	2021-03-26	1	-3/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Follow-up to the previous commit: This driver continues to report stale hw.sensors values when reading them fails, which can easily be observed on a Pinebook Pro after plugging in the AC cable, causing the hw.sensors.cwfg0.raw0 (battery remaining minutes) value to jump considerably one or two times before stalling and becoming incoherent with the rest. Flag sensors invalid upfront in apm's fashion and mark them OK iff they yield valid values; this is what other drivers such as rktemp(4) do, but the consequence/intention of SENSOR_FINVALID is sysctl(8) and systat(8) skipping such sensors (until AC gets plugged off again). OK patrick