wireguard-openbsd - WireGuard implementation for the OpenBSD kernel

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	Let MAIR comment catch up with reality.	kettenis	2021-03-10	1	-2/+5
\|
*	Fix typo for ATS attribute member in IORT root complex struct.	patrick	2021-03-10	1	-2/+2
\|
*	spelling	jsg	2021-03-10	68	-162/+162
\| \| \| \|	ok gnezdo@ semarie@ mpi@
*	pmap_avail_setup() is the only place physmem is calculated, delete a bunch	deraadt	2021-03-10	1	-9/+2
\| \| \| \| \|	of code which thinks it could be done elsewhere. ok kurt
*	Node without a "status" property should be considered enabled as well.	kettenis	2021-03-09	1	-3/+3
\| \| \| \|	ok patrick@
*	Issuing FIOSETOWN and TIOCSPGRP ioctl commands on a tun(4) device leaks	anton	2021-03-09	1	-2/+3
\| \| \| \| \| \| \| \| \| \| \|	device references causing a hang while trying to remove the same interface since the reference count will never reach zero. Instead of returning, break out of the switch in order to ensure that tun_put() gets called. ok deraadt@ mvs@ Reported-by: syzbot+2ca11c73711a1d0b5c6c@syzkaller.appspotmail.com
*	Shorten the if_cloners_lock name preventing it from being truncated in	anton	2021-03-09	1	-2/+2
\| \| \| \| \| \|	the top(1) wait column. ok mvs@
*	Recognize Apple Firestorm cores.	kettenis	2021-03-09	1	-1/+3
\|
*	Add support for 30-bit color modes.	kettenis	2021-03-09	1	-2/+4
\|
*	Early daemons like dhcpleased(8), slaacd(8), unwind(8), resolvd(8)	bluhm	2021-03-09	2	-31/+157
\| \| \| \| \| \| \| \| \| \| \| \| \|	are started before syslogd(8). This resulted in ugly sendsyslog(2) dropped logs and the real message was lost. Create a temporary stash for log messages within the kernel. It has a limited size of 100 messages, and each message is truncated to 8192 bytes. When the stash is exhausted, the well-known dropped message is generated with a counter. After syslogd(8) has setup everything, it sends a debug line through libc to flush the kernel stash. Then syslogd receives all messages from the kernel before the usual logs. OK deraadt@ visa@
*	Add initial bits for Check Point UTM-1 EDGE N.	visa	2021-03-09	3	-3/+15
\| \| \| \|	From Thaison Nguyen
*	ofw_read_mem_regions() can skip calculation of physmem. pmap.c	deraadt	2021-03-09	1	-5/+1
\| \| \| \| \| \| \| \| \| \|	already calculates _usable_ memory and updates physmem (if it is 0), whereas ofw_read_mem_regions() was counting usable+unuseable memory. ie. 4G or more on some machines. powerpc's 32-bit pagetable cannot use memory beyond 4G phys addr. (On a 4G machine, physmem64 was calculated as 0, which caused the installer's auto-diskabel code to place /tmp on the b partition). ok gkoehler, works for kurt also
*	Enable ixl(4).	patrick	2021-03-08	2	-2/+4
\|
*	Revert commitid: AZrsCSWEYDm7XWuv;	claudio	2021-03-08	3	-10/+14
\| \| \| \| \| \|	Kill SINGLE_PTRACE and use SINGLE_SUSPEND which has almost the same semantic. This diff did not properly kill SINGLE_PTRACE and broke RAMDISK kernels.
*	We no longer "accept" RAs in the kernel, delete misleading comment.	florian	2021-03-08	1	-2/+1
\|
*	Add another Type Cover device	jcs	2021-03-08	1	-1/+2
\| \| \| \|	from Fredrik Engberg
*	regen	jcs	2021-03-08	2	-4/+9
\|
*	Add Surface Pro Type Cover	jcs	2021-03-08	1	-1/+2
\| \| \| \|	from Fredrik Engberg
*	Allow uhidev child devices to claim selective report ids	jcs	2021-03-08	16	-57/+74
\| \| \| \| \| \| \| \| \| \| \| \| \|	There may be multiple matching devices on a single uhidev device but the first device that responds to UHIDEV_CLAIM_ALLREPORTID will block the others from attaching. Change this to UHIDEV_CLAIM_MULTIPLE_REPORTID and require any devices wanting some/all report ids to fill in the claimed array in uhidev_attach_arg with just the reports it needs. uhidev can then run match routines for other drivers with the available report ids. ok anton
*	Add support for sdhc(4) on Raspberry Pi in ACPI mode.	kettenis	2021-03-08	1	-2/+8
\| \| \| \|	ok patrick@
*	Add support for rk809 as seen on the Rock Pi N10 with the rk3399pro. Add	kurt	2021-03-08	1	-57/+262
\| \| \| \| \| \| \|	support for multiple linear ranges for voltage regulators and use for all rkpmic ICs. ok kettenis@
*	Revise the ASID allocation sheme to avoid a hang when running out of free	kettenis	2021-03-08	2	-31/+120
\| \| \| \| \| \| \| \| \| \| \| \| \|	ASIDs. This should only happen on systems with 8-bit ASIDs, which are currently unsupported in OpenBSD. The new scheme uses "generations". Whenever we run out of ASIDs we bump the generation and flush the complete TLB. The pmaps of processes that are currently on the CPU are carried over into the new generation. This implementation relies on the scheduler lock to make sure this happens without any (known) races. ok patrick@, mpi@
*	Move a KERNEL_ASSERT_LOCKED() from single_thread_clear() to cursig().	mpi	2021-03-08	1	-6/+3
\| \| \| \| \| \| \| \|	Ze big lock is currently necessary to ensure that two sibling threads are not racing against each other when processing signals. However it is not strickly necessary to unpark sibling threads. ok claudio@
*	Kill SINGLE_PTRACE and use SINGLE_SUSPEND which has almost the same semantic.	mpi	2021-03-08	3	-14/+10
\| \| \| \| \| \| \| \|	single_thread_set() is modified to explicitly indicated when waiting until sibling threads are parked is required. This is obviously not required if a traced thread is switching away from a CPU after handling a STOP signal. ok claudio@
*	Remove the workaround which identified Go executables, and permitted them	deraadt	2021-03-08	1	-7/+2
\| \| \| \| \| \| \| \|	to do syscalls directly. Go executables now use shared libc like all other dynamic binaries. This makes the "where are syscalls done from" checker strict for all binaries, and also opens the door to change the underlying syscall ABI to the kernel in the future very easily (if we find cause). ok jsing
*	Explicitly align kernel text.	mortimer	2021-03-07	2	-5/+6
\| \| \| \| \| \| \|	lld11 no longer quietly aligns this when given an address, so we do the alignment explicitly. ok kettenis@
*	Fix aml_store() to work properly when the lvalue is a reference of	yasuoka	2021-03-07	1	-3/+4
\| \| \| \| \| \| \|	LocalX. In that case, resolving the reference must be done before resetting the LocalX variable. test daniel ok kettenis
*	Pass standard DMA tag to acpi(4) table drivers.	patrick	2021-03-07	1	-1/+2
\| \| \| \|	ok kettenis@
*	ansi	jsg	2021-03-07	17	-636/+262
\|
*	ansi	jsg	2021-03-07	10	-279/+115
\|
*	ansi	jsg	2021-03-07	5	-61/+24
\|
*	ansi	jsg	2021-03-07	35	-816/+320
\|
*	use uint64_t ethernet addresses for compares in carp.	dlg	2021-03-07	3	-17/+16
\| \| \| \| \| \| \| \| \| \|	pass the uint64_t that ether_input has already converted from a real ethernet address into carp_input so it can use it without having to do its own conversion. tested by hrvoje popovski tested by me on amd64 and sparc64 ok patrick@ jmatthew@
*	Since with the current design there's one device per domain, and one	patrick	2021-03-06	1	-17/+11
\| \| \| \| \| \| \| \| \| \|	domain per pagetable, there's no need for a backpointer to the domain in the pagetable entry descriptor. There can't be any other domain. Also since there's no list, no list entry member is needed either. This reduces early allocation to half of the previous size. I think it's possible to reduce it even further and not need a pagetable entry descriptor at all, but I need to think about that a bit more.
*	One major issue talked about in research papers is reducing the overhead	patrick	2021-03-06	1	-61/+103
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	of the IOVA allocation. As far as I can see the current "best solution" is to cache IOVA ranges in percpu magazines. I don't think we have this issue at all thanks to bus_dmamap_create(9). The map is created ahead of time, and we know the maximum size of the DMA transfer. Since with smmu(4) we have IOVA per domain, allocating IOVA 'early' is essentially free. But pagetable mapping also incurs a performance penalty, since we allocate pagetable entry descriptors through pools. Since we have the IOVA early, we can allocate those early as well. This allocation is a bit more expensive though, but can be optimized further. All this means that there is no allocation overhead in hot code paths. The "only" thing remaining is assigning IOVA to the segments, adjusting the pagetable mappings, and flushing the IOTLB on unload. Maybe there's a way to do a combined flush for NICs, because we give a list of mbufs to the network stack and we could do the IOTLB invalidation only once right before we hand over the mbuf list to the upper layers.
*	ansi	jsg	2021-03-06	7	-18/+13
\|
*	ansi	jsg	2021-03-05	2	-14/+6
\|
*	ansi	jsg	2021-03-05	15	-250/+113
\|
*	ansi	jsg	2021-03-05	4	-53/+20
\|
*	ansi	jsg	2021-03-05	2	-88/+42
\|
*	deregister	jsg	2021-03-05	5	-32/+32
\|
*	ansi	jsg	2021-03-05	8	-202/+78
\|
*	pass the uint64_t dst ethernet address from ether_input to bridges.	dlg	2021-03-05	6	-29/+26
\| \| \| \|	tested on amd64 and sparc64.
*	ansi	jsg	2021-03-05	1	-3/+2
\|
*	ansi	jsg	2021-03-05	1	-9/+5
\|
*	work with 64bit ethernet addresses in ether_input().	dlg	2021-03-05	1	-9/+10
\| \| \| \| \| \| \| \| \| \| \|	this applies the tricks with addresses from veb and etherbridge code to the normal ethernet input processing. it basically loads the destination address from the packet and the interface ethernet address into uint64_ts for comparison. tested by hrvoje popovski and chris cappuccio tested here on amd64, arm64, and sparc64 ok claudio@ jmatthew@
*	Improve readability of softc accesses.	patrick	2021-03-05	1	-13/+20
\|
*	Introduce an IOVA allocator instead of mapping pages 1:1. Mapping pages 1:1	patrick	2021-03-05	2	-106/+129
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	obviously reduces the overhead of IOVA allocation, but instead you have the problem of doubly mapped pages, and making sure a page is only unmapped once the last user is gone. My initial attempt, modeled after apldart(4), calls the allocator for each segment. Unfortunately this introduces a performance penalty which reduces performance from around 700 Mbit/s to about 20 Mbit/s, or even less, in a simple single stream tcpbench scenario. Most mbufs from userland seem to have at least 3 segments. Calculating the needed IOVA space upfront reduces this penalty. IOVA allocation overhead could be reduced once and for all if it is possible to reserve IOVA during bus_dmamap_create(9), as it is only called upon creation and basically never for each DMA cycle. This needs some more thought. With this we now put the pressure on the PTED pools instead. Additionally, but not part of this diff, percpu pools for the PTEDs seem to reduce the overhead for that single stream tcpbench scenario to 0.3%. Right now this means we're hitting a different bottleneck, not related to the IOMMU. The next bottleneck will be discovered once forwarding is unlocked. Though it should be possible to benchmark the current implementation, and different designs, using a cycles counter. With IOVA allocation it's not easily possible to correlate memory passed to bus_dmamem_map(9) with memory passed to bus_dmamap_load(9). So far my code try to use the same cachability attributes as the kenrel uses for its userland mappings. For the devices we support, there seems to be no need so far. If this ever gives us any trouble in the feature, I'll have a look and fix it. While drivers should call bus_dmamap_unload(9) before bus_dmamap_destroy(9), the API explicitly states that bus_dmamap_destroy(9) should unload the map if it is still loaded. Hence we need to do exactly that. I actually have found one network driver which behaves that way, and the developer intends to change the network driver's behaviour.
*	Extend the commented code that shows which additional mappings are needed,	patrick	2021-03-05	1	-6/+24
\| \| \| \| \| \| \|	or which regions need to be reserved. As it turns out, a region we should not map is the PCIe address space. Making a PCIe device try to do DMA to an address in PCIe address space will obviously not make its way to SMMU and host memory. We'll probably have to add an API for that.
*	Turns out the cores on Apple's M1 SoC only support 8-bit ASIDs.	kettenis	2021-03-04	1	-52/+57
\| \| \| \| \| \| \| \| \| \| \|	Thank you Apple (not)! Add an initial attempt to support such systems. This isn't good enough since the kernel will hang once you create more than 127 processes. But it makes things work reasonably well until you reach that limit which is good enough to build things on the machine itself. ok patrick@