wireguard-openbsd - WireGuard implementation for the OpenBSD kernel

	Commit message (Collapse)	Author	Age	Files	Lines
*	spelling	jsg	2021-03-11	3	-8/+8
\|
*	Add support for timeconting in userland.	pirofti	2020-07-06	1	-0/+23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This diff exposes parts of clock_gettime(2) and gettimeofday(2) to userland via libc eliberating processes from the need for a context switch everytime they want to count the passage of time. If a timecounter clock can be exposed to userland than it needs to set its tc_user member to a non-zero value. Tested with one or multiple counters per architecture. The timing data is shared through a pointer found in the new ELF auxiliary vector AUX_openbsd_timekeep containing timehands information that is frequently updated by the kernel. Timing differences between the last kernel update and the current time are adjusted in userland by the tc_get_timecount() function inside the MD usertc.c file. This permits a much more responsive environment, quite visible in browsers, office programs and gaming (apparently one is are able to fly in Minecraft now). Tested by robert@, sthen@, naddy@, kmos@, phessler@, and many others! OK from at least kettenis@, cheloha@, naddy@, sthen@
*	Remove obsolete <machine/stdarg.h> header. Nowadays the vararg	visa	2020-06-30	1	-54/+0
\| \| \| \| \| \| \| \|	functionality is provided by <sys/stdarg.h> using compiler builtins. Tested in a ports bulk build on amd64 by naddy@ OK naddy@ mpi@
*	Implement cpu_rnd_messybits() as a read of the cycle counter register.	naddy	2020-06-14	1	-2/+12
\| \| \| \|	ok dlg@ deraadt@
*	introduce "cpu_rnd_messybits" for use instead of nanotime in dev/rnd.c.	dlg	2020-05-31	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	rnd.c uses nanotime to get access to some bits that change quickly between events that it can mix into the entropy pool. it doesn't use nanotime to get a monotonically increasing set or ordered and accurate timestamps, it just wants something with bits that change. there's been discussions for years about letting rnd use a clock that's super fast to read, but not necessarily accurate, but it wasn't until recently that i figured out it wasn't interested in time at all, so things like keeping a fast clock coherent between cpu cores or correct according to ntp is unecessary. this means we can just let rnd read the cycle counters on cpus and things will be fine. cpus with cycle counters that vary in their speed and arent kept consistent between cores may even be desirable in this context. so this is the first step in converting rnd.c to reading cycle counter. it copies the nanotime backend to each arch, and they can replace it with something MD as a second step later on. djm@ suggested rnd_messybytes, but we landed on cpu_rnd_messybits. thanks to visa for his eyes. ok deraadt@ visa@ deraadt@ says he will help handle any MD fallout that occurs.
*	Retire <machine/varargs.h>.	visa	2020-05-27	1	-17/+0
\| \| \| \| \| \|	Nothing uses the header anymore. OK deraadt@ mpi@
*	Convert db_addr_t -> vaddr_t but leave the typedef for now.	mpi	2019-11-07	1	-7/+7
\|
*	Remove file name and line number output from witness(4)	visa	2019-04-23	1	-16/+6
\| \| \| \| \| \| \| \| \| \| \| \| \|	Reduce code clutter by removing the file name and line number output from witness(4). Typically it is easy enough to locate offending locks using the stack traces that are shown in lock order conflict reports. Tricky cases can be tracked using sysctl kern.witness.locktrace=1 . This patch additionally removes the witness(4) wrapper for mutexes. Now each mutex implementation has to invoke the WITNESS_*() macros in order to utilize the checker. Discussed with and OK dlg@, OK mpi@
*	Include srp.h where struct cpu_info uses srp to avoid erroring out when	jsg	2018-12-05	1	-1/+2
\| \| \| \| \| \| \|	including cpu.h machine/intr.h etc without first including param.h when MULTIPROCESSOR is defined. ok visa@
*	Unify and bump some of the NMBCLUSTERS defines. Some archs had it set to	claudio	2018-09-14	1	-2/+2
\| \| \| \| \| \| \| \| \|	4MB which is far too low especially when the platform is able to run MP. New limits are, amd64 = 256M; arm64, mips64, sparc64 = 64M; alpha, arm, hppa, i386, powerpc = 32M; m88k, sh = 8M Still rather conservative numbers but much better than before. At least some hangs of arm64 build boxes was caused by this. OK kettenis@, visa@
*	Constipate all the struct lock_type's so they go into .rodata	guenther	2018-06-08	1	-3/+3
\| \| \| \|	ok visa@
*	Expose memory barriers to userland.	kettenis	2018-05-14	1	-2/+3
\| \| \| \|	ok visa@, mpi@
*	#define _MAX_PAGE_SHIFT in MD _types.h as the maximum pagesize an arch	deraadt	2018-03-05	1	-1/+2
\| \| \| \| \| \| \| \| \| \|	needs (looking at you sgi, but others required this before). This is for the circumstances we need pagesize known at compile time, not getpagesize() runtime. Use it for malloc storage sizes, for shm, and to set pthread stack default sizes. The stack sizes were a mess, and pushing them towards page-aligned is healthy move (which will also be needed by the coming stack register checker) ok guenther kettenis, discussion with stefan
*	Define and use IPL_MPFLOOR in our common mutex implementation.	mpi	2018-01-13	2	-3/+4
\| \| \| \|	ok kettenis@, visa@
*	Unify <machine/mutex.h> a bit further.	mpi	2018-01-04	1	-2/+2
\| \| \| \| \| \| \|	Remove `mtx_lock' from i386, add volatile before `mtx_owner' where it was missing. Inputs from kettenis@, ok visa@
*	Change __mp_lock_held() to work with an arbitrary CPU info structure and	mpi	2017-12-04	1	-2/+2
\| \| \| \| \| \| \|	extend ddb(4) "ps /o" output to print which CPU is currently holding the KERNEL_LOCK(). Tested by dhill@, ok visa@
*	Move mutex, condvar, and thread-specific data routes, pthread_once, and	guenther	2017-09-05	1	-5/+1
\| \| \| \| \| \| \| \|	pthread_exit from libpthread to libc, along with low-level bits to support them. Major bump to both libc and libpthread. Requested by libressl team. Ports testing by naddy@ ok kettenis@
*	Add WITNESS support	guenther	2017-07-16	1	-8/+36
\| \| \| \|	ok visa@ kettenis@
*	Unbreak profiling assembly functions in userland by defining the	mpi	2017-06-23	1	-2/+2
\| \| \| \| \| \|	correct prologue if compiled with -DPROF. ok deraadt@
*	tweak the spllower asm so it is more straightforward.	dlg	2017-05-19	1	-6/+10
\| \| \| \| \| \| \| \| \| \| \|	this properly identifies the registers used as input and output operands to the code running in the trap handler, and passes them to the asm statement as such. this means we dont have to do an extra copy in the asm, or an extra clobber to keep the compiler away from the registers. it also lets gcc set up and use the input register nicely before it reaches the asm. ok kettenis@
*	Implement copyin32(9).	kettenis	2017-05-18	1	-1/+3
\| \| \| \|	"your chicken scratches look fine to me" deraadt@
*	add a BUS_DMA_64BIT flag to bus_dma on all our archs.	dlg	2017-05-08	1	-1/+2
\| \| \| \| \| \| \| \| \|	this is so drivers can advertise that they can handle 64 dma addresses to the platform. it may choose to handle dmamaps differently based on this flag. tweaks and ok tom@ ok kettenis@
*	Hook up mutex(9) to witness(4).	visa	2017-04-20	1	-6/+20
\|
*	Provide mips64 with kernel-facing TCB_{GET,SET} macros that store it	guenther	2017-04-13	1	-3/+1
\| \| \| \| \| \| \|	in struct mdproc. With that, all archs have those and the __HAVE_MD_TCB macro can be unifdef'ed as always defined. ok kettenis@ visa@ jsing@
*	In exec_elf.c: expand ELFNAME(), ELFNAME2(), and ELFNAMEEND() except	guenther	2017-02-08	1	-3/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	leaving out the size, so that ELFNAME2(exec,makecmds) becomes exec_elf_makecmds instead of exec_elf{32,64}_makecmds and then delete the ELFNAME2() and ELFNAMEEND() macros. Move the prototypes for functions local to exec_elf.c to there from exec_elf.h. Simplify the SMALL_KERNEL conditionals around the ELF coredump code. Change exec_conf.c to use the size-generic names and macros Remove exec_elf{32,64}.c and just build exec_elf.c; delete the _KERN_DO_ELF and _KERN_DO_ELF64 #defines. ok jca@, encouragement from deraadt@ and tom@
*	Increase the number of mbufs on most architectures. This is based	bluhm	2016-09-03	1	-2/+2
\| \| \| \| \| \| \|	on a guess how much memory a typical machine has. If the value is too high, users may run out of kernel memory. Then we will have to adjust this again. OK claudio@ deraadt@
*	SROP mitigation. sendsig() stores a (per-process ^ &sigcontext) cookie	deraadt	2016-05-10	1	-2/+3
\| \| \| \| \| \| \| \|	inside the sigcontext. sigreturn(2) checks syscall entry was from the exact PC addr in the (per-process ASLR) sigtramp, verifies the cookie, and clears it to prevent sigcontext reuse. not yet tested on landisk, sparc, *88k, socppc. ok kettenis
*	The hppa trapframe PC is marked (in the low two bits) to indicate a	deraadt	2016-05-10	1	-2/+2
\| \| \| \| \| \|	userland addressspace address. Those bits should be masked to callers of the PROC_PC() macro. ok kettenis
*	Initial support for MSI-X. Only supported on amd64 for now. I have diffs to	kettenis	2016-05-04	1	-1/+2
\| \| \| \| \| \| \| \| \| \|	actually use this in em(4) and xhci(4), but I'm not committing those yet because we almost certainly need to save and restore the MSI-X registers during suspend/resume. However, this allows mpi@ to play with multiple-vector support in networking hardware. Requested by mpi@ ok mlarkin@, mikeb@
*	G/C DDB_REGS.	mpi	2016-04-27	1	-2/+1
\|
*	Rename kdb_trap() into db_ktrap().	mpi	2016-02-27	1	-2/+2
\| \| \| \| \| \| \| \|	The goal is to include it in the list of functions that must not be instrumented. All ddb(8) functions should be in this list and have their names start with 'db_'. ok visa@, deraadt@
*	make __cpu_simple_lock provide serialisation of the critical section.	dlg	2016-02-09	1	-8/+8
\| \| \| \| \| \| \| \| \| \| \|	that in turn makes atomic sequences actually atomic, which in turn means the refcnt api asserts wont fire erronously when if_get and if_put are actually used correctly. such embarrassment. reported by landry@ who also let me debug on the affected machines ok jmatthew@
*	Remove the definition of USRTEXT. It has no relevance outside of the non-PIE	miod	2015-11-01	1	-8/+3
\| \| \| \| \|	a.out world. ok deraadt@ kettenis@
*	Remove some trailing whitespace.	krw	2015-09-30	1	-2/+2
\|
*	Use consistant whitespace/comments for #define'ing LABELSECTOR,	krw	2015-09-30	1	-4/+4
\| \| \| \| \|	LABELOFFSET and MAXPARTITIONS. Easier on the eye when scanning through all these files. No functional change.
*	lint is dead and C99 may be old enough to drive a car: delete LONGLONG	guenther	2015-09-26	1	-3/+1
\| \| \| \| \| \|	comments ok millert@
*	intr_barrier(9) for hppa.	kettenis	2015-09-13	1	-1/+3
\|
*	_NLIST_DO_ELF is no longer needed: it's the only option	guenther	2015-08-29	1	-2/+1
\| \| \| \|	ok deraadt@
*	Always #include <sys/mutex.h>: need struct mutex for struct vm_page_md	guenther	2015-07-27	1	-2/+2
\| \| \| \| \|	problem noted by landry@ ok dlg@
*	First stab at making the hppa mpsafe. Not quite there yet though.	kettenis	2015-07-14	1	-4/+7
\|
*	introduce srp, which according to the manpage i wrote is short for	dlg	2015-07-02	1	-1/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	"shared reference pointers". srp allows concurrent access to a data structure by multiple cpus while avoiding interlocking cpu opcodes. it manages its own reference counts and the garbage collection of those data structure to avoid use after frees. internally srp is a twisted version of hazard pointers, which are a relative of RCU. jmatthew wrote the bulk of a hazard pointer implementation and changed bpf to use it to allow mpsafe access to bpfilters. however, at s2k15 we were trying to apply it to other data structures but the memory overhead of every hazard pointer would have blown out significantly in several uses cases. a bulk of our time at s2k15 was spent reworking hazard pointers into srp. this diff adds the srp api and adds the necessary metadata to struct cpuinfo on our MP architectures. srp on uniprocessor platforms has alternate code that is optimised because it knows there'll be no concurrent access to data by multiple cpus. srp is made available to the system via param.h, so it should be available everywhere in the kernel. the docs likely need improvement cos im too close to the implementation. ok mpi@
*	emul_native is only used for kernel threads which can't dump core, so	guenther	2015-05-05	1	-7/+1
\| \| \| \| \| \| \| \| \| \| \|	delete coredump_trad(), uvm_coredump(), cpu_coredump(), struct md_coredump, and various #includes that are superfluous. This leaves compat_linux processes without a coredump callback. If that ability is desired, someone should update it to use coredump_elf32() and verify the results... ok kettenis@
*	rework hppa mutexes.	dlg	2015-05-02	1	-8/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	this is largely based on src/sys/arch/alpha/alpha/mutex.c r1.14 and src/sys/arch/sgi/sgi/mutex.c r1.15 always and explicitely record which cpu owns the lock (or NULL if noone owns it). improve the mutex diagnostics/asserts so they operate on the mtx_owner field rather than mtx_lock. previously the asserts would assume the lock cpu owns the lock if any of them own the lock, which blows up badly. hppa hasnt got good atomic cpu opcodes, so this still relies on ldcws to serialise access to the lock. while im here i also shuffled the code. on MULTIPROCESSOR systems instead of duplicating code between mtx_enter and mtx_enter_try, mtx_enter simply loops on mtx_enter_try until it succeeds. this also provides an alternative implementation of mutexes on !MULTIPROCESSOR systems that avoids interlocking opcodes. mutexes wont contend on UP boxes, theyre basically wrappers around spls. we can just do the splraise, stash the owner as a guard value for DIAGNOSTIC and return. similarly, mtx_enter_try on UP will never fail, so we can just call mtx_enter and return 1. tested by and ok kettenis@ jsing@
*	Remove SIZE_MAX from limits.h. It was added years ago before we	millert	2015-04-30	1	-4/+1
\| \| \| \|	had a proper stdint.h. No ports fallout. OK guenther@ miod@
*	Change pmap_remove_holes() to take a vmspace instead of a map as its argument.	miod	2015-02-15	1	-2/+2
\| \| \| \| \|	Use this on vax to correctly pick the end of the stack area now that the stackgap adjustment code will no longer guarantee it is a fixed location.
*	dont need lockmgr for pmap things, so we dont need sys/lock.h	dlg	2015-02-11	1	-4/+1
\|
*	intr.c needs atomic.h for atomic_setbits_int to say softints are pending.	dlg	2015-02-11	1	-3/+1
\|
*	make the rwlock implementation MI.	dlg	2015-02-11	1	-5/+1
\| \| \| \| \| \| \| \| \| \| \|	each arch used to have to provide an rw_cas operation, but now we have the rwlock code build its own version. on smp machines it uses atomic_cas_ulong. on uniproc machines it avoids interlocked instructions by using straight loads and stores. this is safe because rwlocks are only used from process context and processes are currently not preemptible in our kernel. so alpha/ppc/etc might get a benefit. ok miod@ kettenis@ deraadt@
*	increase min address to page size for all remaining min == 0 systems.	tedu	2015-02-10	1	-2/+2
\| \| \| \|	not necessary, but consistent with other platforms. ok deraadt
*	remove simplelocks	deraadt	2014-12-17	1	-3/+1
\| \| \| \|	ok kettenis