summaryrefslogtreecommitdiffstats
path: root/lib/libc/arch (follow)
Commit message (Collapse)AuthorAgeFilesLines
* Adding a hard-trap instruction after the __threxit syscall instructionkurt2021-02-031-2/+1
| | | | broke pthreads on hppa. Reverting. Ok deraadt@
* Geode CPU does not support SSE, so MXCSR does not exists there. Asbluhm2020-12-133-9/+3
| | | | | | | | our i386 compiler does not generate SSE instructions by default, it is not strictly necessary to save MXCSR content between setjmp(3) and longjmp(3). We do not want to end supporting such old processors now. Remove the stmxcsr and ldmxcsr instructions from libc. reported by Johan Huldtgren; OK jsg@ kettenis@
* On i386 setjmp(3) should store the FPU state and longjmp(3) restorebluhm2020-12-063-3/+15
| | | | | | it. There is enough space in jmp_buf to save MXCSR and CW register. Idea taken from amd64. This fixes regress/lib/libc/setjmp-fpu . OK kettenis@
* Introduce constants to access the setjmp(3) jmp_buf fields frombluhm2020-12-063-76/+79
| | | | | | | i386 libc. The assembler code is more readable than with magic numbers. This brings i386 in line with amd64. No change in object file. OK kettenis@
* Add retguard to macppc kernel locore.S, ofwreal.S, setjmp.Sgkoehler2020-11-287-24/+24
| | | | | | | | | This changes RETGUARD_SETUP(ffs) to RETGUARD_SETUP(ffs, %r11, %r12) and RETGUARD_CHECK(ffs) to RETGUARD_CHECK(ffs, %r11, %r12) to show that r11 and r12 are in use between setup and check, and to pick registers other than r11 and r12 in some kernel functions. ok mortimer@ deraadt@
* Actually m88k assembler can not handle 'nop' mnemonic, use a macro instead.aoyama2020-11-071-2/+4
| | | | ok deraadt@
* Retguard asm macros for powerpc libc, ld.sogkoehler2020-10-269-64/+87
| | | | | | | | | | Add retguard to some, but not all, asm functions in libc. Edit SYS.h in libc to remove the PREFIX macros and add SYSENTRY (more like aarch64 and powerpc64), so we can insert RETGUARD_SETUP after SYSENTRY. Some .S files in this commit don't get retguard, but do stop using the old prefix macros. Tested by deraadt@, who put this diff in a macppc snap.
* Save and restore the MXCSR register and the FPU control word such thatkettenis2020-10-213-3/+15
| | | | | | floating-point control modes are properly restored by longjmp(3). ok guenther@
* Use a trap instruction that unconditionally terminates the process.visa2020-10-201-2/+2
| | | | OK deraadt@
* Retguard sigsetjmp on powerpc64.mortimer2020-10-191-5/+10
| | | | ok deraadt@
* replace ad-hoc illegal instruction with the architecturally defined onenaddy2020-10-192-4/+4
| | | | | ("permanently undefined") ok deraadt@ kettenis@
* add retguard prologue/epiloguederaadt2020-10-191-2/+4
| | | | ok mortimer
* Save and restore the FPCR register such that floating-point control modeskettenis2020-10-192-6/+14
| | | | are properly restored by longjmp(3).
* Add powerpc64 retguard macros for setjmp / longjmp.mortimer2020-10-181-5/+10
| | | | ok deraadt@
* SYS___threxit cannot fail, but this integration looks like a gadget.deraadt2020-10-1811-11/+24
| | | | | Put a hard-trap instruction after the syscall instruction. ok kettenis mortimer
* Adapt SYS.h to use retguard macros from asm.h, so that generated systemderaadt2020-10-168-46/+77
| | | | | | calls are guarded. Adapt the first few hand-written functions to this model (a few remain) ok kettenis mortimer
* Mark top-level frame for new thread in both CFI and with zeroguenther2020-10-012-2/+16
| | | | | | framepointer, so gdb knows to stop. Inspired by glibc ok kettenis@
* amd64: TSC timecounter: prefix RDTSC with LFENCEcheloha2020-08-231-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Regarding RDTSC, the Intel ISA reference says (Vol 2B. 4-545): > The RDTSC instruction is not a serializing instruction. > > It does not necessarily wait until all previous instructions > have been executed before reading the counter. > > Similarly, subsequent instructions may begin execution before the > read operation is performed. > > If software requires RDTSC to be executed only after all previous > instructions have completed locally, it can either use RDTSCP (if > the processor supports that instruction) or execute the sequence > LFENCE;RDTSC. To mitigate this problem, Linux and DragonFly use LFENCE. FreeBSD and NetBSD take a more complex route: they selectively use MFENCE, LFENCE, or CPUID depending on whether the CPU is AMD, Intel, VIA or something else. Let's start with just LFENCE. We only use the TSC as a timecounter on SSE2 systems so there is no need to conditionally compile the LFENCE. We can explore conditionally using MFENCE later. Microbenchmarking on my machine (Core i7-8650) suggests a penalty of about 7-10% over a "naked" RDTSC. This is acceptable. It's a bit of a moot point though: the alternative is a considerably weaker monotonicity guarantee when comparing timestamps between threads, which is not acceptable. It's worth noting that kernel timecounting is not *exactly* like userspace timecounting. However, they are similar enough that we can use userspace benchmarks to make conjectures about possible impacts on kernel performance. Concerns about kernel performance, in particular the network stack, were the blocking issue for this patch. Regarding networking performance, claudio@ says a 10% slower nanotime(9) or nanouptime(9) is acceptable and that shaving off "tens of cycles" is a micro-optimization. There are bigger optimizations to chase down before such a difference would matter. There is additional work to be done here. We could experiment with conditionally using MFENCE. Also, the userspace TSC timecounter doesn't have access to the adjustment skews available to the kernel timecounter. pirofti@ has suggested a scheme involving RDTSCP and an array of skews mapped into user memory. deraadt@ has suggested a scheme where the skew would be kept in the TCB. However it is done, access to the skews will improve monotonicity, which remains a problem with the TSC. First proposed by kettenis@ and pirofti@. With input from pirofti@, deraadt@, guenther@, naddy@, kettenis@, and claudio@. Based on similar changes in Linux, FreeBSD, NetBSD, and DragonFlyBSD. ok deraadt@ pirofti@ kettenis@ naddy@ claudio@
* Fix two cases where we shpould compare/store 64-bit values instead ofkettenis2020-07-271-3/+3
| | | | | | 32-bit values. ok gkoehler@, drahn@
* Fix powerpc64's sbrk()gkoehler2020-07-271-3/+5
| | | | | | | Initialize __curbrk = &_end. It's a 64-bit pointer, so use ld/std instead of lwz/stw. ok drahn@
* Userland timecounter implementation for octeonvisa2020-07-181-3/+30
| | | | OK naddy@; no objections from kettenis@
* Userland timecounter for macppcgkoehler2020-07-171-2/+22
| | | | | | | Tested by cwen@ and myself. Thanks to pirofti@ for creating the userland timecounter feature. ok kettenis@ pirofti@ deraadt@ cheloha@
* Userland timecounter implementation for arm64.kettenis2020-07-151-3/+29
| | | | ok naddy@
* Fix TIB/TCB on powerpc64. Some bright sould decided that the TCB shouldkettenis2020-07-141-3/+3
| | | | | | | | | | | be 8 bytes in the 64-bit ABI just like in the 32-bit ABI. But that means there is no "spare" word in the TCB that we can use to store a pointer to our struct pthread. So we have to treat powerpc64 special. Also recognize that the thread pointer points 0x7000 bytes after the TCB. Since the TCB is 8 bytes this means that TCB_OFFSET should be 0x7008. Pointed out by guenther@; ok deraadt@
* Add usertc.c.kettenis2020-07-111-0/+1
|
* Add missing usertc.c file.kettenis2020-07-111-0/+21
|
* Userland timecounter implementation for sparc64.kettenis2020-07-082-4/+41
| | | | ok deraadt@, pirofti@
* Clean up the amd64 userland timecounter implementation a bit:kettenis2020-07-081-10/+10
| | | | | | | | | * We don't need TC_LAST * Make internal functions static to avoid namespace pollution in libc.a * Use a switch statement to harmonize with architectures providing multiple timecounters ok deraadt@, pirofti@
* Add support for timeconting in userland.pirofti2020-07-0622-21/+273
| | | | | | | | | | | | | | | | | | | | | | | | | | This diff exposes parts of clock_gettime(2) and gettimeofday(2) to userland via libc eliberating processes from the need for a context switch everytime they want to count the passage of time. If a timecounter clock can be exposed to userland than it needs to set its tc_user member to a non-zero value. Tested with one or multiple counters per architecture. The timing data is shared through a pointer found in the new ELF auxiliary vector AUX_openbsd_timekeep containing timehands information that is frequently updated by the kernel. Timing differences between the last kernel update and the current time are adjusted in userland by the tc_get_timecount() function inside the MD usertc.c file. This permits a much more responsive environment, quite visible in browsers, office programs and gaming (apparently one is are able to fly in Minecraft now). Tested by robert@, sthen@, naddy@, kmos@, phessler@, and many others! OK from at least kettenis@, cheloha@, naddy@, sthen@
* Use a relative branch to jump from setjmp(3) into _setjmp(4).kettenis2020-07-021-5/+4
| | | | Use correct register to reference the location where we store CR.
* Add missing comparison instruction. Load %r12 with the indirect branchkettenis2020-06-301-1/+3
| | | | address to load the correct TOC address.
* Use C versions of bcopy(3) and memmove(3) for now as the assembly versionkettenis2020-06-291-2/+2
| | | | | | of bcopy(9) doesn't work in its current state. ok deraadt@
* Use std instead of stw to store CR since we use std in sigsetjmp(3) andkettenis2020-06-281-2/+2
| | | | we use ld to load it again in longjmp(3).
* The 2nd and 3rd argument are pointers, so use the appropriate doublewordkettenis2020-06-281-5/+5
| | | | | | instructions. ok drahn@
* Add missing label.kettenis2020-06-271-2/+2
|
* Provide an optimized implementation of ffs(3) in libc onnaddy2020-06-266-6/+55
| | | | | | aarch64/powerpc/powerpc64, making use of the count leading zeros instruction. Also add a brief regression test. ok deraadt@ kettenis@
* Fix TCB_OFFSET_ERRNO. Adjust comments to reflect that powerpc64 uses %r13kettenis2020-06-261-4/+4
| | | | | | as the per-thread register. ok patrick@, drahn@
* Avoid "bare" register numbers.kettenis2020-06-264-26/+26
|
* PowerPC64 libc powerpc sys filesdrahn2020-06-258-0/+368
| | | | | | | | Initial attempt to port powerpc code to powerpc64 Expects TOC loading in ENTRY(), ok kettenis@ (some cleanup required)
* PowerPC64 libc string/net filesdrahn2020-06-252-0/+178
| | | | | | | | | Initial attempt to port powerpc code to powerpc64 Expects TOC loading in ENTRY(), memmove.S is the powerpc 32 bit, optimization is possible for 64 bit and handle len of > 32 bits.
* *** empty log message ***drahn2020-06-251-0/+1
|
* PowerPC64 libc/arch/powerpc/gdtoa filesdrahn2020-06-253-0/+20
| | | | | | This is a almost a direct copy from powerpc with 64 bit mods, with two additions present in 64 arch. NOTE: long double 128 is not supported currently.
* Committed wrong version of file, atomic_lock is 32 bit.drahn2020-06-251-6/+6
|
* PowerPC64 libc gen filesdrahn2020-06-2514-0/+812
| | | | | | | | Initial attempt to port powerpc code to powerpc64 Expects TOC loading in ENTRY(), ok kettenis@
* PowerPC64 libc (libc powerpc top)drahn2020-06-254-0/+176
| | | | | | | | | | Expects ELFv2 TOC loading in ENTRY(), build with -gdwarf-4 Split SYS.h into SYS.h and DEFS.h fix tabs after #define
* Anthony Steinhauser reports that 32-bit arm cpus have the same speculationderaadt2020-03-131-3/+3
| | | | | | | | | | | problems as 64-bit models. To resolve the syscall speculation, as a first step "nop; nop" was added after all occurances of the syscall ("swi 0") instruction. Then the kernel was changed to jump over the 2 extra instructions. In this final step, those pair of nops are converted into the speculation-blocking sequence ("dsb nsh; isb"). Don't try to build through these multiple steps, use a snapshot instead. Packages matching the new ABI will be out in a while... ok kettenis
* Anthony Steinhauser reports that 32-bit arm cpus have the same speculationderaadt2020-03-111-3/+5
| | | | | | problems as 64-bit models. For the syscall instruction issue, add nop;nop after swi 0, in preparation for jumping over a speculation barrier here later. ok kettenis
* Now that the kernel skips the two instructions immediately followingkettenis2020-02-181-3/+3
| | | | | | | | a syscall, replace the double nop with a dsb nsh; isb; sequence which stops the CPU from speculating any further. This fix was suggested by Anthony Steinhauser. ok deraadt@
* Insert two nop instructions after each svc #0 instruction in userland.kettenis2020-01-261-2/+4
| | | | | | | | The will be replaced by a speculation barrier as soon as we teach the kernel to skip over these two instructions when returning from a system call. ok patrick@, deraadt@
* Mark as 'protected' all the routines from the quad/ and softfloat/ subdirs,guenther2019-11-101-1/+4
| | | | | | | | | as well as those in arch/arm/gen/divsi3.S. This cleans up the PLTs on the 32bit archs. luna88k testing by aoyama@ "looks good" kettenis@, testing and ok deraadt@