summaryrefslogtreecommitdiffstats
path: root/lib/libc/arch/amd64 (follow)
Commit message (Collapse)AuthorAgeFilesLines
* Save and restore the MXCSR register and the FPU control word such thatkettenis2020-10-213-3/+15
| | | | | | floating-point control modes are properly restored by longjmp(3). ok guenther@
* SYS___threxit cannot fail, but this integration looks like a gadget.deraadt2020-10-181-1/+2
| | | | | Put a hard-trap instruction after the syscall instruction. ok kettenis mortimer
* Mark top-level frame for new thread in both CFI and with zeroguenther2020-10-011-1/+9
| | | | | | framepointer, so gdb knows to stop. Inspired by glibc ok kettenis@
* amd64: TSC timecounter: prefix RDTSC with LFENCEcheloha2020-08-231-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Regarding RDTSC, the Intel ISA reference says (Vol 2B. 4-545): > The RDTSC instruction is not a serializing instruction. > > It does not necessarily wait until all previous instructions > have been executed before reading the counter. > > Similarly, subsequent instructions may begin execution before the > read operation is performed. > > If software requires RDTSC to be executed only after all previous > instructions have completed locally, it can either use RDTSCP (if > the processor supports that instruction) or execute the sequence > LFENCE;RDTSC. To mitigate this problem, Linux and DragonFly use LFENCE. FreeBSD and NetBSD take a more complex route: they selectively use MFENCE, LFENCE, or CPUID depending on whether the CPU is AMD, Intel, VIA or something else. Let's start with just LFENCE. We only use the TSC as a timecounter on SSE2 systems so there is no need to conditionally compile the LFENCE. We can explore conditionally using MFENCE later. Microbenchmarking on my machine (Core i7-8650) suggests a penalty of about 7-10% over a "naked" RDTSC. This is acceptable. It's a bit of a moot point though: the alternative is a considerably weaker monotonicity guarantee when comparing timestamps between threads, which is not acceptable. It's worth noting that kernel timecounting is not *exactly* like userspace timecounting. However, they are similar enough that we can use userspace benchmarks to make conjectures about possible impacts on kernel performance. Concerns about kernel performance, in particular the network stack, were the blocking issue for this patch. Regarding networking performance, claudio@ says a 10% slower nanotime(9) or nanouptime(9) is acceptable and that shaving off "tens of cycles" is a micro-optimization. There are bigger optimizations to chase down before such a difference would matter. There is additional work to be done here. We could experiment with conditionally using MFENCE. Also, the userspace TSC timecounter doesn't have access to the adjustment skews available to the kernel timecounter. pirofti@ has suggested a scheme involving RDTSCP and an array of skews mapped into user memory. deraadt@ has suggested a scheme where the skew would be kept in the TCB. However it is done, access to the skews will improve monotonicity, which remains a problem with the TSC. First proposed by kettenis@ and pirofti@. With input from pirofti@, deraadt@, guenther@, naddy@, kettenis@, and claudio@. Based on similar changes in Linux, FreeBSD, NetBSD, and DragonFlyBSD. ok deraadt@ pirofti@ kettenis@ naddy@ claudio@
* Clean up the amd64 userland timecounter implementation a bit:kettenis2020-07-081-10/+10
| | | | | | | | | * We don't need TC_LAST * Make internal functions static to avoid namespace pollution in libc.a * Use a switch statement to harmonize with architectures providing multiple timecounters ok deraadt@, pirofti@
* Add support for timeconting in userland.pirofti2020-07-062-2/+44
| | | | | | | | | | | | | | | | | | | | | | | | | | This diff exposes parts of clock_gettime(2) and gettimeofday(2) to userland via libc eliberating processes from the need for a context switch everytime they want to count the passage of time. If a timecounter clock can be exposed to userland than it needs to set its tc_user member to a non-zero value. Tested with one or multiple counters per architecture. The timing data is shared through a pointer found in the new ELF auxiliary vector AUX_openbsd_timekeep containing timehands information that is frequently updated by the kernel. Timing differences between the last kernel update and the current time are adjusted in userland by the tc_get_timecount() function inside the MD usertc.c file. This permits a much more responsive environment, quite visible in browsers, office programs and gaming (apparently one is are able to fly in Minecraft now). Tested by robert@, sthen@, naddy@, kmos@, phessler@, and many others! OK from at least kettenis@, cheloha@, naddy@, sthen@
* Stop exporting the internal _mcount symbol as that's only referencedguenther2019-10-261-1/+0
| | | | | | | by the ASM stub, which is also in libc. The compiler only generates invocations of the latter. ok mpi@ deraadt@ kettenis@
* Apply retpoline protection to the indirect call to the thread startfuncguenther2019-05-101-2/+7
| | | | ok mortimer@
* Add retguard macros to setjmp/longjmp on amd64. Knocks out some usefulmortimer2019-03-303-21/+33
| | | | | | gadgets from libc. ok deraadt@, kettenis@
* Remove FBSDID.kevlo2019-03-151-4/+1
| | | | ok deraadt@
* In asm.h ensure NENTRY uses the old-school nop-sled align, but change standardderaadt2018-07-101-1/+1
| | | | | | | ENTRY is a trapsled. Fix a few functions which fall-through into an ENTRY macro. amd64 binaries now are free of double+-nop sequences (except for one assember nit in aes-586.pl). Previous changes by guenther got us here. ok mortimer kettenis
* Add retguard macros for libc.mortimer2018-07-0323-23/+72
| | | | ok deraadt
* Clear the sign bit in the QNAN constants used by strtof, strtod and strtold,jmatthew2018-05-281-4/+4
| | | | | | so passing "nan" and "-nan" produces a NaN with the right sign. Bug reported and diff provided by George Koehler. ok kettenis@
* Instead of trying to handle ffs() with the normal rename-mark-hidden-and-aliasguenther2018-01-181-2/+3
| | | | | | | dance, mark it protected. This works better for both gcc and clang: gcc blocks overriding of internal calls, while clang permits inlining again. ok otto@
* clang doesn't propagate attributes like "asm labels" and "visibility(hidden)"guenther2017-11-293-5/+17
| | | | | | | | | to builtins like mem{set,cpy,move} and __stack_smash_handler. So, when building with clang, instead mark those as protected visibility to get rid of the PLT relocations. We can't take the address of them then, but that's ok: it's a build-time error not a run-time error. ok kettenis@
* Don't need .text before ENTRY(), also minor spacing cleanupsderaadt2017-08-192-7/+6
|
* Put _map table into .rodata instead of .textderaadt2017-08-191-3/+2
|
* Copy files from ../librthread in preparation for moving functionalityguenther2017-08-151-0/+26
| | | | | | | from libpthread to libc. No changes to the build yet, just making it easier to review the substantive diffs. ok beck@ kettenis@ tedu@
* Clang ignores a .weak directive before a function is actually defined. Sokettenis2016-09-102-4/+4
| | | | | | | move it from before ENTRY() to after END(). Keeps brk(2) and sbrk(2) weak when comping libc with clang. ok guenther@
* Remove branch prediction hints from conditional branch instructions. Thesekettenis2016-09-061-3/+3
| | | | | | | hints are not recognized by clang's builtin assembler and the opcode prefixes they generate have been no-ops for all CPUs after the Pentium 4. ok guenther@
* Switch from calling obsolete sig{block,setmask} to directly using theguenther2016-05-292-21/+27
| | | | | | sigprocmask syscall ok kettenis@
* Using a 3-word buffer in the openbsd.randomdata segment, XOR swizzlederaadt2016-05-123-21/+80
| | | | | | the PC/FP/SP registers in the jmpbuf. An old idea (around 1999?) but the random segment sure makes it easy. Lots of help from kettenis ok kettenis
* Remove sigreturn declaration and the now-unused libc syscall stubguenther2016-05-091-57/+0
|
* Use a Thread Information Block in both single and multi-threaded programs.guenther2016-05-0710-224/+52
| | | | | | | | | | | | | | | | | This stores errno, the cancelation flags, and related bits for each thread and is allocated by ld.so or libc.a. This is an ABI break from 5.9-stable! Make libpthread dlopen'able by moving the cancelation wrappers into libc and doing locking and fork/errno handling via callbacks that libpthread registers when it first initializes. 'errno' *must* be declared via <errno.h> now! Clean up libpthread's symbol exports like libc. On powerpc, offset the TIB/TCB/TLS data from the register per the ELF spec. Testing by various, particularly sthen@ and patrick@ ok kettenis@
* The asm in the MD_DISABLE_KBIND macro was too fragile and broke alpha and hppa.guenther2016-03-211-26/+0
| | | | | | So instead, do the kbind disabling with syscall(). debugging and ok deraadt@, ok kettenis@
* Rearrange C runtime bits: now that ld.so exports environ and __progname,guenther2016-03-201-0/+26
| | | | | | | | | | | move their definitions and initialization in static links to libc.a Make crt0 always invoke a new func _csu_finish() in libc to process the auxv and to either register the ld.so cleanup function (in dynamic links) or initialize environ and __progname and do MC_DISABLE_KBIND (in static links). In libc, get pagesize from auxv; cache that between getpagesize() and sysconf(_SC_PAGESIZE) ok mpi@ "good time" deraadt@
* "the the" -> "the" in commentmmcc2015-12-111-2/+2
|
* Split the non-syscall ASM bits from SYS.h into DEFS.h and use that in theguenther2015-11-1413-41/+80
| | | | | | non-syscall .S source ok millert@ miod@
* Wrap the remaining math functions in libc: __fpclassify*(), __flt_rounds(),guenther2015-10-273-5/+7
| | | | | | and ldexp(). ok millert@
* Merge the sigaction() and sigprocmask() overloads/wrappers from libpthreadguenther2015-10-232-4/+6
| | | | | | | | | into libc, and move pthread_sigmask() as well (just a trivial wrapper). This provides consistent handling of SIGTHR between single- and multi-threaded programs and is a step in the merge of all the libpthread overloads, providing some ASM and Makefile bits that the other wrappers will need. ok deraadt@ millert@
* Rename SYSEXIT() to SYSCALL_END() for consistency with most other archs.guenther2015-10-175-25/+14
| | | | | | No change in resulting object files ok millert@
* Wrap <stdlib.h> so that calls go direct and the symbols not in theguenther2015-09-132-3/+2
| | | | | | C standard are all weak. Apply __{BEGIN,END}_HIDDEN_DECLS to gdtoa{,imp}.h, hiding the arch-specific __strtorx, __ULtox_D2A, __strtorQ, __ULtoQ_D2A symbols.
* Do provide hidden _libc_* aliases for sig{block,setmask} and use them inguenther2015-09-132-22/+6
| | | | | | | the ASM *setjmp implementations. Skip the PLT when calling them on amd64 (other archs to do this after testing) ok miod@
* Adds hidden _libc_FOO aliases for the system call stubs.guenther2015-09-057-34/+38
| | | | | | | | Stop generating _brk and _sbrk symbols: they've already been hidden. Set the ELF symbol size on the syscall stubs. Give the __{min,cur}brk symbols a size and type, and hide more jump labels. ok deraadt@
* Add framework for resolving (pun intended) libc namespace issues, usingguenther2015-08-3111-31/+53
| | | | | | | | | | | | wrapper .h files and asm labels to let internal calls resolve directly and not be overridable or use the PLT. Then, apply that framework to most of the functions in stdio.h, string.h, err.h, and wchar.h. Delete the should-have-been-hidden-all-along _v?(err|warn)[cx]? symbols while here. tests clean on i386, amd64, sparc64, powerpc, and mips64 naming feedback from kettenis@ and millert@ ok kettenis@
* Hide many (194!) symbols that nothing should be using.guenther2015-08-265-81/+12
| | | | | | | Delete exect(2); it wasn't portable across archs and nothing used it. ports test build by naddy@ ok deraadt@ kettenis@
* Explicitly list the symbols permitted to be exported by libc.guenther2015-08-221-0/+18
| | | | | | | | This is primed with the current list of exported symbols so it doesn't change the ABI yet, but will prevent unintentional additions in the future and sets the stage for reductions. ok deraadt@ kettenis@
* Set FUNC symbol sizes of auto-generated and hand-written syscall wrappers.uebayasi2015-06-1710-14/+40
| | | | | | Original diff from guenther@, adjusted by me. OK guenther@
* Reuse SYSENTRY_HIDDEN() in SYSENTRY(); no functional changes.uebayasi2015-06-121-2/+2
|
* Put END() matching _ENTRY() (== ENTRY() w/o prof).uebayasi2015-06-011-1/+2
|
* Put END() matching ENTRY().uebayasi2015-06-011-1/+2
|
* Put END() where appropriate.uebayasi2015-05-294-10/+12
| | | | While here, kill redundant use of _C_LABEL() in ENTRY().
* Put obvious END() macros that match ENTRY() entries.uebayasi2015-05-2912-12/+51
|
* Sprinkle END() in some straightforward *.S files that have ENTRY(). Theuebayasi2015-05-2910-5/+17
| | | | resulting *.o have "FUNC" symbols with size set.
* Make index/rindex weak aliases of strchr/strrchr since they are notmillert2015-05-152-4/+4
| | | | | part of the ISO C standard and have also been dropped from POSIX. OK guenther@ kettenis@
* Eliminate the last uses of *fork's second syscall return register; the pidguenther2015-04-211-3/+3
| | | | | | is zero in the child ok deraadt@ miod@
* Make pthread_atfork() track the DSO that called it like atexit() does,guenther2015-04-072-7/+26
| | | | | | | | | | unregistering callbacks if the DSO is unloaded. Move the callback handling from libpthread to libc, though libpthread still overrides the inner call to handle locking and thread-library reinitialization. Major version bump for both libc and libpthread. verification that this fixes various ports ajacoutot@ asm assistance miod@; ok millert@ deraadt@
* Simplify fork/vfork logic: the kernel has handled returning zero in the childguenther2015-03-312-15/+3
| | | | | | | for a long time, so there's no need to test the second return register here in the asm stub. ok and testing of many archs by krw@ miod@
* remove code for ancient gcc.daniel2015-01-041-7/+1
| | | | ok millert@, kettenis@
* Import new amd64 assembly versions of strchr/index, strrchr/rindex,reyk2014-12-095-95/+442
| | | | | | | | | and strlen that provide a significantly faster performance than our previous .c or .S implementations. Based on NetBSD's code. Tested with different amd64 CPUs. ok deraadt@ mikeb@