summaryrefslogtreecommitdiffstats
path: root/sys/kern/exec_elf.c (follow)
Commit message (Collapse)AuthorAgeFilesLines
* spellingjsg2021-03-101-2/+2
| | | | ok gnezdo@ semarie@ mpi@
* Remove the workaround which identified Go executables, and permitted themderaadt2021-03-081-7/+2
| | | | | | | | to do syscalls directly. Go executables now use shared libc like all other dynamic binaries. This makes the "where are syscalls done from" checker strict for all binaries, and also opens the door to change the underlying syscall ABI to the kernel in the future very easily (if we find cause). ok jsing
* Revert the convertion of per-process thread into a SMR_TAILQ.mpi2021-02-081-3/+2
| | | | | We did not reach a consensus about using SMR to unlock single_thread_set() so there's no point in keeping this change.
* Cache parent's pid as `ps_ppid' and use it instead of `ps_pptr->ps_pid'.mvs2021-01-171-2/+2
| | | | | | This allows us to unlock getppid(2). ok mpi@
* Convert the per-process thread list into a SMR_TAILQ.mpi2020-12-071-2/+3
| | | | | | | Currently all iterations are done under KERNEL_LOCK() and therefor use the *_LOCKED() variant. From and ok claudio@
* Add support for timeconting in userland.pirofti2020-07-061-2/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | This diff exposes parts of clock_gettime(2) and gettimeofday(2) to userland via libc eliberating processes from the need for a context switch everytime they want to count the passage of time. If a timecounter clock can be exposed to userland than it needs to set its tc_user member to a non-zero value. Tested with one or multiple counters per architecture. The timing data is shared through a pointer found in the new ELF auxiliary vector AUX_openbsd_timekeep containing timehands information that is frequently updated by the kernel. Timing differences between the last kernel update and the current time are adjusted in userland by the tc_get_timecount() function inside the MD usertc.c file. This permits a much more responsive environment, quite visible in browsers, office programs and gaming (apparently one is are able to fly in Minecraft now). Tested by robert@, sthen@, naddy@, kmos@, phessler@, and many others! OK from at least kettenis@, cheloha@, naddy@, sthen@
* The ELF NOTE parser would only inspect the first NOTE for 'OpenBSD'.deraadt2020-01-251-55/+77
| | | | | | | | | | | | | | | | | | | | | | | | Furthermore the parser was unaware a NOTE could contain multiple records. The scanner has been rewritten. Another bonus bug: if the binary was labelled as OPENBSD ABI, NOTE parsing was completely skipped so WXNEEDED wasn't learned either... Now that NOTEs are scanned correctly, search for the 'Go' NOTE. (During this work found the Go linker produces slightly broken NOTEs - Go team will probably fix that). Work is happening for our Go dynamic-binaries to use libc syscall stubs, but the change isn't ready. Go (and reportedly free-pascal also?) binaries are the only dynamic programs which require syscalls in the main-program. Since Go binaries are now identifiable, we can disable syscalls in all other regular dynamic-main-programs, gaining the strict enforcement we want. When the the Go-libc-stub change arrives we'll delete the Go NOTE scan and treat Go binaries same as regular binaries. This change probably breaks free-pascal, a lower priority item to repair. some discussion with jsing, ok kettenis
* typoderaadt2019-12-091-2/+2
|
* Repurpose the "syscalls must be on a writeable page" mechanism toderaadt2019-11-291-2/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | enforce a new policy: system calls must be in pre-registered regions. We have discussed more strict checks than this, but none satisfy the cost/benefit based upon our understanding of attack methods, anyways let's see what the next iteration looks like. This is intended to harden (translation: attackers must put extra effort into attacking) against a mixture of W^X failures and JIT bugs which allow syscall misinterpretation, especially in environments with polymorphic-instruction/variable-sized instructions. It fits in a bit with libc/libcrypto/ld.so random relink on boot and no-restart-at-crash behaviour, particularily for remote problems. Less effective once on-host since someone the libraries can be read. For static-executables the kernel registers the main program's PIE-mapped exec section valid, as well as the randomly-placed sigtramp page. For dynamic executables ELF ld.so's exec segment is also labelled valid; ld.so then has enough information to register libc's exec section as valid via call-once msyscall(2) For dynamic binaries, we continue to to permit the main program exec segment because "go" (and potentially a few other applications) have embedded system calls in the main program. Hopefully at least go gets fixed soon. We declare the concept of embedded syscalls a bad idea for numerous reasons, as we notice the ecosystem has many of static-syscall-in-base-binary which are dynamically linked against libraries which in turn use libc, which contains another set of syscall stubs. We've been concerned about adding even one additional syscall entry point... but go's approach tends to double the entry-point attack surface. This was started at a nano-hackathon in Bob Beck's basement 2 weeks ago during a long discussion with mortimer trying to hide from the SSL scream-conversations, and finished in more comfortable circumstances next to a wood-stove at Elk Lakes cabin with UVM scream-conversations. ok guenther kettenis mortimer, lots of feedback from others conversations about go with jsing tb sthen
* When killing a process, the signal is handled by any thread thatbluhm2019-05-131-2/+2
| | | | | | | | | | does not block the signal. If all threads block the signal, we delivered it to the main thread. This does not conform to POSIX. If any thread unblocks the signal, it should be delivered immediately to this thread. Mark such signals pending at the process instead of a single thread. Then any thread can handle it later. OK kettenis@ guenther@
* wxneeded binaries on wxallowed filesystems were refused execution. We havederaadt2019-05-111-14/+1
| | | | | | | | encountered a wxneeded binary that attempts correct operation when started on a nowxallowed filesystem (it tries mprotect with RWX, notices ENOTSUP and acts in a different way). So permit execution (but of course don't allow W^X violating mappings) ok sthen kettenis robert
* If mallocing the array program header fails, give up on coredumpingguenther2019-05-091-2/+4
| | | | | | instead of panicing ok deraadt@, tedu@, mpi@
* #define ELFROUNDSIZE 4 /* XXX Should it be sizeof(Elf_Word)? */deraadt2019-04-201-2/+2
| | | | Now that alpha is fixed, we can use sizeof().
* Core files with >65535 sections have to use PN_XNUM and a section headerguenther2018-12-061-8/+65
| | | | | | | | to pass the real count, with a minimal .shstrtab segment for consistency. Also, add support for PN_XNUM to readelf. problem reported and testing by claudio@ ok kettenis@
* Decouple unveil from the pledge flags, by adding dedicated unveil flagsbeck2018-08-051-1/+2
| | | | | | | | to the namei args. This fixes a bug where chmod would be allowed when with only READ. This also allows some further cleanup of some awkward things like PLEDGE_STAT that will follow Lots of assistence from semarie@ - thanks! ok semarie@
* Remove a few leftovers from the days of emulation, which could result inderaadt2018-07-201-3/+2
| | | | | a bad/corrupt binary not returning ENOEXEC but some other error. ok guenther kettenis bluhm
* Fail if a PT_LOAD segment has a memory size of 0. This prevents a panickettenis2018-07-201-3/+7
| | | | | | later on, and it makes no sense for a binary to have such a segment. ok bluhm@, guenther@
* Move from sendsig() to its callers the initsiginfo() calls andguenther2018-07-101-2/+1
| | | | | | | | instead of passing sendsig() the code+type+val, pass a siginfo_t* to copy from. Eliminate the indirection through struct emul for sendsig(); we no longer have a SunOS4-compat version of sendsig() ok deraadt@
* Don't pull in <sys/file.h> just to get fcntl.hguenther2017-12-301-2/+2
| | | | ok deraadt@ krw@
* In elf_load_file() to not call free(9) with an uninitialized sizebluhm2017-09-071-2/+2
| | | | | | | even if the pointer is NULL. This is not a real bug as free(9) checks the addr pointer before the size value, but the compiler cannot know that. found by clang -Wuninitialized; OK deraadt@
* Initialize the stack buffer used to build the auxiliary vector to zero tokettenis2017-03-201-1/+2
| | | | | | avoid leaking the contents of the kernel stack into userspace. ok guenther@, deraadt@
* Generating a coredump requires walking the map twice; changeguenther2017-03-051-125/+118
| | | | | | | uvm_coredump_walkmap() to do both with a callback in between so it can hold locks/change state across the two. ok stefan@
* Correct the entry point and base address calculations for anguenther2017-02-111-2/+4
| | | | | | interpreter whose entry point isn't in its first PT_LOAD segment. problem report and testing by patrick@
* Remove support for forcing the ELF interpreter to a specific address,guenther2017-02-081-21/+12
| | | | | | last used by COMPAT_SYSV which was removed in 2011. ok millert@
* In exec_elf.c: expand ELFNAME(), ELFNAME2(), and ELFNAMEEND() exceptguenther2017-02-081-78/+77
| | | | | | | | | | | | | | | | | | | | | | leaving out the size, so that ELFNAME2(exec,makecmds) becomes exec_elf_makecmds instead of exec_elf{32,64}_makecmds and then delete the ELFNAME2() and ELFNAMEEND() macros. Move the prototypes for functions local to exec_elf.c to there from exec_elf.h. Simplify the SMALL_KERNEL conditionals around the ELF coredump code. Change exec_conf.c to use the size-generic names and macros Remove exec_elf{32,64}.c and just build exec_elf.c; delete the _KERN_DO_ELF and _KERN_DO_ELF64 #defines. ok jca@, encouragement from deraadt@ and tom@
* Move ELF_AUX_ENTRIES from exec_elf.h to exec_elf.c; it's totally internalguenther2017-02-081-1/+6
| | | | | | and not something we guarantee to userspace ok jca@
* Change ELFNAME(read_from)'s buf parameter to be void*, eliminating a castguenther2017-02-081-9/+8
| | | | | | from all but one call ok jca@
* elf{32,64}_check_brand() isn't used; delete itguenther2017-02-081-13/+1
| | | | ok jca@
* Provide size-generic ELF_NO_ADDR in <sys/exec_elf.h> and use that insteadguenther2017-02-081-11/+11
| | | | | | of ELFDEFNNAME(NO_ADDR) ok jca@
* Since we expect to never do binary compat with other OSes again,guenther2017-02-051-23/+5
| | | | | | delete the no-longer-used probe hook support. ok mpi@ jca@
* p_comm is the process's command and isn't per thread, so move it fromguenther2017-01-211-2/+2
| | | | | | struct proc to struct process. ok deraadt@ kettenis@
* Split PID from TID, giving processes a PID unrelated to the TID of theirguenther2016-11-071-2/+2
| | | | | | initial thread ok jsing@ kettenis@
* Display/test/use the process PID, not the thread's TID, in a few places.guenther2016-10-051-2/+2
| | | | ok mpi@ mikeb@
* When trying to run an ELF binary marked PT_OPENBSD_WXNEEDED from aschwarze2016-09-121-5/+6
| | | | | | | file system mounted without MNT_WXALLOWED, fail with EACCES rather than with ENOEXEC, to discourage the shell from trying to run the file as a shell script. OK deraadt@ millert@; tedu@ and halex@ agreed with the general direction.
* Since epp->ep_name is a userland pointer, use copyinstr(9) to get a copy okkettenis2016-06-111-2/+5
| | | | | | the string into kernel space before logging the W^X binary warning. ok jca@, guenther@
* Enforce W^X and map W|X segments without X permission initially. Thekettenis2016-06-081-2/+9
| | | | | | | | dynamic linker will make these read-only and add back X permission after elocation processing. Static executables with W|X segments will probably crash. ok deraadt@, guenther@
* Identify W^X labelled binaries at execve() time based upon WX_OPENBSD_WXNEEDEDderaadt2016-05-301-1/+19
| | | | | | | | | | | flag set by ld -zwxneeded. Such binaries are allowed to run only on wxallowed mountpoints. They do not report mmap/mprotect problems. Rate limit mmap/mprotect reports from other binaries. These semantics are chosen to encourage progress in the ports ecosystem, without overwhelming the developers who work in the area. ok sthen kettenis
* backout to insert correct commit messagederaadt2016-05-301-19/+1
|
* *** empty log message ***deraadt2016-05-301-1/+19
|
* SROP mitigation. sendsig() stores a (per-process ^ &sigcontext) cookiederaadt2016-05-101-2/+3
| | | | | | | | inside the sigcontext. sigreturn(2) checks syscall entry was from the exact PC addr in the (per-process ASLR) sigtramp, verifies the cookie, and clears it to prevent sigcontext reuse. not yet tested on landisk, sparc, *88k, socppc. ok kettenis
* Support for running Linux binaries under emulation is going away.naddy2016-02-281-8/+1
| | | | | | | Remove "option COMPAT_LINUX" and everything directly tied to it from the kernel and the corresponding man page documentation. ok visa@ guenther@
* move the pledgenote annotation from `struct proc' to `struct nameidata'semarie2015-11-021-2/+2
| | | | | | | | | | pledgenote is used for annotate the policy for a namei context. So make it tracking the nameidata. It is expected for the caller to explicitly define the policy. It is a kernel bug to not do so. ok deraadt@
* Paranoa: p_pledgenote the NAMEI for ld.so loadingderaadt2015-10-281-1/+3
|
* Track size of an opaque allocation to pass to free() laterderaadt2015-09-281-3/+4
| | | | ok guenther tedu
* Now we use p_filesz - 1 to test for NUL check that p_filesz isjsg2015-04-301-2/+2
| | | | | | | | at least two and while here allow the upper bound to be MAXPATHLEN by changing a >= to > as suggested by krw@ in a thread on tech where Maxime Villard proposed additional PT_INTERP checks. tested by and ok guenther@
* Error out if the PT_INTERP segment isn't NUL terminatedguenther2015-04-301-1/+3
| | | | ok deraadt@ millert@ miod@
* Require a PT_LOAD segment's p_filesz to be no larger than its p_memsz.guenther2015-04-261-1/+7
| | | | | test cases provided by Alejandro Herna'ndez (nitrousenador (at) gmail.com) ok deraadt@ jsg@
* Extend uvm_map_hint() to get an address range as extra arguments, and makemiod2015-03-301-2/+3
| | | | | | | | | | sure it will return an address within that range. Use this in uaddr_rnd_select() to make sure we will not attempt to pick an address beyond what we are allowed to map. In my trees for 9 months, blackmailed s2k15 attendees into agreeing now would be a good time to commit.
* Don't use an uninitialized variable when a PT_LOAD segment withguenther2015-02-101-11/+10
| | | | | | | | | | alignment 0 or 1 is encountered. The result before was just a spurious failure by execve(), though I had to manually mangle a binary to hit this case: segments are all long-aligned or better in practice. uninitialized variable noted by Maxime Villard (rustyBSD (at) gmx.fr) ok and prod jsg@
* Raise ELF_RANDOMIZE_LIMIT to 64K, so that programs and libraries canderaadt2015-02-061-4/+1
| | | | | | | legitimately use random section variables without execve failures... Because this section is not demand faulted, yield() every page during the fill otherwise the costs are charged poorly. ok tedu matthew