summaryrefslogtreecommitdiffstats
path: root/sys/kern/kern_exec.c (follow)
Commit message (Collapse)AuthorAgeFilesLines
* handle theoretical case of sigfillsz not being pow2-sized on somederaadt2021-03-161-4/+8
| | | | | architecture. from miod
* Kill SINGLE_PTRACE and use SINGLE_SUSPEND which has almost the same semanticmpi2021-03-121-2/+2
| | | | | | | | single_thread_set() is modified to explicitly indicated when waiting until sibling threads are parked is required. This is obviously not required if a traced thread is switching away from a CPU after handling a STOP signal. ok claudio@
* Revert commitid: AZrsCSWEYDm7XWuv;claudio2021-03-081-2/+2
| | | | | | Kill SINGLE_PTRACE and use SINGLE_SUSPEND which has almost the same semantic. This diff did not properly kill SINGLE_PTRACE and broke RAMDISK kernels.
* Kill SINGLE_PTRACE and use SINGLE_SUSPEND which has almost the same semantic.mpi2021-03-081-2/+2
| | | | | | | | single_thread_set() is modified to explicitly indicated when waiting until sibling threads are parked is required. This is obviously not required if a traced thread is switching away from a CPU after handling a STOP signal. ok claudio@
* _exit(2), execve(2): tweak per-process interval timer cancellationcheloha2020-10-151-4/+2
| | | | | | | | If we fold the for-loop iterating over each interval timer into the helper function the result is slightly tidier than what we have now. Rename the helper function "cancel_all_itimers". Based on input from millert@ and kettenis@.
* _exit(2), execve(2): cancel per-process interval timers safelycheloha2020-10-151-9/+4
| | | | | | | | | | | | | | | | | During _exit(2) and sometimes during execve(2) we need to cancel any active per-process interval timers. We don't currently do this in an MP-safe way. Both syscalls ignore the locking assumptions documented in proc.h. The easiest way to make them MP-safe is to use setitimer(), just like the getitimer(2) and setitimer(2) syscalls do. To make things a bit cleaner I have added a helper function, cancelitimer(), so the callers don't need to fuss with an itimerval struct. While we're here we can remove the splclock/splx dance from execve(2). It is no longer necessary. ok deraadt@
* timekeep_sz now already includes the round_page() adjustment; ok kettenis@naddy2020-07-111-2/+2
|
* small typoderaadt2020-07-071-2/+2
|
* Wire down the timekeep page. If we don't do this, the pagedaemon maykettenis2020-07-061-3/+11
| | | | | | | | | | page it out and bad things will happen when we try to page it back in from within the clock interrupt handler. While there, make sure we set timekeep_object back to NULL if we fail to make the timekeep page into kernel space. ok deraadt@ (who had a very similar diff)
* Add support for timeconting in userland.pirofti2020-07-061-1/+53
| | | | | | | | | | | | | | | | | | | | | | | | | | This diff exposes parts of clock_gettime(2) and gettimeofday(2) to userland via libc eliberating processes from the need for a context switch everytime they want to count the passage of time. If a timecounter clock can be exposed to userland than it needs to set its tc_user member to a non-zero value. Tested with one or multiple counters per architecture. The timing data is shared through a pointer found in the new ELF auxiliary vector AUX_openbsd_timekeep containing timehands information that is frequently updated by the kernel. Timing differences between the last kernel update and the current time are adjusted in userland by the tc_get_timecount() function inside the MD usertc.c file. This permits a much more responsive environment, quite visible in browsers, office programs and gaming (apparently one is are able to fly in Minecraft now). Tested by robert@, sthen@, naddy@, kmos@, phessler@, and many others! OK from at least kettenis@, cheloha@, naddy@, sthen@
* Consistently perform atomic writes to the ps_flags field of structanton2020-02-151-3/+3
| | | | | | process. ok bluhm@ claudio@ visa@
* Replace p_xstat with ps_xexit and ps_xsigguenther2019-12-111-2/+2
| | | | | | | | | Convert those to a consolidated status when needed in wait4(), kevent(), and sysctl() Pass exit code and signal separately to exit1() (This also serves as prep for adding waitid(2)) ok mpi@
* comply with POSIX and make execve() return EACCES for directoriesnaddy2019-12-011-5/+1
| | | | ok millert@ deraadt@
* Repurpose the "syscalls must be on a writeable page" mechanism toderaadt2019-11-291-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | enforce a new policy: system calls must be in pre-registered regions. We have discussed more strict checks than this, but none satisfy the cost/benefit based upon our understanding of attack methods, anyways let's see what the next iteration looks like. This is intended to harden (translation: attackers must put extra effort into attacking) against a mixture of W^X failures and JIT bugs which allow syscall misinterpretation, especially in environments with polymorphic-instruction/variable-sized instructions. It fits in a bit with libc/libcrypto/ld.so random relink on boot and no-restart-at-crash behaviour, particularily for remote problems. Less effective once on-host since someone the libraries can be read. For static-executables the kernel registers the main program's PIE-mapped exec section valid, as well as the randomly-placed sigtramp page. For dynamic executables ELF ld.so's exec segment is also labelled valid; ld.so then has enough information to register libc's exec section as valid via call-once msyscall(2) For dynamic binaries, we continue to to permit the main program exec segment because "go" (and potentially a few other applications) have embedded system calls in the main program. Hopefully at least go gets fixed soon. We declare the concept of embedded syscalls a bad idea for numerous reasons, as we notice the ecosystem has many of static-syscall-in-base-binary which are dynamically linked against libraries which in turn use libc, which contains another set of syscall stubs. We've been concerned about adding even one additional syscall entry point... but go's approach tends to double the entry-point attack surface. This was started at a nano-hackathon in Bob Beck's basement 2 weeks ago during a long discussion with mortimer trying to hide from the SSL scream-conversations, and finished in more comfortable circumstances next to a wood-stove at Elk Lakes cabin with UVM scream-conversations. ok guenther kettenis mortimer, lots of feedback from others conversations about go with jsing tb sthen
* Kill uvm_deallocate(9) and use uvm_unmap() directly.mpi2019-11-051-3/+2
| | | | ok kettenis@, semarie@, deraadt@
* per-process itimers: itimerval -> itimerspeccheloha2019-08-021-3/+3
| | | | | | | | | | | | | Loongson runs at 128hz. 128 doesn't divide evenly into a million, but it does divide evenly into a billion. So if we do the per-process itimer bookkeeping with itimerspec structs we can have error-free virtual itimers on loongson just as we do on most other platforms. This change doesn't fix the virtual itimer error alpha, as 1024 does not divide evenly into a billion. But this doesn't make the situation any worse, either. ok deraadt@
* Do not relock fdp in fdrelease(). This prevents unnecessary lockingvisa2019-07-151-2/+2
| | | | | | in the common case. OK mpi@
* Make resource limit access MP-safe. So far, the copy-on-write sharingvisa2019-06-211-2/+2
| | | | | | | | | | of resource limit structs has been done between processes. By applying copy-on-write also between threads, threads can read rlimits in a nearly lock-free manner. Inspired by code in DragonFly BSD and FreeBSD. OK mpi@, agreement from jmatthew@ and anton@
* Revert to using the SCHED_LOCK() to protect time accounting.mpi2019-06-011-3/+1
| | | | | | | | | It currently creates a lock ordering problem because SCHED_LOCK() is taken by hardclock(). That means the "priorities" of a thread should be moved out of the SCHED_LOCK() first in order to make progress. Reported-by: syzbot+8e4863b3dde88eb706dc@syzkaller.appspotmail.com via anton@ as well as by kettenis@
* Use a per-process mutex to protect time accounting instead of SCHED_LOCK().mpi2019-05-311-1/+3
| | | | | | | Note that hardclock(9) still increments p_{u,s,i}ticks without holding a lock. ok visa@, cheloha@
* Fix stack info leak in execve(2). There are 2x4 bytes of paddingbluhm2019-02-081-1/+3
| | | | | in struct ps_strings. from NetBSD; OK deraadt@ guenther@ visa@
* If we execute a #!shell binary, the shell is an integral part of thederaadt2018-10-301-1/+3
| | | | | | | | binary so it should bypass unveil restrictions. This is similar (but different...) to how the ELF linker (ld.so) is loaded (after unveils get dropped). Discovered in doas, due to more accurate unveil semantics. ok guenther tedu beck
* Decouple unveil from the pledge flags, by adding dedicated unveil flagsbeck2018-08-051-1/+2
| | | | | | | | to the namei args. This fixes a bug where chmod would be allowed when with only READ. This also allows some further cleanup of some awkward things like PLEDGE_STAT that will follow Lots of assistence from semarie@ - thanks! ok semarie@
* Remove a few leftovers from the days of emulation, which could result inderaadt2018-07-201-3/+1
| | | | | a bad/corrupt binary not returning ENOEXEC but some other error. ok guenther kettenis bluhm
* Unveiling unveil(2).beck2018-07-131-1/+9
| | | | | | | | | | | | | This brings unveil into the tree, disabled by default - Currently this will return EPERM on all attempts to use it until we are fully certain it is ready for people to start using, but this now allows for others to do more tweaking and experimentation. Still needs to send the unveil's across forks and execs before fully enabling. Many thanks to robert@ and deraadt@ for extensive testing. ok deraadt@
* Put file descriptors on shared data structures when they are completelympi2018-06-181-5/+4
| | | | | | | | | | | | | | | | | setup, take 3. LARVAL fd still exist, but they are no longer marked with a flag and no longer reachable via `fd_ofiles[]' or the global linked list. This allows us to simplifies a lot code grabbing new references to fds. All of this is now possible because dup2(2) refuses to clone LARVAL fds. Note that the `fdplock' could now be release in all open(2)-like syscalls, just like it is done in accept(2). With inputs from Mathieu Masson, visa@, guenther@ and art@ Previous version ok bluhm@, ok visa@, sthen@
* Revert introduction of fdinsert(), a sanitify check triggers whenmpi2018-06-051-4/+5
| | | | | | closing a LARVAL file. Found the hardway by sthen@.
* Put file descriptors on shared data structures when they are completelympi2018-06-021-5/+4
| | | | | | | | | | | | | | | | | setup. LARVAL fd still exist, but they are no longer marked with a flag and no longer reachable via `fd_ofiles[]'. This allows us to simplifies a lot code grabbing new references to fds. All of this is now possible because dup2(2) refuses to clone LARVAL fds. Note that the `fdplock' could now be release in all open(2)-like syscalls, just like it is done in accept(2). With inputs from Mathieu -, visa@, guenther@ and art@ ok visa@, bluhm@
* Clean up the parameters of VOP_LOCK() and VOP_UNLOCK(). It is alwaysvisa2018-04-281-2/+2
| | | | | | | curproc that does the locking or unlocking, so the proc parameter is pointless and can be dropped. OK mpi@, deraadt@
* Move FREF() inside fd_getfile().mpi2018-04-271-1/+3
| | | | ok visa@
* Stop assuming <sys/file.h> will pull in fcntl.h when _KERNEL is defined.guenther2018-01-021-1/+2
| | | | ok millert@ sthen@
* free(9) sizes for sys_execve.florian2018-01-011-16/+17
| | | | | | | Convert the hand rolled loop to strlcpy which gives us the size for free(9). OK visa
* Remove unused ps_stackgap from process structstefan2017-12-191-4/+1
| | | | | Nothing uses this field since Linux compat was removed. ok mpi@ deraadt@ guenther@
* pledge()'s 2nd argument becomes char *execpromises, which becomes thederaadt2017-12-121-2/+15
| | | | | | | | | | pledge for a new execve image immediately upon start. Also introduces "error" which makes violations return -1 ENOSYS instead of killing the program ("error" may not be handed to a setuid/setgid program, which may be missing/ignoring syscall return values and would continue with inconsistant state) Discussion with many florian has used this to improve the strictness of a daemon
* Remove old deactivated pledge path code. A replacement mechanism isderaadt2017-08-291-2/+1
| | | | | being brewed. ok beck
* Provide mips64 with kernel-facing TCB_{GET,SET} macros that store itguenther2017-04-131-5/+2
| | | | | | | in struct mdproc. With that, all archs have those and the __HAVE_MD_TCB macro can be unifdef'ed as always defined. ok kettenis@ visa@ jsing@
* Add a flags argument to falloc() that lets it optionally set theguenther2017-02-111-2/+2
| | | | | | | close-on-exec flag on the newly allocated fd. Make falloc()'s return arguments non-optional: assert that they're not NULL. ok mpi@ millert@
* Delete the obsolete fork/exec/exit emulation hooks.guenther2017-02-081-22/+1
| | | | ok mpi@ dlg@
* p_comm is the process's command and isn't per thread, so move it fromguenther2017-01-211-3/+3
| | | | | | struct proc to struct process. ok deraadt@ kettenis@
* Delete dead copy of pr->ps_vmspace; uvmspace_exec() can change it anywayguenther2016-10-221-3/+3
| | | | ok kettenis@ jsing@
* Reset PS_WXNEEDED in execve(2).jca2016-09-031-1/+3
| | | | | | | | The new process should inherit wxneeded perms from the ELF executable only, not from the former process. Solution improved by guenther@, ok guenther@ deraadt@, ok tedu@ on a similar diff.
* Cleanup some systrace leftovers.kettenis2016-06-111-20/+3
| | | | ok jca@, guenther@
* Identify W^X labelled binaries at execve() time based upon WX_OPENBSD_WXNEEDEDderaadt2016-05-301-1/+4
| | | | | | | | | | | flag set by ld -zwxneeded. Such binaries are allowed to run only on wxallowed mountpoints. They do not report mmap/mprotect problems. Rate limit mmap/mprotect reports from other binaries. These semantics are chosen to encourage progress in the ports ecosystem, without overwhelming the developers who work in the area. ok sthen kettenis
* backout to insert correct commit messagederaadt2016-05-301-4/+1
|
* *** empty log message ***deraadt2016-05-301-1/+4
|
* Place a cpu-dependent trap/illegal instruction over the remainder of thederaadt2016-05-231-2/+8
| | | | | | | | | sigtramp page, so that it will generate a nice kernel fault if touched. While here, move most of the sigtramps to the .rodata segment, because they are not executed in the kernel. Also some preparation for sliding the actual sigtramp forward (will need some gdb changes) ok mlarkin kettenis
* SROP mitigation. sendsig() stores a (per-process ^ &sigcontext) cookiederaadt2016-05-101-1/+6
| | | | | | | | inside the sigcontext. sigreturn(2) checks syscall entry was from the exact PC addr in the (per-process ASLR) sigtramp, verifies the cookie, and clears it to prevent sigcontext reuse. not yet tested on landisk, sparc, *88k, socppc. ok kettenis
* boom goes the dynamitetedu2016-04-251-31/+2
|
* Remove the unused flags argument from VOP_UNLOCK().natano2016-03-191-2/+2
| | | | | | torture tested on amd64, i386 and macppc ok beck mpi stefan "the change looks right" deraadt
* No more compat emulations, so remove ktrace EMUL records and the baggageguenther2016-03-061-11/+1
| | | | | | for generating and parsing them. ok mpi@ naddy@ millert@ deraadt@