summaryrefslogtreecommitdiffstats
path: root/sys/kern/kern_exec.c (follow)
Commit message (Collapse)AuthorAgeFilesLines
* Consistently perform atomic writes to the ps_flags field of structanton2020-02-151-3/+3
| | | | | | process. ok bluhm@ claudio@ visa@
* Replace p_xstat with ps_xexit and ps_xsigguenther2019-12-111-2/+2
| | | | | | | | | Convert those to a consolidated status when needed in wait4(), kevent(), and sysctl() Pass exit code and signal separately to exit1() (This also serves as prep for adding waitid(2)) ok mpi@
* comply with POSIX and make execve() return EACCES for directoriesnaddy2019-12-011-5/+1
| | | | ok millert@ deraadt@
* Repurpose the "syscalls must be on a writeable page" mechanism toderaadt2019-11-291-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | enforce a new policy: system calls must be in pre-registered regions. We have discussed more strict checks than this, but none satisfy the cost/benefit based upon our understanding of attack methods, anyways let's see what the next iteration looks like. This is intended to harden (translation: attackers must put extra effort into attacking) against a mixture of W^X failures and JIT bugs which allow syscall misinterpretation, especially in environments with polymorphic-instruction/variable-sized instructions. It fits in a bit with libc/libcrypto/ld.so random relink on boot and no-restart-at-crash behaviour, particularily for remote problems. Less effective once on-host since someone the libraries can be read. For static-executables the kernel registers the main program's PIE-mapped exec section valid, as well as the randomly-placed sigtramp page. For dynamic executables ELF ld.so's exec segment is also labelled valid; ld.so then has enough information to register libc's exec section as valid via call-once msyscall(2) For dynamic binaries, we continue to to permit the main program exec segment because "go" (and potentially a few other applications) have embedded system calls in the main program. Hopefully at least go gets fixed soon. We declare the concept of embedded syscalls a bad idea for numerous reasons, as we notice the ecosystem has many of static-syscall-in-base-binary which are dynamically linked against libraries which in turn use libc, which contains another set of syscall stubs. We've been concerned about adding even one additional syscall entry point... but go's approach tends to double the entry-point attack surface. This was started at a nano-hackathon in Bob Beck's basement 2 weeks ago during a long discussion with mortimer trying to hide from the SSL scream-conversations, and finished in more comfortable circumstances next to a wood-stove at Elk Lakes cabin with UVM scream-conversations. ok guenther kettenis mortimer, lots of feedback from others conversations about go with jsing tb sthen
* Kill uvm_deallocate(9) and use uvm_unmap() directly.mpi2019-11-051-3/+2
| | | | ok kettenis@, semarie@, deraadt@
* per-process itimers: itimerval -> itimerspeccheloha2019-08-021-3/+3
| | | | | | | | | | | | | Loongson runs at 128hz. 128 doesn't divide evenly into a million, but it does divide evenly into a billion. So if we do the per-process itimer bookkeeping with itimerspec structs we can have error-free virtual itimers on loongson just as we do on most other platforms. This change doesn't fix the virtual itimer error alpha, as 1024 does not divide evenly into a billion. But this doesn't make the situation any worse, either. ok deraadt@
* Do not relock fdp in fdrelease(). This prevents unnecessary lockingvisa2019-07-151-2/+2
| | | | | | in the common case. OK mpi@
* Make resource limit access MP-safe. So far, the copy-on-write sharingvisa2019-06-211-2/+2
| | | | | | | | | | of resource limit structs has been done between processes. By applying copy-on-write also between threads, threads can read rlimits in a nearly lock-free manner. Inspired by code in DragonFly BSD and FreeBSD. OK mpi@, agreement from jmatthew@ and anton@
* Revert to using the SCHED_LOCK() to protect time accounting.mpi2019-06-011-3/+1
| | | | | | | | | It currently creates a lock ordering problem because SCHED_LOCK() is taken by hardclock(). That means the "priorities" of a thread should be moved out of the SCHED_LOCK() first in order to make progress. Reported-by: syzbot+8e4863b3dde88eb706dc@syzkaller.appspotmail.com via anton@ as well as by kettenis@
* Use a per-process mutex to protect time accounting instead of SCHED_LOCK().mpi2019-05-311-1/+3
| | | | | | | Note that hardclock(9) still increments p_{u,s,i}ticks without holding a lock. ok visa@, cheloha@
* Fix stack info leak in execve(2). There are 2x4 bytes of paddingbluhm2019-02-081-1/+3
| | | | | in struct ps_strings. from NetBSD; OK deraadt@ guenther@ visa@
* If we execute a #!shell binary, the shell is an integral part of thederaadt2018-10-301-1/+3
| | | | | | | | binary so it should bypass unveil restrictions. This is similar (but different...) to how the ELF linker (ld.so) is loaded (after unveils get dropped). Discovered in doas, due to more accurate unveil semantics. ok guenther tedu beck
* Decouple unveil from the pledge flags, by adding dedicated unveil flagsbeck2018-08-051-1/+2
| | | | | | | | to the namei args. This fixes a bug where chmod would be allowed when with only READ. This also allows some further cleanup of some awkward things like PLEDGE_STAT that will follow Lots of assistence from semarie@ - thanks! ok semarie@
* Remove a few leftovers from the days of emulation, which could result inderaadt2018-07-201-3/+1
| | | | | a bad/corrupt binary not returning ENOEXEC but some other error. ok guenther kettenis bluhm
* Unveiling unveil(2).beck2018-07-131-1/+9
| | | | | | | | | | | | | This brings unveil into the tree, disabled by default - Currently this will return EPERM on all attempts to use it until we are fully certain it is ready for people to start using, but this now allows for others to do more tweaking and experimentation. Still needs to send the unveil's across forks and execs before fully enabling. Many thanks to robert@ and deraadt@ for extensive testing. ok deraadt@
* Put file descriptors on shared data structures when they are completelympi2018-06-181-5/+4
| | | | | | | | | | | | | | | | | setup, take 3. LARVAL fd still exist, but they are no longer marked with a flag and no longer reachable via `fd_ofiles[]' or the global linked list. This allows us to simplifies a lot code grabbing new references to fds. All of this is now possible because dup2(2) refuses to clone LARVAL fds. Note that the `fdplock' could now be release in all open(2)-like syscalls, just like it is done in accept(2). With inputs from Mathieu Masson, visa@, guenther@ and art@ Previous version ok bluhm@, ok visa@, sthen@
* Revert introduction of fdinsert(), a sanitify check triggers whenmpi2018-06-051-4/+5
| | | | | | closing a LARVAL file. Found the hardway by sthen@.
* Put file descriptors on shared data structures when they are completelympi2018-06-021-5/+4
| | | | | | | | | | | | | | | | | setup. LARVAL fd still exist, but they are no longer marked with a flag and no longer reachable via `fd_ofiles[]'. This allows us to simplifies a lot code grabbing new references to fds. All of this is now possible because dup2(2) refuses to clone LARVAL fds. Note that the `fdplock' could now be release in all open(2)-like syscalls, just like it is done in accept(2). With inputs from Mathieu -, visa@, guenther@ and art@ ok visa@, bluhm@
* Clean up the parameters of VOP_LOCK() and VOP_UNLOCK(). It is alwaysvisa2018-04-281-2/+2
| | | | | | | curproc that does the locking or unlocking, so the proc parameter is pointless and can be dropped. OK mpi@, deraadt@
* Move FREF() inside fd_getfile().mpi2018-04-271-1/+3
| | | | ok visa@
* Stop assuming <sys/file.h> will pull in fcntl.h when _KERNEL is defined.guenther2018-01-021-1/+2
| | | | ok millert@ sthen@
* free(9) sizes for sys_execve.florian2018-01-011-16/+17
| | | | | | | Convert the hand rolled loop to strlcpy which gives us the size for free(9). OK visa
* Remove unused ps_stackgap from process structstefan2017-12-191-4/+1
| | | | | Nothing uses this field since Linux compat was removed. ok mpi@ deraadt@ guenther@
* pledge()'s 2nd argument becomes char *execpromises, which becomes thederaadt2017-12-121-2/+15
| | | | | | | | | | pledge for a new execve image immediately upon start. Also introduces "error" which makes violations return -1 ENOSYS instead of killing the program ("error" may not be handed to a setuid/setgid program, which may be missing/ignoring syscall return values and would continue with inconsistant state) Discussion with many florian has used this to improve the strictness of a daemon
* Remove old deactivated pledge path code. A replacement mechanism isderaadt2017-08-291-2/+1
| | | | | being brewed. ok beck
* Provide mips64 with kernel-facing TCB_{GET,SET} macros that store itguenther2017-04-131-5/+2
| | | | | | | in struct mdproc. With that, all archs have those and the __HAVE_MD_TCB macro can be unifdef'ed as always defined. ok kettenis@ visa@ jsing@
* Add a flags argument to falloc() that lets it optionally set theguenther2017-02-111-2/+2
| | | | | | | close-on-exec flag on the newly allocated fd. Make falloc()'s return arguments non-optional: assert that they're not NULL. ok mpi@ millert@
* Delete the obsolete fork/exec/exit emulation hooks.guenther2017-02-081-22/+1
| | | | ok mpi@ dlg@
* p_comm is the process's command and isn't per thread, so move it fromguenther2017-01-211-3/+3
| | | | | | struct proc to struct process. ok deraadt@ kettenis@
* Delete dead copy of pr->ps_vmspace; uvmspace_exec() can change it anywayguenther2016-10-221-3/+3
| | | | ok kettenis@ jsing@
* Reset PS_WXNEEDED in execve(2).jca2016-09-031-1/+3
| | | | | | | | The new process should inherit wxneeded perms from the ELF executable only, not from the former process. Solution improved by guenther@, ok guenther@ deraadt@, ok tedu@ on a similar diff.
* Cleanup some systrace leftovers.kettenis2016-06-111-20/+3
| | | | ok jca@, guenther@
* Identify W^X labelled binaries at execve() time based upon WX_OPENBSD_WXNEEDEDderaadt2016-05-301-1/+4
| | | | | | | | | | | flag set by ld -zwxneeded. Such binaries are allowed to run only on wxallowed mountpoints. They do not report mmap/mprotect problems. Rate limit mmap/mprotect reports from other binaries. These semantics are chosen to encourage progress in the ports ecosystem, without overwhelming the developers who work in the area. ok sthen kettenis
* backout to insert correct commit messagederaadt2016-05-301-4/+1
|
* *** empty log message ***deraadt2016-05-301-1/+4
|
* Place a cpu-dependent trap/illegal instruction over the remainder of thederaadt2016-05-231-2/+8
| | | | | | | | | sigtramp page, so that it will generate a nice kernel fault if touched. While here, move most of the sigtramps to the .rodata segment, because they are not executed in the kernel. Also some preparation for sliding the actual sigtramp forward (will need some gdb changes) ok mlarkin kettenis
* SROP mitigation. sendsig() stores a (per-process ^ &sigcontext) cookiederaadt2016-05-101-1/+6
| | | | | | | | inside the sigcontext. sigreturn(2) checks syscall entry was from the exact PC addr in the (per-process ASLR) sigtramp, verifies the cookie, and clears it to prevent sigcontext reuse. not yet tested on landisk, sparc, *88k, socppc. ok kettenis
* boom goes the dynamitetedu2016-04-251-31/+2
|
* Remove the unused flags argument from VOP_UNLOCK().natano2016-03-191-2/+2
| | | | | | torture tested on amd64, i386 and macppc ok beck mpi stefan "the change looks right" deraadt
* No more compat emulations, so remove ktrace EMUL records and the baggageguenther2016-03-061-11/+1
| | | | | | for generating and parsing them. ok mpi@ naddy@ millert@ deraadt@
* remove stale lint annotationstedu2015-12-051-2/+1
|
* move the pledgenote annotation from `struct proc' to `struct nameidata'semarie2015-11-021-2/+2
| | | | | | | | | | pledgenote is used for annotate the policy for a namei context. So make it tracking the nameidata. It is expected for the caller to explicitly define the policy. It is a kernel bug to not do so. ok deraadt@
* move p_pledgenote setting next to NDINIT()deraadt2015-10-281-2/+2
|
* Fold "malloc" into "stdio" and -- recognizing that no program so far hasderaadt2015-10-251-2/+2
| | | | | | | | | | | | | | | used less than "stdio" -- include all the "self" operations. Instead of different defines, use regular PLEDGE_* in the "p_pledgenote" variable (which indicates the operation subtype a system call is performing). Many checks before easier to understand. p_pledgenote can often be passed directly to ktrace, so that kdump says: 15565 test CALL pledge(0xa9a3f804c51,0) 15565 test STRU pledge request="stdio" 15565 test RET pledge 0 15565 test CALL open(0xa9a3f804c57,0x2<O_RDWR>) 15565 test NAMI "/tmp/testfile" 15565 test PLDG open, "wpath", errno 1 Operation not permitted with help from semarie, ok guenther
* I forgot execve would go through the namei codepath, so a program markedderaadt2015-10-101-1/+2
| | | | | | "stdio rpath" this would fail to execve. pre-indicate exec actions to the namei checker to allow them through. ok semarie
* Rename tame() to pledge(). This fairly interface has evolved to be morederaadt2015-10-091-4/+4
| | | | | | strict than anticipated. It allows a programmer to pledge/promise/covenant that their program will operate within an easily defined subset of the Unix environment, or it pays the price.
* Add the tame "exec" request. This allows processes which requestderaadt2015-10-071-1/+5
| | | | | | | | | | | | | | | "exec" to call execve(2), potentially fork(2) beforehands if they asked for "proc". Calling execve is what "shells" (ksh, tmux, etc) have as their primary purpose. But meantime, if such a shell has a nasty bug, we want to mitigate the process from opening a socket or calling 100+ other system calls. Unfortunately silver bullets are in short supply, so if our goal is to stay in a POSIX-y environment, we have to let shells call execve(). POSIX ate the world, so choices do we all have? Warning for many: silver bullets are even more rare in other OS ecosystems, so please accept this as a narrow lowering of the bar in a very raised environment. Commited from a machine running tame "proc exec" ksh, make, etc.
* missing ) in COMPAT_LINUX blockderaadt2015-10-021-2/+2
|
* Add ktracing of argv and envp to execve(2), with envp not traced by defaultguenther2015-10-021-5/+26
| | | | ok tedu@ deraadt@
* Track size of an opaque allocation to pass to free() laterderaadt2015-09-281-6/+6
| | | | ok guenther tedu