| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
| |
process.
ok bluhm@ claudio@ visa@
|
|
|
|
|
|
|
|
|
| |
Convert those to a consolidated status when needed in wait4(), kevent(),
and sysctl()
Pass exit code and signal separately to exit1()
(This also serves as prep for adding waitid(2))
ok mpi@
|
|
|
|
| |
ok millert@ deraadt@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
enforce a new policy: system calls must be in pre-registered regions.
We have discussed more strict checks than this, but none satisfy the
cost/benefit based upon our understanding of attack methods, anyways
let's see what the next iteration looks like.
This is intended to harden (translation: attackers must put extra
effort into attacking) against a mixture of W^X failures and JIT bugs
which allow syscall misinterpretation, especially in environments with
polymorphic-instruction/variable-sized instructions. It fits in a bit
with libc/libcrypto/ld.so random relink on boot and no-restart-at-crash
behaviour, particularily for remote problems. Less effective once on-host
since someone the libraries can be read.
For static-executables the kernel registers the main program's
PIE-mapped exec section valid, as well as the randomly-placed sigtramp
page. For dynamic executables ELF ld.so's exec segment is also
labelled valid; ld.so then has enough information to register libc's
exec section as valid via call-once msyscall(2)
For dynamic binaries, we continue to to permit the main program exec
segment because "go" (and potentially a few other applications) have
embedded system calls in the main program. Hopefully at least go gets
fixed soon.
We declare the concept of embedded syscalls a bad idea for numerous
reasons, as we notice the ecosystem has many of
static-syscall-in-base-binary which are dynamically linked against
libraries which in turn use libc, which contains another set of
syscall stubs. We've been concerned about adding even one additional
syscall entry point... but go's approach tends to double the entry-point
attack surface.
This was started at a nano-hackathon in Bob Beck's basement 2 weeks
ago during a long discussion with mortimer trying to hide from the SSL
scream-conversations, and finished in more comfortable circumstances
next to a wood-stove at Elk Lakes cabin with UVM scream-conversations.
ok guenther kettenis mortimer, lots of feedback from others
conversations about go with jsing tb sthen
|
|
|
|
| |
ok kettenis@, semarie@, deraadt@
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Loongson runs at 128hz. 128 doesn't divide evenly into a million,
but it does divide evenly into a billion. So if we do the per-process
itimer bookkeeping with itimerspec structs we can have error-free
virtual itimers on loongson just as we do on most other platforms.
This change doesn't fix the virtual itimer error alpha, as 1024 does not
divide evenly into a billion. But this doesn't make the situation any
worse, either.
ok deraadt@
|
|
|
|
|
|
| |
in the common case.
OK mpi@
|
|
|
|
|
|
|
|
|
|
| |
of resource limit structs has been done between processes. By applying
copy-on-write also between threads, threads can read rlimits in
a nearly lock-free manner.
Inspired by code in DragonFly BSD and FreeBSD.
OK mpi@, agreement from jmatthew@ and anton@
|
|
|
|
|
|
|
|
|
| |
It currently creates a lock ordering problem because SCHED_LOCK() is taken
by hardclock(). That means the "priorities" of a thread should be moved
out of the SCHED_LOCK() first in order to make progress.
Reported-by: syzbot+8e4863b3dde88eb706dc@syzkaller.appspotmail.com
via anton@ as well as by kettenis@
|
|
|
|
|
|
|
| |
Note that hardclock(9) still increments p_{u,s,i}ticks without holding a
lock.
ok visa@, cheloha@
|
|
|
|
|
| |
in struct ps_strings.
from NetBSD; OK deraadt@ guenther@ visa@
|
|
|
|
|
|
|
|
| |
binary so it should bypass unveil restrictions. This is similar
(but different...) to how the ELF linker (ld.so) is loaded (after
unveils get dropped). Discovered in doas, due to more accurate unveil
semantics.
ok guenther tedu beck
|
|
|
|
|
|
|
|
| |
to the namei args. This fixes a bug where chmod would be allowed when
with only READ. This also allows some further cleanup of some awkward
things like PLEDGE_STAT that will follow
Lots of assistence from semarie@ - thanks!
ok semarie@
|
|
|
|
|
| |
a bad/corrupt binary not returning ENOEXEC but some other error.
ok guenther kettenis bluhm
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This brings unveil into the tree, disabled by default - Currently
this will return EPERM on all attempts to use it until we are
fully certain it is ready for people to start using, but this
now allows for others to do more tweaking and experimentation.
Still needs to send the unveil's across forks and execs before
fully enabling.
Many thanks to robert@ and deraadt@ for extensive testing.
ok deraadt@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
setup, take 3.
LARVAL fd still exist, but they are no longer marked with a flag and no
longer reachable via `fd_ofiles[]' or the global linked list. This allows
us to simplifies a lot code grabbing new references to fds.
All of this is now possible because dup2(2) refuses to clone LARVAL fds.
Note that the `fdplock' could now be release in all open(2)-like syscalls,
just like it is done in accept(2).
With inputs from Mathieu Masson, visa@, guenther@ and art@
Previous version ok bluhm@, ok visa@, sthen@
|
|
|
|
|
|
| |
closing a LARVAL file.
Found the hardway by sthen@.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
setup.
LARVAL fd still exist, but they are no longer marked with a flag and no
longer reachable via `fd_ofiles[]'. This allows us to simplifies a lot
code grabbing new references to fds.
All of this is now possible because dup2(2) refuses to clone LARVAL fds.
Note that the `fdplock' could now be release in all open(2)-like syscalls,
just like it is done in accept(2).
With inputs from Mathieu -, visa@, guenther@ and art@
ok visa@, bluhm@
|
|
|
|
|
|
|
| |
curproc that does the locking or unlocking, so the proc parameter
is pointless and can be dropped.
OK mpi@, deraadt@
|
|
|
|
| |
ok visa@
|
|
|
|
| |
ok millert@ sthen@
|
|
|
|
|
|
|
| |
Convert the hand rolled loop to strlcpy which gives us the size for
free(9).
OK visa
|
|
|
|
|
| |
Nothing uses this field since Linux compat was removed.
ok mpi@ deraadt@ guenther@
|
|
|
|
|
|
|
|
|
|
| |
pledge for a new execve image immediately upon start. Also introduces
"error" which makes violations return -1 ENOSYS instead of killing the
program ("error" may not be handed to a setuid/setgid program, which
may be missing/ignoring syscall return values and would continue with
inconsistant state)
Discussion with many
florian has used this to improve the strictness of a daemon
|
|
|
|
|
| |
being brewed.
ok beck
|
|
|
|
|
|
|
| |
in struct mdproc. With that, all archs have those and the __HAVE_MD_TCB
macro can be unifdef'ed as always defined.
ok kettenis@ visa@ jsing@
|
|
|
|
|
|
|
| |
close-on-exec flag on the newly allocated fd. Make falloc()'s
return arguments non-optional: assert that they're not NULL.
ok mpi@ millert@
|
|
|
|
| |
ok mpi@ dlg@
|
|
|
|
|
|
| |
struct proc to struct process.
ok deraadt@ kettenis@
|
|
|
|
| |
ok kettenis@ jsing@
|
|
|
|
|
|
|
|
| |
The new process should inherit wxneeded perms from the ELF executable only,
not from the former process.
Solution improved by guenther@, ok guenther@ deraadt@, ok tedu@ on a similar
diff.
|
|
|
|
| |
ok jca@, guenther@
|
|
|
|
|
|
|
|
|
|
|
| |
flag set by ld -zwxneeded. Such binaries are allowed to run only on wxallowed
mountpoints. They do not report mmap/mprotect problems.
Rate limit mmap/mprotect reports from other binaries.
These semantics are chosen to encourage progress in the ports ecosystem,
without overwhelming the developers who work in the area.
ok sthen kettenis
|
| |
|
| |
|
|
|
|
|
|
|
|
|
| |
sigtramp page, so that it will generate a nice kernel fault if touched.
While here, move most of the sigtramps to the .rodata segment, because
they are not executed in the kernel.
Also some preparation for sliding the actual sigtramp forward (will need
some gdb changes)
ok mlarkin kettenis
|
|
|
|
|
|
|
|
| |
inside the sigcontext. sigreturn(2) checks syscall entry was from the
exact PC addr in the (per-process ASLR) sigtramp, verifies the cookie,
and clears it to prevent sigcontext reuse.
not yet tested on landisk, sparc, *88k, socppc.
ok kettenis
|
| |
|
|
|
|
|
|
| |
torture tested on amd64, i386 and macppc
ok beck mpi stefan
"the change looks right" deraadt
|
|
|
|
|
|
| |
for generating and parsing them.
ok mpi@ naddy@ millert@ deraadt@
|
| |
|
|
|
|
|
|
|
|
|
|
| |
pledgenote is used for annotate the policy for a namei context. So make it
tracking the nameidata.
It is expected for the caller to explicitly define the policy. It is a kernel
bug to not do so.
ok deraadt@
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
used less than "stdio" -- include all the "self" operations. Instead of
different defines, use regular PLEDGE_* in the "p_pledgenote" variable
(which indicates the operation subtype a system call is performing). Many
checks before easier to understand. p_pledgenote can often be passed
directly to ktrace, so that kdump says:
15565 test CALL pledge(0xa9a3f804c51,0)
15565 test STRU pledge request="stdio"
15565 test RET pledge 0
15565 test CALL open(0xa9a3f804c57,0x2<O_RDWR>)
15565 test NAMI "/tmp/testfile"
15565 test PLDG open, "wpath", errno 1 Operation not permitted
with help from semarie, ok guenther
|
|
|
|
|
|
| |
"stdio rpath" this would fail to execve. pre-indicate exec actions to the
namei checker to allow them through.
ok semarie
|
|
|
|
|
|
| |
strict than anticipated. It allows a programmer to pledge/promise/covenant
that their program will operate within an easily defined subset of the
Unix environment, or it pays the price.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
"exec" to call execve(2), potentially fork(2) beforehands if they
asked for "proc". Calling execve is what "shells" (ksh, tmux, etc)
have as their primary purpose. But meantime, if such a shell has a
nasty bug, we want to mitigate the process from opening a socket or
calling 100+ other system calls. Unfortunately silver bullets are in
short supply, so if our goal is to stay in a POSIX-y environment, we
have to let shells call execve(). POSIX ate the world, so choices do
we all have?
Warning for many: silver bullets are even more rare in other OS
ecosystems, so please accept this as a narrow lowering of the bar in a
very raised environment.
Commited from a machine running tame "proc exec" ksh, make, etc.
|
| |
|
|
|
|
| |
ok tedu@ deraadt@
|
|
|
|
| |
ok guenther tedu
|