| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
| |
architecture.
from miod
|
|
|
|
|
|
|
|
| |
single_thread_set() is modified to explicitly indicated when waiting until
sibling threads are parked is required. This is obviously not required if
a traced thread is switching away from a CPU after handling a STOP signal.
ok claudio@
|
|
|
|
|
|
| |
Kill SINGLE_PTRACE and use SINGLE_SUSPEND which has almost the same semantic.
This diff did not properly kill SINGLE_PTRACE and broke RAMDISK kernels.
|
|
|
|
|
|
|
|
| |
single_thread_set() is modified to explicitly indicated when waiting until
sibling threads are parked is required. This is obviously not required if
a traced thread is switching away from a CPU after handling a STOP signal.
ok claudio@
|
|
|
|
|
|
|
|
| |
If we fold the for-loop iterating over each interval timer into the
helper function the result is slightly tidier than what we have now.
Rename the helper function "cancel_all_itimers".
Based on input from millert@ and kettenis@.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
During _exit(2) and sometimes during execve(2) we need to cancel any
active per-process interval timers. We don't currently do this in an
MP-safe way. Both syscalls ignore the locking assumptions documented
in proc.h.
The easiest way to make them MP-safe is to use setitimer(), just like
the getitimer(2) and setitimer(2) syscalls do. To make things a bit
cleaner I have added a helper function, cancelitimer(), so the callers
don't need to fuss with an itimerval struct.
While we're here we can remove the splclock/splx dance from execve(2).
It is no longer necessary.
ok deraadt@
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
| |
page it out and bad things will happen when we try to page it back in
from within the clock interrupt handler.
While there, make sure we set timekeep_object back to NULL if we fail
to make the timekeep page into kernel space.
ok deraadt@ (who had a very similar diff)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This diff exposes parts of clock_gettime(2) and gettimeofday(2) to
userland via libc eliberating processes from the need for a context
switch everytime they want to count the passage of time.
If a timecounter clock can be exposed to userland than it needs to set
its tc_user member to a non-zero value. Tested with one or multiple
counters per architecture.
The timing data is shared through a pointer found in the new ELF
auxiliary vector AUX_openbsd_timekeep containing timehands information
that is frequently updated by the kernel.
Timing differences between the last kernel update and the current time
are adjusted in userland by the tc_get_timecount() function inside the
MD usertc.c file.
This permits a much more responsive environment, quite visible in
browsers, office programs and gaming (apparently one is are able to fly
in Minecraft now).
Tested by robert@, sthen@, naddy@, kmos@, phessler@, and many others!
OK from at least kettenis@, cheloha@, naddy@, sthen@
|
|
|
|
|
|
| |
process.
ok bluhm@ claudio@ visa@
|
|
|
|
|
|
|
|
|
| |
Convert those to a consolidated status when needed in wait4(), kevent(),
and sysctl()
Pass exit code and signal separately to exit1()
(This also serves as prep for adding waitid(2))
ok mpi@
|
|
|
|
| |
ok millert@ deraadt@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
enforce a new policy: system calls must be in pre-registered regions.
We have discussed more strict checks than this, but none satisfy the
cost/benefit based upon our understanding of attack methods, anyways
let's see what the next iteration looks like.
This is intended to harden (translation: attackers must put extra
effort into attacking) against a mixture of W^X failures and JIT bugs
which allow syscall misinterpretation, especially in environments with
polymorphic-instruction/variable-sized instructions. It fits in a bit
with libc/libcrypto/ld.so random relink on boot and no-restart-at-crash
behaviour, particularily for remote problems. Less effective once on-host
since someone the libraries can be read.
For static-executables the kernel registers the main program's
PIE-mapped exec section valid, as well as the randomly-placed sigtramp
page. For dynamic executables ELF ld.so's exec segment is also
labelled valid; ld.so then has enough information to register libc's
exec section as valid via call-once msyscall(2)
For dynamic binaries, we continue to to permit the main program exec
segment because "go" (and potentially a few other applications) have
embedded system calls in the main program. Hopefully at least go gets
fixed soon.
We declare the concept of embedded syscalls a bad idea for numerous
reasons, as we notice the ecosystem has many of
static-syscall-in-base-binary which are dynamically linked against
libraries which in turn use libc, which contains another set of
syscall stubs. We've been concerned about adding even one additional
syscall entry point... but go's approach tends to double the entry-point
attack surface.
This was started at a nano-hackathon in Bob Beck's basement 2 weeks
ago during a long discussion with mortimer trying to hide from the SSL
scream-conversations, and finished in more comfortable circumstances
next to a wood-stove at Elk Lakes cabin with UVM scream-conversations.
ok guenther kettenis mortimer, lots of feedback from others
conversations about go with jsing tb sthen
|
|
|
|
| |
ok kettenis@, semarie@, deraadt@
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Loongson runs at 128hz. 128 doesn't divide evenly into a million,
but it does divide evenly into a billion. So if we do the per-process
itimer bookkeeping with itimerspec structs we can have error-free
virtual itimers on loongson just as we do on most other platforms.
This change doesn't fix the virtual itimer error alpha, as 1024 does not
divide evenly into a billion. But this doesn't make the situation any
worse, either.
ok deraadt@
|
|
|
|
|
|
| |
in the common case.
OK mpi@
|
|
|
|
|
|
|
|
|
|
| |
of resource limit structs has been done between processes. By applying
copy-on-write also between threads, threads can read rlimits in
a nearly lock-free manner.
Inspired by code in DragonFly BSD and FreeBSD.
OK mpi@, agreement from jmatthew@ and anton@
|
|
|
|
|
|
|
|
|
| |
It currently creates a lock ordering problem because SCHED_LOCK() is taken
by hardclock(). That means the "priorities" of a thread should be moved
out of the SCHED_LOCK() first in order to make progress.
Reported-by: syzbot+8e4863b3dde88eb706dc@syzkaller.appspotmail.com
via anton@ as well as by kettenis@
|
|
|
|
|
|
|
| |
Note that hardclock(9) still increments p_{u,s,i}ticks without holding a
lock.
ok visa@, cheloha@
|
|
|
|
|
| |
in struct ps_strings.
from NetBSD; OK deraadt@ guenther@ visa@
|
|
|
|
|
|
|
|
| |
binary so it should bypass unveil restrictions. This is similar
(but different...) to how the ELF linker (ld.so) is loaded (after
unveils get dropped). Discovered in doas, due to more accurate unveil
semantics.
ok guenther tedu beck
|
|
|
|
|
|
|
|
| |
to the namei args. This fixes a bug where chmod would be allowed when
with only READ. This also allows some further cleanup of some awkward
things like PLEDGE_STAT that will follow
Lots of assistence from semarie@ - thanks!
ok semarie@
|
|
|
|
|
| |
a bad/corrupt binary not returning ENOEXEC but some other error.
ok guenther kettenis bluhm
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This brings unveil into the tree, disabled by default - Currently
this will return EPERM on all attempts to use it until we are
fully certain it is ready for people to start using, but this
now allows for others to do more tweaking and experimentation.
Still needs to send the unveil's across forks and execs before
fully enabling.
Many thanks to robert@ and deraadt@ for extensive testing.
ok deraadt@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
setup, take 3.
LARVAL fd still exist, but they are no longer marked with a flag and no
longer reachable via `fd_ofiles[]' or the global linked list. This allows
us to simplifies a lot code grabbing new references to fds.
All of this is now possible because dup2(2) refuses to clone LARVAL fds.
Note that the `fdplock' could now be release in all open(2)-like syscalls,
just like it is done in accept(2).
With inputs from Mathieu Masson, visa@, guenther@ and art@
Previous version ok bluhm@, ok visa@, sthen@
|
|
|
|
|
|
| |
closing a LARVAL file.
Found the hardway by sthen@.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
setup.
LARVAL fd still exist, but they are no longer marked with a flag and no
longer reachable via `fd_ofiles[]'. This allows us to simplifies a lot
code grabbing new references to fds.
All of this is now possible because dup2(2) refuses to clone LARVAL fds.
Note that the `fdplock' could now be release in all open(2)-like syscalls,
just like it is done in accept(2).
With inputs from Mathieu -, visa@, guenther@ and art@
ok visa@, bluhm@
|
|
|
|
|
|
|
| |
curproc that does the locking or unlocking, so the proc parameter
is pointless and can be dropped.
OK mpi@, deraadt@
|
|
|
|
| |
ok visa@
|
|
|
|
| |
ok millert@ sthen@
|
|
|
|
|
|
|
| |
Convert the hand rolled loop to strlcpy which gives us the size for
free(9).
OK visa
|
|
|
|
|
| |
Nothing uses this field since Linux compat was removed.
ok mpi@ deraadt@ guenther@
|
|
|
|
|
|
|
|
|
|
| |
pledge for a new execve image immediately upon start. Also introduces
"error" which makes violations return -1 ENOSYS instead of killing the
program ("error" may not be handed to a setuid/setgid program, which
may be missing/ignoring syscall return values and would continue with
inconsistant state)
Discussion with many
florian has used this to improve the strictness of a daemon
|
|
|
|
|
| |
being brewed.
ok beck
|
|
|
|
|
|
|
| |
in struct mdproc. With that, all archs have those and the __HAVE_MD_TCB
macro can be unifdef'ed as always defined.
ok kettenis@ visa@ jsing@
|
|
|
|
|
|
|
| |
close-on-exec flag on the newly allocated fd. Make falloc()'s
return arguments non-optional: assert that they're not NULL.
ok mpi@ millert@
|
|
|
|
| |
ok mpi@ dlg@
|
|
|
|
|
|
| |
struct proc to struct process.
ok deraadt@ kettenis@
|
|
|
|
| |
ok kettenis@ jsing@
|
|
|
|
|
|
|
|
| |
The new process should inherit wxneeded perms from the ELF executable only,
not from the former process.
Solution improved by guenther@, ok guenther@ deraadt@, ok tedu@ on a similar
diff.
|
|
|
|
| |
ok jca@, guenther@
|
|
|
|
|
|
|
|
|
|
|
| |
flag set by ld -zwxneeded. Such binaries are allowed to run only on wxallowed
mountpoints. They do not report mmap/mprotect problems.
Rate limit mmap/mprotect reports from other binaries.
These semantics are chosen to encourage progress in the ports ecosystem,
without overwhelming the developers who work in the area.
ok sthen kettenis
|
| |
|
| |
|
|
|
|
|
|
|
|
|
| |
sigtramp page, so that it will generate a nice kernel fault if touched.
While here, move most of the sigtramps to the .rodata segment, because
they are not executed in the kernel.
Also some preparation for sliding the actual sigtramp forward (will need
some gdb changes)
ok mlarkin kettenis
|
|
|
|
|
|
|
|
| |
inside the sigcontext. sigreturn(2) checks syscall entry was from the
exact PC addr in the (per-process ASLR) sigtramp, verifies the cookie,
and clears it to prevent sigcontext reuse.
not yet tested on landisk, sparc, *88k, socppc.
ok kettenis
|
| |
|
|
|
|
|
|
| |
torture tested on amd64, i386 and macppc
ok beck mpi stefan
"the change looks right" deraadt
|
|
|
|
|
|
| |
for generating and parsing them.
ok mpi@ naddy@ millert@ deraadt@
|