| Commit message (Collapse) | Author | Age | Files | Lines |
| | |
|
| |
|
|
|
|
|
|
|
|
|
|
| |
32-bit PowerPC doesn't have instructions for lock-free atomic ops on
8-byte values, and needs libcalls like __atomic_fetch_add_8(). In
code like "_Atomic long long a; a++;", clang doesn't emit a libcall.
This was causing linker errors on symbols like __sync_fetch_and_add_8.
Now that LLVM knows the max atomic size, its AtomicExpandPass changes
these 8-byte ops into libcalls.
ok mortimer@
|
| |
|
|
|
|
|
|
| |
This should simplify bringup and make it easier to support Big Endian
and Little Endian with the same code.
May be reconsidered if it causes too many problems with Ports.
ok kettenis@
|
| |
|
|
|
|
|
|
|
|
|
|
| |
This lets the kernel detect retguard traps and send SIGABRT instead
of SIGEMT.
SIGEMT does not indicate correctly the nature of the error (stack
overflow, violation of control flow). It can confuse the user to restart
the program without further investigation.
Prompted by and OK deraadt@
OK mortimer@
|
| |
|
|
|
|
|
|
|
|
|
|
| |
When the DAG truncates an ISD::ADDE node, DAGCombiner may optimize it
by making an adde with smaller operands. PowerPC has i1 registers,
and may truncate an i32 adde to i1, but an i1 adde is not legal for
PowerPC, and the legalize-ops phase can't fix it. This was causing
"fatal error: error in backend: Cannot select..."
cwen@ reported the error
ok mortimer@ kettenis@ deraadt@
|
| |
|
|
|
|
|
|
| |
"hard-quad-float" feature is available. Add missing replacement
instruction patterns that are needed to emit alternative code for
conditional moves of quad-precision floats.
ok mortimer@
|
| |
|
|
| |
ok mortimer@
|
| |
|
|
|
|
|
|
|
|
|
| |
them as COMDATs so that the linker can individually discard them, instead
of just ignoring duplicate symbols but keep the (duplicate) space.
On amd64, this reduces the size of the kernel OPENBSD_RANDOM segment by 82%
and the libc OPENBSD_RANDOM segment by 15%. A port that tb@ is working
on experienced a 97.3% reduction...which let it actually run.
ok mortimer@ deraadt@
|
| |
|
|
|
|
|
|
|
|
| |
For this architecture we use separate retguard prologue and epilogue code
for static or PIC code. In the PIC case we use some additional code before
the retguard epilogue to recover the function start address and the GOT
pointer in order to get the per-function random cookie. Much thanks to
visa@ for suggestions and advice making it all work.
ok deraadt@ visa@
|
| |
|
|
|
|
|
| |
Tested in snaps and package builds
Tested on amd64 by naddy@
Tested on arm64 by patrick@
Tested on octeon by visa@
|
| | |
|
| |
|
|
|
|
|
|
| |
On arm64, arm, and ppc it is possible that a large stack frame will
cause the stack protector slot to be reallocated at the wrong end of
the frame.
Noticed by tj@. ok patrick@.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
LOAD_STACK_GUARD pseudo without consulting the value of useLoadStackGuardNode(),
and then tries to add the return from getSDagStackGuard() as a parameter without
consulting the return from getIRStackGuard() to see if it should do that. This
means that the GlobalISel IRTranslator's implementation for
Intrinsic::stackprotector is broken for platforms that implement
getIRStackGuard() like we do, and this causes a segfault later when the
incomplete LOAD_STACK_GUARD pseudo is lowered in the back end.
Since GlobalISel is disabled on aarch64 most of the time anyway, add a bit that
disables it for OpenBSD/aarch64 all the time.
Fixes a crash when building on aarch64 without retguard, with a stack protector
and without optimizations, which manifests when building cross-tools.
ok patrick@ deraadt@
|
| |
|
|
|
|
|
| |
- In the N64 mode, properly load the whole immediate value
in the destination register even if the lower 32 bits are zero.
- Ensure correct alignment of memory operands.
- Fix the endianess of memory operands.
|
| |
|
|
| |
the MIPS64 mul instruction on pre-MIPS64 subtargets.
|
| |
|
|
|
|
|
|
|
|
|
| |
pieces of software that use the constraint if the compiler claims
to be compatible with GCC 4.2.1.
Note that the constraint was removed in GCC 4.4. The reason was that
'h' could generate code whose result is unpredictable. The underlying
reason is that the HI and LO registers are special, and the optimizer
has to be careful when choosing the order of HI/LO accesses. It looks
that LLVM has the needed logic.
|
| | |
|
| | |
|
| |
|
|
|
|
|
|
|
|
|
|
| |
as a memory operand, the assembler generates incorrect relocations in
PIC mode. As a simple fix, expand the instruction into an address load
sequence, which works, that is followed by the actual memory
instruction.
Note that the generated sequence is not always optimal. If the symbol
has a small offset, the offset could be fused with the memory
instruction. The fix does not achieve that, however. A symbol offset
adds an extra instruction.
|
| | |
|
| |
|
|
|
|
|
|
|
| |
Prepared with help from jsg@ and mortimer@
Tested on amd64 by bcallah@, krw@, naddy@
Tested on arm64 by patrick@
Tested on macppc by kettenis@
Tested on octeon by visa@
Tested on sparc64 by claudio@
|
| | |
|
| |
|
|
|
|
|
|
| |
It turns out MachineFrameInfo.hasCalls() is unreliable, because it is
up to the backends to update this information whenever they add calls
to a function, and this does not always happen.
ok kettenis@
|
| |
|
|
|
|
| |
genassym.sh on sparc64 when using clang as the compiler.
ok claudio@, deraadt@
|
| |
|
|
|
|
|
|
| |
the .text section in use after the file header, improving compatibility
with gcc. Without this change, module-level inline assembly blocks could
end up into wrong section.
OK kettenis@ guenther@
|
| |
|
|
|
| |
disable it in upcoming 6.5 release.
(phessler and mortimer have the details)
|
| |
|
|
|
|
|
| |
This adds more trap padding before the return while ensuring that the
return is still in the same cache line.
ok deraadt@
|
| |
|
|
|
|
|
| |
Makes things slightly faster and also improves security in these functions,
since the retguard cookie can't leak via the stack.
ok deraadt@
|
| |
|
|
|
|
|
|
|
|
| |
- Target all four kinds of return bytes (c2, c3, ca, cb)
- Fix up instructions using both ModR/M and SIB bytes
- Force alignment before instructions with return bytes in immediates
- Force alignment before instructions that have return bytes in their encoding
- Add a command line switch to toggle the functionality.
ok deraadt@
|
| |
|
|
|
|
|
|
|
|
|
|
| |
load and store instructions. The vast majority of PowerPC CPUs that
OpenBSD runs on don't implement those and will generate an alignment
exceptions. While we do emulate lfd and stfd (to work around GCC bugs),
we don't emulate lfs and stfs. It is way more efficient to have the
compiler generate code that only uses aligned load and store instructions.
Based on a diff from Georg Koehler.
ok patrick@, visa@
|
| |
|
|
|
|
|
|
|
|
| |
to fix a regression in floating point operations. Bluhm noticed that
the bc regression test has been failing after the upgrade to 7.0.1
because setting the floating point control register was in some cases
reordered erroneously.
Found and tested by bluhm@
ok bluhm@ kettenis@
|
| |
|
|
| |
ok dlg@
|
| |
|
|
|
|
| |
and use a bool type for a boolean in C++.
ok kettenis@ deraadt@
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
this is a bit different to gcc as gcc likes to use movs to move
stuff on and off the stack, and directly updates the stack pointers
with add and sub instructions. llvm prefers to use push and pop
instructions, is a lot more careful about keeping track of how
much stuff is currently on the stack, and generally pops the frame
pointer rather than do maths on it.
-msave-args adds a bunch of pushes as the first thing a function
prologue does. to keep the stack aligned, if there's an odd number
of arguments to the function it pushes the first one again to put
the frame back on a 16 byte boundary.
to undo the pushes the frame pointer needs to be updated in function
epilogues. clang emits a series of pops to fix up the registers on
the way out, but popping saved arguments is a waste of time and
harmful to actual data in the function. rather than add an offset
to the stack pointer, -msave-args emits a leaveq operation to fix
up the frame again. leaveq is effectively mov rbp,rsp; pop rbp, and
is a single byte, meaning there's less potential for gadgets compared
to a direct add to rsp, or an explicit mov rbp,rsp.
the only thing missing compared to the gcc implementation is adding
the SUN_amd64_parmdump dwarf flag to affected functions. if someone
can tell me how to add that from the frame lowering code, let me
know.
when enabled in kernel builds again, this will provide useful
arguments in ddb stack traces again.
|
| | |
|
| | |
|
| |
|
|
|
| |
With fixes from mortimer@ (thanks!)
Tested by many, especially naddy@ (thanks!)
|
| | |
|
| |
|
|
|
|
|
| |
Without this, values get truncated to 32-bit. Makes a sparc64 kernel
actually work when compiled with clang.
ok pguenther@, visa@
|
| |
|
|
|
|
|
| |
explicitly in SMALL_KERNEL kernel builds.
tweaks from jsg@ and tb@
ok deraadt@ kettenis@
|
| |
|
|
|
|
|
|
| |
Upstream references:
https://reviews.llvm.org/D31557
https://reviews.llvm.org/D48515
OK kettenis@
|
| |
|
|
| |
ok deraadt@
|
| | |
|
| |
|
|
|
|
|
| |
fallthrough. Avoids unnecessary jmp instructions in the middle
of functions and makes disassembly nicer to read.
ok guenther@ mlarkin@ deraadt@
|
| |
|
|
|
|
|
| |
'.openbsd.randomdata.retguard', to make them easier to work with in the
kernel hibernate code.
ok mortimer@ deraadt@
|
| |
|
|
| |
Spotted by Nan Xiao.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
random cookies to protect access to function return instructions, with the
effect that the integrity of the return address is protected, and function
return instructions are harder to use in ROP gadgets.
On function entry the return address is combined with a per-function random
cookie and stored in the stack frame. The integrity of this value is verified
before function return, and if this check fails, the program aborts. In this way
RETGUARD is an improved stack protector, since the cookies are per-function. The
verification routine is constructed such that the binary space immediately
before each ret instruction is padded with int03 instructions, which makes these
return instructions difficult to use in ROP gadgets. In the kernel, this has the
effect of removing approximately 50% of total ROP gadgets, and 15% of unique
ROP gadgets compared to the 6.3 release kernel. Function epilogues are
essentially gadget free, leaving only the polymorphic gadgets that result from
jumping into the instruction stream partway through other instructions. Work to
remove these gadgets will continue through other mechanisms.
Remaining work includes adding this mechanism to assembly routines, which must
be done by hand. Many thanks to all those who helped test and provide feedback,
especially deaadt, tb, espie and naddy.
ok deraadt@
|
| |
|
|
|
|
|
|
| |
friendly instructions with safe alternatives. This initial commit fixes
3 instruction forms that will lower to include a c3 (return) byte.
Additional problematic instructions can be fixed incrementally using
this framework.
ok deraadt@
|
| | |
|
| | |
|