Commit message (Collapse) | Author | Age | Files | Lines | ||
---|---|---|---|---|---|---|
... | ||||||
* | curve25519: use precomp implementation instead of sandy2x | Jason A. Donenfeld | 2018-03-09 | 3 | -3437/+2070 | |
| | | | | It's faster and doesn't use the FPU. | |||||
* | crypto: read only after init | Jason A. Donenfeld | 2018-03-02 | 4 | -10/+11 | |
| | ||||||
* | blake2s: use union instead of casting | Jason A. Donenfeld | 2018-02-14 | 1 | -18/+16 | |
| | | | | | This deals with alignment more easily and also helps squelch a clang-analyzer warning. | |||||
* | curve25519: replace fiat64 with faster hacl64 | Jason A. Donenfeld | 2018-02-01 | 3 | -470/+883 | |
| | | | | | This reverts commit da4ff396cc5d5e0ff21f9ecbc2f951c048c63fff and adds some optimizations to hacl64. | |||||
* | curve25519: replace hacl64 with fiat64 | Jason A. Donenfeld | 2018-02-01 | 3 | -871/+470 | |
| | | | | | | | | | | For now, it's faster: hacl64: 109782 cycles per call fiat64: 108984 cycles per call It's quite possible this commit will be reverted with nice changes from INRIA, though. | |||||
* | chacha20poly1305: better buffer alignment | Jason A. Donenfeld | 2018-01-30 | 1 | -9/+8 | |
| | ||||||
* | chacha20poly1305: use existing rol32 function | Jason A. Donenfeld | 2018-01-30 | 1 | -9/+4 | |
| | ||||||
* | poly1305: add poly-specific self-tests | Jason A. Donenfeld | 2018-01-19 | 2 | -0/+2 | |
| | ||||||
* | curve25519-fiat32: uninline certain functions | Jason A. Donenfeld | 2018-01-18 | 1 | -4/+4 | |
| | | | | | | | | | | | While this has a negative performance impact on x86_64, it has a positive performance impact on smaller machines, which is where we're actually using this code. For example, an A53: Before: fiat32: 228605 cycles per call After: fiat32: 188307 cycles per call | |||||
* | curve25519: wire up new impls and remove donna | Jason A. Donenfeld | 2018-01-18 | 3 | -1454/+3 | |
| | ||||||
* | curve25519: resolve symbol clash between fe types | Jason A. Donenfeld | 2018-01-18 | 1 | -7/+7 | |
| | ||||||
* | curve25519: import 64-bit hacl-star implementation | Jason A. Donenfeld | 2018-01-18 | 1 | -0/+739 | |
| | ||||||
* | curve25519: import 32-bit fiat-crypto implementation | Jason A. Donenfeld | 2018-01-18 | 1 | -0/+838 | |
| | ||||||
* | curve25519: modularize implementation | Jason A. Donenfeld | 2018-01-18 | 5 | -1610/+1640 | |
| | ||||||
* | poly1305: remove indirect calls | Samuel Neves | 2018-01-18 | 1 | -79/+96 | |
| | | | | Signed-off-by: Samuel Neves <sneves@dei.uc.pt> | |||||
* | global: year bump | Jason A. Donenfeld | 2018-01-03 | 16 | -16/+16 | |
| | ||||||
* | crypto: compile on UML | Jason A. Donenfeld | 2017-12-13 | 4 | -2/+8 | |
| | | | | We basically just don't use FPU in UML. | |||||
* | chacha20poly1305: wire up avx512vl for skylake-x | Jason A. Donenfeld | 2017-12-11 | 2 | -4/+17 | |
| | ||||||
* | chacha20: avx512vl implementation | Samuel Neves | 2017-12-11 | 2 | -0/+571 | |
| | | | | Signed-off-by: Samuel Neves <sneves@dei.uc.pt> | |||||
* | poly1305: fix avx512f alignment bug | Samuel Neves | 2017-12-11 | 1 | -1/+1 | |
| | | | | Signed-off-by: Samuel Neves <sneves@dei.uc.pt> | |||||
* | chacha20poly1305: cleaner generic code | Jason A. Donenfeld | 2017-12-11 | 1 | -90/+49 | |
| | ||||||
* | blake2s-x86_64: fix spacing | Jason A. Donenfeld | 2017-12-09 | 1 | -70/+70 | |
| | ||||||
* | global: add SPDX tags to all files | Greg Kroah-Hartman | 2017-12-09 | 16 | -247/+57 | |
| | | | | | | | | | | | | | It's good to have SPDX identifiers in all files as the Linux kernel developers are working to add these identifiers to all files. Update all files with the correct SPDX license identifier based on the license text of the project or based on the license in the file itself. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Modified-by: Jason A. Donenfeld <Jason@zx2c4.com> | |||||
* | chacha20-arm: fix with clang -fno-integrated-as. | David Benjamin | 2017-12-03 | 1 | -1/+3 | |
| | | | | | | | | | The __clang__-guarded #defines cause gas to complain if clang is passed -fno-integrated-as. Emitting .syntax unified when those are used fixes this. Reviewed-by: Andy Polyakov <appro@openssl.org> Reviewed-by: Kurt Roeckx <kurt@roeckx.be> | |||||
* | poly1305: update x86-64 kernel to AVX512F only | Samuel Neves | 2017-12-03 | 2 | -138/+132 | |
| | | | | Signed-off-by: Samuel Neves <sneves@dei.uc.pt> | |||||
* | curve25519: explictly depend on AS_AVX | Jason A. Donenfeld | 2017-11-28 | 1 | -3/+3 | |
| | ||||||
* | curve25519: modularize dispatch | Jason A. Donenfeld | 2017-11-28 | 1 | -91/+82 | |
| | ||||||
* | blake2s: tweak avx512 code | Samuel Neves | 2017-11-26 | 1 | -64/+47 | |
| | | | | | | | | This is not as ideal as using zmm, but zmm downclocks. And it's not as fast single-threaded as using the gathers. But it is faster when multithreaded, which is what WireGuard is doing. Signed-off-by: Samuel Neves <sneves@dei.uc.pt> | |||||
* | chacha20: directly assign constant and initial state | Jason A. Donenfeld | 2017-11-23 | 1 | -59/+20 | |
| | ||||||
* | blake2s: hmac space optimization | Samuel Neves | 2017-11-22 | 1 | -16/+12 | |
| | | | | Signed-off-by: Samuel Neves <sneves@dei.uc.pt> | |||||
* | blake2s: AVX512F+VL implementation | Samuel Neves | 2017-11-22 | 2 | -0/+132 | |
| | | | | Signed-off-by: Samuel Neves <sneves@dei.uc.pt> | |||||
* | poly1305-avx512: requires AVX512F+VL+BW | Samuel Neves | 2017-11-22 | 1 | -1/+6 | |
| | | | | Signed-off-by: Samuel Neves <sneves@dei.uc.pt> | |||||
* | chacha20poly1305: poly cleans up its own state | Jason A. Donenfeld | 2017-11-22 | 1 | -5/+1 | |
| | ||||||
* | poly1305-x86_64: unclobber %rbp | Samuel Neves | 2017-11-22 | 1 | -131/+145 | |
| | | | | | | | | | | | | | | OpenSSL's Poly1305 kernels use %rbp as a scratch register. However, the kernel expects rbp to be a valid frame pointer at any given time in order to do proper unwinding. Thus we need to alter the code in order to preserve it. The most straightforward manner in which this was accomplished was by replacing $d3 in poly1305-x86_64.pl -- formerly %r10 -- by %rdi, and replace %rbp by %r10. Because %rdi, a pointer to the context structure, does not change and is not used by poly1305_iteration, it is safe to use it here, and the overhead of saving and restoring it should be minimal. Signed-off-by: Samuel Neves <sneves@dei.uc.pt> | |||||
* | poly1305: import MIPS64 primitive from OpenSSL | Jason A. Donenfeld | 2017-11-22 | 3 | -9/+401 | |
| | ||||||
* | chacha20poly1305: import ARM primitives from OpenSSL | Jason A. Donenfeld | 2017-11-22 | 11 | -1025/+5513 | |
| | | | | ARMv4-ARMv8, with NEON for ARMv7 and ARMv8. | |||||
* | chacha20poly1305: import x86_64 primitives from OpenSSL | Samuel Neves | 2017-11-22 | 9 | -2455/+5236 | |
| | | | | | | x86_64 only at the moment. SSSE3, AVX, AVX2, AVX512. Signed-off-by: Samuel Neves <sneves@dei.uc.pt> | |||||
* | curve25519-neon: compile in thumb mode | Jason A. Donenfeld | 2017-11-14 | 2 | -6/+6 | |
| | | | | | In thumb mode, it's not possible to use sp as an operand of and, so we have to muck around with r3 as a scratch register. | |||||
* | curve25519: reject deriving from NULL private keys | Jason A. Donenfeld | 2017-11-11 | 1 | -0/+7 | |
| | | | | | These aren't actually valid 25519 points pre-normalization, and doing this is required to make unsetting private keys based on all zeros. | |||||
* | receive: hoist fpu outside of receive loop | Jason A. Donenfeld | 2017-11-10 | 2 | -15/+13 | |
| | ||||||
* | curve25519: only enable int128 if compiler support is sound | Jason A. Donenfeld | 2017-10-31 | 1 | -1/+1 | |
| | ||||||
* | global: style nits | Jason A. Donenfeld | 2017-10-31 | 4 | -129/+198 | |
| | ||||||
* | qemu: allow for cross compilation | Jason A. Donenfeld | 2017-10-31 | 1 | -3/+3 | |
| | ||||||
* | crypto/avx: make sure we can actually use ymm registers | Jason A. Donenfeld | 2017-10-31 | 3 | -3/+3 | |
| | ||||||
* | blake2: include headers for macros | Jason A. Donenfeld | 2017-10-31 | 1 | -0/+2 | |
| | ||||||
* | blake2s: modernize API and have faster _final | Jason A. Donenfeld | 2017-10-17 | 2 | -48/+64 | |
| | ||||||
* | crypto/x86_64: satisfy stack validation 2.0 | Jason A. Donenfeld | 2017-10-09 | 3 | -31/+29 | |
| | | | | | We change this to look like the code gcc generates, so as to keep the objtool checker somewhat happy. | |||||
* | global: use _WG prefix for include guards | Jason A. Donenfeld | 2017-10-03 | 3 | -9/+9 | |
| | | | | Suggested-by: Sultan Alsawaf <sultanxda@gmail.com> | |||||
* | global: satisfy bitshift pedantry | Jason A. Donenfeld | 2017-10-03 | 1 | -7/+7 | |
| | | | | Suggested-by: Sultan Alsawaf <sultanxda@gmail.com> | |||||
* | curve25519-neon-arm: force ARM encoding, since this is unrepresentable in Thumb | Jason A. Donenfeld | 2017-10-02 | 1 | -0/+1 | |
| |