aboutsummaryrefslogtreecommitdiffstats
path: root/arch/arm64/crypto/aes-neon.S
diff options
context:
space:
mode:
authorArd Biesheuvel <ard.biesheuvel@linaro.org>2018-09-10 16:41:15 +0200
committerHerbert Xu <herbert@gondor.apana.org.au>2018-09-21 13:24:50 +0800
commit2e5d2f33d1dbd551f14634483c7ea81e5119d689 (patch)
treeea7b3f03c3af0d1e818d588ac2a8e99c391b0826 /arch/arm64/crypto/aes-neon.S
parentcrypto: arm64/aes-blk - add support for CTS-CBC mode (diff)
downloadlinux-dev-2e5d2f33d1dbd551f14634483c7ea81e5119d689.tar.xz
linux-dev-2e5d2f33d1dbd551f14634483c7ea81e5119d689.zip
crypto: arm64/aes-blk - improve XTS mask handling
The Crypto Extension instantiation of the aes-modes.S collection of skciphers uses only 15 NEON registers for the round key array, whereas the pure NEON flavor uses 16 NEON registers for the AES S-box. This means we have a spare register available that we can use to hold the XTS mask vector, removing the need to reload it at every iteration of the inner loop. Since the pure NEON version does not permit this optimization, tweak the macros so we can factor out this functionality. Also, replace the literal load with a short sequence to compose the mask vector. On Cortex-A53, this results in a ~4% speedup. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Diffstat (limited to 'arch/arm64/crypto/aes-neon.S')
-rw-r--r--arch/arm64/crypto/aes-neon.S6
1 files changed, 6 insertions, 0 deletions
diff --git a/arch/arm64/crypto/aes-neon.S b/arch/arm64/crypto/aes-neon.S
index 1c7b45b7268e..29100f692e8a 100644
--- a/arch/arm64/crypto/aes-neon.S
+++ b/arch/arm64/crypto/aes-neon.S
@@ -14,6 +14,12 @@
#define AES_ENTRY(func) ENTRY(neon_ ## func)
#define AES_ENDPROC(func) ENDPROC(neon_ ## func)
+ xtsmask .req v7
+
+ .macro xts_reload_mask, tmp
+ xts_load_mask \tmp
+ .endm
+
/* multiply by polynomial 'x' in GF(2^8) */
.macro mul_by_x, out, in, temp, const
sshr \temp, \in, #7