diff options
author | 2018-08-07 10:00:19 -0700 | |
---|---|---|
committer | 2019-03-22 00:50:53 -0600 | |
commit | f54839ddad64ad23a3076dd8c9084fd34c039057 (patch) | |
tree | 94b6a319c8422597c9c8b59515c52ab1e7b67fbd /arch/parisc/include/asm/Kbuild | |
parent | Merge branch 'Refactor-flower-classifier-to-remove-dependency-on-rtnl-lock' (diff) | |
download | linux-dev-f54839ddad64ad23a3076dd8c9084fd34c039057.tar.xz linux-dev-f54839ddad64ad23a3076dd8c9084fd34c039057.zip |
asm: simd context helper APIsimd
Sometimes it's useful to amortize calls to XSAVE/XRSTOR and the related
FPU/SIMD functions over a number of calls, because FPU restoration is
quite expensive. This adds a simple header for carrying out this pattern:
simd_context_t simd_context;
simd_get(&simd_context);
while ((item = get_item_from_queue()) != NULL) {
encrypt_item(item, &simd_context);
simd_relax(&simd_context);
}
simd_put(&simd_context);
The relaxation step ensures that we don't trample over preemption, and
the get/put API should be a familiar paradigm in the kernel.
On the other end, code that actually wants to use SIMD instructions can
accept this as a parameter and check it via:
void encrypt_item(struct item *item, simd_context_t *simd_context)
{
if (item->len > LARGE_FOR_SIMD && simd_use(simd_context))
wild_simd_code(item);
else
boring_scalar_code(item);
}
The actual XSAVE happens during simd_use (and only on the first time),
so that if the context is never actually used, no performance penalty is
hit.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Cc: Samuel Neves <sneves@dei.uc.pt>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: linux-arch@vger.kernel.org
Diffstat (limited to '')
-rw-r--r-- | arch/parisc/include/asm/Kbuild | 1 |
1 files changed, 1 insertions, 0 deletions
diff --git a/arch/parisc/include/asm/Kbuild b/arch/parisc/include/asm/Kbuild index 0b1e354c8c24..087fb8b05e5e 100644 --- a/arch/parisc/include/asm/Kbuild +++ b/arch/parisc/include/asm/Kbuild @@ -20,6 +20,7 @@ generic-y += percpu.h generic-y += preempt.h generic-y += seccomp.h generic-y += segment.h +generic-y += simd.h generic-y += topology.h generic-y += trace_clock.h generic-y += user.h |