diff options
author | 2014-10-24 13:22:20 +0100 | |
---|---|---|
committer | 2014-11-06 17:25:28 +0000 | |
commit | 5284e1b4bc8ae6fcc1c92c63cf6c876a53292f82 (patch) | |
tree | 3aeddebd2c9ebbb59bf7e6eec64796c97c5c5c2c /tools/perf/scripts/python/call-graph-from-postgresql.py | |
parent | arm64: optimize memcpy_{from,to}io() and memset_io() (diff) | |
download | linux-dev-5284e1b4bc8ae6fcc1c92c63cf6c876a53292f82.tar.xz linux-dev-5284e1b4bc8ae6fcc1c92c63cf6c876a53292f82.zip |
arm64: xchg: Implement cmpxchg_double
The arm64 architecture has the ability to exclusively load and store
a pair of registers from an address (ldxp/stxp). Also the SLUB can take
advantage of a cmpxchg_double implementation to avoid taking some
locks.
This patch provides an implementation of cmpxchg_double for 64-bit
pairs, and activates the logic required for the SLUB to use these
functions (HAVE_ALIGNED_STRUCT_PAGE and HAVE_CMPXCHG_DOUBLE).
Also definitions of this_cpu_cmpxchg_8 and this_cpu_cmpxchg_double_8
are wired up to cmpxchg_local and cmpxchg_double_local (rather than the
stock implementations that perform non-atomic operations with
interrupts disabled) as they are used by the SLUB.
On a Juno platform running on only the A57s I get quite a noticeable
performance improvement with 5 runs of hackbench on v3.17:
Baseline | With Patch
-----------------+-----------
Mean 119.2312 | 106.1782
StdDev 0.4919 | 0.4494
(times taken to complete `./hackbench 100 process 1000', in seconds)
Signed-off-by: Steve Capper <steve.capper@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Diffstat (limited to 'tools/perf/scripts/python/call-graph-from-postgresql.py')
0 files changed, 0 insertions, 0 deletions