diff options
author | 2020-07-27 20:59:20 +0200 | |
---|---|---|
committer | 2022-07-25 17:45:33 +0200 | |
commit | 234fdd2815ce8fe4da6782109580f3b166aeb97b (patch) | |
tree | 600b56ede9abee0462aa39bb5f0c05c95e8e8d57 /tools/perf/scripts/python/export-to-sqlite.py | |
parent | btrfs: replace kmap() with kmap_local_page() in lzo.c (diff) | |
download | linux-dev-234fdd2815ce8fe4da6782109580f3b166aeb97b.tar.xz linux-dev-234fdd2815ce8fe4da6782109580f3b166aeb97b.zip |
btrfs: remove redundant check in up check_setget_bounds
There are two separate checks in the bounds checker, the first one being
a special case of the second. As this function is performance critical
due to checking access to any eb member, reducing the size can slightly
improve performance.
On a release build on x86_64 the helper is completely inlined so the
function call overhead is also gone.
There was a report of 5% performance drop on metadata heavy workload,
that disappeared after disabling asserts. The most significant part of
that is the bounds checker.
https://lore.kernel.org/linux-btrfs/20200724164147.39925-1-josef@toxicpanda.com/
After the analysis, the optimized code removes the worst overhead which
is the function call and the performance was restored.
https://lore.kernel.org/linux-btrfs/20200730110943.GE3703@twin.jikos.cz/
1. baseline, asserts on, setget check on
run time: 46s
run time with perf: 48s
2. asserts on, comment out setget check
run time: 44s
run time with perf: 47s
So this is confirms the 5% difference
3. asserts on, optimized seget check
run time: 44s
run time with perf: 47s
The optimizations are reducing the number of ifs to 1 and inlining the
hot path. Low-level stuff, gets the performance back. Patch below.
4. asserts off, no setget check
run time: 44s
run time with perf: 45s
This verifies that asserts other than the setget check have negligible
impact on performance and it's not harmful to keep them on.
Analysis where the performance is lost:
* check_setget_bounds is short function, but it's still a function call,
changing the flow of instructions and given how many times it's
called the overhead adds up
* there are two conditions, one to check if the range is
completely outside (member_offset > eb->len) or partially inside
(member_offset + size > eb->len)
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Diffstat (limited to 'tools/perf/scripts/python/export-to-sqlite.py')
0 files changed, 0 insertions, 0 deletions