aboutsummaryrefslogtreecommitdiffstatshomepage
path: root/tools/testing/radix-tree (follow)
AgeCommit message (Collapse)AuthorFilesLines
2020-04-03Merge tag 'spdx-5.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/spdxLinus Torvalds1-0/+1
Pull SPDX updates from Greg KH: "Here are three SPDX patches for 5.7-rc1. One fixes up the SPDX tag for a single driver, while the other two go through the tree and add SPDX tags for all of the .gitignore files as needed. Nothing too complex, but you will get a merge conflict with your current tree, that should be trivial to handle (one file modified by two things, one file deleted.) All three of these have been in linux-next for a while, with no reported issues other than the merge conflict" * tag 'spdx-5.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/spdx: ASoC: MT6660: make spdxcheck.py happy .gitignore: add SPDX License Identifier .gitignore: remove too obvious comments
2020-03-25.gitignore: add SPDX License IdentifierMasahiro Yamada1-0/+1
Add SPDX License Identifier to all .gitignore files. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-03-12xarray: Fix early termination of xas_for_each_markedMatthew Wilcox (Oracle)4-2/+91
xas_for_each_marked() is using entry == NULL as a termination condition of the iteration. When xas_for_each_marked() is used protected only by RCU, this can however race with xas_store(xas, NULL) in the following way: TASK1 TASK2 page_cache_delete() find_get_pages_range_tag() xas_for_each_marked() xas_find_marked() off = xas_find_chunk() xas_store(&xas, NULL) xas_init_marks(&xas); ... rcu_assign_pointer(*slot, NULL); entry = xa_entry(off); And thus xas_for_each_marked() terminates prematurely possibly leading to missed entries in the iteration (translating to missing writeback of some pages or a similar problem). If we find a NULL entry that has been marked, skip it (unless we're trying to allocate an entry). Reported-by: Jan Kara <jack@suse.cz> CC: stable@vger.kernel.org Fixes: ef8e5717db01 ("page cache: Convert delete_batch to XArray") Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
2020-02-27radix tree test suite: Support kmem_cache alignmentMatthew Wilcox (Oracle)2-15/+23
The radix tree doesn't use alignment, so the argument was ignored. The maple tree needs its nodes to be aligned, so we need to pay attention to the alignment argument. Also change the types of 'size' and 'align' to unsigned int to match commit f4957d5bd0916. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
2019-07-08Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds1-1/+1
Pull RCU updates from Ingo Molnar: "The changes in this cycle are: - RCU flavor consolidation cleanups and optmizations - Documentation updates - Miscellaneous fixes - SRCU updates - RCU-sync flavor consolidation - Torture-test updates - Linux-kernel memory-consistency-model updates, most notably the addition of plain C-language accesses" * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (61 commits) tools/memory-model: Improve data-race detection tools/memory-model: Change definition of rcu-fence tools/memory-model: Expand definition of barrier tools/memory-model: Do not use "herd" to refer to "herd7" tools/memory-model: Fix comment in MP+poonceonces.litmus Documentation: atomic_t.txt: Explain ordering provided by smp_mb__{before,after}_atomic() rcu: Don't return a value from rcu_assign_pointer() rcu: Force inlining of rcu_read_lock() rcu: Fix irritating whitespace error in rcu_assign_pointer() rcu: Upgrade sync_exp_work_done() to smp_mb() rcutorture: Upper case solves the case of the vanishing NULL pointer torture: Suppress propagating trace_printk() warning rcutorture: Dump trace buffer for callback pipe drain failures torture: Add --trust-make to suppress "make clean" torture: Make --cpus override idleness calculations torture: Run kernel build in source directory torture: Add function graph-tracing cheat sheet torture: Capture qemu output rcutorture: Tweak kvm options rcutorture: Add trivial RCU implementation ...
2019-06-29Merge tag 'xarray-5.2-rc6' of git://git.infradead.org/users/willy/linux-daxLinus Torvalds1-0/+46
Pull XArray fixes from Matthew Wilcox: - Account XArray nodes for the page cache to the appropriate cgroup (Johannes Weiner) - Fix idr_get_next() when called under the RCU lock (Matthew Wilcox) - Add a test for xa_insert() (Matthew Wilcox) * tag 'xarray-5.2-rc6' of git://git.infradead.org/users/willy/linux-dax: XArray tests: Add check_insert idr: Fix idr_get_next race with idr_remove mm: fix page cache convergence regression
2019-06-28Merge branch 'for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcuIngo Molnar1-1/+1
Pull rcu/next + tools/memory-model changes from Paul E. McKenney: - RCU flavor consolidation cleanups and optmizations - Documentation updates - Miscellaneous fixes - SRCU updates - RCU-sync flavor consolidation - Torture-test updates - Linux-kernel memory-consistency-model updates, most notably the addition of plain C-language accesses Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-06-13rcu: Don't return a value from rcu_assign_pointer()Andrea Parri1-1/+1
Quoting Paul [1]: "Given that a quick (and perhaps error-prone) search of the uses of rcu_assign_pointer() in v5.1 didn't find a single use of the return value, let's please instead change the documentation and implementation to eliminate the return value." [1] https://lkml.kernel.org/r/20190523135013.GL28207@linux.ibm.com Signed-off-by: Andrea Parri <andrea.parri@amarulasolutions.com> Cc: "Paul E. McKenney" <paulmck@linux.ibm.com> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Lai Jiangshan <jiangshanlai@gmail.com> Cc: Joel Fernandes <joel@joelfernandes.org> Cc: rcu@vger.kernel.org Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will.deacon@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Sasha Levin <sashal@kernel.org> Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2019-06-05treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 288Thomas Gleixner4-36/+4
Based on 1 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms and conditions of the gnu general public license version 2 as published by the free software foundation this program is distributed in the hope it will be useful but without any warranty without even the implied warranty of merchantability or fitness for a particular purpose see the gnu general public license for more details extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 263 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Allison Randal <allison@lohutok.net> Reviewed-by: Alexios Zavras <alexios.zavras@intel.com> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190529141901.208660670@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-02idr: Fix idr_get_next race with idr_removeMatthew Wilcox (Oracle)1-0/+46
If the entry is deleted from the IDR between the call to radix_tree_iter_find() and rcu_dereference_raw(), idr_get_next() will return NULL, which will end the iteration prematurely. We should instead continue to the next entry in the IDR. This only happens if the iteration is protected by the RCU lock. Most IDR users use a spinlock or semaphore to exclude simultaneous modifications. It was noticed once the PID allocator was converted to use the IDR, as it uses the RCU lock, but there may be other users elsewhere in the kernel. We can't use the normal pattern of calling radix_tree_deref_retry() (which catches both a retry entry in a leaf node and a node entry in the root) as the IDR supports storing entries which are unaligned, which will trigger an infinite loop if they are encountered. Instead, we have to explicitly check whether the entry is a retry entry. Fixes: 0a835c4f090a ("Reimplement IDR and IDA using the radix tree") Reported-by: Brendan Gregg <bgregg@netflix.com> Tested-by: Brendan Gregg <bgregg@netflix.com> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
2018-12-06radix tree: Don't return retry entries from lookupMatthew Wilcox4-0/+82
Commit 66ee620f06f9 ("idr: Permit any valid kernel pointer to be stored") changed the radix tree lookup so that it stops when reaching the bottom of the tree. However, the condition was added in the wrong place, making it possible to return retry entries to the caller. Reorder the tests to check for the retry entry before checking whether we're at the bottom of the tree. The retry entry should never be found in the tree root, so it's safe to defer the check until the end of the loop. Add a regression test to the test-suite to be sure this doesn't come back. Fixes: 66ee620f06f9 ("idr: Permit any valid kernel pointer to be stored") Reported-by: Greg Kurz <groug@kaod.org> Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-10-21radix tree: Remove multiorder supportMatthew Wilcox1-1/+0
All users have now been converted to the XArray. Removing the support reduces code size and ensures new users will use the XArray instead. Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-10-21radix tree test: Convert multiorder tests to XArrayMatthew Wilcox1-56/+49
This is the last remaining user of the multiorder functionality of the radix tree. Test the XArray instead. Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-10-21radix tree tests: Convert item_delete_rcu to XArrayMatthew Wilcox2-3/+3
In preparation for the removal of the multiorder radix tree code, convert item_delete_rcu() to use the XArray so it can still be called for XArrays containing multi-index entries. Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-10-21radix tree tests: Convert item_kill_tree to XArrayMatthew Wilcox1-21/+9
In preparation for the removal of the multiorder radix tree code, convert item_kill_tree() to use the XArray so it can still be called for XArrays containing multi-index entries. Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-10-21radix tree tests: Move item_insert_orderMatthew Wilcox3-11/+22
The remaining tests are not suitable for moving in-kernel, so move item_insert_order() into multiorder.c, make it static and make it use the XArray. Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-10-21radix tree test suite: Remove multiorder benchmarkingMatthew Wilcox1-29/+21
The multiorder radix tree code is being removed, so remove the benchmarking of its performance. Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-10-21radix tree test suite: Remove __item_insertMatthew Wilcox2-7/+1
Inline it into its one caller Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-10-21xarray: Move multiorder_check to in-kernel testsMatthew Wilcox1-48/+0
This version is a little less thorough in order to be a little quicker, but tests the important edge cases. Also test adding a multiorder entry at a non-canonical index, and erasing it. Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-10-21xarray: Move multiorder_shrink to kernel testsMatthew Wilcox1-173/+0
Test this functionality inside the kernel as well as in userspace. Also remove insert_bug() as there's no comparable thing to test in the XArray code. Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-10-21xarray: Move multiorder account test in-kernelMatthew Wilcox1-24/+0
Move this test to the in-kernel test suite, and enhance it to test several different orders. Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-10-21radix tree test suite: Convert iteration test to XArrayMatthew Wilcox3-58/+63
With no code left in the kernel using the multiorder radix tree, convert the iteration test from the radix tree to the XArray. It's unlikely to suffer the same bug as the radix tree, but this test will prevent that bug from ever creeping into the XArray implementation. Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-10-21radix tree test suite: Convert tag_tagged_items to XArrayMatthew Wilcox8-57/+46
The tag_tagged_items() function is supposed to test the page-writeback tagging code. Since that has been converted to the XArray, there's not much point in testing the radix tree's tagging code. This requires using the pthread mutex embedded in the xarray instead of an external lock, so remove the pthread mutexes which protect xarrays/radix trees. Also remove radix_tree_iter_tag_set() as this was the last user. Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-10-21radix tree: Remove radix_tree_clear_tagsMatthew Wilcox1-29/+0
The page cache was the only user of this interface and it has now been converted to the XArray. Transform the test into a test of xas_init_marks(). Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-10-21radix tree: Remove split/join codeMatthew Wilcox2-338/+0
radix_tree_split and radix_tree_join were never used upstream. Remove them; if they're needed in future they will be replaced by XArray equivalents. Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-10-21radix tree: Remove radix_tree_update_node_tMatthew Wilcox1-1/+1
The only user of this functionality was the workingset code, and it's now been converted to the XArray. Remove __radix_tree_delete_node() entirely as it was also only used by the workingset code. Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-10-21shmem: Convert find_swap_entry to XArrayMatthew Wilcox3-84/+0
This is a 1:1 conversion. The major part of this patch is converting the test framework from userspace to kernel space and mirroring the algorithm now used in find_swap_entry(). Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-10-21radix tree test suite: Convert regression1 to XArrayMatthew Wilcox1-39/+19
Now the page cache lookup is using the XArray, let's convert this regression test from the radix tree API to the XArray so it's testing roughly the same thing it was testing before. Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-10-21page cache: Convert find_get_pages_contig to XArrayMatthew Wilcox1-23/+0
There's no direct replacement for radix_tree_for_each_contig() in the XArray API as it's an unusual thing to do. Instead, open-code a loop using xas_next(). This removes the only user of radix_tree_for_each_contig() so delete the iterator from the API and the test suite code for it. Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-10-21ida: Convert to XArrayMatthew Wilcox1-5/+3
Use the XA_TRACK_FREE ability to track which entries have a free bit, similarly to how it uses the radix tree's IDR_FREE tag. This eliminates the per-cpu ida_bitmap preload, and fixes the memory consumption regression I introduced when making the IDR able to store any pointer. Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-10-21xarray: Add XArray unconditional store operationsMatthew Wilcox4-1/+39
xa_store() differs from radix_tree_insert() in that it will overwrite an existing element in the array rather than returning an error. This is the behaviour which most users want, and those that want more complex behaviour generally want to use the xas family of routines anyway. For memory allocation, xa_store() will first attempt to request memory from the slab allocator; if memory is not immediately available, it will drop the xa_lock and allocate memory, keeping a pointer in the xa_state. It does not use the per-CPU cache, although those will continue to exist until all radix tree users are converted to the xarray. This patch also includes xa_erase() and __xa_erase() for a streamlined way to store NULL. Since there is no need to allocate memory in order to store a NULL in the XArray, we do not need to trouble the user with deciding what memory allocation flags to use. Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-10-21xarray: Add XArray load operationMatthew Wilcox7-2/+38
The xa_load function brings with it a lot of infrastructure; xa_empty(), xa_is_err(), and large chunks of the XArray advanced API that are used to implement xa_load. As the test-suite demonstrates, it is possible to use the XArray functions on a radix tree. The radix tree functions depend on the GFP flags being stored in the root of the tree, so it's not possible to use the radix tree functions on an XArray. Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-10-21xarray: Define struct xa_nodeMatthew Wilcox1-15/+15
This is a direct replacement for struct radix_tree_node. A couple of struct members have changed name, so convert those. Use a #define so that radix tree users continue to work without change. Signed-off-by: Matthew Wilcox <willy@infradead.org> Reviewed-by: Josef Bacik <jbacik@fb.com>
2018-10-21xarray: Add definition of struct xarrayMatthew Wilcox6-7/+19
This is a direct replacement for struct radix_tree_root. Some of the struct members have changed name; convert those, and use a #define so that radix_tree users continue to work without change. Signed-off-by: Matthew Wilcox <willy@infradead.org> Reviewed-by: Josef Bacik <jbacik@fb.com>
2018-09-29xarray: Change definition of sibling entriesMatthew Wilcox2-1/+2
Instead of storing a pointer to the slot containing the canonical entry, store the offset of the slot. Produces slightly more efficient code (~300 bytes) and simplifies the implementation. Signed-off-by: Matthew Wilcox <willy@infradead.org> Reviewed-by: Josef Bacik <jbacik@fb.com>
2018-09-29xarray: Replace exceptional entriesMatthew Wilcox4-29/+27
Introduce xarray value entries and tagged pointers to replace radix tree exceptional entries. This is a slight change in encoding to allow the use of an extra bit (we can now store BITS_PER_LONG - 1 bits in a value entry). It is also a change in emphasis; exceptional entries are intimidating and different. As the comment explains, you can choose to store values or pointers in the xarray and they are both first-class citizens. Signed-off-by: Matthew Wilcox <willy@infradead.org> Reviewed-by: Josef Bacik <jbacik@fb.com>
2018-09-29idr: Permit any valid kernel pointer to be storedMatthew Wilcox1-0/+63
An upcoming change to the encoding of internal entries will set the bottom two bits to 0b10. Unfortunately, m68k only aligns some data structures to 2 bytes, so the IDR will interpret them as internal entries and things will go badly wrong. Change the radix tree so that it stops either when the node indicates that it's the bottom of the tree (shift == 0) or when the entry is not an internal entry. This means we cannot insert an arbitrary kernel pointer as a multiorder entry, but the IDR does not permit multiorder entries. Annoyingly, this means the IDR can no longer take advantage of the radix tree's ability to store a single entry at offset 0 without allocating memory. A pointer which is 2-byte aligned cannot be stored directly in the root as it would be indistinguishable from a node, so we must allocate a node in order to store a 2-byte pointer at index 0. The idr_replace() function does not take a GFP flags argument, so cannot allocate memory. If a user inserts a 4-byte aligned pointer at index 0 and then replaces it with a 2-byte aligned pointer, we must be able to store it. Arbitrary pointer values are still not permitted; pointers of the form 2 + (i * 4) for values of i between 0 and 1023 are reserved for the implementation. These are not valid kernel pointers as they would point into the zero page. This change does cause a runtime memory consumption regression for the IDA. I will recover that later. Signed-off-by: Matthew Wilcox <willy@infradead.org> Tested-by: Guenter Roeck <linux@roeck-us.net>
2018-08-21test_ida: check_ida_destroy and check_ida_allocMatthew Wilcox1-66/+4
Move these tests from the userspace test-suite to the kernel test-suite. Also convert check_ida_random to the new API. Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-08-21test_ida: Convert check_ida_conv to new APIMatthew Wilcox1-46/+10
Move as much as possible to kernel space; leave the parts in user space that rely on checking memory allocation failures to detect the transition between an exceptional entry and a bitmap. Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-08-21test_ida: Move ida_check_maxMatthew Wilcox1-28/+0
Convert to new API and move to kernel space. Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-08-21test_ida: Move ida_check_leafMatthew Wilcox1-27/+0
Convert to new API and move to kernel space. Take the opportunity to test the situation a little more thoroughly (ie at different offsets). Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-08-21idr-test: Convert ida_check_nomem to new APIMatthew Wilcox1-6/+7
We can't move this test to kernel space because there's no way to force kmalloc to fail. But we can use the new API and check this works when the test is in userspace. Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-08-21ida: Start new test_ida moduleMatthew Wilcox4-7/+22
Start transitioning the IDA tests into kernel space. Framework heavily cribbed from test_xarray.c. Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-08-21radix tree test suite: Enable ubsanMatthew Wilcox2-11/+14
Add support for the undefined behaviour sanitizer and fix the bugs that ubsan pointed out. Nothing major, and all in the test suite, not the code. Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-08-21radix tree test suite: Fix compilationMatthew Wilcox1-0/+2
An include of xarray.h was added to lib/idr.c without updating the test suite. Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-05-25idr: fix invalid ptr dereference on item deleteMatthew Wilcox1-0/+7
If the radix tree underlying the IDR happens to be full and we attempt to remove an id which is larger than any id in the IDR, we will call __radix_tree_delete() with an uninitialised 'slot' pointer, at which point anything could happen. This was easiest to hit with a single entry at id 0 and attempting to remove a non-0 id, but it could have happened with 64 entries and attempting to remove an id >= 64. Roman said: The syzcaller test boils down to opening /dev/kvm, creating an eventfd, and calling a couple of KVM ioctls. None of this requires superuser. And the result is dereferencing an uninitialized pointer which is likely a crash. The specific path caught by syzbot is via KVM_HYPERV_EVENTD ioctl which is new in 4.17. But I guess there are other user-triggerable paths, so cc:stable is probably justified. Matthew added: We have around 250 calls to idr_remove() in the kernel today. Many of them pass an ID which is embedded in the object they're removing, so they're safe. Picking a few likely candidates: drivers/firewire/core-cdev.c looks unsafe; the ID comes from an ioctl. drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c is similar drivers/atm/nicstar.c could be taken down by a handcrafted packet Link: http://lkml.kernel.org/r/20180518175025.GD6361@bombadil.infradead.org Fixes: 0a835c4f090a ("Reimplement IDR and IDA using the radix tree") Reported-by: <syzbot+35666cba7f0a337e2e79@syzkaller.appspotmail.com> Debugged-by: Roman Kagan <rkagan@virtuozzo.com> Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-05-18radix tree test suite: multi-order iteration raceRoss Zwisler2-0/+64
Add a test which shows a race in the multi-order iteration code. This test reliably hits the race in under a second on my machine, and is the result of a real bug report against kernel a production v4.15 based kernel (4.15.6-300.fc27.x86_64). With a real kernel this issue is hit when using order 9 PMD DAX radix tree entries. The race has to do with how we tear down multi-order sibling entries when we are removing an item from the tree. Remember that an order 2 entry looks like this: struct radix_tree_node.slots[] = [entry][sibling][sibling][sibling] where 'entry' is in some slot in the struct radix_tree_node, and the three slots following 'entry' contain sibling pointers which point back to 'entry.' When we delete 'entry' from the tree, we call : radix_tree_delete() radix_tree_delete_item() __radix_tree_delete() replace_slot() replace_slot() first removes the siblings in order from the first to the last, then at then replaces 'entry' with NULL. This means that for a brief period of time we end up with one or more of the siblings removed, so: struct radix_tree_node.slots[] = [entry][NULL][sibling][sibling] This causes an issue if you have a reader iterating over the slots in the tree via radix_tree_for_each_slot() while only under rcu_read_lock()/rcu_read_unlock() protection. This is a common case in mm/filemap.c. The issue is that when __radix_tree_next_slot() => skip_siblings() tries to skip over the sibling entries in the slots, it currently does so with an exact match on the slot directly preceding our current slot. Normally this works: V preceding slot struct radix_tree_node.slots[] = [entry][sibling][sibling][sibling] ^ current slot This lets you find the first sibling, and you skip them all in order. But in the case where one of the siblings is NULL, that slot is skipped and then our sibling detection is interrupted: V preceding slot struct radix_tree_node.slots[] = [entry][NULL][sibling][sibling] ^ current slot This means that the sibling pointers aren't recognized since they point all the way back to 'entry', so we think that they are normal internal radix tree pointers. This causes us to think we need to walk down to a struct radix_tree_node starting at the address of 'entry'. In a real running kernel this will crash the thread with a GP fault when you try and dereference the slots in your broken node starting at 'entry'. In the radix tree test suite this will be caught by the address sanitizer: ==27063==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x60c0008ae400 at pc 0x00000040ce4f bp 0x7fa89b8fcad0 sp 0x7fa89b8fcac0 READ of size 8 at 0x60c0008ae400 thread T3 #0 0x40ce4e in __radix_tree_next_slot /home/rzwisler/project/linux/tools/testing/radix-tree/radix-tree.c:1660 #1 0x4022cc in radix_tree_next_slot linux/../../../../include/linux/radix-tree.h:567 #2 0x4022cc in iterator_func /home/rzwisler/project/linux/tools/testing/radix-tree/multiorder.c:655 #3 0x7fa8a088d50a in start_thread (/lib64/libpthread.so.0+0x750a) #4 0x7fa8a03bd16e in clone (/lib64/libc.so.6+0xf516e) Link: http://lkml.kernel.org/r/20180503192430.7582-5-ross.zwisler@linux.intel.com Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Christoph Hellwig <hch@lst.de> Cc: CR, Sapthagirish <sapthagirish.cr@intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Chinner <david@fromorbit.com> Cc: Jan Kara <jack@suse.cz> Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-05-18radix tree test suite: add item_delete_rcu()Ross Zwisler2-0/+21
Currently the lifetime of "struct item" entries in the radix tree are not controlled by RCU, but are instead deleted inline as they are removed from the tree. In the following patches we add a test which has threads iterating over items pulled from the tree and verifying them in an rcu_read_lock()/rcu_read_unlock() section. This means that though an item has been removed from the tree it could still be being worked on by other threads until the RCU grace period expires. So, we need to actually free the "struct item" structures at the end of the grace period, just as we do with "struct radix_tree_node" items. Link: http://lkml.kernel.org/r/20180503192430.7582-4-ross.zwisler@linux.intel.com Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Christoph Hellwig <hch@lst.de> Cc: CR, Sapthagirish <sapthagirish.cr@intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Chinner <david@fromorbit.com> Cc: Jan Kara <jack@suse.cz> Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-05-18radix tree test suite: fix mapshift build targetRoss Zwisler1-4/+2
Commit c6ce3e2fe3da ("radix tree test suite: Add config option for map shift") introduced a phony makefile target called 'mapshift' that ends up generating the file generated/map-shift.h. This phony target was then added as a dependency of the top level 'targets' build target, which is what is run when you go to tools/testing/radix-tree and just type 'make'. Unfortunately, this phony target doesn't actually work as a dependency, so you end up getting: $ make make: *** No rule to make target 'generated/map-shift.h', needed by 'main.o'. Stop. make: *** Waiting for unfinished jobs.... Fix this by making the file generated/map-shift.h our real makefile target, and add this a dependency of the top level build target. Link: http://lkml.kernel.org/r/20180503192430.7582-2-ross.zwisler@linux.intel.com Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Christoph Hellwig <hch@lst.de> Cc: CR, Sapthagirish <sapthagirish.cr@intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Chinner <david@fromorbit.com> Cc: Jan Kara <jack@suse.cz> Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-11radix tree: use GFP_ZONEMASK bits of gfp_t for flagsMatthew Wilcox1-0/+1
Patch series "XArray", v9. (First part thereof). This patchset is, I believe, appropriate for merging for 4.17. It contains the XArray implementation, to eventually replace the radix tree, and converts the page cache to use it. This conversion keeps the radix tree and XArray data structures in sync at all times. That allows us to convert the page cache one function at a time and should allow for easier bisection. Other than renaming some elements of the structures, the data structures are fundamentally unchanged; a radix tree walk and an XArray walk will touch the same number of cachelines. I have changes planned to the XArray data structure, but those will happen in future patches. Improvements the XArray has over the radix tree: - The radix tree provides operations like other trees do; 'insert' and 'delete'. But what most users really want is an automatically resizing array, and so it makes more sense to give users an API that is like an array -- 'load' and 'store'. We still have an 'insert' operation for users that really want that semantic. - The XArray considers locking as part of its API. This simplifies a lot of users who formerly had to manage their own locking just for the radix tree. It also improves code generation as we can now tell RCU that we're holding a lock and it doesn't need to generate as much fencing code. The other advantage is that tree nodes can be moved (not yet implemented). - GFP flags are now parameters to calls which may need to allocate memory. The radix tree forced users to decide what the allocation flags would be at creation time. It's much clearer to specify them at allocation time. - Memory is not preloaded; we don't tie up dozens of pages on the off chance that the slab allocator fails. Instead, we drop the lock, allocate a new node and retry the operation. We have to convert all the radix tree, IDA and IDR preload users before we can realise this benefit, but I have not yet found a user which cannot be converted. - The XArray provides a cmpxchg operation. The radix tree forces users to roll their own (and at least four have). - Iterators take a 'max' parameter. That simplifies many users and will reduce the amount of iteration done. - Iteration can proceed backwards. We only have one user for this, but since it's called as part of the pagefault readahead algorithm, that seemed worth mentioning. - RCU-protected pointers are not exposed as part of the API. There are some fun bugs where the page cache forgets to use rcu_dereference() in the current codebase. - Value entries gain an extra bit compared to radix tree exceptional entries. That gives us the extra bit we need to put huge page swap entries in the page cache. - Some iterators now take a 'filter' argument instead of having separate iterators for tagged/untagged iterations. The page cache is improved by this: - Shorter, easier to read code - More efficient iterations - Reduction in size of struct address_space - Fewer walks from the top of the data structure; the XArray API encourages staying at the leaf node and conducting operations there. This patch (of 8): None of these bits may be used for slab allocations, so we can use them as radix tree flags as long as we mask them off before passing them to the slab allocator. Move the IDR flag from the high bits to the GFP_ZONEMASK bits. Link: http://lkml.kernel.org/r/20180313132639.17387-3-willy@infradead.org Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com> Acked-by: Jeff Layton <jlayton@kernel.org> Cc: Darrick J. Wong <darrick.wong@oracle.com> Cc: Dave Chinner <david@fromorbit.com> Cc: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>