summaryrefslogtreecommitdiffstats
path: root/sys/net/art.c (follow)
Commit message (Collapse)AuthorAgeFilesLines
* Document art locking.mpi2020-11-121-2/+1
| | | | ok denis@, jmatthew@
* Unbreak tree by removing the bits that were copied to art.h in r1.18tb2019-03-311-32/+1
| | | | | | from here. reported by anton and otto
* Prevent a MP race in rtable_lookup().mpi2017-02-281-3/+1
| | | | | | | | | | | | | | If an ART node is linked to multiple route entries, in the MPATH case, it is not safe to dereference ``an_dst''. This non-refcounted pointer can be changed at any time by another CPU. So get rid of the pointer and use the first destination of a route entry when comparing sockaddrs. This allows us so remove a pointer from 'struct art_node' and save 5Mb of memory in an IPv4 fullfeed. ok jmatthew@, claudio@, dlg@
* A space here, a space there. Soon we're talking real whitespacekrw2017-01-241-2/+2
| | | | rectification.
* Make the art interface a bit more generic by not depending on sockaddrclaudio2017-01-231-6/+6
| | | | | in the functions. This way it can be used for other trees as well. OK mpi@ phessler@
* all pools have their ipl set via pool_setipl, so fold it into pool_init.dlg2016-09-151-14/+9
| | | | | | | | | | | | | | | | | | | | | | the ioff argument to pool_init() is unused and has been for many years, so this replaces it with an ipl argument. because the ipl will be set on init we no longer need pool_setipl. most of these changes have been done with coccinelle using the spatch below. cocci sucks at formatting code though, so i fixed that by hand. the manpage and subr_pool.c bits i did myself. ok tedu@ jmatthew@ @ipl@ expression pp; expression ipl; expression s, a, o, f, m, p; @@ -pool_init(pp, s, a, o, f, m, p); -pool_setipl(pp, ipl); +pool_init(pp, s, a, ipl, f, m, p);
* use a per-table rwlock to serialize ART updates and walks, rather thanjmatthew2016-08-301-7/+8
| | | | | | taking the kernel lock. ok mpi@ dlg@
* Revert use of the _SAFE version of SRPL_FOREACH() now that the offendingmpi2016-07-191-3/+3
| | | | | | | function has been fixed. Functions passed to rtable_walk() must return EAGAIN if they delete an entry from the tree, no matter if it is a leaf or not.
* Use the _SAFE_ version of SRPL_FOREACH() in rtable_walk_helper() tompi2016-07-041-3/+3
| | | | | | | | | prevent an off-by-one when removing entries from the mpath list. Fix a regression introduced by the refactoring needed to serialize rtable_walk() with create/delete. ok jca@
* rework art_walk so it will behave in an mpsafe world.dlg2016-06-221-45/+71
| | | | | | | | | | | | | | | | | art_walk now explicitly takes the same lock used to serialise change made via rtable_insert and _delete, so it can safely adjust the refcnts on tables while it recurses into them. they need to still exist when returning out of the recursion. it uses srps to access nodes and drops the lock before calling the callback function. this is because some callbacks sleep (eg, copyout in the sysctl code that dumps an rtable to userland), which you shouldnt hold a lock accross. other callbacks attempt to modify the rtable (eg, marking routes as down when then interface theyre on goes down), which tries to take the lock again, which probably wont work in the future. ok jmatthew@ mpi@
* Convert the links between art data structures used during lookups into srps.jmatthew2016-06-141-103/+170
| | | | | | | | | | | | | art_lookup and art_match now return an active srp_ref, which the caller must leave when it's done with the returned route (if any). This allows lookups to be done without holding any locks. The art_table and art_node garbage collectors are still responsible for freeing items removed from the routing table, so they now use srp_finalize to wait out any active references, and updates are done using srp_swap operations. ok dlg@ mpi@
* defer the freeing of art tables and nodes to a task.dlg2016-06-031-13/+75
| | | | | | | | | | | | | | | | | this will allow us to sleep in srp_finalize before freeing the memory. the defer is done by putting the tables and nodes on a list which is serviced by a task. the task removes all the entries from the list and pool_puts them. the art_tables gc code uses at_parent as its list entry, and the art_node gc code uses a union with the an_dst pointer. both at_parent and an_dst are only used when theyre active as part of an art data structure, and are not used in lookups. once the art is done with them we can reuse these pointers safely. ok mpi@
* pool_setipl at IPL_SOFTNET for all the art structures.dlg2016-06-021-1/+8
|
* always clean up the heap in art_table_delete, even for the last at_refcntdlg2016-06-021-5/+4
| | | | | | | | in the future a table may also be referenced by a cpu reading it with srp as well as the art rtable, so try and make sure it is always usable. ok mpi@
* move all the art_node initialisation to art_get in art.cdlg2016-06-011-1/+2
| | | | ok mpi@
* Keep all pools in the same place.mpi2016-04-131-2/+24
| | | | ok jmatthew@
* Remove unneeded art_free().mpi2016-04-121-9/+2
| | | | Reported by and ok jmatthew@
* Pass the address length to art_alloc() and remove the hack abusing thempi2016-01-181-9/+6
| | | | offset of the address in the sockaddr to initialize the stride lengths.
* Reduce the stride length of the tables by two and use a single pagempi2015-12-041-19/+19
| | | | | | | | | | | | | allocator for the 4K heap. In this configuration a fullfeed BGP server for v4 and v6 consumes 10M more than with the radix tree. This double the depth of the tree and makes the lookup slower. But the ratio speed/memory can be adjusted in the future, for now we are interested in a lock-free route lookup. Tested by and ok benno@
* in art_insert, if at_default on the first table is set then return thedlg2015-11-241-1/+4
| | | | | | existing route rather than overwrite it. ok mpi@
* Provide art_free(), a method to release unused routing table heads.mpi2015-11-241-16/+20
| | | | While here initialize pools in art_init().
* Allocate root tables on demand an free them like any other table.mpi2015-11-121-35/+59
| | | | | | | With this change we no longer waste some precious Kb for unused routing tables like the AF_MPLS one or those with rtableid != 0. This will also simplify the SRP dance during lookups.
* Allocate ART table's heap independently from the structure and usempi2015-11-101-8/+51
| | | | | | | | | | | | | pool(9) to not waste most of the memory allocated. This reduces the memory overhead of our ART routing table from 80M to 70M compared to the existing radix-tree when loading ~550K IPv4 routes. ART can now be used for huge tables without exhausting malloc(9)'s limit. claudio@ agrees with the direction, inputs from and ok dlg@
* Some tweaks to build the rtable API and backends in userland.mpi2015-11-041-1/+5
| | | | Needed by the regression tests.
* Rewrite the logic around the dymanic array of routing tables to helpmpi2015-10-141-2/+2
| | | | | | | | | | | | | | | turning rtable_get(9) MP-safe. Use only one per-AF array, as suggested by claudio@, pointing to an array of pointers to the routing table heads. Routing tables are now allocated/initialized per-AF. This will let us allocate routing table on-demand instead of always having an AF_INET, AF_MPLS and AF_INET table as soon as a new rtableID is used. This also get rid of the "void ***" madness. ok dlg@, jmatthew@
* Initialize the routing table before domains.mpi2015-10-071-13/+11
| | | | | | | | | | | | | | | | | | | | | | | The routing table is not an optional component of the network stack and initializing it inside the "routing domain" requires some ugly introspection in the domain interface. This put the rtable* layer at the same level of the if* level. These two subsystem are organized around the two global data structure used in the network stack: - the global &ifnet list, to be used in process context only, and - the routing table which can be read in interrupt context. This change makes the rtable_* layer domain-aware and extends the "struct domain" such that INET, INET6 and MPLS can specify the length of the binary key used in lookups. This allows us to keep, or move towards, AF-free route and rtable layers. While here stop the madness and pass the size of the maximum key length in *byte* to rn_inithead0(). ok claudio@, mikeb@
* Make ART internals free of 'struct sockaddr'.mpi2015-08-201-39/+9
| | | | | | | | | | Keep route entry/BSD compatibility goos in the rtable layer. The way addresses and masks (prefix-lengths) are encoded is really tied to the radix-tree implementation. Since we decided to no longer support non-contiguous masks, we could get rid of some extra "sockaddr" allocations and reduce the memory grows related to the use of a multibit-trie.
* In an email dated 11 Feb 2015, Yoichi Hariguchi accepted to re-licensempi2015-08-201-35/+2
| | | | | | his reference ART implementation from a BSD 4-clause to ISC. Thanks a lot to him!
* Import an alternative routing table backend based on Yoichi Hariguchi'smpi2015-08-201-0/+789
ART implementation. ART (Allotment Routing Table) is a multibit-trie algorithm invented by D. Knuth while reviewing Yoichi's SMART [0] (Smart Multi-Array Routing Table) paper. This implementation, unlike the one from the KAME project, supports variable stride lengths which makes it easier to adapt the consumed memory/speed trade-off. It also let you use a bigger first-level table, what other algorithms such as POPTRIE [1] need to implement separately. Adaptation to the OpenBSD kernel has been done with two different data structures. ART nodes and route entries are managed separately which makes the algorithm implementation free of any MULTIPATH logic. This implementation does not include Path Compression. [0] http://www.hariguchi.org/art/smart.pdf [1] http://conferences.sigcomm.org/sigcomm/2015/pdf/papers/p57.pdf ok dlg@, reyk@