summaryrefslogtreecommitdiffstats
path: root/sys/net/rtable.c (follow)
Commit message (Collapse)AuthorAgeFilesLines
* Rework source IP address setting.denis2020-11-071-43/+5
| | | | | | | | - Move most of the processing out of rtable.c (reasonnable tb@, ok bluhm@) - Remove memory allocation, store pointer to existing ifaddr - Fix tunnel interface handling looks fine mpi@
* Replace wrong cast with satosin.denis2020-11-051-5/+3
| | | | Advised by bluhm@
* Add feature to force the selection of source IP addressdenis2020-10-291-1/+79
| | | | | | | Based/previous work on an idea from deraadt@ Input from claudio@, djm@, deraadt@, sthen@ OK deraadt@
* Prevent recursions by not deleting entries inside rtable_walk(9).mpi2019-06-211-5/+11
| | | | | | | | | | | | | | | rtable_walk(9) now passes a routing entry back to the caller when a non zero value is returned and if it asked for it. This allows us to call rtdeletemsg()/rtrequest_delete() from the caller without creating a recursion because of rtflushclone(). Multicast code hasn't been adapted and is still possibly creating recursions. However multicast route entries aren't cloned so if a recursion exists it isn't because of rtflushclone(). Fix stack exhaustion triggered by the use of "-msave-args". Issue reported by Dániel Lévai on bugs@ confirmed by and ok bluhm@.
* Make sure pointer is within bounds before dereferencing it.anton2019-03-051-2/+2
| | | | | | ok claudio@ deraadt@ Reported-by: syzbot+8e29400e09a351f17884@syzkaller.appspotmail.com
* Change rtable_mpath_reprio() to take the prefixlen as argument instead ofclaudio2018-11-231-6/+2
| | | | | the network mask. This saves converting the prefixlen to a mask and back. OK phessler@, benno@
* Make rtable_satoplen() a bit more strict when parsing netmasks. Ensureclaudio2018-11-201-19/+12
| | | | | | | | | that the mask is contiguous and not longer then the prefixlen is not bigger then the maximum. Make the function behave a bit more like the similar netmask handling code in the old patricia codebase. Fixes a problem reyk@ reported regarding IPv6 masks and the fact that sin6_scope_id is after sin6_addr. OK mpi@
* Retire dom_rtkeylen from struct domain. Nothing is using this anymore.claudio2018-11-191-8/+4
| | | | | It was used by the original patricia tree. OK mpi@
* provide rtable_empty(), returns 1 if the rtable doesn't contain any routeshenning2018-09-091-1/+22
| | | | ok bluhm
* Simplify rtable_mpath_insert().mpi2017-09-051-26/+15
| | | | ok jmatthew@
* Restart the iteration when a multipath list is re-ordered to make surempi2017-09-051-1/+2
| | | | | | | | | | | no entry are missed. While here do not re-ordered or send messages for route entries that are already in the expected state. Make rttest30 pass. ok gerhard@
* Enable mpath support in the Allotment Routing Table (ART) on the ramdisk.florian2017-07-301-27/+1
| | | | OK mpi
* Switch installer to Allotment Routing Table (ART).florian2017-07-301-153/+1
| | | | | | Prompted by a bugreport by naddy that IPv6 autoconfiguration is broken in the installer. OK mpi, "go for it" deraadt
* No need to go through a remove/insert cycle when there's a single routempi2017-05-111-3/+12
| | | | | | | | | | | entry on the multipath list. Fix a NULL dereference triggered by a CPU doing a lookup when another one is updating the priorities of some routes. By not doing a remove/insert we ensure that ``an_rtlist'' is never empty and do not need a conditional in the fast path. Problem reported by and ok markus@
* Prevent a MP race in rtable_lookup().mpi2017-02-281-10/+26
| | | | | | | | | | | | | | If an ART node is linked to multiple route entries, in the MPATH case, it is not safe to dereference ``an_dst''. This non-refcounted pointer can be changed at any time by another CPU. So get rid of the pointer and use the first destination of a route entry when comparing sockaddrs. This allows us so remove a pointer from 'struct art_node' and save 5Mb of memory in an IPv4 fullfeed. ok jmatthew@, claudio@, dlg@
* A space here, a space there. Soon we're talking real whitespacekrw2017-01-241-3/+3
| | | | rectification.
* Make rtable_iterate(9) mpsafe by using the new SRPL_NEXT(9).mpi2016-11-201-10/+6
| | | | ok dlg@, jmatthew@
* Rename SRPL_ENTER() to SRPL_FIRST() and SRPL_NEXT() to SRPL_FOLLOW().mpi2016-11-201-5/+5
| | | | | | | | This allows us to introduce SRPL_NEXT() that can be used to start iterating on an arbitrary member of an srp list, hence without calling SRPL_ENTER(). ok dlg@, jmatthew@
* Automatically create a default lo(4) interface per rdomain.mpi2016-11-141-9/+42
| | | | | | | | | | | | | | | | | | In order to stop abusing lo0 for all rdomains, a new loopback interface will be created every time a rdomain is created. The unit number will be the same as the rdomain, i.e. lo1 will be attached to rdomain 1. If this loopback interface is already in use it wont be possible to create the corresponding rdomain. In order to know which lo(4) interface is attached to a rdomain, its index is stored in the rtable/rdomain map. This is a long overdue since the introduction of rtable/rdomain. It also fixes a recent regression due to resetting the rdomain of an incoming packet reported by semarie@, Andreas Bartelt and Nils Frohberg. ok claudio@
* Remove radix_mpath dragons.mpi2016-11-141-80/+2
| | | | | | This code insn't used since ART is the default. ok vgross@
* Rename rtable_mpath_next() into rtable_iterate() and make it do a propermpi2016-09-071-19/+44
| | | | | | | | | reference count. rtable_iterate() frees the passed ``rt'' and returns the next one on the multipath list or NULL if there's none. ok dlg@
* use a per-table rwlock to serialize ART updates and walks, rather thanjmatthew2016-08-301-24/+44
| | | | | | taking the kernel lock. ok mpi@ dlg@
* Revert use of the _SAFE version of SRPL_FOREACH() now that the offendingmpi2016-07-191-9/+6
| | | | | | | function has been fixed. Functions passed to rtable_walk() must return EAGAIN if they delete an entry from the tree, no matter if it is a leaf or not.
* Use the _SAFE_ version of SRPL_FOREACH() in rtable_walk_helper() tompi2016-07-041-6/+9
| | | | | | | | | prevent an off-by-one when removing entries from the mpath list. Fix a regression introduced by the refactoring needed to serialize rtable_walk() with create/delete. ok jca@
* rework art_walk so it will behave in an mpsafe world.dlg2016-06-221-5/+5
| | | | | | | | | | | | | | | | | art_walk now explicitly takes the same lock used to serialise change made via rtable_insert and _delete, so it can safely adjust the refcnts on tables while it recurses into them. they need to still exist when returning out of the recursion. it uses srps to access nodes and drops the lock before calling the callback function. this is because some callbacks sleep (eg, copyout in the sysctl code that dumps an rtable to userland), which you shouldnt hold a lock accross. other callbacks attempt to modify the rtable (eg, marking routes as down when then interface theyre on goes down), which tries to take the lock again, which probably wont work in the future. ok jmatthew@ mpi@
* Convert the links between art data structures used during lookups into srps.jmatthew2016-06-141-35/+61
| | | | | | | | | | | | | art_lookup and art_match now return an active srp_ref, which the caller must leave when it's done with the returned route (if any). This allows lookups to be done without holding any locks. The art_table and art_node garbage collectors are still responsible for freeing items removed from the routing table, so they now use srp_finalize to wait out any active references, and updates are done using srp_swap operations. ok dlg@ mpi@
* per trending style, add continue to empty loops.tedu2016-06-071-3/+3
| | | | ok mglocker
* shuffle the code in rtable_insert so it inserts a populated art_node.dlg2016-06-011-20/+21
| | | | | | | this makes the node usable as soon as it is in the tree, rather than after it inserts the rtentry on the node. ok mpi@
* rtref and rtfree around moving the rt in rtable_mpath_reprio so the listdlg2016-06-011-1/+3
| | | | | | operations cant drop the refcount to 0. ok mpi@
* move all the art_node initialisation to art_get in art.cdlg2016-06-011-3/+1
| | | | ok mpi@
* rework the srp api so it takes an srp_ref struct that the caller provides.dlg2016-05-181-19/+20
| | | | | | | | | | | | | | | | | | | | | | | | | the srp_ref struct is used to track the location of the callers hazard pointer so later calls to srp_follow and srp_enter already know what to clear. this in turn means most of the caveats around using srps go away. specifically, you can now: - switch cpus while holding an srp ref - ie, you can sleep while holding an srp ref - you can take and release srp refs in any order the original intent was to simplify use of the api when dealing with complicated data structures. the caller now no longer has to track the location of the srp a value was fetched from, the srp_ref effectively does that for you. srp lists have been refactored to use srp_refs instead of srpl_iter structs. this is in preparation of using srps inside the ART code. ART is a complicated data structure, and lookups require overlapping holds of srp references. ok mpi@ jmatthew@
* Simplify life for routing table implementations by requiring that rtable_walkjmatthew2016-05-021-3/+11
| | | | | | | | | | | callbacks return EAGAIN if they modify the routing table. While we're here, simplify life for rtable_walk callers by moving the loop that restarts the walk on EAGAIN into rtable_walk itself. Flushing cloned routes on interface state changes becomes a bit more inefficient, but this can be improved later. ok mpi@ dlg@
* Keep all pools in the same place.mpi2016-04-131-11/+5
| | | | ok jmatthew@
* Fix ECMP routing by passing the correct destination address to thempi2016-02-241-3/+3
| | | | | | | | hash routine. Bug reported and fix analysed by Jean-Daniel Dupas <jddupas AT xooloo DOT net> ok deraadt@
* Pass the address length to art_alloc() and remove the hack abusing thempi2016-01-181-7/+8
| | | | offset of the address in the sockaddr to initialize the stride lengths.
* Stop storing a backpointer to the corresponding ART node in each routempi2016-01-181-8/+2
| | | | | | | entry. This pointer hasn't been used for some time and without it no external reference count is needed to turn art_lookup() mpsafe.
* Pass the destination and mask to rtable_mpath_reprio() in order to notmpi2015-12-211-8/+32
| | | | use ``rt_node'' with ART.
* Merge rtable_mpath_select() into rtable_match().mpi2015-12-161-80/+64
| | | | | | This allow us to get rid of one more "rt_node" usage with ART. ok jmatthew@
* Do not panic when trying to delete an non-existing route with ART.mpi2015-12-151-16/+19
| | | | Reported by bluhm@, ok jmatthew@
* Move the KERNEL_LOCK from rt_match() to rtable_match().mpi2015-12-041-7/+12
| | | | ok claudio@
* Get rid of rt_mask() and stop allocating a "struct sockaddr" for everympi2015-12-031-39/+23
| | | | | | | | | | | | | route entry in ART. rt_plen() now represents the prefix length of a route entry and should be used instead. For now use a "struct sockaddr_in6" to represent the mask when needed, this should be then replaced by the prefix length and RTA_NETMASK only used for compatibility with userland. ok claudio@
* rtable_delete() does not use its prio parameter, so delete it.bluhm2015-12-021-3/+3
| | | | OK mpi@
* Respect priorities when inserting routes to the same destination in ART.mpi2015-12-021-6/+14
|
* Move multipath Hash-Threshold selection mechanism inside rtable_match().mpi2015-12-021-6/+25
| | | | | | | This will helps for unlocking the routing table and will prevent further mistake by keeping the multipath logic inside the rtable_* API. ok dlg@, claudio@
* Convert the simple list of multipath route entries used by ART kernelsmpi2015-11-291-28/+69
| | | | | | | | | | | | to a SRP list. This turns the rtable_* layer mpsafe. We now only need to protect the ART implementation itself. Note that route(8) regress tests will now fail due to a supplementary reference taken by the SRPL_INIT(9) API. ok dlg@
* Document that routing table heads are never freed as suggested by dlg@mpi2015-11-271-141/+51
| | | | | | | | and kill rtable_put() because we're not going to use it. The overhead of keeping a "struct art_root/radix_node_head" around is very small compared to the added complexity needed to reference count such structures.
* Protect the growth of the routing table arrays used by rtable_get()mpi2015-11-271-87/+176
| | | | | | | | | | with SRPs. This is a simplified version of the dynamically sizeable array of pointers used by if_get() because routing table heads are never freed. ok dlg@
* Provide art_free(), a method to release unused routing table heads.mpi2015-11-241-1/+2
| | | | While here initialize pools in art_init().
* Allocate ART table's heap independently from the structure and usempi2015-11-101-2/+2
| | | | | | | | | | | | | pool(9) to not waste most of the memory allocated. This reduces the memory overhead of our ART routing table from 80M to 70M compared to the existing radix-tree when loading ~550K IPv4 routes. ART can now be used for huge tables without exhausting malloc(9)'s limit. claudio@ agrees with the direction, inputs from and ok dlg@
* Do not leave dangling pointers in the ART tree in case of memorympi2015-11-091-20/+20
| | | | | | exhaustion. Reported by benno@ and found thanks to his bgpd(8) test setup.