summaryrefslogtreecommitdiffstats
path: root/usr.sbin/bgpd/rde.h (follow)
Commit message (Collapse)AuthorAgeFilesLines
* Remove redundant codedenis2020-06-051-2/+1
| | | | | | | Reported by Prof. Dr. Steffen Wendzel <wendzel @ hs-worms . de>, thanks! OK martijn@ sthen@
* Implement 'max-prefix NUM out' to limit the number of announced prefixes.claudio2020-01-241-2/+5
| | | | | | | | This is an easy safety switch to not leak full tables to upstreams and peers. If the limit is hit a Cease notification is sent and the session is closed. This implements most of https://tools.ietf.org/html/draft-sa-idr-maxprefix-00 OK job@
* Instead of calling SipHash24_Update() in path_hash for each element ofclaudio2020-01-091-2/+4
| | | | | | struct rde_aspath define aspath_hashstart and aspath_hashend and update all values in one call. Inspired by struct process and its ps_startcopy. OK deraadt@
* Clean up header a bit, remove peer_recv_eor and peer_send_eor prototypesclaudio2020-01-091-8/+6
| | | | and order prototypes like the functions in rde_peer.c
* Move peer related code from rde.c to rde_peer.c.claudio2020-01-091-3/+20
| | | | | | | Change peer_foreach() to just walk the peer list instead of iterating over the peer hash table. Also change peer_down() arguments so that it can be used as a peer_foreach() callback (which is then used in rde_shutdown()). OK benno@
* eye burning whitespacederaadt2020-01-081-2/+2
|
* Instead of processing all imsg when reading them store peer specificclaudio2020-01-011-1/+10
| | | | | | | messages on a per peer queue. This queue is later processed one at a time resulting in a fairer processing of work and avoiding big table dumps to delay processing of other updates. OK denis@ benno@
* Add PREFIX_FLAG_STALE to mark prefixes in the Adj-RIB-Out as stale duringclaudio2019-10-301-4/+5
| | | | | | | | | | graceful reload. At the same time extend peer_dump() to force all updates getting sent by adding every entry in the Adj-RIB-Out to the update tree unless they are PREFIX_FLAG_DEAD or PREFIX_FLAG_STALE. The latter will be removed during that stage since peer_dump() just did a full update of the Adj-RIB-Out. Also fix prefix_withdraw to check the correct prefix flags before removing a prefix from the update or withdraw tree. OK benno@
* Rework the way ribs are stored in the RDE. Instead of a flat array thatclaudio2019-08-141-11/+2
| | | | | | | | gets enlarged use an array of pointers, so pointers to struct rib entries remain valid after adding new RIBs. Also remove the global ribs pointer and rib_valid() since they are no longer used since all the code uses now rib_byid() instead. OK benno@
* There is no longer a reason to use two structs for RIBs where one is partclaudio2019-08-141-11/+6
| | | | | of the other. Just merge struct rib_desc into struct rib. Makes code simpler. OK benno@
* Instead of passing a struct prefix pointer to rde_filter() pass the 4 valuesclaudio2019-08-131-10/+11
| | | | | | | prefix_peer, prefix_vstate and prefix/prefixlen to the function. This removes some ugly hacks in cases where the prefix was not available. Also adjust the order of arguments of rde_attr_set() to match rde_filter(). OK benno@
* Rename some of the prefix functions to make it clearer. Also renameclaudio2019-08-091-6/+7
| | | | | path_update to prefix_update since this is now more working on a prefix. OK clang
* Improve RIB reload behaviour. Especially when the rtable changes or theclaudio2019-08-071-3/+6
| | | | | | | | | route evaluation is modified. In both cases the softreconfig code will now walk the RIB and ensure that everything is in proper sync. Additionally remove 'route-collector yes|no' from the bgpd config, instead use 'rde rib Loc-RIB no evaluate' with the benefit that you can alter the setting now during runtime. Tested and OK benno@
* GC three prototypes whose function have gone long ago.claudio2019-07-221-4/+1
|
* Change the Adj-RIB-Out to a per peer set of RB trees. The way RIB dataclaudio2019-07-171-15/+35
| | | | | | | | | | | | | | structures are linked does not scale for the Adj-RIB-Out and so inserts and updates into the Adj-RIB-Out did not scale because of some linear list traversals in hot paths. A synthetic test with 4000 peers announcing one prefix each showed that the initial convergence time dropped from around 1 hout to around 6min. Note: because the Adj-RIB-Out is now per peer the order in which prefixes are dumped in 'bgpctl show rib out' changed. Tested and OK job@, benno@, phessler@
* Unify the way object in the RDE are reference counted. The affectedclaudio2019-07-011-23/+17
| | | | | | | | | structures are pt_entry, rde_aspath, rde_communities, and nexthop. The functions are always called *_ref and *_unref also the behaviour when the last reference is removed is unified and now the object is removed inside of the unref function. The actual bean-counting is not modified by this diff. OK benno@
* mrt dumps lost communities after the community rewrite.claudio2019-06-241-10/+11
| | | | | Readd them by dumping them explicitly. Tested by and OK benno@
* Add a direct pointer from struct prefix to struct pt_entry.claudio2019-06-221-3/+6
| | | | | | | | | This changes makes it possible to not use the struct rib_entry pointer which will be used to optimize the Adj-RIB-Out. Also adjust pt_ref() and pt_unref() so that the code can be written a bit more compact. Also prefix_cmp() no longer needs to go via rib_compare() and calls pt_prefix_cmp() directly. OK phessler@
* prefix_updateall() is only used internally, make it a static function.claudio2019-06-201-3/+1
|
* Change nexthop_update to run the list walk over all prefixes to runclaudio2019-06-201-2/+8
| | | | | | | | | asynchronously and therefor other tasks can make progress at the same time. Additionally prefixes belonging to a RIB which does not run the the decision process are no longer linked into the nexthop list. This replaces the early return in prefix_updateall() and reduces the time spent in nexthop_update(). OK benno@
* Cleanup, remove some unneded spaces add some other where needed.claudio2019-06-171-5/+5
| | | | No binary change according to clang
* Completley rewrite the community matching and handling code. All communityclaudio2019-06-171-20/+67
| | | | | | | | attributes are put into a new data structure when parsing the UPDATE. The filter code can quickly lookup and modify this data structure. When creating an UPDATE the data is put back into wire format. Setups using a lot of communities benefit a lot from this. Input and OK benno@
* Exit the attribute loop early if there are no unknown attributes leftclaudio2019-05-311-1/+2
| | | | | | and the loop passed all attributes known by bgpd. Saves about 80% of time in up_generate_attr(). OK phessler@
* Do a better job at cleaning up the config on shutdown. Remove bits thatclaudio2019-03-071-13/+1
| | | | | | | | | | were missed before (e.g. network related objects). This helps to detect memory leaks. Start using new_config() and free_config() in all places where bgpd_config structure are used. This way the struct is properly initialised and cleaned up. Introduce copy_config() to only copy the values into the other struct leaving the pointers as they were. Looks good to benno@
* Add support for '*', local-as and neighbor-as for ext-community matchingclaudio2019-02-261-2/+2
| | | | | | | | | | | | | | | | | | | and setting. This allows rules like: ext-community * * # delete any ext-community ext-community ovs * # delete any ext-community of specified type ext-community rt 1.2.3.4:* and ext-community rt 65001:local-as ext-community rt local-as:11111 Note: Sometimes the type of the ext-community is underspecified when using wildchars or expands. So 'ext-community rt *' or 'ext-community soo *' will match for any of the 3 possible types (2-byte AS, 4-byte AS and IP address). If local-as/neighbor-as is used as an expand of as-number like ext-community rt local-as:11111 then bgpd will default to the 4-byte AS type to encode the community. OK benno@
* Implement as-override, a feature where the neighbor AS is replaced by theclaudio2019-02-041-1/+3
| | | | | | | | | | | local AS in AS paths. This is sometimes needed in bigger transport networks where private AS numbers are used in multiple locations. The implementation is done using a filterset which modifies the AS path - somewhat inspired by the set attribute code. Setting as-override yes will add match from <neighbor> set { as-override } to the start of the filter rules. Since this is filters the Adj-RIB-In still holds the original path and so reloads changing the setting just work. With and OK markus@
* Use Adj-RIB-Out to push UPDATE messages to peers instead of having anotherclaudio2019-01-211-24/+21
| | | | | | | | set of RB trees of prefixes and atribute. Refactor most of the update code which removes some strange buffer handling. By building the output queue directly in the Adj-RIB-Out the top memory usage during startup is greatly reduced which should help busy server. Tested by phessler@ and myself
* add support for IPv6 VPN routesdenis2018-12-301-1/+14
| | | | | | The kernel bits are missing as of now. With input from claudio@ and kn@ OK claudio@
* Fold ext-communities into filter_community so that bgpd can matchclaudio2018-12-191-5/+5
| | | | | | | multiple ext-communities at the same time as well. Additionally this fixes parsing some of the ext-community types. Now all communities are handled by one common struct. OK benno@ plus some input from denis@
* path_empty() is not a function and does not need a prototype.claudio2018-12-171-2/+1
|
* Refactor aspath code a bit. Move cached source_as (for origin validation)claudio2018-12-111-3/+3
| | | | | into struct aspath and pass that struct to aspath_match(). OK denis@
* Start reworking community handling. Merge standard communities and largeclaudio2018-11-281-20/+23
| | | | | | | | communities into one filter_community struct and allow it that more then one community can be used in filter rules (currently up to 3). Also rework the code handling bgpctl show rib commands. The special IMSG types for the various filters are gone and the code is in general simpler. OK job@, phessler@
* Introduce a real Adj-RIB-Out. At the same time remove the update_ribclaudio2018-11-041-3/+2
| | | | | | | | | | introduced before 6.4 because it now can be replaced with the real RIB. Main changes are: - simplified 'show rib' handling since everything is now a real RIB - path_update() is now returning if a prefix was not modified, added or moved - softreconfig out case is simpler since path_update does all the magic now - Adjust shutdown code to work with the Adj-RIB-Out Tested and OK denis@, benno@
* Remove tail queues which link peer, aspath and prefix together. Theseclaudio2018-10-311-9/+2
| | | | | | | | lists are no longer needed and make it possible to share rde_aspath between peers & prefixes. Instead of the lists the rde_aspath is now reference counted. With this struct prefix is now the central place where everything is connected to making the RIB a bit easier to handle. With input and OK denis@
* Replace some walkers using the aspath/prefix lists with a rib_dump walker.claudio2018-10-291-3/+1
| | | | | | | | network_flush() is now using rib_dump_new to walk the Adj-RIB-In and remove all dynamically added announcements. peer_flush() got generalized and is now used also in peer_down(). It also uses a walker to remove all prefixes of a peer but does it in a synchronous way for now. OK benno@
* Calculate ASPATH_HEADER_SIZE correctly by using offsetof() instead of theclaudio2018-10-251-2/+3
| | | | | sizeof calculation that did not respect possible padding bytes. OK sthen@ denis@
* Major refactoring of the RIB handling code. Mainly change how the RIB isclaudio2018-10-241-53/+46
| | | | | | | | | | | walked. rib_dump_r() is now an internal function and instead the code gets an additional callback for throttling the rib_dump code. This removes a lot of similar code used to make sure the RDE is not walking to fast and replaces it with simpler callbacks. The other big change is the removal of struct rib pointers in other data structures. The rib pointers are not stable because of a realloc() call happening when extending the array so instead use the RIB ID as a reference. Tested and OK denis@ and benno@
* Use the up_rib tree to withdraw all prefixes of a peer which is used toclaudio2018-10-151-1/+2
| | | | | reload peers into a new RIB. Removes one additional full RIB tree walker. OK benno@
* Expose BGP Origin Validation state in bgpctl show commandsjob2018-10-011-6/+1
| | | | OK denis@ claudio@
* Implement origin validation in bgpd. This introduces two new tables, theclaudio2018-09-291-25/+15
| | | | | | | | | | | | | | | | | | roa-set for RPKI based origin validation and a origin-set which allows to lookup a source-as / prefix pair. For RPKI a config can be built like this: roa-set { 165.254.255.0/24 source-as 15562 193.0.0.0/21 maxlen 24 source-as 3333 } deny from any ovs invalid match from any ovs valid set community local-as:42 match from any ovs not-found set community local-as:43 Origin sets are similar but only match when the source-as / prefix pair is valid. match from any origin-set ARINDB set community local-as:44 Committing this now so that further work can be done in tree. OK benno@, job@
* Introduce minimal tracking of announced prefixes. A per peer RB tree tracksclaudio2018-09-291-1/+5
| | | | | | | | | which prefixes were sent out as UPDATE. At withdraw time the RB tree can be consulted to know if the withdraw actually needs to be sent to the peer. This replaces the faulty heuristic that was used before and caused either that unneeded withdraw to be sent or in the worst case failing to send a necessary withdraw resulting in stuck routes. OK benno@
* Split up as_set into a set_table and an as_set. The first is what doesclaudio2018-09-201-2/+2
| | | | | | | | | the lookup and will now also be used in roa-set tries. The as_set is glue to add the name and dirty flag. Add an accessor to get the set data so that the imsg sending and printing can be moved into the right places. This is done mainly because roa-sets need similar but slightly different versions and making the code more generic is the best way fixing this. OK benno@
* whitespace cleanup, ok claudio@benno2018-09-201-2/+2
|
* Backend for roa-sets. This combines as_sets and prefix-set tries to doclaudio2018-09-181-2/+13
| | | | | | | | | proper ROA checking. There is a new match function trie_roa_check which does a trie traversal and looks for candidates and matches. If prefix is not covered then ROA_UNKNOWN is returned, if prefix is covered by an entry it will return ROA_INVALID unless the source-as / maxlen combo is matching (ROA_VALID). OK and input sthen@
* Clean up prefix flag handling. First of all the dynamic networks no longerclaudio2018-09-091-13/+8
| | | | | | | | | | need this and are now treated equally to the network statement in the config. This makes bgpctl network delete <net> also remove a network which was defined in the config. While there remove the other use of flag which was done to support Adj-RIB-Out but the direction we're taking is no longer needing that. Makes code simpler again. OK benno@
* implement or-longer filter op for prefix-sets. Allows one two write rules likebenno2018-09-081-4/+4
| | | | | deny from any prefix-set mynetworks or-longer ok claudio, feature discussed with job and deraadt
* Implement a fast presix-set lookup. This magic trie is able to match aclaudio2018-09-071-1/+26
| | | | | | | | | | | prefix addr/plen to a prefix-set spec addr/plen prefixlen min - max (a prefix including prefixlen range). Every addr/plen pair is a node in the trie and the prefixlen is added as a bitmask to those nodes. For the lookup the any match is OK, there is no need to do longest or best prefix matching. Inspiration for this solution comes from the way bird implements this which was done by Ondrej Zajicek santiago (at) crfreenet.org OK benno@
* Implement as-set a fast lookup table to be used instead of long list ofclaudio2018-09-071-2/+2
| | | | | | | AS numbers in source-as, AS and transit-as filterstatements. These table use bsearch to quickly verify if an AS is in the set or not. The filter syntax is not fully set in stone yet. OK denis@ benno@ and previously OK deraadt@
* Update the RIB after a config reload in the background. This moves theclaudio2018-08-081-1/+3
| | | | | | | | heavy bits into the background and so the RDE is able to process new messages more or less instantly after a configuration reload. Not all cases are covered yet but the bulk is. While the backgorund process is running no new config can be loaded. Tested by and OK benno@
* hide rib[] internals in new rib_valid() functionbenno2018-08-081-1/+9
| | | | ok claudio@