summaryrefslogtreecommitdiffstats
path: root/sys/net/if_aggr.c (follow)
Commit message (Collapse)AuthorAgeFilesLines
* big numbers need suffixes on some platforms. fix LACP_ADDR_SLOW_E64.dlg2021-02-281-2/+2
| | | | deraadt@ says i broke hppa :(
* put the mac addr into a uint64_t to compare it to the ethernet slow addr.dlg2021-02-271-5/+9
| | | | also do the ethertype comparison before the conversion above.
* aggr(4): convert ifunit() to if_unit(9)mvs2021-01-191-16/+21
| | | | ok dlg@
* Rename the macro MCLGETI to MCLGETL and removes the dead parameter ifp.jan2020-12-121-2/+2
| | | | | | OK dlg@, bluhm@ No Opinion mpi@ Not against it claudio@
* Leave default ifq_maxlen handling to ifq_init()kn2020-08-211-2/+1
| | | | | | | | | | | | Most clonable interface drivers (except bridge, enc, loop, pppx, switch, trunk and vlan) initialise the send queue's length to IFQ_MAXLEN during *_clone_create() even though ifq_init(), which is eventually called through if_attach(), does the same. Remove all early "ifq_set_maxlen(&ifq->if_snd, IFQ_MAXLEN);" lines to leave it to ifq_init() and have clonable drivers a tad more in sync. OK mvs
* deprecate interface input handler lists, just use one input function.dlg2020-07-221-12/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | the interface input handler lists were originally set up to help us during the intial mpsafe network stack work. at the time not all the virtual ethernet interfaces (vlan, svlan, bridge, trunk, etc) were mpsafe, so we wanted a way to avoid them by default, and only take the kernel lock hit when they were specifically enabled on the interface. since then, they have been fixed up to be mpsafe. i could leave the list in place, but it has some semantic problems. because virtual interfaces filter packets based on the order they were attached to the parent interface, you can get packets taken away in surprising ways, especially when you reboot and netstart does something different to what you did by hand. by hardcoding the order that things like vlan and bridge get to look at packets, we can document the behaviour and get consistency. it also means we can get rid of a use of SRPs which were difficult to replace with SMRs. the interface input handler list is an SRPL, which we would like to deprecate. it turns out that you can sleep during stack processing, which you're not supposed to do with SRPs or SMRs, but SRPs are a lot more forgiving and it worked. lastly, it turns out that this code is faster than the input list handling, so lots of winning all around. special thanks to hrvoje popovski and aaron bieber for testing. this has been in snaps as part of a larger diff for over a week.
* Change users of IFQ_SET_MAXLEN() and IFQ_IS_EMPTY() to use the "new" API.patrick2020-07-101-2/+2
| | | | ok dlg@ tobhe@
* make ph_flowid in mbufs 16bits by storing whether it's set in csum_flags.dlg2020-06-171-6/+4
| | | | | i've been wanting to do this for a while, and now that we've got stoeplitz and it gives us 16 bits, it seems like the right time.
* When the set of ports in an aggr changes, set the aggr capabilities tojmatthew2020-06-021-13/+14
| | | | | | | | the intersection of the capabilities of the ports, allowing use of vlan and checksum offloads if supported by all ports. Since this works the same way as updating hardmtu, do them both at the same time. ok dlg@
* take NET_LOCK in aggr_clone_destroy() before calling aggr_p_dtor()dlg2020-04-121-1/+3
| | | | | | | | | | | aggr_p_dtor() calls ifpromisc(), and ifpromisc() callers need to be holding NET_LOCK to make changes to if_flags and if_pcount, and before calling the interfaces ioctl to apply the flag change. i found this while reading code with my eyes, and was able to trigger the NET_ASSERT_LOCKED in the vlan_ioctl path. ok visa@
* properly limit indexing into the aggr_periodic_times array.dlg2020-03-111-3/+2
| | | | | coverity CID 1486819 pointed out by and ok tobhe@
* when aggr(4) comes up, check port link state to push the rxm forward.dlg2019-12-231-3/+2
| | | | | | | | this lets aggr come up on boot if there's a race with it being brought up and the ports being up. reported by holger glaess on misc@ and debugged with hrvoje popovski. tested by hrvoje popovski too.
* Add a missing unlock.visa2019-12-151-1/+2
| | | | | Spotted by Hrvoje Popovski using witness(4) OK dlg@
* use sockaddr_storage to store the address used to generate mcast entries.dlg2019-12-111-8/+11
| | | | | | | | | | this means we don't truncate sockaddr_in6, which in turn means we dont end up using garbage or zeros on the underlying ports when requesting they set up hardware filters for multicast addresses. vlan(4) uses sockaddr_storage like this too for the same thing. discovered by jmatthew@ because ipv6 on top of aggr wasn't working unless tcpdump was running.
* you still need newlines when using log(9). add some errnos while here.dlg2019-11-111-25/+34
|
* whitespace fixes, no functional change.dlg2019-11-091-23/+23
|
* move the port destructor calls in clone destroy back out of NET_LOCK.dlg2019-11-071-4/+5
| | | | | | | | it's no longer necessary to hold NET_LOCK to call interface hook adds or dels now, but it is necessary not to hold NET_LOCK when calling some barrier functions. found by hrvoje popovski
* turn the linkstate hooks into a task list, like the detach hooks.dlg2019-11-071-5/+5
| | | | | | | | | | | | | | | this is largely mechanical, except for carp. this moves the addition of the carp link state hook after we're committed to using the new interface as a carpdev. because the add can't fail, we avoid a complicated unwind dance. also, this tweaks the carp linkstate hook so it only updates the relevant carp interface, not all of the carpdevs on the parent. hrvoje popovski has tested an early version of this diff and it's generally ok, but there's some splasserts that this diff fires that i'll fix in an upcoming diff. ok claudio@
* replace the hooks used with if_detachhooks with a task list.dlg2019-11-061-9/+9
| | | | | | | | | | | | | | | | | | | | | | | | the main semantic change is that things registering detach hooks have to allocate and set a task structure that then gets added to the list. this means if the task is allocated up front (eg, as part of carps softc or bridges port structure), it avoids the possibility that adding a hook can fail. a lot of drivers weren't checking for failure, and unwinding state in the event of failure in other parts was error prone. while doing this i discovered that the list operations have to be in a particular order, but drivers weren't doing that consistently either. this diff wraps the list ops up so you have to seriously go out of your way to screw them up. ive also sprinkled some NET_ASSERT_LOCKED around the list operations so we can make sure there's no potential for the list to be corrupted, especially while it's being run. hrvoje popovski has tested this a bit, and some issues he discovered have been fixed. ok sashan@
* try to be more compliant with the spec by implementing marker responses.dlg2019-08-051-2/+64
| | | | i hope, i didn't test this that hard.
* generate the actor info per port to send to userland.dlg2019-07-201-1/+8
| | | | useful for debugging.
* just use LINK_STATE_IS_UP to see if a port has link.dlg2019-07-201-8/+2
| | | | excluding HALF_DUPLEX just seems mean.
* try to notify the partner when the port is going away or down.dlg2019-07-191-1/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | by notify i mean we send an lacp packet with our collecting and distributing flags cleared, which should tell the remote system that it should no longer handle packets on their port as part of their aggregation. this is implemented by "unselecting" a port. if an active port is going away, ie, being removed from an aggr via "ifconfig aggr0 -trunkport port0", all that happens is software state on our side changes and we stop considering the interface as part of the aggr interface. the partner system is otherwise oblivious and can continue to send us packets until its expiry timeout fires because it doesn't know any better. we already intercept a ports ioctl handling, so if someone goes "ifconfig portX down" while it is attached to an aggr, we can catch that before the underlying driver actually tears the rings down, and we still have a chance to try and send a packet to the peer. this is useful because our drivers generally do not drop the physical link, so again, the partner system is oblivious to the change on our side until its expiry timer fires. expiry timeouts can be up to 90 seconds away, which is a lot of traffic to blackhole. sending the notification to the parnter means they withdraw this link at the same time the local system is pulling the port out of the aggregation. hopefully. it is possible the packet is lost, but this is a good start. the only caveat to this is is my implementation ignores the transmit state machine from the lacp spec, and may cause more than 3 lacp packets per second to be transmitted to the partner system. oh well. i should look at the marker protocol too.
* default (ie, reset) the partner info when a ports link goes down.dlg2019-07-191-1/+3
| | | | | | | | | | this doesnt seem to be mentioned in the spec, but is a sensible thing to do if you think about it. all the switches i've tried also do this, so there's some consensus about it being sensible. this is done in the link state handler rather than being added to one of the state machines. the idea is to keep the state machines as close to what's in the spec as possible.
* export all the partner info to userland, not just what ifconfig prints.dlg2019-07-191-1/+14
|
* make the UCT in the rxm generate debug outputdlg2019-07-181-3/+8
| | | | | | | without this it looks like debug output loses info because of how the uct was shortcutted. no functional change, just prettier printfs.
* run the selection logic from the rxm current state if the port is unselecteddlg2019-07-181-9/+6
| | | | | | | | | | | previously it would only run the selection logic if the peer information changed, but it is possible to be in the current state with stale partner info. that can happen if the port becomes disabled/disconnected, which unwinds the mux machine, but doesnt clear the partner info. when the link is enabled again we re-enter the current state, but because the partner info is the same we didn't run the selection logic, which in turn didn't let the mux machine move forward again.
* bulk up the debug output around selection logicdlg2019-07-181-7/+41
| | | | | | lacp didnt come up again after i replaced some optics with dacs, and it has to be because of a problem around the selection logic. this will let me narrow it down.
* replace ether_{cmp,is_eq,is_zero} with the new ones in netinet/if_ether.hdlg2019-07-181-13/+10
| | | | | | | | ehter_cmp goes away, ether_is_eq becomes ETHER_IS_EQ, ether_is_zero becomes ETHER_IS_ANYADDR. ether_is_slow is kept locally, but renamed to ETHER_IS_SLOWADDR to better match what comes from if_ether.h.
* pretend to handle setting trunkproto, but only support setting it to lacpdlg2019-07-051-1/+14
|
* fix the $OpenBSD$ tagdlg2019-07-051-1/+1
|
* initialise sc_lacp_timeout to AGGR_LACP_TIMEOUT_SLOW, not 0;dlg2019-07-051-1/+1
| | | | | it's the same, but there was a misleading comment on the same line which this cleans up too.
* iterate over distributing ports when populating the tx map, not all portsdlg2019-07-051-1/+1
| | | | | | | | this probably explains why ive seen a box decide not to use a distributing port, even though the state machine and all the lacp state flags say it's fine. it may also explain why jmatthew@ has seen a port still transmitting after it's been removed from an aggr(4).
* init the log of tx times to somewhere in the past when adding a port.dlg2019-07-051-0/+5
|
* move a declaration before a statement.dlg2019-07-051-1/+2
|
* report a port as active to userland if it is muxeddlg2019-07-051-0/+2
|
* tweak mtu handling and propagate mtu setting to trunkportsdlg2019-07-051-1/+66
| | | | | | | | | | | make setting a trunkports mtu to its current mtu a nop. set a trunkports mtu to the aggr mtu when the port is getting added. set the mtu on all trunkports when the aggr mtu is set so things look consistent. restore a trunkports mtu when it is removed from an aggr. this is mostly cosmetic since the mtu on trunkports isn't really used anywhere.
* add aggr(4), a dedicated driver that implements 802.1AX link aggregationdlg2019-07-051-0/+2670
802.1AX (formerly known as 802.3ad) describes the Link Aggregation Control Protocol (LACP) and how to use it in a bunch of different state machines to control when to bundle interfaces into an aggregation. technically the trunk(4) driver already implements support for 802.1AX, but it had a couple of problems i struggled to deal with as part of that driver. firstly, i couldnt easily make the output path in trunk mpsafe without getting bogged down, and the state machine handling had a few hard to diagnose edge cases that i couldnt figure out. the new driver has an mpsafe output path, and implements ifq bypass like vlan(4) does. this means output with aggr(4) is up to twice as fast as trunk(4). the implementation of the state machines as per the standard means the driver behaves more correctly in edge cases like when a physical link looks like it is up, but is logically unidirectional. the code has been good enough for me to use in production, but it does need more work. that can happen in tree now instead of carrying a large diff around. some testing by ccardenas@, hrvoje popovski, and jmatthew@ ok deraadt@ ccardenas@ jmatthew@