summaryrefslogtreecommitdiffstats
path: root/sys/net/if_trunk.c (follow)
Commit message (Collapse)AuthorAgeFilesLines
* trunk(4): convert ifunit to if_unit(9)mvs2021-01-281-9/+21
| | | | ok bluhm@
* Keep port interface UP on removalkn2020-09-121-5/+1
| | | | | | | There is no reason to change flags on member interfaces when removing them, aggr(4) does not pull its members down either. OK florian bluhm
* Add missing `IFXF_CLONED' flag to clone interfaces.mvs2020-07-281-1/+2
| | | | ok mpi@
* deprecate interface input handler lists, just use one input function.dlg2020-07-221-18/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | the interface input handler lists were originally set up to help us during the intial mpsafe network stack work. at the time not all the virtual ethernet interfaces (vlan, svlan, bridge, trunk, etc) were mpsafe, so we wanted a way to avoid them by default, and only take the kernel lock hit when they were specifically enabled on the interface. since then, they have been fixed up to be mpsafe. i could leave the list in place, but it has some semantic problems. because virtual interfaces filter packets based on the order they were attached to the parent interface, you can get packets taken away in surprising ways, especially when you reboot and netstart does something different to what you did by hand. by hardcoding the order that things like vlan and bridge get to look at packets, we can document the behaviour and get consistency. it also means we can get rid of a use of SRPs which were difficult to replace with SMRs. the interface input handler list is an SRPL, which we would like to deprecate. it turns out that you can sleep during stack processing, which you're not supposed to do with SRPs or SMRs, but SRPs are a lot more forgiving and it worked. lastly, it turns out that this code is faster than the input list handling, so lots of winning all around. special thanks to hrvoje popovski and aaron bieber for testing. this has been in snaps as part of a larger diff for over a week.
* Change users of IFQ_DEQUEUE(), IFQ_ENQUEUE() and IFQ_LEN() to use thepatrick2020-07-101-2/+2
| | | | | | "new" API. ok dlg@ tobhe@
* make ph_flowid in mbufs 16bits by storing whether it's set in csum_flags.dlg2020-06-171-3/+3
| | | | | i've been wanting to do this for a while, and now that we've got stoeplitz and it gives us 16 bits, it seems like the right time.
* don't limit the output queue (ifq) length to 1 anymore.dlg2020-05-211-3/+1
| | | | | | | | | | | | | | | if we use the ifq to move packet processing to another context, it's too easy to fill up the one slot and cause packet loss. the ifq len was set to 1 to avoid delays produced by the original implementation of tx mitigation. however, trunk now introduces latency because it isn't mpsafe yet, which causes the network stack to have to take the kernel lock for each packet, and the kernel lock can be quite contended. i want to use the ifq to move the packet to the systq thread (which already has the kernel lock) before trunk is asked to transmit it. tested by mark patruck and myself.
* when copying capabilities from the first port to a trunk, copy hardmtu too.dlg2019-12-061-2/+2
| | | | | | | | previously it copied the ports if_mtu to the trunks if_hardmtu, which makes it hard for things like vlan(4) to work with a full frame size, or large frame size. tested by hrvoje popovski
* turn the linkstate hooks into a task list, like the detach hooks.dlg2019-11-071-4/+4
| | | | | | | | | | | | | | | this is largely mechanical, except for carp. this moves the addition of the carp link state hook after we're committed to using the new interface as a carpdev. because the add can't fail, we avoid a complicated unwind dance. also, this tweaks the carp linkstate hook so it only updates the relevant carp interface, not all of the carpdevs on the parent. hrvoje popovski has tested an early version of this diff and it's generally ok, but there's some splasserts that this diff fires that i'll fix in an upcoming diff. ok claudio@
* replace the hooks used with if_detachhooks with a task list.dlg2019-11-061-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | the main semantic change is that things registering detach hooks have to allocate and set a task structure that then gets added to the list. this means if the task is allocated up front (eg, as part of carps softc or bridges port structure), it avoids the possibility that adding a hook can fail. a lot of drivers weren't checking for failure, and unwinding state in the event of failure in other parts was error prone. while doing this i discovered that the list operations have to be in a particular order, but drivers weren't doing that consistently either. this diff wraps the list ops up so you have to seriously go out of your way to screw them up. ive also sprinkled some NET_ASSERT_LOCKED around the list operations so we can make sure there's no potential for the list to be corrupted, especially while it's being run. hrvoje popovski has tested this a bit, and some issues he discovered have been fixed. ok sashan@
* record when trunk takes over an interface by setting ac_trunkportdlg2019-07-051-1/+9
| | | | | this will be used to prevent trunk and the upcoming aggr driver from taking ownership of an Ethernet interface at the same time.
* A trunk(4) usually stays up when the link state of one of its membersflorian2019-05-111-1/+3
| | | | | | | | | changes. While we do get RTM_IFINFO messages for the (physical) member interfaces there is no indication that something changed from the trunk(4) interface. It is helpful to get this information in userland from the trunk so that userland daemons do not need to track interface membership by themselves. OK phessler
* tr_unit is unused, so gc itdlg2019-04-291-2/+1
|
* a first cut at converting some virtual ethernet interfaces to if_vinputdlg2019-04-231-4/+3
| | | | | | | | | | this let's input processing bypass ifiqs. there's a performance benefit from this, and it will let me tweak the backpressure detection mechanism that ifiqs use without impacting on a stack of virtual interfaces. ive tested all of these except mpw, which i will end up testing soon anyway.
* Add administrative options to LACP trunk implementation.ccardenas2018-08-121-1/+92
| | | | | | | | | | | | | | | | | | | | | | | | | | The trunk driver now has a new ioctl (SIOCxTRUNKOPTS), which for now only has options for LACP: * Mode - Active or Passive (default Active) * Timeout - Fast or Slow (default Slow) * System Priority - 1(high) to 65535(low) (default 32768/0x8000) * Port Priority - 1(high) to 65535(low) (default 32768/0x8000) * IFQ Priority - 0 to NUM_QUEUES (default 6) At the moment, ifconfig only has options for lacpmode and lacptimeout plumbed as those are the immediate need. The approach taken for the options was to make them on a "trunk" vs a "port" as what's typically seen on various NOSes (JunOS, NXOS, etc...) as it's uncommon for a host to have one link "Passive" and the other "Active" in a given trunk. Just like on a NOS, when applying lacpmode or lacptimeout, the settings are immediately applied to all existing ports in the trunk and to all future ports brought into the trunk. Tested by many on a plethora of NIC drivers and switches. Ok remi@
* Remove almost unused `flags' argument of suser().mpi2018-02-191-4/+4
| | | | | | | The account flag `ASU' will no longer be set but that makes suser() mpsafe since it no longer mess with a per-process field. No objection from millert@, ok tedu@, bluhm@
* Creating a cloned interface could return ENOMEM due to temporarybluhm2018-01-091-4/+2
| | | | | | memory shortage. As it is invoked from a system call, it should not fail and wait instead. OK visa@ mpi@
* The "ret" return value is reused and overwritten, potentiallyreyk2017-08-141-8/+4
| | | | | | | | | | returning 0 (success) on error instead of an error number. The caller doesn't evaluate the return value, so it is good enough to return ENOBUFS (non-0) on error and to remove "ret" in trunk_cast_start(). Coverity CID 1453105; Severity: Minor OK mpi@
* Remove NET_LOCK()'s argument.mpi2017-08-111-5/+5
| | | | Tested by Hrvoje Popovski, ok bluhm@
* Add missing NET_UNLOCK() in error path.mpi2017-05-281-2/+4
| | | | Spotted by sashan@
* trunk_port_destroy() needs the NET_LOCK().mpi2017-05-281-2/+4
| | | | | | It brings the interface down and restore the original lladdr. Found by Hrvoje Popovski
* Remove useless splnet()/splx() dances.mpi2017-05-281-29/+5
| | | | | | Data structures modified in the ioctl path are protected by the NET_LOCK(). ok sashan@
* move counting if_opackets next to counting if_obytes in if_enqueue.dlg2017-01-221-4/+2
| | | | | | | this means packets are consistently counted in one place, unlike the many and various ways that drivers thought they should do it. ok mpi@ deraadt@
* Reconfigure interface capabilities after switching trunkproto; ok mpimikeb2016-09-161-3/+4
|
* We're always ready! So send IFQ_SET_READY() to the bitbucket.mpi2016-04-131-2/+1
|
* Move tr_port_destroy down; fixes 'lacp_compose_key protection fault trap'sthen2015-12-311-4/+4
| | | | | when removing a port from a lacp trunk. Part of a larger diff from mpi, as suggested by mikeb. ok mpi@
* dont check IFF_OACTIVE to see if the port is busy.dlg2015-11-211-5/+1
| | | | | | dont check if its busy at all, actually. fine with reyk@
* dont play with IFF_OACTIVE needlessly.dlg2015-11-201-3/+2
| | | | | only a driver sets or clears it, and trunk never sets it. therefore it never needs to clear it.
* Prefix flowid with ph_ and print it in m_print().mpi2015-11-121-3/+3
| | | | ok dlg@
* arp_ifinit() is no longer required.mpi2015-10-251-6/+2
|
* Make sure that when trunk_port_ioctl is called to set a newmikeb2015-10-081-5/+5
| | | | | | lladdr the trunk port is already on the list. OK mpi
* if the mbuf has a valid flowid, use it instead of using siphash24dlg2015-10-081-1/+4
| | | | | | | | | | and a bunch of header fields we have to parse the mbuf for. siphash24 is about 20% of the cost of sending a udp packet on a trunk interface with tcpbench on my box. if there's a flowid set we get all that back. ok mpi@ mikeb@ sthen@
* Factor LACP frame processing out to a separate taskmikeb2015-10-051-1/+2
| | | | | | | | | | | | | This is slightly refactored version of the diff by jmatthew@ that makes use of a single per-trunk task but retains per-port mbuf queues. Running LACP frame processing in a task context allows a simple way to synchronize changes to the trunk ports and trunk itself performed from the ioctl, timeout and task contexts with a kernel lock. OK mpi
* add sizes to some of the simpler free callsderaadt2015-09-291-11/+9
| | | | ok mpi
* Remove "if_tp" from the "struct ifnet".mpi2015-09-281-6/+4
| | | | | | | | | | Instead of violating a layer of abstraction by keeping per pseudo-driver informations in "struct ifnet", the port trunk is now passed as a cookie to the interface input handler (ifih). The time of per pseudo-driver hack in the network stack is over! ok mikeb@
* add a comment explaining how we serialize when switching trunkproto;mikeb2015-09-241-1/+8
| | | | requested by mpi@
* Avoid a theoretical m_pullup(9) mishandling by delegating the mbufmikeb2015-09-241-2/+2
| | | | | | | | | | | | | | | reclaiming to the PDU and marker input routines. m_pullup may return a pointer to the newly allocated mbuf. In this case m_freem is called by the trunk_input, not by the proto specific code and pointer to the mbuf is not passed by reference. Therefore m_freem will either be called on the middle element of the chain (when the m_pullup call succeeds) or on the stale pointer (when it frees the chain in the failure case). Fortunately we should never hit this case as the receive path uniformly uses contiguous chunks of memory. Verified with and ok blambert, ok mpi
* Serialize trunk changes with input handler insertion and removal.mikeb2015-09-231-7/+12
| | | | | | | | | | | | | | | | | | | | | This moves around calls to if_ih_insert and if_ih_remove to ensure that we either have completed port initialization or are going to tear the port configuration down and don't want any input processes to get hold of the port. When trunk_port_destroy is called from the ioctl this would wait for all input processes to finish and release their references to be able to disestablish the input handler and ensure full control of the port. When switching trunkproto it is required for the ioctl context to be able to touch all trunk ports and the protocol (tr_psc). The easiest way do this is to disestablish all input handlers (while making sure they all complete) and then reestablish them after the trunk reconfiguration is completed. This avoids getting trunk a separate locking protocol of its own. ok mpi, suggested by and ok dlg
* Keep track of an active port in the failover trunk to avoid listmikeb2015-09-231-22/+51
| | | | | | iterations and additional locking protection in the future. Suggested by and ok mpi
* Remove trunk watchdog code since it doesn't do anything usefulmikeb2015-09-231-41/+1
| | | | | | | | | | | | | | and we want to limit the number of different places where we access trunk port pointers. trunk_watchdog should be never called as we don't set up it's if_timer and trunk_port_watchdog just calls the if_watchdog from the underlying interface. It's possible that this is no longer needed due to if_slowtimo/ if_watchdog changes done earlier. ok mpi
* pass a cookie argument to interface input handlers that can be usedmikeb2015-09-101-5/+5
| | | | | | | to pass additional context or transient data with the similar life time. ok mpi, suggestions, hand holding and ok from dlg
* move the if input handler list to an SRP list.dlg2015-09-101-4/+3
| | | | | | | | | | | | instead of having every driver that manipulates the ifih list understand SRPLs, this moves that processing into if_ih_insert and if_ih_remove functions. we rely on the kernel lock to serialise the modifications to the list. tested by mpi@ ok mpi@ claudio@ mikeb@
* Drop promiscuously received packets if the trunk(4) interface is notmpi2015-07-171-1/+16
| | | | | | | | | | | in promiscuous mode. The long story is that claudio@ had his ssh session reset multiple times in the hackroom because czarkoff@'s machine was sending reset. We figured out that the packet was reaching pf because of this missing check. pf would then not find any state and sent a reset. Analyzed with and ok phessler@, claudio@
* Unify the check for up & running between all pseudo-drivers.mpi2015-07-021-12/+10
|
* By design if_input_process() needs to hold a reference on the receivingmpi2015-07-021-11/+3
| | | | | | | | | ifp in order to access its ifih handlers. So get rid of if_get() in the various ifih handlers we know the ifp is live at this point. ok dlg@
* Rename if_output() into if_enqueue() to avoid confusion with commentsmpi2015-06-301-11/+7
| | | | | | talking about (*ifp->if_output)(). ok claudio@, dlg@
* count if_ibytes in if_input like we do for if_ipackets.dlg2015-06-291-3/+1
| | | | tweaks and ok mpi@
* Increment if_ipackets in if_input().mpi2015-06-241-2/+1
| | | | | | | Note that pseudo-drivers not using if_input() are not affected by this conversion. ok mikeb@, kettenis@, claudio@, dlg@
* Store a unique ID, an interface index, rather than a pointer to thempi2015-06-161-3/+8
| | | | | | | | | | | | | | | receiving interface in the packet header of every mbuf. The interface pointer should now be retrieved when necessary with if_get(). If a NULL pointer is returned by if_get(), the interface has probably been destroy/removed and the mbuf should be freed. Such mechanism will simplify garbage collection of mbufs and limit problems with dangling ifp pointers. Tested by jmatthew@ and krw@, discussed with many. ok mikeb@, bluhm@, dlg@
* Fix a double free in the destroy path triggered when a second process,mpi2015-06-151-8/+10
| | | | | | | | | in my case dhclient(8), races with ifconfig(8) to free the descriptors of the joined multicast groups. While here reduce the difference with carp(4). ok dms@