| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
| |
ok bluhm@
|
|
|
|
|
|
|
| |
There is no reason to change flags on member interfaces when removing
them, aggr(4) does not pull its members down either.
OK florian bluhm
|
|
|
|
| |
ok mpi@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
the interface input handler lists were originally set up to help
us during the intial mpsafe network stack work. at the time not all
the virtual ethernet interfaces (vlan, svlan, bridge, trunk, etc)
were mpsafe, so we wanted a way to avoid them by default, and only
take the kernel lock hit when they were specifically enabled on the
interface. since then, they have been fixed up to be mpsafe.
i could leave the list in place, but it has some semantic problems.
because virtual interfaces filter packets based on the order they
were attached to the parent interface, you can get packets taken
away in surprising ways, especially when you reboot and netstart
does something different to what you did by hand. by hardcoding the
order that things like vlan and bridge get to look at packets, we
can document the behaviour and get consistency.
it also means we can get rid of a use of SRPs which were difficult
to replace with SMRs. the interface input handler list is an SRPL,
which we would like to deprecate. it turns out that you can sleep
during stack processing, which you're not supposed to do with SRPs
or SMRs, but SRPs are a lot more forgiving and it worked.
lastly, it turns out that this code is faster than the input list
handling, so lots of winning all around.
special thanks to hrvoje popovski and aaron bieber for testing.
this has been in snaps as part of a larger diff for over a week.
|
|
|
|
|
|
| |
"new" API.
ok dlg@ tobhe@
|
|
|
|
|
| |
i've been wanting to do this for a while, and now that we've got
stoeplitz and it gives us 16 bits, it seems like the right time.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
if we use the ifq to move packet processing to another context,
it's too easy to fill up the one slot and cause packet loss.
the ifq len was set to 1 to avoid delays produced by the original
implementation of tx mitigation. however, trunk now introduces
latency because it isn't mpsafe yet, which causes the network stack
to have to take the kernel lock for each packet, and the kernel
lock can be quite contended. i want to use the ifq to move the
packet to the systq thread (which already has the kernel lock)
before trunk is asked to transmit it.
tested by mark patruck and myself.
|
|
|
|
|
|
|
|
| |
previously it copied the ports if_mtu to the trunks if_hardmtu,
which makes it hard for things like vlan(4) to work with a full
frame size, or large frame size.
tested by hrvoje popovski
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
this is largely mechanical, except for carp. this moves the addition
of the carp link state hook after we're committed to using the new
interface as a carpdev. because the add can't fail, we avoid a
complicated unwind dance. also, this tweaks the carp linkstate hook
so it only updates the relevant carp interface, not all of the
carpdevs on the parent.
hrvoje popovski has tested an early version of this diff and it's
generally ok, but there's some splasserts that this diff fires that
i'll fix in an upcoming diff.
ok claudio@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
the main semantic change is that things registering detach hooks
have to allocate and set a task structure that then gets added to
the list. this means if the task is allocated up front (eg, as part
of carps softc or bridges port structure), it avoids the possibility
that adding a hook can fail. a lot of drivers weren't checking for
failure, and unwinding state in the event of failure in other parts
was error prone.
while doing this i discovered that the list operations have to be
in a particular order, but drivers weren't doing that consistently
either. this diff wraps the list ops up so you have to seriously
go out of your way to screw them up.
ive also sprinkled some NET_ASSERT_LOCKED around the list operations
so we can make sure there's no potential for the list to be corrupted,
especially while it's being run.
hrvoje popovski has tested this a bit, and some issues he discovered
have been fixed.
ok sashan@
|
|
|
|
|
| |
this will be used to prevent trunk and the upcoming aggr driver
from taking ownership of an Ethernet interface at the same time.
|
|
|
|
|
|
|
|
|
| |
changes. While we do get RTM_IFINFO messages for the (physical) member
interfaces there is no indication that something changed from the
trunk(4) interface.
It is helpful to get this information in userland from the trunk so that
userland daemons do not need to track interface membership by themselves.
OK phessler
|
| |
|
|
|
|
|
|
|
|
|
|
| |
this let's input processing bypass ifiqs. there's a performance
benefit from this, and it will let me tweak the backpressure detection
mechanism that ifiqs use without impacting on a stack of virtual
interfaces.
ive tested all of these except mpw, which i will end up testing
soon anyway.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The trunk driver now has a new ioctl (SIOCxTRUNKOPTS), which for now only
has options for LACP:
* Mode - Active or Passive (default Active)
* Timeout - Fast or Slow (default Slow)
* System Priority - 1(high) to 65535(low) (default 32768/0x8000)
* Port Priority - 1(high) to 65535(low) (default 32768/0x8000)
* IFQ Priority - 0 to NUM_QUEUES (default 6)
At the moment, ifconfig only has options for lacpmode and lacptimeout
plumbed as those are the immediate need.
The approach taken for the options was to make them on a "trunk" vs a
"port" as what's typically seen on various NOSes (JunOS, NXOS, etc...)
as it's uncommon for a host to have one link "Passive" and the other
"Active" in a given trunk.
Just like on a NOS, when applying lacpmode or lacptimeout, the settings
are immediately applied to all existing ports in the trunk and to all
future ports brought into the trunk.
Tested by many on a plethora of NIC drivers and switches.
Ok remi@
|
|
|
|
|
|
|
| |
The account flag `ASU' will no longer be set but that makes suser()
mpsafe since it no longer mess with a per-process field.
No objection from millert@, ok tedu@, bluhm@
|
|
|
|
|
|
| |
memory shortage. As it is invoked from a system call, it should
not fail and wait instead.
OK visa@ mpi@
|
|
|
|
|
|
|
|
|
|
| |
returning 0 (success) on error instead of an error number. The caller
doesn't evaluate the return value, so it is good enough to return
ENOBUFS (non-0) on error and to remove "ret" in trunk_cast_start().
Coverity CID 1453105; Severity: Minor
OK mpi@
|
|
|
|
| |
Tested by Hrvoje Popovski, ok bluhm@
|
|
|
|
| |
Spotted by sashan@
|
|
|
|
|
|
| |
It brings the interface down and restore the original lladdr.
Found by Hrvoje Popovski
|
|
|
|
|
|
| |
Data structures modified in the ioctl path are protected by the NET_LOCK().
ok sashan@
|
|
|
|
|
|
|
| |
this means packets are consistently counted in one place, unlike the
many and various ways that drivers thought they should do it.
ok mpi@ deraadt@
|
| |
|
| |
|
|
|
|
|
| |
when removing a port from a lacp trunk. Part of a larger diff from mpi,
as suggested by mikeb. ok mpi@
|
|
|
|
|
|
| |
dont check if its busy at all, actually.
fine with reyk@
|
|
|
|
|
| |
only a driver sets or clears it, and trunk never sets it. therefore it
never needs to clear it.
|
|
|
|
| |
ok dlg@
|
| |
|
|
|
|
|
|
| |
lladdr the trunk port is already on the list.
OK mpi
|
|
|
|
|
|
|
|
|
|
| |
and a bunch of header fields we have to parse the mbuf for.
siphash24 is about 20% of the cost of sending a udp packet on a
trunk interface with tcpbench on my box. if there's a flowid set
we get all that back.
ok mpi@ mikeb@ sthen@
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is slightly refactored version of the diff by jmatthew@
that makes use of a single per-trunk task but retains per-port
mbuf queues.
Running LACP frame processing in a task context allows a simple
way to synchronize changes to the trunk ports and trunk itself
performed from the ioctl, timeout and task contexts with a kernel
lock.
OK mpi
|
|
|
|
| |
ok mpi
|
|
|
|
|
|
|
|
|
|
| |
Instead of violating a layer of abstraction by keeping per pseudo-driver
informations in "struct ifnet", the port trunk is now passed as a cookie
to the interface input handler (ifih).
The time of per pseudo-driver hack in the network stack is over!
ok mikeb@
|
|
|
|
| |
requested by mpi@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
reclaiming to the PDU and marker input routines.
m_pullup may return a pointer to the newly allocated mbuf. In this
case m_freem is called by the trunk_input, not by the proto specific
code and pointer to the mbuf is not passed by reference. Therefore
m_freem will either be called on the middle element of the chain
(when the m_pullup call succeeds) or on the stale pointer (when it
frees the chain in the failure case). Fortunately we should never
hit this case as the receive path uniformly uses contiguous chunks
of memory.
Verified with and ok blambert, ok mpi
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This moves around calls to if_ih_insert and if_ih_remove to ensure
that we either have completed port initialization or are going to
tear the port configuration down and don't want any input processes
to get hold of the port.
When trunk_port_destroy is called from the ioctl this would wait for
all input processes to finish and release their references to be able
to disestablish the input handler and ensure full control of the port.
When switching trunkproto it is required for the ioctl context to
be able to touch all trunk ports and the protocol (tr_psc). The
easiest way do this is to disestablish all input handlers (while
making sure they all complete) and then reestablish them after the
trunk reconfiguration is completed.
This avoids getting trunk a separate locking protocol of its own.
ok mpi, suggested by and ok dlg
|
|
|
|
|
|
| |
iterations and additional locking protection in the future.
Suggested by and ok mpi
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
and we want to limit the number of different places where we
access trunk port pointers.
trunk_watchdog should be never called as we don't set up it's
if_timer and trunk_port_watchdog just calls the if_watchdog
from the underlying interface.
It's possible that this is no longer needed due to if_slowtimo/
if_watchdog changes done earlier.
ok mpi
|
|
|
|
|
|
|
| |
to pass additional context or transient data with the similar life
time.
ok mpi, suggestions, hand holding and ok from dlg
|
|
|
|
|
|
|
|
|
|
|
|
| |
instead of having every driver that manipulates the ifih list
understand SRPLs, this moves that processing into if_ih_insert and
if_ih_remove functions.
we rely on the kernel lock to serialise the modifications to the
list.
tested by mpi@
ok mpi@ claudio@ mikeb@
|
|
|
|
|
|
|
|
|
|
|
| |
in promiscuous mode.
The long story is that claudio@ had his ssh session reset multiple
times in the hackroom because czarkoff@'s machine was sending reset.
We figured out that the packet was reaching pf because of this missing
check. pf would then not find any state and sent a reset.
Analyzed with and ok phessler@, claudio@
|
| |
|
|
|
|
|
|
|
|
|
| |
ifp in order to access its ifih handlers.
So get rid of if_get() in the various ifih handlers we know the ifp is
live at this point.
ok dlg@
|
|
|
|
|
|
| |
talking about (*ifp->if_output)().
ok claudio@, dlg@
|
|
|
|
| |
tweaks and ok mpi@
|
|
|
|
|
|
|
| |
Note that pseudo-drivers not using if_input() are not affected by this
conversion.
ok mikeb@, kettenis@, claudio@, dlg@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
receiving interface in the packet header of every mbuf.
The interface pointer should now be retrieved when necessary with
if_get(). If a NULL pointer is returned by if_get(), the interface
has probably been destroy/removed and the mbuf should be freed.
Such mechanism will simplify garbage collection of mbufs and limit
problems with dangling ifp pointers.
Tested by jmatthew@ and krw@, discussed with many.
ok mikeb@, bluhm@, dlg@
|
|
|
|
|
|
|
|
|
| |
in my case dhclient(8), races with ifconfig(8) to free the descriptors
of the joined multicast groups.
While here reduce the difference with carp(4).
ok dms@
|