| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
| |
the "ports" that nvgre provides to etherbridge are ip addresses
used in the underlay network.
ok patrick@ jmatthew@
|
|
|
|
|
|
|
|
|
| |
it's pretty straightforward since etherbridge was mostly based on
this code in the first place. the etherbridge_ops that bpe provides
to etherbridge set entries up to point at mac addresses in the
underlay network.
ok patrick@ jmatthew@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
this allows for the factoring out of the learning bridge code i
wrote in bpe and nvme, and should be reusable for other drivers
needing a mac learning bridge.
the core data structures are an etherbridge struct to represent the
learning bridge, eb_entry structs for each mac address entry that
the bridge knows about, and an etherbridge_ops struct that drivers
fill in so that they can use this code.
eb_entry structs are stored in a hash table made up of SMR_TAILQs
to support lookups of entries quickly and concurrently in the
forwarding path. they are also stored in a locked red-black tree
to help manage the uniqueness of the mac address in the table.
the etherbridge_ops handlers mostly deal with comparing and testing
the "ports" associated with mac address table entries. the "port"
that a mac address entry is associated with is opaque to the
etherbridge code, which allows for this code to be used by nvgre
and bpe which map mac addresses inside the bridge to addresses in
their underlay networks. it also supports traditional bridges where
"ports" are actual interfaces.
ok patrick@ jmatthew@
|
| |
|
|
|
|
|
|
| |
if_vinput requires mpsafe interface counters, so add those in. this
factors out some more code between drivers. monitor mode will work
on these interfaces now too.
|
|
|
|
|
|
|
|
| |
using if_vinput factors out a lot of repeated code between tunnel
drivers, and it means monitor mode works on gre and mgre now too.
make the l2 gre interfaces do some things in the same order while
here.
|
|
|
|
|
|
| |
if_vinput requires mpsafe interface counters, so gif is a bit more
mpsafe now than it was before. using if_vinput means monitor mode
works on gif now too.
|
|
|
|
|
|
|
|
| |
the l3 protocol input to push the packet is based on a value in
m->m_pkthdr.ph_family, which tunnel drivers should set before calling
if_vinput.
add p2p_bpf_mtap to call bpf_mtap_af also using m->m_pkthdr.ph_family.
|
|
|
|
|
|
|
| |
tun (not tap) input packets are written from userland in the same
format that it's bpf dlt is expecting, so we can push the packet
straight into bpf with bpf_mtap. this is more correct that using
bpf_mtap_ether for tun.
|
|
|
|
|
| |
call (*ifp->if_bpf_mtap) instead of bpf_mtap_ether in ifiq_input
and if_vinput.
|
|
|
|
|
|
|
|
| |
the network stack is now responsible for calling bpf for packets
that the interface receives, and we so far got away with using
bpf_mtap_ether for everything. this doesn't work if layer 3 input
goes through the same functions, so letting drivers specify the
appropriate bpf mtap function means they will be able to cope.
|
|
|
|
|
|
|
|
|
|
|
| |
an example use of this is when you have a span port on a switch and
you want to be able to see the packets coming out of it with tcpdump,
but do not want these packets to enter the network stack for
processing. this is particularly important if the span port is
pushing a copy of any packets related to the machine doing the
monitoring as it will confuse pf states and the stack.
ok benno@
|
| |
|
| |
|
|
|
|
|
|
|
| |
if you have multiple links to the same destination, this will let
you use them with route-to/reply-to/dup-to.
ok claudio@
|
|
|
|
|
|
|
| |
context so we always have `curproc' Also protocol control block is not
required for soreserve() so we can do it before `rop' allocation.
ok bluhm@
|
|
|
|
|
|
|
| |
table. Hence we have to grab both the pf lock and the pf state lock.
Found by dlg@
ok bluhm@ sashan@
|
|
|
|
|
|
| |
addresses that come from pf cannot be right, so remove the code.
Coverity CID 1501718
OK dlg@ claudio@
|
|
|
|
|
|
|
|
|
| |
fully initialized because we initialize `if_groups' after linking. It's
not triggered because if_attach() and if_unit(9) are serialized by
kernel lock and `ifp' is often filled by nulls. Move `if_groups'
initialization to if_attach_common() to prevent this.
ok bluhm@ claudio@ deraadt@
|
|
|
|
|
|
|
|
|
|
|
|
| |
the kernel made the unique check before trunkating with strlcpy().
So there could be two interface groups with the same name. The kif
is created by a name lookup. The trunkated names are equal, so
there was only one kif owned by both groups. When the groups got
destroyed, the single kif was removed twice from the RB tree.
Check length of group name before doing the unique check.
The empty group name was allowed and is now invalid.
Reported-by: syzbot+f47e8296ebd559f9bbff@syzkaller.appspotmail.com
OK deraadt@ gnezdo@ anton@ mvs@ claudio@
|
|
|
|
|
|
|
|
|
|
| |
pppac_ioctl() be called on dying pppac(4) interface. But now if_detach()
makes dying `ifp' inaccessible and waits for references which are in-use
in ioctl(2) path. This logic is not required anymore. Also if_detach()
was moved before klist_invalidate() to prevent the case while
pppac_qstart() bump `sc_rsel'.
ok yasuoka@
|
|
|
|
|
|
|
|
|
| |
since the actual modification of the state table is done by a call to
pf_state_insert(), which takes the pf state lock itself. Other calls
to pfsync_state_import() also only have the pf lock.
Reported-by: syzbot+d6ea8620b43dc69ecbc6@syzkaller.appspotmail.com
ok bluhm@
|
|
|
|
|
| |
Silence from the network group
ok sashan@
|
|
|
|
|
|
|
| |
a new object that is already refcounted, so carp attach does not
reach into internal structures. Add kasserts to detect counter
overflow or underflow.
OK mvs@
|
|
|
|
|
|
|
|
| |
offloading. The checksum must be calculated in software. Use the
same condition in ether_resolve() to send the broadcast packet back
to the stack and in in_ifcap_cksum() to force software checksumming.
This fixes regress/sys/kern/sosplice/loop.
OK procter@
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
The code delivered in this change is currently disabled. Brave souls
may enable the code by adding -DWITH_PF_LOCK when building customized
kernel. Big thanks goes to Hrvoje@ for providing test equipment and
testing.
As soon as we enter the next release cycle, the WITH_PF_LOCK will be
defined as default option for MP kernels.
OK dlg@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
before this change pf_route operated on the semantic that pf runs
when packets go over an interface, so when pf_route changed which
interface the packet was on it would run pf_test again. this change
changes (restores) the semantic that pf is only supposed to run
when packets go in or out of the network stack, even if route-to
is responsibly for short circuiting past the network stack.
just to be clear, for normal packets (ie, those not touched by
route-to/reply-to/dup-to), there isn't a difference between running
pf when packets enter or leave the stack, or having pf run when a
packet goes over an interface.
the main reason for this change is that running the same packet
through pf multiple times creates confusion for the state table.
by default, pf states are floating, meaning that packets are matched
to states regardless of which interface they're going over. if a
packet leaving on em0 is rerouted out em1, both traversals will end
up using the same state, which at best will make the accounting
look weird, or at worst fail some checks in the state and get
dropped.
another reason for this commit is is to make handling of the changes
that route-to makes consistent with other changes that are made to
packet. eg, when nat is applied to a packet, we don't run pf_test
again with the new addresses.
the main caveat with this diff is you can't have one rule that
pushes a packet out a different interface, and then have a rule on
that second interface that NATs the packet. i'm not convinced this
ever worked reliably or was used much anyway, so we don't think
it's a big concern.
discussed with many, with special thanks to bluhm@, sashan@ and
sthen@ for weathering most of that pain.
ok claudio@ sashan@ jmatthew@
|
|
|
|
|
|
|
| |
Otherwise this `pxi' can be killed by concurrent thread after context
switch caused by following netlock.
ok yasuoka@
|
|
|
|
|
|
| |
OpenBSD 6.7 npppd(8) can't work over tun(4).
ok yasuoka@
|
|
|
|
| |
ok bluhm@ dlg@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
this is a significant (and breaking) reworking of the policy based
routing that pf can do. the intention is to make it as easy as
nat/rdr to use, and more robust when it's operating.
the main reasons for this change are:
- route-to, reply-to, and dup-to do not work with pfsync
this is because the information about where to route-to is stored in
rules, and it is hard to have a ruleset synced between firewalls,
and impossible to have them synced 100% of the time.
- i can make my boxes panic in certain situations using route-to
yeah...
- the configuration and syntax for route-to rules are confusing.
the argument to route-to and co is an interace name with an optional
ip address. there are several problems with this. one is that people
tend to think about routing as sending packets to peers by their
address, not by the interface they're reachable on. another is that
we currently have no way to synchronise interface topology information
between firewalls, so using an interface to say where packets go
means we can't do failover of these states with pfsync. another
is that a change in routing topology means a host may become
reachable over a different interface. tying routing policy to
interfaces gets in the way of failover and load balancing.
this change does the following:
- stores the route info in the state instead of the pf rule
this allows route-to to keep working when the ruleset changes, and
allows route-to info to be sent over pfsync. there's enough spare bits
in pfsync messages that the protocol doesnt break.
the caveat is that route-to becomes tied to pass rules that create
state, like rdr-to and nat-to.
- the argument to route-to etc is a destination ip address
it's not limited to a next-hop address (thought a next-hop can be a
destination address). this allows for the failover and load balancing
referred to above.
- deprecates the address@interface host syntax in pfctl
because routing is done entirely by IPs, the interface is derived from
the route lookup, not pf. any attempt to use the @interface syntax
will fail now in all contexts.
there's enthusiasm from proctor@ jmatthew@ and others
ok sashan@ bluhm@
|
|
|
|
| |
ok bluhm@ sashan@
|
|
|
|
| |
ok bluhm@
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
pfsync may want to defer the transmission of a packet. it does this so
it can try and get a state over to a peer firewall before a host may
send a reply to the peer, which would get dropped cos there's no
matching state.
i think the once rule processing should happen before that. the state
is created from the rule, whether the packet the state is for goes out
immediately or not shouldn't matter.
ok sashan@
|
|
|
|
|
|
| |
of course this is limited to the !dup-to case.
ok sashan@ bluhm@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
pf_route and pf_route6 are called to take over delivery of the
packet with route-to and reply-to instead of letting it get processed
normally. for the dup-to handling, it copies the mbuf but leaves
the original mbuf in place. pf_route takes over the packet by
clearing the mbuf pointer in the pf_pdesc struct. this diff moves
the clearing of that pointer to the start of the function, rather
than checking for dup-to again on the way out of the function.
i think this is better because it means that it's more robust in
the face of future code changes. even if that's not true, it's still
shorter code in a forwarding path.
ok sashan@ jmatthew@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
dup-to is kind of like what you do with a span port, but is a bit
more fine grained. it copies packets in a connection out an interface
so that connection can be monitored. it doesnt make sense for pf
to see the copied packets and try to match or create new states for
them either. at best it needs config to stop pf seeing the copies
(eg, set skip on $dup_to_tgt_if). at worst it breaks the connections
you're monitoring because the states in pf get confused.
found while discussing larger route-to changes on tech@.
ok bluhm@ sashan@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
ifs = ifunit(req->ifbr_ifsname);
if (ifs == NULL) {
error = ENOENT;
break;
}
if (ifs->if_bridgeidx != ifp->if_index) {
error = ESRCH;
break;
}
bif = bridge_getbif(ifs);
This sequence repeats 8 times. Also we don't check value returned by
bridge_getbig() before use. Newly introduced bridge_getbig() function
replaces this sequence. This not only reduces duplicated code but also
makes `bif' dereference safe.
ok bluhm@
|
|
|
|
|
|
| |
Diff from Yuichiro NAITO.
ok procter
|
|
|
|
| |
ok dlg@ kn@
|
|
|
|
| |
ok claudio@ mvs@
|
|
|
|
|
|
| |
Add a NULL check to prevent crash in pflog(4) introduced in previous
commit.
Reported-by: syzbot+c6d2f2ad34b822bce98a@syzkaller.appspotmail.com
|
|
|
|
|
|
|
|
| |
rdr-to, nat-to, af-to rules. The kernel uses the information from
the packet description and fills it into the fields in the pflog
header. While doing this, it is trival to figure out whether the
packet has been rewritten.
OK sashan@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
and af-to addresses and ports applied. Therefore it created a mbuf
chain on the stack with a partial copy. This is too complicated
for IP options, extension header, NAT46 af-to, and fragmented mbuf
chains. It even caused a crash in syzkaller. Usually the length
checks in pf_setup_pdesc() rejected the faked mbuf and the goto
copy logged the packet unmodified. Remove the pflog_mtap() function
and call bpf_mtap_hdr() directly. As the old buggy code was bypassed
in most cases, tcpdump(8) output of pflog does not change.
Uncondionally log the unmodified packet.
Reported-by: syzbot+947e89e06ac3fec187d0@syzkaller.appspotmail.com
OK sashan@
|
|
|
|
| |
ok dlg@
|
|
|
|
| |
ok dlg@
|
|
|
|
| |
ok dlg@ kn@
|
|
|
|
| |
ok dlg@
|
|
|
|
| |
ok dlg@
|