summaryrefslogtreecommitdiffstats
path: root/sys/net (follow)
Commit message (Collapse)AuthorAgeFilesLines
...
* try do a better job of filtering 802.1 reserved group addresses.dlg2021-02-261-3/+22
| | | | | | | | | if the bridge is supposed to carry vlan packets, assuming it's an s-vlan component and should allow certain group addresses to cross between "customer" bridges. i should probably let some of these groups fall back through to the calling ether_input rather than drop them.
* use uint64_ts for ethernet addresses in the src/dst bits of rules.dlg2021-02-261-26/+26
|
* use a uint64_t for the ethernet address in the etherbridge table.dlg2021-02-265-58/+64
| | | | | | | | testing has shown up to a 30% improvement in the veb forwarding rate with this change. an earlier diff was tested by hrvoje popovski tested on amd64 and sparc64
* add some helpers for working with ethernet addresses as uint64_tdlg2021-02-261-1/+26
| | | | | | | | | | | | | | the main bits are ether_addr_to_e64 and ether_e64_to addr for loading an ethernet address into a uin64_t and visa versa. there's also some macros for testing if an address in a uint64_t is multicast, broadcast, anyaddr, or if it's an 802.1q reserved multicast group address. the reason for this functionality is once you have an ethernet address as a uint64_t, operations like compares, bit tests, and so on are fast and easy. tested on amd64 and sparc64
* gcc is more strict about union declsderaadt2021-02-261-3/+3
| | | | ok dlg
* we don't have to cast to caddr_t when calling m_copydata anymore.dlg2021-02-2510-46/+41
| | | | | | | | | | | | | | | | the first cut of this diff was made with coccinelle using this spatch: @rule@ type caddr_t; expression m, off, len, cp; @@ -m_copydata(m, off, len, (caddr_t)cp) +m_copydata(m, off, len, cp) i had fix it's opinionated idea of formatting by hand though, so i'm not sure it was worth it. ok deraadt@ bluhm@
* add support for hashing 64 and 32 bit numbers in host byte order.dlg2021-02-241-1/+17
|
* white space tweak, no functional changedlg2021-02-241-2/+2
|
* fix stoeplitz_n16 and stoeplitz_h16dlg2021-02-241-3/+3
|
* whitespace tweaks, no functional change.dlg2021-02-241-4/+4
|
* fix the length check on arp packets when handling arp filter rules.dlg2021-02-241-2/+2
| | | | | | another bridge feature i'm not convinced people actually use. ok jmatthew@ claudio@
* add support for adding and deleting mac addr entries on nvgre.dlg2021-02-241-1/+89
| | | | | | | | | | | | | | the guts of this are in the etherbridge code which i added for veb and used in bpe. there's a bit of boilerplate to make sure that the addresses used for the endpoints will work with the tunnel addresses that have been configured, but it's not too bad. again, this is hard to use because ifconfig doesnt (yet) know how to put ethernet addresses into the "add address" ioctl. these ioctls could be used for things like evpn via bgpd though. not sure if that's interesting to anyone though. it would probably be more useful on vxlan interfaces.
* add support for adding and deleting address table entries.dlg2021-02-241-1/+52
| | | | | | | the guts of this are in the etherbridge code which i just added for veb, so this code is very minimal. it's hard to use though cos ifconfig doesnt (yet) know how to put ethernet addresses into the "add address" ioctl.
* add support for adding and deleting address table entries.dlg2021-02-243-3/+149
|
* handle ifconfig veb0 flush with etherbridge_flush, like bpe and nvgredlg2021-02-231-1/+5
|
* Wrap by netlock the whole foreach loop which calls switch_port_detach() inmvs2021-02-231-1/+3
| | | | | | | | | switch_clone_destroy(). This fixes netlock assertion within underlay ifpromisc(). The problem was reported by hrvoje@ [1]. "why not" by deraadt@ 1. https://marc.info/?l=openbsd-bugs&m=161338077403538&w=2
* small adjustment of the deck chairs, no functional change.dlg2021-02-231-2/+2
|
* Use NULL instead of 0 in `m_nextpkt' assignment.mvs2021-02-231-2/+2
| | | | ok deraadt@ dlg@
* make a start on transparent ipsec interception, based on bridge(4).dlg2021-02-231-1/+287
| | | | | | | | | | | | | | | | | | | | | | | i found the Transparent Network Security Policy Enforcement paper by angelos and jason was useful for understanding the background and why you'd want to do this. the implementation is a little bit different to the bridge one because i've tweaked the order that pf and ipsec processing happens, depending on which direction the packet is going over the bridge. bridge always runs ipsec processing before pf, no matter which direction the packet is going. packets going into veb, pf runs first and then ipsec input processing is allowed to happen. in the outgoing direction ipsec happens first and then pf. pf runs before ipsec in the inbound direction so pf can apply policy to ipsec encapsulated packets before they hit pf. this allows you to apply policy to both the encrypted and unencrypted packets in both directions. the code is disabled for now. this is mostly because i want veb(4) to have a good chance at operating outside the netlock, and i'm pretty sure the ipsec stack isn't ready for that yet. the other reason why it's disabled is getting a test setup is effort, but i want to sleep.
* use the ipv6 dst addr to look up an ipsec tdb in bridge_ipsec in.dlg2021-02-231-2/+2
| | | | | | | | | using the ipv6 next protocol header probably doesnt work. it also probably doesnt matter cos i'm not sure anyone uses this feature in bridge. or maybe there isn't anyone who uses ipv6. both are plausible options. hahaha^Wok patrick@
* use link0 to allow vlans to cross the bridge.dlg2021-02-231-2/+2
|
* implement support for the blocknonip port flag.dlg2021-02-231-2/+25
|
* add support for setting and getting bridge port flags.dlg2021-02-231-1/+48
|
* filter MAC Bridge component Reserved addressdlg2021-02-231-1/+21
| | | | | | im considering converting ethernet addresses into uint64_ts to make comparisons (and masking) easier. im trialling it here, and it doesn't seem like the worst.
* add veb(4), a Virtual Ethernet Bridge driver.dlg2021-02-231-0/+1742
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | my intention is to replace bridge(4), but the way it works is different enough from from bridge that a name change is justified to distinguish them. it also makes it easier to commit it to the tree and work on it in parallel to bridge, and allows a window of migration. the main difference between veb(4) and bridge(4) is how they use interfaces as ports. veb takes over interfaces completely and only uses them to receive and transmit ethernet packets. bridge also use each interface as a port to the ethernet segment it's connected to, but also tries to continue supporting the use of the interface as a way to talk to the network stack on the local system. supporting the use of interfaces for both external and local communication is where most of my confusion with bridge comes from, both when i'm trying to operate it and also understand the code. changing this semantic is where most of the simplification in veb comes from compared to bridge. because veb takes over interfaces, the ethernet network set up on a veb is isolated from the host network stack. by default veb does not interact with pf or the ip (and mpls) stacks. to enable pf for ip frames going over veb ports link1 on the veb interface must be set. to have the stack interact with a veb network, vport interfaces must be created and added as ports to a veb. the vport interface driver is provided as part of veb, and is handled specially by veb. veb usually prevents the use of ports by the stack for sending an receiving packets, but that's why vports exist, so veb has special handling for them. veb already supports a lot of the other features that bridge has, including bridge rules and protected domains, but i got tired of working out of the tree and stopped implementing them. the main outstanding features is better address table management, the blocknonip flag on ports, transparent ipsec interception, and spanning tree. i may not bother with spanning tree unless someone tells me that they actually use it. the core ethernet learning bridge functionality is provided by the etherbridge code that was factored out of nvgre and bpe. veb is already (a lot) faster than bridge, and is better prepared to operate in parallel on multiple CPUs concurrently. thanks to hrvoje popovski for testing some earlier versions of this. discussed with many ok patrick@ jmatthew@
* When cutting of the head of an overlapping fragment during pfbluhm2021-02-221-1/+26
| | | | | | | reassembly, reinsert the fragment into the lookup table with correct index. Reported-by: syzbot+d043455a5346f726f1c4@syzkaller.appspotmail.com OK claudio@
* how about sticking to standard C.deraadt2021-02-211-2/+2
|
* cut nvgre(4) over to use common etherbridge code.dlg2021-02-211-315/+127
| | | | | | | the "ports" that nvgre provides to etherbridge are ip addresses used in the underlay network. ok patrick@ jmatthew@
* cut bpe(4) over to using the common etherbridge code.dlg2021-02-211-290/+125
| | | | | | | | | it's pretty straightforward since etherbridge was mostly based on this code in the first place. the etherbridge_ops that bpe provides to etherbridge set entries up to point at mac addresses in the underlay network. ok patrick@ jmatthew@
* add etherbridge, the guts of a learning bridge that can be reused.dlg2021-02-212-0/+689
| | | | | | | | | | | | | | | | | | | | | | | | | | this allows for the factoring out of the learning bridge code i wrote in bpe and nvme, and should be reusable for other drivers needing a mac learning bridge. the core data structures are an etherbridge struct to represent the learning bridge, eb_entry structs for each mac address entry that the bridge knows about, and an etherbridge_ops struct that drivers fill in so that they can use this code. eb_entry structs are stored in a hash table made up of SMR_TAILQs to support lookups of entries quickly and concurrently in the forwarding path. they are also stored in a locked red-black tree to help manage the uniqueness of the mac address in the table. the etherbridge_ops handlers mostly deal with comparing and testing the "ports" associated with mac address table entries. the "port" that a mac address entry is associated with is opaque to the etherbridge code, which allows for this code to be used by nvgre and bpe which map mac addresses inside the bridge to addresses in their underlay networks. it also supports traditional bridges where "ports" are actual interfaces. ok patrick@ jmatthew@
* add stoeplitz_eaddr, for getting a hash value from an ethernet address.dlg2021-02-212-2/+16
|
* move from calling l3 protocol input handlers to using if_vinput.dlg2021-02-202-40/+14
| | | | | | if_vinput requires mpsafe interface counters, so add those in. this factors out some more code between drivers. monitor mode will work on these interfaces now too.
* move gre and mgre from calling l3 input handlers to using if_vinput.dlg2021-02-201-46/+11
| | | | | | | | using if_vinput factors out a lot of repeated code between tunnel drivers, and it means monitor mode works on gre and mgre now too. make the l2 gre interfaces do some things in the same order while here.
* move gif from calling l3 protocol input handlers to using if_vinput.dlg2021-02-201-25/+5
| | | | | | if_vinput requires mpsafe interface counters, so gif is a bit more mpsafe now than it was before. using if_vinput means monitor mode works on gif now too.
* add p2p_input, like ether_input but for l3 tunnel interfaces.dlg2021-02-202-2/+44
| | | | | | | | the l3 protocol input to push the packet is based on a value in m->m_pkthdr.ph_family, which tunnel drivers should set before calling if_vinput. add p2p_bpf_mtap to call bpf_mtap_af also using m->m_pkthdr.ph_family.
* let tun use bpf_mtap for handling input packets.dlg2021-02-201-1/+4
| | | | | | | tun (not tap) input packets are written from userland in the same format that it's bpf dlt is expecting, so we can push the packet straight into bpf with bpf_mtap. this is more correct that using bpf_mtap_ether for tun.
* default interfaces to bpf_mtap_ether for their if_bpf_mtap handler.dlg2021-02-202-4/+8
| | | | | call (*ifp->if_bpf_mtap) instead of bpf_mtap_ether in ifiq_input and if_vinput.
* give interfaces an if_bpf_mtap handler.dlg2021-02-201-1/+2
| | | | | | | | the network stack is now responsible for calling bpf for packets that the interface receives, and we so far got away with using bpf_mtap_ether for everything. this doesn't work if layer 3 input goes through the same functions, so letting drivers specify the appropriate bpf mtap function means they will be able to cope.
* add a MONITOR flag to ifaces to say they're only used for watching packets.dlg2021-02-203-8/+12
| | | | | | | | | | | an example use of this is when you have a span port on a switch and you want to be able to see the packets coming out of it with tcpdump, but do not want these packets to enter the network stack for processing. this is particularly important if the span port is pushing a copy of any packets related to the machine doing the monitoring as it will confuse pf states and the stack. ok benno@
* we dont need to wrap some short lines.dlg2021-02-191-5/+3
|
* check the state for PF_ROUTE when undeferring a packet, not the rule.dlg2021-02-191-2/+2
|
* use rtalloc_mpath in pf_route and pf_route6.dlg2021-02-161-3/+4
| | | | | | | if you have multiple links to the same destination, this will let you use them with route-to/reply-to/dup-to. ok claudio@
* Simplify error path in in route_attach(). We always call it in threadmvs2021-02-151-10/+4
| | | | | | | context so we always have `curproc' Also protocol control block is not required for soreserve() so we can do it before `rop' allocation. ok bluhm@
* pf_remove_divert_state() is an entry point into pf, modifying the pf statepatrick2021-02-121-1/+7
| | | | | | | table. Hence we have to grab both the pf lock and the pf state lock. Found by dlg@ ok bluhm@ sashan@
* Fix null pointer dereference in pf_route6(). Embedding scope intobluhm2021-02-121-3/+1
| | | | | | addresses that come from pf cannot be right, so remove the code. Coverity CID 1501718 OK dlg@ claudio@
* We link `ifp' to `if_list' before we perform if_attachsetup(). It is notmvs2021-02-111-3/+2
| | | | | | | | | fully initialized because we initialize `if_groups' after linking. It's not triggered because if_attach() and if_unit(9) are serialized by kernel lock and `ifp' is often filled by nulls. Move `if_groups' initialization to if_attach_common() to prevent this. ok bluhm@ claudio@ deraadt@
* Interface group names must fit into IFNAMSIZ and be unique. Butbluhm2021-02-101-3/+5
| | | | | | | | | | | | the kernel made the unique check before trunkating with strlcpy(). So there could be two interface groups with the same name. The kif is created by a name lookup. The trunkated names are equal, so there was only one kif owned by both groups. When the groups got destroyed, the single kif was removed twice from the RB tree. Check length of group name before doing the unique check. The empty group name was allowed and is now invalid. Reported-by: syzbot+f47e8296ebd559f9bbff@syzkaller.appspotmail.com OK deraadt@ gnezdo@ anton@ mvs@ claudio@
* Remove `sc_dead' logic from pppac(4). It is used to preventmvs2021-02-101-9/+3
| | | | | | | | | | pppac_ioctl() be called on dying pppac(4) interface. But now if_detach() makes dying `ifp' inaccessible and waits for references which are in-use in ioctl(2) path. This logic is not required anymore. Also if_detach() was moved before klist_invalidate() to prevent the case while pppac_qstart() bump `sc_rsel'. ok yasuoka@
* pfsync_state_import() must not be called with the pf state lock held,patrick2021-02-091-3/+1
| | | | | | | | | since the actual modification of the state table is done by a call to pf_state_insert(), which takes the pf state lock itself. Other calls to pfsync_state_import() also only have the pf lock. Reported-by: syzbot+d6ea8620b43dc69ecbc6@syzkaller.appspotmail.com ok bluhm@
* Activate use of PF_LOCK() by removing the WITH_PF_LOCK ifdefs.patrick2021-02-094-40/+4
| | | | | Silence from the network group ok sashan@