| Commit message (Collapse) | Author | Age | Files | Lines |
| | |
|
| |
|
|
|
| |
i wrote the original version of this, but it was tweaked by Matt
Dunwoodie and Jason A. Donenfeld for use with wireguard.
|
| | |
|
| |
|
|
|
|
|
|
|
| |
this is so protocols (eg, udp) can let things (eg, kernel support
for wireguard or vxlan or geneve) look at and possibly steal packets
before they get added to a socket buffer.
i wrote the original version of this, but it was tweaked by Matt
Dunwoodie and Jason A. Donenfeld for use with wireguard.
|
| |
|
|
|
|
|
|
|
| |
avoidance. The problem and fix is noted in RFC5681 section 3.1, page 7.
Report, diff and testing from Brian Brombacher, thanks!
Testing and a cosmetic tweak by myself.
ok claudio
|
| |
|
|
|
|
|
|
| |
Prevent a panic in syn_cache_insert() found by syzbot.
Reported-by: syzbot+aee24ad9b7bf5665912d@syzkaller.appspotmail.com
ok sashan@, anton@, millert@
|
| |
|
|
|
|
|
|
|
| |
address. In that case, the linking to the pf state must be dissolved
as the latter still contains the old address. If it is a divert
state, also remove the state as any divert state must be associated
with a matching socket. Call pf_remove_divert_state() and
pf_inp_unlink() from in_pcbconnect().
reported by Tim Kuijsten; OK sashan@ claudio@
|
| |
|
|
|
|
|
|
|
|
| |
Since our last concurrency mistake only ioctl(2) ans sysctl(2) code path
take the reader lock. This is mostly for documentation purpose as long as
the softnet thread is converted back to use a read lock.
dlg@ said that comments should be good enough.
ok sashan@
|
| |
|
|
|
|
|
| |
these packets have generally already been counted on the interface
because that's where they were sent or received from. the protocol
handling side of things already counts things like packets, which
you see with netstat -sp carp.
|
| |
|
|
|
|
|
|
|
|
|
|
| |
this is modelled on vlan_transmit, and basically enqueues the packet
directly on the parent interface.
even though carp is generally not used to transmit packets, we run
dhcp relays on it at work and hit a situation where we unecessarily
dropped packets because it's ifq maxlen was 1. i've been running
this for a month in production.
ok jmatthew@
|
| | |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
encryption or decryption. This allows us to keep plaintext and encrypted
network traffic seperated and reduces the attack surface for network
sidechannel attacks.
The only way to reach the inner rdomain from outside is by successful
decryption and integrity verification through the responsible Security
Association (SA).
The only way for internal traffic to get out is getting encrypted and
moved through the outgoing SA.
Multiple plaintext rdomains can share the same encrypted rdomain while
the unencrypted packets are still kept seperate.
The encrypted and unencrypted rdomains can have different default routes.
The rdomains can be configured with the new SADB_X_EXT_RDOMAIN pfkey
extension. Each SA (tdb) gets a new attribute 'tdb_rdomain_post'.
If this differs from 'tdb_rdomain' then the packet is moved to
'tdb_rdomain_post' afer IPsec processing.
Flows and outgoing IPsec SAs are installed in the plaintext rdomain,
incoming IPsec SAs are installed in the encrypted rdomain.
IPCOMP SAs are always installed in the plaintext rdomain.
They can be viewed with 'route -T X exec ipsecctl -sa' where X is the
rdomain ID.
As the kernel does not create encX devices automatically when creating
rdomains they have to be added by hand with ifconfig for IPsec to work
in non-default rdomains.
discussed with chris@ and kn@
ok markus@, patrick@
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Prevent concurrency in the socket layer which is not ready for that.
Two recent data corruptions in pfsync(4) and the socket layer pointed
out that, at least, tun(4) was incorrectly using NET_RUNLOCK(). Until
we find a way in software to avoid future mistakes and to make sure that
only the softnet thread and some ioctls are safe to use a read version
of the lock, put everything back to the exclusive version.
ok stsp@, visa@
|
| |
|
|
|
|
|
|
|
|
| |
made from socket close path. Most device drivers are not MP-safe yet,
and the closing of AF_INET and AF_INET6 sockets is no longer under the
kernel lock.
This fixes a panic seen by jcs@.
OK mpi@
|
| |
|
|
| |
ok bluhm@
|
| |
|
|
| |
in RFC8622; ok job@
|
| |
|
|
|
|
| |
IP forwarding is disabled. Issue reported by Daniel Jakots (danj@)
OK bluhm@
|
| |
|
|
|
|
|
| |
We only install flows for IPcomp. When processing an incoming ESP SA,
look for a bundled IPcomp SA and use that in the policy check.
ok bluhm@
|
| | |
|
| |
|
|
|
|
|
|
|
| |
where such packet is bound to. This check is enforced if and only
IP forwarding is disabled.
Change discussed with bluhm@, claudio@, deraadt@, markus@, tobhe@
OK bluhm@, claudio@, tobhe@
|
| |
|
|
| |
ok bluhm@
|
| |
|
|
|
| |
Same fix as for the IPv6 case. Fixes a regression in ports/net/openvpn
spotted by landry@, ok bluhm@
|
| |
|
|
|
|
| |
isakmpd and iked to REQUIRE. Filter policy violations earlier.
ok sashan@ bluhm@
|
| |
|
|
|
| |
netmask in the kernel.
OK visa@
|
| |
|
|
|
|
| |
unfiltered in the future, so this prevents rresvport_af(3) from randomly
exposing a service intended for local visibility only.
ok florian
|
| |
|
|
|
|
| |
tp->snd_wnd. This can happen, for example, when the remote side
responds to a window probe by ACKing the one byte it contains.
from FreeBSD; via markus@; OK sashan@ tobhe@
|
| |
|
|
|
|
| |
ifpromisc() already refcounts, so carp doesn't have to do it
implicitly with the carpdev list. there's no functional change, the
code just gets a bit simpler.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
this follows what's been done for detach and link state hooks, and
makes handling of hooks generally more robust.
address hooks are a bit different to detach/link state hooks in
that there's only a few things that register hooks (carp, pf, vxlan),
but a lot of places to run the hooks (lots of ipv4 and ipv6 address
configuration).
an address hook cookie was in struct pfi_kif, which is part of the
pf abi. rather than break pfctl -sI, this maintains the void * used
for the cookie and uses it to store a task, which is then used as
intended with the new api.
|
| |
|
|
|
|
|
|
|
|
| |
SIOCGIFADDR, SIOCGIFNETMASK, SIOCGIFDSTADDR, SIOCGIFBRDADDR,
SIOCSIFADDR, SIOCSIFNETMASK, SIOCSIFDSTADDR, and SIOCSIFBRDADDR.
Name in_ioctl_set_ifaddr() consistently. Use in_sa2sin() to validate
inet address. Combine if_addrlist loops and add comment. Although
netmask is not a inet address, length must be valid.
Reported-by: syzbot+5fc6da002fc4e8d994be@syzkaller.appspotmail.com
OK visa@
|
| |
|
|
|
|
| |
making RTM_INVALIDATE code path perform same check as RTM_DELETE does.
ok mpi@
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
this is largely mechanical, except for carp. this moves the addition
of the carp link state hook after we're committed to using the new
interface as a carpdev. because the add can't fail, we avoid a
complicated unwind dance. also, this tweaks the carp linkstate hook
so it only updates the relevant carp interface, not all of the
carpdevs on the parent.
hrvoje popovski has tested an early version of this diff and it's
generally ok, but there's some splasserts that this diff fires that
i'll fix in an upcoming diff.
ok claudio@
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
the main semantic change is that things registering detach hooks
have to allocate and set a task structure that then gets added to
the list. this means if the task is allocated up front (eg, as part
of carps softc or bridges port structure), it avoids the possibility
that adding a hook can fail. a lot of drivers weren't checking for
failure, and unwinding state in the event of failure in other parts
was error prone.
while doing this i discovered that the list operations have to be
in a particular order, but drivers weren't doing that consistently
either. this diff wraps the list ops up so you have to seriously
go out of your way to screw them up.
ive also sprinkled some NET_ASSERT_LOCKED around the list operations
so we can make sure there's no potential for the list to be corrupted,
especially while it's being run.
hrvoje popovski has tested this a bit, and some issues he discovered
have been fixed.
ok sashan@
|
| |
|
|
|
|
|
| |
noone seems to use it, and we should not encourage people to use
it by having it available. it's been disabled for most of the last
release and noones asked for it in 6.6, so i'm taking that as an
ok for this removal.
|
| | |
|
| |
|
|
|
| |
please don't interpret this as an intention on my part to implement
UDP-Lite.
|
| |
|
|
|
|
| |
Fix the SIOCAIFADDR and SIOCDIFADDR ioctl(2) by implementing
in_sa2sin() to validate inet address family and address length.
OK visa@
|
| |
|
|
|
|
| |
this also brings them in line with the AF_INET equivalents.
ok visa@ bluhm@
|
| |
|
|
| |
ok cheloha@, visa@
|
| |
|
|
| |
ok jca@ deraadt@ claudio@ visa@
|
| |
|
|
|
|
| |
ip_ether.h is where netinet/ip_ipip.h got the forward declaration
for struct tdb from though, so fix that before cutting ip_ether.h
out of gif.
|
| | |
|
| |
|
|
|
|
|
|
| |
it was previously (ab)used by pflog, which has since been fixed.
apart from that nothing else used it, so we can trim the cruft.
ok kn@ claudio@ visa@
visa@ also made sure i fixed ipw(4) so i386 won't break.
|
| |
|
|
|
|
|
|
|
|
|
| |
out of the rtable_walk(). This avoids recursion to prevent stack
overflow. Also it allows freeing the route outside of the walk.
Now mrt_mcast_del() frees the route only when it is deleted from
the routing table. If that fails, it must not be freed. After the
route is returned by mfc_find(), it is reference counted. Then we
need a rtfree(), but not in the other caes.
Move rt_timer_remove_all() into rt_mcast_del().
OK mpi@
|
| |
|
|
|
|
|
|
|
|
|
| |
introduced a queue to grab the lock for multiple packets. Now we
have only netlock for both IP and protocol input. So the queue is
not necessary anymore. It just switches CPU and decreases performance.
So remove the inet and inet6 ip queue for local packets.
To get TCP running on loopback, we have to queue once between TCP
input and output of the two sockets. So use the loopback queue in
looutput() unconditionally.
OK visa@
|
| |
|
|
|
|
| |
ifconfig set/unset it.
ok deraadt@ kmos@
|
| |
|
|
| |
ok dlg@, sthen@, millert@
|
| |
|
|
|
| |
Removes a global variable and avoids MP problems.
OK mpi@ visa@
|
| |
|
|
|
| |
sack hole list length or pool limit.
OK claudio@
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
There is a global tunable limit net.inet.tcp.sackholelimit, default
is 32768. If an attacker manages to attach all these sack holes
to a few TCP connections, the lists may grow long. Traversing them
might cause higher CPU consumption on the victim machine. In
practice such a situation is hard to create as the TCP retransmit
and 2*msl timer flush the list periodically. For additional
protection, enforce a per connection limit of 128 SACK holes in the
list.
reported by Reuven Plevinsky and Tal Vainshtein
discussed with claudio@ and procter@; OK deraadt@
|
| |
|
|
| |
ok kn@
|