| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
| |
ok gnezdo@ semarie@ mpi@
|
|
|
|
|
|
|
|
|
|
|
|
| |
The kernel uses a huge amount of processing time for sending ACKs to the sender
on the receiving interface. After receiving a data segment, we send out two
ACKs. The first one in tcp_input() direct after receiving. The second ACK is
send out, after the userland or the sosplice task read some data out of the
socket buffer. Thus, we save some processing time and improve network
performance.
Longer tested by sthen@
OK claudio@
|
|
|
|
|
|
|
|
|
| |
avoidance. The problem and fix is noted in RFC5681 section 3.1, page 7.
Report, diff and testing from Brian Brombacher, thanks!
Testing and a cosmetic tweak by myself.
ok claudio
|
|
|
|
| |
ok bluhm@
|
|
|
|
|
|
| |
isakmpd and iked to REQUIRE. Filter policy violations earlier.
ok sashan@ bluhm@
|
|
|
|
|
|
| |
tp->snd_wnd. This can happen, for example, when the remote side
responds to a window probe by ACKing the one byte it contains.
from FreeBSD; via markus@; OK sashan@ tobhe@
|
|
|
|
|
| |
sack hole list length or pool limit.
OK claudio@
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There is a global tunable limit net.inet.tcp.sackholelimit, default
is 32768. If an attacker manages to attach all these sack holes
to a few TCP connections, the lists may grow long. Traversing them
might cause higher CPU consumption on the victim machine. In
practice such a situation is hard to create as the TCP retransmit
and 2*msl timer flush the list periodically. For additional
protection, enforce a per connection limit of 128 SACK holes in the
list.
reported by Reuven Plevinsky and Tal Vainshtein
discussed with claudio@ and procter@; OK deraadt@
|
|
|
|
|
|
|
|
|
| |
PAWS. Otherwise we could trigger a retransmit of the opposite party with another
wrong timestamp and produce loop. I have seen this with a buggy server which
messed up tcp timestamps.
Suggested by Prof. Jacobson for FreeBSD.
ok krw, bluhm, henning, mpi
|
|
|
|
|
|
| |
syn_cache_get() is not neccessary. Also make the abort label
consistent to resetandabort and free the mbuf there.
OK mpi@
|
|
|
|
|
| |
in_pcbconnect() to avoid the address family maze in syn_cache_get().
input claudio@; OK mpi@
|
|
|
|
|
|
|
|
| |
was NULL and nothing was traced. So save the old tcpcb and use
that to retrieve some information. Note that otb may be freed and
must not be dereferenced. Use a heuristic for cases where the
address family is in the IP header but not provided in the PCB.
OK visa@
|
|
|
|
|
|
| |
the delack timer had a different implementation. Use the same
mechanism for all TCP timer.
OK mpi@ visa@
|
|
|
|
|
|
|
|
| |
is set, pf_find_divert() cannot fail so put an assert there.
Explicitly check all possible divert types, panic in the default
case. For raw sockets call pf_find_divert() before of the socket
loop. Divert reply should not match on TCP or UDP listen sockets.
OK sashan@ visa@
|
|
|
|
|
|
|
|
| |
security check prevents that the user accidentally configures
redirect where a divert-to would be appropriate. Instead of spreading
the logic into tcp and udp input, check the flag during PCB listen
lookup. This also reduces parameters of in_pcblookup_listen().
OK visa@
|
|
|
|
|
|
| |
pr_input handlers without KERNEL_LOCK().
ok visa@
|
|
|
|
|
|
|
|
|
|
|
| |
calls in tcp_input(). When I added this code for socket splicing,
I have missed that they may be called indirectly through functions.
Although not strictly necessary since we have the sosplice thread,
put that flag consistently when we want to prevent that tcp_output()
is called in the middle of tcp_input(). As soisconnected(),
soisdisconnected(), and socantrcvmore() call the wakeup functions
from tcp_input(), set the TF_BLOCKOUTPUT flag around them.
OK visa@
|
|
|
|
|
|
|
|
|
| |
TCP_FACK was disabled by provos@ in June 1999.
TCP_FACK is an algorithm that decides that when something is lost, all
not SACKed packets until the most forward SACK are lost. It may be a
correct estimate, if network does not reorder packets.
OK visa@ mpi@ mikeb@
|
|
|
|
| |
With input from Klemens Nanni, OK visa, mpi, bluhm
|
|
|
|
| |
OK deraadt, mpi, visa, job
|
|
|
|
| |
Tested by Hrvoje Popovski, ok bluhm@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
buffers.
This is one step towards unlocking TCP input path. Note that all the
functions asserting for the socket lock are not necessarilly MP-safe.
All the fields of 'struct socket' aren't protected.
Introduce a new kernel-only kqueue hint, NOTE_SUBMIT, to be able to
tell when a filter needs to lock the underlying data structures. Logic
and name taken from NetBSD.
Tested by Hrvoje Popovski.
ok claudio@, bluhm@, mikeb@
|
|
|
|
|
|
|
|
|
| |
<netinet/tcp_debug.h>.
The IPv6 variant was always included and the IPv4 version is not
present on all systems.
Most of the offending ports are already fixed, thanks to sthen@!
|
|
|
|
|
|
| |
in ip6_input(). Do not check that again in the protocol input
functions.
OK mpi@
|
|
|
|
|
|
|
| |
change the pointer. Then *mp keeps the invalid pointer and it might
be used. Fix the potential use after free and also reset *mp in
other places to have less dangling pointers to freed mbufs.
OK mpi@ mikeb@
|
|
|
|
|
|
| |
adjust the comment to match reality (or at least rfc7323) instead.
This brings us back in line with the behavior of Net and Free.
From Lauri Tirkkonen. OK bluhm@
|
|
|
|
|
| |
No binary change.
OK mpi@
|
|
|
|
|
| |
tcp_input().
OK florian@
|
|
|
|
|
| |
allows to simplify code used for both IPv4 and IPv6.
OK mikeb@ deraadt@
|
|
|
|
| |
ok mpi@ bluhm@
|
|
|
|
|
|
|
| |
to get rid of struct ip6protosw and some wrapper functions. It is
more consistent to have less different structures. The divert_input
functions cannot be called anyway, so remove them.
OK visa@ mpi@
|
|
|
|
|
|
| |
make the variable parameters of the protocol input functions fixed.
Also add the proto to make it similar to IPv6.
OK mpi@ guenther@ millert@
|
|
|
|
| |
ok bluhm@, kettenis@
|
|
|
|
|
|
|
|
|
|
|
| |
of the network stack that are not yet ready to be executed in parallel or
where new sleeping points are not possible.
This first pass replace all the entry points leading to ip_output(). This
is done to not introduce new sleeping points when trying to acquire ART's
write lock, needed when a new L2 entry is created via the RT_RESOLVE.
Inputs from and ok bluhm@, ok dlg@
|
|
|
|
| |
Prodded by and ok bluhm@
|
|
|
|
|
|
| |
While here keep local definitions local.
ok bluhm@
|
| |
|
|
|
|
| |
This will allow to have a single lock/unlock dance per timer.
|
|
|
|
| |
Found by Chris Jackman, thanks!
|
|
|
|
|
| |
set on the listen socket.
From David Hill; OK vgross@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
the ioff argument to pool_init() is unused and has been for many
years, so this replaces it with an ipl argument. because the ipl
will be set on init we no longer need pool_setipl.
most of these changes have been done with coccinelle using the spatch
below. cocci sucks at formatting code though, so i fixed that by hand.
the manpage and subr_pool.c bits i did myself.
ok tedu@ jmatthew@
@ipl@
expression pp;
expression ipl;
expression s, a, o, f, m, p;
@@
-pool_init(pp, s, a, o, f, m, p);
-pool_setipl(pp, ipl);
+pool_init(pp, s, a, ipl, f, m, p);
|
|
|
|
|
|
| |
This is another little step towards deprecating 'struct route{,_in6}'.
ok florian@
|
|
|
|
|
|
|
|
| |
swapping between two syn caches for random reseeding anyway, this
feature can be added easily. When the cache is empty, there is an
opportunity to change the hash size. This allows an admin under
SYN flood attack to defend his machine.
Suggested by claudio@; OK jung@ claudio@ jmc@
|
|
|
|
|
|
|
|
|
| |
This is consistent with the IPV6_UNICAST_HOPS behavior, and is the only
way to allow applications to completely control the TTL of outgoing
packets (else an application could temporariy send packets with the
default TTL, until it sets again IP_TTL ; this is harmful eg for GTSM).
ok bluhm@
|
| |
|
|
|
|
|
| |
Useful to implement GTSM support in daemons such as bgpd(8). Diff from
2013 revived by renato@. Input from bluhm@, ok bluhm@ deraadt@
|
|
|
|
|
|
| |
its value for the SYN+ACK packet. This makes the IPV6_UNICAST_HOPS
socket option usable for incoming TCP connections.
tested by renato@; OK jca@
|
|
|
|
|
| |
was overly complicated. Simplify the code without functional change.
OK jca@
|
| |
|
|
|
|
|
|
|
| |
attack against our hash function. In this case, switch to the
passive syn cache as soon as possible. It will start with a new
random seed for the hash.
input and OK mpi@
|