summaryrefslogtreecommitdiffstats
path: root/sys/netinet/tcp_var.h (follow)
Commit message (Collapse)AuthorAgeFilesLines
* Count the number of TCP SACK options that were dropped due to thebluhm2019-07-121-1/+3
| | | | | sack hole list length or pool limit. OK claudio@
* The output from tcp debug sockets was incomplete. After detach tpbluhm2018-06-111-2/+3
| | | | | | | | was NULL and nothing was traced. So save the old tcpcb and use that to retrieve some information. Note that otb may be freed and must not be dereferenced. Use a heuristic for cases where the address family is in the IP header but not provided in the PCB. OK visa@
* Historically there were slow and fast tcp timeouts. That is whybluhm2018-05-081-28/+2
| | | | | | the delack timer had a different implementation. Use the same mechanism for all TCP timer. OK mpi@ visa@
* Historically TCP timeouts were implemented with pr_slowtimo andbluhm2018-02-071-3/+2
| | | | | | | | pr_fasttimo. That is the reason why we have two timeout mechanisms with complicated ticks calculation. Move the delay ACK timeout to milliseconds and remove some ticks and hz mess from the others. This makes it easier to see the actual values. OK florian@ dhill@ dlg@
* There was a race in the TCP timers. As they may sleep to grab thebluhm2018-02-061-1/+7
| | | | | | | | | | | netlock, timers may still run after they have been disarmed. Deleting the timeout is not sufficient to cancel them, but the code from 4.4 BSD is assuming this. The solution is to add a flag for every timer to see whether it has been armed or canceled. Remove the TF_DEAD check as tcp_canceltimers() is called before the reaper timer is fired. Cancelation works reliably now. OK mpi@
* The TCP reaper timeout was still imlemented as soft timeout. Sobluhm2018-01-231-4/+2
| | | | | | | | | | | it could run immediately and was not synchronized with the TCP timeouts, although that was the intension when it was introduced in revision 1.85. Convert the reaper to an ordinary TCP timeout so it is scheduled on the same timeout thread after all timeouts have finished. A net lock is not necessary as the process calling tcp_close() will not access the tcpcb after arming the reaper timeout. OK mikeb@
* Move PRU_DETACH out of pr_usrreq into per proto pr_detachflorian2017-11-021-1/+2
| | | | | | functions to pave way for more fine grained locking. Suggested by, comments & OK mpi
* Remove the TCP_FACK option and associated #if{,n}def code.job2017-10-251-7/+1
| | | | | | | | | TCP_FACK was disabled by provos@ in June 1999. TCP_FACK is an algorithm that decides that when something is lost, all not SACKed packets until the most forward SACK are lost. It may be a correct estimate, if network does not reorder packets. OK visa@ mpi@ mikeb@
* Refactor handling of partial TCP acknowledgementsmikeb2017-10-241-3/+1
| | | | With input from Klemens Nanni, OK visa, mpi, bluhm
* Unconditionally enable TCP selective acknowledgements (SACK)mikeb2017-10-221-14/+2
| | | | OK deraadt, mpi, visa, job
* Pass down the address family through the pr_input calls. Thisbluhm2017-04-141-2/+2
| | | | | allows to simplify code used for both IPv4 and IPv6. OK mikeb@ deraadt@
* Move PRU_ATTACH out of the pr_usrreq functions into pr_attach.claudio2017-03-131-2/+2
| | | | | | | Attach is quite a different thing to the other PRU functions and this should make locking a bit simpler. This also removes the ugly hack on how proto was passed to the attach function. OK bluhm@ and mpi@ on a previous version
* percpu counters for TCP statsjca2017-02-091-2/+126
| | | | ok mpi@ bluhm@
* In sogetopt, preallocate an mbuf to avoid using sleeping mallocs withdhill2017-02-011-2/+2
| | | | | | | | the netlock held. This also changes the prototypes of the *ctloutput functions to take an mbuf instead of an mbuf pointer. help, guidance from bluhm@ and mpi@ ok bluhm@
* Change the IPv4 pr_input function to the way IPv6 is implemented,bluhm2017-01-291-5/+2
| | | | | | | to get rid of struct ip6protosw and some wrapper functions. It is more consistent to have less different structures. The divert_input functions cannot be called anyway, so remove them. OK visa@ mpi@
* Reduce the difference between struct protosw and ip6protosw. Thebluhm2017-01-261-2/+2
| | | | | | IPv4 pr_ctlinput functions did return a void pointer that was always NULL and never used. Make all functions void like in the IPv6 case. OK mpi@
* Since raw_input() and route_input() are gone from pr_input, we canbluhm2017-01-251-2/+2
| | | | | | make the variable parameters of the protocol input functions fixed. Also add the proto to make it similar to IPv6. OK mpi@ guenther@ millert@
* Kill recursive splsoftnet()s.mpi2016-11-161-15/+1
| | | | | | While here keep local definitions local. ok bluhm@
* Convert timeouts that need a process context to timeout_set_proc(9).mpi2016-10-041-2/+2
| | | | | | | The current reason is that rtalloc_mpath(9) inside ip_output() might end up inserting a RTF_CLONED route and that require a write lock. ok kettenis@, bluhm@
* To tune the TCP SYN cache we need more information. Print thebluhm2016-07-201-1/+8
| | | | | relevant counters with netstat -s -p tcp. OK henning@
* Make the size for the syn cache hash array tunable. As we arebluhm2016-07-201-6/+11
| | | | | | | | swapping between two syn caches for random reseeding anyway, this feature can be added easily. When the cache is empty, there is an opportunity to change the hash size. This allows an admin under SYN flood attack to defend his machine. Suggested by claudio@; OK jung@ claudio@ jmc@
* Add net.inet.{tcp,udp}.rootonly sysctl, to mark which portsvgross2016-06-181-2/+5
| | | | | | cannot be bound to by non-root users. Ok millert@ bluhm@
* Allow to adjust tcp_syn_use_limit with sysctl net.inet.tcp.synuselimit.bluhm2016-03-291-3/+20
| | | | | | | | This is convenient to test the feature and may be useful to defend against syn flooding in a denial of service condition. It is consistent to the existing syn cache sysctls. Move some declarations to tcp_var.h to access the syn cache sets from tcp_sysctl(). OK mpi@
* To prevent attacks on the hash buckets of the syn cache, our TCPbluhm2016-03-271-2/+3
| | | | | | | | | | | stack reseeds the hash function every time the cache is empty. Unfortunatly the attacker can prevent the reseeding by sending unanswered SYN packes periodically. Fix this by having an active syn cache that gets new entries and a passive one that is idling out. When the passive one is empty and the active one has been used 100000 times, they switch roles and the hash function is reseeded with new random. tedu@ agrees; OK mpi@
* Add a tcps_sc_seedrandom counter in TCP SYN cache and netstat -s.bluhm2016-03-211-1/+2
| | | | | | This shows how often the hash function is reseeded and the random bucket distribution changes. OK mpi@ claudio@
* The syn cache is completely implemented in tcp_input.c. So all itsbluhm2015-08-271-4/+1
| | | | | global variables should also live there. OK markus@
* Rename the syn cache counter into tcp_syn_cache_count to have thebluhm2015-08-241-2/+1
| | | | | | | | same prefix for all variables. Convert the counter type to int, the limit is also int. Before searching the cache, check that it is not empty. Do not access the counter outside of the syn cache from tcp_ctlinput(), let the syn_cache_lookup() function handle it. OK dlg@
* Count dropped SYN packets on the tcpstat. They are dropped due to theyasuoka2015-02-081-1/+2
| | | | | | listen queue (backlog) limit or the memory shortage in syn-cache. ok henning reyk claudio
* To satisfy kernel grovellers and bad (but document) sysctlderaadt2015-01-211-1/+3
| | | | | | practice, be pragmatic and #include <sys/timeout.h> for struct tcpb (glorious namespace violation) ok kettenis millert sthen
* since the cksum rewrite the counters for hardware checksummed packetshenning2014-01-231-3/+3
| | | | | | | | | | are are lie, since the software engine emulates hardware offloading and that is later indistinguishable. so kill the hw cksummed counters. introduce software checksummed packet counters instead. tcp/udp handles ip & ipvshit, ip cksum covered, 6 has no ip layer cksum. as before we still have a miscounting bug for inbound with pf on, to be fixed in the next step. found by, prodding & ok naddy
* remove historical #if 1deraadt2013-10-231-3/+1
|
* Sprinkle a lot more IPv6 routing domains support in the kernel.phessler2013-10-211-2/+2
| | | | | | | | | Mostly mechanical, setting and passing the rdomain and rtable correctly. Not yet enabled. Lots of help and hints from claudio and bluhm OK claudio@, bluhm@
* Add the TCP socket option TCP_NOPUSH to delay sending the stream.bluhm2013-08-121-1/+2
| | | | | | This is useful to aggregate data in the kernel from multiple sources like writes and socket splicing. It avoids sending small packets. From FreeBSD via David Hill; OK mikeb@ henning@
* Pass the routing domain to IPv6 pr_ctlinput() like in IPv4.bluhm2013-06-011-2/+2
| | | | OK claudio@
* Remove various external variable declaration from sources files andmpi2013-04-101-1/+3
| | | | | | | move them to the corresponding header with an appropriate comment if necessary. ok guenther@
* Add sysctl net.inet.tcp.always_keepalive, when this is set the systemsthen2011-07-061-3/+5
| | | | | | | | | | | behaves as if SO_KEEPALIVE was set on all TCP sockets, forcing keepalives to be sent every net.inet.tcp.keepidle half-seconds. In conjunction with a keepidle value greatly reduced from the default, this can be useful for keeping sessions open if you are stuck on a network with short NAT or firewall timeouts. Feedback from various people, ok henning@ claudio@
* Add socket option SO_SPLICE to splice together two TCP sockets.bluhm2011-01-071-1/+3
| | | | | | | The data received on the source socket will automatically be sent on the drain socket. This allows to write relay daemons with zero data copy. ok markus@
* There is no TCP6 in our kernel, so remove the #ifndef TCP6.bluhm2010-10-211-3/+3
| | | | | No binary change. ok claudio@ henning@
* TCP send and recv buffer scaling.claudio2010-09-241-5/+12
| | | | | | | | | | | | | | | | | Send buffer is scaled by not accounting unacknowledged on the wire data against the buffer limit. Receive buffer scaling is done similar to FreeBSD -- measure the delay * bandwith product and base the buffer on that. The problem is that our RTT measurment is coarse so it overshoots on low delay links. This does not matter that much since the recvbuffer is almost always empty. Add a back pressure mechanism to control the amount of memory assigned to socketbuffers that kicks in when 80% of the cluster pool is used. Increases the download speed from 300kB/s to 4.4MB/s on ftp.eu.openbsd.org. Based on work by markus@ and djm@. OK dlg@, henning@, put it in deraadt@
* Add support for using IPsec in multiple rdomains.reyk2010-07-091-2/+2
| | | | | | | | | | | | | | | | | This allows to run isakmpd/iked/ipsecctl in multiple rdomains independently (with "route exec"); the kernel will pickup the rdomain from the process context of the pfkey socket and load the flows and SAs into the matching rdomain encap routing table. The network stack also needs to pass the rdomain to the ipsec stack to lookup the correct rdomain that belongs to an interface/mbuf/... You can now run individual IPsec configs per rdomain or create IPsec VPNs between multiple rdomains on the same machine ;). Note that a primary enc(4) in addition to enc0 interface is required per rdomain, eg. enc1 rdomain 1. Test by some people, mostly on existing "rdomain 0" setups. Was in snaps for some days and people didn't complain. ok claudio@ naddy@
* Fix the naming of interfaces and variables for rdomains and rtablesguenther2010-07-031-2/+2
| | | | | | | | | | | | and make it possible to bind sockets (including listening sockets!) to rtables and not just rdomains. This changes the name of the system calls, socket option, and ioctl. After building with this you should remove the files /usr/share/man/cat2/[gs]etrdomain.0. Since this removes the existing [gs]etrdomain() system calls, the libc major is bumped. Written by claudio@, criticized^Wcritiqued by me
* Extend the protosw pr_ctlinput function to include the rdomain. This isclaudio2009-11-131-3/+3
| | | | | | | | needed so that the route and inp lookups done in TCP and UDP know where to look. Additionally in_pcbnotifyall() and tcp_respond() got a rdomain argument as well for similar reasons. With this tcp seems to be now fully rdomain save and no longer leaks single packets into the main domain. Looks good markus@, henning@
* sockets created via a listening socket lose the rdomain and fail to workclaudio2009-08-101-4/+5
| | | | | | | therefore. Inherit the rdomain through the syncache. There are some interactions that need some more work (ctlinput) so this can be improved but is good enough for now. OK markus@
* Initial support for routing domains. This allows to bind interfaces toclaudio2009-06-051-1/+2
| | | | | | | | | alternate routing table and separate them from other interfaces in distinct routing tables. The same network can now be used in any doamin at the same time without causing conflicts. This diff is mostly mechanical and adds the necessary rdomain checks accross net and netinet. L2 and IPv4 are mostly covered still missing pf and IPv6. input and tested by jsg@, phessler@ and reyk@. "put it in" deraadt@
* fix macros up so they use the do { } while (/* CONSTCOND */ 0) idiomdlg2008-11-081-3/+3
| | | | ok deraadt@ otto@
* Remove {tcp/udp}6_usrreq(); Since the normal ones nowthib2008-05-241-5/+1
| | | | | | | take a proc argument, theres no need for these, since they are just wrappers. OK claudio@
* Deal with the situation when TCP nfs mounts timeout and processesthib2008-05-231-2/+2
| | | | | | | | | | | | | get hung in nfs_reconnect() because they do not have the proper privilages to bind to a socket, by adding a struct proc * argument to sobind() (and the *_usrreq() routines, and finally in{6}_pcbbind) and do the sobind() with proc0 in nfs_connect. OK markus@, blambert@. "go ahead" deraadt@. Fixes an issue reported by bernd@ (Tested by bernd@). Fixes PR5135 too.
* remove tcp_drain code since it's not longer used; ok henning, feedback thibmarkus2008-05-061-32/+1
|
* remove old unused TCP isn code; ok henning, dhartmei, mcbridemarkus2008-02-201-5/+1
|
* when creating a response, use the correct TCP header instead ofmarkus2008-02-201-2/+2
| | | | relying on the mbuf chain layout; with claudio@ and krw@; ok henning@