summaryrefslogtreecommitdiffstats
path: root/sys/netinet/tcp_input.c (follow)
Commit message (Collapse)AuthorAgeFilesLines
...
* When net.inet.ip.sourceroute is enable, store the source routempi2013-08-131-3/+3
| | | | | | | | | | | | of incoming IPv4 packets with the SSRR or LSRR header option in a m_tag rather than in a single static entry. Use a new m_tag type, PACKET_TAG_SRCROUTE, for this and bump PACKET_TAG_MAXSIZE accordingly. Adapted from FreeBSD r135274 with inputs from bluhm@. ok bluhm@, mikeb@
* Move bridge_broadcast and subsequently all IPsec SPD lookup code outmikeb2013-07-311-7/+3
| | | | | | | | of the IPL_NET. pf_test should be no longer called under IPL_NET as well. The problem became evident after the related issue was brought up by David Hill <dhill at mindcry ! org>. With input from and OK mpi. Tested by David and me.
* The reverse parameter of in_pcblookup_listen() is a boolean and notbluhm2013-07-011-5/+5
| | | | | | a flag. Rename the variable inpl_flags in tcp_input() to inpl_reverse like in udp_input(). No binary change. OK mikeb@
* Always make sure that the temporary TCP protocol control blockmikeb2013-06-201-4/+3
| | | | | structure is zeroed out before use. From David Hill <dhill at mindcry ! org>; ok blambert claudio henning
* Increment udpstat.udps_nosec and tcpstat.tcps_rcvnosec in case packet isyasuoka2013-06-091-1/+2
| | | | | | dropped by IPsec security policy. input from and ok mikeb
* Link pf states and socket inpcbs together more tightly. The linkingbluhm2013-06-031-3/+16
| | | | | | | | | | | | | | was only done when a packet traveled up the stack from pf to tcp_input(). Now also link the state and inpcb when the packet is going down from tcp_output() to pf. As a consequence, divert-reply states where the initial SYN does not get an answer, can be handled more correctly. This change is part of a larger diff that has been backed out in 2011. Bring the feature back in small steps to see when bad things start to happen. OK henning deraadt
* Merge the duplicate IPv4 and IPv6 checksum checking code in tcp_input()bluhm2013-06-031-35/+30
| | | | | into one block. OK mpi@
* Remove various external variable declaration from sources files andmpi2013-04-101-3/+1
| | | | | | | move them to the corresponding header with an appropriate comment if necessary. ok guenther@
* Use macros sotoinpcb() and intotcpcb() instead of casts. Use NULLbluhm2013-04-021-7/+7
| | | | | instead of 0 for pointers. No binary change. OK mpi@
* Declare struct pf_state_key in the mbuf and in_pcb header files tobluhm2013-03-291-5/+4
| | | | | avoid ugly casts. OK krw@ tedu@
* code that calls timeout functions should include timeout.htedu2013-03-281-1/+2
| | | | | slipped by on i386, but the zaurus doesn't automagically pick it up. spotted by patrick
* tedu faith(4), suggested by todd@ some weeks ago after a submission bympi2013-03-141-16/+1
| | | | | | dhill. ok krw@, mikeb@, tedu@ (implicit)
* After finding the socket's inp by using the pf's statekey, resetbluhm2013-01-171-1/+3
| | | | | | | | | the pointer to the statekey in the mbuf. When an UDP socket is spliced, pf would use this key during ip_output() although the packet went through two sockets in the meantime. Reset the mbuf's statekey in tcp_input() and udp_input() to eliminate the pointer to pf lingering in the socket buffers. OK claudio@
* first or second coming, commie or not commie, one m in coming is sufficienthenning2013-01-171-2/+2
| | | | ok claudio
* add IP_IPSECFLOWINFO option to sendmsg() and recvmsg(), so npppd(4)markus2012-07-161-2/+2
| | | | | | | can use this to select the IPsec tunnel for sending L2TP packets. this fixes Windows (always binding to 1701) and Android clients (negotiating wildcard flows); feedback mpf@ and yasuoka@; ok henning@ and yasuoka@; ok jmc@ for the manpage
* Increase TCP's initial window to 10 * MSS or 14600 bytes as proposed inclaudio2012-03-101-1/+4
| | | | | | draft-ietf-tcpm-initcwnd. net.inet.tcp.rfc3390 defaults to 2 now which uses the 10*MSS, setting it back to 1 brings back the old default of 4*MSS. OK sperreault@, henning@, sthen@, markus@
* Respect the ToS setting in tcp syn+ack for IPv4, still need to fix forhaesbaert2011-10-151-2/+3
| | | | | | IPv6. ok claudio@
* Revert the pf->socket linking diff.oga2011-05-131-16/+2
| | | | | | | | | | | | | | | | | | | | | | | | | at least krw@, pirofti@ and todd@ have been seeing panics (todd and krw with xxxterm not sure about pirofti) involving pool corruption while using this commit. krw and todd confirm that this backout fixes the problem. ok blambert@ krw@, todd@ henning@ and kettenis@ Double link between pf states and sockets. Henning has already implemented half of it. The additional part is: - The pf state lookup for outgoing packets is optimized by using mbuf->inp->state. - For incomming tcp, udp, raw, raw6 packets the socket lookup always is optimized by using mbuf->state->inp. - All protocols establish the link for incomming packets. - All protocols set the inp in the mbuf for outgoing packets. This allows the linkage beginning with the first packet for outgoing connections. - In case of divert states, delete the state when the socket closes. Otherwise new connections could match on old states instead of being diverted to the listen socket. ok henning@
* Clean up gotos for listening sockets to make it obvious when packetsblambert2011-05-041-12/+12
| | | | | | | | | are dropped and when normal program flow occurs. Change error return value of syn_cache_add() from 0 to -1 in order to clearly communicate intent. ok claudio@
* In certain failure cases, a RST would be sent out on rdomain 0,blambert2011-04-291-3/+3
| | | | | | | | regardless of the rdomain the packet was received on. Explicitly pass the rdomain to the tcp_respond() monstrosity to compensate for said monstricism which led to this behavior. ok claudio@
* Make in_broadcast() rdomain aware. Mostly mechanical change.claudio2011-04-281-2/+3
| | | | | | This fixes the problem of binding sockets to broadcast IPs in other rdomains. OK henning@
* Double link between pf states and sockets. Henning has alreadybluhm2011-04-241-2/+16
| | | | | | | | | | | | | | | | implemented half of it. The additional part is: - The pf state lookup for outgoing packets is optimized by using mbuf->inp->state. - For incomming tcp, udp, raw, raw6 packets the socket lookup always is optimized by using mbuf->state->inp. - All protocols establish the link for incomming packets. - All protocols set the inp in the mbuf for outgoing packets. This allows the linkage beginning with the first packet for outgoing connections. - In case of divert states, delete the state when the socket closes. Otherwise new connections could match on old states instead of being diverted to the listen socket. ok henning@
* put the accepted socket of a diverted connection into the routing domainmikeb2011-04-121-1/+10
| | | | | of a connection originator. this allows one to query the source rdomain with a SO_RTABLE socket option. figured out with reyk, ok claudio.
* Replace if/else ladder with much more legible switch statement forblambert2011-04-051-50/+57
| | | | | | testing tcp flags. ok henning@ claudio@
* turn some macros into functions; saves 1400+ bytes from the kernelblambert2011-04-041-37/+42
| | | | | | on amd64 ok claudio@
* Instead of calling tcp_reass (tcp reassembly) with magic argumentsblambert2011-04-041-15/+18
| | | | | | | | in order to skip most of the reassembly logic and try to flush available tcp segments to the socket, just split it off into its own function and use it where appropriate. ok claudio@ henning@
* change an if statement to a switch to reduce eye bleedageblambert2011-04-041-5/+4
| | | | | | no change in .o md5 "ok gcc" claudio@
* Add socket option SO_SPLICE to splice together two TCP sockets.bluhm2011-01-071-12/+24
| | | | | | | The data received on the source socket will automatically be sent on the drain socket. This allows to write relay daemons with zero data copy. ok markus@
* Initialize the ts_recent (received timestamp) field in the newly createdclaudio2010-09-291-1/+3
| | | | | socket from the information we have in the syncache. Also bzero() the tcpcb that is passed to tcp_dooptions() just to be sure.
* It is not allowed to recalculate the window scale after the initial SYN.claudio2010-09-291-6/+1
| | | | | | A session must stick to the rscale factor sent out in the SYN packet. Remove the bogus tcp_rscale() call which is done after a full established session is returned from the syncache.
* Do not delay ACKs on connections using loopback interfaces. There is noclaudio2010-09-291-6/+10
| | | | | | | reason to reduce the amount of ACKs sent and delayed ACKs have a very bad interaction with the large MTU of lo(4) and the fairly small socketbuffer size. In collaboration with andre@freebsd. OK deraadt@
* TCP send and recv buffer scaling.claudio2010-09-241-3/+39
| | | | | | | | | | | | | | | | | Send buffer is scaled by not accounting unacknowledged on the wire data against the buffer limit. Receive buffer scaling is done similar to FreeBSD -- measure the delay * bandwith product and base the buffer on that. The problem is that our RTT measurment is coarse so it overshoots on low delay links. This does not matter that much since the recvbuffer is almost always empty. Add a back pressure mechanism to control the amount of memory assigned to socketbuffers that kicks in when 80% of the cluster pool is used. Increases the download speed from 300kB/s to 4.4MB/s on ftp.eu.openbsd.org. Based on work by markus@ and djm@. OK dlg@, henning@, put it in deraadt@
* Switch some obvious network stack MAC comparisons from bcmp() tomatthew2010-07-201-3/+3
| | | | | | timingsafe_bcmp(). ok deraadt@; committed over WPA.
* Add support for using IPsec in multiple rdomains.reyk2010-07-091-7/+13
| | | | | | | | | | | | | | | | | This allows to run isakmpd/iked/ipsecctl in multiple rdomains independently (with "route exec"); the kernel will pickup the rdomain from the process context of the pfkey socket and load the flows and SAs into the matching rdomain encap routing table. The network stack also needs to pass the rdomain to the ipsec stack to lookup the correct rdomain that belongs to an interface/mbuf/... You can now run individual IPsec configs per rdomain or create IPsec VPNs between multiple rdomains on the same machine ;). Note that a primary enc(4) in addition to enc0 interface is required per rdomain, eg. enc1 rdomain 1. Test by some people, mostly on existing "rdomain 0" setups. Was in snaps for some days and people didn't complain. ok claudio@ naddy@
* Fix the naming of interfaces and variables for rdomains and rtablesguenther2010-07-031-15/+15
| | | | | | | | | | | | and make it possible to bind sockets (including listening sockets!) to rtables and not just rdomains. This changes the name of the system calls, socket option, and ioctl. After building with this you should remove the files /usr/share/man/cat2/[gs]etrdomain.0. Since this removes the existing [gs]etrdomain() system calls, the libc major is bumped. Written by claudio@, criticized^Wcritiqued by me
* unbreak the build with a custom kernel config including "pseudo-devicesthen2010-03-111-1/+4
| | | | faith 1", noticed by Andris Kadar. ok kettenis@ beck@
* Replace pool_get() + bzero() with pool_get(..., PR_ZERO).chl2010-01-151-4/+2
| | | | | | With input from oga@ and krw@ ok oga@ krw@ thib@ markus@ mk@
* Extend the protosw pr_ctlinput function to include the rdomain. This isclaudio2009-11-131-4/+5
| | | | | | | | needed so that the route and inp lookups done in TCP and UDP know where to look. Additionally in_pcbnotifyall() and tcp_respond() got a rdomain argument as well for similar reasons. With this tcp seems to be now fully rdomain save and no longer leaks single packets into the main domain. Looks good markus@, henning@
* rtables are stacked on rdomains (it is possible to have multiple routingclaudio2009-11-031-6/+6
| | | | | | | | | | | | | | tables on top of a rdomain) but until now our code was a crazy mix so that it was impossible to correctly use rtables in that case. Additionally pf(4) only knows about rtables and not about rdomains. This is especially bad when tracking (possibly conflicting) states in various domains. This diff fixes all or most of these issues. It adds a lookup function to get the rdomain id based on a rtable id. Makes pf understand rdomains and allows pf to move packets between rdomains (it is similar to NAT). Because pf states now track the rdomain id as well it is necessary to modify the pfsync wire format. So old and new systems will not sync up. A lot of help by dlg@, tested by sthen@, jsg@ and probably more OK dlg@, mpf@, deraadt@
* fix indentationbluhm2009-08-201-6/+5
| | | | no binary change; ok grunk@
* sockets created via a listening socket lose the rdomain and fail to workclaudio2009-08-101-10/+22
| | | | | | | therefore. Inherit the rdomain through the syncache. There are some interactions that need some more work (ctlinput) so this can be improved but is good enough for now. OK markus@
* Initial support for routing domains. This allows to bind interfaces toclaudio2009-06-051-4/+6
| | | | | | | | | alternate routing table and separate them from other interfaces in distinct routing tables. The same network can now be used in any doamin at the same time without causing conflicts. This diff is mostly mechanical and adds the necessary rdomain checks accross net and netinet. L2 and IPv4 are mostly covered still missing pf and IPv6. input and tested by jsg@, phessler@ and reyk@. "put it in" deraadt@
* add the basic infrastructure to take advantage of TCP and UDP receivenaddy2009-06-031-4/+15
| | | | checksum offload over IPv6; ok deraadt@
* Remove the M_ANYCAST6 mbuf flag by doing the detection all in ip6_input().claudio2008-11-021-16/+1
| | | | | | | M_ANYCAST6 was only used to signal tcp6_input() that it should drop the packet and send back icmp error. This can be done in ip6_input() without the need for a mbuf flag. Gives us back one slot in m_flags for possible future need. Looked at and some input by naddy@ and henning@. OK dlg@
* back out previous change. Another panic, not as frequent, anddhill2008-10-101-3/+1
| | | | definitely not at will.
* Comment out statekey code to stop 'panic: soreceive 3', whichdhill2008-10-101-1/+3
| | | | | | | happens with IPv6 TCP traffic, until a better fix is found. patch from henning@ proded by deraadt@
* The pf state to pcb linking code change didn't account for thempf2008-09-091-2/+3
| | | | | | | | | TIME_WAIT socket recycling code to redo the pcb lookup w/out resetting the inp pointer. Therefore we used the stale pcb, which leads us to reply with a RST to SYNs received on TIME_WAIT sockets. Also move the findpcb label below the pf pcb cache lookup, to avoid using a stale pcb when the caching code gets activated. OK markus@, henning@
* link pf state keys to tcp pcbs and vice versa.henning2008-07-031-12/+38
| | | | | | | | | | | | | | when we first do a pcb lookup and we have a pointer to a pf state key in the mbuf header, store the state key pointer in the pcb and a pointer to the pcb we just found in the state key. when either the state key or the pcb is removed, clear the pointers. on subsequent packets inbound we can skip the pcb lookup and just use the pointer from the state key. on subsequent packets outbound we can skip the state key lookup and use the pointer from the pcb. about 8% speedup with 100 concurrent tcp sessions, should help much more with more tcp sessions. ok markus ryan
* Include "faith.h" in order to get NFAITH. Also clean up NFAITH conditionalsjsing2008-06-141-2/+4
| | | | | | whilst we're here. ok henning@ deraadt@
* Remove some crazy #if mess.jsing2008-06-121-8/+2
| | | | ok markus@ henning@