summaryrefslogtreecommitdiffstats
path: root/sys/netinet/tcp_output.c (follow)
Commit message (Collapse)AuthorAgeFilesLines
* Kill yet another argument to functions in IPv6. This time ip6_output'sclaudio2015-09-111-2/+2
| | | | | | | ifpp - XXX: just for statistics ifpp is always NULL in all callers so that statistic confirms ifpp is dying OK mpi@
* Avoid a situation where we do not set the tcp persist timer afterbluhm2015-07-131-1/+27
| | | | | | | a zero window condition. If you send a 0-length packet, but there is data is the socket buffer, and neither the rexmt or persist timer is already set, then activate the persist timer. From FreeBSD revision 284941; OK deraadt@ markus@ mikeb@ claudio@
* Get rid of the undocumented & temporary* m_copy() macro added formpi2015-06-301-2/+3
| | | | | | | | compatibility with 4.3BSD in September 1989. *Pick your own definition for "temporary". ok bluhm@, claudio@, dlg@
* Store a unique ID, an interface index, rather than a pointer to thempi2015-06-161-2/+2
| | | | | | | | | | | | | | | receiving interface in the packet header of every mbuf. The interface pointer should now be retrieved when necessary with if_get(). If a NULL pointer is returned by if_get(), the interface has probably been destroy/removed and the mbuf should be freed. Such mechanism will simplify garbage collection of mbufs and limit problems with dangling ifp pointers. Tested by jmatthew@ and krw@, discussed with many. ok mikeb@, bluhm@, dlg@
* Replace a bunch of == 0 with == NULL in pointer tests. Nuke somekrw2015-06-071-13/+13
| | | | | | | annoying trailing, leading and embedded whitespace. No change to .o files. ok deraadt@
* Remove some includes include-what-you-use claims don'tjsg2015-03-141-2/+1
| | | | | | | have any direct symbols used. Tested for indirect use by compiling amd64/i386/sparc64 kernels. ok tedu@ deraadt@
* unifdef INET in net code as a precursor to removing the pretend option.tedu2014-12-191-7/+1
| | | | | long live the one true internet. ok henning mikeb
* Fewer <netinet/in_systm.h> !mpi2014-07-221-2/+1
|
* ip_output() using varargs always struck me as bizarre, esp since it's onlyhenning2014-04-211-2/+2
| | | | | | ever used to pass on uint32 (for ipsec). stop that madness and just pass the uint32, 0 in all cases but the two that pass the ipsec flowinfo. ok deraadt reyk guenther
* "struct pkthdr" holds a routing table ID, not a routing domain one.mpi2014-04-141-3/+3
| | | | | | | | | | | | | | Avoid the confusion by using an appropriate name for the variable. Note that since routing domain IDs are a subset of the set of routing table IDs, the following idiom is correct: rtableid = rdomain But to get the routing domain ID corresponding to a given routing table ID, you must call rtable_l2(9). claudio@ likes it, ok mikeb@
* Retire kernel support for SO_DONTROUTE, this time without breakingmpi2014-04-071-6/+3
| | | | | | | | | | | localhost connections. The plan is to always use the routing table for addresses and routes resolutions, so there is no future for an option that wants to bypass it. This option has never been implemented for IPv6 anyway, so let's just remove the IPv4 bits that you weren't aware of. Tested a least by lteo@, guenther@ and chrisz@, ok mikeb@, benno@
* revert "Retire kernel support for SO_DONTROUTE" diff, which does bad thingssthen2014-03-281-3/+6
| | | | for localhost connections. discussed with deraadt@
* Retire kernel support for SO_DONTROUTE, since the plan is to alwaysmpi2014-03-271-6/+3
| | | | | | | | use the routing table there's no future for an option that wants to bypass it. This option has never been implemented for IPv6 anyway, so let's just remove the IPv4 bits that you weren't aware of. Tested by florian@, man pages inputs from jmc@, ok benno@
* Remove the number of in6_var.h inclusions by moving some functions andmpi2013-10-241-6/+1
| | | | | | global variables to in6.h. ok deraadt@
* make in_proto_cksum_out not rely on the pseudo header checksum to behenning2013-10-191-23/+3
| | | | | | | | | | already there, just compute it - it's dirt cheap. since that happens very late in ip_output, the rest of the stack doesn't have to care about checksums at all any more, if something needs to be checksummed, just set the flag on the pkthdr mbuf to indicate so. stop pre-computing the pseudo header checksum and incrementally updating it in the tcp and udp stacks. ok lteo florian
* Add the TCP socket option TCP_NOPUSH to delay sending the stream.bluhm2013-08-121-3/+4
| | | | | | This is useful to aggregate data in the kernel from multiple sources like writes and socket splicing. It avoids sending small packets. From FreeBSD via David Hill; OK mikeb@ henning@
* Link pf states and socket inpcbs together more tightly. The linkingbluhm2013-06-031-1/+7
| | | | | | | | | | | | | | was only done when a packet traveled up the stack from pf to tcp_input(). Now also link the state and inpcb when the packet is going down from tcp_output() to pf. As a consequence, divert-reply states where the initial SYN does not get an answer, can be handled more correctly. This change is part of a larger diff that has been backed out in 2011. Bring the feature back in small steps to see when bad things start to happen. OK henning deraadt
* spltdb() was really just #define'd to be splsoftnet(); replace the formerblambert2012-09-201-3/+1
| | | | | | | | with the latter no change in md5 checksum of generated files ok claudio@ henning@
* Revert the pf->socket linking diff.oga2011-05-131-7/+1
| | | | | | | | | | | | | | | | | | | | | | | | | at least krw@, pirofti@ and todd@ have been seeing panics (todd and krw with xxxterm not sure about pirofti) involving pool corruption while using this commit. krw and todd confirm that this backout fixes the problem. ok blambert@ krw@, todd@ henning@ and kettenis@ Double link between pf states and sockets. Henning has already implemented half of it. The additional part is: - The pf state lookup for outgoing packets is optimized by using mbuf->inp->state. - For incomming tcp, udp, raw, raw6 packets the socket lookup always is optimized by using mbuf->state->inp. - All protocols establish the link for incomming packets. - All protocols set the inp in the mbuf for outgoing packets. This allows the linkage beginning with the first packet for outgoing connections. - In case of divert states, delete the state when the socket closes. Otherwise new connections could match on old states instead of being diverted to the listen socket. ok henning@
* Double link between pf states and sockets. Henning has alreadybluhm2011-04-241-1/+7
| | | | | | | | | | | | | | | | implemented half of it. The additional part is: - The pf state lookup for outgoing packets is optimized by using mbuf->inp->state. - For incomming tcp, udp, raw, raw6 packets the socket lookup always is optimized by using mbuf->state->inp. - All protocols establish the link for incomming packets. - All protocols set the inp in the mbuf for outgoing packets. This allows the linkage beginning with the first packet for outgoing connections. - In case of divert states, delete the state when the socket closes. Otherwise new connections could match on old states instead of being diverted to the listen socket. ok henning@
* mechanic rename M_{TCP|UDP}V4_CSUM_OUT -> M_{TCP|UDP}_CSUM_OUThenning2011-04-051-2/+2
| | | | ok claudio krw
* Add socket option SO_SPLICE to splice together two TCP sockets.bluhm2011-01-071-1/+7
| | | | | | | The data received on the source socket will automatically be sent on the drain socket. This allows to write relay daemons with zero data copy. ok markus@
* TCP send and recv buffer scaling.claudio2010-09-241-1/+8
| | | | | | | | | | | | | | | | | Send buffer is scaled by not accounting unacknowledged on the wire data against the buffer limit. Receive buffer scaling is done similar to FreeBSD -- measure the delay * bandwith product and base the buffer on that. The problem is that our RTT measurment is coarse so it overshoots on low delay links. This does not matter that much since the recvbuffer is almost always empty. Add a back pressure mechanism to control the amount of memory assigned to socketbuffers that kicks in when 80% of the cluster pool is used. Increases the download speed from 300kB/s to 4.4MB/s on ftp.eu.openbsd.org. Based on work by markus@ and djm@. OK dlg@, henning@, put it in deraadt@
* Return EACCES when pf_test() blocks a packet in ip_output(). This allowsclaudio2010-09-081-1/+3
| | | | | | | | ip_forward() to know the difference between blocked packets and those that can't be forwarded (EHOSTUNREACH). Only in the latter case an ICMP should be sent. In the other callers of ip_output() change the error back to EHOSTUNREACH since userland may not expect EACCES on a sendto(). OK henning@, markus@
* Add support for using IPsec in multiple rdomains.reyk2010-07-091-2/+3
| | | | | | | | | | | | | | | | | This allows to run isakmpd/iked/ipsecctl in multiple rdomains independently (with "route exec"); the kernel will pickup the rdomain from the process context of the pfkey socket and load the flows and SAs into the matching rdomain encap routing table. The network stack also needs to pass the rdomain to the ipsec stack to lookup the correct rdomain that belongs to an interface/mbuf/... You can now run individual IPsec configs per rdomain or create IPsec VPNs between multiple rdomains on the same machine ;). Note that a primary enc(4) in addition to enc0 interface is required per rdomain, eg. enc1 rdomain 1. Test by some people, mostly on existing "rdomain 0" setups. Was in snaps for some days and people didn't complain. ok claudio@ naddy@
* Fix the naming of interfaces and variables for rdomains and rtablesguenther2010-07-031-2/+2
| | | | | | | | | | | | and make it possible to bind sockets (including listening sockets!) to rtables and not just rdomains. This changes the name of the system calls, socket option, and ioctl. After building with this you should remove the files /usr/share/man/cat2/[gs]etrdomain.0. Since this removes the existing [gs]etrdomain() system calls, the libc major is bumped. Written by claudio@, criticized^Wcritiqued by me
* Make sure the temporary buffer used to generate tcp options is properlykettenis2010-05-281-2/+3
| | | | | | | aligned, otherwise we lose on strict alignment architecture. Should fix problems with gcc4 compiled bsd.rd's that people see on sparc64. ok millert@, beck@, jsing@
* Initial support for routing domains. This allows to bind interfaces toclaudio2009-06-051-1/+4
| | | | | | | | | alternate routing table and separate them from other interfaces in distinct routing tables. The same network can now be used in any doamin at the same time without causing conflicts. This diff is mostly mechanical and adds the necessary rdomain checks accross net and netinet. L2 and IPv4 are mostly covered still missing pf and IPv6. input and tested by jsg@, phessler@ and reyk@. "put it in" deraadt@
* do not set the pkthdr mbuf state key pointer to the state key saved in thehenning2008-09-031-2/+1
| | | | | | | | | | pcb. the state key ptr in the pcb is the one that had to be used by pf outbound. but by convention the state key pointer in the pkthdr is the one used INbound, so pf follows its reverse pointer to find the sk to use, and since a reverse doesn't exist for locally terminated connections the reverse pointer is null and thus the whole game a noop. note that this only affects packets FROM local udp/tcp sockets, for the other direction everything works as expected.
* link pf state keys to tcp pcbs and vice versa.henning2008-07-031-1/+2
| | | | | | | | | | | | | | when we first do a pcb lookup and we have a pointer to a pf state key in the mbuf header, store the state key pointer in the pcb and a pointer to the pcb we just found in the state key. when either the state key or the pcb is removed, clear the pointers. on subsequent packets inbound we can skip the pcb lookup and just use the pointer from the state key. on subsequent packets outbound we can skip the state key lookup and use the pointer from the pcb. about 8% speedup with 100 concurrent tcp sessions, should help much more with more tcp sessions. ok markus ryan
* no EOL between tcpsig and sack headers; ok jsing, frantzenmarkus2008-06-281-2/+2
|
* Remove some crazy #if mess.jsing2008-06-121-5/+1
| | | | ok markus@ henning@
* ANSIfy function definitions.jsing2008-06-121-3/+2
| | | | ok markus@ mcbride@ henning@ deraadt@
* some spelling fixes from Martynas Venckusjmc2007-11-241-2/+2
|
* apply the "skip ipsec if there are no flows" speedup diff to IPv6 too.henning2007-06-011-2/+3
| | | | | | we need a pointer to the inpcb to decide, which was not previously passed to ip6_output, so this diff is a little bigger. from itojun, ok ryan
* implement PMTU checks frommarkus2005-06-301-2/+8
| | | | | | | http://www.gont.com.ar/drafts/icmp-attacks-against-tcp.html i.e. don't act on ICMP-need-frag immediately if adhoc checks on the advertised mtu fail. the mtu update is delayed until a tcp retransmit happens. initial patch by Fernando Gont, tested by many.
* Ignore ICMP Source Quench messages meant for TCP connections. (Details infgont2005-05-241-2/+6
| | | | | http://www.gont.com.ar/drafts/icmp-attacks-against-tcp.html) ok markus frantzen
* csum -> csum_flagsbrad2005-04-251-2/+2
| | | | ok krw@ canacar@
* add tcp sack stats, similar to freebsd; ok deraadtmarkus2005-04-051-1/+4
|
* 1. tcp_xmit_timer(): remove extra rtt decrement (t_rtttime is 0-basedmarkus2005-02-271-2/+2
| | | | | | | | | | | while t_rtt was 1-based), update callers 2. define and use TCP_RTT_BASE_SHIFT instead of the hardcoded 2. 3. add missing shifts when t_srtt/t_rttvar are used. 4. update the comments: t_srtt uses 5 bits of fraction (not 3) and t_rttvar uses 4 bits 5. remove obsolete/unused macros TCP_RTT_SCALE and TCP_RTTVAR_SCALE 6. make sure rttmin is not > TCPTV_REXMTMAX parts from netbsd, ok mcbride, henning
* Modulate tcp_now by a random amount on a per-connection basis.mcbride2004-10-281-2/+2
| | | | ok markus@ frantzen@
* set the congestion window to two segments (instead of only one), this matchesmarkus2004-10-061-2/+2
| | | | the window size he have when entering the established state. ok deraadt@
* don't send partial segments if SS_ISSENDING is set, remembermarkus2004-09-161-4/+12
| | | | | TF_LASTIDLE across invocations of tcp_output (from freebsd); ok mcbride
* remove #ifdef TUBAitojun2004-06-201-12/+1
|
* factor out md5 code; ok+tests henning@, djm@, hshoexer@markus2004-06-081-68/+7
|
* set m_pkthdr.len early; ok mcbride, deraadtmarkus2004-06-051-3/+2
|
* work around an LP64 problem where we report an excessively large windowbrad2004-05-311-3/+3
| | | | | | | | due to incorrect mixing of types. From NetBSD ok cedric@ markus@
* Replace RSA-derived md5 code with code derived from Colin Plumb's PD version.millert2004-05-071-2/+2
| | | | | | This moves md5.c out of libkern and into sys/crypto where it belongs (as requested by markus@). Note that md5.c is still mandatory (dev/rnd.c uses it). Verified with IPsec + hmac-md5 and tcp md5sig. OK henning@ and hshoexer@
* - allow the user to force the TCP mss below the fail-safe 216 with a lowfrantzen2004-04-261-2/+2
| | | | | | | | interface MTU. - break a tcp_output() -> tcp_mtudisc() -> tcp_output() infinite recursion when the TCP mss ends up larger than the interface MTU (when the if_mtu is smaller than the tcp header). connections will still stall feedback from itojun@, claudio@ and provos and testing from beck@
* don't allocate a cluster if the header fits into a mbuf;markus2004-02-161-4/+4
| | | | ok itojun@, henning@, mcbride@