aboutsummaryrefslogtreecommitdiffstats
path: root/net/netfilter/nf_conntrack_proto_tcp.c (follow)
AgeCommit message (Collapse)AuthorFilesLines
2018-04-27netfilter: Fix handling simultaneous open in TCP conntrackJozsef Kadlecsik1-0/+11
Dominique Martinet reported a TCP hang problem when simultaneous open was used. The problem is that the tcp_conntracks state table is not smart enough to handle the case. The state table could be fixed by introducing a new state, but that would require more lines of code compared to this patch, due to the required backward compatibility with ctnetlink. Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Reported-by: Dominique Martinet <asmadeus@codewreck.org> Tested-by: Dominique Martinet <asmadeus@codewreck.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-01-08netfilter: nf_conntrack: add IPS_OFFLOAD status bitPablo Neira Ayuso1-0/+3
This new bit tells us that the conntrack entry is owned by the flow table offload infrastructure. # cat /proc/net/nf_conntrack ipv4 2 tcp 6 src=10.141.10.2 dst=147.75.205.195 sport=36392 dport=443 src=147.75.205.195 dst=192.168.2.195 sport=443 dport=36392 [OFFLOAD] mark=0 zone=0 use=2 Note the [OFFLOAD] tag in the listing. The timer of such conntrack entries look like stopped from userspace. In practise, to make sure the conntrack entry does not go away, the conntrack timer is periodically set to an arbitrary large value that gets refreshed on every iteration from the garbage collector, so it never expires- and they display no internal state in the case of TCP flows. This allows us to save a bitcheck from the packet path via nf_ct_is_expired(). Conntrack entries that have been offloaded to the flow table infrastructure cannot be deleted/flushed via ctnetlink. The flow table infrastructure is also responsible for releasing this conntrack entry. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-01-08netfilter: conntrack: timeouts can be constFlorian Westphal1-1/+1
Nowadays this is just the default template that is used when setting up the net namespace, so nothing writes to these locations. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-01-08netfilter: conntrack: l4 protocol trackers can be constFlorian Westphal1-2/+2
previous patches removed all writes to these structs so we can now mark them as const. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-01-08netfilter: conntrack: remove nlattr_size pointer from l4proto trackersFlorian Westphal1-8/+8
similar to previous commit, but instead compute this at compile time and turn nlattr_size into an u16. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-11-20netfilter: conntrack: lower timeout to RETRANS seconds if window is 0Florian Westphal1-0/+3
When zero window is announced we can get into a situation where connection stays around forever: 1. One side announces zero window. 2. Other side closes. In this case, no FIN is sent (stuck in send queue). Unless other side opens the window up again conntrack stays in ESTABLISHED state for a very long time. Lets alleviate this by lowering the timeout to RETRANS (5 minutes), the other end should be sending zero window probes to keep the connection established as long as a socket still exists. Cc: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-11-06netfilter: conntrack: don't cache nlattr_tuple_size result in nla_sizeFlorian Westphal1-2/+7
We currently call ->nlattr_tuple_size() once at register time and cache result in l4proto->nla_size. nla_size is the only member that is written to, avoiding this would allow to make l4proto trackers const. We can use ->nlattr_tuple_size() at run time, and cache result in the individual trackers instead. This is an intermediate step, next patch removes nlattr_size() callback and computes size at compile time, then removes nla_size. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-10-24netfilter: conntrack: remove pf argument from l4 packet functionsFlorian Westphal1-4/+2
not needed/used anymore. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-10-24netfilter: conntrack: add and use nf_ct_l4proto_log_invalidFlorian Westphal1-16/+9
We currently pass down the l4 protocol to the conntrack ->packet() function, but the only user of this is the debug info decision. Same information can be derived from struct nf_conn. Add a wrapper for the previous patch that extracs the information from nf_conn and passes it to nf_l4proto_log_invalid(). Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-10-24netfilter: conntrack: add and use nf_l4proto_log_invalidFlorian Westphal1-12/+10
We currently pass down the l4 protocol to the conntrack ->packet() function, but the only user of this is the debug info decision. Same information can be derived from struct nf_conn. As a first step, add and use a new log function for this, similar to nf_ct_helper_log(). Add __cold annotation -- invalid packets should be infrequent so gcc can consider all call paths that lead to such a function as unlikely. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-09-04netfilter: remove unused hooknum arg from packet functionsFlorian Westphal1-1/+0
tested with allmodconfig build. Signed-off-by: Florian Westphal <fw@strlen.de>
2017-08-24netfilter: conntrack: print_conntrack only needed if CONFIG_NF_CONNTRACK_PROCFSFlorian Westphal1-0/+6
Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-08-24netfilter: conntrack: place print_tuple in procfs partFlorian Westphal1-11/+0
CONFIG_NF_CONNTRACK_PROCFS is deprecated, no need to use a function pointer in the trackers for this. Place the printf formatting in the one place that uses it. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-08-24netfilter: conntrack: remove protocol name from l4proto structFlorian Westphal1-2/+0
no need to waste storage for something that is only needed in one place and can be deduced from protocol number. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-05-01Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-nextDavid S. Miller1-4/+21
Pablo Neira Ayuso says: ==================== Netfilter/IPVS updates for net-next The following patchset contains Netfilter updates for your net-next tree. A large bunch of code cleanups, simplify the conntrack extension codebase, get rid of the fake conntrack object, speed up netns by selective synchronize_net() calls. More specifically, they are: 1) Check for ct->status bit instead of using nfct_nat() from IPVS and Netfilter codebase, patch from Florian Westphal. 2) Use kcalloc() wherever possible in the IPVS code, from Varsha Rao. 3) Simplify FTP IPVS helper module registration path, from Arushi Singhal. 4) Introduce nft_is_base_chain() helper function. 5) Enforce expectation limit from userspace conntrack helper, from Gao Feng. 6) Add nf_ct_remove_expect() helper function, from Gao Feng. 7) NAT mangle helper function return boolean, from Gao Feng. 8) ctnetlink_alloc_expect() should only work for conntrack with helpers, from Gao Feng. 9) Add nfnl_msg_type() helper function to nfnetlink to build the netlink message type. 10) Get rid of unnecessary cast on void, from simran singhal. 11) Use seq_puts()/seq_putc() instead of seq_printf() where possible, also from simran singhal. 12) Use list_prev_entry() from nf_tables, from simran signhal. 13) Remove unnecessary & on pointer function in the Netfilter and IPVS code. 14) Remove obsolete comment on set of rules per CPU in ip6_tables, no longer true. From Arushi Singhal. 15) Remove duplicated nf_conntrack_l4proto_udplite4, from Gao Feng. 16) Remove unnecessary nested rcu_read_lock() in __nf_nat_decode_session(). Code running from hooks are already guaranteed to run under RCU read side. 17) Remove deadcode in nf_tables_getobj(), from Aaron Conole. 18) Remove double assignment in nf_ct_l4proto_pernet_unregister_one(), also from Aaron. 19) Get rid of unsed __ip_set_get_netlink(), from Aaron Conole. 20) Don't propagate NF_DROP error to userspace via ctnetlink in __nf_nat_alloc_null_binding() function, from Gao Feng. 21) Revisit nf_ct_deliver_cached_events() to remove unnecessary checks, from Gao Feng. 22) Kill the fake untracked conntrack objects, use ctinfo instead to annotate a conntrack object is untracked, from Florian Westphal. 23) Remove nf_ct_is_untracked(), now obsolete since we have no conntrack template anymore, from Florian. 24) Add event mask support to nft_ct, also from Florian. 25) Move nf_conn_help structure to include/net/netfilter/nf_conntrack_helper.h. 26) Add a fixed 32 bytes scratchpad area for conntrack helpers. Thus, we don't deal with variable conntrack extensions anymore. Make sure userspace conntrack helper doesn't go over that size. Remove variable size ct extension infrastructure now this code got no more clients. From Florian Westphal. 27) Restore offset and length of nf_ct_ext structure to 8 bytes now that wraparound is not possible any longer, also from Florian. 28) Allow to get rid of unassured flows under stress in conntrack, this applies to DCCP, SCTP and TCP protocols, from Florian. 29) Shrink size of nf_conntrack_ecache structure, from Florian. 30) Use TCP_MAX_WSCALE instead of hardcoded 14 in TCP tracker, from Gao Feng. 31) Register SYNPROXY hooks on demand, from Florian Westphal. 32) Use pernet hook whenever possible, instead of global hook registration, from Florian Westphal. 33) Pass hook structure to ebt_register_table() to consolidate some infrastructure code, from Florian Westphal. 34) Use consume_skb() and return NF_STOLEN, instead of NF_DROP in the SYNPROXY code, to make sure device stats are not fooled, patch from Gao Feng. 35) Remove NF_CT_EXT_F_PREALLOC this kills quite some code that we don't need anymore if we just select a fixed size instead of expensive runtime time calculation of this. From Florian. 36) Constify nf_ct_extend_register() and nf_ct_extend_unregister(), from Florian. 37) Simplify nf_ct_ext_add(), this kills nf_ct_ext_create(), from Florian. 38) Attach NAT extension on-demand from masquerade and pptp helper path, from Florian. 39) Get rid of useless ip_vs_set_state_timeout(), from Aaron Conole. 40) Speed up netns by selective calls of synchronize_net(), from Florian Westphal. 41) Silence stack size warning gcc in 32-bit arch in snmp helper, from Florian. 42) Inconditionally call nf_ct_ext_destroy(), even if we have no extensions, to deal with the NF_NAT_MANIP_SRC case. Patch from Liping Zhang. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-19netfilter: tcp: Use TCP_MAX_WSCALE instead of literal 14Gao Feng1-4/+3
The window scale may be enlarged from 14 to 15 according to the itef draft https://tools.ietf.org/html/draft-nishida-tcpm-maxwin-03. Use the macro TCP_MAX_WSCALE to support it easily with TCP stack in the future. Signed-off-by: Gao Feng <fgao@ikuai8.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-04-19netfilter: allow early drop of assured conntracksFlorian Westphal1-0/+18
If insertion of a new conntrack fails because the table is full, the kernel searches the next buckets of the hash slot where the new connection was supposed to be inserted at for an entry that hasn't seen traffic in reply direction (non-assured), if it finds one, that entry is is dropped and the new connection entry is allocated. Allow the conntrack gc worker to also remove *assured* conntracks if resources are low. Do this by querying the l4 tracker, e.g. tcp connections are now dropped if they are no longer established (e.g. in finwait). This could be refined further, e.g. by adding 'soft' established timeout (i.e., a timeout that is only used once we get close to resource exhaustion). Cc: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-04-13netlink: pass extended ACK struct to parsing functionsJohannes Berg1-1/+2
Pass the new extended ACK reporting struct to all of the generic netlink parsing functions. For now, pass NULL in almost all callers (except for some in the core.) Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-02netfilter: conntrack: no need to pass ctinfo to error handlerFlorian Westphal1-1/+0
It is never accessed for reading and the only places that write to it are the icmp(6) handlers, which also set skb->nfct (and skb->nfctinfo). The conntrack core specifically checks for attached skb->nfct after ->error() invocation and returns early in this case. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-08-13netfilter: remove ip_conntrack* sysctl compat codePablo Neira Ayuso1-126/+1
This backward compatibility has been around for more than ten years, since Yasuyuki Kozakai introduced IPv6 in conntrack. These days, we have alternate /proc/net/nf_conntrack* entries, the ctnetlink interface and the conntrack utility got adopted by many people in the user community according to what I observed on the netfilter user mailing list. So let's get rid of this. Note that nf_conntrack_htable_size and unsigned int nf_conntrack_max do not need to be exported as symbol anymore. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-08-12netfilter: conntrack: Only need first 4 bytes to get l4proto portsGao Feng1-2/+2
We only need first 4 bytes instead of 8 bytes to get the ports of tcp/udp/dccp/sctp/udplite in their pkt_to_tuple function. Signed-off-by: Gao Feng <fgao@ikuai8.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-04-24Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-nextDavid S. Miller1-7/+1
Pablo Neira Ayuso says: ==================== Netfilter updates for net-next The following patchset contains Netfilter updates for your net-next tree, mostly from Florian Westphal to sort out the lack of sufficient validation in x_tables and connlabel preparation patches to add nf_tables support. They are: 1) Ensure we don't go over the ruleset blob boundaries in mark_source_chains(). 2) Validate that target jumps land on an existing xt_entry. This extra sanitization comes with a performance penalty when loading the ruleset. 3) Introduce xt_check_entry_offsets() and use it from {arp,ip,ip6}tables. 4) Get rid of the smallish check_entry() functions in {arp,ip,ip6}tables. 5) Make sure the minimal possible target size in x_tables. 6) Similar to #3, add xt_compat_check_entry_offsets() for compat code. 7) Check that standard target size is valid. 8) More sanitization to ensure that the target_offset field is correct. 9) Add xt_check_entry_match() to validate that matches are well-formed. 10-12) Three patch to reduce the number of parameters in translate_compat_table() for {arp,ip,ip6}tables by using a container structure. 13) No need to return value from xt_compat_match_from_user(), so make it void. 14) Consolidate translate_table() so it can be used by compat code too. 15) Remove obsolete check for compat code, so we keep consistent with what was already removed in the native layout code (back in 2007). 16) Get rid of target jump validation from mark_source_chains(), obsoleted by #2. 17) Introduce xt_copy_counters_from_user() to consolidate counter copying, and use it from {arp,ip,ip6}tables. 18,22) Get rid of unnecessary explicit inlining in ctnetlink for dump functions. 19) Move nf_connlabel_match() to xt_connlabel. 20) Skip event notification if connlabel did not change. 21) Update of nf_connlabels_get() to make the upcoming nft connlabel support easier. 23) Remove spinlock to read protocol state field in conntrack. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-19netfilter: conntrack: don't acquire lock during seq_printfFlorian Westphal1-7/+1
read access doesn't need any lock here. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-04-07netfilter: nf_conntrack_tcp: Fix stack out of bounds when parsing TCP optionsJozsef Kadlecsik1-0/+4
Baozeng Ding reported a KASAN stack out of bounds issue - it uncovered that the TCP option parsing routines in netfilter TCP connection tracking could read one byte out of the buffer of the TCP options. Therefore in the patch we check that the available data length is large enough to parse both TCP option code and size. Reported-by: Baozeng Ding <sploving1@gmail.com> Tested-by: Baozeng Ding <sploving1@gmail.com> Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-09-18netfilter: nf_conntrack: Add a struct net parameter to l4_pkt_to_tupleEric W. Biederman1-1/+1
As gre does not have the srckey in the packet gre_pkt_to_tuple needs to perform a lookup in it's per network namespace tables. Pass in the proper network namespace to all pkt_to_tuple implementations to ensure gre (and any similar protocols) can get this right. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-05-15conntrack: RFC5961 challenge ACK confuse conntrack LAST-ACK transitionJesper Dangaard Brouer1-3/+32
In compliance with RFC5961, the network stack send challenge ACK in response to spurious SYN packets, since commit 0c228e833c88 ("tcp: Restore RFC5961-compliant behavior for SYN packets"). This pose a problem for netfilter conntrack in state LAST_ACK, because this challenge ACK is (falsely) seen as ACKing last FIN, causing a false state transition (into TIME_WAIT). The challenge ACK is hard to distinguish from real last ACK. Thus, solution introduce a flag that tracks the potential for seeing a challenge ACK, in case a SYN packet is let through and current state is LAST_ACK. When conntrack transition LAST_ACK to TIME_WAIT happens, this flag is used for determining if we are expecting a challenge ACK. Scapy based reproducer script avail here: https://github.com/netoptimizer/network-testing/blob/master/scapy/tcp_hacks_3WHS_LAST_ACK.py Fixes: 0c228e833c88 ("tcp: Restore RFC5961-compliant behavior for SYN packets") Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-12-08Merge branch 'iov_iter' into for-nextAl Viro1-2/+2
2014-11-05netfilter: Convert print_tuple functions to return voidJoe Perches1-5/+5
Since adding a new function to seq_file (seq_has_overflowed()) there isn't any value for functions called from seq_show to return anything. Remove the int returns of the various print_tuple/<foo>_print_tuple functions. Link: http://lkml.kernel.org/p/f2e8cf8df433a197daa62cbaf124c900c708edc7.1412031505.git.joe@perches.com Cc: Pablo Neira Ayuso <pablo@netfilter.org> Cc: Patrick McHardy <kaber@trash.net> Cc: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Cc: netfilter-devel@vger.kernel.org Cc: coreteam@netfilter.org Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-11-05netfilter: Remove return values for print_conntrack callbacksSteven Rostedt (Red Hat)1-2/+2
The seq_printf() and friends are having their return values removed. The print_conntrack() returns the result of seq_printf(), which is meaningless when seq_printf() returns void. Might as well remove the return values of print_conntrack() as well. Link: http://lkml.kernel.org/r/20141029220107.465008329@goodmis.org Acked-by: Pablo Neira Ayuso <pablo@netfilter.org> Cc: Patrick McHardy <kaber@trash.net> Cc: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Cc: netfilter-devel@vger.kernel.org Cc: coreteam@netfilter.org Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-10-22netfilter: nf_conntrack: allow server to become a client in TW handlingMarcelo Leitner1-2/+2
When a port that was used to listen for inbound connections gets closed and reused for outgoing connections (like rsh ends up doing for stderr flow), current we may reject the SYN/ACK packet for the new connection because tcp_conntracks states forbirds a port to become a client while there is still a TIME_WAIT entry in there for it. As TCP may expire the TIME_WAIT socket in 60s and conntrack's timeout for it is 120s, there is a ~60s window that the application can end up opening a port that conntrack will end up blocking. This patch fixes this by simply allowing such state transition: if we see a SYN, in TIME_WAIT state, on REPLY direction, move it to sSS. Note that the rest of the code already handles this situation, more specificly in tcp_packet(), first switch clause. Signed-off-by: Marcelo Ricardo Leitner <mleitner@redhat.com> Acked-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-08-28netfilter: add SYNPROXY core/targetPatrick McHardy1-0/+16
Add a SYNPROXY for netfilter. The code is split into two parts, the synproxy core with common functions and an address family specific target. The SYNPROXY receives the connection request from the client, responds with a SYN/ACK containing a SYN cookie and announcing a zero window and checks whether the final ACK from the client contains a valid cookie. It then establishes a connection to the original destination and, if successful, sends a window update to the client with the window size announced by the server. Support for timestamps, SACK, window scaling and MSS options can be statically configured as target parameters if the features of the server are known. If timestamps are used, the timestamp value sent back to the client in the SYN/ACK will be different from the real timestamp of the server. In order to now break PAWS, the timestamps are translated in the direction server->client. Signed-off-by: Patrick McHardy <kaber@trash.net> Tested-by: Martin Topholm <mph@one.com> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-08-28netfilter: nf_conntrack: make sequence number adjustments usuable without NATPatrick McHardy1-16/+2
Split out sequence number adjustments from NAT and move them to the conntrack core to make them usable for SYN proxying. The sequence number adjustment information is moved to a seperate extend. The extend is added to new conntracks when a NAT mapping is set up for a connection using a helper. As a side effect, this saves 24 bytes per connection with NAT in the common case that a connection does not have a helper assigned. Signed-off-by: Patrick McHardy <kaber@trash.net> Tested-by: Martin Topholm <mph@one.com> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-08-20Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-nextDavid S. Miller1-2/+2
Conflicts: net/netfilter/nf_conntrack_proto_tcp.c The conflict had to do with overlapping changes dealing with fixing the use of an "s32" to hold the value returned by NAT_OFFSET(). Pablo Neira Ayuso says: ==================== The following batch contains Netfilter/IPVS updates for your net-next tree. More specifically, they are: * Trivial typo fix in xt_addrtype, from Phil Oester. * Remove net_ratelimit in the conntrack logging for consistency with other logging subsystem, from Patrick McHardy. * Remove unneeded includes from the recently added xt_connlabel support, from Florian Westphal. * Allow to update conntracks via nfqueue, don't need NFQA_CFG_F_CONNTRACK for this, from Florian Westphal. * Remove tproxy core, now that we have socket early demux, from Florian Westphal. * A couple of patches to refactor conntrack event reporting to save a good bunch of lines, from Florian Westphal. * Fix missing locking in NAT sequence adjustment, it did not manifested in any known bug so far, from Patrick McHardy. * Change sequence number adjustment variable to 32 bits, to delay the possible early overflow in long standing connections, also from Patrick. * Comestic cleanups for IPVS, from Dragos Foianu. * Fix possible null dereference in IPVS in the SH scheduler, from Daniel Borkmann. * Allow to attach conntrack expectations via nfqueue. Before this patch, you had to use ctnetlink instead, thus, we save the conntrack lookup. * Export xt_rpfilter and xt_HMARK header files, from Nicolas Dichtel. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2013-08-10netfilter: nf_conntrack: fix tcp_in_window for Fast OpenYuchung Cheng1-4/+8
Currently the conntrack checks if the ending sequence of a packet falls within the observed receive window. However it does so even if it has not observe any packet from the remote yet and uses an uninitialized receive window (td_maxwin). If a connection uses Fast Open to send a SYN-data packet which is dropped afterward in the network. The subsequent SYNs retransmits will all fail this check and be discarded, leading to a connection timeout. This is because the SYN retransmit does not contain data payload so end == initial sequence number (isn) + 1 sender->td_end == isn + syn_data_len receiver->td_maxwin == 0 The fix is to only apply this check after td_maxwin is initialized. Reported-by: Michael Chan <mcfchan@stanford.edu> Signed-off-by: Yuchung Cheng <ycheng@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Acked-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-07-31netfilter: nf_nat: change sequence number adjustments to 32 bitsPatrick McHardy1-2/+2
Using 16 bits is too small, when many adjustments happen the offsets might overflow and break the connection. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-06-20netfilter: nf_conntrack: avoid large timeout for mid-stream pickupFlorian Westphal1-0/+6
When loose tracking is enabled (default), non-syn packets cause creation of new conntracks in established state with default timeout for established state (5 days). This causes the table to fill up with UNREPLIED when the 'new ack' packet happened to be the last-ack of a previous, already timed-out connection. Consider: A 192.168.x.52792 > 10.184.y.80: F, 426:426(0) ack 9237 win 255 B 10.184.y.80 > 192.168.x.52792: ., ack 427 win 123 <61 second pause> C 10.184.y.80 > 192.168.x.52792: F, 9237:9237(0) ack 427 win 123 D 192.168.x.52792 > 10.184.y.80: ., ack 9238 win 255 B moves conntrack to CLOSE_WAIT and will kill it after 60 second timeout, C is ignored (FIN set), but last packet (D) causes new ct with 5-days timeout. Use UNACK timeout (5 minutes) instead to get rid of these entries sooner when in ESTABLISHED state without having seen traffic in both directions. Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-04-18netfilter: add my copyright statementsPatrick McHardy1-0/+2
Add copyright statements to all netfilter files which have had significant changes done by myself in the past. Some notes: - nf_conntrack_ecache.c was incorrectly attributed to Rusty and Netfilter Core Team when it got split out of nf_conntrack_core.c. The copyrights even state a date which lies six years before it was written. It was written in 2005 by Harald and myself. - net/ipv{4,6}/netfilter.c, net/netfitler/nf_queue.c were missing copyright statements. I've added the copyright statement from net/netfilter/core.c, where this code originated - for nf_conntrack_proto_tcp.c I've also added Jozsef, since I didn't want it to give the wrong impression Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-04-05netfilter: nf_log: prepare net namespace support for loggersGao feng1-9/+9
This patch adds netns support to nf_log and it prepares netns support for existing loggers. It is composed of four major changes. 1) nf_log_register has been split to two functions: nf_log_register and nf_log_set. The new nf_log_register is used to globally register the nf_logger and nf_log_set is used for enabling pernet support from nf_loggers. Per netns is not yet complete after this patch, it comes in separate follow up patches. 2) Add net as a parameter of nf_log_bind_pf. Per netns is not yet complete after this patch, it only allows to bind the nf_logger to the protocol family from init_net and it skips other cases. 3) Adapt all nf_log_packet callers to pass netns as parameter. After this patch, this function only works for init_net. 4) Make the sysctl net/netfilter/nf_log pernet. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-12-03netfilter: ctnetlink: nla_policy updatesFlorian Westphal1-0/+2
Add stricter checking for a few attributes. Note that these changes don't fix any bug in the current code base. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-09-15Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller1-19/+10
Conflicts: net/netfilter/nfnetlink_log.c net/netfilter/xt_LOG.c Rather easy conflict resolution, the 'net' tree had bug fixes to make sure we checked if a socket is a time-wait one or not and elide the logging code if so. Whereas on the 'net-next' side we are calculating the UID and GID from the creds using different interfaces due to the user namespace changes from Eric Biederman. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-09netfilter: Validate the sequence number of dataless ACK packets as wellJozsef Kadlecsik1-8/+2
We spare nothing by not validating the sequence number of dataless ACK packets and enabling it makes harder off-path attacks. See: "Reflection scan: an Off-Path Attack on TCP" by Jan Wrobel, http://arxiv.org/abs/1201.2074 Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-09-09netfilter: Mark SYN/ACK packets as invalid from original directionJozsef Kadlecsik1-11/+8
Clients should not send such packets. By accepting them, we open up a hole by wich ephemeral ports can be discovered in an off-path attack. See: "Reflection scan: an Off-Path Attack on TCP" by Jan Wrobel, http://arxiv.org/abs/1201.2074 Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-08-30netfilter: add protocol independent NAT corePatrick McHardy1-4/+4
Convert the IPv4 NAT implementation to a protocol independent core and address family specific modules. Signed-off-by: Patrick McHardy <kaber@trash.net>
2012-07-04netfilter: nf_ct_tcp: missing per-net support for cttimeoutPablo Neira Ayuso1-1/+1
This patch adds missing per-net support for the cttimeout infrastructure to TCP. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Acked-by: Gao feng <gaofeng@cn.fujitsu.com>
2012-07-04netfilter: nf_conntrack: generalize nf_ct_l4proto_netPablo Neira Ayuso1-0/+7
This patch generalizes nf_ct_l4proto_net by splitting it into chunks and moving the corresponding protocol part to where it really belongs to. To clarify, note that we follow two different approaches to support per-net depending if it's built-in or run-time loadable protocol tracker. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Acked-by: Gao feng <gaofeng@cn.fujitsu.com>
2012-06-27netfilter: nf_ct_tcp: merge tcpv[4,6]_net_init into tcp_net_initGao feng1-50/+21
Merge tcpv4_net_init and tcpv6_net_init into tcp_net_init to remove redundant code now that we have the u_int16_t proto parameter. And use nf_proto_net.users to identify if it's the first time we use the nf_proto_net, in that case, we initialize it. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-06-27netfilter: nf_conntrack: prepare l4proto->init_net cleanupGao feng1-2/+2
l4proto->init contain quite redundant code. We can simplify this by adding a new parameter l3proto. This patch prepares that code simplification. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-06-11netfilter: nf_ct_tcp, udp: fix compilation with sysctl disabledPablo Neira Ayuso1-2/+2
This patch fixes the compilation of the TCP and UDP trackers with sysctl compilation disabled: net/netfilter/nf_conntrack_proto_udp.c: In function ‘udp_init_net_data’: net/netfilter/nf_conntrack_proto_udp.c:279:13: error: ‘struct nf_proto_net’ has no member named ‘user’ net/netfilter/nf_conntrack_proto_tcp.c:1606:9: error: ‘struct nf_proto_net’ has no member named ‘user’ net/netfilter/nf_conntrack_proto_tcp.c:1643:9: error: ‘struct nf_proto_net’ has no member named ‘user’ Reported-by: Fengguang Wu <wfg@linux.intel.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-06-07netfilter: nf_conntrack: add namespace support for cttimeoutGao feng1-2/+4
This patch adds namespace support for cttimeout. Acked-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-06-07netfilter: nf_conntrack: remove now unused sysctl for nf_conntrack_l[3|4]protoPablo Neira Ayuso1-15/+0
Since the sysctl data for l[3|4]proto now resides in pernet nf_proto_net. We can now remove this unused fields from struct nf_contrack_l[3,4]proto. Acked-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>