aboutsummaryrefslogtreecommitdiffstatshomepage
path: root/net/netfilter/nf_tables_api.c (follow)
AgeCommit message (Collapse)AuthorFilesLines
2017-05-15netfilter: nf_tables: revisit chain/object refcounting from elementsPablo Neira Ayuso1-10/+72
Andreas reports that the following incremental update using our commit protocol doesn't work. # nft -f incremental-update.nft delete element ip filter client_to_any { 10.180.86.22 : goto CIn_1 } delete chain ip filter CIn_1 ... Error: Could not process rule: Device or resource busy The existing code is not well-integrated into the commit phase protocol, since element deletions do not result in refcount decrement from the preparation phase. This results in bogus EBUSY errors like the one above. Two new functions come with this patch: * nft_set_elem_activate() function is used from the abort path, to restore the set element refcounting on objects that occurred from the preparation phase. * nft_set_elem_deactivate() that is called from nft_del_setelem() to decrement set element refcounting on objects from the preparation phase in the commit protocol. The nft_data_uninit() has been renamed to nft_data_release() since this function does not uninitialize any data store in the data register, instead just releases the references to objects. Moreover, a new function nft_data_hold() has been introduced to be used from nft_set_elem_activate(). Reported-by: Andreas Schultz <aschultz@tpip.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-05-15netfilter: nf_tables: can't assume lock is acquired when dumping set elemsLiping Zhang1-22/+56
When dumping the elements related to a specified set, we may invoke the nf_tables_dump_set with the NFNL_SUBSYS_NFTABLES lock not acquired. So we should use the proper rcu operation to avoid race condition, just like other nft dump operations. Signed-off-by: Liping Zhang <zlpnobody@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-05-03Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nfDavid S. Miller1-0/+5
Pablo Neira Ayuso says: ==================== Netfilter/IPVS/OVS fixes for net The following patchset contains a rather large batch of Netfilter, IPVS and OVS fixes for your net tree. This includes fixes for ctnetlink, the userspace conntrack helper infrastructure, conntrack OVS support, ebtables DNAT target, several leaks in error path among other. More specifically, they are: 1) Fix reference count leak in the CT target error path, from Gao Feng. 2) Remove conntrack entry clashing with a matching expectation, patch from Jarno Rajahalme. 3) Fix bogus EEXIST when registering two different userspace helpers, from Liping Zhang. 4) Don't leak dummy elements in the new bitmap set type in nf_tables, from Liping Zhang. 5) Get rid of module autoload from conntrack update path in ctnetlink, we don't need autoload at this late stage and it is happening with rcu read lock held which is not good. From Liping Zhang. 6) Fix deadlock due to double-acquire of the expect_lock from conntrack update path, this fixes a bug that was introduced when the central spinlock got removed. Again from Liping Zhang. 7) Safe ct->status update from ctnetlink path, from Liping. The expect_lock protection that was selected when the central spinlock was removed was not really protecting anything at all. 8) Protect sequence adjustment under ct->lock. 9) Missing socket match with IPv6, from Peter Tirsek. 10) Adjust skb->pkt_type of DNAT'ed frames from ebtables, from Linus Luessing. 11) Don't give up on evaluating the expression on new entries added via dynset expression in nf_tables, from Liping Zhang. 12) Use skb_checksum() when mangling icmpv6 in IPv6 NAT as this deals with non-linear skbuffs. 13) Don't allow IPv6 service in IPVS if no IPv6 support is available, from Paolo Abeni. 14) Missing mutex release in error path of xt_find_table_lock(), from Dan Carpenter. 15) Update maintainers files, Netfilter section. Add Florian to the file, refer to nftables.org and change project status from Supported to Maintained. 16) Bail out on mismatching extensions in element updates in nf_tables. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-03netfilter: nf_tables: check if same extensions are set when adding elementsPablo Neira Ayuso1-0/+5
If no NLM_F_EXCL is set and the element already exists in the set, make sure that both elements have the same extensions. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-05-01Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-nextDavid S. Miller1-29/+25
Pablo Neira Ayuso says: ==================== Netfilter/IPVS updates for net-next The following patchset contains Netfilter updates for your net-next tree. A large bunch of code cleanups, simplify the conntrack extension codebase, get rid of the fake conntrack object, speed up netns by selective synchronize_net() calls. More specifically, they are: 1) Check for ct->status bit instead of using nfct_nat() from IPVS and Netfilter codebase, patch from Florian Westphal. 2) Use kcalloc() wherever possible in the IPVS code, from Varsha Rao. 3) Simplify FTP IPVS helper module registration path, from Arushi Singhal. 4) Introduce nft_is_base_chain() helper function. 5) Enforce expectation limit from userspace conntrack helper, from Gao Feng. 6) Add nf_ct_remove_expect() helper function, from Gao Feng. 7) NAT mangle helper function return boolean, from Gao Feng. 8) ctnetlink_alloc_expect() should only work for conntrack with helpers, from Gao Feng. 9) Add nfnl_msg_type() helper function to nfnetlink to build the netlink message type. 10) Get rid of unnecessary cast on void, from simran singhal. 11) Use seq_puts()/seq_putc() instead of seq_printf() where possible, also from simran singhal. 12) Use list_prev_entry() from nf_tables, from simran signhal. 13) Remove unnecessary & on pointer function in the Netfilter and IPVS code. 14) Remove obsolete comment on set of rules per CPU in ip6_tables, no longer true. From Arushi Singhal. 15) Remove duplicated nf_conntrack_l4proto_udplite4, from Gao Feng. 16) Remove unnecessary nested rcu_read_lock() in __nf_nat_decode_session(). Code running from hooks are already guaranteed to run under RCU read side. 17) Remove deadcode in nf_tables_getobj(), from Aaron Conole. 18) Remove double assignment in nf_ct_l4proto_pernet_unregister_one(), also from Aaron. 19) Get rid of unsed __ip_set_get_netlink(), from Aaron Conole. 20) Don't propagate NF_DROP error to userspace via ctnetlink in __nf_nat_alloc_null_binding() function, from Gao Feng. 21) Revisit nf_ct_deliver_cached_events() to remove unnecessary checks, from Gao Feng. 22) Kill the fake untracked conntrack objects, use ctinfo instead to annotate a conntrack object is untracked, from Florian Westphal. 23) Remove nf_ct_is_untracked(), now obsolete since we have no conntrack template anymore, from Florian. 24) Add event mask support to nft_ct, also from Florian. 25) Move nf_conn_help structure to include/net/netfilter/nf_conntrack_helper.h. 26) Add a fixed 32 bytes scratchpad area for conntrack helpers. Thus, we don't deal with variable conntrack extensions anymore. Make sure userspace conntrack helper doesn't go over that size. Remove variable size ct extension infrastructure now this code got no more clients. From Florian Westphal. 27) Restore offset and length of nf_ct_ext structure to 8 bytes now that wraparound is not possible any longer, also from Florian. 28) Allow to get rid of unassured flows under stress in conntrack, this applies to DCCP, SCTP and TCP protocols, from Florian. 29) Shrink size of nf_conntrack_ecache structure, from Florian. 30) Use TCP_MAX_WSCALE instead of hardcoded 14 in TCP tracker, from Gao Feng. 31) Register SYNPROXY hooks on demand, from Florian Westphal. 32) Use pernet hook whenever possible, instead of global hook registration, from Florian Westphal. 33) Pass hook structure to ebt_register_table() to consolidate some infrastructure code, from Florian Westphal. 34) Use consume_skb() and return NF_STOLEN, instead of NF_DROP in the SYNPROXY code, to make sure device stats are not fooled, patch from Gao Feng. 35) Remove NF_CT_EXT_F_PREALLOC this kills quite some code that we don't need anymore if we just select a fixed size instead of expensive runtime time calculation of this. From Florian. 36) Constify nf_ct_extend_register() and nf_ct_extend_unregister(), from Florian. 37) Simplify nf_ct_ext_add(), this kills nf_ct_ext_create(), from Florian. 38) Attach NAT extension on-demand from masquerade and pptp helper path, from Florian. 39) Get rid of useless ip_vs_set_state_timeout(), from Aaron Conole. 40) Speed up netns by selective calls of synchronize_net(), from Florian Westphal. 41) Silence stack size warning gcc in 32-bit arch in snmp helper, from Florian. 42) Inconditionally call nf_ct_ext_destroy(), even if we have no extensions, to deal with the NF_NAT_MANIP_SRC case. Patch from Liping Zhang. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-14netfilter: nf_tables: remove double return statementAaron Conole1-2/+0
Signed-off-by: Aaron Conole <aconole@bytheb.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-04-13netlink: pass extended ACK struct to parsing functionsJohannes Berg1-11/+16
Pass the new extended ACK reporting struct to all of the generic netlink parsing functions. For now, pass NULL in almost all callers (except for some in the core.) Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-07net: netfilter: Use list_{next/prev}_entry instead of list_entrysimran singhal1-1/+1
This patch replace list_entry with list_prev_entry as it makes the code more clear to read. Signed-off-by: simran singhal <singhalsimran0@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-04-07netfilter: Add nfnl_msg_type() helper functionPablo Neira Ayuso1-11/+9
Add and use nfnl_msg_type() function to replace opencoded nfnetlink message type. I suggested this change, Arushi Singhal made an initial patch to address this but was missing several spots. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-04-06netfilter: nf_tables: add nft_is_base_chain() helperPablo Neira Ayuso1-15/+15
This new helper function allows us to check if this is a basechain. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-03-23Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller1-4/+0
Conflicts: drivers/net/ethernet/broadcom/genet/bcmmii.c drivers/net/hyperv/netvsc.c kernel/bpf/hashtab.c Almost entirely overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-13Revert "netfilter: nf_tables: add flush field to struct nft_set_iter"Pablo Neira Ayuso1-4/+0
This reverts commit 1f48ff6c5393aa7fe290faf5d633164f105b0aa7. This patch is not required anymore now that we keep a dummy list of set elements in the bitmap set implementation, so revert this before we forget this code has no clients. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-03-13netfilter: provide nft_ctx in object init functionFlorian Westphal1-3/+4
this is needed by the upcoming ct helper object type -- we'd like to be able use the table family (ip, ip6, inet) to figure out which helper has to be requested. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-03-06netfilter: nf_tables: add nft_set_lookup()Pablo Neira Ayuso1-7/+24
This new function consolidates set lookup via either name or ID by introducing a new nft_set_lookup() function. Replace existing spots where we can use this too. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-03-06netfilter: nf_tables: validate the expr explicitly after init successfullyLiping Zhang1-0/+11
When we want to validate the expr's dependency or hooks, we must do two things to accomplish it. First, write a X_validate callback function and point ->validate to it. Second, call X_validate in init routine. This is very common, such as fib, nat, reject expr and so on ... It is a little ugly, since we will call X_validate in the expr's init routine, it's better to do it in nf_tables_newexpr. So we can avoid to do this again and again. After doing this, the second step listed above is not useful anymore, remove them now. Patch was tested by nftables/tests/py/nft-test.py and nftables/tests/shell/run-tests.sh. Signed-off-by: Liping Zhang <zlpnobody@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-03-03netfilter: nf_tables: don't call nfnetlink_set_err() if nfnetlink_send() failsPablo Neira Ayuso1-78/+55
The underlying nlmsg_multicast() already sets sk->sk_err for us to notify socket overruns, so we should not do anything with this return value. So we just call nfnetlink_set_err() if: 1) We fail to allocate the netlink message. or 2) We don't have enough space in the netlink message to place attributes, which means that we likely need to allocate a larger message. Before this patch, the internal ESRCH netlink error code was propagated to userspace, which is quite misleading. Netlink semantics mandate that listeners just hit ENOBUFS if the socket buffer overruns. Reported-by: Alexander Alemayhu <alexander@alemayhu.com> Tested-by: Alexander Alemayhu <alexander@alemayhu.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-02-12netfilter: nf_tables: honor NFT_SET_OBJECT in set backend selectionPablo Neira Ayuso1-1/+2
Check for NFT_SET_OBJECT feature flag, otherwise we may end up selecting the wrong set backend. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-02-12netfilter: nf_tables: add NFTA_RULE_ID attributePablo Neira Ayuso1-0/+26
This new attribute allows us to uniquely identify a rule in transaction. Robots may trigger an insertion followed by deletion in a batch, in that scenario we still don't have a public rule handle that we can use to delete the rule. This is similar to the NFTA_SET_ID attribute that allows us to refer to an anonymous set from a batch. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-02-12netfilter: nf_tables: add check_genid to the nfnetlink subsystemPablo Neira Ayuso1-0/+6
This patch implements the check generation id as provided by nfnetlink. This allows us to reject ruleset updates against stale baseline, so userspace can retry update with a fresh ruleset cache. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-02-08netfilter: nf_tables: add space notation to setsPablo Neira Ayuso1-5/+17
The space notation allows us to classify the set backend implementation based on the amount of required memory. This provides an order of the set representation scalability in terms of memory. The size field is still left in place so use this if the userspace provides no explicit number of elements, so we cannot calculate the real memory that this set needs. This also helps us break ties in the set backend selection routine, eg. two backend implementations provide the same performance. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-02-08netfilter: nf_tables: rename struct nft_set_estimate class fieldPablo Neira Ayuso1-6/+6
Use lookup as field name instead, to prepare the introduction of the memory class in a follow up patch. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-02-08netfilter: nf_tables: add flush field to struct nft_set_iterPablo Neira Ayuso1-0/+4
This provides context to walk callback iterator, thus, we know if the walk happens from the set flush path. This is required by the new bitmap set type coming in a follow up patch which has no real struct nft_set_ext, so it has to allocate it based on the two bit compact element representation. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-02-08netfilter: nf_tables: rename deactivate_one() to flush()Pablo Neira Ayuso1-1/+1
Although semantics are similar to deactivate() with no implicit element lookup, this is only called from the set flush path, so better rename this to flush(). Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-02-08netfilter: nf_tables: use struct nft_set_iter in set element flushPablo Neira Ayuso1-7/+5
Instead of struct nft_set_dump_args, remove unnecessary wrapper structure. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-02-08netfilter: nf_tables: pass netns to set->ops->remove()Pablo Neira Ayuso1-3/+3
This new parameter is required by the new bitmap set type that comes in a follow up patch. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-02-03Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-nextDavid S. Miller1-35/+28
Pablo Neira Ayuso says: ==================== Netfilter updates for net-next The following patchset contains Netfilter updates for your net-next tree, they are: 1) Stash ctinfo 3-bit field into pointer to nf_conntrack object from sk_buff so we only access one single cacheline in the conntrack hotpath. Patchset from Florian Westphal. 2) Don't leak pointer to internal structures when exporting x_tables ruleset back to userspace, from Willem DeBruijn. This includes new helper functions to copy data to userspace such as xt_data_to_user() as well as conversions of our ip_tables, ip6_tables and arp_tables clients to use it. Not surprinsingly, ebtables requires an ad-hoc update. There is also a new field in x_tables extensions to indicate the amount of bytes that we copy to userspace. 3) Add nf_log_all_netns sysctl: This new knob allows you to enable logging via nf_log infrastructure for all existing netnamespaces. Given the effort to provide pernet syslog has been discontinued, let's provide a way to restore logging using netfilter kernel logging facilities in trusted environments. Patch from Michal Kubecek. 4) Validate SCTP checksum from conntrack helper, from Davide Caratti. 5) Merge UDPlite conntrack and NAT helpers into UDP, this was mostly a copy&paste from the original helper, from Florian Westphal. 6) Reset netfilter state when duplicating packets, also from Florian. 7) Remove unnecessary check for broadcast in IPv6 in pkttype match and nft_meta, from Liping Zhang. 8) Add missing code to deal with loopback packets from nft_meta when used by the netdev family, also from Liping. 9) Several cleanups on nf_tables, one to remove unnecessary check from the netlink control plane path to add table, set and stateful objects and code consolidation when unregister chain hooks, from Gao Feng. 10) Fix harmless reference counter underflow in IPVS that, however, results in problems with the introduction of the new refcount_t type, from David Windsor. 11) Enable LIBCRC32C from nf_ct_sctp instead of nf_nat_sctp, from Davide Caratti. 12) Missing documentation on nf_tables uapi header, from Liping Zhang. 13) Use rb_entry() helper in xt_connlimit, from Geliang Tang. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-02netfilter: nf_tables: Eliminate duplicated code in nf_tables_table_enable()Feng1-23/+25
If something fails in nf_tables_table_enable(), it unregisters the chains. But the rollback code is the same as nf_tables_table_disable() almostly, except there is one counter check. Now create one wrapper function to eliminate the duplicated codes. Signed-off-by: Feng <fgao@ikuai8.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-01-24netfilter: nf_tables: bump set->ndeact on set flushPablo Neira Ayuso1-0/+1
Add missing set->ndeact update on each deactivated element from the set flush path. Otherwise, sets with fixed size break after flush since accounting breaks. # nft add set x y { type ipv4_addr\; size 2\; } # nft add element x y { 1.1.1.1 } # nft add element x y { 1.1.1.2 } # nft flush set x y # nft add element x y { 1.1.1.1 } <cmdline>:1:1-28: Error: Could not process rule: Too many open files in system Fixes: 8411b6442e59 ("netfilter: nf_tables: support for set flushing") Reported-by: Elise Lennion <elise.lennion@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-01-24netfilter: nf_tables: deconstify walk callback functionPablo Neira Ayuso1-12/+12
The flush operation needs to modify set and element objects, so let's deconstify this. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-01-24netfilter: nf_tables: fix set->nelems counting with no NLM_F_EXCLPablo Neira Ayuso1-7/+9
If the element exists and no NLM_F_EXCL is specified, do not bump set->nelems, otherwise we leak one set element slot. This problem amplifies if the set is full since the abort path always decrements the counter for the -ENFILE case too, giving one spare extra slot. Fix this by moving set->nelems update to nft_add_set_elem() after successful element insertion. Moreover, remove the element if the set is full so there is no need to rely on the abort path to undo things anymore. Fixes: c016c7e45ddf ("netfilter: nf_tables: honor NLM_F_EXCL flag in set element insertion") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-01-23netfilter: nf_tables: validate the name size when possibleLiping Zhang1-7/+14
Currently, if the user add a stateful object with the name size exceed NFT_OBJ_MAXNAMELEN - 1 (i.e. 31), we truncate it down to 31 silently. This is not friendly, furthermore, this will cause duplicated stateful objects when the first 31 characters of the name is same. So limit the stateful object's name size to NFT_OBJ_MAXNAMELEN - 1. After apply this patch, error message will be printed out like this: # name_32=$(printf "%0.sQ" {1..32}) # nft add counter filter $name_32 <cmdline>:1:1-52: Error: Could not process rule: Numerical result out of range add counter filter QQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Also this patch cleans up the codes which missing the name size limit validation in nftables. Fixes: e50092404c1b ("netfilter: nf_tables: add stateful objects") Signed-off-by: Liping Zhang <zlpnobody@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-01-18netfilter: nf_tables: eliminate useless condition checksGao Feng1-12/+3
The return value of nf_tables_table_lookup() is valid pointer or one pointer error. There are two cases: 1) IS_ERR(table) is true, it would return the error or reset the table as NULL, it is unnecessary to perform the latter check "table != NULL". 2) IS_ERR(obj) is false, the table is one valid pointer. It is also unnecessary to perform that check. The nf_tables_newset() and nf_tables_newobj() have same logic codes. In summary, we could move the block of condition check "table != NULL" in the else block to eliminate the original condition checks. Signed-off-by: Gao Feng <fgao@ikuai8.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-01-16netfilter: nf_tables: fix possible oops when dumping stateful objectsLiping Zhang1-2/+3
When dumping nft stateful objects, if NFTA_OBJ_TABLE and NFTA_OBJ_TYPE attributes are not specified either, filter will become NULL, so oops will happen(actually nft utility will always set NFTA_OBJ_TABLE attr, so I write a test program to make this happen): BUG: unable to handle kernel NULL pointer dereference at (null) IP: nf_tables_dump_obj+0x17c/0x330 [nf_tables] [...] Call Trace: ? nf_tables_dump_obj+0x5/0x330 [nf_tables] ? __kmalloc_reserve.isra.35+0x31/0x90 ? __alloc_skb+0x5b/0x1e0 netlink_dump+0x124/0x2a0 __netlink_dump_start+0x161/0x190 nf_tables_getobj+0xe8/0x280 [nf_tables] Fixes: a9fea2a3c3cf ("netfilter: nf_tables: allow to filter stateful object dumps by type") Signed-off-by: Liping Zhang <zlpnobody@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-12-14netfilter: nf_tables: fix oob accessFlorian Westphal1-1/+1
BUG: KASAN: slab-out-of-bounds in nf_tables_rule_destroy+0xf1/0x130 at addr ffff88006a4c35c8 Read of size 8 by task nft/1607 When we've destroyed last valid expr, nft_expr_next() returns an invalid expr. We must not dereference it unless it passes != nft_expr_last() check. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-12-07netfilter: nf_tables: support for set flushingPablo Neira Ayuso1-6/+49
This patch adds support for set flushing, that consists of walking over the set elements if the NFTA_SET_ELEM_LIST_ELEMENTS attribute is set. This patch requires the following changes: 1) Add set->ops->deactivate_one() operation: This allows us to deactivate an element from the set element walk path, given we can skip the lookup that happens in ->deactivate(). 2) Add a new nft_trans_alloc_gfp() function since we need to allocate transactions using GFP_ATOMIC given the set walk path happens with held rcu_read_lock. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-12-07netfilter: nf_tables: constify struct nft_ctx * parameter in nft_trans_alloc()Pablo Neira Ayuso1-2/+2
Context is not modified by nft_trans_alloc(), so constify it. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-12-07netfilter: nf_tables: allow to filter stateful object dumps by typePablo Neira Ayuso1-0/+50
This patch adds the netlink code to filter out dump of stateful objects, through the NFTA_OBJ_TYPE netlink attribute. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-12-07netfilter: nft_objref: support for stateful object mapsPablo Neira Ayuso1-0/+4
This patch allows us to refer to stateful object dictionaries, the source register indicates the key data to be used to look up for the corresponding state object. We can refer to these maps through names or, alternatively, the map transaction id. This allows us to refer to both anonymous and named maps. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-12-07netfilter: nf_tables: add stateful object reference to set elementsPablo Neira Ayuso1-10/+62
This patch allows you to refer to stateful objects from set elements. This provides the infrastructure to create maps where the right hand side of the mapping is a stateful object. This allows us to build dictionaries of stateful objects, that you can use to perform fast lookups using any arbitrary key combination. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-12-07netfilter: nft_quota: add depleted flag for objectsPablo Neira Ayuso1-0/+1
Notify on depleted quota objects. The NFT_QUOTA_F_DEPLETED flag indicates we have reached overquota. Add pointer to table from nft_object, so we can use it when sending the depletion notification to userspace. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-12-07netfilter: nf_tables: notify internal updates of stateful objectsPablo Neira Ayuso1-12/+19
Introduce nf_tables_obj_notify() to notify internal state changes in stateful objects. This is used by the quota object to report depletion in a follow up patch. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-12-07netfilter: nf_tables: atomic dump and reset for stateful objectsPablo Neira Ayuso1-8/+21
This patch adds a new NFT_MSG_GETOBJ_RESET command perform an atomic dump-and-reset of the stateful object. This also comes with add support for atomic dump and reset for counter and quota objects. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-12-06netfilter: nf_tables: add stateful objectsPablo Neira Ayuso1-0/+516
This patch augments nf_tables to support stateful objects. This new infrastructure allows you to create, dump and delete stateful objects, that are identified by a user-defined name. This patch adds the generic infrastructure, follow up patches add support for two stateful objects: counters and quotas. This patch provides a native infrastructure for nf_tables to replace nfacct, the extended accounting infrastructure for iptables. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-11-24netfilter: nf_tables: fix inconsistent element expiration calculationAnders K. Pedersen1-5/+9
As Liping Zhang reports, after commit a8b1e36d0d1d ("netfilter: nft_dynset: fix element timeout for HZ != 1000"), priv->timeout was stored in jiffies, while set->timeout was stored in milliseconds. This is inconsistent and incorrect. Firstly, we already call msecs_to_jiffies in nft_set_elem_init, so priv->timeout will be converted to jiffies twice. Secondly, if the user did not specify the NFTA_DYNSET_TIMEOUT attr, set->timeout will be used, but we forget to call msecs_to_jiffies when do update elements. Fix this by using jiffies internally for traditional sets and doing the conversions to/from msec when interacting with userspace - as dynset already does. This is preferable to doing the conversions, when elements are inserted or updated, because this can happen very frequently on busy dynsets. Fixes: a8b1e36d0d1d ("netfilter: nft_dynset: fix element timeout for HZ != 1000") Reported-by: Liping Zhang <zlpnobody@gmail.com> Signed-off-by: Anders K. Pedersen <akp@cohaesio.com> Acked-by: Liping Zhang <zlpnobody@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-11-08netfilter: nf_tables: fix oops when inserting an element into a verdict mapLiping Zhang1-0/+1
Dalegaard says: The following ruleset, when loaded with 'nft -f bad.txt' ----snip---- flush ruleset table ip inlinenat { map sourcemap { type ipv4_addr : verdict; } chain postrouting { ip saddr vmap @sourcemap accept } } add chain inlinenat test add element inlinenat sourcemap { 100.123.10.2 : jump test } ----snip---- results in a kernel oops: BUG: unable to handle kernel paging request at 0000000000001344 IP: [<ffffffffa07bf704>] nf_tables_check_loops+0x114/0x1f0 [nf_tables] [...] Call Trace: [<ffffffffa07c2aae>] ? nft_data_init+0x13e/0x1a0 [nf_tables] [<ffffffffa07c1950>] nft_validate_register_store+0x60/0xb0 [nf_tables] [<ffffffffa07c74b5>] nft_add_set_elem+0x545/0x5e0 [nf_tables] [<ffffffffa07bfdd0>] ? nft_table_lookup+0x30/0x60 [nf_tables] [<ffffffff8132c630>] ? nla_strcmp+0x40/0x50 [<ffffffffa07c766e>] nf_tables_newsetelem+0x11e/0x210 [nf_tables] [<ffffffff8132c400>] ? nla_validate+0x60/0x80 [<ffffffffa030d9b4>] nfnetlink_rcv+0x354/0x5a7 [nfnetlink] Because we forget to fill the net pointer in bind_ctx, so dereferencing it may cause kernel crash. Reported-by: Dalegaard <dalegaard@gmail.com> Signed-off-by: Liping Zhang <zlpnobody@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-10-31netfilter: nf_tables: destroy the set if fail to add transactionLiping Zhang1-1/+3
When the memory is exhausted, then we will fail to add the NFT_MSG_NEWSET transaction. In such case, we should destroy the set before we free it. Fixes: 958bee14d071 ("netfilter: nf_tables: use new transaction infrastructure to handle sets") Signed-off-by: Liping Zhang <zlpnobody@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-10-27netfilter: nf_tables: fix type mismatch with error return from nft_parse_u32_checkJohn W. Linville1-1/+1
Commit 36b701fae12ac ("netfilter: nf_tables: validate maximum value of u32 netlink attributes") introduced nft_parse_u32_check with a return value of "unsigned int", yet on error it returns "-ERANGE". This patch corrects the mismatch by changing the return value to "int", which happens to match the actual users of nft_parse_u32_check already. Found by Coverity, CID 1373930. Note that commit 21a9e0f1568ea ("netfilter: nft_exthdr: fix error handling in nft_exthdr_init()) attempted to address the issue, but did not address the return type of nft_parse_u32_check. Signed-off-by: John W. Linville <linville@tuxdriver.com> Cc: Laura Garcia Liebana <nevola@gmail.com> Cc: Pablo Neira Ayuso <pablo@netfilter.org> Cc: Dan Carpenter <dan.carpenter@oracle.com> Fixes: 36b701fae12ac ("netfilter: nf_tables: validate maximum value...") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-10-27netfilter: nf_tables: fix *leak* when expr clone failLiping Zhang1-5/+6
When nft_expr_clone failed, a series of problems will happen: 1. module refcnt will leak, we call __module_get at the beginning but we forget to put it back if ops->clone returns fail 2. memory will be leaked, if clone fail, we just return NULL and forget to free the alloced element 3. set->nelems will become incorrect when set->size is specified. If clone fail, we should decrease the set->nelems Now this patch fixes these problems. And fortunately, clone fail will only happen on counter expression when memory is exhausted. Fixes: 086f332167d6 ("netfilter: nf_tables: add clone interface to expression operations") Signed-off-by: Liping Zhang <zlpnobody@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-10-17netfilter: nf_tables: underflow in nft_parse_u32_check()Dan Carpenter1-1/+1
We don't want to allow negatives here. Fixes: 36b701fae12a ('netfilter: nf_tables: validate maximum value of u32 netlink attributes') Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-09-23netfilter: nf_tables: validate maximum value of u32 netlink attributesLaura Garcia Liebana1-0/+25
Fetch value and validate u32 netlink attribute. This validation is usually required when the u32 netlink attributes are being stored in a field whose size is smaller. This patch revisits 4da449ae1df9 ("netfilter: nft_exthdr: Add size check on u8 nft_exthdr attributes"). Fixes: 96518518cc41 ("netfilter: add nftables") Suggested-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Laura Garcia Liebana <nevola@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>