aboutsummaryrefslogtreecommitdiffstats
path: root/include/linux/netfilter/nfnetlink.h (follow)
AgeCommit message (Collapse)AuthorFilesLines
2020-01-24netfilter: nf_tables: autoload modules from the abort pathPablo Neira Ayuso1-1/+1
This patch introduces a list of pending module requests. This new module list is composed of nft_module_request objects that contain the module name and one status field that tells if the module has been already loaded (the 'done' field). In the first pass, from the preparation phase, the netlink command finds that a module is missing on this list. Then, a module request is allocated and added to this list and nft_request_module() returns -EAGAIN. This triggers the abort path with the autoload parameter set on from nfnetlink, request_module() is called and the module request enters the 'done' state. Since the mutex is released when loading modules from the abort phase, the module list is zapped so this is iteration occurs over a local list. Therefore, the request_module() calls happen when object lists are in consistent state (after fulling aborting the transaction) and the commit list is empty. On the second pass, the netlink command will find that it already tried to load the module, so it does not request it again and nft_request_module() returns 0. Then, there is a look up to find the object that the command was missing. If the module was successfully loaded, the command proceeds normally since it finds the missing object in place, otherwise -ENOENT is reported to userspace. This patch also updates nfnetlink to include the reason to enter the abort phase, which is required for this new autoload module rationale. Fixes: ec7470b834fe ("netfilter: nf_tables: store transaction list locally while requesting module") Reported-by: syzbot+29125d208b3dae9a7019@syzkaller.appspotmail.com Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-12-04netfilter: nf_tables: fix suspicious RCU usage in nft_chain_stats_replace()Taehee Yoo1-12/+0
basechain->stats is rcu protected data which is updated from nft_chain_stats_replace(). This function is executed from the commit phase which holds the pernet nf_tables commit mutex - not the global nfnetlink subsystem mutex. Test commands to reproduce the problem are: %iptables-nft -I INPUT %iptables-nft -Z %iptables-nft -Z This patch uses RCU calls to handle basechain->stats updates to fix a splat that looks like: [89279.358755] ============================= [89279.363656] WARNING: suspicious RCU usage [89279.368458] 4.20.0-rc2+ #44 Tainted: G W L [89279.374661] ----------------------------- [89279.379542] net/netfilter/nf_tables_api.c:1404 suspicious rcu_dereference_protected() usage! [...] [89279.406556] 1 lock held by iptables-nft/5225: [89279.411728] #0: 00000000bf45a000 (&net->nft.commit_mutex){+.+.}, at: nf_tables_valid_genid+0x1f/0x70 [nf_tables] [89279.424022] stack backtrace: [89279.429236] CPU: 0 PID: 5225 Comm: iptables-nft Tainted: G W L 4.20.0-rc2+ #44 [89279.430135] Call Trace: [89279.430135] dump_stack+0xc9/0x16b [89279.430135] ? show_regs_print_info+0x5/0x5 [89279.430135] ? lockdep_rcu_suspicious+0x117/0x160 [89279.430135] nft_chain_commit_update+0x4ea/0x640 [nf_tables] [89279.430135] ? sched_clock_local+0xd4/0x140 [89279.430135] ? check_flags.part.35+0x440/0x440 [89279.430135] ? __rhashtable_remove_fast.constprop.67+0xec0/0xec0 [nf_tables] [89279.430135] ? sched_clock_cpu+0x126/0x170 [89279.430135] ? find_held_lock+0x39/0x1c0 [89279.430135] ? hlock_class+0x140/0x140 [89279.430135] ? is_bpf_text_address+0x5/0xf0 [89279.430135] ? check_flags.part.35+0x440/0x440 [89279.430135] ? __lock_is_held+0xb4/0x140 [89279.430135] nf_tables_commit+0x2555/0x39c0 [nf_tables] Fixes: f102d66b335a4 ("netfilter: nf_tables: use dedicated mutex to guard transactions") Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-07-18netfilter: nf_tables: take module reference when starting a batchFlorian Westphal1-0/+1
Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-06-01netfilter: nf_tables: fix chain dependency validationPablo Neira Ayuso1-0/+1
The following ruleset: add table ip filter add chain ip filter input { type filter hook input priority 4; } add chain ip filter ap add rule ip filter input jump ap add rule ip filter ap masquerade results in a panic, because the masquerade extension should be rejected from the filter chain. The existing validation is missing a chain dependency check when the rule is added to the non-base chain. This patch fixes the problem by walking down the rules from the basechains, searching for either immediate or lookup expressions, then jumping to non-base chains and again walking down the rules to perform the expression validation, so we make sure the full ruleset graph is validated. This is done only once from the commit phase, in case of problem, we abort the transaction and perform fine grain validation for error reporting. This patch requires 003087911af2 ("netfilter: nfnetlink: allow commit to fail") to achieve this behaviour. This patch also adds a cleanup callback to nfnl batch interface to reset the validate state from the exit path. As a result of this patch, nf_tables_check_loops() doesn't use ->validate to check for loops, instead it just checks for immediate expressions. Reported-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-12-05netlink: Remove smp_read_barrier_depends() from commentPaul E. McKenney1-2/+1
Now that smp_read_barrier_depends() has been de-emphasized, the less said about it, the better. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Pablo Neira Ayuso <pablo@netfilter.org> Cc: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Cc: Florian Westphal <fw@strlen.de> Cc: <netfilter-devel@vger.kernel.org> Cc: <coreteam@netfilter.org>
2017-11-07Merge branch 'linus' into locking/core, to resolve conflictsIngo Molnar1-0/+1
Conflicts: include/linux/compiler-clang.h include/linux/compiler-gcc.h include/linux/compiler-intel.h include/uapi/linux/stddef.h Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-11-02License cleanup: add SPDX GPL-2.0 license identifier to files with no licenseGreg Kroah-Hartman1-0/+1
Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a */uapi/* one with no licensing information in it, - file was a */uapi/* one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard license headers were used, and references to license had to be inferred by heuristics based on keywords. The analysis to determine which SPDX License Identifier to be applied to a file was done in a spreadsheet of side by side results from of the output of two independent scanners (ScanCode & Windriver) producing SPDX tag:value files created by Philippe Ombredanne. Philippe prepared the base worksheet, and did an initial spot review of a few 1000 files. The 4.13 kernel was the starting point of the analysis with 60,537 files assessed. Kate Stewart did a file by file comparison of the scanner results in the spreadsheet to determine which SPDX license identifier(s) to be applied to the file. She confirmed any determination that was not immediately clear with lawyers working with the Linux Foundation. Criteria used to select files for SPDX license identifier tagging was: - Files considered eligible had to be source code files. - Make and config files were included as candidates if they contained >5 lines of source - File already had some variant of a license header in it (even if <5 lines). All documentation files were explicitly excluded. The following heuristics were used to determine which SPDX license identifiers to apply. - when both scanners couldn't find any license traces, file was considered to have no license information in it, and the top level COPYING file license applied. For non */uapi/* files that summary was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 11139 and resulted in the first patch in this series. If that file was a */uapi/* path one, it was "GPL-2.0 WITH Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 WITH Linux-syscall-note 930 and resulted in the second patch in this series. - if a file had some form of licensing information in it, and was one of the */uapi/* ones, it was denoted with the Linux-syscall-note if any GPL family license was found in the file or had no licensing in it (per prior point). Results summary: SPDX license identifier # files ---------------------------------------------------|------ GPL-2.0 WITH Linux-syscall-note 270 GPL-2.0+ WITH Linux-syscall-note 169 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17 LGPL-2.1+ WITH Linux-syscall-note 15 GPL-1.0+ WITH Linux-syscall-note 14 ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5 LGPL-2.0+ WITH Linux-syscall-note 4 LGPL-2.1 WITH Linux-syscall-note 3 ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3 ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1 and that resulted in the third patch in this series. - when the two scanners agreed on the detected license(s), that became the concluded license(s). - when there was disagreement between the two scanners (one detected a license but the other didn't, or they both detected different licenses) a manual inspection of the file occurred. - In most cases a manual inspection of the information in the file resulted in a clear resolution of the license that should apply (and which scanner probably needed to revisit its heuristics). - When it was not immediately clear, the license identifier was confirmed with lawyers working with the Linux Foundation. - If there was any question as to the appropriate license identifier, the file was flagged for further research and to be revisited later in time. In total, over 70 hours of logged manual review was done on the spreadsheet to determine the SPDX license identifiers to apply to the source files by Kate, Philippe, Thomas and, in some cases, confirmation by lawyers working with the Linux Foundation. Kate also obtained a third independent scan of the 4.13 code base from FOSSology, and compared selected files where the other two scanners disagreed against that SPDX file, to see if there was new insights. The Windriver scanner is based on an older version of FOSSology in part, so they are related. Thomas did random spot checks in about 500 files from the spreadsheets for the uapi headers and agreed with SPDX license identifier in the files he inspected. For the non-uapi files Thomas did random spot checks in about 15000 files. In initial set of patches against 4.14-rc6, 3 files were found to have copy/paste license identifier errors, and have been fixed to reflect the correct identifier. Additionally Philippe spent 10 hours this week doing a detailed manual inspection and review of the 12,461 patched files from the initial patch version early this week with: - a full scancode scan run, collecting the matched texts, detected license ids and scores - reviewing anything where there was a license detected (about 500+ files) to ensure that the applied SPDX license was correct - reviewing anything where there was no detection but the patch license was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied SPDX license was correct This produced a worksheet with 20 files needing minor correction. This worksheet was then exported into 3 different .csv files for the different types of files to be modified. These .csv files were then reviewed by Greg. Thomas wrote a script to parse the csv files and add the proper SPDX tag to the file, in the format that the file expected. This script was further refined by Greg based on the output to detect more types of files automatically and to distinguish between header and source .c files (which need different comment types.) Finally Greg ran the script using the .csv files to generate the patches. Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-10-25locking/atomics, net/netlink/netfilter: Convert ACCESS_ONCE() to READ_ONCE()/WRITE_ONCE()Mark Rutland1-1/+1
For several reasons, it is desirable to use {READ,WRITE}_ONCE() in preference to ACCESS_ONCE(), and new code is expected to use one of the former. So far, there's been no reason to change most existing uses of ACCESS_ONCE(), as these aren't currently harmful. However, for some features it is necessary to instrument reads and writes separately, which is not possible with ACCESS_ONCE(). This distinction is critical to correct operation. It's possible to transform the bulk of kernel code using the Coccinelle script below. However, this doesn't handle comments, leaving references to ACCESS_ONCE() instances which have been removed. As a preparatory step, this patch converts netlink and netfilter code and comments to use {READ,WRITE}_ONCE() consistently. ---- virtual patch @ depends on patch @ expression E1, E2; @@ - ACCESS_ONCE(E1) = E2 + WRITE_ONCE(E1, E2) @ depends on patch @ expression E; @@ - ACCESS_ONCE(E) + READ_ONCE(E) ---- Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: David S. Miller <davem@davemloft.net> Cc: Florian Westphal <fw@strlen.de> Cc: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Pablo Neira Ayuso <pablo@netfilter.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-arch@vger.kernel.org Cc: mpe@ellerman.id.au Cc: shuah@kernel.org Cc: snitzer@redhat.com Cc: thor.thayer@linux.intel.com Cc: tj@kernel.org Cc: viro@zeniv.linux.org.uk Cc: will.deacon@arm.com Link: http://lkml.kernel.org/r/1508792849-3115-7-git-send-email-paulmck@linux.vnet.ibm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-06-19netfilter: nfnetlink: extended ACK reportingPablo Neira Ayuso1-4/+6
Pass down struct netlink_ext_ack as parameter to all of our nfnetlink subsystem callbacks, so we can work on follow up patches to provide finer grain error reporting using the new infrastructure that 2d4bc93368f5 ("netlink: extended ACK reporting") provides. No functional change, just pass down this new object to callbacks. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-04-07netfilter: Add nfnl_msg_type() helper functionPablo Neira Ayuso1-0/+5
Add and use nfnl_msg_type() function to replace opencoded nfnetlink message type. I suggested this change, Arushi Singhal made an initial patch to address this but was missing several spots. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-02-12netfilter: nfnetlink: allow to check for generation IDPablo Neira Ayuso1-0/+1
This patch allows userspace to specify the generation ID that has been used to build an incremental batch update. If userspace specifies the generation ID in the batch message as attribute, then nfnetlink compares it to the current generation ID so you make sure that you work against the right baseline. Otherwise, bail out with ERESTART so userspace knows that its changeset is stale and needs to respin. Userspace can do this transparently at the cost of taking slightly more time to refresh caches and rework the changeset. This check is optional, if there is no NFNL_BATCH_GENID attribute in the batch begin message, then no check is performed. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-02-18nfnetlink: remove nfnetlink_alloc_skbFlorian Westphal1-2/+0
Following mmapped netlink removal this code can be simplified by removing the alloc wrapper. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-28netfilter: nfnetlink: pass down netns pointer to commit() and abort() callbacksPablo Neira Ayuso1-2/+2
Adapt callsites to avoid recurrent lookup of the netns pointer. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-12-28netfilter: nfnetlink: pass down netns pointer to call() and call_rcu()Pablo Neira Ayuso1-4/+4
Adapt callsites to avoid recurrent lookup of the netns pointer. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-12-10netfilter: nfnetlink: avoid recurrent netns lookups in call_batchPablo Neira Ayuso1-1/+1
Pass the net pointer to the call_batch callback functions so we can skip recurrent lookups. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Tested-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com>
2015-10-09net/nfnetlink: lockdep_nfnl_is_held can be booleanYaowei Bai1-3/+3
This patch makes lockdep_nfnl_is_held return bool to improve readability due to this particular function only using either one or zero as its return value. No functional change. Signed-off-by: Yaowei Bai <bywxiaobai@163.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-25netfilter: nfnetlink: add rcu_dereference_protected() helpersPatrick McHardy1-0/+21
Add a lockdep_nfnl_is_held() function and a nfnl_dereference() macro for RCU dereferences protected by a NFNL subsystem mutex. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-10-14netfilter: nfnetlink: add batch support and use it from nf_tablesPablo Neira Ayuso1-0/+5
This patch adds a batch support to nfnetlink. Basically, it adds two new control messages: * NFNL_MSG_BATCH_BEGIN, that indicates the beginning of a batch, the nfgenmsg->res_id indicates the nfnetlink subsystem ID. * NFNL_MSG_BATCH_END, that results in the invocation of the ss->commit callback function. If not specified or an error ocurred in the batch, the ss->abort function is invoked instead. The end message represents the commit operation in nftables, the lack of end message results in an abort. This patch also adds the .call_batch function that is only called from the batch receival path. This patch adds atomic rule updates and dumps based on bitmask generations. This allows to atomically commit a set of rule-set updates incrementally without altering the internal state of existing nf_tables expressions/matches/targets. The idea consists of using a generation cursor of 1 bit and a bitmask of 2 bits per rule. Assuming the gencursor is 0, then the genmask (expressed as a bitmask) can be interpreted as: 00 active in the present, will be active in the next generation. 01 inactive in the present, will be active in the next generation. 10 active in the present, will be deleted in the next generation. ^ gencursor Once you invoke the transition to the next generation, the global gencursor is updated: 00 active in the present, will be active in the next generation. 01 active in the present, needs to zero its future, it becomes 00. 10 inactive in the present, delete now. ^ gencursor If a dump is in progress and nf_tables enters a new generation, the dump will stop and return -EBUSY to let userspace know that it has to retry again. In order to invalidate dumps, a global genctr counter is increased everytime nf_tables enters a new generation. This new operation can be used from the user-space utility that controls the firewall, eg. nft -f restore The rule updates contained in `file' will be applied atomically. cat file ----- add filter INPUT ip saddr 1.1.1.1 counter accept #1 del filter INPUT ip daddr 2.2.2.2 counter drop #2 -EOF- Note that the rule 1 will be inactive until the transition to the next generation, the rule 2 will be evicted in the next generation. There is a penalty during the rule update due to the branch misprediction in the packet matching framework. But that should be quickly resolved once the iteration over the commit list that contain rules that require updates is finished. Event notification happens once the rule-set update has been committed. So we skip notifications is case the rule-set update is aborted, which can happen in case that the rule-set is tested to apply correctly. This patch squashed the following patches from Pablo: * nf_tables: atomic rule updates and dumps * nf_tables: get rid of per rule list_head for commits * nf_tables: use per netns commit list * nfnetlink: add batch support and use it from nf_tables * nf_tables: all rule updates are transactional * nf_tables: attach replacement rule after stale one * nf_tables: do not allow deletion/replacement of stale rules * nf_tables: remove unused NFTA_RULE_FLAGS Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-09-26netfilter: Remove extern from function prototypesJoe Perches1-14/+14
There are a mix of function prototypes with and without extern in the kernel sources. Standardize on not using extern for function prototypes. Function prototypes don't need to be written with extern. extern is assumed by the compiler. Its use is as unnecessary as using auto to declare automatic/local variables in a block. Signed-off-by: Joe Perches <joe@perches.com>
2013-04-19nfnetlink: add support for memory mapped netlinkPatrick McHardy1-0/+2
Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-19netfilter: rename netlink related "pid" variables to "portid"Patrick McHardy1-4/+5
Get rid of the confusing mix of pid and portid and use portid consistently for all netlink related socket identities. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-05netfilter: nfnetlink: add mutex per subsystemPablo Neira Ayuso1-2/+2
This patch replaces the global lock to one lock per subsystem. The per-subsystem lock avoids that processes operating with different subsystems are synchronized. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-10-09UAPI: (Scripted) Disintegrate include/linux/netfilterDavid Howells1-54/+1
Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Michael Kerrisk <mtk.manpages@gmail.com> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Acked-by: Dave Jones <davej@redhat.com>
2012-06-16netfilter: add user-space connection tracking helper infrastructurePablo Neira Ayuso1-1/+2
There are good reasons to supports helpers in user-space instead: * Rapid connection tracking helper development, as developing code in user-space is usually faster. * Reliability: A buggy helper does not crash the kernel. Moreover, we can monitor the helper process and restart it in case of problems. * Security: Avoid complex string matching and mangling in kernel-space running in privileged mode. Going further, we can even think about running user-space helpers as a non-root process. * Extensibility: It allows the development of very specific helpers (most likely non-standard proprietary protocols) that are very likely not to be accepted for mainline inclusion in the form of kernel-space connection tracking helpers. This patch adds the infrastructure to allow the implementation of user-space conntrack helpers by means of the new nfnetlink subsystem `nfnetlink_cthelper' and the existing queueing infrastructure (nfnetlink_queue). I had to add the new hook NF_IP6_PRI_CONNTRACK_HELPER to register ipv[4|6]_helper which results from splitting ipv[4|6]_confirm into two pieces. This change is required not to break NAT sequence adjustment and conntrack confirmation for traffic that is enqueued to our user-space conntrack helpers. Basic operation, in a few steps: 1) Register user-space helper by means of `nfct': nfct helper add ftp inet tcp [ It must be a valid existing helper supported by conntrack-tools ] 2) Add rules to enable the FTP user-space helper which is used to track traffic going to TCP port 21. For locally generated packets: iptables -I OUTPUT -t raw -p tcp --dport 21 -j CT --helper ftp For non-locally generated packets: iptables -I PREROUTING -t raw -p tcp --dport 21 -j CT --helper ftp 3) Run the test conntrackd in helper mode (see example files under doc/helper/conntrackd.conf conntrackd 4) Generate FTP traffic going, if everything is OK, then conntrackd should create expectations (you can check that with `conntrack': conntrack -E expect [NEW] 301 proto=6 src=192.168.1.136 dst=130.89.148.12 sport=0 dport=54037 mask-src=255.255.255.255 mask-dst=255.255.255.255 sport=0 dport=65535 master-src=192.168.1.136 master-dst=130.89.148.12 sport=57127 dport=21 class=0 helper=ftp [DESTROY] 301 proto=6 src=192.168.1.136 dst=130.89.148.12 sport=0 dport=54037 mask-src=255.255.255.255 mask-dst=255.255.255.255 sport=0 dport=65535 master-src=192.168.1.136 master-dst=130.89.148.12 sport=57127 dport=21 class=0 helper=ftp This confirms that our test helper is receiving packets including the conntrack information, and adding expectations in kernel-space. The user-space helper can also store its private tracking information in the conntrack structure in the kernel via the CTA_HELP_INFO. The kernel will consider this a binary blob whose layout is unknown. This information will be included in the information that is transfered to user-space via glue code that integrates nfnetlink_queue and ctnetlink. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-04-15net: cleanup unsigned to unsigned intEric Dumazet1-1/+1
Use of "unsigned int" is preferred to bare "unsigned" in net tree. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-03-07netfilter: add cttimeout infrastructure for fine timeout tuningPablo Neira Ayuso1-1/+2
This patch adds the infrastructure to add fine timeout tuning over nfnetlink. Now you can use the NFNL_SUBSYS_CTNETLINK_TIMEOUT subsystem to create/delete/dump timeout objects that contain some specific timeout policy for one flow. The follow up patches will allow you attach timeout policy object to conntrack via the CT target and the conntrack extension infrastructure. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2011-12-25netfilter: add extended accounting infrastructure over nfnetlinkPablo Neira Ayuso1-1/+2
We currently have two ways to account traffic in netfilter: - iptables chain and rule counters: # iptables -L -n -v Chain INPUT (policy DROP 3 packets, 867 bytes) pkts bytes target prot opt in out source destination 8 1104 ACCEPT all -- lo * 0.0.0.0/0 0.0.0.0/0 - use flow-based accounting provided by ctnetlink: # conntrack -L tcp 6 431999 ESTABLISHED src=192.168.1.130 dst=212.106.219.168 sport=58152 dport=80 packets=47 bytes=7654 src=212.106.219.168 dst=192.168.1.130 sport=80 dport=58152 packets=49 bytes=66340 [ASSURED] mark=0 use=1 While trying to display real-time accounting statistics, we require to pool the kernel periodically to obtain this information. This is OK if the number of flows is relatively low. However, in case that the number of flows is huge, we can spend a considerable amount of cycles to iterate over the list of flows that have been obtained. Moreover, if we want to obtain the sum of the flow accounting results that match some criteria, we have to iterate over the whole list of existing flows, look for matchings and update the counters. This patch adds the extended accounting infrastructure for nfnetlink which aims to allow displaying real-time traffic accounting without the need of complicated and resource-consuming implementation in user-space. Basically, this new infrastructure allows you to create accounting objects. One accounting object is composed of packet and byte counters. In order to manipulate create accounting objects, you require the new libnetfilter_acct library. It contains several examples of use: libnetfilter_acct/examples# ./nfacct-add http-traffic libnetfilter_acct/examples# ./nfacct-get http-traffic = { pkts = 000000000000, bytes = 000000000000 }; Then, you can use one of this accounting objects in several iptables rules using the new nfacct match (which comes in a follow-up patch): # iptables -I INPUT -p tcp --sport 80 -m nfacct --nfacct-name http-traffic # iptables -I OUTPUT -p tcp --dport 80 -m nfacct --nfacct-name http-traffic The idea is simple: if one packet matches the rule, the nfacct match updates the counters. Thanks to Patrick McHardy, Eric Dumazet, Changli Gao for reviewing and providing feedback for this contribution. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2011-07-18netfilter: nfnetlink: add RCU in nfnetlink_rcv_msg()Eric Dumazet1-0/+3
Goal of this patch is to permit nfnetlink providers not mandate nfnl_mutex being held while nfnetlink_rcv_msg() calls them. If struct nfnl_callback contains a non NULL call_rcu(), then nfnetlink_rcv_msg() will use it instead of call() field, holding rcu_read_lock instead of nfnl_mutex Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> CC: Florian Westphal <fw@strlen.de> CC: Eric Leblond <eric@regit.org> Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-02-01netfilter: NFNL_SUBSYS_IPSET id and NLA_PUT_NET* macrosJozsef Kadlecsik1-1/+2
The patch adds the NFNL_SUBSYS_IPSET id and NLA_PUT_NET* macros to the vanilla kernel. Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-03-20netfilter: ctnetlink: fix reliable event delivery if message building failsPablo Neira Ayuso1-1/+1
This patch fixes a bug that allows to lose events when reliable event delivery mode is used, ie. if NETLINK_BROADCAST_SEND_ERROR and NETLINK_RECV_NO_ENOBUFS socket options are set. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-01-13netfilter: nfnetlink: netns supportAlexey Dobriyan1-4/+4
Make nfnl socket per-petns. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
2009-11-04net: cleanup include/linuxEric Dumazet1-4/+2
This cleanup patch puts struct/union/enum opening braces, in first line to ease grep games. struct something { becomes : struct something { Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-25netfilter: nfnetlink: constify message attributes and headersPatrick McHardy1-1/+2
Signed-off-by: Patrick McHardy <kaber@trash.net>
2009-06-08netfilter: passive OS fingerprint xtables matchEvgeniy Polyakov1-1/+2
Passive OS fingerprinting netfilter module allows to passively detect remote OS and perform various netfilter actions based on that knowledge. This module compares some data (WS, MSS, options and it's order, ttl, df and others) from packets with SYN bit set with dynamically loaded OS fingerprints. Fingerprint matching rules can be downloaded from OpenBSD source tree or found in archive and loaded via netfilter netlink subsystem into the kernel via special util found in archive. Archive contains library file (also attached), which was shipped with iptables extensions some time ago (at least when ipt_osf existed in patch-o-matic). Following changes were made in this release: * added NLM_F_CREATE/NLM_F_EXCL checks * dropped _rcu list traversing helpers in the protected add/remove calls * dropped unneded structures, debug prints, obscure comment and check Fingerprints can be downloaded from http://www.openbsd.org/cgi-bin/cvsweb/src/etc/pf.os or can be found in archive Example usage: -d switch removes fingerprints Please consider for inclusion. Thank you. Passive OS fingerprint homepage (archives, examples): http://www.ioremap.net/projects/osf Signed-off-by: Evgeniy Polyakov <zbr@ioremap.net> Signed-off-by: Patrick McHardy <kaber@trash.net>
2009-06-03netfilter: conntrack: replace notify chain by function pointerPablo Neira Ayuso1-1/+1
This patch removes the notify chain infrastructure and replace it by a simple function pointer. This issue has been mentioned in the mailing list several times: the use of the notify chain adds too much overhead for something that is only used by ctnetlink. This patch also changes nfnetlink_send(). It seems that gfp_any() returns GFP_KERNEL for user-context request, like those via ctnetlink, inside the RCU read-side section which is not valid. Using GFP_KERNEL is also evil since netlink may schedule(), this leads to "scheduling while atomic" bug reports. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2009-03-26Merge branch 'header-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tipLinus Torvalds1-2/+2
* 'header-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (50 commits) x86: headers cleanup - setup.h emu101k1.h: fix duplicate include of <linux/types.h> compiler-gcc4: conditionalize #error on __KERNEL__ remove __KERNEL_STRICT_NAMES make netfilter use strict integer types make drm headers use strict integer types make MTD headers use strict integer types make most exported headers use strict integer types make exported headers use strict posix types unconditionally include asm/types.h from linux/types.h make linux/types.h as assembly safe Neither asm/types.h nor linux/types.h is required for arch/ia64/include/asm/fpu.h headers_check fix cleanup: linux/reiserfs_fs.h headers_check fix cleanup: linux/nubus.h headers_check fix cleanup: linux/coda_psdev.h headers_check fix: x86, setup.h headers_check fix: x86, prctl.h headers_check fix: linux/reinserfs_fs.h headers_check fix: linux/socket.h headers_check fix: linux/nubus.h ... Manually fix trivial conflicts in: include/linux/netfilter/xt_limit.h include/linux/netfilter/xt_statistic.h
2009-03-26make netfilter use strict integer typesArnd Bergmann1-2/+2
Netfilter traditionally uses BSD integer types in its interface headers. This changes it to use the Linux strict integer types, like everyone else. Cc: netfilter-devel@vger.kernel.org Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-23nefilter: nfnetlink: add nfnetlink_set_err and use it in ctnetlinkPablo Neira Ayuso1-0/+1
This patch adds nfnetlink_set_err() to propagate the error to netlink broadcast listener in case of memory allocation errors in the message building. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-10-14netfilter: ctnetlink: remove bogus module dependency between ctnetlink and nf_natPablo Neira Ayuso1-0/+3
This patch removes the module dependency between ctnetlink and nf_nat by means of an indirect call that is initialized when nf_nat is loaded. Now, nf_conntrack_netlink only requires nf_conntrack and nfnetlink. This patch puts nfnetlink_parse_nat_setup_hook into the nf_conntrack_core to avoid dependencies between ctnetlink, nf_conntrack_ipv4 and nf_conntrack_ipv6. This patch also introduces the function ctnetlink_change_nat that is only invoked from the creation path. Actually, the nat handling cannot be invoked from the update path since this is not allowed. By introducing this function, we remove the useless nat handling in the update path and we avoid deadlock-prone code. This patch also adds the required EAGAIN logic for nfnetlink. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10[NETFILTER]: nfnetlink: kill nlattr_bad_sizePatrick McHardy1-13/+0
Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10[NETFILTER]: nfnetlink: support attribute policiesPatrick McHardy1-1/+2
Add support for automatic checking of per-callback attribute policies. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10[NETFILTER]: nfnetlink: rename functions containing 'nfattr'Patrick McHardy1-1/+1
There is no struct nfattr anymore, rename functions to 'nlattr'. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10[NETFILTER]: nfnetlink: convert to generic netlink attribute functionsPatrick McHardy1-72/+6
Get rid of the duplicated rtnetlink macros and use the generic netlink attribute functions. The old duplicated stuff is moved to a new header file that exists just for userspace. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10[NETFILTER]: nfnetlink: make subsystem and callbacks constPatrick McHardy1-5/+5
Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25[NETLINK]: Remove error pointer from netlink message handlerThomas Graf1-1/+1
The error pointer argument in netlink message handlers is used to signal the special case where processing has to be interrupted because a dump was started but no error happened. Instead it is simpler and more clear to return -EINTR and have netlink_run_queue() deal with getting the queue right. nfnetlink passed on this error pointer to its subsystem handlers but only uses it to signal the start of a netlink dump. Therefore it can be removed there as well. This patch also cleans up the error handling in the affected message handlers to be consistent since it had to be touched anyway. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25[SK_BUFF]: Convert skb->tail to sk_buff_data_tArnaldo Carvalho de Melo1-2/+2
So that it is also an offset from skb->head, reduces its size from 8 to 4 bytes on 64bit architectures, allowing us to combine the 4 bytes hole left by the layer headers conversion, reducing struct sk_buff size to 256 bytes, i.e. 4 64byte cachelines, and since the sk_buff slab cache is SLAB_HWCACHE_ALIGN... :-) Many calculations that previously required that skb->{transport,network, mac}_header be first converted to a pointer now can be done directly, being meaningful as offsets or pointers. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25[NETFILTER]: nfnetlink: use mutex instead of semaphorePatrick McHardy1-13/+0
Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02[NETFILTER]: trivial annotationsAl Viro1-1/+1
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-22[NETFILTER]: nfnetlink: remove unnecessary packed attributesPatrick McHardy1-2/+2
Remove unnecessary packed attributes in nfnetlink structures. Unfortunately in a few cases they have to stay to avoid changing structure sizes. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20[NETFILTER]: ctnetlink: avoid unneccessary event message generationPatrick McHardy1-0/+1
Avoid unneccessary event message generation by checking for netlink listeners before building a message. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>