aboutsummaryrefslogtreecommitdiffstats
path: root/net/netfilter/ipvs (follow)
AgeCommit message (Collapse)AuthorFilesLines
2019-10-01netfilter: drop bridge nf reset from nf_resetFlorian Westphal1-1/+1
commit 174e23810cd31 ("sk_buff: drop all skb extensions on free and skb scrubbing") made napi recycle always drop skb extensions. The additional skb_ext_del() that is performed via nf_reset on napi skb recycle is not needed anymore. Most nf_reset() calls in the stack are there so queued skb won't block 'rmmod nf_conntrack' indefinitely. This removes the skb_ext_del from nf_reset, and renames it to a more fitting nf_reset_ct(). In a few selected places, add a call to skb_ext_reset to make sure that no active extensions remain. I am submitting this for "net", because we're still early in the release cycle. The patch applies to net-next too, but I think the rename causes needless divergence between those trees. Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-09-26net: Fix Kconfig indentationKrzysztof Kozlowski1-3/+3
Adjust indentation from spaces to tab (+optional two spaces) as in coding style with command like: $ sed -e 's/^ /\t/' -i */Kconfig Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> Acked-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-13Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-nextJakub Kicinski4-38/+39
Pablo Neira Ayuso says: ==================== Netfilter/IPVS updates for net-next The following patchset contains Netfilter/IPVS updates for net-next: 1) Rename mss field to mss_option field in synproxy, from Fernando Mancera. 2) Use SYSCTL_{ZERO,ONE} definitions in conntrack, from Matteo Croce. 3) More strict validation of IPVS sysctl values, from Junwei Hu. 4) Remove unnecessary spaces after on the right hand side of assignments, from yangxingwu. 5) Add offload support for bitwise operation. 6) Extend the nft_offload_reg structure to store immediate date. 7) Collapse several ip_set header files into ip_set.h, from Jeremy Sowden. 8) Make netfilter headers compile with CONFIG_KERNEL_HEADER_TEST=y, from Jeremy Sowden. 9) Fix several sparse warnings due to missing prototypes, from Valdis Kletnieks. 10) Use static lock initialiser to ensure connlabel spinlock is initialized on boot time to fix sched/act_ct.c, patch from Florian Westphal. ==================== Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
2019-08-13netfilter: remove unnecessary spacesyangxingwu3-4/+4
This patch removes extra spaces. Signed-off-by: yangxingwu <xingwu.yang@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-08-08net: delete "register" keywordAlexey Dobriyan2-4/+4
Delete long obsoleted "register" keyword. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-03ipvs: Improve robustness to the ipvs sysctlJunwei Hu1-34/+35
The ipvs module parse the user buffer and save it to sysctl, then check if the value is valid. invalid value occurs over a period of time. Here, I add a variable, struct ctl_table tmp, used to read the value from the user buffer, and save only when it is valid. I delete proc_do_sync_mode and use extra1/2 in table for the proc_dointvec_minmax call. Fixes: f73181c8288f ("ipvs: add support for sync threads") Signed-off-by: Junwei Hu <hujunwei4@huawei.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-07-19Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nfDavid S. Miller1-1/+1
Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following patchset contains Netfilter fixes for net: 1) Fix a deadlock when module is requested via netlink_bind() in nfnetlink, from Florian Westphal. 2) Fix ipt_rpfilter and ip6t_rpfilter with VRF, from Miaohe Lin. 3) Skip master comparison in SIP helper to fix expectation clash under two valid scenarios, from xiao ruizhu. 4) Remove obsolete comments in nf_conntrack codebase, from Yonatan Goldschmidt. 5) Fix redirect extension module autoload, from Christian Hesse. 6) Fix incorrect mssg option sent to client in synproxy, from Fernando Fernandez. 7) Fix incorrect window calculations in TCP conntrack, from Florian Westphal. 8) Don't bail out when updating basechain policy due to recent offload works, also from Florian. 9) Allow symhash to use modulus 1 as other hash extensions do, from Laura.Garcia. 10) Missing NAT chain module autoload for the inet family, from Phil Sutter. 11) Fix missing adjustment of TCP RST packet in synproxy, from Fernando Fernandez. 12) Skip EAGAIN path when nft_meta_bridge is built-in or not selected. 13) Conntrack bridge does not depend on nf_tables_bridge. 14) Turn NF_TABLES_BRIDGE into tristate to fix possible link break of nft_meta_bridge, from Arnd Bergmann. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-18proc/sysctl: add shared variables for range checkMatteo Croce1-2/+1
In the sysctl code the proc_dointvec_minmax() function is often used to validate the user supplied value between an allowed range. This function uses the extra1 and extra2 members from struct ctl_table as minimum and maximum allowed value. On sysctl handler declaration, in every source file there are some readonly variables containing just an integer which address is assigned to the extra1 and extra2 members, so the sysctl range is enforced. The special values 0, 1 and INT_MAX are very often used as range boundary, leading duplication of variables like zero=0, one=1, int_max=INT_MAX in different source files: $ git grep -E '\.extra[12].*&(zero|one|int_max)' |wc -l 248 Add a const int array containing the most commonly used values, some macros to refer more easily to the correct array member, and use them instead of creating a local one for every object file. This is the bloat-o-meter output comparing the old and new binary compiled with the default Fedora config: # scripts/bloat-o-meter -d vmlinux.o.old vmlinux.o add/remove: 2/2 grow/shrink: 0/2 up/down: 24/-188 (-164) Data old new delta sysctl_vals - 12 +12 __kstrtab_sysctl_vals - 12 +12 max 14 10 -4 int_max 16 - -16 one 68 - -68 zero 128 28 -100 Total: Before=20583249, After=20583085, chg -0.00% [mcroce@redhat.com: tipc: remove two unused variables] Link: http://lkml.kernel.org/r/20190530091952.4108-1-mcroce@redhat.com [akpm@linux-foundation.org: fix net/ipv6/sysctl_net_ipv6.c] [arnd@arndb.de: proc/sysctl: make firmware loader table conditional] Link: http://lkml.kernel.org/r/20190617130014.1713870-1-arnd@arndb.de [akpm@linux-foundation.org: fix fs/eventpoll.c] Link: http://lkml.kernel.org/r/20190430180111.10688-1-mcroce@redhat.com Signed-off-by: Matteo Croce <mcroce@redhat.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Kees Cook <keescook@chromium.org> Reviewed-by: Aaron Tomlin <atomlin@redhat.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-07-16netfilter: nf_conntrack_sip: fix expectation clashxiao ruizhu1-1/+1
When conntracks change during a dialog, SDP messages may be sent from different conntracks to establish expects with identical tuples. In this case expects conflict may be detected for the 2nd SDP message and end up with a process failure. The fixing here is to reuse an existing expect who has the same tuple for a different conntrack if any. Here are two scenarios for the case. 1) SERVER CPE | INVITE SDP | 5060 |<----------------------|5060 | 100 Trying | 5060 |---------------------->|5060 | 183 SDP | 5060 |---------------------->|5060 ===> Conntrack 1 | PRACK | 50601 |<----------------------|5060 | 200 OK (PRACK) | 50601 |---------------------->|5060 | 200 OK (INVITE) | 5060 |---------------------->|5060 | ACK | 50601 |<----------------------|5060 | | |<--- RTP stream ------>| | | | INVITE SDP (t38) | 50601 |---------------------->|5060 ===> Conntrack 2 With a certain configuration in the CPE, SIP messages "183 with SDP" and "re-INVITE with SDP t38" will go through the sip helper to create expects for RTP and RTCP. It is okay to create RTP and RTCP expects for "183", whose master connection source port is 5060, and destination port is 5060. In the "183" message, port in Contact header changes to 50601 (from the original 5060). So the following requests e.g. PRACK and ACK are sent to port 50601. It is a different conntrack (let call Conntrack 2) from the original INVITE (let call Conntrack 1) due to the port difference. In this example, after the call is established, there is RTP stream but no RTCP stream for Conntrack 1, so the RTP expect created upon "183" is cleared, and RTCP expect created for Conntrack 1 retains. When "re-INVITE with SDP t38" arrives to create RTP&RTCP expects, current ALG implementation will call nf_ct_expect_related() for RTP and RTCP. The expects tuples are identical to those for Conntrack 1. RTP expect for Conntrack 2 succeeds in creation as the one for Conntrack 1 has been removed. RTCP expect for Conntrack 2 fails in creation because it has idential tuples and 'conflict' with the one retained for Conntrack 1. And then result in a failure in processing of the re-INVITE. 2) SERVER A CPE | REGISTER | 5060 |<------------------| 5060 ==> CT1 | 200 | 5060 |------------------>| 5060 | | | INVITE SDP(1) | 5060 |<------------------| 5060 | 300(multi choice) | 5060 |------------------>| 5060 SERVER B | ACK | 5060 |<------------------| 5060 | INVITE SDP(2) | 5060 |-------------------->| 5060 ==> CT2 | 100 | 5060 |<--------------------| 5060 | 200(contact changes)| 5060 |<--------------------| 5060 | ACK | 5060 |-------------------->| 50601 ==> CT3 | | |<--- RTP stream ---->| | | | BYE | 5060 |<--------------------| 50601 | 200 | 5060 |-------------------->| 50601 | INVITE SDP(3) | 5060 |<------------------| 5060 ==> CT1 CPE sends an INVITE request(1) to Server A, and creates a RTP&RTCP expect pair for this Conntrack 1 (CT1). Server A responds 300 to redirect to Server B. The RTP&RTCP expect pairs created on CT1 are removed upon 300 response. CPE sends the INVITE request(2) to Server B, and creates an expect pair for the new conntrack (due to destination address difference), let call CT2. Server B changes the port to 50601 in 200 OK response, and the following requests ACK and BYE from CPE are sent to 50601. The call is established. There is RTP stream and no RTCP stream. So RTP expect is removed and RTCP expect for CT2 retains. As BYE request is sent from port 50601, it is another conntrack, let call CT3, different from CT2 due to the port difference. So the BYE request will not remove the RTCP expect for CT2. Then another outgoing call is made, with the same RTP port being used (not definitely but possibly). CPE firstly sends the INVITE request(3) to Server A, and tries to create a RTP&RTCP expect pairs for this CT1. In current ALG implementation, the RTCP expect for CT1 fails in creation because it 'conflicts' with the residual one for CT2. As a result the INVITE request fails to send. Signed-off-by: xiao ruizhu <katrina.xiaorz@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-07-08Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller3-73/+86
Two cases of overlapping changes, nothing fancy. Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-05ipvs: strip gre tunnel headers from icmp errorsJulian Anastasov1-4/+42
Recognize GRE tunnels in received ICMP errors and properly strip the tunnel headers. Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-07-04ipvs: allow tunneling with gre encapsulationVadim Fedorenko2-3/+64
windows real servers can handle gre tunnels, this patch allows gre encapsulation with the tunneling method, thereby letting ipvs be load balancer for windows-based services Signed-off-by: Vadim Fedorenko <vfedorenko@yandex-team.ru> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-06-28Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nfDavid S. Miller3-73/+86
Pablo Neira Ayuso says: ==================== Netfilter/IPVS fixes for net The following patchset contains Netfilter fixes for net: 1) Fix memleak reported by syzkaller when registering IPVS hooks, patch from Julian Anastasov. 2) Fix memory leak in start_sync_thread, also from Julian. 3) Fix conntrack deletion via ctnetlink, from Felix Kaechele. 4) Fix reject for ICMP due to incorrect checksum handling, from He Zhe. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-25ipvs: fix tinfo memory leak in start_sync_threadJulian Anastasov2-66/+72
syzkaller reports for memory leak in start_sync_thread [1] As Eric points out, kthread may start and stop before the threadfn function is called, so there is no chance the data (tinfo in our case) to be released in thread. Fix this by releasing tinfo in the controlling code instead. [1] BUG: memory leak unreferenced object 0xffff8881206bf700 (size 32): comm "syz-executor761", pid 7268, jiffies 4294943441 (age 20.470s) hex dump (first 32 bytes): 00 40 7c 09 81 88 ff ff 80 45 b8 21 81 88 ff ff .@|......E.!.... 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<0000000057619e23>] kmemleak_alloc_recursive include/linux/kmemleak.h:55 [inline] [<0000000057619e23>] slab_post_alloc_hook mm/slab.h:439 [inline] [<0000000057619e23>] slab_alloc mm/slab.c:3326 [inline] [<0000000057619e23>] kmem_cache_alloc_trace+0x13d/0x280 mm/slab.c:3553 [<0000000086ce5479>] kmalloc include/linux/slab.h:547 [inline] [<0000000086ce5479>] start_sync_thread+0x5d2/0xe10 net/netfilter/ipvs/ip_vs_sync.c:1862 [<000000001a9229cc>] do_ip_vs_set_ctl+0x4c5/0x780 net/netfilter/ipvs/ip_vs_ctl.c:2402 [<00000000ece457c8>] nf_sockopt net/netfilter/nf_sockopt.c:106 [inline] [<00000000ece457c8>] nf_setsockopt+0x4c/0x80 net/netfilter/nf_sockopt.c:115 [<00000000942f62d4>] ip_setsockopt net/ipv4/ip_sockglue.c:1258 [inline] [<00000000942f62d4>] ip_setsockopt+0x9b/0xb0 net/ipv4/ip_sockglue.c:1238 [<00000000a56a8ffd>] udp_setsockopt+0x4e/0x90 net/ipv4/udp.c:2616 [<00000000fa895401>] sock_common_setsockopt+0x38/0x50 net/core/sock.c:3130 [<0000000095eef4cf>] __sys_setsockopt+0x98/0x120 net/socket.c:2078 [<000000009747cf88>] __do_sys_setsockopt net/socket.c:2089 [inline] [<000000009747cf88>] __se_sys_setsockopt net/socket.c:2086 [inline] [<000000009747cf88>] __x64_sys_setsockopt+0x26/0x30 net/socket.c:2086 [<00000000ded8ba80>] do_syscall_64+0x76/0x1a0 arch/x86/entry/common.c:301 [<00000000893b4ac8>] entry_SYSCALL_64_after_hwframe+0x44/0xa9 Reported-by: syzbot+7e2e50c8adfccd2e5041@syzkaller.appspotmail.com Suggested-by: Eric Biggers <ebiggers@kernel.org> Fixes: 998e7a76804b ("ipvs: Use kthread_run() instead of doing a double-fork via kernel_thread()") Signed-off-by: Julian Anastasov <ja@ssi.bg> Acked-by: Simon Horman <horms@verge.net.au> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-06-22Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller1-5/+1
Minor SPDX change conflict. Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-21ipvs: defer hook registration to avoid leaksJulian Anastasov1-7/+14
syzkaller reports for memory leak when registering hooks [1] As we moved the nf_unregister_net_hooks() call into __ip_vs_dev_cleanup(), defer the nf_register_net_hooks() call, so that hooks are allocated and freed from same pernet_operations (ipvs_core_dev_ops). [1] BUG: memory leak unreferenced object 0xffff88810acd8a80 (size 96): comm "syz-executor073", pid 7254, jiffies 4294950560 (age 22.250s) hex dump (first 32 bytes): 02 00 00 00 00 00 00 00 50 8b bb 82 ff ff ff ff ........P....... 00 00 00 00 00 00 00 00 00 77 bb 82 ff ff ff ff .........w...... backtrace: [<0000000013db61f1>] kmemleak_alloc_recursive include/linux/kmemleak.h:55 [inline] [<0000000013db61f1>] slab_post_alloc_hook mm/slab.h:439 [inline] [<0000000013db61f1>] slab_alloc_node mm/slab.c:3269 [inline] [<0000000013db61f1>] kmem_cache_alloc_node_trace+0x15b/0x2a0 mm/slab.c:3597 [<000000001a27307d>] __do_kmalloc_node mm/slab.c:3619 [inline] [<000000001a27307d>] __kmalloc_node+0x38/0x50 mm/slab.c:3627 [<0000000025054add>] kmalloc_node include/linux/slab.h:590 [inline] [<0000000025054add>] kvmalloc_node+0x4a/0xd0 mm/util.c:431 [<0000000050d1bc00>] kvmalloc include/linux/mm.h:637 [inline] [<0000000050d1bc00>] kvzalloc include/linux/mm.h:645 [inline] [<0000000050d1bc00>] allocate_hook_entries_size+0x3b/0x60 net/netfilter/core.c:61 [<00000000e8abe142>] nf_hook_entries_grow+0xae/0x270 net/netfilter/core.c:128 [<000000004b94797c>] __nf_register_net_hook+0x9a/0x170 net/netfilter/core.c:337 [<00000000d1545cbc>] nf_register_net_hook+0x34/0xc0 net/netfilter/core.c:464 [<00000000876c9b55>] nf_register_net_hooks+0x53/0xc0 net/netfilter/core.c:480 [<000000002ea868e0>] __ip_vs_init+0xe8/0x170 net/netfilter/ipvs/ip_vs_core.c:2280 [<000000002eb2d451>] ops_init+0x4c/0x140 net/core/net_namespace.c:130 [<000000000284ec48>] setup_net+0xde/0x230 net/core/net_namespace.c:316 [<00000000a70600fa>] copy_net_ns+0xf0/0x1e0 net/core/net_namespace.c:439 [<00000000ff26c15e>] create_new_namespaces+0x141/0x2a0 kernel/nsproxy.c:107 [<00000000b103dc79>] copy_namespaces+0xa1/0xe0 kernel/nsproxy.c:165 [<000000007cc008a2>] copy_process.part.0+0x11fd/0x2150 kernel/fork.c:2035 [<00000000c344af7c>] copy_process kernel/fork.c:1800 [inline] [<00000000c344af7c>] _do_fork+0x121/0x4f0 kernel/fork.c:2369 Reported-by: syzbot+722da59ccb264bc19910@syzkaller.appspotmail.com Fixes: 719c7d563c17 ("ipvs: Fix use-after-free in ip_vs_in") Signed-off-by: Julian Anastasov <ja@ssi.bg> Acked-by: Simon Horman <horms@verge.net.au> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-06-19treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500Thomas Gleixner1-5/+1
Based on 2 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license version 2 as published by the free software foundation this program is free software you can redistribute it and or modify it under the terms of the gnu general public license version 2 as published by the free software foundation # extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 4122 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Enrico Weigelt <info@metux.net> Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Allison Randal <allison@lohutok.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190604081206.933168790@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-07Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller23-136/+23
Some ISDN files that got removed in net-next had some changes done in mainline, take the removals. Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-01Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-nextDavid S. Miller8-43/+287
Pablo Neira Ayuso says: ==================== Netfilter/IPVS updates for net-next The following patchset container Netfilter/IPVS update for net-next: 1) Add UDP tunnel support for ICMP errors in IPVS. Julian Anastasov says: This patchset is a followup to the commit that adds UDP/GUE tunnel: "ipvs: allow tunneling with gue encapsulation". What we do is to put tunnel real servers in hash table (patch 1), add function to lookup tunnels (patch 2) and use it to strip the embedded tunnel headers from ICMP errors (patch 3). 2) Extend xt_owner to match for supplementary groups, from Lukasz Pawelczyk. 3) Remove unused oif field in flow_offload_tuple object, from Taehee Yoo. 4) Release basechain counters from workqueue to skip synchronize_rcu() call. From Florian Westphal. 5) Replace skb_make_writable() by skb_ensure_writable(). Patchset from Florian Westphal. 6) Checksum support for gue encapsulation in IPVS, from Jacky Hu. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-31ipvs: add checksum support for gue encapsulationJacky Hu2-17/+137
Add checksum support for gue encapsulation with the tun_flags parameter, which could be one of the values below: IP_VS_TUNNEL_ENCAP_FLAG_NOCSUM IP_VS_TUNNEL_ENCAP_FLAG_CSUM IP_VS_TUNNEL_ENCAP_FLAG_REMCSUM Signed-off-by: Jacky Hu <hengqing.hu@gmail.com> Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-05-31netfilter: ipvs: prefer skb_ensure_writableFlorian Westphal7-18/+18
It does the same thing, use it instead so we can remove skb_make_writable. Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Simon Horman <horms@verge.net.au> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-05-31ipvs: strip udp tunnel headers from icmp errorsJulian Anastasov1-0/+60
Recognize UDP tunnels in received ICMP errors and properly strip the tunnel headers. GUE is what we have for now. Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-05-31ipvs: add function to find tunnelsJulian Anastasov2-0/+37
Add ip_vs_find_tunnel() to match tunnel headers by family, address and optional port. Use it to properly find the tunnel real server used in received ICMP errors. Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-05-31ipvs: allow rs_table to contain different real server typesJulian Anastasov1-8/+35
Before now rs_table was used only for NAT real servers. Change it to allow TUN real severs from different types, possibly hashed with different port key. Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-05-31Merge tag 'spdx-5.2-rc3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-coreLinus Torvalds23-136/+23
Pull yet more SPDX updates from Greg KH: "Here is another set of reviewed patches that adds SPDX tags to different kernel files, based on a set of rules that are being used to parse the comments to try to determine that the license of the file is "GPL-2.0-or-later" or "GPL-2.0-only". Only the "obvious" versions of these matches are included here, a number of "non-obvious" variants of text have been found but those have been postponed for later review and analysis. There is also a patch in here to add the proper SPDX header to a bunch of Kbuild files that we have missed in the past due to new files being added and forgetting that Kbuild uses two different file names for Makefiles. This issue was reported by the Kbuild maintainer. These patches have been out for review on the linux-spdx@vger mailing list, and while they were created by automatic tools, they were hand-verified by a bunch of different people, all whom names are on the patches are reviewers" * tag 'spdx-5.2-rc3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (82 commits) treewide: Add SPDX license identifier - Kbuild treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 225 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 224 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 223 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 222 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 221 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 220 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 218 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 217 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 216 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 215 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 214 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 213 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 211 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 210 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 209 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 207 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 206 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 203 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 201 ...
2019-05-30Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds1-1/+1
Pull networking fixes from David Miller: 1) Fix OOPS during nf_tables rule dump, from Florian Westphal. 2) Use after free in ip_vs_in, from Yue Haibing. 3) Fix various kTLS bugs (NULL deref during device removal resync, netdev notification ignoring, etc.) From Jakub Kicinski. 4) Fix ipv6 redirects with VRF, from David Ahern. 5) Memory leak fix in igmpv3_del_delrec(), from Eric Dumazet. 6) Missing memory allocation failure check in ip6_ra_control(), from Gen Zhang. And likewise fix ip_ra_control(). 7) TX clean budget logic error in aquantia, from Igor Russkikh. 8) SKB leak in llc_build_and_send_ui_pkt(), from Eric Dumazet. 9) Double frees in mlx5, from Parav Pandit. 10) Fix lost MAC address in r8169 during PCI D3, from Heiner Kallweit. 11) Fix botched register access in mvpp2, from Antoine Tenart. 12) Use after free in napi_gro_frags(), from Eric Dumazet. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (89 commits) net: correct zerocopy refcnt with udp MSG_MORE ethtool: Check for vlan etype or vlan tci when parsing flow_rule net: don't clear sock->sk early to avoid trouble in strparser net-gro: fix use-after-free read in napi_gro_frags() net: dsa: tag_8021q: Create a stable binary format net: dsa: tag_8021q: Change order of rx_vid setup net: mvpp2: fix bad MVPP2_TXQ_SCHED_TOKEN_CNTR_REG queue value ipv4: tcp_input: fix stack out of bounds when parsing TCP options. mlxsw: spectrum: Prevent force of 56G mlxsw: spectrum_acl: Avoid warning after identical rules insertion net: dsa: mv88e6xxx: fix handling of upper half of STATS_TYPE_PORT r8169: fix MAC address being lost in PCI D3 net: core: support XDP generic on stacked devices. netvsc: unshare skb in VF rx handler udp: Avoid post-GRO UDP checksum recalculation net: phy: dp83867: Set up RGMII TX delay net: phy: dp83867: do not call config_init twice net: phy: dp83867: increase SGMII autoneg timer duration net: phy: dp83867: fix speed 10 in sgmii mode net: phy: marvell10g: report if the PHY fails to boot firmware ...
2019-05-30treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152Thomas Gleixner23-136/+23
Based on 1 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license as published by the free software foundation either version 2 of the license or at your option any later version extracted by the scancode license scanner the SPDX license identifier GPL-2.0-or-later has been chosen to replace the boilerplate/reference in 3029 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Allison Randal <allison@lohutok.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190527070032.746973796@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-24treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 93Thomas Gleixner1-17/+1
Based on 1 normalized pattern(s): this code is free software you can redistribute it and or modify it under the terms of the gnu general public license as published by the free software foundation either version 2 of the license or at your option any later version this program is distributed in the hope that it will be useful but without any warranty without even the implied warranty of merchantability or fitness for a particular purpose see the gnu general public license for more details you should have received a copy of the gnu general public license along with this program if not see http www gnu org licenses extracted by the scancode license scanner the SPDX license identifier GPL-2.0-or-later has been chosen to replace the boilerplate/reference in 1 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Allison Randal <allison@lohutok.net> Reviewed-by: Richard Fontana <rfontana@redhat.com> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190520075212.233647300@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-23Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nfDavid S. Miller1-1/+1
Pablo Neira Ayuso says: ==================== Netfilter/IPVS fixes for net The following patchset contains Netfilter/IPVS fixes for your net tree: 1) Fix crash when dumping rules after conversion to RCU, from Florian Westphal. 2) Fix incorrect hook reinjection from nf_queue in case NF_REPEAT, from Jagdish Motwani. 3) Fix check for route existence in fib extension, from Phil Sutter. 4) Fix use after free in ip_vs_in() hook, from YueHaibing. 5) Check for veth existence from netfilter selftests, from Jeffrin Jose T. 6) Checksum corruption in UDP NAT helpers due to typo, from Florian Westphal. 7) Pass up packets to classic forwarding path regardless of IPv4 DF bit, patch for the flowtable infrastructure from Florian. 8) Set liberal TCP tracking for flows that are placed in the flowtable, in case they need to go back to classic forwarding path, also from Florian. 9) Don't add flow with sequence adjustment to flowtable, from Florian. 10) Skip IPv4 options from IPv6 datapath in flowtable, from Florian. 11) Add selftest for the flowtable infrastructure, from Florian. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-21ipvs: Fix use-after-free in ip_vs_inYueHaibing1-1/+1
BUG: KASAN: use-after-free in ip_vs_in.part.29+0xe8/0xd20 [ip_vs] Read of size 4 at addr ffff8881e9b26e2c by task sshd/5603 CPU: 0 PID: 5603 Comm: sshd Not tainted 4.19.39+ #30 Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 Call Trace: dump_stack+0x71/0xab print_address_description+0x6a/0x270 kasan_report+0x179/0x2c0 ip_vs_in.part.29+0xe8/0xd20 [ip_vs] ip_vs_in+0xd8/0x170 [ip_vs] nf_hook_slow+0x5f/0xe0 __ip_local_out+0x1d5/0x250 ip_local_out+0x19/0x60 __tcp_transmit_skb+0xba1/0x14f0 tcp_write_xmit+0x41f/0x1ed0 ? _copy_from_iter_full+0xca/0x340 __tcp_push_pending_frames+0x52/0x140 tcp_sendmsg_locked+0x787/0x1600 ? tcp_sendpage+0x60/0x60 ? inet_sk_set_state+0xb0/0xb0 tcp_sendmsg+0x27/0x40 sock_sendmsg+0x6d/0x80 sock_write_iter+0x121/0x1c0 ? sock_sendmsg+0x80/0x80 __vfs_write+0x23e/0x370 vfs_write+0xe7/0x230 ksys_write+0xa1/0x120 ? __ia32_sys_read+0x50/0x50 ? __audit_syscall_exit+0x3ce/0x450 do_syscall_64+0x73/0x200 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x7ff6f6147c60 Code: 73 01 c3 48 8b 0d 28 12 2d 00 f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 83 3d 5d 73 2d 00 00 75 10 b8 01 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 31 c3 48 83 RSP: 002b:00007ffd772ead18 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 0000000000000034 RCX: 00007ff6f6147c60 RDX: 0000000000000034 RSI: 000055df30a31270 RDI: 0000000000000003 RBP: 000055df30a31270 R08: 0000000000000000 R09: 0000000000000000 R10: 00007ffd772ead70 R11: 0000000000000246 R12: 00007ffd772ead74 R13: 00007ffd772eae20 R14: 00007ffd772eae24 R15: 000055df2f12ddc0 Allocated by task 6052: kasan_kmalloc+0xa0/0xd0 __kmalloc+0x10a/0x220 ops_init+0x97/0x190 register_pernet_operations+0x1ac/0x360 register_pernet_subsys+0x24/0x40 0xffffffffc0ea016d do_one_initcall+0x8b/0x253 do_init_module+0xe3/0x335 load_module+0x2fc0/0x3890 __do_sys_finit_module+0x192/0x1c0 do_syscall_64+0x73/0x200 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Freed by task 6067: __kasan_slab_free+0x130/0x180 kfree+0x90/0x1a0 ops_free_list.part.7+0xa6/0xc0 unregister_pernet_operations+0x18b/0x1f0 unregister_pernet_subsys+0x1d/0x30 ip_vs_cleanup+0x1d/0xd2f [ip_vs] __x64_sys_delete_module+0x20c/0x300 do_syscall_64+0x73/0x200 entry_SYSCALL_64_after_hwframe+0x44/0xa9 The buggy address belongs to the object at ffff8881e9b26600 which belongs to the cache kmalloc-4096 of size 4096 The buggy address is located 2092 bytes inside of 4096-byte region [ffff8881e9b26600, ffff8881e9b27600) The buggy address belongs to the page: page:ffffea0007a6c800 count:1 mapcount:0 mapping:ffff888107c0e600 index:0x0 compound_mapcount: 0 flags: 0x17ffffc0008100(slab|head) raw: 0017ffffc0008100 dead000000000100 dead000000000200 ffff888107c0e600 raw: 0000000000000000 0000000080070007 00000001ffffffff 0000000000000000 page dumped because: kasan: bad access detected while unregistering ipvs module, ops_free_list calls __ip_vs_cleanup, then nf_unregister_net_hooks be called to do remove nf hook entries. It need a RCU period to finish, however net->ipvs is set to NULL immediately, which will trigger NULL pointer dereference when a packet is hooked and handled by ip_vs_in where net->ipvs is dereferenced. Another scene is ops_free_list call ops_free to free the net_generic directly while __ip_vs_cleanup finished, then calling ip_vs_in will triggers use-after-free. This patch moves nf_unregister_net_hooks from __ip_vs_cleanup() to __ip_vs_dev_cleanup(), where rcu_barrier() is called by unregister_pernet_device -> unregister_pernet_operations, that will do the needed grace period. Reported-by: Hulk Robot <hulkci@huawei.com> Fixes: efe41606184e ("ipvs: convert to use pernet nf_hook api") Suggested-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: YueHaibing <yuehaibing@huawei.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-05-21treewide: Add SPDX license identifier - Makefile/KconfigThomas Gleixner1-0/+1
Add SPDX license identifiers to all Make/Kconfig files which: - Have no license information of any form These files fall under the project license, GPL v2 only. The resulting SPDX license identifier is: GPL-2.0-only Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-21treewide: Add SPDX license identifier for more missed filesThomas Gleixner1-0/+1
Add SPDX license identifiers to all files which: - Have no license information of any form - Have MODULE_LICENCE("GPL*") inside which was used in the initial scan/conversion to ignore the file These files fall under the project license, GPL v2 only. The resulting SPDX license identifier is: GPL-2.0-only Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-21treewide: Add SPDX license identifier for missed filesThomas Gleixner1-0/+1
Add SPDX license identifiers to all files which: - Have no license information of any form - Have EXPORT_.*_SYMBOL_GPL inside which was used in the initial scan/conversion to ignore the file These files fall under the project license, GPL v2 only. The resulting SPDX license identifier is: GPL-2.0-only Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-04-27genetlink: optionally validate strictly/dumpsJohannes Berg1-0/+16
Add options to strictly validate messages and dump messages, sometimes perhaps validating dump messages non-strictly may be required, so add an option for that as well. Since none of this can really be applied to existing commands, set the options everwhere using the following spatch: @@ identifier ops; expression X; @@ struct genl_ops ops[] = { ..., { .cmd = X, + .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP, ... }, ... }; For new commands one should just not copy the .validate 'opt-out' flags and thus get strict validation. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-27netlink: make validation more configurable for future strictnessJohannes Berg1-9/+4
We currently have two levels of strict validation: 1) liberal (default) - undefined (type >= max) & NLA_UNSPEC attributes accepted - attribute length >= expected accepted - garbage at end of message accepted 2) strict (opt-in) - NLA_UNSPEC attributes accepted - attribute length >= expected accepted Split out parsing strictness into four different options: * TRAILING - check that there's no trailing data after parsing attributes (in message or nested) * MAXTYPE - reject attrs > max known type * UNSPEC - reject attributes with NLA_UNSPEC policy entries * STRICT_ATTRS - strictly validate attribute size The default for future things should be *everything*. The current *_strict() is a combination of TRAILING and MAXTYPE, and is renamed to _deprecated_strict(). The current regular parsing has none of this, and is renamed to *_parse_deprecated(). Additionally it allows us to selectively set one of the new flags even on old policies. Notably, the UNSPEC flag could be useful in this case, since it can be arranged (by filling in the policy) to not be an incompatible userspace ABI change, but would then going forward prevent forgetting attribute entries. Similar can apply to the POLICY flag. We end up with the following renames: * nla_parse -> nla_parse_deprecated * nla_parse_strict -> nla_parse_deprecated_strict * nlmsg_parse -> nlmsg_parse_deprecated * nlmsg_parse_strict -> nlmsg_parse_deprecated_strict * nla_parse_nested -> nla_parse_nested_deprecated * nla_validate_nested -> nla_validate_nested_deprecated Using spatch, of course: @@ expression TB, MAX, HEAD, LEN, POL, EXT; @@ -nla_parse(TB, MAX, HEAD, LEN, POL, EXT) +nla_parse_deprecated(TB, MAX, HEAD, LEN, POL, EXT) @@ expression NLH, HDRLEN, TB, MAX, POL, EXT; @@ -nlmsg_parse(NLH, HDRLEN, TB, MAX, POL, EXT) +nlmsg_parse_deprecated(NLH, HDRLEN, TB, MAX, POL, EXT) @@ expression NLH, HDRLEN, TB, MAX, POL, EXT; @@ -nlmsg_parse_strict(NLH, HDRLEN, TB, MAX, POL, EXT) +nlmsg_parse_deprecated_strict(NLH, HDRLEN, TB, MAX, POL, EXT) @@ expression TB, MAX, NLA, POL, EXT; @@ -nla_parse_nested(TB, MAX, NLA, POL, EXT) +nla_parse_nested_deprecated(TB, MAX, NLA, POL, EXT) @@ expression START, MAX, POL, EXT; @@ -nla_validate_nested(START, MAX, POL, EXT) +nla_validate_nested_deprecated(START, MAX, POL, EXT) @@ expression NLH, HDRLEN, MAX, POL, EXT; @@ -nlmsg_validate(NLH, HDRLEN, MAX, POL, EXT) +nlmsg_validate_deprecated(NLH, HDRLEN, MAX, POL, EXT) For this patch, don't actually add the strict, non-renamed versions yet so that it breaks compile if I get it wrong. Also, while at it, make nla_validate and nla_parse go down to a common __nla_validate_parse() function to avoid code duplication. Ultimately, this allows us to have very strict validation for every new caller of nla_parse()/nlmsg_parse() etc as re-introduced in the next patch, while existing things will continue to work as is. In effect then, this adds fully strict validation for any new command. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-27netlink: make nla_nest_start() add NLA_F_NESTED flagMichal Kubecek1-5/+5
Even if the NLA_F_NESTED flag was introduced more than 11 years ago, most netlink based interfaces (including recently added ones) are still not setting it in kernel generated messages. Without the flag, message parsers not aware of attribute semantics (e.g. wireshark dissector or libmnl's mnl_nlmsg_fprintf()) cannot recognize nested attributes and won't display the structure of their contents. Unfortunately we cannot just add the flag everywhere as there may be userspace applications which check nlattr::nla_type directly rather than through a helper masking out the flags. Therefore the patch renames nla_nest_start() to nla_nest_start_noflag() and introduces nla_nest_start() as a wrapper adding NLA_F_NESTED. The calls which add NLA_F_NESTED manually are rewritten to use nla_nest_start(). Except for changes in include/net/netlink.h, the patch was generated using this semantic patch: @@ expression E1, E2; @@ -nla_nest_start(E1, E2) +nla_nest_start_noflag(E1, E2) @@ expression E1, E2; @@ -nla_nest_start_noflag(E1, E2 | NLA_F_NESTED) +nla_nest_start(E1, E2) Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Acked-by: Jiri Pirko <jiri@mellanox.com> Acked-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-25Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller1-1/+1
Two easy cases of overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-13ipvs: do not schedule icmp errors from tunnelsJulian Anastasov1-1/+1
We can receive ICMP errors from client or from tunneling real server. While the former can be scheduled to real server, the latter should not be scheduled, they are decapsulated only when existing connection is found. Fixes: 6044eeffafbe ("ipvs: attempt to schedule icmp packets") Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-04-08ipvs: allow tunneling with gue encapsulationJacky Hu2-5/+114
ipip packets are blocked in some public cloud environments, this patch allows gue encapsulation with the tunneling method, which would make tunneling working in those environments. Signed-off-by: Jacky Hu <hengqing.hu@gmail.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-03-22genetlink: make policy common to familyJohannes Berg1-12/+1
Since maxattr is common, the policy can't really differ sanely, so make it common as well. The only user that did in fact manage to make a non-common policy is taskstats, which has to be really careful about it (since it's still using a common maxattr!). This is no longer supported, but we can fake it using pre_doit. This reduces the size of e.g. nl80211.o (which has lots of commands): text data bss dec hex filename 398745 14323 2240 415308 6564c net/wireless/nl80211.o (before) 397913 14331 2240 414484 65314 net/wireless/nl80211.o (after) -------------------------------- -832 +8 0 -824 Which is obviously just 8 bytes for each command, and an added 8 bytes for the new policy pointer. I'm not sure why the ops list is counted as .text though. Most of the code transformations were done using the following spatch: @ops@ identifier OPS; expression POLICY; @@ struct genl_ops OPS[] = { ..., { - .policy = POLICY, }, ... }; @@ identifier ops.OPS; expression ops.POLICY; identifier fam; expression M; @@ struct genl_family fam = { .ops = OPS, .maxattr = M, + .policy = POLICY, ... }; This also gets rid of devlink_nl_cmd_region_read_dumpit() accessing the cb->data as ops, which we want to change in a later genl patch. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-01ipvs: get sctphdr by sctphoff in sctp_csum_checkXin Long1-5/+2
sctp_csum_check() is called by sctp_s/dnat_handler() where it calls skb_make_writable() to ensure sctphdr to be linearized. So there's no need to get sctphdr by calling skb_header_pointer() in sctp_csum_check(). Signed-off-by: Xin Long <lucien.xin@gmail.com> Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-by: Julian Anastasov <ja@ssi.bg> Acked-by: Simon Horman <horms@verge.net.au> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-03-01ipvs: change some data types from int to boolAndrea Claudi5-18/+18
Change the data type of the following variables from int to bool across ipvs code: - found - loop - need_full_dest - need_full_svc - payload_csum Also change the following functions to use bool full_entry param instead of int: - ip_vs_genl_parse_dest() - ip_vs_genl_parse_service() This patch does not change any functionality but makes the source code slightly easier to read. Signed-off-by: Andrea Claudi <aclaudi@redhat.com> Acked-by: Julian Anastasov <ja@ssi.bg> Acked-by: Simon Horman <horms@verge.net.au> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-02-20Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller1-1/+2
Two easily resolvable overlapping change conflicts, one in TCP and one in the eBPF verifier. Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-18Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-nextDavid S. Miller1-4/+2
Pablo Neira Ayuso says: ==================== Netfilter/IPVS updates for net-next The following patchset contains Netfilter/IPVS updates for you net-next tree: 1) Missing NFTA_RULE_POSITION_ID netlink attribute validation, from Phil Sutter. 2) Restrict matching on tunnel metadata to rx/tx path, from wenxu. 3) Avoid indirect calls for IPV6=y, from Florian Westphal. 4) Add two indirections to prepare merger of IPV4 and IPV6 nat modules, from Florian Westphal. 5) Broken indentation in ctnetlink, from Colin Ian King. 6) Patches to use struct_size() from netfilter and IPVS, from Gustavo A. R. Silva. 7) Display kernel splat only once in case of racing to confirm conntrack from bridge plus nfqueue setups, from Chieh-Min Wang. 8) Skip checksum validation for layer 4 protocols that don't need it, patch from Alin Nastac. 9) Sparse warning due to symbol that should be static in CLUSTERIP, from Wei Yongjun. 10) Add new toggle to disable SDP payload translation when media endpoint is reachable though the same interface as the signalling peer, from Alin Nastac. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-16ipvs: fix warning on unused variableAndrea Claudi1-1/+2
When CONFIG_IP_VS_IPV6 is not defined, build produced this warning: net/netfilter/ipvs/ip_vs_ctl.c:899:6: warning: unused variable ‘ret’ [-Wunused-variable] int ret = 0; ^~~ Fix this by moving the declaration of 'ret' in the CONFIG_IP_VS_IPV6 section in the same function. While at it, drop its unneeded initialisation. Fixes: 098e13f5b21d ("ipvs: fix dependency on nf_defrag_ipv6") Reported-by: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: Andrea Claudi <aclaudi@redhat.com> Reviewed-by: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-02-15Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller3-6/+15
The netfilter conflicts were rather simple overlapping changes. However, the cls_tcindex.c stuff was a bit more complex. On the 'net' side, Cong is fixing several races and memory leaks. Whilst on the 'net-next' side we have Vlad adding the rtnl-ness support. What I've decided to do, in order to resolve this, is revert the conversion over to using a workqueue that Cong did, bringing us back to pure RCU. I did it this way because I believe that either Cong's races don't apply with have Vlad did things, or Cong will have to implement the race fix slightly differently. Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-12ipvs: fix dependency on nf_defrag_ipv6Andrea Claudi3-6/+15
ipvs relies on nf_defrag_ipv6 module to manage IPv6 fragmentation, but lacks proper Kconfig dependencies and does not explicitly request defrag features. As a result, if netfilter hooks are not loaded, when IPv6 fragmented packet are handled by ipvs only the first fragment makes through. Fix it properly declaring the dependency on Kconfig and registering netfilter hooks on ip_vs_add_service() and ip_vs_new_dest(). Reported-by: Li Shuang <shuali@redhat.com> Signed-off-by: Andrea Claudi <aclaudi@redhat.com> Acked-by: Julian Anastasov <ja@ssi.bg> Acked-by: Simon Horman <horms@verge.net.au> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-02-12ipvs: Use struct_size() helperGustavo A. R. Silva1-4/+2
One of the more common cases of allocation size calculations is finding the size of a structure that has a zero-sized array at the end, along with memory for some number of elements for that array. For example: struct foo { int stuff; struct boo entry[]; }; size = sizeof(struct foo) + count * sizeof(struct boo); instance = alloc(size, GFP_KERNEL) Instead of leaving these open-coded and prone to type mistakes, we can now use the new struct_size() helper: size = struct_size(instance, entry, count); This code was detected with the help of Coccinelle. Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Acked-by: Simon Horman <horms+renesas@verge.net.au> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-01-29Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller1-0/+12
2019-01-28ipvs: use indirect call wrappersMatteo Croce3-10/+45
Use the new indirect call wrappers in IPVS when calling the TCP or UDP protocol specific functions. This avoids an indirect calls in IPVS, and reduces the performance impact of the Spectre mitigation. Signed-off-by: Matteo Croce <mcroce@redhat.com> Acked-by: Julian Anastasov <ja@ssi.bg> Acked-by: Simon Horman <horms@verge.net.au> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>