aboutsummaryrefslogtreecommitdiffstats
path: root/net/netfilter (follow)
AgeCommit message (Collapse)AuthorFilesLines
2018-03-23Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller5-10/+45
Fun set of conflict resolutions here... For the mac80211 stuff, these were fortunately just parallel adds. Trivially resolved. In drivers/net/phy/phy.c we had a bug fix in 'net' that moved the function phy_disable_interrupts() earlier in the file, whilst in 'net-next' the phy_error() call from this function was removed. In net/ipv4/xfrm4_policy.c, David Ahern's changes to remove the 'rt_table_id' member of rtable collided with a bug fix in 'net' that added a new struct member "rt_mtu_locked" which needs to be copied over here. The mlxsw driver conflict consisted of net-next separating the span code and definitions into separate files, whilst a 'net' bug fix made some changes to that moved code. The mlx5 infiniband conflict resolution was quite non-trivial, the RDMA tree's merge commit was used as a guide here, and here are their notes: ==================== Due to bug fixes found by the syzkaller bot and taken into the for-rc branch after development for the 4.17 merge window had already started being taken into the for-next branch, there were fairly non-trivial merge issues that would need to be resolved between the for-rc branch and the for-next branch. This merge resolves those conflicts and provides a unified base upon which ongoing development for 4.17 can be based. Conflicts: drivers/infiniband/hw/mlx5/main.c - Commit 42cea83f9524 (IB/mlx5: Fix cleanup order on unload) added to for-rc and commit b5ca15ad7e61 (IB/mlx5: Add proper representors support) add as part of the devel cycle both needed to modify the init/de-init functions used by mlx5. To support the new representors, the new functions added by the cleanup patch needed to be made non-static, and the init/de-init list added by the representors patch needed to be modified to match the init/de-init list changes made by the cleanup patch. Updates: drivers/infiniband/hw/mlx5/mlx5_ib.h - Update function prototypes added by representors patch to reflect new function names as changed by cleanup patch drivers/infiniband/hw/mlx5/ib_rep.c - Update init/de-init stage list to match new order from cleanup patch ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-17net: Convert ip_vs_ftp_opsKirill Tkhai1-0/+1
These pernet_operations register and unregister ipvs app. register_ip_vs_app(), unregister_ip_vs_app() and register_ip_vs_app_inc() modify per-net structures, and there are no global structures touched. So, this looks safe to be marked as async. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-17net: Convert ipvs_core_dev_opsKirill Tkhai1-0/+1
Exit method stops two per-net threads and cancels delayed work. Everything looks nicely per-net divided. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-17net: Convert ipvs_core_opsKirill Tkhai1-0/+1
These pernet_operations register and unregister nf hooks, /proc entries, sysctl, percpu statistics. There are several global lists, and the only list modified without exclusive locks is ip_vs_conn_tab in ip_vs_conn_flush(). We iterate the list and force the timers expire at the moment. Since there were possible several timer expirations before this patch, and since they are safe, the patch does not invent new parallelism of their destruction. These pernet_operations look safe to be converted. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-15net: drivers/net: Remove unnecessary skb_copy_expand OOM messagesJoe Perches1-4/+1
skb_copy_expand without __GFP_NOWARN already does a dump_stack on OOM so these messages are redundant. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-11netfilter: nf_tables: release flowtable hooksPablo Neira Ayuso1-0/+1
Otherwise we leak this array. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-03-11netfilter: x_tables: add and use xt_check_proc_nameFlorian Westphal3-9/+43
recent and hashlimit both create /proc files, but only check that name is 0 terminated. This can trigger WARN() from procfs when name is "" or "/". Add helper for this and then use it for both. Cc: Eric Dumazet <eric.dumazet@gmail.com> Reported-by: Eric Dumazet <eric.dumazet@gmail.com> Reported-by: <syzbot+0502b00edac2a0680b61@syzkaller.appspotmail.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-03-08net: Convert nfnl_queue_net_opsKirill Tkhai1-0/+1
These pernet_operations register and unregister net::nf::queue_handler and /proc entry. The handler is accessed only under RCU, so this looks safe to convert them. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-08net: Convert nfnl_log_net_opsKirill Tkhai1-0/+1
These pernet_operations create and destroy /proc entries. Also, exit method unsets nfulnl_logger. The logger is not set by default, and it becomes bound via userspace request. So, they look safe to be made async. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-08net: Convert cttimeout_opsKirill Tkhai1-0/+1
These pernet_operations also look closed in themself. Exit method touch only per-net structures, so it's safe to execute them for several net namespaces in parallel. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-08net: Convert nfnl_acct_opsKirill Tkhai1-0/+1
These pernet_operations look closed in themself, and there are no other users of net::nfnl_acct_list outside. They are safe to be executed for several net namespaces in parallel. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-08net: Convert nfnetlink_net_opsKirill Tkhai1-0/+1
These pernet_operations create and destroy net::nfnl socket of NETLINK_NETFILTER code. There are no other places, where such type the socket is created, except these pernet_operations. It seem other pernet_operations depending on CONFIG_NETFILTER_NETLINK send messages to this socket. So, we mark it async. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-08net: Convert nf_tables_net_opsKirill Tkhai1-0/+1
These pernet_operations looks nicely separated per-net. Exit method unregisters net's nf tables objects. We allow them be executed in parallel. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-06netfilter: nft_set_hash: skip fixed hash if timeout is specifiedPablo Neira Ayuso1-1/+1
Fixed hash supports to timeouts, so skip it. Otherwise, userspace hits EOPNOTSUPP. Fixes: 6c03ae210ce3 ("netfilter: nft_set_hash: add non-resizable hashtable implementation") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-03-06Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2-3/+24
All of the conflicts were cases of overlapping changes. In net/core/devlink.c, we have to make care that the resouce size_params have become a struct member rather than a pointer to such an object. Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-05net: Convert proto_gre_net_opsKirill Tkhai1-0/+1
These pernet_operations register and unregister sysctl. nf_conntrack_l4proto_gre4->init_net is simple memory initializer. Also, exit method removes gre keymap_list, which is per-net. This looks safe to be executed in parallel with other pernet_operations. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-05net: Convert ctnetlink_net_opsKirill Tkhai1-0/+1
These pernet_operations register and unregister two conntrack notifiers, and they seem to be safe to be executed in parallel. General/not related to async pernet_operations JFI: ctnetlink_net_exit_batch() actions are grouped in batch, and this could look like there is synchronize_rcu() is forgotten. But there is synchronize_rcu() on module exit patch (in ctnetlink_exit()), so this batch may be reworked as simple .exit method. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-05net: Convert nf_conntrack_net_opsKirill Tkhai1-0/+1
These pernet_operations register and unregister sysctl and /proc entries. Exit batch method also waits till all per-net conntracks are dead. Thus, they are safe to be marked as async. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-05net: Convert ip_set_net_opsKirill Tkhai1-1/+2
These pernet_operations initialize and destroy net_generic(net, ip_set_net_id)-related data. Since ip_set is under CONFIG_IP_SET, it's easy to watch drivers, which depend on this config. All of them are in net/netfilter/ipset directory, except of net/netfilter/xt_set.c. There are no more drivers, which use ip_set, and all of the above don't register another pernet_operations. Also, there are is no indirect users, as header file include/linux/netfilter/ipset/ip_set.h does not define indirect users by something like this: #ifdef CONFIG_IP_SET extern func(void); #else static inline func(void); #endif So, there are no more pernet operations, dereferencing net_generic(net, ip_set_net_id). ip_set_net_ops are OK to be executed in parallel for several net, so we mark them as async. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-05net: Convert log pernet_operationsKirill Tkhai1-0/+1
These pernet_operations use nf_log_set() and nf_log_unset() in their methods: nf_log_bridge_net_ops nf_log_arp_net_ops nf_log_ipv4_net_ops nf_log_ipv6_net_ops nf_log_netdev_net_ops Nobody can send such a packet to a net before it's became registered, nobody can send a packet after all netdevices are unregistered. So, these pernet_operations are able to be marked as async. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-28ipvs: remove IPS_NAT_MASK check to fix passive FTPJulian Anastasov1-1/+1
The IPS_NAT_MASK check in 4.12 replaced previous check for nfct_nat() which was needed to fix a crash in 2.6.36-rc, see commit 7bcbf81a2296 ("ipvs: avoid oops for passive FTP"). But as IPVS does not set the IPS_SRC_NAT and IPS_DST_NAT bits, checking for IPS_NAT_MASK prevents PASV response to be properly mangled and blocks the transfer. Remove the check as it is not needed after 3.12 commit 41d73ec053d2 ("netfilter: nf_conntrack: make sequence number adjustments usuable without NAT") which changes nfct_nat() with nfct_seqadj() and especially after 3.13 commit b25adce16064 ("ipvs: correct usage/allocation of seqadj ext in ipvs"). Thanks to Li Shuang and Florian Westphal for reporting the problem! Reported-by: Li Shuang <shuali@redhat.com> Fixes: be7be6e161a2 ("netfilter: ipvs: fix incorrect conflict resolution") Signed-off-by: Julian Anastasov <ja@ssi.bg> Acked-by: Simon Horman <horms@verge.net.au> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-02-27netfilter: nf_tables: use the right index from flowtable error pathPablo Neira Ayuso1-1/+1
Use the right loop index, not the number of devices in the array that we need to remove, the following message uncovered the problem: [ 5437.044119] hook not found, pf 5 num 0 [ 5437.044140] WARNING: CPU: 2 PID: 24983 at net/netfilter/core.c:376 __nf_unregister_net_hook+0x250/0x280 Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-02-27net: Convert sysctl creating and destroying pernet_operationsKirill Tkhai2-0/+2
These pernet_operations create and destroy sysctl tables, and they are able to be executed in parallel with any others: ip_vs_lblc_ops ip_vs_lblcr_ops Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-27net: Convert synproxy_net_opsKirill Tkhai1-0/+1
These pernet_operations create and destroy /proc entries and allocate extents to template ct, which depend on global nf_ct_ext_types[] array. So, we are able to mark them async. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-27net: Convert hashlimit_net_ops and recent_net_opsKirill Tkhai2-0/+2
These pernet_operations just create and destroy /proc entries. Also, new /proc entries also may come after new nf rules are added, but this is not possible, when net isn't alive. So, they are safe to be marked as async. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-27netfilter: nf_tables: missing attribute validation in nf_tables_delflowtable()Pablo Neira Ayuso1-0/+5
Return -EINVAL is mandatory attributes are missing. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-02-27netfilter: nf_tables: return EBUSY if device already belongs to flowtablePablo Neira Ayuso1-1/+17
If the netdevice is already part of a flowtable, return EBUSY. I cannot find a valid usecase for having two flowtables bound to the same netdevice. We can still have two flowtable where the device set is disjoint. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-02-24Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller40-251/+255
2018-02-19net: Convert ip_tables_net_ops, udplite6_net_ops and xt_net_opsKirill Tkhai1-0/+1
ip_tables_net_ops and udplite6_net_ops create and destroy /proc entries. xt_net_ops does nothing. So, we are able to mark them async. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-19netfilter: IDLETIMER: be syzkaller friendlyEric Dumazet1-3/+6
We had one report from syzkaller [1] First issue is that INIT_WORK() should be done before mod_timer() or we risk timer being fired too soon, even with a 1 second timer. Second issue is that we need to reject too big info->timeout to avoid overflows in msecs_to_jiffies(info->timeout * 1000), or risk looping, if result after overflow is 0. [1] WARNING: CPU: 1 PID: 5129 at kernel/workqueue.c:1444 __queue_work+0xdf4/0x1230 kernel/workqueue.c:1444 Kernel panic - not syncing: panic_on_warn set ... CPU: 1 PID: 5129 Comm: syzkaller159866 Not tainted 4.16.0-rc1+ #230 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: <IRQ> __dump_stack lib/dump_stack.c:17 [inline] dump_stack+0x194/0x257 lib/dump_stack.c:53 panic+0x1e4/0x41c kernel/panic.c:183 __warn+0x1dc/0x200 kernel/panic.c:547 report_bug+0x211/0x2d0 lib/bug.c:184 fixup_bug.part.11+0x37/0x80 arch/x86/kernel/traps.c:178 fixup_bug arch/x86/kernel/traps.c:247 [inline] do_error_trap+0x2d7/0x3e0 arch/x86/kernel/traps.c:296 do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:315 invalid_op+0x22/0x40 arch/x86/entry/entry_64.S:988 RIP: 0010:__queue_work+0xdf4/0x1230 kernel/workqueue.c:1444 RSP: 0018:ffff8801db507538 EFLAGS: 00010006 RAX: ffff8801aeb46080 RBX: ffff8801db530200 RCX: ffffffff81481404 RDX: 0000000000000100 RSI: ffffffff86b42640 RDI: 0000000000000082 RBP: ffff8801db507758 R08: 1ffff1003b6a0de5 R09: 000000000000000c R10: ffff8801db5073f0 R11: 0000000000000020 R12: 1ffff1003b6a0eb6 R13: ffff8801b1067ae0 R14: 00000000000001f8 R15: dffffc0000000000 queue_work_on+0x16a/0x1c0 kernel/workqueue.c:1488 queue_work include/linux/workqueue.h:488 [inline] schedule_work include/linux/workqueue.h:546 [inline] idletimer_tg_expired+0x44/0x60 net/netfilter/xt_IDLETIMER.c:116 call_timer_fn+0x228/0x820 kernel/time/timer.c:1326 expire_timers kernel/time/timer.c:1363 [inline] __run_timers+0x7ee/0xb70 kernel/time/timer.c:1666 run_timer_softirq+0x4c/0x70 kernel/time/timer.c:1692 __do_softirq+0x2d7/0xb85 kernel/softirq.c:285 invoke_softirq kernel/softirq.c:365 [inline] irq_exit+0x1cc/0x200 kernel/softirq.c:405 exiting_irq arch/x86/include/asm/apic.h:541 [inline] smp_apic_timer_interrupt+0x16b/0x700 arch/x86/kernel/apic/apic.c:1052 apic_timer_interrupt+0xa9/0xb0 arch/x86/entry/entry_64.S:829 </IRQ> RIP: 0010:arch_local_irq_restore arch/x86/include/asm/paravirt.h:777 [inline] RIP: 0010:__raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:160 [inline] RIP: 0010:_raw_spin_unlock_irqrestore+0x5e/0xba kernel/locking/spinlock.c:184 RSP: 0018:ffff8801c20173c8 EFLAGS: 00000282 ORIG_RAX: ffffffffffffff12 RAX: dffffc0000000000 RBX: 0000000000000282 RCX: 0000000000000006 RDX: 1ffffffff0d592cd RSI: 1ffff10035d68d23 RDI: 0000000000000282 RBP: ffff8801c20173d8 R08: 1ffff10038402e47 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: ffffffff8820e5c8 R13: ffff8801b1067ad8 R14: ffff8801aea7c268 R15: ffff8801aea7c278 __debug_object_init+0x235/0x1040 lib/debugobjects.c:378 debug_object_init+0x17/0x20 lib/debugobjects.c:391 __init_work+0x2b/0x60 kernel/workqueue.c:506 idletimer_tg_create net/netfilter/xt_IDLETIMER.c:152 [inline] idletimer_tg_checkentry+0x691/0xb00 net/netfilter/xt_IDLETIMER.c:213 xt_check_target+0x22c/0x7d0 net/netfilter/x_tables.c:850 check_target net/ipv6/netfilter/ip6_tables.c:533 [inline] find_check_entry.isra.7+0x935/0xcf0 net/ipv6/netfilter/ip6_tables.c:575 translate_table+0xf52/0x1690 net/ipv6/netfilter/ip6_tables.c:744 do_replace net/ipv6/netfilter/ip6_tables.c:1160 [inline] do_ip6t_set_ctl+0x370/0x5f0 net/ipv6/netfilter/ip6_tables.c:1686 nf_sockopt net/netfilter/nf_sockopt.c:106 [inline] nf_setsockopt+0x67/0xc0 net/netfilter/nf_sockopt.c:115 ipv6_setsockopt+0x10b/0x130 net/ipv6/ipv6_sockglue.c:927 udpv6_setsockopt+0x45/0x80 net/ipv6/udp.c:1422 sock_common_setsockopt+0x95/0xd0 net/core/sock.c:2976 SYSC_setsockopt net/socket.c:1850 [inline] SyS_setsockopt+0x189/0x360 net/socket.c:1829 do_syscall_64+0x282/0x940 arch/x86/entry/common.c:287 Fixes: 0902b469bd25 ("netfilter: xtables: idletimer target implementation") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzkaller <syzkaller@googlegroups.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-02-16netfilter: xt_hashlimit: fix lock imbalanceEric Dumazet1-1/+1
syszkaller found that rcu was not held in hashlimit_mt_common() We only need to enable BH at this point. Fixes: bea74641e378 ("netfilter: xt_hashlimit: add rate match mode") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzkaller <syzkaller@googlegroups.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-02-14netfilter: nat: cope with negative port rangePaolo Abeni1-2/+5
syzbot reported a division by 0 bug in the netfilter nat code: divide error: 0000 [#1] SMP KASAN Dumping ftrace buffer: (ftrace buffer empty) Modules linked in: CPU: 1 PID: 4168 Comm: syzkaller034710 Not tainted 4.16.0-rc1+ #309 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:nf_nat_l4proto_unique_tuple+0x291/0x530 net/netfilter/nf_nat_proto_common.c:88 RSP: 0018:ffff8801b2466778 EFLAGS: 00010246 RAX: 000000000000f153 RBX: ffff8801b2466dd8 RCX: ffff8801b2466c7c RDX: 0000000000000000 RSI: ffff8801b2466c58 RDI: ffff8801db5293ac RBP: ffff8801b24667d8 R08: ffff8801b8ba6dc0 R09: ffffffff88af5900 R10: ffff8801b24666f0 R11: 0000000000000000 R12: 000000002990f153 R13: 0000000000000001 R14: 0000000000000000 R15: ffff8801b2466c7c FS: 00000000017e3880(0000) GS:ffff8801db500000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000208fdfe4 CR3: 00000001b5340002 CR4: 00000000001606e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: dccp_unique_tuple+0x40/0x50 net/netfilter/nf_nat_proto_dccp.c:30 get_unique_tuple+0xc28/0x1c10 net/netfilter/nf_nat_core.c:362 nf_nat_setup_info+0x1c2/0xe00 net/netfilter/nf_nat_core.c:406 nf_nat_redirect_ipv6+0x306/0x730 net/netfilter/nf_nat_redirect.c:124 redirect_tg6+0x7f/0xb0 net/netfilter/xt_REDIRECT.c:34 ip6t_do_table+0xc2a/0x1a30 net/ipv6/netfilter/ip6_tables.c:365 ip6table_nat_do_chain+0x65/0x80 net/ipv6/netfilter/ip6table_nat.c:41 nf_nat_ipv6_fn+0x594/0xa80 net/ipv6/netfilter/nf_nat_l3proto_ipv6.c:302 nf_nat_ipv6_local_fn+0x33/0x5d0 net/ipv6/netfilter/nf_nat_l3proto_ipv6.c:407 ip6table_nat_local_fn+0x2c/0x40 net/ipv6/netfilter/ip6table_nat.c:69 nf_hook_entry_hookfn include/linux/netfilter.h:120 [inline] nf_hook_slow+0xba/0x1a0 net/netfilter/core.c:483 nf_hook include/linux/netfilter.h:243 [inline] NF_HOOK include/linux/netfilter.h:286 [inline] ip6_xmit+0x10ec/0x2260 net/ipv6/ip6_output.c:277 inet6_csk_xmit+0x2fc/0x580 net/ipv6/inet6_connection_sock.c:139 dccp_transmit_skb+0x9ac/0x10f0 net/dccp/output.c:142 dccp_connect+0x369/0x670 net/dccp/output.c:564 dccp_v6_connect+0xe17/0x1bf0 net/dccp/ipv6.c:946 __inet_stream_connect+0x2d4/0xf00 net/ipv4/af_inet.c:620 inet_stream_connect+0x58/0xa0 net/ipv4/af_inet.c:684 SYSC_connect+0x213/0x4a0 net/socket.c:1639 SyS_connect+0x24/0x30 net/socket.c:1620 do_syscall_64+0x282/0x940 arch/x86/entry/common.c:287 entry_SYSCALL_64_after_hwframe+0x26/0x9b RIP: 0033:0x441c69 RSP: 002b:00007ffe50cc0be8 EFLAGS: 00000217 ORIG_RAX: 000000000000002a RAX: ffffffffffffffda RBX: ffffffffffffffff RCX: 0000000000441c69 RDX: 000000000000001c RSI: 00000000208fdfe4 RDI: 0000000000000003 RBP: 00000000006cc018 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000538 R11: 0000000000000217 R12: 0000000000403590 R13: 0000000000403620 R14: 0000000000000000 R15: 0000000000000000 Code: 48 89 f0 83 e0 07 83 c0 01 38 d0 7c 08 84 d2 0f 85 46 02 00 00 48 8b 45 c8 44 0f b7 20 e8 88 97 04 fd 31 d2 41 0f b7 c4 4c 89 f9 <41> f7 f6 48 c1 e9 03 48 b8 00 00 00 00 00 fc ff df 0f b6 0c 01 RIP: nf_nat_l4proto_unique_tuple+0x291/0x530 net/netfilter/nf_nat_proto_common.c:88 RSP: ffff8801b2466778 The problem is that currently we don't have any check on the configured port range. A port range == -1 triggers the bug, while other negative values may require a very long time to complete the following loop. This commit addresses the issue swapping the two ends on negative ranges. The check is performed in nf_nat_l4proto_unique_tuple() since the nft nat loads the port values from nft registers at runtime. v1 -> v2: use the correct 'Fixes' tag v2 -> v3: update commit message, drop unneeded READ_ONCE() Fixes: 5b1158e909ec ("[NETFILTER]: Add NAT support for nf_conntrack") Reported-by: syzbot+8012e198bd037f4871e5@syzkaller.appspotmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-02-14netfilter: x_tables: fix missing timer initialization in xt_LEDPaolo Abeni1-5/+5
syzbot reported that xt_LED may try to use the ledinternal->timer without previously initializing it: ------------[ cut here ]------------ kernel BUG at kernel/time/timer.c:958! invalid opcode: 0000 [#1] SMP KASAN Dumping ftrace buffer: (ftrace buffer empty) Modules linked in: CPU: 1 PID: 1826 Comm: kworker/1:2 Not tainted 4.15.0+ #306 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Workqueue: ipv6_addrconf addrconf_dad_work RIP: 0010:__mod_timer kernel/time/timer.c:958 [inline] RIP: 0010:mod_timer+0x7d6/0x13c0 kernel/time/timer.c:1102 RSP: 0018:ffff8801d24fe9f8 EFLAGS: 00010293 RAX: ffff8801d25246c0 RBX: ffff8801aec6cb50 RCX: ffffffff816052c6 RDX: 0000000000000000 RSI: 00000000fffbd14b RDI: ffff8801aec6cb68 RBP: ffff8801d24fec98 R08: 0000000000000000 R09: 1ffff1003a49fd6c R10: ffff8801d24feb28 R11: 0000000000000005 R12: dffffc0000000000 R13: ffff8801d24fec70 R14: 00000000fffbd14b R15: ffff8801af608f90 FS: 0000000000000000(0000) GS:ffff8801db500000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000206d6fd0 CR3: 0000000006a22001 CR4: 00000000001606e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: led_tg+0x1db/0x2e0 net/netfilter/xt_LED.c:75 ip6t_do_table+0xc2a/0x1a30 net/ipv6/netfilter/ip6_tables.c:365 ip6table_raw_hook+0x65/0x80 net/ipv6/netfilter/ip6table_raw.c:42 nf_hook_entry_hookfn include/linux/netfilter.h:120 [inline] nf_hook_slow+0xba/0x1a0 net/netfilter/core.c:483 nf_hook.constprop.27+0x3f6/0x830 include/linux/netfilter.h:243 NF_HOOK include/linux/netfilter.h:286 [inline] ndisc_send_skb+0xa51/0x1370 net/ipv6/ndisc.c:491 ndisc_send_ns+0x38a/0x870 net/ipv6/ndisc.c:633 addrconf_dad_work+0xb9e/0x1320 net/ipv6/addrconf.c:4008 process_one_work+0xbbf/0x1af0 kernel/workqueue.c:2113 worker_thread+0x223/0x1990 kernel/workqueue.c:2247 kthread+0x33c/0x400 kernel/kthread.c:238 ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:429 Code: 85 2a 0b 00 00 4d 8b 3c 24 4d 85 ff 75 9f 4c 8b bd 60 fd ff ff e8 bb 57 10 00 65 ff 0d 94 9a a1 7e e9 d9 fc ff ff e8 aa 57 10 00 <0f> 0b e8 a3 57 10 00 e9 14 fb ff ff e8 99 57 10 00 4c 89 bd 70 RIP: __mod_timer kernel/time/timer.c:958 [inline] RSP: ffff8801d24fe9f8 RIP: mod_timer+0x7d6/0x13c0 kernel/time/timer.c:1102 RSP: ffff8801d24fe9f8 ---[ end trace f661ab06f5dd8b3d ]--- The ledinternal struct can be shared between several different xt_LED targets, but the related timer is currently initialized only if the first target requires it. Fix it by unconditionally initializing the timer struct. v1 -> v2: call del_timer_sync() unconditionally, too. Fixes: 268cb38e1802 ("netfilter: x_tables: add LED trigger target") Reported-by: syzbot+10c98dc5725c6c8fc7fb@syzkaller.appspotmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-02-14netfilter: x_tables: use pr ratelimiting in all remaining spotsFlorian Westphal28-95/+105
Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-02-14netfilter: x_tables: use pr ratelimiting in matches/targetsFlorian Westphal3-33/+40
all of these print simple error message - use single pr_ratelimit call. checkpatch complains about lines > 80 but this would require splitting several "literals" over multiple lines which is worse. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-02-14netfilter: x_tables: rate-limit table mismatch warningsFlorian Westphal2-4/+4
Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-02-14netfilter: xt_set: use pr ratelimitingFlorian Westphal1-25/+25
also convert this to info for consistency. These errors are informational message to user, given iptables doesn't have netlink extack equivalent. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-02-14netfilter: xt_NFQUEUE: use pr ratelimitingFlorian Westphal1-3/+5
switch this to info, since these aren't really errors. We only use printk because we cannot report meaningful errors in the xtables framework. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-02-14netfilter: xt_CT: use pr ratelimitingFlorian Westphal1-12/+13
checkpatch complains about line > 80 but this would require splitting "literal" over two lines which is worse. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-02-14netfilter: x_tables: use pr ratelimiting in xt coreFlorian Westphal1-36/+34
most messages are converted to info, since they occur in response to wrong usage. Size mismatch however is a real error (xtables ABI bug) that should not occur. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-02-14netfilter: x_tables: remove pr_info where possibleFlorian Westphal6-28/+12
remove several pr_info messages that cannot be triggered with iptables, the check is only to ensure input is sane. iptables(8) already prints error messages in these cases. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-02-13net: Convert nf_log_net_opsKirill Tkhai1-0/+1
The pernet_operations would have had a problem in parallel execution with others, if init_net had been able to released. But it's not, and the rest is safe for that. There is memory allocation, which nobody else interested in, and sysctl registration. So, we make them async. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Acked-by: Andrei Vagin <avagin@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-13net: Convert netfilter_net_opsKirill Tkhai1-0/+1
Methods netfilter_net_init() and netfilter_net_exit() initialize net::nf::hooks and change net-related proc directory of net. Another pernet_operations are not interested in forein net::nf::hooks or proc entries, so it's safe to make them executed in parallel with methods of other pernet operations. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Acked-by: Andrei Vagin <avagin@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-08netfilter: x_tables: remove size checkMichal Hocko1-4/+0
Back in 2002 vmalloc used to BUG on too large sizes. We are much better behaved these days and vmalloc simply returns NULL for those. Remove the check as it simply not needed and the comment is even misleading. Link: http://lkml.kernel.org/r/20180131081916.GO21609@dhcp22.suse.cz Suggested-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Michal Hocko <mhocko@suse.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Florian Westphal <fw@strlen.de> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-02-07netfilter: nf_flow_offload: fix use-after-free and a resource leakFelix Fietkau1-20/+7
flow_offload_del frees the flow, so all associated resource must be freed before. Since the ct entry in struct flow_offload_entry was allocated by flow_offload_alloc, it should be freed by flow_offload_free to take care of the error handling path when flow_offload_add fails. While at it, make flow_offload_del static, since it should never be called directly, only from the gc step Signed-off-by: Felix Fietkau <nbd@nbd.name> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-02-07netfilter: nf_tables: fix flowtable freePablo Neira Ayuso3-13/+22
Every flow_offload entry is added into the table twice. Because of this, rhashtable_free_and_destroy can't be used, since it would call kfree for each flow_offload object twice. This patch cleans up the flowtable via nf_flow_table_iterate() to schedule removal of entries by setting on the dying bit, then there is an explicitly invocation of the garbage collector to release resources. Based on patch from Felix Fietkau. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-02-07netfilter: nft_flow_offload: move flowtable cleanup routines to nf_flow_tablePablo Neira Ayuso2-18/+25
Move the flowtable cleanup routines to nf_flow_table and expose the nf_flow_table_cleanup() helper function. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-02-07netfilter: xt_RATEEST: acquire xt_rateest_mutex for hash insertCong Wang1-5/+17
rateest_hash is supposed to be protected by xt_rateest_mutex, and, as suggested by Eric, lookup and insert should be atomic, so we should acquire the xt_rateest_mutex once for both. So introduce a non-locking helper for internal use and keep the locking one for external. Reported-by: <syzbot+5cb189720978275e4c75@syzkaller.appspotmail.com> Fixes: 5859034d7eb8 ("[NETFILTER]: x_tables: add RATEEST target") Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Reviewed-by: Florian Westphal <fw@strlen.de> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-02-02netfilter: nft_flow_offload: no need to flush entries on module removalPablo Neira Ayuso1-6/+0
nft_flow_offload module removal does not require to flush existing flowtables, it is valid to remove this module while keeping flowtables around. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-02-02netfilter: nft_flow_offload: wait for garbage collector to run after cleanupPablo Neira Ayuso2-4/+5
If netdevice goes down, then flowtable entries are scheduled to be removed. Wait for garbage collector to have a chance to run so it can delete them from the hashtable. The flush call might sleep, so hold the nfnl mutex from nft_flow_table_iterate() instead of rcu read side lock. The use of the nfnl mutex is also implicitly fixing races between updates via nfnetlink and netdevice event. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>