linux-dev - Linux kernel development work

Age	Commit message (Collapse)	Author	Files	Lines
2018-06-20	bpfilter: fix build error	Matteo Croce	1	-2/+4
	bpfilter Makefile assumes that the system locale is en_US, and the parsing of objdump output fails. Set LC_ALL=C and, while at it, rewrite the objdump parsing so it spawns only 2 processes instead of 7. Fixes: d2ba09c17a064 ("net: add skeleton of bpfilter kernel module") Signed-off-by: Matteo Croce <mcroce@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-20	net/usb/drivers: Remove useless hrtimer_active check	Daniel Lezcano	1	-2/+1
	The code does: if (hrtimer_active(&t)) hrtimer_cancel(&t); However, hrtimer_cancel() checks if the timer is active, so the test above is pointless. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-20	net/sched: act_ife: preserve the action control in case of error	Davide Caratti	1	-2/+1
	in the following script # tc actions add action ife encode allow prio pass index 42 # tc actions replace action ife encode allow tcindex drop index 42 the action control should remain equal to 'pass', if the kernel failed to replace the TC action. Pospone the assignment of the action control, to ensure it is not overwritten in the error path of tcf_ife_init(). Fixes: ef6980b6becb ("introduce IFE action") Signed-off-by: Davide Caratti <dcaratti@redhat.com> Acked-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-20	net/sched: act_ife: fix recursive lock and idr leak	Davide Caratti	1	-5/+4
	a recursive lock warning [1] can be observed with the following script, # $TC actions add action ife encode allow prio pass index 42 IFE type 0xED3E # $TC actions replace action ife encode allow tcindex pass index 42 in case the kernel was unable to run the last command (e.g. because of the impossibility to load 'act_meta_skbtcindex'). For a similar reason, the kernel can leak idr in the error path of tcf_ife_init(), because tcf_idr_release() is not called after successful idr reservation: # $TC actions add action ife encode allow tcindex index 47 IFE type 0xED3E RTNETLINK answers: No such file or directory We have an error talking to the kernel # $TC actions add action ife encode allow tcindex index 47 IFE type 0xED3E RTNETLINK answers: No space left on device We have an error talking to the kernel # $TC actions add action ife encode use mark 7 type 0xfefe pass index 47 IFE type 0xFEFE RTNETLINK answers: No space left on device We have an error talking to the kernel Since tcfa_lock is already taken when the action is being edited, a call to tcf_idr_release() wrongly makes tcf_idr_cleanup() take the same lock again. On the other hand, tcf_idr_release() needs to be called in the error path of tcf_ife_init(), to undo the last tcf_idr_create() invocation. Fix both problems in tcf_ife_init(). Since the cleanup() routine can now be called when ife->params is NULL, also add a NULL pointer check to avoid calling kfree_rcu(NULL, rcu). [1] ============================================ WARNING: possible recursive locking detected 4.17.0-rc4.kasan+ #417 Tainted: G E -------------------------------------------- tc/3932 is trying to acquire lock: 000000005097c9a6 (&(&p->tcfa_lock)->rlock){+...}, at: tcf_ife_cleanup+0x19/0x80 [act_ife] but task is already holding lock: 000000005097c9a6 (&(&p->tcfa_lock)->rlock){+...}, at: tcf_ife_init+0xf6d/0x13c0 [act_ife] other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&(&p->tcfa_lock)->rlock); lock(&(&p->tcfa_lock)->rlock); * DEADLOCK * May be due to missing lock nesting notation 2 locks held by tc/3932: #0: 000000007ca8e990 (rtnl_mutex){+.+.}, at: tcf_ife_init+0xf61/0x13c0 [act_ife] #1: 000000005097c9a6 (&(&p->tcfa_lock)->rlock){+...}, at: tcf_ife_init+0xf6d/0x13c0 [act_ife] stack backtrace: CPU: 3 PID: 3932 Comm: tc Tainted: G E 4.17.0-rc4.kasan+ #417 Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 Call Trace: dump_stack+0x9a/0xeb __lock_acquire+0xf43/0x34a0 ? debug_check_no_locks_freed+0x2b0/0x2b0 ? debug_check_no_locks_freed+0x2b0/0x2b0 ? debug_check_no_locks_freed+0x2b0/0x2b0 ? __mutex_lock+0x62f/0x1240 ? kvm_sched_clock_read+0x1a/0x30 ? sched_clock+0x5/0x10 ? sched_clock_cpu+0x18/0x170 ? find_held_lock+0x39/0x1d0 ? lock_acquire+0x10b/0x330 lock_acquire+0x10b/0x330 ? tcf_ife_cleanup+0x19/0x80 [act_ife] _raw_spin_lock_bh+0x38/0x70 ? tcf_ife_cleanup+0x19/0x80 [act_ife] tcf_ife_cleanup+0x19/0x80 [act_ife] __tcf_idr_release+0xff/0x350 tcf_ife_init+0xdde/0x13c0 [act_ife] ? ife_exit_net+0x290/0x290 [act_ife] ? __lock_is_held+0xb4/0x140 tcf_action_init_1+0x67b/0xad0 ? tcf_action_dump_old+0xa0/0xa0 ? sched_clock+0x5/0x10 ? sched_clock_cpu+0x18/0x170 ? kvm_sched_clock_read+0x1a/0x30 ? sched_clock+0x5/0x10 ? sched_clock_cpu+0x18/0x170 ? memset+0x1f/0x40 tcf_action_init+0x30f/0x590 ? tcf_action_init_1+0xad0/0xad0 ? memset+0x1f/0x40 tc_ctl_action+0x48e/0x5e0 ? mutex_lock_io_nested+0x1160/0x1160 ? tca_action_gd+0x990/0x990 ? sched_clock+0x5/0x10 ? find_held_lock+0x39/0x1d0 rtnetlink_rcv_msg+0x4da/0x990 ? validate_linkmsg+0x680/0x680 ? sched_clock_cpu+0x18/0x170 ? find_held_lock+0x39/0x1d0 netlink_rcv_skb+0x127/0x350 ? validate_linkmsg+0x680/0x680 ? netlink_ack+0x970/0x970 ? __kmalloc_node_track_caller+0x304/0x3a0 netlink_unicast+0x40f/0x5d0 ? netlink_attachskb+0x580/0x580 ? _copy_from_iter_full+0x187/0x760 ? import_iovec+0x90/0x390 netlink_sendmsg+0x67f/0xb50 ? netlink_unicast+0x5d0/0x5d0 ? copy_msghdr_from_user+0x206/0x340 ? netlink_unicast+0x5d0/0x5d0 sock_sendmsg+0xb3/0xf0 ___sys_sendmsg+0x60a/0x8b0 ? copy_msghdr_from_user+0x340/0x340 ? lock_downgrade+0x5e0/0x5e0 ? tty_write_lock+0x18/0x50 ? kvm_sched_clock_read+0x1a/0x30 ? sched_clock+0x5/0x10 ? sched_clock_cpu+0x18/0x170 ? find_held_lock+0x39/0x1d0 ? lock_downgrade+0x5e0/0x5e0 ? lock_acquire+0x10b/0x330 ? __audit_syscall_entry+0x316/0x690 ? current_kernel_time64+0x6b/0xd0 ? __fget_light+0x55/0x1f0 ? __sys_sendmsg+0xd2/0x170 __sys_sendmsg+0xd2/0x170 ? __ia32_sys_shutdown+0x70/0x70 ? syscall_trace_enter+0x57a/0xd60 ? rcu_read_lock_sched_held+0xdc/0x110 ? __bpf_trace_sys_enter+0x10/0x10 ? do_syscall_64+0x22/0x480 do_syscall_64+0xa5/0x480 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x7fd646988ba0 RSP: 002b:00007fffc9fab3c8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 00007fffc9fab4f0 RCX: 00007fd646988ba0 RDX: 0000000000000000 RSI: 00007fffc9fab440 RDI: 0000000000000003 RBP: 000000005b28c8b3 R08: 0000000000000002 R09: 0000000000000000 R10: 00007fffc9faae20 R11: 0000000000000246 R12: 0000000000000000 R13: 00007fffc9fab504 R14: 0000000000000001 R15: 000000000066c100 Fixes: 4e8c86155010 ("net sched: net sched: ife action fix late binding") Fixes: ef6980b6becb ("introduce IFE action") Signed-off-by: Davide Caratti <dcaratti@redhat.com> Acked-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-20	net: ethernet: fix suspend/resume in davinci_emac	Bartosz Golaszewski	1	-2/+13
	This patch reverts commit 3243ff2a05ec ("net: ethernet: davinci_emac: Deduplicate bus_find_device() by name matching") and adds a comment which should stop anyone from reintroducing the same "fix" in the future. We can't use bus_find_device_by_name() here because the device name is not guaranteed to be 'davinci_mdio'. On some systems it can be 'davinci_mdio.0' so we need to use strncmp() against the first part of the string to correctly match it. Fixes: 3243ff2a05ec ("net: ethernet: davinci_emac: Deduplicate bus_find_device() by name matching") Cc: stable@vger.kernel.org Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com> Acked-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-20	net: propagate dev_get_valid_name return code	Li RongQing	1	-2/+2
	if dev_get_valid_name failed, propagate its return code and remove the setting err to ENODEV, it will be set to 0 again before dev_change_net_namespace exits. Signed-off-by: Li RongQing <lirongqing@baidu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-20	enic: do not overwrite error code	Govindarajulu Varadarajan	1	-5/+4
	In failure path, we overwrite err to what vnic_rq_disable() returns. In case it returns 0, enic_open() returns success in case of error. Reported-by: Ben Hutchings <ben.hutchings@codethink.co.uk> Fixes: e8588e268509 ("enic: enable rq before updating rq descriptors") Signed-off-by: Govindarajulu Varadarajan <gvaradar@cisco.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-20	net/tcp: Fix socket lookups with SO_BINDTODEVICE	David Ahern	2	-4/+4
	Similar to 69678bcd4d2d ("udp: fix SO_BINDTODEVICE"), TCP socket lookups need to fail if dev_match is not true. Currently, a packet to a given port can match a socket bound to device when it should not. In the VRF case, this causes the lookup to hit a VRF socket and not a global socket resulting in a response trying to go through the VRF when it should not. Fixes: 3fa6f616a7a4d ("net: ipv4: add second dif to inet socket lookups") Fixes: 4297a0ef08572 ("net: ipv6: add second dif to inet6 socket lookups") Reported-by: Lou Berger <lberger@labn.net> Diagnosed-by: Renato Westphal <renato@opensourcerouting.org> Tested-by: Renato Westphal <renato@opensourcerouting.org> Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-20	ptp: replace getnstimeofday64() with ktime_get_real_ts64()	Arnd Bergmann	2	-3/+3
	getnstimeofday64() is deprecated and getting replaced throughout the kernel with ktime_get_*() based helpers for a more consistent interface. The two functions do the exact same thing, so this is just a cosmetic change. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-20	net/ipv6: respect rcu grace period before freeing fib6_info	Eric Dumazet	2	-4/+6
	syzbot reported use after free that is caused by fib6_info being freed without a proper RCU grace period. CPU: 0 PID: 1407 Comm: udevd Not tainted 4.17.0+ #39 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: <IRQ> __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x1b9/0x294 lib/dump_stack.c:113 print_address_description+0x6c/0x20b mm/kasan/report.c:256 kasan_report_error mm/kasan/report.c:354 [inline] kasan_report.cold.7+0x242/0x2fe mm/kasan/report.c:412 __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:433 __read_once_size include/linux/compiler.h:188 [inline] find_rr_leaf net/ipv6/route.c:705 [inline] rt6_select net/ipv6/route.c:761 [inline] fib6_table_lookup+0x12b7/0x14d0 net/ipv6/route.c:1823 ip6_pol_route+0x1c2/0x1020 net/ipv6/route.c:1856 ip6_pol_route_output+0x54/0x70 net/ipv6/route.c:2082 fib6_rule_lookup+0x211/0x6d0 net/ipv6/fib6_rules.c:122 ip6_route_output_flags+0x2c5/0x350 net/ipv6/route.c:2110 ip6_route_output include/net/ip6_route.h:82 [inline] icmpv6_xrlim_allow net/ipv6/icmp.c:211 [inline] icmp6_send+0x147c/0x2da0 net/ipv6/icmp.c:535 icmpv6_send+0x17a/0x300 net/ipv6/ip6_icmp.c:43 ip6_link_failure+0xa5/0x790 net/ipv6/route.c:2244 dst_link_failure include/net/dst.h:427 [inline] ndisc_error_report+0xd1/0x1c0 net/ipv6/ndisc.c:695 neigh_invalidate+0x246/0x550 net/core/neighbour.c:892 neigh_timer_handler+0xaf9/0xde0 net/core/neighbour.c:978 call_timer_fn+0x230/0x940 kernel/time/timer.c:1326 expire_timers kernel/time/timer.c:1363 [inline] __run_timers+0x79e/0xc50 kernel/time/timer.c:1666 run_timer_softirq+0x4c/0x70 kernel/time/timer.c:1692 __do_softirq+0x2e0/0xaf5 kernel/softirq.c:284 invoke_softirq kernel/softirq.c:364 [inline] irq_exit+0x1d1/0x200 kernel/softirq.c:404 exiting_irq arch/x86/include/asm/apic.h:527 [inline] smp_apic_timer_interrupt+0x17e/0x710 arch/x86/kernel/apic/apic.c:1052 apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:863 </IRQ> RIP: 0010:strlen+0x5e/0xa0 lib/string.c:482 Code: 24 00 74 3b 48 bb 00 00 00 00 00 fc ff df 4c 89 e0 48 83 c0 01 48 89 c2 48 89 c1 48 c1 ea 03 83 e1 07 0f b6 14 1a 38 ca 7f 04 <84> d2 75 23 80 38 00 75 de 48 83 c4 08 4c 29 e0 5b 41 5c 5d c3 48 RSP: 0018:ffff8801af117850 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13 RAX: ffff880197f53bd0 RBX: dffffc0000000000 RCX: 0000000000000000 RDX: 0000000000000000 RSI: ffffffff81c5b06c RDI: ffff880197f53bc0 RBP: ffff8801af117868 R08: ffff88019a976540 R09: 0000000000000000 R10: ffff88019a976540 R11: 0000000000000000 R12: ffff880197f53bc0 R13: ffff880197f53bc0 R14: ffffffff899e4e90 R15: ffff8801d91c6a00 strlen include/linux/string.h:267 [inline] getname_kernel+0x24/0x370 fs/namei.c:218 open_exec+0x17/0x70 fs/exec.c:882 load_elf_binary+0x968/0x5610 fs/binfmt_elf.c:780 search_binary_handler+0x17d/0x570 fs/exec.c:1653 exec_binprm fs/exec.c:1695 [inline] __do_execve_file.isra.35+0x16fe/0x2710 fs/exec.c:1819 do_execveat_common fs/exec.c:1866 [inline] do_execve fs/exec.c:1883 [inline] __do_sys_execve fs/exec.c:1964 [inline] __se_sys_execve fs/exec.c:1959 [inline] __x64_sys_execve+0x8f/0xc0 fs/exec.c:1959 do_syscall_64+0x1b1/0x800 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x7f1576a46207 Code: 77 19 f4 48 89 d7 44 89 c0 0f 05 48 3d 00 f0 ff ff 76 e0 f7 d8 64 41 89 01 eb d8 f7 d8 64 41 89 01 eb df b8 3b 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 02 f3 c3 48 8b 15 00 8c 2d 00 f7 d8 64 89 02 RSP: 002b:00007ffff2784568 EFLAGS: 00000202 ORIG_RAX: 000000000000003b RAX: ffffffffffffffda RBX: 00000000ffffffff RCX: 00007f1576a46207 RDX: 0000000001215b10 RSI: 00007ffff2784660 RDI: 00007ffff2785670 RBP: 0000000000625500 R08: 000000000000589c R09: 000000000000589c R10: 0000000000000000 R11: 0000000000000202 R12: 0000000001215b10 R13: 0000000000000007 R14: 0000000001204250 R15: 0000000000000005 Allocated by task 12188: save_stack+0x43/0xd0 mm/kasan/kasan.c:448 set_track mm/kasan/kasan.c:460 [inline] kasan_kmalloc+0xc4/0xe0 mm/kasan/kasan.c:553 kmem_cache_alloc_trace+0x152/0x780 mm/slab.c:3620 kmalloc include/linux/slab.h:513 [inline] kzalloc include/linux/slab.h:706 [inline] fib6_info_alloc+0xbb/0x280 net/ipv6/ip6_fib.c:152 ip6_route_info_create+0x782/0x2b50 net/ipv6/route.c:3013 ip6_route_add+0x23/0xb0 net/ipv6/route.c:3154 ipv6_route_ioctl+0x5a5/0x760 net/ipv6/route.c:3660 inet6_ioctl+0x100/0x1f0 net/ipv6/af_inet6.c:546 sock_do_ioctl+0xe4/0x3e0 net/socket.c:973 sock_ioctl+0x30d/0x680 net/socket.c:1097 vfs_ioctl fs/ioctl.c:46 [inline] file_ioctl fs/ioctl.c:500 [inline] do_vfs_ioctl+0x1cf/0x16f0 fs/ioctl.c:684 ksys_ioctl+0xa9/0xd0 fs/ioctl.c:701 __do_sys_ioctl fs/ioctl.c:708 [inline] __se_sys_ioctl fs/ioctl.c:706 [inline] __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:706 do_syscall_64+0x1b1/0x800 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe Freed by task 1402: save_stack+0x43/0xd0 mm/kasan/kasan.c:448 set_track mm/kasan/kasan.c:460 [inline] __kasan_slab_free+0x11a/0x170 mm/kasan/kasan.c:521 kasan_slab_free+0xe/0x10 mm/kasan/kasan.c:528 __cache_free mm/slab.c:3498 [inline] kfree+0xd9/0x260 mm/slab.c:3813 fib6_info_destroy+0x29b/0x350 net/ipv6/ip6_fib.c:207 fib6_info_release include/net/ip6_fib.h:286 [inline] __ip6_del_rt_siblings net/ipv6/route.c:3235 [inline] ip6_route_del+0x11c4/0x13b0 net/ipv6/route.c:3316 ipv6_route_ioctl+0x616/0x760 net/ipv6/route.c:3663 inet6_ioctl+0x100/0x1f0 net/ipv6/af_inet6.c:546 sock_do_ioctl+0xe4/0x3e0 net/socket.c:973 sock_ioctl+0x30d/0x680 net/socket.c:1097 vfs_ioctl fs/ioctl.c:46 [inline] file_ioctl fs/ioctl.c:500 [inline] do_vfs_ioctl+0x1cf/0x16f0 fs/ioctl.c:684 ksys_ioctl+0xa9/0xd0 fs/ioctl.c:701 __do_sys_ioctl fs/ioctl.c:708 [inline] __se_sys_ioctl fs/ioctl.c:706 [inline] __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:706 do_syscall_64+0x1b1/0x800 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe The buggy address belongs to the object at ffff8801b5df2580 which belongs to the cache kmalloc-256 of size 256 The buggy address is located 8 bytes inside of 256-byte region [ffff8801b5df2580, ffff8801b5df2680) The buggy address belongs to the page: page:ffffea0006d77c80 count:1 mapcount:0 mapping:ffff8801da8007c0 index:0xffff8801b5df2e40 flags: 0x2fffc0000000100(slab) raw: 02fffc0000000100 ffffea0006c5cc48 ffffea0007363308 ffff8801da8007c0 raw: ffff8801b5df2e40 ffff8801b5df2080 0000000100000006 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff8801b5df2480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff8801b5df2500: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc > ffff8801b5df2580: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ^ ffff8801b5df2600: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff8801b5df2680: fc fc fc fc fc fc fc fc fb fb fb fb fb fb fb fb Fixes: a64efe142f5e ("net/ipv6: introduce fib6_info struct and helpers") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: David Ahern <dsahern@gmail.com> Reported-by: syzbot+9e6d75e3edef427ee888@syzkaller.appspotmail.com Acked-by: David Ahern <dsahern@gmail.com> Tested-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-20	net: net_failover: fix typo in net_failover_slave_register()	Liran Alon	1	-1/+1
	Sync both unicast and multicast lists instead of unicast twice. Fixes: cfc80d9a116 ("net: Introduce net_failover driver") Reviewed-by: Joao Martins <joao.m.martins@oracle.com> Signed-off-by: Liran Alon <liran.alon@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-20	ipvlan: use ETH_MAX_MTU as max mtu	Xin Long	1	-0/+1
	Similar to the fixes on team and bonding, this restores the ability to set an ipvlan device's mtu to anything higher than 1500. Fixes: 91572088e3fd ("net: use core MTU range checking in core net infra") Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-20	net: hamradio: use eth_broadcast_addr	Stefan Agner	1	-6/+2
	The array bpq_eth_addr is only used to get the size of an address, whereas the bcast_addr is used to set the broadcast address. This leads to a warning when using clang: drivers/net/hamradio/bpqether.c:94:13: warning: variable 'bpq_eth_addr' is not needed and will not be emitted [-Wunneeded-internal-declaration] static char bpq_eth_addr[6]; ^ Remove both variables and use the common eth_broadcast_addr to set the broadcast address. Signed-off-by: Stefan Agner <stefan@agner.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-20	enic: initialize enic->rfs_h.lock in enic_probe	Govindarajulu Varadarajan	2	-3/+3
	lockdep spotted that we are using rfs_h.lock in enic_get_rxnfc() without initializing. rfs_h.lock is initialized in enic_open(). But ethtool_ops can be called when interface is down. Move enic_rfs_flw_tbl_init to enic_probe. INFO: trying to register non-static key. the code is fine but needs lockdep annotation. turning off the locking correctness validator. CPU: 18 PID: 1189 Comm: ethtool Not tainted 4.17.0-rc7-devel+ #27 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-20171110_100015-anatol 04/01/2014 Call Trace: dump_stack+0x85/0xc0 register_lock_class+0x550/0x560 ? __handle_mm_fault+0xa8b/0x1100 __lock_acquire+0x81/0x670 lock_acquire+0xb9/0x1e0 ? enic_get_rxnfc+0x139/0x2b0 [enic] _raw_spin_lock_bh+0x38/0x80 ? enic_get_rxnfc+0x139/0x2b0 [enic] enic_get_rxnfc+0x139/0x2b0 [enic] ethtool_get_rxnfc+0x8d/0x1c0 dev_ethtool+0x16c8/0x2400 ? __mutex_lock+0x64d/0xa00 ? dev_load+0x6a/0x150 dev_ioctl+0x253/0x4b0 sock_do_ioctl+0x9a/0x130 sock_ioctl+0x1af/0x350 do_vfs_ioctl+0x8e/0x670 ? syscall_trace_enter+0x1e2/0x380 ksys_ioctl+0x60/0x90 __x64_sys_ioctl+0x16/0x20 do_syscall_64+0x5a/0x170 entry_SYSCALL_64_after_hwframe+0x49/0xbe Signed-off-by: Govindarajulu Varadarajan <gvaradar@cisco.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-20	MAINTAINERS: Add Sam as the maintainer for NCSI	Joel Stanley	1	-0/+5
	Sam has been handing the maintenance of NCSI for a number release cycles now. Acked-by: Samuel Mendoza-Jonas <sam@mendozajonas.com> Signed-off-by: Joel Stanley <joel@jms.id.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-20	net/ncsi: Use netdev_dbg for debug messages	Joel Stanley	2	-21/+18
	This moves all of the netdev_printk(KERN_DEBUG, ...) messages over to netdev_dbg. As Joe explains: > netdev_dbg is not included in object code unless > DEBUG is defined or CONFIG_DYNAMIC_DEBUG is set. > And then, it is not emitted into the log unless > DEBUG is set or this specific netdev_dbg is enabled > via the dynamic debug control file. Which is what we're after in this case. Acked-by: Samuel Mendoza-Jonas <sam@mendozajonas.com> Signed-off-by: Joel Stanley <joel@jms.id.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-20	net/ncsi: Drop no more channels message	Joel Stanley	1	-2/+0
	This does not provide useful information. As the ncsi maintainer said: > either we get a channel or broadcom has gone out to lunch Acked-by: Samuel Mendoza-Jonas <sam@mendozajonas.com> Signed-off-by: Joel Stanley <joel@jms.id.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-20	net/ncsi: Silence debug messages	Joel Stanley	3	-11/+11
	In normal operation we see this series of messages as the host drives the network device: ftgmac100 1e660000.ethernet eth0: NCSI: LSC AEN - channel 0 state down ftgmac100 1e660000.ethernet eth0: NCSI: suspending channel 0 ftgmac100 1e660000.ethernet eth0: NCSI: configuring channel 0 ftgmac100 1e660000.ethernet eth0: NCSI: channel 0 link down after config ftgmac100 1e660000.ethernet eth0: NCSI interface down ftgmac100 1e660000.ethernet eth0: NCSI: LSC AEN - channel 0 state up ftgmac100 1e660000.ethernet eth0: NCSI: configuring channel 0 ftgmac100 1e660000.ethernet eth0: NCSI interface up ftgmac100 1e660000.ethernet eth0: NCSI: LSC AEN - channel 0 state down ftgmac100 1e660000.ethernet eth0: NCSI: suspending channel 0 ftgmac100 1e660000.ethernet eth0: NCSI: configuring channel 0 ftgmac100 1e660000.ethernet eth0: NCSI: channel 0 link down after config ftgmac100 1e660000.ethernet eth0: NCSI interface down ftgmac100 1e660000.ethernet eth0: NCSI: LSC AEN - channel 0 state up ftgmac100 1e660000.ethernet eth0: NCSI: configuring channel 0 ftgmac100 1e660000.ethernet eth0: NCSI interface up This makes all of these messages netdev_dbg. They are still useful to debug eg. misbehaving network device firmware, but we do not need them filling up the kernel logs in normal operation. Acked-by: Samuel Mendoza-Jonas <sam@mendozajonas.com> Signed-off-by: Joel Stanley <joel@jms.id.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-20	bpf, xdp, i40e: fix i40e_build_skb skb reserve and truesize	Daniel Borkmann	1	-4/+3
	Using skb_reserve(skb, I40E_SKB_PAD + (xdp->data - xdp->data_hard_start)) is clearly wrong since I40E_SKB_PAD already points to the offset where the original xdp->data was sitting since xdp->data_hard_start is defined as xdp->data - i40e_rx_offset(rx_ring) where latter offsets to I40E_SKB_PAD when build skb is used. However, also before cc5b114dcf98 ("bpf, i40e: add meta data support") this seems broken since bpf_xdp_adjust_head() helper could have been used to alter headroom and enlarge / shrink the frame and with that the assumption that the xdp->data remains unchanged does not hold and would push a bogus packet to upper stack. ixgbe got this right in 924708081629 ("ixgbe: add XDP support for pass and drop actions"). In any case, fix it by removing the I40E_SKB_PAD from both skb_reserve() and truesize calculation. Fixes: cc5b114dcf98 ("bpf, i40e: add meta data support") Fixes: 0c8493d90b6b ("i40e: add XDP support for pass and drop actions") Reported-by: Keith Busch <keith.busch@linux.intel.com> Reported-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Björn Töpel <bjorn.topel@intel.com> Cc: John Fastabend <john.fastabend@gmail.com> Tested-by: Keith Busch <keith.busch@linux.intel.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-20	qed: Do not advertise DCBX_LLD_MANAGED capability.	Sudarsana Reddy Kalluru	1	-7/+4
	Do not advertise DCBX_LLD_MANAGED capability i.e., do not allow external agent to manage the dcbx/lldp negotiation. MFW acts as lldp agent for qed* devices, and no other lldp agent is allowed to coexist with mfw. Also updated a debug print, to not to display the redundant info. Fixes: a1d8d8a51 ("qed: Add dcbnl support.") Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com> Signed-off-by: Ariel Elior <ariel.elior@cavium.com> Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-20	qed: Add sanity check for SIMD fastpath handler.	Sudarsana Reddy Kalluru	1	-2/+10
	Avoid calling a SIMD fastpath handler if it is NULL. The check is needed to handle an unlikely scenario where unsolicited interrupt is destined to a PF in INTa mode. Fixes: fe56b9e6a ("qed: Add module with basic common support") Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com> Signed-off-by: Ariel Elior <ariel.elior@cavium.com> Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-20	qed: Fix possible memory leak in Rx error path handling.	Sudarsana Reddy Kalluru	1	-2/+9
	Memory for packet buffers need to be freed in the error paths as there is no consumer (e.g., upper layer) for such packets and that memory will never get freed. The issue was uncovered when port was attacked with flood of isatap packets, these are multicast packets hence were directed at all the PFs. For foce PF, this meant they were routed to the ll2 module which in turn drops such packets. Fixes: 0a7fb11c ("qed: Add Light L2 support") Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com> Signed-off-by: Ariel Elior <ariel.elior@cavium.com> Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-17	net_sched: blackhole: tell upper qdisc about dropped packets	Konstantin Khlebnikov	1	-1/+1
	When blackhole is used on top of classful qdisc like hfsc it breaks qlen and backlog counters because packets are disappear without notice. In HFSC non-zero qlen while all classes are inactive triggers warning: WARNING: ... at net/sched/sch_hfsc.c:1393 hfsc_dequeue+0xba4/0xe90 [sch_hfsc] and schedules watchdog work endlessly. This patch return __NET_XMIT_BYPASS in addition to NET_XMIT_SUCCESS, this flag tells upper layer: this packet is gone and isn't queued. Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-17	bluetooth: hci_nokia: Don't include linux/unaligned/le_struct.h directly.	David S. Miller	1	-1/+1
	This breaks the build as this header is not meant to be used in this way. ./include/linux/unaligned/access_ok.h:8:28: error: redefinition of ‘get_unaligned_le16’ static __always_inline u16 get_unaligned_le16(const void p) ^~~~~~~~~~~~~~~~~~ In file included from drivers/bluetooth/hci_nokia.c:32: ./include/linux/unaligned/le_struct.h:7:19: note: previous definition of ‘get_unaligned_le16’ was here static inline u16 get_unaligned_le16(const void p) Use asm/unaligned.h instead. Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-17	atm: Preserve value of skb->truesize when accounting to vcc	David Woodhouse	8	-14/+23
	ATM accounts for in-flight TX packets in sk_wmem_alloc of the VCC on which they are to be sent. But it doesn't take ownership of those packets from the sock (if any) which originally owned them. They should remain owned by their actual sender until they've left the box. There's a hack in pskb_expand_head() to avoid adjusting skb->truesize for certain skbs, precisely to avoid messing up sk_wmem_alloc accounting. Ideally that hack would cover the ATM use case too, but it doesn't — skbs which aren't owned by any sock, for example PPP control frames, still get their truesize adjusted when the low-level ATM driver adds headroom. This has always been an issue, it seems. The truesize of a packet increases, and sk_wmem_alloc on the VCC goes negative. But this wasn't for normal traffic, only for control frames. So I think we just got away with it, and we probably needed to send 2GiB of LCP echo frames before the misaccounting would ever have caused a problem and caused atm_may_send() to start refusing packets. Commit 14afee4b609 ("net: convert sock.sk_wmem_alloc from atomic_t to refcount_t") did exactly what it was intended to do, and turned this mostly-theoretical problem into a real one, causing PPPoATM to fail immediately as sk_wmem_alloc underflows and atm_may_send() immediately starts refusing to allow new packets. The least intrusive solution to this problem is to stash the value of skb->truesize that was accounted to the VCC, in a new member of the ATM_SKB(skb) structure. Then in atm_pop_raw() subtract precisely that value instead of the then-current value of skb->truesize. Fixes: 158f323b9868 ("net: adjust skb->truesize in pskb_expand_head()") Signed-off-by: David Woodhouse <dwmw2@infradead.org> Tested-by: Kevin Darbyshire-Bryant <ldir@darbyshire-bryant.me.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-15	xdp: Fix handling of devmap in generic XDP	Toshiaki Makita	4	-17/+46
	Commit 67f29e07e131 ("bpf: devmap introduce dev_map_enqueue") changed the return value type of __devmap_lookup_elem() from struct net_device * to struct bpf_dtab_netdev * but forgot to modify generic XDP code accordingly. Thus generic XDP incorrectly used struct bpf_dtab_netdev where struct net_device is expected, then skb->dev was set to invalid value. v2: - Fix compiler warning without CONFIG_BPF_SYSCALL. Fixes: 67f29e07e131 ("bpf: devmap introduce dev_map_enqueue") Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Acked-by: Yonghong Song <yhs@fb.com> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-06-15	bpf: reject any prog that failed read-only lock	Daniel Borkmann	3	-32/+87
	We currently lock any JITed image as read-only via bpf_jit_binary_lock_ro() as well as the BPF image as read-only through bpf_prog_lock_ro(). In the case any of these would fail we throw a WARN_ON_ONCE() in order to yell loudly to the log. Perhaps, to some extend, this may be comparable to an allocation where __GFP_NOWARN is explicitly not set. Added via 65869a47f348 ("bpf: improve read-only handling"), this behavior is slightly different compared to any of the other in-kernel set_memory_ro() users who do not check the return code of set_memory_ro() and friends /at all/ (e.g. in the case of module_enable_ro() / module_disable_ro()). Given in BPF this is mandatory hardening step, we want to know whether there are any issues that would leave both BPF data writable. So it happens that syzkaller enabled fault injection and it triggered memory allocation failure deep inside x86's change_page_attr_set_clr() which was triggered from set_memory_ro(). Now, there are two options: i) leaving everything as is, and ii) reworking the image locking code in order to have a final checkpoint out of the central bpf_prog_select_runtime() which probes whether any of the calls during prog setup weren't successful, and then bailing out with an error. Option ii) is a better approach since this additional paranoia avoids altogether leaving any potential W+X pages from BPF side in the system. Therefore, lets be strict about it, and reject programs in such unlikely occasion. While testing I noticed also that one bpf_prog_lock_ro() call was missing on the outer dummy prog in case of calls, e.g. in the destructor we call bpf_prog_free_deferred() on the main prog where we try to bpf_prog_unlock_free() the program, and since we go via bpf_prog_select_runtime() do that as well. Reported-by: syzbot+3b889862e65a98317058@syzkaller.appspotmail.com Reported-by: syzbot+9e762b52dd17e616a7a5@syzkaller.appspotmail.com Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-06-15	bpf: fix panic in prog load calls cleanup	Daniel Borkmann	3	-6/+19
	While testing I found that when hitting error path in bpf_prog_load() where we jump to free_used_maps and prog contained BPF to BPF calls that were JITed earlier, then we never clean up the bpf_prog_kallsyms_add() done under jit_subprogs(). Add proper API to make BPF kallsyms deletion more clear and fix that. Fixes: 1c2a088a6626 ("bpf: x64: add JIT support for multi-function programs") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-06-15	net: stmmac: Run HWIF Quirks after getting HW caps	Jose Abreu	3	-7/+10
	Currently we were running HWIF quirks before getting HW capabilities. This is not right because some HWIF callbacks depend on HW caps. Lets save the quirks callback and use it in a later stage. This fixes Altera socfpga. Signed-off-by: Jose Abreu <joabreu@synopsys.com> Fixes: 5f0456b43140 ("net: stmmac: Implement logic to automatically select HW Interface") Reported-by: Dinh Nguyen <dinh.linux@gmail.com> Cc: David S. Miller <davem@davemloft.net> Cc: Joao Pinto <jpinto@synopsys.com> Cc: Vitor Soares <soares@synopsys.com> Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com> Cc: Alexandre Torgue <alexandre.torgue@st.com> Cc: Dinh Nguyen <dinh.linux@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-15	neighbour: skip NTF_EXT_LEARNED entries during forced gc	Roopa Prabhu	1	-4/+6
	Commit 9ce33e46531d ("neighbour: support for NTF_EXT_LEARNED flag") added support for NTF_EXT_LEARNED for neighbour entries. NTF_EXT_LEARNED entries are neigh entries managed by control plane (eg: Ethernet VPN implementation in FRR routing suite). Periodic gc already excludes these entries. This patch extends it to forced gc which the earlier patch missed. Fixes: 9ce33e46531d ("neighbour: support for NTF_EXT_LEARNED flag") Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-15	net: cxgb3: add error handling for sysfs_create_group	Zhouyang Jia	1	-0/+7
	When sysfs_create_group fails, the lack of error-handling code may cause unexpected results. This patch adds error-handling code after calling sysfs_create_group. Signed-off-by: Zhouyang Jia <jiazhouyang09@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-15	tls: fix waitall behavior in tls_sw_recvmsg	Daniel Borkmann	1	-1/+5
	Current behavior in tls_sw_recvmsg() is to wait for incoming tls messages and copy up to exactly len bytes of data that the user provided. This is problematic in the sense that i) if no packet is currently queued in strparser we keep waiting until one has been processed and pushed into tls receive layer for tls_wait_data() to wake up and push the decrypted bits to user space. Given after tls decryption, we're back at streaming data, use sock_rcvlowat() hint from tcp socket instead. Retain current behavior with MSG_WAITALL flag and otherwise use the hint target for breaking the loop and returning to application. This is done if currently no ctx->recv_pkt is ready, otherwise continue to process it from our strparser backlog. Fixes: c46234ebb4d1 ("tls: RX path for ktls") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Dave Watson <davejwatson@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-15	tls: fix use-after-free in tls_push_record	Daniel Borkmann	1	-13/+13
	syzkaller managed to trigger a use-after-free in tls like the following: BUG: KASAN: use-after-free in tls_push_record.constprop.15+0x6a2/0x810 [tls] Write of size 1 at addr ffff88037aa08000 by task a.out/2317 CPU: 3 PID: 2317 Comm: a.out Not tainted 4.17.0+ #144 Hardware name: LENOVO 20FBCTO1WW/20FBCTO1WW, BIOS N1FET47W (1.21 ) 11/28/2016 Call Trace: dump_stack+0x71/0xab print_address_description+0x6a/0x280 kasan_report+0x258/0x380 ? tls_push_record.constprop.15+0x6a2/0x810 [tls] tls_push_record.constprop.15+0x6a2/0x810 [tls] tls_sw_push_pending_record+0x2e/0x40 [tls] tls_sk_proto_close+0x3fe/0x710 [tls] ? tcp_check_oom+0x4c0/0x4c0 ? tls_write_space+0x260/0x260 [tls] ? kmem_cache_free+0x88/0x1f0 inet_release+0xd6/0x1b0 __sock_release+0xc0/0x240 sock_close+0x11/0x20 __fput+0x22d/0x660 task_work_run+0x114/0x1a0 do_exit+0x71a/0x2780 ? mm_update_next_owner+0x650/0x650 ? handle_mm_fault+0x2f5/0x5f0 ? __do_page_fault+0x44f/0xa50 ? mm_fault_error+0x2d0/0x2d0 do_group_exit+0xde/0x300 __x64_sys_exit_group+0x3a/0x50 do_syscall_64+0x9a/0x300 ? page_fault+0x8/0x30 entry_SYSCALL_64_after_hwframe+0x44/0xa9 This happened through fault injection where aead_req allocation in tls_do_encryption() eventually failed and we returned -ENOMEM from the function. Turns out that the use-after-free is triggered from tls_sw_sendmsg() in the second tls_push_record(). The error then triggers a jump to waiting for memory in sk_stream_wait_memory() resp. returning immediately in case of MSG_DONTWAIT. What follows is the trim_both_sgl(sk, orig_size), which drops elements from the sg list added via tls_sw_sendmsg(). Now the use-after-free gets triggered when the socket is being closed, where tls_sk_proto_close() callback is invoked. The tls_complete_pending_work() will figure that there's a pending closed tls record to be flushed and thus calls into the tls_push_pending_closed_record() from there. ctx->push_pending_record() is called from the latter, which is the tls_sw_push_pending_record() from sw path. This again calls into tls_push_record(). And here the tls_fill_prepend() will panic since the buffer address has been freed earlier via trim_both_sgl(). One way to fix it is to move the aead request allocation out of tls_do_encryption() early into tls_push_record(). This means we don't prep the tls header and advance state to the TLS_PENDING_CLOSED_RECORD before allocation which could potentially fail happened. That fixes the issue on my side. Fixes: 3c4d7559159b ("tls: kernel TLS support") Reported-by: syzbot+5c74af81c547738e1684@syzkaller.appspotmail.com Reported-by: syzbot+709f2810a6a05f11d4d3@syzkaller.appspotmail.com Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Dave Watson <davejwatson@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-15	l2tp: filter out non-PPP sessions in pppol2tp_tunnel_ioctl()	Guillaume Nault	1	-1/+1
	pppol2tp_tunnel_ioctl() can act on an L2TPv3 tunnel, in which case 'session' may be an Ethernet pseudo-wire. However, pppol2tp_session_ioctl() expects a PPP pseudo-wire, as it assumes l2tp_session_priv() points to a pppol2tp_session structure. For an Ethernet pseudo-wire l2tp_session_priv() points to an l2tp_eth_sess structure instead, making pppol2tp_session_ioctl() access invalid memory. Fixes: d9e31d17ceba ("l2tp: Add L2TP ethernet pseudowire support") Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-15	l2tp: reject creation of non-PPP sessions on L2TPv2 tunnels	Guillaume Nault	1	-0/+6
	The /proc/net/pppol2tp handlers (pppol2tp_seq_*()) iterate over all L2TPv2 tunnels, and rightfully expect that only PPP sessions can be found there. However, l2tp_netlink accepts creating Ethernet sessions regardless of the underlying tunnel version. This confuses pppol2tp_seq_session_show(), which expects that l2tp_session_priv() returns a pppol2tp_session structure. When the session is an Ethernet pseudo-wire, a struct l2tp_eth_sess is returned instead. This leads to invalid memory access when pppol2tp_session_get_sock() later tries to dereference ps->sk. Fixes: d9e31d17ceba ("l2tp: Add L2TP ethernet pseudowire support") Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-15	mlxsw: spectrum_switchdev: Fix port_vlan refcounting	Petr Machata	1	-1/+3
	Switchdev notifications for addition of SWITCHDEV_OBJ_ID_PORT_VLAN are distributed not only on clean addition, but also when flags on an existing VLAN are changed. mlxsw_sp_bridge_port_vlan_add() calls mlxsw_sp_port_vlan_get() to get at the port_vlan in question, which implicitly references the object. This then leads to discrepancies in reference counting when the VLAN is removed. spectrum.c warns about the problem when the module is removed: [13578.493090] WARNING: CPU: 0 PID: 2454 at drivers/net/ethernet/mellanox/mlxsw/spectrum.c:2973 mlxsw_sp_port_remove+0xfd/0x110 [mlxsw_spectrum] [...] [13578.627106] Call Trace: [13578.629617] mlxsw_sp_fini+0x2a/0xe0 [mlxsw_spectrum] [13578.634748] mlxsw_core_bus_device_unregister+0x3e/0x130 [mlxsw_core] [13578.641290] mlxsw_pci_remove+0x13/0x40 [mlxsw_pci] [13578.646238] pci_device_remove+0x31/0xb0 [13578.650244] device_release_driver_internal+0x14f/0x220 [13578.655562] driver_detach+0x32/0x70 [13578.659183] bus_remove_driver+0x47/0xa0 [13578.663134] pci_unregister_driver+0x1e/0x80 [13578.667486] mlxsw_sp_module_exit+0xc/0x3fa [mlxsw_spectrum] [13578.673207] __x64_sys_delete_module+0x13b/0x1e0 [13578.677888] ? exit_to_usermode_loop+0x78/0x80 [13578.682374] do_syscall_64+0x39/0xe0 [13578.685976] entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fix by putting the port_vlan when mlxsw_sp_port_vlan_bridge_join() determines it's a flag-only change. Fixes: b3529af6bb0d ("spectrum: Reference count VLAN entries") Signed-off-by: Petr Machata <petrm@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-15	mlxsw: spectrum_router: Align with new route replace logic	Ido Schimmel	1	-16/+5
	Commit f34436a43092 ("net/ipv6: Simplify route replace and appending into multipath route") changed the IPv6 route replace logic so that the first matching route (i.e., same metric) is replaced. Have mlxsw replace the first matching route as well. Fixes: f34436a43092 ("net/ipv6: Simplify route replace and appending into multipath route") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-15	mlxsw: spectrum_router: Allow appending to dev-only routes	Ido Schimmel	1	-8/+19
	Commit f34436a43092 ("net/ipv6: Simplify route replace and appending into multipath route") changed the IPv6 route append logic so that dev-only routes can be appended and not only gatewayed routes. Align mlxsw with the new behaviour. Fixes: f34436a43092 ("net/ipv6: Simplify route replace and appending into multipath route") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-15	ipv6: Only emit append events for appended routes	Ido Schimmel	1	-3/+2
	Current code will emit an append event in the FIB notification chain for any route added with NLM_F_APPEND set, even if the route was not appended to any existing route. This is inconsistent with IPv4 where such an event is only emitted when the new route is appended after an existing one. Align IPv6 behavior with IPv4, thereby allowing listeners to more easily handle these events. Fixes: f34436a43092 ("net/ipv6: Simplify route replace and appending into multipath route") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Acked-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-15	stmmac: added support for 802.1ad vlan stripping	Elad Nachman	1	-8/+13
	stmmac reception handler calls stmmac_rx_vlan() to strip the vlan before calling napi_gro_receive(). The function assumes VLAN tagged frames are always tagged with 802.1Q protocol, and assigns ETH_P_8021Q to the skb by hard-coding the parameter on call to __vlan_hwaccel_put_tag() . This causes packets not to be passed to the VLAN slave if it was created with 802.1AD protocol (ip link add link eth0 eth0.100 type vlan proto 802.1ad id 100). This fix passes the protocol from the VLAN header into __vlan_hwaccel_put_tag() instead of using the hard-coded value of ETH_P_8021Q. NETIF_F_HW_VLAN_STAG_RX check was added and the strip action is now dependent on the correct combination of features and the detected vlan tag. NETIF_F_HW_VLAN_STAG_RX feature was added to be in line with the driver actual abilities. Signed-off-by: Elad Nachman <eladn@gilat.com> Reviewed-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-15	afs: Optimise callback breaking by not repeating volume lookup	David Howells	3	-20/+107
	At the moment, afs_break_callbacks calls afs_break_one_callback() for each separate FID it was given, and the latter looks up the volume individually for each one. However, this is inefficient if two or more FIDs have the same vid as we could reuse the volume. This is complicated by cell aliasing whereby we may have multiple cells sharing a volume and can therefore have multiple callback interests for any particular volume ID. At the moment afs_break_one_callback() scans the entire list of volumes we're getting from a server and breaks the appropriate callback in every matching volume, regardless of cell. This scan is done for every FID. Optimise callback breaking by the following means: (1) Sort the FID list by vid so that all FIDs belonging to the same volume are clumped together. This is done through the use of an indirection table as we cannot do an insertion sort on the afs_callback_break array as we decode FIDs into it as we subsequently also have to decode callback info into it that corresponds by array index only. We also don't really want to bubblesort afterwards if we can avoid it. (2) Sort the server->cb_interests array by vid so that all the matching volumes are grouped together. This permits the scan to stop after finding a record that has a higher vid. (3) When breaking FIDs, we try to keep server->cb_break_lock as long as possible, caching the start point in the array for that volume group as long as possible. It might make sense to add another layer in that list and have a refcounted volume ID anchor that has the matching interests attached to it rather than being in the list. This would allow the lock to be dropped without losing the cursor. Signed-off-by: David Howells <dhowells@redhat.com>
2018-06-15	afs: Display manually added cells in dynamic root mount	David Howells	7	-27/+200
	Alter the dynroot mount so that cells created by manipulation of /proc/fs/afs/cells and /proc/fs/afs/rootcell and by specification of a root cell as a module parameter will cause directories for those cells to be created in the dynamic root superblock for the network namespace[]. To this end: (1) Only one dynamic root superblock is now created per network namespace and this is shared between all attempts to mount it. This makes it easier to find the superblock to modify. (2) When a dynamic root superblock is created, the list of cells is walked and directories created for each cell already defined. (3) When a new cell is added, if a dynamic root superblock exists, a directory is created for it. (4) When a cell is destroyed, the directory is removed. (5) These directories are created by calling lookup_one_len() on the root dir which automatically creates them if they don't exist. [] Inasmuch as network namespaces are currently supported here. Signed-off-by: David Howells <dhowells@redhat.com>
2018-06-15	afs: Enable IPv6 DNS lookups	David Howells	2	-2/+2
	Remove the restriction on DNS lookup upcalls that prevents ipv6 addresses from being looked up. Signed-off-by: David Howells <dhowells@redhat.com>
2018-06-15	cfg80211: fix rcu in cfg80211_unregister_wdev	Dedy Lansky	1	-0/+1
	Callers of cfg80211_unregister_wdev can free the wdev object immediately after this function returns. This may crash the kernel because this wdev object is still in use by other threads. Add synchronize_rcu() after list_del_rcu to make sure wdev object can be safely freed. Signed-off-by: Dedy Lansky <dlansky@codeaurora.org> Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
2018-06-15	mac80211: Move up init of TXQs	Toke Høiland-Jørgensen	1	-6/+6
	On init, ieee80211_if_add() dumps the interface. Since that now includes a dump of the TXQ state, we need to initialise that before the dump happens. So move up the TXQ initialisation to to before the call to ieee80211_if_add(). Fixes: 52539ca89f36 ("cfg80211: Expose TXQ stats and parameters to userspace") Reported-by: Niklas Cassel <niklas.cassel@linaro.org> Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk> Tested-by: Niklas Cassel <niklas.cassel@linaro.org> Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
2018-06-15	mac80211_hwsim: fix module init error paths	Johannes Berg	1	-2/+9
	We didn't free the workqueue on any errors, nor did we correctly check for rhashtable allocation errors, nor did we free the hashtable on error. Reported-by: Colin King <colin.king@canonical.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
2018-06-15	cfg80211: initialize sinfo in cfg80211_get_station	Sven Eckelmann	1	-0/+2
	Most of the implementations behind cfg80211_get_station will not initialize sinfo to zero before manipulating it. For example, the member "filled", which indicates the filled in parts of this struct, is often only modified by enabling certain bits in the bitfield while keeping the remaining bits in their original state. A caller without a preinitialized sinfo.filled can then no longer decide which parts of sinfo were filled in by cfg80211_get_station (or actually the underlying implementations). cfg80211_get_station must therefore take care that sinfo is initialized to zero. Otherwise, the caller may tries to read information which was not filled in and which must therefore also be considered uninitialized. In batadv_v_elp_get_throughput's case, an invalid "random" expected throughput may be stored for this neighbor and thus the B.A.T.M.A.N V algorithm may switch to non-optimal neighbors for certain destinations. Fixes: 7406353d43c8 ("cfg80211: implement cfg80211_get_station cfg80211 API") Reported-by: Thomas Lauer <holminateur@gmail.com> Reported-by: Marcel Schmidt <ff.z-casparistrasse@mailbox.org> Cc: b.a.t.m.a.n@lists.open-mesh.org Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
2018-06-15	nl80211: fix some kernel doc tag mistakes	Luca Coelho	1	-14/+14
	There is a bunch of tags marking constants with &, which means struct or enum name. Replace them with %, which is the correct tag for constants. Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
2018-06-15	afs: Show all of a server's addresses in /proc/fs/afs/servers	David Howells	1	-2/+8
	Show all of a server's addresses in /proc/fs/afs/servers, placing the second plus addresses on padded lines of their own. The current address is marked with a star. Signed-off-by: David Howells <dhowells@redhat.com>
2018-06-15	afs: Handle CONFIG_PROC_FS=n	David Howells	2	-2/+10
	The AFS filesystem depends at the moment on /proc for configuration and also presents information that way - however, this causes a compilation failure if procfs is disabled. Fix it so that the procfs bits aren't compiled in if procfs is disabled. This means that you can't configure the AFS filesystem directly, but it is still usable provided that an up-to-date keyutils is installed to look up cells by SRV or AFSDB DNS records. Reported-by: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: David Howells <dhowells@redhat.com>