linux-dev - Linux kernel development work

Age	Commit message (Collapse)	Author	Files	Lines
2018-04-05	net: fool proof dev_valid_name()	Eric Dumazet	1	-1/+1
	We want to use dev_valid_name() to validate tunnel names, so better use strnlen(name, IFNAMSIZ) than strlen(name) to make sure to not upset KASAN. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-04	netdevsim: remove incorrect __net_initdata annotations	Arnd Bergmann	2	-2/+2
	The __net_initdata section cannot currently be used for structures that get cleaned up in an exitcall using unregister_pernet_operations: WARNING: vmlinux.o(.text+0x868c34): Section mismatch in reference from the function nsim_devlink_exit() to the (unknown reference) .init.data:(unknown) The function nsim_devlink_exit() references the (unknown reference) __initdata (unknown). This is often because nsim_devlink_exit lacks a __initdata annotation or the annotation of (unknown) is wrong. WARNING: vmlinux.o(.text+0x868c64): Section mismatch in reference from the function nsim_devlink_init() to the (unknown reference) .init.data:(unknown) WARNING: vmlinux.o(.text+0x8692bc): Section mismatch in reference from the function nsim_fib_exit() to the (unknown reference) .init.data:(unknown) WARNING: vmlinux.o(.text+0x869300): Section mismatch in reference from the function nsim_fib_init() to the (unknown reference) .init.data:(unknown) As that warning tells us, discarding the structure after a module is loaded would lead to a undefined behavior when that module is removed. It might be possible to change that annotation so it has no effect for loadable modules, but I have not figured out exactly how to do that, and we want this to be fixed in -rc1. This just removes the annotations, just like we do for all other such modules. Fixes: 37923ed6b8ce ("netdevsim: Add simple FIB resource controller via devlink") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-04	sfc: remove ctpio_dmabuf_start from stats	Bert Kenward	2	-3/+0
	The ctpio_dmabuf_start entry is not actually a stat and shouldn't be exposed to ethtool. Fixes: 2c0b6ee837db ("sfc: expose CTPIO stats on NICs that support them") Signed-off-by: Bert Kenward <bkenward@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-04	inet: frags: fix ip6frag_low_thresh boundary	Eric Dumazet	4	-9/+2
	Giving an integer to proc_doulongvec_minmax() is dangerous on 64bit arches, since linker might place next to it a non zero value preventing a change to ip6frag_low_thresh. ip6frag_low_thresh is not used anymore in the kernel, but we do not want to prematuraly break user scripts wanting to change it. Since specifying a minimal value of 0 for proc_doulongvec_minmax() is moot, let's remove these zero values in all defrag units. Fixes: 6e00f7dd5e4e ("ipv6: frags: fix /proc/sys/net/ipv6/ip6frag_low_thresh") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Maciej Żenczykowski <maze@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-04	tipc: Fix namespace violation in tipc_sk_fill_sock_diag	GhantaKrishnamurthy MohanKrishna	1	-1/+2
	To fetch UID info for socket diagnostics, we determine the namespace of user context using tipc socket instance. This may cause namespace violation, as the kernel will remap based on UID. We fix this by fetching namespace info using the calling userspace netlink socket. Fixes: c30b70deb5f4 (tipc: implement socket diagnostics for AF_TIPC) Reported-by: syzbot+326e587eff1074657718@syzkaller.appspotmail.com Acked-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: GhantaKrishnamurthy MohanKrishna <mohan.krishna.ghanta.krishnamurthy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-04	net: avoid unneeded atomic operation in ip*_append_data()	Paolo Abeni	2	-2/+4
	After commit 694aba690de0 ("ipv4: factorize sk_wmem_alloc updates done by __ip_append_data()") and commit 1f4c6eb24029 ("ipv6: factorize sk_wmem_alloc updates done by __ip6_append_data()"), when transmitting sub MTU datagram, an addtional, unneeded atomic operation is performed in ip*_append_data() to update wmem_alloc: in the above condition the delta is 0. The above cause small but measurable performance regression in UDP xmit tput test with packet size below MTU. This change avoids such overhead updating wmem_alloc only if wmem_alloc_delta is non zero. The error path is left intentionally unmodified: it's a slow path and simplicity is preferred to performances. Fixes: 694aba690de0 ("ipv4: factorize sk_wmem_alloc updates done by __ip_append_data()") Fixes: 1f4c6eb24029 ("ipv6: factorize sk_wmem_alloc updates done by __ip6_append_data()") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-04	nvmem: disallow modular CONFIG_NVMEM	Arnd Bergmann	1	-1/+1
	The new of_get_nvmem_mac_address() helper function causes a link error with CONFIG_NVMEM=m: drivers/of/of_net.o: In function `of_get_nvmem_mac_address': of_net.c:(.text+0x168): undefined reference to `of_nvmem_cell_get' of_net.c:(.text+0x19c): undefined reference to `nvmem_cell_read' of_net.c:(.text+0x1a8): undefined reference to `nvmem_cell_put' I could not come up with a good solution for this, as the code is always built-in. Using an #if IS_REACHABLE() check around it would solve the link time issue but then stop it from working in that configuration. Making of_nvmem_cell_get() an inline function could also solve that, but seems a bit ugly since it's somewhat larger than most inline functions, and it would just bring that problem into the callers. Splitting the function into a separate file might be an alternative. This uses the big hammer by making CONFIG_NVMEM itself a 'bool' symbol, which avoids the problem entirely but makes the vmlinux larger for anyone that might use NVMEM support but doesn't need it built-in otherwise. Fixes: 9217e566bdee ("of_net: Implement of_get_nvmem_mac_address helper") Cc: Mike Looijmans <mike.looijmans@topic.nl> Cc: Florian Fainelli <f.fainelli@gmail.com> Cc: Andrew Lunn <andrew@lunn.ch> Cc: David S. Miller <davem@davemloft.net> Cc: netdev@vger.kernel.org Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Mike Looijmans Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-04	net: hns3: fix length overflow when CONFIG_ARM64_64K_PAGES	Tan Xiaojun	1	-1/+1
	When enable the config item "CONFIG_ARM64_64K_PAGES", the size of PAGE_SIZE is 65536(64K). But the type of length is u16, it will overflow. So change it to u32. Signed-off-by: Tan Xiaojun <tanxiaojun@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-04	nfp: use full 40 bits of the NSP buffer address	Dirk van der Merwe	1	-4/+5
	The NSP default buffer is a piece of NFP memory where additional command data can be placed. Its format has been copied from host buffer, but the PCIe selection bits do not make sense in this case. If those get masked out from a NFP address - writes to random place in the chip memory may be issued and crash the device. Even in the general NSP buffer case, it doesn't make sense to have the PCIe selection bits there anymore. These are unused at the moment, and when it becomes necessary, the PCIe selection bits should rather be moved to another register to utilise more bits for the buffer address. This has never been an issue because the buffer used to be allocated in memory with less-than-38-bit-long address but that is about to change. Fixes: 1a64821c6af7 ("nfp: add support for service processor access") Signed-off-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-04	lan78xx: Connect phy early	Alexander Graf	1	-16/+18
	When using wicked with a lan78xx device attached to the system, we end up with ethtool commands issued on the device before an ifup got issued. That lead to the following crash: Unable to handle kernel NULL pointer dereference at virtual address 0000039c pgd = ffff800035b30000 [0000039c] *pgd=0000000000000000 Internal error: Oops: 96000004 [#1] SMP Modules linked in: [...] Supported: Yes CPU: 3 PID: 638 Comm: wickedd Tainted: G E 4.12.14-0-default #1 Hardware name: raspberrypi rpi/rpi, BIOS 2018.03-rc2 02/21/2018 task: ffff800035e74180 task.stack: ffff800036718000 PC is at phy_ethtool_ksettings_get+0x20/0x98 LR is at lan78xx_get_link_ksettings+0x44/0x60 [lan78xx] pc : [<ffff0000086f7f30>] lr : [<ffff000000dcca84>] pstate: 20000005 sp : ffff80003671bb20 x29: ffff80003671bb20 x28: ffff800035e74180 x27: ffff000008912000 x26: 000000000000001d x25: 0000000000000124 x24: ffff000008f74d00 x23: 0000004000114809 x22: 0000000000000000 x21: ffff80003671bbd0 x20: 0000000000000000 x19: ffff80003671bbd0 x18: 000000000000040d x17: 0000000000000001 x16: 0000000000000000 x15: 0000000000000000 x14: ffffffffffffffff x13: 0000000000000000 x12: 0000000000000020 x11: 0101010101010101 x10: fefefefefefefeff x9 : 7f7f7f7f7f7f7f7f x8 : fefefeff31677364 x7 : 0000000080808080 x6 : ffff80003671bc9c x5 : ffff80003671b9f8 x4 : ffff80002c296190 x3 : 0000000000000000 x2 : 0000000000000000 x1 : ffff80003671bbd0 x0 : ffff80003671bc00 Process wickedd (pid: 638, stack limit = 0xffff800036718000) Call trace: Exception stack(0xffff80003671b9e0 to 0xffff80003671bb20) b9e0: ffff80003671bc00 ffff80003671bbd0 0000000000000000 0000000000000000 ba00: ffff80002c296190 ffff80003671b9f8 ffff80003671bc9c 0000000080808080 ba20: fefefeff31677364 7f7f7f7f7f7f7f7f fefefefefefefeff 0101010101010101 ba40: 0000000000000020 0000000000000000 ffffffffffffffff 0000000000000000 ba60: 0000000000000000 0000000000000001 000000000000040d ffff80003671bbd0 ba80: 0000000000000000 ffff80003671bbd0 0000000000000000 0000004000114809 baa0: ffff000008f74d00 0000000000000124 000000000000001d ffff000008912000 bac0: ffff800035e74180 ffff80003671bb20 ffff000000dcca84 ffff80003671bb20 bae0: ffff0000086f7f30 0000000020000005 ffff80002c296000 ffff800035223900 bb00: 0000ffffffffffff 0000000000000000 ffff80003671bb20 ffff0000086f7f30 [<ffff0000086f7f30>] phy_ethtool_ksettings_get+0x20/0x98 [<ffff000000dcca84>] lan78xx_get_link_ksettings+0x44/0x60 [lan78xx] [<ffff0000087cbc40>] ethtool_get_settings+0x68/0x210 [<ffff0000087cc0d4>] dev_ethtool+0x214/0x2180 [<ffff0000087e5008>] dev_ioctl+0x400/0x630 [<ffff00000879dd00>] sock_do_ioctl+0x70/0x88 [<ffff00000879f5f8>] sock_ioctl+0x208/0x368 [<ffff0000082cde10>] do_vfs_ioctl+0xb0/0x848 [<ffff0000082ce634>] SyS_ioctl+0x8c/0xa8 Exception stack(0xffff80003671bec0 to 0xffff80003671c000) bec0: 0000000000000009 0000000000008946 0000fffff4e841d0 0000aa0032687465 bee0: 0000aaaafa2319d4 0000fffff4e841d4 0000000032687465 0000000032687465 bf00: 000000000000001d 7f7fff7f7f7f7f7f 72606b622e71ff4c 7f7f7f7f7f7f7f7f bf20: 0101010101010101 0000000000000020 ffffffffffffffff 0000ffff7f510c68 bf40: 0000ffff7f6a9d18 0000ffff7f44ce30 000000000000040d 0000ffff7f6f98f0 bf60: 0000fffff4e842c0 0000000000000001 0000aaaafa2c2e00 0000ffff7f6ab000 bf80: 0000fffff4e842c0 0000ffff7f62a000 0000aaaafa2b9f20 0000aaaafa2c2e00 bfa0: 0000fffff4e84818 0000fffff4e841a0 0000ffff7f5ad0cc 0000fffff4e841a0 bfc0: 0000ffff7f44ce3c 0000000080000000 0000000000000009 000000000000001d bfe0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 The culprit is quite simple: The driver tries to access the phy left and right, but only actually has a working reference to it when the device is up. The fix thus is quite simple too: Get a reference to the phy on probe already and keep it even when the device is going down. With this patch applied, I can successfully run wicked on my system and bring the interface up and down as many times as I want, without getting NULL pointer dereferences in between. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-04	nfp: add a separate counter for packets with CHECKSUM_COMPLETE	Jakub Kicinski	3	-9/+13
	We are currently counting packets with CHECKSUM_COMPLETE as "hw_rx_csum_ok". This is confusing. Add a new counter. To make sure it fits in the same cacheline move the less used error counter to a different location. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-04	tipc: Fix missing list initializations in struct tipc_subscription	Jon Maloy	1	-0/+2
	When an item of struct tipc_subscription is created, we fail to initialize the two lists aggregated into the struct. This has so far never been a problem, since the items are just added to a root object by list_add(), which does not require the addee list to be pre-initialized. However, syzbot is provoking situations where this addition fails, whereupon the attempted removal if the item from the list causes a crash. This problem seems to always have been around, despite that the code for creating this object was rewritten in commit 242e82cc95f6 ("tipc: collapse subscription creation functions"), which is still in net-next. We fix this for that commit by initializing the two lists properly. Fixes: 242e82cc95f6 ("tipc: collapse subscription creation functions") Reported-by: syzbot+0bb443b74ce09197e970@syzkaller.appspotmail.com Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-04	ipv6: udp: set dst cache for a connected sk if current not valid	Alexey Kodanev	1	-19/+2
	A new RTF_CACHE route can be created between ip6_sk_dst_lookup_flow() and ip6_dst_store() calls in udpv6_sendmsg(), when datagram sending results to ICMPV6_PKT_TOOBIG error: udp_v6_send_skb(), for example with vti6 tunnel: vti6_xmit(), get ICMPV6_PKT_TOOBIG error skb_dst_update_pmtu(), can create a RTF_CACHE clone icmpv6_send() ... udpv6_err() ip6_sk_update_pmtu() ip6_update_pmtu(), can create a RTF_CACHE clone ... ip6_datagram_dst_update() ip6_dst_store() And after commit 33c162a980fe ("ipv6: datagram: Update dst cache of a connected datagram sk during pmtu update"), the UDPv6 error handler can update socket's dst cache, but it can happen before the update in the end of udpv6_sendmsg(), preventing getting the new dst cache on the next udpv6_sendmsg() calls. In order to fix it, save dst in a connected socket only if the current socket's dst cache is invalid. The previous patch prepared ip6_sk_dst_lookup_flow() to do that with the new argument, and this patch enables it in udpv6_sendmsg(). Fixes: 33c162a980fe ("ipv6: datagram: Update dst cache of a connected datagram sk during pmtu update") Fixes: 45e4fd26683c ("ipv6: Only create RTF_CACHE routes after encountering pmtu exception") Signed-off-by: Alexey Kodanev <alexey.kodanev@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-04	ipv6: udp: convert 'connected' to bool type in udpv6_sendmsg()	Alexey Kodanev	1	-5/+5
	This should make it consistent with ip6_sk_dst_lookup_flow() that is accepting the new 'connected' parameter of type bool. Signed-off-by: Alexey Kodanev <alexey.kodanev@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-04	ipv6: allow to cache dst for a connected sk in ip6_sk_dst_lookup_flow()	Alexey Kodanev	4	-6/+16
	Add 'connected' parameter to ip6_sk_dst_lookup_flow() and update the cache only if ip6_sk_dst_check() returns NULL and a socket is connected. The function is used as before, the new behavior for UDP sockets in udpv6_sendmsg() will be enabled in the next patch. Signed-off-by: Alexey Kodanev <alexey.kodanev@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-04	ipv6: add a wrapper for ip6_dst_store() with flowi6 checks	Alexey Kodanev	3	-8/+21
	Move commonly used pattern of ip6_dst_store() usage to a separate function - ip6_sk_dst_store_flow(), which will check the addresses for equality using the flow information, before saving them. There is no functional changes in this patch. In addition, it will be used in the next patch, in ip6_sk_dst_lookup_flow(). Signed-off-by: Alexey Kodanev <alexey.kodanev@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-04	net: phy: marvell10g: add thermal hwmon device	Russell King	1	-2/+182
	Add a thermal monitoring device for the Marvell 88x3310, which updates once a second. We also need to hook into the suspend/resume mechanism to ensure that the thermal monitoring is reconfigured when we resume. Suggested-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-04	pptp: remove a buggy dst release in pptp_connect()	Eric Dumazet	1	-1/+0
	Once dst has been cached in socket via sk_setup_caps(), it is illegal to call ip_rt_put() (or dst_release()), since sk_setup_caps() did not change dst refcount. We can still dereference it since we hold socket lock. Caugth by syzbot : BUG: KASAN: use-after-free in atomic_dec_return include/asm-generic/atomic-instrumented.h:198 [inline] BUG: KASAN: use-after-free in dst_release+0x27/0xa0 net/core/dst.c:185 Write of size 4 at addr ffff8801c54dc040 by task syz-executor4/20088 CPU: 1 PID: 20088 Comm: syz-executor4 Not tainted 4.16.0+ #376 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:17 [inline] dump_stack+0x1a7/0x27d lib/dump_stack.c:53 print_address_description+0x73/0x250 mm/kasan/report.c:256 kasan_report_error mm/kasan/report.c:354 [inline] kasan_report+0x23c/0x360 mm/kasan/report.c:412 check_memory_region_inline mm/kasan/kasan.c:260 [inline] check_memory_region+0x137/0x190 mm/kasan/kasan.c:267 kasan_check_write+0x14/0x20 mm/kasan/kasan.c:278 atomic_dec_return include/asm-generic/atomic-instrumented.h:198 [inline] dst_release+0x27/0xa0 net/core/dst.c:185 sk_dst_set include/net/sock.h:1812 [inline] sk_dst_reset include/net/sock.h:1824 [inline] sock_setbindtodevice net/core/sock.c:610 [inline] sock_setsockopt+0x431/0x1b20 net/core/sock.c:707 SYSC_setsockopt net/socket.c:1845 [inline] SyS_setsockopt+0x2ff/0x360 net/socket.c:1828 do_syscall_64+0x281/0x940 arch/x86/entry/common.c:287 entry_SYSCALL_64_after_hwframe+0x42/0xb7 RIP: 0033:0x4552d9 RSP: 002b:00007f4878126c68 EFLAGS: 00000246 ORIG_RAX: 0000000000000036 RAX: ffffffffffffffda RBX: 00007f48781276d4 RCX: 00000000004552d9 RDX: 0000000000000019 RSI: 0000000000000001 RDI: 0000000000000013 RBP: 000000000072bea0 R08: 0000000000000010 R09: 0000000000000000 R10: 00000000200010c0 R11: 0000000000000246 R12: 00000000ffffffff R13: 0000000000000526 R14: 00000000006fac30 R15: 0000000000000000 Allocated by task 20088: save_stack+0x43/0xd0 mm/kasan/kasan.c:447 set_track mm/kasan/kasan.c:459 [inline] kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:552 kasan_slab_alloc+0x12/0x20 mm/kasan/kasan.c:489 kmem_cache_alloc+0x12e/0x760 mm/slab.c:3542 dst_alloc+0x11f/0x1a0 net/core/dst.c:104 rt_dst_alloc+0xe9/0x540 net/ipv4/route.c:1520 __mkroute_output net/ipv4/route.c:2265 [inline] ip_route_output_key_hash_rcu+0xa49/0x2c60 net/ipv4/route.c:2493 ip_route_output_key_hash+0x20b/0x370 net/ipv4/route.c:2322 __ip_route_output_key include/net/route.h:126 [inline] ip_route_output_flow+0x26/0xa0 net/ipv4/route.c:2577 ip_route_output_ports include/net/route.h:163 [inline] pptp_connect+0xa84/0x1170 drivers/net/ppp/pptp.c:453 SYSC_connect+0x213/0x4a0 net/socket.c:1639 SyS_connect+0x24/0x30 net/socket.c:1620 do_syscall_64+0x281/0x940 arch/x86/entry/common.c:287 entry_SYSCALL_64_after_hwframe+0x42/0xb7 Freed by task 20082: save_stack+0x43/0xd0 mm/kasan/kasan.c:447 set_track mm/kasan/kasan.c:459 [inline] __kasan_slab_free+0x11a/0x170 mm/kasan/kasan.c:520 kasan_slab_free+0xe/0x10 mm/kasan/kasan.c:527 __cache_free mm/slab.c:3486 [inline] kmem_cache_free+0x83/0x2a0 mm/slab.c:3744 dst_destroy+0x266/0x380 net/core/dst.c:140 dst_destroy_rcu+0x16/0x20 net/core/dst.c:153 __rcu_reclaim kernel/rcu/rcu.h:178 [inline] rcu_do_batch kernel/rcu/tree.c:2675 [inline] invoke_rcu_callbacks kernel/rcu/tree.c:2930 [inline] __rcu_process_callbacks kernel/rcu/tree.c:2897 [inline] rcu_process_callbacks+0xd6c/0x17b0 kernel/rcu/tree.c:2914 __do_softirq+0x2d7/0xb85 kernel/softirq.c:285 The buggy address belongs to the object at ffff8801c54dc000 which belongs to the cache ip_dst_cache of size 168 The buggy address is located 64 bytes inside of 168-byte region [ffff8801c54dc000, ffff8801c54dc0a8) The buggy address belongs to the page: page:ffffea0007153700 count:1 mapcount:0 mapping:ffff8801c54dc000 index:0x0 flags: 0x2fffc0000000100(slab) raw: 02fffc0000000100 ffff8801c54dc000 0000000000000000 0000000100000010 raw: ffffea0006b34b20 ffffea0006b6c1e0 ffff8801d674a1c0 0000000000000000 page dumped because: kasan: bad access detected Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-04	net: dsa: mt7530: Use NULL instead of plain integer	Florian Fainelli	1	-3/+3
	We would be passing 0 instead of NULL as the rsp argument to mt7530_fdb_cmd(), fix that. Fixes: b8f126a8d543 ("net-next: dsa: add dsa support for Mediatek MT7530 switch") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Acked-by: Sean Wang <sean.wang@mediatek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-04	net: dsa: b53: Fix sparse warnings in b53_mmap.c	Florian Fainelli	1	-9/+24
	sparse complains about the following warnings: drivers/net/dsa/b53/b53_mmap.c:33:31: warning: incorrect type in initializer (different address spaces) drivers/net/dsa/b53/b53_mmap.c:33:31: expected unsigned char [noderef] [usertype] <asn:2>regs drivers/net/dsa/b53/b53_mmap.c:33:31: got void priv and indeed, while what we are doing is functional, we are dereferencing a void * pointer into a void __iomem * which is not great. Just use the defined b53_mmap_priv structure which holds our register base and use that. Fixes: 967dd82ffc52 ("net: dsa: b53: Add support for Broadcom RoboSwitch") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-04	af_unix: remove redundant lockdep class	Cong Wang	1	-10/+0
	After commit 581319c58600 ("net/socket: use per af lockdep classes for sk queues") sock queue locks now have per-af lockdep classes, including unix socket. It is no longer necessary to workaround it. I noticed this while looking at a syzbot deadlock report, this patch itself doesn't fix it (this is why I don't add Reported-by). Fixes: 581319c58600 ("net/socket: use per af lockdep classes for sk queues") Cc: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-04	net: systemport: Fix sparse warnings in bcm_sysport_insert_tsb()	Florian Fainelli	1	-5/+6
	skb->protocol is a __be16 which we would be calling htons() against, while this is not wrong per-se as it correctly results in swapping the value on LE hosts, this still upsets sparse. Adopt a similar pattern to what other drivers do and just assign ip_ver to skb->protocol, and then use htons() against the different constants such that the compiler can resolve the values at build time. Fixes: 80105befdb4b ("net: systemport: add Broadcom SYSTEMPORT Ethernet MAC driver") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-04	net: bcmgenet: Fix sparse warnings in bcmgenet_put_tx_csum()	Florian Fainelli	1	-5/+6
	skb->protocol is a __be16 which we would be calling htons() against, while this is not wrong per-se as it correctly results in swapping the value on LE hosts, this still upsets sparse. Adopt a similar pattern to what other drivers do and just assign ip_ver to skb->protocol, and then use htons() against the different constants such that the compiler can resolve the values at build time. Fixes: 1c1008c793fa ("net: bcmgenet: add main driver file") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-04	rxrpc: Fix undefined packet handling	David Howells	2	-0/+12
	By analogy with other Rx implementations, RxRPC packet types 9, 10 and 11 should just be discarded rather than being aborted like other undefined packet types. Reported-by: Jeffrey Altman <jaltman@auristor.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-04	openrisc: Set CONFIG_MULTI_IRQ_HANDLER	Palmer Dabbelt	1	-0/+4
	arm has an optional MULTI_IRQ_HANDLER, which openrisc copied but didn't make optional. The multi irq handler infrastructure has been copied to generic code selectable with a new config symbol. That symbol can be selected by randconfig builds and can cause build breakage. Introduce CONFIG_MULTI_IRQ_HANDLER as an intermediate step which prevents the core config symbol from being selected. The openrisc local config symbol will be removed once openrisc gets converted to the generic code. Signed-off-by: Palmer Dabbelt <palmer@sifive.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Arnd Bergmann <arnd@arndb.de> Link: https://lkml.kernel.org/r/20180404043130.31277-3-palmer@sifive.com
2018-04-04	arm64: Set CONFIG_MULTI_IRQ_HANDLER	Palmer Dabbelt	1	-0/+4
	arm has an optional MULTI_IRQ_HANDLER, which arm64 copied but didn't make optional. The multi irq handler infrastructure has been copied to generic code selectable with a new config symbol. That symbol can be selected by randconfig builds and can cause build breakage. Introduce CONFIG_MULTI_IRQ_HANDLER as an intermediate step which prevents the core config symbol from being selected. The arm64 local config symbol will be removed once arm64 gets converted to the generic code. Signed-off-by: Palmer Dabbelt <palmer@sifive.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Arnd Bergmann <arnd@arndb.de> Link: https://lkml.kernel.org/r/20180404043130.31277-2-palmer@sifive.com
2018-04-04	genirq: Make GENERIC_IRQ_MULTI_HANDLER depend on !MULTI_IRQ_HANDLER	Palmer Dabbelt	1	-0/+1
	These config switches enable the same code in the core and the not yet converted architecture code. They can be selected both by randconfig builds and cause linker error because the same symbols are defined twice. Make the new GENERIC_IRQ_MULTI_HANDLER depend on !MULTI_IRQ_HANDLER to prevent that. The dependency will be removed once all architectures are converted over. Signed-off-by: Palmer Dabbelt <palmer@sifive.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Arnd Bergmann <arnd@arndb.de> Link: https://lkml.kernel.org/r/20180404043130.31277-4-palmer@sifive.com
2018-04-03	RISC-V: Rename CONFIG_CMDLINE_OVERRIDE to CONFIG_CMDLINE_FORCE	Palmer Dabbelt	1	-2/+2
	The device tree code looks for CONFIG_CMDLINE_FORCE, but we were using CONFIG_CMDLINE_OVERRIDE. It looks like this was just a hold over from before our device tree conversion -- in fact, we'd already removed the support for CONFIG_CMDLINE_OVERRIDE from our arch-specific code so it didn't even work any more. Thanks to Mortiz and Trung for finding the original bug, and for Michael for suggeting a better fix. CC: Trung Tran <trung.tran@ettus.com> CC: Michael J Clark <mjc@sifive.com> Reviewed-by: Moritz Fischer <mdf@kernel.org> Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2018-04-03	sparc64: Make atomic_xchg() an inline function rather than a macro.	David S. Miller	1	-1/+5
	This avoids a lot of -Wunused warnings such as: ==================== kernel/debug/debug_core.c: In function ‘kgdb_cpu_enter’: ./arch/sparc/include/asm/cmpxchg_64.h:55:22: warning: value computed is not used [-Wunused-value] #define xchg(ptr,x) ((__typeof__((ptr)))__xchg((unsigned long)(x),(ptr),sizeof((ptr)))) ./arch/sparc/include/asm/atomic_64.h:86:30: note: in expansion of macro ‘xchg’ #define atomic_xchg(v, new) (xchg(&((v)->counter), new)) ^~~~ kernel/debug/debug_core.c:508:4: note: in expansion of macro ‘atomic_xchg’ atomic_xchg(&kgdb_active, cpu); ^~~~~~~~~~~ ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-02	bitmap: fix memset optimization on big-endian systems	Omar Sandoval	1	-5/+17
	Commit 2a98dc028f91 ("include/linux/bitmap.h: turn bitmap_set and bitmap_clear into memset when possible") introduced an optimization to bitmap_{set,clear}() which uses memset() when the start and length are constants aligned to a byte. This is wrong on big-endian systems; our bitmaps are arrays of unsigned long, so bit n is not at byte n / 8 in memory. This was caught by the Btrfs selftests, but the bitmap selftests also fail when run on a big-endian machine. We can still use memset if the start and length are aligned to an unsigned long, so do that on big-endian. The same problem applies to the memcmp in bitmap_equal(), so fix it there, too. Fixes: 2a98dc028f91 ("include/linux/bitmap.h: turn bitmap_set and bitmap_clear into memset when possible") Fixes: 2c6deb01525a ("bitmap: use memcmp optimisation in more situations") Cc: stable@kernel.org Reported-by: "Erhard F." <erhard_f@mailbox.org> Cc: Matthew Wilcox <mawilcox@microsoft.com> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Omar Sandoval <osandov@fb.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-02	RISC-V: Add definition of relocation types	Zong Li	1	-0/+7
	Signed-off-by: Zong Li <zong@andestech.com> Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2018-04-02	RISC-V: Enable module support in defconfig	Zong Li	1	-0/+2
	Signed-off-by: Zong Li <zong@andestech.com> Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2018-04-02	RISC-V: Support SUB32 relocation type in kernel module	Zong Li	1	-0/+8
	Signed-off-by: Zong Li <zong@andestech.com> Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2018-04-02	RISC-V: Support ADD32 relocation type in kernel module	Zong Li	1	-0/+8
	Signed-off-by: Zong Li <zong@andestech.com> Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2018-04-02	RISC-V: Support ALIGN relocation type in kernel module	Zong Li	1	-0/+10
	Just fail on align type. Kernel modules loader didn't do relax like linker, it is difficult to remove or migrate the code, but the remnant nop instructions harm the performaace of module. We expect the building module with the no-relax option. Signed-off-by: Zong Li <zong@andestech.com> Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2018-04-02	RISC-V: Support RVC_BRANCH/JUMP relocation type in kernel modulewq	Zong Li	1	-0/+35
	Signed-off-by: Zong Li <zong@andestech.com> Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2018-04-02	RISC-V: Support HI20/LO12_I/LO12_S relocation type in kernel module	Zong Li	1	-0/+42
	HI20 and LO12_I/LO12_S relocate the absolute address, the range of offset must in 32-bit. Signed-off-by: Zong Li <zong@andestech.com> Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2018-04-02	RISC-V: Support CALL relocation type in kernel module	Zong Li	1	-0/+22
	Signed-off-by: Zong Li <zong@andestech.com> Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2018-04-02	RISC-V: Support GOT_HI20/CALL_PLT relocation type in kernel module	Zong Li	1	-10/+52
	For CALL_PLT, emit the plt entry only when offset is more than 32-bit. For PCREL_LO12, it uses the location of corresponding HI20 to get the address of external symbol. It should check the HI20 type is the PCREL_HI20 or GOT_HI20, because sometime the location will have two or more relocation types. For example: 0: 00000797 auipc a5,0x0 0: R_RISCV_ALIGN ABS 0: R_RISCV_GOT_HI20 SYMBOL 4: 0007b783 ld a5,0(a5) # 0 <SYMBOL> 4: R_RISCV_PCREL_LO12_I .L0 4: R_RISCV_RELAX ABS Signed-off-by: Zong Li <zong@andestech.com> Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2018-04-02	RISC-V: Add section of GOT.PLT for kernel module	Zong Li	3	-17/+45
	Separate the function symbol address from .plt to .got.plt section. The original plt entry has trampoline code with symbol address, there is a 32-bit padding bwtween jar instruction and symbol address. Extract the symbol address to .got.plt to reduce the module size. Signed-off-by: Zong Li <zong@andestech.com> Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2018-04-02	RISC-V: Add sections of PLT and GOT for kernel module	Zong Li	6	-0/+260
	The address of external symbols will locate more than 32-bit offset in 64-bit kernel with sv39 or sv48 virtual addressing. Module loader emits the GOT and PLT entries for data symbols and function symbols respectively. The PLT entry is a trampoline code for jumping to the 64-bit real address. The GOT entry is just the data symbol address. Signed-off-by: Zong Li <zong@andestech.com> Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2018-04-02	riscv/atomic: Strengthen implementations with fences	Andrea Parri	2	-220/+588
	Atomics present the same issue with locking: release and acquire variants need to be strengthened to meet the constraints defined by the Linux-kernel memory consistency model [1]. Atomics present a further issue: implementations of atomics such as atomic_cmpxchg() and atomic_add_unless() rely on LR/SC pairs, which do not give full-ordering with .aqrl; for example, current implementations allow the "lr-sc-aqrl-pair-vs-full-barrier" test below to end up with the state indicated in the "exists" clause. In order to "synchronize" LKMM and RISC-V's implementation, this commit strengthens the implementations of the atomics operations by replacing .rl and .aq with the use of ("lightweigth") fences, and by replacing .aqrl LR/SC pairs in sequences such as: 0: lr.w.aqrl %0, %addr bne %0, %old, 1f ... sc.w.aqrl %1, %new, %addr bnez %1, 0b 1: with sequences of the form: 0: lr.w %0, %addr bne %0, %old, 1f ... sc.w.rl %1, %new, %addr /* SC-release / bnez %1, 0b fence rw, rw / "full" fence / 1: following Daniel's suggestion. These modifications were validated with simulation of the RISC-V memory consistency model. C lr-sc-aqrl-pair-vs-full-barrier {} P0(int x, int y, atomic_t u) { int r0; int r1; WRITE_ONCE(x, 1); r0 = atomic_cmpxchg(u, 0, 1); r1 = READ_ONCE(y); } P1(int x, int y, atomic_t v) { int r0; int r1; WRITE_ONCE(y, 1); r0 = atomic_cmpxchg(v, 0, 1); r1 = READ_ONCE(*x); } exists (u=1 /\ v=1 /\ 0:r1=0 /\ 1:r1=0) [1] https://marc.info/?l=linux-kernel&m=151930201102853&w=2 https://groups.google.com/a/groups.riscv.org/forum/#!topic/isa-dev/hKywNHBkAXM https://marc.info/?l=linux-kernel&m=151633436614259&w=2 Suggested-by: Daniel Lustig <dlustig@nvidia.com> Signed-off-by: Andrea Parri <parri.andrea@gmail.com> Cc: Palmer Dabbelt <palmer@sifive.com> Cc: Albert Ou <albert@sifive.com> Cc: Daniel Lustig <dlustig@nvidia.com> Cc: Alan Stern <stern@rowland.harvard.edu> Cc: Will Deacon <will.deacon@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: David Howells <dhowells@redhat.com> Cc: Jade Alglave <j.alglave@ucl.ac.uk> Cc: Luc Maranget <luc.maranget@inria.fr> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: Akira Yokosawa <akiyks@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: linux-riscv@lists.infradead.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2018-04-02	riscv/spinlock: Strengthen implementations with fences	Andrea Parri	2	-14/+27
	Current implementations map locking operations using .rl and .aq annotations. However, this mapping is unsound w.r.t. the kernel memory consistency model (LKMM) [1]: Referring to the "unlock-lock-read-ordering" test reported below, Daniel wrote: "I think an RCpc interpretation of .aq and .rl would in fact allow the two normal loads in P1 to be reordered [...] The intuition would be that the amoswap.w.aq can forward from the amoswap.w.rl while that's still in the store buffer, and then the lw x3,0(x4) can also perform while the amoswap.w.rl is still in the store buffer, all before the l1 x1,0(x2) executes. That's not forbidden unless the amoswaps are RCsc, unless I'm missing something. Likewise even if the unlock()/lock() is between two stores. A control dependency might originate from the load part of the amoswap.w.aq, but there still would have to be something to ensure that this load part in fact performs after the store part of the amoswap.w.rl performs globally, and that's not automatic under RCpc." Simulation of the RISC-V memory consistency model confirmed this expectation. In order to "synchronize" LKMM and RISC-V's implementation, this commit strengthens the implementations of the locking operations by replacing .rl and .aq with the use of ("lightweigth") fences, resp., "fence rw, w" and "fence r , rw". C unlock-lock-read-ordering {} /* s initially owned by P1 / P0(int x, int y) { WRITE_ONCE(x, 1); smp_wmb(); WRITE_ONCE(y, 1); } P1(int x, int y, spinlock_t s) { int r0; int r1; r0 = READ_ONCE(y); spin_unlock(s); spin_lock(s); r1 = READ_ONCE(x); } exists (1:r0=1 /\ 1:r1=0) [1] https://marc.info/?l=linux-kernel&m=151930201102853&w=2 https://groups.google.com/a/groups.riscv.org/forum/#!topic/isa-dev/hKywNHBkAXM https://marc.info/?l=linux-kernel&m=151633436614259&w=2 Signed-off-by: Andrea Parri <parri.andrea@gmail.com> Cc: Palmer Dabbelt <palmer@sifive.com> Cc: Albert Ou <albert@sifive.com> Cc: Daniel Lustig <dlustig@nvidia.com> Cc: Alan Stern <stern@rowland.harvard.edu> Cc: Will Deacon <will.deacon@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: David Howells <dhowells@redhat.com> Cc: Jade Alglave <j.alglave@ucl.ac.uk> Cc: Luc Maranget <luc.maranget@inria.fr> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: Akira Yokosawa <akiyks@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: linux-riscv@lists.infradead.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2018-04-02	riscv/barrier: Define __smp_{store_release,load_acquire}	Andrea Parri	1	-0/+15
	Introduce __smp_{store_release,load_acquire}, and rely on the generic definitions for smp_{store_release,load_acquire}. This avoids the use of full ("rw,rw") fences on SMP. Signed-off-by: Andrea Parri <parri.andrea@gmail.com> Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2018-04-02	riscv/ftrace: Add HAVE_FUNCTION_GRAPH_RET_ADDR_PTR support	Alan Kao	2	-1/+7
	In walk_stackframe, the pc now receives the address from calling ftrace_graph_ret_addr instead of manual calculation. Note that the original calculation, pc = frame->ra - 4 is buggy when the instruction at the return address happened to be a compressed inst. But since it is not a critical part of ftrace, it is ignored for now to ease the review process. Cc: Greentime Hu <greentime@andestech.com> Signed-off-by: Alan Kao <alankao@andestech.com> Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2018-04-02	riscv/ftrace: Add DYNAMIC_FTRACE_WITH_REGS support	Alan Kao	4	-0/+141
	Cc: Greentime Hu <greentime@andestech.com> Signed-off-by: Alan Kao <alankao@andestech.com> Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2018-04-02	riscv/ftrace: Add ARCH_SUPPORTS_FTRACE_OPS support	Alan Kao	2	-0/+4
	Cc: Greentime Hu <greentime@andestech.com> Signed-off-by: Alan Kao <alankao@andestech.com> Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2018-04-02	riscv/ftrace: Add dynamic function graph tracer support	Alan Kao	2	-1/+118
	Once the function_graph tracer is enabled, a filtered function has the following call sequence: * ftracer_caller ==> on/off by ftrace_make_call/ftrace_make_nop * ftrace_graph_caller * ftrace_graph_call ==> on/off by ftrace_en/disable_ftrace_graph_caller * prepare_ftrace_return Considering the following DYNAMIC_FTRACE_WITH_REGS feature, it would be more extendable to have a ftrace_graph_caller function, instead of calling prepare_ftrace_return directly in ftrace_caller. Cc: Greentime Hu <greentime@andestech.com> Signed-off-by: Alan Kao <alankao@andestech.com> Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2018-04-02	riscv/ftrace: Add dynamic function tracer support	Alan Kao	6	-12/+223
	We now have dynamic ftrace with the following added items: * ftrace_make_call, ftrace_make_nop (in kernel/ftrace.c) The two functions turn each recorded call site of filtered functions into a call to ftrace_caller or nops * ftracce_update_ftrace_func (in kernel/ftrace.c) turns the nops at ftrace_call into a call to a generic entry for function tracers. * ftrace_caller (in kernel/mcount-dyn.S) The entry where each _mcount call sites calls to once they are filtered to be traced. Also, this patch fixes the semantic problems in mcount.S, which will be treated as only a reference implementation once we have the dynamic ftrace. Cc: Greentime Hu <greentime@andestech.com> Signed-off-by: Alan Kao <alankao@andestech.com> Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2018-04-02	riscv/ftrace: Add RECORD_MCOUNT support	Alan Kao	3	-0/+9
	Now recordmcount.pl recognizes RISC-V object files. For the mechanism to work, we have to disable the linker relaxation. Cc: Greentime Hu <greentime@andestech.com> Signed-off-by: Alan Kao <alankao@andestech.com> Signed-off-by: Palmer Dabbelt <palmer@sifive.com>