aboutsummaryrefslogtreecommitdiffstatshomepage
path: root/tools/perf/scripts/python/exported-sql-viewer.py (unfollow)
AgeCommit message (Collapse)AuthorFilesLines
2019-01-24net: dev_is_mac_header_xmit() true for ARPHRD_RAWIPMaciej Żenczykowski1-0/+1
__bpf_redirect() and act_mirred checks this boolean to determine whether to prefix an ethernet header. Signed-off-by: Maciej Żenczykowski <maze@google.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-24net: usb: asix: ax88772_bind return error when hw_reset failZhang Run1-2/+7
The ax88772_bind() should return error code immediately when the PHY was not reset properly through ax88772a_hw_reset(). Otherwise, The asix_get_phyid() will block when get the PHY Identifier from the PHYSID1 MII registers through asix_mdio_read() due to the PHY isn't ready. Furthermore, it will produce a lot of error message cause system crash.As follows: asix 1-1:1.0 (unnamed net_device) (uninitialized): Failed to write reg index 0x0000: -71 asix 1-1:1.0 (unnamed net_device) (uninitialized): Failed to send software reset: ffffffb9 asix 1-1:1.0 (unnamed net_device) (uninitialized): Failed to write reg index 0x0000: -71 asix 1-1:1.0 (unnamed net_device) (uninitialized): Failed to enable software MII access asix 1-1:1.0 (unnamed net_device) (uninitialized): Failed to read reg index 0x0000: -71 asix 1-1:1.0 (unnamed net_device) (uninitialized): Failed to write reg index 0x0000: -71 asix 1-1:1.0 (unnamed net_device) (uninitialized): Failed to enable software MII access asix 1-1:1.0 (unnamed net_device) (uninitialized): Failed to read reg index 0x0000: -71 ... Signed-off-by: Zhang Run <zhang.run@zte.com.cn> Reviewed-by: Yang Wei <yang.wei9@zte.com.cn> Tested-by: Marcel Ziswiler <marcel.ziswiler@toradex.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-24MAINTAINERS: Update cavium networking driversSudarsana Reddy Kalluru1-21/+21
Following Marvell's acquisition of Cavium, we need to update all the Cavium drivers maintainer's entries to point to our new e-mail addresses. Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com> Signed-off-by: Ameen Rahman <Ameen.Rahman@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-24net/mlx4_core: Fix error handling when initializing CQ bufs in the driverJack Morgenstein1-2/+4
Procedure mlx4_init_user_cqes() handles returns by copy_to_user incorrectly. copy_to_user() returns the number of bytes not copied. Thus, a non-zero return should be treated as a -EFAULT error (as is done elsewhere in the kernel). However, mlx4_init_user_cqes() error handling simply returns the number of bytes not copied (instead of -EFAULT). Note, though, that this is a harmless bug: procedure mlx4_alloc_cq() (which is the only caller of mlx4_init_user_cqes()) treats any non-zero return as an error, but that returned error value is processed internally, and not passed further up the call stack. In addition, fixes the following sparse warning: warning: incorrect type in argument 1 (different address spaces) expected void [noderef] <asn:1>*to got void *buf Fixes: e45678973dcb ("{net, IB}/mlx4: Initialize CQ buffers in the driver when possible") Reported by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-24net/mlx4_core: Add masking for a few queries on HCA capsAya Levin1-29/+46
Driver reads the query HCA capabilities without the corresponding masks. Without the correct masks, the base addresses of the queues are unaligned. In addition some reserved bits were wrongly read. Using the correct masks, ensures alignment of the base addresses and allows future firmware versions safe use of the reserved bits. Fixes: ab9c17a009ee ("mlx4_core: Modify driver initialization flow to accommodate SRIOV for Ethernet") Fixes: 0ff1fb654bec ("{NET, IB}/mlx4: Add device managed flow steering firmware API") Signed-off-by: Aya Levin <ayal@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-24sctp: set flow sport from saddr only when it's 0Xin Long2-2/+4
Now sctp_transport_pmtu() passes transport->saddr into .get_dst() to set flow sport from 'saddr'. However, transport->saddr is set only when transport->dst exists in sctp_transport_route(). If sctp_transport_pmtu() is called without transport->saddr set, like when transport->dst doesn't exists, the flow sport will be set to 0 from transport->saddr, which will cause a wrong route to be got. Commit 6e91b578bf3f ("sctp: re-use sctp_transport_pmtu in sctp_transport_route") made the issue be triggered more easily since sctp_transport_pmtu() would be called in sctp_transport_route() after that. In gerneral, fl4->fl4_sport should always be set to htons(asoc->base.bind_addr.port), unless transport->asoc doesn't exist in sctp_v4/6_get_dst(), which is the case: sctp_ootb_pkt_new() -> sctp_transport_route() For that, we can simply handle it by setting flow sport from saddr only when it's 0 in sctp_v4/6_get_dst(). Fixes: 6e91b578bf3f ("sctp: re-use sctp_transport_pmtu in sctp_transport_route") Reported-by: Ying Xu <yinxu@redhat.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-24sctp: set chunk transport correctly when it's a new asocXin Long1-3/+8
In the paths: sctp_sf_do_unexpected_init() -> sctp_make_init_ack() sctp_sf_do_dupcook_a/b()() -> sctp_sf_do_5_1D_ce() The new chunk 'retval' transport is set from the incoming chunk 'chunk' transport. However, 'retval' transport belong to the new asoc, which is a different one from 'chunk' transport's asoc. It will cause that the 'retval' chunk gets set with a wrong transport. Later when sending it and because of Commit b9fd683982c9 ("sctp: add sctp_packet_singleton"), sctp_packet_singleton() will set some fields, like vtag to 'retval' chunk from that wrong transport's asoc. This patch is to fix it by setting 'retval' transport correctly which belongs to the right asoc in sctp_make_init_ack() and sctp_sf_do_5_1D_ce(). Fixes: b9fd683982c9 ("sctp: add sctp_packet_singleton") Reported-by: Ying Xu <yinxu@redhat.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-24sctp: improve the events for sctp stream addingXin Long1-11/+8
This patch is to improve sctp stream adding events in 2 places: 1. In sctp_process_strreset_addstrm_out(), move up SCTP_MAX_STREAM and in stream allocation failure checks, as the adding has to succeed after reconf_timer stops for the in stream adding request retransmission. 3. In sctp_process_strreset_addstrm_in(), no event should be sent, as no in or out stream is added here. Fixes: 50a41591f110 ("sctp: implement receiver-side procedures for the Add Outgoing Streams Request Parameter") Fixes: c5c4ebb3ab87 ("sctp: implement receiver-side procedures for the Add Incoming Streams Request Parameter") Reported-by: Ying Xu <yinxu@redhat.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-24sctp: improve the events for sctp stream resetXin Long1-22/+17
This patch is to improve sctp stream reset events in 4 places: 1. In sctp_process_strreset_outreq(), the flag should always be set with SCTP_STREAM_RESET_INCOMING_SSN instead of OUTGOING, as receiver's in stream is reset here. 2. In sctp_process_strreset_outreq(), move up SCTP_STRRESET_ERR_WRONG_SSN check, as the reset has to succeed after reconf_timer stops for the in stream reset request retransmission. 3. In sctp_process_strreset_inreq(), no event should be sent, as no in or out stream is reset here. 4. In sctp_process_strreset_resp(), SCTP_STREAM_RESET_INCOMING_SSN or OUTGOING event should always be sent for stream reset requests, no matter it fails or succeeds to process the request. Fixes: 810544764536 ("sctp: implement receiver-side procedures for the Outgoing SSN Reset Request Parameter") Fixes: 16e1a91965b0 ("sctp: implement receiver-side procedures for the Incoming SSN Reset Request Parameter") Fixes: 11ae76e67a17 ("sctp: implement receiver-side procedures for the Reconf Response Parameter") Reported-by: Ying Xu <yinxu@redhat.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-24ip_tunnel: Make none-tunnel-dst tunnel port work with lwtunnelwenxu1-1/+7
ip l add dev tun type gretap key 1000 ip a a dev tun 10.0.0.1/24 Packets with tun-id 1000 can be recived by tun dev. But packet can't be sent through dev tun for non-tunnel-dst With this patch: tunnel-dst can be get through lwtunnel like beflow: ip r a 10.0.0.7 encap ip dst 172.168.0.11 dev tun Signed-off-by: wenxu <wenxu@ucloud.cn> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-23ax25: fix possible use-after-freeEric Dumazet3-13/+22
syzbot found that ax25 routes where not properly protected against concurrent use [1]. In this particular report the bug happened while copying ax25->digipeat. Fix this problem by making sure we call ax25_get_route() while ax25_route_lock is held, so that no modification could happen while using the route. The current two ax25_get_route() callers do not sleep, so this change should be fine. Once we do that, ax25_get_route() no longer needs to grab a reference on the found route. [1] ax25_connect(): syz-executor0 uses autobind, please contact jreuter@yaina.de BUG: KASAN: use-after-free in memcpy include/linux/string.h:352 [inline] BUG: KASAN: use-after-free in kmemdup+0x42/0x60 mm/util.c:113 Read of size 66 at addr ffff888066641a80 by task syz-executor2/531 ax25_connect(): syz-executor0 uses autobind, please contact jreuter@yaina.de CPU: 1 PID: 531 Comm: syz-executor2 Not tainted 5.0.0-rc2+ #10 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x1db/0x2d0 lib/dump_stack.c:113 print_address_description.cold+0x7c/0x20d mm/kasan/report.c:187 kasan_report.cold+0x1b/0x40 mm/kasan/report.c:317 check_memory_region_inline mm/kasan/generic.c:185 [inline] check_memory_region+0x123/0x190 mm/kasan/generic.c:191 memcpy+0x24/0x50 mm/kasan/common.c:130 memcpy include/linux/string.h:352 [inline] kmemdup+0x42/0x60 mm/util.c:113 kmemdup include/linux/string.h:425 [inline] ax25_rt_autobind+0x25d/0x750 net/ax25/ax25_route.c:424 ax25_connect.cold+0x30/0xa4 net/ax25/af_ax25.c:1224 __sys_connect+0x357/0x490 net/socket.c:1664 __do_sys_connect net/socket.c:1675 [inline] __se_sys_connect net/socket.c:1672 [inline] __x64_sys_connect+0x73/0xb0 net/socket.c:1672 do_syscall_64+0x1a3/0x800 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x458099 Code: 6d b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 3b b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007f870ee22c78 EFLAGS: 00000246 ORIG_RAX: 000000000000002a RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000458099 RDX: 0000000000000048 RSI: 0000000020000080 RDI: 0000000000000005 RBP: 000000000073bf00 R08: 0000000000000000 R09: 0000000000000000 ax25_connect(): syz-executor4 uses autobind, please contact jreuter@yaina.de R10: 0000000000000000 R11: 0000000000000246 R12: 00007f870ee236d4 R13: 00000000004be48e R14: 00000000004ce9a8 R15: 00000000ffffffff Allocated by task 526: save_stack+0x45/0xd0 mm/kasan/common.c:73 set_track mm/kasan/common.c:85 [inline] __kasan_kmalloc mm/kasan/common.c:496 [inline] __kasan_kmalloc.constprop.0+0xcf/0xe0 mm/kasan/common.c:469 kasan_kmalloc+0x9/0x10 mm/kasan/common.c:504 ax25_connect(): syz-executor5 uses autobind, please contact jreuter@yaina.de kmem_cache_alloc_trace+0x151/0x760 mm/slab.c:3609 kmalloc include/linux/slab.h:545 [inline] ax25_rt_add net/ax25/ax25_route.c:95 [inline] ax25_rt_ioctl+0x3b9/0x1270 net/ax25/ax25_route.c:233 ax25_ioctl+0x322/0x10b0 net/ax25/af_ax25.c:1763 sock_do_ioctl+0xe2/0x400 net/socket.c:950 sock_ioctl+0x32f/0x6c0 net/socket.c:1074 vfs_ioctl fs/ioctl.c:46 [inline] file_ioctl fs/ioctl.c:509 [inline] do_vfs_ioctl+0x107b/0x17d0 fs/ioctl.c:696 ksys_ioctl+0xab/0xd0 fs/ioctl.c:713 __do_sys_ioctl fs/ioctl.c:720 [inline] __se_sys_ioctl fs/ioctl.c:718 [inline] __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:718 do_syscall_64+0x1a3/0x800 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe ax25_connect(): syz-executor5 uses autobind, please contact jreuter@yaina.de Freed by task 550: save_stack+0x45/0xd0 mm/kasan/common.c:73 set_track mm/kasan/common.c:85 [inline] __kasan_slab_free+0x102/0x150 mm/kasan/common.c:458 kasan_slab_free+0xe/0x10 mm/kasan/common.c:466 __cache_free mm/slab.c:3487 [inline] kfree+0xcf/0x230 mm/slab.c:3806 ax25_rt_add net/ax25/ax25_route.c:92 [inline] ax25_rt_ioctl+0x304/0x1270 net/ax25/ax25_route.c:233 ax25_ioctl+0x322/0x10b0 net/ax25/af_ax25.c:1763 sock_do_ioctl+0xe2/0x400 net/socket.c:950 sock_ioctl+0x32f/0x6c0 net/socket.c:1074 vfs_ioctl fs/ioctl.c:46 [inline] file_ioctl fs/ioctl.c:509 [inline] do_vfs_ioctl+0x107b/0x17d0 fs/ioctl.c:696 ksys_ioctl+0xab/0xd0 fs/ioctl.c:713 __do_sys_ioctl fs/ioctl.c:720 [inline] __se_sys_ioctl fs/ioctl.c:718 [inline] __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:718 do_syscall_64+0x1a3/0x800 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe The buggy address belongs to the object at ffff888066641a80 which belongs to the cache kmalloc-96 of size 96 The buggy address is located 0 bytes inside of 96-byte region [ffff888066641a80, ffff888066641ae0) The buggy address belongs to the page: page:ffffea0001999040 count:1 mapcount:0 mapping:ffff88812c3f04c0 index:0x0 flags: 0x1fffc0000000200(slab) ax25_connect(): syz-executor4 uses autobind, please contact jreuter@yaina.de raw: 01fffc0000000200 ffffea0001817948 ffffea0002341dc8 ffff88812c3f04c0 raw: 0000000000000000 ffff888066641000 0000000100000020 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff888066641980: fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc ffff888066641a00: 00 00 00 00 00 00 00 00 02 fc fc fc fc fc fc fc >ffff888066641a80: fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc ^ ffff888066641b00: fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc ffff888066641b80: 00 00 00 00 00 00 00 00 00 00 00 00 fc fc fc fc Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Ralf Baechle <ralf@linux-mips.org> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-23sfc: suppress duplicate nvmem partition types in efx_ef10_mtd_probeEdward Cree1-8/+21
Use a bitmap to keep track of which partition types we've already seen; for duplicates, return -EEXIST from efx_ef10_mtd_probe_partition() and thus skip adding that partition. Duplicate partitions occur because of the A/B backup scheme used by newer sfc NICs. Prior to this patch they cause sysfs_warn_dup errors because they have the same name, causing us not to expose any MTDs at all. Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-23hv_netvsc: fix typos in code commentsAdrian Vladu4-6/+6
Fix all typos from hyperv netvsc code comments. Signed-off-by: Adrian Vladu <avladu@cloudbasesolutions.com> Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: Sasha Levin <sashal@kernel.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: "Alessandro Pilotti" <apilotti@cloudbasesolutions.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-01-23hv_netvsc: Fix hash key value reset after other opsHaiyang Zhang4-7/+19
Changing mtu, channels, or buffer sizes ops call to netvsc_attach(), rndis_set_subchannel(), which always reset the hash key to default value. That will override hash key changed previously. This patch fixes the problem by save the hash key, then restore it when we re- add the netvsc device. Fixes: ff4a44199012 ("netvsc: allow get/set of RSS indirection table") Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> [sl: fix up subject line] Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-01-23hv_netvsc: Refactor assignments of struct netvsc_device_infoHaiyang Zhang1-49/+85
These assignments occur in multiple places. The patch refactor them to a function for simplicity. It also puts the struct to heap area for future expension. Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> [sl: fix up subject line] Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-01-23hv_netvsc: Fix ethtool change hash key errorHaiyang Zhang1-6/+19
Hyper-V hosts require us to disable RSS before changing RSS key, otherwise the changing request will fail. This patch fixes the coding error. Fixes: ff4a44199012 ("netvsc: allow get/set of RSS indirection table") Reported-by: Wei Hu <weh@microsoft.com> Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> [sl: fix up subject line] Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-01-23ravb: expand rx descriptor data to accommodate hw checksumSimon Horman1-5/+7
EtherAVB may provide a checksum of packet data appended to packet data. In order to allow this checksum to be received by the host descriptor data needs to be enlarged by 2 bytes to accommodate the checksum. In the case of MTU-sized packets without a VLAN tag the checksum were already accommodated by virtue of the space reserved for the VLAN tag. However, a packet of MTU-size with a VLAN tag consumed all packet data space provided by a descriptor leaving no space for the trailing checksum. This was not detected by the driver which incorrectly used the last two bytes of packet data as the checksum and truncate the packet by two bytes. This resulted all such packets being dropped. A work around is to disable RX checksum offload # ethtool -K eth0 rx off This patch resolves this problem by increasing the size available for packet data in RX descriptors by two bytes. Tested on R-Car E3 (r8a77990) ES1.0 based Ebisu-4D board v2 * Use sizeof(__sum16) directly rather than adding a driver-local #define for the size of the checksum provided by the hw (2 bytes). Fixes: 4d86d3818627 ("ravb: RX checksum offload") Signed-off-by: Simon Horman <horms+renesas@verge.net.au> Reviewed-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-22net: phy: Fixup GPLv2+ SPDX tags based on license textAndrew Lunn3-28/+5
A few PHY drivers have the GPLv2+ license text. They then either have a MODULE_LICENSE() of GPLv2 only, or an SPDX tag of GPLv2 only. Since the license text is much easier to understand than either the SPDX tag or the MODULE_LICENSE, use it as the definitive source of the licence, and fixup the others when there are contradictions. Cc: David Wu <david.wu@rock-chips.com> Cc: Dongpo Li <lidongpo@hisilicon.com> Cc: Michael Schmitz <schmitzmic@gmail.com> Signed-off-by: Andrew Lunn <andrew@lunn.ch> Acked-by: Michael Schmitz <schmitzmic@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-22net: fec: get regulator optionalStefan Agner1-1/+1
According to the device tree binding the phy-supply property is optional. Use the regulator_get_optional API accordingly. The code already handles NULL just fine. This gets rid of the following warning: fec 2188000.ethernet: 2188000.ethernet supply phy not found, using dummy regulator Signed-off-by: Stefan Agner <stefan@agner.ch> Reviewed-by: Marcel Ziswiler <marcel.ziswiler@toradex.com> Acked-by: Fugang Duan <fugang.duan@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-22net/ipv6: lower the level of "link is not ready" messagesLubomir Rintel1-2/+2
This message gets logged far too often for how interesting is it. Most distributions nowadays configure NetworkManager to use randomly generated MAC addresses for Wi-Fi network scans. The interfaces end up being periodically brought down for the address change. When they're subsequently brought back up, the message is logged, eventually flooding the log. Perhaps the message is not all that helpful: it seems to be more interesting to hear when the addrconf actually start, not when it does not. Let's lower its level. Signed-off-by: Lubomir Rintel <lkundrak@v3.sk> Acked-By: Thomas Haller <thaller@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-22net: altera_tse: fix connect_local_phy error pathAtsushi Nemoto1-1/+3
The connect_local_phy should return NULL (not negative errno) on error, since its caller expects it. Signed-off-by: Atsushi Nemoto <atsushi.nemoto@sord.co.jp> Acked-by: Thor Thayer <thor.thayer@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-22net: dpaa2: improve PTP Kconfig optionYangbo Lu1-2/+3
Converted to use "imply" instead of "select" for PTP_1588_CLOCK driver selecting. This could break the hard dependency between the PTP clock subsystem and ethernet drivers. This patch also set "default y" for dpaa2 ptp driver building to provide user an available ptp clock in default. Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-22qede: Error recovery processTomer Tayar4-74/+314
This patch adds the error recovery process in the qede driver. The process includes a partial/customized driver unload and load, which allows it to look like a short suspend period to the kernel while preserving the net devices' state. Signed-off-by: Tomer Tayar <tomer.tayar@cavium.com> Signed-off-by: Ariel Elior <ariel.elior@cavium.com> Signed-off-by: Michal Kalderon <michal.kalderon@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-22qed: Add infrastructure for error detection and recoveryTomer Tayar11-16/+251
This patch adds the detection and handling of a parity error ("process kill event"), including the update of the protocol drivers, and the prevention of any HW access that will lead to device access towards the host while recovery is in progress. It also provides the means for the protocol drivers to trigger a recovery process on their decision. Signed-off-by: Tomer Tayar <tomer.tayar@cavium.com> Signed-off-by: Ariel Elior <ariel.elior@cavium.com> Signed-off-by: Michal Kalderon <michal.kalderon@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-22qed: Revise load sequence to avoid PCI errorsTomer Tayar7-112/+178
Initiating final cleanup after an ungraceful driver unload can lead to bad PCI accesses towards the host. This patch revises the load sequence so final cleanup is sent while the internal master enable is cleared, to prevent the host accesses, and clears the internal error indications just before enabling the internal master enable. Signed-off-by: Tomer Tayar <tomer.tayar@cavium.com> Signed-off-by: Ariel Elior <ariel.elior@cavium.com> Signed-off-by: Michal Kalderon <michal.kalderon@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-22net/ipv6: don't return positive numbers when nothing was dumpedJakub Kicinski1-0/+2
in6_dump_addrs() returns a positive 1 if there was nothing to dump. This return value can not be passed as return from inet6_dump_addr() as is, because it will confuse rtnetlink, resulting in NLMSG_DONE never getting set: $ ip addr list dev lo EOF on netlink Dump terminated v2: flip condition to avoid a new goto (DaveA) Fixes: 7c1e8a3817c5 ("netlink: fixup regression in RTM_GETADDR") Reported-by: Brendan Galloway <brendan.galloway@netronome.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: David Ahern <dsahern@gmail.com> Tested-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-22net: ip_gre: use erspan key field for tunnel lookupLorenzo Bianconi3-13/+17
Use ERSPAN key header field as tunnel key in gre_parse_header routine since ERSPAN protocol sets the key field of the external GRE header to 0 resulting in a tunnel lookup fail in ip6gre_err. In addition remove key field parsing and pskb_may_pull check in erspan_rcv and ip6erspan_rcv Fixes: 5a963eb61b7c ("ip6_gre: Add ERSPAN native tunnel support") Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-22net: sun: cassini: Cleanup license conflictThomas Gleixner2-28/+2
The recent addition of SPDX license identifiers to the files in drivers/net/ethernet/sun created a licensing conflict. The cassini driver files contain a proper license notice: * This program is free software; you can redistribute it and/or * modify it under the terms of the GNU General Public License as * published by the Free Software Foundation; either version 2 of the * License, or (at your option) any later version. but the SPDX change added: SPDX-License-Identifier: GPL-2.0 So the file got tagged GPL v2 only while in fact it is licensed under GPL v2 or later. It's nice that people care about the SPDX tags, but they need to be more careful about it. Not everything under (the) sun belongs to ... Fix up the SPDX identifier and remove the boiler plate text as it is redundant. Fixes: c861ef83d771 ("sun: Add SPDX license tags to Sun network drivers") Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Shannon Nelson <shannon.nelson@oracle.com> Cc: Zhu Yanjun <yanjun.zhu@oracle.com> Cc: David S. Miller <davem@davemloft.net> Cc: netdev@vger.kernel.org Cc: stable@vger.kernel.org Acked-by: Shannon Nelson <shannon.lee.nelson@gmail.com> Reviewed-by: Zhu Yanjun <yanjun.zhu@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-22can: flexcan: fix NULL pointer exception during bringupUwe Kleine-König1-1/+1
Commit cbffaf7aa09e ("can: flexcan: Always use last mailbox for TX") introduced a loop letting i run up to (including) ARRAY_SIZE(regs->mb) and in the body accessed regs->mb[i] which is an out-of-bounds array access that then resulted in an access to an reserved register area. Later this was changed by commit 0517961ccdf1 ("can: flexcan: Add provision for variable payload size") to iterate a bit differently but still runs one iteration too much resulting to call flexcan_get_mb(priv, priv->mb_count) which results in a WARN_ON and then a NULL pointer exception. This only affects devices compatible with "fsl,p1010-flexcan", "fsl,imx53-flexcan", "fsl,imx35-flexcan", "fsl,imx25-flexcan", "fsl,imx28-flexcan", so newer i.MX SoCs are not affected. Fixes: cbffaf7aa09e ("can: flexcan: Always use last mailbox for TX") Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Cc: linux-stable <stable@vger.kernel.org> # >= 4.20 Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2019-01-22can: flexcan: fix 'passing zero to ERR_PTR()' warningYueHaibing1-1/+1
Fix a static code checker warning: drivers/net/can/flexcan.c:1435 flexcan_setup_stop_mode() warn: passing zero to 'PTR_ERR' Fixes: de3578c198c6 ("can: flexcan: add self wakeup support") Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2019-01-22can: bcm: check timer values before ktime conversionOliver Hartkopp1-0/+27
Kyungtae Kim detected a potential integer overflow in bcm_[rx|tx]_setup() when the conversion into ktime multiplies the given value with NSEC_PER_USEC (1000). Reference: https://marc.info/?l=linux-can&m=154732118819828&w=2 Add a check for the given tv_usec, so that the value stays below one second. Additionally limit the tv_sec value to a reasonable value for CAN related use-cases of 400 days and ensure all values to be positive. Reported-by: Kyungtae Kim <kt0755@gmail.com> Tested-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Cc: linux-stable <stable@vger.kernel.org> # >= 2.6.26 Tested-by: Kyungtae Kim <kt0755@gmail.com> Acked-by: Andre Naujoks <nautsch2@gmail.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2019-01-22can: dev: __can_get_echo_skb(): fix bogous check for non-existing skb by removing itManfred Schlaegl1-14/+13
This patch revert commit 7da11ba5c506 ("can: dev: __can_get_echo_skb(): print error message, if trying to echo non existing skb") After introduction of this change we encountered following new error message on various i.MX plattforms (flexcan): | flexcan 53fc8000.can can0: __can_get_echo_skb: BUG! Trying to echo non | existing skb: can_priv::echo_skb[0] The introduction of the message was a mistake because priv->echo_skb[idx] = NULL is a perfectly valid in following case: If CAN_RAW_LOOPBACK is disabled (setsockopt) in applications, the pkt_type of the tx skb's given to can_put_echo_skb is set to PACKET_LOOPBACK. In this case can_put_echo_skb will not set priv->echo_skb[idx]. It is therefore kept NULL. As additional argument for revert: The order of check and usage of idx was changed. idx is used to access an array element before checking it's boundaries. Signed-off-by: Manfred Schlaegl <manfred.schlaegl@ginzinger.com> Fixes: 7da11ba5c506 ("can: dev: __can_get_echo_skb(): print error message, if trying to echo non existing skb") Cc: linux-stable <stable@vger.kernel.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2019-01-21Linux 5.0-rc3Linus Torvalds1-1/+1
2019-01-20pstore/ram: Avoid allocation and leak of platform dataKees Cook1-6/+3
Yue Hu noticed that when parsing device tree the allocated platform data was never freed. Since it's not used beyond the function scope, this switches to using a stack variable instead. Reported-by: Yue Hu <huyue2@yulong.com> Fixes: 35da60941e44 ("pstore/ram: add Device Tree bindings") Cc: stable@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org>
2019-01-20gcc-plugins: arm_ssp_per_task_plugin: fix for GCC 9+Ard Biesheuvel1-0/+18
GCC 9 reworks the way the references to the stack canary are emitted, to prevent the value from being spilled to the stack before the final comparison in the epilogue, defeating the purpose, given that the spill slot is under control of the attacker that we are protecting ourselves from. Since our canary value address is obtained without accessing memory (as opposed to pre-v7 code that will obtain it from a literal pool), it is unlikely (although not guaranteed) that the compiler will spill the canary value in the same way, so let's just disable this improvement when building with GCC9+. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Kees Cook <keescook@chromium.org>
2019-01-20gcc-plugins: arm_ssp_per_task_plugin: sign extend the SP maskArd Biesheuvel1-2/+3
The ARM per-task stack protector GCC plugin hits an assert in the compiler in some case, due to the fact the the SP mask expression is not sign-extended as it should be. So fix that. Suggested-by: Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Kees Cook <keescook@chromium.org>
2019-01-21fix int_sqrt64() for very large numbersFlorian La Roche1-1/+1
If an input number x for int_sqrt64() has the highest bit set, then fls64(x) is 64. (1UL << 64) is an overflow and breaks the algorithm. Subtracting 1 is a better guess for the initial value of m anyway and that's what also done in int_sqrt() implicitly [*]. [*] Note how int_sqrt() uses __fls() with two underscores, which already returns the proper raw bit number. In contrast, int_sqrt64() used fls64(), and that returns bit numbers illogically starting at 1, because of error handling for the "no bits set" case. Will points out that he bug probably is due to a copy-and-paste error from the regular int_sqrt() case. Signed-off-by: Florian La Roche <Florian.LaRoche@googlemail.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-01-20x86: uaccess: Inhibit speculation past access_ok() in user_access_begin()Will Deacon1-1/+1
Commit 594cc251fdd0 ("make 'user_access_begin()' do 'access_ok()'") makes the access_ok() check part of the user_access_begin() preceding a series of 'unsafe' accesses. This has the desirable effect of ensuring that all 'unsafe' accesses have been range-checked, without having to pick through all of the callsites to verify whether the appropriate checking has been made. However, the consolidated range check does not inhibit speculation, so it is still up to the caller to ensure that they are not susceptible to any speculative side-channel attacks for user addresses that ultimately fail the access_ok() check. This is an oversight, so use __uaccess_begin_nospec() to ensure that speculation is inhibited until the access_ok() check has passed. Reported-by: Julien Thierry <julien.thierry@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-01-20bpf: in __bpf_redirect_no_mac pull mac only if presentWillem de Bruijn2-10/+12
Syzkaller was able to construct a packet of negative length by redirecting from bpf_prog_test_run_skb with BPF_PROG_TYPE_LWT_XMIT: BUG: KASAN: slab-out-of-bounds in memcpy include/linux/string.h:345 [inline] BUG: KASAN: slab-out-of-bounds in skb_copy_from_linear_data include/linux/skbuff.h:3421 [inline] BUG: KASAN: slab-out-of-bounds in __pskb_copy_fclone+0x2dd/0xeb0 net/core/skbuff.c:1395 Read of size 4294967282 at addr ffff8801d798009c by task syz-executor2/12942 kasan_report.cold.9+0x242/0x309 mm/kasan/report.c:412 check_memory_region_inline mm/kasan/kasan.c:260 [inline] check_memory_region+0x13e/0x1b0 mm/kasan/kasan.c:267 memcpy+0x23/0x50 mm/kasan/kasan.c:302 memcpy include/linux/string.h:345 [inline] skb_copy_from_linear_data include/linux/skbuff.h:3421 [inline] __pskb_copy_fclone+0x2dd/0xeb0 net/core/skbuff.c:1395 __pskb_copy include/linux/skbuff.h:1053 [inline] pskb_copy include/linux/skbuff.h:2904 [inline] skb_realloc_headroom+0xe7/0x120 net/core/skbuff.c:1539 ipip6_tunnel_xmit net/ipv6/sit.c:965 [inline] sit_tunnel_xmit+0xe1b/0x30d0 net/ipv6/sit.c:1029 __netdev_start_xmit include/linux/netdevice.h:4325 [inline] netdev_start_xmit include/linux/netdevice.h:4334 [inline] xmit_one net/core/dev.c:3219 [inline] dev_hard_start_xmit+0x295/0xc90 net/core/dev.c:3235 __dev_queue_xmit+0x2f0d/0x3950 net/core/dev.c:3805 dev_queue_xmit+0x17/0x20 net/core/dev.c:3838 __bpf_tx_skb net/core/filter.c:2016 [inline] __bpf_redirect_common net/core/filter.c:2054 [inline] __bpf_redirect+0x5cf/0xb20 net/core/filter.c:2061 ____bpf_clone_redirect net/core/filter.c:2094 [inline] bpf_clone_redirect+0x2f6/0x490 net/core/filter.c:2066 bpf_prog_41f2bcae09cd4ac3+0xb25/0x1000 The generated test constructs a packet with mac header, network header, skb->data pointing to network header and skb->len 0. Redirecting to a sit0 through __bpf_redirect_no_mac pulls the mac length, even though skb->data already is at skb->network_header. bpf_prog_test_run_skb has already pulled it as LWT_XMIT !is_l2. Update the offset calculation to pull only if skb->data differs from skb->network_header, which is not true in this case. The test itself can be run only from commit 1cf1cae963c2 ("bpf: introduce BPF_PROG_TEST_RUN command"), but the same type of packets with skb at network header could already be built from lwt xmit hooks, so this fix is more relevant to that commit. Also set the mac header on redirect from LWT_XMIT, as even after this change to __bpf_redirect_no_mac that field is expected to be set, but is not yet in ip_finish_output2. Fixes: 3a0af8fd61f9 ("bpf: BPF for lightweight tunnel infrastructure") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Willem de Bruijn <willemb@google.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-01-19virtio_net: bulk free tx skbsMichael S. Tsirkin1-6/+6
Use napi_consume_skb() to get bulk free. Note that napi_consume_skb is safe to call in a non-napi context as long as the napi_budget flag is correct. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-19clang-format: Update .clang-format with the latest for_each macro listJason Gunthorpe1-1/+42
Re-run the shell fragment that generated the original list. In particular this adds the missing xarray related functions. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
2019-01-19net: phy: phy driver features are mandatoryCamelia Groza2-2/+7
Since phy driver features became a link_mode bitmap, phy drivers that don't have a list of features configured will cause the kernel to crash when probed. Prevent the phy driver from registering if the features field is missing. Fixes: 719655a14971 ("net: phy: Replace phy driver features u32 with link_mode bitmap") Reported-by: Scott Wood <oss@buserror.net> Signed-off-by: Camelia Groza <camelia.groza@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-19isdn: avm: Fix string plus integer warning from ClangNathan Chancellor1-1/+1
A recent commit in Clang expanded the -Wstring-plus-int warning, showing some odd behavior in this file. drivers/isdn/hardware/avm/b1.c:426:30: warning: adding 'int' to a string does not append to the string [-Wstring-plus-int] cinfo->version[j] = "\0\0" + 1; ~~~~~~~^~~ drivers/isdn/hardware/avm/b1.c:426:30: note: use array indexing to silence this warning cinfo->version[j] = "\0\0" + 1; ^ & [ ] 1 warning generated. This is equivalent to just "\0". Nick pointed out that it is smarter to use "" instead of "\0" because "" is used elsewhere in the kernel and can be deduplicated at the linking stage. Link: https://github.com/ClangBuiltLinux/linux/issues/309 Suggested-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-19powerpc: chrp: Use of_node_is_type to access device_typeRob Herring1-2/+1
Commit 8ce5f8415753 ("of: Remove struct device_node.type pointer") removed struct device_node.type pointer, but the conversion to use of_node_is_type() accessor was missed in chrp_init_IRQ(). Fixes: 8ce5f8415753 ("of: Remove struct device_node.type pointer") Reported-by: kbuild test robot <lkp@intel.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: linuxppc-dev@lists.ozlabs.org Acked-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Rob Herring <robh@kernel.org>
2019-01-18net/mlx5e: Fix cb_ident duplicate in indirect block registerEli Britstein1-13/+16
Previously the identifier used for indirect block callback registry and for block rule cb registry (when done via indirect blocks) was the pointer to the tunnel netdev we were interested in receiving updates on. This worked fine if a single PF existed that registered one callback for the tunnel netdev of interest. However, if multiple PFs are in place then the 2nd PF tries to register with the same tunnel netdev identifier. This leads to EEXIST errors and/or incorrect cb deletions. Prevent this conflict by using the rpriv pointer as the identifier for netdev indirect block cb registry, allowing each PF to register a unique callback per tunnel netdev. For block cb registry, the same PF may register multiple cbs to the same block if using TC shared blocks. Instead of the rpriv, use the pointer to the allocated indr_priv data as the identifier here. This means that there can be a unique block callback for each PF/tunnel netdev combo. Fixes: f5bc2c5de101 ("net/mlx5e: Support TC indirect block notifications for eswitch uplink reprs") Signed-off-by: Eli Britstein <elibr@mellanox.com> Reviewed-by: Oz Shlomo <ozsh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-01-18net/mlx5e: Fix wrong (zero) TX drop counter indication for representorTariq Toukan1-0/+1
For representors, the TX dropped counter is not folded from the per-ring counters. Fix it. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-01-18net/mlx5e: Fix wrong error code return on FEC query failureShay Agroskin1-1/+4
Advertised and configured FEC query failure resulted in printing wrong error code. Fixes: 6cfa94605091 ("net/mlx5e: Ethtool driver callback for query/set FEC policy") Signed-off-by: Shay Agroskin <shayag@mellanox.com> Reported-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-01-18net/mlx5e: Force CHECKSUM_UNNECESSARY for short ethernet framesCong Wang1-0/+13
When an ethernet frame is padded to meet the minimum ethernet frame size, the padding octets are not covered by the hardware checksum. Fortunately the padding octets are usually zero's, which don't affect checksum. However, we have a switch which pads non-zero octets, this causes kernel hardware checksum fault repeatedly. Prior to: commit '88078d98d1bb ("net: pskb_trim_rcsum() and CHECKSUM_COMPLETE ...")' skb checksum was forced to be CHECKSUM_NONE when padding is detected. After it, we need to keep skb->csum updated, like what we do for RXFCS. However, fixing up CHECKSUM_COMPLETE requires to verify and parse IP headers, it is not worthy the effort as the packets are so small that CHECKSUM_COMPLETE can't save anything. Fixes: 88078d98d1bb ("net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends"), Cc: Eric Dumazet <edumazet@google.com> Cc: Tariq Toukan <tariqt@mellanox.com> Cc: Nikola Ciprich <nikola.ciprich@linuxbox.cz> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-01-18tools: bpftool: Cleanup license messThomas Gleixner2-11/+1
Precise and non-ambiguous license information is important. The recent relicensing of the bpftools introduced a license conflict. The files have now: SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause and * This program is free software; you can redistribute it and/or * modify it under the terms of the GNU General Public License * as published by the Free Software Foundation; either version * 2 of the License, or (at your option) any later version Amazingly about 20 people acked that change and neither they nor the committer noticed. Oh well. Digging deeper: The files were imported from the iproute2 repository with the GPL V2 or later boiler plate text in commit b66e907cfee2 ("tools: bpftool: copy JSON writer from iproute2 repository") Looking at the iproute2 repository at git://git.kernel.org/pub/scm/network/iproute2/iproute2.git the following commit is the equivivalent: commit d9d8c839 ("json_writer: add SPDX Identifier (GPL-2/BSD-2)") That commit explicitly removes the boiler plate and relicenses the code uner GPL-2.0-only and BSD-2-Clause. As Steven wrote the original code and also the relicensing commit, it's assumed that the relicensing was intended to do exaclty that. Just the kernel side update failed to remove the boiler plate. Do so now. Fixes: 907b22365115 ("tools: bpftool: dual license all files") Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Jakub Kicinski <jakub.kicinski@netronome.com> Cc: Roman Gushchin <guro@fb.com> Cc: YueHaibing <yuehaibing@huawei.com> Cc: Yonghong Song <yhs@fb.com> Cc: Stanislav Fomichev <sdf@google.com> Cc: Sean Young <sean@mess.org> Cc: Jiri Benc <jbenc@redhat.com> Cc: David Calavera <david.calavera@gmail.com> Cc: Andrey Ignatov <rdna@fb.com> Cc: Joe Stringer <joe@wand.net.nz> Cc: David Ahern <dsahern@gmail.com> Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com> Cc: Petar Penkov <ppenkov@stanford.edu> Cc: Sandipan Das <sandipan@linux.ibm.com> Cc: Prashant Bhole <bhole_prashant_q7@lab.ntt.co.jp> Cc: Stephen Hemminger <stephen@networkplumber.org> Cc: John Fastabend <john.fastabend@gmail.com> Cc: Taeung Song <treeze.taeung@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Quentin Monnet <quentin.monnet@netronome.com> CC: okash.khawaja@gmail.com Cc: netdev@vger.kernel.org Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-01-18bpf: fix inner map masking to prevent oob under speculationDaniel Borkmann1-2/+15
During review I noticed that inner meta map setup for map in map is buggy in that it does not propagate all needed data from the reference map which the verifier is later accessing. In particular one such case is index masking to prevent out of bounds access under speculative execution due to missing the map's unpriv_array/index_mask field propagation. Fix this such that the verifier is generating the correct code for inlined lookups in case of unpriviledged use. Before patch (test_verifier's 'map in map access' dump): # bpftool prog dump xla id 3 0: (62) *(u32 *)(r10 -4) = 0 1: (bf) r2 = r10 2: (07) r2 += -4 3: (18) r1 = map[id:4] 5: (07) r1 += 272 | 6: (61) r0 = *(u32 *)(r2 +0) | 7: (35) if r0 >= 0x1 goto pc+6 | Inlined map in map lookup 8: (54) (u32) r0 &= (u32) 0 | with index masking for 9: (67) r0 <<= 3 | map->unpriv_array. 10: (0f) r0 += r1 | 11: (79) r0 = *(u64 *)(r0 +0) | 12: (15) if r0 == 0x0 goto pc+1 | 13: (05) goto pc+1 | 14: (b7) r0 = 0 | 15: (15) if r0 == 0x0 goto pc+11 16: (62) *(u32 *)(r10 -4) = 0 17: (bf) r2 = r10 18: (07) r2 += -4 19: (bf) r1 = r0 20: (07) r1 += 272 | 21: (61) r0 = *(u32 *)(r2 +0) | Index masking missing (!) 22: (35) if r0 >= 0x1 goto pc+3 | for inner map despite 23: (67) r0 <<= 3 | map->unpriv_array set. 24: (0f) r0 += r1 | 25: (05) goto pc+1 | 26: (b7) r0 = 0 | 27: (b7) r0 = 0 28: (95) exit After patch: # bpftool prog dump xla id 1 0: (62) *(u32 *)(r10 -4) = 0 1: (bf) r2 = r10 2: (07) r2 += -4 3: (18) r1 = map[id:2] 5: (07) r1 += 272 | 6: (61) r0 = *(u32 *)(r2 +0) | 7: (35) if r0 >= 0x1 goto pc+6 | Same inlined map in map lookup 8: (54) (u32) r0 &= (u32) 0 | with index masking due to 9: (67) r0 <<= 3 | map->unpriv_array. 10: (0f) r0 += r1 | 11: (79) r0 = *(u64 *)(r0 +0) | 12: (15) if r0 == 0x0 goto pc+1 | 13: (05) goto pc+1 | 14: (b7) r0 = 0 | 15: (15) if r0 == 0x0 goto pc+12 16: (62) *(u32 *)(r10 -4) = 0 17: (bf) r2 = r10 18: (07) r2 += -4 19: (bf) r1 = r0 20: (07) r1 += 272 | 21: (61) r0 = *(u32 *)(r2 +0) | 22: (35) if r0 >= 0x1 goto pc+4 | Now fixed inlined inner map 23: (54) (u32) r0 &= (u32) 0 | lookup with proper index masking 24: (67) r0 <<= 3 | for map->unpriv_array. 25: (0f) r0 += r1 | 26: (05) goto pc+1 | 27: (b7) r0 = 0 | 28: (b7) r0 = 0 29: (95) exit Fixes: b2157399cc98 ("bpf: prevent out-of-bounds speculation") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>