aboutsummaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)AuthorFilesLines
2018-12-08Merge tag 'asm-generic-4.20' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-genericLinus Torvalds1-0/+4
Pull asm-generic fix from Arnd Bergmann: "Multiple people reported a bug I introduced in asm-generic/unistd.h in 4.20, this is the obvious bugfix to get glibc and others to correctly build again on new architectures that no longer provide the old fstatat64() family of system calls" * tag 'asm-generic-4.20' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic: asm-generic: unistd.h: fixup broken macro include.
2018-12-08Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linuxLinus Torvalds6-4/+47
Pull clk fixes from Stephen Boyd: "A few clk driver fixes this time: - Introduce protected-clock DT binding to fix breakage on qcom sdm845-mtp boards where the qspi clks introduced this merge window cause the firmware on those boards to take down the system if we try to read the clk registers - Fix a couple off-by-one errors found by Dan Carpenter - Handle failure in zynq fixed factor clk driver to avoid using uninitialized data" * tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: clk: zynqmp: Off by one in zynqmp_is_valid_clock() clk: mmp: Off by one in mmp_clk_add() clk: mvebu: Off by one bugs in cp110_of_clk_get() arm64: dts: qcom: sdm845-mtp: Mark protected gcc clocks clk: qcom: Support 'protected-clocks' property dt-bindings: clk: Introduce 'protected-clocks' property clk: zynqmp: handle fixed factor param query error
2018-12-08Merge tag 'xfs-4.20-fixes-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linuxLinus Torvalds6-15/+11
Pull xfs fixes from Darrick Wong: "Here are hopefully the last set of fixes for 4.20. There's a fix for a longstanding statfs reporting problem with project quotas, a correction for page cache invalidation behaviors when fallocating near EOF, and a fix for a broken metadata verifier return code. Finally, the most important fix is to the pipe splicing code (aka the generic copy_file_range fallback) to avoid pointless short directio reads by only asking the filesystem for as much data as there are available pages in the pipe buffer. Our previous fix (simulated short directio reads because the number of pages didn't match the length of the read requested) caused subtle problems on overlayfs, so that part is reverted. Anyhow, this series passes fstests -g all on xfs and overlay+xfs, and has passed 17 billion fsx operations problem-free since I started testing Summary: - Fix broken project quota inode counts - Fix incorrect PAGE_MASK/PAGE_SIZE usage - Fix incorrect return value in btree verifier - Fix WARN_ON remap flags false positive - Fix splice read overflows" * tag 'xfs-4.20-fixes-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: iomap: partially revert 4721a601099 (simulated directio short read on EFAULT) splice: don't read more than available pipe space vfs: allow some remap flags to be passed to vfs_clone_file_range xfs: fix inverted return from xfs_btree_sblock_verify_crc xfs: fix PAGE_MASK usage in xfs_free_file_space fs/xfs: fix f_ffree value for statfs when project quota is set
2018-12-08Revert "mm, thp: consolidate THP gfp handling into alloc_hugepage_direct_gfpmask"David Rientjes4-22/+51
This reverts commit 89c83fb539f95491be80cdd5158e6f0ce329e317. This should have been done as part of 2f0799a0ffc0 ("mm, thp: restore node-local hugepage allocations"). The movement of the thp allocation policy from alloc_pages_vma() to alloc_hugepage_direct_gfpmask() was intended to only set __GFP_THISNODE for mempolicies that are not MPOL_BIND whereas the revert could set this regardless of mempolicy. While the check for MPOL_BIND between alloc_hugepage_direct_gfpmask() and alloc_pages_vma() was racy, that has since been removed since the revert. What is left is the possibility to use __GFP_THISNODE in policy_node() when it is unexpected because the special handling for hugepages in alloc_pages_vma() was removed as part of the consolidation. Secondly, prior to 89c83fb539f9, alloc_pages_vma() implemented a somewhat different policy for hugepage allocations, which were allocated through alloc_hugepage_vma(). For hugepage allocations, if the allocating process's node is in the set of allowed nodes, allocate with __GFP_THISNODE for that node (for MPOL_PREFERRED, use that node with __GFP_THISNODE instead). This was changed for shmem_alloc_hugepage() to allow fallback to other nodes in 89c83fb539f9 as it did for new_page() in mm/mempolicy.c which is functionally different behavior and removes the requirement to only allocate hugepages locally. So this commit does a full revert of 89c83fb539f9 instead of the partial revert that was done in 2f0799a0ffc0. The result is the same thp allocation policy for 4.20 that was in 4.19. Fixes: 89c83fb539f9 ("mm, thp: consolidate THP gfp handling into alloc_hugepage_direct_gfpmask") Fixes: 2f0799a0ffc0 ("mm, thp: restore node-local hugepage allocations") Signed-off-by: David Rientjes <rientjes@google.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Michal Hocko <mhocko@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-12-07Revert "net/ibm/emac: wrong bit is used for STA control"Benjamin Herrenschmidt1-1/+1
This reverts commit 624ca9c33c8a853a4a589836e310d776620f4ab9. This commit is completely bogus. The STACR register has two formats, old and new, depending on the version of the IP block used. There's a pair of device-tree properties that can be used to specify the format used: has-inverted-stacr-oc has-new-stacr-staopc What this commit did was to change the bit definition used with the old parts to match the new parts. This of course breaks the driver on all the old ones. Instead, the author should have set the appropriate properties in the device-tree for the variant used on his board. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-07Merge branch 'tc-testing-next'David S. Miller6-56/+239
Lucas Bates says: ==================== tc-testing: implement command timeouts and better results tracking Patch 1 adds a timeout feature for any command tdc launches in a subshell. This prevents tdc from hanging indefinitely. Patches 2-4 introduce a new method for tracking and generating test case results, and implements it across the core script and all applicable plugins. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-07tc-testing: gitignore, ignore generated test resultsLucas Bates1-0/+3
Ignore any .tap or .xml test result files generated by tdc. Additionally, ignore plugin symlinks. Signed-off-by: Lucas Bates <lucasb@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-07tc-testing: Implement the TdcResults module in tdcLucas Bates3-51/+91
In tdc and the valgrind plugin, begin using the TdcResults module to track executed test cases. Signed-off-by: Lucas Bates <lucasb@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-07tc-testing: Add new TdcResults moduleLucas Bates1-0/+132
This module includes new classes for tdc to use in keeping track of test case results, instead of generating and tracking a lengthy string. The new module can be extended to support multiple formal test result formats to be friendlier to automation. Signed-off-by: Lucas Bates <lucasb@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-07tc-testing: Add command timeout feature to tdcLucas Bates2-5/+13
Using an attribute set in the tdc_config.py file, limit the amount of time tdc will wait for an executed command to complete and prevent the script from hanging entirely. This timeout will be applied to all executed commands. Signed-off-by: Lucas Bates <lucasb@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-07Merge branch 'skb-headroom-slab-out-of-bounds'David S. Miller2-26/+44
Stefano Brivio says: ==================== Fix slab out-of-bounds on insufficient headroom for IPv6 packets Patch 1/2 fixes a slab out-of-bounds occurring with short SCTP packets over IPv4 over L2TP over IPv6 on a configuration with relatively low HEADER_MAX. Patch 2/2 makes sure we avoid writing before the allocated buffer in neigh_hh_output() in case the headroom is enough for the unaligned hardware header size, but not enough for the aligned one, and that we warn if we hit this condition. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-07neighbour: Avoid writing before skb->head in neigh_hh_output()Stefano Brivio1-5/+23
While skb_push() makes the kernel panic if the skb headroom is less than the unaligned hardware header size, it will proceed normally in case we copy more than that because of alignment, and we'll silently corrupt adjacent slabs. In the case fixed by the previous patch, "ipv6: Check available headroom in ip6_xmit() even without options", we end up in neigh_hh_output() with 14 bytes headroom, 14 bytes hardware header and write 16 bytes, starting 2 bytes before the allocated buffer. Always check we're not writing before skb->head and, if the headroom is not enough, warn and drop the packet. v2: - instead of panicking with BUG_ON(), WARN_ON_ONCE() and drop the packet (Eric Dumazet) - if we avoid the panic, though, we need to explicitly check the headroom before the memcpy(), otherwise we'll have corrupted slabs on a running kernel, after we warn - use __skb_push() instead of skb_push(), as the headroom check is already implemented here explicitly (Eric Dumazet) Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-07ipv6: Check available headroom in ip6_xmit() even without optionsStefano Brivio1-21/+21
Even if we send an IPv6 packet without options, MAX_HEADER might not be enough to account for the additional headroom required by alignment of hardware headers. On a configuration without HYPERV_NET, WLAN, AX25, and with IPV6_TUNNEL, sending short SCTP packets over IPv4 over L2TP over IPv6, we start with 100 bytes of allocated headroom in sctp_packet_transmit(), end up with 54 bytes after l2tp_xmit_skb(), and 14 bytes in ip6_finish_output2(). Those would be enough to append our 14 bytes header, but we're going to align that to 16 bytes, and write 2 bytes out of the allocated slab in neigh_hh_output(). KASan says: [ 264.967848] ================================================================== [ 264.967861] BUG: KASAN: slab-out-of-bounds in ip6_finish_output2+0x1aec/0x1c70 [ 264.967866] Write of size 16 at addr 000000006af1c7fe by task netperf/6201 [ 264.967870] [ 264.967876] CPU: 0 PID: 6201 Comm: netperf Not tainted 4.20.0-rc4+ #1 [ 264.967881] Hardware name: IBM 2827 H43 400 (z/VM 6.4.0) [ 264.967887] Call Trace: [ 264.967896] ([<00000000001347d6>] show_stack+0x56/0xa0) [ 264.967903] [<00000000017e379c>] dump_stack+0x23c/0x290 [ 264.967912] [<00000000007bc594>] print_address_description+0xf4/0x290 [ 264.967919] [<00000000007bc8fc>] kasan_report+0x13c/0x240 [ 264.967927] [<000000000162f5e4>] ip6_finish_output2+0x1aec/0x1c70 [ 264.967935] [<000000000163f890>] ip6_finish_output+0x430/0x7f0 [ 264.967943] [<000000000163fe44>] ip6_output+0x1f4/0x580 [ 264.967953] [<000000000163882a>] ip6_xmit+0xfea/0x1ce8 [ 264.967963] [<00000000017396e2>] inet6_csk_xmit+0x282/0x3f8 [ 264.968033] [<000003ff805fb0ba>] l2tp_xmit_skb+0xe02/0x13e0 [l2tp_core] [ 264.968037] [<000003ff80631192>] l2tp_eth_dev_xmit+0xda/0x150 [l2tp_eth] [ 264.968041] [<0000000001220020>] dev_hard_start_xmit+0x268/0x928 [ 264.968069] [<0000000001330e8e>] sch_direct_xmit+0x7ae/0x1350 [ 264.968071] [<000000000122359c>] __dev_queue_xmit+0x2b7c/0x3478 [ 264.968075] [<00000000013d2862>] ip_finish_output2+0xce2/0x11a0 [ 264.968078] [<00000000013d9b14>] ip_finish_output+0x56c/0x8c8 [ 264.968081] [<00000000013ddd1e>] ip_output+0x226/0x4c0 [ 264.968083] [<00000000013dbd6c>] __ip_queue_xmit+0x894/0x1938 [ 264.968100] [<000003ff80bc3a5c>] sctp_packet_transmit+0x29d4/0x3648 [sctp] [ 264.968116] [<000003ff80b7bf68>] sctp_outq_flush_ctrl.constprop.5+0x8d0/0xe50 [sctp] [ 264.968131] [<000003ff80b7c716>] sctp_outq_flush+0x22e/0x7d8 [sctp] [ 264.968146] [<000003ff80b35c68>] sctp_cmd_interpreter.isra.16+0x530/0x6800 [sctp] [ 264.968161] [<000003ff80b3410a>] sctp_do_sm+0x222/0x648 [sctp] [ 264.968177] [<000003ff80bbddac>] sctp_primitive_ASSOCIATE+0xbc/0xf8 [sctp] [ 264.968192] [<000003ff80b93328>] __sctp_connect+0x830/0xc20 [sctp] [ 264.968208] [<000003ff80bb11ce>] sctp_inet_connect+0x2e6/0x378 [sctp] [ 264.968212] [<0000000001197942>] __sys_connect+0x21a/0x450 [ 264.968215] [<000000000119aff8>] sys_socketcall+0x3d0/0xb08 [ 264.968218] [<000000000184ea7a>] system_call+0x2a2/0x2c0 [...] Just like ip_finish_output2() does for IPv4, check that we have enough headroom in ip6_xmit(), and reallocate it if we don't. This issue is older than git history. Reported-by: Jianlin Shi <jishi@redhat.com> Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-07tcp: lack of available data can also cause TSO deferEric Dumazet1-11/+24
tcp_tso_should_defer() can return true in three different cases : 1) We are cwnd-limited 2) We are rwnd-limited 3) We are application limited. Neal pointed out that my recent fix went too far, since it assumed that if we were not in 1) case, we must be rwnd-limited Fix this by properly populating the is_cwnd_limited and is_rwnd_limited booleans. After this change, we can finally move the silly check for FIN flag only for the application-limited case. The same move for EOR bit will be handled in net-next, since commit 1c09f7d073b1 ("tcp: do not try to defer skbs with eor mark (MSG_EOR)") is scheduled for linux-4.21 Tested by running 200 concurrent netperf -t TCP_RR -- -r 60000,100 and checking none of them was rwnd_limited in the chrono_stat output from "ss -ti" command. Fixes: 41727549de3e ("tcp: Do not underestimate rwnd_limited") Signed-off-by: Eric Dumazet <edumazet@google.com> Suggested-by: Neal Cardwell <ncardwell@google.com> Reviewed-by: Neal Cardwell <ncardwell@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Reviewed-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-07net: call sk_dst_reset when set SO_DONTROUTEyupeng1-0/+1
after set SO_DONTROUTE to 1, the IP layer should not route packets if the dest IP address is not in link scope. But if the socket has cached the dst_entry, such packets would be routed until the sk_dst_cache expires. So we should clean the sk_dst_cache when a user set SO_DONTROUTE option. Below are server/client python scripts which could reprodue this issue: server side code: ========================================================================== import socket import struct import time s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) s.bind(('0.0.0.0', 9000)) s.listen(1) sock, addr = s.accept() sock.setsockopt(socket.SOL_SOCKET, socket.SO_DONTROUTE, struct.pack('i', 1)) while True: sock.send(b'foo') time.sleep(1) ========================================================================== client side code: ========================================================================== import socket import time s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) s.connect(('server_address', 9000)) while True: data = s.recv(1024) print(data) ========================================================================== Signed-off-by: yupeng <yupeng0921@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-07neighbor: Improve garbage collectionDavid Ahern3-36/+90
The existing garbage collection algorithm has a number of problems: 1. The gc algorithm will not evict PERMANENT entries as those entries are managed by userspace, yet the existing algorithm walks the entire hash table which means it always considers PERMANENT entries when looking for entries to evict. In some use cases (e.g., EVPN) there can be tens of thousands of PERMANENT entries leading to wasted CPU cycles when gc kicks in. As an example, with 32k permanent entries, neigh_alloc has been observed taking more than 4 msec per invocation. 2. Currently, when the number of neighbor entries hits gc_thresh2 and the last flush for the table was more than 5 seconds ago gc kicks in walks the entire hash table evicting *all* entries not in PERMANENT or REACHABLE state and not marked as externally learned. There is no discriminator on when the neigh entry was created or if it just moved from REACHABLE to another NUD_VALID state (e.g., NUD_STALE). It is possible for entries to be created or for established neighbor entries to be moved to STALE (e.g., an external node sends an ARP request) right before the 5 second window lapses: -----|---------x|----------|----- t-5 t t+5 If that happens those entries are evicted during gc causing unnecessary thrashing on neighbor entries and userspace caches trying to track them. Further, this contradicts the description of gc_thresh2 which says "Entries older than 5 seconds will be cleared". One workaround is to make gc_thresh2 == gc_thresh3 but that negates the whole point of having separate thresholds. 3. Clearing *all* neigh non-PERMANENT/REACHABLE/externally learned entries when gc_thresh2 is exceeded is over kill and contributes to trashing especially during startup. This patch addresses these problems as follows: 1. Use of a separate list_head to track entries that can be garbage collected along with a separate counter. PERMANENT entries are not added to this list. The gc_thresh parameters are only compared to the new counter, not the total entries in the table. The forced_gc function is updated to only walk this new gc_list looking for entries to evict. 2. Entries are added to the list head at the tail and removed from the front. 3. Entries are only evicted if they were last updated more than 5 seconds ago, adhering to the original intent of gc_thresh2. 4. Forced gc is stopped once the number of gc_entries drops below gc_thresh2. 5. Since gc checks do not apply to PERMANENT entries, gc levels are skipped when allocating a new neighbor for a PERMANENT entry. By extension this means there are no explicit limits on the number of PERMANENT entries that can be created, but this is no different than FIB entries or FDB entries. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-07Merge branch 'hns3-error-handling'David S. Miller7-656/+1066
Salil Mehta says: ==================== net: hns3: Additions/optimizations related to HNS3 H/W err handling This patch set primarily does following addtions and optimizations related to error handling in HNS3 Ethernet driver: 1. Name changes for enable and process functions and minor loop optimizations. [PATCH 1-6] 2. Modify query and clearing of RAS errors using new set of commands because modules specific commands for clearing RCB PPP PF, SSU are obselete. [PATCH 7] 3. Deletes logging 1-bit errors for RAS in HNS3 driver as these never get reported to the driver. [PATCH 8] 4. Add handling of NIC hw errors reported through MSIx rather than PCIe AER channel. [PATCH 9] 5. Add handling for the HW RAS and MSIx errors in the modules MAC, PPP PF, MSIx SRAM, RCB and SSU. [PATCH 10-13] 6. Add handling of RoCEE RAS errors. [PATCH 14] ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-07net: hns3: add handling of RDMA RAS errorsShiju Jose3-1/+199
This patch handles the RDMA RAS errors. 1. Enable RAS interrupt, print error detail info and clear error status. 2. Do CORE reset to recovery when these non-fatal errors happened. Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: Shiju Jose <shiju.jose@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-07net: hns3: handle hw errors of SSUShiju Jose3-0/+203
This patch enables and handles hw errors of the Storage Switch Unit(SSU). Signed-off-by: Shiju Jose <shiju.jose@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-07net: hns3: handle hw errors of PPU(RCB)Shiju Jose3-0/+180
This patch enables and handles hw RAS and MSIx errors of PPU(RCB). Signed-off-by: Shiju Jose <shiju.jose@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-07net: hns3: handle hw errors of PPP PFShiju Jose1-2/+13
This patch handles PF hw errors of PPP(Programmable Packet Processor). Signed-off-by: Shiju Jose <shiju.jose@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-07net: hns3: add handling of hw errors of MACShiju Jose3-0/+51
This patch adds enable and handling of hw errors of the MAC block. Signed-off-by: Shiju Jose <shiju.jose@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-07net: hns3: add handling of hw errors reported through MSIXSalil Mehta6-2/+139
This patch adds handling for HNS3 hardware errors(non-standard) which are reported through MSIX interrupts and not through PCIe AER channel. These MSIX reported hardware errors are handled using common misc. interrupt handler. Hardware error related registers cannot be cleared in context to the interrupt received as they require *heavy* access to hardware using IMP(Integrated Mangement Processor) commands. Hence, we defer the clearing of such error events till later time. Since, we have defered exact identification of errors we will have to defer the level of receovery/reset which might be required. Hence, a new reset type UNKNOWN reset has been introduced which effectively defers the assertion of the reset till we get hold of kind of errors at later time. Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: Shiju Jose <shiju.jose@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-07net: hns3: deleted logging 1 bit errorsShiju Jose1-37/+0
This patch deletes logging 1 bit errors for the following reasons. 1. AER does not notify 1 bit errors to the device drivers. However AER reports 1 bit errors to the userspace through the trace_aer_event for logging in the rasdaemon. 2. Firmware clears the status of 1 bit errors in the hw registers. Signed-off-by: Shiju Jose <shiju.jose@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-07net: hns3: add handling of hw ras errors using new set of commandsShiju Jose3-170/+331
1. This patch adds handling of hw ras errors using new set of common commands. 2. Updated the error message tables to match the register's name and error status returned by the commands. Signed-off-by: Shiju Jose <shiju.jose@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-07net: hns3: add optimization in the hclge_hw_error_set_stateShiju Jose1-9/+7
1. This patch adds minor loop optimization in the hclge_hw_error_set_state function. 2. Adds logging module's name if it fails to configure the error interrupts. Signed-off-by: Shiju Jose <shiju.jose@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-07net: hns3: rename process_hw_error functionShiju Jose5-6/+6
This patch renames process_hw_error function to handle_hw_ras_error function to match the purpose of the function. This is because hw errors reported through ras and msix interrupts will be handled separately. Signed-off-by: Shiju Jose <shiju.jose@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-07net: hns3: deletes unnecessary settings of the descriptor dataShiju Jose1-22/+5
This patch deletes unnecessary setting of the descriptor data to 0 for disabling error interrupts because it is already done by the hclge_cmd_setup_basic_desc function. Signed-off-by: Shiju Jose <shiju.jose@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-07net: hns3: re-enable error interrupts on hw resetShiju Jose3-7/+10
This patch adds calling hclge_hw_error_set_state function to re-enable the error interrupts those will be disabled on the hw reset. Signed-off-by: Shiju Jose <shiju.jose@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-07net: hns3: rename enable error interrupt functionsShiju Jose3-55/+34
This patch - renames the enable error interrupt functions. The reason is that these functions are used for both enable and disable error interrupts. - removes redundant logs from the enable error interrupt functions. Signed-off-by: Shiju Jose <shiju.jose@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-07net: hns3: remove existing process error functions and reorder hw_blk tableShiju Jose3-475/+18
1.The command interface for queryng and clearing hw errors is changed, which requires the new process error functions to be added. This patch removes all the current process error functions and associated definitions. The new functions to handle ras errors would be added in this patch set. 2. Fixed order issue of the hw_blk table. Signed-off-by: Shiju Jose <shiju.jose@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-07Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhostLinus Torvalds2-34/+62
Pull vhost/virtio fixes from Michael Tsirkin: "A couple of last-minute fixes" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: vhost/vsock: fix use-after-free in network stack callers virtio/s390: fix race in ccw_io_helper() virtio/s390: avoid race on vcdev->config vhost/vsock: fix reset orphans race with close timeout
2018-12-07Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linuxLinus Torvalds1-1/+1
Pull arm64 fix from Catalin Marinas: "Avoid sending IPIs with interrupts disabled" * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: hibernate: Avoid sending cross-calling with interrupts disabled
2018-12-07Merge branch 'support-alu32_arsh'Alexei Starovoitov14-35/+137
Jiong Wang says: ==================== BPF_ALU | BPF_ARSH | BPF_* were rejected by commit: 7891a87efc71 ("bpf: arsh is not supported in 32 bit alu thus reject it"). As explained in the commit message, this is due to there is no complete support for them on interpreter and various JIT compilation back-ends. This patch set is a follow-up which completes the missing bits. This also pave the way for running bpf program compiled with ALU32 instruction enabled by specifing -mattr=+alu32 to LLVM for which case there is likely to have more BPF_ALU | BPF_ARSH insns that will trigger the rejection code. test_verifier.c is updated accordingly. I have tested this patch set on x86-64 and NFP, I need help of review and test on the arch changes (mips/ppc/s390). Note, there might be merge confict on mips change which is better to be applied on top of: commit: 20b880a05f06 ("mips: bpf: fix encoding bug for mm_srlv32_op"), which is on mips-fixes branch at the moment. Thanks. v1->v2: - Fix ppc implementation bug. Should zero high bits explicitly. ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-12-07selftests: bpf: update testcases for BPF_ALU | BPF_ARSHJiong Wang1-4/+25
"arsh32 on imm" and "arsh32 on reg" now are accepted. Also added two new testcases to make sure arsh32 won't be treated as arsh64 during interpretation or JIT code-gen for which case the high bits will be moved into low halve that the testcases could catch them. Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-12-07bpf: verifier remove the rejection on BPF_ALU | BPF_ARSHJiong Wang1-5/+0
This patch remove the rejection on BPF_ALU | BPF_ARSH as we have supported them on interpreter and all JIT back-ends Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-12-07bpf: interpreter support BPF_ALU | BPF_ARSHJiong Wang1-22/+30
This patch implements interpreting BPF_ALU | BPF_ARSH. Do arithmetic right shift on low 32-bit sub-register, and zero the high 32 bits. Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-12-07nfp: bpf: implement jitting of BPF_ALU | BPF_ARSH | BPF_*Jiong Wang1-0/+45
BPF_X support needs indirect shift mode, please see code comments for details. Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-12-07s390: bpf: implement jitting of BPF_ALU | BPF_ARSH | BPF_*Jiong Wang1-0/+12
This patch implements code-gen for BPF_ALU | BPF_ARSH | BPF_*. Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-12-07ppc: bpf: implement jitting of BPF_ALU | BPF_ARSH | BPF_*Jiong Wang3-0/+12
This patch implements code-gen for BPF_ALU | BPF_ARSH | BPF_*. Cc: Naveen N. Rao <naveen.n.rao@linux.ibm.com> Cc: Sandipan Das <sandipan@linux.ibm.com> Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Sandipan Das <sandipan@linux.ibm.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-12-07mips: bpf: implement jitting of BPF_ALU | BPF_ARSH | BPF_XJiong Wang6-4/+13
Jitting of BPF_K is supported already, but not BPF_X. This patch complete the support for the latter on both MIPS and microMIPS. Cc: Paul Burton <paul.burton@mips.com> Cc: linux-mips@vger.kernel.org Acked-by: Paul Burton <paul.burton@mips.com> Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-12-07mips: bpf: fix encoding bug for mm_srlv32_opJiong Wang1-1/+1
For micro-mips, srlv inside POOL32A encoding space should use 0x50 sub-opcode, NOT 0x90. Some early version ISA doc describes the encoding as 0x90 for both srlv and srav, this looks to me was a typo. I checked Binutils libopcode implementation which is using 0x50 for srlv and 0x90 for srav. v1->v2: - Keep mm_srlv32_op sorted by value. Fixes: f31318fdf324 ("MIPS: uasm: Add srlv uasm instruction") Cc: Markos Chandras <markos.chandras@imgtec.com> Cc: Paul Burton <paul.burton@mips.com> Cc: linux-mips@vger.kernel.org Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-12-07Merge tag 'gcc-plugins-v4.20-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linuxLinus Torvalds2-4/+6
Pull gcc stackleak plugin fixes from Kees Cook: - Remove tracing for inserted stack depth marking function (Anders Roxell) - Move gcc-plugin pass location to avoid objtool warnings (Alexander Popov) * tag 'gcc-plugins-v4.20-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: stackleak: Register the 'stackleak_cleanup' pass before the '*free_cfg' pass stackleak: Mark stackleak_track_stack() as notrace
2018-12-07Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6Linus Torvalds4-7/+13
Pull crypto fixes from Herbert Xu: - Disable the new crypto stats interface as it's still being changed - Fix potential uses-after-free in cbc/cfb/pcbc. * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: user - Disable statistics interface crypto: do not free algorithm before using
2018-12-07Merge branch 'mlxsw-Un-offload-FDB-on-NVE-detach-attach'David S. Miller12-31/+495
Ido Schimmel says: ==================== mlxsw: Un/offload FDB on NVE detach/attach Petr says: When a VXLAN device is attached to a bridge of a driver capable of offloading such, or upped, the FDB entries already present at the device need to be offloaded. Similarly when an offloaded VXLAN device ceases being interesting (it is downed, or detached, or a front-panel port netdevice is detached from the bridge that the VXLAN device is attached to), any offloaded FDB entries need to be unoffloaded and unmarked. This attach / detach processing is implemented in this patchset. In patch #1, a code pattern is extracted into a named function for easier reuse. In patch #2, vxlan_fdb_replay() is added to send SWITCHDEV_VXLAN_FDB_ADD_TO_DEVICE for each FDB entry with a given VNI. The intention is that the offloading driver will interpret these events like any other and thus offload the FDB entries that existed prior to VXLAN attach. In patches #3 and #4, the functions vxlan_fdb_clear_offload() resp. br_fdb_clear_offload() are added. These clear the offloaded flag at matching FDB entries. In patches #5-#9, we introduce FID-type-specific and NVE-type-specific ops necessary to properly abstract invocations of the replay/clear functions. Finally patch #10 implements the FDB management. In patch #11, the mlxsw-specific test case is extended to check that the management of offload marks under the newly-supported situations is correct. Patch #12, from Ido, exercises the new code paths in actual functional test. v2: - Patch #1: - Modify vxlan_fdb_switchdev_notifier_info() to initialize the structure through a passed-in pointer argument, instead of returning it as a value. - Patch #2: - Adapt to API change in vxlan_fdb_switchdev_notifier_info() ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-07selftests: forwarding: Add PVID test case for VXLAN with VLAN-aware bridgesIdo Schimmel1-0/+70
When using VLAN-aware bridges with VXLAN, the VLAN that is mapped to the VNI of the VXLAN device is that which is configured as "pvid untagged" on the corresponding bridge port. When these flags are toggled or when the VLAN is deleted entirely, remote hosts should not be able to receive packets from the VTEP. Add a test case for above mentioned scenarios. Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-07selftests: mlxsw: vxlan: Test FDB un/marking on VXLAN join/leavePetr Machata1-0/+177
When a VXLAN device is attached to an offloaded bridge, or when a front-panel port is attached to a bridge that already has a VXLAN device, mlxsw should offload the existing offloadable FDB entries. Similarly when VXLAN device is downed, the FDB entries are unoffloaded, and the marks thus need to be cleared. Similarly when a front-panel port device is attached to a bridge with a VXLAN device, or when VLAN flags are tweaked on a VXLAN port attached to a VLAN-aware bridge. Test that the replaying / clearing logic works by observing transitions in presence of offload marks under different scenarios. Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-07mlxsw: spectrum_nve: Un/offload FDB on nve_fid_disable/enablePetr Machata1-0/+41
Any existing NVE FDB entries need to be offloaded when NVE is enabled for a given FID. Recent patches have added fdb_replay op for this, so just invoke it from mlxsw_sp_nve_fid_enable(). When NVE is disabled on a FID, any existing FDB offloaded marks need to be cleared on NVE device as well as on its bridge master. An op to handle this, fdb_clear_offload, has been added to FID ops and NVE ops in previous patches. Add code to resolve the NVE device, NVE type, and dispatch to both fdb_clear_offload ops. Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-07mlxsw: spectrum: Add mlxsw_sp_fid_ops.fdb_clear_offloadPetr Machata2-0/+30
If there are any offloaded FDB entries at bridge master of an NVE device at the time that it's un-offloaded, their offloaded marks need to be cleared. How that is done depends on whether the bridge in question is vlan aware. Therefore add a per-FID-type operation. Implement the operation for the 802.1q and 802.1d bridges. Add and publish a function mlxsw_sp_fid_fdb_clear_offload() to dispatch to the new operation according to FID type. Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-07mlxsw: spectrum_nve: Add mlxsw_sp_nve_ops.fdb_clear_offloadPetr Machata2-0/+11
If there are any offloaded FDB entries at an NVE device at the time that it's un-offloaded, their offloaded marks need to be cleared. How that is done depends on NVE device type, and therefore add a per-NVE-type operation. Implement the operation for the sole NVE device type currently supported by mlxsw, VXLAN. Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>