aboutsummaryrefslogtreecommitdiffstats
path: root/include (follow)
AgeCommit message (Collapse)AuthorFilesLines
2021-07-29Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfDavid S. Miller4-20/+53
Daniel Borkmann says: ==================== pull-request: bpf 2021-07-29 The following pull-request contains BPF updates for your *net* tree. We've added 9 non-merge commits during the last 14 day(s) which contain a total of 20 files changed, 446 insertions(+), 138 deletions(-). The main changes are: 1) Fix UBSAN out-of-bounds splat for showing XDP link fdinfo, from Lorenz Bauer. 2) Fix insufficient Spectre v4 mitigation in BPF runtime, from Daniel Borkmann, Piotr Krysiuk and Benedict Schlueter. 3) Batch of fixes for BPF sockmap found under stress testing, from John Fastabend. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-29bpf: Fix leakage due to insufficient speculative store bypass mitigationDaniel Borkmann1-1/+1
Spectre v4 gadgets make use of memory disambiguation, which is a set of techniques that execute memory access instructions, that is, loads and stores, out of program order; Intel's optimization manual, section 2.4.4.5: A load instruction micro-op may depend on a preceding store. Many microarchitectures block loads until all preceding store addresses are known. The memory disambiguator predicts which loads will not depend on any previous stores. When the disambiguator predicts that a load does not have such a dependency, the load takes its data from the L1 data cache. Eventually, the prediction is verified. If an actual conflict is detected, the load and all succeeding instructions are re-executed. af86ca4e3088 ("bpf: Prevent memory disambiguation attack") tried to mitigate this attack by sanitizing the memory locations through preemptive "fast" (low latency) stores of zero prior to the actual "slow" (high latency) store of a pointer value such that upon dependency misprediction the CPU then speculatively executes the load of the pointer value and retrieves the zero value instead of the attacker controlled scalar value previously stored at that location, meaning, subsequent access in the speculative domain is then redirected to the "zero page". The sanitized preemptive store of zero prior to the actual "slow" store is done through a simple ST instruction based on r10 (frame pointer) with relative offset to the stack location that the verifier has been tracking on the original used register for STX, which does not have to be r10. Thus, there are no memory dependencies for this store, since it's only using r10 and immediate constant of zero; hence af86ca4e3088 /assumed/ a low latency operation. However, a recent attack demonstrated that this mitigation is not sufficient since the preemptive store of zero could also be turned into a "slow" store and is thus bypassed as well: [...] // r2 = oob address (e.g. scalar) // r7 = pointer to map value 31: (7b) *(u64 *)(r10 -16) = r2 // r9 will remain "fast" register, r10 will become "slow" register below 32: (bf) r9 = r10 // JIT maps BPF reg to x86 reg: // r9 -> r15 (callee saved) // r10 -> rbp // train store forward prediction to break dependency link between both r9 // and r10 by evicting them from the predictor's LRU table. 33: (61) r0 = *(u32 *)(r7 +24576) 34: (63) *(u32 *)(r7 +29696) = r0 35: (61) r0 = *(u32 *)(r7 +24580) 36: (63) *(u32 *)(r7 +29700) = r0 37: (61) r0 = *(u32 *)(r7 +24584) 38: (63) *(u32 *)(r7 +29704) = r0 39: (61) r0 = *(u32 *)(r7 +24588) 40: (63) *(u32 *)(r7 +29708) = r0 [...] 543: (61) r0 = *(u32 *)(r7 +25596) 544: (63) *(u32 *)(r7 +30716) = r0 // prepare call to bpf_ringbuf_output() helper. the latter will cause rbp // to spill to stack memory while r13/r14/r15 (all callee saved regs) remain // in hardware registers. rbp becomes slow due to push/pop latency. below is // disasm of bpf_ringbuf_output() helper for better visual context: // // ffffffff8117ee20: 41 54 push r12 // ffffffff8117ee22: 55 push rbp // ffffffff8117ee23: 53 push rbx // ffffffff8117ee24: 48 f7 c1 fc ff ff ff test rcx,0xfffffffffffffffc // ffffffff8117ee2b: 0f 85 af 00 00 00 jne ffffffff8117eee0 <-- jump taken // [...] // ffffffff8117eee0: 49 c7 c4 ea ff ff ff mov r12,0xffffffffffffffea // ffffffff8117eee7: 5b pop rbx // ffffffff8117eee8: 5d pop rbp // ffffffff8117eee9: 4c 89 e0 mov rax,r12 // ffffffff8117eeec: 41 5c pop r12 // ffffffff8117eeee: c3 ret 545: (18) r1 = map[id:4] 547: (bf) r2 = r7 548: (b7) r3 = 0 549: (b7) r4 = 4 550: (85) call bpf_ringbuf_output#194288 // instruction 551 inserted by verifier \ 551: (7a) *(u64 *)(r10 -16) = 0 | /both/ are now slow stores here // storing map value pointer r7 at fp-16 | since value of r10 is "slow". 552: (7b) *(u64 *)(r10 -16) = r7 / // following "fast" read to the same memory location, but due to dependency // misprediction it will speculatively execute before insn 551/552 completes. 553: (79) r2 = *(u64 *)(r9 -16) // in speculative domain contains attacker controlled r2. in non-speculative // domain this contains r7, and thus accesses r7 +0 below. 554: (71) r3 = *(u8 *)(r2 +0) // leak r3 As can be seen, the current speculative store bypass mitigation which the verifier inserts at line 551 is insufficient since /both/, the write of the zero sanitation as well as the map value pointer are a high latency instruction due to prior memory access via push/pop of r10 (rbp) in contrast to the low latency read in line 553 as r9 (r15) which stays in hardware registers. Thus, architecturally, fp-16 is r7, however, microarchitecturally, fp-16 can still be r2. Initial thoughts to address this issue was to track spilled pointer loads from stack and enforce their load via LDX through r10 as well so that /both/ the preemptive store of zero /as well as/ the load use the /same/ register such that a dependency is created between the store and load. However, this option is not sufficient either since it can be bypassed as well under speculation. An updated attack with pointer spill/fills now _all_ based on r10 would look as follows: [...] // r2 = oob address (e.g. scalar) // r7 = pointer to map value [...] // longer store forward prediction training sequence than before. 2062: (61) r0 = *(u32 *)(r7 +25588) 2063: (63) *(u32 *)(r7 +30708) = r0 2064: (61) r0 = *(u32 *)(r7 +25592) 2065: (63) *(u32 *)(r7 +30712) = r0 2066: (61) r0 = *(u32 *)(r7 +25596) 2067: (63) *(u32 *)(r7 +30716) = r0 // store the speculative load address (scalar) this time after the store // forward prediction training. 2068: (7b) *(u64 *)(r10 -16) = r2 // preoccupy the CPU store port by running sequence of dummy stores. 2069: (63) *(u32 *)(r7 +29696) = r0 2070: (63) *(u32 *)(r7 +29700) = r0 2071: (63) *(u32 *)(r7 +29704) = r0 2072: (63) *(u32 *)(r7 +29708) = r0 2073: (63) *(u32 *)(r7 +29712) = r0 2074: (63) *(u32 *)(r7 +29716) = r0 2075: (63) *(u32 *)(r7 +29720) = r0 2076: (63) *(u32 *)(r7 +29724) = r0 2077: (63) *(u32 *)(r7 +29728) = r0 2078: (63) *(u32 *)(r7 +29732) = r0 2079: (63) *(u32 *)(r7 +29736) = r0 2080: (63) *(u32 *)(r7 +29740) = r0 2081: (63) *(u32 *)(r7 +29744) = r0 2082: (63) *(u32 *)(r7 +29748) = r0 2083: (63) *(u32 *)(r7 +29752) = r0 2084: (63) *(u32 *)(r7 +29756) = r0 2085: (63) *(u32 *)(r7 +29760) = r0 2086: (63) *(u32 *)(r7 +29764) = r0 2087: (63) *(u32 *)(r7 +29768) = r0 2088: (63) *(u32 *)(r7 +29772) = r0 2089: (63) *(u32 *)(r7 +29776) = r0 2090: (63) *(u32 *)(r7 +29780) = r0 2091: (63) *(u32 *)(r7 +29784) = r0 2092: (63) *(u32 *)(r7 +29788) = r0 2093: (63) *(u32 *)(r7 +29792) = r0 2094: (63) *(u32 *)(r7 +29796) = r0 2095: (63) *(u32 *)(r7 +29800) = r0 2096: (63) *(u32 *)(r7 +29804) = r0 2097: (63) *(u32 *)(r7 +29808) = r0 2098: (63) *(u32 *)(r7 +29812) = r0 // overwrite scalar with dummy pointer; same as before, also including the // sanitation store with 0 from the current mitigation by the verifier. 2099: (7a) *(u64 *)(r10 -16) = 0 | /both/ are now slow stores here 2100: (7b) *(u64 *)(r10 -16) = r7 | since store unit is still busy. // load from stack intended to bypass stores. 2101: (79) r2 = *(u64 *)(r10 -16) 2102: (71) r3 = *(u8 *)(r2 +0) // leak r3 [...] Looking at the CPU microarchitecture, the scheduler might issue loads (such as seen in line 2101) before stores (line 2099,2100) because the load execution units become available while the store execution unit is still busy with the sequence of dummy stores (line 2069-2098). And so the load may use the prior stored scalar from r2 at address r10 -16 for speculation. The updated attack may work less reliable on CPU microarchitectures where loads and stores share execution resources. This concludes that the sanitizing with zero stores from af86ca4e3088 ("bpf: Prevent memory disambiguation attack") is insufficient. Moreover, the detection of stack reuse from af86ca4e3088 where previously data (STACK_MISC) has been written to a given stack slot where a pointer value is now to be stored does not have sufficient coverage as precondition for the mitigation either; for several reasons outlined as follows: 1) Stack content from prior program runs could still be preserved and is therefore not "random", best example is to split a speculative store bypass attack between tail calls, program A would prepare and store the oob address at a given stack slot and then tail call into program B which does the "slow" store of a pointer to the stack with subsequent "fast" read. From program B PoV such stack slot type is STACK_INVALID, and therefore also must be subject to mitigation. 2) The STACK_SPILL must not be coupled to register_is_const(&stack->spilled_ptr) condition, for example, the previous content of that memory location could also be a pointer to map or map value. Without the fix, a speculative store bypass is not mitigated in such precondition and can then lead to a type confusion in the speculative domain leaking kernel memory near these pointer types. While brainstorming on various alternative mitigation possibilities, we also stumbled upon a retrospective from Chrome developers [0]: [...] For variant 4, we implemented a mitigation to zero the unused memory of the heap prior to allocation, which cost about 1% when done concurrently and 4% for scavenging. Variant 4 defeats everything we could think of. We explored more mitigations for variant 4 but the threat proved to be more pervasive and dangerous than we anticipated. For example, stack slots used by the register allocator in the optimizing compiler could be subject to type confusion, leading to pointer crafting. Mitigating type confusion for stack slots alone would have required a complete redesign of the backend of the optimizing compiler, perhaps man years of work, without a guarantee of completeness. [...] From BPF side, the problem space is reduced, however, options are rather limited. One idea that has been explored was to xor-obfuscate pointer spills to the BPF stack: [...] // preoccupy the CPU store port by running sequence of dummy stores. [...] 2106: (63) *(u32 *)(r7 +29796) = r0 2107: (63) *(u32 *)(r7 +29800) = r0 2108: (63) *(u32 *)(r7 +29804) = r0 2109: (63) *(u32 *)(r7 +29808) = r0 2110: (63) *(u32 *)(r7 +29812) = r0 // overwrite scalar with dummy pointer; xored with random 'secret' value // of 943576462 before store ... 2111: (b4) w11 = 943576462 2112: (af) r11 ^= r7 2113: (7b) *(u64 *)(r10 -16) = r11 2114: (79) r11 = *(u64 *)(r10 -16) 2115: (b4) w2 = 943576462 2116: (af) r2 ^= r11 // ... and restored with the same 'secret' value with the help of AX reg. 2117: (71) r3 = *(u8 *)(r2 +0) [...] While the above would not prevent speculation, it would make data leakage infeasible by directing it to random locations. In order to be effective and prevent type confusion under speculation, such random secret would have to be regenerated for each store. The additional complexity involved for a tracking mechanism that prevents jumps such that restoring spilled pointers would not get corrupted is not worth the gain for unprivileged. Hence, the fix in here eventually opted for emitting a non-public BPF_ST | BPF_NOSPEC instruction which the x86 JIT translates into a lfence opcode. Inserting the latter in between the store and load instruction is one of the mitigations options [1]. The x86 instruction manual notes: [...] An LFENCE that follows an instruction that stores to memory might complete before the data being stored have become globally visible. [...] The latter meaning that the preceding store instruction finished execution and the store is at minimum guaranteed to be in the CPU's store queue, but it's not guaranteed to be in that CPU's L1 cache at that point (globally visible). The latter would only be guaranteed via sfence. So the load which is guaranteed to execute after the lfence for that local CPU would have to rely on store-to-load forwarding. [2], in section 2.3 on store buffers says: [...] For every store operation that is added to the ROB, an entry is allocated in the store buffer. This entry requires both the virtual and physical address of the target. Only if there is no free entry in the store buffer, the frontend stalls until there is an empty slot available in the store buffer again. Otherwise, the CPU can immediately continue adding subsequent instructions to the ROB and execute them out of order. On Intel CPUs, the store buffer has up to 56 entries. [...] One small upside on the fix is that it lifts constraints from af86ca4e3088 where the sanitize_stack_off relative to r10 must be the same when coming from different paths. The BPF_ST | BPF_NOSPEC gets emitted after a BPF_STX or BPF_ST instruction. This happens either when we store a pointer or data value to the BPF stack for the first time, or upon later pointer spills. The former needs to be enforced since otherwise stale stack data could be leaked under speculation as outlined earlier. For non-x86 JITs the BPF_ST | BPF_NOSPEC mapping is currently optimized away, but others could emit a speculation barrier as well if necessary. For real-world unprivileged programs e.g. generated by LLVM, pointer spill/fill is only generated upon register pressure and LLVM only tries to do that for pointers which are not used often. The program main impact will be the initial BPF_ST | BPF_NOSPEC sanitation for the STACK_INVALID case when the first write to a stack slot occurs e.g. upon map lookup. In future we might refine ways to mitigate the latter cost. [0] https://arxiv.org/pdf/1902.05178.pdf [1] https://msrc-blog.microsoft.com/2018/05/21/analysis-and-mitigation-of-speculative-store-bypass-cve-2018-3639/ [2] https://arxiv.org/pdf/1905.05725.pdf Fixes: af86ca4e3088 ("bpf: Prevent memory disambiguation attack") Fixes: f7cf25b2026d ("bpf: track spill/fill of constants") Co-developed-by: Piotr Krysiuk <piotras@gmail.com> Co-developed-by: Benedict Schlueter <benedict.schlueter@rub.de> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Piotr Krysiuk <piotras@gmail.com> Signed-off-by: Benedict Schlueter <benedict.schlueter@rub.de> Acked-by: Alexei Starovoitov <ast@kernel.org>
2021-07-29bpf: Introduce BPF nospec instruction for mitigating Spectre v4Daniel Borkmann1-0/+15
In case of JITs, each of the JIT backends compiles the BPF nospec instruction /either/ to a machine instruction which emits a speculation barrier /or/ to /no/ machine instruction in case the underlying architecture is not affected by Speculative Store Bypass or has different mitigations in place already. This covers both x86 and (implicitly) arm64: In case of x86, we use 'lfence' instruction for mitigation. In case of arm64, we rely on the firmware mitigation as controlled via the ssbd kernel parameter. Whenever the mitigation is enabled, it works for all of the kernel code with no need to provide any additional instructions here (hence only comment in arm64 JIT). Other archs can follow as needed. The BPF nospec instruction is specifically targeting Spectre v4 since i) we don't use a serialization barrier for the Spectre v1 case, and ii) mitigation instructions for v1 and v4 might be different on some archs. The BPF nospec is required for a future commit, where the BPF verifier does annotate intermediate BPF programs with speculation barriers. Co-developed-by: Piotr Krysiuk <piotras@gmail.com> Co-developed-by: Benedict Schlueter <benedict.schlueter@rub.de> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Piotr Krysiuk <piotras@gmail.com> Signed-off-by: Benedict Schlueter <benedict.schlueter@rub.de> Acked-by: Alexei Starovoitov <ast@kernel.org>
2021-07-27bpf, sockmap: Fix memleak on ingress msg enqueueJohn Fastabend1-19/+35
If backlog handler is running during a tear down operation we may enqueue data on the ingress msg queue while tear down is trying to free it. sk_psock_backlog() sk_psock_handle_skb() skb_psock_skb_ingress() sk_psock_skb_ingress_enqueue() sk_psock_queue_msg(psock,msg) spin_lock(ingress_lock) sk_psock_zap_ingress() _sk_psock_purge_ingerss_msg() _sk_psock_purge_ingress_msg() -- free ingress_msg list -- spin_unlock(ingress_lock) spin_lock(ingress_lock) list_add_tail(msg,ingress_msg) <- entry on list with no one left to free it. spin_unlock(ingress_lock) To fix we only enqueue from backlog if the ENABLED bit is set. The tear down logic clears the bit with ingress_lock set so we wont enqueue the msg in the last step. Fixes: 799aa7f98d53 ("skmsg: Avoid lock_sock() in sk_psock_backlog()") Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Jakub Sitnicki <jakub@cloudflare.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20210727160500.1713554-4-john.fastabend@gmail.com
2021-07-27net: llc: fix skb_over_panicPavel Skripkin1-8/+23
Syzbot reported skb_over_panic() in llc_pdu_init_as_xid_cmd(). The problem was in wrong LCC header manipulations. Syzbot's reproducer tries to send XID packet. llc_ui_sendmsg() is doing following steps: 1. skb allocation with size = len + header size len is passed from userpace and header size is 3 since addr->sllc_xid is set. 2. skb_reserve() for header_len = 3 3. filling all other space with memcpy_from_msg() Ok, at this moment we have fully loaded skb, only headers needs to be filled. Then code comes to llc_sap_action_send_xid_c(). This function pushes 3 bytes for LLC PDU header and initializes it. Then comes llc_pdu_init_as_xid_cmd(). It initalizes next 3 bytes *AFTER* LLC PDU header and call skb_push(skb, 3). This looks wrong for 2 reasons: 1. Bytes rigth after LLC header are user data, so this function was overwriting payload. 2. skb_push(skb, 3) call can cause skb_over_panic() since all free space was filled in llc_ui_sendmsg(). (This can happen is user passed 686 len: 686 + 14 (eth header) + 3 (LLC header) = 703. SKB_DATA_ALIGN(703) = 704) So, in this patch I added 2 new private constansts: LLC_PDU_TYPE_U_XID and LLC_PDU_LEN_U_XID. LLC_PDU_LEN_U_XID is used to correctly reserve header size to handle LLC + XID case. LLC_PDU_TYPE_U_XID is used by llc_pdu_header_init() function to push 6 bytes instead of 3. And finally I removed skb_push() call from llc_pdu_init_as_xid_cmd(). This changes should not affect other parts of LLC, since after all steps we just transmit buffer. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Reported-and-tested-by: syzbot+5e5a981ad7cc54c4b2b4@syzkaller.appspotmail.com Signed-off-by: Pavel Skripkin <paskripkin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-25sctp: send pmtu probe only if packet loss in Search Complete stateXin Long1-0/+1
This patch is to introduce last_rtx_chunks into sctp_transport to detect if there's any packet retransmission/loss happened by checking against asoc's rtx_data_chunks in sctp_transport_pl_send(). If there is, namely, transport->last_rtx_chunks != asoc->rtx_data_chunks, the pmtu probe will be sent out. Otherwise, increment the pl.raise_count and return when it's in Search Complete state. With this patch, if in Search Complete state, which is a long period, it doesn't need to keep probing the current pmtu unless there's data packet loss. This will save quite some traffic. v1->v2: - add the missing Fixes tag. Fixes: 0dac127c0557 ("sctp: do black hole detection in search complete state") Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-25sctp: improve the code for pmtu probe send and recv updateXin Long1-2/+2
This patch does 3 things: - make sctp_transport_pl_send() and sctp_transport_pl_recv() return bool type to decide if more probe is needed to send. - pr_debug() only when probe is really needed to send. - count pl.raise_count in sctp_transport_pl_send() instead of sctp_transport_pl_recv(), and it's only incremented for the 1st probe for the same size. These are preparations for the next patch to make probes happen only when there's packet loss in Search Complete state. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-22Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linuxLinus Torvalds1-25/+1
Pull arm64 fixes from Will Deacon: "A pair of arm64 fixes for -rc3. The straightforward one is a fix to our firmware calling stub, which accidentally started corrupting the link register on machines with SVE. Since these machines don't really exist yet, it wasn't spotted in -next. The other fix is a revert-and-a-bit of a patch originally intended to allow PTE-level huge mappings for the VMAP area on 32-bit PPC 8xx. A side-effect of this change was that our pXd_set_huge() implementations could be replaced with generic dummy functions depending on the levels of page-table being used, which in turn broke the boot if we fail to create the linear mapping as a result of using these functions to operate on the pgd. Huge thanks to Michael Ellerman for modifying the revert so as not to regress PPC 8xx in terms of functionality. Anyway, that's the background and it's also available in the commit message along with Link tags pointing at all of the fun. Summary: - Fix hang when issuing SMC on SVE-capable system due to clobbered LR - Fix boot failure due to missing block mappings with folded page-table" * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: Revert "mm/pgtable: add stubs for {pmd/pub}_{set/clear}_huge" arm64: smccc: Save lr before calling __arm_smccc_sve_check()
2021-07-22Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netLinus Torvalds3-3/+28
Pull networking fixes from David Miller: 1) Fix type of bind option flag in af_xdp, from Baruch Siach. 2) Fix use after free in bpf_xdp_link_release(), from Xuan Zhao. 3) PM refcnt imbakance in r8152, from Takashi Iwai. 4) Sign extension ug in liquidio, from Colin Ian King. 5) Mising range check in s390 bpf jit, from Colin Ian King. 6) Uninit value in caif_seqpkt_sendmsg(), from Ziyong Xuan. 7) Fix skb page recycling race, from Ilias Apalodimas. 8) Fix memory leak in tcindex_partial_destroy_work, from Pave Skripkin. 9) netrom timer sk refcnt issues, from Nguyen Dinh Phi. 10) Fix data races aroun tcp's tfo_active_disable_stamp, from Eric Dumazet. 11) act_skbmod should only operate on ethernet packets, from Peilin Ye. 12) Fix slab out-of-bpunds in fib6_nh_flush_exceptions(),, from Psolo Abeni. 13) Fix sparx5 dependencies, from Yajun Deng. * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (74 commits) dpaa2-switch: seed the buffer pool after allocating the swp net: sched: cls_api: Fix the the wrong parameter net: sparx5: fix unmet dependencies warning net: dsa: tag_ksz: dont let the hardware process the layer 4 checksum net: dsa: ensure linearized SKBs in case of tail taggers ravb: Remove extra TAB ravb: Fix a typo in comment net: dsa: sja1105: make VID 4095 a bridge VLAN too tcp: disable TFO blackhole logic by default sctp: do not update transport pathmtu if SPP_PMTUD_ENABLE is not set net: ixp46x: fix ptp build failure ibmvnic: Remove the proper scrq flush selftests: net: add ESP-in-UDP PMTU test udp: check encap socket in __udp_lib_err sctp: update active_key for asoc when old key is being replaced r8169: Avoid duplicate sysfs entry creation error ixgbe: Fix packet corruption due to missing DMA sync Revert "qed: fix possible unpaired spin_{un}lock_bh in _qed_mcp_cmd_and_union()" ipv6: fix another slab-out-of-bounds in fib6_nh_flush_exceptions fsl/fman: Add fibre support ...
2021-07-21Merge tag 'regulator-fix-v5.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulatorLinus Torvalds1-2/+2
Pull regulator fixes from Mark Brown: "A few driver specific fixes that came in since the merge window, plus a change to mark the regulator-fixed-domain DT binding as deprecated in order to try to to discourage any new users while a better solution is put in place" * tag 'regulator-fix-v5.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator: regulator: hi6421: Fix getting wrong drvdata regulator: mtk-dvfsrc: Fix wrong dev pointer for devm_regulator_register regulator: fixed: Mark regulator-fixed-domain as deprecated regulator: bd9576: Fix testing wrong flag in check_temp_flag_mismatch regulator: hi6421v600: Fix getting wrong drvdata that causes boot failure regulator: rt5033: Fix n_voltages settings for BUCK and LDO regulator: rtmv20: Fix wrong mask for strobe-polarity-high
2021-07-21Merge tag 'afs-fixes-20210721' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fsLinus Torvalds1-5/+62
Pull AFS fixes from David Howells: - Fix a tracepoint that causes one of the tracing subsystem query files to crash if the module is loaded - Fix afs_writepages() to take account of whether the storage rpc actually succeeded when updating the cyclic writeback counter - Fix some error code propagation/handling - Fix place where afs_writepages() was setting writeback_index to a file position rather than a page index * tag 'afs-fixes-20210721' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs: afs: Remove redundant assignment to ret afs: Fix setting of writeback_index afs: check function return afs: Fix tracepoint string placement with built-in AFS
2021-07-21afs: Fix tracepoint string placement with built-in AFSDavid Howells1-5/+62
To quote Alexey[1]: I was adding custom tracepoint to the kernel, grabbed full F34 kernel .config, disabled modules and booted whole shebang as VM kernel. Then did perf record -a -e ... It crashed: general protection fault, probably for non-canonical address 0x435f5346592e4243: 0000 [#1] SMP PTI CPU: 1 PID: 842 Comm: cat Not tainted 5.12.6+ #26 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-1.fc33 04/01/2014 RIP: 0010:t_show+0x22/0xd0 Then reproducer was narrowed to # cat /sys/kernel/tracing/printk_formats Original F34 kernel with modules didn't crash. So I started to disable options and after disabling AFS everything started working again. The root cause is that AFS was placing char arrays content into a section full of _pointers_ to strings with predictable consequences. Non canonical address 435f5346592e4243 is "CB.YFS_" which came from CM_NAME macro. Steps to reproduce: CONFIG_AFS=y CONFIG_TRACING=y # cat /sys/kernel/tracing/printk_formats Fix this by the following means: (1) Add enum->string translation tables in the event header with the AFS and YFS cache/callback manager operations listed by RPC operation ID. (2) Modify the afs_cb_call tracepoint to print the string from the translation table rather than using the string at the afs_call name pointer. (3) Switch translation table depending on the service we're being accessed as (AFS or YFS) in the tracepoint print clause. Will this cause problems to userspace utilities? Note that the symbolic representation of the YFS service ID isn't available to this header, so I've put it in as a number. I'm not sure if this is the best way to do this. (4) Remove the name wrangling (CM_NAME) macro and put the names directly into the afs_call_type structs in cmservice.c. Fixes: 8e8d7f13b6d5a9 ("afs: Add some tracepoints") Reported-by: Alexey Dobriyan (SK hynix) <adobriyan@gmail.com> Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Reviewed-by: Marc Dionne <marc.dionne@auristor.com> cc: Andrew Morton <akpm@linux-foundation.org> cc: linux-afs@lists.infradead.org Link: https://lore.kernel.org/r/YLAXfvZ+rObEOdc%2F@localhost.localdomain/ [1] Link: https://lore.kernel.org/r/643721.1623754699@warthog.procyon.org.uk/ Link: https://lore.kernel.org/r/162430903582.2896199.6098150063997983353.stgit@warthog.procyon.org.uk/ # v1 Link: https://lore.kernel.org/r/162609463957.3133237.15916579353149746363.stgit@warthog.procyon.org.uk/ # v1 (repost) Link: https://lore.kernel.org/r/162610726860.3408253.445207609466288531.stgit@warthog.procyon.org.uk/ # v2
2021-07-21Revert "mm/pgtable: add stubs for {pmd/pub}_{set/clear}_huge"Jonathan Marek1-25/+1
This reverts commit c742199a014de23ee92055c2473d91fe5561ffdf. c742199a014d ("mm/pgtable: add stubs for {pmd/pub}_{set/clear}_huge") breaks arm64 in at least two ways for configurations where PUD or PMD folding occur: 1. We no longer install huge-vmap mappings and silently fall back to page-granular entries, despite being able to install block entries at what is effectively the PGD level. 2. If the linear map is backed with block mappings, these will now silently fail to be created in alloc_init_pud(), causing a panic early during boot. The pgtable selftests caught this, although a fix has not been forthcoming and Christophe is AWOL at the moment, so just revert the change for now to get a working -rc3 on which we can queue patches for 5.15. A simple revert breaks the build for 32-bit PowerPC 8xx machines, which rely on the default function definitions when the corresponding page-table levels are folded, since commit a6a8f7c4aa7e ("powerpc/8xx: add support for huge pages on VMAP and VMALLOC"), eg: powerpc64-linux-ld: mm/vmalloc.o: in function `vunmap_pud_range': linux/mm/vmalloc.c:362: undefined reference to `pud_clear_huge' To avoid that, add stubs for pud_clear_huge() and pmd_clear_huge() in arch/powerpc/mm/nohash/8xx.c as suggested by Christophe. Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Fixes: c742199a014d ("mm/pgtable: add stubs for {pmd/pub}_{set/clear}_huge") Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Acked-by: Marc Zyngier <maz@kernel.org> [mpe: Fold in 8xx.c changes from Christophe and mention in change log] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/linux-arm-kernel/CAMuHMdXShORDox-xxaeUfDW3wx2PeggFSqhVSHVZNKCGK-y_vQ@mail.gmail.com/ Link: https://lore.kernel.org/r/20210717160118.9855-1-jonathan@marek.ca Link: https://lore.kernel.org/r/87r1fs1762.fsf@mpe.ellerman.id.au Signed-off-by: Will Deacon <will@kernel.org>
2021-07-20net/tcp_fastopen: remove obsolete externEric Dumazet1-1/+0
After cited commit, sysctl_tcp_fastopen_blackhole_timeout is no longer a global variable. Fixes: 3733be14a32b ("ipv4: Namespaceify tcp_fastopen_blackhole_timeout knob") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Haishuang Yan <yanhaishuang@cmss.chinamobile.com> Cc: Yuchung Cheng <ycheng@google.com> Cc: Neal Cardwell <ncardwell@google.com> Acked-by: Wei Wang <weiwan@google.com> Link: https://lore.kernel.org/r/20210719092028.3016745-1-eric.dumazet@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-07-19bpf: Fix OOB read when printing XDP link fdinfoLorenz Bauer1-0/+1
We got the following UBSAN report on one of our testing machines: ================================================================================ UBSAN: array-index-out-of-bounds in kernel/bpf/syscall.c:2389:24 index 6 is out of range for type 'char *[6]' CPU: 43 PID: 930921 Comm: systemd-coredum Tainted: G O 5.10.48-cloudflare-kasan-2021.7.0 #1 Hardware name: <snip> Call Trace: dump_stack+0x7d/0xa3 ubsan_epilogue+0x5/0x40 __ubsan_handle_out_of_bounds.cold+0x43/0x48 ? seq_printf+0x17d/0x250 bpf_link_show_fdinfo+0x329/0x380 ? bpf_map_value_size+0xe0/0xe0 ? put_files_struct+0x20/0x2d0 ? __kasan_kmalloc.constprop.0+0xc2/0xd0 seq_show+0x3f7/0x540 seq_read_iter+0x3f8/0x1040 seq_read+0x329/0x500 ? seq_read_iter+0x1040/0x1040 ? __fsnotify_parent+0x80/0x820 ? __fsnotify_update_child_dentry_flags+0x380/0x380 vfs_read+0x123/0x460 ksys_read+0xed/0x1c0 ? __x64_sys_pwrite64+0x1f0/0x1f0 do_syscall_64+0x33/0x40 entry_SYSCALL_64_after_hwframe+0x44/0xa9 <snip> ================================================================================ ================================================================================ UBSAN: object-size-mismatch in kernel/bpf/syscall.c:2384:2 From the report, we can infer that some array access in bpf_link_show_fdinfo at index 6 is out of bounds. The obvious candidate is bpf_link_type_strs[BPF_LINK_TYPE_XDP] with BPF_LINK_TYPE_XDP == 6. It turns out that BPF_LINK_TYPE_XDP is missing from bpf_types.h and therefore doesn't have an entry in bpf_link_type_strs: pos: 0 flags: 02000000 mnt_id: 13 link_type: (null) link_id: 4 prog_tag: bcf7977d3b93787c prog_id: 4 ifindex: 1 Fixes: aa8d3a716b59 ("bpf, xdp: Add bpf_link-based XDP attachment API") Signed-off-by: Lorenz Bauer <lmb@cloudflare.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210719085134.43325-2-lmb@cloudflare.com
2021-07-17Merge tag 'soc-fixes-5.14-1' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/socLinus Torvalds4-60/+207
Pull ARM SoC fixes from Arnd Bergmann: "Here are the patches for this week that came as the fallout of the merge window: - Two fixes for the NVidia memory controller driver - multiple defconfig files get patched to turn CONFIG_FB back on after that is no longer selected by CONFIG_DRM - ffa and scmpi firmware drivers fixes, mostly addressing compiler and documentation warnings - Platform specific fixes for device tree files on ASpeed, Renesas and NVidia SoC, mostly for recent regressions. - A workaround for a regression on the USB PHY with devlink when the usb-nop-xceiv driver is not available until the rootfs is mounted. - Device tree compiler warnings in Arm Versatile-AB" * tag 'soc-fixes-5.14-1' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (35 commits) ARM: dts: versatile: Fix up interrupt controller node names ARM: multi_v7_defconfig: Make NOP_USB_XCEIV driver built-in ARM: configs: Update u8500_defconfig ARM: configs: Update Vexpress defconfig ARM: configs: Update Versatile defconfig ARM: configs: Update RealView defconfig ARM: configs: Update Integrator defconfig arm: Typo s/PCI_IXP4XX_LEGACY/IXP4XX_PCI_LEGACY/ firmware: arm_scmi: Fix range check for the maximum number of pending messages firmware: arm_scmi: Avoid padding in sensor message structure firmware: arm_scmi: Fix kernel doc warnings about return values firmware: arm_scpi: Fix kernel doc warnings firmware: arm_scmi: Fix kernel doc warnings ARM: shmobile: defconfig: Restore graphical consoles firmware: arm_ffa: Fix a possible ffa_linux_errmap buffer overflow firmware: arm_ffa: Fix the comment style firmware: arm_ffa: Simplify probe function firmware: arm_ffa: Ensure drivers provide a probe function firmware: arm_scmi: Fix possible scmi_linux_errmap buffer overflow firmware: arm_scmi: Ensure drivers provide a probe function ...
2021-07-16Merge tag 'scmi-fixes-5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux into arm/fixesArnd Bergmann2-5/+17
ARM SCMI fixes for v5.14 A small set of fixes: - adding check for presence of probe while registering the driver to prevent NULL pointer access - dropping the duplicate check as the driver core already takes care of it - fix for possible scmi_linux_errmap buffer overflow - fix to avoid sensor message structure padding - fix the range check for the maximum number of pending SCMI messages - fix for various kernel-doc warnings * tag 'scmi-fixes-5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux: firmware: arm_scmi: Fix range check for the maximum number of pending messages firmware: arm_scmi: Avoid padding in sensor message structure firmware: arm_scmi: Fix kernel doc warnings about return values firmware: arm_scpi: Fix kernel doc warnings firmware: arm_scmi: Fix kernel doc warnings firmware: arm_scmi: Fix possible scmi_linux_errmap buffer overflow firmware: arm_scmi: Ensure drivers provide a probe function firmware: arm_scmi: Simplify device probe function on the bus Link: https://lore.kernel.org/r/20210714165831.2617437-1-sudeep.holla@arm.com Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2021-07-16Merge tag 'renesas-fixes-for-v5.14-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/renesas-devel into arm/fixesArnd Bergmann1-53/+183
Renesas fixes for v5.14 - Fix a clock/reset handling design issue on the new RZ/G2L SoC, requiring an atomic change to DT binding definitions, clock driver, and DTS, - Restore graphical consoles in the shmobile_defconfig. * tag 'renesas-fixes-for-v5.14-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/renesas-devel: ARM: shmobile: defconfig: Restore graphical consoles dt-bindings: clock: r9a07g044-cpg: Update clock/reset definitions clk: renesas: r9a07g044: Add P2 Clock support clk: renesas: r9a07g044: Fix P1 Clock clk: renesas: r9a07g044: Rename divider table clk: renesas: rzg2l: Add multi clock PM support Link: https://lore.kernel.org/r/cover.1626253929.git.geert+renesas@glider.be Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2021-07-16Merge tag 'memory-controller-drv-tegra-5.14-3' of git://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux-mem-ctrl into arm/fixesArnd Bergmann1-2/+7
Memory controller drivers for v5.14 - Tegra SoC, late fixes Two fixes for recent series of changes in Tegra SoC memory controller drivers: 1. Add a stub for tegra_mc_probe_device() to fix compile testing of arm-smmu without TEGRA_MC. 2. Fix arm-smmu dtschema syntax. * tag 'memory-controller-drv-tegra-5.14-3' of git://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux-mem-ctrl: dt-bindings: arm-smmu: Fix json-schema syntax memory: tegra: Add compile-test stub for tegra_mc_probe_device() Link: https://lore.kernel.org/r/20210625073604.13562-1-krzysztof.kozlowski@canonical.com Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2021-07-16bpf: Fix pointer arithmetic mask tightening under state pruningDaniel Borkmann1-0/+1
In 7fedb63a8307 ("bpf: Tighten speculative pointer arithmetic mask") we narrowed the offset mask for unprivileged pointer arithmetic in order to mitigate a corner case where in the speculative domain it is possible to advance, for example, the map value pointer by up to value_size-1 out-of- bounds in order to leak kernel memory via side-channel to user space. The verifier's state pruning for scalars leaves one corner case open where in the first verification path R_x holds an unknown scalar with an aux->alu_limit of e.g. 7, and in a second verification path that same register R_x, here denoted as R_x', holds an unknown scalar which has tighter bounds and would thus satisfy range_within(R_x, R_x') as well as tnum_in(R_x, R_x') for state pruning, yielding an aux->alu_limit of 3: Given the second path fits the register constraints for pruning, the final generated mask from aux->alu_limit will remain at 7. While technically not wrong for the non-speculative domain, it would however be possible to craft similar cases where the mask would be too wide as in 7fedb63a8307. One way to fix it is to detect the presence of unknown scalar map pointer arithmetic and force a deeper search on unknown scalars to ensure that we do not run into a masking mismatch. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org>
2021-07-15Merge tag 'Wimplicit-fallthrough-clang-5.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linuxLinus Torvalds1-1/+1
Pull fallthrough fixes from Gustavo Silva: "This fixes many fall-through warnings when building with Clang and -Wimplicit-fallthrough, and also enables -Wimplicit-fallthrough for Clang, globally. It's also important to notice that since we have adopted the use of the pseudo-keyword macro fallthrough, we also want to avoid having more /* fall through */ comments being introduced. Contrary to GCC, Clang doesn't recognize any comments as implicit fall-through markings when the -Wimplicit-fallthrough option is enabled. So, in order to avoid having more comments being introduced, we use the option -Wimplicit-fallthrough=5 for GCC, which similar to Clang, will cause a warning in case a code comment is intended to be used as a fall-through marking. The patch for Makefile also enforces this. We had almost 4,000 of these issues for Clang in the beginning, and there might be a couple more out there when building some architectures with certain configurations. However, with the recent fixes I think we are in good shape and it is now possible to enable the warning for Clang" * tag 'Wimplicit-fallthrough-clang-5.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux: (27 commits) Makefile: Enable -Wimplicit-fallthrough for Clang powerpc/smp: Fix fall-through warning for Clang dmaengine: mpc512x: Fix fall-through warning for Clang usb: gadget: fsl_qe_udc: Fix fall-through warning for Clang powerpc/powernv: Fix fall-through warning for Clang MIPS: Fix unreachable code issue MIPS: Fix fall-through warnings for Clang ASoC: Mediatek: MT8183: Fix fall-through warning for Clang power: supply: Fix fall-through warnings for Clang dmaengine: ti: k3-udma: Fix fall-through warning for Clang s390: Fix fall-through warnings for Clang dmaengine: ipu: Fix fall-through warning for Clang iommu/arm-smmu-v3: Fix fall-through warning for Clang mmc: jz4740: Fix fall-through warning for Clang PCI: Fix fall-through warning for Clang scsi: libsas: Fix fall-through warning for Clang video: fbdev: Fix fall-through warning for Clang math-emu: Fix fall-through warning cpufreq: Fix fall-through warning for Clang drm/msm: Fix fall-through warning in msm_gem_new_impl() ...
2021-07-15Merge branch 'akpm' (patches from Andrew)Linus Torvalds2-1/+4
Merge misc fixes from Andrew Morton: "13 patches. Subsystems affected by this patch series: mm (kasan, pagealloc, rmap, hmm, and hugetlb), and hfs" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: mm/hugetlb: fix refs calculation from unaligned @vaddr hfs: add lock nesting notation to hfs_find_init hfs: fix high memory mapping in hfs_bnode_read hfs: add missing clean-up in hfs_fill_super lib/test_hmm: remove set but unused page variable mm: fix the try_to_unmap prototype for !CONFIG_MMU mm/page_alloc: further fix __alloc_pages_bulk() return value mm/page_alloc: correct return value when failing at preparing mm/page_alloc: avoid page allocator recursion with pagesets.lock held Revert "mm/page_alloc: make should_fail_alloc_page() static" kasan: fix build by including kernel.h kasan: add memzero init for unaligned size at DEBUG mm: move helper to check slub_debug_enabled
2021-07-15net_sched: introduce tracepoint trace_qdisc_enqueue()Qitao Xu1-0/+26
Tracepoint trace_qdisc_enqueue() is introduced to trace skb at the entrance of TC layer on TX side. This is similar to trace_qdisc_dequeue(): 1. For both we only trace successful cases. The failure cases can be traced via trace_kfree_skb(). 2. They are called at entrance or exit of TC layer, not for each ->enqueue() or ->dequeue(). This is intentional, because we want to make trace_qdisc_enqueue() symmetric to trace_qdisc_dequeue(), which is easier to use. The return value of qdisc_enqueue() is not interesting here, we have Qdisc's drop packets in ->dequeue(), it is impossible to trace them even if we have the return value, the only way to trace them is tracing kfree_skb(). We only add information we need to trace ring buffer. If any other information is needed, it is easy to extend it without breaking ABI, see commit 3dd344ea84e1 ("net: tracepoint: exposing sk_family in all tcp:tracepoints"). Reviewed-by: Cong Wang <cong.wang@bytedance.com> Signed-off-by: Qitao Xu <qitao.xu@bytedance.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-15net_sched: use %px to print skb address in trace_qdisc_dequeue()Qitao Xu1-1/+1
Print format of skbaddr is changed to %px from %p, because we want to use skb address as a quick way to identify a packet. Note, trace ring buffer is only accessible to privileged users, it is safe to use a real kernel address here. Reviewed-by: Cong Wang <cong.wang@bytedance.com> Signed-off-by: Qitao Xu <qitao.xu@bytedance.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-15net: use %px to print skb address in trace_netif_receive_skbQitao Xu1-1/+1
The print format of skb adress in tracepoint class net_dev_template is changed to %px from %p, because we want to use skb address as a quick way to identify a packet. Note, trace ring buffer is only accessible to privileged users, it is safe to use a real kernel address here. Reviewed-by: Cong Wang <cong.wang@bytedance.com> Signed-off-by: Qitao Xu <qitao.xu@bytedance.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-15mm: fix the try_to_unmap prototype for !CONFIG_MMUChristoph Hellwig1-1/+3
Adjust the nommu stub of try_to_unmap to match the changed protype for the full version. Turn it into an inline instead of a macro to generally improve the type checking. Link: https://lkml.kernel.org/r/20210705053944.885828-1-hch@lst.de Fixes: 1fb08ac63bee ("mm: rmap: make try_to_unmap() void function") Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Yang Shi <shy828301@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-07-15kasan: fix build by including kernel.hMarco Elver1-0/+1
The <linux/kasan.h> header relies on _RET_IP_ being defined, and had been receiving that definition via inclusion of bug.h which includes kernel.h. However, since f39650de687e ("kernel.h: split out panic and oops helpers") that is no longer the case and get the following build error when building CONFIG_KASAN_HW_TAGS on arm64: In file included from arch/arm64/mm/kasan_init.c:10: include/linux/kasan.h: In function 'kasan_slab_free': include/linux/kasan.h:230:39: error: '_RET_IP_' undeclared (first use in this function) 230 | return __kasan_slab_free(s, object, _RET_IP_, init); Fix it by including kernel.h from kasan.h. Link: https://lkml.kernel.org/r/20210705072716.2125074-1-elver@google.com Fixes: f39650de687e ("kernel.h: split out panic and oops helpers") Signed-off-by: Marco Elver <elver@google.com> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com> Cc: Alexander Potapenko <glider@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Peter Collingbourne <pcc@google.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Vincenzo Frascino <vincenzo.frascino@arm.com> Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-07-14Merge tag 'net-5.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netLinus Torvalds20-223/+105
Pull networking fixes from Jakub Kicinski. "Including fixes from bpf and netfilter. Current release - regressions: - sock: fix parameter order in sock_setsockopt() Current release - new code bugs: - netfilter: nft_last: - fix incorrect arithmetic when restoring last used - honor NFTA_LAST_SET on restoration Previous releases - regressions: - udp: properly flush normal packet at GRO time - sfc: ensure correct number of XDP queues; don't allow enabling the feature if there isn't sufficient resources to Tx from any CPU - dsa: sja1105: fix address learning getting disabled on the CPU port - mptcp: addresses a rmem accounting issue that could keep packets in subflow receive buffers longer than necessary, delaying MPTCP-level ACKs - ip_tunnel: fix mtu calculation for ETHER tunnel devices - do not reuse skbs allocated from skbuff_fclone_cache in the napi skb cache, we'd try to return them to the wrong slab cache - tcp: consistently disable header prediction for mptcp Previous releases - always broken: - bpf: fix subprog poke descriptor tracking use-after-free - ipv6: - allocate enough headroom in ip6_finish_output2() in case iptables TEE is used - tcp: drop silly ICMPv6 packet too big messages to avoid expensive and pointless lookups (which may serve as a DDOS vector) - make sure fwmark is copied in SYNACK packets - fix 'disable_policy' for forwarded packets (align with IPv4) - netfilter: conntrack: - do not renew entry stuck in tcp SYN_SENT state - do not mark RST in the reply direction coming after SYN packet for an out-of-sync entry - mptcp: cleanly handle error conditions with MP_JOIN and syncookies - mptcp: fix double free when rejecting a join due to port mismatch - validate lwtstate->data before returning from skb_tunnel_info() - tcp: call sk_wmem_schedule before sk_mem_charge in zerocopy path - mt76: mt7921: continue to probe driver when fw already downloaded - bonding: fix multiple issues with offloading IPsec to (thru?) bond - stmmac: ptp: fix issues around Qbv support and setting time back - bcmgenet: always clear wake-up based on energy detection Misc: - sctp: move 198 addresses from unusable to private scope - ptp: support virtual clocks and timestamping - openvswitch: optimize operation for key comparison" * tag 'net-5.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (158 commits) net: dsa: properly check for the bridge_leave methods in dsa_switch_bridge_leave() sfc: add logs explaining XDP_TX/REDIRECT is not available sfc: ensure correct number of XDP queues sfc: fix lack of XDP TX queues - error XDP TX failed (-22) net: fddi: fix UAF in fza_probe net: dsa: sja1105: fix address learning getting disabled on the CPU port net: ocelot: fix switchdev objects synced for wrong netdev with LAG offload net: Use nlmsg_unicast() instead of netlink_unicast() octeontx2-pf: Fix uninitialized boolean variable pps ipv6: allocate enough headroom in ip6_finish_output2() net: hdlc: rename 'mod_init' & 'mod_exit' functions to be module-specific net: bridge: multicast: fix MRD advertisement router port marking race net: bridge: multicast: fix PIM hello router port marking race net: phy: marvell10g: fix differentiation of 88X3310 from 88X3340 dsa: fix for_each_child.cocci warnings virtio_net: check virtqueue_add_sgs() return value mptcp: properly account bulk freed memory selftests: mptcp: fix case multiple subflows limited by server mptcp: avoid processing packet if a subflow reset mptcp: fix syncookie process if mptcp can not_accept new subflow ...
2021-07-14fs: add vfs_parse_fs_param_source() helperChristian Brauner1-0/+2
Add a simple helper that filesystems can use in their parameter parser to parse the "source" parameter. A few places open-coded this function and that already caused a bug in the cgroup v1 parser that we fixed. Let's make it harder to get this wrong by introducing a helper which performs all necessary checks. Link: https://syzkaller.appspot.com/bug?id=6312526aba5beae046fdae8f00399f87aab48b12 Cc: Christoph Hellwig <hch@lst.de> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-07-13math-emu: Fix fall-through warningGustavo A. R. Silva1-1/+1
Fix the following fallthrough warning (nds32-randconfig with GCC): include/math-emu/op-common.h:332:8: warning: this statement may fall through [-Wimplicit-fallthrough=] Reported-by: kernel test robot <lkp@intel.com> Link: https://lore.kernel.org/lkml/60edca25.k00ut905IFBjPyt5%25lkp@intel.com/ Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
2021-07-13firmware: arm_scpi: Fix kernel doc warningsSudeep Holla1-0/+8
Kernel doc validation script is unhappy and complains with the below set of warnings. | Function parameter or member 'device_domain_id' not described in 'scpi_ops' | Function parameter or member 'get_transition_latency' not described in 'scpi_ops' | Function parameter or member 'add_opps_to_device' not described in 'scpi_ops' | Function parameter or member 'sensor_get_capability' not described in 'scpi_ops' | Function parameter or member 'sensor_get_info' not described in 'scpi_ops' | Function parameter or member 'sensor_get_value' not described in 'scpi_ops' | Function parameter or member 'device_get_power_state' not described in 'scpi_ops' | Function parameter or member 'device_set_power_state' not described in 'scpi_ops' Fix them adding appropriate documents or missing keywords. Link: https://lore.kernel.org/r/20210712130801.2436492-1-sudeep.holla@arm.com Reviewed-by: Cristian Marussi <cristian.marussi@arm.com> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
2021-07-13firmware: arm_scmi: Fix kernel doc warningsSudeep Holla1-5/+9
Kernel doc validation script is unhappy and complains with the below set of warnings. | Function parameter or member 'fast_switch_possible' not described in 'scmi_perf_proto_ops' | Function parameter or member 'power_scale_mw_get' not described in 'scmi_perf_proto_ops' | cannot understand function prototype: 'struct scmi_sensor_reading ' | cannot understand function prototype: 'struct scmi_range_attrs ' | cannot understand function prototype: 'struct scmi_sensor_axis_info ' | cannot understand function prototype: 'struct scmi_sensor_intervals_info ' Fix them adding appropriate documents or missing keywords. Link: https://lore.kernel.org/r/20210712130801.2436492-2-sudeep.holla@arm.com Reviewed-by: Cristian Marussi <cristian.marussi@arm.com> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
2021-07-12mm: Make copy_huge_page() always availableMatthew Wilcox (Oracle)2-5/+1
Rewrite copy_huge_page() and move it into mm/util.c so it's always available. Fixes an exposure of uninitialised memory on configurations with HUGETLB and UFFD enabled and MIGRATION disabled. Fixes: 8cc5fcbb5be8 ("mm, hugetlb: fix racy resv_huge_pages underflow on UFFDIO_COPY") Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-07-12dt-bindings: clock: r9a07g044-cpg: Update clock/reset definitionsBiju Das1-53/+183
Update clock and reset definitions as per RZ/G2L_clock_list_r02_02.xlsx and RZ/G2L HW(Rev.0.50) manual. Update {GIC,IA55,SCIF} clock and reset entries in the CPG driver, and separate reset from module clocks in order to handle them efficiently. Update the SCIF0 clock and reset index in the SoC DTSI. Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com> Reviewed-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Link: https://lore.kernel.org/r/20210626081344.5783-6-biju.das.jz@bp.renesas.com Link: https://lore.kernel.org/r/20210626081344.5783-7-biju.das.jz@bp.renesas.com Link: https://lore.kernel.org/r/20210626081344.5783-8-biju.das.jz@bp.renesas.com [geert: Squashed 3 commits] Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
2021-07-11net: phy: marvell10g: fix differentiation of 88X3310 from 88X3340Marek Behún1-5/+1
It seems that we cannot differentiate 88X3310 from 88X3340 by simply looking at bit 3 of revision ID. This only works on revisions A0 and A1. On revision B0, this bit is always 1. Instead use the 3.d00d register for differentiation, since this register contains information about number of ports on the device. Fixes: 9885d016ffa9 ("net: phy: marvell10g: add separate structure for 88X3340") Signed-off-by: Marek Behún <kabel@kernel.org> Reported-by: Matteo Croce <mcroce@linux.microsoft.com> Tested-by: Matteo Croce <mcroce@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-10Merge tag 'kbuild-v5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuildLinus Torvalds1-2/+4
Pull Kbuild updates from Masahiro Yamada: - Increase the -falign-functions alignment for the debug option. - Remove ugly libelf checks from the top Makefile. - Make the silent build (-s) more silent. - Re-compile the kernel if KBUILD_BUILD_TIMESTAMP is specified. - Various script cleanups * tag 'kbuild-v5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (27 commits) scripts: add generic syscallnr.sh scripts: check duplicated syscall number in syscall table sparc: syscalls: use pattern rules to generate syscall headers parisc: syscalls: use pattern rules to generate syscall headers nds32: add arch/nds32/boot/.gitignore kbuild: mkcompile_h: consider timestamp if KBUILD_BUILD_TIMESTAMP is set kbuild: modpost: Explicitly warn about unprototyped symbols kbuild: remove trailing slashes from $(KBUILD_EXTMOD) kconfig.h: explain IS_MODULE(), IS_ENABLED() kconfig: constify long_opts scripts/setlocalversion: simplify the short version part scripts/setlocalversion: factor out 12-chars hash construction scripts/setlocalversion: add more comments to -dirty flag detection scripts/setlocalversion: remove workaround for old make-kpkg scripts/setlocalversion: remove mercurial, svn and git-svn supports kbuild: clean up ${quiet} checks in shell scripts kbuild: sink stdout from cmd for silent build init: use $(call cmd,) for generating include/generated/compile.h kbuild: merge scripts/mkmakefile to top Makefile sh: move core-y in arch/sh/Makefile to arch/sh/Kbuild ...
2021-07-10Merge tag 's390-5.14-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linuxLinus Torvalds1-1/+0
Pull more s390 updates from Vasily Gorbik: - Fix preempt_count initialization. - Rework call_on_stack() macro to add proper type handling and avoid possible register corruption. - More error prone "register asm" removal and fixes. - Fix syscall restarting when multiple signals are coming in. This adds minimalistic trampolines to vdso so we can return from signal without using the stack which requires pgm check handler hacks when NX is enabled. - Remove HAVE_IRQ_EXIT_ON_IRQ_STACK since this is no longer true after switch to generic entry. - Fix protected virtualization secure storage access exception handling. - Make machine check C handler always enter with DAT enabled and move register validation to C code. - Fix tinyconfig boot problem by avoiding MONITOR CALL without CONFIG_BUG. - Increase asm symbols alignment to 16 to make it consistent with compilers. - Enable concurrent access to the CPU Measurement Counter Facility. - Add support for dynamic AP bus size limit and rework ap_dqap to deal with messages greater than recv buffer. * tag 's390-5.14-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (41 commits) s390: preempt: Fix preempt_count initialization s390/linkage: increase asm symbols alignment to 16 s390: rename CALL_ON_STACK_NORETURN() to call_on_stack_noreturn() s390: add type checking to CALL_ON_STACK_NORETURN() macro s390: remove old CALL_ON_STACK() macro s390/softirq: use call_on_stack() macro s390/lib: use call_on_stack() macro s390/smp: use call_on_stack() macro s390/kexec: use call_on_stack() macro s390/irq: use call_on_stack() macro s390/mm: use call_on_stack() macro s390: introduce proper type handling call_on_stack() macro s390/irq: simplify on_async_stack() s390/irq: inline do_softirq_own_stack() s390/irq: simplify do_softirq_own_stack() s390/ap: get rid of register asm in ap_dqap() s390: rename PIF_SYSCALL_RESTART to PIF_EXECVE_PGSTE_RESTART s390: move restart of execve() syscall s390/signal: remove sigreturn on stack s390/signal: switch to using vdso for sigreturn and syscall restart ...
2021-07-10Merge tag 'arm-drivers-5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/socLinus Torvalds8-21/+429
Pull ARM driver updates from Olof Johansson: - Reset controllers: Adding support for Microchip Sparx5 Switch. - Memory controllers: ARM Primecell PL35x SMC memory controller driver cleanups and improvements. - i.MX SoC drivers: Power domain support for i.MX8MM and i.MX8MN. - Rockchip: RK3568 power domains support + DT binding updates, cleanups. - Qualcomm SoC drivers: Amend socinfo with more SoC/PMIC details, including support for MSM8226, MDM9607, SM6125 and SC8180X. - ARM FFA driver: "Firmware Framework for ARMv8-A", defining management interfaces and communication (including bus model) between partitions both in Normal and Secure Worlds. - Tegra Memory controller changes, including major rework to deal with identity mappings at boot and integration with ARM SMMU pieces. * tag 'arm-drivers-5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (120 commits) firmware: turris-mox-rwtm: add marvell,armada-3700-rwtm-firmware compatible string firmware: turris-mox-rwtm: show message about HWRNG registration firmware: turris-mox-rwtm: fail probing when firmware does not support hwrng firmware: turris-mox-rwtm: report failures better firmware: turris-mox-rwtm: fix reply status decoding function soc: imx: gpcv2: add support for i.MX8MN power domains dt-bindings: add defines for i.MX8MN power domains firmware: tegra: bpmp: Fix Tegra234-only builds iommu/arm-smmu: Use Tegra implementation on Tegra186 iommu/arm-smmu: tegra: Implement SID override programming iommu/arm-smmu: tegra: Detect number of instances at runtime dt-bindings: arm-smmu: Add Tegra186 compatible string firmware: qcom_scm: Add MDM9607 compatible soc: qcom: rpmpd: Add MDM9607 RPM Power Domains soc: renesas: Add support to read LSI DEVID register of RZ/G2{L,LC} SoC's soc: renesas: Add ARCH_R9A07G044 for the new RZ/G2L SoC's dt-bindings: soc: rockchip: drop unnecessary #phy-cells from grf.yaml memory: emif: remove unused frequency and voltage notifiers memory: fsl_ifc: fix leak of private memory on probe failure memory: fsl_ifc: fix leak of IO mapping on probe failure ...
2021-07-10Merge tag 'arm-dt-5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/socLinus Torvalds2-1/+2
Pull ARM devicetree updates from Olof Johansson: "Like always, the DT branch is sizable. There are numerous additions and fixes to existing platforms, but also a handful of new ones introduced. Less than some other releases, but there's been significant work on cleanups, refactorings and device enabling on existing platforms. A non-exhaustive list of new material: - Refactoring of BCM2711 dtsi structure to add support for the Raspberry Pi 400 - Rockchip: RK3568 SoC and EVB, video codecs for rk3036/3066/3188/322x - Qualcomm: SA8155p Automotive platform (SM8150 derivative), SM8150/8250 enhancements and support for Sony Xperia 1/1II and 5/5II - TI K3: PCI/USB3 support on AM64-sk boards, R5 remoteproc definitions - TI OMAP: Various cleanups - Tegra: Audio support for Jetson Xavier NX, SMMU support on Tegra194 - Qualcomm: lots of additions for peripherals across several SoCs, and new support for Microsoft Surface Duo (SM8150-based), Huawei Ascend G7. - i.MX: Numerous additions of features across SoCs and boards. - Allwinner: More device bindings for V3s, Forlinx OKA40i-C and NanoPi R1S H5 boards - MediaTek: More device bindings for mt8167, new Chromebook system variants for mt8183 - Renesas: RZ/G2L SoC and EVK added - Amlogic: BananaPi BPI-M5 board added" * tag 'arm-dt-5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (511 commits) arm64: dts: rockchip: add basic dts for RK3568 EVB arm64: dts: rockchip: add core dtsi for RK3568 SoC arm64: dts: rockchip: add generic pinconfig settings used by most Rockchip socs ARM: dts: rockchip: add vpu and vdec node for RK322x ARM: dts: rockchip: add vpu nodes for RK3066 and RK3188 ARM: dts: rockchip: add vpu node for RK3036 arm64: dts: ipq8074: Add QUP6 I2C node arm64: dts: rockchip: Re-add regulator-always-on for vcc_sdio for rk3399-roc-pc arm64: dts: rockchip: Re-add regulator-boot-on, regulator-always-on for vdd_gpu on rk3399-roc-pc arm64: dts: rockchip: add ir-receiver for rk3399-roc-pc arm64: dts: rockchip: Add USB-C port details for rk3399 Firefly arm64: dts: rockchip: Sort rk3399 firefly pinmux entries arm64: dts: rockchip: add infrared receiver node to RK3399 Firefly arm64: dts: rockchip: add SPDIF node for rk3399-firefly arm64: dts: rockchip: Add Rotation Property for OGA Panel arm64: dts: qcom: sc7180: bus votes for eMMC and SD card arm64: dts: qcom: sm8250-edo: Add Samsung touchscreen arm64: dts: qcom: sm8250-edo: Enable GPI DMA arm64: dts: qcom: sm8250-edo: Enable ADSP/CDSP/SLPI arm64: dts: qcom: sm8250-edo: Enable PCIe ...
2021-07-10Merge tag 'arm-soc-5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/socLinus Torvalds7-9/+183
Pull ARM SoC updates from Olof Johansson: "A few SoC (code) changes have queued up this cycle, mostly for minor changes and some refactoring and cleanup of legacy platforms. This branch also contains a few of the fixes that weren't sent in by the end of the release (all fairly minor). - Adding an additional maintainer for the TEE subsystem (Sumit Garg) - Quite a significant modernization of the IXP4xx platforms by Linus Walleij, revisiting with a new PCI host driver/binding, removing legacy mach/* include dependencies and moving platform detection/config to drivers/soc. Also some updates/cleanup of platform data. - Core power domain support for Tegra platforms, and some improvements in build test coverage by adding stubs for compile test targets. - A handful of updates to i.MX platforms, adding legacy (non-PSCI) SMP support on i.MX7D, SoC ID setup for i.MX50, removal of platform data and board fixups for iMX6/7. ... and a few smaller changes and fixes for Samsung, OMAP, Allwinner, Rockchip" * tag 'arm-soc-5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (53 commits) MAINTAINERS: Add myself as TEE subsystem reviewer ixp4xx: fix spelling mistake in Kconfig "Devce" -> "Device" hw_random: ixp4xx: Add OF support hw_random: ixp4xx: Add DT bindings hw_random: ixp4xx: Turn into a module hw_random: ixp4xx: Use SPDX license tag hw_random: ixp4xx: enable compile-testing pata: ixp4xx: split platform data to its own header soc: ixp4xx: move cpu detection to linux/soc/ixp4xx/cpu.h PCI: ixp4xx: Add a new driver for IXP4xx PCI: ixp4xx: Add device tree bindings for IXP4xx ARM/ixp4xx: Make NEED_MACH_IO_H optional ARM/ixp4xx: Move the virtual IObases MAINTAINERS: ARM/MStar/Sigmastar SoCs: Add a link to the MStar tree ARM: debug: add UART early console support for MSTAR SoCs ARM: dts: ux500: Fix LED probing ARM: imx: add smp support for imx7d ARM: imx6q: drop of_platform_default_populate() from init_machine arm64: dts: rockchip: Update RK3399 PCI host bridge window to 32-bit address memory soc/tegra: fuse: Fix Tegra234-only builds ...
2021-07-09mptcp: avoid processing packet if a subflow resetJianguo Wu1-2/+3
If check_fully_established() causes a subflow reset, it should not continue to process the packet in tcp_data_queue(). Add a return value to mptcp_incoming_options(), and return false if a subflow has been reset, else return true. Then drop the packet in tcp_data_queue()/tcp_rcv_state_process() if mptcp_incoming_options() return false. Fixes: d582484726c4 ("mptcp: fix fallback for MP_JOIN subflows") Signed-off-by: Jianguo Wu <wujianguo@chinatelecom.cn> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-09Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfDavid S. Miller1-0/+1
Daniel Borkmann says: ==================== pull-request: bpf 2021-07-09 The following pull-request contains BPF updates for your *net* tree. We've added 9 non-merge commits during the last 9 day(s) which contain a total of 13 files changed, 118 insertions(+), 62 deletions(-). The main changes are: 1) Fix runqslower task->state access from BPF, from SanjayKumar Jeyakumar. 2) Fix subprog poke descriptor tracking use-after-free, from John Fastabend. 3) Fix sparse complaint from prior devmap RCU conversion, from Toke Høiland-Jørgensen. 4) Fix missing va_end in bpftool JIT json dump's error path, from Gu Shengxian. 5) Fix tools/bpf install target from missing runqslower install, from Wei Li. 6) Fix xdpsock BPF sample to unload program on shared umem option, from Wang Hai. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-09net: validate lwtstate->data before returning from skb_tunnel_info()Taehee Yoo1-1/+3
skb_tunnel_info() returns pointer of lwtstate->data as ip_tunnel_info type without validation. lwtstate->data can have various types such as mpls_iptunnel_encap, etc and these are not compatible. So skb_tunnel_info() should validate before returning that pointer. Splat looks like: BUG: KASAN: slab-out-of-bounds in vxlan_get_route+0x418/0x4b0 [vxlan] Read of size 2 at addr ffff888106ec2698 by task ping/811 CPU: 1 PID: 811 Comm: ping Not tainted 5.13.0+ #1195 Call Trace: dump_stack_lvl+0x56/0x7b print_address_description.constprop.8.cold.13+0x13/0x2ee ? vxlan_get_route+0x418/0x4b0 [vxlan] ? vxlan_get_route+0x418/0x4b0 [vxlan] kasan_report.cold.14+0x83/0xdf ? vxlan_get_route+0x418/0x4b0 [vxlan] vxlan_get_route+0x418/0x4b0 [vxlan] [ ... ] vxlan_xmit_one+0x148b/0x32b0 [vxlan] [ ... ] vxlan_xmit+0x25c5/0x4780 [vxlan] [ ... ] dev_hard_start_xmit+0x1ae/0x6e0 __dev_queue_xmit+0x1f39/0x31a0 [ ... ] neigh_xmit+0x2f9/0x940 mpls_xmit+0x911/0x1600 [mpls_iptunnel] lwtunnel_xmit+0x18f/0x450 ip_finish_output2+0x867/0x2040 [ ... ] Fixes: 61adedf3e3f1 ("route: move lwtunnel state to dst_entry") Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-09Merge tag 'block-5.14-2021-07-08' of git://git.kernel.dk/linux-blockLinus Torvalds4-36/+7
Pull more block updates from Jens Axboe: "A combination of changes that ended up depending on both the driver and core branch (and/or the IDE removal), and a few late arriving fixes. In detail: - Fix io ticks wrap-around issue (Chunguang) - nvme-tcp sock locking fix (Maurizio) - s390-dasd fixes (Kees, Christoph) - blk_execute_rq polling support (Keith) - blk-cgroup RCU iteration fix (Yu) - nbd backend ID addition (Prasanna) - Partition deletion fix (Yufen) - Use blk_mq_alloc_disk for mmc, mtip32xx, ubd (Christoph) - Removal of now dead block request types due to IDE removal (Christoph) - Loop probing and control device cleanups (Christoph) - Device uevent fix (Christoph) - Misc cleanups/fixes (Tetsuo, Christoph)" * tag 'block-5.14-2021-07-08' of git://git.kernel.dk/linux-block: (34 commits) blk-cgroup: prevent rcu_sched detected stalls warnings while iterating blkgs block: fix the problem of io_ticks becoming smaller nvme-tcp: can't set sk_user_data without write_lock loop: remove unused variable in loop_set_status() block: remove the bdgrab in blk_drop_partitions block: grab a device refcount in disk_uevent s390/dasd: Avoid field over-reading memcpy() dasd: unexport dasd_set_target_state block: check disk exist before trying to add partition ubd: remove dead code in ubd_setup_common nvme: use return value from blk_execute_rq() block: return errors from blk_execute_rq() nvme: use blk_execute_rq() for passthrough commands block: support polling through blk_execute_rq block: remove REQ_OP_SCSI_{IN,OUT} block: mark blk_mq_init_queue_data static loop: rewrite loop_exit using idr_for_each_entry loop: split loop_lookup loop: don't allow deleting an unspecified loop device loop: move loop_ctl_mutex locking into loop_add ...
2021-07-09Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhostLinus Torvalds4-3/+39
Pull virtio,vhost,vdpa updates from Michael Tsirkin: - Doorbell remapping for ifcvf, mlx5 - virtio_vdpa support for mlx5 - Validate device input in several drivers (for SEV and friends) - ZONE_MOVABLE aware handling in virtio-mem - Misc fixes, cleanups * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (48 commits) virtio-mem: prioritize unplug from ZONE_MOVABLE in Big Block Mode virtio-mem: simplify high-level unplug handling in Big Block Mode virtio-mem: prioritize unplug from ZONE_MOVABLE in Sub Block Mode virtio-mem: simplify high-level unplug handling in Sub Block Mode virtio-mem: simplify high-level plug handling in Sub Block Mode virtio-mem: use page_zonenum() in virtio_mem_fake_offline() virtio-mem: don't read big block size in Sub Block Mode virtio/vdpa: clear the virtqueue state during probe vp_vdpa: allow set vq state to initial state after reset virtio-pci library: introduce vp_modern_get_driver_features() vdpa: support packed virtqueue for set/get_vq_state() virtio-ring: store DMA metadata in desc_extra for split virtqueue virtio: use err label in __vring_new_virtqueue() virtio_ring: introduce virtqueue_desc_add_split() virtio_ring: secure handling of mapping errors virtio-ring: factor out desc_extra allocation virtio_ring: rename vring_desc_extra_packed virtio-ring: maintain next in extra state for packed virtqueue vdpa/mlx5: Clear vq ready indication upon device reset vdpa/mlx5: Add support for doorbell bypassing ...
2021-07-09Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6Linus Torvalds1-6/+1
Pull crypto fixes from Herbert Xu: - Regression fix in drbg due to missing self-test for new default algorithm - Add ratelimit on user-triggerable message in qat - Fix build failure due to missing dependency in sl3516 - Remove obsolete PageSlab checks - Fix bogus hardware register writes on Kunpeng920 in hisilicon/sec * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: hisilicon/sec - fix the process of disabling sva prefetching crypto: sl3516 - Add dependency on ARCH_GEMINI crypto: sl3516 - Typo s/Stormlink/Storlink/ crypto: drbg - self test for HMAC(SHA-512) crypto: omap - Drop obsolete PageSlab check crypto: scatterwalk - Remove obsolete PageSlab check crypto: qat - ratelimit invalid ioctl message and print the invalid cmd
2021-07-09Merge tag 'for-linus-5.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/umlLinus Torvalds3-0/+204
Pull UML updates from Richard Weinberger: - Support for optimized routines based on the host CPU - Support for PCI via virtio - Various fixes * tag 'for-linus-5.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml: um: remove unneeded semicolon in um_arch.c um: Remove the repeated declaration um: fix error return code in winch_tramp() um: fix error return code in slip_open() um: Fix stack pointer alignment um: implement flush_cache_vmap/flush_cache_vunmap um: add a UML specific futex implementation um: enable the use of optimized xor routines in UML um: Add support for host CPU flags and alignment um: allow not setting extra rpaths in the linux binary um: virtio/pci: enable suspend/resume um: add PCI over virtio emulation driver um: irqs: allow invoking time-travel handler multiple times um: time-travel/signals: fix ndelay() in interrupt um: expose time-travel mode to userspace side um: export signals_enabled directly um: remove unused smp_sigio_handler() declaration lib: add iomem emulation (logic_iomem) um: allow disabling NO_IOMEM
2021-07-09Merge tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4Linus Torvalds1-4/+2
Pull ext4 updates from Ted Ts'o: "Ext4 regression and bug fixes" * tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4: inline jbd2_journal_[un]register_shrinker() ext4: fix flags validity checking for EXT4_IOC_CHECKPOINT ext4: fix possible UAF when remounting r/o a mmp-protected file system ext4: use ext4_grp_locked_error in mb_find_extent ext4: fix WARN_ON_ONCE(!buffer_uptodate) after an error writing the superblock Revert "ext4: consolidate checks for resize of bigalloc into ext4_resize_begin"
2021-07-09Merge tag 'nfs-for-5.14-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds7-0/+23
Pull NFS client updates from Trond Myklebust: "Highlights include: Features: - Multiple patches to add support for fcntl() leases over NFSv4. - A sysfs interface to display more information about the various transport connections used by the RPC client - A sysfs interface to allow a suitably privileged user to offline a transport that may no longer point to a valid server - A sysfs interface to allow a suitably privileged user to change the server IP address used by the RPC client Stable fixes: - Two sunrpc fixes for deadlocks involving privileged rpc_wait_queues Bugfixes: - SUNRPC: Avoid a KASAN slab-out-of-bounds bug in xdr_set_page_base() - SUNRPC: prevent port reuse on transports which don't request it. - NFSv3: Fix memory leak in posix_acl_create() - NFS: Various fixes to attribute revalidation timeouts - NFSv4: Fix handling of non-atomic change attribute updates - NFSv4: If a server is down, don't cause mounts to other servers to hang as well - pNFS: Fix an Oops in pnfs_mark_request_commit() when doing O_DIRECT - NFS: Fix mount failures due to incorrect setting of the has_sec_mnt_opts filesystem flag - NFS: Ensure nfs_readpage returns promptly when an internal error occurs - NFS: Fix fscache read from NFS after cache error - pNFS: Various bugfixes around the LAYOUTGET operation" * tag 'nfs-for-5.14-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (46 commits) NFSv4/pNFS: Return an error if _nfs4_pnfs_v3_ds_connect can't load NFSv3 NFSv4/pNFS: Don't call _nfs4_pnfs_v3_ds_connect multiple times NFSv4/pnfs: Clean up layout get on open NFSv4/pnfs: Fix layoutget behaviour after invalidation NFSv4/pnfs: Fix the layout barrier update NFS: Fix fscache read from NFS after cache error NFS: Ensure nfs_readpage returns promptly when internal error occurs sunrpc: remove an offlined xprt using sysfs sunrpc: provide showing transport's state info in the sysfs directory sunrpc: display xprt's queuelen of assigned tasks via sysfs sunrpc: provide multipath info in the sysfs directory NFSv4.1 identify and mark RPC tasks that can move between transports sunrpc: provide transport info in the sysfs directory SUNRPC: take a xprt offline using sysfs sunrpc: add dst_attr attributes to the sysfs xprt directory SUNRPC for TCP display xprt's source port in sysfs xprt_info SUNRPC query transport's source port SUNRPC display xprt's main value in sysfs's xprt_info SUNRPC mark the first transport sunrpc: add add sysfs directory per xprt under each xprt_switch ...
2021-07-09Merge tag 'f2fs-for-5.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fsLinus Torvalds1-0/+2
Pull f2fs updates from Jaegeuk Kim: "In this round, we've improved the compression support especially for Android such as allowing compression for mmap files, replacing the immutable bit with internal bit to prohibits data writes explicitly, and adding a mount option, "compress_cache", to improve random reads. And, we added "readonly" feature to compact the partition w/ compression enabled, which will be useful for Android RO partitions. Enhancements: - support compression for mmap file - use an f2fs flag instead of IMMUTABLE bit for compression - support RO feature w/ extent_cache - fully support swapfile with file pinning - improve atgc tunability - add nocompress extensions to unselect files for compression Bug fixes: - fix false alaram on iget failure during GC - fix race condition on global pointers when there are multiple f2fs instances - add MODULE_SOFTDEP for initramfs As usual, we've also cleaned up some places for better code readability (e.g., sysfs/feature, debugging messages, slab cache name, and docs)" * tag 'f2fs-for-5.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (32 commits) f2fs: drop dirty node pages when cp is in error status f2fs: initialize page->private when using for our internal use f2fs: compress: add nocompress extensions support MAINTAINERS: f2fs: update my email address f2fs: remove false alarm on iget failure during GC f2fs: enable extent cache for compression files in read-only f2fs: fix to avoid adding tab before doc section f2fs: introduce f2fs_casefolded_name slab cache f2fs: swap: support migrating swapfile in aligned write mode f2fs: swap: remove dead codes f2fs: compress: add compress_inode to cache compressed blocks f2fs: clean up /sys/fs/f2fs/<disk>/features f2fs: add pin_file in feature list f2fs: Advertise encrypted casefolding in sysfs f2fs: Show casefolding support only when supported f2fs: support RO feature f2fs: logging neatening f2fs: introduce FI_COMPRESS_RELEASED instead of using IMMUTABLE bit f2fs: compress: remove unneeded preallocation f2fs: atgc: export entries for better tunability via sysfs ...