aboutsummaryrefslogtreecommitdiffstats
path: root/include (follow)
AgeCommit message (Collapse)AuthorFilesLines
2021-10-20NFS: Save some space in the inodeTrond Myklebust1-18/+24
Save some space in the nfs_inode by setting up an anonymous union with the fields that are peculiar to a specific type of filesystem object. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-10-20NFS: Fix WARN_ON due to unionization of nfs_inode.nrequestsDave Wysochanski1-1/+3
Fixes the following WARN_ON WARNING: CPU: 2 PID: 18678 at fs/nfs/inode.c:123 nfs_clear_inode+0x3b/0x50 [nfs] ... Call Trace: nfs4_evict_inode+0x57/0x70 [nfsv4] evict+0xd1/0x180 dispose_list+0x48/0x60 evict_inodes+0x156/0x190 generic_shutdown_super+0x37/0x110 nfs_kill_super+0x1d/0x40 [nfs] deactivate_locked_super+0x36/0xa0 Signed-off-by: Dave Wysochanski <dwysocha@redhat.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-10-20NFS: Fix up commit deadlocksTrond Myklebust1-0/+1
If O_DIRECT bumps the commit_info rpcs_out field, then that could lead to fsync() hangs. The fix is to ensure that O_DIRECT calls nfs_commit_end(). Fixes: 723c921e7dfc ("sched/wait, fs/nfs: Convert wait_on_atomic_t() usage to the new wait_var_event() API") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-10-10SUNRPC: Per-rpc_clnt task PIDsChuck Lever1-0/+1
The current range of RPC task PIDs is 0..65535. That's not adequate for distinguishing tasks across multiple rpc_clnts running high throughput workloads. To help relieve this situation and to reduce the bottleneck of having a single atomic for assigning all RPC task PIDs, assign task PIDs per rpc_clnt. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-10-03NFS: Further optimisations for 'ls -l'Trond Myklebust1-3/+2
If a user is doing 'ls -l', we have a heuristic in GETATTR that tells the readdir code to try to use READDIRPLUS in order to refresh the inode attributes. In certain cirumstances, we also try to invalidate the remaining directory entries in order to ensure this refresh. If there are multiple readers of the directory, we probably should avoid invalidating the page cache, since the heuristic breaks down in that situation anyway. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Tested-by: Benjamin Coddington <bcodding@redhat.com> Reviewed-by: Benjamin Coddington <bcodding@redhat.com>
2021-10-03SUNRPC: Remove unnecessary memory barriersTrond Myklebust1-14/+2
The only check for RPC_TASK_RUNNING is the one in rpc_make_runnable(), which happens under the same spin lock held when we call rpc_clear_running(). Ditto, the last check for RPC_TASK_QUEUED in rpc_execute() is performed under the same lock as the one held when we call rpc_clear_queued(). Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-10-03NFS: Fix up nfs_ctx_key_to_expire()Trond Myklebust1-1/+1
If the cached credential exists but doesn't have any expiration callback then exit early. Fix up atomicity issues when replacing the credential with a new one since the existing code could lead to refcount leaks. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-10-03Merge tag 'driver-core-5.15-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-coreLinus Torvalds2-6/+12
Pull driver core fixes from Greg KH: "Here are some driver core and kernfs fixes for reported issues for 5.15-rc4. These fixes include: - kernfs positive dentry bugfix - debugfs_create_file_size error path fix - cpumask sysfs file bugfix to preserve the user/kernel abi (has been reported multiple times.) - devlink fixes for mdiobus devices as reported by the subsystem maintainers. Also included in here are some devlink debugging changes to make it easier for people to report problems when asked. They have already helped with the mdiobus and other subsystems reporting issues. All of these have been linux-next for a while with no reported issues" * tag 'driver-core-5.15-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: kernfs: also call kernfs_set_rev() for positive dentry driver core: Add debug logs when fwnode links are added/deleted driver core: Create __fwnode_link_del() helper function driver core: Set deferred probe reason when deferred by driver core net: mdiobus: Set FWNODE_FLAG_NEEDS_CHILD_BOUND_ON_ADD for mdiobus parents driver core: fw_devlink: Add support for FWNODE_FLAG_NEEDS_CHILD_BOUND_ON_ADD driver core: fw_devlink: Improve handling of cyclic dependencies cpumask: Omit terminating null byte in cpumap_print_{list,bitmask}_to_buf debugfs: debugfs_create_file_size(): use IS_ERR to check for error
2021-10-03Merge tag 'sched_urgent_for_v5.15_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds1-1/+1
Pull scheduler fixes from Borislav Petkov: - Tell the compiler to always inline is_percpu_thread() - Make sure tunable_scaling buffer is null-terminated after an update in sysfs - Fix LTP named regression due to cgroup list ordering * tag 'sched_urgent_for_v5.15_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched: Always inline is_percpu_thread() sched/fair: Null terminate buffer when updating tunable_scaling sched/fair: Add ancestors of unthrottled undecayed cfs_rq
2021-10-03Merge tag 'perf_urgent_for_v5.15_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds1-1/+3
Pull perf fixes from Borislav Petkov: - Make sure the destroy callback is reset when a event initialization fails - Update the event constraints for Icelake - Make sure the active time of an event is updated even for inactive events * tag 'perf_urgent_for_v5.15_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/core: fix userpage->time_enabled of inactive events perf/x86/intel: Update event constraints for ICX perf/x86: Reset destroy callback on event init failure
2021-10-02cachefiles: Fix oops in trace_cachefiles_mark_buried due to NULL objectDave Wysochanski1-1/+1
In cachefiles_mark_object_buried, the dentry in question may not have an owner, and thus our cachefiles_object pointer may be NULL when calling the tracepoint, in which case we will also not have a valid debug_id to print in the tracepoint. Check for NULL object in the tracepoint and if so, just set debug_id to MAX_UINT as was done in 2908f5e101e3 ("fscache: Add a cookie debug ID and use that in traces"). This fixes the following oops: FS-Cache: Cache "mycache" added (type cachefiles) CacheFiles: File cache on vdc registered ... Workqueue: fscache_object fscache_object_work_func [fscache] RIP: 0010:trace_event_raw_event_cachefiles_mark_buried+0x4e/0xa0 [cachefiles] .... Call Trace: cachefiles_mark_object_buried+0xa5/0xb0 [cachefiles] cachefiles_bury_object+0x270/0x430 [cachefiles] cachefiles_walk_to_object+0x195/0x9c0 [cachefiles] cachefiles_lookup_object+0x5a/0xc0 [cachefiles] fscache_look_up_object+0xd7/0x160 [fscache] fscache_object_work_func+0xb2/0x340 [fscache] process_one_work+0x1f1/0x390 worker_thread+0x53/0x3e0 kthread+0x127/0x150 Fixes: 2908f5e101e3 ("fscache: Add a cookie debug ID and use that in traces") Signed-off-by: Dave Wysochanski <dwysocha@redhat.com> Signed-off-by: David Howells <dhowells@redhat.com> cc: linux-cachefs@redhat.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-10-01sched: Always inline is_percpu_thread()Peter Zijlstra1-1/+1
vmlinux.o: warning: objtool: check_preemption_disabled()+0x81: call to is_percpu_thread() leaves .noinstr.text section Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20210928084218.063371959@infradead.org
2021-10-01perf/core: fix userpage->time_enabled of inactive eventsSong Liu1-1/+3
Users of rdpmc rely on the mmapped user page to calculate accurate time_enabled. Currently, userpage->time_enabled is only updated when the event is added to the pmu. As a result, inactive event (due to counter multiplexing) does not have accurate userpage->time_enabled. This can be reproduced with something like: /* open 20 task perf_event "cycles", to create multiplexing */ fd = perf_event_open(); /* open task perf_event "cycles" */ userpage = mmap(fd); /* use mmap and rdmpc */ while (true) { time_enabled_mmap = xxx; /* use logic in perf_event_mmap_page */ time_enabled_read = read(fd).time_enabled; if (time_enabled_mmap > time_enabled_read) BUG(); } Fix this by updating userpage for inactive events in merge_sched_in. Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reported-and-tested-by: Lucian Grijincu <lucian@fb.com> Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20210929194313.2398474-1-songliubraving@fb.com
2021-09-30Merge tag 'net-5.15-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netLinus Torvalds6-8/+41
Pull networking fixes from Jakub Kicinski: "Networking fixes, including fixes from mac80211, netfilter and bpf. Current release - regressions: - bpf, cgroup: assign cgroup in cgroup_sk_alloc when called from interrupt - mdio: revert mechanical patches which broke handling of optional resources - dev_addr_list: prevent address duplication Previous releases - regressions: - sctp: break out if skb_header_pointer returns NULL in sctp_rcv_ootb (NULL deref) - Revert "mac80211: do not use low data rates for data frames with no ack flag", fixing broadcast transmissions - mac80211: fix use-after-free in CCMP/GCMP RX - netfilter: include zone id in tuple hash again, minimize collisions - netfilter: nf_tables: unlink table before deleting it (race -> UAF) - netfilter: log: work around missing softdep backend module - mptcp: don't return sockets in foreign netns - sched: flower: protect fl_walk() with rcu (race -> UAF) - ixgbe: fix NULL pointer dereference in ixgbe_xdp_setup - smsc95xx: fix stalled rx after link change - enetc: fix the incorrect clearing of IF_MODE bits - ipv4: fix rtnexthop len when RTA_FLOW is present - dsa: mv88e6xxx: 6161: use correct MAX MTU config method for this SKU - e100: fix length calculation & buffer overrun in ethtool::get_regs Previous releases - always broken: - mac80211: fix using stale frag_tail skb pointer in A-MSDU tx - mac80211: drop frames from invalid MAC address in ad-hoc mode - af_unix: fix races in sk_peer_pid and sk_peer_cred accesses (race -> UAF) - bpf, x86: Fix bpf mapping of atomic fetch implementation - bpf: handle return value of BPF_PROG_TYPE_STRUCT_OPS prog - netfilter: ip6_tables: zero-initialize fragment offset - mhi: fix error path in mhi_net_newlink - af_unix: return errno instead of NULL in unix_create1() when over the fs.file-max limit Misc: - bpf: exempt CAP_BPF from checks against bpf_jit_limit - netfilter: conntrack: make max chain length random, prevent guessing buckets by attackers - netfilter: nf_nat_masquerade: make async masq_inet6_event handling generic, defer conntrack walk to work queue (prevent hogging RTNL lock)" * tag 'net-5.15-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (77 commits) af_unix: fix races in sk_peer_pid and sk_peer_cred accesses net: stmmac: fix EEE init issue when paired with EEE capable PHYs net: dev_addr_list: handle first address in __hw_addr_add_ex net: sched: flower: protect fl_walk() with rcu net: introduce and use lock_sock_fast_nested() net: phy: bcm7xxx: Fixed indirect MMD operations net: hns3: disable firmware compatible features when uninstall PF net: hns3: fix always enable rx vlan filter problem after selftest net: hns3: PF enable promisc for VF when mac table is overflow net: hns3: fix show wrong state when add existing uc mac address net: hns3: fix mixed flag HCLGE_FLAG_MQPRIO_ENABLE and HCLGE_FLAG_DCB_ENABLE net: hns3: don't rollback when destroy mqprio fail net: hns3: remove tc enable checking net: hns3: do not allow call hns3_nic_net_open repeatedly ixgbe: Fix NULL pointer dereference in ixgbe_xdp_setup net: bridge: mcast: Associate the seqcount with its protecting lock. net: mdio-ipq4019: Fix the error for an optional regs resource net: hns3: fix hclge_dbg_dump_tm_pg() stack usage net: mdio: mscc-miim: Fix the mdio controller af_unix: Return errno instead of NULL in unix_create1(). ...
2021-09-30af_unix: fix races in sk_peer_pid and sk_peer_cred accessesEric Dumazet1-0/+2
Jann Horn reported that SO_PEERCRED and SO_PEERGROUPS implementations are racy, as af_unix can concurrently change sk_peer_pid and sk_peer_cred. In order to fix this issue, this patch adds a new spinlock that needs to be used whenever these fields are read or written. Jann also pointed out that l2cap_sock_get_peer_pid_cb() is currently reading sk->sk_peer_pid which makes no sense, as this field is only possibly set by AF_UNIX sockets. We will have to clean this in a separate patch. This could be done by reverting b48596d1dc25 "Bluetooth: L2CAP: Add get_peer_pid callback" or implementing what was truly expected. Fixes: 109f6e39fa07 ("af_unix: Allow SO_PEERCRED to work across namespaces.") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Jann Horn <jannh@google.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Cc: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-30net: introduce and use lock_sock_fast_nested()Paolo Abeni1-1/+30
Syzkaller reported a false positive deadlock involving the nl socket lock and the subflow socket lock: MPTCP: kernel_bind error, err=-98 ============================================ WARNING: possible recursive locking detected 5.15.0-rc1-syzkaller #0 Not tainted -------------------------------------------- syz-executor998/6520 is trying to acquire lock: ffff8880795718a0 (k-sk_lock-AF_INET){+.+.}-{0:0}, at: mptcp_close+0x267/0x7b0 net/mptcp/protocol.c:2738 but task is already holding lock: ffff8880787c8c60 (k-sk_lock-AF_INET){+.+.}-{0:0}, at: lock_sock include/net/sock.h:1612 [inline] ffff8880787c8c60 (k-sk_lock-AF_INET){+.+.}-{0:0}, at: mptcp_close+0x23/0x7b0 net/mptcp/protocol.c:2720 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(k-sk_lock-AF_INET); lock(k-sk_lock-AF_INET); *** DEADLOCK *** May be due to missing lock nesting notation 3 locks held by syz-executor998/6520: #0: ffffffff8d176c50 (cb_lock){++++}-{3:3}, at: genl_rcv+0x15/0x40 net/netlink/genetlink.c:802 #1: ffffffff8d176d08 (genl_mutex){+.+.}-{3:3}, at: genl_lock net/netlink/genetlink.c:33 [inline] #1: ffffffff8d176d08 (genl_mutex){+.+.}-{3:3}, at: genl_rcv_msg+0x3e0/0x580 net/netlink/genetlink.c:790 #2: ffff8880787c8c60 (k-sk_lock-AF_INET){+.+.}-{0:0}, at: lock_sock include/net/sock.h:1612 [inline] #2: ffff8880787c8c60 (k-sk_lock-AF_INET){+.+.}-{0:0}, at: mptcp_close+0x23/0x7b0 net/mptcp/protocol.c:2720 stack backtrace: CPU: 1 PID: 6520 Comm: syz-executor998 Not tainted 5.15.0-rc1-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106 print_deadlock_bug kernel/locking/lockdep.c:2944 [inline] check_deadlock kernel/locking/lockdep.c:2987 [inline] validate_chain kernel/locking/lockdep.c:3776 [inline] __lock_acquire.cold+0x149/0x3ab kernel/locking/lockdep.c:5015 lock_acquire kernel/locking/lockdep.c:5625 [inline] lock_acquire+0x1ab/0x510 kernel/locking/lockdep.c:5590 lock_sock_fast+0x36/0x100 net/core/sock.c:3229 mptcp_close+0x267/0x7b0 net/mptcp/protocol.c:2738 inet_release+0x12e/0x280 net/ipv4/af_inet.c:431 __sock_release net/socket.c:649 [inline] sock_release+0x87/0x1b0 net/socket.c:677 mptcp_pm_nl_create_listen_socket+0x238/0x2c0 net/mptcp/pm_netlink.c:900 mptcp_nl_cmd_add_addr+0x359/0x930 net/mptcp/pm_netlink.c:1170 genl_family_rcv_msg_doit+0x228/0x320 net/netlink/genetlink.c:731 genl_family_rcv_msg net/netlink/genetlink.c:775 [inline] genl_rcv_msg+0x328/0x580 net/netlink/genetlink.c:792 netlink_rcv_skb+0x153/0x420 net/netlink/af_netlink.c:2504 genl_rcv+0x24/0x40 net/netlink/genetlink.c:803 netlink_unicast_kernel net/netlink/af_netlink.c:1314 [inline] netlink_unicast+0x533/0x7d0 net/netlink/af_netlink.c:1340 netlink_sendmsg+0x86d/0xdb0 net/netlink/af_netlink.c:1929 sock_sendmsg_nosec net/socket.c:704 [inline] sock_sendmsg+0xcf/0x120 net/socket.c:724 sock_no_sendpage+0x101/0x150 net/core/sock.c:2980 kernel_sendpage.part.0+0x1a0/0x340 net/socket.c:3504 kernel_sendpage net/socket.c:3501 [inline] sock_sendpage+0xe5/0x140 net/socket.c:1003 pipe_to_sendpage+0x2ad/0x380 fs/splice.c:364 splice_from_pipe_feed fs/splice.c:418 [inline] __splice_from_pipe+0x43e/0x8a0 fs/splice.c:562 splice_from_pipe fs/splice.c:597 [inline] generic_splice_sendpage+0xd4/0x140 fs/splice.c:746 do_splice_from fs/splice.c:767 [inline] direct_splice_actor+0x110/0x180 fs/splice.c:936 splice_direct_to_actor+0x34b/0x8c0 fs/splice.c:891 do_splice_direct+0x1b3/0x280 fs/splice.c:979 do_sendfile+0xae9/0x1240 fs/read_write.c:1249 __do_sys_sendfile64 fs/read_write.c:1314 [inline] __se_sys_sendfile64 fs/read_write.c:1300 [inline] __x64_sys_sendfile64+0x1cc/0x210 fs/read_write.c:1300 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x7f215cb69969 Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 e1 14 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 c0 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007ffc96bb3868 EFLAGS: 00000246 ORIG_RAX: 0000000000000028 RAX: ffffffffffffffda RBX: 00007f215cbad072 RCX: 00007f215cb69969 RDX: 0000000000000000 RSI: 0000000000000004 RDI: 0000000000000005 RBP: 0000000000000000 R08: 00007ffc96bb3a08 R09: 00007ffc96bb3a08 R10: 0000000100000002 R11: 0000000000000246 R12: 00007ffc96bb387c R13: 431bde82d7b634db R14: 0000000000000000 R15: 0000000000000000 the problem originates from uncorrect lock annotation in the mptcp code and is only visible since commit 2dcb96bacce3 ("net: core: Correct the sock::sk_lock.owned lockdep annotations"), but is present since the port-based endpoint support initial implementation. This patch addresses the issue introducing a nested variant of lock_sock_fast() and using it in the relevant code path. Fixes: 1729cf186d8a ("mptcp: create the listening socket for new port") Fixes: 2dcb96bacce3 ("net: core: Correct the sock::sk_lock.owned lockdep annotations") Suggested-by: Thomas Gleixner <tglx@linutronix.de> Reported-and-tested-by: syzbot+1dd53f7a89b299d59eaf@syzkaller.appspotmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-29Merge tag 'sound-5.15-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/soundLinus Torvalds2-0/+2
Pull sound fixes from Takashi Iwai: "This became a slightly large collection of changes, partly because I've been off in the last weeks. Most of changes are small and scattered while a bit big change is found in HD-audio Realtek codec driver; it's a very device-specific fix that has been long wanted, so I decided to pick up although it's in the middle RC. Some highlights: - A new guard ioctl for ALSA rawmidi API to avoid the misuse of the new timestamp framing mode; it's for a regression fix - HD-audio: a revert of the 5.15 change that might work badly, new quirks for Lenovo Legion & co, a follow-up fix for CS8409 - ASoC: lots of SOF-related fixes, fsl component fixes, corrections of mediatek drivers - USB-audio: fix for the PM resume - FireWire: oxfw and motu fixes" * tag 'sound-5.15-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (25 commits) ALSA: pcsp: Make hrtimer forwarding more robust ALSA: rawmidi: introduce SNDRV_RAWMIDI_IOCTL_USER_PVERSION ALSA: firewire-motu: fix truncated bytes in message tracepoints ASoC: SOF: trace: Omit error print when waking up trace sleepers ASoC: mediatek: mt8195: remove wrong fixup assignment on HDMITX ASoC: SOF: loader: Re-phrase the missing firmware error to avoid duplication ASoC: SOF: loader: release_firmware() on load failure to avoid batching ALSA: hda/cs8409: Setup Dolphin Headset Mic as Phantom Jack ALSA: pcxhr: "fix" PCXHR_REG_TO_PORT definition ASoC: SOF: imx: imx8m: Bar index is only valid for IRAM and SRAM types ASoC: SOF: imx: imx8: Bar index is only valid for IRAM and SRAM types ASoC: SOF: Fix DSP oops stack dump output contents ALSA: hda/realtek: Quirks to enable speaker output for Lenovo Legion 7i 15IMHG05, Yoga 7i 14ITL5/15ITL5, and 13s Gen2 laptops. ALSA: usb-audio: Unify mixer resume and reset_resume procedure Revert "ALSA: hda: Drop workaround for a hang at shutdown again" ALSA: oxfw: fix transmission method for Loud models based on OXFW971 ASoC: mediatek: common: handle NULL case in suspend/resume function ASoC: fsl_xcvr: register platform component before registering cpu dai ASoC: fsl_spdif: register platform component before registering cpu dai ASoC: fsl_micfil: register platform component before registering cpu dai ...
2021-09-28Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfDavid S. Miller1-1/+2
Daniel Borkmann says: ==================== pull-request: bpf 2021-09-28 The following pull-request contains BPF updates for your *net* tree. We've added 10 non-merge commits during the last 14 day(s) which contain a total of 11 files changed, 139 insertions(+), 53 deletions(-). The main changes are: 1) Fix MIPS JIT jump code emission for too large offsets, from Piotr Krysiuk. 2) Fix x86 JIT atomic/fetch emission when dst reg maps to rax, from Johan Almbladh. 3) Fix cgroup_sk_alloc corner case when called from interrupt, from Daniel Borkmann. 4) Fix segfault in libbpf's linker for objects without BTF, from Kumar Kartikeya Dwivedi. 5) Fix bpf_jit_charge_modmem for applications with CAP_BPF, from Lorenz Bauer. 6) Fix return value handling for struct_ops BPF programs, from Hou Tao. 7) Various fixes to BPF selftests, from Jiri Benc. ==================== Signed-off-by: David S. Miller <davem@davemloft.net> ,
2021-09-27Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds3-9/+6
Pull kvm fixes from Paolo Bonzini: "A bit late... I got sidetracked by back-from-vacation routines and conferences. But most of these patches are already a few weeks old and things look more calm on the mailing list than what this pull request would suggest. x86: - missing TLB flush - nested virtualization fixes for SMM (secure boot on nested hypervisor) and other nested SVM fixes - syscall fuzzing fixes - live migration fix for AMD SEV - mirror VMs now work for SEV-ES too - fixes for reset - possible out-of-bounds access in IOAPIC emulation - fix enlightened VMCS on Windows 2022 ARM: - Add missing FORCE target when building the EL2 object - Fix a PMU probe regression on some platforms Generic: - KCSAN fixes selftests: - random fixes, mostly for clang compilation" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (43 commits) selftests: KVM: Explicitly use movq to read xmm registers selftests: KVM: Call ucall_init when setting up in rseq_test KVM: Remove tlbs_dirty KVM: X86: Synchronize the shadow pagetable before link it KVM: X86: Fix missed remote tlb flush in rmap_write_protect() KVM: x86: nSVM: don't copy virt_ext from vmcb12 KVM: x86: nSVM: test eax for 4K alignment for GP errata workaround KVM: x86: selftests: test simultaneous uses of V_IRQ from L1 and L0 KVM: x86: nSVM: restore int_vector in svm_clear_vintr kvm: x86: Add AMD PMU MSRs to msrs_to_save_all[] KVM: x86: nVMX: re-evaluate emulation_required on nested VM exit KVM: x86: nVMX: don't fail nested VM entry on invalid guest state if !from_vmentry KVM: x86: VMX: synthesize invalid VM exit when emulating invalid guest state KVM: x86: nSVM: refactor svm_leave_smm and smm_enter_smm KVM: x86: SVM: call KVM_REQ_GET_NESTED_STATE_PAGES on exit from SMM mode KVM: x86: reset pdptrs_from_userspace when exiting smm KVM: x86: nSVM: restore the L1 host state prior to resuming nested guest on SMM exit KVM: nVMX: Filter out all unsupported controls when eVMCS was activated KVM: KVM: Use cpumask_available() to check for NULL cpumask when kicking vCPUs KVM: Clean up benign vcpu->cpu data races when kicking vCPUs ...
2021-09-27Merge tag 'mac80211-for-net-2021-09-27' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211David S. Miller1-4/+4
Johannes berg says: ==================== Some fixes: * potential use-after-free in CCMP/GCMP RX processing * potential use-after-free in TX A-MSDU processing * revert to low data rates for no-ack as the commit broke other things * limit VHT MCS/NSS in radiotap injection * drop frames with invalid addresses in IBSS mode * check rhashtable_init() return value in mesh * fix potentially unaligned access in mesh * fix late beacon hrtimer handling in hwsim (syzbot) * fix documentation for PTK0 rekeying ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-27mac80211: Fix Ptk0 rekey documentationAlexander Wetzel1-4/+4
@IEEE80211_KEY_FLAG_GENERATE_IV setting is irrelevant for RX. Move the requirement to the correct section in the PTK0 rekey documentation. Signed-off-by: Alexander Wetzel <alexander@wetzel-home.de> Link: https://lore.kernel.org/r/20210924200514.7936-1-alexander@wetzel-home.de Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2021-09-26Merge tag 'x86-urgent-2021-09-26' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds1-0/+2
Pull x86 fixes from Thomas Gleixner: "A set of fixes for X86: - Prevent sending the wrong signal when protection keys are enabled and the kernel handles a fault in the vsyscall emulation. - Invoke early_reserve_memory() before invoking e820_memory_setup() which is required to make the Xen dom0 e820 hooks work correctly. - Use the correct data type for the SETZ operand in the EMQCMDS instruction wrapper. - Prevent undefined behaviour to the potential unaligned accesss in the instruction decoder library" * tag 'x86-urgent-2021-09-26' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/insn, tools/x86: Fix undefined behavior due to potential unaligned accesses x86/asm: Fix SETZ size enqcmds() build failure x86/setup: Call early_reserve_memory() earlier x86/fault: Fix wrong signal when vsyscall fails with pkey
2021-09-26Merge tag 'irq-urgent-2021-09-26' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds1-1/+1
Pull irq fixes from Thomas Gleixner: "A set of fixes for interrupt chip drivers: - Work around a bad GIC integration on a Renesas platform which can't handle byte-sized MMIO access - Plug a potential memory leak in the GICv4 driver - Fix a regression in the Armada 370-XP IPI code which was caused by issuing EOI instack of ACK. - A couple of small fixes here and there" * tag 'irq-urgent-2021-09-26' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: irqchip/gic: Work around broken Renesas integration irqchip/renesas-rza1: Use semicolons instead of commas irqchip/gic-v3-its: Fix potential VPE leak on error irqchip/goldfish-pic: Select GENERIC_IRQ_CHIP to fix build irqchip/mbigen: Repair non-kernel-doc notation irqdomain: Change the type of 'size' in __irq_domain_add() to be consistent irqchip/armada-370-xp: Fix ack/eoi breakage Documentation: Fix irq-domain.rst build warning
2021-09-26net: prevent user from passing illegal stab size王贇1-0/+1
We observed below report when playing with netlink sock: UBSAN: shift-out-of-bounds in net/sched/sch_api.c:580:10 shift exponent 249 is too large for 32-bit type CPU: 0 PID: 685 Comm: a.out Not tainted Call Trace: dump_stack_lvl+0x8d/0xcf ubsan_epilogue+0xa/0x4e __ubsan_handle_shift_out_of_bounds+0x161/0x182 __qdisc_calculate_pkt_len+0xf0/0x190 __dev_queue_xmit+0x2ed/0x15b0 it seems like kernel won't check the stab log value passing from user, and will use the insane value later to calculate pkt_len. This patch just add a check on the size/cell_log to avoid insane calculation. Reported-by: Abaci <abaci@linux.alibaba.com> Signed-off-by: Michael Wang <yun.wang@linux.alibaba.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-25Merge branch 'akpm' (patches from Andrew)Linus Torvalds2-3/+7
Merge misc fixes from Andrew Morton: "16 patches. Subsystems affected by this patch series: xtensa, sh, ocfs2, scripts, lib, and mm (memory-failure, kasan, damon, shmem, tools, pagecache, debug, and pagemap)" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: mm: fix uninitialized use in overcommit_policy_handler mm/memory_failure: fix the missing pte_unmap() call kasan: always respect CONFIG_KASAN_STACK sh: pgtable-3level: fix cast to pointer from integer of different size mm/debug: sync up latest migrate_reason to migrate_reason_names mm/debug: sync up MR_CONTIG_RANGE and MR_LONGTERM_PIN mm: fs: invalidate bh_lrus for only cold path lib/zlib_inflate/inffast: check config in C to avoid unused function warning tools/vm/page-types: remove dependency on opt_file for idle page tracking scripts/sorttable: riscv: fix undeclared identifier 'EM_RISCV' error ocfs2: drop acl cache for directories too mm/shmem.c: fix judgment error in shmem_is_huge() xtensa: increase size of gcc stack frame check mm/damon: don't use strnlen() with known-bogus source length kasan: fix Kconfig check of CC_HAS_WORKING_NOSANITIZE_ADDRESS mm, hwpoison: add is_free_buddy_page() in HWPoisonHandlable()
2021-09-25Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsiLinus Torvalds1-1/+0
Pull SCSI fixes from James Bottomley: "Thirty-three fixes, I'm afraid. Essentially the build up from the last couple of weeks while I've been dealling with Linux Plumbers conference infrastructure issues. It's mostly the usual assortment of spelling fixes and minor corrections. The only core relevant changes are to the sd driver to reduce the spin up message spew and fix a small memory leak on the freeing path" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (33 commits) scsi: ses: Retry failed Send/Receive Diagnostic commands scsi: target: Fix spelling mistake "CONFLIFT" -> "CONFLICT" scsi: lpfc: Fix gcc -Wstringop-overread warning, again scsi: lpfc: Use correct scnprintf() limit scsi: lpfc: Fix sprintf() overflow in lpfc_display_fpin_wwpn() scsi: core: Remove 'current_tag' scsi: acornscsi: Remove tagged queuing vestiges scsi: fas216: Kill scmd->tag scsi: qla2xxx: Restore initiator in dual mode scsi: ufs: core: Unbreak the reset handler scsi: sd_zbc: Support disks with more than 2**32 logical blocks scsi: ufs: core: Revert "scsi: ufs: Synchronize SCSI and UFS error handling" scsi: bsg: Fix device unregistration scsi: sd: Make sd_spinup_disk() less noisy scsi: ufs: ufs-pci: Fix Intel LKF link stability scsi: mpt3sas: Clean up some inconsistent indenting scsi: megaraid: Clean up some inconsistent indenting scsi: sr: Fix spelling mistake "does'nt" -> "doesn't" scsi: Remove SCSI CDROM MAINTAINERS entry scsi: megaraid: Fix Coccinelle warning ...
2021-09-25Merge tag 'for-linus-5.15b-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tipLinus Torvalds1-12/+0
Pull xen fixes from Juergen Gross: "Some minor cleanups and fixes of some theoretical bugs, as well as a fix of a bug introduced in 5.15-rc1" * tag 'for-linus-5.15b-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen/x86: fix PV trap handling on secondary processors xen/balloon: fix balloon kthread freezing swiotlb-xen: this is PV-only on x86 xen/pci-swiotlb: reduce visibility of symbols PCI: only build xen-pcifront in PV-enabled environments swiotlb-xen: ensure to issue well-formed XENMEM_exchange requests Xen/gntdev: don't ignore kernel unmapping error xen/x86: drop redundant zeroing from cpu_initialize_context()
2021-09-25Merge tag 'erofs-for-5.15-rc3-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofsLinus Torvalds1-3/+3
Pull erofs fixes from Gao Xiang: "Two bugfixes to fix the 4KiB blockmap chunk format availability and a dangling pointer usage. There is also a trivial cleanup to clarify compacted_2b if compacted_4b_initial > totalidx. Summary: - fix the dangling pointer use in erofs_lookup tracepoint - fix unsupported chunk format check - zero out compacted_2b if compacted_4b_initial > totalidx" * tag 'erofs-for-5.15-rc3-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs: erofs: clear compacted_2b if compacted_4b_initial > totalidx erofs: fix misbehavior of unsupported chunk format check erofs: fix up erofs_lookup tracepoint
2021-09-25Merge tag 'char-misc-5.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-miscLinus Torvalds2-0/+21
Pull char/misc driver fixes from Greg KH: "Here are some small char and misc driver fixes for 5.15-rc3. Nothing huge in here, just fixes for a number of small issues that have been reported. These include: - habanalabs race conditions and other bugs fixed - binder driver fixes - fpga driver fixes - coresight build warning fix - nvmem driver fix - comedi memory leak fix - bcm-vk tty race fix - other tiny driver fixes All of these have been in linux-next for a while with no reported issues" * tag 'char-misc-5.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (21 commits) comedi: Fix memory leak in compat_insnlist() nvmem: NVMEM_NINTENDO_OTP should depend on WII misc: bcm-vk: fix tty registration race fpga: dfl: Avoid reads to AFU CSRs during enumeration fpga: machxo2-spi: Fix missing error code in machxo2_write_complete() fpga: machxo2-spi: Return an error on failure habanalabs: expose a single cs seq in staged submissions habanalabs: fix wait offset handling habanalabs: rate limit multi CS completion errors habanalabs/gaudi: fix LBW RR configuration habanalabs: Fix spelling mistake "FEADBACK" -> "FEEDBACK" habanalabs: fail collective wait when not supported habanalabs/gaudi: use direct MSI in single mode habanalabs: fix kernel OOPs related to staged cs habanalabs: fix potential race in interrupt wait ioctl mcb: fix error handling in mcb_alloc_bus() misc: genwqe: Fixes DMA mask setting coresight: syscfg: Fix compiler warning nvmem: core: Add stubs for nvmem_cell_read_variable_le_u32/64 if !CONFIG_NVMEM binder: make sure fd closes complete ...
2021-09-25Merge tag 'usb-5.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usbLinus Torvalds1-0/+2
Pull USB driver fixes from Greg KH: "Here are some USB driver fixes and new device ids for 5.15-rc3. They include: - usb-storage quirk additions - usb-serial new device ids - usb-serial driver fixes - USB roothub registration bugfix to resolve a long-reported issue - usb gadget driver fixes for a large number of small things - dwc2 driver fixes All of these have been in linux-next for a while with no reported issues" * tag 'usb-5.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (28 commits) USB: serial: option: add device id for Foxconn T99W265 USB: serial: cp210x: add ID for GW Instek GDM-834x Digital Multimeter USB: serial: cp210x: add part-number debug printk USB: serial: cp210x: fix dropped characters with CP2102 MAINTAINERS: usb, update Peter Korsgaard's entries usb: musb: tusb6010: uninitialized data in tusb_fifo_write_unaligned() usb-storage: Add quirk for ScanLogic SL11R-IDE older than 2.6c Re-enable UAS for LaCie Rugged USB3-FW with fk quirk USB: serial: option: remove duplicate USB device ID USB: serial: mos7840: remove duplicated 0xac24 device ID arm64: dts: qcom: ipq8074: remove USB tx-fifo-resize property usb: gadget: f_uac2: Populate SS descriptors' wBytesPerInterval usb: gadget: f_uac2: Add missing companion descriptor for feedback EP usb: dwc2: gadget: Fix ISOC transfer complete handling for DDMA usb: core: hcd: Modularize HCD stop configuration in usb_stop_hcd() xhci: Set HCD flag to defer primary roothub registration usb: core: hcd: Add support for deferring roothub registration usb: dwc2: gadget: Fix ISOC flow for BDMA and Slave usb: dwc3: core: balance phy init and exit Revert "USB: bcma: Add a check for devm_gpiod_get" ...
2021-09-24mm/debug: sync up latest migrate_reason to migrate_reason_namesWeizhao Ouyang1-1/+5
Sync up MR_DEMOTION to migrate_reason_names and add a synch prompt. Link: https://lkml.kernel.org/r/20210921064553.293905-3-o451686892@gmail.com Fixes: 26aa2d199d6f ("mm/migrate: demote pages during reclaim") Signed-off-by: Weizhao Ouyang <o451686892@gmail.com> Reviewed-by: "Huang, Ying" <ying.huang@intel.com> Reviewed-by: John Hubbard <jhubbard@nvidia.com> Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Pavel Tatashin <pasha.tatashin@soleen.com> Cc: Yang Shi <yang.shi@linux.alibaba.com> Cc: Zi Yan <ziy@nvidia.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Mina Almasry <almasrymina@google.com> Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org> Cc: Oscar Salvador <osalvador@suse.de> Cc: Wei Xu <weixugc@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-09-24mm: fs: invalidate bh_lrus for only cold pathMinchan Kim1-2/+2
The kernel test robot reported the regression of fio.write_iops[1] with commit 8cc621d2f45d ("mm: fs: invalidate BH LRU during page migration"). Since lru_add_drain is called frequently, invalidate bh_lrus there could increase bh_lrus cache miss ratio, which needs more IO in the end. This patch moves the bh_lrus invalidation from the hot path( e.g., zap_page_range, pagevec_release) to cold path(i.e., lru_add_drain_all, lru_cache_disable). Zhengjun Xing confirmed "I test the patch, the regression reduced to -2.9%" [1] https://lore.kernel.org/lkml/20210520083144.GD14190@xsang-OptiPlex-9020/ [2] 8cc621d2f45d, mm: fs: invalidate BH LRU during page migration Link: https://lkml.kernel.org/r/20210907212347.1977686-1-minchan@kernel.org Signed-off-by: Minchan Kim <minchan@kernel.org> Reported-by: kernel test robot <oliver.sang@intel.com> Reviewed-by: Chris Goldsworthy <cgoldswo@codeaurora.org> Tested-by: "Xing, Zhengjun" <zhengjun.xing@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-09-24Merge tag 'acpi-5.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pmLinus Torvalds1-8/+0
Pull ACPI fix from Rafael Wysocki: "Revert a recent commit related to memory management that turned out to be problematic (Jia He)" * tag 'acpi-5.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: Revert "ACPI: Add memory semantics to acpi_os_map_memory()"
2021-09-24net: ipv4: Fix rtnexthop len when RTA_FLOW is presentXiao Liang2-2/+2
Multipath RTA_FLOW is embedded in nexthop. Dump it in fib_add_nexthop() to get the length of rtnexthop correct. Fixes: b0f60193632e ("ipv4: Refactor nexthop attributes in fib_dump_info") Signed-off-by: Xiao Liang <shaw.leon@gmail.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-24Merge tag 'irqchip-fixes-5.15-1' of git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/urgentThomas Gleixner1-1/+1
Pull irqchip fixes from Marc Zyngier: - Work around a bad GIC integration on a Renesas platform, where the interconnect cannot deal with byte-sized MMIO accesses - Cleanup another Renesas driver abusing the comma operator - Fix a potential GICv4 memory leak on an error path - Make the type of 'size' consistent with the rest of the code in __irq_domain_add() - Fix a regression in the Armada 370-XP IPI path - Fix the build for the obviously unloved goldfish-pic - Some documentation fixes Link: https://lore.kernel.org/r/20210924090933.2766857-1-maz@kernel.org
2021-09-24Merge tag 'kvmarm-fixes-5.15-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into kvm-masterPaolo Bonzini2-3/+6
KVM/arm64 fixes for 5.15, take #1 - Add missing FORCE target when building the EL2 object - Fix a PMU probe regression on some platforms
2021-09-23Revert "ACPI: Add memory semantics to acpi_os_map_memory()"Jia He1-8/+0
This reverts commit 437b38c51162f8b87beb28a833c4d5dc85fa864e. The memory semantics added in commit 437b38c51162 causes SystemMemory Operation region, whose address range is not described in the EFI memory map to be mapped as NormalNC memory on arm64 platforms (through acpi_os_map_memory() in acpi_ex_system_memory_space_handler()). This triggers the following abort on an ARM64 Ampere eMAG machine, because presumably the physical address range area backing the Opregion does not support NormalNC memory attributes driven on the bus. Internal error: synchronous external abort: 96000410 [#1] SMP Modules linked in: CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.14.0+ #462 Hardware name: MiTAC RAPTOR EV-883832-X3-0001/RAPTOR, BIOS 0.14 02/22/2019 pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) [...snip...] Call trace: acpi_ex_system_memory_space_handler+0x26c/0x2c8 acpi_ev_address_space_dispatch+0x228/0x2c4 acpi_ex_access_region+0x114/0x268 acpi_ex_field_datum_io+0x128/0x1b8 acpi_ex_extract_from_field+0x14c/0x2ac acpi_ex_read_data_from_field+0x190/0x1b8 acpi_ex_resolve_node_to_value+0x1ec/0x288 acpi_ex_resolve_to_value+0x250/0x274 acpi_ds_evaluate_name_path+0xac/0x124 acpi_ds_exec_end_op+0x90/0x410 acpi_ps_parse_loop+0x4ac/0x5d8 acpi_ps_parse_aml+0xe0/0x2c8 acpi_ps_execute_method+0x19c/0x1ac acpi_ns_evaluate+0x1f8/0x26c acpi_ns_init_one_device+0x104/0x140 acpi_ns_walk_namespace+0x158/0x1d0 acpi_ns_initialize_devices+0x194/0x218 acpi_initialize_objects+0x48/0x50 acpi_init+0xe0/0x498 If the Opregion address range is not present in the EFI memory map there is no way for us to determine the memory attributes to use to map it - defaulting to NormalNC does not work (and it is not correct on a memory region that may have read side-effects) and therefore commit 437b38c51162 should be reverted, which means reverting back to the original behavior whereby address ranges that are mapped using acpi_os_map_memory() default to the safe devicenGnRnE attributes on ARM64 if the mapped address range is not defined in the EFI memory map. Fixes: 437b38c51162 ("ACPI: Add memory semantics to acpi_os_map_memory()") Signed-off-by: Jia He <justin.he@arm.com> Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-09-23Merge tag 'for-linus-rseq' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds1-0/+2
Pull rseq fixes from Paolo Bonzini: "A fix for a bug with restartable sequences and KVM. KVM's handling of TIF_NOTIFY_RESUME, e.g. for task migration, clears the flag without informing rseq and leads to stale data in userspace's rseq struct" * tag 'for-linus-rseq' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: selftests: Remove __NR_userfaultfd syscall fallback KVM: selftests: Add a test for KVM_RUN+rseq to detect task migration bugs tools: Move x86 syscall number fallbacks to .../uapi/ entry: rseq: Call rseq_handle_notify_resume() in tracehook_notify_resume() KVM: rseq: Update rseq when processing NOTIFY_RESUME on xfer to KVM guest
2021-09-23Merge tag 'net-5.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netLinus Torvalds5-2/+15
Pull networking fixes from Jakub Kicinski: "Current release - regressions: - dsa: bcm_sf2: fix array overrun in bcm_sf2_num_active_ports() Previous releases - regressions: - introduce a shutdown method to mdio device drivers, and make DSA switch drivers compatible with masters disappearing on shutdown; preventing infinite reference wait - fix issues in mdiobus users related to ->shutdown vs ->remove - virtio-net: fix pages leaking when building skb in big mode - xen-netback: correct success/error reporting for the SKB-with-fraglist - dsa: tear down devlink port regions when tearing down the devlink port on error - nexthop: fix division by zero while replacing a resilient group - hns3: check queue, vf, vlan ids range before using Previous releases - always broken: - napi: fix race against netpoll causing NAPI getting stuck - mlx4_en: ensure link operstate is updated even if link comes up before netdev registration - bnxt_en: fix TX timeout when TX ring size is set to the smallest - enetc: fix illegal access when reading affinity_hint; prevent oops on sysfs access - mtk_eth_soc: avoid creating duplicate offload entries Misc: - core: correct the sock::sk_lock.owned lockdep annotations" * tag 'net-5.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (51 commits) atlantic: Fix issue in the pm resume flow. net/mlx4_en: Don't allow aRFS for encapsulated packets net: mscc: ocelot: fix forwarding from BLOCKING ports remaining enabled net: ethernet: mtk_eth_soc: avoid creating duplicate offload entries nfc: st-nci: Add SPI ID matching DT compatible MAINTAINERS: remove Guvenc Gulce as net/smc maintainer nexthop: Fix memory leaks in nexthop notification chain listeners mptcp: ensure tx skbs always have the MPTCP ext qed: rdma - don't wait for resources under hw error recovery flow s390/qeth: fix deadlock during failing recovery s390/qeth: Fix deadlock in remove_discipline s390/qeth: fix NULL deref in qeth_clear_working_pool_list() net: dsa: realtek: register the MDIO bus under devres net: dsa: don't allocate the slave_mii_bus using devres Doc: networking: Fox a typo in ice.rst net: dsa: fix dsa_tree_setup error path net/smc: fix 'workqueue leaked lock' in smc_conn_abort_work net/smc: add missing error check in smc_clc_prfx_set() net: hns3: fix a return value error in hclge_get_reset_status() net: hns3: check vlan id before using it ...
2021-09-23driver core: fw_devlink: Add support for FWNODE_FLAG_NEEDS_CHILD_BOUND_ON_ADDSaravana Kannan1-3/+8
If a parent device is also a supplier to a child device, fw_devlink=on by design delays the probe() of the child device until the probe() of the parent finishes successfully. However, some drivers of such parent devices (where parent is also a supplier) expect the child device to finish probing successfully as soon as they are added using device_add() and before the probe() of the parent device has completed successfully. One example of such a case is discussed in the link mentioned below. Add a flag to make fw_devlink=on not enforce these supplier-consumer relationships, so these drivers can continue working. Link: https://lore.kernel.org/netdev/CAGETcx_uj0V4DChME-gy5HGKTYnxLBX=TH2rag29f_p=UcG+Tg@mail.gmail.com/ Fixes: ea718c699055 ("Revert "Revert "driver core: Set fw_devlink=on by default""") Signed-off-by: Saravana Kannan <saravanak@google.com> Link: https://lore.kernel.org/r/20210915170940.617415-3-saravanak@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-09-23erofs: fix up erofs_lookup tracepointGao Xiang1-3/+3
Fix up a misuse that the filename pointer isn't always valid in the ring buffer, and we should copy the content instead. Link: https://lore.kernel.org/r/20210921143531.81356-1-hsiangkao@linux.alibaba.com Fixes: 13f06f48f7bf ("staging: erofs: support tracepoint") Cc: stable@vger.kernel.org # 4.19+ Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
2021-09-23KVM: Remove tlbs_dirtyLai Jiangshan1-1/+0
There is no user of tlbs_dirty. Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20210918005636.3675-4-jiangshanlai@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-23ALSA: rawmidi: introduce SNDRV_RAWMIDI_IOCTL_USER_PVERSIONJaroslav Kysela2-0/+2
The new framing mode causes the user space regression, because the alsa-lib code does not initialize the reserved space in the params structure when the device is opened. This change adds SNDRV_RAWMIDI_IOCTL_USER_PVERSION like we do for the PCM interface for the protocol acknowledgment. Cc: David Henningsson <coding@diwic.se> Cc: <stable@vger.kernel.org> Fixes: 08fdced60ca0 ("ALSA: rawmidi: Add framing mode") BugLink: https://github.com/alsa-project/alsa-lib/issues/178 Signed-off-by: Jaroslav Kysela <perex@perex.cz> Link: https://lore.kernel.org/r/20210920171850.154186-1-perex@perex.cz Signed-off-by: Takashi Iwai <tiwai@suse.de>
2021-09-22KVM: x86: Query vcpu->vcpu_idx directly and drop its accessorSean Christopherson1-5/+0
Read vcpu->vcpu_idx directly instead of bouncing through the one-line wrapper, kvm_vcpu_get_idx(), and drop the wrapper. The wrapper is a remnant of the original implementation and serves no purpose; remove it before it gains more users. Back when kvm_vcpu_get_idx() was added by commit 497d72d80a78 ("KVM: Add kvm_vcpu_get_idx to get vcpu index in kvm->vcpus"), the implementation was more than just a simple wrapper as vcpu->vcpu_idx did not exist and retrieving the index meant walking over the vCPU array to find the given vCPU. When vcpu_idx was introduced by commit 8750e72a79dd ("KVM: remember position in kvm->vcpus array"), the helper was left behind, likely to avoid extra thrash (but even then there were only two users, the original arm usage having been removed at some point in the past). No functional change intended. Suggested-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20210910183220.2397812-2-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-22entry: rseq: Call rseq_handle_notify_resume() in tracehook_notify_resume()Sean Christopherson1-0/+2
Invoke rseq_handle_notify_resume() from tracehook_notify_resume() now that the two function are always called back-to-back by architectures that have rseq. The rseq helper is stubbed out for architectures that don't support rseq, i.e. this is a nop across the board. Note, tracehook_notify_resume() is horribly named and arguably does not belong in tracehook.h as literally every line of code in it has nothing to do with tracing. But, that's been true since commit a42c6ded827d ("move key_repace_session_keyring() into tracehook_notify_resume()") first usurped tracehook_notify_resume() back in 2012. Punt cleaning that mess up to future patches. No functional change intended. Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210901203030.1292304-3-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-22irqdomain: Change the type of 'size' in __irq_domain_add() to be consistentBixuan Cui1-1/+1
The 'size' is used in struct_size(domain, revmap, size) and its input parameter type is 'size_t'(unsigned int). Changing the size to 'unsigned int' to make the type consistent. Signed-off-by: Bixuan Cui <cuibixuan@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210916025203.44841-1-cuibixuan@huawei.com
2021-09-22scsi: core: Remove 'current_tag'Hannes Reinecke1-1/+0
The 'current_tag' field in struct scsi_device is unused now; remove it. Link: https://lore.kernel.org/r/1631696835-136198-4-git-send-email-john.garry@huawei.com Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-09-21cpumask: Omit terminating null byte in cpumap_print_{list,bitmask}_to_bufTobias Klauser1-3/+4
The changes in the patch series [1] introduced a terminating null byte when reading from cpulist or cpumap sysfs files, for example: $ xxd /sys/devices/system/node/node0/cpulist 00000000: 302d 310a 00 0-1.. Before this change, the output looked as follows: $ xxd /sys/devices/system/node/node0/cpulist 00000000: 302d 310a 0-1. Fix this regression by excluding the terminating null byte from the returned length in cpumap_print_list_to_buf and cpumap_print_bitmask_to_buf. [1] https://lore.kernel.org/all/20210806110251.560-1-song.bao.hua@hisilicon.com/ Fixes: 1fae562983ca ("cpumask: introduce cpumap_print_list/bitmask_to_buf to support large bitmask and list") Acked-by: Barry Song <song.bao.hua@hisilicon.com> Acked-by: Yury Norov <yury.norov@gmail.com> Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Link: https://lore.kernel.org/r/20210916222705.13554-1-tklauser@distanz.ch Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-09-20Merge tag 'afs-fixes-20210913' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fsLinus Torvalds1-2/+6
Pull AFS fixes from David Howells: "Fixes for AFS problems that can cause data corruption due to interaction with another client modifying data cached locally: - When d_revalidating a dentry, don't look at the inode to which it points. Only check the directory to which the dentry belongs. This was confusing things and causing the silly-rename cleanup code to remove the file now at the dentry of a file that got deleted. - Fix mmap data coherency. When a callback break is received that relates to a file that we have cached, the data content may have been changed (there are other reasons, such as the user's rights having been changed). However, we're checking it lazily, only on entry to the kernel, which doesn't happen if we have a writeable shared mapped page on that file. We make the kernel keep track of mmapped files and clear all PTEs mapping to that file as soon as the callback comes in by calling unmap_mapping_pages() (we don't necessarily want to zap the pagecache). This causes the kernel to be reentered when userspace tries to access the mmapped address range again - and at that point we can query the server and, if we need to, zap the page cache. Ideally, I would check each file at the point of notification, but that involves poking the server[*] - which is holding an exclusive lock on the vnode it is changing, waiting for all the clients it notified to reply. This could then deadlock against the server. Further, invalidating the pagecache might call ->launder_page(), which would try to write to the file, which would definitely deadlock. (AFS doesn't lease file access). [*] Checking to see if the file content has changed is a matter of comparing the current data version number, but we have to ask the server for that. We also need to get a new callback promise and we need to poke the server for that too. - Add some more points at which the inode is validated, since we're doing it lazily, notably in ->read_iter() and ->page_mkwrite(), but also when performing some directory operations. Ideally, checking in ->read_iter() would be done in some derivation of filemap_read(). If we're going to call the server to read the file, then we get the file status fetch as part of that. - The above is now causing us to make a lot more calls to afs_validate() to check the inode - and afs_validate() takes the RCU read lock each time to make a quick check (ie. afs_check_validity()). This is entirely for the purpose of checking cb_s_break to see if the server we're using reinitialised its list of callbacks - however this isn't a very common event, so most of the time we're taking this needlessly. Add a new cell-wide counter to count the number of reinitialisations done by any server and check that - and only if that changes, take the RCU read lock and check the server list (the server list may change, but the cell a file is part of won't). - Don't update vnode->cb_s_break and ->cb_v_break inside the validity checking loop. The cb_lock is done with read_seqretry, so we might go round the loop a second time after resetting those values - and that could cause someone else checking validity to miss something (I think). Also included are patches for fixes for some bugs encountered whilst debugging this: - Fix a leak of afs_read objects and fix a leak of keys hidden by that. - Fix a leak of pages that couldn't be added to extend a writeback. - Fix the maintenance of i_blocks when i_size is changed by a local write or a local dir edit" Link: https://bugzilla.kernel.org/show_bug.cgi?id=214217 [1] Link: https://lore.kernel.org/r/163111665183.283156.17200205573146438918.stgit@warthog.procyon.org.uk/ # v1 Link: https://lore.kernel.org/r/163113612442.352844.11162345591911691150.stgit@warthog.procyon.org.uk/ # i_blocks patch * tag 'afs-fixes-20210913' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs: afs: Fix updating of i_blocks on file/dir extension afs: Fix corruption in reads at fpos 2G-4G from an OpenAFS server afs: Try to avoid taking RCU read lock when checking vnode validity afs: Fix mmap coherency vs 3rd-party changes afs: Fix incorrect triggering of sillyrename on 3rd-party invalidation afs: Add missing vnode validation checks afs: Fix page leak afs: Fix missing put on afs_read objects and missing get on the key therein
2021-09-20Merge tag '5.15-rc1-smb3' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds1-1/+0
Pull cifs client fixes from Steve French: - two deferred close fixes (for bugs found with xfstests 478 and 461) - a deferred close improvement in rename - two trivial fixes for incorrect Linux comment formatting of multiple cifs files (pointed out by automated kernel test robot and checkpatch) * tag '5.15-rc1-smb3' of git://git.samba.org/sfrench/cifs-2.6: cifs: Not to defer close on file when lock is set cifs: Fix soft lockup during fsstress cifs: Deferred close performance improvements cifs: fix incorrect kernel doc comments cifs: remove pathname for file from SPDX header