aboutsummaryrefslogtreecommitdiffstats
path: root/fs (follow)
AgeCommit message (Collapse)AuthorFilesLines
2022-05-30Merge branch 'guilt/xfs-5.19-larp-cleanups' into xfs-5.19-for-nextDave Chinner14-81/+126
This series contains a two key cleanups for the new LARP code. Most of it is refactoring and tweaking the code that creates kernel log messages about enabling and disabling features -- we should be warning about LARP being turned on once per mount, instead of once per insmod cycle; we shouldn't be spamming the logs so aggressively about turning *off* log incompat features. The second part of the series refactors the LARP code responsible for getting (and releasing) permission to use xattr log items. The implementation code doesn't belong in xfs_log.c, and calls to logging functions don't belong in libxfs -- they really should be done by the VFS implementation functions before they start calling into libraries. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Dave Chinner <david@fromorbit.com>
2022-05-30Merge branch 'guilt/xfs-5.19-recovery-buf-cancel' into xfs-5.19-for-nextDave Chinner4-32/+85
As part of solving the memory leaks and UAF problems in the new LARP code, kmemleak also reported that log recovery will leak the table used to hash buffer cancellations if the recovery fails. Fix this problem by creating alloc/free helpers that initialize and free the hashtable contents correctly. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Dave Chinner <david@fromorbit.com>
2022-05-30xfs: fix xfs_ifree() error handling to not leak perag refBrian Foster1-1/+1
For some reason commit 9a5280b312e2e ("xfs: reorder iunlink remove operation in xfs_ifree") replaced a jump to the exit path in the event of an xfs_difree() error with a direct return, which skips releasing the perag reference acquired at the top of the function. Restore the original code to drop the reference on error. Fixes: 9a5280b312e2e ("xfs: reorder iunlink remove operation in xfs_ifree") Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2022-05-29erofs: fix crash when enable tracepoint cachefiles_prep_readXin Yin1-0/+1
RIP: 0010:trace_event_raw_event_cachefiles_prep_read+0x88/0xe0 [cachefiles] Call Trace: <TASK> cachefiles_prepare_read+0x1d7/0x3a0 [cachefiles] erofs_fscache_read_folios+0x188/0x220 [erofs] erofs_fscache_meta_readpage+0x106/0x160 [erofs] do_read_cache_folio+0x42a/0x590 ? bdi_register_va.part.14+0x1a7/0x210 ? super_setup_bdi_name+0x76/0xe0 erofs_bread+0x5b/0x170 [erofs] erofs_fc_fill_super+0x12b/0xc50 [erofs] This tracepoint uses rreq->inode, should set it when allocating. Fixes: d435d53228dd ("erofs: change to use asynchronous io for fscache readpage/readahead") Signed-off-by: Xin Yin <yinxin.x@bytedance.com> Reviewed-by: Jeffle Xu <jefflexu@linux.alibaba.com> Reviewed-by: Chao Yu <chao@kernel.org> Link: https://lore.kernel.org/r/20220527101800.22360-1-yinxin.x@bytedance.com Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
2022-05-29erofs: leave compressed inodes unsupported in fscache mode for nowJeffle Xu1-1/+4
erofs over fscache doesn't support the compressed layout yet. It will cause NULL crash if there are compressed inodes contained when working in fscache mode. So far in the erofs based container image distribution scenarios (RAFS v6), the compressed RAFS v6 images are downloaded and then decompressed on demand as an uncompressed erofs image. Then the erofs image is mounted in fscache mode for containers to use. IOWs, currently compressed data is decompressed on the userspace side instead and uncompressed erofs images will be finally cached. The fscache support for the compressed layout is still under development and it will be used for runtime decompression feature. Anyway, to avoid the potential crash, let's leave the compressed inodes unsupported in fscache mode until we support it later. Fixes: 1442b02b66ad ("erofs: implement fscache-based data read for non-inline layout") Signed-off-by: Jeffle Xu <jefflexu@linux.alibaba.com> Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com> Reviewed-by: Chao Yu <chao@kernel.org> Link: https://lore.kernel.org/r/20220526010344.118493-1-jefflexu@linux.alibaba.com Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
2022-05-28Merge tag 'powerpc-5.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linuxLinus Torvalds1-7/+16
Pull powerpc updates from Michael Ellerman: - Convert to the generic mmap support (ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT) - Add support for outline-only KASAN with 64-bit Radix MMU (P9 or later) - Increase SIGSTKSZ and MINSIGSTKSZ and add support for AT_MINSIGSTKSZ - Enable the DAWR (Data Address Watchpoint) on POWER9 DD2.3 or later - Drop support for system call instruction emulation - Many other small features and fixes Thanks to Alexey Kardashevskiy, Alistair Popple, Andy Shevchenko, Bagas Sanjaya, Bjorn Helgaas, Bo Liu, Chen Huang, Christophe Leroy, Colin Ian King, Daniel Axtens, Dwaipayan Ray, Fabiano Rosas, Finn Thain, Frank Rowand, Fuqian Huang, Guilherme G. Piccoli, Hangyu Hua, Haowen Bai, Haren Myneni, Hari Bathini, He Ying, Jason Wang, Jiapeng Chong, Jing Yangyang, Joel Stanley, Julia Lawall, Kajol Jain, Kevin Hao, Krzysztof Kozlowski, Laurent Dufour, Lv Ruyi, Madhavan Srinivasan, Magali Lemes, Miaoqian Lin, Minghao Chi, Nathan Chancellor, Naveen N. Rao, Nicholas Piggin, Oliver O'Halloran, Oscar Salvador, Pali Rohár, Paul Mackerras, Peng Wu, Qing Wang, Randy Dunlap, Reza Arbab, Russell Currey, Sohaib Mohamed, Vaibhav Jain, Vasant Hegde, Wang Qing, Wang Wensheng, Xiang wangx, Xiaomeng Tong, Xu Wang, Yang Guang, Yang Li, Ye Bin, YueHaibing, Yu Kuai, Zheng Bin, Zou Wei, and Zucheng Zheng. * tag 'powerpc-5.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (200 commits) powerpc/64: Include cache.h directly in paca.h powerpc/64s: Only set HAVE_ARCH_UNMAPPED_AREA when CONFIG_PPC_64S_HASH_MMU is set powerpc/xics: Include missing header powerpc/powernv/pci: Drop VF MPS fixup powerpc/fsl_book3e: Don't set rodata RO too early powerpc/microwatt: Add mmu bits to device tree powerpc/powernv/flash: Check OPAL flash calls exist before using powerpc/powermac: constify device_node in of_irq_parse_oldworld() powerpc/powermac: add missing g5_phy_disable_cpu1() declaration selftests/powerpc/pmu: fix spelling mistake "mis-match" -> "mismatch" powerpc: Enable the DAWR on POWER9 DD2.3 and above powerpc/64s: Add CPU_FTRS_POWER10 to ALWAYS mask powerpc/64s: Add CPU_FTRS_POWER9_DD2_2 to CPU_FTRS_ALWAYS mask powerpc: Fix all occurences of "the the" selftests/powerpc/pmu/ebb: remove fixed_instruction.S powerpc/platforms/83xx: Use of_device_get_match_data() powerpc/eeh: Drop redundant spinlock initialization powerpc/iommu: Add missing of_node_put in iommu_init_early_dart powerpc/pseries/vas: Call misc_deregister if sysfs init fails powerpc/papr_scm: Fix leaking nvdimm_events_map elements ...
2022-05-27ksmbd: smbd: relax the count of sges requiredHyunchul Lee1-8/+5
Remove the condition that the count of sges must be greater than or equal to SMB_DIRECT_MAX_SEND_SGES(8). Because ksmbd needs sges only for SMB direct header, SMB2 transform header, SMB2 response, and optional payload. Signed-off-by: Hyunchul Lee <hyc.lee@gmail.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Reviewed-by: Tom Talpey <tom@talpey.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2022-05-27Merge tag '5.19-rc-smb3-client-fixes-updated' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds24-189/+559
Pull cifs client updates from Steve French: - multichannel fixes to improve reconnect after network failure - improved caching of root directory contents (extending benefit of directory leases) - two DFS fixes - three fixes for improved debugging - an NTLMSSP fix for mounts t0 older servers - new mount parm to allow disabling creating sparse files - various cleanup fixes and minor fixes pointed out by coverity * tag '5.19-rc-smb3-client-fixes-updated' of git://git.samba.org/sfrench/cifs-2.6: (24 commits) smb3: remove unneeded null check in cifs_readdir cifs: fix ntlmssp on old servers cifs: cache the dirents for entries in a cached directory cifs: avoid parallel session setups on same channel cifs: use new enum for ses_status cifs: do not use tcpStatus after negotiate completes smb3: add mount parm nosparse smb3: don't set rc when used and unneeded in query_info_compound smb3: check for null tcon cifs: fix minor compile warning Add various fsctl structs Add defines for various newer FSCTLs smb3: add trace point for oplock not found cifs: return the more nuanced writeback error on close() smb3: add trace point for lease not found issue cifs: smbd: fix typo in comment cifs: set the CREATE_NOT_FILE when opening the directory in use_cached_dir() cifs: check for smb1 in open_cached_dir() cifs: move definition of cifs_fattr earlier in cifsglob.h cifs: print TIDs as hex ...
2022-05-27Merge tag 'jfs-5.19' of https://github.com/kleikamp/linux-shaggyLinus Torvalds10-1652/+3
Pull jfs updates from David Kleikamp: "One bug fix and some code cleanup" * tag 'jfs-5.19' of https://github.com/kleikamp/linux-shaggy: fs/jfs: Remove dead code fs: jfs: fix possible NULL pointer dereference in dbFree()
2022-05-27Merge tag 'libnvdimm-for-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimmLinus Torvalds3-9/+23
Pull libnvdimm and DAX updates from Dan Williams: "New support for clearing memory errors when a file is in DAX mode, alongside with some other fixes and cleanups. Previously it was only possible to clear these errors using a truncate or hole-punch operation to trigger the filesystem to reallocate the block, now, any page aligned write can opportunistically clear errors as well. This change spans x86/mm, nvdimm, and fs/dax, and has received the appropriate sign-offs. Thanks to Jane for her work on this. Summary: - Add support for clearing memory error via pwrite(2) on DAX - Fix 'security overwrite' support in the presence of media errors - Miscellaneous cleanups and fixes for nfit_test (nvdimm unit tests)" * tag 'libnvdimm-for-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: pmem: implement pmem_recovery_write() pmem: refactor pmem_clear_poison() dax: add .recovery_write dax_operation dax: introduce DAX_RECOVERY_WRITE dax access mode mce: fix set_mce_nospec to always unmap the whole page x86/mce: relocate set{clear}_mce_nospec() functions acpi/nfit: rely on mce->misc to determine poison granularity testing: nvdimm: asm/mce.h is not needed in nfit.c testing: nvdimm: iomap: make __nfit_test_ioremap a macro nvdimm: Allow overwrite in the presence of disabled dimms tools/testing/nvdimm: remove unneeded flush_workqueue
2022-05-27f2fs: fix to tag gcing flag on page during file defragmentChao Yu1-0/+1
In order to garantee migrated data be persisted during checkpoint, otherwise out-of-order persistency between data and node may cause data corruption after SPOR. Signed-off-by: Chao Yu <chao.yu@oppo.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2022-05-27f2fs: replace F2FS_I(inode) and sbi by the local variableYufen Yu3-18/+18
We have define 'fi' at the begin of the functions, just use it, rather than use F2FS_I(inode) again. Signed-off-by: Yufen Yu <yuyufen@huawei.com> Reviewed-by: Chao Yu <chao@kernel.org> [Jaegeuk Kim: replace sbi] Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2022-05-27Merge tag 'mm-nonmm-stable-2022-05-26' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mmLinus Torvalds25-260/+346
Pull misc updates from Andrew Morton: "The non-MM patch queue for this merge window. Not a lot of material this cycle. Many singleton patches against various subsystems. Most notably some maintenance work in ocfs2 and initramfs" * tag 'mm-nonmm-stable-2022-05-26' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (65 commits) kcov: update pos before writing pc in trace function ocfs2: dlmfs: fix error handling of user_dlm_destroy_lock ocfs2: dlmfs: don't clear USER_LOCK_ATTACHED when destroying lock fs/ntfs: remove redundant variable idx fat: remove time truncations in vfat_create/vfat_mkdir fat: report creation time in statx fat: ignore ctime updates, and keep ctime identical to mtime in memory fat: split fat_truncate_time() into separate functions MAINTAINERS: add Muchun as a memcg reviewer proc/sysctl: make protected_* world readable ia64: mca: drop redundant spinlock initialization tty: fix deadlock caused by calling printk() under tty_port->lock relay: remove redundant assignment to pointer buf fs/ntfs3: validate BOOT sectors_per_clusters lib/string_helpers: fix not adding strarray to device's resource list kernel/crash_core.c: remove redundant check of ck_cmdline ELF, uapi: fixup ELF_ST_TYPE definition ipc/mqueue: use get_tree_nodev() in mqueue_get_tree() ipc: update semtimedop() to use hrtimer ipc/sem: remove redundant assignments ...
2022-05-27pipe: Fix missing lock in pipe_resize_ring()David Howells1-13/+18
pipe_resize_ring() needs to take the pipe->rd_wait.lock spinlock to prevent post_one_notification() from trying to insert into the ring whilst the ring is being replaced. The occupancy check must be done after the lock is taken, and the lock must be taken after the new ring is allocated. The bug can lead to an oops looking something like: BUG: KASAN: use-after-free in post_one_notification.isra.0+0x62e/0x840 Read of size 4 at addr ffff88801cc72a70 by task poc/27196 ... Call Trace: post_one_notification.isra.0+0x62e/0x840 __post_watch_notification+0x3b7/0x650 key_create_or_update+0xb8b/0xd20 __do_sys_add_key+0x175/0x340 __x64_sys_add_key+0xbe/0x140 do_syscall_64+0x5c/0xc0 entry_SYSCALL_64_after_hwframe+0x44/0xae Reported by Selim Enes Karaduman @Enesdex working with Trend Micro Zero Day Initiative. Fixes: c73be61cede5 ("pipe: Add general notification queue support") Reported-by: zdi-disclosures@trendmicro.com # ZDI-CAN-17291 Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-05-27smb3: remove unneeded null check in cifs_readdirSteve French2-4/+3
Coverity pointed out an unneeded check. Addresses-Coverity: 1518030 ("Null pointer dereferences") Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2022-05-27ubifs: Use NULL instead of using plain integer as pointerHaowen Bai1-1/+1
This fixes the following sparse warnings: fs/ubifs/xattr.c:680:58: warning: Using plain integer as NULL pointer Signed-off-by: Haowen Bai <baihaowen@meizu.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2022-05-27ubifs: Simplify the return expression of run_gc()Minghao Chi1-5/+2
Simplify the return expression. Reported-by: Zeal Robot <zealci@zte.com.cn> Signed-off-by: Minghao Chi <chi.minghao@zte.com.cn> Reviewed-by: Tudor Ambarus <tudor.ambarus@microchip.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2022-05-27jffs2: fix memory leak in jffs2_do_fill_superBaokun Li1-0/+1
If jffs2_iget() or d_make_root() in jffs2_do_fill_super() returns an error, we can observe the following kmemleak report: -------------------------------------------- unreferenced object 0xffff888105a65340 (size 64): comm "mount", pid 710, jiffies 4302851558 (age 58.239s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<ffffffff859c45e5>] kmem_cache_alloc_trace+0x475/0x8a0 [<ffffffff86160146>] jffs2_sum_init+0x96/0x1a0 [<ffffffff86140e25>] jffs2_do_mount_fs+0x745/0x2120 [<ffffffff86149fec>] jffs2_do_fill_super+0x35c/0x810 [<ffffffff8614aae9>] jffs2_fill_super+0x2b9/0x3b0 [...] unreferenced object 0xffff8881bd7f0000 (size 65536): comm "mount", pid 710, jiffies 4302851558 (age 58.239s) hex dump (first 32 bytes): bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb ................ bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb ................ backtrace: [<ffffffff858579ba>] kmalloc_order+0xda/0x110 [<ffffffff85857a11>] kmalloc_order_trace+0x21/0x130 [<ffffffff859c2ed1>] __kmalloc+0x711/0x8a0 [<ffffffff86160189>] jffs2_sum_init+0xd9/0x1a0 [<ffffffff86140e25>] jffs2_do_mount_fs+0x745/0x2120 [<ffffffff86149fec>] jffs2_do_fill_super+0x35c/0x810 [<ffffffff8614aae9>] jffs2_fill_super+0x2b9/0x3b0 [...] -------------------------------------------- This is because the resources allocated in jffs2_sum_init() are not released. Call jffs2_sum_exit() to release these resources to solve the problem. Fixes: e631ddba5887 ("[JFFS2] Add erase block summary support (mount time improvement)") Signed-off-by: Baokun Li <libaokun1@huawei.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2022-05-27jffs2: Use kzalloc instead of kmalloc/memsetHaowen Bai1-4/+2
Use kzalloc rather than duplicating its implementation, which makes code simple and easy to understand. Signed-off-by: Haowen Bai <baihaowen@meizu.com> [rw: Fixed printk string] Signed-off-by: Richard Weinberger <richard@nod.at>
2022-05-26Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdmaLinus Torvalds1-1/+1
Pull rdma updates from Jason Gunthorpe: "Small collection of incremental improvement patches: - Minor code cleanup patches, comment improvements, etc from static tools - Clean the some of the kernel caps, reducing the historical stealth uAPI leftovers - Bug fixes and minor changes for rdmavt, hns, rxe, irdma - Remove unimplemented cruft from rxe - Reorganize UMR QP code in mlx5 to avoid going through the IB verbs layer - flush_workqueue(system_unbound_wq) removal - Ensure rxe waits for objects to be unused before allowing the core to free them - Several rc quality bug fixes for hfi1" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (67 commits) RDMA/rtrs-clt: Fix one kernel-doc comment RDMA/hfi1: Remove all traces of diagpkt support RDMA/hfi1: Consolidate software versions RDMA/hfi1: Remove pointless driver version RDMA/hfi1: Fix potential integer multiplication overflow errors RDMA/hfi1: Prevent panic when SDMA is disabled RDMA/hfi1: Prevent use of lock before it is initialized RDMA/rxe: Fix an error handling path in rxe_get_mcg() IB/core: Fix typo in comment RDMA/core: Fix typo in comment IB/hf1: Fix typo in comment IB/qib: Fix typo in comment IB/iser: Fix typo in comment RDMA/mlx4: Avoid flush_scheduled_work() usage IB/isert: Avoid flush_scheduled_work() usage RDMA/mlx5: Remove duplicate pointer assignment in mlx5_ib_alloc_implicit_mr() RDMA/qedr: Remove unnecessary synchronize_irq() before free_irq() RDMA/hns: Use hr_reg_read() instead of remaining roce_get_xxx() RDMA/hns: Use hr_reg_xxx() instead of remaining roce_set_xxx() RDMA/irdma: Add SW mechanism to generate completions on error ...
2022-05-26Merge tag 'nfsd-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linuxLinus Torvalds16-371/+910
Pull nfsd updates from Chuck Lever: "We introduce 'courteous server' in this release. Previously NFSD would purge open and lock state for an unresponsive client after one lease period (typically 90 seconds). Now, after one lease period, another client can open and lock those files and the unresponsive client's lease is purged; otherwise if the unresponsive client's open and lock state is uncontended, the server retains that open and lock state for up to 24 hours, allowing the client's workload to resume after a lengthy network partition. A longstanding issue with NFSv4 file creation is also addressed. Previously a file creation can fail internally, returning an error to the client, but leave the newly created file in place as an artifact. The file creation code path has been reorganized so that internal failures and race conditions are less likely to result in an unwanted file creation. A fault injector has been added to help exercise paths that are run during kernel metadata cache invalidation. These caches contain information maintained by user space about exported filesystems. Many of our test workloads do not trigger cache invalidation. There is one patch that is needed to support PREEMPT_RT and a fix for an ancient 'sleep while spin-locked' splat that seems to have become easier to hit since v5.18-rc3" * tag 'nfsd-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: (36 commits) NFSD: nfsd_file_put() can sleep NFSD: Add documenting comment for nfsd4_release_lockowner() NFSD: Modernize nfsd4_release_lockowner() NFSD: Fix possible sleep during nfsd4_release_lockowner() nfsd: destroy percpu stats counters after reply cache shutdown nfsd: Fix null-ptr-deref in nfsd_fill_super() nfsd: Unregister the cld notifier when laundry_wq create failed SUNRPC: Use RMW bitops in single-threaded hot paths NFSD: Clean up the show_nf_flags() macro NFSD: Trace filecache opens NFSD: Move documenting comment for nfsd4_process_open2() NFSD: Fix whitespace NFSD: Remove dprintk call sites from tail of nfsd4_open() NFSD: Instantiate a struct file when creating a regular NFSv4 file NFSD: Clean up nfsd_open_verified() NFSD: Remove do_nfsd_create() NFSD: Refactor NFSv4 OPEN(CREATE) NFSD: Refactor NFSv3 CREATE NFSD: Refactor nfsd_create_setattr() NFSD: Avoid calling fh_drop_write() twice in do_nfsd_create() ...
2022-05-27xfs: move xfs_attr_use_log_assist usage out of libxfsDarrick J. Wong6-19/+39
The LARP patchset added an awkward coupling point between libxfs and what would be libxlog, if the XFS log were actually its own library. Move the code that sets up logged xattr updates out of libxfs and into xfs_xattr.c so that libxfs no longer has to know about xlog_* functions. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2022-05-27xfs: move xfs_attr_use_log_assist out of xfs_log.cDarrick J. Wong6-45/+67
The LARP patchset added an awkward coupling point between libxfs and what would be libxlog, if the XFS log were actually its own library. Move the code that enables logged xattr updates out of "lib"xlog and into xfs_xattr.c so that it no longer has to know about xlog_* functions. While we're at it, give xfs_xattr.c its own header file. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2022-05-27xfs: warn about LARP once per mountDarrick J. Wong2-3/+6
Since LARP is an experimental debug-only feature, we should try to warn about it being in use once per mount, not once per reboot. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2022-05-27xfs: implement per-mount warnings for scrub and shrink usageDarrick J. Wong4-22/+23
Currently, we don't have a consistent story around logging when an EXPERIMENTAL feature gets turned on at runtime -- online fsck and shrink log a message once per day across all mounts, and the recently merged LARP mode only ever does it once per insmod cycle or reboot. Because EXPERIMENTAL tags are supposed to go away eventually, convert the existing daily warnings into state flags that travel with the mount, and warn once per mount. Making this an opstate flag means that we'll be able to capture the experimental usage in the ftrace output too. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2022-05-27xfs: don't log every time we clear the log incompat flagsDarrick J. Wong1-1/+0
There's no need to spam the logs every time we clear the log incompat flags -- if someone is periodically using one of these features, they'll be cleared every time the log tries to clean itself, which can get pretty chatty: $ dmesg | grep -i clear [ 5363.894711] XFS (sdd): Clearing log incompat feature flags. [ 5365.157516] XFS (sdd): Clearing log incompat feature flags. [ 5369.388543] XFS (sdd): Clearing log incompat feature flags. [ 5371.281246] XFS (sdd): Clearing log incompat feature flags. These aren't high value messages either -- nothing's gone wrong, and nobody's trying anything tricky. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2022-05-27xfs: convert buf_cancel_table allocation to kmalloc_arrayDarrick J. Wong3-6/+14
While we're messing around with how recovery allocates and frees the buffer cancellation table, convert the allocation to use kmalloc_array instead of the old kmem_alloc APIs, and make it handle a null return, even though that's not likely. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2022-05-27xfs: don't leak xfs_buf_cancel structures when recovery failsDarrick J. Wong1-0/+13
If log recovery fails, we free the memory used by the buffer cancellation buckets, but we don't actually traverse each bucket list to free the individual xfs_buf_cancel objects. This leads to a memory leak, as reported by kmemleak in xfs/051: unreferenced object 0xffff888103629560 (size 32): comm "mount", pid 687045, jiffies 4296935916 (age 10.752s) hex dump (first 32 bytes): 08 d3 0a 01 00 00 00 00 08 00 00 00 01 00 00 00 ................ d0 f5 0b 92 81 88 ff ff 80 64 64 25 81 88 ff ff .........dd%.... backtrace: [<ffffffffa0317c83>] kmem_alloc+0x73/0x140 [xfs] [<ffffffffa03234a9>] xlog_recover_buf_commit_pass1+0x139/0x200 [xfs] [<ffffffffa032dc27>] xlog_recover_commit_trans+0x307/0x350 [xfs] [<ffffffffa032df15>] xlog_recovery_process_trans+0xa5/0xe0 [xfs] [<ffffffffa032e12d>] xlog_recover_process_data+0x8d/0x140 [xfs] [<ffffffffa032e49d>] xlog_do_recovery_pass+0x19d/0x740 [xfs] [<ffffffffa032f22d>] xlog_do_log_recovery+0x6d/0x150 [xfs] [<ffffffffa032f343>] xlog_do_recover+0x33/0x1d0 [xfs] [<ffffffffa032faba>] xlog_recover+0xda/0x190 [xfs] [<ffffffffa03194bc>] xfs_log_mount+0x14c/0x360 [xfs] [<ffffffffa030bfed>] xfs_mountfs+0x50d/0xa60 [xfs] [<ffffffffa03124b5>] xfs_fs_fill_super+0x6a5/0x950 [xfs] [<ffffffff812b92a5>] get_tree_bdev+0x175/0x280 [<ffffffff812b7c3a>] vfs_get_tree+0x1a/0x80 [<ffffffff812e366f>] path_mount+0x6ff/0xaa0 [<ffffffff812e3b13>] __x64_sys_mount+0x103/0x140 Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2022-05-27xfs: refactor buffer cancellation table allocationDarrick J. Wong4-32/+64
Move the code that allocates and frees the buffer cancellation tables used by log recovery into the file that actually uses the tables. This is a precursor to some cleanups and a memory leak fix. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2022-05-27xfs: don't leak btree cursor when insrec fails after a splitDarrick J. Wong1-3/+5
The recent patch to improve btree cycle checking caused a regression when I rebased the in-memory btree branch atop the 5.19 for-next branch, because in-memory short-pointer btrees do not have AG numbers. This produced the following complaint from kmemleak: unreferenced object 0xffff88803d47dde8 (size 264): comm "xfs_io", pid 4889, jiffies 4294906764 (age 24.072s) hex dump (first 32 bytes): 90 4d 0b 0f 80 88 ff ff 00 a0 bd 05 80 88 ff ff .M.............. e0 44 3a a0 ff ff ff ff 00 df 08 06 80 88 ff ff .D:............. backtrace: [<ffffffffa0388059>] xfbtree_dup_cursor+0x49/0xc0 [xfs] [<ffffffffa029887b>] xfs_btree_dup_cursor+0x3b/0x200 [xfs] [<ffffffffa029af5d>] __xfs_btree_split+0x6ad/0x820 [xfs] [<ffffffffa029b130>] xfs_btree_split+0x60/0x110 [xfs] [<ffffffffa029f6da>] xfs_btree_make_block_unfull+0x19a/0x1f0 [xfs] [<ffffffffa029fada>] xfs_btree_insrec+0x3aa/0x810 [xfs] [<ffffffffa029fff3>] xfs_btree_insert+0xb3/0x240 [xfs] [<ffffffffa02cb729>] xfs_rmap_insert+0x99/0x200 [xfs] [<ffffffffa02cf142>] xfs_rmap_map_shared+0x192/0x5f0 [xfs] [<ffffffffa02cf60b>] xfs_rmap_map_raw+0x6b/0x90 [xfs] [<ffffffffa0384a85>] xrep_rmap_stash+0xd5/0x1d0 [xfs] [<ffffffffa0384dc0>] xrep_rmap_visit_bmbt+0xa0/0xf0 [xfs] [<ffffffffa0384fb6>] xrep_rmap_scan_iext+0x56/0xa0 [xfs] [<ffffffffa03850d8>] xrep_rmap_scan_ifork+0xd8/0x160 [xfs] [<ffffffffa0385195>] xrep_rmap_scan_inode+0x35/0x80 [xfs] [<ffffffffa03852ee>] xrep_rmap_find_rmaps+0x10e/0x270 [xfs] I noticed that xfs_btree_insrec has a bunch of debug code that return out of the function immediately, without freeing the "new" btree cursor that can be returned when _make_block_unfull calls xfs_btree_split. Fix the error return in this function to free the btree cursor. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2022-05-27xfs: purge dquots after inode walk fails during quotacheckDarrick J. Wong1-1/+8
xfs/434 and xfs/436 have been reporting occasional memory leaks of xfs_dquot objects. These tests themselves were the messenger, not the culprit, since they unload the xfs module, which trips the slub debugging code while tearing down all the xfs slab caches: ============================================================================= BUG xfs_dquot (Tainted: G W ): Objects remaining in xfs_dquot on __kmem_cache_shutdown() ----------------------------------------------------------------------------- Slab 0xffffea000606de00 objects=30 used=5 fp=0xffff888181b78a78 flags=0x17ff80000010200(slab|head|node=0|zone=2|lastcpupid=0xfff) CPU: 0 PID: 3953166 Comm: modprobe Tainted: G W 5.18.0-rc6-djwx #rc6 d5824be9e46a2393677bda868f9b154d917ca6a7 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20171121_152543-x86-ol7-builder-01.us.oracle.com-4.el7.1 04/01/2014 Since we don't generally rmmod the xfs module between fstests, this means that xfs/434 is really just the canary in the coal mine -- something leaked a dquot, but we don't know who. After days of pounding on fstests with kmemleak enabled, I finally got it to spit this out: unreferenced object 0xffff8880465654c0 (size 536): comm "u10:4", pid 88, jiffies 4294935810 (age 29.512s) hex dump (first 32 bytes): 60 4a 56 46 80 88 ff ff 58 ea e4 5c 80 88 ff ff `JVF....X..\.... 00 e0 52 49 80 88 ff ff 01 00 01 00 00 00 00 00 ..RI............ backtrace: [<ffffffffa0740f6c>] xfs_dquot_alloc+0x2c/0x530 [xfs] [<ffffffffa07443df>] xfs_qm_dqread+0x6f/0x330 [xfs] [<ffffffffa07462a2>] xfs_qm_dqget+0x132/0x4e0 [xfs] [<ffffffffa0756bb0>] xfs_qm_quotacheck_dqadjust+0xa0/0x3e0 [xfs] [<ffffffffa075724d>] xfs_qm_dqusage_adjust+0x35d/0x4f0 [xfs] [<ffffffffa06c9068>] xfs_iwalk_ag_recs+0x348/0x5d0 [xfs] [<ffffffffa06c95d3>] xfs_iwalk_run_callbacks+0x273/0x540 [xfs] [<ffffffffa06c9e8d>] xfs_iwalk_ag+0x5ed/0x890 [xfs] [<ffffffffa06ca22f>] xfs_iwalk_ag_work+0xff/0x170 [xfs] [<ffffffffa06d22c9>] xfs_pwork_work+0x79/0x130 [xfs] [<ffffffff81170bb2>] process_one_work+0x672/0x1040 [<ffffffff81171b1b>] worker_thread+0x59b/0xec0 [<ffffffff8118711e>] kthread+0x29e/0x340 [<ffffffff810032bf>] ret_from_fork+0x1f/0x30 Now we know that quotacheck is at fault, but even this report was canaryish -- it was triggered by xfs/494, which doesn't actually mount any filesystems. (kmemleak can be a little slow to notice leaks, even with fstests repeatedly whacking it to look for them.) Looking at the *previous* fstest, however, showed that the test run before xfs/494 was xfs/117. The tipoff to the problem is in this excerpt from dmesg: XFS (sda4): Quotacheck needed: Please wait. XFS (sda4): Metadata corruption detected at xfs_dinode_verify.part.0+0xdb/0x7b0 [xfs], inode 0x119 dinode XFS (sda4): Unmount and run xfs_repair XFS (sda4): First 128 bytes of corrupted metadata buffer: 00000000: 49 4e 81 a4 03 02 00 00 00 00 00 00 00 00 00 00 IN.............. 00000010: 00 00 00 01 00 00 00 00 00 90 57 54 54 1a 4c 68 ..........WTT.Lh 00000020: 81 f9 7d e1 6d ee 16 00 34 bd 7d e1 6d ee 16 00 ..}.m...4.}.m... 00000030: 34 bd 7d e1 6d ee 16 00 00 00 00 00 00 00 00 00 4.}.m........... 00000040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00000050: 00 00 00 02 00 00 00 00 00 00 00 00 96 80 f3 ab ................ 00000060: ff ff ff ff da 57 7b 11 00 00 00 00 00 00 00 03 .....W{......... 00000070: 00 00 00 01 00 00 00 10 00 00 00 00 00 00 00 08 ................ XFS (sda4): Quotacheck: Unsuccessful (Error -117): Disabling quotas. The dinode verifier decided that the inode was corrupt, which causes iget to return with EFSCORRUPTED. Since this happened during quotacheck, it is obvious that the kernel aborted the inode walk on account of the corruption error and disabled quotas. Unfortunately, we neglect to purge the dquot cache before doing that, which is how the dquots leaked. The problems started 10 years ago in commit b84a3a, when the dquot lists were converted to a radix tree, but the error handling behavior was not correctly preserved -- in that commit, if the bulkstat failed and usrquota was enabled, the bulkstat failure code would be overwritten by the result of flushing all the dquots to disk. As long as that succeeds, we'd continue the quota mount as if everything were ok, but instead we're now operating with a corrupt inode and incorrect quota usage counts. I didn't notice this bug in 2019 when I wrote commit ebd126a, which changed quotacheck to skip the dqflush when the scan doesn't complete due to inode walk failures. Introduced-by: b84a3a96751f ("xfs: remove the per-filesystem list of dquots") Fixes: ebd126a651f8 ("xfs: convert quotacheck to use the new iwalk functions") Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2022-05-27xfs: assert in xfs_btree_del_cursor should take into account errorDave Chinner1-1/+7
xfs/538 on a 1kB block filesystem failed with this assert: XFS: Assertion failed: cur->bc_btnum != XFS_BTNUM_BMAP || cur->bc_ino.allocated == 0 || xfs_is_shutdown(cur->bc_mp), file: fs/xfs/libxfs/xfs_btree.c, line: 448 The problem was that an allocation failed unexpectedly in xfs_bmbt_alloc_block() after roughly 150,000 minlen allocation error injections, resulting in an EFSCORRUPTED error being returned to xfs_bmapi_write(). The error occurred on extent-to-btree format conversion allocating the new root block: RIP: 0010:xfs_bmbt_alloc_block+0x177/0x210 Call Trace: <TASK> xfs_btree_new_iroot+0xdf/0x520 xfs_btree_make_block_unfull+0x10d/0x1c0 xfs_btree_insrec+0x364/0x790 xfs_btree_insert+0xaa/0x210 xfs_bmap_add_extent_hole_real+0x1fe/0x9a0 xfs_bmapi_allocate+0x34c/0x420 xfs_bmapi_write+0x53c/0x9c0 xfs_alloc_file_space+0xee/0x320 xfs_file_fallocate+0x36b/0x450 vfs_fallocate+0x148/0x340 __x64_sys_fallocate+0x3c/0x70 do_syscall_64+0x35/0x80 entry_SYSCALL_64_after_hwframe+0x44/0xa Why the allocation failed at this point is unknown, but is likely that we ran the transaction out of reserved space and filesystem out of space with bmbt blocks because of all the minlen allocations being done causing worst case fragmentation of a large allocation. Regardless of the cause, we've then called xfs_bmapi_finish() which calls xfs_btree_del_cursor(cur, error) to tear down the cursor. So we have a failed operation, error != 0, cur->bc_ino.allocated > 0 and the filesystem is still up. The assert fails to take into account that allocation can fail with an error and the transaction teardown will shut the filesystem down if necessary. i.e. the assert needs to check "|| error != 0" as well, because at this point shutdown is pending because the current transaction is dirty.... Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com>
2022-05-27xfs: don't assert fail on perag references on teardownDave Chinner1-2/+1
Not fatal, the assert is there to catch developer attention. I'm seeing this occasionally during recoveryloop testing after a shutdown, and I don't want this to stop an overnight recoveryloop run as it is currently doing. Convert the ASSERT to a XFS_IS_CORRUPT() check so it will dump a corruption report into the log and cause a test failure that way, but it won't stop the machine dead. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com>
2022-05-27xfs: avoid unnecessary runtime sibling pointer endian conversionsDave Chinner1-14/+33
Commit dc04db2aa7c9 has caused a small aim7 regression, showing a small increase in CPU usage in __xfs_btree_check_sblock() as a result of the extra checking. This is likely due to the endian conversion of the sibling poitners being unconditional instead of relying on the compiler to endian convert the NULL pointer at compile time and avoiding the runtime conversion for this common case. Rework the checks so that endian conversion of the sibling pointers is only done if they are not null as the original code did. .... and these need to be "inline" because the compiler completely fails to inline them automatically like it should be doing. $ size fs/xfs/libxfs/xfs_btree.o* text data bss dec hex filename 51874 240 0 52114 cb92 fs/xfs/libxfs/xfs_btree.o.orig 51562 240 0 51802 ca5a fs/xfs/libxfs/xfs_btree.o.inline Just when you think the tools have advanced sufficiently we don't have to care about stuff like this anymore, along comes a reminder that *our tools still suck*. Fixes: dc04db2aa7c9 ("xfs: detect self referencing btree sibling pointers") Reported-by: kernel test robot <oliver.sang@intel.com> Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com>
2022-05-26Merge tag 'sysctl-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linuxLinus Torvalds1-39/+50
Pull sysctl updates from Luis Chamberlain: "For two kernel releases now kernel/sysctl.c has been being cleaned up slowly, since the tables were grossly long, sprinkled with tons of #ifdefs and all this caused merge conflicts with one susbystem or another. This tree was put together to help try to avoid conflicts with these cleanups going on different trees at time. So nothing exciting on this pull request, just cleanups. Thanks a lot to the Uniontech and Huawei folks for doing some of this nasty work" * tag 'sysctl-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux: (28 commits) sched: Fix build warning without CONFIG_SYSCTL reboot: Fix build warning without CONFIG_SYSCTL kernel/kexec_core: move kexec_core sysctls into its own file sysctl: minor cleanup in new_dir() ftrace: fix building with SYSCTL=y but DYNAMIC_FTRACE=n fs/proc: Introduce list_for_each_table_entry for proc sysctl mm: fix unused variable kernel warning when SYSCTL=n latencytop: move sysctl to its own file ftrace: fix building with SYSCTL=n but DYNAMIC_FTRACE=y ftrace: Fix build warning ftrace: move sysctl_ftrace_enabled to ftrace.c kernel/do_mount_initrd: move real_root_dev sysctls to its own file kernel/delayacct: move delayacct sysctls to its own file kernel/acct: move acct sysctls to its own file kernel/panic: move panic sysctls to its own file kernel/lockdep: move lockdep sysctls to its own file mm: move page-writeback sysctls to their own file mm: move oom_kill sysctls to their own file kernel/reboot: move reboot sysctls to its own file sched: Move energy_aware sysctls to topology.c ...
2022-05-26Merge tag 'mm-stable-2022-05-25' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mmLinus Torvalds14-162/+132
Pull MM updates from Andrew Morton: "Almost all of MM here. A few things are still getting finished off, reviewed, etc. - Yang Shi has improved the behaviour of khugepaged collapsing of readonly file-backed transparent hugepages. - Johannes Weiner has arranged for zswap memory use to be tracked and managed on a per-cgroup basis. - Munchun Song adds a /proc knob ("hugetlb_optimize_vmemmap") for runtime enablement of the recent huge page vmemmap optimization feature. - Baolin Wang contributes a series to fix some issues around hugetlb pagetable invalidation. - Zhenwei Pi has fixed some interactions between hwpoisoned pages and virtualization. - Tong Tiangen has enabled the use of the presently x86-only page_table_check debugging feature on arm64 and riscv. - David Vernet has done some fixup work on the memcg selftests. - Peter Xu has taught userfaultfd to handle write protection faults against shmem- and hugetlbfs-backed files. - More DAMON development from SeongJae Park - adding online tuning of the feature and support for monitoring of fixed virtual address ranges. Also easier discovery of which monitoring operations are available. - Nadav Amit has done some optimization of TLB flushing during mprotect(). - Neil Brown continues to labor away at improving our swap-over-NFS support. - David Hildenbrand has some fixes to anon page COWing versus get_user_pages(). - Peng Liu fixed some errors in the core hugetlb code. - Joao Martins has reduced the amount of memory consumed by device-dax's compound devmaps. - Some cleanups of the arch-specific pagemap code from Anshuman Khandual. - Muchun Song has found and fixed some errors in the TLB flushing of transparent hugepages. - Roman Gushchin has done more work on the memcg selftests. ... and, of course, many smaller fixes and cleanups. Notably, the customary million cleanup serieses from Miaohe Lin" * tag 'mm-stable-2022-05-25' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (381 commits) mm: kfence: use PAGE_ALIGNED helper selftests: vm: add the "settings" file with timeout variable selftests: vm: add "test_hmm.sh" to TEST_FILES selftests: vm: check numa_available() before operating "merge_across_nodes" in ksm_tests selftests: vm: add migration to the .gitignore selftests/vm/pkeys: fix typo in comment ksm: fix typo in comment selftests: vm: add process_mrelease tests Revert "mm/vmscan: never demote for memcg reclaim" mm/kfence: print disabling or re-enabling message include/trace/events/percpu.h: cleanup for "percpu: improve percpu_alloc_percpu event trace" include/trace/events/mmflags.h: cleanup for "tracing: incorrect gfp_t conversion" mm: fix a potential infinite loop in start_isolate_page_range() MAINTAINERS: add Muchun as co-maintainer for HugeTLB zram: fix Kconfig dependency warning mm/shmem: fix shmem folio swapoff hang cgroup: fix an error handling path in alloc_pagecache_max_30M() mm: damon: use HPAGE_PMD_SIZE tracing: incorrect isolate_mote_t cast in mm_vmscan_lru_isolate nodemask.h: fix compilation error with GCC12 ...
2022-05-26NFSD: nfsd_file_put() can sleepChuck Lever1-0/+2
Now that there are no more callers of nfsd_file_put() that might hold a spin lock, ensure the lockdep infrastructure can catch newly introduced calls to nfsd_file_put() made while a spinlock is held. Link: https://lore.kernel.org/linux-nfs/ece7fd1d-5fb3-5155-54ba-347cfc19bd9a@oracle.com/T/#mf1855552570cf9a9c80d1e49d91438cd9085aada Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Jeff Layton <jlayton@kernel.org>
2022-05-26NFSD: Add documenting comment for nfsd4_release_lockowner()Chuck Lever1-3/+20
And return explicit nfserr values that match what is documented in the new comment / API contract. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-05-26NFSD: Modernize nfsd4_release_lockowner()Chuck Lever1-25/+11
Refactor: Use existing helpers that other lock operations use. This change removes several automatic variables, so re-organize the variable declarations for readability. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-05-26NFSD: Fix possible sleep during nfsd4_release_lockowner()Chuck Lever1-8/+4
nfsd4_release_lockowner() holds clp->cl_lock when it calls check_for_locks(). However, check_for_locks() calls nfsd_file_get() / nfsd_file_put() to access the backing inode's flc_posix list, and nfsd_file_put() can sleep if the inode was recently removed. Let's instead rely on the stateowner's reference count to gate whether the release is permitted. This should be a reliable indication of locks-in-use since file lock operations and ->lm_get_owner take appropriate references, which are released appropriately when file locks are removed. Reported-by: Dai Ngo <dai.ngo@oracle.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Cc: stable@vger.kernel.org
2022-05-25Merge tag 'xfs-5.19-for-linus' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linuxLinus Torvalds104-2675/+4689
Pull xfs updates from Dave Chinner: "This is a big update with lots of new code. The summary below them all, so I'll just touch on teh higlights. The two main new features are Large Extent Counts and Logged Attribute Replay - these are two new foundational features that we are building more complex future features on top of. For upcoming functionality, we need to be able to store hundreds of millions of xattrs per inode. The Large Extent Count feature removes the limits that prevent this scale of xattr storage, and while we were modifying the on disk extent count format we also increased the number of data extents we support per inode from 2^32 to 2^47. We also need to be able to modify xattrs as part of larger atomic transactions rather than as standalone transactions. The Logged Attribute Replay feature introduces the infrastructure that allows us to use intents to record the attribute modifications in the journal before we start them, hence allowing other atomic transactions to log attribute modification intents and then defer the actual modification to later. If we then crash, log recovery then guarantees that the attribute is replayed in the context of the atomic transaction that logged the intent. A significant chunk of the commits in this merge are for the base attribute replay functionality along with fixes, improvements and cleanups related to this new functioanlity. Allison deserves a big round of thanks for her ongoing work to get this functionality into XFS. There are also many other smaller changes and improvements, so overall this is one of the bigger XFS merge requests in some time. I will be following up next week with another smaller pull request - we already have another round of fixes and improvements to the logged attribute replay functionality just about ready to go. They'll soak and test over the next week, and I'll send a pull request for them near the end of the merge window. Summary: - support for printk message indexing. - large extent counts to provide support for up to 2^47 data extents and 2^32 attribute extents, allowing us to scale beyond 4 billion data extents to billions of xattrs per inode. - conversion of various flags fields to be consistently declared as unsigned bit fields. - improvements to realtime extent accounting and converts them to per-cpu counters to match all the other block and inode accounting. - reworks core log formatting code to reduce iterations, have a shorter, cleaner fast path and generally be easier to understand and maintain. - improvements to rmap btree searches that reduce overhead by up to 30% resulting in xfs_scrub runtime reductions of 15%. - improvements to reflink that remove the size limitations in remapping operations and greatly reduce the size of transaction reservations. - reworks the minimum log size calculations to allow us to change transaction reservations without changing the minimum supported log size. - removal of quota warning support as it has never been used on Linux. - intent whiteouts to allow us to cancel intents that are completed entirely in memory rather than having use CPU and disk bandwidth formatting and writing them into the journal when it is not necessary. This makes rmap, reflink and extent freeing slightly more efficient, but provides massive improvements for.... - Logged Attribute Replay feature support. This is a fundamental change to the way we modify attributes, laying the foundation for future integration of attribute modifications as part of other atomic transactional operations the filesystem performs. - Lots of cleanups and fixes for the logged attribute replay functionality" * tag 'xfs-5.19-for-linus' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: (124 commits) xfs: can't use kmem_zalloc() for attribute buffers xfs: detect empty attr leaf blocks in xfs_attr3_leaf_verify xfs: ATTR_REPLACE algorithm with LARP enabled needs rework xfs: use XFS_DA_OP flags in deferred attr ops xfs: remove xfs_attri_remove_iter xfs: switch attr remove to xfs_attri_set_iter xfs: introduce attr remove initial states into xfs_attr_set_iter xfs: xfs_attr_set_iter() does not need to return EAGAIN xfs: clean up final attr removal in xfs_attr_set_iter xfs: remote xattr removal in xfs_attr_set_iter() is conditional xfs: XFS_DAS_LEAF_REPLACE state only needed if !LARP xfs: split remote attr setting out from replace path xfs: consolidate leaf/node states in xfs_attr_set_iter xfs: kill XFS_DAC_LEAF_ADDNAME_INIT xfs: separate out initial attr_set states xfs: don't set quota warning values xfs: remove warning counters from struct xfs_dquot_res xfs: remove quota warning limit from struct xfs_quota_limits xfs: rework deferred attribute operation setup xfs: make xattri_leaf_bp more useful ...
2022-05-25Merge tag 'fsnotify_for_v5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fsLinus Torvalds12-192/+297
Pull fsnotify updates from Jan Kara: "The biggest part of this is support for fsnotify inode marks that don't pin inodes in memory but rather get evicted together with the inode (they are useful if userspace needs to exclude receipt of events from potentially large subtrees using fanotify ignore marks). There is also a fix for more consistent handling of events sent to parent and a fix of sparse(1) complaints" * tag 'fsnotify_for_v5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: fanotify: fix incorrect fmode_t casts fsnotify: consistent behavior for parent not watching children fsnotify: introduce mark type iterator fanotify: enable "evictable" inode marks fanotify: use fsnotify group lock helpers fanotify: implement "evictable" inode marks fanotify: factor out helper fanotify_mark_update_flags() fanotify: create helper fanotify_mark_user_flags() fsnotify: allow adding an inode mark without pinning inode dnotify: use fsnotify group lock helpers nfsd: use fsnotify group lock helpers audit: use fsnotify group lock helpers inotify: use fsnotify group lock helpers fsnotify: create helpers for group mark_mutex lock fsnotify: make allow_dups a property of the group fsnotify: pass flags argument to fsnotify_alloc_group() fsnotify: fix wrong lockdep annotations inotify: move control flags from mask to mark flags inotify: show inotify mask flags in proc fdinfo
2022-05-25Merge tag 'fs_for_v5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fsLinus Torvalds2-2/+1
Pull writeback and ext2 cleanups from Jan Kara: "One small ext2 cleanup and one writeback spelling fix" * tag 'fs_for_v5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: writeback: fix typo in comment fs: ext2: Fix duplicate included linux/dax.h
2022-05-25Merge tag 'size_t-saturating-helpers-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linuxLinus Torvalds1-2/+1
Pull misc hardening updates from Gustavo Silva: "Replace a few open-coded instances with size_t saturating arithmetic helpers" * tag 'size_t-saturating-helpers-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux: virt: acrn: Prefer array_size and struct_size over open coded arithmetic afs: Prefer struct_size over open coded arithmetic
2022-05-25ocfs2: dlmfs: fix error handling of user_dlm_destroy_lockJunxiao Bi via Ocfs2-devel1-1/+15
When user_dlm_destroy_lock failed, it didn't clean up the flags it set before exit. For USER_LOCK_IN_TEARDOWN, if this function fails because of lock is still in used, next time when unlink invokes this function, it will return succeed, and then unlink will remove inode and dentry if lock is not in used(file closed), but the dlm lock is still linked in dlm lock resource, then when bast come in, it will trigger a panic due to user-after-free. See the following panic call trace. To fix this, USER_LOCK_IN_TEARDOWN should be reverted if fail. And also error should be returned if USER_LOCK_IN_TEARDOWN is set to let user know that unlink fail. For the case of ocfs2_dlm_unlock failure, besides USER_LOCK_IN_TEARDOWN, USER_LOCK_BUSY is also required to be cleared. Even though spin lock is released in between, but USER_LOCK_IN_TEARDOWN is still set, for USER_LOCK_BUSY, if before every place that waits on this flag, USER_LOCK_IN_TEARDOWN is checked to bail out, that will make sure no flow waits on the busy flag set by user_dlm_destroy_lock(), then we can simplely revert USER_LOCK_BUSY when ocfs2_dlm_unlock fails. Fix user_dlm_cluster_lock() which is the only function not following this. [ 941.336392] (python,26174,16):dlmfs_unlink:562 ERROR: unlink 004fb0000060000b5a90b8c847b72e1, error -16 from destroy [ 989.757536] ------------[ cut here ]------------ [ 989.757709] kernel BUG at fs/ocfs2/dlmfs/userdlm.c:173! [ 989.757876] invalid opcode: 0000 [#1] SMP [ 989.758027] Modules linked in: ksplice_2zhuk2jr_ib_ipoib_new(O) ksplice_2zhuk2jr(O) mptctl mptbase xen_netback xen_blkback xen_gntalloc xen_gntdev xen_evtchn cdc_ether usbnet mii ocfs2 jbd2 rpcsec_gss_krb5 auth_rpcgss nfsv4 nfsv3 nfs_acl nfs fscache lockd grace ocfs2_dlmfs ocfs2_stack_o2cb ocfs2_dlm ocfs2_nodemanager ocfs2_stackglue configfs bnx2fc fcoe libfcoe libfc scsi_transport_fc sunrpc ipmi_devintf bridge stp llc rds_rdma rds bonding ib_sdp ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm falcon_lsm_serviceable(PE) falcon_nf_netcontain(PE) mlx4_vnic falcon_kal(E) falcon_lsm_pinned_13402(E) mlx4_ib ib_sa ib_mad ib_core ib_addr xenfs xen_privcmd dm_multipath iTCO_wdt iTCO_vendor_support pcspkr sb_edac edac_core i2c_i801 lpc_ich mfd_core ipmi_ssif i2c_core ipmi_si ipmi_msghandler [ 989.760686] ioatdma sg ext3 jbd mbcache sd_mod ahci libahci ixgbe dca ptp pps_core vxlan udp_tunnel ip6_udp_tunnel megaraid_sas mlx4_core crc32c_intel be2iscsi bnx2i cnic uio cxgb4i cxgb4 cxgb3i libcxgbi ipv6 cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi wmi dm_mirror dm_region_hash dm_log dm_mod [last unloaded: ksplice_2zhuk2jr_ib_ipoib_old] [ 989.761987] CPU: 10 PID: 19102 Comm: dlm_thread Tainted: P OE 4.1.12-124.57.1.el6uek.x86_64 #2 [ 989.762290] Hardware name: Oracle Corporation ORACLE SERVER X5-2/ASM,MOTHERBOARD,1U, BIOS 30350100 06/17/2021 [ 989.762599] task: ffff880178af6200 ti: ffff88017f7c8000 task.ti: ffff88017f7c8000 [ 989.762848] RIP: e030:[<ffffffffc07d4316>] [<ffffffffc07d4316>] __user_dlm_queue_lockres.part.4+0x76/0x80 [ocfs2_dlmfs] [ 989.763185] RSP: e02b:ffff88017f7cbcb8 EFLAGS: 00010246 [ 989.763353] RAX: 0000000000000000 RBX: ffff880174d48008 RCX: 0000000000000003 [ 989.763565] RDX: 0000000000120012 RSI: 0000000000000003 RDI: ffff880174d48170 [ 989.763778] RBP: ffff88017f7cbcc8 R08: ffff88021f4293b0 R09: 0000000000000000 [ 989.763991] R10: ffff880179c8c000 R11: 0000000000000003 R12: ffff880174d48008 [ 989.764204] R13: 0000000000000003 R14: ffff880179c8c000 R15: ffff88021db7a000 [ 989.764422] FS: 0000000000000000(0000) GS:ffff880247480000(0000) knlGS:ffff880247480000 [ 989.764685] CS: e033 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 989.764865] CR2: ffff8000007f6800 CR3: 0000000001ae0000 CR4: 0000000000042660 [ 989.765081] Stack: [ 989.765167] 0000000000000003 ffff880174d48040 ffff88017f7cbd18 ffffffffc07d455f [ 989.765442] ffff88017f7cbd88 ffffffff816fb639 ffff88017f7cbd38 ffff8800361b5600 [ 989.765717] ffff88021db7a000 ffff88021f429380 0000000000000003 ffffffffc0453020 [ 989.765991] Call Trace: [ 989.766093] [<ffffffffc07d455f>] user_bast+0x5f/0xf0 [ocfs2_dlmfs] [ 989.766287] [<ffffffff816fb639>] ? schedule_timeout+0x169/0x2d0 [ 989.766475] [<ffffffffc0453020>] ? o2dlm_lock_ast_wrapper+0x20/0x20 [ocfs2_stack_o2cb] [ 989.766738] [<ffffffffc045303a>] o2dlm_blocking_ast_wrapper+0x1a/0x20 [ocfs2_stack_o2cb] [ 989.767010] [<ffffffffc0864ec6>] dlm_do_local_bast+0x46/0xe0 [ocfs2_dlm] [ 989.767217] [<ffffffffc084f5cc>] ? dlm_lockres_calc_usage+0x4c/0x60 [ocfs2_dlm] [ 989.767466] [<ffffffffc08501f1>] dlm_thread+0xa31/0x1140 [ocfs2_dlm] [ 989.767662] [<ffffffff816f78da>] ? __schedule+0x24a/0x810 [ 989.767834] [<ffffffff816f78ce>] ? __schedule+0x23e/0x810 [ 989.768006] [<ffffffff816f78da>] ? __schedule+0x24a/0x810 [ 989.768178] [<ffffffff816f78ce>] ? __schedule+0x23e/0x810 [ 989.768349] [<ffffffff816f78da>] ? __schedule+0x24a/0x810 [ 989.768521] [<ffffffff816f78ce>] ? __schedule+0x23e/0x810 [ 989.768693] [<ffffffff816f78da>] ? __schedule+0x24a/0x810 [ 989.768893] [<ffffffff816f78ce>] ? __schedule+0x23e/0x810 [ 989.769067] [<ffffffff816f78da>] ? __schedule+0x24a/0x810 [ 989.769241] [<ffffffff810ce4d0>] ? wait_woken+0x90/0x90 [ 989.769411] [<ffffffffc084f7c0>] ? dlm_kick_thread+0x80/0x80 [ocfs2_dlm] [ 989.769617] [<ffffffff810a8bbb>] kthread+0xcb/0xf0 [ 989.769774] [<ffffffff816f78da>] ? __schedule+0x24a/0x810 [ 989.769945] [<ffffffff816f78da>] ? __schedule+0x24a/0x810 [ 989.770117] [<ffffffff810a8af0>] ? kthread_create_on_node+0x180/0x180 [ 989.770321] [<ffffffff816fdaa1>] ret_from_fork+0x61/0x90 [ 989.770492] [<ffffffff810a8af0>] ? kthread_create_on_node+0x180/0x180 [ 989.770689] Code: d0 00 00 00 f0 45 7d c0 bf 00 20 00 00 48 89 83 c0 00 00 00 48 89 83 c8 00 00 00 e8 55 c1 8c c0 83 4b 04 10 48 83 c4 08 5b 5d c3 <0f> 0b 0f 1f 84 00 00 00 00 00 55 48 89 e5 41 55 41 54 53 48 83 [ 989.771892] RIP [<ffffffffc07d4316>] __user_dlm_queue_lockres.part.4+0x76/0x80 [ocfs2_dlmfs] [ 989.772174] RSP <ffff88017f7cbcb8> [ 989.772704] ---[ end trace ebd1e38cebcc93a8 ]--- [ 989.772907] Kernel panic - not syncing: Fatal exception [ 989.773173] Kernel Offset: disabled Link: https://lkml.kernel.org/r/20220518235224.87100-2-junxiao.bi@oracle.com Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Joseph Qi <jiangqi903@gmail.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-25ocfs2: dlmfs: don't clear USER_LOCK_ATTACHED when destroying lockJunxiao Bi1-1/+0
The following function is the only place that checks USER_LOCK_ATTACHED. This flag is set when lock request is granted through user_ast() and only the following function will clear it. Checking of this flag here is to make sure ocfs2_dlm_unlock is not issued if this lock is never granted. For example, lock file is created and then get removed, open file never happens. Clearing the flag here is not necessary because this is the only function that checks it, if another flow is executing user_dlm_destroy_lock(), it will bail out at the beginning because of USER_LOCK_IN_TEARDOWN and never check USER_LOCK_ATTACHED. Drop the clear, so we don't need take care of it for the following error handling patch. int user_dlm_destroy_lock(struct user_lock_res *lockres) { ... status = 0; if (!(lockres->l_flags & USER_LOCK_ATTACHED)) { spin_unlock(&lockres->l_lock); goto bail; } lockres->l_flags &= ~USER_LOCK_ATTACHED; lockres->l_flags |= USER_LOCK_BUSY; spin_unlock(&lockres->l_lock); status = ocfs2_dlm_unlock(conn, &lockres->l_lksb, DLM_LKF_VALBLK); if (status) { user_log_dlm_error("ocfs2_dlm_unlock", status, lockres); goto bail; } ... } V1 discussion with Joseph: https://lore.kernel.org/all/7b620c53-0c45-da2c-829e-26195cbe7d4e@linux.alibaba.com/T/ Link: https://lkml.kernel.org/r/20220518235224.87100-1-junxiao.bi@oracle.com Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-25Merge tag 'net-next-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-nextLinus Torvalds6-6/+48
Pull networking updates from Jakub Kicinski: "Core ---- - Support TCPv6 segmentation offload with super-segments larger than 64k bytes using the IPv6 Jumbogram extension header (AKA BIG TCP). - Generalize skb freeing deferral to per-cpu lists, instead of per-socket lists. - Add a netdev statistic for packets dropped due to L2 address mismatch (rx_otherhost_dropped). - Continue work annotating skb drop reasons. - Accept alternative netdev names (ALT_IFNAME) in more netlink requests. - Add VLAN support for AF_PACKET SOCK_RAW GSO. - Allow receiving skb mark from the socket as a cmsg. - Enable memcg accounting for veth queues, sysctl tables and IPv6. BPF --- - Add libbpf support for User Statically-Defined Tracing (USDTs). - Speed up symbol resolution for kprobes multi-link attachments. - Support storing typed pointers to referenced and unreferenced objects in BPF maps. - Add support for BPF link iterator. - Introduce access to remote CPU map elements in BPF per-cpu map. - Allow middle-of-the-road settings for the kernel.unprivileged_bpf_disabled sysctl. - Implement basic types of dynamic pointers e.g. to allow for dynamically sized ringbuf reservations without extra memory copies. Protocols --------- - Retire port only listening_hash table, add a second bind table hashed by port and address. Avoid linear list walk when binding to very popular ports (e.g. 443). - Add bridge FDB bulk flush filtering support allowing user space to remove all FDB entries matching a condition. - Introduce accept_unsolicited_na sysctl for IPv6 to implement router-side changes for RFC9131. - Support for MPTCP path manager in user space. - Add MPTCP support for fallback to regular TCP for connections that have never connected additional subflows or transmitted out-of-sequence data (partial support for RFC8684 fallback). - Avoid races in MPTCP-level window tracking, stabilize and improve throughput. - Support lockless operation of GRE tunnels with seq numbers enabled. - WiFi support for host based BSS color collision detection. - Add support for SO_TXTIME/SCM_TXTIME on CAN sockets. - Support transmission w/o flow control in CAN ISOTP (ISO 15765-2). - Support zero-copy Tx with TLS 1.2 crypto offload (sendfile). - Allow matching on the number of VLAN tags via tc-flower. - Add tracepoint for tcp_set_ca_state(). Driver API ---------- - Improve error reporting from classifier and action offload. - Add support for listing line cards in switches (devlink). - Add helpers for reporting page pool statistics with ethtool -S. - Add support for reading clock cycles when using PTP virtual clocks, instead of having the driver convert to time before reporting. This makes it possible to report time from different vclocks. - Support configuring low-latency Tx descriptor push via ethtool. - Separate Clause 22 and Clause 45 MDIO accesses more explicitly. New hardware / drivers ---------------------- - Ethernet: - Marvell's Octeon NIC PCI Endpoint support (octeon_ep) - Sunplus SP7021 SoC (sp7021_emac) - Add support for Renesas RZ/V2M (in ravb) - Add support for MediaTek mt7986 switches (in mtk_eth_soc) - Ethernet PHYs: - ADIN1100 industrial PHYs (w/ 10BASE-T1L and SQI reporting) - TI DP83TD510 PHY - Microchip LAN8742/LAN88xx PHYs - WiFi: - Driver for pureLiFi X, XL, XC devices (plfxlc) - Driver for Silicon Labs devices (wfx) - Support for WCN6750 (in ath11k) - Support Realtek 8852ce devices (in rtw89) - Mobile: - MediaTek T700 modems (Intel 5G 5000 M.2 cards) - CAN: - ctucanfd: add support for CTU CAN FD open-source IP core from Czech Technical University in Prague Drivers ------- - Delete a number of old drivers still using virt_to_bus(). - Ethernet NICs: - intel: support TSO on tunnels MPLS - broadcom: support multi-buffer XDP - nfp: support VF rate limiting - sfc: use hardware tx timestamps for more than PTP - mlx5: multi-port eswitch support - hyper-v: add support for XDP_REDIRECT - atlantic: XDP support (including multi-buffer) - macb: improve real-time perf by deferring Tx processing to NAPI - High-speed Ethernet switches: - mlxsw: implement basic line card information querying - prestera: add support for traffic policing on ingress and egress - Embedded Ethernet switches: - lan966x: add support for packet DMA (FDMA) - lan966x: add support for PTP programmable pins - ti: cpsw_new: enable bc/mc storm prevention - Qualcomm 802.11ax WiFi (ath11k): - Wake-on-WLAN support for QCA6390 and WCN6855 - device recovery (firmware restart) support - support setting Specific Absorption Rate (SAR) for WCN6855 - read country code from SMBIOS for WCN6855/QCA6390 - enable keep-alive during WoWLAN suspend - implement remain-on-channel support - MediaTek WiFi (mt76): - support Wireless Ethernet Dispatch offloading packet movement between the Ethernet switch and WiFi interfaces - non-standard VHT MCS10-11 support - mt7921 AP mode support - mt7921 IPv6 NS offload support - Ethernet PHYs: - micrel: ksz9031/ksz9131: cabletest support - lan87xx: SQI support for T1 PHYs - lan937x: add interrupt support for link detection" * tag 'net-next-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1809 commits) ptp: ocp: Add firmware header checks ptp: ocp: fix PPS source selector debugfs reporting ptp: ocp: add .init function for sma_op vector ptp: ocp: vectorize the sma accessor functions ptp: ocp: constify selectors ptp: ocp: parameterize input/output sma selectors ptp: ocp: revise firmware display ptp: ocp: add Celestica timecard PCI ids ptp: ocp: Remove #ifdefs around PCI IDs ptp: ocp: 32-bit fixups for pci start address Revert "net/smc: fix listen processing for SMC-Rv2" ath6kl: Use cc-disable-warning to disable -Wdangling-pointer selftests/bpf: Dynptr tests bpf: Add dynptr data slices bpf: Add bpf_dynptr_read and bpf_dynptr_write bpf: Dynptr support for ring buffers bpf: Add bpf_dynptr_from_mem for local dynptrs bpf: Add verifier support for dynptrs bpf: Suppress 'passing zero to PTR_ERR' warning bpf: Introduce bpf_arch_text_invalidate for bpf_prog_pack ...
2022-05-25ceph: fix decoding of client session messages flagsLuís Henriques1-5/+9
The cephfs kernel client started to show the message: ceph: mds0 session blocklisted when mounting a filesystem. This is due to the fact that the session messages are being incorrectly decoded: the skip needs to take into account the 'len'. While there, fixed some whitespaces too. Cc: stable@vger.kernel.org Fixes: e1c9788cb397 ("ceph: don't rely on error_string to validate blocklisted session.") Signed-off-by: Luís Henriques <lhenriques@suse.de> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2022-05-25ceph: switch TASK_INTERRUPTIBLE to TASK_KILLABLEXiubo Li1-1/+1
If the task is placed in the TASK_INTERRUPTIBLE state it will sleep until either something explicitly wakes it up, or a non-masked signal is received. Switch to TASK_KILLABLE to avoid the noises. Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Xiubo Li <xiubli@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2022-05-25ceph: remove redundant variable inoColin Ian King1-2/+2
Variable ino is being assigned a value that is never read. The variable and assignment are redundant, remove it. Cleans up clang scan build warning: warning: Although the value stored to 'ino' is used in the enclosing expression, the value is never actually read from 'ino' [deadcode.DeadStores] Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>