aboutsummaryrefslogtreecommitdiffstats
path: root/fs/nfs (follow)
AgeCommit message (Collapse)AuthorFilesLines
2022-08-02mm/migrate: Convert migrate_page() to migrate_folio()Matthew Wilcox (Oracle)1-1/+1
Convert all callers to pass a folio. Most have the folio already available. Switch all users from aops->migratepage to aops->migrate_folio. Also turn the documentation into kerneldoc. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: David Sterba <dsterba@suse.com>
2022-08-02nfs: Convert to migrate_folioMatthew Wilcox (Oracle)3-13/+13
Use a folio throughout this function. migrate_page() will be converted later. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Acked-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2022-07-27NFSv4.1: RECLAIM_COMPLETE must handle EACCESZhang Xianwei1-0/+3
A client should be able to handle getting an EACCES error while doing a mount operation to reclaim state due to NFS4CLNT_RECLAIM_REBOOT being set. If the server returns RPC_AUTH_BADCRED because authentication failed when we execute "exportfs -au", then RECLAIM_COMPLETE will go a wrong way. After mount succeeds, all OPEN call will fail due to an NFS4ERR_GRACE error being returned. This patch is to fix it by resending a RPC request. Signed-off-by: Zhang Xianwei <zhang.xianwei8@zte.com.cn> Signed-off-by: Yi Wang <wang.yi59@zte.com.cn> Fixes: aa5190d0ed7d ("NFSv4: Kill nfs4_async_handle_error() abuses by NFSv4.1") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2022-07-25NFSv4.1 probe offline transports for trunking on session creationOlga Kornievskaia1-0/+8
Once the session is established call into the SUNRPC layer to check if any offlined trunking connections should be re-enabled. Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2022-07-25NFSv4.1 remove xprt from xprt_switch if session trunking test failsOlga Kornievskaia1-0/+3
If we are doing a session trunking test and it fails for the transport, then remove this transport from the xprt_switch group. Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2022-07-25NFSv4.1 offline trunkable transports on DESTROY_SESSIONOlga Kornievskaia1-0/+1
When session is destroy, some of the transports might no longer be valid trunks for the new session. Offline existing transports. Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2022-07-23NFS: Replace the READ_PLUS decoding codeAnna Schumaker1-82/+88
We now take a 2-step process that allows us to place data and hole segments directly at their final position in the xdr_stream without needing to do a bunch of redundant copies to expand holes. Due to the variable lengths of each segment, the xdr metadata might cross page boundaries which I account for by setting a small scratch buffer so xdr_inline_decode() won't fail. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2022-07-23NFS: Replace fs_context-related dprintk() call sites with tracepointsChuck Lever2-10/+73
Contributed as part of the long patch series that converts NFS from using dprintk to tracepoints for observability. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2022-07-23nfs: only issue commit in DIO codepath if we have uncommitted dataJeff Layton2-19/+31
Currently, we try to determine whether to issue a commit based on nfs_write_need_commit which looks at the current verifier. In the case where we got a short write and then tried to follow it up with one that failed, the verifier can't be trusted. What we really want to know is whether the pgio request had any successful writes that came back as UNSTABLE. Add a new flag to the pgio request, and use that to indicate that we've had a successful unstable write. Only issue a commit if that flag is set. Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2022-07-23nfs: always check dreq->error after a commitJeff Layton1-1/+2
When the client gets back a short DIO write, it will then attempt to issue another write to finish the DIO request. If that write then fails (as is often the case in an -ENOSPC situation), then we still may need to issue a COMMIT if the earlier short write was unstable. If that COMMIT then succeeds, then we don't want the client to reschedule the write requests, and to instead just return a short write. Otherwise, we can end up looping over the same DIO write forever. Always consult dreq->error after a successful RPC, even when the flag state is not NFS_ODIRECT_DONE. Link: https://bugzilla.redhat.com/show_bug.cgi?id=2028370 Reported-by: Boyang Xue <bxue@redhat.com> Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2022-07-23nfs: add new nfs_direct_req tracepoint eventsJeff Layton3-33/+114
Add some new tracepoints to the DIO write code. Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2022-07-14fs/nfs: Use enum req_op where appropriateBart Van Assche1-7/+6
Improve static type checking by using enum req_op for request operations. Rename an 'rw' argument into 'op' since that name is typically used for request operations. This patch does not change any functionality. Note: REQ_OP_READ = READ = 0 and REQ_OP_WRITE = WRITE = 1. Cc: Trond Myklebust <trond.myklebust@hammerspace.com> Cc: Anna Schumaker <anna@kernel.org> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20220714180729.1065367-58-bvanassche@acm.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-07-13NFSv4: Fix races in the legacy idmapper upcallTrond Myklebust1-22/+24
nfs_idmap_instantiate() will cause the process that is waiting in request_key_with_auxdata() to wake up and exit. If there is a second process waiting for the idmap->idmap_mutex, then it may wake up and start a new call to request_key_with_auxdata(). If the call to idmap_pipe_downcall() from the first process has not yet finished calling nfs_idmap_complete_pipe_upcall_locked(), then we may end up triggering the WARN_ON_ONCE() in nfs_idmap_prepare_pipe_upcall(). The fix is to ensure that we clear idmap->idmap_upcall_data before calling nfs_idmap_instantiate(). Fixes: e9ab41b620e4 ("NFSv4: Clean up the legacy idmapper upcall") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2022-07-12NFS: Allow setting rsize / wsize to a multiple of PAGE_SIZEAnna Schumaker4-10/+31
Previously, we required this to value to be a power of 2 for UDP related reasons. This patch keeps the power of 2 rule for UDP but allows more flexibility for TCP and RDMA. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2022-07-12nfs: fix port value parsingIan Kent1-1/+1
The valid values of nfs options port and mountport are 0 to USHRT_MAX. The fs parser will return a fail for port values that are negative and the sloppy option handling then returns success. But the sloppy option handling is meant to return success for invalid options not valid options with invalid values. Restricting the sloppy option override to handle failure returns for invalid options only is sufficient to resolve this problem. Changes: v2: utilize the return value from fs_parse() to resolve this problem instead of changing the parameter definitions. Suggested-by: Trond Myklebust <trondmy@hammerspace.com> Signed-off-by: Ian Kent <raven@themaw.net> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2022-07-12nfs: Replace kmap() with kmap_local_page()Fabio M. De Francesco1-2/+2
The use of kmap() is being deprecated in favor of kmap_local_page(). With kmap_local_page(), the mapping is per thread, CPU local and not globally visible. Furthermore, the mapping can be acquired from any context (including interrupts). Therefore, use kmap_local_page() in nfs_do_filldir() because this mapping is per thread, CPU local, and not globally visible. Suggested-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Fabio M. De Francesco <fmdefrancesco@gmail.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2022-07-12NFS: remove redundant code in nfs_file_write()ChenXiaoSong1-2/+0
filemap_fdatawait_range() will always return 0, after patch 6c984083ec24 ("NFS: Use of mapping_set_error() results in spurious errors"), it will not save the wb err in struct address_space->flags: result = filemap_fdatawait_range(file->f_mapping, ...) = 0 filemap_check_errors(mapping) = 0 test_bit(..., &mapping->flags) // flags is 0 Signed-off-by: ChenXiaoSong <chenxiaosong2@huawei.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2022-07-12nfs/blocklayout: refactor block device openingChristoph Hellwig1-31/+11
Deduplicate the helpers to open a device node by passing a name prefix argument and using the same helper for both kinds of paths. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2022-07-12NFSv4.1: Handle NFS4ERR_DELAY replies to OP_SEQUENCE correctlyTrond Myklebust1-1/+0
Don't assume that the NFS4ERR_DELAY means that the server is processing this slot id. Fixes: 3453d5708b33 ("NFSv4.1: Avoid false retries when RPC calls are interrupted") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2022-07-12NFSv4.1: Don't decrease the value of seq_nr_highest_sentTrond Myklebust1-3/+2
When we're trying to figure out what the server may or may not have seen in terms of request numbers, do not assume that requests with a larger number were missed, just because we saw a reply to a request with a smaller number. Fixes: 3453d5708b33 ("NFSv4.1: Avoid false retries when RPC calls are interrupted") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2022-07-10NFS: Fix case insensitive renamesTrond Myklebust1-0/+4
For filesystems that are case insensitive and case preserving, we need to be able to rename from one case folded variant of the filename to another. Currently, if we have looked up the target filename before the call to rename, then we may have a hashed dentry with that target name in the dcache, causing the vfs to optimise away the rename. To avoid that, let's drop the target dentry, and leave it to the server to optimise away the rename if that is the correct thing to do. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2022-07-10pNFS/files: Handle RDMA connection errors correctlyTrond Myklebust1-0/+2
The RPC/RDMA driver will return -EPROTO and -ENODEV as connection errors under certain circumstances. Make sure that we handle them correctly and avoid cycling forever in a LAYOUTGET/LAYOUTRETURN loop. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2022-07-10pNFS/flexfiles: Report RDMA connection errors to the serverTrond Myklebust1-0/+4
The RPC/RDMA driver will return -EPROTO and -ENODEV as connection errors under certain circumstances. Make sure that we handle them and report them to the server. If not, we can end up cycling forever in a LAYOUTGET/LAYOUTRETURN loop. Fixes: a12f996d3413 ("NFSv4/pNFS: Use connections to a DS that are all of the same protocol family") Cc: stable@vger.kernel.org # 5.11.x Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2022-07-10Revert "pNFS: nfs3_set_ds_client should set NFS_CS_NOPING"Trond Myklebust1-1/+0
This reverts commit c6eb58435b98bd843d3179664a0195ff25adb2c3. If a transport is down, then we want to fail over to other transports if they are listed in the GETDEVICEINFO reply. Fixes: c6eb58435b98 ("pNFS: nfs3_set_ds_client should set NFS_CS_NOPING") Cc: stable@vger.kernel.org # 5.11.x Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2022-07-03mm: shrinkers: provide shrinkers with namesRoman Gushchin2-4/+5
Currently shrinkers are anonymous objects. For debugging purposes they can be identified by count/scan function names, but it's not always useful: e.g. for superblock's shrinkers it's nice to have at least an idea of to which superblock the shrinker belongs. This commit adds names to shrinkers. register_shrinker() and prealloc_shrinker() functions are extended to take a format and arguments to master a name. In some cases it's not possible to determine a good name at the time when a shrinker is allocated. For such cases shrinker_debugfs_rename() is provided. The expected format is: <subsystem>-<shrinker_type>[:<instance>]-<id> For some shrinkers an instance can be encoded as (MAJOR:MINOR) pair. After this change the shrinker debugfs directory looks like: $ cd /sys/kernel/debug/shrinker/ $ ls dquota-cache-16 sb-devpts-28 sb-proc-47 sb-tmpfs-42 mm-shadow-18 sb-devtmpfs-5 sb-proc-48 sb-tmpfs-43 mm-zspool:zram0-34 sb-hugetlbfs-17 sb-pstore-31 sb-tmpfs-44 rcu-kfree-0 sb-hugetlbfs-33 sb-rootfs-2 sb-tmpfs-49 sb-aio-20 sb-iomem-12 sb-securityfs-6 sb-tracefs-13 sb-anon_inodefs-15 sb-mqueue-21 sb-selinuxfs-22 sb-xfs:vda1-36 sb-bdev-3 sb-nsfs-4 sb-sockfs-8 sb-zsmalloc-19 sb-bpf-32 sb-pipefs-14 sb-sysfs-26 thp-deferred_split-10 sb-btrfs:vda2-24 sb-proc-25 sb-tmpfs-1 thp-zero-9 sb-cgroup2-30 sb-proc-39 sb-tmpfs-27 xfs-buf:vda1-37 sb-configfs-23 sb-proc-41 sb-tmpfs-29 xfs-inodegc:vda1-38 sb-dax-11 sb-proc-45 sb-tmpfs-35 sb-debugfs-7 sb-proc-46 sb-tmpfs-40 [roman.gushchin@linux.dev: fix build warnings] Link: https://lkml.kernel.org/r/Yr+ZTnLb9lJk6fJO@castle Reported-by: kernel test robot <lkp@intel.com> Link: https://lkml.kernel.org/r/20220601032227.4076670-4-roman.gushchin@linux.dev Signed-off-by: Roman Gushchin <roman.gushchin@linux.dev> Cc: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Cc: Dave Chinner <dchinner@redhat.com> Cc: Hillf Danton <hdanton@sina.com> Cc: Kent Overstreet <kent.overstreet@gmail.com> Cc: Muchun Song <songmuchun@bytedance.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-06-30NFSv4: Add an fattr allocation to _nfs4_discover_trunking()Scott Mayhew1-6/+13
This was missed in c3ed222745d9 ("NFSv4: Fix free of uninitialized nfs4_label on referral lookup.") and causes a panic when mounting with '-o trunkdiscovery': PID: 1604 TASK: ffff93dac3520000 CPU: 3 COMMAND: "mount.nfs" #0 [ffffb79140f738f8] machine_kexec at ffffffffaec64bee #1 [ffffb79140f73950] __crash_kexec at ffffffffaeda67fd #2 [ffffb79140f73a18] crash_kexec at ffffffffaeda76ed #3 [ffffb79140f73a30] oops_end at ffffffffaec2658d #4 [ffffb79140f73a50] general_protection at ffffffffaf60111e [exception RIP: nfs_fattr_init+0x5] RIP: ffffffffc0c18265 RSP: ffffb79140f73b08 RFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff93dac304a800 RCX: 0000000000000000 RDX: ffffb79140f73bb0 RSI: ffff93dadc8cbb40 RDI: d03ee11cfaf6bd50 RBP: ffffb79140f73be8 R8: ffffffffc0691560 R9: 0000000000000006 R10: ffff93db3ffd3df8 R11: 0000000000000000 R12: ffff93dac4040000 R13: ffff93dac2848e00 R14: ffffb79140f73b60 R15: ffffb79140f73b30 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 #5 [ffffb79140f73b08] _nfs41_proc_get_locations at ffffffffc0c73d53 [nfsv4] #6 [ffffb79140f73bf0] nfs4_proc_get_locations at ffffffffc0c83e90 [nfsv4] #7 [ffffb79140f73c60] nfs4_discover_trunking at ffffffffc0c83fb7 [nfsv4] #8 [ffffb79140f73cd8] nfs_probe_fsinfo at ffffffffc0c0f95f [nfs] #9 [ffffb79140f73da0] nfs_probe_server at ffffffffc0c1026a [nfs] RIP: 00007f6254fce26e RSP: 00007ffc69496ac8 RFLAGS: 00000246 RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f6254fce26e RDX: 00005600220a82a0 RSI: 00005600220a64d0 RDI: 00005600220a6520 RBP: 00007ffc69496c50 R8: 00005600220a8710 R9: 003035322e323231 R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffc69496c50 R13: 00005600220a8440 R14: 0000000000000010 R15: 0000560020650ef9 ORIG_RAX: 00000000000000a5 CS: 0033 SS: 002b Fixes: c3ed222745d9 ("NFSv4: Fix free of uninitialized nfs4_label on referral lookup.") Signed-off-by: Scott Mayhew <smayhew@redhat.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2022-06-30NFS: restore module put when manager exits.NeilBrown1-0/+1
Commit f49169c97fce ("NFSD: Remove svc_serv_ops::svo_module") removed calls to module_put_and_kthread_exit() from threads that acted as SUNRPC servers and had a related svc_serv_ops structure. This was correct. It ALSO removed the module_put_and_kthread_exit() call from nfs4_run_state_manager() which is NOT a SUNRPC service. Consequently every time the NFSv4 state manager runs the module count increments and won't be decremented. So the nfsv4 module cannot be unloaded. So restore the module_put_and_kthread_exit() call. Fixes: f49169c97fce ("NFSD: Remove svc_serv_ops::svo_module") Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2022-06-29nfs: Leave pages in the pagecache if readpage failedMatthew Wilcox (Oracle)1-4/+0
The pagecache handles readpage failing by itself; it doesn't want filesystems to remove pages from under it. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
2022-06-15NFSv4: Add FMODE_CAN_ODIRECT after successful open of a NFS4.x fileDave Wysochanski2-0/+2
Commit a2ad63daa88b ("VFS: add FMODE_CAN_ODIRECT file flag") added the FMODE_CAN_ODIRECT flag for NFSv3 but neglected to add it for NFSv4.x. This causes direct io on NFSv4.x to fail open with EINVAL: mount -o vers=4.2 127.0.0.1:/export /mnt/nfs4 dd if=/dev/zero of=/mnt/nfs4/file.bin bs=128k count=1 oflag=direct dd: failed to open '/mnt/nfs4/file.bin': Invalid argument dd of=/dev/null if=/mnt/nfs4/file.bin bs=128k count=1 iflag=direct dd: failed to open '/mnt/dir1/file1.bin': Invalid argument Fixes: a2ad63daa88b ("VFS: add FMODE_CAN_ODIRECT file flag") Signed-off-by: Dave Wysochanski <dwysocha@redhat.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2022-06-06pNFS: Avoid a live lock condition in pnfs_update_layout()Trond Myklebust3-6/+11
If we're about to send the first layoutget for an empty layout, we want to make sure that we drain out the existing pending layoutget calls first. The reason is that these layouts may have been already implicitly returned to the server by a recall to which the client gave a NFS4ERR_NOMATCHING_LAYOUT response. The problem is that wait_var_event_killable() could in principle see the plh_outstanding count go back to '1' when the first process to wake up starts sending a new layoutget. If it fails to get a layout, then this loop can continue ad infinitum... Fixes: 0b77f97a7e42 ("NFSv4/pnfs: Fix layoutget behaviour after invalidation") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2022-06-06pNFS: Don't keep retrying if the server replied NFS4ERR_LAYOUTUNAVAILABLETrond Myklebust1-0/+6
If the server tells us that a pNFS layout is not available for a specific file, then we should not keep pounding it with further layoutget requests. Fixes: 183d9e7b112a ("pnfs: rework LAYOUTGET retry handling") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2022-06-04Merge tag 'pull-18-rc1-work.mount' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfsLinus Torvalds1-2/+2
Pull mount handling updates from Al Viro: "Cleanups (and one fix) around struct mount handling. The fix is usermode_driver.c one - once you've done kern_mount(), you must kern_unmount(); simple mntput() will end up with a leak. Several failure exits in there messed up that way... In practice you won't hit those particular failure exits without fault injection, though" * tag 'pull-18-rc1-work.mount' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: move mount-related externs from fs.h to mount.h blob_to_mnt(): kern_unmount() is needed to undo kern_mount() m->mnt_root->d_inode->i_sb is a weird way to spell m->mnt_sb... linux/mount.h: trim includes uninline may_mount() and don't opencode it in fspick(2)/fsopen(2)
2022-05-31Merge tag 'nfs-for-5.19-1' of git://git.linux-nfs.org/projects/anna/linux-nfsLinus Torvalds12-149/+302
Pull NFS client updates from Anna Schumaker: "New Features: - Add support for 'dacl' and 'sacl' attributes Bugfixes and Cleanups: - Fixes for reporting mapping errors - Fixes for memory allocation errors - Improve warning message when locks are lost - Update documentation for the nfs4_unique_id parameter - Add an explanation of NFSv4 client identifiers - Ensure the i_size attribute is written to the fscache storage - Fix freeing uninitialized nfs4_labels - Better handling when xprtrdma bc_serv is NULL - Mark qualified async operations as MOVEABLE tasks" * tag 'nfs-for-5.19-1' of git://git.linux-nfs.org/projects/anna/linux-nfs: NFSv4.1 mark qualified async operations as MOVEABLE tasks xprtrdma: treat all calls not a bcall when bc_serv is NULL NFSv4: Fix free of uninitialized nfs4_label on referral lookup. NFS: Pass i_size to fscache_unuse_cookie() when a file is released Documentation: Add an explanation of NFSv4 client identifiers NFS: update documentation for the nfs4_unique_id parameter NFS: Improve warning message when locks are lost. NFSv4.1: Enable access to the NFSv4.1 'dacl' and 'sacl' attributes NFSv4: Add encoders/decoders for the NFSv4.1 dacl and sacl attributes NFSv4: Specify the type of ACL to cache NFSv4: Don't hold the layoutget locks across multiple RPC calls pNFS/files: Fall back to I/O through the MDS on non-fatal layout errors NFS: Further fixes to the writeback error handling NFSv4/pNFS: Do not fail I/O when we fail to allocate the pNFS layout NFS: Memory allocation failures are not server fatal errors NFS: Don't report errors from nfs_pageio_complete() more than once NFS: Do not report flush errors in nfs_write_end() NFS: Don't report ENOSPC write errors twice NFS: fsync() should report filesystem errors over EINTR/ERESTARTSYS NFS: Do not report EINTR/ERESTARTSYS as mapping errors
2022-05-31NFSv4.1 mark qualified async operations as MOVEABLE tasksOlga Kornievskaia4-12/+29
Mark async operations such as RENAME, REMOVE, COMMIT MOVEABLE for the nfsv4.1+ sessions. Fixes: 85e39feead948 ("NFSv4.1 identify and mark RPC tasks that can move between transports") Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2022-05-31NFSv4: Fix free of uninitialized nfs4_label on referral lookup.Benjamin Coddington4-13/+24
Send along the already-allocated fattr along with nfs4_fs_locations, and drop the memcpy of fattr. We end up growing two more allocations, but this fixes up a crash as: PID: 790 TASK: ffff88811b43c000 CPU: 0 COMMAND: "ls" #0 [ffffc90000857920] panic at ffffffff81b9bfde #1 [ffffc900008579c0] do_trap at ffffffff81023a9b #2 [ffffc90000857a10] do_error_trap at ffffffff81023b78 #3 [ffffc90000857a58] exc_stack_segment at ffffffff81be1f45 #4 [ffffc90000857a80] asm_exc_stack_segment at ffffffff81c009de #5 [ffffc90000857b08] nfs_lookup at ffffffffa0302322 [nfs] #6 [ffffc90000857b70] __lookup_slow at ffffffff813a4a5f #7 [ffffc90000857c60] walk_component at ffffffff813a86c4 #8 [ffffc90000857cb8] path_lookupat at ffffffff813a9553 #9 [ffffc90000857cf0] filename_lookup at ffffffff813ab86b Suggested-by: Trond Myklebust <trondmy@hammerspace.com> Fixes: 9558a007dbc3 ("NFS: Remove the label from the nfs4_lookup_res struct") Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2022-05-26Merge tag 'mm-stable-2022-05-25' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mmLinus Torvalds2-16/+25
Pull MM updates from Andrew Morton: "Almost all of MM here. A few things are still getting finished off, reviewed, etc. - Yang Shi has improved the behaviour of khugepaged collapsing of readonly file-backed transparent hugepages. - Johannes Weiner has arranged for zswap memory use to be tracked and managed on a per-cgroup basis. - Munchun Song adds a /proc knob ("hugetlb_optimize_vmemmap") for runtime enablement of the recent huge page vmemmap optimization feature. - Baolin Wang contributes a series to fix some issues around hugetlb pagetable invalidation. - Zhenwei Pi has fixed some interactions between hwpoisoned pages and virtualization. - Tong Tiangen has enabled the use of the presently x86-only page_table_check debugging feature on arm64 and riscv. - David Vernet has done some fixup work on the memcg selftests. - Peter Xu has taught userfaultfd to handle write protection faults against shmem- and hugetlbfs-backed files. - More DAMON development from SeongJae Park - adding online tuning of the feature and support for monitoring of fixed virtual address ranges. Also easier discovery of which monitoring operations are available. - Nadav Amit has done some optimization of TLB flushing during mprotect(). - Neil Brown continues to labor away at improving our swap-over-NFS support. - David Hildenbrand has some fixes to anon page COWing versus get_user_pages(). - Peng Liu fixed some errors in the core hugetlb code. - Joao Martins has reduced the amount of memory consumed by device-dax's compound devmaps. - Some cleanups of the arch-specific pagemap code from Anshuman Khandual. - Muchun Song has found and fixed some errors in the TLB flushing of transparent hugepages. - Roman Gushchin has done more work on the memcg selftests. ... and, of course, many smaller fixes and cleanups. Notably, the customary million cleanup serieses from Miaohe Lin" * tag 'mm-stable-2022-05-25' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (381 commits) mm: kfence: use PAGE_ALIGNED helper selftests: vm: add the "settings" file with timeout variable selftests: vm: add "test_hmm.sh" to TEST_FILES selftests: vm: check numa_available() before operating "merge_across_nodes" in ksm_tests selftests: vm: add migration to the .gitignore selftests/vm/pkeys: fix typo in comment ksm: fix typo in comment selftests: vm: add process_mrelease tests Revert "mm/vmscan: never demote for memcg reclaim" mm/kfence: print disabling or re-enabling message include/trace/events/percpu.h: cleanup for "percpu: improve percpu_alloc_percpu event trace" include/trace/events/mmflags.h: cleanup for "tracing: incorrect gfp_t conversion" mm: fix a potential infinite loop in start_isolate_page_range() MAINTAINERS: add Muchun as co-maintainer for HugeTLB zram: fix Kconfig dependency warning mm/shmem: fix shmem folio swapoff hang cgroup: fix an error handling path in alloc_pagecache_max_30M() mm: damon: use HPAGE_PMD_SIZE tracing: incorrect isolate_mote_t cast in mm_vmscan_lru_isolate nodemask.h: fix compilation error with GCC12 ...
2022-05-24Merge tag 'folio-5.19' of git://git.infradead.org/users/willy/pagecacheLinus Torvalds5-45/+48
Pull page cache updates from Matthew Wilcox: - Appoint myself page cache maintainer - Fix how scsicam uses the page cache - Use the memalloc_nofs_save() API to replace AOP_FLAG_NOFS - Remove the AOP flags entirely - Remove pagecache_write_begin() and pagecache_write_end() - Documentation updates - Convert several address_space operations to use folios: - is_dirty_writeback - readpage becomes read_folio - releasepage becomes release_folio - freepage becomes free_folio - Change filler_t to require a struct file pointer be the first argument like ->read_folio * tag 'folio-5.19' of git://git.infradead.org/users/willy/pagecache: (107 commits) nilfs2: Fix some kernel-doc comments Appoint myself page cache maintainer fs: Remove aops->freepage secretmem: Convert to free_folio nfs: Convert to free_folio orangefs: Convert to free_folio fs: Add free_folio address space operation fs: Convert drop_buffers() to use a folio fs: Change try_to_free_buffers() to take a folio jbd2: Convert release_buffer_page() to use a folio jbd2: Convert jbd2_journal_try_to_free_buffers to take a folio reiserfs: Convert release_buffer_page() to use a folio fs: Remove last vestiges of releasepage ubifs: Convert to release_folio reiserfs: Convert to release_folio orangefs: Convert to release_folio ocfs2: Convert to release_folio nilfs2: Remove comment about releasepage nfs: Convert to release_folio jfs: Convert to release_folio ...
2022-05-19m->mnt_root->d_inode->i_sb is a weird way to spell m->mnt_sb...Al Viro1-2/+2
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-05-17NFS: Pass i_size to fscache_unuse_cookie() when a file is releasedDave Wysochanski1-4/+3
Pass updated i_size in fscache_unuse_cookie() when called from nfs_fscache_release_file(), which ensures the size of an fscache object gets written to the cache storage. Failing to do so results in unnessary reads from the NFS server, even when the data is cached, due to a cachefiles object coherency check failing with a trace similar to the following: cachefiles_coherency: o=0000000e BAD osiz B=afbb3 c=0 This problem can be reproduced as follows: #!/bin/bash v=4.2; NFS_SERVER=127.0.0.1 set -e; trap cleanup EXIT; rc=1 function cleanup { umount /mnt/nfs > /dev/null 2>&1 RC_STR="TEST PASS" [ $rc -eq 1 ] && RC_STR="TEST FAIL" echo "$RC_STR on $(uname -r) with NFSv$v and server $NFS_SERVER" } mount -o vers=$v,fsc $NFS_SERVER:/export /mnt/nfs rm -f /mnt/nfs/file1.bin > /dev/null 2>&1 dd if=/dev/zero of=/mnt/nfs/file1.bin bs=4096 count=1 > /dev/null 2>&1 echo 3 > /proc/sys/vm/drop_caches echo Read file 1st time from NFS server into fscache dd if=/mnt/nfs/file1.bin of=/dev/null > /dev/null 2>&1 umount /mnt/nfs && mount -o vers=$v,fsc $NFS_SERVER:/export /mnt/nfs echo 3 > /proc/sys/vm/drop_caches echo Read file 2nd time from fscache dd if=/mnt/nfs/file1.bin of=/dev/null > /dev/null 2>&1 echo Check mountstats for NFS read grep -q "READ: 0" /proc/self/mountstats # (1st number) == 0 [ $? -eq 0 ] && rc=0 Fixes: a6b5a28eb56c "nfs: Convert to new fscache volume/cookie API" Signed-off-by: Dave Wysochanski <dwysocha@redhat.com> Tested-by: Daire Byrne <daire@dneg.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2022-05-17NFS: Improve warning message when locks are lost.NeilBrown1-5/+15
NFSv4 can lose locks if, for example there is a network partition for longer than the lease period. When this happens a warning message NFS: __nfs4_reclaim_open_state: Lock reclaim failed! is generated, possibly once for each lock (though rate limited). This is potentially misleading as is can be read as suggesting that lock reclaim was attempted. However the default behaviour is to not attempt to recover locks (except due to server report). This patch changes the reporting to produce at most one message for each attempt to recover all state from a given server. The message reports the server name and the number of locks lost if that number is non-zero. It reports that locks were lost and give no suggestion as to whether there was an attempt or not. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2022-05-17NFSv4.1: Enable access to the NFSv4.1 'dacl' and 'sacl' attributesTrond Myklebust1-0/+69
Enable access to the NFSv4 acl via the NFSv4.1 'dacl' and 'sacl' attributes. This allows the server to authenticate the DACL and the SACL operations separately, since reading and/or editing the SACL is usually considered to be a privileged operation. It also allows the propagation of automatic inheritance information that was not supported by the NFSv4.0 'acl' attribute. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2022-05-17NFSv4: Add encoders/decoders for the NFSv4.1 dacl and sacl attributesTrond Myklebust2-36/+68
Add the ability to set or retrieve the acl using the NFSv4.1 'dacl' and 'sacl' attributes to the NFSv4 xdr encoders/decoders. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2022-05-17NFSv4: Specify the type of ACL to cacheTrond Myklebust1-19/+40
When caching a NFSv4 ACL, we want to specify whether we are caching an NFSv4.0 type acl, the NFSv4.1 dacl or the NFSv4.1 sacl. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2022-05-17NFSv4: Don't hold the layoutget locks across multiple RPC callsTrond Myklebust1-0/+4
When doing layoutget as part of the open() compound, we have to be careful to release the layout locks before we can call any further RPC calls, such as setattr(). The reason is that those calls could trigger a recall, which could deadlock. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2022-05-17pNFS/files: Fall back to I/O through the MDS on non-fatal layout errorsTrond Myklebust1-1/+6
Only report the error when the server is returning a fatal error, such as ESTALE, EIO, etc... Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2022-05-17NFS: Further fixes to the writeback error handlingTrond Myklebust1-21/+18
When we handle an error by redirtying the page, we're not corrupting the mapping, so we don't want the error to be recorded in the mapping. If the caller has specified a sync_mode of WB_SYNC_NONE, we can just return AOP_WRITEPAGE_ACTIVATE. However if we're dealing with WB_SYNC_ALL, we need to ensure that retries happen when the errors are non-fatal. Reported-by: Olga Kornievskaia <aglo@umich.edu> Fixes: 8fc75bed96bb ("NFS: Fix up return value on fatal errors in nfs_page_async_flush()") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2022-05-17NFSv4/pNFS: Do not fail I/O when we fail to allocate the pNFS layoutTrond Myklebust1-0/+2
Commit 587f03deb69b caused pnfs_update_layout() to stop returning ENOMEM when the memory allocation fails, and hence causes it to fall back to trying to do I/O through the MDS. There is no guarantee that this will fare any better. If we're failing the pNFS layout allocation, then we should just redirty the page and retry later. Reported-by: Olga Kornievskaia <aglo@umich.edu> Fixes: 587f03deb69b ("pnfs: refactor send_layoutget") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2022-05-17NFS: Memory allocation failures are not server fatal errorsTrond Myklebust1-0/+1
We need to filter out ENOMEM in nfs_error_is_fatal_on_server(), because running out of memory on our client is not a server error. Reported-by: Olga Kornievskaia <aglo@umich.edu> Fixes: 2dc23afffbca ("NFS: ENOMEM should also be a fatal error.") Cc: stable@vger.kernel.org Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2022-05-17NFS: Don't report errors from nfs_pageio_complete() more than onceTrond Myklebust1-8/+1
Since errors from nfs_pageio_complete() are already being reported through nfs_async_write_error(), we should not be returning them to the callers of do_writepages() as well. They will end up being reported through the generic mechanism instead. Fixes: 6fbda89b257f ("NFS: Replace custom error reporting mechanism with generic one") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2022-05-17NFS: Do not report flush errors in nfs_write_end()Trond Myklebust1-5/+2
If we do flush cached writebacks in nfs_write_end() due to the imminent expiration of an RPCSEC_GSS session, then we should defer reporting any resulting errors until the calls to file_check_and_advance_wb_err() in nfs_file_write() and nfs_file_fsync(). Fixes: 6fbda89b257f ("NFS: Replace custom error reporting mechanism with generic one") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>