aboutsummaryrefslogtreecommitdiff
path: root/fs/nfs
AgeCommit message (Collapse)Author
2023-03-10NFS: fix disabling of swapNeilBrown
[ Upstream commit 5bab56fff53ce161ed859d9559a10361d4f79578 ] When swap is activated to a file on an NFSv4 mount we arrange that the state manager thread is always present as starting a new thread requires memory allocations that might block waiting for swap. Unfortunately the code for allowing the state manager thread to exit when swap is disabled was not tested properly and does not work. This can be seen by examining /proc/fs/nfsfs/servers after disabling swap and unmounting the filesystem. The servers file will still list one entry. Also a "ps" listing will show the state manager thread is still present. There are two problems. 1/ rpc_clnt_swap_deactivate() doesn't walk up the ->cl_parent list to find the primary client on which the state manager runs. 2/ The thread is not woken up properly and it immediately goes back to sleep without checking whether it is really needed. Using nfs4_schedule_state_manager() ensures a proper wake-up. Reported-by: Olga Kornievskaia <aglo@umich.edu> Fixes: 4dc73c679114 ("NFSv4: keep state manager thread active if swap is enabled") Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-10nfs4trace: fix state manager flag printingBenjamin Coddington
[ Upstream commit b46d80bd2d6e7e063c625a20de54248afe8d4889 ] __print_flags wants a mask, not the enum value. Add two more flags. Fixes: 511ba52e4c01 ("NFS4: Trace state recovery operation") Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-09use less confusing names for iov_iter direction initializersAl Viro
[ Upstream commit de4eda9de2d957ef2d6a8365a01e26a435e958cb ] READ/WRITE proved to be actively confusing - the meanings are "data destination, as used with read(2)" and "data source, as used with write(2)", but people keep interpreting those as "we read data from it" and "we write data to it", i.e. exactly the wrong way. Call them ITER_DEST and ITER_SOURCE - at least that is harder to misinterpret... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Stable-dep-of: 6dd88fd59da8 ("vhost-scsi: unbreak any layout for response") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-24pNFS/filelayout: Fix coalescing test for single DSOlga Kornievskaia
[ Upstream commit a6b9d2fa0024e7e399c26facd0fb466b7396e2b9 ] When there is a single DS no striping constraints need to be placed on the IO. When such constraint is applied then buffered reads don't coalesce to the DS's rsize. Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-31nfs: fix possible null-ptr-deref when parsing paramHawkins Jiawei
[ Upstream commit 5559405df652008e56eee88872126fe4c451da67 ] According to commit "vfs: parse: deal with zero length string value", kernel will set the param->string to null pointer in vfs_parse_fs_string() if fs string has zero length. Yet the problem is that, nfs_fs_context_parse_param() will dereferences the param->string, without checking whether it is a null pointer, which may trigger a null-ptr-deref bug. This patch solves it by adding sanity check on param->string in nfs_fs_context_parse_param(). Signed-off-by: Hawkins Jiawei <yin31149@gmail.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-31NFSv4.x: Fail client initialisation if state manager thread can't runTrond Myklebust
[ Upstream commit b4e4f66901658fae0614dea5bf91062a5387eda7 ] If the state manager thread fails to start, then we should just mark the client initialisation as failed so that other processes or threads don't get stuck in nfs_wait_client_init_complete(). Reported-by: ChenXiaoSong <chenxiaosong2@huawei.com> Fixes: 4697bd5e9419 ("NFSv4: Fix a race in the net namespace mount notification") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-31NFS: Allow very small rsize & wsize againAnna Schumaker
[ Upstream commit a60214c2465493aac0b014d87ee19327b6204c42 ] 940261a19508 introduced nfs_io_size() to clamp the iosize to a multiple of PAGE_SIZE. This had the unintended side effect of no longer allowing iosizes less than a page, which could be useful in some situations. UDP already has an exception that causes it to fall back on the power-of-two style sizes instead. This patch adds an additional exception for very small iosizes. Reported-by: Jeff Layton <jlayton@kernel.org> Fixes: 940261a19508 ("NFS: Allow setting rsize / wsize to a multiple of PAGE_SIZE") Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-31NFSv4.2: Set the correct size scratch buffer for decoding READ_PLUSAnna Schumaker
[ Upstream commit 36357fe74ef736524a29fbd3952948768510a8b9 ] The scratch_buf array is 16 bytes, but I was passing 32 to the xdr_set_scratch_buffer() function. Fix this by using sizeof(), which is what I probably should have been doing this whole time. Fixes: d3b00a802c84 ("NFS: Replace the READ_PLUS decoding code") Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-31NFS: Fix an Oops in nfs_d_automount()Trond Myklebust
[ Upstream commit 35e3b6ae84935d0d7ff76cbdaa83411b0ad5e471 ] When mounting from a NFSv4 referral, path->dentry can end up being a negative dentry, so derive the struct nfs_server from the dentry itself instead. Fixes: 2b0143b5c986 ("VFS: normal filesystems (and lustre): d_inode() annotations") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-31NFSv4: Fix a deadlock between nfs4_open_recover_helper() and delegreturnTrond Myklebust
[ Upstream commit 51069e4aef6257b0454057359faed0ab0c9af083 ] If we're asked to recover open state while a delegation return is outstanding, then the state manager thread cannot use a cached open, so if the server returns a delegation, we can end up deadlocked behind the pending delegreturn. To avoid this problem, let's just ask the server not to give us a delegation unless we're explicitly reclaiming one. Fixes: be36e185bd26 ("NFSv4: nfs4_open_recover_helper() must set share access") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-31NFSv4: Fix a credential leak in _nfs4_discover_trunking()Trond Myklebust
[ Upstream commit e83458fce080dc23c25353a1af90bfecf79c7369 ] Fixes: 4f40a5b55446 ("NFSv4: Add an fattr allocation to _nfs4_discover_trunking()") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-31NFSv4.2: Fix initialisation of struct nfs4_labelTrond Myklebust
[ Upstream commit c528f70f504434eaff993a5ddd52203a2010d51f ] The call to nfs4_label_init_security() should return a fully initialised label. Fixes: aa9c2669626c ("NFS: Client implementation of Labeled-NFS") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-31NFSv4.2: Fix a memory stomp in decode_attr_security_labelTrond Myklebust
[ Upstream commit 43c1031f7110967c240cb6e922adcfc4b8899183 ] We must not change the value of label->len if it is zero, since that indicates we stored a label. Fixes: b4487b935452 ("nfs: Fix getxattr kernel panic and memory overflow") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-31NFSv4.2: Always decode the security labelTrond Myklebust
[ Upstream commit c8a62f440229ae7a10874776344dfcc17d860336 ] If the server returns a reply that includes a security label, then we must decode it whether or not we can store the results. Fixes: 1e2f67da8931 ("NFS: Remove the nfs4_label argument from decode_getattr_*() functions") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-31NFSv4.2: Clear FATTR4_WORD2_SECURITY_LABEL when done decodingTrond Myklebust
[ Upstream commit eef7314caf2d73a94b68ba293cd105154d3a664e ] We need to clear the FATTR4_WORD2_SECURITY_LABEL bitmap flag irrespective of whether or not the label is too long. Fixes: aa9c2669626c ("NFS: Client implementation of Labeled-NFS") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-10-27nfs4: Fix kmemleak when allocate slot failedZhang Xiaoxu
If one of the slot allocate failed, should cleanup all the other allocated slots, otherwise, the allocated slots will leak: unreferenced object 0xffff8881115aa100 (size 64): comm ""mount.nfs"", pid 679, jiffies 4294744957 (age 115.037s) hex dump (first 32 bytes): 00 cc 19 73 81 88 ff ff 00 a0 5a 11 81 88 ff ff ...s......Z..... 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<000000007a4c434a>] nfs4_find_or_create_slot+0x8e/0x130 [<000000005472a39c>] nfs4_realloc_slot_table+0x23f/0x270 [<00000000cd8ca0eb>] nfs40_init_client+0x4a/0x90 [<00000000128486db>] nfs4_init_client+0xce/0x270 [<000000008d2cacad>] nfs4_set_client+0x1a2/0x2b0 [<000000000e593b52>] nfs4_create_server+0x300/0x5f0 [<00000000e4425dd2>] nfs4_try_get_tree+0x65/0x110 [<00000000d3a6176f>] vfs_get_tree+0x41/0xf0 [<0000000016b5ad4c>] path_mount+0x9b3/0xdd0 [<00000000494cae71>] __x64_sys_mount+0x190/0x1d0 [<000000005d56bdec>] do_syscall_64+0x35/0x80 [<00000000687c9ae4>] entry_SYSCALL_64_after_hwframe+0x46/0xb0 Fixes: abf79bb341bf ("NFS: Add a slot table to struct nfs_client for NFSv4.0 transport blocking") Signed-off-by: Zhang Xiaoxu <zhangxiaoxu5@huawei.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2022-10-27NFSv4.2: Fixup CLONE dest file size for zero-length countBenjamin Coddington
When holding a delegation, the NFS client optimizes away setting the attributes of a file from the GETATTR in the compound after CLONE, and for a zero-length CLONE we will end up setting the inode's size to zero in nfs42_copy_dest_done(). Handle this case by computing the resulting count from the server's reported size after CLONE's GETATTR. Suggested-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Fixes: 94d202d5ca39 ("NFSv42: Copy offload should update the file size when appropriate") Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2022-10-27NFSv4: Retry LOCK on OLD_STATEID during delegation returnBenjamin Coddington
There's a small window where a LOCK sent during a delegation return can race with another OPEN on client, but the open stateid has not yet been updated. In this case, the client doesn't handle the OLD_STATEID error from the server and will lose this lock, emitting: "NFS: nfs4_handle_delegation_recall_error: unhandled error -10024". Fix this by sending the task through the nfs4 error handling in nfs4_lock_done() when we may have to reconcile our stateid with what the server believes it to be. For this case, the result is a retry of the LOCK operation with the updated stateid. Reported-by: Gonzalo Siero Humet <gsierohu@redhat.com> Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2022-10-27NFSv4.1: We must always send RECLAIM_COMPLETE after a rebootTrond Myklebust
Currently, we are only guaranteed to send RECLAIM_COMPLETE if we have open state to recover. Fix the client to always send RECLAIM_COMPLETE after setting up the lease. Fixes: fce5c838e133 ("nfs41: RECLAIM_COMPLETE functionality") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2022-10-27NFSv4.1: Handle RECLAIM_COMPLETE trunking errorsTrond Myklebust
If RECLAIM_COMPLETE sets the NFS4CLNT_BIND_CONN_TO_SESSION flag, then we need to loop back in order to handle it. Fixes: 0048fdd06614 ("NFSv4.1: RECLAIM_COMPLETE must handle NFS4ERR_CONN_NOT_BOUND_TO_SESSION") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2022-10-27NFSv4: Fix a potential state reclaim deadlockTrond Myklebust
If the server reboots while we are engaged in a delegation return, and there is a pNFS layout with return-on-close set, then the current code can end up deadlocking in pnfs_roc() when nfs_inode_set_delegation() tries to return the old delegation. Now that delegreturn actually uses its own copy of the stateid, it should be safe to just always update the delegation stateid in place. Fixes: 078000d02d57 ("pNFS: We want return-on-close to complete when evicting the inode") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2022-10-27NFS: Avoid memcpy() run-time warning for struct sockaddr overflowsKees Cook
The 'nfs_server' and 'mount_server' structures include a union of 'struct sockaddr' (with the older 16 bytes max address size) and 'struct sockaddr_storage' which is large enough to hold all the supported sa_family types (128 bytes max size). The runtime memcpy() buffer overflow checker is seeing attempts to write beyond the 16 bytes as an overflow, but the actual expected size is that of 'struct sockaddr_storage'. Plumb the use of 'struct sockaddr_storage' more completely through-out NFS, which results in adjusting the memcpy() buffers to the correct union members. Avoids this false positive run-time warning under CONFIG_FORTIFY_SOURCE: memcpy: detected field-spanning write (size 28) of single field "&ctx->nfs_server.address" at fs/nfs/namespace.c:178 (size 16) Reported-by: kernel test robot <yujie.liu@intel.com> Link: https://lore.kernel.org/all/202210110948.26b43120-yujie.liu@intel.com Cc: Trond Myklebust <trond.myklebust@hammerspace.com> Cc: Anna Schumaker <anna@kernel.org> Cc: linux-nfs@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2022-10-27nfs: Remove redundant null checks before kfreeYushan Zhou
Fix the following coccicheck warning: fs/nfs/dir.c:2494:2-7: WARNING: NULL check before some freeing functions is not needed. Signed-off-by: Yushan Zhou <katrinzhou@tencent.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2022-10-13Merge tag 'nfs-for-6.1-1' of git://git.linux-nfs.org/projects/anna/linux-nfsLinus Torvalds
Pull NFS client updates from Anna Schumaker: "New Features: - Add NFSv4.2 xattr tracepoints - Replace xprtiod WQ in rpcrdma - Flexfiles cancels I/O on layout recall or revoke Bugfixes and Cleanups: - Directly use ida_alloc() / ida_free() - Don't open-code max_t() - Prefer using strscpy over strlcpy - Remove unused forward declarations - Always return layout states on flexfiles layout return - Have LISTXATTR treat NFS4ERR_NOXATTR as an empty reply instead of error - Allow more xprtrdma memory allocations to fail without triggering a reclaim - Various other xprtrdma clean ups - Fix rpc_killall_tasks() races" * tag 'nfs-for-6.1-1' of git://git.linux-nfs.org/projects/anna/linux-nfs: (27 commits) NFSv4/flexfiles: Cancel I/O if the layout is recalled or revoked SUNRPC: Add API to force the client to disconnect SUNRPC: Add a helper to allow pNFS drivers to selectively cancel RPC calls SUNRPC: Fix races with rpc_killall_tasks() xprtrdma: Fix uninitialized variable xprtrdma: Prevent memory allocations from driving a reclaim xprtrdma: Memory allocation should be allowed to fail during connect xprtrdma: MR-related memory allocation should be allowed to fail xprtrdma: Clean up synopsis of rpcrdma_regbuf_alloc() xprtrdma: Clean up synopsis of rpcrdma_req_create() svcrdma: Clean up RPCRDMA_DEF_GFP SUNRPC: Replace the use of the xprtiod WQ in rpcrdma NFSv4.2: Add a tracepoint for listxattr NFSv4.2: Add tracepoints for getxattr, setxattr, and removexattr NFSv4.2: Move TRACE_DEFINE_ENUM(NFS4_CONTENT_*) under CONFIG_NFS_V4_2 NFSv4.2: Add special handling for LISTXATTR receiving NFS4ERR_NOXATTR nfs: remove nfs_wait_atomic_killable() and nfs_write_prepare() declaration NFSv4: remove nfs4_renewd_prepare_shutdown() declaration fs/nfs/pnfs_nfs.c: fix spelling typo and syntax error in comment NFSv4/pNFS: Always return layout stats on layout return for flexfiles ...
2022-10-10Merge tag 'sched-core-2022-10-07' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler updates from Ingo Molnar: "Debuggability: - Change most occurances of BUG_ON() to WARN_ON_ONCE() - Reorganize & fix TASK_ state comparisons, turn it into a bitmap - Update/fix misc scheduler debugging facilities Load-balancing & regular scheduling: - Improve the behavior of the scheduler in presence of lot of SCHED_IDLE tasks - in particular they should not impact other scheduling classes. - Optimize task load tracking, cleanups & fixes - Clean up & simplify misc load-balancing code Freezer: - Rewrite the core freezer to behave better wrt thawing and be simpler in general, by replacing PF_FROZEN with TASK_FROZEN & fixing/adjusting all the fallout. Deadline scheduler: - Fix the DL capacity-aware code - Factor out dl_task_is_earliest_deadline() & replenish_dl_new_period() - Relax/optimize locking in task_non_contending() Cleanups: - Factor out the update_current_exec_runtime() helper - Various cleanups, simplifications" * tag 'sched-core-2022-10-07' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (41 commits) sched: Fix more TASK_state comparisons sched: Fix TASK_state comparisons sched/fair: Move call to list_last_entry() in detach_tasks sched/fair: Cleanup loop_max and loop_break sched/fair: Make sure to try to detach at least one movable task sched: Show PF_flag holes freezer,sched: Rewrite core freezer logic sched: Widen TAKS_state literals sched/wait: Add wait_event_state() sched/completion: Add wait_for_completion_state() sched: Add TASK_ANY for wait_task_inactive() sched: Change wait_task_inactive()s match_state freezer,umh: Clean up freezer/initrd interaction freezer: Have {,un}lock_system_sleep() save/restore flags sched: Rename task_running() to task_on_cpu() sched/fair: Cleanup for SIS_PROP sched/fair: Default to false in test_idle_cores() sched/fair: Remove useless check in select_idle_core() sched/fair: Avoid double search on same cpu sched/fair: Remove redundant check in select_idle_smt() ...
2022-10-06Merge tag 'pull-file_inode' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull file_inode() updates from Al Vrio: "whack-a-mole: cropped up open-coded file_inode() uses..." * tag 'pull-file_inode' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: orangefs: use ->f_mapping _nfs42_proc_copy(): use ->f_mapping instead of file_inode()->i_mapping dma_buf: no need to bother with file_inode()->i_mapping nfs_finish_open(): don't open-code file_inode() bprm_fill_uid(): don't open-code file_inode() sgx: use ->f_mapping... exfat_iterate(): don't open-code file_inode(file) ibmvmc: don't open-code file_inode()
2022-10-06NFSv4/flexfiles: Cancel I/O if the layout is recalled or revokedTrond Myklebust
If the layout is recalled or revoked, we want to cancel I/O as quickly as possible so that we can return the layout. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2022-10-05NFSv4.2: Add a tracepoint for listxattrAnna Schumaker
This can be defined as simply an NFS4_INODE_EVENT() since we don't have the name of a specific xattr to list. This roughly matches readdir, which also uses an NFS4_INODE_EVENT() tracepoint. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2022-10-05NFSv4.2: Add tracepoints for getxattr, setxattr, and removexattrAnna Schumaker
These functions take similar arguments, and can share a tracepoint class for common formatting. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2022-10-05NFSv4.2: Move TRACE_DEFINE_ENUM(NFS4_CONTENT_*) under CONFIG_NFS_V4_2Anna Schumaker
NFS4_CONTENT_DATA and NFS4_CONTENT_HOLE both only exist under NFS v4.2. Move their corresponding TRACE_DEFINE_ENUM calls under this Kconfig option. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2022-10-05NFSv4.2: Add special handling for LISTXATTR receiving NFS4ERR_NOXATTRAnna Schumaker
We can translate this into an empty response list instead of passing an error up to userspace. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2022-10-05nfs: remove nfs_wait_atomic_killable() and nfs_write_prepare() declarationGaosheng Cui
nfs_write_prepare() has been removed since commit a4cdda59111f ("NFS: Create a common pgio_rpc_prepare function"), so remove it. nfs_wait_atomic_killable() has been removed since commit 723c921e7dfc ("sched/wait, fs/nfs: Convert wait_on_atomic_t() usage to the new wait_var_event() API"), so remove it. Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2022-10-05NFSv4: remove nfs4_renewd_prepare_shutdown() declarationGaosheng Cui
nfs4_renewd_prepare_shutdown() has been removed since commit 3050141bae57 ("NFSv4: Kill nfs4_renewd_prepare_shutdown()"), so remove it. Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2022-10-05fs/nfs/pnfs_nfs.c: fix spelling typo and syntax error in commentJiangshan Yi
Fix spelling typo and syntax error in comment. Suggested-by: Randy Dunlap <rdunlap@infradead.org> Reported-by: k2ci <kernel-bot@kylinos.cn> Signed-off-by: Jiangshan Yi <yijiangshan@kylinos.cn> Reviewed-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2022-10-03NFSv4/pNFS: Always return layout stats on layout return for flexfilesTrond Myklebust
We want to ensure that the server never misses the layout stats when we're closing the file, so that it knows whether or not to update its internal state. Otherwise, if we were racing with a layout stat, we might cause the server to invalidate its layout before the layout stat got processed. Fixes: 06946c6a3d8b ("pNFS/flexfiles: Only send layoutstats updates for mirrors that were updated") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2022-10-03NFS: move from strlcpy with unused retval to strscpyWolfram Sang
Follow the advice of the below link and prefer 'strscpy' in this subsystem. Conversion is 1:1 because the return value is not used. Generated by a coccinelle script. Link: https://lore.kernel.org/r/CAHk-=wgfRnXz0W3D37d01q3JFkr_i_uTL=V6A6G1oUZcprmknw@mail.gmail.com/ Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2022-10-03NFS: clean up a needless assignment in nfs_file_write()Lukas Bulwahn
Commit 064109db53ec ("NFS: remove redundant code in nfs_file_write()") identifies that filemap_fdatawait_range() will always return 0 and removes a dead error-handling case in nfs_file_write(). With this change however, assigning the return of filemap_fdatawait_range() to the result variable is a dead store. Remove this needless assignment. No functional change. No change in object code. Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2022-10-03nfs: remove unnecessary (void*) conversions.yuzhe
remove unnecessary void* type castings. Signed-off-by: yuzhe <yuzhe@nfschina.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2022-10-03NFSv4: Directly use ida_alloc()/free()Bo Liu
Use ida_alloc()/ida_free() instead of ida_simple_get()/ida_simple_remove(). The latter is deprecated and more verbose. Signed-off-by: Bo Liu <liubo03@inspur.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2022-09-26SUNRPC: Parametrize how much of argsize should be zeroedChuck Lever
Currently, SUNRPC clears the whole of .pc_argsize before processing each incoming RPC transaction. Add an extra parameter to struct svc_procedure to enable upper layers to reduce the amount of each operation's argument structure that is zeroed by SUNRPC. The size of struct nfsd4_compoundargs, in particular, is a lot to clear on each incoming RPC Call. A subsequent patch will cut this down to something closer to what NFSv2 and NFSv3 uses. This patch should cause no behavior changes. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-09-12Merge tag 'nfs-for-5.20-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds
Pull NFS client bugfixes from Trond Myklebust: - Fix SUNRPC call completion races with call_decode() that trigger a WARN_ON() - NFSv4.0 cannot support open-by-filehandle and NFS re-export - Revert "SUNRPC: Remove unreachable error condition" to allow handling of error conditions - Update suid/sgid mode bits after ALLOCATE and DEALLOCATE * tag 'nfs-for-5.20-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: Revert "SUNRPC: Remove unreachable error condition" NFSv4.2: Update mode bits after ALLOCATE and DEALLOCATE NFSv4: Turn off open-by-filehandle and NFS re-export for NFSv4.0 SUNRPC: Fix call completion races with call_decode()
2022-09-08NFSv4.2: Update mode bits after ALLOCATE and DEALLOCATEAnna Schumaker
The fallocate call invalidates suid and sgid bits as part of normal operation. We need to mark the mode bits as invalid when using fallocate with an suid so these will be updated the next time the user looks at them. This fixes xfstests generic/683 and generic/684. Reported-by: Yue Cui <cuiyue-fnst@fujitsu.com> Fixes: 913eca1aea87 ("NFS: Fallocate should use the nfs4_fattr_bitmap") Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2022-09-07freezer,sched: Rewrite core freezer logicPeter Zijlstra
Rewrite the core freezer to behave better wrt thawing and be simpler in general. By replacing PF_FROZEN with TASK_FROZEN, a special block state, it is ensured frozen tasks stay frozen until thawed and don't randomly wake up early, as is currently possible. As such, it does away with PF_FROZEN and PF_FREEZER_SKIP, freeing up two PF_flags (yay!). Specifically; the current scheme works a little like: freezer_do_not_count(); schedule(); freezer_count(); And either the task is blocked, or it lands in try_to_freezer() through freezer_count(). Now, when it is blocked, the freezer considers it frozen and continues. However, on thawing, once pm_freezing is cleared, freezer_count() stops working, and any random/spurious wakeup will let a task run before its time. That is, thawing tries to thaw things in explicit order; kernel threads and workqueues before doing bringing SMP back before userspace etc.. However due to the above mentioned races it is entirely possible for userspace tasks to thaw (by accident) before SMP is back. This can be a fatal problem in asymmetric ISA architectures (eg ARMv9) where the userspace task requires a special CPU to run. As said; replace this with a special task state TASK_FROZEN and add the following state transitions: TASK_FREEZABLE -> TASK_FROZEN __TASK_STOPPED -> TASK_FROZEN __TASK_TRACED -> TASK_FROZEN The new TASK_FREEZABLE can be set on any state part of TASK_NORMAL (IOW. TASK_INTERRUPTIBLE and TASK_UNINTERRUPTIBLE) -- any such state is already required to deal with spurious wakeups and the freezer causes one such when thawing the task (since the original state is lost). The special __TASK_{STOPPED,TRACED} states *can* be restored since their canonical state is in ->jobctl. With this, frozen tasks need an explicit TASK_FROZEN wakeup and are free of undue (early / spurious) wakeups. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Ingo Molnar <mingo@kernel.org> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://lore.kernel.org/r/20220822114649.055452969@infradead.org
2022-09-01_nfs42_proc_copy(): use ->f_mapping instead of file_inode()->i_mappingAl Viro
Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-09-01nfs_finish_open(): don't open-code file_inode()Al Viro
Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-09-01NFSv4: Turn off open-by-filehandle and NFS re-export for NFSv4.0Trond Myklebust
The NFSv4.0 protocol only supports open() by name. It cannot therefore be used with open_by_handle() and friends, nor can it be re-exported by knfsd. Reported-by: Chuck Lever III <chuck.lever@oracle.com> Fixes: 20fa19027286 ("nfs: add export operations") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2022-08-22Merge tag 'nfs-for-5.20-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds
Pull NFS client fixes from Trond Myklebust: "Stable fixes: - NFS: Fix another fsync() issue after a server reboot Bugfixes: - NFS: unlink/rmdir shouldn't call d_delete() twice on ENOENT - NFS: Fix missing unlock in nfs_unlink() - Add sanity checking of the file type used by __nfs42_ssc_open - Fix a case where we're failing to set task->tk_rpc_status Cleanups: - Remove the NFS_CONTEXT_RESEND_WRITES flag that got obsoleted by the fsync() fix" * tag 'nfs-for-5.20-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: SUNRPC: RPC level errors should set task->tk_rpc_status NFSv4.2 fix problems with __nfs42_ssc_open NFS: unlink/rmdir shouldn't call d_delete() twice on ENOENT NFS: Cleanup to remove unused flag NFS_CONTEXT_RESEND_WRITES NFS: Remove a bogus flag setting in pnfs_write_done_resend_to_mds NFS: Fix another fsync() issue after a server reboot NFS: Fix missing unlock in nfs_unlink()
2022-08-19NFSv4.2 fix problems with __nfs42_ssc_openOlga Kornievskaia
A destination server while doing a COPY shouldn't accept using the passed in filehandle if its not a regular filehandle. If alloc_file_pseudo() has failed, we need to decrement a reference on the newly created inode, otherwise it leaks. Reported-by: Al Viro <viro@zeniv.linux.org.uk> Fixes: ec4b092508982 ("NFS: inter ssc open") Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2022-08-19NFS: unlink/rmdir shouldn't call d_delete() twice on ENOENTNeilBrown
nfs_unlink() calls d_delete() twice if it receives ENOENT from the server - once in nfs_dentry_handle_enoent() from nfs_safe_remove and once in nfs_dentry_remove_handle_error(). nfs_rmddir() also calls it twice - the nfs_dentry_handle_enoent() call is direct and inside a region locked with ->rmdir_sem It is safe to call d_delete() twice if the refcount > 1 as the dentry is simply unhashed. If the refcount is 1, the first call sets d_inode to NULL and the second call crashes. This patch guards the d_delete() call from nfs_dentry_handle_enoent() leaving the one under ->remdir_sem in case that is important. In mainline it would be safe to remove the d_delete() call. However in older kernels to which this might be backported, that would change the behaviour of nfs_unlink(). nfs_unlink() used to unhash the dentry which resulted in nfs_dentry_handle_enoent() not calling d_delete(). So in older kernels we need the d_delete() in nfs_dentry_remove_handle_error() when called from nfs_unlink() but not when called from nfs_rmdir(). To make the code work correctly for old and new kernels, and from both nfs_unlink() and nfs_rmdir(), we protect the d_delete() call with simple_positive(). This ensures it is never called in a circumstance where it could crash. Fixes: 3c59366c207e ("NFS: don't unhash dentry during unlink/rename") Fixes: 9019fb391de0 ("NFS: Label the dentry with a verifier in nfs_rmdir() and nfs_unlink()") Signed-off-by: NeilBrown <neilb@suse.de> Tested-by: Olga Kornievskaia <aglo@umich.edu> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2022-08-13NFS: Remove a bogus flag setting in pnfs_write_done_resend_to_mdsTrond Myklebust
Since pnfs_write_done_resend_to_mds() does not actually call end_page_writeback() on the pages that are being redirected to the metadata server, callers of fsync() do not see the I/O as complete until the writeback to the MDS finishes. We therefore do not need to set NFS_CONTEXT_RESEND_WRITES, since there is nothing to redrive. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>