aboutsummaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
2022-06-12Merge tag '5.19-rc1-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds
Pull cifs client fixes from Steve French: "Three reconnect fixes, all for stable as well. One of these three reconnect fixes does address a problem with multichannel reconnect, but this does not include the additional fix (still being tested) for dynamically detecting multichannel adapter changes which will improve those reconnect scenarios even more" * tag '5.19-rc1-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6: cifs: populate empty hostnames for extra channels cifs: return errors during session setup during reconnects cifs: fix reconnect on smb3 mount types
2022-06-10Merge tag 'nfsd-5.19-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux Pull nfsd fixes from Chuck Lever: "Notable changes: - There is now a backup maintainer for NFSD Notable fixes: - Prevent array overruns in svc_rdma_build_writes() - Prevent buffer overruns when encoding NFSv3 READDIR results - Fix a potential UAF in nfsd_file_put()" * tag 'nfsd-5.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: SUNRPC: Remove pointer type casts from xdr_get_next_encode_buffer() SUNRPC: Clean up xdr_get_next_encode_buffer() SUNRPC: Clean up xdr_commit_encode() SUNRPC: Optimize xdr_reserve_space() SUNRPC: Fix the calculation of xdr->end in xdr_get_next_encode_buffer() SUNRPC: Trap RDMA segment overflows NFSD: Fix potential use-after-free in nfsd_file_put() MAINTAINERS: reciprocal co-maintainership for file locking and nfsd
2022-06-10cifs: populate empty hostnames for extra channelsShyam Prasad N
Currently, the secondary channels of a multichannel session also get hostname populated based on the info in primary channel. However, this will end up with a wrong resolution of hostname to IP address during reconnect. This change fixes this by not populating hostname info for all secondary channels. Fixes: 5112d80c162f ("cifs: populate server_hostname for extra channels") Cc: stable@vger.kernel.org Signed-off-by: Shyam Prasad N <sprasad@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2022-06-10netfs: Rename the netfs_io_request cleanup op and give it an op pointerDavid Howells
The netfs_io_request cleanup op is now always in a position to be given a pointer to a netfs_io_request struct, so this can be passed in instead of the mapping and private data arguments (both of which are included in the struct). So rename the ->cleanup op to ->free_request (to match ->init_request) and pass in the I/O pointer. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> cc: linux-cachefs@redhat.com
2022-06-10netfs: Further cleanups after struct netfs_inode wrapper introducedLinus Torvalds
Change the signature of netfs helper functions to take a struct netfs_inode pointer rather than a struct inode pointer where appropriate, thereby relieving the need for the network filesystem to convert its internal inode format down to the VFS inode only for netfslib to bounce it back up. For type safety, it's better not to do that (and it's less typing too). Give netfs_write_begin() an extra argument to pass in a pointer to the netfs_inode struct rather than deriving it internally from the file pointer. Note that the ->write_begin() and ->write_end() ops are intended to be replaced in the future by netfslib code that manages this without the need to call in twice for each page. netfs_readpage() and similar are intended to be pointed at directly by the address_space_operations table, so must stick to the signature dictated by the function pointers there. Changes ======= - Updated the kerneldoc comments and documentation [DH]. Signed-off-by: David Howells <dhowells@redhat.com> cc: linux-cachefs@redhat.com Link: https://lore.kernel.org/r/CAHk-=wgkwKyNmNdKpQkqZ6DnmUL-x9hp0YBnUGjaPFEAdxDTbw@mail.gmail.com/
2022-06-10afs: Fix some checker issuesDavid Howells
Remove an unused global variable and make another static as reported by make C=1. Signed-off-by: David Howells <dhowells@redhat.com> cc: linux-afs@lists.infradead.org
2022-06-10Merge tag 'zonefs-5.19-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/zonefs Pull zonefs fixes from Damien Le Moal: - Fix handling of the explicit-open mount option, and in particular the conditions under which this option can be ignored. - Fix a problem with zonefs iomap_begin method, causing a hang in iomap_readahead() when a readahead request reaches the end of a file. * tag 'zonefs-5.19-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/zonefs: zonefs: fix zonefs_iomap_begin() for reads zonefs: Do not ignore explicit_open with active zone limit zonefs: fix handling of explicit_open option on mount
2022-06-09netfs: Fix gcc-12 warning by embedding vfs inode in netfs_i_contextDavid Howells
While randstruct was satisfied with using an open-coded "void *" offset cast for the netfs_i_context <-> inode casting, __builtin_object_size() as used by FORTIFY_SOURCE was not as easily fooled. This was causing the following complaint[1] from gcc v12: In file included from include/linux/string.h:253, from include/linux/ceph/ceph_debug.h:7, from fs/ceph/inode.c:2: In function 'fortify_memset_chk', inlined from 'netfs_i_context_init' at include/linux/netfs.h:326:2, inlined from 'ceph_alloc_inode' at fs/ceph/inode.c:463:2: include/linux/fortify-string.h:242:25: warning: call to '__write_overflow_field' declared with attribute warning: detected write beyond size of field (1st parameter); maybe use struct_group()? [-Wattribute-warning] 242 | __write_overflow_field(p_size_field, size); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Fix this by embedding a struct inode into struct netfs_i_context (which should perhaps be renamed to struct netfs_inode). The struct inode vfs_inode fields are then removed from the 9p, afs, ceph and cifs inode structs and vfs_inode is then simply changed to "netfs.inode" in those filesystems. Further, rename netfs_i_context to netfs_inode, get rid of the netfs_inode() function that converted a netfs_i_context pointer to an inode pointer (that can now be done with &ctx->inode) and rename the netfs_i_context() function to netfs_inode() (which is now a wrapper around container_of()). Most of the changes were done with: perl -p -i -e 's/vfs_inode/netfs.inode/'g \ `git grep -l 'vfs_inode' -- fs/{9p,afs,ceph,cifs}/*.[ch]` Kees suggested doing it with a pair structure[2] and a special declarator to insert that into the network filesystem's inode wrapper[3], but I think it's cleaner to embed it - and then it doesn't matter if struct randomisation reorders things. Dave Chinner suggested using a filesystem-specific VFS_I() function in each filesystem to convert that filesystem's own inode wrapper struct into the VFS inode struct[4]. Version #2: - Fix a couple of missed name changes due to a disabled cifs option. - Rename nfs_i_context to nfs_inode - Use "netfs" instead of "nic" as the member name in per-fs inode wrapper structs. [ This also undoes commit 507160f46c55 ("netfs: gcc-12: temporarily disable '-Wattribute-warning' for now") that is no longer needed ] Fixes: bc899ee1c898 ("netfs: Add a netfs inode context") Reported-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Xiubo Li <xiubli@redhat.com> cc: Jonathan Corbet <corbet@lwn.net> cc: Eric Van Hensbergen <ericvh@gmail.com> cc: Latchesar Ionkov <lucho@ionkov.net> cc: Dominique Martinet <asmadeus@codewreck.org> cc: Christian Schoenebeck <linux_oss@crudebyte.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: Ilya Dryomov <idryomov@gmail.com> cc: Steve French <smfrench@gmail.com> cc: William Kucharski <william.kucharski@oracle.com> cc: "Matthew Wilcox (Oracle)" <willy@infradead.org> cc: Dave Chinner <david@fromorbit.com> cc: linux-doc@vger.kernel.org cc: v9fs-developer@lists.sourceforge.net cc: linux-afs@lists.infradead.org cc: ceph-devel@vger.kernel.org cc: linux-cifs@vger.kernel.org cc: samba-technical@lists.samba.org cc: linux-fsdevel@vger.kernel.org cc: linux-hardening@vger.kernel.org Link: https://lore.kernel.org/r/d2ad3a3d7bdd794c6efb562d2f2b655fb67756b9.camel@kernel.org/ [1] Link: https://lore.kernel.org/r/20220517210230.864239-1-keescook@chromium.org/ [2] Link: https://lore.kernel.org/r/20220518202212.2322058-1-keescook@chromium.org/ [3] Link: https://lore.kernel.org/r/20220524101205.GI2306852@dread.disaster.area/ [4] Link: https://lore.kernel.org/r/165296786831.3591209.12111293034669289733.stgit@warthog.procyon.org.uk/ # v1 Link: https://lore.kernel.org/r/165305805651.4094995.7763502506786714216.stgit@warthog.procyon.org.uk # v2 Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-06-09Merge tag 'fs_for_v5.19-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull ext2, writeback, and quota fixes and cleanups from Jan Kara: "A fix for race in writeback code and two cleanups in quota and ext2" * tag 'fs_for_v5.19-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: quota: Prevent memory allocation recursion while holding dq_lock writeback: Fix inode->i_io_list not be protected by inode->i_lock error fs: Fix syntax errors in comments
2022-06-09netfs: gcc-12: temporarily disable '-Wattribute-warning' for nowLinus Torvalds
This is a pure band-aid so that I can continue merging stuff from people while some of the gcc-12 fallout gets sorted out. In particular, gcc-12 is very unhappy about the kinds of pointer arithmetic tricks that netfs does, and that makes the fortify checks trigger in afs and ceph: In function ‘fortify_memset_chk’, inlined from ‘netfs_i_context_init’ at include/linux/netfs.h:327:2, inlined from ‘afs_set_netfs_context’ at fs/afs/inode.c:61:2, inlined from ‘afs_root_iget’ at fs/afs/inode.c:543:2: include/linux/fortify-string.h:258:25: warning: call to ‘__write_overflow_field’ declared with attribute warning: detected write beyond size of field (1st parameter); maybe use struct_group()? [-Wattribute-warning] 258 | __write_overflow_field(p_size_field, size); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ and the reason is that netfs_i_context_init() is passed a 'struct inode' pointer, and then it does struct netfs_i_context *ctx = netfs_i_context(inode); memset(ctx, 0, sizeof(*ctx)); where that netfs_i_context() function just does pointer arithmetic on the inode pointer, knowing that the netfs_i_context is laid out immediately after it in memory. This is all truly disgusting, since the whole "netfs_i_context is laid out immediately after it in memory" is not actually remotely true in general, but is just made to be that way for afs and ceph. See for example fs/cifs/cifsglob.h: struct cifsInodeInfo { struct { /* These must be contiguous */ struct inode vfs_inode; /* the VFS's inode record */ struct netfs_i_context netfs_ctx; /* Netfslib context */ }; [...] and realize that this is all entirely wrong, and the pointer arithmetic that netfs_i_context() is doing is also very very wrong and wouldn't give the right answer if netfs_ctx had different alignment rules from a 'struct inode', for example). Anyway, that's just a long-winded way to say "the gcc-12 warning is actually quite reasonable, and our code happens to work but is pretty disgusting". This is getting fixed properly, but for now I made the mistake of thinking "the week right after the merge window tends to be calm for me as people take a breather" and I did a sustem upgrade. And I got gcc-12 as a result, so to continue merging fixes from people and not have the end result drown in warnings, I am fixing all these gcc-12 issues I hit. Including with these kinds of temporary fixes. Cc: Kees Cook <keescook@chromium.org> Cc: David Howells <dhowells@redhat.com> Link: https://lore.kernel.org/all/AEEBCF5D-8402-441D-940B-105AA718C71F@chromium.org/ Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-06-08zonefs: fix zonefs_iomap_begin() for readsDamien Le Moal
If a readahead is issued to a sequential zone file with an offset exactly equal to the current file size, the iomap type is set to IOMAP_UNWRITTEN, which will prevent an IO, but the iomap length is calculated as 0. This causes a WARN_ON() in iomap_iter(): [17309.548939] WARNING: CPU: 3 PID: 2137 at fs/iomap/iter.c:34 iomap_iter+0x9cf/0xe80 [...] [17309.650907] RIP: 0010:iomap_iter+0x9cf/0xe80 [...] [17309.754560] Call Trace: [17309.757078] <TASK> [17309.759240] ? lock_is_held_type+0xd8/0x130 [17309.763531] iomap_readahead+0x1a8/0x870 [17309.767550] ? iomap_read_folio+0x4c0/0x4c0 [17309.771817] ? lockdep_hardirqs_on_prepare+0x400/0x400 [17309.778848] ? lock_release+0x370/0x750 [17309.784462] ? folio_add_lru+0x217/0x3f0 [17309.790220] ? reacquire_held_locks+0x4e0/0x4e0 [17309.796543] read_pages+0x17d/0xb60 [17309.801854] ? folio_add_lru+0x238/0x3f0 [17309.807573] ? readahead_expand+0x5f0/0x5f0 [17309.813554] ? policy_node+0xb5/0x140 [17309.819018] page_cache_ra_unbounded+0x27d/0x450 [17309.825439] filemap_get_pages+0x500/0x1450 [17309.831444] ? filemap_add_folio+0x140/0x140 [17309.837519] ? lock_is_held_type+0xd8/0x130 [17309.843509] filemap_read+0x28c/0x9f0 [17309.848953] ? zonefs_file_read_iter+0x1ea/0x4d0 [zonefs] [17309.856162] ? trace_contention_end+0xd6/0x130 [17309.862416] ? __mutex_lock+0x221/0x1480 [17309.868151] ? zonefs_file_read_iter+0x166/0x4d0 [zonefs] [17309.875364] ? filemap_get_pages+0x1450/0x1450 [17309.881647] ? __mutex_unlock_slowpath+0x15e/0x620 [17309.888248] ? wait_for_completion_io_timeout+0x20/0x20 [17309.895231] ? lock_is_held_type+0xd8/0x130 [17309.901115] ? lock_is_held_type+0xd8/0x130 [17309.906934] zonefs_file_read_iter+0x356/0x4d0 [zonefs] [17309.913750] new_sync_read+0x2d8/0x520 [17309.919035] ? __x64_sys_lseek+0x1d0/0x1d0 Furthermore, this causes iomap_readahead() to loop forever as iomap_readahead_iter() always returns 0, making no progress. Fix this by treating reads after the file size as access to holes, setting the iomap type to IOMAP_HOLE, the iomap addr to IOMAP_NULL_ADDR and using the length argument as is for the iomap length. To simplify the code with this change, zonefs_iomap_begin() is split into the read variant, zonefs_read_iomap_begin() and zonefs_read_iomap_ops, and the write variant, zonefs_write_iomap_begin() and zonefs_write_iomap_ops. Reported-by: Jorgen Hansen <Jorgen.Hansen@wdc.com> Fixes: 8dcc1a9d90c1 ("fs: New zonefs file system") Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Jorgen Hansen <Jorgen.Hansen@wdc.com>
2022-06-08zonefs: Do not ignore explicit_open with active zone limitDamien Le Moal
A zoned device may have no limit on the number of open zones but may have a limit on the number of active zones it can support. In such case, the explicit_open mount option should not be ignored to ensure that the open() system call activates the zone with an explicit zone open command, thus guaranteeing that the zone can be written. Enforce this by ignoring the explicit_open mount option only for devices that have both the open and active zone limits equal to 0. Fixes: 87c9ce3ffec9 ("zonefs: Add active seq file accounting") Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2022-06-08zonefs: fix handling of explicit_open option on mountDamien Le Moal
Ignoring the explicit_open mount option on mount for devices that do not have a limit on the number of open zones must be done after the mount options are parsed and set in s_mount_opts. Move the check to ignore the explicit_open option after the call to zonefs_parse_options() in zonefs_fill_super(). Fixes: b5c00e975779 ("zonefs: open/close zone on file open/close") Cc: <stable@vger.kernel.org> Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
2022-06-06cifs: return errors during session setup during reconnectsShyam Prasad N
During reconnects, we check the return value from cifs_negotiate_protocol, and have handlers for both success and failures. But if that passes, and cifs_setup_session returns any errors other than -EACCES, we do not handle that. This fix adds a handler for that, so that we don't go ahead and try a tree_connect on a failed session. Signed-off-by: Shyam Prasad N <sprasad@microsoft.com> Reviewed-by: Enzo Matsumiya <ematsumiya@suse.de> Cc: stable@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>
2022-06-06quota: Prevent memory allocation recursion while holding dq_lockMatthew Wilcox (Oracle)
As described in commit 02117b8ae9c0 ("f2fs: Set GF_NOFS in read_cache_page_gfp while doing f2fs_quota_read"), we must not enter filesystem reclaim while holding the dq_lock. Prevent this more generally by using memalloc_nofs_save() while holding the lock. Link: https://lore.kernel.org/r/20220605143815.2330891-2-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Jan Kara <jack@suse.cz>
2022-06-06writeback: Fix inode->i_io_list not be protected by inode->i_lock errorJchao Sun
Commit b35250c0816c ("writeback: Protect inode->i_io_list with inode->i_lock") made inode->i_io_list not only protected by wb->list_lock but also inode->i_lock, but inode_io_list_move_locked() was missed. Add lock there and also update comment describing things protected by inode->i_lock. This also fixes a race where __mark_inode_dirty() could move inode under flush worker's hands and thus sync(2) could miss writing some inodes. Fixes: b35250c0816c ("writeback: Protect inode->i_io_list with inode->i_lock") Link: https://lore.kernel.org/r/20220524150540.12552-1-sunjunchao2870@gmail.com CC: stable@vger.kernel.org Signed-off-by: Jchao Sun <sunjunchao2870@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz>
2022-06-06fs: Fix syntax errors in commentsXiang wangx
Delete the redundant word 'not'. Link: https://lore.kernel.org/r/20220605125509.14837-1-wangxiang@cdjrlc.com Signed-off-by: Xiang wangx <wangxiang@cdjrlc.com> Signed-off-by: Jan Kara <jack@suse.cz>
2022-06-06cifs: fix reconnect on smb3 mount typesPaulo Alcantara
cifs.ko defines two file system types: cifs & smb3, and __cifs_get_super() was not including smb3 file system type when looking up superblocks, therefore failing to reconnect tcons in cifs_tree_connect(). Fix this by calling iterate_supers_type() on both file system types. Link: https://lore.kernel.org/r/CAFrh3J9soC36+BVuwHB=g9z_KB5Og2+p2_W+BBoBOZveErz14w@mail.gmail.com Cc: stable@vger.kernel.org Tested-by: Satadru Pramanik <satadru@gmail.com> Reported-by: Satadru Pramanik <satadru@gmail.com> Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz> Signed-off-by: Steve French <stfrench@microsoft.com>
2022-06-05Merge tag 'pull-work.fd-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull file descriptor fix from Al Viro: "Fix for breakage in #work.fd this window" * tag 'pull-work.fd-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: fix the breakage in close_fd_get_file() calling conventions change
2022-06-05fix the breakage in close_fd_get_file() calling conventions changeAl Viro
It used to grab an extra reference to struct file rather than just transferring to caller the one it had removed from descriptor table. New variant doesn't, and callers need to be adjusted. Reported-and-tested-by: syzbot+47dd250f527cb7bebf24@syzkaller.appspotmail.com Fixes: 6319194ec57b ("Unify the primitives for file descriptor closing") Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-06-04Merge tag 'pull-18-rc1-work.namei' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs pathname updates from Al Viro: "Several cleanups in fs/namei.c" * tag 'pull-18-rc1-work.namei' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: namei: cleanup double word in comment get rid of dead code in legitimize_root() fs/namei.c:reserve_stack(): tidy up the call of try_to_unlazy()
2022-06-04Merge tag 'pull-18-rc1-work.mount' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull mount handling updates from Al Viro: "Cleanups (and one fix) around struct mount handling. The fix is usermode_driver.c one - once you've done kern_mount(), you must kern_unmount(); simple mntput() will end up with a leak. Several failure exits in there messed up that way... In practice you won't hit those particular failure exits without fault injection, though" * tag 'pull-18-rc1-work.mount' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: move mount-related externs from fs.h to mount.h blob_to_mnt(): kern_unmount() is needed to undo kern_mount() m->mnt_root->d_inode->i_sb is a weird way to spell m->mnt_sb... linux/mount.h: trim includes uninline may_mount() and don't opencode it in fspick(2)/fsopen(2)
2022-06-04Merge tag 'pull-18-rc1-work.fd' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull file descriptor updates from Al Viro. - Descriptor handling cleanups * tag 'pull-18-rc1-work.fd' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: Unify the primitives for file descriptor closing fs: remove fget_many and fput_many interface io_uring_enter(): don't leave f.flags uninitialized
2022-06-04Merge tag '5.19-rc-smb3-client-fixes-part2' of ↵Linus Torvalds
git://git.samba.org/sfrench/cifs-2.6 Pull cifs client fixes from Steve French: "Nine cifs/smb3 client fixes. Includes DFS fixes, some cleanup of leagcy SMB1 code, duplicated message cleanup and a double free and deadlock fix" * tag '5.19-rc-smb3-client-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6: cifs: fix uninitialized pointer in error case in dfs_cache_get_tgt_share cifs: skip trailing separators of prefix paths cifs: update internal module number cifs: version operations for smb20 unneeded when legacy support disabled cifs: do not build smb1ops if legacy support is disabled cifs: fix potential deadlock in direct reclaim cifs: when extending a file with falloc we should make files not-sparse cifs: remove repeated debug message on cifs_put_smb_ses() cifs: fix potential double free during failed mount
2022-06-04cifs: fix uninitialized pointer in error case in dfs_cache_get_tgt_shareSteve French
Set default value of ppath to null. Reported-by: kernel test robot <lkp@intel.com> Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz> Signed-off-by: Steve French <stfrench@microsoft.com>
2022-06-03Merge tag 'ntfs3_for_5.19' of ↵Linus Torvalds
https://github.com/Paragon-Software-Group/linux-ntfs3 Pull ntfs3 updates from Konstantin Komarov: - fix some memory leaks and panic - fixed xfstests (tested on x86_64): generic/092 generic/099 generic/228 generic/240 generic/307 generic/444 - fix some typos, dead code, etc * tag 'ntfs3_for_5.19' of https://github.com/Paragon-Software-Group/linux-ntfs3: fs/ntfs3: provide block_invalidate_folio to fix memory leak fs/ntfs3: Fix invalid free in log_replay fs/ntfs3: Update valid size if -EIOCBQUEUED fs/ntfs3: Check new size for limits fs/ntfs3: Fix fiemap + fix shrink file size (to remove preallocated space) fs/ntfs3: In function ntfs_set_acl_ex do not change inode->i_mode if called from function ntfs_init_acl fs/ntfs3: Optimize locking in ntfs_save_wsl_perm fs/ntfs3: Update i_ctime when xattr is added fs/ntfs3: Restore ntfs_xattr_get_acl and ntfs_xattr_set_acl functions fs/ntfs3: Keep preallocated only if option prealloc enabled fs/ntfs3: Fix some memory leaks in an error handling path of 'log_replay()'
2022-06-03Merge tag 'kthread-cleanups-for-v5.19' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull kthread updates from Eric Biederman: "This updates init and user mode helper tasks to be ordinary user mode tasks. Commit 40966e316f86 ("kthread: Ensure struct kthread is present for all kthreads") caused init and the user mode helper threads that call kernel_execve to have struct kthread allocated for them. This struct kthread going away during execve in turned made a use after free of struct kthread possible. Here, commit 343f4c49f243 ("kthread: Don't allocate kthread_struct for init and umh") is enough to fix the use after free and is simple enough to be backportable. The rest of the changes pass struct kernel_clone_args to clean things up and cause the code to make sense. In making init and the user mode helpers tasks purely user mode tasks I ran into two complications. The function task_tick_numa was detecting tasks without an mm by testing for the presence of PF_KTHREAD. The initramfs code in populate_initrd_image was using flush_delayed_fput to ensuere the closing of all it's file descriptors was complete, and flush_delayed_fput does not work in a userspace thread. I have looked and looked and more complications and in my code review I have not found any, and neither has anyone else with the code sitting in linux-next" * tag 'kthread-cleanups-for-v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: sched: Update task_tick_numa to ignore tasks without an mm fork: Stop allowing kthreads to call execve fork: Explicitly set PF_KTHREAD init: Deal with the init process being a user mode process fork: Generalize PF_IO_WORKER handling fork: Explicity test for idle tasks in copy_thread fork: Pass struct kernel_clone_args into copy_thread kthread: Don't allocate kthread_struct for init and umh
2022-06-03Merge tag 'for-linus-5.19-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rw/ubifs Pull JFFS2, UBI and UBIFS updates from Richard Weinberger: "JFFS2: - Fixes for a memory leak UBI: - Fixes for fastmap (UAF, high CPU usage) UBIFS: - Minor cleanups" * tag 'for-linus-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/ubifs: ubi: ubi_create_volume: Fix use-after-free when volume creation failed ubi: fastmap: Check wl_pool for free peb before wear leveling ubi: fastmap: Fix high cpu usage of ubi_bgt by making sure wl_pool not empty ubifs: Use NULL instead of using plain integer as pointer ubifs: Simplify the return expression of run_gc() jffs2: fix memory leak in jffs2_do_fill_super jffs2: Use kzalloc instead of kmalloc/memset
2022-06-03cifs: skip trailing separators of prefix pathsPaulo Alcantara
During DFS failover, prefix paths may change, so make sure to not leave trailing separators when parsing thew in dfs_cache_get_tgt_share(). The separators of prefix paths are already handled by build_path_from_dentry_optional_prefix(). Consider the following DFS link: //dom/dfs/link: [\srv1\share\dir1, \srv2\share\dir1] Before commit: mount.cifs //dom/dfs/link tree connect to \\srv1\share; prefix_path=dir1 disconnect srv1; failover to srv2 tree connect to \\srv2\share; prefix_path=dir1\ mv foo bar ... SMB2 430 Create Request File: dir1\\foo;GetInfo Request FILE_INFO/SMB2_FILE_ALL_INFO;Close Request SMB2 582 Create Response File: dir1\\foo;GetInfo Response;Close Response SMB2 430 Create Request File: dir1\\bar;GetInfo Request FILE_INFO/SMB2_FILE_ALL_INFO;Close Request SMB2 286 Create Response, Error: STATUS_OBJECT_NAME_NOT_FOUND;GetInfo Response, Error: STATUS_OBJECT_NAME_NOT_FOUND;Close Response, Error: STATUS_OBJECT_NAME_NOT_FOUND SMB2 462 Create Request File: dir1\\foo;SetInfo Request FILE_INFO/SMB2_FILE_RENAME_INFO NewName:dir1\\bar;Close Request SMB2 478 Create Response File: dir1\\foo;SetInfo Response, Error: STATUS_OBJECT_NAME_INVALID;Close Response After commit: mount.cifs //dom/dfs/link tree connect to \\srv1\share; prefix_path=dir1 disconnect srv1; failover to srv2 tree connect to \\srv2\share; prefix_path=dir1 mv foo bar ... SMB2 430 Create Request File: dir1\foo;GetInfo Request FILE_INFO/SMB2_FILE_ALL_INFO;Close Request SMB2 582 Create Response File: dir1\foo;GetInfo Response;Close Response SMB2 430 Create Request File: dir1\bar;GetInfo Request FILE_INFO/SMB2_FILE_ALL_INFO;Close Request SMB2 286 Create Response, Error: STATUS_OBJECT_NAME_NOT_FOUND;GetInfo Response, Error: STATUS_OBJECT_NAME_NOT_FOUND;Close Response, Error: STATUS_OBJECT_NAME_NOT_FOUND SMB2 462 Create Request File: dir1\foo;SetInfo Request FILE_INFO/SMB2_FILE_RENAME_INFO NewName:dir1\bar;Close Request SMB2 478 Create Response File: dir1\foo;SetInfo Response;Close Response Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz> Signed-off-by: Steve French <stfrench@microsoft.com>
2022-06-03Merge tag 'driver-core-5.19-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core updates from Greg KH: "Here is the set of driver core changes for 5.19-rc1. Lots of tiny driver core changes and cleanups happened this cycle, but the two major things are: - firmware_loader reorganization and additions including the ability to have XZ compressed firmware images and the ability for userspace to initiate the firmware load when it needs to, instead of being always initiated by the kernel. FPGA devices specifically want this ability to have their firmware changed over the lifetime of the system boot, and this allows them to work without having to come up with yet-another-custom-uapi interface for loading firmware for them. - physical location support added to sysfs so that devices that know this information, can tell userspace where they are located in a common way. Some ACPI devices already support this today, and more bus types should support this in the future. Smaller changes include: - driver_override api cleanups and fixes - error path cleanups and fixes - get_abi script fixes - deferred probe timeout changes. It's that last change that I'm the most worried about. It has been reported to cause boot problems for a number of systems, and I have a tested patch series that resolves this issue. But I didn't get it merged into my tree before 5.18-final came out, so it has not gotten any linux-next testing. I'll send the fixup patches (there are 2) as a follow-on series to this pull request. All have been tested in linux-next for weeks, with no reported issues other than the above-mentioned boot time-outs" * tag 'driver-core-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (55 commits) driver core: fix deadlock in __device_attach kernfs: Separate kernfs_pr_cont_buf and rename_lock. topology: Remove unused cpu_cluster_mask() driver core: Extend deferred probe timeout on driver registration MAINTAINERS: add Russ Weight as a firmware loader maintainer driver: base: fix UAF when driver_attach failed test_firmware: fix end of loop test in upload_read_show() driver core: location: Add "back" as a possible output for panel driver core: location: Free struct acpi_pld_info *pld driver core: Add "*" wildcard support to driver_async_probe cmdline param driver core: location: Check for allocations failure arch_topology: Trace the update thermal pressure kernfs: Rename kernfs_put_open_node to kernfs_unlink_open_file. export: fix string handling of namespace in EXPORT_SYMBOL_NS rpmsg: use local 'dev' variable rpmsg: Fix calling device_lock() on non-initialized device firmware_loader: describe 'module' parameter of firmware_upload_register() firmware_loader: Move definitions from sysfs_upload.h to sysfs.h firmware_loader: Fix configs for sysfs split selftests: firmware: Add firmware upload selftests ...
2022-06-03Merge tag 'spdx-5.19-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/spdx Pull SPDX updates from Greg KH: "Here are some SPDX license marker changes. The SPDX-labeling effort has started to pick up again, so here are some changes for various parts of the tree that are related to this effort. Included in here are: - freevxfs license updates - spihash.c license cleanups - spdxcheck script updates to make things easier to work with going forward All of the license updates came from the original authors/copyright holders of the code involved. All of these have been in linux-next for weeks with no reported issues" * tag 'spdx-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/spdx: siphash: add SPDX tags as sole licensing authority scripts/spdxcheck: Exclude top-level README scripts/spdxcheck: Exclude MAINTAINERS/CREDITS scripts/spdxcheck: Exclude config directories scripts/spdxcheck: Put excluded files and directories into a separate file scripts/spdxcheck: Add option to display files without SPDX scripts/spdxcheck: Add [sub]directory statistics scripts/spdxcheck: Add directory statistics scripts/spdxcheck: Add percentage to statistics freevxfs: relicense to GPLv2 only
2022-06-03Merge tag 'io_uring-5.19-2022-06-02' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull more io_uring updates from Jens Axboe: - A small series with some prep patches for the upcoming 5.20 split of the io_uring.c file. No functional changes here, just minor bits that are nice to get out of the way now (me) - Fix for a memory leak in high numbered provided buffer groups, introduced in the merge window (me) - Wire up the new socket opcode for allocated direct descriptors, making it consistent with the other opcodes that can instantiate a descriptor (me) - Fix for the inflight tracking, should go into 5.18-stable as well (me) - Fix for a deadlock for io-wq offloaded file slot allocations (Pavel) - Direct descriptor failure fput leak fix (Xiaoguang) - Fix for the direct descriptor allocation hinting in case of unsuccessful install (Xiaoguang) * tag 'io_uring-5.19-2022-06-02' of git://git.kernel.dk/linux-block: io_uring: reinstate the inflight tracking io_uring: fix deadlock on iowq file slot alloc io_uring: let IORING_OP_FILES_UPDATE support choosing fixed file slots io_uring: defer alloc_hint update to io_file_bitmap_set() io_uring: ensure fput() called correspondingly when direct install fails io_uring: wire up allocated direct descriptors for socket io_uring: fix a memory leak of buffer group list on exit io_uring: move shutdown under the general net section io_uring: unify calling convention for async prep handling io_uring: add io_op_defs 'def' pointer in req init and issue io_uring: make prep and issue side of req handlers named consistently io_uring: make timeout prep handlers consistent with other prep handlers
2022-06-02NFSD: Fix potential use-after-free in nfsd_file_put()Chuck Lever
nfsd_file_put_noref() can free @nf, so don't dereference @nf immediately upon return from nfsd_file_put_noref(). Suggested-by: Trond Myklebust <trondmy@hammerspace.com> Fixes: 999397926ab3 ("nfsd: Clean up nfsd_file_put()") Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-06-02Merge tag 'ceph-for-5.19-rc1' of https://github.com/ceph/ceph-clientLinus Torvalds
Pull ceph updates from Ilya Dryomov: "A big pile of assorted fixes and improvements for the filesystem with nothing in particular standing out, except perhaps that the fact that the MDS never really maintained atime was made official and thus it's no longer updated on the client either. We also have a MAINTAINERS update: Jeff is transitioning his filesystem maintainership duties to Xiubo" * tag 'ceph-for-5.19-rc1' of https://github.com/ceph/ceph-client: (23 commits) MAINTAINERS: move myself from ceph "Maintainer" to "Reviewer" ceph: fix decoding of client session messages flags ceph: switch TASK_INTERRUPTIBLE to TASK_KILLABLE ceph: remove redundant variable ino ceph: try to queue a writeback if revoking fails ceph: fix statfs for subdir mounts ceph: fix possible deadlock when holding Fwb to get inline_data ceph: redirty the page for writepage on failure ceph: try to choose the auth MDS if possible for getattr ceph: disable updating the atime since cephfs won't maintain it ceph: flush the mdlog for filesystem sync ceph: rename unsafe_request_wait() libceph: use swap() macro instead of taking tmp variable ceph: fix statx AT_STATX_DONT_SYNC vs AT_STATX_FORCE_SYNC check ceph: no need to invalidate the fscache twice ceph: replace usage of found with dedicated list iterator variable ceph: use dedicated list iterator variable ceph: update the dlease for the hashed dentry when removing ceph: stop retrying the request when exceeding 256 times ceph: stop forwarding the request when exceeding 256 times ...
2022-06-01io_uring: reinstate the inflight trackingJens Axboe
After some debugging, it was realized that we really do still need the old inflight tracking for any file type that has io_uring_fops assigned. If we don't, then trivial circular references will mean that we never get the ctx cleaned up and hence it'll leak. Just bring back the inflight tracking, which then also means we can eliminate the conditional dropping of the file when task_work is queued. Fixes: d5361233e9ab ("io_uring: drop the old style inflight file tracking") Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-06-01cifs: update internal module numberSteve French
To 2.37 Signed-off-by: Steve French <stfrench@microsoft.com>
2022-06-01cifs: version operations for smb20 unneeded when legacy support disabledSteve French
We should not be including unused smb20 specific code when legacy support is disabled (CONFIG_CIFS_ALLOW_INSECURE_LEGACY turned off). For example smb2_operations and smb2_values aren't used in that case. Over time we can move more and more SMB1/CIFS and SMB2.0 code into the insecure legacy ifdefs Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2022-06-01cifs: do not build smb1ops if legacy support is disabledSteve French
We should not be including unused SMB1/CIFS functions when legacy support is disabled (CONFIG_CIFS_ALLOW_INSECURE_LEGACY turned off), but especially obvious is not needing to build smb1ops.c at all when legacy support is disabled. Over time we can move more SMB1/CIFS and SMB2.0 legacy functions into ifdefs but this is a good start (and shrinks the module size a few percent). Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2022-06-01Merge tag 'xfs-5.19-for-linus-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/fs/xfs/xfs-linux Pull more xfs updates from Dave Chinner: "This update is largely bug fixes and cleanups for all the code merged in the first pull request. The majority of them are to the new logged attribute code, but there are also a couple of fixes for other log recovery and memory leaks that have recently been found. Summary: - fix refcount leak in xfs_ifree() - fix xfs_buf_cancel structure leaks in log recovery - fix dquot leak after failed quota check - fix a couple of problematic ASSERTS - fix small aim7 perf regression in from new btree sibling validation - clean up log incompat feature marking for new logged attribute feature - disallow logged attributes on legacy V4 filesystem formats. - fix da state leak when freeing attr intents - improve validation of the attr log items in recovery - use slab caches for commonly used attr structures - fix leaks of attr name/value buffer and reduce copying overhead during intent logging - remove some dead debug code from log recovery" * tag 'xfs-5.19-for-linus-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: (33 commits) xfs: fix xfs_ifree() error handling to not leak perag ref xfs: move xfs_attr_use_log_assist usage out of libxfs xfs: move xfs_attr_use_log_assist out of xfs_log.c xfs: warn about LARP once per mount xfs: implement per-mount warnings for scrub and shrink usage xfs: don't log every time we clear the log incompat flags xfs: convert buf_cancel_table allocation to kmalloc_array xfs: don't leak xfs_buf_cancel structures when recovery fails xfs: refactor buffer cancellation table allocation xfs: don't leak btree cursor when insrec fails after a split xfs: purge dquots after inode walk fails during quotacheck xfs: assert in xfs_btree_del_cursor should take into account error xfs: don't assert fail on perag references on teardown xfs: avoid unnecessary runtime sibling pointer endian conversions xfs: share xattr name and value buffers when logging xattr updates xfs: do not use logged xattr updates on V4 filesystems xfs: Remove duplicate include xfs: reduce IOCB_NOWAIT judgment for retry exclusive unaligned DIO xfs: Remove dead code xfs: fix typo in comment ...
2022-06-01Merge tag 'erofs-for-5.19-rc1-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs Pull more erofs updates from Gao Xiang: "This is a follow-up to the main updates, including some fixes of fscache mode related to compressed inodes and a cachefiles tracepoint. There is also a patch to fix an unexpected decompression strategy change due to a cleanup in the past. All the fixes are quite small. Apart from these, documentation is also updated for a better description of recent new features. In addition, this has some trivial cleanups without actual code logic changes, so I could have a more recent codebase to work on folios and avoiding the PG_error page flag for the next cycle. Summary: - Leave compressed inodes unsupported in fscache mode for now - Avoid crash when using tracepoint cachefiles_prep_read - Fix `backmost' behavior due to a recent cleanup - Update documentation for better description of recent new features - Several decompression cleanups w/o logical change" * tag 'erofs-for-5.19-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs: erofs: fix 'backmost' member of z_erofs_decompress_frontend erofs: simplify z_erofs_pcluster_readmore() erofs: get rid of label `restart_now' erofs: get rid of `struct z_erofs_collection' erofs: update documentation erofs: fix crash when enable tracepoint cachefiles_prep_read erofs: leave compressed inodes unsupported in fscache mode for now
2022-06-01Merge tag '5.19-rc-ksmbd-server-fixes' of git://git.samba.org/ksmbdLinus Torvalds
Pull ksmbd server updates from Steve French: - rdma (smbdirect) fixes, cleanup and optimizations - crediting (flow control) fix for mounts from Windows client - ACL fix - Windows client query dir fix - write validation fix - cleanups * tag '5.19-rc-ksmbd-server-fixes' of git://git.samba.org/ksmbd: ksmbd: smbd: relax the count of sges required ksmbd: fix outstanding credits related bugs ksmbd: smbd: fix connection dropped issue ksmbd: Fix some kernel-doc comments ksmbd: fix wrong smbd max read/write size check ksmbd: add smbd max io size parameter ksmbd: handle smb2 query dir request for OutputBufferLength that is too small ksmbd: smbd: handle multiple Buffer descriptors ksmbd: smbd: change the return value of get_sg_list ksmbd: smbd: simplify tracking pending packets ksmbd: smbd: introduce read/write credits for RDMA read/write ksmbd: smbd: change prototypes of RDMA read/write related functions ksmbd: validate length in smb2_write() ksmbd: fix reference count leak in smb_check_perm_dacl()
2022-06-01afs: Fix infinite loop found by xfstest generic/676David Howells
In AFS, a directory is handled as a file that the client downloads and parses locally for the purposes of performing lookup and getdents operations. The in-kernel afs filesystem has a number of functions that do this. A directory file is arranged as a series of 2K blocks divided into 32-byte slots, where a directory entry occupies one or more slots, plus each block starts with one or more metadata blocks. When parsing a block, if the last slots are occupied by a dirent that occupies more than a single slot and the file position points at a slot that's not the initial one, the logic in afs_dir_iterate_block() that skips over it won't advance the file pointer to the end of it. This will cause an infinite loop in getdents() as it will keep retrying that block and failing to advance beyond the final entry. Fix this by advancing the file pointer if the next entry will be beyond it when we skip a block. This was found by the generic/676 xfstest but can also be triggered with something like: ~/xfstests-dev/src/t_readdir_3 /xfstest.test/z 4000 1 Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Marc Dionne <marc.dionne@auristor.com> Tested-by: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org Link: http://lore.kernel.org/r/165391973497.110268.2939296942213894166.stgit@warthog.procyon.org.uk/ Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-06-01io_uring: fix deadlock on iowq file slot allocPavel Begunkov
io_fixed_fd_install() can grab uring_lock in the slot allocation path when called from io-wq, and then call into io_install_fixed_file(), which will lock it again. Pull all locking out of io_install_fixed_file() into io_fixed_fd_install(). Fixes: 1339f24b336db ("io_uring: allow allocated fixed files for openat/openat2") Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/64116172a9d0b85b85300346bb280f3657aafc26.1654087283.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-06-01fs/ntfs3: provide block_invalidate_folio to fix memory leakMikulas Patocka
The ntfs3 filesystem lacks the 'invalidate_folio' method and it causes memory leak. If you write to the filesystem and then unmount it, the cached written data are not freed and they are permanently leaked. Fixes: 7ba13abbd31e ("fs: Turn block_invalidatepage into block_invalidate_folio") Reported-by: José Luis Lara Carrascal <manualinux@yahoo.es> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Acked-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com> Cc: stable@vger.kernel.org # v5.18
2022-06-01cifs: fix potential deadlock in direct reclaimVincent Whitchurch
The srv_mutex is used during writeback so cifs should ensure that allocations done when that mutex is held are done with GFP_NOFS, to avoid having direct reclaim ending up waiting for the same mutex and causing a deadlock. This is detected by lockdep with the splat below: ====================================================== WARNING: possible circular locking dependency detected 5.18.0 #70 Not tainted ------------------------------------------------------ kswapd0/49 is trying to acquire lock: ffff8880195782e0 (&tcp_ses->srv_mutex){+.+.}-{3:3}, at: compound_send_recv but task is already holding lock: ffffffffa98e66c0 (fs_reclaim){+.+.}-{0:0}, at: balance_pgdat which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (fs_reclaim){+.+.}-{0:0}: fs_reclaim_acquire kmem_cache_alloc_trace __request_module crypto_alg_mod_lookup crypto_alloc_tfm_node crypto_alloc_shash cifs_alloc_hash smb311_crypto_shash_allocate smb311_update_preauth_hash compound_send_recv cifs_send_recv SMB2_negotiate smb2_negotiate cifs_negotiate_protocol cifs_get_smb_ses cifs_mount cifs_smb3_do_mount smb3_get_tree vfs_get_tree path_mount __x64_sys_mount do_syscall_64 entry_SYSCALL_64_after_hwframe -> #0 (&tcp_ses->srv_mutex){+.+.}-{3:3}: __lock_acquire lock_acquire __mutex_lock mutex_lock_nested compound_send_recv cifs_send_recv SMB2_write smb2_sync_write cifs_write cifs_writepage_locked cifs_writepage shrink_page_list shrink_lruvec shrink_node balance_pgdat kswapd kthread ret_from_fork other info that might help us debug this: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(fs_reclaim); lock(&tcp_ses->srv_mutex); lock(fs_reclaim); lock(&tcp_ses->srv_mutex); *** DEADLOCK *** 1 lock held by kswapd0/49: #0: ffffffffa98e66c0 (fs_reclaim){+.+.}-{0:0}, at: balance_pgdat stack backtrace: CPU: 2 PID: 49 Comm: kswapd0 Not tainted 5.18.0 #70 Call Trace: <TASK> dump_stack_lvl dump_stack print_circular_bug.cold check_noncircular __lock_acquire lock_acquire __mutex_lock mutex_lock_nested compound_send_recv cifs_send_recv SMB2_write smb2_sync_write cifs_write cifs_writepage_locked cifs_writepage shrink_page_list shrink_lruvec shrink_node balance_pgdat kswapd kthread ret_from_fork </TASK> Fix this by using the memalloc_nofs_save/restore APIs around the places where the srv_mutex is held. Do this in a wrapper function for the lock/unlock of the srv_mutex, and rename the srv_mutex to avoid missing call sites in the conversion. Note that there is another lockdep warning involving internal crypto locks, which was masked by this problem and is visible after this fix, see the discussion in this thread: https://lore.kernel.org/all/20220523123755.GA13668@axis.com/ Link: https://lore.kernel.org/r/CANT5p=rqcYfYMVHirqvdnnca4Mo+JQSw5Qu12v=kPfpk5yhhmg@mail.gmail.com/ Reported-by: Shyam Prasad N <nspmangalore@gmail.com> Suggested-by: Lars Persson <larper@axis.com> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com> Reviewed-by: Enzo Matsumiya <ematsumiya@suse.de> Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2022-05-31Merge tag 'nfs-for-5.19-1' of git://git.linux-nfs.org/projects/anna/linux-nfsLinus Torvalds
Pull NFS client updates from Anna Schumaker: "New Features: - Add support for 'dacl' and 'sacl' attributes Bugfixes and Cleanups: - Fixes for reporting mapping errors - Fixes for memory allocation errors - Improve warning message when locks are lost - Update documentation for the nfs4_unique_id parameter - Add an explanation of NFSv4 client identifiers - Ensure the i_size attribute is written to the fscache storage - Fix freeing uninitialized nfs4_labels - Better handling when xprtrdma bc_serv is NULL - Mark qualified async operations as MOVEABLE tasks" * tag 'nfs-for-5.19-1' of git://git.linux-nfs.org/projects/anna/linux-nfs: NFSv4.1 mark qualified async operations as MOVEABLE tasks xprtrdma: treat all calls not a bcall when bc_serv is NULL NFSv4: Fix free of uninitialized nfs4_label on referral lookup. NFS: Pass i_size to fscache_unuse_cookie() when a file is released Documentation: Add an explanation of NFSv4 client identifiers NFS: update documentation for the nfs4_unique_id parameter NFS: Improve warning message when locks are lost. NFSv4.1: Enable access to the NFSv4.1 'dacl' and 'sacl' attributes NFSv4: Add encoders/decoders for the NFSv4.1 dacl and sacl attributes NFSv4: Specify the type of ACL to cache NFSv4: Don't hold the layoutget locks across multiple RPC calls pNFS/files: Fall back to I/O through the MDS on non-fatal layout errors NFS: Further fixes to the writeback error handling NFSv4/pNFS: Do not fail I/O when we fail to allocate the pNFS layout NFS: Memory allocation failures are not server fatal errors NFS: Don't report errors from nfs_pageio_complete() more than once NFS: Do not report flush errors in nfs_write_end() NFS: Don't report ENOSPC write errors twice NFS: fsync() should report filesystem errors over EINTR/ERESTARTSYS NFS: Do not report EINTR/ERESTARTSYS as mapping errors
2022-05-31Merge tag 'f2fs-for-5.19-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs Pull f2fs updates from Jaegeuk Kim: "In this round, we've refactored the existing atomic write support implemented by in-memory operations to have storing data in disk temporarily, which can give us a benefit to accept more atomic writes. At the same time, we removed the existing volatile write support. We've also revisited the file pinning and GC flows and found some corner cases which contributeed abnormal system behaviours. As usual, there're several minor code refactoring for readability, sanity check, and clean ups. Enhancements: - allow compression for mmap files in compress_mode=user - kill volatile write support - change the current atomic write way - give priority to select unpinned section for foreground GC - introduce data read/write showing path info - remove unnecessary f2fs_lock_op in f2fs_new_inode Bug fixes: - fix the file pinning flow during checkpoint=disable and GCs - fix foreground and background GCs to select the right victims and get free sections on time - fix GC flags on defragmenting pages - avoid an infinite loop to flush node pages - fix fallocate to use file_modified to update permissions consistently" * tag 'f2fs-for-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (40 commits) f2fs: fix to tag gcing flag on page during file defragment f2fs: replace F2FS_I(inode) and sbi by the local variable f2fs: add f2fs_init_write_merge_io function f2fs: avoid unneeded error handling for revoke_entry_slab allocation f2fs: allow compression for mmap files in compress_mode=user f2fs: fix typo in comment f2fs: make f2fs_read_inline_data() more readable f2fs: fix to do sanity check for inline inode f2fs: fix fallocate to use file_modified to update permissions consistently f2fs: don't use casefolded comparison for "." and ".." f2fs: do not stop GC when requiring a free section f2fs: keep wait_ms if EAGAIN happens f2fs: introduce f2fs_gc_control to consolidate f2fs_gc parameters f2fs: reject test_dummy_encryption when !CONFIG_FS_ENCRYPTION f2fs: kill volatile write support f2fs: change the current atomic write way f2fs: don't need inode lock for system hidden quota f2fs: stop allocating pinned sections if EAGAIN happens f2fs: skip GC if possible when checkpoint disabling f2fs: give priority to select unpinned section for foreground GC ...
2022-05-31cifs: when extending a file with falloc we should make files not-sparseRonnie Sahlberg
as this is the only way to make sure the region is allocated. Fix the conditional that was wrong and only tried to make already non-sparse files non-sparse. Cc: stable@vger.kernel.org Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2022-05-31Merge tag 'riscv-for-linus-5.19-mw0' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V updates from Palmer Dabbelt: - Support for the Svpbmt extension, which allows memory attributes to be encoded in pages - Support for the Allwinner D1's implementation of page-based memory attributes - Support for running rv32 binaries on rv64 systems, via the compat subsystem - Support for kexec_file() - Support for the new generic ticket-based spinlocks, which allows us to also move to qrwlock. These should have already gone in through the asm-geneic tree as well - A handful of cleanups and fixes, include some larger ones around atomics and XIP * tag 'riscv-for-linus-5.19-mw0' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (51 commits) RISC-V: Prepare dropping week attribute from arch_kexec_apply_relocations[_add] riscv: compat: Using seperated vdso_maps for compat_vdso_info RISC-V: Fix the XIP build RISC-V: Split out the XIP fixups into their own file RISC-V: ignore xipImage RISC-V: Avoid empty create_*_mapping definitions riscv: Don't output a bogus mmu-type on a no MMU kernel riscv: atomic: Add custom conditional atomic operation implementation riscv: atomic: Optimize dec_if_positive functions riscv: atomic: Cleanup unnecessary definition RISC-V: Load purgatory in kexec_file RISC-V: Add purgatory RISC-V: Support for kexec_file on panic RISC-V: Add kexec_file support RISC-V: use memcpy for kexec_file mode kexec_file: Fix kexec_file.c build error for riscv platform riscv: compat: Add COMPAT Kbuild skeletal support riscv: compat: ptrace: Add compat_arch_ptrace implement riscv: compat: signal: Add rt_frame implementation riscv: add memory-type errata for T-Head ...
2022-05-31NFSv4.1 mark qualified async operations as MOVEABLE tasksOlga Kornievskaia
Mark async operations such as RENAME, REMOVE, COMMIT MOVEABLE for the nfsv4.1+ sessions. Fixes: 85e39feead948 ("NFSv4.1 identify and mark RPC tasks that can move between transports") Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>