aboutsummaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
2015-11-13Merge branch 'for-linus-3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs xattr cleanups from Al Viro. * 'for-linus-3' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: f2fs: xattr simplifications squashfs: xattr simplifications 9p: xattr simplifications xattr handlers: Pass handler to operations instead of flags jffs2: Add missing capability check for listing trusted xattrs hfsplus: Remove unused xattr handler list operations ubifs: Remove unused security xattr handler vfs: Fix the posix_acl_xattr_list return value vfs: Check attribute names in posix acl xattr handers
2015-11-13Merge branch 'libnvdimm-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm Pull libnvdimm fixes from Dan Williams: - three fixes tagged for -stable including a crash fix, simple performance tweak, and an invalid i/o error. - build regression fix for the nvdimm unit tests - nvdimm documentation update * 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: dax: fix __dax_pmd_fault crash libnvdimm: documentation clarifications libnvdimm, pmem: fix size trim in pmem_direct_access() libnvdimm, e820: fix numa node for e820-type-12 pmem ranges tools/testing/nvdimm, acpica: fix flag rename build breakage
2015-11-13f2fs: xattr simplificationsAndreas Gruenbacher
Now that the xattr handler is passed to the xattr handler operations, we have access to the attribute name prefix, so simplify f2fs_xattr_generic_list. Also, f2fs_xattr_advise_list is only ever called for f2fs_xattr_advise_handler; there is no need to double check for that. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Cc: Jaegeuk Kim <jaegeuk@kernel.org> Cc: Changman Lee <cm224.lee@samsung.com> Cc: Chao Yu <chao2.yu@samsung.com> Cc: linux-f2fs-devel@lists.sourceforge.net Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-11-13squashfs: xattr simplificationsAndreas Gruenbacher
Now that the xattr handler is passed to the xattr handler operations, we have access to the attribute name prefix, so simplify the squashfs xattr handlers a bit. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Cc: Phillip Lougher <phillip@squashfs.org.uk> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-11-139p: xattr simplificationsAndreas Gruenbacher
Now that the xattr handler is passed to the xattr handler operations, we can use the same get and set operations for the user, trusted, and security xattr namespaces. In those namespaces, we can access the full attribute name by "reattaching" the name prefix the vfs has skipped for us. Add a xattr_full_name helper to make this obvious in the code. For the "system.posix_acl_access" and "system.posix_acl_default" attributes, handler->prefix is the full attribute name; the suffix is the empty string. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Cc: Eric Van Hensbergen <ericvh@gmail.com> Cc: Ron Minnich <rminnich@sandia.gov> Cc: Latchesar Ionkov <lucho@ionkov.net> Cc: v9fs-developer@lists.sourceforge.net Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-11-13xattr handlers: Pass handler to operations instead of flagsAndreas Gruenbacher
The xattr_handler operations are currently all passed a file system specific flags value which the operations can use to disambiguate between different handlers; some file systems use that to distinguish the xattr namespace, for example. In some oprations, it would be useful to also have access to the handler prefix. To allow that, pass a pointer to the handler to operations instead of the flags value alone. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-11-13jffs2: Add missing capability check for listing trusted xattrsAndreas Gruenbacher
The vfs checks if a task has the appropriate access for get and set operations, but it cannot do that for the list operation; the file system must check for that itself. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: David Woodhouse <dwmw2@infradead.org> Cc: linux-mtd@lists.infradead.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-11-13hfsplus: Remove unused xattr handler list operationsAndreas Gruenbacher
The list operations can never be called; they are even documented to be unused. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-11-13ubifs: Remove unused security xattr handlerAndreas Gruenbacher
Ubifs installs a security xattr handler in sb->s_xattr but doesn't use the generic_{get,set,list,remove}xattr inode operations needed for processing this list of attribute handlers; the handler is never called. Instead, ubifs uses its own xattr handlers which also process security xattrs. Remove the dead code. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Reviewed-by: Richard Weinberger <richard@nod.at> Cc: Artem Bityutskiy <dedekind1@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: linux-mtd@lists.infradead.org Cc: Subodh Nijsure <snijsure@grid-net.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-11-13vfs: Fix the posix_acl_xattr_list return valueAndreas Gruenbacher
When a filesystem that contains POSIX ACLs is mounted without ACL support (-o noacl), the appropriate behavior is not to list any existing POSIX ACL xattrs. The return value for list xattr handlers in this case is 0, not an error code: several filesystems that use the POSIX ACL xattr handlers do not expect the list operation to fail. Symlinks cannot have ACLs, so posix_acl_xattr_list will never be called for symlinks in the first place. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-11-13vfs: Check attribute names in posix acl xattr handersAndreas Gruenbacher
The get and set operations of the POSIX ACL xattr handlers failed to check the attribute names, so all names with "system.posix_acl_access" or "system.posix_acl_default" as a prefix were accepted. Reject invalid names from now on. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-11-13Merge branch 'for-next' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds
Pull SMB3 updates from Steve French: "A collection of SMB3 patches adding some reliability features (persistent and resilient handles) and improving SMB3 copy offload. I will have some additional patches for SMB3 encryption and SMB3.1.1 signing (important security features), and also for improving SMB3 persistent handle reconnection (setting ChannelSequence number e.g.) that I am still working on but wanted to get this set in since they can stand alone" * 'for-next' of git://git.samba.org/sfrench/cifs-2.6: Allow copy offload (CopyChunk) across shares Add resilienthandles mount parm [SMB3] Send durable handle v2 contexts when use of persistent handles required [SMB3] Display persistenthandles in /proc/mounts for SMB3 shares if enabled [SMB3] Enable checking for continuous availability and persistent handle support [SMB3] Add parsing for new mount option controlling persistent handles Allow duplicate extents in SMB3 not just SMB3.1.1
2015-11-13Merge branch 'for-linus-4.4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs Pull btrfs fixes and cleanups from Chris Mason: "Some of this got cherry-picked from a github repo this week, but I verified the patches. We have three small scrub cleanups and a collection of fixes" * 'for-linus-4.4' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: btrfs: Use fs_info directly in btrfs_delete_unused_bgs btrfs: Fix lost-data-profile caused by balance bg btrfs: Fix lost-data-profile caused by auto removing bg btrfs: Remove len argument from scrub_find_csum btrfs: Reduce unnecessary arguments in scrub_recheck_block btrfs: Use scrub_checksum_data and scrub_checksum_tree_block for scrub_recheck_block_checksum btrfs: Reset sblock->xxx_error stats before calling scrub_recheck_block_checksum btrfs: scrub: setup all fields for sblock_to_check btrfs: scrub: set error stats when tree block spanning stripes Btrfs: fix race when listing an inode's xattrs Btrfs: fix race leading to BUG_ON when running delalloc for nodatacow Btrfs: fix race leading to incorrect item deletion when dropping extents Btrfs: fix sleeping inside atomic context in qgroup rescan worker Btrfs: fix race waiting for qgroup rescan worker btrfs: qgroup: exit the rescan worker during umount Btrfs: fix extent accounting for partial direct IO writes
2015-11-13Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client Pull Ceph updates from Sage Weil: "There are several patches from Ilya fixing RBD allocation lifecycle issues, a series adding a nocephx_sign_messages option (and associated bug fixes/cleanups), several patches from Zheng improving the (directory) fsync behavior, a big improvement in IO for direct-io requests when striping is enabled from Caifeng, and several other small fixes and cleanups" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: libceph: clear msg->con in ceph_msg_release() only libceph: add nocephx_sign_messages option libceph: stop duplicating client fields in messenger libceph: drop authorizer check from cephx msg signing routines libceph: msg signing callouts don't need con argument libceph: evaluate osd_req_op_data() arguments only once ceph: make fsync() wait unsafe requests that created/modified inode ceph: add request to i_unsafe_dirops when getting unsafe reply libceph: introduce ceph_x_authorizer_cleanup() ceph: don't invalidate page cache when inode is no longer used rbd: remove duplicate calls to rbd_dev_mapping_clear() rbd: set device_type::release instead of device::release rbd: don't free rbd_dev outside of the release callback rbd: return -ENOMEM instead of pool id if rbd_dev_create() fails libceph: use local variable cursor instead of &msg->cursor libceph: remove con argument in handle_reply() ceph: combine as many iovec as possile into one OSD request ceph: fix message length computation ceph: fix a comment typo rbd: drop null test before destroy functions
2015-11-12dax: fix __dax_pmd_fault crashDan Williams
Since 4.3 introduced devm_memremap_pages() the pfns handled by DAX may optionally have a struct page backing. When a mapped pfn reaches vmf_insert_pfn_pmd() it fails with a crash signature like the following: kernel BUG at mm/huge_memory.c:905! [..] Call Trace: [<ffffffff812a73ba>] __dax_pmd_fault+0x2ea/0x5b0 [<ffffffffa01a4182>] xfs_filemap_pmd_fault+0x92/0x150 [xfs] [<ffffffff811fbe02>] handle_mm_fault+0x312/0x1b50 Fix this by falling back to 4K mappings in the pfn_valid() case. Longer term, vmf_insert_pfn_pmd() needs to grow support for architectures that can provide a 'pmd_special' capability. Cc: <stable@vger.kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Reported-by: Ross Zwisler <ross.zwisler@linux.intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2015-11-12Merge branch 'for-linus' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull misc block fixes from Jens Axboe: "Stuff that got collected after the merge window opened. This contains: - NVMe: - Fix for non-striped transfer size setting for NVMe from Sathyavathi. - (Some) support for the weird Apple nvme controller in the macbooks. From Stephan Günther. - The error value leak for dax from Al. - A few minor blk-mq tweaks from me. - Add the new linux-block@vger.kernel.org mailing list to the MAINTAINERS file. - Discard fix for brd, from Jan. - A kerneldoc warning for block core from Randy. - An older fix from Vivek, converting a WARN_ON() to a rate limited printk when a device is hot removed with dirty inodes" * 'for-linus' of git://git.kernel.dk/linux-block: block: don't hardcode blk_qc_t -> tag mask dax_io(): don't let non-error value escape via retval instead of EFAULT block: fix blk-core.c kernel-doc warning fs/block_dev.c: Remove WARN_ON() when inode writeback fails NVMe: add support for Apple NVMe controller NVMe: use split lo_hi_{read,write}q blk-mq: mark __blk_mq_complete_request() static MAINTAINERS: add reference to new linux-block list NVMe: Increase the max transfer size when mdts is 0 brd: Refuse improperly aligned discard requests
2015-11-11Merge tag 'xfs-for-linus-4.4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs Pull xfs updates from Dave Chinner: "There is nothing really major here - the only significant addition is the per-mount operation statistics infrastructure. Otherwises there's various ACL, xattr, DAX, AIO and logging fixes, and a smattering of small cleanups and fixes elsewhere. Summary: - per-mount operational statistics in sysfs - fixes for concurrent aio append write submission - various logging fixes - detection of zeroed logs and invalid log sequence numbers on v5 filesystems - memory allocation failure message improvements - a bunch of xattr/ACL fixes - fdatasync optimisation - miscellaneous other fixes and cleanups" * tag 'xfs-for-linus-4.4' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs: (39 commits) xfs: give all workqueues rescuer threads xfs: fix log recovery op header validation assert xfs: Fix error path in xfs_get_acl xfs: optimise away log forces on timestamp updates for fdatasync xfs: don't leak uuid table on rmmod xfs: invalidate cached acl if set via ioctl xfs: Plug memory leak in xfs_attrmulti_attr_set xfs: Validate the length of on-disk ACLs xfs: invalidate cached acl if set directly via xattr xfs: xfs_filemap_pmd_fault treats read faults as write faults xfs: add ->pfn_mkwrite support for DAX xfs: DAX does not use IO completion callbacks xfs: Don't use unwritten extents for DAX xfs: introduce BMAPI_ZERO for allocating zeroed extents xfs: fix inode size update overflow in xfs_map_direct() xfs: clear PF_NOFREEZE for xfsaild kthread xfs: fix an error code in xfs_fs_fill_super() xfs: stats are no longer dependent on CONFIG_PROC_FS xfs: simplify /proc teardown & error handling xfs: per-filesystem stats counter implementation ...
2015-11-11Merge tag 'nfsd-4.4' of git://linux-nfs.org/~bfields/linuxLinus Torvalds
Pull nfsd updates from Bruce Fields: "Apologies for coming a little late in the merge window. Fortunately this is another fairly quiet one: Mainly smaller bugfixes and cleanup. We're still finding some bugs from the breakup of the big NFSv4 state lock in 3.17 -- thanks especially to Andrew Elble and Jeff Layton for tracking down some of the remaining races" * tag 'nfsd-4.4' of git://linux-nfs.org/~bfields/linux: svcrpc: document lack of some memory barriers nfsd: fix race with open / open upgrade stateids nfsd: eliminate sending duplicate and repeated delegations nfsd: remove recurring workqueue job to clean DRC SUNRPC: drop stale comment in svc_setup_socket() nfsd: ensure that seqid morphing operations are atomic wrt to copies nfsd: serialize layout stateid morphing operations nfsd: improve client_has_state to check for unused openowners nfsd: fix clid_inuse on mount with security change sunrpc/cache: make cache flushing more reliable. nfsd: move include of state.h from trace.c to trace.h sunrpc: avoid warning in gss_key_timeout lockd: get rid of reference-counted NSM RPC clients SUNRPC: Use MSG_SENDPAGE_NOTLAST when calling sendpage() lockd: create NSM handles per net namespace nfsd: switch unsigned char flags in svc_fh to bools nfsd: move svc_fh->fh_maxsize to just after fh_handle nfsd: drop null test before destroy functions nfsd: serialize state seqid morphing operations
2015-11-11Merge branch 'for-linus-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs update from Al Viro: - misc stable fixes - trivial kernel-doc and comment fixups - remove never-used block_page_mkwrite() wrapper function, and rename the function that is _actually_ used to not have double underscores. * 'for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: fs: 9p: cache.h: Add #define of include guard vfs: remove stale comment in inode_operations vfs: remove unused wrapper block_page_mkwrite() binfmt_elf: Correct `arch_check_elf's description fs: fix writeback.c kernel-doc warnings fs: fix inode.c kernel-doc warning fs/pipe.c: return error code rather than 0 in pipe_write() fs/pipe.c: preserve alloc_file() error code binfmt_elf: Don't clobber passed executable's file header FS-Cache: Handle a write to the page immediately beyond the EOF marker cachefiles: perform test on s_blocksize when opening cache file. FS-Cache: Don't override netfs's primary_index if registering failed FS-Cache: Increase reference of parent after registering, netfs success debugfs: fix refcount imbalance in start_creating
2015-11-11dax_io(): don't let non-error value escape via retval instead of EFAULTAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Reported-by: Sasha Levin <sasha.levin@oracle.com> Cc: stable@vger.kernel.org # 4.0+ Signed-off-by: Jens Axboe <axboe@fb.com>
2015-11-11fs/block_dev.c: Remove WARN_ON() when inode writeback failsVivek Goyal
If a block device is hot removed and later last reference to device is put, we try to writeback the dirty inode. But device is gone and that writeback fails. Currently we do a WARN_ON() which does not seem to be the right thing. Convert it to a ratelimited kernel warning. Reported-by: Andi Kleen <andi@firstfloor.org> Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Acked-by: Tejun Heo <tj@kernel.org> [jmoyer@redhat.com: get rid of unnecessary name initialization, 80 cols] Signed-off-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-11-11fs: 9p: cache.h: Add #define of include guardTzvetelin Katchov
The include file was intended to have an include guard, but the #define part is missing. Signed-off-by: Tzvetelin Katchov <katchov@gmail.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-11-11vfs: remove unused wrapper block_page_mkwrite()Ross Zwisler
The function currently called "__block_page_mkwrite()" used to be called "block_page_mkwrite()" until a wrapper for this function was added by: commit 24da4fab5a61 ("vfs: Create __block_page_mkwrite() helper passing error values back") This wrapper, the current "block_page_mkwrite()", is currently unused. __block_page_mkwrite() is used directly by ext4, nilfs2 and xfs. Remove the unused wrapper, rename __block_page_mkwrite() back to block_page_mkwrite() and update the comment above block_page_mkwrite(). Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Reviewed-by: Jan Kara <jack@suse.com> Cc: Jan Kara <jack@suse.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-11-11binfmt_elf: Correct `arch_check_elf's descriptionMaciej W. Rozycki
Correct `arch_check_elf's description, mistakenly copied and pasted from `arch_elf_pt_proc'. Signed-off-by: Maciej W. Rozycki <macro@imgtec.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-11-11fs: fix writeback.c kernel-doc warningsRandy Dunlap
Fix kernel-doc warnings in fs/fs-writeback.c by moving a #define macro to after the function's opening brace. Also #undef this macro at the end of the function. ..//fs/fs-writeback.c:1984: warning: Excess function parameter 'inode' description in 'I_DIRTY_INODE' ..//fs/fs-writeback.c:1984: warning: Excess function parameter 'flags' description in 'I_DIRTY_INODE' Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-11-11fs: fix inode.c kernel-doc warningRandy Dunlap
Fix kernel-doc warning in fs/inode.c: ..//fs/inode.c:1606: warning: No description found for parameter 'inode' Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-11-11fs/pipe.c: return error code rather than 0 in pipe_write()Eric Biggers
pipe_write() would return 0 if it failed to merge the beginning of the data to write with the last, partially filled pipe buffer. It should return an error code instead. Userspace programs could be confused by write() returning 0 when called with a nonzero 'count'. The EFAULT error case was a regression from f0d1bec9d5 ("new helper: copy_page_from_iter()"), while the ops->confirm() error case was a much older bug. Test program: #include <assert.h> #include <errno.h> #include <unistd.h> int main(void) { int fd[2]; char data[1] = {0}; assert(0 == pipe(fd)); assert(1 == write(fd[1], data, 1)); /* prior to this patch, write() returned 0 here */ assert(-1 == write(fd[1], NULL, 1)); assert(errno == EFAULT); } Cc: stable@vger.kernel.org # at least v3.15+ Signed-off-by: Eric Biggers <ebiggers3@gmail.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-11-11fs/pipe.c: preserve alloc_file() error codeEric Biggers
If sys_pipe() was unable to allocate a 'struct file', it always failed with ENFILE, which means "The number of simultaneously open files in the system would exceed a system-imposed limit." However, alloc_file() actually returns an ERR_PTR value and might fail with other error codes. Currently, in addition to ENFILE, it can fail with ENOMEM, potentially when there are few open files in the system. Update sys_pipe() to preserve this error code. In a prior submission of a similar patch (1) some concern was raised about introducing a new error code for sys_pipe(). However, for most system calls, programs cannot assume that new error codes will never be introduced. In addition, ENOMEM was, in fact, already a possible error code for sys_pipe(), in the case where the file descriptor table could not be expanded due to insufficient memory. (1) http://comments.gmane.org/gmane.linux.kernel/1357942 Signed-off-by: Eric Biggers <ebiggers3@gmail.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-11-11binfmt_elf: Don't clobber passed executable's file headerMaciej W. Rozycki
Do not clobber the buffer space passed from `search_binary_handler' and originally preloaded by `prepare_binprm' with the executable's file header by overwriting it with its interpreter's file header. Instead keep the buffer space intact and directly use the data structure locally allocated for the interpreter's file header, fixing a bug introduced in 2.1.14 with loadable module support (linux-mips.org commit beb11695 [Import of Linux/MIPS 2.1.14], predating kernel.org repo's history). Adjust the amount of data read from the interpreter's file accordingly. This was not an issue before loadable module support, because back then `load_elf_binary' was executed only once for a given ELF executable, whether the function succeeded or failed. With loadable module support supported and enabled, upon a failure of `load_elf_binary' -- which may for example be caused by architecture code rejecting an executable due to a missing hardware feature requested in the file header -- a module load is attempted and then the function reexecuted by `search_binary_handler'. With the executable's file header replaced with its interpreter's file header the executable can then be erroneously accepted in this subsequent attempt. Cc: stable@vger.kernel.org # all the way back Signed-off-by: Maciej W. Rozycki <macro@imgtec.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-11-11FS-Cache: Handle a write to the page immediately beyond the EOF markerDavid Howells
Handle a write being requested to the page immediately beyond the EOF marker on a cache object. Currently this gets an assertion failure in CacheFiles because the EOF marker is used there to encode information about a partial page at the EOF - which could lead to an unknown blank spot in the file if we extend the file over it. The problem is actually in fscache where we check the index of the page being written against store_limit. store_limit is set to the number of pages that we're allowed to store by fscache_set_store_limit() - which means it's one more than the index of the last page we're allowed to store. The problem is that we permit writing to a page with an index _equal_ to the store limit - when we should reject that case. Whilst we're at it, change the triggered assertion in CacheFiles to just return -ENOBUFS instead. The assertion failure looks something like this: CacheFiles: Assertion failed 1000 < 7b1 is false ------------[ cut here ]------------ kernel BUG at fs/cachefiles/rdwr.c:962! ... RIP: 0010:[<ffffffffa02c9e83>] [<ffffffffa02c9e83>] cachefiles_write_page+0x273/0x2d0 [cachefiles] Cc: stable@vger.kernel.org # v2.6.31+; earlier - that + backport of a17754f (at least) Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-11-11cachefiles: perform test on s_blocksize when opening cache file.NeilBrown
cachefiles requires that s_blocksize in the cache is not greater than PAGE_SIZE, and performs the check every time a block is accessed. Move the test to the place where the file is "opened", where other file-validity tests are performed. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-11-11FS-Cache: Don't override netfs's primary_index if registering failedKinglong Mee
Only override netfs->primary_index when registering success. Cc: stable@vger.kernel.org # v2.6.30+ Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-11-11FS-Cache: Increase reference of parent after registering, netfs successKinglong Mee
If netfs exist, fscache should not increase the reference of parent's usage and n_children, otherwise, never be decreased. v2: thanks David's suggest, move increasing reference of parent if success use kmem_cache_free() freeing primary_index directly v3: don't move "netfs->primary_index->parent = &fscache_fsdef_index;" Cc: stable@vger.kernel.org # v2.6.30+ Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-11-11debugfs: fix refcount imbalance in start_creatingDaniel Borkmann
In debugfs' start_creating(), we pin the file system to safely access its root. When we failed to create a file, we unpin the file system via failed_creating() to release the mount count and eventually the reference of the vfsmount. However, when we run into an error during lookup_one_len() when still in start_creating(), we only release the parent's mutex but not so the reference on the mount. Looks like it was done in the past, but after splitting portions of __create_file() into start_creating() and end_creating() via 190afd81e4a5 ("debugfs: split the beginning and the end of __create_file() off"), this seemed missed. Noticed during code review. Fixes: 190afd81e4a5 ("debugfs: split the beginning and the end of __create_file() off") Cc: stable@vger.kernel.org # v4.0+ Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-11-10btrfs: Use fs_info directly in btrfs_delete_unused_bgsZhao Lei
No need to use root->fs_info in btrfs_delete_unused_bgs(), use fs_info directly instead. Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com> Signed-off-by: Chris Mason <clm@fb.com>
2015-11-10btrfs: Fix lost-data-profile caused by balance bgZhao Lei
Reproduce: (In integration-4.3 branch) TEST_DEV=(/dev/vdg /dev/vdh) TEST_DIR=/mnt/tmp umount "$TEST_DEV" >/dev/null mkfs.btrfs -f -d raid1 "${TEST_DEV[@]}" mount -o nospace_cache "$TEST_DEV" "$TEST_DIR" btrfs balance start -dusage=0 $TEST_DIR btrfs filesystem usage $TEST_DIR dd if=/dev/zero of="$TEST_DIR"/file count=100 btrfs filesystem usage $TEST_DIR Result: We can see "no data chunk" in first "btrfs filesystem usage": # btrfs filesystem usage $TEST_DIR Overall: ... Metadata,single: Size:8.00MiB, Used:0.00B /dev/vdg 8.00MiB Metadata,RAID1: Size:122.88MiB, Used:112.00KiB /dev/vdg 122.88MiB /dev/vdh 122.88MiB System,single: Size:4.00MiB, Used:0.00B /dev/vdg 4.00MiB System,RAID1: Size:8.00MiB, Used:16.00KiB /dev/vdg 8.00MiB /dev/vdh 8.00MiB Unallocated: /dev/vdg 1.06GiB /dev/vdh 1.07GiB And "data chunks changed from raid1 to single" in second "btrfs filesystem usage": # btrfs filesystem usage $TEST_DIR Overall: ... Data,single: Size:256.00MiB, Used:0.00B /dev/vdh 256.00MiB Metadata,single: Size:8.00MiB, Used:0.00B /dev/vdg 8.00MiB Metadata,RAID1: Size:122.88MiB, Used:112.00KiB /dev/vdg 122.88MiB /dev/vdh 122.88MiB System,single: Size:4.00MiB, Used:0.00B /dev/vdg 4.00MiB System,RAID1: Size:8.00MiB, Used:16.00KiB /dev/vdg 8.00MiB /dev/vdh 8.00MiB Unallocated: /dev/vdg 1.06GiB /dev/vdh 841.92MiB Reason: btrfs balance delete last data chunk in case of no data in the filesystem, then we can see "no data chunk" by "fi usage" command. And when we do write operation to fs, the only available data profile is 0x0, result is all new chunks are allocated single type. Fix: Allocate a data chunk explicitly to ensure we don't lose the raid profile for data. Test: Test by above script, and confirmed the logic by debug output. Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com> Signed-off-by: Chris Mason <clm@fb.com>
2015-11-10btrfs: Fix lost-data-profile caused by auto removing bgZhao Lei
Reproduce: (In integration-4.3 branch) TEST_DEV=(/dev/vdg /dev/vdh) TEST_DIR=/mnt/tmp umount "$TEST_DEV" >/dev/null mkfs.btrfs -f -d raid1 "${TEST_DEV[@]}" mount -o nospace_cache "$TEST_DEV" "$TEST_DIR" umount "$TEST_DEV" mount -o nospace_cache "$TEST_DEV" "$TEST_DIR" btrfs filesystem usage $TEST_DIR We can see the data chunk changed from raid1 to single: # btrfs filesystem usage $TEST_DIR Data,single: Size:8.00MiB, Used:0.00B /dev/vdg 8.00MiB # Reason: When a empty filesystem mount with -o nospace_cache, the last data blockgroup will be auto-removed in umount. Then if we mount it again, there is no data chunk in the filesystem, so the only available data profile is 0x0, result is all new chunks are created as single type. Fix: Don't auto-delete last blockgroup for a raid type. Test: Test by above script, and confirmed the logic by debug output. Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com> Signed-off-by: Chris Mason <clm@fb.com>
2015-11-10btrfs: Remove len argument from scrub_find_csumZhao Lei
It is useless. Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com> Signed-off-by: Chris Mason <clm@fb.com>
2015-11-10btrfs: Reduce unnecessary arguments in scrub_recheck_blockZhao Lei
We don't need pass so many arguments for recheck sblock now, this patch cleans them. Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com> Signed-off-by: Chris Mason <clm@fb.com>
2015-11-10btrfs: Use scrub_checksum_data and scrub_checksum_tree_block for ↵Zhao Lei
scrub_recheck_block_checksum We can use existing scrub_checksum_data() and scrub_checksum_tree_block() for scrub_recheck_block_checksum(), instead of write duplicated code. Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com> Signed-off-by: Chris Mason <clm@fb.com>
2015-11-10btrfs: Reset sblock->xxx_error stats before calling scrub_recheck_block_checksumZhao Lei
We should reset sblock->xxx_error stats before calling scrub_recheck_block_checksum(). Current code run correctly because all sblock are allocated by k[cz]alloc(), and the error stats are not got changed. Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com> Signed-off-by: Chris Mason <clm@fb.com>
2015-11-10btrfs: scrub: setup all fields for sblock_to_checkZhao Lei
scrub_setup_recheck_block() isn't setup all necessary fields for sblock_to_check because history reason. So current code need more arguments in severial functions, and more local variables, just to passing these lacked values to necessary place. This patch setup above fields to sblock_to_check in scrub_setup_recheck_block(), for: 1: more cleanup for function arg, local variable 2: to make sblock_to_check complete, then we can use sblock_to_check without concern about some uninitialized member. Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com> Signed-off-by: Chris Mason <clm@fb.com>
2015-11-10btrfs: scrub: set error stats when tree block spanning stripesZhao Lei
It is better to show error stats to user when we found tree block spanning stripes. On a btrfs created by old version of btrfs-convert: Before patch: # btrfs scrub start -B /dev/vdh scrub done for 8b342d35-2904-41ab-b3cb-2f929709cf47 scrub started at Tue Aug 25 21:19:09 2015 and finished after 00:00:00 total bytes scrubbed: 53.54MiB with 0 errors # dmesg ... [ 128.711434] BTRFS error (device vdh): scrub: tree block 27054080 spanning stripes, ignored. logical=27000832 [ 128.712744] BTRFS error (device vdh): scrub: tree block 27054080 spanning stripes, ignored. logical=27066368 ... After patch: # btrfs scrub start -B /dev/vdh scrub done for ff7f844b-7a4e-4b1a-88a9-8252ab25be1b scrub started at Tue Aug 25 21:42:29 2015 and finished after 00:00:00 total bytes scrubbed: 53.60MiB with 2 errors error details: corrected errors: 0, uncorrectable errors: 2, unverified errors: 0 ERROR: There are uncorrectable errors. # dmesg ...omit... # Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com> Signed-off-by: Chris Mason <clm@fb.com>
2015-11-10Merge branch 'for-4.4/io-poll' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block IO poll support from Jens Axboe: "Various groups have been doing experimentation around IO polling for (really) fast devices. The code has been reviewed and has been sitting on the side for a few releases, but this is now good enough for coordinated benchmarking and further experimentation. Currently O_DIRECT sync read/write are supported. A framework is in the works that allows scalable stats tracking so we can auto-tune this. And we'll add libaio support as well soon. Fow now, it's an opt-in feature for test purposes" * 'for-4.4/io-poll' of git://git.kernel.dk/linux-block: direct-io: be sure to assign dio->bio_bdev for both paths directio: add block polling support NVMe: add blk polling support block: add block polling support blk-mq: return tag/queue combo in the make_request_fn handlers block: change ->make_request_fn() and users to return a queue cookie
2015-11-10Merge tag 'upstream-4.4-rc1' of git://git.infradead.org/linux-ubifsLinus Torvalds
Pull UBI/UBIFS updates from Richard Weinberger: - access time support for UBIFS by Dongsheng Yang - random cleanups and bug fixes all over the place * tag 'upstream-4.4-rc1' of git://git.infradead.org/linux-ubifs: ubifs: introduce UBIFS_ATIME_SUPPORT to ubifs ubifs: make ubifs_[get|set]xattr atomic UBIFS: Delete unnecessary checks before the function call "iput" UBI: Remove in vain semicolon UBI: Fastmap: Fix PEB array type UBIFS: Fix possible memory leak in ubifs_readdir() fs/ubifs: remove unnecessary new_valid_dev check ubi: fastmap: Implement produce_free_peb() UBIFS: print verbose message when rescanning a corrupted node UBIFS: call dbg_is_power_cut() instead of reading c->dbg->pc_happened UBI: drop null test before destroy functions UBI: Update comments to reflect UBI_METAONLY flag UBI: Fix debug message UBI: Fix typo in comment UBI: Fastmap: Simplify expression UBIFS: fix a typo in comment of ubifs_budget_req UBIFS: use kmemdup rather than duplicating its implementation
2015-11-10Merge tag 'libnvdimm-for-4.4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm Pull libnvdimm updates from Dan Williams: "Outside of the new ACPI-NFIT hot-add support this pull request is more notable for what it does not contain, than what it does. There were a handful of development topics this cycle, dax get_user_pages, dax fsync, and raw block dax, that need more more iteration and will wait for 4.5. The patches to make devm and the pmem driver NUMA aware have been in -next for several weeks. The hot-add support has not, but is contained to the NFIT driver and is passing unit tests. The coredump support is straightforward and was looked over by Jeff. All of it has received a 0day build success notification across 107 configs. Summary: - Add support for the ACPI 6.0 NFIT hot add mechanism to process updates of the NFIT at runtime. - Teach the coredump implementation how to filter out DAX mappings. - Introduce NUMA hints for allocations made by the pmem driver, and as a side effect all devm allocations now hint their NUMA node by default" * tag 'libnvdimm-for-4.4' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: coredump: add DAX filtering for FDPIC ELF coredumps coredump: add DAX filtering for ELF coredumps acpi: nfit: Add support for hot-add nfit: in acpi_nfit_init, break on a 0-length table pmem, memremap: convert to numa aware allocations devm_memremap_pages: use numa_mem_id devm: make allocations numa aware by default devm_memremap: convert to return ERR_PTR devm_memunmap: use devres_release() pmem: kill memremap_pmem() x86, mm: quiet arch_add_memory()
2015-11-10Merge branch 'i2c/for-4.4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux Pull i2c updates from Wolfram Sang: - New drivers: UniPhier (with and without FIFO) - some drivers got some bigger rework: ismt, designware, img-scb (rcar had to be reverted because issues were showing up just lately) - ACPI: reworked the device scanning and added support for muxes ... and quite a lot of driver bugfixes and cleanups this time. All files touched outside of the i2c realm have proper acks. * 'i2c/for-4.4' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: (70 commits) i2c: rcar: Revert the latest refactoring series i2c: pnx: remove superfluous assignment MAINTAINERS: i2c: drop i2c-pnx maintainer MAINTAINERS: i2c: mark also subdirectories as maintained i2c: cadence: enable driver for ARM64 i2c: i801: Document Intel DNV and Broxton i2c: at91: manage unexpected RXRDY flag when starting a transfer i2c: pnx: Use setup_timer instead of open coding it i2c: add ACPI support for I2C mux ports acpi: add acpi_preset_companion() stub i2c: pxa: Add support for pxa910/988 & new configuration features i2c: au1550: Convert to devm_kzalloc and devm_ioremap_resource i2c-dev: Fix I2C_SLAVE ioctl comment i2c-dev: Fix typo in ioctl name reference i2c: sirf: tune the divider to make i2c bus freq more accurate i2c: imx: Use -ENXIO as error in the NACK case i2c: i801: Add support for Intel Broxton i2c: i801: Add support for Intel DNV i2c: mediatek: add i2c resume support i2c: imx: implement bus recovery ...
2015-11-10direct-io: be sure to assign dio->bio_bdev for both pathsJens Axboe
btrfs sets ->submit_io(), and we failed to set the block dev for that path. That resulted in a potential NULL dereference when we later wait for IO in dio_await_one(). Reported-by: kernel test robot <ying.huang@linux.intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-11-10nfsd: fix race with open / open upgrade stateidsAndrew Elble
We observed multiple open stateids on the server for files that seemingly should have been closed. nfsd4_process_open2() tests for the existence of a preexisting stateid. If one is not found, the locks are dropped and a new one is created. The problem is that init_open_stateid(), which is also responsible for hashing the newly initialized stateid, doesn't check to see if another open has raced in and created a matching stateid. This fix is to enable init_open_stateid() to return the matching stateid and have nfsd4_process_open2() swap to that stateid and switch to the open upgrade path. In testing this patch, coverage to the newly created path indicates that the race was indeed happening. Signed-off-by: Andrew Elble <aweits@rit.edu> Reviewed-by: Jeff Layton <jlayton@poochiereds.net> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-11-10nfsd: eliminate sending duplicate and repeated delegationsAndrew Elble
We've observed the nfsd server in a state where there are multiple delegations on the same nfs4_file for the same client. The nfs client does attempt to DELEGRETURN these when they are presented to it - but apparently under some (unknown) circumstances the client does not manage to return all of them. This leads to the eventual attempt to CB_RECALL more than one delegation with the same nfs filehandle to the same client. The first recall will succeed, but the next recall will fail with NFS4ERR_BADHANDLE. This leads to the server having delegations on cl_revoked that the client has no way to FREE or DELEGRETURN, with resulting inability to recover. The state manager on the server will continually assert SEQ4_STATUS_RECALLABLE_STATE_REVOKED, and the state manager on the client will be looping unable to satisfy the server. List discussion also reports a race between OPEN and DELEGRETURN that will be avoided by only sending the delegation once to the client. This is also logically in accordance with RFC5561 9.1.1 and 10.2. So, let's: 1.) Not hand out duplicate delegations. 2.) Only send them to the client once. RFC 5561: 9.1.1: "Delegations and layouts, on the other hand, are not associated with a specific owner but are associated with the client as a whole (identified by a client ID)." 10.2: "...the stateid for a delegation is associated with a client ID and may be used on behalf of all the open-owners for the given client. A delegation is made to the client as a whole and not to any specific process or thread of control within it." Reported-by: Eric Meddaugh <etmsys@rit.edu> Cc: Trond Myklebust <trond.myklebust@primarydata.com> Cc: Olga Kornievskaia <aglo@umich.edu> Signed-off-by: Andrew Elble <aweits@rit.edu> Cc: stable@vger.kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>