aboutsummaryrefslogtreecommitdiff
path: root/fs/debugfs/inode.c
AgeCommit message (Collapse)Author
2022-09-24debugfs: Only clobber mode/uid/gid on remount if askedBrian Norris
Users may have explicitly configured their debugfs permissions; we shouldn't overwrite those just because a second mount appeared. Only clobber if the options were provided at mount time. Existing behavior: ## Pre-existing status: debugfs is 0755. # chmod 755 /sys/kernel/debug/ # stat -c '%A' /sys/kernel/debug/ drwxr-xr-x ## New mount sets kernel-default permissions: # mount -t debugfs none /mnt/foo # stat -c '%A' /mnt/foo drwx------ ## Unexpected: the original mount changed permissions: # stat -c '%A' /sys/kernel/debug drwx------ New behavior: ## Pre-existing status: debugfs is 0755. # chmod 755 /sys/kernel/debug/ # stat -c '%A' /sys/kernel/debug/ drwxr-xr-x ## New mount inherits existing permissions: # mount -t debugfs none /mnt/foo # stat -c '%A' /mnt/foo drwxr-xr-x ## Expected: old mount is unchanged: # stat -c '%A' /sys/kernel/debug drwxr-xr-x Full test cases are being submitted to LTP. Signed-off-by: Brian Norris <briannorris@chromium.org> Link: https://lore.kernel.org/r/20220912163042.v3.1.Icbd40fce59f55ad74b80e5d435ea233579348a78@changeid Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-09-05debugfs: add debugfs_lookup_and_remove()Greg Kroah-Hartman
There is a very common pattern of using debugfs_remove(debufs_lookup(..)) which results in a dentry leak of the dentry that was looked up. Instead of having to open-code the correct pattern of calling dput() on the dentry, create debugfs_lookup_and_remove() to handle this pattern automatically and properly without any memory leaks. Cc: stable <stable@kernel.org> Reported-by: Kuyo Chang <kuyo.chang@mediatek.com> Tested-by: Kuyo Chang <kuyo.chang@mediatek.com> Link: https://lore.kernel.org/r/YxIaQ8cSinDR881k@kroah.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-02-25debugfs: Document that debugfs_create functions need not be error checkedDouglas Anderson
As talked about in commit b792e64021ec ("drm: no need to check return value of debugfs_create functions"), in many cases we can get away with totally skipping checking the errors of debugfs functions. Let's document that so people don't add new code that needlessly checks these errors. Probably this note could be added to a boatload of functions, but that's a lot of duplication. Let's just add it to the two most frequent ones and hope people will get the idea. Suggested-by: Javier Martinez Canillas <javierm@redhat.com> Reviewed-by: Javier Martinez Canillas <javierm@redhat.com> Signed-off-by: Douglas Anderson <dianders@chromium.org> Link: https://lore.kernel.org/r/20220222154555.1.I26d364db7a007f8995e8f0dac978673bc8e9f5e2@changeid Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-09-21debugfs: debugfs_create_file_size(): use IS_ERR to check for errorNirmoy Das
debugfs_create_file() returns encoded error so use IS_ERR for checking return value. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Nirmoy Das <nirmoy.das@amd.com> Fixes: ff9fb72bc077 ("debugfs: return error values, not NULL") Cc: stable <stable@vger.kernel.org> References: https://gitlab.freedesktop.org/drm/amd/-/issues/1686 Link: https://lore.kernel.org/r/20210902102917.2233-1-nirmoy.das@amd.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-05-18debugfs: fix security_locked_down() call for SELinuxOndrej Mosnacek
When (ia->ia_valid & (ATTR_MODE | ATTR_UID | ATTR_GID)) is zero, then the SELinux implementation of the locked_down hook might report a denial even though the operation would actually be allowed. To fix this, make sure that security_locked_down() is called only when the return value will be taken into account (i.e. when changing one of the problematic attributes). Note: this was introduced by commit 5496197f9b08 ("debugfs: Restrict debugfs when the kernel is locked down"), but it didn't matter at that time, as the SELinux support came in later. Fixes: 59438b46471a ("security,lockdown,selinux: implement SELinux lockdown") Cc: stable <stable@vger.kernel.org> Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com> Link: https://lore.kernel.org/r/20210507125304.144394-1-omosnace@redhat.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-04-09debugfs: Make debugfs_allow RO after initKees Cook
Since debugfs_allow is only set at boot time during __init, make it read-only after being set. Fixes: a24c6f7bc923 ("debugfs: Add access restriction option") Cc: Peter Enderborg <peter.enderborg@sony.com> Reviewed-by: Peter Enderborg <peter.enderborg@sony.com> Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20210405213959.3079432-1-keescook@chromium.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-02-24Merge tag 'driver-core-5.12-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core / debugfs update from Greg KH: "Here is the "big" driver core and debugfs update for 5.12-rc1 This set of driver core patches caused a bunch of problems in linux-next for the past few weeks, when Saravana tried to set fw_devlink=on as the default functionality. This caused a number of systems to stop booting, and lots of bugs were fixed in this area for almost all of the reported systems, but this option is not ready to be turned on just yet for the default operation based on this testing, so I've reverted that change at the very end so we don't have to worry about regressions in 5.12 We will try to turn this on for 5.13 if testing goes better over the next few months. Other than the fixes caused by the fw_devlink testing in here, there's not much more: - debugfs fixes for invalid input into debugfs_lookup() - kerneldoc cleanups - warn message if platform drivers return an error on their remove callback (a futile effort, but good to catch). All of these have been in linux-next for a while now, and the regressions have gone away with the revert of the fw_devlink change" * tag 'driver-core-5.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (35 commits) Revert "driver core: Set fw_devlink=on by default" of: property: fw_devlink: Ignore interrupts property for some configs debugfs: do not attempt to create a new file before the filesystem is initalized debugfs: be more robust at handling improper input in debugfs_lookup() driver core: auxiliary bus: Fix calling stage for auxiliary bus init of: irq: Fix the return value for of_irq_parse_one() stub of: irq: make a stub for of_irq_parse_one() clk: Mark fwnodes when their clock provider is added/removed PM: domains: Mark fwnodes when their powerdomain is added/removed irqdomain: Mark fwnodes when their irqdomain is added/removed driver core: fw_devlink: Handle suppliers that don't use driver core of: property: Add fw_devlink support for optional properties driver core: Add fw_devlink.strict kernel param of: property: Don't add links to absent suppliers driver core: fw_devlink: Detect supplier devices that will never be added driver core: platform: Emit a warning if a remove callback returned non-zero of: property: Fix fw_devlink handling of interrupts/interrupts-extended gpiolib: Don't probe gpio_device if it's not the primary device device.h: Remove bogus "the" in kerneldoc gpiolib: Bind gpio_device to a driver to enable fw_devlink=on by default ...
2021-02-18debugfs: do not attempt to create a new file before the filesystem is initalizedGreg Kroah-Hartman
Some subsystems want to add debugfs files at early boot, way before debugfs is initialized. This seems to work somehow as the vfs layer will not allow it to happen, but let's be explicit and test to ensure we are properly up and running before allowing files to be created. Cc: "Rafael J. Wysocki" <rafael@kernel.org> Cc: stable <stable@vger.kernel.org> Reported-by: Michael Walle <michael@walle.cc> Reported-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210218100818.3622317-2-gregkh@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-02-18debugfs: be more robust at handling improper input in debugfs_lookup()Greg Kroah-Hartman
debugfs_lookup() doesn't like it if it is passed an illegal name pointer, or if the filesystem isn't even initialized yet. If either of these happen, it will crash the system, so fix it up by properly testing for valid input and that we are up and running before trying to find a file in the filesystem. Cc: "Rafael J. Wysocki" <rafael@kernel.org> Cc: stable <stable@vger.kernel.org> Reported-by: Michael Walle <michael@walle.cc> Tested-by: Michael Walle <michael@walle.cc> Tested-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210218100818.3622317-1-gregkh@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-01-24fs: make helpers idmap mount awareChristian Brauner
Extend some inode methods with an additional user namespace argument. A filesystem that is aware of idmapped mounts will receive the user namespace the mount has been marked with. This can be used for additional permission checking and also to enable filesystems to translate between uids and gids if they need to. We have implemented all relevant helpers in earlier patches. As requested we simply extend the exisiting inode method instead of introducing new ones. This is a little more code churn but it's mostly mechanical and doesnt't leave us with additional inode methods. Link: https://lore.kernel.org/r/20210121131959.646623-25-christian.brauner@ubuntu.com Cc: Christoph Hellwig <hch@lst.de> Cc: David Howells <dhowells@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2020-07-23debugfs: Add access restriction optionPeter Enderborg
Since debugfs include sensitive information it need to be treated carefully. But it also has many very useful debug functions for userspace. With this option we can have same configuration for system with need of debugfs and a way to turn it off. This gives a extra protection for exposure on systems where user-space services with system access are attacked. It is controlled by a configurable default value that can be override with a kernel command line parameter. (debugfs=) It can be on or off, but also internally on but not seen from user-space. This no-mount mode do not register a debugfs as filesystem, but client can register their parts in the internal structures. This data can be readed with a debugger or saved with a crashkernel. When it is off clients get EPERM error when accessing the functions for registering their components. Signed-off-by: Peter Enderborg <peter.enderborg@sony.com> Link: https://lore.kernel.org/r/20200716071511.26864-3-peter.enderborg@sony.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-03-18debugfs: remove return value of debugfs_create_file_size()Greg Kroah-Hartman
No one checks the return value of debugfs_create_file_size, as it's not needed, so make the return value void, so that no one tries to do so in the future. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://lore.kernel.org/r/20200309163640.237984-1-gregkh@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-02-05Merge branch 'work.recursive_removal' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs recursive removal updates from Al Viro: "We have quite a few places where synthetic filesystems do an equivalent of 'rm -rf', with varying amounts of code duplication, wrong locking, etc. That really ought to be a library helper. Only debugfs (and very similar tracefs) are converted here - I have more conversions, but they'd never been in -next, so they'll have to wait" * 'work.recursive_removal' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: simple_recursive_removal(): kernel-side rm -rf for ramfs-style filesystems
2020-01-06debugfs: Fix warnings when building documentationDaniel W. S. Almeida
Fix the following warnings: fs/debugfs/inode.c:423: WARNING: Inline literal start-string without end-string. fs/debugfs/inode.c:502: WARNING: Inline literal start-string without end-string. fs/debugfs/inode.c:534: WARNING: Inline literal start-string without end-string. fs/debugfs/inode.c:627: WARNING: Inline literal start-string without end-string. fs/debugfs/file.c:496: WARNING: Inline literal start-string without end-string. fs/debugfs/file.c:502: WARNING: Inline literal start-string without end-string. fs/debugfs/file.c:581: WARNING: Inline literal start-string without end-string. fs/debugfs/file.c:587: WARNING: Inline literal start-string without end-string. fs/debugfs/file.c:846: WARNING: Inline literal start-string without end-string. fs/debugfs/file.c:852: WARNING: Inline literal start-string without end-string. fs/debugfs/file.c:899: WARNING: Inline literal start-string without end-string. fs/debugfs/file.c:905: WARNING: Inline literal start-string without end-string. fs/debugfs/file.c:1091: WARNING: Inline literal start-string without end-string. fs/debugfs/file.c:1097: WARNING: Inline literal start-string without end-string By replacing %ERR_PTR with ERR_PTR. Signed-off-by: Daniel W. S. Almeida <dwlsalmeida@gmail.com> Link: https://lore.kernel.org/r/20191227010035.854913-1-dwlsalmeida@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-10simple_recursive_removal(): kernel-side rm -rf for ramfs-style filesystemsAl Viro
two requirements: no file creations in IS_DEADDIR and no cross-directory renames whatsoever. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-11-15new helper: lookup_positive_unlocked()Al Viro
Most of the callers of lookup_one_len_unlocked() treat negatives are ERR_PTR(-ENOENT). Provide a helper that would do just that. Note that a pinned positive dentry remains positive - it's ->d_inode is stable, etc.; a pinned _negative_ dentry can become positive at any point as long as you are not holding its parent at least shared. So using lookup_one_len_unlocked() needs to be careful; lookup_positive_unlocked() is safer and that's what the callers end up open-coding anyway. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-09-28Merge branch 'next-lockdown' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security Pull kernel lockdown mode from James Morris: "This is the latest iteration of the kernel lockdown patchset, from Matthew Garrett, David Howells and others. From the original description: This patchset introduces an optional kernel lockdown feature, intended to strengthen the boundary between UID 0 and the kernel. When enabled, various pieces of kernel functionality are restricted. Applications that rely on low-level access to either hardware or the kernel may cease working as a result - therefore this should not be enabled without appropriate evaluation beforehand. The majority of mainstream distributions have been carrying variants of this patchset for many years now, so there's value in providing a doesn't meet every distribution requirement, but gets us much closer to not requiring external patches. There are two major changes since this was last proposed for mainline: - Separating lockdown from EFI secure boot. Background discussion is covered here: https://lwn.net/Articles/751061/ - Implementation as an LSM, with a default stackable lockdown LSM module. This allows the lockdown feature to be policy-driven, rather than encoding an implicit policy within the mechanism. The new locked_down LSM hook is provided to allow LSMs to make a policy decision around whether kernel functionality that would allow tampering with or examining the runtime state of the kernel should be permitted. The included lockdown LSM provides an implementation with a simple policy intended for general purpose use. This policy provides a coarse level of granularity, controllable via the kernel command line: lockdown={integrity|confidentiality} Enable the kernel lockdown feature. If set to integrity, kernel features that allow userland to modify the running kernel are disabled. If set to confidentiality, kernel features that allow userland to extract confidential information from the kernel are also disabled. This may also be controlled via /sys/kernel/security/lockdown and overriden by kernel configuration. New or existing LSMs may implement finer-grained controls of the lockdown features. Refer to the lockdown_reason documentation in include/linux/security.h for details. The lockdown feature has had signficant design feedback and review across many subsystems. This code has been in linux-next for some weeks, with a few fixes applied along the way. Stephen Rothwell noted that commit 9d1f8be5cf42 ("bpf: Restrict bpf when kernel lockdown is in confidentiality mode") is missing a Signed-off-by from its author. Matthew responded that he is providing this under category (c) of the DCO" * 'next-lockdown' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (31 commits) kexec: Fix file verification on S390 security: constify some arrays in lockdown LSM lockdown: Print current->comm in restriction messages efi: Restrict efivar_ssdt_load when the kernel is locked down tracefs: Restrict tracefs when the kernel is locked down debugfs: Restrict debugfs when the kernel is locked down kexec: Allow kexec_file() with appropriate IMA policy when locked down lockdown: Lock down perf when in confidentiality mode bpf: Restrict bpf when kernel lockdown is in confidentiality mode lockdown: Lock down tracing and perf kprobes when in confidentiality mode lockdown: Lock down /proc/kcore x86/mmiotrace: Lock down the testmmiotrace module lockdown: Lock down module params that specify hardware parameters (eg. ioport) lockdown: Lock down TIOCSSERIAL lockdown: Prohibit PCMCIA CIS storage when the kernel is locked down acpi: Disable ACPI table override if the kernel is locked down acpi: Ignore acpi_rsdp kernel param when the kernel has been locked down ACPI: Limit access to custom_method when the kernel is locked down x86/msr: Restrict MSR access when the kernel is locked down x86: Lock down IO port access when the kernel is locked down ...
2019-08-19debugfs: Restrict debugfs when the kernel is locked downDavid Howells
Disallow opening of debugfs files that might be used to muck around when the kernel is locked down as various drivers give raw access to hardware through debugfs. Given the effort of auditing all 2000 or so files and manually fixing each one as necessary, I've chosen to apply a heuristic instead. The following changes are made: (1) chmod and chown are disallowed on debugfs objects (though the root dir can be modified by mount and remount, but I'm not worried about that). (2) When the kernel is locked down, only files with the following criteria are permitted to be opened: - The file must have mode 00444 - The file must not have ioctl methods - The file must not have mmap (3) When the kernel is locked down, files may only be opened for reading. Normal device interaction should be done through configfs, sysfs or a miscdev, not debugfs. Note that this makes it unnecessary to specifically lock down show_dsts(), show_devs() and show_call() in the asus-wmi driver. I would actually prefer to lock down all files by default and have the the files unlocked by the creator. This is tricky to manage correctly, though, as there are 19 creation functions and ~1600 call sites (some of them in loops scanning tables). Signed-off-by: David Howells <dhowells@redhat.com> cc: Andy Shevchenko <andy.shevchenko@gmail.com> cc: acpi4asus-user@lists.sourceforge.net cc: platform-driver-x86@vger.kernel.org cc: Matthew Garrett <mjg59@srcf.ucam.org> cc: Thomas Gleixner <tglx@linutronix.de> Cc: Greg KH <greg@kroah.com> Cc: Rafael J. Wysocki <rafael@kernel.org> Signed-off-by: Matthew Garrett <matthewgarrett@google.com> Signed-off-by: James Morris <jmorris@namei.org>
2019-07-12Merge tag 'driver-core-5.3-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core and debugfs updates from Greg KH: "Here is the "big" driver core and debugfs changes for 5.3-rc1 It's a lot of different patches, all across the tree due to some api changes and lots of debugfs cleanups. Other than the debugfs cleanups, in this set of changes we have: - bus iteration function cleanups - scripts/get_abi.pl tool to display and parse Documentation/ABI entries in a simple way - cleanups to Documenatation/ABI/ entries to make them parse easier due to typos and other minor things - default_attrs use for some ktype users - driver model documentation file conversions to .rst - compressed firmware file loading - deferred probe fixes All of these have been in linux-next for a while, with a bunch of merge issues that Stephen has been patient with me for" * tag 'driver-core-5.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (102 commits) debugfs: make error message a bit more verbose orangefs: fix build warning from debugfs cleanup patch ubifs: fix build warning after debugfs cleanup patch driver: core: Allow subsystems to continue deferring probe drivers: base: cacheinfo: Ensure cpu hotplug work is done before Intel RDT arch_topology: Remove error messages on out-of-memory conditions lib: notifier-error-inject: no need to check return value of debugfs_create functions swiotlb: no need to check return value of debugfs_create functions ceph: no need to check return value of debugfs_create functions sunrpc: no need to check return value of debugfs_create functions ubifs: no need to check return value of debugfs_create functions orangefs: no need to check return value of debugfs_create functions nfsd: no need to check return value of debugfs_create functions lib: 842: no need to check return value of debugfs_create functions debugfs: provide pr_fmt() macro debugfs: log errors when something goes wrong drivers: s390/cio: Fix compilation warning about const qualifiers drivers: Add generic helper to match by of_node driver_find_device: Unify the match function with class_find_device() bus_find_device: Unify the match callback with class_find_device ...
2019-07-08debugfs: make error message a bit more verboseGreg Kroah-Hartman
When a file/directory is already present in debugfs, and it is attempted to be created again, be more specific about what file/directory is being created and where it is trying to be created to give a bit more help to developers to figure out the problem. Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Mark Brown <broonie@kernel.org> Cc: Takashi Iwai <tiwai@suse.de> Reviewed-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20190706154256.GA2683@kroah.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-03debugfs: provide pr_fmt() macroGreg Kroah-Hartman
Use a common "debugfs: " prefix for all pr_* calls in a single place. Cc: Mark Brown <broonie@kernel.org> Reviewed-by: Takashi Iwai <tiwai@suse.de> Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://lore.kernel.org/r/20190703071653.2799-1-gregkh@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-03debugfs: log errors when something goes wrongGreg Kroah-Hartman
As it is not recommended that debugfs calls be checked, it was pointed out that major errors should still be logged somewhere so that developers and users have a chance to figure out what went wrong. To help with this, error logging has been added to the debugfs core so that it is not needed to be present in every individual file that calls debugfs. Reported-by: Mark Brown <broonie@kernel.org> Reported-by: Takashi Iwai <tiwai@suse.de> Reviewed-by: Mark Brown <broonie@kernel.org> Reviewed-by: Takashi Iwai <tiwai@suse.de> Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://lore.kernel.org/r/20190703071653.2799-2-gregkh@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-20debugfs: call fsnotify_{unlink,rmdir}() hooksAmir Goldstein
This will allow generating fsnotify delete events after the fsnotify_nameremove() hook is removed from d_delete(). Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz>
2019-06-20debugfs: simplify __debugfs_remove_file()Amir Goldstein
Move simple_unlink()+d_delete() from __debugfs_remove_file() into caller __debugfs_remove() and rename helper for post remove file to __debugfs_file_removed(). This will simplify adding fsnotify_unlink() hook. Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz>
2019-05-07Merge branch 'work.dcache' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull misc dcache updates from Al Viro: "Most of this pile is putting name length into struct name_snapshot and making use of it. The beginning of this series ("ovl_lookup_real_one(): don't bother with strlen()") ought to have been split in two (separate switch of name_snapshot to struct qstr from overlayfs reaping the trivial benefits of that), but I wanted to avoid a rebase - by the time I'd spotted that it was (a) in -next and (b) close to 5.1-final ;-/" * 'work.dcache' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: audit_compare_dname_path(): switch to const struct qstr * audit_update_watch(): switch to const struct qstr * inotify_handle_event(): don't bother with strlen() fsnotify: switch send_to_group() and ->handle_event to const struct qstr * fsnotify(): switch to passing const struct qstr * for file_name switch fsnotify_move() to passing const struct qstr * for old_name ovl_lookup_real_one(): don't bother with strlen() sysv: bury the broken "quietly truncate the long filenames" logics nsfs: unobfuscate unexport d_alloc_pseudo()
2019-05-01debugfs: switch to ->free_inode()Al Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-04-26switch fsnotify_move() to passing const struct qstr * for old_nameAl Viro
note that in the second (RENAME_EXCHANGE) call of fsnotify_move() in vfs_rename() the old_dentry->d_name is guaranteed to be unchanged throughout the evaluation of fsnotify_move() (by the fact that the parent directory is locked exclusive), so we don't need to fetch old_dentry->d_name.name in the caller. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-04-26ovl_lookup_real_one(): don't bother with strlen()Al Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-04-01debugfs: fix use-after-free on symlink traversalAl Viro
symlink body shouldn't be freed without an RCU delay. Switch debugfs to ->destroy_inode() and use of call_rcu(); free both the inode and symlink body in the callback. Similar to solution for bpf, only here it's even more obvious that ->evict_inode() can be dropped. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-02-11Merge 5.0-rc6 into driver-core-nextGreg Kroah-Hartman
We need the debugfs fixes in here as well. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-01-30debugfs: debugfs_lookup() should return NULL if not foundGreg Kroah-Hartman
Lots of callers of debugfs_lookup() were just checking NULL to see if the file/directory was found or not. By changing this in ff9fb72bc077 ("debugfs: return error values, not NULL") we caused some subsystems to easily crash. Fixes: ff9fb72bc077 ("debugfs: return error values, not NULL") Reported-by: syzbot+b382ba6a802a3d242790@syzkaller.appspotmail.com Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: Omar Sandoval <osandov@fb.com> Cc: Jens Axboe <axboe@fb.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-01-29debugfs: return error values, not NULLGreg Kroah-Hartman
When an error happens, debugfs should return an error pointer value, not NULL. This will prevent the totally theoretical error where a debugfs call fails due to lack of memory, returning NULL, and that dentry value is then passed to another debugfs call, which would end up succeeding, creating a file at the root of the debugfs tree, but would then be impossible to remove (because you can not remove the directory NULL). So, to make everyone happy, always return errors, this makes the users of debugfs much simpler (they do not have to ever check the return value), and everyone can rest easy. Reported-by: Gary R Hook <ghook@amd.com> Reported-by: Heiko Carstens <heiko.carstens@de.ibm.com> Reported-by: Masami Hiramatsu <mhiramat@kernel.org> Reported-by: Michal Hocko <mhocko@kernel.org> Reported-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Reported-by: Ulf Hansson <ulf.hansson@linaro.org> Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org> Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-01-25debugfs: fix debugfs_rename parameter checkingGreg Kroah-Hartman
debugfs_rename() needs to check that the dentries passed into it really are valid, as sometimes they are not (i.e. if the return value of another debugfs call is passed into this one.) So fix this up by properly checking if the two parent directories are errors (they are allowed to be NULL), and if the dentry to rename is not NULL or an error. Cc: stable <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-01-22debugfs: debugfs_use_start/finish do not exist anymoreSergey Senozhatsky
debugfs_use_file_start() and debugfs_use_file_finish() do not exist since commit c9afbec27089 ("debugfs: purge obsolete SRCU based removal protection"); tweak debugfs_create_file_unsafe() comment. Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-06-12Revert "debugfs: inode: debugfs_create_dir uses mode permission from parent"Linus Torvalds
This reverts commit 95cde3c59966f6371b6bcd9e4e2da2ba64ee9775. The commit had good intentions, but it breaks kvm-tool and qemu-kvm. With it in place, "lkvm run" just fails with Error: KVM_CREATE_VM ioctl Warning: Failed init: kvm__init which isn't a wonderful error message, but bisection pinpointed the problematic commit. The problem is almost certainly due to the special kvm debugfs entries created dynamically by kvm under /sys/kernel/debug/kvm/. See kvm_create_vm_debugfs() Bisected-and-reported-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Wanpeng Li <kernellwp@gmail.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Kees Cook <keescook@chromium.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-05-14debugfs: inode: debugfs_create_dir uses mode permission from parentThomas Richter
Currently function debugfs_create_dir() creates a new directory in the debugfs (usually mounted /sys/kernel/debug) with permission rwxr-xr-x. This is hard coded. Change this to use the parent directory permission. Output before the patch: root@s8360047 ~]# tree -dp -L 1 /sys/kernel/debug/ /sys/kernel/debug/ ├── [drwxr-xr-x] bdi ├── [drwxr-xr-x] block ├── [drwxr-xr-x] dasd ├── [drwxr-xr-x] device_component ├── [drwxr-xr-x] extfrag ├── [drwxr-xr-x] hid ├── [drwxr-xr-x] kprobes ├── [drwxr-xr-x] kvm ├── [drwxr-xr-x] memblock ├── [drwxr-xr-x] pm_qos ├── [drwxr-xr-x] qdio ├── [drwxr-xr-x] s390 ├── [drwxr-xr-x] s390dbf └── [drwx------] tracing 14 directories [root@s8360047 linux]# Output after the patch: [root@s8360047 ~]# tree -dp -L 1 /sys/kernel/debug/ sys/kernel/debug/ ├── [drwx------] bdi ├── [drwx------] block ├── [drwx------] dasd ├── [drwx------] device_component ├── [drwx------] extfrag ├── [drwx------] hid ├── [drwx------] kprobes ├── [drwx------] kvm ├── [drwx------] memblock ├── [drwx------] pm_qos ├── [drwx------] qdio ├── [drwx------] s390 ├── [drwx------] s390dbf └── [drwx------] tracing 14 directories [root@s8360047 linux]# Here is the full diff output done with: [root@s8360047 ~]# diff -u treefull.before treefull.after | sed 's-^- # -' > treefull.diff # --- treefull.before 2018-04-27 13:22:04.532824564 +0200 # +++ treefull.after 2018-04-27 13:24:12.106182062 +0200 # @@ -1,55 +1,55 @@ # /sys/kernel/debug/ # -├── [drwxr-xr-x] bdi # -│   ├── [drwxr-xr-x] 1:0 # -│   ├── [drwxr-xr-x] 1:1 # -│   ├── [drwxr-xr-x] 1:10 # -│   ├── [drwxr-xr-x] 1:11 # -│   ├── [drwxr-xr-x] 1:12 # -│   ├── [drwxr-xr-x] 1:13 # -│   ├── [drwxr-xr-x] 1:14 # -│   ├── [drwxr-xr-x] 1:15 # -│   ├── [drwxr-xr-x] 1:2 # -│   ├── [drwxr-xr-x] 1:3 # -│   ├── [drwxr-xr-x] 1:4 # -│   ├── [drwxr-xr-x] 1:5 # -│   ├── [drwxr-xr-x] 1:6 # -│   ├── [drwxr-xr-x] 1:7 # -│   ├── [drwxr-xr-x] 1:8 # -│   ├── [drwxr-xr-x] 1:9 # -│   └── [drwxr-xr-x] 94:0 # -├── [drwxr-xr-x] block # -├── [drwxr-xr-x] dasd # -│   ├── [drwxr-xr-x] 0.0.e18a # -│   ├── [drwxr-xr-x] dasda # -│   └── [drwxr-xr-x] global # -├── [drwxr-xr-x] device_component # -├── [drwxr-xr-x] extfrag # -├── [drwxr-xr-x] hid # -├── [drwxr-xr-x] kprobes # -├── [drwxr-xr-x] kvm # -├── [drwxr-xr-x] memblock # -├── [drwxr-xr-x] pm_qos # -├── [drwxr-xr-x] qdio # -│   └── [drwxr-xr-x] 0.0.f5f2 # -├── [drwxr-xr-x] s390 # -│   └── [drwxr-xr-x] stsi # -├── [drwxr-xr-x] s390dbf # -│   ├── [drwxr-xr-x] 0.0.e18a # -│   ├── [drwxr-xr-x] cio_crw # -│   ├── [drwxr-xr-x] cio_msg # -│   ├── [drwxr-xr-x] cio_trace # -│   ├── [drwxr-xr-x] dasd # -│   ├── [drwxr-xr-x] kvm-trace # -│   ├── [drwxr-xr-x] lgr # -│   ├── [drwxr-xr-x] qdio_0.0.f5f2 # -│   ├── [drwxr-xr-x] qdio_error # -│   ├── [drwxr-xr-x] qdio_setup # -│   ├── [drwxr-xr-x] qeth_card_0.0.f5f0 # -│   ├── [drwxr-xr-x] qeth_control # -│   ├── [drwxr-xr-x] qeth_msg # -│   ├── [drwxr-xr-x] qeth_setup # -│   ├── [drwxr-xr-x] vmcp # -│   └── [drwxr-xr-x] vmur # +├── [drwx------] bdi # +│   ├── [drwx------] 1:0 # +│   ├── [drwx------] 1:1 # +│   ├── [drwx------] 1:10 # +│   ├── [drwx------] 1:11 # +│   ├── [drwx------] 1:12 # +│   ├── [drwx------] 1:13 # +│   ├── [drwx------] 1:14 # +│   ├── [drwx------] 1:15 # +│   ├── [drwx------] 1:2 # +│   ├── [drwx------] 1:3 # +│   ├── [drwx------] 1:4 # +│   ├── [drwx------] 1:5 # +│   ├── [drwx------] 1:6 # +│   ├── [drwx------] 1:7 # +│   ├── [drwx------] 1:8 # +│   ├── [drwx------] 1:9 # +│   └── [drwx------] 94:0 # +├── [drwx------] block # +├── [drwx------] dasd # +│   ├── [drwx------] 0.0.e18a # +│   ├── [drwx------] dasda # +│   └── [drwx------] global # +├── [drwx------] device_component # +├── [drwx------] extfrag # +├── [drwx------] hid # +├── [drwx------] kprobes # +├── [drwx------] kvm # +├── [drwx------] memblock # +├── [drwx------] pm_qos # +├── [drwx------] qdio # +│   └── [drwx------] 0.0.f5f2 # +├── [drwx------] s390 # +│   └── [drwx------] stsi # +├── [drwx------] s390dbf # +│   ├── [drwx------] 0.0.e18a # +│   ├── [drwx------] cio_crw # +│   ├── [drwx------] cio_msg # +│   ├── [drwx------] cio_trace # +│   ├── [drwx------] dasd # +│   ├── [drwx------] kvm-trace # +│   ├── [drwx------] lgr # +│   ├── [drwx------] qdio_0.0.f5f2 # +│   ├── [drwx------] qdio_error # +│   ├── [drwx------] qdio_setup # +│   ├── [drwx------] qeth_card_0.0.f5f0 # +│   ├── [drwx------] qeth_control # +│   ├── [drwx------] qeth_msg # +│   ├── [drwx------] qeth_setup # +│   ├── [drwx------] vmcp # +│   └── [drwx------] vmur # └── [drwx------] tracing # ├── [drwxr-xr-x] events # │   ├── [drwxr-xr-x] alarmtimer Fixes: edac65eaf8d5c ("debugfs: take mode-dependent parts of debugfs_get_inode() into callers") Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-29debugfs_lookup(): switch to lookup_one_len_unlocked()Al Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-11-07debugfs: Remove redundant license textGreg Kroah-Hartman
Now that the SPDX tag is in all debugfs files, that identifies the license in a specific and legally-defined manner. So the extra GPL text wording can be removed as it is no longer needed at all. This is done on a quest to remove the 700+ different ways that files in the kernel describe the GPL license text. And there's unneeded stuff like the address (sometimes incorrect) for the FSF which is never needed. No copyright headers or other non-license-description text was removed. Cc: Nicolai Stange <nicstange@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-07debugfs: add SPDX identifiers to all debugfs filesGreg Kroah-Hartman
It's good to have SPDX identifiers in all files to make it easier to audit the kernel tree for correct licenses. Update the debugfs files files with the correct SPDX license identifier based on the license text in the file itself. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This work is based on a script and data from Thomas Gleixner, Philippe Ombredanne, and Kate Stewart. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Kate Stewart <kstewart@linuxfoundation.org> Cc: Philippe Ombredanne <pombredanne@nexb.com> Cc: Nicolai Stange <nicstange@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-07debugfs: defer debugfs_fsdata allocation to first usageNicolai Stange
Currently, __debugfs_create_file allocates one struct debugfs_fsdata instance for every file created. However, there are potentially many debugfs file around, most of which are never touched by userspace. Thus, defer the allocations to the first usage, i.e. to the first debugfs_file_get(). A dentry's ->d_fsdata starts out to point to the "real", user provided fops. After a debugfs_fsdata instance has been allocated (and the real fops pointer has been moved over into its ->real_fops member), ->d_fsdata is changed to point to it from then on. The two cases are distinguished by setting BIT(0) for the real fops case. struct debugfs_fsdata's foremost purpose is to track active users and to make debugfs_remove() block until they are done. Since no debugfs_fsdata instance means no active users, make debugfs_remove() return immediately in this case. Take care of possible races between debugfs_file_get() and debugfs_remove(): either debugfs_remove() must see a debugfs_fsdata instance and thus wait for possible active users or debugfs_file_get() must see a dead dentry and return immediately. Make a dentry's ->d_release(), i.e. debugfs_release_dentry(), check whether ->d_fsdata is actually a debugfs_fsdata instance before kfree()ing it. Similarly, make debugfs_real_fops() check whether ->d_fsdata is actually a debugfs_fsdata instance before returning it, otherwise emit a warning. The set of possible error codes returned from debugfs_file_get() has grown from -EIO to -EIO and -ENOMEM. Make open_proxy_open() and full_proxy_open() pass the -ENOMEM onwards to their callers. Signed-off-by: Nicolai Stange <nicstange@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-07debugfs: purge obsolete SRCU based removal protectionNicolai Stange
Purge the SRCU based file removal race protection in favour of the new, refcount based debugfs_file_get()/debugfs_file_put() API. Fixes: 49d200deaa68 ("debugfs: prevent access to removed files' private data") Signed-off-by: Nicolai Stange <nicstange@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-07debugfs: implement per-file removal protectionNicolai Stange
Since commit 49d200deaa68 ("debugfs: prevent access to removed files' private data"), accesses to a file's private data are protected from concurrent removal by covering all file_operations with a SRCU read section and sychronizing with those before returning from debugfs_remove() by means of synchronize_srcu(). As pointed out by Johannes Berg, there are debugfs files with forever blocking file_operations. Their corresponding SRCU read side sections would block any debugfs_remove() forever as well, even unrelated ones. This results in a livelock. Because a remover can't cancel any indefinite blocking within foreign files, this is a problem. Resolve this by introducing support for more granular protection on a per-file basis. This is implemented by introducing an 'active_users' refcount_t to the per-file struct debugfs_fsdata state. At file creation time, it is set to one and a debugfs_remove() will drop that initial reference. The new debugfs_file_get() and debugfs_file_put(), intended to be used in place of former debugfs_use_file_start() and debugfs_use_file_finish(), increment and decrement it respectively. Once the count drops to zero, debugfs_file_put() will signal a completion which is possibly being waited for from debugfs_remove(). Thus, as long as there is a debugfs_file_get() not yet matched by a corresponding debugfs_file_put() around, debugfs_remove() will block. Actual users of debugfs_use_file_start() and -finish() will get converted to the new debugfs_file_get() and debugfs_file_put() by followup patches. Fixes: 49d200deaa68 ("debugfs: prevent access to removed files' private data") Reported-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: Nicolai Stange <nicstange@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-07debugfs: add support for more elaborate ->d_fsdataNicolai Stange
Currently, the user provided fops, "real_fops", are stored directly into ->d_fsdata. In order to be able to store more per-file state and thus prepare for more granular file removal protection, wrap the real_fops into a dynamically allocated container struct, debugfs_fsdata. A struct debugfs_fsdata gets allocated at file creation and freed from the newly intoduced ->d_release(). Finally, move the implementation of debugfs_real_fops() out of the public debugfs header such that struct debugfs_fsdata's declaration can be kept private. Signed-off-by: Nicolai Stange <nicstange@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-15Merge branch 'work.mount' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull ->s_options removal from Al Viro: "Preparations for fsmount/fsopen stuff (coming next cycle). Everything gets moved to explicit ->show_options(), killing ->s_options off + some cosmetic bits around fs/namespace.c and friends. Basically, the stuff needed to work with fsmount series with minimum of conflicts with other work. It's not strictly required for this merge window, but it would reduce the PITA during the coming cycle, so it would be nice to have those bits and pieces out of the way" * 'work.mount' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: isofs: Fix isofs_show_options() VFS: Kill off s_options and helpers orangefs: Implement show_options 9p: Implement show_options isofs: Implement show_options afs: Implement show_options affs: Implement show_options befs: Implement show_options spufs: Implement show_options bpf: Implement show_options ramfs: Implement show_options pstore: Implement show_options omfs: Implement show_options hugetlbfs: Implement show_options VFS: Don't use save/replace_mount_options if not using generic_show_options VFS: Provide empty name qstr VFS: Make get_filesystem() return the affected filesystem VFS: Clean up whitespace in fs/namespace.c and fs/super.c Provide a function to create a NUL-terminated string from unterminated data
2017-07-08Merge branch 'work.misc' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull misc filesystem updates from Al Viro: "Assorted normal VFS / filesystems stuff..." * 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: dentry name snapshots Make statfs properly return read-only state after emergency remount fs/dcache: init in_lookup_hashtable minix: Deinline get_block, save 2691 bytes fs: Reorder inode_owner_or_capable() to avoid needless fs: warn in case userspace lied about modprobe return
2017-07-07dentry name snapshotsAl Viro
take_dentry_name_snapshot() takes a safe snapshot of dentry name; if the name is a short one, it gets copied into caller-supplied structure, otherwise an extra reference to external name is grabbed (those are never modified). In either case the pointer to stable string is stored into the same structure. dentry must be held by the caller of take_dentry_name_snapshot(), but may be freely dropped afterwards - the snapshot will stay until destroyed by release_dentry_name_snapshot(). Intended use: struct name_snapshot s; take_dentry_name_snapshot(&s, dentry); ... access s.name ... release_dentry_name_snapshot(&s); Replaces fsnotify_oldname_...(), gets used in fsnotify to obtain the name to pass down with event. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-07-06VFS: Don't use save/replace_mount_options if not using generic_show_optionsDavid Howells
btrfs, debugfs, reiserfs and tracefs call save_mount_options() and reiserfs calls replace_mount_options(), but they then implement their own ->show_options() methods and don't touch s_options, rendering the saved options unnecessary. I'm trying to eliminate s_options to make it easier to implement a context-based mount where the mount options can be passed individually over a file descriptor. Remove the calls to save/replace_mount_options() call in these cases. Signed-off-by: David Howells <dhowells@redhat.com> cc: Chris Mason <clm@fb.com> cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> cc: Steven Rostedt <rostedt@goodmis.org> cc: linux-btrfs@vger.kernel.org cc: reiserfs-devel@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-05-16fs: fix the location of the kernel-api bookMauro Carvalho Chehab
The kernel-api book is now part of the core-api. Update its location. Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
2017-04-26fs: constify tree_descr arrays passed to simple_fill_super()Eric Biggers
simple_fill_super() is passed an array of tree_descr structures which describe the files to create in the filesystem's root directory. Since these arrays are never modified intentionally, they should be 'const' so that they are placed in .rodata and benefit from memory protection. This patch updates the function signature and all users, and also constifies tree_descr.name. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-02-23Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull namespace updates from Eric Biederman: "There is a lot here. A lot of these changes result in subtle user visible differences in kernel behavior. I don't expect anything will care but I will revert/fix things immediately if any regressions show up. From Seth Forshee there is a continuation of the work to make the vfs ready for unpriviled mounts. We had thought the previous changes prevented the creation of files outside of s_user_ns of a filesystem, but it turns we missed the O_CREAT path. Ooops. Pavel Tikhomirov and Oleg Nesterov worked together to fix a long standing bug in the implemenation of PR_SET_CHILD_SUBREAPER where only children that are forked after the prctl are considered and not children forked before the prctl. The only known user of this prctl systemd forks all children after the prctl. So no userspace regressions will occur. Holding earlier forked children to the same rules as later forked children creates a semantic that is sane enough to allow checkpoing of processes that use this feature. There is a long delayed change by Nikolay Borisov to limit inotify instances inside a user namespace. Michael Kerrisk extends the API for files used to maniuplate namespaces with two new trivial ioctls to allow discovery of the hierachy and properties of namespaces. Konstantin Khlebnikov with the help of Al Viro adds code that when a network namespace exits purges it's sysctl entries from the dcache. As in some circumstances this could use a lot of memory. Vivek Goyal fixed a bug with stacked filesystems where the permissions on the wrong inode were being checked. I continue previous work on ptracing across exec. Allowing a file to be setuid across exec while being ptraced if the tracer has enough credentials in the user namespace, and if the process has CAP_SETUID in it's own namespace. Proc files for setuid or otherwise undumpable executables are now owned by the root in the user namespace of their mm. Allowing debugging of setuid applications in containers to work better. A bug I introduced with permission checking and automount is now fixed. The big change is to mark the mounts that the kernel initiates as a result of an automount. This allows the permission checks in sget to be safely suppressed for this kind of mount. As the permission check happened when the original filesystem was mounted. Finally a special case in the mount namespace is removed preventing unbounded chains in the mount hash table, and making the semantics simpler which benefits CRIU. The vfs fix along with related work in ima and evm I believe makes us ready to finish developing and merge fully unprivileged mounts of the fuse filesystem. The cleanups of the mount namespace makes discussing how to fix the worst case complexity of umount. The stacked filesystem fixes pave the way for adding multiple mappings for the filesystem uids so that efficient and safer containers can be implemented" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: proc/sysctl: Don't grab i_lock under sysctl_lock. vfs: Use upper filesystem inode in bprm_fill_uid() proc/sysctl: prune stale dentries during unregistering mnt: Tuck mounts under others instead of creating shadow/side mounts. prctl: propagate has_child_subreaper flag to every descendant introduce the walk_process_tree() helper nsfs: Add an ioctl() to return owner UID of a userns fs: Better permission checking for submounts exit: fix the setns() && PR_SET_CHILD_SUBREAPER interaction vfs: open() with O_CREAT should not create inodes with unknown ids nsfs: Add an ioctl() to return the namespace type proc: Better ownership of files for non-dumpable tasks in user namespaces exec: Remove LSM_UNSAFE_PTRACE_CAP exec: Test the ptracer's saved cred to see if the tracee can gain caps exec: Don't reset euid and egid when the tracee has CAP_SETUID inotify: Convert to using per-namespace limits