aboutsummaryrefslogtreecommitdiff
path: root/fs/namei.c
AgeCommit message (Collapse)Author
2011-06-07vfs: make unlink() and rmdir() return ENOENT in preference to EROFSTheodore Ts'o
If user space attempts to remove a non-existent file or directory, and the file system is mounted read-only, return ENOENT instead of EROFS. Either error code is arguably valid/correct, but ENOENT is a more specific error message. Reported-by: Michael Tokarev <mjt@tls.msk.ru> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-05-30vfs: shrink_dcache_parent before rmdir, dir renameSage Weil
The dentry_unhash push-down series missed that shink_dcache_parent needs to be called prior to rmdir or dir rename to clear DCACHE_REFERENCED and allow efficient dentry reclaim. Reported-by: Dave Chinner <david@fromorbit.com> Signed-off-by: Sage Weil <sage@newdream.net> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-05-27Lift the check for automount points into do_lookup()Al Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-05-27Trim excessive arguments of follow_mount_rcu()Al Viro
... and kill a useless local variable in follow_dotdot_rcu(), while we are at it - follow_mount_rcu(nd, path, inode) *always* assigned value to *inode, and always it had been path->dentry->d_inode (aka nd->path.dentry->d_inode, since it always got &nd->path as the second argument). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-05-27split __follow_mount_rcu() into normal and .. casesAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-05-26Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (25 commits) cifs: remove unnecessary dentry_unhash on rmdir/rename_dir ocfs2: remove unnecessary dentry_unhash on rmdir/rename_dir exofs: remove unnecessary dentry_unhash on rmdir/rename_dir nfs: remove unnecessary dentry_unhash on rmdir/rename_dir ext2: remove unnecessary dentry_unhash on rmdir/rename_dir ext3: remove unnecessary dentry_unhash on rmdir/rename_dir ext4: remove unnecessary dentry_unhash on rmdir/rename_dir btrfs: remove unnecessary dentry_unhash in rmdir/rename_dir ceph: remove unnecessary dentry_unhash calls vfs: clean up vfs_rename_other vfs: clean up vfs_rename_dir vfs: clean up vfs_rmdir vfs: fix vfs_rename_dir for FS_RENAME_DOES_D_MOVE filesystems libfs: drop unneeded dentry_unhash vfs: update dentry_unhash() comment vfs: push dentry_unhash on rename_dir into file systems vfs: push dentry_unhash on rmdir into file systems vfs: remove dget() from dentry_unhash() vfs: dentry_unhash immediately prior to rmdir vfs: Block mmapped writes while the fs is frozen ...
2011-05-26vfs: clean up vfs_rename_otherSage Weil
Simplify control flow to match vfs_rename_dir. Signed-off-by: Sage Weil <sage@newdream.net> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-05-26vfs: clean up vfs_rename_dirSage Weil
Simplify control flow through vfs_rename_dir. Signed-off-by: Sage Weil <sage@newdream.net> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-05-26vfs: clean up vfs_rmdirSage Weil
Simplify the control flow with an out label. Signed-off-by: Sage Weil <sage@newdream.net> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-05-26vfs: fix vfs_rename_dir for FS_RENAME_DOES_D_MOVE filesystemsMiklos Szeredi
vfs_rename_dir() doesn't properly account for filesystems with FS_RENAME_DOES_D_MOVE. If new_dentry has a target inode attached, it unhashes the new_dentry prior to the rename() iop and rehashes it after, but doesn't account for the possibility that rename() may have swapped {old,new}_dentry. For FS_RENAME_DOES_D_MOVE filesystems, it rehashes new_dentry (now the old renamed-from name, which d_move() expected to go away), such that a subsequent lookup will find it. Currently all FS_RENAME_DOES_D_MOVE filesystems compensate for this by failing in d_revalidate. The bug was introduced by: commit 349457ccf2592c14bdf13b6706170ae2e94931b1 "[PATCH] Allow file systems to manually d_move() inside of ->rename()" Fix by not rehashing the new dentry. Rehashing used to be needed by d_move() but isn't anymore. Reported-by: Sage Weil <sage@newdream.net> Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-05-26vfs: update dentry_unhash() commentSage Weil
The helper is now only called by file systems, not the VFS. Signed-off-by: Sage Weil <sage@newdream.net> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-05-26vfs: push dentry_unhash on rename_dir into file systemsSage Weil
Only a few file systems need this. Start by pushing it down into each rename method (except gfs2 and xfs) so that it can be dealt with on a per-fs basis. Acked-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sage Weil <sage@newdream.net> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-05-26vfs: push dentry_unhash on rmdir into file systemsSage Weil
Only a few file systems need this. Start by pushing it down into each fs rmdir method (except gfs2 and xfs) so it can be dealt with on a per-fs basis. This does not change behavior for any in-tree file systems. Acked-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sage Weil <sage@newdream.net> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-05-26vfs: remove dget() from dentry_unhash()Sage Weil
This serves no useful purpose that I can discern. All callers (rename, rmdir) hold their own reference to the dentry. A quick audit of all file systems showed no relevant checks on the value of d_count in vfs_rmdir/vfs_rename_dir paths. Signed-off-by: Sage Weil <sage@newdream.net> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-05-26vfs: dentry_unhash immediately prior to rmdirSage Weil
This presumes that there is no reason to unhash a dentry if we fail because it is a mountpoint or the LSM check fails, and that the LSM checks do not depend on the dentry being unhashed. Signed-off-by: Sage Weil <sage@newdream.net> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-05-26merge handle_reval_dot and nameidata_drop_rcu_lastAl Viro
new helper: complete_walk(). Done on successful completion of walk, drops out of RCU mode, does d_revalidate of final result if that hadn't been done already. handle_reval_dot() and nameidata_drop_rcu_last() subsumed into that one; callers converted to use of complete_walk(). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-05-26consolidate nameidata_..._drop_rcu()Al Viro
Merge these into a single function (unlazy_walk(nd, dentry)), kill ..._maybe variants Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-05-21VFS: move BUG_ON test for symlink nd->depth after current->link_count testErez Zadok
This solves a serious VFS-level bug in nested_symlink (which was rewritten from do_follow_link), and follows the order of depth tests that existed before. The bug triggers a BUG_ON in fs/namei.c:1381, when running racer with symlink and rename ops. Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu> Acked-by: Miklos Szeredi <mszeredi@suse.cz> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-13vfs: micro-optimize acl_permission_check()Linus Torvalds
It's a hot function, and we're better off not mixing types in the mask calculations. The compiler just ends up mixing 16-bit and 32-bit operations, for no good reason. So do everything in 'unsigned int' rather than mixing 'unsigned int' masking with a 'umode_t' (16-bit) mode variable. This, together with the parent commit (47a150edc2ae: "Cache user_ns in struct cred") makes acl_permission_check() much nicer. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-04-15vfs: Fix absolute RCU path walk failures due to uninitialized seq numberTim Chen
During RCU walk in path_lookupat and path_openat, the rcu lookup frequently failed if looking up an absolute path, because when root directory was looked up, seq number was not properly set in nameidata. We dropped out of RCU walk in nameidata_drop_rcu due to mismatch in directory entry's seq number. We reverted to slow path walk that need to take references. With the following patch, I saw a 50% increase in an exim mail server benchmark throughput on a 4-socket Nehalem-EX system. Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> Reviewed-by: Andi Kleen <ak@linux.intel.com> Cc: stable@kernel.org (v2.6.38) Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-03-31Fix common misspellingsLucas De Marchi
Fixes generated by 'codespell' and manually reviewed. Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>
2011-03-24vfs - check non-mountpoint dentry might block in __follow_mount_rcu()Ian Kent
When following a mount in rcu-walk mode we must check if the incoming dentry is telling us it may need to block, even if it isn't actually a mountpoint. Signed-off-by: Ian Kent <raven@themaw.net> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-03-23Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: deal with races in /proc/*/{syscall,stack,personality} proc: enable writing to /proc/pid/mem proc: make check_mem_permission() return an mm_struct on success proc: hold cred_guard_mutex in check_mem_permission() proc: disable mem_write after exec mm: implement access_remote_vm mm: factor out main logic of access_process_vm mm: use mm_struct to resolve gate vma's in __get_user_pages mm: arch: rename in_gate_area_no_task to in_gate_area_no_mm mm: arch: make in_gate_area take an mm_struct instead of a task_struct mm: arch: make get_gate_vma take an mm_struct instead of a task_struct x86: mark associated mm when running a task in 32 bit compatibility mode x86: add context tag to mark mm when running a task in 32-bit compatibility mode auxv: require the target to be tracable (or yourself) close race in /proc/*/environ report errors in /proc/*/*map* sanely pagemap: close races with suid execve make sessionid permissions in /proc/*/task/* match those in /proc/* fix leaks in path_lookupat() Fix up trivial conflicts in fs/proc/base.c
2011-03-23userns: rename is_owner_or_cap to inode_owner_or_capableSerge E. Hallyn
And give it a kernel-doc comment. [akpm@linux-foundation.org: btrfs changed in linux-next] Signed-off-by: Serge E. Hallyn <serge.hallyn@canonical.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Daniel Lezcano <daniel.lezcano@free.fr> Acked-by: David Howells <dhowells@redhat.com> Cc: James Morris <jmorris@namei.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-03-23userns: userns: check user namespace for task->file uid equivalence checksSerge E. Hallyn
Cheat for now and say all files belong to init_user_ns. Next step will be to let superblocks belong to a user_ns, and derive inode_userns(inode) from inode->i_sb->s_user_ns. Finally we'll introduce more flexible arrangements. Changelog: Feb 15: make is_owner_or_cap take const struct inode Feb 23: make is_owner_or_cap bool [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Serge E. Hallyn <serge.hallyn@canonical.com> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Daniel Lezcano <daniel.lezcano@free.fr> Acked-by: David Howells <dhowells@redhat.com> Cc: James Morris <jmorris@namei.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-03-23fix leaks in path_lookupat()Al Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-03-18lose 'mounting_here' argument in ->d_manage()Al Viro
it's always false... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-03-18don't pass 'mounting_here' flag to follow_down()Al Viro
it's always false now Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-03-16fix follow_link() breakageAl Viro
commit 574197e0de46a8a4db5c54ef7b65e43ffa8873a7 had a missing piece, breaking the loop detection ;-/ Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-03-15tidy the trailing symlinks traversal upAl Viro
* pull the handling of current->total_link_count into __do_follow_link() * put the common "do ->put_link() if needed and path_put() the link" stuff into a helper (put_link(nd, link, cookie)) * rename __do_follow_link() to follow_link(), while we are at it Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-03-15Turn resolution of trailing symlinks iterative everywhereAl Viro
The last remaining place (resolution of nested symlink) converted to the loop of the same kind we have in path_lookupat() and path_openat(). Note that we still *do* have a recursion in pathname resolution; can't avoid it, really. However, it's strictly for nested symlinks now - i.e. ones in the middle of a pathname. link_path_walk() has lost the tail now - it always walks everything except the last component. do_follow_link() renamed to nested_symlink() and moved down. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-03-15simplify link_path_walk() tailAl Viro
Now that link_path_walk() is called without LOOKUP_PARENT only from do_follow_link(), we can simplify the checks in last component handling. First of all, checking if we'd arrived to a directory is not needed - the caller will check it anyway. And LOOKUP_FOLLOW is guaranteed to be there, since we only get to that place with nd->depth > 0. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-03-15Make trailing symlink resolution in path_lookupat() iterativeAl Viro
Now the only caller of link_path_walk() that does *not* pass LOOKUP_PARENT is do_follow_link() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-03-15update nd->inode in __do_follow_link() instead of after do_follow_link()Al Viro
... and note that we only need to do it for LAST_BIND symlinks Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-03-15pull handling of one pathname component into a helperAl Viro
new helper: walk_component(). Handles everything except symlinks; returns negative on error, 0 on success and 1 on symlinks we decided to follow. Drops out of RCU mode on such symlinks. link_path_walk() and do_last() switched to using that. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-03-15fs: allow AT_EMPTY_PATH in linkat(), limit that to CAP_DAC_READ_SEARCHAneesh Kumar K.V
We don't want to allow creation of private hardlinks by different application using the fd passed to them via SCM_RIGHTS. So limit the null relative name usage in linkat syscall to CAP_DAC_READ_SEARCH Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
2011-03-15Allow O_PATH for symlinksAl Viro
At that point we can't do almost nothing with them. They can be opened with O_PATH, we can manipulate such descriptors with dup(), etc. and we can see them in /proc/*/{fd,fdinfo}/*. We can't (and won't be able to) follow /proc/*/fd/* symlinks for those; there's simply not enough information for pathname resolution to go on from such point - to resolve a symlink we need to know which directory does it live in. We will be able to do useful things with them after the next commit, though - readlinkat() and fchownat() will be possible to use with dfd being an O_PATH-opened symlink and empty relative pathname. Combined with open_by_handle() it'll give us a way to do realink-by-handle and lchown-by-handle without messing with more redundant syscalls. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-03-15New kind of open files - "location only".Al Viro
New flag for open(2) - O_PATH. Semantics: * pathname is resolved, but the file itself is _NOT_ opened as far as filesystem is concerned. * almost all operations on the resulting descriptors shall fail with -EBADF. Exceptions are: 1) operations on descriptors themselves (i.e. close(), dup(), dup2(), dup3(), fcntl(fd, F_DUPFD), fcntl(fd, F_DUPFD_CLOEXEC, ...), fcntl(fd, F_GETFD), fcntl(fd, F_SETFD, ...)) 2) fcntl(fd, F_GETFL), for a common non-destructive way to check if descriptor is open 3) "dfd" arguments of ...at(2) syscalls, i.e. the starting points of pathname resolution * closing such descriptor does *NOT* affect dnotify or posix locks. * permissions are checked as usual along the way to file; no permission checks are applied to the file itself. Of course, giving such thing to syscall will result in permission checks (at the moment it means checking that starting point of ....at() is a directory and caller has exec permissions on it). fget() and fget_light() return NULL on such descriptors; use of fget_raw() and fget_raw_light() is needed to get them. That protects existing code from dealing with those things. There are two things still missing (they come in the next commits): one is handling of symlinks (right now we refuse to open them that way; see the next commit for semantics related to those) and another is descriptor passing via SCM_RIGHTS datagrams. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-03-15fs: Don't allow to create hardlink for deleted fileAneesh Kumar K.V
Add inode->i_nlink == 0 check in VFS. Some of the file systems do this internally. A followup patch will remove those instance. This is needed to ensure that with link by handle we don't allow to create hardlink of an unlinked file. The check also prevent a race between unlink and link Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-03-14New AT_... flag: AT_EMPTY_PATHAl Viro
For name_to_handle_at(2) we'll want both ...at()-style syscall that would be usable for non-directory descriptors (with empty relative pathname). Introduce new flag (AT_EMPTY_PATH) to deal with that and corresponding LOOKUP_EMPTY; teach user_path_at() and path_init() to deal with the latter. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-03-14open-style analog of vfs_path_lookup()Al Viro
new function: file_open_root(dentry, mnt, name, flags) opens the file vfs_path_lookup would arrive to. Note that name can be empty; in that case the usual requirement that dentry should be a directory is lifted. open-coded equivalents switched to it, may_open() got down exactly one caller and became static. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-03-14reduce vfs_path_lookup() to do_path_lookup()Al Viro
New lookup flag: LOOKUP_ROOT. nd->root is set (and held) by caller, path_init() starts walking from that place and all pathname resolution machinery never drops nd->root if that flag is set. That turns vfs_path_lookup() into a special case of do_path_lookup() *and* gets us down to 3 callers of link_path_walk(), making it finally feasible to rip the handling of trailing symlink out of link_path_walk(). That will not only simply the living hell out of it, but make life much simpler for unionfs merge. Trailing symlink handling will become iterative, which is a good thing for stack footprint in a lot of situations as well. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-03-14untangle do_lookup()Al Viro
That thing has devolved into rats nest of gotos; sane use of unlikely() gets rid of that horror and gives much more readable structure: * make a fast attempt to find a dentry; false negatives are OK. In RCU mode if everything went fine, we are done, otherwise just drop out of RCU. If we'd done (RCU) ->d_revalidate() and it had not refused outright (i.e. didn't give us -ECHILD), remember its result. * now we are not in RCU mode and hopefully have a dentry. If we do not, lock parent, do full d_lookup() and if that has not found anything, allocate and call ->lookup(). If we'd done that ->lookup(), remember that dentry is good and we don't need to revalidate it. * now we have a dentry. If it has ->d_revalidate() and we can't skip it, call it. * hopefully dentry is good; if not, either fail (in case of error) or try to invalidate it. If d_invalidate() has succeeded, drop it and retry everything as if original attempt had not found a dentry. * now we can finish it up - deal with mountpoint crossing and automount. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-03-14path_openat: clean ELOOP handling a bitAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-03-14do_last: kill a rudiment of old ->d_revalidate() workaroundAl Viro
There used to be time when ->d_revalidate() couldn't return an error. So intents code had lookup_instantiate_filp() stash ERR_PTR(error) in nd->intent.open.filp and had it checked after lookup_hash(), to catch the otherwise silent failures. That had been introduced by commit 4af4c52f34606bdaab6930a845550c6fb02078a4. These days ->d_revalidate() can and does propagate errors back to callers explicitly, so this check isn't needed anymore. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-03-14fold __open_namei_create() and open_will_truncate() into do_last()Al Viro
... and clean up a bit more Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-03-14do_last: unify may_open() call and everyting after itAl Viro
We have a bunch of diverging codepaths in do_last(); some of them converge, but the case of having to create a new file duplicates large part of common tail of the rest and exits separately. Massage them so that they could be merged. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-03-14move may_open() from __open_name_create() to do_last()Al Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-03-14expand finish_open() in its only callerAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-03-14sanitize pathname component hash calculationAl Viro
Lift it to lookup_one_len() and link_path_walk() resp. into the same place where we calculated default hash function of the same name. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>