aboutsummaryrefslogtreecommitdiff
path: root/fs/dcache.c
AgeCommit message (Collapse)Author
2011-01-07fs: change d_hash for rcu-walkNick Piggin
Change d_hash so it may be called from lock-free RCU lookups. See similar patch for d_compare for details. For in-tree filesystems, this is just a mechanical change. Signed-off-by: Nick Piggin <npiggin@kernel.dk>
2011-01-07fs: change d_compare for rcu-walkNick Piggin
Change d_compare so it may be called from lock-free RCU lookups. This does put significant restrictions on what may be done from the callback, however there don't seem to have been any problems with in-tree fses. If some strange use case pops up that _really_ cannot cope with the rcu-walk rules, we can just add new rcu-unaware callbacks, which would cause name lookup to drop out of rcu-walk mode. For in-tree filesystems, this is just a mechanical change. Signed-off-by: Nick Piggin <npiggin@kernel.dk>
2011-01-07fs: name case update methodNick Piggin
smpfs and ncpfs want to update a live dentry name in-place. Rather than have them open code the locking, provide a documented dcache API. Signed-off-by: Nick Piggin <npiggin@kernel.dk>
2011-01-07fs: change d_delete semanticsNick Piggin
Change d_delete from a dentry deletion notification to a dentry caching advise, more like ->drop_inode. Require it to be constant and idempotent, and not take d_lock. This is how all existing filesystems use the callback anyway. This makes fine grained dentry locking of dput and dentry lru scanning much simpler. Signed-off-by: Nick Piggin <npiggin@kernel.dk>
2011-01-07fs: use fast counters for vfs cachesNick Piggin
percpu_counter library generates quite nasty code, so unless you need to dynamically allocate counters or take fast approximate value, a simple per cpu set of counters is much better. The percpu_counter can never be made to work as well, because it has an indirection from pointer to percpu memory, and it can't use direct this_cpu_inc interfaces because it doesn't use static PER_CPU data, so code will always be worse. In the fastpath, it is the difference between this: incl %gs:nr_dentry # nr_dentry and this: movl percpu_counter_batch(%rip), %edx # percpu_counter_batch, movl $1, %esi #, movq $nr_dentry, %rdi #, call __percpu_counter_add # (plus I clobber registers) __percpu_counter_add: pushq %rbp # movq %rsp, %rbp #, subq $32, %rsp #, movq %rbx, -24(%rbp) #, movq %r12, -16(%rbp) #, movq %r13, -8(%rbp) #, movq %rdi, %rbx # fbc, fbc #APP # 216 "/home/npiggin/usr/src/linux-2.6/arch/x86/include/asm/thread_info.h" 1 movq %gs:kernel_stack,%rax #, pfo_ret__ # 0 "" 2 #NO_APP incl -8124(%rax) # <variable>.preempt_count movq 32(%rdi), %r12 # <variable>.counters, tcp_ptr__ #APP # 78 "lib/percpu_counter.c" 1 add %gs:this_cpu_off, %r12 # this_cpu_off, tcp_ptr__ # 0 "" 2 #NO_APP movslq (%r12),%r13 #* tcp_ptr__, tmp73 movslq %edx,%rax # batch, batch addq %rsi, %r13 # amount, count cmpq %rax, %r13 # batch, count jge .L27 #, negl %edx # tmp76 movslq %edx,%rdx # tmp76, tmp77 cmpq %rdx, %r13 # tmp77, count jg .L28 #, .L27: movq %rbx, %rdi # fbc, call _raw_spin_lock # addq %r13, 8(%rbx) # count, <variable>.count movq %rbx, %rdi # fbc, movl $0, (%r12) #,* tcp_ptr__ call _raw_spin_unlock # .L29: #APP # 216 "/home/npiggin/usr/src/linux-2.6/arch/x86/include/asm/thread_info.h" 1 movq %gs:kernel_stack,%rax #, pfo_ret__ # 0 "" 2 #NO_APP decl -8124(%rax) # <variable>.preempt_count movq -8136(%rax), %rax #, D.14625 testb $8, %al #, D.14625 jne .L32 #, .L31: movq -24(%rbp), %rbx #, movq -16(%rbp), %r12 #, movq -8(%rbp), %r13 #, leave ret .p2align 4,,10 .p2align 3 .L28: movl %r13d, (%r12) # count,* jmp .L29 # .L32: call preempt_schedule # .p2align 4,,6 jmp .L31 # .size __percpu_counter_add, .-__percpu_counter_add .p2align 4,,15 Signed-off-by: Nick Piggin <npiggin@kernel.dk>
2011-01-07vfs: revert per-cpu nr_unused counters for dentry and inodesNick Piggin
The nr_unused counters count the number of objects on an LRU, and as such they are synchronized with LRU object insertion and removal and scanning, and protected under the LRU lock. Making it per-cpu does not actually get any concurrency improvements because of this lock, and summing the counter is much slower, and incrementing/decrementing it costs more code size and is slower too. These counters should stay per-LRU, which currently means global. Signed-off-by: Nick Piggin <npiggin@kernel.dk>
2011-01-07fs: d_validate fixesNick Piggin
d_validate has been broken for a long time. kmem_ptr_validate does not guarantee that a pointer can be dereferenced if it can go away at any time. Even rcu_read_lock doesn't help, because the pointer might be queued in RCU callbacks but not executed yet. So the parent cannot be checked, nor the name hashed. The dentry pointer can not be touched until it can be verified under lock. Hashing simply cannot be used. Instead, verify the parent/child relationship by traversing parent's d_child list. It's slow, but only ncpfs and the destaged smbfs care about it, at this point. Signed-off-by: Nick Piggin <npiggin@kernel.dk>
2011-01-05Revert "fs: use RCU read side protection in d_validate"Nick Piggin
This reverts commit 3825bdb7ed920845961f32f364454bee5f469abb. You cannot dget() a dentry without having a reference, or holding a lock that guarantees it remains valid. Signed-off-by: Nick Piggin <npiggin@kernel.dk>
2010-10-25fs: use RCU read side protection in d_validateChristoph Hellwig
d_validate does a purely read lookup in the dentry hash, so use RCU read side locking instead of dcache_lock. Split out from a larget patch by Nick Piggin <npiggin@suse.de>. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-10-25fs: clean up dentry lru modificationChristoph Hellwig
Always do a list_del_init on the LRU to make sure the list_empty invariant for not beeing on the LRU always holds true, and fold dentry_lru_del_init into dentry_lru_del. Replace the dentry_lru_add_tail primitive with a dentry_lru_move_tail operations that simpler when the dentry already is one the list, which is always is. Move the list_empty into dentry_lru_add to fit the scheme of the other lru helpers, and simplify locking once we move to a separate LRU lock. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-10-25fs: split __shrink_dcache_sbChristoph Hellwig
Currently __shrink_dcache_sb has an extremly awkward calling convention because it tries to please very different callers. Split out the main loop into a shrink_dentry_list helper, which gets called directly from shrink_dcache_sb for the cases where all dentries need to be pruned, or from __shrink_dcache_sb for pruning only a certain number of dentries. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-10-25fs: improve DCACHE_REFERENCED usageNick Piggin
dentry referenced bit is only set when installing the dentry back onto the LRU. However with lazy LRU, the dentry can already be on the LRU list at dput time, thus missing out on setting the referenced bit. Fix this. Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-10-25fs: use percpu counter for nr_dentry and nr_dentry_unusedChristoph Hellwig
The nr_dentry stat is a globally touched cacheline and atomic operation twice over the lifetime of a dentry. It is used for the benfit of userspace only. Turn it into a per-cpu counter and always decrement it in d_free instead of doing various batching operations to reduce lock hold times in the callers. Based on an earlier patch from Nick Piggin <npiggin@suse.de>. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-10-25fs: simplify __d_freeChristoph Hellwig
Remove d_callback and always call __d_free with a RCU head. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-10-25fs: take dcache_lock inside __d_pathChristoph Hellwig
All callers take dcache_lock just around the call to __d_path, so take the lock into it in preparation of getting rid of dcache_lock. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-08-18fs: brlock vfsmount_lockNick Piggin
fs: brlock vfsmount_lock Use a brlock for the vfsmount lock. It must be taken for write whenever modifying the mount hash or associated fields, and may be taken for read when performing mount hash lookups. A new lock is added for the mnt-id allocator, so it doesn't need to take the heavy vfsmount write-lock. The number of atomics should remain the same for fastpath rlock cases, though code would be slightly slower due to per-cpu access. Scalability is not not be much improved in common cases yet, due to other locks (ie. dcache_lock) getting in the way. However path lookups crossing mountpoints should be one case where scalability is improved (currently requiring the global lock). The slowpath is slower due to use of brlock. On a 64 core, 64 socket, 32 node Altix system (high latency to remote nodes), a simple umount microbenchmark (mount --bind mnt mnt2 ; umount mnt2 loop 1000 times), before this patch it took 6.8s, afterwards took 7.1s, about 5% slower. Cc: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: Nick Piggin <npiggin@kernel.dk> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-08-18fs: remove extra lookup in __lookup_hashNick Piggin
fs: remove extra lookup in __lookup_hash Optimize lookup for create operations, where no dentry should often be common-case. In cases where it is not, such as unlink, the added overhead is much smaller than the removed. Also, move comments about __d_lookup racyness to the __d_lookup call site. d_lookup is intuitive; __d_lookup is what needs commenting. So in that same vein, add kerneldoc comments to __d_lookup and clean up some of the comments: - We are interested in how the RCU lookup works here, particularly with renames. Make that explicit, and point to the document where it is explained in more detail. - RCU is pretty standard now, and macros make implementations pretty mindless. If we want to know about RCU barrier details, we look in RCU code. - Delete some boring legacy comments because we don't care much about how the code used to work, more about the interesting parts of how it works now. So comments about lazy LRU may be interesting, but would better be done in the LRU or refcount management code. Signed-off-by: Nick Piggin <npiggin@kernel.dk> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-08-14fs/dcache: fix function param name in kernel-docRandy Dunlap
Fix parameter name in kernel-doc notation (causes a warning). Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-08-11vfs: show unreachable paths in getcwd and procMiklos Szeredi
Prepend "(unreachable)" to path strings if the path is not reachable from the current root. Two places updated are - the return string from getcwd() - and symlinks under /proc/$PID. Other uses of d_path() are left unchanged (we know that some old software crashes if /proc/mounts is changed). Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-08-11vfs: only add " (deleted)" where necessaryMiklos Szeredi
__d_path() has 4 callers: d_path() sys_getcwd() seq_path_root() tomoyo_realpath_from_path2() Of these the only one which needs the " (deleted)" ending is d_path(). sys_getcwd() checks for existence before calling __d_path(). seq_path_root() is used to show the mountpoint path in /proc/PID/mountinfo, which is always a positive. And tomoyo doesn't want the deleted ending. Create a helper "path_with_deleted()" as subsequent patches will need this in multiple places. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-08-11vfs: add prepend_path() helperMiklos Szeredi
Split off prepend_path() from __d_path(). This new helper takes an end-of-buffer pointer and buffer-length pointer just like the other prepend_* functions. Move the " (deleted)" postfix out to __d_path(). This patch doesn't change any functionality but paves the way for the following patches. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-08-11vfs: __d_path: dont prepend the name of the root dentryMiklos Szeredi
In the old times pseudo-filesystems set the name of theroot dentry to some prefix like "pipe:" and the name of the child dentry to "[123]" and relied on a hack in __d_path() to replace the preceding slash with the root's name to get "pipe:[123]". Then the d_dname() dentry operation was introduced which solved the same problem without having to pre-fill the name in each dentry. Currently the following pseudo filesystems exist in the kernel: perfmon mtd anon_inode bdev pipe socket Of these only perfmon, anon_inode, pipe and socket create sub-dentries, all of which have now been switched to using d_dname(). bdev and mtd only create inodes. This means that now the hack to overwrite the slash can be removed, so for unreachable paths (e.g. within a detached mount) the path string won't be polluted with garbage. For these cases a subsequent patch will add a prefix, indicating that the path is unreachable. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-08-11vfs: add helpers to get root and pwdMiklos Szeredi
Add three helpers that retrieve a refcounted copy of the root and cwd from the supplied fs_struct. get_fs_root() get_fs_pwd() get_fs_root_and_pwd() Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-08-09no need for list_for_each_entry_safe()/resetting with superblock listAl Viro
just delay __put_super() a bit Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-08-09new helper: __dentry_path()Al Viro
builds path relative to fs root, called under dcache_lock, doesn't append any nonsense to unlinked ones. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-07-19mm: add context argument to shrinker callbackDave Chinner
The current shrinker implementation requires the registered callback to have global state to work from. This makes it difficult to shrink caches that are not global (e.g. per-filesystem caches). Pass the shrinker structure to the callback so that users can embed the shrinker structure in the context the shrinker needs to operate on and get back to it in the callback via container_of(). Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2010-06-29fs: fix superblock iteration racenpiggin@suse.de
list_for_each_entry_safe is not suitable to protect against concurrent modification of the list. 6754af6 introduced a race in sb walking. list_for_each_entry can use the trick of pinning the current entry in the list before we drop and retake the lock because it subsequently follows cur->next. However list_for_each_entry_safe saves n=cur->next for following before entering the loop body, so when the lock is dropped, n may be deleted. Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: Christoph Hellwig <hch@infradead.org> Cc: John Stultz <johnstul@us.ibm.com> Cc: Frank Mayhar <fmayhar@google.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-21fix prune_dcache()/umount() raceAl Viro
... and get rid of the last __put_super_and_need_restart() caller Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-05-21Leave superblocks on s_list until the endAl Viro
We used to remove from s_list and s_instances at the same time. So let's *not* do the former and skip superblocks that have empty s_instances in the loops over s_list. The next step, of course, will be to get rid of rescan logics in those loops. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-05-21clean DCACHE_CANT_MOUNT in d_delete()Al Viro
We set the "it's dead, don't mount on it" flag _and_ do not remove it if we turn the damn thing negative and leave it around. And if it goes positive afterwards, well... Fortunately, there's only one place where that needs to be caught: only d_delete() can turn the sucker negative without immediately freeing it; all other places that can lead to ->d_iput() call are followed by unconditionally freeing struct dentry in question. So the fix is obvious: Addresses https://bugzilla.kernel.org/show_bug.cgi?id=16014 Reported-by: Adam Tkac <vonsch@gmail.com> Tested-by: Adam Tkac <vonsch@gmail.com> Cc: <stable@kernel.org> [2.6.34.x] Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-03-03fix race in d_splice_alias()Al Viro
rehashing the negative placeholder opens a race with d_lookup(); we unhash it almost immediately (by d_move()), but the race window is there. Since d_move() doesn't rely on target being hashed, we don't need that d_rehash() at all. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-03-03New helper: path_is_under(path1, path2)Al Viro
Analog of is_subdir for vfsmount,dentry pairs, moved from audit_tree.c Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-03-03fs/dcache.c: CodingStyle cleanupH Hartley Sweeten
Cleanup EXPORT* macros according to Documantation/CodingStyle. Move EXPORT* macros to the line immediately after the closing function brace. Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-12-16libfs: move EXPORT_SYMBOL for d_alloc_nameH Hartley Sweeten
The EXPORT_SYMBOL for d_alloc_name is in fs/libfs.c but the function is in fs/dcache.c. Move the EXPORT_SYMBOL to the line immediately after the closing function brace line in fs/dcache.c as mentioned in Documentation/CodingStyle. Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-07-18sched: Pull up the might_sleep() check into cond_resched()Frederic Weisbecker
might_sleep() is called late-ish in cond_resched(), after the need_resched()/preempt enabled/system running tests are checked. It's better to check the sleeps while atomic earlier and not depend on some environment datas that reduce the chances to detect a problem. Also define cond_resched_*() helpers as macros, so that the FILE/LINE reported in the sleeping while atomic warning displays the real origin and not sched.h Changes in v2: - Call __might_sleep() directly instead of might_sleep() which may call cond_resched() - Turn cond_resched() into a macro so that the file:line couple reported refers to the caller of cond_resched() and not __cond_resched() itself. Changes in v3: - Also propagate this __might_sleep() pull up to cond_resched_lock() and cond_resched_softirq() Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1247725694-6082-6-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-11dcache: extrace and use d_unlinked()Alexey Dobriyan
d_unlinked() will be used in middle-term to ban checkpointing when opened but unlinked file is detected, and in long term, to detect such situation and special case on it. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-05-09fs: dcache fix LRU orderingnpiggin@suse.de
Fix ordering of LRU when moving referenced dentries to the head of the list (they should go to the head of the list in the same order as they were found from the tail, rather than reverse order). Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-04-20No need for crossing to mountpoint in audit_tag_tree()Al Viro
is_under() will DTRT anyway. And yes, is_subdir() behaviour is intentional. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-03-31Trim includes of fdtable.hAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-03-31Get rid of indirect include of fs_struct.hAl Viro
Don't pull it in sched.h; very few files actually need it and those can include directly. sched.h itself only needs forward declaration of struct fs_struct; Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-03-27cleanup d_add_ciChristoph Hellwig
Make sure that comments describe what's going on and not how, and always use __d_instantiate instead of two separate branches, one with d_instantiate and one with __d_instantiate. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-02-27EXPORT_SYMBOL(d_obtain_alias) rather than EXPORT_SYMBOL_GPLBenny Halevy
Commit 4ea3ada2955e4519befa98ff55dd62d6dfbd1705 declares d_obtain_alias() as EXPORT_SYMBOL_GPL where it's supposed to replace d_alloc_anon which was previously declared as EXPORT_SYMBOL and thus available to any loadable module. This patch reverts that. Signed-off-by: Benny Halevy <bhalevy@panasas.com> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Christoph Hellwig <hch@infradead.org> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-01-14[CVE-2009-0029] System call wrappers part 20Heiko Carstens
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2009-01-08generic swap(): dcache: use swap() instead of private do_switch()Wu Fengguang
Use the new generic implementation. Signed-off-by: Wu Fengguang <fengguang.wu@intel.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-12-31filp_cachep can be static in fs/file_table.cEric Dumazet
Instead of creating the "filp" kmem_cache in vfs_caches_init(), we can do it a litle be later in files_init(), so that filp_cachep is static to fs/file_table.c Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-12-31expand some comments (d_path / seq_path)Arjan van de Ven
Explain that you really need to use the return value of d_path rather than the buffer you passed into it. Also fix the comment for seq_path(), the function arguments changed recently but the comment hadn't been updated in sync. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-12-31correct wrong function name of d_put in kernel document and source commentZhaolei
no function named d_put(), it should be dput(). Impact: fix document and comment, no functionality changed Signed-off-by: Zhao Lei <zhaolei@cn.fuijtsu.com> Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-12-31fix switch_names() breakage in short-to-short caseAl Viro
We want ->name.len to match the resulting name on *both* source and target Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-12-31shrink struct dentryNick Piggin
struct dentry is one of the most critical structures in the kernel. So it's sad to see it going neglected. With CONFIG_PROFILING turned on (which is probably the common case at least for distros and kernel developers), sizeof(struct dcache) == 208 here (64-bit). This gives 19 objects per slab. I packed d_mounted into a hole, and took another 4 bytes off the inline name length to take the padding out from the end of the structure. This shinks it to 200 bytes. I could have gone the other way and increased the length to 40, but I'm aiming for a magic number, read on... I then got rid of the d_cookie pointer. This shrinks it to 192 bytes. Rant: why was this ever a good idea? The cookie system should increase its hash size or use a tree or something if lookups are a problem. Also the "fast dcookie lookups" in oprofile should be moved into the dcookie code -- how can oprofile possibly care about the dcookie_mutex? It gets dropped after get_dcookie() returns so it can't be providing any sort of protection. At 192 bytes, 21 objects fit into a 4K page, saving about 3MB on my system with ~140 000 entries allocated. 192 is also a multiple of 64, so we get nice cacheline alignment on 64 and 32 byte line systems -- any given dentry will now require 3 cachelines to touch all fields wheras previously it would require 4. I know the inline name size was chosen quite carefully, however with the reduction in cacheline footprint, it should actually be just about as fast to do a name lookup for a 36 character name as it was before the patch (and faster for other sizes). The memory footprint savings for names which are <= 32 or > 36 bytes long should more than make up for the memory cost for 33-36 byte names. Performance is a feature... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-10-23[PATCH] fs: add a sanity check in d_freeArjan van de Ven
Hi Al, remember that debug session we did at KS? You suggested this patch back then.... From 7751eaf30474b8cbfaea64795805a17eab05ac53 Mon Sep 17 00:00:00 2001 From: Arjan van de Ven <arjan@linux.intel.com> Date: Tue, 16 Sep 2008 16:51:17 -0700 Subject: [PATCH] fs: add a sanity check in d_free we're seeing some corruption in the dentry->d_alias list that appears like a free of an entry still on the list; this patch adds a WARN_ON() to catch this scenario, as suggested by Al Viro Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>