aboutsummaryrefslogtreecommitdiff
path: root/fs/ceph/inode.c
AgeCommit message (Collapse)Author
2010-10-20ceph: factor out libceph from Ceph file systemYehuda Sadeh
This factors out protocol and low-level storage parts of ceph into a separate libceph module living in net/ceph and include/linux/ceph. This is mostly a matter of moving files around. However, a few key pieces of the interface change as well: - ceph_client becomes ceph_fs_client and ceph_client, where the latter captures the mon and osd clients, and the fs_client gets the mds client and file system specific pieces. - Mount option parsing and debugfs setup is correspondingly broken into two pieces. - The mon client gets a generic handler callback for otherwise unknown messages (mds map, in this case). - The basic supported/required feature bits can be expanded (and are by ceph_fs_client). No functional change, aside from some subtle error handling cases that got cleaned up in the refactoring process. Signed-off-by: Sage Weil <sage@newdream.net>
2010-09-13ceph: fix dn offset during readdir_prepopulateSage Weil
When adding the readdir results to the cache, ceph_set_dentry_offset was clobbered our just-set offset. This can cause the readdir result offsets to get out of sync with the server. Add an argument to the helper so that it does not. This bug was introduced by 1cd3935bedccf592d44343890251452a6dd74fc4. Signed-off-by: Sage Weil <sage@newdream.net>
2010-08-25ceph: ceph_get_inode() returns an ERR_PTRDan Carpenter
ceph_get_inode() returns an ERR_PTR and it doesn't return a NULL. Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Sage Weil <sage@newdream.net>
2010-08-22ceph: don't improperly set dir complete when holding EXCL capSage Weil
If we hold the EXCL cap, we cannot trust the dir stats from the MDS (num files, subdirs) and must not incorrectly conclude that the directory is empty. If we do, we get can bad results from lookup (bad ENOENT) and bad readdir results. Signed-off-by: Sage Weil <sage@newdream.net>
2010-08-01ceph: perform lazy reads when file mode and caps permitSage Weil
If the file mode is marked as "lazy," perform cached/buffered reads when the caps permit it. Adjust the rdcache_gen and invalidation logic accordingly so that we manage our cache based on the FILE_CACHE -or- FILE_LAZYIO cap bits. Signed-off-by: Sage Weil <sage@newdream.net>
2010-07-27ceph: use complete_all and wake_up_allYehuda Sadeh
This fixes an issue triggered by running concurrent syncs. One of the syncs would go through while the other would just hang indefinitely. In any case, we never actually want to wake a single waiter, so the *_all functions should be used. Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net> Signed-off-by: Sage Weil <sage@newdream.net>
2010-07-23ceph: fix leak of dentry in ceph_init_dentry() error pathSage Weil
If we fail to allocate a ceph_dentry_info, don't leak the dn reference. Signed-off-by: Sage Weil <sage@newdream.net>
2010-06-21ceph: handle splice_dentry/d_materialize_unique error in readdir_prepopulateSage Weil
Handle a splice_dentry failure (due to a d_materialize_unique error) without crashing. (Also, report the error code.) Signed-off-by: Sage Weil <sage@newdream.net>
2010-06-01ceph: fix d_subdirs ordering problemHenry C Chang
We misused list_move_tail() to order the dentry in d_subdirs. This will screw up the d_subdirs order. This bug can be reliably reproduced by: 1. mount ceph fs. 2. on ceph fs, git clone git://ceph.newdream.net/git/ceph.git 3. Run autogen.sh in ceph directory. (Note: Errors only occur at the first time you run autogen.sh.) Signed-off-by: Henry C Chang <henry_c_chang@tcloudcomputing.com> Signed-off-by: Sage Weil <sage@newdream.net>
2010-05-29fs/ceph: Use ERR_CASTJulia Lawall
Use ERR_CAST(x) rather than ERR_PTR(PTR_ERR(x)). The former makes more clear what is the purpose of the operation, which otherwise looks like a no-op. In the case of fs/ceph/inode.c, ERR_CAST is not needed, because the type of the returned value is the same as the type of the enclosing function. The semantic patch that makes this change is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ type T; T x; identifier f; @@ T f (...) { <+... - ERR_PTR(PTR_ERR(x)) + x ...+> } @@ expression x; @@ - ERR_PTR(PTR_ERR(x)) + ERR_CAST(x) // </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: Sage Weil <sage@newdream.net>
2010-05-17ceph: use common helper for aborted dir request invalidationSage Weil
We invalidate I_COMPLETE and dentry leases in two places: on aborted mds request and on request replay. Use common helper to avoid duplicate code. Signed-off-by: Sage Weil <sage@newdream.net>
2010-05-17ceph: set dn offset when splicedSage Weil
We want to assign an offset when the dentry goes from null to linked, which is always done by splice_dentry(). Notably, we should NOT assign an offset when a dentry is first created and is still null. BUG if we try to splice a non-null dentry (we shouldn't). Signed-off-by: Sage Weil <sage@newdream.net>
2010-05-17ceph: don't clobber i_max_offset on already complete dirSage Weil
This can screw up offsets assigned to new dentries and break dcache readdir results. Signed-off-by: Sage Weil <sage@newdream.net>
2010-05-17ceph: skip set_dentry_offset work if directory not I_COMPLETESage Weil
Signed-off-by: Sage Weil <sage@newdream.net>
2010-05-17ceph: fix xattr dangling pointer / double freeSage Weil
If we use the xattr_blob, clear the pointer so we don't release the memory at the bottom of the fuction. Reported-by: Henry C Chang <henry_c_chang@tcloudcomputing.com> Signed-off-by: Sage Weil <sage@newdream.net>
2010-05-17ceph: use ceph_sb_to_client instead of ceph_clientCheng Renquan
ceph_sb_to_client and ceph_client are really identical, we need to dump one; while function ceph_client is confusing with "struct ceph_client", ceph_sb_to_client's definition is more clear; so we'd better switch all call to ceph_sb_to_client. -static inline struct ceph_client *ceph_client(struct super_block *sb) -{ - return sb->s_fs_info; -} Signed-off-by: Cheng Renquan <crquan@gmail.com> Signed-off-by: Sage Weil <sage@newdream.net>
2010-05-17ceph: invalidate affected dentry leases on aborted requestsSage Weil
If we abort a request, we return to caller, but the request may still complete. And if we hold the dir FILE_EXCL bit, we may not release a lease when sending a request. A simple un-tar, control-c, un-tar again will reproduce the bug (manifested as a 'Cannot open: File exists'). Ensure we invalidate affected dentry leases (as well dir I_COMPLETE) so we don't have valid (but incorrect) leases. Do the same, consistently, at other sites where I_COMPLETE is similarly cleared. Signed-off-by: Sage Weil <sage@newdream.net>
2010-05-11ceph: fix open file counting on snapped inodes when mds returns no capsSage Weil
It's possible the MDS will not issue caps on a snapped inode, in which case an open request may not __ceph_get_fmode(), botching the open file counting. (This is actually a server bug, but the client shouldn't BUG out in this case.) Signed-off-by: Sage Weil <sage@newdream.net>
2010-05-03ceph: clear dir complete on d_moveSage Weil
d_move() reorders the d_subdirs list, breaking the readdir result caching. Unless/until d_move preserves that ordering, clear CEPH_I_COMPLETE on rename. Signed-off-by: Sage Weil <sage@newdream.net>
2010-03-30ceph: fix dentry rehashing on virtual .snap dirSage Weil
If a lookup fails on the magic .snap directory, we bind it to a magic snap directory inode in ceph_lookup_finish(). That code assumes the dentry is unhashed, but a recent server-side change started returning NULL leases on lookup failure, causing the .snap dentry to be hashed and NULL by ceph_fill_trace(). This causes dentry hash chain corruption, or a dies when d_rehash() includes BUG_ON(!d_unhashed(entry)); So, avoid processing the NULL dentry lease if it the dentry matches the snapdir name in ceph_fill_trace(). That allows the lookup completion to properly bind it to the snapdir inode. BUG there if dentry is hashed to be sure. Signed-off-by: Sage Weil <sage@newdream.net>
2010-03-20ceph: fix inode removal from snap realm when racing with migrationSage Weil
When an inode was dropped while being migrated between two MDSs, i_cap_exporting_issued was non-zero such that issue caps were non-zero and __ceph_is_any_caps(ci) was true. This prevented the inode from being removed from the snap realm, even as it was dropped from the cache. Fix this by dropping any residual i_snap_realm ref in destroy_inode. Signed-off-by: Sage Weil <sage@newdream.net>
2010-02-19ceph: don't truncate dirty pages in invalidate work threadYehuda Sadeh
Instead of truncating the whole range of pages, we skip those pages that are dirty or in the middle of writeback. Those pages will be cleared later when the writeback completes. Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net> Signed-off-by: Sage Weil <sage@newdream.net>
2010-02-17ceph: fix typo in ceph_queue_writeback debug outputSage Weil
Signed-off-by: Sage Weil <sage@newdream.net>
2010-02-11ceph: cleanup async writeback, truncation, invalidate helpersSage Weil
Grab inode ref in helper. Make work functions static, with consistent naming. Signed-off-by: Sage Weil <sage@newdream.net>
2010-02-11ceph: fix truncation when not holding capsYehuda Sadeh
A truncation should occur when either we have the specified caps for the file, or (in cases where we are not the only ones referencing the file) when it is mapped or when it is opened. The latter two cases were not handled. Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net> Signed-off-by: Sage Weil <sage@newdream.net>
2010-01-29ceph: remove unreachable codeYehuda Sadeh
We never truncate to a smaller size without contacting the MDS. Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net> Signed-off-by: Sage Weil <sage@newdream.net>
2010-01-25ceph: properly handle aborted mds requestsSage Weil
Previously, if the MDS request was interrupted, we would unregister the request and ignore any reply. This could cause the caps or other cache state to become out of sync. (For instance, aborting dbench and doing rm -r on clients would complain about a non-empty directory because the client didn't realize it's aborted file create request completed.) Even we don't unregister, we still can't process the reply normally because we are no longer holding the caller's locks (like the dir i_mutex). So, mark aborted operations with r_aborted, and in the reply handler, be sure to process all the caps. Do not process the namespace changes, though, since we no longer will hold the dir i_mutex. The dentry lease state can also be ignored as it's more forgiving. Signed-off-by: Sage Weil <sage@newdream.net>
2010-01-14ceph: change dentry offset and position after splice_dentryYehuda Sadeh
This fixes a bug, where we had the parent list have dentries with offsets that are not monotonically increasing, which caused the ceph dcache_readdir to skip entries. Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net> Signed-off-by: Sage Weil <sage@newdream.net>
2009-12-21ceph: ensure rename target dentry fails revalidationSage Weil
This works around a bug in vfs_rename_dir() that rehashes the target dentry. Ensure such dentries always fail revalidation by timing out the dentry lease and kicking it out of the current directory lease gen. This can be reverted when the vfs bug is fixed. Signed-off-by: Sage Weil <sage@newdream.net>
2009-12-07ceph: simplify ceph_buffer interfaceSage Weil
We never allocate the ceph_buffer and buffer separtely, so use a single constructor. Disallow put on NULL buffer; make the caller check. Signed-off-by: Sage Weil <sage@newdream.net>
2009-11-11ceph: initialize i_size/i_rbytes on snapdirSage Weil
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-21ceph: move directory size logic to ceph_getattrSage Weil
We can't fill i_size with rbytes at the fill_file_size stage without adding additional checks for directories. Notably, we want st_blocks to remain 0 on directories so that 'du' still works. Fill in i_blocks, i_size specially in ceph_getattr instead. Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-06ceph: inode operationsSage Weil
Inode cache and inode operations. We also include routines to incorporate metadata structures returned by the MDS into the client cache, and some helpers to deal with file capabilities and metadata leases. The bulk of that work is done by fill_inode() and fill_trace(). Signed-off-by: Sage Weil <sage@newdream.net>