diff options
author | Linus Torvalds | 2021-04-27 13:08:12 -0700 |
---|---|---|
committer | Linus Torvalds | 2021-04-27 13:08:12 -0700 |
commit | 820c4bae40cb56466cfed6409e00d0f5165a990c (patch) | |
tree | 0b0b49ae9b61e4dbb04f08ad91987e91d062a401 /mm/filemap.c | |
parent | 34a456eb1fe26303d0661693d01a50e83a551da3 (diff) | |
parent | 53b776c77aca99b663a5512a04abc27670d61058 (diff) |
Merge tag 'netfs-lib-20210426' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs
Pull network filesystem helper library updates from David Howells:
"Here's a set of patches for 5.13 to begin the process of overhauling
the local caching API for network filesystems. This set consists of
two parts:
(1) Add a helper library to handle the new VM readahead interface.
This is intended to be used unconditionally by the filesystem
(whether or not caching is enabled) and provides a common
framework for doing caching, transparent huge pages and, in the
future, possibly fscrypt and read bandwidth maximisation. It also
allows the netfs and the cache to align, expand and slice up a
read request from the VM in various ways; the netfs need only
provide a function to read a stretch of data to the pagecache and
the helper takes care of the rest.
(2) Add an alternative fscache/cachfiles I/O API that uses the kiocb
facility to do async DIO to transfer data to/from the netfs's
pages, rather than using readpage with wait queue snooping on one
side and vfs_write() on the other. It also uses less memory, since
it doesn't do buffered I/O on the backing file.
Note that this uses SEEK_HOLE/SEEK_DATA to locate the data
available to be read from the cache. Whilst this is an improvement
from the bmap interface, it still has a problem with regard to a
modern extent-based filesystem inserting or removing bridging
blocks of zeros. Fixing that requires a much greater overhaul.
This is a step towards overhauling the fscache API. The change is
opt-in on the part of the network filesystem. A netfs should not try
to mix the old and the new API because of conflicting ways of handling
pages and the PG_fscache page flag and because it would be mixing DIO
with buffered I/O. Further, the helper library can't be used with the
old API.
This does not change any of the fscache cookie handling APIs or the
way invalidation is done at this time.
In the near term, I intend to deprecate and remove the old I/O API
(fscache_allocate_page{,s}(), fscache_read_or_alloc_page{,s}(),
fscache_write_page() and fscache_uncache_page()) and eventually
replace most of fscache/cachefiles with something simpler and easier
to follow.
This patchset contains the following parts:
- Some helper patches, including provision of an ITER_XARRAY iov
iterator and a function to do readahead expansion.
- Patches to add the netfs helper library.
- A patch to add the fscache/cachefiles kiocb API.
- A pair of patches to fix some review issues in the ITER_XARRAY and
read helpers as spotted by Al and Willy.
Jeff Layton has patches to add support in Ceph for this that he
intends for this merge window. I have a set of patches to support AFS
that I will post a separate pull request for.
With this, AFS without a cache passes all expected xfstests; with a
cache, there's an extra failure, but that's also there before these
patches. Fixing that probably requires a greater overhaul. Ceph also
passes the expected tests.
I also have patches in a separate branch to tidy up the handling of
PG_fscache/PG_private_2 and their contribution to page refcounting in
the core kernel here, but I haven't included them in this set and will
route them separately"
Link: https://lore.kernel.org/lkml/3779937.1619478404@warthog.procyon.org.uk/
* tag 'netfs-lib-20210426' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
netfs: Miscellaneous fixes
iov_iter: Four fixes for ITER_XARRAY
fscache, cachefiles: Add alternate API to use kiocb for read/write to cache
netfs: Add a tracepoint to log failures that would be otherwise unseen
netfs: Define an interface to talk to a cache
netfs: Add write_begin helper
netfs: Gather stats
netfs: Add tracepoints
netfs: Provide readahead and readpage netfs helpers
netfs, mm: Add set/end/wait_on_page_fscache() aliases
netfs, mm: Move PG_fscache helper funcs to linux/netfs.h
netfs: Documentation for helper library
netfs: Make a netfs helper module
mm: Implement readahead_control pageset expansion
mm/readahead: Handle ractl nr_pages being modified
fs: Document file_ra_state
mm/filemap: Pass the file_ra_state in the ractl
mm: Add set/end/wait functions for PG_private_2
iov_iter: Add ITER_XARRAY
Diffstat (limited to 'mm/filemap.c')
-rw-r--r-- | mm/filemap.c | 65 |
1 files changed, 63 insertions, 2 deletions
diff --git a/mm/filemap.c b/mm/filemap.c index 6ce832dc59e7..151090fdcf29 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -1433,6 +1433,67 @@ void unlock_page(struct page *page) EXPORT_SYMBOL(unlock_page); /** + * end_page_private_2 - Clear PG_private_2 and release any waiters + * @page: The page + * + * Clear the PG_private_2 bit on a page and wake up any sleepers waiting for + * this. The page ref held for PG_private_2 being set is released. + * + * This is, for example, used when a netfs page is being written to a local + * disk cache, thereby allowing writes to the cache for the same page to be + * serialised. + */ +void end_page_private_2(struct page *page) +{ + page = compound_head(page); + VM_BUG_ON_PAGE(!PagePrivate2(page), page); + clear_bit_unlock(PG_private_2, &page->flags); + wake_up_page_bit(page, PG_private_2); + put_page(page); +} +EXPORT_SYMBOL(end_page_private_2); + +/** + * wait_on_page_private_2 - Wait for PG_private_2 to be cleared on a page + * @page: The page to wait on + * + * Wait for PG_private_2 (aka PG_fscache) to be cleared on a page. + */ +void wait_on_page_private_2(struct page *page) +{ + page = compound_head(page); + while (PagePrivate2(page)) + wait_on_page_bit(page, PG_private_2); +} +EXPORT_SYMBOL(wait_on_page_private_2); + +/** + * wait_on_page_private_2_killable - Wait for PG_private_2 to be cleared on a page + * @page: The page to wait on + * + * Wait for PG_private_2 (aka PG_fscache) to be cleared on a page or until a + * fatal signal is received by the calling task. + * + * Return: + * - 0 if successful. + * - -EINTR if a fatal signal was encountered. + */ +int wait_on_page_private_2_killable(struct page *page) +{ + int ret = 0; + + page = compound_head(page); + while (PagePrivate2(page)) { + ret = wait_on_page_bit_killable(page, PG_private_2); + if (ret < 0) + break; + } + + return ret; +} +EXPORT_SYMBOL(wait_on_page_private_2_killable); + +/** * end_page_writeback - end writeback against a page * @page: the page */ @@ -2778,7 +2839,7 @@ static struct file *do_sync_mmap_readahead(struct vm_fault *vmf) struct file *file = vmf->vma->vm_file; struct file_ra_state *ra = &file->f_ra; struct address_space *mapping = file->f_mapping; - DEFINE_READAHEAD(ractl, file, mapping, vmf->pgoff); + DEFINE_READAHEAD(ractl, file, ra, mapping, vmf->pgoff); struct file *fpin = NULL; unsigned int mmap_miss; @@ -2790,7 +2851,7 @@ static struct file *do_sync_mmap_readahead(struct vm_fault *vmf) if (vmf->vma->vm_flags & VM_SEQ_READ) { fpin = maybe_unlock_mmap_for_io(vmf, fpin); - page_cache_sync_ra(&ractl, ra, ra->ra_pages); + page_cache_sync_ra(&ractl, ra->ra_pages); return fpin; } |