Age | Commit message (Collapse) | Author |
|
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
'BTRFS_ORDERED_REGULAR' was introduced for the cow case in patch
'Btrfs: specify a new ordered extent type for create_io_em',
but it missed the directIO cow case.
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: Chris Mason <clm@fb.com>
|
|
If we are deduping two ranges of the same file we need to make sure that
we lock all pages in ascending order, that is, lock first the pages from
the range with lower offset and then the pages from the other range, as
otherwise we can deadlock with a concurrent task that is starting delalloc
(writeback). Example trace:
[74073.052218] INFO: task kworker/u32:10:17997 blocked for more than 120 seconds.
[74073.053889] Tainted: G W 4.9.0-rc7-btrfs-next-36+ #1
[74073.055071] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[74073.056696] kworker/u32:10 D 0 17997 2 0x00000000
[74073.058606] Workqueue: writeback wb_workfn (flush-btrfs-53176)
[74073.061370] ffff880031e79858 ffff8802159d2580 ffff880237004580 ffff880031e79240
[74073.064784] ffff88023f4978c0 ffffc9000817b638 ffffffff814c15e1 0000000000000000
[74073.068386] ffff88023f4978d8 ffff88023f4978c0 000000000017b620 ffff880031e79240
[74073.071712] Call Trace:
[74073.072884] [<ffffffff814c15e1>] ? __schedule+0x48f/0x6f4
[74073.075395] [<ffffffff814c1c8b>] ? bit_wait+0x2f/0x2f
[74073.077511] [<ffffffff814c18d2>] schedule+0x8c/0xa0
[74073.079440] [<ffffffff814c4b36>] schedule_timeout+0x43/0xff
[74073.081637] [<ffffffff8110953e>] ? time_hardirqs_on+0x9/0x14
[74073.083809] [<ffffffff81095c67>] ? trace_hardirqs_on_caller+0x16/0x197
[74073.086314] [<ffffffff810bde98>] ? timekeeping_get_ns+0x1e/0x32
[74073.100654] [<ffffffff810be048>] ? ktime_get+0x41/0x52
[74073.102619] [<ffffffff814c10f0>] io_schedule_timeout+0xa0/0x102
[74073.104771] [<ffffffff814c10f0>] ? io_schedule_timeout+0xa0/0x102
[74073.106969] [<ffffffff814c1ca6>] bit_wait_io+0x1b/0x39
[74073.108954] [<ffffffff814c1fb8>] __wait_on_bit_lock+0x4f/0x99
[74073.110981] [<ffffffff8112b692>] __lock_page+0x6b/0x6d
[74073.112833] [<ffffffff8108ceb4>] ? autoremove_wake_function+0x3a/0x3a
[74073.115010] [<ffffffffa031178b>] lock_page+0x2f/0x32 [btrfs]
[74073.116999] [<ffffffffa0311d9f>] lock_delalloc_pages+0xc7/0x1a0 [btrfs]
[74073.119243] [<ffffffffa0313d15>] find_lock_delalloc_range+0xc3/0x1a4 [btrfs]
[74073.121636] [<ffffffffa0313e81>] writepage_delalloc.isra.31+0x8b/0x134 [btrfs]
[74073.124229] [<ffffffffa0315d69>] __extent_writepage+0x1c1/0x2bf [btrfs]
[74073.126372] [<ffffffffa03160f2>] extent_write_cache_pages.isra.30.constprop.49+0x28b/0x36c [btrfs]
[74073.129371] [<ffffffffa03165b9>] extent_writepages+0x4b/0x5c [btrfs]
[74073.131440] [<ffffffffa02fcb59>] ? insert_reserved_file_extent.constprop.42+0x261/0x261 [btrfs]
[74073.134303] [<ffffffff811b4ce4>] ? writeback_sb_inodes+0xe0/0x4a1
[74073.136298] [<ffffffffa02fab7f>] btrfs_writepages+0x28/0x2a [btrfs]
[74073.138248] [<ffffffff81138200>] do_writepages+0x23/0x2c
[74073.139910] [<ffffffff811b3cab>] __writeback_single_inode+0x105/0x6d2
[74073.142003] [<ffffffff811b4e96>] writeback_sb_inodes+0x292/0x4a1
[74073.136298] [<ffffffffa02fab7f>] btrfs_writepages+0x28/0x2a [btrfs]
[74073.138248] [<ffffffff81138200>] do_writepages+0x23/0x2c
[74073.139910] [<ffffffff811b3cab>] __writeback_single_inode+0x105/0x6d2
[74073.142003] [<ffffffff811b4e96>] writeback_sb_inodes+0x292/0x4a1
[74073.143911] [<ffffffff811b511b>] __writeback_inodes_wb+0x76/0xae
[74073.145787] [<ffffffff811b53ca>] wb_writeback+0x1cc/0x4d7
[74073.147452] [<ffffffff811b60cd>] wb_workfn+0x194/0x37d
[74073.149084] [<ffffffff811b60cd>] ? wb_workfn+0x194/0x37d
[74073.150726] [<ffffffff8106ce77>] ? process_one_work+0x154/0x4e4
[74073.152694] [<ffffffff8106cf96>] process_one_work+0x273/0x4e4
[74073.154452] [<ffffffff8106d6db>] worker_thread+0x1eb/0x2ca
[74073.156138] [<ffffffff8106d4f0>] ? rescuer_thread+0x2b6/0x2b6
[74073.157837] [<ffffffff81072a81>] kthread+0xd5/0xdd
[74073.159339] [<ffffffff810729ac>] ? __kthread_unpark+0x5a/0x5a
[74073.161088] [<ffffffff814c6257>] ret_from_fork+0x27/0x40
[74073.162680] INFO: lockdep is turned off.
[74073.163855] INFO: task do-dedup:30264 blocked for more than 120 seconds.
[74073.181180] Tainted: G W 4.9.0-rc7-btrfs-next-36+ #1
[74073.181180] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[74073.185296] fdm-stress D 0 30264 29974 0x00000000
[74073.186810] ffff880089595118 ffff880211b8eac0 ffff880237030380 ffff880089594b00
[74073.188998] ffff88023f2978c0 ffffc900063abb68 ffffffff814c15e1 0000000000000000
[74073.191070] ffff88023f2978d8 ffff88023f2978c0 00000000003abb50 ffff880089594b00
[74073.193286] Call Trace:
[74073.193990] [<ffffffff814c15e1>] ? __schedule+0x48f/0x6f4
[74073.195418] [<ffffffff814c1c8b>] ? bit_wait+0x2f/0x2f
[74073.196796] [<ffffffff814c18d2>] schedule+0x8c/0xa0
[74073.198163] [<ffffffff814c4b36>] schedule_timeout+0x43/0xff
[74073.199621] [<ffffffff81095df5>] ? trace_hardirqs_on+0xd/0xf
[74073.201100] [<ffffffff810bde98>] ? timekeeping_get_ns+0x1e/0x32
[74073.202686] [<ffffffff810be048>] ? ktime_get+0x41/0x52
[74073.204051] [<ffffffff814c10f0>] io_schedule_timeout+0xa0/0x102
[74073.205585] [<ffffffff814c10f0>] ? io_schedule_timeout+0xa0/0x102
[74073.207123] [<ffffffff814c1ca6>] bit_wait_io+0x1b/0x39
[74073.208238] [<ffffffff814c1fb8>] __wait_on_bit_lock+0x4f/0x99
[74073.208871] [<ffffffff8112b692>] __lock_page+0x6b/0x6d
[74073.209430] [<ffffffff8108ceb4>] ? autoremove_wake_function+0x3a/0x3a
[74073.210101] [<ffffffff8112b800>] lock_page+0x2f/0x32
[74073.210636] [<ffffffff8112c502>] pagecache_get_page+0x5e/0x153
[74073.211270] [<ffffffffa03257eb>] gather_extent_pages+0x4e/0x109 [btrfs]
[74073.212166] [<ffffffffa032a04c>] btrfs_dedupe_file_range+0x1e1/0x4dd [btrfs]
[74073.213257] [<ffffffff8118d9b5>] vfs_dedupe_file_range+0x1c1/0x221
[74073.214086] [<ffffffff8119e0c4>] do_vfs_ioctl+0x442/0x600
[74073.214767] [<ffffffff811a7874>] ? rcu_read_unlock+0x5b/0x5d
[74073.215619] [<ffffffff811a7953>] ? __fget+0x6b/0x77
[74073.216338] [<ffffffff8119e2d9>] SyS_ioctl+0x57/0x79
[74073.217149] [<ffffffff814c5fea>] entry_SYSCALL_64_fastpath+0x18/0xad
[74073.218102] [<ffffffff81109552>] ? time_hardirqs_off+0x9/0x14
[74073.218968] [<ffffffff810938ce>] ? trace_hardirqs_off_caller+0x1f/0xaa
[74073.219938] INFO: lockdep is turned off.
What happened was the following:
CPU 1 CPU 2
btrfs_dedupe_file_range()
--> using same inode as source
and target
--> src range is [768K, 1Mb[
--> dst range is [0, 256K[
btrfs_cmp_data_prepare()
--> calls gather_extent_pages()
for range [768K, 1Mb[ and
locks all pages in that range
do_writepages()
btrfs_writepages()
extent_writepages()
extent_write_cache_pages()
__extent_writepage()
writepage_delalloc()
find_lock_delalloc_range()
--> finds range [0, 1Mb[
lock_delalloc_pages()
--> locks all pages in the
range [0, 768K[
--> tries to lock page at
offset 768K
--> deadlock
--> calls gather_extent_pages()
to lock pages in the range
[0, 256K[
--> deadlock, task at CPU 1
already locked that
range and it's trying
to lock the range we
locked previously
So fix this by making sure that during a dedup we always lock first the
pages from the range with lower offset.
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>
|
|
Commit e5d6b12fe14 (Btrfs: don't WARN() in btrfs_transaction_abort() for
IO errors) added a pr_debug call to be printed when a transaction is
aborted with -EIO instead of WARN. btrfs_debug prints which file system
the message is associated with so let's use that instead.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
btrfs_truncate_free_space_cache always allocates a btrfs_path structure
but only uses it when the caller passes a block group. Let's move the
allocation and free into the conditional.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
The free space cache APIs accept a root but always use the tree root.
Also, btrfs_truncate_free_space_cache accepts a root AND an inode but
the inode always points to the root anyway, so let's just pass the inode.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
btrfs_inc_block_group_ro is either passed the extent root or the dev
root, but it doesn't do anything with the dev tree. Let's convert
to passing an fs_info and using the extent root.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
We don't need to pass a root to flush_space since it always uses
the fs_root.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Outside of interactions with qgroups, the roots passed in extent-tree.c
are usually passed to ensure that we don't do refcounts on log trees or
to get the allocation profile for an allocation request. Otherwise, it
operates on the extent root. This patch converts some more routines in
extent-tree.c that are always called with the extent root to accept
an fs_info instead.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Just as Filipe pointed out, the most time consuming parts of qgroup are
btrfs_qgroup_account_extents() and
btrfs_qgroup_prepare_account_extents().
Which both call btrfs_find_all_roots() to get old_roots and new_roots
ulist.
What makes things worse is, we're calling that expensive
btrfs_find_all_roots() at transaction committing time with
TRANS_STATE_COMMIT_DOING, which will blocks all incoming transaction.
Such behavior is necessary for @new_roots search as current
btrfs_find_all_roots() can't do it correctly so we do call it just
before switch commit roots.
However for @old_roots search, it's not necessary as such search is
based on commit_root, so it will always be correct and we can move it
out of transaction committing.
This patch moves the @old_roots search part out of
commit_transaction(), so in theory we can half the time qgroup time
consumption at commit_transaction().
But please note that, this won't speedup qgroup overall, the total time
consumption is still the same, just reduce the performance stall.
Cc: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Never used.
Reviewed-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Both unused after the call to update_cache_item has been moved to
__btrfs_wait_cache_io.
Reviewed-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
bitmap_list is unused since the io_ctl framework.
Reviewed-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Unused since the helper has been split, eb used in the caller.
Reviewed-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Never used.
Reviewed-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|