From 2d077d4b59924acd1f5180c6fb73b57f4771fde6 Mon Sep 17 00:00:00 2001 From: Hugh Dickins Date: Fri, 1 Jun 2018 16:50:45 -0700 Subject: mm/huge_memory.c: __split_huge_page() use atomic ClearPageDirty() Swapping load on huge=always tmpfs (with khugepaged tuned up to be very eager, but I'm not sure that is relevant) soon hung uninterruptibly, waiting for page lock in shmem_getpage_gfp()'s find_lock_entry(), most often when "cp -a" was trying to write to a smallish file. Debug showed that the page in question was not locked, and page->mapping NULL by now, but page->index consistent with having been in a huge page before. Reproduced in minutes on a 4.15 kernel, even with 4.17's 605ca5ede764 ("mm/huge_memory.c: reorder operations in __split_huge_page_tail()") added in; but took hours to reproduce on a 4.17 kernel (no idea why). The culprit proved to be the __ClearPageDirty() on tails beyond i_size in __split_huge_page(): the non-atomic __bitoperation may have been safe when 4.8's baa355fd3314 ("thp: file pages support for split_huge_page()") introduced it, but liable to erase PageWaiters after 4.10's 62906027091f ("mm: add PageWaiters indicating tasks are waiting for a page bit"). Link: http://lkml.kernel.org/r/alpine.LSU.2.11.1805291841070.3197@eggly.anvils Fixes: 62906027091f ("mm: add PageWaiters indicating tasks are waiting for a page bit") Signed-off-by: Hugh Dickins Acked-by: Kirill A. Shutemov Cc: Konstantin Khlebnikov Cc: Nicholas Piggin Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- mm/huge_memory.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index a3a1815f8e11..b9f3dbd885bd 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -2431,7 +2431,7 @@ static void __split_huge_page(struct page *page, struct list_head *list, __split_huge_page_tail(head, i, lruvec, list); /* Some pages can be beyond i_size: drop them from page cache */ if (head[i].index >= end) { - __ClearPageDirty(head + i); + ClearPageDirty(head + i); __delete_from_page_cache(head + i, NULL); if (IS_ENABLED(CONFIG_SHMEM) && PageSwapBacked(head)) shmem_uncharge(head->mapping->host, 1); -- cgit v1.2.3 From 145e1a71e090575c74969e3daa8136d1e5b99fc8 Mon Sep 17 00:00:00 2001 From: Hugh Dickins Date: Fri, 1 Jun 2018 16:50:50 -0700 Subject: mm: fix the NULL mapping case in __isolate_lru_page() George Boole would have noticed a slight error in 4.16 commit 69d763fc6d3a ("mm: pin address_space before dereferencing it while isolating an LRU page"). Fix it, to match both the comment above it, and the original behaviour. Although anonymous pages are not marked PageDirty at first, we have an old habit of calling SetPageDirty when a page is removed from swap cache: so there's a category of ex-swap pages that are easily migratable, but were inadvertently excluded from compaction's async migration in 4.16. Link: http://lkml.kernel.org/r/alpine.LSU.2.11.1805302014001.12558@eggly.anvils Fixes: 69d763fc6d3a ("mm: pin address_space before dereferencing it while isolating an LRU page") Signed-off-by: Hugh Dickins Acked-by: Minchan Kim Acked-by: Mel Gorman Reported-by: Ivan Kalvachev Cc: "Huang, Ying" Cc: Jan Kara Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- mm/vmscan.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/vmscan.c b/mm/vmscan.c index 9b697323a88c..9270a4370d54 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -1418,7 +1418,7 @@ int __isolate_lru_page(struct page *page, isolate_mode_t mode) return ret; mapping = page_mapping(page); - migrate_dirty = mapping && mapping->a_ops->migratepage; + migrate_dirty = !mapping || mapping->a_ops->migratepage; unlock_page(page); if (!migrate_dirty) return ret; -- cgit v1.2.3