aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2013-11-15arc: handle pgtable_page_ctor() failKirill A. Shutemov
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Vineet Gupta <vgupta@synopsys.com> [for arch/arc bits] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-11-15alpha: handle pgtable_page_ctor() failKirill A. Shutemov
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Matt Turner <mattst88@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-11-15openrisc: add missing pgtable_page_ctor/dtor callsKirill A. Shutemov
It will fix NR_PAGETABLE accounting. It's also required if the arch is going ever support split ptl. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Jonas Bonn <jonas@southpole.se> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-11-15mn10300: add missing pgtable_page_ctor/dtor callsKirill A. Shutemov
It will fix NR_PAGETABLE accounting. It's also required if the arch is going ever support split ptl. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: David Howells <dhowells@redhat.com> Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-11-15microblaze: add missing pgtable_page_ctor/dtor callsKirill A. Shutemov
It will fix NR_PAGETABLE accounting. It's also required if the arch is going ever support split ptl. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Michal Simek <monstr@monstr.eu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-11-15mm: allow pgtable_page_ctor() to failKirill A. Shutemov
Change pgtable_page_ctor() return type from void to bool. Returns true, if initialization is successful and false otherwise. Current implementation never fails, but it will change later. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-11-15xtensa: fix potential NULL-pointer dereferenceKirill A. Shutemov
Add missing check for memory allocation fail. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Chris Zankel <chris@zankel.net> Cc: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-11-15m32r: fix potential NULL-pointer dereferenceKirill A. Shutemov
Add missing check for memory allocation fail. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Hirokazu Takata <takata@linux-m32r.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-11-15cris: fix potential NULL-pointer dereferenceKirill A. Shutemov
Add missing check for memory allocation fail. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Mikael Starvik <starvik@axis.com> Acked-by: Jesper Nilsson <jesper.nilsson@axis.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-11-15x86: add missed pgtable_pmd_page_ctor/dtor calls for preallocated pmdsKirill A. Shutemov
In split page table lock case, we embed spinlock_t into struct page. For obvious reason, we don't want to increase size of struct page if spinlock_t is too big, like with DEBUG_SPINLOCK or DEBUG_LOCK_ALLOC or on -rt kernel. So we disable split page table lock, if spinlock_t is too big. This patchset allows to allocate the lock dynamically if spinlock_t is big. In this page->ptl is used to store pointer to spinlock instead of spinlock itself. It costs additional cache line for indirect access, but fix page fault scalability for multi-threaded applications. LOCK_STAT depends on DEBUG_SPINLOCK, so on current kernel enabling LOCK_STAT to analyse scalability issues breaks scalability. ;) The patchset mostly fixes this. Results for ./thp_memscale -c 80 -b 512M on 4-socket machine: baseline, no CONFIG_LOCK_STAT: 9.115460703 seconds time elapsed baseline, CONFIG_LOCK_STAT=y: 53.890567123 seconds time elapsed patched, no CONFIG_LOCK_STAT: 8.852250368 seconds time elapsed patched, CONFIG_LOCK_STAT=y: 11.069770759 seconds time elapsed Patch count is scary, but most of them trivial. Overview: Patches 1-4 Few bug fixes. No dependencies to other patches. Probably should applied as soon as possible. Patch 5 Changes signature of pgtable_page_ctor(). We will use it for dynamic lock allocation, so it can fail. Patches 6-8 Add missing constructor/destructor calls on few archs. It's fixes NR_PAGETABLE accounting and prepare to use split ptl. Patches 9-33 Add pgtable_page_ctor() fail handling to all archs. Patches 34 Finally adds support of dynamically-allocated page->pte. Also contains documentation for split page table lock. This patch (of 34): I've missed that we preallocate few pmds on pgd_alloc() if X86_PAE enabled. Let's add missed constructor/destructor calls. I haven't noticed it during testing since prep_new_page() clears page->mapping and therefore page->ptl. It's effectively equal to spin_lock_init(&page->ptl). Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Ingo Molnar <mingo@kernel.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: "James E.J. Bottomley" <jejb@parisc-linux.org> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chen Liqin <liqin.chen@sunplusct.com> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Chris Zankel <chris@zankel.net> Cc: Christoph Lameter <cl@linux.com> Cc: David Howells <dhowells@redhat.com> Cc: David S. Miller <davem@davemloft.net> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Grant Likely <grant.likely@linaro.org> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Cc: Haavard Skinnemoen <hskinnemoen@gmail.com> Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: Hirokazu Takata <takata@linux-m32r.org> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: James Hogan <james.hogan@imgtec.com> Cc: Jeff Dike <jdike@addtoit.com> Cc: Jesper Nilsson <jesper.nilsson@axis.com> Cc: Jonas Bonn <jonas@southpole.se> Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com> Cc: Lennox Wu <lennox.wu@gmail.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michal Simek <monstr@monstr.eu> Cc: Mikael Starvik <starvik@axis.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Richard Henderson <rth@twiddle.net> Cc: Richard Kuo <rkuo@codeaurora.org> Cc: Richard Weinberger <richard@nod.at> Cc: Rob Herring <rob.herring@calxeda.com> Cc: Russell King <linux@arm.linux.org.uk> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-11-15x86, mm: enable split page table lock for PMD levelKirill A. Shutemov
Enable PMD split page table lock for X86_64 and PAE. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Tested-by: Alex Thorlton <athorlton@sgi.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: "Eric W . Biederman" <ebiederm@xmission.com> Cc: "Paul E . McKenney" <paulmck@linux.vnet.ibm.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Dave Jones <davej@redhat.com> Cc: David Howells <dhowells@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Kees Cook <keescook@chromium.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Robin Holt <robinmholt@gmail.com> Cc: Sedat Dilek <sedat.dilek@gmail.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Hugh Dickins <hughd@google.com> Reviewed-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-11-15mm: implement split page table lock for PMD levelKirill A. Shutemov
The basic idea is the same as with PTE level: the lock is embedded into struct page of table's page. We can't use mm->pmd_huge_pte to store pgtables for THP, since we don't take mm->page_table_lock anymore. Let's reuse page->lru of table's page for that. pgtable_pmd_page_ctor() returns true, if initialization is successful and false otherwise. Current implementation never fails, but assumption that constructor can fail will help to port it to -rt where spinlock_t is rather huge and cannot be embedded into struct page -- dynamic allocation is required. Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Tested-by: Alex Thorlton <athorlton@sgi.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: "Eric W . Biederman" <ebiederm@xmission.com> Cc: "Paul E . McKenney" <paulmck@linux.vnet.ibm.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Dave Jones <davej@redhat.com> Cc: David Howells <dhowells@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Kees Cook <keescook@chromium.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Robin Holt <robinmholt@gmail.com> Cc: Sedat Dilek <sedat.dilek@gmail.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Hugh Dickins <hughd@google.com> Reviewed-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-11-15mm: convert the rest to new page table lock apiKirill A. Shutemov
Only trivial cases left. Let's convert them altogether. Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Tested-by: Alex Thorlton <athorlton@sgi.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: "Eric W . Biederman" <ebiederm@xmission.com> Cc: "Paul E . McKenney" <paulmck@linux.vnet.ibm.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Dave Jones <davej@redhat.com> Cc: David Howells <dhowells@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Kees Cook <keescook@chromium.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Robin Holt <robinmholt@gmail.com> Cc: Sedat Dilek <sedat.dilek@gmail.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-11-15mm, hugetlb: convert hugetlbfs to use split pmd lockKirill A. Shutemov
Hugetlb supports multiple page sizes. We use split lock only for PMD level, but not for PUD. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Tested-by: Alex Thorlton <athorlton@sgi.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: "Eric W . Biederman" <ebiederm@xmission.com> Cc: "Paul E . McKenney" <paulmck@linux.vnet.ibm.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Dave Jones <davej@redhat.com> Cc: David Howells <dhowells@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Kees Cook <keescook@chromium.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Robin Holt <robinmholt@gmail.com> Cc: Sedat Dilek <sedat.dilek@gmail.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-11-15mm, thp: do not access mm->pmd_huge_pte directlyKirill A. Shutemov
Currently mm->pmd_huge_pte protected by page table lock. It will not work with split lock. We have to have per-pmd pmd_huge_pte for proper access serialization. For now, let's just introduce wrapper to access mm->pmd_huge_pte. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Tested-by: Alex Thorlton <athorlton@sgi.com> Cc: Alex Thorlton <athorlton@sgi.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: "Eric W . Biederman" <ebiederm@xmission.com> Cc: "Paul E . McKenney" <paulmck@linux.vnet.ibm.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Dave Jones <davej@redhat.com> Cc: David Howells <dhowells@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Kees Cook <keescook@chromium.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Robin Holt <robinmholt@gmail.com> Cc: Sedat Dilek <sedat.dilek@gmail.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-11-15mm, thp: move ptl taking inside page_check_address_pmd()Kirill A. Shutemov
With split page table lock we can't know which lock we need to take before we find the relevant pmd. Let's move lock taking inside the function. Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Tested-by: Alex Thorlton <athorlton@sgi.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: "Eric W . Biederman" <ebiederm@xmission.com> Cc: "Paul E . McKenney" <paulmck@linux.vnet.ibm.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Dave Jones <davej@redhat.com> Cc: David Howells <dhowells@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Kees Cook <keescook@chromium.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Robin Holt <robinmholt@gmail.com> Cc: Sedat Dilek <sedat.dilek@gmail.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-11-15mm, thp: change pmd_trans_huge_lock() to return taken lockKirill A. Shutemov
With split ptlock it's important to know which lock pmd_trans_huge_lock() took. This patch adds one more parameter to the function to return the lock. In most places migration to new api is trivial. Exception is move_huge_pmd(): we need to take two locks if pmd tables are different. Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Tested-by: Alex Thorlton <athorlton@sgi.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: "Eric W . Biederman" <ebiederm@xmission.com> Cc: "Paul E . McKenney" <paulmck@linux.vnet.ibm.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Dave Jones <davej@redhat.com> Cc: David Howells <dhowells@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Kees Cook <keescook@chromium.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Robin Holt <robinmholt@gmail.com> Cc: Sedat Dilek <sedat.dilek@gmail.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-11-15mm: introduce api for split page table lock for PMD levelKirill A. Shutemov
Basic api, backed by mm->page_table_lock for now. Actual implementation will be added later. Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Tested-by: Alex Thorlton <athorlton@sgi.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: "Eric W . Biederman" <ebiederm@xmission.com> Cc: "Paul E . McKenney" <paulmck@linux.vnet.ibm.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Dave Jones <davej@redhat.com> Cc: David Howells <dhowells@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Kees Cook <keescook@chromium.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Robin Holt <robinmholt@gmail.com> Cc: Sedat Dilek <sedat.dilek@gmail.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-11-15mm: convert mm->nr_ptes to atomic_long_tKirill A. Shutemov
With split page table lock for PMD level we can't hold mm->page_table_lock while updating nr_ptes. Let's convert it to atomic_long_t to avoid races. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Tested-by: Alex Thorlton <athorlton@sgi.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: "Eric W . Biederman" <ebiederm@xmission.com> Cc: "Paul E . McKenney" <paulmck@linux.vnet.ibm.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Dave Jones <davej@redhat.com> Cc: David Howells <dhowells@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Kees Cook <keescook@chromium.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Robin Holt <robinmholt@gmail.com> Cc: Sedat Dilek <sedat.dilek@gmail.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-11-15mm: rename USE_SPLIT_PTLOCKS to USE_SPLIT_PTE_PTLOCKSKirill A. Shutemov
We're going to introduce split page table lock for PMD level. Let's rename existing split ptlock for PTE level to avoid confusion. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Tested-by: Alex Thorlton <athorlton@sgi.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: "Eric W . Biederman" <ebiederm@xmission.com> Cc: "Paul E . McKenney" <paulmck@linux.vnet.ibm.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Dave Jones <davej@redhat.com> Cc: David Howells <dhowells@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Kees Cook <keescook@chromium.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Robin Holt <robinmholt@gmail.com> Cc: Sedat Dilek <sedat.dilek@gmail.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-11-15mm: avoid increase sizeof(struct page) due to split page table lockKirill A. Shutemov
Alex Thorlton noticed that some massively threaded workloads work poorly, if THP enabled. This patchset fixes this by introducing split page table lock for PMD tables. hugetlbfs is not covered yet. This patchset is based on work by Naoya Horiguchi. : akpm result summary: : : THP off, v3.12-rc2: 18.059261877 seconds time elapsed : THP off, patched: 16.768027318 seconds time elapsed : : THP on, v3.12-rc2: 42.162306788 seconds time elapsed : THP on, patched: 8.397885779 seconds time elapsed : : HUGETLB, v3.12-rc2: 47.574936948 seconds time elapsed : HUGETLB, patched: 19.447481153 seconds time elapsed THP off, v3.12-rc2: ------------------- Performance counter stats for './thp_memscale -c 80 -b 512m' (5 runs): 1037072.835207 task-clock # 57.426 CPUs utilized ( +- 3.59% ) 95,093 context-switches # 0.092 K/sec ( +- 3.93% ) 140 cpu-migrations # 0.000 K/sec ( +- 5.28% ) 10,000,550 page-faults # 0.010 M/sec ( +- 0.00% ) 2,455,210,400,261 cycles # 2.367 GHz ( +- 3.62% ) [83.33%] 2,429,281,882,056 stalled-cycles-frontend # 98.94% frontend cycles idle ( +- 3.67% ) [83.33%] 1,975,960,019,659 stalled-cycles-backend # 80.48% backend cycles idle ( +- 3.88% ) [66.68%] 46,503,296,013 instructions # 0.02 insns per cycle # 52.24 stalled cycles per insn ( +- 3.21% ) [83.34%] 9,278,997,542 branches # 8.947 M/sec ( +- 4.00% ) [83.34%] 89,881,640 branch-misses # 0.97% of all branches ( +- 1.17% ) [83.33%] 18.059261877 seconds time elapsed ( +- 2.65% ) THP on, v3.12-rc2: ------------------ Performance counter stats for './thp_memscale -c 80 -b 512m' (5 runs): 3114745.395974 task-clock # 73.875 CPUs utilized ( +- 1.84% ) 267,356 context-switches # 0.086 K/sec ( +- 1.84% ) 99 cpu-migrations # 0.000 K/sec ( +- 1.40% ) 58,313 page-faults # 0.019 K/sec ( +- 0.28% ) 7,416,635,817,510 cycles # 2.381 GHz ( +- 1.83% ) [83.33%] 7,342,619,196,993 stalled-cycles-frontend # 99.00% frontend cycles idle ( +- 1.88% ) [83.33%] 6,267,671,641,967 stalled-cycles-backend # 84.51% backend cycles idle ( +- 2.03% ) [66.67%] 117,819,935,165 instructions # 0.02 insns per cycle # 62.32 stalled cycles per insn ( +- 4.39% ) [83.34%] 28,899,314,777 branches # 9.278 M/sec ( +- 4.48% ) [83.34%] 71,787,032 branch-misses # 0.25% of all branches ( +- 1.03% ) [83.33%] 42.162306788 seconds time elapsed ( +- 1.73% ) HUGETLB, v3.12-rc2: ------------------- Performance counter stats for './thp_memscale_hugetlbfs -c 80 -b 512M' (5 runs): 2588052.787264 task-clock # 54.400 CPUs utilized ( +- 3.69% ) 246,831 context-switches # 0.095 K/sec ( +- 4.15% ) 138 cpu-migrations # 0.000 K/sec ( +- 5.30% ) 21,027 page-faults # 0.008 K/sec ( +- 0.01% ) 6,166,666,307,263 cycles # 2.383 GHz ( +- 3.68% ) [83.33%] 6,086,008,929,407 stalled-cycles-frontend # 98.69% frontend cycles idle ( +- 3.77% ) [83.33%] 5,087,874,435,481 stalled-cycles-backend # 82.51% backend cycles idle ( +- 4.41% ) [66.67%] 133,782,831,249 instructions # 0.02 insns per cycle # 45.49 stalled cycles per insn ( +- 4.30% ) [83.34%] 34,026,870,541 branches # 13.148 M/sec ( +- 4.24% ) [83.34%] 68,670,942 branch-misses # 0.20% of all branches ( +- 3.26% ) [83.33%] 47.574936948 seconds time elapsed ( +- 2.09% ) THP off, patched: ----------------- Performance counter stats for './thp_memscale -c 80 -b 512m' (5 runs): 943301.957892 task-clock # 56.256 CPUs utilized ( +- 3.01% ) 86,218 context-switches # 0.091 K/sec ( +- 3.17% ) 121 cpu-migrations # 0.000 K/sec ( +- 6.64% ) 10,000,551 page-faults # 0.011 M/sec ( +- 0.00% ) 2,230,462,457,654 cycles # 2.365 GHz ( +- 3.04% ) [83.32%] 2,204,616,385,805 stalled-cycles-frontend # 98.84% frontend cycles idle ( +- 3.09% ) [83.32%] 1,778,640,046,926 stalled-cycles-backend # 79.74% backend cycles idle ( +- 3.47% ) [66.69%] 45,995,472,617 instructions # 0.02 insns per cycle # 47.93 stalled cycles per insn ( +- 2.51% ) [83.34%] 9,179,700,174 branches # 9.731 M/sec ( +- 3.04% ) [83.35%] 89,166,529 branch-misses # 0.97% of all branches ( +- 1.45% ) [83.33%] 16.768027318 seconds time elapsed ( +- 2.47% ) THP on, patched: ---------------- Performance counter stats for './thp_memscale -c 80 -b 512m' (5 runs): 458793.837905 task-clock # 54.632 CPUs utilized ( +- 0.79% ) 41,831 context-switches # 0.091 K/sec ( +- 0.97% ) 98 cpu-migrations # 0.000 K/sec ( +- 1.66% ) 57,829 page-faults # 0.126 K/sec ( +- 0.62% ) 1,077,543,336,716 cycles # 2.349 GHz ( +- 0.81% ) [83.33%] 1,067,403,802,964 stalled-cycles-frontend # 99.06% frontend cycles idle ( +- 0.87% ) [83.33%] 864,764,616,143 stalled-cycles-backend # 80.25% backend cycles idle ( +- 0.73% ) [66.68%] 16,129,177,440 instructions # 0.01 insns per cycle # 66.18 stalled cycles per insn ( +- 7.94% ) [83.35%] 3,618,938,569 branches # 7.888 M/sec ( +- 8.46% ) [83.36%] 33,242,032 branch-misses # 0.92% of all branches ( +- 2.02% ) [83.32%] 8.397885779 seconds time elapsed ( +- 0.18% ) HUGETLB, patched: ----------------- Performance counter stats for './thp_memscale_hugetlbfs -c 80 -b 512M' (5 runs): 395353.076837 task-clock # 20.329 CPUs utilized ( +- 8.16% ) 55,730 context-switches # 0.141 K/sec ( +- 5.31% ) 138 cpu-migrations # 0.000 K/sec ( +- 4.24% ) 21,027 page-faults # 0.053 K/sec ( +- 0.00% ) 930,219,717,244 cycles # 2.353 GHz ( +- 8.21% ) [83.32%] 914,295,694,103 stalled-cycles-frontend # 98.29% frontend cycles idle ( +- 8.35% ) [83.33%] 704,137,950,187 stalled-cycles-backend # 75.70% backend cycles idle ( +- 9.16% ) [66.69%] 30,541,538,385 instructions # 0.03 insns per cycle # 29.94 stalled cycles per insn ( +- 3.98% ) [83.35%] 8,415,376,631 branches # 21.286 M/sec ( +- 3.61% ) [83.36%] 32,645,478 branch-misses # 0.39% of all branches ( +- 3.41% ) [83.32%] 19.447481153 seconds time elapsed ( +- 2.00% ) This patch (of 11): CONFIG_GENERIC_LOCKBREAK increases sizeof(spinlock_t) to 8 bytes. It leads to increase sizeof(struct page) by 4 bytes on 32-bit system if split page table lock is in use, since page->ptl shares space in union with longs and pointers. Let's disable split page table lock on 32-bit systems with GENERIC_LOCKBREAK enabled. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Alex Thorlton <athorlton@sgi.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: "Eric W . Biederman" <ebiederm@xmission.com> Cc: "Paul E . McKenney" <paulmck@linux.vnet.ibm.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Dave Jones <davej@redhat.com> Cc: David Howells <dhowells@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Kees Cook <keescook@chromium.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Robin Holt <robinmholt@gmail.com> Cc: Sedat Dilek <sedat.dilek@gmail.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-11-15mm: drop actor argument of do_generic_file_read()Kirill A. Shutemov
There's only one caller of do_generic_file_read() and the only actor is file_read_actor(). No reason to have a callback parameter. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Wanpeng Li <liwanp@linux.vnet.ibm.com> Cc: Matthew Wilcox <willy@linux.intel.com> Cc: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-11-15drivers/memstick/core/ms_block.c: fix spelling of MSB_RP_RECIVE_STATUS_REGAndrew Morton
Cc: Maxim Levitsky <maximlevitsky@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-11-14Merge tag 'ext4_for_linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 changes from Ted Ts'o: "Ext4 updates for 3.13. Mostly bug fixes and cleanups" * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4: add prototypes for macro-generated functions ext4: return non-zero st_blocks for inline data ext4: use prandom_u32() instead of get_random_bytes() ext4: remove unreachable code after ext4_can_extents_be_merged() ext4: remove unreachable code in ext4_can_extents_be_merged() ext4: avoid bh leak in retry path of ext4_expand_extra_isize_ea() ext4: don't count free clusters from a corrupt block group ext4: fix FITRIM in no journal mode ext4: drop set but otherwise unused variable from ext4_add_dirent_to_inline() ext4: change ext4_read_inline_dir() to return 0 on success ext4: pair trace_ext4_writepages & trace_ext4_writepages_result ext4: add ratelimiting to ext4 messages ext4: fix performance regression in ext4_writepages ext4: fixup kerndoc annotation of mpage_map_and_submit_extent() ext4: fix assertion in ext4_add_complete_io()
2013-11-14Merge tag 'xfs-for-linus-v3.13-rc1' of git://oss.sgi.com/xfs/xfsLinus Torvalds
Pull xfs update from Ben Myers: "For 3.13-rc1 we have an eclectic assortment of bugfixes, cleanups, and refactoring. Bugfixes that stand out are the fix for the AGF/AGI deadlock, incore extent list fixes, verifier fixes for v4 superblocks and growfs, and memory leaks. There are some asserts, warnings, and strings that were cleaned up. There was further rearrangement of code to make libxfs and the kernel sync up more easily, differences between v2 and v3 directory code were abstracted using an ops vector, xfs_inactive was reworked, and the preallocation/hole punching code was refactored. - simplify kmem_zone_zalloc - add traces for AGF/AGI read ops - add additional AIL traces - fix xfs_remove AGF vs AGI deadlock - fix the extent count of new incore extent page in the indirection array - don't fail bad secondary superblocks verification on v4 filesystems due to unzeroed bits after v4 fields - fix possible NULL dereference in xlog_verify_iclog - remove redundant assert in xfs_dir2_leafn_split - prevent stack overflows from page cache allocation - fix some sparse warnings - fix directory block format verifier to check the leaf entry count - abstract the differences in dir2/dir3 via an ops vector - continue process of reorganization to make libxfs/kernel code merges easier - refactor the preallocation and hole punching code - fix for growfs and verifiers - remove unnecessary scary corruption error when probing non-xfs filesystems - remove extra newlines from strings passed to printk - prevent deadlock trying to cover an active log - rework xfs_inactive() - add the inode directory type support to XFS_IOC_FSGEOM - cleanup (remove) usage of is_bad_inode - fix miscalculation in xfs_iext_realloc_direct which results in oversized direct extent list - remove unnecessary count arg to xfs_iomap_write_allocate - fix memory leak in xlog_recover_add_to_trans - check superblock instead of block magic to determine if dtype field is present - fix lockdep annotation due to project quotas - fix regression in xfs_node_toosmall which can lead to incorrect directory btree node collapse - make log recovery verify filesystem uuid of recovering blocks - fix XFS_IOC_FREE_EOFBLOCKS definition - remove invalid assert in xfs_inode_free - fix for AIL lock regression" * tag 'xfs-for-linus-v3.13-rc1' of git://oss.sgi.com/xfs/xfs: (49 commits) xfs: simplify kmem_{zone_}zalloc xfs: add tracepoints to AGF/AGI read operations xfs: trace AIL manipulations xfs: xfs_remove deadlocks due to inverted AGF vs AGI lock ordering xfs: fix the extent count when allocating an new indirection array entry xfs: be more forgiving of a v4 secondary sb w/ junk in v5 fields xfs: fix possible NULL dereference in xlog_verify_iclog xfs:xfs_dir2_node.c: pointer use before check for null xfs: prevent stack overflows from page cache allocation xfs: fix static and extern sparse warnings xfs: validity check the directory block leaf entry count xfs: make dir2 ftype offset pointers explicit xfs: convert directory vector functions to constants xfs: convert directory vector functions to constants xfs: vectorise encoding/decoding directory headers xfs: vectorise DA btree operations xfs: vectorise directory leaf operations xfs: vectorise directory data operations part 2 xfs: vectorise directory data operations xfs: vectorise remaining shortform dir2 ops ...
2013-11-14Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf updates from Ingo Molnar: "A number of fixes: - Fix segfault on perf trace -i perf.data, from Namhyung Kim. - Fix segfault with --no-mmap-pages, from David Ahern. - Don't force a refresh during progress update in the TUI, greatly reducing startup costs, fix from Patrick Palka. - Fix sw clock event period test wrt not checking if using > max_sample_freq. - Handle throttle events in 'object code reading' test, fix from Adrian Hunter. - Prevent condition that all sort keys are elided, fix from Namhyung Kim. - Round mmap pages to power 2, from David Ahern. And a number of late arrival changes: - Add summary only option to 'perf trace', suppressing the decoding of events, from David Ahern - 'perf trace --summary' formatting simplifications, from Pekka Enberg. - Beautify fifth argument of mmap() as fd, in 'perf trace', from Namhyung Kim. - Add direct access to dynamic arrays in libtraceevent, from Steven Rostedt. - Synthesize non-exec MMAP records when --data used, allowing the resolution of data addresses to symbols (global variables, etc), by Arnaldo Carvalho de Melo. - Code cleanups by David Ahern and Adrian Hunter" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (23 commits) tools lib traceevent: Add direct access to dynamic arrays perf target: Shorten perf_target__ to target__ perf tests: Handle throttle events in 'object code reading' test perf evlist: Refactor mmap_pages parsing perf evlist: Round mmap pages to power 2 - v2 perf record: Fix segfault with --no-mmap-pages perf trace: Add summary only option perf trace: Simplify '--summary' output perf trace: Change syscall summary duration order perf tests: Compensate lower sample freq with longer test loop perf trace: Fix segfault on perf trace -i perf.data perf trace: Separate tp syscall field caching into init routine to be reused perf trace: Beautify fifth argument of mmap() as fd perf tests: Use lower sample_freq in sw clock event period test perf tests: Check return of perf_evlist__open sw clock event period test perf record: Move existing write_output into helper function perf record: Use correct return type for write() perf tools: Prevent condition that all sort keys are elided perf machine: Simplify synthesize_threads method perf machine: Introduce synthesize_threads method out of open coded equivalent ...
2013-11-14Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull two x86 fixes from Ingo Molnar. * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/microcode/amd: Tone down printk(), don't treat a missing firmware file as an error x86/dumpstack: Fix printk_address for direct addresses
2013-11-14Merge branch 'sched-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixes from Ingo Molnar: "Four bugfixes and one performance fix" * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/fair: Avoid integer overflow sched: Optimize task_sched_runtime() sched/numa: Cure update_numa_stats() vs. hotplug sched/numa: Fix NULL pointer dereference in task_numa_migrate() sched: Fix endless sync_sched/rcu() loop inside _cpu_down()
2013-11-14Merge branch 'core-locking-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull core locking changes from Ingo Molnar: "The biggest changes: - add lockdep support for seqcount/seqlocks structures, this unearthed both bugs and required extra annotation. - move the various kernel locking primitives to the new kernel/locking/ directory" * 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (21 commits) block: Use u64_stats_init() to initialize seqcounts locking/lockdep: Mark __lockdep_count_forward_deps() as static lockdep/proc: Fix lock-time avg computation locking/doc: Update references to kernel/mutex.c ipv6: Fix possible ipv6 seqlock deadlock cpuset: Fix potential deadlock w/ set_mems_allowed seqcount: Add lockdep functionality to seqcount/seqlock structures net: Explicitly initialize u64_stats_sync structures for lockdep locking: Move the percpu-rwsem code to kernel/locking/ locking: Move the lglocks code to kernel/locking/ locking: Move the rwsem code to kernel/locking/ locking: Move the rtmutex code to kernel/locking/ locking: Move the semaphore core to kernel/locking/ locking: Move the spinlock code to kernel/locking/ locking: Move the lockdep code to kernel/locking/ locking: Move the mutex code to kernel/locking/ hung_task debugging: Add tracepoint to report the hang x86/locking/kconfig: Update paravirt spinlock Kconfig description lockstat: Report avg wait and hold times lockdep, x86/alternatives: Drop ancient lockdep fixup message ...
2013-11-14Merge branch 'x86-trace-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86/trace changes from Ingo Molnar: "This adds page fault tracepoints which have zero runtime cost in the disabled case via IDT trickery (no NOPs in the page fault hotpath)" * 'x86-trace-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, trace: Change user|kernel_page_fault to page_fault_user|kernel x86, trace: Add page fault tracepoints x86, trace: Delete __trace_alloc_intr_gate() x86, trace: Register exception handler to trace IDT x86, trace: Remove __alloc_intr_gate()
2013-11-14Merge tag 'fbdev-3.13' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tomba/linux Pull fbdev changes from Tomi Valkeinen: "Nothing particularly stands out in this pull request. The biggest part of the changes are cleanups. Maybe one fix to mention is the "fb: reorder the lock sequence to fix potential dead lock" which hopefully fixes the fb locking issues reported by multiple persons. There are also a few commits that have changes to arch/arm/mach-at91 and arch/avr32, which have been acked by the maintainers" * tag 'fbdev-3.13' of git://git.kernel.org/pub/scm/linux/kernel/git/tomba/linux: (143 commits) fb: reorder the lock sequence to fix potential dead lock fbdev: shmobile-lcdcfb: Convert to clk_prepare/unprepare fbdev: shmobile-hdmi: Convert to clk_prepare/unprepare omapdss: Add new panel driver for Topolly td028ttec1 LCD. video: exynos_mipi_dsi: Unlock the mutex before returning video: da8xx-fb: remove unwanted define video: Remove unnecessary semicolons simplefb: use write-combined remapping simplefb: fix unmapping fb during destruction OMAPDSS: connector-dvi: fix releasing i2c_adapter OMAPDSS: DSI: fix perf measuring ifdefs framebuffer: Use fb_<level> framebuffer: Add fb_<level> convenience logging macros efifb: prevent null-deref when iterating dmi_list fbdev: fix error return code in metronomefb_probe() video: xilinxfb: Fix for "Use standard variable name convention" OMAPDSS: Fix de_level in videomode_to_omap_video_timings() video: xilinxfb: Simplify error path video: xilinxfb: Use devm_kzalloc instead of kzalloc video: xilinxfb: Use standard variable name convention ...
2013-11-14Merge branch 'next' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux Pull thermal management updates from Zhang Rui: "This time we only have a few changes as there are no soc thermal changes from Eduardo. The only big change is the introduction of TMON, a tool to help visualize, tune, and test the thermal subsystem. The rest is mostly cleanups and fixes all over. Specifics: - introduce TMON, a tool base on thermal sysfs I/F. It can be used to visualize, tune and test the thermal subsystem. - fix a zone/cooling device binding problem, when both thermal zone bind parameters and .bind() callback are available" * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux: tools/thermal: Introduce tmon, a tool for thermal subsystem thermal: Fix binding problem when there is thermal zone params thermal: cpu_cooling: fix return value check in cpufreq_cooling_register() Thermal: Check for validity before doing kfree thermal/intel_powerclamp: Add newer CPU models Thermal: Tidy up error handling in powerclamp_init thermal: Kconfig: cosmetic fixes ACPI/thermal : Remove zone disabled warning typo in drivers/thermal/Kconfig: lpatform instead of platform
2013-11-14Merge tag 'pci-v3.13-changes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI changes from Bjorn Helgaas: "Resource management - Fix host bridge window coalescing (Alexey Neyman) - Pass type, width, and prefetchability for window alignment (Wei Yang) PCI device hotplug - Convert acpiphp, acpiphp_ibm to dynamic debug (Lan Tianyu) Power management - Remove pci_pm_complete() (Liu Chuansheng) MSI - Fail initialization if device is not in PCI_D0 (Yijing Wang) MPS (Max Payload Size) - Use pcie_get_mps() and pcie_set_mps() to simplify code (Yijing Wang) - Use pcie_set_readrq() to simplify code (Yijing Wang) - Use cached pci_dev->pcie_mpss to simplify code (Yijing Wang) SR-IOV - Enable upstream bridges even for VFs on virtual buses (Bjorn Helgaas) - Use pci_is_root_bus() to avoid catching virtual buses (Wei Yang) Virtualization - Add x86 MSI masking ops (Konrad Rzeszutek Wilk) Freescale i.MX6 - Support i.MX6 PCIe controller (Sean Cross) - Increase link startup timeout (Marek Vasut) - Probe PCIe in fs_initcall() (Marek Vasut) - Fix imprecise abort handler (Tim Harvey) - Remove redundant of_match_ptr (Sachin Kamat) Renesas R-Car - Support Gen2 internal PCIe controller (Valentine Barshak) Samsung Exynos - Add MSI support (Jingoo Han) - Turn off power when link fails (Jingoo Han) - Add Jingoo Han as maintainer (Jingoo Han) - Add clk_disable_unprepare() on error path (Wei Yongjun) - Remove redundant of_match_ptr (Sachin Kamat) Synopsys DesignWare - Add irq_create_mapping() (Pratyush Anand) - Add header guards (Seungwon Jeon) Miscellaneous - Enable native PCIe services by default on non-ACPI (Andrew Murray) - Cleanup _OSC usage and messages (Bjorn Helgaas) - Remove pcibios_last_bus boot option on non-x86 (Bjorn Helgaas) - Convert bus code to use bus_, drv_, and dev_groups (Greg Kroah-Hartman) - Remove unused pci_mem_start (Myron Stowe) - Make sysfs functions static (Sachin Kamat) - Warn on invalid return from driver probe (Stephen M. Cameron) - Remove Intel Haswell D3 delays (Todd E Brandt) - Call pci_set_master() in core if driver doesn't do it (Yinghai Lu) - Use pci_is_pcie() to simplify code (Yijing Wang) - Use PCIe capability accessors to simplify code (Yijing Wang) - Use cached pci_dev->pcie_cap to simplify code (Yijing Wang) - Removed unused "is_pcie" from struct pci_dev (Yijing Wang) - Simplify sysfs CPU affinity implementation (Yijing Wang)" * tag 'pci-v3.13-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (79 commits) PCI: Enable upstream bridges even for VFs on virtual buses PCI: Add pci_upstream_bridge() PCI: Add x86_msi.msi_mask_irq() and msix_mask_irq() PCI: Warn on driver probe return value greater than zero PCI: Drop warning about drivers that don't use pci_set_master() PCI: Workaround missing pci_set_master in pci drivers powerpc/pci: Use pci_is_pcie() to simplify code [fix] PCI: Update pcie_ports 'auto' behavior for non-ACPI platforms PCI: imx6: Probe the PCIe in fs_initcall() PCI: Add R-Car Gen2 internal PCI support PCI: imx6: Remove redundant of_match_ptr PCI: Report pci_pme_active() kmalloc failure mn10300/PCI: Remove useless pcibios_last_bus frv/PCI: Remove pcibios_last_bus PCI: imx6: Increase link startup timeout PCI: exynos: Remove redundant of_match_ptr PCI: imx6: Fix imprecise abort handler PCI: Fail MSI/MSI-X initialization if device is not in PCI_D0 PCI: imx6: Remove redundant dev_err() in imx6_pcie_probe() x86/PCI: Coalesce multiple overlapping host bridge windows ...
2013-11-14Merge tag 'pm+acpi-3.13-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI and power management updates from Rafael J Wysocki: - New power capping framework and the the Intel Running Average Power Limit (RAPL) driver using it from Srinivas Pandruvada and Jacob Pan. - Addition of the in-kernel switching feature to the arm_big_little cpufreq driver from Viresh Kumar and Nicolas Pitre. - cpufreq support for iMac G5 from Aaro Koskinen. - Baytrail processors support for intel_pstate from Dirk Brandewie. - cpufreq support for Midway/ECX-2000 from Mark Langsdorf. - ARM vexpress/TC2 cpufreq support from Sudeep KarkadaNagesha. - ACPI power management support for the I2C and SPI bus types from Mika Westerberg and Lv Zheng. - cpufreq core fixes and cleanups from Viresh Kumar, Srivatsa S Bhat, Stratos Karafotis, Xiaoguang Chen, Lan Tianyu. - cpufreq drivers updates (mostly fixes and cleanups) from Viresh Kumar, Aaro Koskinen, Jungseok Lee, Sudeep KarkadaNagesha, Lukasz Majewski, Manish Badarkhe, Hans-Christian Egtvedt, Evgeny Kapaev. - intel_pstate updates from Dirk Brandewie and Adrian Huang. - ACPICA update to version 20130927 includig fixes and cleanups and some reduction of divergences between the ACPICA code in the kernel and ACPICA upstream in order to improve the automatic ACPICA patch generation process. From Bob Moore, Lv Zheng, Tomasz Nowicki, Naresh Bhat, Bjorn Helgaas, David E Box. - ACPI IPMI driver fixes and cleanups from Lv Zheng. - ACPI hotplug fixes and cleanups from Bjorn Helgaas, Toshi Kani, Zhang Yanfei, Rafael J Wysocki. - Conversion of the ACPI AC driver to the platform bus type and multiple driver fixes and cleanups related to ACPI from Zhang Rui. - ACPI processor driver fixes and cleanups from Hanjun Guo, Jiang Liu, Bartlomiej Zolnierkiewicz, Mathieu Rhéaume, Rafael J Wysocki. - Fixes and cleanups and new blacklist entries related to the ACPI video support from Aaron Lu, Felipe Contreras, Lennart Poettering, Kirill Tkhai. - cpuidle core cleanups from Viresh Kumar and Lorenzo Pieralisi. - cpuidle drivers fixes and cleanups from Daniel Lezcano, Jingoo Han, Bartlomiej Zolnierkiewicz, Prarit Bhargava. - devfreq updates from Sachin Kamat, Dan Carpenter, Manish Badarkhe. - Operation Performance Points (OPP) core updates from Nishanth Menon. - Runtime power management core fix from Rafael J Wysocki and update from Ulf Hansson. - Hibernation fixes from Aaron Lu and Rafael J Wysocki. - Device suspend/resume lockup detection mechanism from Benoit Goby. - Removal of unused proc directories created for various ACPI drivers from Lan Tianyu. - ACPI LPSS driver fix and new device IDs for the ACPI platform scan handler from Heikki Krogerus and Jarkko Nikula. - New ACPI _OSI blacklist entry for Toshiba NB100 from Levente Kurusa. - Assorted fixes and cleanups related to ACPI from Andy Shevchenko, Al Stone, Bartlomiej Zolnierkiewicz, Colin Ian King, Dan Carpenter, Felipe Contreras, Jianguo Wu, Lan Tianyu, Yinghai Lu, Mathias Krause, Liu Chuansheng. - Assorted PM fixes and cleanups from Andy Shevchenko, Thierry Reding, Jean-Christophe Plagniol-Villard. * tag 'pm+acpi-3.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (386 commits) cpufreq: conservative: fix requested_freq reduction issue ACPI / hotplug: Consolidate deferred execution of ACPI hotplug routines PM / runtime: Use pm_runtime_put_sync() in __device_release_driver() ACPI / event: remove unneeded NULL pointer check Revert "ACPI / video: Ignore BIOS initial backlight value for HP 250 G1" ACPI / video: Quirk initial backlight level 0 ACPI / video: Fix initial level validity test intel_pstate: skip the driver if ACPI has power mgmt option PM / hibernate: Avoid overflow in hibernate_preallocate_memory() ACPI / hotplug: Do not execute "insert in progress" _OST ACPI / hotplug: Carry out PCI root eject directly ACPI / hotplug: Merge device hot-removal routines ACPI / hotplug: Make acpi_bus_hot_remove_device() internal ACPI / hotplug: Simplify device ejection routines ACPI / hotplug: Fix handle_root_bridge_removal() ACPI / hotplug: Refuse to hot-remove all objects with disabled hotplug ACPI / scan: Start matching drivers after trying scan handlers ACPI: Remove acpi_pci_slot_init() headers from internal.h ACPI / blacklist: fix name of ThinkPad Edge E530 PowerCap: Fix build error with option -Werror=format-security ... Conflicts: arch/arm/mach-omap2/opp.c drivers/Kconfig drivers/spi/spi.c
2013-11-14Merge tag 'dm-3.13-changes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm Pull device mapper changes from Mike Snitzer: "A set of device-mapper changes for 3.13. Improve reliability of buffer allocations for dm messages with a small number of arguments, a couple path group initialization fixes for dm multipath, a fix for resizing a dm array, various fixes and optimizations for dm cache, a fix for device mapper's Kconfig menu indentation. Features added include: - dm crypt support for activating legacy CBC TrueCrypt containers (useful for forensics of these old TCRYPT containers) - reduced dm-cache memory requirements for each block in the cache - basic support for shrinking a dm-cache's cache (fast) device - most notably, dm-cache support for managing cache coherency when deploying dm-cache with sophisticated origin volumes (that support hardware snapshots and/or clustering): these changes come in the form of a new passthrough operation mode and a cache block invalidation interface" * tag 'dm-3.13-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: (32 commits) dm cache: resolve small nits and improve Documentation dm cache: add cache block invalidation support dm cache: add remove_cblock method to policy interface dm cache policy mq: reduce memory requirements dm cache metadata: check the metadata version when reading the superblock dm cache: add passthrough mode dm cache: cache shrinking support dm cache: promotion optimisation for writes dm cache: be much more aggressive about promoting writes to discarded blocks dm cache policy mq: implement writeback_work() and mq_{set,clear}_dirty() dm cache: optimize commit_if_needed dm space map disk: optimise sm_disk_dec_block MAINTAINERS: add reference to device-mapper's linux-dm.git tree dm: fix Kconfig menu indentation dm: allow remove to be deferred dm table: print error on preresume failure dm crypt: add TCW IV mode for old CBC TCRYPT containers dm crypt: properly handle extra key string in initialization dm cache: log error message if dm_kcopyd_copy() fails dm cache: use cell_defer() boolean argument consistently ...
2013-11-14Merge tag 'for-linus-20131112' of git://git.infradead.org/linux-mtdLinus Torvalds
Pull MTD changes from Brian Norris: - Unify some compile-time differences so that we have fewer uses of #ifdef CONFIG_OF in atmel_nand - Other general cleanups (removing unused functions, options, variables, fields; use correct interfaces) - Fix BUG() for new odd-sized NAND, which report non-power-of-2 dimensions via ONFI - Miscellaneous driver fixes (SPI NOR flash; BCM47xx NAND flash; etc.) - Improve differentiation between SLC and MLC NAND -- this clarifies an ABI issue regarding the MTD "type" (in sysfs and in the MEMGETINFO ioctl), where the MTD_MLCNANDFLASH type was present but inconsistently used - Extend GPMI NAND to support multi-chip-select NAND for some platforms - Many improvements to the OMAP2/3 NAND driver, including an expanded DT binding to bring us closer to mainline support for some OMAP systems - Fix a deadlock in the error path of the Atmel NAND driver probe - Correct the error codes from MTD mmap() to conform to POSIX and the Linux Programmer's Manual. This is an acknowledged change in the MTD ABI, but I can't imagine somebody relying on the non-standard -ENOSYS error code specifically. Am I just being unimaginative? :) - Fix a few important GPMI NAND bugs (one regression from 3.12 and one long-standing race condition) - More? Read the log! * tag 'for-linus-20131112' of git://git.infradead.org/linux-mtd: (98 commits) mtd: gpmi: fix the NULL pointer mtd: gpmi: fix kernel BUG due to racing DMA operations mtd: mtdchar: return expected errors on mmap() call mtd: gpmi: only scan two chips for imx6 mtd: gpmi: Use devm_kzalloc() mtd: atmel_nand: fix bug driver will in a dead lock if no nand detected mtd: nand: use a local variable to simplify the nand_scan_tail mtd: nand: remove deprecated IRQF_DISABLED mtd: dataflash: Say if we find a device we don't support mtd: nand: omap: fix error return code in omap_nand_probe() mtd: nand_bbt: kill NAND_BBT_SCANALLPAGES mtd: m25p80: fixup device removal failure path mtd: mxc_nand: Include linux/of.h header mtd: remove duplicated include from mtdcore.c mtd: m25p80: add support for Macronix mx25l3255e mtd: nand: omap: remove selection of BCH ecc-scheme via KConfig mtd: nand: omap: updated devm_xx for all resource allocation and free calls mtd: nand: omap: use drivers/mtd/nand/nand_bch.c wrapper for BCH ECC instead of lib/bch.c mtd: nand: omap: clean-up ecc layout for BCH ecc schemes mtd: nand: omap2: clean-up BCHx_HW and BCHx_SW ECC configurations in device_probe ...
2013-11-14Merge tag 'scsi-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull first round of SCSI updates from James Bottomley: "This patch set is driver updates for qla4xxx, scsi_debug, pm80xx, fcoe/libfc, eas2r, lpfc, be2iscsi and megaraid_sas plus some assorted bug fixes and cleanups" * tag 'scsi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (106 commits) [SCSI] scsi_error: Escalate to LUN reset if abort fails [SCSI] Add 'eh_deadline' to limit SCSI EH runtime [SCSI] remove check for 'resetting' [SCSI] dc395: Move 'last_reset' into internal host structure [SCSI] tmscsim: Move 'last_reset' into host structure [SCSI] advansys: Remove 'last_reset' references [SCSI] dpt_i2o: return SCSI_MLQUEUE_HOST_BUSY when in reset [SCSI] dpt_i2o: Remove DPTI_STATE_IOCTL [SCSI] megaraid_sas: Fix synchronization problem between sysPD IO path and AEN path [SCSI] lpfc: Fix typo on NULL assignment [SCSI] scsi_dh_alua: ALUA handler attach should succeed while TPG is transitioning [SCSI] scsi_dh_alua: ALUA check sense should retry device internal reset unit attention [SCSI] esas2r: Cleanup snprinf formatting of firmware version [SCSI] esas2r: Remove superfluous mask of pcie_cap_reg [SCSI] esas2r: Fixes for big-endian platforms [SCSI] esas2r: Directly call kernel functions for atomic bit operations [SCSI] lpfc 8.3.43: Update lpfc version to driver version 8.3.43 [SCSI] lpfc 8.3.43: Fixed not processing task management IOCB response status [SCSI] lpfc 8.3.43: Fixed spinlock hang. [SCSI] lpfc 8.3.43: Fixed invalid Total_Data_Placed value received for els and ct command responses ...
2013-11-14Merge branch 'for-3.13/drivers' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block driver updates from Jens Axboe: "This is the block driver pull request for 3.13. As with the core pull request just sent out, this was rebased on top of the core branch again after the immutable series was pulled. This also means that bcache gets to sit the initial pull over. I will send a second driver pull request in the merge window to get those fixes in, once they have been rebased and tested on top of the non-immutable stack. This pull request contains: - Add support for the sTec Kronos pci-e flash card from sTec. Also has various cleanups for this driver, from myself, Bart, Mike Snizter, and Wei Yongjun. - Add surprise removal support for the micron mtip32xx driver from Micron. - Floppy documentation fix from Ben Harris. - debugfs bug fix for pktcdvd from Dan Carpenter. - Fix for the mtip32xx driver stack usage in the debugfs path, dynamically allocating those buffers instead. From David Milburn. - Disable cpqarray in Kconfig. The plan is to remove it on request of HP, but lets disable it for a few revisions just to see if anyone yells. - drbd fixes from Lars Ellenberg and Philipp Reisner. - Elevator switch fix for the s390 block driver from Heiko Carstens. - loop crash fix on IO to unassigned device from Mikulas Patocka. - A series of bug fixes for the IBM rsxx pci-e flash driver from Philip J Kelleher. - cciss probe fix from Stephen Cameron. - Xen block front/back fixes from Roger Pau Monne and Vegard Nossum" * 'for-3.13/drivers' of git://git.kernel.dk/linux-block: (41 commits) floppy: Correct documentation of driver options when used as a module. pktcdvd: debugfs functions return NULL on error xen-blkfront: restore the non-persistent data path skd: fix formatting in skd_s1120.h skd: reorder construct/destruct code skd: cleanup skd_do_inq_page_da() skd: remove SKD_OMIT_FROM_SRC_DIST ifdefs skd: remove redundant skdev->pdev assignment from skd_pci_probe() skd: use <asm/unaligned.h> skd: remove SCSI subsystem specific includes skd: register block device only if some devices are present skd: fix error messages in skd_init() skd: fix error paths in skd_init() skd: fix unregister_blkdev() placement skd: more removal of bio-based code skd: cleanup the skd_*() function block wrapping skd: rip out bio path skd: fix error return code in skd_pci_probe() s390/dasd: hold request queue sysfs lock when calling elevator_init() cciss: return 0 from driver probe function on success, not 1 ...
2013-11-14Merge branch 'for-3.13/core' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block IO core updates from Jens Axboe: "This is the pull request for the core changes in the block layer for 3.13. It contains: - The new blk-mq request interface. This is a new and more scalable queueing model that marries the best part of the request based interface we currently have (which is fully featured, but scales poorly) and the bio based "interface" which the new drivers for high IOPS devices end up using because it's much faster than the request based one. The bio interface has no block layer support, since it taps into the stack much earlier. This means that drivers end up having to implement a lot of functionality on their own, like tagging, timeout handling, requeue, etc. The blk-mq interface provides all these. Some drivers even provide a switch to select bio or rq and has code to handle both, since things like merging only works in the rq model and hence is faster for some workloads. This is a huge mess. Conversion of these drivers nets us a substantial code reduction. Initial results on converting SCSI to this model even shows an 8x improvement on single queue devices. So while the model was intended to work on the newer multiqueue devices, it has substantial improvements for "classic" hardware as well. This code has gone through extensive testing and development, it's now ready to go. A pull request is coming to convert virtio-blk to this model will be will be coming as well, with more drivers scheduled for 3.14 conversion. - Two blktrace fixes from Jan and Chen Gang. - A plug merge fix from Alireza Haghdoost. - Conversion of __get_cpu_var() from Christoph Lameter. - Fix for sector_div() with 64-bit divider from Geert Uytterhoeven. - A fix for a race between request completion and the timeout handling from Jeff Moyer. This is what caused the merge conflict with blk-mq/core, in case you are looking at that. - A dm stacking fix from Mike Snitzer. - A code consolidation fix and duplicated code removal from Kent Overstreet. - A handful of block bug fixes from Mikulas Patocka, fixing a loop crash and memory corruption on blk cg. - Elevator switch bug fix from Tomoki Sekiyama. A heads-up that I had to rebase this branch. Initially the immutable bio_vecs had been queued up for inclusion, but a week later, it became clear that it wasn't fully cooked yet. So the decision was made to pull this out and postpone it until 3.14. It was a straight forward rebase, just pruning out the immutable series and the later fixes of problems with it. The rest of the patches applied directly and no further changes were made" * 'for-3.13/core' of git://git.kernel.dk/linux-block: (31 commits) block: replace IS_ERR and PTR_ERR with PTR_ERR_OR_ZERO block: replace IS_ERR and PTR_ERR with PTR_ERR_OR_ZERO block: Do not call sector_div() with a 64-bit divisor kernel: trace: blktrace: remove redundent memcpy() in compat_blk_trace_setup() block: Consolidate duplicated bio_trim() implementations block: Use rw_copy_check_uvector() block: Enable sysfs nomerge control for I/O requests in the plug list block: properly stack underlying max_segment_size to DM device elevator: acquire q->sysfs_lock in elevator_change() elevator: Fix a race in elevator switching and md device initialization block: Replace __get_cpu_var uses bdi: test bdi_init failure block: fix a probe argument to blk_register_region loop: fix crash if blk_alloc_queue fails blk-core: Fix memory corruption if blkcg_init_queue fails block: fix race between request completion and timeout handling blktrace: Send BLK_TN_PROCESS events to all running traces blk-mq: don't disallow request merges for req->special being set blk-mq: mq plug list breakage blk-mq: fix for flush deadlock ...
2013-11-14Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull VFS fixes from Al Viro: "Several fixes, mostly for regressions in the last pile. Howeover, prepend_path() forgetting to reininitalize dentry/vfsmount is in 3.12 as well and qib_fs had been leaking all along..." The unpaired RCU lock issue was also independently reported by Dave Jones with his fuzzer tool.. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: qib_fs: fix (some) dcache abuses prepend_path() needs to reinitialize dentry/vfsmount/mnt on restarts fix unpaired rcu lock in prepend_path() locks: missing unlock on error in generic_add_lease() aio: checking for NULL instead of IS_ERR
2013-11-14Merge branch 'for-linus' of git://git.linaro.org/people/rmk/linux-armLinus Torvalds
Pull ARM updates from Russell King: "Included in this series are: 1. BE8 (modern big endian) changes for ARM from Ben Dooks 2. big.Little support from Nicolas Pitre and Dave Martin 3. support for LPAE systems with all system memory above 4GB 4. Perf updates from Will Deacon 5. Additional prefetching and other performance improvements from Will. 6. Neon-optimised AES implementation fro Ard. 7. A number of smaller fixes scattered around the place. There is a rather horrid merge conflict in tools/perf - I was never notified of the conflict because it originally occurred between Will's tree and other stuff. Consequently I have a resolution which Will forwarded me, which I'll forward on immediately after sending this mail. The other notable thing is I'm expecting some build breakage in the crypto stuff on ARM only with Ard's AES patches. These were merged into a stable git branch which others had already pulled, so there's little I can do about this. The problem is caused because these patches have a dependency on some code in the crypto git tree - I tried requesting a branch I can pull to resolve these, and all I got each time from the crypto people was "we'll revert our patches then" which would only make things worse since I still don't have the dependent patches. I've no idea what's going on there or how to resolve that, and since I can't split these patches from the rest of this pull request, I'm rather stuck with pushing this as-is or reverting Ard's patches. Since it should "come out in the wash" I've left them in - the only build problems they seem to cause at the moment are with randconfigs, and since it's a new feature anyway. However, if by -rc1 the dependencies aren't in, I think it'd be best to revert Ard's patches" I resolved the perf conflict roughly as per the patch sent by Russell, but there may be some differences. Any errors are likely mine. Let's see how the crypto issues work out.. * 'for-linus' of git://git.linaro.org/people/rmk/linux-arm: (110 commits) ARM: 7868/1: arm/arm64: remove atomic_clear_mask() in "include/asm/atomic.h" ARM: 7867/1: include: asm: use 'int' instead of 'unsigned long' for 'oldval' in atomic_cmpxchg(). ARM: 7866/1: include: asm: use 'long long' instead of 'u64' within atomic.h ARM: 7871/1: amba: Extend number of IRQS ARM: 7887/1: Don't smp_cross_call() on UP devices in arch_irq_work_raise() ARM: 7872/1: Support arch_irq_work_raise() via self IPIs ARM: 7880/1: Clear the IT state independent of the Thumb-2 mode ARM: 7878/1: nommu: Implement dummy early_paging_init() ARM: 7876/1: clear Thumb-2 IT state on exception handling ARM: 7874/2: bL_switcher: Remove cpu_hotplug_driver_{lock,unlock}() ARM: footbridge: fix build warnings for netwinder ARM: 7873/1: vfp: clear vfp_current_hw_state for dying cpu ARM: fix misplaced arch_virt_to_idmap() ARM: 7848/1: mcpm: Implement cpu_kill() to synchronise on powerdown ARM: 7847/1: mcpm: Factor out logical-to-physical CPU translation ARM: 7869/1: remove unused XSCALE_PMU Kconfig param ARM: 7864/1: Handle 64-bit memory in case of 32-bit phys_addr_t ARM: 7863/1: Let arm_add_memory() always use 64-bit arguments ARM: 7862/1: pcpu: replace __get_cpu_var_uses ARM: 7861/1: cacheflush: consolidate single-CPU ARMv7 cache disabling code ...
2013-11-14Merge branch 'for-linus-dma-masks' of git://git.linaro.org/people/rmk/linux-armLinus Torvalds
Pull DMA mask updates from Russell King: "This series cleans up the handling of DMA masks in a lot of drivers, fixing some bugs as we go. Some of the more serious errors include: - drivers which only set their coherent DMA mask if the attempt to set the streaming mask fails. - drivers which test for a NULL dma mask pointer, and then set the dma mask pointer to a location in their module .data section - which will cause problems if the module is reloaded. To counter these, I have introduced two helper functions: - dma_set_mask_and_coherent() takes care of setting both the streaming and coherent masks at the same time, with the correct error handling as specified by the API. - dma_coerce_mask_and_coherent() which resolves the problem of drivers forcefully setting DMA masks. This is more a marker for future work to further clean these locations up - the code which creates the devices really should be initialising these, but to fix that in one go along with this change could potentially be very disruptive. The last thing this series does is prise away some of Linux's addition to "DMA addresses are physical addresses and RAM always starts at zero". We have ARM LPAE systems where all system memory is above 4GB physical, hence having DMA masks interpreted by (eg) the block layers as describing physical addresses in the range 0..DMAMASK fails on these platforms. Santosh Shilimkar addresses this in this series; the patches were copied to the appropriate people multiple times but were ignored. Fixing this also gets rid of some ARM weirdness in the setup of the max*pfn variables, and brings ARM into line with every other Linux architecture as far as those go" * 'for-linus-dma-masks' of git://git.linaro.org/people/rmk/linux-arm: (52 commits) ARM: 7805/1: mm: change max*pfn to include the physical offset of memory ARM: 7797/1: mmc: Use dma_max_pfn(dev) helper for bounce_limit calculations ARM: 7796/1: scsi: Use dma_max_pfn(dev) helper for bounce_limit calculations ARM: 7795/1: mm: dma-mapping: Add dma_max_pfn(dev) helper function ARM: 7794/1: block: Rename parameter dma_mask to max_addr for blk_queue_bounce_limit() ARM: DMA-API: better handing of DMA masks for coherent allocations ARM: 7857/1: dma: imx-sdma: setup dma mask DMA-API: firmware/google/gsmi.c: avoid direct access to DMA masks DMA-API: dcdbas: update DMA mask handing DMA-API: dma: edma.c: no need to explicitly initialize DMA masks DMA-API: usb: musb: use platform_device_register_full() to avoid directly messing with dma masks DMA-API: crypto: remove last references to 'static struct device *dev' DMA-API: crypto: fix ixp4xx crypto platform device support DMA-API: others: use dma_set_coherent_mask() DMA-API: staging: use dma_set_coherent_mask() DMA-API: usb: use new dma_coerce_mask_and_coherent() DMA-API: usb: use dma_set_coherent_mask() DMA-API: parport: parport_pc.c: use dma_coerce_mask_and_coherent() DMA-API: net: octeon: use dma_coerce_mask_and_coherent() DMA-API: net: nxp/lpc_eth: use dma_coerce_mask_and_coherent() ...
2013-11-13qib_fs: fix (some) dcache abusesAl Viro
* lookup_one_len() really wants i_mutex held on directory. * leaks galore - just mount ipathfs, then cd /sys/bus/pci/drivers/qib_ib; echo *:*:*.* >unbind on a box with that card present and try to umount ipathfs... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-11-13block: Use u64_stats_init() to initialize seqcountsPeter Zijlstra
Now that seqcounts are lockdep enabled objects, we need to explicitly initialize runtime allocated seqcounts so that lockdep can track them. Without this patch, Fengguang was seeing: [ 4.127282] INFO: trying to register non-static key. [ 4.128027] the code is fine but needs lockdep annotation. [ 4.128027] turning off the locking correctness validator. [ 4.128027] CPU: 0 PID: 96 Comm: kworker/u4:1 Not tainted 3.12.0-next-20131108-10601-gbad570d #2 [ 4.128027] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 [ ... ] [ 4.128027] Call Trace: [ 4.128027] [<7908e744>] ? console_unlock+0x353/0x380 [ 4.128027] [<79dc7cf2>] dump_stack+0x48/0x60 [ 4.128027] [<7908953e>] __lock_acquire.isra.26+0x7e3/0xceb [ 4.128027] [<7908a1c5>] lock_acquire+0x71/0x9a [ 4.128027] [<794079aa>] ? blk_throtl_bio+0x1c3/0x485 [ 4.128027] [<7940658b>] throtl_update_dispatch_stats+0x7c/0x153 [ 4.128027] [<794079aa>] ? blk_throtl_bio+0x1c3/0x485 [ 4.128027] [<794079aa>] blk_throtl_bio+0x1c3/0x485 ... Use u64_stats_init() for all affected data structures, which initializes the seqcount. Reported-and-Tested-by: Fengguang Wu <fengguang.wu@intel.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: Peter Zijlstra <peterz@infradead.org> [ Folded in another fix from the mailing list as well as a fix to that fix. Tweaked commit message. ] Signed-off-by: John Stultz <john.stultz@linaro.org> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1384314134-6895-1-git-send-email-john.stultz@linaro.org [ So I actually think that the two SOBs from PeterZ are the right depiction of the patch route. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-13locking/lockdep: Mark __lockdep_count_forward_deps() as staticFengguang Wu
There are new Sparse warnings: >> kernel/locking/lockdep.c:1235:15: sparse: symbol '__lockdep_count_forward_deps' was not declared. Should it be static? >> kernel/locking/lockdep.c:1261:15: sparse: symbol '__lockdep_count_backward_deps' was not declared. Should it be static? Please consider folding the attached diff :-) Signed-off-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/527d1787.ThzXGoUspZWehFDl\%fengguang.wu@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-13prepend_path() needs to reinitialize dentry/vfsmount/mnt on restartsAl Viro
... and equivalent is needed in 3.12; it's broken there as well Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-11-13fix unpaired rcu lock in prepend_path()Li Zhong
Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-11-13sched/fair: Avoid integer overflowMichal Nazarewicz
sa->runnable_avg_sum is of type u32 but after shifting it by NICE_0_SHIFT bits it is promoted to u64. This of course makes no sense, since the result will never be more then 32-bit long. Casting sa->runnable_avg_sum to u64 before it is shifted, fixes this problem. Reviewed-by: Ben Segall <bsegall@google.com> Signed-off-by: Michal Nazarewicz <mina86@mina86.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1384112521-25177-1-git-send-email-mpn@google.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-13sched: Optimize task_sched_runtime()Peter Zijlstra
Large multi-threaded apps like to hit this using do_sys_times() and then queue up on the rq->lock. Avoid when possible. Larry reported ~20% performance increase his test case. Reported-by: Larry Woodman <lwoodman@redhat.com> Suggested-by: Paul Turner <pjt@google.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/r/20131111172925.GG26898@twins.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-13sched/numa: Cure update_numa_stats() vs. hotplugPeter Zijlstra
Because we're completely unserialized against hotplug its well possible to try and generate numa stats for an offlined node. Bail out early (and avoid a /0) in this case. The resulting stats are all 0 which should result in an undesirable balance target -- not to mention that actually trying to migrate to an offline CPU will fail. Reported-by: Prarit Bhargava <prarit@redhat.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Mel Gorman <mgorman@suse.de> Link: http://lkml.kernel.org/n/tip-orja0qylcvyhxfsuebcyL5sI@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>