aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2018-05-21EDAC, ghes: Make platform-based whitelisting x86-onlyBorislav Petkov
ARM machines all have DMI tables so if they request hw error reporting through GHES, then the driver should be able to detect DIMMs and report errors successfully (famous last words :)). Make the platform-based list x86-specific so that ghes_edac can load on ARM. Reported-by: Qiang Zheng <zhengqiang10@huawei.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: James Morse <james.morse@arm.com> Tested-by: James Morse <james.morse@arm.com> Tested-by: Qiang Zheng <zhengqiang10@huawei.com> Link: https://lkml.kernel.org/r/1526039543-180996-1-git-send-email-zhengqiang10@huawei.com
2018-05-15EDAC, altera: Fix ARM64 build warningThor Thayer
The kbuild test robot reported the following warning: drivers/edac/altera_edac.c: In function 'ocram_free_mem': drivers/edac/altera_edac.c:1410:42: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast] gen_pool_free((struct gen_pool *)other, (u32)p, size); ^ After adding support for ARM64 architectures, the unsigned long parameter is 64 bits and causes a build warning on 64-bit configs. Fix by casting to the correct size (unsigned long) instead of u32. Reported-by: kbuild test robot <lkp@intel.com> Signed-off-by: Thor Thayer <thor.thayer@linux.intel.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-edac <linux-edac@vger.kernel.org> Fixes: c3eea1942a16 ("EDAC, altera: Add Altera L2 cache and OCRAM support") Link: http://lkml.kernel.org/r/1526317441-4996-1-git-send-email-thor.thayer@linux.intel.com Signed-off-by: Borislav Petkov <bp@suse.de>
2018-05-14EDAC, skx: Fix skx_edac build error when ACPI_NFIT=mRandy Dunlap
Prevent build error when CONFIG_ACPI_NFIT=m and CONFIG_EDAC_SKX=y by limiting EDAC_SKX based on how ACPI_NFIT is set. Fixes this build error: drivers/edac/skx_edac.o: In function `get_nvdimm_info': ../drivers/edac/skx_edac.c:399: undefined reference to `nfit_get_smbios_id' Reported-by: kbuild test robot <lkp@intel.com> Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Tony Luck <tony.luck@intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Fixes: 58ca9ac1463d ("EDAC, skx_edac: Detect non-volatile DIMMs") Link: http://lkml.kernel.org/r/3af91354-8e19-d2af-1bba-ced8dce053f1@infradead.org Signed-off-by: Borislav Petkov <bp@suse.de>
2018-05-12EDAC, ghes: Use BIT() macroBorislav Petkov
... for improved readability. Also, add a local mask variable for the same reason. No functional changes. Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Toshi Kani <toshi.kani@hpe.com> Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
2018-05-12EDAC, ghes: Add DDR4 and NVDIMM memory typesToshi Kani
The ghes_edac driver obtains memory type from SMBIOS type 17, but it does not recognize DDR4 and NVDIMM types. Add support of DDR4 and NVDIMM types. NVDIMM type is denoted by memory type DDR3/4 and non-volatile. Reported-by: Robert Elliott <elliott@hpe.com> Signed-off-by: Toshi Kani <toshi.kani@hpe.com> Cc: Mauro Carvalho Chehab <mchehab@kernel.org> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20180509222030.9299-1-toshi.kani@hpe.com Signed-off-by: Borislav Petkov <bp@suse.de>
2018-05-12EDAC, altera: Handle SDRAM Uncorrectable Errors on Stratix10Thor Thayer
On Stratix10, uncorrectable errors are routed to the SError exception instead of the IRQ exceptions. In Stratix10, uncorrectable SErrors must be treated as fatal and will cause a panic. Older Altera/Intel parts printed out a message for UE so do that here using the notifier framework. Record the UE in sticky registers that retain the state through a reset. Check these registers on probe and printout the error on startup. Signed-off-by: Thor Thayer <thor.thayer@linux.intel.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-edac <linux-edac@vger.kernel.org> Cc: mark.rutland@arm.com Cc: mchehab@kernel.org Cc: will.deacon@arm.com Link: http://lkml.kernel.org/r/1526079610-5527-1-git-send-email-thor.thayer@linux.intel.com [ Remove unused var in s10_edac_dberr_handler(), reorder args. ] Signed-off-by: Borislav Petkov <bp@suse.de>
2018-05-12Documentation: dt: edac: Move Altera SOCFPGA EDAC fileThor Thayer
Move the Altera SOCFPGA EDAC file from bindings/arm/altera to bindings/edac. Signed-off-by: Thor Thayer <thor.thayer@linux.intel.com> Requested-by: Rob Herring <robh@kernel.org> Acked-by: Rob Herring <robh@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: devicetree@vger.kernel.org Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1525203036-21774-1-git-send-email-thor.thayer@linux.intel.com Signed-off-by: Borislav Petkov <bp@suse.de>
2018-05-12EDAC, altera: Add support for Stratix10 SDRAM EDACThor Thayer
Support for Stratix10 SDRAM ECC requires the use of SMC calls to Secure Monitor for accessing registers. Signed-off-by: Thor Thayer <thor.thayer@linux.intel.com> Cc: Mauro Carvalho Chehab <mchehab@kernel.org> Cc: catalin.marinas@arm.com Cc: devicetree@vger.kernel.org Cc: dinguyen@kernel.org Cc: linux-arm-kernel@lists.infradead.org Cc: linux-edac <linux-edac@vger.kernel.org> Cc: mark.rutland@arm.com Cc: robh+dt@kernel.org Cc: will.deacon@arm.com Link: http://lkml.kernel.org/r/1524854238-19394-3-git-send-email-thor.thayer@linux.intel.com Signed-off-by: Borislav Petkov <bp@suse.de>
2018-05-12Documentation: dt: socfpga: Add Stratix10 ECC Manager bindingThor Thayer
Add the device tree bindings needed to support the Stratix10 ECC Manager and SDRAM ECC. Signed-off-by: Thor Thayer <thor.thayer@linux.intel.com> Reviewed-by: Rob Herring <robh@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: catalin.marinas@arm.com Cc: devicetree@vger.kernel.org Cc: dinguyen@kernel.org Cc: linux-arm-kernel@lists.infradead.org Cc: linux-edac <linux-edac@vger.kernel.org> Cc: mchehab@kernel.org Cc: will.deacon@arm.com Link: http://lkml.kernel.org/r/1524854238-19394-2-git-send-email-thor.thayer@linux.intel.com Signed-off-by: Borislav Petkov <bp@suse.de>
2018-05-12Merge tag 'socfpga_updates_for_v4.18_part2' into edac-for-4.18Borislav Petkov
Pick up dependent socfpga_stratix10.dtsi changes from Dinh's tree to avoid merge conflicts with that same file in his tree. Signed-off-by: Borislav Petkov <bp@suse.de>
2018-05-12EDAC, ghes: Remove unused argument to ghes_edac_report_mem_error()Alexandru Gagniuc
The use of the @ghes argument was removed in a previous commit, but function signature was not updated to reflect this. Signed-off-by: Alexandru Gagniuc <mr.nuke.me@gmail.com> Acked-by: "Rafael J. Wysocki" <rafael@kernel.org> Cc: linux-acpi@vger.kernel.org Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20180430213358.8319-1-mr.nuke.me@gmail.com Signed-off-by: Borislav Petkov <bp@suse.de>
2018-05-08arm64: dts: stratix10: add sdram eccThor Thayer
Add the Stratix10 ECC Manager and SDRAM EDAC nodes to the device tree. Signed-off-by: Thor Thayer <thor.thayer@linux.intel.com> Signed-off-by: Dinh Nguyen <dinguyen@kernel.org>
2018-05-04EDAC, i7core: Fix spelling mistake: "redundacy" -> "redundancy"Colin Ian King
Trivial fix to spelling mistake in err string. Signed-off-by: Colin Ian King <colin.king@canonical.com> Cc: Mauro Carvalho Chehab <mchehab@kernel.org> Cc: kernel-janitors@vger.kernel.org Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20180504113804.17103-1-colin.king@canonical.com Signed-off-by: Borislav Petkov <bp@suse.de>
2018-05-02EDAC, ghes: Add a null pointer check in ghes_edac_unregister()Sughosh Ganu
Add a null check for ghes_pvt, before dereferencing it. The pointer could still be null in case the return path is taken before initialising ghes_pvt in the registration function. Reviewed-by: Toshi Kani <toshi.kani@hpe.com> Signed-off-by: Sughosh Ganu <sughosh.ganu@arm.com> Cc: Mauro Carvalho Chehab <mchehab@kernel.org> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: lkml <linux-kernel@vger.kernel.org> Link: http://lkml.kernel.org/r/1524737809-24475-1-git-send-email-sughosh.ganu@arm.com Signed-off-by: Borislav Petkov <bp@suse.de>
2018-05-02ghes, EDAC: Fix ghes_edac registrationBorislav Petkov
Tony reported seeing "Internal error: Can't find EDAC structure" when injecting correctable errors due to the fact that ghes_edac would still load even if the whitelist won't hit. Drop the pr_err() in ghes_edac_report_mem_error() for now due to the hacky way how ghes_edac depends on ghes.c. While at it, make ghes_edac_register() return an error if it doesn't hit in the whitelist as it is the only sensible thing to do in that situation. Furthermore, move the call to it to happen last in ghes_probe() so that GHES initializing properly does not depend on ghes_edac init at all as latter is only reporting errors and not required for GHES's proper functioning. Reviewed-by: Toshi Kani <toshi.kani@hpe.com> Tested-by: Sughosh Ganu <sughosh.ganu@arm.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Tony Luck <tony.luck@intel.com> Link: https://lkml.kernel.org/r/20180420182015.zao3olss4tvvlxki@agluck-desk
2018-04-24arm64: dts: stratix10: Change pad skew values for EMAC0 PHY driverOoi, Joyce
The HPS EMAC0 drive strength is changed to 4mA because the initial 8mA drive strength has caused CE test to fail. This requires changes on the pad skew for EMAC0 PHY driver. Based on several measurements done, Tx clock does not require the extra 0.96ns delay. Signed-off-by: Ooi, Joyce <joyce.ooi@intel.com> Signed-off-by: Dinh Nguyen <dinguyen@kernel.org>
2018-04-24ARM: dts: consistently use 'atmel' as at24 manufacturer in cyclone5Bartosz Golaszewski
Using 'at' or 'at24' as the <manufacturer> part of the compatible string is now deprecated. Use a correct value: 'atmel,<model>'. Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl> Signed-off-by: Dinh Nguyen <dinguyen@kernel.org>
2018-04-22Linux 4.17-rc2Linus Torvalds
2018-04-22Merge tag 'drm-fixes-for-v4.17-rc2' of ↵Linus Torvalds
git://people.freedesktop.org/~airlied/linux Pull drm fixes from Dave Airlie: "Exynos, i915, vc4, amdgpu fixes. i915: - an oops fix - two race fixes - some gvt fixes amdgpu: - dark screen fix - clk/voltage fix - vega12 smu fix vc4: - memory leak fix exynos just drops some code" * tag 'drm-fixes-for-v4.17-rc2' of git://people.freedesktop.org/~airlied/linux: (23 commits) drm/amd/powerplay: header file interface to SMU update drm/amd/pp: Fix bug voltage can't be OD separately on VI drm/amd/display: Don't program bypass on linear regamma LUT drm/i915: Fix LSPCON TMDS output buffer enabling from low-power state drm/i915/audio: Fix audio detection issue on GLK drm/i915: Call i915_perf_fini() on init_hw error unwind drm/i915/bios: filter out invalid DDC pins from VBT child devices drm/i915/pmu: Inspect runtime PM state more carefully while estimating RC6 drm/i915: Do no use kfree() to free a kmem_cache_alloc() return value drm/exynos: exynos_drm_fb -> drm_framebuffer drm/exynos: Move dma_addr out of exynos_drm_fb drm/exynos: Move GEM BOs to drm_framebuffer drm: Fix HDCP downstream dev count read drm/vc4: Fix memory leak during BO teardown drm/i915/execlists: Clear user-active flag on preemption completion drm/i915/gvt: Add drm_format_mod update drm/i915/gvt: Disable primary/sprite/cursor plane at virtual display initialization drm/i915/gvt: Delete redundant error message in fb_decode.c drm/i915/gvt: Cancel dma map when resetting ggtt entries drm/i915/gvt: Missed to cancel dma map for ggtt entries ...
2018-04-23Merge branch 'drm-next-4.17' of git://people.freedesktop.org/~agd5f/linux ↵Dave Airlie
into drm-next - Fix a dark screen issue in DC - Fix clk/voltage dependency tracking for wattman - Update SMU interface for vega12 * 'drm-next-4.17' of git://people.freedesktop.org/~agd5f/linux: drm/amd/powerplay: header file interface to SMU update drm/amd/pp: Fix bug voltage can't be OD separately on VI drm/amd/display: Don't program bypass on linear regamma LUT
2018-04-23Merge tag 'exynos-drm-fixes-for-v4.17-rc2' of ↵Dave Airlie
git://git.kernel.org/pub/scm/linux/kernel/git/daeinki/drm-exynos into drm-next Remove Exynos specific framebuffer structure and relevant functions. - it removes exynos_drm_fb structure which is a wrapper of drm_framebuffer and unnecessary two exynos specific callback functions, exynos_drm_destory() and exynos_drm_fb_create_handle() because we can reuse existing drm common callback ones instead. * tag 'exynos-drm-fixes-for-v4.17-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/daeinki/drm-exynos: drm/exynos: exynos_drm_fb -> drm_framebuffer drm/exynos: Move dma_addr out of exynos_drm_fb drm/exynos: Move GEM BOs to drm_framebuffer drm/amdkfd: Deallocate SDMA queues correctly drm/amdkfd: Fix scratch memory with HWS enabled
2018-04-23Merge tag 'drm-intel-next-fixes-2018-04-19' of ↵Dave Airlie
git://anongit.freedesktop.org/drm/drm-intel into drm-next - Fix for FDO #105549: Avoid OOPS on bad VBT (Jani) - Fix rare pre-emption race (Chris) - Fix RC6 race against PM transitions (Tvrtko) * tag 'drm-intel-next-fixes-2018-04-19' of git://anongit.freedesktop.org/drm/drm-intel: drm/i915/audio: Fix audio detection issue on GLK drm/i915: Call i915_perf_fini() on init_hw error unwind drm/i915/bios: filter out invalid DDC pins from VBT child devices drm/i915/pmu: Inspect runtime PM state more carefully while estimating RC6 drm/i915: Do no use kfree() to free a kmem_cache_alloc() return value drm/i915/execlists: Clear user-active flag on preemption completion drm/i915/gvt: Add drm_format_mod update drm/i915/gvt: Disable primary/sprite/cursor plane at virtual display initialization drm/i915/gvt: Delete redundant error message in fb_decode.c drm/i915/gvt: Cancel dma map when resetting ggtt entries drm/i915/gvt: Missed to cancel dma map for ggtt entries drm/i915/gvt: Make MI_USER_INTERRUPT nop in cmd parser drm/i915/gvt: Mark expected switch fall-through in handle_g2v_notification drm/i915/gvt: throw error on unhandled vfio ioctls
2018-04-23Merge tag 'drm-misc-fixes-2018-04-18-1' of ↵Dave Airlie
git://anongit.freedesktop.org/drm/drm-misc into drm-next drm-misc-fixes: stable: vc4: Fix memory leak during BO teardown (Daniel) dp: Add i2c retry for LSPCON adapters (Imre) hdcp: Fix device count mask (Ramalingam) Cc: Daniel J Blueman <daniel@quora.org Cc: Imre Deak <imre.deak@intel.com> Cc: Ramalingam C <ramalingam.c@intel.com> * tag 'drm-misc-fixes-2018-04-18-1' of git://anongit.freedesktop.org/drm/drm-misc: drm/i915: Fix LSPCON TMDS output buffer enabling from low-power state drm: Fix HDCP downstream dev count read drm/vc4: Fix memory leak during BO teardown
2018-04-22Merge tag '4.17-rc1-SMB3-CIFS' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds
Pull cifs fixes from Steve French: "Various SMB3/CIFS fixes. There are three more security related fixes in progress that are not included in this set but they are still being tested and reviewed, so sending this unrelated set of smaller fixes now" * tag '4.17-rc1-SMB3-CIFS' of git://git.samba.org/sfrench/cifs-2.6: CIFS: fix typo in cifs_dbg cifs: do not allow creating sockets except with SMB1 posix exensions cifs: smbd: Dump SMB packet when configured cifs: smbd: Check for iov length on sending the last iov fs: cifs: Adding new return type vm_fault_t cifs: smb2ops: Fix NULL check in smb2_query_symlink
2018-04-22Merge tag 'for-4.17-rc1-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: "This contains a few fixups to the qgroup patches that were merged this dev cycle, unaligned access fix, blockgroup removal corner case fix and a small debugging output tweak" * tag 'for-4.17-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: print-tree: debugging output enhancement btrfs: Fix race condition between delayed refs and blockgroup removal btrfs: fix unaligned access in readdir btrfs: Fix wrong btrfs_delalloc_release_extents parameter btrfs: delayed-inode: Remove wrong qgroup meta reservation calls btrfs: qgroup: Use independent and accurate per inode qgroup rsv btrfs: qgroup: Commit transaction in advance to reduce early EDQUOT
2018-04-22Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Thomas Gleixner: "A small set of fixes for x86: - Prevent X2APIC ID 0xFFFFFFFF from being treated as valid, which causes the possible CPU count to be wrong. - Prevent 32bit truncation in calc_hpet_ref() which causes the TSC calibration to fail - Fix the page table setup for temporary text mappings in the resume code which causes resume failures - Make the page table dump code handle HIGHPTE correctly instead of oopsing - Support for topologies where NUMA nodes share an LLC to prevent a invalid topology warning and further malfunction on such systems. - Remove the now unused pci-nommu code - Remove stale function declarations" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/power/64: Fix page-table setup for temporary text mapping x86/mm: Prevent kernel Oops in PTDUMP code with HIGHPTE=y x86,sched: Allow topologies where NUMA nodes share an LLC x86/processor: Remove two unused function declarations x86/acpi: Prevent X2APIC id 0xffffffff from being accounted x86/tsc: Prevent 32bit truncation in calc_hpet_ref() x86: Remove pci-nommu.c
2018-04-22Merge branch 'timers-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fixes from Thomas Gleixner: "A small set of timer fixes: - Evaluate the -ETIME condition correctly in the imx tpm driver - Fix the evaluation order of a condition in posix cpu timers - Use pr_cont() in the clockevents code to prevent ugly message splitting - Remove __current_kernel_time() which is now unused to prevent that new users show up. - Remove a stale forward declaration" * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: clocksource/imx-tpm: Correct -ETIME return condition check posix-cpu-timers: Ensure set_process_cpu_timer is always evaluated timekeeping: Remove __current_kernel_time() timers: Remove stale struct tvec_base forward declaration clockevents: Fix kernel messages split across multiple lines
2018-04-22Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Thomas Gleixner: "A larger set of updates for perf. Kernel: - Handle the SBOX uncore monitoring correctly on Broadwell CPUs which do not have SBOX. - Store context switch out type in PERF_RECORD_SWITCH[_CPU_WIDE]. The percentage of preempting and non-preempting context switches help understanding the nature of workloads (CPU or IO bound) that are running on a machine. This adds the kernel facility and userspace changes needed to show this information in 'perf script' and 'perf report -D' (Alexey Budankov) - Remove a WARN_ON() in the trace/kprobes code which is pointless because the return error code is already telling the caller what's wrong. - Revert a fugly workaround for clang BPF targets. - Fix sample_max_stack maximum check and do not proceed when an error has been detect, return them to avoid misidentifying errors (Jiri Olsa) - Add SPDX idenitifiers and get rid of GPL boilderplate. Tools: - Synchronize kernel ABI headers, v4.17-rc1 (Ingo Molnar) - Support MAP_FIXED_NOREPLACE, noticed when updating the tools/include/ copies (Arnaldo Carvalho de Melo) - Add '\n' at the end of parse-options error messages (Ravi Bangoria) - Add s390 support for detailed/verbose PMU event description (Thomas Richter) - perf annotate fixes and improvements: * Allow showing offsets in more than just jump targets, use the new 'O' hotkey in the TUI, config ~/.perfconfig annotate.offset_level for it and for --stdio2 (Arnaldo Carvalho de Melo) * Use the resolved variable names from objdump disassembled lines to make them more compact, just like was already done for some instructions, like "mov", this eventually will be done more generally, but lets now add some more to the existing mechanism (Arnaldo Carvalho de Melo) - perf record fixes: * Change warning for missing topology sysfs entry to debug, as not all architectures have those files, s390 being one of those (Thomas Richter) * Remove old error messages about things that unlikely to be the root cause in modern systems (Andi Kleen) - perf sched fixes: * Fix -g/--call-graph documentation (Takuya Yamamoto) - perf stat: * Enable 1ms interval for printing event counters values in (Alexey Budankov) - perf test fixes: * Run dwarf unwind on arm32 (Kim Phillips) * Remove unused ptrace.h include from LLVM test, sidesteping older clang's lack of support for some asm constructs (Arnaldo Carvalho de Melo) * Fixup BPF test using epoll_pwait syscall function probe, to cope with the syscall routines renames performed in this development cycle (Arnaldo Carvalho de Melo) - perf version fixes: * Do not print info about HAVE_LIBAUDIT_SUPPORT in 'perf version --build-options' when HAVE_SYSCALL_TABLE_SUPPORT is true, as libaudit won't be used in that case, print info about syscall_table support instead (Jin Yao) - Build system fixes: * Use HAVE_..._SUPPORT used consistently (Jin Yao) * Restore READ_ONCE() C++ compatibility in tools/include (Mark Rutland) * Give hints about package names needed to build jvmti (Arnaldo Carvalho de Melo)" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (37 commits) perf/x86/intel/uncore: Fix SBOX support for Broadwell CPUs perf/x86/intel/uncore: Revert "Remove SBOX support for Broadwell server" coresight: Move to SPDX identifier perf test BPF: Fixup BPF test using epoll_pwait syscall function probe perf tests mmap: Show which tracepoint is failing perf tools: Add '\n' at the end of parse-options error messages perf record: Remove suggestion to enable APIC perf record: Remove misleading error suggestion perf hists browser: Clarify top/report browser help perf mem: Allow all record/report options perf trace: Support MAP_FIXED_NOREPLACE perf: Remove superfluous allocation error check perf: Fix sample_max_stack maximum check perf: Return proper values for user stack errors perf list: Add s390 support for detailed/verbose PMU event description perf script: Extend misc field decoding with switch out event type perf report: Extend raw dump (-D) out with switch out event type perf/core: Store context switch out type in PERF_RECORD_SWITCH[_CPU_WIDE] tools/headers: Synchronize kernel ABI headers, v4.17-rc1 trace_kprobe: Remove warning message "Could not insert probe at..." ...
2018-04-22Merge branch 'core-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull objtool fix from Thomas Gleixner: "A single fix for objtool so it uses the host C and LD flags and not the target ones" * 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: objtool: Support HOSTCFLAGS and HOSTLDFLAGS
2018-04-21Merge tag 'random_for_linus_stable' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/random Pull /dev/random fixes from Ted Ts'o: "Fix some bugs in the /dev/random driver which causes getrandom(2) to unblock earlier than designed. Thanks to Jann Horn from Google's Project Zero for pointing this out to me" * tag 'random_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/random: random: add new ioctl RNDRESEEDCRNG random: crng_reseed() should lock the crng instance that it is modifying random: set up the NUMA crng instances after the CRNG is fully initialized random: use a different mixing algorithm for add_device_randomness() random: fix crng_ready() test
2018-04-21Merge branch 'libnvdimm-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm Pull libnvdimm fixes from Dan Williams: "A regression fix, new unit test infrastructure and a build fix: - Regression fix addressing support for the new NVDIMM label storage area access commands (_LSI, _LSR, and _LSW). The Intel specific version of these commands communicated the "Device Locked" status on the label-storage-information command. However, these new commands (standardized in ACPI 6.2) communicate the "Device Locked" status on the label-storage-read command, and the driver was missing the indication. Reading from locked persistent memory is similar to reading unmapped PCI memory space, returns all 1's. - Unit test infrastructure is added to regression test the "Device Locked" detection failure. - A build fix is included to allow the "of_pmem" driver to be built as a module and translate an Open Firmware described device to its local numa node" * 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: MAINTAINERS: Add backup maintainers for libnvdimm and DAX device-dax: allow MAP_SYNC to succeed Revert "libnvdimm, of_pmem: workaround OF_NUMA=n build error" libnvdimm, of_pmem: use dev_to_node() instead of of_node_to_nid() tools/testing/nvdimm: enable labels for nfit_test.1 dimms tools/testing/nvdimm: fix missing newline in nfit_test_dimm 'handle' attribute tools/testing/nvdimm: support nfit_test_dimm attributes under nfit_test.1 tools/testing/nvdimm: allow custom error code injection libnvdimm, dimm: handle EACCES failures from label reads
2018-04-21Merge tag 'sound-4.17-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "A few small fixes: - a fix for the NULL-dereference in rawmidi compat ioctls, triggered by fuzzer - HD-audio Realtek codec quirks, a VIA controller fixup - a long-standing bug fix in LINE6 MIDI" * tag 'sound-4.17-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: ALSA: rawmidi: Fix missing input substream checks in compat ioctls ALSA: hda/realtek - adjust the location of one mic ALSA: hda/realtek - set PINCFG_HEADSET_MIC to parse_flags ALSA: hda - New VIA controller suppor no-snoop path ALSA: line6: Use correct endpoint type for midi output
2018-04-21Merge tag 'linux-watchdog-4.17-rc2' of ↵Linus Torvalds
git://www.linux-watchdog.org/linux-watchdog Pull watchdog fixes from Wim Van Sebroeck: - fall-through fixes - MAINTAINER change for hpwdt - renesas-wdt: Add support for WDIOF_CARDRESET - aspeed: set bootstatus during probe * tag 'linux-watchdog-4.17-rc2' of git://www.linux-watchdog.org/linux-watchdog: aspeed: watchdog: Set bootstatus during probe watchdog: renesas-wdt: Add support for WDIOF_CARDRESET watchdog: wafer5823wdt: Mark expected switch fall-through watchdog: w83977f_wdt: Mark expected switch fall-through watchdog: sch311x_wdt: Mark expected switch fall-through watchdog: hpwdt: change maintainer.
2018-04-21Merge tag 'linux-kselftest-4.17-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest Pull Kselftest fix from Shuah Khan: "A fix from Michael Ellerman to not run dnotify_test by default to prevent Kselftest running forever" * tag 'linux-kselftest-4.17-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: selftests/filesystems: Don't run dnotify_test by default
2018-04-21Merge tag 'arm64-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Catalin Marinas: - kasan: avoid pfn_to_nid() before the page array is initialised - Fix typo causing the "upgrade" of known signals to SIGKILL * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: signal: don't force known signals to SIGKILL arm64: kasan: avoid pfn_to_nid() before page array is initialized
2018-04-21Merge branch 'akpm' (patches from Andrew)Linus Torvalds
Merge misc fixes from Andrew Morton: - "fork: unconditionally clear stack on fork" is a non-bugfix which got lost during the merge window - performance concerns appear to have been adequately addressed. - and a bunch of fixes * emailed patches from Andrew Morton <akpm@linux-foundation.org>: mm/filemap.c: fix NULL pointer in page_cache_tree_insert() mm: memcg: add __GFP_NOWARN in __memcg_schedule_kmem_cache_create() fs, elf: don't complain MAP_FIXED_NOREPLACE unless -EEXIST error kexec_file: do not add extra alignment to efi memmap proc: fix /proc/loadavg regression proc: revalidate kernel thread inodes to root:root autofs: mount point create should honour passed in mode MAINTAINERS: add personal addresses for Sascha and Uwe kasan: add no_sanitize attribute for clang builds rapidio: fix rio_dma_transfer error handling mm: enable thp migration for shmem thp writeback: safer lock nesting mm, pagemap: fix swap offset value for PMD migration entry mm: fix do_pages_move status handling fork: unconditionally clear stack on fork
2018-04-21Merge tag 'perf-urgent-for-mingo-4.17-20180420' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent Pull perf/urgent fixes and improvements from Arnaldo Carvalho de Melo: - Store context switch out type in PERF_RECORD_SWITCH[_CPU_WIDE]. The percentage of preempting and non-preempting context switches help understanding the nature of workloads (CPU or IO bound) that are running on a machine. This adds the kernel facility and userspace changes needed to show this information in 'perf script' and 'perf report -D' (Alexey Budankov) - Remove old error messages about things that unlikely to be the root cause in modern systems (Andi Kleen) - Synchronize kernel ABI headers, v4.17-rc1 (Ingo Molnar) - Support MAP_FIXED_NOREPLACE, noticed when updating the tools/include/ copies (Arnaldo Carvalho de Melo) - Fixup BPF test using epoll_pwait syscall function probe, to cope with the syscall routines renames performed in this development cycle (Arnaldo Carvalho de Melo) - Fix sample_max_stack maximum check and do not proceed when an error has been detect, return them to avoid misidentifying errors (Jiri Olsa) - Add '\n' at the end of parse-options error messages (Ravi Bangoria) - Add s390 support for detailed/verbose PMU event description (Thomas Richter) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-04-20mm/filemap.c: fix NULL pointer in page_cache_tree_insert()Matthew Wilcox
f2fs specifies the __GFP_ZERO flag for allocating some of its pages. Unfortunately, the page cache also uses the mapping's GFP flags for allocating radix tree nodes. It always masked off the __GFP_HIGHMEM flag, and masks off __GFP_ZERO in some paths, but not all. That causes radix tree nodes to be allocated with a NULL list_head, which causes backtraces like: __list_del_entry+0x30/0xd0 list_lru_del+0xac/0x1ac page_cache_tree_insert+0xd8/0x110 The __GFP_DMA and __GFP_DMA32 flags would also be able to sneak through if they are ever used. Fix them all by using GFP_RECLAIM_MASK at the innermost location, and remove it from earlier in the callchain. Link: http://lkml.kernel.org/r/20180411060320.14458-2-willy@infradead.org Fixes: 449dd6984d0e ("mm: keep page cache radix tree nodes in check") Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com> Reported-by: Chris Fries <cfries@google.com> Debugged-by: Minchan Kim <minchan@kernel.org> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Michal Hocko <mhocko@suse.com> Reviewed-by: Jan Kara <jack@suse.cz> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-20mm: memcg: add __GFP_NOWARN in __memcg_schedule_kmem_cache_create()Minchan Kim
If there is heavy memory pressure, page allocation with __GFP_NOWAIT fails easily although it's order-0 request. I got below warning 9 times for normal boot. <snip >: page allocation failure: order:0, mode:0x2200000(GFP_NOWAIT|__GFP_NOTRACK) .. snip .. Call trace: dump_backtrace+0x0/0x4 dump_stack+0xa4/0xc0 warn_alloc+0xd4/0x15c __alloc_pages_nodemask+0xf88/0x10fc alloc_slab_page+0x40/0x18c new_slab+0x2b8/0x2e0 ___slab_alloc+0x25c/0x464 __kmalloc+0x394/0x498 memcg_kmem_get_cache+0x114/0x2b8 kmem_cache_alloc+0x98/0x3e8 mmap_region+0x3bc/0x8c0 do_mmap+0x40c/0x43c vm_mmap_pgoff+0x15c/0x1e4 sys_mmap+0xb0/0xc8 el0_svc_naked+0x24/0x28 Mem-Info: active_anon:17124 inactive_anon:193 isolated_anon:0 active_file:7898 inactive_file:712955 isolated_file:55 unevictable:0 dirty:27 writeback:18 unstable:0 slab_reclaimable:12250 slab_unreclaimable:23334 mapped:19310 shmem:212 pagetables:816 bounce:0 free:36561 free_pcp:1205 free_cma:35615 Node 0 active_anon:68496kB inactive_anon:772kB active_file:31592kB inactive_file:2851820kB unevictable:0kB isolated(anon):0kB isolated(file):220kB mapped:77240kB dirty:108kB writeback:72kB shmem:848kB writeback_tmp:0kB unstable:0kB all_unreclaimable? no DMA free:142188kB min:3056kB low:3820kB high:4584kB active_anon:10052kB inactive_anon:12kB active_file:312kB inactive_file:1412620kB unevictable:0kB writepending:0kB present:1781412kB managed:1604728kB mlocked:0kB slab_reclaimable:3592kB slab_unreclaimable:876kB kernel_stack:400kB pagetables:52kB bounce:0kB free_pcp:1436kB local_pcp:124kB free_cma:142492kB lowmem_reserve[]: 0 1842 1842 Normal free:4056kB min:4172kB low:5212kB high:6252kB active_anon:58376kB inactive_anon:760kB active_file:31348kB inactive_file:1439040kB unevictable:0kB writepending:180kB present:2000636kB managed:1923688kB mlocked:0kB slab_reclaimable:45408kB slab_unreclaimable:92460kB kernel_stack:9680kB pagetables:3212kB bounce:0kB free_pcp:3392kB local_pcp:688kB free_cma:0kB lowmem_reserve[]: 0 0 0 DMA: 0*4kB 0*8kB 1*16kB (C) 0*32kB 0*64kB 0*128kB 1*256kB (C) 1*512kB (C) 0*1024kB 1*2048kB (C) 34*4096kB (C) = 142096kB Normal: 228*4kB (UMEH) 172*8kB (UMH) 23*16kB (UH) 24*32kB (H) 5*64kB (H) 1*128kB (H) 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 3872kB 721350 total pagecache pages 0 pages in swap cache Swap cache stats: add 0, delete 0, find 0/0 Free swap = 0kB Total swap = 0kB 945512 pages RAM 0 pages HighMem/MovableOnly 63408 pages reserved 51200 pages cma reserved __memcg_schedule_kmem_cache_create() tries to create a shadow slab cache and the worker allocation failure is not really critical because we will retry on the next kmem charge. We might miss some charges but that shouldn't be critical. The excessive allocation failure report is not very helpful. [mhocko@kernel.org: changelog update] Link: http://lkml.kernel.org/r/20180418022912.248417-1-minchan@kernel.org Signed-off-by: Minchan Kim <minchan@kernel.org> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-20fs, elf: don't complain MAP_FIXED_NOREPLACE unless -EEXIST errorTetsuo Handa
Commit 4ed28639519c ("fs, elf: drop MAP_FIXED usage from elf_map") is printing spurious messages under memory pressure due to map_addr == -ENOMEM. 9794 (a.out): Uhuuh, elf segment at 00007f2e34738000(fffffffffffffff4) requested but the memory is mapped already 14104 (a.out): Uhuuh, elf segment at 00007f34fd76c000(fffffffffffffff4) requested but the memory is mapped already 16843 (a.out): Uhuuh, elf segment at 00007f930ecc7000(fffffffffffffff4) requested but the memory is mapped already Complain only if -EEXIST, and use %px for printing the address. Link: http://lkml.kernel.org/r/201804182307.FAC17665.SFMOFJVFtHOLOQ@I-love.SAKURA.ne.jp Fixes: 4ed28639519c7bad ("fs, elf: drop MAP_FIXED usage from elf_map") is Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Andrei Vagin <avagin@openvz.org> Cc: Khalid Aziz <khalid.aziz@oracle.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Kees Cook <keescook@chromium.org> Cc: Abdul Haleem <abdhalee@linux.vnet.ibm.com> Cc: Joel Stanley <joel@jms.id.au> Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-20kexec_file: do not add extra alignment to efi memmapDave Young
Chun-Yi reported a kernel warning message below: WARNING: CPU: 0 PID: 0 at ../mm/early_ioremap.c:182 early_iounmap+0x4f/0x12c() early_iounmap(ffffffffff200180, 00000118) [0] size not consistent 00000120 The problem is x86 kexec_file_load adds extra alignment to the efi memmap: in bzImage64_load(): efi_map_sz = efi_get_runtime_map_size(); efi_map_sz = ALIGN(efi_map_sz, 16); And __efi_memmap_init maps with the size including the alignment bytes but efi_memmap_unmap use nr_maps * desc_size which does not include the extra bytes. The alignment in kexec code is only needed for the kexec buffer internal use Actually kexec should pass exact size of the efi memmap to 2nd kernel. Link: http://lkml.kernel.org/r/20180417083600.GA1972@dhcp-128-65.nay.redhat.com Signed-off-by: Dave Young <dyoung@redhat.com> Reported-by: joeyli <jlee@suse.com> Tested-by: Randy Wright <rwright@hpe.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-20proc: fix /proc/loadavg regressionAlexey Dobriyan
Commit 95846ecf9dac ("pid: replace pid bitmap implementation with IDR API") changed last field of /proc/loadavg (last pid allocated) to be off by one: # unshare -p -f --mount-proc cat /proc/loadavg 0.00 0.00 0.00 1/60 2 <=== It should be 1 after first fork into pid namespace. This is formally a regression but given how useless this field is I don't think anyone is affected. Bug was found by /proc testsuite! Link: http://lkml.kernel.org/r/20180413175408.GA27246@avx2 Fixes: 95846ecf9dac508 ("pid: replace pid bitmap implementation with IDR API") Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Gargi Sharma <gs051095@gmail.com> Cc: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-20proc: revalidate kernel thread inodes to root:rootAlexey Dobriyan
task_dump_owner() has the following code: mm = task->mm; if (mm) { if (get_dumpable(mm) != SUID_DUMP_USER) { uid = ... } } Check for ->mm is buggy -- kernel thread might be borrowing mm and inode will go to some random uid:gid pair. Link: http://lkml.kernel.org/r/20180412220109.GA20978@avx2 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-20autofs: mount point create should honour passed in modeIan Kent
The autofs file system mkdir inode operation blindly sets the created directory mode to S_IFDIR | 0555, ingoring the passed in mode, which can cause selinux dac_override denials. But the function also checks if the caller is the daemon (as no-one else should be able to do anything here) so there's no point in not honouring the passed in mode, allowing the daemon to set appropriate mode when required. Link: http://lkml.kernel.org/r/152361593601.8051.14014139124905996173.stgit@pluto.themaw.net Signed-off-by: Ian Kent <raven@themaw.net> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-20MAINTAINERS: add personal addresses for Sascha and UweUwe Kleine-König
The idea behind using kernel@pengutronix.de (i.e. the mail alias for the kernel people at Pengutronix) as email address was to have a backup when a given developer is on vacation or run over by a bus. Make this more explicit by adding the alias as reviewer and use the personal address for Sascha and me. Link: http://lkml.kernel.org/r/20180413083312.11213-1-u.kleine-koenig@pengutronix.de Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Acked-by: Sascha Hauer <s.hauer@pengutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-20kasan: add no_sanitize attribute for clang buildsAndrey Konovalov
KASAN uses the __no_sanitize_address macro to disable instrumentation of particular functions. Right now it's defined only for GCC build, which causes false positives when clang is used. This patch adds a definition for clang. Note, that clang's revision 329612 or higher is required. [andreyknvl@google.com: remove redundant #ifdef CONFIG_KASAN check] Link: http://lkml.kernel.org/r/c79aa31a2a2790f6131ed607c58b0dd45dd62a6c.1523967959.git.andreyknvl@google.com Link: http://lkml.kernel.org/r/4ad725cc903f8534f8c8a60f0daade5e3d674f8d.1523554166.git.andreyknvl@google.com Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Acked-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Alexander Potapenko <glider@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: David Rientjes <rientjes@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: David Woodhouse <dwmw@amazon.co.uk> Cc: Andrey Konovalov <andreyknvl@google.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Paul Lawrence <paullawrence@google.com> Cc: Sandipan Das <sandipan@linux.vnet.ibm.com> Cc: Kees Cook <keescook@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-20rapidio: fix rio_dma_transfer error handlingIoan Nicu
Some of the mport_dma_req structure members were initialized late inside the do_dma_request() function, just before submitting the request to the dma engine. But we have some error branches before that. In case of such an error, the code would return on the error path and trigger the calling of dma_req_free() with a req structure which is not completely initialized. This causes a NULL pointer dereference in dma_req_free(). This patch fixes these error branches by making sure that all necessary mport_dma_req structure members are initialized in rio_dma_transfer() immediately after the request structure gets allocated. Link: http://lkml.kernel.org/r/20180412150605.GA31409@nokia.com Fixes: bbd876adb8c72 ("rapidio: use a reference count for struct mport_dma_req") Signed-off-by: Ioan Nicu <ioan.nicu.ext@nokia.com> Tested-by: Alexander Sverdlin <alexander.sverdlin@nokia.com> Acked-by: Alexandre Bounine <alex.bou9@gmail.com> Cc: Barry Wood <barry.wood@idt.com> Cc: Matt Porter <mporter@kernel.crashing.org> Cc: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Cc: Logan Gunthorpe <logang@deltatee.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Frank Kunz <frank.kunz@nokia.com> Cc: <stable@vger.kernel.org> [4.6+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-20mm: enable thp migration for shmem thpNaoya Horiguchi
My testing for the latest kernel supporting thp migration showed an infinite loop in offlining the memory block that is filled with shmem thps. We can get out of the loop with a signal, but kernel should return with failure in this case. What happens in the loop is that scan_movable_pages() repeats returning the same pfn without any progress. That's because page migration always fails for shmem thps. In memory offline code, memory blocks containing unmovable pages should be prevented from being offline targets by has_unmovable_pages() inside start_isolate_page_range(). So it's possible to change migratability for non-anonymous thps to avoid the issue, but it introduces more complex and thp-specific handling in migration code, so it might not good. So this patch is suggesting to fix the issue by enabling thp migration for shmem thp. Both of anon/shmem thp are migratable so we don't need precheck about the type of thps. Link: http://lkml.kernel.org/r/20180406030706.GA2434@hori1.linux.bs1.fc.nec.co.jp Fixes: commit 72b39cfc4d75 ("mm, memory_hotplug: do not fail offlining too early") Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Zi Yan <zi.yan@sent.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Michal Hocko <mhocko@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-20writeback: safer lock nestingGreg Thelen
lock_page_memcg()/unlock_page_memcg() use spin_lock_irqsave/restore() if the page's memcg is undergoing move accounting, which occurs when a process leaves its memcg for a new one that has memory.move_charge_at_immigrate set. unlocked_inode_to_wb_begin,end() use spin_lock_irq/spin_unlock_irq() if the given inode is switching writeback domains. Switches occur when enough writes are issued from a new domain. This existing pattern is thus suspicious: lock_page_memcg(page); unlocked_inode_to_wb_begin(inode, &locked); ... unlocked_inode_to_wb_end(inode, locked); unlock_page_memcg(page); If both inode switch and process memcg migration are both in-flight then unlocked_inode_to_wb_end() will unconditionally enable interrupts while still holding the lock_page_memcg() irq spinlock. This suggests the possibility of deadlock if an interrupt occurs before unlock_page_memcg(). truncate __cancel_dirty_page lock_page_memcg unlocked_inode_to_wb_begin unlocked_inode_to_wb_end <interrupts mistakenly enabled> <interrupt> end_page_writeback test_clear_page_writeback lock_page_memcg <deadlock> unlock_page_memcg Due to configuration limitations this deadlock is not currently possible because we don't mix cgroup writeback (a cgroupv2 feature) and memory.move_charge_at_immigrate (a cgroupv1 feature). If the kernel is hacked to always claim inode switching and memcg moving_account, then this script triggers lockup in less than a minute: cd /mnt/cgroup/memory mkdir a b echo 1 > a/memory.move_charge_at_immigrate echo 1 > b/memory.move_charge_at_immigrate ( echo $BASHPID > a/cgroup.procs while true; do dd if=/dev/zero of=/mnt/big bs=1M count=256 done ) & while true; do sync done & sleep 1h & SLEEP=$! while true; do echo $SLEEP > a/cgroup.procs echo $SLEEP > b/cgroup.procs done The deadlock does not seem possible, so it's debatable if there's any reason to modify the kernel. I suggest we should to prevent future surprises. And Wang Long said "this deadlock occurs three times in our environment", so there's more reason to apply this, even to stable. Stable 4.4 has minor conflicts applying this patch. For a clean 4.4 patch see "[PATCH for-4.4] writeback: safer lock nesting" https://lkml.org/lkml/2018/4/11/146 Wang Long said "this deadlock occurs three times in our environment" [gthelen@google.com: v4] Link: http://lkml.kernel.org/r/20180411084653.254724-1-gthelen@google.com [akpm@linux-foundation.org: comment tweaks, struct initialization simplification] Change-Id: Ibb773e8045852978f6207074491d262f1b3fb613 Link: http://lkml.kernel.org/r/20180410005908.167976-1-gthelen@google.com Fixes: 682aa8e1a6a1 ("writeback: implement unlocked_inode_to_wb transaction and use it for stat updates") Signed-off-by: Greg Thelen <gthelen@google.com> Reported-by: Wang Long <wanglong19@meituan.com> Acked-by: Wang Long <wanglong19@meituan.com> Acked-by: Michal Hocko <mhocko@suse.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Tejun Heo <tj@kernel.org> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: <stable@vger.kernel.org> [v4.2+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-20mm, pagemap: fix swap offset value for PMD migration entryHuang Ying
The swap offset reported by /proc/<pid>/pagemap may be not correct for PMD migration entries. If addr passed into pagemap_pmd_range() isn't aligned with PMD start address, the swap offset reported doesn't reflect this. And in the loop to report information of each sub-page, the swap offset isn't increased accordingly as that for PFN. This may happen after opening /proc/<pid>/pagemap and seeking to a page whose address doesn't align with a PMD start address. I have verified this with a simple test program. BTW: migration swap entries have PFN information, do we need to restrict whether to show them? [akpm@linux-foundation.org: fix typo, per Huang, Ying] Link: http://lkml.kernel.org/r/20180408033737.10897-1-ying.huang@intel.com Signed-off-by: "Huang, Ying" <ying.huang@intel.com> Cc: Michal Hocko <mhocko@suse.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Andrei Vagin <avagin@openvz.org> Cc: Dan Williams <dan.j.williams@intel.com> Cc: "Jerome Glisse" <jglisse@redhat.com> Cc: Daniel Colascione <dancol@google.com> Cc: Zi Yan <zi.yan@cs.rutgers.edu> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>