aboutsummaryrefslogtreecommitdiff
path: root/arch/s390/include
AgeCommit message (Collapse)Author
2024-01-01s390/vx: fix save/restore of fpu kernel contextHeiko Carstens
[ Upstream commit e6b2dab41888332bf83f592131e7ea07756770a4 ] The KERNEL_FPR mask only contains a flag for the first eight vector registers. However floating point registers overlay parts of the first sixteen vector registers. This could lead to vector register corruption if a kernel fpu context uses any of the vector registers 8 to 15 and is interrupted or calls a KERNEL_FPR context. If that context uses also vector registers 8 to 15, their contents will be corrupted on return. Luckily this is currently not a real bug, since the kernel has only one KERNEL_FPR user with s390_adjust_jiffies() and it is only using floating point registers 0 to 2. Fix this by using the correct bits for KERNEL_FPR. Fixes: 7f79695cc1b6 ("s390/fpu: improve kernel_fpu_[begin|end]") Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-09-13s390/pkey: fix PKEY_TYPE_EP11_AES handling for sysfs attributesHolger Dengler
[ Upstream commit b9352e4b9b9eff949bcc6907b8569b3a1d992f1e ] Commit 'fa6999e326fe ("s390/pkey: support CCA and EP11 secure ECC private keys")' introduced a new PKEY_TYPE_EP11_AES securekey type as a supplement to the existing PKEY_TYPE_EP11 (which won't work in environments with session-bound keys). The pkey EP11 securekey attributes use PKEY_TYPE_EP11_AES (instead of PKEY_TYPE_EP11) keyblobs, to make the generated keyblobs usable also in environments, where session-bound keys are required. There should be no negative impacts to userspace because the internal structure of the keyblobs is opaque. The increased size of the generated keyblobs is reflected by the changed size of the attributes. Fixes: fa6999e326fe ("s390/pkey: support CCA and EP11 secure ECC private keys") Signed-off-by: Holger Dengler <dengler@linux.ibm.com> Reviewed-by: Ingo Franzki <ifranzki@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-10s390/ap: fix status returned by ap_qact()Halil Pasic
[ Upstream commit a2522c80f074c35254974fec39fffe8b8d75befe ] Since commit 159491f3b509 ("s390/ap: rework assembler functions to use unions for in/out register variables") the function ap_qact() tries to grab the status from the wrong part of the register. Thus we always end up with zeros. Which is wrong, among others, because we detect failures via status.response_code. Signed-off-by: Halil Pasic <pasic@linux.ibm.com> Reported-by: Harald Freudenberger <freude@linux.ibm.com> Fixes: 159491f3b509 ("s390/ap: rework assembler functions to use unions for in/out register variables") Reviewed-by: Harald Freudenberger <freude@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-10s390/ap: fix status returned by ap_aqic()Halil Pasic
[ Upstream commit 394740d7645ea767795074287769dd26dbd4d782 ] There function ap_aqic() tries to grab the status from the wrong part of the register. Thus we always end up with zeros. Which is wrong, among others, because we detect failures via status.response_code. Signed-off-by: Halil Pasic <pasic@linux.ibm.com> Reported-by: Janosch Frank <frankja@linux.ibm.com> Fixes: 159491f3b509 ("s390/ap: rework assembler functions to use unions for in/out register variables") Reviewed-by: Harald Freudenberger <freude@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-01s390/debug: add _ASM_S390_ prefix to header guardNiklas Schnelle
[ Upstream commit 0d4d52361b6c29bf771acd4fa461f06d78fb2fac ] Using DEBUG_H without a prefix is very generic and inconsistent with other header guards in arch/s390/include/asm. In fact it collides with the same name in the ath9k wireless driver though that depends on !S390 via disabled wireless support. Let's just use a consistent header guard name and prevent possible future trouble. Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-18s390/percpu: add READ_ONCE() to arch_this_cpu_to_op_simple()Heiko Carstens
commit e3f360db08d55a14112bd27454e616a24296a8b0 upstream. Make sure that *ptr__ within arch_this_cpu_to_op_simple() is only dereferenced once by using READ_ONCE(). Otherwise the compiler could generate incorrect code. Cc: <stable@vger.kernel.org> Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-01-18s390/cpum_sf: add READ_ONCE() semantics to compare and swap loopsHeiko Carstens
commit 82d3edb50a11bf3c5ef63294d5358ba230181413 upstream. The current cmpxchg_double() loops within the perf hw sampling code do not have READ_ONCE() semantics to read the old value from memory. This allows the compiler to generate code which reads the "old" value several times from memory, which again allows for inconsistencies. For example: /* Reset trailer (using compare-double-and-swap) */ do { te_flags = te->flags & ~SDB_TE_BUFFER_FULL_MASK; te_flags |= SDB_TE_ALERT_REQ_MASK; } while (!cmpxchg_double(&te->flags, &te->overflow, te->flags, te->overflow, te_flags, 0ULL)); The compiler could generate code where te->flags used within the cmpxchg_double() call may be refetched from memory and which is not necessarily identical to the previous read version which was used to generate te_flags. Which in turn means that an incorrect update could happen. Fix this by adding READ_ONCE() semantics to all cmpxchg_double() loops. Given that READ_ONCE() cannot generate code on s390 which atomically reads 16 bytes, use a private compare-and-swap-double implementation to achieve that. Also replace cmpxchg_double() with the private implementation to be able to re-use the old value within the loops. As a side effect this converts the whole code to only use bit fields to read and modify bits within the hws trailer header. Reported-by: Alexander Gordeev <agordeev@linux.ibm.com> Acked-by: Alexander Gordeev <agordeev@linux.ibm.com> Acked-by: Hendrik Brueckner <brueckner@linux.ibm.com> Reviewed-by: Thomas Richter <tmricht@linux.ibm.com> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/linux-s390/Y71QJBhNTIatvxUT@osiris/T/#ma14e2a5f7aa8ed4b94b6f9576799b3ad9c60f333 Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-12-02Merge tag 'mm-hotfixes-stable-2022-12-02' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc hotfixes from Andrew Morton: "15 hotfixes, 11 marked cc:stable. Only three or four of the latter address post-6.0 issues, which is hopefully a sign that things are converging" * tag 'mm-hotfixes-stable-2022-12-02' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: revert "kbuild: fix -Wimplicit-function-declaration in license_is_gpl_compatible" Kconfig.debug: provide a little extra FRAME_WARN leeway when KASAN is enabled drm/amdgpu: temporarily disable broken Clang builds due to blown stack-frame mm/khugepaged: invoke MMU notifiers in shmem/file collapse paths mm/khugepaged: fix GUP-fast interaction by sending IPI mm/khugepaged: take the right locks for page table retraction mm: migrate: fix THP's mapcount on isolation mm: introduce arch_has_hw_nonleaf_pmd_young() mm: add dummy pmd_young() for architectures not having it mm/damon/sysfs: fix wrong empty schemes assumption under online tuning in damon_sysfs_set_schemes() tools/vm/slabinfo-gnuplot: use "grep -E" instead of "egrep" nilfs2: fix NULL pointer dereference in nilfs_palloc_commit_free_entry() hugetlb: don't delete vma_lock in hugetlb MADV_DONTNEED processing madvise: use zap_page_range_single for madvise dontneed mm: replace VM_WARN_ON to pr_warn if the node is offline with __GFP_THISNODE
2022-11-30mm: add dummy pmd_young() for architectures not having itJuergen Gross
In order to avoid #ifdeffery add a dummy pmd_young() implementation as a fallback. This is required for the later patch "mm: introduce arch_has_hw_nonleaf_pmd_young()". Link: https://lkml.kernel.org/r/fd3ac3cd-7349-6bbd-890a-71a9454ca0b3@suse.com Signed-off-by: Juergen Gross <jgross@suse.com> Acked-by: Yu Zhao <yuzhao@google.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Sander Eikelenboom <linux@eikelenboom.it> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-11-16s390: avoid using global register for current_stack_pointerVasily Gorbik
Commit 30de14b1884b ("s390: current_stack_pointer shouldn't be a function") made current_stack_pointer a global register variable like on many other architectures. Unfortunately on s390 it uncovers old gcc bug which is fixed only since gcc-9.1 [gcc commit 3ad7fed1cc87 ("S/390: Fix PR89775. Stackpointer save/restore instructions removed")] and backported to gcc-8.4 and later. Due to this bug gcc versions prior to 8.4 generate broken code which leads to stack corruptions. Current minimal gcc version required to build the kernel is declared as 5.1. It is not possible to fix all old gcc versions, so work around this problem by avoiding using global register variable for current_stack_pointer. Fixes: 30de14b1884b ("s390: current_stack_pointer shouldn't be a function") Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-10-26s390/futex: add missing EX_TABLE entry to __futex_atomic_op()Heiko Carstens
For some exception types the instruction address points behind the instruction that caused the exception. Take that into account and add the missing exception table entry. Cc: <stable@vger.kernel.org> Reviewed-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-10-12Merge tag 'mm-nonmm-stable-2022-10-11' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull non-MM updates from Andrew Morton: - hfs and hfsplus kmap API modernization (Fabio Francesco) - make crash-kexec work properly when invoked from an NMI-time panic (Valentin Schneider) - ntfs bugfixes (Hawkins Jiawei) - improve IPC msg scalability by replacing atomic_t's with percpu counters (Jiebin Sun) - nilfs2 cleanups (Minghao Chi) - lots of other single patches all over the tree! * tag 'mm-nonmm-stable-2022-10-11' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (71 commits) include/linux/entry-common.h: remove has_signal comment of arch_do_signal_or_restart() prototype proc: test how it holds up with mapping'less process mailmap: update Frank Rowand email address ia64: mca: use strscpy() is more robust and safer init/Kconfig: fix unmet direct dependencies ia64: update config files nilfs2: replace WARN_ONs by nilfs_error for checkpoint acquisition failure fork: remove duplicate included header files init/main.c: remove unnecessary (void*) conversions proc: mark more files as permanent nilfs2: remove the unneeded result variable nilfs2: delete unnecessary checks before brelse() checkpatch: warn for non-standard fixes tag style usr/gen_init_cpio.c: remove unnecessary -1 values from int file ipc/msg: mitigate the lock contention with percpu counter percpu: add percpu_counter_add_local and percpu_counter_sub_local fs/ocfs2: fix repeated words in comments relay: use kvcalloc to alloc page array in relay_alloc_page_array proc: make config PROC_CHILDREN depend on PROC_FS fs: uninline inode_maybe_inc_iversion() ...
2022-10-09Merge tag 's390-6.1-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 updates from Vasily Gorbik: - Make use of the IBM z16 processor activity instrumentation facility extension to count neural network processor assist operations: add a new PMU device driver so that perf can make use of this. - Rework memcpy_real() to avoid DAT-off mode. - Rework absolute lowcore access code. - Various small fixes and improvements all over the code. * tag 's390-6.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/pci: remove unused bus_next field from struct zpci_dev s390/cio: remove unused ccw_device_force_console() declaration s390/pai: Add support for PAI Extension 1 NNPA counters s390/mm: fix no previous prototype warnings in maccess.c s390/mm: uninline copy_oldmem_kernel() function s390/mm,ptdump: add real memory copy page markers s390/mm: rework memcpy_real() to avoid DAT-off mode s390/dump: save IPL CPU registers once DAT is available s390/pci: convert high_memory to physical address s390/smp,ptdump: add absolute lowcore markers s390/smp: rework absolute lowcore access s390/smp: call smp_reinit_ipl_cpu() before scheduler is available s390/ptdump: add missing amode31 markers s390/mm: split lowcore pages with set_memory_4k() s390/mm: remove unused access parameter from do_fault_error() s390/delay: sync comment within __delay() with reality s390: move from strlcpy with unused retval to strscpy
2022-10-07Merge tag 'tty-6.1-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty Pull tty/serial driver updates from Greg KH: "Here is the big set of TTY and Serial driver updates for 6.1-rc1. Lots of cleanups in here, no real new functionality this time around, with the diffstat being that we removed more lines than we added! Included in here are: - termios unification cleanups from Al Viro, it's nice to finally get this work done - tty serial transmit cleanups in various drivers in preparation for more cleanup and unification in future releases (that work was not ready for this release) - n_gsm fixes and updates - ktermios cleanups and code reductions - dt bindings json conversions and updates for new devices - some serial driver updates for new devices - lots of other tiny cleanups and janitorial stuff. Full details in the shortlog. All of these have been in linux-next for a while with no reported issues" * tag 'tty-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (102 commits) serial: cpm_uart: Don't request IRQ too early for console port tty: serial: do unlock on a common path in altera_jtaguart_console_putc() tty: serial: unify TX space reads under altera_jtaguart_tx_space() tty: serial: use FIELD_GET() in lqasc_tx_ready() tty: serial: extend lqasc_tx_ready() to lqasc_console_putchar() tty: serial: allow pxa.c to be COMPILE_TESTed serial: stm32: Fix unused-variable warning tty: serial: atmel: Add COMMON_CLK dependency to SERIAL_ATMEL serial: 8250: Fix restoring termios speed after suspend serial: Deassert Transmit Enable on probe in driver-specific way serial: 8250_dma: Convert to use uart_xmit_advance() serial: 8250_omap: Convert to use uart_xmit_advance() MAINTAINERS: Solve warning regarding inexistent atmel-usart binding serial: stm32: Deassert Transmit Enable on ->rs485_config() serial: ar933x: Deassert Transmit Enable on ->rs485_config() tty: serial: atmel: Use FIELD_PREP/FIELD_GET tty: serial: atmel: Make the driver aware of the existence of GCLK tty: serial: atmel: Only divide Clock Divisor if the IP is USART tty: serial: atmel: Separate mode clearing between UART and USART dt-bindings: serial: atmel,at91-usart: Add gclk as a possible USART clock ...
2022-10-07Merge tag 'for-6.1/block-2022-10-03' of git://git.kernel.dk/linuxLinus Torvalds
Pull block updates from Jens Axboe: - NVMe pull requests via Christoph: - handle number of queue changes in the TCP and RDMA drivers (Daniel Wagner) - allow changing the number of queues in nvmet (Daniel Wagner) - also consider host_iface when checking ip options (Daniel Wagner) - don't map pages which can't come from HIGHMEM (Fabio M. De Francesco) - avoid unnecessary flush bios in nvmet (Guixin Liu) - shrink and better pack the nvme_iod structure (Keith Busch) - add comment for unaligned "fake" nqn (Linjun Bao) - print actual source IP address through sysfs "address" attr (Martin Belanger) - various cleanups (Jackie Liu, Wolfram Sang, Genjian Zhang) - handle effects after freeing the request (Keith Busch) - copy firmware_rev on each init (Keith Busch) - restrict management ioctls to admin (Keith Busch) - ensure subsystem reset is single threaded (Keith Busch) - report the actual number of tagset maps in nvme-pci (Keith Busch) - small fabrics authentication fixups (Christoph Hellwig) - add common code for tagset allocation and freeing (Christoph Hellwig) - stop using the request_queue in nvmet (Christoph Hellwig) - set min_align_mask before calculating max_hw_sectors (Rishabh Bhatnagar) - send a rediscover uevent when a persistent discovery controller reconnects (Sagi Grimberg) - misc nvmet-tcp fixes (Varun Prakash, zhenwei pi) - MD pull request via Song: - Various raid5 fix and clean up, by Logan Gunthorpe and David Sloan. - Raid10 performance optimization, by Yu Kuai. - sbitmap wakeup hang fixes (Hugh, Keith, Jan, Yu) - IO scheduler switching quisce fix (Keith) - s390/dasd block driver updates (Stefan) - support for recovery for the ublk driver (ZiyangZhang) - rnbd drivers fixes and updates (Guoqing, Santosh, ye, Christoph) - blk-mq and null_blk map fixes (Bart) - various bcache fixes (Coly, Jilin, Jules) - nbd signal hang fix (Shigeru) - block writeback throttling fix (Yu) - optimize the passthrough mapping handling (me) - prepare block cgroups to being gendisk based (Christoph) - get rid of an old PSI hack in the block layer, moving it to the callers instead where it belongs (Christoph) - blk-throttle fixes and cleanups (Yu) - misc fixes and cleanups (Liu Shixin, Liu Song, Miaohe, Pankaj, Ping-Xiang, Wolfram, Saurabh, Li Jinlin, Li Lei, Lin, Li zeming, Miaohe, Bart, Coly, Gaosheng * tag 'for-6.1/block-2022-10-03' of git://git.kernel.dk/linux: (162 commits) sbitmap: fix lockup while swapping block: add rationale for not using blk_mq_plug() when applicable block: adapt blk_mq_plug() to not plug for writes that require a zone lock s390/dasd: use blk_mq_alloc_disk blk-cgroup: don't update the blkg lookup hint in blkg_conf_prep nvmet: don't look at the request_queue in nvmet_bdev_set_limits nvmet: don't look at the request_queue in nvmet_bdev_zone_mgmt_emulate_all blk-mq: use quiesced elevator switch when reinitializing queues block: replace blk_queue_nowait with bdev_nowait nvme: remove nvme_ctrl_init_connect_q nvme-loop: use the tagset alloc/free helpers nvme-loop: store the generic nvme_ctrl in set->driver_data nvme-loop: initialize sqsize later nvme-fc: use the tagset alloc/free helpers nvme-fc: store the generic nvme_ctrl in set->driver_data nvme-fc: keep ctrl->sqsize in sync with opts->queue_size nvme-rdma: use the tagset alloc/free helpers nvme-rdma: store the generic nvme_ctrl in set->driver_data nvme-tcp: use the tagset alloc/free helpers nvme-tcp: store the generic nvme_ctrl in set->driver_data ...
2022-09-28s390/pci: remove unused bus_next field from struct zpci_devNiklas Schnelle
This field was added in commit 44510d6fa0c0 ("s390/pci: Handling multifunctions") but is an unused remnant of an earlier version where the devices on the virtual bus were connected in a linked list instead of a fixed 256 entry array of pointers. It is also not used for the list of busses as that is threaded through struct zpci_bus not through struct zpci_dev. Reviewed-by: Pierre Morel <pmorel@linux.ibm.com> Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-09-28s390/cio: remove unused ccw_device_force_console() declarationGaosheng Cui
ccw_device_force_console() has been removed by commit 8cc0dcfdc1c0 ("s390/cio: remove pm support from ccw bus driver"), so remove the declaration, too. Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com> Acked-by: Vineeth Vijayan <vneethv@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-09-25Merge 7e2cd21e02b3 ("Merge tag 'tty-6.0-rc7' of ↵Greg Kroah-Hartman
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty") into tty-next We need the tty fixes and api additions in this branch. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-09-21s390/dasd: suppress generic error messages for PPRC secondary devicesStefan Haberland
Suppress generic command reject messages and dump of sense data for Peer-To-Peer-Remote-Copy (PPRC) secondary errors. If IO is issued on a PPRC secondary device, a specific error message is printed instead. Signed-off-by: Stefan Haberland <sth@linux.ibm.com> Reviewed-by: Jan Hoeppner <hoeppner@linux.ibm.com> Link: https://lore.kernel.org/r/20220920192616.808070-7-sth@linux.ibm.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-09-21s390/dasd: add ioctl to perform a swap of the drivers copy pairStefan Haberland
The newly defined ioctl BIODASDCOPYPAIRSWAP takes a structure that specifies a copy pair that should be swapped. It will call the device discipline function to perform the swap operation. The structure looks as followed: struct dasd_copypair_swap_data_t { char primary[20]; char secondary[20]; __u8 reserved[64]; }; where primary is the old primary device that will be replaced by the secondary device. The old primary will become a secondary device afterwards. Signed-off-by: Stefan Haberland <sth@linux.ibm.com> Reviewed-by: Jan Hoeppner <hoeppner@linux.ibm.com> Link: https://lore.kernel.org/r/20220920192616.808070-6-sth@linux.ibm.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-09-16s390/pai: Add support for PAI Extension 1 NNPA countersThomas Richter
PMU device driver perf_paiext supports Processor Activity Instrumentation Extension (PAIE1), available with IBM z16: - maps a 512 byte block to lowcore address 0x1508 called PAIE1 control block. - maps a 1024 byte block at PAIE1 control block entry with index 2. - uses control register bit 14 to enable PAIE1 control block lookup. - turn PAIE1 nnpa counting on and off by setting bit 63 in PAIE1 control block entry with index 2. - creates a sample with raw data on each context switch out when at context switch some mapped counters have a value of nonzero. This device driver only supports CPU wide context, no task context is allowed. Support for counting: - one or more counters can be specified using perf stat -e pai_ext/xxx/ where xxx stands for the counter event name. Multiple invocation of this command is possible. The counter names are listed in /sys/devices/pai_ext/events directory. - one special counters can be specified using perf stat -e pai_ext/NNPA_ALL/ which returns the sum of all incremented nnpa counters. - multiple counting events can run in parallel. Support for Sampling: - one event pai_ext/NNPA_ALL/ is reserved for sampling. The event collects data at context switch out and saves them in the ring buffer. - no multiple invocations are possible. The PAIE1 nnpa counter events are system wide. No task context is supported. Therefore some restrictions documented in function paiext_busy() apply. Extend qpaci assembly instruction to query supported memory mapped nnpa counters. It returns the number of counters (no holes allowed in that range). PAIE1 nnpa counter events can not be created when a CPU hot plug add is processed. This means a CPU hot plug add does not get the necessary PAIE1 event to record PAIE1 nnpa counter increments on the newly added CPU. CPU hot plug remove removes the event and terminates the counting of PAIE1 counters immediately. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Reviewed-by: Sumanth Korikkar <sumanthk@linux.ibm.com> Reviewed-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-09-14s390/mm: uninline copy_oldmem_kernel() functionAlexander Gordeev
Uninline copy_oldmem_kernel() function and make it consistent with a very similar memcpy_real() implementation, by moving to code to crash_dump.c, where it actually belongs. Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-09-14s390/mm: rework memcpy_real() to avoid DAT-off modeAlexander Gordeev
Function memcpy_real() is an univeral data mover that does not require DAT mode to be able reading from a physical address. Its advantage is an ability to read from any address, even those for which no kernel virtual mapping exists. Although memcpy_real() is interrupt-safe, there are no handlers that make use of this function. The compiler instrumentation have to be disabled and separate no-DAT stack used to allow execution of the function once DAT mode is disabled. Rework memcpy_real() to overcome these shortcomings. As result, data copying (which is primarily reading out a crashed system memory by a user process) is executed on a regular stack with enabled interrupts. Also, use of memcpy_real_buf swap buffer becomes unnecessary and the swapping is eliminated. The above is achieved by using a fixed virtual address range that spans a single page and remaps that page repeatedly when memcpy_real() is called for a particular physical address. Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-09-14s390/dump: save IPL CPU registers once DAT is availableAlexander Gordeev
Function smp_save_dump_cpus() collects CPU state of a crashed system for secondary CPUs and for the IPL CPU very differently. The Signal Processor stop-and-store-status orders are used for the former while Hardware System Area requests and memcpy_real() routine are called for the latter. In addition a system reset is triggered, which pins smp_save_dump_cpus() function call before CPU and device initialization. Move the collection of IPL CPU state to a later stage when DAT becomes available. That is needed to allow a follow-up rework of memcpy_real() routine. Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-09-14s390/smp: rework absolute lowcore accessAlexander Gordeev
Temporary unsetting of the prefix page in memcpy_absolute() routine poses a risk of executing code path with unexpectedly disabled prefix page. This rework avoids the prefix page uninstalling and disabling of normal and machine check interrupts when accessing the absolute zero memory. Although memcpy_absolute() routine can access the whole memory, it is only used to update the absolute zero lowcore. This rework therefore introduces a new mechanism for the absolute zero lowcore access and scraps memcpy_absolute() routine for good. Instead, an area is reserved in the virtual memory that is used for the absolute lowcore access only. That area holds an array of 8KB virtual mappings - one per CPU. Whenever a CPU is brought online, the corresponding item is mapped to the real address of the previously installed prefix page. The absolute zero lowcore access works like this: a CPU calls the new primitive get_abs_lowcore() to obtain its 8KB mapping as a pointer to the struct lowcore. Virtual address references to that pointer get translated to the real addresses of the prefix page, which in turn gets swapped with the absolute zero memory addresses due to prefixing. Once the pointer is not needed it must be released with put_abs_lowcore() primitive: struct lowcore *abs_lc; unsigned long flags; abs_lc = get_abs_lowcore(&flags); abs_lc->... = ...; put_abs_lowcore(abs_lc, flags); To ensure the described mechanism works large segment- and region- table entries must be avoided for the 8KB mappings. Failure to do so results in usage of Region-Frame Absolute Address (RFAA) or Segment-Frame Absolute Address (SFAA) large page fields. In that case absolute addresses would be used to address the prefix page instead of the real ones and the prefixing would get bypassed. Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-09-14s390/smp: call smp_reinit_ipl_cpu() before scheduler is availableAlexander Gordeev
Currently smp_reinit_ipl_cpu() is a pre-SMP early initcall. That ensures no CPU is running in parallel, but still not enough to assume the code is exclusive, since the scheduling is already available. Move the function call to arch_call_rest_init() callback to ensure no thread could be preempted and allow lockless allocation of the kernel page tables. That is needed to allow a follow-up rework of the absolute lowcore access mechanism. Suggested-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-09-11kernel: exit: cleanup release_thread()Kefeng Wang
Only x86 has own release_thread(), introduce a new weak release_thread() function to clean empty definitions in other ARCHs. Link: https://lkml.kernel.org/r/20220819014406.32266-1-wangkefeng.wang@huawei.com Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Acked-by: Guo Ren <guoren@kernel.org> [csky] Acked-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Brian Cain <bcain@quicinc.com> Acked-by: Michael Ellerman <mpe@ellerman.id.au> [powerpc] Acked-by: Stafford Horne <shorne@gmail.com> [openrisc] Acked-by: Catalin Marinas <catalin.marinas@arm.com> [arm64] Acked-by: Huacai Chen <chenhuacai@kernel.org> [LoongArch] Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Chris Zankel <chris@zankel.net> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Dinh Nguyen <dinguyen@kernel.org> Cc: Guo Ren <guoren@kernel.org> [csky] Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: Jonas Bonn <jonas@southpole.se> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michal Simek <monstr@monstr.eu> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Richard Henderson <richard.henderson@linaro.org> Cc: Richard Weinberger <richard@nod.at> Cc: Rich Felker <dalias@libc.org> Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vineet Gupta <vgupta@kernel.org> Cc: Will Deacon <will@kernel.org> Cc: Xuerui Wang <kernel@xen0n.name> Cc: Yoshinori Sato <ysato@users.osdn.me> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-09Merge tag 'asm-generic-fixes-6.0-rc4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic Pull SOFTIRQ_ON_OWN_STACK rework from Arnd Bergmann: "Just one fixup patch, reworking the softirq_on_own_stack logic for preempt-rt kernels as discussed in https://lore.kernel.org/all/CAHk-=wgZSD3W2y6yczad2Am=EfHYyiPzTn3CfXxrriJf9i5W5w@mail.gmail.com/" * tag 'asm-generic-fixes-6.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic: asm-generic: Conditionally enable do_softirq_own_stack() via Kconfig.
2022-09-09termios: kill uapi termios.h that are identical to generic oneAl Viro
mandatory-y will have the generic picked for architectures that don't have uapi/asm/termios.h of their own. ia64, parisc and s390 ones are identical to generic, so... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Link: https://lore.kernel.org/r/YxGVXpS2dWoTwoa0@ZenIV Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-09-09termios: get rid of non-UAPI asm/termios.hAl Viro
All non-UAPI asm/termios.h consist of include of UAPI counterpart and, possibly, include of linux/uaccess.h The latter can't be simply removed, even though nothing in linux/termios.h doesn't depend upon it anymore - there are several places that rely upon that indirect chain of includes to pull linux/uaccess.h. So the include needs to be lifted out of there - we lift into tty_driver.h, serdev.h and places that pull asm/termios.h, but none of * linux/uaccess.h (obvious) * net/sock.h (pulls uaccess.h) * linux/{tty,tty_driver,serdev}.h (tty.h pulls tty_driver.h) That leaves us just with the include of UAPI asm/termios.h, which is what <asm/termios.h> will resolve to if we simply remove non-UAPI header. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Link: https://lore.kernel.org/r/YxDnKvYCHn/ogBUv@ZenIV Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-09-09termios: start unifying non-UAPI parts of asm/termios.hAl Viro
* new header (linut/termios_internal.h), pulled by the users of those suckers * defaults for INIT_C_CC and externs for conversion helpers moved over there * remove termios-base.h (empty now) Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Link: https://lore.kernel.org/r/YxDmptU7dNGZ+/Hn@ZenIV Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-09-05asm-generic: Conditionally enable do_softirq_own_stack() via Kconfig.Sebastian Andrzej Siewior
Remove the CONFIG_PREEMPT_RT symbol from the ifdef around do_softirq_own_stack() and move it to Kconfig instead. Enable softirq stacks based on SOFTIRQ_ON_OWN_STACK which depends on HAVE_SOFTIRQ_ON_OWN_STACK and its default value is set to !PREEMPT_RT. This ensures that softirq stacks are not used on PREEMPT_RT and avoids a 'select' statement on an option which has a 'depends' statement. Link: https://lore.kernel.org/YvN5E%2FPrHfUhggr7@linutronix.de Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-09-04Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull kvm fixes from Paolo Bonzini: "s390: - PCI interpretation compile fixes RISC-V: - fix unused variable warnings in vcpu_timer.c - move extern sbi_ext declarations to a header x86: - check validity of argument to KVM_SET_MP_STATE - use guest's global_ctrl to completely disable guest PEBS - fix a memory leak on memory allocation failure - mask off unsupported and unknown bits of IA32_ARCH_CAPABILITIES - fix build failure with Clang integrated assembler - fix MSR interception - always flush TLBs when enabling dirty logging" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: x86: check validity of argument to KVM_SET_MP_STATE perf/x86/core: Completely disable guest PEBS via guest's global_ctrl KVM: x86: fix memoryleak in kvm_arch_vcpu_create() KVM: x86: Mask off unsupported and unknown bits of IA32_ARCH_CAPABILITIES KVM: s390: pci: Hook to access KVM lowlevel from VFIO riscv: kvm: move extern sbi_ext declarations to a header riscv: kvm: vcpu_timer: fix unused variable warnings KVM: selftests: Fix ambiguous mov in KVM_ASM_SAFE() KVM: selftests: Fix KVM_EXCEPTION_MAGIC build with Clang KVM: VMX: Heed the 'msr' argument in msr_write_intercepted() kvm: x86: mmu: Always flush TLBs when enabling dirty logging kvm: x86: mmu: Drop the need_remote_flush() function
2022-08-30s390/hugetlb: fix prepare_hugepage_range() check for 2 GB hugepagesGerald Schaefer
The alignment check in prepare_hugepage_range() is wrong for 2 GB hugepages, it only checks for 1 MB hugepage alignment. This can result in kernel crash in __unmap_hugepage_range() at the BUG_ON(start & ~huge_page_mask(h)) alignment check, for mappings created with MAP_FIXED at unaligned address. Fix this by correctly handling multiple hugepage sizes, similar to the generic version of prepare_hugepage_range(). Fixes: d08de8e2d867 ("s390/mm: add support for 2GB hugepages") Cc: <stable@vger.kernel.org> # 4.8+ Acked-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-08-29KVM: s390: pci: Hook to access KVM lowlevel from VFIOPierre Morel
We have a cross dependency between KVM and VFIO when using s390 vfio_pci_zdev extensions for PCI passthrough To be able to keep both subsystem modular we add a registering hook inside the S390 core code. This fixes a build problem when VFIO is built-in and KVM is built as a module. Reported-by: Randy Dunlap <rdunlap@infradead.org> Reported-by: kernel test robot <lkp@intel.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Reviewed-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Pierre Morel <pmorel@linux.ibm.com> Fixes: 09340b2fca007 ("KVM: s390: pci: add routines to start/stop interpretive execution") Cc: <stable@vger.kernel.org> Acked-by: Janosch Frank <frankja@linux.ibm.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> # build-tested Link: https://lore.kernel.org/r/20220819122945.9309-1-pmorel@linux.ibm.com Message-Id: <20220819122945.9309-1-pmorel@linux.ibm.com> Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
2022-08-27provide arch_test_bit_acquire for architectures that define test_bitMikulas Patocka
Some architectures define their own arch_test_bit and they also need arch_test_bit_acquire, otherwise they won't compile. We also clean up the code by using the generic test_bit if that is equivalent to the arch-specific version. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Cc: stable@vger.kernel.org Fixes: 8238b4579866 ("wait_on_bit: add an acquire memory barrier") Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-08-07Merge tag 'bitmap-6.0-rc1' of https://github.com/norov/linuxLinus Torvalds
Pull bitmap updates from Yury Norov: - fix the duplicated comments on bitmap_to_arr64() (Qu Wenruo) - optimize out non-atomic bitops on compile-time constants (Alexander Lobakin) - cleanup bitmap-related headers (Yury Norov) - x86/olpc: fix 'logical not is only applied to the left hand side' (Alexander Lobakin) - lib/nodemask: inline wrappers around bitmap (Yury Norov) * tag 'bitmap-6.0-rc1' of https://github.com/norov/linux: (26 commits) lib/nodemask: inline next_node_in() and node_random() powerpc: drop dependency on <asm/machdep.h> in archrandom.h x86/olpc: fix 'logical not is only applied to the left hand side' lib/cpumask: move some one-line wrappers to header file headers/deps: mm: align MANITAINERS and Docs with new gfp.h structure headers/deps: mm: Split <linux/gfp_types.h> out of <linux/gfp.h> headers/deps: mm: Optimize <linux/gfp.h> header dependencies lib/cpumask: move trivial wrappers around find_bit to the header lib/cpumask: change return types to unsigned where appropriate cpumask: change return types to bool where appropriate lib/bitmap: change type of bitmap_weight to unsigned long lib/bitmap: change return types to bool where appropriate arm: align find_bit declarations with generic kernel iommu/vt-d: avoid invalid memory access via node_online(NUMA_NO_NODE) lib/test_bitmap: test the tail after bitmap_to_arr64() lib/bitmap: fix off-by-one in bitmap_to_arr64() lib: test_bitmap: add compile-time optimization/evaluations assertions bitmap: don't assume compiler evaluates small mem*() builtins calls net/ice: fix initializing the bitmap in the switch code bitops: let optimize out non-atomic bitops on compile-time constants ...
2022-08-06Merge tag 's390-5.20-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 updates from Alexander Gordeev: - Rework copy_oldmem_page() callback to take an iov_iter. This includes a few prerequisite updates and fixes to the oldmem reading code. - Rework cpufeature implementation to allow for various CPU feature indications, which is not only limited to hardware capabilities, but also allows CPU facilities. - Use the cpufeature rework to autoload Ultravisor module when CPU facility 158 is available. - Add ELF note type for encrypted CPU state of a protected virtual CPU. The zgetdump tool from s390-tools package will decrypt the CPU state using a Customer Communication Key and overwrite respective notes to make the data accessible for crash and other debugging tools. - Use vzalloc() instead of vmalloc() + memset() in ChaCha20 crypto test. - Fix incorrect recovery of kretprobe modified return address in stacktrace. - Switch the NMI handler to use generic irqentry_nmi_enter() and irqentry_nmi_exit() helper functions. - Rework the cryptographic Adjunct Processors (AP) pass-through design to support dynamic changes to the AP matrix of a running guest as well as to implement more of the AP architecture. - Minor boot code cleanups. - Grammar and typo fixes to hmcdrv and tape drivers. * tag 's390-5.20-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (46 commits) Revert "s390/smp: enforce lowcore protection on CPU restart" Revert "s390/smp: rework absolute lowcore access" Revert "s390/smp,ptdump: add absolute lowcore markers" s390/unwind: fix fgraph return address recovery s390/nmi: use irqentry_nmi_enter()/irqentry_nmi_exit() s390: add ELF note type for encrypted CPU state of a PV VCPU s390/smp,ptdump: add absolute lowcore markers s390/smp: rework absolute lowcore access s390/setup: rearrange absolute lowcore initialization s390/boot: cleanup adjust_to_uv_max() function s390/smp: enforce lowcore protection on CPU restart s390/tape: fix comment typo s390/hmcdrv: fix Kconfig "its" grammar s390/docs: fix warnings for vfio_ap driver doc s390/docs: fix warnings for vfio_ap driver lock usage doc s390/crash: support multi-segment iterators s390/crash: use static swap buffer for copy_to_user_real() s390/crash: move copy_to_user_real() to crash_dump.c s390/zcore: fix race when reading from hardware system area s390/crash: fix incorrect number of bytes to copy to user space ...
2022-08-06Merge tag 'vfio-v6.0-rc1' of https://github.com/awilliam/linux-vfioLinus Torvalds
Pull VFIO updates from Alex Williamson: - Cleanup use of extern in function prototypes (Alex Williamson) - Simplify bus_type usage and convert to device IOMMU interfaces (Robin Murphy) - Check missed return value and fix comment typos (Bo Liu) - Split migration ops from device ops and fix races in mlx5 migration support (Yishai Hadas) - Fix missed return value check in noiommu support (Liam Ni) - Hardening to clear buffer pointer to avoid use-after-free (Schspa Shi) - Remove requirement that only the same mm can unmap a previously mapped range (Li Zhe) - Adjust semaphore release vs device open counter (Yi Liu) - Remove unused arg from SPAPR support code (Deming Wang) - Rework vfio-ccw driver to better fit new mdev framework (Eric Farman, Michael Kawano) - Replace DMA unmap notifier with callbacks (Jason Gunthorpe) - Clarify SPAPR support comment relative to iommu_ops (Alexey Kardashevskiy) - Revise page pinning API towards compatibility with future iommufd support (Nicolin Chen) - Resolve issues in vfio-ccw, including use of DMA unmap callback (Eric Farman) * tag 'vfio-v6.0-rc1' of https://github.com/awilliam/linux-vfio: (40 commits) vfio/pci: fix the wrong word vfio/ccw: Check return code from subchannel quiesce vfio/ccw: Remove FSM Close from remove handlers vfio/ccw: Add length to DMA_UNMAP checks vfio: Replace phys_pfn with pages for vfio_pin_pages() vfio/ccw: Add kmap_local_page() for memcpy vfio: Rename user_iova of vfio_dma_rw() vfio/ccw: Change pa_pfn list to pa_iova list vfio/ap: Change saved_pfn to saved_iova vfio: Pass in starting IOVA to vfio_pin/unpin_pages API vfio/ccw: Only pass in contiguous pages vfio/ap: Pass in physical address of ind to ap_aqic() drm/i915/gvt: Replace roundup with DIV_ROUND_UP vfio: Make vfio_unpin_pages() return void vfio/spapr_tce: Fix the comment vfio: Replace the iommu notifier with a device list vfio: Replace the DMA unmapping notifier with a callback vfio/ccw: Move FSM open/close to MDEV open/close vfio/ccw: Refactor vfio_ccw_mdev_reset vfio/ccw: Create a CLOSE FSM event ...
2022-08-06Revert "s390/smp: rework absolute lowcore access"Alexander Gordeev
This reverts commit 7d06fed77b7d8fc9f6cc41b4e3f2823d32532ad8. This introduced vmem_mutex locking from vmem_map_4k_page() function called from smp_reinit_ipl_cpu() with interrupts disabled. While it is a pre-SMP early initcall no other CPUs running in parallel nor other code taking vmem_mutex on this boot stage - it still needs to be fixed. Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-08-05Merge tag 'mm-stable-2022-08-03' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: "Most of the MM queue. A few things are still pending. Liam's maple tree rework didn't make it. This has resulted in a few other minor patch series being held over for next time. Multi-gen LRU still isn't merged as we were waiting for mapletree to stabilize. The current plan is to merge MGLRU into -mm soon and to later reintroduce mapletree, with a view to hopefully getting both into 6.1-rc1. Summary: - The usual batches of cleanups from Baoquan He, Muchun Song, Miaohe Lin, Yang Shi, Anshuman Khandual and Mike Rapoport - Some kmemleak fixes from Patrick Wang and Waiman Long - DAMON updates from SeongJae Park - memcg debug/visibility work from Roman Gushchin - vmalloc speedup from Uladzislau Rezki - more folio conversion work from Matthew Wilcox - enhancements for coherent device memory mapping from Alex Sierra - addition of shared pages tracking and CoW support for fsdax, from Shiyang Ruan - hugetlb optimizations from Mike Kravetz - Mel Gorman has contributed some pagealloc changes to improve latency and realtime behaviour. - mprotect soft-dirty checking has been improved by Peter Xu - Many other singleton patches all over the place" [ XFS merge from hell as per Darrick Wong in https://lore.kernel.org/all/YshKnxb4VwXycPO8@magnolia/ ] * tag 'mm-stable-2022-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (282 commits) tools/testing/selftests/vm/hmm-tests.c: fix build mm: Kconfig: fix typo mm: memory-failure: convert to pr_fmt() mm: use is_zone_movable_page() helper hugetlbfs: fix inaccurate comment in hugetlbfs_statfs() hugetlbfs: cleanup some comments in inode.c hugetlbfs: remove unneeded header file hugetlbfs: remove unneeded hugetlbfs_ops forward declaration hugetlbfs: use helper macro SZ_1{K,M} mm: cleanup is_highmem() mm/hmm: add a test for cross device private faults selftests: add soft-dirty into run_vmtests.sh selftests: soft-dirty: add test for mprotect mm/mprotect: fix soft-dirty check in can_change_pte_writable() mm: memcontrol: fix potential oom_lock recursion deadlock mm/gup.c: fix formatting in check_and_migrate_movable_page() xfs: fail dax mount if reflink is enabled on a partition mm/memcontrol.c: remove the redundant updating of stats_flush_threshold userfaultfd: don't fail on unrecognized features hugetlb_cgroup: fix wrong hugetlb cgroup numa stat ...
2022-08-05Merge tag 'asm-generic-6.0' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic Pull asm-generic updates from Arnd Bergmann: "There are three independent sets of changes: - Sai Prakash Ranjan adds tracing support to the asm-generic version of the MMIO accessors, which is intended to help understand problems with device drivers and has been part of Qualcomm's vendor kernels for many years - A patch from Sebastian Siewior to rework the handling of IRQ stacks in softirqs across architectures, which is needed for enabling PREEMPT_RT - The last patch to remove the CONFIG_VIRT_TO_BUS option and some of the code behind that, after the last users of this old interface made it in through the netdev, scsi, media and staging trees" * tag 'asm-generic-6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic: uapi: asm-generic: fcntl: Fix typo 'the the' in comment arch/*/: remove CONFIG_VIRT_TO_BUS soc: qcom: geni: Disable MMIO tracing for GENI SE serial: qcom_geni_serial: Disable MMIO tracing for geni serial asm-generic/io: Add logging support for MMIO accessors KVM: arm64: Add a flag to disable MMIO trace for nVHE KVM lib: Add register read/write tracing support drm/meson: Fix overflow implicit truncation warnings irqchip/tegra: Fix overflow implicit truncation warnings coresight: etm4x: Use asm-generic IO memory barriers arm64: io: Use asm-generic high level MMIO accessors arch/*: Disable softirq stacks on PREEMPT_RT.
2022-08-04Merge tag 'pci-v5.20-changes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull pci updates from Bjorn Helgaas: "Enumeration: - Consolidate duplicated 'next function' scanning and extend to allow 'isolated functions' on s390, similar to existing hypervisors (Niklas Schnelle) Resource management: - Implement pci_iobar_pfn() for sparc, which allows us to remove the sparc-specific pci_mmap_page_range() and pci_mmap_resource_range(). This removes the ability to map the entire PCI I/O space using /proc/bus/pci, but we believe that's already been broken since v2.6.28 (Arnd Bergmann) - Move common PCI definitions to asm-generic/pci.h and rework others to be be more specific and more encapsulated in arches that need them (Stafford Horne) Power management: - Convert drivers to new *_PM_OPS macros to avoid need for '#ifdef CONFIG_PM_SLEEP' or '__maybe_unused' (Bjorn Helgaas) Virtualization: - Add ACS quirk for Broadcom BCM5750x multifunction NICs that isolate the functions but don't advertise an ACS capability (Pavan Chebbi) Error handling: - Clear PCI Status register during enumeration in case firmware left errors logged (Kai-Heng Feng) - When we have native control of AER, enable error reporting for all devices that support AER. Previously only a few drivers enabled this (Stefan Roese) - Keep AER error reporting enabled for switches. Previously we enabled this during enumeration but immediately disabled it (Stefan Roese) - Iterate over error counters instead of error strings to avoid printing junk in AER sysfs counters (Mohamed Khalfella) ASPM: - Remove pcie_aspm_pm_state_change() so ASPM config changes, e.g., via sysfs, are not lost across power state changes (Kai-Heng Feng) Endpoint framework: - Don't stop an EPC when unbinding an EPF from it (Shunsuke Mie) Endpoint embedded DMA controller driver: - Simplify and clean up support for the DesignWare embedded DMA (eDMA) controller (Frank Li, Serge Semin) Broadcom STB PCIe controller driver: - Avoid config space accesses when link is down because we can't recover from the CPU aborts these cause (Jim Quinlan) - Look for power regulators described under Root Ports in DT and enable them before scanning the secondary bus (Jim Quinlan) - Disable/enable regulators in suspend/resume (Jim Quinlan) Freescale i.MX6 PCIe controller driver: - Simplify and clean up clock and PHY management (Richard Zhu) - Disable/enable regulators in suspend/resume (Richard Zhu) - Set PCIE_DBI_RO_WR_EN before writing DBI registers (Richard Zhu) - Allow speeds faster than Gen2 (Richard Zhu) - Make link being down a non-fatal error so controller probe doesn't fail if there are no Endpoints connected (Richard Zhu) Loongson PCIe controller driver: - Add ACPI and MCFG support for Loongson LS7A (Huacai Chen) - Avoid config reads to non-existent LS2K/LS7A devices because a hardware defect causes machine hangs (Huacai Chen) - Work around LS7A integrated devices that report incorrect Interrupt Pin values (Jianmin Lv) Marvell Aardvark PCIe controller driver: - Add support for AER and Slot capability on emulated bridge (Pali Rohár) MediaTek PCIe controller driver: - Add Airoha EN7532 to DT binding (John Crispin) - Allow building of driver for ARCH_AIROHA (Felix Fietkau) MediaTek PCIe Gen3 controller driver: - Print decoded LTSSM state when the link doesn't come up (Jianjun Wang) NVIDIA Tegra194 PCIe controller driver: - Convert DT binding to json-schema (Vidya Sagar) - Add DT bindings and driver support for Tegra234 Root Port and Endpoint mode (Vidya Sagar) - Fix some Root Port interrupt handling issues (Vidya Sagar) - Set default Max Payload Size to 256 bytes (Vidya Sagar) - Fix Data Link Feature capability programming (Vidya Sagar) - Extend Endpoint mode support to devices beyond Controller-5 (Vidya Sagar) Qualcomm PCIe controller driver: - Rework clock, reset, PHY power-on ordering to avoid hangs and improve consistency (Robert Marko, Christian Marangi) - Move pipe_clk handling to PHY drivers (Dmitry Baryshkov) - Add IPQ60xx support (Selvam Sathappan Periakaruppan) - Allow ASPM L1 and substates for 2.7.0 (Krishna chaitanya chundru) - Add support for more than 32 MSI interrupts (Dmitry Baryshkov) Renesas R-Car PCIe controller driver: - Convert DT binding to json-schema (Herve Codina) - Add Renesas RZ/N1D (R9A06G032) to rcar-gen2 DT binding and driver (Herve Codina) Samsung Exynos PCIe controller driver: - Fix phy-exynos-pcie driver so it follows the 'phy_init() before phy_power_on()' PHY programming model (Marek Szyprowski) Synopsys DesignWare PCIe controller driver: - Simplify and clean up the DWC core extensively (Serge Semin) - Fix an issue with programming the ATU for regions that cross a 4GB boundary (Serge Semin) - Enable the CDM check if 'snps,enable-cdm-check' exists; previously we skipped it if 'num-lanes' was absent (Serge Semin) - Allocate a 32-bit DMA-able page to be MSI target instead of using a driver data structure that may not be addressable with 32-bit address (Will McVicker) - Add DWC core support for more than 32 MSI interrupts (Dmitry Baryshkov) Xilinx Versal CPM PCIe controller driver: - Add DT binding and driver support for Versal CPM5 Gen5 Root Port (Bharat Kumar Gogada)" * tag 'pci-v5.20-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (150 commits) PCI: imx6: Support more than Gen2 speed link mode PCI: imx6: Set PCIE_DBI_RO_WR_EN before writing DBI registers PCI: imx6: Reformat suspend callback to keep symmetric with resume PCI: imx6: Move the imx6_pcie_ltssm_disable() earlier PCI: imx6: Disable clocks in reverse order of enable PCI: imx6: Do not hide PHY driver callbacks and refine the error handling PCI: imx6: Reduce resume time by only starting link if it was up before suspend PCI: imx6: Mark the link down as non-fatal error PCI: imx6: Move regulator enable out of imx6_pcie_deassert_core_reset() PCI: imx6: Turn off regulator when system is in suspend mode PCI: imx6: Call host init function directly in resume PCI: imx6: Disable i.MX6QDL clock when disabling ref clocks PCI: imx6: Propagate .host_init() errors to caller PCI: imx6: Collect clock enables in imx6_pcie_clk_enable() PCI: imx6: Factor out ref clock disable to match enable PCI: imx6: Move imx6_pcie_clk_disable() earlier PCI: imx6: Move imx6_pcie_enable_ref_clk() earlier PCI: imx6: Move PHY management functions together PCI: imx6: Move imx6_pcie_grp_offset(), imx6_pcie_configure_type() earlier PCI: imx6: Convert to NOIRQ_SYSTEM_SLEEP_PM_OPS() ...
2022-08-04Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull kvm updates from Paolo Bonzini: "Quite a large pull request due to a selftest API overhaul and some patches that had come in too late for 5.19. ARM: - Unwinder implementations for both nVHE modes (classic and protected), complete with an overflow stack - Rework of the sysreg access from userspace, with a complete rewrite of the vgic-v3 view to allign with the rest of the infrastructure - Disagregation of the vcpu flags in separate sets to better track their use model. - A fix for the GICv2-on-v3 selftest - A small set of cosmetic fixes RISC-V: - Track ISA extensions used by Guest using bitmap - Added system instruction emulation framework - Added CSR emulation framework - Added gfp_custom flag in struct kvm_mmu_memory_cache - Added G-stage ioremap() and iounmap() functions - Added support for Svpbmt inside Guest s390: - add an interface to provide a hypervisor dump for secure guests - improve selftests to use TAP interface - enable interpretive execution of zPCI instructions (for PCI passthrough) - First part of deferred teardown - CPU Topology - PV attestation - Minor fixes x86: - Permit guests to ignore single-bit ECC errors - Intel IPI virtualization - Allow getting/setting pending triple fault with KVM_GET/SET_VCPU_EVENTS - PEBS virtualization - Simplify PMU emulation by just using PERF_TYPE_RAW events - More accurate event reinjection on SVM (avoid retrying instructions) - Allow getting/setting the state of the speaker port data bit - Refuse starting the kvm-intel module if VM-Entry/VM-Exit controls are inconsistent - "Notify" VM exit (detect microarchitectural hangs) for Intel - Use try_cmpxchg64 instead of cmpxchg64 - Ignore benign host accesses to PMU MSRs when PMU is disabled - Allow disabling KVM's "MONITOR/MWAIT are NOPs!" behavior - Allow NX huge page mitigation to be disabled on a per-vm basis - Port eager page splitting to shadow MMU as well - Enable CMCI capability by default and handle injected UCNA errors - Expose pid of vcpu threads in debugfs - x2AVIC support for AMD - cleanup PIO emulation - Fixes for LLDT/LTR emulation - Don't require refcounted "struct page" to create huge SPTEs - Miscellaneous cleanups: - MCE MSR emulation - Use separate namespaces for guest PTEs and shadow PTEs bitmasks - PIO emulation - Reorganize rmap API, mostly around rmap destruction - Do not workaround very old KVM bugs for L0 that runs with nesting enabled - new selftests API for CPUID Generic: - Fix races in gfn->pfn cache refresh; do not pin pages tracked by the cache - new selftests API using struct kvm_vcpu instead of a (vm, id) tuple" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (606 commits) selftests: kvm: set rax before vmcall selftests: KVM: Add exponent check for boolean stats selftests: KVM: Provide descriptive assertions in kvm_binary_stats_test selftests: KVM: Check stat name before other fields KVM: x86/mmu: remove unused variable RISC-V: KVM: Add support for Svpbmt inside Guest/VM RISC-V: KVM: Use PAGE_KERNEL_IO in kvm_riscv_gstage_ioremap() RISC-V: KVM: Add G-stage ioremap() and iounmap() functions KVM: Add gfp_custom flag in struct kvm_mmu_memory_cache RISC-V: KVM: Add extensible CSR emulation framework RISC-V: KVM: Add extensible system instruction emulation framework RISC-V: KVM: Factor-out instruction emulation into separate sources RISC-V: KVM: move preempt_disable() call in kvm_arch_vcpu_ioctl_run RISC-V: KVM: Make kvm_riscv_guest_timer_init a void function RISC-V: KVM: Fix variable spelling mistake RISC-V: KVM: Improve ISA extension by using a bitmap KVM, x86/mmu: Fix the comment around kvm_tdp_mmu_zap_leafs() KVM: SVM: Dump Virtual Machine Save Area (VMSA) to klog KVM: x86/mmu: Treat NX as a valid SPTE bit for NPT KVM: x86: Do not block APIC write for non ICR registers ...
2022-08-03Merge tag 'pull-work.iov_iter-base' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs iov_iter updates from Al Viro: "Part 1 - isolated cleanups and optimizations. One of the goals is to reduce the overhead of using ->read_iter() and ->write_iter() instead of ->read()/->write(). new_sync_{read,write}() has a surprising amount of overhead, in particular inside iocb_flags(). That's the explanation for the beginning of the series is in this pile; it's not directly iov_iter-related, but it's a part of the same work..." * tag 'pull-work.iov_iter-base' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: first_iovec_segment(): just return address iov_iter: massage calling conventions for first_{iovec,bvec}_segment() iov_iter: first_{iovec,bvec}_segment() - simplify a bit iov_iter: lift dealing with maxpages out of first_{iovec,bvec}_segment() iov_iter_get_pages{,_alloc}(): cap the maxsize with MAX_RW_COUNT iov_iter_bvec_advance(): don't bother with bvec_iter copy_page_{to,from}_iter(): switch iovec variants to generic keep iocb_flags() result cached in struct file iocb: delay evaluation of IS_SYNC(...) until we want to check IOCB_DSYNC struct file: use anonymous union member for rcuhead and llist btrfs: use IOMAP_DIO_NOSYNC teach iomap_dio_rw() to suppress dsync No need of likely/unlikely on calls of check_copy_size()
2022-08-02Merge tag 'flexible-array-transformations-UAPI-6.0-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux Pull uapi flexible array update from Gustavo Silva: "A treewide patch that replaces zero-length arrays with flexible-array members in UAPI. This has been baking in linux-next for 5 weeks now. '-fstrict-flex-arrays=3' is coming and we need to land these changes to prevent issues like these in the short future: fs/minix/dir.c:337:3: warning: 'strcpy' will always overflow; destination buffer has size 0, but the source string has length 2 (including NUL byte) [-Wfortify-source] strcpy(de3->name, "."); ^ Since these are all [0] to [] changes, the risk to UAPI is nearly zero. If this breaks anything, we can use a union with a new member name" Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101836 * tag 'flexible-array-transformations-UAPI-6.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux: treewide: uapi: Replace zero-length arrays with flexible-array members
2022-08-02Merge tag 'random-6.0-rc1-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/crng/random Pull random number generator updates from Jason Donenfeld: "Though there's been a decent amount of RNG-related development during this last cycle, not all of it is coming through this tree, as this cycle saw a shift toward tackling early boot time seeding issues, which took place in other trees as well. Here's a summary of the various patches: - The CONFIG_ARCH_RANDOM .config option and the "nordrand" boot option have been removed, as they overlapped with the more widely supported and more sensible options, CONFIG_RANDOM_TRUST_CPU and "random.trust_cpu". This change allowed simplifying a bit of arch code. - x86's RDRAND boot time test has been made a bit more robust, with RDRAND disabled if it's clearly producing bogus results. This would be a tip.git commit, technically, but I took it through random.git to avoid a large merge conflict. - The RNG has long since mixed in a timestamp very early in boot, on the premise that a computer that does the same things, but does so starting at different points in wall time, could be made to still produce a different RNG state. Unfortunately, the clock isn't set early in boot on all systems, so now we mix in that timestamp when the time is actually set. - User Mode Linux now uses the host OS's getrandom() syscall to generate a bootloader RNG seed and later on treats getrandom() as the platform's RDRAND-like faculty. - The arch_get_random_{seed_,}_long() family of functions is now arch_get_random_{seed_,}_longs(), which enables certain platforms, such as s390, to exploit considerable performance advantages from requesting multiple CPU random numbers at once, while at the same time compiling down to the same code as before on platforms like x86. - A small cleanup changing a cmpxchg() into a try_cmpxchg(), from Uros. - A comment spelling fix" More info about other random number changes that come in through various architecture trees in the full commentary in the pull request: https://lore.kernel.org/all/20220731232428.2219258-1-Jason@zx2c4.com/ * tag 'random-6.0-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random: random: correct spelling of "overwrites" random: handle archrandom with multiple longs um: seed rng using host OS rng random: use try_cmpxchg in _credit_init_bits timekeeping: contribute wall clock to rng on time change x86/rdrand: Remove "nordrand" flag in favor of "random.trust_cpu" random: remove CONFIG_ARCH_RANDOM
2022-08-02Merge tag 'integrity-v6.0' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity Pull integrity updates from Mimi Zohar: "Aside from the one EVM cleanup patch, all the other changes are kexec related. On different architectures different keyrings are used to verify the kexec'ed kernel image signature. Here are a number of preparatory cleanup patches and the patches themselves for making the keyrings - builtin_trusted_keyring, .machine, .secondary_trusted_keyring, and .platform - consistent across the different architectures" * tag 'integrity-v6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity: kexec, KEYS, s390: Make use of built-in and secondary keyring for signature verification arm64: kexec_file: use more system keyrings to verify kernel image signature kexec, KEYS: make the code in bzImage64_verify_sig generic kexec: clean up arch_kexec_kernel_verify_sig kexec: drop weak attribute from functions kexec_file: drop weak attribute from functions evm: Use IS_ENABLED to initialize .enabled
2022-08-01Merge tag 'locking-core-2022-08-01' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking updates from Ingo Molnar: "This was a fairly quiet cycle for the locking subsystem: - lockdep: Fix a handful of the more complex lockdep_init_map_*() primitives that can lose the lock_type & cause false reports. No such mishap was observed in the wild. - jump_label improvements: simplify the cross-arch support of initial NOP patching by making it arch-specific code (used on MIPS only), and remove the s390 initial NOP patching that was superfluous" * tag 'locking-core-2022-08-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: locking/lockdep: Fix lockdep_init_map_*() confusion jump_label: make initial NOP patching the special case jump_label: mips: move module NOP patching into arch code jump_label: s390: avoid pointless initial NOP patching
2022-08-01Merge remote-tracking branch 'kvm/next' into kvm-next-5.20Paolo Bonzini
KVM/s390, KVM/x86 and common infrastructure changes for 5.20 x86: * Permit guests to ignore single-bit ECC errors * Fix races in gfn->pfn cache refresh; do not pin pages tracked by the cache * Intel IPI virtualization * Allow getting/setting pending triple fault with KVM_GET/SET_VCPU_EVENTS * PEBS virtualization * Simplify PMU emulation by just using PERF_TYPE_RAW events * More accurate event reinjection on SVM (avoid retrying instructions) * Allow getting/setting the state of the speaker port data bit * Refuse starting the kvm-intel module if VM-Entry/VM-Exit controls are inconsistent * "Notify" VM exit (detect microarchitectural hangs) for Intel * Cleanups for MCE MSR emulation s390: * add an interface to provide a hypervisor dump for secure guests * improve selftests to use TAP interface * enable interpretive execution of zPCI instructions (for PCI passthrough) * First part of deferred teardown * CPU Topology * PV attestation * Minor fixes Generic: * new selftests API using struct kvm_vcpu instead of a (vm, id) tuple x86: * Use try_cmpxchg64 instead of cmpxchg64 * Bugfixes * Ignore benign host accesses to PMU MSRs when PMU is disabled * Allow disabling KVM's "MONITOR/MWAIT are NOPs!" behavior * x86/MMU: Allow NX huge pages to be disabled on a per-vm basis * Port eager page splitting to shadow MMU as well * Enable CMCI capability by default and handle injected UCNA errors * Expose pid of vcpu threads in debugfs * x2AVIC support for AMD * cleanup PIO emulation * Fixes for LLDT/LTR emulation * Don't require refcounted "struct page" to create huge SPTEs x86 cleanups: * Use separate namespaces for guest PTEs and shadow PTEs bitmasks * PIO emulation * Reorganize rmap API, mostly around rmap destruction * Do not workaround very old KVM bugs for L0 that runs with nesting enabled * new selftests API for CPUID