aboutsummaryrefslogtreecommitdiff
path: root/arch/riscv
AgeCommit message (Collapse)Author
2023-04-20riscv: add icache flush for nommu sigreturn trampolineMathis Salmen
commit 8d736482749f6d350892ef83a7a11d43cd49981e upstream. In a NOMMU kernel, sigreturn trampolines are generated on the user stack by setup_rt_frame. Currently, these trampolines are not instruction fenced, thus their visibility to ifetch is not guaranteed. This patch adds a flush_icache_range in setup_rt_frame to fix this problem. Signed-off-by: Mathis Salmen <mathis.salmen@matsal.de> Fixes: 6bd33e1ece52 ("riscv: add nommu support") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230406101130.82304-1-mathis.salmen@matsal.de Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-04-06riscv/kvm: Fix VM hang in case of timer delta being zero.Rajnesh Kanwal
[ Upstream commit 6eff38048944cadc3cddcf117acfa5199ec32490 ] In case when VCPU is blocked due to WFI, we schedule the timer from `kvm_riscv_vcpu_timer_blocking()` to keep timer interrupt ticking. But in case when delta_ns comes to be zero, we never schedule the timer and VCPU keeps sleeping indefinitely until any activity is done with VM console. This is easily reproduce-able using kvmtool. ./lkvm-static run -c1 --console virtio -p "earlycon root=/dev/vda" \ -k ./Image -d rootfs.ext4 Also, just add a print in kvm_riscv_vcpu_vstimer_expired() to check the interrupt delivery and run `top` or similar auto-upating cmd from guest. Within sometime one can notice that print from timer expiry routine stops and the `top` cmd output will stop updating. This change fixes this by making sure we schedule the timer even with delta_ns being zero to bring the VCPU out of sleep immediately. Fixes: 8f5cb44b1bae ("RISC-V: KVM: Support sstc extension") Signed-off-by: Rajnesh Kanwal <rkanwal@rivosinc.com> Reviewed-by: Atish Patra <atishp@rivosinc.com> Signed-off-by: Anup Patel <anup@brainfault.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-04-06riscv: ftrace: Fixup panic by disabling preemptionAndy Chiu
[ Upstream commit 8547649981e6631328cd64f583667501ae385531 ] In RISCV, we must use an AUIPC + JALR pair to encode an immediate, forming a jump that jumps to an address over 4K. This may cause errors if we want to enable kernel preemption and remove dependency from patching code with stop_machine(). For example, if a task was switched out on auipc. And, if we changed the ftrace function before it was switched back, then it would jump to an address that has updated 11:0 bits mixing with previous XLEN:12 part. p: patched area performed by dynamic ftrace ftrace_prologue: p| REG_S ra, -SZREG(sp) p| auipc ra, 0x? ------------> preempted ... change ftrace function ... p| jalr -?(ra) <------------- switched back p| REG_L ra, -SZREG(sp) func: xxx ret Fixes: afc76b8b8011 ("riscv: Using PATCHABLE_FUNCTION_ENTRY instead of MCOUNT") Signed-off-by: Andy Chiu <andy.chiu@sifive.com> Signed-off-by: Guo Ren <guoren@kernel.org> Link: https://lore.kernel.org/r/20230112090603.1295340-2-guoren@kernel.org Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-30riscv: Handle zicsr/zifencei issues between clang and binutilsNathan Chancellor
commit e89c2e815e76471cb507bd95728bf26da7976430 upstream. There are two related issues that appear in certain combinations with clang and GNU binutils. The first occurs when a version of clang that supports zicsr or zifencei via '-march=' [1] (i.e, >= 17.x) is used in combination with a version of GNU binutils that do not recognize zicsr and zifencei in the '-march=' value (i.e., < 2.36): riscv64-linux-gnu-ld: -march=rv64i2p0_m2p0_a2p0_c2p0_zicsr2p0_zifencei2p0: Invalid or unknown z ISA extension: 'zifencei' riscv64-linux-gnu-ld: failed to merge target specific data of file fs/efivarfs/file.o riscv64-linux-gnu-ld: -march=rv64i2p0_m2p0_a2p0_c2p0_zicsr2p0_zifencei2p0: Invalid or unknown z ISA extension: 'zifencei' riscv64-linux-gnu-ld: failed to merge target specific data of file fs/efivarfs/super.o The second occurs when a version of clang that does not support zicsr or zifencei via '-march=' (i.e., <= 16.x) is used in combination with a version of GNU as that defaults to a newer ISA base spec, which requires specifying zicsr and zifencei in the '-march=' value explicitly (i.e, >= 2.38): ../arch/riscv/kernel/kexec_relocate.S: Assembler messages: ../arch/riscv/kernel/kexec_relocate.S:147: Error: unrecognized opcode `fence.i', extension `zifencei' required clang-12: error: assembler command failed with exit code 1 (use -v to see invocation) This is the same issue addressed by commit 6df2a016c0c8 ("riscv: fix build with binutils 2.38") (see [2] for additional information) but older versions of clang miss out on it because the cc-option check fails: clang-12: error: invalid arch name 'rv64imac_zicsr_zifencei', unsupported standard user-level extension 'zicsr' clang-12: error: invalid arch name 'rv64imac_zicsr_zifencei', unsupported standard user-level extension 'zicsr' To resolve the first issue, only attempt to add zicsr and zifencei to the march string when using the GNU assembler 2.38 or newer, which is when the default ISA spec was updated, requiring these extensions to be specified explicitly. LLVM implements an older version of the base specification for all currently released versions, so these instructions are available as part of the 'i' extension. If LLVM's implementation is updated in the future, a CONFIG_AS_IS_LLVM condition can be added to CONFIG_TOOLCHAIN_NEEDS_EXPLICIT_ZICSR_ZIFENCEI. To resolve the second issue, use version 2.2 of the base ISA spec when using an older version of clang that does not support zicsr or zifencei via '-march=', as that is the spec version most compatible with the one clang/LLVM implements and avoids the need to specify zicsr and zifencei explicitly due to still being a part of 'i'. [1]: https://github.com/llvm/llvm-project/commit/22e199e6afb1263c943c0c0d4498694e15bf8a16 [2]: https://lore.kernel.org/ZAxT7T9Xy1Fo3d5W@aurel32.net/ Cc: stable@vger.kernel.org Link: https://github.com/ClangBuiltLinux/linux/issues/1808 Co-developed-by: Conor Dooley <conor.dooley@microchip.com> Signed-off-by: Conor Dooley <conor.dooley@microchip.com> Signed-off-by: Nathan Chancellor <nathan@kernel.org> Acked-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20230313-riscv-zicsr-zifencei-fiasco-v1-1-dd1b7840a551@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-03-30riscv: mm: Fix incorrect ASID argument when flushing TLBDylan Jhong
commit 9a801afd3eb95e1a89aba17321062df06fb49d98 upstream. Currently, we pass the CONTEXTID instead of the ASID to the TLB flush function. We should only take the ASID field to prevent from touching the reserved bit field. Fixes: 3f1e782998cd ("riscv: add ASID-based tlbflushing methods") Signed-off-by: Dylan Jhong <dylan@andestech.com> Reviewed-by: Sergey Matyukevich <sergey.matyukevich@syntacore.com> Link: https://lore.kernel.org/r/20230313034906.2401730-1-dylan@andestech.com Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-03-30riscv: Bump COMMAND_LINE_SIZE value to 1024Alexandre Ghiti
[ Upstream commit 61fc1ee8be26bc192d691932b0a67eabee45d12f ] Increase COMMAND_LINE_SIZE as the current default value is too low for syzbot kernel command line. There has been considerable discussion on this patch that has led to a larger patch set removing COMMAND_LINE_SIZE from the uapi headers on all ports. That's not quite done yet, but it's gotten far enough we're confident this is not a uABI change so this is safe. Reported-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Alexandre Ghiti <alex@ghiti.fr> Link: https://lore.kernel.org/r/20210316193420.904-1-alex@ghiti.fr [Palmer: it's not uabi] Link: https://lore.kernel.org/linux-riscv/874b8076-b0d1-4aaa-bcd8-05d523060152@app.fastmail.com/#t Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-22riscv: asid: Fixup stale TLB entry cause application crashGuo Ren
commit 82dd33fde0268cc622d3d1ac64971f3f61634142 upstream. After use_asid_allocator is enabled, the userspace application will crash by stale TLB entries. Because only using cpumask_clear_cpu without local_flush_tlb_all couldn't guarantee CPU's TLB entries were fresh. Then set_mm_asid would cause the user space application to get a stale value by stale TLB entry, but set_mm_noasid is okay. Here is the symptom of the bug: unhandled signal 11 code 0x1 (coredump) 0x0000003fd6d22524 <+4>: auipc s0,0x70 0x0000003fd6d22528 <+8>: ld s0,-148(s0) # 0x3fd6d92490 => 0x0000003fd6d2252c <+12>: ld a5,0(s0) (gdb) i r s0 s0 0x8082ed1cc3198b21 0x8082ed1cc3198b21 (gdb) x /2x 0x3fd6d92490 0x3fd6d92490: 0xd80ac8a8 0x0000003f The core dump file shows that register s0 is wrong, but the value in memory is correct. Because 'ld s0, -148(s0)' used a stale mapping entry in TLB and got a wrong result from an incorrect physical address. When the task ran on CPU0, which loaded/speculative-loaded the value of address(0x3fd6d92490), then the first version of the mapping entry was PTWed into CPU0's TLB. When the task switched from CPU0 to CPU1 (No local_tlb_flush_all here by asid), it happened to write a value on the address (0x3fd6d92490). It caused do_page_fault -> wp_page_copy -> ptep_clear_flush -> ptep_get_and_clear & flush_tlb_page. The flush_tlb_page used mm_cpumask(mm) to determine which CPUs need TLB flush, but CPU0 had cleared the CPU0's mm_cpumask in the previous switch_mm. So we only flushed the CPU1 TLB and set the second version mapping of the PTE. When the task switched from CPU1 to CPU0 again, CPU0 still used a stale TLB mapping entry which contained a wrong target physical address. It raised a bug when the task happened to read that value. CPU0 CPU1 - switch 'task' in - read addr (Fill stale mapping entry into TLB) - switch 'task' out (no tlb_flush) - switch 'task' in (no tlb_flush) - write addr cause pagefault do_page_fault() (change to new addr mapping) wp_page_copy() ptep_clear_flush() ptep_get_and_clear() & flush_tlb_page() write new value into addr - switch 'task' out (no tlb_flush) - switch 'task' in (no tlb_flush) - read addr again (Use stale mapping entry in TLB) get wrong value from old phyical addr, BUG! The solution is to keep all CPUs' footmarks of cpumask(mm) in switch_mm, which could guarantee to invalidate all stale TLB entries during TLB flush. Fixes: 65d4b9c53017 ("RISC-V: Implement ASID allocator") Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org> Tested-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Tested-by: Zong Li <zong.li@sifive.com> Tested-by: Sergey Matyukevich <sergey.matyukevich@syntacore.com> Cc: Anup Patel <apatel@ventanamicro.com> Cc: Palmer Dabbelt <palmer@rivosinc.com> Cc: stable@vger.kernel.org Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Link: https://lore.kernel.org/r/20230226150137.1919750-3-geomatsi@gmail.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-03-22Revert "riscv: mm: notify remote harts about mmu cache updates"Sergey Matyukevich
commit e921050022f1f12d5029d1487a7dfc46cde15523 upstream. This reverts the remaining bits of commit 4bd1d80efb5a ("riscv: mm: notify remote harts harts about mmu cache updates"). According to bug reports, suggested approach to fix stale TLB entries is not sufficient. It needs to be replaced by a more robust solution. Fixes: 4bd1d80efb5a ("riscv: mm: notify remote harts about mmu cache updates") Reported-by: Zong Li <zong.li@sifive.com> Reported-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Signed-off-by: Sergey Matyukevich <sergey.matyukevich@syntacore.com> Cc: stable@vger.kernel.org Reviewed-by: Guo Ren <guoren@kernel.org> Link: https://lore.kernel.org/r/20230226150137.1919750-2-geomatsi@gmail.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-03-17RISC-V: Don't check text_mutex during stop_machineConor Dooley
[ Upstream commit 2a8db5ec4a28a0fce822d10224db9471a44b6925 ] We're currently using stop_machine() to update ftrace & kprobes, which means that the thread that takes text_mutex during may not be the same as the thread that eventually patches the code. This isn't actually a race because the lock is still held (preventing any other concurrent accesses) and there is only one thread running during stop_machine(), but it does trigger a lockdep failure. This patch just elides the lockdep check during stop_machine. Fixes: c15ac4fd60d5 ("riscv/ftrace: Add dynamic function tracer support") Suggested-by: Steven Rostedt <rostedt@goodmis.org> Reported-by: Changbin Du <changbin.du@gmail.com> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com> Signed-off-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20230303143754.4005217-1-conor.dooley@microchip.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-17riscv: Use READ_ONCE_NOCHECK in imprecise unwinding stack modeAlexandre Ghiti
[ Upstream commit 76950340cf03b149412fe0d5f0810e52ac1df8cb ] When CONFIG_FRAME_POINTER is unset, the stack unwinding function walk_stackframe randomly reads the stack and then, when KASAN is enabled, it can lead to the following backtrace: [ 0.000000] ================================================================== [ 0.000000] BUG: KASAN: stack-out-of-bounds in walk_stackframe+0xa6/0x11a [ 0.000000] Read of size 8 at addr ffffffff81807c40 by task swapper/0 [ 0.000000] [ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 6.2.0-12919-g24203e6db61f #43 [ 0.000000] Hardware name: riscv-virtio,qemu (DT) [ 0.000000] Call Trace: [ 0.000000] [<ffffffff80007ba8>] walk_stackframe+0x0/0x11a [ 0.000000] [<ffffffff80099ecc>] init_param_lock+0x26/0x2a [ 0.000000] [<ffffffff80007c4a>] walk_stackframe+0xa2/0x11a [ 0.000000] [<ffffffff80c49c80>] dump_stack_lvl+0x22/0x36 [ 0.000000] [<ffffffff80c3783e>] print_report+0x198/0x4a8 [ 0.000000] [<ffffffff80099ecc>] init_param_lock+0x26/0x2a [ 0.000000] [<ffffffff80007c4a>] walk_stackframe+0xa2/0x11a [ 0.000000] [<ffffffff8015f68a>] kasan_report+0x9a/0xc8 [ 0.000000] [<ffffffff80007c4a>] walk_stackframe+0xa2/0x11a [ 0.000000] [<ffffffff80007c4a>] walk_stackframe+0xa2/0x11a [ 0.000000] [<ffffffff8006e99c>] desc_make_final+0x80/0x84 [ 0.000000] [<ffffffff8009a04e>] stack_trace_save+0x88/0xa6 [ 0.000000] [<ffffffff80099fc2>] filter_irq_stacks+0x72/0x76 [ 0.000000] [<ffffffff8006b95e>] devkmsg_read+0x32a/0x32e [ 0.000000] [<ffffffff8015ec16>] kasan_save_stack+0x28/0x52 [ 0.000000] [<ffffffff8006e998>] desc_make_final+0x7c/0x84 [ 0.000000] [<ffffffff8009a04a>] stack_trace_save+0x84/0xa6 [ 0.000000] [<ffffffff8015ec52>] kasan_set_track+0x12/0x20 [ 0.000000] [<ffffffff8015f22e>] __kasan_slab_alloc+0x58/0x5e [ 0.000000] [<ffffffff8015e7ea>] __kmem_cache_create+0x21e/0x39a [ 0.000000] [<ffffffff80e133ac>] create_boot_cache+0x70/0x9c [ 0.000000] [<ffffffff80e17ab2>] kmem_cache_init+0x6c/0x11e [ 0.000000] [<ffffffff80e00fd6>] mm_init+0xd8/0xfe [ 0.000000] [<ffffffff80e011d8>] start_kernel+0x190/0x3ca [ 0.000000] [ 0.000000] The buggy address belongs to stack of task swapper/0 [ 0.000000] and is located at offset 0 in frame: [ 0.000000] stack_trace_save+0x0/0xa6 [ 0.000000] [ 0.000000] This frame has 1 object: [ 0.000000] [32, 56) 'c' [ 0.000000] [ 0.000000] The buggy address belongs to the physical page: [ 0.000000] page:(____ptrval____) refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x81a07 [ 0.000000] flags: 0x1000(reserved|zone=0) [ 0.000000] raw: 0000000000001000 ff600003f1e3d150 ff600003f1e3d150 0000000000000000 [ 0.000000] raw: 0000000000000000 0000000000000000 00000001ffffffff [ 0.000000] page dumped because: kasan: bad access detected [ 0.000000] [ 0.000000] Memory state around the buggy address: [ 0.000000] ffffffff81807b00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 0.000000] ffffffff81807b80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 0.000000] >ffffffff81807c00: 00 00 00 00 00 00 00 00 f1 f1 f1 f1 00 00 00 f3 [ 0.000000] ^ [ 0.000000] ffffffff81807c80: f3 f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00 [ 0.000000] ffffffff81807d00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 0.000000] ================================================================== Fix that by using READ_ONCE_NOCHECK when reading the stack in imprecise mode. Fixes: 5d8544e2d007 ("RISC-V: Generic library routines and assembly") Reported-by: Chathura Rajapaksha <chathura.abeyrathne.lk@gmail.com> Link: https://lore.kernel.org/all/CAD7mqryDQCYyJ1gAmtMm8SASMWAQ4i103ptTb0f6Oda=tPY2=A@mail.gmail.com/ Suggested-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20230308091639.602024-1-alexghiti@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-17riscv: Add header include guards to insn.hLiao Chang
[ Upstream commit 8ac6e619d9d51b3eb5bae817db8aa94e780a0db4 ] Add header include guards to insn.h to prevent repeating declaration of any identifiers in insn.h. Fixes: edde5584c7ab ("riscv: Add SW single-step support for KDB") Signed-off-by: Liao Chang <liaochang1@huawei.com> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Fixes: c9c1af3f186a ("RISC-V: rename parse_asm.h to insn.h") Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20230129094242.282620-1-liaochang1@huawei.com Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-17RISC-V: Stop emitting attributesPalmer Dabbelt
commit e18048da9bc3f87acef4eb67a11b4fc55fe15424 upstream. The RISC-V ELF attributes don't contain any useful information. New toolchains ignore them, but they frequently trip up various older/mixed toolchains. So just turn them off. Tested-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20230223224605.6995-1-palmer@rivosinc.com Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-03-10riscv: ftrace: Reduce the detour code size to halfGuo Ren
commit 6724a76cff85ee271bbbff42ac527e4643b2ec52 upstream. Use a temporary register to reduce the size of detour code from 16 bytes to 8 bytes. The previous implementation is from 'commit afc76b8b8011 ("riscv: Using PATCHABLE_FUNCTION_ENTRY instead of MCOUNT")'. Before the patch: <func_prolog>: 0: REG_S ra, -SZREG(sp) 4: auipc ra, ? 8: jalr ?(ra) 12: REG_L ra, -SZREG(sp) (func_boddy) After the patch: <func_prolog>: 0: auipc t0, ? 4: jalr t0, ?(t0) (func_boddy) This patch not just reduces the size of detour code, but also fixes an important issue: An Ftrace callback registered with FTRACE_OPS_FL_IPMODIFY flag can actually change the instruction pointer, e.g. to "replace" the given kernel function with a new one, which is needed for livepatching, etc. In this case, the trampoline (ftrace_regs_caller) would not return to <func_prolog+12> but would rather jump to the new function. So, "REG_L ra, -SZREG(sp)" would not run and the original return address would not be restored. The kernel is likely to hang or crash as a result. This can be easily demonstrated if one tries to "replace", say, cmdline_proc_show() with a new function with the same signature using instruction_pointer_set(&fregs->regs, new_func_addr) in the Ftrace callback. Link: https://lore.kernel.org/linux-riscv/20221122075440.1165172-1-suagrfillet@gmail.com/ Link: https://lore.kernel.org/linux-riscv/d7d5730b-ebef-68e5-5046-e763e1ee6164@yadro.com/ Co-developed-by: Song Shuai <suagrfillet@gmail.com> Signed-off-by: Song Shuai <suagrfillet@gmail.com> Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org> Cc: Evgenii Shatokhin <e.shatokhin@yadro.com> Reviewed-by: Evgenii Shatokhin <e.shatokhin@yadro.com> Link: https://lore.kernel.org/r/20230112090603.1295340-4-guoren@kernel.org Cc: stable@vger.kernel.org Fixes: 10626c32e382 ("riscv/ftrace: Add basic support") Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-03-10riscv: ftrace: Remove wasted nops for !RISCV_ISA_CGuo Ren
commit 409c8fb20c66df7150e592747412438c04aeb11f upstream. When CONFIG_RISCV_ISA_C=n, -fpatchable-function-entry=8 would generate more nops than we expect. Because it treat nop opcode as 0x00000013 instead of 0x0001. Dump of assembler code for function dw_pcie_free_msi: 0xffffffff806fce94 <+0>: sd ra,-8(sp) 0xffffffff806fce98 <+4>: auipc ra,0xff90f 0xffffffff806fce9c <+8>: jalr -684(ra) # 0xffffffff8000bbec <ftrace_caller> 0xffffffff806fcea0 <+12>: ld ra,-8(sp) 0xffffffff806fcea4 <+16>: nop /* wasted */ 0xffffffff806fcea8 <+20>: nop /* wasted */ 0xffffffff806fceac <+24>: nop /* wasted */ 0xffffffff806fceb0 <+28>: nop /* wasted */ 0xffffffff806fceb4 <+0>: addi sp,sp,-48 0xffffffff806fceb8 <+4>: sd s0,32(sp) 0xffffffff806fcebc <+8>: sd s1,24(sp) 0xffffffff806fcec0 <+12>: sd s2,16(sp) 0xffffffff806fcec4 <+16>: sd s3,8(sp) 0xffffffff806fcec8 <+20>: sd ra,40(sp) 0xffffffff806fcecc <+24>: addi s0,sp,48 Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org> Link: https://lore.kernel.org/r/20230112090603.1295340-3-guoren@kernel.org Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-03-10riscv, mm: Perform BPF exhandler fixup on page faultBjörn Töpel
commit 416721ff05fddc58ca531b6f069de250301de6e5 upstream. Commit 21855cac82d3 ("riscv/mm: Prevent kernel module to access user memory without uaccess routines") added early exits/deaths for page faults stemming from accesses to user-space without using proper uaccess routines (where sstatus.SUM is set). Unfortunatly, this is too strict for some BPF programs, which relies on BPF exhandler fixups. These BPF programs loads "BTF pointers". A BTF pointers could either be a valid kernel pointer or NULL, but not a userspace address. Resolve the problem by calling the fixup handler in the early exit path. Fixes: 21855cac82d3 ("riscv/mm: Prevent kernel module to access user memory without uaccess routines") Signed-off-by: Björn Töpel <bjorn@rivosinc.com> Link: https://lore.kernel.org/r/20230214162515.184827-1-bjorn@kernel.org Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-03-10riscv: jump_label: Fixup unaligned arch_static_branch functionAndy Chiu
commit 9ddfc3cd806081ce1f6c9c2f988cbb031f35d28f upstream. Runtime code patching must be done at a naturally aligned address, or we may execute on a partial instruction. We have encountered problems traced back to static jump functions during the test. We switched the tracer randomly for every 1~5 seconds on a dual-core QEMU setup and found the kernel sucking at a static branch where it jumps to itself. The reason is that the static branch was 2-byte but not 4-byte aligned. Then, the kernel would patch the instruction, either J or NOP, with two half-word stores if the machine does not have efficient unaligned accesses. Thus, moments exist where half of the NOP mixes with the other half of the J when transitioning the branch. In our particular case, on a little-endian machine, the upper half of the NOP was mixed with the lower part of the J when enabling the branch, resulting in a jump that jumped to itself. Conversely, it would result in a HINT instruction when disabling the branch, but it might not be observable. ARM64 does not have this problem since all instructions must be 4-byte aligned. Fixes: ebc00dde8a97 ("riscv: Add jump-label implementation") Link: https://lore.kernel.org/linux-riscv/20220913094252.3555240-6-andy.chiu@sifive.com/ Reviewed-by: Greentime Hu <greentime.hu@sifive.com> Signed-off-by: Andy Chiu <andy.chiu@sifive.com> Signed-off-by: Guo Ren <guoren@kernel.org> Link: https://lore.kernel.org/r/20230206090440.1255001-1-guoren@kernel.org Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-03-10riscv: mm: fix regression due to update_mmu_cache changeSergey Matyukevich
commit b49f700668fff7565b945dce823def79bff59bb0 upstream. This is a partial revert of the commit 4bd1d80efb5a ("riscv: mm: notify remote harts about mmu cache updates"). Original commit included two loosely related changes serving the same purpose of fixing stale TLB entries causing user-space application crash: - introduce deferred per-ASID TLB flush for CPUs not running the task - switch to per-ASID TLB flush on all CPUs running the task in update_mmu_cache According to report and discussion in [1], the second part caused a regression on Renesas RZ/Five SoC. For now restore the old behavior of the update_mmu_cache. [1] https://lore.kernel.org/linux-riscv/20220829205219.283543-1-geomatsi@gmail.com/ Fixes: 4bd1d80efb5a ("riscv: mm: notify remote harts about mmu cache updates") Reported-by: "Lad, Prabhakar" <prabhakar.csengg@gmail.com> Signed-off-by: Sergey Matyukevich <sergey.matyukevich@syntacore.com> Link: trailer, so that it can be parsed with git's trailer functionality? Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20230129211818.686557-1-geomatsi@gmail.com Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-03-10riscv: Avoid enabling interrupts in die()Mattias Nissler
commit 130aee3fd9981297ff9354e5d5609cd59aafbbea upstream. While working on something else, I noticed that the kernel would start accepting interrupts again after crashing in an interrupt handler. Since the kernel is already in inconsistent state, enabling interrupts is dangerous and opens up risk of kernel state deteriorating further. Interrupts do get enabled via what looks like an unintended side effect of spin_unlock_irq, so switch to the more cautious spin_lock_irqsave/spin_unlock_irqrestore instead. Fixes: 76d2a0493a17 ("RISC-V: Init and Halt Code") Signed-off-by: Mattias Nissler <mnissler@rivosinc.com> Reviewed-by: Björn Töpel <bjorn@kernel.org> Link: https://lore.kernel.org/r/20230215144828.3370316-1-mnissler@rivosinc.com Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-03-10RISC-V: add a spin_shadow_stack declarationConor Dooley
commit eb9be8310c58c166f9fae3b71c0ad9d6741b4897 upstream. The patchwork automation reported a sparse complaint that spin_shadow_stack was not declared and should be static: ../arch/riscv/kernel/traps.c:335:15: warning: symbol 'spin_shadow_stack' was not declared. Should it be static? However, this is used in entry.S and therefore shouldn't be static. The same applies to the shadow_stack that this pseudo spinlock is trying to protect, so do like its charge and add a declaration to thread_info.h Signed-off-by: Conor Dooley <conor.dooley@microchip.com> Fixes: 7e1864332fbc ("riscv: fix race when vmap stack overflow") Reviewed-by: Guo Ren <guoren@kernel.org> Link: https://lore.kernel.org/r/20230210185945.915806-1-conor@kernel.org Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-03-10RISC-V: time: initialize hrtimer based broadcast clock event deviceConor Dooley
[ Upstream commit 8b3b8fbb4896984b5564789a42240e4b3caddb61 ] Similarly to commit 022eb8ae8b5e ("ARM: 8938/1: kernel: initialize broadcast hrtimer based clock event device"), RISC-V needs to initiate hrtimer based broadcast clock event device before C3STOP can be used. Otherwise, the introduction of C3STOP for the RISC-V arch timer in commit 232ccac1bd9b ("clocksource/drivers/riscv: Events are stopped during CPU suspend") leaves us without any broadcast timer registered. This prevents the kernel from entering oneshot mode, which breaks timer behaviour, for example clock_nanosleep(). A test app that sleeps each cpu for 6, 5, 4, 3 ms respectively, HZ=250 & C3STOP enabled, the sleep times are rounded up to the next jiffy: == CPU: 1 == == CPU: 2 == == CPU: 3 == == CPU: 4 == Mean: 7.974992 Mean: 7.976534 Mean: 7.962591 Mean: 3.952179 Std Dev: 0.154374 Std Dev: 0.156082 Std Dev: 0.171018 Std Dev: 0.076193 Hi: 9.472000 Hi: 10.495000 Hi: 8.864000 Hi: 4.736000 Lo: 6.087000 Lo: 6.380000 Lo: 4.872000 Lo: 3.403000 Samples: 521 Samples: 521 Samples: 521 Samples: 521 Link: https://lore.kernel.org/linux-riscv/YzYTNQRxLr7Q9JR0@spud/ Fixes: 232ccac1bd9b ("clocksource/drivers/riscv: Events are stopped during CPU suspend") Suggested-by: Samuel Holland <samuel@sholland.org> Signed-off-by: Conor Dooley <conor.dooley@microchip.com> Signed-off-by: Anup Patel <apatel@ventanamicro.com> Reviewed-by: Samuel Holland <samuel@sholland.org> Acked-by: Palmer Dabbelt <palmer@rivosinc.com> Link: https://lore.kernel.org/r/20230103141102.772228-2-apatel@ventanamicro.com Signed-off-by: Daniel Lezcano <daniel.lezcano@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-14riscv: kprobe: Fixup misaligned load textGuo Ren
commit eb7423273cc9922ee2d05bf660c034d7d515bb91 upstream. The current kprobe would cause a misaligned load for the probe point. This patch fixup it with two half-word loads instead. Fixes: c22b0bcb1dd0 ("riscv: Add kprobes supported") Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org> Link: https://lore.kernel.org/linux-riscv/878rhig9zj.fsf@all.your.base.are.belong.to.us/ Reported-by: Bjorn Topel <bjorn.topel@gmail.com> Reviewed-by: Björn Töpel <bjorn@kernel.org> Link: https://lore.kernel.org/r/20230204063531.740220-1-guoren@kernel.org Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-02-14riscv: Fixup race condition on PG_dcache_clean in flush_icache_pteGuo Ren
commit 950b879b7f0251317d26bae0687e72592d607532 upstream. In commit 588a513d3425 ("arm64: Fix race condition on PG_dcache_clean in __sync_icache_dcache()"), we found RISC-V has the same issue as the previous arm64. The previous implementation didn't guarantee the correct sequence of operations, which means flush_icache_all() hasn't been called when the PG_dcache_clean was set. That would cause a risk of page synchronization. Fixes: 08f051eda33b ("RISC-V: Flush I$ when making a dirty page executable") Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20230127035306.1819561-1-guoren@kernel.org Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-02-14riscv: stacktrace: Fix missing the first frameLiu Shixin
[ Upstream commit cb80242cc679d6397e77d8a964deeb3ff218d2b5 ] When running kfence_test, I found some testcases failed like this: # test_out_of_bounds_read: EXPECTATION FAILED at mm/kfence/kfence_test.c:346 Expected report_matches(&expect) to be true, but is false not ok 1 - test_out_of_bounds_read The corresponding call-trace is: BUG: KFENCE: out-of-bounds read in kunit_try_run_case+0x38/0x84 Out-of-bounds read at 0x(____ptrval____) (32B right of kfence-#10): kunit_try_run_case+0x38/0x84 kunit_generic_run_threadfn_adapter+0x12/0x1e kthread+0xc8/0xde ret_from_exception+0x0/0xc The kfence_test using the first frame of call trace to check whether the testcase is succeed or not. Commit 6a00ef449370 ("riscv: eliminate unreliable __builtin_frame_address(1)") skip first frame for all case, which results the kfence_test failed. Indeed, we only need to skip the first frame for case (task==NULL || task==current). With this patch, the call-trace will be: BUG: KFENCE: out-of-bounds read in test_out_of_bounds_read+0x88/0x19e Out-of-bounds read at 0x(____ptrval____) (1B left of kfence-#7): test_out_of_bounds_read+0x88/0x19e kunit_try_run_case+0x38/0x84 kunit_generic_run_threadfn_adapter+0x12/0x1e kthread+0xc8/0xde ret_from_exception+0x0/0xc Fixes: 6a00ef449370 ("riscv: eliminate unreliable __builtin_frame_address(1)") Signed-off-by: Liu Shixin <liushixin2@huawei.com> Tested-by: Samuel Holland <samuel@sholland.org> Link: https://lore.kernel.org/r/20221207025038.1022045-1-liushixin2@huawei.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-09riscv: disable generation of unwind tablesAndreas Schwab
commit 2f394c0e7d1129a35156e492bc8f445fb20f43ac upstream. GCC 13 will enable -fasynchronous-unwind-tables by default on riscv. In the kernel, we don't have any use for unwind tables yet, so disable them. More importantly, the .eh_frame section brings relocations (R_RISC_32_PCREL, R_RISCV_SET{6,8,16}, R_RISCV_SUB{6,8,16}) into modules that we are not prepared to handle. Signed-off-by: Andreas Schwab <schwab@suse.de> Link: https://lore.kernel.org/r/mvmzg9xybqu.fsf@suse.de Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-02-09riscv: kprobe: Fixup kernel panic when probing an illegal positionGuo Ren
[ Upstream commit 87f48c7ccc73afc78630530d9af51f458f58cab8 ] The kernel would panic when probed for an illegal position. eg: (CONFIG_RISCV_ISA_C=n) echo 'p:hello kernel_clone+0x16 a0=%a0' >> kprobe_events echo 1 > events/kprobes/hello/enable cat trace Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: __do_sys_newfstatat+0xb8/0xb8 CPU: 0 PID: 111 Comm: sh Not tainted 6.2.0-rc1-00027-g2d398fe49a4d #490 Hardware name: riscv-virtio,qemu (DT) Call Trace: [<ffffffff80007268>] dump_backtrace+0x38/0x48 [<ffffffff80c5e83c>] show_stack+0x50/0x68 [<ffffffff80c6da28>] dump_stack_lvl+0x60/0x84 [<ffffffff80c6da6c>] dump_stack+0x20/0x30 [<ffffffff80c5ecf4>] panic+0x160/0x374 [<ffffffff80c6db94>] generic_handle_arch_irq+0x0/0xa8 [<ffffffff802deeb0>] sys_newstat+0x0/0x30 [<ffffffff800158c0>] sys_clone+0x20/0x30 [<ffffffff800039e8>] ret_from_syscall+0x0/0x4 ---[ end Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: __do_sys_newfstatat+0xb8/0xb8 ]--- That is because the kprobe's ebreak instruction broke the kernel's original code. The user should guarantee the correction of the probe position, but it couldn't make the kernel panic. This patch adds arch_check_kprobe in arch_prepare_kprobe to prevent an illegal position (Such as the middle of an instruction). Fixes: c22b0bcb1dd0 ("riscv: Add kprobes supported") Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org> Reviewed-by: Björn Töpel <bjorn@kernel.org> Link: https://lore.kernel.org/r/20230201040604.3390509-1-guoren@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-01riscv: Move call to init_cpu_topology() to later initialization stageLey Foon Tan
[ Upstream commit c1d6105869464635d8a2bcf87a43c05f4c0cfca4 ] If "capacity-dmips-mhz" is present in a CPU DT node, topology_parse_cpu_capacity() will fail to allocate memory. arm64, with which this code path is shared, does not call topology_parse_cpu_capacity() until later in boot where memory allocation is available. While "capacity-dmips-mhz" is not yet a valid property on RISC-V, invalid properties should be ignored rather than cause issues. Move init_cpu_topology(), which calls topology_parse_cpu_capacity(), to a later initialization stage, to match arm64. As a side effect of this change, RISC-V is "protected" from changes to core topology code that would work on arm64 where memory allocation is safe but on RISC-V isn't. Fixes: 03f11f03dbfe ("RISC-V: Parse cpu topology during boot.") Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Signed-off-by: Ley Foon Tan <leyfoon.tan@starfivetech.com> Link: https://lore.kernel.org/r/20230105033705.3946130-1-leyfoon.tan@starfivetech.com [Palmer: use Conor's commit text] Link: https://lore.kernel.org/linux-riscv/20230104183033.755668-1-pierre.gondois@arm.com/T/#me592d4c8b9508642954839f0077288a353b0b9b2 Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-01riscv/kprobe: Fix instruction simulation of JALRLiao Chang
[ Upstream commit ca0254998be4d74cf6add70ccfab0d2dbd362a10 ] Set kprobe at 'jalr 1140(ra)' of vfs_write results in the following crash: [ 32.092235] Unable to handle kernel access to user memory without uaccess routines at virtual address 00aaaaaad77b1170 [ 32.093115] Oops [#1] [ 32.093251] Modules linked in: [ 32.093626] CPU: 0 PID: 135 Comm: ftracetest Not tainted 6.2.0-rc2-00013-gb0aa5e5df0cb-dirty #16 [ 32.093985] Hardware name: riscv-virtio,qemu (DT) [ 32.094280] epc : ksys_read+0x88/0xd6 [ 32.094855] ra : ksys_read+0xc0/0xd6 [ 32.095016] epc : ffffffff801cda80 ra : ffffffff801cdab8 sp : ff20000000d7bdc0 [ 32.095227] gp : ffffffff80f14000 tp : ff60000080f9cb40 t0 : ffffffff80f13e80 [ 32.095500] t1 : ffffffff8000c29c t2 : ffffffff800dbc54 s0 : ff20000000d7be60 [ 32.095716] s1 : 0000000000000000 a0 : ffffffff805a64ae a1 : ffffffff80a83708 [ 32.095921] a2 : ffffffff80f160a0 a3 : 0000000000000000 a4 : f229b0afdb165300 [ 32.096171] a5 : f229b0afdb165300 a6 : ffffffff80eeebd0 a7 : 00000000000003ff [ 32.096411] s2 : ff6000007ff76800 s3 : fffffffffffffff7 s4 : 00aaaaaad77b1170 [ 32.096638] s5 : ffffffff80f160a0 s6 : ff6000007ff76800 s7 : 0000000000000030 [ 32.096865] s8 : 00ffffffc3d97be0 s9 : 0000000000000007 s10: 00aaaaaad77c9410 [ 32.097092] s11: 0000000000000000 t3 : ffffffff80f13e48 t4 : ffffffff8000c29c [ 32.097317] t5 : ffffffff8000c29c t6 : ffffffff800dbc54 [ 32.097505] status: 0000000200000120 badaddr: 00aaaaaad77b1170 cause: 000000000000000d [ 32.098011] [<ffffffff801cdb72>] ksys_write+0x6c/0xd6 [ 32.098222] [<ffffffff801cdc06>] sys_write+0x2a/0x38 [ 32.098405] [<ffffffff80003c76>] ret_from_syscall+0x0/0x2 Since the rs1 and rd might be the same one, such as 'jalr 1140(ra)', hence it requires obtaining the target address from rs1 followed by updating rd. Fixes: c22b0bcb1dd0 ("riscv: Add kprobes supported") Signed-off-by: Liao Chang <liaochang1@huawei.com> Reviewed-by: Guo Ren <guoren@kernel.org> Link: https://lore.kernel.org/r/20230116064342.2092136-1-liaochang1@huawei.com [Palmer: Pick Guo's cleanup] Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-01riscv: fix -Wundef warning for CONFIG_RISCV_BOOT_SPINWAITMasahiro Yamada
commit 5b89c6f9b2df2b7cf6da8e0b2b87c8995b378cad upstream. Since commit 80b6093b55e3 ("kbuild: add -Wundef to KBUILD_CPPFLAGS for W=1 builds"), building with W=1 detects misuse of #if. $ make W=1 ARCH=riscv CROSS_COMPILE=riscv64-linux-gnu- arch/riscv/kernel/ [snip] AS arch/riscv/kernel/head.o arch/riscv/kernel/head.S:329:5: warning: "CONFIG_RISCV_BOOT_SPINWAIT" is not defined, evaluates to 0 [-Wundef] 329 | #if CONFIG_RISCV_BOOT_SPINWAIT | ^~~~~~~~~~~~~~~~~~~~~~~~~~ CONFIG_RISCV_BOOT_SPINWAIT is a bool option. #ifdef should be used. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Fixes: 2ffc48fc7071 ("RISC-V: Move spinwait booting method to its own config") Link: https://lore.kernel.org/r/20230106161213.2374093-1-masahiroy@kernel.org Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-01-24riscv: dts: sifive: fu740: fix size of pcie 32bit memoryBen Dooks
commit 43d5f5d63699724d47f0d9e0eae516a260d232b4 upstream. The 32-bit memory resource is needed for non-prefetchable memory allocations on the PCIe bus, however with some cards (such as the SM768) the system fails to allocate memory from this. Checking the allocation against the datasheet, it looks like there has been a mis-calcualation of the resource for the first memory region (0x0060090000..0x0070ffffff) which in the data-sheet for the fu740 (v1p2) is from 0x0060000000..0x007fffffff. Changing this to allocate from 0x0060090000..0x007fffffff fixes the probing issues. Fixes: ae80d5148085 ("riscv: dts: Add PCIe support for the SiFive FU740-C000 SoC") Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Greentime Hu <greentime.hu@sifive.com> Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk> Cc: stable@vger.kernel.org Tested-by: Ron Economos <re@w6rz.net> # from IRC Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Signed-off-by: Conor Dooley <conor.dooley@microchip.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-01-12riscv, kprobes: Stricter c.jr/c.jalr decodingBjörn Töpel
commit b2d473a6019ef9a54b0156ecdb2e0398c9fa6a24 upstream. In the compressed instruction extension, c.jr, c.jalr, c.mv, and c.add is encoded the following way (each instruction is 16b): ---+-+-----------+-----------+-- 100 0 rs1[4:0]!=0 00000 10 : c.jr 100 1 rs1[4:0]!=0 00000 10 : c.jalr 100 0 rd[4:0]!=0 rs2[4:0]!=0 10 : c.mv 100 1 rd[4:0]!=0 rs2[4:0]!=0 10 : c.add The following logic is used to decode c.jr and c.jalr: insn & 0xf007 == 0x8002 => instruction is an c.jr insn & 0xf007 == 0x9002 => instruction is an c.jalr When 0xf007 is used to mask the instruction, c.mv can be incorrectly decoded as c.jr, and c.add as c.jalr. Correct the decoding by changing the mask from 0xf007 to 0xf07f. Fixes: c22b0bcb1dd0 ("riscv: Add kprobes supported") Signed-off-by: Björn Töpel <bjorn@rivosinc.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Guo Ren <guoren@kernel.org> Link: https://lore.kernel.org/r/20230102160748.1307289-1-bjorn@kernel.org Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-01-12riscv: uaccess: fix type of 0 variable on error in get_user()Ben Dooks
commit b9b916aee6715cd7f3318af6dc360c4729417b94 upstream. If the get_user(x, ptr) has x as a pointer, then the setting of (x) = 0 is going to produce the following sparse warning, so fix this by forcing the type of 'x' when access_ok() fails. fs/aio.c:2073:21: warning: Using plain integer as NULL pointer Signed-off-by: Ben Dooks <ben-linux@fluff.org> Reviewed-by: Palmer Dabbelt <palmer@rivosinc.com> Link: https://lore.kernel.org/r/20221229170545.718264-1-ben-linux@fluff.org Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-01-07riscv: mm: notify remote harts about mmu cache updatesSergey Matyukevich
commit 4bd1d80efb5af640f99157f39b50fb11326ce641 upstream. Current implementation of update_mmu_cache function performs local TLB flush. It does not take into account ASID information. Besides, it does not take into account other harts currently running the same mm context or possible migration of the running context to other harts. Meanwhile TLB flush is not performed for every context switch if ASID support is enabled. Patch [1] proposed to add ASID support to update_mmu_cache to avoid flushing local TLB entirely. This patch takes into account other harts currently running the same mm context as well as possible migration of this context to other harts. For this purpose the approach from flush_icache_mm is reused. Remote harts currently running the same mm context are informed via SBI calls that they need to flush their local TLBs. All the other harts are marked as needing a deferred TLB flush when this mm context runs on them. [1] https://lore.kernel.org/linux-riscv/20220821013926.8968-1-tjytimi@163.com/ Signed-off-by: Sergey Matyukevich <sergey.matyukevich@syntacore.com> Fixes: 65d4b9c53017 ("RISC-V: Implement ASID allocator") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/linux-riscv/20220829205219.283543-1-geomatsi@gmail.com/#t Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-01-07riscv: stacktrace: Fixup ftrace_graph_ret_addr retp argumentGuo Ren
commit 5c3022e4a616d800cf5f4c3a981d7992179e44a1 upstream. The 'retp' is a pointer to the return address on the stack, so we must pass the current return address pointer as the 'retp' argument to ftrace_push_return_trace(). Not parent function's return address on the stack. Fixes: b785ec129bd9 ("riscv/ftrace: Add HAVE_FUNCTION_GRAPH_RET_ADDR_PTR support") Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org> Link: https://lore.kernel.org/r/20221109064937.3643993-2-guoren@kernel.org Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-01-07RISC-V: kexec: Fix memory leak of elf header bufferLi Huafei
commit cbc32023ddbdf4baa3d9dc513a2184a84080a5a2 upstream. This is reported by kmemleak detector: unreferenced object 0xff2000000403d000 (size 4096): comm "kexec", pid 146, jiffies 4294900633 (age 64.792s) hex dump (first 32 bytes): 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00 .ELF............ 04 00 f3 00 01 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<00000000566ca97c>] kmemleak_vmalloc+0x3c/0xbe [<00000000979283d8>] __vmalloc_node_range+0x3ac/0x560 [<00000000b4b3712a>] __vmalloc_node+0x56/0x62 [<00000000854f75e2>] vzalloc+0x2c/0x34 [<00000000e9a00db9>] crash_prepare_elf64_headers+0x80/0x30c [<0000000067e8bf48>] elf_kexec_load+0x3e8/0x4ec [<0000000036548e09>] kexec_image_load_default+0x40/0x4c [<0000000079fbe1b4>] sys_kexec_file_load+0x1c4/0x322 [<0000000040c62c03>] ret_from_syscall+0x0/0x2 In elf_kexec_load(), a buffer is allocated via vzalloc() to store elf headers. While it's not freed back to system when kdump kernel is reloaded or unloaded, or when image->elf_header is successfully set and then fails to load kdump kernel for some reason. Fix it by freeing the buffer in arch_kimage_file_post_load_cleanup(). Fixes: 8acea455fafa ("RISC-V: Support for kexec_file on panic") Signed-off-by: Li Huafei <lihuafei1@huawei.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20221104095658.141222-2-lihuafei1@huawei.com Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-01-07riscv: Fixup compile error with !MMUGuo Ren
commit c528ef0888b75f673f7d48022de8d31d5b451e8c upstream. Current nommu_virt_defconfig can't compile: In file included from arch/riscv/kernel/crash_core.c:3: arch/riscv/kernel/crash_core.c: In function 'arch_crash_save_vmcoreinfo': arch/riscv/kernel/crash_core.c:8:27: error: 'VA_BITS' undeclared (first use in this function) 8 | VMCOREINFO_NUMBER(VA_BITS); | ^~~~~~~ Add MMU dependency for KEXEC_FILE. Fixes: 6261586e0c91 ("RISC-V: Add kexec_file support") Reported-by: Conor Dooley <conor.dooley@microchip.com> Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Guo Ren <guoren@kernel.org> Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Tested-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20221207091112.2258674-1-guoren@kernel.org Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-01-07RISC-V: kexec: Fix memory leak of fdt bufferLi Huafei
commit 96df59b1ae23f5c11698c3c2159aeb2ecd4944a4 upstream. This is reported by kmemleak detector: unreferenced object 0xff60000082864000 (size 9588): comm "kexec", pid 146, jiffies 4294900634 (age 64.788s) hex dump (first 32 bytes): d0 0d fe ed 00 00 12 ed 00 00 00 48 00 00 11 40 ...........H...@ 00 00 00 28 00 00 00 11 00 00 00 02 00 00 00 00 ...(............ backtrace: [<00000000f95b17c4>] kmemleak_alloc+0x34/0x3e [<00000000b9ec8e3e>] kmalloc_order+0x9c/0xc4 [<00000000a95cf02e>] kmalloc_order_trace+0x34/0xb6 [<00000000f01e68b4>] __kmalloc+0x5c2/0x62a [<000000002bd497b2>] kvmalloc_node+0x66/0xd6 [<00000000906542fa>] of_kexec_alloc_and_setup_fdt+0xa6/0x6ea [<00000000e1166bde>] elf_kexec_load+0x206/0x4ec [<0000000036548e09>] kexec_image_load_default+0x40/0x4c [<0000000079fbe1b4>] sys_kexec_file_load+0x1c4/0x322 [<0000000040c62c03>] ret_from_syscall+0x0/0x2 In elf_kexec_load(), a buffer is allocated via kvmalloc() to store fdt. While it's not freed back to system when kexec kernel is reloaded or unloaded. Then memory leak is caused. Fix it by introducing riscv specific function arch_kimage_file_post_load_cleanup(), and freeing the buffer there. Fixes: 6261586e0c91 ("RISC-V: Add kexec_file support") Signed-off-by: Li Huafei <lihuafei1@huawei.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Liao Chang <liaochang1@huawei.com> Link: https://lore.kernel.org/r/20221104095658.141222-1-lihuafei1@huawei.com Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-12-31RISC-V: KVM: Fix reg_val check in kvm_riscv_vcpu_set_reg_config()Anup Patel
[ Upstream commit e482d9e33d5b0f222cbef7341dcd52cead6b9edc ] The reg_val check in kvm_riscv_vcpu_set_reg_config() should only be done for isa config register. Fixes: 9bfd900beeec ("RISC-V: KVM: Improve ISA extension by using a bitmap") Signed-off-by: Anup Patel <apatel@ventanamicro.com> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Reviewed-by: Atish Patra <atishp@rivosinc.com> Signed-off-by: Anup Patel <anup@brainfault.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-31riscv: Fix P4D_SHIFT definition for 3-level page table modeAlexandre Ghiti
[ Upstream commit 71fc3621efc38ace9640ee6a0db3300900689592 ] RISC-V kernels support 3,4,5-level page tables at runtime by folding upper levels. In case of a 3-level page table, PGDIR is folded into P4D which in turn is folded into PUD: PGDIR_SHIFT value is correctly set to the same value as PUD_SHIFT, but P4D_SHIFT is not, then any use of P4D_SHIFT will access invalid address bits (all set to 1). Fix this by dynamically defining P4D_SHIFT value, like we already do for PGDIR_SHIFT. Fixes: d10efa21a937 ("riscv: mm: Control p4d's folding by pgtable_l5_enabled") Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Reviewed-by: Palmer Dabbelt <palmer@rivosinc.com> Link: https://lore.kernel.org/r/20221201135128.1482189-2-alexghiti@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-31RISC-V: Align the shadow stackPalmer Dabbelt
[ Upstream commit b003b3b77d65133a0011ae3b7b255347438c12f6 ] The standard RISC-V ABIs all require 16-byte stack alignment. We're only calling that one function on the shadow stack so I doubt it'd result in a real issue, but might as well keep this lined up. Fixes: 31da94c25aea ("riscv: add VMAP_STACK overflow detection") Reviewed-by: Jisheng Zhang <jszhang@kernel.org> Link: https://lore.kernel.org/r/20221130023515.20217-1-palmer@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-31riscv: Fix crash during early errata patchingSamuel Holland
[ Upstream commit 0c49688174f5347c3f8012e84c0ffa0d2b2890c8 ] The patch function for the T-Head PBMT errata calls __pa_symbol() before relocation. This crashes when CONFIG_DEBUG_VIRTUAL is enabled, because __pa_symbol() forwards to __phys_addr_symbol(), and __phys_addr_symbol() checks against the absolute kernel start/end address. Fix this by checking against the kernel map instead of a symbol address. Fixes: a35707c3d850 ("riscv: add memory-type errata for T-Head") Reviewed-by: Heiko Stuebner <heiko@sntech.de> Tested-by: Heiko Stuebner <heiko@sntech.de> Signed-off-by: Samuel Holland <samuel@sholland.org> Link: https://lore.kernel.org/r/20221126060920.65009-1-samuel@sholland.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-31RISC-V: Fix MEMREMAP_WB for systems with SvpbmtAnup Patel
[ Upstream commit b91676fc16cd384a81e3af52c641aa61985cc231 ] Currently, the memremap() called with MEMREMAP_WB maps memory using the generic ioremap() function which breaks on system with Svpbmt because memory mapped using _PAGE_IOREMAP page attributes is treated as strongly-ordered non-cacheable IO memory. To address this, we implement RISC-V specific arch_memremap_wb() which maps memory using _PAGE_KERNEL page attributes resulting in write-back cacheable mapping on systems with Svpbmt. Fixes: ff689fd21cb1 ("riscv: add RISC-V Svpbmt extension support") Co-developed-by: Mayuresh Chitale <mchitale@ventanamicro.com> Signed-off-by: Mayuresh Chitale <mchitale@ventanamicro.com> Signed-off-by: Anup Patel <apatel@ventanamicro.com> Acked-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20221114090536.1662624-2-apatel@ventanamicro.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-31RISC-V: Fix unannoted hardirqs-on in return to userspace slow-pathAndrew Bresticker
[ Upstream commit b0f4c74eadbf69a3298f38566bfaa2e202541f2f ] The return to userspace path in entry.S may enable interrupts without the corresponding lockdep annotation, producing a splat[0] when DEBUG_LOCKDEP is enabled. Simply calling __trace_hardirqs_on() here gets a bit messy due to the use of RA to point back to ret_from_exception, so just move the whole slow-path loop into C. It's more readable and it lets us use local_irq_{enable,disable}(), avoiding the need for manual annotations altogether. [0]: ------------[ cut here ]------------ DEBUG_LOCKS_WARN_ON(!lockdep_hardirqs_enabled()) WARNING: CPU: 2 PID: 1 at kernel/locking/lockdep.c:5512 check_flags+0x10a/0x1e0 Modules linked in: CPU: 2 PID: 1 Comm: init Not tainted 6.1.0-rc4-00160-gb56b6e2b4f31 #53 Hardware name: riscv-virtio,qemu (DT) epc : check_flags+0x10a/0x1e0 ra : check_flags+0x10a/0x1e0 <snip> status: 0000000200000100 badaddr: 0000000000000000 cause: 0000000000000003 [<ffffffff808edb90>] lock_is_held_type+0x78/0x14e [<ffffffff8003dae2>] __might_resched+0x26/0x22c [<ffffffff8003dd24>] __might_sleep+0x3c/0x66 [<ffffffff80022c60>] get_signal+0x9e/0xa70 [<ffffffff800054a2>] do_notify_resume+0x6e/0x422 [<ffffffff80003c68>] ret_from_exception+0x0/0x10 irq event stamp: 44512 hardirqs last enabled at (44511): [<ffffffff808f901c>] _raw_spin_unlock_irqrestore+0x54/0x62 hardirqs last disabled at (44512): [<ffffffff80008200>] __trace_hardirqs_off+0xc/0x14 softirqs last enabled at (44472): [<ffffffff808f9fbe>] __do_softirq+0x3de/0x51e softirqs last disabled at (44467): [<ffffffff80017760>] irq_exit+0xd6/0x104 ---[ end trace 0000000000000000 ]--- possible reason: unannotated irqs-on. Signed-off-by: Andrew Bresticker <abrestic@rivosinc.com> Fixes: 3c4697982982 ("riscv: Enable LOCKDEP_SUPPORT & fixup TRACE_IRQFLAGS_SUPPORT") Link: https://lore.kernel.org/r/20221111223108.1976562-1-abrestic@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-31riscv/mm: add arch hook arch_clear_hugepage_flagsTong Tiangen
[ Upstream commit d8bf77a1dc3079692f54be3087a5fd16d90027b0 ] With the PG_arch_1 we keep track if the page's data cache is clean, architecture rely on this property to treat new pages as dirty with respect to the data cache and perform the flushing before mapping the pages into userspace. This patch adds a new architecture hook, arch_clear_hugepage_flags,so that architectures which rely on the page flags being in a particular state for fresh allocations can adjust the flags accordingly when a page is freed into the pool. Fixes: 9e953cda5cdf ("riscv: Introduce huge page support for 32/64bit kernel") Signed-off-by: Tong Tiangen <tongtiangen@huawei.com> Link: https://lore.kernel.org/r/20221024094725.3054311-3-tongtiangen@huawei.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-31riscv, bpf: Emit fixed-length instructions for BPF_PSEUDO_FUNCPu Lehui
[ Upstream commit b54b6003612a376e7be32cbc5c1af3754bbbbb3d ] For BPF_PSEUDO_FUNC instruction, verifier will refill imm with correct addresses of bpf_calls and then run last pass of JIT. Since the emit_imm of RV64 is variable-length, which will emit appropriate length instructions accorroding to the imm, it may broke ctx->offset, and lead to unpredictable problem, such as inaccurate jump. So let's fix it with fixed-length instructions. Fixes: 69c087ba6225 ("bpf: Add bpf_for_each_map_elem() helper") Suggested-by: Björn Töpel <bjorn@rivosinc.com> Signed-off-by: Pu Lehui <pulehui@huawei.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Björn Töpel <bjorn@kernel.org> Acked-by: Björn Töpel <bjorn@kernel.org> Link: https://lore.kernel.org/bpf/20221206091410.1584784-1-pulehui@huaweicloud.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-31riscv: dts: microchip: remove pcie node from the sev kitConor Dooley
[ Upstream commit 1150f4cff831e1d7db673417bcb81833d6544cf8 ] The SEV kit reference design does not hook up the PCIe root port to the core complex including it is misleading. The entry is a re-use mistake - I was not aware of this when I moved the PCIe node out of mpfs.dtsi so that individual bistreams could connect it to different fics etc. The node is disabled, so there should be no functional change here. Fixes: 978a17d1a688 ("riscv: dts: microchip: add sevkit device tree") Signed-off-by: Conor Dooley <conor.dooley@microchip.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-31riscv: dts: microchip: fix the icicle's #pwm-cellsConor Dooley
[ Upstream commit bdd28ab35c163553a2a686fdc5ea3cf247aad69b ] \#pwm-cells for the Icicle kit's fabric PWM was incorrectly set to 2 & blindly overridden by the (out of tree) driver anyway. The core can support inverted operation, so update the entry to correctly report its capabilities. Fixes: 72560c6559b8 ("riscv: dts: microchip: add fpga fabric section to icicle kit") Reviewed-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Conor Dooley <conor.dooley@microchip.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-31riscv: dts: microchip: fix memory node unit address for icicleConor Dooley
[ Upstream commit d6105a8b7c160a73ae04054c8921eba80a294146 ] Evidently I forgot to update the unit address for the 38-bit cached memory node when I changed the address in the reg property.. Update it to match. Fixes: 6c1193301791 ("riscv: dts: microchip: update memory configuration for v2022.10") Signed-off-by: Conor Dooley <conor.dooley@microchip.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-02Merge tag 'riscv-for-linus-6.1-rc8' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V fixes from Palmer Dabbelt: - build fix for the NR_CPUS Kconfig SBI version dependency - fixes to early memory initialization, to fix page permissions in EFI and post-initmem-free - build fix for the VDSO, to avoid trying to profile the VDSO functions - fixes for kexec crash handling, to fix multi-core and interrupt related initialization inside the crash kernel - fix for a race condition when handling multiple concurrect kernel stack overflows * tag 'riscv-for-linus-6.1-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: riscv: kexec: Fixup crash_smp_send_stop without multi cores riscv: kexec: Fixup irq controller broken in kexec crash path riscv: mm: Proper page permissions after initmem free riscv: vdso: fix section overlapping under some conditions riscv: fix race when vmap stack overflow riscv: Sync efi page table's kernel mappings before switching riscv: Fix NR_CPUS range conditions
2022-12-02Merge tag 'mm-hotfixes-stable-2022-12-02' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc hotfixes from Andrew Morton: "15 hotfixes, 11 marked cc:stable. Only three or four of the latter address post-6.0 issues, which is hopefully a sign that things are converging" * tag 'mm-hotfixes-stable-2022-12-02' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: revert "kbuild: fix -Wimplicit-function-declaration in license_is_gpl_compatible" Kconfig.debug: provide a little extra FRAME_WARN leeway when KASAN is enabled drm/amdgpu: temporarily disable broken Clang builds due to blown stack-frame mm/khugepaged: invoke MMU notifiers in shmem/file collapse paths mm/khugepaged: fix GUP-fast interaction by sending IPI mm/khugepaged: take the right locks for page table retraction mm: migrate: fix THP's mapcount on isolation mm: introduce arch_has_hw_nonleaf_pmd_young() mm: add dummy pmd_young() for architectures not having it mm/damon/sysfs: fix wrong empty schemes assumption under online tuning in damon_sysfs_set_schemes() tools/vm/slabinfo-gnuplot: use "grep -E" instead of "egrep" nilfs2: fix NULL pointer dereference in nilfs_palloc_commit_free_entry() hugetlb: don't delete vma_lock in hugetlb MADV_DONTNEED processing madvise: use zap_page_range_single for madvise dontneed mm: replace VM_WARN_ON to pr_warn if the node is offline with __GFP_THISNODE
2022-12-01RISC-V: Fix a race condition during kernel stack overflowPalmer Dabbelt
This fixes a concrete bug but is also the basis for some cleanup work, so I'm merging it based on the offending commit in order to minimize future conflicts. * commit '7e1864332fbc1b993659eab7974da9fe8bf8c128': riscv: fix race when vmap stack overflow