aboutsummaryrefslogtreecommitdiff
path: root/drivers/virtio
AgeCommit message (Collapse)Author
2023-11-02virtio_pci: fix the common cfg map sizeXuan Zhuo
commit 061b39fdfe7fd98946e67637213bcbb10a318cca upstream. The function vp_modern_map_capability() takes the size parameter, which corresponds to the size of virtio_pci_common_cfg. As a result, this indicates the size of memory area to map. Now the size is the size of virtio_pci_common_cfg, but some feature(such as the _F_RING_RESET) needs the virtio_pci_modern_common_cfg, so this commit changes the size to the size of virtio_pci_modern_common_cfg. Cc: stable@vger.kernel.org Fixes: 0b50cece0b78 ("virtio_pci: introduce helper to get/set queue reset") Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Message-Id: <20231010031120.81272-3-xuanzhuo@linux.alibaba.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-11-02virtio-mmio: fix memory leak of vm_devMaximilian Heyne
commit fab7f259227b8f70aa6d54e1de1a1f5f4729041c upstream. With the recent removal of vm_dev from devres its memory is only freed via the callback virtio_mmio_release_dev. However, this only takes effect after device_add is called by register_virtio_device. Until then it's an unmanaged resource and must be explicitly freed on error exit. This bug was discovered and resolved using Coverity Static Analysis Security Testing (SAST) by Synopsys, Inc. Cc: stable@vger.kernel.org Fixes: 55c91fedd03d ("virtio-mmio: don't break lifecycle of vm_dev") Signed-off-by: Maximilian Heyne <mheyne@amazon.de> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Tested-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Message-Id: <20230911090328.40538-1-mheyne@amazon.de> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
2023-11-02virtio_balloon: Fix endless deflation and inflation on arm64Gavin Shan
commit 07622bd415639e9709579f400afd19e7e9866e5e upstream. The deflation request to the target, which isn't unaligned to the guest page size causes endless deflation and inflation actions. For example, we receive the flooding QMP events for the changes on memory balloon's size after a deflation request to the unaligned target is sent for the ARM64 guest, where we have 64KB base page size. /home/gavin/sandbox/qemu.main/build/qemu-system-aarch64 \ -accel kvm -machine virt,gic-version=host -cpu host \ -smp maxcpus=8,cpus=8,sockets=2,clusters=2,cores=2,threads=1 \ -m 1024M,slots=16,maxmem=64G \ -object memory-backend-ram,id=mem0,size=512M \ -object memory-backend-ram,id=mem1,size=512M \ -numa node,nodeid=0,memdev=mem0,cpus=0-3 \ -numa node,nodeid=1,memdev=mem1,cpus=4-7 \ : \ -device virtio-balloon-pci,id=balloon0,bus=pcie.10 { "execute" : "balloon", "arguments": { "value" : 1073672192 } } {"return": {}} {"timestamp": {"seconds": 1693272173, "microseconds": 88667}, \ "event": "BALLOON_CHANGE", "data": {"actual": 1073610752}} {"timestamp": {"seconds": 1693272174, "microseconds": 89704}, \ "event": "BALLOON_CHANGE", "data": {"actual": 1073610752}} {"timestamp": {"seconds": 1693272175, "microseconds": 90819}, \ "event": "BALLOON_CHANGE", "data": {"actual": 1073610752}} {"timestamp": {"seconds": 1693272176, "microseconds": 91961}, \ "event": "BALLOON_CHANGE", "data": {"actual": 1073610752}} {"timestamp": {"seconds": 1693272177, "microseconds": 93040}, \ "event": "BALLOON_CHANGE", "data": {"actual": 1073676288}} {"timestamp": {"seconds": 1693272178, "microseconds": 94117}, \ "event": "BALLOON_CHANGE", "data": {"actual": 1073676288}} {"timestamp": {"seconds": 1693272179, "microseconds": 95337}, \ "event": "BALLOON_CHANGE", "data": {"actual": 1073610752}} {"timestamp": {"seconds": 1693272180, "microseconds": 96615}, \ "event": "BALLOON_CHANGE", "data": {"actual": 1073676288}} {"timestamp": {"seconds": 1693272181, "microseconds": 97626}, \ "event": "BALLOON_CHANGE", "data": {"actual": 1073610752}} {"timestamp": {"seconds": 1693272182, "microseconds": 98693}, \ "event": "BALLOON_CHANGE", "data": {"actual": 1073676288}} {"timestamp": {"seconds": 1693272183, "microseconds": 99698}, \ "event": "BALLOON_CHANGE", "data": {"actual": 1073610752}} {"timestamp": {"seconds": 1693272184, "microseconds": 100727}, \ "event": "BALLOON_CHANGE", "data": {"actual": 1073610752}} {"timestamp": {"seconds": 1693272185, "microseconds": 90430}, \ "event": "BALLOON_CHANGE", "data": {"actual": 1073610752}} {"timestamp": {"seconds": 1693272186, "microseconds": 102999}, \ "event": "BALLOON_CHANGE", "data": {"actual": 1073676288}} : <The similar QMP events repeat> Fix it by aligning the target up to the guest page size, 64KB in this specific case. With this applied, no flooding QMP events are observed and the memory balloon's size can be stablizied to 0x3ffe0000 soon after the deflation request is sent. { "execute" : "balloon", "arguments": { "value" : 1073672192 } } {"return": {}} {"timestamp": {"seconds": 1693273328, "microseconds": 793075}, \ "event": "BALLOON_CHANGE", "data": {"actual": 1073610752}} { "execute" : "query-balloon" } {"return": {"actual": 1073610752}} Cc: stable@vger.kernel.org Signed-off-by: Gavin Shan <gshan@redhat.com> Tested-by: Zhenyu Zhang <zhenyzha@redhat.com> Message-Id: <20230831011007.1032822-1-gshan@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-13virtio_ring: fix avail_wrap_counter in virtqueue_add_packedYuan Yao
[ Upstream commit 1acfe2c1225899eab5ab724c91b7e1eb2881b9ab ] In current packed virtqueue implementation, the avail_wrap_counter won't flip, in the case when the driver supplies a descriptor chain with a length equals to the queue size; total_sg == vq->packed.vring.num. Let’s assume the following situation: vq->packed.vring.num=4 vq->packed.next_avail_idx: 1 vq->packed.avail_wrap_counter: 0 Then the driver adds a descriptor chain containing 4 descriptors. We expect the following result with avail_wrap_counter flipped: vq->packed.next_avail_idx: 1 vq->packed.avail_wrap_counter: 1 But, the current implementation gives the following result: vq->packed.next_avail_idx: 1 vq->packed.avail_wrap_counter: 0 To reproduce the bug, you can set a packed queue size as small as possible, so that the driver is more likely to provide a descriptor chain with a length equal to the packed queue size. For example, in qemu run following commands: sudo qemu-system-x86_64 \ -enable-kvm \ -nographic \ -kernel "path/to/kernel_image" \ -m 1G \ -drive file="path/to/rootfs",if=none,id=disk \ -device virtio-blk,drive=disk \ -drive file="path/to/disk_image",if=none,id=rwdisk \ -device virtio-blk,drive=rwdisk,packed=on,queue-size=4,\ indirect_desc=off \ -append "console=ttyS0 root=/dev/vda rw init=/bin/bash" Inside the VM, create a directory and mount the rwdisk device on it. The rwdisk will hang and mount operation will not complete. This commit fixes the wrap counter error by flipping the packed.avail_wrap_counter, when start of descriptor chain equals to the end of descriptor chain (head == i). Fixes: 1ce9e6055fa0 ("virtio_ring: introduce packed ring support") Signed-off-by: Yuan Yao <yuanyaogoog@chromium.org> Message-Id: <20230808051110.3492693-1-yuanyaogoog@chromium.org> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-08-23virtio-mmio: don't break lifecycle of vm_devWolfram Sang
[ Upstream commit 55c91fedd03d7b9cf0c5199b2eb12b9b8e95281a ] vm_dev has a separate lifecycle because it has a 'struct device' embedded. Thus, having a release callback for it is correct. Allocating the vm_dev struct with devres totally breaks this protection, though. Instead of waiting for the vm_dev release callback, the memory is freed when the platform_device is removed. Resulting in a use-after-free when finally the callback is to be called. To easily see the problem, compile the kernel with CONFIG_DEBUG_KOBJECT_RELEASE and unbind with sysfs. The fix is easy, don't use devres in this case. Found during my research about object lifetime problems. Fixes: 7eb781b1bbb7 ("virtio_mmio: add cleanup for virtio_mmio_probe") Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Message-Id: <20230629120526.7184-1-wsa+renesas@sang-engineering.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-05-11virtio_ring: don't update event idx on get_bufAlbert Huang
[ Upstream commit 6c0b057cec5eade4c3afec3908821176931a9997 ] In virtio_net, if we disable napi_tx, when we trigger a tx interrupt, the vq->event_triggered will be set to true. It is then never reset until we explicitly call virtqueue_enable_cb_delayed or virtqueue_enable_cb_prepare. If we disable the napi_tx, virtqueue_enable_cb* will only be called when the tx ring is getting relatively empty. Since event_triggered is true, VRING_AVAIL_F_NO_INTERRUPT or VRING_PACKED_EVENT_FLAG_DISABLE will not be set. As a result we update vring_used_event(&vq->split.vring) or vq->packed.vring.driver->off_wrap every time we call virtqueue_get_buf_ctx. This causes more interrupts. To summarize: 1) event_triggered was set to true in vring_interrupt() 2) after this nothing will happen in virtqueue_disable_cb() so VRING_AVAIL_F_NO_INTERRUPT is not set in avail_flags_shadow 3) virtqueue_get_buf_ctx_split() will still think the cb is enabled and then it will publish a new event index To fix: update VRING_AVAIL_F_NO_INTERRUPT or VRING_PACKED_EVENT_FLAG_DISABLE in the vq when we call virtqueue_disable_cb even when event_triggered is true. Tested with iperf: iperf3 tcp stream: vm1 -----------------> vm2 vm2 just receives tcp data stream from vm1, and sends acks to vm1, there are many tx interrupts in vm2. with the patch applied there are just a few tx interrupts. v2->v3: -update the interrupt disable flag even with the event_triggered is set, -instead of checking whether event_triggered is set in -virtqueue_get_buf_ctx_{packed/split}, will cause the drivers which have -not called virtqueue_{enable/disable}_cb to miss notifications. v3->v4: -remove change for -"if (vq->packed.event_flags_shadow != VRING_PACKED_EVENT_FLAG_DISABLE)" -in virtqueue_disable_cb_packed Fixes: 8d622d21d248 ("virtio: fix up virtio_disable_cb") Signed-off-by: Albert Huang <huangjie.albert@bytedance.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Message-Id: <20230329102300.61000-1-huangjie.albert@bytedance.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-24virtio_pci: modify ENOENT to EINVALAngus Chen
[ Upstream commit b66ead2d0ecac00c3a06a6218af5411cb5fcb5d5 ] Virtio_crypto use max_data_queues+1 to setup vqs, we use vp_modern_get_num_queues to protect the vq range in setup_vq. We could enter index >= vp_modern_get_num_queues(mdev) in setup_vq if common->num_queues is not set well,and it return -ENOENT. It is better to use -EINVAL instead. Signed-off-by: Angus Chen <angus.chen@jaguarmicro.com> Message-Id: <20221101111655.1947-1-angus.chen@jaguarmicro.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-10-13virtio_pci: use irq to detect interrupt supportMichael S. Tsirkin
commit 71491c54eafa ("virtio_pci: don't try to use intxif pin is zero") breaks virtio_pci on powerpc, when running as a qemu guest. vp_find_vqs() bails out because pci_dev->pin == 0. But pci_dev->irq is populated correctly, so vp_find_vqs_intx() would succeed if we called it - which is what the code used to do. This seems to happen because pci_dev->pin is not populated in pci_assign_irq(). A PCI core bug? Maybe. However Linus said: I really think that that is basically the only time you should use that 'pci_dev->pin' thing: it basically exists not for "does this device have an IRQ", but for "what is the routing of this irq on this device". and The correct way to check for "no irq" doesn't use NO_IRQ at all, it just does if (dev->irq) ... so let's just check irq and be done with it. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Reported-by: Michael Ellerman <mpe@ellerman.id.au> Fixes: 71491c54eafa ("virtio_pci: don't try to use intxif pin is zero") Cc: "Angus Chen" <angus.chen@jaguarmicro.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Tested-by: Michael Ellerman <mpe@ellerman.id.au> Acked-by: Jason Wang <jasowang@redhat.com> Message-Id: <20221012220312.308522-1-mst@redhat.com>
2022-10-10Merge tag 'mm-stable-2022-10-08' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: - Yu Zhao's Multi-Gen LRU patches are here. They've been under test in linux-next for a couple of months without, to my knowledge, any negative reports (or any positive ones, come to that). - Also the Maple Tree from Liam Howlett. An overlapping range-based tree for vmas. It it apparently slightly more efficient in its own right, but is mainly targeted at enabling work to reduce mmap_lock contention. Liam has identified a number of other tree users in the kernel which could be beneficially onverted to mapletrees. Yu Zhao has identified a hard-to-hit but "easy to fix" lockdep splat at [1]. This has yet to be addressed due to Liam's unfortunately timed vacation. He is now back and we'll get this fixed up. - Dmitry Vyukov introduces KMSAN: the Kernel Memory Sanitizer. It uses clang-generated instrumentation to detect used-unintialized bugs down to the single bit level. KMSAN keeps finding bugs. New ones, as well as the legacy ones. - Yang Shi adds a userspace mechanism (madvise) to induce a collapse of memory into THPs. - Zach O'Keefe has expanded Yang Shi's madvise(MADV_COLLAPSE) to support file/shmem-backed pages. - userfaultfd updates from Axel Rasmussen - zsmalloc cleanups from Alexey Romanov - cleanups from Miaohe Lin: vmscan, hugetlb_cgroup, hugetlb and memory-failure - Huang Ying adds enhancements to NUMA balancing memory tiering mode's page promotion, with a new way of detecting hot pages. - memcg updates from Shakeel Butt: charging optimizations and reduced memory consumption. - memcg cleanups from Kairui Song. - memcg fixes and cleanups from Johannes Weiner. - Vishal Moola provides more folio conversions - Zhang Yi removed ll_rw_block() :( - migration enhancements from Peter Xu - migration error-path bugfixes from Huang Ying - Aneesh Kumar added ability for a device driver to alter the memory tiering promotion paths. For optimizations by PMEM drivers, DRM drivers, etc. - vma merging improvements from Jakub Matěn. - NUMA hinting cleanups from David Hildenbrand. - xu xin added aditional userspace visibility into KSM merging activity. - THP & KSM code consolidation from Qi Zheng. - more folio work from Matthew Wilcox. - KASAN updates from Andrey Konovalov. - DAMON cleanups from Kaixu Xia. - DAMON work from SeongJae Park: fixes, cleanups. - hugetlb sysfs cleanups from Muchun Song. - Mike Kravetz fixes locking issues in hugetlbfs and in hugetlb core. Link: https://lkml.kernel.org/r/CAOUHufZabH85CeUN-MEMgL8gJGzJEWUrkiM58JkTbBhh-jew0Q@mail.gmail.com [1] * tag 'mm-stable-2022-10-08' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (555 commits) hugetlb: allocate vma lock for all sharable vmas hugetlb: take hugetlb vma_lock when clearing vma_lock->vma pointer hugetlb: fix vma lock handling during split vma and range unmapping mglru: mm/vmscan.c: fix imprecise comments mm/mglru: don't sync disk for each aging cycle mm: memcontrol: drop dead CONFIG_MEMCG_SWAP config symbol mm: memcontrol: use do_memsw_account() in a few more places mm: memcontrol: deprecate swapaccounting=0 mode mm: memcontrol: don't allocate cgroup swap arrays when memcg is disabled mm/secretmem: remove reduntant return value mm/hugetlb: add available_huge_pages() func mm: remove unused inline functions from include/linux/mm_inline.h selftests/vm: add selftest for MADV_COLLAPSE of uffd-minor memory selftests/vm: add file/shmem MADV_COLLAPSE selftest for cleared pmd selftests/vm: add thp collapse shmem testing selftests/vm: add thp collapse file and tmpfs testing selftests/vm: modularize thp collapse memory operations selftests/vm: dedup THP helpers mm/khugepaged: add tracepoint to hpage_collapse_scan_file() mm/madvise: add file and shmem support to MADV_COLLAPSE ...
2022-10-07virtio_pci: don't try to use intxif pin is zeroAngus Chen
The background is that we use dpu in cloud computing,the arch is x86,80 cores. We will have a lots of virtio devices,like 512 or more. When we probe about 200 virtio_blk devices,it will fail and the stack is printed as follows: [25338.485128] virtio-pci 0000:b3:00.0: virtio_pci: leaving for legacy driver [25338.496174] genirq: Flags mismatch irq 0. 00000080 (virtio418) vs. 00015a00 (timer) [25338.503822] CPU: 20 PID: 5431 Comm: kworker/20:0 Kdump: loaded Tainted: G OE --------- - - 4.18.0-305.30.1.el8.x86_64 [25338.516403] Hardware name: Inspur NF5280M5/YZMB-00882-10E, BIOS 4.1.21 08/25/2021 [25338.523881] Workqueue: events work_for_cpu_fn [25338.528235] Call Trace: [25338.530687] dump_stack+0x5c/0x80 [25338.534000] __setup_irq.cold.53+0x7c/0xd3 [25338.538098] request_threaded_irq+0xf5/0x160 [25338.542371] vp_find_vqs+0xc7/0x190 [25338.545866] init_vq+0x17c/0x2e0 [virtio_blk] [25338.550223] ? ncpus_cmp_func+0x10/0x10 [25338.554061] virtblk_probe+0xe6/0x8a0 [virtio_blk] [25338.558846] virtio_dev_probe+0x158/0x1f0 [25338.562861] really_probe+0x255/0x4a0 [25338.566524] ? __driver_attach_async_helper+0x90/0x90 [25338.571567] driver_probe_device+0x49/0xc0 [25338.575660] bus_for_each_drv+0x79/0xc0 [25338.579499] __device_attach+0xdc/0x160 [25338.583337] bus_probe_device+0x9d/0xb0 [25338.587167] device_add+0x418/0x780 [25338.590654] register_virtio_device+0x9e/0xe0 [25338.595011] virtio_pci_probe+0xb3/0x140 [25338.598941] local_pci_probe+0x41/0x90 [25338.602689] work_for_cpu_fn+0x16/0x20 [25338.606443] process_one_work+0x1a7/0x360 [25338.610456] ? create_worker+0x1a0/0x1a0 [25338.614381] worker_thread+0x1cf/0x390 [25338.618132] ? create_worker+0x1a0/0x1a0 [25338.622051] kthread+0x116/0x130 [25338.625283] ? kthread_flush_work_fn+0x10/0x10 [25338.629731] ret_from_fork+0x1f/0x40 [25338.633395] virtio_blk: probe of virtio418 failed with error -16 The log : "genirq: Flags mismatch irq 0. 00000080 (virtio418) vs. 00015a00 (timer)" was printed because of the irq 0 is used by timer exclusive,and when vp_find_vqs call vp_find_vqs_msix and returns false twice (for whatever reason), then it will call vp_find_vqs_intx as a fallback. Because vp_dev->pci_dev->irq is zero, we request irq 0 with flag IRQF_SHARED, and get a backtrace like above. According to PCI spec about "Interrupt Pin" Register (Offset 3Dh): "The Interrupt Pin register is a read-only register that identifies the legacy interrupt Message(s) the Function uses. Valid values are 01h, 02h, 03h, and 04h that map to legacy interrupt Messages for INTA, INTB, INTC, and INTD respectively. A value of 00h indicates that the Function uses no legacy interrupt Message(s)." So if vp_dev->pci_dev->pin is zero, we should not request legacy interrupt. Signed-off-by: Angus Chen <angus.chen@jaguarmicro.com> Suggested-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20220930000915.548-1-angus.chen@jaguarmicro.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-10-07virtio_ring: make vring_alloc_queue_packed prettierDeming Wang
Add some spaces to vring_alloc_queue(make it look prettier). Signed-off-by: Deming Wang <wangdeming@inspur.com> Message-Id: <20220926183306.4535-1-wangdeming@inspur.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-10-07virtio_ring: split: Operators use unified styleDeming Wang
The operators of vring_alloc_queue_split should use the unified style.Add space for the '|' ,make it be looked more pretty. Signed-off-by: Deming Wang <wangdeming@inspur.com> Message-Id: <20220926022202.1516-1-wangdeming@inspur.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-10-03virtio: kmsan: check/unpoison scatterlist in vring_map_one_sg()Alexander Potapenko
If vring doesn't use the DMA API, KMSAN is unable to tell whether the memory is initialized by hardware. Explicitly call kmsan_handle_dma() from vring_map_one_sg() in this case to prevent false positives. Link: https://lkml.kernel.org/r/20220915150417.722975-23-glider@google.com Signed-off-by: Alexander Potapenko <glider@google.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andrey Konovalov <andreyknvl@gmail.com> Cc: Andrey Konovalov <andreyknvl@google.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Christoph Hellwig <hch@lst.de> Cc: Christoph Lameter <cl@linux.com> Cc: David Rientjes <rientjes@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Eric Biggers <ebiggers@google.com> Cc: Eric Biggers <ebiggers@kernel.org> Cc: Eric Dumazet <edumazet@google.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Ilya Leoshkevich <iii@linux.ibm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Kees Cook <keescook@chromium.org> Cc: Marco Elver <elver@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Pekka Enberg <penberg@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Petr Mladek <pmladek@suse.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vegard Nossum <vegard.nossum@oracle.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-08-16virtio: kerneldocs fixes and enhancementsRicardo Cañuelo
Fix variable names in some kerneldocs, naming in others. Add kerneldocs for struct vring_desc and vring_interrupt. Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com> Message-Id: <20220810094004.1250-2-ricardo.canuelo@collabora.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com>
2022-08-16virtio: Revert "virtio: find_vqs() add arg sizes"Michael S. Tsirkin
This reverts commit a10fba0377145fccefea4dc4dd5915b7ed87e546: the proposed API isn't supported on all transports but no effort was made to address this. It might not be hard to fix if we want to: maybe just rename size to size_hint and make sure legacy transports ignore the hint. But it's not sure what the benefit is in any case, so let's drop it. Fixes: a10fba037714 ("virtio: find_vqs() add arg sizes") Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20220816053602.173815-8-mst@redhat.com>
2022-08-16virtio_vdpa: Revert "virtio_vdpa: support the arg sizes of find_vqs()"Michael S. Tsirkin
This reverts commit 99e8927d8a4da8eb8a8a5904dc13a3156be8e7c0: proposed API isn't supported on all transports but no effort was made to address this. It might not be hard to fix if we want to: maybe just rename size to size_hint and make sure legacy transports ignore the hint. But it's not sure what the benefit is in any case, so let's drop it. Fixes: 99e8927d8a4d ("virtio_vdpa: support the arg sizes of find_vqs()") Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20220816053602.173815-6-mst@redhat.com>
2022-08-16virtio_pci: Revert "virtio_pci: support the arg sizes of find_vqs()"Michael S. Tsirkin
This reverts commit cdb44806fca2d0ad29ca644cbf1505433902ee0c: the legacy path is wrong and in fact can not support the proposed API since for a legacy device we never communicate the vq size to the hypervisor. Reported-by: Andres Freund <andres@anarazel.de> Fixes: cdb44806fca2 ("virtio_pci: support the arg sizes of find_vqs()") Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20220816053602.173815-5-mst@redhat.com>
2022-08-16virtio-mmio: Revert "virtio_mmio: support the arg sizes of find_vqs()"Michael S. Tsirkin
This reverts commit fbed86abba6e0472d98079790e58060e4332608a. The API is now unused, let's not carry dead code around. Fixes: fbed86abba6e ("virtio_mmio: support the arg sizes of find_vqs()") Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20220816053602.173815-4-mst@redhat.com>
2022-08-12Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhostLinus Torvalds
Pull virtio updates from Michael Tsirkin: - A huge patchset supporting vq resize using the new vq reset capability - Features, fixes, and cleanups all over the place * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (88 commits) vdpa/mlx5: Fix possible uninitialized return value vdpa_sim_blk: add support for discard and write-zeroes vdpa_sim_blk: add support for VIRTIO_BLK_T_FLUSH vdpa_sim_blk: make vdpasim_blk_check_range usable by other requests vdpa_sim_blk: check if sector is 0 for commands other than read or write vdpa_sim: Implement suspend vdpa op vhost-vdpa: uAPI to suspend the device vhost-vdpa: introduce SUSPEND backend feature bit vdpa: Add suspend operation virtio-blk: Avoid use-after-free on suspend/resume virtio_vdpa: support the arg sizes of find_vqs() vhost-vdpa: Call ida_simple_remove() when failed vDPA: fix 'cast to restricted le16' warnings in vdpa.c vDPA: !FEATURES_OK should not block querying device config space vDPA/ifcvf: support userspace to query features and MQ of a management device vDPA/ifcvf: get_config_size should return a value no greater than dev implementation vhost scsi: Allow user to control num virtqueues vhost-scsi: Fix max number of virtqueues vdpa/mlx5: Support different address spaces for control and data vdpa/mlx5: Implement susupend virtqueue callback ...
2022-08-11virtio_vdpa: support the arg sizes of find_vqs()Bo Liu
Virtio vdpa support the new parameter sizes of find_vqs(). Signed-off-by: Bo Liu <liubo03@inspur.com> Message-Id: <20220810085151.7251-1-liubo03@inspur.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-08-11virtio: Check dev_set_name() return valueBo Liu
It's possible that dev_set_name() returns -ENOMEM, catch and handle this. Signed-off-by: Bo Liu <liubo03@inspur.com> Message-Id: <20220707031751.4802-1-liubo03@inspur.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>
2022-08-11virtio_mmio: support the arg sizes of find_vqs()Xuan Zhuo
Virtio MMIO support the new parameter sizes of find_vqs(). Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Message-Id: <20220801063902.129329-36-xuanzhuo@linux.alibaba.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-08-11virtio_pci: support the arg sizes of find_vqs()Xuan Zhuo
Virtio PCI supports new parameter sizes of find_vqs(). Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Message-Id: <20220801063902.129329-35-xuanzhuo@linux.alibaba.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-08-11virtio: find_vqs() add arg sizesXuan Zhuo
find_vqs() adds a new parameter sizes to specify the size of each vq vring. NULL as sizes means that all queues in find_vqs() use the maximum size. A value in the array is 0, which means that the corresponding queue uses the maximum size. In the split scenario, the meaning of size is the largest size, because it may be limited by memory, the virtio core will try a smaller size. And the size is power of 2. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org> Acked-by: Jason Wang <jasowang@redhat.com> Message-Id: <20220801063902.129329-34-xuanzhuo@linux.alibaba.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-08-11virtio_pci: support VIRTIO_F_RING_RESETXuan Zhuo
This patch implements virtio pci support for QUEUE RESET. Performing reset on a queue is divided into these steps: 1. notify the device to reset the queue 2. recycle the buffer submitted 3. reset the vring (may re-alloc) 4. mmap vring to device, and enable the queue This patch implements virtio_reset_vq(), virtio_enable_resetq() in the pci scenario. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Message-Id: <20220801063902.129329-33-xuanzhuo@linux.alibaba.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-08-11virtio_pci: extract the logic of active vq for modern pciXuan Zhuo
Introduce vp_active_vq() to configure vring to backend after vq attach vring. And configure vq vector if necessary. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Message-Id: <20220801063902.129329-32-xuanzhuo@linux.alibaba.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-08-11virtio_pci: introduce helper to get/set queue resetXuan Zhuo
Introduce new helpers to implement queue reset and get queue reset status. https://github.com/oasis-tcs/virtio-spec/issues/124 https://github.com/oasis-tcs/virtio-spec/issues/139 Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Message-Id: <20220801063902.129329-31-xuanzhuo@linux.alibaba.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-08-11virtio_ring: struct virtqueue introduce resetXuan Zhuo
Introduce a new member reset to the structure virtqueue to determine whether the current vq is in the reset state. Subsequent patches will use it. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Message-Id: <20220801063902.129329-29-xuanzhuo@linux.alibaba.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-08-11virtio: allow to unbreak/break virtqueue individuallyXuan Zhuo
This patch allows the new introduced __virtqueue_break()/__virtqueue_unbreak() to break/unbreak the virtqueue. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Message-Id: <20220801063902.129329-27-xuanzhuo@linux.alibaba.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-08-11virtio_ring: introduce virtqueue_resize()Xuan Zhuo
Introduce virtqueue_resize() to implement the resize of vring. Based on these, the driver can dynamically adjust the size of the vring. For example: ethtool -G. virtqueue_resize() implements resize based on the vq reset function. In case of failure to allocate a new vring, it will give up resize and use the original vring. During this process, if the re-enable reset vq fails, the vq can no longer be used. Although the probability of this situation is not high. The parameter recycle is used to recycle the buffer that is no longer used. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Message-Id: <20220801063902.129329-25-xuanzhuo@linux.alibaba.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-08-11virtio_ring: packed: introduce virtqueue_resize_packed()Xuan Zhuo
virtio ring packed supports resize. Only after the new vring is successfully allocated based on the new num, we will release the old vring. In any case, an error is returned, indicating that the vring still points to the old vring. In the case of an error, re-initialize(by virtqueue_reinit_packed()) the virtqueue to ensure that the vring can be used. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Message-Id: <20220801063902.129329-24-xuanzhuo@linux.alibaba.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-08-11virtio_ring: packed: introduce virtqueue_reinit_packed()Xuan Zhuo
Introduce a function to initialize vq without allocating new ring, desc_state, desc_extra. Subsequent patches will call this function after reset vq to reinitialize vq. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Message-Id: <20220801063902.129329-23-xuanzhuo@linux.alibaba.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-08-11virtio_ring: packed: extract the logic of attach vringXuan Zhuo
Separate the logic of attach vring, the subsequent patch will call it separately. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Message-Id: <20220801063902.129329-22-xuanzhuo@linux.alibaba.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-08-11virtio_ring: packed: extract the logic of vring initXuan Zhuo
Separate the logic of initializing vring, and subsequent patches will call it separately. This function completes the variable initialization of packed vring. It together with the logic of atatch constitutes the initialization of vring. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Message-Id: <20220801063902.129329-21-xuanzhuo@linux.alibaba.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-08-11virtio_ring: packed: extract the logic of alloc state and extraXuan Zhuo
Separate the logic for alloc desc_state and desc_extra, which will be called separately by subsequent patches. Use struct vring_packed to pass desc_state, desc_extra. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Message-Id: <20220801063902.129329-20-xuanzhuo@linux.alibaba.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-08-11virtio_ring: packed: extract the logic of alloc queueXuan Zhuo
Separate the logic of packed to create vring queue. This feature is required for subsequent virtuqueue reset vring. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Message-Id: <20220801063902.129329-19-xuanzhuo@linux.alibaba.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-08-11virtio_ring: packed: introduce vring_free_packedXuan Zhuo
Free the structure struct vring_vritqueue_packed. Subsequent patches require it. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Message-Id: <20220801063902.129329-18-xuanzhuo@linux.alibaba.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-08-11virtio_ring: split: introduce virtqueue_resize_split()Xuan Zhuo
virtio ring split supports resize. Only after the new vring is successfully allocated based on the new num, we will release the old vring. In any case, an error is returned, indicating that the vring still points to the old vring. In the case of an error, re-initialize(virtqueue_reinit_split()) the virtqueue to ensure that the vring can be used. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Message-Id: <20220801063902.129329-17-xuanzhuo@linux.alibaba.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-08-11virtio_ring: split: reserve vring_align, may_reduce_numXuan Zhuo
In vring_alloc_queue_split() save vring_align, may_reduce_num to structure vring_virtqueue_split. Used to create a new vring when implementing resize. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Message-Id: <20220801063902.129329-16-xuanzhuo@linux.alibaba.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-08-11virtio_ring: split: introduce virtqueue_reinit_split()Xuan Zhuo
Introduce a function to initialize vq without allocating new ring, desc_state, desc_extra. Subsequent patches will call this function after reset vq to reinitialize vq. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Message-Id: <20220801063902.129329-15-xuanzhuo@linux.alibaba.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-08-11virtio_ring: split: extract the logic of attach vringXuan Zhuo
Separate the logic of attach vring, subsequent patches will call it separately. virtqueue_vring_init_split() completes the initialization of other variables of vring split. We can directly use vq->split = *vring_split to complete attach. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Message-Id: <20220801063902.129329-14-xuanzhuo@linux.alibaba.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-08-11virtio_ring: split: extract the logic of vring initXuan Zhuo
Separate the logic of initializing vring, and subsequent patches will call it separately. This function completes the variable initialization of split vring. It together with the logic of atatch constitutes the initialization of vring. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Message-Id: <20220801063902.129329-13-xuanzhuo@linux.alibaba.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-08-11virtio_ring: split: extract the logic of alloc state and extraXuan Zhuo
Separate the logic of creating desc_state, desc_extra, and subsequent patches will call it independently. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Message-Id: <20220801063902.129329-12-xuanzhuo@linux.alibaba.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-08-11virtio_ring: split: extract the logic of alloc queueXuan Zhuo
Separate the logic of split to create vring queue. This feature is required for subsequent virtuqueue reset vring. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Message-Id: <20220801063902.129329-11-xuanzhuo@linux.alibaba.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-08-11virtio_ring: split: introduce vring_free_split()Xuan Zhuo
Free the structure struct vring_vritqueue_split. Subsequent patches require it. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Message-Id: <20220801063902.129329-10-xuanzhuo@linux.alibaba.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-08-11virtio_ring: split: __vring_new_virtqueue() accept struct vring_virtqueue_splitXuan Zhuo
__vring_new_virtqueue() instead accepts struct vring_virtqueue_split. The purpose of this is to pass more information into __vring_new_virtqueue() to make the code simpler and the structure cleaner. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Message-Id: <20220801063902.129329-9-xuanzhuo@linux.alibaba.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-08-11virtio_ring: split: stop __vring_new_virtqueue as export symbolXuan Zhuo
There is currently only one place to reference __vring_new_virtqueue() directly from the outside of virtio core. And here vring_new_virtqueue() can be used instead. Subsequent patches will modify __vring_new_virtqueue, so stop it as an export symbol for now. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Message-Id: <20220801063902.129329-8-xuanzhuo@linux.alibaba.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-08-11virtio_ring: introduce virtqueue_init()Xuan Zhuo
Separate the logic of virtqueue initialization. These variables should be reset during reset. This logic can be called independently when implementing resize/reset later. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Message-Id: <20220801063902.129329-7-xuanzhuo@linux.alibaba.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-08-11virtio_ring: split vring_virtqueueXuan Zhuo
Separate the two inline structures(split and packed) from the structure vring_virtqueue. In this way, we can use these two structures later to pass parameters and retain temporary variables. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Message-Id: <20220801063902.129329-6-xuanzhuo@linux.alibaba.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-08-11virtio_ring: extract the logic of freeing vringXuan Zhuo
Introduce vring_free() to free the vring of vq. Subsequent patches will use vring_free() alone. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Message-Id: <20220801063902.129329-5-xuanzhuo@linux.alibaba.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>