aboutsummaryrefslogtreecommitdiff
path: root/drivers
AgeCommit message (Collapse)Author
2023-09-19scsi: qla2xxx: Flush mailbox commands on chip resetQuinn Tran
commit 6d0b65569c0a10b27c49bacd8d25bcd406003533 upstream. Fix race condition between Interrupt thread and Chip reset thread in trying to flush the same mailbox. With the race condition, the "ha->mbx_intr_comp" will get an extra complete() call. The extra complete call create erroneous mailbox timeout condition when the next mailbox is sent where the mailbox call does not wait for interrupt to arrive. Instead, it advances without waiting. Add lock protection around the check for mailbox completion. Cc: stable@vger.kernel.org Fixes: b2000805a975 ("scsi: qla2xxx: Flush mailbox commands on chip reset") Signed-off-by: Quinn Tran <quinn.tran@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Link: https://lore.kernel.org/r/20230821130045.34850-3-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-19scsi: qla2xxx: Remove unsupported ql2xenabledif optionManish Rangankar
commit e9105c4b7a9208a21a9bda133707624f12ddabc2 upstream. User accidently passed module parameter ql2xenabledif=1 which is unsupported. However, driver still initialized which lead to guard tag errors during device discovery. Remove unsupported ql2xenabledif=1 option and validate the user input. Cc: stable@vger.kernel.org Signed-off-by: Manish Rangankar <mrangankar@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Link: https://lore.kernel.org/r/20230821130045.34850-7-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-19scsi: qla2xxx: Fix TMF leak throughQuinn Tran
commit 5d3148d8e8b05f084e607ac3bd55a4c317a9f934 upstream. Task management can retry up to 5 times when FW resource becomes bottle neck. Between the retries, there is a short sleep. Current code assumes the chip has not reset or session has not changed. Check for chip reset or session change before sending Task management. Cc: stable@vger.kernel.org Fixes: 9803fb5d2759 ("scsi: qla2xxx: Fix task management cmd failure") Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Link: https://lore.kernel.org/r/20230714070104.40052-9-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-19scsi: qla2xxx: Fix session hang in gnlQuinn Tran
commit 39d22740712c7563a2e18c08f033deeacdaf66e7 upstream. Connection does not resume after a host reset / chip reset. The cause of the blockage is due to the FCF_ASYNC_ACTIVE left on. The gnl command was interrupted by the chip reset. On exiting the command, this flag should be turn off to allow relogin to reoccur. Clear this flag to prevent blockage. Cc: stable@vger.kernel.org Fixes: 17e64648aa47 ("scsi: qla2xxx: Correct fcport flags handling") Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Link: https://lore.kernel.org/r/20230714070104.40052-7-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-19scsi: qla2xxx: Turn off noisy message logQuinn Tran
commit 8ebaa45163a3fedc885c1dc7d43ea987a2f00a06 upstream. Some consider noisy log as test failure. Turn off noisy message log. Cc: stable@vger.kernel.org Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Link: https://lore.kernel.org/r/20230714070104.40052-8-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-19scsi: qla2xxx: Fix erroneous link up failureQuinn Tran
commit 5b51f35d127e7bef55fa869d2465e2bca4636454 upstream. Link up failure occurred where driver failed to see certain events from FW indicating link up (AEN 8011) and fabric login completion (AEN 8014). Without these 2 events, driver would not proceed forward to scan the fabric. The cause of this is due to delay in the receive of interrupt for Mailbox 60 that causes qla to set the fw_started flag late. The late setting of this flag causes other interrupts to be dropped. These dropped interrupts happen to be the link up (AEN 8011) and fabric login completion (AEN 8014). Set fw_started flag early to prevent interrupts being dropped. Cc: stable@vger.kernel.org Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Link: https://lore.kernel.org/r/20230714070104.40052-6-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-19scsi: qla2xxx: Fix command flush during TMFQuinn Tran
commit da7c21b72aa86e990af5f73bce6590b8d8d148d0 upstream. For each TMF request, driver iterates through each qpair and flushes commands associated to the TMF. At the end of the qpair flush, a Marker is used to complete the flush transaction. This process was repeated for each qpair. The multiple flush and marker for this TMF request seems to cause confusion for FW. Instead, 1 flush is sent to FW. Driver would wait for FW to go through all the I/Os on each qpair to be read then return. Driver then closes out the transaction with a Marker. Cc: stable@vger.kernel.org Fixes: d90171dd0da5 ("scsi: qla2xxx: Multi-que support for TMF") Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Link: https://lore.kernel.org/r/20230714070104.40052-5-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-19scsi: qla2xxx: fix inconsistent TMF timeoutQuinn Tran
commit 009e7fe4a1ed52276b332842a6b6e23b07200f2d upstream. Different behavior were experienced of session being torn down vs not when TMF is timed out. When FW detects the time out, the session is torn down. When driver detects the time out, the session is not torn down. Allow TMF error to return to upper layer without session tear down. Cc: stable@vger.kernel.org Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Link: https://lore.kernel.org/r/20230714070104.40052-10-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-19scsi: qla2xxx: Fix deletion race conditionQuinn Tran
commit 6dfe4344c168c6ca20fe7640649aacfcefcccb26 upstream. System crash when using debug kernel due to link list corruption. The cause of the link list corruption is due to session deletion was allowed to queue up twice. Here's the internal trace that show the same port was allowed to double queue for deletion on different cpu. 20808683956 015 qla2xxx [0000:13:00.1]-e801:4: Scheduling sess ffff93ebf9306800 for deletion 50:06:0e:80:12:48:ff:50 fc4_type 1 20808683957 027 qla2xxx [0000:13:00.1]-e801:4: Scheduling sess ffff93ebf9306800 for deletion 50:06:0e:80:12:48:ff:50 fc4_type 1 Move the clearing/setting of deleted flag lock. Cc: stable@vger.kernel.org Fixes: 726b85487067 ("qla2xxx: Add framework for async fabric discovery") Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Link: https://lore.kernel.org/r/20230714070104.40052-2-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-19scsi: qla2xxx: Limit TMF to 8 per functionQuinn Tran
commit a8ec192427e0516436e61f9ca9eb49c54eadfe0a upstream. Per FW recommendation, 8 TMF's can be outstanding for each function. Previously, it allowed 8 per target. Limit TMF to 8 per function. Cc: stable@vger.kernel.org Fixes: 6a87679626b5 ("scsi: qla2xxx: Fix task management cmd fail due to unavailable resource") Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Link: https://lore.kernel.org/r/20230714070104.40052-4-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-19scsi: qla2xxx: Adjust IOCB resource on qpair createQuinn Tran
commit efa74a62aaa2429c04fe6cb277b3bf6739747d86 upstream. During NVMe queue creation, a new qpair is created. FW resource limit needs to be re-adjusted to take into account the new qpair. Otherwise, NVMe command can not go through. This issue was discovered while testing/forcing FW execution to fail at load time. Add call to readjust IOCB and exchange limit. In addition, get FW state command and require FW to be running. Otherwise, error is generated. Cc: stable@vger.kernel.org Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Link: https://lore.kernel.org/r/20230714070104.40052-3-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-19drm/virtio: Conditionally allocate virtio_gpu_fenceGurchetan Singh
commit 70d1ace56db6c79d39dbe9c0d5244452b67e2fde upstream. We don't want to create a fence for every command submission. It's only necessary when userspace provides a waitable token for submission. This could be: 1) bo_handles, to be used with VIRTGPU_WAIT 2) out_fence_fd, to be used with dma_fence apis 3) a ring_idx provided with VIRTGPU_CONTEXT_PARAM_POLL_RINGS_MASK + DRM event API 4) syncobjs in the future The use case for just submitting a command to the host, and expecting no response. For example, gfxstream has GFXSTREAM_CONTEXT_PING that just wakes up the host side worker threads. There's also CROSS_DOMAIN_CMD_SEND which just sends data to the Wayland server. This prevents the need to signal the automatically created virtio_gpu_fence. In addition, VIRTGPU_EXECBUF_RING_IDX is checked when creating a DRM event object. VIRTGPU_CONTEXT_PARAM_POLL_RINGS_MASK is already defined in terms of per-context rings. It was theoretically possible to create a DRM event on the global timeline (ring_idx == 0), if the context enabled DRM event polling. However, that wouldn't work and userspace (Sommelier). Explicitly disallow it for clarity. Signed-off-by: Gurchetan Singh <gurchetansingh@chromium.org> Reviewed-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> Tested-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> # edited coding style Link: https://patchwork.freedesktop.org/patch/msgid/20230707213124.494-1-gurchetansingh@chromium.org Signed-off-by: Alyssa Ross <hi@alyssa.is> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-13treewide: Fix probing of devices in DT overlaysGeert Uytterhoeven
commit 1a50d9403fb90cbe4dea0ec9fd0351d2ecbd8924 upstream. When loading a DT overlay that creates a device, the device is not probed, unless the DT overlay is unloaded and reloaded again. After the recent refactoring to improve fw_devlink, it no longer depends on the "compatible" property to identify which device tree nodes will become struct devices. fw_devlink now picks up dangling consumers (consumers pointing to descendent device tree nodes of a device that aren't converted to child devices) when a device is successfully bound to a driver. See __fw_devlink_pickup_dangling_consumers(). However, during DT overlay, a device's device tree node can have sub-nodes added/removed without unbinding/rebinding the driver. This difference in behavior between the normal device instantiation and probing flow vs. the DT overlay flow has a bunch of implications that are pointed out elsewhere[1]. One of them is that the fw_devlink logic to pick up dangling consumers is never exercised. This patch solves the fw_devlink issue by marking all DT nodes added by DT overlays with FWNODE_FLAG_NOT_DEVICE (fwnode that won't become device), and by clearing the flag when a struct device is actually created for the DT node. This way, fw_devlink knows not to have consumers waiting on these newly added DT nodes, and to propagate the dependency to an ancestor DT node that has the corresponding struct device. Based on a patch by Saravana Kannan, which covered only platform and spi devices. [1] https://lore.kernel.org/r/CAGETcx_bkuFaLCiPrAWCPQz+w79ccDp6=9e881qmK=vx3hBMyg@mail.gmail.com Fixes: 4a032827daa89350 ("of: property: Simplify of_link_to_phandle()") Link: https://lore.kernel.org/r/CAGETcx_+rhHvaC_HJXGrr5_WAd2+k5f=rWYnkCZ6z5bGX-wj4w@mail.gmail.com Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Acked-by: Mark Brown <broonie@kernel.org> Acked-by: Wolfram Sang <wsa@kernel.org> # for I2C Acked-by: Shawn Guo <shawnguo@kernel.org> Acked-by: Saravana Kannan <saravanak@google.com> Tested-by: Ivan Bornyakov <i.bornyakov@metrotek.ru> Link: https://lore.kernel.org/r/e1fa546682ea4c8474ff997ab6244c5e11b6f8bc.1680182615.git.geert+renesas@glider.be Signed-off-by: Rob Herring <robh@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-13md: fix regression for null-ptr-deference in __md_stop()Yu Kuai
commit 433279beba1d4872da10b7b60a539e0cb828b32b upstream. Commit 3e453522593d ("md: Free resources in __md_stop") tried to fix null-ptr-deference for 'active_io' by moving percpu_ref_exit() to __md_stop(), however, the commit also moving 'writes_pending' to __md_stop(), and this will cause mdadm tests broken: BUG: kernel NULL pointer dereference, address: 0000000000000038 Oops: 0000 [#1] PREEMPT SMP CPU: 15 PID: 17830 Comm: mdadm Not tainted 6.3.0-rc3-next-20230324-00009-g520d37 RIP: 0010:free_percpu+0x465/0x670 Call Trace: <TASK> __percpu_ref_exit+0x48/0x70 percpu_ref_exit+0x1a/0x90 __md_stop+0xe9/0x170 do_md_stop+0x1e1/0x7b0 md_ioctl+0x90c/0x1aa0 blkdev_ioctl+0x19b/0x400 vfs_ioctl+0x20/0x50 __x64_sys_ioctl+0xba/0xe0 do_syscall_64+0x6c/0xe0 entry_SYSCALL_64_after_hwframe+0x63/0xcd And the problem can be reporduced 100% by following test: mdadm -CR /dev/md0 -l1 -n1 /dev/sda --force echo inactive > /sys/block/md0/md/array_state echo read-auto > /sys/block/md0/md/array_state echo inactive > /sys/block/md0/md/array_state Root cause: // start raid raid1_run mddev_init_writes_pending percpu_ref_init // inactive raid array_state_store do_md_stop __md_stop percpu_ref_exit // start raid again array_state_store do_md_run raid1_run mddev_init_writes_pending if (mddev->writes_pending.percpu_count_ptr) // won't reinit // inactive raid again ... percpu_ref_exit -> null-ptr-deference Before the commit, 'writes_pending' is exited when mddev is freed, and it's safe to restart raid because mddev_init_writes_pending() already make sure that 'writes_pending' will only be initialized once. Fix the prblem by moving 'writes_pending' back, it's a litter hard to find the relationship between alloc memory and free memory, however, code changes is much less and we lived with this for a long time already. Fixes: 3e453522593d ("md: Free resources in __md_stop") Signed-off-by: Yu Kuai <yukuai3@huawei.com> Reviewed-by: Xiao Ni <xni@redhat.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20230328094400.1448955-1-yukuai1@huaweicloud.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-13md: Free resources in __md_stopXiao Ni
commit 3e453522593d74a87cf68a38e14aa36ebca1dbcd upstream. If md_run() fails after ->active_io is initialized, then percpu_ref_exit is called in error path. However, later md_free_disk will call percpu_ref_exit again which leads to a panic because of null pointer dereference. It can also trigger this bug when resources are initialized but are freed in error path, then will be freed again in md_free_disk. BUG: kernel NULL pointer dereference, address: 0000000000000038 Oops: 0000 [#1] PREEMPT SMP Workqueue: md_misc mddev_delayed_delete RIP: 0010:free_percpu+0x110/0x630 Call Trace: <TASK> __percpu_ref_exit+0x44/0x70 percpu_ref_exit+0x16/0x90 md_free_disk+0x2f/0x80 disk_release+0x101/0x180 device_release+0x84/0x110 kobject_put+0x12a/0x380 kobject_put+0x160/0x380 mddev_delayed_delete+0x19/0x30 process_one_work+0x269/0x680 worker_thread+0x266/0x640 kthread+0x151/0x1b0 ret_from_fork+0x1f/0x30 For creating raid device, md raid calls do_md_run->md_run, dm raid calls md_run. We alloc those memory in md_run. For stopping raid device, md raid calls do_md_stop->__md_stop, dm raid calls md_stop->__md_stop. So we can free those memory resources in __md_stop. Fixes: 72adae23a72c ("md: Change active_io to percpu") Reported-and-tested-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Xiao Ni <xni@redhat.com> Signed-off-by: Song Liu <song@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-13Revert "drm/amd/display: Do not set drr on pipe commit"Michel Dänzer
commit 360930985ec9f394c82ba0b235403b4a366d1560 upstream. This reverts commit e101bf95ea87ccc03ac2f48dfc0757c6364ff3c7. Caused a regression: Samsung Odyssey Neo G9, running at 5120x1440@240/VRR, connected to Navi 21 via DisplayPort, blanks and the GPU hangs while starting the Steam game Assetto Corsa Competizione (via Proton 7.0). Example dmesg excerpt: amdgpu 0000:0c:00.0: [drm] ERROR [CRTC:82:crtc-0] flip_done timed out NMI watchdog: Watchdog detected hard LOCKUP on cpu 6 [...] RIP: 0010:amdgpu_device_rreg.part.0+0x2f/0xf0 [amdgpu] Code: 41 54 44 8d 24 b5 00 00 00 00 55 89 f5 53 48 89 fb 4c 3b a7 60 0b 00 00 73 6a 83 e2 02 74 29 4c 03 a3 68 0b 00 00 45 8b 24 24 <48> 8b 43 08 0f b7 70 3e 66 90 44 89 e0 5b 5d 41 5c 31 d2 31 c9 31 RSP: 0000:ffffb39a119dfb88 EFLAGS: 00000086 RAX: ffffffffc0eb96a0 RBX: ffff9e7963dc0000 RCX: 0000000000007fff RDX: 0000000000000000 RSI: 0000000000004ff6 RDI: ffff9e7963dc0000 RBP: 0000000000004ff6 R08: ffffb39a119dfc40 R09: 0000000000000010 R10: ffffb39a119dfc40 R11: ffffb39a119dfc44 R12: 00000000000e05ae R13: 0000000000000000 R14: ffff9e7963dc0010 R15: 0000000000000000 FS: 000000001012f6c0(0000) GS:ffff9e805eb80000(0000) knlGS:000000007fd40000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000461ca000 CR3: 00000002a8a20000 CR4: 0000000000350ee0 Call Trace: <TASK> dm_read_reg_func+0x37/0xc0 [amdgpu] generic_reg_get2+0x22/0x60 [amdgpu] optc1_get_crtc_scanoutpos+0x6a/0xc0 [amdgpu] dc_stream_get_scanoutpos+0x74/0x90 [amdgpu] dm_crtc_get_scanoutpos+0x82/0xf0 [amdgpu] amdgpu_display_get_crtc_scanoutpos+0x91/0x190 [amdgpu] ? dm_read_reg_func+0x37/0xc0 [amdgpu] amdgpu_get_vblank_counter_kms+0xb4/0x1a0 [amdgpu] dm_pflip_high_irq+0x213/0x2f0 [amdgpu] amdgpu_dm_irq_handler+0x8a/0x200 [amdgpu] amdgpu_irq_dispatch+0xd4/0x220 [amdgpu] amdgpu_ih_process+0x7f/0x110 [amdgpu] amdgpu_irq_handler+0x1f/0x70 [amdgpu] __handle_irq_event_percpu+0x46/0x1b0 handle_irq_event+0x34/0x80 handle_edge_irq+0x9f/0x240 __common_interrupt+0x66/0x110 common_interrupt+0x5c/0xd0 asm_common_interrupt+0x22/0x40 Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Michel Dänzer <mdaenzer@redhat.com> Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-13serial: sc16is7xx: fix regression with GPIO configurationHugo Villeneuve
[ Upstream commit 0499942928341d572a42199580433c2b0725211e ] Commit 679875d1d880 ("sc16is7xx: Separate GPIOs from modem control lines") and commit 21144bab4f11 ("sc16is7xx: Handle modem status lines") changed the function of the GPIOs pins to act as modem control lines without any possibility of selecting GPIO function. As a consequence, applications that depends on GPIO lines configured by default as GPIO pins no longer work as expected. Also, the change to select modem control lines function was done only for channel A of dual UART variants (752/762). This was not documented in the log message. Allow to specify GPIO or modem control line function in the device tree, and for each of the ports (A or B). Do so by using the new device-tree property named "nxp,modem-control-line-ports" (property added in separate patch). When registering GPIO chip controller, mask-out GPIO pins declared as modem control lines according to this new DT property. Fixes: 679875d1d880 ("sc16is7xx: Separate GPIOs from modem control lines") Fixes: 21144bab4f11 ("sc16is7xx: Handle modem status lines") Cc: stable@vger.kernel.org Signed-off-by: Hugo Villeneuve <hvilleneuve@dimonoff.com> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Reviewed-by: Lech Perczak <lech.perczak@camlingroup.com> Tested-by: Lech Perczak <lech.perczak@camlingroup.com> Acked-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20230807214556.540627-5-hugo@hugovil.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-09-13serial: sc16is7xx: remove obsolete out_thread labelHugo Villeneuve
[ Upstream commit dabc54a45711fe77674a6c0348231e00e66bd567 ] Commit c8f71b49ee4d ("serial: sc16is7xx: setup GPIO controller later in probe") moved GPIO setup code later in probe function. Doing so also required to move ports cleanup code (out_ports label) after the GPIO cleanup code. After these moves, the out_thread label becomes misplaced and makes part of the cleanup code illogical. This patch remove the now obsolete out_thread label and make GPIO setup code jump to out_ports label if it fails. Signed-off-by: Hugo Villeneuve <hvilleneuve@dimonoff.com> Reviewed-by: Lech Perczak <lech.perczak@camlingroup.com> Tested-by: Lech Perczak <lech.perczak@camlingroup.com> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Link: https://lore.kernel.org/r/20230807214556.540627-3-hugo@hugovil.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Stable-dep-of: 049994292834 ("serial: sc16is7xx: fix regression with GPIO configuration") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-09-13USB: core: Fix oversight in SuperSpeed initializationAlan Stern
commit 59cf445754566984fd55af19ba7146c76e6627bc upstream. Commit 85d07c556216 ("USB: core: Unite old scheme and new scheme descriptor reads") altered the way USB devices are enumerated following detection, and in the process it messed up the initialization of SuperSpeed (or faster) devices: [ 31.650759] usb 2-1: new SuperSpeed Plus Gen 2x1 USB device number 2 using xhci_hcd [ 31.663107] usb 2-1: device descriptor read/8, error -71 [ 31.952697] usb 2-1: new SuperSpeed Plus Gen 2x1 USB device number 3 using xhci_hcd [ 31.965122] usb 2-1: device descriptor read/8, error -71 [ 32.080991] usb usb2-port1: attempt power cycle ... The problem was caused by the commit forgetting that in SuperSpeed or faster devices, the device descriptor uses a logarithmic encoding of the bMaxPacketSize0 value. (For some reason I thought the 255 case in the switch statement was meant for these devices, but it isn't -- it was meant for Wireless USB and is no longer needed.) We can fix the oversight by testing for buf->bMaxPacketSize0 = 9 (meaning 512, the actual maxpacket size for ep0 on all SuperSpeed devices) and straightening out the logic that checks and adjusts our initial guesses of the maxpacket value. Reported-and-tested-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com> Closes: https://lore.kernel.org/linux-usb/20230810002257.nadxmfmrobkaxgnz@synopsys.com/ Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Fixes: 85d07c556216 ("USB: core: Unite old scheme and new scheme descriptor reads") Link: https://lore.kernel.org/r/8809e6c5-59d5-4d2d-ac8f-6d106658ad73@rowland.harvard.edu Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-13USB: core: Fix race by not overwriting udev->descriptor in hub_port_init()Alan Stern
commit ff33299ec8bb80cdcc073ad9c506bd79bb2ed20b upstream. Syzbot reported an out-of-bounds read in sysfs.c:read_descriptors(): BUG: KASAN: slab-out-of-bounds in read_descriptors+0x263/0x280 drivers/usb/core/sysfs.c:883 Read of size 8 at addr ffff88801e78b8c8 by task udevd/5011 CPU: 0 PID: 5011 Comm: udevd Not tainted 6.4.0-rc6-syzkaller-00195-g40f71e7cd3c6 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 05/27/2023 Call Trace: <TASK> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0xd9/0x150 lib/dump_stack.c:106 print_address_description.constprop.0+0x2c/0x3c0 mm/kasan/report.c:351 print_report mm/kasan/report.c:462 [inline] kasan_report+0x11c/0x130 mm/kasan/report.c:572 read_descriptors+0x263/0x280 drivers/usb/core/sysfs.c:883 ... Allocated by task 758: ... __do_kmalloc_node mm/slab_common.c:966 [inline] __kmalloc+0x5e/0x190 mm/slab_common.c:979 kmalloc include/linux/slab.h:563 [inline] kzalloc include/linux/slab.h:680 [inline] usb_get_configuration+0x1f7/0x5170 drivers/usb/core/config.c:887 usb_enumerate_device drivers/usb/core/hub.c:2407 [inline] usb_new_device+0x12b0/0x19d0 drivers/usb/core/hub.c:2545 As analyzed by Khazhy Kumykov, the cause of this bug is a race between read_descriptors() and hub_port_init(): The first routine uses a field in udev->descriptor, not expecting it to change, while the second overwrites it. Prior to commit 45bf39f8df7f ("USB: core: Don't hold device lock while reading the "descriptors" sysfs file") this race couldn't occur, because the routines were mutually exclusive thanks to the device locking. Removing that locking from read_descriptors() exposed it to the race. The best way to fix the bug is to keep hub_port_init() from changing udev->descriptor once udev has been initialized and registered. Drivers expect the descriptors stored in the kernel to be immutable; we should not undermine this expectation. In fact, this change should have been made long ago. So now hub_port_init() will take an additional argument, specifying a buffer in which to store the device descriptor it reads. (If udev has not yet been initialized, the buffer pointer will be NULL and then hub_port_init() will store the device descriptor in udev as before.) This eliminates the data race responsible for the out-of-bounds read. The changes to hub_port_init() appear more extensive than they really are, because of indentation changes resulting from an attempt to avoid writing to other parts of the usb_device structure after it has been initialized. Similar changes should be made to the code that reads the BOS descriptor, but that can be handled in a separate patch later on. This patch is sufficient to fix the bug found by syzbot. Reported-and-tested-by: syzbot+18996170f8096c6174d0@syzkaller.appspotmail.com Closes: https://lore.kernel.org/linux-usb/000000000000c0ffe505fe86c9ca@google.com/#r Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Cc: Khazhy Kumykov <khazhy@google.com> Fixes: 45bf39f8df7f ("USB: core: Don't hold device lock while reading the "descriptors" sysfs file") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/b958b47a-9a46-4c22-a9f9-e42e42c31251@rowland.harvard.edu Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-13USB: core: Change usb_get_device_descriptor() APIAlan Stern
commit de28e469da75359a2bb8cd8778b78aa64b1be1f4 upstream. The usb_get_device_descriptor() routine reads the device descriptor from the udev device and stores it directly in udev->descriptor. This interface is error prone, because the USB subsystem expects in-memory copies of a device's descriptors to be immutable once the device has been initialized. The interface is changed so that the device descriptor is left in a kmalloc-ed buffer, not copied into the usb_device structure. A pointer to the buffer is returned to the caller, who is then responsible for kfree-ing it. The corresponding changes needed in the various callers are fairly small. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Link: https://lore.kernel.org/r/d0111bb6-56c1-4f90-adf2-6cfe152f6561@rowland.harvard.edu Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-13USB: core: Unite old scheme and new scheme descriptor readsAlan Stern
commit 85d07c55621676d47d873d2749b88f783cd4d5a1 upstream. In preparation for reworking the usb_get_device_descriptor() routine, it is desirable to unite the two different code paths responsible for initially determining endpoint 0's maximum packet size in a newly discovered USB device. Making this determination presents a chicken-and-egg sort of problem, in that the only way to learn the maxpacket value is to get it from the device descriptor retrieved from the device, but communicating with the device to retrieve a descriptor requires us to know beforehand the ep0 maxpacket size. In practice this problem is solved in two different ways, referred to in hub.c as the "old scheme" and the "new scheme". The old scheme (which is the approach recommended by the USB-2 spec) involves asking the device to send just the first eight bytes of its device descriptor. Such a transfer uses packets containing no more than eight bytes each, and every USB device must have an ep0 maxpacket size >= 8, so this should succeed. Since the bMaxPacketSize0 field of the device descriptor lies within the first eight bytes, this is all we need. The new scheme is an imitation of the technique used in an early Windows USB implementation, giving it the happy advantage of working with a wide variety of devices (some of them at the time would not work with the old scheme, although that's probably less true now). It involves making an initial guess of the ep0 maxpacket size, asking the device to send up to 64 bytes worth of its device descriptor (which is only 18 bytes long), and then resetting the device to clear any error condition that might have resulted from the guess being wrong. The initial guess is determined by the connection speed; it should be correct in all cases other than full speed, for which the allowed values are 8, 16, 32, and 64 (in this case the initial guess is 64). The reason for this patch is that the old- and new-scheme parts of hub_port_init() use different code paths, one involving usb_get_device_descriptor() and one not, for their initial reads of the device descriptor. Since these reads have essentially the same purpose and are made under essentially the same circumstances, this is illogical. It makes more sense to have both of them use a common subroutine. This subroutine does basically what the new scheme's code did, because that approach is more general than the one used by the old scheme. It only needs to know how many bytes to transfer and whether or not it is being called for the first iteration of a retry loop (in case of certain time-out errors). There are two main differences from the former code: We initialize the bDescriptorType field of the transfer buffer to 0 before performing the transfer, to avoid possibly accessing an uninitialized value afterward. We read the device descriptor into a temporary buffer rather than storing it directly into udev->descriptor, which the old scheme implementation used to do. Since the whole point of this first read of the device descriptor is to determine the bMaxPacketSize0 value, that is what the new routine returns (or an error code). The value is stored in a local variable rather than in udev->descriptor. As a side effect, this necessitates moving a section of code that checks the bcdUSB field for SuperSpeed devices until after the full device descriptor has been retrieved. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Cc: Oliver Neukum <oneukum@suse.com> Link: https://lore.kernel.org/r/495cb5d4-f956-4f4a-a875-1e67e9489510@rowland.harvard.edu Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-13usb: typec: bus: verify partner exists in typec_altmode_attentionRD Babiera
commit f23643306430f86e2f413ee2b986e0773e79da31 upstream. Some usb hubs will negotiate DisplayPort Alt mode with the device but will then negotiate a data role swap after entering the alt mode. The data role swap causes the device to unregister all alt modes, however the usb hub will still send Attention messages even after failing to reregister the Alt Mode. type_altmode_attention currently does not verify whether or not a device's altmode partner exists, which results in a NULL pointer error when dereferencing the typec_altmode and typec_altmode_ops belonging to the altmode partner. Verify the presence of a device's altmode partner before sending the Attention message to the Alt Mode driver. Fixes: 8a37d87d72f0 ("usb: typec: Bus type for alternate modes") Cc: stable@vger.kernel.org Signed-off-by: RD Babiera <rdbabiera@google.com> Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20230814180559.923475-1-rdbabiera@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-13usb: typec: tcpm: set initial svdm version based on pd revisionRD Babiera
commit c97cd0b4b54eb42aed7f6c3c295a2d137f6d2416 upstream. When sending Discover Identity messages to a Port Partner that uses Power Delivery v2 and SVDM v1, we currently send PD v2 messages with SVDM v2.0, expecting the port partner to respond with its highest supported SVDM version as stated in Section 6.4.4.2.3 in the Power Delivery v3 specification. However, sending SVDM v2 to some Power Delivery v2 port partners results in a NAK whereas sending SVDM v1 does not. NAK messages can be handled by the initiator (PD v3 section 6.4.4.2.5.1), and one solution could be to resend Discover Identity on a lower SVDM version if possible. But, Section 6.4.4.3 of PD v2 states that "A NAK response Should be taken as an indication not to retry that particular Command." Instead, we can set the SVDM version to the maximum one supported by the negotiated PD revision. When operating in PD v2, this obeys Section 6.4.4.2.3, which states the SVDM field "Shall be set to zero to indicate Version 1.0." In PD v3, the SVDM field "Shall be set to 01b to indicate Version 2.0." Fixes: c34e85fa69b9 ("usb: typec: tcpm: Send DISCOVER_IDENTITY from dedicated work") Cc: stable@vger.kernel.org Signed-off-by: RD Babiera <rdbabiera@google.com> Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Link: https://lore.kernel.org/r/20230731165926.1815338-1-rdbabiera@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-13of: property: fw_devlink: Add a devlink for panel followersDouglas Anderson
commit fbf0ea2da3c7cd0b33ed7ae53a67ab1c24838cba upstream. Inform fw_devlink of the fact that a panel follower (like a touchscreen) is effectively a consumer of the panel from the purposes of fw_devlink. NOTE: this patch isn't required for correctness but instead optimizes probe order / helps avoid deferrals. Acked-by: Rob Herring <robh@kernel.org> Reviewed-by: Maxime Ripard <mripard@kernel.org> Signed-off-by: Douglas Anderson <dianders@chromium.org> Link: https://patchwork.freedesktop.org/patch/msgid/20230727101636.v4.4.Ibf8e1342b5b7906279db2365aca45e6253857bb3@changeid Cc: Adam Ford <aford173@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-13cpufreq: brcmstb-avs-cpufreq: Fix -Warray-bounds bugGustavo A. R. Silva
commit e520d0b6be950ce3738cf4b9bd3b392be818f1dc upstream. Allocate extra space for terminating element at: drivers/cpufreq/brcmstb-avs-cpufreq.c: 449 table[i].frequency = CPUFREQ_TABLE_END; and add code comment to make this clear. This fixes the following -Warray-bounds warning seen after building ARM with multi_v7_defconfig (GCC 13): In function 'brcm_avs_get_freq_table', inlined from 'brcm_avs_cpufreq_init' at drivers/cpufreq/brcmstb-avs-cpufreq.c:623:15: drivers/cpufreq/brcmstb-avs-cpufreq.c:449:28: warning: array subscript 5 is outside array bounds of 'void[60]' [-Warray-bounds=] 449 | table[i].frequency = CPUFREQ_TABLE_END; In file included from include/linux/node.h:18, from include/linux/cpu.h:17, from include/linux/cpufreq.h:12, from drivers/cpufreq/brcmstb-avs-cpufreq.c:44: In function 'devm_kmalloc_array', inlined from 'devm_kcalloc' at include/linux/device.h:328:9, inlined from 'brcm_avs_get_freq_table' at drivers/cpufreq/brcmstb-avs-cpufreq.c:437:10, inlined from 'brcm_avs_cpufreq_init' at drivers/cpufreq/brcmstb-avs-cpufreq.c:623:15: include/linux/device.h:323:16: note: at offset 60 into object of size 60 allocated by 'devm_kmalloc' 323 | return devm_kmalloc(dev, bytes, flags); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ This helps with the ongoing efforts to tighten the FORTIFY_SOURCE routines on memcpy() and help us make progress towards globally enabling -Warray-bounds. Link: https://github.com/KSPP/linux/issues/324 Fixes: de322e085995 ("cpufreq: brcmstb-avs-cpufreq: AVS CPUfreq driver for Broadcom STB SoCs") Cc: stable@vger.kernel.org Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-13crypto: stm32 - fix loop iterating through scatterlist for DMAThomas Bourgoin
commit d9c83f71eeceed2cb54bb78be84f2d4055fd9a1f upstream. We were reading the length of the scatterlist sg after copying value of tsg inside. So we are using the size of the previous scatterlist and for the first one we are using an unitialised value. Fix this by copying tsg in sg[0] before reading the size. Fixes : 8a1012d3f2ab ("crypto: stm32 - Support for STM32 HASH module") Cc: stable@vger.kernel.org Signed-off-by: Thomas Bourgoin <thomas.bourgoin@foss.st.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-13s390/dasd: fix string length handlingHeiko Carstens
commit f7cf22424665043787a96a66a048ff6b2cfd473c upstream. Building dasd_eckd.o with latest clang reveals this bug: CC drivers/s390/block/dasd_eckd.o drivers/s390/block/dasd_eckd.c:1082:3: warning: 'snprintf' will always be truncated; specified size is 1, but format string expands to at least 11 [-Wfortify-source] 1082 | snprintf(print_uid, sizeof(*print_uid), | ^ drivers/s390/block/dasd_eckd.c:1087:3: warning: 'snprintf' will always be truncated; specified size is 1, but format string expands to at least 10 [-Wfortify-source] 1087 | snprintf(print_uid, sizeof(*print_uid), | ^ Fix this by moving and using the existing UID_STRLEN for the arrays that are being written to. Also rename UID_STRLEN to DASD_UID_STRLEN to clarify its scope. Fixes: 23596961b437 ("s390/dasd: split up dasd_eckd_read_conf") Reviewed-by: Peter Oberparleiter <oberpar@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Tested-by: Nick Desaulniers <ndesaulniers@google.com> # build Reported-by: Nathan Chancellor <nathan@kernel.org> Closes: https://github.com/ClangBuiltLinux/linux/issues/1923 Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Link: https://lore.kernel.org/r/20230828153142.2843753-2-hca@linux.ibm.com Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-13s390/dcssblk: fix kernel crash with list_add corruptionGerald Schaefer
commit c8f40a0bccefd613748d080147469a4652d6e74c upstream. Commit fb08a1908cb1 ("dax: simplify the dax_device <-> gendisk association") introduced new logic for gendisk association, requiring drivers to explicitly call dax_add_host() and dax_remove_host(). For dcssblk driver, some dax_remove_host() calls were missing, e.g. in device remove path. The commit also broke error handling for out_dax case in device add path, resulting in an extra put_device() w/o the previous get_device() in that case. This lead to stale xarray entries after device add / remove cycles. In the case when a previously used struct gendisk pointer (xarray index) would be used again, because blk_alloc_disk() happened to return such a pointer, the xa_insert() in dax_add_host() would fail and go to out_dax, doing the extra put_device() in the error path. In combination with an already flawed error handling in dcssblk (device_register() cleanup), which needs to be addressed in a separate patch, this resulted in a missing device_del() / klist_del(), and eventually in the kernel crash with list_add corruption on a subsequent device_add() / klist_add(). Fix this by adding the missing dax_remove_host() calls, and also move the put_device() in the error path to restore the previous logic. Fixes: fb08a1908cb1 ("dax: simplify the dax_device <-> gendisk association") Cc: <stable@vger.kernel.org> # 5.17+ Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-13arm64: sdei: abort running SDEI handlers during crashD Scott Phillips
commit 5cd474e57368f0957c343bb21e309cf82826b1ef upstream. Interrupts are blocked in SDEI context, per the SDEI spec: "The client interrupts cannot preempt the event handler." If we crashed in the SDEI handler-running context (as with ACPI's AGDI) then we need to clean up the SDEI state before proceeding to the crash kernel so that the crash kernel can have working interrupts. Track the active SDEI handler per-cpu so that we can COMPLETE_AND_RESUME the handler, discarding the interrupted context. Fixes: f5df26961853 ("arm64: kernel: Add arch-specific SDEI entry code and CPU masking") Signed-off-by: D Scott Phillips <scott@os.amperecomputing.com> Cc: stable@vger.kernel.org Reviewed-by: James Morse <james.morse@arm.com> Tested-by: Mihai Carabas <mihai.carabas@oracle.com> Link: https://lore.kernel.org/r/20230627002939.2758-1-scott@os.amperecomputing.com Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-13mmc: renesas_sdhi: register irqs before registering controllerWolfram Sang
commit 74f45de394d979cc7770271f92fafa53e1ed3119 upstream. IRQs should be ready to serve when we call mmc_add_host() via tmio_mmc_host_probe(). To achieve that, ensure that all irqs are masked before registering the handlers. Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Tested-by: Biju Das <biju.das.jz@bp.renesas.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230712140011.18602-1-wsa+renesas@sang-engineering.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-13platform/chrome: chromeos_acpi: print hex string for ACPI_TYPE_BUFFERTzung-Bi Shih
commit 0820debb7d489e9eb1f68b7bb69e6ae210699b3f upstream. `element->buffer.pointer` should be binary blob. `%s` doesn't work perfect for them. Print hex string for ACPI_TYPE_BUFFER. Also update the documentation to reflect this. Fixes: 0a4cad9c11ad ("platform/chrome: Add ChromeOS ACPI device driver") Cc: stable@vger.kernel.org Reviewed-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20230803011245.3773756-1-tzungbi@kernel.org Signed-off-by: Tzung-Bi Shih <tzungbi@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-13r8169: fix ASPM-related issues on a number of systems with NIC version from ↵Heiner Kallweit
RTL8168h commit 90ca51e8c654699b672ba61aeaa418dfb3252e5e upstream. This effectively reverts 4b5f82f6aaef. On a number of systems ASPM L1 causes tx timeouts with RTL8168h, see referenced bug report. Fixes: 4b5f82f6aaef ("r8169: enable ASPM L1/L1.1 from RTL8168h") Cc: stable@vger.kernel.org Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217814 Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-13drm/amd/display: Add smu write msg id fail retry processFudong Wang
commit 72105dcfa3d12b5af49311f857e3490baa225135 upstream. A benchmark stress test (12-40 machines x 48hours) found that DCN315 has cases where DC writes to an indirect register to set the smu clock msg id, but when we go to read the same indirect register the returned msg id doesn't match with what we just set it to. So, to fix this retry the write until the register's value matches with the requested value. Cc: stable@vger.kernel.org # 6.1+ Fixes: f94903996140 ("drm/amd/display: Add DCN315 CLK_MGR") Reviewed-by: Charlene Liu <charlene.liu@amd.com> Acked-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Fudong Wang <fudong.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-13Revert "PCI: Mark NVIDIA T4 GPUs to avoid bus reset"Bjorn Helgaas
commit 5260bd6d36c83c5b269c33baaaf8c78e520908b0 upstream. This reverts commit d5af729dc2071273f14cbb94abbc60608142fd83. d5af729dc207 ("PCI: Mark NVIDIA T4 GPUs to avoid bus reset") avoided Secondary Bus Reset on the T4 because the reset seemed to not work when the T4 was directly attached to a Root Port. But NVIDIA thinks the issue is probably related to some issue with the Root Port, not with the T4. The T4 provides neither PM nor FLR reset, so masking bus reset compromises this device for assignment scenarios. Revert d5af729dc207 as requested by Wu Zongyong. This will leave SBR broken in the specific configuration Wu tested, as it was in v6.5, so Wu will debug that further. Link: https://lore.kernel.org/r/ZPqMCDWvITlOLHgJ@wuzongyong-alibaba Link: https://lore.kernel.org/r/20230908201104.GA305023@bhelgaas Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-13ntb: Fix calculation ntb_transport_tx_free_entry()Dave Jiang
commit 5a7693e6bbf19b22fd6c1d2c4b7beb0a03969e2c upstream. ntb_transport_tx_free_entry() never returns 0 with the current calculation. If head == tail, then it would return qp->tx_max_entry. Change compare to tail >= head and when they are equal, a 0 would be returned. Fixes: e74bfeedad08 ("NTB: Add flow control to the ntb_netdev") Reviewed-by: Logan Gunthorpe <logang@deltatee.com> Signed-off-by: renlonglong <ren.longlong@h3c.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Jon Mason <jdmason@kudzu.us> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-13ntb: Clean up tx tail index on link downDave Jiang
commit cc79bd2738c2d40aba58b2be6ce47dc0e471df0e upstream. The tx tail index is not reset when the link goes down. This causes the tail index to go out of sync when the link goes down and comes back up. Refactor the ntb_qp_link_down_reset() and reset the tail index as well. Fixes: 2849b5d70641 ("NTB: Reset transport QP link stats on down") Reported-by: Yuan Y Lu <yuan.y.lu@intel.com> Tested-by: Yuan Y Lu <yuan.y.lu@intel.com> Reviewed-by: Logan Gunthorpe <logang@deltatee.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Jon Mason <jdmason@kudzu.us> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-13ntb: Drop packets when qp link is downDave Jiang
commit f195a1a6fe416882984f8bd6c61afc1383171860 upstream. Currently when the transport receive packets after netdev has closed the transport returns error and triggers tx errors to be incremented and carrier to be stopped. There is no reason to return error if the device is already closed. Drop the packet and return 0. Fixes: e26a5843f7f5 ("NTB: Split ntb_hw_intel and ntb_transport drivers") Reported-by: Yuan Y Lu <yuan.y.lu@intel.com> Tested-by: Yuan Y Lu <yuan.y.lu@intel.com> Reviewed-by: Logan Gunthorpe <logang@deltatee.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Jon Mason <jdmason@kudzu.us> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-13PCI/PM: Only read PCI_PM_CTRL register when availableFeiyang Chen
commit 5694ba13b004eea683c6d4faeb6d6e7a9636bda0 upstream. For a device with no Power Management Capability, pci_power_up() previously returned 0 (success) if the platform was able to put the device in D0, which led to pci_set_full_power_state() trying to read PCI_PM_CTRL, even though it doesn't exist. Since dev->pm_cap == 0 in this case, pci_set_full_power_state() actually read the wrong register, interpreted it as PCI_PM_CTRL, and corrupted dev->current_state. This led to messages like this in some cases: pci 0000:01:00.0: Refused to change power state from D3hot to D0 To prevent this, make pci_power_up() always return a negative failure code if the device lacks a Power Management Capability, even if non-PCI platform power management has been able to put the device in D0. The failure will prevent pci_set_full_power_state() from trying to access PCI_PM_CTRL. Fixes: e200904b275c ("PCI/PM: Split pci_power_up()") Link: https://lore.kernel.org/r/20230824013738.1894965-1-chenfeiyang@loongson.cn Signed-off-by: Feiyang Chen <chenfeiyang@loongson.cn> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: "Rafael J. Wysocki" <rafael@kernel.org> Cc: stable@vger.kernel.org # v5.19+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-13PCI: hv: Fix a crash in hv_pci_restore_msi_msg() during hibernationDexuan Cui
commit 04bbe863241a9be7d57fb4cf217ee4a72f480e70 upstream. When a Linux VM with an assigned PCI device runs on Hyper-V, if the PCI device driver is not loaded yet (i.e. MSI-X/MSI is not enabled on the device yet), doing a VM hibernation triggers a panic in hv_pci_restore_msi_msg() -> msi_lock_descs(&pdev->dev), because pdev->dev.msi.data is still NULL. Avoid the panic by checking if MSI-X/MSI is enabled. Link: https://lore.kernel.org/r/20230816175939.21566-1-decui@microsoft.com Fixes: dc2b453290c4 ("PCI: hv: Rework MSI handling") Signed-off-by: Dexuan Cui <decui@microsoft.com> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Reviewed-by: sathyanarayanan.kuppuswamy@linux.intel.com Reviewed-by: Michael Kelley <mikelley@microsoft.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-13PCI: Free released resource after coalescingRoss Lagerwall
commit 8ec9c1d5d0a5a4744516adb483b97a238892f9d5 upstream. release_resource() doesn't actually free the resource or resource list entry so free the resource list entry to avoid a leak. Closes: https://lore.kernel.org/r/878r9sga1t.fsf@kernel.org/ Fixes: e54223275ba1 ("PCI: Release resource invalidated by coalescing") Link: https://lore.kernel.org/r/20230906110846.225369-1-ross.lagerwall@citrix.com Reported-by: Kalle Valo <kvalo@kernel.org> Tested-by: Kalle Valo <kvalo@kernel.org> Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: stable@vger.kernel.org # v5.16+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-13scsi: mpt3sas: Perform additional retries if doorbell read returns 0Ranjan Kumar
commit 4ca10f3e31745d35249a727ecd108eb58f0a8c5e upstream. The driver retries certain register reads 3 times if the returned value is 0. This was done because the controller could return 0 for certain registers if other registers were being accessed concurrently by the BMC. In certain systems with increased BMC interactions, the register values returned can be 0 for longer than 3 retries. Change the retry count from 3 to 30 for the affected registers to prevent problems with out-of-band management. Fixes: b899202901a8 ("scsi: mpt3sas: Add separate function for aero doorbell reads") Cc: stable@vger.kernel.org Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com> Link: https://lore.kernel.org/r/20230829090020.5417-2-ranjan.kumar@broadcom.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-13Revert "scsi: qla2xxx: Fix buffer overrun"Nilesh Javali
commit 641671d97b9199f1ba35ccc2222d4b189a6a5de5 upstream. Revert due to Get PLOGI Template failed. This reverts commit b68710a8094fdffe8dd4f7a82c82649f479bb453. Cc: stable@vger.kernel.org Signed-off-by: Nilesh Javali <njavali@marvell.com> Link: https://lore.kernel.org/r/20230821130045.34850-9-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-13media: venus: hfi_venus: Write to VIDC_CTRL_INIT after unmasking interruptsKonrad Dybcio
commit d74e481609808330b4625b3691cf01e1f56e255e upstream. The startup procedure shouldn't be started with interrupts masked, as that may entail silent failures. Kick off initialization only after the interrupts are unmasked. Cc: stable@vger.kernel.org # v4.12+ Fixes: d96d3f30c0f2 ("[media] media: venus: hfi: add Venus HFI files") Signed-off-by: Konrad Dybcio <konrad.dybcio@linaro.org> Signed-off-by: Stanimir Varbanov <stanimir.k.varbanov@gmail.com> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-13media: dvb: symbol fixup for dvb_attach()Greg Kroah-Hartman
commit 86495af1171e1feec79faa9b64c05c89f46e41d1 upstream. In commit 9011e49d54dc ("modules: only allow symbol_get of EXPORT_SYMBOL_GPL modules") the use of symbol_get is properly restricted to GPL-only marked symbols. This interacts oddly with the DVB logic which only uses dvb_attach() to load the dvb driver which then uses symbol_get(). Fix this up by properly marking all of the dvb_attach attach symbols as EXPORT_SYMBOL_GPL(). Fixes: 9011e49d54dc ("modules: only allow symbol_get of EXPORT_SYMBOL_GPL modules") Cc: stable <stable@kernel.org> Reported-by: Stefan Lippers-Hollmann <s.l-h@gmx.de> Cc: Mauro Carvalho Chehab <mchehab@kernel.org> Cc: Christoph Hellwig <hch@lst.de> Cc: linux-media@vger.kernel.org Cc: linux-modules@vger.kernel.org Acked-by: Luis Chamberlain <mcgrof@kernel.org> Acked-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Link: https://lore.kernel.org/r/20230908092035.3815268-2-gregkh@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-13i3c: master: svc: fix probe failure when no i3c device existFrank Li
commit 6e13d6528be2f7e801af63c8153b87293f25d736 upstream. I3C masters are expected to support hot-join. This means at initialization time we might not yet discover any device and this should not be treated as a fatal error. During the DAA procedure which happens at probe time, if no device has joined, all CCC will be NACKed (from a bus perspective). This leads to an early return with an error code which fails the probe of the master. Let's avoid this by just telling the core through an I3C_ERROR_M2 return command code that no device was discovered, which is a valid situation. This way the master will no longer bail out and fail to probe for a wrong reason. Cc: stable@vger.kernel.org Fixes: dd3c52846d59 ("i3c: master: svc: Add Silvaco I3C master driver") Signed-off-by: Frank Li <Frank.Li@nxp.com> Acked-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/r/20230831141324.2841525-1-Frank.Li@nxp.com Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-13drm/amd/display: register edp_backlight_control() for DCN301Hamza Mahfooz
commit 1611917f39bee1abfc01501238db8ac19649042d upstream. As made mention of in commit 099303e9a9bd ("drm/amd/display: eDP intermittent black screen during PnP"), we need to turn off the display's backlight before powering off an eDP display. Not doing so will result in undefined behaviour according to the eDP spec. So, set DCN301's edp_backlight_control() function pointer to dce110_edp_backlight_control(). Cc: stable@vger.kernel.org Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2765 Fixes: 9c75891feef0 ("drm/amd/display: rework recent update PHY state commit") Suggested-by: Swapnil Patel <swapnil.patel@amd.com> Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-13backlight/lv5207lp: Compare against struct fb_info.deviceThomas Zimmermann
commit 1ca8819320fd84e7d95b04e7668efc5f9fe9fa5c upstream. Struct lv5207lp_platform_data refers to a platform device within the Linux device hierarchy. The test in lv5207lp_backlight_check_fb() compares it against the fbdev device in struct fb_info.dev, which is different. Fix the test by comparing to struct fb_info.device. Fixes a bug in the backlight driver and prepares fbdev for making struct fb_info.dev optional. v2: * move renames into separate patch (Javier, Sam, Michael) Fixes: 82e5c40d88f9 ("backlight: Add Sanyo LV5207LP backlight driver") Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Cc: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Rich Felker <dalias@libc.org> Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Cc: Lee Jones <lee@kernel.org> Cc: Daniel Thompson <daniel.thompson@linaro.org> Cc: Jingoo Han <jingoohan1@gmail.com> Cc: linux-sh@vger.kernel.org Cc: dri-devel@lists.freedesktop.org Cc: <stable@vger.kernel.org> # v3.12+ Reviewed-by: Javier Martinez Canillas <javierm@redhat.com> Reviewed-by: Sam Ravnborg <sam@ravnborg.org> Reviewed-by: Daniel Thompson <daniel.thompson@linaro.org> Link: https://patchwork.freedesktop.org/patch/msgid/20230613110953.24176-6-tzimmermann@suse.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-13backlight/bd6107: Compare against struct fb_info.deviceThomas Zimmermann
commit 992bdddaabfba19bdc77c1c7a4977b2aa41ec891 upstream. Struct bd6107_platform_data refers to a platform device within the Linux device hierarchy. The test in bd6107_backlight_check_fb() compares it against the fbdev device in struct fb_info.dev, which is different. Fix the test by comparing to struct fb_info.device. Fixes a bug in the backlight driver and prepares fbdev for making struct fb_info.dev optional. v2: * move renames into separate patch (Javier, Sam, Michael) Fixes: 67b43e590415 ("backlight: Add ROHM BD6107 backlight driver") Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Cc: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com> Cc: Lee Jones <lee@kernel.org> Cc: Daniel Thompson <daniel.thompson@linaro.org> Cc: Jingoo Han <jingoohan1@gmail.com> Cc: dri-devel@lists.freedesktop.org Cc: <stable@vger.kernel.org> # v3.12+ Reviewed-by: Javier Martinez Canillas <javierm@redhat.com> Reviewed-by: Sam Ravnborg <sam@ravnborg.org> Reviewed-by: Daniel Thompson <daniel.thompson@linaro.org> Link: https://patchwork.freedesktop.org/patch/msgid/20230613110953.24176-2-tzimmermann@suse.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-13backlight/gpio_backlight: Compare against struct fb_info.deviceThomas Zimmermann
commit 7b91d017f77c1bda56f27c2f4bbb70de7c6eca08 upstream. Struct gpio_backlight_platform_data refers to a platform device within the Linux device hierarchy. The test in gpio_backlight_check_fb() compares it against the fbdev device in struct fb_info.dev, which is different. Fix the test by comparing to struct fb_info.device. Fixes a bug in the backlight driver and prepares fbdev for making struct fb_info.dev optional. v2: * move renames into separate patch (Javier, Sam, Michael) Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Fixes: 8b770e3c9824 ("backlight: Add GPIO-based backlight driver") Cc: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com> Cc: Rich Felker <dalias@libc.org> Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Cc: Lee Jones <lee@kernel.org> Cc: Daniel Thompson <daniel.thompson@linaro.org> Cc: Jingoo Han <jingoohan1@gmail.com> Cc: linux-sh@vger.kernel.org Cc: dri-devel@lists.freedesktop.org Cc: <stable@vger.kernel.org> # v3.12+ Reviewed-by: Sam Ravnborg <sam@ravnborg.org> Reviewed-by: Daniel Thompson <daniel.thompson@linaro.org> Link: https://patchwork.freedesktop.org/patch/msgid/20230613110953.24176-4-tzimmermann@suse.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>