aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2024-01-01keys, dns: Allow key types (eg. DNS) to be reclaimed immediately on expiryDavid Howells
[ Upstream commit 39299bdd2546688d92ed9db4948f6219ca1b9542 ] If a key has an expiration time, then when that time passes, the key is left around for a certain amount of time before being collected (5 mins by default) so that EKEYEXPIRED can be returned instead of ENOKEY. This is a problem for DNS keys because we want to redo the DNS lookup immediately at that point. Fix this by allowing key types to be marked such that keys of that type don't have this extra period, but are reclaimed as soon as they expire and turn this on for dns_resolver-type keys. To make this easier to handle, key->expiry is changed to be permanent if TIME64_MAX rather than 0. Furthermore, give such new-style negative DNS results a 1s default expiry if no other expiry time is set rather than allowing it to stick around indefinitely. This shouldn't be zero as ls will follow a failing stat call immediately with a second with AT_SYMLINK_NOFOLLOW added. Fixes: 1a4240f4764a ("DNS: Separate out CIFS DNS Resolver code") Signed-off-by: David Howells <dhowells@redhat.com> Tested-by: Markus Suvanto <markus.suvanto@gmail.com> cc: Wang Lei <wang840925@gmail.com> cc: Jeff Layton <jlayton@redhat.com> cc: Steve French <smfrench@gmail.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: Jarkko Sakkinen <jarkko@kernel.org> cc: "David S. Miller" <davem@davemloft.net> cc: Eric Dumazet <edumazet@google.com> cc: Jakub Kicinski <kuba@kernel.org> cc: Paolo Abeni <pabeni@redhat.com> cc: linux-afs@lists.infradead.org cc: linux-cifs@vger.kernel.org cc: linux-nfs@vger.kernel.org cc: ceph-devel@vger.kernel.org cc: keyrings@vger.kernel.org cc: netdev@vger.kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-01net: check dev->gso_max_size in gso_features_check()Eric Dumazet
[ Upstream commit 24ab059d2ebd62fdccc43794796f6ffbabe49ebc ] Some drivers might misbehave if TSO packets get too big. GVE for instance uses a 16bit field in its TX descriptor, and will do bad things if a packet is bigger than 2^16 bytes. Linux TCP stack honors dev->gso_max_size, but there are other ways for too big packets to reach an ndo_start_xmit() handler : virtio_net, af_packet, GRO... Add a generic check in gso_features_check() and fallback to GSO when needed. gso_max_size was added in the blamed commit. Fixes: 82cc1a7a5687 ("[NET]: Add per-connection option to set max TSO frame size") Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20231219125331.4127498-1-edumazet@google.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-01afs: Fix dynamic root lookup DNS checkDavid Howells
[ Upstream commit 74cef6872ceaefb5b6c5c60641371ea28702d358 ] In the afs dynamic root directory, the ->lookup() function does a DNS check on the cell being asked for and if the DNS upcall reports an error it will report an error back to userspace (typically ENOENT). However, if a failed DNS upcall returns a new-style result, it will return a valid result, with the status field set appropriately to indicate the type of failure - and in that case, dns_query() doesn't return an error and we let stat() complete with no error - which can cause confusion in userspace as subsequent calls that trigger d_automount then fail with ENOENT. Fix this by checking the status result from a valid dns_query() and returning an error if it indicates a failure. Fixes: bbb4c4323a4d ("dns: Allow the dns resolver to retrieve a server set") Reported-by: Markus Suvanto <markus.suvanto@gmail.com> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=216637 Signed-off-by: David Howells <dhowells@redhat.com> Tested-by: Markus Suvanto <markus.suvanto@gmail.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-01afs: Fix the dynamic root's d_delete to always delete unused dentriesDavid Howells
[ Upstream commit 71f8b55bc30e82d6355e07811213d847981a32e2 ] Fix the afs dynamic root's d_delete function to always delete unused dentries rather than only deleting them if they're positive. With things as they stand upstream, negative dentries stemming from failed DNS lookups stick around preventing retries. Fixes: 66c7e1d319a5 ("afs: Split the dynroot stuff out and give it its own ops tables") Signed-off-by: David Howells <dhowells@redhat.com> Tested-by: Markus Suvanto <markus.suvanto@gmail.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-01net: check vlan filter feature in vlan_vids_add_by_dev() and ↵Liu Jian
vlan_vids_del_by_dev() [ Upstream commit 01a564bab4876007ce35f312e16797dfe40e4823 ] I got the below warning trace: WARNING: CPU: 4 PID: 4056 at net/core/dev.c:11066 unregister_netdevice_many_notify CPU: 4 PID: 4056 Comm: ip Not tainted 6.7.0-rc4+ #15 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014 RIP: 0010:unregister_netdevice_many_notify+0x9a4/0x9b0 Call Trace: rtnl_dellink rtnetlink_rcv_msg netlink_rcv_skb netlink_unicast netlink_sendmsg __sock_sendmsg ____sys_sendmsg ___sys_sendmsg __sys_sendmsg do_syscall_64 entry_SYSCALL_64_after_hwframe It can be repoduced via: ip netns add ns1 ip netns exec ns1 ip link add bond0 type bond mode 0 ip netns exec ns1 ip link add bond_slave_1 type veth peer veth2 ip netns exec ns1 ip link set bond_slave_1 master bond0 [1] ip netns exec ns1 ethtool -K bond0 rx-vlan-filter off [2] ip netns exec ns1 ip link add link bond_slave_1 name bond_slave_1.0 type vlan id 0 [3] ip netns exec ns1 ip link add link bond0 name bond0.0 type vlan id 0 [4] ip netns exec ns1 ip link set bond_slave_1 nomaster [5] ip netns exec ns1 ip link del veth2 ip netns del ns1 This is all caused by command [1] turning off the rx-vlan-filter function of bond0. The reason is the same as commit 01f4fd270870 ("bonding: Fix incorrect deletion of ETH_P_8021AD protocol vid from slaves"). Commands [2] [3] add the same vid to slave and master respectively, causing command [4] to empty slave->vlan_info. The following command [5] triggers this problem. To fix this problem, we should add VLAN_FILTER feature checks in vlan_vids_add_by_dev() and vlan_vids_del_by_dev() to prevent incorrect addition or deletion of vlan_vid information. Fixes: 348a1443cc43 ("vlan: introduce functions to do mass addition/deletion of vids by another device") Signed-off-by: Liu Jian <liujian56@huawei.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-01net: mana: select PAGE_POOLYury Norov
[ Upstream commit 340943fbff3d8faa44d2223ca04917df28786a07 ] Mana uses PAGE_POOL API. x86_64 defconfig doesn't select it: ld: vmlinux.o: in function `mana_create_page_pool.isra.0': mana_en.c:(.text+0x9ae36f): undefined reference to `page_pool_create' ld: vmlinux.o: in function `mana_get_rxfrag': mana_en.c:(.text+0x9afed1): undefined reference to `page_pool_alloc_pages' make[3]: *** [/home/yury/work/linux/scripts/Makefile.vmlinux:37: vmlinux] Error 1 make[2]: *** [/home/yury/work/linux/Makefile:1154: vmlinux] Error 2 make[1]: *** [/home/yury/work/linux/Makefile:234: __sub-make] Error 2 make[1]: Leaving directory '/home/yury/work/build-linux-x86_64' make: *** [Makefile:234: __sub-make] Error 2 So we need to select it explicitly. Signed-off-by: Yury Norov <yury.norov@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Simon Horman <horms@kernel.org> # build-tested Fixes: ca9c54d2 ("net: mana: Add a driver for Microsoft Azure Network Adapter") Link: https://lore.kernel.org/r/20231215203353.635379-1-yury.norov@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-01Bluetooth: hci_event: shut up a false-positive warningArnd Bergmann
[ Upstream commit a5812c68d849505ea657f653446512b85887f813 ] Turning on -Wstringop-overflow globally exposed a misleading compiler warning in bluetooth: net/bluetooth/hci_event.c: In function 'hci_cc_read_class_of_dev': net/bluetooth/hci_event.c:524:9: error: 'memcpy' writing 3 bytes into a region of size 0 overflows the destination [-Werror=stringop-overflow=] 524 | memcpy(hdev->dev_class, rp->dev_class, 3); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The problem here is the check for hdev being NULL in bt_dev_dbg() that leads the compiler to conclude that hdev->dev_class might be an invalid pointer access. Add another explicit check for the same condition to make sure gcc sees this cannot happen. Fixes: a9de9248064b ("[Bluetooth] Switch from OGF+OCF to using only opcodes") Fixes: 1b56c90018f0 ("Makefile: Enable -Wstringop-overflow globally") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-01Bluetooth: Fix deadlock in vhci_send_frameYing Hsu
[ Upstream commit 769bf60e17ee1a56a81e7c031192c3928312c52e ] syzbot found a potential circular dependency leading to a deadlock: -> #3 (&hdev->req_lock){+.+.}-{3:3}: __mutex_lock_common+0x1b6/0x1bc2 kernel/locking/mutex.c:599 __mutex_lock kernel/locking/mutex.c:732 [inline] mutex_lock_nested+0x17/0x1c kernel/locking/mutex.c:784 hci_dev_do_close+0x3f/0x9f net/bluetooth/hci_core.c:551 hci_rfkill_set_block+0x130/0x1ac net/bluetooth/hci_core.c:935 rfkill_set_block+0x1e6/0x3b8 net/rfkill/core.c:345 rfkill_fop_write+0x2d8/0x672 net/rfkill/core.c:1274 vfs_write+0x277/0xcf5 fs/read_write.c:594 ksys_write+0x19b/0x2bd fs/read_write.c:650 do_syscall_x64 arch/x86/entry/common.c:55 [inline] do_syscall_64+0x51/0xba arch/x86/entry/common.c:93 entry_SYSCALL_64_after_hwframe+0x61/0xcb -> #2 (rfkill_global_mutex){+.+.}-{3:3}: __mutex_lock_common+0x1b6/0x1bc2 kernel/locking/mutex.c:599 __mutex_lock kernel/locking/mutex.c:732 [inline] mutex_lock_nested+0x17/0x1c kernel/locking/mutex.c:784 rfkill_register+0x30/0x7e3 net/rfkill/core.c:1045 hci_register_dev+0x48f/0x96d net/bluetooth/hci_core.c:2622 __vhci_create_device drivers/bluetooth/hci_vhci.c:341 [inline] vhci_create_device+0x3ad/0x68f drivers/bluetooth/hci_vhci.c:374 vhci_get_user drivers/bluetooth/hci_vhci.c:431 [inline] vhci_write+0x37b/0x429 drivers/bluetooth/hci_vhci.c:511 call_write_iter include/linux/fs.h:2109 [inline] new_sync_write fs/read_write.c:509 [inline] vfs_write+0xaa8/0xcf5 fs/read_write.c:596 ksys_write+0x19b/0x2bd fs/read_write.c:650 do_syscall_x64 arch/x86/entry/common.c:55 [inline] do_syscall_64+0x51/0xba arch/x86/entry/common.c:93 entry_SYSCALL_64_after_hwframe+0x61/0xcb -> #1 (&data->open_mutex){+.+.}-{3:3}: __mutex_lock_common+0x1b6/0x1bc2 kernel/locking/mutex.c:599 __mutex_lock kernel/locking/mutex.c:732 [inline] mutex_lock_nested+0x17/0x1c kernel/locking/mutex.c:784 vhci_send_frame+0x68/0x9c drivers/bluetooth/hci_vhci.c:75 hci_send_frame+0x1cc/0x2ff net/bluetooth/hci_core.c:2989 hci_sched_acl_pkt net/bluetooth/hci_core.c:3498 [inline] hci_sched_acl net/bluetooth/hci_core.c:3583 [inline] hci_tx_work+0xb94/0x1a60 net/bluetooth/hci_core.c:3654 process_one_work+0x901/0xfb8 kernel/workqueue.c:2310 worker_thread+0xa67/0x1003 kernel/workqueue.c:2457 kthread+0x36a/0x430 kernel/kthread.c:319 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:298 -> #0 ((work_completion)(&hdev->tx_work)){+.+.}-{0:0}: check_prev_add kernel/locking/lockdep.c:3053 [inline] check_prevs_add kernel/locking/lockdep.c:3172 [inline] validate_chain kernel/locking/lockdep.c:3787 [inline] __lock_acquire+0x2d32/0x77fa kernel/locking/lockdep.c:5011 lock_acquire+0x273/0x4d5 kernel/locking/lockdep.c:5622 __flush_work+0xee/0x19f kernel/workqueue.c:3090 hci_dev_close_sync+0x32f/0x1113 net/bluetooth/hci_sync.c:4352 hci_dev_do_close+0x47/0x9f net/bluetooth/hci_core.c:553 hci_rfkill_set_block+0x130/0x1ac net/bluetooth/hci_core.c:935 rfkill_set_block+0x1e6/0x3b8 net/rfkill/core.c:345 rfkill_fop_write+0x2d8/0x672 net/rfkill/core.c:1274 vfs_write+0x277/0xcf5 fs/read_write.c:594 ksys_write+0x19b/0x2bd fs/read_write.c:650 do_syscall_x64 arch/x86/entry/common.c:55 [inline] do_syscall_64+0x51/0xba arch/x86/entry/common.c:93 entry_SYSCALL_64_after_hwframe+0x61/0xcb This change removes the need for acquiring the open_mutex in vhci_send_frame, thus eliminating the potential deadlock while maintaining the required packet ordering. Fixes: 92d4abd66f70 ("Bluetooth: vhci: Fix race when opening vhci device") Signed-off-by: Ying Hsu <yinghsu@chromium.org> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-01net/rose: fix races in rose_kill_by_device()Eric Dumazet
[ Upstream commit 64b8bc7d5f1434c636a40bdcfcd42b278d1714be ] syzbot found an interesting netdev refcounting issue in net/rose/af_rose.c, thanks to CONFIG_NET_DEV_REFCNT_TRACKER=y [1] Problem is that rose_kill_by_device() can change rose->device while other threads do not expect the pointer to be changed. We have to first collect sockets in a temporary array, then perform the changes while holding the socket lock and rose_list_lock spinlock (in this order) Change rose_release() to also acquire rose_list_lock before releasing the netdev refcount. [1] [ 1185.055088][ T7889] ref_tracker: reference already released. [ 1185.061476][ T7889] ref_tracker: allocated in: [ 1185.066081][ T7889] rose_bind+0x4ab/0xd10 [ 1185.070446][ T7889] __sys_bind+0x1ec/0x220 [ 1185.074818][ T7889] __x64_sys_bind+0x72/0xb0 [ 1185.079356][ T7889] do_syscall_64+0x40/0x110 [ 1185.083897][ T7889] entry_SYSCALL_64_after_hwframe+0x63/0x6b [ 1185.089835][ T7889] ref_tracker: freed in: [ 1185.094088][ T7889] rose_release+0x2f5/0x570 [ 1185.098629][ T7889] __sock_release+0xae/0x260 [ 1185.103262][ T7889] sock_close+0x1c/0x20 [ 1185.107453][ T7889] __fput+0x270/0xbb0 [ 1185.111467][ T7889] task_work_run+0x14d/0x240 [ 1185.116085][ T7889] get_signal+0x106f/0x2790 [ 1185.120622][ T7889] arch_do_signal_or_restart+0x90/0x7f0 [ 1185.126205][ T7889] exit_to_user_mode_prepare+0x121/0x240 [ 1185.131846][ T7889] syscall_exit_to_user_mode+0x1e/0x60 [ 1185.137293][ T7889] do_syscall_64+0x4d/0x110 [ 1185.141783][ T7889] entry_SYSCALL_64_after_hwframe+0x63/0x6b [ 1185.148085][ T7889] ------------[ cut here ]------------ WARNING: CPU: 1 PID: 7889 at lib/ref_tracker.c:255 ref_tracker_free+0x61a/0x810 lib/ref_tracker.c:255 Modules linked in: CPU: 1 PID: 7889 Comm: syz-executor.2 Not tainted 6.7.0-rc4-syzkaller-00162-g65c95f78917e #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 11/10/2023 RIP: 0010:ref_tracker_free+0x61a/0x810 lib/ref_tracker.c:255 Code: 00 44 8b 6b 18 31 ff 44 89 ee e8 21 62 f5 fc 45 85 ed 0f 85 a6 00 00 00 e8 a3 66 f5 fc 48 8b 34 24 48 89 ef e8 27 5f f1 05 90 <0f> 0b 90 bb ea ff ff ff e9 52 fd ff ff e8 84 66 f5 fc 4c 8d 6d 44 RSP: 0018:ffffc90004917850 EFLAGS: 00010202 RAX: 0000000000000201 RBX: ffff88802618f4c0 RCX: 0000000000000000 RDX: 0000000000000202 RSI: ffffffff8accb920 RDI: 0000000000000001 RBP: ffff8880269ea5b8 R08: 0000000000000001 R09: fffffbfff23e35f6 R10: ffffffff91f1afb7 R11: 0000000000000001 R12: 1ffff92000922f0c R13: 0000000005a2039b R14: ffff88802618f4d8 R15: 00000000ffffffff FS: 00007f0a720ef6c0(0000) GS:ffff8880b9900000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f43a819d988 CR3: 0000000076c64000 CR4: 00000000003506f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> netdev_tracker_free include/linux/netdevice.h:4127 [inline] netdev_put include/linux/netdevice.h:4144 [inline] netdev_put include/linux/netdevice.h:4140 [inline] rose_kill_by_device net/rose/af_rose.c:195 [inline] rose_device_event+0x25d/0x330 net/rose/af_rose.c:218 notifier_call_chain+0xb6/0x3b0 kernel/notifier.c:93 call_netdevice_notifiers_info+0xbe/0x130 net/core/dev.c:1967 call_netdevice_notifiers_extack net/core/dev.c:2005 [inline] call_netdevice_notifiers net/core/dev.c:2019 [inline] __dev_notify_flags+0x1f5/0x2e0 net/core/dev.c:8646 dev_change_flags+0x122/0x170 net/core/dev.c:8682 dev_ifsioc+0x9ad/0x1090 net/core/dev_ioctl.c:529 dev_ioctl+0x224/0x1090 net/core/dev_ioctl.c:786 sock_do_ioctl+0x198/0x270 net/socket.c:1234 sock_ioctl+0x22e/0x6b0 net/socket.c:1339 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:871 [inline] __se_sys_ioctl fs/ioctl.c:857 [inline] __x64_sys_ioctl+0x18f/0x210 fs/ioctl.c:857 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0x40/0x110 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x63/0x6b RIP: 0033:0x7f0a7147cba9 Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 e1 20 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007f0a720ef0c8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 00007f0a7159bf80 RCX: 00007f0a7147cba9 RDX: 0000000020000040 RSI: 0000000000008914 RDI: 0000000000000004 RBP: 00007f0a714c847a R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 R13: 000000000000000b R14: 00007f0a7159bf80 R15: 00007ffc8bb3a5f8 </TASK> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Bernard Pidoux <f6bvp@free.fr> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-01ethernet: atheros: fix a memleak in atl1e_setup_ring_resourcesZhipeng Lu
[ Upstream commit 309fdb1c33fe726d92d0030481346f24e1b01f07 ] In the error handling of 'offset > adapter->ring_size', the tx_ring->tx_buffer allocated by kzalloc should be freed, instead of 'goto failed' instantly. Fixes: a6a5325239c2 ("atl1e: Atheros L1E Gigabit Ethernet driver") Signed-off-by: Zhipeng Lu <alexious@zju.edu.cn> Reviewed-by: Suman Ghosh <sumang@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-01net: sched: ife: fix potential use-after-freeEric Dumazet
[ Upstream commit 19391a2ca98baa7b80279306cdf7dd43f81fa595 ] ife_decode() calls pskb_may_pull() two times, we need to reload ifehdr after the second one, or risk use-after-free as reported by syzbot: BUG: KASAN: slab-use-after-free in __ife_tlv_meta_valid net/ife/ife.c:108 [inline] BUG: KASAN: slab-use-after-free in ife_tlv_meta_decode+0x1d1/0x210 net/ife/ife.c:131 Read of size 2 at addr ffff88802d7300a4 by task syz-executor.5/22323 CPU: 0 PID: 22323 Comm: syz-executor.5 Not tainted 6.7.0-rc3-syzkaller-00804-g074ac38d5b95 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 11/10/2023 Call Trace: <TASK> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0xd9/0x1b0 lib/dump_stack.c:106 print_address_description mm/kasan/report.c:364 [inline] print_report+0xc4/0x620 mm/kasan/report.c:475 kasan_report+0xda/0x110 mm/kasan/report.c:588 __ife_tlv_meta_valid net/ife/ife.c:108 [inline] ife_tlv_meta_decode+0x1d1/0x210 net/ife/ife.c:131 tcf_ife_decode net/sched/act_ife.c:739 [inline] tcf_ife_act+0x4e3/0x1cd0 net/sched/act_ife.c:879 tc_act include/net/tc_wrapper.h:221 [inline] tcf_action_exec+0x1ac/0x620 net/sched/act_api.c:1079 tcf_exts_exec include/net/pkt_cls.h:344 [inline] mall_classify+0x201/0x310 net/sched/cls_matchall.c:42 tc_classify include/net/tc_wrapper.h:227 [inline] __tcf_classify net/sched/cls_api.c:1703 [inline] tcf_classify+0x82f/0x1260 net/sched/cls_api.c:1800 hfsc_classify net/sched/sch_hfsc.c:1147 [inline] hfsc_enqueue+0x315/0x1060 net/sched/sch_hfsc.c:1546 dev_qdisc_enqueue+0x3f/0x230 net/core/dev.c:3739 __dev_xmit_skb net/core/dev.c:3828 [inline] __dev_queue_xmit+0x1de1/0x3d30 net/core/dev.c:4311 dev_queue_xmit include/linux/netdevice.h:3165 [inline] packet_xmit+0x237/0x350 net/packet/af_packet.c:276 packet_snd net/packet/af_packet.c:3081 [inline] packet_sendmsg+0x24aa/0x5200 net/packet/af_packet.c:3113 sock_sendmsg_nosec net/socket.c:730 [inline] __sock_sendmsg+0xd5/0x180 net/socket.c:745 __sys_sendto+0x255/0x340 net/socket.c:2190 __do_sys_sendto net/socket.c:2202 [inline] __se_sys_sendto net/socket.c:2198 [inline] __x64_sys_sendto+0xe0/0x1b0 net/socket.c:2198 do_syscall_x64 arch/x86/entry/common.c:51 [inline] do_syscall_64+0x40/0x110 arch/x86/entry/common.c:82 entry_SYSCALL_64_after_hwframe+0x63/0x6b RIP: 0033:0x7fe9acc7cae9 Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 e1 20 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007fe9ada450c8 EFLAGS: 00000246 ORIG_RAX: 000000000000002c RAX: ffffffffffffffda RBX: 00007fe9acd9bf80 RCX: 00007fe9acc7cae9 RDX: 000000000000fce0 RSI: 00000000200002c0 RDI: 0000000000000003 RBP: 00007fe9accc847a R08: 0000000020000140 R09: 0000000000000014 R10: 0000000000000004 R11: 0000000000000246 R12: 0000000000000000 R13: 000000000000000b R14: 00007fe9acd9bf80 R15: 00007ffd5427ae78 </TASK> Allocated by task 22323: kasan_save_stack+0x33/0x50 mm/kasan/common.c:45 kasan_set_track+0x25/0x30 mm/kasan/common.c:52 ____kasan_kmalloc mm/kasan/common.c:374 [inline] __kasan_kmalloc+0xa2/0xb0 mm/kasan/common.c:383 kasan_kmalloc include/linux/kasan.h:198 [inline] __do_kmalloc_node mm/slab_common.c:1007 [inline] __kmalloc_node_track_caller+0x5a/0x90 mm/slab_common.c:1027 kmalloc_reserve+0xef/0x260 net/core/skbuff.c:582 __alloc_skb+0x12b/0x330 net/core/skbuff.c:651 alloc_skb include/linux/skbuff.h:1298 [inline] alloc_skb_with_frags+0xe4/0x710 net/core/skbuff.c:6331 sock_alloc_send_pskb+0x7e4/0x970 net/core/sock.c:2780 packet_alloc_skb net/packet/af_packet.c:2930 [inline] packet_snd net/packet/af_packet.c:3024 [inline] packet_sendmsg+0x1e2a/0x5200 net/packet/af_packet.c:3113 sock_sendmsg_nosec net/socket.c:730 [inline] __sock_sendmsg+0xd5/0x180 net/socket.c:745 __sys_sendto+0x255/0x340 net/socket.c:2190 __do_sys_sendto net/socket.c:2202 [inline] __se_sys_sendto net/socket.c:2198 [inline] __x64_sys_sendto+0xe0/0x1b0 net/socket.c:2198 do_syscall_x64 arch/x86/entry/common.c:51 [inline] do_syscall_64+0x40/0x110 arch/x86/entry/common.c:82 entry_SYSCALL_64_after_hwframe+0x63/0x6b Freed by task 22323: kasan_save_stack+0x33/0x50 mm/kasan/common.c:45 kasan_set_track+0x25/0x30 mm/kasan/common.c:52 kasan_save_free_info+0x2b/0x40 mm/kasan/generic.c:522 ____kasan_slab_free mm/kasan/common.c:236 [inline] ____kasan_slab_free+0x15b/0x1b0 mm/kasan/common.c:200 kasan_slab_free include/linux/kasan.h:164 [inline] slab_free_hook mm/slub.c:1800 [inline] slab_free_freelist_hook+0x114/0x1e0 mm/slub.c:1826 slab_free mm/slub.c:3809 [inline] __kmem_cache_free+0xc0/0x180 mm/slub.c:3822 skb_kfree_head net/core/skbuff.c:950 [inline] skb_free_head+0x110/0x1b0 net/core/skbuff.c:962 pskb_expand_head+0x3c5/0x1170 net/core/skbuff.c:2130 __pskb_pull_tail+0xe1/0x1830 net/core/skbuff.c:2655 pskb_may_pull_reason include/linux/skbuff.h:2685 [inline] pskb_may_pull include/linux/skbuff.h:2693 [inline] ife_decode+0x394/0x4f0 net/ife/ife.c:82 tcf_ife_decode net/sched/act_ife.c:727 [inline] tcf_ife_act+0x43b/0x1cd0 net/sched/act_ife.c:879 tc_act include/net/tc_wrapper.h:221 [inline] tcf_action_exec+0x1ac/0x620 net/sched/act_api.c:1079 tcf_exts_exec include/net/pkt_cls.h:344 [inline] mall_classify+0x201/0x310 net/sched/cls_matchall.c:42 tc_classify include/net/tc_wrapper.h:227 [inline] __tcf_classify net/sched/cls_api.c:1703 [inline] tcf_classify+0x82f/0x1260 net/sched/cls_api.c:1800 hfsc_classify net/sched/sch_hfsc.c:1147 [inline] hfsc_enqueue+0x315/0x1060 net/sched/sch_hfsc.c:1546 dev_qdisc_enqueue+0x3f/0x230 net/core/dev.c:3739 __dev_xmit_skb net/core/dev.c:3828 [inline] __dev_queue_xmit+0x1de1/0x3d30 net/core/dev.c:4311 dev_queue_xmit include/linux/netdevice.h:3165 [inline] packet_xmit+0x237/0x350 net/packet/af_packet.c:276 packet_snd net/packet/af_packet.c:3081 [inline] packet_sendmsg+0x24aa/0x5200 net/packet/af_packet.c:3113 sock_sendmsg_nosec net/socket.c:730 [inline] __sock_sendmsg+0xd5/0x180 net/socket.c:745 __sys_sendto+0x255/0x340 net/socket.c:2190 __do_sys_sendto net/socket.c:2202 [inline] __se_sys_sendto net/socket.c:2198 [inline] __x64_sys_sendto+0xe0/0x1b0 net/socket.c:2198 do_syscall_x64 arch/x86/entry/common.c:51 [inline] do_syscall_64+0x40/0x110 arch/x86/entry/common.c:82 entry_SYSCALL_64_after_hwframe+0x63/0x6b The buggy address belongs to the object at ffff88802d730000 which belongs to the cache kmalloc-8k of size 8192 The buggy address is located 164 bytes inside of freed 8192-byte region [ffff88802d730000, ffff88802d732000) The buggy address belongs to the physical page: page:ffffea0000b5cc00 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x2d730 head:ffffea0000b5cc00 order:3 entire_mapcount:0 nr_pages_mapped:0 pincount:0 flags: 0xfff00000000840(slab|head|node=0|zone=1|lastcpupid=0x7ff) page_type: 0xffffffff() raw: 00fff00000000840 ffff888013042280 dead000000000122 0000000000000000 raw: 0000000000000000 0000000080020002 00000001ffffffff 0000000000000000 page dumped because: kasan: bad access detected page_owner tracks the page as allocated page last allocated via order 3, migratetype Unmovable, gfp_mask 0x1d20c0(__GFP_IO|__GFP_FS|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_NOMEMALLOC|__GFP_HARDWALL), pid 22323, tgid 22320 (syz-executor.5), ts 950317230369, free_ts 950233467461 set_page_owner include/linux/page_owner.h:31 [inline] post_alloc_hook+0x2d0/0x350 mm/page_alloc.c:1544 prep_new_page mm/page_alloc.c:1551 [inline] get_page_from_freelist+0xa28/0x3730 mm/page_alloc.c:3319 __alloc_pages+0x22e/0x2420 mm/page_alloc.c:4575 alloc_pages_mpol+0x258/0x5f0 mm/mempolicy.c:2133 alloc_slab_page mm/slub.c:1870 [inline] allocate_slab mm/slub.c:2017 [inline] new_slab+0x283/0x3c0 mm/slub.c:2070 ___slab_alloc+0x979/0x1500 mm/slub.c:3223 __slab_alloc.constprop.0+0x56/0xa0 mm/slub.c:3322 __slab_alloc_node mm/slub.c:3375 [inline] slab_alloc_node mm/slub.c:3468 [inline] __kmem_cache_alloc_node+0x131/0x310 mm/slub.c:3517 __do_kmalloc_node mm/slab_common.c:1006 [inline] __kmalloc_node_track_caller+0x4a/0x90 mm/slab_common.c:1027 kmalloc_reserve+0xef/0x260 net/core/skbuff.c:582 __alloc_skb+0x12b/0x330 net/core/skbuff.c:651 alloc_skb include/linux/skbuff.h:1298 [inline] alloc_skb_with_frags+0xe4/0x710 net/core/skbuff.c:6331 sock_alloc_send_pskb+0x7e4/0x970 net/core/sock.c:2780 packet_alloc_skb net/packet/af_packet.c:2930 [inline] packet_snd net/packet/af_packet.c:3024 [inline] packet_sendmsg+0x1e2a/0x5200 net/packet/af_packet.c:3113 sock_sendmsg_nosec net/socket.c:730 [inline] __sock_sendmsg+0xd5/0x180 net/socket.c:745 __sys_sendto+0x255/0x340 net/socket.c:2190 page last free stack trace: reset_page_owner include/linux/page_owner.h:24 [inline] free_pages_prepare mm/page_alloc.c:1144 [inline] free_unref_page_prepare+0x53c/0xb80 mm/page_alloc.c:2354 free_unref_page+0x33/0x3b0 mm/page_alloc.c:2494 __unfreeze_partials+0x226/0x240 mm/slub.c:2655 qlink_free mm/kasan/quarantine.c:168 [inline] qlist_free_all+0x6a/0x170 mm/kasan/quarantine.c:187 kasan_quarantine_reduce+0x18e/0x1d0 mm/kasan/quarantine.c:294 __kasan_slab_alloc+0x65/0x90 mm/kasan/common.c:305 kasan_slab_alloc include/linux/kasan.h:188 [inline] slab_post_alloc_hook mm/slab.h:763 [inline] slab_alloc_node mm/slub.c:3478 [inline] slab_alloc mm/slub.c:3486 [inline] __kmem_cache_alloc_lru mm/slub.c:3493 [inline] kmem_cache_alloc_lru+0x219/0x6f0 mm/slub.c:3509 alloc_inode_sb include/linux/fs.h:2937 [inline] ext4_alloc_inode+0x28/0x650 fs/ext4/super.c:1408 alloc_inode+0x5d/0x220 fs/inode.c:261 new_inode_pseudo fs/inode.c:1006 [inline] new_inode+0x22/0x260 fs/inode.c:1032 __ext4_new_inode+0x333/0x5200 fs/ext4/ialloc.c:958 ext4_symlink+0x5d7/0xa20 fs/ext4/namei.c:3398 vfs_symlink fs/namei.c:4464 [inline] vfs_symlink+0x3e5/0x620 fs/namei.c:4448 do_symlinkat+0x25f/0x310 fs/namei.c:4490 __do_sys_symlinkat fs/namei.c:4506 [inline] __se_sys_symlinkat fs/namei.c:4503 [inline] __x64_sys_symlinkat+0x97/0xc0 fs/namei.c:4503 do_syscall_x64 arch/x86/entry/common.c:51 [inline] do_syscall_64+0x40/0x110 arch/x86/entry/common.c:82 Fixes: d57493d6d1be ("net: sched: ife: check on metadata length") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Alexander Aring <aahringo@redhat.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-01net: Return error from sk_stream_wait_connect() if sk_wait_event() failsShigeru Yoshida
[ Upstream commit cac23b7d7627915d967ce25436d7aae26e88ed06 ] The following NULL pointer dereference issue occurred: BUG: kernel NULL pointer dereference, address: 0000000000000000 <...> RIP: 0010:ccid_hc_tx_send_packet net/dccp/ccid.h:166 [inline] RIP: 0010:dccp_write_xmit+0x49/0x140 net/dccp/output.c:356 <...> Call Trace: <TASK> dccp_sendmsg+0x642/0x7e0 net/dccp/proto.c:801 inet_sendmsg+0x63/0x90 net/ipv4/af_inet.c:846 sock_sendmsg_nosec net/socket.c:730 [inline] __sock_sendmsg+0x83/0xe0 net/socket.c:745 ____sys_sendmsg+0x443/0x510 net/socket.c:2558 ___sys_sendmsg+0xe5/0x150 net/socket.c:2612 __sys_sendmsg+0xa6/0x120 net/socket.c:2641 __do_sys_sendmsg net/socket.c:2650 [inline] __se_sys_sendmsg net/socket.c:2648 [inline] __x64_sys_sendmsg+0x45/0x50 net/socket.c:2648 do_syscall_x64 arch/x86/entry/common.c:51 [inline] do_syscall_64+0x43/0x110 arch/x86/entry/common.c:82 entry_SYSCALL_64_after_hwframe+0x63/0x6b sk_wait_event() returns an error (-EPIPE) if disconnect() is called on the socket waiting for the event. However, sk_stream_wait_connect() returns success, i.e. zero, even if sk_wait_event() returns -EPIPE, so a function that waits for a connection with sk_stream_wait_connect() may misbehave. In the case of the above DCCP issue, dccp_sendmsg() is waiting for the connection. If disconnect() is called in concurrently, the above issue occurs. This patch fixes the issue by returning error from sk_stream_wait_connect() if sk_wait_event() fails. Fixes: 419ce133ab92 ("tcp: allow again tcp_disconnect() when threads are waiting") Signed-off-by: Shigeru Yoshida <syoshida@redhat.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reported-by: syzbot+c71bc336c5061153b502@syzkaller.appspotmail.com Reviewed-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Reported-by: syzkaller <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-01octeontx2-pf: Fix graceful exit during PFC configuration failureSuman Ghosh
[ Upstream commit 8c97ab5448f2096daba11edf8d18a44e1eb6f31d ] During PFC configuration failure the code was not handling a graceful exit. This patch fixes the same and add proper code for a graceful exit. Fixes: 99c969a83d82 ("octeontx2-pf: Add egress PFC support") Signed-off-by: Suman Ghosh <sumang@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-01net: mscc: ocelot: fix eMAC TX RMON stats for bucket 256-511 and aboveVladimir Oltean
[ Upstream commit 52eda4641d041667fa059f4855c5f88dcebd8afe ] There is a typo in the driver due to which we report incorrect TX RMON counters for the 256-511 octet bucket and all the other buckets larger than that. Bug found with the selftest at https://patchwork.kernel.org/project/netdevbpf/patch/20231211223346.2497157-9-tobias@waldekranz.com/ Fixes: e32036e1ae7b ("net: mscc: ocelot: add support for all sorts of standardized counters present in DSA") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Link: https://lore.kernel.org/r/20231214000902.545625-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-01net/mlx5e: Correct snprintf truncation handling for fw_version buffer used ↵Rahul Rameshbabu
by representors [ Upstream commit b13559b76157de9d74f04d3ca0e49d69de3b5675 ] snprintf returns the length of the formatted string, excluding the trailing null, without accounting for truncation. This means that is the return value is greater than or equal to the size parameter, the fw_version string was truncated. Link: https://docs.kernel.org/core-api/kernel-api.html#c.snprintf Fixes: 1b2bd0c0264f ("net/mlx5e: Check return value of snprintf writing to fw_version buffer for representors") Signed-off-by: Rahul Rameshbabu <rrameshbabu@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-01net/mlx5e: Correct snprintf truncation handling for fw_version bufferRahul Rameshbabu
[ Upstream commit ad436b9c1270c40554e274f067f1b78fcc06a004 ] snprintf returns the length of the formatted string, excluding the trailing null, without accounting for truncation. This means that is the return value is greater than or equal to the size parameter, the fw_version string was truncated. Reported-by: David Laight <David.Laight@ACULAB.COM> Closes: https://lore.kernel.org/netdev/81cae734ee1b4cde9b380a9a31006c1a@AcuMS.aculab.com/ Link: https://docs.kernel.org/core-api/kernel-api.html#c.snprintf Fixes: 41e63c2baa11 ("net/mlx5e: Check return value of snprintf writing to fw_version buffer") Signed-off-by: Rahul Rameshbabu <rrameshbabu@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-01net/mlx5: Fix fw tracer first block checkMoshe Shemesh
[ Upstream commit 4261edf11cb7c9224af713a102e5616329306932 ] While handling new traces, to verify it is not the first block being written, last_timestamp is checked. But instead of checking it is non zero it is verified to be zero. Fix to verify last_timestamp is not zero. Fixes: c71ad41ccb0c ("net/mlx5: FW tracer, events handling") Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Feras Daoud <ferasda@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-01net/mlx5e: fix a potential double-free in fs_udp_create_groupsDinghao Liu
[ Upstream commit e75efc6466ae289e599fb12a5a86545dff245c65 ] When kcalloc() for ft->g succeeds but kvzalloc() for in fails, fs_udp_create_groups() will free ft->g. However, its caller fs_udp_create_table() will free ft->g again through calling mlx5e_destroy_flow_table(), which will lead to a double-free. Fix this by setting ft->g to NULL in fs_udp_create_groups(). Fixes: 1c80bd684388 ("net/mlx5e: Introduce Flow Steering UDP API") Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-01net/mlx5e: Fix a race in command alloc flowShifeng Li
[ Upstream commit 8f5100da56b3980276234e812ce98d8f075194cd ] Fix a cmd->ent use after free due to a race on command entry. Such race occurs when one of the commands releases its last refcount and frees its index and entry while another process running command flush flow takes refcount to this command entry. The process which handles commands flush may see this command as needed to be flushed if the other process allocated a ent->idx but didn't set ent to cmd->ent_arr in cmd_work_handler(). Fix it by moving the assignment of cmd->ent_arr into the spin lock. [70013.081955] BUG: KASAN: use-after-free in mlx5_cmd_trigger_completions+0x1e2/0x4c0 [mlx5_core] [70013.081967] Write of size 4 at addr ffff88880b1510b4 by task kworker/26:1/1433361 [70013.081968] [70013.082028] Workqueue: events aer_isr [70013.082053] Call Trace: [70013.082067] dump_stack+0x8b/0xbb [70013.082086] print_address_description+0x6a/0x270 [70013.082102] kasan_report+0x179/0x2c0 [70013.082173] mlx5_cmd_trigger_completions+0x1e2/0x4c0 [mlx5_core] [70013.082267] mlx5_cmd_flush+0x80/0x180 [mlx5_core] [70013.082304] mlx5_enter_error_state+0x106/0x1d0 [mlx5_core] [70013.082338] mlx5_try_fast_unload+0x2ea/0x4d0 [mlx5_core] [70013.082377] remove_one+0x200/0x2b0 [mlx5_core] [70013.082409] pci_device_remove+0xf3/0x280 [70013.082439] device_release_driver_internal+0x1c3/0x470 [70013.082453] pci_stop_bus_device+0x109/0x160 [70013.082468] pci_stop_and_remove_bus_device+0xe/0x20 [70013.082485] pcie_do_fatal_recovery+0x167/0x550 [70013.082493] aer_isr+0x7d2/0x960 [70013.082543] process_one_work+0x65f/0x12d0 [70013.082556] worker_thread+0x87/0xb50 [70013.082571] kthread+0x2e9/0x3a0 [70013.082592] ret_from_fork+0x1f/0x40 The logical relationship of this error is as follows: aer_recover_work | ent->work -------------------------------------------+------------------------------ aer_recover_work_func | |- pcie_do_recovery | |- report_error_detected | |- mlx5_pci_err_detected |cmd_work_handler |- mlx5_enter_error_state | |- cmd_alloc_index |- enter_error_state | |- lock cmd->alloc_lock |- mlx5_cmd_flush | |- clear_bit |- mlx5_cmd_trigger_completions| |- unlock cmd->alloc_lock |- lock cmd->alloc_lock | |- vector = ~dev->cmd.vars.bitmask |- for_each_set_bit | |- cmd_ent_get(cmd->ent_arr[i]) (UAF) |- unlock cmd->alloc_lock | |- cmd->ent_arr[ent->idx]=ent The cmd->ent_arr[ent->idx] assignment and the bit clearing are not protected by the cmd->alloc_lock in cmd_work_handler(). Fixes: 50b2412b7e78 ("net/mlx5: Avoid possible free of command entry while timeout comp handler") Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Shifeng Li <lishifeng@sangfor.com.cn> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-01net/mlx5: Re-organize mlx5_cmd structShay Drory
[ Upstream commit 58db72869a9f8e01910844ca145efc2ea91bbbf9 ] Downstream patch will split mlx5_cmd_init() to probe and reload routines. As a preparation, organize mlx5_cmd struct so that any field that will be used in the reload routine are grouped at new nested struct. Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Stable-dep-of: 8f5100da56b3 ("net/mlx5e: Fix a race in command alloc flow") Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-01net/mlx5: Prevent high-rate FW commands from populating all slotsTariq Toukan
[ Upstream commit 63fbae0a74c3e1df7c20c81e04353ced050d9887 ] Certain connection-based device-offload protocols (like TLS) use per-connection HW objects to track the state, maintain the context, and perform the offload properly. Some of these objects are created, modified, and destroyed via FW commands. Under high connection rate, this type of FW commands might continuously populate all slots of the FW command interface and throttle it, while starving other critical control FW commands. Limit these throttle commands to using only up to a portion (half) of the FW command interface slots. FW commands maximal rate is not hit, and the same high rate is still reached when applying this limitation. Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Stable-dep-of: 8f5100da56b3 ("net/mlx5e: Fix a race in command alloc flow") Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-01net/mlx5: Introduce and use opcode getter in command interfaceTariq Toukan
[ Upstream commit 7cb5eb937231663d11f7817e366f6f86a142d6d3 ] Introduce an opcode getter in the FW command interface, and use it. Initialize the entry's opcode field early in cmd_alloc_ent() and use it when possible. Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Stable-dep-of: 8f5100da56b3 ("net/mlx5e: Fix a race in command alloc flow") Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-01net/mlx5e: Fix slab-out-of-bounds in mlx5_query_nic_vport_mac_list()Shifeng Li
[ Upstream commit ddb38ddff9c71026bad481b791a94d446ee37603 ] Out_sz that the size of out buffer is calculated using query_nic_vport _context_in structure when driver query the MAC list. However query_nic _vport_context_in structure is smaller than query_nic_vport_context_out. When allowed_list_size is greater than 96, calling ether_addr_copy() will trigger an slab-out-of-bounds. [ 1170.055866] BUG: KASAN: slab-out-of-bounds in mlx5_query_nic_vport_mac_list+0x481/0x4d0 [mlx5_core] [ 1170.055869] Read of size 4 at addr ffff88bdbc57d912 by task kworker/u128:1/461 [ 1170.055870] [ 1170.055932] Workqueue: mlx5_esw_wq esw_vport_change_handler [mlx5_core] [ 1170.055936] Call Trace: [ 1170.055949] dump_stack+0x8b/0xbb [ 1170.055958] print_address_description+0x6a/0x270 [ 1170.055961] kasan_report+0x179/0x2c0 [ 1170.056061] mlx5_query_nic_vport_mac_list+0x481/0x4d0 [mlx5_core] [ 1170.056162] esw_update_vport_addr_list+0x2c5/0xcd0 [mlx5_core] [ 1170.056257] esw_vport_change_handle_locked+0xd08/0x1a20 [mlx5_core] [ 1170.056377] esw_vport_change_handler+0x6b/0x90 [mlx5_core] [ 1170.056381] process_one_work+0x65f/0x12d0 [ 1170.056383] worker_thread+0x87/0xb50 [ 1170.056390] kthread+0x2e9/0x3a0 [ 1170.056394] ret_from_fork+0x1f/0x40 Fixes: e16aea2744ab ("net/mlx5: Introduce access functions to modify/query vport mac lists") Cc: Ding Hui <dinghui@sangfor.com.cn> Signed-off-by: Shifeng Li <lishifeng@sangfor.com.cn> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-01Revert "net/mlx5e: fix double free of encap_header"Vlad Buslov
[ Upstream commit 5d089684dc434a31e08d32f0530066d0025c52e4 ] This reverts commit 6f9b1a0731662648949a1c0587f6acb3b7f8acf1. This patch is causing a null ptr issue, the proper fix is in the next patch. Fixes: 6f9b1a073166 ("net/mlx5e: fix double free of encap_header") Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-01Revert "net/mlx5e: fix double free of encap_header in update funcs"Vlad Buslov
[ Upstream commit 66ca8d4deca09bce3fc7bcf8ea7997fa1a51c33c ] This reverts commit 3a4aa3cb83563df942be49d145ee3b7ddf17d6bb. This patch is causing a null ptr issue, the proper fix is in the next patch. Fixes: 3a4aa3cb8356 ("net/mlx5e: fix double free of encap_header in update funcs") Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-01wifi: mac80211: mesh_plink: fix matches_local logicJohannes Berg
[ Upstream commit 8c386b166e2517cf3a123018e77941ec22625d0f ] During refactoring the "else" here got lost, add it back. Fixes: c99a89edb106 ("mac80211: factor out plink event gathering") Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://msgid.link/20231211085121.795480fa0e0b.I017d501196a5bbdcd9afd33338d342d6fe1edd79@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-01wifi: mac80211: mesh: check element parsing succeededJohannes Berg
[ Upstream commit 1fc4a3eec50d726f4663ad3c0bb0158354d6647a ] ieee802_11_parse_elems() can return NULL, so we must check for the return value. Fixes: 5d24828d05f3 ("mac80211: always allocate struct ieee802_11_elems") Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://msgid.link/20231211085121.93dea364f3d3.Ie87781c6c48979fb25a744b90af4a33dc2d83a28@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-01wifi: mac80211: check if the existing link config remains unchangedEdward Adam Davis
[ Upstream commit c1393c132b906fbdf91f6d1c9eb2ef7a00cce64e ] [Syz report] WARNING: CPU: 1 PID: 5067 at net/mac80211/rate.c:48 rate_control_rate_init+0x540/0x690 net/mac80211/rate.c:48 Modules linked in: CPU: 1 PID: 5067 Comm: syz-executor413 Not tainted 6.7.0-rc3-syzkaller-00014-gdf60cee26a2e #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 11/10/2023 RIP: 0010:rate_control_rate_init+0x540/0x690 net/mac80211/rate.c:48 Code: 48 c7 c2 00 46 0c 8c be 08 03 00 00 48 c7 c7 c0 45 0c 8c c6 05 70 79 0b 05 01 e8 1b a0 6f f7 e9 e0 fd ff ff e8 61 b3 8f f7 90 <0f> 0b 90 e9 36 ff ff ff e8 53 b3 8f f7 e8 5e 0b 78 f7 31 ff 89 c3 RSP: 0018:ffffc90003c57248 EFLAGS: 00010293 RAX: 0000000000000000 RBX: ffff888016bc4000 RCX: ffffffff89f7d519 RDX: ffff888076d43b80 RSI: ffffffff89f7d6df RDI: 0000000000000005 RBP: ffff88801daaae20 R08: 0000000000000005 R09: 0000000000000000 R10: 0000000000000001 R11: 0000000000000002 R12: 0000000000000001 R13: 0000000000000000 R14: ffff888020030e20 R15: ffff888078f08000 FS: 0000555556b94380(0000) GS:ffff8880b9900000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000005fdeb8 CR3: 0000000076d22000 CR4: 00000000003506f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> sta_apply_auth_flags.constprop.0+0x4b7/0x510 net/mac80211/cfg.c:1674 sta_apply_parameters+0xaf1/0x16c0 net/mac80211/cfg.c:2002 ieee80211_add_station+0x3fa/0x6c0 net/mac80211/cfg.c:2068 rdev_add_station net/wireless/rdev-ops.h:201 [inline] nl80211_new_station+0x13ba/0x1a70 net/wireless/nl80211.c:7603 genl_family_rcv_msg_doit+0x1fc/0x2e0 net/netlink/genetlink.c:972 genl_family_rcv_msg net/netlink/genetlink.c:1052 [inline] genl_rcv_msg+0x561/0x800 net/netlink/genetlink.c:1067 netlink_rcv_skb+0x16b/0x440 net/netlink/af_netlink.c:2545 genl_rcv+0x28/0x40 net/netlink/genetlink.c:1076 netlink_unicast_kernel net/netlink/af_netlink.c:1342 [inline] netlink_unicast+0x53b/0x810 net/netlink/af_netlink.c:1368 netlink_sendmsg+0x93c/0xe40 net/netlink/af_netlink.c:1910 sock_sendmsg_nosec net/socket.c:730 [inline] __sock_sendmsg+0xd5/0x180 net/socket.c:745 ____sys_sendmsg+0x6ac/0x940 net/socket.c:2584 ___sys_sendmsg+0x135/0x1d0 net/socket.c:2638 __sys_sendmsg+0x117/0x1e0 net/socket.c:2667 do_syscall_x64 arch/x86/entry/common.c:51 [inline] do_syscall_64+0x40/0x110 arch/x86/entry/common.c:82 entry_SYSCALL_64_after_hwframe+0x63/0x6b [Analysis] It is inappropriate to make a link configuration change judgment on an non-existent and non new link. [Fix] Quickly exit when there is a existent link and the link configuration has not changed. Fixes: b303835dabe0 ("wifi: mac80211: accept STA changes without link changes") Reported-and-tested-by: syzbot+62d7eef57b09bfebcd84@syzkaller.appspotmail.com Signed-off-by: Edward Adam Davis <eadavis@qq.com> Link: https://msgid.link/tencent_DE67FF86DB92ED465489A36ECD2EDDCC8C06@qq.com Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-01wifi: iwlwifi: pcie: add another missing bh-disable for rxq->lockJohannes Berg
[ Upstream commit a4754182dc936b97ec7e9f6b08cdf7ed97ef9069 ] Evidently I had only looked at all the ones in rx.c, and missed this. Add bh-disable to this use of the rxq->lock as well. Fixes: 25edc8f259c7 ("iwlwifi: pcie: properly implement NAPI") Reported-by: Brian Norris <briannorris@chromium.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://msgid.link/20231208183100.e79ad3dae649.I8f19713c4383707f8be7fc20ff5cc1ecf12429bb@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-01s390/vx: fix save/restore of fpu kernel contextHeiko Carstens
[ Upstream commit e6b2dab41888332bf83f592131e7ea07756770a4 ] The KERNEL_FPR mask only contains a flag for the first eight vector registers. However floating point registers overlay parts of the first sixteen vector registers. This could lead to vector register corruption if a kernel fpu context uses any of the vector registers 8 to 15 and is interrupted or calls a KERNEL_FPR context. If that context uses also vector registers 8 to 15, their contents will be corrupted on return. Luckily this is currently not a real bug, since the kernel has only one KERNEL_FPR user with s390_adjust_jiffies() and it is only using floating point registers 0 to 2. Fix this by using the correct bits for KERNEL_FPR. Fixes: 7f79695cc1b6 ("s390/fpu: improve kernel_fpu_[begin|end]") Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-01reset: Fix crash when freeing non-existent optional resetsGeert Uytterhoeven
[ Upstream commit 4a6756f56bcf8e64c87144a626ce53aea4899c0e ] When obtaining one or more optional resets, non-existent resets are stored as NULL pointers, and all related error and cleanup paths need to take this into account. Currently only reset_control_put() and reset_control_bulk_put() get this right. All of __reset_control_bulk_get(), of_reset_control_array_get(), and reset_control_array_put() lack the proper checking, causing NULL pointer dereferences on failure or release. Fix this by moving the existing check from reset_control_bulk_put() to __reset_control_put_internal(), so it applies to all callers. The double check in reset_control_put() doesn't hurt. Fixes: 17c82e206d2a3cd8 ("reset: Add APIs to manage array of resets") Fixes: 48d71395896d54ee ("reset: Add reset_control_bulk API") Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Link: https://lore.kernel.org/r/2440edae7ca8534628cdbaf559ded288f2998178.1701276806.git.geert+renesas@glider.be Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-01ARM: OMAP2+: Fix null pointer dereference and memory leak in ↵Kunwu Chan
omap_soc_device_init [ Upstream commit c72b9c33ef9695ad7ce7a6eb39a9df8a01b70796 ] kasprintf() returns a pointer to dynamically allocated memory which can be NULL upon failure. When 'soc_dev_attr->family' is NULL,it'll trigger the null pointer dereference issue, such as in 'soc_info_show'. And when 'soc_device_register' fails, it's necessary to release 'soc_dev_attr->family' to avoid memory leaks. Fixes: 6770b2114325 ("ARM: OMAP2+: Export SoC information to userspace") Signed-off-by: Kunwu Chan <chentao@kylinos.cn> Message-ID: <20231123145237.609442-1-chentao@kylinos.cn> Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-01ARM: dts: dra7: Fix DRA7 L3 NoC node register sizeAndrew Davis
[ Upstream commit 1e5caee2ba8f1426e8098afb4ca38dc40a0ca71b ] This node can access any part of the L3 configuration registers space, including CLK1 and CLK2 which are 0x800000 offset. Restore this area size to include these areas. Fixes: 7f2659ce657e ("ARM: dts: Move dra7 l3 noc to a separate node") Signed-off-by: Andrew Davis <afd@ti.com> Message-ID: <20231113181604.546444-1-afd@ti.com> Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-01drm/amd/display: fix hw rotated modes when PSR-SU is enabledHamza Mahfooz
[ Upstream commit f528ee145bd0076cd0ed7e7b2d435893e6329e98 ] We currently don't support dirty rectangles on hardware rotated modes. So, if a user is using hardware rotated modes with PSR-SU enabled, use PSR-SU FFU for all rotated planes (including cursor planes). Cc: stable@vger.kernel.org Fixes: 30ebe41582d1 ("drm/amd/display: add FB_DAMAGE_CLIPS support") Reported-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/2952 Tested-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Tested-by: Bin Li <binli@gnome.org> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-01HID: i2c-hid: Add IDEA5002 to i2c_hid_acpi_blacklist[]Mario Limonciello
[ Upstream commit a9f68ffe1170ca4bc17ab29067d806a354a026e0 ] Users have reported problems with recent Lenovo laptops that contain an IDEA5002 I2C HID device. Reports include fans turning on and running even at idle and spurious wakeups from suspend. Presumably in the Windows ecosystem there is an application that uses the HID device. Maybe that puts it into a lower power state so it doesn't cause spurious events. This device doesn't serve any functional purpose in Linux as nothing interacts with it so blacklist it from being probed. This will prevent the GPIO driver from setting up the GPIO and the spurious interrupts and wake events will not occur. Cc: stable@vger.kernel.org # 6.1 Reported-and-tested-by: Marcus Aram <marcus+oss@oxar.nl> Reported-and-tested-by: Mark Herbert <mark.herbert42@gmail.com> Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/2812 Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Jiri Kosina <jkosina@suse.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-01HID: i2c-hid: acpi: Unify ACPI ID tables formatAndy Shevchenko
[ Upstream commit 4122abfed2193e752485282370abf5c419f05cad ] Unify ACPI ID tables format by: - surrounding HID by spaces - dropping unnecessary driver_data assignment to 0 - dropping comma at the terminator entry Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz> Stable-dep-of: a9f68ffe1170 ("HID: i2c-hid: Add IDEA5002 to i2c_hid_acpi_blacklist[]") Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-01bpf: Fix prog_array_map_poke_run map poke updateJiri Olsa
commit 4b7de801606e504e69689df71475d27e35336fb3 upstream. Lee pointed out issue found by syscaller [0] hitting BUG in prog array map poke update in prog_array_map_poke_run function due to error value returned from bpf_arch_text_poke function. There's race window where bpf_arch_text_poke can fail due to missing bpf program kallsym symbols, which is accounted for with check for -EINVAL in that BUG_ON call. The problem is that in such case we won't update the tail call jump and cause imbalance for the next tail call update check which will fail with -EBUSY in bpf_arch_text_poke. I'm hitting following race during the program load: CPU 0 CPU 1 bpf_prog_load bpf_check do_misc_fixups prog_array_map_poke_track map_update_elem bpf_fd_array_map_update_elem prog_array_map_poke_run bpf_arch_text_poke returns -EINVAL bpf_prog_kallsyms_add After bpf_arch_text_poke (CPU 1) fails to update the tail call jump, the next poke update fails on expected jump instruction check in bpf_arch_text_poke with -EBUSY and triggers the BUG_ON in prog_array_map_poke_run. Similar race exists on the program unload. Fixing this by moving the update to bpf_arch_poke_desc_update function which makes sure we call __bpf_arch_text_poke that skips the bpf address check. Each architecture has slightly different approach wrt looking up bpf address in bpf_arch_text_poke, so instead of splitting the function or adding new 'checkip' argument in previous version, it seems best to move the whole map_poke_run update as arch specific code. [0] https://syzkaller.appspot.com/bug?extid=97a4fe20470e9bc30810 Fixes: ebf7d1f508a7 ("bpf, x64: rework pro/epilogue and tailcall handling in JIT") Reported-by: syzbot+97a4fe20470e9bc30810@syzkaller.appspotmail.com Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Yonghong Song <yonghong.song@linux.dev> Cc: Lee Jones <lee@kernel.org> Cc: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Link: https://lore.kernel.org/bpf/20231206083041.1306660-2-jolsa@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-01-01kasan: disable kasan_non_canonical_hook() for HW tagsArnd Bergmann
commit 17c17567fe510857b18fe01b7a88027600e76ac6 upstream. On arm64, building with CONFIG_KASAN_HW_TAGS now causes a compile-time error: mm/kasan/report.c: In function 'kasan_non_canonical_hook': mm/kasan/report.c:637:20: error: 'KASAN_SHADOW_OFFSET' undeclared (first use in this function) 637 | if (addr < KASAN_SHADOW_OFFSET) | ^~~~~~~~~~~~~~~~~~~ mm/kasan/report.c:637:20: note: each undeclared identifier is reported only once for each function it appears in mm/kasan/report.c:640:77: error: expected expression before ';' token 640 | orig_addr = (addr - KASAN_SHADOW_OFFSET) << KASAN_SHADOW_SCALE_SHIFT; This was caused by removing the dependency on CONFIG_KASAN_INLINE that used to prevent this from happening. Use the more specific dependency on KASAN_SW_TAGS || KASAN_GENERIC to only ignore the function for hwasan mode. Link: https://lkml.kernel.org/r/20231016200925.984439-1-arnd@kernel.org Fixes: 12ec6a919b0f ("kasan: print the original fault addr when access invalid shadow") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Konovalov <andreyknvl@gmail.com> Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Haibo Li <haibo.li@mediatek.com> Cc: Kees Cook <keescook@chromium.org> Cc: Vincenzo Frascino <vincenzo.frascino@arm.com> Cc: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Cc: Matthias Brugger <matthias.bgg@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Amit Pundir <amit.pundir@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-12-20Linux 6.1.69Greg Kroah-Hartman
Link: https://lore.kernel.org/r/20231218135055.005497074@linuxfoundation.org Tested-by: Conor Dooley <conor.dooley@microchip.com> Tested-by: Pavel Machek (CIP) <pavel@denx.de> = Tested-by: SeongJae Park <sj@kernel.org> Tested-by: Salvatore Bonaccorso <carnil@debian.org> Tested-by: Shuah Khan <skhan@linuxfoundation.org> Tested-by: Kelsey Steele <kelseysteele@linux.microsoft.com> Tested-by: Linux Kernel Functional Testing <lkft@linaro.org> Tested-by: kernelci.org bot <bot@kernelci.org> Tested-by: Florian Fainelli <florian.fainelli@broadcom.com> Tested-by: Ron Economos <re@w6rz.net> Tested-by: Yann Sionneau <ysionneau@kalrayinc.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-12-20r8152: fix the autosuspend doesn't workHayes Wang
commit 0fbd79c01a9a657348f7032df70c57a406468c86 upstream. Set supports_autosuspend = 1 for the rtl8152_cfgselector_driver. Fixes: ec51fbd1b8a2 ("r8152: add USB device driver for config selection") Signed-off-by: Hayes Wang <hayeswang@realtek.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-12-20r8152: remove rtl_vendor_mode functionHayes Wang
commit 95a4c1d617b92cdc4522297741b56e8f6cd01a1e upstream. After commit ec51fbd1b8a2 ("r8152: add USB device driver for config selection"), the code about changing USB configuration in rtl_vendor_mode() wouldn't be run anymore. Therefore, the function could be removed. Signed-off-by: Hayes Wang <hayeswang@realtek.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-12-20r8152: avoid to change cfg for all devicesHayes Wang
commit 0d4cda805a183bbe523f2407edb5c14ade50b841 upstream. The rtl8152_cfgselector_probe() should set the USB configuration to the vendor mode only for the devices which the driver (r8152) supports. Otherwise, no driver would be used for such devices. Fixes: ec51fbd1b8a2 ("r8152: add USB device driver for config selection") Signed-off-by: Hayes Wang <hayeswang@realtek.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-12-20net: tls, update curr on splice as wellJohn Fastabend
commit c5a595000e2677e865a39f249c056bc05d6e55fd upstream. The curr pointer must also be updated on the splice similar to how we do this for other copy types. Fixes: d829e9c4112b ("tls: convert to generic sk_msg interface") Signed-off-by: John Fastabend <john.fastabend@gmail.com> Reported-by: Jann Horn <jannh@google.com> Link: https://lore.kernel.org/r/20231206232706.374377-2-john.fastabend@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-12-20ring-buffer: Have rb_time_cmpxchg() set the msb counter tooSteven Rostedt (Google)
commit 0aa0e5289cfe984a8a9fdd79ccf46ccf080151f7 upstream. The rb_time_cmpxchg() on 32-bit architectures requires setting three 32-bit words to represent the 64-bit timestamp, with some salt for synchronization. Those are: msb, top, and bottom The issue is, the rb_time_cmpxchg() did not properly salt the msb portion, and the msb that was written was stale. Link: https://lore.kernel.org/linux-trace-kernel/20231215084114.20899342@rorschach.local.home Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Fixes: f03f2abce4f39 ("ring-buffer: Have 32 bit time stamps use all 64 bits") Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-12-20ring-buffer: Do not try to put back write_stampSteven Rostedt (Google)
commit dd939425707898da992e59ab0fcfae4652546910 upstream. If an update to an event is interrupted by another event between the time the initial event allocated its buffer and where it wrote to the write_stamp, the code try to reset the write stamp back to the what it had just overwritten. It knows that it was overwritten via checking the before_stamp, and if it didn't match what it wrote to the before_stamp before it allocated its space, it knows it was overwritten. To put back the write_stamp, it uses the before_stamp it read. The problem here is that by writing the before_stamp to the write_stamp it makes the two equal again, which means that the write_stamp can be considered valid as the last timestamp written to the ring buffer. But this is not necessarily true. The event that interrupted the event could have been interrupted in a way that it was interrupted as well, and can end up leaving with an invalid write_stamp. But if this happens and returns to this context that uses the before_stamp to update the write_stamp again, it can possibly incorrectly make it valid, causing later events to have in correct time stamps. As it is OK to leave this function with an invalid write_stamp (one that doesn't match the before_stamp), there's no reason to try to make it valid again in this case. If this race happens, then just leave with the invalid write_stamp and the next event to come along will just add a absolute timestamp and validate everything again. Bonus points: This gets rid of another cmpxchg64! Link: https://lore.kernel.org/linux-trace-kernel/20231214222921.193037a7@gandalf.local.home Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Joel Fernandes <joel@joelfernandes.org> Cc: Vincent Donnefort <vdonnefort@google.com> Fixes: a389d86f7fd09 ("ring-buffer: Have nested events still record running time stamp") Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-12-20ring-buffer: Fix a race in rb_time_cmpxchg() for 32 bit archsSteven Rostedt (Google)
commit fff88fa0fbc7067ba46dde570912d63da42c59a9 upstream. Mathieu Desnoyers pointed out an issue in the rb_time_cmpxchg() for 32 bit architectures. That is: static bool rb_time_cmpxchg(rb_time_t *t, u64 expect, u64 set) { unsigned long cnt, top, bottom, msb; unsigned long cnt2, top2, bottom2, msb2; u64 val; /* The cmpxchg always fails if it interrupted an update */ if (!__rb_time_read(t, &val, &cnt2)) return false; if (val != expect) return false; <<<< interrupted here! cnt = local_read(&t->cnt); The problem is that the synchronization counter in the rb_time_t is read *after* the value of the timestamp is read. That means if an interrupt were to come in between the value being read and the counter being read, it can change the value and the counter and the interrupted process would be clueless about it! The counter needs to be read first and then the value. That way it is easy to tell if the value is stale or not. If the counter hasn't been updated, then the value is still good. Link: https://lore.kernel.org/linux-trace-kernel/20231211201324.652870-1-mathieu.desnoyers@efficios.com/ Link: https://lore.kernel.org/linux-trace-kernel/20231212115301.7a9c9a64@gandalf.local.home Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Fixes: 10464b4aa605e ("ring-buffer: Add rb_time_t 64 bit operations for speeding up 32 bit") Reported-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-12-20ring-buffer: Fix writing to the buffer with max_data_sizeSteven Rostedt (Google)
commit b3ae7b67b87fed771fa5bf95389df06b0433603e upstream. The maximum ring buffer data size is the maximum size of data that can be recorded on the ring buffer. Events must be smaller than the sub buffer data size minus any meta data. This size is checked before trying to allocate from the ring buffer because the allocation assumes that the size will fit on the sub buffer. The maximum size was calculated as the size of a sub buffer page (which is currently PAGE_SIZE minus the sub buffer header) minus the size of the meta data of an individual event. But it missed the possible adding of a time stamp for events that are added long enough apart that the event meta data can't hold the time delta. When an event is added that is greater than the current BUF_MAX_DATA_SIZE minus the size of a time stamp, but still less than or equal to BUF_MAX_DATA_SIZE, the ring buffer would go into an infinite loop, looking for a page that can hold the event. Luckily, there's a check for this loop and after 1000 iterations and a warning is emitted and the ring buffer is disabled. But this should never happen. This can happen when a large event is added first, or after a long period where an absolute timestamp is prefixed to the event, increasing its size by 8 bytes. This passes the check and then goes into the algorithm that causes the infinite loop. For events that are the first event on the sub-buffer, it does not need to add a timestamp, because the sub-buffer itself contains an absolute timestamp, and adding one is redundant. The fix is to check if the event is to be the first event on the sub-buffer, and if it is, then do not add a timestamp. This also fixes 32 bit adding a timestamp when a read of before_stamp or write_stamp is interrupted. There's still no need to add that timestamp if the event is going to be the first event on the sub buffer. Also, if the buffer has "time_stamp_abs" set, then also check if the length plus the timestamp is greater than the BUF_MAX_DATA_SIZE. Link: https://lore.kernel.org/all/20231212104549.58863438@gandalf.local.home/ Link: https://lore.kernel.org/linux-trace-kernel/20231212071837.5fdd6c13@gandalf.local.home Link: https://lore.kernel.org/linux-trace-kernel/20231212111617.39e02849@gandalf.local.home Cc: stable@vger.kernel.org Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Fixes: a4543a2fa9ef3 ("ring-buffer: Get timestamp after event is allocated") Fixes: 58fbc3c63275c ("ring-buffer: Consolidate add_timestamp to remove some branches") Reported-by: Kent Overstreet <kent.overstreet@linux.dev> # (on IRC) Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-12-20ring-buffer: Have saved event hold the entire eventSteven Rostedt (Google)
commit b049525855fdd0024881c9b14b8fbec61c3f53d3 upstream. For the ring buffer iterator (non-consuming read), the event needs to be copied into the iterator buffer to make sure that a writer does not overwrite it while the user is reading it. If a write happens during the copy, the buffer is simply discarded. But the temp buffer itself was not big enough. The allocation of the buffer was only BUF_MAX_DATA_SIZE, which is the maximum data size that can be passed into the ring buffer and saved. But the temp buffer needs to hold the meta data as well. That would be BUF_PAGE_SIZE and not BUF_MAX_DATA_SIZE. Link: https://lore.kernel.org/linux-trace-kernel/20231212072558.61f76493@gandalf.local.home Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Fixes: 785888c544e04 ("ring-buffer: Have rb_iter_head_event() handle concurrent writer") Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-12-20ring-buffer: Do not update before stamp when switching sub-buffersSteven Rostedt (Google)
commit 9e45e39dc249c970d99d2681f6bcb55736fd725c upstream. The ring buffer timestamps are synchronized by two timestamp placeholders. One is the "before_stamp" and the other is the "write_stamp" (sometimes referred to as the "after stamp" but only in the comments. These two stamps are key to knowing how to handle nested events coming in with a lockless system. When moving across sub-buffers, the before stamp is updated but the write stamp is not. There's an effort to put back the before stamp to something that seems logical in case there's nested events. But as the current event is about to cross sub-buffers, and so will any new nested event that happens, updating the before stamp is useless, and could even introduce new race conditions. The first event on a sub-buffer simply uses the sub-buffer's timestamp and keeps a "delta" of zero. The "before_stamp" and "write_stamp" are not used in the algorithm in this case. There's no reason to try to fix the before_stamp when this happens. As a bonus, it removes a cmpxchg() when crossing sub-buffers! Link: https://lore.kernel.org/linux-trace-kernel/20231211114420.36dde01b@gandalf.local.home Cc: stable@vger.kernel.org Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Fixes: a389d86f7fd09 ("ring-buffer: Have nested events still record running time stamp") Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-12-20tracing: Update snapshot buffer on resize if it is allocatedSteven Rostedt (Google)
commit d06aff1cb13d2a0d52b48e605462518149c98c81 upstream. The snapshot buffer is to mimic the main buffer so that when a snapshot is needed, the snapshot and main buffer are swapped. When the snapshot buffer is allocated, it is set to the minimal size that the ring buffer may be at and still functional. When it is allocated it becomes the same size as the main ring buffer, and when the main ring buffer changes in size, it should do. Currently, the resize only updates the snapshot buffer if it's used by the current tracer (ie. the preemptirqsoff tracer). But it needs to be updated anytime it is allocated. When changing the size of the main buffer, instead of looking to see if the current tracer is utilizing the snapshot buffer, just check if it is allocated to know if it should be updated or not. Also fix typo in comment just above the code change. Link: https://lore.kernel.org/linux-trace-kernel/20231210225447.48476a6a@rorschach.local.home Cc: stable@vger.kernel.org Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Fixes: ad909e21bbe69 ("tracing: Add internal tracing_snapshot() functions") Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>