Age | Commit message (Collapse) | Author |
|
[ Upstream commit 39299bdd2546688d92ed9db4948f6219ca1b9542 ]
If a key has an expiration time, then when that time passes, the key is
left around for a certain amount of time before being collected (5 mins by
default) so that EKEYEXPIRED can be returned instead of ENOKEY. This is a
problem for DNS keys because we want to redo the DNS lookup immediately at
that point.
Fix this by allowing key types to be marked such that keys of that type
don't have this extra period, but are reclaimed as soon as they expire and
turn this on for dns_resolver-type keys. To make this easier to handle,
key->expiry is changed to be permanent if TIME64_MAX rather than 0.
Furthermore, give such new-style negative DNS results a 1s default expiry
if no other expiry time is set rather than allowing it to stick around
indefinitely. This shouldn't be zero as ls will follow a failing stat call
immediately with a second with AT_SYMLINK_NOFOLLOW added.
Fixes: 1a4240f4764a ("DNS: Separate out CIFS DNS Resolver code")
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Markus Suvanto <markus.suvanto@gmail.com>
cc: Wang Lei <wang840925@gmail.com>
cc: Jeff Layton <jlayton@redhat.com>
cc: Steve French <smfrench@gmail.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: Jarkko Sakkinen <jarkko@kernel.org>
cc: "David S. Miller" <davem@davemloft.net>
cc: Eric Dumazet <edumazet@google.com>
cc: Jakub Kicinski <kuba@kernel.org>
cc: Paolo Abeni <pabeni@redhat.com>
cc: linux-afs@lists.infradead.org
cc: linux-cifs@vger.kernel.org
cc: linux-nfs@vger.kernel.org
cc: ceph-devel@vger.kernel.org
cc: keyrings@vger.kernel.org
cc: netdev@vger.kernel.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 24ab059d2ebd62fdccc43794796f6ffbabe49ebc ]
Some drivers might misbehave if TSO packets get too big.
GVE for instance uses a 16bit field in its TX descriptor,
and will do bad things if a packet is bigger than 2^16 bytes.
Linux TCP stack honors dev->gso_max_size, but there are
other ways for too big packets to reach an ndo_start_xmit()
handler : virtio_net, af_packet, GRO...
Add a generic check in gso_features_check() and fallback
to GSO when needed.
gso_max_size was added in the blamed commit.
Fixes: 82cc1a7a5687 ("[NET]: Add per-connection option to set max TSO frame size")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20231219125331.4127498-1-edumazet@google.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 74cef6872ceaefb5b6c5c60641371ea28702d358 ]
In the afs dynamic root directory, the ->lookup() function does a DNS check
on the cell being asked for and if the DNS upcall reports an error it will
report an error back to userspace (typically ENOENT).
However, if a failed DNS upcall returns a new-style result, it will return
a valid result, with the status field set appropriately to indicate the
type of failure - and in that case, dns_query() doesn't return an error and
we let stat() complete with no error - which can cause confusion in
userspace as subsequent calls that trigger d_automount then fail with
ENOENT.
Fix this by checking the status result from a valid dns_query() and
returning an error if it indicates a failure.
Fixes: bbb4c4323a4d ("dns: Allow the dns resolver to retrieve a server set")
Reported-by: Markus Suvanto <markus.suvanto@gmail.com>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=216637
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Markus Suvanto <markus.suvanto@gmail.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 71f8b55bc30e82d6355e07811213d847981a32e2 ]
Fix the afs dynamic root's d_delete function to always delete unused
dentries rather than only deleting them if they're positive. With things
as they stand upstream, negative dentries stemming from failed DNS lookups
stick around preventing retries.
Fixes: 66c7e1d319a5 ("afs: Split the dynroot stuff out and give it its own ops tables")
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Markus Suvanto <markus.suvanto@gmail.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
vlan_vids_del_by_dev()
[ Upstream commit 01a564bab4876007ce35f312e16797dfe40e4823 ]
I got the below warning trace:
WARNING: CPU: 4 PID: 4056 at net/core/dev.c:11066 unregister_netdevice_many_notify
CPU: 4 PID: 4056 Comm: ip Not tainted 6.7.0-rc4+ #15
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014
RIP: 0010:unregister_netdevice_many_notify+0x9a4/0x9b0
Call Trace:
rtnl_dellink
rtnetlink_rcv_msg
netlink_rcv_skb
netlink_unicast
netlink_sendmsg
__sock_sendmsg
____sys_sendmsg
___sys_sendmsg
__sys_sendmsg
do_syscall_64
entry_SYSCALL_64_after_hwframe
It can be repoduced via:
ip netns add ns1
ip netns exec ns1 ip link add bond0 type bond mode 0
ip netns exec ns1 ip link add bond_slave_1 type veth peer veth2
ip netns exec ns1 ip link set bond_slave_1 master bond0
[1] ip netns exec ns1 ethtool -K bond0 rx-vlan-filter off
[2] ip netns exec ns1 ip link add link bond_slave_1 name bond_slave_1.0 type vlan id 0
[3] ip netns exec ns1 ip link add link bond0 name bond0.0 type vlan id 0
[4] ip netns exec ns1 ip link set bond_slave_1 nomaster
[5] ip netns exec ns1 ip link del veth2
ip netns del ns1
This is all caused by command [1] turning off the rx-vlan-filter function
of bond0. The reason is the same as commit 01f4fd270870 ("bonding: Fix
incorrect deletion of ETH_P_8021AD protocol vid from slaves"). Commands
[2] [3] add the same vid to slave and master respectively, causing
command [4] to empty slave->vlan_info. The following command [5] triggers
this problem.
To fix this problem, we should add VLAN_FILTER feature checks in
vlan_vids_add_by_dev() and vlan_vids_del_by_dev() to prevent incorrect
addition or deletion of vlan_vid information.
Fixes: 348a1443cc43 ("vlan: introduce functions to do mass addition/deletion of vids by another device")
Signed-off-by: Liu Jian <liujian56@huawei.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 340943fbff3d8faa44d2223ca04917df28786a07 ]
Mana uses PAGE_POOL API. x86_64 defconfig doesn't select it:
ld: vmlinux.o: in function `mana_create_page_pool.isra.0':
mana_en.c:(.text+0x9ae36f): undefined reference to `page_pool_create'
ld: vmlinux.o: in function `mana_get_rxfrag':
mana_en.c:(.text+0x9afed1): undefined reference to `page_pool_alloc_pages'
make[3]: *** [/home/yury/work/linux/scripts/Makefile.vmlinux:37: vmlinux] Error 1
make[2]: *** [/home/yury/work/linux/Makefile:1154: vmlinux] Error 2
make[1]: *** [/home/yury/work/linux/Makefile:234: __sub-make] Error 2
make[1]: Leaving directory '/home/yury/work/build-linux-x86_64'
make: *** [Makefile:234: __sub-make] Error 2
So we need to select it explicitly.
Signed-off-by: Yury Norov <yury.norov@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Simon Horman <horms@kernel.org> # build-tested
Fixes: ca9c54d2 ("net: mana: Add a driver for Microsoft Azure Network Adapter")
Link: https://lore.kernel.org/r/20231215203353.635379-1-yury.norov@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit a5812c68d849505ea657f653446512b85887f813 ]
Turning on -Wstringop-overflow globally exposed a misleading compiler
warning in bluetooth:
net/bluetooth/hci_event.c: In function 'hci_cc_read_class_of_dev':
net/bluetooth/hci_event.c:524:9: error: 'memcpy' writing 3 bytes into a
region of size 0 overflows the destination [-Werror=stringop-overflow=]
524 | memcpy(hdev->dev_class, rp->dev_class, 3);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The problem here is the check for hdev being NULL in bt_dev_dbg() that
leads the compiler to conclude that hdev->dev_class might be an invalid
pointer access.
Add another explicit check for the same condition to make sure gcc sees
this cannot happen.
Fixes: a9de9248064b ("[Bluetooth] Switch from OGF+OCF to using only opcodes")
Fixes: 1b56c90018f0 ("Makefile: Enable -Wstringop-overflow globally")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 769bf60e17ee1a56a81e7c031192c3928312c52e ]
syzbot found a potential circular dependency leading to a deadlock:
-> #3 (&hdev->req_lock){+.+.}-{3:3}:
__mutex_lock_common+0x1b6/0x1bc2 kernel/locking/mutex.c:599
__mutex_lock kernel/locking/mutex.c:732 [inline]
mutex_lock_nested+0x17/0x1c kernel/locking/mutex.c:784
hci_dev_do_close+0x3f/0x9f net/bluetooth/hci_core.c:551
hci_rfkill_set_block+0x130/0x1ac net/bluetooth/hci_core.c:935
rfkill_set_block+0x1e6/0x3b8 net/rfkill/core.c:345
rfkill_fop_write+0x2d8/0x672 net/rfkill/core.c:1274
vfs_write+0x277/0xcf5 fs/read_write.c:594
ksys_write+0x19b/0x2bd fs/read_write.c:650
do_syscall_x64 arch/x86/entry/common.c:55 [inline]
do_syscall_64+0x51/0xba arch/x86/entry/common.c:93
entry_SYSCALL_64_after_hwframe+0x61/0xcb
-> #2 (rfkill_global_mutex){+.+.}-{3:3}:
__mutex_lock_common+0x1b6/0x1bc2 kernel/locking/mutex.c:599
__mutex_lock kernel/locking/mutex.c:732 [inline]
mutex_lock_nested+0x17/0x1c kernel/locking/mutex.c:784
rfkill_register+0x30/0x7e3 net/rfkill/core.c:1045
hci_register_dev+0x48f/0x96d net/bluetooth/hci_core.c:2622
__vhci_create_device drivers/bluetooth/hci_vhci.c:341 [inline]
vhci_create_device+0x3ad/0x68f drivers/bluetooth/hci_vhci.c:374
vhci_get_user drivers/bluetooth/hci_vhci.c:431 [inline]
vhci_write+0x37b/0x429 drivers/bluetooth/hci_vhci.c:511
call_write_iter include/linux/fs.h:2109 [inline]
new_sync_write fs/read_write.c:509 [inline]
vfs_write+0xaa8/0xcf5 fs/read_write.c:596
ksys_write+0x19b/0x2bd fs/read_write.c:650
do_syscall_x64 arch/x86/entry/common.c:55 [inline]
do_syscall_64+0x51/0xba arch/x86/entry/common.c:93
entry_SYSCALL_64_after_hwframe+0x61/0xcb
-> #1 (&data->open_mutex){+.+.}-{3:3}:
__mutex_lock_common+0x1b6/0x1bc2 kernel/locking/mutex.c:599
__mutex_lock kernel/locking/mutex.c:732 [inline]
mutex_lock_nested+0x17/0x1c kernel/locking/mutex.c:784
vhci_send_frame+0x68/0x9c drivers/bluetooth/hci_vhci.c:75
hci_send_frame+0x1cc/0x2ff net/bluetooth/hci_core.c:2989
hci_sched_acl_pkt net/bluetooth/hci_core.c:3498 [inline]
hci_sched_acl net/bluetooth/hci_core.c:3583 [inline]
hci_tx_work+0xb94/0x1a60 net/bluetooth/hci_core.c:3654
process_one_work+0x901/0xfb8 kernel/workqueue.c:2310
worker_thread+0xa67/0x1003 kernel/workqueue.c:2457
kthread+0x36a/0x430 kernel/kthread.c:319
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:298
-> #0 ((work_completion)(&hdev->tx_work)){+.+.}-{0:0}:
check_prev_add kernel/locking/lockdep.c:3053 [inline]
check_prevs_add kernel/locking/lockdep.c:3172 [inline]
validate_chain kernel/locking/lockdep.c:3787 [inline]
__lock_acquire+0x2d32/0x77fa kernel/locking/lockdep.c:5011
lock_acquire+0x273/0x4d5 kernel/locking/lockdep.c:5622
__flush_work+0xee/0x19f kernel/workqueue.c:3090
hci_dev_close_sync+0x32f/0x1113 net/bluetooth/hci_sync.c:4352
hci_dev_do_close+0x47/0x9f net/bluetooth/hci_core.c:553
hci_rfkill_set_block+0x130/0x1ac net/bluetooth/hci_core.c:935
rfkill_set_block+0x1e6/0x3b8 net/rfkill/core.c:345
rfkill_fop_write+0x2d8/0x672 net/rfkill/core.c:1274
vfs_write+0x277/0xcf5 fs/read_write.c:594
ksys_write+0x19b/0x2bd fs/read_write.c:650
do_syscall_x64 arch/x86/entry/common.c:55 [inline]
do_syscall_64+0x51/0xba arch/x86/entry/common.c:93
entry_SYSCALL_64_after_hwframe+0x61/0xcb
This change removes the need for acquiring the open_mutex in
vhci_send_frame, thus eliminating the potential deadlock while
maintaining the required packet ordering.
Fixes: 92d4abd66f70 ("Bluetooth: vhci: Fix race when opening vhci device")
Signed-off-by: Ying Hsu <yinghsu@chromium.org>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 64b8bc7d5f1434c636a40bdcfcd42b278d1714be ]
syzbot found an interesting netdev refcounting issue in
net/rose/af_rose.c, thanks to CONFIG_NET_DEV_REFCNT_TRACKER=y [1]
Problem is that rose_kill_by_device() can change rose->device
while other threads do not expect the pointer to be changed.
We have to first collect sockets in a temporary array,
then perform the changes while holding the socket
lock and rose_list_lock spinlock (in this order)
Change rose_release() to also acquire rose_list_lock
before releasing the netdev refcount.
[1]
[ 1185.055088][ T7889] ref_tracker: reference already released.
[ 1185.061476][ T7889] ref_tracker: allocated in:
[ 1185.066081][ T7889] rose_bind+0x4ab/0xd10
[ 1185.070446][ T7889] __sys_bind+0x1ec/0x220
[ 1185.074818][ T7889] __x64_sys_bind+0x72/0xb0
[ 1185.079356][ T7889] do_syscall_64+0x40/0x110
[ 1185.083897][ T7889] entry_SYSCALL_64_after_hwframe+0x63/0x6b
[ 1185.089835][ T7889] ref_tracker: freed in:
[ 1185.094088][ T7889] rose_release+0x2f5/0x570
[ 1185.098629][ T7889] __sock_release+0xae/0x260
[ 1185.103262][ T7889] sock_close+0x1c/0x20
[ 1185.107453][ T7889] __fput+0x270/0xbb0
[ 1185.111467][ T7889] task_work_run+0x14d/0x240
[ 1185.116085][ T7889] get_signal+0x106f/0x2790
[ 1185.120622][ T7889] arch_do_signal_or_restart+0x90/0x7f0
[ 1185.126205][ T7889] exit_to_user_mode_prepare+0x121/0x240
[ 1185.131846][ T7889] syscall_exit_to_user_mode+0x1e/0x60
[ 1185.137293][ T7889] do_syscall_64+0x4d/0x110
[ 1185.141783][ T7889] entry_SYSCALL_64_after_hwframe+0x63/0x6b
[ 1185.148085][ T7889] ------------[ cut here ]------------
WARNING: CPU: 1 PID: 7889 at lib/ref_tracker.c:255 ref_tracker_free+0x61a/0x810 lib/ref_tracker.c:255
Modules linked in:
CPU: 1 PID: 7889 Comm: syz-executor.2 Not tainted 6.7.0-rc4-syzkaller-00162-g65c95f78917e #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 11/10/2023
RIP: 0010:ref_tracker_free+0x61a/0x810 lib/ref_tracker.c:255
Code: 00 44 8b 6b 18 31 ff 44 89 ee e8 21 62 f5 fc 45 85 ed 0f 85 a6 00 00 00 e8 a3 66 f5 fc 48 8b 34 24 48 89 ef e8 27 5f f1 05 90 <0f> 0b 90 bb ea ff ff ff e9 52 fd ff ff e8 84 66 f5 fc 4c 8d 6d 44
RSP: 0018:ffffc90004917850 EFLAGS: 00010202
RAX: 0000000000000201 RBX: ffff88802618f4c0 RCX: 0000000000000000
RDX: 0000000000000202 RSI: ffffffff8accb920 RDI: 0000000000000001
RBP: ffff8880269ea5b8 R08: 0000000000000001 R09: fffffbfff23e35f6
R10: ffffffff91f1afb7 R11: 0000000000000001 R12: 1ffff92000922f0c
R13: 0000000005a2039b R14: ffff88802618f4d8 R15: 00000000ffffffff
FS: 00007f0a720ef6c0(0000) GS:ffff8880b9900000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f43a819d988 CR3: 0000000076c64000 CR4: 00000000003506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
netdev_tracker_free include/linux/netdevice.h:4127 [inline]
netdev_put include/linux/netdevice.h:4144 [inline]
netdev_put include/linux/netdevice.h:4140 [inline]
rose_kill_by_device net/rose/af_rose.c:195 [inline]
rose_device_event+0x25d/0x330 net/rose/af_rose.c:218
notifier_call_chain+0xb6/0x3b0 kernel/notifier.c:93
call_netdevice_notifiers_info+0xbe/0x130 net/core/dev.c:1967
call_netdevice_notifiers_extack net/core/dev.c:2005 [inline]
call_netdevice_notifiers net/core/dev.c:2019 [inline]
__dev_notify_flags+0x1f5/0x2e0 net/core/dev.c:8646
dev_change_flags+0x122/0x170 net/core/dev.c:8682
dev_ifsioc+0x9ad/0x1090 net/core/dev_ioctl.c:529
dev_ioctl+0x224/0x1090 net/core/dev_ioctl.c:786
sock_do_ioctl+0x198/0x270 net/socket.c:1234
sock_ioctl+0x22e/0x6b0 net/socket.c:1339
vfs_ioctl fs/ioctl.c:51 [inline]
__do_sys_ioctl fs/ioctl.c:871 [inline]
__se_sys_ioctl fs/ioctl.c:857 [inline]
__x64_sys_ioctl+0x18f/0x210 fs/ioctl.c:857
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0x40/0x110 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x63/0x6b
RIP: 0033:0x7f0a7147cba9
Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 e1 20 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f0a720ef0c8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00007f0a7159bf80 RCX: 00007f0a7147cba9
RDX: 0000000020000040 RSI: 0000000000008914 RDI: 0000000000000004
RBP: 00007f0a714c847a R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 000000000000000b R14: 00007f0a7159bf80 R15: 00007ffc8bb3a5f8
</TASK>
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Bernard Pidoux <f6bvp@free.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 309fdb1c33fe726d92d0030481346f24e1b01f07 ]
In the error handling of 'offset > adapter->ring_size', the
tx_ring->tx_buffer allocated by kzalloc should be freed,
instead of 'goto failed' instantly.
Fixes: a6a5325239c2 ("atl1e: Atheros L1E Gigabit Ethernet driver")
Signed-off-by: Zhipeng Lu <alexious@zju.edu.cn>
Reviewed-by: Suman Ghosh <sumang@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 19391a2ca98baa7b80279306cdf7dd43f81fa595 ]
ife_decode() calls pskb_may_pull() two times, we need to reload
ifehdr after the second one, or risk use-after-free as reported
by syzbot:
BUG: KASAN: slab-use-after-free in __ife_tlv_meta_valid net/ife/ife.c:108 [inline]
BUG: KASAN: slab-use-after-free in ife_tlv_meta_decode+0x1d1/0x210 net/ife/ife.c:131
Read of size 2 at addr ffff88802d7300a4 by task syz-executor.5/22323
CPU: 0 PID: 22323 Comm: syz-executor.5 Not tainted 6.7.0-rc3-syzkaller-00804-g074ac38d5b95 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 11/10/2023
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0xd9/0x1b0 lib/dump_stack.c:106
print_address_description mm/kasan/report.c:364 [inline]
print_report+0xc4/0x620 mm/kasan/report.c:475
kasan_report+0xda/0x110 mm/kasan/report.c:588
__ife_tlv_meta_valid net/ife/ife.c:108 [inline]
ife_tlv_meta_decode+0x1d1/0x210 net/ife/ife.c:131
tcf_ife_decode net/sched/act_ife.c:739 [inline]
tcf_ife_act+0x4e3/0x1cd0 net/sched/act_ife.c:879
tc_act include/net/tc_wrapper.h:221 [inline]
tcf_action_exec+0x1ac/0x620 net/sched/act_api.c:1079
tcf_exts_exec include/net/pkt_cls.h:344 [inline]
mall_classify+0x201/0x310 net/sched/cls_matchall.c:42
tc_classify include/net/tc_wrapper.h:227 [inline]
__tcf_classify net/sched/cls_api.c:1703 [inline]
tcf_classify+0x82f/0x1260 net/sched/cls_api.c:1800
hfsc_classify net/sched/sch_hfsc.c:1147 [inline]
hfsc_enqueue+0x315/0x1060 net/sched/sch_hfsc.c:1546
dev_qdisc_enqueue+0x3f/0x230 net/core/dev.c:3739
__dev_xmit_skb net/core/dev.c:3828 [inline]
__dev_queue_xmit+0x1de1/0x3d30 net/core/dev.c:4311
dev_queue_xmit include/linux/netdevice.h:3165 [inline]
packet_xmit+0x237/0x350 net/packet/af_packet.c:276
packet_snd net/packet/af_packet.c:3081 [inline]
packet_sendmsg+0x24aa/0x5200 net/packet/af_packet.c:3113
sock_sendmsg_nosec net/socket.c:730 [inline]
__sock_sendmsg+0xd5/0x180 net/socket.c:745
__sys_sendto+0x255/0x340 net/socket.c:2190
__do_sys_sendto net/socket.c:2202 [inline]
__se_sys_sendto net/socket.c:2198 [inline]
__x64_sys_sendto+0xe0/0x1b0 net/socket.c:2198
do_syscall_x64 arch/x86/entry/common.c:51 [inline]
do_syscall_64+0x40/0x110 arch/x86/entry/common.c:82
entry_SYSCALL_64_after_hwframe+0x63/0x6b
RIP: 0033:0x7fe9acc7cae9
Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 e1 20 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007fe9ada450c8 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
RAX: ffffffffffffffda RBX: 00007fe9acd9bf80 RCX: 00007fe9acc7cae9
RDX: 000000000000fce0 RSI: 00000000200002c0 RDI: 0000000000000003
RBP: 00007fe9accc847a R08: 0000000020000140 R09: 0000000000000014
R10: 0000000000000004 R11: 0000000000000246 R12: 0000000000000000
R13: 000000000000000b R14: 00007fe9acd9bf80 R15: 00007ffd5427ae78
</TASK>
Allocated by task 22323:
kasan_save_stack+0x33/0x50 mm/kasan/common.c:45
kasan_set_track+0x25/0x30 mm/kasan/common.c:52
____kasan_kmalloc mm/kasan/common.c:374 [inline]
__kasan_kmalloc+0xa2/0xb0 mm/kasan/common.c:383
kasan_kmalloc include/linux/kasan.h:198 [inline]
__do_kmalloc_node mm/slab_common.c:1007 [inline]
__kmalloc_node_track_caller+0x5a/0x90 mm/slab_common.c:1027
kmalloc_reserve+0xef/0x260 net/core/skbuff.c:582
__alloc_skb+0x12b/0x330 net/core/skbuff.c:651
alloc_skb include/linux/skbuff.h:1298 [inline]
alloc_skb_with_frags+0xe4/0x710 net/core/skbuff.c:6331
sock_alloc_send_pskb+0x7e4/0x970 net/core/sock.c:2780
packet_alloc_skb net/packet/af_packet.c:2930 [inline]
packet_snd net/packet/af_packet.c:3024 [inline]
packet_sendmsg+0x1e2a/0x5200 net/packet/af_packet.c:3113
sock_sendmsg_nosec net/socket.c:730 [inline]
__sock_sendmsg+0xd5/0x180 net/socket.c:745
__sys_sendto+0x255/0x340 net/socket.c:2190
__do_sys_sendto net/socket.c:2202 [inline]
__se_sys_sendto net/socket.c:2198 [inline]
__x64_sys_sendto+0xe0/0x1b0 net/socket.c:2198
do_syscall_x64 arch/x86/entry/common.c:51 [inline]
do_syscall_64+0x40/0x110 arch/x86/entry/common.c:82
entry_SYSCALL_64_after_hwframe+0x63/0x6b
Freed by task 22323:
kasan_save_stack+0x33/0x50 mm/kasan/common.c:45
kasan_set_track+0x25/0x30 mm/kasan/common.c:52
kasan_save_free_info+0x2b/0x40 mm/kasan/generic.c:522
____kasan_slab_free mm/kasan/common.c:236 [inline]
____kasan_slab_free+0x15b/0x1b0 mm/kasan/common.c:200
kasan_slab_free include/linux/kasan.h:164 [inline]
slab_free_hook mm/slub.c:1800 [inline]
slab_free_freelist_hook+0x114/0x1e0 mm/slub.c:1826
slab_free mm/slub.c:3809 [inline]
__kmem_cache_free+0xc0/0x180 mm/slub.c:3822
skb_kfree_head net/core/skbuff.c:950 [inline]
skb_free_head+0x110/0x1b0 net/core/skbuff.c:962
pskb_expand_head+0x3c5/0x1170 net/core/skbuff.c:2130
__pskb_pull_tail+0xe1/0x1830 net/core/skbuff.c:2655
pskb_may_pull_reason include/linux/skbuff.h:2685 [inline]
pskb_may_pull include/linux/skbuff.h:2693 [inline]
ife_decode+0x394/0x4f0 net/ife/ife.c:82
tcf_ife_decode net/sched/act_ife.c:727 [inline]
tcf_ife_act+0x43b/0x1cd0 net/sched/act_ife.c:879
tc_act include/net/tc_wrapper.h:221 [inline]
tcf_action_exec+0x1ac/0x620 net/sched/act_api.c:1079
tcf_exts_exec include/net/pkt_cls.h:344 [inline]
mall_classify+0x201/0x310 net/sched/cls_matchall.c:42
tc_classify include/net/tc_wrapper.h:227 [inline]
__tcf_classify net/sched/cls_api.c:1703 [inline]
tcf_classify+0x82f/0x1260 net/sched/cls_api.c:1800
hfsc_classify net/sched/sch_hfsc.c:1147 [inline]
hfsc_enqueue+0x315/0x1060 net/sched/sch_hfsc.c:1546
dev_qdisc_enqueue+0x3f/0x230 net/core/dev.c:3739
__dev_xmit_skb net/core/dev.c:3828 [inline]
__dev_queue_xmit+0x1de1/0x3d30 net/core/dev.c:4311
dev_queue_xmit include/linux/netdevice.h:3165 [inline]
packet_xmit+0x237/0x350 net/packet/af_packet.c:276
packet_snd net/packet/af_packet.c:3081 [inline]
packet_sendmsg+0x24aa/0x5200 net/packet/af_packet.c:3113
sock_sendmsg_nosec net/socket.c:730 [inline]
__sock_sendmsg+0xd5/0x180 net/socket.c:745
__sys_sendto+0x255/0x340 net/socket.c:2190
__do_sys_sendto net/socket.c:2202 [inline]
__se_sys_sendto net/socket.c:2198 [inline]
__x64_sys_sendto+0xe0/0x1b0 net/socket.c:2198
do_syscall_x64 arch/x86/entry/common.c:51 [inline]
do_syscall_64+0x40/0x110 arch/x86/entry/common.c:82
entry_SYSCALL_64_after_hwframe+0x63/0x6b
The buggy address belongs to the object at ffff88802d730000
which belongs to the cache kmalloc-8k of size 8192
The buggy address is located 164 bytes inside of
freed 8192-byte region [ffff88802d730000, ffff88802d732000)
The buggy address belongs to the physical page:
page:ffffea0000b5cc00 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x2d730
head:ffffea0000b5cc00 order:3 entire_mapcount:0 nr_pages_mapped:0 pincount:0
flags: 0xfff00000000840(slab|head|node=0|zone=1|lastcpupid=0x7ff)
page_type: 0xffffffff()
raw: 00fff00000000840 ffff888013042280 dead000000000122 0000000000000000
raw: 0000000000000000 0000000080020002 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected
page_owner tracks the page as allocated
page last allocated via order 3, migratetype Unmovable, gfp_mask 0x1d20c0(__GFP_IO|__GFP_FS|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_NOMEMALLOC|__GFP_HARDWALL), pid 22323, tgid 22320 (syz-executor.5), ts 950317230369, free_ts 950233467461
set_page_owner include/linux/page_owner.h:31 [inline]
post_alloc_hook+0x2d0/0x350 mm/page_alloc.c:1544
prep_new_page mm/page_alloc.c:1551 [inline]
get_page_from_freelist+0xa28/0x3730 mm/page_alloc.c:3319
__alloc_pages+0x22e/0x2420 mm/page_alloc.c:4575
alloc_pages_mpol+0x258/0x5f0 mm/mempolicy.c:2133
alloc_slab_page mm/slub.c:1870 [inline]
allocate_slab mm/slub.c:2017 [inline]
new_slab+0x283/0x3c0 mm/slub.c:2070
___slab_alloc+0x979/0x1500 mm/slub.c:3223
__slab_alloc.constprop.0+0x56/0xa0 mm/slub.c:3322
__slab_alloc_node mm/slub.c:3375 [inline]
slab_alloc_node mm/slub.c:3468 [inline]
__kmem_cache_alloc_node+0x131/0x310 mm/slub.c:3517
__do_kmalloc_node mm/slab_common.c:1006 [inline]
__kmalloc_node_track_caller+0x4a/0x90 mm/slab_common.c:1027
kmalloc_reserve+0xef/0x260 net/core/skbuff.c:582
__alloc_skb+0x12b/0x330 net/core/skbuff.c:651
alloc_skb include/linux/skbuff.h:1298 [inline]
alloc_skb_with_frags+0xe4/0x710 net/core/skbuff.c:6331
sock_alloc_send_pskb+0x7e4/0x970 net/core/sock.c:2780
packet_alloc_skb net/packet/af_packet.c:2930 [inline]
packet_snd net/packet/af_packet.c:3024 [inline]
packet_sendmsg+0x1e2a/0x5200 net/packet/af_packet.c:3113
sock_sendmsg_nosec net/socket.c:730 [inline]
__sock_sendmsg+0xd5/0x180 net/socket.c:745
__sys_sendto+0x255/0x340 net/socket.c:2190
page last free stack trace:
reset_page_owner include/linux/page_owner.h:24 [inline]
free_pages_prepare mm/page_alloc.c:1144 [inline]
free_unref_page_prepare+0x53c/0xb80 mm/page_alloc.c:2354
free_unref_page+0x33/0x3b0 mm/page_alloc.c:2494
__unfreeze_partials+0x226/0x240 mm/slub.c:2655
qlink_free mm/kasan/quarantine.c:168 [inline]
qlist_free_all+0x6a/0x170 mm/kasan/quarantine.c:187
kasan_quarantine_reduce+0x18e/0x1d0 mm/kasan/quarantine.c:294
__kasan_slab_alloc+0x65/0x90 mm/kasan/common.c:305
kasan_slab_alloc include/linux/kasan.h:188 [inline]
slab_post_alloc_hook mm/slab.h:763 [inline]
slab_alloc_node mm/slub.c:3478 [inline]
slab_alloc mm/slub.c:3486 [inline]
__kmem_cache_alloc_lru mm/slub.c:3493 [inline]
kmem_cache_alloc_lru+0x219/0x6f0 mm/slub.c:3509
alloc_inode_sb include/linux/fs.h:2937 [inline]
ext4_alloc_inode+0x28/0x650 fs/ext4/super.c:1408
alloc_inode+0x5d/0x220 fs/inode.c:261
new_inode_pseudo fs/inode.c:1006 [inline]
new_inode+0x22/0x260 fs/inode.c:1032
__ext4_new_inode+0x333/0x5200 fs/ext4/ialloc.c:958
ext4_symlink+0x5d7/0xa20 fs/ext4/namei.c:3398
vfs_symlink fs/namei.c:4464 [inline]
vfs_symlink+0x3e5/0x620 fs/namei.c:4448
do_symlinkat+0x25f/0x310 fs/namei.c:4490
__do_sys_symlinkat fs/namei.c:4506 [inline]
__se_sys_symlinkat fs/namei.c:4503 [inline]
__x64_sys_symlinkat+0x97/0xc0 fs/namei.c:4503
do_syscall_x64 arch/x86/entry/common.c:51 [inline]
do_syscall_64+0x40/0x110 arch/x86/entry/common.c:82
Fixes: d57493d6d1be ("net: sched: ife: check on metadata length")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Alexander Aring <aahringo@redhat.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit cac23b7d7627915d967ce25436d7aae26e88ed06 ]
The following NULL pointer dereference issue occurred:
BUG: kernel NULL pointer dereference, address: 0000000000000000
<...>
RIP: 0010:ccid_hc_tx_send_packet net/dccp/ccid.h:166 [inline]
RIP: 0010:dccp_write_xmit+0x49/0x140 net/dccp/output.c:356
<...>
Call Trace:
<TASK>
dccp_sendmsg+0x642/0x7e0 net/dccp/proto.c:801
inet_sendmsg+0x63/0x90 net/ipv4/af_inet.c:846
sock_sendmsg_nosec net/socket.c:730 [inline]
__sock_sendmsg+0x83/0xe0 net/socket.c:745
____sys_sendmsg+0x443/0x510 net/socket.c:2558
___sys_sendmsg+0xe5/0x150 net/socket.c:2612
__sys_sendmsg+0xa6/0x120 net/socket.c:2641
__do_sys_sendmsg net/socket.c:2650 [inline]
__se_sys_sendmsg net/socket.c:2648 [inline]
__x64_sys_sendmsg+0x45/0x50 net/socket.c:2648
do_syscall_x64 arch/x86/entry/common.c:51 [inline]
do_syscall_64+0x43/0x110 arch/x86/entry/common.c:82
entry_SYSCALL_64_after_hwframe+0x63/0x6b
sk_wait_event() returns an error (-EPIPE) if disconnect() is called on the
socket waiting for the event. However, sk_stream_wait_connect() returns
success, i.e. zero, even if sk_wait_event() returns -EPIPE, so a function
that waits for a connection with sk_stream_wait_connect() may misbehave.
In the case of the above DCCP issue, dccp_sendmsg() is waiting for the
connection. If disconnect() is called in concurrently, the above issue
occurs.
This patch fixes the issue by returning error from sk_stream_wait_connect()
if sk_wait_event() fails.
Fixes: 419ce133ab92 ("tcp: allow again tcp_disconnect() when threads are waiting")
Signed-off-by: Shigeru Yoshida <syoshida@redhat.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reported-by: syzbot+c71bc336c5061153b502@syzkaller.appspotmail.com
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Reported-by: syzkaller <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 8c97ab5448f2096daba11edf8d18a44e1eb6f31d ]
During PFC configuration failure the code was not handling a graceful
exit. This patch fixes the same and add proper code for a graceful exit.
Fixes: 99c969a83d82 ("octeontx2-pf: Add egress PFC support")
Signed-off-by: Suman Ghosh <sumang@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 52eda4641d041667fa059f4855c5f88dcebd8afe ]
There is a typo in the driver due to which we report incorrect TX RMON
counters for the 256-511 octet bucket and all the other buckets larger
than that.
Bug found with the selftest at
https://patchwork.kernel.org/project/netdevbpf/patch/20231211223346.2497157-9-tobias@waldekranz.com/
Fixes: e32036e1ae7b ("net: mscc: ocelot: add support for all sorts of standardized counters present in DSA")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://lore.kernel.org/r/20231214000902.545625-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
by representors
[ Upstream commit b13559b76157de9d74f04d3ca0e49d69de3b5675 ]
snprintf returns the length of the formatted string, excluding the trailing
null, without accounting for truncation. This means that is the return
value is greater than or equal to the size parameter, the fw_version string
was truncated.
Link: https://docs.kernel.org/core-api/kernel-api.html#c.snprintf
Fixes: 1b2bd0c0264f ("net/mlx5e: Check return value of snprintf writing to fw_version buffer for representors")
Signed-off-by: Rahul Rameshbabu <rrameshbabu@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit ad436b9c1270c40554e274f067f1b78fcc06a004 ]
snprintf returns the length of the formatted string, excluding the trailing
null, without accounting for truncation. This means that is the return
value is greater than or equal to the size parameter, the fw_version string
was truncated.
Reported-by: David Laight <David.Laight@ACULAB.COM>
Closes: https://lore.kernel.org/netdev/81cae734ee1b4cde9b380a9a31006c1a@AcuMS.aculab.com/
Link: https://docs.kernel.org/core-api/kernel-api.html#c.snprintf
Fixes: 41e63c2baa11 ("net/mlx5e: Check return value of snprintf writing to fw_version buffer")
Signed-off-by: Rahul Rameshbabu <rrameshbabu@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 4261edf11cb7c9224af713a102e5616329306932 ]
While handling new traces, to verify it is not the first block being
written, last_timestamp is checked. But instead of checking it is non
zero it is verified to be zero. Fix to verify last_timestamp is not
zero.
Fixes: c71ad41ccb0c ("net/mlx5: FW tracer, events handling")
Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Feras Daoud <ferasda@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit e75efc6466ae289e599fb12a5a86545dff245c65 ]
When kcalloc() for ft->g succeeds but kvzalloc() for in fails,
fs_udp_create_groups() will free ft->g. However, its caller
fs_udp_create_table() will free ft->g again through calling
mlx5e_destroy_flow_table(), which will lead to a double-free.
Fix this by setting ft->g to NULL in fs_udp_create_groups().
Fixes: 1c80bd684388 ("net/mlx5e: Introduce Flow Steering UDP API")
Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 8f5100da56b3980276234e812ce98d8f075194cd ]
Fix a cmd->ent use after free due to a race on command entry.
Such race occurs when one of the commands releases its last refcount and
frees its index and entry while another process running command flush
flow takes refcount to this command entry. The process which handles
commands flush may see this command as needed to be flushed if the other
process allocated a ent->idx but didn't set ent to cmd->ent_arr in
cmd_work_handler(). Fix it by moving the assignment of cmd->ent_arr into
the spin lock.
[70013.081955] BUG: KASAN: use-after-free in mlx5_cmd_trigger_completions+0x1e2/0x4c0 [mlx5_core]
[70013.081967] Write of size 4 at addr ffff88880b1510b4 by task kworker/26:1/1433361
[70013.081968]
[70013.082028] Workqueue: events aer_isr
[70013.082053] Call Trace:
[70013.082067] dump_stack+0x8b/0xbb
[70013.082086] print_address_description+0x6a/0x270
[70013.082102] kasan_report+0x179/0x2c0
[70013.082173] mlx5_cmd_trigger_completions+0x1e2/0x4c0 [mlx5_core]
[70013.082267] mlx5_cmd_flush+0x80/0x180 [mlx5_core]
[70013.082304] mlx5_enter_error_state+0x106/0x1d0 [mlx5_core]
[70013.082338] mlx5_try_fast_unload+0x2ea/0x4d0 [mlx5_core]
[70013.082377] remove_one+0x200/0x2b0 [mlx5_core]
[70013.082409] pci_device_remove+0xf3/0x280
[70013.082439] device_release_driver_internal+0x1c3/0x470
[70013.082453] pci_stop_bus_device+0x109/0x160
[70013.082468] pci_stop_and_remove_bus_device+0xe/0x20
[70013.082485] pcie_do_fatal_recovery+0x167/0x550
[70013.082493] aer_isr+0x7d2/0x960
[70013.082543] process_one_work+0x65f/0x12d0
[70013.082556] worker_thread+0x87/0xb50
[70013.082571] kthread+0x2e9/0x3a0
[70013.082592] ret_from_fork+0x1f/0x40
The logical relationship of this error is as follows:
aer_recover_work | ent->work
-------------------------------------------+------------------------------
aer_recover_work_func |
|- pcie_do_recovery |
|- report_error_detected |
|- mlx5_pci_err_detected |cmd_work_handler
|- mlx5_enter_error_state | |- cmd_alloc_index
|- enter_error_state | |- lock cmd->alloc_lock
|- mlx5_cmd_flush | |- clear_bit
|- mlx5_cmd_trigger_completions| |- unlock cmd->alloc_lock
|- lock cmd->alloc_lock |
|- vector = ~dev->cmd.vars.bitmask
|- for_each_set_bit |
|- cmd_ent_get(cmd->ent_arr[i]) (UAF)
|- unlock cmd->alloc_lock | |- cmd->ent_arr[ent->idx]=ent
The cmd->ent_arr[ent->idx] assignment and the bit clearing are not
protected by the cmd->alloc_lock in cmd_work_handler().
Fixes: 50b2412b7e78 ("net/mlx5: Avoid possible free of command entry while timeout comp handler")
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Shifeng Li <lishifeng@sangfor.com.cn>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 58db72869a9f8e01910844ca145efc2ea91bbbf9 ]
Downstream patch will split mlx5_cmd_init() to probe and reload
routines. As a preparation, organize mlx5_cmd struct so that any
field that will be used in the reload routine are grouped at new
nested struct.
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Stable-dep-of: 8f5100da56b3 ("net/mlx5e: Fix a race in command alloc flow")
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 63fbae0a74c3e1df7c20c81e04353ced050d9887 ]
Certain connection-based device-offload protocols (like TLS) use
per-connection HW objects to track the state, maintain the context, and
perform the offload properly. Some of these objects are created,
modified, and destroyed via FW commands. Under high connection rate,
this type of FW commands might continuously populate all slots of the FW
command interface and throttle it, while starving other critical control
FW commands.
Limit these throttle commands to using only up to a portion (half) of
the FW command interface slots. FW commands maximal rate is not hit, and
the same high rate is still reached when applying this limitation.
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Stable-dep-of: 8f5100da56b3 ("net/mlx5e: Fix a race in command alloc flow")
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 7cb5eb937231663d11f7817e366f6f86a142d6d3 ]
Introduce an opcode getter in the FW command interface, and use it.
Initialize the entry's opcode field early in cmd_alloc_ent() and use it
when possible.
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Stable-dep-of: 8f5100da56b3 ("net/mlx5e: Fix a race in command alloc flow")
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit ddb38ddff9c71026bad481b791a94d446ee37603 ]
Out_sz that the size of out buffer is calculated using query_nic_vport
_context_in structure when driver query the MAC list. However query_nic
_vport_context_in structure is smaller than query_nic_vport_context_out.
When allowed_list_size is greater than 96, calling ether_addr_copy() will
trigger an slab-out-of-bounds.
[ 1170.055866] BUG: KASAN: slab-out-of-bounds in mlx5_query_nic_vport_mac_list+0x481/0x4d0 [mlx5_core]
[ 1170.055869] Read of size 4 at addr ffff88bdbc57d912 by task kworker/u128:1/461
[ 1170.055870]
[ 1170.055932] Workqueue: mlx5_esw_wq esw_vport_change_handler [mlx5_core]
[ 1170.055936] Call Trace:
[ 1170.055949] dump_stack+0x8b/0xbb
[ 1170.055958] print_address_description+0x6a/0x270
[ 1170.055961] kasan_report+0x179/0x2c0
[ 1170.056061] mlx5_query_nic_vport_mac_list+0x481/0x4d0 [mlx5_core]
[ 1170.056162] esw_update_vport_addr_list+0x2c5/0xcd0 [mlx5_core]
[ 1170.056257] esw_vport_change_handle_locked+0xd08/0x1a20 [mlx5_core]
[ 1170.056377] esw_vport_change_handler+0x6b/0x90 [mlx5_core]
[ 1170.056381] process_one_work+0x65f/0x12d0
[ 1170.056383] worker_thread+0x87/0xb50
[ 1170.056390] kthread+0x2e9/0x3a0
[ 1170.056394] ret_from_fork+0x1f/0x40
Fixes: e16aea2744ab ("net/mlx5: Introduce access functions to modify/query vport mac lists")
Cc: Ding Hui <dinghui@sangfor.com.cn>
Signed-off-by: Shifeng Li <lishifeng@sangfor.com.cn>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 5d089684dc434a31e08d32f0530066d0025c52e4 ]
This reverts commit 6f9b1a0731662648949a1c0587f6acb3b7f8acf1.
This patch is causing a null ptr issue, the proper fix is in the next
patch.
Fixes: 6f9b1a073166 ("net/mlx5e: fix double free of encap_header")
Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 66ca8d4deca09bce3fc7bcf8ea7997fa1a51c33c ]
This reverts commit 3a4aa3cb83563df942be49d145ee3b7ddf17d6bb.
This patch is causing a null ptr issue, the proper fix is in the next
patch.
Fixes: 3a4aa3cb8356 ("net/mlx5e: fix double free of encap_header in update funcs")
Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 8c386b166e2517cf3a123018e77941ec22625d0f ]
During refactoring the "else" here got lost, add it back.
Fixes: c99a89edb106 ("mac80211: factor out plink event gathering")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://msgid.link/20231211085121.795480fa0e0b.I017d501196a5bbdcd9afd33338d342d6fe1edd79@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 1fc4a3eec50d726f4663ad3c0bb0158354d6647a ]
ieee802_11_parse_elems() can return NULL, so we must
check for the return value.
Fixes: 5d24828d05f3 ("mac80211: always allocate struct ieee802_11_elems")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://msgid.link/20231211085121.93dea364f3d3.Ie87781c6c48979fb25a744b90af4a33dc2d83a28@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit c1393c132b906fbdf91f6d1c9eb2ef7a00cce64e ]
[Syz report]
WARNING: CPU: 1 PID: 5067 at net/mac80211/rate.c:48 rate_control_rate_init+0x540/0x690 net/mac80211/rate.c:48
Modules linked in:
CPU: 1 PID: 5067 Comm: syz-executor413 Not tainted 6.7.0-rc3-syzkaller-00014-gdf60cee26a2e #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 11/10/2023
RIP: 0010:rate_control_rate_init+0x540/0x690 net/mac80211/rate.c:48
Code: 48 c7 c2 00 46 0c 8c be 08 03 00 00 48 c7 c7 c0 45 0c 8c c6 05 70 79 0b 05 01 e8 1b a0 6f f7 e9 e0 fd ff ff e8 61 b3 8f f7 90 <0f> 0b 90 e9 36 ff ff ff e8 53 b3 8f f7 e8 5e 0b 78 f7 31 ff 89 c3
RSP: 0018:ffffc90003c57248 EFLAGS: 00010293
RAX: 0000000000000000 RBX: ffff888016bc4000 RCX: ffffffff89f7d519
RDX: ffff888076d43b80 RSI: ffffffff89f7d6df RDI: 0000000000000005
RBP: ffff88801daaae20 R08: 0000000000000005 R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000002 R12: 0000000000000001
R13: 0000000000000000 R14: ffff888020030e20 R15: ffff888078f08000
FS: 0000555556b94380(0000) GS:ffff8880b9900000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000005fdeb8 CR3: 0000000076d22000 CR4: 00000000003506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
sta_apply_auth_flags.constprop.0+0x4b7/0x510 net/mac80211/cfg.c:1674
sta_apply_parameters+0xaf1/0x16c0 net/mac80211/cfg.c:2002
ieee80211_add_station+0x3fa/0x6c0 net/mac80211/cfg.c:2068
rdev_add_station net/wireless/rdev-ops.h:201 [inline]
nl80211_new_station+0x13ba/0x1a70 net/wireless/nl80211.c:7603
genl_family_rcv_msg_doit+0x1fc/0x2e0 net/netlink/genetlink.c:972
genl_family_rcv_msg net/netlink/genetlink.c:1052 [inline]
genl_rcv_msg+0x561/0x800 net/netlink/genetlink.c:1067
netlink_rcv_skb+0x16b/0x440 net/netlink/af_netlink.c:2545
genl_rcv+0x28/0x40 net/netlink/genetlink.c:1076
netlink_unicast_kernel net/netlink/af_netlink.c:1342 [inline]
netlink_unicast+0x53b/0x810 net/netlink/af_netlink.c:1368
netlink_sendmsg+0x93c/0xe40 net/netlink/af_netlink.c:1910
sock_sendmsg_nosec net/socket.c:730 [inline]
__sock_sendmsg+0xd5/0x180 net/socket.c:745
____sys_sendmsg+0x6ac/0x940 net/socket.c:2584
___sys_sendmsg+0x135/0x1d0 net/socket.c:2638
__sys_sendmsg+0x117/0x1e0 net/socket.c:2667
do_syscall_x64 arch/x86/entry/common.c:51 [inline]
do_syscall_64+0x40/0x110 arch/x86/entry/common.c:82
entry_SYSCALL_64_after_hwframe+0x63/0x6b
[Analysis]
It is inappropriate to make a link configuration change judgment on an
non-existent and non new link.
[Fix]
Quickly exit when there is a existent link and the link configuration has not
changed.
Fixes: b303835dabe0 ("wifi: mac80211: accept STA changes without link changes")
Reported-and-tested-by: syzbot+62d7eef57b09bfebcd84@syzkaller.appspotmail.com
Signed-off-by: Edward Adam Davis <eadavis@qq.com>
Link: https://msgid.link/tencent_DE67FF86DB92ED465489A36ECD2EDDCC8C06@qq.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit a4754182dc936b97ec7e9f6b08cdf7ed97ef9069 ]
Evidently I had only looked at all the ones in rx.c, and missed this.
Add bh-disable to this use of the rxq->lock as well.
Fixes: 25edc8f259c7 ("iwlwifi: pcie: properly implement NAPI")
Reported-by: Brian Norris <briannorris@chromium.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://msgid.link/20231208183100.e79ad3dae649.I8f19713c4383707f8be7fc20ff5cc1ecf12429bb@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit e6b2dab41888332bf83f592131e7ea07756770a4 ]
The KERNEL_FPR mask only contains a flag for the first eight vector
registers. However floating point registers overlay parts of the first
sixteen vector registers.
This could lead to vector register corruption if a kernel fpu context uses
any of the vector registers 8 to 15 and is interrupted or calls a
KERNEL_FPR context. If that context uses also vector registers 8 to 15,
their contents will be corrupted on return.
Luckily this is currently not a real bug, since the kernel has only one
KERNEL_FPR user with s390_adjust_jiffies() and it is only using floating
point registers 0 to 2.
Fix this by using the correct bits for KERNEL_FPR.
Fixes: 7f79695cc1b6 ("s390/fpu: improve kernel_fpu_[begin|end]")
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 4a6756f56bcf8e64c87144a626ce53aea4899c0e ]
When obtaining one or more optional resets, non-existent resets are
stored as NULL pointers, and all related error and cleanup paths need to
take this into account.
Currently only reset_control_put() and reset_control_bulk_put()
get this right. All of __reset_control_bulk_get(),
of_reset_control_array_get(), and reset_control_array_put() lack the
proper checking, causing NULL pointer dereferences on failure or
release.
Fix this by moving the existing check from reset_control_bulk_put() to
__reset_control_put_internal(), so it applies to all callers.
The double check in reset_control_put() doesn't hurt.
Fixes: 17c82e206d2a3cd8 ("reset: Add APIs to manage array of resets")
Fixes: 48d71395896d54ee ("reset: Add reset_control_bulk API")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/r/2440edae7ca8534628cdbaf559ded288f2998178.1701276806.git.geert+renesas@glider.be
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
omap_soc_device_init
[ Upstream commit c72b9c33ef9695ad7ce7a6eb39a9df8a01b70796 ]
kasprintf() returns a pointer to dynamically allocated memory which can
be NULL upon failure. When 'soc_dev_attr->family' is NULL,it'll trigger
the null pointer dereference issue, such as in 'soc_info_show'.
And when 'soc_device_register' fails, it's necessary to release
'soc_dev_attr->family' to avoid memory leaks.
Fixes: 6770b2114325 ("ARM: OMAP2+: Export SoC information to userspace")
Signed-off-by: Kunwu Chan <chentao@kylinos.cn>
Message-ID: <20231123145237.609442-1-chentao@kylinos.cn>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 1e5caee2ba8f1426e8098afb4ca38dc40a0ca71b ]
This node can access any part of the L3 configuration registers space,
including CLK1 and CLK2 which are 0x800000 offset. Restore this area
size to include these areas.
Fixes: 7f2659ce657e ("ARM: dts: Move dra7 l3 noc to a separate node")
Signed-off-by: Andrew Davis <afd@ti.com>
Message-ID: <20231113181604.546444-1-afd@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit f528ee145bd0076cd0ed7e7b2d435893e6329e98 ]
We currently don't support dirty rectangles on hardware rotated modes.
So, if a user is using hardware rotated modes with PSR-SU enabled,
use PSR-SU FFU for all rotated planes (including cursor planes).
Cc: stable@vger.kernel.org
Fixes: 30ebe41582d1 ("drm/amd/display: add FB_DAMAGE_CLIPS support")
Reported-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/2952
Tested-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Tested-by: Bin Li <binli@gnome.org>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit a9f68ffe1170ca4bc17ab29067d806a354a026e0 ]
Users have reported problems with recent Lenovo laptops that contain
an IDEA5002 I2C HID device. Reports include fans turning on and
running even at idle and spurious wakeups from suspend.
Presumably in the Windows ecosystem there is an application that
uses the HID device. Maybe that puts it into a lower power state so
it doesn't cause spurious events.
This device doesn't serve any functional purpose in Linux as nothing
interacts with it so blacklist it from being probed. This will
prevent the GPIO driver from setting up the GPIO and the spurious
interrupts and wake events will not occur.
Cc: stable@vger.kernel.org # 6.1
Reported-and-tested-by: Marcus Aram <marcus+oss@oxar.nl>
Reported-and-tested-by: Mark Herbert <mark.herbert42@gmail.com>
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/2812
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Jiri Kosina <jkosina@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 4122abfed2193e752485282370abf5c419f05cad ]
Unify ACPI ID tables format by:
- surrounding HID by spaces
- dropping unnecessary driver_data assignment to 0
- dropping comma at the terminator entry
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Stable-dep-of: a9f68ffe1170 ("HID: i2c-hid: Add IDEA5002 to i2c_hid_acpi_blacklist[]")
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
commit 4b7de801606e504e69689df71475d27e35336fb3 upstream.
Lee pointed out issue found by syscaller [0] hitting BUG in prog array
map poke update in prog_array_map_poke_run function due to error value
returned from bpf_arch_text_poke function.
There's race window where bpf_arch_text_poke can fail due to missing
bpf program kallsym symbols, which is accounted for with check for
-EINVAL in that BUG_ON call.
The problem is that in such case we won't update the tail call jump
and cause imbalance for the next tail call update check which will
fail with -EBUSY in bpf_arch_text_poke.
I'm hitting following race during the program load:
CPU 0 CPU 1
bpf_prog_load
bpf_check
do_misc_fixups
prog_array_map_poke_track
map_update_elem
bpf_fd_array_map_update_elem
prog_array_map_poke_run
bpf_arch_text_poke returns -EINVAL
bpf_prog_kallsyms_add
After bpf_arch_text_poke (CPU 1) fails to update the tail call jump, the next
poke update fails on expected jump instruction check in bpf_arch_text_poke
with -EBUSY and triggers the BUG_ON in prog_array_map_poke_run.
Similar race exists on the program unload.
Fixing this by moving the update to bpf_arch_poke_desc_update function which
makes sure we call __bpf_arch_text_poke that skips the bpf address check.
Each architecture has slightly different approach wrt looking up bpf address
in bpf_arch_text_poke, so instead of splitting the function or adding new
'checkip' argument in previous version, it seems best to move the whole
map_poke_run update as arch specific code.
[0] https://syzkaller.appspot.com/bug?extid=97a4fe20470e9bc30810
Fixes: ebf7d1f508a7 ("bpf, x64: rework pro/epilogue and tailcall handling in JIT")
Reported-by: syzbot+97a4fe20470e9bc30810@syzkaller.appspotmail.com
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Cc: Lee Jones <lee@kernel.org>
Cc: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Link: https://lore.kernel.org/bpf/20231206083041.1306660-2-jolsa@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 17c17567fe510857b18fe01b7a88027600e76ac6 upstream.
On arm64, building with CONFIG_KASAN_HW_TAGS now causes a compile-time
error:
mm/kasan/report.c: In function 'kasan_non_canonical_hook':
mm/kasan/report.c:637:20: error: 'KASAN_SHADOW_OFFSET' undeclared (first use in this function)
637 | if (addr < KASAN_SHADOW_OFFSET)
| ^~~~~~~~~~~~~~~~~~~
mm/kasan/report.c:637:20: note: each undeclared identifier is reported only once for each function it appears in
mm/kasan/report.c:640:77: error: expected expression before ';' token
640 | orig_addr = (addr - KASAN_SHADOW_OFFSET) << KASAN_SHADOW_SCALE_SHIFT;
This was caused by removing the dependency on CONFIG_KASAN_INLINE that
used to prevent this from happening. Use the more specific dependency
on KASAN_SW_TAGS || KASAN_GENERIC to only ignore the function for hwasan
mode.
Link: https://lkml.kernel.org/r/20231016200925.984439-1-arnd@kernel.org
Fixes: 12ec6a919b0f ("kasan: print the original fault addr when access invalid shadow")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Haibo Li <haibo.li@mediatek.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
Cc: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Cc: Matthias Brugger <matthias.bgg@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Link: https://lore.kernel.org/r/20231218135055.005497074@linuxfoundation.org
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Tested-by: Pavel Machek (CIP) <pavel@denx.de> =
Tested-by: SeongJae Park <sj@kernel.org>
Tested-by: Salvatore Bonaccorso <carnil@debian.org>
Tested-by: Shuah Khan <skhan@linuxfoundation.org>
Tested-by: Kelsey Steele <kelseysteele@linux.microsoft.com>
Tested-by: Linux Kernel Functional Testing <lkft@linaro.org>
Tested-by: kernelci.org bot <bot@kernelci.org>
Tested-by: Florian Fainelli <florian.fainelli@broadcom.com>
Tested-by: Ron Economos <re@w6rz.net>
Tested-by: Yann Sionneau <ysionneau@kalrayinc.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 0fbd79c01a9a657348f7032df70c57a406468c86 upstream.
Set supports_autosuspend = 1 for the rtl8152_cfgselector_driver.
Fixes: ec51fbd1b8a2 ("r8152: add USB device driver for config selection")
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 95a4c1d617b92cdc4522297741b56e8f6cd01a1e upstream.
After commit ec51fbd1b8a2 ("r8152: add USB device driver for
config selection"), the code about changing USB configuration
in rtl_vendor_mode() wouldn't be run anymore. Therefore, the
function could be removed.
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 0d4cda805a183bbe523f2407edb5c14ade50b841 upstream.
The rtl8152_cfgselector_probe() should set the USB configuration to the
vendor mode only for the devices which the driver (r8152) supports.
Otherwise, no driver would be used for such devices.
Fixes: ec51fbd1b8a2 ("r8152: add USB device driver for config selection")
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit c5a595000e2677e865a39f249c056bc05d6e55fd upstream.
The curr pointer must also be updated on the splice similar to how
we do this for other copy types.
Fixes: d829e9c4112b ("tls: convert to generic sk_msg interface")
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Reported-by: Jann Horn <jannh@google.com>
Link: https://lore.kernel.org/r/20231206232706.374377-2-john.fastabend@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 0aa0e5289cfe984a8a9fdd79ccf46ccf080151f7 upstream.
The rb_time_cmpxchg() on 32-bit architectures requires setting three
32-bit words to represent the 64-bit timestamp, with some salt for
synchronization. Those are: msb, top, and bottom
The issue is, the rb_time_cmpxchg() did not properly salt the msb portion,
and the msb that was written was stale.
Link: https://lore.kernel.org/linux-trace-kernel/20231215084114.20899342@rorschach.local.home
Cc: stable@vger.kernel.org
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Fixes: f03f2abce4f39 ("ring-buffer: Have 32 bit time stamps use all 64 bits")
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit dd939425707898da992e59ab0fcfae4652546910 upstream.
If an update to an event is interrupted by another event between the time
the initial event allocated its buffer and where it wrote to the
write_stamp, the code try to reset the write stamp back to the what it had
just overwritten. It knows that it was overwritten via checking the
before_stamp, and if it didn't match what it wrote to the before_stamp
before it allocated its space, it knows it was overwritten.
To put back the write_stamp, it uses the before_stamp it read. The problem
here is that by writing the before_stamp to the write_stamp it makes the
two equal again, which means that the write_stamp can be considered valid
as the last timestamp written to the ring buffer. But this is not
necessarily true. The event that interrupted the event could have been
interrupted in a way that it was interrupted as well, and can end up
leaving with an invalid write_stamp. But if this happens and returns to
this context that uses the before_stamp to update the write_stamp again,
it can possibly incorrectly make it valid, causing later events to have in
correct time stamps.
As it is OK to leave this function with an invalid write_stamp (one that
doesn't match the before_stamp), there's no reason to try to make it valid
again in this case. If this race happens, then just leave with the invalid
write_stamp and the next event to come along will just add a absolute
timestamp and validate everything again.
Bonus points: This gets rid of another cmpxchg64!
Link: https://lore.kernel.org/linux-trace-kernel/20231214222921.193037a7@gandalf.local.home
Cc: stable@vger.kernel.org
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: Vincent Donnefort <vdonnefort@google.com>
Fixes: a389d86f7fd09 ("ring-buffer: Have nested events still record running time stamp")
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit fff88fa0fbc7067ba46dde570912d63da42c59a9 upstream.
Mathieu Desnoyers pointed out an issue in the rb_time_cmpxchg() for 32 bit
architectures. That is:
static bool rb_time_cmpxchg(rb_time_t *t, u64 expect, u64 set)
{
unsigned long cnt, top, bottom, msb;
unsigned long cnt2, top2, bottom2, msb2;
u64 val;
/* The cmpxchg always fails if it interrupted an update */
if (!__rb_time_read(t, &val, &cnt2))
return false;
if (val != expect)
return false;
<<<< interrupted here!
cnt = local_read(&t->cnt);
The problem is that the synchronization counter in the rb_time_t is read
*after* the value of the timestamp is read. That means if an interrupt
were to come in between the value being read and the counter being read,
it can change the value and the counter and the interrupted process would
be clueless about it!
The counter needs to be read first and then the value. That way it is easy
to tell if the value is stale or not. If the counter hasn't been updated,
then the value is still good.
Link: https://lore.kernel.org/linux-trace-kernel/20231211201324.652870-1-mathieu.desnoyers@efficios.com/
Link: https://lore.kernel.org/linux-trace-kernel/20231212115301.7a9c9a64@gandalf.local.home
Cc: stable@vger.kernel.org
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Fixes: 10464b4aa605e ("ring-buffer: Add rb_time_t 64 bit operations for speeding up 32 bit")
Reported-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit b3ae7b67b87fed771fa5bf95389df06b0433603e upstream.
The maximum ring buffer data size is the maximum size of data that can be
recorded on the ring buffer. Events must be smaller than the sub buffer
data size minus any meta data. This size is checked before trying to
allocate from the ring buffer because the allocation assumes that the size
will fit on the sub buffer.
The maximum size was calculated as the size of a sub buffer page (which is
currently PAGE_SIZE minus the sub buffer header) minus the size of the
meta data of an individual event. But it missed the possible adding of a
time stamp for events that are added long enough apart that the event meta
data can't hold the time delta.
When an event is added that is greater than the current BUF_MAX_DATA_SIZE
minus the size of a time stamp, but still less than or equal to
BUF_MAX_DATA_SIZE, the ring buffer would go into an infinite loop, looking
for a page that can hold the event. Luckily, there's a check for this loop
and after 1000 iterations and a warning is emitted and the ring buffer is
disabled. But this should never happen.
This can happen when a large event is added first, or after a long period
where an absolute timestamp is prefixed to the event, increasing its size
by 8 bytes. This passes the check and then goes into the algorithm that
causes the infinite loop.
For events that are the first event on the sub-buffer, it does not need to
add a timestamp, because the sub-buffer itself contains an absolute
timestamp, and adding one is redundant.
The fix is to check if the event is to be the first event on the
sub-buffer, and if it is, then do not add a timestamp.
This also fixes 32 bit adding a timestamp when a read of before_stamp or
write_stamp is interrupted. There's still no need to add that timestamp if
the event is going to be the first event on the sub buffer.
Also, if the buffer has "time_stamp_abs" set, then also check if the
length plus the timestamp is greater than the BUF_MAX_DATA_SIZE.
Link: https://lore.kernel.org/all/20231212104549.58863438@gandalf.local.home/
Link: https://lore.kernel.org/linux-trace-kernel/20231212071837.5fdd6c13@gandalf.local.home
Link: https://lore.kernel.org/linux-trace-kernel/20231212111617.39e02849@gandalf.local.home
Cc: stable@vger.kernel.org
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Fixes: a4543a2fa9ef3 ("ring-buffer: Get timestamp after event is allocated")
Fixes: 58fbc3c63275c ("ring-buffer: Consolidate add_timestamp to remove some branches")
Reported-by: Kent Overstreet <kent.overstreet@linux.dev> # (on IRC)
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit b049525855fdd0024881c9b14b8fbec61c3f53d3 upstream.
For the ring buffer iterator (non-consuming read), the event needs to be
copied into the iterator buffer to make sure that a writer does not
overwrite it while the user is reading it. If a write happens during the
copy, the buffer is simply discarded.
But the temp buffer itself was not big enough. The allocation of the
buffer was only BUF_MAX_DATA_SIZE, which is the maximum data size that can
be passed into the ring buffer and saved. But the temp buffer needs to
hold the meta data as well. That would be BUF_PAGE_SIZE and not
BUF_MAX_DATA_SIZE.
Link: https://lore.kernel.org/linux-trace-kernel/20231212072558.61f76493@gandalf.local.home
Cc: stable@vger.kernel.org
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Fixes: 785888c544e04 ("ring-buffer: Have rb_iter_head_event() handle concurrent writer")
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 9e45e39dc249c970d99d2681f6bcb55736fd725c upstream.
The ring buffer timestamps are synchronized by two timestamp placeholders.
One is the "before_stamp" and the other is the "write_stamp" (sometimes
referred to as the "after stamp" but only in the comments. These two
stamps are key to knowing how to handle nested events coming in with a
lockless system.
When moving across sub-buffers, the before stamp is updated but the write
stamp is not. There's an effort to put back the before stamp to something
that seems logical in case there's nested events. But as the current event
is about to cross sub-buffers, and so will any new nested event that happens,
updating the before stamp is useless, and could even introduce new race
conditions.
The first event on a sub-buffer simply uses the sub-buffer's timestamp
and keeps a "delta" of zero. The "before_stamp" and "write_stamp" are not
used in the algorithm in this case. There's no reason to try to fix the
before_stamp when this happens.
As a bonus, it removes a cmpxchg() when crossing sub-buffers!
Link: https://lore.kernel.org/linux-trace-kernel/20231211114420.36dde01b@gandalf.local.home
Cc: stable@vger.kernel.org
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Fixes: a389d86f7fd09 ("ring-buffer: Have nested events still record running time stamp")
Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit d06aff1cb13d2a0d52b48e605462518149c98c81 upstream.
The snapshot buffer is to mimic the main buffer so that when a snapshot is
needed, the snapshot and main buffer are swapped. When the snapshot buffer
is allocated, it is set to the minimal size that the ring buffer may be at
and still functional. When it is allocated it becomes the same size as the
main ring buffer, and when the main ring buffer changes in size, it should
do.
Currently, the resize only updates the snapshot buffer if it's used by the
current tracer (ie. the preemptirqsoff tracer). But it needs to be updated
anytime it is allocated.
When changing the size of the main buffer, instead of looking to see if
the current tracer is utilizing the snapshot buffer, just check if it is
allocated to know if it should be updated or not.
Also fix typo in comment just above the code change.
Link: https://lore.kernel.org/linux-trace-kernel/20231210225447.48476a6a@rorschach.local.home
Cc: stable@vger.kernel.org
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Fixes: ad909e21bbe69 ("tracing: Add internal tracing_snapshot() functions")
Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|