aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2021-01-08libbpf: Add user-space variants of BPF_CORE_READ() family of macrosAndrii Nakryiko
Add BPF_CORE_READ_USER(), BPF_CORE_READ_USER_STR() and their _INTO() variations to allow reading CO-RE-relocatable kernel data structures from the user-space. One of such cases is reading input arguments of syscalls, while reaping the benefits of CO-RE relocations w.r.t. handling 32/64 bit conversions and handling missing/new fields in UAPI data structs. Suggested-by: Gilad Reti <gilad.reti@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20201218235614.2284956-2-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2021-01-08Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Trivial conflict in CAN on file rename. Conflicts: drivers/net/can/m_can/tcan4x5x-core.c Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-08Merge tag 'net-5.11-rc3-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull more networking fixes from Jakub Kicinski: "Slightly lighter pull request to get back into the Thursday cadence. Current release - always broken: - can: mcp251xfd: fix Tx/Rx ring buffer driver race conditions - dsa: hellcreek: fix led_classdev build errors Previous releases - regressions: - ipv6: fib: flush exceptions when purging route to avoid netdev reference leak - ip_tunnels: fix pmtu check in nopmtudisc mode - ip: always refragment ip defragmented packets to avoid MTU issues when forwarding through tunnels, correct "packet too big" message is prohibitively tricky to generate - s390/qeth: fix locking for discipline setup / removal and during recovery to prevent both deadlocks and races - mlx5: Use port_num 1 instead of 0 when delete a RoCE address Previous releases - always broken: - cdc_ncm: correct overhead calculation in delayed_ndp_size to prevent out of bound accesses with Huawei 909s-120 LTE module - fix stmmac dwmac-sun8i suspend/resume: - PHY being left powered off - MAC syscon configuration being reset - reference to the reset controller being improperly dropped - qrtr: fix null-ptr-deref in qrtr_ns_remove - can: tcan4x5x: fix bittiming const, use common bittiming from m_can driver - mlx5e: CT: Use per flow counter when CT flow accounting is enabled - mlx5e: Fix SWP offsets when vlan inserted by driver Misc: - bpf: Fix a task_iter bug caused by a bpf -> net merge conflict resolution And the usual many fixes to various error paths" * tag 'net-5.11-rc3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (69 commits) net: dsa: lantiq_gswip: Exclude RMII from modes that report 1 GbE s390/qeth: fix L2 header access in qeth_l3_osa_features_check() s390/qeth: fix locking for discipline setup / removal s390/qeth: fix deadlock during recovery selftests: fib_nexthops: Fix wrong mausezahn invocation nexthop: Bounce NHA_GATEWAY in FDB nexthop groups nexthop: Unlink nexthop group entry in error path nexthop: Fix off-by-one error in error path octeontx2-af: fix memory leak of lmac and lmac->name chtls: Fix chtls resources release sequence chtls: Added a check to avoid NULL pointer dereference chtls: Replace skb_dequeue with skb_peek chtls: Avoid unnecessary freeing of oreq pointer chtls: Fix panic when route to peer not configured chtls: Remove invalid set_tcb call chtls: Fix hardware tid leak net: ip: always refragment ip defragmented packets net: fix pmtu check in nopmtudisc mode selftests: netfilter: add selftest for ipip pmtu discovery with enabled connection tracking docs: octeontx2: tune rst markup ...
2021-01-08Merge branch 'linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto fixes from Herbert Xu: "This fixes a functional bug in arm/chacha-neon as well as a potential buffer overflow in ecdh" * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: ecdh - avoid buffer overflow in ecdh_set_secret() crypto: arm/chacha-neon - add missing counter increment
2021-01-08poll: fix performance regression due to out-of-line __put_user()Linus Torvalds
The kernel test robot reported a -5.8% performance regression on the "poll2" test of will-it-scale, and bisected it to commit d55564cfc222 ("x86: Make __put_user() generate an out-of-line call"). I didn't expect an out-of-line __put_user() to matter, because no normal core code should use that non-checking legacy version of user access any more. But I had overlooked the very odd poll() usage, which does a __put_user() to update the 'revents' values of the poll array. Now, Al Viro correctly points out that instead of updating just the 'revents' field, it would be much simpler to just copy the _whole_ pollfd entry, and then we could just use "copy_to_user()" on the whole array of entries, the same way we use "copy_from_user()" a few lines earlier to get the original values. But that is not what we've traditionally done, and I worry that threaded applications might be concurrently modifying the other fields of the pollfd array. So while Al's suggestion is simpler - and perhaps worth trying in the future - this instead keeps the "just update revents" model. To fix the performance regression, use the modern "unsafe_put_user()" instead of __put_user(), with the proper "user_write_access_begin()" guarding in place. This improves code generation enormously. Link: https://lore.kernel.org/lkml/20210107134723.GA28532@xsang-OptiPlex-9020/ Reported-by: kernel test robot <oliver.sang@intel.com> Tested-by: Oliver Sang <oliver.sang@intel.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: David Laight <David.Laight@aculab.com> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-01-08Revert "init/console: Use ttynull as a fallback when there is no console"Petr Mladek
This reverts commit 757055ae8dedf5333af17b3b5b4b70ba9bc9da4e. The commit caused that ttynull was used as the default console on several systems[1][2][3]. As a result, the console was blank even when a better alternative existed. It happened when there was no console configured on the command line and ttynull_init() was the first initcall calling register_console(). Or it happened when /dev/ did not exist when console_on_rootfs() was called. It was not able to open /dev/console even though a console driver was registered. It tried to add ttynull console but it obviously did not help. But ttynull became the preferred console and was used by /dev/console when it was available later. The commit tried to fix a historical problem that have been there for ages. The primary motivation was the commit 3cffa06aeef7ece30f6 ("printk/console: Allow to disable console output by using console="" or console=null"). It provided a clean solution for a workaround that was widely used and worked only by chance. This revert causes that the console="" or console=null command line options will again work only by chance. These options will cause that a particular console will be preferred and the default (tty) ones will not get enabled. There will be no console registered at all. As a result there won't be stdin, stdout, and stderr for the init process. But it worked exactly this way even before. The proper solution has to fulfill many conditions: + Register ttynull only when explicitly required or as the ultimate fallback. + ttynull should get associated with /dev/console but it must not become preferred console when used as a fallback. Especially, it must still be possible to replace it by a better console later. Such a change requires clean up of the register_console() code. Otherwise, it would be even harder to follow. Especially, the use of has_preferred_console and CON_CONSDEV flag is tricky. The clean up is risky. The ordering of consoles is not well defined. And any changes tend to break existing user settings. Do the revert at the least risky solution for now. [1] https://lore.kernel.org/linux-kselftest/20201221144302.GR4077@smile.fi.intel.com/ [2] https://lore.kernel.org/lkml/d2a3b3c0-e548-7dd1-730f-59bc5c04e191@synopsys.com/ [3] https://patchwork.ozlabs.org/project/linux-um/patch/20210105120128.10854-1-thomas@m3y3r.de/ Reported-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reported-by: Vineet Gupta <vgupta@synopsys.com> Reported-by: Thomas Meyer <thomas@m3y3r.de> Signed-off-by: Petr Mladek <pmladek@suse.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-01-07Merge tag 'mlx5-fixes-2021-01-07' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5 fixes 2021-01-07 * tag 'mlx5-fixes-2021-01-07' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux: net/mlx5e: Fix memleak in mlx5e_create_l2_table_groups net/mlx5e: Fix two double free cases net/mlx5: Release devlink object if adev fails net/mlx5e: ethtool, Fix restriction of autoneg with 56G net/mlx5e: In skb build skip setting mark in switchdev mode net/mlx5: E-Switch, fix changing vf VLANID net/mlx5e: Fix SWP offsets when vlan inserted by driver net/mlx5e: CT: Use per flow counter when CT flow accounting is enabled net/mlx5: Use port_num 1 instead of 0 when delete a RoCE address net/mlx5e: Add missing capability check for uplink follow net/mlx5: Check if lag is supported before creating one ==================== Link: https://lore.kernel.org/r/20210107202845.470205-1-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-07net: dsa: lantiq_gswip: Exclude RMII from modes that report 1 GbEAleksander Jan Bajkowski
Exclude RMII from modes that report 1 GbE support. Reduced MII supports up to 100 MbE. Fixes: 14fceff4771e ("net: dsa: Add Lantiq / Intel DSA driver for vrx200") Signed-off-by: Aleksander Jan Bajkowski <olek2@wp.pl> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Link: https://lore.kernel.org/r/20210107195818.3878-1-olek2@wp.pl Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-07Merge branch 's390-qeth-fixes-2021-01-07'Jakub Kicinski
Julian Wiedmann says: ==================== s390/qeth: fixes 2021-01-07 This brings two locking fixes for the device control path. Also one fix for a path where our .ndo_features_check() attempts to access a non-existent L2 header. ==================== Link: https://lore.kernel.org/r/20210107172442.1737-1-jwi@linux.ibm.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-07s390/qeth: fix L2 header access in qeth_l3_osa_features_check()Julian Wiedmann
ip_finish_output_gso() may call .ndo_features_check() even before the skb has a L2 header. This conflicts with qeth_get_ip_version()'s attempt to inspect the L2 header via vlan_eth_hdr(). Switch to vlan_get_protocol(), as already used further down in the common qeth_features_check() path. Fixes: f13ade199391 ("s390/qeth: run non-offload L3 traffic over common xmit path") Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-07s390/qeth: fix locking for discipline setup / removalJulian Wiedmann
Due to insufficient locking, qeth_core_set_online() and qeth_dev_layer2_store() can run in parallel, both attempting to load & setup the discipline (and stepping on each other toes along the way). A similar race can also occur between qeth_core_remove_device() and qeth_dev_layer2_store(). Access to .discipline is meant to be protected by the discipline_mutex, so add/expand the locking in qeth_core_remove_device() and qeth_core_set_online(). Adjust the locking in qeth_l*_remove_device() accordingly, as it's now handled by the callers in a consistent manner. Based on an initial patch by Ursula Braun. Fixes: 9dc48ccc68b9 ("qeth: serialize sysfs-triggered device configurations") Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Reviewed-by: Alexandra Winter <wintera@linux.ibm.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-07s390/qeth: fix deadlock during recoveryJulian Wiedmann
When qeth_dev_layer2_store() - holding the discipline_mutex - waits inside qeth_l*_remove_device() for a qeth_do_reset() thread to complete, we can hit a deadlock if qeth_do_reset() concurrently calls qeth_set_online() and thus tries to aquire the discipline_mutex. Move the discipline_mutex locking outside of qeth_set_online() and qeth_set_offline(), and turn the discipline into a parameter so that callers understand the dependency. To fix the deadlock, we can now relax the locking: As already established, qeth_l*_remove_device() waits for qeth_do_reset() to complete. So qeth_do_reset() itself is under no risk of having card->discipline ripped out while it's running, and thus doesn't need to take the discipline_mutex. Fixes: 9dc48ccc68b9 ("qeth: serialize sysfs-triggered device configurations") Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Reviewed-by: Alexandra Winter <wintera@linux.ibm.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-07Merge branch 'nexthop-various-fixes'Jakub Kicinski
Ido Schimmel says: ==================== nexthop: Various fixes This series contains various fixes for the nexthop code. The bugs were uncovered during the development of resilient nexthop groups. Patches #1-#2 fix the error path of nexthop_create_group(). I was not able to trigger these bugs with current code, but it is possible with the upcoming resilient nexthop groups code which adds a user controllable memory allocation further in the function. Patch #3 fixes wrong validation of netlink attributes. Patch #4 fixes wrong invocation of mausezahn in a selftest. ==================== Link: https://lore.kernel.org/r/20210107144824.1135691-1-idosch@idosch.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-07selftests: fib_nexthops: Fix wrong mausezahn invocationIdo Schimmel
For IPv6 traffic, mausezahn needs to be invoked with '-6'. Otherwise an error is returned: # ip netns exec me mausezahn veth1 -B 2001:db8:101::2 -A 2001:db8:91::1 -c 0 -t tcp "dp=1-1023, flags=syn" Failed to set source IPv4 address. Please check if source is set to a valid IPv4 address. Invalid command line parameters! Fixes: 7c741868ceab ("selftests: Add torture tests to nexthop tests") Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-07nexthop: Bounce NHA_GATEWAY in FDB nexthop groupsPetr Machata
The function nh_check_attr_group() is called to validate nexthop groups. The intention of that code seems to have been to bounce all attributes above NHA_GROUP_TYPE except for NHA_FDB. However instead it bounces all these attributes except when NHA_FDB attribute is present--then it accepts them. NHA_FDB validation that takes place before, in rtm_to_nh_config(), already bounces NHA_OIF, NHA_BLACKHOLE, NHA_ENCAP and NHA_ENCAP_TYPE. Yet further back, NHA_GROUPS and NHA_MASTER are bounced unconditionally. But that still leaves NHA_GATEWAY as an attribute that would be accepted in FDB nexthop groups (with no meaning), so long as it keeps the address family as unspecified: # ip nexthop add id 1 fdb via 127.0.0.1 # ip nexthop add id 10 fdb via default group 1 The nexthop code is still relatively new and likely not used very broadly, and the FDB bits are newer still. Even though there is a reproducer out there, it relies on an improbable gateway arguments "via default", "via all" or "via any". Given all this, I believe it is OK to reformulate the condition to do the right thing and bounce NHA_GATEWAY. Fixes: 38428d68719c ("nexthop: support for fdb ecmp nexthops") Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-07nexthop: Unlink nexthop group entry in error pathIdo Schimmel
In case of error, remove the nexthop group entry from the list to which it was previously added. Fixes: 430a049190de ("nexthop: Add support for nexthop groups") Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-07nexthop: Fix off-by-one error in error pathIdo Schimmel
A reference was not taken for the current nexthop entry, so do not try to put it in the error path. Fixes: 430a049190de ("nexthop: Add support for nexthop groups") Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-07octeontx2-af: fix memory leak of lmac and lmac->nameColin Ian King
Currently the error return paths don't kfree lmac and lmac->name leading to some memory leaks. Fix this by adding two error return paths that kfree these objects Addresses-Coverity: ("Resource leak") Fixes: 1463f382f58d ("octeontx2-af: Add support for CGX link management") Signed-off-by: Colin Ian King <colin.king@canonical.com> Link: https://lore.kernel.org/r/20210107123916.189748-1-colin.king@canonical.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-07Merge branch 'bug-fixes-for-chtls-driver'Jakub Kicinski
Ayush Sawal says: ==================== Bug fixes for chtls driver patch 1: Fix hardware tid leak. patch 2: Remove invalid set_tcb call. patch 3: Fix panic when route to peer not configured. patch 4: Avoid unnecessary freeing of oreq pointer. patch 5: Replace skb_dequeue with skb_peek. patch 6: Added a check to avoid NULL pointer dereference patch. patch 7: Fix chtls resources release sequence. ==================== Link: https://lore.kernel.org/r/20210106042912.23512-1-ayush.sawal@chelsio.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-07chtls: Fix chtls resources release sequenceAyush Sawal
CPL_ABORT_RPL is sent after releasing the resources by calling chtls_release_resources(sk); and chtls_conn_done(sk); eventually causing kernel panic. Fixing it by calling release in appropriate order. Fixes: cc35c88ae4db ("crypto : chtls - CPL handler definition") Signed-off-by: Vinay Kumar Yadav <vinay.yadav@chelsio.com> Signed-off-by: Ayush Sawal <ayush.sawal@chelsio.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-07chtls: Added a check to avoid NULL pointer dereferenceAyush Sawal
In case of server removal lookup_stid() may return NULL pointer, which is used as listen_ctx. So added a check before accessing this pointer. Fixes: cc35c88ae4db ("crypto : chtls - CPL handler definition") Signed-off-by: Vinay Kumar Yadav <vinay.yadav@chelsio.com> Signed-off-by: Ayush Sawal <ayush.sawal@chelsio.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-07chtls: Replace skb_dequeue with skb_peekAyush Sawal
The skb is unlinked twice, one in __skb_dequeue in function chtls_reset_synq() and another in cleanup_syn_rcv_conn(). So in this patch using skb_peek() instead of __skb_dequeue(), so that unlink will be handled only in cleanup_syn_rcv_conn(). Fixes: cc35c88ae4db ("crypto : chtls - CPL handler definition") Signed-off-by: Vinay Kumar Yadav <vinay.yadav@chelsio.com> Signed-off-by: Ayush Sawal <ayush.sawal@chelsio.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-07chtls: Avoid unnecessary freeing of oreq pointerAyush Sawal
In chtls_pass_accept_request(), removing the chtls_reqsk_free() call to avoid oreq freeing twice. Here oreq is the pointer to struct request_sock. Fixes: cc35c88ae4db ("crypto : chtls - CPL handler definition") Signed-off-by: Rohit Maheshwari <rohitm@chelsio.com> Signed-off-by: Ayush Sawal <ayush.sawal@chelsio.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-07chtls: Fix panic when route to peer not configuredAyush Sawal
If route to peer is not configured, we might get non tls devices from dst_neigh_lookup() which is invalid, adding a check to avoid it. Fixes: cc35c88ae4db ("crypto : chtls - CPL handler definition") Signed-off-by: Rohit Maheshwari <rohitm@chelsio.com> Signed-off-by: Ayush Sawal <ayush.sawal@chelsio.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-07chtls: Remove invalid set_tcb callAyush Sawal
At the time of SYN_RECV, connection information is not initialized at FW, updating tcb flag over uninitialized connection causes adapter crash. We don't need to update the flag during SYN_RECV state, so avoid this. Fixes: cc35c88ae4db ("crypto : chtls - CPL handler definition") Signed-off-by: Rohit Maheshwari <rohitm@chelsio.com> Signed-off-by: Ayush Sawal <ayush.sawal@chelsio.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-07chtls: Fix hardware tid leakAyush Sawal
send_abort_rpl() is not calculating cpl_abort_req_rss offset and ends up sending wrong TID with abort_rpl WR causng tid leaks. Replaced send_abort_rpl() with chtls_send_abort_rpl() as it is redundant. Fixes: cc35c88ae4db ("crypto : chtls - CPL handler definition") Signed-off-by: Rohit Maheshwari <rohitm@chelsio.com> Signed-off-by: Ayush Sawal <ayush.sawal@chelsio.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-07Merge branch 'generic-zcopy_-functions'Jakub Kicinski
Jonathan Lemon says: ==================== Generic zcopy_* functions This is set of cleanup patches for zerocopy which are intended to allow a introduction of a different zerocopy implementation. The top level API will use the skb_zcopy_*() functions, while the current TCP specific zerocopy ends up using msg_zerocopy_*() calls. There should be no functional changes from these patches. ==================== Link: https://lore.kernel.org/r/20210106221841.1880536-1-jonathan.lemon@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-07skbuff: Rename skb_zcopy_{get|put} to net_zcopy_{get|put}Jonathan Lemon
Unlike the rest of the skb_zcopy_ functions, these routines operate on a 'struct ubuf', not a skb. Remove the 'skb_' prefix from the naming to make things clearer. Suggested-by: Willem de Bruijn <willemdebruijn.kernel@gmail.com> Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-07tap/tun: add skb_zcopy_init() helper for initialization.Jonathan Lemon
Replace direct assignments with skb_zcopy_init() for zerocopy cases where a new skb is initialized, without changing the reference counts. Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-07skbuff: add flags to ubuf_info for ubuf setupJonathan Lemon
Currently, when an ubuf is attached to a new skb, the shared flags word is initialized to a fixed value. Instead of doing this, set the default flags in the ubuf, and have new skbs inherit from this default. This is needed when setting up different zerocopy types. Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-07net: group skb_shinfo zerocopy related bits together.Jonathan Lemon
In preparation for expanded zerocopy (TX and RX), move the zerocopy related bits out of tx_flags into their own flag word. Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-07skbuff: rename sock_zerocopy_* to msg_zerocopy_*Jonathan Lemon
At Willem's suggestion, rename the sock_zerocopy_* functions so that they match the MSG_ZEROCOPY flag, which makes it clear they are specific to this zerocopy implementation. Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-07skbuff: Call skb_zcopy_clear() before unref'ing fragmentsJonathan Lemon
RX zerocopy fragment pages which are not allocated from the system page pool require special handling. Give the callback in skb_zcopy_clear() a chance to process them first. Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-07skbuff: Call sock_zerocopy_put_abort from skb_zcopy_put_abortJonathan Lemon
The sock_zerocopy_put_abort function contains logic which is specific to the current zerocopy implementation. Add a wrapper which checks the callback and dispatches apppropriately. Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-07skbuff: Add skb parameter to the ubuf zerocopy callbackJonathan Lemon
Add an optional skb parameter to the zerocopy callback parameter, which is passed down from skb_zcopy_clear(). This gives access to the original skb, which is needed for upcoming RX zero-copy error handling. Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-07skbuff: replace sock_zerocopy_get with skb_zcopy_getJonathan Lemon
Rename the get routines for consistency. Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-07skbuff: replace sock_zerocopy_put() with skb_zcopy_put()Jonathan Lemon
Replace sock_zerocopy_put with the generic skb_zcopy_put() function. Pass 'true' as the success argument, as this is identical to no change. Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-07skbuff: Push status and refcounts into sock_zerocopy_callbackJonathan Lemon
Before this change, the caller of sock_zerocopy_callback would need to save the zerocopy status, decrement and check the refcount, and then call the callback function - the callback was only invoked when the refcount reached zero. Now, the caller just passes the status into the callback function, which saves the status and handles its own refcounts. This makes the behavior of the sock_zerocopy_callback identical to the tpacket and vhost callbacks. Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-07skbuff: simplify sock_zerocopy_putJonathan Lemon
All 'struct ubuf_info' users should have a callback defined as of commit 0a4a060bb204 ("sock: fix zerocopy_success regression with msg_zerocopy"). Remove the dead code path to consume_skb(), which makes assumptions about how the structure was allocated. Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-07skbuff: remove unused skb_zcopy_abort functionJonathan Lemon
skb_zcopy_abort() has no in-tree consumers, remove it. Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-07Merge tag 'gcc-plugins-v5.11-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull gcc-plugins fix from Kees Cook: "Bump c++ standard version for latest GCC versions (Valdis Kletnieks)" * tag 'gcc-plugins-v5.11-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: gcc-plugins: fix gcc 11 indigestion with plugins...
2021-01-07Merge branch 'dwmac-meson8b-picosecond-precision-rx-delay-support'Jakub Kicinski
Martin Blumenstingl says: ==================== dwmac-meson8b: picosecond precision RX delay support with the help of Jianxin Pan (many thanks!) the meaning of the "new" PRG_ETH1[19:16] register bits on Amlogic Meson G12A, G12B and SM1 SoCs are finally known. These SoCs allow fine-tuning the RGMII RX delay in 200ps steps (contrary to what I have thought in the past [0] these are not some "calibration" values). The vendor u-boot has code to automatically detect the best RX/TX delay settings. For now we keep it simple and add a device-tree property with 200ps precision to select the "right" RX delay for each board. While here, deprecate the "amlogic,rx-delay-ns" property as it's not used on any upstream .dts (yet). The driver is backwards compatible. I have tested this on an X96 Air 4GB board (not upstream yet). Testing with iperf3 gives 938 Mbits/sec in both directions (RX and TX). The following network settings were used in the .dts (2ns TX delay generated by the PHY, 800ps RX delay generated by the MAC as the PHY only supports 0ns or 2ns RX delays): &ext_mdio { external_phy: ethernet-phy@0 { /* Realtek RTL8211F (0x001cc916) */ reg = <0>; eee-broken-1000t; reset-assert-us = <10000>; reset-deassert-us = <30000>; reset-gpios = <&gpio GPIOZ_15 (GPIO_ACTIVE_LOW | GPIO_OPEN_DRAIN)>; interrupt-parent = <&gpio_intc>; /* MAC_INTR on GPIOZ_14 */ interrupts = <26 IRQ_TYPE_LEVEL_LOW>; }; }; &ethmac { status = "okay"; pinctrl-0 = <&eth_pins>, <&eth_rgmii_pins>; pinctrl-names = "default"; phy-mode = "rgmii-txid"; phy-handle = <&external_phy>; amlogic,rgmii-rx-delay-ps = <800>; }; To use the same settings from vendor u-boot (which in my case has broken Ethernet) the following commands can be used: mw.l 0xff634540 0x1621 mw.l 0xff634544 0x30000 phyreg w 0x0 0x1040 phyreg w 0x1f 0xd08 phyreg w 0x11 0x9 phyreg w 0x15 0x11 phyreg w 0x1f 0x0 phyreg w 0x0 0x9200 Also I have tested this on a X96 Max board without any .dts changes to confirm that other boards with the same IP block still work fine with these changes. Changes since v3 at [3]. - added Florian's Reviewed-by to patch 1 (thank you!) - rebased on top of net-next Changes since v2 at [2]: - use the generic property name "rx-internal-delay-ps" as suggested by Rob (thanks!). This affects patches #1 and #3. The biggest change is is in patch #1 which is why I didn't add Florian's and Andrew's Reviewed-by - added Andrew's and Florian's Reviewed-by to patches 2, 3, 4, 5 (many thanks to both!). I decided to do this despite renaming the property to the generic name "rx-internal-delay-ps" as it only affects the patch description and one line of code - updated patch description of patch #3 to explain why there's not a lot of validation when parsing the old device-tree property (in nanosecond precision) - dropped RFC status Changes since v1 at [1]: - updated patch 1 by making it more clear when the RX delay is applied. Thanks to Andrew for the suggestion! - added a fix to enabling the timing-adjustment clock only when really needed. Found by Andrew - thanks! - added testing not about X96 Max - v1 did not go to the netdev mailing list, v2 fixes this [0] https://lore.kernel.org/netdev/CAFBinCATt4Hi9rigj52nMf3oygyFbnopZcsakGL=KyWnsjY3JA@mail.gmail.com/ [1] https://patchwork.kernel.org/project/linux-amlogic/list/?series=384279&state=%2A&archive=both [2] https://patchwork.kernel.org/project/linux-amlogic/list/?series=384491&state=%2A&archive=both [3] https://patchwork.kernel.org/project/linux-amlogic/list/?series=406005&state=%2A&archive=both ==================== Link: https://lore.kernel.org/r/20210106134251.45264-1-martin.blumenstingl@googlemail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-07net: stmmac: dwmac-meson8b: add support for the RGMII RX delay on G12AMartin Blumenstingl
Amlogic Meson G12A (and newer: G12B, SM1) SoCs have a more advanced RX delay logic. Instead of fine-tuning the delay in the nanoseconds range it now allows tuning in 200 picosecond steps. This support comes with new bits in the PRG_ETH1[19:16] register. Add support for validating the RGMII RX delay as well as configuring the register accordingly on these platforms. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-07net: stmmac: dwmac-meson8b: move RGMII delays into a separate functionMartin Blumenstingl
Newer SoCs starting with the Amlogic Meson G12A have more a precise RGMII RX delay configuration register. This means more complexity in the code. Extract the existing RGMII delay configuration code into a separate function to make it easier to read/understand even when adding more logic in the future. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-07net: stmmac: dwmac-meson8b: use picoseconds for the RGMII RX delayMartin Blumenstingl
Amlogic Meson G12A, G12B and SM1 SoCs have a more advanced RGMII RX delay register which allows picoseconds precision. Parse the new "rx-internal-delay-ps" property or fall back to the value from the old "amlogic,rx-delay-ns" property. No upstream DTB uses the old "amlogic,rx-delay-ns" property (yet). Only include minimalistic logic to fall back to the old property, without any special validation (for example if the old and new property are given at the same time). Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-07net: stmmac: dwmac-meson8b: fix enabling the timing-adjustment clockMartin Blumenstingl
The timing-adjustment clock only has to be enabled when a) there is a 2ns RX delay configured using device-tree and b) the phy-mode indicates that the RX delay should be enabled. Only enable the RX delay if both are true, instead of (by accident) also enabling it when there's the 2ns RX delay configured but the phy-mode incicates that the RX delay is not used. Fixes: 9308c47640d515 ("net: stmmac: dwmac-meson8b: add support for the RX delay configuration") Reported-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-07dt-bindings: net: dwmac-meson: use picoseconds for the RGMII RX delayMartin Blumenstingl
Amlogic Meson G12A, G12B and SM1 SoCs have a more advanced RGMII RX delay register which allows picoseconds precision. Deprecate the old "amlogic,rx-delay-ns" in favour of the generic "rx-internal-delay-ps" property. For older SoCs the only known supported values were 0ns and 2ns. The new SoCs have support for RGMII RX delays between 0ps and 3000ps in 200ps steps. Don't carry over the description for the "rx-internal-delay-ps" property and inherit that from ethernet-controller.yaml instead. Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-07Merge branch 'reduce-coupling-between-dsa-and-broadcom-systemport-driver'Jakub Kicinski
Vladimir Oltean says: ==================== Reduce coupling between DSA and Broadcom SYSTEMPORT driver Upon a quick inspection, it seems that there is some code in the generic DSA layer that is somehow specific to the Broadcom SYSTEMPORT driver. The challenge there is that the hardware integration is very tight between the switch and the DSA master interface. However this does not mean that the drivers must also be as integrated as the hardware is. We can avoid creating a DSA notifier just for the Broadcom SYSTEMPORT, and we can move some Broadcom-specific queue mapping helpers outside of the common include/net/dsa.h. ==================== Link: https://lore.kernel.org/r/20210107012403.1521114-1-olteanv@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-07net: dsa: remove the DSA specific notifiersVladimir Oltean
This effectively reverts commit 60724d4bae14 ("net: dsa: Add support for DSA specific notifiers"). The reason is that since commit 2f1e8ea726e9 ("net: dsa: link interfaces with the DSA master to get rid of lockdep warnings"), it appears that there is a generic way to achieve the same purpose. The only user thus far, the Broadcom SYSTEMPORT driver, was converted to use the generic notifiers. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-07net: systemport: use standard netdevice notifier to detect DSA presenceVladimir Oltean
The SYSTEMPORT driver maps each port of the embedded Broadcom DSA switch port to a certain queue of the master Ethernet controller. For that it currently uses a dedicated notifier infrastructure which was added in commit 60724d4bae14 ("net: dsa: Add support for DSA specific notifiers"). However, since commit 2f1e8ea726e9 ("net: dsa: link interfaces with the DSA master to get rid of lockdep warnings"), DSA is actually an upper of the Broadcom SYSTEMPORT as far as the netdevice adjacency lists are concerned. So naturally, the plain NETDEV_CHANGEUPPER net device notifiers are emitted. It looks like there is enough API exposed by DSA to the outside world already to make the call_dsa_notifiers API redundant. So let's convert its only user to plain netdev notifiers. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Tested-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>