aboutsummaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)Author
2021-07-09Merge tag 'ceph-for-5.14-rc1' of git://github.com/ceph/ceph-clientLinus Torvalds
Pull ceph updates from Ilya Dryomov: "We have new filesystem client metrics for reporting I/O sizes from Xiubo, two patchsets from Jeff that begin to untangle some heavyweight blocking locks in the filesystem and a bunch of code cleanups" * tag 'ceph-for-5.14-rc1' of git://github.com/ceph/ceph-client: ceph: take reference to req->r_parent at point of assignment ceph: eliminate ceph_async_iput() ceph: don't take s_mutex in ceph_flush_snaps ceph: don't take s_mutex in try_flush_caps ceph: don't take s_mutex or snap_rwsem in ceph_check_caps ceph: eliminate session->s_gen_ttl_lock ceph: allow ceph_put_mds_session to take NULL or ERR_PTR ceph: clean up locking annotation for ceph_get_snap_realm and __lookup_snap_realm ceph: add some lockdep assertions around snaprealm handling ceph: decoding error in ceph_update_snap_realm should return -EIO ceph: add IO size metrics support ceph: update and rename __update_latency helper to __update_stdev ceph: simplify the metrics struct libceph: fix doc warnings in cls_lock_client.c libceph: remove unnecessary ret variable in ceph_auth_init() libceph: fix some spelling mistakes libceph: kill ceph_none_authorizer::reply_buf ceph: make ceph_queue_cap_snap static ceph: make ceph_netfs_read_ops static ceph: remove bogus checks and WARN_ONs from ceph_set_page_dirty
2021-07-09Merge tag 'nfs-for-5.14-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds
Pull NFS client updates from Trond Myklebust: "Highlights include: Features: - Multiple patches to add support for fcntl() leases over NFSv4. - A sysfs interface to display more information about the various transport connections used by the RPC client - A sysfs interface to allow a suitably privileged user to offline a transport that may no longer point to a valid server - A sysfs interface to allow a suitably privileged user to change the server IP address used by the RPC client Stable fixes: - Two sunrpc fixes for deadlocks involving privileged rpc_wait_queues Bugfixes: - SUNRPC: Avoid a KASAN slab-out-of-bounds bug in xdr_set_page_base() - SUNRPC: prevent port reuse on transports which don't request it. - NFSv3: Fix memory leak in posix_acl_create() - NFS: Various fixes to attribute revalidation timeouts - NFSv4: Fix handling of non-atomic change attribute updates - NFSv4: If a server is down, don't cause mounts to other servers to hang as well - pNFS: Fix an Oops in pnfs_mark_request_commit() when doing O_DIRECT - NFS: Fix mount failures due to incorrect setting of the has_sec_mnt_opts filesystem flag - NFS: Ensure nfs_readpage returns promptly when an internal error occurs - NFS: Fix fscache read from NFS after cache error - pNFS: Various bugfixes around the LAYOUTGET operation" * tag 'nfs-for-5.14-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (46 commits) NFSv4/pNFS: Return an error if _nfs4_pnfs_v3_ds_connect can't load NFSv3 NFSv4/pNFS: Don't call _nfs4_pnfs_v3_ds_connect multiple times NFSv4/pnfs: Clean up layout get on open NFSv4/pnfs: Fix layoutget behaviour after invalidation NFSv4/pnfs: Fix the layout barrier update NFS: Fix fscache read from NFS after cache error NFS: Ensure nfs_readpage returns promptly when internal error occurs sunrpc: remove an offlined xprt using sysfs sunrpc: provide showing transport's state info in the sysfs directory sunrpc: display xprt's queuelen of assigned tasks via sysfs sunrpc: provide multipath info in the sysfs directory NFSv4.1 identify and mark RPC tasks that can move between transports sunrpc: provide transport info in the sysfs directory SUNRPC: take a xprt offline using sysfs sunrpc: add dst_attr attributes to the sysfs xprt directory SUNRPC for TCP display xprt's source port in sysfs xprt_info SUNRPC query transport's source port SUNRPC display xprt's main value in sysfs's xprt_info SUNRPC mark the first transport sunrpc: add add sysfs directory per xprt under each xprt_switch ...
2021-07-08Merge part 2 of branch 'sysfs-devel'Trond Myklebust
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-07-08Merge branch 'sysfs-devel'Trond Myklebust
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-07-08sunrpc: remove an offlined xprt using sysfsOlga Kornievskaia
Once a transport has been put offline, this transport can be also removed from the list of transports. Any tasks that have been stuck on this transport would find the next available active transport and be re-tried. This transport would be removed from the xprt_switch list and freed. Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-07-08sunrpc: provide showing transport's state info in the sysfs directoryOlga Kornievskaia
In preparation of being able to change the xprt's state, add a way to show currect state of the transport. Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-07-08sunrpc: display xprt's queuelen of assigned tasks via sysfsOlga Kornievskaia
Once a task grabs a trasnport it's reflected in the queuelen of the rpc_xprt structure. Add display of that value in the xprt's info file in sysfs. Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-07-08sunrpc: provide multipath info in the sysfs directoryOlga Kornievskaia
Allow to query xrpt_switch attributes. Currently showing the following fields of the rpc_xprt_switch structure: xps_nxprts, xps_nactive, xps_queuelen. Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-07-08sunrpc: provide transport info in the sysfs directoryOlga Kornievskaia
Allow to query transport's attributes. Currently showing following fields of the rpc_xprt structure: state, last_used, cong, cwnd, max_reqs, min_reqs, num_reqs, sizes of queues binding, sending, pending, backlog. Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-07-08SUNRPC: take a xprt offline using sysfsOlga Kornievskaia
Using sysfs's xprt_state attribute, mark a particular transport offline. It will not be picked during the round-robin selection. It's not allowed to take the main (1st created transport associated with the rpc_client) offline. Also bring a transport back online via sysfs by writing "online" and that would allow for this transport to be picked during the round- robin selection. Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-07-08sunrpc: add dst_attr attributes to the sysfs xprt directoryOlga Kornievskaia
Allow to query and set the destination's address of a transport. Setting of the destination address is allowed only for TCP or RDMA based connections. Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-07-08SUNRPC for TCP display xprt's source port in sysfs xprt_infoOlga Kornievskaia
Using TCP connection's source port it is useful to match connections seen on the network traces to the xprts used by the linux nfs client. Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-07-08SUNRPC query transport's source portOlga Kornievskaia
Provide ability to query transport's source port. Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-07-08SUNRPC display xprt's main value in sysfs's xprt_infoOlga Kornievskaia
Display in sysfs in the information about the xprt if this is a main transport or not. Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-07-08SUNRPC mark the first transportOlga Kornievskaia
When an RPC client gets created it's first transport is special and should be marked a main transport. Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-07-08sunrpc: add add sysfs directory per xprt under each xprt_switchOlga Kornievskaia
Add individual transport directories under each transport switch group. For instance, for each nconnect=X connections there will be a transport directory. Naming conventions also identifies transport type -- xprt-<id>-<type> where type is udp, tcp, rdma, local, bc. Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-07-08sunrpc: add a symlink from rpc-client directory to the xprt_switchOlga Kornievskaia
An rpc client uses a transport switch and one ore more transports associated with that switch. Since transports are shared among rpc clients, create a symlink into the xprt_switch directory instead of duplicating entries under each rpc client. Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-07-08sunrpc: add xprt_switch direcotry to sunrpc's sysfsOlga Kornievskaia
Add xprt_switch directory to the sysfs and create individual xprt_swith subdirectories for multipath transport group. Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-07-08sunrpc: keep track of the xprt_class in rpc_xprt structureOlga Kornievskaia
We need to keep track of the type for a given transport. Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-07-08sunrpc: add IDs to multipathOlga Kornievskaia
This is used to uniquely identify sunrpc multipath objects in /sys. Signed-off-by: Dan Aloni <dan@kernelim.com> Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-07-08sunrpc: add xprt idOlga Kornievskaia
This adds a unique identifier for a sunrpc transport in sysfs, which is similarly managed to the unique IDs of clients. Signed-off-by: Dan Aloni <dan@kernelim.com> Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-07-08sunrpc: Create per-rpc_clnt sysfs kobjectsOlga Kornievskaia
These will eventually have files placed under them for sysfs operations. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-07-08sunrpc: Create a client/ subdirectory in the sunrpc sysfsOlga Kornievskaia
For network namespace separation. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-07-08sunrpc: Create a sunrpc directory under /sys/kernel/Olga Kornievskaia
This is where we'll put per-rpc_client related files Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-07-07Merge tag 'nfsd-5.14' of git://linux-nfs.org/~bfields/linuxLinus Torvalds
Pull nfsd updates from Bruce Fields: - add tracepoints for callbacks and for client creation and destruction - cache the mounts used for server-to-server copies - expose callback information in /proc/fs/nfsd/clients/*/info - don't hold locks unnecessarily while waiting for commits - update NLM to use xdr_stream, as we have for NFSv2/v3/v4 * tag 'nfsd-5.14' of git://linux-nfs.org/~bfields/linux: (69 commits) nfsd: fix NULL dereference in nfs3svc_encode_getaclres NFSD: Prevent a possible oops in the nfs_dirent() tracepoint nfsd: remove redundant assignment to pointer 'this' nfsd: Reduce contention for the nfsd_file nf_rwsem lockd: Update the NLMv4 SHARE results encoder to use struct xdr_stream lockd: Update the NLMv4 nlm_res results encoder to use struct xdr_stream lockd: Update the NLMv4 TEST results encoder to use struct xdr_stream lockd: Update the NLMv4 void results encoder to use struct xdr_stream lockd: Update the NLMv4 FREE_ALL arguments decoder to use struct xdr_stream lockd: Update the NLMv4 SHARE arguments decoder to use struct xdr_stream lockd: Update the NLMv4 SM_NOTIFY arguments decoder to use struct xdr_stream lockd: Update the NLMv4 nlm_res arguments decoder to use struct xdr_stream lockd: Update the NLMv4 UNLOCK arguments decoder to use struct xdr_stream lockd: Update the NLMv4 CANCEL arguments decoder to use struct xdr_stream lockd: Update the NLMv4 LOCK arguments decoder to use struct xdr_stream lockd: Update the NLMv4 TEST arguments decoder to use struct xdr_stream lockd: Update the NLMv4 void arguments decoder to use struct xdr_stream lockd: Update the NLMv1 SHARE results encoder to use struct xdr_stream lockd: Update the NLMv1 nlm_res results encoder to use struct xdr_stream lockd: Update the NLMv1 TEST results encoder to use struct xdr_stream ...
2021-07-06rpc: remove redundant initialization of variable statusColin Ian King
The variable status is being initialized with a value that is never read, the assignment is redundant and can be removed. Addresses-Coverity: ("Unused value") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2021-07-06xprtrdma: Fix spelling mistakesZheng Yongjun
Fix some spelling mistakes in comments: succes ==> success Signed-off-by: Zheng Yongjun <zhengyongjun3@huawei.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2021-07-05Merge tag 'tty-5.14-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty Pull tty / serial updates from Greg KH: "Here is the big set of tty and serial driver patches for 5.14-rc1. A bit more than normal, but nothing major, lots of cleanups. Highlights are: - lots of tty api cleanups and mxser driver cleanups from Jiri - build warning fixes - various serial driver updates - coding style cleanups - various tty driver minor fixes and updates - removal of broken and disable r3964 line discipline (finally!) All of these have been in linux-next for a while with no reported issues" * tag 'tty-5.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (227 commits) serial: mvebu-uart: remove unused member nb from struct mvebu_uart arm64: dts: marvell: armada-37xx: Fix reg for standard variant of UART dt-bindings: mvebu-uart: fix documentation serial: mvebu-uart: correctly calculate minimal possible baudrate serial: mvebu-uart: do not allow changing baudrate when uartclk is not available serial: mvebu-uart: fix calculation of clock divisor tty: make linux/tty_flip.h self-contained serial: Prefer unsigned int to bare use of unsigned serial: 8250: 8250_omap: Fix possible interrupt storm on K3 SoCs serial: qcom_geni_serial: use DT aliases according to DT bindings Revert "tty: serial: Add UART driver for Cortina-Access platform" tty: serial: Add UART driver for Cortina-Access platform MAINTAINERS: add me back as mxser maintainer mxser: Documentation, fix typos mxser: Documentation, make the docs up-to-date mxser: Documentation, remove traces of callout device mxser: introduce mxser_16550A_or_MUST helper mxser: rename flags to old_speed in mxser_set_serial_info mxser: use port variable in mxser_set_serial_info mxser: access info->MCR under info->slock ...
2021-07-02Merge tag 'asm-generic-unaligned-5.14' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic Pull asm/unaligned.h unification from Arnd Bergmann: "Unify asm/unaligned.h around struct helper The get_unaligned()/put_unaligned() helpers are traditionally architecture specific, with the two main variants being the "access-ok.h" version that assumes unaligned pointer accesses always work on a particular architecture, and the "le-struct.h" version that casts the data to a byte aligned type before dereferencing, for architectures that cannot always do unaligned accesses in hardware. Based on the discussion linked below, it appears that the access-ok version is not realiable on any architecture, but the struct version probably has no downsides. This series changes the code to use the same implementation on all architectures, addressing the few exceptions separately" Link: https://lore.kernel.org/lkml/75d07691-1e4f-741f-9852-38c0b4f520bc@synopsys.com/ Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100363 Link: https://lore.kernel.org/lkml/20210507220813.365382-14-arnd@kernel.org/ Link: git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic.git unaligned-rework-v2 Link: https://lore.kernel.org/lkml/CAHk-=whGObOKruA_bU3aPGZfoDqZM1_9wBkwREp0H0FgR-90uQ@mail.gmail.com/ * tag 'asm-generic-unaligned-5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic: asm-generic: simplify asm/unaligned.h asm-generic: uaccess: 1-byte access is always aligned netpoll: avoid put_unaligned() on single character mwifiex: re-fix for unaligned accesses apparmor: use get_unaligned() only for multi-byte words partitions: msdos: fix one-byte get_unaligned() asm-generic: unaligned always use struct helpers asm-generic: unaligned: remove byteshift helpers powerpc: use linux/unaligned/le_struct.h on LE power7 m68k: select CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS sh: remove unaligned access for sh4a openrisc: always use unaligned-struct header asm-generic: use asm-generic/unaligned.h for most architectures
2021-06-30Merge tag 'net-next-5.14' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next Pull networking updates from Jakub Kicinski: "Core: - BPF: - add syscall program type and libbpf support for generating instructions and bindings for in-kernel BPF loaders (BPF loaders for BPF), this is a stepping stone for signed BPF programs - infrastructure to migrate TCP child sockets from one listener to another in the same reuseport group/map to improve flexibility of service hand-off/restart - add broadcast support to XDP redirect - allow bypass of the lockless qdisc to improving performance (for pktgen: +23% with one thread, +44% with 2 threads) - add a simpler version of "DO_ONCE()" which does not require jump labels, intended for slow-path usage - virtio/vsock: introduce SOCK_SEQPACKET support - add getsocketopt to retrieve netns cookie - ip: treat lowest address of a IPv4 subnet as ordinary unicast address allowing reclaiming of precious IPv4 addresses - ipv6: use prandom_u32() for ID generation - ip: add support for more flexible field selection for hashing across multi-path routes (w/ offload to mlxsw) - icmp: add support for extended RFC 8335 PROBE (ping) - seg6: add support for SRv6 End.DT46 behavior - mptcp: - DSS checksum support (RFC 8684) to detect middlebox meddling - support Connection-time 'C' flag - time stamping support - sctp: packetization Layer Path MTU Discovery (RFC 8899) - xfrm: speed up state addition with seq set - WiFi: - hidden AP discovery on 6 GHz and other HE 6 GHz improvements - aggregation handling improvements for some drivers - minstrel improvements for no-ack frames - deferred rate control for TXQs to improve reaction times - switch from round robin to virtual time-based airtime scheduler - add trace points: - tcp checksum errors - openvswitch - action execution, upcalls - socket errors via sk_error_report Device APIs: - devlink: add rate API for hierarchical control of max egress rate of virtual devices (VFs, SFs etc.) - don't require RCU read lock to be held around BPF hooks in NAPI context - page_pool: generic buffer recycling New hardware/drivers: - mobile: - iosm: PCIe Driver for Intel M.2 Modem - support for Qualcomm MSM8998 (ipa) - WiFi: Qualcomm QCN9074 and WCN6855 PCI devices - sparx5: Microchip SparX-5 family of Enterprise Ethernet switches - Mellanox BlueField Gigabit Ethernet (control NIC of the DPU) - NXP SJA1110 Automotive Ethernet 10-port switch - Qualcomm QCA8327 switch support (qca8k) - Mikrotik 10/25G NIC (atl1c) Driver changes: - ACPI support for some MDIO, MAC and PHY devices from Marvell and NXP (our first foray into MAC/PHY description via ACPI) - HW timestamping (PTP) support: bnxt_en, ice, sja1105, hns3, tja11xx - Mellanox/Nvidia NIC (mlx5) - NIC VF offload of L2 bridging - support IRQ distribution to Sub-functions - Marvell (prestera): - add flower and match all - devlink trap - link aggregation - Netronome (nfp): connection tracking offload - Intel 1GE (igc): add AF_XDP support - Marvell DPU (octeontx2): ingress ratelimit offload - Google vNIC (gve): new ring/descriptor format support - Qualcomm mobile (rmnet & ipa): inline checksum offload support - MediaTek WiFi (mt76) - mt7915 MSI support - mt7915 Tx status reporting - mt7915 thermal sensors support - mt7921 decapsulation offload - mt7921 enable runtime pm and deep sleep - Realtek WiFi (rtw88) - beacon filter support - Tx antenna path diversity support - firmware crash information via devcoredump - Qualcomm WiFi (wcn36xx) - Wake-on-WLAN support with magic packets and GTK rekeying - Micrel PHY (ksz886x/ksz8081): add cable test support" * tag 'net-next-5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2168 commits) tcp: change ICSK_CA_PRIV_SIZE definition tcp_yeah: check struct yeah size at compile time gve: DQO: Fix off by one in gve_rx_dqo() stmmac: intel: set PCI_D3hot in suspend stmmac: intel: Enable PHY WOL option in EHL net: stmmac: option to enable PHY WOL with PMT enabled net: say "local" instead of "static" addresses in ndo_dflt_fdb_{add,del} net: use netdev_info in ndo_dflt_fdb_{add,del} ptp: Set lookup cookie when creating a PTP PPS source. net: sock: add trace for socket errors net: sock: introduce sk_error_report net: dsa: replay the local bridge FDB entries pointing to the bridge dev too net: dsa: ensure during dsa_fdb_offload_notify that dev_hold and dev_put are on the same dev net: dsa: include fdb entries pointing to bridge in the host fdb list net: dsa: include bridge addresses which are local in the host fdb list net: dsa: sync static FDB entries on foreign interfaces to hardware net: dsa: install the host MDB and FDB entries in the master's RX filter net: dsa: reference count the FDB addresses at the cross-chip notifier level net: dsa: introduce a separate cross-chip notifier type for host FDBs net: dsa: reference count the MDB entries at the cross-chip notifier level ...
2021-06-30Merge tag 'selinux-pr-20210629' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux Pull SELinux updates from Paul Moore: - The slow_avc_audit() function is now non-blocking so we can remove the AVC_NONBLOCKING tricks; this also includes the 'flags' variant of avc_has_perm(). - Use kmemdup() instead of kcalloc()+copy when copying parts of the SELinux policydb. - The InfiniBand device name is now passed by reference when possible in the SELinux code, removing a strncpy(). - Minor cleanups including: constification of avtab function args, removal of useless LSM/XFRM function args, SELinux kdoc fixes, and removal of redundant assignments. * tag 'selinux-pr-20210629' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux: selinux: kill 'flags' argument in avc_has_perm_flags() and avc_audit() selinux: slow_avc_audit has become non-blocking selinux: Fix kernel-doc selinux: use __GFP_NOWARN with GFP_NOWAIT in the AVC lsm_audit,selinux: pass IB device name by reference selinux: Remove redundant assignment to rc selinux: Corrected comment to match kernel-doc comment selinux: delete selinux_xfrm_policy_lookup() useless argument selinux: constify some avtab function arguments selinux: simplify duplicate_policydb_cond_list() by using kmemdup()
2021-06-29Merge branch 'akpm' (patches from Andrew)Linus Torvalds
Merge misc updates from Andrew Morton: "191 patches. Subsystems affected by this patch series: kthread, ia64, scripts, ntfs, squashfs, ocfs2, kernel/watchdog, and mm (gup, pagealloc, slab, slub, kmemleak, dax, debug, pagecache, gup, swap, memcg, pagemap, mprotect, bootmem, dma, tracing, vmalloc, kasan, initialization, pagealloc, and memory-failure)" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (191 commits) mm,hwpoison: make get_hwpoison_page() call get_any_page() mm,hwpoison: send SIGBUS with error virutal address mm/page_alloc: split pcp->high across all online CPUs for cpuless nodes mm/page_alloc: allow high-order pages to be stored on the per-cpu lists mm: replace CONFIG_FLAT_NODE_MEM_MAP with CONFIG_FLATMEM mm: replace CONFIG_NEED_MULTIPLE_NODES with CONFIG_NUMA docs: remove description of DISCONTIGMEM arch, mm: remove stale mentions of DISCONIGMEM mm: remove CONFIG_DISCONTIGMEM m68k: remove support for DISCONTIGMEM arc: remove support for DISCONTIGMEM arc: update comment about HIGHMEM implementation alpha: remove DISCONTIGMEM and NUMA mm/page_alloc: move free_the_page mm/page_alloc: fix counting of managed_pages mm/page_alloc: improve memmap_pages dbg msg mm: drop SECTION_SHIFT in code comments mm/page_alloc: introduce vm.percpu_pagelist_high_fraction mm/page_alloc: limit the number of pages on PCP lists when reclaim is active mm/page_alloc: scale the number of pages that are batch freed ...
2021-06-29Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Trivial conflict in net/netfilter/nf_tables_api.c. Duplicate fix in tools/testing/selftests/net/devlink_port_split.py - take the net-next version. skmsg, and L4 bpf - keep the bpf code but remove the flags and err params. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-06-29tcp_yeah: check struct yeah size at compile timeEric Dumazet
Compiler can perform the sanity check instead of waiting to load the module and crash the host. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-29net: say "local" instead of "static" addresses in ndo_dflt_fdb_{add,del}Vladimir Oltean
"Static" is a loaded word, and probably not what the author meant when the code was written. In particular, this looks weird: $ bridge fdb add dev swp0 00:01:02:03:04:05 local # totally fine, but $ bridge fdb add dev swp0 00:01:02:03:04:05 static [ 2020.708298] swp0: FDB only supports static addresses # hmm what? By looking at the implementation which uses dev_uc_add/dev_uc_del it is absolutely clear that only local addresses are supported, and the proper Network Unreachability Detection state is being used for this purpose (user space indeed sets NUD_PERMANENT when local addresses are meant). So it is just the message that is wrong, fix it. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-29net: use netdev_info in ndo_dflt_fdb_{add,del}Vladimir Oltean
Use the more modern printk helper for network interfaces, which also contains information about the associated struct device, and results in overall shorter line lengths compared to printing an open-coded dev->name. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-29net: sock: add trace for socket errorsAlexander Aring
This patch will add tracers to trace inet socket errors only. A user space monitor application can track connection errors indepedent from socket lifetime and do additional handling. For example a cluster manager can fence a node if errors occurs in a specific heuristic. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-29net: sock: introduce sk_error_reportAlexander Aring
This patch introduces a function wrapper to call the sk_error_report callback. That will prepare to add additional handling whenever sk_error_report is called, for example to trace socket errors. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-29Merge tag 'hyperv-next-signed-20210629' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux Pull hyperv updates from Wei Liu: "Just a few minor enhancement patches and bug fixes" * tag 'hyperv-next-signed-20210629' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux: PCI: hv: Add check for hyperv_initialized in init_hv_pci_drv() Drivers: hv: Move Hyper-V extended capability check to arch neutral code drivers: hv: Fix missing error code in vmbus_connect() x86/hyperv: fix logical processor creation hv_utils: Fix passing zero to 'PTR_ERR' warning scsi: storvsc: Use blk_mq_unique_tag() to generate requestIDs Drivers: hv: vmbus: Copy packets sent by Hyper-V out of the ring buffer hv_balloon: Remove redundant assignment to region_start
2021-06-29net/ipv5/tcp: use vma_lookup() in tcp_zerocopy_receive()Liam Howlett
Use vma_lookup() to find the VMA at a specific address. As vma_lookup() will return NULL if the address is not within any VMA, the start address no longer needs to be validated. Link: https://lkml.kernel.org/r/20210521174745.2219620-13-Liam.Howlett@Oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com> Reviewed-by: Laurent Dufour <ldufour@linux.ibm.com> Acked-by: David Hildenbrand <david@redhat.com> Acked-by: Davidlohr Bueso <dbueso@suse.de> Cc: David Miller <davem@davemloft.net> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-29net: dsa: replay the local bridge FDB entries pointing to the bridge dev tooVladimir Oltean
When we join a bridge that already has some local addresses pointing to itself, we do not get those notifications. Similarly, when we leave that bridge, we do not get notifications for the deletion of those entries. The only switchdev notifications we get are those of entries added while the DSA port is enslaved to the bridge. This makes use cases such as the following work properly (with the number of additions and removals properly balanced): ip link add br0 type bridge ip link add br1 type bridge ip link set br0 address 00:01:02:03:04:05 ip link set br1 address 00:01:02:03:04:05 ip link set swp0 up ip link set swp1 up ip link set swp0 master br0 ip link set swp1 master br1 ip link set br0 up ip link set br1 up ip link del br1 # 00:01:02:03:04:05 still installed on the CPU port ip link del br0 # 00:01:02:03:04:05 finally removed from the CPU port Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-29net: dsa: ensure during dsa_fdb_offload_notify that dev_hold and dev_put are ↵Vladimir Oltean
on the same dev When (a) "dev" is a bridge port which the DSA switch tree offloads, but is otherwise not a dsa slave (such as a LAG netdev), or (b) "dev" is the bridge net device itself then strange things happen to the dev_hold/dev_put pair: dsa_schedule_work() will still be called with a DSA port that offloads that netdev, but dev_hold() will be called on the non-DSA netdev. Then the "if" condition in dsa_slave_switchdev_event_work() does not pass, because "dev" is not a DSA netdev, so dev_put() is not called. This results in the simple fact that we have a reference counting mismatch on the "dev" net device. This can be seen when we add support for host addresses installed on the bridge net device. ip link add br1 type bridge ip link set br1 address 00:01:02:03:04:05 ip link set swp0 master br1 ip link del br1 [ 968.512278] unregister_netdevice: waiting for br1 to become free. Usage count = 5 It seems foolish to do penny pinching and not add the net_device pointer in the dsa_switchdev_event_work structure, so let's finally do that. As an added bonus, when we start offloading local entries pointing towards the bridge, these will now properly appear as 'offloaded' in 'bridge fdb' (this was not possible before, because 'dev' was assumed to only be a DSA net device): 00:01:02:03:04:05 dev br0 vlan 1 offload master br0 permanent 00:01:02:03:04:05 dev br0 offload master br0 permanent Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-29net: dsa: include fdb entries pointing to bridge in the host fdb listVladimir Oltean
The bridge supports a legacy way of adding local (non-forwarded) FDB entries, which works on an individual port basis: bridge fdb add dev swp0 00:01:02:03:04:05 master local As well as a new way, added by Roopa Prabhu in commit 3741873b4f73 ("bridge: allow adding of fdb entries pointing to the bridge device"): bridge fdb add dev br0 00:01:02:03:04:05 self local The two commands are functionally equivalent, except that the first one produces an entry with fdb->dst == swp0, and the other an entry with fdb->dst == NULL. The confusing part, though, is that even if fdb->dst is swp0 for the 'local on port' entry, that destination is not used. Nonetheless, the idea is that the bridge has reference counting for local entries, and local entries pointing towards the bridge are still 'as local' as local entries for a port. The bridge adds the MAC addresses of the interfaces automatically as FDB entries with is_local=1. For the MAC address of the ports, fdb->dst will be equal to the port, and for the MAC address of the bridge, fdb->dst will point towards the bridge (i.e. be NULL). Therefore, if the MAC address of the bridge is not inherited from either of the physical ports, then we must explicitly catch local FDB entries emitted towards the br0, otherwise we'll miss the MAC address of the bridge (and, of course, any entry with 'bridge add dev br0 ... self local'). Co-developed-by: Tobias Waldekranz <tobias@waldekranz.com> Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-29net: dsa: include bridge addresses which are local in the host fdb listTobias Waldekranz
The bridge automatically creates local (not forwarded) fdb entries pointing towards physical ports with their interface MAC addresses. For switchdev, the significance of these fdb entries is the exact opposite of that of non-local entries: instead of sending these frame outwards, we must send them inwards (towards the host). NOTE: The bridge's own MAC address is also "local". If that address is not shared with any port, the bridge's MAC is not be added by this functionality - but the following commit takes care of that case. NOTE 2: We mark these addresses as host-filtered regardless of the value of ds->assisted_learning_on_cpu_port. This is because, as opposed to the speculative logic done for dynamic address learning on foreign interfaces, the local FDB entries are rather fixed, so there isn't any risk of them migrating from one bridge port to another. Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-29net: dsa: sync static FDB entries on foreign interfaces to hardwareVladimir Oltean
DSA is able to install FDB entries towards the CPU port for addresses which were dynamically learnt by the software bridge on foreign interfaces that are in the same bridge with a DSA switch interface. Since this behavior is opportunistic, it is guarded by the "assisted_learning_on_cpu_port" property which can be enabled by drivers and is not done automatically (since certain switches may support address learning of packets coming from the CPU port). But if those FDB entries added on the foreign interfaces are static (added by the user) instead of dynamically learnt, currently DSA does not do anything (and arguably it should). Because static FDB entries are not supposed to move on their own, there is no downside in reusing the "assisted_learning_on_cpu_port" logic to sync static FDB entries to the DSA CPU port unconditionally, even if assisted_learning_on_cpu_port is not requested by the driver. For example, this situation: br0 / \ swp0 dummy0 $ bridge fdb add 02:00:de:ad:00:01 dev dummy0 vlan 1 master static Results in DSA adding an entry in the hardware FDB, pointing this address towards the CPU port. The same is true for entries added to the bridge itself, e.g: $ bridge fdb add 02:00:de:ad:00:01 dev br0 vlan 1 self local (except that right now, DSA still ignores 'local' FDB entries, this will be changed in a later patch) Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-29net: dsa: install the host MDB and FDB entries in the master's RX filterVladimir Oltean
If the DSA master implements strict address filtering, then the unicast and multicast addresses kept by the DSA CPU ports should be synchronized with the address lists of the DSA master. Note that we want the synchronization of the master's address lists even if the DSA switch doesn't support unicast/multicast database operations, on the premises that the packets will be flooded to the CPU in that case, and we should still instruct the master to receive them. This is why we do the dev_uc_add() etc first, even if dsa_port_notify() returns -EOPNOTSUPP. In turn, dev_uc_add() and friends return error only if memory allocation fails, so it is probably ok to check and propagate that error code and not just ignore it. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-29net: dsa: reference count the FDB addresses at the cross-chip notifier levelVladimir Oltean
The same concerns expressed for host MDB entries are valid for host FDBs just as well: - in the case of multiple bridges spanning the same switch chip, deleting a host FDB entry that belongs to one bridge will result in breakage to the other bridge - not deleting FDB entries across DSA links means that the switch's hardware tables will eventually run out, given enough wear&tear So do the same thing and introduce reference counting for CPU ports and DSA links using the same data structures as we have for MDB entries. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-29net: dsa: introduce a separate cross-chip notifier type for host FDBsVladimir Oltean
DSA treats some bridge FDB entries by trapping them to the CPU port. Currently, the only class of such entries are FDB addresses learnt by the software bridge on a foreign interface. However there are many more to be added: - FDB entries with the is_local flag (for termination) added by the bridge on the user ports (typically containing the MAC address of the bridge port) - FDB entries pointing towards the bridge net device (for termination). Typically these contain the MAC address of the bridge net device. - Static FDB entries installed on a foreign interface that is in the same bridge with a DSA user port. The reason why a separate cross-chip notifier for host FDBs is justified compared to normal FDBs is the same as in the case of host MDBs: the cross-chip notifier matching function in switch.c should avoid installing these entries on routing ports that route towards the targeted switch, but not towards the CPU. This is required in order to have proper support for H-like multi-chip topologies. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-29net: dsa: reference count the MDB entries at the cross-chip notifier levelVladimir Oltean
Ever since the cross-chip notifiers were introduced, the design was meant to be simplistic and just get the job done without worrying too much about dangling resources left behind. For example, somebody installs an MDB entry on sw0p0 in this daisy chain topology. It gets installed using ds->ops->port_mdb_add() on sw0p0, sw1p4 and sw2p4. | sw0p0 sw0p1 sw0p2 sw0p3 sw0p4 [ user ] [ user ] [ user ] [ dsa ] [ cpu ] [ x ] [ ] [ ] [ ] [ ] | +---------+ | sw1p0 sw1p1 sw1p2 sw1p3 sw1p4 [ user ] [ user ] [ user ] [ dsa ] [ dsa ] [ ] [ ] [ ] [ ] [ x ] | +---------+ | sw2p0 sw2p1 sw2p2 sw2p3 sw2p4 [ user ] [ user ] [ user ] [ user ] [ dsa ] [ ] [ ] [ ] [ ] [ x ] Then the same person deletes that MDB entry. The cross-chip notifier for deletion only matches sw0p0: | sw0p0 sw0p1 sw0p2 sw0p3 sw0p4 [ user ] [ user ] [ user ] [ dsa ] [ cpu ] [ x ] [ ] [ ] [ ] [ ] | +---------+ | sw1p0 sw1p1 sw1p2 sw1p3 sw1p4 [ user ] [ user ] [ user ] [ dsa ] [ dsa ] [ ] [ ] [ ] [ ] [ ] | +---------+ | sw2p0 sw2p1 sw2p2 sw2p3 sw2p4 [ user ] [ user ] [ user ] [ user ] [ dsa ] [ ] [ ] [ ] [ ] [ ] Why? Because the DSA links are 'trunk' ports, if we just go ahead and delete the MDB from sw1p4 and sw2p4 directly, we might delete those multicast entries when they are still needed. Just consider the fact that somebody does: - add a multicast MAC address towards sw0p0 [ via the cross-chip notifiers it gets installed on the DSA links too ] - add the same multicast MAC address towards sw0p1 (another port of that same switch) - delete the same multicast MAC address from sw0p0. At this point, if we deleted the MAC address from the DSA links, it would be flooded, even though there is still an entry on switch 0 which needs it not to. So that is why deletions only match the targeted source port and nothing on DSA links. Of course, dangling resources means that the hardware tables will eventually run out given enough additions/removals, but hey, at least it's simple. But there is a bigger concern which needs to be addressed, and that is our support for SWITCHDEV_OBJ_ID_HOST_MDB. DSA simply translates such an object into a dsa_port_host_mdb_add() which ends up as ds->ops->port_mdb_add() on the upstream port, and a similar thing happens on deletion: dsa_port_host_mdb_del() will trigger ds->ops->port_mdb_del() on the upstream port. When there are 2 VLAN-unaware bridges spanning the same switch (which is a use case DSA proudly supports), each bridge will install its own SWITCHDEV_OBJ_ID_HOST_MDB entries. But upon deletion, DSA goes ahead and emits a DSA_NOTIFIER_MDB_DEL for dp->cpu_dp, which is shared between the user ports enslaved to br0 and the user ports enslaved to br1. Not good. The host-trapped multicast addresses installed by br1 will be deleted when any state changes in br0 (IGMP timers expire, or ports leave, etc). To avoid this, we could of course go the route of the zero-sum game and delete the DSA_NOTIFIER_MDB_DEL call for dp->cpu_dp. But the better design is to just admit that on shared ports like DSA links and CPU ports, we should be reference counting calls, even if this consumes some dynamic memory which DSA has traditionally avoided. On the flip side, the hardware tables of switches are limited in size, so it would be good if the OS managed them properly instead of having them eventually overflow. To address the memory usage concern, we only apply the refcounting of MDB entries on ports that are really shared (CPU ports and DSA links) and not on user ports. In a typical single-switch setup, this means only the CPU port (and the host MDB entries are not that many, really). The name of the newly introduced data structures (dsa_mac_addr) is chosen in such a way that will be reusable for host FDB entries (next patch). With this change, we can finally have the same matching logic for the MDB additions and deletions, as well as for their host-trapped variants. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-29net: dsa: introduce a separate cross-chip notifier type for host MDBsVladimir Oltean
Commit abd49535c380 ("net: dsa: execute dsa_switch_mdb_add only for routing port in cross-chip topologies") does a surprisingly good job even for the SWITCHDEV_OBJ_ID_HOST_MDB use case, where DSA simply translates a switchdev object received on dp into a cross-chip notifier for dp->cpu_dp. To visualize how that works, imagine the daisy chain topology below and consider a SWITCHDEV_OBJ_ID_HOST_MDB object emitted on sw2p0. How does the cross-chip notifier know to match on all the right ports (sw0p4, the dedicated CPU port, sw1p4, an upstream DSA link, and sw2p4, another upstream DSA link)? | sw0p0 sw0p1 sw0p2 sw0p3 sw0p4 [ user ] [ user ] [ user ] [ dsa ] [ cpu ] [ ] [ ] [ ] [ ] [ x ] | +---------+ | sw1p0 sw1p1 sw1p2 sw1p3 sw1p4 [ user ] [ user ] [ user ] [ dsa ] [ dsa ] [ ] [ ] [ ] [ ] [ x ] | +---------+ | sw2p0 sw2p1 sw2p2 sw2p3 sw2p4 [ user ] [ user ] [ user ] [ user ] [ dsa ] [ ] [ ] [ ] [ ] [ x ] The answer is simple: the dedicated CPU port of sw2p0 is sw0p4, and dsa_routing_port returns the upstream port for all switches. That is fine, but there are other topologies where this does not work as well. There are trees with "H" topologies in the wild, where there are 2 or more switches with DSA links between them, but every switch has its dedicated CPU port. For these topologies, it seems stupid for the neighbor switches to install an MDB entry on the routing port, since these multicast addresses are fundamentally different than the usual ones we support (and that is the justification for this patch, to introduce the concept of a termination plane multicast MAC address, as opposed to a forwarding plane multicast MAC address). For example, when a SWITCHDEV_OBJ_ID_HOST_MDB would get added to sw0p0, without this patch, it would get treated as a regular port MDB on sw0p2 and it would match on the ports below (including the sw1p3 routing port). | | sw0p0 sw0p1 sw0p2 sw0p3 sw1p3 sw1p2 sw1p1 sw1p0 [ user ] [ user ] [ cpu ] [ dsa ] [ dsa ] [ cpu ] [ user ] [ user ] [ ] [ ] [ x ] [ ] ---- [ x ] [ ] [ ] [ ] With the patch, the host MDB notifier on sw0p0 matches only on the local switch, which is what we want for a termination plane address. | | sw0p0 sw0p1 sw0p2 sw0p3 sw1p3 sw1p2 sw1p1 sw1p0 [ user ] [ user ] [ cpu ] [ dsa ] [ dsa ] [ cpu ] [ user ] [ user ] [ ] [ ] [ x ] [ ] ---- [ ] [ ] [ ] [ ] Name this new matching function "dsa_switch_host_address_match" since we will be reusing it soon for host FDB entries as well. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>