diff options
author | Linus Torvalds | 2022-10-04 13:38:03 -0700 |
---|---|---|
committer | Linus Torvalds | 2022-10-04 13:38:03 -0700 |
commit | 0326074ff4652329f2a1a9c8685104576bd8d131 (patch) | |
tree | 9a7574c7ccb05bf4c7cb34fc5a65457bb8f495cb /net | |
parent | 522667b24f08009591c90e75bfe2ffb67f555498 (diff) | |
parent | 681bf011b9b5989c6e9db6beb64494918aab9a43 (diff) |
Merge tag 'net-next-6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Jakub Kicinski:
"Core:
- Introduce and use a single page frag cache for allocating small skb
heads, clawing back the 10-20% performance regression in UDP flood
test from previous fixes.
- Run packets which already went thru HW coalescing thru SW GRO. This
significantly improves TCP segment coalescing and simplifies
deployments as different workloads benefit from HW or SW GRO.
- Shrink the size of the base zero-copy send structure.
- Move TCP init under a new slow / sleepable version of DO_ONCE().
BPF:
- Add BPF-specific, any-context-safe memory allocator.
- Add helpers/kfuncs for PKCS#7 signature verification from BPF
programs.
- Define a new map type and related helpers for user space -> kernel
communication over a ring buffer (BPF_MAP_TYPE_USER_RINGBUF).
- Allow targeting BPF iterators to loop through resources of one
task/thread.
- Add ability to call selected destructive functions. Expose
crash_kexec() to allow BPF to trigger a kernel dump. Use
CAP_SYS_BOOT check on the loading process to judge permissions.
- Enable BPF to collect custom hierarchical cgroup stats efficiently
by integrating with the rstat framework.
- Support struct arguments for trampoline based programs. Only
structs with size <= 16B and x86 are supported.
- Invoke cgroup/connect{4,6} programs for unprivileged ICMP ping
sockets (instead of just TCP and UDP sockets).
- Add a helper for accessing CLOCK_TAI for time sensitive network
related programs.
- Support accessing network tunnel metadata's flags.
- Make TCP SYN ACK RTO tunable by BPF programs with TCP Fast Open.
- Add support for writing to Netfilter's nf_conn:mark.
Protocols:
- WiFi: more Extremely High Throughput (EHT) and Multi-Link Operation
(MLO) work (802.11be, WiFi 7).
- vsock: improve support for SO_RCVLOWAT.
- SMC: support SO_REUSEPORT.
- Netlink: define and document how to use netlink in a "modern" way.
Support reporting missing attributes via extended ACK.
- IPSec: support collect metadata mode for xfrm interfaces.
- TCPv6: send consistent autoflowlabel in SYN_RECV state and RST
packets.
- TCP: introduce optional per-netns connection hash table to allow
better isolation between namespaces (opt-in, at the cost of memory
and cache pressure).
- MPTCP: support TCP_FASTOPEN_CONNECT.
- Add NEXT-C-SID support in Segment Routing (SRv6) End behavior.
- Adjust IP_UNICAST_IF sockopt behavior for connected UDP sockets.
- Open vSwitch:
- Allow specifying ifindex of new interfaces.
- Allow conntrack and metering in non-initial user namespace.
- TLS: support the Korean ARIA-GCM crypto algorithm.
- Remove DECnet support.
Driver API:
- Allow selecting the conduit interface used by each port in DSA
switches, at runtime.
- Ethernet Power Sourcing Equipment and Power Device support.
- Add tc-taprio support for queueMaxSDU parameter, i.e. setting per
traffic class max frame size for time-based packet schedules.
- Support PHY rate matching - adapting between differing host-side
and link-side speeds.
- Introduce QUSGMII PHY mode and 1000BASE-KX interface mode.
- Validate OF (device tree) nodes for DSA shared ports; make
phylink-related properties mandatory on DSA and CPU ports.
Enforcing more uniformity should allow transitioning to phylink.
- Require that flash component name used during update matches one of
the components for which version is reported by info_get().
- Remove "weight" argument from driver-facing NAPI API as much as
possible. It's one of those magic knobs which seemed like a good
idea at the time but is too indirect to use in practice.
- Support offload of TLS connections with 256 bit keys.
New hardware / drivers:
- Ethernet:
- Microchip KSZ9896 6-port Gigabit Ethernet Switch
- Renesas Ethernet AVB (EtherAVB-IF) Gen4 SoCs
- Analog Devices ADIN1110 and ADIN2111 industrial single pair
Ethernet (10BASE-T1L) MAC+PHY.
- Rockchip RV1126 Gigabit Ethernet (a version of stmmac IP).
- Ethernet SFPs / modules:
- RollBall / Hilink / Turris 10G copper SFPs
- HALNy GPON module
- WiFi:
- CYW43439 SDIO chipset (brcmfmac)
- CYW89459 PCIe chipset (brcmfmac)
- BCM4378 on Apple platforms (brcmfmac)
Drivers:
- CAN:
- gs_usb: HW timestamp support
- Ethernet PHYs:
- lan8814: cable diagnostics
- Ethernet NICs:
- Intel (100G):
- implement control of FCS/CRC stripping
- port splitting via devlink
- L2TPv3 filtering offload
- nVidia/Mellanox:
- tunnel offload for sub-functions
- MACSec offload, w/ Extended packet number and replay window
offload
- significantly restructure, and optimize the AF_XDP support,
align the behavior with other vendors
- Huawei:
- configuring DSCP map for traffic class selection
- querying standard FEC statistics
- querying SerDes lane number via ethtool
- Marvell/Cavium:
- egress priority flow control
- MACSec offload
- AMD/SolarFlare:
- PTP over IPv6 and raw Ethernet
- small / embedded:
- ax88772: convert to phylink (to support SFP cages)
- altera: tse: convert to phylink
- ftgmac100: support fixed link
- enetc: standard Ethtool counters
- macb: ZynqMP SGMII dynamic configuration support
- tsnep: support multi-queue and use page pool
- lan743x: Rx IP & TCP checksum offload
- igc: add xdp frags support to ndo_xdp_xmit
- Ethernet high-speed switches:
- Marvell (prestera):
- support SPAN port features (traffic mirroring)
- nexthop object offloading
- Microchip (sparx5):
- multicast forwarding offload
- QoS queuing offload (tc-mqprio, tc-tbf, tc-ets)
- Ethernet embedded switches:
- Marvell (mv88e6xxx):
- support RGMII cmode
- NXP (felix):
- standardized ethtool counters
- Microchip (lan966x):
- QoS queuing offload (tc-mqprio, tc-tbf, tc-cbs, tc-ets)
- traffic policing and mirroring
- link aggregation / bonding offload
- QUSGMII PHY mode support
- Qualcomm 802.11ax WiFi (ath11k):
- cold boot calibration support on WCN6750
- support to connect to a non-transmit MBSSID AP profile
- enable remain-on-channel support on WCN6750
- Wake-on-WLAN support for WCN6750
- support to provide transmit power from firmware via nl80211
- support to get power save duration for each client
- spectral scan support for 160 MHz
- MediaTek WiFi (mt76):
- WiFi-to-Ethernet bridging offload for MT7986 chips
- RealTek WiFi (rtw89):
- P2P support"
* tag 'net-next-6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1864 commits)
eth: pse: add missing static inlines
once: rename _SLOW to _SLEEPABLE
net: pse-pd: add regulator based PSE driver
dt-bindings: net: pse-dt: add bindings for regulator based PoDL PSE controller
ethtool: add interface to interact with Ethernet Power Equipment
net: mdiobus: search for PSE nodes by parsing PHY nodes.
net: mdiobus: fwnode_mdiobus_register_phy() rework error handling
net: add framework to support Ethernet PSE and PDs devices
dt-bindings: net: phy: add PoDL PSE property
net: marvell: prestera: Propagate nh state from hw to kernel
net: marvell: prestera: Add neighbour cache accounting
net: marvell: prestera: add stub handler neighbour events
net: marvell: prestera: Add heplers to interact with fib_notifier_info
net: marvell: prestera: Add length macros for prestera_ip_addr
net: marvell: prestera: add delayed wq and flush wq on deinit
net: marvell: prestera: Add strict cleanup of fib arbiter
net: marvell: prestera: Add cleanup of allocated fib_nodes
net: marvell: prestera: Add router nexthops ABI
eth: octeon: fix build after netif_napi_add() changes
net/mlx5: E-Switch, Return EBUSY if can't get mode lock
...
Diffstat (limited to 'net')
358 files changed, 10083 insertions, 16880 deletions
diff --git a/net/8021q/vlan_core.c b/net/8021q/vlan_core.c index 5aa8144101dc..0beb44f2fe1f 100644 --- a/net/8021q/vlan_core.c +++ b/net/8021q/vlan_core.c @@ -467,12 +467,9 @@ static struct sk_buff *vlan_gro_receive(struct list_head *head, off_vlan = skb_gro_offset(skb); hlen = off_vlan + sizeof(*vhdr); - vhdr = skb_gro_header_fast(skb, off_vlan); - if (skb_gro_header_hard(skb, hlen)) { - vhdr = skb_gro_header_slow(skb, hlen, off_vlan); - if (unlikely(!vhdr)) - goto out; - } + vhdr = skb_gro_header(skb, hlen, off_vlan); + if (unlikely(!vhdr)) + goto out; type = vhdr->h_vlan_encapsulated_proto; diff --git a/net/8021q/vlan_dev.c b/net/8021q/vlan_dev.c index 035812b0461c..e1bb41a443c4 100644 --- a/net/8021q/vlan_dev.c +++ b/net/8021q/vlan_dev.c @@ -674,9 +674,9 @@ static int vlan_ethtool_get_link_ksettings(struct net_device *dev, static void vlan_ethtool_get_drvinfo(struct net_device *dev, struct ethtool_drvinfo *info) { - strlcpy(info->driver, vlan_fullname, sizeof(info->driver)); - strlcpy(info->version, vlan_version, sizeof(info->version)); - strlcpy(info->fw_version, "N/A", sizeof(info->fw_version)); + strscpy(info->driver, vlan_fullname, sizeof(info->driver)); + strscpy(info->version, vlan_version, sizeof(info->version)); + strscpy(info->fw_version, "N/A", sizeof(info->fw_version)); } static int vlan_ethtool_get_ts_info(struct net_device *dev, diff --git a/net/Kconfig b/net/Kconfig index 6b78f695caa6..48c33c222199 100644 --- a/net/Kconfig +++ b/net/Kconfig @@ -204,7 +204,6 @@ config BRIDGE_NETFILTER source "net/netfilter/Kconfig" source "net/ipv4/netfilter/Kconfig" source "net/ipv6/netfilter/Kconfig" -source "net/decnet/netfilter/Kconfig" source "net/bridge/netfilter/Kconfig" endif @@ -221,7 +220,6 @@ source "net/802/Kconfig" source "net/bridge/Kconfig" source "net/dsa/Kconfig" source "net/8021q/Kconfig" -source "net/decnet/Kconfig" source "net/llc/Kconfig" source "drivers/net/appletalk/Kconfig" source "net/x25/Kconfig" diff --git a/net/Kconfig.debug b/net/Kconfig.debug index e6ae11cc2fb7..5e3fffe707dd 100644 --- a/net/Kconfig.debug +++ b/net/Kconfig.debug @@ -2,7 +2,7 @@ config NET_DEV_REFCNT_TRACKER bool "Enable net device refcount tracking" - depends on DEBUG_KERNEL && STACKTRACE_SUPPORT + depends on DEBUG_KERNEL && STACKTRACE_SUPPORT && NET select REF_TRACKER default n help @@ -11,7 +11,7 @@ config NET_DEV_REFCNT_TRACKER config NET_NS_REFCNT_TRACKER bool "Enable networking namespace refcount tracking" - depends on DEBUG_KERNEL && STACKTRACE_SUPPORT + depends on DEBUG_KERNEL && STACKTRACE_SUPPORT && NET select REF_TRACKER default n help diff --git a/net/Makefile b/net/Makefile index fbfeb8a0bb37..6a62e5b27378 100644 --- a/net/Makefile +++ b/net/Makefile @@ -38,7 +38,6 @@ obj-$(CONFIG_AF_KCM) += kcm/ obj-$(CONFIG_STREAM_PARSER) += strparser/ obj-$(CONFIG_ATM) += atm/ obj-$(CONFIG_L2TP) += l2tp/ -obj-$(CONFIG_DECNET) += decnet/ obj-$(CONFIG_PHONET) += phonet/ ifneq ($(CONFIG_VLAN_8021Q),) obj-y += 8021q/ diff --git a/net/ax25/af_ax25.c b/net/ax25/af_ax25.c index d82a51e69386..6b4c25a92377 100644 --- a/net/ax25/af_ax25.c +++ b/net/ax25/af_ax25.c @@ -778,7 +778,7 @@ static int ax25_getsockopt(struct socket *sock, int level, int optname, ax25_dev = ax25->ax25_dev; if (ax25_dev != NULL && ax25_dev->dev != NULL) { - strlcpy(devname, ax25_dev->dev->name, sizeof(devname)); + strscpy(devname, ax25_dev->dev->name, sizeof(devname)); length = strlen(devname) + 1; } else { *devname = '\0'; diff --git a/net/batman-adv/bat_v_elp.c b/net/batman-adv/bat_v_elp.c index b6db999abf75..f1741fbfb617 100644 --- a/net/batman-adv/bat_v_elp.c +++ b/net/batman-adv/bat_v_elp.c @@ -125,7 +125,6 @@ static u32 batadv_v_elp_get_throughput(struct batadv_hardif_neigh_node *neigh) /* if not a wifi interface, check if this device provides data via * ethtool (e.g. an Ethernet adapter) */ - memset(&link_settings, 0, sizeof(link_settings)); rtnl_lock(); ret = __ethtool_get_link_ksettings(hard_iface->net_dev, &link_settings); rtnl_unlock(); diff --git a/net/batman-adv/main.h b/net/batman-adv/main.h index 23f3d53f4b51..c48803b32bb0 100644 --- a/net/batman-adv/main.h +++ b/net/batman-adv/main.h @@ -13,7 +13,7 @@ #define BATADV_DRIVER_DEVICE "batman-adv" #ifndef BATADV_SOURCE_VERSION -#define BATADV_SOURCE_VERSION "2022.2" +#define BATADV_SOURCE_VERSION "2022.3" #endif /* B.A.T.M.A.N. parameters */ diff --git a/net/batman-adv/netlink.c b/net/batman-adv/netlink.c index 00875e1d8c44..a5e4a4e976cf 100644 --- a/net/batman-adv/netlink.c +++ b/net/batman-adv/netlink.c @@ -1493,6 +1493,7 @@ struct genl_family batadv_netlink_family __ro_after_init = { .module = THIS_MODULE, .small_ops = batadv_netlink_ops, .n_small_ops = ARRAY_SIZE(batadv_netlink_ops), + .resv_start_op = BATADV_CMD_SET_VLAN + 1, .mcgrps = batadv_netlink_mcgrps, .n_mcgrps = ARRAY_SIZE(batadv_netlink_mcgrps), }; diff --git a/net/batman-adv/trace.h b/net/batman-adv/trace.h index 31c8f922651d..5dd52bc5cabb 100644 --- a/net/batman-adv/trace.h +++ b/net/batman-adv/trace.h @@ -9,8 +9,6 @@ #include "main.h" -#include <linux/bug.h> -#include <linux/kernel.h> #include <linux/netdevice.h> #include <linux/percpu.h> #include <linux/printk.h> diff --git a/net/batman-adv/types.h b/net/batman-adv/types.h index 2be5d4a712c5..758cd797a063 100644 --- a/net/batman-adv/types.h +++ b/net/batman-adv/types.h @@ -1740,45 +1740,6 @@ struct batadv_priv { #endif }; -/** - * struct batadv_socket_client - layer2 icmp socket client data - */ -struct batadv_socket_client { - /** - * @queue_list: packet queue for packets destined for this socket client - */ - struct list_head queue_list; - - /** @queue_len: number of packets in the packet queue (queue_list) */ - unsigned int queue_len; - - /** @index: socket client's index in the batadv_socket_client_hash */ - unsigned char index; - - /** @lock: lock protecting queue_list, queue_len & index */ - spinlock_t lock; - - /** @queue_wait: socket client's wait queue */ - wait_queue_head_t queue_wait; - - /** @bat_priv: pointer to soft_iface this client belongs to */ - struct batadv_priv *bat_priv; -}; - -/** - * struct batadv_socket_packet - layer2 icmp packet for socket client - */ -struct batadv_socket_packet { - /** @list: list node for &batadv_socket_client.queue_list */ - struct list_head list; - - /** @icmp_len: size of the layer2 icmp packet */ - size_t icmp_len; - - /** @icmp_packet: layer2 icmp packet */ - u8 icmp_packet[BATADV_ICMP_MAX_PACKET_SIZE]; -}; - #ifdef CONFIG_BATMAN_ADV_BLA /** diff --git a/net/bluetooth/hci_conn.c b/net/bluetooth/hci_conn.c index 9777e7b109ee..7a59c4487050 100644 --- a/net/bluetooth/hci_conn.c +++ b/net/bluetooth/hci_conn.c @@ -44,6 +44,11 @@ struct sco_param { u8 retrans_effort; }; +struct conn_handle_t { + struct hci_conn *conn; + __u16 handle; +}; + static const struct sco_param esco_param_cvsd[] = { { EDR_ESCO_MASK & ~ESCO_2EV3, 0x000a, 0x01 }, /* S3 */ { EDR_ESCO_MASK & ~ESCO_2EV3, 0x0007, 0x01 }, /* S2 */ @@ -316,17 +321,60 @@ static bool find_next_esco_param(struct hci_conn *conn, return conn->attempt <= size; } -static bool hci_enhanced_setup_sync_conn(struct hci_conn *conn, __u16 handle) +static int configure_datapath_sync(struct hci_dev *hdev, struct bt_codec *codec) { - struct hci_dev *hdev = conn->hdev; + int err; + __u8 vnd_len, *vnd_data = NULL; + struct hci_op_configure_data_path *cmd = NULL; + + err = hdev->get_codec_config_data(hdev, ESCO_LINK, codec, &vnd_len, + &vnd_data); + if (err < 0) + goto error; + + cmd = kzalloc(sizeof(*cmd) + vnd_len, GFP_KERNEL); + if (!cmd) { + err = -ENOMEM; + goto error; + } + + err = hdev->get_data_path_id(hdev, &cmd->data_path_id); + if (err < 0) + goto error; + + cmd->vnd_len = vnd_len; + memcpy(cmd->vnd_data, vnd_data, vnd_len); + + cmd->direction = 0x00; + __hci_cmd_sync_status(hdev, HCI_CONFIGURE_DATA_PATH, + sizeof(*cmd) + vnd_len, cmd, HCI_CMD_TIMEOUT); + + cmd->direction = 0x01; + err = __hci_cmd_sync_status(hdev, HCI_CONFIGURE_DATA_PATH, + sizeof(*cmd) + vnd_len, cmd, + HCI_CMD_TIMEOUT); +error: + + kfree(cmd); + kfree(vnd_data); + return err; +} + +static int hci_enhanced_setup_sync(struct hci_dev *hdev, void *data) +{ + struct conn_handle_t *conn_handle = data; + struct hci_conn *conn = conn_handle->conn; + __u16 handle = conn_handle->handle; struct hci_cp_enhanced_setup_sync_conn cp; const struct sco_param *param; + kfree(conn_handle); + bt_dev_dbg(hdev, "hcon %p", conn); /* for offload use case, codec needs to configured before opening SCO */ if (conn->codec.data_path) - hci_req_configure_datapath(hdev, &conn->codec); + configure_datapath_sync(hdev, &conn->codec); conn->state = BT_CONNECT; conn->out = true; @@ -344,7 +392,7 @@ static bool hci_enhanced_setup_sync_conn(struct hci_conn *conn, __u16 handle) case BT_CODEC_MSBC: if (!find_next_esco_param(conn, esco_param_msbc, ARRAY_SIZE(esco_param_msbc))) - return false; + return -EINVAL; param = &esco_param_msbc[conn->attempt - 1]; cp.tx_coding_format.id = 0x05; @@ -396,11 +444,11 @@ static bool hci_enhanced_setup_sync_conn(struct hci_conn *conn, __u16 handle) if (lmp_esco_capable(conn->link)) { if (!find_next_esco_param(conn, esco_param_cvsd, ARRAY_SIZE(esco_param_cvsd))) - return false; + return -EINVAL; param = &esco_param_cvsd[conn->attempt - 1]; } else { if (conn->attempt > ARRAY_SIZE(sco_param_cvsd)) - return false; + return -EINVAL; param = &sco_param_cvsd[conn->attempt - 1]; } cp.tx_coding_format.id = 2; @@ -423,7 +471,7 @@ static bool hci_enhanced_setup_sync_conn(struct hci_conn *conn, __u16 handle) cp.out_transport_unit_size = 16; break; default: - return false; + return -EINVAL; } cp.retrans_effort = param->retrans_effort; @@ -431,9 +479,9 @@ static bool hci_enhanced_setup_sync_conn(struct hci_conn *conn, __u16 handle) cp.max_latency = __cpu_to_le16(param->max_latency); if (hci_send_cmd(hdev, HCI_OP_ENHANCED_SETUP_SYNC_CONN, sizeof(cp), &cp) < 0) - return false; + return -EIO; - return true; + return 0; } static bool hci_setup_sync_conn(struct hci_conn *conn, __u16 handle) @@ -490,8 +538,24 @@ static bool hci_setup_sync_conn(struct hci_conn *conn, __u16 handle) bool hci_setup_sync(struct hci_conn *conn, __u16 handle) { - if (enhanced_sync_conn_capable(conn->hdev)) - return hci_enhanced_setup_sync_conn(conn, handle); + int result; + struct conn_handle_t *conn_handle; + + if (enhanced_sync_conn_capable(conn->hdev)) { + conn_handle = kzalloc(sizeof(*conn_handle), GFP_KERNEL); + + if (!conn_handle) + return false; + + conn_handle->conn = conn; + conn_handle->handle = handle; + result = hci_cmd_sync_queue(conn->hdev, hci_enhanced_setup_sync, + conn_handle, NULL); + if (result < 0) + kfree(conn_handle); + + return result == 0; + } return hci_setup_sync_conn(conn, handle); } @@ -2696,3 +2760,79 @@ u32 hci_conn_get_phy(struct hci_conn *conn) return phys; } + +int hci_abort_conn(struct hci_conn *conn, u8 reason) +{ + int r = 0; + + switch (conn->state) { + case BT_CONNECTED: + case BT_CONFIG: + if (conn->type == AMP_LINK) { + struct hci_cp_disconn_phy_link cp; + + cp.phy_handle = HCI_PHY_HANDLE(conn->handle); + cp.reason = reason; + r = hci_send_cmd(conn->hdev, HCI_OP_DISCONN_PHY_LINK, + sizeof(cp), &cp); + } else { + struct hci_cp_disconnect dc; + + dc.handle = cpu_to_le16(conn->handle); + dc.reason = reason; + r = hci_send_cmd(conn->hdev, HCI_OP_DISCONNECT, + sizeof(dc), &dc); + } + + conn->state = BT_DISCONN; + + break; + case BT_CONNECT: + if (conn->type == LE_LINK) { + if (test_bit(HCI_CONN_SCANNING, &conn->flags)) + break; + r = hci_send_cmd(conn->hdev, + HCI_OP_LE_CREATE_CONN_CANCEL, 0, NULL); + } else if (conn->type == ACL_LINK) { + if (conn->hdev->hci_ver < BLUETOOTH_VER_1_2) + break; + r = hci_send_cmd(conn->hdev, + HCI_OP_CREATE_CONN_CANCEL, + 6, &conn->dst); + } + break; + case BT_CONNECT2: + if (conn->type == ACL_LINK) { + struct hci_cp_reject_conn_req rej; + + bacpy(&rej.bdaddr, &conn->dst); + rej.reason = reason; + + r = hci_send_cmd(conn->hdev, + HCI_OP_REJECT_CONN_REQ, + sizeof(rej), &rej); + } else if (conn->type == SCO_LINK || conn->type == ESCO_LINK) { + struct hci_cp_reject_sync_conn_req rej; + + bacpy(&rej.bdaddr, &conn->dst); + + /* SCO rejection has its own limited set of + * allowed error values (0x0D-0x0F) which isn't + * compatible with most values passed to this + * function. To be safe hard-code one of the + * values that's suitable for SCO. + */ + rej.reason = HCI_ERROR_REJ_LIMITED_RESOURCES; + + r = hci_send_cmd(conn->hdev, + HCI_OP_REJECT_SYNC_CONN_REQ, + sizeof(rej), &rej); + } + break; + default: + conn->state = BT_CLOSED; + break; + } + + return r; +} diff --git a/net/bluetooth/hci_core.c b/net/bluetooth/hci_core.c index b3a5a3cc9372..0540555b3704 100644 --- a/net/bluetooth/hci_core.c +++ b/net/bluetooth/hci_core.c @@ -597,6 +597,15 @@ static int hci_dev_do_reset(struct hci_dev *hdev) /* Cancel these to avoid queueing non-chained pending work */ hci_dev_set_flag(hdev, HCI_CMD_DRAIN_WORKQUEUE); + /* Wait for + * + * if (!hci_dev_test_flag(hdev, HCI_CMD_DRAIN_WORKQUEUE)) + * queue_delayed_work(&hdev->{cmd,ncmd}_timer) + * + * inside RCU section to see the flag or complete scheduling. + */ + synchronize_rcu(); + /* Explicitly cancel works in case scheduled after setting the flag. */ cancel_delayed_work(&hdev->cmd_timer); cancel_delayed_work(&hdev->ncmd_timer); @@ -714,7 +723,7 @@ static void hci_update_passive_scan_state(struct hci_dev *hdev, u8 scan) hci_dev_set_flag(hdev, HCI_BREDR_ENABLED); if (hci_dev_test_flag(hdev, HCI_LE_ENABLED)) - hci_req_update_adv_data(hdev, hdev->cur_adv_instance); + hci_update_adv_data(hdev, hdev->cur_adv_instance); mgmt_new_settings(hdev); } @@ -1706,7 +1715,8 @@ struct adv_info *hci_add_adv_instance(struct hci_dev *hdev, u8 instance, u32 flags, u16 adv_data_len, u8 *adv_data, u16 scan_rsp_len, u8 *scan_rsp_data, u16 timeout, u16 duration, s8 tx_power, - u32 min_interval, u32 max_interval) + u32 min_interval, u32 max_interval, + u8 mesh_handle) { struct adv_info *adv; @@ -1717,7 +1727,7 @@ struct adv_info *hci_add_adv_instance(struct hci_dev *hdev, u8 instance, memset(adv->per_adv_data, 0, sizeof(adv->per_adv_data)); } else { if (hdev->adv_instance_cnt >= hdev->le_num_of_adv_sets || - instance < 1 || instance > hdev->le_num_of_adv_sets) + instance < 1 || instance > hdev->le_num_of_adv_sets + 1) return ERR_PTR(-EOVERFLOW); adv = kzalloc(sizeof(*adv), GFP_KERNEL); @@ -1734,6 +1744,11 @@ struct adv_info *hci_add_adv_instance(struct hci_dev *hdev, u8 instance, adv->min_interval = min_interval; adv->max_interval = max_interval; adv->tx_power = tx_power; + /* Defining a mesh_handle changes the timing units to ms, + * rather than seconds, and ties the instance to the requested + * mesh_tx queue. + */ + adv->mesh = mesh_handle; hci_set_adv_instance_data(hdev, instance, adv_data_len, adv_data, scan_rsp_len, scan_rsp_data); @@ -1762,7 +1777,7 @@ struct adv_info *hci_add_per_instance(struct hci_dev *hdev, u8 instance, adv = hci_add_adv_instance(hdev, instance, flags, 0, NULL, 0, NULL, 0, 0, HCI_ADV_TX_POWER_NO_PREFERENCE, - min_interval, max_interval); + min_interval, max_interval, 0); if (IS_ERR(adv)) return adv; @@ -2391,6 +2406,10 @@ static int hci_suspend_notifier(struct notifier_block *nb, unsigned long action, container_of(nb, struct hci_dev, suspend_notifier); int ret = 0; + /* Userspace has full control of this device. Do nothing. */ + if (hci_dev_test_flag(hdev, HCI_USER_CHANNEL)) + return NOTIFY_DONE; + if (action == PM_SUSPEND_PREPARE) ret = hci_suspend_dev(hdev); else if (action == PM_POST_SUSPEND) @@ -2486,6 +2505,7 @@ struct hci_dev *hci_alloc_dev_priv(int sizeof_priv) mutex_init(&hdev->lock); mutex_init(&hdev->req_lock); + INIT_LIST_HEAD(&hdev->mesh_pending); INIT_LIST_HEAD(&hdev->mgmt_pending); INIT_LIST_HEAD(&hdev->reject_list); INIT_LIST_HEAD(&hdev->accept_list); @@ -3469,15 +3489,27 @@ static inline int __get_blocks(struct hci_dev *hdev, struct sk_buff *skb) return DIV_ROUND_UP(skb->len - HCI_ACL_HDR_SIZE, hdev->block_len); } -static void __check_timeout(struct hci_dev *hdev, unsigned int cnt) +static void __check_timeout(struct hci_dev *hdev, unsigned int cnt, u8 type) { - if (!hci_dev_test_flag(hdev, HCI_UNCONFIGURED)) { - /* ACL tx timeout must be longer than maximum - * link supervision timeout (40.9 seconds) */ - if (!cnt && time_after(jiffies, hdev->acl_last_tx + - HCI_ACL_TX_TIMEOUT)) - hci_link_tx_to(hdev, ACL_LINK); + unsigned long last_tx; + + if (hci_dev_test_flag(hdev, HCI_UNCONFIGURED)) + return; + + switch (type) { + case LE_LINK: + last_tx = hdev->le_last_tx; + break; + default: + last_tx = hdev->acl_last_tx; + break; } + + /* tx timeout must be longer than maximum link supervision timeout + * (40.9 seconds) + */ + if (!cnt && time_after(jiffies, last_tx + HCI_ACL_TX_TIMEOUT)) + hci_link_tx_to(hdev, type); } /* Schedule SCO */ @@ -3535,7 +3567,7 @@ static void hci_sched_acl_pkt(struct hci_dev *hdev) struct sk_buff *skb; int quote; - __check_timeout(hdev, cnt); + __check_timeout(hdev, cnt, ACL_LINK); while (hdev->acl_cnt && (chan = hci_chan_sent(hdev, ACL_LINK, "e))) { @@ -3578,8 +3610,6 @@ static void hci_sched_acl_blk(struct hci_dev *hdev) int quote; u8 type; - __check_timeout(hdev, cnt); - BT_DBG("%s", hdev->name); if (hdev->dev_type == HCI_AMP) @@ -3587,6 +3617,8 @@ static void hci_sched_acl_blk(struct hci_dev *hdev) else type = ACL_LINK; + __check_timeout(hdev, cnt, type); + while (hdev->block_cnt > 0 && (chan = hci_chan_sent(hdev, type, "e))) { u32 priority = (skb_peek(&chan->data_q))->priority; @@ -3660,7 +3692,7 @@ static void hci_sched_le(struct hci_dev *hdev) cnt = hdev->le_pkts ? hdev->le_cnt : hdev->acl_cnt; - __check_timeout(hdev, cnt); + __check_timeout(hdev, cnt, LE_LINK); tmp = cnt; while (cnt && (chan = hci_chan_sent(hdev, LE_LINK, "e))) { @@ -4056,12 +4088,14 @@ static void hci_cmd_work(struct work_struct *work) if (res < 0) __hci_cmd_sync_cancel(hdev, -res); + rcu_read_lock(); if (test_bit(HCI_RESET, &hdev->flags) || hci_dev_test_flag(hdev, HCI_CMD_DRAIN_WORKQUEUE)) cancel_delayed_work(&hdev->cmd_timer); else - schedule_delayed_work(&hdev->cmd_timer, - HCI_CMD_TIMEOUT); + queue_delayed_work(hdev->workqueue, &hdev->cmd_timer, + HCI_CMD_TIMEOUT); + rcu_read_unlock(); } else { skb_queue_head(&hdev->cmd_q, skb); queue_work(hdev->workqueue, &hdev->cmd_work); diff --git a/net/bluetooth/hci_debugfs.c b/net/bluetooth/hci_debugfs.c index 902b40a90b91..3f401ec5bb0c 100644 --- a/net/bluetooth/hci_debugfs.c +++ b/net/bluetooth/hci_debugfs.c @@ -1245,7 +1245,7 @@ void hci_debugfs_create_conn(struct hci_conn *conn) struct hci_dev *hdev = conn->hdev; char name[6]; - if (IS_ERR_OR_NULL(hdev->debugfs)) + if (IS_ERR_OR_NULL(hdev->debugfs) || conn->debugfs) return; snprintf(name, sizeof(name), "%u", conn->handle); diff --git a/net/bluetooth/hci_event.c b/net/bluetooth/hci_event.c index 6643c9c20fa4..faca701bce2a 100644 --- a/net/bluetooth/hci_event.c +++ b/net/bluetooth/hci_event.c @@ -712,6 +712,47 @@ static u8 hci_cc_read_local_version(struct hci_dev *hdev, void *data, return rp->status; } +static u8 hci_cc_read_enc_key_size(struct hci_dev *hdev, void *data, + struct sk_buff *skb) +{ + struct hci_rp_read_enc_key_size *rp = data; + struct hci_conn *conn; + u16 handle; + u8 status = rp->status; + + bt_dev_dbg(hdev, "status 0x%2.2x", status); + + handle = le16_to_cpu(rp->handle); + + hci_dev_lock(hdev); + + conn = hci_conn_hash_lookup_handle(hdev, handle); + if (!conn) { + status = 0xFF; + goto done; + } + + /* While unexpected, the read_enc_key_size command may fail. The most + * secure approach is to then assume the key size is 0 to force a + * disconnection. + */ + if (status) { + bt_dev_err(hdev, "failed to read key size for handle %u", + handle); + conn->enc_key_size = 0; + } else { + conn->enc_key_size = rp->key_size; + status = 0; + } + + hci_encrypt_cfm(conn, 0); + +done: + hci_dev_unlock(hdev); + + return status; +} + static u8 hci_cc_read_local_commands(struct hci_dev *hdev, void *data, struct sk_buff *skb) { @@ -1715,6 +1756,8 @@ static void le_set_scan_enable_complete(struct hci_dev *hdev, u8 enable) hci_dev_set_flag(hdev, HCI_LE_SCAN); if (hdev->le_scan_type == LE_SCAN_ACTIVE) clear_pending_adv_report(hdev); + if (hci_dev_test_flag(hdev, HCI_MESH)) + hci_discovery_set_state(hdev, DISCOVERY_FINDING); break; case LE_SCAN_DISABLE: @@ -1729,7 +1772,7 @@ static void le_set_scan_enable_complete(struct hci_dev *hdev, u8 enable) d->last_adv_addr_type, NULL, d->last_adv_rssi, d->last_adv_flags, d->last_adv_data, - d->last_adv_data_len, NULL, 0); + d->last_adv_data_len, NULL, 0, 0); } /* Cancel this timer so that we don't try to disable scanning @@ -1745,6 +1788,9 @@ static void le_set_scan_enable_complete(struct hci_dev *hdev, u8 enable) */ if (hci_dev_test_and_clear_flag(hdev, HCI_LE_SCAN_INTERRUPTED)) hci_discovery_set_state(hdev, DISCOVERY_STOPPED); + else if (!hci_dev_test_flag(hdev, HCI_LE_ADV) && + hdev->discovery.state == DISCOVERY_FINDING) + queue_work(hdev->workqueue, &hdev->reenable_adv_work); break; @@ -2152,7 +2198,7 @@ static u8 hci_cc_set_ext_adv_param(struct hci_dev *hdev, void *data, adv_instance->tx_power = rp->tx_power; } /* Update adv data as tx power is known now */ - hci_req_update_adv_data(hdev, cp->handle); + hci_update_adv_data(hdev, cp->handle); hci_dev_unlock(hdev); @@ -3071,7 +3117,7 @@ static void hci_inquiry_result_evt(struct hci_dev *hdev, void *edata, mgmt_device_found(hdev, &info->bdaddr, ACL_LINK, 0x00, info->dev_class, HCI_RSSI_INVALID, - flags, NULL, 0, NULL, 0); + flags, NULL, 0, NULL, 0, 0); } hci_dev_unlock(hdev); @@ -3534,47 +3580,6 @@ unlock: hci_dev_unlock(hdev); } -static void read_enc_key_size_complete(struct hci_dev *hdev, u8 status, - u16 opcode, struct sk_buff *skb) -{ - const struct hci_rp_read_enc_key_size *rp; - struct hci_conn *conn; - u16 handle; - - BT_DBG("%s status 0x%02x", hdev->name, status); - - if (!skb || skb->len < sizeof(*rp)) { - bt_dev_err(hdev, "invalid read key size response"); - return; - } - - rp = (void *)skb->data; - handle = le16_to_cpu(rp->handle); - - hci_dev_lock(hdev); - - conn = hci_conn_hash_lookup_handle(hdev, handle); - if (!conn) - goto unlock; - - /* While unexpected, the read_enc_key_size command may fail. The most - * secure approach is to then assume the key size is 0 to force a - * disconnection. - */ - if (rp->status) { - bt_dev_err(hdev, "failed to read key size for handle %u", - handle); - conn->enc_key_size = 0; - } else { - conn->enc_key_size = rp->key_size; - } - - hci_encrypt_cfm(conn, 0); - -unlock: - hci_dev_unlock(hdev); -} - static void hci_encrypt_change_evt(struct hci_dev *hdev, void *data, struct sk_buff *skb) { @@ -3639,7 +3644,6 @@ static void hci_encrypt_change_evt(struct hci_dev *hdev, void *data, /* Try reading the encryption key size for encrypted ACL links */ if (!ev->status && ev->encrypt && conn->type == ACL_LINK) { struct hci_cp_read_enc_key_size cp; - struct hci_request req; /* Only send HCI_Read_Encryption_Key_Size if the * controller really supports it. If it doesn't, assume @@ -3650,12 +3654,9 @@ static void hci_encrypt_change_evt(struct hci_dev *hdev, void *data, goto notify; } - hci_req_init(&req, hdev); - cp.handle = cpu_to_le16(conn->handle); - hci_req_add(&req, HCI_OP_READ_ENC_KEY_SIZE, sizeof(cp), &cp); - - if (hci_req_run_skb(&req, read_enc_key_size_complete)) { + if (hci_send_cmd(hdev, HCI_OP_READ_ENC_KEY_SIZE, + sizeof(cp), &cp)) { bt_dev_err(hdev, "sending read key size failed"); conn->enc_key_size = HCI_LINK_KEY_SIZE; goto notify; @@ -3766,16 +3767,18 @@ static inline void handle_cmd_cnt_and_timer(struct hci_dev *hdev, u8 ncmd) { cancel_delayed_work(&hdev->cmd_timer); + rcu_read_lock(); if (!test_bit(HCI_RESET, &hdev->flags)) { if (ncmd) { cancel_delayed_work(&hdev->ncmd_timer); atomic_set(&hdev->cmd_cnt, 1); } else { if (!hci_dev_test_flag(hdev, HCI_CMD_DRAIN_WORKQUEUE)) - schedule_delayed_work(&hdev->ncmd_timer, - HCI_NCMD_TIMEOUT); + queue_delayed_work(hdev->workqueue, &hdev->ncmd_timer, + HCI_NCMD_TIMEOUT); } } + rcu_read_unlock(); } static u8 hci_cc_le_read_buffer_size_v2(struct hci_dev *hdev, void *data, @@ -4037,6 +4040,8 @@ static const struct hci_cc { sizeof(struct hci_rp_read_local_amp_info)), HCI_CC(HCI_OP_READ_CLOCK, hci_cc_read_clock, sizeof(struct hci_rp_read_clock)), + HCI_CC(HCI_OP_READ_ENC_KEY_SIZE, hci_cc_read_enc_key_size, + sizeof(struct hci_rp_read_enc_key_size)), HCI_CC(HCI_OP_READ_INQ_RSP_TX_POWER, hci_cc_read_inq_rsp_tx_power, sizeof(struct hci_rp_read_inq_rsp_tx_power)), HCI_CC(HCI_OP_READ_DEF_ERR_DATA_REPORTING, @@ -4829,7 +4834,7 @@ static void hci_inquiry_result_with_rssi_evt(struct hci_dev *hdev, void *edata, mgmt_device_found(hdev, &info->bdaddr, ACL_LINK, 0x00, info->dev_class, info->rssi, - flags, NULL, 0, NULL, 0); + flags, NULL, 0, NULL, 0, 0); } } else if (skb->len == array_size(ev->num, sizeof(struct inquiry_info_rssi))) { @@ -4860,7 +4865,7 @@ static void hci_inquiry_result_with_rssi_evt(struct hci_dev *hdev, void *edata, mgmt_device_found(hdev, &info->bdaddr, ACL_LINK, 0x00, info->dev_class, info->rssi, - flags, NULL, 0, NULL, 0); + flags, NULL, 0, NULL, 0, 0); } } else { bt_dev_err(hdev, "Malformed HCI Event: 0x%2.2x", @@ -5116,7 +5121,7 @@ static void hci_extended_inquiry_result_evt(struct hci_dev *hdev, void *edata, mgmt_device_found(hdev, &info->bdaddr, ACL_LINK, 0x00, info->dev_class, info->rssi, - flags, info->data, eir_len, NULL, 0); + flags, info->data, eir_len, NULL, 0, 0); } hci_dev_unlock(hdev); @@ -6172,7 +6177,7 @@ static struct hci_conn *check_pending_le_conn(struct hci_dev *hdev, static void process_adv_report(struct hci_dev *hdev, u8 type, bdaddr_t *bdaddr, u8 bdaddr_type, bdaddr_t *direct_addr, u8 direct_addr_type, s8 rssi, u8 *data, u8 len, - bool ext_adv) + bool ext_adv, bool ctl_time, u64 instant) { struct discovery_state *d = &hdev->discovery; struct smp_irk *irk; @@ -6220,7 +6225,7 @@ static void process_adv_report(struct hci_dev *hdev, u8 type, bdaddr_t *bdaddr, * important to see if the address is matching the local * controller address. */ - if (direct_addr) { + if (!hci_dev_test_flag(hdev, HCI_MESH) && direct_addr) { direct_addr_type = ev_bdaddr_type(hdev, direct_addr_type, &bdaddr_resolved); @@ -6268,6 +6273,18 @@ static void process_adv_report(struct hci_dev *hdev, u8 type, bdaddr_t *bdaddr, conn->le_adv_data_len = len; } + if (type == LE_ADV_NONCONN_IND || type == LE_ADV_SCAN_IND) + flags = MGMT_DEV_FOUND_NOT_CONNECTABLE; + else + flags = 0; + + /* All scan results should be sent up for Mesh systems */ + if (hci_dev_test_flag(hdev, HCI_MESH)) { + mgmt_device_found(hdev, bdaddr, LE_LINK, bdaddr_type, NULL, + rssi, flags, data, len, NULL, 0, instant); + return; + } + /* Passive scanning shouldn't trigger any device found events, * except for devices marked as CONN_REPORT for which we do send * device found events, or advertisement monitoring requested. @@ -6281,12 +6298,8 @@ static void process_adv_report(struct hci_dev *hdev, u8 type, bdaddr_t *bdaddr, idr_is_empty(&hdev->adv_monitors_idr)) return; - if (type == LE_ADV_NONCONN_IND || type == LE_ADV_SCAN_IND) - flags = MGMT_DEV_FOUND_NOT_CONNECTABLE; - else - flags = 0; mgmt_device_found(hdev, bdaddr, LE_LINK, bdaddr_type, NULL, - rssi, flags, data, len, NULL, 0); + rssi, flags, data, len, NULL, 0, 0); return; } @@ -6305,11 +6318,8 @@ static void process_adv_report(struct hci_dev *hdev, u8 type, bdaddr_t *bdaddr, * and just sends a scan response event, then it is marked as * not connectable as well. */ - if (type == LE_ADV_NONCONN_IND || type == LE_ADV_SCAN_IND || - type == LE_ADV_SCAN_RSP) + if (type == LE_ADV_SCAN_RSP) flags = MGMT_DEV_FOUND_NOT_CONNECTABLE; - else - flags = 0; /* If there's nothing pending either store the data from this * event or send an immediate device found event if the data @@ -6326,7 +6336,7 @@ static void process_adv_report(struct hci_dev *hdev, u8 type, bdaddr_t *bdaddr, } mgmt_device_found(hdev, bdaddr, LE_LINK, bdaddr_type, NULL, - rssi, flags, data, len, NULL, 0); + rssi, flags, data, len, NULL, 0, 0); return; } @@ -6345,7 +6355,7 @@ static void process_adv_report(struct hci_dev *hdev, u8 type, bdaddr_t *bdaddr, d->last_adv_addr_type, NULL, d->last_adv_rssi, d->last_adv_flags, d->last_adv_data, - d->last_adv_data_len, NULL, 0); + d->last_adv_data_len, NULL, 0, 0); /* If the new report will trigger a SCAN_REQ store it for * later merging. @@ -6362,7 +6372,7 @@ static void process_adv_report(struct hci_dev *hdev, u8 type, bdaddr_t *bdaddr, */ clear_pending_adv_report(hdev); mgmt_device_found(hdev, bdaddr, LE_LINK, bdaddr_type, NULL, - rssi, flags, data, len, NULL, 0); + rssi, flags, data, len, NULL, 0, 0); return; } @@ -6372,7 +6382,7 @@ static void process_adv_report(struct hci_dev *hdev, u8 type, bdaddr_t *bdaddr, */ mgmt_device_found(hdev, &d->last_adv_addr, LE_LINK, d->last_adv_addr_type, NULL, rssi, d->last_adv_flags, - d->last_adv_data, d->last_adv_data_len, data, len); + d->last_adv_data, d->last_adv_data_len, data, len, 0); clear_pending_adv_report(hdev); } @@ -6380,6 +6390,7 @@ static void hci_le_adv_report_evt(struct hci_dev *hdev, void *data, struct sk_buff *skb) { struct hci_ev_le_advertising_report *ev = data; + u64 instant = jiffies; if (!ev->num) return; @@ -6404,7 +6415,8 @@ static void hci_le_adv_report_evt(struct hci_dev *hdev, void *data, rssi = info->data[info->length]; process_adv_report(hdev, info->type, &info->bdaddr, info->bdaddr_type, NULL, 0, rssi, - info->data, info->length, false); + info->data, info->length, false, + false, instant); } else { bt_dev_err(hdev, "Dropping invalid advertising data"); } @@ -6461,6 +6473,7 @@ static void hci_le_ext_adv_report_evt(struct hci_dev *hdev, void *data, struct sk_buff *skb) { struct hci_ev_le_ext_adv_report *ev = data; + u64 instant = jiffies; if (!ev->num) return; @@ -6487,7 +6500,8 @@ static void hci_le_ext_adv_report_evt(struct hci_dev *hdev, void *data, process_adv_report(hdev, legacy_evt_type, &info->bdaddr, info->bdaddr_type, NULL, 0, info->rssi, info->data, info->length, - !(evt_type & LE_EXT_ADV_LEGACY_PDU)); + !(evt_type & LE_EXT_ADV_LEGACY_PDU), + false, instant); } } @@ -6710,6 +6724,7 @@ static void hci_le_direct_adv_report_evt(struct hci_dev *hdev, void *data, struct sk_buff *skb) { struct hci_ev_le_direct_adv_report *ev = data; + u64 instant = jiffies; int i; if (!hci_le_ev_skb_pull(hdev, skb, HCI_EV_LE_DIRECT_ADV_REPORT, @@ -6727,7 +6742,7 @@ static void hci_le_direct_adv_report_evt(struct hci_dev *hdev, void *data, process_adv_report(hdev, info->type, &info->bdaddr, info->bdaddr_type, &info->direct_addr, info->direct_addr_type, info->rssi, NULL, 0, - false); + false, false, instant); } hci_dev_unlock(hdev); @@ -6776,6 +6791,13 @@ static void hci_le_cis_estabilished_evt(struct hci_dev *hdev, void *data, goto unlock; } + if (conn->type != ISO_LINK) { + bt_dev_err(hdev, + "Invalid connection link type handle 0x%4.4x", + handle); + goto unlock; + } + if (conn->role == HCI_ROLE_SLAVE) { __le32 interval; @@ -6896,6 +6918,13 @@ static void hci_le_create_big_complete_evt(struct hci_dev *hdev, void *data, if (!conn) goto unlock; + if (conn->type != ISO_LINK) { + bt_dev_err(hdev, + "Invalid connection link type handle 0x%2.2x", + ev->handle); + goto unlock; + } + if (ev->num_bis) conn->handle = __le16_to_cpu(ev->bis_handle[0]); diff --git a/net/bluetooth/hci_request.c b/net/bluetooth/hci_request.c index e64d558e5d69..5a0296a4352e 100644 --- a/net/bluetooth/hci_request.c +++ b/net/bluetooth/hci_request.c @@ -269,42 +269,10 @@ void hci_req_add_ev(struct hci_request *req, u16 opcode, u32 plen, void hci_req_add(struct hci_request *req, u16 opcode, u32 plen, const void *param) { + bt_dev_err(req->hdev, "HCI_REQ-0x%4.4x", opcode); hci_req_add_ev(req, opcode, plen, param, 0); } -void __hci_req_write_fast_connectable(struct hci_request *req, bool enable) -{ - struct hci_dev *hdev = req->hdev; - struct hci_cp_write_page_scan_activity acp; - u8 type; - - if (!hci_dev_test_flag(hdev, HCI_BREDR_ENABLED)) - return; - - if (hdev->hci_ver < BLUETOOTH_VER_1_2) - return; - - if (enable) { - type = PAGE_SCAN_TYPE_INTERLACED; - - /* 160 msec page scan interval */ - acp.interval = cpu_to_le16(0x0100); - } else { - type = hdev->def_page_scan_type; - acp.interval = cpu_to_le16(hdev->def_page_scan_int); - } - - acp.window = cpu_to_le16(hdev->def_page_scan_window); - - if (__cpu_to_le16(hdev->page_scan_interval) != acp.interval || - __cpu_to_le16(hdev->page_scan_window) != acp.window) - hci_req_add(req, HCI_OP_WRITE_PAGE_SCAN_ACTIVITY, - sizeof(acp), &acp); - - if (hdev->page_scan_type != type) - hci_req_add(req, HCI_OP_WRITE_PAGE_SCAN_TYPE, 1, &type); -} - static void start_interleave_scan(struct hci_dev *hdev) { hdev->interleave_scan_state = INTERLEAVE_SCAN_NO_FILTER; @@ -357,45 +325,6 @@ static bool __hci_update_interleaved_scan(struct hci_dev *hdev) return false; } -void __hci_req_update_name(struct hci_request *req) -{ - struct hci_dev *hdev = req->hdev; - struct hci_cp_write_local_name cp; - - memcpy(cp.name, hdev->dev_name, sizeof(cp.name)); - - hci_req_add(req, HCI_OP_WRITE_LOCAL_NAME, sizeof(cp), &cp); -} - -void __hci_req_update_eir(struct hci_request *req) -{ - struct hci_dev *hdev = req->hdev; - struct hci_cp_write_eir cp; - - if (!hdev_is_powered(hdev)) - return; - - if (!lmp_ext_inq_capable(hdev)) - return; - - if (!hci_dev_test_flag(hdev, HCI_SSP_ENABLED)) - return; - - if (hci_dev_test_flag(hdev, HCI_SERVICE_CACHE)) - return; - - memset(&cp, 0, sizeof(cp)); - - eir_create(hdev, cp.data); - - if (memcmp(cp.data, hdev->eir, sizeof(cp.data)) == 0) - return; - - memcpy(hdev->eir, cp.data, sizeof(cp.data)); - - hci_req_add(req, HCI_OP_WRITE_EIR, sizeof(cp), &cp); -} - void hci_req_add_le_scan_disable(struct hci_request *req, bool rpa_le_conn) { struct hci_dev *hdev = req->hdev; @@ -721,6 +650,96 @@ static inline bool hci_is_le_conn_scanning(struct hci_dev *hdev) return false; } +static void set_random_addr(struct hci_request *req, bdaddr_t *rpa); +static int hci_update_random_address(struct hci_request *req, + bool require_privacy, bool use_rpa, + u8 *own_addr_type) +{ + struct hci_dev *hdev = req->hdev; + int err; + + /* If privacy is enabled use a resolvable private address. If + * current RPA has expired or there is something else than + * the current RPA in use, then generate a new one. + */ + if (use_rpa) { + /* If Controller supports LL Privacy use own address type is + * 0x03 + */ + if (use_ll_privacy(hdev)) + *own_addr_type = ADDR_LE_DEV_RANDOM_RESOLVED; + else + *own_addr_type = ADDR_LE_DEV_RANDOM; + + if (rpa_valid(hdev)) + return 0; + + err = smp_generate_rpa(hdev, hdev->irk, &hdev->rpa); + if (err < 0) { + bt_dev_err(hdev, "failed to generate new RPA"); + return err; + } + + set_random_addr(req, &hdev->rpa); + + return 0; + } + + /* In case of required privacy without resolvable private address, + * use an non-resolvable private address. This is useful for active + * scanning and non-connectable advertising. + */ + if (require_privacy) { + bdaddr_t nrpa; + + while (true) { + /* The non-resolvable private address is generated + * from random six bytes with the two most significant + * bits cleared. + */ + get_random_bytes(&nrpa, 6); + nrpa.b[5] &= 0x3f; + + /* The non-resolvable private address shall not be + * equal to the public address. + */ + if (bacmp(&hdev->bdaddr, &nrpa)) + break; + } + + *own_addr_type = ADDR_LE_DEV_RANDOM; + set_random_addr(req, &nrpa); + return 0; + } + + /* If forcing static address is in use or there is no public + * address use the static address as random address (but skip + * the HCI command if the current random address is already the + * static one. + * + * In case BR/EDR has been disabled on a dual-mode controller + * and a static address has been configured, then use that + * address instead of the public BR/EDR address. + */ + if (hci_dev_test_flag(hdev, HCI_FORCE_STATIC_ADDR) || + !bacmp(&hdev->bdaddr, BDADDR_ANY) || + (!hci_dev_test_flag(hdev, HCI_BREDR_ENABLED) && + bacmp(&hdev->static_addr, BDADDR_ANY))) { + *own_addr_type = ADDR_LE_DEV_RANDOM; + if (bacmp(&hdev->static_addr, &hdev->random_addr)) + hci_req_add(req, HCI_OP_LE_SET_RANDOM_ADDR, 6, + &hdev->static_addr); + return 0; + } + + /* Neither privacy nor static address is being used so use a + * public address. + */ + *own_addr_type = ADDR_LE_DEV_PUBLIC; + + return 0; +} + /* Ensure to call hci_req_add_le_scan_disable() first to disable the * controller based address resolution to be able to reconfigure * resolving list. @@ -810,366 +829,6 @@ void hci_req_add_le_passive_scan(struct hci_request *req) addr_resolv); } -static void cancel_adv_timeout(struct hci_dev *hdev) -{ - if (hdev->adv_instance_timeout) { - hdev->adv_instance_timeout = 0; - cancel_delayed_work(&hdev->adv_instance_expire); - } -} - -static bool adv_cur_instance_is_scannable(struct hci_dev *hdev) -{ - return hci_adv_instance_is_scannable(hdev, hdev->cur_adv_instance); -} - -void __hci_req_disable_advertising(struct hci_request *req) -{ - if (ext_adv_capable(req->hdev)) { - __hci_req_disable_ext_adv_instance(req, 0x00); - } else { - u8 enable = 0x00; - - hci_req_add(req, HCI_OP_LE_SET_ADV_ENABLE, sizeof(enable), &enable); - } -} - -static bool adv_use_rpa(struct hci_dev *hdev, uint32_t flags) -{ - /* If privacy is not enabled don't use RPA */ - if (!hci_dev_test_flag(hdev, HCI_PRIVACY)) - return false; - - /* If basic privacy mode is enabled use RPA */ - if (!hci_dev_test_flag(hdev, HCI_LIMITED_PRIVACY)) - return true; - - /* If limited privacy mode is enabled don't use RPA if we're - * both discoverable and bondable. - */ - if ((flags & MGMT_ADV_FLAG_DISCOV) && - hci_dev_test_flag(hdev, HCI_BONDABLE)) - return false; - - /* We're neither bondable nor discoverable in the limited - * privacy mode, therefore use RPA. - */ - return true; -} - -static bool is_advertising_allowed(struct hci_dev *hdev, bool connectable) -{ - /* If there is no connection we are OK to advertise. */ - if (hci_conn_num(hdev, LE_LINK) == 0) - return true; - - /* Check le_states if there is any connection in peripheral role. */ - if (hdev->conn_hash.le_num_peripheral > 0) { - /* Peripheral connection state and non connectable mode bit 20. - */ - if (!connectable && !(hdev->le_states[2] & 0x10)) - return false; - - /* Peripheral connection state and connectable mode bit 38 - * and scannable bit 21. - */ - if (connectable && (!(hdev->le_states[4] & 0x40) || - !(hdev->le_states[2] & 0x20))) - return false; - } - - /* Check le_states if there is any connection in central role. */ - if (hci_conn_num(hdev, LE_LINK) != hdev->conn_hash.le_num_peripheral) { - /* Central connection state and non connectable mode bit 18. */ - if (!connectable && !(hdev->le_states[2] & 0x02)) - return false; - - /* Central connection state and connectable mode bit 35 and - * scannable 19. - */ - if (connectable && (!(hdev->le_states[4] & 0x08) || - !(hdev->le_states[2] & 0x08))) - return false; - } - - return true; -} - -void __hci_req_enable_advertising(struct hci_request *req) -{ - struct hci_dev *hdev = req->hdev; - struct adv_info *adv; - struct hci_cp_le_set_adv_param cp; - u8 own_addr_type, enable = 0x01; - bool connectable; - u16 adv_min_interval, adv_max_interval; - u32 flags; - - flags = hci_adv_instance_flags(hdev, hdev->cur_adv_instance); - adv = hci_find_adv_instance(hdev, hdev->cur_adv_instance); - - /* If the "connectable" instance flag was not set, then choose between - * ADV_IND and ADV_NONCONN_IND based on the global connectable setting. - */ - connectable = (flags & MGMT_ADV_FLAG_CONNECTABLE) || - mgmt_get_connectable(hdev); - - if (!is_advertising_allowed(hdev, connectable)) - return; - - if (hci_dev_test_flag(hdev, HCI_LE_ADV)) - __hci_req_disable_advertising(req); - - /* Clear the HCI_LE_ADV bit temporarily so that the - * hci_update_random_address knows that it's safe to go ahead - * and write a new random address. The flag will be set back on - * as soon as the SET_ADV_ENABLE HCI command completes. - */ - hci_dev_clear_flag(hdev, HCI_LE_ADV); - - /* Set require_privacy to true only when non-connectable - * advertising is used. In that case it is fine to use a - * non-resolvable private address. - */ - if (hci_update_random_address(req, !connectable, - adv_use_rpa(hdev, flags), - &own_addr_type) < 0) - return; - - memset(&cp, 0, sizeof(cp)); - - if (adv) { - adv_min_interval = adv->min_interval; - adv_max_interval = adv->max_interval; - } else { - adv_min_interval = hdev->le_adv_min_interval; - adv_max_interval = hdev->le_adv_max_interval; - } - - if (connectable) { - cp.type = LE_ADV_IND; - } else { - if (adv_cur_instance_is_scannable(hdev)) - cp.type = LE_ADV_SCAN_IND; - else - cp.type = LE_ADV_NONCONN_IND; - - if (!hci_dev_test_flag(hdev, HCI_DISCOVERABLE) || - hci_dev_test_flag(hdev, HCI_LIMITED_DISCOVERABLE)) { - adv_min_interval = DISCOV_LE_FAST_ADV_INT_MIN; - adv_max_interval = DISCOV_LE_FAST_ADV_INT_MAX; - } - } - - cp.min_interval = cpu_to_le16(adv_min_interval); - cp.max_interval = cpu_to_le16(adv_max_interval); - cp.own_address_type = own_addr_type; - cp.channel_map = hdev->le_adv_channel_map; - - hci_req_add(req, HCI_OP_LE_SET_ADV_PARAM, sizeof(cp), &cp); - - hci_req_add(req, HCI_OP_LE_SET_ADV_ENABLE, sizeof(enable), &enable); -} - -void __hci_req_update_scan_rsp_data(struct hci_request *req, u8 instance) -{ - struct hci_dev *hdev = req->hdev; - u8 len; - - if (!hci_dev_test_flag(hdev, HCI_LE_ENABLED)) - return; - - if (ext_adv_capable(hdev)) { - struct { - struct hci_cp_le_set_ext_scan_rsp_data cp; - u8 data[HCI_MAX_EXT_AD_LENGTH]; - } pdu; - - memset(&pdu, 0, sizeof(pdu)); - - len = eir_create_scan_rsp(hdev, instance, pdu.data); - - if (hdev->scan_rsp_data_len == len && - !memcmp(pdu.data, hdev->scan_rsp_data, len)) - return; - - memcpy(hdev->scan_rsp_data, pdu.data, len); - hdev->scan_rsp_data_len = len; - - pdu.cp.handle = instance; - pdu.cp.length = len; - pdu.cp.operation = LE_SET_ADV_DATA_OP_COMPLETE; - pdu.cp.frag_pref = LE_SET_ADV_DATA_NO_FRAG; - - hci_req_add(req, HCI_OP_LE_SET_EXT_SCAN_RSP_DATA, - sizeof(pdu.cp) + len, &pdu.cp); - } else { - struct hci_cp_le_set_scan_rsp_data cp; - - memset(&cp, 0, sizeof(cp)); - - len = eir_create_scan_rsp(hdev, instance, cp.data); - - if (hdev->scan_rsp_data_len == len && - !memcmp(cp.data, hdev->scan_rsp_data, len)) - return; - - memcpy(hdev->scan_rsp_data, cp.data, sizeof(cp.data)); - hdev->scan_rsp_data_len = len; - - cp.length = len; - - hci_req_add(req, HCI_OP_LE_SET_SCAN_RSP_DATA, sizeof(cp), &cp); - } -} - -void __hci_req_update_adv_data(struct hci_request *req, u8 instance) -{ - struct hci_dev *hdev = req->hdev; - u8 len; - - if (!hci_dev_test_flag(hdev, HCI_LE_ENABLED)) - return; - - if (ext_adv_capable(hdev)) { - struct { - struct hci_cp_le_set_ext_adv_data cp; - u8 data[HCI_MAX_EXT_AD_LENGTH]; - } pdu; - - memset(&pdu, 0, sizeof(pdu)); - - len = eir_create_adv_data(hdev, instance, pdu.data); - - /* There's nothing to do if the data hasn't changed */ - if (hdev->adv_data_len == len && - memcmp(pdu.data, hdev->adv_data, len) == 0) - return; - - memcpy(hdev->adv_data, pdu.data, len); - hdev->adv_data_len = len; - - pdu.cp.length = len; - pdu.cp.handle = instance; - pdu.cp.operation = LE_SET_ADV_DATA_OP_COMPLETE; - pdu.cp.frag_pref = LE_SET_ADV_DATA_NO_FRAG; - - hci_req_add(req, HCI_OP_LE_SET_EXT_ADV_DATA, - sizeof(pdu.cp) + len, &pdu.cp); - } else { - struct hci_cp_le_set_adv_data cp; - - memset(&cp, 0, sizeof(cp)); - - len = eir_create_adv_data(hdev, instance, cp.data); - - /* There's nothing to do if the data hasn't changed */ - if (hdev->adv_data_len == len && - memcmp(cp.data, hdev->adv_data, len) == 0) - return; - - memcpy(hdev->adv_data, cp.data, sizeof(cp.data)); - hdev->adv_data_len = len; - - cp.length = len; - - hci_req_add(req, HCI_OP_LE_SET_ADV_DATA, sizeof(cp), &cp); - } -} - -int hci_req_update_adv_data(struct hci_dev *hdev, u8 instance) -{ - struct hci_request req; - - hci_req_init(&req, hdev); - __hci_req_update_adv_data(&req, instance); - - return hci_req_run(&req, NULL); -} - -static void enable_addr_resolution_complete(struct hci_dev *hdev, u8 status, - u16 opcode) -{ - BT_DBG("%s status %u", hdev->name, status); -} - -void hci_req_disable_address_resolution(struct hci_dev *hdev) -{ - struct hci_request req; - __u8 enable = 0x00; - - if (!hci_dev_test_flag(hdev, HCI_LL_RPA_RESOLUTION)) - return; - - hci_req_init(&req, hdev); - - hci_req_add(&req, HCI_OP_LE_SET_ADDR_RESOLV_ENABLE, 1, &enable); - - hci_req_run(&req, enable_addr_resolution_complete); -} - -static void adv_enable_complete(struct hci_dev *hdev, u8 status, u16 opcode) -{ - bt_dev_dbg(hdev, "status %u", status); -} - -void hci_req_reenable_advertising(struct hci_dev *hdev) -{ - struct hci_request req; - - if (!hci_dev_test_flag(hdev, HCI_ADVERTISING) && - list_empty(&hdev->adv_instances)) - return; - - hci_req_init(&req, hdev); - - if (hdev->cur_adv_instance) { - __hci_req_schedule_adv_instance(&req, hdev->cur_adv_instance, - true); - } else { - if (ext_adv_capable(hdev)) { - __hci_req_start_ext_adv(&req, 0x00); - } else { - __hci_req_update_adv_data(&req, 0x00); - __hci_req_update_scan_rsp_data(&req, 0x00); - __hci_req_enable_advertising(&req); - } - } - - hci_req_run(&req, adv_enable_complete); -} - -static void adv_timeout_expire(struct work_struct *work) -{ - struct hci_dev *hdev = container_of(work, struct hci_dev, - adv_instance_expire.work); - - struct hci_request req; - u8 instance; - - bt_dev_dbg(hdev, ""); - - hci_dev_lock(hdev); - - hdev->adv_instance_timeout = 0; - - instance = hdev->cur_adv_instance; - if (instance == 0x00) - goto unlock; - - hci_req_init(&req, hdev); - - hci_req_clear_adv_instance(hdev, NULL, &req, instance, false); - - if (list_empty(&hdev->adv_instances)) - __hci_req_disable_advertising(&req); - - hci_req_run(&req, NULL); - -unlock: - hci_dev_unlock(hdev); -} - static int hci_req_add_le_interleaved_scan(struct hci_request *req, unsigned long opt) { @@ -1226,84 +885,6 @@ static void interleave_scan_work(struct work_struct *work) &hdev->interleave_scan, timeout); } -int hci_get_random_address(struct hci_dev *hdev, bool require_privacy, - bool use_rpa, struct adv_info *adv_instance, - u8 *own_addr_type, bdaddr_t *rand_addr) -{ - int err; - - bacpy(rand_addr, BDADDR_ANY); - - /* If privacy is enabled use a resolvable private address. If - * current RPA has expired then generate a new one. - */ - if (use_rpa) { - /* If Controller supports LL Privacy use own address type is - * 0x03 - */ - if (use_ll_privacy(hdev)) - *own_addr_type = ADDR_LE_DEV_RANDOM_RESOLVED; - else - *own_addr_type = ADDR_LE_DEV_RANDOM; - - if (adv_instance) { - if (adv_rpa_valid(adv_instance)) - return 0; - } else { - if (rpa_valid(hdev)) - return 0; - } - - err = smp_generate_rpa(hdev, hdev->irk, &hdev->rpa); - if (err < 0) { - bt_dev_err(hdev, "failed to generate new RPA"); - return err; - } - - bacpy(rand_addr, &hdev->rpa); - - return 0; - } - - /* In case of required privacy without resolvable private address, - * use an non-resolvable private address. This is useful for - * non-connectable advertising. - */ - if (require_privacy) { - bdaddr_t nrpa; - - while (true) { - /* The non-resolvable private address is generated - * from random six bytes with the two most significant - * bits cleared. - */ - get_random_bytes(&nrpa, 6); - nrpa.b[5] &= 0x3f; - - /* The non-resolvable private address shall not be - * equal to the public address. - */ - if (bacmp(&hdev->bdaddr, &nrpa)) - break; - } - - *own_addr_type = ADDR_LE_DEV_RANDOM; - bacpy(rand_addr, &nrpa); - - return 0; - } - - /* No privacy so use a public address. */ - *own_addr_type = ADDR_LE_DEV_PUBLIC; - - return 0; -} - -void __hci_req_clear_ext_adv_sets(struct hci_request *req) -{ - hci_req_add(req, HCI_OP_LE_CLEAR_ADV_SETS, 0, NULL); -} - static void set_random_addr(struct hci_request *req, bdaddr_t *rpa) { struct hci_dev *hdev = req->hdev; @@ -1328,933 +909,8 @@ static void set_random_addr(struct hci_request *req, bdaddr_t *rpa) hci_req_add(req, HCI_OP_LE_SET_RANDOM_ADDR, 6, rpa); } -int __hci_req_setup_ext_adv_instance(struct hci_request *req, u8 instance) -{ - struct hci_cp_le_set_ext_adv_params cp; - struct hci_dev *hdev = req->hdev; - bool connectable; - u32 flags; - bdaddr_t random_addr; - u8 own_addr_type; - int err; - struct adv_info *adv; - bool secondary_adv, require_privacy; - - if (instance > 0) { - adv = hci_find_adv_instance(hdev, instance); - if (!adv) - return -EINVAL; - } else { - adv = NULL; - } - - flags = hci_adv_instance_flags(hdev, instance); - - /* If the "connectable" instance flag was not set, then choose between - * ADV_IND and ADV_NONCONN_IND based on the global connectable setting. - */ - connectable = (flags & MGMT_ADV_FLAG_CONNECTABLE) || - mgmt_get_connectable(hdev); - - if (!is_advertising_allowed(hdev, connectable)) - return -EPERM; - - /* Set require_privacy to true only when non-connectable - * advertising is used. In that case it is fine to use a - * non-resolvable private address. - */ - require_privacy = !connectable; - - /* Don't require privacy for periodic adv? */ - if (adv && adv->periodic) - require_privacy = false; - - err = hci_get_random_address(hdev, require_privacy, - adv_use_rpa(hdev, flags), adv, - &own_addr_type, &random_addr); - if (err < 0) - return err; - - memset(&cp, 0, sizeof(cp)); - - if (adv) { - hci_cpu_to_le24(adv->min_interval, cp.min_interval); - hci_cpu_to_le24(adv->max_interval, cp.max_interval); - cp.tx_power = adv->tx_power; - } else { - hci_cpu_to_le24(hdev->le_adv_min_interval, cp.min_interval); - hci_cpu_to_le24(hdev->le_adv_max_interval, cp.max_interval); - cp.tx_power = HCI_ADV_TX_POWER_NO_PREFERENCE; - } - - secondary_adv = (flags & MGMT_ADV_FLAG_SEC_MASK); - - if (connectable) { - if (secondary_adv) - cp.evt_properties = cpu_to_le16(LE_EXT_ADV_CONN_IND); - else - cp.evt_properties = cpu_to_le16(LE_LEGACY_ADV_IND); - } else if (hci_adv_instance_is_scannable(hdev, instance) || - (flags & MGMT_ADV_PARAM_SCAN_RSP)) { - if (secondary_adv) - cp.evt_properties = cpu_to_le16(LE_EXT_ADV_SCAN_IND); - else - cp.evt_properties = cpu_to_le16(LE_LEGACY_ADV_SCAN_IND); - } else { - /* Secondary and periodic cannot use legacy PDUs */ - if (secondary_adv || (adv && adv->periodic)) - cp.evt_properties = cpu_to_le16(LE_EXT_ADV_NON_CONN_IND); - else - cp.evt_properties = cpu_to_le16(LE_LEGACY_NONCONN_IND); - } - - cp.own_addr_type = own_addr_type; - cp.channel_map = hdev->le_adv_channel_map; - cp.handle = instance; - - if (flags & MGMT_ADV_FLAG_SEC_2M) { - cp.primary_phy = HCI_ADV_PHY_1M; - cp.secondary_phy = HCI_ADV_PHY_2M; - } else if (flags & MGMT_ADV_FLAG_SEC_CODED) { - cp.primary_phy = HCI_ADV_PHY_CODED; - cp.secondary_phy = HCI_ADV_PHY_CODED; - } else { - /* In all other cases use 1M */ - cp.primary_phy = HCI_ADV_PHY_1M; - cp.secondary_phy = HCI_ADV_PHY_1M; - } - - hci_req_add(req, HCI_OP_LE_SET_EXT_ADV_PARAMS, sizeof(cp), &cp); - - if ((own_addr_type == ADDR_LE_DEV_RANDOM || - own_addr_type == ADDR_LE_DEV_RANDOM_RESOLVED) && - bacmp(&random_addr, BDADDR_ANY)) { - struct hci_cp_le_set_adv_set_rand_addr cp; - - /* Check if random address need to be updated */ - if (adv) { - if (!bacmp(&random_addr, &adv->random_addr)) - return 0; - } else { - if (!bacmp(&random_addr, &hdev->random_addr)) - return 0; - /* Instance 0x00 doesn't have an adv_info, instead it - * uses hdev->random_addr to track its address so - * whenever it needs to be updated this also set the - * random address since hdev->random_addr is shared with - * scan state machine. - */ - set_random_addr(req, &random_addr); - } - - memset(&cp, 0, sizeof(cp)); - - cp.handle = instance; - bacpy(&cp.bdaddr, &random_addr); - - hci_req_add(req, - HCI_OP_LE_SET_ADV_SET_RAND_ADDR, - sizeof(cp), &cp); - } - - return 0; -} - -int __hci_req_enable_ext_advertising(struct hci_request *req, u8 instance) -{ - struct hci_dev *hdev = req->hdev; - struct hci_cp_le_set_ext_adv_enable *cp; - struct hci_cp_ext_adv_set *adv_set; - u8 data[sizeof(*cp) + sizeof(*adv_set) * 1]; - struct adv_info *adv_instance; - - if (instance > 0) { - adv_instance = hci_find_adv_instance(hdev, instance); - if (!adv_instance) - return -EINVAL; - } else { - adv_instance = NULL; - } - - cp = (void *) data; - adv_set = (void *) cp->data; - - memset(cp, 0, sizeof(*cp)); - - cp->enable = 0x01; - cp->num_of_sets = 0x01; - - memset(adv_set, 0, sizeof(*adv_set)); - - adv_set->handle = instance; - - /* Set duration per instance since controller is responsible for - * scheduling it. - */ - if (adv_instance && adv_instance->duration) { - u16 duration = adv_instance->timeout * MSEC_PER_SEC; - - /* Time = N * 10 ms */ - adv_set->duration = cpu_to_le16(duration / 10); - } - - hci_req_add(req, HCI_OP_LE_SET_EXT_ADV_ENABLE, - sizeof(*cp) + sizeof(*adv_set) * cp->num_of_sets, - data); - - return 0; -} - -int __hci_req_disable_ext_adv_instance(struct hci_request *req, u8 instance) -{ - struct hci_dev *hdev = req->hdev; - struct hci_cp_le_set_ext_adv_enable *cp; - struct hci_cp_ext_adv_set *adv_set; - u8 data[sizeof(*cp) + sizeof(*adv_set) * 1]; - u8 req_size; - - /* If request specifies an instance that doesn't exist, fail */ - if (instance > 0 && !hci_find_adv_instance(hdev, instance)) - return -EINVAL; - - memset(data, 0, sizeof(data)); - - cp = (void *)data; - adv_set = (void *)cp->data; - - /* Instance 0x00 indicates all advertising instances will be disabled */ - cp->num_of_sets = !!instance; - cp->enable = 0x00; - - adv_set->handle = instance; - - req_size = sizeof(*cp) + sizeof(*adv_set) * cp->num_of_sets; - hci_req_add(req, HCI_OP_LE_SET_EXT_ADV_ENABLE, req_size, data); - - return 0; -} - -int __hci_req_remove_ext_adv_instance(struct hci_request *req, u8 instance) -{ - struct hci_dev *hdev = req->hdev; - - /* If request specifies an instance that doesn't exist, fail */ - if (instance > 0 && !hci_find_adv_instance(hdev, instance)) - return -EINVAL; - - hci_req_add(req, HCI_OP_LE_REMOVE_ADV_SET, sizeof(instance), &instance); - - return 0; -} - -int __hci_req_start_ext_adv(struct hci_request *req, u8 instance) -{ - struct hci_dev *hdev = req->hdev; - struct adv_info *adv_instance = hci_find_adv_instance(hdev, instance); - int err; - - /* If instance isn't pending, the chip knows about it, and it's safe to - * disable - */ - if (adv_instance && !adv_instance->pending) - __hci_req_disable_ext_adv_instance(req, instance); - - err = __hci_req_setup_ext_adv_instance(req, instance); - if (err < 0) - return err; - - __hci_req_update_scan_rsp_data(req, instance); - __hci_req_enable_ext_advertising(req, instance); - - return 0; -} - -int __hci_req_schedule_adv_instance(struct hci_request *req, u8 instance, - bool force) -{ - struct hci_dev *hdev = req->hdev; - struct adv_info *adv_instance = NULL; - u16 timeout; - - if (hci_dev_test_flag(hdev, HCI_ADVERTISING) || - list_empty(&hdev->adv_instances)) - return -EPERM; - - if (hdev->adv_instance_timeout) - return -EBUSY; - - adv_instance = hci_find_adv_instance(hdev, instance); - if (!adv_instance) - return -ENOENT; - - /* A zero timeout means unlimited advertising. As long as there is - * only one instance, duration should be ignored. We still set a timeout - * in case further instances are being added later on. - * - * If the remaining lifetime of the instance is more than the duration - * then the timeout corresponds to the duration, otherwise it will be - * reduced to the remaining instance lifetime. - */ - if (adv_instance->timeout == 0 || - adv_instance->duration <= adv_instance->remaining_time) - timeout = adv_instance->duration; - else - timeout = adv_instance->remaining_time; - - /* The remaining time is being reduced unless the instance is being - * advertised without time limit. - */ - if (adv_instance->timeout) - adv_instance->remaining_time = - adv_instance->remaining_time - timeout; - - /* Only use work for scheduling instances with legacy advertising */ - if (!ext_adv_capable(hdev)) { - hdev->adv_instance_timeout = timeout; - queue_delayed_work(hdev->req_workqueue, - &hdev->adv_instance_expire, - msecs_to_jiffies(timeout * 1000)); - } - - /* If we're just re-scheduling the same instance again then do not - * execute any HCI commands. This happens when a single instance is - * being advertised. - */ - if (!force && hdev->cur_adv_instance == instance && - hci_dev_test_flag(hdev, HCI_LE_ADV)) - return 0; - - hdev->cur_adv_instance = instance; - if (ext_adv_capable(hdev)) { - __hci_req_start_ext_adv(req, instance); - } else { - __hci_req_update_adv_data(req, instance); - __hci_req_update_scan_rsp_data(req, instance); - __hci_req_enable_advertising(req); - } - - return 0; -} - -/* For a single instance: - * - force == true: The instance will be removed even when its remaining - * lifetime is not zero. - * - force == false: the instance will be deactivated but kept stored unless - * the remaining lifetime is zero. - * - * For instance == 0x00: - * - force == true: All instances will be removed regardless of their timeout - * setting. - * - force == false: Only instances that have a timeout will be removed. - */ -void hci_req_clear_adv_instance(struct hci_dev *hdev, struct sock *sk, - struct hci_request *req, u8 instance, - bool force) -{ - struct adv_info *adv_instance, *n, *next_instance = NULL; - int err; - u8 rem_inst; - - /* Cancel any timeout concerning the removed instance(s). */ - if (!instance || hdev->cur_adv_instance == instance) - cancel_adv_timeout(hdev); - - /* Get the next instance to advertise BEFORE we remove - * the current one. This can be the same instance again - * if there is only one instance. - */ - if (instance && hdev->cur_adv_instance == instance) - next_instance = hci_get_next_instance(hdev, instance); - - if (instance == 0x00) { - list_for_each_entry_safe(adv_instance, n, &hdev->adv_instances, - list) { - if (!(force || adv_instance->timeout)) - continue; - - rem_inst = adv_instance->instance; - err = hci_remove_adv_instance(hdev, rem_inst); - if (!err) - mgmt_advertising_removed(sk, hdev, rem_inst); - } - } else { - adv_instance = hci_find_adv_instance(hdev, instance); - - if (force || (adv_instance && adv_instance->timeout && - !adv_instance->remaining_time)) { - /* Don't advertise a removed instance. */ - if (next_instance && - next_instance->instance == instance) - next_instance = NULL; - - err = hci_remove_adv_instance(hdev, instance); - if (!err) - mgmt_advertising_removed(sk, hdev, instance); - } - } - - if (!req || !hdev_is_powered(hdev) || - hci_dev_test_flag(hdev, HCI_ADVERTISING)) - return; - - if (next_instance && !ext_adv_capable(hdev)) - __hci_req_schedule_adv_instance(req, next_instance->instance, - false); -} - -int hci_update_random_address(struct hci_request *req, bool require_privacy, - bool use_rpa, u8 *own_addr_type) -{ - struct hci_dev *hdev = req->hdev; - int err; - - /* If privacy is enabled use a resolvable private address. If - * current RPA has expired or there is something else than - * the current RPA in use, then generate a new one. - */ - if (use_rpa) { - /* If Controller supports LL Privacy use own address type is - * 0x03 - */ - if (use_ll_privacy(hdev)) - *own_addr_type = ADDR_LE_DEV_RANDOM_RESOLVED; - else - *own_addr_type = ADDR_LE_DEV_RANDOM; - - if (rpa_valid(hdev)) - return 0; - - err = smp_generate_rpa(hdev, hdev->irk, &hdev->rpa); - if (err < 0) { - bt_dev_err(hdev, "failed to generate new RPA"); - return err; - } - - set_random_addr(req, &hdev->rpa); - - return 0; - } - - /* In case of required privacy without resolvable private address, - * use an non-resolvable private address. This is useful for active - * scanning and non-connectable advertising. - */ - if (require_privacy) { - bdaddr_t nrpa; - - while (true) { - /* The non-resolvable private address is generated - * from random six bytes with the two most significant - * bits cleared. - */ - get_random_bytes(&nrpa, 6); - nrpa.b[5] &= 0x3f; - - /* The non-resolvable private address shall not be - * equal to the public address. - */ - if (bacmp(&hdev->bdaddr, &nrpa)) - break; - } - - *own_addr_type = ADDR_LE_DEV_RANDOM; - set_random_addr(req, &nrpa); - return 0; - } - - /* If forcing static address is in use or there is no public - * address use the static address as random address (but skip - * the HCI command if the current random address is already the - * static one. - * - * In case BR/EDR has been disabled on a dual-mode controller - * and a static address has been configured, then use that - * address instead of the public BR/EDR address. - */ - if (hci_dev_test_flag(hdev, HCI_FORCE_STATIC_ADDR) || - !bacmp(&hdev->bdaddr, BDADDR_ANY) || - (!hci_dev_test_flag(hdev, HCI_BREDR_ENABLED) && - bacmp(&hdev->static_addr, BDADDR_ANY))) { - *own_addr_type = ADDR_LE_DEV_RANDOM; - if (bacmp(&hdev->static_addr, &hdev->random_addr)) - hci_req_add(req, HCI_OP_LE_SET_RANDOM_ADDR, 6, - &hdev->static_addr); - return 0; - } - - /* Neither privacy nor static address is being used so use a - * public address. - */ - *own_addr_type = ADDR_LE_DEV_PUBLIC; - - return 0; -} - -static bool disconnected_accept_list_entries(struct hci_dev *hdev) -{ - struct bdaddr_list *b; - - list_for_each_entry(b, &hdev->accept_list, list) { - struct hci_conn *conn; - - conn = hci_conn_hash_lookup_ba(hdev, ACL_LINK, &b->bdaddr); - if (!conn) - return true; - - if (conn->state != BT_CONNECTED && conn->state != BT_CONFIG) - return true; - } - - return false; -} - -void __hci_req_update_scan(struct hci_request *req) -{ - struct hci_dev *hdev = req->hdev; - u8 scan; - - if (!hci_dev_test_flag(hdev, HCI_BREDR_ENABLED)) - return; - - if (!hdev_is_powered(hdev)) - return; - - if (mgmt_powering_down(hdev)) - return; - - if (hdev->scanning_paused) - return; - - if (hci_dev_test_flag(hdev, HCI_CONNECTABLE) || - disconnected_accept_list_entries(hdev)) - scan = SCAN_PAGE; - else - scan = SCAN_DISABLED; - - if (hci_dev_test_flag(hdev, HCI_DISCOVERABLE)) - scan |= SCAN_INQUIRY; - - if (test_bit(HCI_PSCAN, &hdev->flags) == !!(scan & SCAN_PAGE) && - test_bit(HCI_ISCAN, &hdev->flags) == !!(scan & SCAN_INQUIRY)) - return; - - hci_req_add(req, HCI_OP_WRITE_SCAN_ENABLE, 1, &scan); -} - -static u8 get_service_classes(struct hci_dev *hdev) -{ - struct bt_uuid *uuid; - u8 val = 0; - - list_for_each_entry(uuid, &hdev->uuids, list) - val |= uuid->svc_hint; - - return val; -} - -void __hci_req_update_class(struct hci_request *req) -{ - struct hci_dev *hdev = req->hdev; - u8 cod[3]; - - bt_dev_dbg(hdev, ""); - - if (!hdev_is_powered(hdev)) - return; - - if (!hci_dev_test_flag(hdev, HCI_BREDR_ENABLED)) - return; - - if (hci_dev_test_flag(hdev, HCI_SERVICE_CACHE)) - return; - - cod[0] = hdev->minor_class; - cod[1] = hdev->major_class; - cod[2] = get_service_classes(hdev); - - if (hci_dev_test_flag(hdev, HCI_LIMITED_DISCOVERABLE)) - cod[1] |= 0x20; - - if (memcmp(cod, hdev->dev_class, 3) == 0) - return; - - hci_req_add(req, HCI_OP_WRITE_CLASS_OF_DEV, sizeof(cod), cod); -} - -void __hci_abort_conn(struct hci_request *req, struct hci_conn *conn, - u8 reason) -{ - switch (conn->state) { - case BT_CONNECTED: - case BT_CONFIG: - if (conn->type == AMP_LINK) { - struct hci_cp_disconn_phy_link cp; - - cp.phy_handle = HCI_PHY_HANDLE(conn->handle); - cp.reason = reason; - hci_req_add(req, HCI_OP_DISCONN_PHY_LINK, sizeof(cp), - &cp); - } else { - struct hci_cp_disconnect dc; - - dc.handle = cpu_to_le16(conn->handle); - dc.reason = reason; - hci_req_add(req, HCI_OP_DISCONNECT, sizeof(dc), &dc); - } - - conn->state = BT_DISCONN; - - break; - case BT_CONNECT: - if (conn->type == LE_LINK) { - if (test_bit(HCI_CONN_SCANNING, &conn->flags)) - break; - hci_req_add(req, HCI_OP_LE_CREATE_CONN_CANCEL, - 0, NULL); - } else if (conn->type == ACL_LINK) { - if (req->hdev->hci_ver < BLUETOOTH_VER_1_2) - break; - hci_req_add(req, HCI_OP_CREATE_CONN_CANCEL, - 6, &conn->dst); - } - break; - case BT_CONNECT2: - if (conn->type == ACL_LINK) { - struct hci_cp_reject_conn_req rej; - - bacpy(&rej.bdaddr, &conn->dst); - rej.reason = reason; - - hci_req_add(req, HCI_OP_REJECT_CONN_REQ, - sizeof(rej), &rej); - } else if (conn->type == SCO_LINK || conn->type == ESCO_LINK) { - struct hci_cp_reject_sync_conn_req rej; - - bacpy(&rej.bdaddr, &conn->dst); - - /* SCO rejection has its own limited set of - * allowed error values (0x0D-0x0F) which isn't - * compatible with most values passed to this - * function. To be safe hard-code one of the - * values that's suitable for SCO. - */ - rej.reason = HCI_ERROR_REJ_LIMITED_RESOURCES; - - hci_req_add(req, HCI_OP_REJECT_SYNC_CONN_REQ, - sizeof(rej), &rej); - } - break; - default: - conn->state = BT_CLOSED; - break; - } -} - -static void abort_conn_complete(struct hci_dev *hdev, u8 status, u16 opcode) -{ - if (status) - bt_dev_dbg(hdev, "Failed to abort connection: status 0x%2.2x", status); -} - -int hci_abort_conn(struct hci_conn *conn, u8 reason) -{ - struct hci_request req; - int err; - - hci_req_init(&req, conn->hdev); - - __hci_abort_conn(&req, conn, reason); - - err = hci_req_run(&req, abort_conn_complete); - if (err && err != -ENODATA) { - bt_dev_err(conn->hdev, "failed to run HCI request: err %d", err); - return err; - } - - return 0; -} - -static int le_scan_disable(struct hci_request *req, unsigned long opt) -{ - hci_req_add_le_scan_disable(req, false); - return 0; -} - -static int bredr_inquiry(struct hci_request *req, unsigned long opt) -{ - u8 length = opt; - const u8 giac[3] = { 0x33, 0x8b, 0x9e }; - const u8 liac[3] = { 0x00, 0x8b, 0x9e }; - struct hci_cp_inquiry cp; - - if (test_bit(HCI_INQUIRY, &req->hdev->flags)) - return 0; - - bt_dev_dbg(req->hdev, ""); - - hci_dev_lock(req->hdev); - hci_inquiry_cache_flush(req->hdev); - hci_dev_unlock(req->hdev); - - memset(&cp, 0, sizeof(cp)); - - if (req->hdev->discovery.limited) - memcpy(&cp.lap, liac, sizeof(cp.lap)); - else - memcpy(&cp.lap, giac, sizeof(cp.lap)); - - cp.length = length; - - hci_req_add(req, HCI_OP_INQUIRY, sizeof(cp), &cp); - - return 0; -} - -static void le_scan_disable_work(struct work_struct *work) -{ - struct hci_dev *hdev = container_of(work, struct hci_dev, - le_scan_disable.work); - u8 status; - - bt_dev_dbg(hdev, ""); - - if (!hci_dev_test_flag(hdev, HCI_LE_SCAN)) - return; - - cancel_delayed_work(&hdev->le_scan_restart); - - hci_req_sync(hdev, le_scan_disable, 0, HCI_CMD_TIMEOUT, &status); - if (status) { - bt_dev_err(hdev, "failed to disable LE scan: status 0x%02x", - status); - return; - } - - hdev->discovery.scan_start = 0; - - /* If we were running LE only scan, change discovery state. If - * we were running both LE and BR/EDR inquiry simultaneously, - * and BR/EDR inquiry is already finished, stop discovery, - * otherwise BR/EDR inquiry will stop discovery when finished. - * If we will resolve remote device name, do not change - * discovery state. - */ - - if (hdev->discovery.type == DISCOV_TYPE_LE) - goto discov_stopped; - - if (hdev->discovery.type != DISCOV_TYPE_INTERLEAVED) - return; - - if (test_bit(HCI_QUIRK_SIMULTANEOUS_DISCOVERY, &hdev->quirks)) { - if (!test_bit(HCI_INQUIRY, &hdev->flags) && - hdev->discovery.state != DISCOVERY_RESOLVING) - goto discov_stopped; - - return; - } - - hci_req_sync(hdev, bredr_inquiry, DISCOV_INTERLEAVED_INQUIRY_LEN, - HCI_CMD_TIMEOUT, &status); - if (status) { - bt_dev_err(hdev, "inquiry failed: status 0x%02x", status); - goto discov_stopped; - } - - return; - -discov_stopped: - hci_dev_lock(hdev); - hci_discovery_set_state(hdev, DISCOVERY_STOPPED); - hci_dev_unlock(hdev); -} - -static int le_scan_restart(struct hci_request *req, unsigned long opt) -{ - struct hci_dev *hdev = req->hdev; - - /* If controller is not scanning we are done. */ - if (!hci_dev_test_flag(hdev, HCI_LE_SCAN)) - return 0; - - if (hdev->scanning_paused) { - bt_dev_dbg(hdev, "Scanning is paused for suspend"); - return 0; - } - - hci_req_add_le_scan_disable(req, false); - - if (use_ext_scan(hdev)) { - struct hci_cp_le_set_ext_scan_enable ext_enable_cp; - - memset(&ext_enable_cp, 0, sizeof(ext_enable_cp)); - ext_enable_cp.enable = LE_SCAN_ENABLE; - ext_enable_cp.filter_dup = LE_SCAN_FILTER_DUP_ENABLE; - - hci_req_add(req, HCI_OP_LE_SET_EXT_SCAN_ENABLE, - sizeof(ext_enable_cp), &ext_enable_cp); - } else { - struct hci_cp_le_set_scan_enable cp; - - memset(&cp, 0, sizeof(cp)); - cp.enable = LE_SCAN_ENABLE; - cp.filter_dup = LE_SCAN_FILTER_DUP_ENABLE; - hci_req_add(req, HCI_OP_LE_SET_SCAN_ENABLE, sizeof(cp), &cp); - } - - return 0; -} - -static void le_scan_restart_work(struct work_struct *work) -{ - struct hci_dev *hdev = container_of(work, struct hci_dev, - le_scan_restart.work); - unsigned long timeout, duration, scan_start, now; - u8 status; - - bt_dev_dbg(hdev, ""); - - hci_req_sync(hdev, le_scan_restart, 0, HCI_CMD_TIMEOUT, &status); - if (status) { - bt_dev_err(hdev, "failed to restart LE scan: status %d", - status); - return; - } - - hci_dev_lock(hdev); - - if (!test_bit(HCI_QUIRK_STRICT_DUPLICATE_FILTER, &hdev->quirks) || - !hdev->discovery.scan_start) - goto unlock; - - /* When the scan was started, hdev->le_scan_disable has been queued - * after duration from scan_start. During scan restart this job - * has been canceled, and we need to queue it again after proper - * timeout, to make sure that scan does not run indefinitely. - */ - duration = hdev->discovery.scan_duration; - scan_start = hdev->discovery.scan_start; - now = jiffies; - if (now - scan_start <= duration) { - int elapsed; - - if (now >= scan_start) - elapsed = now - scan_start; - else - elapsed = ULONG_MAX - scan_start + now; - - timeout = duration - elapsed; - } else { - timeout = 0; - } - - queue_delayed_work(hdev->req_workqueue, - &hdev->le_scan_disable, timeout); - -unlock: - hci_dev_unlock(hdev); -} - -bool hci_req_stop_discovery(struct hci_request *req) -{ - struct hci_dev *hdev = req->hdev; - struct discovery_state *d = &hdev->discovery; - struct hci_cp_remote_name_req_cancel cp; - struct inquiry_entry *e; - bool ret = false; - - bt_dev_dbg(hdev, "state %u", hdev->discovery.state); - - if (d->state == DISCOVERY_FINDING || d->state == DISCOVERY_STOPPING) { - if (test_bit(HCI_INQUIRY, &hdev->flags)) - hci_req_add(req, HCI_OP_INQUIRY_CANCEL, 0, NULL); - - if (hci_dev_test_flag(hdev, HCI_LE_SCAN)) { - cancel_delayed_work(&hdev->le_scan_disable); - cancel_delayed_work(&hdev->le_scan_restart); - hci_req_add_le_scan_disable(req, false); - } - - ret = true; - } else { - /* Passive scanning */ - if (hci_dev_test_flag(hdev, HCI_LE_SCAN)) { - hci_req_add_le_scan_disable(req, false); - ret = true; - } - } - - /* No further actions needed for LE-only discovery */ - if (d->type == DISCOV_TYPE_LE) - return ret; - - if (d->state == DISCOVERY_RESOLVING || d->state == DISCOVERY_STOPPING) { - e = hci_inquiry_cache_lookup_resolve(hdev, BDADDR_ANY, - NAME_PENDING); - if (!e) - return ret; - - bacpy(&cp.bdaddr, &e->data.bdaddr); - hci_req_add(req, HCI_OP_REMOTE_NAME_REQ_CANCEL, sizeof(cp), - &cp); - ret = true; - } - - return ret; -} - -static void config_data_path_complete(struct hci_dev *hdev, u8 status, - u16 opcode) -{ - bt_dev_dbg(hdev, "status %u", status); -} - -int hci_req_configure_datapath(struct hci_dev *hdev, struct bt_codec *codec) -{ - struct hci_request req; - int err; - __u8 vnd_len, *vnd_data = NULL; - struct hci_op_configure_data_path *cmd = NULL; - - hci_req_init(&req, hdev); - - err = hdev->get_codec_config_data(hdev, ESCO_LINK, codec, &vnd_len, - &vnd_data); - if (err < 0) - goto error; - - cmd = kzalloc(sizeof(*cmd) + vnd_len, GFP_KERNEL); - if (!cmd) { - err = -ENOMEM; - goto error; - } - - err = hdev->get_data_path_id(hdev, &cmd->data_path_id); - if (err < 0) - goto error; - - cmd->vnd_len = vnd_len; - memcpy(cmd->vnd_data, vnd_data, vnd_len); - - cmd->direction = 0x00; - hci_req_add(&req, HCI_CONFIGURE_DATA_PATH, sizeof(*cmd) + vnd_len, cmd); - - cmd->direction = 0x01; - hci_req_add(&req, HCI_CONFIGURE_DATA_PATH, sizeof(*cmd) + vnd_len, cmd); - - err = hci_req_run(&req, config_data_path_complete); -error: - - kfree(cmd); - kfree(vnd_data); - return err; -} - void hci_request_setup(struct hci_dev *hdev) { - INIT_DELAYED_WORK(&hdev->le_scan_disable, le_scan_disable_work); - INIT_DELAYED_WORK(&hdev->le_scan_restart, le_scan_restart_work); - INIT_DELAYED_WORK(&hdev->adv_instance_expire, adv_timeout_expire); INIT_DELAYED_WORK(&hdev->interleave_scan, interleave_scan_work); } @@ -2262,13 +918,5 @@ void hci_request_cancel_all(struct hci_dev *hdev) { __hci_cmd_sync_cancel(hdev, ENODEV); - cancel_delayed_work_sync(&hdev->le_scan_disable); - cancel_delayed_work_sync(&hdev->le_scan_restart); - - if (hdev->adv_instance_timeout) { - cancel_delayed_work_sync(&hdev->adv_instance_expire); - hdev->adv_instance_timeout = 0; - } - cancel_interleave_scan(hdev); } diff --git a/net/bluetooth/hci_request.h b/net/bluetooth/hci_request.h index 39d001fa3acf..b9c5a9823837 100644 --- a/net/bluetooth/hci_request.h +++ b/net/bluetooth/hci_request.h @@ -68,63 +68,10 @@ int __hci_req_sync(struct hci_dev *hdev, int (*func)(struct hci_request *req, struct sk_buff *hci_prepare_cmd(struct hci_dev *hdev, u16 opcode, u32 plen, const void *param); -void __hci_req_write_fast_connectable(struct hci_request *req, bool enable); -void __hci_req_update_name(struct hci_request *req); -void __hci_req_update_eir(struct hci_request *req); - void hci_req_add_le_scan_disable(struct hci_request *req, bool rpa_le_conn); void hci_req_add_le_passive_scan(struct hci_request *req); void hci_req_prepare_suspend(struct hci_dev *hdev, enum suspended_state next); -void hci_req_disable_address_resolution(struct hci_dev *hdev); -void hci_req_reenable_advertising(struct hci_dev *hdev); -void __hci_req_enable_advertising(struct hci_request *req); -void __hci_req_disable_advertising(struct hci_request *req); -void __hci_req_update_adv_data(struct hci_request *req, u8 instance); -int hci_req_update_adv_data(struct hci_dev *hdev, u8 instance); -int hci_req_start_per_adv(struct hci_dev *hdev, u8 instance, u32 flags, - u16 min_interval, u16 max_interval, - u16 sync_interval); -void __hci_req_update_scan_rsp_data(struct hci_request *req, u8 instance); - -int __hci_req_schedule_adv_instance(struct hci_request *req, u8 instance, - bool force); -void hci_req_clear_adv_instance(struct hci_dev *hdev, struct sock *sk, - struct hci_request *req, u8 instance, - bool force); - -int __hci_req_setup_ext_adv_instance(struct hci_request *req, u8 instance); -int __hci_req_setup_per_adv_instance(struct hci_request *req, u8 instance, - u16 min_interval, u16 max_interval); -int __hci_req_start_ext_adv(struct hci_request *req, u8 instance); -int __hci_req_start_per_adv(struct hci_request *req, u8 instance, u32 flags, - u16 min_interval, u16 max_interval, - u16 sync_interval); -int __hci_req_enable_ext_advertising(struct hci_request *req, u8 instance); -int __hci_req_enable_per_advertising(struct hci_request *req, u8 instance); -int __hci_req_disable_ext_adv_instance(struct hci_request *req, u8 instance); -int __hci_req_remove_ext_adv_instance(struct hci_request *req, u8 instance); -void __hci_req_clear_ext_adv_sets(struct hci_request *req); -int hci_get_random_address(struct hci_dev *hdev, bool require_privacy, - bool use_rpa, struct adv_info *adv_instance, - u8 *own_addr_type, bdaddr_t *rand_addr); - -void __hci_req_update_class(struct hci_request *req); - -/* Returns true if HCI commands were queued */ -bool hci_req_stop_discovery(struct hci_request *req); - -int hci_req_configure_datapath(struct hci_dev *hdev, struct bt_codec *codec); - -void __hci_req_update_scan(struct hci_request *req); - -int hci_update_random_address(struct hci_request *req, bool require_privacy, - bool use_rpa, u8 *own_addr_type); - -int hci_abort_conn(struct hci_conn *conn, u8 reason); -void __hci_abort_conn(struct hci_request *req, struct hci_conn *conn, - u8 reason); - void hci_request_setup(struct hci_dev *hdev); void hci_request_cancel_all(struct hci_dev *hdev); diff --git a/net/bluetooth/hci_sock.c b/net/bluetooth/hci_sock.c index 0d015d4a8e41..06581223238c 100644 --- a/net/bluetooth/hci_sock.c +++ b/net/bluetooth/hci_sock.c @@ -887,7 +887,6 @@ static int hci_sock_release(struct socket *sock) */ hci_dev_do_close(hdev); hci_dev_clear_flag(hdev, HCI_USER_CHANNEL); - hci_register_suspend_notifier(hdev); mgmt_index_added(hdev); } @@ -1216,7 +1215,6 @@ static int hci_sock_bind(struct socket *sock, struct sockaddr *addr, } mgmt_index_removed(hdev); - hci_unregister_suspend_notifier(hdev); err = hci_dev_open(hdev->id); if (err) { @@ -1231,7 +1229,6 @@ static int hci_sock_bind(struct socket *sock, struct sockaddr *addr, err = 0; } else { hci_dev_clear_flag(hdev, HCI_USER_CHANNEL); - hci_register_suspend_notifier(hdev); mgmt_index_added(hdev); hci_dev_put(hdev); goto done; @@ -2065,6 +2062,7 @@ static int hci_sock_getsockopt(struct socket *sock, int level, int optname, static void hci_sock_destruct(struct sock *sk) { + mgmt_cleanup(sk); skb_queue_purge(&sk->sk_receive_queue); skb_queue_purge(&sk->sk_write_queue); } diff --git a/net/bluetooth/hci_sync.c b/net/bluetooth/hci_sync.c index fbd5613eebfc..76c3107c9f91 100644 --- a/net/bluetooth/hci_sync.c +++ b/net/bluetooth/hci_sync.c @@ -246,7 +246,7 @@ int __hci_cmd_sync_status_sk(struct hci_dev *hdev, u16 opcode, u32 plen, skb = __hci_cmd_sync_sk(hdev, opcode, plen, param, event, timeout, sk); if (IS_ERR(skb)) { bt_dev_err(hdev, "Opcode 0x%4x failed: %ld", opcode, - PTR_ERR(skb)); + PTR_ERR(skb)); return PTR_ERR(skb); } @@ -321,6 +321,307 @@ static void hci_cmd_sync_cancel_work(struct work_struct *work) wake_up_interruptible(&hdev->req_wait_q); } +static int hci_scan_disable_sync(struct hci_dev *hdev); +static int scan_disable_sync(struct hci_dev *hdev, void *data) +{ + return hci_scan_disable_sync(hdev); +} + +static int hci_inquiry_sync(struct hci_dev *hdev, u8 length); +static int interleaved_inquiry_sync(struct hci_dev *hdev, void *data) +{ + return hci_inquiry_sync(hdev, DISCOV_INTERLEAVED_INQUIRY_LEN); +} + +static void le_scan_disable(struct work_struct *work) +{ + struct hci_dev *hdev = container_of(work, struct hci_dev, + le_scan_disable.work); + int status; + + bt_dev_dbg(hdev, ""); + hci_dev_lock(hdev); + + if (!hci_dev_test_flag(hdev, HCI_LE_SCAN)) + goto _return; + + cancel_delayed_work(&hdev->le_scan_restart); + + status = hci_cmd_sync_queue(hdev, scan_disable_sync, NULL, NULL); + if (status) { + bt_dev_err(hdev, "failed to disable LE scan: %d", status); + goto _return; + } + + hdev->discovery.scan_start = 0; + + /* If we were running LE only scan, change discovery state. If + * we were running both LE and BR/EDR inquiry simultaneously, + * and BR/EDR inquiry is already finished, stop discovery, + * otherwise BR/EDR inquiry will stop discovery when finished. + * If we will resolve remote device name, do not change + * discovery state. + */ + + if (hdev->discovery.type == DISCOV_TYPE_LE) + goto discov_stopped; + + if (hdev->discovery.type != DISCOV_TYPE_INTERLEAVED) + goto _return; + + if (test_bit(HCI_QUIRK_SIMULTANEOUS_DISCOVERY, &hdev->quirks)) { + if (!test_bit(HCI_INQUIRY, &hdev->flags) && + hdev->discovery.state != DISCOVERY_RESOLVING) + goto discov_stopped; + + goto _return; + } + + status = hci_cmd_sync_queue(hdev, interleaved_inquiry_sync, NULL, NULL); + if (status) { + bt_dev_err(hdev, "inquiry failed: status %d", status); + goto discov_stopped; + } + + goto _return; + +discov_stopped: + hci_discovery_set_state(hdev, DISCOVERY_STOPPED); + +_return: + hci_dev_unlock(hdev); +} + +static int hci_le_set_scan_enable_sync(struct hci_dev *hdev, u8 val, + u8 filter_dup); +static int hci_le_scan_restart_sync(struct hci_dev *hdev) +{ + /* If controller is not scanning we are done. */ + if (!hci_dev_test_flag(hdev, HCI_LE_SCAN)) + return 0; + + if (hdev->scanning_paused) { + bt_dev_dbg(hdev, "Scanning is paused for suspend"); + return 0; + } + + hci_le_set_scan_enable_sync(hdev, LE_SCAN_DISABLE, 0x00); + return hci_le_set_scan_enable_sync(hdev, LE_SCAN_ENABLE, + LE_SCAN_FILTER_DUP_ENABLE); +} + +static int le_scan_restart_sync(struct hci_dev *hdev, void *data) +{ + return hci_le_scan_restart_sync(hdev); +} + +static void le_scan_restart(struct work_struct *work) +{ + struct hci_dev *hdev = container_of(work, struct hci_dev, + le_scan_restart.work); + unsigned long timeout, duration, scan_start, now; + int status; + + bt_dev_dbg(hdev, ""); + + hci_dev_lock(hdev); + + status = hci_cmd_sync_queue(hdev, le_scan_restart_sync, NULL, NULL); + if (status) { + bt_dev_err(hdev, "failed to restart LE scan: status %d", + status); + goto unlock; + } + + if (!test_bit(HCI_QUIRK_STRICT_DUPLICATE_FILTER, &hdev->quirks) || + !hdev->discovery.scan_start) + goto unlock; + + /* When the scan was started, hdev->le_scan_disable has been queued + * after duration from scan_start. During scan restart this job + * has been canceled, and we need to queue it again after proper + * timeout, to make sure that scan does not run indefinitely. + */ + duration = hdev->discovery.scan_duration; + scan_start = hdev->discovery.scan_start; + now = jiffies; + if (now - scan_start <= duration) { + int elapsed; + + if (now >= scan_start) + elapsed = now - scan_start; + else + elapsed = ULONG_MAX - scan_start + now; + + timeout = duration - elapsed; + } else { + timeout = 0; + } + + queue_delayed_work(hdev->req_workqueue, + &hdev->le_scan_disable, timeout); + +unlock: + hci_dev_unlock(hdev); +} + +static int reenable_adv_sync(struct hci_dev *hdev, void *data) +{ + bt_dev_dbg(hdev, ""); + + if (!hci_dev_test_flag(hdev, HCI_ADVERTISING) && + list_empty(&hdev->adv_instances)) + return 0; + + if (hdev->cur_adv_instance) { + return hci_schedule_adv_instance_sync(hdev, + hdev->cur_adv_instance, + true); + } else { + if (ext_adv_capable(hdev)) { + hci_start_ext_adv_sync(hdev, 0x00); + } else { + hci_update_adv_data_sync(hdev, 0x00); + hci_update_scan_rsp_data_sync(hdev, 0x00); + hci_enable_advertising_sync(hdev); + } + } + + return 0; +} + +static void reenable_adv(struct work_struct *work) +{ + struct hci_dev *hdev = container_of(work, struct hci_dev, + reenable_adv_work); + int status; + + bt_dev_dbg(hdev, ""); + + hci_dev_lock(hdev); + + status = hci_cmd_sync_queue(hdev, reenable_adv_sync, NULL, NULL); + if (status) + bt_dev_err(hdev, "failed to reenable ADV: %d", status); + + hci_dev_unlock(hdev); +} + +static void cancel_adv_timeout(struct hci_dev *hdev) +{ + if (hdev->adv_instance_timeout) { + hdev->adv_instance_timeout = 0; + cancel_delayed_work(&hdev->adv_instance_expire); + } +} + +/* For a single instance: + * - force == true: The instance will be removed even when its remaining + * lifetime is not zero. + * - force == false: the instance will be deactivated but kept stored unless + * the remaining lifetime is zero. + * + * For instance == 0x00: + * - force == true: All instances will be removed regardless of their timeout + * setting. + * - force == false: Only instances that have a timeout will be removed. + */ +int hci_clear_adv_instance_sync(struct hci_dev *hdev, struct sock *sk, + u8 instance, bool force) +{ + struct adv_info *adv_instance, *n, *next_instance = NULL; + int err; + u8 rem_inst; + + /* Cancel any timeout concerning the removed instance(s). */ + if (!instance || hdev->cur_adv_instance == instance) + cancel_adv_timeout(hdev); + + /* Get the next instance to advertise BEFORE we remove + * the current one. This can be the same instance again + * if there is only one instance. + */ + if (instance && hdev->cur_adv_instance == instance) + next_instance = hci_get_next_instance(hdev, instance); + + if (instance == 0x00) { + list_for_each_entry_safe(adv_instance, n, &hdev->adv_instances, + list) { + if (!(force || adv_instance->timeout)) + continue; + + rem_inst = adv_instance->instance; + err = hci_remove_adv_instance(hdev, rem_inst); + if (!err) + mgmt_advertising_removed(sk, hdev, rem_inst); + } + } else { + adv_instance = hci_find_adv_instance(hdev, instance); + + if (force || (adv_instance && adv_instance->timeout && + !adv_instance->remaining_time)) { + /* Don't advertise a removed instance. */ + if (next_instance && + next_instance->instance == instance) + next_instance = NULL; + + err = hci_remove_adv_instance(hdev, instance); + if (!err) + mgmt_advertising_removed(sk, hdev, instance); + } + } + + if (!hdev_is_powered(hdev) || hci_dev_test_flag(hdev, HCI_ADVERTISING)) + return 0; + + if (next_instance && !ext_adv_capable(hdev)) + return hci_schedule_adv_instance_sync(hdev, + next_instance->instance, + false); + + return 0; +} + +static int adv_timeout_expire_sync(struct hci_dev *hdev, void *data) +{ + u8 instance = *(u8 *)data; + + kfree(data); + + hci_clear_adv_instance_sync(hdev, NULL, instance, false); + + if (list_empty(&hdev->adv_instances)) + return hci_disable_advertising_sync(hdev); + + return 0; +} + +static void adv_timeout_expire(struct work_struct *work) +{ + u8 *inst_ptr; + struct hci_dev *hdev = container_of(work, struct hci_dev, + adv_instance_expire.work); + + bt_dev_dbg(hdev, ""); + + hci_dev_lock(hdev); + + hdev->adv_instance_timeout = 0; + + if (hdev->cur_adv_instance == 0x00) + goto unlock; + + inst_ptr = kmalloc(1, GFP_KERNEL); + if (!inst_ptr) + goto unlock; + + *inst_ptr = hdev->cur_adv_instance; + hci_cmd_sync_queue(hdev, adv_timeout_expire_sync, inst_ptr, NULL); + +unlock: + hci_dev_unlock(hdev); +} + void hci_cmd_sync_init(struct hci_dev *hdev) { INIT_WORK(&hdev->cmd_sync_work, hci_cmd_sync_work); @@ -328,6 +629,10 @@ void hci_cmd_sync_init(struct hci_dev *hdev) mutex_init(&hdev->cmd_sync_work_lock); INIT_WORK(&hdev->cmd_sync_cancel_work, hci_cmd_sync_cancel_work); + INIT_WORK(&hdev->reenable_adv_work, reenable_adv); + INIT_DELAYED_WORK(&hdev->le_scan_disable, le_scan_disable); + INIT_DELAYED_WORK(&hdev->le_scan_restart, le_scan_restart); + INIT_DELAYED_WORK(&hdev->adv_instance_expire, adv_timeout_expire); } void hci_cmd_sync_clear(struct hci_dev *hdev) @@ -335,6 +640,7 @@ void hci_cmd_sync_clear(struct hci_dev *hdev) struct hci_cmd_sync_work_entry *entry, *tmp; cancel_work_sync(&hdev->cmd_sync_work); + cancel_work_sync(&hdev->reenable_adv_work); list_for_each_entry_safe(entry, tmp, &hdev->cmd_sync_work_list, list) { if (entry->destroy) @@ -1333,14 +1639,6 @@ int hci_le_terminate_big_sync(struct hci_dev *hdev, u8 handle, u8 reason) sizeof(cp), &cp, HCI_CMD_TIMEOUT); } -static void cancel_adv_timeout(struct hci_dev *hdev) -{ - if (hdev->adv_instance_timeout) { - hdev->adv_instance_timeout = 0; - cancel_delayed_work(&hdev->adv_instance_expire); - } -} - static int hci_set_ext_adv_data_sync(struct hci_dev *hdev, u8 instance) { struct { @@ -1492,10 +1790,13 @@ static int hci_clear_adv_sets_sync(struct hci_dev *hdev, struct sock *sk) static int hci_clear_adv_sync(struct hci_dev *hdev, struct sock *sk, bool force) { struct adv_info *adv, *n; + int err = 0; if (ext_adv_capable(hdev)) /* Remove all existing sets */ - return hci_clear_adv_sets_sync(hdev, sk); + err = hci_clear_adv_sets_sync(hdev, sk); + if (ext_adv_capable(hdev)) + return err; /* This is safe as long as there is no command send while the lock is * held. @@ -1523,11 +1824,13 @@ static int hci_clear_adv_sync(struct hci_dev *hdev, struct sock *sk, bool force) static int hci_remove_adv_sync(struct hci_dev *hdev, u8 instance, struct sock *sk) { - int err; + int err = 0; /* If we use extended advertising, instance has to be removed first. */ if (ext_adv_capable(hdev)) - return hci_remove_ext_adv_instance_sync(hdev, instance, sk); + err = hci_remove_ext_adv_instance_sync(hdev, instance, sk); + if (ext_adv_capable(hdev)) + return err; /* This is safe as long as there is no command send while the lock is * held. @@ -1626,13 +1929,16 @@ int hci_read_tx_power_sync(struct hci_dev *hdev, __le16 handle, u8 type) int hci_disable_advertising_sync(struct hci_dev *hdev) { u8 enable = 0x00; + int err = 0; /* If controller is not advertising we are done. */ if (!hci_dev_test_flag(hdev, HCI_LE_ADV)) return 0; if (ext_adv_capable(hdev)) - return hci_disable_ext_adv_instance_sync(hdev, 0x00); + err = hci_disable_ext_adv_instance_sync(hdev, 0x00); + if (ext_adv_capable(hdev)) + return err; return __hci_cmd_sync_status(hdev, HCI_OP_LE_SET_ADV_ENABLE, sizeof(enable), &enable, HCI_CMD_TIMEOUT); @@ -1645,7 +1951,11 @@ static int hci_le_set_ext_scan_enable_sync(struct hci_dev *hdev, u8 val, memset(&cp, 0, sizeof(cp)); cp.enable = val; - cp.filter_dup = filter_dup; + + if (hci_dev_test_flag(hdev, HCI_MESH)) + cp.filter_dup = LE_SCAN_FILTER_DUP_DISABLE; + else + cp.filter_dup = filter_dup; return __hci_cmd_sync_status(hdev, HCI_OP_LE_SET_EXT_SCAN_ENABLE, sizeof(cp), &cp, HCI_CMD_TIMEOUT); @@ -1661,7 +1971,11 @@ static int hci_le_set_scan_enable_sync(struct hci_dev *hdev, u8 val, memset(&cp, 0, sizeof(cp)); cp.enable = val; - cp.filter_dup = filter_dup; + + if (val && hci_dev_test_flag(hdev, HCI_MESH)) + cp.filter_dup = LE_SCAN_FILTER_DUP_DISABLE; + else + cp.filter_dup = filter_dup; return __hci_cmd_sync_status(hdev, HCI_OP_LE_SET_SCAN_ENABLE, sizeof(cp), &cp, HCI_CMD_TIMEOUT); @@ -2300,6 +2614,7 @@ static int hci_passive_scan_sync(struct hci_dev *hdev) u8 own_addr_type; u8 filter_policy; u16 window, interval; + u8 filter_dups = LE_SCAN_FILTER_DUP_ENABLE; int err; if (hdev->scanning_paused) { @@ -2362,11 +2677,16 @@ static int hci_passive_scan_sync(struct hci_dev *hdev) interval = hdev->le_scan_interval; } + /* Disable all filtering for Mesh */ + if (hci_dev_test_flag(hdev, HCI_MESH)) { + filter_policy = 0; + filter_dups = LE_SCAN_FILTER_DUP_DISABLE; + } + bt_dev_dbg(hdev, "LE passive scan with acceptlist = %d", filter_policy); return hci_start_scan_sync(hdev, LE_SCAN_PASSIVE, interval, window, - own_addr_type, filter_policy, - LE_SCAN_FILTER_DUP_ENABLE); + own_addr_type, filter_policy, filter_dups); } /* This function controls the passive scanning based on hdev->pend_le_conns @@ -2416,7 +2736,8 @@ int hci_update_passive_scan_sync(struct hci_dev *hdev) bt_dev_dbg(hdev, "ADV monitoring is %s", hci_is_adv_monitoring(hdev) ? "on" : "off"); - if (list_empty(&hdev->pend_le_conns) && + if (!hci_dev_test_flag(hdev, HCI_MESH) && + list_empty(&hdev->pend_le_conns) && list_empty(&hdev->pend_le_reports) && !hci_is_adv_monitoring(hdev) && !hci_dev_test_flag(hdev, HCI_PA_SYNC)) { @@ -4355,6 +4676,7 @@ int hci_dev_open_sync(struct hci_dev *hdev) hci_dev_test_flag(hdev, HCI_MGMT) && hdev->dev_type == HCI_PRIMARY) { ret = hci_powered_update_sync(hdev); + mgmt_power_on(hdev, ret); } } else { /* Init failed, cleanup */ @@ -4406,6 +4728,31 @@ static void hci_pend_le_actions_clear(struct hci_dev *hdev) BT_DBG("All LE pending actions cleared"); } +static int hci_dev_shutdown(struct hci_dev *hdev) +{ + int err = 0; + /* Similar to how we first do setup and then set the exclusive access + * bit for userspace, we must first unset userchannel and then clean up. + * Otherwise, the kernel can't properly use the hci channel to clean up + * the controller (some shutdown routines require sending additional + * commands to the controller for example). + */ + bool was_userchannel = + hci_dev_test_and_clear_flag(hdev, HCI_USER_CHANNEL); + + if (!hci_dev_test_flag(hdev, HCI_UNREGISTER) && + test_bit(HCI_UP, &hdev->flags)) { + /* Execute vendor specific shutdown routine */ + if (hdev->shutdown) + err = hdev->shutdown(hdev); + } + + if (was_userchannel) + hci_dev_set_flag(hdev, HCI_USER_CHANNEL); + + return err; +} + int hci_dev_close_sync(struct hci_dev *hdev) { bool auto_off; @@ -4415,17 +4762,18 @@ int hci_dev_close_sync(struct hci_dev *hdev) cancel_delayed_work(&hdev->power_off); cancel_delayed_work(&hdev->ncmd_timer); + cancel_delayed_work(&hdev->le_scan_disable); + cancel_delayed_work(&hdev->le_scan_restart); hci_request_cancel_all(hdev); - if (!hci_dev_test_flag(hdev, HCI_UNREGISTER) && - !hci_dev_test_flag(hdev, HCI_USER_CHANNEL) && - test_bit(HCI_UP, &hdev->flags)) { - /* Execute vendor specific shutdown routine */ - if (hdev->shutdown) - err = hdev->shutdown(hdev); + if (hdev->adv_instance_timeout) { + cancel_delayed_work_sync(&hdev->adv_instance_expire); + hdev->adv_instance_timeout = 0; } + err = hci_dev_shutdown(hdev); + if (!test_and_clear_bit(HCI_UP, &hdev->flags)) { cancel_delayed_work_sync(&hdev->cmd_timer); return err; @@ -5023,7 +5371,7 @@ static int hci_active_scan_sync(struct hci_dev *hdev, uint16_t interval) /* Pause advertising since active scanning disables address resolution * which advertising depend on in order to generate its RPAs. */ - if (use_ll_privacy(hdev)) { + if (use_ll_privacy(hdev) && hci_dev_test_flag(hdev, HCI_PRIVACY)) { err = hci_pause_advertising_sync(hdev); if (err) { bt_dev_err(hdev, "pause advertising failed: %d", err); @@ -5737,3 +6085,96 @@ int hci_le_pa_terminate_sync(struct hci_dev *hdev, u16 handle) return __hci_cmd_sync_status(hdev, HCI_OP_LE_PA_TERM_SYNC, sizeof(cp), &cp, HCI_CMD_TIMEOUT); } + +int hci_get_random_address(struct hci_dev *hdev, bool require_privacy, + bool use_rpa, struct adv_info *adv_instance, + u8 *own_addr_type, bdaddr_t *rand_addr) +{ + int err; + + bacpy(rand_addr, BDADDR_ANY); + + /* If privacy is enabled use a resolvable private address. If + * current RPA has expired then generate a new one. + */ + if (use_rpa) { + /* If Controller supports LL Privacy use own address type is + * 0x03 + */ + if (use_ll_privacy(hdev)) + *own_addr_type = ADDR_LE_DEV_RANDOM_RESOLVED; + else + *own_addr_type = ADDR_LE_DEV_RANDOM; + + if (adv_instance) { + if (adv_rpa_valid(adv_instance)) + return 0; + } else { + if (rpa_valid(hdev)) + return 0; + } + + err = smp_generate_rpa(hdev, hdev->irk, &hdev->rpa); + if (err < 0) { + bt_dev_err(hdev, "failed to generate new RPA"); + return err; + } + + bacpy(rand_addr, &hdev->rpa); + + return 0; + } + + /* In case of required privacy without resolvable private address, + * use an non-resolvable private address. This is useful for + * non-connectable advertising. + */ + if (require_privacy) { + bdaddr_t nrpa; + + while (true) { + /* The non-resolvable private address is generated + * from random six bytes with the two most significant + * bits cleared. + */ + get_random_bytes(&nrpa, 6); + nrpa.b[5] &= 0x3f; + + /* The non-resolvable private address shall not be + * equal to the public address. + */ + if (bacmp(&hdev->bdaddr, &nrpa)) + break; + } + + *own_addr_type = ADDR_LE_DEV_RANDOM; + bacpy(rand_addr, &nrpa); + + return 0; + } + + /* No privacy so use a public address. */ + *own_addr_type = ADDR_LE_DEV_PUBLIC; + + return 0; +} + +static int _update_adv_data_sync(struct hci_dev *hdev, void *data) +{ + u8 instance = *(u8 *)data; + + kfree(data); + + return hci_update_adv_data_sync(hdev, instance); +} + +int hci_update_adv_data(struct hci_dev *hdev, u8 instance) +{ + u8 *inst_ptr = kmalloc(1, GFP_KERNEL); + + if (!inst_ptr) + return -ENOMEM; + + *inst_ptr = instance; + return hci_cmd_sync_queue(hdev, _update_adv_data_sync, inst_ptr, NULL); +} diff --git a/net/bluetooth/hci_sysfs.c b/net/bluetooth/hci_sysfs.c index 4e3e0451b08c..08542dfc2dc5 100644 --- a/net/bluetooth/hci_sysfs.c +++ b/net/bluetooth/hci_sysfs.c @@ -48,6 +48,9 @@ void hci_conn_add_sysfs(struct hci_conn *conn) BT_DBG("conn %p", conn); + if (device_is_registered(&conn->dev)) + return; + dev_set_name(&conn->dev, "%s:%d", hdev->name, conn->handle); if (device_add(&conn->dev) < 0) { diff --git a/net/bluetooth/l2cap_core.c b/net/bluetooth/l2cap_core.c index 2c9de67daadc..1f34b82ca0ec 100644 --- a/net/bluetooth/l2cap_core.c +++ b/net/bluetooth/l2cap_core.c @@ -61,6 +61,9 @@ static void l2cap_send_disconn_req(struct l2cap_chan *chan, int err); static void l2cap_tx(struct l2cap_chan *chan, struct l2cap_ctrl *control, struct sk_buff_head *skbs, u8 event); +static void l2cap_retrans_timeout(struct work_struct *work); +static void l2cap_monitor_timeout(struct work_struct *work); +static void l2cap_ack_timeout(struct work_struct *work); static inline u8 bdaddr_type(u8 link_type, u8 bdaddr_type) { @@ -476,6 +479,9 @@ struct l2cap_chan *l2cap_chan_create(void) write_unlock(&chan_list_lock); INIT_DELAYED_WORK(&chan->chan_timer, l2cap_chan_timeout); + INIT_DELAYED_WORK(&chan->retrans_timer, l2cap_retrans_timeout); + INIT_DELAYED_WORK(&chan->monitor_timer, l2cap_monitor_timeout); + INIT_DELAYED_WORK(&chan->ack_timer, l2cap_ack_timeout); chan->state = BT_OPEN; @@ -3320,10 +3326,6 @@ int l2cap_ertm_init(struct l2cap_chan *chan) chan->rx_state = L2CAP_RX_STATE_RECV; chan->tx_state = L2CAP_TX_STATE_XMIT; - INIT_DELAYED_WORK(&chan->retrans_timer, l2cap_retrans_timeout); - INIT_DELAYED_WORK(&chan->monitor_timer, l2cap_monitor_timeout); - INIT_DELAYED_WORK(&chan->ack_timer, l2cap_ack_timeout); - skb_queue_head_init(&chan->srej_q); err = l2cap_seq_list_init(&chan->srej_list, chan->tx_win); @@ -4307,6 +4309,12 @@ static int l2cap_connect_create_rsp(struct l2cap_conn *conn, } } + chan = l2cap_chan_hold_unless_zero(chan); + if (!chan) { + err = -EBADSLT; + goto unlock; + } + err = 0; l2cap_chan_lock(chan); @@ -4336,6 +4344,7 @@ static int l2cap_connect_create_rsp(struct l2cap_conn *conn, } l2cap_chan_unlock(chan); + l2cap_chan_put(chan); unlock: mutex_unlock(&conn->chan_lock); diff --git a/net/bluetooth/mgmt.c b/net/bluetooth/mgmt.c index 72e6595a71cc..a92e7e485feb 100644 --- a/net/bluetooth/mgmt.c +++ b/net/bluetooth/mgmt.c @@ -129,6 +129,10 @@ static const u16 mgmt_commands[] = { MGMT_OP_ADD_EXT_ADV_PARAMS, MGMT_OP_ADD_EXT_ADV_DATA, MGMT_OP_ADD_ADV_PATTERNS_MONITOR_RSSI, + MGMT_OP_SET_MESH_RECEIVER, + MGMT_OP_MESH_READ_FEATURES, + MGMT_OP_MESH_SEND, + MGMT_OP_MESH_SEND_CANCEL, }; static const u16 mgmt_events[] = { @@ -1048,9 +1052,66 @@ static void discov_off(struct work_struct *work) hci_dev_unlock(hdev); } +static int send_settings_rsp(struct sock *sk, u16 opcode, struct hci_dev *hdev); + +static void mesh_send_complete(struct hci_dev *hdev, + struct mgmt_mesh_tx *mesh_tx, bool silent) +{ + u8 handle = mesh_tx->handle; + + if (!silent) + mgmt_event(MGMT_EV_MESH_PACKET_CMPLT, hdev, &handle, + sizeof(handle), NULL); + + mgmt_mesh_remove(mesh_tx); +} + +static int mesh_send_done_sync(struct hci_dev *hdev, void *data) +{ + struct mgmt_mesh_tx *mesh_tx; + + hci_dev_clear_flag(hdev, HCI_MESH_SENDING); + hci_disable_advertising_sync(hdev); + mesh_tx = mgmt_mesh_next(hdev, NULL); + + if (mesh_tx) + mesh_send_complete(hdev, mesh_tx, false); + + return 0; +} + +static int mesh_send_sync(struct hci_dev *hdev, void *data); +static void mesh_send_start_complete(struct hci_dev *hdev, void *data, int err); +static void mesh_next(struct hci_dev *hdev, void *data, int err) +{ + struct mgmt_mesh_tx *mesh_tx = mgmt_mesh_next(hdev, NULL); + + if (!mesh_tx) + return; + + err = hci_cmd_sync_queue(hdev, mesh_send_sync, mesh_tx, + mesh_send_start_complete); + + if (err < 0) + mesh_send_complete(hdev, mesh_tx, false); + else + hci_dev_set_flag(hdev, HCI_MESH_SENDING); +} + +static void mesh_send_done(struct work_struct *work) +{ + struct hci_dev *hdev = container_of(work, struct hci_dev, + mesh_send_done.work); + + if (!hci_dev_test_flag(hdev, HCI_MESH_SENDING)) + return; + + hci_cmd_sync_queue(hdev, mesh_send_done_sync, NULL, mesh_next); +} + static void mgmt_init_hdev(struct sock *sk, struct hci_dev *hdev) { - if (hci_dev_test_and_set_flag(hdev, HCI_MGMT)) + if (hci_dev_test_flag(hdev, HCI_MGMT)) return; BT_INFO("MGMT ver %d.%d", MGMT_VERSION, MGMT_REVISION); @@ -1058,6 +1119,7 @@ static void mgmt_init_hdev(struct sock *sk, struct hci_dev *hdev) INIT_DELAYED_WORK(&hdev->discov_off, discov_off); INIT_DELAYED_WORK(&hdev->service_cache, service_cache_off); INIT_DELAYED_WORK(&hdev->rpa_expired, rpa_expired); + INIT_DELAYED_WORK(&hdev->mesh_send_done, mesh_send_done); /* Non-mgmt controlled devices get this bit set * implicitly so that pairing works for them, however @@ -1065,6 +1127,8 @@ static void mgmt_init_hdev(struct sock *sk, struct hci_dev *hdev) * it */ hci_dev_clear_flag(hdev, HCI_BONDABLE); + + hci_dev_set_flag(hdev, HCI_MGMT); } static int read_controller_info(struct sock *sk, struct hci_dev *hdev, @@ -2058,6 +2122,8 @@ static int set_le_sync(struct hci_dev *hdev, void *data) int err; if (!val) { + hci_clear_adv_instance_sync(hdev, NULL, 0x00, true); + if (hci_dev_test_flag(hdev, HCI_LE_ADV)) hci_disable_advertising_sync(hdev); @@ -2092,6 +2158,317 @@ static int set_le_sync(struct hci_dev *hdev, void *data) return err; } +static void set_mesh_complete(struct hci_dev *hdev, void *data, int err) +{ + struct mgmt_pending_cmd *cmd = data; + u8 status = mgmt_status(err); + struct sock *sk = cmd->sk; + + if (status) { + mgmt_pending_foreach(MGMT_OP_SET_MESH_RECEIVER, hdev, + cmd_status_rsp, &status); + return; + } + + mgmt_pending_remove(cmd); + mgmt_cmd_complete(sk, hdev->id, MGMT_OP_SET_MESH_RECEIVER, 0, NULL, 0); +} + +static int set_mesh_sync(struct hci_dev *hdev, void *data) +{ + struct mgmt_pending_cmd *cmd = data; + struct mgmt_cp_set_mesh *cp = cmd->param; + size_t len = cmd->param_len; + + memset(hdev->mesh_ad_types, 0, sizeof(hdev->mesh_ad_types)); + + if (cp->enable) + hci_dev_set_flag(hdev, HCI_MESH); + else + hci_dev_clear_flag(hdev, HCI_MESH); + + len -= sizeof(*cp); + + /* If filters don't fit, forward all adv pkts */ + if (len <= sizeof(hdev->mesh_ad_types)) + memcpy(hdev->mesh_ad_types, cp->ad_types, len); + + hci_update_passive_scan_sync(hdev); + return 0; +} + +static int set_mesh(struct sock *sk, struct hci_dev *hdev, void *data, u16 len) +{ + struct mgmt_cp_set_mesh *cp = data; + struct mgmt_pending_cmd *cmd; + int err = 0; + + bt_dev_dbg(hdev, "sock %p", sk); + + if (!lmp_le_capable(hdev) || + !hci_dev_test_flag(hdev, HCI_MESH_EXPERIMENTAL)) + return mgmt_cmd_status(sk, hdev->id, MGMT_OP_SET_MESH_RECEIVER, + MGMT_STATUS_NOT_SUPPORTED); + + if (cp->enable != 0x00 && cp->enable != 0x01) + return mgmt_cmd_status(sk, hdev->id, MGMT_OP_SET_MESH_RECEIVER, + MGMT_STATUS_INVALID_PARAMS); + + hci_dev_lock(hdev); + + cmd = mgmt_pending_add(sk, MGMT_OP_SET_MESH_RECEIVER, hdev, data, len); + if (!cmd) + err = -ENOMEM; + else + err = hci_cmd_sync_queue(hdev, set_mesh_sync, cmd, + set_mesh_complete); + + if (err < 0) { + err = mgmt_cmd_status(sk, hdev->id, MGMT_OP_SET_MESH_RECEIVER, + MGMT_STATUS_FAILED); + + if (cmd) + mgmt_pending_remove(cmd); + } + + hci_dev_unlock(hdev); + return err; +} + +static void mesh_send_start_complete(struct hci_dev *hdev, void *data, int err) +{ + struct mgmt_mesh_tx *mesh_tx = data; + struct mgmt_cp_mesh_send *send = (void *)mesh_tx->param; + unsigned long mesh_send_interval; + u8 mgmt_err = mgmt_status(err); + + /* Report any errors here, but don't report completion */ + + if (mgmt_err) { + hci_dev_clear_flag(hdev, HCI_MESH_SENDING); + /* Send Complete Error Code for handle */ + mesh_send_complete(hdev, mesh_tx, false); + return; + } + + mesh_send_interval = msecs_to_jiffies((send->cnt) * 25); + queue_delayed_work(hdev->req_workqueue, &hdev->mesh_send_done, + mesh_send_interval); +} + +static int mesh_send_sync(struct hci_dev *hdev, void *data) +{ + struct mgmt_mesh_tx *mesh_tx = data; + struct mgmt_cp_mesh_send *send = (void *)mesh_tx->param; + struct adv_info *adv, *next_instance; + u8 instance = hdev->le_num_of_adv_sets + 1; + u16 timeout, duration; + int err = 0; + + if (hdev->le_num_of_adv_sets <= hdev->adv_instance_cnt) + return MGMT_STATUS_BUSY; + + timeout = 1000; + duration = send->cnt * INTERVAL_TO_MS(hdev->le_adv_max_interval); + adv = hci_add_adv_instance(hdev, instance, 0, + send->adv_data_len, send->adv_data, + 0, NULL, + timeout, duration, + HCI_ADV_TX_POWER_NO_PREFERENCE, + hdev->le_adv_min_interval, + hdev->le_adv_max_interval, + mesh_tx->handle); + + if (!IS_ERR(adv)) + mesh_tx->instance = instance; + else + err = PTR_ERR(adv); + + if (hdev->cur_adv_instance == instance) { + /* If the currently advertised instance is being changed then + * cancel the current advertising and schedule the next + * instance. If there is only one instance then the overridden + * advertising data will be visible right away. + */ + cancel_adv_timeout(hdev); + + next_instance = hci_get_next_instance(hdev, instance); + if (next_instance) + instance = next_instance->instance; + else + instance = 0; + } else if (hdev->adv_instance_timeout) { + /* Immediately advertise the new instance if no other, or + * let it go naturally from queue if ADV is already happening + */ + instance = 0; + } + + if (instance) + return hci_schedule_adv_instance_sync(hdev, instance, true); + + return err; +} + +static void send_count(struct mgmt_mesh_tx *mesh_tx, void *data) +{ + struct mgmt_rp_mesh_read_features *rp = data; + + if (rp->used_handles >= rp->max_handles) + return; + + rp->handles[rp->used_handles++] = mesh_tx->handle; +} + +static int mesh_features(struct sock *sk, struct hci_dev *hdev, + void *data, u16 len) +{ + struct mgmt_rp_mesh_read_features rp; + + if (!lmp_le_capable(hdev) || + !hci_dev_test_flag(hdev, HCI_MESH_EXPERIMENTAL)) + return mgmt_cmd_status(sk, hdev->id, MGMT_OP_MESH_READ_FEATURES, + MGMT_STATUS_NOT_SUPPORTED); + + memset(&rp, 0, sizeof(rp)); + rp.index = cpu_to_le16(hdev->id); + if (hci_dev_test_flag(hdev, HCI_LE_ENABLED)) + rp.max_handles = MESH_HANDLES_MAX; + + hci_dev_lock(hdev); + + if (rp.max_handles) + mgmt_mesh_foreach(hdev, send_count, &rp, sk); + + mgmt_cmd_complete(sk, hdev->id, MGMT_OP_MESH_READ_FEATURES, 0, &rp, + rp.used_handles + sizeof(rp) - MESH_HANDLES_MAX); + + hci_dev_unlock(hdev); + return 0; +} + +static int send_cancel(struct hci_dev *hdev, void *data) +{ + struct mgmt_pending_cmd *cmd = data; + struct mgmt_cp_mesh_send_cancel *cancel = (void *)cmd->param; + struct mgmt_mesh_tx *mesh_tx; + + if (!cancel->handle) { + do { + mesh_tx = mgmt_mesh_next(hdev, cmd->sk); + + if (mesh_tx) + mesh_send_complete(hdev, mesh_tx, false); + } while (mesh_tx); + } else { + mesh_tx = mgmt_mesh_find(hdev, cancel->handle); + + if (mesh_tx && mesh_tx->sk == cmd->sk) + mesh_send_complete(hdev, mesh_tx, false); + } + + mgmt_cmd_complete(cmd->sk, hdev->id, MGMT_OP_MESH_SEND_CANCEL, + 0, NULL, 0); + mgmt_pending_free(cmd); + + return 0; +} + +static int mesh_send_cancel(struct sock *sk, struct hci_dev *hdev, + void *data, u16 len) +{ + struct mgmt_pending_cmd *cmd; + int err; + + if (!lmp_le_capable(hdev) || + !hci_dev_test_flag(hdev, HCI_MESH_EXPERIMENTAL)) + return mgmt_cmd_status(sk, hdev->id, MGMT_OP_MESH_SEND_CANCEL, + MGMT_STATUS_NOT_SUPPORTED); + + if (!hci_dev_test_flag(hdev, HCI_LE_ENABLED)) + return mgmt_cmd_status(sk, hdev->id, MGMT_OP_MESH_SEND_CANCEL, + MGMT_STATUS_REJECTED); + + hci_dev_lock(hdev); + cmd = mgmt_pending_new(sk, MGMT_OP_MESH_SEND_CANCEL, hdev, data, len); + if (!cmd) + err = -ENOMEM; + else + err = hci_cmd_sync_queue(hdev, send_cancel, cmd, NULL); + + if (err < 0) { + err = mgmt_cmd_status(sk, hdev->id, MGMT_OP_MESH_SEND_CANCEL, + MGMT_STATUS_FAILED); + + if (cmd) + mgmt_pending_free(cmd); + } + + hci_dev_unlock(hdev); + return err; +} + +static int mesh_send(struct sock *sk, struct hci_dev *hdev, void *data, u16 len) +{ + struct mgmt_mesh_tx *mesh_tx; + struct mgmt_cp_mesh_send *send = data; + struct mgmt_rp_mesh_read_features rp; + bool sending; + int err = 0; + + if (!lmp_le_capable(hdev) || + !hci_dev_test_flag(hdev, HCI_MESH_EXPERIMENTAL)) + return mgmt_cmd_status(sk, hdev->id, MGMT_OP_MESH_SEND, + MGMT_STATUS_NOT_SUPPORTED); + if (!hci_dev_test_flag(hdev, HCI_LE_ENABLED) || + len <= MGMT_MESH_SEND_SIZE || + len > (MGMT_MESH_SEND_SIZE + 31)) + return mgmt_cmd_status(sk, hdev->id, MGMT_OP_MESH_SEND, + MGMT_STATUS_REJECTED); + + hci_dev_lock(hdev); + + memset(&rp, 0, sizeof(rp)); + rp.max_handles = MESH_HANDLES_MAX; + + mgmt_mesh_foreach(hdev, send_count, &rp, sk); + + if (rp.max_handles <= rp.used_handles) { + err = mgmt_cmd_status(sk, hdev->id, MGMT_OP_MESH_SEND, + MGMT_STATUS_BUSY); + goto done; + } + + sending = hci_dev_test_flag(hdev, HCI_MESH_SENDING); + mesh_tx = mgmt_mesh_add(sk, hdev, send, len); + + if (!mesh_tx) + err = -ENOMEM; + else if (!sending) + err = hci_cmd_sync_queue(hdev, mesh_send_sync, mesh_tx, + mesh_send_start_complete); + + if (err < 0) { + bt_dev_err(hdev, "Send Mesh Failed %d", err); + err = mgmt_cmd_status(sk, hdev->id, MGMT_OP_MESH_SEND, + MGMT_STATUS_FAILED); + + if (mesh_tx) { + if (sending) + mgmt_mesh_remove(mesh_tx); + } + } else { + hci_dev_set_flag(hdev, HCI_MESH_SENDING); + + mgmt_cmd_complete(sk, hdev->id, MGMT_OP_MESH_SEND, 0, + &mesh_tx->handle, 1); + } + +done: + hci_dev_unlock(hdev); + return err; +} + static int set_le(struct sock *sk, struct hci_dev *hdev, void *data, u16 len) { struct mgmt_mode *cp = data; @@ -2131,9 +2508,6 @@ static int set_le(struct sock *sk, struct hci_dev *hdev, void *data, u16 len) val = !!cp->val; enabled = lmp_host_le_capable(hdev); - if (!val) - hci_req_clear_adv_instance(hdev, NULL, NULL, 0x00, true); - if (!hdev_is_powered(hdev) || val == enabled) { bool changed = false; @@ -3186,6 +3560,18 @@ unlock: return err; } +static int abort_conn_sync(struct hci_dev *hdev, void *data) +{ + struct hci_conn *conn; + u16 handle = PTR_ERR(data); + + conn = hci_conn_hash_lookup_handle(hdev, handle); + if (!conn) + return 0; + + return hci_abort_conn_sync(hdev, conn, HCI_ERROR_REMOTE_USER_TERM); +} + static int cancel_pair_device(struct sock *sk, struct hci_dev *hdev, void *data, u16 len) { @@ -3236,7 +3622,8 @@ static int cancel_pair_device(struct sock *sk, struct hci_dev *hdev, void *data, le_addr_type(addr->type)); if (conn->conn_reason == CONN_REASON_PAIR_DEVICE) - hci_abort_conn(conn, HCI_ERROR_REMOTE_USER_TERM); + hci_cmd_sync_queue(hdev, abort_conn_sync, ERR_PTR(conn->handle), + NULL); unlock: hci_dev_unlock(hdev); @@ -3991,17 +4378,28 @@ static const u8 iso_socket_uuid[16] = { 0x6a, 0x49, 0xe0, 0x05, 0x88, 0xf1, 0xba, 0x6f, }; +/* 2ce463d7-7a03-4d8d-bf05-5f24e8f36e76 */ +static const u8 mgmt_mesh_uuid[16] = { + 0x76, 0x6e, 0xf3, 0xe8, 0x24, 0x5f, 0x05, 0xbf, + 0x8d, 0x4d, 0x03, 0x7a, 0xd7, 0x63, 0xe4, 0x2c, +}; + static int read_exp_features_info(struct sock *sk, struct hci_dev *hdev, void *data, u16 data_len) { - char buf[122]; /* Enough space for 6 features: 2 + 20 * 6 */ - struct mgmt_rp_read_exp_features_info *rp = (void *)buf; + struct mgmt_rp_read_exp_features_info *rp; + size_t len; u16 idx = 0; u32 flags; + int status; bt_dev_dbg(hdev, "sock %p", sk); - memset(&buf, 0, sizeof(buf)); + /* Enough space for 7 features */ + len = sizeof(*rp) + (sizeof(rp->features[0]) * 7); + rp = kzalloc(len, GFP_KERNEL); + if (!rp) + return -ENOMEM; #ifdef CONFIG_BT_FEATURE_DEBUG if (!hdev) { @@ -4065,6 +4463,17 @@ static int read_exp_features_info(struct sock *sk, struct hci_dev *hdev, idx++; } + if (hdev && lmp_le_capable(hdev)) { + if (hci_dev_test_flag(hdev, HCI_MESH_EXPERIMENTAL)) + flags = BIT(0); + else + flags = 0; + + memcpy(rp->features[idx].uuid, mgmt_mesh_uuid, 16); + rp->features[idx].flags = cpu_to_le32(flags); + idx++; + } + rp->feature_count = cpu_to_le16(idx); /* After reading the experimental features information, enable @@ -4072,9 +4481,12 @@ static int read_exp_features_info(struct sock *sk, struct hci_dev *hdev, */ hci_sock_set_flag(sk, HCI_MGMT_EXP_FEATURE_EVENTS); - return mgmt_cmd_complete(sk, hdev ? hdev->id : MGMT_INDEX_NONE, - MGMT_OP_READ_EXP_FEATURES_INFO, - 0, rp, sizeof(*rp) + (20 * idx)); + status = mgmt_cmd_complete(sk, hdev ? hdev->id : MGMT_INDEX_NONE, + MGMT_OP_READ_EXP_FEATURES_INFO, + 0, rp, sizeof(*rp) + (20 * idx)); + + kfree(rp); + return status; } static int exp_ll_privacy_feature_changed(bool enabled, struct hci_dev *hdev, @@ -4202,6 +4614,63 @@ static int set_debug_func(struct sock *sk, struct hci_dev *hdev, } #endif +static int set_mgmt_mesh_func(struct sock *sk, struct hci_dev *hdev, + struct mgmt_cp_set_exp_feature *cp, u16 data_len) +{ + struct mgmt_rp_set_exp_feature rp; + bool val, changed; + int err; + + /* Command requires to use the controller index */ + if (!hdev) + return mgmt_cmd_status(sk, MGMT_INDEX_NONE, + MGMT_OP_SET_EXP_FEATURE, + MGMT_STATUS_INVALID_INDEX); + + /* Changes can only be made when controller is powered down */ + if (hdev_is_powered(hdev)) + return mgmt_cmd_status(sk, hdev->id, + MGMT_OP_SET_EXP_FEATURE, + MGMT_STATUS_REJECTED); + + /* Parameters are limited to a single octet */ + if (data_len != MGMT_SET_EXP_FEATURE_SIZE + 1) + return mgmt_cmd_status(sk, hdev->id, + MGMT_OP_SET_EXP_FEATURE, + MGMT_STATUS_INVALID_PARAMS); + + /* Only boolean on/off is supported */ + if (cp->param[0] != 0x00 && cp->param[0] != 0x01) + return mgmt_cmd_status(sk, hdev->id, + MGMT_OP_SET_EXP_FEATURE, + MGMT_STATUS_INVALID_PARAMS); + + val = !!cp->param[0]; + + if (val) { + changed = !hci_dev_test_and_set_flag(hdev, + HCI_MESH_EXPERIMENTAL); + } else { + hci_dev_clear_flag(hdev, HCI_MESH); + changed = hci_dev_test_and_clear_flag(hdev, + HCI_MESH_EXPERIMENTAL); + } + + memcpy(rp.uuid, mgmt_mesh_uuid, 16); + rp.flags = cpu_to_le32(val ? BIT(0) : 0); + + hci_sock_set_flag(sk, HCI_MGMT_EXP_FEATURE_EVENTS); + + err = mgmt_cmd_complete(sk, hdev->id, + MGMT_OP_SET_EXP_FEATURE, 0, + &rp, sizeof(rp)); + + if (changed) + exp_feature_changed(hdev, mgmt_mesh_uuid, val, sk); + + return err; +} + static int set_rpa_resolution_func(struct sock *sk, struct hci_dev *hdev, struct mgmt_cp_set_exp_feature *cp, u16 data_len) @@ -4517,6 +4986,7 @@ static const struct mgmt_exp_feature { #ifdef CONFIG_BT_FEATURE_DEBUG EXP_FEAT(debug_uuid, set_debug_func), #endif + EXP_FEAT(mgmt_mesh_uuid, set_mgmt_mesh_func), EXP_FEAT(rpa_resolution_uuid, set_rpa_resolution_func), EXP_FEAT(quality_report_uuid, set_quality_report_func), EXP_FEAT(offload_codecs_uuid, set_offload_codec_func), @@ -5981,6 +6451,7 @@ static int set_advertising(struct sock *sk, struct hci_dev *hdev, void *data, if (!hdev_is_powered(hdev) || (val == hci_dev_test_flag(hdev, HCI_ADVERTISING) && (cp->val == 0x02) == hci_dev_test_flag(hdev, HCI_ADVERTISING_CONNECTABLE)) || + hci_dev_test_flag(hdev, HCI_MESH) || hci_conn_num(hdev, LE_LINK) > 0 || (hci_dev_test_flag(hdev, HCI_LE_SCAN) && hdev->le_scan_type == LE_SCAN_ACTIVE)) { @@ -7909,8 +8380,7 @@ static u32 get_supported_adv_flags(struct hci_dev *hdev) /* In extended adv TX_POWER returned from Set Adv Param * will be always valid. */ - if ((hdev->adv_tx_power != HCI_TX_POWER_INVALID) || - ext_adv_capable(hdev)) + if (hdev->adv_tx_power != HCI_TX_POWER_INVALID || ext_adv_capable(hdev)) flags |= MGMT_ADV_FLAG_TX_POWER; if (ext_adv_capable(hdev)) { @@ -7963,8 +8433,14 @@ static int read_adv_features(struct sock *sk, struct hci_dev *hdev, instance = rp->instance; list_for_each_entry(adv_instance, &hdev->adv_instances, list) { - *instance = adv_instance->instance; - instance++; + /* Only instances 1-le_num_of_adv_sets are externally visible */ + if (adv_instance->instance <= hdev->adv_instance_cnt) { + *instance = adv_instance->instance; + instance++; + } else { + rp->num_instances--; + rp_len--; + } } hci_dev_unlock(hdev); @@ -8226,7 +8702,7 @@ static int add_advertising(struct sock *sk, struct hci_dev *hdev, timeout, duration, HCI_ADV_TX_POWER_NO_PREFERENCE, hdev->le_adv_min_interval, - hdev->le_adv_max_interval); + hdev->le_adv_max_interval, 0); if (IS_ERR(adv)) { err = mgmt_cmd_status(sk, hdev->id, MGMT_OP_ADD_ADVERTISING, MGMT_STATUS_FAILED); @@ -8430,7 +8906,7 @@ static int add_ext_adv_params(struct sock *sk, struct hci_dev *hdev, /* Create advertising instance with no advertising or response data */ adv = hci_add_adv_instance(hdev, cp->instance, flags, 0, NULL, 0, NULL, timeout, duration, tx_power, min_interval, - max_interval); + max_interval, 0); if (IS_ERR(adv)) { err = mgmt_cmd_status(sk, hdev->id, MGMT_OP_ADD_EXT_ADV_PARAMS, @@ -8876,8 +9352,13 @@ static const struct hci_mgmt_handler mgmt_handlers[] = { { add_ext_adv_data, MGMT_ADD_EXT_ADV_DATA_SIZE, HCI_MGMT_VAR_LEN }, { add_adv_patterns_monitor_rssi, - MGMT_ADD_ADV_PATTERNS_MONITOR_RSSI_SIZE, + MGMT_ADD_ADV_PATTERNS_MONITOR_RSSI_SIZE }, + { set_mesh, MGMT_SET_MESH_RECEIVER_SIZE, + HCI_MGMT_VAR_LEN }, + { mesh_features, MGMT_MESH_READ_FEATURES_SIZE }, + { mesh_send, MGMT_MESH_SEND_SIZE, HCI_MGMT_VAR_LEN }, + { mesh_send_cancel, MGMT_MESH_SEND_CANCEL_SIZE }, }; void mgmt_index_added(struct hci_dev *hdev) @@ -9817,14 +10298,86 @@ static void mgmt_adv_monitor_device_found(struct hci_dev *hdev, kfree_skb(skb); } +static void mesh_device_found(struct hci_dev *hdev, bdaddr_t *bdaddr, + u8 addr_type, s8 rssi, u32 flags, u8 *eir, + u16 eir_len, u8 *scan_rsp, u8 scan_rsp_len, + u64 instant) +{ + struct sk_buff *skb; + struct mgmt_ev_mesh_device_found *ev; + int i, j; + + if (!hdev->mesh_ad_types[0]) + goto accepted; + + /* Scan for requested AD types */ + if (eir_len > 0) { + for (i = 0; i + 1 < eir_len; i += eir[i] + 1) { + for (j = 0; j < sizeof(hdev->mesh_ad_types); j++) { + if (!hdev->mesh_ad_types[j]) + break; + + if (hdev->mesh_ad_types[j] == eir[i + 1]) + goto accepted; + } + } + } + + if (scan_rsp_len > 0) { + for (i = 0; i + 1 < scan_rsp_len; i += scan_rsp[i] + 1) { + for (j = 0; j < sizeof(hdev->mesh_ad_types); j++) { + if (!hdev->mesh_ad_types[j]) + break; + + if (hdev->mesh_ad_types[j] == scan_rsp[i + 1]) + goto accepted; + } + } + } + + return; + +accepted: + skb = mgmt_alloc_skb(hdev, MGMT_EV_MESH_DEVICE_FOUND, + sizeof(*ev) + eir_len + scan_rsp_len); + if (!skb) + return; + + ev = skb_put(skb, sizeof(*ev)); + + bacpy(&ev->addr.bdaddr, bdaddr); + ev->addr.type = link_to_bdaddr(LE_LINK, addr_type); + ev->rssi = rssi; + ev->flags = cpu_to_le32(flags); + ev->instant = cpu_to_le64(instant); + + if (eir_len > 0) + /* Copy EIR or advertising data into event */ + skb_put_data(skb, eir, eir_len); + + if (scan_rsp_len > 0) + /* Append scan response data to event */ + skb_put_data(skb, scan_rsp, scan_rsp_len); + + ev->eir_len = cpu_to_le16(eir_len + scan_rsp_len); + + mgmt_event_skb(skb, NULL); +} + void mgmt_device_found(struct hci_dev *hdev, bdaddr_t *bdaddr, u8 link_type, u8 addr_type, u8 *dev_class, s8 rssi, u32 flags, - u8 *eir, u16 eir_len, u8 *scan_rsp, u8 scan_rsp_len) + u8 *eir, u16 eir_len, u8 *scan_rsp, u8 scan_rsp_len, + u64 instant) { struct sk_buff *skb; struct mgmt_ev_device_found *ev; bool report_device = hci_discovery_active(hdev); + if (hci_dev_test_flag(hdev, HCI_MESH) && link_type == LE_LINK) + mesh_device_found(hdev, bdaddr, addr_type, rssi, flags, + eir, eir_len, scan_rsp, scan_rsp_len, + instant); + /* Don't send events for a non-kernel initiated discovery. With * LE one exception is if we have pend_le_reports > 0 in which * case we're doing passive scanning and want these events. @@ -9983,3 +10536,22 @@ void mgmt_exit(void) { hci_mgmt_chan_unregister(&chan); } + +void mgmt_cleanup(struct sock *sk) +{ + struct mgmt_mesh_tx *mesh_tx; + struct hci_dev *hdev; + + read_lock(&hci_dev_list_lock); + + list_for_each_entry(hdev, &hci_dev_list, list) { + do { + mesh_tx = mgmt_mesh_next(hdev, sk); + + if (mesh_tx) + mesh_send_complete(hdev, mesh_tx, true); + } while (mesh_tx); + } + + read_unlock(&hci_dev_list_lock); +} diff --git a/net/bluetooth/mgmt_util.c b/net/bluetooth/mgmt_util.c index b69cfed62088..0115f783bde8 100644 --- a/net/bluetooth/mgmt_util.c +++ b/net/bluetooth/mgmt_util.c @@ -314,3 +314,77 @@ void mgmt_pending_remove(struct mgmt_pending_cmd *cmd) list_del(&cmd->list); mgmt_pending_free(cmd); } + +void mgmt_mesh_foreach(struct hci_dev *hdev, + void (*cb)(struct mgmt_mesh_tx *mesh_tx, void *data), + void *data, struct sock *sk) +{ + struct mgmt_mesh_tx *mesh_tx, *tmp; + + list_for_each_entry_safe(mesh_tx, tmp, &hdev->mgmt_pending, list) { + if (!sk || mesh_tx->sk == sk) + cb(mesh_tx, data); + } +} + +struct mgmt_mesh_tx *mgmt_mesh_next(struct hci_dev *hdev, struct sock *sk) +{ + struct mgmt_mesh_tx *mesh_tx; + + if (list_empty(&hdev->mesh_pending)) + return NULL; + + list_for_each_entry(mesh_tx, &hdev->mesh_pending, list) { + if (!sk || mesh_tx->sk == sk) + return mesh_tx; + } + + return NULL; +} + +struct mgmt_mesh_tx *mgmt_mesh_find(struct hci_dev *hdev, u8 handle) +{ + struct mgmt_mesh_tx *mesh_tx; + + if (list_empty(&hdev->mesh_pending)) + return NULL; + + list_for_each_entry(mesh_tx, &hdev->mesh_pending, list) { + if (mesh_tx->handle == handle) + return mesh_tx; + } + + return NULL; +} + +struct mgmt_mesh_tx *mgmt_mesh_add(struct sock *sk, struct hci_dev *hdev, + void *data, u16 len) +{ + struct mgmt_mesh_tx *mesh_tx; + + mesh_tx = kzalloc(sizeof(*mesh_tx), GFP_KERNEL); + if (!mesh_tx) + return NULL; + + hdev->mesh_send_ref++; + if (!hdev->mesh_send_ref) + hdev->mesh_send_ref++; + + mesh_tx->handle = hdev->mesh_send_ref; + mesh_tx->index = hdev->id; + memcpy(mesh_tx->param, data, len); + mesh_tx->param_len = len; + mesh_tx->sk = sk; + sock_hold(sk); + + list_add_tail(&mesh_tx->list, &hdev->mesh_pending); + + return mesh_tx; +} + +void mgmt_mesh_remove(struct mgmt_mesh_tx *mesh_tx) +{ + list_del(&mesh_tx->list); + sock_put(mesh_tx->sk); + kfree(mesh_tx); +} diff --git a/net/bluetooth/mgmt_util.h b/net/bluetooth/mgmt_util.h index 98e40395a383..6a8b7e84293d 100644 --- a/net/bluetooth/mgmt_util.h +++ b/net/bluetooth/mgmt_util.h @@ -20,6 +20,16 @@ SOFTWARE IS DISCLAIMED. */ +struct mgmt_mesh_tx { + struct list_head list; + int index; + size_t param_len; + struct sock *sk; + u8 handle; + u8 instance; + u8 param[sizeof(struct mgmt_cp_mesh_send) + 29]; +}; + struct mgmt_pending_cmd { struct list_head list; u16 opcode; @@ -59,3 +69,11 @@ struct mgmt_pending_cmd *mgmt_pending_new(struct sock *sk, u16 opcode, void *data, u16 len); void mgmt_pending_free(struct mgmt_pending_cmd *cmd); void mgmt_pending_remove(struct mgmt_pending_cmd *cmd); +void mgmt_mesh_foreach(struct hci_dev *hdev, + void (*cb)(struct mgmt_mesh_tx *mesh_tx, void *data), + void *data, struct sock *sk); +struct mgmt_mesh_tx *mgmt_mesh_find(struct hci_dev *hdev, u8 handle); +struct mgmt_mesh_tx *mgmt_mesh_next(struct hci_dev *hdev, struct sock *sk); +struct mgmt_mesh_tx *mgmt_mesh_add(struct sock *sk, struct hci_dev *hdev, + void *data, u16 len); +void mgmt_mesh_remove(struct mgmt_mesh_tx *mesh_tx); diff --git a/net/bluetooth/rfcomm/sock.c b/net/bluetooth/rfcomm/sock.c index 4bf4ea6cbb5e..21e24da4847f 100644 --- a/net/bluetooth/rfcomm/sock.c +++ b/net/bluetooth/rfcomm/sock.c @@ -902,7 +902,10 @@ static int rfcomm_sock_shutdown(struct socket *sock, int how) lock_sock(sk); if (!sk->sk_shutdown) { sk->sk_shutdown = SHUTDOWN_MASK; + + release_sock(sk); __rfcomm_sock_close(sk); + lock_sock(sk); if (sock_flag(sk, SOCK_LINGER) && sk->sk_lingertime && !(current->flags & PF_EXITING)) diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c index d11209367dd0..13d578ce2a09 100644 --- a/net/bpf/test_run.c +++ b/net/bpf/test_run.c @@ -606,6 +606,38 @@ noinline void bpf_kfunc_call_memb1_release(struct prog_test_member1 *p) WARN_ON_ONCE(1); } +static int *__bpf_kfunc_call_test_get_mem(struct prog_test_ref_kfunc *p, const int size) +{ + if (size > 2 * sizeof(int)) + return NULL; + + return (int *)p; +} + +noinline int *bpf_kfunc_call_test_get_rdwr_mem(struct prog_test_ref_kfunc *p, const int rdwr_buf_size) +{ + return __bpf_kfunc_call_test_get_mem(p, rdwr_buf_size); +} + +noinline int *bpf_kfunc_call_test_get_rdonly_mem(struct prog_test_ref_kfunc *p, const int rdonly_buf_size) +{ + return __bpf_kfunc_call_test_get_mem(p, rdonly_buf_size); +} + +/* the next 2 ones can't be really used for testing expect to ensure + * that the verifier rejects the call. + * Acquire functions must return struct pointers, so these ones are + * failing. + */ +noinline int *bpf_kfunc_call_test_acq_rdonly_mem(struct prog_test_ref_kfunc *p, const int rdonly_buf_size) +{ + return __bpf_kfunc_call_test_get_mem(p, rdonly_buf_size); +} + +noinline void bpf_kfunc_call_int_mem_release(int *p) +{ +} + noinline struct prog_test_ref_kfunc * bpf_kfunc_call_test_kptr_get(struct prog_test_ref_kfunc **pp, int a, int b) { @@ -695,6 +727,10 @@ noinline void bpf_kfunc_call_test_ref(struct prog_test_ref_kfunc *p) { } +noinline void bpf_kfunc_call_test_destructive(void) +{ +} + __diag_pop(); ALLOW_ERROR_INJECTION(bpf_modify_return_test, ERRNO); @@ -708,6 +744,10 @@ BTF_ID_FLAGS(func, bpf_kfunc_call_memb_acquire, KF_ACQUIRE | KF_RET_NULL) BTF_ID_FLAGS(func, bpf_kfunc_call_test_release, KF_RELEASE) BTF_ID_FLAGS(func, bpf_kfunc_call_memb_release, KF_RELEASE) BTF_ID_FLAGS(func, bpf_kfunc_call_memb1_release, KF_RELEASE) +BTF_ID_FLAGS(func, bpf_kfunc_call_test_get_rdwr_mem, KF_RET_NULL) +BTF_ID_FLAGS(func, bpf_kfunc_call_test_get_rdonly_mem, KF_RET_NULL) +BTF_ID_FLAGS(func, bpf_kfunc_call_test_acq_rdonly_mem, KF_ACQUIRE | KF_RET_NULL) +BTF_ID_FLAGS(func, bpf_kfunc_call_int_mem_release, KF_RELEASE) BTF_ID_FLAGS(func, bpf_kfunc_call_test_kptr_get, KF_ACQUIRE | KF_RET_NULL | KF_KPTR_GET) BTF_ID_FLAGS(func, bpf_kfunc_call_test_pass_ctx) BTF_ID_FLAGS(func, bpf_kfunc_call_test_pass1) @@ -719,6 +759,7 @@ BTF_ID_FLAGS(func, bpf_kfunc_call_test_mem_len_pass1) BTF_ID_FLAGS(func, bpf_kfunc_call_test_mem_len_fail1) BTF_ID_FLAGS(func, bpf_kfunc_call_test_mem_len_fail2) BTF_ID_FLAGS(func, bpf_kfunc_call_test_ref, KF_TRUSTED_ARGS) +BTF_ID_FLAGS(func, bpf_kfunc_call_test_destructive, KF_DESTRUCTIVE) BTF_SET8_END(test_sk_check_kfunc_ids) static void *bpf_test_init(const union bpf_attr *kattr, u32 user_size, @@ -1629,6 +1670,7 @@ static int __init bpf_prog_test_run_init(void) ret = register_btf_kfunc_id_set(BPF_PROG_TYPE_SCHED_CLS, &bpf_prog_test_kfunc_set); ret = ret ?: register_btf_kfunc_id_set(BPF_PROG_TYPE_TRACING, &bpf_prog_test_kfunc_set); + ret = ret ?: register_btf_kfunc_id_set(BPF_PROG_TYPE_SYSCALL, &bpf_prog_test_kfunc_set); return ret ?: register_btf_id_dtor_kfuncs(bpf_prog_test_dtor_kfunc, ARRAY_SIZE(bpf_prog_test_dtor_kfunc), THIS_MODULE); diff --git a/net/bridge/br_device.c b/net/bridge/br_device.c index 58a4f70e01e3..b82906fc999a 100644 --- a/net/bridge/br_device.c +++ b/net/bridge/br_device.c @@ -251,10 +251,10 @@ static int br_set_mac_address(struct net_device *dev, void *p) static void br_getinfo(struct net_device *dev, struct ethtool_drvinfo *info) { - strlcpy(info->driver, "bridge", sizeof(info->driver)); - strlcpy(info->version, BR_VERSION, sizeof(info->version)); - strlcpy(info->fw_version, "N/A", sizeof(info->fw_version)); - strlcpy(info->bus_info, "N/A", sizeof(info->bus_info)); + strscpy(info->driver, "bridge", sizeof(info->driver)); + strscpy(info->version, BR_VERSION, sizeof(info->version)); + strscpy(info->fw_version, "N/A", sizeof(info->fw_version)); + strscpy(info->bus_info, "N/A", sizeof(info->bus_info)); } static int br_get_link_ksettings(struct net_device *dev, diff --git a/net/bridge/br_if.c b/net/bridge/br_if.c index a84a7cfb9d6d..228fd5b20f10 100644 --- a/net/bridge/br_if.c +++ b/net/bridge/br_if.c @@ -40,12 +40,21 @@ static int port_cost(struct net_device *dev) switch (ecmd.base.speed) { case SPEED_10000: return 2; - case SPEED_1000: + case SPEED_5000: + return 3; + case SPEED_2500: return 4; + case SPEED_1000: + return 5; case SPEED_100: return 19; case SPEED_10: return 100; + case SPEED_UNKNOWN: + return 100; + default: + if (ecmd.base.speed > SPEED_10000) + return 1; } } @@ -568,26 +577,6 @@ int br_add_if(struct net_bridge *br, struct net_device *dev, !is_valid_ether_addr(dev->dev_addr)) return -EINVAL; - /* Also don't allow bridging of net devices that are DSA masters, since - * the bridge layer rx_handler prevents the DSA fake ethertype handler - * to be invoked, so we don't get the chance to strip off and parse the - * DSA switch tag protocol header (the bridge layer just returns - * RX_HANDLER_CONSUMED, stopping RX processing for these frames). - * The only case where that would not be an issue is when bridging can - * already be offloaded, such as when the DSA master is itself a DSA - * or plain switchdev port, and is bridged only with other ports from - * the same hardware device. - */ - if (netdev_uses_dsa(dev)) { - list_for_each_entry(p, &br->port_list, list) { - if (!netdev_port_same_parent_id(dev, p->dev)) { - NL_SET_ERR_MSG(extack, - "Cannot do software bridging with a DSA master"); - return -EINVAL; - } - } - } - /* No bridging of bridges */ if (dev->netdev_ops->ndo_start_xmit == br_dev_xmit) { NL_SET_ERR_MSG(extack, diff --git a/net/bridge/br_sysfs_if.c b/net/bridge/br_sysfs_if.c index 07fa76080512..74fdd8105dca 100644 --- a/net/bridge/br_sysfs_if.c +++ b/net/bridge/br_sysfs_if.c @@ -384,7 +384,7 @@ int br_sysfs_addif(struct net_bridge_port *p) return err; } - strlcpy(p->sysfs_name, p->dev->name, IFNAMSIZ); + strscpy(p->sysfs_name, p->dev->name, IFNAMSIZ); return sysfs_create_link(br->ifobj, &p->kobj, p->sysfs_name); } @@ -406,7 +406,7 @@ int br_sysfs_renameif(struct net_bridge_port *p) netdev_notice(br->dev, "unable to rename link %s to %s", p->sysfs_name, p->dev->name); else - strlcpy(p->sysfs_name, p->dev->name, IFNAMSIZ); + strscpy(p->sysfs_name, p->dev->name, IFNAMSIZ); return err; } diff --git a/net/bridge/netfilter/ebtables.c b/net/bridge/netfilter/ebtables.c index 4f385d52a1c4..ce5dfa3babd2 100644 --- a/net/bridge/netfilter/ebtables.c +++ b/net/bridge/netfilter/ebtables.c @@ -1442,7 +1442,7 @@ static inline int ebt_obj_to_user(char __user *um, const char *_name, /* ebtables expects 31 bytes long names but xt_match names are 29 bytes * long. Copy 29 bytes and fill remaining bytes with zeroes. */ - strlcpy(name, _name, sizeof(name)); + strscpy(name, _name, sizeof(name)); if (copy_to_user(um, name, EBT_EXTENSION_MAXNAMELEN) || put_user(revision, (u8 __user *)(um + EBT_EXTENSION_MAXNAMELEN)) || put_user(datasize, (int __user *)(um + EBT_EXTENSION_MAXNAMELEN + 1)) || diff --git a/net/caif/caif_dev.c b/net/caif/caif_dev.c index 52dd0b6835bc..6a0cba4fc29f 100644 --- a/net/caif/caif_dev.c +++ b/net/caif/caif_dev.c @@ -342,7 +342,7 @@ int caif_enroll_dev(struct net_device *dev, struct caif_dev_common *caifdev, mutex_lock(&caifdevs->lock); list_add_rcu(&caifd->list, &caifdevs->list); - strlcpy(caifd->layer.name, dev->name, + strscpy(caifd->layer.name, dev->name, sizeof(caifd->layer.name)); caifd->layer.transmit = transmit; res = cfcnfg_add_phy_layer(cfg, diff --git a/net/caif/caif_usb.c b/net/caif/caif_usb.c index 4be6b04879a1..ebc202ffdd8d 100644 --- a/net/caif/caif_usb.c +++ b/net/caif/caif_usb.c @@ -184,7 +184,7 @@ static int cfusbl_device_notify(struct notifier_block *me, unsigned long what, dev_add_pack(&caif_usb_type); pack_added = true; - strlcpy(layer->name, dev->name, sizeof(layer->name)); + strscpy(layer->name, dev->name, sizeof(layer->name)); return 0; err: diff --git a/net/caif/cfcnfg.c b/net/caif/cfcnfg.c index 23267c8db7c4..52509e185960 100644 --- a/net/caif/cfcnfg.c +++ b/net/caif/cfcnfg.c @@ -268,14 +268,14 @@ static int caif_connect_req_to_link_param(struct cfcnfg *cnfg, case CAIFPROTO_RFM: l->linktype = CFCTRL_SRV_RFM; l->u.datagram.connid = s->sockaddr.u.rfm.connection_id; - strlcpy(l->u.rfm.volume, s->sockaddr.u.rfm.volume, + strscpy(l->u.rfm.volume, s->sockaddr.u.rfm.volume, sizeof(l->u.rfm.volume)); break; case CAIFPROTO_UTIL: l->linktype = CFCTRL_SRV_UTIL; l->endpoint = 0x00; l->chtype = 0x00; - strlcpy(l->u.utility.name, s->sockaddr.u.util.service, + strscpy(l->u.utility.name, s->sockaddr.u.util.service, sizeof(l->u.utility.name)); caif_assert(sizeof(l->u.utility.name) > 10); l->u.utility.paramlen = s->param.size; diff --git a/net/caif/cfctrl.c b/net/caif/cfctrl.c index 2809cbd6b7f7..cc405d8c7c30 100644 --- a/net/caif/cfctrl.c +++ b/net/caif/cfctrl.c @@ -258,7 +258,7 @@ int cfctrl_linkup_request(struct cflayer *layer, tmp16 = cpu_to_le16(param->u.utility.fifosize_bufs); cfpkt_add_body(pkt, &tmp16, 2); memset(utility_name, 0, sizeof(utility_name)); - strlcpy(utility_name, param->u.utility.name, + strscpy(utility_name, param->u.utility.name, UTILITY_NAME_LENGTH); cfpkt_add_body(pkt, utility_name, UTILITY_NAME_LENGTH); tmp8 = param->u.utility.paramlen; diff --git a/net/can/af_can.c b/net/can/af_can.c index 1fb49d51b25d..9503ab10f9b8 100644 --- a/net/can/af_can.c +++ b/net/can/af_can.c @@ -199,27 +199,26 @@ static int can_create(struct net *net, struct socket *sock, int protocol, int can_send(struct sk_buff *skb, int loop) { struct sk_buff *newskb = NULL; - struct canfd_frame *cfd = (struct canfd_frame *)skb->data; struct can_pkg_stats *pkg_stats = dev_net(skb->dev)->can.pkg_stats; int err = -EINVAL; - if (skb->len == CAN_MTU) { + if (can_is_canxl_skb(skb)) { + skb->protocol = htons(ETH_P_CANXL); + } else if (can_is_can_skb(skb)) { skb->protocol = htons(ETH_P_CAN); - if (unlikely(cfd->len > CAN_MAX_DLEN)) - goto inval_skb; - } else if (skb->len == CANFD_MTU) { + } else if (can_is_canfd_skb(skb)) { + struct canfd_frame *cfd = (struct canfd_frame *)skb->data; + skb->protocol = htons(ETH_P_CANFD); - if (unlikely(cfd->len > CANFD_MAX_DLEN)) - goto inval_skb; + + /* set CAN FD flag for CAN FD frames by default */ + cfd->flags |= CANFD_FDF; } else { goto inval_skb; } - /* Make sure the CAN frame can pass the selected CAN netdevice. - * As structs can_frame and canfd_frame are similar, we can provide - * CAN FD frames to legacy CAN drivers as long as the length is <= 8 - */ - if (unlikely(skb->len > skb->dev->mtu && cfd->len > CAN_MAX_DLEN)) { + /* Make sure the CAN frame can pass the selected CAN netdevice. */ + if (unlikely(skb->len > skb->dev->mtu)) { err = -EMSGSIZE; goto inval_skb; } @@ -678,53 +677,46 @@ static void can_receive(struct sk_buff *skb, struct net_device *dev) static int can_rcv(struct sk_buff *skb, struct net_device *dev, struct packet_type *pt, struct net_device *orig_dev) { - struct canfd_frame *cfd = (struct canfd_frame *)skb->data; - - if (unlikely(dev->type != ARPHRD_CAN || skb->len != CAN_MTU)) { + if (unlikely(dev->type != ARPHRD_CAN || (!can_is_can_skb(skb)))) { pr_warn_once("PF_CAN: dropped non conform CAN skbuff: dev type %d, len %d\n", dev->type, skb->len); - goto free_skb; - } - /* This check is made separately since cfd->len would be uninitialized if skb->len = 0. */ - if (unlikely(cfd->len > CAN_MAX_DLEN)) { - pr_warn_once("PF_CAN: dropped non conform CAN skbuff: dev type %d, len %d, datalen %d\n", - dev->type, skb->len, cfd->len); - goto free_skb; + kfree_skb(skb); + return NET_RX_DROP; } can_receive(skb, dev); return NET_RX_SUCCESS; - -free_skb: - kfree_skb(skb); - return NET_RX_DROP; } static int canfd_rcv(struct sk_buff *skb, struct net_device *dev, struct packet_type *pt, struct net_device *orig_dev) { - struct canfd_frame *cfd = (struct canfd_frame *)skb->data; - - if (unlikely(dev->type != ARPHRD_CAN || skb->len != CANFD_MTU)) { + if (unlikely(dev->type != ARPHRD_CAN || (!can_is_canfd_skb(skb)))) { pr_warn_once("PF_CAN: dropped non conform CAN FD skbuff: dev type %d, len %d\n", dev->type, skb->len); - goto free_skb; - } - /* This check is made separately since cfd->len would be uninitialized if skb->len = 0. */ - if (unlikely(cfd->len > CANFD_MAX_DLEN)) { - pr_warn_once("PF_CAN: dropped non conform CAN FD skbuff: dev type %d, len %d, datalen %d\n", - dev->type, skb->len, cfd->len); - goto free_skb; + kfree_skb(skb); + return NET_RX_DROP; } can_receive(skb, dev); return NET_RX_SUCCESS; +} -free_skb: - kfree_skb(skb); - return NET_RX_DROP; +static int canxl_rcv(struct sk_buff *skb, struct net_device *dev, + struct packet_type *pt, struct net_device *orig_dev) +{ + if (unlikely(dev->type != ARPHRD_CAN || (!can_is_canxl_skb(skb)))) { + pr_warn_once("PF_CAN: dropped non conform CAN XL skbuff: dev type %d, len %d\n", + dev->type, skb->len); + + kfree_skb(skb); + return NET_RX_DROP; + } + + can_receive(skb, dev); + return NET_RX_SUCCESS; } /* af_can protocol functions */ @@ -851,6 +843,11 @@ static struct packet_type canfd_packet __read_mostly = { .func = canfd_rcv, }; +static struct packet_type canxl_packet __read_mostly = { + .type = cpu_to_be16(ETH_P_CANXL), + .func = canxl_rcv, +}; + static const struct net_proto_family can_family_ops = { .family = PF_CAN, .create = can_create, @@ -890,6 +887,7 @@ static __init int can_init(void) dev_add_pack(&can_packet); dev_add_pack(&canfd_packet); + dev_add_pack(&canxl_packet); return 0; diff --git a/net/can/bcm.c b/net/can/bcm.c index e60161bec850..27706f6ace34 100644 --- a/net/can/bcm.c +++ b/net/can/bcm.c @@ -274,6 +274,7 @@ static void bcm_can_tx(struct bcm_op *op) struct sk_buff *skb; struct net_device *dev; struct canfd_frame *cf = op->frames + op->cfsiz * op->currframe; + int err; /* no target device? => exit */ if (!op->ifindex) @@ -298,11 +299,11 @@ static void bcm_can_tx(struct bcm_op *op) /* send with loopback */ skb->dev = dev; can_skb_set_owner(skb, op->sk); - can_send(skb, 1); + err = can_send(skb, 1); + if (!err) + op->frames_abs++; - /* update statistics */ op->currframe++; - op->frames_abs++; /* reached last frame? */ if (op->currframe >= op->nframes) @@ -648,8 +649,13 @@ static void bcm_rx_handler(struct sk_buff *skb, void *data) return; /* make sure to handle the correct frame type (CAN / CAN FD) */ - if (skb->len != op->cfsiz) - return; + if (op->flags & CAN_FD_FRAME) { + if (!can_is_canfd_skb(skb)) + return; + } else { + if (!can_is_can_skb(skb)) + return; + } /* disable timeout */ hrtimer_cancel(&op->timer); @@ -1744,15 +1750,27 @@ static int __init bcm_module_init(void) pr_info("can: broadcast manager protocol\n"); + err = register_pernet_subsys(&canbcm_pernet_ops); + if (err) + return err; + + err = register_netdevice_notifier(&canbcm_notifier); + if (err) + goto register_notifier_failed; + err = can_proto_register(&bcm_can_proto); if (err < 0) { printk(KERN_ERR "can: registration of bcm protocol failed\n"); - return err; + goto register_proto_failed; } - register_pernet_subsys(&canbcm_pernet_ops); - register_netdevice_notifier(&canbcm_notifier); return 0; + +register_proto_failed: + unregister_netdevice_notifier(&canbcm_notifier); +register_notifier_failed: + unregister_pernet_subsys(&canbcm_pernet_ops); + return err; } static void __exit bcm_module_exit(void) diff --git a/net/can/gw.c b/net/can/gw.c index 1ea4cc527db3..23a3d89cad81 100644 --- a/net/can/gw.c +++ b/net/can/gw.c @@ -463,10 +463,10 @@ static void can_can_gw_rcv(struct sk_buff *skb, void *data) /* process strictly Classic CAN or CAN FD frames */ if (gwj->flags & CGW_FLAGS_CAN_FD) { - if (skb->len != CANFD_MTU) + if (!can_is_canfd_skb(skb)) return; } else { - if (skb->len != CAN_MTU) + if (!can_is_can_skb(skb)) return; } diff --git a/net/can/isotp.c b/net/can/isotp.c index 43a27d19cdac..a9d1357f8489 100644 --- a/net/can/isotp.c +++ b/net/can/isotp.c @@ -669,7 +669,7 @@ static void isotp_rcv(struct sk_buff *skb, void *data) if (cf->len <= CAN_MAX_DLEN) { isotp_rcv_sf(sk, cf, SF_PCI_SZ4 + ae, skb, sf_dl); } else { - if (skb->len == CANFD_MTU) { + if (can_is_canfd_skb(skb)) { /* We have a CAN FD frame and CAN_DL is greater than 8: * Only frames with the SF_DL == 0 ESC value are valid. * diff --git a/net/can/j1939/main.c b/net/can/j1939/main.c index 8452b0fbb78c..144c86b0e3ff 100644 --- a/net/can/j1939/main.c +++ b/net/can/j1939/main.c @@ -42,6 +42,10 @@ static void j1939_can_recv(struct sk_buff *iskb, void *data) struct j1939_sk_buff_cb *skcb, *iskcb; struct can_frame *cf; + /* make sure we only get Classical CAN frames */ + if (!can_is_can_skb(iskb)) + return; + /* create a copy of the skb * j1939 only delivers the real data bytes, * the header goes into sockaddr. diff --git a/net/can/raw.c b/net/can/raw.c index d1bd9cc51ebe..3eb7d3e2b541 100644 --- a/net/can/raw.c +++ b/net/can/raw.c @@ -50,6 +50,7 @@ #include <linux/skbuff.h> #include <linux/can.h> #include <linux/can/core.h> +#include <linux/can/dev.h> /* for can_is_canxl_dev_mtu() */ #include <linux/can/skb.h> #include <linux/can/raw.h> #include <net/sock.h> @@ -87,6 +88,7 @@ struct raw_sock { int loopback; int recv_own_msgs; int fd_frames; + int xl_frames; int join_filters; int count; /* number of active filters */ struct can_filter dfilter; /* default/single filter */ @@ -129,21 +131,21 @@ static void raw_rcv(struct sk_buff *oskb, void *data) if (!ro->recv_own_msgs && oskb->sk == sk) return; - /* do not pass non-CAN2.0 frames to a legacy socket */ - if (!ro->fd_frames && oskb->len != CAN_MTU) + /* make sure to not pass oversized frames to the socket */ + if ((can_is_canfd_skb(oskb) && !ro->fd_frames && !ro->xl_frames) || + (can_is_canxl_skb(oskb) && !ro->xl_frames)) return; /* eliminate multiple filter matches for the same skb */ if (this_cpu_ptr(ro->uniq)->skb == oskb && this_cpu_ptr(ro->uniq)->skbcnt == can_skb_prv(oskb)->skbcnt) { - if (ro->join_filters) { - this_cpu_inc(ro->uniq->join_rx_count); - /* drop frame until all enabled filters matched */ - if (this_cpu_ptr(ro->uniq)->join_rx_count < ro->count) - return; - } else { + if (!ro->join_filters) + return; + + this_cpu_inc(ro->uniq->join_rx_count); + /* drop frame until all enabled filters matched */ + if (this_cpu_ptr(ro->uniq)->join_rx_count < ro->count) return; - } } else { this_cpu_ptr(ro->uniq)->skb = oskb; this_cpu_ptr(ro->uniq)->skbcnt = can_skb_prv(oskb)->skbcnt; @@ -346,6 +348,7 @@ static int raw_init(struct sock *sk) ro->loopback = 1; ro->recv_own_msgs = 0; ro->fd_frames = 0; + ro->xl_frames = 0; ro->join_filters = 0; /* alloc_percpu provides zero'ed memory */ @@ -669,6 +672,15 @@ static int raw_setsockopt(struct socket *sock, int level, int optname, break; + case CAN_RAW_XL_FRAMES: + if (optlen != sizeof(ro->xl_frames)) + return -EINVAL; + + if (copy_from_sockptr(&ro->xl_frames, optval, optlen)) + return -EFAULT; + + break; + case CAN_RAW_JOIN_FILTERS: if (optlen != sizeof(ro->join_filters)) return -EINVAL; @@ -751,6 +763,12 @@ static int raw_getsockopt(struct socket *sock, int level, int optname, val = &ro->fd_frames; break; + case CAN_RAW_XL_FRAMES: + if (len > sizeof(int)) + len = sizeof(int); + val = &ro->xl_frames; + break; + case CAN_RAW_JOIN_FILTERS: if (len > sizeof(int)) len = sizeof(int); @@ -776,7 +794,11 @@ static int raw_sendmsg(struct socket *sock, struct msghdr *msg, size_t size) struct sk_buff *skb; struct net_device *dev; int ifindex; - int err; + int err = -EINVAL; + + /* check for valid CAN frame sizes */ + if (size < CANXL_HDR_SIZE + CANXL_MIN_DLEN || size > CANXL_MTU) + return -EINVAL; if (msg->msg_name) { DECLARE_SOCKADDR(struct sockaddr_can *, addr, msg->msg_name); @@ -796,15 +818,6 @@ static int raw_sendmsg(struct socket *sock, struct msghdr *msg, size_t size) if (!dev) return -ENXIO; - err = -EINVAL; - if (ro->fd_frames && dev->mtu == CANFD_MTU) { - if (unlikely(size != CANFD_MTU && size != CAN_MTU)) - goto put_dev; - } else { - if (unlikely(size != CAN_MTU)) - goto put_dev; - } - skb = sock_alloc_send_skb(sk, size + sizeof(struct can_skb_priv), msg->msg_flags & MSG_DONTWAIT, &err); if (!skb) @@ -814,10 +827,27 @@ static int raw_sendmsg(struct socket *sock, struct msghdr *msg, size_t size) can_skb_prv(skb)->ifindex = dev->ifindex; can_skb_prv(skb)->skbcnt = 0; + /* fill the skb before testing for valid CAN frames */ err = memcpy_from_msg(skb_put(skb, size), msg, size); if (err < 0) goto free_skb; + err = -EINVAL; + if (ro->xl_frames && can_is_canxl_dev_mtu(dev->mtu)) { + /* CAN XL, CAN FD and Classical CAN */ + if (!can_is_canxl_skb(skb) && !can_is_canfd_skb(skb) && + !can_is_can_skb(skb)) + goto free_skb; + } else if (ro->fd_frames && dev->mtu == CANFD_MTU) { + /* CAN FD and Classical CAN */ + if (!can_is_canfd_skb(skb) && !can_is_can_skb(skb)) + goto free_skb; + } else { + /* Classical CAN */ + if (!can_is_can_skb(skb)) + goto free_skb; + } + sockcm_init(&sockc, sk); if (msg->msg_controllen) { err = sock_cmsg_send(sk, msg, &sockc); @@ -942,12 +972,20 @@ static __init int raw_module_init(void) pr_info("can: raw protocol\n"); + err = register_netdevice_notifier(&canraw_notifier); + if (err) + return err; + err = can_proto_register(&raw_can_proto); - if (err < 0) + if (err < 0) { pr_err("can: registration of raw protocol failed\n"); - else - register_netdevice_notifier(&canraw_notifier); + goto register_proto_failed; + } + + return 0; +register_proto_failed: + unregister_netdevice_notifier(&canraw_notifier); return err; } diff --git a/net/core/dev.c b/net/core/dev.c index 56c8b0921c9f..fa53830d0683 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -1100,7 +1100,7 @@ static int dev_alloc_name_ns(struct net *net, BUG_ON(!net); ret = __dev_alloc_name(net, name, buf); if (ret >= 0) - strlcpy(dev->name, buf, IFNAMSIZ); + strscpy(dev->name, buf, IFNAMSIZ); return ret; } @@ -1137,7 +1137,7 @@ static int dev_get_valid_name(struct net *net, struct net_device *dev, else if (netdev_name_in_use(net, name)) return -EEXIST; else if (dev->name != name) - strlcpy(dev->name, name, IFNAMSIZ); + strscpy(dev->name, name, IFNAMSIZ); return 0; } @@ -6358,23 +6358,6 @@ int dev_set_threaded(struct net_device *dev, bool threaded) } EXPORT_SYMBOL(dev_set_threaded); -/* Double check that napi_get_frags() allocates skbs with - * skb->head being backed by slab, not a page fragment. - * This is to make sure bug fixed in 3226b158e67c - * ("net: avoid 32 x truesize under-estimation for tiny skbs") - * does not accidentally come back. - */ -static void napi_get_frags_check(struct napi_struct *napi) -{ - struct sk_buff *skb; - - local_bh_disable(); - skb = napi_get_frags(napi); - WARN_ON_ONCE(skb && skb->head_frag); - napi_free_frags(napi); - local_bh_enable(); -} - void netif_napi_add_weight(struct net_device *dev, struct napi_struct *napi, int (*poll)(struct napi_struct *, int), int weight) { @@ -10370,9 +10353,7 @@ void netdev_run_todo(void) BUG_ON(!list_empty(&dev->ptype_specific)); WARN_ON(rcu_access_pointer(dev->ip_ptr)); WARN_ON(rcu_access_pointer(dev->ip6_ptr)); -#if IS_ENABLED(CONFIG_DECNET) - WARN_ON(dev->dn_ptr); -#endif + if (dev->priv_destructor) dev->priv_destructor(dev); if (dev->needs_free_netdev) diff --git a/net/core/devlink.c b/net/core/devlink.c index b50bcc18b8d9..89baa7c0938b 100644 --- a/net/core/devlink.c +++ b/net/core/devlink.c @@ -371,6 +371,13 @@ static struct devlink *devlink_get_from_attrs(struct net *net, return ERR_PTR(-ENODEV); } +#define ASSERT_DEVLINK_PORT_REGISTERED(devlink_port) \ + WARN_ON_ONCE(!(devlink_port)->registered) +#define ASSERT_DEVLINK_PORT_NOT_REGISTERED(devlink_port) \ + WARN_ON_ONCE((devlink_port)->registered) +#define ASSERT_DEVLINK_PORT_INITIALIZED(devlink_port) \ + WARN_ON_ONCE(!(devlink_port)->initialized) + static struct devlink_port *devlink_port_get_by_index(struct devlink *devlink, unsigned int port_index) { @@ -1710,7 +1717,7 @@ static int devlink_nl_cmd_port_split_doit(struct sk_buff *skb, struct devlink *devlink = info->user_ptr[0]; u32 count; - if (!info->attrs[DEVLINK_ATTR_PORT_SPLIT_COUNT]) + if (GENL_REQ_ATTR_CHECK(info, DEVLINK_ATTR_PORT_SPLIT_COUNT)) return -EINVAL; if (!devlink->ops->port_split) return -EOPNOTSUPP; @@ -1838,7 +1845,7 @@ static int devlink_nl_cmd_port_del_doit(struct sk_buff *skb, if (!devlink->ops->port_del) return -EOPNOTSUPP; - if (!info->attrs[DEVLINK_ATTR_PORT_INDEX]) { + if (GENL_REQ_ATTR_CHECK(info, DEVLINK_ATTR_PORT_INDEX)) { NL_SET_ERR_MSG_MOD(extack, "Port index is not specified"); return -EINVAL; } @@ -2690,7 +2697,7 @@ static int devlink_nl_cmd_sb_pool_set_doit(struct sk_buff *skb, if (err) return err; - if (!info->attrs[DEVLINK_ATTR_SB_POOL_SIZE]) + if (GENL_REQ_ATTR_CHECK(info, DEVLINK_ATTR_SB_POOL_SIZE)) return -EINVAL; size = nla_get_u32(info->attrs[DEVLINK_ATTR_SB_POOL_SIZE]); @@ -2900,7 +2907,7 @@ static int devlink_nl_cmd_sb_port_pool_set_doit(struct sk_buff *skb, if (err) return err; - if (!info->attrs[DEVLINK_ATTR_SB_THRESHOLD]) + if (GENL_REQ_ATTR_CHECK(info, DEVLINK_ATTR_SB_THRESHOLD)) return -EINVAL; threshold = nla_get_u32(info->attrs[DEVLINK_ATTR_SB_THRESHOLD]); @@ -3156,7 +3163,7 @@ static int devlink_nl_cmd_sb_tc_pool_bind_set_doit(struct sk_buff *skb, if (err) return err; - if (!info->attrs[DEVLINK_ATTR_SB_THRESHOLD]) + if (GENL_REQ_ATTR_CHECK(info, DEVLINK_ATTR_SB_THRESHOLD)) return -EINVAL; threshold = nla_get_u32(info->attrs[DEVLINK_ATTR_SB_THRESHOLD]); @@ -3845,7 +3852,7 @@ static int devlink_nl_cmd_dpipe_entries_get(struct sk_buff *skb, struct devlink_dpipe_table *table; const char *table_name; - if (!info->attrs[DEVLINK_ATTR_DPIPE_TABLE_NAME]) + if (GENL_REQ_ATTR_CHECK(info, DEVLINK_ATTR_DPIPE_TABLE_NAME)) return -EINVAL; table_name = nla_data(info->attrs[DEVLINK_ATTR_DPIPE_TABLE_NAME]); @@ -4029,8 +4036,9 @@ static int devlink_nl_cmd_dpipe_table_counters_set(struct sk_buff *skb, const char *table_name; bool counters_enable; - if (!info->attrs[DEVLINK_ATTR_DPIPE_TABLE_NAME] || - !info->attrs[DEVLINK_ATTR_DPIPE_TABLE_COUNTERS_ENABLED]) + if (GENL_REQ_ATTR_CHECK(info, DEVLINK_ATTR_DPIPE_TABLE_NAME) || + GENL_REQ_ATTR_CHECK(info, + DEVLINK_ATTR_DPIPE_TABLE_COUNTERS_ENABLED)) return -EINVAL; table_name = nla_data(info->attrs[DEVLINK_ATTR_DPIPE_TABLE_NAME]); @@ -4119,8 +4127,8 @@ static int devlink_nl_cmd_resource_set(struct sk_buff *skb, u64 size; int err; - if (!info->attrs[DEVLINK_ATTR_RESOURCE_ID] || - !info->attrs[DEVLINK_ATTR_RESOURCE_SIZE]) + if (GENL_REQ_ATTR_CHECK(info, DEVLINK_ATTR_RESOURCE_ID) || + GENL_REQ_ATTR_CHECK(info, DEVLINK_ATTR_RESOURCE_SIZE)) return -EINVAL; resource_id = nla_get_u64(info->attrs[DEVLINK_ATTR_RESOURCE_ID]); @@ -4742,10 +4750,76 @@ void devlink_flash_update_timeout_notify(struct devlink *devlink, } EXPORT_SYMBOL_GPL(devlink_flash_update_timeout_notify); +struct devlink_info_req { + struct sk_buff *msg; + void (*version_cb)(const char *version_name, + enum devlink_info_version_type version_type, + void *version_cb_priv); + void *version_cb_priv; +}; + +struct devlink_flash_component_lookup_ctx { + const char *lookup_name; + bool lookup_name_found; +}; + +static void +devlink_flash_component_lookup_cb(const char *version_name, + enum devlink_info_version_type version_type, + void *version_cb_priv) +{ + struct devlink_flash_component_lookup_ctx *lookup_ctx = version_cb_priv; + + if (version_type != DEVLINK_INFO_VERSION_TYPE_COMPONENT || + lookup_ctx->lookup_name_found) + return; + + lookup_ctx->lookup_name_found = + !strcmp(lookup_ctx->lookup_name, version_name); +} + +static int devlink_flash_component_get(struct devlink *devlink, + struct nlattr *nla_component, + const char **p_component, + struct netlink_ext_ack *extack) +{ + struct devlink_flash_component_lookup_ctx lookup_ctx = {}; + struct devlink_info_req req = {}; + const char *component; + int ret; + + if (!nla_component) + return 0; + + component = nla_data(nla_component); + + if (!devlink->ops->info_get) { + NL_SET_ERR_MSG_ATTR(extack, nla_component, + "component update is not supported by this device"); + return -EOPNOTSUPP; + } + + lookup_ctx.lookup_name = component; + req.version_cb = devlink_flash_component_lookup_cb; + req.version_cb_priv = &lookup_ctx; + + ret = devlink->ops->info_get(devlink, &req, NULL); + if (ret) + return ret; + + if (!lookup_ctx.lookup_name_found) { + NL_SET_ERR_MSG_ATTR(extack, nla_component, + "selected component is not supported by this device"); + return -EINVAL; + } + *p_component = component; + return 0; +} + static int devlink_nl_cmd_flash_update(struct sk_buff *skb, struct genl_info *info) { - struct nlattr *nla_component, *nla_overwrite_mask, *nla_file_name; + struct nlattr *nla_overwrite_mask, *nla_file_name; struct devlink_flash_update_params params = {}; struct devlink *devlink = info->user_ptr[0]; const char *file_name; @@ -4755,20 +4829,16 @@ static int devlink_nl_cmd_flash_update(struct sk_buff *skb, if (!devlink->ops->flash_update) return -EOPNOTSUPP; - if (!info->attrs[DEVLINK_ATTR_FLASH_UPDATE_FILE_NAME]) + if (GENL_REQ_ATTR_CHECK(info, DEVLINK_ATTR_FLASH_UPDATE_FILE_NAME)) return -EINVAL; - supported_params = devlink->ops->supported_flash_update_params; + ret = devlink_flash_component_get(devlink, + info->attrs[DEVLINK_ATTR_FLASH_UPDATE_COMPONENT], + ¶ms.component, info->extack); + if (ret) + return ret; - nla_component = info->attrs[DEVLINK_ATTR_FLASH_UPDATE_COMPONENT]; - if (nla_component) { - if (!(supported_params & DEVLINK_SUPPORT_FLASH_UPDATE_COMPONENT)) { - NL_SET_ERR_MSG_ATTR(info->extack, nla_component, - "component update is not supported by this device"); - return -EOPNOTSUPP; - } - params.component = nla_data(nla_component); - } + supported_params = devlink->ops->supported_flash_update_params; nla_overwrite_mask = info->attrs[DEVLINK_ATTR_FLASH_UPDATE_OVERWRITE_MASK]; if (nla_overwrite_mask) { @@ -4936,10 +5006,8 @@ static int devlink_nl_cmd_selftests_run(struct sk_buff *skb, if (!devlink->ops->selftest_run || !devlink->ops->selftest_check) return -EOPNOTSUPP; - if (!info->attrs[DEVLINK_ATTR_SELFTESTS]) { - NL_SET_ERR_MSG_MOD(info->extack, "selftest required"); + if (GENL_REQ_ATTR_CHECK(info, DEVLINK_ATTR_SELFTESTS)) return -EINVAL; - } attrs = info->attrs[DEVLINK_ATTR_SELFTESTS]; @@ -5393,7 +5461,7 @@ static int devlink_param_type_get_from_info(struct genl_info *info, enum devlink_param_type *param_type) { - if (!info->attrs[DEVLINK_ATTR_PARAM_TYPE]) + if (GENL_REQ_ATTR_CHECK(info, DEVLINK_ATTR_PARAM_TYPE)) return -EINVAL; switch (nla_get_u8(info->attrs[DEVLINK_ATTR_PARAM_TYPE])) { @@ -5470,7 +5538,7 @@ devlink_param_get_from_info(struct list_head *param_list, { char *param_name; - if (!info->attrs[DEVLINK_ATTR_PARAM_NAME]) + if (GENL_REQ_ATTR_CHECK(info, DEVLINK_ATTR_PARAM_NAME)) return NULL; param_name = nla_data(info->attrs[DEVLINK_ATTR_PARAM_NAME]); @@ -5536,7 +5604,7 @@ static int __devlink_nl_cmd_param_set_doit(struct devlink *devlink, return err; } - if (!info->attrs[DEVLINK_ATTR_PARAM_VALUE_CMODE]) + if (GENL_REQ_ATTR_CHECK(info, DEVLINK_ATTR_PARAM_VALUE_CMODE)) return -EINVAL; cmode = nla_get_u8(info->attrs[DEVLINK_ATTR_PARAM_VALUE_CMODE]); if (!devlink_param_cmode_is_supported(param, cmode)) @@ -5574,89 +5642,22 @@ static int devlink_nl_cmd_param_set_doit(struct sk_buff *skb, static int devlink_nl_cmd_port_param_get_dumpit(struct sk_buff *msg, struct netlink_callback *cb) { - struct devlink_param_item *param_item; - struct devlink_port *devlink_port; - struct devlink *devlink; - int start = cb->args[0]; - unsigned long index; - int idx = 0; - int err = 0; - - devlinks_xa_for_each_registered_get(sock_net(msg->sk), index, devlink) { - devl_lock(devlink); - list_for_each_entry(devlink_port, &devlink->port_list, list) { - list_for_each_entry(param_item, - &devlink_port->param_list, list) { - if (idx < start) { - idx++; - continue; - } - err = devlink_nl_param_fill(msg, - devlink_port->devlink, - devlink_port->index, param_item, - DEVLINK_CMD_PORT_PARAM_GET, - NETLINK_CB(cb->skb).portid, - cb->nlh->nlmsg_seq, - NLM_F_MULTI); - if (err == -EOPNOTSUPP) { - err = 0; - } else if (err) { - devl_unlock(devlink); - devlink_put(devlink); - goto out; - } - idx++; - } - } - devl_unlock(devlink); - devlink_put(devlink); - } -out: - if (err != -EMSGSIZE) - return err; - - cb->args[0] = idx; + NL_SET_ERR_MSG_MOD(cb->extack, "Port params are not supported"); return msg->len; } static int devlink_nl_cmd_port_param_get_doit(struct sk_buff *skb, struct genl_info *info) { - struct devlink_port *devlink_port = info->user_ptr[1]; - struct devlink_param_item *param_item; - struct sk_buff *msg; - int err; - - param_item = devlink_param_get_from_info(&devlink_port->param_list, - info); - if (!param_item) - return -EINVAL; - - msg = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL); - if (!msg) - return -ENOMEM; - - err = devlink_nl_param_fill(msg, devlink_port->devlink, - devlink_port->index, param_item, - DEVLINK_CMD_PORT_PARAM_GET, - info->snd_portid, info->snd_seq, 0); - if (err) { - nlmsg_free(msg); - return err; - } - - return genlmsg_reply(msg, info); + NL_SET_ERR_MSG_MOD(info->extack, "Port params are not supported"); + return -EINVAL; } static int devlink_nl_cmd_port_param_set_doit(struct sk_buff *skb, struct genl_info *info) { - struct devlink_port *devlink_port = info->user_ptr[1]; - - return __devlink_nl_cmd_param_set_doit(devlink_port->devlink, - devlink_port->index, - &devlink_port->param_list, info, - DEVLINK_CMD_PORT_PARAM_NEW); + NL_SET_ERR_MSG_MOD(info->extack, "Port params are not supported"); + return -EINVAL; } static int devlink_nl_region_snapshot_id_put(struct sk_buff *msg, @@ -6056,7 +6057,7 @@ static int devlink_nl_cmd_region_get_doit(struct sk_buff *skb, unsigned int index; int err; - if (!info->attrs[DEVLINK_ATTR_REGION_NAME]) + if (GENL_REQ_ATTR_CHECK(info, DEVLINK_ATTR_REGION_NAME)) return -EINVAL; if (info->attrs[DEVLINK_ATTR_PORT_INDEX]) { @@ -6189,8 +6190,8 @@ static int devlink_nl_cmd_region_del(struct sk_buff *skb, unsigned int index; u32 snapshot_id; - if (!info->attrs[DEVLINK_ATTR_REGION_NAME] || - !info->attrs[DEVLINK_ATTR_REGION_SNAPSHOT_ID]) + if (GENL_REQ_ATTR_CHECK(info, DEVLINK_ATTR_REGION_NAME) || + GENL_REQ_ATTR_CHECK(info, DEVLINK_ATTR_REGION_SNAPSHOT_ID)) return -EINVAL; region_name = nla_data(info->attrs[DEVLINK_ATTR_REGION_NAME]); @@ -6238,7 +6239,7 @@ devlink_nl_cmd_region_new(struct sk_buff *skb, struct genl_info *info) u8 *data; int err; - if (!info->attrs[DEVLINK_ATTR_REGION_NAME]) { + if (GENL_REQ_ATTR_CHECK(info, DEVLINK_ATTR_REGION_NAME)) { NL_SET_ERR_MSG_MOD(info->extack, "No region name provided"); return -EINVAL; } @@ -6553,18 +6554,18 @@ out_unlock: return err; } -struct devlink_info_req { - struct sk_buff *msg; -}; - int devlink_info_driver_name_put(struct devlink_info_req *req, const char *name) { + if (!req->msg) + return 0; return nla_put_string(req->msg, DEVLINK_ATTR_INFO_DRIVER_NAME, name); } EXPORT_SYMBOL_GPL(devlink_info_driver_name_put); int devlink_info_serial_number_put(struct devlink_info_req *req, const char *sn) { + if (!req->msg) + return 0; return nla_put_string(req->msg, DEVLINK_ATTR_INFO_SERIAL_NUMBER, sn); } EXPORT_SYMBOL_GPL(devlink_info_serial_number_put); @@ -6572,6 +6573,8 @@ EXPORT_SYMBOL_GPL(devlink_info_serial_number_put); int devlink_info_board_serial_number_put(struct devlink_info_req *req, const char *bsn) { + if (!req->msg) + return 0; return nla_put_string(req->msg, DEVLINK_ATTR_INFO_BOARD_SERIAL_NUMBER, bsn); } @@ -6579,11 +6582,19 @@ EXPORT_SYMBOL_GPL(devlink_info_board_serial_number_put); static int devlink_info_version_put(struct devlink_info_req *req, int attr, const char *version_name, - const char *version_value) + const char *version_value, + enum devlink_info_version_type version_type) { struct nlattr *nest; int err; + if (req->version_cb) + req->version_cb(version_name, version_type, + req->version_cb_priv); + + if (!req->msg) + return 0; + nest = nla_nest_start_noflag(req->msg, attr); if (!nest) return -EMSGSIZE; @@ -6612,7 +6623,8 @@ int devlink_info_version_fixed_put(struct devlink_info_req *req, const char *version_value) { return devlink_info_version_put(req, DEVLINK_ATTR_INFO_VERSION_FIXED, - version_name, version_value); + version_name, version_value, + DEVLINK_INFO_VERSION_TYPE_NONE); } EXPORT_SYMBOL_GPL(devlink_info_version_fixed_put); @@ -6621,25 +6633,49 @@ int devlink_info_version_stored_put(struct devlink_info_req *req, const char *version_value) { return devlink_info_version_put(req, DEVLINK_ATTR_INFO_VERSION_STORED, - version_name, version_value); + version_name, version_value, + DEVLINK_INFO_VERSION_TYPE_NONE); } EXPORT_SYMBOL_GPL(devlink_info_version_stored_put); +int devlink_info_version_stored_put_ext(struct devlink_info_req *req, + const char *version_name, + const char *version_value, + enum devlink_info_version_type version_type) +{ + return devlink_info_version_put(req, DEVLINK_ATTR_INFO_VERSION_STORED, + version_name, version_value, + version_type); +} +EXPORT_SYMBOL_GPL(devlink_info_version_stored_put_ext); + int devlink_info_version_running_put(struct devlink_info_req *req, const char *version_name, const char *version_value) { return devlink_info_version_put(req, DEVLINK_ATTR_INFO_VERSION_RUNNING, - version_name, version_value); + version_name, version_value, + DEVLINK_INFO_VERSION_TYPE_NONE); } EXPORT_SYMBOL_GPL(devlink_info_version_running_put); +int devlink_info_version_running_put_ext(struct devlink_info_req *req, + const char *version_name, + const char *version_value, + enum devlink_info_version_type version_type) +{ + return devlink_info_version_put(req, DEVLINK_ATTR_INFO_VERSION_RUNNING, + version_name, version_value, + version_type); +} +EXPORT_SYMBOL_GPL(devlink_info_version_running_put_ext); + static int devlink_nl_info_fill(struct sk_buff *msg, struct devlink *devlink, enum devlink_command cmd, u32 portid, u32 seq, int flags, struct netlink_ext_ack *extack) { - struct devlink_info_req req; + struct devlink_info_req req = {}; void *hdr; int err; @@ -9513,6 +9549,7 @@ static struct genl_family devlink_nl_family __ro_after_init = { .module = THIS_MODULE, .small_ops = devlink_nl_ops, .n_small_ops = ARRAY_SIZE(devlink_nl_ops), + .resv_start_op = DEVLINK_CMD_SELFTESTS_RUN + 1, .mcgrps = devlink_nl_mcgrps, .n_mcgrps = ARRAY_SIZE(devlink_nl_mcgrps), }; @@ -9818,6 +9855,44 @@ static void devlink_port_type_warn_cancel(struct devlink_port *devlink_port) } /** + * devlink_port_init() - Init devlink port + * + * @devlink: devlink + * @devlink_port: devlink port + * + * Initialize essencial stuff that is needed for functions + * that may be called before devlink port registration. + * Call to this function is optional and not needed + * in case the driver does not use such functions. + */ +void devlink_port_init(struct devlink *devlink, + struct devlink_port *devlink_port) +{ + if (devlink_port->initialized) + return; + devlink_port->devlink = devlink; + INIT_LIST_HEAD(&devlink_port->region_list); + devlink_port->initialized = true; +} +EXPORT_SYMBOL_GPL(devlink_port_init); + +/** + * devlink_port_fini() - Deinitialize devlink port + * + * @devlink_port: devlink port + * + * Deinitialize essencial stuff that is in use for functions + * that may be called after devlink port unregistration. + * Call to this function is optional and not needed + * in case the driver does not use such functions. + */ +void devlink_port_fini(struct devlink_port *devlink_port) +{ + WARN_ON(!list_empty(&devlink_port->region_list)); +} +EXPORT_SYMBOL_GPL(devlink_port_fini); + +/** * devl_port_register() - Register devlink port * * @devlink: devlink @@ -9839,15 +9914,15 @@ int devl_port_register(struct devlink *devlink, if (devlink_port_index_exists(devlink, port_index)) return -EEXIST; - WARN_ON(devlink_port->devlink); - devlink_port->devlink = devlink; + ASSERT_DEVLINK_PORT_NOT_REGISTERED(devlink_port); + + devlink_port_init(devlink, devlink_port); + devlink_port->registered = true; devlink_port->index = port_index; spin_lock_init(&devlink_port->type_lock); INIT_LIST_HEAD(&devlink_port->reporter_list); mutex_init(&devlink_port->reporters_lock); list_add_tail(&devlink_port->list, &devlink->port_list); - INIT_LIST_HEAD(&devlink_port->param_list); - INIT_LIST_HEAD(&devlink_port->region_list); INIT_DELAYED_WORK(&devlink_port->type_warn_dw, &devlink_port_type_warn); devlink_port_type_warn_schedule(devlink_port); @@ -9897,8 +9972,8 @@ void devl_port_unregister(struct devlink_port *devlink_port) devlink_port_notify(devlink_port, DEVLINK_CMD_PORT_DEL); list_del(&devlink_port->list); WARN_ON(!list_empty(&devlink_port->reporter_list)); - WARN_ON(!list_empty(&devlink_port->region_list)); mutex_destroy(&devlink_port->reporters_lock); + devlink_port->registered = false; } EXPORT_SYMBOL_GPL(devl_port_unregister); @@ -9923,8 +9998,8 @@ static void __devlink_port_type_set(struct devlink_port *devlink_port, enum devlink_port_type type, void *type_dev) { - if (WARN_ON(!devlink_port->devlink)) - return; + ASSERT_DEVLINK_PORT_REGISTERED(devlink_port); + devlink_port_type_warn_cancel(devlink_port); spin_lock_bh(&devlink_port->type_lock); devlink_port->type = type; @@ -10043,8 +10118,8 @@ void devlink_port_attrs_set(struct devlink_port *devlink_port, { int ret; - if (WARN_ON(devlink_port->devlink)) - return; + ASSERT_DEVLINK_PORT_NOT_REGISTERED(devlink_port); + devlink_port->attrs = *attrs; ret = __devlink_port_attrs_set(devlink_port, attrs->flavour); if (ret) @@ -10067,8 +10142,8 @@ void devlink_port_attrs_pci_pf_set(struct devlink_port *devlink_port, u32 contro struct devlink_port_attrs *attrs = &devlink_port->attrs; int ret; - if (WARN_ON(devlink_port->devlink)) - return; + ASSERT_DEVLINK_PORT_NOT_REGISTERED(devlink_port); + ret = __devlink_port_attrs_set(devlink_port, DEVLINK_PORT_FLAVOUR_PCI_PF); if (ret) @@ -10094,8 +10169,8 @@ void devlink_port_attrs_pci_vf_set(struct devlink_port *devlink_port, u32 contro struct devlink_port_attrs *attrs = &devlink_port->attrs; int ret; - if (WARN_ON(devlink_port->devlink)) - return; + ASSERT_DEVLINK_PORT_NOT_REGISTERED(devlink_port); + ret = __devlink_port_attrs_set(devlink_port, DEVLINK_PORT_FLAVOUR_PCI_VF); if (ret) @@ -10122,8 +10197,8 @@ void devlink_port_attrs_pci_sf_set(struct devlink_port *devlink_port, u32 contro struct devlink_port_attrs *attrs = &devlink_port->attrs; int ret; - if (WARN_ON(devlink_port->devlink)) - return; + ASSERT_DEVLINK_PORT_NOT_REGISTERED(devlink_port); + ret = __devlink_port_attrs_set(devlink_port, DEVLINK_PORT_FLAVOUR_PCI_SF); if (ret) @@ -10238,8 +10313,8 @@ EXPORT_SYMBOL_GPL(devl_rate_nodes_destroy); void devlink_port_linecard_set(struct devlink_port *devlink_port, struct devlink_linecard *linecard) { - if (WARN_ON(devlink_port->devlink)) - return; + ASSERT_DEVLINK_PORT_NOT_REGISTERED(devlink_port); + devlink_port->linecard = linecard; } EXPORT_SYMBOL_GPL(devlink_port_linecard_set); @@ -11310,6 +11385,8 @@ devlink_port_region_create(struct devlink_port *port, struct devlink_region *region; int err = 0; + ASSERT_DEVLINK_PORT_INITIALIZED(port); + if (WARN_ON(!ops) || WARN_ON(!ops->destructor)) return ERR_PTR(-EINVAL); @@ -12306,8 +12383,8 @@ EXPORT_SYMBOL_GPL(devl_trap_policers_unregister); static void __devlink_compat_running_version(struct devlink *devlink, char *buf, size_t len) { + struct devlink_info_req req = {}; const struct nlattr *nlattr; - struct devlink_info_req req; struct sk_buff *msg; int rem, err; diff --git a/net/core/drop_monitor.c b/net/core/drop_monitor.c index 75501e1bdd25..f084a4a6b7ab 100644 --- a/net/core/drop_monitor.c +++ b/net/core/drop_monitor.c @@ -464,7 +464,7 @@ net_dm_hw_trap_summary_probe(void *ignore, const struct devlink *devlink, goto out; hw_entry = &hw_entries->entries[hw_entries->num_entries]; - strlcpy(hw_entry->trap_name, metadata->trap_name, + strscpy(hw_entry->trap_name, metadata->trap_name, NET_DM_MAX_HW_TRAP_NAME_LEN - 1); hw_entry->count = 1; hw_entries->num_entries++; @@ -1645,6 +1645,7 @@ static struct genl_family net_drop_monitor_family __ro_after_init = { .module = THIS_MODULE, .small_ops = dropmon_ops, .n_small_ops = ARRAY_SIZE(dropmon_ops), + .resv_start_op = NET_DM_CMD_STATS_GET + 1, .mcgrps = dropmon_mcgrps, .n_mcgrps = ARRAY_SIZE(dropmon_mcgrps), }; diff --git a/net/core/filter.c b/net/core/filter.c index c191db80ce93..bb0136e7a8e4 100644 --- a/net/core/filter.c +++ b/net/core/filter.c @@ -18,6 +18,7 @@ */ #include <linux/atomic.h> +#include <linux/bpf_verifier.h> #include <linux/module.h> #include <linux/types.h> #include <linux/mm.h> @@ -3010,7 +3011,7 @@ BPF_CALL_0(bpf_get_cgroup_classid_curr) return __task_get_classid(current); } -static const struct bpf_func_proto bpf_get_cgroup_classid_curr_proto = { +const struct bpf_func_proto bpf_get_cgroup_classid_curr_proto = { .func = bpf_get_cgroup_classid_curr, .gpl_only = false, .ret_type = RET_INTEGER, @@ -4489,7 +4490,8 @@ BPF_CALL_4(bpf_skb_get_tunnel_key, struct sk_buff *, skb, struct bpf_tunnel_key void *to_orig = to; int err; - if (unlikely(!info || (flags & ~(BPF_F_TUNINFO_IPV6)))) { + if (unlikely(!info || (flags & ~(BPF_F_TUNINFO_IPV6 | + BPF_F_TUNINFO_FLAGS)))) { err = -EINVAL; goto err_clear; } @@ -4521,7 +4523,10 @@ set_compat: to->tunnel_id = be64_to_cpu(info->key.tun_id); to->tunnel_tos = info->key.tos; to->tunnel_ttl = info->key.ttl; - to->tunnel_ext = 0; + if (flags & BPF_F_TUNINFO_FLAGS) + to->tunnel_flags = info->key.tun_flags; + else + to->tunnel_ext = 0; if (flags & BPF_F_TUNINFO_IPV6) { memcpy(to->remote_ipv6, &info->key.u.ipv6.src, @@ -5014,359 +5019,303 @@ static const struct bpf_func_proto bpf_get_socket_uid_proto = { .arg1_type = ARG_PTR_TO_CTX, }; -static int __bpf_setsockopt(struct sock *sk, int level, int optname, - char *optval, int optlen) +static int sol_socket_sockopt(struct sock *sk, int optname, + char *optval, int *optlen, + bool getopt) +{ + switch (optname) { + case SO_REUSEADDR: + case SO_SNDBUF: + case SO_RCVBUF: + case SO_KEEPALIVE: + case SO_PRIORITY: + case SO_REUSEPORT: + case SO_RCVLOWAT: + case SO_MARK: + case SO_MAX_PACING_RATE: + case SO_BINDTOIFINDEX: + case SO_TXREHASH: + if (*optlen != sizeof(int)) + return -EINVAL; + break; + case SO_BINDTODEVICE: + break; + default: + return -EINVAL; + } + + if (getopt) { + if (optname == SO_BINDTODEVICE) + return -EINVAL; + return sk_getsockopt(sk, SOL_SOCKET, optname, + KERNEL_SOCKPTR(optval), + KERNEL_SOCKPTR(optlen)); + } + + return sk_setsockopt(sk, SOL_SOCKET, optname, + KERNEL_SOCKPTR(optval), *optlen); +} + +static int bpf_sol_tcp_setsockopt(struct sock *sk, int optname, + char *optval, int optlen) { - char devname[IFNAMSIZ]; - int val, valbool; - struct net *net; - int ifindex; - int ret = 0; + struct tcp_sock *tp = tcp_sk(sk); + unsigned long timeout; + int val; - if (!sk_fullsock(sk)) + if (optlen != sizeof(int)) return -EINVAL; - if (level == SOL_SOCKET) { - if (optlen != sizeof(int) && optname != SO_BINDTODEVICE) + val = *(int *)optval; + + /* Only some options are supported */ + switch (optname) { + case TCP_BPF_IW: + if (val <= 0 || tp->data_segs_out > tp->syn_data) return -EINVAL; - val = *((int *)optval); - valbool = val ? 1 : 0; - - /* Only some socketops are supported */ - switch (optname) { - case SO_RCVBUF: - val = min_t(u32, val, READ_ONCE(sysctl_rmem_max)); - val = min_t(int, val, INT_MAX / 2); - sk->sk_userlocks |= SOCK_RCVBUF_LOCK; - WRITE_ONCE(sk->sk_rcvbuf, - max_t(int, val * 2, SOCK_MIN_RCVBUF)); - break; - case SO_SNDBUF: - val = min_t(u32, val, READ_ONCE(sysctl_wmem_max)); - val = min_t(int, val, INT_MAX / 2); - sk->sk_userlocks |= SOCK_SNDBUF_LOCK; - WRITE_ONCE(sk->sk_sndbuf, - max_t(int, val * 2, SOCK_MIN_SNDBUF)); - break; - case SO_MAX_PACING_RATE: /* 32bit version */ - if (val != ~0U) - cmpxchg(&sk->sk_pacing_status, - SK_PACING_NONE, - SK_PACING_NEEDED); - sk->sk_max_pacing_rate = (val == ~0U) ? - ~0UL : (unsigned int)val; - sk->sk_pacing_rate = min(sk->sk_pacing_rate, - sk->sk_max_pacing_rate); - break; - case SO_PRIORITY: - sk->sk_priority = val; - break; - case SO_RCVLOWAT: - if (val < 0) - val = INT_MAX; - if (sk->sk_socket && sk->sk_socket->ops->set_rcvlowat) - ret = sk->sk_socket->ops->set_rcvlowat(sk, val); - else - WRITE_ONCE(sk->sk_rcvlowat, val ? : 1); - break; - case SO_MARK: - if (sk->sk_mark != val) { - sk->sk_mark = val; - sk_dst_reset(sk); - } - break; - case SO_BINDTODEVICE: - optlen = min_t(long, optlen, IFNAMSIZ - 1); - strncpy(devname, optval, optlen); - devname[optlen] = 0; + tcp_snd_cwnd_set(tp, val); + break; + case TCP_BPF_SNDCWND_CLAMP: + if (val <= 0) + return -EINVAL; + tp->snd_cwnd_clamp = val; + tp->snd_ssthresh = val; + break; + case TCP_BPF_DELACK_MAX: + timeout = usecs_to_jiffies(val); + if (timeout > TCP_DELACK_MAX || + timeout < TCP_TIMEOUT_MIN) + return -EINVAL; + inet_csk(sk)->icsk_delack_max = timeout; + break; + case TCP_BPF_RTO_MIN: + timeout = usecs_to_jiffies(val); + if (timeout > TCP_RTO_MIN || + timeout < TCP_TIMEOUT_MIN) + return -EINVAL; + inet_csk(sk)->icsk_rto_min = timeout; + break; + default: + return -EINVAL; + } - ifindex = 0; - if (devname[0] != '\0') { - struct net_device *dev; + return 0; +} - ret = -ENODEV; +static int sol_tcp_sockopt_congestion(struct sock *sk, char *optval, + int *optlen, bool getopt) +{ + struct tcp_sock *tp; + int ret; - net = sock_net(sk); - dev = dev_get_by_name(net, devname); - if (!dev) - break; - ifindex = dev->ifindex; - dev_put(dev); - } - fallthrough; - case SO_BINDTOIFINDEX: - if (optname == SO_BINDTOIFINDEX) - ifindex = val; - ret = sock_bindtoindex(sk, ifindex, false); - break; - case SO_KEEPALIVE: - if (sk->sk_prot->keepalive) - sk->sk_prot->keepalive(sk, valbool); - sock_valbool_flag(sk, SOCK_KEEPOPEN, valbool); - break; - case SO_REUSEPORT: - sk->sk_reuseport = valbool; - break; - case SO_TXREHASH: - if (val < -1 || val > 1) { - ret = -EINVAL; - break; - } - sk->sk_txrehash = (u8)val; - break; - default: - ret = -EINVAL; - } -#ifdef CONFIG_INET - } else if (level == SOL_IP) { - if (optlen != sizeof(int) || sk->sk_family != AF_INET) + if (*optlen < 2) + return -EINVAL; + + if (getopt) { + if (!inet_csk(sk)->icsk_ca_ops) return -EINVAL; + /* BPF expects NULL-terminated tcp-cc string */ + optval[--(*optlen)] = '\0'; + return do_tcp_getsockopt(sk, SOL_TCP, TCP_CONGESTION, + KERNEL_SOCKPTR(optval), + KERNEL_SOCKPTR(optlen)); + } - val = *((int *)optval); - /* Only some options are supported */ - switch (optname) { - case IP_TOS: - if (val < -1 || val > 0xff) { - ret = -EINVAL; - } else { - struct inet_sock *inet = inet_sk(sk); + /* "cdg" is the only cc that alloc a ptr + * in inet_csk_ca area. The bpf-tcp-cc may + * overwrite this ptr after switching to cdg. + */ + if (*optlen >= sizeof("cdg") - 1 && !strncmp("cdg", optval, *optlen)) + return -ENOTSUPP; - if (val == -1) - val = 0; - inet->tos = val; - } - break; - default: - ret = -EINVAL; - } -#if IS_ENABLED(CONFIG_IPV6) - } else if (level == SOL_IPV6) { - if (optlen != sizeof(int) || sk->sk_family != AF_INET6) - return -EINVAL; + /* It stops this looping + * + * .init => bpf_setsockopt(tcp_cc) => .init => + * bpf_setsockopt(tcp_cc)" => .init => .... + * + * The second bpf_setsockopt(tcp_cc) is not allowed + * in order to break the loop when both .init + * are the same bpf prog. + * + * This applies even the second bpf_setsockopt(tcp_cc) + * does not cause a loop. This limits only the first + * '.init' can call bpf_setsockopt(TCP_CONGESTION) to + * pick a fallback cc (eg. peer does not support ECN) + * and the second '.init' cannot fallback to + * another. + */ + tp = tcp_sk(sk); + if (tp->bpf_chg_cc_inprogress) + return -EBUSY; + + tp->bpf_chg_cc_inprogress = 1; + ret = do_tcp_setsockopt(sk, SOL_TCP, TCP_CONGESTION, + KERNEL_SOCKPTR(optval), *optlen); + tp->bpf_chg_cc_inprogress = 0; + return ret; +} - val = *((int *)optval); - /* Only some options are supported */ - switch (optname) { - case IPV6_TCLASS: - if (val < -1 || val > 0xff) { - ret = -EINVAL; - } else { - struct ipv6_pinfo *np = inet6_sk(sk); +static int sol_tcp_sockopt(struct sock *sk, int optname, + char *optval, int *optlen, + bool getopt) +{ + if (sk->sk_prot->setsockopt != tcp_setsockopt) + return -EINVAL; - if (val == -1) - val = 0; - np->tclass = val; - } - break; - default: - ret = -EINVAL; - } -#endif - } else if (level == SOL_TCP && - sk->sk_prot->setsockopt == tcp_setsockopt) { - if (optname == TCP_CONGESTION) { - char name[TCP_CA_NAME_MAX]; - - strncpy(name, optval, min_t(long, optlen, - TCP_CA_NAME_MAX-1)); - name[TCP_CA_NAME_MAX-1] = 0; - ret = tcp_set_congestion_control(sk, name, false, true); - } else { - struct inet_connection_sock *icsk = inet_csk(sk); + switch (optname) { + case TCP_NODELAY: + case TCP_MAXSEG: + case TCP_KEEPIDLE: + case TCP_KEEPINTVL: + case TCP_KEEPCNT: + case TCP_SYNCNT: + case TCP_WINDOW_CLAMP: + case TCP_THIN_LINEAR_TIMEOUTS: + case TCP_USER_TIMEOUT: + case TCP_NOTSENT_LOWAT: + case TCP_SAVE_SYN: + if (*optlen != sizeof(int)) + return -EINVAL; + break; + case TCP_CONGESTION: + return sol_tcp_sockopt_congestion(sk, optval, optlen, getopt); + case TCP_SAVED_SYN: + if (*optlen < 1) + return -EINVAL; + break; + default: + if (getopt) + return -EINVAL; + return bpf_sol_tcp_setsockopt(sk, optname, optval, *optlen); + } + + if (getopt) { + if (optname == TCP_SAVED_SYN) { struct tcp_sock *tp = tcp_sk(sk); - unsigned long timeout; - if (optlen != sizeof(int)) + if (!tp->saved_syn || + *optlen > tcp_saved_syn_len(tp->saved_syn)) return -EINVAL; - - val = *((int *)optval); - /* Only some options are supported */ - switch (optname) { - case TCP_BPF_IW: - if (val <= 0 || tp->data_segs_out > tp->syn_data) - ret = -EINVAL; - else - tcp_snd_cwnd_set(tp, val); - break; - case TCP_BPF_SNDCWND_CLAMP: - if (val <= 0) { - ret = -EINVAL; - } else { - tp->snd_cwnd_clamp = val; - tp->snd_ssthresh = val; - } - break; - case TCP_BPF_DELACK_MAX: - timeout = usecs_to_jiffies(val); - if (timeout > TCP_DELACK_MAX || - timeout < TCP_TIMEOUT_MIN) - return -EINVAL; - inet_csk(sk)->icsk_delack_max = timeout; - break; - case TCP_BPF_RTO_MIN: - timeout = usecs_to_jiffies(val); - if (timeout > TCP_RTO_MIN || - timeout < TCP_TIMEOUT_MIN) - return -EINVAL; - inet_csk(sk)->icsk_rto_min = timeout; - break; - case TCP_SAVE_SYN: - if (val < 0 || val > 1) - ret = -EINVAL; - else - tp->save_syn = val; - break; - case TCP_KEEPIDLE: - ret = tcp_sock_set_keepidle_locked(sk, val); - break; - case TCP_KEEPINTVL: - if (val < 1 || val > MAX_TCP_KEEPINTVL) - ret = -EINVAL; - else - tp->keepalive_intvl = val * HZ; - break; - case TCP_KEEPCNT: - if (val < 1 || val > MAX_TCP_KEEPCNT) - ret = -EINVAL; - else - tp->keepalive_probes = val; - break; - case TCP_SYNCNT: - if (val < 1 || val > MAX_TCP_SYNCNT) - ret = -EINVAL; - else - icsk->icsk_syn_retries = val; - break; - case TCP_USER_TIMEOUT: - if (val < 0) - ret = -EINVAL; - else - icsk->icsk_user_timeout = val; - break; - case TCP_NOTSENT_LOWAT: - tp->notsent_lowat = val; - sk->sk_write_space(sk); - break; - case TCP_WINDOW_CLAMP: - ret = tcp_set_window_clamp(sk, val); - break; - default: - ret = -EINVAL; - } + memcpy(optval, tp->saved_syn->data, *optlen); + /* It cannot free tp->saved_syn here because it + * does not know if the user space still needs it. + */ + return 0; } -#endif - } else { - ret = -EINVAL; + + return do_tcp_getsockopt(sk, SOL_TCP, optname, + KERNEL_SOCKPTR(optval), + KERNEL_SOCKPTR(optlen)); } - return ret; + + return do_tcp_setsockopt(sk, SOL_TCP, optname, + KERNEL_SOCKPTR(optval), *optlen); } -static int _bpf_setsockopt(struct sock *sk, int level, int optname, - char *optval, int optlen) +static int sol_ip_sockopt(struct sock *sk, int optname, + char *optval, int *optlen, + bool getopt) { - if (sk_fullsock(sk)) - sock_owned_by_me(sk); - return __bpf_setsockopt(sk, level, optname, optval, optlen); + if (sk->sk_family != AF_INET) + return -EINVAL; + + switch (optname) { + case IP_TOS: + if (*optlen != sizeof(int)) + return -EINVAL; + break; + default: + return -EINVAL; + } + + if (getopt) + return do_ip_getsockopt(sk, SOL_IP, optname, + KERNEL_SOCKPTR(optval), + KERNEL_SOCKPTR(optlen)); + + return do_ip_setsockopt(sk, SOL_IP, optname, + KERNEL_SOCKPTR(optval), *optlen); } -static int __bpf_getsockopt(struct sock *sk, int level, int optname, - char *optval, int optlen) +static int sol_ipv6_sockopt(struct sock *sk, int optname, + char *optval, int *optlen, + bool getopt) { - if (!sk_fullsock(sk)) - goto err_clear; + if (sk->sk_family != AF_INET6) + return -EINVAL; - if (level == SOL_SOCKET) { - if (optlen != sizeof(int)) - goto err_clear; + switch (optname) { + case IPV6_TCLASS: + case IPV6_AUTOFLOWLABEL: + if (*optlen != sizeof(int)) + return -EINVAL; + break; + default: + return -EINVAL; + } - switch (optname) { - case SO_RCVBUF: - *((int *)optval) = sk->sk_rcvbuf; - break; - case SO_SNDBUF: - *((int *)optval) = sk->sk_sndbuf; - break; - case SO_MARK: - *((int *)optval) = sk->sk_mark; - break; - case SO_PRIORITY: - *((int *)optval) = sk->sk_priority; - break; - case SO_BINDTOIFINDEX: - *((int *)optval) = sk->sk_bound_dev_if; - break; - case SO_REUSEPORT: - *((int *)optval) = sk->sk_reuseport; - break; - case SO_TXREHASH: - *((int *)optval) = sk->sk_txrehash; - break; - default: - goto err_clear; - } -#ifdef CONFIG_INET - } else if (level == SOL_TCP && sk->sk_prot->getsockopt == tcp_getsockopt) { - struct inet_connection_sock *icsk; - struct tcp_sock *tp; + if (getopt) + return ipv6_bpf_stub->ipv6_getsockopt(sk, SOL_IPV6, optname, + KERNEL_SOCKPTR(optval), + KERNEL_SOCKPTR(optlen)); - switch (optname) { - case TCP_CONGESTION: - icsk = inet_csk(sk); + return ipv6_bpf_stub->ipv6_setsockopt(sk, SOL_IPV6, optname, + KERNEL_SOCKPTR(optval), *optlen); +} - if (!icsk->icsk_ca_ops || optlen <= 1) - goto err_clear; - strncpy(optval, icsk->icsk_ca_ops->name, optlen); - optval[optlen - 1] = 0; - break; - case TCP_SAVED_SYN: - tp = tcp_sk(sk); +static int __bpf_setsockopt(struct sock *sk, int level, int optname, + char *optval, int optlen) +{ + if (!sk_fullsock(sk)) + return -EINVAL; - if (optlen <= 0 || !tp->saved_syn || - optlen > tcp_saved_syn_len(tp->saved_syn)) - goto err_clear; - memcpy(optval, tp->saved_syn->data, optlen); - break; - default: - goto err_clear; - } - } else if (level == SOL_IP) { - struct inet_sock *inet = inet_sk(sk); + if (level == SOL_SOCKET) + return sol_socket_sockopt(sk, optname, optval, &optlen, false); + else if (IS_ENABLED(CONFIG_INET) && level == SOL_IP) + return sol_ip_sockopt(sk, optname, optval, &optlen, false); + else if (IS_ENABLED(CONFIG_IPV6) && level == SOL_IPV6) + return sol_ipv6_sockopt(sk, optname, optval, &optlen, false); + else if (IS_ENABLED(CONFIG_INET) && level == SOL_TCP) + return sol_tcp_sockopt(sk, optname, optval, &optlen, false); - if (optlen != sizeof(int) || sk->sk_family != AF_INET) - goto err_clear; + return -EINVAL; +} - /* Only some options are supported */ - switch (optname) { - case IP_TOS: - *((int *)optval) = (int)inet->tos; - break; - default: - goto err_clear; - } -#if IS_ENABLED(CONFIG_IPV6) - } else if (level == SOL_IPV6) { - struct ipv6_pinfo *np = inet6_sk(sk); +static int _bpf_setsockopt(struct sock *sk, int level, int optname, + char *optval, int optlen) +{ + if (sk_fullsock(sk)) + sock_owned_by_me(sk); + return __bpf_setsockopt(sk, level, optname, optval, optlen); +} - if (optlen != sizeof(int) || sk->sk_family != AF_INET6) - goto err_clear; +static int __bpf_getsockopt(struct sock *sk, int level, int optname, + char *optval, int optlen) +{ + int err, saved_optlen = optlen; - /* Only some options are supported */ - switch (optname) { - case IPV6_TCLASS: - *((int *)optval) = (int)np->tclass; - break; - default: - goto err_clear; - } -#endif -#endif - } else { - goto err_clear; + if (!sk_fullsock(sk)) { + err = -EINVAL; + goto done; } - return 0; -err_clear: - memset(optval, 0, optlen); - return -EINVAL; + + if (level == SOL_SOCKET) + err = sol_socket_sockopt(sk, optname, optval, &optlen, true); + else if (IS_ENABLED(CONFIG_INET) && level == SOL_TCP) + err = sol_tcp_sockopt(sk, optname, optval, &optlen, true); + else if (IS_ENABLED(CONFIG_INET) && level == SOL_IP) + err = sol_ip_sockopt(sk, optname, optval, &optlen, true); + else if (IS_ENABLED(CONFIG_IPV6) && level == SOL_IPV6) + err = sol_ipv6_sockopt(sk, optname, optval, &optlen, true); + else + err = -EINVAL; + +done: + if (err) + optlen = 0; + if (optlen < saved_optlen) + memset(optval + optlen, 0, saved_optlen - optlen); + return err; } static int _bpf_getsockopt(struct sock *sk, int level, int optname, @@ -5380,12 +5329,6 @@ static int _bpf_getsockopt(struct sock *sk, int level, int optname, BPF_CALL_5(bpf_sk_setsockopt, struct sock *, sk, int, level, int, optname, char *, optval, int, optlen) { - if (level == SOL_TCP && optname == TCP_CONGESTION) { - if (optlen >= sizeof("cdg") - 1 && - !strncmp("cdg", optval, optlen)) - return -ENOTSUPP; - } - return _bpf_setsockopt(sk, level, optname, optval, optlen); } @@ -6469,6 +6412,7 @@ static const struct bpf_func_proto bpf_lwt_seg6_adjust_srh_proto = { static struct sock *sk_lookup(struct net *net, struct bpf_sock_tuple *tuple, int dif, int sdif, u8 family, u8 proto) { + struct inet_hashinfo *hinfo = net->ipv4.tcp_death_row.hashinfo; bool refcounted = false; struct sock *sk = NULL; @@ -6477,7 +6421,7 @@ static struct sock *sk_lookup(struct net *net, struct bpf_sock_tuple *tuple, __be32 dst4 = tuple->ipv4.daddr; if (proto == IPPROTO_TCP) - sk = __inet_lookup(net, &tcp_hashinfo, NULL, 0, + sk = __inet_lookup(net, hinfo, NULL, 0, src4, tuple->ipv4.sport, dst4, tuple->ipv4.dport, dif, sdif, &refcounted); @@ -6491,7 +6435,7 @@ static struct sock *sk_lookup(struct net *net, struct bpf_sock_tuple *tuple, struct in6_addr *dst6 = (struct in6_addr *)&tuple->ipv6.daddr; if (proto == IPPROTO_TCP) - sk = __inet6_lookup(net, &tcp_hashinfo, NULL, 0, + sk = __inet6_lookup(net, hinfo, NULL, 0, src6, tuple->ipv6.sport, dst6, ntohs(tuple->ipv6.dport), dif, sdif, &refcounted); @@ -7667,34 +7611,23 @@ const struct bpf_func_proto bpf_sk_storage_get_cg_sock_proto __weak; static const struct bpf_func_proto * sock_filter_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog) { + const struct bpf_func_proto *func_proto; + + func_proto = cgroup_common_func_proto(func_id, prog); + if (func_proto) + return func_proto; + + func_proto = cgroup_current_func_proto(func_id, prog); + if (func_proto) + return func_proto; + switch (func_id) { - /* inet and inet6 sockets are created in a process - * context so there is always a valid uid/gid - */ - case BPF_FUNC_get_current_uid_gid: - return &bpf_get_current_uid_gid_proto; - case BPF_FUNC_get_local_storage: - return &bpf_get_local_storage_proto; case BPF_FUNC_get_socket_cookie: return &bpf_get_socket_cookie_sock_proto; case BPF_FUNC_get_netns_cookie: return &bpf_get_netns_cookie_sock_proto; case BPF_FUNC_perf_event_output: return &bpf_event_output_data_proto; - case BPF_FUNC_get_current_pid_tgid: - return &bpf_get_current_pid_tgid_proto; - case BPF_FUNC_get_current_comm: - return &bpf_get_current_comm_proto; -#ifdef CONFIG_CGROUPS - case BPF_FUNC_get_current_cgroup_id: - return &bpf_get_current_cgroup_id_proto; - case BPF_FUNC_get_current_ancestor_cgroup_id: - return &bpf_get_current_ancestor_cgroup_id_proto; -#endif -#ifdef CONFIG_CGROUP_NET_CLASSID - case BPF_FUNC_get_cgroup_classid: - return &bpf_get_cgroup_classid_curr_proto; -#endif case BPF_FUNC_sk_storage_get: return &bpf_sk_storage_get_cg_sock_proto; case BPF_FUNC_ktime_get_coarse_ns: @@ -7707,12 +7640,17 @@ sock_filter_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog) static const struct bpf_func_proto * sock_addr_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog) { + const struct bpf_func_proto *func_proto; + + func_proto = cgroup_common_func_proto(func_id, prog); + if (func_proto) + return func_proto; + + func_proto = cgroup_current_func_proto(func_id, prog); + if (func_proto) + return func_proto; + switch (func_id) { - /* inet and inet6 sockets are created in a process - * context so there is always a valid uid/gid - */ - case BPF_FUNC_get_current_uid_gid: - return &bpf_get_current_uid_gid_proto; case BPF_FUNC_bind: switch (prog->expected_attach_type) { case BPF_CGROUP_INET4_CONNECT: @@ -7725,24 +7663,8 @@ sock_addr_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog) return &bpf_get_socket_cookie_sock_addr_proto; case BPF_FUNC_get_netns_cookie: return &bpf_get_netns_cookie_sock_addr_proto; - case BPF_FUNC_get_local_storage: - return &bpf_get_local_storage_proto; case BPF_FUNC_perf_event_output: return &bpf_event_output_data_proto; - case BPF_FUNC_get_current_pid_tgid: - return &bpf_get_current_pid_tgid_proto; - case BPF_FUNC_get_current_comm: - return &bpf_get_current_comm_proto; -#ifdef CONFIG_CGROUPS - case BPF_FUNC_get_current_cgroup_id: - return &bpf_get_current_cgroup_id_proto; - case BPF_FUNC_get_current_ancestor_cgroup_id: - return &bpf_get_current_ancestor_cgroup_id_proto; -#endif -#ifdef CONFIG_CGROUP_NET_CLASSID - case BPF_FUNC_get_cgroup_classid: - return &bpf_get_cgroup_classid_curr_proto; -#endif #ifdef CONFIG_INET case BPF_FUNC_sk_lookup_tcp: return &bpf_sock_addr_sk_lookup_tcp_proto; @@ -7823,9 +7745,13 @@ const struct bpf_func_proto bpf_sk_storage_delete_proto __weak; static const struct bpf_func_proto * cg_skb_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog) { + const struct bpf_func_proto *func_proto; + + func_proto = cgroup_common_func_proto(func_id, prog); + if (func_proto) + return func_proto; + switch (func_id) { - case BPF_FUNC_get_local_storage: - return &bpf_get_local_storage_proto; case BPF_FUNC_sk_fullsock: return &bpf_sk_fullsock_proto; case BPF_FUNC_sk_storage_get: @@ -8065,6 +7991,12 @@ const struct bpf_func_proto bpf_sock_hash_update_proto __weak; static const struct bpf_func_proto * sock_ops_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog) { + const struct bpf_func_proto *func_proto; + + func_proto = cgroup_common_func_proto(func_id, prog); + if (func_proto) + return func_proto; + switch (func_id) { case BPF_FUNC_setsockopt: return &bpf_sock_ops_setsockopt_proto; @@ -8078,8 +8010,6 @@ sock_ops_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog) return &bpf_sock_hash_update_proto; case BPF_FUNC_get_socket_cookie: return &bpf_get_socket_cookie_sock_ops_proto; - case BPF_FUNC_get_local_storage: - return &bpf_get_local_storage_proto; case BPF_FUNC_perf_event_output: return &bpf_event_output_data_proto; case BPF_FUNC_sk_storage_get: @@ -8714,6 +8644,36 @@ static bool tc_cls_act_is_valid_access(int off, int size, return bpf_skb_is_valid_access(off, size, type, prog, info); } +DEFINE_MUTEX(nf_conn_btf_access_lock); +EXPORT_SYMBOL_GPL(nf_conn_btf_access_lock); + +int (*nfct_btf_struct_access)(struct bpf_verifier_log *log, const struct btf *btf, + const struct btf_type *t, int off, int size, + enum bpf_access_type atype, u32 *next_btf_id, + enum bpf_type_flag *flag); +EXPORT_SYMBOL_GPL(nfct_btf_struct_access); + +static int tc_cls_act_btf_struct_access(struct bpf_verifier_log *log, + const struct btf *btf, + const struct btf_type *t, int off, + int size, enum bpf_access_type atype, + u32 *next_btf_id, + enum bpf_type_flag *flag) +{ + int ret = -EACCES; + + if (atype == BPF_READ) + return btf_struct_access(log, btf, t, off, size, atype, next_btf_id, + flag); + + mutex_lock(&nf_conn_btf_access_lock); + if (nfct_btf_struct_access) + ret = nfct_btf_struct_access(log, btf, t, off, size, atype, next_btf_id, flag); + mutex_unlock(&nf_conn_btf_access_lock); + + return ret; +} + static bool __is_valid_xdp_access(int off, int size) { if (off < 0 || off >= sizeof(struct xdp_md)) @@ -8773,6 +8733,27 @@ void bpf_warn_invalid_xdp_action(struct net_device *dev, struct bpf_prog *prog, } EXPORT_SYMBOL_GPL(bpf_warn_invalid_xdp_action); +static int xdp_btf_struct_access(struct bpf_verifier_log *log, + const struct btf *btf, + const struct btf_type *t, int off, + int size, enum bpf_access_type atype, + u32 *next_btf_id, + enum bpf_type_flag *flag) +{ + int ret = -EACCES; + + if (atype == BPF_READ) + return btf_struct_access(log, btf, t, off, size, atype, next_btf_id, + flag); + + mutex_lock(&nf_conn_btf_access_lock); + if (nfct_btf_struct_access) + ret = nfct_btf_struct_access(log, btf, t, off, size, atype, next_btf_id, flag); + mutex_unlock(&nf_conn_btf_access_lock); + + return ret; +} + static bool sock_addr_is_valid_access(int off, int size, enum bpf_access_type type, const struct bpf_prog *prog, @@ -10667,6 +10648,7 @@ const struct bpf_verifier_ops tc_cls_act_verifier_ops = { .convert_ctx_access = tc_cls_act_convert_ctx_access, .gen_prologue = tc_cls_act_prologue, .gen_ld_abs = bpf_gen_ld_abs, + .btf_struct_access = tc_cls_act_btf_struct_access, }; const struct bpf_prog_ops tc_cls_act_prog_ops = { @@ -10678,6 +10660,7 @@ const struct bpf_verifier_ops xdp_verifier_ops = { .is_valid_access = xdp_is_valid_access, .convert_ctx_access = xdp_convert_ctx_access, .gen_prologue = bpf_noop_prologue, + .btf_struct_access = xdp_btf_struct_access, }; const struct bpf_prog_ops xdp_prog_ops = { @@ -10812,14 +10795,13 @@ int sk_detach_filter(struct sock *sk) } EXPORT_SYMBOL_GPL(sk_detach_filter); -int sk_get_filter(struct sock *sk, struct sock_filter __user *ubuf, - unsigned int len) +int sk_get_filter(struct sock *sk, sockptr_t optval, unsigned int len) { struct sock_fprog_kern *fprog; struct sk_filter *filter; int ret = 0; - lock_sock(sk); + sockopt_lock_sock(sk); filter = rcu_dereference_protected(sk->sk_filter, lockdep_sock_is_held(sk)); if (!filter) @@ -10844,7 +10826,7 @@ int sk_get_filter(struct sock *sk, struct sock_filter __user *ubuf, goto out; ret = -EFAULT; - if (copy_to_user(ubuf, fprog->filter, bpf_classic_proglen(fprog))) + if (copy_to_sockptr(optval, fprog->filter, bpf_classic_proglen(fprog))) goto out; /* Instead of bytes, the API requests to return the number @@ -10852,7 +10834,7 @@ int sk_get_filter(struct sock *sk, struct sock_filter __user *ubuf, */ ret = fprog->len; out: - release_sock(sk); + sockopt_release_sock(sk); return ret; } diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c index 5dc3860e9fc7..25cd35f5922e 100644 --- a/net/core/flow_dissector.c +++ b/net/core/flow_dissector.c @@ -204,6 +204,30 @@ static void __skb_flow_dissect_icmp(const struct sk_buff *skb, skb_flow_get_icmp_tci(skb, key_icmp, data, thoff, hlen); } +static void __skb_flow_dissect_l2tpv3(const struct sk_buff *skb, + struct flow_dissector *flow_dissector, + void *target_container, const void *data, + int nhoff, int hlen) +{ + struct flow_dissector_key_l2tpv3 *key_l2tpv3; + struct { + __be32 session_id; + } *hdr, _hdr; + + if (!dissector_uses_key(flow_dissector, FLOW_DISSECTOR_KEY_L2TPV3)) + return; + + hdr = __skb_header_pointer(skb, nhoff, sizeof(_hdr), data, hlen, &_hdr); + if (!hdr) + return; + + key_l2tpv3 = skb_flow_dissector_target(flow_dissector, + FLOW_DISSECTOR_KEY_L2TPV3, + target_container); + + key_l2tpv3->session_id = hdr->session_id; +} + void skb_flow_dissect_meta(const struct sk_buff *skb, struct flow_dissector *flow_dissector, void *target_container) @@ -866,8 +890,8 @@ static void __skb_flow_bpf_to_target(const struct bpf_flow_keys *flow_keys, } } -bool bpf_flow_dissect(struct bpf_prog *prog, struct bpf_flow_dissector *ctx, - __be16 proto, int nhoff, int hlen, unsigned int flags) +u32 bpf_flow_dissect(struct bpf_prog *prog, struct bpf_flow_dissector *ctx, + __be16 proto, int nhoff, int hlen, unsigned int flags) { struct bpf_flow_keys *flow_keys = ctx->flow_keys; u32 result; @@ -892,7 +916,7 @@ bool bpf_flow_dissect(struct bpf_prog *prog, struct bpf_flow_dissector *ctx, flow_keys->thoff = clamp_t(u16, flow_keys->thoff, flow_keys->nhoff, hlen); - return result == BPF_OK; + return result; } static bool is_pppoe_ses_hdr_valid(const struct pppoe_hdr *hdr) @@ -1008,6 +1032,7 @@ bool __skb_flow_dissect(const struct net *net, }; __be16 n_proto = proto; struct bpf_prog *prog; + u32 result; if (skb) { ctx.skb = skb; @@ -1019,13 +1044,16 @@ bool __skb_flow_dissect(const struct net *net, } prog = READ_ONCE(run_array->items[0].prog); - ret = bpf_flow_dissect(prog, &ctx, n_proto, nhoff, - hlen, flags); + result = bpf_flow_dissect(prog, &ctx, n_proto, nhoff, + hlen, flags); + if (result == BPF_FLOW_DISSECTOR_CONTINUE) + goto dissect_continue; __skb_flow_bpf_to_target(&flow_keys, flow_dissector, target_container); rcu_read_unlock(); - return ret; + return result == BPF_OK; } +dissect_continue: rcu_read_unlock(); } @@ -1173,8 +1201,8 @@ proto_again: nhoff += sizeof(*vlan); } - if (dissector_uses_key(flow_dissector, - FLOW_DISSECTOR_KEY_NUM_OF_VLANS)) { + if (dissector_uses_key(flow_dissector, FLOW_DISSECTOR_KEY_NUM_OF_VLANS) && + !(key_control->flags & FLOW_DIS_ENCAPSULATION)) { struct flow_dissector_key_num_of_vlans *key_nvs; key_nvs = skb_flow_dissector_target(flow_dissector, @@ -1497,6 +1525,10 @@ ip_proto_again: __skb_flow_dissect_icmp(skb, flow_dissector, target_container, data, nhoff, hlen); break; + case IPPROTO_L2TP: + __skb_flow_dissect_l2tpv3(skb, flow_dissector, target_container, + data, nhoff, hlen); + break; default: break; diff --git a/net/core/flow_offload.c b/net/core/flow_offload.c index 8cfb63528d18..abe423fd5736 100644 --- a/net/core/flow_offload.c +++ b/net/core/flow_offload.c @@ -237,6 +237,13 @@ void flow_rule_match_pppoe(const struct flow_rule *rule, } EXPORT_SYMBOL(flow_rule_match_pppoe); +void flow_rule_match_l2tpv3(const struct flow_rule *rule, + struct flow_match_l2tpv3 *out) +{ + FLOW_DISSECTOR_MATCH(rule, FLOW_DISSECTOR_KEY_L2TPV3, out); +} +EXPORT_SYMBOL(flow_rule_match_l2tpv3); + struct flow_block_cb *flow_block_cb_alloc(flow_setup_cb_t *cb, void *cb_ident, void *cb_priv, void (*release)(void *cb_priv)) diff --git a/net/core/gro.c b/net/core/gro.c index b4190eb08467..bc9451743307 100644 --- a/net/core/gro.c +++ b/net/core/gro.c @@ -160,6 +160,7 @@ int skb_gro_receive(struct sk_buff *p, struct sk_buff *skb) unsigned int gro_max_size; unsigned int new_truesize; struct sk_buff *lp; + int segs; /* pairs with WRITE_ONCE() in netif_set_gro_max_size() */ gro_max_size = READ_ONCE(p->dev->gro_max_size); @@ -175,6 +176,7 @@ int skb_gro_receive(struct sk_buff *p, struct sk_buff *skb) return -E2BIG; } + segs = NAPI_GRO_CB(skb)->count; lp = NAPI_GRO_CB(p)->last; pinfo = skb_shinfo(lp); @@ -265,7 +267,7 @@ merge: lp = p; done: - NAPI_GRO_CB(p)->count++; + NAPI_GRO_CB(p)->count += segs; p->data_len += len; p->truesize += delta_truesize; p->len += len; @@ -496,8 +498,15 @@ static enum gro_result dev_gro_receive(struct napi_struct *napi, struct sk_buff BUILD_BUG_ON(!IS_ALIGNED(offsetof(struct napi_gro_cb, zeroed), sizeof(u32))); /* Avoid slow unaligned acc */ *(u32 *)&NAPI_GRO_CB(skb)->zeroed = 0; - NAPI_GRO_CB(skb)->flush = skb_is_gso(skb) || skb_has_frag_list(skb); + NAPI_GRO_CB(skb)->flush = skb_has_frag_list(skb); NAPI_GRO_CB(skb)->is_atomic = 1; + NAPI_GRO_CB(skb)->count = 1; + if (unlikely(skb_is_gso(skb))) { + NAPI_GRO_CB(skb)->count = skb_shinfo(skb)->gso_segs; + /* Only support TCP at the moment. */ + if (!skb_is_gso_tcp(skb)) + NAPI_GRO_CB(skb)->flush = 1; + } /* Setup for GRO checksum validation */ switch (skb->ip_summed) { @@ -545,10 +554,10 @@ static enum gro_result dev_gro_receive(struct napi_struct *napi, struct sk_buff else gro_list->count++; - NAPI_GRO_CB(skb)->count = 1; NAPI_GRO_CB(skb)->age = jiffies; NAPI_GRO_CB(skb)->last = skb; - skb_shinfo(skb)->gso_size = skb_gro_len(skb); + if (!skb_is_gso(skb)) + skb_shinfo(skb)->gso_size = skb_gro_len(skb); list_add(&skb->list, &gro_list->list); ret = GRO_HELD; @@ -660,6 +669,7 @@ static void napi_reuse_skb(struct napi_struct *napi, struct sk_buff *skb) skb->encapsulation = 0; skb_shinfo(skb)->gso_type = 0; + skb_shinfo(skb)->gso_size = 0; if (unlikely(skb->slow_gro)) { skb_orphan(skb); skb_ext_reset(skb); diff --git a/net/core/gro_cells.c b/net/core/gro_cells.c index 21619c70a82b..ed5ec5de47f6 100644 --- a/net/core/gro_cells.c +++ b/net/core/gro_cells.c @@ -81,8 +81,7 @@ int gro_cells_init(struct gro_cells *gcells, struct net_device *dev) set_bit(NAPI_STATE_NO_BUSY_POLL, &cell->napi.state); - netif_napi_add(dev, &cell->napi, gro_cell_poll, - NAPI_POLL_WEIGHT); + netif_napi_add(dev, &cell->napi, gro_cell_poll); napi_enable(&cell->napi); } return 0; diff --git a/net/core/lwtunnel.c b/net/core/lwtunnel.c index 9ccd64e8a666..6fac2f0ef074 100644 --- a/net/core/lwtunnel.c +++ b/net/core/lwtunnel.c @@ -50,6 +50,7 @@ static const char *lwtunnel_encap_str(enum lwtunnel_encap_types encap_type) return "IOAM6"; case LWTUNNEL_ENCAP_IP6: case LWTUNNEL_ENCAP_IP: + case LWTUNNEL_ENCAP_XFRM: case LWTUNNEL_ENCAP_NONE: case __LWTUNNEL_ENCAP_MAX: /* should not have got here */ diff --git a/net/core/neighbour.c b/net/core/neighbour.c index 78cc8fb68814..e93edb810103 100644 --- a/net/core/neighbour.c +++ b/net/core/neighbour.c @@ -1853,9 +1853,6 @@ static struct neigh_table *neigh_find_table(int family) case AF_INET6: tbl = neigh_tables[NEIGH_ND_TABLE]; break; - case AF_DECnet: - tbl = neigh_tables[NEIGH_DN_TABLE]; - break; } return tbl; diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c index d61afd21aab5..8409d41405df 100644 --- a/net/core/net-sysfs.c +++ b/net/core/net-sysfs.c @@ -59,7 +59,7 @@ static ssize_t netdev_show(const struct device *dev, #define NETDEVICE_SHOW(field, format_string) \ static ssize_t format_##field(const struct net_device *dev, char *buf) \ { \ - return sprintf(buf, format_string, dev->field); \ + return sysfs_emit(buf, format_string, dev->field); \ } \ static ssize_t field##_show(struct device *dev, \ struct device_attribute *attr, char *buf) \ @@ -118,13 +118,13 @@ static ssize_t iflink_show(struct device *dev, struct device_attribute *attr, { struct net_device *ndev = to_net_dev(dev); - return sprintf(buf, fmt_dec, dev_get_iflink(ndev)); + return sysfs_emit(buf, fmt_dec, dev_get_iflink(ndev)); } static DEVICE_ATTR_RO(iflink); static ssize_t format_name_assign_type(const struct net_device *dev, char *buf) { - return sprintf(buf, fmt_dec, dev->name_assign_type); + return sysfs_emit(buf, fmt_dec, dev->name_assign_type); } static ssize_t name_assign_type_show(struct device *dev, @@ -194,7 +194,7 @@ static ssize_t carrier_show(struct device *dev, struct net_device *netdev = to_net_dev(dev); if (netif_running(netdev)) - return sprintf(buf, fmt_dec, !!netif_carrier_ok(netdev)); + return sysfs_emit(buf, fmt_dec, !!netif_carrier_ok(netdev)); return -EINVAL; } @@ -219,7 +219,7 @@ static ssize_t speed_show(struct device *dev, struct ethtool_link_ksettings cmd; if (!__ethtool_get_link_ksettings(netdev, &cmd)) - ret = sprintf(buf, fmt_dec, cmd.base.speed); + ret = sysfs_emit(buf, fmt_dec, cmd.base.speed); } rtnl_unlock(); return ret; @@ -258,7 +258,7 @@ static ssize_t duplex_show(struct device *dev, duplex = "unknown"; break; } - ret = sprintf(buf, "%s\n", duplex); + ret = sysfs_emit(buf, "%s\n", duplex); } } rtnl_unlock(); @@ -272,7 +272,7 @@ static ssize_t testing_show(struct device *dev, struct net_device *netdev = to_net_dev(dev); if (netif_running(netdev)) - return sprintf(buf, fmt_dec, !!netif_testing(netdev)); + return sysfs_emit(buf, fmt_dec, !!netif_testing(netdev)); return -EINVAL; } @@ -284,7 +284,7 @@ static ssize_t dormant_show(struct device *dev, struct net_device *netdev = to_net_dev(dev); if (netif_running(netdev)) - return sprintf(buf, fmt_dec, !!netif_dormant(netdev)); + return sysfs_emit(buf, fmt_dec, !!netif_dormant(netdev)); return -EINVAL; } @@ -315,7 +315,7 @@ static ssize_t operstate_show(struct device *dev, if (operstate >= ARRAY_SIZE(operstates)) return -EINVAL; /* should not happen */ - return sprintf(buf, "%s\n", operstates[operstate]); + return sysfs_emit(buf, "%s\n", operstates[operstate]); } static DEVICE_ATTR_RO(operstate); @@ -325,9 +325,9 @@ static ssize_t carrier_changes_show(struct device *dev, { struct net_device *netdev = to_net_dev(dev); - return sprintf(buf, fmt_dec, - atomic_read(&netdev->carrier_up_count) + - atomic_read(&netdev->carrier_down_count)); + return sysfs_emit(buf, fmt_dec, + atomic_read(&netdev->carrier_up_count) + + atomic_read(&netdev->carrier_down_count)); } static DEVICE_ATTR_RO(carrier_changes); @@ -337,7 +337,7 @@ static ssize_t carrier_up_count_show(struct device *dev, { struct net_device *netdev = to_net_dev(dev); - return sprintf(buf, fmt_dec, atomic_read(&netdev->carrier_up_count)); + return sysfs_emit(buf, fmt_dec, atomic_read(&netdev->carrier_up_count)); } static DEVICE_ATTR_RO(carrier_up_count); @@ -347,7 +347,7 @@ static ssize_t carrier_down_count_show(struct device *dev, { struct net_device *netdev = to_net_dev(dev); - return sprintf(buf, fmt_dec, atomic_read(&netdev->carrier_down_count)); + return sysfs_emit(buf, fmt_dec, atomic_read(&netdev->carrier_down_count)); } static DEVICE_ATTR_RO(carrier_down_count); @@ -462,7 +462,7 @@ static ssize_t ifalias_show(struct device *dev, ret = dev_get_alias(netdev, tmp, sizeof(tmp)); if (ret > 0) - ret = sprintf(buf, "%s\n", tmp); + ret = sysfs_emit(buf, "%s\n", tmp); return ret; } static DEVICE_ATTR_RW(ifalias); @@ -514,7 +514,7 @@ static ssize_t phys_port_id_show(struct device *dev, ret = dev_get_phys_port_id(netdev, &ppid); if (!ret) - ret = sprintf(buf, "%*phN\n", ppid.id_len, ppid.id); + ret = sysfs_emit(buf, "%*phN\n", ppid.id_len, ppid.id); } rtnl_unlock(); @@ -543,7 +543,7 @@ static ssize_t phys_port_name_show(struct device *dev, ret = dev_get_phys_port_name(netdev, name, sizeof(name)); if (!ret) - ret = sprintf(buf, "%s\n", name); + ret = sysfs_emit(buf, "%s\n", name); } rtnl_unlock(); @@ -573,7 +573,7 @@ static ssize_t phys_switch_id_show(struct device *dev, ret = dev_get_port_parent_id(netdev, &ppid, false); if (!ret) - ret = sprintf(buf, "%*phN\n", ppid.id_len, ppid.id); + ret = sysfs_emit(buf, "%*phN\n", ppid.id_len, ppid.id); } rtnl_unlock(); @@ -591,7 +591,7 @@ static ssize_t threaded_show(struct device *dev, return restart_syscall(); if (dev_isalive(netdev)) - ret = sprintf(buf, fmt_dec, netdev->threaded); + ret = sysfs_emit(buf, fmt_dec, netdev->threaded); rtnl_unlock(); return ret; @@ -673,7 +673,7 @@ static ssize_t netstat_show(const struct device *d, struct rtnl_link_stats64 temp; const struct rtnl_link_stats64 *stats = dev_get_stats(dev, &temp); - ret = sprintf(buf, fmt_u64, *(u64 *)(((u8 *)stats) + offset)); + ret = sysfs_emit(buf, fmt_u64, *(u64 *)(((u8 *)stats) + offset)); } read_unlock(&dev_base_lock); return ret; @@ -824,7 +824,7 @@ static ssize_t show_rps_map(struct netdev_rx_queue *queue, char *buf) for (i = 0; i < map->len; i++) cpumask_set_cpu(map->cpus[i], mask); - len = snprintf(buf, PAGE_SIZE, "%*pb\n", cpumask_pr_args(mask)); + len = sysfs_emit(buf, "%*pb\n", cpumask_pr_args(mask)); rcu_read_unlock(); free_cpumask_var(mask); @@ -910,7 +910,7 @@ static ssize_t show_rps_dev_flow_table_cnt(struct netdev_rx_queue *queue, val = (unsigned long)flow_table->mask + 1; rcu_read_unlock(); - return sprintf(buf, "%lu\n", val); + return sysfs_emit(buf, "%lu\n", val); } static void rps_dev_flow_table_release(struct rcu_head *rcu) @@ -1208,7 +1208,7 @@ static ssize_t tx_timeout_show(struct netdev_queue *queue, char *buf) { unsigned long trans_timeout = atomic_long_read(&queue->trans_timeout); - return sprintf(buf, fmt_ulong, trans_timeout); + return sysfs_emit(buf, fmt_ulong, trans_timeout); } static unsigned int get_netdev_queue_index(struct netdev_queue *queue) @@ -1255,15 +1255,15 @@ static ssize_t traffic_class_show(struct netdev_queue *queue, * belongs to the root device it will be reported with just the * traffic class, so just "0" for TC 0 for example. */ - return num_tc < 0 ? sprintf(buf, "%d%d\n", tc, num_tc) : - sprintf(buf, "%d\n", tc); + return num_tc < 0 ? sysfs_emit(buf, "%d%d\n", tc, num_tc) : + sysfs_emit(buf, "%d\n", tc); } #ifdef CONFIG_XPS static ssize_t tx_maxrate_show(struct netdev_queue *queue, char *buf) { - return sprintf(buf, "%lu\n", queue->tx_maxrate); + return sysfs_emit(buf, "%lu\n", queue->tx_maxrate); } static ssize_t tx_maxrate_store(struct netdev_queue *queue, @@ -1317,7 +1317,7 @@ static struct netdev_queue_attribute queue_traffic_class __ro_after_init */ static ssize_t bql_show(char *buf, unsigned int value) { - return sprintf(buf, "%u\n", value); + return sysfs_emit(buf, "%u\n", value); } static ssize_t bql_set(const char *buf, const size_t count, @@ -1346,7 +1346,7 @@ static ssize_t bql_show_hold_time(struct netdev_queue *queue, { struct dql *dql = &queue->dql; - return sprintf(buf, "%u\n", jiffies_to_msecs(dql->slack_hold_time)); + return sysfs_emit(buf, "%u\n", jiffies_to_msecs(dql->slack_hold_time)); } static ssize_t bql_set_hold_time(struct netdev_queue *queue, @@ -1374,7 +1374,7 @@ static ssize_t bql_show_inflight(struct netdev_queue *queue, { struct dql *dql = &queue->dql; - return sprintf(buf, "%u\n", dql->num_queued - dql->num_completed); + return sysfs_emit(buf, "%u\n", dql->num_queued - dql->num_completed); } static struct netdev_queue_attribute bql_inflight_attribute __ro_after_init = diff --git a/net/core/netclassid_cgroup.c b/net/core/netclassid_cgroup.c index 1a6a86693b74..d6a70aeaa503 100644 --- a/net/core/netclassid_cgroup.c +++ b/net/core/netclassid_cgroup.c @@ -66,7 +66,7 @@ struct update_classid_context { #define UPDATE_CLASSID_BATCH 1000 -static int update_classid_sock(const void *v, struct file *file, unsigned n) +static int update_classid_sock(const void *v, struct file *file, unsigned int n) { struct update_classid_context *ctx = (void *)v; struct socket *sock = sock_from_file(file); diff --git a/net/core/netpoll.c b/net/core/netpoll.c index 5d27067b72d5..9be762e1d042 100644 --- a/net/core/netpoll.c +++ b/net/core/netpoll.c @@ -556,7 +556,7 @@ int netpoll_parse_options(struct netpoll *np, char *opt) if ((delim = strchr(cur, ',')) == NULL) goto parse_failed; *delim = 0; - strlcpy(np->dev_name, cur, sizeof(np->dev_name)); + strscpy(np->dev_name, cur, sizeof(np->dev_name)); cur = delim; } cur++; @@ -610,7 +610,7 @@ int __netpoll_setup(struct netpoll *np, struct net_device *ndev) int err; np->dev = ndev; - strlcpy(np->dev_name, ndev->name, IFNAMSIZ); + strscpy(np->dev_name, ndev->name, IFNAMSIZ); if (ndev->priv_flags & IFF_DISABLE_NETPOLL) { np_err(np, "%s doesn't support polling, aborting\n", diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c index 4b5b15c684ed..74864dc46a7e 100644 --- a/net/core/rtnetlink.c +++ b/net/core/rtnetlink.c @@ -866,14 +866,12 @@ static void set_operstate(struct net_device *dev, unsigned char transition) break; case IF_OPER_TESTING: - if (operstate == IF_OPER_UP || - operstate == IF_OPER_UNKNOWN) + if (netif_oper_up(dev)) operstate = IF_OPER_TESTING; break; case IF_OPER_DORMANT: - if (operstate == IF_OPER_UP || - operstate == IF_OPER_UNKNOWN) + if (netif_oper_up(dev)) operstate = IF_OPER_DORMANT; break; } @@ -1059,6 +1057,7 @@ static noinline size_t if_nlmsg_size(const struct net_device *dev, + nla_total_size(4) /* IFLA_MASTER */ + nla_total_size(1) /* IFLA_CARRIER */ + nla_total_size(4) /* IFLA_PROMISCUITY */ + + nla_total_size(4) /* IFLA_ALLMULTI */ + nla_total_size(4) /* IFLA_NUM_TX_QUEUES */ + nla_total_size(4) /* IFLA_NUM_RX_QUEUES */ + nla_total_size(4) /* IFLA_GSO_MAX_SEGS */ @@ -1767,6 +1766,7 @@ static int rtnl_fill_ifinfo(struct sk_buff *skb, nla_put_u32(skb, IFLA_MAX_MTU, dev->max_mtu) || nla_put_u32(skb, IFLA_GROUP, dev->group) || nla_put_u32(skb, IFLA_PROMISCUITY, dev->promiscuity) || + nla_put_u32(skb, IFLA_ALLMULTI, dev->allmulti) || nla_put_u32(skb, IFLA_NUM_TX_QUEUES, dev->num_tx_queues) || nla_put_u32(skb, IFLA_GSO_MAX_SEGS, dev->gso_max_segs) || nla_put_u32(skb, IFLA_GSO_MAX_SIZE, dev->gso_max_size) || @@ -1928,6 +1928,7 @@ static const struct nla_policy ifla_policy[IFLA_MAX+1] = { [IFLA_GRO_MAX_SIZE] = { .type = NLA_U32 }, [IFLA_TSO_MAX_SIZE] = { .type = NLA_REJECT }, [IFLA_TSO_MAX_SEGS] = { .type = NLA_REJECT }, + [IFLA_ALLMULTI] = { .type = NLA_REJECT }, }; static const struct nla_policy ifla_info_policy[IFLA_INFO_MAX+1] = { @@ -2776,13 +2777,6 @@ static int do_setlink(const struct sk_buff *skb, call_netdevice_notifiers(NETDEV_CHANGEADDR, dev); } - if (ifm->ifi_flags || ifm->ifi_change) { - err = dev_change_flags(dev, rtnl_dev_combine_flags(dev, ifm), - extack); - if (err < 0) - goto errout; - } - if (tb[IFLA_MASTER]) { err = do_set_master(dev, nla_get_u32(tb[IFLA_MASTER]), extack); if (err) @@ -2790,6 +2784,13 @@ static int do_setlink(const struct sk_buff *skb, status |= DO_SETLINK_MODIFIED; } + if (ifm->ifi_flags || ifm->ifi_change) { + err = dev_change_flags(dev, rtnl_dev_combine_flags(dev, ifm), + extack); + if (err < 0) + goto errout; + } + if (tb[IFLA_CARRIER]) { err = dev_change_carrier(dev, nla_get_u8(tb[IFLA_CARRIER])); if (err) diff --git a/net/core/skbuff.c b/net/core/skbuff.c index 417463da4fac..1d9719e72f9d 100644 --- a/net/core/skbuff.c +++ b/net/core/skbuff.c @@ -134,8 +134,66 @@ static void skb_under_panic(struct sk_buff *skb, unsigned int sz, void *addr) #define NAPI_SKB_CACHE_BULK 16 #define NAPI_SKB_CACHE_HALF (NAPI_SKB_CACHE_SIZE / 2) +#if PAGE_SIZE == SZ_4K + +#define NAPI_HAS_SMALL_PAGE_FRAG 1 +#define NAPI_SMALL_PAGE_PFMEMALLOC(nc) ((nc).pfmemalloc) + +/* specialized page frag allocator using a single order 0 page + * and slicing it into 1K sized fragment. Constrained to systems + * with a very limited amount of 1K fragments fitting a single + * page - to avoid excessive truesize underestimation + */ + +struct page_frag_1k { + void *va; + u16 offset; + bool pfmemalloc; +}; + +static void *page_frag_alloc_1k(struct page_frag_1k *nc, gfp_t gfp) +{ + struct page *page; + int offset; + + offset = nc->offset - SZ_1K; + if (likely(offset >= 0)) + goto use_frag; + + page = alloc_pages_node(NUMA_NO_NODE, gfp, 0); + if (!page) + return NULL; + + nc->va = page_address(page); + nc->pfmemalloc = page_is_pfmemalloc(page); + offset = PAGE_SIZE - SZ_1K; + page_ref_add(page, offset / SZ_1K); + +use_frag: + nc->offset = offset; + return nc->va + offset; +} +#else + +/* the small page is actually unused in this build; add dummy helpers + * to please the compiler and avoid later preprocessor's conditionals + */ +#define NAPI_HAS_SMALL_PAGE_FRAG 0 +#define NAPI_SMALL_PAGE_PFMEMALLOC(nc) false + +struct page_frag_1k { +}; + +static void *page_frag_alloc_1k(struct page_frag_1k *nc, gfp_t gfp_mask) +{ + return NULL; +} + +#endif + struct napi_alloc_cache { struct page_frag_cache page; + struct page_frag_1k page_small; unsigned int skb_count; void *skb_cache[NAPI_SKB_CACHE_SIZE]; }; @@ -143,6 +201,23 @@ struct napi_alloc_cache { static DEFINE_PER_CPU(struct page_frag_cache, netdev_alloc_cache); static DEFINE_PER_CPU(struct napi_alloc_cache, napi_alloc_cache); +/* Double check that napi_get_frags() allocates skbs with + * skb->head being backed by slab, not a page fragment. + * This is to make sure bug fixed in 3226b158e67c + * ("net: avoid 32 x truesize under-estimation for tiny skbs") + * does not accidentally come back. + */ +void napi_get_frags_check(struct napi_struct *napi) +{ + struct sk_buff *skb; + + local_bh_disable(); + skb = napi_get_frags(napi); + WARN_ON_ONCE(!NAPI_HAS_SMALL_PAGE_FRAG && skb && skb->head_frag); + napi_free_frags(napi); + local_bh_enable(); +} + void *__napi_alloc_frag_align(unsigned int fragsz, unsigned int align_mask) { struct napi_alloc_cache *nc = this_cpu_ptr(&napi_alloc_cache); @@ -561,6 +636,7 @@ struct sk_buff *__napi_alloc_skb(struct napi_struct *napi, unsigned int len, { struct napi_alloc_cache *nc; struct sk_buff *skb; + bool pfmemalloc; void *data; DEBUG_NET_WARN_ON_ONCE(!in_softirq()); @@ -568,8 +644,10 @@ struct sk_buff *__napi_alloc_skb(struct napi_struct *napi, unsigned int len, /* If requested length is either too small or too big, * we use kmalloc() for skb->head allocation. + * When the small frag allocator is available, prefer it over kmalloc + * for small fragments */ - if (len <= SKB_WITH_OVERHEAD(1024) || + if ((!NAPI_HAS_SMALL_PAGE_FRAG && len <= SKB_WITH_OVERHEAD(1024)) || len > SKB_WITH_OVERHEAD(PAGE_SIZE) || (gfp_mask & (__GFP_DIRECT_RECLAIM | GFP_DMA))) { skb = __alloc_skb(len, gfp_mask, SKB_ALLOC_RX | SKB_ALLOC_NAPI, @@ -580,13 +658,33 @@ struct sk_buff *__napi_alloc_skb(struct napi_struct *napi, unsigned int len, } nc = this_cpu_ptr(&napi_alloc_cache); - len += SKB_DATA_ALIGN(sizeof(struct skb_shared_info)); - len = SKB_DATA_ALIGN(len); if (sk_memalloc_socks()) gfp_mask |= __GFP_MEMALLOC; - data = page_frag_alloc(&nc->page, len, gfp_mask); + if (NAPI_HAS_SMALL_PAGE_FRAG && len <= SKB_WITH_OVERHEAD(1024)) { + /* we are artificially inflating the allocation size, but + * that is not as bad as it may look like, as: + * - 'len' less than GRO_MAX_HEAD makes little sense + * - On most systems, larger 'len' values lead to fragment + * size above 512 bytes + * - kmalloc would use the kmalloc-1k slab for such values + * - Builds with smaller GRO_MAX_HEAD will very likely do + * little networking, as that implies no WiFi and no + * tunnels support, and 32 bits arches. + */ + len = SZ_1K; + + data = page_frag_alloc_1k(&nc->page_small, gfp_mask); + pfmemalloc = NAPI_SMALL_PAGE_PFMEMALLOC(nc->page_small); + } else { + len += SKB_DATA_ALIGN(sizeof(struct skb_shared_info)); + len = SKB_DATA_ALIGN(len); + + data = page_frag_alloc(&nc->page, len, gfp_mask); + pfmemalloc = nc->page.pfmemalloc; + } + if (unlikely(!data)) return NULL; @@ -596,7 +694,7 @@ struct sk_buff *__napi_alloc_skb(struct napi_struct *napi, unsigned int len, return NULL; } - if (nc->page.pfmemalloc) + if (pfmemalloc) skb->pfmemalloc = 1; skb->head_frag = 1; @@ -781,9 +879,10 @@ EXPORT_SYMBOL(__kfree_skb); * hit zero. Meanwhile, pass the drop reason to 'kfree_skb' * tracepoint. */ -void kfree_skb_reason(struct sk_buff *skb, enum skb_drop_reason reason) +void __fix_address +kfree_skb_reason(struct sk_buff *skb, enum skb_drop_reason reason) { - if (!skb_unref(skb)) + if (unlikely(!skb_unref(skb))) return; DEBUG_NET_WARN_ON_ONCE(reason <= 0 || reason >= SKB_DROP_REASON_MAX); @@ -1187,7 +1286,7 @@ EXPORT_SYMBOL_GPL(mm_unaccount_pinned_pages); static struct ubuf_info *msg_zerocopy_alloc(struct sock *sk, size_t size) { - struct ubuf_info *uarg; + struct ubuf_info_msgzc *uarg; struct sk_buff *skb; WARN_ON_ONCE(!in_task()); @@ -1205,19 +1304,19 @@ static struct ubuf_info *msg_zerocopy_alloc(struct sock *sk, size_t size) return NULL; } - uarg->callback = msg_zerocopy_callback; + uarg->ubuf.callback = msg_zerocopy_callback; uarg->id = ((u32)atomic_inc_return(&sk->sk_zckey)) - 1; uarg->len = 1; uarg->bytelen = size; uarg->zerocopy = 1; - uarg->flags = SKBFL_ZEROCOPY_FRAG | SKBFL_DONT_ORPHAN; - refcount_set(&uarg->refcnt, 1); + uarg->ubuf.flags = SKBFL_ZEROCOPY_FRAG | SKBFL_DONT_ORPHAN; + refcount_set(&uarg->ubuf.refcnt, 1); sock_hold(sk); - return uarg; + return &uarg->ubuf; } -static inline struct sk_buff *skb_from_uarg(struct ubuf_info *uarg) +static inline struct sk_buff *skb_from_uarg(struct ubuf_info_msgzc *uarg) { return container_of((void *)uarg, struct sk_buff, cb); } @@ -1226,6 +1325,7 @@ struct ubuf_info *msg_zerocopy_realloc(struct sock *sk, size_t size, struct ubuf_info *uarg) { if (uarg) { + struct ubuf_info_msgzc *uarg_zc; const u32 byte_limit = 1 << 19; /* limit to a few TSO */ u32 bytelen, next; @@ -1241,8 +1341,9 @@ struct ubuf_info *msg_zerocopy_realloc(struct sock *sk, size_t size, return NULL; } - bytelen = uarg->bytelen + size; - if (uarg->len == USHRT_MAX - 1 || bytelen > byte_limit) { + uarg_zc = uarg_to_msgzc(uarg); + bytelen = uarg_zc->bytelen + size; + if (uarg_zc->len == USHRT_MAX - 1 || bytelen > byte_limit) { /* TCP can create new skb to attach new uarg */ if (sk->sk_type == SOCK_STREAM) goto new_alloc; @@ -1250,11 +1351,11 @@ struct ubuf_info *msg_zerocopy_realloc(struct sock *sk, size_t size, } next = (u32)atomic_read(&sk->sk_zckey); - if ((u32)(uarg->id + uarg->len) == next) { - if (mm_account_pinned_pages(&uarg->mmp, size)) + if ((u32)(uarg_zc->id + uarg_zc->len) == next) { + if (mm_account_pinned_pages(&uarg_zc->mmp, size)) return NULL; - uarg->len++; - uarg->bytelen = bytelen; + uarg_zc->len++; + uarg_zc->bytelen = bytelen; atomic_set(&sk->sk_zckey, ++next); /* no extra ref when appending to datagram (MSG_MORE) */ @@ -1290,7 +1391,7 @@ static bool skb_zerocopy_notify_extend(struct sk_buff *skb, u32 lo, u16 len) return true; } -static void __msg_zerocopy_callback(struct ubuf_info *uarg) +static void __msg_zerocopy_callback(struct ubuf_info_msgzc *uarg) { struct sk_buff *tail, *skb = skb_from_uarg(uarg); struct sock_exterr_skb *serr; @@ -1343,19 +1444,21 @@ release: void msg_zerocopy_callback(struct sk_buff *skb, struct ubuf_info *uarg, bool success) { - uarg->zerocopy = uarg->zerocopy & success; + struct ubuf_info_msgzc *uarg_zc = uarg_to_msgzc(uarg); + + uarg_zc->zerocopy = uarg_zc->zerocopy & success; if (refcount_dec_and_test(&uarg->refcnt)) - __msg_zerocopy_callback(uarg); + __msg_zerocopy_callback(uarg_zc); } EXPORT_SYMBOL_GPL(msg_zerocopy_callback); void msg_zerocopy_put_abort(struct ubuf_info *uarg, bool have_uref) { - struct sock *sk = skb_from_uarg(uarg)->sk; + struct sock *sk = skb_from_uarg(uarg_to_msgzc(uarg))->sk; atomic_dec(&sk->sk_zckey); - uarg->len--; + uarg_to_msgzc(uarg)->len--; if (have_uref) msg_zerocopy_callback(NULL, uarg, true); diff --git a/net/core/skmsg.c b/net/core/skmsg.c index 188f8558d27d..ca70525621c7 100644 --- a/net/core/skmsg.c +++ b/net/core/skmsg.c @@ -434,8 +434,10 @@ int sk_msg_recvmsg(struct sock *sk, struct sk_psock *psock, struct msghdr *msg, if (copied + copy > len) copy = len - copied; copy = copy_page_to_iter(page, sge->offset, copy, iter); - if (!copy) - return copied ? copied : -EFAULT; + if (!copy) { + copied = copied ? copied : -EFAULT; + goto out; + } copied += copy; if (likely(!peek)) { @@ -455,7 +457,7 @@ int sk_msg_recvmsg(struct sock *sk, struct sk_psock *psock, struct msghdr *msg, * didn't copy the entire length lets just break. */ if (copy != sge->length) - return copied; + goto out; sk_msg_iter_var_next(i); } @@ -477,7 +479,9 @@ int sk_msg_recvmsg(struct sock *sk, struct sk_psock *psock, struct msghdr *msg, } msg_rx = sk_psock_peek_msg(psock); } - +out: + if (psock->work_state.skb && copied > 0) + schedule_work(&psock->work); return copied; } EXPORT_SYMBOL_GPL(sk_msg_recvmsg); diff --git a/net/core/sock.c b/net/core/sock.c index 788c1372663c..eeb6cbac6f49 100644 --- a/net/core/sock.c +++ b/net/core/sock.c @@ -703,15 +703,17 @@ static int sock_setbindtodevice(struct sock *sk, sockptr_t optval, int optlen) goto out; } - return sock_bindtoindex(sk, index, true); + sockopt_lock_sock(sk); + ret = sock_bindtoindex_locked(sk, index); + sockopt_release_sock(sk); out: #endif return ret; } -static int sock_getbindtodevice(struct sock *sk, char __user *optval, - int __user *optlen, int len) +static int sock_getbindtodevice(struct sock *sk, sockptr_t optval, + sockptr_t optlen, int len) { int ret = -ENOPROTOOPT; #ifdef CONFIG_NETDEVICES @@ -735,12 +737,12 @@ static int sock_getbindtodevice(struct sock *sk, char __user *optval, len = strlen(devname) + 1; ret = -EFAULT; - if (copy_to_user(optval, devname, len)) + if (copy_to_sockptr(optval, devname, len)) goto out; zero: ret = -EFAULT; - if (put_user(len, optlen)) + if (copy_to_sockptr(optlen, &len, sizeof(int))) goto out; ret = 0; @@ -1036,17 +1038,51 @@ static int sock_reserve_memory(struct sock *sk, int bytes) return 0; } +void sockopt_lock_sock(struct sock *sk) +{ + /* When current->bpf_ctx is set, the setsockopt is called from + * a bpf prog. bpf has ensured the sk lock has been + * acquired before calling setsockopt(). + */ + if (has_current_bpf_ctx()) + return; + + lock_sock(sk); +} +EXPORT_SYMBOL(sockopt_lock_sock); + +void sockopt_release_sock(struct sock *sk) +{ + if (has_current_bpf_ctx()) + return; + + release_sock(sk); +} +EXPORT_SYMBOL(sockopt_release_sock); + +bool sockopt_ns_capable(struct user_namespace *ns, int cap) +{ + return has_current_bpf_ctx() || ns_capable(ns, cap); +} +EXPORT_SYMBOL(sockopt_ns_capable); + +bool sockopt_capable(int cap) +{ + return has_current_bpf_ctx() || capable(cap); +} +EXPORT_SYMBOL(sockopt_capable); + /* * This is meant for all protocols to use and covers goings on * at the socket level. Everything here is generic. */ -int sock_setsockopt(struct socket *sock, int level, int optname, - sockptr_t optval, unsigned int optlen) +int sk_setsockopt(struct sock *sk, int level, int optname, + sockptr_t optval, unsigned int optlen) { struct so_timestamping timestamping; + struct socket *sock = sk->sk_socket; struct sock_txtime sk_txtime; - struct sock *sk = sock->sk; int val; int valbool; struct linger ling; @@ -1067,11 +1103,11 @@ int sock_setsockopt(struct socket *sock, int level, int optname, valbool = val ? 1 : 0; - lock_sock(sk); + sockopt_lock_sock(sk); switch (optname) { case SO_DEBUG: - if (val && !capable(CAP_NET_ADMIN)) + if (val && !sockopt_capable(CAP_NET_ADMIN)) ret = -EACCES; else sock_valbool_flag(sk, SOCK_DBG, valbool); @@ -1115,7 +1151,7 @@ set_sndbuf: break; case SO_SNDBUFFORCE: - if (!capable(CAP_NET_ADMIN)) { + if (!sockopt_capable(CAP_NET_ADMIN)) { ret = -EPERM; break; } @@ -1137,7 +1173,7 @@ set_sndbuf: break; case SO_RCVBUFFORCE: - if (!capable(CAP_NET_ADMIN)) { + if (!sockopt_capable(CAP_NET_ADMIN)) { ret = -EPERM; break; } @@ -1164,8 +1200,8 @@ set_sndbuf: case SO_PRIORITY: if ((val >= 0 && val <= 6) || - ns_capable(sock_net(sk)->user_ns, CAP_NET_RAW) || - ns_capable(sock_net(sk)->user_ns, CAP_NET_ADMIN)) + sockopt_ns_capable(sock_net(sk)->user_ns, CAP_NET_RAW) || + sockopt_ns_capable(sock_net(sk)->user_ns, CAP_NET_ADMIN)) sk->sk_priority = val; else ret = -EPERM; @@ -1228,7 +1264,7 @@ set_sndbuf: case SO_RCVLOWAT: if (val < 0) val = INT_MAX; - if (sock->ops->set_rcvlowat) + if (sock && sock->ops->set_rcvlowat) ret = sock->ops->set_rcvlowat(sk, val); else WRITE_ONCE(sk->sk_rcvlowat, val ? : 1); @@ -1310,8 +1346,8 @@ set_sndbuf: clear_bit(SOCK_PASSSEC, &sock->flags); break; case SO_MARK: - if (!ns_capable(sock_net(sk)->user_ns, CAP_NET_RAW) && - !ns_capable(sock_net(sk)->user_ns, CAP_NET_ADMIN)) { + if (!sockopt_ns_capable(sock_net(sk)->user_ns, CAP_NET_RAW) && + !sockopt_ns_capable(sock_net(sk)->user_ns, CAP_NET_ADMIN)) { ret = -EPERM; break; } @@ -1319,8 +1355,8 @@ set_sndbuf: __sock_set_mark(sk, val); break; case SO_RCVMARK: - if (!ns_capable(sock_net(sk)->user_ns, CAP_NET_RAW) && - !ns_capable(sock_net(sk)->user_ns, CAP_NET_ADMIN)) { + if (!sockopt_ns_capable(sock_net(sk)->user_ns, CAP_NET_RAW) && + !sockopt_ns_capable(sock_net(sk)->user_ns, CAP_NET_ADMIN)) { ret = -EPERM; break; } @@ -1354,7 +1390,7 @@ set_sndbuf: #ifdef CONFIG_NET_RX_BUSY_POLL case SO_BUSY_POLL: /* allow unprivileged users to decrease the value */ - if ((val > sk->sk_ll_usec) && !capable(CAP_NET_ADMIN)) + if ((val > sk->sk_ll_usec) && !sockopt_capable(CAP_NET_ADMIN)) ret = -EPERM; else { if (val < 0) @@ -1364,13 +1400,13 @@ set_sndbuf: } break; case SO_PREFER_BUSY_POLL: - if (valbool && !capable(CAP_NET_ADMIN)) + if (valbool && !sockopt_capable(CAP_NET_ADMIN)) ret = -EPERM; else WRITE_ONCE(sk->sk_prefer_busy_poll, valbool); break; case SO_BUSY_POLL_BUDGET: - if (val > READ_ONCE(sk->sk_busy_poll_budget) && !capable(CAP_NET_ADMIN)) { + if (val > READ_ONCE(sk->sk_busy_poll_budget) && !sockopt_capable(CAP_NET_ADMIN)) { ret = -EPERM; } else { if (val < 0 || val > U16_MAX) @@ -1441,7 +1477,7 @@ set_sndbuf: * scheduler has enough safe guards. */ if (sk_txtime.clockid != CLOCK_MONOTONIC && - !ns_capable(sock_net(sk)->user_ns, CAP_NET_ADMIN)) { + !sockopt_ns_capable(sock_net(sk)->user_ns, CAP_NET_ADMIN)) { ret = -EPERM; break; } @@ -1496,9 +1532,16 @@ set_sndbuf: ret = -ENOPROTOOPT; break; } - release_sock(sk); + sockopt_release_sock(sk); return ret; } + +int sock_setsockopt(struct socket *sock, int level, int optname, + sockptr_t optval, unsigned int optlen) +{ + return sk_setsockopt(sock->sk, level, optname, + optval, optlen); +} EXPORT_SYMBOL(sock_setsockopt); static const struct cred *sk_get_peer_cred(struct sock *sk) @@ -1525,22 +1568,25 @@ static void cred_to_ucred(struct pid *pid, const struct cred *cred, } } -static int groups_to_user(gid_t __user *dst, const struct group_info *src) +static int groups_to_user(sockptr_t dst, const struct group_info *src) { struct user_namespace *user_ns = current_user_ns(); int i; - for (i = 0; i < src->ngroups; i++) - if (put_user(from_kgid_munged(user_ns, src->gid[i]), dst + i)) + for (i = 0; i < src->ngroups; i++) { + gid_t gid = from_kgid_munged(user_ns, src->gid[i]); + + if (copy_to_sockptr_offset(dst, i * sizeof(gid), &gid, sizeof(gid))) return -EFAULT; + } return 0; } -int sock_getsockopt(struct socket *sock, int level, int optname, - char __user *optval, int __user *optlen) +int sk_getsockopt(struct sock *sk, int level, int optname, + sockptr_t optval, sockptr_t optlen) { - struct sock *sk = sock->sk; + struct socket *sock = sk->sk_socket; union { int val; @@ -1557,7 +1603,7 @@ int sock_getsockopt(struct socket *sock, int level, int optname, int lv = sizeof(int); int len; - if (get_user(len, optlen)) + if (copy_from_sockptr(&len, optlen, sizeof(int))) return -EFAULT; if (len < 0) return -EINVAL; @@ -1692,7 +1738,7 @@ int sock_getsockopt(struct socket *sock, int level, int optname, cred_to_ucred(sk->sk_peer_pid, sk->sk_peer_cred, &peercred); spin_unlock(&sk->sk_peer_lock); - if (copy_to_user(optval, &peercred, len)) + if (copy_to_sockptr(optval, &peercred, len)) return -EFAULT; goto lenout; } @@ -1710,11 +1756,11 @@ int sock_getsockopt(struct socket *sock, int level, int optname, if (len < n * sizeof(gid_t)) { len = n * sizeof(gid_t); put_cred(cred); - return put_user(len, optlen) ? -EFAULT : -ERANGE; + return copy_to_sockptr(optlen, &len, sizeof(int)) ? -EFAULT : -ERANGE; } len = n * sizeof(gid_t); - ret = groups_to_user((gid_t __user *)optval, cred->group_info); + ret = groups_to_user(optval, cred->group_info); put_cred(cred); if (ret) return ret; @@ -1730,7 +1776,7 @@ int sock_getsockopt(struct socket *sock, int level, int optname, return -ENOTCONN; if (lv < len) return -EINVAL; - if (copy_to_user(optval, address, len)) + if (copy_to_sockptr(optval, address, len)) return -EFAULT; goto lenout; } @@ -1747,7 +1793,7 @@ int sock_getsockopt(struct socket *sock, int level, int optname, break; case SO_PEERSEC: - return security_socket_getpeersec_stream(sock, optval, optlen, len); + return security_socket_getpeersec_stream(sock, optval.user, optlen.user, len); case SO_MARK: v.val = sk->sk_mark; @@ -1779,7 +1825,7 @@ int sock_getsockopt(struct socket *sock, int level, int optname, return sock_getbindtodevice(sk, optval, optlen, len); case SO_GET_FILTER: - len = sk_get_filter(sk, (struct sock_filter __user *)optval, len); + len = sk_get_filter(sk, optval, len); if (len < 0) return len; @@ -1827,7 +1873,7 @@ int sock_getsockopt(struct socket *sock, int level, int optname, sk_get_meminfo(sk, meminfo); len = min_t(unsigned int, len, sizeof(meminfo)); - if (copy_to_user(optval, &meminfo, len)) + if (copy_to_sockptr(optval, &meminfo, len)) return -EFAULT; goto lenout; @@ -1896,14 +1942,22 @@ int sock_getsockopt(struct socket *sock, int level, int optname, if (len > lv) len = lv; - if (copy_to_user(optval, &v, len)) + if (copy_to_sockptr(optval, &v, len)) return -EFAULT; lenout: - if (put_user(len, optlen)) + if (copy_to_sockptr(optlen, &len, sizeof(int))) return -EFAULT; return 0; } +int sock_getsockopt(struct socket *sock, int level, int optname, + char __user *optval, int __user *optlen) +{ + return sk_getsockopt(sock->sk, level, optname, + USER_SOCKPTR(optval), + USER_SOCKPTR(optlen)); +} + /* * Initialize an sk_lock. * diff --git a/net/core/sock_map.c b/net/core/sock_map.c index 9a9fb9487d63..a660baedd9e7 100644 --- a/net/core/sock_map.c +++ b/net/core/sock_map.c @@ -41,7 +41,7 @@ static struct bpf_map *sock_map_alloc(union bpf_attr *attr) attr->map_flags & ~SOCK_CREATE_FLAG_MASK) return ERR_PTR(-EINVAL); - stab = kzalloc(sizeof(*stab), GFP_USER | __GFP_ACCOUNT); + stab = bpf_map_area_alloc(sizeof(*stab), NUMA_NO_NODE); if (!stab) return ERR_PTR(-ENOMEM); @@ -52,7 +52,7 @@ static struct bpf_map *sock_map_alloc(union bpf_attr *attr) sizeof(struct sock *), stab->map.numa_node); if (!stab->sks) { - kfree(stab); + bpf_map_area_free(stab); return ERR_PTR(-ENOMEM); } @@ -361,7 +361,7 @@ static void sock_map_free(struct bpf_map *map) synchronize_rcu(); bpf_map_area_free(stab->sks); - kfree(stab); + bpf_map_area_free(stab); } static void sock_map_release_progs(struct bpf_map *map) @@ -1085,7 +1085,7 @@ static struct bpf_map *sock_hash_alloc(union bpf_attr *attr) if (attr->key_size > MAX_BPF_STACK) return ERR_PTR(-E2BIG); - htab = kzalloc(sizeof(*htab), GFP_USER | __GFP_ACCOUNT); + htab = bpf_map_area_alloc(sizeof(*htab), NUMA_NO_NODE); if (!htab) return ERR_PTR(-ENOMEM); @@ -1115,7 +1115,7 @@ static struct bpf_map *sock_hash_alloc(union bpf_attr *attr) return &htab->map; free_htab: - kfree(htab); + bpf_map_area_free(htab); return ERR_PTR(err); } @@ -1168,7 +1168,7 @@ static void sock_hash_free(struct bpf_map *map) synchronize_rcu(); bpf_map_area_free(htab->buckets); - kfree(htab); + bpf_map_area_free(htab); } static void *sock_hash_lookup_sys(struct bpf_map *map, void *key) diff --git a/net/core/stream.c b/net/core/stream.c index ccc083cdef23..1105057ce00a 100644 --- a/net/core/stream.c +++ b/net/core/stream.c @@ -159,7 +159,8 @@ int sk_stream_wait_memory(struct sock *sk, long *timeo_p) *timeo_p = current_timeo; } out: - remove_wait_queue(sk_sleep(sk), &wait); + if (!sock_flag(sk, SOCK_DEAD)) + remove_wait_queue(sk_sleep(sk), &wait); return err; do_error: diff --git a/net/core/sysctl_net_core.c b/net/core/sysctl_net_core.c index 725891527814..5b1ce656baa1 100644 --- a/net/core/sysctl_net_core.c +++ b/net/core/sysctl_net_core.c @@ -29,7 +29,6 @@ static int int_3600 = 3600; static int min_sndbuf = SOCK_MIN_SNDBUF; static int min_rcvbuf = SOCK_MIN_RCVBUF; static int max_skb_frags = MAX_SKB_FRAGS; -static long long_max __maybe_unused = LONG_MAX; static int net_msg_warn; /* Unused, but still a sysctl */ diff --git a/net/core/xdp.c b/net/core/xdp.c index 24420209bf0e..844c9d99dc0e 100644 --- a/net/core/xdp.c +++ b/net/core/xdp.c @@ -375,19 +375,17 @@ EXPORT_SYMBOL_GPL(xdp_rxq_info_reg_mem_model); void __xdp_return(void *data, struct xdp_mem_info *mem, bool napi_direct, struct xdp_buff *xdp) { - struct xdp_mem_allocator *xa; struct page *page; switch (mem->type) { case MEM_TYPE_PAGE_POOL: - rcu_read_lock(); - /* mem->id is valid, checked in xdp_rxq_info_reg_mem_model() */ - xa = rhashtable_lookup(mem_id_ht, &mem->id, mem_id_rht_params); page = virt_to_head_page(data); if (napi_direct && xdp_return_frame_no_direct()) napi_direct = false; - page_pool_put_full_page(xa->page_pool, page, napi_direct); - rcu_read_unlock(); + /* No need to check ((page->pp_magic & ~0x3UL) == PP_SIGNATURE) + * as mem->type knows this a page_pool page + */ + page_pool_put_full_page(page->pp, page, napi_direct); break; case MEM_TYPE_PAGE_SHARED: page_frag_free(data); diff --git a/net/dccp/ipv4.c b/net/dccp/ipv4.c index da6e3b20cd75..6a6e121dc00c 100644 --- a/net/dccp/ipv4.c +++ b/net/dccp/ipv4.c @@ -45,10 +45,11 @@ static unsigned int dccp_v4_pernet_id __read_mostly; int dccp_v4_connect(struct sock *sk, struct sockaddr *uaddr, int addr_len) { const struct sockaddr_in *usin = (struct sockaddr_in *)uaddr; + struct inet_bind_hashbucket *prev_addr_hashbucket = NULL; + __be32 daddr, nexthop, prev_sk_rcv_saddr; struct inet_sock *inet = inet_sk(sk); struct dccp_sock *dp = dccp_sk(sk); __be16 orig_sport, orig_dport; - __be32 daddr, nexthop; struct flowi4 *fl4; struct rtable *rt; int err; @@ -89,9 +90,29 @@ int dccp_v4_connect(struct sock *sk, struct sockaddr *uaddr, int addr_len) if (inet_opt == NULL || !inet_opt->opt.srr) daddr = fl4->daddr; - if (inet->inet_saddr == 0) + if (inet->inet_saddr == 0) { + if (inet_csk(sk)->icsk_bind2_hash) { + prev_addr_hashbucket = + inet_bhashfn_portaddr(&dccp_hashinfo, sk, + sock_net(sk), + inet->inet_num); + prev_sk_rcv_saddr = sk->sk_rcv_saddr; + } inet->inet_saddr = fl4->saddr; + } + sk_rcv_saddr_set(sk, inet->inet_saddr); + + if (prev_addr_hashbucket) { + err = inet_bhash2_update_saddr(prev_addr_hashbucket, sk); + if (err) { + inet->inet_saddr = 0; + sk_rcv_saddr_set(sk, prev_sk_rcv_saddr); + ip_rt_put(rt); + return err; + } + } + inet->inet_dport = usin->sin_port; sk_daddr_set(sk, daddr); diff --git a/net/dccp/ipv6.c b/net/dccp/ipv6.c index fd44638ec16b..e57b43006074 100644 --- a/net/dccp/ipv6.c +++ b/net/dccp/ipv6.c @@ -934,8 +934,26 @@ static int dccp_v6_connect(struct sock *sk, struct sockaddr *uaddr, } if (saddr == NULL) { + struct inet_bind_hashbucket *prev_addr_hashbucket = NULL; + struct in6_addr prev_v6_rcv_saddr; + + if (icsk->icsk_bind2_hash) { + prev_addr_hashbucket = inet_bhashfn_portaddr(&dccp_hashinfo, + sk, sock_net(sk), + inet->inet_num); + prev_v6_rcv_saddr = sk->sk_v6_rcv_saddr; + } + saddr = &fl6.saddr; sk->sk_v6_rcv_saddr = *saddr; + + if (prev_addr_hashbucket) { + err = inet_bhash2_update_saddr(prev_addr_hashbucket, sk); + if (err) { + sk->sk_v6_rcv_saddr = prev_v6_rcv_saddr; + goto failure; + } + } } /* set the source address */ diff --git a/net/dccp/proto.c b/net/dccp/proto.c index e13641c65f88..c548ca3e9b0e 100644 --- a/net/dccp/proto.c +++ b/net/dccp/proto.c @@ -1120,6 +1120,12 @@ static int __init dccp_init(void) SLAB_HWCACHE_ALIGN | SLAB_ACCOUNT, NULL); if (!dccp_hashinfo.bind_bucket_cachep) goto out_free_hashinfo2; + dccp_hashinfo.bind2_bucket_cachep = + kmem_cache_create("dccp_bind2_bucket", + sizeof(struct inet_bind2_bucket), 0, + SLAB_HWCACHE_ALIGN | SLAB_ACCOUNT, NULL); + if (!dccp_hashinfo.bind2_bucket_cachep) + goto out_free_bind_bucket_cachep; /* * Size and allocate the main established and bind bucket @@ -1150,7 +1156,7 @@ static int __init dccp_init(void) if (!dccp_hashinfo.ehash) { DCCP_CRIT("Failed to allocate DCCP established hash table"); - goto out_free_bind_bucket_cachep; + goto out_free_bind2_bucket_cachep; } for (i = 0; i <= dccp_hashinfo.ehash_mask; i++) @@ -1176,14 +1182,26 @@ static int __init dccp_init(void) goto out_free_dccp_locks; } + dccp_hashinfo.bhash2 = (struct inet_bind_hashbucket *) + __get_free_pages(GFP_ATOMIC | __GFP_NOWARN, bhash_order); + + if (!dccp_hashinfo.bhash2) { + DCCP_CRIT("Failed to allocate DCCP bind2 hash table"); + goto out_free_dccp_bhash; + } + for (i = 0; i < dccp_hashinfo.bhash_size; i++) { spin_lock_init(&dccp_hashinfo.bhash[i].lock); INIT_HLIST_HEAD(&dccp_hashinfo.bhash[i].chain); + spin_lock_init(&dccp_hashinfo.bhash2[i].lock); + INIT_HLIST_HEAD(&dccp_hashinfo.bhash2[i].chain); } + dccp_hashinfo.pernet = false; + rc = dccp_mib_init(); if (rc) - goto out_free_dccp_bhash; + goto out_free_dccp_bhash2; rc = dccp_ackvec_init(); if (rc) @@ -1207,30 +1225,38 @@ out_ackvec_exit: dccp_ackvec_exit(); out_free_dccp_mib: dccp_mib_exit(); +out_free_dccp_bhash2: + free_pages((unsigned long)dccp_hashinfo.bhash2, bhash_order); out_free_dccp_bhash: free_pages((unsigned long)dccp_hashinfo.bhash, bhash_order); out_free_dccp_locks: inet_ehash_locks_free(&dccp_hashinfo); out_free_dccp_ehash: free_pages((unsigned long)dccp_hashinfo.ehash, ehash_order); +out_free_bind2_bucket_cachep: + kmem_cache_destroy(dccp_hashinfo.bind2_bucket_cachep); out_free_bind_bucket_cachep: kmem_cache_destroy(dccp_hashinfo.bind_bucket_cachep); out_free_hashinfo2: inet_hashinfo2_free_mod(&dccp_hashinfo); out_fail: dccp_hashinfo.bhash = NULL; + dccp_hashinfo.bhash2 = NULL; dccp_hashinfo.ehash = NULL; dccp_hashinfo.bind_bucket_cachep = NULL; + dccp_hashinfo.bind2_bucket_cachep = NULL; return rc; } static void __exit dccp_fini(void) { + int bhash_order = get_order(dccp_hashinfo.bhash_size * + sizeof(struct inet_bind_hashbucket)); + ccid_cleanup_builtins(); dccp_mib_exit(); - free_pages((unsigned long)dccp_hashinfo.bhash, - get_order(dccp_hashinfo.bhash_size * - sizeof(struct inet_bind_hashbucket))); + free_pages((unsigned long)dccp_hashinfo.bhash, bhash_order); + free_pages((unsigned long)dccp_hashinfo.bhash2, bhash_order); free_pages((unsigned long)dccp_hashinfo.ehash, get_order((dccp_hashinfo.ehash_mask + 1) * sizeof(struct inet_ehash_bucket))); diff --git a/net/decnet/Kconfig b/net/decnet/Kconfig deleted file mode 100644 index 24336bdb1054..000000000000 --- a/net/decnet/Kconfig +++ /dev/null @@ -1,43 +0,0 @@ -# SPDX-License-Identifier: GPL-2.0-only -# -# DECnet configuration -# -config DECNET - tristate "DECnet Support" - help - The DECnet networking protocol was used in many products made by - Digital (now Compaq). It provides reliable stream and sequenced - packet communications over which run a variety of services similar - to those which run over TCP/IP. - - To find some tools to use with the kernel layer support, please - look at Patrick Caulfield's web site: - <http://linux-decnet.sourceforge.net/>. - - More detailed documentation is available in - <file:Documentation/networking/decnet.rst>. - - Be sure to say Y to "/proc file system support" and "Sysctl support" - below when using DECnet, since you will need sysctl support to aid - in configuration at run time. - - The DECnet code is also available as a module ( = code which can be - inserted in and removed from the running kernel whenever you want). - The module is called decnet. - -config DECNET_ROUTER - bool "DECnet: router support" - depends on DECNET - select FIB_RULES - help - Add support for turning your DECnet Endnode into a level 1 or 2 - router. This is an experimental, but functional option. If you - do say Y here, then make sure that you also say Y to "Kernel/User - network link driver", "Routing messages" and "Network packet - filtering". The first two are required to allow configuration via - rtnetlink (you will need Alexey Kuznetsov's iproute2 package - from <ftp://ftp.tux.org/pub/net/ip-routing/>). The "Network packet - filtering" option will be required for the forthcoming routing daemon - to work. - - See <file:Documentation/networking/decnet.rst> for more information. diff --git a/net/decnet/Makefile b/net/decnet/Makefile deleted file mode 100644 index 07b38e441b2d..000000000000 --- a/net/decnet/Makefile +++ /dev/null @@ -1,10 +0,0 @@ -# SPDX-License-Identifier: GPL-2.0 - -obj-$(CONFIG_DECNET) += decnet.o - -decnet-y := af_decnet.o dn_nsp_in.o dn_nsp_out.o \ - dn_route.o dn_dev.o dn_neigh.o dn_timer.o -decnet-$(CONFIG_DECNET_ROUTER) += dn_fib.o dn_rules.o dn_table.o -decnet-y += sysctl_net_decnet.o - -obj-$(CONFIG_NETFILTER) += netfilter/ diff --git a/net/decnet/README b/net/decnet/README deleted file mode 100644 index 60e7ec88c81f..000000000000 --- a/net/decnet/README +++ /dev/null @@ -1,8 +0,0 @@ - Linux DECnet Project - ====================== - -The documentation for this kernel subsystem is available in the -Documentation/networking subdirectory of this distribution and also -on line at http://www.chygwyn.com/DECnet/ - -Steve Whitehouse <SteveW@ACM.org> diff --git a/net/decnet/af_decnet.c b/net/decnet/af_decnet.c deleted file mode 100644 index 6582dfdfb932..000000000000 --- a/net/decnet/af_decnet.c +++ /dev/null @@ -1,2404 +0,0 @@ -// SPDX-License-Identifier: GPL-2.0-or-later - -/* - * DECnet An implementation of the DECnet protocol suite for the LINUX - * operating system. DECnet is implemented using the BSD Socket - * interface as the means of communication with the user level. - * - * DECnet Socket Layer Interface - * - * Authors: Eduardo Marcelo Serrat <emserrat@geocities.com> - * Patrick Caulfield <patrick@pandh.demon.co.uk> - * - * Changes: - * Steve Whitehouse: Copied from Eduardo Serrat and Patrick Caulfield's - * version of the code. Original copyright preserved - * below. - * Steve Whitehouse: Some bug fixes, cleaning up some code to make it - * compatible with my routing layer. - * Steve Whitehouse: Merging changes from Eduardo Serrat and Patrick - * Caulfield. - * Steve Whitehouse: Further bug fixes, checking module code still works - * with new routing layer. - * Steve Whitehouse: Additional set/get_sockopt() calls. - * Steve Whitehouse: Fixed TIOCINQ ioctl to be same as Eduardo's new - * code. - * Steve Whitehouse: recvmsg() changed to try and behave in a POSIX like - * way. Didn't manage it entirely, but its better. - * Steve Whitehouse: ditto for sendmsg(). - * Steve Whitehouse: A selection of bug fixes to various things. - * Steve Whitehouse: Added TIOCOUTQ ioctl. - * Steve Whitehouse: Fixes to username2sockaddr & sockaddr2username. - * Steve Whitehouse: Fixes to connect() error returns. - * Patrick Caulfield: Fixes to delayed acceptance logic. - * David S. Miller: New socket locking - * Steve Whitehouse: Socket list hashing/locking - * Arnaldo C. Melo: use capable, not suser - * Steve Whitehouse: Removed unused code. Fix to use sk->allocation - * when required. - * Patrick Caulfield: /proc/net/decnet now has object name/number - * Steve Whitehouse: Fixed local port allocation, hashed sk list - * Matthew Wilcox: Fixes for dn_ioctl() - * Steve Whitehouse: New connect/accept logic to allow timeouts and - * prepare for sendpage etc. - */ - - -/****************************************************************************** - (c) 1995-1998 E.M. Serrat emserrat@geocities.com - - -HISTORY: - -Version Kernel Date Author/Comments -------- ------ ---- --------------- -Version 0.0.1 2.0.30 01-dic-97 Eduardo Marcelo Serrat - (emserrat@geocities.com) - - First Development of DECnet Socket La- - yer for Linux. Only supports outgoing - connections. - -Version 0.0.2 2.1.105 20-jun-98 Patrick J. Caulfield - (patrick@pandh.demon.co.uk) - - Port to new kernel development version. - -Version 0.0.3 2.1.106 25-jun-98 Eduardo Marcelo Serrat - (emserrat@geocities.com) - _ - Added support for incoming connections - so we can start developing server apps - on Linux. - - - Module Support -Version 0.0.4 2.1.109 21-jul-98 Eduardo Marcelo Serrat - (emserrat@geocities.com) - _ - Added support for X11R6.4. Now we can - use DECnet transport for X on Linux!!! - - -Version 0.0.5 2.1.110 01-aug-98 Eduardo Marcelo Serrat - (emserrat@geocities.com) - Removed bugs on flow control - Removed bugs on incoming accessdata - order - - -Version 0.0.6 2.1.110 07-aug-98 Eduardo Marcelo Serrat - dn_recvmsg fixes - - Patrick J. Caulfield - dn_bind fixes -*******************************************************************************/ - -#include <linux/module.h> -#include <linux/errno.h> -#include <linux/types.h> -#include <linux/slab.h> -#include <linux/socket.h> -#include <linux/in.h> -#include <linux/kernel.h> -#include <linux/sched/signal.h> -#include <linux/timer.h> -#include <linux/string.h> -#include <linux/sockios.h> -#include <linux/net.h> -#include <linux/netdevice.h> -#include <linux/inet.h> -#include <linux/route.h> -#include <linux/netfilter.h> -#include <linux/seq_file.h> -#include <net/sock.h> -#include <net/tcp_states.h> -#include <net/flow.h> -#include <asm/ioctls.h> -#include <linux/capability.h> -#include <linux/mm.h> -#include <linux/interrupt.h> -#include <linux/proc_fs.h> -#include <linux/stat.h> -#include <linux/init.h> -#include <linux/poll.h> -#include <linux/jiffies.h> -#include <net/net_namespace.h> -#include <net/neighbour.h> -#include <net/dst.h> -#include <net/fib_rules.h> -#include <net/tcp.h> -#include <net/dn.h> -#include <net/dn_nsp.h> -#include <net/dn_dev.h> -#include <net/dn_route.h> -#include <net/dn_fib.h> -#include <net/dn_neigh.h> - -struct dn_sock { - struct sock sk; - struct dn_scp scp; -}; - -static void dn_keepalive(struct sock *sk); - -#define DN_SK_HASH_SHIFT 8 -#define DN_SK_HASH_SIZE (1 << DN_SK_HASH_SHIFT) -#define DN_SK_HASH_MASK (DN_SK_HASH_SIZE - 1) - - -static const struct proto_ops dn_proto_ops; -static DEFINE_RWLOCK(dn_hash_lock); -static struct hlist_head dn_sk_hash[DN_SK_HASH_SIZE]; -static struct hlist_head dn_wild_sk; -static atomic_long_t decnet_memory_allocated; -static DEFINE_PER_CPU(int, decnet_memory_per_cpu_fw_alloc); - -static int __dn_setsockopt(struct socket *sock, int level, int optname, - sockptr_t optval, unsigned int optlen, int flags); -static int __dn_getsockopt(struct socket *sock, int level, int optname, char __user *optval, int __user *optlen, int flags); - -static struct hlist_head *dn_find_list(struct sock *sk) -{ - struct dn_scp *scp = DN_SK(sk); - - if (scp->addr.sdn_flags & SDF_WILD) - return hlist_empty(&dn_wild_sk) ? &dn_wild_sk : NULL; - - return &dn_sk_hash[le16_to_cpu(scp->addrloc) & DN_SK_HASH_MASK]; -} - -/* - * Valid ports are those greater than zero and not already in use. - */ -static int check_port(__le16 port) -{ - struct sock *sk; - - if (port == 0) - return -1; - - sk_for_each(sk, &dn_sk_hash[le16_to_cpu(port) & DN_SK_HASH_MASK]) { - struct dn_scp *scp = DN_SK(sk); - if (scp->addrloc == port) - return -1; - } - return 0; -} - -static unsigned short port_alloc(struct sock *sk) -{ - struct dn_scp *scp = DN_SK(sk); - static unsigned short port = 0x2000; - unsigned short i_port = port; - - while(check_port(cpu_to_le16(++port)) != 0) { - if (port == i_port) - return 0; - } - - scp->addrloc = cpu_to_le16(port); - - return 1; -} - -/* - * Since this is only ever called from user - * level, we don't need a write_lock() version - * of this. - */ -static int dn_hash_sock(struct sock *sk) -{ - struct dn_scp *scp = DN_SK(sk); - struct hlist_head *list; - int rv = -EUSERS; - - BUG_ON(sk_hashed(sk)); - - write_lock_bh(&dn_hash_lock); - - if (!scp->addrloc && !port_alloc(sk)) - goto out; - - rv = -EADDRINUSE; - if ((list = dn_find_list(sk)) == NULL) - goto out; - - sk_add_node(sk, list); - rv = 0; -out: - write_unlock_bh(&dn_hash_lock); - return rv; -} - -static void dn_unhash_sock(struct sock *sk) -{ - write_lock(&dn_hash_lock); - sk_del_node_init(sk); - write_unlock(&dn_hash_lock); -} - -static void dn_unhash_sock_bh(struct sock *sk) -{ - write_lock_bh(&dn_hash_lock); - sk_del_node_init(sk); - write_unlock_bh(&dn_hash_lock); -} - -static struct hlist_head *listen_hash(struct sockaddr_dn *addr) -{ - int i; - unsigned int hash = addr->sdn_objnum; - - if (hash == 0) { - hash = addr->sdn_objnamel; - for(i = 0; i < le16_to_cpu(addr->sdn_objnamel); i++) { - hash ^= addr->sdn_objname[i]; - hash ^= (hash << 3); - } - } - - return &dn_sk_hash[hash & DN_SK_HASH_MASK]; -} - -/* - * Called to transform a socket from bound (i.e. with a local address) - * into a listening socket (doesn't need a local port number) and rehashes - * based upon the object name/number. - */ -static void dn_rehash_sock(struct sock *sk) -{ - struct hlist_head *list; - struct dn_scp *scp = DN_SK(sk); - - if (scp->addr.sdn_flags & SDF_WILD) - return; - - write_lock_bh(&dn_hash_lock); - sk_del_node_init(sk); - DN_SK(sk)->addrloc = 0; - list = listen_hash(&DN_SK(sk)->addr); - sk_add_node(sk, list); - write_unlock_bh(&dn_hash_lock); -} - -int dn_sockaddr2username(struct sockaddr_dn *sdn, unsigned char *buf, unsigned char type) -{ - int len = 2; - - *buf++ = type; - - switch (type) { - case 0: - *buf++ = sdn->sdn_objnum; - break; - case 1: - *buf++ = 0; - *buf++ = le16_to_cpu(sdn->sdn_objnamel); - memcpy(buf, sdn->sdn_objname, le16_to_cpu(sdn->sdn_objnamel)); - len = 3 + le16_to_cpu(sdn->sdn_objnamel); - break; - case 2: - memset(buf, 0, 5); - buf += 5; - *buf++ = le16_to_cpu(sdn->sdn_objnamel); - memcpy(buf, sdn->sdn_objname, le16_to_cpu(sdn->sdn_objnamel)); - len = 7 + le16_to_cpu(sdn->sdn_objnamel); - break; - } - - return len; -} - -/* - * On reception of usernames, we handle types 1 and 0 for destination - * addresses only. Types 2 and 4 are used for source addresses, but the - * UIC, GIC are ignored and they are both treated the same way. Type 3 - * is never used as I've no idea what its purpose might be or what its - * format is. - */ -int dn_username2sockaddr(unsigned char *data, int len, struct sockaddr_dn *sdn, unsigned char *fmt) -{ - unsigned char type; - int size = len; - int namel = 12; - - sdn->sdn_objnum = 0; - sdn->sdn_objnamel = cpu_to_le16(0); - memset(sdn->sdn_objname, 0, DN_MAXOBJL); - - if (len < 2) - return -1; - - len -= 2; - *fmt = *data++; - type = *data++; - - switch (*fmt) { - case 0: - sdn->sdn_objnum = type; - return 2; - case 1: - namel = 16; - break; - case 2: - len -= 4; - data += 4; - break; - case 4: - len -= 8; - data += 8; - break; - default: - return -1; - } - - len -= 1; - - if (len < 0) - return -1; - - sdn->sdn_objnamel = cpu_to_le16(*data++); - len -= le16_to_cpu(sdn->sdn_objnamel); - - if ((len < 0) || (le16_to_cpu(sdn->sdn_objnamel) > namel)) - return -1; - - memcpy(sdn->sdn_objname, data, le16_to_cpu(sdn->sdn_objnamel)); - - return size - len; -} - -struct sock *dn_sklist_find_listener(struct sockaddr_dn *addr) -{ - struct hlist_head *list = listen_hash(addr); - struct sock *sk; - - read_lock(&dn_hash_lock); - sk_for_each(sk, list) { - struct dn_scp *scp = DN_SK(sk); - if (sk->sk_state != TCP_LISTEN) - continue; - if (scp->addr.sdn_objnum) { - if (scp->addr.sdn_objnum != addr->sdn_objnum) - continue; - } else { - if (addr->sdn_objnum) - continue; - if (scp->addr.sdn_objnamel != addr->sdn_objnamel) - continue; - if (memcmp(scp->addr.sdn_objname, addr->sdn_objname, le16_to_cpu(addr->sdn_objnamel)) != 0) - continue; - } - sock_hold(sk); - read_unlock(&dn_hash_lock); - return sk; - } - - sk = sk_head(&dn_wild_sk); - if (sk) { - if (sk->sk_state == TCP_LISTEN) - sock_hold(sk); - else - sk = NULL; - } - - read_unlock(&dn_hash_lock); - return sk; -} - -struct sock *dn_find_by_skb(struct sk_buff *skb) -{ - struct dn_skb_cb *cb = DN_SKB_CB(skb); - struct sock *sk; - struct dn_scp *scp; - - read_lock(&dn_hash_lock); - sk_for_each(sk, &dn_sk_hash[le16_to_cpu(cb->dst_port) & DN_SK_HASH_MASK]) { - scp = DN_SK(sk); - if (cb->src != dn_saddr2dn(&scp->peer)) - continue; - if (cb->dst_port != scp->addrloc) - continue; - if (scp->addrrem && (cb->src_port != scp->addrrem)) - continue; - sock_hold(sk); - goto found; - } - sk = NULL; -found: - read_unlock(&dn_hash_lock); - return sk; -} - - - -static void dn_destruct(struct sock *sk) -{ - struct dn_scp *scp = DN_SK(sk); - - skb_queue_purge(&scp->data_xmit_queue); - skb_queue_purge(&scp->other_xmit_queue); - skb_queue_purge(&scp->other_receive_queue); - - dst_release(rcu_dereference_protected(sk->sk_dst_cache, 1)); -} - -static unsigned long dn_memory_pressure; - -static void dn_enter_memory_pressure(struct sock *sk) -{ - if (!dn_memory_pressure) { - dn_memory_pressure = 1; - } -} - -static struct proto dn_proto = { - .name = "NSP", - .owner = THIS_MODULE, - .enter_memory_pressure = dn_enter_memory_pressure, - .memory_pressure = &dn_memory_pressure, - - .memory_allocated = &decnet_memory_allocated, - .per_cpu_fw_alloc = &decnet_memory_per_cpu_fw_alloc, - - .sysctl_mem = sysctl_decnet_mem, - .sysctl_wmem = sysctl_decnet_wmem, - .sysctl_rmem = sysctl_decnet_rmem, - .max_header = DN_MAX_NSP_DATA_HEADER + 64, - .obj_size = sizeof(struct dn_sock), -}; - -static struct sock *dn_alloc_sock(struct net *net, struct socket *sock, gfp_t gfp, int kern) -{ - struct dn_scp *scp; - struct sock *sk = sk_alloc(net, PF_DECnet, gfp, &dn_proto, kern); - - if (!sk) - goto out; - - if (sock) - sock->ops = &dn_proto_ops; - sock_init_data(sock, sk); - - sk->sk_backlog_rcv = dn_nsp_backlog_rcv; - sk->sk_destruct = dn_destruct; - sk->sk_no_check_tx = 1; - sk->sk_family = PF_DECnet; - sk->sk_protocol = 0; - sk->sk_allocation = gfp; - sk->sk_sndbuf = READ_ONCE(sysctl_decnet_wmem[1]); - sk->sk_rcvbuf = READ_ONCE(sysctl_decnet_rmem[1]); - - /* Initialization of DECnet Session Control Port */ - scp = DN_SK(sk); - scp->state = DN_O; /* Open */ - scp->numdat = 1; /* Next data seg to tx */ - scp->numoth = 1; /* Next oth data to tx */ - scp->ackxmt_dat = 0; /* Last data seg ack'ed */ - scp->ackxmt_oth = 0; /* Last oth data ack'ed */ - scp->ackrcv_dat = 0; /* Highest data ack recv*/ - scp->ackrcv_oth = 0; /* Last oth data ack rec*/ - scp->flowrem_sw = DN_SEND; - scp->flowloc_sw = DN_SEND; - scp->flowrem_dat = 0; - scp->flowrem_oth = 1; - scp->flowloc_dat = 0; - scp->flowloc_oth = 1; - scp->services_rem = 0; - scp->services_loc = 1 | NSP_FC_NONE; - scp->info_rem = 0; - scp->info_loc = 0x03; /* NSP version 4.1 */ - scp->segsize_rem = 230 - DN_MAX_NSP_DATA_HEADER; /* Default: Updated by remote segsize */ - scp->nonagle = 0; - scp->multi_ireq = 1; - scp->accept_mode = ACC_IMMED; - scp->addr.sdn_family = AF_DECnet; - scp->peer.sdn_family = AF_DECnet; - scp->accessdata.acc_accl = 5; - memcpy(scp->accessdata.acc_acc, "LINUX", 5); - - scp->max_window = NSP_MAX_WINDOW; - scp->snd_window = NSP_MIN_WINDOW; - scp->nsp_srtt = NSP_INITIAL_SRTT; - scp->nsp_rttvar = NSP_INITIAL_RTTVAR; - scp->nsp_rxtshift = 0; - - skb_queue_head_init(&scp->data_xmit_queue); - skb_queue_head_init(&scp->other_xmit_queue); - skb_queue_head_init(&scp->other_receive_queue); - - scp->persist = 0; - scp->persist_fxn = NULL; - scp->keepalive = 10 * HZ; - scp->keepalive_fxn = dn_keepalive; - - dn_start_slow_timer(sk); -out: - return sk; -} - -/* - * Keepalive timer. - * FIXME: Should respond to SO_KEEPALIVE etc. - */ -static void dn_keepalive(struct sock *sk) -{ - struct dn_scp *scp = DN_SK(sk); - - /* - * By checking the other_data transmit queue is empty - * we are double checking that we are not sending too - * many of these keepalive frames. - */ - if (skb_queue_empty(&scp->other_xmit_queue)) - dn_nsp_send_link(sk, DN_NOCHANGE, 0); -} - - -/* - * Timer for shutdown/destroyed sockets. - * When socket is dead & no packets have been sent for a - * certain amount of time, they are removed by this - * routine. Also takes care of sending out DI & DC - * frames at correct times. - */ -int dn_destroy_timer(struct sock *sk) -{ - struct dn_scp *scp = DN_SK(sk); - - scp->persist = dn_nsp_persist(sk); - - switch (scp->state) { - case DN_DI: - dn_nsp_send_disc(sk, NSP_DISCINIT, 0, GFP_ATOMIC); - if (scp->nsp_rxtshift >= decnet_di_count) - scp->state = DN_CN; - return 0; - - case DN_DR: - dn_nsp_send_disc(sk, NSP_DISCINIT, 0, GFP_ATOMIC); - if (scp->nsp_rxtshift >= decnet_dr_count) - scp->state = DN_DRC; - return 0; - - case DN_DN: - if (scp->nsp_rxtshift < decnet_dn_count) { - /* printk(KERN_DEBUG "dn_destroy_timer: DN\n"); */ - dn_nsp_send_disc(sk, NSP_DISCCONF, NSP_REASON_DC, - GFP_ATOMIC); - return 0; - } - } - - scp->persist = (HZ * decnet_time_wait); - - if (sk->sk_socket) - return 0; - - if (time_after_eq(jiffies, scp->stamp + HZ * decnet_time_wait)) { - dn_unhash_sock(sk); - sock_put(sk); - return 1; - } - - return 0; -} - -static void dn_destroy_sock(struct sock *sk) -{ - struct dn_scp *scp = DN_SK(sk); - - scp->nsp_rxtshift = 0; /* reset back off */ - - if (sk->sk_socket) { - if (sk->sk_socket->state != SS_UNCONNECTED) - sk->sk_socket->state = SS_DISCONNECTING; - } - - sk->sk_state = TCP_CLOSE; - - switch (scp->state) { - case DN_DN: - dn_nsp_send_disc(sk, NSP_DISCCONF, NSP_REASON_DC, - sk->sk_allocation); - scp->persist_fxn = dn_destroy_timer; - scp->persist = dn_nsp_persist(sk); - break; - case DN_CR: - scp->state = DN_DR; - goto disc_reject; - case DN_RUN: - scp->state = DN_DI; - fallthrough; - case DN_DI: - case DN_DR: -disc_reject: - dn_nsp_send_disc(sk, NSP_DISCINIT, 0, sk->sk_allocation); - fallthrough; - case DN_NC: - case DN_NR: - case DN_RJ: - case DN_DIC: - case DN_CN: - case DN_DRC: - case DN_CI: - case DN_CD: - scp->persist_fxn = dn_destroy_timer; - scp->persist = dn_nsp_persist(sk); - break; - default: - printk(KERN_DEBUG "DECnet: dn_destroy_sock passed socket in invalid state\n"); - fallthrough; - case DN_O: - dn_stop_slow_timer(sk); - - dn_unhash_sock_bh(sk); - sock_put(sk); - - break; - } -} - -char *dn_addr2asc(__u16 addr, char *buf) -{ - unsigned short node, area; - - node = addr & 0x03ff; - area = addr >> 10; - sprintf(buf, "%hd.%hd", area, node); - - return buf; -} - - - -static int dn_create(struct net *net, struct socket *sock, int protocol, - int kern) -{ - struct sock *sk; - - if (protocol < 0 || protocol > U8_MAX) - return -EINVAL; - - if (!net_eq(net, &init_net)) - return -EAFNOSUPPORT; - - switch (sock->type) { - case SOCK_SEQPACKET: - if (protocol != DNPROTO_NSP) - return -EPROTONOSUPPORT; - break; - case SOCK_STREAM: - break; - default: - return -ESOCKTNOSUPPORT; - } - - - if ((sk = dn_alloc_sock(net, sock, GFP_KERNEL, kern)) == NULL) - return -ENOBUFS; - - sk->sk_protocol = protocol; - - return 0; -} - - -static int -dn_release(struct socket *sock) -{ - struct sock *sk = sock->sk; - - if (sk) { - sock_orphan(sk); - sock_hold(sk); - lock_sock(sk); - dn_destroy_sock(sk); - release_sock(sk); - sock_put(sk); - } - - return 0; -} - -static int dn_bind(struct socket *sock, struct sockaddr *uaddr, int addr_len) -{ - struct sock *sk = sock->sk; - struct dn_scp *scp = DN_SK(sk); - struct sockaddr_dn *saddr = (struct sockaddr_dn *)uaddr; - struct net_device *dev, *ldev; - int rv; - - if (addr_len != sizeof(struct sockaddr_dn)) - return -EINVAL; - - if (saddr->sdn_family != AF_DECnet) - return -EINVAL; - - if (le16_to_cpu(saddr->sdn_nodeaddrl) && (le16_to_cpu(saddr->sdn_nodeaddrl) != 2)) - return -EINVAL; - - if (le16_to_cpu(saddr->sdn_objnamel) > DN_MAXOBJL) - return -EINVAL; - - if (saddr->sdn_flags & ~SDF_WILD) - return -EINVAL; - - if (!capable(CAP_NET_BIND_SERVICE) && (saddr->sdn_objnum || - (saddr->sdn_flags & SDF_WILD))) - return -EACCES; - - if (!(saddr->sdn_flags & SDF_WILD)) { - if (le16_to_cpu(saddr->sdn_nodeaddrl)) { - rcu_read_lock(); - ldev = NULL; - for_each_netdev_rcu(&init_net, dev) { - if (!dev->dn_ptr) - continue; - if (dn_dev_islocal(dev, dn_saddr2dn(saddr))) { - ldev = dev; - break; - } - } - rcu_read_unlock(); - if (ldev == NULL) - return -EADDRNOTAVAIL; - } - } - - rv = -EINVAL; - lock_sock(sk); - if (sock_flag(sk, SOCK_ZAPPED)) { - memcpy(&scp->addr, saddr, addr_len); - sock_reset_flag(sk, SOCK_ZAPPED); - - rv = dn_hash_sock(sk); - if (rv) - sock_set_flag(sk, SOCK_ZAPPED); - } - release_sock(sk); - - return rv; -} - - -static int dn_auto_bind(struct socket *sock) -{ - struct sock *sk = sock->sk; - struct dn_scp *scp = DN_SK(sk); - int rv; - - sock_reset_flag(sk, SOCK_ZAPPED); - - scp->addr.sdn_flags = 0; - scp->addr.sdn_objnum = 0; - - /* - * This stuff is to keep compatibility with Eduardo's - * patch. I hope I can dispense with it shortly... - */ - if ((scp->accessdata.acc_accl != 0) && - (scp->accessdata.acc_accl <= 12)) { - - scp->addr.sdn_objnamel = cpu_to_le16(scp->accessdata.acc_accl); - memcpy(scp->addr.sdn_objname, scp->accessdata.acc_acc, le16_to_cpu(scp->addr.sdn_objnamel)); - - scp->accessdata.acc_accl = 0; - memset(scp->accessdata.acc_acc, 0, 40); - } - /* End of compatibility stuff */ - - scp->addr.sdn_add.a_len = cpu_to_le16(2); - rv = dn_dev_bind_default((__le16 *)scp->addr.sdn_add.a_addr); - if (rv == 0) { - rv = dn_hash_sock(sk); - if (rv) - sock_set_flag(sk, SOCK_ZAPPED); - } - - return rv; -} - -static int dn_confirm_accept(struct sock *sk, long *timeo, gfp_t allocation) -{ - struct dn_scp *scp = DN_SK(sk); - DEFINE_WAIT_FUNC(wait, woken_wake_function); - int err; - - if (scp->state != DN_CR) - return -EINVAL; - - scp->state = DN_CC; - scp->segsize_loc = dst_metric_advmss(__sk_dst_get(sk)); - dn_send_conn_conf(sk, allocation); - - add_wait_queue(sk_sleep(sk), &wait); - for(;;) { - release_sock(sk); - if (scp->state == DN_CC) - *timeo = wait_woken(&wait, TASK_INTERRUPTIBLE, *timeo); - lock_sock(sk); - err = 0; - if (scp->state == DN_RUN) - break; - err = sock_error(sk); - if (err) - break; - err = sock_intr_errno(*timeo); - if (signal_pending(current)) - break; - err = -EAGAIN; - if (!*timeo) - break; - } - remove_wait_queue(sk_sleep(sk), &wait); - if (err == 0) { - sk->sk_socket->state = SS_CONNECTED; - } else if (scp->state != DN_CC) { - sk->sk_socket->state = SS_UNCONNECTED; - } - return err; -} - -static int dn_wait_run(struct sock *sk, long *timeo) -{ - struct dn_scp *scp = DN_SK(sk); - DEFINE_WAIT_FUNC(wait, woken_wake_function); - int err = 0; - - if (scp->state == DN_RUN) - goto out; - - if (!*timeo) - return -EALREADY; - - add_wait_queue(sk_sleep(sk), &wait); - for(;;) { - release_sock(sk); - if (scp->state == DN_CI || scp->state == DN_CC) - *timeo = wait_woken(&wait, TASK_INTERRUPTIBLE, *timeo); - lock_sock(sk); - err = 0; - if (scp->state == DN_RUN) - break; - err = sock_error(sk); - if (err) - break; - err = sock_intr_errno(*timeo); - if (signal_pending(current)) - break; - err = -ETIMEDOUT; - if (!*timeo) - break; - } - remove_wait_queue(sk_sleep(sk), &wait); -out: - if (err == 0) { - sk->sk_socket->state = SS_CONNECTED; - } else if (scp->state != DN_CI && scp->state != DN_CC) { - sk->sk_socket->state = SS_UNCONNECTED; - } - return err; -} - -static int __dn_connect(struct sock *sk, struct sockaddr_dn *addr, int addrlen, long *timeo, int flags) -{ - struct socket *sock = sk->sk_socket; - struct dn_scp *scp = DN_SK(sk); - int err = -EISCONN; - struct flowidn fld; - struct dst_entry *dst; - - if (sock->state == SS_CONNECTED) - goto out; - - if (sock->state == SS_CONNECTING) { - err = 0; - if (scp->state == DN_RUN) { - sock->state = SS_CONNECTED; - goto out; - } - err = -ECONNREFUSED; - if (scp->state != DN_CI && scp->state != DN_CC) { - sock->state = SS_UNCONNECTED; - goto out; - } - return dn_wait_run(sk, timeo); - } - - err = -EINVAL; - if (scp->state != DN_O) - goto out; - - if (addr == NULL || addrlen != sizeof(struct sockaddr_dn)) - goto out; - if (addr->sdn_family != AF_DECnet) - goto out; - if (addr->sdn_flags & SDF_WILD) - goto out; - - if (sock_flag(sk, SOCK_ZAPPED)) { - err = dn_auto_bind(sk->sk_socket); - if (err) - goto out; - } - - memcpy(&scp->peer, addr, sizeof(struct sockaddr_dn)); - - err = -EHOSTUNREACH; - memset(&fld, 0, sizeof(fld)); - fld.flowidn_oif = sk->sk_bound_dev_if; - fld.daddr = dn_saddr2dn(&scp->peer); - fld.saddr = dn_saddr2dn(&scp->addr); - dn_sk_ports_copy(&fld, scp); - fld.flowidn_proto = DNPROTO_NSP; - if (dn_route_output_sock(&sk->sk_dst_cache, &fld, sk, flags) < 0) - goto out; - dst = __sk_dst_get(sk); - sk->sk_route_caps = dst->dev->features; - sock->state = SS_CONNECTING; - scp->state = DN_CI; - scp->segsize_loc = dst_metric_advmss(dst); - - dn_nsp_send_conninit(sk, NSP_CI); - err = -EINPROGRESS; - if (*timeo) { - err = dn_wait_run(sk, timeo); - } -out: - return err; -} - -static int dn_connect(struct socket *sock, struct sockaddr *uaddr, int addrlen, int flags) -{ - struct sockaddr_dn *addr = (struct sockaddr_dn *)uaddr; - struct sock *sk = sock->sk; - int err; - long timeo = sock_sndtimeo(sk, flags & O_NONBLOCK); - - lock_sock(sk); - err = __dn_connect(sk, addr, addrlen, &timeo, 0); - release_sock(sk); - - return err; -} - -static inline int dn_check_state(struct sock *sk, struct sockaddr_dn *addr, int addrlen, long *timeo, int flags) -{ - struct dn_scp *scp = DN_SK(sk); - - switch (scp->state) { - case DN_RUN: - return 0; - case DN_CR: - return dn_confirm_accept(sk, timeo, sk->sk_allocation); - case DN_CI: - case DN_CC: - return dn_wait_run(sk, timeo); - case DN_O: - return __dn_connect(sk, addr, addrlen, timeo, flags); - } - - return -EINVAL; -} - - -static void dn_access_copy(struct sk_buff *skb, struct accessdata_dn *acc) -{ - unsigned char *ptr = skb->data; - - acc->acc_userl = *ptr++; - memcpy(&acc->acc_user, ptr, acc->acc_userl); - ptr += acc->acc_userl; - - acc->acc_passl = *ptr++; - memcpy(&acc->acc_pass, ptr, acc->acc_passl); - ptr += acc->acc_passl; - - acc->acc_accl = *ptr++; - memcpy(&acc->acc_acc, ptr, acc->acc_accl); - - skb_pull(skb, acc->acc_accl + acc->acc_passl + acc->acc_userl + 3); - -} - -static void dn_user_copy(struct sk_buff *skb, struct optdata_dn *opt) -{ - unsigned char *ptr = skb->data; - u16 len = *ptr++; /* yes, it's 8bit on the wire */ - - BUG_ON(len > 16); /* we've checked the contents earlier */ - opt->opt_optl = cpu_to_le16(len); - opt->opt_status = 0; - memcpy(opt->opt_data, ptr, len); - skb_pull(skb, len + 1); -} - -static struct sk_buff *dn_wait_for_connect(struct sock *sk, long *timeo) -{ - DEFINE_WAIT_FUNC(wait, woken_wake_function); - struct sk_buff *skb = NULL; - int err = 0; - - add_wait_queue(sk_sleep(sk), &wait); - for(;;) { - release_sock(sk); - skb = skb_dequeue(&sk->sk_receive_queue); - if (skb == NULL) { - *timeo = wait_woken(&wait, TASK_INTERRUPTIBLE, *timeo); - skb = skb_dequeue(&sk->sk_receive_queue); - } - lock_sock(sk); - if (skb != NULL) - break; - err = -EINVAL; - if (sk->sk_state != TCP_LISTEN) - break; - err = sock_intr_errno(*timeo); - if (signal_pending(current)) - break; - err = -EAGAIN; - if (!*timeo) - break; - } - remove_wait_queue(sk_sleep(sk), &wait); - - return skb == NULL ? ERR_PTR(err) : skb; -} - -static int dn_accept(struct socket *sock, struct socket *newsock, int flags, - bool kern) -{ - struct sock *sk = sock->sk, *newsk; - struct sk_buff *skb = NULL; - struct dn_skb_cb *cb; - unsigned char menuver; - int err = 0; - unsigned char type; - long timeo = sock_rcvtimeo(sk, flags & O_NONBLOCK); - struct dst_entry *dst; - - lock_sock(sk); - - if (sk->sk_state != TCP_LISTEN || DN_SK(sk)->state != DN_O) { - release_sock(sk); - return -EINVAL; - } - - skb = skb_dequeue(&sk->sk_receive_queue); - if (skb == NULL) { - skb = dn_wait_for_connect(sk, &timeo); - if (IS_ERR(skb)) { - release_sock(sk); - return PTR_ERR(skb); - } - } - - cb = DN_SKB_CB(skb); - sk_acceptq_removed(sk); - newsk = dn_alloc_sock(sock_net(sk), newsock, sk->sk_allocation, kern); - if (newsk == NULL) { - release_sock(sk); - kfree_skb(skb); - return -ENOBUFS; - } - release_sock(sk); - - dst = skb_dst(skb); - sk_dst_set(newsk, dst); - skb_dst_set(skb, NULL); - - DN_SK(newsk)->state = DN_CR; - DN_SK(newsk)->addrrem = cb->src_port; - DN_SK(newsk)->services_rem = cb->services; - DN_SK(newsk)->info_rem = cb->info; - DN_SK(newsk)->segsize_rem = cb->segsize; - DN_SK(newsk)->accept_mode = DN_SK(sk)->accept_mode; - - if (DN_SK(newsk)->segsize_rem < 230) - DN_SK(newsk)->segsize_rem = 230; - - if ((DN_SK(newsk)->services_rem & NSP_FC_MASK) == NSP_FC_NONE) - DN_SK(newsk)->max_window = decnet_no_fc_max_cwnd; - - newsk->sk_state = TCP_LISTEN; - memcpy(&(DN_SK(newsk)->addr), &(DN_SK(sk)->addr), sizeof(struct sockaddr_dn)); - - /* - * If we are listening on a wild socket, we don't want - * the newly created socket on the wrong hash queue. - */ - DN_SK(newsk)->addr.sdn_flags &= ~SDF_WILD; - - skb_pull(skb, dn_username2sockaddr(skb->data, skb->len, &(DN_SK(newsk)->addr), &type)); - skb_pull(skb, dn_username2sockaddr(skb->data, skb->len, &(DN_SK(newsk)->peer), &type)); - *(__le16 *)(DN_SK(newsk)->peer.sdn_add.a_addr) = cb->src; - *(__le16 *)(DN_SK(newsk)->addr.sdn_add.a_addr) = cb->dst; - - menuver = *skb->data; - skb_pull(skb, 1); - - if (menuver & DN_MENUVER_ACC) - dn_access_copy(skb, &(DN_SK(newsk)->accessdata)); - - if (menuver & DN_MENUVER_USR) - dn_user_copy(skb, &(DN_SK(newsk)->conndata_in)); - - if (menuver & DN_MENUVER_PRX) - DN_SK(newsk)->peer.sdn_flags |= SDF_PROXY; - - if (menuver & DN_MENUVER_UIC) - DN_SK(newsk)->peer.sdn_flags |= SDF_UICPROXY; - - kfree_skb(skb); - - memcpy(&(DN_SK(newsk)->conndata_out), &(DN_SK(sk)->conndata_out), - sizeof(struct optdata_dn)); - memcpy(&(DN_SK(newsk)->discdata_out), &(DN_SK(sk)->discdata_out), - sizeof(struct optdata_dn)); - - lock_sock(newsk); - err = dn_hash_sock(newsk); - if (err == 0) { - sock_reset_flag(newsk, SOCK_ZAPPED); - dn_send_conn_ack(newsk); - - /* - * Here we use sk->sk_allocation since although the conn conf is - * for the newsk, the context is the old socket. - */ - if (DN_SK(newsk)->accept_mode == ACC_IMMED) - err = dn_confirm_accept(newsk, &timeo, - sk->sk_allocation); - } - release_sock(newsk); - return err; -} - - -static int dn_getname(struct socket *sock, struct sockaddr *uaddr,int peer) -{ - struct sockaddr_dn *sa = (struct sockaddr_dn *)uaddr; - struct sock *sk = sock->sk; - struct dn_scp *scp = DN_SK(sk); - - lock_sock(sk); - - if (peer) { - if ((sock->state != SS_CONNECTED && - sock->state != SS_CONNECTING) && - scp->accept_mode == ACC_IMMED) { - release_sock(sk); - return -ENOTCONN; - } - - memcpy(sa, &scp->peer, sizeof(struct sockaddr_dn)); - } else { - memcpy(sa, &scp->addr, sizeof(struct sockaddr_dn)); - } - - release_sock(sk); - - return sizeof(struct sockaddr_dn); -} - - -static __poll_t dn_poll(struct file *file, struct socket *sock, poll_table *wait) -{ - struct sock *sk = sock->sk; - struct dn_scp *scp = DN_SK(sk); - __poll_t mask = datagram_poll(file, sock, wait); - - if (!skb_queue_empty_lockless(&scp->other_receive_queue)) - mask |= EPOLLRDBAND; - - return mask; -} - -static int dn_ioctl(struct socket *sock, unsigned int cmd, unsigned long arg) -{ - struct sock *sk = sock->sk; - struct dn_scp *scp = DN_SK(sk); - int err = -EOPNOTSUPP; - long amount = 0; - struct sk_buff *skb; - int val; - - switch(cmd) - { - case SIOCGIFADDR: - case SIOCSIFADDR: - return dn_dev_ioctl(cmd, (void __user *)arg); - - case SIOCATMARK: - lock_sock(sk); - val = !skb_queue_empty(&scp->other_receive_queue); - if (scp->state != DN_RUN) - val = -ENOTCONN; - release_sock(sk); - return val; - - case TIOCOUTQ: - amount = sk->sk_sndbuf - sk_wmem_alloc_get(sk); - if (amount < 0) - amount = 0; - err = put_user(amount, (int __user *)arg); - break; - - case TIOCINQ: - lock_sock(sk); - skb = skb_peek(&scp->other_receive_queue); - if (skb) { - amount = skb->len; - } else { - skb_queue_walk(&sk->sk_receive_queue, skb) - amount += skb->len; - } - release_sock(sk); - err = put_user(amount, (int __user *)arg); - break; - - default: - err = -ENOIOCTLCMD; - break; - } - - return err; -} - -static int dn_listen(struct socket *sock, int backlog) -{ - struct sock *sk = sock->sk; - int err = -EINVAL; - - lock_sock(sk); - - if (sock_flag(sk, SOCK_ZAPPED)) - goto out; - - if ((DN_SK(sk)->state != DN_O) || (sk->sk_state == TCP_LISTEN)) - goto out; - - sk->sk_max_ack_backlog = backlog; - sk->sk_ack_backlog = 0; - sk->sk_state = TCP_LISTEN; - err = 0; - dn_rehash_sock(sk); - -out: - release_sock(sk); - - return err; -} - - -static int dn_shutdown(struct socket *sock, int how) -{ - struct sock *sk = sock->sk; - struct dn_scp *scp = DN_SK(sk); - int err = -ENOTCONN; - - lock_sock(sk); - - if (sock->state == SS_UNCONNECTED) - goto out; - - err = 0; - if (sock->state == SS_DISCONNECTING) - goto out; - - err = -EINVAL; - if (scp->state == DN_O) - goto out; - - if (how != SHUT_RDWR) - goto out; - - sk->sk_shutdown = SHUTDOWN_MASK; - dn_destroy_sock(sk); - err = 0; - -out: - release_sock(sk); - - return err; -} - -static int dn_setsockopt(struct socket *sock, int level, int optname, - sockptr_t optval, unsigned int optlen) -{ - struct sock *sk = sock->sk; - int err; - - lock_sock(sk); - err = __dn_setsockopt(sock, level, optname, optval, optlen, 0); - release_sock(sk); -#ifdef CONFIG_NETFILTER - /* we need to exclude all possible ENOPROTOOPTs except default case */ - if (err == -ENOPROTOOPT && optname != DSO_LINKINFO && - optname != DSO_STREAM && optname != DSO_SEQPACKET) - err = nf_setsockopt(sk, PF_DECnet, optname, optval, optlen); -#endif - - return err; -} - -static int __dn_setsockopt(struct socket *sock, int level, int optname, - sockptr_t optval, unsigned int optlen, int flags) -{ - struct sock *sk = sock->sk; - struct dn_scp *scp = DN_SK(sk); - long timeo; - union { - struct optdata_dn opt; - struct accessdata_dn acc; - int mode; - unsigned long win; - int val; - unsigned char services; - unsigned char info; - } u; - int err; - - if (optlen && sockptr_is_null(optval)) - return -EINVAL; - - if (optlen > sizeof(u)) - return -EINVAL; - - if (copy_from_sockptr(&u, optval, optlen)) - return -EFAULT; - - switch (optname) { - case DSO_CONDATA: - if (sock->state == SS_CONNECTED) - return -EISCONN; - if ((scp->state != DN_O) && (scp->state != DN_CR)) - return -EINVAL; - - if (optlen != sizeof(struct optdata_dn)) - return -EINVAL; - - if (le16_to_cpu(u.opt.opt_optl) > 16) - return -EINVAL; - - memcpy(&scp->conndata_out, &u.opt, optlen); - break; - - case DSO_DISDATA: - if (sock->state != SS_CONNECTED && - scp->accept_mode == ACC_IMMED) - return -ENOTCONN; - - if (optlen != sizeof(struct optdata_dn)) - return -EINVAL; - - if (le16_to_cpu(u.opt.opt_optl) > 16) - return -EINVAL; - - memcpy(&scp->discdata_out, &u.opt, optlen); - break; - - case DSO_CONACCESS: - if (sock->state == SS_CONNECTED) - return -EISCONN; - if (scp->state != DN_O) - return -EINVAL; - - if (optlen != sizeof(struct accessdata_dn)) - return -EINVAL; - - if ((u.acc.acc_accl > DN_MAXACCL) || - (u.acc.acc_passl > DN_MAXACCL) || - (u.acc.acc_userl > DN_MAXACCL)) - return -EINVAL; - - memcpy(&scp->accessdata, &u.acc, optlen); - break; - - case DSO_ACCEPTMODE: - if (sock->state == SS_CONNECTED) - return -EISCONN; - if (scp->state != DN_O) - return -EINVAL; - - if (optlen != sizeof(int)) - return -EINVAL; - - if ((u.mode != ACC_IMMED) && (u.mode != ACC_DEFER)) - return -EINVAL; - - scp->accept_mode = (unsigned char)u.mode; - break; - - case DSO_CONACCEPT: - if (scp->state != DN_CR) - return -EINVAL; - timeo = sock_rcvtimeo(sk, 0); - err = dn_confirm_accept(sk, &timeo, sk->sk_allocation); - return err; - - case DSO_CONREJECT: - if (scp->state != DN_CR) - return -EINVAL; - - scp->state = DN_DR; - sk->sk_shutdown = SHUTDOWN_MASK; - dn_nsp_send_disc(sk, 0x38, 0, sk->sk_allocation); - break; - - case DSO_MAXWINDOW: - if (optlen != sizeof(unsigned long)) - return -EINVAL; - if (u.win > NSP_MAX_WINDOW) - u.win = NSP_MAX_WINDOW; - if (u.win == 0) - return -EINVAL; - scp->max_window = u.win; - if (scp->snd_window > u.win) - scp->snd_window = u.win; - break; - - case DSO_NODELAY: - if (optlen != sizeof(int)) - return -EINVAL; - if (scp->nonagle == TCP_NAGLE_CORK) - return -EINVAL; - scp->nonagle = (u.val == 0) ? 0 : TCP_NAGLE_OFF; - /* if (scp->nonagle == 1) { Push pending frames } */ - break; - - case DSO_CORK: - if (optlen != sizeof(int)) - return -EINVAL; - if (scp->nonagle == TCP_NAGLE_OFF) - return -EINVAL; - scp->nonagle = (u.val == 0) ? 0 : TCP_NAGLE_CORK; - /* if (scp->nonagle == 0) { Push pending frames } */ - break; - - case DSO_SERVICES: - if (optlen != sizeof(unsigned char)) - return -EINVAL; - if ((u.services & ~NSP_FC_MASK) != 0x01) - return -EINVAL; - if ((u.services & NSP_FC_MASK) == NSP_FC_MASK) - return -EINVAL; - scp->services_loc = u.services; - break; - - case DSO_INFO: - if (optlen != sizeof(unsigned char)) - return -EINVAL; - if (u.info & 0xfc) - return -EINVAL; - scp->info_loc = u.info; - break; - - case DSO_LINKINFO: - case DSO_STREAM: - case DSO_SEQPACKET: - default: - return -ENOPROTOOPT; - } - - return 0; -} - -static int dn_getsockopt(struct socket *sock, int level, int optname, char __user *optval, int __user *optlen) -{ - struct sock *sk = sock->sk; - int err; - - lock_sock(sk); - err = __dn_getsockopt(sock, level, optname, optval, optlen, 0); - release_sock(sk); -#ifdef CONFIG_NETFILTER - if (err == -ENOPROTOOPT && optname != DSO_STREAM && - optname != DSO_SEQPACKET && optname != DSO_CONACCEPT && - optname != DSO_CONREJECT) { - int len; - - if (get_user(len, optlen)) - return -EFAULT; - - err = nf_getsockopt(sk, PF_DECnet, optname, optval, &len); - if (err >= 0) - err = put_user(len, optlen); - } -#endif - - return err; -} - -static int __dn_getsockopt(struct socket *sock, int level,int optname, char __user *optval,int __user *optlen, int flags) -{ - struct sock *sk = sock->sk; - struct dn_scp *scp = DN_SK(sk); - struct linkinfo_dn link; - unsigned int r_len; - void *r_data = NULL; - unsigned int val; - - if(get_user(r_len , optlen)) - return -EFAULT; - - switch (optname) { - case DSO_CONDATA: - if (r_len > sizeof(struct optdata_dn)) - r_len = sizeof(struct optdata_dn); - r_data = &scp->conndata_in; - break; - - case DSO_DISDATA: - if (r_len > sizeof(struct optdata_dn)) - r_len = sizeof(struct optdata_dn); - r_data = &scp->discdata_in; - break; - - case DSO_CONACCESS: - if (r_len > sizeof(struct accessdata_dn)) - r_len = sizeof(struct accessdata_dn); - r_data = &scp->accessdata; - break; - - case DSO_ACCEPTMODE: - if (r_len > sizeof(unsigned char)) - r_len = sizeof(unsigned char); - r_data = &scp->accept_mode; - break; - - case DSO_LINKINFO: - if (r_len > sizeof(struct linkinfo_dn)) - r_len = sizeof(struct linkinfo_dn); - - memset(&link, 0, sizeof(link)); - - switch (sock->state) { - case SS_CONNECTING: - link.idn_linkstate = LL_CONNECTING; - break; - case SS_DISCONNECTING: - link.idn_linkstate = LL_DISCONNECTING; - break; - case SS_CONNECTED: - link.idn_linkstate = LL_RUNNING; - break; - default: - link.idn_linkstate = LL_INACTIVE; - } - - link.idn_segsize = scp->segsize_rem; - r_data = &link; - break; - - case DSO_MAXWINDOW: - if (r_len > sizeof(unsigned long)) - r_len = sizeof(unsigned long); - r_data = &scp->max_window; - break; - - case DSO_NODELAY: - if (r_len > sizeof(int)) - r_len = sizeof(int); - val = (scp->nonagle == TCP_NAGLE_OFF); - r_data = &val; - break; - - case DSO_CORK: - if (r_len > sizeof(int)) - r_len = sizeof(int); - val = (scp->nonagle == TCP_NAGLE_CORK); - r_data = &val; - break; - - case DSO_SERVICES: - if (r_len > sizeof(unsigned char)) - r_len = sizeof(unsigned char); - r_data = &scp->services_rem; - break; - - case DSO_INFO: - if (r_len > sizeof(unsigned char)) - r_len = sizeof(unsigned char); - r_data = &scp->info_rem; - break; - - case DSO_STREAM: - case DSO_SEQPACKET: - case DSO_CONACCEPT: - case DSO_CONREJECT: - default: - return -ENOPROTOOPT; - } - - if (r_data) { - if (copy_to_user(optval, r_data, r_len)) - return -EFAULT; - if (put_user(r_len, optlen)) - return -EFAULT; - } - - return 0; -} - - -static int dn_data_ready(struct sock *sk, struct sk_buff_head *q, int flags, int target) -{ - struct sk_buff *skb; - int len = 0; - - if (flags & MSG_OOB) - return !skb_queue_empty(q) ? 1 : 0; - - skb_queue_walk(q, skb) { - struct dn_skb_cb *cb = DN_SKB_CB(skb); - len += skb->len; - - if (cb->nsp_flags & 0x40) { - /* SOCK_SEQPACKET reads to EOM */ - if (sk->sk_type == SOCK_SEQPACKET) - return 1; - /* so does SOCK_STREAM unless WAITALL is specified */ - if (!(flags & MSG_WAITALL)) - return 1; - } - - /* minimum data length for read exceeded */ - if (len >= target) - return 1; - } - - return 0; -} - - -static int dn_recvmsg(struct socket *sock, struct msghdr *msg, size_t size, - int flags) -{ - struct sock *sk = sock->sk; - struct dn_scp *scp = DN_SK(sk); - struct sk_buff_head *queue = &sk->sk_receive_queue; - size_t target = size > 1 ? 1 : 0; - size_t copied = 0; - int rv = 0; - struct sk_buff *skb, *n; - struct dn_skb_cb *cb = NULL; - unsigned char eor = 0; - long timeo = sock_rcvtimeo(sk, flags & MSG_DONTWAIT); - - lock_sock(sk); - - if (sock_flag(sk, SOCK_ZAPPED)) { - rv = -EADDRNOTAVAIL; - goto out; - } - - if (sk->sk_shutdown & RCV_SHUTDOWN) { - rv = 0; - goto out; - } - - rv = dn_check_state(sk, NULL, 0, &timeo, flags); - if (rv) - goto out; - - if (flags & ~(MSG_CMSG_COMPAT|MSG_PEEK|MSG_OOB|MSG_WAITALL|MSG_DONTWAIT|MSG_NOSIGNAL)) { - rv = -EOPNOTSUPP; - goto out; - } - - if (flags & MSG_OOB) - queue = &scp->other_receive_queue; - - if (flags & MSG_WAITALL) - target = size; - - - /* - * See if there is data ready to read, sleep if there isn't - */ - for(;;) { - DEFINE_WAIT_FUNC(wait, woken_wake_function); - - if (sk->sk_err) - goto out; - - if (!skb_queue_empty(&scp->other_receive_queue)) { - if (!(flags & MSG_OOB)) { - msg->msg_flags |= MSG_OOB; - if (!scp->other_report) { - scp->other_report = 1; - goto out; - } - } - } - - if (scp->state != DN_RUN) - goto out; - - if (signal_pending(current)) { - rv = sock_intr_errno(timeo); - goto out; - } - - if (dn_data_ready(sk, queue, flags, target)) - break; - - if (flags & MSG_DONTWAIT) { - rv = -EWOULDBLOCK; - goto out; - } - - add_wait_queue(sk_sleep(sk), &wait); - sk_set_bit(SOCKWQ_ASYNC_WAITDATA, sk); - sk_wait_event(sk, &timeo, dn_data_ready(sk, queue, flags, target), &wait); - sk_clear_bit(SOCKWQ_ASYNC_WAITDATA, sk); - remove_wait_queue(sk_sleep(sk), &wait); - } - - skb_queue_walk_safe(queue, skb, n) { - unsigned int chunk = skb->len; - cb = DN_SKB_CB(skb); - - if ((chunk + copied) > size) - chunk = size - copied; - - if (memcpy_to_msg(msg, skb->data, chunk)) { - rv = -EFAULT; - break; - } - copied += chunk; - - if (!(flags & MSG_PEEK)) - skb_pull(skb, chunk); - - eor = cb->nsp_flags & 0x40; - - if (skb->len == 0) { - skb_unlink(skb, queue); - kfree_skb(skb); - /* - * N.B. Don't refer to skb or cb after this point - * in loop. - */ - if ((scp->flowloc_sw == DN_DONTSEND) && !dn_congested(sk)) { - scp->flowloc_sw = DN_SEND; - dn_nsp_send_link(sk, DN_SEND, 0); - } - } - - if (eor) { - if (sk->sk_type == SOCK_SEQPACKET) - break; - if (!(flags & MSG_WAITALL)) - break; - } - - if (flags & MSG_OOB) - break; - - if (copied >= target) - break; - } - - rv = copied; - - - if (eor && (sk->sk_type == SOCK_SEQPACKET)) - msg->msg_flags |= MSG_EOR; - -out: - if (rv == 0) - rv = (flags & MSG_PEEK) ? -sk->sk_err : sock_error(sk); - - if ((rv >= 0) && msg->msg_name) { - __sockaddr_check_size(sizeof(struct sockaddr_dn)); - memcpy(msg->msg_name, &scp->peer, sizeof(struct sockaddr_dn)); - msg->msg_namelen = sizeof(struct sockaddr_dn); - } - - release_sock(sk); - - return rv; -} - - -static inline int dn_queue_too_long(struct dn_scp *scp, struct sk_buff_head *queue, int flags) -{ - unsigned char fctype = scp->services_rem & NSP_FC_MASK; - if (skb_queue_len(queue) >= scp->snd_window) - return 1; - if (fctype != NSP_FC_NONE) { - if (flags & MSG_OOB) { - if (scp->flowrem_oth == 0) - return 1; - } else { - if (scp->flowrem_dat == 0) - return 1; - } - } - return 0; -} - -/* - * The DECnet spec requires that the "routing layer" accepts packets which - * are at least 230 bytes in size. This excludes any headers which the NSP - * layer might add, so we always assume that we'll be using the maximal - * length header on data packets. The variation in length is due to the - * inclusion (or not) of the two 16 bit acknowledgement fields so it doesn't - * make much practical difference. - */ -unsigned int dn_mss_from_pmtu(struct net_device *dev, int mtu) -{ - unsigned int mss = 230 - DN_MAX_NSP_DATA_HEADER; - if (dev) { - struct dn_dev *dn_db = rcu_dereference_raw(dev->dn_ptr); - mtu -= LL_RESERVED_SPACE(dev); - if (dn_db->use_long) - mtu -= 21; - else - mtu -= 6; - mtu -= DN_MAX_NSP_DATA_HEADER; - } else { - /* - * 21 = long header, 16 = guess at MAC header length - */ - mtu -= (21 + DN_MAX_NSP_DATA_HEADER + 16); - } - if (mtu > mss) - mss = mtu; - return mss; -} - -static inline unsigned int dn_current_mss(struct sock *sk, int flags) -{ - struct dst_entry *dst = __sk_dst_get(sk); - struct dn_scp *scp = DN_SK(sk); - int mss_now = min_t(int, scp->segsize_loc, scp->segsize_rem); - - /* Other data messages are limited to 16 bytes per packet */ - if (flags & MSG_OOB) - return 16; - - /* This works out the maximum size of segment we can send out */ - if (dst) { - u32 mtu = dst_mtu(dst); - mss_now = min_t(int, dn_mss_from_pmtu(dst->dev, mtu), mss_now); - } - - return mss_now; -} - -/* - * N.B. We get the timeout wrong here, but then we always did get it - * wrong before and this is another step along the road to correcting - * it. It ought to get updated each time we pass through the routine, - * but in practise it probably doesn't matter too much for now. - */ -static inline struct sk_buff *dn_alloc_send_pskb(struct sock *sk, - unsigned long datalen, int noblock, - int *errcode) -{ - struct sk_buff *skb = sock_alloc_send_skb(sk, datalen, - noblock, errcode); - if (skb) { - skb->protocol = htons(ETH_P_DNA_RT); - skb->pkt_type = PACKET_OUTGOING; - } - return skb; -} - -static int dn_sendmsg(struct socket *sock, struct msghdr *msg, size_t size) -{ - struct sock *sk = sock->sk; - struct dn_scp *scp = DN_SK(sk); - size_t mss; - struct sk_buff_head *queue = &scp->data_xmit_queue; - int flags = msg->msg_flags; - int err = 0; - size_t sent = 0; - int addr_len = msg->msg_namelen; - DECLARE_SOCKADDR(struct sockaddr_dn *, addr, msg->msg_name); - struct sk_buff *skb = NULL; - struct dn_skb_cb *cb; - size_t len; - unsigned char fctype; - long timeo; - - if (flags & ~(MSG_TRYHARD|MSG_OOB|MSG_DONTWAIT|MSG_EOR|MSG_NOSIGNAL|MSG_MORE|MSG_CMSG_COMPAT)) - return -EOPNOTSUPP; - - if (addr_len && (addr_len != sizeof(struct sockaddr_dn))) - return -EINVAL; - - lock_sock(sk); - timeo = sock_sndtimeo(sk, flags & MSG_DONTWAIT); - /* - * The only difference between stream sockets and sequenced packet - * sockets is that the stream sockets always behave as if MSG_EOR - * has been set. - */ - if (sock->type == SOCK_STREAM) { - if (flags & MSG_EOR) { - err = -EINVAL; - goto out; - } - flags |= MSG_EOR; - } - - - err = dn_check_state(sk, addr, addr_len, &timeo, flags); - if (err) - goto out_err; - - if (sk->sk_shutdown & SEND_SHUTDOWN) { - err = -EPIPE; - if (!(flags & MSG_NOSIGNAL)) - send_sig(SIGPIPE, current, 0); - goto out_err; - } - - if ((flags & MSG_TRYHARD) && sk->sk_dst_cache) - dst_negative_advice(sk); - - mss = scp->segsize_rem; - fctype = scp->services_rem & NSP_FC_MASK; - - mss = dn_current_mss(sk, flags); - - if (flags & MSG_OOB) { - queue = &scp->other_xmit_queue; - if (size > mss) { - err = -EMSGSIZE; - goto out; - } - } - - scp->persist_fxn = dn_nsp_xmit_timeout; - - while(sent < size) { - err = sock_error(sk); - if (err) - goto out; - - if (signal_pending(current)) { - err = sock_intr_errno(timeo); - goto out; - } - - /* - * Calculate size that we wish to send. - */ - len = size - sent; - - if (len > mss) - len = mss; - - /* - * Wait for queue size to go down below the window - * size. - */ - if (dn_queue_too_long(scp, queue, flags)) { - DEFINE_WAIT_FUNC(wait, woken_wake_function); - - if (flags & MSG_DONTWAIT) { - err = -EWOULDBLOCK; - goto out; - } - - add_wait_queue(sk_sleep(sk), &wait); - sk_set_bit(SOCKWQ_ASYNC_WAITDATA, sk); - sk_wait_event(sk, &timeo, - !dn_queue_too_long(scp, queue, flags), &wait); - sk_clear_bit(SOCKWQ_ASYNC_WAITDATA, sk); - remove_wait_queue(sk_sleep(sk), &wait); - continue; - } - - /* - * Get a suitably sized skb. - * 64 is a bit of a hack really, but its larger than any - * link-layer headers and has served us well as a good - * guess as to their real length. - */ - skb = dn_alloc_send_pskb(sk, len + 64 + DN_MAX_NSP_DATA_HEADER, - flags & MSG_DONTWAIT, &err); - - if (err) - break; - - if (!skb) - continue; - - cb = DN_SKB_CB(skb); - - skb_reserve(skb, 64 + DN_MAX_NSP_DATA_HEADER); - - if (memcpy_from_msg(skb_put(skb, len), msg, len)) { - err = -EFAULT; - goto out; - } - - if (flags & MSG_OOB) { - cb->nsp_flags = 0x30; - if (fctype != NSP_FC_NONE) - scp->flowrem_oth--; - } else { - cb->nsp_flags = 0x00; - if (scp->seg_total == 0) - cb->nsp_flags |= 0x20; - - scp->seg_total += len; - - if (((sent + len) == size) && (flags & MSG_EOR)) { - cb->nsp_flags |= 0x40; - scp->seg_total = 0; - if (fctype == NSP_FC_SCMC) - scp->flowrem_dat--; - } - if (fctype == NSP_FC_SRC) - scp->flowrem_dat--; - } - - sent += len; - dn_nsp_queue_xmit(sk, skb, sk->sk_allocation, flags & MSG_OOB); - skb = NULL; - - scp->persist = dn_nsp_persist(sk); - - } -out: - - kfree_skb(skb); - - release_sock(sk); - - return sent ? sent : err; - -out_err: - err = sk_stream_error(sk, flags, err); - release_sock(sk); - return err; -} - -static int dn_device_event(struct notifier_block *this, unsigned long event, - void *ptr) -{ - struct net_device *dev = netdev_notifier_info_to_dev(ptr); - - if (!net_eq(dev_net(dev), &init_net)) - return NOTIFY_DONE; - - switch (event) { - case NETDEV_UP: - dn_dev_up(dev); - break; - case NETDEV_DOWN: - dn_dev_down(dev); - break; - default: - break; - } - - return NOTIFY_DONE; -} - -static struct notifier_block dn_dev_notifier = { - .notifier_call = dn_device_event, -}; - -static struct packet_type dn_dix_packet_type __read_mostly = { - .type = cpu_to_be16(ETH_P_DNA_RT), - .func = dn_route_rcv, -}; - -#ifdef CONFIG_PROC_FS -struct dn_iter_state { - int bucket; -}; - -static struct sock *dn_socket_get_first(struct seq_file *seq) -{ - struct dn_iter_state *state = seq->private; - struct sock *n = NULL; - - for(state->bucket = 0; - state->bucket < DN_SK_HASH_SIZE; - ++state->bucket) { - n = sk_head(&dn_sk_hash[state->bucket]); - if (n) - break; - } - - return n; -} - -static struct sock *dn_socket_get_next(struct seq_file *seq, - struct sock *n) -{ - struct dn_iter_state *state = seq->private; - - n = sk_next(n); - while (!n) { - if (++state->bucket >= DN_SK_HASH_SIZE) - break; - n = sk_head(&dn_sk_hash[state->bucket]); - } - return n; -} - -static struct sock *socket_get_idx(struct seq_file *seq, loff_t *pos) -{ - struct sock *sk = dn_socket_get_first(seq); - - if (sk) { - while(*pos && (sk = dn_socket_get_next(seq, sk))) - --*pos; - } - return *pos ? NULL : sk; -} - -static void *dn_socket_get_idx(struct seq_file *seq, loff_t pos) -{ - void *rc; - read_lock_bh(&dn_hash_lock); - rc = socket_get_idx(seq, &pos); - if (!rc) { - read_unlock_bh(&dn_hash_lock); - } - return rc; -} - -static void *dn_socket_seq_start(struct seq_file *seq, loff_t *pos) -{ - return *pos ? dn_socket_get_idx(seq, *pos - 1) : SEQ_START_TOKEN; -} - -static void *dn_socket_seq_next(struct seq_file *seq, void *v, loff_t *pos) -{ - void *rc; - - if (v == SEQ_START_TOKEN) { - rc = dn_socket_get_idx(seq, 0); - goto out; - } - - rc = dn_socket_get_next(seq, v); - if (rc) - goto out; - read_unlock_bh(&dn_hash_lock); -out: - ++*pos; - return rc; -} - -static void dn_socket_seq_stop(struct seq_file *seq, void *v) -{ - if (v && v != SEQ_START_TOKEN) - read_unlock_bh(&dn_hash_lock); -} - -#define IS_NOT_PRINTABLE(x) ((x) < 32 || (x) > 126) - -static void dn_printable_object(struct sockaddr_dn *dn, unsigned char *buf) -{ - int i; - - switch (le16_to_cpu(dn->sdn_objnamel)) { - case 0: - sprintf(buf, "%d", dn->sdn_objnum); - break; - default: - for (i = 0; i < le16_to_cpu(dn->sdn_objnamel); i++) { - buf[i] = dn->sdn_objname[i]; - if (IS_NOT_PRINTABLE(buf[i])) - buf[i] = '.'; - } - buf[i] = 0; - } -} - -static char *dn_state2asc(unsigned char state) -{ - switch (state) { - case DN_O: - return "OPEN"; - case DN_CR: - return " CR"; - case DN_DR: - return " DR"; - case DN_DRC: - return " DRC"; - case DN_CC: - return " CC"; - case DN_CI: - return " CI"; - case DN_NR: - return " NR"; - case DN_NC: - return " NC"; - case DN_CD: - return " CD"; - case DN_RJ: - return " RJ"; - case DN_RUN: - return " RUN"; - case DN_DI: - return " DI"; - case DN_DIC: - return " DIC"; - case DN_DN: - return " DN"; - case DN_CL: - return " CL"; - case DN_CN: - return " CN"; - } - - return "????"; -} - -static inline void dn_socket_format_entry(struct seq_file *seq, struct sock *sk) -{ - struct dn_scp *scp = DN_SK(sk); - char buf1[DN_ASCBUF_LEN]; - char buf2[DN_ASCBUF_LEN]; - char local_object[DN_MAXOBJL+3]; - char remote_object[DN_MAXOBJL+3]; - - dn_printable_object(&scp->addr, local_object); - dn_printable_object(&scp->peer, remote_object); - - seq_printf(seq, - "%6s/%04X %04d:%04d %04d:%04d %01d %-16s " - "%6s/%04X %04d:%04d %04d:%04d %01d %-16s %4s %s\n", - dn_addr2asc(le16_to_cpu(dn_saddr2dn(&scp->addr)), buf1), - scp->addrloc, - scp->numdat, - scp->numoth, - scp->ackxmt_dat, - scp->ackxmt_oth, - scp->flowloc_sw, - local_object, - dn_addr2asc(le16_to_cpu(dn_saddr2dn(&scp->peer)), buf2), - scp->addrrem, - scp->numdat_rcv, - scp->numoth_rcv, - scp->ackrcv_dat, - scp->ackrcv_oth, - scp->flowrem_sw, - remote_object, - dn_state2asc(scp->state), - ((scp->accept_mode == ACC_IMMED) ? "IMMED" : "DEFER")); -} - -static int dn_socket_seq_show(struct seq_file *seq, void *v) -{ - if (v == SEQ_START_TOKEN) { - seq_puts(seq, "Local Remote\n"); - } else { - dn_socket_format_entry(seq, v); - } - return 0; -} - -static const struct seq_operations dn_socket_seq_ops = { - .start = dn_socket_seq_start, - .next = dn_socket_seq_next, - .stop = dn_socket_seq_stop, - .show = dn_socket_seq_show, -}; -#endif - -static const struct net_proto_family dn_family_ops = { - .family = AF_DECnet, - .create = dn_create, - .owner = THIS_MODULE, -}; - -static const struct proto_ops dn_proto_ops = { - .family = AF_DECnet, - .owner = THIS_MODULE, - .release = dn_release, - .bind = dn_bind, - .connect = dn_connect, - .socketpair = sock_no_socketpair, - .accept = dn_accept, - .getname = dn_getname, - .poll = dn_poll, - .ioctl = dn_ioctl, - .listen = dn_listen, - .shutdown = dn_shutdown, - .setsockopt = dn_setsockopt, - .getsockopt = dn_getsockopt, - .sendmsg = dn_sendmsg, - .recvmsg = dn_recvmsg, - .mmap = sock_no_mmap, - .sendpage = sock_no_sendpage, -}; - -MODULE_DESCRIPTION("The Linux DECnet Network Protocol"); -MODULE_AUTHOR("Linux DECnet Project Team"); -MODULE_LICENSE("GPL"); -MODULE_ALIAS_NETPROTO(PF_DECnet); - -static const char banner[] __initconst = KERN_INFO -"NET4: DECnet for Linux: V.2.5.68s (C) 1995-2003 Linux DECnet Project Team\n"; - -static int __init decnet_init(void) -{ - int rc; - - printk(banner); - - rc = proto_register(&dn_proto, 1); - if (rc != 0) - goto out; - - dn_neigh_init(); - dn_dev_init(); - dn_route_init(); - dn_fib_init(); - - sock_register(&dn_family_ops); - dev_add_pack(&dn_dix_packet_type); - register_netdevice_notifier(&dn_dev_notifier); - - proc_create_seq_private("decnet", 0444, init_net.proc_net, - &dn_socket_seq_ops, sizeof(struct dn_iter_state), - NULL); - dn_register_sysctl(); -out: - return rc; - -} -module_init(decnet_init); - -/* - * Prevent DECnet module unloading until its fixed properly. - * Requires an audit of the code to check for memory leaks and - * initialisation problems etc. - */ -#if 0 -static void __exit decnet_exit(void) -{ - sock_unregister(AF_DECnet); - rtnl_unregister_all(PF_DECnet); - dev_remove_pack(&dn_dix_packet_type); - - dn_unregister_sysctl(); - - unregister_netdevice_notifier(&dn_dev_notifier); - - dn_route_cleanup(); - dn_dev_cleanup(); - dn_neigh_cleanup(); - dn_fib_cleanup(); - - remove_proc_entry("decnet", init_net.proc_net); - - proto_unregister(&dn_proto); - - rcu_barrier(); /* Wait for completion of call_rcu()'s */ -} -module_exit(decnet_exit); -#endif diff --git a/net/decnet/dn_dev.c b/net/decnet/dn_dev.c deleted file mode 100644 index a09ba642b5e7..000000000000 --- a/net/decnet/dn_dev.c +++ /dev/null @@ -1,1433 +0,0 @@ -// SPDX-License-Identifier: GPL-2.0 -/* - * DECnet An implementation of the DECnet protocol suite for the LINUX - * operating system. DECnet is implemented using the BSD Socket - * interface as the means of communication with the user level. - * - * DECnet Device Layer - * - * Authors: Steve Whitehouse <SteveW@ACM.org> - * Eduardo Marcelo Serrat <emserrat@geocities.com> - * - * Changes: - * Steve Whitehouse : Devices now see incoming frames so they - * can mark on who it came from. - * Steve Whitehouse : Fixed bug in creating neighbours. Each neighbour - * can now have a device specific setup func. - * Steve Whitehouse : Added /proc/sys/net/decnet/conf/<dev>/ - * Steve Whitehouse : Fixed bug which sometimes killed timer - * Steve Whitehouse : Multiple ifaddr support - * Steve Whitehouse : SIOCGIFCONF is now a compile time option - * Steve Whitehouse : /proc/sys/net/decnet/conf/<sys>/forwarding - * Steve Whitehouse : Removed timer1 - it's a user space issue now - * Patrick Caulfield : Fixed router hello message format - * Steve Whitehouse : Got rid of constant sizes for blksize for - * devices. All mtu based now. - */ - -#include <linux/capability.h> -#include <linux/module.h> -#include <linux/moduleparam.h> -#include <linux/init.h> -#include <linux/net.h> -#include <linux/netdevice.h> -#include <linux/proc_fs.h> -#include <linux/seq_file.h> -#include <linux/timer.h> -#include <linux/string.h> -#include <linux/if_addr.h> -#include <linux/if_arp.h> -#include <linux/if_ether.h> -#include <linux/skbuff.h> -#include <linux/sysctl.h> -#include <linux/notifier.h> -#include <linux/slab.h> -#include <linux/jiffies.h> -#include <linux/uaccess.h> -#include <net/net_namespace.h> -#include <net/neighbour.h> -#include <net/dst.h> -#include <net/flow.h> -#include <net/fib_rules.h> -#include <net/netlink.h> -#include <net/dn.h> -#include <net/dn_dev.h> -#include <net/dn_route.h> -#include <net/dn_neigh.h> -#include <net/dn_fib.h> - -#define DN_IFREQ_SIZE (offsetof(struct ifreq, ifr_ifru) + sizeof(struct sockaddr_dn)) - -static char dn_rt_all_end_mcast[ETH_ALEN] = {0xAB,0x00,0x00,0x04,0x00,0x00}; -static char dn_rt_all_rt_mcast[ETH_ALEN] = {0xAB,0x00,0x00,0x03,0x00,0x00}; -static char dn_hiord[ETH_ALEN] = {0xAA,0x00,0x04,0x00,0x00,0x00}; -static unsigned char dn_eco_version[3] = {0x02,0x00,0x00}; - -extern struct neigh_table dn_neigh_table; - -/* - * decnet_address is kept in network order. - */ -__le16 decnet_address = 0; - -static DEFINE_SPINLOCK(dndev_lock); -static struct net_device *decnet_default_device; -static BLOCKING_NOTIFIER_HEAD(dnaddr_chain); - -static struct dn_dev *dn_dev_create(struct net_device *dev, int *err); -static void dn_dev_delete(struct net_device *dev); -static void dn_ifaddr_notify(int event, struct dn_ifaddr *ifa); - -static int dn_eth_up(struct net_device *); -static void dn_eth_down(struct net_device *); -static void dn_send_brd_hello(struct net_device *dev, struct dn_ifaddr *ifa); -static void dn_send_ptp_hello(struct net_device *dev, struct dn_ifaddr *ifa); - -static struct dn_dev_parms dn_dev_list[] = { -{ - .type = ARPHRD_ETHER, /* Ethernet */ - .mode = DN_DEV_BCAST, - .state = DN_DEV_S_RU, - .t2 = 1, - .t3 = 10, - .name = "ethernet", - .up = dn_eth_up, - .down = dn_eth_down, - .timer3 = dn_send_brd_hello, -}, -{ - .type = ARPHRD_IPGRE, /* DECnet tunneled over GRE in IP */ - .mode = DN_DEV_BCAST, - .state = DN_DEV_S_RU, - .t2 = 1, - .t3 = 10, - .name = "ipgre", - .timer3 = dn_send_brd_hello, -}, -#if 0 -{ - .type = ARPHRD_X25, /* Bog standard X.25 */ - .mode = DN_DEV_UCAST, - .state = DN_DEV_S_DS, - .t2 = 1, - .t3 = 120, - .name = "x25", - .timer3 = dn_send_ptp_hello, -}, -#endif -#if 0 -{ - .type = ARPHRD_PPP, /* DECnet over PPP */ - .mode = DN_DEV_BCAST, - .state = DN_DEV_S_RU, - .t2 = 1, - .t3 = 10, - .name = "ppp", - .timer3 = dn_send_brd_hello, -}, -#endif -{ - .type = ARPHRD_DDCMP, /* DECnet over DDCMP */ - .mode = DN_DEV_UCAST, - .state = DN_DEV_S_DS, - .t2 = 1, - .t3 = 120, - .name = "ddcmp", - .timer3 = dn_send_ptp_hello, -}, -{ - .type = ARPHRD_LOOPBACK, /* Loopback interface - always last */ - .mode = DN_DEV_BCAST, - .state = DN_DEV_S_RU, - .t2 = 1, - .t3 = 10, - .name = "loopback", - .timer3 = dn_send_brd_hello, -} -}; - -#define DN_DEV_LIST_SIZE ARRAY_SIZE(dn_dev_list) - -#define DN_DEV_PARMS_OFFSET(x) offsetof(struct dn_dev_parms, x) - -#ifdef CONFIG_SYSCTL - -static int min_t2[] = { 1 }; -static int max_t2[] = { 60 }; /* No max specified, but this seems sensible */ -static int min_t3[] = { 1 }; -static int max_t3[] = { 8191 }; /* Must fit in 16 bits when multiplied by BCT3MULT or T3MULT */ - -static int min_priority[1]; -static int max_priority[] = { 127 }; /* From DECnet spec */ - -static int dn_forwarding_proc(struct ctl_table *, int, void *, size_t *, - loff_t *); -static struct dn_dev_sysctl_table { - struct ctl_table_header *sysctl_header; - struct ctl_table dn_dev_vars[5]; -} dn_dev_sysctl = { - NULL, - { - { - .procname = "forwarding", - .data = (void *)DN_DEV_PARMS_OFFSET(forwarding), - .maxlen = sizeof(int), - .mode = 0644, - .proc_handler = dn_forwarding_proc, - }, - { - .procname = "priority", - .data = (void *)DN_DEV_PARMS_OFFSET(priority), - .maxlen = sizeof(int), - .mode = 0644, - .proc_handler = proc_dointvec_minmax, - .extra1 = &min_priority, - .extra2 = &max_priority - }, - { - .procname = "t2", - .data = (void *)DN_DEV_PARMS_OFFSET(t2), - .maxlen = sizeof(int), - .mode = 0644, - .proc_handler = proc_dointvec_minmax, - .extra1 = &min_t2, - .extra2 = &max_t2 - }, - { - .procname = "t3", - .data = (void *)DN_DEV_PARMS_OFFSET(t3), - .maxlen = sizeof(int), - .mode = 0644, - .proc_handler = proc_dointvec_minmax, - .extra1 = &min_t3, - .extra2 = &max_t3 - }, - { } - }, -}; - -static void dn_dev_sysctl_register(struct net_device *dev, struct dn_dev_parms *parms) -{ - struct dn_dev_sysctl_table *t; - int i; - - char path[sizeof("net/decnet/conf/") + IFNAMSIZ]; - - t = kmemdup(&dn_dev_sysctl, sizeof(*t), GFP_KERNEL); - if (t == NULL) - return; - - for(i = 0; i < ARRAY_SIZE(t->dn_dev_vars) - 1; i++) { - long offset = (long)t->dn_dev_vars[i].data; - t->dn_dev_vars[i].data = ((char *)parms) + offset; - } - - snprintf(path, sizeof(path), "net/decnet/conf/%s", - dev? dev->name : parms->name); - - t->dn_dev_vars[0].extra1 = (void *)dev; - - t->sysctl_header = register_net_sysctl(&init_net, path, t->dn_dev_vars); - if (t->sysctl_header == NULL) - kfree(t); - else - parms->sysctl = t; -} - -static void dn_dev_sysctl_unregister(struct dn_dev_parms *parms) -{ - if (parms->sysctl) { - struct dn_dev_sysctl_table *t = parms->sysctl; - parms->sysctl = NULL; - unregister_net_sysctl_table(t->sysctl_header); - kfree(t); - } -} - -static int dn_forwarding_proc(struct ctl_table *table, int write, - void *buffer, size_t *lenp, loff_t *ppos) -{ -#ifdef CONFIG_DECNET_ROUTER - struct net_device *dev = table->extra1; - struct dn_dev *dn_db; - int err; - int tmp, old; - - if (table->extra1 == NULL) - return -EINVAL; - - dn_db = rcu_dereference_raw(dev->dn_ptr); - old = dn_db->parms.forwarding; - - err = proc_dointvec(table, write, buffer, lenp, ppos); - - if ((err >= 0) && write) { - if (dn_db->parms.forwarding < 0) - dn_db->parms.forwarding = 0; - if (dn_db->parms.forwarding > 2) - dn_db->parms.forwarding = 2; - /* - * What an ugly hack this is... its works, just. It - * would be nice if sysctl/proc were just that little - * bit more flexible so I don't have to write a special - * routine, or suffer hacks like this - SJW - */ - tmp = dn_db->parms.forwarding; - dn_db->parms.forwarding = old; - if (dn_db->parms.down) - dn_db->parms.down(dev); - dn_db->parms.forwarding = tmp; - if (dn_db->parms.up) - dn_db->parms.up(dev); - } - - return err; -#else - return -EINVAL; -#endif -} - -#else /* CONFIG_SYSCTL */ -static void dn_dev_sysctl_unregister(struct dn_dev_parms *parms) -{ -} -static void dn_dev_sysctl_register(struct net_device *dev, struct dn_dev_parms *parms) -{ -} - -#endif /* CONFIG_SYSCTL */ - -static inline __u16 mtu2blksize(struct net_device *dev) -{ - u32 blksize = dev->mtu; - if (blksize > 0xffff) - blksize = 0xffff; - - if (dev->type == ARPHRD_ETHER || - dev->type == ARPHRD_PPP || - dev->type == ARPHRD_IPGRE || - dev->type == ARPHRD_LOOPBACK) - blksize -= 2; - - return (__u16)blksize; -} - -static struct dn_ifaddr *dn_dev_alloc_ifa(void) -{ - struct dn_ifaddr *ifa; - - ifa = kzalloc(sizeof(*ifa), GFP_KERNEL); - - return ifa; -} - -static void dn_dev_free_ifa(struct dn_ifaddr *ifa) -{ - kfree_rcu(ifa, rcu); -} - -static void dn_dev_del_ifa(struct dn_dev *dn_db, struct dn_ifaddr __rcu **ifap, int destroy) -{ - struct dn_ifaddr *ifa1 = rtnl_dereference(*ifap); - unsigned char mac_addr[6]; - struct net_device *dev = dn_db->dev; - - ASSERT_RTNL(); - - *ifap = ifa1->ifa_next; - - if (dn_db->dev->type == ARPHRD_ETHER) { - if (ifa1->ifa_local != dn_eth2dn(dev->dev_addr)) { - dn_dn2eth(mac_addr, ifa1->ifa_local); - dev_mc_del(dev, mac_addr); - } - } - - dn_ifaddr_notify(RTM_DELADDR, ifa1); - blocking_notifier_call_chain(&dnaddr_chain, NETDEV_DOWN, ifa1); - if (destroy) { - dn_dev_free_ifa(ifa1); - - if (dn_db->ifa_list == NULL) - dn_dev_delete(dn_db->dev); - } -} - -static int dn_dev_insert_ifa(struct dn_dev *dn_db, struct dn_ifaddr *ifa) -{ - struct net_device *dev = dn_db->dev; - struct dn_ifaddr *ifa1; - unsigned char mac_addr[6]; - - ASSERT_RTNL(); - - /* Check for duplicates */ - for (ifa1 = rtnl_dereference(dn_db->ifa_list); - ifa1 != NULL; - ifa1 = rtnl_dereference(ifa1->ifa_next)) { - if (ifa1->ifa_local == ifa->ifa_local) - return -EEXIST; - } - - if (dev->type == ARPHRD_ETHER) { - if (ifa->ifa_local != dn_eth2dn(dev->dev_addr)) { - dn_dn2eth(mac_addr, ifa->ifa_local); - dev_mc_add(dev, mac_addr); - } - } - - ifa->ifa_next = dn_db->ifa_list; - rcu_assign_pointer(dn_db->ifa_list, ifa); - - dn_ifaddr_notify(RTM_NEWADDR, ifa); - blocking_notifier_call_chain(&dnaddr_chain, NETDEV_UP, ifa); - - return 0; -} - -static int dn_dev_set_ifa(struct net_device *dev, struct dn_ifaddr *ifa) -{ - struct dn_dev *dn_db = rtnl_dereference(dev->dn_ptr); - int rv; - - if (dn_db == NULL) { - int err; - dn_db = dn_dev_create(dev, &err); - if (dn_db == NULL) - return err; - } - - ifa->ifa_dev = dn_db; - - if (dev->flags & IFF_LOOPBACK) - ifa->ifa_scope = RT_SCOPE_HOST; - - rv = dn_dev_insert_ifa(dn_db, ifa); - if (rv) - dn_dev_free_ifa(ifa); - return rv; -} - - -int dn_dev_ioctl(unsigned int cmd, void __user *arg) -{ - char buffer[DN_IFREQ_SIZE]; - struct ifreq *ifr = (struct ifreq *)buffer; - struct sockaddr_dn *sdn = (struct sockaddr_dn *)&ifr->ifr_addr; - struct dn_dev *dn_db; - struct net_device *dev; - struct dn_ifaddr *ifa = NULL; - struct dn_ifaddr __rcu **ifap = NULL; - int ret = 0; - - if (copy_from_user(ifr, arg, DN_IFREQ_SIZE)) - return -EFAULT; - ifr->ifr_name[IFNAMSIZ-1] = 0; - - dev_load(&init_net, ifr->ifr_name); - - switch (cmd) { - case SIOCGIFADDR: - break; - case SIOCSIFADDR: - if (!capable(CAP_NET_ADMIN)) - return -EACCES; - if (sdn->sdn_family != AF_DECnet) - return -EINVAL; - break; - default: - return -EINVAL; - } - - rtnl_lock(); - - if ((dev = __dev_get_by_name(&init_net, ifr->ifr_name)) == NULL) { - ret = -ENODEV; - goto done; - } - - if ((dn_db = rtnl_dereference(dev->dn_ptr)) != NULL) { - for (ifap = &dn_db->ifa_list; - (ifa = rtnl_dereference(*ifap)) != NULL; - ifap = &ifa->ifa_next) - if (strcmp(ifr->ifr_name, ifa->ifa_label) == 0) - break; - } - - if (ifa == NULL && cmd != SIOCSIFADDR) { - ret = -EADDRNOTAVAIL; - goto done; - } - - switch (cmd) { - case SIOCGIFADDR: - *((__le16 *)sdn->sdn_nodeaddr) = ifa->ifa_local; - if (copy_to_user(arg, ifr, DN_IFREQ_SIZE)) - ret = -EFAULT; - break; - - case SIOCSIFADDR: - if (!ifa) { - if ((ifa = dn_dev_alloc_ifa()) == NULL) { - ret = -ENOBUFS; - break; - } - memcpy(ifa->ifa_label, dev->name, IFNAMSIZ); - } else { - if (ifa->ifa_local == dn_saddr2dn(sdn)) - break; - dn_dev_del_ifa(dn_db, ifap, 0); - } - - ifa->ifa_local = ifa->ifa_address = dn_saddr2dn(sdn); - - ret = dn_dev_set_ifa(dev, ifa); - } -done: - rtnl_unlock(); - - return ret; -} - -struct net_device *dn_dev_get_default(void) -{ - struct net_device *dev; - - spin_lock(&dndev_lock); - dev = decnet_default_device; - if (dev) { - if (dev->dn_ptr) - dev_hold(dev); - else - dev = NULL; - } - spin_unlock(&dndev_lock); - - return dev; -} - -int dn_dev_set_default(struct net_device *dev, int force) -{ - struct net_device *old = NULL; - int rv = -EBUSY; - if (!dev->dn_ptr) - return -ENODEV; - - spin_lock(&dndev_lock); - if (force || decnet_default_device == NULL) { - old = decnet_default_device; - decnet_default_device = dev; - rv = 0; - } - spin_unlock(&dndev_lock); - - dev_put(old); - return rv; -} - -static void dn_dev_check_default(struct net_device *dev) -{ - spin_lock(&dndev_lock); - if (dev == decnet_default_device) { - decnet_default_device = NULL; - } else { - dev = NULL; - } - spin_unlock(&dndev_lock); - - dev_put(dev); -} - -/* - * Called with RTNL - */ -static struct dn_dev *dn_dev_by_index(int ifindex) -{ - struct net_device *dev; - struct dn_dev *dn_dev = NULL; - - dev = __dev_get_by_index(&init_net, ifindex); - if (dev) - dn_dev = rtnl_dereference(dev->dn_ptr); - - return dn_dev; -} - -static const struct nla_policy dn_ifa_policy[IFA_MAX+1] = { - [IFA_ADDRESS] = { .type = NLA_U16 }, - [IFA_LOCAL] = { .type = NLA_U16 }, - [IFA_LABEL] = { .type = NLA_STRING, - .len = IFNAMSIZ - 1 }, - [IFA_FLAGS] = { .type = NLA_U32 }, -}; - -static int dn_nl_deladdr(struct sk_buff *skb, struct nlmsghdr *nlh, - struct netlink_ext_ack *extack) -{ - struct net *net = sock_net(skb->sk); - struct nlattr *tb[IFA_MAX+1]; - struct dn_dev *dn_db; - struct ifaddrmsg *ifm; - struct dn_ifaddr *ifa; - struct dn_ifaddr __rcu **ifap; - int err = -EINVAL; - - if (!netlink_capable(skb, CAP_NET_ADMIN)) - return -EPERM; - - if (!net_eq(net, &init_net)) - goto errout; - - err = nlmsg_parse_deprecated(nlh, sizeof(*ifm), tb, IFA_MAX, - dn_ifa_policy, extack); - if (err < 0) - goto errout; - - err = -ENODEV; - ifm = nlmsg_data(nlh); - if ((dn_db = dn_dev_by_index(ifm->ifa_index)) == NULL) - goto errout; - - err = -EADDRNOTAVAIL; - for (ifap = &dn_db->ifa_list; - (ifa = rtnl_dereference(*ifap)) != NULL; - ifap = &ifa->ifa_next) { - if (tb[IFA_LOCAL] && - nla_memcmp(tb[IFA_LOCAL], &ifa->ifa_local, 2)) - continue; - - if (tb[IFA_LABEL] && nla_strcmp(tb[IFA_LABEL], ifa->ifa_label)) - continue; - - dn_dev_del_ifa(dn_db, ifap, 1); - return 0; - } - -errout: - return err; -} - -static int dn_nl_newaddr(struct sk_buff *skb, struct nlmsghdr *nlh, - struct netlink_ext_ack *extack) -{ - struct net *net = sock_net(skb->sk); - struct nlattr *tb[IFA_MAX+1]; - struct net_device *dev; - struct dn_dev *dn_db; - struct ifaddrmsg *ifm; - struct dn_ifaddr *ifa; - int err; - - if (!netlink_capable(skb, CAP_NET_ADMIN)) - return -EPERM; - - if (!net_eq(net, &init_net)) - return -EINVAL; - - err = nlmsg_parse_deprecated(nlh, sizeof(*ifm), tb, IFA_MAX, - dn_ifa_policy, extack); - if (err < 0) - return err; - - if (tb[IFA_LOCAL] == NULL) - return -EINVAL; - - ifm = nlmsg_data(nlh); - if ((dev = __dev_get_by_index(&init_net, ifm->ifa_index)) == NULL) - return -ENODEV; - - if ((dn_db = rtnl_dereference(dev->dn_ptr)) == NULL) { - dn_db = dn_dev_create(dev, &err); - if (!dn_db) - return err; - } - - if ((ifa = dn_dev_alloc_ifa()) == NULL) - return -ENOBUFS; - - if (tb[IFA_ADDRESS] == NULL) - tb[IFA_ADDRESS] = tb[IFA_LOCAL]; - - ifa->ifa_local = nla_get_le16(tb[IFA_LOCAL]); - ifa->ifa_address = nla_get_le16(tb[IFA_ADDRESS]); - ifa->ifa_flags = tb[IFA_FLAGS] ? nla_get_u32(tb[IFA_FLAGS]) : - ifm->ifa_flags; - ifa->ifa_scope = ifm->ifa_scope; - ifa->ifa_dev = dn_db; - - if (tb[IFA_LABEL]) - nla_strscpy(ifa->ifa_label, tb[IFA_LABEL], IFNAMSIZ); - else - memcpy(ifa->ifa_label, dev->name, IFNAMSIZ); - - err = dn_dev_insert_ifa(dn_db, ifa); - if (err) - dn_dev_free_ifa(ifa); - - return err; -} - -static inline size_t dn_ifaddr_nlmsg_size(void) -{ - return NLMSG_ALIGN(sizeof(struct ifaddrmsg)) - + nla_total_size(IFNAMSIZ) /* IFA_LABEL */ - + nla_total_size(2) /* IFA_ADDRESS */ - + nla_total_size(2) /* IFA_LOCAL */ - + nla_total_size(4); /* IFA_FLAGS */ -} - -static int dn_nl_fill_ifaddr(struct sk_buff *skb, struct dn_ifaddr *ifa, - u32 portid, u32 seq, int event, unsigned int flags) -{ - struct ifaddrmsg *ifm; - struct nlmsghdr *nlh; - u32 ifa_flags = ifa->ifa_flags | IFA_F_PERMANENT; - - nlh = nlmsg_put(skb, portid, seq, event, sizeof(*ifm), flags); - if (nlh == NULL) - return -EMSGSIZE; - - ifm = nlmsg_data(nlh); - ifm->ifa_family = AF_DECnet; - ifm->ifa_prefixlen = 16; - ifm->ifa_flags = ifa_flags; - ifm->ifa_scope = ifa->ifa_scope; - ifm->ifa_index = ifa->ifa_dev->dev->ifindex; - - if ((ifa->ifa_address && - nla_put_le16(skb, IFA_ADDRESS, ifa->ifa_address)) || - (ifa->ifa_local && - nla_put_le16(skb, IFA_LOCAL, ifa->ifa_local)) || - (ifa->ifa_label[0] && - nla_put_string(skb, IFA_LABEL, ifa->ifa_label)) || - nla_put_u32(skb, IFA_FLAGS, ifa_flags)) - goto nla_put_failure; - nlmsg_end(skb, nlh); - return 0; - -nla_put_failure: - nlmsg_cancel(skb, nlh); - return -EMSGSIZE; -} - -static void dn_ifaddr_notify(int event, struct dn_ifaddr *ifa) -{ - struct sk_buff *skb; - int err = -ENOBUFS; - - skb = alloc_skb(dn_ifaddr_nlmsg_size(), GFP_KERNEL); - if (skb == NULL) - goto errout; - - err = dn_nl_fill_ifaddr(skb, ifa, 0, 0, event, 0); - if (err < 0) { - /* -EMSGSIZE implies BUG in dn_ifaddr_nlmsg_size() */ - WARN_ON(err == -EMSGSIZE); - kfree_skb(skb); - goto errout; - } - rtnl_notify(skb, &init_net, 0, RTNLGRP_DECnet_IFADDR, NULL, GFP_KERNEL); - return; -errout: - if (err < 0) - rtnl_set_sk_err(&init_net, RTNLGRP_DECnet_IFADDR, err); -} - -static int dn_nl_dump_ifaddr(struct sk_buff *skb, struct netlink_callback *cb) -{ - struct net *net = sock_net(skb->sk); - int idx, dn_idx = 0, skip_ndevs, skip_naddr; - struct net_device *dev; - struct dn_dev *dn_db; - struct dn_ifaddr *ifa; - - if (!net_eq(net, &init_net)) - return 0; - - skip_ndevs = cb->args[0]; - skip_naddr = cb->args[1]; - - idx = 0; - rcu_read_lock(); - for_each_netdev_rcu(&init_net, dev) { - if (idx < skip_ndevs) - goto cont; - else if (idx > skip_ndevs) { - /* Only skip over addresses for first dev dumped - * in this iteration (idx == skip_ndevs) */ - skip_naddr = 0; - } - - if ((dn_db = rcu_dereference(dev->dn_ptr)) == NULL) - goto cont; - - for (ifa = rcu_dereference(dn_db->ifa_list), dn_idx = 0; ifa; - ifa = rcu_dereference(ifa->ifa_next), dn_idx++) { - if (dn_idx < skip_naddr) - continue; - - if (dn_nl_fill_ifaddr(skb, ifa, NETLINK_CB(cb->skb).portid, - cb->nlh->nlmsg_seq, RTM_NEWADDR, - NLM_F_MULTI) < 0) - goto done; - } -cont: - idx++; - } -done: - rcu_read_unlock(); - cb->args[0] = idx; - cb->args[1] = dn_idx; - - return skb->len; -} - -static int dn_dev_get_first(struct net_device *dev, __le16 *addr) -{ - struct dn_dev *dn_db; - struct dn_ifaddr *ifa; - int rv = -ENODEV; - - rcu_read_lock(); - dn_db = rcu_dereference(dev->dn_ptr); - if (dn_db == NULL) - goto out; - - ifa = rcu_dereference(dn_db->ifa_list); - if (ifa != NULL) { - *addr = ifa->ifa_local; - rv = 0; - } -out: - rcu_read_unlock(); - return rv; -} - -/* - * Find a default address to bind to. - * - * This is one of those areas where the initial VMS concepts don't really - * map onto the Linux concepts, and since we introduced multiple addresses - * per interface we have to cope with slightly odd ways of finding out what - * "our address" really is. Mostly it's not a problem; for this we just guess - * a sensible default. Eventually the routing code will take care of all the - * nasties for us I hope. - */ -int dn_dev_bind_default(__le16 *addr) -{ - struct net_device *dev; - int rv; - dev = dn_dev_get_default(); -last_chance: - if (dev) { - rv = dn_dev_get_first(dev, addr); - dev_put(dev); - if (rv == 0 || dev == init_net.loopback_dev) - return rv; - } - dev = init_net.loopback_dev; - dev_hold(dev); - goto last_chance; -} - -static void dn_send_endnode_hello(struct net_device *dev, struct dn_ifaddr *ifa) -{ - struct endnode_hello_message *msg; - struct sk_buff *skb = NULL; - __le16 *pktlen; - struct dn_dev *dn_db = rcu_dereference_raw(dev->dn_ptr); - - if ((skb = dn_alloc_skb(NULL, sizeof(*msg), GFP_ATOMIC)) == NULL) - return; - - skb->dev = dev; - - msg = skb_put(skb, sizeof(*msg)); - - msg->msgflg = 0x0D; - memcpy(msg->tiver, dn_eco_version, 3); - dn_dn2eth(msg->id, ifa->ifa_local); - msg->iinfo = DN_RT_INFO_ENDN; - msg->blksize = cpu_to_le16(mtu2blksize(dev)); - msg->area = 0x00; - memset(msg->seed, 0, 8); - memcpy(msg->neighbor, dn_hiord, ETH_ALEN); - - if (dn_db->router) { - struct dn_neigh *dn = container_of(dn_db->router, struct dn_neigh, n); - dn_dn2eth(msg->neighbor, dn->addr); - } - - msg->timer = cpu_to_le16((unsigned short)dn_db->parms.t3); - msg->mpd = 0x00; - msg->datalen = 0x02; - memset(msg->data, 0xAA, 2); - - pktlen = skb_push(skb, 2); - *pktlen = cpu_to_le16(skb->len - 2); - - skb_reset_network_header(skb); - - dn_rt_finish_output(skb, dn_rt_all_rt_mcast, msg->id); -} - - -#define DRDELAY (5 * HZ) - -static int dn_am_i_a_router(struct dn_neigh *dn, struct dn_dev *dn_db, struct dn_ifaddr *ifa) -{ - /* First check time since device went up */ - if (time_before(jiffies, dn_db->uptime + DRDELAY)) - return 0; - - /* If there is no router, then yes... */ - if (!dn_db->router) - return 1; - - /* otherwise only if we have a higher priority or.. */ - if (dn->priority < dn_db->parms.priority) - return 1; - - /* if we have equal priority and a higher node number */ - if (dn->priority != dn_db->parms.priority) - return 0; - - if (le16_to_cpu(dn->addr) < le16_to_cpu(ifa->ifa_local)) - return 1; - - return 0; -} - -static void dn_send_router_hello(struct net_device *dev, struct dn_ifaddr *ifa) -{ - int n; - struct dn_dev *dn_db = rcu_dereference_raw(dev->dn_ptr); - struct dn_neigh *dn = container_of(dn_db->router, struct dn_neigh, n); - struct sk_buff *skb; - size_t size; - unsigned char *ptr; - unsigned char *i1, *i2; - __le16 *pktlen; - char *src; - - if (mtu2blksize(dev) < (26 + 7)) - return; - - n = mtu2blksize(dev) - 26; - n /= 7; - - if (n > 32) - n = 32; - - size = 2 + 26 + 7 * n; - - if ((skb = dn_alloc_skb(NULL, size, GFP_ATOMIC)) == NULL) - return; - - skb->dev = dev; - ptr = skb_put(skb, size); - - *ptr++ = DN_RT_PKT_CNTL | DN_RT_PKT_ERTH; - *ptr++ = 2; /* ECO */ - *ptr++ = 0; - *ptr++ = 0; - dn_dn2eth(ptr, ifa->ifa_local); - src = ptr; - ptr += ETH_ALEN; - *ptr++ = dn_db->parms.forwarding == 1 ? - DN_RT_INFO_L1RT : DN_RT_INFO_L2RT; - *((__le16 *)ptr) = cpu_to_le16(mtu2blksize(dev)); - ptr += 2; - *ptr++ = dn_db->parms.priority; /* Priority */ - *ptr++ = 0; /* Area: Reserved */ - *((__le16 *)ptr) = cpu_to_le16((unsigned short)dn_db->parms.t3); - ptr += 2; - *ptr++ = 0; /* MPD: Reserved */ - i1 = ptr++; - memset(ptr, 0, 7); /* Name: Reserved */ - ptr += 7; - i2 = ptr++; - - n = dn_neigh_elist(dev, ptr, n); - - *i2 = 7 * n; - *i1 = 8 + *i2; - - skb_trim(skb, (27 + *i2)); - - pktlen = skb_push(skb, 2); - *pktlen = cpu_to_le16(skb->len - 2); - - skb_reset_network_header(skb); - - if (dn_am_i_a_router(dn, dn_db, ifa)) { - struct sk_buff *skb2 = skb_copy(skb, GFP_ATOMIC); - if (skb2) { - dn_rt_finish_output(skb2, dn_rt_all_end_mcast, src); - } - } - - dn_rt_finish_output(skb, dn_rt_all_rt_mcast, src); -} - -static void dn_send_brd_hello(struct net_device *dev, struct dn_ifaddr *ifa) -{ - struct dn_dev *dn_db = rcu_dereference_raw(dev->dn_ptr); - - if (dn_db->parms.forwarding == 0) - dn_send_endnode_hello(dev, ifa); - else - dn_send_router_hello(dev, ifa); -} - -static void dn_send_ptp_hello(struct net_device *dev, struct dn_ifaddr *ifa) -{ - int tdlen = 16; - int size = dev->hard_header_len + 2 + 4 + tdlen; - struct sk_buff *skb = dn_alloc_skb(NULL, size, GFP_ATOMIC); - int i; - unsigned char *ptr; - char src[ETH_ALEN]; - - if (skb == NULL) - return ; - - skb->dev = dev; - skb_push(skb, dev->hard_header_len); - ptr = skb_put(skb, 2 + 4 + tdlen); - - *ptr++ = DN_RT_PKT_HELO; - *((__le16 *)ptr) = ifa->ifa_local; - ptr += 2; - *ptr++ = tdlen; - - for(i = 0; i < tdlen; i++) - *ptr++ = 0252; - - dn_dn2eth(src, ifa->ifa_local); - dn_rt_finish_output(skb, dn_rt_all_rt_mcast, src); -} - -static int dn_eth_up(struct net_device *dev) -{ - struct dn_dev *dn_db = rcu_dereference_raw(dev->dn_ptr); - - if (dn_db->parms.forwarding == 0) - dev_mc_add(dev, dn_rt_all_end_mcast); - else - dev_mc_add(dev, dn_rt_all_rt_mcast); - - dn_db->use_long = 1; - - return 0; -} - -static void dn_eth_down(struct net_device *dev) -{ - struct dn_dev *dn_db = rcu_dereference_raw(dev->dn_ptr); - - if (dn_db->parms.forwarding == 0) - dev_mc_del(dev, dn_rt_all_end_mcast); - else - dev_mc_del(dev, dn_rt_all_rt_mcast); -} - -static void dn_dev_set_timer(struct net_device *dev); - -static void dn_dev_timer_func(struct timer_list *t) -{ - struct dn_dev *dn_db = from_timer(dn_db, t, timer); - struct net_device *dev; - struct dn_ifaddr *ifa; - - rcu_read_lock(); - dev = dn_db->dev; - if (dn_db->t3 <= dn_db->parms.t2) { - if (dn_db->parms.timer3) { - for (ifa = rcu_dereference(dn_db->ifa_list); - ifa; - ifa = rcu_dereference(ifa->ifa_next)) { - if (!(ifa->ifa_flags & IFA_F_SECONDARY)) - dn_db->parms.timer3(dev, ifa); - } - } - dn_db->t3 = dn_db->parms.t3; - } else { - dn_db->t3 -= dn_db->parms.t2; - } - rcu_read_unlock(); - dn_dev_set_timer(dev); -} - -static void dn_dev_set_timer(struct net_device *dev) -{ - struct dn_dev *dn_db = rcu_dereference_raw(dev->dn_ptr); - - if (dn_db->parms.t2 > dn_db->parms.t3) - dn_db->parms.t2 = dn_db->parms.t3; - - dn_db->timer.expires = jiffies + (dn_db->parms.t2 * HZ); - - add_timer(&dn_db->timer); -} - -static struct dn_dev *dn_dev_create(struct net_device *dev, int *err) -{ - int i; - struct dn_dev_parms *p = dn_dev_list; - struct dn_dev *dn_db; - - for(i = 0; i < DN_DEV_LIST_SIZE; i++, p++) { - if (p->type == dev->type) - break; - } - - *err = -ENODEV; - if (i == DN_DEV_LIST_SIZE) - return NULL; - - *err = -ENOBUFS; - if ((dn_db = kzalloc(sizeof(struct dn_dev), GFP_ATOMIC)) == NULL) - return NULL; - - memcpy(&dn_db->parms, p, sizeof(struct dn_dev_parms)); - - rcu_assign_pointer(dev->dn_ptr, dn_db); - dn_db->dev = dev; - timer_setup(&dn_db->timer, dn_dev_timer_func, 0); - - dn_db->uptime = jiffies; - - dn_db->neigh_parms = neigh_parms_alloc(dev, &dn_neigh_table); - if (!dn_db->neigh_parms) { - RCU_INIT_POINTER(dev->dn_ptr, NULL); - kfree(dn_db); - return NULL; - } - - if (dn_db->parms.up) { - if (dn_db->parms.up(dev) < 0) { - neigh_parms_release(&dn_neigh_table, dn_db->neigh_parms); - dev->dn_ptr = NULL; - kfree(dn_db); - return NULL; - } - } - - dn_dev_sysctl_register(dev, &dn_db->parms); - - dn_dev_set_timer(dev); - - *err = 0; - return dn_db; -} - - -/* - * This processes a device up event. We only start up - * the loopback device & ethernet devices with correct - * MAC addresses automatically. Others must be started - * specifically. - * - * FIXME: How should we configure the loopback address ? If we could dispense - * with using decnet_address here and for autobind, it will be one less thing - * for users to worry about setting up. - */ - -void dn_dev_up(struct net_device *dev) -{ - struct dn_ifaddr *ifa; - __le16 addr = decnet_address; - int maybe_default = 0; - struct dn_dev *dn_db = rtnl_dereference(dev->dn_ptr); - - if ((dev->type != ARPHRD_ETHER) && (dev->type != ARPHRD_LOOPBACK)) - return; - - /* - * Need to ensure that loopback device has a dn_db attached to it - * to allow creation of neighbours against it, even though it might - * not have a local address of its own. Might as well do the same for - * all autoconfigured interfaces. - */ - if (dn_db == NULL) { - int err; - dn_db = dn_dev_create(dev, &err); - if (dn_db == NULL) - return; - } - - if (dev->type == ARPHRD_ETHER) { - if (memcmp(dev->dev_addr, dn_hiord, 4) != 0) - return; - addr = dn_eth2dn(dev->dev_addr); - maybe_default = 1; - } - - if (addr == 0) - return; - - if ((ifa = dn_dev_alloc_ifa()) == NULL) - return; - - ifa->ifa_local = ifa->ifa_address = addr; - ifa->ifa_flags = 0; - ifa->ifa_scope = RT_SCOPE_UNIVERSE; - strcpy(ifa->ifa_label, dev->name); - - dn_dev_set_ifa(dev, ifa); - - /* - * Automagically set the default device to the first automatically - * configured ethernet card in the system. - */ - if (maybe_default) { - dev_hold(dev); - if (dn_dev_set_default(dev, 0)) - dev_put(dev); - } -} - -static void dn_dev_delete(struct net_device *dev) -{ - struct dn_dev *dn_db = rtnl_dereference(dev->dn_ptr); - - if (dn_db == NULL) - return; - - del_timer_sync(&dn_db->timer); - dn_dev_sysctl_unregister(&dn_db->parms); - dn_dev_check_default(dev); - neigh_ifdown(&dn_neigh_table, dev); - - if (dn_db->parms.down) - dn_db->parms.down(dev); - - dev->dn_ptr = NULL; - - neigh_parms_release(&dn_neigh_table, dn_db->neigh_parms); - neigh_ifdown(&dn_neigh_table, dev); - - if (dn_db->router) - neigh_release(dn_db->router); - if (dn_db->peer) - neigh_release(dn_db->peer); - - kfree(dn_db); -} - -void dn_dev_down(struct net_device *dev) -{ - struct dn_dev *dn_db = rtnl_dereference(dev->dn_ptr); - struct dn_ifaddr *ifa; - - if (dn_db == NULL) - return; - - while ((ifa = rtnl_dereference(dn_db->ifa_list)) != NULL) { - dn_dev_del_ifa(dn_db, &dn_db->ifa_list, 0); - dn_dev_free_ifa(ifa); - } - - dn_dev_delete(dev); -} - -void dn_dev_init_pkt(struct sk_buff *skb) -{ -} - -void dn_dev_veri_pkt(struct sk_buff *skb) -{ -} - -void dn_dev_hello(struct sk_buff *skb) -{ -} - -void dn_dev_devices_off(void) -{ - struct net_device *dev; - - rtnl_lock(); - for_each_netdev(&init_net, dev) - dn_dev_down(dev); - rtnl_unlock(); - -} - -void dn_dev_devices_on(void) -{ - struct net_device *dev; - - rtnl_lock(); - for_each_netdev(&init_net, dev) { - if (dev->flags & IFF_UP) - dn_dev_up(dev); - } - rtnl_unlock(); -} - -int register_dnaddr_notifier(struct notifier_block *nb) -{ - return blocking_notifier_chain_register(&dnaddr_chain, nb); -} - -int unregister_dnaddr_notifier(struct notifier_block *nb) -{ - return blocking_notifier_chain_unregister(&dnaddr_chain, nb); -} - -#ifdef CONFIG_PROC_FS -static inline int is_dn_dev(struct net_device *dev) -{ - return dev->dn_ptr != NULL; -} - -static void *dn_dev_seq_start(struct seq_file *seq, loff_t *pos) - __acquires(RCU) -{ - int i; - struct net_device *dev; - - rcu_read_lock(); - - if (*pos == 0) - return SEQ_START_TOKEN; - - i = 1; - for_each_netdev_rcu(&init_net, dev) { - if (!is_dn_dev(dev)) - continue; - - if (i++ == *pos) - return dev; - } - - return NULL; -} - -static void *dn_dev_seq_next(struct seq_file *seq, void *v, loff_t *pos) -{ - struct net_device *dev; - - ++*pos; - - dev = v; - if (v == SEQ_START_TOKEN) - dev = net_device_entry(&init_net.dev_base_head); - - for_each_netdev_continue_rcu(&init_net, dev) { - if (!is_dn_dev(dev)) - continue; - - return dev; - } - - return NULL; -} - -static void dn_dev_seq_stop(struct seq_file *seq, void *v) - __releases(RCU) -{ - rcu_read_unlock(); -} - -static char *dn_type2asc(char type) -{ - switch (type) { - case DN_DEV_BCAST: - return "B"; - case DN_DEV_UCAST: - return "U"; - case DN_DEV_MPOINT: - return "M"; - } - - return "?"; -} - -static int dn_dev_seq_show(struct seq_file *seq, void *v) -{ - if (v == SEQ_START_TOKEN) - seq_puts(seq, "Name Flags T1 Timer1 T3 Timer3 BlkSize Pri State DevType Router Peer\n"); - else { - struct net_device *dev = v; - char peer_buf[DN_ASCBUF_LEN]; - char router_buf[DN_ASCBUF_LEN]; - struct dn_dev *dn_db = rcu_dereference(dev->dn_ptr); - - seq_printf(seq, "%-8s %1s %04u %04u %04lu %04lu" - " %04hu %03d %02x %-10s %-7s %-7s\n", - dev->name, - dn_type2asc(dn_db->parms.mode), - 0, 0, - dn_db->t3, dn_db->parms.t3, - mtu2blksize(dev), - dn_db->parms.priority, - dn_db->parms.state, dn_db->parms.name, - dn_db->router ? dn_addr2asc(le16_to_cpu(*(__le16 *)dn_db->router->primary_key), router_buf) : "", - dn_db->peer ? dn_addr2asc(le16_to_cpu(*(__le16 *)dn_db->peer->primary_key), peer_buf) : ""); - } - return 0; -} - -static const struct seq_operations dn_dev_seq_ops = { - .start = dn_dev_seq_start, - .next = dn_dev_seq_next, - .stop = dn_dev_seq_stop, - .show = dn_dev_seq_show, -}; -#endif /* CONFIG_PROC_FS */ - -static int addr[2]; -module_param_array(addr, int, NULL, 0444); -MODULE_PARM_DESC(addr, "The DECnet address of this machine: area,node"); - -void __init dn_dev_init(void) -{ - if (addr[0] > 63 || addr[0] < 0) { - printk(KERN_ERR "DECnet: Area must be between 0 and 63"); - return; - } - - if (addr[1] > 1023 || addr[1] < 0) { - printk(KERN_ERR "DECnet: Node must be between 0 and 1023"); - return; - } - - decnet_address = cpu_to_le16((addr[0] << 10) | addr[1]); - - dn_dev_devices_on(); - - rtnl_register_module(THIS_MODULE, PF_DECnet, RTM_NEWADDR, - dn_nl_newaddr, NULL, 0); - rtnl_register_module(THIS_MODULE, PF_DECnet, RTM_DELADDR, - dn_nl_deladdr, NULL, 0); - rtnl_register_module(THIS_MODULE, PF_DECnet, RTM_GETADDR, - NULL, dn_nl_dump_ifaddr, 0); - - proc_create_seq("decnet_dev", 0444, init_net.proc_net, &dn_dev_seq_ops); - -#ifdef CONFIG_SYSCTL - { - int i; - for(i = 0; i < DN_DEV_LIST_SIZE; i++) - dn_dev_sysctl_register(NULL, &dn_dev_list[i]); - } -#endif /* CONFIG_SYSCTL */ -} - -void __exit dn_dev_cleanup(void) -{ -#ifdef CONFIG_SYSCTL - { - int i; - for(i = 0; i < DN_DEV_LIST_SIZE; i++) - dn_dev_sysctl_unregister(&dn_dev_list[i]); - } -#endif /* CONFIG_SYSCTL */ - - remove_proc_entry("decnet_dev", init_net.proc_net); - - dn_dev_devices_off(); -} diff --git a/net/decnet/dn_fib.c b/net/decnet/dn_fib.c deleted file mode 100644 index 269c029ad74f..000000000000 --- a/net/decnet/dn_fib.c +++ /dev/null @@ -1,798 +0,0 @@ -// SPDX-License-Identifier: GPL-2.0 -/* - * DECnet An implementation of the DECnet protocol suite for the LINUX - * operating system. DECnet is implemented using the BSD Socket - * interface as the means of communication with the user level. - * - * DECnet Routing Forwarding Information Base (Glue/Info List) - * - * Author: Steve Whitehouse <SteveW@ACM.org> - * - * - * Changes: - * Alexey Kuznetsov : SMP locking changes - * Steve Whitehouse : Rewrote it... Well to be more correct, I - * copied most of it from the ipv4 fib code. - * Steve Whitehouse : Updated it in style and fixed a few bugs - * which were fixed in the ipv4 code since - * this code was copied from it. - * - */ -#include <linux/string.h> -#include <linux/net.h> -#include <linux/socket.h> -#include <linux/slab.h> -#include <linux/sockios.h> -#include <linux/init.h> -#include <linux/skbuff.h> -#include <linux/netlink.h> -#include <linux/rtnetlink.h> -#include <linux/proc_fs.h> -#include <linux/netdevice.h> -#include <linux/timer.h> -#include <linux/spinlock.h> -#include <linux/atomic.h> -#include <linux/uaccess.h> -#include <net/neighbour.h> -#include <net/dst.h> -#include <net/flow.h> -#include <net/fib_rules.h> -#include <net/dn.h> -#include <net/dn_route.h> -#include <net/dn_fib.h> -#include <net/dn_neigh.h> -#include <net/dn_dev.h> -#include <net/rtnh.h> - -#define RT_MIN_TABLE 1 - -#define for_fib_info() { struct dn_fib_info *fi;\ - for(fi = dn_fib_info_list; fi; fi = fi->fib_next) -#define endfor_fib_info() } - -#define for_nexthops(fi) { int nhsel; const struct dn_fib_nh *nh;\ - for(nhsel = 0, nh = (fi)->fib_nh; nhsel < (fi)->fib_nhs; nh++, nhsel++) - -#define change_nexthops(fi) { int nhsel; struct dn_fib_nh *nh;\ - for(nhsel = 0, nh = (struct dn_fib_nh *)((fi)->fib_nh); nhsel < (fi)->fib_nhs; nh++, nhsel++) - -#define endfor_nexthops(fi) } - -static DEFINE_SPINLOCK(dn_fib_multipath_lock); -static struct dn_fib_info *dn_fib_info_list; -static DEFINE_SPINLOCK(dn_fib_info_lock); - -static struct -{ - int error; - u8 scope; -} dn_fib_props[RTN_MAX+1] = { - [RTN_UNSPEC] = { .error = 0, .scope = RT_SCOPE_NOWHERE }, - [RTN_UNICAST] = { .error = 0, .scope = RT_SCOPE_UNIVERSE }, - [RTN_LOCAL] = { .error = 0, .scope = RT_SCOPE_HOST }, - [RTN_BROADCAST] = { .error = -EINVAL, .scope = RT_SCOPE_NOWHERE }, - [RTN_ANYCAST] = { .error = -EINVAL, .scope = RT_SCOPE_NOWHERE }, - [RTN_MULTICAST] = { .error = -EINVAL, .scope = RT_SCOPE_NOWHERE }, - [RTN_BLACKHOLE] = { .error = -EINVAL, .scope = RT_SCOPE_UNIVERSE }, - [RTN_UNREACHABLE] = { .error = -EHOSTUNREACH, .scope = RT_SCOPE_UNIVERSE }, - [RTN_PROHIBIT] = { .error = -EACCES, .scope = RT_SCOPE_UNIVERSE }, - [RTN_THROW] = { .error = -EAGAIN, .scope = RT_SCOPE_UNIVERSE }, - [RTN_NAT] = { .error = 0, .scope = RT_SCOPE_NOWHERE }, - [RTN_XRESOLVE] = { .error = -EINVAL, .scope = RT_SCOPE_NOWHERE }, -}; - -static int dn_fib_sync_down(__le16 local, struct net_device *dev, int force); -static int dn_fib_sync_up(struct net_device *dev); - -void dn_fib_free_info(struct dn_fib_info *fi) -{ - if (fi->fib_dead == 0) { - printk(KERN_DEBUG "DECnet: BUG! Attempt to free alive dn_fib_info\n"); - return; - } - - change_nexthops(fi) { - dev_put(nh->nh_dev); - nh->nh_dev = NULL; - } endfor_nexthops(fi); - kfree(fi); -} - -void dn_fib_release_info(struct dn_fib_info *fi) -{ - spin_lock(&dn_fib_info_lock); - if (fi && refcount_dec_and_test(&fi->fib_treeref)) { - if (fi->fib_next) - fi->fib_next->fib_prev = fi->fib_prev; - if (fi->fib_prev) - fi->fib_prev->fib_next = fi->fib_next; - if (fi == dn_fib_info_list) - dn_fib_info_list = fi->fib_next; - fi->fib_dead = 1; - dn_fib_info_put(fi); - } - spin_unlock(&dn_fib_info_lock); -} - -static inline int dn_fib_nh_comp(const struct dn_fib_info *fi, const struct dn_fib_info *ofi) -{ - const struct dn_fib_nh *onh = ofi->fib_nh; - - for_nexthops(fi) { - if (nh->nh_oif != onh->nh_oif || - nh->nh_gw != onh->nh_gw || - nh->nh_scope != onh->nh_scope || - nh->nh_weight != onh->nh_weight || - ((nh->nh_flags^onh->nh_flags)&~RTNH_F_DEAD)) - return -1; - onh++; - } endfor_nexthops(fi); - return 0; -} - -static inline struct dn_fib_info *dn_fib_find_info(const struct dn_fib_info *nfi) -{ - for_fib_info() { - if (fi->fib_nhs != nfi->fib_nhs) - continue; - if (nfi->fib_protocol == fi->fib_protocol && - nfi->fib_prefsrc == fi->fib_prefsrc && - nfi->fib_priority == fi->fib_priority && - memcmp(nfi->fib_metrics, fi->fib_metrics, sizeof(fi->fib_metrics)) == 0 && - ((nfi->fib_flags^fi->fib_flags)&~RTNH_F_DEAD) == 0 && - (nfi->fib_nhs == 0 || dn_fib_nh_comp(fi, nfi) == 0)) - return fi; - } endfor_fib_info(); - return NULL; -} - -static int dn_fib_count_nhs(const struct nlattr *attr) -{ - struct rtnexthop *nhp = nla_data(attr); - int nhs = 0, nhlen = nla_len(attr); - - while (rtnh_ok(nhp, nhlen)) { - nhs++; - nhp = rtnh_next(nhp, &nhlen); - } - - /* leftover implies invalid nexthop configuration, discard it */ - return nhlen > 0 ? 0 : nhs; -} - -static int dn_fib_get_nhs(struct dn_fib_info *fi, const struct nlattr *attr, - const struct rtmsg *r) -{ - struct rtnexthop *nhp = nla_data(attr); - int nhlen = nla_len(attr); - - change_nexthops(fi) { - int attrlen; - - if (!rtnh_ok(nhp, nhlen)) - return -EINVAL; - - nh->nh_flags = (r->rtm_flags&~0xFF) | nhp->rtnh_flags; - nh->nh_oif = nhp->rtnh_ifindex; - nh->nh_weight = nhp->rtnh_hops + 1; - - attrlen = rtnh_attrlen(nhp); - if (attrlen > 0) { - struct nlattr *gw_attr; - - gw_attr = nla_find((struct nlattr *) (nhp + 1), attrlen, RTA_GATEWAY); - nh->nh_gw = gw_attr ? nla_get_le16(gw_attr) : 0; - } - - nhp = rtnh_next(nhp, &nhlen); - } endfor_nexthops(fi); - - return 0; -} - - -static int dn_fib_check_nh(const struct rtmsg *r, struct dn_fib_info *fi, struct dn_fib_nh *nh) -{ - int err; - - if (nh->nh_gw) { - struct flowidn fld; - struct dn_fib_res res; - - if (nh->nh_flags&RTNH_F_ONLINK) { - struct net_device *dev; - - if (r->rtm_scope >= RT_SCOPE_LINK) - return -EINVAL; - if (dnet_addr_type(nh->nh_gw) != RTN_UNICAST) - return -EINVAL; - if ((dev = __dev_get_by_index(&init_net, nh->nh_oif)) == NULL) - return -ENODEV; - if (!(dev->flags&IFF_UP)) - return -ENETDOWN; - nh->nh_dev = dev; - dev_hold(dev); - nh->nh_scope = RT_SCOPE_LINK; - return 0; - } - - memset(&fld, 0, sizeof(fld)); - fld.daddr = nh->nh_gw; - fld.flowidn_oif = nh->nh_oif; - fld.flowidn_scope = r->rtm_scope + 1; - - if (fld.flowidn_scope < RT_SCOPE_LINK) - fld.flowidn_scope = RT_SCOPE_LINK; - - if ((err = dn_fib_lookup(&fld, &res)) != 0) - return err; - - err = -EINVAL; - if (res.type != RTN_UNICAST && res.type != RTN_LOCAL) - goto out; - nh->nh_scope = res.scope; - nh->nh_oif = DN_FIB_RES_OIF(res); - nh->nh_dev = DN_FIB_RES_DEV(res); - if (nh->nh_dev == NULL) - goto out; - dev_hold(nh->nh_dev); - err = -ENETDOWN; - if (!(nh->nh_dev->flags & IFF_UP)) - goto out; - err = 0; -out: - dn_fib_res_put(&res); - return err; - } else { - struct net_device *dev; - - if (nh->nh_flags&(RTNH_F_PERVASIVE|RTNH_F_ONLINK)) - return -EINVAL; - - dev = __dev_get_by_index(&init_net, nh->nh_oif); - if (dev == NULL || dev->dn_ptr == NULL) - return -ENODEV; - if (!(dev->flags&IFF_UP)) - return -ENETDOWN; - nh->nh_dev = dev; - dev_hold(nh->nh_dev); - nh->nh_scope = RT_SCOPE_HOST; - } - - return 0; -} - - -struct dn_fib_info *dn_fib_create_info(const struct rtmsg *r, struct nlattr *attrs[], - const struct nlmsghdr *nlh, int *errp) -{ - int err; - struct dn_fib_info *fi = NULL; - struct dn_fib_info *ofi; - int nhs = 1; - - if (r->rtm_type > RTN_MAX) - goto err_inval; - - if (dn_fib_props[r->rtm_type].scope > r->rtm_scope) - goto err_inval; - - if (attrs[RTA_MULTIPATH] && - (nhs = dn_fib_count_nhs(attrs[RTA_MULTIPATH])) == 0) - goto err_inval; - - fi = kzalloc(struct_size(fi, fib_nh, nhs), GFP_KERNEL); - err = -ENOBUFS; - if (fi == NULL) - goto failure; - - fi->fib_protocol = r->rtm_protocol; - fi->fib_nhs = nhs; - fi->fib_flags = r->rtm_flags; - - if (attrs[RTA_PRIORITY]) - fi->fib_priority = nla_get_u32(attrs[RTA_PRIORITY]); - - if (attrs[RTA_METRICS]) { - struct nlattr *attr; - int rem; - - nla_for_each_nested(attr, attrs[RTA_METRICS], rem) { - int type = nla_type(attr); - - if (type) { - if (type > RTAX_MAX || type == RTAX_CC_ALGO || - nla_len(attr) < 4) - goto err_inval; - - fi->fib_metrics[type-1] = nla_get_u32(attr); - } - } - } - - if (attrs[RTA_PREFSRC]) - fi->fib_prefsrc = nla_get_le16(attrs[RTA_PREFSRC]); - - if (attrs[RTA_MULTIPATH]) { - if ((err = dn_fib_get_nhs(fi, attrs[RTA_MULTIPATH], r)) != 0) - goto failure; - - if (attrs[RTA_OIF] && - fi->fib_nh->nh_oif != nla_get_u32(attrs[RTA_OIF])) - goto err_inval; - - if (attrs[RTA_GATEWAY] && - fi->fib_nh->nh_gw != nla_get_le16(attrs[RTA_GATEWAY])) - goto err_inval; - } else { - struct dn_fib_nh *nh = fi->fib_nh; - - if (attrs[RTA_OIF]) - nh->nh_oif = nla_get_u32(attrs[RTA_OIF]); - - if (attrs[RTA_GATEWAY]) - nh->nh_gw = nla_get_le16(attrs[RTA_GATEWAY]); - - nh->nh_flags = r->rtm_flags; - nh->nh_weight = 1; - } - - if (r->rtm_type == RTN_NAT) { - if (!attrs[RTA_GATEWAY] || nhs != 1 || attrs[RTA_OIF]) - goto err_inval; - - fi->fib_nh->nh_gw = nla_get_le16(attrs[RTA_GATEWAY]); - goto link_it; - } - - if (dn_fib_props[r->rtm_type].error) { - if (attrs[RTA_GATEWAY] || attrs[RTA_OIF] || attrs[RTA_MULTIPATH]) - goto err_inval; - - goto link_it; - } - - if (r->rtm_scope > RT_SCOPE_HOST) - goto err_inval; - - if (r->rtm_scope == RT_SCOPE_HOST) { - struct dn_fib_nh *nh = fi->fib_nh; - - /* Local address is added */ - if (nhs != 1 || nh->nh_gw) - goto err_inval; - nh->nh_scope = RT_SCOPE_NOWHERE; - nh->nh_dev = dev_get_by_index(&init_net, fi->fib_nh->nh_oif); - err = -ENODEV; - if (nh->nh_dev == NULL) - goto failure; - } else { - change_nexthops(fi) { - if ((err = dn_fib_check_nh(r, fi, nh)) != 0) - goto failure; - } endfor_nexthops(fi) - } - - if (fi->fib_prefsrc) { - if (r->rtm_type != RTN_LOCAL || !attrs[RTA_DST] || - fi->fib_prefsrc != nla_get_le16(attrs[RTA_DST])) - if (dnet_addr_type(fi->fib_prefsrc) != RTN_LOCAL) - goto err_inval; - } - -link_it: - if ((ofi = dn_fib_find_info(fi)) != NULL) { - fi->fib_dead = 1; - dn_fib_free_info(fi); - refcount_inc(&ofi->fib_treeref); - return ofi; - } - - refcount_set(&fi->fib_treeref, 1); - refcount_set(&fi->fib_clntref, 1); - spin_lock(&dn_fib_info_lock); - fi->fib_next = dn_fib_info_list; - fi->fib_prev = NULL; - if (dn_fib_info_list) - dn_fib_info_list->fib_prev = fi; - dn_fib_info_list = fi; - spin_unlock(&dn_fib_info_lock); - return fi; - -err_inval: - err = -EINVAL; - -failure: - *errp = err; - if (fi) { - fi->fib_dead = 1; - dn_fib_free_info(fi); - } - - return NULL; -} - -int dn_fib_semantic_match(int type, struct dn_fib_info *fi, const struct flowidn *fld, struct dn_fib_res *res) -{ - int err = dn_fib_props[type].error; - - if (err == 0) { - if (fi->fib_flags & RTNH_F_DEAD) - return 1; - - res->fi = fi; - - switch (type) { - case RTN_NAT: - DN_FIB_RES_RESET(*res); - refcount_inc(&fi->fib_clntref); - return 0; - case RTN_UNICAST: - case RTN_LOCAL: - for_nexthops(fi) { - if (nh->nh_flags & RTNH_F_DEAD) - continue; - if (!fld->flowidn_oif || - fld->flowidn_oif == nh->nh_oif) - break; - } - if (nhsel < fi->fib_nhs) { - res->nh_sel = nhsel; - refcount_inc(&fi->fib_clntref); - return 0; - } - endfor_nexthops(fi); - res->fi = NULL; - return 1; - default: - net_err_ratelimited("DECnet: impossible routing event : dn_fib_semantic_match type=%d\n", - type); - res->fi = NULL; - return -EINVAL; - } - } - return err; -} - -void dn_fib_select_multipath(const struct flowidn *fld, struct dn_fib_res *res) -{ - struct dn_fib_info *fi = res->fi; - int w; - - spin_lock_bh(&dn_fib_multipath_lock); - if (fi->fib_power <= 0) { - int power = 0; - change_nexthops(fi) { - if (!(nh->nh_flags&RTNH_F_DEAD)) { - power += nh->nh_weight; - nh->nh_power = nh->nh_weight; - } - } endfor_nexthops(fi); - fi->fib_power = power; - if (power < 0) { - spin_unlock_bh(&dn_fib_multipath_lock); - res->nh_sel = 0; - return; - } - } - - w = jiffies % fi->fib_power; - - change_nexthops(fi) { - if (!(nh->nh_flags&RTNH_F_DEAD) && nh->nh_power) { - if ((w -= nh->nh_power) <= 0) { - nh->nh_power--; - fi->fib_power--; - res->nh_sel = nhsel; - spin_unlock_bh(&dn_fib_multipath_lock); - return; - } - } - } endfor_nexthops(fi); - res->nh_sel = 0; - spin_unlock_bh(&dn_fib_multipath_lock); -} - -static inline u32 rtm_get_table(struct nlattr *attrs[], u8 table) -{ - if (attrs[RTA_TABLE]) - table = nla_get_u32(attrs[RTA_TABLE]); - - return table; -} - -static int dn_fib_rtm_delroute(struct sk_buff *skb, struct nlmsghdr *nlh, - struct netlink_ext_ack *extack) -{ - struct net *net = sock_net(skb->sk); - struct dn_fib_table *tb; - struct rtmsg *r = nlmsg_data(nlh); - struct nlattr *attrs[RTA_MAX+1]; - int err; - - if (!netlink_capable(skb, CAP_NET_ADMIN)) - return -EPERM; - - if (!net_eq(net, &init_net)) - return -EINVAL; - - err = nlmsg_parse_deprecated(nlh, sizeof(*r), attrs, RTA_MAX, - rtm_dn_policy, extack); - if (err < 0) - return err; - - tb = dn_fib_get_table(rtm_get_table(attrs, r->rtm_table), 0); - if (!tb) - return -ESRCH; - - return tb->delete(tb, r, attrs, nlh, &NETLINK_CB(skb)); -} - -static int dn_fib_rtm_newroute(struct sk_buff *skb, struct nlmsghdr *nlh, - struct netlink_ext_ack *extack) -{ - struct net *net = sock_net(skb->sk); - struct dn_fib_table *tb; - struct rtmsg *r = nlmsg_data(nlh); - struct nlattr *attrs[RTA_MAX+1]; - int err; - - if (!netlink_capable(skb, CAP_NET_ADMIN)) - return -EPERM; - - if (!net_eq(net, &init_net)) - return -EINVAL; - - err = nlmsg_parse_deprecated(nlh, sizeof(*r), attrs, RTA_MAX, - rtm_dn_policy, extack); - if (err < 0) - return err; - - tb = dn_fib_get_table(rtm_get_table(attrs, r->rtm_table), 1); - if (!tb) - return -ENOBUFS; - - return tb->insert(tb, r, attrs, nlh, &NETLINK_CB(skb)); -} - -static void fib_magic(int cmd, int type, __le16 dst, int dst_len, struct dn_ifaddr *ifa) -{ - struct dn_fib_table *tb; - struct { - struct nlmsghdr nlh; - struct rtmsg rtm; - } req; - struct { - struct nlattr hdr; - __le16 dst; - } dst_attr = { - .dst = dst, - }; - struct { - struct nlattr hdr; - __le16 prefsrc; - } prefsrc_attr = { - .prefsrc = ifa->ifa_local, - }; - struct { - struct nlattr hdr; - u32 oif; - } oif_attr = { - .oif = ifa->ifa_dev->dev->ifindex, - }; - struct nlattr *attrs[RTA_MAX+1] = { - [RTA_DST] = (struct nlattr *) &dst_attr, - [RTA_PREFSRC] = (struct nlattr * ) &prefsrc_attr, - [RTA_OIF] = (struct nlattr *) &oif_attr, - }; - - memset(&req.rtm, 0, sizeof(req.rtm)); - - if (type == RTN_UNICAST) - tb = dn_fib_get_table(RT_MIN_TABLE, 1); - else - tb = dn_fib_get_table(RT_TABLE_LOCAL, 1); - - if (tb == NULL) - return; - - req.nlh.nlmsg_len = sizeof(req); - req.nlh.nlmsg_type = cmd; - req.nlh.nlmsg_flags = NLM_F_REQUEST|NLM_F_CREATE|NLM_F_APPEND; - req.nlh.nlmsg_pid = 0; - req.nlh.nlmsg_seq = 0; - - req.rtm.rtm_dst_len = dst_len; - req.rtm.rtm_table = tb->n; - req.rtm.rtm_protocol = RTPROT_KERNEL; - req.rtm.rtm_scope = (type != RTN_LOCAL ? RT_SCOPE_LINK : RT_SCOPE_HOST); - req.rtm.rtm_type = type; - - if (cmd == RTM_NEWROUTE) - tb->insert(tb, &req.rtm, attrs, &req.nlh, NULL); - else - tb->delete(tb, &req.rtm, attrs, &req.nlh, NULL); -} - -static void dn_fib_add_ifaddr(struct dn_ifaddr *ifa) -{ - - fib_magic(RTM_NEWROUTE, RTN_LOCAL, ifa->ifa_local, 16, ifa); - -#if 0 - if (!(dev->flags&IFF_UP)) - return; - /* In the future, we will want to add default routes here */ - -#endif -} - -static void dn_fib_del_ifaddr(struct dn_ifaddr *ifa) -{ - int found_it = 0; - struct net_device *dev; - struct dn_dev *dn_db; - struct dn_ifaddr *ifa2; - - ASSERT_RTNL(); - - /* Scan device list */ - rcu_read_lock(); - for_each_netdev_rcu(&init_net, dev) { - dn_db = rcu_dereference(dev->dn_ptr); - if (dn_db == NULL) - continue; - for (ifa2 = rcu_dereference(dn_db->ifa_list); - ifa2 != NULL; - ifa2 = rcu_dereference(ifa2->ifa_next)) { - if (ifa2->ifa_local == ifa->ifa_local) { - found_it = 1; - break; - } - } - } - rcu_read_unlock(); - - if (found_it == 0) { - fib_magic(RTM_DELROUTE, RTN_LOCAL, ifa->ifa_local, 16, ifa); - - if (dnet_addr_type(ifa->ifa_local) != RTN_LOCAL) { - if (dn_fib_sync_down(ifa->ifa_local, NULL, 0)) - dn_fib_flush(); - } - } -} - -static void dn_fib_disable_addr(struct net_device *dev, int force) -{ - if (dn_fib_sync_down(0, dev, force)) - dn_fib_flush(); - dn_rt_cache_flush(0); - neigh_ifdown(&dn_neigh_table, dev); -} - -static int dn_fib_dnaddr_event(struct notifier_block *this, unsigned long event, void *ptr) -{ - struct dn_ifaddr *ifa = (struct dn_ifaddr *)ptr; - - switch (event) { - case NETDEV_UP: - dn_fib_add_ifaddr(ifa); - dn_fib_sync_up(ifa->ifa_dev->dev); - dn_rt_cache_flush(-1); - break; - case NETDEV_DOWN: - dn_fib_del_ifaddr(ifa); - if (ifa->ifa_dev && ifa->ifa_dev->ifa_list == NULL) { - dn_fib_disable_addr(ifa->ifa_dev->dev, 1); - } else { - dn_rt_cache_flush(-1); - } - break; - } - return NOTIFY_DONE; -} - -static int dn_fib_sync_down(__le16 local, struct net_device *dev, int force) -{ - int ret = 0; - int scope = RT_SCOPE_NOWHERE; - - if (force) - scope = -1; - - for_fib_info() { - /* - * This makes no sense for DECnet.... we will almost - * certainly have more than one local address the same - * over all our interfaces. It needs thinking about - * some more. - */ - if (local && fi->fib_prefsrc == local) { - fi->fib_flags |= RTNH_F_DEAD; - ret++; - } else if (dev && fi->fib_nhs) { - int dead = 0; - - change_nexthops(fi) { - if (nh->nh_flags&RTNH_F_DEAD) - dead++; - else if (nh->nh_dev == dev && - nh->nh_scope != scope) { - spin_lock_bh(&dn_fib_multipath_lock); - nh->nh_flags |= RTNH_F_DEAD; - fi->fib_power -= nh->nh_power; - nh->nh_power = 0; - spin_unlock_bh(&dn_fib_multipath_lock); - dead++; - } - } endfor_nexthops(fi) - if (dead == fi->fib_nhs) { - fi->fib_flags |= RTNH_F_DEAD; - ret++; - } - } - } endfor_fib_info(); - return ret; -} - - -static int dn_fib_sync_up(struct net_device *dev) -{ - int ret = 0; - - if (!(dev->flags&IFF_UP)) - return 0; - - for_fib_info() { - int alive = 0; - - change_nexthops(fi) { - if (!(nh->nh_flags&RTNH_F_DEAD)) { - alive++; - continue; - } - if (nh->nh_dev == NULL || !(nh->nh_dev->flags&IFF_UP)) - continue; - if (nh->nh_dev != dev || dev->dn_ptr == NULL) - continue; - alive++; - spin_lock_bh(&dn_fib_multipath_lock); - nh->nh_power = 0; - nh->nh_flags &= ~RTNH_F_DEAD; - spin_unlock_bh(&dn_fib_multipath_lock); - } endfor_nexthops(fi); - - if (alive > 0) { - fi->fib_flags &= ~RTNH_F_DEAD; - ret++; - } - } endfor_fib_info(); - return ret; -} - -static struct notifier_block dn_fib_dnaddr_notifier = { - .notifier_call = dn_fib_dnaddr_event, -}; - -void __exit dn_fib_cleanup(void) -{ - dn_fib_table_cleanup(); - dn_fib_rules_cleanup(); - - unregister_dnaddr_notifier(&dn_fib_dnaddr_notifier); -} - - -void __init dn_fib_init(void) -{ - dn_fib_table_init(); - dn_fib_rules_init(); - - register_dnaddr_notifier(&dn_fib_dnaddr_notifier); - - rtnl_register_module(THIS_MODULE, PF_DECnet, RTM_NEWROUTE, - dn_fib_rtm_newroute, NULL, 0); - rtnl_register_module(THIS_MODULE, PF_DECnet, RTM_DELROUTE, - dn_fib_rtm_delroute, NULL, 0); -} diff --git a/net/decnet/dn_neigh.c b/net/decnet/dn_neigh.c deleted file mode 100644 index 7c569bcc0aca..000000000000 --- a/net/decnet/dn_neigh.c +++ /dev/null @@ -1,607 +0,0 @@ -// SPDX-License-Identifier: GPL-2.0 -/* - * DECnet An implementation of the DECnet protocol suite for the LINUX - * operating system. DECnet is implemented using the BSD Socket - * interface as the means of communication with the user level. - * - * DECnet Neighbour Functions (Adjacency Database and - * On-Ethernet Cache) - * - * Author: Steve Whitehouse <SteveW@ACM.org> - * - * - * Changes: - * Steve Whitehouse : Fixed router listing routine - * Steve Whitehouse : Added error_report functions - * Steve Whitehouse : Added default router detection - * Steve Whitehouse : Hop counts in outgoing messages - * Steve Whitehouse : Fixed src/dst in outgoing messages so - * forwarding now stands a good chance of - * working. - * Steve Whitehouse : Fixed neighbour states (for now anyway). - * Steve Whitehouse : Made error_report functions dummies. This - * is not the right place to return skbs. - * Steve Whitehouse : Convert to seq_file - * - */ - -#include <linux/net.h> -#include <linux/module.h> -#include <linux/socket.h> -#include <linux/if_arp.h> -#include <linux/slab.h> -#include <linux/if_ether.h> -#include <linux/init.h> -#include <linux/proc_fs.h> -#include <linux/string.h> -#include <linux/netfilter_decnet.h> -#include <linux/spinlock.h> -#include <linux/seq_file.h> -#include <linux/rcupdate.h> -#include <linux/jhash.h> -#include <linux/atomic.h> -#include <net/net_namespace.h> -#include <net/neighbour.h> -#include <net/dst.h> -#include <net/flow.h> -#include <net/dn.h> -#include <net/dn_dev.h> -#include <net/dn_neigh.h> -#include <net/dn_route.h> - -static int dn_neigh_construct(struct neighbour *); -static void dn_neigh_error_report(struct neighbour *, struct sk_buff *); -static int dn_neigh_output(struct neighbour *neigh, struct sk_buff *skb); - -/* - * Operations for adding the link layer header. - */ -static const struct neigh_ops dn_neigh_ops = { - .family = AF_DECnet, - .error_report = dn_neigh_error_report, - .output = dn_neigh_output, - .connected_output = dn_neigh_output, -}; - -static u32 dn_neigh_hash(const void *pkey, - const struct net_device *dev, - __u32 *hash_rnd) -{ - return jhash_2words(*(__u16 *)pkey, 0, hash_rnd[0]); -} - -static bool dn_key_eq(const struct neighbour *neigh, const void *pkey) -{ - return neigh_key_eq16(neigh, pkey); -} - -struct neigh_table dn_neigh_table = { - .family = PF_DECnet, - .entry_size = NEIGH_ENTRY_SIZE(sizeof(struct dn_neigh)), - .key_len = sizeof(__le16), - .protocol = cpu_to_be16(ETH_P_DNA_RT), - .hash = dn_neigh_hash, - .key_eq = dn_key_eq, - .constructor = dn_neigh_construct, - .id = "dn_neigh_cache", - .parms ={ - .tbl = &dn_neigh_table, - .reachable_time = 30 * HZ, - .data = { - [NEIGH_VAR_MCAST_PROBES] = 0, - [NEIGH_VAR_UCAST_PROBES] = 0, - [NEIGH_VAR_APP_PROBES] = 0, - [NEIGH_VAR_RETRANS_TIME] = 1 * HZ, - [NEIGH_VAR_BASE_REACHABLE_TIME] = 30 * HZ, - [NEIGH_VAR_DELAY_PROBE_TIME] = 5 * HZ, - [NEIGH_VAR_INTERVAL_PROBE_TIME_MS] = 5 * HZ, - [NEIGH_VAR_GC_STALETIME] = 60 * HZ, - [NEIGH_VAR_QUEUE_LEN_BYTES] = SK_WMEM_MAX, - [NEIGH_VAR_PROXY_QLEN] = 0, - [NEIGH_VAR_ANYCAST_DELAY] = 0, - [NEIGH_VAR_PROXY_DELAY] = 0, - [NEIGH_VAR_LOCKTIME] = 1 * HZ, - }, - }, - .gc_interval = 30 * HZ, - .gc_thresh1 = 128, - .gc_thresh2 = 512, - .gc_thresh3 = 1024, -}; - -static int dn_neigh_construct(struct neighbour *neigh) -{ - struct net_device *dev = neigh->dev; - struct dn_neigh *dn = container_of(neigh, struct dn_neigh, n); - struct dn_dev *dn_db; - struct neigh_parms *parms; - - rcu_read_lock(); - dn_db = rcu_dereference(dev->dn_ptr); - if (dn_db == NULL) { - rcu_read_unlock(); - return -EINVAL; - } - - parms = dn_db->neigh_parms; - if (!parms) { - rcu_read_unlock(); - return -EINVAL; - } - - __neigh_parms_put(neigh->parms); - neigh->parms = neigh_parms_clone(parms); - rcu_read_unlock(); - - neigh->ops = &dn_neigh_ops; - neigh->nud_state = NUD_NOARP; - neigh->output = neigh->ops->connected_output; - - if ((dev->type == ARPHRD_IPGRE) || (dev->flags & IFF_POINTOPOINT)) - memcpy(neigh->ha, dev->broadcast, dev->addr_len); - else if ((dev->type == ARPHRD_ETHER) || (dev->type == ARPHRD_LOOPBACK)) - dn_dn2eth(neigh->ha, dn->addr); - else { - net_dbg_ratelimited("Trying to create neigh for hw %d\n", - dev->type); - return -EINVAL; - } - - /* - * Make an estimate of the remote block size by assuming that its - * two less then the device mtu, which it true for ethernet (and - * other things which support long format headers) since there is - * an extra length field (of 16 bits) which isn't part of the - * ethernet headers and which the DECnet specs won't admit is part - * of the DECnet routing headers either. - * - * If we over estimate here its no big deal, the NSP negotiations - * will prevent us from sending packets which are too large for the - * remote node to handle. In any case this figure is normally updated - * by a hello message in most cases. - */ - dn->blksize = dev->mtu - 2; - - return 0; -} - -static void dn_neigh_error_report(struct neighbour *neigh, struct sk_buff *skb) -{ - printk(KERN_DEBUG "dn_neigh_error_report: called\n"); - kfree_skb(skb); -} - -static int dn_neigh_output(struct neighbour *neigh, struct sk_buff *skb) -{ - struct dst_entry *dst = skb_dst(skb); - struct dn_route *rt = (struct dn_route *)dst; - struct net_device *dev = neigh->dev; - char mac_addr[ETH_ALEN]; - unsigned int seq; - int err; - - dn_dn2eth(mac_addr, rt->rt_local_src); - do { - seq = read_seqbegin(&neigh->ha_lock); - err = dev_hard_header(skb, dev, ntohs(skb->protocol), - neigh->ha, mac_addr, skb->len); - } while (read_seqretry(&neigh->ha_lock, seq)); - - if (err >= 0) - err = dev_queue_xmit(skb); - else { - kfree_skb(skb); - err = -EINVAL; - } - return err; -} - -static int dn_neigh_output_packet(struct net *net, struct sock *sk, struct sk_buff *skb) -{ - struct dst_entry *dst = skb_dst(skb); - struct dn_route *rt = (struct dn_route *)dst; - struct neighbour *neigh = rt->n; - - return neigh->output(neigh, skb); -} - -/* - * For talking to broadcast devices: Ethernet & PPP - */ -static int dn_long_output(struct neighbour *neigh, struct sock *sk, - struct sk_buff *skb) -{ - struct net_device *dev = neigh->dev; - int headroom = dev->hard_header_len + sizeof(struct dn_long_packet) + 3; - unsigned char *data; - struct dn_long_packet *lp; - struct dn_skb_cb *cb = DN_SKB_CB(skb); - - - if (skb_headroom(skb) < headroom) { - struct sk_buff *skb2 = skb_realloc_headroom(skb, headroom); - if (skb2 == NULL) { - net_crit_ratelimited("dn_long_output: no memory\n"); - kfree_skb(skb); - return -ENOBUFS; - } - consume_skb(skb); - skb = skb2; - net_info_ratelimited("dn_long_output: Increasing headroom\n"); - } - - data = skb_push(skb, sizeof(struct dn_long_packet) + 3); - lp = (struct dn_long_packet *)(data+3); - - *((__le16 *)data) = cpu_to_le16(skb->len - 2); - *(data + 2) = 1 | DN_RT_F_PF; /* Padding */ - - lp->msgflg = DN_RT_PKT_LONG|(cb->rt_flags&(DN_RT_F_IE|DN_RT_F_RQR|DN_RT_F_RTS)); - lp->d_area = lp->d_subarea = 0; - dn_dn2eth(lp->d_id, cb->dst); - lp->s_area = lp->s_subarea = 0; - dn_dn2eth(lp->s_id, cb->src); - lp->nl2 = 0; - lp->visit_ct = cb->hops & 0x3f; - lp->s_class = 0; - lp->pt = 0; - - skb_reset_network_header(skb); - - return NF_HOOK(NFPROTO_DECNET, NF_DN_POST_ROUTING, - &init_net, sk, skb, NULL, neigh->dev, - dn_neigh_output_packet); -} - -/* - * For talking to pointopoint and multidrop devices: DDCMP and X.25 - */ -static int dn_short_output(struct neighbour *neigh, struct sock *sk, - struct sk_buff *skb) -{ - struct net_device *dev = neigh->dev; - int headroom = dev->hard_header_len + sizeof(struct dn_short_packet) + 2; - struct dn_short_packet *sp; - unsigned char *data; - struct dn_skb_cb *cb = DN_SKB_CB(skb); - - - if (skb_headroom(skb) < headroom) { - struct sk_buff *skb2 = skb_realloc_headroom(skb, headroom); - if (skb2 == NULL) { - net_crit_ratelimited("dn_short_output: no memory\n"); - kfree_skb(skb); - return -ENOBUFS; - } - consume_skb(skb); - skb = skb2; - net_info_ratelimited("dn_short_output: Increasing headroom\n"); - } - - data = skb_push(skb, sizeof(struct dn_short_packet) + 2); - *((__le16 *)data) = cpu_to_le16(skb->len - 2); - sp = (struct dn_short_packet *)(data+2); - - sp->msgflg = DN_RT_PKT_SHORT|(cb->rt_flags&(DN_RT_F_RQR|DN_RT_F_RTS)); - sp->dstnode = cb->dst; - sp->srcnode = cb->src; - sp->forward = cb->hops & 0x3f; - - skb_reset_network_header(skb); - - return NF_HOOK(NFPROTO_DECNET, NF_DN_POST_ROUTING, - &init_net, sk, skb, NULL, neigh->dev, - dn_neigh_output_packet); -} - -/* - * For talking to DECnet phase III nodes - * Phase 3 output is the same as short output, execpt that - * it clears the area bits before transmission. - */ -static int dn_phase3_output(struct neighbour *neigh, struct sock *sk, - struct sk_buff *skb) -{ - struct net_device *dev = neigh->dev; - int headroom = dev->hard_header_len + sizeof(struct dn_short_packet) + 2; - struct dn_short_packet *sp; - unsigned char *data; - struct dn_skb_cb *cb = DN_SKB_CB(skb); - - if (skb_headroom(skb) < headroom) { - struct sk_buff *skb2 = skb_realloc_headroom(skb, headroom); - if (skb2 == NULL) { - net_crit_ratelimited("dn_phase3_output: no memory\n"); - kfree_skb(skb); - return -ENOBUFS; - } - consume_skb(skb); - skb = skb2; - net_info_ratelimited("dn_phase3_output: Increasing headroom\n"); - } - - data = skb_push(skb, sizeof(struct dn_short_packet) + 2); - *((__le16 *)data) = cpu_to_le16(skb->len - 2); - sp = (struct dn_short_packet *)(data + 2); - - sp->msgflg = DN_RT_PKT_SHORT|(cb->rt_flags&(DN_RT_F_RQR|DN_RT_F_RTS)); - sp->dstnode = cb->dst & cpu_to_le16(0x03ff); - sp->srcnode = cb->src & cpu_to_le16(0x03ff); - sp->forward = cb->hops & 0x3f; - - skb_reset_network_header(skb); - - return NF_HOOK(NFPROTO_DECNET, NF_DN_POST_ROUTING, - &init_net, sk, skb, NULL, neigh->dev, - dn_neigh_output_packet); -} - -int dn_to_neigh_output(struct net *net, struct sock *sk, struct sk_buff *skb) -{ - struct dst_entry *dst = skb_dst(skb); - struct dn_route *rt = (struct dn_route *) dst; - struct neighbour *neigh = rt->n; - struct dn_neigh *dn = container_of(neigh, struct dn_neigh, n); - struct dn_dev *dn_db; - bool use_long; - - rcu_read_lock(); - dn_db = rcu_dereference(neigh->dev->dn_ptr); - if (dn_db == NULL) { - rcu_read_unlock(); - return -EINVAL; - } - use_long = dn_db->use_long; - rcu_read_unlock(); - - if (dn->flags & DN_NDFLAG_P3) - return dn_phase3_output(neigh, sk, skb); - if (use_long) - return dn_long_output(neigh, sk, skb); - else - return dn_short_output(neigh, sk, skb); -} - -/* - * Unfortunately, the neighbour code uses the device in its hash - * function, so we don't get any advantage from it. This function - * basically does a neigh_lookup(), but without comparing the device - * field. This is required for the On-Ethernet cache - */ - -/* - * Pointopoint link receives a hello message - */ -void dn_neigh_pointopoint_hello(struct sk_buff *skb) -{ - kfree_skb(skb); -} - -/* - * Ethernet router hello message received - */ -int dn_neigh_router_hello(struct net *net, struct sock *sk, struct sk_buff *skb) -{ - struct rtnode_hello_message *msg = (struct rtnode_hello_message *)skb->data; - - struct neighbour *neigh; - struct dn_neigh *dn; - struct dn_dev *dn_db; - __le16 src; - - src = dn_eth2dn(msg->id); - - neigh = __neigh_lookup(&dn_neigh_table, &src, skb->dev, 1); - - dn = container_of(neigh, struct dn_neigh, n); - - if (neigh) { - write_lock(&neigh->lock); - - neigh->used = jiffies; - dn_db = rcu_dereference(neigh->dev->dn_ptr); - - if (!(neigh->nud_state & NUD_PERMANENT)) { - neigh->updated = jiffies; - - if (neigh->dev->type == ARPHRD_ETHER) - memcpy(neigh->ha, ð_hdr(skb)->h_source, ETH_ALEN); - - dn->blksize = le16_to_cpu(msg->blksize); - dn->priority = msg->priority; - - dn->flags &= ~DN_NDFLAG_P3; - - switch (msg->iinfo & DN_RT_INFO_TYPE) { - case DN_RT_INFO_L1RT: - dn->flags &=~DN_NDFLAG_R2; - dn->flags |= DN_NDFLAG_R1; - break; - case DN_RT_INFO_L2RT: - dn->flags |= DN_NDFLAG_R2; - } - } - - /* Only use routers in our area */ - if ((le16_to_cpu(src)>>10) == (le16_to_cpu((decnet_address))>>10)) { - if (!dn_db->router) { - dn_db->router = neigh_clone(neigh); - } else { - if (msg->priority > container_of(dn_db->router, - struct dn_neigh, n)->priority) - neigh_release(xchg(&dn_db->router, neigh_clone(neigh))); - } - } - write_unlock(&neigh->lock); - neigh_release(neigh); - } - - kfree_skb(skb); - return 0; -} - -/* - * Endnode hello message received - */ -int dn_neigh_endnode_hello(struct net *net, struct sock *sk, struct sk_buff *skb) -{ - struct endnode_hello_message *msg = (struct endnode_hello_message *)skb->data; - struct neighbour *neigh; - struct dn_neigh *dn; - __le16 src; - - src = dn_eth2dn(msg->id); - - neigh = __neigh_lookup(&dn_neigh_table, &src, skb->dev, 1); - - dn = container_of(neigh, struct dn_neigh, n); - - if (neigh) { - write_lock(&neigh->lock); - - neigh->used = jiffies; - - if (!(neigh->nud_state & NUD_PERMANENT)) { - neigh->updated = jiffies; - - if (neigh->dev->type == ARPHRD_ETHER) - memcpy(neigh->ha, ð_hdr(skb)->h_source, ETH_ALEN); - dn->flags &= ~(DN_NDFLAG_R1 | DN_NDFLAG_R2); - dn->blksize = le16_to_cpu(msg->blksize); - dn->priority = 0; - } - - write_unlock(&neigh->lock); - neigh_release(neigh); - } - - kfree_skb(skb); - return 0; -} - -static char *dn_find_slot(char *base, int max, int priority) -{ - int i; - unsigned char *min = NULL; - - base += 6; /* skip first id */ - - for(i = 0; i < max; i++) { - if (!min || (*base < *min)) - min = base; - base += 7; /* find next priority */ - } - - if (!min) - return NULL; - - return (*min < priority) ? (min - 6) : NULL; -} - -struct elist_cb_state { - struct net_device *dev; - unsigned char *ptr; - unsigned char *rs; - int t, n; -}; - -static void neigh_elist_cb(struct neighbour *neigh, void *_info) -{ - struct elist_cb_state *s = _info; - struct dn_neigh *dn; - - if (neigh->dev != s->dev) - return; - - dn = container_of(neigh, struct dn_neigh, n); - if (!(dn->flags & (DN_NDFLAG_R1|DN_NDFLAG_R2))) - return; - - if (s->t == s->n) - s->rs = dn_find_slot(s->ptr, s->n, dn->priority); - else - s->t++; - if (s->rs == NULL) - return; - - dn_dn2eth(s->rs, dn->addr); - s->rs += 6; - *(s->rs) = neigh->nud_state & NUD_CONNECTED ? 0x80 : 0x0; - *(s->rs) |= dn->priority; - s->rs++; -} - -int dn_neigh_elist(struct net_device *dev, unsigned char *ptr, int n) -{ - struct elist_cb_state state; - - state.dev = dev; - state.t = 0; - state.n = n; - state.ptr = ptr; - state.rs = ptr; - - neigh_for_each(&dn_neigh_table, neigh_elist_cb, &state); - - return state.t; -} - - -#ifdef CONFIG_PROC_FS - -static inline void dn_neigh_format_entry(struct seq_file *seq, - struct neighbour *n) -{ - struct dn_neigh *dn = container_of(n, struct dn_neigh, n); - char buf[DN_ASCBUF_LEN]; - - read_lock(&n->lock); - seq_printf(seq, "%-7s %s%s%s %02x %02d %07ld %-8s\n", - dn_addr2asc(le16_to_cpu(dn->addr), buf), - (dn->flags&DN_NDFLAG_R1) ? "1" : "-", - (dn->flags&DN_NDFLAG_R2) ? "2" : "-", - (dn->flags&DN_NDFLAG_P3) ? "3" : "-", - dn->n.nud_state, - refcount_read(&dn->n.refcnt), - dn->blksize, - (dn->n.dev) ? dn->n.dev->name : "?"); - read_unlock(&n->lock); -} - -static int dn_neigh_seq_show(struct seq_file *seq, void *v) -{ - if (v == SEQ_START_TOKEN) { - seq_puts(seq, "Addr Flags State Use Blksize Dev\n"); - } else { - dn_neigh_format_entry(seq, v); - } - - return 0; -} - -static void *dn_neigh_seq_start(struct seq_file *seq, loff_t *pos) -{ - return neigh_seq_start(seq, pos, &dn_neigh_table, - NEIGH_SEQ_NEIGH_ONLY); -} - -static const struct seq_operations dn_neigh_seq_ops = { - .start = dn_neigh_seq_start, - .next = neigh_seq_next, - .stop = neigh_seq_stop, - .show = dn_neigh_seq_show, -}; -#endif - -void __init dn_neigh_init(void) -{ - neigh_table_init(NEIGH_DN_TABLE, &dn_neigh_table); - proc_create_net("decnet_neigh", 0444, init_net.proc_net, - &dn_neigh_seq_ops, sizeof(struct neigh_seq_state)); -} - -void __exit dn_neigh_cleanup(void) -{ - remove_proc_entry("decnet_neigh", init_net.proc_net); - neigh_table_clear(NEIGH_DN_TABLE, &dn_neigh_table); -} diff --git a/net/decnet/dn_nsp_in.c b/net/decnet/dn_nsp_in.c deleted file mode 100644 index c59be5b04479..000000000000 --- a/net/decnet/dn_nsp_in.c +++ /dev/null @@ -1,907 +0,0 @@ -// SPDX-License-Identifier: GPL-2.0-or-later -/* - * DECnet An implementation of the DECnet protocol suite for the LINUX - * operating system. DECnet is implemented using the BSD Socket - * interface as the means of communication with the user level. - * - * DECnet Network Services Protocol (Input) - * - * Author: Eduardo Marcelo Serrat <emserrat@geocities.com> - * - * Changes: - * - * Steve Whitehouse: Split into dn_nsp_in.c and dn_nsp_out.c from - * original dn_nsp.c. - * Steve Whitehouse: Updated to work with my new routing architecture. - * Steve Whitehouse: Add changes from Eduardo Serrat's patches. - * Steve Whitehouse: Put all ack handling code in a common routine. - * Steve Whitehouse: Put other common bits into dn_nsp_rx() - * Steve Whitehouse: More checks on skb->len to catch bogus packets - * Fixed various race conditions and possible nasties. - * Steve Whitehouse: Now handles returned conninit frames. - * David S. Miller: New socket locking - * Steve Whitehouse: Fixed lockup when socket filtering was enabled. - * Paul Koning: Fix to push CC sockets into RUN when acks are - * received. - * Steve Whitehouse: - * Patrick Caulfield: Checking conninits for correctness & sending of error - * responses. - * Steve Whitehouse: Added backlog congestion level return codes. - * Patrick Caulfield: - * Steve Whitehouse: Added flow control support (outbound) - * Steve Whitehouse: Prepare for nonlinear skbs - */ - -/****************************************************************************** - (c) 1995-1998 E.M. Serrat emserrat@geocities.com - -*******************************************************************************/ - -#include <linux/errno.h> -#include <linux/filter.h> -#include <linux/types.h> -#include <linux/socket.h> -#include <linux/in.h> -#include <linux/kernel.h> -#include <linux/timer.h> -#include <linux/string.h> -#include <linux/sockios.h> -#include <linux/net.h> -#include <linux/netdevice.h> -#include <linux/inet.h> -#include <linux/route.h> -#include <linux/slab.h> -#include <net/sock.h> -#include <net/tcp_states.h> -#include <linux/fcntl.h> -#include <linux/mm.h> -#include <linux/termios.h> -#include <linux/interrupt.h> -#include <linux/proc_fs.h> -#include <linux/stat.h> -#include <linux/init.h> -#include <linux/poll.h> -#include <linux/netfilter_decnet.h> -#include <net/neighbour.h> -#include <net/dst.h> -#include <net/dn.h> -#include <net/dn_nsp.h> -#include <net/dn_dev.h> -#include <net/dn_route.h> - -extern int decnet_log_martians; - -static void dn_log_martian(struct sk_buff *skb, const char *msg) -{ - if (decnet_log_martians) { - char *devname = skb->dev ? skb->dev->name : "???"; - struct dn_skb_cb *cb = DN_SKB_CB(skb); - net_info_ratelimited("DECnet: Martian packet (%s) dev=%s src=0x%04hx dst=0x%04hx srcport=0x%04hx dstport=0x%04hx\n", - msg, devname, - le16_to_cpu(cb->src), - le16_to_cpu(cb->dst), - le16_to_cpu(cb->src_port), - le16_to_cpu(cb->dst_port)); - } -} - -/* - * For this function we've flipped the cross-subchannel bit - * if the message is an otherdata or linkservice message. Thus - * we can use it to work out what to update. - */ -static void dn_ack(struct sock *sk, struct sk_buff *skb, unsigned short ack) -{ - struct dn_scp *scp = DN_SK(sk); - unsigned short type = ((ack >> 12) & 0x0003); - int wakeup = 0; - - switch (type) { - case 0: /* ACK - Data */ - if (dn_after(ack, scp->ackrcv_dat)) { - scp->ackrcv_dat = ack & 0x0fff; - wakeup |= dn_nsp_check_xmit_queue(sk, skb, - &scp->data_xmit_queue, - ack); - } - break; - case 1: /* NAK - Data */ - break; - case 2: /* ACK - OtherData */ - if (dn_after(ack, scp->ackrcv_oth)) { - scp->ackrcv_oth = ack & 0x0fff; - wakeup |= dn_nsp_check_xmit_queue(sk, skb, - &scp->other_xmit_queue, - ack); - } - break; - case 3: /* NAK - OtherData */ - break; - } - - if (wakeup && !sock_flag(sk, SOCK_DEAD)) - sk->sk_state_change(sk); -} - -/* - * This function is a universal ack processor. - */ -static int dn_process_ack(struct sock *sk, struct sk_buff *skb, int oth) -{ - __le16 *ptr = (__le16 *)skb->data; - int len = 0; - unsigned short ack; - - if (skb->len < 2) - return len; - - if ((ack = le16_to_cpu(*ptr)) & 0x8000) { - skb_pull(skb, 2); - ptr++; - len += 2; - if ((ack & 0x4000) == 0) { - if (oth) - ack ^= 0x2000; - dn_ack(sk, skb, ack); - } - } - - if (skb->len < 2) - return len; - - if ((ack = le16_to_cpu(*ptr)) & 0x8000) { - skb_pull(skb, 2); - len += 2; - if ((ack & 0x4000) == 0) { - if (oth) - ack ^= 0x2000; - dn_ack(sk, skb, ack); - } - } - - return len; -} - - -/** - * dn_check_idf - Check an image data field format is correct. - * @pptr: Pointer to pointer to image data - * @len: Pointer to length of image data - * @max: The maximum allowed length of the data in the image data field - * @follow_on: Check that this many bytes exist beyond the end of the image data - * - * Returns: 0 if ok, -1 on error - */ -static inline int dn_check_idf(unsigned char **pptr, int *len, unsigned char max, unsigned char follow_on) -{ - unsigned char *ptr = *pptr; - unsigned char flen = *ptr++; - - (*len)--; - if (flen > max) - return -1; - if ((flen + follow_on) > *len) - return -1; - - *len -= flen; - *pptr = ptr + flen; - return 0; -} - -/* - * Table of reason codes to pass back to node which sent us a badly - * formed message, plus text messages for the log. A zero entry in - * the reason field means "don't reply" otherwise a disc init is sent with - * the specified reason code. - */ -static struct { - unsigned short reason; - const char *text; -} ci_err_table[] = { - { 0, "CI: Truncated message" }, - { NSP_REASON_ID, "CI: Destination username error" }, - { NSP_REASON_ID, "CI: Destination username type" }, - { NSP_REASON_US, "CI: Source username error" }, - { 0, "CI: Truncated at menuver" }, - { 0, "CI: Truncated before access or user data" }, - { NSP_REASON_IO, "CI: Access data format error" }, - { NSP_REASON_IO, "CI: User data format error" } -}; - -/* - * This function uses a slightly different lookup method - * to find its sockets, since it searches on object name/number - * rather than port numbers. Various tests are done to ensure that - * the incoming data is in the correct format before it is queued to - * a socket. - */ -static struct sock *dn_find_listener(struct sk_buff *skb, unsigned short *reason) -{ - struct dn_skb_cb *cb = DN_SKB_CB(skb); - struct nsp_conn_init_msg *msg = (struct nsp_conn_init_msg *)skb->data; - struct sockaddr_dn dstaddr; - struct sockaddr_dn srcaddr; - unsigned char type = 0; - int dstlen; - int srclen; - unsigned char *ptr; - int len; - int err = 0; - unsigned char menuver; - - memset(&dstaddr, 0, sizeof(struct sockaddr_dn)); - memset(&srcaddr, 0, sizeof(struct sockaddr_dn)); - - /* - * 1. Decode & remove message header - */ - cb->src_port = msg->srcaddr; - cb->dst_port = msg->dstaddr; - cb->services = msg->services; - cb->info = msg->info; - cb->segsize = le16_to_cpu(msg->segsize); - - if (!pskb_may_pull(skb, sizeof(*msg))) - goto err_out; - - skb_pull(skb, sizeof(*msg)); - - len = skb->len; - ptr = skb->data; - - /* - * 2. Check destination end username format - */ - dstlen = dn_username2sockaddr(ptr, len, &dstaddr, &type); - err++; - if (dstlen < 0) - goto err_out; - - err++; - if (type > 1) - goto err_out; - - len -= dstlen; - ptr += dstlen; - - /* - * 3. Check source end username format - */ - srclen = dn_username2sockaddr(ptr, len, &srcaddr, &type); - err++; - if (srclen < 0) - goto err_out; - - len -= srclen; - ptr += srclen; - err++; - if (len < 1) - goto err_out; - - menuver = *ptr; - ptr++; - len--; - - /* - * 4. Check that optional data actually exists if menuver says it does - */ - err++; - if ((menuver & (DN_MENUVER_ACC | DN_MENUVER_USR)) && (len < 1)) - goto err_out; - - /* - * 5. Check optional access data format - */ - err++; - if (menuver & DN_MENUVER_ACC) { - if (dn_check_idf(&ptr, &len, 39, 1)) - goto err_out; - if (dn_check_idf(&ptr, &len, 39, 1)) - goto err_out; - if (dn_check_idf(&ptr, &len, 39, (menuver & DN_MENUVER_USR) ? 1 : 0)) - goto err_out; - } - - /* - * 6. Check optional user data format - */ - err++; - if (menuver & DN_MENUVER_USR) { - if (dn_check_idf(&ptr, &len, 16, 0)) - goto err_out; - } - - /* - * 7. Look up socket based on destination end username - */ - return dn_sklist_find_listener(&dstaddr); -err_out: - dn_log_martian(skb, ci_err_table[err].text); - *reason = ci_err_table[err].reason; - return NULL; -} - - -static void dn_nsp_conn_init(struct sock *sk, struct sk_buff *skb) -{ - if (sk_acceptq_is_full(sk)) { - kfree_skb(skb); - return; - } - - sk_acceptq_added(sk); - skb_queue_tail(&sk->sk_receive_queue, skb); - sk->sk_state_change(sk); -} - -static void dn_nsp_conn_conf(struct sock *sk, struct sk_buff *skb) -{ - struct dn_skb_cb *cb = DN_SKB_CB(skb); - struct dn_scp *scp = DN_SK(sk); - unsigned char *ptr; - - if (skb->len < 4) - goto out; - - ptr = skb->data; - cb->services = *ptr++; - cb->info = *ptr++; - cb->segsize = le16_to_cpu(*(__le16 *)ptr); - - if ((scp->state == DN_CI) || (scp->state == DN_CD)) { - scp->persist = 0; - scp->addrrem = cb->src_port; - sk->sk_state = TCP_ESTABLISHED; - scp->state = DN_RUN; - scp->services_rem = cb->services; - scp->info_rem = cb->info; - scp->segsize_rem = cb->segsize; - - if ((scp->services_rem & NSP_FC_MASK) == NSP_FC_NONE) - scp->max_window = decnet_no_fc_max_cwnd; - - if (skb->len > 0) { - u16 dlen = *skb->data; - if ((dlen <= 16) && (dlen <= skb->len)) { - scp->conndata_in.opt_optl = cpu_to_le16(dlen); - skb_copy_from_linear_data_offset(skb, 1, - scp->conndata_in.opt_data, dlen); - } - } - dn_nsp_send_link(sk, DN_NOCHANGE, 0); - if (!sock_flag(sk, SOCK_DEAD)) - sk->sk_state_change(sk); - } - -out: - kfree_skb(skb); -} - -static void dn_nsp_conn_ack(struct sock *sk, struct sk_buff *skb) -{ - struct dn_scp *scp = DN_SK(sk); - - if (scp->state == DN_CI) { - scp->state = DN_CD; - scp->persist = 0; - } - - kfree_skb(skb); -} - -static void dn_nsp_disc_init(struct sock *sk, struct sk_buff *skb) -{ - struct dn_scp *scp = DN_SK(sk); - struct dn_skb_cb *cb = DN_SKB_CB(skb); - unsigned short reason; - - if (skb->len < 2) - goto out; - - reason = le16_to_cpu(*(__le16 *)skb->data); - skb_pull(skb, 2); - - scp->discdata_in.opt_status = cpu_to_le16(reason); - scp->discdata_in.opt_optl = 0; - memset(scp->discdata_in.opt_data, 0, 16); - - if (skb->len > 0) { - u16 dlen = *skb->data; - if ((dlen <= 16) && (dlen <= skb->len)) { - scp->discdata_in.opt_optl = cpu_to_le16(dlen); - skb_copy_from_linear_data_offset(skb, 1, scp->discdata_in.opt_data, dlen); - } - } - - scp->addrrem = cb->src_port; - sk->sk_state = TCP_CLOSE; - - switch (scp->state) { - case DN_CI: - case DN_CD: - scp->state = DN_RJ; - sk->sk_err = ECONNREFUSED; - break; - case DN_RUN: - sk->sk_shutdown |= SHUTDOWN_MASK; - scp->state = DN_DN; - break; - case DN_DI: - scp->state = DN_DIC; - break; - } - - if (!sock_flag(sk, SOCK_DEAD)) { - if (sk->sk_socket->state != SS_UNCONNECTED) - sk->sk_socket->state = SS_DISCONNECTING; - sk->sk_state_change(sk); - } - - /* - * It appears that its possible for remote machines to send disc - * init messages with no port identifier if we are in the CI and - * possibly also the CD state. Obviously we shouldn't reply with - * a message if we don't know what the end point is. - */ - if (scp->addrrem) { - dn_nsp_send_disc(sk, NSP_DISCCONF, NSP_REASON_DC, GFP_ATOMIC); - } - scp->persist_fxn = dn_destroy_timer; - scp->persist = dn_nsp_persist(sk); - -out: - kfree_skb(skb); -} - -/* - * disc_conf messages are also called no_resources or no_link - * messages depending upon the "reason" field. - */ -static void dn_nsp_disc_conf(struct sock *sk, struct sk_buff *skb) -{ - struct dn_scp *scp = DN_SK(sk); - unsigned short reason; - - if (skb->len != 2) - goto out; - - reason = le16_to_cpu(*(__le16 *)skb->data); - - sk->sk_state = TCP_CLOSE; - - switch (scp->state) { - case DN_CI: - scp->state = DN_NR; - break; - case DN_DR: - if (reason == NSP_REASON_DC) - scp->state = DN_DRC; - if (reason == NSP_REASON_NL) - scp->state = DN_CN; - break; - case DN_DI: - scp->state = DN_DIC; - break; - case DN_RUN: - sk->sk_shutdown |= SHUTDOWN_MASK; - fallthrough; - case DN_CC: - scp->state = DN_CN; - } - - if (!sock_flag(sk, SOCK_DEAD)) { - if (sk->sk_socket->state != SS_UNCONNECTED) - sk->sk_socket->state = SS_DISCONNECTING; - sk->sk_state_change(sk); - } - - scp->persist_fxn = dn_destroy_timer; - scp->persist = dn_nsp_persist(sk); - -out: - kfree_skb(skb); -} - -static void dn_nsp_linkservice(struct sock *sk, struct sk_buff *skb) -{ - struct dn_scp *scp = DN_SK(sk); - unsigned short segnum; - unsigned char lsflags; - signed char fcval; - int wake_up = 0; - char *ptr = skb->data; - unsigned char fctype = scp->services_rem & NSP_FC_MASK; - - if (skb->len != 4) - goto out; - - segnum = le16_to_cpu(*(__le16 *)ptr); - ptr += 2; - lsflags = *(unsigned char *)ptr++; - fcval = *ptr; - - /* - * Here we ignore erroneous packets which should really - * should cause a connection abort. It is not critical - * for now though. - */ - if (lsflags & 0xf8) - goto out; - - if (seq_next(scp->numoth_rcv, segnum)) { - seq_add(&scp->numoth_rcv, 1); - switch(lsflags & 0x04) { /* FCVAL INT */ - case 0x00: /* Normal Request */ - switch(lsflags & 0x03) { /* FCVAL MOD */ - case 0x00: /* Request count */ - if (fcval < 0) { - unsigned char p_fcval = -fcval; - if ((scp->flowrem_dat > p_fcval) && - (fctype == NSP_FC_SCMC)) { - scp->flowrem_dat -= p_fcval; - } - } else if (fcval > 0) { - scp->flowrem_dat += fcval; - wake_up = 1; - } - break; - case 0x01: /* Stop outgoing data */ - scp->flowrem_sw = DN_DONTSEND; - break; - case 0x02: /* Ok to start again */ - scp->flowrem_sw = DN_SEND; - dn_nsp_output(sk); - wake_up = 1; - } - break; - case 0x04: /* Interrupt Request */ - if (fcval > 0) { - scp->flowrem_oth += fcval; - wake_up = 1; - } - break; - } - if (wake_up && !sock_flag(sk, SOCK_DEAD)) - sk->sk_state_change(sk); - } - - dn_nsp_send_oth_ack(sk); - -out: - kfree_skb(skb); -} - -/* - * Copy of sock_queue_rcv_skb (from sock.h) without - * bh_lock_sock() (its already held when this is called) which - * also allows data and other data to be queued to a socket. - */ -static __inline__ int dn_queue_skb(struct sock *sk, struct sk_buff *skb, int sig, struct sk_buff_head *queue) -{ - int err; - - /* Cast skb->rcvbuf to unsigned... It's pointless, but reduces - number of warnings when compiling with -W --ANK - */ - if (atomic_read(&sk->sk_rmem_alloc) + skb->truesize >= - (unsigned int)sk->sk_rcvbuf) { - err = -ENOMEM; - goto out; - } - - err = sk_filter(sk, skb); - if (err) - goto out; - - skb_set_owner_r(skb, sk); - skb_queue_tail(queue, skb); - - if (!sock_flag(sk, SOCK_DEAD)) - sk->sk_data_ready(sk); -out: - return err; -} - -static void dn_nsp_otherdata(struct sock *sk, struct sk_buff *skb) -{ - struct dn_scp *scp = DN_SK(sk); - unsigned short segnum; - struct dn_skb_cb *cb = DN_SKB_CB(skb); - int queued = 0; - - if (skb->len < 2) - goto out; - - cb->segnum = segnum = le16_to_cpu(*(__le16 *)skb->data); - skb_pull(skb, 2); - - if (seq_next(scp->numoth_rcv, segnum)) { - - if (dn_queue_skb(sk, skb, SIGURG, &scp->other_receive_queue) == 0) { - seq_add(&scp->numoth_rcv, 1); - scp->other_report = 0; - queued = 1; - } - } - - dn_nsp_send_oth_ack(sk); -out: - if (!queued) - kfree_skb(skb); -} - -static void dn_nsp_data(struct sock *sk, struct sk_buff *skb) -{ - int queued = 0; - unsigned short segnum; - struct dn_skb_cb *cb = DN_SKB_CB(skb); - struct dn_scp *scp = DN_SK(sk); - - if (skb->len < 2) - goto out; - - cb->segnum = segnum = le16_to_cpu(*(__le16 *)skb->data); - skb_pull(skb, 2); - - if (seq_next(scp->numdat_rcv, segnum)) { - if (dn_queue_skb(sk, skb, SIGIO, &sk->sk_receive_queue) == 0) { - seq_add(&scp->numdat_rcv, 1); - queued = 1; - } - - if ((scp->flowloc_sw == DN_SEND) && dn_congested(sk)) { - scp->flowloc_sw = DN_DONTSEND; - dn_nsp_send_link(sk, DN_DONTSEND, 0); - } - } - - dn_nsp_send_data_ack(sk); -out: - if (!queued) - kfree_skb(skb); -} - -/* - * If one of our conninit messages is returned, this function - * deals with it. It puts the socket into the NO_COMMUNICATION - * state. - */ -static void dn_returned_conn_init(struct sock *sk, struct sk_buff *skb) -{ - struct dn_scp *scp = DN_SK(sk); - - if (scp->state == DN_CI) { - scp->state = DN_NC; - sk->sk_state = TCP_CLOSE; - if (!sock_flag(sk, SOCK_DEAD)) - sk->sk_state_change(sk); - } - - kfree_skb(skb); -} - -static int dn_nsp_no_socket(struct sk_buff *skb, unsigned short reason) -{ - struct dn_skb_cb *cb = DN_SKB_CB(skb); - int ret = NET_RX_DROP; - - /* Must not reply to returned packets */ - if (cb->rt_flags & DN_RT_F_RTS) - goto out; - - if ((reason != NSP_REASON_OK) && ((cb->nsp_flags & 0x0c) == 0x08)) { - switch (cb->nsp_flags & 0x70) { - case 0x10: - case 0x60: /* (Retransmitted) Connect Init */ - dn_nsp_return_disc(skb, NSP_DISCINIT, reason); - ret = NET_RX_SUCCESS; - break; - case 0x20: /* Connect Confirm */ - dn_nsp_return_disc(skb, NSP_DISCCONF, reason); - ret = NET_RX_SUCCESS; - break; - } - } - -out: - kfree_skb(skb); - return ret; -} - -static int dn_nsp_rx_packet(struct net *net, struct sock *sk2, - struct sk_buff *skb) -{ - struct dn_skb_cb *cb = DN_SKB_CB(skb); - struct sock *sk = NULL; - unsigned char *ptr = (unsigned char *)skb->data; - unsigned short reason = NSP_REASON_NL; - - if (!pskb_may_pull(skb, 2)) - goto free_out; - - skb_reset_transport_header(skb); - cb->nsp_flags = *ptr++; - - if (decnet_debug_level & 2) - printk(KERN_DEBUG "dn_nsp_rx: Message type 0x%02x\n", (int)cb->nsp_flags); - - if (cb->nsp_flags & 0x83) - goto free_out; - - /* - * Filter out conninits and useless packet types - */ - if ((cb->nsp_flags & 0x0c) == 0x08) { - switch (cb->nsp_flags & 0x70) { - case 0x00: /* NOP */ - case 0x70: /* Reserved */ - case 0x50: /* Reserved, Phase II node init */ - goto free_out; - case 0x10: - case 0x60: - if (unlikely(cb->rt_flags & DN_RT_F_RTS)) - goto free_out; - sk = dn_find_listener(skb, &reason); - goto got_it; - } - } - - if (!pskb_may_pull(skb, 3)) - goto free_out; - - /* - * Grab the destination address. - */ - cb->dst_port = *(__le16 *)ptr; - cb->src_port = 0; - ptr += 2; - - /* - * If not a connack, grab the source address too. - */ - if (pskb_may_pull(skb, 5)) { - cb->src_port = *(__le16 *)ptr; - ptr += 2; - skb_pull(skb, 5); - } - - /* - * Returned packets... - * Swap src & dst and look up in the normal way. - */ - if (unlikely(cb->rt_flags & DN_RT_F_RTS)) { - swap(cb->dst_port, cb->src_port); - swap(cb->dst, cb->src); - } - - /* - * Find the socket to which this skb is destined. - */ - sk = dn_find_by_skb(skb); -got_it: - if (sk != NULL) { - struct dn_scp *scp = DN_SK(sk); - - /* Reset backoff */ - scp->nsp_rxtshift = 0; - - /* - * We linearize everything except data segments here. - */ - if (cb->nsp_flags & ~0x60) { - if (unlikely(skb_linearize(skb))) - goto free_out; - } - - return sk_receive_skb(sk, skb, 0); - } - - return dn_nsp_no_socket(skb, reason); - -free_out: - kfree_skb(skb); - return NET_RX_DROP; -} - -int dn_nsp_rx(struct sk_buff *skb) -{ - return NF_HOOK(NFPROTO_DECNET, NF_DN_LOCAL_IN, - &init_net, NULL, skb, skb->dev, NULL, - dn_nsp_rx_packet); -} - -/* - * This is the main receive routine for sockets. It is called - * from the above when the socket is not busy, and also from - * sock_release() when there is a backlog queued up. - */ -int dn_nsp_backlog_rcv(struct sock *sk, struct sk_buff *skb) -{ - struct dn_scp *scp = DN_SK(sk); - struct dn_skb_cb *cb = DN_SKB_CB(skb); - - if (cb->rt_flags & DN_RT_F_RTS) { - if (cb->nsp_flags == 0x18 || cb->nsp_flags == 0x68) - dn_returned_conn_init(sk, skb); - else - kfree_skb(skb); - return NET_RX_SUCCESS; - } - - /* - * Control packet. - */ - if ((cb->nsp_flags & 0x0c) == 0x08) { - switch (cb->nsp_flags & 0x70) { - case 0x10: - case 0x60: - dn_nsp_conn_init(sk, skb); - break; - case 0x20: - dn_nsp_conn_conf(sk, skb); - break; - case 0x30: - dn_nsp_disc_init(sk, skb); - break; - case 0x40: - dn_nsp_disc_conf(sk, skb); - break; - } - - } else if (cb->nsp_flags == 0x24) { - /* - * Special for connacks, 'cos they don't have - * ack data or ack otherdata info. - */ - dn_nsp_conn_ack(sk, skb); - } else { - int other = 1; - - /* both data and ack frames can kick a CC socket into RUN */ - if ((scp->state == DN_CC) && !sock_flag(sk, SOCK_DEAD)) { - scp->state = DN_RUN; - sk->sk_state = TCP_ESTABLISHED; - sk->sk_state_change(sk); - } - - if ((cb->nsp_flags & 0x1c) == 0) - other = 0; - if (cb->nsp_flags == 0x04) - other = 0; - - /* - * Read out ack data here, this applies equally - * to data, other data, link service and both - * ack data and ack otherdata. - */ - dn_process_ack(sk, skb, other); - - /* - * If we've some sort of data here then call a - * suitable routine for dealing with it, otherwise - * the packet is an ack and can be discarded. - */ - if ((cb->nsp_flags & 0x0c) == 0) { - - if (scp->state != DN_RUN) - goto free_out; - - switch (cb->nsp_flags) { - case 0x10: /* LS */ - dn_nsp_linkservice(sk, skb); - break; - case 0x30: /* OD */ - dn_nsp_otherdata(sk, skb); - break; - default: - dn_nsp_data(sk, skb); - } - - } else { /* Ack, chuck it out here */ -free_out: - kfree_skb(skb); - } - } - - return NET_RX_SUCCESS; -} diff --git a/net/decnet/dn_nsp_out.c b/net/decnet/dn_nsp_out.c deleted file mode 100644 index b05639bdfc8f..000000000000 --- a/net/decnet/dn_nsp_out.c +++ /dev/null @@ -1,696 +0,0 @@ -// SPDX-License-Identifier: GPL-2.0-or-later -/* - * DECnet An implementation of the DECnet protocol suite for the LINUX - * operating system. DECnet is implemented using the BSD Socket - * interface as the means of communication with the user level. - * - * DECnet Network Services Protocol (Output) - * - * Author: Eduardo Marcelo Serrat <emserrat@geocities.com> - * - * Changes: - * - * Steve Whitehouse: Split into dn_nsp_in.c and dn_nsp_out.c from - * original dn_nsp.c. - * Steve Whitehouse: Updated to work with my new routing architecture. - * Steve Whitehouse: Added changes from Eduardo Serrat's patches. - * Steve Whitehouse: Now conninits have the "return" bit set. - * Steve Whitehouse: Fixes to check alloc'd skbs are non NULL! - * Moved output state machine into one function - * Steve Whitehouse: New output state machine - * Paul Koning: Connect Confirm message fix. - * Eduardo Serrat: Fix to stop dn_nsp_do_disc() sending malformed packets. - * Steve Whitehouse: dn_nsp_output() and friends needed a spring clean - * Steve Whitehouse: Moved dn_nsp_send() in here from route.h - */ - -/****************************************************************************** - (c) 1995-1998 E.M. Serrat emserrat@geocities.com - -*******************************************************************************/ - -#include <linux/errno.h> -#include <linux/types.h> -#include <linux/socket.h> -#include <linux/in.h> -#include <linux/kernel.h> -#include <linux/timer.h> -#include <linux/string.h> -#include <linux/sockios.h> -#include <linux/net.h> -#include <linux/netdevice.h> -#include <linux/inet.h> -#include <linux/route.h> -#include <linux/slab.h> -#include <net/sock.h> -#include <linux/fcntl.h> -#include <linux/mm.h> -#include <linux/termios.h> -#include <linux/interrupt.h> -#include <linux/proc_fs.h> -#include <linux/stat.h> -#include <linux/init.h> -#include <linux/poll.h> -#include <linux/if_packet.h> -#include <linux/jiffies.h> -#include <net/neighbour.h> -#include <net/dst.h> -#include <net/flow.h> -#include <net/dn.h> -#include <net/dn_nsp.h> -#include <net/dn_dev.h> -#include <net/dn_route.h> - - -static int nsp_backoff[NSP_MAXRXTSHIFT + 1] = { 1, 2, 4, 8, 16, 32, 64, 64, 64, 64, 64, 64, 64 }; - -static void dn_nsp_send(struct sk_buff *skb) -{ - struct sock *sk = skb->sk; - struct dn_scp *scp = DN_SK(sk); - struct dst_entry *dst; - struct flowidn fld; - - skb_reset_transport_header(skb); - scp->stamp = jiffies; - - dst = sk_dst_check(sk, 0); - if (dst) { -try_again: - skb_dst_set(skb, dst); - dst_output(&init_net, skb->sk, skb); - return; - } - - memset(&fld, 0, sizeof(fld)); - fld.flowidn_oif = sk->sk_bound_dev_if; - fld.saddr = dn_saddr2dn(&scp->addr); - fld.daddr = dn_saddr2dn(&scp->peer); - dn_sk_ports_copy(&fld, scp); - fld.flowidn_proto = DNPROTO_NSP; - if (dn_route_output_sock(&sk->sk_dst_cache, &fld, sk, 0) == 0) { - dst = sk_dst_get(sk); - sk->sk_route_caps = dst->dev->features; - goto try_again; - } - - sk->sk_err = EHOSTUNREACH; - if (!sock_flag(sk, SOCK_DEAD)) - sk->sk_state_change(sk); -} - - -/* - * If sk == NULL, then we assume that we are supposed to be making - * a routing layer skb. If sk != NULL, then we are supposed to be - * creating an skb for the NSP layer. - * - * The eventual aim is for each socket to have a cached header size - * for its outgoing packets, and to set hdr from this when sk != NULL. - */ -struct sk_buff *dn_alloc_skb(struct sock *sk, int size, gfp_t pri) -{ - struct sk_buff *skb; - int hdr = 64; - - if ((skb = alloc_skb(size + hdr, pri)) == NULL) - return NULL; - - skb->protocol = htons(ETH_P_DNA_RT); - skb->pkt_type = PACKET_OUTGOING; - - if (sk) - skb_set_owner_w(skb, sk); - - skb_reserve(skb, hdr); - - return skb; -} - -/* - * Calculate persist timer based upon the smoothed round - * trip time and the variance. Backoff according to the - * nsp_backoff[] array. - */ -unsigned long dn_nsp_persist(struct sock *sk) -{ - struct dn_scp *scp = DN_SK(sk); - - unsigned long t = ((scp->nsp_srtt >> 2) + scp->nsp_rttvar) >> 1; - - t *= nsp_backoff[scp->nsp_rxtshift]; - - if (t < HZ) t = HZ; - if (t > (600*HZ)) t = (600*HZ); - - if (scp->nsp_rxtshift < NSP_MAXRXTSHIFT) - scp->nsp_rxtshift++; - - /* printk(KERN_DEBUG "rxtshift %lu, t=%lu\n", scp->nsp_rxtshift, t); */ - - return t; -} - -/* - * This is called each time we get an estimate for the rtt - * on the link. - */ -static void dn_nsp_rtt(struct sock *sk, long rtt) -{ - struct dn_scp *scp = DN_SK(sk); - long srtt = (long)scp->nsp_srtt; - long rttvar = (long)scp->nsp_rttvar; - long delta; - - /* - * If the jiffies clock flips over in the middle of timestamp - * gathering this value might turn out negative, so we make sure - * that is it always positive here. - */ - if (rtt < 0) - rtt = -rtt; - /* - * Add new rtt to smoothed average - */ - delta = ((rtt << 3) - srtt); - srtt += (delta >> 3); - if (srtt >= 1) - scp->nsp_srtt = (unsigned long)srtt; - else - scp->nsp_srtt = 1; - - /* - * Add new rtt variance to smoothed varience - */ - delta >>= 1; - rttvar += ((((delta>0)?(delta):(-delta)) - rttvar) >> 2); - if (rttvar >= 1) - scp->nsp_rttvar = (unsigned long)rttvar; - else - scp->nsp_rttvar = 1; - - /* printk(KERN_DEBUG "srtt=%lu rttvar=%lu\n", scp->nsp_srtt, scp->nsp_rttvar); */ -} - -/** - * dn_nsp_clone_and_send - Send a data packet by cloning it - * @skb: The packet to clone and transmit - * @gfp: memory allocation flag - * - * Clone a queued data or other data packet and transmit it. - * - * Returns: The number of times the packet has been sent previously - */ -static inline unsigned int dn_nsp_clone_and_send(struct sk_buff *skb, - gfp_t gfp) -{ - struct dn_skb_cb *cb = DN_SKB_CB(skb); - struct sk_buff *skb2; - int ret = 0; - - if ((skb2 = skb_clone(skb, gfp)) != NULL) { - ret = cb->xmit_count; - cb->xmit_count++; - cb->stamp = jiffies; - skb2->sk = skb->sk; - dn_nsp_send(skb2); - } - - return ret; -} - -/** - * dn_nsp_output - Try and send something from socket queues - * @sk: The socket whose queues are to be investigated - * - * Try and send the packet on the end of the data and other data queues. - * Other data gets priority over data, and if we retransmit a packet we - * reduce the window by dividing it in two. - * - */ -void dn_nsp_output(struct sock *sk) -{ - struct dn_scp *scp = DN_SK(sk); - struct sk_buff *skb; - unsigned int reduce_win = 0; - - /* - * First we check for otherdata/linkservice messages - */ - if ((skb = skb_peek(&scp->other_xmit_queue)) != NULL) - reduce_win = dn_nsp_clone_and_send(skb, GFP_ATOMIC); - - /* - * If we may not send any data, we don't. - * If we are still trying to get some other data down the - * channel, we don't try and send any data. - */ - if (reduce_win || (scp->flowrem_sw != DN_SEND)) - goto recalc_window; - - if ((skb = skb_peek(&scp->data_xmit_queue)) != NULL) - reduce_win = dn_nsp_clone_and_send(skb, GFP_ATOMIC); - - /* - * If we've sent any frame more than once, we cut the - * send window size in half. There is always a minimum - * window size of one available. - */ -recalc_window: - if (reduce_win) { - scp->snd_window >>= 1; - if (scp->snd_window < NSP_MIN_WINDOW) - scp->snd_window = NSP_MIN_WINDOW; - } -} - -int dn_nsp_xmit_timeout(struct sock *sk) -{ - struct dn_scp *scp = DN_SK(sk); - - dn_nsp_output(sk); - - if (!skb_queue_empty(&scp->data_xmit_queue) || - !skb_queue_empty(&scp->other_xmit_queue)) - scp->persist = dn_nsp_persist(sk); - - return 0; -} - -static inline __le16 *dn_mk_common_header(struct dn_scp *scp, struct sk_buff *skb, unsigned char msgflag, int len) -{ - unsigned char *ptr = skb_push(skb, len); - - BUG_ON(len < 5); - - *ptr++ = msgflag; - *((__le16 *)ptr) = scp->addrrem; - ptr += 2; - *((__le16 *)ptr) = scp->addrloc; - ptr += 2; - return (__le16 __force *)ptr; -} - -static __le16 *dn_mk_ack_header(struct sock *sk, struct sk_buff *skb, unsigned char msgflag, int hlen, int other) -{ - struct dn_scp *scp = DN_SK(sk); - unsigned short acknum = scp->numdat_rcv & 0x0FFF; - unsigned short ackcrs = scp->numoth_rcv & 0x0FFF; - __le16 *ptr; - - BUG_ON(hlen < 9); - - scp->ackxmt_dat = acknum; - scp->ackxmt_oth = ackcrs; - acknum |= 0x8000; - ackcrs |= 0x8000; - - /* If this is an "other data/ack" message, swap acknum and ackcrs */ - if (other) - swap(acknum, ackcrs); - - /* Set "cross subchannel" bit in ackcrs */ - ackcrs |= 0x2000; - - ptr = dn_mk_common_header(scp, skb, msgflag, hlen); - - *ptr++ = cpu_to_le16(acknum); - *ptr++ = cpu_to_le16(ackcrs); - - return ptr; -} - -static __le16 *dn_nsp_mk_data_header(struct sock *sk, struct sk_buff *skb, int oth) -{ - struct dn_scp *scp = DN_SK(sk); - struct dn_skb_cb *cb = DN_SKB_CB(skb); - __le16 *ptr = dn_mk_ack_header(sk, skb, cb->nsp_flags, 11, oth); - - if (unlikely(oth)) { - cb->segnum = scp->numoth; - seq_add(&scp->numoth, 1); - } else { - cb->segnum = scp->numdat; - seq_add(&scp->numdat, 1); - } - *(ptr++) = cpu_to_le16(cb->segnum); - - return ptr; -} - -void dn_nsp_queue_xmit(struct sock *sk, struct sk_buff *skb, - gfp_t gfp, int oth) -{ - struct dn_scp *scp = DN_SK(sk); - struct dn_skb_cb *cb = DN_SKB_CB(skb); - unsigned long t = ((scp->nsp_srtt >> 2) + scp->nsp_rttvar) >> 1; - - cb->xmit_count = 0; - dn_nsp_mk_data_header(sk, skb, oth); - - /* - * Slow start: If we have been idle for more than - * one RTT, then reset window to min size. - */ - if (time_is_before_jiffies(scp->stamp + t)) - scp->snd_window = NSP_MIN_WINDOW; - - if (oth) - skb_queue_tail(&scp->other_xmit_queue, skb); - else - skb_queue_tail(&scp->data_xmit_queue, skb); - - if (scp->flowrem_sw != DN_SEND) - return; - - dn_nsp_clone_and_send(skb, gfp); -} - - -int dn_nsp_check_xmit_queue(struct sock *sk, struct sk_buff *skb, struct sk_buff_head *q, unsigned short acknum) -{ - struct dn_skb_cb *cb = DN_SKB_CB(skb); - struct dn_scp *scp = DN_SK(sk); - struct sk_buff *skb2, *n, *ack = NULL; - int wakeup = 0; - int try_retrans = 0; - unsigned long reftime = cb->stamp; - unsigned long pkttime; - unsigned short xmit_count; - unsigned short segnum; - - skb_queue_walk_safe(q, skb2, n) { - struct dn_skb_cb *cb2 = DN_SKB_CB(skb2); - - if (dn_before_or_equal(cb2->segnum, acknum)) - ack = skb2; - - /* printk(KERN_DEBUG "ack: %s %04x %04x\n", ack ? "ACK" : "SKIP", (int)cb2->segnum, (int)acknum); */ - - if (ack == NULL) - continue; - - /* printk(KERN_DEBUG "check_xmit_queue: %04x, %d\n", acknum, cb2->xmit_count); */ - - /* Does _last_ packet acked have xmit_count > 1 */ - try_retrans = 0; - /* Remember to wake up the sending process */ - wakeup = 1; - /* Keep various statistics */ - pkttime = cb2->stamp; - xmit_count = cb2->xmit_count; - segnum = cb2->segnum; - /* Remove and drop ack'ed packet */ - skb_unlink(ack, q); - kfree_skb(ack); - ack = NULL; - - /* - * We don't expect to see acknowledgements for packets we - * haven't sent yet. - */ - WARN_ON(xmit_count == 0); - - /* - * If the packet has only been sent once, we can use it - * to calculate the RTT and also open the window a little - * further. - */ - if (xmit_count == 1) { - if (dn_equal(segnum, acknum)) - dn_nsp_rtt(sk, (long)(pkttime - reftime)); - - if (scp->snd_window < scp->max_window) - scp->snd_window++; - } - - /* - * Packet has been sent more than once. If this is the last - * packet to be acknowledged then we want to send the next - * packet in the send queue again (assumes the remote host does - * go-back-N error control). - */ - if (xmit_count > 1) - try_retrans = 1; - } - - if (try_retrans) - dn_nsp_output(sk); - - return wakeup; -} - -void dn_nsp_send_data_ack(struct sock *sk) -{ - struct sk_buff *skb = NULL; - - if ((skb = dn_alloc_skb(sk, 9, GFP_ATOMIC)) == NULL) - return; - - skb_reserve(skb, 9); - dn_mk_ack_header(sk, skb, 0x04, 9, 0); - dn_nsp_send(skb); -} - -void dn_nsp_send_oth_ack(struct sock *sk) -{ - struct sk_buff *skb = NULL; - - if ((skb = dn_alloc_skb(sk, 9, GFP_ATOMIC)) == NULL) - return; - - skb_reserve(skb, 9); - dn_mk_ack_header(sk, skb, 0x14, 9, 1); - dn_nsp_send(skb); -} - - -void dn_send_conn_ack (struct sock *sk) -{ - struct dn_scp *scp = DN_SK(sk); - struct sk_buff *skb = NULL; - struct nsp_conn_ack_msg *msg; - - if ((skb = dn_alloc_skb(sk, 3, sk->sk_allocation)) == NULL) - return; - - msg = skb_put(skb, 3); - msg->msgflg = 0x24; - msg->dstaddr = scp->addrrem; - - dn_nsp_send(skb); -} - -static int dn_nsp_retrans_conn_conf(struct sock *sk) -{ - struct dn_scp *scp = DN_SK(sk); - - if (scp->state == DN_CC) - dn_send_conn_conf(sk, GFP_ATOMIC); - - return 0; -} - -void dn_send_conn_conf(struct sock *sk, gfp_t gfp) -{ - struct dn_scp *scp = DN_SK(sk); - struct sk_buff *skb = NULL; - struct nsp_conn_init_msg *msg; - __u8 len = (__u8)le16_to_cpu(scp->conndata_out.opt_optl); - - if ((skb = dn_alloc_skb(sk, 50 + len, gfp)) == NULL) - return; - - msg = skb_put(skb, sizeof(*msg)); - msg->msgflg = 0x28; - msg->dstaddr = scp->addrrem; - msg->srcaddr = scp->addrloc; - msg->services = scp->services_loc; - msg->info = scp->info_loc; - msg->segsize = cpu_to_le16(scp->segsize_loc); - - skb_put_u8(skb, len); - - if (len > 0) - skb_put_data(skb, scp->conndata_out.opt_data, len); - - - dn_nsp_send(skb); - - scp->persist = dn_nsp_persist(sk); - scp->persist_fxn = dn_nsp_retrans_conn_conf; -} - - -static __inline__ void dn_nsp_do_disc(struct sock *sk, unsigned char msgflg, - unsigned short reason, gfp_t gfp, - struct dst_entry *dst, - int ddl, unsigned char *dd, __le16 rem, __le16 loc) -{ - struct sk_buff *skb = NULL; - int size = 7 + ddl + ((msgflg == NSP_DISCINIT) ? 1 : 0); - unsigned char *msg; - - if ((dst == NULL) || (rem == 0)) { - net_dbg_ratelimited("DECnet: dn_nsp_do_disc: BUG! Please report this to SteveW@ACM.org rem=%u dst=%p\n", - le16_to_cpu(rem), dst); - return; - } - - if ((skb = dn_alloc_skb(sk, size, gfp)) == NULL) - return; - - msg = skb_put(skb, size); - *msg++ = msgflg; - *(__le16 *)msg = rem; - msg += 2; - *(__le16 *)msg = loc; - msg += 2; - *(__le16 *)msg = cpu_to_le16(reason); - msg += 2; - if (msgflg == NSP_DISCINIT) - *msg++ = ddl; - - if (ddl) { - memcpy(msg, dd, ddl); - } - - /* - * This doesn't go via the dn_nsp_send() function since we need - * to be able to send disc packets out which have no socket - * associations. - */ - skb_dst_set(skb, dst_clone(dst)); - dst_output(&init_net, skb->sk, skb); -} - - -void dn_nsp_send_disc(struct sock *sk, unsigned char msgflg, - unsigned short reason, gfp_t gfp) -{ - struct dn_scp *scp = DN_SK(sk); - int ddl = 0; - - if (msgflg == NSP_DISCINIT) - ddl = le16_to_cpu(scp->discdata_out.opt_optl); - - if (reason == 0) - reason = le16_to_cpu(scp->discdata_out.opt_status); - - dn_nsp_do_disc(sk, msgflg, reason, gfp, __sk_dst_get(sk), ddl, - scp->discdata_out.opt_data, scp->addrrem, scp->addrloc); -} - - -void dn_nsp_return_disc(struct sk_buff *skb, unsigned char msgflg, - unsigned short reason) -{ - struct dn_skb_cb *cb = DN_SKB_CB(skb); - int ddl = 0; - gfp_t gfp = GFP_ATOMIC; - - dn_nsp_do_disc(NULL, msgflg, reason, gfp, skb_dst(skb), ddl, - NULL, cb->src_port, cb->dst_port); -} - - -void dn_nsp_send_link(struct sock *sk, unsigned char lsflags, char fcval) -{ - struct dn_scp *scp = DN_SK(sk); - struct sk_buff *skb; - unsigned char *ptr; - gfp_t gfp = GFP_ATOMIC; - - if ((skb = dn_alloc_skb(sk, DN_MAX_NSP_DATA_HEADER + 2, gfp)) == NULL) - return; - - skb_reserve(skb, DN_MAX_NSP_DATA_HEADER); - ptr = skb_put(skb, 2); - DN_SKB_CB(skb)->nsp_flags = 0x10; - *ptr++ = lsflags; - *ptr = fcval; - - dn_nsp_queue_xmit(sk, skb, gfp, 1); - - scp->persist = dn_nsp_persist(sk); - scp->persist_fxn = dn_nsp_xmit_timeout; -} - -static int dn_nsp_retrans_conninit(struct sock *sk) -{ - struct dn_scp *scp = DN_SK(sk); - - if (scp->state == DN_CI) - dn_nsp_send_conninit(sk, NSP_RCI); - - return 0; -} - -void dn_nsp_send_conninit(struct sock *sk, unsigned char msgflg) -{ - struct dn_scp *scp = DN_SK(sk); - struct nsp_conn_init_msg *msg; - unsigned char aux; - unsigned char menuver; - struct dn_skb_cb *cb; - unsigned char type = 1; - gfp_t allocation = (msgflg == NSP_CI) ? sk->sk_allocation : GFP_ATOMIC; - struct sk_buff *skb = dn_alloc_skb(sk, 200, allocation); - - if (!skb) - return; - - cb = DN_SKB_CB(skb); - msg = skb_put(skb, sizeof(*msg)); - - msg->msgflg = msgflg; - msg->dstaddr = 0x0000; /* Remote Node will assign it*/ - - msg->srcaddr = scp->addrloc; - msg->services = scp->services_loc; /* Requested flow control */ - msg->info = scp->info_loc; /* Version Number */ - msg->segsize = cpu_to_le16(scp->segsize_loc); /* Max segment size */ - - if (scp->peer.sdn_objnum) - type = 0; - - skb_put(skb, dn_sockaddr2username(&scp->peer, - skb_tail_pointer(skb), type)); - skb_put(skb, dn_sockaddr2username(&scp->addr, - skb_tail_pointer(skb), 2)); - - menuver = DN_MENUVER_ACC | DN_MENUVER_USR; - if (scp->peer.sdn_flags & SDF_PROXY) - menuver |= DN_MENUVER_PRX; - if (scp->peer.sdn_flags & SDF_UICPROXY) - menuver |= DN_MENUVER_UIC; - - skb_put_u8(skb, menuver); /* Menu Version */ - - aux = scp->accessdata.acc_userl; - skb_put_u8(skb, aux); - if (aux > 0) - skb_put_data(skb, scp->accessdata.acc_user, aux); - - aux = scp->accessdata.acc_passl; - skb_put_u8(skb, aux); - if (aux > 0) - skb_put_data(skb, scp->accessdata.acc_pass, aux); - - aux = scp->accessdata.acc_accl; - skb_put_u8(skb, aux); - if (aux > 0) - skb_put_data(skb, scp->accessdata.acc_acc, aux); - - aux = (__u8)le16_to_cpu(scp->conndata_out.opt_optl); - skb_put_u8(skb, aux); - if (aux > 0) - skb_put_data(skb, scp->conndata_out.opt_data, aux); - - scp->persist = dn_nsp_persist(sk); - scp->persist_fxn = dn_nsp_retrans_conninit; - - cb->rt_flags = DN_RT_F_RQR; - - dn_nsp_send(skb); -} diff --git a/net/decnet/dn_route.c b/net/decnet/dn_route.c deleted file mode 100644 index ac2ee1689111..000000000000 --- a/net/decnet/dn_route.c +++ /dev/null @@ -1,1922 +0,0 @@ -// SPDX-License-Identifier: GPL-2.0-or-later -/* - * DECnet An implementation of the DECnet protocol suite for the LINUX - * operating system. DECnet is implemented using the BSD Socket - * interface as the means of communication with the user level. - * - * DECnet Routing Functions (Endnode and Router) - * - * Authors: Steve Whitehouse <SteveW@ACM.org> - * Eduardo Marcelo Serrat <emserrat@geocities.com> - * - * Changes: - * Steve Whitehouse : Fixes to allow "intra-ethernet" and - * "return-to-sender" bits on outgoing - * packets. - * Steve Whitehouse : Timeouts for cached routes. - * Steve Whitehouse : Use dst cache for input routes too. - * Steve Whitehouse : Fixed error values in dn_send_skb. - * Steve Whitehouse : Rework routing functions to better fit - * DECnet routing design - * Alexey Kuznetsov : New SMP locking - * Steve Whitehouse : More SMP locking changes & dn_cache_dump() - * Steve Whitehouse : Prerouting NF hook, now really is prerouting. - * Fixed possible skb leak in rtnetlink funcs. - * Steve Whitehouse : Dave Miller's dynamic hash table sizing and - * Alexey Kuznetsov's finer grained locking - * from ipv4/route.c. - * Steve Whitehouse : Routing is now starting to look like a - * sensible set of code now, mainly due to - * my copying the IPv4 routing code. The - * hooks here are modified and will continue - * to evolve for a while. - * Steve Whitehouse : Real SMP at last :-) Also new netfilter - * stuff. Look out raw sockets your days - * are numbered! - * Steve Whitehouse : Added return-to-sender functions. Added - * backlog congestion level return codes. - * Steve Whitehouse : Fixed bug where routes were set up with - * no ref count on net devices. - * Steve Whitehouse : RCU for the route cache - * Steve Whitehouse : Preparations for the flow cache - * Steve Whitehouse : Prepare for nonlinear skbs - */ - -/****************************************************************************** - (c) 1995-1998 E.M. Serrat emserrat@geocities.com - -*******************************************************************************/ - -#include <linux/errno.h> -#include <linux/types.h> -#include <linux/socket.h> -#include <linux/in.h> -#include <linux/kernel.h> -#include <linux/sockios.h> -#include <linux/net.h> -#include <linux/netdevice.h> -#include <linux/inet.h> -#include <linux/route.h> -#include <linux/in_route.h> -#include <linux/slab.h> -#include <net/sock.h> -#include <linux/mm.h> -#include <linux/proc_fs.h> -#include <linux/seq_file.h> -#include <linux/init.h> -#include <linux/rtnetlink.h> -#include <linux/string.h> -#include <linux/netfilter_decnet.h> -#include <linux/rcupdate.h> -#include <linux/times.h> -#include <linux/export.h> -#include <asm/errno.h> -#include <net/net_namespace.h> -#include <net/netlink.h> -#include <net/neighbour.h> -#include <net/dst.h> -#include <net/flow.h> -#include <net/fib_rules.h> -#include <net/dn.h> -#include <net/dn_dev.h> -#include <net/dn_nsp.h> -#include <net/dn_route.h> -#include <net/dn_neigh.h> -#include <net/dn_fib.h> - -struct dn_rt_hash_bucket { - struct dn_route __rcu *chain; - spinlock_t lock; -}; - -extern struct neigh_table dn_neigh_table; - - -static unsigned char dn_hiord_addr[6] = {0xAA, 0x00, 0x04, 0x00, 0x00, 0x00}; - -static const int dn_rt_min_delay = 2 * HZ; -static const int dn_rt_max_delay = 10 * HZ; -static const int dn_rt_mtu_expires = 10 * 60 * HZ; - -static unsigned long dn_rt_deadline; - -static int dn_dst_gc(struct dst_ops *ops); -static struct dst_entry *dn_dst_check(struct dst_entry *, __u32); -static unsigned int dn_dst_default_advmss(const struct dst_entry *dst); -static unsigned int dn_dst_mtu(const struct dst_entry *dst); -static void dn_dst_destroy(struct dst_entry *); -static void dn_dst_ifdown(struct dst_entry *, struct net_device *dev, int how); -static struct dst_entry *dn_dst_negative_advice(struct dst_entry *); -static void dn_dst_link_failure(struct sk_buff *); -static void dn_dst_update_pmtu(struct dst_entry *dst, struct sock *sk, - struct sk_buff *skb , u32 mtu, - bool confirm_neigh); -static void dn_dst_redirect(struct dst_entry *dst, struct sock *sk, - struct sk_buff *skb); -static struct neighbour *dn_dst_neigh_lookup(const struct dst_entry *dst, - struct sk_buff *skb, - const void *daddr); -static int dn_route_input(struct sk_buff *); -static void dn_run_flush(struct timer_list *unused); - -static struct dn_rt_hash_bucket *dn_rt_hash_table; -static unsigned int dn_rt_hash_mask; - -static struct timer_list dn_route_timer; -static DEFINE_TIMER(dn_rt_flush_timer, dn_run_flush); -int decnet_dst_gc_interval = 2; - -static struct dst_ops dn_dst_ops = { - .family = PF_DECnet, - .gc_thresh = 128, - .gc = dn_dst_gc, - .check = dn_dst_check, - .default_advmss = dn_dst_default_advmss, - .mtu = dn_dst_mtu, - .cow_metrics = dst_cow_metrics_generic, - .destroy = dn_dst_destroy, - .ifdown = dn_dst_ifdown, - .negative_advice = dn_dst_negative_advice, - .link_failure = dn_dst_link_failure, - .update_pmtu = dn_dst_update_pmtu, - .redirect = dn_dst_redirect, - .neigh_lookup = dn_dst_neigh_lookup, -}; - -static void dn_dst_destroy(struct dst_entry *dst) -{ - struct dn_route *rt = (struct dn_route *) dst; - - if (rt->n) - neigh_release(rt->n); - dst_destroy_metrics_generic(dst); -} - -static void dn_dst_ifdown(struct dst_entry *dst, struct net_device *dev, int how) -{ - if (how) { - struct dn_route *rt = (struct dn_route *) dst; - struct neighbour *n = rt->n; - - if (n && n->dev == dev) { - n->dev = blackhole_netdev; - dev_hold(n->dev); - dev_put(dev); - } - } -} - -static __inline__ unsigned int dn_hash(__le16 src, __le16 dst) -{ - __u16 tmp = (__u16 __force)(src ^ dst); - tmp ^= (tmp >> 3); - tmp ^= (tmp >> 5); - tmp ^= (tmp >> 10); - return dn_rt_hash_mask & (unsigned int)tmp; -} - -static void dn_dst_check_expire(struct timer_list *unused) -{ - int i; - struct dn_route *rt; - struct dn_route __rcu **rtp; - unsigned long now = jiffies; - unsigned long expire = 120 * HZ; - - for (i = 0; i <= dn_rt_hash_mask; i++) { - rtp = &dn_rt_hash_table[i].chain; - - spin_lock(&dn_rt_hash_table[i].lock); - while ((rt = rcu_dereference_protected(*rtp, - lockdep_is_held(&dn_rt_hash_table[i].lock))) != NULL) { - if (atomic_read(&rt->dst.__refcnt) > 1 || - (now - rt->dst.lastuse) < expire) { - rtp = &rt->dn_next; - continue; - } - *rtp = rt->dn_next; - rt->dn_next = NULL; - dst_dev_put(&rt->dst); - dst_release(&rt->dst); - } - spin_unlock(&dn_rt_hash_table[i].lock); - - if (jiffies != now) - break; - } - - mod_timer(&dn_route_timer, now + decnet_dst_gc_interval * HZ); -} - -static int dn_dst_gc(struct dst_ops *ops) -{ - struct dn_route *rt; - struct dn_route __rcu **rtp; - int i; - unsigned long now = jiffies; - unsigned long expire = 10 * HZ; - - for (i = 0; i <= dn_rt_hash_mask; i++) { - - spin_lock_bh(&dn_rt_hash_table[i].lock); - rtp = &dn_rt_hash_table[i].chain; - - while ((rt = rcu_dereference_protected(*rtp, - lockdep_is_held(&dn_rt_hash_table[i].lock))) != NULL) { - if (atomic_read(&rt->dst.__refcnt) > 1 || - (now - rt->dst.lastuse) < expire) { - rtp = &rt->dn_next; - continue; - } - *rtp = rt->dn_next; - rt->dn_next = NULL; - dst_dev_put(&rt->dst); - dst_release(&rt->dst); - break; - } - spin_unlock_bh(&dn_rt_hash_table[i].lock); - } - - return 0; -} - -/* - * The decnet standards don't impose a particular minimum mtu, what they - * do insist on is that the routing layer accepts a datagram of at least - * 230 bytes long. Here we have to subtract the routing header length from - * 230 to get the minimum acceptable mtu. If there is no neighbour, then we - * assume the worst and use a long header size. - * - * We update both the mtu and the advertised mss (i.e. the segment size we - * advertise to the other end). - */ -static void dn_dst_update_pmtu(struct dst_entry *dst, struct sock *sk, - struct sk_buff *skb, u32 mtu, - bool confirm_neigh) -{ - struct dn_route *rt = (struct dn_route *) dst; - struct neighbour *n = rt->n; - u32 min_mtu = 230; - struct dn_dev *dn; - - dn = n ? rcu_dereference_raw(n->dev->dn_ptr) : NULL; - - if (dn && dn->use_long == 0) - min_mtu -= 6; - else - min_mtu -= 21; - - if (dst_metric(dst, RTAX_MTU) > mtu && mtu >= min_mtu) { - if (!(dst_metric_locked(dst, RTAX_MTU))) { - dst_metric_set(dst, RTAX_MTU, mtu); - dst_set_expires(dst, dn_rt_mtu_expires); - } - if (!(dst_metric_locked(dst, RTAX_ADVMSS))) { - u32 mss = mtu - DN_MAX_NSP_DATA_HEADER; - u32 existing_mss = dst_metric_raw(dst, RTAX_ADVMSS); - if (!existing_mss || existing_mss > mss) - dst_metric_set(dst, RTAX_ADVMSS, mss); - } - } -} - -static void dn_dst_redirect(struct dst_entry *dst, struct sock *sk, - struct sk_buff *skb) -{ -} - -/* - * When a route has been marked obsolete. (e.g. routing cache flush) - */ -static struct dst_entry *dn_dst_check(struct dst_entry *dst, __u32 cookie) -{ - return NULL; -} - -static struct dst_entry *dn_dst_negative_advice(struct dst_entry *dst) -{ - dst_release(dst); - return NULL; -} - -static void dn_dst_link_failure(struct sk_buff *skb) -{ -} - -static inline int compare_keys(struct flowidn *fl1, struct flowidn *fl2) -{ - return ((fl1->daddr ^ fl2->daddr) | - (fl1->saddr ^ fl2->saddr) | - (fl1->flowidn_mark ^ fl2->flowidn_mark) | - (fl1->flowidn_scope ^ fl2->flowidn_scope) | - (fl1->flowidn_oif ^ fl2->flowidn_oif) | - (fl1->flowidn_iif ^ fl2->flowidn_iif)) == 0; -} - -static int dn_insert_route(struct dn_route *rt, unsigned int hash, struct dn_route **rp) -{ - struct dn_route *rth; - struct dn_route __rcu **rthp; - unsigned long now = jiffies; - - rthp = &dn_rt_hash_table[hash].chain; - - spin_lock_bh(&dn_rt_hash_table[hash].lock); - while ((rth = rcu_dereference_protected(*rthp, - lockdep_is_held(&dn_rt_hash_table[hash].lock))) != NULL) { - if (compare_keys(&rth->fld, &rt->fld)) { - /* Put it first */ - *rthp = rth->dn_next; - rcu_assign_pointer(rth->dn_next, - dn_rt_hash_table[hash].chain); - rcu_assign_pointer(dn_rt_hash_table[hash].chain, rth); - - dst_hold_and_use(&rth->dst, now); - spin_unlock_bh(&dn_rt_hash_table[hash].lock); - - dst_release_immediate(&rt->dst); - *rp = rth; - return 0; - } - rthp = &rth->dn_next; - } - - rcu_assign_pointer(rt->dn_next, dn_rt_hash_table[hash].chain); - rcu_assign_pointer(dn_rt_hash_table[hash].chain, rt); - - dst_hold_and_use(&rt->dst, now); - spin_unlock_bh(&dn_rt_hash_table[hash].lock); - *rp = rt; - return 0; -} - -static void dn_run_flush(struct timer_list *unused) -{ - int i; - struct dn_route *rt, *next; - - for (i = 0; i < dn_rt_hash_mask; i++) { - spin_lock_bh(&dn_rt_hash_table[i].lock); - - rt = xchg((struct dn_route **)&dn_rt_hash_table[i].chain, NULL); - if (!rt) - goto nothing_to_declare; - - for (; rt; rt = next) { - next = rcu_dereference_raw(rt->dn_next); - RCU_INIT_POINTER(rt->dn_next, NULL); - dst_dev_put(&rt->dst); - dst_release(&rt->dst); - } - -nothing_to_declare: - spin_unlock_bh(&dn_rt_hash_table[i].lock); - } -} - -static DEFINE_SPINLOCK(dn_rt_flush_lock); - -void dn_rt_cache_flush(int delay) -{ - unsigned long now = jiffies; - int user_mode = !in_interrupt(); - - if (delay < 0) - delay = dn_rt_min_delay; - - spin_lock_bh(&dn_rt_flush_lock); - - if (del_timer(&dn_rt_flush_timer) && delay > 0 && dn_rt_deadline) { - long tmo = (long)(dn_rt_deadline - now); - - if (user_mode && tmo < dn_rt_max_delay - dn_rt_min_delay) - tmo = 0; - - if (delay > tmo) - delay = tmo; - } - - if (delay <= 0) { - spin_unlock_bh(&dn_rt_flush_lock); - dn_run_flush(NULL); - return; - } - - if (dn_rt_deadline == 0) - dn_rt_deadline = now + dn_rt_max_delay; - - dn_rt_flush_timer.expires = now + delay; - add_timer(&dn_rt_flush_timer); - spin_unlock_bh(&dn_rt_flush_lock); -} - -/** - * dn_return_short - Return a short packet to its sender - * @skb: The packet to return - * - */ -static int dn_return_short(struct sk_buff *skb) -{ - struct dn_skb_cb *cb; - unsigned char *ptr; - __le16 *src; - __le16 *dst; - - /* Add back headers */ - skb_push(skb, skb->data - skb_network_header(skb)); - - skb = skb_unshare(skb, GFP_ATOMIC); - if (!skb) - return NET_RX_DROP; - - cb = DN_SKB_CB(skb); - /* Skip packet length and point to flags */ - ptr = skb->data + 2; - *ptr++ = (cb->rt_flags & ~DN_RT_F_RQR) | DN_RT_F_RTS; - - dst = (__le16 *)ptr; - ptr += 2; - src = (__le16 *)ptr; - ptr += 2; - *ptr = 0; /* Zero hop count */ - - swap(*src, *dst); - - skb->pkt_type = PACKET_OUTGOING; - dn_rt_finish_output(skb, NULL, NULL); - return NET_RX_SUCCESS; -} - -/** - * dn_return_long - Return a long packet to its sender - * @skb: The long format packet to return - * - */ -static int dn_return_long(struct sk_buff *skb) -{ - struct dn_skb_cb *cb; - unsigned char *ptr; - unsigned char *src_addr, *dst_addr; - unsigned char tmp[ETH_ALEN]; - - /* Add back all headers */ - skb_push(skb, skb->data - skb_network_header(skb)); - - skb = skb_unshare(skb, GFP_ATOMIC); - if (!skb) - return NET_RX_DROP; - - cb = DN_SKB_CB(skb); - /* Ignore packet length and point to flags */ - ptr = skb->data + 2; - - /* Skip padding */ - if (*ptr & DN_RT_F_PF) { - char padlen = (*ptr & ~DN_RT_F_PF); - ptr += padlen; - } - - *ptr++ = (cb->rt_flags & ~DN_RT_F_RQR) | DN_RT_F_RTS; - ptr += 2; - dst_addr = ptr; - ptr += 8; - src_addr = ptr; - ptr += 6; - *ptr = 0; /* Zero hop count */ - - /* Swap source and destination */ - memcpy(tmp, src_addr, ETH_ALEN); - memcpy(src_addr, dst_addr, ETH_ALEN); - memcpy(dst_addr, tmp, ETH_ALEN); - - skb->pkt_type = PACKET_OUTGOING; - dn_rt_finish_output(skb, dst_addr, src_addr); - return NET_RX_SUCCESS; -} - -/** - * dn_route_rx_packet - Try and find a route for an incoming packet - * @net: The applicable net namespace - * @sk: Socket packet transmitted on - * @skb: The packet to find a route for - * - * Returns: result of input function if route is found, error code otherwise - */ -static int dn_route_rx_packet(struct net *net, struct sock *sk, struct sk_buff *skb) -{ - struct dn_skb_cb *cb; - int err; - - err = dn_route_input(skb); - if (err == 0) - return dst_input(skb); - - cb = DN_SKB_CB(skb); - if (decnet_debug_level & 4) { - char *devname = skb->dev ? skb->dev->name : "???"; - - printk(KERN_DEBUG - "DECnet: dn_route_rx_packet: rt_flags=0x%02x dev=%s len=%d src=0x%04hx dst=0x%04hx err=%d type=%d\n", - (int)cb->rt_flags, devname, skb->len, - le16_to_cpu(cb->src), le16_to_cpu(cb->dst), - err, skb->pkt_type); - } - - if ((skb->pkt_type == PACKET_HOST) && (cb->rt_flags & DN_RT_F_RQR)) { - switch (cb->rt_flags & DN_RT_PKT_MSK) { - case DN_RT_PKT_SHORT: - return dn_return_short(skb); - case DN_RT_PKT_LONG: - return dn_return_long(skb); - } - } - - kfree_skb(skb); - return NET_RX_DROP; -} - -static int dn_route_rx_long(struct sk_buff *skb) -{ - struct dn_skb_cb *cb = DN_SKB_CB(skb); - unsigned char *ptr = skb->data; - - if (!pskb_may_pull(skb, 21)) /* 20 for long header, 1 for shortest nsp */ - goto drop_it; - - skb_pull(skb, 20); - skb_reset_transport_header(skb); - - /* Destination info */ - ptr += 2; - cb->dst = dn_eth2dn(ptr); - if (memcmp(ptr, dn_hiord_addr, 4) != 0) - goto drop_it; - ptr += 6; - - - /* Source info */ - ptr += 2; - cb->src = dn_eth2dn(ptr); - if (memcmp(ptr, dn_hiord_addr, 4) != 0) - goto drop_it; - ptr += 6; - /* Other junk */ - ptr++; - cb->hops = *ptr++; /* Visit Count */ - - return NF_HOOK(NFPROTO_DECNET, NF_DN_PRE_ROUTING, - &init_net, NULL, skb, skb->dev, NULL, - dn_route_rx_packet); - -drop_it: - kfree_skb(skb); - return NET_RX_DROP; -} - - - -static int dn_route_rx_short(struct sk_buff *skb) -{ - struct dn_skb_cb *cb = DN_SKB_CB(skb); - unsigned char *ptr = skb->data; - - if (!pskb_may_pull(skb, 6)) /* 5 for short header + 1 for shortest nsp */ - goto drop_it; - - skb_pull(skb, 5); - skb_reset_transport_header(skb); - - cb->dst = *(__le16 *)ptr; - ptr += 2; - cb->src = *(__le16 *)ptr; - ptr += 2; - cb->hops = *ptr & 0x3f; - - return NF_HOOK(NFPROTO_DECNET, NF_DN_PRE_ROUTING, - &init_net, NULL, skb, skb->dev, NULL, - dn_route_rx_packet); - -drop_it: - kfree_skb(skb); - return NET_RX_DROP; -} - -static int dn_route_discard(struct net *net, struct sock *sk, struct sk_buff *skb) -{ - /* - * I know we drop the packet here, but that's considered success in - * this case - */ - kfree_skb(skb); - return NET_RX_SUCCESS; -} - -static int dn_route_ptp_hello(struct net *net, struct sock *sk, struct sk_buff *skb) -{ - dn_dev_hello(skb); - dn_neigh_pointopoint_hello(skb); - return NET_RX_SUCCESS; -} - -int dn_route_rcv(struct sk_buff *skb, struct net_device *dev, struct packet_type *pt, struct net_device *orig_dev) -{ - struct dn_skb_cb *cb; - unsigned char flags = 0; - __u16 len = le16_to_cpu(*(__le16 *)skb->data); - struct dn_dev *dn = rcu_dereference(dev->dn_ptr); - unsigned char padlen = 0; - - if (!net_eq(dev_net(dev), &init_net)) - goto dump_it; - - if (dn == NULL) - goto dump_it; - - skb = skb_share_check(skb, GFP_ATOMIC); - if (!skb) - goto out; - - if (!pskb_may_pull(skb, 3)) - goto dump_it; - - skb_pull(skb, 2); - - if (len > skb->len) - goto dump_it; - - skb_trim(skb, len); - - flags = *skb->data; - - cb = DN_SKB_CB(skb); - cb->stamp = jiffies; - cb->iif = dev->ifindex; - - /* - * If we have padding, remove it. - */ - if (flags & DN_RT_F_PF) { - padlen = flags & ~DN_RT_F_PF; - if (!pskb_may_pull(skb, padlen + 1)) - goto dump_it; - skb_pull(skb, padlen); - flags = *skb->data; - } - - skb_reset_network_header(skb); - - /* - * Weed out future version DECnet - */ - if (flags & DN_RT_F_VER) - goto dump_it; - - cb->rt_flags = flags; - - if (decnet_debug_level & 1) - printk(KERN_DEBUG - "dn_route_rcv: got 0x%02x from %s [%d %d %d]\n", - (int)flags, dev->name, len, skb->len, - padlen); - - if (flags & DN_RT_PKT_CNTL) { - if (unlikely(skb_linearize(skb))) - goto dump_it; - - switch (flags & DN_RT_CNTL_MSK) { - case DN_RT_PKT_INIT: - dn_dev_init_pkt(skb); - break; - case DN_RT_PKT_VERI: - dn_dev_veri_pkt(skb); - break; - } - - if (dn->parms.state != DN_DEV_S_RU) - goto dump_it; - - switch (flags & DN_RT_CNTL_MSK) { - case DN_RT_PKT_HELO: - return NF_HOOK(NFPROTO_DECNET, NF_DN_HELLO, - &init_net, NULL, skb, skb->dev, NULL, - dn_route_ptp_hello); - - case DN_RT_PKT_L1RT: - case DN_RT_PKT_L2RT: - return NF_HOOK(NFPROTO_DECNET, NF_DN_ROUTE, - &init_net, NULL, skb, skb->dev, NULL, - dn_route_discard); - case DN_RT_PKT_ERTH: - return NF_HOOK(NFPROTO_DECNET, NF_DN_HELLO, - &init_net, NULL, skb, skb->dev, NULL, - dn_neigh_router_hello); - - case DN_RT_PKT_EEDH: - return NF_HOOK(NFPROTO_DECNET, NF_DN_HELLO, - &init_net, NULL, skb, skb->dev, NULL, - dn_neigh_endnode_hello); - } - } else { - if (dn->parms.state != DN_DEV_S_RU) - goto dump_it; - - skb_pull(skb, 1); /* Pull flags */ - - switch (flags & DN_RT_PKT_MSK) { - case DN_RT_PKT_LONG: - return dn_route_rx_long(skb); - case DN_RT_PKT_SHORT: - return dn_route_rx_short(skb); - } - } - -dump_it: - kfree_skb(skb); -out: - return NET_RX_DROP; -} - -static int dn_output(struct net *net, struct sock *sk, struct sk_buff *skb) -{ - struct dst_entry *dst = skb_dst(skb); - struct dn_route *rt = (struct dn_route *)dst; - struct net_device *dev = dst->dev; - struct dn_skb_cb *cb = DN_SKB_CB(skb); - - int err = -EINVAL; - - if (rt->n == NULL) - goto error; - - skb->dev = dev; - - cb->src = rt->rt_saddr; - cb->dst = rt->rt_daddr; - - /* - * Always set the Intra-Ethernet bit on all outgoing packets - * originated on this node. Only valid flag from upper layers - * is return-to-sender-requested. Set hop count to 0 too. - */ - cb->rt_flags &= ~DN_RT_F_RQR; - cb->rt_flags |= DN_RT_F_IE; - cb->hops = 0; - - return NF_HOOK(NFPROTO_DECNET, NF_DN_LOCAL_OUT, - &init_net, sk, skb, NULL, dev, - dn_to_neigh_output); - -error: - net_dbg_ratelimited("dn_output: This should not happen\n"); - - kfree_skb(skb); - - return err; -} - -static int dn_forward(struct sk_buff *skb) -{ - struct dn_skb_cb *cb = DN_SKB_CB(skb); - struct dst_entry *dst = skb_dst(skb); - struct dn_dev *dn_db = rcu_dereference(dst->dev->dn_ptr); - struct dn_route *rt; - int header_len; - struct net_device *dev = skb->dev; - - if (skb->pkt_type != PACKET_HOST) - goto drop; - - /* Ensure that we have enough space for headers */ - rt = (struct dn_route *)skb_dst(skb); - header_len = dn_db->use_long ? 21 : 6; - if (skb_cow(skb, LL_RESERVED_SPACE(rt->dst.dev)+header_len)) - goto drop; - - /* - * Hop count exceeded. - */ - if (++cb->hops > 30) - goto drop; - - skb->dev = rt->dst.dev; - - /* - * If packet goes out same interface it came in on, then set - * the Intra-Ethernet bit. This has no effect for short - * packets, so we don't need to test for them here. - */ - cb->rt_flags &= ~DN_RT_F_IE; - if (rt->rt_flags & RTCF_DOREDIRECT) - cb->rt_flags |= DN_RT_F_IE; - - return NF_HOOK(NFPROTO_DECNET, NF_DN_FORWARD, - &init_net, NULL, skb, dev, skb->dev, - dn_to_neigh_output); - -drop: - kfree_skb(skb); - return NET_RX_DROP; -} - -/* - * Used to catch bugs. This should never normally get - * called. - */ -static int dn_rt_bug_out(struct net *net, struct sock *sk, struct sk_buff *skb) -{ - struct dn_skb_cb *cb = DN_SKB_CB(skb); - - net_dbg_ratelimited("dn_rt_bug: skb from:%04x to:%04x\n", - le16_to_cpu(cb->src), le16_to_cpu(cb->dst)); - - kfree_skb(skb); - - return NET_RX_DROP; -} - -static int dn_rt_bug(struct sk_buff *skb) -{ - struct dn_skb_cb *cb = DN_SKB_CB(skb); - - net_dbg_ratelimited("dn_rt_bug: skb from:%04x to:%04x\n", - le16_to_cpu(cb->src), le16_to_cpu(cb->dst)); - - kfree_skb(skb); - - return NET_RX_DROP; -} - -static unsigned int dn_dst_default_advmss(const struct dst_entry *dst) -{ - return dn_mss_from_pmtu(dst->dev, dst_mtu(dst)); -} - -static unsigned int dn_dst_mtu(const struct dst_entry *dst) -{ - unsigned int mtu = dst_metric_raw(dst, RTAX_MTU); - - return mtu ? : dst->dev->mtu; -} - -static struct neighbour *dn_dst_neigh_lookup(const struct dst_entry *dst, - struct sk_buff *skb, - const void *daddr) -{ - return __neigh_lookup_errno(&dn_neigh_table, daddr, dst->dev); -} - -static int dn_rt_set_next_hop(struct dn_route *rt, struct dn_fib_res *res) -{ - struct dn_fib_info *fi = res->fi; - struct net_device *dev = rt->dst.dev; - unsigned int mss_metric; - struct neighbour *n; - - if (fi) { - if (DN_FIB_RES_GW(*res) && - DN_FIB_RES_NH(*res).nh_scope == RT_SCOPE_LINK) - rt->rt_gateway = DN_FIB_RES_GW(*res); - dst_init_metrics(&rt->dst, fi->fib_metrics, true); - } - rt->rt_type = res->type; - - if (dev != NULL && rt->n == NULL) { - n = __neigh_lookup_errno(&dn_neigh_table, &rt->rt_gateway, dev); - if (IS_ERR(n)) - return PTR_ERR(n); - rt->n = n; - } - - if (dst_metric(&rt->dst, RTAX_MTU) > rt->dst.dev->mtu) - dst_metric_set(&rt->dst, RTAX_MTU, rt->dst.dev->mtu); - mss_metric = dst_metric_raw(&rt->dst, RTAX_ADVMSS); - if (mss_metric) { - unsigned int mss = dn_mss_from_pmtu(dev, dst_mtu(&rt->dst)); - if (mss_metric > mss) - dst_metric_set(&rt->dst, RTAX_ADVMSS, mss); - } - return 0; -} - -static inline int dn_match_addr(__le16 addr1, __le16 addr2) -{ - __u16 tmp = le16_to_cpu(addr1) ^ le16_to_cpu(addr2); - int match = 16; - while (tmp) { - tmp >>= 1; - match--; - } - return match; -} - -static __le16 dnet_select_source(const struct net_device *dev, __le16 daddr, int scope) -{ - __le16 saddr = 0; - struct dn_dev *dn_db; - struct dn_ifaddr *ifa; - int best_match = 0; - int ret; - - rcu_read_lock(); - dn_db = rcu_dereference(dev->dn_ptr); - for (ifa = rcu_dereference(dn_db->ifa_list); - ifa != NULL; - ifa = rcu_dereference(ifa->ifa_next)) { - if (ifa->ifa_scope > scope) - continue; - if (!daddr) { - saddr = ifa->ifa_local; - break; - } - ret = dn_match_addr(daddr, ifa->ifa_local); - if (ret > best_match) - saddr = ifa->ifa_local; - if (best_match == 0) - saddr = ifa->ifa_local; - } - rcu_read_unlock(); - - return saddr; -} - -static inline __le16 __dn_fib_res_prefsrc(struct dn_fib_res *res) -{ - return dnet_select_source(DN_FIB_RES_DEV(*res), DN_FIB_RES_GW(*res), res->scope); -} - -static inline __le16 dn_fib_rules_map_destination(__le16 daddr, struct dn_fib_res *res) -{ - __le16 mask = dnet_make_mask(res->prefixlen); - return (daddr&~mask)|res->fi->fib_nh->nh_gw; -} - -static int dn_route_output_slow(struct dst_entry **pprt, const struct flowidn *oldflp, int try_hard) -{ - struct flowidn fld = { - .daddr = oldflp->daddr, - .saddr = oldflp->saddr, - .flowidn_scope = RT_SCOPE_UNIVERSE, - .flowidn_mark = oldflp->flowidn_mark, - .flowidn_iif = LOOPBACK_IFINDEX, - .flowidn_oif = oldflp->flowidn_oif, - }; - struct dn_route *rt = NULL; - struct net_device *dev_out = NULL, *dev; - struct neighbour *neigh = NULL; - unsigned int hash; - unsigned int flags = 0; - struct dn_fib_res res = { .fi = NULL, .type = RTN_UNICAST }; - int err; - int free_res = 0; - __le16 gateway = 0; - - if (decnet_debug_level & 16) - printk(KERN_DEBUG - "dn_route_output_slow: dst=%04x src=%04x mark=%d" - " iif=%d oif=%d\n", le16_to_cpu(oldflp->daddr), - le16_to_cpu(oldflp->saddr), - oldflp->flowidn_mark, LOOPBACK_IFINDEX, - oldflp->flowidn_oif); - - /* If we have an output interface, verify its a DECnet device */ - if (oldflp->flowidn_oif) { - dev_out = dev_get_by_index(&init_net, oldflp->flowidn_oif); - err = -ENODEV; - if (dev_out && dev_out->dn_ptr == NULL) { - dev_put(dev_out); - dev_out = NULL; - } - if (dev_out == NULL) - goto out; - } - - /* If we have a source address, verify that its a local address */ - if (oldflp->saddr) { - err = -EADDRNOTAVAIL; - - if (dev_out) { - if (dn_dev_islocal(dev_out, oldflp->saddr)) - goto source_ok; - dev_put(dev_out); - goto out; - } - rcu_read_lock(); - for_each_netdev_rcu(&init_net, dev) { - if (!dev->dn_ptr) - continue; - if (!dn_dev_islocal(dev, oldflp->saddr)) - continue; - if ((dev->flags & IFF_LOOPBACK) && - oldflp->daddr && - !dn_dev_islocal(dev, oldflp->daddr)) - continue; - - dev_out = dev; - break; - } - rcu_read_unlock(); - if (dev_out == NULL) - goto out; - dev_hold(dev_out); -source_ok: - ; - } - - /* No destination? Assume its local */ - if (!fld.daddr) { - fld.daddr = fld.saddr; - - dev_put(dev_out); - err = -EINVAL; - dev_out = init_net.loopback_dev; - if (!dev_out->dn_ptr) - goto out; - err = -EADDRNOTAVAIL; - dev_hold(dev_out); - if (!fld.daddr) { - fld.daddr = - fld.saddr = dnet_select_source(dev_out, 0, - RT_SCOPE_HOST); - if (!fld.daddr) - goto done; - } - fld.flowidn_oif = LOOPBACK_IFINDEX; - res.type = RTN_LOCAL; - goto make_route; - } - - if (decnet_debug_level & 16) - printk(KERN_DEBUG - "dn_route_output_slow: initial checks complete." - " dst=%04x src=%04x oif=%d try_hard=%d\n", - le16_to_cpu(fld.daddr), le16_to_cpu(fld.saddr), - fld.flowidn_oif, try_hard); - - /* - * N.B. If the kernel is compiled without router support then - * dn_fib_lookup() will evaluate to non-zero so this if () block - * will always be executed. - */ - err = -ESRCH; - if (try_hard || (err = dn_fib_lookup(&fld, &res)) != 0) { - struct dn_dev *dn_db; - if (err != -ESRCH) - goto out; - /* - * Here the fallback is basically the standard algorithm for - * routing in endnodes which is described in the DECnet routing - * docs - * - * If we are not trying hard, look in neighbour cache. - * The result is tested to ensure that if a specific output - * device/source address was requested, then we honour that - * here - */ - if (!try_hard) { - neigh = neigh_lookup_nodev(&dn_neigh_table, &init_net, &fld.daddr); - if (neigh) { - if ((oldflp->flowidn_oif && - (neigh->dev->ifindex != oldflp->flowidn_oif)) || - (oldflp->saddr && - (!dn_dev_islocal(neigh->dev, - oldflp->saddr)))) { - neigh_release(neigh); - neigh = NULL; - } else { - dev_put(dev_out); - if (dn_dev_islocal(neigh->dev, fld.daddr)) { - dev_out = init_net.loopback_dev; - res.type = RTN_LOCAL; - } else { - dev_out = neigh->dev; - } - dev_hold(dev_out); - goto select_source; - } - } - } - - /* Not there? Perhaps its a local address */ - if (dev_out == NULL) - dev_out = dn_dev_get_default(); - err = -ENODEV; - if (dev_out == NULL) - goto out; - dn_db = rcu_dereference_raw(dev_out->dn_ptr); - if (!dn_db) - goto e_inval; - /* Possible improvement - check all devices for local addr */ - if (dn_dev_islocal(dev_out, fld.daddr)) { - dev_put(dev_out); - dev_out = init_net.loopback_dev; - dev_hold(dev_out); - res.type = RTN_LOCAL; - goto select_source; - } - /* Not local either.... try sending it to the default router */ - neigh = neigh_clone(dn_db->router); - BUG_ON(neigh && neigh->dev != dev_out); - - /* Ok then, we assume its directly connected and move on */ -select_source: - if (neigh) - gateway = container_of(neigh, struct dn_neigh, n)->addr; - if (gateway == 0) - gateway = fld.daddr; - if (fld.saddr == 0) { - fld.saddr = dnet_select_source(dev_out, gateway, - res.type == RTN_LOCAL ? - RT_SCOPE_HOST : - RT_SCOPE_LINK); - if (fld.saddr == 0 && res.type != RTN_LOCAL) - goto e_addr; - } - fld.flowidn_oif = dev_out->ifindex; - goto make_route; - } - free_res = 1; - - if (res.type == RTN_NAT) - goto e_inval; - - if (res.type == RTN_LOCAL) { - if (!fld.saddr) - fld.saddr = fld.daddr; - dev_put(dev_out); - dev_out = init_net.loopback_dev; - dev_hold(dev_out); - if (!dev_out->dn_ptr) - goto e_inval; - fld.flowidn_oif = dev_out->ifindex; - if (res.fi) - dn_fib_info_put(res.fi); - res.fi = NULL; - goto make_route; - } - - if (res.fi->fib_nhs > 1 && fld.flowidn_oif == 0) - dn_fib_select_multipath(&fld, &res); - - /* - * We could add some logic to deal with default routes here and - * get rid of some of the special casing above. - */ - - if (!fld.saddr) - fld.saddr = DN_FIB_RES_PREFSRC(res); - - dev_put(dev_out); - dev_out = DN_FIB_RES_DEV(res); - dev_hold(dev_out); - fld.flowidn_oif = dev_out->ifindex; - gateway = DN_FIB_RES_GW(res); - -make_route: - if (dev_out->flags & IFF_LOOPBACK) - flags |= RTCF_LOCAL; - - rt = dst_alloc(&dn_dst_ops, dev_out, 0, DST_OBSOLETE_NONE, 0); - if (rt == NULL) - goto e_nobufs; - - rt->dn_next = NULL; - memset(&rt->fld, 0, sizeof(rt->fld)); - rt->fld.saddr = oldflp->saddr; - rt->fld.daddr = oldflp->daddr; - rt->fld.flowidn_oif = oldflp->flowidn_oif; - rt->fld.flowidn_iif = 0; - rt->fld.flowidn_mark = oldflp->flowidn_mark; - - rt->rt_saddr = fld.saddr; - rt->rt_daddr = fld.daddr; - rt->rt_gateway = gateway ? gateway : fld.daddr; - rt->rt_local_src = fld.saddr; - - rt->rt_dst_map = fld.daddr; - rt->rt_src_map = fld.saddr; - - rt->n = neigh; - neigh = NULL; - - rt->dst.lastuse = jiffies; - rt->dst.output = dn_output; - rt->dst.input = dn_rt_bug; - rt->rt_flags = flags; - if (flags & RTCF_LOCAL) - rt->dst.input = dn_nsp_rx; - - err = dn_rt_set_next_hop(rt, &res); - if (err) - goto e_neighbour; - - hash = dn_hash(rt->fld.saddr, rt->fld.daddr); - /* dn_insert_route() increments dst->__refcnt */ - dn_insert_route(rt, hash, (struct dn_route **)pprt); - -done: - if (neigh) - neigh_release(neigh); - if (free_res) - dn_fib_res_put(&res); - dev_put(dev_out); -out: - return err; - -e_addr: - err = -EADDRNOTAVAIL; - goto done; -e_inval: - err = -EINVAL; - goto done; -e_nobufs: - err = -ENOBUFS; - goto done; -e_neighbour: - dst_release_immediate(&rt->dst); - goto e_nobufs; -} - - -/* - * N.B. The flags may be moved into the flowi at some future stage. - */ -static int __dn_route_output_key(struct dst_entry **pprt, const struct flowidn *flp, int flags) -{ - unsigned int hash = dn_hash(flp->saddr, flp->daddr); - struct dn_route *rt = NULL; - - if (!(flags & MSG_TRYHARD)) { - rcu_read_lock_bh(); - for (rt = rcu_dereference_bh(dn_rt_hash_table[hash].chain); rt; - rt = rcu_dereference_bh(rt->dn_next)) { - if ((flp->daddr == rt->fld.daddr) && - (flp->saddr == rt->fld.saddr) && - (flp->flowidn_mark == rt->fld.flowidn_mark) && - dn_is_output_route(rt) && - (rt->fld.flowidn_oif == flp->flowidn_oif)) { - dst_hold_and_use(&rt->dst, jiffies); - rcu_read_unlock_bh(); - *pprt = &rt->dst; - return 0; - } - } - rcu_read_unlock_bh(); - } - - return dn_route_output_slow(pprt, flp, flags); -} - -static int dn_route_output_key(struct dst_entry **pprt, struct flowidn *flp, int flags) -{ - int err; - - err = __dn_route_output_key(pprt, flp, flags); - if (err == 0 && flp->flowidn_proto) { - *pprt = xfrm_lookup(&init_net, *pprt, - flowidn_to_flowi(flp), NULL, 0); - if (IS_ERR(*pprt)) { - err = PTR_ERR(*pprt); - *pprt = NULL; - } - } - return err; -} - -int dn_route_output_sock(struct dst_entry __rcu **pprt, struct flowidn *fl, struct sock *sk, int flags) -{ - int err; - - err = __dn_route_output_key(pprt, fl, flags & MSG_TRYHARD); - if (err == 0 && fl->flowidn_proto) { - *pprt = xfrm_lookup(&init_net, *pprt, - flowidn_to_flowi(fl), sk, 0); - if (IS_ERR(*pprt)) { - err = PTR_ERR(*pprt); - *pprt = NULL; - } - } - return err; -} - -static int dn_route_input_slow(struct sk_buff *skb) -{ - struct dn_route *rt = NULL; - struct dn_skb_cb *cb = DN_SKB_CB(skb); - struct net_device *in_dev = skb->dev; - struct net_device *out_dev = NULL; - struct dn_dev *dn_db; - struct neighbour *neigh = NULL; - unsigned int hash; - int flags = 0; - __le16 gateway = 0; - __le16 local_src = 0; - struct flowidn fld = { - .daddr = cb->dst, - .saddr = cb->src, - .flowidn_scope = RT_SCOPE_UNIVERSE, - .flowidn_mark = skb->mark, - .flowidn_iif = skb->dev->ifindex, - }; - struct dn_fib_res res = { .fi = NULL, .type = RTN_UNREACHABLE }; - int err = -EINVAL; - int free_res = 0; - - dev_hold(in_dev); - - dn_db = rcu_dereference(in_dev->dn_ptr); - if (!dn_db) - goto out; - - /* Zero source addresses are not allowed */ - if (fld.saddr == 0) - goto out; - - /* - * In this case we've just received a packet from a source - * outside ourselves pretending to come from us. We don't - * allow it any further to prevent routing loops, spoofing and - * other nasties. Loopback packets already have the dst attached - * so this only affects packets which have originated elsewhere. - */ - err = -ENOTUNIQ; - if (dn_dev_islocal(in_dev, cb->src)) - goto out; - - err = dn_fib_lookup(&fld, &res); - if (err) { - if (err != -ESRCH) - goto out; - /* - * Is the destination us ? - */ - if (!dn_dev_islocal(in_dev, cb->dst)) - goto e_inval; - - res.type = RTN_LOCAL; - } else { - __le16 src_map = fld.saddr; - free_res = 1; - - out_dev = DN_FIB_RES_DEV(res); - if (out_dev == NULL) { - net_crit_ratelimited("Bug in dn_route_input_slow() No output device\n"); - goto e_inval; - } - dev_hold(out_dev); - - if (res.r) - src_map = fld.saddr; /* no NAT support for now */ - - gateway = DN_FIB_RES_GW(res); - if (res.type == RTN_NAT) { - fld.daddr = dn_fib_rules_map_destination(fld.daddr, &res); - dn_fib_res_put(&res); - free_res = 0; - if (dn_fib_lookup(&fld, &res)) - goto e_inval; - free_res = 1; - if (res.type != RTN_UNICAST) - goto e_inval; - flags |= RTCF_DNAT; - gateway = fld.daddr; - } - fld.saddr = src_map; - } - - switch (res.type) { - case RTN_UNICAST: - /* - * Forwarding check here, we only check for forwarding - * being turned off, if you want to only forward intra - * area, its up to you to set the routing tables up - * correctly. - */ - if (dn_db->parms.forwarding == 0) - goto e_inval; - - if (res.fi->fib_nhs > 1 && fld.flowidn_oif == 0) - dn_fib_select_multipath(&fld, &res); - - /* - * Check for out_dev == in_dev. We use the RTCF_DOREDIRECT - * flag as a hint to set the intra-ethernet bit when - * forwarding. If we've got NAT in operation, we don't do - * this optimisation. - */ - if (out_dev == in_dev && !(flags & RTCF_NAT)) - flags |= RTCF_DOREDIRECT; - - local_src = DN_FIB_RES_PREFSRC(res); - break; - case RTN_BLACKHOLE: - case RTN_UNREACHABLE: - break; - case RTN_LOCAL: - flags |= RTCF_LOCAL; - fld.saddr = cb->dst; - fld.daddr = cb->src; - - /* Routing tables gave us a gateway */ - if (gateway) - goto make_route; - - /* Packet was intra-ethernet, so we know its on-link */ - if (cb->rt_flags & DN_RT_F_IE) { - gateway = cb->src; - goto make_route; - } - - /* Use the default router if there is one */ - neigh = neigh_clone(dn_db->router); - if (neigh) { - gateway = container_of(neigh, struct dn_neigh, n)->addr; - goto make_route; - } - - /* Close eyes and pray */ - gateway = cb->src; - goto make_route; - default: - goto e_inval; - } - -make_route: - rt = dst_alloc(&dn_dst_ops, out_dev, 1, DST_OBSOLETE_NONE, 0); - if (rt == NULL) - goto e_nobufs; - - rt->dn_next = NULL; - memset(&rt->fld, 0, sizeof(rt->fld)); - rt->rt_saddr = fld.saddr; - rt->rt_daddr = fld.daddr; - rt->rt_gateway = fld.daddr; - if (gateway) - rt->rt_gateway = gateway; - rt->rt_local_src = local_src ? local_src : rt->rt_saddr; - - rt->rt_dst_map = fld.daddr; - rt->rt_src_map = fld.saddr; - - rt->fld.saddr = cb->src; - rt->fld.daddr = cb->dst; - rt->fld.flowidn_oif = 0; - rt->fld.flowidn_iif = in_dev->ifindex; - rt->fld.flowidn_mark = fld.flowidn_mark; - - rt->n = neigh; - rt->dst.lastuse = jiffies; - rt->dst.output = dn_rt_bug_out; - switch (res.type) { - case RTN_UNICAST: - rt->dst.input = dn_forward; - break; - case RTN_LOCAL: - rt->dst.output = dn_output; - rt->dst.input = dn_nsp_rx; - rt->dst.dev = in_dev; - flags |= RTCF_LOCAL; - break; - default: - case RTN_UNREACHABLE: - case RTN_BLACKHOLE: - rt->dst.input = dst_discard; - } - rt->rt_flags = flags; - - err = dn_rt_set_next_hop(rt, &res); - if (err) - goto e_neighbour; - - hash = dn_hash(rt->fld.saddr, rt->fld.daddr); - /* dn_insert_route() increments dst->__refcnt */ - dn_insert_route(rt, hash, &rt); - skb_dst_set(skb, &rt->dst); - -done: - if (neigh) - neigh_release(neigh); - if (free_res) - dn_fib_res_put(&res); - dev_put(in_dev); - dev_put(out_dev); -out: - return err; - -e_inval: - err = -EINVAL; - goto done; - -e_nobufs: - err = -ENOBUFS; - goto done; - -e_neighbour: - dst_release_immediate(&rt->dst); - goto done; -} - -static int dn_route_input(struct sk_buff *skb) -{ - struct dn_route *rt; - struct dn_skb_cb *cb = DN_SKB_CB(skb); - unsigned int hash = dn_hash(cb->src, cb->dst); - - if (skb_dst(skb)) - return 0; - - rcu_read_lock(); - for (rt = rcu_dereference(dn_rt_hash_table[hash].chain); rt != NULL; - rt = rcu_dereference(rt->dn_next)) { - if ((rt->fld.saddr == cb->src) && - (rt->fld.daddr == cb->dst) && - (rt->fld.flowidn_oif == 0) && - (rt->fld.flowidn_mark == skb->mark) && - (rt->fld.flowidn_iif == cb->iif)) { - dst_hold_and_use(&rt->dst, jiffies); - rcu_read_unlock(); - skb_dst_set(skb, (struct dst_entry *)rt); - return 0; - } - } - rcu_read_unlock(); - - return dn_route_input_slow(skb); -} - -static int dn_rt_fill_info(struct sk_buff *skb, u32 portid, u32 seq, - int event, int nowait, unsigned int flags) -{ - struct dn_route *rt = (struct dn_route *)skb_dst(skb); - struct rtmsg *r; - struct nlmsghdr *nlh; - long expires; - - nlh = nlmsg_put(skb, portid, seq, event, sizeof(*r), flags); - if (!nlh) - return -EMSGSIZE; - - r = nlmsg_data(nlh); - r->rtm_family = AF_DECnet; - r->rtm_dst_len = 16; - r->rtm_src_len = 0; - r->rtm_tos = 0; - r->rtm_table = RT_TABLE_MAIN; - r->rtm_type = rt->rt_type; - r->rtm_flags = (rt->rt_flags & ~0xFFFF) | RTM_F_CLONED; - r->rtm_scope = RT_SCOPE_UNIVERSE; - r->rtm_protocol = RTPROT_UNSPEC; - - if (rt->rt_flags & RTCF_NOTIFY) - r->rtm_flags |= RTM_F_NOTIFY; - - if (nla_put_u32(skb, RTA_TABLE, RT_TABLE_MAIN) < 0 || - nla_put_le16(skb, RTA_DST, rt->rt_daddr) < 0) - goto errout; - - if (rt->fld.saddr) { - r->rtm_src_len = 16; - if (nla_put_le16(skb, RTA_SRC, rt->fld.saddr) < 0) - goto errout; - } - if (rt->dst.dev && - nla_put_u32(skb, RTA_OIF, rt->dst.dev->ifindex) < 0) - goto errout; - - /* - * Note to self - change this if input routes reverse direction when - * they deal only with inputs and not with replies like they do - * currently. - */ - if (nla_put_le16(skb, RTA_PREFSRC, rt->rt_local_src) < 0) - goto errout; - - if (rt->rt_daddr != rt->rt_gateway && - nla_put_le16(skb, RTA_GATEWAY, rt->rt_gateway) < 0) - goto errout; - - if (rtnetlink_put_metrics(skb, dst_metrics_ptr(&rt->dst)) < 0) - goto errout; - - expires = rt->dst.expires ? rt->dst.expires - jiffies : 0; - if (rtnl_put_cacheinfo(skb, &rt->dst, 0, expires, - rt->dst.error) < 0) - goto errout; - - if (dn_is_input_route(rt) && - nla_put_u32(skb, RTA_IIF, rt->fld.flowidn_iif) < 0) - goto errout; - - nlmsg_end(skb, nlh); - return 0; - -errout: - nlmsg_cancel(skb, nlh); - return -EMSGSIZE; -} - -const struct nla_policy rtm_dn_policy[RTA_MAX + 1] = { - [RTA_DST] = { .type = NLA_U16 }, - [RTA_SRC] = { .type = NLA_U16 }, - [RTA_IIF] = { .type = NLA_U32 }, - [RTA_OIF] = { .type = NLA_U32 }, - [RTA_GATEWAY] = { .type = NLA_U16 }, - [RTA_PRIORITY] = { .type = NLA_U32 }, - [RTA_PREFSRC] = { .type = NLA_U16 }, - [RTA_METRICS] = { .type = NLA_NESTED }, - [RTA_MULTIPATH] = { .type = NLA_NESTED }, - [RTA_TABLE] = { .type = NLA_U32 }, - [RTA_MARK] = { .type = NLA_U32 }, -}; - -/* - * This is called by both endnodes and routers now. - */ -static int dn_cache_getroute(struct sk_buff *in_skb, struct nlmsghdr *nlh, - struct netlink_ext_ack *extack) -{ - struct net *net = sock_net(in_skb->sk); - struct rtmsg *rtm = nlmsg_data(nlh); - struct dn_route *rt = NULL; - struct dn_skb_cb *cb; - int err; - struct sk_buff *skb; - struct flowidn fld; - struct nlattr *tb[RTA_MAX+1]; - - if (!net_eq(net, &init_net)) - return -EINVAL; - - err = nlmsg_parse_deprecated(nlh, sizeof(*rtm), tb, RTA_MAX, - rtm_dn_policy, extack); - if (err < 0) - return err; - - memset(&fld, 0, sizeof(fld)); - fld.flowidn_proto = DNPROTO_NSP; - - skb = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL); - if (skb == NULL) - return -ENOBUFS; - skb_reset_mac_header(skb); - cb = DN_SKB_CB(skb); - - if (tb[RTA_SRC]) - fld.saddr = nla_get_le16(tb[RTA_SRC]); - - if (tb[RTA_DST]) - fld.daddr = nla_get_le16(tb[RTA_DST]); - - if (tb[RTA_IIF]) - fld.flowidn_iif = nla_get_u32(tb[RTA_IIF]); - - if (fld.flowidn_iif) { - struct net_device *dev; - dev = __dev_get_by_index(&init_net, fld.flowidn_iif); - if (!dev || !dev->dn_ptr) { - kfree_skb(skb); - return -ENODEV; - } - skb->protocol = htons(ETH_P_DNA_RT); - skb->dev = dev; - cb->src = fld.saddr; - cb->dst = fld.daddr; - local_bh_disable(); - err = dn_route_input(skb); - local_bh_enable(); - memset(cb, 0, sizeof(struct dn_skb_cb)); - rt = (struct dn_route *)skb_dst(skb); - if (!err && -rt->dst.error) - err = rt->dst.error; - } else { - if (tb[RTA_OIF]) - fld.flowidn_oif = nla_get_u32(tb[RTA_OIF]); - - err = dn_route_output_key((struct dst_entry **)&rt, &fld, 0); - } - - skb->dev = NULL; - if (err) - goto out_free; - skb_dst_set(skb, &rt->dst); - if (rtm->rtm_flags & RTM_F_NOTIFY) - rt->rt_flags |= RTCF_NOTIFY; - - err = dn_rt_fill_info(skb, NETLINK_CB(in_skb).portid, nlh->nlmsg_seq, RTM_NEWROUTE, 0, 0); - if (err < 0) { - err = -EMSGSIZE; - goto out_free; - } - - return rtnl_unicast(skb, &init_net, NETLINK_CB(in_skb).portid); - -out_free: - kfree_skb(skb); - return err; -} - -/* - * For routers, this is called from dn_fib_dump, but for endnodes its - * called directly from the rtnetlink dispatch table. - */ -int dn_cache_dump(struct sk_buff *skb, struct netlink_callback *cb) -{ - struct net *net = sock_net(skb->sk); - struct dn_route *rt; - int h, s_h; - int idx, s_idx; - struct rtmsg *rtm; - - if (!net_eq(net, &init_net)) - return 0; - - if (nlmsg_len(cb->nlh) < sizeof(struct rtmsg)) - return -EINVAL; - - rtm = nlmsg_data(cb->nlh); - if (!(rtm->rtm_flags & RTM_F_CLONED)) - return 0; - - s_h = cb->args[0]; - s_idx = idx = cb->args[1]; - for (h = 0; h <= dn_rt_hash_mask; h++) { - if (h < s_h) - continue; - if (h > s_h) - s_idx = 0; - rcu_read_lock_bh(); - for (rt = rcu_dereference_bh(dn_rt_hash_table[h].chain), idx = 0; - rt; - rt = rcu_dereference_bh(rt->dn_next), idx++) { - if (idx < s_idx) - continue; - skb_dst_set(skb, dst_clone(&rt->dst)); - if (dn_rt_fill_info(skb, NETLINK_CB(cb->skb).portid, - cb->nlh->nlmsg_seq, RTM_NEWROUTE, - 1, NLM_F_MULTI) < 0) { - skb_dst_drop(skb); - rcu_read_unlock_bh(); - goto done; - } - skb_dst_drop(skb); - } - rcu_read_unlock_bh(); - } - -done: - cb->args[0] = h; - cb->args[1] = idx; - return skb->len; -} - -#ifdef CONFIG_PROC_FS -struct dn_rt_cache_iter_state { - int bucket; -}; - -static struct dn_route *dn_rt_cache_get_first(struct seq_file *seq) -{ - struct dn_route *rt = NULL; - struct dn_rt_cache_iter_state *s = seq->private; - - for (s->bucket = dn_rt_hash_mask; s->bucket >= 0; --s->bucket) { - rcu_read_lock_bh(); - rt = rcu_dereference_bh(dn_rt_hash_table[s->bucket].chain); - if (rt) - break; - rcu_read_unlock_bh(); - } - return rt; -} - -static struct dn_route *dn_rt_cache_get_next(struct seq_file *seq, struct dn_route *rt) -{ - struct dn_rt_cache_iter_state *s = seq->private; - - rt = rcu_dereference_bh(rt->dn_next); - while (!rt) { - rcu_read_unlock_bh(); - if (--s->bucket < 0) - break; - rcu_read_lock_bh(); - rt = rcu_dereference_bh(dn_rt_hash_table[s->bucket].chain); - } - return rt; -} - -static void *dn_rt_cache_seq_start(struct seq_file *seq, loff_t *pos) -{ - struct dn_route *rt = dn_rt_cache_get_first(seq); - - if (rt) { - while (*pos && (rt = dn_rt_cache_get_next(seq, rt))) - --*pos; - } - return *pos ? NULL : rt; -} - -static void *dn_rt_cache_seq_next(struct seq_file *seq, void *v, loff_t *pos) -{ - struct dn_route *rt = dn_rt_cache_get_next(seq, v); - ++*pos; - return rt; -} - -static void dn_rt_cache_seq_stop(struct seq_file *seq, void *v) -{ - if (v) - rcu_read_unlock_bh(); -} - -static int dn_rt_cache_seq_show(struct seq_file *seq, void *v) -{ - struct dn_route *rt = v; - char buf1[DN_ASCBUF_LEN], buf2[DN_ASCBUF_LEN]; - - seq_printf(seq, "%-8s %-7s %-7s %04d %04d %04d\n", - rt->dst.dev ? rt->dst.dev->name : "*", - dn_addr2asc(le16_to_cpu(rt->rt_daddr), buf1), - dn_addr2asc(le16_to_cpu(rt->rt_saddr), buf2), - atomic_read(&rt->dst.__refcnt), - rt->dst.__use, 0); - return 0; -} - -static const struct seq_operations dn_rt_cache_seq_ops = { - .start = dn_rt_cache_seq_start, - .next = dn_rt_cache_seq_next, - .stop = dn_rt_cache_seq_stop, - .show = dn_rt_cache_seq_show, -}; -#endif /* CONFIG_PROC_FS */ - -void __init dn_route_init(void) -{ - int i, goal, order; - - dn_dst_ops.kmem_cachep = - kmem_cache_create("dn_dst_cache", sizeof(struct dn_route), 0, - SLAB_HWCACHE_ALIGN|SLAB_PANIC, NULL); - dst_entries_init(&dn_dst_ops); - timer_setup(&dn_route_timer, dn_dst_check_expire, 0); - dn_route_timer.expires = jiffies + decnet_dst_gc_interval * HZ; - add_timer(&dn_route_timer); - - goal = totalram_pages() >> (26 - PAGE_SHIFT); - - for (order = 0; (1UL << order) < goal; order++) - /* NOTHING */; - - /* - * Only want 1024 entries max, since the table is very, very unlikely - * to be larger than that. - */ - while (order && ((((1UL << order) * PAGE_SIZE) / - sizeof(struct dn_rt_hash_bucket)) >= 2048)) - order--; - - do { - dn_rt_hash_mask = (1UL << order) * PAGE_SIZE / - sizeof(struct dn_rt_hash_bucket); - while (dn_rt_hash_mask & (dn_rt_hash_mask - 1)) - dn_rt_hash_mask--; - dn_rt_hash_table = (struct dn_rt_hash_bucket *) - __get_free_pages(GFP_ATOMIC, order); - } while (dn_rt_hash_table == NULL && --order > 0); - - if (!dn_rt_hash_table) - panic("Failed to allocate DECnet route cache hash table\n"); - - printk(KERN_INFO - "DECnet: Routing cache hash table of %u buckets, %ldKbytes\n", - dn_rt_hash_mask, - (long)(dn_rt_hash_mask*sizeof(struct dn_rt_hash_bucket))/1024); - - dn_rt_hash_mask--; - for (i = 0; i <= dn_rt_hash_mask; i++) { - spin_lock_init(&dn_rt_hash_table[i].lock); - dn_rt_hash_table[i].chain = NULL; - } - - dn_dst_ops.gc_thresh = (dn_rt_hash_mask + 1); - - proc_create_seq_private("decnet_cache", 0444, init_net.proc_net, - &dn_rt_cache_seq_ops, - sizeof(struct dn_rt_cache_iter_state), NULL); - -#ifdef CONFIG_DECNET_ROUTER - rtnl_register_module(THIS_MODULE, PF_DECnet, RTM_GETROUTE, - dn_cache_getroute, dn_fib_dump, 0); -#else - rtnl_register_module(THIS_MODULE, PF_DECnet, RTM_GETROUTE, - dn_cache_getroute, dn_cache_dump, 0); -#endif -} - -void __exit dn_route_cleanup(void) -{ - del_timer(&dn_route_timer); - dn_run_flush(NULL); - - remove_proc_entry("decnet_cache", init_net.proc_net); - dst_entries_destroy(&dn_dst_ops); -} diff --git a/net/decnet/dn_rules.c b/net/decnet/dn_rules.c deleted file mode 100644 index ee73057529cf..000000000000 --- a/net/decnet/dn_rules.c +++ /dev/null @@ -1,253 +0,0 @@ -// SPDX-License-Identifier: GPL-2.0 - -/* - * DECnet An implementation of the DECnet protocol suite for the LINUX - * operating system. DECnet is implemented using the BSD Socket - * interface as the means of communication with the user level. - * - * DECnet Routing Forwarding Information Base (Rules) - * - * Author: Steve Whitehouse <SteveW@ACM.org> - * Mostly copied from Alexey Kuznetsov's ipv4/fib_rules.c - * - * - * Changes: - * Steve Whitehouse <steve@chygwyn.com> - * Updated for Thomas Graf's generic rules - * - */ -#include <linux/net.h> -#include <linux/init.h> -#include <linux/netlink.h> -#include <linux/rtnetlink.h> -#include <linux/netdevice.h> -#include <linux/spinlock.h> -#include <linux/list.h> -#include <linux/rcupdate.h> -#include <linux/export.h> -#include <net/neighbour.h> -#include <net/dst.h> -#include <net/flow.h> -#include <net/fib_rules.h> -#include <net/dn.h> -#include <net/dn_fib.h> -#include <net/dn_neigh.h> -#include <net/dn_dev.h> -#include <net/dn_route.h> - -static struct fib_rules_ops *dn_fib_rules_ops; - -struct dn_fib_rule -{ - struct fib_rule common; - unsigned char dst_len; - unsigned char src_len; - __le16 src; - __le16 srcmask; - __le16 dst; - __le16 dstmask; - __le16 srcmap; - u8 flags; -}; - - -int dn_fib_lookup(struct flowidn *flp, struct dn_fib_res *res) -{ - struct fib_lookup_arg arg = { - .result = res, - }; - int err; - - err = fib_rules_lookup(dn_fib_rules_ops, - flowidn_to_flowi(flp), 0, &arg); - res->r = arg.rule; - - return err; -} - -static int dn_fib_rule_action(struct fib_rule *rule, struct flowi *flp, - int flags, struct fib_lookup_arg *arg) -{ - struct flowidn *fld = &flp->u.dn; - int err = -EAGAIN; - struct dn_fib_table *tbl; - - switch(rule->action) { - case FR_ACT_TO_TBL: - break; - - case FR_ACT_UNREACHABLE: - err = -ENETUNREACH; - goto errout; - - case FR_ACT_PROHIBIT: - err = -EACCES; - goto errout; - - case FR_ACT_BLACKHOLE: - default: - err = -EINVAL; - goto errout; - } - - tbl = dn_fib_get_table(rule->table, 0); - if (tbl == NULL) - goto errout; - - err = tbl->lookup(tbl, fld, (struct dn_fib_res *)arg->result); - if (err > 0) - err = -EAGAIN; -errout: - return err; -} - -static int dn_fib_rule_match(struct fib_rule *rule, struct flowi *fl, int flags) -{ - struct dn_fib_rule *r = (struct dn_fib_rule *)rule; - struct flowidn *fld = &fl->u.dn; - __le16 daddr = fld->daddr; - __le16 saddr = fld->saddr; - - if (((saddr ^ r->src) & r->srcmask) || - ((daddr ^ r->dst) & r->dstmask)) - return 0; - - return 1; -} - -static int dn_fib_rule_configure(struct fib_rule *rule, struct sk_buff *skb, - struct fib_rule_hdr *frh, - struct nlattr **tb, - struct netlink_ext_ack *extack) -{ - int err = -EINVAL; - struct dn_fib_rule *r = (struct dn_fib_rule *)rule; - - if (frh->tos) { - NL_SET_ERR_MSG(extack, "Invalid tos value"); - goto errout; - } - - if (rule->table == RT_TABLE_UNSPEC) { - if (rule->action == FR_ACT_TO_TBL) { - struct dn_fib_table *table; - - table = dn_fib_empty_table(); - if (table == NULL) { - err = -ENOBUFS; - goto errout; - } - - rule->table = table->n; - } - } - - if (frh->src_len) - r->src = nla_get_le16(tb[FRA_SRC]); - - if (frh->dst_len) - r->dst = nla_get_le16(tb[FRA_DST]); - - r->src_len = frh->src_len; - r->srcmask = dnet_make_mask(r->src_len); - r->dst_len = frh->dst_len; - r->dstmask = dnet_make_mask(r->dst_len); - err = 0; -errout: - return err; -} - -static int dn_fib_rule_compare(struct fib_rule *rule, struct fib_rule_hdr *frh, - struct nlattr **tb) -{ - struct dn_fib_rule *r = (struct dn_fib_rule *)rule; - - if (frh->src_len && (r->src_len != frh->src_len)) - return 0; - - if (frh->dst_len && (r->dst_len != frh->dst_len)) - return 0; - - if (frh->src_len && (r->src != nla_get_le16(tb[FRA_SRC]))) - return 0; - - if (frh->dst_len && (r->dst != nla_get_le16(tb[FRA_DST]))) - return 0; - - return 1; -} - -unsigned int dnet_addr_type(__le16 addr) -{ - struct flowidn fld = { .daddr = addr }; - struct dn_fib_res res; - unsigned int ret = RTN_UNICAST; - struct dn_fib_table *tb = dn_fib_get_table(RT_TABLE_LOCAL, 0); - - res.r = NULL; - - if (tb) { - if (!tb->lookup(tb, &fld, &res)) { - ret = res.type; - dn_fib_res_put(&res); - } - } - return ret; -} - -static int dn_fib_rule_fill(struct fib_rule *rule, struct sk_buff *skb, - struct fib_rule_hdr *frh) -{ - struct dn_fib_rule *r = (struct dn_fib_rule *)rule; - - frh->dst_len = r->dst_len; - frh->src_len = r->src_len; - frh->tos = 0; - - if ((r->dst_len && - nla_put_le16(skb, FRA_DST, r->dst)) || - (r->src_len && - nla_put_le16(skb, FRA_SRC, r->src))) - goto nla_put_failure; - return 0; - -nla_put_failure: - return -ENOBUFS; -} - -static void dn_fib_rule_flush_cache(struct fib_rules_ops *ops) -{ - dn_rt_cache_flush(-1); -} - -static const struct fib_rules_ops __net_initconst dn_fib_rules_ops_template = { - .family = AF_DECnet, - .rule_size = sizeof(struct dn_fib_rule), - .addr_size = sizeof(u16), - .action = dn_fib_rule_action, - .match = dn_fib_rule_match, - .configure = dn_fib_rule_configure, - .compare = dn_fib_rule_compare, - .fill = dn_fib_rule_fill, - .flush_cache = dn_fib_rule_flush_cache, - .nlgroup = RTNLGRP_DECnet_RULE, - .owner = THIS_MODULE, - .fro_net = &init_net, -}; - -void __init dn_fib_rules_init(void) -{ - dn_fib_rules_ops = - fib_rules_register(&dn_fib_rules_ops_template, &init_net); - BUG_ON(IS_ERR(dn_fib_rules_ops)); - BUG_ON(fib_default_rule_add(dn_fib_rules_ops, 0x7fff, - RT_TABLE_MAIN, 0)); -} - -void __exit dn_fib_rules_cleanup(void) -{ - rtnl_lock(); - fib_rules_unregister(dn_fib_rules_ops); - rtnl_unlock(); - rcu_barrier(); -} diff --git a/net/decnet/dn_table.c b/net/decnet/dn_table.c deleted file mode 100644 index 4086f9c746af..000000000000 --- a/net/decnet/dn_table.c +++ /dev/null @@ -1,929 +0,0 @@ -// SPDX-License-Identifier: GPL-2.0 -/* - * DECnet An implementation of the DECnet protocol suite for the LINUX - * operating system. DECnet is implemented using the BSD Socket - * interface as the means of communication with the user level. - * - * DECnet Routing Forwarding Information Base (Routing Tables) - * - * Author: Steve Whitehouse <SteveW@ACM.org> - * Mostly copied from the IPv4 routing code - * - * - * Changes: - * - */ -#include <linux/string.h> -#include <linux/net.h> -#include <linux/socket.h> -#include <linux/slab.h> -#include <linux/sockios.h> -#include <linux/init.h> -#include <linux/skbuff.h> -#include <linux/rtnetlink.h> -#include <linux/proc_fs.h> -#include <linux/netdevice.h> -#include <linux/timer.h> -#include <linux/spinlock.h> -#include <linux/atomic.h> -#include <linux/uaccess.h> -#include <linux/route.h> /* RTF_xxx */ -#include <net/neighbour.h> -#include <net/netlink.h> -#include <net/tcp.h> -#include <net/dst.h> -#include <net/flow.h> -#include <net/fib_rules.h> -#include <net/dn.h> -#include <net/dn_route.h> -#include <net/dn_fib.h> -#include <net/dn_neigh.h> -#include <net/dn_dev.h> - -struct dn_zone -{ - struct dn_zone *dz_next; - struct dn_fib_node **dz_hash; - int dz_nent; - int dz_divisor; - u32 dz_hashmask; -#define DZ_HASHMASK(dz) ((dz)->dz_hashmask) - int dz_order; - __le16 dz_mask; -#define DZ_MASK(dz) ((dz)->dz_mask) -}; - -struct dn_hash -{ - struct dn_zone *dh_zones[17]; - struct dn_zone *dh_zone_list; -}; - -#define dz_key_0(key) ((key).datum = 0) - -#define for_nexthops(fi) { int nhsel; const struct dn_fib_nh *nh;\ - for(nhsel = 0, nh = (fi)->fib_nh; nhsel < (fi)->fib_nhs; nh++, nhsel++) - -#define endfor_nexthops(fi) } - -#define DN_MAX_DIVISOR 1024 -#define DN_S_ZOMBIE 1 -#define DN_S_ACCESSED 2 - -#define DN_FIB_SCAN(f, fp) \ -for( ; ((f) = *(fp)) != NULL; (fp) = &(f)->fn_next) - -#define DN_FIB_SCAN_KEY(f, fp, key) \ -for( ; ((f) = *(fp)) != NULL && dn_key_eq((f)->fn_key, (key)); (fp) = &(f)->fn_next) - -#define RT_TABLE_MIN 1 -#define DN_FIB_TABLE_HASHSZ 256 -static struct hlist_head dn_fib_table_hash[DN_FIB_TABLE_HASHSZ]; -static DEFINE_RWLOCK(dn_fib_tables_lock); - -static struct kmem_cache *dn_hash_kmem __read_mostly; -static int dn_fib_hash_zombies; - -static inline dn_fib_idx_t dn_hash(dn_fib_key_t key, struct dn_zone *dz) -{ - u16 h = le16_to_cpu(key.datum)>>(16 - dz->dz_order); - h ^= (h >> 10); - h ^= (h >> 6); - h &= DZ_HASHMASK(dz); - return *(dn_fib_idx_t *)&h; -} - -static inline dn_fib_key_t dz_key(__le16 dst, struct dn_zone *dz) -{ - dn_fib_key_t k; - k.datum = dst & DZ_MASK(dz); - return k; -} - -static inline struct dn_fib_node **dn_chain_p(dn_fib_key_t key, struct dn_zone *dz) -{ - return &dz->dz_hash[dn_hash(key, dz).datum]; -} - -static inline struct dn_fib_node *dz_chain(dn_fib_key_t key, struct dn_zone *dz) -{ - return dz->dz_hash[dn_hash(key, dz).datum]; -} - -static inline int dn_key_eq(dn_fib_key_t a, dn_fib_key_t b) -{ - return a.datum == b.datum; -} - -static inline int dn_key_leq(dn_fib_key_t a, dn_fib_key_t b) -{ - return a.datum <= b.datum; -} - -static inline void dn_rebuild_zone(struct dn_zone *dz, - struct dn_fib_node **old_ht, - int old_divisor) -{ - struct dn_fib_node *f, **fp, *next; - int i; - - for(i = 0; i < old_divisor; i++) { - for(f = old_ht[i]; f; f = next) { - next = f->fn_next; - for(fp = dn_chain_p(f->fn_key, dz); - *fp && dn_key_leq((*fp)->fn_key, f->fn_key); - fp = &(*fp)->fn_next) - /* NOTHING */; - f->fn_next = *fp; - *fp = f; - } - } -} - -static void dn_rehash_zone(struct dn_zone *dz) -{ - struct dn_fib_node **ht, **old_ht; - int old_divisor, new_divisor; - u32 new_hashmask; - - old_divisor = dz->dz_divisor; - - switch (old_divisor) { - case 16: - new_divisor = 256; - new_hashmask = 0xFF; - break; - default: - printk(KERN_DEBUG "DECnet: dn_rehash_zone: BUG! %d\n", - old_divisor); - fallthrough; - case 256: - new_divisor = 1024; - new_hashmask = 0x3FF; - break; - } - - ht = kcalloc(new_divisor, sizeof(struct dn_fib_node*), GFP_KERNEL); - if (ht == NULL) - return; - - write_lock_bh(&dn_fib_tables_lock); - old_ht = dz->dz_hash; - dz->dz_hash = ht; - dz->dz_hashmask = new_hashmask; - dz->dz_divisor = new_divisor; - dn_rebuild_zone(dz, old_ht, old_divisor); - write_unlock_bh(&dn_fib_tables_lock); - kfree(old_ht); -} - -static void dn_free_node(struct dn_fib_node *f) -{ - dn_fib_release_info(DN_FIB_INFO(f)); - kmem_cache_free(dn_hash_kmem, f); -} - - -static struct dn_zone *dn_new_zone(struct dn_hash *table, int z) -{ - int i; - struct dn_zone *dz = kzalloc(sizeof(struct dn_zone), GFP_KERNEL); - if (!dz) - return NULL; - - if (z) { - dz->dz_divisor = 16; - dz->dz_hashmask = 0x0F; - } else { - dz->dz_divisor = 1; - dz->dz_hashmask = 0; - } - - dz->dz_hash = kcalloc(dz->dz_divisor, sizeof(struct dn_fib_node *), GFP_KERNEL); - if (!dz->dz_hash) { - kfree(dz); - return NULL; - } - - dz->dz_order = z; - dz->dz_mask = dnet_make_mask(z); - - for(i = z + 1; i <= 16; i++) - if (table->dh_zones[i]) - break; - - write_lock_bh(&dn_fib_tables_lock); - if (i>16) { - dz->dz_next = table->dh_zone_list; - table->dh_zone_list = dz; - } else { - dz->dz_next = table->dh_zones[i]->dz_next; - table->dh_zones[i]->dz_next = dz; - } - table->dh_zones[z] = dz; - write_unlock_bh(&dn_fib_tables_lock); - return dz; -} - - -static int dn_fib_nh_match(struct rtmsg *r, struct nlmsghdr *nlh, struct nlattr *attrs[], struct dn_fib_info *fi) -{ - struct rtnexthop *nhp; - int nhlen; - - if (attrs[RTA_PRIORITY] && - nla_get_u32(attrs[RTA_PRIORITY]) != fi->fib_priority) - return 1; - - if (attrs[RTA_OIF] || attrs[RTA_GATEWAY]) { - if ((!attrs[RTA_OIF] || nla_get_u32(attrs[RTA_OIF]) == fi->fib_nh->nh_oif) && - (!attrs[RTA_GATEWAY] || nla_get_le16(attrs[RTA_GATEWAY]) != fi->fib_nh->nh_gw)) - return 0; - return 1; - } - - if (!attrs[RTA_MULTIPATH]) - return 0; - - nhp = nla_data(attrs[RTA_MULTIPATH]); - nhlen = nla_len(attrs[RTA_MULTIPATH]); - - for_nexthops(fi) { - int attrlen = nhlen - sizeof(struct rtnexthop); - __le16 gw; - - if (attrlen < 0 || (nhlen -= nhp->rtnh_len) < 0) - return -EINVAL; - if (nhp->rtnh_ifindex && nhp->rtnh_ifindex != nh->nh_oif) - return 1; - if (attrlen) { - struct nlattr *gw_attr; - - gw_attr = nla_find((struct nlattr *) (nhp + 1), attrlen, RTA_GATEWAY); - gw = gw_attr ? nla_get_le16(gw_attr) : 0; - - if (gw && gw != nh->nh_gw) - return 1; - } - nhp = RTNH_NEXT(nhp); - } endfor_nexthops(fi); - - return 0; -} - -static inline size_t dn_fib_nlmsg_size(struct dn_fib_info *fi) -{ - size_t payload = NLMSG_ALIGN(sizeof(struct rtmsg)) - + nla_total_size(4) /* RTA_TABLE */ - + nla_total_size(2) /* RTA_DST */ - + nla_total_size(4) /* RTA_PRIORITY */ - + nla_total_size(TCP_CA_NAME_MAX); /* RTAX_CC_ALGO */ - - /* space for nested metrics */ - payload += nla_total_size((RTAX_MAX * nla_total_size(4))); - - if (fi->fib_nhs) { - /* Also handles the special case fib_nhs == 1 */ - - /* each nexthop is packed in an attribute */ - size_t nhsize = nla_total_size(sizeof(struct rtnexthop)); - - /* may contain a gateway attribute */ - nhsize += nla_total_size(4); - - /* all nexthops are packed in a nested attribute */ - payload += nla_total_size(fi->fib_nhs * nhsize); - } - - return payload; -} - -static int dn_fib_dump_info(struct sk_buff *skb, u32 portid, u32 seq, int event, - u32 tb_id, u8 type, u8 scope, void *dst, int dst_len, - struct dn_fib_info *fi, unsigned int flags) -{ - struct rtmsg *rtm; - struct nlmsghdr *nlh; - - nlh = nlmsg_put(skb, portid, seq, event, sizeof(*rtm), flags); - if (!nlh) - return -EMSGSIZE; - - rtm = nlmsg_data(nlh); - rtm->rtm_family = AF_DECnet; - rtm->rtm_dst_len = dst_len; - rtm->rtm_src_len = 0; - rtm->rtm_tos = 0; - rtm->rtm_table = tb_id; - rtm->rtm_flags = fi->fib_flags; - rtm->rtm_scope = scope; - rtm->rtm_type = type; - rtm->rtm_protocol = fi->fib_protocol; - - if (nla_put_u32(skb, RTA_TABLE, tb_id) < 0) - goto errout; - - if (rtm->rtm_dst_len && - nla_put(skb, RTA_DST, 2, dst) < 0) - goto errout; - - if (fi->fib_priority && - nla_put_u32(skb, RTA_PRIORITY, fi->fib_priority) < 0) - goto errout; - - if (rtnetlink_put_metrics(skb, fi->fib_metrics) < 0) - goto errout; - - if (fi->fib_nhs == 1) { - if (fi->fib_nh->nh_gw && - nla_put_le16(skb, RTA_GATEWAY, fi->fib_nh->nh_gw) < 0) - goto errout; - - if (fi->fib_nh->nh_oif && - nla_put_u32(skb, RTA_OIF, fi->fib_nh->nh_oif) < 0) - goto errout; - } - - if (fi->fib_nhs > 1) { - struct rtnexthop *nhp; - struct nlattr *mp_head; - - mp_head = nla_nest_start_noflag(skb, RTA_MULTIPATH); - if (!mp_head) - goto errout; - - for_nexthops(fi) { - if (!(nhp = nla_reserve_nohdr(skb, sizeof(*nhp)))) - goto errout; - - nhp->rtnh_flags = nh->nh_flags & 0xFF; - nhp->rtnh_hops = nh->nh_weight - 1; - nhp->rtnh_ifindex = nh->nh_oif; - - if (nh->nh_gw && - nla_put_le16(skb, RTA_GATEWAY, nh->nh_gw) < 0) - goto errout; - - nhp->rtnh_len = skb_tail_pointer(skb) - (unsigned char *)nhp; - } endfor_nexthops(fi); - - nla_nest_end(skb, mp_head); - } - - nlmsg_end(skb, nlh); - return 0; - -errout: - nlmsg_cancel(skb, nlh); - return -EMSGSIZE; -} - - -static void dn_rtmsg_fib(int event, struct dn_fib_node *f, int z, u32 tb_id, - struct nlmsghdr *nlh, struct netlink_skb_parms *req) -{ - struct sk_buff *skb; - u32 portid = req ? req->portid : 0; - int err = -ENOBUFS; - - skb = nlmsg_new(dn_fib_nlmsg_size(DN_FIB_INFO(f)), GFP_KERNEL); - if (skb == NULL) - goto errout; - - err = dn_fib_dump_info(skb, portid, nlh->nlmsg_seq, event, tb_id, - f->fn_type, f->fn_scope, &f->fn_key, z, - DN_FIB_INFO(f), 0); - if (err < 0) { - /* -EMSGSIZE implies BUG in dn_fib_nlmsg_size() */ - WARN_ON(err == -EMSGSIZE); - kfree_skb(skb); - goto errout; - } - rtnl_notify(skb, &init_net, portid, RTNLGRP_DECnet_ROUTE, nlh, GFP_KERNEL); - return; -errout: - if (err < 0) - rtnl_set_sk_err(&init_net, RTNLGRP_DECnet_ROUTE, err); -} - -static __inline__ int dn_hash_dump_bucket(struct sk_buff *skb, - struct netlink_callback *cb, - struct dn_fib_table *tb, - struct dn_zone *dz, - struct dn_fib_node *f) -{ - int i, s_i; - - s_i = cb->args[4]; - for(i = 0; f; i++, f = f->fn_next) { - if (i < s_i) - continue; - if (f->fn_state & DN_S_ZOMBIE) - continue; - if (dn_fib_dump_info(skb, NETLINK_CB(cb->skb).portid, - cb->nlh->nlmsg_seq, - RTM_NEWROUTE, - tb->n, - (f->fn_state & DN_S_ZOMBIE) ? 0 : f->fn_type, - f->fn_scope, &f->fn_key, dz->dz_order, - f->fn_info, NLM_F_MULTI) < 0) { - cb->args[4] = i; - return -1; - } - } - cb->args[4] = i; - return skb->len; -} - -static __inline__ int dn_hash_dump_zone(struct sk_buff *skb, - struct netlink_callback *cb, - struct dn_fib_table *tb, - struct dn_zone *dz) -{ - int h, s_h; - - s_h = cb->args[3]; - for(h = 0; h < dz->dz_divisor; h++) { - if (h < s_h) - continue; - if (h > s_h) - memset(&cb->args[4], 0, sizeof(cb->args) - 4*sizeof(cb->args[0])); - if (dz->dz_hash == NULL || dz->dz_hash[h] == NULL) - continue; - if (dn_hash_dump_bucket(skb, cb, tb, dz, dz->dz_hash[h]) < 0) { - cb->args[3] = h; - return -1; - } - } - cb->args[3] = h; - return skb->len; -} - -static int dn_fib_table_dump(struct dn_fib_table *tb, struct sk_buff *skb, - struct netlink_callback *cb) -{ - int m, s_m; - struct dn_zone *dz; - struct dn_hash *table = (struct dn_hash *)tb->data; - - s_m = cb->args[2]; - read_lock(&dn_fib_tables_lock); - for(dz = table->dh_zone_list, m = 0; dz; dz = dz->dz_next, m++) { - if (m < s_m) - continue; - if (m > s_m) - memset(&cb->args[3], 0, sizeof(cb->args) - 3*sizeof(cb->args[0])); - - if (dn_hash_dump_zone(skb, cb, tb, dz) < 0) { - cb->args[2] = m; - read_unlock(&dn_fib_tables_lock); - return -1; - } - } - read_unlock(&dn_fib_tables_lock); - cb->args[2] = m; - - return skb->len; -} - -int dn_fib_dump(struct sk_buff *skb, struct netlink_callback *cb) -{ - struct net *net = sock_net(skb->sk); - unsigned int h, s_h; - unsigned int e = 0, s_e; - struct dn_fib_table *tb; - int dumped = 0; - - if (!net_eq(net, &init_net)) - return 0; - - if (nlmsg_len(cb->nlh) >= sizeof(struct rtmsg) && - ((struct rtmsg *)nlmsg_data(cb->nlh))->rtm_flags&RTM_F_CLONED) - return dn_cache_dump(skb, cb); - - s_h = cb->args[0]; - s_e = cb->args[1]; - - for (h = s_h; h < DN_FIB_TABLE_HASHSZ; h++, s_h = 0) { - e = 0; - hlist_for_each_entry(tb, &dn_fib_table_hash[h], hlist) { - if (e < s_e) - goto next; - if (dumped) - memset(&cb->args[2], 0, sizeof(cb->args) - - 2 * sizeof(cb->args[0])); - if (tb->dump(tb, skb, cb) < 0) - goto out; - dumped = 1; -next: - e++; - } - } -out: - cb->args[1] = e; - cb->args[0] = h; - - return skb->len; -} - -static int dn_fib_table_insert(struct dn_fib_table *tb, struct rtmsg *r, struct nlattr *attrs[], - struct nlmsghdr *n, struct netlink_skb_parms *req) -{ - struct dn_hash *table = (struct dn_hash *)tb->data; - struct dn_fib_node *new_f, *f, **fp, **del_fp; - struct dn_zone *dz; - struct dn_fib_info *fi; - int z = r->rtm_dst_len; - int type = r->rtm_type; - dn_fib_key_t key; - int err; - - if (z > 16) - return -EINVAL; - - dz = table->dh_zones[z]; - if (!dz && !(dz = dn_new_zone(table, z))) - return -ENOBUFS; - - dz_key_0(key); - if (attrs[RTA_DST]) { - __le16 dst = nla_get_le16(attrs[RTA_DST]); - if (dst & ~DZ_MASK(dz)) - return -EINVAL; - key = dz_key(dst, dz); - } - - if ((fi = dn_fib_create_info(r, attrs, n, &err)) == NULL) - return err; - - if (dz->dz_nent > (dz->dz_divisor << 2) && - dz->dz_divisor > DN_MAX_DIVISOR && - (z==16 || (1<<z) > dz->dz_divisor)) - dn_rehash_zone(dz); - - fp = dn_chain_p(key, dz); - - DN_FIB_SCAN(f, fp) { - if (dn_key_leq(key, f->fn_key)) - break; - } - - del_fp = NULL; - - if (f && (f->fn_state & DN_S_ZOMBIE) && - dn_key_eq(f->fn_key, key)) { - del_fp = fp; - fp = &f->fn_next; - f = *fp; - goto create; - } - - DN_FIB_SCAN_KEY(f, fp, key) { - if (fi->fib_priority <= DN_FIB_INFO(f)->fib_priority) - break; - } - - if (f && dn_key_eq(f->fn_key, key) && - fi->fib_priority == DN_FIB_INFO(f)->fib_priority) { - struct dn_fib_node **ins_fp; - - err = -EEXIST; - if (n->nlmsg_flags & NLM_F_EXCL) - goto out; - - if (n->nlmsg_flags & NLM_F_REPLACE) { - del_fp = fp; - fp = &f->fn_next; - f = *fp; - goto replace; - } - - ins_fp = fp; - err = -EEXIST; - - DN_FIB_SCAN_KEY(f, fp, key) { - if (fi->fib_priority != DN_FIB_INFO(f)->fib_priority) - break; - if (f->fn_type == type && - f->fn_scope == r->rtm_scope && - DN_FIB_INFO(f) == fi) - goto out; - } - - if (!(n->nlmsg_flags & NLM_F_APPEND)) { - fp = ins_fp; - f = *fp; - } - } - -create: - err = -ENOENT; - if (!(n->nlmsg_flags & NLM_F_CREATE)) - goto out; - -replace: - err = -ENOBUFS; - new_f = kmem_cache_zalloc(dn_hash_kmem, GFP_KERNEL); - if (new_f == NULL) - goto out; - - new_f->fn_key = key; - new_f->fn_type = type; - new_f->fn_scope = r->rtm_scope; - DN_FIB_INFO(new_f) = fi; - - new_f->fn_next = f; - write_lock_bh(&dn_fib_tables_lock); - *fp = new_f; - write_unlock_bh(&dn_fib_tables_lock); - dz->dz_nent++; - - if (del_fp) { - f = *del_fp; - write_lock_bh(&dn_fib_tables_lock); - *del_fp = f->fn_next; - write_unlock_bh(&dn_fib_tables_lock); - - if (!(f->fn_state & DN_S_ZOMBIE)) - dn_rtmsg_fib(RTM_DELROUTE, f, z, tb->n, n, req); - if (f->fn_state & DN_S_ACCESSED) - dn_rt_cache_flush(-1); - dn_free_node(f); - dz->dz_nent--; - } else { - dn_rt_cache_flush(-1); - } - - dn_rtmsg_fib(RTM_NEWROUTE, new_f, z, tb->n, n, req); - - return 0; -out: - dn_fib_release_info(fi); - return err; -} - - -static int dn_fib_table_delete(struct dn_fib_table *tb, struct rtmsg *r, struct nlattr *attrs[], - struct nlmsghdr *n, struct netlink_skb_parms *req) -{ - struct dn_hash *table = (struct dn_hash*)tb->data; - struct dn_fib_node **fp, **del_fp, *f; - int z = r->rtm_dst_len; - struct dn_zone *dz; - dn_fib_key_t key; - int matched; - - - if (z > 16) - return -EINVAL; - - if ((dz = table->dh_zones[z]) == NULL) - return -ESRCH; - - dz_key_0(key); - if (attrs[RTA_DST]) { - __le16 dst = nla_get_le16(attrs[RTA_DST]); - if (dst & ~DZ_MASK(dz)) - return -EINVAL; - key = dz_key(dst, dz); - } - - fp = dn_chain_p(key, dz); - - DN_FIB_SCAN(f, fp) { - if (dn_key_eq(f->fn_key, key)) - break; - if (dn_key_leq(key, f->fn_key)) - return -ESRCH; - } - - matched = 0; - del_fp = NULL; - DN_FIB_SCAN_KEY(f, fp, key) { - struct dn_fib_info *fi = DN_FIB_INFO(f); - - if (f->fn_state & DN_S_ZOMBIE) - return -ESRCH; - - matched++; - - if (del_fp == NULL && - (!r->rtm_type || f->fn_type == r->rtm_type) && - (r->rtm_scope == RT_SCOPE_NOWHERE || f->fn_scope == r->rtm_scope) && - (!r->rtm_protocol || - fi->fib_protocol == r->rtm_protocol) && - dn_fib_nh_match(r, n, attrs, fi) == 0) - del_fp = fp; - } - - if (del_fp) { - f = *del_fp; - dn_rtmsg_fib(RTM_DELROUTE, f, z, tb->n, n, req); - - if (matched != 1) { - write_lock_bh(&dn_fib_tables_lock); - *del_fp = f->fn_next; - write_unlock_bh(&dn_fib_tables_lock); - - if (f->fn_state & DN_S_ACCESSED) - dn_rt_cache_flush(-1); - dn_free_node(f); - dz->dz_nent--; - } else { - f->fn_state |= DN_S_ZOMBIE; - if (f->fn_state & DN_S_ACCESSED) { - f->fn_state &= ~DN_S_ACCESSED; - dn_rt_cache_flush(-1); - } - if (++dn_fib_hash_zombies > 128) - dn_fib_flush(); - } - - return 0; - } - - return -ESRCH; -} - -static inline int dn_flush_list(struct dn_fib_node **fp, int z, struct dn_hash *table) -{ - int found = 0; - struct dn_fib_node *f; - - while((f = *fp) != NULL) { - struct dn_fib_info *fi = DN_FIB_INFO(f); - - if (fi && ((f->fn_state & DN_S_ZOMBIE) || (fi->fib_flags & RTNH_F_DEAD))) { - write_lock_bh(&dn_fib_tables_lock); - *fp = f->fn_next; - write_unlock_bh(&dn_fib_tables_lock); - - dn_free_node(f); - found++; - continue; - } - fp = &f->fn_next; - } - - return found; -} - -static int dn_fib_table_flush(struct dn_fib_table *tb) -{ - struct dn_hash *table = (struct dn_hash *)tb->data; - struct dn_zone *dz; - int found = 0; - - dn_fib_hash_zombies = 0; - for(dz = table->dh_zone_list; dz; dz = dz->dz_next) { - int i; - int tmp = 0; - for(i = dz->dz_divisor-1; i >= 0; i--) - tmp += dn_flush_list(&dz->dz_hash[i], dz->dz_order, table); - dz->dz_nent -= tmp; - found += tmp; - } - - return found; -} - -static int dn_fib_table_lookup(struct dn_fib_table *tb, const struct flowidn *flp, struct dn_fib_res *res) -{ - int err; - struct dn_zone *dz; - struct dn_hash *t = (struct dn_hash *)tb->data; - - read_lock(&dn_fib_tables_lock); - for(dz = t->dh_zone_list; dz; dz = dz->dz_next) { - struct dn_fib_node *f; - dn_fib_key_t k = dz_key(flp->daddr, dz); - - for(f = dz_chain(k, dz); f; f = f->fn_next) { - if (!dn_key_eq(k, f->fn_key)) { - if (dn_key_leq(k, f->fn_key)) - break; - else - continue; - } - - f->fn_state |= DN_S_ACCESSED; - - if (f->fn_state&DN_S_ZOMBIE) - continue; - - if (f->fn_scope < flp->flowidn_scope) - continue; - - err = dn_fib_semantic_match(f->fn_type, DN_FIB_INFO(f), flp, res); - - if (err == 0) { - res->type = f->fn_type; - res->scope = f->fn_scope; - res->prefixlen = dz->dz_order; - goto out; - } - if (err < 0) - goto out; - } - } - err = 1; -out: - read_unlock(&dn_fib_tables_lock); - return err; -} - - -struct dn_fib_table *dn_fib_get_table(u32 n, int create) -{ - struct dn_fib_table *t; - unsigned int h; - - if (n < RT_TABLE_MIN) - return NULL; - - if (n > RT_TABLE_MAX) - return NULL; - - h = n & (DN_FIB_TABLE_HASHSZ - 1); - rcu_read_lock(); - hlist_for_each_entry_rcu(t, &dn_fib_table_hash[h], hlist) { - if (t->n == n) { - rcu_read_unlock(); - return t; - } - } - rcu_read_unlock(); - - if (!create) - return NULL; - - if (in_interrupt()) { - net_dbg_ratelimited("DECnet: BUG! Attempt to create routing table from interrupt\n"); - return NULL; - } - - t = kzalloc(sizeof(struct dn_fib_table) + sizeof(struct dn_hash), - GFP_KERNEL); - if (t == NULL) - return NULL; - - t->n = n; - t->insert = dn_fib_table_insert; - t->delete = dn_fib_table_delete; - t->lookup = dn_fib_table_lookup; - t->flush = dn_fib_table_flush; - t->dump = dn_fib_table_dump; - hlist_add_head_rcu(&t->hlist, &dn_fib_table_hash[h]); - - return t; -} - -struct dn_fib_table *dn_fib_empty_table(void) -{ - u32 id; - - for(id = RT_TABLE_MIN; id <= RT_TABLE_MAX; id++) - if (dn_fib_get_table(id, 0) == NULL) - return dn_fib_get_table(id, 1); - return NULL; -} - -void dn_fib_flush(void) -{ - int flushed = 0; - struct dn_fib_table *tb; - unsigned int h; - - for (h = 0; h < DN_FIB_TABLE_HASHSZ; h++) { - hlist_for_each_entry(tb, &dn_fib_table_hash[h], hlist) - flushed += tb->flush(tb); - } - - if (flushed) - dn_rt_cache_flush(-1); -} - -void __init dn_fib_table_init(void) -{ - dn_hash_kmem = kmem_cache_create("dn_fib_info_cache", - sizeof(struct dn_fib_info), - 0, SLAB_HWCACHE_ALIGN, - NULL); -} - -void __exit dn_fib_table_cleanup(void) -{ - struct dn_fib_table *t; - struct hlist_node *next; - unsigned int h; - - write_lock(&dn_fib_tables_lock); - for (h = 0; h < DN_FIB_TABLE_HASHSZ; h++) { - hlist_for_each_entry_safe(t, next, &dn_fib_table_hash[h], - hlist) { - hlist_del(&t->hlist); - kfree(t); - } - } - write_unlock(&dn_fib_tables_lock); -} diff --git a/net/decnet/dn_timer.c b/net/decnet/dn_timer.c deleted file mode 100644 index aa4155875ca8..000000000000 --- a/net/decnet/dn_timer.c +++ /dev/null @@ -1,104 +0,0 @@ -// SPDX-License-Identifier: GPL-2.0 -/* - * DECnet An implementation of the DECnet protocol suite for the LINUX - * operating system. DECnet is implemented using the BSD Socket - * interface as the means of communication with the user level. - * - * DECnet Socket Timer Functions - * - * Author: Steve Whitehouse <SteveW@ACM.org> - * - * - * Changes: - * Steve Whitehouse : Made keepalive timer part of the same - * timer idea. - * Steve Whitehouse : Added checks for sk->sock_readers - * David S. Miller : New socket locking - * Steve Whitehouse : Timer grabs socket ref. - */ -#include <linux/net.h> -#include <linux/socket.h> -#include <linux/skbuff.h> -#include <linux/netdevice.h> -#include <linux/timer.h> -#include <linux/spinlock.h> -#include <net/sock.h> -#include <linux/atomic.h> -#include <linux/jiffies.h> -#include <net/flow.h> -#include <net/dn.h> - -/* - * Slow timer is for everything else (n * 500mS) - */ - -#define SLOW_INTERVAL (HZ/2) - -static void dn_slow_timer(struct timer_list *t); - -void dn_start_slow_timer(struct sock *sk) -{ - timer_setup(&sk->sk_timer, dn_slow_timer, 0); - sk_reset_timer(sk, &sk->sk_timer, jiffies + SLOW_INTERVAL); -} - -void dn_stop_slow_timer(struct sock *sk) -{ - sk_stop_timer(sk, &sk->sk_timer); -} - -static void dn_slow_timer(struct timer_list *t) -{ - struct sock *sk = from_timer(sk, t, sk_timer); - struct dn_scp *scp = DN_SK(sk); - - bh_lock_sock(sk); - - if (sock_owned_by_user(sk)) { - sk_reset_timer(sk, &sk->sk_timer, jiffies + HZ / 10); - goto out; - } - - /* - * The persist timer is the standard slow timer used for retransmits - * in both connection establishment and disconnection as well as - * in the RUN state. The different states are catered for by changing - * the function pointer in the socket. Setting the timer to a value - * of zero turns it off. We allow the persist_fxn to turn the - * timer off in a permant way by returning non-zero, so that - * timer based routines may remove sockets. This is why we have a - * sock_hold()/sock_put() around the timer to prevent the socket - * going away in the middle. - */ - if (scp->persist && scp->persist_fxn) { - if (scp->persist <= SLOW_INTERVAL) { - scp->persist = 0; - - if (scp->persist_fxn(sk)) - goto out; - } else { - scp->persist -= SLOW_INTERVAL; - } - } - - /* - * Check for keepalive timeout. After the other timer 'cos if - * the previous timer caused a retransmit, we don't need to - * do this. scp->stamp is the last time that we sent a packet. - * The keepalive function sends a link service packet to the - * other end. If it remains unacknowledged, the standard - * socket timers will eventually shut the socket down. Each - * time we do this, scp->stamp will be updated, thus - * we won't try and send another until scp->keepalive has passed - * since the last successful transmission. - */ - if (scp->keepalive && scp->keepalive_fxn && (scp->state == DN_RUN)) { - if (time_after_eq(jiffies, scp->stamp + scp->keepalive)) - scp->keepalive_fxn(sk); - } - - sk_reset_timer(sk, &sk->sk_timer, jiffies + SLOW_INTERVAL); -out: - bh_unlock_sock(sk); - sock_put(sk); -} diff --git a/net/decnet/netfilter/Kconfig b/net/decnet/netfilter/Kconfig deleted file mode 100644 index 14ec4ef95fab..000000000000 --- a/net/decnet/netfilter/Kconfig +++ /dev/null @@ -1,17 +0,0 @@ -# SPDX-License-Identifier: GPL-2.0-only -# -# DECnet netfilter configuration -# - -menu "DECnet: Netfilter Configuration" - depends on DECNET && NETFILTER - depends on NETFILTER_ADVANCED - -config DECNET_NF_GRABULATOR - tristate "Routing message grabulator (for userland routing daemon)" - help - Enable this module if you want to use the userland DECnet routing - daemon. You will also need to enable routing support for DECnet - unless you just want to monitor routing messages from other nodes. - -endmenu diff --git a/net/decnet/netfilter/Makefile b/net/decnet/netfilter/Makefile deleted file mode 100644 index 429c84289d0f..000000000000 --- a/net/decnet/netfilter/Makefile +++ /dev/null @@ -1,6 +0,0 @@ -# SPDX-License-Identifier: GPL-2.0-only -# -# Makefile for DECnet netfilter modules -# - -obj-$(CONFIG_DECNET_NF_GRABULATOR) += dn_rtmsg.o diff --git a/net/decnet/netfilter/dn_rtmsg.c b/net/decnet/netfilter/dn_rtmsg.c deleted file mode 100644 index 26a9193df783..000000000000 --- a/net/decnet/netfilter/dn_rtmsg.c +++ /dev/null @@ -1,158 +0,0 @@ -// SPDX-License-Identifier: GPL-2.0-or-later -/* - * DECnet An implementation of the DECnet protocol suite for the LINUX - * operating system. DECnet is implemented using the BSD Socket - * interface as the means of communication with the user level. - * - * DECnet Routing Message Grabulator - * - * (C) 2000 ChyGwyn Limited - https://www.chygwyn.com/ - * - * Author: Steven Whitehouse <steve@chygwyn.com> - */ -#include <linux/module.h> -#include <linux/skbuff.h> -#include <linux/slab.h> -#include <linux/init.h> -#include <linux/netdevice.h> -#include <linux/netfilter.h> -#include <linux/spinlock.h> -#include <net/netlink.h> -#include <linux/netfilter_decnet.h> - -#include <net/sock.h> -#include <net/flow.h> -#include <net/dn.h> -#include <net/dn_route.h> - -static struct sock *dnrmg = NULL; - - -static struct sk_buff *dnrmg_build_message(struct sk_buff *rt_skb, int *errp) -{ - struct sk_buff *skb = NULL; - size_t size; - sk_buff_data_t old_tail; - struct nlmsghdr *nlh; - unsigned char *ptr; - struct nf_dn_rtmsg *rtm; - - size = NLMSG_ALIGN(rt_skb->len) + - NLMSG_ALIGN(sizeof(struct nf_dn_rtmsg)); - skb = nlmsg_new(size, GFP_ATOMIC); - if (!skb) { - *errp = -ENOMEM; - return NULL; - } - old_tail = skb->tail; - nlh = nlmsg_put(skb, 0, 0, 0, size, 0); - if (!nlh) { - kfree_skb(skb); - *errp = -ENOMEM; - return NULL; - } - rtm = (struct nf_dn_rtmsg *)nlmsg_data(nlh); - rtm->nfdn_ifindex = rt_skb->dev->ifindex; - ptr = NFDN_RTMSG(rtm); - skb_copy_from_linear_data(rt_skb, ptr, rt_skb->len); - nlh->nlmsg_len = skb->tail - old_tail; - return skb; -} - -static void dnrmg_send_peer(struct sk_buff *skb) -{ - struct sk_buff *skb2; - int status = 0; - int group = 0; - unsigned char flags = *skb->data; - - switch (flags & DN_RT_CNTL_MSK) { - case DN_RT_PKT_L1RT: - group = DNRNG_NLGRP_L1; - break; - case DN_RT_PKT_L2RT: - group = DNRNG_NLGRP_L2; - break; - default: - return; - } - - skb2 = dnrmg_build_message(skb, &status); - if (skb2 == NULL) - return; - NETLINK_CB(skb2).dst_group = group; - netlink_broadcast(dnrmg, skb2, 0, group, GFP_ATOMIC); -} - - -static unsigned int dnrmg_hook(void *priv, - struct sk_buff *skb, - const struct nf_hook_state *state) -{ - dnrmg_send_peer(skb); - return NF_ACCEPT; -} - - -#define RCV_SKB_FAIL(err) do { netlink_ack(skb, nlh, (err), NULL); return; } while (0) - -static inline void dnrmg_receive_user_skb(struct sk_buff *skb) -{ - struct nlmsghdr *nlh = nlmsg_hdr(skb); - - if (skb->len < sizeof(*nlh) || - nlh->nlmsg_len < sizeof(*nlh) || - skb->len < nlh->nlmsg_len) - return; - - if (!netlink_capable(skb, CAP_NET_ADMIN)) - RCV_SKB_FAIL(-EPERM); - - /* Eventually we might send routing messages too */ - - RCV_SKB_FAIL(-EINVAL); -} - -static const struct nf_hook_ops dnrmg_ops = { - .hook = dnrmg_hook, - .pf = NFPROTO_DECNET, - .hooknum = NF_DN_ROUTE, - .priority = NF_DN_PRI_DNRTMSG, -}; - -static int __init dn_rtmsg_init(void) -{ - int rv = 0; - struct netlink_kernel_cfg cfg = { - .groups = DNRNG_NLGRP_MAX, - .input = dnrmg_receive_user_skb, - }; - - dnrmg = netlink_kernel_create(&init_net, NETLINK_DNRTMSG, &cfg); - if (dnrmg == NULL) { - printk(KERN_ERR "dn_rtmsg: Cannot create netlink socket"); - return -ENOMEM; - } - - rv = nf_register_net_hook(&init_net, &dnrmg_ops); - if (rv) { - netlink_kernel_release(dnrmg); - } - - return rv; -} - -static void __exit dn_rtmsg_fini(void) -{ - nf_unregister_net_hook(&init_net, &dnrmg_ops); - netlink_kernel_release(dnrmg); -} - - -MODULE_DESCRIPTION("DECnet Routing Message Grabulator"); -MODULE_AUTHOR("Steven Whitehouse <steve@chygwyn.com>"); -MODULE_LICENSE("GPL"); -MODULE_ALIAS_NET_PF_PROTO(PF_NETLINK, NETLINK_DNRTMSG); - -module_init(dn_rtmsg_init); -module_exit(dn_rtmsg_fini); diff --git a/net/decnet/sysctl_net_decnet.c b/net/decnet/sysctl_net_decnet.c deleted file mode 100644 index 67b5ab2657b7..000000000000 --- a/net/decnet/sysctl_net_decnet.c +++ /dev/null @@ -1,362 +0,0 @@ -// SPDX-License-Identifier: GPL-2.0 -/* - * DECnet An implementation of the DECnet protocol suite for the LINUX - * operating system. DECnet is implemented using the BSD Socket - * interface as the means of communication with the user level. - * - * DECnet sysctl support functions - * - * Author: Steve Whitehouse <SteveW@ACM.org> - * - * - * Changes: - * Steve Whitehouse - C99 changes and default device handling - * Steve Whitehouse - Memory buffer settings, like the tcp ones - * - */ -#include <linux/mm.h> -#include <linux/sysctl.h> -#include <linux/fs.h> -#include <linux/netdevice.h> -#include <linux/string.h> -#include <net/neighbour.h> -#include <net/dst.h> -#include <net/flow.h> - -#include <linux/uaccess.h> - -#include <net/dn.h> -#include <net/dn_dev.h> -#include <net/dn_route.h> - - -int decnet_debug_level; -int decnet_time_wait = 30; -int decnet_dn_count = 1; -int decnet_di_count = 3; -int decnet_dr_count = 3; -int decnet_log_martians = 1; -int decnet_no_fc_max_cwnd = NSP_MIN_WINDOW; - -/* Reasonable defaults, I hope, based on tcp's defaults */ -long sysctl_decnet_mem[3] = { 768 << 3, 1024 << 3, 1536 << 3 }; -int sysctl_decnet_wmem[3] = { 4 * 1024, 16 * 1024, 128 * 1024 }; -int sysctl_decnet_rmem[3] = { 4 * 1024, 87380, 87380 * 2 }; - -#ifdef CONFIG_SYSCTL -extern int decnet_dst_gc_interval; -static int min_decnet_time_wait[] = { 5 }; -static int max_decnet_time_wait[] = { 600 }; -static int min_state_count[] = { 1 }; -static int max_state_count[] = { NSP_MAXRXTSHIFT }; -static int min_decnet_dst_gc_interval[] = { 1 }; -static int max_decnet_dst_gc_interval[] = { 60 }; -static int min_decnet_no_fc_max_cwnd[] = { NSP_MIN_WINDOW }; -static int max_decnet_no_fc_max_cwnd[] = { NSP_MAX_WINDOW }; -static char node_name[7] = "???"; - -static struct ctl_table_header *dn_table_header = NULL; - -/* - * ctype.h :-) - */ -#define ISNUM(x) (((x) >= '0') && ((x) <= '9')) -#define ISLOWER(x) (((x) >= 'a') && ((x) <= 'z')) -#define ISUPPER(x) (((x) >= 'A') && ((x) <= 'Z')) -#define ISALPHA(x) (ISLOWER(x) || ISUPPER(x)) -#define INVALID_END_CHAR(x) (ISNUM(x) || ISALPHA(x)) - -static void strip_it(char *str) -{ - for(;;) { - switch (*str) { - case ' ': - case '\n': - case '\r': - case ':': - *str = 0; - fallthrough; - case 0: - return; - } - str++; - } -} - -/* - * Simple routine to parse an ascii DECnet address - * into a network order address. - */ -static int parse_addr(__le16 *addr, char *str) -{ - __u16 area, node; - - while(*str && !ISNUM(*str)) str++; - - if (*str == 0) - return -1; - - area = (*str++ - '0'); - if (ISNUM(*str)) { - area *= 10; - area += (*str++ - '0'); - } - - if (*str++ != '.') - return -1; - - if (!ISNUM(*str)) - return -1; - - node = *str++ - '0'; - if (ISNUM(*str)) { - node *= 10; - node += (*str++ - '0'); - } - if (ISNUM(*str)) { - node *= 10; - node += (*str++ - '0'); - } - if (ISNUM(*str)) { - node *= 10; - node += (*str++ - '0'); - } - - if ((node > 1023) || (area > 63)) - return -1; - - if (INVALID_END_CHAR(*str)) - return -1; - - *addr = cpu_to_le16((area << 10) | node); - - return 0; -} - -static int dn_node_address_handler(struct ctl_table *table, int write, - void *buffer, size_t *lenp, loff_t *ppos) -{ - char addr[DN_ASCBUF_LEN]; - size_t len; - __le16 dnaddr; - - if (!*lenp || (*ppos && !write)) { - *lenp = 0; - return 0; - } - - if (write) { - len = (*lenp < DN_ASCBUF_LEN) ? *lenp : (DN_ASCBUF_LEN-1); - memcpy(addr, buffer, len); - addr[len] = 0; - strip_it(addr); - - if (parse_addr(&dnaddr, addr)) - return -EINVAL; - - dn_dev_devices_off(); - - decnet_address = dnaddr; - - dn_dev_devices_on(); - - *ppos += len; - - return 0; - } - - dn_addr2asc(le16_to_cpu(decnet_address), addr); - len = strlen(addr); - addr[len++] = '\n'; - - if (len > *lenp) - len = *lenp; - memcpy(buffer, addr, len); - *lenp = len; - *ppos += len; - - return 0; -} - -static int dn_def_dev_handler(struct ctl_table *table, int write, - void *buffer, size_t *lenp, loff_t *ppos) -{ - size_t len; - struct net_device *dev; - char devname[17]; - - if (!*lenp || (*ppos && !write)) { - *lenp = 0; - return 0; - } - - if (write) { - if (*lenp > 16) - return -E2BIG; - - memcpy(devname, buffer, *lenp); - devname[*lenp] = 0; - strip_it(devname); - - dev = dev_get_by_name(&init_net, devname); - if (dev == NULL) - return -ENODEV; - - if (dev->dn_ptr == NULL) { - dev_put(dev); - return -ENODEV; - } - - if (dn_dev_set_default(dev, 1)) { - dev_put(dev); - return -ENODEV; - } - *ppos += *lenp; - - return 0; - } - - dev = dn_dev_get_default(); - if (dev == NULL) { - *lenp = 0; - return 0; - } - - strcpy(devname, dev->name); - dev_put(dev); - len = strlen(devname); - devname[len++] = '\n'; - - if (len > *lenp) len = *lenp; - - memcpy(buffer, devname, len); - *lenp = len; - *ppos += len; - - return 0; -} - -static struct ctl_table dn_table[] = { - { - .procname = "node_address", - .maxlen = 7, - .mode = 0644, - .proc_handler = dn_node_address_handler, - }, - { - .procname = "node_name", - .data = node_name, - .maxlen = 7, - .mode = 0644, - .proc_handler = proc_dostring, - }, - { - .procname = "default_device", - .maxlen = 16, - .mode = 0644, - .proc_handler = dn_def_dev_handler, - }, - { - .procname = "time_wait", - .data = &decnet_time_wait, - .maxlen = sizeof(int), - .mode = 0644, - .proc_handler = proc_dointvec_minmax, - .extra1 = &min_decnet_time_wait, - .extra2 = &max_decnet_time_wait - }, - { - .procname = "dn_count", - .data = &decnet_dn_count, - .maxlen = sizeof(int), - .mode = 0644, - .proc_handler = proc_dointvec_minmax, - .extra1 = &min_state_count, - .extra2 = &max_state_count - }, - { - .procname = "di_count", - .data = &decnet_di_count, - .maxlen = sizeof(int), - .mode = 0644, - .proc_handler = proc_dointvec_minmax, - .extra1 = &min_state_count, - .extra2 = &max_state_count - }, - { - .procname = "dr_count", - .data = &decnet_dr_count, - .maxlen = sizeof(int), - .mode = 0644, - .proc_handler = proc_dointvec_minmax, - .extra1 = &min_state_count, - .extra2 = &max_state_count - }, - { - .procname = "dst_gc_interval", - .data = &decnet_dst_gc_interval, - .maxlen = sizeof(int), - .mode = 0644, - .proc_handler = proc_dointvec_minmax, - .extra1 = &min_decnet_dst_gc_interval, - .extra2 = &max_decnet_dst_gc_interval - }, - { - .procname = "no_fc_max_cwnd", - .data = &decnet_no_fc_max_cwnd, - .maxlen = sizeof(int), - .mode = 0644, - .proc_handler = proc_dointvec_minmax, - .extra1 = &min_decnet_no_fc_max_cwnd, - .extra2 = &max_decnet_no_fc_max_cwnd - }, - { - .procname = "decnet_mem", - .data = &sysctl_decnet_mem, - .maxlen = sizeof(sysctl_decnet_mem), - .mode = 0644, - .proc_handler = proc_doulongvec_minmax - }, - { - .procname = "decnet_rmem", - .data = &sysctl_decnet_rmem, - .maxlen = sizeof(sysctl_decnet_rmem), - .mode = 0644, - .proc_handler = proc_dointvec, - }, - { - .procname = "decnet_wmem", - .data = &sysctl_decnet_wmem, - .maxlen = sizeof(sysctl_decnet_wmem), - .mode = 0644, - .proc_handler = proc_dointvec, - }, - { - .procname = "debug", - .data = &decnet_debug_level, - .maxlen = sizeof(int), - .mode = 0644, - .proc_handler = proc_dointvec, - }, - { } -}; - -void dn_register_sysctl(void) -{ - dn_table_header = register_net_sysctl(&init_net, "net/decnet", dn_table); -} - -void dn_unregister_sysctl(void) -{ - unregister_net_sysctl_table(dn_table_header); -} - -#else /* CONFIG_SYSCTL */ -void dn_unregister_sysctl(void) -{ -} -void dn_register_sysctl(void) -{ -} - -#endif diff --git a/net/dsa/Makefile b/net/dsa/Makefile index af28c24ead18..bf57ef3bce2a 100644 --- a/net/dsa/Makefile +++ b/net/dsa/Makefile @@ -1,7 +1,15 @@ # SPDX-License-Identifier: GPL-2.0 # the core obj-$(CONFIG_NET_DSA) += dsa_core.o -dsa_core-y += dsa.o dsa2.o master.o port.o slave.o switch.o tag_8021q.o +dsa_core-y += \ + dsa.o \ + dsa2.o \ + master.o \ + netlink.o \ + port.o \ + slave.o \ + switch.o \ + tag_8021q.o # tagging formats obj-$(CONFIG_NET_DSA_TAG_AR9331) += tag_ar9331.o diff --git a/net/dsa/dsa.c b/net/dsa/dsa.c index be7b320cda76..64b14f655b23 100644 --- a/net/dsa/dsa.c +++ b/net/dsa/dsa.c @@ -536,8 +536,16 @@ static int __init dsa_init_module(void) dsa_tag_driver_register(&DSA_TAG_DRIVER_NAME(none_ops), THIS_MODULE); + rc = rtnl_link_register(&dsa_link_ops); + if (rc) + goto netlink_register_fail; + return 0; +netlink_register_fail: + dsa_tag_driver_unregister(&DSA_TAG_DRIVER_NAME(none_ops)); + dsa_slave_unregister_notifier(); + dev_remove_pack(&dsa_pack_type); register_notifier_fail: destroy_workqueue(dsa_owq); @@ -547,6 +555,7 @@ module_init(dsa_init_module); static void __exit dsa_cleanup_module(void) { + rtnl_link_unregister(&dsa_link_ops); dsa_tag_driver_unregister(&DSA_TAG_DRIVER_NAME(none_ops)); dsa_slave_unregister_notifier(); diff --git a/net/dsa/dsa2.c b/net/dsa/dsa2.c index cac48a741f27..af0e2c0394ac 100644 --- a/net/dsa/dsa2.c +++ b/net/dsa/dsa2.c @@ -387,6 +387,20 @@ static struct dsa_port *dsa_tree_find_first_cpu(struct dsa_switch_tree *dst) return NULL; } +struct net_device *dsa_tree_find_first_master(struct dsa_switch_tree *dst) +{ + struct device_node *ethernet; + struct net_device *master; + struct dsa_port *cpu_dp; + + cpu_dp = dsa_tree_find_first_cpu(dst); + ethernet = of_parse_phandle(cpu_dp->dn, "ethernet", 0); + master = of_find_net_device_by_node(ethernet); + of_node_put(ethernet); + + return master; +} + /* Assign the default CPU port (the first one in the tree) to all ports of the * fabric which don't already have one as part of their own switch. */ @@ -447,6 +461,72 @@ static void dsa_tree_teardown_cpu_ports(struct dsa_switch_tree *dst) dp->cpu_dp = NULL; } +static int dsa_port_devlink_setup(struct dsa_port *dp) +{ + struct devlink_port *dlp = &dp->devlink_port; + struct dsa_switch_tree *dst = dp->ds->dst; + struct devlink_port_attrs attrs = {}; + struct devlink *dl = dp->ds->devlink; + struct dsa_switch *ds = dp->ds; + const unsigned char *id; + unsigned char len; + int err; + + memset(dlp, 0, sizeof(*dlp)); + devlink_port_init(dl, dlp); + + if (ds->ops->port_setup) { + err = ds->ops->port_setup(ds, dp->index); + if (err) + return err; + } + + id = (const unsigned char *)&dst->index; + len = sizeof(dst->index); + + attrs.phys.port_number = dp->index; + memcpy(attrs.switch_id.id, id, len); + attrs.switch_id.id_len = len; + + switch (dp->type) { + case DSA_PORT_TYPE_UNUSED: + attrs.flavour = DEVLINK_PORT_FLAVOUR_UNUSED; + break; + case DSA_PORT_TYPE_CPU: + attrs.flavour = DEVLINK_PORT_FLAVOUR_CPU; + break; + case DSA_PORT_TYPE_DSA: + attrs.flavour = DEVLINK_PORT_FLAVOUR_DSA; + break; + case DSA_PORT_TYPE_USER: + attrs.flavour = DEVLINK_PORT_FLAVOUR_PHYSICAL; + break; + } + + devlink_port_attrs_set(dlp, &attrs); + err = devlink_port_register(dl, dlp, dp->index); + if (err) { + if (ds->ops->port_teardown) + ds->ops->port_teardown(ds, dp->index); + return err; + } + + return 0; +} + +static void dsa_port_devlink_teardown(struct dsa_port *dp) +{ + struct devlink_port *dlp = &dp->devlink_port; + struct dsa_switch *ds = dp->ds; + + devlink_port_unregister(dlp); + + if (ds->ops->port_teardown) + ds->ops->port_teardown(ds, dp->index); + + devlink_port_fini(dlp); +} + static int dsa_port_setup(struct dsa_port *dp) { struct devlink_port *dlp = &dp->devlink_port; @@ -458,21 +538,25 @@ static int dsa_port_setup(struct dsa_port *dp) if (dp->setup) return 0; - if (ds->ops->port_setup) { - err = ds->ops->port_setup(ds, dp->index); - if (err) - return err; - } + err = dsa_port_devlink_setup(dp); + if (err) + return err; switch (dp->type) { case DSA_PORT_TYPE_UNUSED: dsa_port_disable(dp); break; case DSA_PORT_TYPE_CPU: - err = dsa_port_link_register_of(dp); - if (err) - break; - dsa_port_link_registered = true; + if (dp->dn) { + err = dsa_shared_port_link_register_of(dp); + if (err) + break; + dsa_port_link_registered = true; + } else { + dev_warn(ds->dev, + "skipping link registration for CPU port %d\n", + dp->index); + } err = dsa_port_enable(dp, NULL); if (err) @@ -481,10 +565,16 @@ static int dsa_port_setup(struct dsa_port *dp) break; case DSA_PORT_TYPE_DSA: - err = dsa_port_link_register_of(dp); - if (err) - break; - dsa_port_link_registered = true; + if (dp->dn) { + err = dsa_shared_port_link_register_of(dp); + if (err) + break; + dsa_port_link_registered = true; + } else { + dev_warn(ds->dev, + "skipping link registration for DSA port %d\n", + dp->index); + } err = dsa_port_enable(dp, NULL); if (err) @@ -505,10 +595,9 @@ static int dsa_port_setup(struct dsa_port *dp) if (err && dsa_port_enabled) dsa_port_disable(dp); if (err && dsa_port_link_registered) - dsa_port_link_unregister_of(dp); + dsa_shared_port_link_unregister_of(dp); if (err) { - if (ds->ops->port_teardown) - ds->ops->port_teardown(ds, dp->index); + dsa_port_devlink_teardown(dp); return err; } @@ -517,59 +606,13 @@ static int dsa_port_setup(struct dsa_port *dp) return 0; } -static int dsa_port_devlink_setup(struct dsa_port *dp) -{ - struct devlink_port *dlp = &dp->devlink_port; - struct dsa_switch_tree *dst = dp->ds->dst; - struct devlink_port_attrs attrs = {}; - struct devlink *dl = dp->ds->devlink; - const unsigned char *id; - unsigned char len; - int err; - - id = (const unsigned char *)&dst->index; - len = sizeof(dst->index); - - attrs.phys.port_number = dp->index; - memcpy(attrs.switch_id.id, id, len); - attrs.switch_id.id_len = len; - memset(dlp, 0, sizeof(*dlp)); - - switch (dp->type) { - case DSA_PORT_TYPE_UNUSED: - attrs.flavour = DEVLINK_PORT_FLAVOUR_UNUSED; - break; - case DSA_PORT_TYPE_CPU: - attrs.flavour = DEVLINK_PORT_FLAVOUR_CPU; - break; - case DSA_PORT_TYPE_DSA: - attrs.flavour = DEVLINK_PORT_FLAVOUR_DSA; - break; - case DSA_PORT_TYPE_USER: - attrs.flavour = DEVLINK_PORT_FLAVOUR_PHYSICAL; - break; - } - - devlink_port_attrs_set(dlp, &attrs); - err = devlink_port_register(dl, dlp, dp->index); - - if (!err) - dp->devlink_port_setup = true; - - return err; -} - static void dsa_port_teardown(struct dsa_port *dp) { struct devlink_port *dlp = &dp->devlink_port; - struct dsa_switch *ds = dp->ds; if (!dp->setup) return; - if (ds->ops->port_teardown) - ds->ops->port_teardown(ds, dp->index); - devlink_port_type_clear(dlp); switch (dp->type) { @@ -577,11 +620,13 @@ static void dsa_port_teardown(struct dsa_port *dp) break; case DSA_PORT_TYPE_CPU: dsa_port_disable(dp); - dsa_port_link_unregister_of(dp); + if (dp->dn) + dsa_shared_port_link_unregister_of(dp); break; case DSA_PORT_TYPE_DSA: dsa_port_disable(dp); - dsa_port_link_unregister_of(dp); + if (dp->dn) + dsa_shared_port_link_unregister_of(dp); break; case DSA_PORT_TYPE_USER: if (dp->slave) { @@ -591,46 +636,15 @@ static void dsa_port_teardown(struct dsa_port *dp) break; } - dp->setup = false; -} - -static void dsa_port_devlink_teardown(struct dsa_port *dp) -{ - struct devlink_port *dlp = &dp->devlink_port; + dsa_port_devlink_teardown(dp); - if (dp->devlink_port_setup) - devlink_port_unregister(dlp); - dp->devlink_port_setup = false; + dp->setup = false; } -/* Destroy the current devlink port, and create a new one which has the UNUSED - * flavour. At this point, any call to ds->ops->port_setup has been already - * balanced out by a call to ds->ops->port_teardown, so we know that any - * devlink port regions the driver had are now unregistered. We then call its - * ds->ops->port_setup again, in order for the driver to re-create them on the - * new devlink port. - */ -static int dsa_port_reinit_as_unused(struct dsa_port *dp) +static int dsa_port_setup_as_unused(struct dsa_port *dp) { - struct dsa_switch *ds = dp->ds; - int err; - - dsa_port_devlink_teardown(dp); dp->type = DSA_PORT_TYPE_UNUSED; - err = dsa_port_devlink_setup(dp); - if (err) - return err; - - if (ds->ops->port_setup) { - /* On error, leave the devlink port registered, - * dsa_switch_teardown will clean it up later. - */ - err = ds->ops->port_setup(ds, dp->index); - if (err) - return err; - } - - return 0; + return dsa_port_setup(dp); } static int dsa_devlink_info_get(struct devlink *dl, @@ -854,7 +868,6 @@ static int dsa_switch_setup(struct dsa_switch *ds) { struct dsa_devlink_priv *dl_priv; struct device_node *dn; - struct dsa_port *dp; int err; if (ds->setup) @@ -877,18 +890,9 @@ static int dsa_switch_setup(struct dsa_switch *ds) dl_priv = devlink_priv(ds->devlink); dl_priv->ds = ds; - /* Setup devlink port instances now, so that the switch - * setup() can register regions etc, against the ports - */ - dsa_switch_for_each_port(dp, ds) { - err = dsa_port_devlink_setup(dp); - if (err) - goto unregister_devlink_ports; - } - err = dsa_switch_register_notifier(ds); if (err) - goto unregister_devlink_ports; + goto devlink_free; ds->configure_vlan_while_not_filtering = true; @@ -929,9 +933,7 @@ teardown: ds->ops->teardown(ds); unregister_notifier: dsa_switch_unregister_notifier(ds); -unregister_devlink_ports: - dsa_switch_for_each_port(dp, ds) - dsa_port_devlink_teardown(dp); +devlink_free: devlink_free(ds->devlink); ds->devlink = NULL; return err; @@ -939,8 +941,6 @@ unregister_devlink_ports: static void dsa_switch_teardown(struct dsa_switch *ds) { - struct dsa_port *dp; - if (!ds->setup) return; @@ -959,8 +959,6 @@ static void dsa_switch_teardown(struct dsa_switch *ds) dsa_switch_unregister_notifier(ds); if (ds->devlink) { - dsa_switch_for_each_port(dp, ds) - dsa_port_devlink_teardown(dp); devlink_free(ds->devlink); ds->devlink = NULL; } @@ -1013,7 +1011,7 @@ static int dsa_tree_setup_ports(struct dsa_switch_tree *dst) if (dsa_port_is_user(dp) || dsa_port_is_unused(dp)) { err = dsa_port_setup(dp); if (err) { - err = dsa_port_reinit_as_unused(dp); + err = dsa_port_setup_as_unused(dp); if (err) goto teardown; } @@ -1046,26 +1044,24 @@ static int dsa_tree_setup_switches(struct dsa_switch_tree *dst) static int dsa_tree_setup_master(struct dsa_switch_tree *dst) { - struct dsa_port *dp; + struct dsa_port *cpu_dp; int err = 0; rtnl_lock(); - list_for_each_entry(dp, &dst->ports, list) { - if (dsa_port_is_cpu(dp)) { - struct net_device *master = dp->master; - bool admin_up = (master->flags & IFF_UP) && - !qdisc_tx_is_noop(master); + dsa_tree_for_each_cpu_port(cpu_dp, dst) { + struct net_device *master = cpu_dp->master; + bool admin_up = (master->flags & IFF_UP) && + !qdisc_tx_is_noop(master); - err = dsa_master_setup(master, dp); - if (err) - break; + err = dsa_master_setup(master, cpu_dp); + if (err) + break; - /* Replay master state event */ - dsa_tree_master_admin_state_change(dst, master, admin_up); - dsa_tree_master_oper_state_change(dst, master, - netif_oper_up(master)); - } + /* Replay master state event */ + dsa_tree_master_admin_state_change(dst, master, admin_up); + dsa_tree_master_oper_state_change(dst, master, + netif_oper_up(master)); } rtnl_unlock(); @@ -1075,22 +1071,20 @@ static int dsa_tree_setup_master(struct dsa_switch_tree *dst) static void dsa_tree_teardown_master(struct dsa_switch_tree *dst) { - struct dsa_port *dp; + struct dsa_port *cpu_dp; rtnl_lock(); - list_for_each_entry(dp, &dst->ports, list) { - if (dsa_port_is_cpu(dp)) { - struct net_device *master = dp->master; + dsa_tree_for_each_cpu_port(cpu_dp, dst) { + struct net_device *master = cpu_dp->master; - /* Synthesizing an "admin down" state is sufficient for - * the switches to get a notification if the master is - * currently up and running. - */ - dsa_tree_master_admin_state_change(dst, master, false); + /* Synthesizing an "admin down" state is sufficient for + * the switches to get a notification if the master is + * currently up and running. + */ + dsa_tree_master_admin_state_change(dst, master, false); - dsa_master_teardown(master); - } + dsa_master_teardown(master); } rtnl_unlock(); @@ -1238,7 +1232,6 @@ out_disconnect: * they would have formed disjoint trees (different "dsa,member" values). */ int dsa_tree_change_tag_proto(struct dsa_switch_tree *dst, - struct net_device *master, const struct dsa_device_ops *tag_ops, const struct dsa_device_ops *old_tag_ops) { @@ -1254,12 +1247,9 @@ int dsa_tree_change_tag_proto(struct dsa_switch_tree *dst, * attempts to change the tagging protocol. If we ever lift the IFF_UP * restriction, there needs to be another mutex which serializes this. */ - if (master->flags & IFF_UP) - goto out_unlock; - - list_for_each_entry(dp, &dst->ports, list) { - if (!dsa_port_is_user(dp)) - continue; + dsa_tree_for_each_user_port(dp, dst) { + if (dsa_port_to_master(dp)->flags & IFF_UP) + goto out_unlock; if (dp->slave->flags & IFF_UP) goto out_unlock; @@ -1306,6 +1296,12 @@ void dsa_tree_master_admin_state_change(struct dsa_switch_tree *dst, struct dsa_port *cpu_dp = master->dsa_ptr; bool notify = false; + /* Don't keep track of admin state on LAG DSA masters, + * but rather just of physical DSA masters + */ + if (netif_is_lag_master(master)) + return; + if ((dsa_port_master_is_operational(cpu_dp)) != (up && cpu_dp->master_oper_up)) notify = true; @@ -1323,6 +1319,12 @@ void dsa_tree_master_oper_state_change(struct dsa_switch_tree *dst, struct dsa_port *cpu_dp = master->dsa_ptr; bool notify = false; + /* Don't keep track of oper state on LAG DSA masters, + * but rather just of physical DSA masters + */ + if (netif_is_lag_master(master)) + return; + if ((dsa_port_master_is_operational(cpu_dp)) != (cpu_dp->master_admin_up && up)) notify = true; @@ -1791,7 +1793,7 @@ void dsa_switch_shutdown(struct dsa_switch *ds) rtnl_lock(); dsa_switch_for_each_user_port(dp, ds) { - master = dp->cpu_dp->master; + master = dsa_port_to_master(dp); slave_dev = dp->slave; netdev_upper_dev_unlink(master, slave_dev); diff --git a/net/dsa/dsa_priv.h b/net/dsa/dsa_priv.h index d9722e49864b..6e65c7ffd6f3 100644 --- a/net/dsa/dsa_priv.h +++ b/net/dsa/dsa_priv.h @@ -88,6 +88,7 @@ struct dsa_notifier_lag_info { const struct dsa_port *dp; struct dsa_lag lag; struct netdev_lag_upper_info *info; + struct netlink_ext_ack *extack; }; /* DSA_NOTIFIER_VLAN_* */ @@ -184,6 +185,11 @@ static inline int dsa_tag_protocol_overhead(const struct dsa_device_ops *ops) /* master.c */ int dsa_master_setup(struct net_device *dev, struct dsa_port *cpu_dp); void dsa_master_teardown(struct net_device *dev); +int dsa_master_lag_setup(struct net_device *lag_dev, struct dsa_port *cpu_dp, + struct netdev_lag_upper_info *uinfo, + struct netlink_ext_ack *extack); +void dsa_master_lag_teardown(struct net_device *lag_dev, + struct dsa_port *cpu_dp); static inline struct net_device *dsa_master_find_slave(struct net_device *dev, int device, int port) @@ -200,6 +206,9 @@ static inline struct net_device *dsa_master_find_slave(struct net_device *dev, return NULL; } +/* netlink.c */ +extern struct rtnl_link_ops dsa_link_ops __read_mostly; + /* port.c */ void dsa_port_set_tag_protocol(struct dsa_port *cpu_dp, const struct dsa_device_ops *tag_ops); @@ -285,13 +294,16 @@ int dsa_port_mrp_add_ring_role(const struct dsa_port *dp, int dsa_port_mrp_del_ring_role(const struct dsa_port *dp, const struct switchdev_obj_ring_role_mrp *mrp); int dsa_port_phylink_create(struct dsa_port *dp); -int dsa_port_link_register_of(struct dsa_port *dp); -void dsa_port_link_unregister_of(struct dsa_port *dp); +void dsa_port_phylink_destroy(struct dsa_port *dp); +int dsa_shared_port_link_register_of(struct dsa_port *dp); +void dsa_shared_port_link_unregister_of(struct dsa_port *dp); int dsa_port_hsr_join(struct dsa_port *dp, struct net_device *hsr); void dsa_port_hsr_leave(struct dsa_port *dp, struct net_device *hsr); int dsa_port_tag_8021q_vlan_add(struct dsa_port *dp, u16 vid, bool broadcast); void dsa_port_tag_8021q_vlan_del(struct dsa_port *dp, u16 vid, bool broadcast); void dsa_port_set_host_flood(struct dsa_port *dp, bool uc, bool mc); +int dsa_port_change_master(struct dsa_port *dp, struct net_device *master, + struct netlink_ext_ack *extack); /* slave.c */ extern const struct dsa_device_ops notag_netdev_ops; @@ -305,8 +317,12 @@ int dsa_slave_suspend(struct net_device *slave_dev); int dsa_slave_resume(struct net_device *slave_dev); int dsa_slave_register_notifier(void); void dsa_slave_unregister_notifier(void); +void dsa_slave_sync_ha(struct net_device *dev); +void dsa_slave_unsync_ha(struct net_device *dev); void dsa_slave_setup_tagger(struct net_device *slave); int dsa_slave_change_mtu(struct net_device *dev, int new_mtu); +int dsa_slave_change_master(struct net_device *dev, struct net_device *master, + struct netlink_ext_ack *extack); int dsa_slave_manage_vlan_filtering(struct net_device *dev, bool vlan_filtering); @@ -322,7 +338,7 @@ dsa_slave_to_master(const struct net_device *dev) { struct dsa_port *dp = dsa_slave_to_port(dev); - return dp->cpu_dp->master; + return dsa_port_to_master(dp); } /* If under a bridge with vlan_filtering=0, make sure to send pvid-tagged @@ -542,10 +558,10 @@ void dsa_lag_map(struct dsa_switch_tree *dst, struct dsa_lag *lag); void dsa_lag_unmap(struct dsa_switch_tree *dst, struct dsa_lag *lag); struct dsa_lag *dsa_tree_lag_find(struct dsa_switch_tree *dst, const struct net_device *lag_dev); +struct net_device *dsa_tree_find_first_master(struct dsa_switch_tree *dst); int dsa_tree_notify(struct dsa_switch_tree *dst, unsigned long e, void *v); int dsa_broadcast(unsigned long e, void *v); int dsa_tree_change_tag_proto(struct dsa_switch_tree *dst, - struct net_device *master, const struct dsa_device_ops *tag_ops, const struct dsa_device_ops *old_tag_ops); void dsa_tree_master_admin_state_change(struct dsa_switch_tree *dst, diff --git a/net/dsa/master.c b/net/dsa/master.c index 2851e44c4cf0..40367ab41cf8 100644 --- a/net/dsa/master.c +++ b/net/dsa/master.c @@ -58,7 +58,7 @@ static void dsa_master_get_regs(struct net_device *dev, } cpu_info = (struct ethtool_drvinfo *)data; - strlcpy(cpu_info->driver, "dsa", sizeof(cpu_info->driver)); + strscpy(cpu_info->driver, "dsa", sizeof(cpu_info->driver)); data += sizeof(*cpu_info); cpu_regs = (struct ethtool_regs *)data; data += sizeof(*cpu_regs); @@ -226,6 +226,9 @@ static int dsa_master_ethtool_setup(struct net_device *dev) struct dsa_switch *ds = cpu_dp->ds; struct ethtool_ops *ops; + if (netif_is_lag_master(dev)) + return 0; + ops = devm_kzalloc(ds->dev, sizeof(*ops), GFP_KERNEL); if (!ops) return -ENOMEM; @@ -250,6 +253,9 @@ static void dsa_master_ethtool_teardown(struct net_device *dev) { struct dsa_port *cpu_dp = dev->dsa_ptr; + if (netif_is_lag_master(dev)) + return; + dev->ethtool_ops = cpu_dp->orig_ethtool_ops; cpu_dp->orig_ethtool_ops = NULL; } @@ -257,6 +263,9 @@ static void dsa_master_ethtool_teardown(struct net_device *dev) static void dsa_netdev_ops_set(struct net_device *dev, const struct dsa_netdevice_ops *ops) { + if (netif_is_lag_master(dev)) + return; + dev->dsa_ptr->netdev_ops = ops; } @@ -307,7 +316,7 @@ static ssize_t tagging_store(struct device *d, struct device_attribute *attr, */ goto out; - err = dsa_tree_change_tag_proto(cpu_dp->ds->dst, dev, new_tag_ops, + err = dsa_tree_change_tag_proto(cpu_dp->ds->dst, new_tag_ops, old_tag_ops); if (err) { /* On failure the old tagger is restored, so we don't need the @@ -355,12 +364,14 @@ int dsa_master_setup(struct net_device *dev, struct dsa_port *cpu_dp) mtu = ETH_DATA_LEN + dsa_tag_protocol_overhead(tag_ops); /* The DSA master must use SET_NETDEV_DEV for this to work. */ - consumer_link = device_link_add(ds->dev, dev->dev.parent, - DL_FLAG_AUTOREMOVE_CONSUMER); - if (!consumer_link) - netdev_err(dev, - "Failed to create a device link to DSA switch %s\n", - dev_name(ds->dev)); + if (!netif_is_lag_master(dev)) { + consumer_link = device_link_add(ds->dev, dev->dev.parent, + DL_FLAG_AUTOREMOVE_CONSUMER); + if (!consumer_link) + netdev_err(dev, + "Failed to create a device link to DSA switch %s\n", + dev_name(ds->dev)); + } /* The switch driver may not implement ->port_change_mtu(), case in * which dsa_slave_change_mtu() will not update the master MTU either, @@ -417,3 +428,52 @@ void dsa_master_teardown(struct net_device *dev) */ wmb(); } + +int dsa_master_lag_setup(struct net_device *lag_dev, struct dsa_port *cpu_dp, + struct netdev_lag_upper_info *uinfo, + struct netlink_ext_ack *extack) +{ + bool master_setup = false; + int err; + + if (!netdev_uses_dsa(lag_dev)) { + err = dsa_master_setup(lag_dev, cpu_dp); + if (err) + return err; + + master_setup = true; + } + + err = dsa_port_lag_join(cpu_dp, lag_dev, uinfo, extack); + if (err) { + if (extack && !extack->_msg) + NL_SET_ERR_MSG_MOD(extack, + "CPU port failed to join LAG"); + goto out_master_teardown; + } + + return 0; + +out_master_teardown: + if (master_setup) + dsa_master_teardown(lag_dev); + return err; +} + +/* Tear down a master if there isn't any other user port on it, + * optionally also destroying LAG information. + */ +void dsa_master_lag_teardown(struct net_device *lag_dev, + struct dsa_port *cpu_dp) +{ + struct net_device *upper; + struct list_head *iter; + + dsa_port_lag_leave(cpu_dp, lag_dev); + + netdev_for_each_upper_dev_rcu(lag_dev, upper, iter) + if (dsa_slave_dev_check(upper)) + return; + + dsa_master_teardown(lag_dev); +} diff --git a/net/dsa/netlink.c b/net/dsa/netlink.c new file mode 100644 index 000000000000..ecf9ed1de185 --- /dev/null +++ b/net/dsa/netlink.c @@ -0,0 +1,63 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright 2022 NXP + */ +#include <linux/netdevice.h> +#include <net/rtnetlink.h> + +#include "dsa_priv.h" + +static const struct nla_policy dsa_policy[IFLA_DSA_MAX + 1] = { + [IFLA_DSA_MASTER] = { .type = NLA_U32 }, +}; + +static int dsa_changelink(struct net_device *dev, struct nlattr *tb[], + struct nlattr *data[], + struct netlink_ext_ack *extack) +{ + int err; + + if (!data) + return 0; + + if (data[IFLA_DSA_MASTER]) { + u32 ifindex = nla_get_u32(data[IFLA_DSA_MASTER]); + struct net_device *master; + + master = __dev_get_by_index(dev_net(dev), ifindex); + if (!master) + return -EINVAL; + + err = dsa_slave_change_master(dev, master, extack); + if (err) + return err; + } + + return 0; +} + +static size_t dsa_get_size(const struct net_device *dev) +{ + return nla_total_size(sizeof(u32)) + /* IFLA_DSA_MASTER */ + 0; +} + +static int dsa_fill_info(struct sk_buff *skb, const struct net_device *dev) +{ + struct net_device *master = dsa_slave_to_master(dev); + + if (nla_put_u32(skb, IFLA_DSA_MASTER, master->ifindex)) + return -EMSGSIZE; + + return 0; +} + +struct rtnl_link_ops dsa_link_ops __read_mostly = { + .kind = "dsa", + .priv_size = sizeof(struct dsa_port), + .maxtype = IFLA_DSA_MAX, + .policy = dsa_policy, + .changelink = dsa_changelink, + .get_size = dsa_get_size, + .fill_info = dsa_fill_info, + .netns_refund = true, +}; diff --git a/net/dsa/port.c b/net/dsa/port.c index a8895ee3cd60..e4a0513816bb 100644 --- a/net/dsa/port.c +++ b/net/dsa/port.c @@ -7,6 +7,7 @@ */ #include <linux/if_bridge.h> +#include <linux/netdevice.h> #include <linux/notifier.h> #include <linux/of_mdio.h> #include <linux/of_net.h> @@ -634,6 +635,7 @@ int dsa_port_lag_join(struct dsa_port *dp, struct net_device *lag_dev, struct dsa_notifier_lag_info info = { .dp = dp, .info = uinfo, + .extack = extack, }; struct net_device *bridge_dev; int err; @@ -1026,7 +1028,7 @@ int dsa_port_standalone_host_fdb_add(struct dsa_port *dp, int dsa_port_bridge_host_fdb_add(struct dsa_port *dp, const unsigned char *addr, u16 vid) { - struct dsa_port *cpu_dp = dp->cpu_dp; + struct net_device *master = dsa_port_to_master(dp); struct dsa_db db = { .type = DSA_DB_BRIDGE, .bridge = *dp->bridge, @@ -1037,8 +1039,8 @@ int dsa_port_bridge_host_fdb_add(struct dsa_port *dp, * requires rtnl_lock(), since we can't guarantee that is held here, * and we can't take it either. */ - if (cpu_dp->master->priv_flags & IFF_UNICAST_FLT) { - err = dev_uc_add(cpu_dp->master, addr); + if (master->priv_flags & IFF_UNICAST_FLT) { + err = dev_uc_add(master, addr); if (err) return err; } @@ -1077,15 +1079,15 @@ int dsa_port_standalone_host_fdb_del(struct dsa_port *dp, int dsa_port_bridge_host_fdb_del(struct dsa_port *dp, const unsigned char *addr, u16 vid) { - struct dsa_port *cpu_dp = dp->cpu_dp; + struct net_device *master = dsa_port_to_master(dp); struct dsa_db db = { .type = DSA_DB_BRIDGE, .bridge = *dp->bridge, }; int err; - if (cpu_dp->master->priv_flags & IFF_UNICAST_FLT) { - err = dev_uc_del(cpu_dp->master, addr); + if (master->priv_flags & IFF_UNICAST_FLT) { + err = dev_uc_del(master, addr); if (err) return err; } @@ -1208,14 +1210,14 @@ int dsa_port_standalone_host_mdb_add(const struct dsa_port *dp, int dsa_port_bridge_host_mdb_add(const struct dsa_port *dp, const struct switchdev_obj_port_mdb *mdb) { - struct dsa_port *cpu_dp = dp->cpu_dp; + struct net_device *master = dsa_port_to_master(dp); struct dsa_db db = { .type = DSA_DB_BRIDGE, .bridge = *dp->bridge, }; int err; - err = dev_mc_add(cpu_dp->master, mdb->addr); + err = dev_mc_add(master, mdb->addr); if (err) return err; @@ -1252,14 +1254,14 @@ int dsa_port_standalone_host_mdb_del(const struct dsa_port *dp, int dsa_port_bridge_host_mdb_del(const struct dsa_port *dp, const struct switchdev_obj_port_mdb *mdb) { - struct dsa_port *cpu_dp = dp->cpu_dp; + struct net_device *master = dsa_port_to_master(dp); struct dsa_db db = { .type = DSA_DB_BRIDGE, .bridge = *dp->bridge, }; int err; - err = dev_mc_del(cpu_dp->master, mdb->addr); + err = dev_mc_del(master, mdb->addr); if (err) return err; @@ -1294,19 +1296,19 @@ int dsa_port_host_vlan_add(struct dsa_port *dp, const struct switchdev_obj_port_vlan *vlan, struct netlink_ext_ack *extack) { + struct net_device *master = dsa_port_to_master(dp); struct dsa_notifier_vlan_info info = { .dp = dp, .vlan = vlan, .extack = extack, }; - struct dsa_port *cpu_dp = dp->cpu_dp; int err; err = dsa_port_notify(dp, DSA_NOTIFIER_HOST_VLAN_ADD, &info); if (err && err != -EOPNOTSUPP) return err; - vlan_vid_add(cpu_dp->master, htons(ETH_P_8021Q), vlan->vid); + vlan_vid_add(master, htons(ETH_P_8021Q), vlan->vid); return err; } @@ -1314,18 +1316,18 @@ int dsa_port_host_vlan_add(struct dsa_port *dp, int dsa_port_host_vlan_del(struct dsa_port *dp, const struct switchdev_obj_port_vlan *vlan) { + struct net_device *master = dsa_port_to_master(dp); struct dsa_notifier_vlan_info info = { .dp = dp, .vlan = vlan, }; - struct dsa_port *cpu_dp = dp->cpu_dp; int err; err = dsa_port_notify(dp, DSA_NOTIFIER_HOST_VLAN_DEL, &info); if (err && err != -EOPNOTSUPP) return err; - vlan_vid_del(cpu_dp->master, htons(ETH_P_8021Q), vlan->vid); + vlan_vid_del(master, htons(ETH_P_8021Q), vlan->vid); return err; } @@ -1374,6 +1376,136 @@ int dsa_port_mrp_del_ring_role(const struct dsa_port *dp, return ds->ops->port_mrp_del_ring_role(ds, dp->index, mrp); } +static int dsa_port_assign_master(struct dsa_port *dp, + struct net_device *master, + struct netlink_ext_ack *extack, + bool fail_on_err) +{ + struct dsa_switch *ds = dp->ds; + int port = dp->index, err; + + err = ds->ops->port_change_master(ds, port, master, extack); + if (err && !fail_on_err) + dev_err(ds->dev, "port %d failed to assign master %s: %pe\n", + port, master->name, ERR_PTR(err)); + + if (err && fail_on_err) + return err; + + dp->cpu_dp = master->dsa_ptr; + dp->cpu_port_in_lag = netif_is_lag_master(master); + + return 0; +} + +/* Change the dp->cpu_dp affinity for a user port. Note that both cross-chip + * notifiers and drivers have implicit assumptions about user-to-CPU-port + * mappings, so we unfortunately cannot delay the deletion of the objects + * (switchdev, standalone addresses, standalone VLANs) on the old CPU port + * until the new CPU port has been set up. So we need to completely tear down + * the old CPU port before changing it, and restore it on errors during the + * bringup of the new one. + */ +int dsa_port_change_master(struct dsa_port *dp, struct net_device *master, + struct netlink_ext_ack *extack) +{ + struct net_device *bridge_dev = dsa_port_bridge_dev_get(dp); + struct net_device *old_master = dsa_port_to_master(dp); + struct net_device *dev = dp->slave; + struct dsa_switch *ds = dp->ds; + bool vlan_filtering; + int err, tmp; + + /* Bridges may hold host FDB, MDB and VLAN objects. These need to be + * migrated, so dynamically unoffload and later reoffload the bridge + * port. + */ + if (bridge_dev) { + dsa_port_pre_bridge_leave(dp, bridge_dev); + dsa_port_bridge_leave(dp, bridge_dev); + } + + /* The port might still be VLAN filtering even if it's no longer + * under a bridge, either due to ds->vlan_filtering_is_global or + * ds->needs_standalone_vlan_filtering. In turn this means VLANs + * on the CPU port. + */ + vlan_filtering = dsa_port_is_vlan_filtering(dp); + if (vlan_filtering) { + err = dsa_slave_manage_vlan_filtering(dev, false); + if (err) { + NL_SET_ERR_MSG_MOD(extack, + "Failed to remove standalone VLANs"); + goto rewind_old_bridge; + } + } + + /* Standalone addresses, and addresses of upper interfaces like + * VLAN, LAG, HSR need to be migrated. + */ + dsa_slave_unsync_ha(dev); + + err = dsa_port_assign_master(dp, master, extack, true); + if (err) + goto rewind_old_addrs; + + dsa_slave_sync_ha(dev); + + if (vlan_filtering) { + err = dsa_slave_manage_vlan_filtering(dev, true); + if (err) { + NL_SET_ERR_MSG_MOD(extack, + "Failed to restore standalone VLANs"); + goto rewind_new_addrs; + } + } + + if (bridge_dev) { + err = dsa_port_bridge_join(dp, bridge_dev, extack); + if (err && err == -EOPNOTSUPP) { + NL_SET_ERR_MSG_MOD(extack, + "Failed to reoffload bridge"); + goto rewind_new_vlan; + } + } + + return 0; + +rewind_new_vlan: + if (vlan_filtering) + dsa_slave_manage_vlan_filtering(dev, false); + +rewind_new_addrs: + dsa_slave_unsync_ha(dev); + + dsa_port_assign_master(dp, old_master, NULL, false); + +/* Restore the objects on the old CPU port */ +rewind_old_addrs: + dsa_slave_sync_ha(dev); + + if (vlan_filtering) { + tmp = dsa_slave_manage_vlan_filtering(dev, true); + if (tmp) { + dev_err(ds->dev, + "port %d failed to restore standalone VLANs: %pe\n", + dp->index, ERR_PTR(tmp)); + } + } + +rewind_old_bridge: + if (bridge_dev) { + tmp = dsa_port_bridge_join(dp, bridge_dev, extack); + if (tmp) { + dev_err(ds->dev, + "port %d failed to rejoin bridge %s: %pe\n", + dp->index, bridge_dev->name, ERR_PTR(tmp)); + } + } + + return err; +} + void dsa_port_set_tag_protocol(struct dsa_port *cpu_dp, const struct dsa_device_ops *tag_ops) { @@ -1529,6 +1661,7 @@ int dsa_port_phylink_create(struct dsa_port *dp) { struct dsa_switch *ds = dp->ds; phy_interface_t mode; + struct phylink *pl; int err; err = of_get_phy_mode(dp->dn, &mode); @@ -1545,17 +1678,25 @@ int dsa_port_phylink_create(struct dsa_port *dp) if (ds->ops->phylink_get_caps) ds->ops->phylink_get_caps(ds, dp->index, &dp->pl_config); - dp->pl = phylink_create(&dp->pl_config, of_fwnode_handle(dp->dn), - mode, &dsa_port_phylink_mac_ops); - if (IS_ERR(dp->pl)) { + pl = phylink_create(&dp->pl_config, of_fwnode_handle(dp->dn), + mode, &dsa_port_phylink_mac_ops); + if (IS_ERR(pl)) { pr_err("error creating PHYLINK: %ld\n", PTR_ERR(dp->pl)); - return PTR_ERR(dp->pl); + return PTR_ERR(pl); } + dp->pl = pl; + return 0; } -static int dsa_port_setup_phy_of(struct dsa_port *dp, bool enable) +void dsa_port_phylink_destroy(struct dsa_port *dp) +{ + phylink_destroy(dp->pl); + dp->pl = NULL; +} + +static int dsa_shared_port_setup_phy_of(struct dsa_port *dp, bool enable) { struct dsa_switch *ds = dp->ds; struct phy_device *phydev; @@ -1593,7 +1734,7 @@ err_put_dev: return err; } -static int dsa_port_fixed_link_register_of(struct dsa_port *dp) +static int dsa_shared_port_fixed_link_register_of(struct dsa_port *dp) { struct device_node *dn = dp->dn; struct dsa_switch *ds = dp->ds; @@ -1627,7 +1768,7 @@ static int dsa_port_fixed_link_register_of(struct dsa_port *dp) return 0; } -static int dsa_port_phylink_register(struct dsa_port *dp) +static int dsa_shared_port_phylink_register(struct dsa_port *dp) { struct dsa_switch *ds = dp->ds; struct device_node *port_dn = dp->dn; @@ -1649,26 +1790,188 @@ static int dsa_port_phylink_register(struct dsa_port *dp) return 0; err_phy_connect: - phylink_destroy(dp->pl); + dsa_port_phylink_destroy(dp); return err; } -int dsa_port_link_register_of(struct dsa_port *dp) +/* During the initial DSA driver migration to OF, port nodes were sometimes + * added to device trees with no indication of how they should operate from a + * link management perspective (phy-handle, fixed-link, etc). Additionally, the + * phy-mode may be absent. The interpretation of these port OF nodes depends on + * their type. + * + * User ports with no phy-handle or fixed-link are expected to connect to an + * internal PHY located on the ds->slave_mii_bus at an MDIO address equal to + * the port number. This description is still actively supported. + * + * Shared (CPU and DSA) ports with no phy-handle or fixed-link are expected to + * operate at the maximum speed that their phy-mode is capable of. If the + * phy-mode is absent, they are expected to operate using the phy-mode + * supported by the port that gives the highest link speed. It is unspecified + * if the port should use flow control or not, half duplex or full duplex, or + * if the phy-mode is a SERDES link, whether in-band autoneg is expected to be + * enabled or not. + * + * In the latter case of shared ports, omitting the link management description + * from the firmware node is deprecated and strongly discouraged. DSA uses + * phylink, which rejects the firmware nodes of these ports for lacking + * required properties. + * + * For switches in this table, DSA will skip enforcing validation and will + * later omit registering a phylink instance for the shared ports, if they lack + * a fixed-link, a phy-handle, or a managed = "in-band-status" property. + * It becomes the responsibility of the driver to ensure that these ports + * operate at the maximum speed (whatever this means) and will interoperate + * with the DSA master or other cascade port, since phylink methods will not be + * invoked for them. + * + * If you are considering expanding this table for newly introduced switches, + * think again. It is OK to remove switches from this table if there aren't DT + * blobs in circulation which rely on defaulting the shared ports. + */ +static const char * const dsa_switches_apply_workarounds[] = { +#if IS_ENABLED(CONFIG_NET_DSA_XRS700X) + "arrow,xrs7003e", + "arrow,xrs7003f", + "arrow,xrs7004e", + "arrow,xrs7004f", +#endif +#if IS_ENABLED(CONFIG_B53) + "brcm,bcm5325", + "brcm,bcm53115", + "brcm,bcm53125", + "brcm,bcm53128", + "brcm,bcm5365", + "brcm,bcm5389", + "brcm,bcm5395", + "brcm,bcm5397", + "brcm,bcm5398", + "brcm,bcm53010-srab", + "brcm,bcm53011-srab", + "brcm,bcm53012-srab", + "brcm,bcm53018-srab", + "brcm,bcm53019-srab", + "brcm,bcm5301x-srab", + "brcm,bcm11360-srab", + "brcm,bcm58522-srab", + "brcm,bcm58525-srab", + "brcm,bcm58535-srab", + "brcm,bcm58622-srab", + "brcm,bcm58623-srab", + "brcm,bcm58625-srab", + "brcm,bcm88312-srab", + "brcm,cygnus-srab", + "brcm,nsp-srab", + "brcm,omega-srab", + "brcm,bcm3384-switch", + "brcm,bcm6328-switch", + "brcm,bcm6368-switch", + "brcm,bcm63xx-switch", +#endif +#if IS_ENABLED(CONFIG_NET_DSA_BCM_SF2) + "brcm,bcm7445-switch-v4.0", + "brcm,bcm7278-switch-v4.0", + "brcm,bcm7278-switch-v4.8", +#endif +#if IS_ENABLED(CONFIG_NET_DSA_LANTIQ_GSWIP) + "lantiq,xrx200-gswip", + "lantiq,xrx300-gswip", + "lantiq,xrx330-gswip", +#endif +#if IS_ENABLED(CONFIG_NET_DSA_MV88E6060) + "marvell,mv88e6060", +#endif +#if IS_ENABLED(CONFIG_NET_DSA_MV88E6XXX) + "marvell,mv88e6085", + "marvell,mv88e6190", + "marvell,mv88e6250", +#endif +#if IS_ENABLED(CONFIG_NET_DSA_MICROCHIP_KSZ_COMMON) + "microchip,ksz8765", + "microchip,ksz8794", + "microchip,ksz8795", + "microchip,ksz8863", + "microchip,ksz8873", + "microchip,ksz9477", + "microchip,ksz9897", + "microchip,ksz9893", + "microchip,ksz9563", + "microchip,ksz8563", + "microchip,ksz9567", +#endif +#if IS_ENABLED(CONFIG_NET_DSA_SMSC_LAN9303_MDIO) + "smsc,lan9303-mdio", +#endif +#if IS_ENABLED(CONFIG_NET_DSA_SMSC_LAN9303_I2C) + "smsc,lan9303-i2c", +#endif + NULL, +}; + +static void dsa_shared_port_validate_of(struct dsa_port *dp, + bool *missing_phy_mode, + bool *missing_link_description) { + struct device_node *dn = dp->dn, *phy_np; struct dsa_switch *ds = dp->ds; - struct device_node *phy_np; + phy_interface_t mode; + + *missing_phy_mode = false; + *missing_link_description = false; + + if (of_get_phy_mode(dn, &mode)) { + *missing_phy_mode = true; + dev_err(ds->dev, + "OF node %pOF of %s port %d lacks the required \"phy-mode\" property\n", + dn, dsa_port_is_cpu(dp) ? "CPU" : "DSA", dp->index); + } + + /* Note: of_phy_is_fixed_link() also returns true for + * managed = "in-band-status" + */ + if (of_phy_is_fixed_link(dn)) + return; + + phy_np = of_parse_phandle(dn, "phy-handle", 0); + if (phy_np) { + of_node_put(phy_np); + return; + } + + *missing_link_description = true; + + dev_err(ds->dev, + "OF node %pOF of %s port %d lacks the required \"phy-handle\", \"fixed-link\" or \"managed\" properties\n", + dn, dsa_port_is_cpu(dp) ? "CPU" : "DSA", dp->index); +} + +int dsa_shared_port_link_register_of(struct dsa_port *dp) +{ + struct dsa_switch *ds = dp->ds; + bool missing_link_description; + bool missing_phy_mode; int port = dp->index; + dsa_shared_port_validate_of(dp, &missing_phy_mode, + &missing_link_description); + + if ((missing_phy_mode || missing_link_description) && + !of_device_compatible_match(ds->dev->of_node, + dsa_switches_apply_workarounds)) + return -EINVAL; + if (!ds->ops->adjust_link) { - phy_np = of_parse_phandle(dp->dn, "phy-handle", 0); - if (of_phy_is_fixed_link(dp->dn) || phy_np) { + if (missing_link_description) { + dev_warn(ds->dev, + "Skipping phylink registration for %s port %d\n", + dsa_port_is_cpu(dp) ? "CPU" : "DSA", dp->index); + } else { if (ds->ops->phylink_mac_link_down) ds->ops->phylink_mac_link_down(ds, port, MLO_AN_FIXED, PHY_INTERFACE_MODE_NA); - of_node_put(phy_np); - return dsa_port_phylink_register(dp); + + return dsa_shared_port_phylink_register(dp); } - of_node_put(phy_np); return 0; } @@ -1676,12 +1979,12 @@ int dsa_port_link_register_of(struct dsa_port *dp) "Using legacy PHYLIB callbacks. Please migrate to PHYLINK!\n"); if (of_phy_is_fixed_link(dp->dn)) - return dsa_port_fixed_link_register_of(dp); + return dsa_shared_port_fixed_link_register_of(dp); else - return dsa_port_setup_phy_of(dp, true); + return dsa_shared_port_setup_phy_of(dp, true); } -void dsa_port_link_unregister_of(struct dsa_port *dp) +void dsa_shared_port_link_unregister_of(struct dsa_port *dp) { struct dsa_switch *ds = dp->ds; @@ -1689,15 +1992,14 @@ void dsa_port_link_unregister_of(struct dsa_port *dp) rtnl_lock(); phylink_disconnect_phy(dp->pl); rtnl_unlock(); - phylink_destroy(dp->pl); - dp->pl = NULL; + dsa_port_phylink_destroy(dp); return; } if (of_phy_is_fixed_link(dp->dn)) of_phy_deregister_fixed_link(dp->dn); else - dsa_port_setup_phy_of(dp, false); + dsa_shared_port_setup_phy_of(dp, false); } int dsa_port_hsr_join(struct dsa_port *dp, struct net_device *hsr) diff --git a/net/dsa/slave.c b/net/dsa/slave.c index 1291c2431d44..1a59918d3b30 100644 --- a/net/dsa/slave.c +++ b/net/dsa/slave.c @@ -164,6 +164,48 @@ static int dsa_slave_unsync_mc(struct net_device *dev, return dsa_slave_schedule_standalone_work(dev, DSA_MC_DEL, addr, 0); } +void dsa_slave_sync_ha(struct net_device *dev) +{ + struct dsa_port *dp = dsa_slave_to_port(dev); + struct dsa_switch *ds = dp->ds; + struct netdev_hw_addr *ha; + + netif_addr_lock_bh(dev); + + netdev_for_each_synced_mc_addr(ha, dev) + dsa_slave_sync_mc(dev, ha->addr); + + netdev_for_each_synced_uc_addr(ha, dev) + dsa_slave_sync_uc(dev, ha->addr); + + netif_addr_unlock_bh(dev); + + if (dsa_switch_supports_uc_filtering(ds) || + dsa_switch_supports_mc_filtering(ds)) + dsa_flush_workqueue(); +} + +void dsa_slave_unsync_ha(struct net_device *dev) +{ + struct dsa_port *dp = dsa_slave_to_port(dev); + struct dsa_switch *ds = dp->ds; + struct netdev_hw_addr *ha; + + netif_addr_lock_bh(dev); + + netdev_for_each_synced_uc_addr(ha, dev) + dsa_slave_unsync_uc(dev, ha->addr); + + netdev_for_each_synced_mc_addr(ha, dev) + dsa_slave_unsync_mc(dev, ha->addr); + + netif_addr_unlock_bh(dev); + + if (dsa_switch_supports_uc_filtering(ds) || + dsa_switch_supports_mc_filtering(ds)) + dsa_flush_workqueue(); +} + /* slave mii_bus handling ***************************************************/ static int dsa_slave_phy_read(struct mii_bus *bus, int addr, int reg) { @@ -826,9 +868,9 @@ static netdev_tx_t dsa_slave_xmit(struct sk_buff *skb, struct net_device *dev) static void dsa_slave_get_drvinfo(struct net_device *dev, struct ethtool_drvinfo *drvinfo) { - strlcpy(drvinfo->driver, "dsa", sizeof(drvinfo->driver)); - strlcpy(drvinfo->fw_version, "N/A", sizeof(drvinfo->fw_version)); - strlcpy(drvinfo->bus_info, "platform", sizeof(drvinfo->bus_info)); + strscpy(drvinfo->driver, "dsa", sizeof(drvinfo->driver)); + strscpy(drvinfo->fw_version, "N/A", sizeof(drvinfo->fw_version)); + strscpy(drvinfo->bus_info, "platform", sizeof(drvinfo->bus_info)); } static int dsa_slave_get_regs_len(struct net_device *dev) @@ -1503,8 +1545,7 @@ static int dsa_slave_setup_tc_block(struct net_device *dev, static int dsa_slave_setup_ft_block(struct dsa_switch *ds, int port, void *type_data) { - struct dsa_port *cpu_dp = dsa_to_port(ds, port)->cpu_dp; - struct net_device *master = cpu_dp->master; + struct net_device *master = dsa_port_to_master(dsa_to_port(ds, port)); if (!master->netdev_ops->ndo_setup_tc) return -EOPNOTSUPP; @@ -2147,13 +2188,14 @@ static int dsa_slave_fill_forward_path(struct net_device_path_ctx *ctx, struct net_device_path *path) { struct dsa_port *dp = dsa_slave_to_port(ctx->dev); + struct net_device *master = dsa_port_to_master(dp); struct dsa_port *cpu_dp = dp->cpu_dp; path->dev = ctx->dev; path->type = DEV_PATH_DSA; path->dsa.proto = cpu_dp->tag_ops->proto; path->dsa.port = dp->index; - ctx->dev = cpu_dp->master; + ctx->dev = master; return 0; } @@ -2262,7 +2304,7 @@ static int dsa_slave_phy_setup(struct net_device *slave_dev) if (ret) { netdev_err(slave_dev, "failed to connect to PHY: %pe\n", ERR_PTR(ret)); - phylink_destroy(dp->pl); + dsa_port_phylink_destroy(dp); } return ret; @@ -2271,9 +2313,9 @@ static int dsa_slave_phy_setup(struct net_device *slave_dev) void dsa_slave_setup_tagger(struct net_device *slave) { struct dsa_port *dp = dsa_slave_to_port(slave); + struct net_device *master = dsa_port_to_master(dp); struct dsa_slave_priv *p = netdev_priv(slave); const struct dsa_port *cpu_dp = dp->cpu_dp; - struct net_device *master = cpu_dp->master; const struct dsa_switch *ds = dp->ds; slave->needed_headroom = cpu_dp->tag_ops->needed_headroom; @@ -2330,8 +2372,7 @@ int dsa_slave_resume(struct net_device *slave_dev) int dsa_slave_create(struct dsa_port *port) { - const struct dsa_port *cpu_dp = port->cpu_dp; - struct net_device *master = cpu_dp->master; + struct net_device *master = dsa_port_to_master(port); struct dsa_switch *ds = port->ds; const char *name = port->name; struct net_device *slave_dev; @@ -2347,6 +2388,7 @@ int dsa_slave_create(struct dsa_port *port) if (slave_dev == NULL) return -ENOMEM; + slave_dev->rtnl_link_ops = &dsa_link_ops; slave_dev->ethtool_ops = &dsa_slave_ethtool_ops; #if IS_ENABLED(CONFIG_DCB) slave_dev->dcbnl_ops = &dsa_slave_dcbnl_ops; @@ -2434,7 +2476,7 @@ out_phy: rtnl_lock(); phylink_disconnect_phy(p->dp->pl); rtnl_unlock(); - phylink_destroy(p->dp->pl); + dsa_port_phylink_destroy(p->dp); out_gcells: gro_cells_destroy(&p->gcells); out_free: @@ -2457,12 +2499,89 @@ void dsa_slave_destroy(struct net_device *slave_dev) phylink_disconnect_phy(dp->pl); rtnl_unlock(); - phylink_destroy(dp->pl); + dsa_port_phylink_destroy(dp); gro_cells_destroy(&p->gcells); free_percpu(slave_dev->tstats); free_netdev(slave_dev); } +int dsa_slave_change_master(struct net_device *dev, struct net_device *master, + struct netlink_ext_ack *extack) +{ + struct net_device *old_master = dsa_slave_to_master(dev); + struct dsa_port *dp = dsa_slave_to_port(dev); + struct dsa_switch *ds = dp->ds; + struct net_device *upper; + struct list_head *iter; + int err; + + if (master == old_master) + return 0; + + if (!ds->ops->port_change_master) { + NL_SET_ERR_MSG_MOD(extack, + "Driver does not support changing DSA master"); + return -EOPNOTSUPP; + } + + if (!netdev_uses_dsa(master)) { + NL_SET_ERR_MSG_MOD(extack, + "Interface not eligible as DSA master"); + return -EOPNOTSUPP; + } + + netdev_for_each_upper_dev_rcu(master, upper, iter) { + if (dsa_slave_dev_check(upper)) + continue; + if (netif_is_bridge_master(upper)) + continue; + NL_SET_ERR_MSG_MOD(extack, "Cannot join master with unknown uppers"); + return -EOPNOTSUPP; + } + + /* Since we allow live-changing the DSA master, plus we auto-open the + * DSA master when the user port opens => we need to ensure that the + * new DSA master is open too. + */ + if (dev->flags & IFF_UP) { + err = dev_open(master, extack); + if (err) + return err; + } + + netdev_upper_dev_unlink(old_master, dev); + + err = netdev_upper_dev_link(master, dev, extack); + if (err) + goto out_revert_old_master_unlink; + + err = dsa_port_change_master(dp, master, extack); + if (err) + goto out_revert_master_link; + + /* Update the MTU of the new CPU port through cross-chip notifiers */ + err = dsa_slave_change_mtu(dev, dev->mtu); + if (err && err != -EOPNOTSUPP) { + netdev_warn(dev, + "nonfatal error updating MTU with new master: %pe\n", + ERR_PTR(err)); + } + + /* If the port doesn't have its own MAC address and relies on the DSA + * master's one, inherit it again from the new DSA master. + */ + if (is_zero_ether_addr(dp->mac)) + eth_hw_addr_inherit(dev, master); + + return 0; + +out_revert_master_link: + netdev_upper_dev_unlink(master, dev); +out_revert_old_master_unlink: + netdev_upper_dev_link(old_master, dev, NULL); + return err; +} + bool dsa_slave_dev_check(const struct net_device *dev) { return dev->netdev_ops == &dsa_slave_netdev_ops; @@ -2476,6 +2595,9 @@ static int dsa_slave_changeupper(struct net_device *dev, struct netlink_ext_ack *extack; int err = NOTIFY_DONE; + if (!dsa_slave_dev_check(dev)) + return err; + extack = netdev_notifier_info_to_extack(&info->info); if (netif_is_bridge_master(info->upper_dev)) { @@ -2531,6 +2653,9 @@ static int dsa_slave_prechangeupper(struct net_device *dev, { struct dsa_port *dp = dsa_slave_to_port(dev); + if (!dsa_slave_dev_check(dev)) + return NOTIFY_DONE; + if (netif_is_bridge_master(info->upper_dev) && !info->linking) dsa_port_pre_bridge_leave(dp, info->upper_dev); else if (netif_is_lag_master(info->upper_dev) && !info->linking) @@ -2551,6 +2676,9 @@ dsa_slave_lag_changeupper(struct net_device *dev, int err = NOTIFY_DONE; struct dsa_port *dp; + if (!netif_is_lag_master(dev)) + return err; + netdev_for_each_lower_dev(dev, lower, iter) { if (!dsa_slave_dev_check(lower)) continue; @@ -2580,6 +2708,9 @@ dsa_slave_lag_prechangeupper(struct net_device *dev, int err = NOTIFY_DONE; struct dsa_port *dp; + if (!netif_is_lag_master(dev)) + return err; + netdev_for_each_lower_dev(dev, lower, iter) { if (!dsa_slave_dev_check(lower)) continue; @@ -2687,6 +2818,277 @@ dsa_slave_prechangeupper_sanity_check(struct net_device *dev, return NOTIFY_DONE; } +/* To be eligible as a DSA master, a LAG must have all lower interfaces be + * eligible DSA masters. Additionally, all LAG slaves must be DSA masters of + * switches in the same switch tree. + */ +static int dsa_lag_master_validate(struct net_device *lag_dev, + struct netlink_ext_ack *extack) +{ + struct net_device *lower1, *lower2; + struct list_head *iter1, *iter2; + + netdev_for_each_lower_dev(lag_dev, lower1, iter1) { + netdev_for_each_lower_dev(lag_dev, lower2, iter2) { + if (!netdev_uses_dsa(lower1) || + !netdev_uses_dsa(lower2)) { + NL_SET_ERR_MSG_MOD(extack, + "All LAG ports must be eligible as DSA masters"); + return notifier_from_errno(-EINVAL); + } + + if (lower1 == lower2) + continue; + + if (!dsa_port_tree_same(lower1->dsa_ptr, + lower2->dsa_ptr)) { + NL_SET_ERR_MSG_MOD(extack, + "LAG contains DSA masters of disjoint switch trees"); + return notifier_from_errno(-EINVAL); + } + } + } + + return NOTIFY_DONE; +} + +static int +dsa_master_prechangeupper_sanity_check(struct net_device *master, + struct netdev_notifier_changeupper_info *info) +{ + struct netlink_ext_ack *extack = netdev_notifier_info_to_extack(&info->info); + + if (!netdev_uses_dsa(master)) + return NOTIFY_DONE; + + if (!info->linking) + return NOTIFY_DONE; + + /* Allow DSA switch uppers */ + if (dsa_slave_dev_check(info->upper_dev)) + return NOTIFY_DONE; + + /* Allow bridge uppers of DSA masters, subject to further + * restrictions in dsa_bridge_prechangelower_sanity_check() + */ + if (netif_is_bridge_master(info->upper_dev)) + return NOTIFY_DONE; + + /* Allow LAG uppers, subject to further restrictions in + * dsa_lag_master_prechangelower_sanity_check() + */ + if (netif_is_lag_master(info->upper_dev)) + return dsa_lag_master_validate(info->upper_dev, extack); + + NL_SET_ERR_MSG_MOD(extack, + "DSA master cannot join unknown upper interfaces"); + return notifier_from_errno(-EBUSY); +} + +static int +dsa_lag_master_prechangelower_sanity_check(struct net_device *dev, + struct netdev_notifier_changeupper_info *info) +{ + struct netlink_ext_ack *extack = netdev_notifier_info_to_extack(&info->info); + struct net_device *lag_dev = info->upper_dev; + struct net_device *lower; + struct list_head *iter; + + if (!netdev_uses_dsa(lag_dev) || !netif_is_lag_master(lag_dev)) + return NOTIFY_DONE; + + if (!info->linking) + return NOTIFY_DONE; + + if (!netdev_uses_dsa(dev)) { + NL_SET_ERR_MSG(extack, + "Only DSA masters can join a LAG DSA master"); + return notifier_from_errno(-EINVAL); + } + + netdev_for_each_lower_dev(lag_dev, lower, iter) { + if (!dsa_port_tree_same(dev->dsa_ptr, lower->dsa_ptr)) { + NL_SET_ERR_MSG(extack, + "Interface is DSA master for a different switch tree than this LAG"); + return notifier_from_errno(-EINVAL); + } + + break; + } + + return NOTIFY_DONE; +} + +/* Don't allow bridging of DSA masters, since the bridge layer rx_handler + * prevents the DSA fake ethertype handler to be invoked, so we don't get the + * chance to strip off and parse the DSA switch tag protocol header (the bridge + * layer just returns RX_HANDLER_CONSUMED, stopping RX processing for these + * frames). + * The only case where that would not be an issue is when bridging can already + * be offloaded, such as when the DSA master is itself a DSA or plain switchdev + * port, and is bridged only with other ports from the same hardware device. + */ +static int +dsa_bridge_prechangelower_sanity_check(struct net_device *new_lower, + struct netdev_notifier_changeupper_info *info) +{ + struct net_device *br = info->upper_dev; + struct netlink_ext_ack *extack; + struct net_device *lower; + struct list_head *iter; + + if (!netif_is_bridge_master(br)) + return NOTIFY_DONE; + + if (!info->linking) + return NOTIFY_DONE; + + extack = netdev_notifier_info_to_extack(&info->info); + + netdev_for_each_lower_dev(br, lower, iter) { + if (!netdev_uses_dsa(new_lower) && !netdev_uses_dsa(lower)) + continue; + + if (!netdev_port_same_parent_id(lower, new_lower)) { + NL_SET_ERR_MSG(extack, + "Cannot do software bridging with a DSA master"); + return notifier_from_errno(-EINVAL); + } + } + + return NOTIFY_DONE; +} + +static void dsa_tree_migrate_ports_from_lag_master(struct dsa_switch_tree *dst, + struct net_device *lag_dev) +{ + struct net_device *new_master = dsa_tree_find_first_master(dst); + struct dsa_port *dp; + int err; + + dsa_tree_for_each_user_port(dp, dst) { + if (dsa_port_to_master(dp) != lag_dev) + continue; + + err = dsa_slave_change_master(dp->slave, new_master, NULL); + if (err) { + netdev_err(dp->slave, + "failed to restore master to %s: %pe\n", + new_master->name, ERR_PTR(err)); + } + } +} + +static int dsa_master_lag_join(struct net_device *master, + struct net_device *lag_dev, + struct netdev_lag_upper_info *uinfo, + struct netlink_ext_ack *extack) +{ + struct dsa_port *cpu_dp = master->dsa_ptr; + struct dsa_switch_tree *dst = cpu_dp->dst; + struct dsa_port *dp; + int err; + + err = dsa_master_lag_setup(lag_dev, cpu_dp, uinfo, extack); + if (err) + return err; + + dsa_tree_for_each_user_port(dp, dst) { + if (dsa_port_to_master(dp) != master) + continue; + + err = dsa_slave_change_master(dp->slave, lag_dev, extack); + if (err) + goto restore; + } + + return 0; + +restore: + dsa_tree_for_each_user_port_continue_reverse(dp, dst) { + if (dsa_port_to_master(dp) != lag_dev) + continue; + + err = dsa_slave_change_master(dp->slave, master, NULL); + if (err) { + netdev_err(dp->slave, + "failed to restore master to %s: %pe\n", + master->name, ERR_PTR(err)); + } + } + + dsa_master_lag_teardown(lag_dev, master->dsa_ptr); + + return err; +} + +static void dsa_master_lag_leave(struct net_device *master, + struct net_device *lag_dev) +{ + struct dsa_port *dp, *cpu_dp = lag_dev->dsa_ptr; + struct dsa_switch_tree *dst = cpu_dp->dst; + struct dsa_port *new_cpu_dp = NULL; + struct net_device *lower; + struct list_head *iter; + + netdev_for_each_lower_dev(lag_dev, lower, iter) { + if (netdev_uses_dsa(lower)) { + new_cpu_dp = lower->dsa_ptr; + break; + } + } + + if (new_cpu_dp) { + /* Update the CPU port of the user ports still under the LAG + * so that dsa_port_to_master() continues to work properly + */ + dsa_tree_for_each_user_port(dp, dst) + if (dsa_port_to_master(dp) == lag_dev) + dp->cpu_dp = new_cpu_dp; + + /* Update the index of the virtual CPU port to match the lowest + * physical CPU port + */ + lag_dev->dsa_ptr = new_cpu_dp; + wmb(); + } else { + /* If the LAG DSA master has no ports left, migrate back all + * user ports to the first physical CPU port + */ + dsa_tree_migrate_ports_from_lag_master(dst, lag_dev); + } + + /* This DSA master has left its LAG in any case, so let + * the CPU port leave the hardware LAG as well + */ + dsa_master_lag_teardown(lag_dev, master->dsa_ptr); +} + +static int dsa_master_changeupper(struct net_device *dev, + struct netdev_notifier_changeupper_info *info) +{ + struct netlink_ext_ack *extack; + int err = NOTIFY_DONE; + + if (!netdev_uses_dsa(dev)) + return err; + + extack = netdev_notifier_info_to_extack(&info->info); + + if (netif_is_lag_master(info->upper_dev)) { + if (info->linking) { + err = dsa_master_lag_join(dev, info->upper_dev, + info->upper_info, extack); + err = notifier_from_errno(err); + } else { + dsa_master_lag_leave(dev, info->upper_dev); + err = NOTIFY_OK; + } + } + + return err; +} + static int dsa_slave_netdevice_event(struct notifier_block *nb, unsigned long event, void *ptr) { @@ -2698,36 +3100,68 @@ static int dsa_slave_netdevice_event(struct notifier_block *nb, int err; err = dsa_slave_prechangeupper_sanity_check(dev, info); - if (err != NOTIFY_DONE) + if (notifier_to_errno(err)) + return err; + + err = dsa_master_prechangeupper_sanity_check(dev, info); + if (notifier_to_errno(err)) + return err; + + err = dsa_lag_master_prechangelower_sanity_check(dev, info); + if (notifier_to_errno(err)) + return err; + + err = dsa_bridge_prechangelower_sanity_check(dev, info); + if (notifier_to_errno(err)) return err; - if (dsa_slave_dev_check(dev)) - return dsa_slave_prechangeupper(dev, ptr); + err = dsa_slave_prechangeupper(dev, ptr); + if (notifier_to_errno(err)) + return err; - if (netif_is_lag_master(dev)) - return dsa_slave_lag_prechangeupper(dev, ptr); + err = dsa_slave_lag_prechangeupper(dev, ptr); + if (notifier_to_errno(err)) + return err; break; } - case NETDEV_CHANGEUPPER: - if (dsa_slave_dev_check(dev)) - return dsa_slave_changeupper(dev, ptr); + case NETDEV_CHANGEUPPER: { + int err; - if (netif_is_lag_master(dev)) - return dsa_slave_lag_changeupper(dev, ptr); + err = dsa_slave_changeupper(dev, ptr); + if (notifier_to_errno(err)) + return err; + + err = dsa_slave_lag_changeupper(dev, ptr); + if (notifier_to_errno(err)) + return err; + + err = dsa_master_changeupper(dev, ptr); + if (notifier_to_errno(err)) + return err; break; + } case NETDEV_CHANGELOWERSTATE: { struct netdev_notifier_changelowerstate_info *info = ptr; struct dsa_port *dp; int err; - if (!dsa_slave_dev_check(dev)) - break; + if (dsa_slave_dev_check(dev)) { + dp = dsa_slave_to_port(dev); - dp = dsa_slave_to_port(dev); + err = dsa_port_lag_change(dp, info->lower_state_info); + } + + /* Mirror LAG port events on DSA masters that are in + * a LAG towards their respective switch CPU ports + */ + if (netdev_uses_dsa(dev)) { + dp = dev->dsa_ptr; + + err = dsa_port_lag_change(dp, info->lower_state_info); + } - err = dsa_port_lag_change(dp, info->lower_state_info); return notifier_from_errno(err); } case NETDEV_CHANGE: @@ -2777,6 +3211,9 @@ static int dsa_slave_netdevice_event(struct notifier_block *nb, if (!dsa_port_is_user(dp)) continue; + if (dp->cpu_dp != cpu_dp) + continue; + list_add(&dp->slave->close_list, &close_list); } diff --git a/net/dsa/switch.c b/net/dsa/switch.c index 4dfd68cf61c5..ce56acdba203 100644 --- a/net/dsa/switch.c +++ b/net/dsa/switch.c @@ -398,8 +398,15 @@ static int dsa_switch_host_fdb_add(struct dsa_switch *ds, dsa_switch_for_each_port(dp, ds) { if (dsa_port_host_address_match(dp, info->dp)) { - err = dsa_port_do_fdb_add(dp, info->addr, info->vid, - info->db); + if (dsa_port_is_cpu(dp) && info->dp->cpu_port_in_lag) { + err = dsa_switch_do_lag_fdb_add(ds, dp->lag, + info->addr, + info->vid, + info->db); + } else { + err = dsa_port_do_fdb_add(dp, info->addr, + info->vid, info->db); + } if (err) break; } @@ -419,8 +426,15 @@ static int dsa_switch_host_fdb_del(struct dsa_switch *ds, dsa_switch_for_each_port(dp, ds) { if (dsa_port_host_address_match(dp, info->dp)) { - err = dsa_port_do_fdb_del(dp, info->addr, info->vid, - info->db); + if (dsa_port_is_cpu(dp) && info->dp->cpu_port_in_lag) { + err = dsa_switch_do_lag_fdb_del(ds, dp->lag, + info->addr, + info->vid, + info->db); + } else { + err = dsa_port_do_fdb_del(dp, info->addr, + info->vid, info->db); + } if (err) break; } @@ -507,12 +521,12 @@ static int dsa_switch_lag_join(struct dsa_switch *ds, { if (info->dp->ds == ds && ds->ops->port_lag_join) return ds->ops->port_lag_join(ds, info->dp->index, info->lag, - info->info); + info->info, info->extack); if (info->dp->ds != ds && ds->ops->crosschip_lag_join) return ds->ops->crosschip_lag_join(ds, info->dp->ds->index, info->dp->index, info->lag, - info->info); + info->info, info->extack); return -EOPNOTSUPP; } diff --git a/net/dsa/tag_8021q.c b/net/dsa/tag_8021q.c index 01a427800797..34e5ec5d3e23 100644 --- a/net/dsa/tag_8021q.c +++ b/net/dsa/tag_8021q.c @@ -2,9 +2,7 @@ /* Copyright (c) 2019, Vladimir Oltean <olteanv@gmail.com> * * This module is not a complete tagger implementation. It only provides - * primitives for taggers that rely on 802.1Q VLAN tags to use. The - * dsa_8021q_netdev_ops is registered for API compliance and not used - * directly by callers. + * primitives for taggers that rely on 802.1Q VLAN tags to use. */ #include <linux/if_vlan.h> #include <linux/dsa/8021q.h> @@ -332,7 +330,7 @@ static int dsa_tag_8021q_port_setup(struct dsa_switch *ds, int port) if (!dsa_port_is_user(dp)) return 0; - master = dp->cpu_dp->master; + master = dsa_port_to_master(dp); err = dsa_port_tag_8021q_vlan_add(dp, vid, false); if (err) { @@ -361,7 +359,7 @@ static void dsa_tag_8021q_port_teardown(struct dsa_switch *ds, int port) if (!dsa_port_is_user(dp)) return; - master = dp->cpu_dp->master; + master = dsa_port_to_master(dp); dsa_port_tag_8021q_vlan_del(dp, vid, false); diff --git a/net/ethernet/eth.c b/net/ethernet/eth.c index 62b89d6f54fd..e02daa74e833 100644 --- a/net/ethernet/eth.c +++ b/net/ethernet/eth.c @@ -414,12 +414,9 @@ struct sk_buff *eth_gro_receive(struct list_head *head, struct sk_buff *skb) off_eth = skb_gro_offset(skb); hlen = off_eth + sizeof(*eh); - eh = skb_gro_header_fast(skb, off_eth); - if (skb_gro_header_hard(skb, hlen)) { - eh = skb_gro_header_slow(skb, hlen, off_eth); - if (unlikely(!eh)) - goto out; - } + eh = skb_gro_header(skb, hlen, off_eth); + if (unlikely(!eh)) + goto out; flush = 0; diff --git a/net/ethtool/Makefile b/net/ethtool/Makefile index b76432e70e6b..72ab0944262a 100644 --- a/net/ethtool/Makefile +++ b/net/ethtool/Makefile @@ -7,4 +7,5 @@ obj-$(CONFIG_ETHTOOL_NETLINK) += ethtool_nl.o ethtool_nl-y := netlink.o bitset.o strset.o linkinfo.o linkmodes.o \ linkstate.o debug.o wol.o features.o privflags.o rings.o \ channels.o coalesce.o pause.o eee.o tsinfo.o cabletest.o \ - tunnels.o fec.o eeprom.o stats.o phc_vclocks.o module.o + tunnels.o fec.o eeprom.o stats.o phc_vclocks.o module.o \ + pse-pd.o diff --git a/net/ethtool/common.h b/net/ethtool/common.h index 2dc2b80aea5f..c1779657e074 100644 --- a/net/ethtool/common.h +++ b/net/ethtool/common.h @@ -46,6 +46,7 @@ int ethtool_get_max_rxfh_channel(struct net_device *dev, u32 *max); int __ethtool_get_ts_info(struct net_device *dev, struct ethtool_ts_info *info); extern const struct ethtool_phy_ops *ethtool_phy_ops; +extern const struct ethtool_pse_ops *ethtool_pse_ops; int ethtool_get_module_info_call(struct net_device *dev, struct ethtool_modinfo *modinfo); diff --git a/net/ethtool/ioctl.c b/net/ethtool/ioctl.c index 6a7308de192d..57e7238a4136 100644 --- a/net/ethtool/ioctl.c +++ b/net/ethtool/ioctl.c @@ -571,6 +571,7 @@ static int ethtool_get_link_ksettings(struct net_device *dev, = __ETHTOOL_LINK_MODE_MASK_NU32; link_ksettings.base.master_slave_cfg = MASTER_SLAVE_CFG_UNSUPPORTED; link_ksettings.base.master_slave_state = MASTER_SLAVE_STATE_UNSUPPORTED; + link_ksettings.base.rate_matching = RATE_MATCH_NONE; return store_link_ksettings_for_user(useraddr, &link_ksettings); } @@ -714,16 +715,16 @@ ethtool_get_drvinfo(struct net_device *dev, struct ethtool_devlink_compat *rsp) const struct ethtool_ops *ops = dev->ethtool_ops; rsp->info.cmd = ETHTOOL_GDRVINFO; - strlcpy(rsp->info.version, UTS_RELEASE, sizeof(rsp->info.version)); + strscpy(rsp->info.version, UTS_RELEASE, sizeof(rsp->info.version)); if (ops->get_drvinfo) { ops->get_drvinfo(dev, &rsp->info); } else if (dev->dev.parent && dev->dev.parent->driver) { - strlcpy(rsp->info.bus_info, dev_name(dev->dev.parent), + strscpy(rsp->info.bus_info, dev_name(dev->dev.parent), sizeof(rsp->info.bus_info)); - strlcpy(rsp->info.driver, dev->dev.parent->driver->name, + strscpy(rsp->info.driver, dev->dev.parent->driver->name, sizeof(rsp->info.driver)); } else if (dev->rtnl_link_ops) { - strlcpy(rsp->info.driver, dev->rtnl_link_ops->kind, + strscpy(rsp->info.driver, dev->rtnl_link_ops->kind, sizeof(rsp->info.driver)); } else { return -EOPNOTSUPP; diff --git a/net/ethtool/linkmodes.c b/net/ethtool/linkmodes.c index 99b29b4fe947..126e06c713a3 100644 --- a/net/ethtool/linkmodes.c +++ b/net/ethtool/linkmodes.c @@ -70,6 +70,7 @@ static int linkmodes_reply_size(const struct ethnl_req_info *req_base, + nla_total_size(sizeof(u32)) /* LINKMODES_SPEED */ + nla_total_size(sizeof(u32)) /* LINKMODES_LANES */ + nla_total_size(sizeof(u8)) /* LINKMODES_DUPLEX */ + + nla_total_size(sizeof(u8)) /* LINKMODES_RATE_MATCHING */ + 0; ret = ethnl_bitset_size(ksettings->link_modes.advertising, ksettings->link_modes.supported, @@ -143,6 +144,10 @@ static int linkmodes_fill_reply(struct sk_buff *skb, lsettings->master_slave_state)) return -EMSGSIZE; + if (nla_put_u8(skb, ETHTOOL_A_LINKMODES_RATE_MATCHING, + lsettings->rate_matching)) + return -EMSGSIZE; + return 0; } diff --git a/net/ethtool/netlink.c b/net/ethtool/netlink.c index e26079e11835..1a4c11356c96 100644 --- a/net/ethtool/netlink.c +++ b/net/ethtool/netlink.c @@ -286,6 +286,7 @@ ethnl_default_requests[__ETHTOOL_MSG_USER_CNT] = { [ETHTOOL_MSG_STATS_GET] = ðnl_stats_request_ops, [ETHTOOL_MSG_PHC_VCLOCKS_GET] = ðnl_phc_vclocks_request_ops, [ETHTOOL_MSG_MODULE_GET] = ðnl_module_request_ops, + [ETHTOOL_MSG_PSE_GET] = ðnl_pse_request_ops, }; static struct ethnl_dump_ctx *ethnl_dump_context(struct netlink_callback *cb) @@ -361,6 +362,9 @@ static int ethnl_default_doit(struct sk_buff *skb, struct genl_info *info) ops = ethnl_default_requests[cmd]; if (WARN_ONCE(!ops, "cmd %u has no ethnl_request_ops\n", cmd)) return -EOPNOTSUPP; + if (GENL_REQ_ATTR_CHECK(info, ops->hdr_attr)) + return -EINVAL; + req_info = kzalloc(ops->req_info_size, GFP_KERNEL); if (!req_info) return -ENOMEM; @@ -1020,6 +1024,22 @@ static const struct genl_ops ethtool_genl_ops[] = { .policy = ethnl_module_set_policy, .maxattr = ARRAY_SIZE(ethnl_module_set_policy) - 1, }, + { + .cmd = ETHTOOL_MSG_PSE_GET, + .doit = ethnl_default_doit, + .start = ethnl_default_start, + .dumpit = ethnl_default_dumpit, + .done = ethnl_default_done, + .policy = ethnl_pse_get_policy, + .maxattr = ARRAY_SIZE(ethnl_pse_get_policy) - 1, + }, + { + .cmd = ETHTOOL_MSG_PSE_SET, + .flags = GENL_UNS_ADMIN_PERM, + .doit = ethnl_set_pse, + .policy = ethnl_pse_set_policy, + .maxattr = ARRAY_SIZE(ethnl_pse_set_policy) - 1, + }, }; static const struct genl_multicast_group ethtool_nl_mcgrps[] = { @@ -1033,6 +1053,7 @@ static struct genl_family ethtool_genl_family __ro_after_init = { .parallel_ops = true, .ops = ethtool_genl_ops, .n_ops = ARRAY_SIZE(ethtool_genl_ops), + .resv_start_op = ETHTOOL_MSG_MODULE_GET + 1, .mcgrps = ethtool_nl_mcgrps, .n_mcgrps = ARRAY_SIZE(ethtool_nl_mcgrps), }; diff --git a/net/ethtool/netlink.h b/net/ethtool/netlink.h index c0d587611854..1bfd374f9718 100644 --- a/net/ethtool/netlink.h +++ b/net/ethtool/netlink.h @@ -345,6 +345,7 @@ extern const struct ethnl_request_ops ethnl_module_eeprom_request_ops; extern const struct ethnl_request_ops ethnl_stats_request_ops; extern const struct ethnl_request_ops ethnl_phc_vclocks_request_ops; extern const struct ethnl_request_ops ethnl_module_request_ops; +extern const struct ethnl_request_ops ethnl_pse_request_ops; extern const struct nla_policy ethnl_header_policy[ETHTOOL_A_HEADER_FLAGS + 1]; extern const struct nla_policy ethnl_header_policy_stats[ETHTOOL_A_HEADER_FLAGS + 1]; @@ -383,6 +384,8 @@ extern const struct nla_policy ethnl_stats_get_policy[ETHTOOL_A_STATS_GROUPS + 1 extern const struct nla_policy ethnl_phc_vclocks_get_policy[ETHTOOL_A_PHC_VCLOCKS_HEADER + 1]; extern const struct nla_policy ethnl_module_get_policy[ETHTOOL_A_MODULE_HEADER + 1]; extern const struct nla_policy ethnl_module_set_policy[ETHTOOL_A_MODULE_POWER_MODE_POLICY + 1]; +extern const struct nla_policy ethnl_pse_get_policy[ETHTOOL_A_PSE_HEADER + 1]; +extern const struct nla_policy ethnl_pse_set_policy[ETHTOOL_A_PSE_MAX + 1]; int ethnl_set_linkinfo(struct sk_buff *skb, struct genl_info *info); int ethnl_set_linkmodes(struct sk_buff *skb, struct genl_info *info); @@ -402,6 +405,7 @@ int ethnl_tunnel_info_start(struct netlink_callback *cb); int ethnl_tunnel_info_dumpit(struct sk_buff *skb, struct netlink_callback *cb); int ethnl_set_fec(struct sk_buff *skb, struct genl_info *info); int ethnl_set_module(struct sk_buff *skb, struct genl_info *info); +int ethnl_set_pse(struct sk_buff *skb, struct genl_info *info); extern const char stats_std_names[__ETHTOOL_STATS_CNT][ETH_GSTRING_LEN]; extern const char stats_eth_phy_names[__ETHTOOL_A_STATS_ETH_PHY_CNT][ETH_GSTRING_LEN]; diff --git a/net/ethtool/pse-pd.c b/net/ethtool/pse-pd.c new file mode 100644 index 000000000000..5a471e115b66 --- /dev/null +++ b/net/ethtool/pse-pd.c @@ -0,0 +1,185 @@ +// SPDX-License-Identifier: GPL-2.0-only +// +// ethtool interface for for Ethernet PSE (Power Sourcing Equipment) +// and PD (Powered Device) +// +// Copyright (c) 2022 Pengutronix, Oleksij Rempel <kernel@pengutronix.de> +// + +#include "common.h" +#include "linux/pse-pd/pse.h" +#include "netlink.h" +#include <linux/ethtool_netlink.h> +#include <linux/ethtool.h> +#include <linux/phy.h> + +struct pse_req_info { + struct ethnl_req_info base; +}; + +struct pse_reply_data { + struct ethnl_reply_data base; + struct pse_control_status status; +}; + +#define PSE_REPDATA(__reply_base) \ + container_of(__reply_base, struct pse_reply_data, base) + +/* PSE_GET */ + +const struct nla_policy ethnl_pse_get_policy[ETHTOOL_A_PSE_HEADER + 1] = { + [ETHTOOL_A_PSE_HEADER] = NLA_POLICY_NESTED(ethnl_header_policy), +}; + +static int pse_get_pse_attributes(struct net_device *dev, + struct netlink_ext_ack *extack, + struct pse_reply_data *data) +{ + struct phy_device *phydev = dev->phydev; + + if (!phydev) { + NL_SET_ERR_MSG(extack, "No PHY is attached"); + return -EOPNOTSUPP; + } + + if (!phydev->psec) { + NL_SET_ERR_MSG(extack, "No PSE is attached"); + return -EOPNOTSUPP; + } + + memset(&data->status, 0, sizeof(data->status)); + + return pse_ethtool_get_status(phydev->psec, extack, &data->status); +} + +static int pse_prepare_data(const struct ethnl_req_info *req_base, + struct ethnl_reply_data *reply_base, + struct genl_info *info) +{ + struct pse_reply_data *data = PSE_REPDATA(reply_base); + struct net_device *dev = reply_base->dev; + int ret; + + ret = ethnl_ops_begin(dev); + if (ret < 0) + return ret; + + ret = pse_get_pse_attributes(dev, info->extack, data); + + ethnl_ops_complete(dev); + + return ret; +} + +static int pse_reply_size(const struct ethnl_req_info *req_base, + const struct ethnl_reply_data *reply_base) +{ + const struct pse_reply_data *data = PSE_REPDATA(reply_base); + const struct pse_control_status *st = &data->status; + int len = 0; + + if (st->podl_admin_state > 0) + len += nla_total_size(sizeof(u32)); /* _PODL_PSE_ADMIN_STATE */ + if (st->podl_pw_status > 0) + len += nla_total_size(sizeof(u32)); /* _PODL_PSE_PW_D_STATUS */ + + return len; +} + +static int pse_fill_reply(struct sk_buff *skb, + const struct ethnl_req_info *req_base, + const struct ethnl_reply_data *reply_base) +{ + const struct pse_reply_data *data = PSE_REPDATA(reply_base); + const struct pse_control_status *st = &data->status; + + if (st->podl_admin_state > 0 && + nla_put_u32(skb, ETHTOOL_A_PODL_PSE_ADMIN_STATE, + st->podl_admin_state)) + return -EMSGSIZE; + + if (st->podl_pw_status > 0 && + nla_put_u32(skb, ETHTOOL_A_PODL_PSE_PW_D_STATUS, + st->podl_pw_status)) + return -EMSGSIZE; + + return 0; +} + +const struct ethnl_request_ops ethnl_pse_request_ops = { + .request_cmd = ETHTOOL_MSG_PSE_GET, + .reply_cmd = ETHTOOL_MSG_PSE_GET_REPLY, + .hdr_attr = ETHTOOL_A_PSE_HEADER, + .req_info_size = sizeof(struct pse_req_info), + .reply_data_size = sizeof(struct pse_reply_data), + + .prepare_data = pse_prepare_data, + .reply_size = pse_reply_size, + .fill_reply = pse_fill_reply, +}; + +/* PSE_SET */ + +const struct nla_policy ethnl_pse_set_policy[ETHTOOL_A_PSE_MAX + 1] = { + [ETHTOOL_A_PSE_HEADER] = NLA_POLICY_NESTED(ethnl_header_policy), + [ETHTOOL_A_PODL_PSE_ADMIN_CONTROL] = + NLA_POLICY_RANGE(NLA_U32, ETHTOOL_PODL_PSE_ADMIN_STATE_DISABLED, + ETHTOOL_PODL_PSE_ADMIN_STATE_ENABLED), +}; + +static int pse_set_pse_config(struct net_device *dev, + struct netlink_ext_ack *extack, + struct nlattr **tb) +{ + struct phy_device *phydev = dev->phydev; + struct pse_control_config config = {}; + + /* Optional attribute. Do not return error if not set. */ + if (!tb[ETHTOOL_A_PODL_PSE_ADMIN_CONTROL]) + return 0; + + /* this values are already validated by the ethnl_pse_set_policy */ + config.admin_cotrol = nla_get_u32(tb[ETHTOOL_A_PODL_PSE_ADMIN_CONTROL]); + + if (!phydev) { + NL_SET_ERR_MSG(extack, "No PHY is attached"); + return -EOPNOTSUPP; + } + + if (!phydev->psec) { + NL_SET_ERR_MSG(extack, "No PSE is attached"); + return -EOPNOTSUPP; + } + + return pse_ethtool_set_config(phydev->psec, extack, &config); +} + +int ethnl_set_pse(struct sk_buff *skb, struct genl_info *info) +{ + struct ethnl_req_info req_info = {}; + struct nlattr **tb = info->attrs; + struct net_device *dev; + int ret; + + ret = ethnl_parse_header_dev_get(&req_info, tb[ETHTOOL_A_PSE_HEADER], + genl_info_net(info), info->extack, + true); + if (ret < 0) + return ret; + + dev = req_info.dev; + + rtnl_lock(); + ret = ethnl_ops_begin(dev); + if (ret < 0) + goto out_rtnl; + + ret = pse_set_pse_config(dev, info->extack, tb); + ethnl_ops_complete(dev); +out_rtnl: + rtnl_unlock(); + + ethnl_parse_header_dev_put(&req_info); + + return ret; +} diff --git a/net/ethtool/strset.c b/net/ethtool/strset.c index 2d51b7ab4dc5..3f7de54d85fb 100644 --- a/net/ethtool/strset.c +++ b/net/ethtool/strset.c @@ -167,7 +167,7 @@ static int strset_get_id(const struct nlattr *nest, u32 *val, get_stringset_policy, extack); if (ret < 0) return ret; - if (!tb[ETHTOOL_A_STRINGSET_ID]) + if (NL_REQ_ATTR_CHECK(extack, nest, tb, ETHTOOL_A_STRINGSET_ID)) return -EINVAL; *val = nla_get_u32(tb[ETHTOOL_A_STRINGSET_ID]); diff --git a/net/ethtool/tunnels.c b/net/ethtool/tunnels.c index efde33536687..67fb414ca859 100644 --- a/net/ethtool/tunnels.c +++ b/net/ethtool/tunnels.c @@ -136,6 +136,8 @@ ethnl_tunnel_info_fill_reply(const struct ethnl_req_info *req_base, goto err_cancel_table; entry = nla_nest_start(skb, ETHTOOL_A_TUNNEL_UDP_TABLE_ENTRY); + if (!entry) + goto err_cancel_entry; if (nla_put_be16(skb, ETHTOOL_A_TUNNEL_UDP_ENTRY_PORT, htons(IANA_VXLAN_UDP_PORT)) || diff --git a/net/hsr/hsr_netlink.c b/net/hsr/hsr_netlink.c index 1405c037cf7a..7174a9092900 100644 --- a/net/hsr/hsr_netlink.c +++ b/net/hsr/hsr_netlink.c @@ -522,6 +522,7 @@ static struct genl_family hsr_genl_family __ro_after_init = { .module = THIS_MODULE, .small_ops = hsr_ops, .n_small_ops = ARRAY_SIZE(hsr_ops), + .resv_start_op = HSR_C_SET_NODE_LIST + 1, .mcgrps = hsr_mcgrps, .n_mcgrps = ARRAY_SIZE(hsr_mcgrps), }; diff --git a/net/ieee802154/netlink.c b/net/ieee802154/netlink.c index b07abc38b4b3..7d2de4ee6992 100644 --- a/net/ieee802154/netlink.c +++ b/net/ieee802154/netlink.c @@ -132,6 +132,7 @@ struct genl_family nl802154_family __ro_after_init = { .module = THIS_MODULE, .small_ops = ieee802154_ops, .n_small_ops = ARRAY_SIZE(ieee802154_ops), + .resv_start_op = IEEE802154_LLSEC_DEL_SECLEVEL + 1, .mcgrps = ieee802154_mcgrps, .n_mcgrps = ARRAY_SIZE(ieee802154_mcgrps), }; diff --git a/net/ieee802154/nl802154.c b/net/ieee802154/nl802154.c index e0b072aecf0f..38c4f3cb010e 100644 --- a/net/ieee802154/nl802154.c +++ b/net/ieee802154/nl802154.c @@ -2500,6 +2500,7 @@ static struct genl_family nl802154_fam __ro_after_init = { .module = THIS_MODULE, .ops = nl802154_ops, .n_ops = ARRAY_SIZE(nl802154_ops), + .resv_start_op = NL802154_CMD_DEL_SEC_LEVEL + 1, .mcgrps = nl802154_mcgrps, .n_mcgrps = ARRAY_SIZE(nl802154_mcgrps), }; diff --git a/net/ieee802154/socket.c b/net/ieee802154/socket.c index 7889e1ef7fad..cbd0e2ac4ffe 100644 --- a/net/ieee802154/socket.c +++ b/net/ieee802154/socket.c @@ -251,6 +251,9 @@ static int raw_sendmsg(struct sock *sk, struct msghdr *msg, size_t size) return -EOPNOTSUPP; } + if (!size) + return -EINVAL; + lock_sock(sk); if (!sk->sk_bound_dev_if) dev = dev_getfirstbyhwtype(sock_net(sk), ARPHRD_IEEE802154); diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c index 3ca0cc467886..e2c219382345 100644 --- a/net/ipv4/af_inet.c +++ b/net/ipv4/af_inet.c @@ -1219,6 +1219,7 @@ EXPORT_SYMBOL(inet_unregister_protosw); static int inet_sk_reselect_saddr(struct sock *sk) { + struct inet_bind_hashbucket *prev_addr_hashbucket; struct inet_sock *inet = inet_sk(sk); __be32 old_saddr = inet->inet_saddr; __be32 daddr = inet->inet_daddr; @@ -1226,6 +1227,7 @@ static int inet_sk_reselect_saddr(struct sock *sk) struct rtable *rt; __be32 new_saddr; struct ip_options_rcu *inet_opt; + int err; inet_opt = rcu_dereference_protected(inet->inet_opt, lockdep_sock_is_held(sk)); @@ -1240,20 +1242,34 @@ static int inet_sk_reselect_saddr(struct sock *sk) if (IS_ERR(rt)) return PTR_ERR(rt); - sk_setup_caps(sk, &rt->dst); - new_saddr = fl4->saddr; - if (new_saddr == old_saddr) + if (new_saddr == old_saddr) { + sk_setup_caps(sk, &rt->dst); return 0; + } + + prev_addr_hashbucket = + inet_bhashfn_portaddr(tcp_or_dccp_get_hashinfo(sk), sk, + sock_net(sk), inet->inet_num); + + inet->inet_saddr = inet->inet_rcv_saddr = new_saddr; + + err = inet_bhash2_update_saddr(prev_addr_hashbucket, sk); + if (err) { + inet->inet_saddr = old_saddr; + inet->inet_rcv_saddr = old_saddr; + ip_rt_put(rt); + return err; + } + + sk_setup_caps(sk, &rt->dst); if (READ_ONCE(sock_net(sk)->ipv4.sysctl_ip_dynaddr) > 1) { pr_info("%s(): shifting inet->saddr from %pI4 to %pI4\n", __func__, &old_saddr, &new_saddr); } - inet->inet_saddr = inet->inet_rcv_saddr = new_saddr; - /* * XXX The only one ugly spot where we need to * XXX really change the sockets identity after @@ -1448,12 +1464,9 @@ struct sk_buff *inet_gro_receive(struct list_head *head, struct sk_buff *skb) off = skb_gro_offset(skb); hlen = off + sizeof(*iph); - iph = skb_gro_header_fast(skb, off); - if (skb_gro_header_hard(skb, hlen)) { - iph = skb_gro_header_slow(skb, hlen, off); - if (unlikely(!iph)) - goto out; - } + iph = skb_gro_header(skb, hlen, off); + if (unlikely(!iph)) + goto out; proto = iph->protocol; diff --git a/net/ipv4/ah4.c b/net/ipv4/ah4.c index f8ad04470d3a..ee4e578c7f20 100644 --- a/net/ipv4/ah4.c +++ b/net/ipv4/ah4.c @@ -471,30 +471,38 @@ static int ah4_err(struct sk_buff *skb, u32 info) return 0; } -static int ah_init_state(struct xfrm_state *x) +static int ah_init_state(struct xfrm_state *x, struct netlink_ext_ack *extack) { struct ah_data *ahp = NULL; struct xfrm_algo_desc *aalg_desc; struct crypto_ahash *ahash; - if (!x->aalg) + if (!x->aalg) { + NL_SET_ERR_MSG(extack, "AH requires a state with an AUTH algorithm"); goto error; + } - if (x->encap) + if (x->encap) { + NL_SET_ERR_MSG(extack, "AH is not compatible with encapsulation"); goto error; + } ahp = kzalloc(sizeof(*ahp), GFP_KERNEL); if (!ahp) return -ENOMEM; ahash = crypto_alloc_ahash(x->aalg->alg_name, 0, 0); - if (IS_ERR(ahash)) + if (IS_ERR(ahash)) { + NL_SET_ERR_MSG(extack, "Kernel was unable to initialize cryptographic operations"); goto error; + } ahp->ahash = ahash; if (crypto_ahash_setkey(ahash, x->aalg->alg_key, - (x->aalg->alg_key_len + 7) / 8)) + (x->aalg->alg_key_len + 7) / 8)) { + NL_SET_ERR_MSG(extack, "Kernel was unable to initialize cryptographic operations"); goto error; + } /* * Lookup the algorithm description maintained by xfrm_algo, @@ -507,10 +515,7 @@ static int ah_init_state(struct xfrm_state *x) if (aalg_desc->uinfo.auth.icv_fullbits/8 != crypto_ahash_digestsize(ahash)) { - pr_info("%s: %s digestsize %u != %u\n", - __func__, x->aalg->alg_name, - crypto_ahash_digestsize(ahash), - aalg_desc->uinfo.auth.icv_fullbits / 8); + NL_SET_ERR_MSG(extack, "Kernel was unable to initialize cryptographic operations"); goto error; } diff --git a/net/ipv4/arp.c b/net/ipv4/arp.c index 87c7e3fc5197..4f7237661afb 100644 --- a/net/ipv4/arp.c +++ b/net/ipv4/arp.c @@ -1129,7 +1129,7 @@ static int arp_req_get(struct arpreq *r, struct net_device *dev) r->arp_flags = arp_state_to_flags(neigh); read_unlock_bh(&neigh->lock); r->arp_ha.sa_family = dev->type; - strlcpy(r->arp_dev, dev->name, sizeof(r->arp_dev)); + strscpy(r->arp_dev, dev->name, sizeof(r->arp_dev)); err = 0; } neigh_release(neigh); diff --git a/net/ipv4/bpf_tcp_ca.c b/net/ipv4/bpf_tcp_ca.c index 85a9e500c42d..6da16ae6a962 100644 --- a/net/ipv4/bpf_tcp_ca.c +++ b/net/ipv4/bpf_tcp_ca.c @@ -124,7 +124,7 @@ static int bpf_tcp_ca_btf_struct_access(struct bpf_verifier_log *log, return -EACCES; } - return NOT_INIT; + return 0; } BPF_CALL_2(bpf_tcp_send_ack, struct tcp_sock *, tp, u32, rcv_nxt) diff --git a/net/ipv4/datagram.c b/net/ipv4/datagram.c index ffd57523331f..405a8c2aea64 100644 --- a/net/ipv4/datagram.c +++ b/net/ipv4/datagram.c @@ -42,6 +42,8 @@ int __ip4_datagram_connect(struct sock *sk, struct sockaddr *uaddr, int addr_len oif = inet->mc_index; if (!saddr) saddr = inet->mc_addr; + } else if (!oif) { + oif = inet->uc_index; } fl4 = &inet->cork.fl.u.ip4; rt = ip_route_connect(fl4, usin->sin_addr.s_addr, saddr, oif, diff --git a/net/ipv4/esp4.c b/net/ipv4/esp4.c index 5c03eba787e5..52c8047efedb 100644 --- a/net/ipv4/esp4.c +++ b/net/ipv4/esp4.c @@ -134,6 +134,7 @@ static void esp_free_tcp_sk(struct rcu_head *head) static struct sock *esp_find_tcp_sk(struct xfrm_state *x) { struct xfrm_encap_tmpl *encap = x->encap; + struct net *net = xs_net(x); struct esp_tcp_sk *esk; __be16 sport, dport; struct sock *nsk; @@ -160,7 +161,7 @@ static struct sock *esp_find_tcp_sk(struct xfrm_state *x) } spin_unlock_bh(&x->lock); - sk = inet_lookup_established(xs_net(x), &tcp_hashinfo, x->id.daddr.a4, + sk = inet_lookup_established(net, net->ipv4.tcp_death_row.hashinfo, x->id.daddr.a4, dport, x->props.saddr.a4, sport, 0); if (!sk) return ERR_PTR(-ENOENT); @@ -1007,16 +1008,17 @@ static void esp_destroy(struct xfrm_state *x) crypto_free_aead(aead); } -static int esp_init_aead(struct xfrm_state *x) +static int esp_init_aead(struct xfrm_state *x, struct netlink_ext_ack *extack) { char aead_name[CRYPTO_MAX_ALG_NAME]; struct crypto_aead *aead; int err; - err = -ENAMETOOLONG; if (snprintf(aead_name, CRYPTO_MAX_ALG_NAME, "%s(%s)", - x->geniv, x->aead->alg_name) >= CRYPTO_MAX_ALG_NAME) - goto error; + x->geniv, x->aead->alg_name) >= CRYPTO_MAX_ALG_NAME) { + NL_SET_ERR_MSG(extack, "Algorithm name is too long"); + return -ENAMETOOLONG; + } aead = crypto_alloc_aead(aead_name, 0, 0); err = PTR_ERR(aead); @@ -1034,11 +1036,15 @@ static int esp_init_aead(struct xfrm_state *x) if (err) goto error; + return 0; + error: + NL_SET_ERR_MSG(extack, "Kernel was unable to initialize cryptographic operations"); return err; } -static int esp_init_authenc(struct xfrm_state *x) +static int esp_init_authenc(struct xfrm_state *x, + struct netlink_ext_ack *extack) { struct crypto_aead *aead; struct crypto_authenc_key_param *param; @@ -1049,10 +1055,6 @@ static int esp_init_authenc(struct xfrm_state *x) unsigned int keylen; int err; - err = -EINVAL; - if (!x->ealg) - goto error; - err = -ENAMETOOLONG; if ((x->props.flags & XFRM_STATE_ESN)) { @@ -1061,22 +1063,28 @@ static int esp_init_authenc(struct xfrm_state *x) x->geniv ?: "", x->geniv ? "(" : "", x->aalg ? x->aalg->alg_name : "digest_null", x->ealg->alg_name, - x->geniv ? ")" : "") >= CRYPTO_MAX_ALG_NAME) + x->geniv ? ")" : "") >= CRYPTO_MAX_ALG_NAME) { + NL_SET_ERR_MSG(extack, "Algorithm name is too long"); goto error; + } } else { if (snprintf(authenc_name, CRYPTO_MAX_ALG_NAME, "%s%sauthenc(%s,%s)%s", x->geniv ?: "", x->geniv ? "(" : "", x->aalg ? x->aalg->alg_name : "digest_null", x->ealg->alg_name, - x->geniv ? ")" : "") >= CRYPTO_MAX_ALG_NAME) + x->geniv ? ")" : "") >= CRYPTO_MAX_ALG_NAME) { + NL_SET_ERR_MSG(extack, "Algorithm name is too long"); goto error; + } } aead = crypto_alloc_aead(authenc_name, 0, 0); err = PTR_ERR(aead); - if (IS_ERR(aead)) + if (IS_ERR(aead)) { + NL_SET_ERR_MSG(extack, "Kernel was unable to initialize cryptographic operations"); goto error; + } x->data = aead; @@ -1106,17 +1114,16 @@ static int esp_init_authenc(struct xfrm_state *x) err = -EINVAL; if (aalg_desc->uinfo.auth.icv_fullbits / 8 != crypto_aead_authsize(aead)) { - pr_info("ESP: %s digestsize %u != %u\n", - x->aalg->alg_name, - crypto_aead_authsize(aead), - aalg_desc->uinfo.auth.icv_fullbits / 8); + NL_SET_ERR_MSG(extack, "Kernel was unable to initialize cryptographic operations"); goto free_key; } err = crypto_aead_setauthsize( aead, x->aalg->alg_trunc_len / 8); - if (err) + if (err) { + NL_SET_ERR_MSG(extack, "Kernel was unable to initialize cryptographic operations"); goto free_key; + } } param->enckeylen = cpu_to_be32((x->ealg->alg_key_len + 7) / 8); @@ -1131,7 +1138,7 @@ error: return err; } -static int esp_init_state(struct xfrm_state *x) +static int esp_init_state(struct xfrm_state *x, struct netlink_ext_ack *extack) { struct crypto_aead *aead; u32 align; @@ -1139,10 +1146,14 @@ static int esp_init_state(struct xfrm_state *x) x->data = NULL; - if (x->aead) - err = esp_init_aead(x); - else - err = esp_init_authenc(x); + if (x->aead) { + err = esp_init_aead(x, extack); + } else if (x->ealg) { + err = esp_init_authenc(x, extack); + } else { + NL_SET_ERR_MSG(extack, "ESP: AEAD or CRYPT must be provided"); + err = -EINVAL; + } if (err) goto error; @@ -1160,6 +1171,7 @@ static int esp_init_state(struct xfrm_state *x) switch (encap->encap_type) { default: + NL_SET_ERR_MSG(extack, "Unsupported encapsulation type for ESP"); err = -EINVAL; goto error; case UDP_ENCAP_ESPINUDP: diff --git a/net/ipv4/esp4_offload.c b/net/ipv4/esp4_offload.c index 935026f4c807..170152772d33 100644 --- a/net/ipv4/esp4_offload.c +++ b/net/ipv4/esp4_offload.c @@ -110,7 +110,10 @@ static struct sk_buff *xfrm4_tunnel_gso_segment(struct xfrm_state *x, struct sk_buff *skb, netdev_features_t features) { - return skb_eth_gso_segment(skb, features, htons(ETH_P_IP)); + __be16 type = x->inner_mode.family == AF_INET6 ? htons(ETH_P_IPV6) + : htons(ETH_P_IP); + + return skb_eth_gso_segment(skb, features, type); } static struct sk_buff *xfrm4_transport_gso_segment(struct xfrm_state *x, diff --git a/net/ipv4/fou.c b/net/ipv4/fou.c index 025a33c1b04d..0c3c6d0cee29 100644 --- a/net/ipv4/fou.c +++ b/net/ipv4/fou.c @@ -323,12 +323,9 @@ static struct sk_buff *gue_gro_receive(struct sock *sk, off = skb_gro_offset(skb); len = off + sizeof(*guehdr); - guehdr = skb_gro_header_fast(skb, off); - if (skb_gro_header_hard(skb, len)) { - guehdr = skb_gro_header_slow(skb, len, off); - if (unlikely(!guehdr)) - goto out; - } + guehdr = skb_gro_header(skb, len, off); + if (unlikely(!guehdr)) + goto out; switch (guehdr->version) { case 0: @@ -931,6 +928,7 @@ static struct genl_family fou_nl_family __ro_after_init = { .module = THIS_MODULE, .small_ops = fou_nl_ops, .n_small_ops = ARRAY_SIZE(fou_nl_ops), + .resv_start_op = FOU_CMD_GET + 1, }; size_t fou_encap_hlen(struct ip_tunnel_encap *e) diff --git a/net/ipv4/gre_offload.c b/net/ipv4/gre_offload.c index 07073fa35205..2b9cb5398335 100644 --- a/net/ipv4/gre_offload.c +++ b/net/ipv4/gre_offload.c @@ -137,12 +137,9 @@ static struct sk_buff *gre_gro_receive(struct list_head *head, off = skb_gro_offset(skb); hlen = off + sizeof(*greh); - greh = skb_gro_header_fast(skb, off); - if (skb_gro_header_hard(skb, hlen)) { - greh = skb_gro_header_slow(skb, hlen, off); - if (unlikely(!greh)) - goto out; - } + greh = skb_gro_header(skb, hlen, off); + if (unlikely(!greh)) + goto out; /* Only support version 0 and K (key), C (csum) flags. Note that * although the support for the S (seq#) flag can be added easily diff --git a/net/ipv4/igmp.c b/net/ipv4/igmp.c index e3ab0cb61624..df0660d818ac 100644 --- a/net/ipv4/igmp.c +++ b/net/ipv4/igmp.c @@ -2529,11 +2529,10 @@ done: err = ip_mc_leave_group(sk, &imr); return err; } - int ip_mc_msfget(struct sock *sk, struct ip_msfilter *msf, - struct ip_msfilter __user *optval, int __user *optlen) + sockptr_t optval, sockptr_t optlen) { - int err, len, count, copycount; + int err, len, count, copycount, msf_size; struct ip_mreqn imr; __be32 addr = msf->imsf_multiaddr; struct ip_mc_socklist *pmc; @@ -2575,12 +2574,15 @@ int ip_mc_msfget(struct sock *sk, struct ip_msfilter *msf, copycount = count < msf->imsf_numsrc ? count : msf->imsf_numsrc; len = flex_array_size(psl, sl_addr, copycount); msf->imsf_numsrc = count; - if (put_user(IP_MSFILTER_SIZE(copycount), optlen) || - copy_to_user(optval, msf, IP_MSFILTER_SIZE(0))) { + msf_size = IP_MSFILTER_SIZE(copycount); + if (copy_to_sockptr(optlen, &msf_size, sizeof(int)) || + copy_to_sockptr(optval, msf, IP_MSFILTER_SIZE(0))) { return -EFAULT; } if (len && - copy_to_user(&optval->imsf_slist_flex[0], psl->sl_addr, len)) + copy_to_sockptr_offset(optval, + offsetof(struct ip_msfilter, imsf_slist_flex), + psl->sl_addr, len)) return -EFAULT; return 0; done: @@ -2588,7 +2590,7 @@ done: } int ip_mc_gsfget(struct sock *sk, struct group_filter *gsf, - struct sockaddr_storage __user *p) + sockptr_t optval, size_t ss_offset) { int i, count, copycount; struct sockaddr_in *psin; @@ -2618,15 +2620,17 @@ int ip_mc_gsfget(struct sock *sk, struct group_filter *gsf, count = psl ? psl->sl_count : 0; copycount = count < gsf->gf_numsrc ? count : gsf->gf_numsrc; gsf->gf_numsrc = count; - for (i = 0; i < copycount; i++, p++) { + for (i = 0; i < copycount; i++) { struct sockaddr_storage ss; psin = (struct sockaddr_in *)&ss; memset(&ss, 0, sizeof(ss)); psin->sin_family = AF_INET; psin->sin_addr.s_addr = psl->sl_addr[i]; - if (copy_to_user(p, &ss, sizeof(ss))) + if (copy_to_sockptr_offset(optval, ss_offset, + &ss, sizeof(ss))) return -EFAULT; + ss_offset += sizeof(ss); } return 0; } diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c index eb31c7158b39..ebca860e113f 100644 --- a/net/ipv4/inet_connection_sock.c +++ b/net/ipv4/inet_connection_sock.c @@ -130,14 +130,75 @@ void inet_get_local_port_range(struct net *net, int *low, int *high) } EXPORT_SYMBOL(inet_get_local_port_range); +static bool inet_use_bhash2_on_bind(const struct sock *sk) +{ +#if IS_ENABLED(CONFIG_IPV6) + if (sk->sk_family == AF_INET6) { + int addr_type = ipv6_addr_type(&sk->sk_v6_rcv_saddr); + + return addr_type != IPV6_ADDR_ANY && + addr_type != IPV6_ADDR_MAPPED; + } +#endif + return sk->sk_rcv_saddr != htonl(INADDR_ANY); +} + +static bool inet_bind_conflict(const struct sock *sk, struct sock *sk2, + kuid_t sk_uid, bool relax, + bool reuseport_cb_ok, bool reuseport_ok) +{ + int bound_dev_if2; + + if (sk == sk2) + return false; + + bound_dev_if2 = READ_ONCE(sk2->sk_bound_dev_if); + + if (!sk->sk_bound_dev_if || !bound_dev_if2 || + sk->sk_bound_dev_if == bound_dev_if2) { + if (sk->sk_reuse && sk2->sk_reuse && + sk2->sk_state != TCP_LISTEN) { + if (!relax || (!reuseport_ok && sk->sk_reuseport && + sk2->sk_reuseport && reuseport_cb_ok && + (sk2->sk_state == TCP_TIME_WAIT || + uid_eq(sk_uid, sock_i_uid(sk2))))) + return true; + } else if (!reuseport_ok || !sk->sk_reuseport || + !sk2->sk_reuseport || !reuseport_cb_ok || + (sk2->sk_state != TCP_TIME_WAIT && + !uid_eq(sk_uid, sock_i_uid(sk2)))) { + return true; + } + } + return false; +} + +static bool inet_bhash2_conflict(const struct sock *sk, + const struct inet_bind2_bucket *tb2, + kuid_t sk_uid, + bool relax, bool reuseport_cb_ok, + bool reuseport_ok) +{ + struct sock *sk2; + + sk_for_each_bound_bhash2(sk2, &tb2->owners) { + if (sk->sk_family == AF_INET && ipv6_only_sock(sk2)) + continue; + + if (inet_bind_conflict(sk, sk2, sk_uid, relax, + reuseport_cb_ok, reuseport_ok)) + return true; + } + return false; +} + +/* This should be called only when the tb and tb2 hashbuckets' locks are held */ static int inet_csk_bind_conflict(const struct sock *sk, const struct inet_bind_bucket *tb, + const struct inet_bind2_bucket *tb2, /* may be null */ bool relax, bool reuseport_ok) { - struct sock *sk2; bool reuseport_cb_ok; - bool reuse = sk->sk_reuse; - bool reuseport = !!sk->sk_reuseport; struct sock_reuseport *reuseport_cb; kuid_t uid = sock_i_uid((struct sock *)sk); @@ -150,58 +211,88 @@ static int inet_csk_bind_conflict(const struct sock *sk, /* * Unlike other sk lookup places we do not check * for sk_net here, since _all_ the socks listed - * in tb->owners list belong to the same net - the - * one this bucket belongs to. + * in tb->owners and tb2->owners list belong + * to the same net - the one this bucket belongs to. */ - sk_for_each_bound(sk2, &tb->owners) { - int bound_dev_if2; + if (!inet_use_bhash2_on_bind(sk)) { + struct sock *sk2; - if (sk == sk2) - continue; - bound_dev_if2 = READ_ONCE(sk2->sk_bound_dev_if); - if ((!sk->sk_bound_dev_if || - !bound_dev_if2 || - sk->sk_bound_dev_if == bound_dev_if2)) { - if (reuse && sk2->sk_reuse && - sk2->sk_state != TCP_LISTEN) { - if ((!relax || - (!reuseport_ok && - reuseport && sk2->sk_reuseport && - reuseport_cb_ok && - (sk2->sk_state == TCP_TIME_WAIT || - uid_eq(uid, sock_i_uid(sk2))))) && - inet_rcv_saddr_equal(sk, sk2, true)) - break; - } else if (!reuseport_ok || - !reuseport || !sk2->sk_reuseport || - !reuseport_cb_ok || - (sk2->sk_state != TCP_TIME_WAIT && - !uid_eq(uid, sock_i_uid(sk2)))) { - if (inet_rcv_saddr_equal(sk, sk2, true)) - break; - } - } + sk_for_each_bound(sk2, &tb->owners) + if (inet_bind_conflict(sk, sk2, uid, relax, + reuseport_cb_ok, reuseport_ok) && + inet_rcv_saddr_equal(sk, sk2, true)) + return true; + + return false; + } + + /* Conflicts with an existing IPV6_ADDR_ANY (if ipv6) or INADDR_ANY (if + * ipv4) should have been checked already. We need to do these two + * checks separately because their spinlocks have to be acquired/released + * independently of each other, to prevent possible deadlocks + */ + return tb2 && inet_bhash2_conflict(sk, tb2, uid, relax, reuseport_cb_ok, + reuseport_ok); +} + +/* Determine if there is a bind conflict with an existing IPV6_ADDR_ANY (if ipv6) or + * INADDR_ANY (if ipv4) socket. + * + * Caller must hold bhash hashbucket lock with local bh disabled, to protect + * against concurrent binds on the port for addr any + */ +static bool inet_bhash2_addr_any_conflict(const struct sock *sk, int port, int l3mdev, + bool relax, bool reuseport_ok) +{ + kuid_t uid = sock_i_uid((struct sock *)sk); + const struct net *net = sock_net(sk); + struct sock_reuseport *reuseport_cb; + struct inet_bind_hashbucket *head2; + struct inet_bind2_bucket *tb2; + bool reuseport_cb_ok; + + rcu_read_lock(); + reuseport_cb = rcu_dereference(sk->sk_reuseport_cb); + /* paired with WRITE_ONCE() in __reuseport_(add|detach)_closed_sock */ + reuseport_cb_ok = !reuseport_cb || READ_ONCE(reuseport_cb->num_closed_socks); + rcu_read_unlock(); + + head2 = inet_bhash2_addr_any_hashbucket(sk, net, port); + + spin_lock(&head2->lock); + + inet_bind_bucket_for_each(tb2, &head2->chain) + if (inet_bind2_bucket_match_addr_any(tb2, net, port, l3mdev, sk)) + break; + + if (tb2 && inet_bhash2_conflict(sk, tb2, uid, relax, reuseport_cb_ok, + reuseport_ok)) { + spin_unlock(&head2->lock); + return true; } - return sk2 != NULL; + + spin_unlock(&head2->lock); + return false; } /* * Find an open port number for the socket. Returns with the - * inet_bind_hashbucket lock held. + * inet_bind_hashbucket locks held if successful. */ static struct inet_bind_hashbucket * -inet_csk_find_open_port(struct sock *sk, struct inet_bind_bucket **tb_ret, int *port_ret) +inet_csk_find_open_port(const struct sock *sk, struct inet_bind_bucket **tb_ret, + struct inet_bind2_bucket **tb2_ret, + struct inet_bind_hashbucket **head2_ret, int *port_ret) { - struct inet_hashinfo *hinfo = sk->sk_prot->h.hashinfo; - int port = 0; - struct inet_bind_hashbucket *head; + struct inet_hashinfo *hinfo = tcp_or_dccp_get_hashinfo(sk); + int i, low, high, attempt_half, port, l3mdev; + struct inet_bind_hashbucket *head, *head2; struct net *net = sock_net(sk); - bool relax = false; - int i, low, high, attempt_half; + struct inet_bind2_bucket *tb2; struct inet_bind_bucket *tb; u32 remaining, offset; - int l3mdev; + bool relax = false; l3mdev = inet_sk_bound_l3mdev(sk); ports_exhausted: @@ -239,11 +330,20 @@ other_parity_scan: head = &hinfo->bhash[inet_bhashfn(net, port, hinfo->bhash_size)]; spin_lock_bh(&head->lock); + if (inet_use_bhash2_on_bind(sk)) { + if (inet_bhash2_addr_any_conflict(sk, port, l3mdev, relax, false)) + goto next_port; + } + + head2 = inet_bhashfn_portaddr(hinfo, sk, net, port); + spin_lock(&head2->lock); + tb2 = inet_bind2_bucket_find(head2, net, port, l3mdev, sk); inet_bind_bucket_for_each(tb, &head->chain) - if (net_eq(ib_net(tb), net) && tb->l3mdev == l3mdev && - tb->port == port) { - if (!inet_csk_bind_conflict(sk, tb, relax, false)) + if (inet_bind_bucket_match(tb, net, port, l3mdev)) { + if (!inet_csk_bind_conflict(sk, tb, tb2, + relax, false)) goto success; + spin_unlock(&head2->lock); goto next_port; } tb = NULL; @@ -272,6 +372,8 @@ next_port: success: *port_ret = port; *tb_ret = tb; + *tb2_ret = tb2; + *head2_ret = head2; return head; } @@ -365,56 +467,97 @@ void inet_csk_update_fastreuse(struct inet_bind_bucket *tb, */ int inet_csk_get_port(struct sock *sk, unsigned short snum) { + struct inet_hashinfo *hinfo = tcp_or_dccp_get_hashinfo(sk); bool reuse = sk->sk_reuse && sk->sk_state != TCP_LISTEN; - struct inet_hashinfo *hinfo = sk->sk_prot->h.hashinfo; - int ret = 1, port = snum; - struct inet_bind_hashbucket *head; - struct net *net = sock_net(sk); + bool found_port = false, check_bind_conflict = true; + bool bhash_created = false, bhash2_created = false; + struct inet_bind_hashbucket *head, *head2; + struct inet_bind2_bucket *tb2 = NULL; struct inet_bind_bucket *tb = NULL; - int l3mdev; + bool head2_lock_acquired = false; + int ret = 1, port = snum, l3mdev; + struct net *net = sock_net(sk); l3mdev = inet_sk_bound_l3mdev(sk); if (!port) { - head = inet_csk_find_open_port(sk, &tb, &port); + head = inet_csk_find_open_port(sk, &tb, &tb2, &head2, &port); if (!head) return ret; + + head2_lock_acquired = true; + + if (tb && tb2) + goto success; + found_port = true; + } else { + head = &hinfo->bhash[inet_bhashfn(net, port, + hinfo->bhash_size)]; + spin_lock_bh(&head->lock); + inet_bind_bucket_for_each(tb, &head->chain) + if (inet_bind_bucket_match(tb, net, port, l3mdev)) + break; + } + + if (!tb) { + tb = inet_bind_bucket_create(hinfo->bind_bucket_cachep, net, + head, port, l3mdev); if (!tb) - goto tb_not_found; - goto success; + goto fail_unlock; + bhash_created = true; } - head = &hinfo->bhash[inet_bhashfn(net, port, - hinfo->bhash_size)]; - spin_lock_bh(&head->lock); - inet_bind_bucket_for_each(tb, &head->chain) - if (net_eq(ib_net(tb), net) && tb->l3mdev == l3mdev && - tb->port == port) - goto tb_found; -tb_not_found: - tb = inet_bind_bucket_create(hinfo->bind_bucket_cachep, - net, head, port, l3mdev); - if (!tb) - goto fail_unlock; -tb_found: - if (!hlist_empty(&tb->owners)) { - if (sk->sk_reuse == SK_FORCE_REUSE) - goto success; - if ((tb->fastreuse > 0 && reuse) || - sk_reuseport_match(tb, sk)) - goto success; - if (inet_csk_bind_conflict(sk, tb, true, true)) + if (!found_port) { + if (!hlist_empty(&tb->owners)) { + if (sk->sk_reuse == SK_FORCE_REUSE || + (tb->fastreuse > 0 && reuse) || + sk_reuseport_match(tb, sk)) + check_bind_conflict = false; + } + + if (check_bind_conflict && inet_use_bhash2_on_bind(sk)) { + if (inet_bhash2_addr_any_conflict(sk, port, l3mdev, true, true)) + goto fail_unlock; + } + + head2 = inet_bhashfn_portaddr(hinfo, sk, net, port); + spin_lock(&head2->lock); + head2_lock_acquired = true; + tb2 = inet_bind2_bucket_find(head2, net, port, l3mdev, sk); + } + + if (!tb2) { + tb2 = inet_bind2_bucket_create(hinfo->bind2_bucket_cachep, + net, head2, port, l3mdev, sk); + if (!tb2) goto fail_unlock; + bhash2_created = true; } + + if (!found_port && check_bind_conflict) { + if (inet_csk_bind_conflict(sk, tb, tb2, true, true)) + goto fail_unlock; + } + success: inet_csk_update_fastreuse(tb, sk); if (!inet_csk(sk)->icsk_bind_hash) - inet_bind_hash(sk, tb, port); + inet_bind_hash(sk, tb, tb2, port); WARN_ON(inet_csk(sk)->icsk_bind_hash != tb); + WARN_ON(inet_csk(sk)->icsk_bind2_hash != tb2); ret = 0; fail_unlock: + if (ret) { + if (bhash_created) + inet_bind_bucket_destroy(hinfo->bind_bucket_cachep, tb); + if (bhash2_created) + inet_bind2_bucket_destroy(hinfo->bind2_bucket_cachep, + tb2); + } + if (head2_lock_acquired) + spin_unlock(&head2->lock); spin_unlock_bh(&head->lock); return ret; } @@ -763,14 +906,15 @@ static void reqsk_migrate_reset(struct request_sock *req) /* return true if req was found in the ehash table */ static bool reqsk_queue_unlink(struct request_sock *req) { - struct inet_hashinfo *hashinfo = req_to_sk(req)->sk_prot->h.hashinfo; + struct sock *sk = req_to_sk(req); bool found = false; - if (sk_hashed(req_to_sk(req))) { + if (sk_hashed(sk)) { + struct inet_hashinfo *hashinfo = tcp_or_dccp_get_hashinfo(sk); spinlock_t *lock = inet_ehash_lockp(hashinfo, req->rsk_hash); spin_lock(lock); - found = __sk_nulls_del_node_init_rcu(req_to_sk(req)); + found = __sk_nulls_del_node_init_rcu(sk); spin_unlock(lock); } if (timer_pending(&req->rsk_timer) && del_timer_sync(&req->rsk_timer)) @@ -962,6 +1106,7 @@ struct sock *inet_csk_clone_lock(const struct sock *sk, inet_sk_set_state(newsk, TCP_SYN_RECV); newicsk->icsk_bind_hash = NULL; + newicsk->icsk_bind2_hash = NULL; inet_sk(newsk)->inet_dport = inet_rsk(req)->ir_rmt_port; inet_sk(newsk)->inet_num = inet_rsk(req)->ir_num; diff --git a/net/ipv4/inet_hashtables.c b/net/ipv4/inet_hashtables.c index b9d995b5ce24..a0ad34e4f044 100644 --- a/net/ipv4/inet_hashtables.c +++ b/net/ipv4/inet_hashtables.c @@ -92,12 +92,79 @@ void inet_bind_bucket_destroy(struct kmem_cache *cachep, struct inet_bind_bucket } } +bool inet_bind_bucket_match(const struct inet_bind_bucket *tb, const struct net *net, + unsigned short port, int l3mdev) +{ + return net_eq(ib_net(tb), net) && tb->port == port && + tb->l3mdev == l3mdev; +} + +static void inet_bind2_bucket_init(struct inet_bind2_bucket *tb, + struct net *net, + struct inet_bind_hashbucket *head, + unsigned short port, int l3mdev, + const struct sock *sk) +{ + write_pnet(&tb->ib_net, net); + tb->l3mdev = l3mdev; + tb->port = port; +#if IS_ENABLED(CONFIG_IPV6) + tb->family = sk->sk_family; + if (sk->sk_family == AF_INET6) + tb->v6_rcv_saddr = sk->sk_v6_rcv_saddr; + else +#endif + tb->rcv_saddr = sk->sk_rcv_saddr; + INIT_HLIST_HEAD(&tb->owners); + hlist_add_head(&tb->node, &head->chain); +} + +struct inet_bind2_bucket *inet_bind2_bucket_create(struct kmem_cache *cachep, + struct net *net, + struct inet_bind_hashbucket *head, + unsigned short port, + int l3mdev, + const struct sock *sk) +{ + struct inet_bind2_bucket *tb = kmem_cache_alloc(cachep, GFP_ATOMIC); + + if (tb) + inet_bind2_bucket_init(tb, net, head, port, l3mdev, sk); + + return tb; +} + +/* Caller must hold hashbucket lock for this tb with local BH disabled */ +void inet_bind2_bucket_destroy(struct kmem_cache *cachep, struct inet_bind2_bucket *tb) +{ + if (hlist_empty(&tb->owners)) { + __hlist_del(&tb->node); + kmem_cache_free(cachep, tb); + } +} + +static bool inet_bind2_bucket_addr_match(const struct inet_bind2_bucket *tb2, + const struct sock *sk) +{ +#if IS_ENABLED(CONFIG_IPV6) + if (sk->sk_family != tb2->family) + return false; + + if (sk->sk_family == AF_INET6) + return ipv6_addr_equal(&tb2->v6_rcv_saddr, + &sk->sk_v6_rcv_saddr); +#endif + return tb2->rcv_saddr == sk->sk_rcv_saddr; +} + void inet_bind_hash(struct sock *sk, struct inet_bind_bucket *tb, - const unsigned short snum) + struct inet_bind2_bucket *tb2, unsigned short port) { - inet_sk(sk)->inet_num = snum; + inet_sk(sk)->inet_num = port; sk_add_bind_node(sk, &tb->owners); inet_csk(sk)->icsk_bind_hash = tb; + sk_add_bind2_node(sk, &tb2->owners); + inet_csk(sk)->icsk_bind2_hash = tb2; } /* @@ -105,11 +172,15 @@ void inet_bind_hash(struct sock *sk, struct inet_bind_bucket *tb, */ static void __inet_put_port(struct sock *sk) { - struct inet_hashinfo *hashinfo = sk->sk_prot->h.hashinfo; - const int bhash = inet_bhashfn(sock_net(sk), inet_sk(sk)->inet_num, - hashinfo->bhash_size); - struct inet_bind_hashbucket *head = &hashinfo->bhash[bhash]; + struct inet_hashinfo *hashinfo = tcp_or_dccp_get_hashinfo(sk); + struct inet_bind_hashbucket *head, *head2; + struct net *net = sock_net(sk); struct inet_bind_bucket *tb; + int bhash; + + bhash = inet_bhashfn(net, inet_sk(sk)->inet_num, hashinfo->bhash_size); + head = &hashinfo->bhash[bhash]; + head2 = inet_bhashfn_portaddr(hashinfo, sk, net, inet_sk(sk)->inet_num); spin_lock(&head->lock); tb = inet_csk(sk)->icsk_bind_hash; @@ -117,6 +188,17 @@ static void __inet_put_port(struct sock *sk) inet_csk(sk)->icsk_bind_hash = NULL; inet_sk(sk)->inet_num = 0; inet_bind_bucket_destroy(hashinfo->bind_bucket_cachep, tb); + + spin_lock(&head2->lock); + if (inet_csk(sk)->icsk_bind2_hash) { + struct inet_bind2_bucket *tb2 = inet_csk(sk)->icsk_bind2_hash; + + __sk_del_bind2_node(sk); + inet_csk(sk)->icsk_bind2_hash = NULL; + inet_bind2_bucket_destroy(hashinfo->bind2_bucket_cachep, tb2); + } + spin_unlock(&head2->lock); + spin_unlock(&head->lock); } @@ -130,17 +212,26 @@ EXPORT_SYMBOL(inet_put_port); int __inet_inherit_port(const struct sock *sk, struct sock *child) { - struct inet_hashinfo *table = sk->sk_prot->h.hashinfo; + struct inet_hashinfo *table = tcp_or_dccp_get_hashinfo(sk); unsigned short port = inet_sk(child)->inet_num; - const int bhash = inet_bhashfn(sock_net(sk), port, - table->bhash_size); - struct inet_bind_hashbucket *head = &table->bhash[bhash]; + struct inet_bind_hashbucket *head, *head2; + bool created_inet_bind_bucket = false; + struct net *net = sock_net(sk); + bool update_fastreuse = false; + struct inet_bind2_bucket *tb2; struct inet_bind_bucket *tb; - int l3mdev; + int bhash, l3mdev; + + bhash = inet_bhashfn(net, port, table->bhash_size); + head = &table->bhash[bhash]; + head2 = inet_bhashfn_portaddr(table, child, net, port); spin_lock(&head->lock); + spin_lock(&head2->lock); tb = inet_csk(sk)->icsk_bind_hash; - if (unlikely(!tb)) { + tb2 = inet_csk(sk)->icsk_bind2_hash; + if (unlikely(!tb || !tb2)) { + spin_unlock(&head2->lock); spin_unlock(&head->lock); return -ENOENT; } @@ -153,25 +244,49 @@ int __inet_inherit_port(const struct sock *sk, struct sock *child) * as that of the child socket. We have to look up or * create a new bind bucket for the child here. */ inet_bind_bucket_for_each(tb, &head->chain) { - if (net_eq(ib_net(tb), sock_net(sk)) && - tb->l3mdev == l3mdev && tb->port == port) + if (inet_bind_bucket_match(tb, net, port, l3mdev)) break; } if (!tb) { tb = inet_bind_bucket_create(table->bind_bucket_cachep, - sock_net(sk), head, port, - l3mdev); + net, head, port, l3mdev); if (!tb) { + spin_unlock(&head2->lock); spin_unlock(&head->lock); return -ENOMEM; } + created_inet_bind_bucket = true; + } + update_fastreuse = true; + + goto bhash2_find; + } else if (!inet_bind2_bucket_addr_match(tb2, child)) { + l3mdev = inet_sk_bound_l3mdev(sk); + +bhash2_find: + tb2 = inet_bind2_bucket_find(head2, net, port, l3mdev, child); + if (!tb2) { + tb2 = inet_bind2_bucket_create(table->bind2_bucket_cachep, + net, head2, port, + l3mdev, child); + if (!tb2) + goto error; } - inet_csk_update_fastreuse(tb, child); } - inet_bind_hash(child, tb, port); + if (update_fastreuse) + inet_csk_update_fastreuse(tb, child); + inet_bind_hash(child, tb, tb2, port); + spin_unlock(&head2->lock); spin_unlock(&head->lock); return 0; + +error: + if (created_inet_bind_bucket) + inet_bind_bucket_destroy(table->bind_bucket_cachep, tb); + spin_unlock(&head2->lock); + spin_unlock(&head->lock); + return -ENOMEM; } EXPORT_SYMBOL_GPL(__inet_inherit_port); @@ -275,7 +390,7 @@ static inline struct sock *inet_lookup_run_bpf(struct net *net, struct sock *sk, *reuse_sk; bool no_reuseport; - if (hashinfo != &tcp_hashinfo) + if (hashinfo != net->ipv4.tcp_death_row.hashinfo) return NULL; /* only TCP is supported */ no_reuseport = bpf_sk_lookup_run_v4(net, IPPROTO_TCP, saddr, sport, @@ -518,9 +633,9 @@ static bool inet_ehash_lookup_by_sk(struct sock *sk, */ bool inet_ehash_insert(struct sock *sk, struct sock *osk, bool *found_dup_sk) { - struct inet_hashinfo *hashinfo = sk->sk_prot->h.hashinfo; - struct hlist_nulls_head *list; + struct inet_hashinfo *hashinfo = tcp_or_dccp_get_hashinfo(sk); struct inet_ehash_bucket *head; + struct hlist_nulls_head *list; spinlock_t *lock; bool ret = true; @@ -590,7 +705,7 @@ static int inet_reuseport_add_sock(struct sock *sk, int __inet_hash(struct sock *sk, struct sock *osk) { - struct inet_hashinfo *hashinfo = sk->sk_prot->h.hashinfo; + struct inet_hashinfo *hashinfo = tcp_or_dccp_get_hashinfo(sk); struct inet_listen_hashbucket *ilb2; int err = 0; @@ -636,7 +751,7 @@ EXPORT_SYMBOL_GPL(inet_hash); void inet_unhash(struct sock *sk) { - struct inet_hashinfo *hashinfo = sk->sk_prot->h.hashinfo; + struct inet_hashinfo *hashinfo = tcp_or_dccp_get_hashinfo(sk); if (sk_unhashed(sk)) return; @@ -675,6 +790,118 @@ void inet_unhash(struct sock *sk) } EXPORT_SYMBOL_GPL(inet_unhash); +static bool inet_bind2_bucket_match(const struct inet_bind2_bucket *tb, + const struct net *net, unsigned short port, + int l3mdev, const struct sock *sk) +{ +#if IS_ENABLED(CONFIG_IPV6) + if (sk->sk_family != tb->family) + return false; + + if (sk->sk_family == AF_INET6) + return net_eq(ib2_net(tb), net) && tb->port == port && + tb->l3mdev == l3mdev && + ipv6_addr_equal(&tb->v6_rcv_saddr, &sk->sk_v6_rcv_saddr); + else +#endif + return net_eq(ib2_net(tb), net) && tb->port == port && + tb->l3mdev == l3mdev && tb->rcv_saddr == sk->sk_rcv_saddr; +} + +bool inet_bind2_bucket_match_addr_any(const struct inet_bind2_bucket *tb, const struct net *net, + unsigned short port, int l3mdev, const struct sock *sk) +{ +#if IS_ENABLED(CONFIG_IPV6) + struct in6_addr addr_any = {}; + + if (sk->sk_family != tb->family) + return false; + + if (sk->sk_family == AF_INET6) + return net_eq(ib2_net(tb), net) && tb->port == port && + tb->l3mdev == l3mdev && + ipv6_addr_equal(&tb->v6_rcv_saddr, &addr_any); + else +#endif + return net_eq(ib2_net(tb), net) && tb->port == port && + tb->l3mdev == l3mdev && tb->rcv_saddr == 0; +} + +/* The socket's bhash2 hashbucket spinlock must be held when this is called */ +struct inet_bind2_bucket * +inet_bind2_bucket_find(const struct inet_bind_hashbucket *head, const struct net *net, + unsigned short port, int l3mdev, const struct sock *sk) +{ + struct inet_bind2_bucket *bhash2 = NULL; + + inet_bind_bucket_for_each(bhash2, &head->chain) + if (inet_bind2_bucket_match(bhash2, net, port, l3mdev, sk)) + break; + + return bhash2; +} + +struct inet_bind_hashbucket * +inet_bhash2_addr_any_hashbucket(const struct sock *sk, const struct net *net, int port) +{ + struct inet_hashinfo *hinfo = tcp_or_dccp_get_hashinfo(sk); + u32 hash; +#if IS_ENABLED(CONFIG_IPV6) + struct in6_addr addr_any = {}; + + if (sk->sk_family == AF_INET6) + hash = ipv6_portaddr_hash(net, &addr_any, port); + else +#endif + hash = ipv4_portaddr_hash(net, 0, port); + + return &hinfo->bhash2[hash & (hinfo->bhash_size - 1)]; +} + +int inet_bhash2_update_saddr(struct inet_bind_hashbucket *prev_saddr, struct sock *sk) +{ + struct inet_hashinfo *hinfo = tcp_or_dccp_get_hashinfo(sk); + struct inet_bind2_bucket *tb2, *new_tb2; + int l3mdev = inet_sk_bound_l3mdev(sk); + struct inet_bind_hashbucket *head2; + int port = inet_sk(sk)->inet_num; + struct net *net = sock_net(sk); + + /* Allocate a bind2 bucket ahead of time to avoid permanently putting + * the bhash2 table in an inconsistent state if a new tb2 bucket + * allocation fails. + */ + new_tb2 = kmem_cache_alloc(hinfo->bind2_bucket_cachep, GFP_ATOMIC); + if (!new_tb2) + return -ENOMEM; + + head2 = inet_bhashfn_portaddr(hinfo, sk, net, port); + + if (prev_saddr) { + spin_lock_bh(&prev_saddr->lock); + __sk_del_bind2_node(sk); + inet_bind2_bucket_destroy(hinfo->bind2_bucket_cachep, + inet_csk(sk)->icsk_bind2_hash); + spin_unlock_bh(&prev_saddr->lock); + } + + spin_lock_bh(&head2->lock); + tb2 = inet_bind2_bucket_find(head2, net, port, l3mdev, sk); + if (!tb2) { + tb2 = new_tb2; + inet_bind2_bucket_init(tb2, net, head2, port, l3mdev, sk); + } + sk_add_bind2_node(sk, &tb2->owners); + inet_csk(sk)->icsk_bind2_hash = tb2; + spin_unlock_bh(&head2->lock); + + if (tb2 != new_tb2) + kmem_cache_free(hinfo->bind2_bucket_cachep, new_tb2); + + return 0; +} +EXPORT_SYMBOL_GPL(inet_bhash2_update_saddr); + /* RFC 6056 3.3.4. Algorithm 4: Double-Hash Port Selection Algorithm * Note that we use 32bit integers (vs RFC 'short integers') * because 2^16 is not a multiple of num_ephemeral and this @@ -694,11 +921,13 @@ int __inet_hash_connect(struct inet_timewait_death_row *death_row, struct sock *, __u16, struct inet_timewait_sock **)) { struct inet_hashinfo *hinfo = death_row->hashinfo; + struct inet_bind_hashbucket *head, *head2; struct inet_timewait_sock *tw = NULL; - struct inet_bind_hashbucket *head; int port = inet_sk(sk)->inet_num; struct net *net = sock_net(sk); + struct inet_bind2_bucket *tb2; struct inet_bind_bucket *tb; + bool tb_created = false; u32 remaining, offset; int ret, i, low, high; int l3mdev; @@ -729,8 +958,8 @@ int __inet_hash_connect(struct inet_timewait_death_row *death_row, if (likely(remaining > 1)) remaining &= ~1U; - net_get_random_once(table_perturb, - INET_TABLE_PERTURB_SIZE * sizeof(*table_perturb)); + get_random_sleepable_once(table_perturb, + INET_TABLE_PERTURB_SIZE * sizeof(*table_perturb)); index = port_offset & (INET_TABLE_PERTURB_SIZE - 1); offset = READ_ONCE(table_perturb[index]) + (port_offset >> 32); @@ -755,8 +984,7 @@ other_parity_scan: * the established check is already unique enough. */ inet_bind_bucket_for_each(tb, &head->chain) { - if (net_eq(ib_net(tb), net) && tb->l3mdev == l3mdev && - tb->port == port) { + if (inet_bind_bucket_match(tb, net, port, l3mdev)) { if (tb->fastreuse >= 0 || tb->fastreuseport >= 0) goto next_port; @@ -774,6 +1002,7 @@ other_parity_scan: spin_unlock_bh(&head->lock); return -ENOMEM; } + tb_created = true; tb->fastreuse = -1; tb->fastreuseport = -1; goto ok; @@ -789,6 +1018,20 @@ next_port: return -EADDRNOTAVAIL; ok: + /* Find the corresponding tb2 bucket since we need to + * add the socket to the bhash2 table as well + */ + head2 = inet_bhashfn_portaddr(hinfo, sk, net, port); + spin_lock(&head2->lock); + + tb2 = inet_bind2_bucket_find(head2, net, port, l3mdev, sk); + if (!tb2) { + tb2 = inet_bind2_bucket_create(hinfo->bind2_bucket_cachep, net, + head2, port, l3mdev, sk); + if (!tb2) + goto error; + } + /* Here we want to add a little bit of randomness to the next source * port that will be chosen. We use a max() with a random here so that * on low contention the randomness is maximal and on high contention @@ -798,7 +1041,10 @@ ok: WRITE_ONCE(table_perturb[index], READ_ONCE(table_perturb[index]) + i + 2); /* Head lock still held and bh's disabled */ - inet_bind_hash(sk, tb, port); + inet_bind_hash(sk, tb, tb2, port); + + spin_unlock(&head2->lock); + if (sk_unhashed(sk)) { inet_sk(sk)->inet_sport = htons(port); inet_ehash_nolisten(sk, (struct sock *)tw, NULL); @@ -810,6 +1056,13 @@ ok: inet_twsk_deschedule_put(tw); local_bh_enable(); return 0; + +error: + spin_unlock(&head2->lock); + if (tb_created) + inet_bind_bucket_destroy(hinfo->bind_bucket_cachep, tb); + spin_unlock_bh(&head->lock); + return -ENOMEM; } /* @@ -902,3 +1155,50 @@ int inet_ehash_locks_alloc(struct inet_hashinfo *hashinfo) return 0; } EXPORT_SYMBOL_GPL(inet_ehash_locks_alloc); + +struct inet_hashinfo *inet_pernet_hashinfo_alloc(struct inet_hashinfo *hashinfo, + unsigned int ehash_entries) +{ + struct inet_hashinfo *new_hashinfo; + int i; + + new_hashinfo = kmemdup(hashinfo, sizeof(*hashinfo), GFP_KERNEL); + if (!new_hashinfo) + goto err; + + new_hashinfo->ehash = vmalloc_huge(ehash_entries * sizeof(struct inet_ehash_bucket), + GFP_KERNEL_ACCOUNT); + if (!new_hashinfo->ehash) + goto free_hashinfo; + + new_hashinfo->ehash_mask = ehash_entries - 1; + + if (inet_ehash_locks_alloc(new_hashinfo)) + goto free_ehash; + + for (i = 0; i < ehash_entries; i++) + INIT_HLIST_NULLS_HEAD(&new_hashinfo->ehash[i].chain, i); + + new_hashinfo->pernet = true; + + return new_hashinfo; + +free_ehash: + vfree(new_hashinfo->ehash); +free_hashinfo: + kfree(new_hashinfo); +err: + return NULL; +} +EXPORT_SYMBOL_GPL(inet_pernet_hashinfo_alloc); + +void inet_pernet_hashinfo_free(struct inet_hashinfo *hashinfo) +{ + if (!hashinfo->pernet) + return; + + inet_ehash_locks_free(hashinfo); + vfree(hashinfo->ehash); + kfree(hashinfo); +} +EXPORT_SYMBOL_GPL(inet_pernet_hashinfo_free); diff --git a/net/ipv4/inet_timewait_sock.c b/net/ipv4/inet_timewait_sock.c index 47ccc343c9fb..71d3bb0abf6c 100644 --- a/net/ipv4/inet_timewait_sock.c +++ b/net/ipv4/inet_timewait_sock.c @@ -59,9 +59,7 @@ static void inet_twsk_kill(struct inet_timewait_sock *tw) inet_twsk_bind_unhash(tw, hashinfo); spin_unlock(&bhead->lock); - if (refcount_dec_and_test(&tw->tw_dr->tw_refcount)) - kfree(tw->tw_dr); - + refcount_dec(&tw->tw_dr->tw_refcount); inet_twsk_put(tw); } diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c index 04e2034f2f8e..1ae83ad629b2 100644 --- a/net/ipv4/ip_output.c +++ b/net/ipv4/ip_output.c @@ -1043,7 +1043,7 @@ static int __ip_append_data(struct sock *sk, paged = true; zc = true; } else { - uarg->zerocopy = 0; + uarg_to_msgzc(uarg)->zerocopy = 0; skb_zcopy_set(skb, uarg, &extra_uref); } } @@ -1109,10 +1109,7 @@ alloc_new_skb: (fraglen + alloc_extra < SKB_MAX_ALLOC || !(rt->dst.dev->features & NETIF_F_SG))) alloclen = fraglen; - else if (!zc) { - alloclen = min_t(int, fraglen, MAX_HEADER); - pagedlen = fraglen - alloclen; - } else { + else { alloclen = fragheaderlen + transhdrlen; pagedlen = datalen - transhdrlen; } diff --git a/net/ipv4/ip_sockglue.c b/net/ipv4/ip_sockglue.c index e49a61a053a6..6e19cad154f5 100644 --- a/net/ipv4/ip_sockglue.c +++ b/net/ipv4/ip_sockglue.c @@ -888,8 +888,8 @@ static int compat_ip_mcast_join_leave(struct sock *sk, int optname, DEFINE_STATIC_KEY_FALSE(ip4_min_ttl); -static int do_ip_setsockopt(struct sock *sk, int level, int optname, - sockptr_t optval, unsigned int optlen) +int do_ip_setsockopt(struct sock *sk, int level, int optname, + sockptr_t optval, unsigned int optlen) { struct inet_sock *inet = inet_sk(sk); struct net *net = sock_net(sk); @@ -944,7 +944,7 @@ static int do_ip_setsockopt(struct sock *sk, int level, int optname, err = 0; if (needs_rtnl) rtnl_lock(); - lock_sock(sk); + sockopt_lock_sock(sk); switch (optname) { case IP_OPTIONS: @@ -1333,14 +1333,14 @@ static int do_ip_setsockopt(struct sock *sk, int level, int optname, case IP_IPSEC_POLICY: case IP_XFRM_POLICY: err = -EPERM; - if (!ns_capable(sock_net(sk)->user_ns, CAP_NET_ADMIN)) + if (!sockopt_ns_capable(sock_net(sk)->user_ns, CAP_NET_ADMIN)) break; err = xfrm_user_policy(sk, optname, optval, optlen); break; case IP_TRANSPARENT: - if (!!val && !ns_capable(sock_net(sk)->user_ns, CAP_NET_RAW) && - !ns_capable(sock_net(sk)->user_ns, CAP_NET_ADMIN)) { + if (!!val && !sockopt_ns_capable(sock_net(sk)->user_ns, CAP_NET_RAW) && + !sockopt_ns_capable(sock_net(sk)->user_ns, CAP_NET_ADMIN)) { err = -EPERM; break; } @@ -1368,13 +1368,13 @@ static int do_ip_setsockopt(struct sock *sk, int level, int optname, err = -ENOPROTOOPT; break; } - release_sock(sk); + sockopt_release_sock(sk); if (needs_rtnl) rtnl_unlock(); return err; e_inval: - release_sock(sk); + sockopt_release_sock(sk); if (needs_rtnl) rtnl_unlock(); return -EINVAL; @@ -1462,37 +1462,37 @@ static bool getsockopt_needs_rtnl(int optname) return false; } -static int ip_get_mcast_msfilter(struct sock *sk, void __user *optval, - int __user *optlen, int len) +static int ip_get_mcast_msfilter(struct sock *sk, sockptr_t optval, + sockptr_t optlen, int len) { const int size0 = offsetof(struct group_filter, gf_slist_flex); - struct group_filter __user *p = optval; struct group_filter gsf; - int num; + int num, gsf_size; int err; if (len < size0) return -EINVAL; - if (copy_from_user(&gsf, p, size0)) + if (copy_from_sockptr(&gsf, optval, size0)) return -EFAULT; num = gsf.gf_numsrc; - err = ip_mc_gsfget(sk, &gsf, p->gf_slist_flex); + err = ip_mc_gsfget(sk, &gsf, optval, + offsetof(struct group_filter, gf_slist_flex)); if (err) return err; if (gsf.gf_numsrc < num) num = gsf.gf_numsrc; - if (put_user(GROUP_FILTER_SIZE(num), optlen) || - copy_to_user(p, &gsf, size0)) + gsf_size = GROUP_FILTER_SIZE(num); + if (copy_to_sockptr(optlen, &gsf_size, sizeof(int)) || + copy_to_sockptr(optval, &gsf, size0)) return -EFAULT; return 0; } -static int compat_ip_get_mcast_msfilter(struct sock *sk, void __user *optval, - int __user *optlen, int len) +static int compat_ip_get_mcast_msfilter(struct sock *sk, sockptr_t optval, + sockptr_t optlen, int len) { const int size0 = offsetof(struct compat_group_filter, gf_slist_flex); - struct compat_group_filter __user *p = optval; struct compat_group_filter gf32; struct group_filter gf; int num; @@ -1500,7 +1500,7 @@ static int compat_ip_get_mcast_msfilter(struct sock *sk, void __user *optval, if (len < size0) return -EINVAL; - if (copy_from_user(&gf32, p, size0)) + if (copy_from_sockptr(&gf32, optval, size0)) return -EFAULT; gf.gf_interface = gf32.gf_interface; @@ -1508,21 +1508,24 @@ static int compat_ip_get_mcast_msfilter(struct sock *sk, void __user *optval, num = gf.gf_numsrc = gf32.gf_numsrc; gf.gf_group = gf32.gf_group; - err = ip_mc_gsfget(sk, &gf, p->gf_slist_flex); + err = ip_mc_gsfget(sk, &gf, optval, + offsetof(struct compat_group_filter, gf_slist_flex)); if (err) return err; if (gf.gf_numsrc < num) num = gf.gf_numsrc; len = GROUP_FILTER_SIZE(num) - (sizeof(gf) - sizeof(gf32)); - if (put_user(len, optlen) || - put_user(gf.gf_fmode, &p->gf_fmode) || - put_user(gf.gf_numsrc, &p->gf_numsrc)) + if (copy_to_sockptr(optlen, &len, sizeof(int)) || + copy_to_sockptr_offset(optval, offsetof(struct compat_group_filter, gf_fmode), + &gf.gf_fmode, sizeof(gf.gf_fmode)) || + copy_to_sockptr_offset(optval, offsetof(struct compat_group_filter, gf_numsrc), + &gf.gf_numsrc, sizeof(gf.gf_numsrc))) return -EFAULT; return 0; } -static int do_ip_getsockopt(struct sock *sk, int level, int optname, - char __user *optval, int __user *optlen) +int do_ip_getsockopt(struct sock *sk, int level, int optname, + sockptr_t optval, sockptr_t optlen) { struct inet_sock *inet = inet_sk(sk); bool needs_rtnl = getsockopt_needs_rtnl(optname); @@ -1535,14 +1538,14 @@ static int do_ip_getsockopt(struct sock *sk, int level, int optname, if (ip_mroute_opt(optname)) return ip_mroute_getsockopt(sk, optname, optval, optlen); - if (get_user(len, optlen)) + if (copy_from_sockptr(&len, optlen, sizeof(int))) return -EFAULT; if (len < 0) return -EINVAL; if (needs_rtnl) rtnl_lock(); - lock_sock(sk); + sockopt_lock_sock(sk); switch (optname) { case IP_OPTIONS: @@ -1558,17 +1561,19 @@ static int do_ip_getsockopt(struct sock *sk, int level, int optname, memcpy(optbuf, &inet_opt->opt, sizeof(struct ip_options) + inet_opt->opt.optlen); - release_sock(sk); + sockopt_release_sock(sk); - if (opt->optlen == 0) - return put_user(0, optlen); + if (opt->optlen == 0) { + len = 0; + return copy_to_sockptr(optlen, &len, sizeof(int)); + } ip_options_undo(opt); len = min_t(unsigned int, len, opt->optlen); - if (put_user(len, optlen)) + if (copy_to_sockptr(optlen, &len, sizeof(int))) return -EFAULT; - if (copy_to_user(optval, opt->__data, len)) + if (copy_to_sockptr(optval, opt->__data, len)) return -EFAULT; return 0; } @@ -1632,7 +1637,7 @@ static int do_ip_getsockopt(struct sock *sk, int level, int optname, dst_release(dst); } if (!val) { - release_sock(sk); + sockopt_release_sock(sk); return -ENOTCONN; } break; @@ -1657,11 +1662,11 @@ static int do_ip_getsockopt(struct sock *sk, int level, int optname, struct in_addr addr; len = min_t(unsigned int, len, sizeof(struct in_addr)); addr.s_addr = inet->mc_addr; - release_sock(sk); + sockopt_release_sock(sk); - if (put_user(len, optlen)) + if (copy_to_sockptr(optlen, &len, sizeof(int))) return -EFAULT; - if (copy_to_user(optval, &addr, len)) + if (copy_to_sockptr(optval, &addr, len)) return -EFAULT; return 0; } @@ -1673,12 +1678,11 @@ static int do_ip_getsockopt(struct sock *sk, int level, int optname, err = -EINVAL; goto out; } - if (copy_from_user(&msf, optval, IP_MSFILTER_SIZE(0))) { + if (copy_from_sockptr(&msf, optval, IP_MSFILTER_SIZE(0))) { err = -EFAULT; goto out; } - err = ip_mc_msfget(sk, &msf, - (struct ip_msfilter __user *)optval, optlen); + err = ip_mc_msfget(sk, &msf, optval, optlen); goto out; } case MCAST_MSFILTER: @@ -1695,13 +1699,18 @@ static int do_ip_getsockopt(struct sock *sk, int level, int optname, { struct msghdr msg; - release_sock(sk); + sockopt_release_sock(sk); if (sk->sk_type != SOCK_STREAM) return -ENOPROTOOPT; - msg.msg_control_is_user = true; - msg.msg_control_user = optval; + if (optval.is_kernel) { + msg.msg_control_is_user = false; + msg.msg_control = optval.kernel; + } else { + msg.msg_control_is_user = true; + msg.msg_control_user = optval.user; + } msg.msg_controllen = len; msg.msg_flags = in_compat_syscall() ? MSG_CMSG_COMPAT : 0; @@ -1722,7 +1731,7 @@ static int do_ip_getsockopt(struct sock *sk, int level, int optname, put_cmsg(&msg, SOL_IP, IP_TOS, sizeof(tos), &tos); } len -= msg.msg_controllen; - return put_user(len, optlen); + return copy_to_sockptr(optlen, &len, sizeof(int)); } case IP_FREEBIND: val = inet->freebind; @@ -1734,29 +1743,29 @@ static int do_ip_getsockopt(struct sock *sk, int level, int optname, val = inet->min_ttl; break; default: - release_sock(sk); + sockopt_release_sock(sk); return -ENOPROTOOPT; } - release_sock(sk); + sockopt_release_sock(sk); if (len < sizeof(int) && len > 0 && val >= 0 && val <= 255) { unsigned char ucval = (unsigned char)val; len = 1; - if (put_user(len, optlen)) + if (copy_to_sockptr(optlen, &len, sizeof(int))) return -EFAULT; - if (copy_to_user(optval, &ucval, 1)) + if (copy_to_sockptr(optval, &ucval, 1)) return -EFAULT; } else { len = min_t(unsigned int, sizeof(int), len); - if (put_user(len, optlen)) + if (copy_to_sockptr(optlen, &len, sizeof(int))) return -EFAULT; - if (copy_to_user(optval, &val, len)) + if (copy_to_sockptr(optval, &val, len)) return -EFAULT; } return 0; out: - release_sock(sk); + sockopt_release_sock(sk); if (needs_rtnl) rtnl_unlock(); return err; @@ -1767,7 +1776,8 @@ int ip_getsockopt(struct sock *sk, int level, { int err; - err = do_ip_getsockopt(sk, level, optname, optval, optlen); + err = do_ip_getsockopt(sk, level, optname, + USER_SOCKPTR(optval), USER_SOCKPTR(optlen)); #if IS_ENABLED(CONFIG_BPFILTER_UMH) if (optname >= BPFILTER_IPT_SO_GET_INFO && diff --git a/net/ipv4/ip_tunnel_core.c b/net/ipv4/ip_tunnel_core.c index cc1caab4a654..92c02c886fe7 100644 --- a/net/ipv4/ip_tunnel_core.c +++ b/net/ipv4/ip_tunnel_core.c @@ -1079,3 +1079,70 @@ EXPORT_SYMBOL(ip_tunnel_parse_protocol); const struct header_ops ip_tunnel_header_ops = { .parse_protocol = ip_tunnel_parse_protocol }; EXPORT_SYMBOL(ip_tunnel_header_ops); + +/* This function returns true when ENCAP attributes are present in the nl msg */ +bool ip_tunnel_netlink_encap_parms(struct nlattr *data[], + struct ip_tunnel_encap *encap) +{ + bool ret = false; + + memset(encap, 0, sizeof(*encap)); + + if (!data) + return ret; + + if (data[IFLA_IPTUN_ENCAP_TYPE]) { + ret = true; + encap->type = nla_get_u16(data[IFLA_IPTUN_ENCAP_TYPE]); + } + + if (data[IFLA_IPTUN_ENCAP_FLAGS]) { + ret = true; + encap->flags = nla_get_u16(data[IFLA_IPTUN_ENCAP_FLAGS]); + } + + if (data[IFLA_IPTUN_ENCAP_SPORT]) { + ret = true; + encap->sport = nla_get_be16(data[IFLA_IPTUN_ENCAP_SPORT]); + } + + if (data[IFLA_IPTUN_ENCAP_DPORT]) { + ret = true; + encap->dport = nla_get_be16(data[IFLA_IPTUN_ENCAP_DPORT]); + } + + return ret; +} +EXPORT_SYMBOL_GPL(ip_tunnel_netlink_encap_parms); + +void ip_tunnel_netlink_parms(struct nlattr *data[], + struct ip_tunnel_parm *parms) +{ + if (data[IFLA_IPTUN_LINK]) + parms->link = nla_get_u32(data[IFLA_IPTUN_LINK]); + + if (data[IFLA_IPTUN_LOCAL]) + parms->iph.saddr = nla_get_be32(data[IFLA_IPTUN_LOCAL]); + + if (data[IFLA_IPTUN_REMOTE]) + parms->iph.daddr = nla_get_be32(data[IFLA_IPTUN_REMOTE]); + + if (data[IFLA_IPTUN_TTL]) { + parms->iph.ttl = nla_get_u8(data[IFLA_IPTUN_TTL]); + if (parms->iph.ttl) + parms->iph.frag_off = htons(IP_DF); + } + + if (data[IFLA_IPTUN_TOS]) + parms->iph.tos = nla_get_u8(data[IFLA_IPTUN_TOS]); + + if (!data[IFLA_IPTUN_PMTUDISC] || nla_get_u8(data[IFLA_IPTUN_PMTUDISC])) + parms->iph.frag_off = htons(IP_DF); + + if (data[IFLA_IPTUN_FLAGS]) + parms->i_flags = nla_get_be16(data[IFLA_IPTUN_FLAGS]); + + if (data[IFLA_IPTUN_PROTO]) + parms->iph.protocol = nla_get_u8(data[IFLA_IPTUN_PROTO]); +} +EXPORT_SYMBOL_GPL(ip_tunnel_netlink_parms); diff --git a/net/ipv4/ipcomp.c b/net/ipv4/ipcomp.c index 366094c1ce6c..5a4fb2539b08 100644 --- a/net/ipv4/ipcomp.c +++ b/net/ipv4/ipcomp.c @@ -117,7 +117,8 @@ out: return err; } -static int ipcomp4_init_state(struct xfrm_state *x) +static int ipcomp4_init_state(struct xfrm_state *x, + struct netlink_ext_ack *extack) { int err = -EINVAL; @@ -129,17 +130,20 @@ static int ipcomp4_init_state(struct xfrm_state *x) x->props.header_len += sizeof(struct iphdr); break; default: + NL_SET_ERR_MSG(extack, "Unsupported XFRM mode for IPcomp"); goto out; } - err = ipcomp_init_state(x); + err = ipcomp_init_state(x, extack); if (err) goto out; if (x->props.mode == XFRM_MODE_TUNNEL) { err = ipcomp_tunnel_attach(x); - if (err) + if (err) { + NL_SET_ERR_MSG(extack, "Kernel error: failed to initialize the associated state"); goto out; + } } err = 0; diff --git a/net/ipv4/ipip.c b/net/ipv4/ipip.c index 123ea63a04cb..180f9daf5bec 100644 --- a/net/ipv4/ipip.c +++ b/net/ipv4/ipip.c @@ -417,29 +417,7 @@ static void ipip_netlink_parms(struct nlattr *data[], if (!data) return; - if (data[IFLA_IPTUN_LINK]) - parms->link = nla_get_u32(data[IFLA_IPTUN_LINK]); - - if (data[IFLA_IPTUN_LOCAL]) - parms->iph.saddr = nla_get_in_addr(data[IFLA_IPTUN_LOCAL]); - - if (data[IFLA_IPTUN_REMOTE]) - parms->iph.daddr = nla_get_in_addr(data[IFLA_IPTUN_REMOTE]); - - if (data[IFLA_IPTUN_TTL]) { - parms->iph.ttl = nla_get_u8(data[IFLA_IPTUN_TTL]); - if (parms->iph.ttl) - parms->iph.frag_off = htons(IP_DF); - } - - if (data[IFLA_IPTUN_TOS]) - parms->iph.tos = nla_get_u8(data[IFLA_IPTUN_TOS]); - - if (data[IFLA_IPTUN_PROTO]) - parms->iph.protocol = nla_get_u8(data[IFLA_IPTUN_PROTO]); - - if (!data[IFLA_IPTUN_PMTUDISC] || nla_get_u8(data[IFLA_IPTUN_PMTUDISC])) - parms->iph.frag_off = htons(IP_DF); + ip_tunnel_netlink_parms(data, parms); if (data[IFLA_IPTUN_COLLECT_METADATA]) *collect_md = true; @@ -448,40 +426,6 @@ static void ipip_netlink_parms(struct nlattr *data[], *fwmark = nla_get_u32(data[IFLA_IPTUN_FWMARK]); } -/* This function returns true when ENCAP attributes are present in the nl msg */ -static bool ipip_netlink_encap_parms(struct nlattr *data[], - struct ip_tunnel_encap *ipencap) -{ - bool ret = false; - - memset(ipencap, 0, sizeof(*ipencap)); - - if (!data) - return ret; - - if (data[IFLA_IPTUN_ENCAP_TYPE]) { - ret = true; - ipencap->type = nla_get_u16(data[IFLA_IPTUN_ENCAP_TYPE]); - } - - if (data[IFLA_IPTUN_ENCAP_FLAGS]) { - ret = true; - ipencap->flags = nla_get_u16(data[IFLA_IPTUN_ENCAP_FLAGS]); - } - - if (data[IFLA_IPTUN_ENCAP_SPORT]) { - ret = true; - ipencap->sport = nla_get_be16(data[IFLA_IPTUN_ENCAP_SPORT]); - } - - if (data[IFLA_IPTUN_ENCAP_DPORT]) { - ret = true; - ipencap->dport = nla_get_be16(data[IFLA_IPTUN_ENCAP_DPORT]); - } - - return ret; -} - static int ipip_newlink(struct net *src_net, struct net_device *dev, struct nlattr *tb[], struct nlattr *data[], struct netlink_ext_ack *extack) @@ -491,7 +435,7 @@ static int ipip_newlink(struct net *src_net, struct net_device *dev, struct ip_tunnel_encap ipencap; __u32 fwmark = 0; - if (ipip_netlink_encap_parms(data, &ipencap)) { + if (ip_tunnel_netlink_encap_parms(data, &ipencap)) { int err = ip_tunnel_encap_setup(t, &ipencap); if (err < 0) @@ -512,7 +456,7 @@ static int ipip_changelink(struct net_device *dev, struct nlattr *tb[], bool collect_md; __u32 fwmark = t->fwmark; - if (ipip_netlink_encap_parms(data, &ipencap)) { + if (ip_tunnel_netlink_encap_parms(data, &ipencap)) { int err = ip_tunnel_encap_setup(t, &ipencap); if (err < 0) diff --git a/net/ipv4/ipmr.c b/net/ipv4/ipmr.c index e11d6b0b62b7..e04544ac4b45 100644 --- a/net/ipv4/ipmr.c +++ b/net/ipv4/ipmr.c @@ -1548,7 +1548,8 @@ out: } /* Getsock opt support for the multicast routing system. */ -int ip_mroute_getsockopt(struct sock *sk, int optname, char __user *optval, int __user *optlen) +int ip_mroute_getsockopt(struct sock *sk, int optname, sockptr_t optval, + sockptr_t optlen) { int olr; int val; @@ -1579,14 +1580,14 @@ int ip_mroute_getsockopt(struct sock *sk, int optname, char __user *optval, int return -ENOPROTOOPT; } - if (get_user(olr, optlen)) + if (copy_from_sockptr(&olr, optlen, sizeof(int))) return -EFAULT; olr = min_t(unsigned int, olr, sizeof(int)); if (olr < 0) return -EINVAL; - if (put_user(olr, optlen)) + if (copy_to_sockptr(optlen, &olr, sizeof(int))) return -EFAULT; - if (copy_to_user(optval, &val, olr)) + if (copy_to_sockptr(optval, &val, olr)) return -EFAULT; return 0; } diff --git a/net/ipv4/netfilter/ipt_rpfilter.c b/net/ipv4/netfilter/ipt_rpfilter.c index 8cd3224d913e..8183bbcabb4a 100644 --- a/net/ipv4/netfilter/ipt_rpfilter.c +++ b/net/ipv4/netfilter/ipt_rpfilter.c @@ -33,7 +33,6 @@ static bool rpfilter_lookup_reverse(struct net *net, struct flowi4 *fl4, const struct net_device *dev, u8 flags) { struct fib_result res; - int ret __maybe_unused; if (fib_lookup(net, fl4, &res, FIB_LOOKUP_IGNORE_LINKSTATE)) return false; diff --git a/net/ipv4/netfilter/nf_nat_h323.c b/net/ipv4/netfilter/nf_nat_h323.c index a334f0dcc2d0..faee20af4856 100644 --- a/net/ipv4/netfilter/nf_nat_h323.c +++ b/net/ipv4/netfilter/nf_nat_h323.c @@ -291,20 +291,7 @@ static int nat_t120(struct sk_buff *skb, struct nf_conn *ct, exp->expectfn = nf_nat_follow_master; exp->dir = !dir; - /* Try to get same port: if not, try to change it. */ - for (; nated_port != 0; nated_port++) { - int ret; - - exp->tuple.dst.u.tcp.port = htons(nated_port); - ret = nf_ct_expect_related(exp, 0); - if (ret == 0) - break; - else if (ret != -EBUSY) { - nated_port = 0; - break; - } - } - + nated_port = nf_nat_exp_find_port(exp, nated_port); if (nated_port == 0) { /* No port available */ net_notice_ratelimited("nf_nat_h323: out of TCP ports\n"); return 0; @@ -347,20 +334,7 @@ static int nat_h245(struct sk_buff *skb, struct nf_conn *ct, if (info->sig_port[dir] == port) nated_port = ntohs(info->sig_port[!dir]); - /* Try to get same port: if not, try to change it. */ - for (; nated_port != 0; nated_port++) { - int ret; - - exp->tuple.dst.u.tcp.port = htons(nated_port); - ret = nf_ct_expect_related(exp, 0); - if (ret == 0) - break; - else if (ret != -EBUSY) { - nated_port = 0; - break; - } - } - + nated_port = nf_nat_exp_find_port(exp, nated_port); if (nated_port == 0) { /* No port available */ net_notice_ratelimited("nf_nat_q931: out of TCP ports\n"); return 0; @@ -439,20 +413,7 @@ static int nat_q931(struct sk_buff *skb, struct nf_conn *ct, if (info->sig_port[dir] == port) nated_port = ntohs(info->sig_port[!dir]); - /* Try to get same port: if not, try to change it. */ - for (; nated_port != 0; nated_port++) { - int ret; - - exp->tuple.dst.u.tcp.port = htons(nated_port); - ret = nf_ct_expect_related(exp, 0); - if (ret == 0) - break; - else if (ret != -EBUSY) { - nated_port = 0; - break; - } - } - + nated_port = nf_nat_exp_find_port(exp, nated_port); if (nated_port == 0) { /* No port available */ net_notice_ratelimited("nf_nat_ras: out of TCP ports\n"); return 0; @@ -532,20 +493,7 @@ static int nat_callforwarding(struct sk_buff *skb, struct nf_conn *ct, exp->expectfn = ip_nat_callforwarding_expect; exp->dir = !dir; - /* Try to get same port: if not, try to change it. */ - for (nated_port = ntohs(port); nated_port != 0; nated_port++) { - int ret; - - exp->tuple.dst.u.tcp.port = htons(nated_port); - ret = nf_ct_expect_related(exp, 0); - if (ret == 0) - break; - else if (ret != -EBUSY) { - nated_port = 0; - break; - } - } - + nated_port = nf_nat_exp_find_port(exp, ntohs(port)); if (nated_port == 0) { /* No port available */ net_notice_ratelimited("nf_nat_q931: out of TCP ports\n"); return 0; diff --git a/net/ipv4/netfilter/nf_socket_ipv4.c b/net/ipv4/netfilter/nf_socket_ipv4.c index 2d42e4c35a20..a1350fc25838 100644 --- a/net/ipv4/netfilter/nf_socket_ipv4.c +++ b/net/ipv4/netfilter/nf_socket_ipv4.c @@ -71,8 +71,8 @@ nf_socket_get_sock_v4(struct net *net, struct sk_buff *skb, const int doff, { switch (protocol) { case IPPROTO_TCP: - return inet_lookup(net, &tcp_hashinfo, skb, doff, - saddr, sport, daddr, dport, + return inet_lookup(net, net->ipv4.tcp_death_row.hashinfo, + skb, doff, saddr, sport, daddr, dport, in->ifindex); case IPPROTO_UDP: return udp4_lib_lookup(net, saddr, sport, daddr, dport, diff --git a/net/ipv4/netfilter/nf_tproxy_ipv4.c b/net/ipv4/netfilter/nf_tproxy_ipv4.c index b2bae0b0e42a..b22b2c745c76 100644 --- a/net/ipv4/netfilter/nf_tproxy_ipv4.c +++ b/net/ipv4/netfilter/nf_tproxy_ipv4.c @@ -79,6 +79,7 @@ nf_tproxy_get_sock_v4(struct net *net, struct sk_buff *skb, const struct net_device *in, const enum nf_tproxy_lookup_t lookup_type) { + struct inet_hashinfo *hinfo = net->ipv4.tcp_death_row.hashinfo; struct sock *sk; switch (protocol) { @@ -92,12 +93,10 @@ nf_tproxy_get_sock_v4(struct net *net, struct sk_buff *skb, switch (lookup_type) { case NF_TPROXY_LOOKUP_LISTENER: - sk = inet_lookup_listener(net, &tcp_hashinfo, skb, - ip_hdrlen(skb) + - __tcp_hdrlen(hp), - saddr, sport, - daddr, dport, - in->ifindex, 0); + sk = inet_lookup_listener(net, hinfo, skb, + ip_hdrlen(skb) + __tcp_hdrlen(hp), + saddr, sport, daddr, dport, + in->ifindex, 0); if (sk && !refcount_inc_not_zero(&sk->sk_refcnt)) sk = NULL; @@ -108,9 +107,8 @@ nf_tproxy_get_sock_v4(struct net *net, struct sk_buff *skb, */ break; case NF_TPROXY_LOOKUP_ESTABLISHED: - sk = inet_lookup_established(net, &tcp_hashinfo, - saddr, sport, daddr, dport, - in->ifindex); + sk = inet_lookup_established(net, hinfo, saddr, sport, + daddr, dport, in->ifindex); break; default: BUG(); diff --git a/net/ipv4/netfilter/nft_fib_ipv4.c b/net/ipv4/netfilter/nft_fib_ipv4.c index b75cac69bd7e..7ade04ff972d 100644 --- a/net/ipv4/netfilter/nft_fib_ipv4.c +++ b/net/ipv4/netfilter/nft_fib_ipv4.c @@ -83,6 +83,9 @@ void nft_fib4_eval(const struct nft_expr *expr, struct nft_regs *regs, else oif = NULL; + if (priv->flags & NFTA_FIB_F_IIF) + fl4.flowi4_oif = l3mdev_master_ifindex_rcu(oif); + if (nft_hook(pkt) == NF_INET_PRE_ROUTING && nft_fib_is_loopback(pkt->skb, nft_in(pkt))) { nft_fib_store_result(dest, priv, nft_in(pkt)); diff --git a/net/ipv4/ping.c b/net/ipv4/ping.c index b83c2bd9d722..517042caf6dc 100644 --- a/net/ipv4/ping.c +++ b/net/ipv4/ping.c @@ -33,6 +33,7 @@ #include <linux/skbuff.h> #include <linux/proc_fs.h> #include <linux/export.h> +#include <linux/bpf-cgroup.h> #include <net/sock.h> #include <net/ping.h> #include <net/udp.h> @@ -295,6 +296,19 @@ void ping_close(struct sock *sk, long timeout) } EXPORT_SYMBOL_GPL(ping_close); +static int ping_pre_connect(struct sock *sk, struct sockaddr *uaddr, + int addr_len) +{ + /* This check is replicated from __ip4_datagram_connect() and + * intended to prevent BPF program called below from accessing bytes + * that are out of the bound specified by user in addr_len. + */ + if (addr_len < sizeof(struct sockaddr_in)) + return -EINVAL; + + return BPF_CGROUP_RUN_PROG_INET4_CONNECT_LOCK(sk, uaddr); +} + /* Checks the bind address and possibly modifies sk->sk_bound_dev_if. */ static int ping_check_bind_addr(struct sock *sk, struct inet_sock *isk, struct sockaddr *uaddr, int addr_len) @@ -1009,6 +1023,7 @@ struct proto ping_prot = { .owner = THIS_MODULE, .init = ping_init_sock, .close = ping_close, + .pre_connect = ping_pre_connect, .connect = ip4_datagram_connect, .disconnect = __udp_disconnect, .setsockopt = ip_setsockopt, diff --git a/net/ipv4/proc.c b/net/ipv4/proc.c index 0088a4c64d77..5386f460bd20 100644 --- a/net/ipv4/proc.c +++ b/net/ipv4/proc.c @@ -59,7 +59,7 @@ static int sockstat_seq_show(struct seq_file *seq, void *v) socket_seq_show(seq); seq_printf(seq, "TCP: inuse %d orphan %d tw %d alloc %d mem %ld\n", sock_prot_inuse_get(net, &tcp_prot), orphans, - refcount_read(&net->ipv4.tcp_death_row->tw_refcount) - 1, + refcount_read(&net->ipv4.tcp_death_row.tw_refcount) - 1, sockets, proto_memory_allocated(&tcp_prot)); seq_printf(seq, "UDP: inuse %d mem %ld\n", sock_prot_inuse_get(net, &udp_prot), diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c index 5490c285668b..9b8a6db7a66b 100644 --- a/net/ipv4/sysctl_net_ipv4.c +++ b/net/ipv4/sysctl_net_ipv4.c @@ -39,6 +39,7 @@ static u32 u32_max_div_HZ = UINT_MAX / HZ; static int one_day_secs = 24 * 3600; static u32 fib_multipath_hash_fields_all_mask __maybe_unused = FIB_MULTIPATH_HASH_FIELD_ALL_MASK; +static unsigned int tcp_child_ehash_entries_max = 16 * 1024 * 1024; /* obsolete */ static int sysctl_tcp_low_latency __read_mostly; @@ -382,6 +383,29 @@ static int proc_tcp_available_ulp(struct ctl_table *ctl, return ret; } +static int proc_tcp_ehash_entries(struct ctl_table *table, int write, + void *buffer, size_t *lenp, loff_t *ppos) +{ + struct net *net = container_of(table->data, struct net, + ipv4.sysctl_tcp_child_ehash_entries); + struct inet_hashinfo *hinfo = net->ipv4.tcp_death_row.hashinfo; + int tcp_ehash_entries; + struct ctl_table tbl; + + tcp_ehash_entries = hinfo->ehash_mask + 1; + + /* A negative number indicates that the child netns + * shares the global ehash. + */ + if (!net_eq(net, &init_net) && !hinfo->pernet) + tcp_ehash_entries *= -1; + + tbl.data = &tcp_ehash_entries; + tbl.maxlen = sizeof(int); + + return proc_dointvec(&tbl, write, buffer, lenp, ppos); +} + #ifdef CONFIG_IP_ROUTE_MULTIPATH static int proc_fib_multipath_hash_policy(struct ctl_table *table, int write, void *buffer, size_t *lenp, @@ -530,10 +554,9 @@ static struct ctl_table ipv4_table[] = { }; static struct ctl_table ipv4_net_table[] = { - /* tcp_max_tw_buckets must be first in this table. */ { .procname = "tcp_max_tw_buckets", -/* .data = &init_net.ipv4.tcp_death_row.sysctl_max_tw_buckets, */ + .data = &init_net.ipv4.tcp_death_row.sysctl_max_tw_buckets, .maxlen = sizeof(int), .mode = 0644, .proc_handler = proc_dointvec @@ -1322,6 +1345,21 @@ static struct ctl_table ipv4_net_table[] = { .extra2 = SYSCTL_ONE, }, { + .procname = "tcp_ehash_entries", + .data = &init_net.ipv4.sysctl_tcp_child_ehash_entries, + .mode = 0444, + .proc_handler = proc_tcp_ehash_entries, + }, + { + .procname = "tcp_child_ehash_entries", + .data = &init_net.ipv4.sysctl_tcp_child_ehash_entries, + .maxlen = sizeof(unsigned int), + .mode = 0644, + .proc_handler = proc_douintvec_minmax, + .extra1 = SYSCTL_ZERO, + .extra2 = &tcp_child_ehash_entries_max, + }, + { .procname = "udp_rmem_min", .data = &init_net.ipv4.sysctl_udp_rmem_min, .maxlen = sizeof(init_net.ipv4.sysctl_udp_rmem_min), @@ -1361,8 +1399,7 @@ static __net_init int ipv4_sysctl_init_net(struct net *net) if (!table) goto err_alloc; - /* skip first entry (sysctl_max_tw_buckets) */ - for (i = 1; i < ARRAY_SIZE(ipv4_net_table) - 1; i++) { + for (i = 0; i < ARRAY_SIZE(ipv4_net_table) - 1; i++) { if (table[i].data) { /* Update the variables to point into * the current struct net @@ -1377,8 +1414,6 @@ static __net_init int ipv4_sysctl_init_net(struct net *net) } } - table[0].data = &net->ipv4.tcp_death_row->sysctl_max_tw_buckets; - net->ipv4.ipv4_hdr = register_net_sysctl(net, "net/ipv4", table); if (!net->ipv4.ipv4_hdr) goto err_reg; diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index e373dde1f46f..0c51abeee172 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -1162,9 +1162,8 @@ void tcp_free_fastopen_req(struct tcp_sock *tp) } } -static int tcp_sendmsg_fastopen(struct sock *sk, struct msghdr *msg, - int *copied, size_t size, - struct ubuf_info *uarg) +int tcp_sendmsg_fastopen(struct sock *sk, struct msghdr *msg, int *copied, + size_t size, struct ubuf_info *uarg) { struct tcp_sock *tp = tcp_sk(sk); struct inet_sock *inet = inet_sk(sk); @@ -1239,7 +1238,7 @@ int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size) } zc = sk->sk_route_caps & NETIF_F_SG; if (!zc) - uarg->zerocopy = 0; + uarg_to_msgzc(uarg)->zerocopy = 0; } } @@ -3137,6 +3136,8 @@ int tcp_disconnect(struct sock *sk, int flags) tp->snd_ssthresh = TCP_INFINITE_SSTHRESH; tcp_snd_cwnd_set(tp, TCP_INIT_CWND); tp->snd_cwnd_cnt = 0; + tp->is_cwnd_limited = 0; + tp->max_packets_out = 0; tp->window_clamp = 0; tp->delivered = 0; tp->delivered_ce = 0; @@ -3208,7 +3209,7 @@ EXPORT_SYMBOL(tcp_disconnect); static inline bool tcp_can_repair_sock(const struct sock *sk) { - return ns_capable(sock_net(sk)->user_ns, CAP_NET_ADMIN) && + return sockopt_ns_capable(sock_net(sk)->user_ns, CAP_NET_ADMIN) && (sk->sk_state != TCP_LISTEN); } @@ -3485,8 +3486,8 @@ int tcp_set_window_clamp(struct sock *sk, int val) /* * Socket option code for TCP. */ -static int do_tcp_setsockopt(struct sock *sk, int level, int optname, - sockptr_t optval, unsigned int optlen) +int do_tcp_setsockopt(struct sock *sk, int level, int optname, + sockptr_t optval, unsigned int optlen) { struct tcp_sock *tp = tcp_sk(sk); struct inet_connection_sock *icsk = inet_csk(sk); @@ -3508,11 +3509,11 @@ static int do_tcp_setsockopt(struct sock *sk, int level, int optname, return -EFAULT; name[val] = 0; - lock_sock(sk); - err = tcp_set_congestion_control(sk, name, true, - ns_capable(sock_net(sk)->user_ns, - CAP_NET_ADMIN)); - release_sock(sk); + sockopt_lock_sock(sk); + err = tcp_set_congestion_control(sk, name, !has_current_bpf_ctx(), + sockopt_ns_capable(sock_net(sk)->user_ns, + CAP_NET_ADMIN)); + sockopt_release_sock(sk); return err; } case TCP_ULP: { @@ -3528,9 +3529,9 @@ static int do_tcp_setsockopt(struct sock *sk, int level, int optname, return -EFAULT; name[val] = 0; - lock_sock(sk); + sockopt_lock_sock(sk); err = tcp_set_ulp(sk, name); - release_sock(sk); + sockopt_release_sock(sk); return err; } case TCP_FASTOPEN_KEY: { @@ -3563,7 +3564,7 @@ static int do_tcp_setsockopt(struct sock *sk, int level, int optname, if (copy_from_sockptr(&val, optval, sizeof(val))) return -EFAULT; - lock_sock(sk); + sockopt_lock_sock(sk); switch (optname) { case TCP_MAXSEG: @@ -3785,7 +3786,7 @@ static int do_tcp_setsockopt(struct sock *sk, int level, int optname, break; } - release_sock(sk); + sockopt_release_sock(sk); return err; } @@ -4049,15 +4050,15 @@ struct sk_buff *tcp_get_timestamping_opt_stats(const struct sock *sk, return stats; } -static int do_tcp_getsockopt(struct sock *sk, int level, - int optname, char __user *optval, int __user *optlen) +int do_tcp_getsockopt(struct sock *sk, int level, + int optname, sockptr_t optval, sockptr_t optlen) { struct inet_connection_sock *icsk = inet_csk(sk); struct tcp_sock *tp = tcp_sk(sk); struct net *net = sock_net(sk); int val, len; - if (get_user(len, optlen)) + if (copy_from_sockptr(&len, optlen, sizeof(int))) return -EFAULT; len = min_t(unsigned int, len, sizeof(int)); @@ -4107,15 +4108,15 @@ static int do_tcp_getsockopt(struct sock *sk, int level, case TCP_INFO: { struct tcp_info info; - if (get_user(len, optlen)) + if (copy_from_sockptr(&len, optlen, sizeof(int))) return -EFAULT; tcp_get_info(sk, &info); len = min_t(unsigned int, len, sizeof(info)); - if (put_user(len, optlen)) + if (copy_to_sockptr(optlen, &len, sizeof(int))) return -EFAULT; - if (copy_to_user(optval, &info, len)) + if (copy_to_sockptr(optval, &info, len)) return -EFAULT; return 0; } @@ -4125,7 +4126,7 @@ static int do_tcp_getsockopt(struct sock *sk, int level, size_t sz = 0; int attr; - if (get_user(len, optlen)) + if (copy_from_sockptr(&len, optlen, sizeof(int))) return -EFAULT; ca_ops = icsk->icsk_ca_ops; @@ -4133,9 +4134,9 @@ static int do_tcp_getsockopt(struct sock *sk, int level, sz = ca_ops->get_info(sk, ~0U, &attr, &info); len = min_t(unsigned int, len, sz); - if (put_user(len, optlen)) + if (copy_to_sockptr(optlen, &len, sizeof(int))) return -EFAULT; - if (copy_to_user(optval, &info, len)) + if (copy_to_sockptr(optval, &info, len)) return -EFAULT; return 0; } @@ -4144,27 +4145,28 @@ static int do_tcp_getsockopt(struct sock *sk, int level, break; case TCP_CONGESTION: - if (get_user(len, optlen)) + if (copy_from_sockptr(&len, optlen, sizeof(int))) return -EFAULT; len = min_t(unsigned int, len, TCP_CA_NAME_MAX); - if (put_user(len, optlen)) + if (copy_to_sockptr(optlen, &len, sizeof(int))) return -EFAULT; - if (copy_to_user(optval, icsk->icsk_ca_ops->name, len)) + if (copy_to_sockptr(optval, icsk->icsk_ca_ops->name, len)) return -EFAULT; return 0; case TCP_ULP: - if (get_user(len, optlen)) + if (copy_from_sockptr(&len, optlen, sizeof(int))) return -EFAULT; len = min_t(unsigned int, len, TCP_ULP_NAME_MAX); if (!icsk->icsk_ulp_ops) { - if (put_user(0, optlen)) + len = 0; + if (copy_to_sockptr(optlen, &len, sizeof(int))) return -EFAULT; return 0; } - if (put_user(len, optlen)) + if (copy_to_sockptr(optlen, &len, sizeof(int))) return -EFAULT; - if (copy_to_user(optval, icsk->icsk_ulp_ops->name, len)) + if (copy_to_sockptr(optval, icsk->icsk_ulp_ops->name, len)) return -EFAULT; return 0; @@ -4172,15 +4174,15 @@ static int do_tcp_getsockopt(struct sock *sk, int level, u64 key[TCP_FASTOPEN_KEY_BUF_LENGTH / sizeof(u64)]; unsigned int key_len; - if (get_user(len, optlen)) + if (copy_from_sockptr(&len, optlen, sizeof(int))) return -EFAULT; key_len = tcp_fastopen_get_cipher(net, icsk, key) * TCP_FASTOPEN_KEY_LENGTH; len = min_t(unsigned int, len, key_len); - if (put_user(len, optlen)) + if (copy_to_sockptr(optlen, &len, sizeof(int))) return -EFAULT; - if (copy_to_user(optval, key, len)) + if (copy_to_sockptr(optval, key, len)) return -EFAULT; return 0; } @@ -4206,7 +4208,7 @@ static int do_tcp_getsockopt(struct sock *sk, int level, case TCP_REPAIR_WINDOW: { struct tcp_repair_window opt; - if (get_user(len, optlen)) + if (copy_from_sockptr(&len, optlen, sizeof(int))) return -EFAULT; if (len != sizeof(opt)) @@ -4221,7 +4223,7 @@ static int do_tcp_getsockopt(struct sock *sk, int level, opt.rcv_wnd = tp->rcv_wnd; opt.rcv_wup = tp->rcv_wup; - if (copy_to_user(optval, &opt, len)) + if (copy_to_sockptr(optval, &opt, len)) return -EFAULT; return 0; } @@ -4267,35 +4269,35 @@ static int do_tcp_getsockopt(struct sock *sk, int level, val = tp->save_syn; break; case TCP_SAVED_SYN: { - if (get_user(len, optlen)) + if (copy_from_sockptr(&len, optlen, sizeof(int))) return -EFAULT; - lock_sock(sk); + sockopt_lock_sock(sk); if (tp->saved_syn) { if (len < tcp_saved_syn_len(tp->saved_syn)) { - if (put_user(tcp_saved_syn_len(tp->saved_syn), - optlen)) { - release_sock(sk); + len = tcp_saved_syn_len(tp->saved_syn); + if (copy_to_sockptr(optlen, &len, sizeof(int))) { + sockopt_release_sock(sk); return -EFAULT; } - release_sock(sk); + sockopt_release_sock(sk); return -EINVAL; } len = tcp_saved_syn_len(tp->saved_syn); - if (put_user(len, optlen)) { - release_sock(sk); + if (copy_to_sockptr(optlen, &len, sizeof(int))) { + sockopt_release_sock(sk); return -EFAULT; } - if (copy_to_user(optval, tp->saved_syn->data, len)) { - release_sock(sk); + if (copy_to_sockptr(optval, tp->saved_syn->data, len)) { + sockopt_release_sock(sk); return -EFAULT; } tcp_saved_syn_free(tp); - release_sock(sk); + sockopt_release_sock(sk); } else { - release_sock(sk); + sockopt_release_sock(sk); len = 0; - if (put_user(len, optlen)) + if (copy_to_sockptr(optlen, &len, sizeof(int))) return -EFAULT; } return 0; @@ -4306,31 +4308,31 @@ static int do_tcp_getsockopt(struct sock *sk, int level, struct tcp_zerocopy_receive zc = {}; int err; - if (get_user(len, optlen)) + if (copy_from_sockptr(&len, optlen, sizeof(int))) return -EFAULT; if (len < 0 || len < offsetofend(struct tcp_zerocopy_receive, length)) return -EINVAL; if (unlikely(len > sizeof(zc))) { - err = check_zeroed_user(optval + sizeof(zc), - len - sizeof(zc)); + err = check_zeroed_sockptr(optval, sizeof(zc), + len - sizeof(zc)); if (err < 1) return err == 0 ? -EINVAL : err; len = sizeof(zc); - if (put_user(len, optlen)) + if (copy_to_sockptr(optlen, &len, sizeof(int))) return -EFAULT; } - if (copy_from_user(&zc, optval, len)) + if (copy_from_sockptr(&zc, optval, len)) return -EFAULT; if (zc.reserved) return -EINVAL; if (zc.msg_flags & ~(TCP_VALID_ZC_MSG_FLAGS)) return -EINVAL; - lock_sock(sk); + sockopt_lock_sock(sk); err = tcp_zerocopy_receive(sk, &zc, &tss); err = BPF_CGROUP_RUN_PROG_GETSOCKOPT_KERN(sk, level, optname, &zc, &len, err); - release_sock(sk); + sockopt_release_sock(sk); if (len >= offsetofend(struct tcp_zerocopy_receive, msg_flags)) goto zerocopy_rcv_cmsg; switch (len) { @@ -4360,7 +4362,7 @@ zerocopy_rcv_sk_err: zerocopy_rcv_inq: zc.inq = tcp_inq_hint(sk); zerocopy_rcv_out: - if (!err && copy_to_user(optval, &zc, len)) + if (!err && copy_to_sockptr(optval, &zc, len)) err = -EFAULT; return err; } @@ -4369,9 +4371,9 @@ zerocopy_rcv_out: return -ENOPROTOOPT; } - if (put_user(len, optlen)) + if (copy_to_sockptr(optlen, &len, sizeof(int))) return -EFAULT; - if (copy_to_user(optval, &val, len)) + if (copy_to_sockptr(optval, &val, len)) return -EFAULT; return 0; } @@ -4396,7 +4398,8 @@ int tcp_getsockopt(struct sock *sk, int level, int optname, char __user *optval, if (level != SOL_TCP) return icsk->icsk_af_ops->getsockopt(sk, level, optname, optval, optlen); - return do_tcp_getsockopt(sk, level, optname, optval, optlen); + return do_tcp_getsockopt(sk, level, optname, USER_SOCKPTR(optval), + USER_SOCKPTR(optlen)); } EXPORT_SYMBOL(tcp_getsockopt); @@ -4442,12 +4445,16 @@ static void __tcp_alloc_md5sig_pool(void) * to memory. See smp_rmb() in tcp_get_md5sig_pool() */ smp_wmb(); - tcp_md5sig_pool_populated = true; + /* Paired with READ_ONCE() from tcp_alloc_md5sig_pool() + * and tcp_get_md5sig_pool(). + */ + WRITE_ONCE(tcp_md5sig_pool_populated, true); } bool tcp_alloc_md5sig_pool(void) { - if (unlikely(!tcp_md5sig_pool_populated)) { + /* Paired with WRITE_ONCE() from __tcp_alloc_md5sig_pool() */ + if (unlikely(!READ_ONCE(tcp_md5sig_pool_populated))) { mutex_lock(&tcp_md5sig_mutex); if (!tcp_md5sig_pool_populated) { @@ -4458,7 +4465,8 @@ bool tcp_alloc_md5sig_pool(void) mutex_unlock(&tcp_md5sig_mutex); } - return tcp_md5sig_pool_populated; + /* Paired with WRITE_ONCE() from __tcp_alloc_md5sig_pool() */ + return READ_ONCE(tcp_md5sig_pool_populated); } EXPORT_SYMBOL(tcp_alloc_md5sig_pool); @@ -4474,7 +4482,8 @@ struct tcp_md5sig_pool *tcp_get_md5sig_pool(void) { local_bh_disable(); - if (tcp_md5sig_pool_populated) { + /* Paired with WRITE_ONCE() from __tcp_alloc_md5sig_pool() */ + if (READ_ONCE(tcp_md5sig_pool_populated)) { /* coupled with smp_wmb() in __tcp_alloc_md5sig_pool() */ smp_rmb(); return this_cpu_ptr(&tcp_md5sig_pool); @@ -4745,6 +4754,12 @@ void __init tcp_init(void) SLAB_HWCACHE_ALIGN | SLAB_PANIC | SLAB_ACCOUNT, NULL); + tcp_hashinfo.bind2_bucket_cachep = + kmem_cache_create("tcp_bind2_bucket", + sizeof(struct inet_bind2_bucket), 0, + SLAB_HWCACHE_ALIGN | SLAB_PANIC | + SLAB_ACCOUNT, + NULL); /* Size and allocate the main established and bind bucket * hash tables. @@ -4768,7 +4783,7 @@ void __init tcp_init(void) panic("TCP: failed to alloc ehash_locks"); tcp_hashinfo.bhash = alloc_large_system_hash("TCP bind", - sizeof(struct inet_bind_hashbucket), + 2 * sizeof(struct inet_bind_hashbucket), tcp_hashinfo.ehash_mask + 1, 17, /* one slot per 128 KB of memory */ 0, @@ -4777,11 +4792,15 @@ void __init tcp_init(void) 0, 64 * 1024); tcp_hashinfo.bhash_size = 1U << tcp_hashinfo.bhash_size; + tcp_hashinfo.bhash2 = tcp_hashinfo.bhash + tcp_hashinfo.bhash_size; for (i = 0; i < tcp_hashinfo.bhash_size; i++) { spin_lock_init(&tcp_hashinfo.bhash[i].lock); INIT_HLIST_HEAD(&tcp_hashinfo.bhash[i].chain); + spin_lock_init(&tcp_hashinfo.bhash2[i].lock); + INIT_HLIST_HEAD(&tcp_hashinfo.bhash2[i].chain); } + tcp_hashinfo.pernet = false; cnt = tcp_hashinfo.ehash_mask + 1; sysctl_tcp_max_orphans = cnt / 2; diff --git a/net/ipv4/tcp_diag.c b/net/ipv4/tcp_diag.c index 75a1c985f49a..01b50fa79189 100644 --- a/net/ipv4/tcp_diag.c +++ b/net/ipv4/tcp_diag.c @@ -181,13 +181,21 @@ static size_t tcp_diag_get_aux_size(struct sock *sk, bool net_admin) static void tcp_diag_dump(struct sk_buff *skb, struct netlink_callback *cb, const struct inet_diag_req_v2 *r) { - inet_diag_dump_icsk(&tcp_hashinfo, skb, cb, r); + struct inet_hashinfo *hinfo; + + hinfo = sock_net(cb->skb->sk)->ipv4.tcp_death_row.hashinfo; + + inet_diag_dump_icsk(hinfo, skb, cb, r); } static int tcp_diag_dump_one(struct netlink_callback *cb, const struct inet_diag_req_v2 *req) { - return inet_diag_dump_one_icsk(&tcp_hashinfo, cb, req); + struct inet_hashinfo *hinfo; + + hinfo = sock_net(cb->skb->sk)->ipv4.tcp_death_row.hashinfo; + + return inet_diag_dump_one_icsk(hinfo, cb, req); } #ifdef CONFIG_INET_DIAG_DESTROY @@ -195,9 +203,13 @@ static int tcp_diag_destroy(struct sk_buff *in_skb, const struct inet_diag_req_v2 *req) { struct net *net = sock_net(in_skb->sk); - struct sock *sk = inet_diag_find_one_icsk(net, &tcp_hashinfo, req); + struct inet_hashinfo *hinfo; + struct sock *sk; int err; + hinfo = net->ipv4.tcp_death_row.hashinfo; + sk = inet_diag_find_one_icsk(net, hinfo, req); + if (IS_ERR(sk)) return PTR_ERR(sk); diff --git a/net/ipv4/tcp_fastopen.c b/net/ipv4/tcp_fastopen.c index 825b216d11f5..45cc7f1ca296 100644 --- a/net/ipv4/tcp_fastopen.c +++ b/net/ipv4/tcp_fastopen.c @@ -272,8 +272,9 @@ static struct sock *tcp_fastopen_create_child(struct sock *sk, * The request socket is not added to the ehash * because it's been added to the accept queue directly. */ + req->timeout = tcp_timeout_init(child); inet_csk_reset_xmit_timer(child, ICSK_TIME_RETRANS, - TCP_TIMEOUT_INIT, TCP_RTO_MAX); + req->timeout, TCP_RTO_MAX); refcount_set(&req->rsk_refcnt, 2); diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index 5b019ba2b9d2..6376ad915765 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -199,16 +199,18 @@ static int tcp_v4_pre_connect(struct sock *sk, struct sockaddr *uaddr, /* This will initiate an outgoing connection. */ int tcp_v4_connect(struct sock *sk, struct sockaddr *uaddr, int addr_len) { + struct inet_bind_hashbucket *prev_addr_hashbucket = NULL; struct sockaddr_in *usin = (struct sockaddr_in *)uaddr; + struct inet_timewait_death_row *tcp_death_row; + __be32 daddr, nexthop, prev_sk_rcv_saddr; struct inet_sock *inet = inet_sk(sk); struct tcp_sock *tp = tcp_sk(sk); + struct ip_options_rcu *inet_opt; + struct net *net = sock_net(sk); __be16 orig_sport, orig_dport; - __be32 daddr, nexthop; struct flowi4 *fl4; struct rtable *rt; int err; - struct ip_options_rcu *inet_opt; - struct inet_timewait_death_row *tcp_death_row = sock_net(sk)->ipv4.tcp_death_row; if (addr_len < sizeof(struct sockaddr_in)) return -EINVAL; @@ -234,7 +236,7 @@ int tcp_v4_connect(struct sock *sk, struct sockaddr *uaddr, int addr_len) if (IS_ERR(rt)) { err = PTR_ERR(rt); if (err == -ENETUNREACH) - IP_INC_STATS(sock_net(sk), IPSTATS_MIB_OUTNOROUTES); + IP_INC_STATS(net, IPSTATS_MIB_OUTNOROUTES); return err; } @@ -246,10 +248,29 @@ int tcp_v4_connect(struct sock *sk, struct sockaddr *uaddr, int addr_len) if (!inet_opt || !inet_opt->opt.srr) daddr = fl4->daddr; - if (!inet->inet_saddr) + tcp_death_row = &sock_net(sk)->ipv4.tcp_death_row; + + if (!inet->inet_saddr) { + if (inet_csk(sk)->icsk_bind2_hash) { + prev_addr_hashbucket = inet_bhashfn_portaddr(tcp_death_row->hashinfo, + sk, net, inet->inet_num); + prev_sk_rcv_saddr = sk->sk_rcv_saddr; + } inet->inet_saddr = fl4->saddr; + } + sk_rcv_saddr_set(sk, inet->inet_saddr); + if (prev_addr_hashbucket) { + err = inet_bhash2_update_saddr(prev_addr_hashbucket, sk); + if (err) { + inet->inet_saddr = 0; + sk_rcv_saddr_set(sk, prev_sk_rcv_saddr); + ip_rt_put(rt); + return err; + } + } + if (tp->rx_opt.ts_recent_stamp && inet->inet_daddr != daddr) { /* Reset inherited state */ tp->rx_opt.ts_recent = 0; @@ -298,8 +319,7 @@ int tcp_v4_connect(struct sock *sk, struct sockaddr *uaddr, int addr_len) inet->inet_daddr, inet->inet_sport, usin->sin_port)); - tp->tsoffset = secure_tcp_ts_off(sock_net(sk), - inet->inet_saddr, + tp->tsoffset = secure_tcp_ts_off(net, inet->inet_saddr, inet->inet_daddr); } @@ -475,9 +495,9 @@ int tcp_v4_err(struct sk_buff *skb, u32 info) int err; struct net *net = dev_net(skb->dev); - sk = __inet_lookup_established(net, &tcp_hashinfo, iph->daddr, - th->dest, iph->saddr, ntohs(th->source), - inet_iif(skb), 0); + sk = __inet_lookup_established(net, net->ipv4.tcp_death_row.hashinfo, + iph->daddr, th->dest, iph->saddr, + ntohs(th->source), inet_iif(skb), 0); if (!sk) { __ICMP_INC_STATS(net, ICMP_MIB_INERRORS); return -ENOENT; @@ -740,8 +760,8 @@ static void tcp_v4_send_reset(const struct sock *sk, struct sk_buff *skb) * Incoming packet is checked with md5 hash with finding key, * no RST generated if md5 hash doesn't match. */ - sk1 = __inet_lookup_listener(net, &tcp_hashinfo, NULL, 0, - ip_hdr(skb)->saddr, + sk1 = __inet_lookup_listener(net, net->ipv4.tcp_death_row.hashinfo, + NULL, 0, ip_hdr(skb)->saddr, th->source, ip_hdr(skb)->daddr, ntohs(th->source), dif, sdif); /* don't send rst if it can't find key */ @@ -1709,6 +1729,7 @@ EXPORT_SYMBOL(tcp_v4_do_rcv); int tcp_v4_early_demux(struct sk_buff *skb) { + struct net *net = dev_net(skb->dev); const struct iphdr *iph; const struct tcphdr *th; struct sock *sk; @@ -1725,7 +1746,7 @@ int tcp_v4_early_demux(struct sk_buff *skb) if (th->doff < sizeof(struct tcphdr) / 4) return 0; - sk = __inet_lookup_established(dev_net(skb->dev), &tcp_hashinfo, + sk = __inet_lookup_established(net, net->ipv4.tcp_death_row.hashinfo, iph->saddr, th->source, iph->daddr, ntohs(th->dest), skb->skb_iif, inet_sdif(skb)); @@ -1951,7 +1972,8 @@ int tcp_v4_rcv(struct sk_buff *skb) th = (const struct tcphdr *)skb->data; iph = ip_hdr(skb); lookup: - sk = __inet_lookup_skb(&tcp_hashinfo, skb, __tcp_hdrlen(th), th->source, + sk = __inet_lookup_skb(net->ipv4.tcp_death_row.hashinfo, + skb, __tcp_hdrlen(th), th->source, th->dest, sdif, &refcounted); if (!sk) goto no_tcp_socket; @@ -2133,9 +2155,9 @@ do_time_wait: } switch (tcp_timewait_state_process(inet_twsk(sk), skb, th)) { case TCP_TW_SYN: { - struct sock *sk2 = inet_lookup_listener(dev_net(skb->dev), - &tcp_hashinfo, skb, - __tcp_hdrlen(th), + struct sock *sk2 = inet_lookup_listener(net, + net->ipv4.tcp_death_row.hashinfo, + skb, __tcp_hdrlen(th), iph->saddr, th->source, iph->daddr, th->dest, inet_iif(skb), @@ -2285,15 +2307,16 @@ static bool seq_sk_match(struct seq_file *seq, const struct sock *sk) */ static void *listening_get_first(struct seq_file *seq) { + struct inet_hashinfo *hinfo = seq_file_net(seq)->ipv4.tcp_death_row.hashinfo; struct tcp_iter_state *st = seq->private; st->offset = 0; - for (; st->bucket <= tcp_hashinfo.lhash2_mask; st->bucket++) { + for (; st->bucket <= hinfo->lhash2_mask; st->bucket++) { struct inet_listen_hashbucket *ilb2; struct hlist_nulls_node *node; struct sock *sk; - ilb2 = &tcp_hashinfo.lhash2[st->bucket]; + ilb2 = &hinfo->lhash2[st->bucket]; if (hlist_nulls_empty(&ilb2->nulls_head)) continue; @@ -2318,6 +2341,7 @@ static void *listening_get_next(struct seq_file *seq, void *cur) struct tcp_iter_state *st = seq->private; struct inet_listen_hashbucket *ilb2; struct hlist_nulls_node *node; + struct inet_hashinfo *hinfo; struct sock *sk = cur; ++st->num; @@ -2329,7 +2353,8 @@ static void *listening_get_next(struct seq_file *seq, void *cur) return sk; } - ilb2 = &tcp_hashinfo.lhash2[st->bucket]; + hinfo = seq_file_net(seq)->ipv4.tcp_death_row.hashinfo; + ilb2 = &hinfo->lhash2[st->bucket]; spin_unlock(&ilb2->lock); ++st->bucket; return listening_get_first(seq); @@ -2351,9 +2376,10 @@ static void *listening_get_idx(struct seq_file *seq, loff_t *pos) return rc; } -static inline bool empty_bucket(const struct tcp_iter_state *st) +static inline bool empty_bucket(struct inet_hashinfo *hinfo, + const struct tcp_iter_state *st) { - return hlist_nulls_empty(&tcp_hashinfo.ehash[st->bucket].chain); + return hlist_nulls_empty(&hinfo->ehash[st->bucket].chain); } /* @@ -2362,20 +2388,21 @@ static inline bool empty_bucket(const struct tcp_iter_state *st) */ static void *established_get_first(struct seq_file *seq) { + struct inet_hashinfo *hinfo = seq_file_net(seq)->ipv4.tcp_death_row.hashinfo; struct tcp_iter_state *st = seq->private; st->offset = 0; - for (; st->bucket <= tcp_hashinfo.ehash_mask; ++st->bucket) { + for (; st->bucket <= hinfo->ehash_mask; ++st->bucket) { struct sock *sk; struct hlist_nulls_node *node; - spinlock_t *lock = inet_ehash_lockp(&tcp_hashinfo, st->bucket); + spinlock_t *lock = inet_ehash_lockp(hinfo, st->bucket); /* Lockless fast path for the common case of empty buckets */ - if (empty_bucket(st)) + if (empty_bucket(hinfo, st)) continue; spin_lock_bh(lock); - sk_nulls_for_each(sk, node, &tcp_hashinfo.ehash[st->bucket].chain) { + sk_nulls_for_each(sk, node, &hinfo->ehash[st->bucket].chain) { if (seq_sk_match(seq, sk)) return sk; } @@ -2387,9 +2414,10 @@ static void *established_get_first(struct seq_file *seq) static void *established_get_next(struct seq_file *seq, void *cur) { - struct sock *sk = cur; - struct hlist_nulls_node *node; + struct inet_hashinfo *hinfo = seq_file_net(seq)->ipv4.tcp_death_row.hashinfo; struct tcp_iter_state *st = seq->private; + struct hlist_nulls_node *node; + struct sock *sk = cur; ++st->num; ++st->offset; @@ -2401,7 +2429,7 @@ static void *established_get_next(struct seq_file *seq, void *cur) return sk; } - spin_unlock_bh(inet_ehash_lockp(&tcp_hashinfo, st->bucket)); + spin_unlock_bh(inet_ehash_lockp(hinfo, st->bucket)); ++st->bucket; return established_get_first(seq); } @@ -2439,6 +2467,7 @@ static void *tcp_get_idx(struct seq_file *seq, loff_t pos) static void *tcp_seek_last_pos(struct seq_file *seq) { + struct inet_hashinfo *hinfo = seq_file_net(seq)->ipv4.tcp_death_row.hashinfo; struct tcp_iter_state *st = seq->private; int bucket = st->bucket; int offset = st->offset; @@ -2447,7 +2476,7 @@ static void *tcp_seek_last_pos(struct seq_file *seq) switch (st->state) { case TCP_SEQ_STATE_LISTENING: - if (st->bucket > tcp_hashinfo.lhash2_mask) + if (st->bucket > hinfo->lhash2_mask) break; st->state = TCP_SEQ_STATE_LISTENING; rc = listening_get_first(seq); @@ -2459,7 +2488,7 @@ static void *tcp_seek_last_pos(struct seq_file *seq) st->state = TCP_SEQ_STATE_ESTABLISHED; fallthrough; case TCP_SEQ_STATE_ESTABLISHED: - if (st->bucket > tcp_hashinfo.ehash_mask) + if (st->bucket > hinfo->ehash_mask) break; rc = established_get_first(seq); while (offset-- && rc && bucket == st->bucket) @@ -2527,16 +2556,17 @@ EXPORT_SYMBOL(tcp_seq_next); void tcp_seq_stop(struct seq_file *seq, void *v) { + struct inet_hashinfo *hinfo = seq_file_net(seq)->ipv4.tcp_death_row.hashinfo; struct tcp_iter_state *st = seq->private; switch (st->state) { case TCP_SEQ_STATE_LISTENING: if (v != SEQ_START_TOKEN) - spin_unlock(&tcp_hashinfo.lhash2[st->bucket].lock); + spin_unlock(&hinfo->lhash2[st->bucket].lock); break; case TCP_SEQ_STATE_ESTABLISHED: if (v) - spin_unlock_bh(inet_ehash_lockp(&tcp_hashinfo, st->bucket)); + spin_unlock_bh(inet_ehash_lockp(hinfo, st->bucket)); break; } } @@ -2731,6 +2761,7 @@ static int bpf_iter_tcp_realloc_batch(struct bpf_tcp_iter_state *iter, static unsigned int bpf_iter_tcp_listening_batch(struct seq_file *seq, struct sock *start_sk) { + struct inet_hashinfo *hinfo = seq_file_net(seq)->ipv4.tcp_death_row.hashinfo; struct bpf_tcp_iter_state *iter = seq->private; struct tcp_iter_state *st = &iter->state; struct hlist_nulls_node *node; @@ -2750,7 +2781,7 @@ static unsigned int bpf_iter_tcp_listening_batch(struct seq_file *seq, expected++; } } - spin_unlock(&tcp_hashinfo.lhash2[st->bucket].lock); + spin_unlock(&hinfo->lhash2[st->bucket].lock); return expected; } @@ -2758,6 +2789,7 @@ static unsigned int bpf_iter_tcp_listening_batch(struct seq_file *seq, static unsigned int bpf_iter_tcp_established_batch(struct seq_file *seq, struct sock *start_sk) { + struct inet_hashinfo *hinfo = seq_file_net(seq)->ipv4.tcp_death_row.hashinfo; struct bpf_tcp_iter_state *iter = seq->private; struct tcp_iter_state *st = &iter->state; struct hlist_nulls_node *node; @@ -2777,13 +2809,14 @@ static unsigned int bpf_iter_tcp_established_batch(struct seq_file *seq, expected++; } } - spin_unlock_bh(inet_ehash_lockp(&tcp_hashinfo, st->bucket)); + spin_unlock_bh(inet_ehash_lockp(hinfo, st->bucket)); return expected; } static struct sock *bpf_iter_tcp_batch(struct seq_file *seq) { + struct inet_hashinfo *hinfo = seq_file_net(seq)->ipv4.tcp_death_row.hashinfo; struct bpf_tcp_iter_state *iter = seq->private; struct tcp_iter_state *st = &iter->state; unsigned int expected; @@ -2799,7 +2832,7 @@ static struct sock *bpf_iter_tcp_batch(struct seq_file *seq) st->offset = 0; st->bucket++; if (st->state == TCP_SEQ_STATE_LISTENING && - st->bucket > tcp_hashinfo.lhash2_mask) { + st->bucket > hinfo->lhash2_mask) { st->state = TCP_SEQ_STATE_ESTABLISHED; st->bucket = 0; } @@ -3064,7 +3097,7 @@ struct proto tcp_prot = { .slab_flags = SLAB_TYPESAFE_BY_RCU, .twsk_prot = &tcp_timewait_sock_ops, .rsk_prot = &tcp_request_sock_ops, - .h.hashinfo = &tcp_hashinfo, + .h.hashinfo = NULL, .no_autobind = true, .diag_destroy = tcp_abort, }; @@ -3072,19 +3105,43 @@ EXPORT_SYMBOL(tcp_prot); static void __net_exit tcp_sk_exit(struct net *net) { - struct inet_timewait_death_row *tcp_death_row = net->ipv4.tcp_death_row; - if (net->ipv4.tcp_congestion_control) bpf_module_put(net->ipv4.tcp_congestion_control, net->ipv4.tcp_congestion_control->owner); - if (refcount_dec_and_test(&tcp_death_row->tw_refcount)) - kfree(tcp_death_row); } -static int __net_init tcp_sk_init(struct net *net) +static void __net_init tcp_set_hashinfo(struct net *net) { - int cnt; + struct inet_hashinfo *hinfo; + unsigned int ehash_entries; + struct net *old_net; + + if (net_eq(net, &init_net)) + goto fallback; + + old_net = current->nsproxy->net_ns; + ehash_entries = READ_ONCE(old_net->ipv4.sysctl_tcp_child_ehash_entries); + if (!ehash_entries) + goto fallback; + + ehash_entries = roundup_pow_of_two(ehash_entries); + hinfo = inet_pernet_hashinfo_alloc(&tcp_hashinfo, ehash_entries); + if (!hinfo) { + pr_warn("Failed to allocate TCP ehash (entries: %u) " + "for a netns, fallback to the global one\n", + ehash_entries); +fallback: + hinfo = &tcp_hashinfo; + ehash_entries = tcp_hashinfo.ehash_mask + 1; + } + net->ipv4.tcp_death_row.hashinfo = hinfo; + net->ipv4.tcp_death_row.sysctl_max_tw_buckets = ehash_entries / 2; + net->ipv4.sysctl_max_syn_backlog = max(128U, ehash_entries / 128); +} + +static int __net_init tcp_sk_init(struct net *net) +{ net->ipv4.sysctl_tcp_ecn = 2; net->ipv4.sysctl_tcp_ecn_fallback = 1; @@ -3110,15 +3167,9 @@ static int __net_init tcp_sk_init(struct net *net) net->ipv4.sysctl_tcp_tw_reuse = 2; net->ipv4.sysctl_tcp_no_ssthresh_metrics_save = 1; - net->ipv4.tcp_death_row = kzalloc(sizeof(struct inet_timewait_death_row), GFP_KERNEL); - if (!net->ipv4.tcp_death_row) - return -ENOMEM; - refcount_set(&net->ipv4.tcp_death_row->tw_refcount, 1); - cnt = tcp_hashinfo.ehash_mask + 1; - net->ipv4.tcp_death_row->sysctl_max_tw_buckets = cnt / 2; - net->ipv4.tcp_death_row->hashinfo = &tcp_hashinfo; + refcount_set(&net->ipv4.tcp_death_row.tw_refcount, 1); + tcp_set_hashinfo(net); - net->ipv4.sysctl_max_syn_backlog = max(128, cnt / 128); net->ipv4.sysctl_tcp_sack = 1; net->ipv4.sysctl_tcp_window_scaling = 1; net->ipv4.sysctl_tcp_timestamps = 1; @@ -3180,10 +3231,13 @@ static void __net_exit tcp_sk_exit_batch(struct list_head *net_exit_list) { struct net *net; - inet_twsk_purge(&tcp_hashinfo, AF_INET); + tcp_twsk_purge(net_exit_list, AF_INET); - list_for_each_entry(net, net_exit_list, exit_list) + list_for_each_entry(net, net_exit_list, exit_list) { + inet_pernet_hashinfo_free(net->ipv4.tcp_death_row.hashinfo); + WARN_ON_ONCE(!refcount_dec_and_test(&net->ipv4.tcp_death_row.tw_refcount)); tcp_fastopen_ctx_destroy(net); + } } static struct pernet_operations __net_initdata tcp_sk_ops = { diff --git a/net/ipv4/tcp_metrics.c b/net/ipv4/tcp_metrics.c index d58e672be31c..82f4575f9cd9 100644 --- a/net/ipv4/tcp_metrics.c +++ b/net/ipv4/tcp_metrics.c @@ -969,6 +969,7 @@ static struct genl_family tcp_metrics_nl_family __ro_after_init = { .module = THIS_MODULE, .small_ops = tcp_metrics_nl_ops, .n_small_ops = ARRAY_SIZE(tcp_metrics_nl_ops), + .resv_start_op = TCP_METRICS_CMD_DEL + 1, }; static unsigned int tcpmhash_entries; diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c index cb95d88497ae..79f30f026d89 100644 --- a/net/ipv4/tcp_minisocks.c +++ b/net/ipv4/tcp_minisocks.c @@ -247,10 +247,10 @@ void tcp_time_wait(struct sock *sk, int state, int timeo) { const struct inet_connection_sock *icsk = inet_csk(sk); const struct tcp_sock *tp = tcp_sk(sk); + struct net *net = sock_net(sk); struct inet_timewait_sock *tw; - struct inet_timewait_death_row *tcp_death_row = sock_net(sk)->ipv4.tcp_death_row; - tw = inet_twsk_alloc(sk, tcp_death_row, state); + tw = inet_twsk_alloc(sk, &net->ipv4.tcp_death_row, state); if (tw) { struct tcp_timewait_sock *tcptw = tcp_twsk((struct sock *)tw); @@ -319,14 +319,14 @@ void tcp_time_wait(struct sock *sk, int state, int timeo) /* Linkage updates. * Note that access to tw after this point is illegal. */ - inet_twsk_hashdance(tw, sk, &tcp_hashinfo); + inet_twsk_hashdance(tw, sk, net->ipv4.tcp_death_row.hashinfo); local_bh_enable(); } else { /* Sorry, if we're out of memory, just CLOSE this * socket up. We've got bigger problems than * non-graceful socket closings. */ - NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPTIMEWAITOVERFLOW); + NET_INC_STATS(net, LINUX_MIB_TCPTIMEWAITOVERFLOW); } tcp_update_metrics(sk); @@ -347,6 +347,26 @@ void tcp_twsk_destructor(struct sock *sk) } EXPORT_SYMBOL_GPL(tcp_twsk_destructor); +void tcp_twsk_purge(struct list_head *net_exit_list, int family) +{ + bool purged_once = false; + struct net *net; + + list_for_each_entry(net, net_exit_list, exit_list) { + /* The last refcount is decremented in tcp_sk_exit_batch() */ + if (refcount_read(&net->ipv4.tcp_death_row.tw_refcount) == 1) + continue; + + if (net->ipv4.tcp_death_row.hashinfo->pernet) { + inet_twsk_purge(net->ipv4.tcp_death_row.hashinfo, family); + } else if (!purged_once) { + inet_twsk_purge(&tcp_hashinfo, family); + purged_once = true; + } + } +} +EXPORT_SYMBOL_GPL(tcp_twsk_purge); + /* Warning : This function is called without sk_listener being locked. * Be sure to read socket fields once, as their value could change under us. */ @@ -541,6 +561,7 @@ struct sock *tcp_create_openreq_child(const struct sock *sk, newtp->fastopen_req = NULL; RCU_INIT_POINTER(newtp->fastopen_rsk, NULL); + newtp->bpf_chg_cc_inprogress = 0; tcp_bpf_clone(sk, newsk); __TCP_INC_STATS(sock_net(sk), TCP_MIB_PASSIVEOPENS); diff --git a/net/ipv4/tcp_offload.c b/net/ipv4/tcp_offload.c index 30abde86db45..45dda7889387 100644 --- a/net/ipv4/tcp_offload.c +++ b/net/ipv4/tcp_offload.c @@ -195,12 +195,9 @@ struct sk_buff *tcp_gro_receive(struct list_head *head, struct sk_buff *skb) off = skb_gro_offset(skb); hlen = off + sizeof(*th); - th = skb_gro_header_fast(skb, off); - if (skb_gro_header_hard(skb, hlen)) { - th = skb_gro_header_slow(skb, hlen, off); - if (unlikely(!th)) - goto out; - } + th = skb_gro_header(skb, hlen, off); + if (unlikely(!th)) + goto out; thlen = th->doff * 4; if (thlen < sizeof(*th)) @@ -258,7 +255,15 @@ found: mss = skb_shinfo(p)->gso_size; - flush |= (len - 1) >= mss; + /* If skb is a GRO packet, make sure its gso_size matches prior packet mss. + * If it is a single frame, do not aggregate it if its length + * is bigger than our mss. + */ + if (unlikely(skb_is_gso(skb))) + flush |= (mss != skb_shinfo(skb)->gso_size); + else + flush |= (len - 1) >= mss; + flush |= (ntohl(th2->seq) + skb_gro_len(p)) ^ ntohl(th->seq); #ifdef CONFIG_TLS_DEVICE flush |= p->decrypted ^ skb->decrypted; @@ -272,7 +277,12 @@ found: tcp_flag_word(th2) |= flags & (TCP_FLAG_FIN | TCP_FLAG_PSH); out_check_final: - flush = len < mss; + /* Force a flush if last segment is smaller than mss. */ + if (unlikely(skb_is_gso(skb))) + flush = len != NAPI_GRO_CB(skb)->count * skb_shinfo(skb)->gso_size; + else + flush = len < mss; + flush |= (__force int)(flags & (TCP_FLAG_URG | TCP_FLAG_PSH | TCP_FLAG_RST | TCP_FLAG_SYN | TCP_FLAG_FIN)); diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index 290019de766d..c69f4d966024 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -1875,15 +1875,20 @@ static void tcp_cwnd_validate(struct sock *sk, bool is_cwnd_limited) const struct tcp_congestion_ops *ca_ops = inet_csk(sk)->icsk_ca_ops; struct tcp_sock *tp = tcp_sk(sk); - /* Track the maximum number of outstanding packets in each - * window, and remember whether we were cwnd-limited then. + /* Track the strongest available signal of the degree to which the cwnd + * is fully utilized. If cwnd-limited then remember that fact for the + * current window. If not cwnd-limited then track the maximum number of + * outstanding packets in the current window. (If cwnd-limited then we + * chose to not update tp->max_packets_out to avoid an extra else + * clause with no functional impact.) */ - if (!before(tp->snd_una, tp->max_packets_seq) || - tp->packets_out > tp->max_packets_out || - is_cwnd_limited) { - tp->max_packets_out = tp->packets_out; - tp->max_packets_seq = tp->snd_nxt; + if (!before(tp->snd_una, tp->cwnd_usage_seq) || + is_cwnd_limited || + (!tp->is_cwnd_limited && + tp->packets_out > tp->max_packets_out)) { tp->is_cwnd_limited = is_cwnd_limited; + tp->max_packets_out = tp->packets_out; + tp->cwnd_usage_seq = tp->snd_nxt; } if (tcp_is_cwnd_limited(sk)) { diff --git a/net/ipv4/tcp_timer.c b/net/ipv4/tcp_timer.c index b4dfb82d6ecb..cb79127f45c3 100644 --- a/net/ipv4/tcp_timer.c +++ b/net/ipv4/tcp_timer.c @@ -428,7 +428,7 @@ static void tcp_fastopen_synack_timer(struct sock *sk, struct request_sock *req) if (!tp->retrans_stamp) tp->retrans_stamp = tcp_time_stamp(tp); inet_csk_reset_xmit_timer(sk, ICSK_TIME_RETRANS, - TCP_TIMEOUT_INIT << req->num_timeout, TCP_RTO_MAX); + req->timeout << req->num_timeout, TCP_RTO_MAX); } diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c index 560d9eadeaa5..d63118ce5900 100644 --- a/net/ipv4/udp.c +++ b/net/ipv4/udp.c @@ -1801,41 +1801,29 @@ EXPORT_SYMBOL(__skb_recv_udp); int udp_read_skb(struct sock *sk, skb_read_actor_t recv_actor) { - int copied = 0; - - while (1) { - struct sk_buff *skb; - int err, used; - - skb = skb_recv_udp(sk, MSG_DONTWAIT, &err); - if (!skb) - return err; + struct sk_buff *skb; + int err, copied; - if (udp_lib_checksum_complete(skb)) { - __UDP_INC_STATS(sock_net(sk), UDP_MIB_CSUMERRORS, - IS_UDPLITE(sk)); - __UDP_INC_STATS(sock_net(sk), UDP_MIB_INERRORS, - IS_UDPLITE(sk)); - atomic_inc(&sk->sk_drops); - kfree_skb(skb); - continue; - } +try_again: + skb = skb_recv_udp(sk, MSG_DONTWAIT, &err); + if (!skb) + return err; - WARN_ON_ONCE(!skb_set_owner_sk_safe(skb, sk)); - used = recv_actor(sk, skb); - if (used <= 0) { - if (!copied) - copied = used; - kfree_skb(skb); - break; - } else if (used <= skb->len) { - copied += used; - } + if (udp_lib_checksum_complete(skb)) { + int is_udplite = IS_UDPLITE(sk); + struct net *net = sock_net(sk); + __UDP_INC_STATS(net, UDP_MIB_CSUMERRORS, is_udplite); + __UDP_INC_STATS(net, UDP_MIB_INERRORS, is_udplite); + atomic_inc(&sk->sk_drops); kfree_skb(skb); - break; + goto try_again; } + WARN_ON_ONCE(!skb_set_owner_sk_safe(skb, sk)); + copied = recv_actor(sk, skb); + kfree_skb(skb); + return copied; } EXPORT_SYMBOL(udp_read_skb); diff --git a/net/ipv4/xfrm4_tunnel.c b/net/ipv4/xfrm4_tunnel.c index 9d4f418f1bf8..8489fa106583 100644 --- a/net/ipv4/xfrm4_tunnel.c +++ b/net/ipv4/xfrm4_tunnel.c @@ -22,13 +22,17 @@ static int ipip_xfrm_rcv(struct xfrm_state *x, struct sk_buff *skb) return ip_hdr(skb)->protocol; } -static int ipip_init_state(struct xfrm_state *x) +static int ipip_init_state(struct xfrm_state *x, struct netlink_ext_ack *extack) { - if (x->props.mode != XFRM_MODE_TUNNEL) + if (x->props.mode != XFRM_MODE_TUNNEL) { + NL_SET_ERR_MSG(extack, "IPv4 tunnel can only be used with tunnel mode"); return -EINVAL; + } - if (x->encap) + if (x->encap) { + NL_SET_ERR_MSG(extack, "IPv4 tunnel is not compatible with encapsulation"); return -EINVAL; + } x->props.header_len = sizeof(struct iphdr); diff --git a/net/ipv6/af_inet6.c b/net/ipv6/af_inet6.c index dbb1430d6cc2..d40b7d60e00e 100644 --- a/net/ipv6/af_inet6.c +++ b/net/ipv6/af_inet6.c @@ -1057,6 +1057,8 @@ static const struct ipv6_stub ipv6_stub_impl = { static const struct ipv6_bpf_stub ipv6_bpf_stub_impl = { .inet6_bind = __inet6_bind, .udp6_lib_lookup = __udp6_lib_lookup, + .ipv6_setsockopt = do_ipv6_setsockopt, + .ipv6_getsockopt = do_ipv6_getsockopt, }; static int __init inet6_init(void) diff --git a/net/ipv6/ah6.c b/net/ipv6/ah6.c index b5995c1f4d7a..5228d2716289 100644 --- a/net/ipv6/ah6.c +++ b/net/ipv6/ah6.c @@ -666,30 +666,38 @@ static int ah6_err(struct sk_buff *skb, struct inet6_skb_parm *opt, return 0; } -static int ah6_init_state(struct xfrm_state *x) +static int ah6_init_state(struct xfrm_state *x, struct netlink_ext_ack *extack) { struct ah_data *ahp = NULL; struct xfrm_algo_desc *aalg_desc; struct crypto_ahash *ahash; - if (!x->aalg) + if (!x->aalg) { + NL_SET_ERR_MSG(extack, "AH requires a state with an AUTH algorithm"); goto error; + } - if (x->encap) + if (x->encap) { + NL_SET_ERR_MSG(extack, "AH is not compatible with encapsulation"); goto error; + } ahp = kzalloc(sizeof(*ahp), GFP_KERNEL); if (!ahp) return -ENOMEM; ahash = crypto_alloc_ahash(x->aalg->alg_name, 0, 0); - if (IS_ERR(ahash)) + if (IS_ERR(ahash)) { + NL_SET_ERR_MSG(extack, "Kernel was unable to initialize cryptographic operations"); goto error; + } ahp->ahash = ahash; if (crypto_ahash_setkey(ahash, x->aalg->alg_key, - (x->aalg->alg_key_len + 7) / 8)) + (x->aalg->alg_key_len + 7) / 8)) { + NL_SET_ERR_MSG(extack, "Kernel was unable to initialize cryptographic operations"); goto error; + } /* * Lookup the algorithm description maintained by xfrm_algo, @@ -702,9 +710,7 @@ static int ah6_init_state(struct xfrm_state *x) if (aalg_desc->uinfo.auth.icv_fullbits/8 != crypto_ahash_digestsize(ahash)) { - pr_info("AH: %s digestsize %u != %u\n", - x->aalg->alg_name, crypto_ahash_digestsize(ahash), - aalg_desc->uinfo.auth.icv_fullbits/8); + NL_SET_ERR_MSG(extack, "Kernel was unable to initialize cryptographic operations"); goto error; } @@ -721,6 +727,7 @@ static int ah6_init_state(struct xfrm_state *x) x->props.header_len += sizeof(struct ipv6hdr); break; default: + NL_SET_ERR_MSG(extack, "Invalid mode requested for AH, must be one of TRANSPORT, TUNNEL, BEET"); goto error; } x->data = ahp; diff --git a/net/ipv6/esp6.c b/net/ipv6/esp6.c index 8220923a12f7..14ed868680c6 100644 --- a/net/ipv6/esp6.c +++ b/net/ipv6/esp6.c @@ -151,6 +151,7 @@ static void esp_free_tcp_sk(struct rcu_head *head) static struct sock *esp6_find_tcp_sk(struct xfrm_state *x) { struct xfrm_encap_tmpl *encap = x->encap; + struct net *net = xs_net(x); struct esp_tcp_sk *esk; __be16 sport, dport; struct sock *nsk; @@ -177,7 +178,7 @@ static struct sock *esp6_find_tcp_sk(struct xfrm_state *x) } spin_unlock_bh(&x->lock); - sk = __inet6_lookup_established(xs_net(x), &tcp_hashinfo, &x->id.daddr.in6, + sk = __inet6_lookup_established(net, net->ipv4.tcp_death_row.hashinfo, &x->id.daddr.in6, dport, &x->props.saddr.in6, ntohs(sport), 0, 0); if (!sk) return ERR_PTR(-ENOENT); @@ -1050,16 +1051,17 @@ static void esp6_destroy(struct xfrm_state *x) crypto_free_aead(aead); } -static int esp_init_aead(struct xfrm_state *x) +static int esp_init_aead(struct xfrm_state *x, struct netlink_ext_ack *extack) { char aead_name[CRYPTO_MAX_ALG_NAME]; struct crypto_aead *aead; int err; - err = -ENAMETOOLONG; if (snprintf(aead_name, CRYPTO_MAX_ALG_NAME, "%s(%s)", - x->geniv, x->aead->alg_name) >= CRYPTO_MAX_ALG_NAME) - goto error; + x->geniv, x->aead->alg_name) >= CRYPTO_MAX_ALG_NAME) { + NL_SET_ERR_MSG(extack, "Algorithm name is too long"); + return -ENAMETOOLONG; + } aead = crypto_alloc_aead(aead_name, 0, 0); err = PTR_ERR(aead); @@ -1077,11 +1079,15 @@ static int esp_init_aead(struct xfrm_state *x) if (err) goto error; + return 0; + error: + NL_SET_ERR_MSG(extack, "Kernel was unable to initialize cryptographic operations"); return err; } -static int esp_init_authenc(struct xfrm_state *x) +static int esp_init_authenc(struct xfrm_state *x, + struct netlink_ext_ack *extack) { struct crypto_aead *aead; struct crypto_authenc_key_param *param; @@ -1092,10 +1098,6 @@ static int esp_init_authenc(struct xfrm_state *x) unsigned int keylen; int err; - err = -EINVAL; - if (!x->ealg) - goto error; - err = -ENAMETOOLONG; if ((x->props.flags & XFRM_STATE_ESN)) { @@ -1104,22 +1106,28 @@ static int esp_init_authenc(struct xfrm_state *x) x->geniv ?: "", x->geniv ? "(" : "", x->aalg ? x->aalg->alg_name : "digest_null", x->ealg->alg_name, - x->geniv ? ")" : "") >= CRYPTO_MAX_ALG_NAME) + x->geniv ? ")" : "") >= CRYPTO_MAX_ALG_NAME) { + NL_SET_ERR_MSG(extack, "Algorithm name is too long"); goto error; + } } else { if (snprintf(authenc_name, CRYPTO_MAX_ALG_NAME, "%s%sauthenc(%s,%s)%s", x->geniv ?: "", x->geniv ? "(" : "", x->aalg ? x->aalg->alg_name : "digest_null", x->ealg->alg_name, - x->geniv ? ")" : "") >= CRYPTO_MAX_ALG_NAME) + x->geniv ? ")" : "") >= CRYPTO_MAX_ALG_NAME) { + NL_SET_ERR_MSG(extack, "Algorithm name is too long"); goto error; + } } aead = crypto_alloc_aead(authenc_name, 0, 0); err = PTR_ERR(aead); - if (IS_ERR(aead)) + if (IS_ERR(aead)) { + NL_SET_ERR_MSG(extack, "Kernel was unable to initialize cryptographic operations"); goto error; + } x->data = aead; @@ -1149,17 +1157,16 @@ static int esp_init_authenc(struct xfrm_state *x) err = -EINVAL; if (aalg_desc->uinfo.auth.icv_fullbits / 8 != crypto_aead_authsize(aead)) { - pr_info("ESP: %s digestsize %u != %u\n", - x->aalg->alg_name, - crypto_aead_authsize(aead), - aalg_desc->uinfo.auth.icv_fullbits / 8); + NL_SET_ERR_MSG(extack, "Kernel was unable to initialize cryptographic operations"); goto free_key; } err = crypto_aead_setauthsize( aead, x->aalg->alg_trunc_len / 8); - if (err) + if (err) { + NL_SET_ERR_MSG(extack, "Kernel was unable to initialize cryptographic operations"); goto free_key; + } } param->enckeylen = cpu_to_be32((x->ealg->alg_key_len + 7) / 8); @@ -1174,7 +1181,7 @@ error: return err; } -static int esp6_init_state(struct xfrm_state *x) +static int esp6_init_state(struct xfrm_state *x, struct netlink_ext_ack *extack) { struct crypto_aead *aead; u32 align; @@ -1182,10 +1189,14 @@ static int esp6_init_state(struct xfrm_state *x) x->data = NULL; - if (x->aead) - err = esp_init_aead(x); - else - err = esp_init_authenc(x); + if (x->aead) { + err = esp_init_aead(x, extack); + } else if (x->ealg) { + err = esp_init_authenc(x, extack); + } else { + NL_SET_ERR_MSG(extack, "ESP: AEAD or CRYPT must be provided"); + err = -EINVAL; + } if (err) goto error; @@ -1213,6 +1224,7 @@ static int esp6_init_state(struct xfrm_state *x) switch (encap->encap_type) { default: + NL_SET_ERR_MSG(extack, "Unsupported encapsulation type for ESP"); err = -EINVAL; goto error; case UDP_ENCAP_ESPINUDP: diff --git a/net/ipv6/esp6_offload.c b/net/ipv6/esp6_offload.c index 3a293838a91d..79d43548279c 100644 --- a/net/ipv6/esp6_offload.c +++ b/net/ipv6/esp6_offload.c @@ -145,7 +145,10 @@ static struct sk_buff *xfrm6_tunnel_gso_segment(struct xfrm_state *x, struct sk_buff *skb, netdev_features_t features) { - return skb_eth_gso_segment(skb, features, htons(ETH_P_IPV6)); + __be16 type = x->inner_mode.family == AF_INET ? htons(ETH_P_IP) + : htons(ETH_P_IPV6); + + return skb_eth_gso_segment(skb, features, type); } static struct sk_buff *xfrm6_transport_gso_segment(struct xfrm_state *x, diff --git a/net/ipv6/ila/ila_main.c b/net/ipv6/ila/ila_main.c index 36c58aa257e8..3faf62530d6a 100644 --- a/net/ipv6/ila/ila_main.c +++ b/net/ipv6/ila/ila_main.c @@ -55,6 +55,7 @@ struct genl_family ila_nl_family __ro_after_init = { .module = THIS_MODULE, .ops = ila_nl_ops, .n_ops = ARRAY_SIZE(ila_nl_ops), + .resv_start_op = ILA_CMD_FLUSH + 1, }; static __net_init int ila_init_net(struct net *net) diff --git a/net/ipv6/inet6_hashtables.c b/net/ipv6/inet6_hashtables.c index 7d53d62783b1..b64b49012655 100644 --- a/net/ipv6/inet6_hashtables.c +++ b/net/ipv6/inet6_hashtables.c @@ -21,8 +21,6 @@ #include <net/ip.h> #include <net/sock_reuseport.h> -extern struct inet_hashinfo tcp_hashinfo; - u32 inet6_ehashfn(const struct net *net, const struct in6_addr *laddr, const u16 lport, const struct in6_addr *faddr, const __be16 fport) @@ -169,7 +167,7 @@ static inline struct sock *inet6_lookup_run_bpf(struct net *net, struct sock *sk, *reuse_sk; bool no_reuseport; - if (hashinfo != &tcp_hashinfo) + if (hashinfo != net->ipv4.tcp_death_row.hashinfo) return NULL; /* only TCP is supported */ no_reuseport = bpf_sk_lookup_run_v6(net, IPPROTO_TCP, saddr, sport, diff --git a/net/ipv6/ioam6.c b/net/ipv6/ioam6.c index 1098131ed90c..571f0e4d9cf3 100644 --- a/net/ipv6/ioam6.c +++ b/net/ipv6/ioam6.c @@ -619,6 +619,7 @@ static struct genl_family ioam6_genl_family __ro_after_init = { .parallel_ops = true, .ops = ioam6_genl_ops, .n_ops = ARRAY_SIZE(ioam6_genl_ops), + .resv_start_op = IOAM6_CMD_NS_SET_SCHEMA + 1, .module = THIS_MODULE, }; diff --git a/net/ipv6/ip6_gre.c b/net/ipv6/ip6_gre.c index 80cb50d459e4..48b4ff0294f6 100644 --- a/net/ipv6/ip6_gre.c +++ b/net/ipv6/ip6_gre.c @@ -360,7 +360,7 @@ static struct ip6_tnl *ip6gre_tunnel_locate(struct net *net, if (parms->name[0]) { if (!dev_valid_name(parms->name)) return NULL; - strlcpy(name, parms->name, IFNAMSIZ); + strscpy(name, parms->name, IFNAMSIZ); } else { strcpy(name, "ip6gre%d"); } diff --git a/net/ipv6/ip6_offload.c b/net/ipv6/ip6_offload.c index d12dba2dd535..3ee345672849 100644 --- a/net/ipv6/ip6_offload.c +++ b/net/ipv6/ip6_offload.c @@ -219,12 +219,9 @@ INDIRECT_CALLABLE_SCOPE struct sk_buff *ipv6_gro_receive(struct list_head *head, off = skb_gro_offset(skb); hlen = off + sizeof(*iph); - iph = skb_gro_header_fast(skb, off); - if (skb_gro_header_hard(skb, hlen)) { - iph = skb_gro_header_slow(skb, hlen, off); - if (unlikely(!iph)) - goto out; - } + iph = skb_gro_header(skb, hlen, off); + if (unlikely(!iph)) + goto out; skb_set_network_header(skb, off); skb_gro_pull(skb, sizeof(*iph)); @@ -235,7 +232,7 @@ INDIRECT_CALLABLE_SCOPE struct sk_buff *ipv6_gro_receive(struct list_head *head, proto = iph->nexthdr; ops = rcu_dereference(inet6_offloads[proto]); if (!ops || !ops->callbacks.gro_receive) { - __pskb_pull(skb, skb_gro_offset(skb)); + pskb_pull(skb, skb_gro_offset(skb)); skb_gro_frag0_invalidate(skb); proto = ipv6_gso_pull_exthdrs(skb, proto); skb_gro_pull(skb, -skb_transport_offset(skb)); diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c index f152e51242cb..e19507614f64 100644 --- a/net/ipv6/ip6_output.c +++ b/net/ipv6/ip6_output.c @@ -1567,7 +1567,7 @@ emsgsize: paged = true; zc = true; } else { - uarg->zerocopy = 0; + uarg_to_msgzc(uarg)->zerocopy = 0; skb_zcopy_set(skb, uarg, &extra_uref); } } @@ -1648,10 +1648,7 @@ alloc_new_skb: (fraglen + alloc_extra < SKB_MAX_ALLOC || !(rt->dst.dev->features & NETIF_F_SG))) alloclen = fraglen; - else if (!zc) { - alloclen = min_t(int, fraglen, MAX_HEADER); - pagedlen = fraglen - alloclen; - } else { + else { alloclen = fragheaderlen + transhdrlen; pagedlen = datalen - transhdrlen; } diff --git a/net/ipv6/ip6_tunnel.c b/net/ipv6/ip6_tunnel.c index 79c6a827dea9..cc5d5e75b658 100644 --- a/net/ipv6/ip6_tunnel.c +++ b/net/ipv6/ip6_tunnel.c @@ -293,7 +293,7 @@ static struct ip6_tnl *ip6_tnl_create(struct net *net, struct __ip6_tnl_parm *p) if (p->name[0]) { if (!dev_valid_name(p->name)) goto failed; - strlcpy(name, p->name, IFNAMSIZ); + strscpy(name, p->name, IFNAMSIZ); } else { sprintf(name, "ip6tnl%%d"); } @@ -1988,39 +1988,6 @@ static void ip6_tnl_netlink_parms(struct nlattr *data[], parms->fwmark = nla_get_u32(data[IFLA_IPTUN_FWMARK]); } -static bool ip6_tnl_netlink_encap_parms(struct nlattr *data[], - struct ip_tunnel_encap *ipencap) -{ - bool ret = false; - - memset(ipencap, 0, sizeof(*ipencap)); - - if (!data) - return ret; - - if (data[IFLA_IPTUN_ENCAP_TYPE]) { - ret = true; - ipencap->type = nla_get_u16(data[IFLA_IPTUN_ENCAP_TYPE]); - } - - if (data[IFLA_IPTUN_ENCAP_FLAGS]) { - ret = true; - ipencap->flags = nla_get_u16(data[IFLA_IPTUN_ENCAP_FLAGS]); - } - - if (data[IFLA_IPTUN_ENCAP_SPORT]) { - ret = true; - ipencap->sport = nla_get_be16(data[IFLA_IPTUN_ENCAP_SPORT]); - } - - if (data[IFLA_IPTUN_ENCAP_DPORT]) { - ret = true; - ipencap->dport = nla_get_be16(data[IFLA_IPTUN_ENCAP_DPORT]); - } - - return ret; -} - static int ip6_tnl_newlink(struct net *src_net, struct net_device *dev, struct nlattr *tb[], struct nlattr *data[], struct netlink_ext_ack *extack) @@ -2033,7 +2000,7 @@ static int ip6_tnl_newlink(struct net *src_net, struct net_device *dev, nt = netdev_priv(dev); - if (ip6_tnl_netlink_encap_parms(data, &ipencap)) { + if (ip_tunnel_netlink_encap_parms(data, &ipencap)) { err = ip6_tnl_encap_setup(nt, &ipencap); if (err < 0) return err; @@ -2070,7 +2037,7 @@ static int ip6_tnl_changelink(struct net_device *dev, struct nlattr *tb[], if (dev == ip6n->fb_tnl_dev) return -EINVAL; - if (ip6_tnl_netlink_encap_parms(data, &ipencap)) { + if (ip_tunnel_netlink_encap_parms(data, &ipencap)) { int err = ip6_tnl_encap_setup(t, &ipencap); if (err < 0) diff --git a/net/ipv6/ip6_vti.c b/net/ipv6/ip6_vti.c index 8fe59a79e800..151337d7f67b 100644 --- a/net/ipv6/ip6_vti.c +++ b/net/ipv6/ip6_vti.c @@ -154,7 +154,7 @@ vti6_tnl_link(struct vti6_net *ip6n, struct ip6_tnl *t) { struct ip6_tnl __rcu **tp = vti6_tnl_bucket(ip6n, &t->parms); - rcu_assign_pointer(t->next , rtnl_dereference(*tp)); + rcu_assign_pointer(t->next, rtnl_dereference(*tp)); rcu_assign_pointer(*tp, t); } @@ -211,7 +211,7 @@ static struct ip6_tnl *vti6_tnl_create(struct net *net, struct __ip6_tnl_parm *p if (p->name[0]) { if (!dev_valid_name(p->name)) goto failed; - strlcpy(name, p->name, IFNAMSIZ); + strscpy(name, p->name, IFNAMSIZ); } else { sprintf(name, "ip6_vti%%d"); } diff --git a/net/ipv6/ip6mr.c b/net/ipv6/ip6mr.c index 858fd8a28b5b..facdc78a43e5 100644 --- a/net/ipv6/ip6mr.c +++ b/net/ipv6/ip6mr.c @@ -1830,8 +1830,8 @@ int ip6_mroute_setsockopt(struct sock *sk, int optname, sockptr_t optval, * Getsock opt support for the multicast routing system. */ -int ip6_mroute_getsockopt(struct sock *sk, int optname, char __user *optval, - int __user *optlen) +int ip6_mroute_getsockopt(struct sock *sk, int optname, sockptr_t optval, + sockptr_t optlen) { int olr; int val; @@ -1862,16 +1862,16 @@ int ip6_mroute_getsockopt(struct sock *sk, int optname, char __user *optval, return -ENOPROTOOPT; } - if (get_user(olr, optlen)) + if (copy_from_sockptr(&olr, optlen, sizeof(int))) return -EFAULT; olr = min_t(int, olr, sizeof(int)); if (olr < 0) return -EINVAL; - if (put_user(olr, optlen)) + if (copy_to_sockptr(optlen, &olr, sizeof(int))) return -EFAULT; - if (copy_to_user(optval, &val, olr)) + if (copy_to_sockptr(optval, &val, olr)) return -EFAULT; return 0; } diff --git a/net/ipv6/ipcomp6.c b/net/ipv6/ipcomp6.c index 15f984be3570..72d4858dec18 100644 --- a/net/ipv6/ipcomp6.c +++ b/net/ipv6/ipcomp6.c @@ -136,7 +136,8 @@ out: return err; } -static int ipcomp6_init_state(struct xfrm_state *x) +static int ipcomp6_init_state(struct xfrm_state *x, + struct netlink_ext_ack *extack) { int err = -EINVAL; @@ -148,17 +149,20 @@ static int ipcomp6_init_state(struct xfrm_state *x) x->props.header_len += sizeof(struct ipv6hdr); break; default: + NL_SET_ERR_MSG(extack, "Unsupported XFRM mode for IPcomp"); goto out; } - err = ipcomp_init_state(x); + err = ipcomp_init_state(x, extack); if (err) goto out; if (x->props.mode == XFRM_MODE_TUNNEL) { err = ipcomp6_tunnel_attach(x); - if (err) + if (err) { + NL_SET_ERR_MSG(extack, "Kernel error: failed to initialize the associated state"); goto out; + } } err = 0; diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c index e0dcc7a193df..2d2f4dd9e5df 100644 --- a/net/ipv6/ipv6_sockglue.c +++ b/net/ipv6/ipv6_sockglue.c @@ -327,7 +327,7 @@ static int ipv6_set_opt_hdr(struct sock *sk, int optname, sockptr_t optval, int err; /* hop-by-hop / destination options are privileged option */ - if (optname != IPV6_RTHDR && !ns_capable(net->user_ns, CAP_NET_RAW)) + if (optname != IPV6_RTHDR && !sockopt_ns_capable(net->user_ns, CAP_NET_RAW)) return -EPERM; /* remove any sticky options header with a zero option @@ -391,8 +391,8 @@ sticky_done: return err; } -static int do_ipv6_setsockopt(struct sock *sk, int level, int optname, - sockptr_t optval, unsigned int optlen) +int do_ipv6_setsockopt(struct sock *sk, int level, int optname, + sockptr_t optval, unsigned int optlen) { struct ipv6_pinfo *np = inet6_sk(sk); struct net *net = sock_net(sk); @@ -417,7 +417,7 @@ static int do_ipv6_setsockopt(struct sock *sk, int level, int optname, if (needs_rtnl) rtnl_lock(); - lock_sock(sk); + sockopt_lock_sock(sk); switch (optname) { @@ -634,8 +634,8 @@ static int do_ipv6_setsockopt(struct sock *sk, int level, int optname, break; case IPV6_TRANSPARENT: - if (valbool && !ns_capable(net->user_ns, CAP_NET_RAW) && - !ns_capable(net->user_ns, CAP_NET_ADMIN)) { + if (valbool && !sockopt_ns_capable(net->user_ns, CAP_NET_RAW) && + !sockopt_ns_capable(net->user_ns, CAP_NET_ADMIN)) { retv = -EPERM; break; } @@ -946,7 +946,7 @@ done: case IPV6_IPSEC_POLICY: case IPV6_XFRM_POLICY: retv = -EPERM; - if (!ns_capable(net->user_ns, CAP_NET_ADMIN)) + if (!sockopt_ns_capable(net->user_ns, CAP_NET_ADMIN)) break; retv = xfrm_user_policy(sk, optname, optval, optlen); break; @@ -994,14 +994,14 @@ done: break; } - release_sock(sk); + sockopt_release_sock(sk); if (needs_rtnl) rtnl_unlock(); return retv; e_inval: - release_sock(sk); + sockopt_release_sock(sk); if (needs_rtnl) rtnl_unlock(); return -EINVAL; @@ -1030,7 +1030,7 @@ int ipv6_setsockopt(struct sock *sk, int level, int optname, sockptr_t optval, EXPORT_SYMBOL(ipv6_setsockopt); static int ipv6_getsockopt_sticky(struct sock *sk, struct ipv6_txoptions *opt, - int optname, char __user *optval, int len) + int optname, sockptr_t optval, int len) { struct ipv6_opt_hdr *hdr; @@ -1058,56 +1058,53 @@ static int ipv6_getsockopt_sticky(struct sock *sk, struct ipv6_txoptions *opt, return 0; len = min_t(unsigned int, len, ipv6_optlen(hdr)); - if (copy_to_user(optval, hdr, len)) + if (copy_to_sockptr(optval, hdr, len)) return -EFAULT; return len; } -static int ipv6_get_msfilter(struct sock *sk, void __user *optval, - int __user *optlen, int len) +static int ipv6_get_msfilter(struct sock *sk, sockptr_t optval, + sockptr_t optlen, int len) { const int size0 = offsetof(struct group_filter, gf_slist_flex); - struct group_filter __user *p = optval; struct group_filter gsf; int num; int err; if (len < size0) return -EINVAL; - if (copy_from_user(&gsf, p, size0)) + if (copy_from_sockptr(&gsf, optval, size0)) return -EFAULT; if (gsf.gf_group.ss_family != AF_INET6) return -EADDRNOTAVAIL; num = gsf.gf_numsrc; - lock_sock(sk); - err = ip6_mc_msfget(sk, &gsf, p->gf_slist_flex); + sockopt_lock_sock(sk); + err = ip6_mc_msfget(sk, &gsf, optval, size0); if (!err) { if (num > gsf.gf_numsrc) num = gsf.gf_numsrc; - if (put_user(GROUP_FILTER_SIZE(num), optlen) || - copy_to_user(p, &gsf, size0)) + len = GROUP_FILTER_SIZE(num); + if (copy_to_sockptr(optlen, &len, sizeof(int)) || + copy_to_sockptr(optval, &gsf, size0)) err = -EFAULT; } - release_sock(sk); + sockopt_release_sock(sk); return err; } -static int compat_ipv6_get_msfilter(struct sock *sk, void __user *optval, - int __user *optlen) +static int compat_ipv6_get_msfilter(struct sock *sk, sockptr_t optval, + sockptr_t optlen, int len) { const int size0 = offsetof(struct compat_group_filter, gf_slist_flex); - struct compat_group_filter __user *p = optval; struct compat_group_filter gf32; struct group_filter gf; - int len, err; + int err; int num; - if (get_user(len, optlen)) - return -EFAULT; if (len < size0) return -EINVAL; - if (copy_from_user(&gf32, p, size0)) + if (copy_from_sockptr(&gf32, optval, size0)) return -EFAULT; gf.gf_interface = gf32.gf_interface; gf.gf_fmode = gf32.gf_fmode; @@ -1117,23 +1114,25 @@ static int compat_ipv6_get_msfilter(struct sock *sk, void __user *optval, if (gf.gf_group.ss_family != AF_INET6) return -EADDRNOTAVAIL; - lock_sock(sk); - err = ip6_mc_msfget(sk, &gf, p->gf_slist_flex); - release_sock(sk); + sockopt_lock_sock(sk); + err = ip6_mc_msfget(sk, &gf, optval, size0); + sockopt_release_sock(sk); if (err) return err; if (num > gf.gf_numsrc) num = gf.gf_numsrc; len = GROUP_FILTER_SIZE(num) - (sizeof(gf)-sizeof(gf32)); - if (put_user(len, optlen) || - put_user(gf.gf_fmode, &p->gf_fmode) || - put_user(gf.gf_numsrc, &p->gf_numsrc)) + if (copy_to_sockptr(optlen, &len, sizeof(int)) || + copy_to_sockptr_offset(optval, offsetof(struct compat_group_filter, gf_fmode), + &gf.gf_fmode, sizeof(gf32.gf_fmode)) || + copy_to_sockptr_offset(optval, offsetof(struct compat_group_filter, gf_numsrc), + &gf.gf_numsrc, sizeof(gf32.gf_numsrc))) return -EFAULT; return 0; } -static int do_ipv6_getsockopt(struct sock *sk, int level, int optname, - char __user *optval, int __user *optlen, unsigned int flags) +int do_ipv6_getsockopt(struct sock *sk, int level, int optname, + sockptr_t optval, sockptr_t optlen) { struct ipv6_pinfo *np = inet6_sk(sk); int len; @@ -1142,7 +1141,7 @@ static int do_ipv6_getsockopt(struct sock *sk, int level, int optname, if (ip6_mroute_opt(optname)) return ip6_mroute_getsockopt(sk, optname, optval, optlen); - if (get_user(len, optlen)) + if (copy_from_sockptr(&len, optlen, sizeof(int))) return -EFAULT; switch (optname) { case IPV6_ADDRFORM: @@ -1156,7 +1155,7 @@ static int do_ipv6_getsockopt(struct sock *sk, int level, int optname, break; case MCAST_MSFILTER: if (in_compat_syscall()) - return compat_ipv6_get_msfilter(sk, optval, optlen); + return compat_ipv6_get_msfilter(sk, optval, optlen, len); return ipv6_get_msfilter(sk, optval, optlen, len); case IPV6_2292PKTOPTIONS: { @@ -1166,16 +1165,21 @@ static int do_ipv6_getsockopt(struct sock *sk, int level, int optname, if (sk->sk_type != SOCK_STREAM) return -ENOPROTOOPT; - msg.msg_control_user = optval; + if (optval.is_kernel) { + msg.msg_control_is_user = false; + msg.msg_control = optval.kernel; + } else { + msg.msg_control_is_user = true; + msg.msg_control_user = optval.user; + } msg.msg_controllen = len; - msg.msg_flags = flags; - msg.msg_control_is_user = true; + msg.msg_flags = 0; - lock_sock(sk); + sockopt_lock_sock(sk); skb = np->pktoptions; if (skb) ip6_datagram_recv_ctl(sk, &msg, skb); - release_sock(sk); + sockopt_release_sock(sk); if (!skb) { if (np->rxopt.bits.rxinfo) { struct in6_pktinfo src_info; @@ -1212,7 +1216,7 @@ static int do_ipv6_getsockopt(struct sock *sk, int level, int optname, } } len -= msg.msg_controllen; - return put_user(len, optlen); + return copy_to_sockptr(optlen, &len, sizeof(int)); } case IPV6_MTU: { @@ -1264,15 +1268,15 @@ static int do_ipv6_getsockopt(struct sock *sk, int level, int optname, { struct ipv6_txoptions *opt; - lock_sock(sk); + sockopt_lock_sock(sk); opt = rcu_dereference_protected(np->opt, lockdep_sock_is_held(sk)); len = ipv6_getsockopt_sticky(sk, opt, optname, optval, len); - release_sock(sk); + sockopt_release_sock(sk); /* check if ipv6_getsockopt_sticky() returns err code */ if (len < 0) return len; - return put_user(len, optlen); + return copy_to_sockptr(optlen, &len, sizeof(int)); } case IPV6_RECVHOPOPTS: @@ -1326,9 +1330,9 @@ static int do_ipv6_getsockopt(struct sock *sk, int level, int optname, if (!mtuinfo.ip6m_mtu) return -ENOTCONN; - if (put_user(len, optlen)) + if (copy_to_sockptr(optlen, &len, sizeof(int))) return -EFAULT; - if (copy_to_user(optval, &mtuinfo, len)) + if (copy_to_sockptr(optval, &mtuinfo, len)) return -EFAULT; return 0; @@ -1405,7 +1409,7 @@ static int do_ipv6_getsockopt(struct sock *sk, int level, int optname, if (len < sizeof(freq)) return -EINVAL; - if (copy_from_user(&freq, optval, sizeof(freq))) + if (copy_from_sockptr(&freq, optval, sizeof(freq))) return -EFAULT; if (freq.flr_action != IPV6_FL_A_GET) @@ -1420,9 +1424,9 @@ static int do_ipv6_getsockopt(struct sock *sk, int level, int optname, if (val < 0) return val; - if (put_user(len, optlen)) + if (copy_to_sockptr(optlen, &len, sizeof(int))) return -EFAULT; - if (copy_to_user(optval, &freq, len)) + if (copy_to_sockptr(optval, &freq, len)) return -EFAULT; return 0; @@ -1474,9 +1478,9 @@ static int do_ipv6_getsockopt(struct sock *sk, int level, int optname, return -ENOPROTOOPT; } len = min_t(unsigned int, sizeof(int), len); - if (put_user(len, optlen)) + if (copy_to_sockptr(optlen, &len, sizeof(int))) return -EFAULT; - if (copy_to_user(optval, &val, len)) + if (copy_to_sockptr(optval, &val, len)) return -EFAULT; return 0; } @@ -1492,7 +1496,8 @@ int ipv6_getsockopt(struct sock *sk, int level, int optname, if (level != SOL_IPV6) return -ENOPROTOOPT; - err = do_ipv6_getsockopt(sk, level, optname, optval, optlen, 0); + err = do_ipv6_getsockopt(sk, level, optname, + USER_SOCKPTR(optval), USER_SOCKPTR(optlen)); #ifdef CONFIG_NETFILTER /* we need to exclude all possible ENOPROTOOPTs except default case */ if (err == -ENOPROTOOPT && optname != IPV6_2292PKTOPTIONS) { diff --git a/net/ipv6/mcast.c b/net/ipv6/mcast.c index 87c699d57b36..0566ab03ddbe 100644 --- a/net/ipv6/mcast.c +++ b/net/ipv6/mcast.c @@ -580,7 +580,7 @@ done: } int ip6_mc_msfget(struct sock *sk, struct group_filter *gsf, - struct sockaddr_storage __user *p) + sockptr_t optval, size_t ss_offset) { struct ipv6_pinfo *inet6 = inet6_sk(sk); const struct in6_addr *group; @@ -612,8 +612,7 @@ int ip6_mc_msfget(struct sock *sk, struct group_filter *gsf, copycount = count < gsf->gf_numsrc ? count : gsf->gf_numsrc; gsf->gf_numsrc = count; - - for (i = 0; i < copycount; i++, p++) { + for (i = 0; i < copycount; i++) { struct sockaddr_in6 *psin6; struct sockaddr_storage ss; @@ -621,8 +620,9 @@ int ip6_mc_msfget(struct sock *sk, struct group_filter *gsf, memset(&ss, 0, sizeof(ss)); psin6->sin6_family = AF_INET6; psin6->sin6_addr = psl->sl_addr[i]; - if (copy_to_user(p, &ss, sizeof(ss))) + if (copy_to_sockptr_offset(optval, ss_offset, &ss, sizeof(ss))) return -EFAULT; + ss_offset += sizeof(ss); } return 0; } diff --git a/net/ipv6/mip6.c b/net/ipv6/mip6.c index aeb35d26e474..83d2a8be263f 100644 --- a/net/ipv6/mip6.c +++ b/net/ipv6/mip6.c @@ -247,15 +247,14 @@ static int mip6_destopt_reject(struct xfrm_state *x, struct sk_buff *skb, return err; } -static int mip6_destopt_init_state(struct xfrm_state *x) +static int mip6_destopt_init_state(struct xfrm_state *x, struct netlink_ext_ack *extack) { if (x->id.spi) { - pr_info("%s: spi is not 0: %u\n", __func__, x->id.spi); + NL_SET_ERR_MSG(extack, "SPI must be 0"); return -EINVAL; } if (x->props.mode != XFRM_MODE_ROUTEOPTIMIZATION) { - pr_info("%s: state's mode is not %u: %u\n", - __func__, XFRM_MODE_ROUTEOPTIMIZATION, x->props.mode); + NL_SET_ERR_MSG(extack, "XFRM mode must be XFRM_MODE_ROUTEOPTIMIZATION"); return -EINVAL; } @@ -333,15 +332,14 @@ static int mip6_rthdr_output(struct xfrm_state *x, struct sk_buff *skb) return 0; } -static int mip6_rthdr_init_state(struct xfrm_state *x) +static int mip6_rthdr_init_state(struct xfrm_state *x, struct netlink_ext_ack *extack) { if (x->id.spi) { - pr_info("%s: spi is not 0: %u\n", __func__, x->id.spi); + NL_SET_ERR_MSG(extack, "SPI must be 0"); return -EINVAL; } if (x->props.mode != XFRM_MODE_ROUTEOPTIMIZATION) { - pr_info("%s: state's mode is not %u: %u\n", - __func__, XFRM_MODE_ROUTEOPTIMIZATION, x->props.mode); + NL_SET_ERR_MSG(extack, "XFRM mode must be XFRM_MODE_ROUTEOPTIMIZATION"); return -EINVAL; } diff --git a/net/ipv6/netfilter/nf_socket_ipv6.c b/net/ipv6/netfilter/nf_socket_ipv6.c index aa5bb8789ba0..a7690ec62325 100644 --- a/net/ipv6/netfilter/nf_socket_ipv6.c +++ b/net/ipv6/netfilter/nf_socket_ipv6.c @@ -83,8 +83,8 @@ nf_socket_get_sock_v6(struct net *net, struct sk_buff *skb, int doff, { switch (protocol) { case IPPROTO_TCP: - return inet6_lookup(net, &tcp_hashinfo, skb, doff, - saddr, sport, daddr, dport, + return inet6_lookup(net, net->ipv4.tcp_death_row.hashinfo, + skb, doff, saddr, sport, daddr, dport, in->ifindex); case IPPROTO_UDP: return udp6_lib_lookup(net, saddr, sport, daddr, dport, diff --git a/net/ipv6/netfilter/nf_tproxy_ipv6.c b/net/ipv6/netfilter/nf_tproxy_ipv6.c index 6bac68fb27a3..929502e51203 100644 --- a/net/ipv6/netfilter/nf_tproxy_ipv6.c +++ b/net/ipv6/netfilter/nf_tproxy_ipv6.c @@ -80,6 +80,7 @@ nf_tproxy_get_sock_v6(struct net *net, struct sk_buff *skb, int thoff, const struct net_device *in, const enum nf_tproxy_lookup_t lookup_type) { + struct inet_hashinfo *hinfo = net->ipv4.tcp_death_row.hashinfo; struct sock *sk; switch (protocol) { @@ -93,7 +94,7 @@ nf_tproxy_get_sock_v6(struct net *net, struct sk_buff *skb, int thoff, switch (lookup_type) { case NF_TPROXY_LOOKUP_LISTENER: - sk = inet6_lookup_listener(net, &tcp_hashinfo, skb, + sk = inet6_lookup_listener(net, hinfo, skb, thoff + __tcp_hdrlen(hp), saddr, sport, daddr, ntohs(dport), @@ -108,9 +109,8 @@ nf_tproxy_get_sock_v6(struct net *net, struct sk_buff *skb, int thoff, */ break; case NF_TPROXY_LOOKUP_ESTABLISHED: - sk = __inet6_lookup_established(net, &tcp_hashinfo, - saddr, sport, daddr, ntohs(dport), - in->ifindex, 0); + sk = __inet6_lookup_established(net, hinfo, saddr, sport, daddr, + ntohs(dport), in->ifindex, 0); break; default: BUG(); diff --git a/net/ipv6/netfilter/nft_fib_ipv6.c b/net/ipv6/netfilter/nft_fib_ipv6.c index 8970d0b4faeb..1d7e520d9966 100644 --- a/net/ipv6/netfilter/nft_fib_ipv6.c +++ b/net/ipv6/netfilter/nft_fib_ipv6.c @@ -41,6 +41,9 @@ static int nft_fib6_flowi_init(struct flowi6 *fl6, const struct nft_fib *priv, if (ipv6_addr_type(&fl6->daddr) & IPV6_ADDR_LINKLOCAL) { lookup_flags |= RT6_LOOKUP_F_IFACE; fl6->flowi6_oif = get_ifindex(dev ? dev : pkt->skb->dev); + } else if ((priv->flags & NFTA_FIB_F_IIF) && + (netif_is_l3_master(dev) || netif_is_l3_slave(dev))) { + fl6->flowi6_oif = dev->ifindex; } if (ipv6_addr_type(&fl6->saddr) & IPV6_ADDR_UNICAST) @@ -197,7 +200,8 @@ void nft_fib6_eval(const struct nft_expr *expr, struct nft_regs *regs, if (rt->rt6i_flags & (RTF_REJECT | RTF_ANYCAST | RTF_LOCAL)) goto put_rt_err; - if (oif && oif != rt->rt6i_idev->dev) + if (oif && oif != rt->rt6i_idev->dev && + l3mdev_master_ifindex_rcu(rt->rt6i_idev->dev) != oif->ifindex) goto put_rt_err; nft_fib_store_result(dest, priv, rt->rt6i_idev->dev); diff --git a/net/ipv6/ping.c b/net/ipv6/ping.c index 91b840514656..5f2ef8493714 100644 --- a/net/ipv6/ping.c +++ b/net/ipv6/ping.c @@ -20,6 +20,7 @@ #include <net/udp.h> #include <net/transp_v6.h> #include <linux/proc_fs.h> +#include <linux/bpf-cgroup.h> #include <net/ping.h> static void ping_v6_destroy(struct sock *sk) @@ -49,6 +50,20 @@ static int dummy_ipv6_chk_addr(struct net *net, const struct in6_addr *addr, return 0; } +static int ping_v6_pre_connect(struct sock *sk, struct sockaddr *uaddr, + int addr_len) +{ + /* This check is replicated from __ip6_datagram_connect() and + * intended to prevent BPF program called below from accessing + * bytes that are out of the bound specified by user in addr_len. + */ + + if (addr_len < SIN6_LEN_RFC2133) + return -EINVAL; + + return BPF_CGROUP_RUN_PROG_INET6_CONNECT_LOCK(sk, uaddr); +} + static int ping_v6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len) { struct inet_sock *inet = inet_sk(sk); @@ -191,6 +206,7 @@ struct proto pingv6_prot = { .init = ping_init_sock, .close = ping_close, .destroy = ping_v6_destroy, + .pre_connect = ping_v6_pre_connect, .connect = ip6_datagram_connect_v6_only, .disconnect = __udp_disconnect, .setsockopt = ipv6_setsockopt, diff --git a/net/ipv6/seg6.c b/net/ipv6/seg6.c index 0b0e34ddc64e..29346a6eec9f 100644 --- a/net/ipv6/seg6.c +++ b/net/ipv6/seg6.c @@ -504,6 +504,7 @@ static struct genl_family seg6_genl_family __ro_after_init = { .parallel_ops = true, .ops = seg6_genl_ops, .n_ops = ARRAY_SIZE(seg6_genl_ops), + .resv_start_op = SEG6_CMD_GET_TUNSRC + 1, .module = THIS_MODULE, }; diff --git a/net/ipv6/seg6_local.c b/net/ipv6/seg6_local.c index b7de5e46fdd8..8370726ae7bf 100644 --- a/net/ipv6/seg6_local.c +++ b/net/ipv6/seg6_local.c @@ -73,6 +73,55 @@ struct bpf_lwt_prog { char *name; }; +/* default length values (expressed in bits) for both Locator-Block and + * Locator-Node Function. + * + * Both SEG6_LOCAL_LCBLOCK_DBITS and SEG6_LOCAL_LCNODE_FN_DBITS *must* be: + * i) greater than 0; + * ii) evenly divisible by 8. In other terms, the lengths of the + * Locator-Block and Locator-Node Function must be byte-aligned (we can + * relax this constraint in the future if really needed). + * + * Moreover, a third condition must hold: + * iii) SEG6_LOCAL_LCBLOCK_DBITS + SEG6_LOCAL_LCNODE_FN_DBITS <= 128. + * + * The correctness of SEG6_LOCAL_LCBLOCK_DBITS and SEG6_LOCAL_LCNODE_FN_DBITS + * values are checked during the kernel compilation. If the compilation stops, + * check the value of these parameters to see if they meet conditions (i), (ii) + * and (iii). + */ +#define SEG6_LOCAL_LCBLOCK_DBITS 32 +#define SEG6_LOCAL_LCNODE_FN_DBITS 16 + +/* The following next_csid_chk_{cntr,lcblock,lcblock_fn}_bits macros can be + * used directly to check whether the lengths (in bits) of Locator-Block and + * Locator-Node Function are valid according to (i), (ii), (iii). + */ +#define next_csid_chk_cntr_bits(blen, flen) \ + ((blen) + (flen) > 128) + +#define next_csid_chk_lcblock_bits(blen) \ +({ \ + typeof(blen) __tmp = blen; \ + (!__tmp || __tmp > 120 || (__tmp & 0x07)); \ +}) + +#define next_csid_chk_lcnode_fn_bits(flen) \ + next_csid_chk_lcblock_bits(flen) + +/* Supported Flavor operations are reported in this bitmask */ +#define SEG6_LOCAL_FLV_SUPP_OPS (BIT(SEG6_LOCAL_FLV_OP_NEXT_CSID)) + +struct seg6_flavors_info { + /* Flavor operations */ + __u32 flv_ops; + + /* Locator-Block length, expressed in bits */ + __u8 lcblock_bits; + /* Locator-Node Function length, expressed in bits*/ + __u8 lcnode_func_bits; +}; + enum seg6_end_dt_mode { DT_INVALID_MODE = -EINVAL, DT_LEGACY_MODE = 0, @@ -136,6 +185,8 @@ struct seg6_local_lwt { #ifdef CONFIG_NET_L3_MASTER_DEV struct seg6_end_dt_info dt_info; #endif + struct seg6_flavors_info flv_info; + struct pcpu_seg6_local_counters __percpu *pcpu_counters; int headroom; @@ -271,8 +322,50 @@ int seg6_lookup_nexthop(struct sk_buff *skb, return seg6_lookup_any_nexthop(skb, nhaddr, tbl_id, false); } -/* regular endpoint function */ -static int input_action_end(struct sk_buff *skb, struct seg6_local_lwt *slwt) +static __u8 seg6_flv_lcblock_octects(const struct seg6_flavors_info *finfo) +{ + return finfo->lcblock_bits >> 3; +} + +static __u8 seg6_flv_lcnode_func_octects(const struct seg6_flavors_info *finfo) +{ + return finfo->lcnode_func_bits >> 3; +} + +static bool seg6_next_csid_is_arg_zero(const struct in6_addr *addr, + const struct seg6_flavors_info *finfo) +{ + __u8 fnc_octects = seg6_flv_lcnode_func_octects(finfo); + __u8 blk_octects = seg6_flv_lcblock_octects(finfo); + __u8 arg_octects; + int i; + + arg_octects = 16 - blk_octects - fnc_octects; + for (i = 0; i < arg_octects; ++i) { + if (addr->s6_addr[blk_octects + fnc_octects + i] != 0x00) + return false; + } + + return true; +} + +/* assume that DA.Argument length > 0 */ +static void seg6_next_csid_advance_arg(struct in6_addr *addr, + const struct seg6_flavors_info *finfo) +{ + __u8 fnc_octects = seg6_flv_lcnode_func_octects(finfo); + __u8 blk_octects = seg6_flv_lcblock_octects(finfo); + + /* advance DA.Argument */ + memmove(&addr->s6_addr[blk_octects], + &addr->s6_addr[blk_octects + fnc_octects], + 16 - blk_octects - fnc_octects); + + memset(&addr->s6_addr[16 - fnc_octects], 0x00, fnc_octects); +} + +static int input_action_end_core(struct sk_buff *skb, + struct seg6_local_lwt *slwt) { struct ipv6_sr_hdr *srh; @@ -291,6 +384,38 @@ drop: return -EINVAL; } +static int end_next_csid_core(struct sk_buff *skb, struct seg6_local_lwt *slwt) +{ + const struct seg6_flavors_info *finfo = &slwt->flv_info; + struct in6_addr *daddr = &ipv6_hdr(skb)->daddr; + + if (seg6_next_csid_is_arg_zero(daddr, finfo)) + return input_action_end_core(skb, slwt); + + /* update DA */ + seg6_next_csid_advance_arg(daddr, finfo); + + seg6_lookup_nexthop(skb, NULL, 0); + + return dst_input(skb); +} + +static bool seg6_next_csid_enabled(__u32 fops) +{ + return fops & BIT(SEG6_LOCAL_FLV_OP_NEXT_CSID); +} + +/* regular endpoint function */ +static int input_action_end(struct sk_buff *skb, struct seg6_local_lwt *slwt) +{ + const struct seg6_flavors_info *finfo = &slwt->flv_info; + + if (seg6_next_csid_enabled(finfo->flv_ops)) + return end_next_csid_core(skb, slwt); + + return input_action_end_core(skb, slwt); +} + /* regular endpoint, and forward to specified nexthop */ static int input_action_end_x(struct sk_buff *skb, struct seg6_local_lwt *slwt) { @@ -951,7 +1076,8 @@ static struct seg6_action_desc seg6_action_table[] = { { .action = SEG6_LOCAL_ACTION_END, .attrs = 0, - .optattrs = SEG6_F_LOCAL_COUNTERS, + .optattrs = SEG6_F_LOCAL_COUNTERS | + SEG6_F_ATTR(SEG6_LOCAL_FLAVORS), .input = input_action_end, }, { @@ -1132,9 +1258,11 @@ static const struct nla_policy seg6_local_policy[SEG6_LOCAL_MAX + 1] = { [SEG6_LOCAL_OIF] = { .type = NLA_U32 }, [SEG6_LOCAL_BPF] = { .type = NLA_NESTED }, [SEG6_LOCAL_COUNTERS] = { .type = NLA_NESTED }, + [SEG6_LOCAL_FLAVORS] = { .type = NLA_NESTED }, }; -static int parse_nla_srh(struct nlattr **attrs, struct seg6_local_lwt *slwt) +static int parse_nla_srh(struct nlattr **attrs, struct seg6_local_lwt *slwt, + struct netlink_ext_ack *extack) { struct ipv6_sr_hdr *srh; int len; @@ -1191,7 +1319,8 @@ static void destroy_attr_srh(struct seg6_local_lwt *slwt) kfree(slwt->srh); } -static int parse_nla_table(struct nlattr **attrs, struct seg6_local_lwt *slwt) +static int parse_nla_table(struct nlattr **attrs, struct seg6_local_lwt *slwt, + struct netlink_ext_ack *extack) { slwt->table = nla_get_u32(attrs[SEG6_LOCAL_TABLE]); @@ -1225,7 +1354,8 @@ seg6_end_dt_info *seg6_possible_end_dt_info(struct seg6_local_lwt *slwt) } static int parse_nla_vrftable(struct nlattr **attrs, - struct seg6_local_lwt *slwt) + struct seg6_local_lwt *slwt, + struct netlink_ext_ack *extack) { struct seg6_end_dt_info *info = seg6_possible_end_dt_info(slwt); @@ -1261,7 +1391,8 @@ static int cmp_nla_vrftable(struct seg6_local_lwt *a, struct seg6_local_lwt *b) return 0; } -static int parse_nla_nh4(struct nlattr **attrs, struct seg6_local_lwt *slwt) +static int parse_nla_nh4(struct nlattr **attrs, struct seg6_local_lwt *slwt, + struct netlink_ext_ack *extack) { memcpy(&slwt->nh4, nla_data(attrs[SEG6_LOCAL_NH4]), sizeof(struct in_addr)); @@ -1287,7 +1418,8 @@ static int cmp_nla_nh4(struct seg6_local_lwt *a, struct seg6_local_lwt *b) return memcmp(&a->nh4, &b->nh4, sizeof(struct in_addr)); } -static int parse_nla_nh6(struct nlattr **attrs, struct seg6_local_lwt *slwt) +static int parse_nla_nh6(struct nlattr **attrs, struct seg6_local_lwt *slwt, + struct netlink_ext_ack *extack) { memcpy(&slwt->nh6, nla_data(attrs[SEG6_LOCAL_NH6]), sizeof(struct in6_addr)); @@ -1313,7 +1445,8 @@ static int cmp_nla_nh6(struct seg6_local_lwt *a, struct seg6_local_lwt *b) return memcmp(&a->nh6, &b->nh6, sizeof(struct in6_addr)); } -static int parse_nla_iif(struct nlattr **attrs, struct seg6_local_lwt *slwt) +static int parse_nla_iif(struct nlattr **attrs, struct seg6_local_lwt *slwt, + struct netlink_ext_ack *extack) { slwt->iif = nla_get_u32(attrs[SEG6_LOCAL_IIF]); @@ -1336,7 +1469,8 @@ static int cmp_nla_iif(struct seg6_local_lwt *a, struct seg6_local_lwt *b) return 0; } -static int parse_nla_oif(struct nlattr **attrs, struct seg6_local_lwt *slwt) +static int parse_nla_oif(struct nlattr **attrs, struct seg6_local_lwt *slwt, + struct netlink_ext_ack *extack) { slwt->oif = nla_get_u32(attrs[SEG6_LOCAL_OIF]); @@ -1366,7 +1500,8 @@ static const struct nla_policy bpf_prog_policy[SEG6_LOCAL_BPF_PROG_MAX + 1] = { .len = MAX_PROG_NAME }, }; -static int parse_nla_bpf(struct nlattr **attrs, struct seg6_local_lwt *slwt) +static int parse_nla_bpf(struct nlattr **attrs, struct seg6_local_lwt *slwt, + struct netlink_ext_ack *extack) { struct nlattr *tb[SEG6_LOCAL_BPF_PROG_MAX + 1]; struct bpf_prog *p; @@ -1444,7 +1579,8 @@ nla_policy seg6_local_counters_policy[SEG6_LOCAL_CNT_MAX + 1] = { }; static int parse_nla_counters(struct nlattr **attrs, - struct seg6_local_lwt *slwt) + struct seg6_local_lwt *slwt, + struct netlink_ext_ack *extack) { struct pcpu_seg6_local_counters __percpu *pcounters; struct nlattr *tb[SEG6_LOCAL_CNT_MAX + 1]; @@ -1542,8 +1678,195 @@ static void destroy_attr_counters(struct seg6_local_lwt *slwt) free_percpu(slwt->pcpu_counters); } +static const +struct nla_policy seg6_local_flavors_policy[SEG6_LOCAL_FLV_MAX + 1] = { + [SEG6_LOCAL_FLV_OPERATION] = { .type = NLA_U32 }, + [SEG6_LOCAL_FLV_LCBLOCK_BITS] = { .type = NLA_U8 }, + [SEG6_LOCAL_FLV_LCNODE_FN_BITS] = { .type = NLA_U8 }, +}; + +/* check whether the lengths of the Locator-Block and Locator-Node Function + * are compatible with the dimension of a C-SID container. + */ +static int seg6_chk_next_csid_cfg(__u8 block_len, __u8 func_len) +{ + /* Locator-Block and Locator-Node Function cannot exceed 128 bits + * (i.e. C-SID container lenghts). + */ + if (next_csid_chk_cntr_bits(block_len, func_len)) + return -EINVAL; + + /* Locator-Block length must be greater than zero and evenly divisible + * by 8. There must be room for a Locator-Node Function, at least. + */ + if (next_csid_chk_lcblock_bits(block_len)) + return -EINVAL; + + /* Locator-Node Function length must be greater than zero and evenly + * divisible by 8. There must be room for the Locator-Block. + */ + if (next_csid_chk_lcnode_fn_bits(func_len)) + return -EINVAL; + + return 0; +} + +static int seg6_parse_nla_next_csid_cfg(struct nlattr **tb, + struct seg6_flavors_info *finfo, + struct netlink_ext_ack *extack) +{ + __u8 func_len = SEG6_LOCAL_LCNODE_FN_DBITS; + __u8 block_len = SEG6_LOCAL_LCBLOCK_DBITS; + int rc; + + if (tb[SEG6_LOCAL_FLV_LCBLOCK_BITS]) + block_len = nla_get_u8(tb[SEG6_LOCAL_FLV_LCBLOCK_BITS]); + + if (tb[SEG6_LOCAL_FLV_LCNODE_FN_BITS]) + func_len = nla_get_u8(tb[SEG6_LOCAL_FLV_LCNODE_FN_BITS]); + + rc = seg6_chk_next_csid_cfg(block_len, func_len); + if (rc < 0) { + NL_SET_ERR_MSG(extack, + "Invalid Locator Block/Node Function lengths"); + return rc; + } + + finfo->lcblock_bits = block_len; + finfo->lcnode_func_bits = func_len; + + return 0; +} + +static int parse_nla_flavors(struct nlattr **attrs, struct seg6_local_lwt *slwt, + struct netlink_ext_ack *extack) +{ + struct seg6_flavors_info *finfo = &slwt->flv_info; + struct nlattr *tb[SEG6_LOCAL_FLV_MAX + 1]; + unsigned long fops; + int rc; + + rc = nla_parse_nested_deprecated(tb, SEG6_LOCAL_FLV_MAX, + attrs[SEG6_LOCAL_FLAVORS], + seg6_local_flavors_policy, NULL); + if (rc < 0) + return rc; + + /* this attribute MUST always be present since it represents the Flavor + * operation(s) to be carried out. + */ + if (!tb[SEG6_LOCAL_FLV_OPERATION]) + return -EINVAL; + + fops = nla_get_u32(tb[SEG6_LOCAL_FLV_OPERATION]); + if (fops & ~SEG6_LOCAL_FLV_SUPP_OPS) { + NL_SET_ERR_MSG(extack, "Unsupported Flavor operation(s)"); + return -EOPNOTSUPP; + } + + finfo->flv_ops = fops; + + if (seg6_next_csid_enabled(fops)) { + /* Locator-Block and Locator-Node Function lengths can be + * provided by the user space. Otherwise, default values are + * applied. + */ + rc = seg6_parse_nla_next_csid_cfg(tb, finfo, extack); + if (rc < 0) + return rc; + } + + return 0; +} + +static int seg6_fill_nla_next_csid_cfg(struct sk_buff *skb, + struct seg6_flavors_info *finfo) +{ + if (nla_put_u8(skb, SEG6_LOCAL_FLV_LCBLOCK_BITS, finfo->lcblock_bits)) + return -EMSGSIZE; + + if (nla_put_u8(skb, SEG6_LOCAL_FLV_LCNODE_FN_BITS, + finfo->lcnode_func_bits)) + return -EMSGSIZE; + + return 0; +} + +static int put_nla_flavors(struct sk_buff *skb, struct seg6_local_lwt *slwt) +{ + struct seg6_flavors_info *finfo = &slwt->flv_info; + __u32 fops = finfo->flv_ops; + struct nlattr *nest; + int rc; + + nest = nla_nest_start(skb, SEG6_LOCAL_FLAVORS); + if (!nest) + return -EMSGSIZE; + + if (nla_put_u32(skb, SEG6_LOCAL_FLV_OPERATION, fops)) { + rc = -EMSGSIZE; + goto err; + } + + if (seg6_next_csid_enabled(fops)) { + rc = seg6_fill_nla_next_csid_cfg(skb, finfo); + if (rc < 0) + goto err; + } + + return nla_nest_end(skb, nest); + +err: + nla_nest_cancel(skb, nest); + return rc; +} + +static int seg6_cmp_nla_next_csid_cfg(struct seg6_flavors_info *finfo_a, + struct seg6_flavors_info *finfo_b) +{ + if (finfo_a->lcblock_bits != finfo_b->lcblock_bits) + return 1; + + if (finfo_a->lcnode_func_bits != finfo_b->lcnode_func_bits) + return 1; + + return 0; +} + +static int cmp_nla_flavors(struct seg6_local_lwt *a, struct seg6_local_lwt *b) +{ + struct seg6_flavors_info *finfo_a = &a->flv_info; + struct seg6_flavors_info *finfo_b = &b->flv_info; + + if (finfo_a->flv_ops != finfo_b->flv_ops) + return 1; + + if (seg6_next_csid_enabled(finfo_a->flv_ops)) { + if (seg6_cmp_nla_next_csid_cfg(finfo_a, finfo_b)) + return 1; + } + + return 0; +} + +static int encap_size_flavors(struct seg6_local_lwt *slwt) +{ + struct seg6_flavors_info *finfo = &slwt->flv_info; + int nlsize; + + nlsize = nla_total_size(0) + /* nest SEG6_LOCAL_FLAVORS */ + nla_total_size(4); /* SEG6_LOCAL_FLV_OPERATION */ + + if (seg6_next_csid_enabled(finfo->flv_ops)) + nlsize += nla_total_size(1) + /* SEG6_LOCAL_FLV_LCBLOCK_BITS */ + nla_total_size(1); /* SEG6_LOCAL_FLV_LCNODE_FN_BITS */ + + return nlsize; +} + struct seg6_action_param { - int (*parse)(struct nlattr **attrs, struct seg6_local_lwt *slwt); + int (*parse)(struct nlattr **attrs, struct seg6_local_lwt *slwt, + struct netlink_ext_ack *extack); int (*put)(struct sk_buff *skb, struct seg6_local_lwt *slwt); int (*cmp)(struct seg6_local_lwt *a, struct seg6_local_lwt *b); @@ -1593,6 +1916,10 @@ static struct seg6_action_param seg6_action_params[SEG6_LOCAL_MAX + 1] = { .put = put_nla_counters, .cmp = cmp_nla_counters, .destroy = destroy_attr_counters }, + + [SEG6_LOCAL_FLAVORS] = { .parse = parse_nla_flavors, + .put = put_nla_flavors, + .cmp = cmp_nla_flavors }, }; /* call the destroy() callback (if available) for each set attribute in @@ -1636,7 +1963,8 @@ static void destroy_attrs(struct seg6_local_lwt *slwt) } static int parse_nla_optional_attrs(struct nlattr **attrs, - struct seg6_local_lwt *slwt) + struct seg6_local_lwt *slwt, + struct netlink_ext_ack *extack) { struct seg6_action_desc *desc = slwt->desc; unsigned long parsed_optattrs = 0; @@ -1652,7 +1980,7 @@ static int parse_nla_optional_attrs(struct nlattr **attrs, */ param = &seg6_action_params[i]; - err = param->parse(attrs, slwt); + err = param->parse(attrs, slwt, extack); if (err < 0) goto parse_optattrs_err; @@ -1705,7 +2033,8 @@ static void seg6_local_lwtunnel_destroy_state(struct seg6_local_lwt *slwt) ops->destroy_state(slwt); } -static int parse_nla_action(struct nlattr **attrs, struct seg6_local_lwt *slwt) +static int parse_nla_action(struct nlattr **attrs, struct seg6_local_lwt *slwt, + struct netlink_ext_ack *extack) { struct seg6_action_param *param; struct seg6_action_desc *desc; @@ -1749,14 +2078,14 @@ static int parse_nla_action(struct nlattr **attrs, struct seg6_local_lwt *slwt) param = &seg6_action_params[i]; - err = param->parse(attrs, slwt); + err = param->parse(attrs, slwt, extack); if (err < 0) goto parse_attrs_err; } } /* parse the optional attributes, if any */ - err = parse_nla_optional_attrs(attrs, slwt); + err = parse_nla_optional_attrs(attrs, slwt, extack); if (err < 0) goto parse_attrs_err; @@ -1800,7 +2129,7 @@ static int seg6_local_build_state(struct net *net, struct nlattr *nla, slwt = seg6_local_lwtunnel(newts); slwt->action = nla_get_u32(tb[SEG6_LOCAL_ACTION]); - err = parse_nla_action(tb, slwt); + err = parse_nla_action(tb, slwt, extack); if (err < 0) goto out_free; @@ -1904,6 +2233,9 @@ static int seg6_local_get_encap_size(struct lwtunnel_state *lwt) /* SEG6_LOCAL_CNT_ERRORS */ nla_total_size_64bit(sizeof(__u64)); + if (attrs & SEG6_F_ATTR(SEG6_LOCAL_FLAVORS)) + nlsize += encap_size_flavors(slwt); + return nlsize; } @@ -1959,6 +2291,15 @@ int __init seg6_local_init(void) */ BUILD_BUG_ON(SEG6_LOCAL_MAX + 1 > BITS_PER_TYPE(unsigned long)); + /* If the default NEXT-C-SID Locator-Block/Node Function lengths (in + * bits) have been changed with invalid values, kernel build stops + * here. + */ + BUILD_BUG_ON(next_csid_chk_cntr_bits(SEG6_LOCAL_LCBLOCK_DBITS, + SEG6_LOCAL_LCNODE_FN_DBITS)); + BUILD_BUG_ON(next_csid_chk_lcblock_bits(SEG6_LOCAL_LCBLOCK_DBITS)); + BUILD_BUG_ON(next_csid_chk_lcnode_fn_bits(SEG6_LOCAL_LCNODE_FN_DBITS)); + return lwtunnel_encap_add_ops(&seg6_local_ops, LWTUNNEL_ENCAP_SEG6_LOCAL); } diff --git a/net/ipv6/sit.c b/net/ipv6/sit.c index 6b73b7a5f175..d27683e3fc97 100644 --- a/net/ipv6/sit.c +++ b/net/ipv6/sit.c @@ -254,7 +254,7 @@ static struct ip_tunnel *ipip6_tunnel_locate(struct net *net, if (parms->name[0]) { if (!dev_valid_name(parms->name)) goto failed; - strlcpy(name, parms->name, IFNAMSIZ); + strscpy(name, parms->name, IFNAMSIZ); } else { strcpy(name, "sit%d"); } @@ -1503,71 +1503,12 @@ static void ipip6_netlink_parms(struct nlattr *data[], if (!data) return; - if (data[IFLA_IPTUN_LINK]) - parms->link = nla_get_u32(data[IFLA_IPTUN_LINK]); - - if (data[IFLA_IPTUN_LOCAL]) - parms->iph.saddr = nla_get_be32(data[IFLA_IPTUN_LOCAL]); - - if (data[IFLA_IPTUN_REMOTE]) - parms->iph.daddr = nla_get_be32(data[IFLA_IPTUN_REMOTE]); - - if (data[IFLA_IPTUN_TTL]) { - parms->iph.ttl = nla_get_u8(data[IFLA_IPTUN_TTL]); - if (parms->iph.ttl) - parms->iph.frag_off = htons(IP_DF); - } - - if (data[IFLA_IPTUN_TOS]) - parms->iph.tos = nla_get_u8(data[IFLA_IPTUN_TOS]); - - if (!data[IFLA_IPTUN_PMTUDISC] || nla_get_u8(data[IFLA_IPTUN_PMTUDISC])) - parms->iph.frag_off = htons(IP_DF); - - if (data[IFLA_IPTUN_FLAGS]) - parms->i_flags = nla_get_be16(data[IFLA_IPTUN_FLAGS]); - - if (data[IFLA_IPTUN_PROTO]) - parms->iph.protocol = nla_get_u8(data[IFLA_IPTUN_PROTO]); + ip_tunnel_netlink_parms(data, parms); if (data[IFLA_IPTUN_FWMARK]) *fwmark = nla_get_u32(data[IFLA_IPTUN_FWMARK]); } -/* This function returns true when ENCAP attributes are present in the nl msg */ -static bool ipip6_netlink_encap_parms(struct nlattr *data[], - struct ip_tunnel_encap *ipencap) -{ - bool ret = false; - - memset(ipencap, 0, sizeof(*ipencap)); - - if (!data) - return ret; - - if (data[IFLA_IPTUN_ENCAP_TYPE]) { - ret = true; - ipencap->type = nla_get_u16(data[IFLA_IPTUN_ENCAP_TYPE]); - } - - if (data[IFLA_IPTUN_ENCAP_FLAGS]) { - ret = true; - ipencap->flags = nla_get_u16(data[IFLA_IPTUN_ENCAP_FLAGS]); - } - - if (data[IFLA_IPTUN_ENCAP_SPORT]) { - ret = true; - ipencap->sport = nla_get_be16(data[IFLA_IPTUN_ENCAP_SPORT]); - } - - if (data[IFLA_IPTUN_ENCAP_DPORT]) { - ret = true; - ipencap->dport = nla_get_be16(data[IFLA_IPTUN_ENCAP_DPORT]); - } - - return ret; -} - #ifdef CONFIG_IPV6_SIT_6RD /* This function returns true when 6RD attributes are present in the nl msg */ static bool ipip6_netlink_6rd_parms(struct nlattr *data[], @@ -1619,7 +1560,7 @@ static int ipip6_newlink(struct net *src_net, struct net_device *dev, nt = netdev_priv(dev); - if (ipip6_netlink_encap_parms(data, &ipencap)) { + if (ip_tunnel_netlink_encap_parms(data, &ipencap)) { err = ip_tunnel_encap_setup(nt, &ipencap); if (err < 0) return err; @@ -1671,7 +1612,7 @@ static int ipip6_changelink(struct net_device *dev, struct nlattr *tb[], if (dev == sitn->fb_tunnel_dev) return -EINVAL; - if (ipip6_netlink_encap_parms(data, &ipencap)) { + if (ip_tunnel_netlink_encap_parms(data, &ipencap)) { err = ip_tunnel_encap_setup(t, &ipencap); if (err < 0) return err; diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c index e54eee80ce5f..a8adda623da1 100644 --- a/net/ipv6/tcp_ipv6.c +++ b/net/ipv6/tcp_ipv6.c @@ -146,15 +146,16 @@ static int tcp_v6_connect(struct sock *sk, struct sockaddr *uaddr, int addr_len) { struct sockaddr_in6 *usin = (struct sockaddr_in6 *) uaddr; - struct inet_sock *inet = inet_sk(sk); struct inet_connection_sock *icsk = inet_csk(sk); + struct in6_addr *saddr = NULL, *final_p, final; struct inet_timewait_death_row *tcp_death_row; struct ipv6_pinfo *np = tcp_inet6_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct tcp_sock *tp = tcp_sk(sk); - struct in6_addr *saddr = NULL, *final_p, final; + struct net *net = sock_net(sk); struct ipv6_txoptions *opt; - struct flowi6 fl6; struct dst_entry *dst; + struct flowi6 fl6; int addr_type; int err; @@ -280,15 +281,33 @@ static int tcp_v6_connect(struct sock *sk, struct sockaddr *uaddr, security_sk_classify_flow(sk, flowi6_to_flowi_common(&fl6)); - dst = ip6_dst_lookup_flow(sock_net(sk), sk, &fl6, final_p); + dst = ip6_dst_lookup_flow(net, sk, &fl6, final_p); if (IS_ERR(dst)) { err = PTR_ERR(dst); goto failure; } + tcp_death_row = &sock_net(sk)->ipv4.tcp_death_row; + if (!saddr) { + struct inet_bind_hashbucket *prev_addr_hashbucket = NULL; + struct in6_addr prev_v6_rcv_saddr; + + if (icsk->icsk_bind2_hash) { + prev_addr_hashbucket = inet_bhashfn_portaddr(tcp_death_row->hashinfo, + sk, net, inet->inet_num); + prev_v6_rcv_saddr = sk->sk_v6_rcv_saddr; + } saddr = &fl6.saddr; sk->sk_v6_rcv_saddr = *saddr; + + if (prev_addr_hashbucket) { + err = inet_bhash2_update_saddr(prev_addr_hashbucket, sk); + if (err) { + sk->sk_v6_rcv_saddr = prev_v6_rcv_saddr; + goto failure; + } + } } /* set the source address */ @@ -308,7 +327,6 @@ static int tcp_v6_connect(struct sock *sk, struct sockaddr *uaddr, inet->inet_dport = usin->sin6_port; tcp_set_state(sk, TCP_SYN_SENT); - tcp_death_row = sock_net(sk)->ipv4.tcp_death_row; err = inet6_hash_connect(tcp_death_row, sk); if (err) goto late_failure; @@ -322,8 +340,7 @@ static int tcp_v6_connect(struct sock *sk, struct sockaddr *uaddr, sk->sk_v6_daddr.s6_addr32, inet->inet_sport, inet->inet_dport)); - tp->tsoffset = secure_tcpv6_ts_off(sock_net(sk), - np->saddr.s6_addr32, + tp->tsoffset = secure_tcpv6_ts_off(net, np->saddr.s6_addr32, sk->sk_v6_daddr.s6_addr32); } @@ -386,7 +403,7 @@ static int tcp_v6_err(struct sk_buff *skb, struct inet6_skb_parm *opt, bool fatal; int err; - sk = __inet6_lookup_established(net, &tcp_hashinfo, + sk = __inet6_lookup_established(net, net->ipv4.tcp_death_row.hashinfo, &hdr->daddr, th->dest, &hdr->saddr, ntohs(th->source), skb->dev->ifindex, inet6_sdif(skb)); @@ -841,7 +858,7 @@ const struct tcp_request_sock_ops tcp_request_sock_ipv6_ops = { static void tcp_v6_send_response(const struct sock *sk, struct sk_buff *skb, u32 seq, u32 ack, u32 win, u32 tsval, u32 tsecr, int oif, struct tcp_md5sig_key *key, int rst, - u8 tclass, __be32 label, u32 priority) + u8 tclass, __be32 label, u32 priority, u32 txhash) { const struct tcphdr *th = tcp_hdr(skb); struct tcphdr *t1; @@ -932,16 +949,16 @@ static void tcp_v6_send_response(const struct sock *sk, struct sk_buff *skb, u32 } if (sk) { - if (sk->sk_state == TCP_TIME_WAIT) { + if (sk->sk_state == TCP_TIME_WAIT) mark = inet_twsk(sk)->tw_mark; - /* autoflowlabel relies on buff->hash */ - skb_set_hash(buff, inet_twsk(sk)->tw_txhash, - PKT_HASH_TYPE_L4); - } else { + else mark = sk->sk_mark; - } skb_set_delivery_time(buff, tcp_transmit_time(sk), true); } + if (txhash) { + /* autoflowlabel/skb_get_hash_flowi6 rely on buff->hash */ + skb_set_hash(buff, txhash, PKT_HASH_TYPE_L4); + } fl6.flowi6_mark = IP6_REPLY_MARK(net, skb->mark) ?: mark; fl6.fl6_dport = t1->dest; fl6.fl6_sport = t1->source; @@ -984,6 +1001,7 @@ static void tcp_v6_send_reset(const struct sock *sk, struct sk_buff *skb) __be32 label = 0; u32 priority = 0; struct net *net; + u32 txhash = 0; int oif = 0; if (th->rst) @@ -1019,11 +1037,10 @@ static void tcp_v6_send_reset(const struct sock *sk, struct sk_buff *skb) * Incoming packet is checked with md5 hash with finding key, * no RST generated if md5 hash doesn't match. */ - sk1 = inet6_lookup_listener(net, - &tcp_hashinfo, NULL, 0, - &ipv6h->saddr, - th->source, &ipv6h->daddr, - ntohs(th->source), dif, sdif); + sk1 = inet6_lookup_listener(net, net->ipv4.tcp_death_row.hashinfo, + NULL, 0, &ipv6h->saddr, th->source, + &ipv6h->daddr, ntohs(th->source), + dif, sdif); if (!sk1) goto out; @@ -1057,10 +1074,12 @@ static void tcp_v6_send_reset(const struct sock *sk, struct sk_buff *skb) if (np->repflow) label = ip6_flowlabel(ipv6h); priority = sk->sk_priority; + txhash = sk->sk_hash; } if (sk->sk_state == TCP_TIME_WAIT) { label = cpu_to_be32(inet_twsk(sk)->tw_flowlabel); priority = inet_twsk(sk)->tw_priority; + txhash = inet_twsk(sk)->tw_txhash; } } else { if (net->ipv6.sysctl.flowlabel_reflect & FLOWLABEL_REFLECT_TCP_RESET) @@ -1068,7 +1087,7 @@ static void tcp_v6_send_reset(const struct sock *sk, struct sk_buff *skb) } tcp_v6_send_response(sk, skb, seq, ack_seq, 0, 0, 0, oif, key, 1, - ipv6_get_dsfield(ipv6h), label, priority); + ipv6_get_dsfield(ipv6h), label, priority, txhash); #ifdef CONFIG_TCP_MD5SIG out: @@ -1079,10 +1098,10 @@ out: static void tcp_v6_send_ack(const struct sock *sk, struct sk_buff *skb, u32 seq, u32 ack, u32 win, u32 tsval, u32 tsecr, int oif, struct tcp_md5sig_key *key, u8 tclass, - __be32 label, u32 priority) + __be32 label, u32 priority, u32 txhash) { tcp_v6_send_response(sk, skb, seq, ack, win, tsval, tsecr, oif, key, 0, - tclass, label, priority); + tclass, label, priority, txhash); } static void tcp_v6_timewait_ack(struct sock *sk, struct sk_buff *skb) @@ -1094,7 +1113,8 @@ static void tcp_v6_timewait_ack(struct sock *sk, struct sk_buff *skb) tcptw->tw_rcv_wnd >> tw->tw_rcv_wscale, tcp_time_stamp_raw() + tcptw->tw_ts_offset, tcptw->tw_ts_recent, tw->tw_bound_dev_if, tcp_twsk_md5_key(tcptw), - tw->tw_tclass, cpu_to_be32(tw->tw_flowlabel), tw->tw_priority); + tw->tw_tclass, cpu_to_be32(tw->tw_flowlabel), tw->tw_priority, + tw->tw_txhash); inet_twsk_put(tw); } @@ -1121,7 +1141,8 @@ static void tcp_v6_reqsk_send_ack(const struct sock *sk, struct sk_buff *skb, tcp_time_stamp_raw() + tcp_rsk(req)->ts_off, req->ts_recent, sk->sk_bound_dev_if, tcp_v6_md5_do_lookup(sk, &ipv6_hdr(skb)->saddr, l3index), - ipv6_get_dsfield(ipv6_hdr(skb)), 0, sk->sk_priority); + ipv6_get_dsfield(ipv6_hdr(skb)), 0, sk->sk_priority, + tcp_rsk(req)->txhash); } @@ -1619,7 +1640,7 @@ INDIRECT_CALLABLE_SCOPE int tcp_v6_rcv(struct sk_buff *skb) hdr = ipv6_hdr(skb); lookup: - sk = __inet6_lookup_skb(&tcp_hashinfo, skb, __tcp_hdrlen(th), + sk = __inet6_lookup_skb(net->ipv4.tcp_death_row.hashinfo, skb, __tcp_hdrlen(th), th->source, th->dest, inet6_iif(skb), sdif, &refcounted); if (!sk) @@ -1794,7 +1815,7 @@ do_time_wait: { struct sock *sk2; - sk2 = inet6_lookup_listener(dev_net(skb->dev), &tcp_hashinfo, + sk2 = inet6_lookup_listener(net, net->ipv4.tcp_death_row.hashinfo, skb, __tcp_hdrlen(th), &ipv6_hdr(skb)->saddr, th->source, &ipv6_hdr(skb)->daddr, @@ -1827,6 +1848,7 @@ do_time_wait: void tcp_v6_early_demux(struct sk_buff *skb) { + struct net *net = dev_net(skb->dev); const struct ipv6hdr *hdr; const struct tcphdr *th; struct sock *sk; @@ -1844,7 +1866,7 @@ void tcp_v6_early_demux(struct sk_buff *skb) return; /* Note : We use inet6_iif() here, not tcp_v6_iif() */ - sk = __inet6_lookup_established(dev_net(skb->dev), &tcp_hashinfo, + sk = __inet6_lookup_established(net, net->ipv4.tcp_death_row.hashinfo, &hdr->saddr, th->source, &hdr->daddr, ntohs(th->dest), inet6_iif(skb), inet6_sdif(skb)); @@ -2176,7 +2198,7 @@ struct proto tcpv6_prot = { .slab_flags = SLAB_TYPESAFE_BY_RCU, .twsk_prot = &tcp6_timewait_sock_ops, .rsk_prot = &tcp6_request_sock_ops, - .h.hashinfo = &tcp_hashinfo, + .h.hashinfo = NULL, .no_autobind = true, .diag_destroy = tcp_abort, }; @@ -2210,7 +2232,7 @@ static void __net_exit tcpv6_net_exit(struct net *net) static void __net_exit tcpv6_net_exit_batch(struct list_head *net_exit_list) { - inet_twsk_purge(&tcp_hashinfo, AF_INET6); + tcp_twsk_purge(net_exit_list, AF_INET6); } static struct pernet_operations tcpv6_net_ops = { diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c index 3366d6a77ff2..91e795bb9ade 100644 --- a/net/ipv6/udp.c +++ b/net/ipv6/udp.c @@ -650,16 +650,20 @@ static int __udpv6_queue_rcv_skb(struct sock *sk, struct sk_buff *skb) rc = __udp_enqueue_schedule_skb(sk, skb); if (rc < 0) { int is_udplite = IS_UDPLITE(sk); + enum skb_drop_reason drop_reason; /* Note that an ENOMEM error is charged twice */ - if (rc == -ENOMEM) + if (rc == -ENOMEM) { UDP6_INC_STATS(sock_net(sk), UDP_MIB_RCVBUFERRORS, is_udplite); - else + drop_reason = SKB_DROP_REASON_SOCKET_RCVBUFF; + } else { UDP6_INC_STATS(sock_net(sk), UDP_MIB_MEMERRORS, is_udplite); + drop_reason = SKB_DROP_REASON_PROTO_MEM; + } UDP6_INC_STATS(sock_net(sk), UDP_MIB_INERRORS, is_udplite); - kfree_skb(skb); + kfree_skb_reason(skb, drop_reason); return -1; } @@ -675,11 +679,14 @@ static __inline__ int udpv6_err(struct sk_buff *skb, static int udpv6_queue_rcv_one_skb(struct sock *sk, struct sk_buff *skb) { + enum skb_drop_reason drop_reason = SKB_DROP_REASON_NOT_SPECIFIED; struct udp_sock *up = udp_sk(sk); int is_udplite = IS_UDPLITE(sk); - if (!xfrm6_policy_check(sk, XFRM_POLICY_IN, skb)) + if (!xfrm6_policy_check(sk, XFRM_POLICY_IN, skb)) { + drop_reason = SKB_DROP_REASON_XFRM_POLICY; goto drop; + } if (static_branch_unlikely(&udpv6_encap_needed_key) && up->encap_type) { int (*encap_rcv)(struct sock *sk, struct sk_buff *skb); @@ -738,8 +745,10 @@ static int udpv6_queue_rcv_one_skb(struct sock *sk, struct sk_buff *skb) udp_lib_checksum_complete(skb)) goto csum_error; - if (sk_filter_trim_cap(sk, skb, sizeof(struct udphdr))) + if (sk_filter_trim_cap(sk, skb, sizeof(struct udphdr))) { + drop_reason = SKB_DROP_REASON_SOCKET_FILTER; goto drop; + } udp_csum_pull_header(skb); @@ -748,11 +757,12 @@ static int udpv6_queue_rcv_one_skb(struct sock *sk, struct sk_buff *skb) return __udpv6_queue_rcv_skb(sk, skb); csum_error: + drop_reason = SKB_DROP_REASON_UDP_CSUM; __UDP6_INC_STATS(sock_net(sk), UDP_MIB_CSUMERRORS, is_udplite); drop: __UDP6_INC_STATS(sock_net(sk), UDP_MIB_INERRORS, is_udplite); atomic_inc(&sk->sk_drops); - kfree_skb(skb); + kfree_skb_reason(skb, drop_reason); return -1; } diff --git a/net/ipv6/xfrm6_tunnel.c b/net/ipv6/xfrm6_tunnel.c index 2b31112c0856..1323f2f6928e 100644 --- a/net/ipv6/xfrm6_tunnel.c +++ b/net/ipv6/xfrm6_tunnel.c @@ -270,13 +270,17 @@ static int xfrm6_tunnel_err(struct sk_buff *skb, struct inet6_skb_parm *opt, return 0; } -static int xfrm6_tunnel_init_state(struct xfrm_state *x) +static int xfrm6_tunnel_init_state(struct xfrm_state *x, struct netlink_ext_ack *extack) { - if (x->props.mode != XFRM_MODE_TUNNEL) + if (x->props.mode != XFRM_MODE_TUNNEL) { + NL_SET_ERR_MSG(extack, "IPv6 tunnel can only be used with tunnel mode"); return -EINVAL; + } - if (x->encap) + if (x->encap) { + NL_SET_ERR_MSG(extack, "IPv6 tunnel is not compatible with encapsulation"); return -EINVAL; + } x->props.header_len = sizeof(struct ipv6hdr); diff --git a/net/l2tp/l2tp_eth.c b/net/l2tp/l2tp_eth.c index 6cd97c75445c..f2ae03c40473 100644 --- a/net/l2tp/l2tp_eth.c +++ b/net/l2tp/l2tp_eth.c @@ -254,7 +254,7 @@ static int l2tp_eth_create(struct net *net, struct l2tp_tunnel *tunnel, int rc; if (cfg->ifname) { - strlcpy(name, cfg->ifname, IFNAMSIZ); + strscpy(name, cfg->ifname, IFNAMSIZ); name_assign_type = NET_NAME_USER; } else { strcpy(name, L2TP_ETH_DEV_NAME); @@ -314,7 +314,7 @@ static int l2tp_eth_create(struct net *net, struct l2tp_tunnel *tunnel, return rc; } - strlcpy(session->ifname, dev->name, IFNAMSIZ); + strscpy(session->ifname, dev->name, IFNAMSIZ); rcu_assign_pointer(spriv->dev, dev); rtnl_unlock(); diff --git a/net/l2tp/l2tp_netlink.c b/net/l2tp/l2tp_netlink.c index 96eb91be9238..a901fd14fe3b 100644 --- a/net/l2tp/l2tp_netlink.c +++ b/net/l2tp/l2tp_netlink.c @@ -989,6 +989,7 @@ static struct genl_family l2tp_nl_family __ro_after_init = { .module = THIS_MODULE, .small_ops = l2tp_nl_ops, .n_small_ops = ARRAY_SIZE(l2tp_nl_ops), + .resv_start_op = L2TP_CMD_SESSION_GET + 1, .mcgrps = l2tp_multicast_group, .n_mcgrps = ARRAY_SIZE(l2tp_multicast_group), }; diff --git a/net/mac80211/Makefile b/net/mac80211/Makefile index af1df3a6bd55..b8de44da1fb8 100644 --- a/net/mac80211/Makefile +++ b/net/mac80211/Makefile @@ -16,6 +16,7 @@ mac80211-y := \ s1g.o \ ibss.o \ iface.o \ + link.o \ rate.o \ michael.o \ tkip.o \ diff --git a/net/mac80211/cfg.c b/net/mac80211/cfg.c index a4f6971b7a19..687b4c878d4a 100644 --- a/net/mac80211/cfg.c +++ b/net/mac80211/cfg.c @@ -23,6 +23,30 @@ #include "mesh.h" #include "wme.h" +static struct ieee80211_link_data * +ieee80211_link_or_deflink(struct ieee80211_sub_if_data *sdata, int link_id, + bool require_valid) +{ + struct ieee80211_link_data *link; + + if (link_id < 0) { + /* + * For keys, if sdata is not an MLD, we might not use + * the return value at all (if it's not a pairwise key), + * so in that case (require_valid==false) don't error. + */ + if (require_valid && sdata->vif.valid_links) + return ERR_PTR(-EINVAL); + + return &sdata->deflink; + } + + link = sdata_dereference(sdata->link[link_id], sdata); + if (!link) + return ERR_PTR(-ENOLINK); + return link; +} + static void ieee80211_set_mu_mimo_follow(struct ieee80211_sub_if_data *sdata, struct vif_params *params) { @@ -202,6 +226,10 @@ static int ieee80211_change_iface(struct wiphy *wiphy, if (params->use_4addr == ifmgd->use_4addr) return 0; + /* FIXME: no support for 4-addr MLO yet */ + if (sdata->vif.valid_links) + return -EOPNOTSUPP; + sdata->u.mgd.use_4addr = params->use_4addr; if (!ifmgd->associated) return 0; @@ -434,10 +462,12 @@ static int ieee80211_set_tx(struct ieee80211_sub_if_data *sdata, } static int ieee80211_add_key(struct wiphy *wiphy, struct net_device *dev, - u8 key_idx, bool pairwise, const u8 *mac_addr, - struct key_params *params) + int link_id, u8 key_idx, bool pairwise, + const u8 *mac_addr, struct key_params *params) { struct ieee80211_sub_if_data *sdata = IEEE80211_DEV_TO_SUB_IF(dev); + struct ieee80211_link_data *link = + ieee80211_link_or_deflink(sdata, link_id, false); struct ieee80211_local *local = sdata->local; struct sta_info *sta = NULL; struct ieee80211_key *key; @@ -446,6 +476,9 @@ static int ieee80211_add_key(struct wiphy *wiphy, struct net_device *dev, if (!ieee80211_sdata_running(sdata)) return -ENETDOWN; + if (IS_ERR(link)) + return PTR_ERR(link); + if (pairwise && params->mode == NL80211_KEY_SET_TX) return ieee80211_set_tx(sdata, mac_addr, key_idx); @@ -454,6 +487,8 @@ static int ieee80211_add_key(struct wiphy *wiphy, struct net_device *dev, case WLAN_CIPHER_SUITE_WEP40: case WLAN_CIPHER_SUITE_TKIP: case WLAN_CIPHER_SUITE_WEP104: + if (link_id >= 0) + return -EINVAL; if (WARN_ON_ONCE(fips_enabled)) return -EINVAL; break; @@ -466,6 +501,8 @@ static int ieee80211_add_key(struct wiphy *wiphy, struct net_device *dev, if (IS_ERR(key)) return PTR_ERR(key); + key->conf.link_id = link_id; + if (pairwise) key->conf.flags |= IEEE80211_KEY_FLAG_PAIRWISE; @@ -527,7 +564,7 @@ static int ieee80211_add_key(struct wiphy *wiphy, struct net_device *dev, break; } - err = ieee80211_key_link(key, sdata, sta); + err = ieee80211_key_link(key, link, sta); out_unlock: mutex_unlock(&local->sta_mtx); @@ -536,18 +573,37 @@ static int ieee80211_add_key(struct wiphy *wiphy, struct net_device *dev, } static struct ieee80211_key * -ieee80211_lookup_key(struct ieee80211_sub_if_data *sdata, +ieee80211_lookup_key(struct ieee80211_sub_if_data *sdata, int link_id, u8 key_idx, bool pairwise, const u8 *mac_addr) { struct ieee80211_local *local = sdata->local; + struct ieee80211_link_data *link = &sdata->deflink; struct ieee80211_key *key; - struct sta_info *sta; + + if (link_id >= 0) { + link = rcu_dereference_check(sdata->link[link_id], + lockdep_is_held(&sdata->wdev.mtx)); + if (!link) + return NULL; + } if (mac_addr) { + struct sta_info *sta; + struct link_sta_info *link_sta; + sta = sta_info_get_bss(sdata, mac_addr); if (!sta) return NULL; + if (link_id >= 0) { + link_sta = rcu_dereference_check(sta->link[link_id], + lockdep_is_held(&local->sta_mtx)); + if (!link_sta) + return NULL; + } else { + link_sta = &sta->deflink; + } + if (pairwise && key_idx < NUM_DEFAULT_KEYS) return rcu_dereference_check_key_mtx(local, sta->ptk[key_idx]); @@ -557,7 +613,7 @@ ieee80211_lookup_key(struct ieee80211_sub_if_data *sdata, NUM_DEFAULT_MGMT_KEYS + NUM_DEFAULT_BEACON_KEYS) return rcu_dereference_check_key_mtx(local, - sta->deflink.gtk[key_idx]); + link_sta->gtk[key_idx]); return NULL; } @@ -566,7 +622,7 @@ ieee80211_lookup_key(struct ieee80211_sub_if_data *sdata, return rcu_dereference_check_key_mtx(local, sdata->keys[key_idx]); - key = rcu_dereference_check_key_mtx(local, sdata->deflink.gtk[key_idx]); + key = rcu_dereference_check_key_mtx(local, link->gtk[key_idx]); if (key) return key; @@ -578,7 +634,8 @@ ieee80211_lookup_key(struct ieee80211_sub_if_data *sdata, } static int ieee80211_del_key(struct wiphy *wiphy, struct net_device *dev, - u8 key_idx, bool pairwise, const u8 *mac_addr) + int link_id, u8 key_idx, bool pairwise, + const u8 *mac_addr) { struct ieee80211_sub_if_data *sdata = IEEE80211_DEV_TO_SUB_IF(dev); struct ieee80211_local *local = sdata->local; @@ -588,7 +645,7 @@ static int ieee80211_del_key(struct wiphy *wiphy, struct net_device *dev, mutex_lock(&local->sta_mtx); mutex_lock(&local->key_mtx); - key = ieee80211_lookup_key(sdata, key_idx, pairwise, mac_addr); + key = ieee80211_lookup_key(sdata, link_id, key_idx, pairwise, mac_addr); if (!key) { ret = -ENOENT; goto out_unlock; @@ -605,8 +662,8 @@ static int ieee80211_del_key(struct wiphy *wiphy, struct net_device *dev, } static int ieee80211_get_key(struct wiphy *wiphy, struct net_device *dev, - u8 key_idx, bool pairwise, const u8 *mac_addr, - void *cookie, + int link_id, u8 key_idx, bool pairwise, + const u8 *mac_addr, void *cookie, void (*callback)(void *cookie, struct key_params *params)) { @@ -624,7 +681,7 @@ static int ieee80211_get_key(struct wiphy *wiphy, struct net_device *dev, rcu_read_lock(); - key = ieee80211_lookup_key(sdata, key_idx, pairwise, mac_addr); + key = ieee80211_lookup_key(sdata, link_id, key_idx, pairwise, mac_addr); if (!key) goto out; @@ -711,34 +768,49 @@ static int ieee80211_get_key(struct wiphy *wiphy, struct net_device *dev, static int ieee80211_config_default_key(struct wiphy *wiphy, struct net_device *dev, - u8 key_idx, bool uni, + int link_id, u8 key_idx, bool uni, bool multi) { struct ieee80211_sub_if_data *sdata = IEEE80211_DEV_TO_SUB_IF(dev); + struct ieee80211_link_data *link = + ieee80211_link_or_deflink(sdata, link_id, false); - ieee80211_set_default_key(sdata, key_idx, uni, multi); + if (IS_ERR(link)) + return PTR_ERR(link); + + ieee80211_set_default_key(link, key_idx, uni, multi); return 0; } static int ieee80211_config_default_mgmt_key(struct wiphy *wiphy, struct net_device *dev, - u8 key_idx) + int link_id, u8 key_idx) { struct ieee80211_sub_if_data *sdata = IEEE80211_DEV_TO_SUB_IF(dev); + struct ieee80211_link_data *link = + ieee80211_link_or_deflink(sdata, link_id, true); + + if (IS_ERR(link)) + return PTR_ERR(link); - ieee80211_set_default_mgmt_key(sdata, key_idx); + ieee80211_set_default_mgmt_key(link, key_idx); return 0; } static int ieee80211_config_default_beacon_key(struct wiphy *wiphy, struct net_device *dev, - u8 key_idx) + int link_id, u8 key_idx) { struct ieee80211_sub_if_data *sdata = IEEE80211_DEV_TO_SUB_IF(dev); + struct ieee80211_link_data *link = + ieee80211_link_or_deflink(sdata, link_id, true); + + if (IS_ERR(link)) + return PTR_ERR(link); - ieee80211_set_default_beacon_key(sdata, key_idx); + ieee80211_set_default_beacon_key(link, key_idx); return 0; } @@ -1610,6 +1682,18 @@ static int sta_link_apply_parameters(struct ieee80211_local *local, rcu_dereference_protected(sta->link[link_id], lockdep_is_held(&local->sta_mtx)); + /* + * If there are no changes, then accept a link that doesn't exist, + * unless it's a new link. + */ + if (params->link_id < 0 && !new_link && + !params->link_mac && !params->txpwr_set && + !params->supported_rates_len && + !params->ht_capa && !params->vht_capa && + !params->he_capa && !params->eht_capa && + !params->opmode_notif_used) + return 0; + if (!link || !link_sta) return -EINVAL; @@ -1625,6 +1709,8 @@ static int sta_link_apply_parameters(struct ieee80211_local *local, params->link_mac)) { return -EINVAL; } + } else if (new_link) { + return -EINVAL; } if (params->txpwr_set) { @@ -2554,7 +2640,8 @@ static int ieee80211_set_txq_params(struct wiphy *wiphy, { struct ieee80211_local *local = wiphy_priv(wiphy); struct ieee80211_sub_if_data *sdata = IEEE80211_DEV_TO_SUB_IF(dev); - struct ieee80211_link_data *link = &sdata->deflink; + struct ieee80211_link_data *link = + ieee80211_link_or_deflink(sdata, params->link_id, true); struct ieee80211_tx_queue_params p; if (!local->ops->conf_tx) @@ -2563,6 +2650,9 @@ static int ieee80211_set_txq_params(struct wiphy *wiphy, if (local->hw.queues < IEEE80211_NUM_ACS) return -EOPNOTSUPP; + if (IS_ERR(link)) + return PTR_ERR(link); + memset(&p, 0, sizeof(p)); p.aifs = params->aifs; p.cw_max = params->cwmax; @@ -3597,9 +3687,6 @@ static int ieee80211_set_csa_beacon(struct ieee80211_sub_if_data *sdata, case NL80211_IFTYPE_MESH_POINT: { struct ieee80211_if_mesh *ifmsh = &sdata->u.mesh; - if (params->chandef.width != sdata->vif.bss_conf.chandef.width) - return -EINVAL; - /* changes into another band are not supported */ if (sdata->vif.bss_conf.chandef.chan->band != params->chandef.chan->band) @@ -3732,7 +3819,7 @@ __ieee80211_channel_switch(struct wiphy *wiphy, struct net_device *dev, IEEE80211_QUEUE_STOP_REASON_CSA); cfg80211_ch_switch_started_notify(sdata->dev, - &sdata->deflink.csa_chandef, + &sdata->deflink.csa_chandef, 0, params->count, params->block_tx); if (changed) { @@ -4614,6 +4701,9 @@ static int ieee80211_add_intf_link(struct wiphy *wiphy, { struct ieee80211_sub_if_data *sdata = IEEE80211_WDEV_TO_SUB_IF(wdev); + if (wdev->use_4addr) + return -EOPNOTSUPP; + return ieee80211_vif_set_links(sdata, wdev->valid_links); } diff --git a/net/mac80211/chan.c b/net/mac80211/chan.c index f247daa41563..e72cf0749d49 100644 --- a/net/mac80211/chan.c +++ b/net/mac80211/chan.c @@ -1799,6 +1799,12 @@ int ieee80211_link_use_channel(struct ieee80211_link_data *link, lockdep_assert_held(&local->mtx); + if (sdata->vif.active_links && + !(sdata->vif.active_links & BIT(link->link_id))) { + ieee80211_link_update_chandef(link, chandef); + return 0; + } + mutex_lock(&local->chanctx_mtx); ret = cfg80211_chandef_dfs_required(local->hw.wiphy, diff --git a/net/mac80211/debugfs_netdev.c b/net/mac80211/debugfs_netdev.c index 1e5b041a5cea..5b014786fd2d 100644 --- a/net/mac80211/debugfs_netdev.c +++ b/net/mac80211/debugfs_netdev.c @@ -570,6 +570,30 @@ static ssize_t ieee80211_if_parse_tsf( } IEEE80211_IF_FILE_RW(tsf); +static ssize_t ieee80211_if_fmt_valid_links(const struct ieee80211_sub_if_data *sdata, + char *buf, int buflen) +{ + return snprintf(buf, buflen, "0x%x\n", sdata->vif.valid_links); +} +IEEE80211_IF_FILE_R(valid_links); + +static ssize_t ieee80211_if_fmt_active_links(const struct ieee80211_sub_if_data *sdata, + char *buf, int buflen) +{ + return snprintf(buf, buflen, "0x%x\n", sdata->vif.active_links); +} + +static ssize_t ieee80211_if_parse_active_links(struct ieee80211_sub_if_data *sdata, + const char *buf, int buflen) +{ + u16 active_links; + + if (kstrtou16(buf, 0, &active_links)) + return -EINVAL; + + return ieee80211_set_active_links(&sdata->vif, active_links) ?: buflen; +} +IEEE80211_IF_FILE_RW(active_links); #ifdef CONFIG_MAC80211_MESH IEEE80211_IF_FILE(estab_plinks, u.mesh.estab_plinks, ATOMIC); @@ -670,6 +694,8 @@ static void add_sta_files(struct ieee80211_sub_if_data *sdata) DEBUGFS_ADD_MODE(uapsd_queues, 0600); DEBUGFS_ADD_MODE(uapsd_max_sp_len, 0600); DEBUGFS_ADD_MODE(tdls_wider_bw, 0600); + DEBUGFS_ADD_MODE(valid_links, 0200); + DEBUGFS_ADD_MODE(active_links, 0600); } static void add_ap_files(struct ieee80211_sub_if_data *sdata) diff --git a/net/mac80211/driver-ops.c b/net/mac80211/driver-ops.c index 9b61dc7889c2..5392ffa18270 100644 --- a/net/mac80211/driver-ops.c +++ b/net/mac80211/driver-ops.c @@ -192,6 +192,10 @@ int drv_conf_tx(struct ieee80211_local *local, if (!check_sdata_in_driver(sdata)) return -EIO; + if (sdata->vif.active_links && + !(sdata->vif.active_links & BIT(link->link_id))) + return 0; + if (params->cw_min == 0 || params->cw_min > params->cw_max) { /* * If we can't configure hardware anyway, don't warn. We may @@ -272,6 +276,60 @@ void drv_reset_tsf(struct ieee80211_local *local, trace_drv_return_void(local); } +int drv_assign_vif_chanctx(struct ieee80211_local *local, + struct ieee80211_sub_if_data *sdata, + struct ieee80211_bss_conf *link_conf, + struct ieee80211_chanctx *ctx) +{ + int ret = 0; + + drv_verify_link_exists(sdata, link_conf); + if (!check_sdata_in_driver(sdata)) + return -EIO; + + if (sdata->vif.active_links && + !(sdata->vif.active_links & BIT(link_conf->link_id))) + return 0; + + trace_drv_assign_vif_chanctx(local, sdata, link_conf, ctx); + if (local->ops->assign_vif_chanctx) { + WARN_ON_ONCE(!ctx->driver_present); + ret = local->ops->assign_vif_chanctx(&local->hw, + &sdata->vif, + link_conf, + &ctx->conf); + } + trace_drv_return_int(local, ret); + + return ret; +} + +void drv_unassign_vif_chanctx(struct ieee80211_local *local, + struct ieee80211_sub_if_data *sdata, + struct ieee80211_bss_conf *link_conf, + struct ieee80211_chanctx *ctx) +{ + might_sleep(); + + drv_verify_link_exists(sdata, link_conf); + if (!check_sdata_in_driver(sdata)) + return; + + if (sdata->vif.active_links && + !(sdata->vif.active_links & BIT(link_conf->link_id))) + return; + + trace_drv_unassign_vif_chanctx(local, sdata, link_conf, ctx); + if (local->ops->unassign_vif_chanctx) { + WARN_ON_ONCE(!ctx->driver_present); + local->ops->unassign_vif_chanctx(&local->hw, + &sdata->vif, + link_conf, + &ctx->conf); + } + trace_drv_return_void(local); +} + int drv_switch_vif_chanctx(struct ieee80211_local *local, struct ieee80211_vif_chanctx_switch *vifs, int n_vifs, enum ieee80211_chanctx_switch_mode mode) @@ -346,3 +404,117 @@ int drv_ampdu_action(struct ieee80211_local *local, return ret; } + +void drv_link_info_changed(struct ieee80211_local *local, + struct ieee80211_sub_if_data *sdata, + struct ieee80211_bss_conf *info, + int link_id, u64 changed) +{ + might_sleep(); + + if (WARN_ON_ONCE(changed & (BSS_CHANGED_BEACON | + BSS_CHANGED_BEACON_ENABLED) && + sdata->vif.type != NL80211_IFTYPE_AP && + sdata->vif.type != NL80211_IFTYPE_ADHOC && + sdata->vif.type != NL80211_IFTYPE_MESH_POINT && + sdata->vif.type != NL80211_IFTYPE_OCB)) + return; + + if (WARN_ON_ONCE(sdata->vif.type == NL80211_IFTYPE_P2P_DEVICE || + sdata->vif.type == NL80211_IFTYPE_NAN || + (sdata->vif.type == NL80211_IFTYPE_MONITOR && + !sdata->vif.bss_conf.mu_mimo_owner && + !(changed & BSS_CHANGED_TXPOWER)))) + return; + + if (!check_sdata_in_driver(sdata)) + return; + + if (sdata->vif.active_links && + !(sdata->vif.active_links & BIT(link_id))) + return; + + trace_drv_link_info_changed(local, sdata, info, changed); + if (local->ops->link_info_changed) + local->ops->link_info_changed(&local->hw, &sdata->vif, + info, changed); + else if (local->ops->bss_info_changed) + local->ops->bss_info_changed(&local->hw, &sdata->vif, + info, changed); + trace_drv_return_void(local); +} + +int drv_set_key(struct ieee80211_local *local, + enum set_key_cmd cmd, + struct ieee80211_sub_if_data *sdata, + struct ieee80211_sta *sta, + struct ieee80211_key_conf *key) +{ + int ret; + + might_sleep(); + + sdata = get_bss_sdata(sdata); + if (!check_sdata_in_driver(sdata)) + return -EIO; + + if (WARN_ON(key->link_id >= 0 && sdata->vif.active_links && + !(sdata->vif.active_links & BIT(key->link_id)))) + return -ENOLINK; + + trace_drv_set_key(local, cmd, sdata, sta, key); + ret = local->ops->set_key(&local->hw, cmd, &sdata->vif, sta, key); + trace_drv_return_int(local, ret); + return ret; +} + +int drv_change_vif_links(struct ieee80211_local *local, + struct ieee80211_sub_if_data *sdata, + u16 old_links, u16 new_links, + struct ieee80211_bss_conf *old[IEEE80211_MLD_MAX_NUM_LINKS]) +{ + int ret = -EOPNOTSUPP; + + might_sleep(); + + if (!check_sdata_in_driver(sdata)) + return -EIO; + + if (old_links == new_links) + return 0; + + trace_drv_change_vif_links(local, sdata, old_links, new_links); + if (local->ops->change_vif_links) + ret = local->ops->change_vif_links(&local->hw, &sdata->vif, + old_links, new_links, old); + trace_drv_return_int(local, ret); + + return ret; +} + +int drv_change_sta_links(struct ieee80211_local *local, + struct ieee80211_sub_if_data *sdata, + struct ieee80211_sta *sta, + u16 old_links, u16 new_links) +{ + int ret = -EOPNOTSUPP; + + might_sleep(); + + if (!check_sdata_in_driver(sdata)) + return -EIO; + + old_links &= sdata->vif.active_links; + new_links &= sdata->vif.active_links; + + if (old_links == new_links) + return 0; + + trace_drv_change_sta_links(local, sdata, sta, old_links, new_links); + if (local->ops->change_sta_links) + ret = local->ops->change_sta_links(&local->hw, &sdata->vif, sta, + old_links, new_links); + trace_drv_return_int(local, ret); + + return ret; +} diff --git a/net/mac80211/driver-ops.h b/net/mac80211/driver-ops.h index 482f5c97a72b..81e40b0a3b16 100644 --- a/net/mac80211/driver-ops.h +++ b/net/mac80211/driver-ops.h @@ -165,40 +165,10 @@ static inline void drv_vif_cfg_changed(struct ieee80211_local *local, trace_drv_return_void(local); } -static inline void drv_link_info_changed(struct ieee80211_local *local, - struct ieee80211_sub_if_data *sdata, - struct ieee80211_bss_conf *info, - int link_id, u64 changed) -{ - might_sleep(); - - if (WARN_ON_ONCE(changed & (BSS_CHANGED_BEACON | - BSS_CHANGED_BEACON_ENABLED) && - sdata->vif.type != NL80211_IFTYPE_AP && - sdata->vif.type != NL80211_IFTYPE_ADHOC && - sdata->vif.type != NL80211_IFTYPE_MESH_POINT && - sdata->vif.type != NL80211_IFTYPE_OCB)) - return; - - if (WARN_ON_ONCE(sdata->vif.type == NL80211_IFTYPE_P2P_DEVICE || - sdata->vif.type == NL80211_IFTYPE_NAN || - (sdata->vif.type == NL80211_IFTYPE_MONITOR && - !sdata->vif.bss_conf.mu_mimo_owner && - !(changed & BSS_CHANGED_TXPOWER)))) - return; - - if (!check_sdata_in_driver(sdata)) - return; - - trace_drv_link_info_changed(local, sdata, info, changed); - if (local->ops->link_info_changed) - local->ops->link_info_changed(&local->hw, &sdata->vif, - info, changed); - else if (local->ops->bss_info_changed) - local->ops->bss_info_changed(&local->hw, &sdata->vif, - info, changed); - trace_drv_return_void(local); -} +void drv_link_info_changed(struct ieee80211_local *local, + struct ieee80211_sub_if_data *sdata, + struct ieee80211_bss_conf *info, + int link_id, u64 changed); static inline u64 drv_prepare_multicast(struct ieee80211_local *local, struct netdev_hw_addr_list *mc_list) @@ -256,25 +226,11 @@ static inline int drv_set_tim(struct ieee80211_local *local, return ret; } -static inline int drv_set_key(struct ieee80211_local *local, - enum set_key_cmd cmd, - struct ieee80211_sub_if_data *sdata, - struct ieee80211_sta *sta, - struct ieee80211_key_conf *key) -{ - int ret; - - might_sleep(); - - sdata = get_bss_sdata(sdata); - if (!check_sdata_in_driver(sdata)) - return -EIO; - - trace_drv_set_key(local, cmd, sdata, sta, key); - ret = local->ops->set_key(&local->hw, cmd, &sdata->vif, sta, key); - trace_drv_return_int(local, ret); - return ret; -} +int drv_set_key(struct ieee80211_local *local, + enum set_key_cmd cmd, + struct ieee80211_sub_if_data *sdata, + struct ieee80211_sta *sta, + struct ieee80211_key_conf *key); static inline void drv_update_tkip_key(struct ieee80211_local *local, struct ieee80211_sub_if_data *sdata, @@ -945,52 +901,14 @@ static inline void drv_verify_link_exists(struct ieee80211_sub_if_data *sdata, sdata_assert_lock(sdata); } -static inline int drv_assign_vif_chanctx(struct ieee80211_local *local, - struct ieee80211_sub_if_data *sdata, - struct ieee80211_bss_conf *link_conf, - struct ieee80211_chanctx *ctx) -{ - int ret = 0; - - drv_verify_link_exists(sdata, link_conf); - if (!check_sdata_in_driver(sdata)) - return -EIO; - - trace_drv_assign_vif_chanctx(local, sdata, link_conf, ctx); - if (local->ops->assign_vif_chanctx) { - WARN_ON_ONCE(!ctx->driver_present); - ret = local->ops->assign_vif_chanctx(&local->hw, - &sdata->vif, - link_conf, - &ctx->conf); - } - trace_drv_return_int(local, ret); - - return ret; -} - -static inline void drv_unassign_vif_chanctx(struct ieee80211_local *local, - struct ieee80211_sub_if_data *sdata, - struct ieee80211_bss_conf *link_conf, - struct ieee80211_chanctx *ctx) -{ - might_sleep(); - - drv_verify_link_exists(sdata, link_conf); - if (!check_sdata_in_driver(sdata)) - return; - - trace_drv_unassign_vif_chanctx(local, sdata, link_conf, ctx); - if (local->ops->unassign_vif_chanctx) { - WARN_ON_ONCE(!ctx->driver_present); - local->ops->unassign_vif_chanctx(&local->hw, - &sdata->vif, - link_conf, - &ctx->conf); - } - trace_drv_return_void(local); -} - +int drv_assign_vif_chanctx(struct ieee80211_local *local, + struct ieee80211_sub_if_data *sdata, + struct ieee80211_bss_conf *link_conf, + struct ieee80211_chanctx *ctx); +void drv_unassign_vif_chanctx(struct ieee80211_local *local, + struct ieee80211_sub_if_data *sdata, + struct ieee80211_bss_conf *link_conf, + struct ieee80211_chanctx *ctx); int drv_switch_vif_chanctx(struct ieee80211_local *local, struct ieee80211_vif_chanctx_switch *vifs, int n_vifs, enum ieee80211_chanctx_switch_mode mode); @@ -1552,46 +1470,13 @@ static inline int drv_net_fill_forward_path(struct ieee80211_local *local, return ret; } -static inline int drv_change_vif_links(struct ieee80211_local *local, - struct ieee80211_sub_if_data *sdata, - u16 old_links, u16 new_links, - struct ieee80211_bss_conf *old[IEEE80211_MLD_MAX_NUM_LINKS]) -{ - int ret = -EOPNOTSUPP; - - might_sleep(); - - if (!check_sdata_in_driver(sdata)) - return -EIO; - - trace_drv_change_vif_links(local, sdata, old_links, new_links); - if (local->ops->change_vif_links) - ret = local->ops->change_vif_links(&local->hw, &sdata->vif, - old_links, new_links, old); - trace_drv_return_int(local, ret); - - return ret; -} - -static inline int drv_change_sta_links(struct ieee80211_local *local, - struct ieee80211_sub_if_data *sdata, - struct ieee80211_sta *sta, - u16 old_links, u16 new_links) -{ - int ret = -EOPNOTSUPP; - - might_sleep(); - - if (!check_sdata_in_driver(sdata)) - return -EIO; - - trace_drv_change_sta_links(local, sdata, sta, old_links, new_links); - if (local->ops->change_sta_links) - ret = local->ops->change_sta_links(&local->hw, &sdata->vif, sta, - old_links, new_links); - trace_drv_return_int(local, ret); - - return ret; -} +int drv_change_vif_links(struct ieee80211_local *local, + struct ieee80211_sub_if_data *sdata, + u16 old_links, u16 new_links, + struct ieee80211_bss_conf *old[IEEE80211_MLD_MAX_NUM_LINKS]); +int drv_change_sta_links(struct ieee80211_local *local, + struct ieee80211_sub_if_data *sdata, + struct ieee80211_sta *sta, + u16 old_links, u16 new_links); #endif /* __MAC80211_DRIVER_OPS */ diff --git a/net/mac80211/eht.c b/net/mac80211/eht.c index 31e20a342f21..18bc6b78b267 100644 --- a/net/mac80211/eht.c +++ b/net/mac80211/eht.c @@ -30,7 +30,9 @@ ieee80211_eht_cap_ie_to_sta_eht_cap(struct ieee80211_sub_if_data *sdata, return; mcs_nss_size = ieee80211_eht_mcs_nss_size(he_cap_ie_elem, - &eht_cap_ie_elem->fixed); + &eht_cap_ie_elem->fixed, + sdata->vif.type == + NL80211_IFTYPE_STATION); eht_total_size += mcs_nss_size; diff --git a/net/mac80211/ethtool.c b/net/mac80211/ethtool.c index c2b38370bfb1..a3830d925cc2 100644 --- a/net/mac80211/ethtool.c +++ b/net/mac80211/ethtool.c @@ -83,17 +83,17 @@ static void ieee80211_get_stats(struct net_device *dev, #define ADD_STA_STATS(sta) \ do { \ - data[i++] += (sta)->rx_stats.packets; \ - data[i++] += (sta)->rx_stats.bytes; \ + data[i++] += sinfo.rx_packets; \ + data[i++] += sinfo.rx_bytes; \ data[i++] += (sta)->rx_stats.num_duplicates; \ data[i++] += (sta)->rx_stats.fragments; \ - data[i++] += (sta)->rx_stats.dropped; \ + data[i++] += sinfo.rx_dropped_misc; \ \ data[i++] += sinfo.tx_packets; \ data[i++] += sinfo.tx_bytes; \ data[i++] += (sta)->status_stats.filtered; \ - data[i++] += (sta)->status_stats.retry_failed; \ - data[i++] += (sta)->status_stats.retry_count; \ + data[i++] += sinfo.tx_failed; \ + data[i++] += sinfo.tx_retries; \ } while (0) /* For Managed stations, find the single station based on BSSID diff --git a/net/mac80211/he.c b/net/mac80211/he.c index d9228fd3f77a..729f261520c7 100644 --- a/net/mac80211/he.c +++ b/net/mac80211/he.c @@ -31,25 +31,27 @@ ieee80211_update_from_he_6ghz_capa(const struct ieee80211_he_6ghz_capa *he_6ghz_ break; } - sta->sta.smps_mode = smps_mode; + link_sta->pub->smps_mode = smps_mode; } else { - sta->sta.smps_mode = IEEE80211_SMPS_OFF; + link_sta->pub->smps_mode = IEEE80211_SMPS_OFF; } switch (le16_get_bits(he_6ghz_capa->capa, IEEE80211_HE_6GHZ_CAP_MAX_MPDU_LEN)) { case IEEE80211_VHT_CAP_MAX_MPDU_LENGTH_11454: - sta->sta.max_amsdu_len = IEEE80211_MAX_MPDU_LEN_VHT_11454; + link_sta->pub->agg.max_amsdu_len = IEEE80211_MAX_MPDU_LEN_VHT_11454; break; case IEEE80211_VHT_CAP_MAX_MPDU_LENGTH_7991: - sta->sta.max_amsdu_len = IEEE80211_MAX_MPDU_LEN_VHT_7991; + link_sta->pub->agg.max_amsdu_len = IEEE80211_MAX_MPDU_LEN_VHT_7991; break; case IEEE80211_VHT_CAP_MAX_MPDU_LENGTH_3895: default: - sta->sta.max_amsdu_len = IEEE80211_MAX_MPDU_LEN_VHT_3895; + link_sta->pub->agg.max_amsdu_len = IEEE80211_MAX_MPDU_LEN_VHT_3895; break; } + ieee80211_sta_recalc_aggregates(&sta->sta); + link_sta->pub->he_6ghz_capa = *he_6ghz_capa; } diff --git a/net/mac80211/ht.c b/net/mac80211/ht.c index 8c24817cd497..83bc41346ae7 100644 --- a/net/mac80211/ht.c +++ b/net/mac80211/ht.c @@ -241,9 +241,11 @@ bool ieee80211_ht_cap_ie_to_sta_ht_cap(struct ieee80211_sub_if_data *sdata, ht_cap.mcs.rx_highest = ht_cap_ie->mcs.rx_highest; if (ht_cap.cap & IEEE80211_HT_CAP_MAX_AMSDU) - sta->sta.max_amsdu_len = IEEE80211_MAX_MPDU_LEN_HT_7935; + link_sta->pub->agg.max_amsdu_len = IEEE80211_MAX_MPDU_LEN_HT_7935; else - sta->sta.max_amsdu_len = IEEE80211_MAX_MPDU_LEN_HT_3839; + link_sta->pub->agg.max_amsdu_len = IEEE80211_MAX_MPDU_LEN_HT_3839; + + ieee80211_sta_recalc_aggregates(&sta->sta); apply: changed = memcmp(&link_sta->pub->ht_cap, &ht_cap, sizeof(ht_cap)); @@ -299,12 +301,13 @@ bool ieee80211_ht_cap_ie_to_sta_ht_cap(struct ieee80211_sub_if_data *sdata, break; } - if (smps_mode != sta->sta.smps_mode) + if (smps_mode != link_sta->pub->smps_mode) changed = true; - sta->sta.smps_mode = smps_mode; + link_sta->pub->smps_mode = smps_mode; } else { - sta->sta.smps_mode = IEEE80211_SMPS_OFF; + link_sta->pub->smps_mode = IEEE80211_SMPS_OFF; } + return changed; } diff --git a/net/mac80211/ibss.c b/net/mac80211/ibss.c index 9b283bbc7bb4..9dffc3079588 100644 --- a/net/mac80211/ibss.c +++ b/net/mac80211/ibss.c @@ -1350,10 +1350,10 @@ static void ieee80211_sta_create_ibss(struct ieee80211_sub_if_data *sdata) capability, 0, true); } -static unsigned ibss_setup_channels(struct wiphy *wiphy, - struct ieee80211_channel **channels, - unsigned int channels_max, - u32 center_freq, u32 width) +static unsigned int ibss_setup_channels(struct wiphy *wiphy, + struct ieee80211_channel **channels, + unsigned int channels_max, + u32 center_freq, u32 width) { struct ieee80211_channel *chan = NULL; unsigned int n_chan = 0; diff --git a/net/mac80211/ieee80211_i.h b/net/mac80211/ieee80211_i.h index e192e1ec0261..4e1d4c339f2d 100644 --- a/net/mac80211/ieee80211_i.h +++ b/net/mac80211/ieee80211_i.h @@ -213,6 +213,7 @@ struct ieee80211_rx_data { struct ieee80211_sub_if_data *sdata; struct ieee80211_link_data *link; struct sta_info *sta; + struct link_sta_info *link_sta; struct ieee80211_key *key; unsigned int flags; @@ -1080,6 +1081,10 @@ struct ieee80211_sub_if_data { struct ieee80211_link_data deflink; struct ieee80211_link_data __rcu *link[IEEE80211_MLD_MAX_NUM_LINKS]; + /* for ieee80211_set_active_links_async() */ + struct work_struct activate_links_work; + u16 desired_active_links; + #ifdef CONFIG_MAC80211_DEBUGFS struct { struct dentry *subdir_stations; @@ -1810,6 +1815,7 @@ void ieee80211_sta_connection_lost(struct ieee80211_sub_if_data *sdata, u8 reason, bool tx); void ieee80211_mgd_setup_link(struct ieee80211_link_data *link); void ieee80211_mgd_stop_link(struct ieee80211_link_data *link); +void ieee80211_mgd_set_link_qos_params(struct ieee80211_link_data *link); /* IBSS code */ void ieee80211_ibss_notify_scan_completed(struct ieee80211_local *local); @@ -1929,9 +1935,6 @@ void ieee80211_sdata_stop(struct ieee80211_sub_if_data *sdata); int ieee80211_add_virtual_monitor(struct ieee80211_local *local); void ieee80211_del_virtual_monitor(struct ieee80211_local *local); -int ieee80211_vif_set_links(struct ieee80211_sub_if_data *sdata, - u16 new_links); - bool __ieee80211_recalc_txpower(struct ieee80211_sub_if_data *sdata); void ieee80211_recalc_txpower(struct ieee80211_sub_if_data *sdata, bool update_bss); @@ -1942,6 +1945,17 @@ static inline bool ieee80211_sdata_running(struct ieee80211_sub_if_data *sdata) return test_bit(SDATA_STATE_RUNNING, &sdata->state); } +/* link handling */ +void ieee80211_link_setup(struct ieee80211_link_data *link); +void ieee80211_link_init(struct ieee80211_sub_if_data *sdata, + int link_id, + struct ieee80211_link_data *link, + struct ieee80211_bss_conf *link_conf); +void ieee80211_link_stop(struct ieee80211_link_data *link); +int ieee80211_vif_set_links(struct ieee80211_sub_if_data *sdata, + u16 new_links); +void ieee80211_vif_clear_links(struct ieee80211_sub_if_data *sdata); + /* tx handling */ void ieee80211_clear_tx_pending(struct ieee80211_local *local); void ieee80211_tx_pending(struct tasklet_struct *t); @@ -2184,6 +2198,8 @@ static inline void ieee80211_tx_skb(struct ieee80211_sub_if_data *sdata, * for that non-transmitting BSS is returned * @link_id: the link ID to parse elements for, if a STA profile * is present in the multi-link element, or -1 to ignore + * @from_ap: frame is received from an AP (currently used only + * for EHT capabilities parsing) */ struct ieee80211_elems_parse_params { const u8 *start; @@ -2193,6 +2209,7 @@ struct ieee80211_elems_parse_params { u32 crc; struct cfg80211_bss *bss; int link_id; + bool from_ap; }; struct ieee802_11_elems * @@ -2382,6 +2399,7 @@ u8 *ieee80211_ie_build_he_cap(ieee80211_conn_flags_t disable_flags, u8 *pos, const struct ieee80211_sta_he_cap *he_cap, u8 *end); void ieee80211_ie_build_he_6ghz_cap(struct ieee80211_sub_if_data *sdata, + enum ieee80211_smps_mode smps_mode, struct sk_buff *skb); u8 *ieee80211_ie_build_he_oper(u8 *pos, struct cfg80211_chan_def *chandef); int ieee80211_parse_bitrates(enum nl80211_chan_width width, @@ -2407,8 +2425,7 @@ bool ieee80211_chandef_vht_oper(struct ieee80211_hw *hw, u32 vht_cap_info, const struct ieee80211_vht_operation *oper, const struct ieee80211_ht_operation *htop, struct cfg80211_chan_def *chandef); -void ieee80211_chandef_eht_oper(struct ieee80211_sub_if_data *sdata, - const struct ieee80211_eht_operation *eht_oper, +void ieee80211_chandef_eht_oper(const struct ieee80211_eht_operation *eht_oper, bool support_160, bool support_320, struct cfg80211_chan_def *chandef); bool ieee80211_chandef_he_6ghz_oper(struct ieee80211_sub_if_data *sdata, @@ -2513,7 +2530,8 @@ u8 ieee80211_ie_len_eht_cap(struct ieee80211_sub_if_data *sdata, u8 iftype); u8 *ieee80211_ie_build_eht_cap(u8 *pos, const struct ieee80211_sta_he_cap *he_cap, const struct ieee80211_sta_eht_cap *eht_cap, - u8 *end); + u8 *end, + bool for_ap); void ieee80211_eht_cap_ie_to_sta_eht_cap(struct ieee80211_sub_if_data *sdata, diff --git a/net/mac80211/iface.c b/net/mac80211/iface.c index 95b58c5cac07..572254366a0f 100644 --- a/net/mac80211/iface.c +++ b/net/mac80211/iface.c @@ -200,15 +200,73 @@ static int ieee80211_verify_mac(struct ieee80211_sub_if_data *sdata, u8 *addr, return ret; } +static int ieee80211_can_powered_addr_change(struct ieee80211_sub_if_data *sdata) +{ + struct ieee80211_roc_work *roc; + struct ieee80211_local *local = sdata->local; + struct ieee80211_sub_if_data *scan_sdata; + int ret = 0; + + /* To be the most flexible here we want to only limit changing the + * address if the specific interface is doing offchannel work or + * scanning. + */ + if (netif_carrier_ok(sdata->dev)) + return -EBUSY; + + mutex_lock(&local->mtx); + + /* First check no ROC work is happening on this iface */ + list_for_each_entry(roc, &local->roc_list, list) { + if (roc->sdata != sdata) + continue; + + if (roc->started) { + ret = -EBUSY; + goto unlock; + } + } + + /* And if this iface is scanning */ + if (local->scanning) { + scan_sdata = rcu_dereference_protected(local->scan_sdata, + lockdep_is_held(&local->mtx)); + if (sdata == scan_sdata) + ret = -EBUSY; + } + + switch (sdata->vif.type) { + case NL80211_IFTYPE_STATION: + case NL80211_IFTYPE_P2P_CLIENT: + /* More interface types could be added here but changing the + * address while powered makes the most sense in client modes. + */ + break; + default: + return -EOPNOTSUPP; + } + +unlock: + mutex_unlock(&local->mtx); + return ret; +} + static int ieee80211_change_mac(struct net_device *dev, void *addr) { struct ieee80211_sub_if_data *sdata = IEEE80211_DEV_TO_SUB_IF(dev); + struct ieee80211_local *local = sdata->local; struct sockaddr *sa = addr; bool check_dup = true; + bool live = false; int ret; - if (ieee80211_sdata_running(sdata)) - return -EBUSY; + if (ieee80211_sdata_running(sdata)) { + ret = ieee80211_can_powered_addr_change(sdata); + if (ret) + return ret; + + live = true; + } if (sdata->vif.type == NL80211_IFTYPE_MONITOR && !(sdata->u.mntr.flags & MONITOR_FLAG_ACTIVE)) @@ -218,6 +276,8 @@ static int ieee80211_change_mac(struct net_device *dev, void *addr) if (ret) return ret; + if (live) + drv_remove_interface(local, sdata); ret = eth_mac_addr(dev, sa); if (ret == 0) { @@ -225,6 +285,12 @@ static int ieee80211_change_mac(struct net_device *dev, void *addr) ether_addr_copy(sdata->vif.bss_conf.addr, sdata->vif.addr); } + /* Regardless of eth_mac_addr() return we still want to add the + * interface back. This should not fail... + */ + if (live) + WARN_ON(drv_add_interface(local, sdata)); + return ret; } @@ -296,6 +362,11 @@ static int ieee80211_check_concurrent_iface(struct ieee80211_sub_if_data *sdata, nsdata->vif.type)) return -ENOTUNIQ; + /* No support for VLAN with MLO yet */ + if (iftype == NL80211_IFTYPE_AP_VLAN && + nsdata->wdev.use_4addr) + return -EOPNOTSUPP; + /* * can only add VLANs to enabled APs */ @@ -368,246 +439,6 @@ static int ieee80211_open(struct net_device *dev) return err; } -static void ieee80211_link_setup(struct ieee80211_link_data *link) -{ - if (link->sdata->vif.type == NL80211_IFTYPE_STATION) - ieee80211_mgd_setup_link(link); -} - -static void ieee80211_link_init(struct ieee80211_sub_if_data *sdata, - int link_id, - struct ieee80211_link_data *link, - struct ieee80211_bss_conf *link_conf) -{ - bool deflink = link_id < 0; - - if (link_id < 0) - link_id = 0; - - rcu_assign_pointer(sdata->vif.link_conf[link_id], link_conf); - rcu_assign_pointer(sdata->link[link_id], link); - - link->sdata = sdata; - link->link_id = link_id; - link->conf = link_conf; - link_conf->link_id = link_id; - - INIT_WORK(&link->csa_finalize_work, - ieee80211_csa_finalize_work); - INIT_WORK(&link->color_change_finalize_work, - ieee80211_color_change_finalize_work); - INIT_LIST_HEAD(&link->assigned_chanctx_list); - INIT_LIST_HEAD(&link->reserved_chanctx_list); - INIT_DELAYED_WORK(&link->dfs_cac_timer_work, - ieee80211_dfs_cac_timer_work); - - if (!deflink) { - switch (sdata->vif.type) { - case NL80211_IFTYPE_AP: - ether_addr_copy(link_conf->addr, - sdata->wdev.links[link_id].addr); - WARN_ON(!(sdata->wdev.valid_links & BIT(link_id))); - break; - case NL80211_IFTYPE_STATION: - break; - default: - WARN_ON(1); - } - } -} - -static void ieee80211_link_stop(struct ieee80211_link_data *link) -{ - if (link->sdata->vif.type == NL80211_IFTYPE_STATION) - ieee80211_mgd_stop_link(link); - - ieee80211_link_release_channel(link); -} - -struct link_container { - struct ieee80211_link_data data; - struct ieee80211_bss_conf conf; -}; - -static void ieee80211_free_links(struct ieee80211_sub_if_data *sdata, - struct link_container **links) -{ - unsigned int link_id; - - synchronize_rcu(); - - for (link_id = 0; link_id < IEEE80211_MLD_MAX_NUM_LINKS; link_id++) { - if (!links[link_id]) - continue; - ieee80211_link_stop(&links[link_id]->data); - kfree(links[link_id]); - } -} - -static int ieee80211_check_dup_link_addrs(struct ieee80211_sub_if_data *sdata) -{ - unsigned int i, j; - - for (i = 0; i < IEEE80211_MLD_MAX_NUM_LINKS; i++) { - struct ieee80211_link_data *link1; - - link1 = sdata_dereference(sdata->link[i], sdata); - if (!link1) - continue; - for (j = i + 1; j < IEEE80211_MLD_MAX_NUM_LINKS; j++) { - struct ieee80211_link_data *link2; - - link2 = sdata_dereference(sdata->link[j], sdata); - if (!link2) - continue; - - if (ether_addr_equal(link1->conf->addr, - link2->conf->addr)) - return -EALREADY; - } - } - - return 0; -} - -static int ieee80211_vif_update_links(struct ieee80211_sub_if_data *sdata, - struct link_container **to_free, - u16 new_links) -{ - u16 old_links = sdata->vif.valid_links; - unsigned long add = new_links & ~old_links; - unsigned long rem = old_links & ~new_links; - unsigned int link_id; - int ret; - struct link_container *links[IEEE80211_MLD_MAX_NUM_LINKS] = {}, *link; - struct ieee80211_bss_conf *old[IEEE80211_MLD_MAX_NUM_LINKS]; - struct ieee80211_link_data *old_data[IEEE80211_MLD_MAX_NUM_LINKS]; - bool use_deflink = old_links == 0; /* set for error case */ - - sdata_assert_lock(sdata); - - memset(to_free, 0, sizeof(links)); - - if (old_links == new_links) - return 0; - - /* if there were no old links, need to clear the pointers to deflink */ - if (!old_links) - rem |= BIT(0); - - /* allocate new link structures first */ - for_each_set_bit(link_id, &add, IEEE80211_MLD_MAX_NUM_LINKS) { - link = kzalloc(sizeof(*link), GFP_KERNEL); - if (!link) { - ret = -ENOMEM; - goto free; - } - links[link_id] = link; - } - - /* keep track of the old pointers for the driver */ - BUILD_BUG_ON(sizeof(old) != sizeof(sdata->vif.link_conf)); - memcpy(old, sdata->vif.link_conf, sizeof(old)); - /* and for us in error cases */ - BUILD_BUG_ON(sizeof(old_data) != sizeof(sdata->link)); - memcpy(old_data, sdata->link, sizeof(old_data)); - - /* grab old links to free later */ - for_each_set_bit(link_id, &rem, IEEE80211_MLD_MAX_NUM_LINKS) { - if (rcu_access_pointer(sdata->link[link_id]) != &sdata->deflink) { - /* - * we must have allocated the data through this path so - * we know we can free both at the same time - */ - to_free[link_id] = container_of(rcu_access_pointer(sdata->link[link_id]), - typeof(*links[link_id]), - data); - } - - RCU_INIT_POINTER(sdata->link[link_id], NULL); - RCU_INIT_POINTER(sdata->vif.link_conf[link_id], NULL); - } - - /* link them into data structures */ - for_each_set_bit(link_id, &add, IEEE80211_MLD_MAX_NUM_LINKS) { - WARN_ON(!use_deflink && - rcu_access_pointer(sdata->link[link_id]) == &sdata->deflink); - - link = links[link_id]; - ieee80211_link_init(sdata, link_id, &link->data, &link->conf); - ieee80211_link_setup(&link->data); - } - - if (new_links == 0) - ieee80211_link_init(sdata, -1, &sdata->deflink, - &sdata->vif.bss_conf); - - sdata->vif.valid_links = new_links; - - ret = ieee80211_check_dup_link_addrs(sdata); - if (!ret) { - /* tell the driver */ - ret = drv_change_vif_links(sdata->local, sdata, - old_links, new_links, - old); - } - - if (ret) { - /* restore config */ - memcpy(sdata->link, old_data, sizeof(old_data)); - memcpy(sdata->vif.link_conf, old, sizeof(old)); - sdata->vif.valid_links = old_links; - /* and free (only) the newly allocated links */ - memset(to_free, 0, sizeof(links)); - goto free; - } - - /* use deflink/bss_conf again if and only if there are no more links */ - use_deflink = new_links == 0; - - goto deinit; -free: - /* if we failed during allocation, only free all */ - for (link_id = 0; link_id < IEEE80211_MLD_MAX_NUM_LINKS; link_id++) { - kfree(links[link_id]); - links[link_id] = NULL; - } -deinit: - if (use_deflink) - ieee80211_link_init(sdata, -1, &sdata->deflink, - &sdata->vif.bss_conf); - return ret; -} - -int ieee80211_vif_set_links(struct ieee80211_sub_if_data *sdata, - u16 new_links) -{ - struct link_container *links[IEEE80211_MLD_MAX_NUM_LINKS]; - int ret; - - ret = ieee80211_vif_update_links(sdata, links, new_links); - ieee80211_free_links(sdata, links); - - return ret; -} - -static void ieee80211_vif_clear_links(struct ieee80211_sub_if_data *sdata) -{ - struct link_container *links[IEEE80211_MLD_MAX_NUM_LINKS]; - - /* - * The locking here is different because when we free links - * in the station case we need to be able to cancel_work_sync() - * something that also takes the lock. - */ - - sdata_lock(sdata); - ieee80211_vif_update_links(sdata, links, 0); - sdata_unlock(sdata); - - ieee80211_free_links(sdata, links); -} - static void ieee80211_do_stop(struct ieee80211_sub_if_data *sdata, bool going_down) { struct ieee80211_local *local = sdata->local; @@ -923,6 +754,8 @@ static int ieee80211_stop(struct net_device *dev) ieee80211_stop_mbssid(sdata); } + cancel_work_sync(&sdata->activate_links_work); + wiphy_lock(sdata->local->hw.wiphy); ieee80211_do_stop(sdata, true); wiphy_unlock(sdata->local->hw.wiphy); @@ -1893,6 +1726,15 @@ static void ieee80211_recalc_smps_work(struct work_struct *work) ieee80211_recalc_smps(sdata, &sdata->deflink); } +static void ieee80211_activate_links_work(struct work_struct *work) +{ + struct ieee80211_sub_if_data *sdata = + container_of(work, struct ieee80211_sub_if_data, + activate_links_work); + + ieee80211_set_active_links(&sdata->vif, sdata->desired_active_links); +} + /* * Helper function to initialise an interface to a specific type. */ @@ -1930,6 +1772,7 @@ static void ieee80211_setup_sdata(struct ieee80211_sub_if_data *sdata, skb_queue_head_init(&sdata->status_queue); INIT_WORK(&sdata->work, ieee80211_iface_work); INIT_WORK(&sdata->recalc_smps, ieee80211_recalc_smps_work); + INIT_WORK(&sdata->activate_links_work, ieee80211_activate_links_work); switch (type) { case NL80211_IFTYPE_P2P_GO: @@ -2267,7 +2110,7 @@ int ieee80211_if_add(struct ieee80211_local *local, const char *name, wdev = &sdata->wdev; sdata->dev = NULL; - strlcpy(sdata->name, name, IFNAMSIZ); + strscpy(sdata->name, name, IFNAMSIZ); ieee80211_assign_perm_addr(local, wdev->address, type); memcpy(sdata->vif.addr, wdev->address, ETH_ALEN); ether_addr_copy(sdata->vif.bss_conf.addr, sdata->vif.addr); @@ -2396,6 +2239,7 @@ int ieee80211_if_add(struct ieee80211_local *local, const char *name, sdata->u.mgd.use_4addr = params->use_4addr; ndev->features |= local->hw.netdev_features; + ndev->priv_flags |= IFF_LIVE_ADDR_CHANGE; ndev->hw_features |= ndev->features & MAC80211_SUPPORTED_FEATURES_TX; diff --git a/net/mac80211/key.c b/net/mac80211/key.c index 6befb578ed9e..e8f6c1e5eabf 100644 --- a/net/mac80211/key.c +++ b/net/mac80211/key.c @@ -177,6 +177,10 @@ static int ieee80211_key_enable_hw_accel(struct ieee80211_key *key) } } + if (key->conf.link_id >= 0 && sdata->vif.active_links && + !(sdata->vif.active_links & BIT(key->conf.link_id))) + return 0; + ret = drv_set_key(key->local, SET_KEY, sdata, sta ? &sta->sta : NULL, &key->conf); @@ -246,6 +250,10 @@ static void ieee80211_key_disable_hw_accel(struct ieee80211_key *key) sta = key->sta; sdata = key->sdata; + if (key->conf.link_id >= 0 && sdata->vif.active_links && + !(sdata->vif.active_links & BIT(key->conf.link_id))) + return; + if (!(key->conf.flags & (IEEE80211_KEY_FLAG_GENERATE_MMIC | IEEE80211_KEY_FLAG_PUT_MIC_SPACE | IEEE80211_KEY_FLAG_RESERVE_TAILROOM))) @@ -344,9 +352,10 @@ static void ieee80211_pairwise_rekey(struct ieee80211_key *old, } } -static void __ieee80211_set_default_key(struct ieee80211_sub_if_data *sdata, +static void __ieee80211_set_default_key(struct ieee80211_link_data *link, int idx, bool uni, bool multi) { + struct ieee80211_sub_if_data *sdata = link->sdata; struct ieee80211_key *key = NULL; assert_key_lock(sdata->local); @@ -354,7 +363,7 @@ static void __ieee80211_set_default_key(struct ieee80211_sub_if_data *sdata, if (idx >= 0 && idx < NUM_DEFAULT_KEYS) { key = key_mtx_dereference(sdata->local, sdata->keys[idx]); if (!key) - key = key_mtx_dereference(sdata->local, sdata->deflink.gtk[idx]); + key = key_mtx_dereference(sdata->local, link->gtk[idx]); } if (uni) { @@ -365,47 +374,48 @@ static void __ieee80211_set_default_key(struct ieee80211_sub_if_data *sdata, } if (multi) - rcu_assign_pointer(sdata->deflink.default_multicast_key, key); + rcu_assign_pointer(link->default_multicast_key, key); ieee80211_debugfs_key_update_default(sdata); } -void ieee80211_set_default_key(struct ieee80211_sub_if_data *sdata, int idx, +void ieee80211_set_default_key(struct ieee80211_link_data *link, int idx, bool uni, bool multi) { - mutex_lock(&sdata->local->key_mtx); - __ieee80211_set_default_key(sdata, idx, uni, multi); - mutex_unlock(&sdata->local->key_mtx); + mutex_lock(&link->sdata->local->key_mtx); + __ieee80211_set_default_key(link, idx, uni, multi); + mutex_unlock(&link->sdata->local->key_mtx); } static void -__ieee80211_set_default_mgmt_key(struct ieee80211_sub_if_data *sdata, int idx) +__ieee80211_set_default_mgmt_key(struct ieee80211_link_data *link, int idx) { + struct ieee80211_sub_if_data *sdata = link->sdata; struct ieee80211_key *key = NULL; assert_key_lock(sdata->local); if (idx >= NUM_DEFAULT_KEYS && idx < NUM_DEFAULT_KEYS + NUM_DEFAULT_MGMT_KEYS) - key = key_mtx_dereference(sdata->local, - sdata->deflink.gtk[idx]); + key = key_mtx_dereference(sdata->local, link->gtk[idx]); - rcu_assign_pointer(sdata->deflink.default_mgmt_key, key); + rcu_assign_pointer(link->default_mgmt_key, key); ieee80211_debugfs_key_update_default(sdata); } -void ieee80211_set_default_mgmt_key(struct ieee80211_sub_if_data *sdata, +void ieee80211_set_default_mgmt_key(struct ieee80211_link_data *link, int idx) { - mutex_lock(&sdata->local->key_mtx); - __ieee80211_set_default_mgmt_key(sdata, idx); - mutex_unlock(&sdata->local->key_mtx); + mutex_lock(&link->sdata->local->key_mtx); + __ieee80211_set_default_mgmt_key(link, idx); + mutex_unlock(&link->sdata->local->key_mtx); } static void -__ieee80211_set_default_beacon_key(struct ieee80211_sub_if_data *sdata, int idx) +__ieee80211_set_default_beacon_key(struct ieee80211_link_data *link, int idx) { + struct ieee80211_sub_if_data *sdata = link->sdata; struct ieee80211_key *key = NULL; assert_key_lock(sdata->local); @@ -413,28 +423,30 @@ __ieee80211_set_default_beacon_key(struct ieee80211_sub_if_data *sdata, int idx) if (idx >= NUM_DEFAULT_KEYS + NUM_DEFAULT_MGMT_KEYS && idx < NUM_DEFAULT_KEYS + NUM_DEFAULT_MGMT_KEYS + NUM_DEFAULT_BEACON_KEYS) - key = key_mtx_dereference(sdata->local, - sdata->deflink.gtk[idx]); + key = key_mtx_dereference(sdata->local, link->gtk[idx]); - rcu_assign_pointer(sdata->deflink.default_beacon_key, key); + rcu_assign_pointer(link->default_beacon_key, key); ieee80211_debugfs_key_update_default(sdata); } -void ieee80211_set_default_beacon_key(struct ieee80211_sub_if_data *sdata, +void ieee80211_set_default_beacon_key(struct ieee80211_link_data *link, int idx) { - mutex_lock(&sdata->local->key_mtx); - __ieee80211_set_default_beacon_key(sdata, idx); - mutex_unlock(&sdata->local->key_mtx); + mutex_lock(&link->sdata->local->key_mtx); + __ieee80211_set_default_beacon_key(link, idx); + mutex_unlock(&link->sdata->local->key_mtx); } static int ieee80211_key_replace(struct ieee80211_sub_if_data *sdata, - struct sta_info *sta, - bool pairwise, - struct ieee80211_key *old, - struct ieee80211_key *new) + struct ieee80211_link_data *link, + struct sta_info *sta, + bool pairwise, + struct ieee80211_key *old, + struct ieee80211_key *new) { + struct link_sta_info *link_sta = sta ? &sta->deflink : NULL; + int link_id; int idx; int ret = 0; bool defunikey, defmultikey, defmgmtkey, defbeaconkey; @@ -446,13 +458,36 @@ static int ieee80211_key_replace(struct ieee80211_sub_if_data *sdata, if (new) { idx = new->conf.keyidx; - list_add_tail_rcu(&new->list, &sdata->key_list); is_wep = new->conf.cipher == WLAN_CIPHER_SUITE_WEP40 || new->conf.cipher == WLAN_CIPHER_SUITE_WEP104; + link_id = new->conf.link_id; } else { idx = old->conf.keyidx; is_wep = old->conf.cipher == WLAN_CIPHER_SUITE_WEP40 || old->conf.cipher == WLAN_CIPHER_SUITE_WEP104; + link_id = old->conf.link_id; + } + + if (WARN(old && old->conf.link_id != link_id, + "old link ID %d doesn't match new link ID %d\n", + old->conf.link_id, link_id)) + return -EINVAL; + + if (link_id >= 0) { + if (!link) { + link = sdata_dereference(sdata->link[link_id], sdata); + if (!link) + return -ENOLINK; + } + + if (sta) { + link_sta = rcu_dereference_protected(sta->link[link_id], + lockdep_is_held(&sta->local->sta_mtx)); + if (!link_sta) + return -ENOLINK; + } + } else { + link = &sdata->deflink; } if ((is_wep || pairwise) && idx >= NUM_DEFAULT_KEYS) @@ -482,6 +517,9 @@ static int ieee80211_key_replace(struct ieee80211_sub_if_data *sdata, if (ret) return ret; + if (new) + list_add_tail_rcu(&new->list, &sdata->key_list); + if (sta) { if (pairwise) { rcu_assign_pointer(sta->ptk[idx], new); @@ -489,7 +527,7 @@ static int ieee80211_key_replace(struct ieee80211_sub_if_data *sdata, !(new->conf.flags & IEEE80211_KEY_FLAG_NO_AUTO_TX)) _ieee80211_set_tx_key(new, true); } else { - rcu_assign_pointer(sta->deflink.gtk[idx], new); + rcu_assign_pointer(link_sta->gtk[idx], new); } /* Only needed for transition from no key -> key. * Still triggers unnecessary when using Extended Key ID @@ -503,39 +541,39 @@ static int ieee80211_key_replace(struct ieee80211_sub_if_data *sdata, sdata->default_unicast_key); defmultikey = old && old == key_mtx_dereference(sdata->local, - sdata->deflink.default_multicast_key); + link->default_multicast_key); defmgmtkey = old && old == key_mtx_dereference(sdata->local, - sdata->deflink.default_mgmt_key); + link->default_mgmt_key); defbeaconkey = old && old == key_mtx_dereference(sdata->local, - sdata->deflink.default_beacon_key); + link->default_beacon_key); if (defunikey && !new) - __ieee80211_set_default_key(sdata, -1, true, false); + __ieee80211_set_default_key(link, -1, true, false); if (defmultikey && !new) - __ieee80211_set_default_key(sdata, -1, false, true); + __ieee80211_set_default_key(link, -1, false, true); if (defmgmtkey && !new) - __ieee80211_set_default_mgmt_key(sdata, -1); + __ieee80211_set_default_mgmt_key(link, -1); if (defbeaconkey && !new) - __ieee80211_set_default_beacon_key(sdata, -1); + __ieee80211_set_default_beacon_key(link, -1); if (is_wep || pairwise) rcu_assign_pointer(sdata->keys[idx], new); else - rcu_assign_pointer(sdata->deflink.gtk[idx], new); + rcu_assign_pointer(link->gtk[idx], new); if (defunikey && new) - __ieee80211_set_default_key(sdata, new->conf.keyidx, + __ieee80211_set_default_key(link, new->conf.keyidx, true, false); if (defmultikey && new) - __ieee80211_set_default_key(sdata, new->conf.keyidx, + __ieee80211_set_default_key(link, new->conf.keyidx, false, true); if (defmgmtkey && new) - __ieee80211_set_default_mgmt_key(sdata, + __ieee80211_set_default_mgmt_key(link, new->conf.keyidx); if (defbeaconkey && new) - __ieee80211_set_default_beacon_key(sdata, + __ieee80211_set_default_beacon_key(link, new->conf.keyidx); } @@ -569,6 +607,7 @@ ieee80211_key_alloc(u32 cipher, int idx, size_t key_len, key->conf.flags = 0; key->flags = 0; + key->conf.link_id = -1; key->conf.cipher = cipher; key->conf.keyidx = idx; key->conf.keylen = key_len; @@ -797,9 +836,10 @@ static bool ieee80211_key_identical(struct ieee80211_sub_if_data *sdata, } int ieee80211_key_link(struct ieee80211_key *key, - struct ieee80211_sub_if_data *sdata, + struct ieee80211_link_data *link, struct sta_info *sta) { + struct ieee80211_sub_if_data *sdata = link->sdata; static atomic_t key_color = ATOMIC_INIT(0); struct ieee80211_key *old_key = NULL; int idx = key->conf.keyidx; @@ -827,15 +867,26 @@ int ieee80211_key_link(struct ieee80211_key *key, (old_key && old_key->conf.cipher != key->conf.cipher)) goto out; } else if (sta) { - old_key = key_mtx_dereference(sdata->local, - sta->deflink.gtk[idx]); + struct link_sta_info *link_sta = &sta->deflink; + int link_id = key->conf.link_id; + + if (link_id >= 0) { + link_sta = rcu_dereference_protected(sta->link[link_id], + lockdep_is_held(&sta->local->sta_mtx)); + if (!link_sta) { + ret = -ENOLINK; + goto out; + } + } + + old_key = key_mtx_dereference(sdata->local, link_sta->gtk[idx]); } else { if (idx < NUM_DEFAULT_KEYS) old_key = key_mtx_dereference(sdata->local, sdata->keys[idx]); if (!old_key) old_key = key_mtx_dereference(sdata->local, - sdata->deflink.gtk[idx]); + link->gtk[idx]); } /* Non-pairwise keys must also not switch the cipher on rekey */ @@ -866,7 +917,7 @@ int ieee80211_key_link(struct ieee80211_key *key, increment_tailroom_need_count(sdata); - ret = ieee80211_key_replace(sdata, sta, pairwise, old_key, key); + ret = ieee80211_key_replace(sdata, link, sta, pairwise, old_key, key); if (!ret) { ieee80211_debugfs_key_add(key); @@ -890,9 +941,9 @@ void ieee80211_key_free(struct ieee80211_key *key, bool delay_tailroom) * Replace key with nothingness if it was ever used. */ if (key->sdata) - ieee80211_key_replace(key->sdata, key->sta, - key->conf.flags & IEEE80211_KEY_FLAG_PAIRWISE, - key, NULL); + ieee80211_key_replace(key->sdata, NULL, key->sta, + key->conf.flags & IEEE80211_KEY_FLAG_PAIRWISE, + key, NULL); ieee80211_key_destroy(key, delay_tailroom); } @@ -1019,15 +1070,45 @@ static void ieee80211_free_keys_iface(struct ieee80211_sub_if_data *sdata, ieee80211_debugfs_key_remove_beacon_default(sdata); list_for_each_entry_safe(key, tmp, &sdata->key_list, list) { - ieee80211_key_replace(key->sdata, key->sta, - key->conf.flags & IEEE80211_KEY_FLAG_PAIRWISE, - key, NULL); + ieee80211_key_replace(key->sdata, NULL, key->sta, + key->conf.flags & IEEE80211_KEY_FLAG_PAIRWISE, + key, NULL); list_add_tail(&key->list, keys); } ieee80211_debugfs_key_update_default(sdata); } +void ieee80211_remove_link_keys(struct ieee80211_link_data *link, + struct list_head *keys) +{ + struct ieee80211_sub_if_data *sdata = link->sdata; + struct ieee80211_local *local = sdata->local; + struct ieee80211_key *key, *tmp; + + mutex_lock(&local->key_mtx); + list_for_each_entry_safe(key, tmp, &sdata->key_list, list) { + if (key->conf.link_id != link->link_id) + continue; + ieee80211_key_replace(key->sdata, link, key->sta, + key->conf.flags & IEEE80211_KEY_FLAG_PAIRWISE, + key, NULL); + list_add_tail(&key->list, keys); + } + mutex_unlock(&local->key_mtx); +} + +void ieee80211_free_key_list(struct ieee80211_local *local, + struct list_head *keys) +{ + struct ieee80211_key *key, *tmp; + + mutex_lock(&local->key_mtx); + list_for_each_entry_safe(key, tmp, keys, list) + __ieee80211_key_destroy(key, false); + mutex_unlock(&local->key_mtx); +} + void ieee80211_free_keys(struct ieee80211_sub_if_data *sdata, bool force_synchronize) { @@ -1087,9 +1168,9 @@ void ieee80211_free_sta_keys(struct ieee80211_local *local, key = key_mtx_dereference(local, sta->deflink.gtk[i]); if (!key) continue; - ieee80211_key_replace(key->sdata, key->sta, - key->conf.flags & IEEE80211_KEY_FLAG_PAIRWISE, - key, NULL); + ieee80211_key_replace(key->sdata, NULL, key->sta, + key->conf.flags & IEEE80211_KEY_FLAG_PAIRWISE, + key, NULL); __ieee80211_key_destroy(key, key->sdata->vif.type == NL80211_IFTYPE_STATION); } @@ -1098,9 +1179,9 @@ void ieee80211_free_sta_keys(struct ieee80211_local *local, key = key_mtx_dereference(local, sta->ptk[i]); if (!key) continue; - ieee80211_key_replace(key->sdata, key->sta, - key->conf.flags & IEEE80211_KEY_FLAG_PAIRWISE, - key, NULL); + ieee80211_key_replace(key->sdata, NULL, key->sta, + key->conf.flags & IEEE80211_KEY_FLAG_PAIRWISE, + key, NULL); __ieee80211_key_destroy(key, key->sdata->vif.type == NL80211_IFTYPE_STATION); } @@ -1307,7 +1388,8 @@ ieee80211_gtk_rekey_add(struct ieee80211_vif *vif, if (sdata->u.mgd.mfp != IEEE80211_MFP_DISABLED) key->conf.flags |= IEEE80211_KEY_FLAG_RX_MGMT; - err = ieee80211_key_link(key, sdata, NULL); + /* FIXME: this function needs to get a link ID */ + err = ieee80211_key_link(key, &sdata->deflink, NULL); if (err) return ERR_PTR(err); @@ -1363,3 +1445,37 @@ void ieee80211_key_replay(struct ieee80211_key_conf *keyconf) } } EXPORT_SYMBOL_GPL(ieee80211_key_replay); + +int ieee80211_key_switch_links(struct ieee80211_sub_if_data *sdata, + unsigned long del_links_mask, + unsigned long add_links_mask) +{ + struct ieee80211_key *key; + int ret; + + list_for_each_entry(key, &sdata->key_list, list) { + if (key->conf.link_id < 0 || + !(del_links_mask & BIT(key->conf.link_id))) + continue; + + /* shouldn't happen for per-link keys */ + WARN_ON(key->sta); + + ieee80211_key_disable_hw_accel(key); + } + + list_for_each_entry(key, &sdata->key_list, list) { + if (key->conf.link_id < 0 || + !(add_links_mask & BIT(key->conf.link_id))) + continue; + + /* shouldn't happen for per-link keys */ + WARN_ON(key->sta); + + ret = ieee80211_key_enable_hw_accel(key); + if (ret) + return ret; + } + + return 0; +} diff --git a/net/mac80211/key.h b/net/mac80211/key.h index e994dcea1ce3..f3df97df4b72 100644 --- a/net/mac80211/key.h +++ b/net/mac80211/key.h @@ -22,6 +22,7 @@ struct ieee80211_local; struct ieee80211_sub_if_data; +struct ieee80211_link_data; struct sta_info; /** @@ -144,22 +145,29 @@ ieee80211_key_alloc(u32 cipher, int idx, size_t key_len, * to make it used, free old key. On failure, also free the new key. */ int ieee80211_key_link(struct ieee80211_key *key, - struct ieee80211_sub_if_data *sdata, + struct ieee80211_link_data *link, struct sta_info *sta); int ieee80211_set_tx_key(struct ieee80211_key *key); void ieee80211_key_free(struct ieee80211_key *key, bool delay_tailroom); void ieee80211_key_free_unused(struct ieee80211_key *key); -void ieee80211_set_default_key(struct ieee80211_sub_if_data *sdata, int idx, +void ieee80211_set_default_key(struct ieee80211_link_data *link, int idx, bool uni, bool multi); -void ieee80211_set_default_mgmt_key(struct ieee80211_sub_if_data *sdata, +void ieee80211_set_default_mgmt_key(struct ieee80211_link_data *link, int idx); -void ieee80211_set_default_beacon_key(struct ieee80211_sub_if_data *sdata, +void ieee80211_set_default_beacon_key(struct ieee80211_link_data *link, int idx); +void ieee80211_remove_link_keys(struct ieee80211_link_data *link, + struct list_head *keys); +void ieee80211_free_key_list(struct ieee80211_local *local, + struct list_head *keys); void ieee80211_free_keys(struct ieee80211_sub_if_data *sdata, bool force_synchronize); void ieee80211_free_sta_keys(struct ieee80211_local *local, struct sta_info *sta); void ieee80211_reenable_keys(struct ieee80211_sub_if_data *sdata); +int ieee80211_key_switch_links(struct ieee80211_sub_if_data *sdata, + unsigned long del_links_mask, + unsigned long add_links_mask); #define key_mtx_dereference(local, ref) \ rcu_dereference_protected(ref, lockdep_is_held(&((local)->key_mtx))) diff --git a/net/mac80211/link.c b/net/mac80211/link.c new file mode 100644 index 000000000000..e309708abae8 --- /dev/null +++ b/net/mac80211/link.c @@ -0,0 +1,473 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * MLO link handling + * + * Copyright (C) 2022 Intel Corporation + */ +#include <linux/slab.h> +#include <linux/kernel.h> +#include <net/mac80211.h> +#include "ieee80211_i.h" +#include "driver-ops.h" +#include "key.h" + +void ieee80211_link_setup(struct ieee80211_link_data *link) +{ + if (link->sdata->vif.type == NL80211_IFTYPE_STATION) + ieee80211_mgd_setup_link(link); +} + +void ieee80211_link_init(struct ieee80211_sub_if_data *sdata, + int link_id, + struct ieee80211_link_data *link, + struct ieee80211_bss_conf *link_conf) +{ + bool deflink = link_id < 0; + + if (link_id < 0) + link_id = 0; + + rcu_assign_pointer(sdata->vif.link_conf[link_id], link_conf); + rcu_assign_pointer(sdata->link[link_id], link); + + link->sdata = sdata; + link->link_id = link_id; + link->conf = link_conf; + link_conf->link_id = link_id; + + INIT_WORK(&link->csa_finalize_work, + ieee80211_csa_finalize_work); + INIT_WORK(&link->color_change_finalize_work, + ieee80211_color_change_finalize_work); + INIT_LIST_HEAD(&link->assigned_chanctx_list); + INIT_LIST_HEAD(&link->reserved_chanctx_list); + INIT_DELAYED_WORK(&link->dfs_cac_timer_work, + ieee80211_dfs_cac_timer_work); + + if (!deflink) { + switch (sdata->vif.type) { + case NL80211_IFTYPE_AP: + ether_addr_copy(link_conf->addr, + sdata->wdev.links[link_id].addr); + link_conf->bssid = link_conf->addr; + WARN_ON(!(sdata->wdev.valid_links & BIT(link_id))); + break; + case NL80211_IFTYPE_STATION: + /* station sets the bssid in ieee80211_mgd_setup_link */ + break; + default: + WARN_ON(1); + } + } +} + +void ieee80211_link_stop(struct ieee80211_link_data *link) +{ + if (link->sdata->vif.type == NL80211_IFTYPE_STATION) + ieee80211_mgd_stop_link(link); + + ieee80211_link_release_channel(link); +} + +struct link_container { + struct ieee80211_link_data data; + struct ieee80211_bss_conf conf; +}; + +static void ieee80211_tear_down_links(struct ieee80211_sub_if_data *sdata, + struct link_container **links, u16 mask) +{ + struct ieee80211_link_data *link; + LIST_HEAD(keys); + unsigned int link_id; + + for (link_id = 0; link_id < IEEE80211_MLD_MAX_NUM_LINKS; link_id++) { + if (!(mask & BIT(link_id))) + continue; + link = &links[link_id]->data; + if (link_id == 0 && !link) + link = &sdata->deflink; + if (WARN_ON(!link)) + continue; + ieee80211_remove_link_keys(link, &keys); + ieee80211_link_stop(link); + } + + synchronize_rcu(); + + ieee80211_free_key_list(sdata->local, &keys); +} + +static void ieee80211_free_links(struct ieee80211_sub_if_data *sdata, + struct link_container **links) +{ + unsigned int link_id; + + for (link_id = 0; link_id < IEEE80211_MLD_MAX_NUM_LINKS; link_id++) + kfree(links[link_id]); +} + +static int ieee80211_check_dup_link_addrs(struct ieee80211_sub_if_data *sdata) +{ + unsigned int i, j; + + for (i = 0; i < IEEE80211_MLD_MAX_NUM_LINKS; i++) { + struct ieee80211_link_data *link1; + + link1 = sdata_dereference(sdata->link[i], sdata); + if (!link1) + continue; + for (j = i + 1; j < IEEE80211_MLD_MAX_NUM_LINKS; j++) { + struct ieee80211_link_data *link2; + + link2 = sdata_dereference(sdata->link[j], sdata); + if (!link2) + continue; + + if (ether_addr_equal(link1->conf->addr, + link2->conf->addr)) + return -EALREADY; + } + } + + return 0; +} + +static void ieee80211_set_vif_links_bitmaps(struct ieee80211_sub_if_data *sdata, + u16 links) +{ + sdata->vif.valid_links = links; + + if (!links) { + sdata->vif.active_links = 0; + return; + } + + switch (sdata->vif.type) { + case NL80211_IFTYPE_AP: + /* in an AP all links are always active */ + sdata->vif.active_links = links; + break; + case NL80211_IFTYPE_STATION: + if (sdata->vif.active_links) + break; + WARN_ON(hweight16(links) > 1); + sdata->vif.active_links = links; + break; + default: + WARN_ON(1); + } +} + +static int ieee80211_vif_update_links(struct ieee80211_sub_if_data *sdata, + struct link_container **to_free, + u16 new_links) +{ + u16 old_links = sdata->vif.valid_links; + u16 old_active = sdata->vif.active_links; + unsigned long add = new_links & ~old_links; + unsigned long rem = old_links & ~new_links; + unsigned int link_id; + int ret; + struct link_container *links[IEEE80211_MLD_MAX_NUM_LINKS] = {}, *link; + struct ieee80211_bss_conf *old[IEEE80211_MLD_MAX_NUM_LINKS]; + struct ieee80211_link_data *old_data[IEEE80211_MLD_MAX_NUM_LINKS]; + bool use_deflink = old_links == 0; /* set for error case */ + + sdata_assert_lock(sdata); + + memset(to_free, 0, sizeof(links)); + + if (old_links == new_links) + return 0; + + /* if there were no old links, need to clear the pointers to deflink */ + if (!old_links) + rem |= BIT(0); + + /* allocate new link structures first */ + for_each_set_bit(link_id, &add, IEEE80211_MLD_MAX_NUM_LINKS) { + link = kzalloc(sizeof(*link), GFP_KERNEL); + if (!link) { + ret = -ENOMEM; + goto free; + } + links[link_id] = link; + } + + /* keep track of the old pointers for the driver */ + BUILD_BUG_ON(sizeof(old) != sizeof(sdata->vif.link_conf)); + memcpy(old, sdata->vif.link_conf, sizeof(old)); + /* and for us in error cases */ + BUILD_BUG_ON(sizeof(old_data) != sizeof(sdata->link)); + memcpy(old_data, sdata->link, sizeof(old_data)); + + /* grab old links to free later */ + for_each_set_bit(link_id, &rem, IEEE80211_MLD_MAX_NUM_LINKS) { + if (rcu_access_pointer(sdata->link[link_id]) != &sdata->deflink) { + /* + * we must have allocated the data through this path so + * we know we can free both at the same time + */ + to_free[link_id] = container_of(rcu_access_pointer(sdata->link[link_id]), + typeof(*links[link_id]), + data); + } + + RCU_INIT_POINTER(sdata->link[link_id], NULL); + RCU_INIT_POINTER(sdata->vif.link_conf[link_id], NULL); + } + + /* link them into data structures */ + for_each_set_bit(link_id, &add, IEEE80211_MLD_MAX_NUM_LINKS) { + WARN_ON(!use_deflink && + rcu_access_pointer(sdata->link[link_id]) == &sdata->deflink); + + link = links[link_id]; + ieee80211_link_init(sdata, link_id, &link->data, &link->conf); + ieee80211_link_setup(&link->data); + } + + if (new_links == 0) + ieee80211_link_init(sdata, -1, &sdata->deflink, + &sdata->vif.bss_conf); + + ret = ieee80211_check_dup_link_addrs(sdata); + if (!ret) { + /* for keys we will not be able to undo this */ + ieee80211_tear_down_links(sdata, to_free, rem); + + ieee80211_set_vif_links_bitmaps(sdata, new_links); + + /* tell the driver */ + ret = drv_change_vif_links(sdata->local, sdata, + old_links & old_active, + new_links & sdata->vif.active_links, + old); + } + + if (ret) { + /* restore config */ + memcpy(sdata->link, old_data, sizeof(old_data)); + memcpy(sdata->vif.link_conf, old, sizeof(old)); + ieee80211_set_vif_links_bitmaps(sdata, old_links); + /* and free (only) the newly allocated links */ + memset(to_free, 0, sizeof(links)); + goto free; + } + + /* use deflink/bss_conf again if and only if there are no more links */ + use_deflink = new_links == 0; + + goto deinit; +free: + /* if we failed during allocation, only free all */ + for (link_id = 0; link_id < IEEE80211_MLD_MAX_NUM_LINKS; link_id++) { + kfree(links[link_id]); + links[link_id] = NULL; + } +deinit: + if (use_deflink) + ieee80211_link_init(sdata, -1, &sdata->deflink, + &sdata->vif.bss_conf); + return ret; +} + +int ieee80211_vif_set_links(struct ieee80211_sub_if_data *sdata, + u16 new_links) +{ + struct link_container *links[IEEE80211_MLD_MAX_NUM_LINKS]; + int ret; + + ret = ieee80211_vif_update_links(sdata, links, new_links); + ieee80211_free_links(sdata, links); + + return ret; +} + +void ieee80211_vif_clear_links(struct ieee80211_sub_if_data *sdata) +{ + struct link_container *links[IEEE80211_MLD_MAX_NUM_LINKS]; + + /* + * The locking here is different because when we free links + * in the station case we need to be able to cancel_work_sync() + * something that also takes the lock. + */ + + sdata_lock(sdata); + ieee80211_vif_update_links(sdata, links, 0); + sdata_unlock(sdata); + + ieee80211_free_links(sdata, links); +} + +static int _ieee80211_set_active_links(struct ieee80211_sub_if_data *sdata, + u16 active_links) +{ + struct ieee80211_bss_conf *link_confs[IEEE80211_MLD_MAX_NUM_LINKS]; + struct ieee80211_local *local = sdata->local; + u16 old_active = sdata->vif.active_links; + unsigned long rem = old_active & ~active_links; + unsigned long add = active_links & ~old_active; + struct sta_info *sta; + unsigned int link_id; + int ret, i; + + if (!ieee80211_sdata_running(sdata)) + return -ENETDOWN; + + if (sdata->vif.type != NL80211_IFTYPE_STATION) + return -EINVAL; + + /* cannot activate links that don't exist */ + if (active_links & ~sdata->vif.valid_links) + return -EINVAL; + + /* nothing to do */ + if (old_active == active_links) + return 0; + + for (i = 0; i < IEEE80211_MLD_MAX_NUM_LINKS; i++) + link_confs[i] = sdata_dereference(sdata->vif.link_conf[i], + sdata); + + if (add) { + sdata->vif.active_links |= active_links; + ret = drv_change_vif_links(local, sdata, + old_active, + sdata->vif.active_links, + link_confs); + if (ret) { + sdata->vif.active_links = old_active; + return ret; + } + } + + for_each_set_bit(link_id, &rem, IEEE80211_MLD_MAX_NUM_LINKS) { + struct ieee80211_link_data *link; + + link = sdata_dereference(sdata->link[link_id], sdata); + + /* FIXME: kill TDLS connections on the link */ + + ieee80211_link_release_channel(link); + } + + list_for_each_entry(sta, &local->sta_list, list) { + if (sdata != sta->sdata) + continue; + ret = drv_change_sta_links(local, sdata, &sta->sta, + old_active, + old_active | active_links); + WARN_ON_ONCE(ret); + } + + ret = ieee80211_key_switch_links(sdata, rem, add); + WARN_ON_ONCE(ret); + + list_for_each_entry(sta, &local->sta_list, list) { + if (sdata != sta->sdata) + continue; + ret = drv_change_sta_links(local, sdata, &sta->sta, + old_active | active_links, + active_links); + WARN_ON_ONCE(ret); + } + + for_each_set_bit(link_id, &add, IEEE80211_MLD_MAX_NUM_LINKS) { + struct ieee80211_link_data *link; + + link = sdata_dereference(sdata->link[link_id], sdata); + + ret = ieee80211_link_use_channel(link, &link->conf->chandef, + IEEE80211_CHANCTX_SHARED); + WARN_ON_ONCE(ret); + + ieee80211_link_info_change_notify(sdata, link, + BSS_CHANGED_ERP_CTS_PROT | + BSS_CHANGED_ERP_PREAMBLE | + BSS_CHANGED_ERP_SLOT | + BSS_CHANGED_HT | + BSS_CHANGED_BASIC_RATES | + BSS_CHANGED_BSSID | + BSS_CHANGED_CQM | + BSS_CHANGED_QOS | + BSS_CHANGED_TXPOWER | + BSS_CHANGED_BANDWIDTH | + BSS_CHANGED_TWT | + BSS_CHANGED_HE_OBSS_PD | + BSS_CHANGED_HE_BSS_COLOR); + ieee80211_mgd_set_link_qos_params(link); + } + + old_active = sdata->vif.active_links; + sdata->vif.active_links = active_links; + + if (rem) { + ret = drv_change_vif_links(local, sdata, old_active, + active_links, link_confs); + WARN_ON_ONCE(ret); + } + + return 0; +} + +int ieee80211_set_active_links(struct ieee80211_vif *vif, u16 active_links) +{ + struct ieee80211_sub_if_data *sdata = vif_to_sdata(vif); + struct ieee80211_local *local = sdata->local; + u16 old_active; + int ret; + + sdata_lock(sdata); + mutex_lock(&local->sta_mtx); + mutex_lock(&local->mtx); + mutex_lock(&local->key_mtx); + old_active = sdata->vif.active_links; + if (old_active & active_links) { + /* + * if there's at least one link that stays active across + * the change then switch to it (to those) first, and + * then enable the additional links + */ + ret = _ieee80211_set_active_links(sdata, + old_active & active_links); + if (!ret) + ret = _ieee80211_set_active_links(sdata, active_links); + } else { + /* otherwise switch directly */ + ret = _ieee80211_set_active_links(sdata, active_links); + } + mutex_unlock(&local->key_mtx); + mutex_unlock(&local->mtx); + mutex_unlock(&local->sta_mtx); + sdata_unlock(sdata); + + return ret; +} +EXPORT_SYMBOL_GPL(ieee80211_set_active_links); + +void ieee80211_set_active_links_async(struct ieee80211_vif *vif, + u16 active_links) +{ + struct ieee80211_sub_if_data *sdata = vif_to_sdata(vif); + + if (!ieee80211_sdata_running(sdata)) + return; + + if (sdata->vif.type != NL80211_IFTYPE_STATION) + return; + + /* cannot activate links that don't exist */ + if (active_links & ~sdata->vif.valid_links) + return; + + /* nothing to do */ + if (sdata->vif.active_links == active_links) + return; + + sdata->desired_active_links = active_links; + schedule_work(&sdata->activate_links_work); +} +EXPORT_SYMBOL_GPL(ieee80211_set_active_links_async); diff --git a/net/mac80211/main.c b/net/mac80211/main.c index 5b1c47ed0cc0..46f3eddc2388 100644 --- a/net/mac80211/main.c +++ b/net/mac80211/main.c @@ -699,6 +699,8 @@ struct ieee80211_hw *ieee80211_alloc_hw_nm(size_t priv_data_len, NL80211_EXT_FEATURE_CONTROL_PORT_OVER_NL80211_TX_STATUS); wiphy_ext_feature_set(wiphy, NL80211_EXT_FEATURE_SCAN_FREQ_KHZ); + wiphy_ext_feature_set(wiphy, + NL80211_EXT_FEATURE_POWERED_ADDR_CHANGE); if (!ops->hw_scan) { wiphy->features |= NL80211_FEATURE_LOW_PRIORITY_SCAN | diff --git a/net/mac80211/mesh.c b/net/mac80211/mesh.c index 6991c4c479da..5a99b8f6e465 100644 --- a/net/mac80211/mesh.c +++ b/net/mac80211/mesh.c @@ -634,7 +634,7 @@ int mesh_add_he_6ghz_cap_ie(struct ieee80211_sub_if_data *sdata, if (!iftd) return 0; - ieee80211_ie_build_he_6ghz_cap(sdata, skb); + ieee80211_ie_build_he_6ghz_cap(sdata, sdata->deflink.smps_mode, skb); return 0; } diff --git a/net/mac80211/mlme.c b/net/mac80211/mlme.c index fc764984d687..54b8d5065bbd 100644 --- a/net/mac80211/mlme.c +++ b/net/mac80211/mlme.c @@ -314,7 +314,7 @@ ieee80211_determine_chantype(struct ieee80211_sub_if_data *sdata, if (eht_oper && (eht_oper->params & IEEE80211_EHT_OPER_INFO_PRESENT)) { struct cfg80211_chan_def eht_chandef = *chandef; - ieee80211_chandef_eht_oper(sdata, eht_oper, + ieee80211_chandef_eht_oper(eht_oper, eht_chandef.width == NL80211_CHAN_WIDTH_160, false, &eht_chandef); @@ -695,6 +695,7 @@ static bool ieee80211_add_vht_ie(struct ieee80211_sub_if_data *sdata, static void ieee80211_add_he_ie(struct ieee80211_sub_if_data *sdata, struct sk_buff *skb, struct ieee80211_supported_band *sband, + enum ieee80211_smps_mode smps_mode, ieee80211_conn_flags_t conn_flags) { u8 *pos, *pre_he_pos; @@ -719,7 +720,7 @@ static void ieee80211_add_he_ie(struct ieee80211_sub_if_data *sdata, /* trim excess if any */ skb_trim(skb, skb->len - (pre_he_pos + he_cap_size - pos)); - ieee80211_ie_build_he_6ghz_cap(sdata, skb); + ieee80211_ie_build_he_6ghz_cap(sdata, smps_mode, skb); } static void ieee80211_add_eht_ie(struct ieee80211_sub_if_data *sdata, @@ -746,11 +747,13 @@ static void ieee80211_add_eht_ie(struct ieee80211_sub_if_data *sdata, eht_cap_size = 2 + 1 + sizeof(eht_cap->eht_cap_elem) + ieee80211_eht_mcs_nss_size(&he_cap->he_cap_elem, - &eht_cap->eht_cap_elem) + + &eht_cap->eht_cap_elem, + false) + ieee80211_eht_ppe_size(eht_cap->eht_ppe_thres[0], eht_cap->eht_cap_elem.phy_cap_info); pos = skb_put(skb, eht_cap_size); - ieee80211_ie_build_eht_cap(pos, he_cap, eht_cap, pos + eht_cap_size); + ieee80211_ie_build_eht_cap(pos, he_cap, eht_cap, pos + eht_cap_size, + false); } static void ieee80211_assoc_add_rates(struct sk_buff *skb, @@ -1098,7 +1101,7 @@ static size_t ieee80211_assoc_link_elems(struct ieee80211_sub_if_data *sdata, offset); if (!(assoc_data->link[link_id].conn_flags & IEEE80211_CONN_DISABLE_HE)) { - ieee80211_add_he_ie(sdata, skb, sband, + ieee80211_add_he_ie(sdata, skb, sband, smps_mode, assoc_data->link[link_id].conn_flags); ADD_PRESENT_EXT_ELEM(WLAN_EID_EXT_HE_CAPABILITY); } @@ -1220,14 +1223,21 @@ static void ieee80211_assoc_add_ml_elem(struct ieee80211_sub_if_data *sdata, ml_elem = skb_put(skb, sizeof(*ml_elem)); ml_elem->control = cpu_to_le16(IEEE80211_ML_CONTROL_TYPE_BASIC | - IEEE80211_MLC_BASIC_PRES_EML_CAPA | IEEE80211_MLC_BASIC_PRES_MLD_CAPA_OP); common = skb_put(skb, sizeof(*common)); common->len = sizeof(*common) + - 2 + /* EML capabilities */ 2; /* MLD capa/ops */ memcpy(common->mld_mac_addr, sdata->vif.addr, ETH_ALEN); - skb_put_data(skb, &eml_capa, sizeof(eml_capa)); + + /* add EML_CAPA only if needed, see Draft P802.11be_D2.1, 35.3.17 */ + if (eml_capa & + cpu_to_le16((IEEE80211_EML_CAP_EMLSR_SUPP | + IEEE80211_EML_CAP_EMLMR_SUPPORT))) { + common->len += 2; /* EML capabilities */ + ml_elem->control |= + cpu_to_le16(IEEE80211_MLC_BASIC_PRES_EML_CAPA); + skb_put_data(skb, &eml_capa, sizeof(eml_capa)); + } /* need indication from userspace to support this */ mld_capa_ops &= ~cpu_to_le16(IEEE80211_MLD_CAP_OP_TID_TO_LINK_MAP_NEG_SUPP); skb_put_data(skb, &mld_capa_ops, sizeof(mld_capa_ops)); @@ -1536,8 +1546,9 @@ void ieee80211_send_nullfunc(struct ieee80211_local *local, struct ieee80211_hdr_3addr *nullfunc; struct ieee80211_if_managed *ifmgd = &sdata->u.mgd; - skb = ieee80211_nullfunc_get(&local->hw, &sdata->vif, - !ieee80211_hw_check(&local->hw, DOESNT_SUPPORT_QOS_NDP)); + skb = ieee80211_nullfunc_get(&local->hw, &sdata->vif, -1, + !ieee80211_hw_check(&local->hw, + DOESNT_SUPPORT_QOS_NDP)); if (!skb) return; @@ -1902,7 +1913,7 @@ ieee80211_sta_process_chanswitch(struct ieee80211_link_data *link, IEEE80211_QUEUE_STOP_REASON_CSA); mutex_unlock(&local->mtx); - cfg80211_ch_switch_started_notify(sdata->dev, &csa_ie.chandef, + cfg80211_ch_switch_started_notify(sdata->dev, &csa_ie.chandef, 0, csa_ie.count, csa_ie.mode); if (local->ops->channel_switch) { @@ -2435,6 +2446,29 @@ static void ieee80211_sta_handle_tspec_ac_params_wk(struct work_struct *work) ieee80211_sta_handle_tspec_ac_params(sdata); } +void ieee80211_mgd_set_link_qos_params(struct ieee80211_link_data *link) +{ + struct ieee80211_sub_if_data *sdata = link->sdata; + struct ieee80211_local *local = sdata->local; + struct ieee80211_if_managed *ifmgd = &sdata->u.mgd; + struct ieee80211_tx_queue_params *params = link->tx_conf; + u8 ac; + + for (ac = 0; ac < IEEE80211_NUM_ACS; ac++) { + mlme_dbg(sdata, + "WMM AC=%d acm=%d aifs=%d cWmin=%d cWmax=%d txop=%d uapsd=%d, downgraded=%d\n", + ac, params[ac].acm, + params[ac].aifs, params[ac].cw_min, params[ac].cw_max, + params[ac].txop, params[ac].uapsd, + ifmgd->tx_tspec[ac].downgraded); + if (!ifmgd->tx_tspec[ac].downgraded && + drv_conf_tx(local, link, ac, ¶ms[ac])) + link_err(link, + "failed to set TX queue parameters for AC %d\n", + ac); + } +} + /* MLME */ static bool ieee80211_sta_wmm_params(struct ieee80211_local *local, @@ -2566,20 +2600,10 @@ ieee80211_sta_wmm_params(struct ieee80211_local *local, } } - for (ac = 0; ac < IEEE80211_NUM_ACS; ac++) { - mlme_dbg(sdata, - "WMM AC=%d acm=%d aifs=%d cWmin=%d cWmax=%d txop=%d uapsd=%d, downgraded=%d\n", - ac, params[ac].acm, - params[ac].aifs, params[ac].cw_min, params[ac].cw_max, - params[ac].txop, params[ac].uapsd, - ifmgd->tx_tspec[ac].downgraded); + for (ac = 0; ac < IEEE80211_NUM_ACS; ac++) link->tx_conf[ac] = params[ac]; - if (!ifmgd->tx_tspec[ac].downgraded && - drv_conf_tx(local, link, ac, ¶ms[ac])) - link_err(link, - "failed to set TX queue parameters for AC %d\n", - ac); - } + + ieee80211_mgd_set_link_qos_params(link); /* enable WMM or activate new settings */ link->conf->qos = true; @@ -3904,6 +3928,7 @@ static bool ieee80211_assoc_config_link(struct ieee80211_link_data *link, .len = elem_len, .bss = cbss, .link_id = link == &sdata->deflink ? -1 : link->link_id, + .from_ap = true, }; bool is_6ghz = cbss->channel->band == NL80211_BAND_6GHZ; bool is_s1g = cbss->channel->band == NL80211_BAND_S1GHZ; @@ -4032,11 +4057,11 @@ static bool ieee80211_assoc_config_link(struct ieee80211_link_data *link, goto out; } - sband = ieee80211_get_link_sband(link); - if (!sband) { + if (WARN_ON(!link->conf->chandef.chan)) { ret = false; goto out; } + sband = local->hw.wiphy->bands[link->conf->chandef.chan->band]; if (!(link->u.mgd.conn_flags & IEEE80211_CONN_DISABLE_HE) && (!elems->he_cap || !elems->he_operation)) { @@ -4571,6 +4596,11 @@ static int ieee80211_prep_channel(struct ieee80211_sub_if_data *sdata, bool is_6ghz = cbss->channel->band == NL80211_BAND_6GHZ; bool is_5ghz = cbss->channel->band == NL80211_BAND_5GHZ; struct ieee80211_bss *bss = (void *)cbss->priv; + struct ieee80211_elems_parse_params parse_params = { + .bss = cbss, + .link_id = -1, + .from_ap = true, + }; struct ieee802_11_elems *elems; const struct cfg80211_bss_ies *ies; int ret; @@ -4580,7 +4610,9 @@ static int ieee80211_prep_channel(struct ieee80211_sub_if_data *sdata, rcu_read_lock(); ies = rcu_dereference(cbss->ies); - elems = ieee802_11_parse_elems(ies->data, ies->len, false, cbss); + parse_params.start = ies->data; + parse_params.len = ies->len; + elems = ieee802_11_parse_elems_full(&parse_params); if (!elems) { rcu_read_unlock(); return -ENOMEM; @@ -4786,6 +4818,40 @@ static int ieee80211_prep_channel(struct ieee80211_sub_if_data *sdata, return ret; } +static bool ieee80211_get_dtim(const struct cfg80211_bss_ies *ies, + u8 *dtim_count, u8 *dtim_period) +{ + const u8 *tim_ie = cfg80211_find_ie(WLAN_EID_TIM, ies->data, ies->len); + const u8 *idx_ie = cfg80211_find_ie(WLAN_EID_MULTI_BSSID_IDX, ies->data, + ies->len); + const struct ieee80211_tim_ie *tim = NULL; + const struct ieee80211_bssid_index *idx; + bool valid = tim_ie && tim_ie[1] >= 2; + + if (valid) + tim = (void *)(tim_ie + 2); + + if (dtim_count) + *dtim_count = valid ? tim->dtim_count : 0; + + if (dtim_period) + *dtim_period = valid ? tim->dtim_period : 0; + + /* Check if value is overridden by non-transmitted profile */ + if (!idx_ie || idx_ie[1] < 3) + return valid; + + idx = (void *)(idx_ie + 2); + + if (dtim_count) + *dtim_count = idx->dtim_count; + + if (dtim_period) + *dtim_period = idx->dtim_period; + + return true; +} + static bool ieee80211_assoc_success(struct ieee80211_sub_if_data *sdata, struct ieee80211_mgmt *mgmt, struct ieee802_11_elems *elems, @@ -4849,11 +4915,24 @@ static bool ieee80211_assoc_success(struct ieee80211_sub_if_data *sdata, goto out_err; if (link_id != assoc_data->assoc_link_id) { - err = ieee80211_prep_channel(sdata, link, - assoc_data->link[link_id].bss, + struct cfg80211_bss *cbss = assoc_data->link[link_id].bss; + const struct cfg80211_bss_ies *ies; + + rcu_read_lock(); + ies = rcu_dereference(cbss->ies); + ieee80211_get_dtim(ies, + &link->conf->sync_dtim_count, + &link->u.mgd.dtim_period); + link->conf->dtim_period = link->u.mgd.dtim_period ?: 1; + link->conf->beacon_int = cbss->beacon_interval; + rcu_read_unlock(); + + err = ieee80211_prep_channel(sdata, link, cbss, &link->u.mgd.conn_flags); - if (err) + if (err) { + link_info(link, "prep_channel failed\n"); goto out_err; + } } err = ieee80211_mgd_setup_link_sta(link, sta, link_sta, @@ -4935,6 +5014,11 @@ static void ieee80211_rx_mgmt_assoc_resp(struct ieee80211_sub_if_data *sdata, struct ieee80211_if_managed *ifmgd = &sdata->u.mgd; struct ieee80211_mgd_assoc_data *assoc_data = ifmgd->assoc_data; u16 capab_info, status_code, aid; + struct ieee80211_elems_parse_params parse_params = { + .bss = NULL, + .link_id = -1, + .from_ap = true, + }; struct ieee802_11_elems *elems; int ac; const u8 *elem_start; @@ -4989,7 +5073,9 @@ static void ieee80211_rx_mgmt_assoc_resp(struct ieee80211_sub_if_data *sdata, return; elem_len = len - (elem_start - (u8 *)mgmt); - elems = ieee802_11_parse_elems(elem_start, elem_len, false, NULL); + parse_params.start = elem_start; + parse_params.len = elem_len; + elems = ieee802_11_parse_elems_full(&parse_params); if (!elems) goto notify_driver; @@ -5122,7 +5208,7 @@ static void ieee80211_rx_mgmt_assoc_resp(struct ieee80211_sub_if_data *sdata, resp.req_ies = ifmgd->assoc_req_ies; resp.req_ies_len = ifmgd->assoc_req_ies_len; if (sdata->vif.valid_links) - resp.ap_mld_addr = assoc_data->ap_addr; + resp.ap_mld_addr = sdata->vif.cfg.ap_addr; cfg80211_rx_assoc_resp(sdata->dev, &resp); notify_driver: drv_mgd_complete_tx(sdata->local, sdata, &info); @@ -5354,6 +5440,10 @@ static void ieee80211_rx_mgmt_beacon(struct ieee80211_link_data *link, u32 ncrc = 0; u8 *bssid, *variable = mgmt->u.beacon.variable; u8 deauth_buf[IEEE80211_DEAUTH_FRAME_LEN]; + struct ieee80211_elems_parse_params parse_params = { + .link_id = -1, + .from_ap = true, + }; sdata_assert_lock(sdata); @@ -5372,6 +5462,9 @@ static void ieee80211_rx_mgmt_beacon(struct ieee80211_link_data *link, if (baselen > len) return; + parse_params.start = variable; + parse_params.len = len - baselen; + rcu_read_lock(); chanctx_conf = rcu_dereference(link->conf->chanctx_conf); if (!chanctx_conf) { @@ -5390,8 +5483,8 @@ static void ieee80211_rx_mgmt_beacon(struct ieee80211_link_data *link, if (ifmgd->assoc_data && ifmgd->assoc_data->need_beacon && !WARN_ON(sdata->vif.valid_links) && ieee80211_rx_our_beacon(bssid, ifmgd->assoc_data->link[0].bss)) { - elems = ieee802_11_parse_elems(variable, len - baselen, false, - ifmgd->assoc_data->link[0].bss); + parse_params.bss = ifmgd->assoc_data->link[0].bss; + elems = ieee802_11_parse_elems_full(&parse_params); if (!elems) return; @@ -5457,9 +5550,10 @@ static void ieee80211_rx_mgmt_beacon(struct ieee80211_link_data *link, */ if (!ieee80211_is_s1g_beacon(hdr->frame_control)) ncrc = crc32_be(0, (void *)&mgmt->u.beacon.beacon_int, 4); - elems = ieee802_11_parse_elems_crc(variable, len - baselen, - false, care_about_ies, ncrc, - link->u.mgd.bss); + parse_params.bss = link->u.mgd.bss; + parse_params.filter = care_about_ies; + parse_params.crc = ncrc; + elems = ieee802_11_parse_elems_full(&parse_params); if (!elems) return; ncrc = elems->crc; @@ -5673,6 +5767,13 @@ void ieee80211_sta_rx_queued_mgmt(struct ieee80211_sub_if_data *sdata, sdata_lock(sdata); + if (rx_status->link_valid) { + link = sdata_dereference(sdata->link[rx_status->link_id], + sdata); + if (!link) + goto out; + } + switch (fc & IEEE80211_FCTL_STYPE) { case IEEE80211_STYPE_BEACON: ieee80211_rx_mgmt_beacon(link, (void *)mgmt, @@ -5749,6 +5850,7 @@ void ieee80211_sta_rx_queued_mgmt(struct ieee80211_sub_if_data *sdata, } break; } +out: sdata_unlock(sdata); } @@ -6284,6 +6386,8 @@ void ieee80211_mgd_setup_link(struct ieee80211_link_data *link) if (sdata->u.mgd.assoc_data) ether_addr_copy(link->conf->addr, sdata->u.mgd.assoc_data->link[link_id].addr); + else if (!is_valid_ether_addr(link->conf->addr)) + eth_random_addr(link->conf->addr); } /* scan finished notification */ @@ -6300,40 +6404,6 @@ void ieee80211_mlme_notify_scan_completed(struct ieee80211_local *local) rcu_read_unlock(); } -static bool ieee80211_get_dtim(const struct cfg80211_bss_ies *ies, - u8 *dtim_count, u8 *dtim_period) -{ - const u8 *tim_ie = cfg80211_find_ie(WLAN_EID_TIM, ies->data, ies->len); - const u8 *idx_ie = cfg80211_find_ie(WLAN_EID_MULTI_BSSID_IDX, ies->data, - ies->len); - const struct ieee80211_tim_ie *tim = NULL; - const struct ieee80211_bssid_index *idx; - bool valid = tim_ie && tim_ie[1] >= 2; - - if (valid) - tim = (void *)(tim_ie + 2); - - if (dtim_count) - *dtim_count = valid ? tim->dtim_count : 0; - - if (dtim_period) - *dtim_period = valid ? tim->dtim_period : 0; - - /* Check if value is overridden by non-transmitted profile */ - if (!idx_ie || idx_ie[1] < 3) - return valid; - - idx = (void *)(idx_ie + 2); - - if (dtim_count) - *dtim_count = idx->dtim_count; - - if (dtim_period) - *dtim_period = idx->dtim_period; - - return true; -} - static int ieee80211_prep_connection(struct ieee80211_sub_if_data *sdata, struct cfg80211_bss *cbss, s8 link_id, const u8 *ap_mld_addr, bool assoc, @@ -6371,9 +6441,6 @@ static int ieee80211_prep_connection(struct ieee80211_sub_if_data *sdata, goto out_err; } - if (mlo && !is_valid_ether_addr(link->conf->addr)) - eth_random_addr(link->conf->addr); - if (WARN_ON(!ifmgd->auth_data && !ifmgd->assoc_data)) { err = -EINVAL; goto out_err; @@ -6839,22 +6906,9 @@ int ieee80211_mgd_assoc(struct ieee80211_sub_if_data *sdata, for (i = 0; i < IEEE80211_MLD_MAX_NUM_LINKS; i++) size += req->links[i].elems_len; - if (req->ap_mld_addr) { - for (i = 0; i < IEEE80211_MLD_MAX_NUM_LINKS; i++) { - if (!req->links[i].bss) - continue; - if (i == assoc_link_id) - continue; - /* - * For now, support only a single link in MLO, we - * don't have the necessary parsing of the multi- - * link element in the association response, etc. - */ - sdata_info(sdata, - "refusing MLO association with >1 links\n"); - return -EINVAL; - } - } + /* FIXME: no support for 4-addr MLO yet */ + if (sdata->u.mgd.use_4addr && req->link_id >= 0) + return -EOPNOTSUPP; assoc_data = kzalloc(size, GFP_KERNEL); if (!assoc_data) diff --git a/net/mac80211/rc80211_minstrel_ht.c b/net/mac80211/rc80211_minstrel_ht.c index 788a82f9c74d..7f3f5f51081d 100644 --- a/net/mac80211/rc80211_minstrel_ht.c +++ b/net/mac80211/rc80211_minstrel_ht.c @@ -1,7 +1,7 @@ // SPDX-License-Identifier: GPL-2.0-only /* * Copyright (C) 2010-2013 Felix Fietkau <nbd@openwrt.org> - * Copyright (C) 2019-2021 Intel Corporation + * Copyright (C) 2019-2022 Intel Corporation */ #include <linux/netdevice.h> #include <linux/types.h> @@ -1479,7 +1479,7 @@ minstrel_ht_set_rate(struct minstrel_priv *mp, struct minstrel_ht_sta *mi, * - for fallback rates, to increase chances of getting through */ if (offset > 0 || - (mi->sta->smps_mode == IEEE80211_SMPS_DYNAMIC && + (mi->sta->deflink.smps_mode == IEEE80211_SMPS_DYNAMIC && group->streams > 1)) { ratetbl->rate[offset].count = ratetbl->rate[offset].count_rts; flags |= IEEE80211_TX_RC_USE_RTS_CTS; @@ -1570,7 +1570,8 @@ minstrel_ht_update_rates(struct minstrel_priv *mp, struct minstrel_ht_sta *mi) if (i < IEEE80211_TX_RATE_TABLE_SIZE) rates->rate[i].idx = -1; - mi->sta->max_rc_amsdu_len = minstrel_ht_get_max_amsdu_len(mi); + mi->sta->deflink.agg.max_rc_amsdu_len = minstrel_ht_get_max_amsdu_len(mi); + ieee80211_sta_recalc_aggregates(mi->sta); rate_control_set_rates(mp->hw, mi->sta, rates); } @@ -1781,7 +1782,7 @@ minstrel_ht_update_caps(void *priv, struct ieee80211_supported_band *sband, nss = minstrel_mcs_groups[i].streams; /* Mark MCS > 7 as unsupported if STA is in static SMPS mode */ - if (sta->smps_mode == IEEE80211_SMPS_STATIC && nss > 1) + if (sta->deflink.smps_mode == IEEE80211_SMPS_STATIC && nss > 1) continue; /* HT rate */ diff --git a/net/mac80211/rx.c b/net/mac80211/rx.c index 45d7e71661e3..bd215fe3c796 100644 --- a/net/mac80211/rx.c +++ b/net/mac80211/rx.c @@ -49,7 +49,7 @@ static struct sk_buff *ieee80211_clean_skb(struct sk_buff *skb, if (present_fcs_len) __pskb_trim(skb, skb->len - present_fcs_len); - __pskb_pull(skb, rtap_space); + pskb_pull(skb, rtap_space); hdr = (void *)skb->data; fc = hdr->frame_control; @@ -74,7 +74,7 @@ static struct sk_buff *ieee80211_clean_skb(struct sk_buff *skb, memmove(skb->data + IEEE80211_HT_CTL_LEN, skb->data, hdrlen - IEEE80211_HT_CTL_LEN); - __pskb_pull(skb, IEEE80211_HT_CTL_LEN); + pskb_pull(skb, IEEE80211_HT_CTL_LEN); return skb; } @@ -215,9 +215,19 @@ ieee80211_rx_radiotap_hdrlen(struct ieee80211_local *local, } static void __ieee80211_queue_skb_to_iface(struct ieee80211_sub_if_data *sdata, + int link_id, struct sta_info *sta, struct sk_buff *skb) { + struct ieee80211_rx_status *status = IEEE80211_SKB_RXCB(skb); + + if (link_id >= 0) { + status->link_valid = 1; + status->link_id = link_id; + } else { + status->link_valid = 0; + } + skb_queue_tail(&sdata->skb_queue, skb); ieee80211_queue_work(&sdata->local->hw, &sdata->work); if (sta) @@ -225,11 +235,12 @@ static void __ieee80211_queue_skb_to_iface(struct ieee80211_sub_if_data *sdata, } static void ieee80211_queue_skb_to_iface(struct ieee80211_sub_if_data *sdata, + int link_id, struct sta_info *sta, struct sk_buff *skb) { skb->protocol = 0; - __ieee80211_queue_skb_to_iface(sdata, sta, skb); + __ieee80211_queue_skb_to_iface(sdata, link_id, sta, skb); } static void ieee80211_handle_mu_mimo_mon(struct ieee80211_sub_if_data *sdata, @@ -272,7 +283,7 @@ static void ieee80211_handle_mu_mimo_mon(struct ieee80211_sub_if_data *sdata, if (!skb) return; - ieee80211_queue_skb_to_iface(sdata, NULL, skb); + ieee80211_queue_skb_to_iface(sdata, -1, NULL, skb); } /* @@ -1394,7 +1405,7 @@ static void ieee80211_rx_reorder_ampdu(struct ieee80211_rx_data *rx, /* if this mpdu is fragmented - terminate rx aggregation session */ sc = le16_to_cpu(hdr->seq_ctrl); if (sc & IEEE80211_SCTL_FRAG) { - ieee80211_queue_skb_to_iface(rx->sdata, NULL, skb); + ieee80211_queue_skb_to_iface(rx->sdata, rx->link_id, NULL, skb); return; } @@ -1441,7 +1452,7 @@ ieee80211_rx_h_check_dup(struct ieee80211_rx_data *rx) if (unlikely(ieee80211_has_retry(hdr->frame_control) && rx->sta->last_seq_ctrl[rx->seqno_idx] == hdr->seq_ctrl)) { I802_DEBUG_INC(rx->local->dot11FrameDuplicateCount); - rx->sta->deflink.rx_stats.num_duplicates++; + rx->link_sta->rx_stats.num_duplicates++; return RX_DROP_UNUSABLE; } else if (!(status->flag & RX_FLAG_AMSDU_MORE)) { rx->sta->last_seq_ctrl[rx->seqno_idx] = hdr->seq_ctrl; @@ -1720,12 +1731,13 @@ static ieee80211_rx_result debug_noinline ieee80211_rx_h_sta_process(struct ieee80211_rx_data *rx) { struct sta_info *sta = rx->sta; + struct link_sta_info *link_sta = rx->link_sta; struct sk_buff *skb = rx->skb; struct ieee80211_rx_status *status = IEEE80211_SKB_RXCB(skb); struct ieee80211_hdr *hdr = (struct ieee80211_hdr *)skb->data; int i; - if (!sta) + if (!sta || !link_sta) return RX_CONTINUE; /* @@ -1741,47 +1753,47 @@ ieee80211_rx_h_sta_process(struct ieee80211_rx_data *rx) NL80211_IFTYPE_ADHOC); if (ether_addr_equal(bssid, rx->sdata->u.ibss.bssid) && test_sta_flag(sta, WLAN_STA_AUTHORIZED)) { - sta->deflink.rx_stats.last_rx = jiffies; + link_sta->rx_stats.last_rx = jiffies; if (ieee80211_is_data(hdr->frame_control) && !is_multicast_ether_addr(hdr->addr1)) - sta->deflink.rx_stats.last_rate = + link_sta->rx_stats.last_rate = sta_stats_encode_rate(status); } } else if (rx->sdata->vif.type == NL80211_IFTYPE_OCB) { - sta->deflink.rx_stats.last_rx = jiffies; + link_sta->rx_stats.last_rx = jiffies; } else if (!ieee80211_is_s1g_beacon(hdr->frame_control) && !is_multicast_ether_addr(hdr->addr1)) { /* * Mesh beacons will update last_rx when if they are found to * match the current local configuration when processed. */ - sta->deflink.rx_stats.last_rx = jiffies; + link_sta->rx_stats.last_rx = jiffies; if (ieee80211_is_data(hdr->frame_control)) - sta->deflink.rx_stats.last_rate = sta_stats_encode_rate(status); + link_sta->rx_stats.last_rate = sta_stats_encode_rate(status); } - sta->deflink.rx_stats.fragments++; + link_sta->rx_stats.fragments++; - u64_stats_update_begin(&rx->sta->deflink.rx_stats.syncp); - sta->deflink.rx_stats.bytes += rx->skb->len; - u64_stats_update_end(&rx->sta->deflink.rx_stats.syncp); + u64_stats_update_begin(&link_sta->rx_stats.syncp); + link_sta->rx_stats.bytes += rx->skb->len; + u64_stats_update_end(&link_sta->rx_stats.syncp); if (!(status->flag & RX_FLAG_NO_SIGNAL_VAL)) { - sta->deflink.rx_stats.last_signal = status->signal; - ewma_signal_add(&sta->deflink.rx_stats_avg.signal, + link_sta->rx_stats.last_signal = status->signal; + ewma_signal_add(&link_sta->rx_stats_avg.signal, -status->signal); } if (status->chains) { - sta->deflink.rx_stats.chains = status->chains; + link_sta->rx_stats.chains = status->chains; for (i = 0; i < ARRAY_SIZE(status->chain_signal); i++) { int signal = status->chain_signal[i]; if (!(status->chains & BIT(i))) continue; - sta->deflink.rx_stats.chain_signal_last[i] = signal; - ewma_signal_add(&sta->deflink.rx_stats_avg.chain_signal[i], + link_sta->rx_stats.chain_signal_last[i] = signal; + ewma_signal_add(&link_sta->rx_stats_avg.chain_signal[i], -signal); } } @@ -1842,7 +1854,7 @@ ieee80211_rx_h_sta_process(struct ieee80211_rx_data *rx) * Update counter and free packet here to avoid * counting this as a dropped packed. */ - sta->deflink.rx_stats.packets++; + link_sta->rx_stats.packets++; dev_kfree_skb(rx->skb); return RX_QUEUED; } @@ -1854,7 +1866,6 @@ static struct ieee80211_key * ieee80211_rx_get_bigtk(struct ieee80211_rx_data *rx, int idx) { struct ieee80211_key *key = NULL; - struct ieee80211_sub_if_data *sdata = rx->sdata; int idx2; /* Make sure key gets set if either BIGTK key index is set so that @@ -1873,14 +1884,14 @@ ieee80211_rx_get_bigtk(struct ieee80211_rx_data *rx, int idx) idx2 = idx - 1; } - if (rx->sta) - key = rcu_dereference(rx->sta->deflink.gtk[idx]); + if (rx->link_sta) + key = rcu_dereference(rx->link_sta->gtk[idx]); if (!key) - key = rcu_dereference(sdata->deflink.gtk[idx]); - if (!key && rx->sta) - key = rcu_dereference(rx->sta->deflink.gtk[idx2]); + key = rcu_dereference(rx->link->gtk[idx]); + if (!key && rx->link_sta) + key = rcu_dereference(rx->link_sta->gtk[idx2]); if (!key) - key = rcu_dereference(sdata->deflink.gtk[idx2]); + key = rcu_dereference(rx->link->gtk[idx2]); return key; } @@ -1986,15 +1997,15 @@ ieee80211_rx_h_decrypt(struct ieee80211_rx_data *rx) if (mmie_keyidx < NUM_DEFAULT_KEYS || mmie_keyidx >= NUM_DEFAULT_KEYS + NUM_DEFAULT_MGMT_KEYS) return RX_DROP_MONITOR; /* unexpected BIP keyidx */ - if (rx->sta) { + if (rx->link_sta) { if (ieee80211_is_group_privacy_action(skb) && test_sta_flag(rx->sta, WLAN_STA_MFP)) return RX_DROP_MONITOR; - rx->key = rcu_dereference(rx->sta->deflink.gtk[mmie_keyidx]); + rx->key = rcu_dereference(rx->link_sta->gtk[mmie_keyidx]); } if (!rx->key) - rx->key = rcu_dereference(rx->sdata->deflink.gtk[mmie_keyidx]); + rx->key = rcu_dereference(rx->link->gtk[mmie_keyidx]); } else if (!ieee80211_has_protected(fc)) { /* * The frame was not protected, so skip decryption. However, we @@ -2003,25 +2014,24 @@ ieee80211_rx_h_decrypt(struct ieee80211_rx_data *rx) * have been expected. */ struct ieee80211_key *key = NULL; - struct ieee80211_sub_if_data *sdata = rx->sdata; int i; if (ieee80211_is_beacon(fc)) { key = ieee80211_rx_get_bigtk(rx, -1); } else if (ieee80211_is_mgmt(fc) && is_multicast_ether_addr(hdr->addr1)) { - key = rcu_dereference(rx->sdata->deflink.default_mgmt_key); + key = rcu_dereference(rx->link->default_mgmt_key); } else { - if (rx->sta) { + if (rx->link_sta) { for (i = 0; i < NUM_DEFAULT_KEYS; i++) { - key = rcu_dereference(rx->sta->deflink.gtk[i]); + key = rcu_dereference(rx->link_sta->gtk[i]); if (key) break; } } if (!key) { for (i = 0; i < NUM_DEFAULT_KEYS; i++) { - key = rcu_dereference(sdata->deflink.gtk[i]); + key = rcu_dereference(rx->link->gtk[i]); if (key) break; } @@ -2050,13 +2060,13 @@ ieee80211_rx_h_decrypt(struct ieee80211_rx_data *rx) return RX_DROP_UNUSABLE; /* check per-station GTK first, if multicast packet */ - if (is_multicast_ether_addr(hdr->addr1) && rx->sta) - rx->key = rcu_dereference(rx->sta->deflink.gtk[keyidx]); + if (is_multicast_ether_addr(hdr->addr1) && rx->link_sta) + rx->key = rcu_dereference(rx->link_sta->gtk[keyidx]); /* if not found, try default key */ if (!rx->key) { if (is_multicast_ether_addr(hdr->addr1)) - rx->key = rcu_dereference(rx->sdata->deflink.gtk[keyidx]); + rx->key = rcu_dereference(rx->link->gtk[keyidx]); if (!rx->key) rx->key = rcu_dereference(rx->sdata->keys[keyidx]); @@ -2380,7 +2390,7 @@ ieee80211_rx_h_defragment(struct ieee80211_rx_data *rx) out: ieee80211_led_rx(rx->local); if (rx->sta) - rx->sta->deflink.rx_stats.packets++; + rx->link_sta->rx_stats.packets++; return RX_CONTINUE; } @@ -2656,9 +2666,9 @@ ieee80211_deliver_skb(struct ieee80211_rx_data *rx) * for non-QoS-data frames. Here we know it's a data * frame, so count MSDUs. */ - u64_stats_update_begin(&rx->sta->deflink.rx_stats.syncp); - rx->sta->deflink.rx_stats.msdu[rx->seqno_idx]++; - u64_stats_update_end(&rx->sta->deflink.rx_stats.syncp); + u64_stats_update_begin(&rx->link_sta->rx_stats.syncp); + rx->link_sta->rx_stats.msdu[rx->seqno_idx]++; + u64_stats_update_end(&rx->link_sta->rx_stats.syncp); } if ((sdata->vif.type == NL80211_IFTYPE_AP || @@ -3046,7 +3056,8 @@ ieee80211_rx_h_data(struct ieee80211_rx_data *rx) (tf->action_code == WLAN_TDLS_CHANNEL_SWITCH_REQUEST || tf->action_code == WLAN_TDLS_CHANNEL_SWITCH_RESPONSE)) { rx->skb->protocol = cpu_to_be16(ETH_P_TDLS); - __ieee80211_queue_skb_to_iface(sdata, rx->sta, rx->skb); + __ieee80211_queue_skb_to_iface(sdata, rx->link_id, + rx->sta, rx->skb); return RX_QUEUED; } } @@ -3354,7 +3365,7 @@ ieee80211_rx_h_action(struct ieee80211_rx_data *rx) switch (mgmt->u.action.category) { case WLAN_CATEGORY_HT: /* reject HT action frames from stations not supporting HT */ - if (!rx->sta->sta.deflink.ht_cap.ht_supported) + if (!rx->link_sta->pub->ht_cap.ht_supported) goto invalid; if (sdata->vif.type != NL80211_IFTYPE_STATION && @@ -3394,9 +3405,9 @@ ieee80211_rx_h_action(struct ieee80211_rx_data *rx) } /* if no change do nothing */ - if (rx->sta->sta.smps_mode == smps_mode) + if (rx->link_sta->pub->smps_mode == smps_mode) goto handled; - rx->sta->sta.smps_mode = smps_mode; + rx->link_sta->pub->smps_mode = smps_mode; sta_opmode.smps_mode = ieee80211_smps_mode_to_smps_mode(smps_mode); sta_opmode.changed = STA_OPMODE_SMPS_MODE_CHANGED; @@ -3418,26 +3429,26 @@ ieee80211_rx_h_action(struct ieee80211_rx_data *rx) struct sta_opmode_info sta_opmode = {}; /* If it doesn't support 40 MHz it can't change ... */ - if (!(rx->sta->sta.deflink.ht_cap.cap & + if (!(rx->link_sta->pub->ht_cap.cap & IEEE80211_HT_CAP_SUP_WIDTH_20_40)) goto handled; if (chanwidth == IEEE80211_HT_CHANWIDTH_20MHZ) max_bw = IEEE80211_STA_RX_BW_20; else - max_bw = ieee80211_sta_cap_rx_bw(&rx->sta->deflink); + max_bw = ieee80211_sta_cap_rx_bw(rx->link_sta); /* set cur_max_bandwidth and recalc sta bw */ - rx->sta->deflink.cur_max_bandwidth = max_bw; - new_bw = ieee80211_sta_cur_vht_bw(&rx->sta->deflink); + rx->link_sta->cur_max_bandwidth = max_bw; + new_bw = ieee80211_sta_cur_vht_bw(rx->link_sta); - if (rx->sta->sta.deflink.bandwidth == new_bw) + if (rx->link_sta->pub->bandwidth == new_bw) goto handled; - rx->sta->sta.deflink.bandwidth = new_bw; + rx->link_sta->pub->bandwidth = new_bw; sband = rx->local->hw.wiphy->bands[status->band]; sta_opmode.bw = - ieee80211_sta_rx_bw_to_chan_width(&rx->sta->deflink); + ieee80211_sta_rx_bw_to_chan_width(rx->link_sta); sta_opmode.changed = STA_OPMODE_MAX_BW_CHANGED; rate_control_rate_update(local, sband, rx->sta, 0, @@ -3631,12 +3642,12 @@ ieee80211_rx_h_action(struct ieee80211_rx_data *rx) handled: if (rx->sta) - rx->sta->deflink.rx_stats.packets++; + rx->link_sta->rx_stats.packets++; dev_kfree_skb(rx->skb); return RX_QUEUED; queue: - ieee80211_queue_skb_to_iface(sdata, rx->sta, rx->skb); + ieee80211_queue_skb_to_iface(sdata, rx->link_id, rx->sta, rx->skb); return RX_QUEUED; } @@ -3675,7 +3686,7 @@ ieee80211_rx_h_userspace_mgmt(struct ieee80211_rx_data *rx) if (cfg80211_rx_mgmt_ext(&rx->sdata->wdev, &info)) { if (rx->sta) - rx->sta->deflink.rx_stats.packets++; + rx->link_sta->rx_stats.packets++; dev_kfree_skb(rx->skb); return RX_QUEUED; } @@ -3713,7 +3724,7 @@ ieee80211_rx_h_action_post_userspace(struct ieee80211_rx_data *rx) handled: if (rx->sta) - rx->sta->deflink.rx_stats.packets++; + rx->link_sta->rx_stats.packets++; dev_kfree_skb(rx->skb); return RX_QUEUED; } @@ -3794,7 +3805,7 @@ ieee80211_rx_h_ext(struct ieee80211_rx_data *rx) return RX_DROP_MONITOR; /* for now only beacons are ext, so queue them */ - ieee80211_queue_skb_to_iface(sdata, rx->sta, rx->skb); + ieee80211_queue_skb_to_iface(sdata, rx->link_id, rx->sta, rx->skb); return RX_QUEUED; } @@ -3851,7 +3862,7 @@ ieee80211_rx_h_mgmt(struct ieee80211_rx_data *rx) return RX_DROP_MONITOR; } - ieee80211_queue_skb_to_iface(sdata, rx->sta, rx->skb); + ieee80211_queue_skb_to_iface(sdata, rx->link_id, rx->sta, rx->skb); return RX_QUEUED; } @@ -3933,7 +3944,7 @@ static void ieee80211_rx_handlers_result(struct ieee80211_rx_data *rx, case RX_DROP_MONITOR: I802_DEBUG_INC(rx->sdata->local->rx_handlers_drop); if (rx->sta) - rx->sta->deflink.rx_stats.dropped++; + rx->link_sta->rx_stats.dropped++; fallthrough; case RX_CONTINUE: { struct ieee80211_rate *rate = NULL; @@ -3952,7 +3963,7 @@ static void ieee80211_rx_handlers_result(struct ieee80211_rx_data *rx, case RX_DROP_UNUSABLE: I802_DEBUG_INC(rx->sdata->local->rx_handlers_drop); if (rx->sta) - rx->sta->deflink.rx_stats.dropped++; + rx->link_sta->rx_stats.dropped++; dev_kfree_skb(rx->skb); break; case RX_QUEUED: @@ -4097,6 +4108,7 @@ void ieee80211_release_reorder_timeout(struct sta_info *sta, int tid) /* FIXME: statistics won't be right with this */ link_id = sta->sta.valid_links ? ffs(sta->sta.valid_links) - 1 : 0; rx.link = rcu_dereference(sta->sdata->link[link_id]); + rx.link_sta = rcu_dereference(sta->link[link_id]); ieee80211_rx_handlers(&rx, &frames); } @@ -4512,6 +4524,15 @@ void ieee80211_check_fast_rx_iface(struct ieee80211_sub_if_data *sdata) mutex_unlock(&local->sta_mtx); } +static bool +ieee80211_rx_is_valid_sta_link_id(struct ieee80211_sta *sta, u8 link_id) +{ + if (!sta->mlo) + return false; + + return !!(sta->valid_links & BIT(link_id)); +} + static void ieee80211_rx_8023(struct ieee80211_rx_data *rx, struct ieee80211_fast_rx *fast_rx, int orig_len) @@ -4519,19 +4540,30 @@ static void ieee80211_rx_8023(struct ieee80211_rx_data *rx, struct ieee80211_sta_rx_stats *stats; struct ieee80211_rx_status *status = IEEE80211_SKB_RXCB(rx->skb); struct sta_info *sta = rx->sta; + struct link_sta_info *link_sta; struct sk_buff *skb = rx->skb; void *sa = skb->data + ETH_ALEN; void *da = skb->data; - stats = &sta->deflink.rx_stats; + if (rx->link_id >= 0) { + link_sta = rcu_dereference(sta->link[rx->link_id]); + if (WARN_ON_ONCE(!link_sta)) { + dev_kfree_skb(rx->skb); + return; + } + } else { + link_sta = &sta->deflink; + } + + stats = &link_sta->rx_stats; if (fast_rx->uses_rss) - stats = this_cpu_ptr(sta->deflink.pcpu_rx_stats); + stats = this_cpu_ptr(link_sta->pcpu_rx_stats); /* statistics part of ieee80211_rx_h_sta_process() */ if (!(status->flag & RX_FLAG_NO_SIGNAL_VAL)) { stats->last_signal = status->signal; if (!fast_rx->uses_rss) - ewma_signal_add(&sta->deflink.rx_stats_avg.signal, + ewma_signal_add(&link_sta->rx_stats_avg.signal, -status->signal); } @@ -4547,7 +4579,7 @@ static void ieee80211_rx_8023(struct ieee80211_rx_data *rx, stats->chain_signal_last[i] = signal; if (!fast_rx->uses_rss) - ewma_signal_add(&sta->deflink.rx_stats_avg.chain_signal[i], + ewma_signal_add(&link_sta->rx_stats_avg.chain_signal[i], -signal); } } @@ -4623,7 +4655,8 @@ static bool ieee80211_invoke_fast_rx(struct ieee80211_rx_data *rx, u8 da[ETH_ALEN]; u8 sa[ETH_ALEN]; } addrs __aligned(2); - struct ieee80211_sta_rx_stats *stats = &sta->deflink.rx_stats; + struct link_sta_info *link_sta; + struct ieee80211_sta_rx_stats *stats; /* for parallel-rx, we need to have DUP_VALIDATED, otherwise we write * to a common data structure; drivers can implement that per queue @@ -4724,8 +4757,19 @@ static bool ieee80211_invoke_fast_rx(struct ieee80211_rx_data *rx, return true; drop: dev_kfree_skb(skb); + + if (rx->link_id >= 0) { + link_sta = rcu_dereference(sta->link[rx->link_id]); + if (!link_sta) + return true; + } else { + link_sta = &sta->deflink; + } + if (fast_rx->uses_rss) - stats = this_cpu_ptr(sta->deflink.pcpu_rx_stats); + stats = this_cpu_ptr(link_sta->pcpu_rx_stats); + else + stats = &link_sta->rx_stats; stats->dropped++; return true; @@ -4773,7 +4817,17 @@ static bool ieee80211_prepare_and_rx_handle(struct ieee80211_rx_data *rx, if (!link) return true; rx->link = link; + + if (rx->sta) { + rx->link_sta = + rcu_dereference(rx->sta->link[rx->link_id]); + if (!rx->link_sta) + return true; + } } else { + if (rx->sta) + rx->link_sta = &rx->sta->deflink; + rx->link = &sdata->deflink; } @@ -4831,6 +4885,7 @@ static void __ieee80211_rx_handle_8023(struct ieee80211_hw *hw, struct list_head *list) { struct ieee80211_local *local = hw_to_local(hw); + struct ieee80211_rx_status *status = IEEE80211_SKB_RXCB(skb); struct ieee80211_fast_rx *fast_rx; struct ieee80211_rx_data rx; @@ -4851,7 +4906,31 @@ static void __ieee80211_rx_handle_8023(struct ieee80211_hw *hw, rx.sta = container_of(pubsta, struct sta_info, sta); rx.sdata = rx.sta->sdata; - rx.link = &rx.sdata->deflink; + + if (status->link_valid && + !ieee80211_rx_is_valid_sta_link_id(pubsta, status->link_id)) + goto drop; + + /* + * TODO: Should the frame be dropped if the right link_id is not + * available? Or may be it is fine in the current form to proceed with + * the frame processing because with frame being in 802.3 format, + * link_id is used only for stats purpose and updating the stats on + * the deflink is fine? + */ + if (status->link_valid) + rx.link_id = status->link_id; + + if (rx.link_id >= 0) { + struct ieee80211_link_data *link; + + link = rcu_dereference(rx.sdata->link[rx.link_id]); + if (!link) + goto drop; + rx.link = link; + } else { + rx.link = &rx.sdata->deflink; + } fast_rx = rcu_dereference(rx.sta->fast_rx); if (!fast_rx) @@ -4881,7 +4960,19 @@ static bool ieee80211_rx_for_interface(struct ieee80211_rx_data *rx, rx->sta = link_sta->sta; rx->link_id = link_sta->link_id; } else { + struct ieee80211_rx_status *status = IEEE80211_SKB_RXCB(skb); + rx->sta = sta_info_get_bss(rx->sdata, hdr->addr2); + if (rx->sta) { + if (status->link_valid && + !ieee80211_rx_is_valid_sta_link_id(&rx->sta->sta, + status->link_id)) + return false; + + rx->link_id = status->link_valid ? status->link_id : -1; + } else { + rx->link_id = -1; + } } return ieee80211_prepare_and_rx_handle(rx, skb, consume); @@ -4897,6 +4988,7 @@ static void __ieee80211_rx_handle_packet(struct ieee80211_hw *hw, struct list_head *list) { struct ieee80211_local *local = hw_to_local(hw); + struct ieee80211_rx_status *status = IEEE80211_SKB_RXCB(skb); struct ieee80211_sub_if_data *sdata; struct ieee80211_hdr *hdr; __le16 fc; @@ -4941,10 +5033,39 @@ static void __ieee80211_rx_handle_packet(struct ieee80211_hw *hw, if (ieee80211_is_data(fc)) { struct sta_info *sta, *prev_sta; + u8 link_id = status->link_id; if (pubsta) { rx.sta = container_of(pubsta, struct sta_info, sta); rx.sdata = rx.sta->sdata; + + if (status->link_valid && + !ieee80211_rx_is_valid_sta_link_id(pubsta, link_id)) + goto out; + + if (status->link_valid) + rx.link_id = status->link_id; + + /* + * In MLO connection, fetch the link_id using addr2 + * when the driver does not pass link_id in status. + * When the address translation is already performed by + * driver/hw, the valid link_id must be passed in + * status. + */ + + if (!status->link_valid && pubsta->mlo) { + struct ieee80211_hdr *hdr = (void *)skb->data; + struct link_sta_info *link_sta; + + link_sta = link_sta_info_get_bss(rx.sdata, + hdr->addr2); + if (!link_sta) + goto out; + + rx.link_id = link_sta->link_id; + } + if (ieee80211_prepare_and_rx_handle(&rx, skb, true)) return; goto out; @@ -4958,6 +5079,13 @@ static void __ieee80211_rx_handle_packet(struct ieee80211_hw *hw, continue; } + if ((status->link_valid && + !ieee80211_rx_is_valid_sta_link_id(&prev_sta->sta, + link_id)) || + (!status->link_valid && prev_sta->sta.mlo)) + continue; + + rx.link_id = status->link_valid ? link_id : -1; rx.sta = prev_sta; rx.sdata = prev_sta->sdata; ieee80211_prepare_and_rx_handle(&rx, skb, false); @@ -4966,6 +5094,13 @@ static void __ieee80211_rx_handle_packet(struct ieee80211_hw *hw, } if (prev_sta) { + if ((status->link_valid && + !ieee80211_rx_is_valid_sta_link_id(&prev_sta->sta, + link_id)) || + (!status->link_valid && prev_sta->sta.mlo)) + goto out; + + rx.link_id = status->link_valid ? link_id : -1; rx.sta = prev_sta; rx.sdata = prev_sta->sdata; @@ -5108,6 +5243,9 @@ void ieee80211_rx_list(struct ieee80211_hw *hw, struct ieee80211_sta *pubsta, } } + if (WARN_ON_ONCE(status->link_id >= IEEE80211_LINK_UNSPECIFIED)) + goto drop; + status->rx_flags = 0; kcov_remote_start_common(skb_get_kcov_handle(skb)); diff --git a/net/mac80211/scan.c b/net/mac80211/scan.c index c4f2aeb31da3..0e8c4f48c36d 100644 --- a/net/mac80211/scan.c +++ b/net/mac80211/scan.c @@ -485,7 +485,7 @@ static void __ieee80211_scan_completed(struct ieee80211_hw *hw, bool aborted) /* Set power back to normal operating levels. */ ieee80211_hw_config(local, 0); - if (!hw_scan) { + if (!hw_scan && was_scanning) { ieee80211_configure_filter(local); drv_sw_scan_complete(local, scan_sdata); ieee80211_offchannel_return(local); diff --git a/net/mac80211/sta_info.c b/net/mac80211/sta_info.c index 58998d821778..cebfd148bb40 100644 --- a/net/mac80211/sta_info.c +++ b/net/mac80211/sta_info.c @@ -274,6 +274,43 @@ link_sta_info_get_bss(struct ieee80211_sub_if_data *sdata, const u8 *addr) return NULL; } +struct ieee80211_sta * +ieee80211_find_sta_by_link_addrs(struct ieee80211_hw *hw, + const u8 *addr, + const u8 *localaddr, + unsigned int *link_id) +{ + struct ieee80211_local *local = hw_to_local(hw); + struct link_sta_info *link_sta; + struct rhlist_head *tmp; + + for_each_link_sta_info(local, addr, link_sta, tmp) { + struct sta_info *sta = link_sta->sta; + struct ieee80211_link_data *link; + u8 _link_id = link_sta->link_id; + + if (!localaddr) { + if (link_id) + *link_id = _link_id; + return &sta->sta; + } + + link = rcu_dereference(sta->sdata->link[_link_id]); + if (!link) + continue; + + if (memcmp(link->conf->addr, localaddr, ETH_ALEN)) + continue; + + if (link_id) + *link_id = _link_id; + return &sta->sta; + } + + return NULL; +} +EXPORT_SYMBOL_GPL(ieee80211_find_sta_by_link_addrs); + struct sta_info *sta_info_get_by_addrs(struct ieee80211_local *local, const u8 *sta_addr, const u8 *vif_addr) { @@ -339,6 +376,8 @@ static void sta_remove_link(struct sta_info *sta, unsigned int link_id, sta_info_free_link(&alloc->info); kfree_rcu(alloc, rcu_head); } + + ieee80211_sta_recalc_aggregates(&sta->sta); } /** @@ -472,8 +511,12 @@ static void sta_info_add_link(struct sta_info *sta, link_info->sta = sta; link_info->link_id = link_id; link_info->pub = link_sta; + link_sta->link_id = link_id; rcu_assign_pointer(sta->link[link_id], link_info); rcu_assign_pointer(sta->sta.link[link_id], link_sta); + + link_sta->smps_mode = IEEE80211_SMPS_OFF; + link_sta->agg.max_rc_amsdu_len = IEEE80211_MAX_MPDU_LEN_HT_BA; } static struct sta_info * @@ -504,6 +547,8 @@ __sta_info_alloc(struct ieee80211_sub_if_data *sdata, sta_info_add_link(sta, 0, &sta->deflink, &sta->sta.deflink); } + sta->sta.cur = &sta->sta.deflink.agg; + spin_lock_init(&sta->lock); spin_lock_init(&sta->ps_lock); INIT_WORK(&sta->drv_deliver_wk, sta_deliver_ps_frames); @@ -627,9 +672,6 @@ __sta_info_alloc(struct ieee80211_sub_if_data *sdata, } } - sta->sta.smps_mode = IEEE80211_SMPS_OFF; - sta->sta.max_rc_amsdu_len = IEEE80211_MAX_MPDU_LEN_HT_BA; - sta->cparams.ce_threshold = CODEL_DISABLED_THRESHOLD; sta->cparams.target = MS2TIME(20); sta->cparams.interval = MS2TIME(100); @@ -2085,6 +2127,44 @@ void ieee80211_sta_register_airtime(struct ieee80211_sta *pubsta, u8 tid, } EXPORT_SYMBOL(ieee80211_sta_register_airtime); +void ieee80211_sta_recalc_aggregates(struct ieee80211_sta *pubsta) +{ + struct sta_info *sta = container_of(pubsta, struct sta_info, sta); + struct ieee80211_link_sta *link_sta; + int link_id, i; + bool first = true; + + if (!pubsta->valid_links || !pubsta->mlo) { + pubsta->cur = &pubsta->deflink.agg; + return; + } + + rcu_read_lock(); + for_each_sta_active_link(&sta->sdata->vif, pubsta, link_sta, link_id) { + if (first) { + sta->cur = pubsta->deflink.agg; + first = false; + continue; + } + + sta->cur.max_amsdu_len = + min(sta->cur.max_amsdu_len, + link_sta->agg.max_amsdu_len); + sta->cur.max_rc_amsdu_len = + min(sta->cur.max_rc_amsdu_len, + link_sta->agg.max_rc_amsdu_len); + + for (i = 0; i < ARRAY_SIZE(sta->cur.max_tid_amsdu_len); i++) + sta->cur.max_tid_amsdu_len[i] = + min(sta->cur.max_tid_amsdu_len[i], + link_sta->agg.max_tid_amsdu_len[i]); + } + rcu_read_unlock(); + + pubsta->cur = &sta->cur; +} +EXPORT_SYMBOL(ieee80211_sta_recalc_aggregates); + void ieee80211_sta_update_pending_airtime(struct ieee80211_local *local, struct sta_info *sta, u8 ac, u16 tx_airtime, bool tx_completed) @@ -2777,10 +2857,13 @@ int ieee80211_sta_activate_link(struct sta_info *sta, unsigned int link_id) sta->sta.valid_links = new_links; - if (!test_sta_flag(sta, WLAN_STA_INSERTED)) { - ret = 0; + if (!test_sta_flag(sta, WLAN_STA_INSERTED)) goto hash; - } + + /* Ensure the values are updated for the driver, + * redone by sta_remove_link on failure. + */ + ieee80211_sta_recalc_aggregates(&sta->sta); ret = drv_change_sta_links(sdata->local, sdata, &sta->sta, old_links, new_links); @@ -2799,6 +2882,7 @@ hash: void ieee80211_sta_remove_link(struct sta_info *sta, unsigned int link_id) { struct ieee80211_sub_if_data *sdata = sta->sdata; + u16 old_links = sta->sta.valid_links; lockdep_assert_held(&sdata->local->sta_mtx); @@ -2806,8 +2890,7 @@ void ieee80211_sta_remove_link(struct sta_info *sta, unsigned int link_id) if (test_sta_flag(sta, WLAN_STA_INSERTED)) drv_change_sta_links(sdata->local, sdata, &sta->sta, - sta->sta.valid_links, - sta->sta.valid_links & ~BIT(link_id)); + old_links, sta->sta.valid_links); sta_remove_link(sta, link_id, true); } @@ -2834,3 +2917,13 @@ void ieee80211_sta_set_max_amsdu_subframes(struct sta_info *sta, if (val) sta->sta.max_amsdu_subframes = 4 << val; } + +#ifdef CONFIG_LOCKDEP +bool lockdep_sta_mutex_held(struct ieee80211_sta *pubsta) +{ + struct sta_info *sta = container_of(pubsta, struct sta_info, sta); + + return lockdep_is_held(&sta->local->sta_mtx); +} +EXPORT_SYMBOL(lockdep_sta_mutex_held); +#endif diff --git a/net/mac80211/sta_info.h b/net/mac80211/sta_info.h index 2eb3a9452e07..2517ea714dc4 100644 --- a/net/mac80211/sta_info.h +++ b/net/mac80211/sta_info.h @@ -622,6 +622,8 @@ struct link_sta_info { * @tdls_chandef: a TDLS peer can have a wider chandef that is compatible to * the BSS one. * @frags: fragment cache + * @cur: storage for aggregation data + * &struct ieee80211_sta points either here or to deflink.agg. * @deflink: This is the default link STA information, for non MLO STA all link * specific STA information is accessed through @deflink or through * link[0] which points to address of @deflink. For MLO Link STA @@ -705,6 +707,7 @@ struct sta_info { struct ieee80211_fragment_cache frags; + struct ieee80211_sta_aggregates cur; struct link_sta_info deflink; struct link_sta_info __rcu *link[IEEE80211_MLD_MAX_NUM_LINKS]; diff --git a/net/mac80211/tx.c b/net/mac80211/tx.c index 13249e97a069..27c964be102e 100644 --- a/net/mac80211/tx.c +++ b/net/mac80211/tx.c @@ -576,6 +576,51 @@ ieee80211_tx_h_check_control_port_protocol(struct ieee80211_tx_data *tx) return TX_CONTINUE; } +static struct ieee80211_key * +ieee80211_select_link_key(struct ieee80211_tx_data *tx) +{ + struct ieee80211_hdr *hdr = (struct ieee80211_hdr *)tx->skb->data; + struct ieee80211_tx_info *info = IEEE80211_SKB_CB(tx->skb); + enum { + USE_NONE, + USE_MGMT_KEY, + USE_MCAST_KEY, + } which_key = USE_NONE; + struct ieee80211_link_data *link; + unsigned int link_id; + + if (ieee80211_is_group_privacy_action(tx->skb)) + which_key = USE_MCAST_KEY; + else if (ieee80211_is_mgmt(hdr->frame_control) && + is_multicast_ether_addr(hdr->addr1) && + ieee80211_is_robust_mgmt_frame(tx->skb)) + which_key = USE_MGMT_KEY; + else if (is_multicast_ether_addr(hdr->addr1)) + which_key = USE_MCAST_KEY; + else + return NULL; + + link_id = u32_get_bits(info->control.flags, IEEE80211_TX_CTRL_MLO_LINK); + if (link_id == IEEE80211_LINK_UNSPECIFIED) { + link = &tx->sdata->deflink; + } else { + link = rcu_dereference(tx->sdata->link[link_id]); + if (!link) + return NULL; + } + + switch (which_key) { + case USE_NONE: + break; + case USE_MGMT_KEY: + return rcu_dereference(link->default_mgmt_key); + case USE_MCAST_KEY: + return rcu_dereference(link->default_multicast_key); + } + + return NULL; +} + static ieee80211_tx_result debug_noinline ieee80211_tx_h_select_key(struct ieee80211_tx_data *tx) { @@ -591,16 +636,7 @@ ieee80211_tx_h_select_key(struct ieee80211_tx_data *tx) if (tx->sta && (key = rcu_dereference(tx->sta->ptk[tx->sta->ptk_idx]))) tx->key = key; - else if (ieee80211_is_group_privacy_action(tx->skb) && - (key = rcu_dereference(tx->sdata->deflink.default_multicast_key))) - tx->key = key; - else if (ieee80211_is_mgmt(hdr->frame_control) && - is_multicast_ether_addr(hdr->addr1) && - ieee80211_is_robust_mgmt_frame(tx->skb) && - (key = rcu_dereference(tx->sdata->deflink.default_mgmt_key))) - tx->key = key; - else if (is_multicast_ether_addr(hdr->addr1) && - (key = rcu_dereference(tx->sdata->deflink.default_multicast_key))) + else if ((key = ieee80211_select_link_key(tx))) tx->key = key; else if (!is_multicast_ether_addr(hdr->addr1) && (key = rcu_dereference(tx->sdata->default_unicast_key))) @@ -2640,7 +2676,8 @@ static struct sk_buff *ieee80211_build_hdr(struct ieee80211_sub_if_data *sdata, goto free; } memcpy(hdr.addr2, link->conf->addr, ETH_ALEN); - } else if (link_id == IEEE80211_LINK_UNSPECIFIED) { + } else if (link_id == IEEE80211_LINK_UNSPECIFIED || + (sta && sta->sta.mlo)) { memcpy(hdr.addr2, sdata->vif.addr, ETH_ALEN); } else { struct ieee80211_bss_conf *conf; @@ -3350,7 +3387,7 @@ static bool ieee80211_amsdu_aggregate(struct ieee80211_sub_if_data *sdata, int subframe_len = skb->len - ETH_ALEN; u8 max_subframes = sta->sta.max_amsdu_subframes; int max_frags = local->hw.max_tx_fragments; - int max_amsdu_len = sta->sta.max_amsdu_len; + int max_amsdu_len = sta->sta.cur->max_amsdu_len; int orig_truesize; u32 flow_idx; __be16 len; @@ -3376,13 +3413,13 @@ static bool ieee80211_amsdu_aggregate(struct ieee80211_sub_if_data *sdata, if (test_bit(IEEE80211_TXQ_NO_AMSDU, &txqi->flags)) return false; - if (sta->sta.max_rc_amsdu_len) + if (sta->sta.cur->max_rc_amsdu_len) max_amsdu_len = min_t(int, max_amsdu_len, - sta->sta.max_rc_amsdu_len); + sta->sta.cur->max_rc_amsdu_len); - if (sta->sta.max_tid_amsdu_len[tid]) + if (sta->sta.cur->max_tid_amsdu_len[tid]) max_amsdu_len = min_t(int, max_amsdu_len, - sta->sta.max_tid_amsdu_len[tid]); + sta->sta.cur->max_tid_amsdu_len[tid]); flow_idx = fq_flow_idx(fq, skb); @@ -3735,8 +3772,8 @@ begin: !test_sta_flag(tx.sta, WLAN_STA_AUTHORIZED) && (!(info->control.flags & IEEE80211_TX_CTRL_PORT_CTRL_PROTO) || - !ether_addr_equal(tx.sdata->vif.addr, - hdr->addr2)))) { + !ieee80211_is_our_addr(tx.sdata, hdr->addr2, + NULL)))) { I802_DEBUG_INC(local->tx_handlers_drop_unauth_port); ieee80211_free_txskb(&local->hw, skb); goto begin; @@ -5061,6 +5098,8 @@ ieee80211_beacon_get_finish(struct ieee80211_hw *hw, rate_control_get_rate(sdata, NULL, &txrc); info->control.vif = vif; + info->control.flags |= u32_encode_bits(link->link_id, + IEEE80211_TX_CTRL_MLO_LINK); info->flags |= IEEE80211_TX_CTL_CLEAR_PS_FILT | IEEE80211_TX_CTL_ASSIGN_SEQ | IEEE80211_TX_CTL_FIRST_FRAGMENT; @@ -5430,33 +5469,39 @@ EXPORT_SYMBOL(ieee80211_pspoll_get); struct sk_buff *ieee80211_nullfunc_get(struct ieee80211_hw *hw, struct ieee80211_vif *vif, - bool qos_ok) + int link_id, bool qos_ok) { + struct ieee80211_sub_if_data *sdata = vif_to_sdata(vif); + struct ieee80211_local *local = sdata->local; + struct ieee80211_link_data *link = NULL; struct ieee80211_hdr_3addr *nullfunc; - struct ieee80211_sub_if_data *sdata; - struct ieee80211_local *local; struct sk_buff *skb; bool qos = false; if (WARN_ON(vif->type != NL80211_IFTYPE_STATION)) return NULL; - sdata = vif_to_sdata(vif); - local = sdata->local; + skb = dev_alloc_skb(local->hw.extra_tx_headroom + + sizeof(*nullfunc) + 2); + if (!skb) + return NULL; + rcu_read_lock(); if (qos_ok) { struct sta_info *sta; - rcu_read_lock(); - sta = sta_info_get(sdata, sdata->deflink.u.mgd.bssid); + sta = sta_info_get(sdata, vif->cfg.ap_addr); qos = sta && sta->sta.wme; - rcu_read_unlock(); } - skb = dev_alloc_skb(local->hw.extra_tx_headroom + - sizeof(*nullfunc) + 2); - if (!skb) - return NULL; + if (link_id >= 0) { + link = rcu_dereference(sdata->link[link_id]); + if (WARN_ON_ONCE(!link)) { + rcu_read_unlock(); + kfree_skb(skb); + return NULL; + } + } skb_reserve(skb, local->hw.extra_tx_headroom); @@ -5477,9 +5522,16 @@ struct sk_buff *ieee80211_nullfunc_get(struct ieee80211_hw *hw, skb_put_data(skb, &qoshdr, sizeof(qoshdr)); } - memcpy(nullfunc->addr1, sdata->deflink.u.mgd.bssid, ETH_ALEN); - memcpy(nullfunc->addr2, vif->addr, ETH_ALEN); - memcpy(nullfunc->addr3, sdata->deflink.u.mgd.bssid, ETH_ALEN); + if (link) { + memcpy(nullfunc->addr1, link->conf->bssid, ETH_ALEN); + memcpy(nullfunc->addr2, link->conf->addr, ETH_ALEN); + memcpy(nullfunc->addr3, link->conf->bssid, ETH_ALEN); + } else { + memcpy(nullfunc->addr1, vif->cfg.ap_addr, ETH_ALEN); + memcpy(nullfunc->addr2, vif->addr, ETH_ALEN); + memcpy(nullfunc->addr3, vif->cfg.ap_addr, ETH_ALEN); + } + rcu_read_unlock(); return skb; } diff --git a/net/mac80211/util.c b/net/mac80211/util.c index efcefb2dd882..bf7461c41bef 100644 --- a/net/mac80211/util.c +++ b/net/mac80211/util.c @@ -954,9 +954,11 @@ void ieee80211_queue_delayed_work(struct ieee80211_hw *hw, } EXPORT_SYMBOL(ieee80211_queue_delayed_work); -static void ieee80211_parse_extension_element(u32 *crc, - const struct element *elem, - struct ieee802_11_elems *elems) +static void +ieee80211_parse_extension_element(u32 *crc, + const struct element *elem, + struct ieee802_11_elems *elems, + struct ieee80211_elems_parse_params *params) { const void *data = elem->data + 1; u8 len; @@ -1013,7 +1015,8 @@ static void ieee80211_parse_extension_element(u32 *crc, break; case WLAN_EID_EXT_EHT_CAPABILITY: if (ieee80211_eht_capa_size_ok(elems->he_cap, - data, len)) { + data, len, + params->from_ap)) { elems->eht_cap = data; elems->eht_cap_len = len; } @@ -1385,7 +1388,7 @@ _ieee802_11_parse_elems_full(struct ieee80211_elems_parse_params *params, case WLAN_EID_EXTENSION: ieee80211_parse_extension_element(calc_crc ? &crc : NULL, - elem, elems); + elem, elems, params); break; case WLAN_EID_S1G_CAPABILITIES: if (elen >= sizeof(*elems->s1g_capab)) @@ -2025,7 +2028,8 @@ static int ieee80211_build_preq_ies_band(struct ieee80211_sub_if_data *sdata, cfg80211_any_usable_channels(local->hw.wiphy, BIT(sband->band), IEEE80211_CHAN_NO_HE | IEEE80211_CHAN_NO_EHT)) { - pos = ieee80211_ie_build_eht_cap(pos, he_cap, eht_cap, end); + pos = ieee80211_ie_build_eht_cap(pos, he_cap, eht_cap, end, + sdata->vif.type == NL80211_IFTYPE_AP); if (!pos) goto out_err; } @@ -2526,7 +2530,6 @@ int ieee80211_reconfig(struct ieee80211_local *local) if (link) ieee80211_assign_chanctx(local, sdata, link); } - sdata_unlock(sdata); switch (sdata->vif.type) { case NL80211_IFTYPE_AP_VLAN: @@ -2545,6 +2548,7 @@ int ieee80211_reconfig(struct ieee80211_local *local) &sdata->deflink.tx_conf[i]); break; } + sdata_unlock(sdata); /* common change flags for all interface types */ changed = BSS_CHANGED_ERP_CTS_PROT | @@ -2653,23 +2657,21 @@ int ieee80211_reconfig(struct ieee80211_local *local) } /* APs are now beaconing, add back stations */ - mutex_lock(&local->sta_mtx); - list_for_each_entry(sta, &local->sta_list, list) { - enum ieee80211_sta_state state; - - if (!sta->uploaded) - continue; - - if (sta->sdata->vif.type != NL80211_IFTYPE_AP && - sta->sdata->vif.type != NL80211_IFTYPE_AP_VLAN) + list_for_each_entry(sdata, &local->interfaces, list) { + if (!ieee80211_sdata_running(sdata)) continue; - for (state = IEEE80211_STA_NOTEXIST; - state < sta->sta_state; state++) - WARN_ON(drv_sta_state(local, sta->sdata, sta, state, - state + 1)); + sdata_lock(sdata); + switch (sdata->vif.type) { + case NL80211_IFTYPE_AP_VLAN: + case NL80211_IFTYPE_AP: + ieee80211_reconfig_stations(sdata); + break; + default: + break; + } + sdata_unlock(sdata); } - mutex_unlock(&local->sta_mtx); /* add back keys */ list_for_each_entry(sdata, &local->interfaces, list) @@ -2898,7 +2900,7 @@ void ieee80211_recalc_min_chandef(struct ieee80211_sub_if_data *sdata, */ rcu_read_unlock(); - if (WARN_ON_ONCE(!chanctx_conf)) + if (!chanctx_conf) goto unlock; chanctx = container_of(chanctx_conf, struct ieee80211_chanctx, @@ -3080,6 +3082,7 @@ end: } void ieee80211_ie_build_he_6ghz_cap(struct ieee80211_sub_if_data *sdata, + enum ieee80211_smps_mode smps_mode, struct sk_buff *skb) { struct ieee80211_supported_band *sband; @@ -3106,7 +3109,7 @@ void ieee80211_ie_build_he_6ghz_cap(struct ieee80211_sub_if_data *sdata, cap = le16_to_cpu(iftd->he_6ghz_capa.capa); cap &= ~IEEE80211_HE_6GHZ_CAP_SM_PS; - switch (sdata->deflink.smps_mode) { + switch (smps_mode) { case IEEE80211_SMPS_AUTOMATIC: case IEEE80211_SMPS_NUM_MODES: WARN_ON(1); @@ -3507,8 +3510,7 @@ bool ieee80211_chandef_vht_oper(struct ieee80211_hw *hw, u32 vht_cap_info, return true; } -void ieee80211_chandef_eht_oper(struct ieee80211_sub_if_data *sdata, - const struct ieee80211_eht_operation *eht_oper, +void ieee80211_chandef_eht_oper(const struct ieee80211_eht_operation *eht_oper, bool support_160, bool support_320, struct cfg80211_chan_def *chandef) { @@ -3684,7 +3686,7 @@ bool ieee80211_chandef_he_6ghz_oper(struct ieee80211_sub_if_data *sdata, support_320 = eht_phy_cap & IEEE80211_EHT_PHY_CAP0_320MHZ_IN_6GHZ; - ieee80211_chandef_eht_oper(sdata, eht_oper, support_160, + ieee80211_chandef_eht_oper(eht_oper, support_160, support_320, &he_chandef); } @@ -4770,6 +4772,7 @@ u8 ieee80211_ie_len_eht_cap(struct ieee80211_sub_if_data *sdata, u8 iftype) const struct ieee80211_sta_he_cap *he_cap; const struct ieee80211_sta_eht_cap *eht_cap; struct ieee80211_supported_band *sband; + bool is_ap; u8 n; sband = ieee80211_get_sband(sdata); @@ -4781,8 +4784,12 @@ u8 ieee80211_ie_len_eht_cap(struct ieee80211_sub_if_data *sdata, u8 iftype) if (!he_cap || !eht_cap) return 0; + is_ap = iftype == NL80211_IFTYPE_AP || + iftype == NL80211_IFTYPE_P2P_GO; + n = ieee80211_eht_mcs_nss_size(&he_cap->he_cap_elem, - &eht_cap->eht_cap_elem); + &eht_cap->eht_cap_elem, + is_ap); return 2 + 1 + sizeof(he_cap->he_cap_elem) + n + ieee80211_eht_ppe_size(eht_cap->eht_ppe_thres[0], @@ -4793,7 +4800,8 @@ u8 ieee80211_ie_len_eht_cap(struct ieee80211_sub_if_data *sdata, u8 iftype) u8 *ieee80211_ie_build_eht_cap(u8 *pos, const struct ieee80211_sta_he_cap *he_cap, const struct ieee80211_sta_eht_cap *eht_cap, - u8 *end) + u8 *end, + bool for_ap) { u8 mcs_nss_len, ppet_len; u8 ie_len; @@ -4804,7 +4812,8 @@ u8 *ieee80211_ie_build_eht_cap(u8 *pos, return orig_pos; mcs_nss_len = ieee80211_eht_mcs_nss_size(&he_cap->he_cap_elem, - &eht_cap->eht_cap_elem); + &eht_cap->eht_cap_elem, + for_ap); ppet_len = ieee80211_eht_ppe_size(eht_cap->eht_ppe_thres[0], eht_cap->eht_cap_elem.phy_cap_info); diff --git a/net/mac80211/vht.c b/net/mac80211/vht.c index b2b09d421e8b..803de5881485 100644 --- a/net/mac80211/vht.c +++ b/net/mac80211/vht.c @@ -323,16 +323,18 @@ ieee80211_vht_cap_ie_to_sta_vht_cap(struct ieee80211_sub_if_data *sdata, */ switch (vht_cap->cap & IEEE80211_VHT_CAP_MAX_MPDU_MASK) { case IEEE80211_VHT_CAP_MAX_MPDU_LENGTH_11454: - link_sta->sta->sta.max_amsdu_len = IEEE80211_MAX_MPDU_LEN_VHT_11454; + link_sta->pub->agg.max_amsdu_len = IEEE80211_MAX_MPDU_LEN_VHT_11454; break; case IEEE80211_VHT_CAP_MAX_MPDU_LENGTH_7991: - link_sta->sta->sta.max_amsdu_len = IEEE80211_MAX_MPDU_LEN_VHT_7991; + link_sta->pub->agg.max_amsdu_len = IEEE80211_MAX_MPDU_LEN_VHT_7991; break; case IEEE80211_VHT_CAP_MAX_MPDU_LENGTH_3895: default: - link_sta->sta->sta.max_amsdu_len = IEEE80211_MAX_MPDU_LEN_VHT_3895; + link_sta->pub->agg.max_amsdu_len = IEEE80211_MAX_MPDU_LEN_VHT_3895; break; } + + ieee80211_sta_recalc_aggregates(&link_sta->sta->sta); } /* FIXME: move this to some better location - parses HE/EHT now */ diff --git a/net/mptcp/mptcp_diag.c b/net/mptcp/mptcp_diag.c index 7f9a71780437..8df1bdb647e2 100644 --- a/net/mptcp/mptcp_diag.c +++ b/net/mptcp/mptcp_diag.c @@ -81,15 +81,18 @@ static void mptcp_diag_dump_listeners(struct sk_buff *skb, struct netlink_callba struct mptcp_diag_ctx *diag_ctx = (void *)cb->ctx; struct nlattr *bc = cb_data->inet_diag_nla_bc; struct net *net = sock_net(skb->sk); + struct inet_hashinfo *hinfo; int i; - for (i = diag_ctx->l_slot; i <= tcp_hashinfo.lhash2_mask; i++) { + hinfo = net->ipv4.tcp_death_row.hashinfo; + + for (i = diag_ctx->l_slot; i <= hinfo->lhash2_mask; i++) { struct inet_listen_hashbucket *ilb; struct hlist_nulls_node *node; struct sock *sk; int num = 0; - ilb = &tcp_hashinfo.lhash2[i]; + ilb = &hinfo->lhash2[i]; rcu_read_lock(); spin_lock(&ilb->lock); diff --git a/net/mptcp/pm_netlink.c b/net/mptcp/pm_netlink.c index 291b5da42fdb..9813ed0fde9b 100644 --- a/net/mptcp/pm_netlink.c +++ b/net/mptcp/pm_netlink.c @@ -796,7 +796,7 @@ static void mptcp_pm_nl_rm_addr_or_subflow(struct mptcp_sock *msk, u8 rm_id = rm_list->ids[i]; bool removed = false; - list_for_each_entry_safe(subflow, tmp, &msk->conn_list, node) { + mptcp_for_each_subflow_safe(msk, subflow, tmp) { struct sock *ssk = mptcp_subflow_tcp_sock(subflow); int how = RCV_SHUTDOWN | SEND_SHUTDOWN; u8 id = subflow->local_id; @@ -1327,7 +1327,7 @@ static int mptcp_nl_cmd_add_addr(struct sk_buff *skb, struct genl_info *info) return -EINVAL; } - entry = kmalloc(sizeof(*entry), GFP_KERNEL); + entry = kmalloc(sizeof(*entry), GFP_KERNEL_ACCOUNT); if (!entry) { GENL_SET_ERR_MSG(info, "can't allocate addr"); return -ENOMEM; @@ -2218,17 +2218,17 @@ static const struct genl_small_ops mptcp_pm_ops[] = { { .cmd = MPTCP_PM_CMD_ADD_ADDR, .doit = mptcp_nl_cmd_add_addr, - .flags = GENL_ADMIN_PERM, + .flags = GENL_UNS_ADMIN_PERM, }, { .cmd = MPTCP_PM_CMD_DEL_ADDR, .doit = mptcp_nl_cmd_del_addr, - .flags = GENL_ADMIN_PERM, + .flags = GENL_UNS_ADMIN_PERM, }, { .cmd = MPTCP_PM_CMD_FLUSH_ADDRS, .doit = mptcp_nl_cmd_flush_addrs, - .flags = GENL_ADMIN_PERM, + .flags = GENL_UNS_ADMIN_PERM, }, { .cmd = MPTCP_PM_CMD_GET_ADDR, @@ -2238,7 +2238,7 @@ static const struct genl_small_ops mptcp_pm_ops[] = { { .cmd = MPTCP_PM_CMD_SET_LIMITS, .doit = mptcp_nl_cmd_set_limits, - .flags = GENL_ADMIN_PERM, + .flags = GENL_UNS_ADMIN_PERM, }, { .cmd = MPTCP_PM_CMD_GET_LIMITS, @@ -2247,27 +2247,27 @@ static const struct genl_small_ops mptcp_pm_ops[] = { { .cmd = MPTCP_PM_CMD_SET_FLAGS, .doit = mptcp_nl_cmd_set_flags, - .flags = GENL_ADMIN_PERM, + .flags = GENL_UNS_ADMIN_PERM, }, { .cmd = MPTCP_PM_CMD_ANNOUNCE, .doit = mptcp_nl_cmd_announce, - .flags = GENL_ADMIN_PERM, + .flags = GENL_UNS_ADMIN_PERM, }, { .cmd = MPTCP_PM_CMD_REMOVE, .doit = mptcp_nl_cmd_remove, - .flags = GENL_ADMIN_PERM, + .flags = GENL_UNS_ADMIN_PERM, }, { .cmd = MPTCP_PM_CMD_SUBFLOW_CREATE, .doit = mptcp_nl_cmd_sf_create, - .flags = GENL_ADMIN_PERM, + .flags = GENL_UNS_ADMIN_PERM, }, { .cmd = MPTCP_PM_CMD_SUBFLOW_DESTROY, .doit = mptcp_nl_cmd_sf_destroy, - .flags = GENL_ADMIN_PERM, + .flags = GENL_UNS_ADMIN_PERM, }, }; @@ -2280,6 +2280,7 @@ static struct genl_family mptcp_genl_family __ro_after_init = { .module = THIS_MODULE, .small_ops = mptcp_pm_ops, .n_small_ops = ARRAY_SIZE(mptcp_pm_ops), + .resv_start_op = MPTCP_PM_CMD_SUBFLOW_DESTROY + 1, .mcgrps = mptcp_pm_mcgrps, .n_mcgrps = ARRAY_SIZE(mptcp_pm_mcgrps), }; diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index f8897a70c11d..f599ad44ed24 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -662,9 +662,9 @@ static bool __mptcp_move_skbs_from_subflow(struct mptcp_sock *msk, skb = skb_peek(&ssk->sk_receive_queue); if (!skb) { - /* if no data is found, a racing workqueue/recvmsg - * already processed the new data, stop here or we - * can enter an infinite loop + /* With racing move_skbs_to_msk() and __mptcp_move_skbs(), + * a different CPU can have already processed the pending + * data, stop here or we can enter an infinite loop */ if (!moved) done = true; @@ -672,9 +672,9 @@ static bool __mptcp_move_skbs_from_subflow(struct mptcp_sock *msk, } if (__mptcp_check_fallback(msk)) { - /* if we are running under the workqueue, TCP could have - * collapsed skbs between dummy map creation and now - * be sure to adjust the size + /* Under fallback skbs have no MPTCP extension and TCP could + * collapse them between the dummy map creation and the + * current dequeue. Be sure to adjust the map size. */ map_remaining = skb->len; subflow->map_data_len = skb->len; @@ -1544,8 +1544,9 @@ void __mptcp_push_pending(struct sock *sk, unsigned int flags) struct mptcp_sendmsg_info info = { .flags = flags, }; + bool do_check_data_fin = false; struct mptcp_data_frag *dfrag; - int len, copied = 0; + int len; while ((dfrag = mptcp_send_head(sk))) { info.sent = dfrag->already_sent; @@ -1580,8 +1581,8 @@ void __mptcp_push_pending(struct sock *sk, unsigned int flags) goto out; } + do_check_data_fin = true; info.sent += ret; - copied += ret; len -= ret; mptcp_update_post_push(msk, dfrag, ret); @@ -1597,7 +1598,7 @@ out: /* ensure the rtx timer is running */ if (!mptcp_timer_pending(sk)) mptcp_reset_timer(sk); - if (copied) + if (do_check_data_fin) __mptcp_check_send_data_fin(sk); } @@ -1676,6 +1677,7 @@ static int mptcp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len) { struct mptcp_sock *msk = mptcp_sk(sk); struct page_frag *pfrag; + struct socket *ssock; size_t copied = 0; int ret = 0; long timeo; @@ -1689,14 +1691,39 @@ static int mptcp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len) lock_sock(sk); + ssock = __mptcp_nmpc_socket(msk); + if (unlikely(ssock && inet_sk(ssock->sk)->defer_connect)) { + struct sock *ssk = ssock->sk; + int copied_syn = 0; + + lock_sock(ssk); + + ret = tcp_sendmsg_fastopen(ssk, msg, &copied_syn, len, NULL); + copied += copied_syn; + if (ret == -EINPROGRESS && copied_syn > 0) { + /* reflect the new state on the MPTCP socket */ + inet_sk_state_store(sk, inet_sk_state_load(ssk)); + release_sock(ssk); + goto out; + } else if (ret) { + release_sock(ssk); + goto do_error; + } + release_sock(ssk); + } + timeo = sock_sndtimeo(sk, msg->msg_flags & MSG_DONTWAIT); if ((1 << sk->sk_state) & ~(TCPF_ESTABLISHED | TCPF_CLOSE_WAIT)) { ret = sk_stream_wait_connect(sk, &timeo); if (ret) - goto out; + goto do_error; } + ret = -EPIPE; + if (unlikely(sk->sk_err || (sk->sk_shutdown & SEND_SHUTDOWN))) + goto do_error; + pfrag = sk_page_frag(sk); while (msg_data_left(msg)) { @@ -1705,11 +1732,6 @@ static int mptcp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len) bool dfrag_collapsed; size_t psize, offset; - if (sk->sk_err || (sk->sk_shutdown & SEND_SHUTDOWN)) { - ret = -EPIPE; - goto out; - } - /* reuse tail pfrag, if possible, or carve a new one from the * page allocator */ @@ -1741,7 +1763,7 @@ static int mptcp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len) if (copy_page_from_iter(dfrag->page, offset, psize, &msg->msg_iter) != psize) { ret = -EFAULT; - goto out; + goto do_error; } /* data successfully copied into the write queue */ @@ -1773,7 +1795,7 @@ wait_for_memory: __mptcp_push_pending(sk, msg->msg_flags); ret = sk_stream_wait_memory(sk, &timeo); if (ret) - goto out; + goto do_error; } if (copied) @@ -1781,7 +1803,14 @@ wait_for_memory: out: release_sock(sk); - return copied ? : ret; + return copied; + +do_error: + if (copied) + goto out; + + copied = sk_stream_error(sk, msg->msg_flags, ret); + goto out; } static int __mptcp_recvmsg_mskq(struct mptcp_sock *msk, @@ -2284,8 +2313,14 @@ static void __mptcp_close_ssk(struct sock *sk, struct sock *ssk, lock_sock_nested(ssk, SINGLE_DEPTH_NESTING); - if (flags & MPTCP_CF_FASTCLOSE) + if (flags & MPTCP_CF_FASTCLOSE) { + /* be sure to force the tcp_disconnect() path, + * to generate the egress reset + */ + ssk->sk_lingertime = 0; + sock_set_flag(ssk, SOCK_LINGER); subflow->send_fastclose = 1; + } need_push = (flags & MPTCP_CF_PUSH) && __mptcp_retransmit_pending_data(sk); if (!dispose_it) { @@ -2363,7 +2398,7 @@ static void __mptcp_close_subflow(struct mptcp_sock *msk) might_sleep(); - list_for_each_entry_safe(subflow, tmp, &msk->conn_list, node) { + mptcp_for_each_subflow_safe(msk, subflow, tmp) { struct sock *ssk = mptcp_subflow_tcp_sock(subflow); if (inet_sk_state_load(ssk) != TCP_CLOSE) @@ -2406,7 +2441,7 @@ static void mptcp_check_fastclose(struct mptcp_sock *msk) mptcp_token_destroy(msk); - list_for_each_entry_safe(subflow, tmp, &msk->conn_list, node) { + mptcp_for_each_subflow_safe(msk, subflow, tmp) { struct sock *tcp_sk = mptcp_subflow_tcp_sock(subflow); bool slow; @@ -2418,12 +2453,31 @@ static void mptcp_check_fastclose(struct mptcp_sock *msk) unlock_sock_fast(tcp_sk, slow); } + /* Mirror the tcp_reset() error propagation */ + switch (sk->sk_state) { + case TCP_SYN_SENT: + sk->sk_err = ECONNREFUSED; + break; + case TCP_CLOSE_WAIT: + sk->sk_err = EPIPE; + break; + case TCP_CLOSE: + return; + default: + sk->sk_err = ECONNRESET; + } + inet_sk_state_store(sk, TCP_CLOSE); sk->sk_shutdown = SHUTDOWN_MASK; smp_mb__before_atomic(); /* SHUTDOWN must be visible first */ set_bit(MPTCP_WORK_CLOSE_SUBFLOW, &msk->flags); - mptcp_close_wake_up(sk); + /* the calling mptcp_worker will properly destroy the socket */ + if (sock_flag(sk, SOCK_DEAD)) + return; + + sk->sk_state_change(sk); + sk_error_report(sk); } static void __mptcp_retrans(struct sock *sk) @@ -2529,6 +2583,16 @@ static void mptcp_mp_fail_no_response(struct mptcp_sock *msk) mptcp_reset_timeout(msk, 0); } +static void mptcp_do_fastclose(struct sock *sk) +{ + struct mptcp_subflow_context *subflow, *tmp; + struct mptcp_sock *msk = mptcp_sk(sk); + + mptcp_for_each_subflow_safe(msk, subflow, tmp) + __mptcp_close_ssk(sk, mptcp_subflow_tcp_sock(subflow), + subflow, MPTCP_CF_FASTCLOSE); +} + static void mptcp_worker(struct work_struct *work) { struct mptcp_sock *msk = container_of(work, struct mptcp_sock, work); @@ -2557,11 +2621,15 @@ static void mptcp_worker(struct work_struct *work) * closed, but we need the msk around to reply to incoming DATA_FIN, * even if it is orphaned and in FIN_WAIT2 state */ - if (sock_flag(sk, SOCK_DEAD) && - (mptcp_check_close_timeout(sk) || sk->sk_state == TCP_CLOSE)) { - inet_sk_state_store(sk, TCP_CLOSE); - __mptcp_destroy_sock(sk); - goto unlock; + if (sock_flag(sk, SOCK_DEAD)) { + if (mptcp_check_close_timeout(sk)) { + inet_sk_state_store(sk, TCP_CLOSE); + mptcp_do_fastclose(sk); + } + if (sk->sk_state == TCP_CLOSE) { + __mptcp_destroy_sock(sk); + goto unlock; + } } if (test_and_clear_bit(MPTCP_WORK_CLOSE_SUBFLOW, &msk->flags)) @@ -2802,6 +2870,18 @@ static void __mptcp_destroy_sock(struct sock *sk) sock_put(sk); } +static __poll_t mptcp_check_readable(struct mptcp_sock *msk) +{ + /* Concurrent splices from sk_receive_queue into receive_queue will + * always show at least one non-empty queue when checked in this order. + */ + if (skb_queue_empty_lockless(&((struct sock *)msk)->sk_receive_queue) && + skb_queue_empty_lockless(&msk->receive_queue)) + return 0; + + return EPOLLIN | EPOLLRDNORM; +} + bool __mptcp_close(struct sock *sk, long timeout) { struct mptcp_subflow_context *subflow; @@ -2815,8 +2895,13 @@ bool __mptcp_close(struct sock *sk, long timeout) goto cleanup; } - if (mptcp_close_state(sk)) + if (mptcp_check_readable(msk)) { + /* the msk has read data, do the MPTCP equivalent of TCP reset */ + inet_sk_state_store(sk, TCP_CLOSE); + mptcp_do_fastclose(sk); + } else if (mptcp_close_state(sk)) { __mptcp_wr_shutdown(sk); + } sk_stream_wait_close(sk, timeout); @@ -3063,7 +3148,7 @@ void mptcp_destroy_common(struct mptcp_sock *msk, unsigned int flags) __mptcp_clear_xmit(sk); /* join list will be eventually flushed (with rst) at sock lock release time */ - list_for_each_entry_safe(subflow, tmp, &msk->conn_list, node) + mptcp_for_each_subflow_safe(msk, subflow, tmp) __mptcp_close_ssk(sk, mptcp_subflow_tcp_sock(subflow), subflow, flags); /* move to sk_receive_queue, sk_stream_kill_queues will purge it */ @@ -3535,6 +3620,7 @@ static int mptcp_stream_connect(struct socket *sock, struct sockaddr *uaddr, do_connect: err = ssock->ops->connect(ssock, uaddr, addr_len, flags); + inet_sk(sock->sk)->defer_connect = inet_sk(ssock->sk)->defer_connect; sock->state = ssock->state; /* on successful connect, the msk state will be moved to established by @@ -3632,18 +3718,6 @@ static int mptcp_stream_accept(struct socket *sock, struct socket *newsock, return err; } -static __poll_t mptcp_check_readable(struct mptcp_sock *msk) -{ - /* Concurrent splices from sk_receive_queue into receive_queue will - * always show at least one non-empty queue when checked in this order. - */ - if (skb_queue_empty_lockless(&((struct sock *)msk)->sk_receive_queue) && - skb_queue_empty_lockless(&msk->receive_queue)) - return 0; - - return EPOLLIN | EPOLLRDNORM; -} - static __poll_t mptcp_check_writeable(struct mptcp_sock *msk) { struct sock *sk = (struct sock *)msk; @@ -3685,13 +3759,16 @@ static __poll_t mptcp_poll(struct file *file, struct socket *sock, if (state != TCP_SYN_SENT && state != TCP_SYN_RECV) { mask |= mptcp_check_readable(msk); mask |= mptcp_check_writeable(msk); + } else if (state == TCP_SYN_SENT && inet_sk(sk)->defer_connect) { + /* cf tcp_poll() note about TFO */ + mask |= EPOLLOUT | EPOLLWRNORM; } if (sk->sk_shutdown == SHUTDOWN_MASK || state == TCP_CLOSE) mask |= EPOLLHUP; if (sk->sk_shutdown & RCV_SHUTDOWN) mask |= EPOLLIN | EPOLLRDNORM | EPOLLRDHUP; - /* This barrier is coupled with smp_wmb() in tcp_reset() */ + /* This barrier is coupled with smp_wmb() in __mptcp_error_report() */ smp_rmb(); if (sk->sk_err) mask |= EPOLLERR; diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h index 8f372b8f059c..c0b5b4628f65 100644 --- a/net/mptcp/protocol.h +++ b/net/mptcp/protocol.h @@ -314,6 +314,8 @@ struct mptcp_sock { #define mptcp_for_each_subflow(__msk, __subflow) \ list_for_each_entry(__subflow, &((__msk)->conn_list), node) +#define mptcp_for_each_subflow_safe(__msk, __subflow, __tmp) \ + list_for_each_entry_safe(__subflow, __tmp, &((__msk)->conn_list), node) static inline void msk_owned_by_me(const struct mptcp_sock *msk) { diff --git a/net/mptcp/sockopt.c b/net/mptcp/sockopt.c index 423d3826ca1e..c7cb68c725b2 100644 --- a/net/mptcp/sockopt.c +++ b/net/mptcp/sockopt.c @@ -559,6 +559,7 @@ static bool mptcp_supported_sockopt(int level, int optname) case TCP_NOTSENT_LOWAT: case TCP_TX_DELAY: case TCP_INQ: + case TCP_FASTOPEN_CONNECT: return true; } @@ -567,7 +568,7 @@ static bool mptcp_supported_sockopt(int level, int optname) /* TCP_REPAIR, TCP_REPAIR_QUEUE, TCP_QUEUE_SEQ, TCP_REPAIR_OPTIONS, * TCP_REPAIR_WINDOW are not supported, better avoid this mess */ - /* TCP_FASTOPEN_KEY, TCP_FASTOPEN TCP_FASTOPEN_CONNECT, TCP_FASTOPEN_NO_COOKIE, + /* TCP_FASTOPEN_KEY, TCP_FASTOPEN, TCP_FASTOPEN_NO_COOKIE, * are not supported fastopen is currently unsupported */ } @@ -768,6 +769,19 @@ static int mptcp_setsockopt_sol_tcp_defer(struct mptcp_sock *msk, sockptr_t optv return tcp_setsockopt(listener->sk, SOL_TCP, TCP_DEFER_ACCEPT, optval, optlen); } +static int mptcp_setsockopt_sol_tcp_fastopen_connect(struct mptcp_sock *msk, sockptr_t optval, + unsigned int optlen) +{ + struct socket *sock; + + /* Limit to first subflow */ + sock = __mptcp_nmpc_socket(msk); + if (!sock) + return -EINVAL; + + return tcp_setsockopt(sock->sk, SOL_TCP, TCP_FASTOPEN_CONNECT, optval, optlen); +} + static int mptcp_setsockopt_sol_tcp(struct mptcp_sock *msk, int optname, sockptr_t optval, unsigned int optlen) { @@ -796,6 +810,8 @@ static int mptcp_setsockopt_sol_tcp(struct mptcp_sock *msk, int optname, return mptcp_setsockopt_sol_tcp_nodelay(msk, optval, optlen); case TCP_DEFER_ACCEPT: return mptcp_setsockopt_sol_tcp_defer(msk, optval, optlen); + case TCP_FASTOPEN_CONNECT: + return mptcp_setsockopt_sol_tcp_fastopen_connect(msk, optval, optlen); } return -EOPNOTSUPP; @@ -1157,6 +1173,7 @@ static int mptcp_getsockopt_sol_tcp(struct mptcp_sock *msk, int optname, case TCP_INFO: case TCP_CC_INFO: case TCP_DEFER_ACCEPT: + case TCP_FASTOPEN_CONNECT: return mptcp_getsockopt_first_sf_only(msk, SOL_TCP, optname, optval, optlen); case TCP_INQ: diff --git a/net/ncsi/ncsi-netlink.c b/net/ncsi/ncsi-netlink.c index c189b4c8a182..d27f4eccce6d 100644 --- a/net/ncsi/ncsi-netlink.c +++ b/net/ncsi/ncsi-netlink.c @@ -768,6 +768,7 @@ static struct genl_family ncsi_genl_family __ro_after_init = { .module = THIS_MODULE, .small_ops = ncsi_ops, .n_small_ops = ARRAY_SIZE(ncsi_ops), + .resv_start_op = NCSI_CMD_SET_CHANNEL_MASK + 1, }; static int __init ncsi_init_netlink(void) diff --git a/net/netfilter/Makefile b/net/netfilter/Makefile index 06df49ea6329..0f060d100880 100644 --- a/net/netfilter/Makefile +++ b/net/netfilter/Makefile @@ -60,6 +60,12 @@ obj-$(CONFIG_NF_NAT) += nf_nat.o nf_nat-$(CONFIG_NF_NAT_REDIRECT) += nf_nat_redirect.o nf_nat-$(CONFIG_NF_NAT_MASQUERADE) += nf_nat_masquerade.o +ifeq ($(CONFIG_NF_NAT),m) +nf_nat-$(CONFIG_DEBUG_INFO_BTF_MODULES) += nf_nat_bpf.o +else ifeq ($(CONFIG_NF_NAT),y) +nf_nat-$(CONFIG_DEBUG_INFO_BTF) += nf_nat_bpf.o +endif + # NAT helpers obj-$(CONFIG_NF_NAT_AMANDA) += nf_nat_amanda.o obj-$(CONFIG_NF_NAT_FTP) += nf_nat_ftp.o diff --git a/net/netfilter/core.c b/net/netfilter/core.c index dcf752b55a52..5a6705a0e4ec 100644 --- a/net/netfilter/core.c +++ b/net/netfilter/core.c @@ -300,12 +300,6 @@ nf_hook_entry_head(struct net *net, int pf, unsigned int hooknum, if (WARN_ON_ONCE(ARRAY_SIZE(net->nf.hooks_ipv6) <= hooknum)) return NULL; return net->nf.hooks_ipv6 + hooknum; -#if IS_ENABLED(CONFIG_DECNET) - case NFPROTO_DECNET: - if (WARN_ON_ONCE(ARRAY_SIZE(net->nf.hooks_decnet) <= hooknum)) - return NULL; - return net->nf.hooks_decnet + hooknum; -#endif default: WARN_ON_ONCE(1); return NULL; @@ -750,10 +744,6 @@ static int __net_init netfilter_net_init(struct net *net) #ifdef CONFIG_NETFILTER_FAMILY_BRIDGE __netfilter_net_init(net->nf.hooks_bridge, ARRAY_SIZE(net->nf.hooks_bridge)); #endif -#if IS_ENABLED(CONFIG_DECNET) - __netfilter_net_init(net->nf.hooks_decnet, ARRAY_SIZE(net->nf.hooks_decnet)); -#endif - #ifdef CONFIG_PROC_FS net->nf.proc_netfilter = proc_net_mkdir(net, "netfilter", net->proc_net); diff --git a/net/netfilter/ipset/ip_set_core.c b/net/netfilter/ipset/ip_set_core.c index 16ae92054baa..e7ba5b6dd2b7 100644 --- a/net/netfilter/ipset/ip_set_core.c +++ b/net/netfilter/ipset/ip_set_core.c @@ -353,7 +353,7 @@ ip_set_init_comment(struct ip_set *set, struct ip_set_comment *comment, c = kmalloc(sizeof(*c) + len + 1, GFP_ATOMIC); if (unlikely(!c)) return; - strlcpy(c->str, ext->comment, len + 1); + strscpy(c->str, ext->comment, len + 1); set->ext_size += sizeof(*c) + strlen(c->str) + 1; rcu_assign_pointer(comment->c, c); } @@ -1072,7 +1072,7 @@ static int ip_set_create(struct sk_buff *skb, const struct nfnl_info *info, if (!set) return -ENOMEM; spin_lock_init(&set->lock); - strlcpy(set->name, name, IPSET_MAXNAMELEN); + strscpy(set->name, name, IPSET_MAXNAMELEN); set->family = family; set->revision = revision; @@ -1719,11 +1719,13 @@ call_ad(struct net *net, struct sock *ctnl, struct sk_buff *skb, skb2 = nlmsg_new(payload, GFP_KERNEL); if (!skb2) return -ENOMEM; - rep = __nlmsg_put(skb2, NETLINK_CB(skb).portid, - nlh->nlmsg_seq, NLMSG_ERROR, payload, 0); + rep = nlmsg_put(skb2, NETLINK_CB(skb).portid, + nlh->nlmsg_seq, NLMSG_ERROR, payload, 0); errmsg = nlmsg_data(rep); errmsg->error = ret; - memcpy(&errmsg->msg, nlh, nlh->nlmsg_len); + unsafe_memcpy(&errmsg->msg, nlh, nlh->nlmsg_len, + /* Bounds checked by the skb layer. */); + cmdattr = (void *)&errmsg->msg + min_len; ret = nla_parse(cda, IPSET_ATTR_CMD_MAX, cmdattr, diff --git a/net/netfilter/ipvs/ip_vs_ctl.c b/net/netfilter/ipvs/ip_vs_ctl.c index efab2b06d373..988222fff9f0 100644 --- a/net/netfilter/ipvs/ip_vs_ctl.c +++ b/net/netfilter/ipvs/ip_vs_ctl.c @@ -2611,7 +2611,7 @@ ip_vs_copy_service(struct ip_vs_service_entry *dst, struct ip_vs_service *src) dst->addr = src->addr.ip; dst->port = src->port; dst->fwmark = src->fwmark; - strlcpy(dst->sched_name, sched_name, sizeof(dst->sched_name)); + strscpy(dst->sched_name, sched_name, sizeof(dst->sched_name)); dst->flags = src->flags; dst->timeout = src->timeout / HZ; dst->netmask = src->netmask; @@ -2805,13 +2805,13 @@ do_ip_vs_get_ctl(struct sock *sk, int cmd, void __user *user, int *len) mutex_lock(&ipvs->sync_mutex); if (ipvs->sync_state & IP_VS_STATE_MASTER) { d[0].state = IP_VS_STATE_MASTER; - strlcpy(d[0].mcast_ifn, ipvs->mcfg.mcast_ifn, + strscpy(d[0].mcast_ifn, ipvs->mcfg.mcast_ifn, sizeof(d[0].mcast_ifn)); d[0].syncid = ipvs->mcfg.syncid; } if (ipvs->sync_state & IP_VS_STATE_BACKUP) { d[1].state = IP_VS_STATE_BACKUP; - strlcpy(d[1].mcast_ifn, ipvs->bcfg.mcast_ifn, + strscpy(d[1].mcast_ifn, ipvs->bcfg.mcast_ifn, sizeof(d[1].mcast_ifn)); d[1].syncid = ipvs->bcfg.syncid; } @@ -3561,7 +3561,7 @@ static int ip_vs_genl_new_daemon(struct netns_ipvs *ipvs, struct nlattr **attrs) attrs[IPVS_DAEMON_ATTR_MCAST_IFN] && attrs[IPVS_DAEMON_ATTR_SYNC_ID])) return -EINVAL; - strlcpy(c.mcast_ifn, nla_data(attrs[IPVS_DAEMON_ATTR_MCAST_IFN]), + strscpy(c.mcast_ifn, nla_data(attrs[IPVS_DAEMON_ATTR_MCAST_IFN]), sizeof(c.mcast_ifn)); c.syncid = nla_get_u32(attrs[IPVS_DAEMON_ATTR_SYNC_ID]); @@ -4005,6 +4005,7 @@ static struct genl_family ip_vs_genl_family __ro_after_init = { .module = THIS_MODULE, .small_ops = ip_vs_genl_ops, .n_small_ops = ARRAY_SIZE(ip_vs_genl_ops), + .resv_start_op = IPVS_CMD_FLUSH + 1, }; static int __init ip_vs_genl_register(void) diff --git a/net/netfilter/nf_conntrack_bpf.c b/net/netfilter/nf_conntrack_bpf.c index 1cd87b28c9b0..8639e7efd0e2 100644 --- a/net/netfilter/nf_conntrack_bpf.c +++ b/net/netfilter/nf_conntrack_bpf.c @@ -6,12 +6,14 @@ * are exposed through to BPF programs is explicitly unstable. */ +#include <linux/bpf_verifier.h> #include <linux/bpf.h> #include <linux/btf.h> +#include <linux/filter.h> +#include <linux/mutex.h> #include <linux/types.h> #include <linux/btf_ids.h> #include <linux/net_namespace.h> -#include <net/netfilter/nf_conntrack.h> #include <net/netfilter/nf_conntrack_bpf.h> #include <net/netfilter/nf_conntrack_core.h> @@ -134,7 +136,6 @@ __bpf_nf_ct_alloc_entry(struct net *net, struct bpf_sock_tuple *bpf_tuple, memset(&ct->proto, 0, sizeof(ct->proto)); __nf_ct_set_timeout(ct, timeout * HZ); - ct->status |= IPS_CONFIRMED; out: if (opts->netns_id >= 0) @@ -184,14 +185,58 @@ static struct nf_conn *__bpf_nf_ct_lookup(struct net *net, return ct; } +BTF_ID_LIST(btf_nf_conn_ids) +BTF_ID(struct, nf_conn) +BTF_ID(struct, nf_conn___init) + +/* Check writes into `struct nf_conn` */ +static int _nf_conntrack_btf_struct_access(struct bpf_verifier_log *log, + const struct btf *btf, + const struct btf_type *t, int off, + int size, enum bpf_access_type atype, + u32 *next_btf_id, + enum bpf_type_flag *flag) +{ + const struct btf_type *ncit; + const struct btf_type *nct; + size_t end; + + ncit = btf_type_by_id(btf, btf_nf_conn_ids[1]); + nct = btf_type_by_id(btf, btf_nf_conn_ids[0]); + + if (t != nct && t != ncit) { + bpf_log(log, "only read is supported\n"); + return -EACCES; + } + + /* `struct nf_conn` and `struct nf_conn___init` have the same layout + * so we are safe to simply merge offset checks here + */ + switch (off) { +#if defined(CONFIG_NF_CONNTRACK_MARK) + case offsetof(struct nf_conn, mark): + end = offsetofend(struct nf_conn, mark); + break; +#endif + default: + bpf_log(log, "no write support to nf_conn at off %d\n", off); + return -EACCES; + } + + if (off + size > end) { + bpf_log(log, + "write access at off %d with size %d beyond the member of nf_conn ended at %zu\n", + off, size, end); + return -EACCES; + } + + return 0; +} + __diag_push(); __diag_ignore_all("-Wmissing-prototypes", "Global functions as their definitions will be in nf_conntrack BTF"); -struct nf_conn___init { - struct nf_conn ct; -}; - /* bpf_xdp_ct_alloc - Allocate a new CT entry * * Parameters: @@ -339,6 +384,7 @@ struct nf_conn *bpf_ct_insert_entry(struct nf_conn___init *nfct_i) struct nf_conn *nfct = (struct nf_conn *)nfct_i; int err; + nfct->status |= IPS_CONFIRMED; err = nf_conntrack_hash_check_insert(nfct); if (err < 0) { nf_conntrack_free(nfct); @@ -449,5 +495,19 @@ int register_nf_conntrack_bpf(void) int ret; ret = register_btf_kfunc_id_set(BPF_PROG_TYPE_XDP, &nf_conntrack_kfunc_set); - return ret ?: register_btf_kfunc_id_set(BPF_PROG_TYPE_SCHED_CLS, &nf_conntrack_kfunc_set); + ret = ret ?: register_btf_kfunc_id_set(BPF_PROG_TYPE_SCHED_CLS, &nf_conntrack_kfunc_set); + if (!ret) { + mutex_lock(&nf_conn_btf_access_lock); + nfct_btf_struct_access = _nf_conntrack_btf_struct_access; + mutex_unlock(&nf_conn_btf_access_lock); + } + + return ret; +} + +void cleanup_nf_conntrack_bpf(void) +{ + mutex_lock(&nf_conn_btf_access_lock); + nfct_btf_struct_access = NULL; + mutex_unlock(&nf_conn_btf_access_lock); } diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c index 1357a2729a4b..f97bda06d2a9 100644 --- a/net/netfilter/nf_conntrack_core.c +++ b/net/netfilter/nf_conntrack_core.c @@ -67,6 +67,7 @@ struct conntrack_gc_work { struct delayed_work dwork; u32 next_bucket; u32 avg_timeout; + u32 count; u32 start_time; bool exiting; bool early_drop; @@ -85,10 +86,12 @@ static DEFINE_MUTEX(nf_conntrack_mutex); /* clamp timeouts to this value (TCP unacked) */ #define GC_SCAN_INTERVAL_CLAMP (300ul * HZ) -/* large initial bias so that we don't scan often just because we have - * three entries with a 1s timeout. +/* Initial bias pretending we have 100 entries at the upper bound so we don't + * wakeup often just because we have three entries with a 1s timeout while still + * allowing non-idle machines to wakeup more often when needed. */ -#define GC_SCAN_INTERVAL_INIT INT_MAX +#define GC_SCAN_INITIAL_COUNT 100 +#define GC_SCAN_INTERVAL_INIT GC_SCAN_INTERVAL_MAX #define GC_SCAN_MAX_DURATION msecs_to_jiffies(10) #define GC_SCAN_EXPIRED_MAX (64000u / HZ) @@ -1466,6 +1469,7 @@ static void gc_worker(struct work_struct *work) unsigned int expired_count = 0; unsigned long next_run; s32 delta_time; + long count; gc_work = container_of(work, struct conntrack_gc_work, dwork.work); @@ -1475,10 +1479,12 @@ static void gc_worker(struct work_struct *work) if (i == 0) { gc_work->avg_timeout = GC_SCAN_INTERVAL_INIT; + gc_work->count = GC_SCAN_INITIAL_COUNT; gc_work->start_time = start_time; } next_run = gc_work->avg_timeout; + count = gc_work->count; end_time = start_time + GC_SCAN_MAX_DURATION; @@ -1498,8 +1504,8 @@ static void gc_worker(struct work_struct *work) hlist_nulls_for_each_entry_rcu(h, n, &ct_hash[i], hnnode) { struct nf_conntrack_net *cnet; - unsigned long expires; struct net *net; + long expires; tmp = nf_ct_tuplehash_to_ctrack(h); @@ -1513,6 +1519,7 @@ static void gc_worker(struct work_struct *work) gc_work->next_bucket = i; gc_work->avg_timeout = next_run; + gc_work->count = count; delta_time = nfct_time_stamp - gc_work->start_time; @@ -1528,8 +1535,8 @@ static void gc_worker(struct work_struct *work) } expires = clamp(nf_ct_expires(tmp), GC_SCAN_INTERVAL_MIN, GC_SCAN_INTERVAL_CLAMP); + expires = (expires - (long)next_run) / ++count; next_run += expires; - next_run /= 2u; if (nf_conntrack_max95 == 0 || gc_worker_skip_ct(tmp)) continue; @@ -1570,6 +1577,7 @@ static void gc_worker(struct work_struct *work) delta_time = nfct_time_stamp - end_time; if (delta_time > 0 && i < hashsz) { gc_work->avg_timeout = next_run; + gc_work->count = count; gc_work->next_bucket = i; next_run = 0; goto early_exit; @@ -2508,6 +2516,7 @@ static int kill_all(struct nf_conn *i, void *data) void nf_conntrack_cleanup_start(void) { + cleanup_nf_conntrack_bpf(); conntrack_gc_work.exiting = true; } @@ -2802,10 +2811,6 @@ err_expect: return ret; } -#if (IS_BUILTIN(CONFIG_NF_CONNTRACK) && IS_ENABLED(CONFIG_DEBUG_INFO_BTF)) || \ - (IS_MODULE(CONFIG_NF_CONNTRACK) && IS_ENABLED(CONFIG_DEBUG_INFO_BTF_MODULES) || \ - IS_ENABLED(CONFIG_NF_CT_NETLINK)) - /* ctnetlink code shared by both ctnetlink and nf_conntrack_bpf */ int __nf_ct_change_timeout(struct nf_conn *ct, u64 timeout) @@ -2861,5 +2866,3 @@ int nf_ct_change_status_common(struct nf_conn *ct, unsigned int status) return 0; } EXPORT_SYMBOL_GPL(nf_ct_change_status_common); - -#endif diff --git a/net/netfilter/nf_conntrack_proto_tcp.c b/net/netfilter/nf_conntrack_proto_tcp.c index a634c72b1ffc..656631083177 100644 --- a/net/netfilter/nf_conntrack_proto_tcp.c +++ b/net/netfilter/nf_conntrack_proto_tcp.c @@ -47,6 +47,12 @@ static const char *const tcp_conntrack_names[] = { "SYN_SENT2", }; +enum nf_ct_tcp_action { + NFCT_TCP_IGNORE, + NFCT_TCP_INVALID, + NFCT_TCP_ACCEPT, +}; + #define SECS * HZ #define MINS * 60 SECS #define HOURS * 60 MINS @@ -472,23 +478,45 @@ static void tcp_init_sender(struct ip_ct_tcp_state *sender, } } -static bool tcp_in_window(struct nf_conn *ct, - enum ip_conntrack_dir dir, - unsigned int index, - const struct sk_buff *skb, - unsigned int dataoff, - const struct tcphdr *tcph, - const struct nf_hook_state *hook_state) +__printf(6, 7) +static enum nf_ct_tcp_action nf_tcp_log_invalid(const struct sk_buff *skb, + const struct nf_conn *ct, + const struct nf_hook_state *state, + const struct ip_ct_tcp_state *sender, + enum nf_ct_tcp_action ret, + const char *fmt, ...) +{ + const struct nf_tcp_net *tn = nf_tcp_pernet(nf_ct_net(ct)); + struct va_format vaf; + va_list args; + bool be_liberal; + + be_liberal = sender->flags & IP_CT_TCP_FLAG_BE_LIBERAL || tn->tcp_be_liberal; + if (be_liberal) + return NFCT_TCP_ACCEPT; + + va_start(args, fmt); + vaf.fmt = fmt; + vaf.va = &args; + nf_ct_l4proto_log_invalid(skb, ct, state, "%pV", &vaf); + va_end(args); + + return ret; +} + +static enum nf_ct_tcp_action +tcp_in_window(struct nf_conn *ct, enum ip_conntrack_dir dir, + unsigned int index, const struct sk_buff *skb, + unsigned int dataoff, const struct tcphdr *tcph, + const struct nf_hook_state *hook_state) { struct ip_ct_tcp *state = &ct->proto.tcp; - struct net *net = nf_ct_net(ct); - struct nf_tcp_net *tn = nf_tcp_pernet(net); struct ip_ct_tcp_state *sender = &state->seen[dir]; struct ip_ct_tcp_state *receiver = &state->seen[!dir]; __u32 seq, ack, sack, end, win, swin; - u16 win_raw; + bool in_recv_win, seq_ok; s32 receiver_offset; - bool res, in_recv_win; + u16 win_raw; /* * Get the required data from the packet. @@ -517,7 +545,7 @@ static bool tcp_in_window(struct nf_conn *ct, end, win); if (!tcph->ack) /* Simultaneous open */ - return true; + return NFCT_TCP_ACCEPT; } else { /* * We are in the middle of a connection, @@ -560,7 +588,7 @@ static bool tcp_in_window(struct nf_conn *ct, end, win); if (dir == IP_CT_DIR_REPLY && !tcph->ack) - return true; + return NFCT_TCP_ACCEPT; } if (!(tcph->ack)) { @@ -584,122 +612,166 @@ static bool tcp_in_window(struct nf_conn *ct, */ seq = end = sender->td_end; - /* Is the ending sequence in the receive window (if available)? */ - in_recv_win = !receiver->td_maxwin || - after(end, sender->td_end - receiver->td_maxwin - 1); - - if (before(seq, sender->td_maxend + 1) && - in_recv_win && - before(sack, receiver->td_end + 1) && - after(sack, receiver->td_end - MAXACKWINDOW(sender) - 1)) { - /* - * Take into account window scaling (RFC 1323). - */ - if (!tcph->syn) - win <<= sender->td_scale; - - /* - * Update sender data. - */ - swin = win + (sack - ack); - if (sender->td_maxwin < swin) - sender->td_maxwin = swin; - if (after(end, sender->td_end)) { + seq_ok = before(seq, sender->td_maxend + 1); + if (!seq_ok) { + u32 overshot = end - sender->td_maxend + 1; + bool ack_ok; + + ack_ok = after(sack, receiver->td_end - MAXACKWINDOW(sender) - 1); + in_recv_win = receiver->td_maxwin && + after(end, sender->td_end - receiver->td_maxwin - 1); + + if (in_recv_win && + ack_ok && + overshot <= receiver->td_maxwin && + before(sack, receiver->td_end + 1)) { + /* Work around TCPs that send more bytes than allowed by + * the receive window. + * + * If the (marked as invalid) packet is allowed to pass by + * the ruleset and the peer acks this data, then its possible + * all future packets will trigger 'ACK is over upper bound' check. + * + * Thus if only the sequence check fails then do update td_end so + * possible ACK for this data can update internal state. + */ sender->td_end = end; sender->flags |= IP_CT_TCP_FLAG_DATA_UNACKNOWLEDGED; - } - if (tcph->ack) { - if (!(sender->flags & IP_CT_TCP_FLAG_MAXACK_SET)) { - sender->td_maxack = ack; - sender->flags |= IP_CT_TCP_FLAG_MAXACK_SET; - } else if (after(ack, sender->td_maxack)) - sender->td_maxack = ack; + + return nf_tcp_log_invalid(skb, ct, hook_state, sender, NFCT_TCP_IGNORE, + "%u bytes more than expected", overshot); } - /* - * Update receiver data. - */ - if (receiver->td_maxwin != 0 && after(end, sender->td_maxend)) - receiver->td_maxwin += end - sender->td_maxend; - if (after(sack + win, receiver->td_maxend - 1)) { - receiver->td_maxend = sack + win; - if (win == 0) - receiver->td_maxend++; + return nf_tcp_log_invalid(skb, ct, hook_state, sender, NFCT_TCP_INVALID, + "SEQ is over upper bound %u (over the window of the receiver)", + sender->td_maxend + 1); + } + + if (!before(sack, receiver->td_end + 1)) + return nf_tcp_log_invalid(skb, ct, hook_state, sender, NFCT_TCP_INVALID, + "ACK is over upper bound %u (ACKed data not seen yet)", + receiver->td_end + 1); + + /* Is the ending sequence in the receive window (if available)? */ + in_recv_win = !receiver->td_maxwin || + after(end, sender->td_end - receiver->td_maxwin - 1); + if (!in_recv_win) + return nf_tcp_log_invalid(skb, ct, hook_state, sender, NFCT_TCP_IGNORE, + "SEQ is under lower bound %u (already ACKed data retransmitted)", + sender->td_end - receiver->td_maxwin - 1); + if (!after(sack, receiver->td_end - MAXACKWINDOW(sender) - 1)) + return nf_tcp_log_invalid(skb, ct, hook_state, sender, NFCT_TCP_IGNORE, + "ignored ACK under lower bound %u (possible overly delayed)", + receiver->td_end - MAXACKWINDOW(sender) - 1); + + /* Take into account window scaling (RFC 1323). */ + if (!tcph->syn) + win <<= sender->td_scale; + + /* Update sender data. */ + swin = win + (sack - ack); + if (sender->td_maxwin < swin) + sender->td_maxwin = swin; + if (after(end, sender->td_end)) { + sender->td_end = end; + sender->flags |= IP_CT_TCP_FLAG_DATA_UNACKNOWLEDGED; + } + if (tcph->ack) { + if (!(sender->flags & IP_CT_TCP_FLAG_MAXACK_SET)) { + sender->td_maxack = ack; + sender->flags |= IP_CT_TCP_FLAG_MAXACK_SET; + } else if (after(ack, sender->td_maxack)) { + sender->td_maxack = ack; } - if (ack == receiver->td_end) - receiver->flags &= ~IP_CT_TCP_FLAG_DATA_UNACKNOWLEDGED; + } - /* - * Check retransmissions. - */ - if (index == TCP_ACK_SET) { - if (state->last_dir == dir - && state->last_seq == seq - && state->last_ack == ack - && state->last_end == end - && state->last_win == win_raw) - state->retrans++; - else { - state->last_dir = dir; - state->last_seq = seq; - state->last_ack = ack; - state->last_end = end; - state->last_win = win_raw; - state->retrans = 0; - } + /* Update receiver data. */ + if (receiver->td_maxwin != 0 && after(end, sender->td_maxend)) + receiver->td_maxwin += end - sender->td_maxend; + if (after(sack + win, receiver->td_maxend - 1)) { + receiver->td_maxend = sack + win; + if (win == 0) + receiver->td_maxend++; + } + if (ack == receiver->td_end) + receiver->flags &= ~IP_CT_TCP_FLAG_DATA_UNACKNOWLEDGED; + + /* Check retransmissions. */ + if (index == TCP_ACK_SET) { + if (state->last_dir == dir && + state->last_seq == seq && + state->last_ack == ack && + state->last_end == end && + state->last_win == win_raw) { + state->retrans++; + } else { + state->last_dir = dir; + state->last_seq = seq; + state->last_ack = ack; + state->last_end = end; + state->last_win = win_raw; + state->retrans = 0; } - res = true; - } else { - res = false; - if (sender->flags & IP_CT_TCP_FLAG_BE_LIBERAL || - tn->tcp_be_liberal) - res = true; - if (!res) { - bool seq_ok = before(seq, sender->td_maxend + 1); - - if (!seq_ok) { - u32 overshot = end - sender->td_maxend + 1; - bool ack_ok; - - ack_ok = after(sack, receiver->td_end - MAXACKWINDOW(sender) - 1); - - if (in_recv_win && - ack_ok && - overshot <= receiver->td_maxwin && - before(sack, receiver->td_end + 1)) { - /* Work around TCPs that send more bytes than allowed by - * the receive window. - * - * If the (marked as invalid) packet is allowed to pass by - * the ruleset and the peer acks this data, then its possible - * all future packets will trigger 'ACK is over upper bound' check. - * - * Thus if only the sequence check fails then do update td_end so - * possible ACK for this data can update internal state. - */ - sender->td_end = end; - sender->flags |= IP_CT_TCP_FLAG_DATA_UNACKNOWLEDGED; - - nf_ct_l4proto_log_invalid(skb, ct, hook_state, - "%u bytes more than expected", overshot); - return res; - } - } + } + + return NFCT_TCP_ACCEPT; +} + +static void __cold nf_tcp_handle_invalid(struct nf_conn *ct, + enum ip_conntrack_dir dir, + int index, + const struct sk_buff *skb, + const struct nf_hook_state *hook_state) +{ + const unsigned int *timeouts; + const struct nf_tcp_net *tn; + unsigned int timeout; + u32 expires; + + if (!test_bit(IPS_ASSURED_BIT, &ct->status) || + test_bit(IPS_FIXED_TIMEOUT_BIT, &ct->status)) + return; + + /* We don't want to have connections hanging around in ESTABLISHED + * state for long time 'just because' conntrack deemed a FIN/RST + * out-of-window. + * + * Shrink the timeout just like when there is unacked data. + * This speeds up eviction of 'dead' connections where the + * connection and conntracks internal state are out of sync. + */ + switch (index) { + case TCP_RST_SET: + case TCP_FIN_SET: + break; + default: + return; + } + + if (ct->proto.tcp.last_dir != dir && + (ct->proto.tcp.last_index == TCP_FIN_SET || + ct->proto.tcp.last_index == TCP_RST_SET)) { + expires = nf_ct_expires(ct); + if (expires < 120 * HZ) + return; + + tn = nf_tcp_pernet(nf_ct_net(ct)); + timeouts = nf_ct_timeout_lookup(ct); + if (!timeouts) + timeouts = tn->timeouts; + timeout = READ_ONCE(timeouts[TCP_CONNTRACK_UNACK]); + if (expires > timeout) { nf_ct_l4proto_log_invalid(skb, ct, hook_state, - "%s", - before(seq, sender->td_maxend + 1) ? - in_recv_win ? - before(sack, receiver->td_end + 1) ? - after(sack, receiver->td_end - MAXACKWINDOW(sender) - 1) ? "BUG" - : "ACK is under the lower bound (possible overly delayed ACK)" - : "ACK is over the upper bound (ACKed data not seen yet)" - : "SEQ is under the lower bound (already ACKed data retransmitted)" - : "SEQ is over the upper bound (over the window of the receiver)"); + "packet (index %d, dir %d) response for index %d lower timeout to %u", + index, dir, ct->proto.tcp.last_index, timeout); + + WRITE_ONCE(ct->timeout, timeout + nfct_time_stamp); } + } else { + ct->proto.tcp.last_index = index; + ct->proto.tcp.last_dir = dir; } - - return res; } /* table of valid flag combinations - PUSH, ECE and CWR are always valid */ @@ -861,6 +933,7 @@ int nf_conntrack_tcp_packet(struct nf_conn *ct, struct nf_conntrack_tuple *tuple; enum tcp_conntrack new_state, old_state; unsigned int index, *timeouts; + enum nf_ct_tcp_action res; enum ip_conntrack_dir dir; const struct tcphdr *th; struct tcphdr _tcph; @@ -1126,10 +1199,18 @@ int nf_conntrack_tcp_packet(struct nf_conn *ct, break; } - if (!tcp_in_window(ct, dir, index, - skb, dataoff, th, state)) { + res = tcp_in_window(ct, dir, index, + skb, dataoff, th, state); + switch (res) { + case NFCT_TCP_IGNORE: + spin_unlock_bh(&ct->lock); + return NF_ACCEPT; + case NFCT_TCP_INVALID: + nf_tcp_handle_invalid(ct, dir, index, skb, state); spin_unlock_bh(&ct->lock); return -NF_ACCEPT; + case NFCT_TCP_ACCEPT: + break; } in_window: /* From now on we have got in-window packets */ diff --git a/net/netfilter/nf_log.c b/net/netfilter/nf_log.c index edee7fa944c1..8a29290149bd 100644 --- a/net/netfilter/nf_log.c +++ b/net/netfilter/nf_log.c @@ -443,9 +443,9 @@ static int nf_log_proc_dostring(struct ctl_table *table, int write, mutex_lock(&nf_log_mutex); logger = nft_log_dereference(net->nf.nf_loggers[tindex]); if (!logger) - strlcpy(buf, "NONE", sizeof(buf)); + strscpy(buf, "NONE", sizeof(buf)); else - strlcpy(buf, logger->name, sizeof(buf)); + strscpy(buf, logger->name, sizeof(buf)); mutex_unlock(&nf_log_mutex); r = proc_dostring(&tmp, write, buffer, lenp, ppos); } diff --git a/net/netfilter/nf_nat_amanda.c b/net/netfilter/nf_nat_amanda.c index 3bc7e0854efe..98deef6cde69 100644 --- a/net/netfilter/nf_nat_amanda.c +++ b/net/netfilter/nf_nat_amanda.c @@ -44,19 +44,7 @@ static unsigned int help(struct sk_buff *skb, exp->expectfn = nf_nat_follow_master; /* Try to get same port: if not, try to change it. */ - for (port = ntohs(exp->saved_proto.tcp.port); port != 0; port++) { - int res; - - exp->tuple.dst.u.tcp.port = htons(port); - res = nf_ct_expect_related(exp, 0); - if (res == 0) - break; - else if (res != -EBUSY) { - port = 0; - break; - } - } - + port = nf_nat_exp_find_port(exp, ntohs(exp->saved_proto.tcp.port)); if (port == 0) { nf_ct_helper_log(skb, exp->master, "all ports in use"); return NF_DROP; diff --git a/net/netfilter/nf_nat_bpf.c b/net/netfilter/nf_nat_bpf.c new file mode 100644 index 000000000000..0fa5a0bbb0ff --- /dev/null +++ b/net/netfilter/nf_nat_bpf.c @@ -0,0 +1,79 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* Unstable NAT Helpers for XDP and TC-BPF hook + * + * These are called from the XDP and SCHED_CLS BPF programs. Note that it is + * allowed to break compatibility for these functions since the interface they + * are exposed through to BPF programs is explicitly unstable. + */ + +#include <linux/bpf.h> +#include <linux/btf_ids.h> +#include <net/netfilter/nf_conntrack_bpf.h> +#include <net/netfilter/nf_conntrack_core.h> +#include <net/netfilter/nf_nat.h> + +__diag_push(); +__diag_ignore_all("-Wmissing-prototypes", + "Global functions as their definitions will be in nf_nat BTF"); + +/* bpf_ct_set_nat_info - Set source or destination nat address + * + * Set source or destination nat address of the newly allocated + * nf_conn before insertion. This must be invoked for referenced + * PTR_TO_BTF_ID to nf_conn___init. + * + * Parameters: + * @nfct - Pointer to referenced nf_conn object, obtained using + * bpf_xdp_ct_alloc or bpf_skb_ct_alloc. + * @addr - Nat source/destination address + * @port - Nat source/destination port. Non-positive values are + * interpreted as select a random port. + * @manip - NF_NAT_MANIP_SRC or NF_NAT_MANIP_DST + */ +int bpf_ct_set_nat_info(struct nf_conn___init *nfct, + union nf_inet_addr *addr, int port, + enum nf_nat_manip_type manip) +{ + struct nf_conn *ct = (struct nf_conn *)nfct; + u16 proto = nf_ct_l3num(ct); + struct nf_nat_range2 range; + + if (proto != NFPROTO_IPV4 && proto != NFPROTO_IPV6) + return -EINVAL; + + memset(&range, 0, sizeof(struct nf_nat_range2)); + range.flags = NF_NAT_RANGE_MAP_IPS; + range.min_addr = *addr; + range.max_addr = range.min_addr; + if (port > 0) { + range.flags |= NF_NAT_RANGE_PROTO_SPECIFIED; + range.min_proto.all = cpu_to_be16(port); + range.max_proto.all = range.min_proto.all; + } + + return nf_nat_setup_info(ct, &range, manip) == NF_DROP ? -ENOMEM : 0; +} + +__diag_pop() + +BTF_SET8_START(nf_nat_kfunc_set) +BTF_ID_FLAGS(func, bpf_ct_set_nat_info, KF_TRUSTED_ARGS) +BTF_SET8_END(nf_nat_kfunc_set) + +static const struct btf_kfunc_id_set nf_bpf_nat_kfunc_set = { + .owner = THIS_MODULE, + .set = &nf_nat_kfunc_set, +}; + +int register_nf_nat_bpf(void) +{ + int ret; + + ret = register_btf_kfunc_id_set(BPF_PROG_TYPE_XDP, + &nf_bpf_nat_kfunc_set); + if (ret) + return ret; + + return register_btf_kfunc_id_set(BPF_PROG_TYPE_SCHED_CLS, + &nf_bpf_nat_kfunc_set); +} diff --git a/net/netfilter/nf_nat_core.c b/net/netfilter/nf_nat_core.c index 7981be526f26..d8e6380f6337 100644 --- a/net/netfilter/nf_nat_core.c +++ b/net/netfilter/nf_nat_core.c @@ -16,7 +16,7 @@ #include <linux/siphash.h> #include <linux/rtnetlink.h> -#include <net/netfilter/nf_conntrack.h> +#include <net/netfilter/nf_conntrack_bpf.h> #include <net/netfilter/nf_conntrack_core.h> #include <net/netfilter/nf_conntrack_helper.h> #include <net/netfilter/nf_conntrack_seqadj.h> @@ -1152,7 +1152,7 @@ static int __init nf_nat_init(void) WARN_ON(nf_nat_hook != NULL); RCU_INIT_POINTER(nf_nat_hook, &nat_hook); - return 0; + return register_nf_nat_bpf(); } static void __exit nf_nat_cleanup(void) diff --git a/net/netfilter/nf_nat_ftp.c b/net/netfilter/nf_nat_ftp.c index aace6768a64e..c92a436d9c48 100644 --- a/net/netfilter/nf_nat_ftp.c +++ b/net/netfilter/nf_nat_ftp.c @@ -86,22 +86,9 @@ static unsigned int nf_nat_ftp(struct sk_buff *skb, * this one. */ exp->expectfn = nf_nat_follow_master; - /* Try to get same port: if not, try to change it. */ - for (port = ntohs(exp->saved_proto.tcp.port); port != 0; port++) { - int ret; - - exp->tuple.dst.u.tcp.port = htons(port); - ret = nf_ct_expect_related(exp, 0); - if (ret == 0) - break; - else if (ret != -EBUSY) { - port = 0; - break; - } - } - + port = nf_nat_exp_find_port(exp, ntohs(exp->saved_proto.tcp.port)); if (port == 0) { - nf_ct_helper_log(skb, ct, "all ports in use"); + nf_ct_helper_log(skb, exp->master, "all ports in use"); return NF_DROP; } diff --git a/net/netfilter/nf_nat_helper.c b/net/netfilter/nf_nat_helper.c index a263505455fc..a95a25196943 100644 --- a/net/netfilter/nf_nat_helper.c +++ b/net/netfilter/nf_nat_helper.c @@ -198,3 +198,34 @@ void nf_nat_follow_master(struct nf_conn *ct, nf_nat_setup_info(ct, &range, NF_NAT_MANIP_DST); } EXPORT_SYMBOL(nf_nat_follow_master); + +u16 nf_nat_exp_find_port(struct nf_conntrack_expect *exp, u16 port) +{ + static const unsigned int max_attempts = 128; + int range, attempts_left; + u16 min = port; + + range = USHRT_MAX - port; + attempts_left = range; + + if (attempts_left > max_attempts) + attempts_left = max_attempts; + + /* Try to get same port: if not, try to change it. */ + for (;;) { + int res; + + exp->tuple.dst.u.tcp.port = htons(port); + res = nf_ct_expect_related(exp, 0); + if (res == 0) + return port; + + if (res != -EBUSY || (--attempts_left < 0)) + break; + + port = min + prandom_u32_max(range); + } + + return 0; +} +EXPORT_SYMBOL_GPL(nf_nat_exp_find_port); diff --git a/net/netfilter/nf_nat_irc.c b/net/netfilter/nf_nat_irc.c index c691ab8d234c..19c4fcc60c50 100644 --- a/net/netfilter/nf_nat_irc.c +++ b/net/netfilter/nf_nat_irc.c @@ -48,20 +48,8 @@ static unsigned int help(struct sk_buff *skb, exp->dir = IP_CT_DIR_REPLY; exp->expectfn = nf_nat_follow_master; - /* Try to get same port: if not, try to change it. */ - for (port = ntohs(exp->saved_proto.tcp.port); port != 0; port++) { - int ret; - - exp->tuple.dst.u.tcp.port = htons(port); - ret = nf_ct_expect_related(exp, 0); - if (ret == 0) - break; - else if (ret != -EBUSY) { - port = 0; - break; - } - } - + port = nf_nat_exp_find_port(exp, + ntohs(exp->saved_proto.tcp.port)); if (port == 0) { nf_ct_helper_log(skb, ct, "all ports in use"); return NF_DROP; diff --git a/net/netfilter/nf_nat_sip.c b/net/netfilter/nf_nat_sip.c index f0a735e86851..cf4aeb299bde 100644 --- a/net/netfilter/nf_nat_sip.c +++ b/net/netfilter/nf_nat_sip.c @@ -410,19 +410,7 @@ static unsigned int nf_nat_sip_expect(struct sk_buff *skb, unsigned int protoff, exp->dir = !dir; exp->expectfn = nf_nat_sip_expected; - for (; port != 0; port++) { - int ret; - - exp->tuple.dst.u.udp.port = htons(port); - ret = nf_ct_expect_related(exp, NF_CT_EXP_F_SKIP_MASTER); - if (ret == 0) - break; - else if (ret != -EBUSY) { - port = 0; - break; - } - } - + port = nf_nat_exp_find_port(exp, port); if (port == 0) { nf_ct_helper_log(skb, ct, "all ports in use for SIP"); return NF_DROP; diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c index 63c70141b3e5..a0653a8dfa82 100644 --- a/net/netfilter/nf_tables_api.c +++ b/net/netfilter/nf_tables_api.c @@ -742,7 +742,7 @@ __printf(2, 3) int nft_request_module(struct net *net, const char *fmt, return -ENOMEM; req->done = false; - strlcpy(req->module, module_name, MODULE_NAME_LEN); + strscpy(req->module, module_name, MODULE_NAME_LEN); list_add_tail(&req->list, &nft_net->module_list); return -EAGAIN; diff --git a/net/netfilter/nfnetlink_hook.c b/net/netfilter/nfnetlink_hook.c index 71e29adac48b..8120aadf6a0f 100644 --- a/net/netfilter/nfnetlink_hook.c +++ b/net/netfilter/nfnetlink_hook.c @@ -215,13 +215,6 @@ nfnl_hook_entries_head(u8 pf, unsigned int hook, struct net *net, const char *de hook_head = rcu_dereference(net->nf.hooks_bridge[hook]); #endif break; -#if IS_ENABLED(CONFIG_DECNET) - case NFPROTO_DECNET: - if (hook >= ARRAY_SIZE(net->nf.hooks_decnet)) - return ERR_PTR(-EINVAL); - hook_head = rcu_dereference(net->nf.hooks_decnet[hook]); - break; -#endif #if defined(CONFIG_NETFILTER_INGRESS) || defined(CONFIG_NETFILTER_EGRESS) case NFPROTO_NETDEV: if (hook >= NF_NETDEV_NUMHOOKS) diff --git a/net/netfilter/nft_osf.c b/net/netfilter/nft_osf.c index 89342ccccdcc..adacf95b6e2b 100644 --- a/net/netfilter/nft_osf.c +++ b/net/netfilter/nft_osf.c @@ -51,7 +51,7 @@ static void nft_osf_eval(const struct nft_expr *expr, struct nft_regs *regs, snprintf(os_match, NFT_OSF_MAXGENRELEN, "%s:%s", data.genre, data.version); else - strlcpy(os_match, data.genre, NFT_OSF_MAXGENRELEN); + strscpy(os_match, data.genre, NFT_OSF_MAXGENRELEN); strncpy((char *)dest, os_match, NFT_OSF_MAXGENRELEN); } diff --git a/net/netfilter/nft_payload.c b/net/netfilter/nft_payload.c index eb0e40c29712..088244f9d838 100644 --- a/net/netfilter/nft_payload.c +++ b/net/netfilter/nft_payload.c @@ -173,10 +173,10 @@ static const struct nla_policy nft_payload_policy[NFTA_PAYLOAD_MAX + 1] = { [NFTA_PAYLOAD_SREG] = { .type = NLA_U32 }, [NFTA_PAYLOAD_DREG] = { .type = NLA_U32 }, [NFTA_PAYLOAD_BASE] = { .type = NLA_U32 }, - [NFTA_PAYLOAD_OFFSET] = { .type = NLA_U32 }, - [NFTA_PAYLOAD_LEN] = { .type = NLA_U32 }, + [NFTA_PAYLOAD_OFFSET] = NLA_POLICY_MAX_BE(NLA_U32, 255), + [NFTA_PAYLOAD_LEN] = NLA_POLICY_MAX_BE(NLA_U32, 255), [NFTA_PAYLOAD_CSUM_TYPE] = { .type = NLA_U32 }, - [NFTA_PAYLOAD_CSUM_OFFSET] = { .type = NLA_U32 }, + [NFTA_PAYLOAD_CSUM_OFFSET] = NLA_POLICY_MAX_BE(NLA_U32, 255), [NFTA_PAYLOAD_CSUM_FLAGS] = { .type = NLA_U32 }, }; diff --git a/net/netfilter/x_tables.c b/net/netfilter/x_tables.c index 54a489f16b17..470282cf3fae 100644 --- a/net/netfilter/x_tables.c +++ b/net/netfilter/x_tables.c @@ -766,7 +766,7 @@ void xt_compat_match_from_user(struct xt_entry_match *m, void **dstptr, msize += off; m->u.user.match_size = msize; - strlcpy(name, match->name, sizeof(name)); + strscpy(name, match->name, sizeof(name)); module_put(match->me); strncpy(m->u.user.name, name, sizeof(m->u.user.name)); @@ -1146,7 +1146,7 @@ void xt_compat_target_from_user(struct xt_entry_target *t, void **dstptr, tsize += off; t->u.user.target_size = tsize; - strlcpy(name, target->name, sizeof(name)); + strscpy(name, target->name, sizeof(name)); module_put(target->me); strncpy(t->u.user.name, name, sizeof(t->u.user.name)); @@ -1827,7 +1827,7 @@ int xt_proto_init(struct net *net, u_int8_t af) root_uid = make_kuid(net->user_ns, 0); root_gid = make_kgid(net->user_ns, 0); - strlcpy(buf, xt_prefix[af], sizeof(buf)); + strscpy(buf, xt_prefix[af], sizeof(buf)); strlcat(buf, FORMAT_TABLES, sizeof(buf)); proc = proc_create_net_data(buf, 0440, net->proc_net, &xt_table_seq_ops, sizeof(struct seq_net_private), @@ -1837,7 +1837,7 @@ int xt_proto_init(struct net *net, u_int8_t af) if (uid_valid(root_uid) && gid_valid(root_gid)) proc_set_user(proc, root_uid, root_gid); - strlcpy(buf, xt_prefix[af], sizeof(buf)); + strscpy(buf, xt_prefix[af], sizeof(buf)); strlcat(buf, FORMAT_MATCHES, sizeof(buf)); proc = proc_create_seq_private(buf, 0440, net->proc_net, &xt_match_seq_ops, sizeof(struct nf_mttg_trav), @@ -1847,7 +1847,7 @@ int xt_proto_init(struct net *net, u_int8_t af) if (uid_valid(root_uid) && gid_valid(root_gid)) proc_set_user(proc, root_uid, root_gid); - strlcpy(buf, xt_prefix[af], sizeof(buf)); + strscpy(buf, xt_prefix[af], sizeof(buf)); strlcat(buf, FORMAT_TARGETS, sizeof(buf)); proc = proc_create_seq_private(buf, 0440, net->proc_net, &xt_target_seq_ops, sizeof(struct nf_mttg_trav), @@ -1862,12 +1862,12 @@ int xt_proto_init(struct net *net, u_int8_t af) #ifdef CONFIG_PROC_FS out_remove_matches: - strlcpy(buf, xt_prefix[af], sizeof(buf)); + strscpy(buf, xt_prefix[af], sizeof(buf)); strlcat(buf, FORMAT_MATCHES, sizeof(buf)); remove_proc_entry(buf, net->proc_net); out_remove_tables: - strlcpy(buf, xt_prefix[af], sizeof(buf)); + strscpy(buf, xt_prefix[af], sizeof(buf)); strlcat(buf, FORMAT_TABLES, sizeof(buf)); remove_proc_entry(buf, net->proc_net); out: @@ -1881,15 +1881,15 @@ void xt_proto_fini(struct net *net, u_int8_t af) #ifdef CONFIG_PROC_FS char buf[XT_FUNCTION_MAXNAMELEN]; - strlcpy(buf, xt_prefix[af], sizeof(buf)); + strscpy(buf, xt_prefix[af], sizeof(buf)); strlcat(buf, FORMAT_TABLES, sizeof(buf)); remove_proc_entry(buf, net->proc_net); - strlcpy(buf, xt_prefix[af], sizeof(buf)); + strscpy(buf, xt_prefix[af], sizeof(buf)); strlcat(buf, FORMAT_TARGETS, sizeof(buf)); remove_proc_entry(buf, net->proc_net); - strlcpy(buf, xt_prefix[af], sizeof(buf)); + strscpy(buf, xt_prefix[af], sizeof(buf)); strlcat(buf, FORMAT_MATCHES, sizeof(buf)); remove_proc_entry(buf, net->proc_net); #endif /*CONFIG_PROC_FS*/ diff --git a/net/netfilter/xt_RATEEST.c b/net/netfilter/xt_RATEEST.c index 8aec1b529364..80f6624e2355 100644 --- a/net/netfilter/xt_RATEEST.c +++ b/net/netfilter/xt_RATEEST.c @@ -144,7 +144,7 @@ static int xt_rateest_tg_checkentry(const struct xt_tgchk_param *par) goto err1; gnet_stats_basic_sync_init(&est->bstats); - strlcpy(est->name, info->name, sizeof(est->name)); + strscpy(est->name, info->name, sizeof(est->name)); spin_lock_init(&est->lock); est->refcnt = 1; est->params.interval = info->interval; diff --git a/net/netlabel/netlabel_calipso.c b/net/netlabel/netlabel_calipso.c index 91a19c3ea1a3..f1d5b8465217 100644 --- a/net/netlabel/netlabel_calipso.c +++ b/net/netlabel/netlabel_calipso.c @@ -344,6 +344,7 @@ static struct genl_family netlbl_calipso_gnl_family __ro_after_init = { .module = THIS_MODULE, .small_ops = netlbl_calipso_ops, .n_small_ops = ARRAY_SIZE(netlbl_calipso_ops), + .resv_start_op = NLBL_CALIPSO_C_LISTALL + 1, }; /* NetLabel Generic NETLINK Protocol Functions diff --git a/net/netlabel/netlabel_cipso_v4.c b/net/netlabel/netlabel_cipso_v4.c index 894e6b8f1a86..fa08ee75ac06 100644 --- a/net/netlabel/netlabel_cipso_v4.c +++ b/net/netlabel/netlabel_cipso_v4.c @@ -767,6 +767,7 @@ static struct genl_family netlbl_cipsov4_gnl_family __ro_after_init = { .module = THIS_MODULE, .small_ops = netlbl_cipsov4_ops, .n_small_ops = ARRAY_SIZE(netlbl_cipsov4_ops), + .resv_start_op = NLBL_CIPSOV4_C_LISTALL + 1, }; /* diff --git a/net/netlabel/netlabel_mgmt.c b/net/netlabel/netlabel_mgmt.c index 032b7d7b32c7..689eaa2afbec 100644 --- a/net/netlabel/netlabel_mgmt.c +++ b/net/netlabel/netlabel_mgmt.c @@ -826,6 +826,7 @@ static struct genl_family netlbl_mgmt_gnl_family __ro_after_init = { .module = THIS_MODULE, .small_ops = netlbl_mgmt_genl_ops, .n_small_ops = ARRAY_SIZE(netlbl_mgmt_genl_ops), + .resv_start_op = NLBL_MGMT_C_VERSION + 1, }; /* diff --git a/net/netlabel/netlabel_unlabeled.c b/net/netlabel/netlabel_unlabeled.c index 0555dffd80e0..9996883bf2b7 100644 --- a/net/netlabel/netlabel_unlabeled.c +++ b/net/netlabel/netlabel_unlabeled.c @@ -1374,6 +1374,7 @@ static struct genl_family netlbl_unlabel_gnl_family __ro_after_init = { .module = THIS_MODULE, .small_ops = netlbl_unlabel_genl_ops, .n_small_ops = ARRAY_SIZE(netlbl_unlabel_genl_ops), + .resv_start_op = NLBL_UNLABEL_C_STATICLISTDEF + 1, }; /* diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c index 0cd91f813a3b..a662e8a5ff84 100644 --- a/net/netlink/af_netlink.c +++ b/net/netlink/af_netlink.c @@ -2400,6 +2400,69 @@ error_free: } EXPORT_SYMBOL(__netlink_dump_start); +static size_t +netlink_ack_tlv_len(struct netlink_sock *nlk, int err, + const struct netlink_ext_ack *extack) +{ + size_t tlvlen; + + if (!extack || !(nlk->flags & NETLINK_F_EXT_ACK)) + return 0; + + tlvlen = 0; + if (extack->_msg) + tlvlen += nla_total_size(strlen(extack->_msg) + 1); + if (extack->cookie_len) + tlvlen += nla_total_size(extack->cookie_len); + + /* Following attributes are only reported as error (not warning) */ + if (!err) + return tlvlen; + + if (extack->bad_attr) + tlvlen += nla_total_size(sizeof(u32)); + if (extack->policy) + tlvlen += netlink_policy_dump_attr_size_estimate(extack->policy); + if (extack->miss_type) + tlvlen += nla_total_size(sizeof(u32)); + if (extack->miss_nest) + tlvlen += nla_total_size(sizeof(u32)); + + return tlvlen; +} + +static void +netlink_ack_tlv_fill(struct sk_buff *in_skb, struct sk_buff *skb, + struct nlmsghdr *nlh, int err, + const struct netlink_ext_ack *extack) +{ + if (extack->_msg) + WARN_ON(nla_put_string(skb, NLMSGERR_ATTR_MSG, extack->_msg)); + if (extack->cookie_len) + WARN_ON(nla_put(skb, NLMSGERR_ATTR_COOKIE, + extack->cookie_len, extack->cookie)); + + if (!err) + return; + + if (extack->bad_attr && + !WARN_ON((u8 *)extack->bad_attr < in_skb->data || + (u8 *)extack->bad_attr >= in_skb->data + in_skb->len)) + WARN_ON(nla_put_u32(skb, NLMSGERR_ATTR_OFFS, + (u8 *)extack->bad_attr - (u8 *)nlh)); + if (extack->policy) + netlink_policy_dump_write_attr(skb, extack->policy, + NLMSGERR_ATTR_POLICY); + if (extack->miss_type) + WARN_ON(nla_put_u32(skb, NLMSGERR_ATTR_MISS_TYPE, + extack->miss_type)); + if (extack->miss_nest && + !WARN_ON((u8 *)extack->miss_nest < in_skb->data || + (u8 *)extack->miss_nest > in_skb->data + in_skb->len)) + WARN_ON(nla_put_u32(skb, NLMSGERR_ATTR_MISS_NEST, + (u8 *)extack->miss_nest - (u8 *)nlh)); +} + void netlink_ack(struct sk_buff *in_skb, struct nlmsghdr *nlh, int err, const struct netlink_ext_ack *extack) { @@ -2407,29 +2470,20 @@ void netlink_ack(struct sk_buff *in_skb, struct nlmsghdr *nlh, int err, struct nlmsghdr *rep; struct nlmsgerr *errmsg; size_t payload = sizeof(*errmsg); - size_t tlvlen = 0; struct netlink_sock *nlk = nlk_sk(NETLINK_CB(in_skb).sk); unsigned int flags = 0; - bool nlk_has_extack = nlk->flags & NETLINK_F_EXT_ACK; + size_t tlvlen; /* Error messages get the original request appened, unless the user * requests to cap the error message, and get extra error data if * requested. */ - if (nlk_has_extack && extack && extack->_msg) - tlvlen += nla_total_size(strlen(extack->_msg) + 1); - if (err && !(nlk->flags & NETLINK_F_CAP_ACK)) payload += nlmsg_len(nlh); else flags |= NLM_F_CAPPED; - if (err && nlk_has_extack && extack && extack->bad_attr) - tlvlen += nla_total_size(sizeof(u32)); - if (nlk_has_extack && extack && extack->cookie_len) - tlvlen += nla_total_size(extack->cookie_len); - if (err && nlk_has_extack && extack && extack->policy) - tlvlen += netlink_policy_dump_attr_size_estimate(extack->policy); + tlvlen = netlink_ack_tlv_len(nlk, err, extack); if (tlvlen) flags |= NLM_F_ACK_TLVS; @@ -2440,31 +2494,16 @@ void netlink_ack(struct sk_buff *in_skb, struct nlmsghdr *nlh, int err, return; } - rep = __nlmsg_put(skb, NETLINK_CB(in_skb).portid, nlh->nlmsg_seq, - NLMSG_ERROR, payload, flags); + rep = nlmsg_put(skb, NETLINK_CB(in_skb).portid, nlh->nlmsg_seq, + NLMSG_ERROR, payload, flags); errmsg = nlmsg_data(rep); errmsg->error = err; - memcpy(&errmsg->msg, nlh, payload > sizeof(*errmsg) ? nlh->nlmsg_len : sizeof(*nlh)); + unsafe_memcpy(&errmsg->msg, nlh, payload > sizeof(*errmsg) + ? nlh->nlmsg_len : sizeof(*nlh), + /* Bounds checked by the skb layer. */); - if (nlk_has_extack && extack) { - if (extack->_msg) { - WARN_ON(nla_put_string(skb, NLMSGERR_ATTR_MSG, - extack->_msg)); - } - if (err && extack->bad_attr && - !WARN_ON((u8 *)extack->bad_attr < in_skb->data || - (u8 *)extack->bad_attr >= in_skb->data + - in_skb->len)) - WARN_ON(nla_put_u32(skb, NLMSGERR_ATTR_OFFS, - (u8 *)extack->bad_attr - - (u8 *)nlh)); - if (extack->cookie_len) - WARN_ON(nla_put(skb, NLMSGERR_ATTR_COOKIE, - extack->cookie_len, extack->cookie)); - if (extack->policy) - netlink_policy_dump_write_attr(skb, extack->policy, - NLMSGERR_ATTR_POLICY); - } + if (tlvlen) + netlink_ack_tlv_fill(in_skb, skb, nlh, err, extack); nlmsg_end(skb, rep); diff --git a/net/netlink/genetlink.c b/net/netlink/genetlink.c index 57010927e20a..39b7c00e4cef 100644 --- a/net/netlink/genetlink.c +++ b/net/netlink/genetlink.c @@ -739,6 +739,36 @@ out: return err; } +static int genl_header_check(const struct genl_family *family, + struct nlmsghdr *nlh, struct genlmsghdr *hdr, + struct netlink_ext_ack *extack) +{ + u16 flags; + + /* Only for commands added after we started validating */ + if (hdr->cmd < family->resv_start_op) + return 0; + + if (hdr->reserved) { + NL_SET_ERR_MSG(extack, "genlmsghdr.reserved field is not 0"); + return -EINVAL; + } + + /* Old netlink flags have pretty loose semantics, allow only the flags + * consumed by the core where we can enforce the meaning. + */ + flags = nlh->nlmsg_flags; + if ((flags & NLM_F_DUMP) == NLM_F_DUMP) /* DUMP is 2 bits */ + flags &= ~NLM_F_DUMP; + if (flags & ~(NLM_F_REQUEST | NLM_F_ACK | NLM_F_ECHO)) { + NL_SET_ERR_MSG(extack, + "ambiguous or reserved bits set in nlmsg_flags"); + return -EINVAL; + } + + return 0; +} + static int genl_family_rcv_msg(const struct genl_family *family, struct sk_buff *skb, struct nlmsghdr *nlh, @@ -757,6 +787,9 @@ static int genl_family_rcv_msg(const struct genl_family *family, if (nlh->nlmsg_len < nlmsg_msg_size(hdrlen)) return -EINVAL; + if (genl_header_check(family, nlh, hdr, extack)) + return -EINVAL; + if (genl_get_cmd(hdr->cmd, family, &op)) return -EOPNOTSUPP; @@ -1348,6 +1381,7 @@ static struct genl_family genl_ctrl __ro_after_init = { .module = THIS_MODULE, .ops = genl_ctrl_ops, .n_ops = ARRAY_SIZE(genl_ctrl_ops), + .resv_start_op = CTRL_CMD_GETPOLICY + 1, .mcgrps = genl_ctrl_groups, .n_mcgrps = ARRAY_SIZE(genl_ctrl_groups), .id = GENL_ID_CTRL, @@ -1362,7 +1396,7 @@ static int genl_bind(struct net *net, int group) unsigned int id; int ret = 0; - genl_lock_all(); + down_read(&cb_lock); idr_for_each_entry(&genl_fam_idr, family, id) { const struct genl_multicast_group *grp; @@ -1383,7 +1417,7 @@ static int genl_bind(struct net *net, int group) break; } - genl_unlock_all(); + up_read(&cb_lock); return ret; } diff --git a/net/nfc/hci/hcp.c b/net/nfc/hci/hcp.c index 05c60988f59a..4902f5064098 100644 --- a/net/nfc/hci/hcp.c +++ b/net/nfc/hci/hcp.c @@ -73,14 +73,12 @@ int nfc_hci_hcp_message_tx(struct nfc_hci_dev *hdev, u8 pipe, if (firstfrag) { firstfrag = false; packet->message.header = HCP_HEADER(type, instruction); - if (ptr) { - memcpy(packet->message.data, ptr, - data_link_len - 1); - ptr += data_link_len - 1; - } } else { - memcpy(&packet->message, ptr, data_link_len); - ptr += data_link_len; + packet->message.header = *ptr++; + } + if (ptr) { + memcpy(packet->message.data, ptr, data_link_len - 1); + ptr += data_link_len - 1; } /* This is the last fragment, set the cb bit */ diff --git a/net/nfc/netlink.c b/net/nfc/netlink.c index 7c62417ccfd7..9d91087b9399 100644 --- a/net/nfc/netlink.c +++ b/net/nfc/netlink.c @@ -1783,6 +1783,7 @@ static struct genl_family nfc_genl_family __ro_after_init = { .module = THIS_MODULE, .ops = nfc_genl_ops, .n_ops = ARRAY_SIZE(nfc_genl_ops), + .resv_start_op = NFC_CMD_DEACTIVATE_TARGET + 1, .mcgrps = nfc_genl_mcgrps, .n_mcgrps = ARRAY_SIZE(nfc_genl_mcgrps), }; diff --git a/net/openvswitch/conntrack.c b/net/openvswitch/conntrack.c index 4e70df91d0f2..cb255d8ed99a 100644 --- a/net/openvswitch/conntrack.c +++ b/net/openvswitch/conntrack.c @@ -1982,7 +1982,8 @@ static int ovs_ct_limit_set_zone_limit(struct nlattr *nla_zone_limit, } else { struct ovs_ct_limit *ct_limit; - ct_limit = kmalloc(sizeof(*ct_limit), GFP_KERNEL); + ct_limit = kmalloc(sizeof(*ct_limit), + GFP_KERNEL_ACCOUNT); if (!ct_limit) return -ENOMEM; @@ -2252,14 +2253,16 @@ exit_err: static const struct genl_small_ops ct_limit_genl_ops[] = { { .cmd = OVS_CT_LIMIT_CMD_SET, .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP, - .flags = GENL_ADMIN_PERM, /* Requires CAP_NET_ADMIN - * privilege. */ + .flags = GENL_UNS_ADMIN_PERM, /* Requires CAP_NET_ADMIN + * privilege. + */ .doit = ovs_ct_limit_cmd_set, }, { .cmd = OVS_CT_LIMIT_CMD_DEL, .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP, - .flags = GENL_ADMIN_PERM, /* Requires CAP_NET_ADMIN - * privilege. */ + .flags = GENL_UNS_ADMIN_PERM, /* Requires CAP_NET_ADMIN + * privilege. + */ .doit = ovs_ct_limit_cmd_del, }, { .cmd = OVS_CT_LIMIT_CMD_GET, @@ -2283,6 +2286,7 @@ struct genl_family dp_ct_limit_genl_family __ro_after_init = { .parallel_ops = true, .small_ops = ct_limit_genl_ops, .n_small_ops = ARRAY_SIZE(ct_limit_genl_ops), + .resv_start_op = OVS_CT_LIMIT_CMD_GET + 1, .mcgrps = &ovs_ct_limit_multicast_group, .n_mcgrps = 1, .module = THIS_MODULE, diff --git a/net/openvswitch/datapath.c b/net/openvswitch/datapath.c index 6c9d153afbee..c8a9075ddd0a 100644 --- a/net/openvswitch/datapath.c +++ b/net/openvswitch/datapath.c @@ -252,10 +252,17 @@ void ovs_dp_process_packet(struct sk_buff *skb, struct sw_flow_key *key) upcall.mru = OVS_CB(skb)->mru; error = ovs_dp_upcall(dp, skb, key, &upcall, 0); - if (unlikely(error)) - kfree_skb(skb); - else + switch (error) { + case 0: + case -EAGAIN: + case -ERESTARTSYS: + case -EINTR: consume_skb(skb); + break; + default: + kfree_skb(skb); + break; + } stats_counter = &stats->n_missed; goto out; } @@ -551,8 +558,9 @@ static int queue_userspace_packet(struct datapath *dp, struct sk_buff *skb, out: if (err) skb_tx_error(skb); - kfree_skb(user_skb); - kfree_skb(nskb); + consume_skb(user_skb); + consume_skb(nskb); + return err; } @@ -684,6 +692,7 @@ static struct genl_family dp_packet_genl_family __ro_after_init = { .parallel_ops = true, .small_ops = dp_packet_genl_ops, .n_small_ops = ARRAY_SIZE(dp_packet_genl_ops), + .resv_start_op = OVS_PACKET_CMD_EXECUTE + 1, .module = THIS_MODULE, }; @@ -1501,6 +1510,7 @@ static struct genl_family dp_flow_genl_family __ro_after_init = { .parallel_ops = true, .small_ops = dp_flow_genl_ops, .n_small_ops = ARRAY_SIZE(dp_flow_genl_ops), + .resv_start_op = OVS_FLOW_CMD_SET + 1, .mcgrps = &ovs_dp_flow_multicast_group, .n_mcgrps = 1, .module = THIS_MODULE, @@ -1515,6 +1525,7 @@ static size_t ovs_dp_cmd_msg_size(void) msgsize += nla_total_size_64bit(sizeof(struct ovs_dp_megaflow_stats)); msgsize += nla_total_size(sizeof(u32)); /* OVS_DP_ATTR_USER_FEATURES */ msgsize += nla_total_size(sizeof(u32)); /* OVS_DP_ATTR_MASKS_CACHE_SIZE */ + msgsize += nla_total_size(sizeof(u32) * nr_cpu_ids); /* OVS_DP_ATTR_PER_CPU_PIDS */ return msgsize; } @@ -1526,7 +1537,8 @@ static int ovs_dp_cmd_fill_info(struct datapath *dp, struct sk_buff *skb, struct ovs_header *ovs_header; struct ovs_dp_stats dp_stats; struct ovs_dp_megaflow_stats dp_megaflow_stats; - int err; + struct dp_nlsk_pids *pids = ovsl_dereference(dp->upcall_portids); + int err, pids_len; ovs_header = genlmsg_put(skb, portid, seq, &dp_datapath_genl_family, flags, cmd); @@ -1556,6 +1568,12 @@ static int ovs_dp_cmd_fill_info(struct datapath *dp, struct sk_buff *skb, ovs_flow_tbl_masks_cache_size(&dp->table))) goto nla_put_failure; + if (dp->user_features & OVS_DP_F_DISPATCH_UPCALL_PER_CPU && pids) { + pids_len = min(pids->n_pids, nr_cpu_ids) * sizeof(u32); + if (nla_put(skb, OVS_DP_ATTR_PER_CPU_PIDS, pids_len, &pids->pids)) + goto nla_put_failure; + } + genlmsg_end(skb, ovs_header); return 0; @@ -1779,6 +1797,8 @@ static int ovs_dp_cmd_new(struct sk_buff *skb, struct genl_info *info) parms.dp = dp; parms.port_no = OVSP_LOCAL; parms.upcall_portids = a[OVS_DP_ATTR_UPCALL_PID]; + parms.desired_ifindex = a[OVS_DP_ATTR_IFINDEX] + ? nla_get_u32(a[OVS_DP_ATTR_IFINDEX]) : 0; /* So far only local changes have been made, now need the lock. */ ovs_lock(); @@ -1998,6 +2018,7 @@ static const struct nla_policy datapath_policy[OVS_DP_ATTR_MAX + 1] = { [OVS_DP_ATTR_USER_FEATURES] = { .type = NLA_U32 }, [OVS_DP_ATTR_MASKS_CACHE_SIZE] = NLA_POLICY_RANGE(NLA_U32, 0, PCPU_MIN_UNIT_SIZE / sizeof(struct mask_cache_entry)), + [OVS_DP_ATTR_IFINDEX] = {.type = NLA_U32 }, }; static const struct genl_small_ops dp_datapath_genl_ops[] = { @@ -2034,6 +2055,7 @@ static struct genl_family dp_datapath_genl_family __ro_after_init = { .parallel_ops = true, .small_ops = dp_datapath_genl_ops, .n_small_ops = ARRAY_SIZE(dp_datapath_genl_ops), + .resv_start_op = OVS_DP_CMD_SET + 1, .mcgrps = &ovs_dp_datapath_multicast_group, .n_mcgrps = 1, .module = THIS_MODULE, @@ -2201,7 +2223,10 @@ static int ovs_vport_cmd_new(struct sk_buff *skb, struct genl_info *info) if (!a[OVS_VPORT_ATTR_NAME] || !a[OVS_VPORT_ATTR_TYPE] || !a[OVS_VPORT_ATTR_UPCALL_PID]) return -EINVAL; - if (a[OVS_VPORT_ATTR_IFINDEX]) + + parms.type = nla_get_u32(a[OVS_VPORT_ATTR_TYPE]); + + if (a[OVS_VPORT_ATTR_IFINDEX] && parms.type != OVS_VPORT_TYPE_INTERNAL) return -EOPNOTSUPP; port_no = a[OVS_VPORT_ATTR_PORT_NO] @@ -2238,11 +2263,12 @@ restart: } parms.name = nla_data(a[OVS_VPORT_ATTR_NAME]); - parms.type = nla_get_u32(a[OVS_VPORT_ATTR_TYPE]); parms.options = a[OVS_VPORT_ATTR_OPTIONS]; parms.dp = dp; parms.port_no = port_no; parms.upcall_portids = a[OVS_VPORT_ATTR_UPCALL_PID]; + parms.desired_ifindex = a[OVS_VPORT_ATTR_IFINDEX] + ? nla_get_u32(a[OVS_VPORT_ATTR_IFINDEX]) : 0; vport = new_vport(&parms); err = PTR_ERR(vport); diff --git a/net/openvswitch/flow_netlink.c b/net/openvswitch/flow_netlink.c index 4c09cf8a0ab2..4a07ab094a84 100644 --- a/net/openvswitch/flow_netlink.c +++ b/net/openvswitch/flow_netlink.c @@ -3304,7 +3304,7 @@ static int __ovs_nla_copy_actions(struct net *net, const struct nlattr *attr, /* Disallow subsequent L2.5+ set actions and mpls_pop * actions once the last MPLS label in the packet is - * is popped as there is no check here to ensure that + * popped as there is no check here to ensure that * the new eth type is valid and thus set actions could * write off the end of the packet or otherwise corrupt * it. diff --git a/net/openvswitch/meter.c b/net/openvswitch/meter.c index 04a060ac7fdf..6e38f68f88c2 100644 --- a/net/openvswitch/meter.c +++ b/net/openvswitch/meter.c @@ -343,7 +343,7 @@ static struct dp_meter *dp_meter_create(struct nlattr **a) return ERR_PTR(-EINVAL); /* Allocate and set up the meter before locking anything. */ - meter = kzalloc(struct_size(meter, bands, n_bands), GFP_KERNEL); + meter = kzalloc(struct_size(meter, bands, n_bands), GFP_KERNEL_ACCOUNT); if (!meter) return ERR_PTR(-ENOMEM); @@ -687,9 +687,9 @@ static const struct genl_small_ops dp_meter_genl_ops[] = { }, { .cmd = OVS_METER_CMD_SET, .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP, - .flags = GENL_ADMIN_PERM, /* Requires CAP_NET_ADMIN - * privilege. - */ + .flags = GENL_UNS_ADMIN_PERM, /* Requires CAP_NET_ADMIN + * privilege. + */ .doit = ovs_meter_cmd_set, }, { .cmd = OVS_METER_CMD_GET, @@ -699,9 +699,9 @@ static const struct genl_small_ops dp_meter_genl_ops[] = { }, { .cmd = OVS_METER_CMD_DEL, .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP, - .flags = GENL_ADMIN_PERM, /* Requires CAP_NET_ADMIN - * privilege. - */ + .flags = GENL_UNS_ADMIN_PERM, /* Requires CAP_NET_ADMIN + * privilege. + */ .doit = ovs_meter_cmd_del }, }; @@ -720,6 +720,7 @@ struct genl_family dp_meter_genl_family __ro_after_init = { .parallel_ops = true, .small_ops = dp_meter_genl_ops, .n_small_ops = ARRAY_SIZE(dp_meter_genl_ops), + .resv_start_op = OVS_METER_CMD_GET + 1, .mcgrps = &ovs_meter_multicast_group, .n_mcgrps = 1, .module = THIS_MODULE, diff --git a/net/openvswitch/vport-internal_dev.c b/net/openvswitch/vport-internal_dev.c index 5b2ee9c1c00b..74c88a6baa43 100644 --- a/net/openvswitch/vport-internal_dev.c +++ b/net/openvswitch/vport-internal_dev.c @@ -65,7 +65,7 @@ static int internal_dev_stop(struct net_device *netdev) static void internal_dev_getinfo(struct net_device *netdev, struct ethtool_drvinfo *info) { - strlcpy(info->driver, "openvswitch", sizeof(info->driver)); + strscpy(info->driver, "openvswitch", sizeof(info->driver)); } static const struct ethtool_ops internal_dev_ethtool_ops = { @@ -147,6 +147,7 @@ static struct vport *internal_dev_create(const struct vport_parms *parms) } dev_net_set(vport->dev, ovs_dp_get_net(vport->dp)); + dev->ifindex = parms->desired_ifindex; internal_dev = internal_dev_priv(vport->dev); internal_dev->vport = vport; @@ -189,7 +190,7 @@ static void internal_dev_destroy(struct vport *vport) rtnl_unlock(); } -static netdev_tx_t internal_dev_recv(struct sk_buff *skb) +static int internal_dev_recv(struct sk_buff *skb) { struct net_device *netdev = skb->dev; diff --git a/net/openvswitch/vport.h b/net/openvswitch/vport.h index 9de5030d9801..6ff45e8a0868 100644 --- a/net/openvswitch/vport.h +++ b/net/openvswitch/vport.h @@ -90,12 +90,14 @@ struct vport { * @type: New vport's type. * @options: %OVS_VPORT_ATTR_OPTIONS attribute from Netlink message, %NULL if * none was supplied. + * @desired_ifindex: New vport's ifindex. * @dp: New vport's datapath. * @port_no: New vport's port number. */ struct vport_parms { const char *name; enum ovs_vport_type type; + int desired_ifindex; struct nlattr *options; /* For ovs_vport_alloc(). */ @@ -130,7 +132,7 @@ struct vport_ops { int (*set_options)(struct vport *, struct nlattr *); int (*get_options)(const struct vport *, struct sk_buff *); - netdev_tx_t (*send) (struct sk_buff *skb); + int (*send)(struct sk_buff *skb); struct module *owner; struct list_head list; }; diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c index 5cbe07116e04..d3f6db350de7 100644 --- a/net/packet/af_packet.c +++ b/net/packet/af_packet.c @@ -1905,7 +1905,7 @@ static int packet_rcv_spkt(struct sk_buff *skb, struct net_device *dev, */ spkt->spkt_family = dev->type; - strlcpy(spkt->spkt_device, dev->name, sizeof(spkt->spkt_device)); + strscpy(spkt->spkt_device, dev->name, sizeof(spkt->spkt_device)); spkt->spkt_protocol = skb->protocol; /* @@ -3565,7 +3565,7 @@ static int packet_getname_spkt(struct socket *sock, struct sockaddr *uaddr, rcu_read_lock(); dev = dev_get_by_index_rcu(sock_net(sk), READ_ONCE(pkt_sk(sk)->ifindex)); if (dev) - strlcpy(uaddr->sa_data, dev->name, sizeof(uaddr->sa_data)); + strscpy(uaddr->sa_data, dev->name, sizeof(uaddr->sa_data)); rcu_read_unlock(); return sizeof(*uaddr); @@ -4725,37 +4725,37 @@ static struct pernet_operations packet_net_ops = { static void __exit packet_exit(void) { - unregister_netdevice_notifier(&packet_netdev_notifier); - unregister_pernet_subsys(&packet_net_ops); sock_unregister(PF_PACKET); proto_unregister(&packet_proto); + unregister_netdevice_notifier(&packet_netdev_notifier); + unregister_pernet_subsys(&packet_net_ops); } static int __init packet_init(void) { int rc; - rc = proto_register(&packet_proto, 0); - if (rc) - goto out; - rc = sock_register(&packet_family_ops); - if (rc) - goto out_proto; rc = register_pernet_subsys(&packet_net_ops); if (rc) - goto out_sock; + goto out; rc = register_netdevice_notifier(&packet_netdev_notifier); if (rc) goto out_pernet; + rc = proto_register(&packet_proto, 0); + if (rc) + goto out_notifier; + rc = sock_register(&packet_family_ops); + if (rc) + goto out_proto; return 0; -out_pernet: - unregister_pernet_subsys(&packet_net_ops); -out_sock: - sock_unregister(PF_PACKET); out_proto: proto_unregister(&packet_proto); +out_notifier: + unregister_netdevice_notifier(&packet_netdev_notifier); +out_pernet: + unregister_pernet_subsys(&packet_net_ops); out: return rc; } diff --git a/net/psample/psample.c b/net/psample/psample.c index 118d5d2a81a0..81a794e36f53 100644 --- a/net/psample/psample.c +++ b/net/psample/psample.c @@ -115,6 +115,7 @@ static struct genl_family psample_nl_family __ro_after_init = { .mcgrps = psample_nl_mcgrps, .small_ops = psample_nl_ops, .n_small_ops = ARRAY_SIZE(psample_nl_ops), + .resv_start_op = PSAMPLE_CMD_GET_GROUP + 1, .n_mcgrps = ARRAY_SIZE(psample_nl_mcgrps), }; diff --git a/net/rds/af_rds.c b/net/rds/af_rds.c index b239120dd9ca..3ff6995244e5 100644 --- a/net/rds/af_rds.c +++ b/net/rds/af_rds.c @@ -894,7 +894,7 @@ module_exit(rds_exit); u32 rds_gen_num; -static int rds_init(void) +static int __init rds_init(void) { int ret; diff --git a/net/rds/message.c b/net/rds/message.c index d74be4e3f3fa..44dbc612ef54 100644 --- a/net/rds/message.c +++ b/net/rds/message.c @@ -354,7 +354,7 @@ struct rds_message *rds_message_map_pages(unsigned long *page_addrs, unsigned in for (i = 0; i < rm->data.op_nents; ++i) { sg_set_page(&rm->data.op_sg[i], - virt_to_page(page_addrs[i]), + virt_to_page((void *)page_addrs[i]), PAGE_SIZE, 0); } diff --git a/net/rds/rdma_transport.c b/net/rds/rdma_transport.c index a9e4ff948a7d..d36f3f6b4351 100644 --- a/net/rds/rdma_transport.c +++ b/net/rds/rdma_transport.c @@ -291,7 +291,7 @@ static void rds_rdma_listen_stop(void) #endif } -static int rds_rdma_init(void) +static int __init rds_rdma_init(void) { int ret; @@ -307,7 +307,7 @@ out: } module_init(rds_rdma_init); -static void rds_rdma_exit(void) +static void __exit rds_rdma_exit(void) { /* stop listening first to ensure no new connections are attempted */ rds_rdma_listen_stop(); diff --git a/net/rds/tcp.c b/net/rds/tcp.c index 73ee2771093d..4444fd82b66d 100644 --- a/net/rds/tcp.c +++ b/net/rds/tcp.c @@ -166,10 +166,10 @@ void rds_tcp_reset_callbacks(struct socket *sock, */ atomic_set(&cp->cp_state, RDS_CONN_RESETTING); wait_event(cp->cp_waitq, !test_bit(RDS_IN_XMIT, &cp->cp_flags)); - lock_sock(osock->sk); /* reset receive side state for rds_tcp_data_recv() for osock */ cancel_delayed_work_sync(&cp->cp_send_w); cancel_delayed_work_sync(&cp->cp_recv_w); + lock_sock(osock->sk); if (tc->t_tinc) { rds_inc_put(&tc->t_tinc->ti_inc); tc->t_tinc = NULL; @@ -712,7 +712,7 @@ static void rds_tcp_exit(void) } module_exit(rds_tcp_exit); -static int rds_tcp_init(void) +static int __init rds_tcp_init(void) { int ret; diff --git a/net/rxrpc/ar-internal.h b/net/rxrpc/ar-internal.h index 62c70709d798..1ad0ec5afb50 100644 --- a/net/rxrpc/ar-internal.h +++ b/net/rxrpc/ar-internal.h @@ -782,7 +782,6 @@ void rxrpc_delete_call_timer(struct rxrpc_call *call); */ extern const char *const rxrpc_call_states[]; extern const char *const rxrpc_call_completions[]; -extern unsigned int rxrpc_max_call_lifetime; extern struct kmem_cache *rxrpc_call_jar; struct rxrpc_call *rxrpc_find_call_by_user_ID(struct rxrpc_sock *, unsigned long); diff --git a/net/sched/act_api.c b/net/sched/act_api.c index 817065aa2833..9b31a10cc639 100644 --- a/net/sched/act_api.c +++ b/net/sched/act_api.c @@ -676,6 +676,31 @@ int tcf_idr_search(struct tc_action_net *tn, struct tc_action **a, u32 index) } EXPORT_SYMBOL(tcf_idr_search); +static int __tcf_generic_walker(struct net *net, struct sk_buff *skb, + struct netlink_callback *cb, int type, + const struct tc_action_ops *ops, + struct netlink_ext_ack *extack) +{ + struct tc_action_net *tn = net_generic(net, ops->net_id); + + if (unlikely(ops->walk)) + return ops->walk(net, skb, cb, type, ops, extack); + + return tcf_generic_walker(tn, skb, cb, type, ops, extack); +} + +static int __tcf_idr_search(struct net *net, + const struct tc_action_ops *ops, + struct tc_action **a, u32 index) +{ + struct tc_action_net *tn = net_generic(net, ops->net_id); + + if (unlikely(ops->lookup)) + return ops->lookup(net, a, index); + + return tcf_idr_search(tn, a, index); +} + static int tcf_idr_delete_index(struct tcf_idrinfo *idrinfo, u32 index) { struct tc_action *p; @@ -926,7 +951,7 @@ int tcf_register_action(struct tc_action_ops *act, struct tc_action_ops *a; int ret; - if (!act->act || !act->dump || !act->init || !act->walk || !act->lookup) + if (!act->act || !act->dump || !act->init) return -EINVAL; /* We have to register pernet ops before making the action ops visible, @@ -1638,7 +1663,7 @@ static struct tc_action *tcf_action_get_1(struct net *net, struct nlattr *nla, goto err_out; } err = -ENOENT; - if (ops->lookup(net, &a, index) == 0) { + if (__tcf_idr_search(net, ops, &a, index) == 0) { NL_SET_ERR_MSG(extack, "TC action with specified index not found"); goto err_mod; } @@ -1703,7 +1728,7 @@ static int tca_action_flush(struct net *net, struct nlattr *nla, goto out_module_put; } - err = ops->walk(net, skb, &dcb, RTM_DELACTION, ops, extack); + err = __tcf_generic_walker(net, skb, &dcb, RTM_DELACTION, ops, extack); if (err <= 0) { nla_nest_cancel(skb, nest); goto out_module_put; @@ -2121,7 +2146,7 @@ static int tc_dump_action(struct sk_buff *skb, struct netlink_callback *cb) if (nest == NULL) goto out_module_put; - ret = a_o->walk(net, skb, cb, RTM_GETACTION, a_o, NULL); + ret = __tcf_generic_walker(net, skb, cb, RTM_GETACTION, a_o, NULL); if (ret < 0) goto out_module_put; diff --git a/net/sched/act_bpf.c b/net/sched/act_bpf.c index fea2d78b9ddc..b79eee44e24e 100644 --- a/net/sched/act_bpf.c +++ b/net/sched/act_bpf.c @@ -29,7 +29,6 @@ struct tcf_bpf_cfg { bool is_ebpf; }; -static unsigned int bpf_net_id; static struct tc_action_ops act_bpf_ops; static int tcf_bpf_act(struct sk_buff *skb, const struct tc_action *act, @@ -280,7 +279,7 @@ static int tcf_bpf_init(struct net *net, struct nlattr *nla, struct tcf_proto *tp, u32 flags, struct netlink_ext_ack *extack) { - struct tc_action_net *tn = net_generic(net, bpf_net_id); + struct tc_action_net *tn = net_generic(net, act_bpf_ops.net_id); bool bind = flags & TCA_ACT_FLAGS_BIND; struct nlattr *tb[TCA_ACT_BPF_MAX + 1]; struct tcf_chain *goto_ch = NULL; @@ -334,7 +333,7 @@ static int tcf_bpf_init(struct net *net, struct nlattr *nla, is_bpf = tb[TCA_ACT_BPF_OPS_LEN] && tb[TCA_ACT_BPF_OPS]; is_ebpf = tb[TCA_ACT_BPF_FD]; - if ((!is_bpf && !is_ebpf) || (is_bpf && is_ebpf)) { + if (is_bpf == is_ebpf) { ret = -EINVAL; goto put_chain; } @@ -390,23 +389,6 @@ static void tcf_bpf_cleanup(struct tc_action *act) tcf_bpf_cfg_cleanup(&tmp); } -static int tcf_bpf_walker(struct net *net, struct sk_buff *skb, - struct netlink_callback *cb, int type, - const struct tc_action_ops *ops, - struct netlink_ext_ack *extack) -{ - struct tc_action_net *tn = net_generic(net, bpf_net_id); - - return tcf_generic_walker(tn, skb, cb, type, ops, extack); -} - -static int tcf_bpf_search(struct net *net, struct tc_action **a, u32 index) -{ - struct tc_action_net *tn = net_generic(net, bpf_net_id); - - return tcf_idr_search(tn, a, index); -} - static struct tc_action_ops act_bpf_ops __read_mostly = { .kind = "bpf", .id = TCA_ID_BPF, @@ -415,27 +397,25 @@ static struct tc_action_ops act_bpf_ops __read_mostly = { .dump = tcf_bpf_dump, .cleanup = tcf_bpf_cleanup, .init = tcf_bpf_init, - .walk = tcf_bpf_walker, - .lookup = tcf_bpf_search, .size = sizeof(struct tcf_bpf), }; static __net_init int bpf_init_net(struct net *net) { - struct tc_action_net *tn = net_generic(net, bpf_net_id); + struct tc_action_net *tn = net_generic(net, act_bpf_ops.net_id); return tc_action_net_init(net, tn, &act_bpf_ops); } static void __net_exit bpf_exit_net(struct list_head *net_list) { - tc_action_net_exit(net_list, bpf_net_id); + tc_action_net_exit(net_list, act_bpf_ops.net_id); } static struct pernet_operations bpf_net_ops = { .init = bpf_init_net, .exit_batch = bpf_exit_net, - .id = &bpf_net_id, + .id = &act_bpf_ops.net_id, .size = sizeof(struct tc_action_net), }; diff --git a/net/sched/act_connmark.c b/net/sched/act_connmark.c index 09e2aafc8943..66b143bb04ac 100644 --- a/net/sched/act_connmark.c +++ b/net/sched/act_connmark.c @@ -25,7 +25,6 @@ #include <net/netfilter/nf_conntrack_core.h> #include <net/netfilter/nf_conntrack_zones.h> -static unsigned int connmark_net_id; static struct tc_action_ops act_connmark_ops; static int tcf_connmark_act(struct sk_buff *skb, const struct tc_action *a, @@ -99,7 +98,7 @@ static int tcf_connmark_init(struct net *net, struct nlattr *nla, struct tcf_proto *tp, u32 flags, struct netlink_ext_ack *extack) { - struct tc_action_net *tn = net_generic(net, connmark_net_id); + struct tc_action_net *tn = net_generic(net, act_connmark_ops.net_id); struct nlattr *tb[TCA_CONNMARK_MAX + 1]; bool bind = flags & TCA_ACT_FLAGS_BIND; struct tcf_chain *goto_ch = NULL; @@ -200,23 +199,6 @@ nla_put_failure: return -1; } -static int tcf_connmark_walker(struct net *net, struct sk_buff *skb, - struct netlink_callback *cb, int type, - const struct tc_action_ops *ops, - struct netlink_ext_ack *extack) -{ - struct tc_action_net *tn = net_generic(net, connmark_net_id); - - return tcf_generic_walker(tn, skb, cb, type, ops, extack); -} - -static int tcf_connmark_search(struct net *net, struct tc_action **a, u32 index) -{ - struct tc_action_net *tn = net_generic(net, connmark_net_id); - - return tcf_idr_search(tn, a, index); -} - static struct tc_action_ops act_connmark_ops = { .kind = "connmark", .id = TCA_ID_CONNMARK, @@ -224,27 +206,25 @@ static struct tc_action_ops act_connmark_ops = { .act = tcf_connmark_act, .dump = tcf_connmark_dump, .init = tcf_connmark_init, - .walk = tcf_connmark_walker, - .lookup = tcf_connmark_search, .size = sizeof(struct tcf_connmark_info), }; static __net_init int connmark_init_net(struct net *net) { - struct tc_action_net *tn = net_generic(net, connmark_net_id); + struct tc_action_net *tn = net_generic(net, act_connmark_ops.net_id); return tc_action_net_init(net, tn, &act_connmark_ops); } static void __net_exit connmark_exit_net(struct list_head *net_list) { - tc_action_net_exit(net_list, connmark_net_id); + tc_action_net_exit(net_list, act_connmark_ops.net_id); } static struct pernet_operations connmark_net_ops = { .init = connmark_init_net, .exit_batch = connmark_exit_net, - .id = &connmark_net_id, + .id = &act_connmark_ops.net_id, .size = sizeof(struct tc_action_net), }; diff --git a/net/sched/act_csum.c b/net/sched/act_csum.c index 22847ee009ef..1366adf9b909 100644 --- a/net/sched/act_csum.c +++ b/net/sched/act_csum.c @@ -37,7 +37,6 @@ static const struct nla_policy csum_policy[TCA_CSUM_MAX + 1] = { [TCA_CSUM_PARMS] = { .len = sizeof(struct tc_csum), }, }; -static unsigned int csum_net_id; static struct tc_action_ops act_csum_ops; static int tcf_csum_init(struct net *net, struct nlattr *nla, @@ -45,7 +44,7 @@ static int tcf_csum_init(struct net *net, struct nlattr *nla, struct tcf_proto *tp, u32 flags, struct netlink_ext_ack *extack) { - struct tc_action_net *tn = net_generic(net, csum_net_id); + struct tc_action_net *tn = net_generic(net, act_csum_ops.net_id); bool bind = flags & TCA_ACT_FLAGS_BIND; struct tcf_csum_params *params_new; struct nlattr *tb[TCA_CSUM_MAX + 1]; @@ -673,23 +672,6 @@ static void tcf_csum_cleanup(struct tc_action *a) kfree_rcu(params, rcu); } -static int tcf_csum_walker(struct net *net, struct sk_buff *skb, - struct netlink_callback *cb, int type, - const struct tc_action_ops *ops, - struct netlink_ext_ack *extack) -{ - struct tc_action_net *tn = net_generic(net, csum_net_id); - - return tcf_generic_walker(tn, skb, cb, type, ops, extack); -} - -static int tcf_csum_search(struct net *net, struct tc_action **a, u32 index) -{ - struct tc_action_net *tn = net_generic(net, csum_net_id); - - return tcf_idr_search(tn, a, index); -} - static size_t tcf_csum_get_fill_size(const struct tc_action *act) { return nla_total_size(sizeof(struct tc_csum)); @@ -722,8 +704,6 @@ static struct tc_action_ops act_csum_ops = { .dump = tcf_csum_dump, .init = tcf_csum_init, .cleanup = tcf_csum_cleanup, - .walk = tcf_csum_walker, - .lookup = tcf_csum_search, .get_fill_size = tcf_csum_get_fill_size, .offload_act_setup = tcf_csum_offload_act_setup, .size = sizeof(struct tcf_csum), @@ -731,20 +711,20 @@ static struct tc_action_ops act_csum_ops = { static __net_init int csum_init_net(struct net *net) { - struct tc_action_net *tn = net_generic(net, csum_net_id); + struct tc_action_net *tn = net_generic(net, act_csum_ops.net_id); return tc_action_net_init(net, tn, &act_csum_ops); } static void __net_exit csum_exit_net(struct list_head *net_list) { - tc_action_net_exit(net_list, csum_net_id); + tc_action_net_exit(net_list, act_csum_ops.net_id); } static struct pernet_operations csum_net_ops = { .init = csum_init_net, .exit_batch = csum_exit_net, - .id = &csum_net_id, + .id = &act_csum_ops.net_id, .size = sizeof(struct tc_action_net), }; diff --git a/net/sched/act_ct.c b/net/sched/act_ct.c index 5950974ae8f6..b38d91d6b249 100644 --- a/net/sched/act_ct.c +++ b/net/sched/act_ct.c @@ -649,7 +649,6 @@ static void tcf_ct_flow_tables_uninit(void) } static struct tc_action_ops act_ct_ops; -static unsigned int ct_net_id; struct tc_ct_action_net { struct tc_action_net tn; /* Must be first */ @@ -697,7 +696,6 @@ drop_ct: static int tcf_ct_skb_network_trim(struct sk_buff *skb, int family) { unsigned int len; - int err; switch (family) { case NFPROTO_IPV4: @@ -711,9 +709,7 @@ static int tcf_ct_skb_network_trim(struct sk_buff *skb, int family) len = skb->len; } - err = pskb_trim_rcsum(skb, len); - - return err; + return pskb_trim_rcsum(skb, len); } static u8 tcf_ct_skb_nf_family(struct sk_buff *skb) @@ -1255,7 +1251,7 @@ static int tcf_ct_fill_params(struct net *net, struct nlattr **tb, struct netlink_ext_ack *extack) { - struct tc_ct_action_net *tn = net_generic(net, ct_net_id); + struct tc_ct_action_net *tn = net_generic(net, act_ct_ops.net_id); struct nf_conntrack_zone zone; struct nf_conn *tmpl; int err; @@ -1330,7 +1326,7 @@ static int tcf_ct_init(struct net *net, struct nlattr *nla, struct tcf_proto *tp, u32 flags, struct netlink_ext_ack *extack) { - struct tc_action_net *tn = net_generic(net, ct_net_id); + struct tc_action_net *tn = net_generic(net, act_ct_ops.net_id); bool bind = flags & TCA_ACT_FLAGS_BIND; struct tcf_ct_params *params = NULL; struct nlattr *tb[TCA_CT_MAX + 1]; @@ -1561,23 +1557,6 @@ nla_put_failure: return -1; } -static int tcf_ct_walker(struct net *net, struct sk_buff *skb, - struct netlink_callback *cb, int type, - const struct tc_action_ops *ops, - struct netlink_ext_ack *extack) -{ - struct tc_action_net *tn = net_generic(net, ct_net_id); - - return tcf_generic_walker(tn, skb, cb, type, ops, extack); -} - -static int tcf_ct_search(struct net *net, struct tc_action **a, u32 index) -{ - struct tc_action_net *tn = net_generic(net, ct_net_id); - - return tcf_idr_search(tn, a, index); -} - static void tcf_stats_update(struct tc_action *a, u64 bytes, u64 packets, u64 drops, u64 lastuse, bool hw) { @@ -1616,8 +1595,6 @@ static struct tc_action_ops act_ct_ops = { .dump = tcf_ct_dump, .init = tcf_ct_init, .cleanup = tcf_ct_cleanup, - .walk = tcf_ct_walker, - .lookup = tcf_ct_search, .stats_update = tcf_stats_update, .offload_act_setup = tcf_ct_offload_act_setup, .size = sizeof(struct tcf_ct), @@ -1626,7 +1603,7 @@ static struct tc_action_ops act_ct_ops = { static __net_init int ct_init_net(struct net *net) { unsigned int n_bits = sizeof_field(struct tcf_ct_params, labels) * 8; - struct tc_ct_action_net *tn = net_generic(net, ct_net_id); + struct tc_ct_action_net *tn = net_generic(net, act_ct_ops.net_id); if (nf_connlabels_get(net, n_bits - 1)) { tn->labels = false; @@ -1644,20 +1621,20 @@ static void __net_exit ct_exit_net(struct list_head *net_list) rtnl_lock(); list_for_each_entry(net, net_list, exit_list) { - struct tc_ct_action_net *tn = net_generic(net, ct_net_id); + struct tc_ct_action_net *tn = net_generic(net, act_ct_ops.net_id); if (tn->labels) nf_connlabels_put(net); } rtnl_unlock(); - tc_action_net_exit(net_list, ct_net_id); + tc_action_net_exit(net_list, act_ct_ops.net_id); } static struct pernet_operations ct_net_ops = { .init = ct_init_net, .exit_batch = ct_exit_net, - .id = &ct_net_id, + .id = &act_ct_ops.net_id, .size = sizeof(struct tc_ct_action_net), }; diff --git a/net/sched/act_ctinfo.c b/net/sched/act_ctinfo.c index 0281e45987a4..d4102f0a9abd 100644 --- a/net/sched/act_ctinfo.c +++ b/net/sched/act_ctinfo.c @@ -25,7 +25,6 @@ #include <net/netfilter/nf_conntrack_zones.h> static struct tc_action_ops act_ctinfo_ops; -static unsigned int ctinfo_net_id; static void tcf_ctinfo_dscp_set(struct nf_conn *ct, struct tcf_ctinfo *ca, struct tcf_ctinfo_params *cp, @@ -157,7 +156,7 @@ static int tcf_ctinfo_init(struct net *net, struct nlattr *nla, struct tcf_proto *tp, u32 flags, struct netlink_ext_ack *extack) { - struct tc_action_net *tn = net_generic(net, ctinfo_net_id); + struct tc_action_net *tn = net_generic(net, act_ctinfo_ops.net_id); bool bind = flags & TCA_ACT_FLAGS_BIND; u32 dscpmask = 0, dscpstatemask, index; struct nlattr *tb[TCA_CTINFO_MAX + 1]; @@ -342,23 +341,6 @@ nla_put_failure: return -1; } -static int tcf_ctinfo_walker(struct net *net, struct sk_buff *skb, - struct netlink_callback *cb, int type, - const struct tc_action_ops *ops, - struct netlink_ext_ack *extack) -{ - struct tc_action_net *tn = net_generic(net, ctinfo_net_id); - - return tcf_generic_walker(tn, skb, cb, type, ops, extack); -} - -static int tcf_ctinfo_search(struct net *net, struct tc_action **a, u32 index) -{ - struct tc_action_net *tn = net_generic(net, ctinfo_net_id); - - return tcf_idr_search(tn, a, index); -} - static void tcf_ctinfo_cleanup(struct tc_action *a) { struct tcf_ctinfo *ci = to_ctinfo(a); @@ -377,27 +359,25 @@ static struct tc_action_ops act_ctinfo_ops = { .dump = tcf_ctinfo_dump, .init = tcf_ctinfo_init, .cleanup= tcf_ctinfo_cleanup, - .walk = tcf_ctinfo_walker, - .lookup = tcf_ctinfo_search, .size = sizeof(struct tcf_ctinfo), }; static __net_init int ctinfo_init_net(struct net *net) { - struct tc_action_net *tn = net_generic(net, ctinfo_net_id); + struct tc_action_net *tn = net_generic(net, act_ctinfo_ops.net_id); return tc_action_net_init(net, tn, &act_ctinfo_ops); } static void __net_exit ctinfo_exit_net(struct list_head *net_list) { - tc_action_net_exit(net_list, ctinfo_net_id); + tc_action_net_exit(net_list, act_ctinfo_ops.net_id); } static struct pernet_operations ctinfo_net_ops = { .init = ctinfo_init_net, .exit_batch = ctinfo_exit_net, - .id = &ctinfo_net_id, + .id = &act_ctinfo_ops.net_id, .size = sizeof(struct tc_action_net), }; diff --git a/net/sched/act_gact.c b/net/sched/act_gact.c index ac29d1065232..abe1bcc5c797 100644 --- a/net/sched/act_gact.c +++ b/net/sched/act_gact.c @@ -19,7 +19,6 @@ #include <linux/tc_act/tc_gact.h> #include <net/tc_act/tc_gact.h> -static unsigned int gact_net_id; static struct tc_action_ops act_gact_ops; #ifdef CONFIG_GACT_PROB @@ -55,7 +54,7 @@ static int tcf_gact_init(struct net *net, struct nlattr *nla, struct tcf_proto *tp, u32 flags, struct netlink_ext_ack *extack) { - struct tc_action_net *tn = net_generic(net, gact_net_id); + struct tc_action_net *tn = net_generic(net, act_gact_ops.net_id); bool bind = flags & TCA_ACT_FLAGS_BIND; struct nlattr *tb[TCA_GACT_MAX + 1]; struct tcf_chain *goto_ch = NULL; @@ -222,23 +221,6 @@ nla_put_failure: return -1; } -static int tcf_gact_walker(struct net *net, struct sk_buff *skb, - struct netlink_callback *cb, int type, - const struct tc_action_ops *ops, - struct netlink_ext_ack *extack) -{ - struct tc_action_net *tn = net_generic(net, gact_net_id); - - return tcf_generic_walker(tn, skb, cb, type, ops, extack); -} - -static int tcf_gact_search(struct net *net, struct tc_action **a, u32 index) -{ - struct tc_action_net *tn = net_generic(net, gact_net_id); - - return tcf_idr_search(tn, a, index); -} - static size_t tcf_gact_get_fill_size(const struct tc_action *act) { size_t sz = nla_total_size(sizeof(struct tc_gact)); /* TCA_GACT_PARMS */ @@ -308,8 +290,6 @@ static struct tc_action_ops act_gact_ops = { .stats_update = tcf_gact_stats_update, .dump = tcf_gact_dump, .init = tcf_gact_init, - .walk = tcf_gact_walker, - .lookup = tcf_gact_search, .get_fill_size = tcf_gact_get_fill_size, .offload_act_setup = tcf_gact_offload_act_setup, .size = sizeof(struct tcf_gact), @@ -317,20 +297,20 @@ static struct tc_action_ops act_gact_ops = { static __net_init int gact_init_net(struct net *net) { - struct tc_action_net *tn = net_generic(net, gact_net_id); + struct tc_action_net *tn = net_generic(net, act_gact_ops.net_id); return tc_action_net_init(net, tn, &act_gact_ops); } static void __net_exit gact_exit_net(struct list_head *net_list) { - tc_action_net_exit(net_list, gact_net_id); + tc_action_net_exit(net_list, act_gact_ops.net_id); } static struct pernet_operations gact_net_ops = { .init = gact_init_net, .exit_batch = gact_exit_net, - .id = &gact_net_id, + .id = &act_gact_ops.net_id, .size = sizeof(struct tc_action_net), }; diff --git a/net/sched/act_gate.c b/net/sched/act_gate.c index fd5155274733..3049878e7315 100644 --- a/net/sched/act_gate.c +++ b/net/sched/act_gate.c @@ -15,7 +15,6 @@ #include <net/pkt_cls.h> #include <net/tc_act/tc_gate.h> -static unsigned int gate_net_id; static struct tc_action_ops act_gate_ops; static ktime_t gate_get_time(struct tcf_gate *gact) @@ -298,7 +297,7 @@ static int tcf_gate_init(struct net *net, struct nlattr *nla, struct tcf_proto *tp, u32 flags, struct netlink_ext_ack *extack) { - struct tc_action_net *tn = net_generic(net, gate_net_id); + struct tc_action_net *tn = net_generic(net, act_gate_ops.net_id); enum tk_offsets tk_offset = TK_OFFS_TAI; bool bind = flags & TCA_ACT_FLAGS_BIND; struct nlattr *tb[TCA_GATE_MAX + 1]; @@ -565,16 +564,6 @@ nla_put_failure: return -1; } -static int tcf_gate_walker(struct net *net, struct sk_buff *skb, - struct netlink_callback *cb, int type, - const struct tc_action_ops *ops, - struct netlink_ext_ack *extack) -{ - struct tc_action_net *tn = net_generic(net, gate_net_id); - - return tcf_generic_walker(tn, skb, cb, type, ops, extack); -} - static void tcf_gate_stats_update(struct tc_action *a, u64 bytes, u64 packets, u64 drops, u64 lastuse, bool hw) { @@ -585,13 +574,6 @@ static void tcf_gate_stats_update(struct tc_action *a, u64 bytes, u64 packets, tm->lastuse = max_t(u64, tm->lastuse, lastuse); } -static int tcf_gate_search(struct net *net, struct tc_action **a, u32 index) -{ - struct tc_action_net *tn = net_generic(net, gate_net_id); - - return tcf_idr_search(tn, a, index); -} - static size_t tcf_gate_get_fill_size(const struct tc_action *act) { return nla_total_size(sizeof(struct tc_gate)); @@ -654,30 +636,28 @@ static struct tc_action_ops act_gate_ops = { .dump = tcf_gate_dump, .init = tcf_gate_init, .cleanup = tcf_gate_cleanup, - .walk = tcf_gate_walker, .stats_update = tcf_gate_stats_update, .get_fill_size = tcf_gate_get_fill_size, - .lookup = tcf_gate_search, .offload_act_setup = tcf_gate_offload_act_setup, .size = sizeof(struct tcf_gate), }; static __net_init int gate_init_net(struct net *net) { - struct tc_action_net *tn = net_generic(net, gate_net_id); + struct tc_action_net *tn = net_generic(net, act_gate_ops.net_id); return tc_action_net_init(net, tn, &act_gate_ops); } static void __net_exit gate_exit_net(struct list_head *net_list) { - tc_action_net_exit(net_list, gate_net_id); + tc_action_net_exit(net_list, act_gate_ops.net_id); } static struct pernet_operations gate_net_ops = { .init = gate_init_net, .exit_batch = gate_exit_net, - .id = &gate_net_id, + .id = &act_gate_ops.net_id, .size = sizeof(struct tc_action_net), }; diff --git a/net/sched/act_ife.c b/net/sched/act_ife.c index 41ba55e60b1b..41d63b33461d 100644 --- a/net/sched/act_ife.c +++ b/net/sched/act_ife.c @@ -30,7 +30,6 @@ #include <linux/etherdevice.h> #include <net/ife.h> -static unsigned int ife_net_id; static int max_metacnt = IFE_META_MAX + 1; static struct tc_action_ops act_ife_ops; @@ -482,7 +481,7 @@ static int tcf_ife_init(struct net *net, struct nlattr *nla, struct tcf_proto *tp, u32 flags, struct netlink_ext_ack *extack) { - struct tc_action_net *tn = net_generic(net, ife_net_id); + struct tc_action_net *tn = net_generic(net, act_ife_ops.net_id); bool bind = flags & TCA_ACT_FLAGS_BIND; struct nlattr *tb[TCA_IFE_MAX + 1]; struct nlattr *tb2[IFE_META_MAX + 1]; @@ -878,23 +877,6 @@ static int tcf_ife_act(struct sk_buff *skb, const struct tc_action *a, return tcf_ife_decode(skb, a, res); } -static int tcf_ife_walker(struct net *net, struct sk_buff *skb, - struct netlink_callback *cb, int type, - const struct tc_action_ops *ops, - struct netlink_ext_ack *extack) -{ - struct tc_action_net *tn = net_generic(net, ife_net_id); - - return tcf_generic_walker(tn, skb, cb, type, ops, extack); -} - -static int tcf_ife_search(struct net *net, struct tc_action **a, u32 index) -{ - struct tc_action_net *tn = net_generic(net, ife_net_id); - - return tcf_idr_search(tn, a, index); -} - static struct tc_action_ops act_ife_ops = { .kind = "ife", .id = TCA_ID_IFE, @@ -903,27 +885,25 @@ static struct tc_action_ops act_ife_ops = { .dump = tcf_ife_dump, .cleanup = tcf_ife_cleanup, .init = tcf_ife_init, - .walk = tcf_ife_walker, - .lookup = tcf_ife_search, .size = sizeof(struct tcf_ife_info), }; static __net_init int ife_init_net(struct net *net) { - struct tc_action_net *tn = net_generic(net, ife_net_id); + struct tc_action_net *tn = net_generic(net, act_ife_ops.net_id); return tc_action_net_init(net, tn, &act_ife_ops); } static void __net_exit ife_exit_net(struct list_head *net_list) { - tc_action_net_exit(net_list, ife_net_id); + tc_action_net_exit(net_list, act_ife_ops.net_id); } static struct pernet_operations ife_net_ops = { .init = ife_init_net, .exit_batch = ife_exit_net, - .id = &ife_net_id, + .id = &act_ife_ops.net_id, .size = sizeof(struct tc_action_net), }; diff --git a/net/sched/act_ipt.c b/net/sched/act_ipt.c index 2f3d507c24a1..1625e1037416 100644 --- a/net/sched/act_ipt.c +++ b/net/sched/act_ipt.c @@ -24,10 +24,7 @@ #include <linux/netfilter_ipv4/ip_tables.h> -static unsigned int ipt_net_id; static struct tc_action_ops act_ipt_ops; - -static unsigned int xt_net_id; static struct tc_action_ops act_xt_ops; static int ipt_init_target(struct net *net, struct xt_entry_target *t, @@ -206,8 +203,8 @@ static int tcf_ipt_init(struct net *net, struct nlattr *nla, struct tcf_proto *tp, u32 flags, struct netlink_ext_ack *extack) { - return __tcf_ipt_init(net, ipt_net_id, nla, est, a, &act_ipt_ops, - tp, flags); + return __tcf_ipt_init(net, act_ipt_ops.net_id, nla, est, + a, &act_ipt_ops, tp, flags); } static int tcf_xt_init(struct net *net, struct nlattr *nla, @@ -215,8 +212,8 @@ static int tcf_xt_init(struct net *net, struct nlattr *nla, struct tcf_proto *tp, u32 flags, struct netlink_ext_ack *extack) { - return __tcf_ipt_init(net, xt_net_id, nla, est, a, &act_xt_ops, - tp, flags); + return __tcf_ipt_init(net, act_xt_ops.net_id, nla, est, + a, &act_xt_ops, tp, flags); } static int tcf_ipt_act(struct sk_buff *skb, const struct tc_action *a, @@ -316,23 +313,6 @@ nla_put_failure: return -1; } -static int tcf_ipt_walker(struct net *net, struct sk_buff *skb, - struct netlink_callback *cb, int type, - const struct tc_action_ops *ops, - struct netlink_ext_ack *extack) -{ - struct tc_action_net *tn = net_generic(net, ipt_net_id); - - return tcf_generic_walker(tn, skb, cb, type, ops, extack); -} - -static int tcf_ipt_search(struct net *net, struct tc_action **a, u32 index) -{ - struct tc_action_net *tn = net_generic(net, ipt_net_id); - - return tcf_idr_search(tn, a, index); -} - static struct tc_action_ops act_ipt_ops = { .kind = "ipt", .id = TCA_ID_IPT, @@ -341,47 +321,28 @@ static struct tc_action_ops act_ipt_ops = { .dump = tcf_ipt_dump, .cleanup = tcf_ipt_release, .init = tcf_ipt_init, - .walk = tcf_ipt_walker, - .lookup = tcf_ipt_search, .size = sizeof(struct tcf_ipt), }; static __net_init int ipt_init_net(struct net *net) { - struct tc_action_net *tn = net_generic(net, ipt_net_id); + struct tc_action_net *tn = net_generic(net, act_ipt_ops.net_id); return tc_action_net_init(net, tn, &act_ipt_ops); } static void __net_exit ipt_exit_net(struct list_head *net_list) { - tc_action_net_exit(net_list, ipt_net_id); + tc_action_net_exit(net_list, act_ipt_ops.net_id); } static struct pernet_operations ipt_net_ops = { .init = ipt_init_net, .exit_batch = ipt_exit_net, - .id = &ipt_net_id, + .id = &act_ipt_ops.net_id, .size = sizeof(struct tc_action_net), }; -static int tcf_xt_walker(struct net *net, struct sk_buff *skb, - struct netlink_callback *cb, int type, - const struct tc_action_ops *ops, - struct netlink_ext_ack *extack) -{ - struct tc_action_net *tn = net_generic(net, xt_net_id); - - return tcf_generic_walker(tn, skb, cb, type, ops, extack); -} - -static int tcf_xt_search(struct net *net, struct tc_action **a, u32 index) -{ - struct tc_action_net *tn = net_generic(net, xt_net_id); - - return tcf_idr_search(tn, a, index); -} - static struct tc_action_ops act_xt_ops = { .kind = "xt", .id = TCA_ID_XT, @@ -390,27 +351,25 @@ static struct tc_action_ops act_xt_ops = { .dump = tcf_ipt_dump, .cleanup = tcf_ipt_release, .init = tcf_xt_init, - .walk = tcf_xt_walker, - .lookup = tcf_xt_search, .size = sizeof(struct tcf_ipt), }; static __net_init int xt_init_net(struct net *net) { - struct tc_action_net *tn = net_generic(net, xt_net_id); + struct tc_action_net *tn = net_generic(net, act_xt_ops.net_id); return tc_action_net_init(net, tn, &act_xt_ops); } static void __net_exit xt_exit_net(struct list_head *net_list) { - tc_action_net_exit(net_list, xt_net_id); + tc_action_net_exit(net_list, act_xt_ops.net_id); } static struct pernet_operations xt_net_ops = { .init = xt_init_net, .exit_batch = xt_exit_net, - .id = &xt_net_id, + .id = &act_xt_ops.net_id, .size = sizeof(struct tc_action_net), }; diff --git a/net/sched/act_mirred.c b/net/sched/act_mirred.c index a1d70cf86843..b8ad6ae282c0 100644 --- a/net/sched/act_mirred.c +++ b/net/sched/act_mirred.c @@ -86,7 +86,6 @@ static const struct nla_policy mirred_policy[TCA_MIRRED_MAX + 1] = { [TCA_MIRRED_PARMS] = { .len = sizeof(struct tc_mirred) }, }; -static unsigned int mirred_net_id; static struct tc_action_ops act_mirred_ops; static int tcf_mirred_init(struct net *net, struct nlattr *nla, @@ -94,7 +93,7 @@ static int tcf_mirred_init(struct net *net, struct nlattr *nla, struct tcf_proto *tp, u32 flags, struct netlink_ext_ack *extack) { - struct tc_action_net *tn = net_generic(net, mirred_net_id); + struct tc_action_net *tn = net_generic(net, act_mirred_ops.net_id); bool bind = flags & TCA_ACT_FLAGS_BIND; struct nlattr *tb[TCA_MIRRED_MAX + 1]; struct tcf_chain *goto_ch = NULL; @@ -306,8 +305,7 @@ static int tcf_mirred_act(struct sk_buff *skb, const struct tc_action *a, /* let's the caller reinsert the packet, if possible */ if (use_reinsert) { - res->ingress = want_ingress; - err = tcf_mirred_forward(res->ingress, skb); + err = tcf_mirred_forward(want_ingress, skb); if (err) tcf_action_inc_overlimit_qstats(&m->common); __this_cpu_dec(mirred_rec_level); @@ -373,23 +371,6 @@ nla_put_failure: return -1; } -static int tcf_mirred_walker(struct net *net, struct sk_buff *skb, - struct netlink_callback *cb, int type, - const struct tc_action_ops *ops, - struct netlink_ext_ack *extack) -{ - struct tc_action_net *tn = net_generic(net, mirred_net_id); - - return tcf_generic_walker(tn, skb, cb, type, ops, extack); -} - -static int tcf_mirred_search(struct net *net, struct tc_action **a, u32 index) -{ - struct tc_action_net *tn = net_generic(net, mirred_net_id); - - return tcf_idr_search(tn, a, index); -} - static int mirred_device_event(struct notifier_block *unused, unsigned long event, void *ptr) { @@ -510,8 +491,6 @@ static struct tc_action_ops act_mirred_ops = { .dump = tcf_mirred_dump, .cleanup = tcf_mirred_release, .init = tcf_mirred_init, - .walk = tcf_mirred_walker, - .lookup = tcf_mirred_search, .get_fill_size = tcf_mirred_get_fill_size, .offload_act_setup = tcf_mirred_offload_act_setup, .size = sizeof(struct tcf_mirred), @@ -520,20 +499,20 @@ static struct tc_action_ops act_mirred_ops = { static __net_init int mirred_init_net(struct net *net) { - struct tc_action_net *tn = net_generic(net, mirred_net_id); + struct tc_action_net *tn = net_generic(net, act_mirred_ops.net_id); return tc_action_net_init(net, tn, &act_mirred_ops); } static void __net_exit mirred_exit_net(struct list_head *net_list) { - tc_action_net_exit(net_list, mirred_net_id); + tc_action_net_exit(net_list, act_mirred_ops.net_id); } static struct pernet_operations mirred_net_ops = { .init = mirred_init_net, .exit_batch = mirred_exit_net, - .id = &mirred_net_id, + .id = &act_mirred_ops.net_id, .size = sizeof(struct tc_action_net), }; diff --git a/net/sched/act_mpls.c b/net/sched/act_mpls.c index adabeccb63e1..8ad25cc8ccd5 100644 --- a/net/sched/act_mpls.c +++ b/net/sched/act_mpls.c @@ -15,7 +15,6 @@ #include <net/pkt_cls.h> #include <net/tc_act/tc_mpls.h> -static unsigned int mpls_net_id; static struct tc_action_ops act_mpls_ops; #define ACT_MPLS_TTL_DEFAULT 255 @@ -155,7 +154,7 @@ static int tcf_mpls_init(struct net *net, struct nlattr *nla, struct tcf_proto *tp, u32 flags, struct netlink_ext_ack *extack) { - struct tc_action_net *tn = net_generic(net, mpls_net_id); + struct tc_action_net *tn = net_generic(net, act_mpls_ops.net_id); bool bind = flags & TCA_ACT_FLAGS_BIND; struct nlattr *tb[TCA_MPLS_MAX + 1]; struct tcf_chain *goto_ch = NULL; @@ -367,23 +366,6 @@ nla_put_failure: return -EMSGSIZE; } -static int tcf_mpls_walker(struct net *net, struct sk_buff *skb, - struct netlink_callback *cb, int type, - const struct tc_action_ops *ops, - struct netlink_ext_ack *extack) -{ - struct tc_action_net *tn = net_generic(net, mpls_net_id); - - return tcf_generic_walker(tn, skb, cb, type, ops, extack); -} - -static int tcf_mpls_search(struct net *net, struct tc_action **a, u32 index) -{ - struct tc_action_net *tn = net_generic(net, mpls_net_id); - - return tcf_idr_search(tn, a, index); -} - static int tcf_mpls_offload_act_setup(struct tc_action *act, void *entry_data, u32 *index_inc, bool bind, struct netlink_ext_ack *extack) @@ -451,28 +433,26 @@ static struct tc_action_ops act_mpls_ops = { .dump = tcf_mpls_dump, .init = tcf_mpls_init, .cleanup = tcf_mpls_cleanup, - .walk = tcf_mpls_walker, - .lookup = tcf_mpls_search, .offload_act_setup = tcf_mpls_offload_act_setup, .size = sizeof(struct tcf_mpls), }; static __net_init int mpls_init_net(struct net *net) { - struct tc_action_net *tn = net_generic(net, mpls_net_id); + struct tc_action_net *tn = net_generic(net, act_mpls_ops.net_id); return tc_action_net_init(net, tn, &act_mpls_ops); } static void __net_exit mpls_exit_net(struct list_head *net_list) { - tc_action_net_exit(net_list, mpls_net_id); + tc_action_net_exit(net_list, act_mpls_ops.net_id); } static struct pernet_operations mpls_net_ops = { .init = mpls_init_net, .exit_batch = mpls_exit_net, - .id = &mpls_net_id, + .id = &act_mpls_ops.net_id, .size = sizeof(struct tc_action_net), }; diff --git a/net/sched/act_nat.c b/net/sched/act_nat.c index 2a39b3729e84..9265145f1040 100644 --- a/net/sched/act_nat.c +++ b/net/sched/act_nat.c @@ -26,7 +26,6 @@ #include <net/udp.h> -static unsigned int nat_net_id; static struct tc_action_ops act_nat_ops; static const struct nla_policy nat_policy[TCA_NAT_MAX + 1] = { @@ -37,7 +36,7 @@ static int tcf_nat_init(struct net *net, struct nlattr *nla, struct nlattr *est, struct tc_action **a, struct tcf_proto *tp, u32 flags, struct netlink_ext_ack *extack) { - struct tc_action_net *tn = net_generic(net, nat_net_id); + struct tc_action_net *tn = net_generic(net, act_nat_ops.net_id); bool bind = flags & TCA_ACT_FLAGS_BIND; struct nlattr *tb[TCA_NAT_MAX + 1]; struct tcf_chain *goto_ch = NULL; @@ -289,23 +288,6 @@ nla_put_failure: return -1; } -static int tcf_nat_walker(struct net *net, struct sk_buff *skb, - struct netlink_callback *cb, int type, - const struct tc_action_ops *ops, - struct netlink_ext_ack *extack) -{ - struct tc_action_net *tn = net_generic(net, nat_net_id); - - return tcf_generic_walker(tn, skb, cb, type, ops, extack); -} - -static int tcf_nat_search(struct net *net, struct tc_action **a, u32 index) -{ - struct tc_action_net *tn = net_generic(net, nat_net_id); - - return tcf_idr_search(tn, a, index); -} - static struct tc_action_ops act_nat_ops = { .kind = "nat", .id = TCA_ID_NAT, @@ -313,27 +295,25 @@ static struct tc_action_ops act_nat_ops = { .act = tcf_nat_act, .dump = tcf_nat_dump, .init = tcf_nat_init, - .walk = tcf_nat_walker, - .lookup = tcf_nat_search, .size = sizeof(struct tcf_nat), }; static __net_init int nat_init_net(struct net *net) { - struct tc_action_net *tn = net_generic(net, nat_net_id); + struct tc_action_net *tn = net_generic(net, act_nat_ops.net_id); return tc_action_net_init(net, tn, &act_nat_ops); } static void __net_exit nat_exit_net(struct list_head *net_list) { - tc_action_net_exit(net_list, nat_net_id); + tc_action_net_exit(net_list, act_nat_ops.net_id); } static struct pernet_operations nat_net_ops = { .init = nat_init_net, .exit_batch = nat_exit_net, - .id = &nat_net_id, + .id = &act_nat_ops.net_id, .size = sizeof(struct tc_action_net), }; diff --git a/net/sched/act_pedit.c b/net/sched/act_pedit.c index 823ee643371c..94ed5857ce67 100644 --- a/net/sched/act_pedit.c +++ b/net/sched/act_pedit.c @@ -21,7 +21,6 @@ #include <uapi/linux/tc_act/tc_pedit.h> #include <net/pkt_cls.h> -static unsigned int pedit_net_id; static struct tc_action_ops act_pedit_ops; static const struct nla_policy pedit_policy[TCA_PEDIT_MAX + 1] = { @@ -139,7 +138,7 @@ static int tcf_pedit_init(struct net *net, struct nlattr *nla, struct tcf_proto *tp, u32 flags, struct netlink_ext_ack *extack) { - struct tc_action_net *tn = net_generic(net, pedit_net_id); + struct tc_action_net *tn = net_generic(net, act_pedit_ops.net_id); bool bind = flags & TCA_ACT_FLAGS_BIND; struct nlattr *tb[TCA_PEDIT_MAX + 1]; struct tcf_chain *goto_ch = NULL; @@ -492,23 +491,6 @@ nla_put_failure: return -1; } -static int tcf_pedit_walker(struct net *net, struct sk_buff *skb, - struct netlink_callback *cb, int type, - const struct tc_action_ops *ops, - struct netlink_ext_ack *extack) -{ - struct tc_action_net *tn = net_generic(net, pedit_net_id); - - return tcf_generic_walker(tn, skb, cb, type, ops, extack); -} - -static int tcf_pedit_search(struct net *net, struct tc_action **a, u32 index) -{ - struct tc_action_net *tn = net_generic(net, pedit_net_id); - - return tcf_idr_search(tn, a, index); -} - static int tcf_pedit_offload_act_setup(struct tc_action *act, void *entry_data, u32 *index_inc, bool bind, struct netlink_ext_ack *extack) @@ -553,28 +535,26 @@ static struct tc_action_ops act_pedit_ops = { .dump = tcf_pedit_dump, .cleanup = tcf_pedit_cleanup, .init = tcf_pedit_init, - .walk = tcf_pedit_walker, - .lookup = tcf_pedit_search, .offload_act_setup = tcf_pedit_offload_act_setup, .size = sizeof(struct tcf_pedit), }; static __net_init int pedit_init_net(struct net *net) { - struct tc_action_net *tn = net_generic(net, pedit_net_id); + struct tc_action_net *tn = net_generic(net, act_pedit_ops.net_id); return tc_action_net_init(net, tn, &act_pedit_ops); } static void __net_exit pedit_exit_net(struct list_head *net_list) { - tc_action_net_exit(net_list, pedit_net_id); + tc_action_net_exit(net_list, act_pedit_ops.net_id); } static struct pernet_operations pedit_net_ops = { .init = pedit_init_net, .exit_batch = pedit_exit_net, - .id = &pedit_net_id, + .id = &act_pedit_ops.net_id, .size = sizeof(struct tc_action_net), }; diff --git a/net/sched/act_police.c b/net/sched/act_police.c index b759628a47c2..0adb26e366a7 100644 --- a/net/sched/act_police.c +++ b/net/sched/act_police.c @@ -22,19 +22,8 @@ /* Each policer is serialized by its individual spinlock */ -static unsigned int police_net_id; static struct tc_action_ops act_police_ops; -static int tcf_police_walker(struct net *net, struct sk_buff *skb, - struct netlink_callback *cb, int type, - const struct tc_action_ops *ops, - struct netlink_ext_ack *extack) -{ - struct tc_action_net *tn = net_generic(net, police_net_id); - - return tcf_generic_walker(tn, skb, cb, type, ops, extack); -} - static const struct nla_policy police_policy[TCA_POLICE_MAX + 1] = { [TCA_POLICE_RATE] = { .len = TC_RTAB_SIZE }, [TCA_POLICE_PEAKRATE] = { .len = TC_RTAB_SIZE }, @@ -58,7 +47,7 @@ static int tcf_police_init(struct net *net, struct nlattr *nla, struct tc_police *parm; struct tcf_police *police; struct qdisc_rate_table *R_tab = NULL, *P_tab = NULL; - struct tc_action_net *tn = net_generic(net, police_net_id); + struct tc_action_net *tn = net_generic(net, act_police_ops.net_id); struct tcf_police_params *new; bool exists = false; u32 index; @@ -412,13 +401,6 @@ nla_put_failure: return -1; } -static int tcf_police_search(struct net *net, struct tc_action **a, u32 index) -{ - struct tc_action_net *tn = net_generic(net, police_net_id); - - return tcf_idr_search(tn, a, index); -} - static int tcf_police_act_to_flow_act(int tc_act, u32 *extval, struct netlink_ext_ack *extack) { @@ -513,8 +495,6 @@ static struct tc_action_ops act_police_ops = { .act = tcf_police_act, .dump = tcf_police_dump, .init = tcf_police_init, - .walk = tcf_police_walker, - .lookup = tcf_police_search, .cleanup = tcf_police_cleanup, .offload_act_setup = tcf_police_offload_act_setup, .size = sizeof(struct tcf_police), @@ -522,20 +502,20 @@ static struct tc_action_ops act_police_ops = { static __net_init int police_init_net(struct net *net) { - struct tc_action_net *tn = net_generic(net, police_net_id); + struct tc_action_net *tn = net_generic(net, act_police_ops.net_id); return tc_action_net_init(net, tn, &act_police_ops); } static void __net_exit police_exit_net(struct list_head *net_list) { - tc_action_net_exit(net_list, police_net_id); + tc_action_net_exit(net_list, act_police_ops.net_id); } static struct pernet_operations police_net_ops = { .init = police_init_net, .exit_batch = police_exit_net, - .id = &police_net_id, + .id = &act_police_ops.net_id, .size = sizeof(struct tc_action_net), }; diff --git a/net/sched/act_sample.c b/net/sched/act_sample.c index 2f7f5e44d28c..5ba36f70e3a1 100644 --- a/net/sched/act_sample.c +++ b/net/sched/act_sample.c @@ -23,7 +23,6 @@ #include <linux/if_arp.h> -static unsigned int sample_net_id; static struct tc_action_ops act_sample_ops; static const struct nla_policy sample_policy[TCA_SAMPLE_MAX + 1] = { @@ -38,7 +37,7 @@ static int tcf_sample_init(struct net *net, struct nlattr *nla, struct tcf_proto *tp, u32 flags, struct netlink_ext_ack *extack) { - struct tc_action_net *tn = net_generic(net, sample_net_id); + struct tc_action_net *tn = net_generic(net, act_sample_ops.net_id); bool bind = flags & TCA_ACT_FLAGS_BIND; struct nlattr *tb[TCA_SAMPLE_MAX + 1]; struct psample_group *psample_group; @@ -241,23 +240,6 @@ nla_put_failure: return -1; } -static int tcf_sample_walker(struct net *net, struct sk_buff *skb, - struct netlink_callback *cb, int type, - const struct tc_action_ops *ops, - struct netlink_ext_ack *extack) -{ - struct tc_action_net *tn = net_generic(net, sample_net_id); - - return tcf_generic_walker(tn, skb, cb, type, ops, extack); -} - -static int tcf_sample_search(struct net *net, struct tc_action **a, u32 index) -{ - struct tc_action_net *tn = net_generic(net, sample_net_id); - - return tcf_idr_search(tn, a, index); -} - static void tcf_psample_group_put(void *priv) { struct psample_group *group = priv; @@ -321,8 +303,6 @@ static struct tc_action_ops act_sample_ops = { .dump = tcf_sample_dump, .init = tcf_sample_init, .cleanup = tcf_sample_cleanup, - .walk = tcf_sample_walker, - .lookup = tcf_sample_search, .get_psample_group = tcf_sample_get_group, .offload_act_setup = tcf_sample_offload_act_setup, .size = sizeof(struct tcf_sample), @@ -330,20 +310,20 @@ static struct tc_action_ops act_sample_ops = { static __net_init int sample_init_net(struct net *net) { - struct tc_action_net *tn = net_generic(net, sample_net_id); + struct tc_action_net *tn = net_generic(net, act_sample_ops.net_id); return tc_action_net_init(net, tn, &act_sample_ops); } static void __net_exit sample_exit_net(struct list_head *net_list) { - tc_action_net_exit(net_list, sample_net_id); + tc_action_net_exit(net_list, act_sample_ops.net_id); } static struct pernet_operations sample_net_ops = { .init = sample_init_net, .exit_batch = sample_exit_net, - .id = &sample_net_id, + .id = &act_sample_ops.net_id, .size = sizeof(struct tc_action_net), }; diff --git a/net/sched/act_simple.c b/net/sched/act_simple.c index 8c1d60bde93e..18d376135461 100644 --- a/net/sched/act_simple.c +++ b/net/sched/act_simple.c @@ -18,7 +18,6 @@ #include <linux/tc_act/tc_defact.h> #include <net/tc_act/tc_defact.h> -static unsigned int simp_net_id; static struct tc_action_ops act_simp_ops; #define SIMP_MAX_DATA 32 @@ -89,7 +88,7 @@ static int tcf_simp_init(struct net *net, struct nlattr *nla, struct tcf_proto *tp, u32 flags, struct netlink_ext_ack *extack) { - struct tc_action_net *tn = net_generic(net, simp_net_id); + struct tc_action_net *tn = net_generic(net, act_simp_ops.net_id); bool bind = flags & TCA_ACT_FLAGS_BIND; struct nlattr *tb[TCA_DEF_MAX + 1]; struct tcf_chain *goto_ch = NULL; @@ -198,23 +197,6 @@ nla_put_failure: return -1; } -static int tcf_simp_walker(struct net *net, struct sk_buff *skb, - struct netlink_callback *cb, int type, - const struct tc_action_ops *ops, - struct netlink_ext_ack *extack) -{ - struct tc_action_net *tn = net_generic(net, simp_net_id); - - return tcf_generic_walker(tn, skb, cb, type, ops, extack); -} - -static int tcf_simp_search(struct net *net, struct tc_action **a, u32 index) -{ - struct tc_action_net *tn = net_generic(net, simp_net_id); - - return tcf_idr_search(tn, a, index); -} - static struct tc_action_ops act_simp_ops = { .kind = "simple", .id = TCA_ID_SIMP, @@ -223,27 +205,25 @@ static struct tc_action_ops act_simp_ops = { .dump = tcf_simp_dump, .cleanup = tcf_simp_release, .init = tcf_simp_init, - .walk = tcf_simp_walker, - .lookup = tcf_simp_search, .size = sizeof(struct tcf_defact), }; static __net_init int simp_init_net(struct net *net) { - struct tc_action_net *tn = net_generic(net, simp_net_id); + struct tc_action_net *tn = net_generic(net, act_simp_ops.net_id); return tc_action_net_init(net, tn, &act_simp_ops); } static void __net_exit simp_exit_net(struct list_head *net_list) { - tc_action_net_exit(net_list, simp_net_id); + tc_action_net_exit(net_list, act_simp_ops.net_id); } static struct pernet_operations simp_net_ops = { .init = simp_init_net, .exit_batch = simp_exit_net, - .id = &simp_net_id, + .id = &act_simp_ops.net_id, .size = sizeof(struct tc_action_net), }; diff --git a/net/sched/act_skbedit.c b/net/sched/act_skbedit.c index e3bd11dfe1ca..7f598784fd30 100644 --- a/net/sched/act_skbedit.c +++ b/net/sched/act_skbedit.c @@ -20,7 +20,6 @@ #include <linux/tc_act/tc_skbedit.h> #include <net/tc_act/tc_skbedit.h> -static unsigned int skbedit_net_id; static struct tc_action_ops act_skbedit_ops; static u16 tcf_skbedit_hash(struct tcf_skbedit_params *params, @@ -118,7 +117,7 @@ static int tcf_skbedit_init(struct net *net, struct nlattr *nla, struct tcf_proto *tp, u32 act_flags, struct netlink_ext_ack *extack) { - struct tc_action_net *tn = net_generic(net, skbedit_net_id); + struct tc_action_net *tn = net_generic(net, act_skbedit_ops.net_id); bool bind = act_flags & TCA_ACT_FLAGS_BIND; struct tcf_skbedit_params *params_new; struct nlattr *tb[TCA_SKBEDIT_MAX + 1]; @@ -347,23 +346,6 @@ static void tcf_skbedit_cleanup(struct tc_action *a) kfree_rcu(params, rcu); } -static int tcf_skbedit_walker(struct net *net, struct sk_buff *skb, - struct netlink_callback *cb, int type, - const struct tc_action_ops *ops, - struct netlink_ext_ack *extack) -{ - struct tc_action_net *tn = net_generic(net, skbedit_net_id); - - return tcf_generic_walker(tn, skb, cb, type, ops, extack); -} - -static int tcf_skbedit_search(struct net *net, struct tc_action **a, u32 index) -{ - struct tc_action_net *tn = net_generic(net, skbedit_net_id); - - return tcf_idr_search(tn, a, index); -} - static size_t tcf_skbedit_get_fill_size(const struct tc_action *act) { return nla_total_size(sizeof(struct tc_skbedit)) @@ -428,29 +410,27 @@ static struct tc_action_ops act_skbedit_ops = { .dump = tcf_skbedit_dump, .init = tcf_skbedit_init, .cleanup = tcf_skbedit_cleanup, - .walk = tcf_skbedit_walker, .get_fill_size = tcf_skbedit_get_fill_size, - .lookup = tcf_skbedit_search, .offload_act_setup = tcf_skbedit_offload_act_setup, .size = sizeof(struct tcf_skbedit), }; static __net_init int skbedit_init_net(struct net *net) { - struct tc_action_net *tn = net_generic(net, skbedit_net_id); + struct tc_action_net *tn = net_generic(net, act_skbedit_ops.net_id); return tc_action_net_init(net, tn, &act_skbedit_ops); } static void __net_exit skbedit_exit_net(struct list_head *net_list) { - tc_action_net_exit(net_list, skbedit_net_id); + tc_action_net_exit(net_list, act_skbedit_ops.net_id); } static struct pernet_operations skbedit_net_ops = { .init = skbedit_init_net, .exit_batch = skbedit_exit_net, - .id = &skbedit_net_id, + .id = &act_skbedit_ops.net_id, .size = sizeof(struct tc_action_net), }; diff --git a/net/sched/act_skbmod.c b/net/sched/act_skbmod.c index 2083612d8780..d98758a63934 100644 --- a/net/sched/act_skbmod.c +++ b/net/sched/act_skbmod.c @@ -19,7 +19,6 @@ #include <linux/tc_act/tc_skbmod.h> #include <net/tc_act/tc_skbmod.h> -static unsigned int skbmod_net_id; static struct tc_action_ops act_skbmod_ops; static int tcf_skbmod_act(struct sk_buff *skb, const struct tc_action *a, @@ -103,7 +102,7 @@ static int tcf_skbmod_init(struct net *net, struct nlattr *nla, struct tcf_proto *tp, u32 flags, struct netlink_ext_ack *extack) { - struct tc_action_net *tn = net_generic(net, skbmod_net_id); + struct tc_action_net *tn = net_generic(net, act_skbmod_ops.net_id); bool ovr = flags & TCA_ACT_FLAGS_REPLACE; bool bind = flags & TCA_ACT_FLAGS_BIND; struct nlattr *tb[TCA_SKBMOD_MAX + 1]; @@ -276,23 +275,6 @@ nla_put_failure: return -1; } -static int tcf_skbmod_walker(struct net *net, struct sk_buff *skb, - struct netlink_callback *cb, int type, - const struct tc_action_ops *ops, - struct netlink_ext_ack *extack) -{ - struct tc_action_net *tn = net_generic(net, skbmod_net_id); - - return tcf_generic_walker(tn, skb, cb, type, ops, extack); -} - -static int tcf_skbmod_search(struct net *net, struct tc_action **a, u32 index) -{ - struct tc_action_net *tn = net_generic(net, skbmod_net_id); - - return tcf_idr_search(tn, a, index); -} - static struct tc_action_ops act_skbmod_ops = { .kind = "skbmod", .id = TCA_ACT_SKBMOD, @@ -301,27 +283,25 @@ static struct tc_action_ops act_skbmod_ops = { .dump = tcf_skbmod_dump, .init = tcf_skbmod_init, .cleanup = tcf_skbmod_cleanup, - .walk = tcf_skbmod_walker, - .lookup = tcf_skbmod_search, .size = sizeof(struct tcf_skbmod), }; static __net_init int skbmod_init_net(struct net *net) { - struct tc_action_net *tn = net_generic(net, skbmod_net_id); + struct tc_action_net *tn = net_generic(net, act_skbmod_ops.net_id); return tc_action_net_init(net, tn, &act_skbmod_ops); } static void __net_exit skbmod_exit_net(struct list_head *net_list) { - tc_action_net_exit(net_list, skbmod_net_id); + tc_action_net_exit(net_list, act_skbmod_ops.net_id); } static struct pernet_operations skbmod_net_ops = { .init = skbmod_init_net, .exit_batch = skbmod_exit_net, - .id = &skbmod_net_id, + .id = &act_skbmod_ops.net_id, .size = sizeof(struct tc_action_net), }; diff --git a/net/sched/act_tunnel_key.c b/net/sched/act_tunnel_key.c index 856dc23cef8c..2691a3d8e451 100644 --- a/net/sched/act_tunnel_key.c +++ b/net/sched/act_tunnel_key.c @@ -20,7 +20,6 @@ #include <linux/tc_act/tc_tunnel_key.h> #include <net/tc_act/tc_tunnel_key.h> -static unsigned int tunnel_key_net_id; static struct tc_action_ops act_tunnel_key_ops; static int tunnel_key_act(struct sk_buff *skb, const struct tc_action *a, @@ -358,7 +357,7 @@ static int tunnel_key_init(struct net *net, struct nlattr *nla, struct tcf_proto *tp, u32 act_flags, struct netlink_ext_ack *extack) { - struct tc_action_net *tn = net_generic(net, tunnel_key_net_id); + struct tc_action_net *tn = net_generic(net, act_tunnel_key_ops.net_id); bool bind = act_flags & TCA_ACT_FLAGS_BIND; struct nlattr *tb[TCA_TUNNEL_KEY_MAX + 1]; struct tcf_tunnel_key_params *params_new; @@ -770,23 +769,6 @@ nla_put_failure: return -1; } -static int tunnel_key_walker(struct net *net, struct sk_buff *skb, - struct netlink_callback *cb, int type, - const struct tc_action_ops *ops, - struct netlink_ext_ack *extack) -{ - struct tc_action_net *tn = net_generic(net, tunnel_key_net_id); - - return tcf_generic_walker(tn, skb, cb, type, ops, extack); -} - -static int tunnel_key_search(struct net *net, struct tc_action **a, u32 index) -{ - struct tc_action_net *tn = net_generic(net, tunnel_key_net_id); - - return tcf_idr_search(tn, a, index); -} - static void tcf_tunnel_encap_put_tunnel(void *priv) { struct ip_tunnel_info *tunnel = priv; @@ -850,28 +832,26 @@ static struct tc_action_ops act_tunnel_key_ops = { .dump = tunnel_key_dump, .init = tunnel_key_init, .cleanup = tunnel_key_release, - .walk = tunnel_key_walker, - .lookup = tunnel_key_search, .offload_act_setup = tcf_tunnel_key_offload_act_setup, .size = sizeof(struct tcf_tunnel_key), }; static __net_init int tunnel_key_init_net(struct net *net) { - struct tc_action_net *tn = net_generic(net, tunnel_key_net_id); + struct tc_action_net *tn = net_generic(net, act_tunnel_key_ops.net_id); return tc_action_net_init(net, tn, &act_tunnel_key_ops); } static void __net_exit tunnel_key_exit_net(struct list_head *net_list) { - tc_action_net_exit(net_list, tunnel_key_net_id); + tc_action_net_exit(net_list, act_tunnel_key_ops.net_id); } static struct pernet_operations tunnel_key_net_ops = { .init = tunnel_key_init_net, .exit_batch = tunnel_key_exit_net, - .id = &tunnel_key_net_id, + .id = &act_tunnel_key_ops.net_id, .size = sizeof(struct tc_action_net), }; diff --git a/net/sched/act_vlan.c b/net/sched/act_vlan.c index 68b5e772386a..7b24e898a3e6 100644 --- a/net/sched/act_vlan.c +++ b/net/sched/act_vlan.c @@ -16,7 +16,6 @@ #include <linux/tc_act/tc_vlan.h> #include <net/tc_act/tc_vlan.h> -static unsigned int vlan_net_id; static struct tc_action_ops act_vlan_ops; static int tcf_vlan_act(struct sk_buff *skb, const struct tc_action *a, @@ -117,7 +116,7 @@ static int tcf_vlan_init(struct net *net, struct nlattr *nla, struct tcf_proto *tp, u32 flags, struct netlink_ext_ack *extack) { - struct tc_action_net *tn = net_generic(net, vlan_net_id); + struct tc_action_net *tn = net_generic(net, act_vlan_ops.net_id); bool bind = flags & TCA_ACT_FLAGS_BIND; struct nlattr *tb[TCA_VLAN_MAX + 1]; struct tcf_chain *goto_ch = NULL; @@ -333,16 +332,6 @@ nla_put_failure: return -1; } -static int tcf_vlan_walker(struct net *net, struct sk_buff *skb, - struct netlink_callback *cb, int type, - const struct tc_action_ops *ops, - struct netlink_ext_ack *extack) -{ - struct tc_action_net *tn = net_generic(net, vlan_net_id); - - return tcf_generic_walker(tn, skb, cb, type, ops, extack); -} - static void tcf_vlan_stats_update(struct tc_action *a, u64 bytes, u64 packets, u64 drops, u64 lastuse, bool hw) { @@ -353,13 +342,6 @@ static void tcf_vlan_stats_update(struct tc_action *a, u64 bytes, u64 packets, tm->lastuse = max_t(u64, tm->lastuse, lastuse); } -static int tcf_vlan_search(struct net *net, struct tc_action **a, u32 index) -{ - struct tc_action_net *tn = net_generic(net, vlan_net_id); - - return tcf_idr_search(tn, a, index); -} - static size_t tcf_vlan_get_fill_size(const struct tc_action *act) { return nla_total_size(sizeof(struct tc_vlan)) @@ -438,30 +420,28 @@ static struct tc_action_ops act_vlan_ops = { .dump = tcf_vlan_dump, .init = tcf_vlan_init, .cleanup = tcf_vlan_cleanup, - .walk = tcf_vlan_walker, .stats_update = tcf_vlan_stats_update, .get_fill_size = tcf_vlan_get_fill_size, - .lookup = tcf_vlan_search, .offload_act_setup = tcf_vlan_offload_act_setup, .size = sizeof(struct tcf_vlan), }; static __net_init int vlan_init_net(struct net *net) { - struct tc_action_net *tn = net_generic(net, vlan_net_id); + struct tc_action_net *tn = net_generic(net, act_vlan_ops.net_id); return tc_action_net_init(net, tn, &act_vlan_ops); } static void __net_exit vlan_exit_net(struct list_head *net_list) { - tc_action_net_exit(net_list, vlan_net_id); + tc_action_net_exit(net_list, act_vlan_ops.net_id); } static struct pernet_operations vlan_net_ops = { .init = vlan_init_net, .exit_batch = vlan_exit_net, - .id = &vlan_net_id, + .id = &act_vlan_ops.net_id, .size = sizeof(struct tc_action_net), }; diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c index 51d175f3fbcb..50566db45949 100644 --- a/net/sched/cls_api.c +++ b/net/sched/cls_api.c @@ -1977,9 +1977,6 @@ static int tc_new_tfilter(struct sk_buff *skb, struct nlmsghdr *n, bool rtnl_held = false; u32 flags; - if (!netlink_ns_capable(skb, net->user_ns, CAP_NET_ADMIN)) - return -EPERM; - replay: tp_created = 0; @@ -2209,9 +2206,6 @@ static int tc_del_tfilter(struct sk_buff *skb, struct nlmsghdr *n, int err; bool rtnl_held = false; - if (!netlink_ns_capable(skb, net->user_ns, CAP_NET_ADMIN)) - return -EPERM; - err = nlmsg_parse_deprecated(n, sizeof(*t), tca, TCA_MAX, rtm_tca_policy, extack); if (err < 0) @@ -2827,10 +2821,6 @@ static int tc_ctl_chain(struct sk_buff *skb, struct nlmsghdr *n, unsigned long cl; int err; - if (n->nlmsg_type != RTM_GETCHAIN && - !netlink_ns_capable(skb, net->user_ns, CAP_NET_ADMIN)) - return -EPERM; - replay: q = NULL; err = nlmsg_parse_deprecated(n, sizeof(*t), tca, TCA_MAX, @@ -3640,9 +3630,6 @@ int tcf_qevent_init(struct tcf_qevent *qe, struct Qdisc *sch, if (err) return err; - if (!block_index) - return 0; - qe->info.binder_type = binder_type; qe->info.chain_head_change = tcf_chain_head_change_dflt; qe->info.chain_head_change_priv = &qe->filter_chain; diff --git a/net/sched/cls_basic.c b/net/sched/cls_basic.c index 8158fc9ee1ab..d229ce99e554 100644 --- a/net/sched/cls_basic.c +++ b/net/sched/cls_basic.c @@ -251,15 +251,8 @@ static void basic_walk(struct tcf_proto *tp, struct tcf_walker *arg, struct basic_filter *f; list_for_each_entry(f, &head->flist, link) { - if (arg->count < arg->skip) - goto skip; - - if (arg->fn(tp, f, arg) < 0) { - arg->stop = 1; + if (!tc_cls_stats_dump(tp, arg, f)) break; - } -skip: - arg->count++; } } @@ -268,12 +261,7 @@ static void basic_bind_class(void *fh, u32 classid, unsigned long cl, void *q, { struct basic_filter *f = fh; - if (f && f->res.classid == classid) { - if (cl) - __tcf_bind_filter(q, &f->res, base); - else - __tcf_unbind_filter(q, &f->res); - } + tc_cls_bind_class(classid, cl, q, &f->res, base); } static int basic_dump(struct net *net, struct tcf_proto *tp, void *fh, diff --git a/net/sched/cls_bpf.c b/net/sched/cls_bpf.c index c85b85a192bf..bc317b3eac12 100644 --- a/net/sched/cls_bpf.c +++ b/net/sched/cls_bpf.c @@ -635,12 +635,7 @@ static void cls_bpf_bind_class(void *fh, u32 classid, unsigned long cl, { struct cls_bpf_prog *prog = fh; - if (prog && prog->res.classid == classid) { - if (cl) - __tcf_bind_filter(q, &prog->res, base); - else - __tcf_unbind_filter(q, &prog->res); - } + tc_cls_bind_class(classid, cl, q, &prog->res, base); } static void cls_bpf_walk(struct tcf_proto *tp, struct tcf_walker *arg, @@ -650,14 +645,8 @@ static void cls_bpf_walk(struct tcf_proto *tp, struct tcf_walker *arg, struct cls_bpf_prog *prog; list_for_each_entry(prog, &head->plist, link) { - if (arg->count < arg->skip) - goto skip; - if (arg->fn(tp, prog, arg) < 0) { - arg->stop = 1; + if (!tc_cls_stats_dump(tp, arg, prog)) break; - } -skip: - arg->count++; } } diff --git a/net/sched/cls_flow.c b/net/sched/cls_flow.c index 972303aa8edd..014cd3de7b5d 100644 --- a/net/sched/cls_flow.c +++ b/net/sched/cls_flow.c @@ -683,14 +683,8 @@ static void flow_walk(struct tcf_proto *tp, struct tcf_walker *arg, struct flow_filter *f; list_for_each_entry(f, &head->filters, list) { - if (arg->count < arg->skip) - goto skip; - if (arg->fn(tp, f, arg) < 0) { - arg->stop = 1; + if (!tc_cls_stats_dump(tp, arg, f)) break; - } -skip: - arg->count++; } } diff --git a/net/sched/cls_flower.c b/net/sched/cls_flower.c index 041d63ff809a..25bc57ee6ea1 100644 --- a/net/sched/cls_flower.c +++ b/net/sched/cls_flower.c @@ -69,6 +69,7 @@ struct fl_flow_key { struct flow_dissector_key_hash hash; struct flow_dissector_key_num_of_vlans num_of_vlans; struct flow_dissector_key_pppoe pppoe; + struct flow_dissector_key_l2tpv3 l2tpv3; } __aligned(BITS_PER_LONG / 8); /* Ensure that we can do comparisons as longs. */ struct fl_flow_mask_range { @@ -712,6 +713,7 @@ static const struct nla_policy fl_policy[TCA_FLOWER_MAX + 1] = { [TCA_FLOWER_KEY_NUM_OF_VLANS] = { .type = NLA_U8 }, [TCA_FLOWER_KEY_PPPOE_SID] = { .type = NLA_U16 }, [TCA_FLOWER_KEY_PPP_PROTO] = { .type = NLA_U16 }, + [TCA_FLOWER_KEY_L2TPV3_SID] = { .type = NLA_U32 }, }; @@ -1790,6 +1792,11 @@ static int fl_set_key(struct net *net, struct nlattr **tb, fl_set_key_val(tb, key->arp.tha, TCA_FLOWER_KEY_ARP_THA, mask->arp.tha, TCA_FLOWER_KEY_ARP_THA_MASK, sizeof(key->arp.tha)); + } else if (key->basic.ip_proto == IPPROTO_L2TP) { + fl_set_key_val(tb, &key->l2tpv3.session_id, + TCA_FLOWER_KEY_L2TPV3_SID, + &mask->l2tpv3.session_id, TCA_FLOWER_UNSPEC, + sizeof(key->l2tpv3.session_id)); } if (key->basic.ip_proto == IPPROTO_TCP || @@ -1970,6 +1977,8 @@ static void fl_init_dissector(struct flow_dissector *dissector, FLOW_DISSECTOR_KEY_NUM_OF_VLANS, num_of_vlans); FL_KEY_SET_IF_MASKED(mask, keys, cnt, FLOW_DISSECTOR_KEY_PPPOE, pppoe); + FL_KEY_SET_IF_MASKED(mask, keys, cnt, + FLOW_DISSECTOR_KEY_L2TPV3, l2tpv3); skb_flow_dissector_init(dissector, keys, cnt); } @@ -3196,6 +3205,13 @@ static int fl_dump_key(struct sk_buff *skb, struct net *net, mask->arp.tha, TCA_FLOWER_KEY_ARP_THA_MASK, sizeof(key->arp.tha)))) goto nla_put_failure; + else if (key->basic.ip_proto == IPPROTO_L2TP && + fl_dump_key_val(skb, &key->l2tpv3.session_id, + TCA_FLOWER_KEY_L2TPV3_SID, + &mask->l2tpv3.session_id, + TCA_FLOWER_UNSPEC, + sizeof(key->l2tpv3.session_id))) + goto nla_put_failure; if ((key->basic.ip_proto == IPPROTO_TCP || key->basic.ip_proto == IPPROTO_UDP || @@ -3389,12 +3405,7 @@ static void fl_bind_class(void *fh, u32 classid, unsigned long cl, void *q, { struct cls_fl_filter *f = fh; - if (f && f->res.classid == classid) { - if (cl) - __tcf_bind_filter(q, &f->res, base); - else - __tcf_unbind_filter(q, &f->res); - } + tc_cls_bind_class(classid, cl, q, &f->res, base); } static bool fl_delete_empty(struct tcf_proto *tp) diff --git a/net/sched/cls_fw.c b/net/sched/cls_fw.c index 8654b0ce997c..a32351da968c 100644 --- a/net/sched/cls_fw.c +++ b/net/sched/cls_fw.c @@ -358,15 +358,8 @@ static void fw_walk(struct tcf_proto *tp, struct tcf_walker *arg, for (f = rtnl_dereference(head->ht[h]); f; f = rtnl_dereference(f->next)) { - if (arg->count < arg->skip) { - arg->count++; - continue; - } - if (arg->fn(tp, f, arg) < 0) { - arg->stop = 1; + if (!tc_cls_stats_dump(tp, arg, f)) return; - } - arg->count++; } } } @@ -423,12 +416,7 @@ static void fw_bind_class(void *fh, u32 classid, unsigned long cl, void *q, { struct fw_filter *f = fh; - if (f && f->res.classid == classid) { - if (cl) - __tcf_bind_filter(q, &f->res, base); - else - __tcf_unbind_filter(q, &f->res); - } + tc_cls_bind_class(classid, cl, q, &f->res, base); } static struct tcf_proto_ops cls_fw_ops __read_mostly = { diff --git a/net/sched/cls_matchall.c b/net/sched/cls_matchall.c index 06cf22adbab7..39a5d9c170de 100644 --- a/net/sched/cls_matchall.c +++ b/net/sched/cls_matchall.c @@ -313,10 +313,7 @@ static int mall_reoffload(struct tcf_proto *tp, bool add, flow_setup_cb_t *cb, tc_cleanup_offload_action(&cls_mall.rule->action); kfree(cls_mall.rule); - if (err) - return err; - - return 0; + return err; } static void mall_stats_hw_filter(struct tcf_proto *tp, @@ -397,12 +394,7 @@ static void mall_bind_class(void *fh, u32 classid, unsigned long cl, void *q, { struct cls_mall_head *head = fh; - if (head && head->res.classid == classid) { - if (cl) - __tcf_bind_filter(q, &head->res, base); - else - __tcf_unbind_filter(q, &head->res); - } + tc_cls_bind_class(classid, cl, q, &head->res, base); } static struct tcf_proto_ops cls_mall_ops __read_mostly = { diff --git a/net/sched/cls_route.c b/net/sched/cls_route.c index 48712bc51bda..9e43b929d4ca 100644 --- a/net/sched/cls_route.c +++ b/net/sched/cls_route.c @@ -488,7 +488,7 @@ static int route4_change(struct net *net, struct sk_buff *in_skb, } if (opt == NULL) - return handle ? -EINVAL : 0; + return -EINVAL; err = nla_parse_nested_deprecated(tb, TCA_ROUTE4_MAX, opt, route4_policy, NULL); @@ -496,7 +496,7 @@ static int route4_change(struct net *net, struct sk_buff *in_skb, return err; fold = *arg; - if (fold && handle && fold->handle != handle) + if (fold && fold->handle != handle) return -EINVAL; err = -ENOBUFS; @@ -587,15 +587,8 @@ static void route4_walk(struct tcf_proto *tp, struct tcf_walker *arg, for (f = rtnl_dereference(b->ht[h1]); f; f = rtnl_dereference(f->next)) { - if (arg->count < arg->skip) { - arg->count++; - continue; - } - if (arg->fn(tp, f, arg) < 0) { - arg->stop = 1; + if (!tc_cls_stats_dump(tp, arg, f)) return; - } - arg->count++; } } } @@ -656,12 +649,7 @@ static void route4_bind_class(void *fh, u32 classid, unsigned long cl, void *q, { struct route4_filter *f = fh; - if (f && f->res.classid == classid) { - if (cl) - __tcf_bind_filter(q, &f->res, base); - else - __tcf_unbind_filter(q, &f->res); - } + tc_cls_bind_class(classid, cl, q, &f->res, base); } static struct tcf_proto_ops cls_route4_ops __read_mostly = { diff --git a/net/sched/cls_rsvp.h b/net/sched/cls_rsvp.h index 5cd9d6b143c4..b00a7dbd0587 100644 --- a/net/sched/cls_rsvp.h +++ b/net/sched/cls_rsvp.h @@ -671,15 +671,8 @@ static void rsvp_walk(struct tcf_proto *tp, struct tcf_walker *arg, for (f = rtnl_dereference(s->ht[h1]); f; f = rtnl_dereference(f->next)) { - if (arg->count < arg->skip) { - arg->count++; - continue; - } - if (arg->fn(tp, f, arg) < 0) { - arg->stop = 1; + if (!tc_cls_stats_dump(tp, arg, f)) return; - } - arg->count++; } } } @@ -740,12 +733,7 @@ static void rsvp_bind_class(void *fh, u32 classid, unsigned long cl, void *q, { struct rsvp_filter *f = fh; - if (f && f->res.classid == classid) { - if (cl) - __tcf_bind_filter(q, &f->res, base); - else - __tcf_unbind_filter(q, &f->res); - } + tc_cls_bind_class(classid, cl, q, &f->res, base); } static struct tcf_proto_ops RSVP_OPS __read_mostly = { diff --git a/net/sched/cls_tcindex.c b/net/sched/cls_tcindex.c index 742c7d49a958..1c9eeb98d826 100644 --- a/net/sched/cls_tcindex.c +++ b/net/sched/cls_tcindex.c @@ -566,13 +566,8 @@ static void tcindex_walk(struct tcf_proto *tp, struct tcf_walker *walker, for (i = 0; i < p->hash; i++) { if (!p->perfect[i].res.class) continue; - if (walker->count >= walker->skip) { - if (walker->fn(tp, p->perfect + i, walker) < 0) { - walker->stop = 1; - return; - } - } - walker->count++; + if (!tc_cls_stats_dump(tp, walker, p->perfect + i)) + return; } } if (!p->h) @@ -580,13 +575,8 @@ static void tcindex_walk(struct tcf_proto *tp, struct tcf_walker *walker, for (i = 0; i < p->hash; i++) { for (f = rtnl_dereference(p->h[i]); f; f = next) { next = rtnl_dereference(f->next); - if (walker->count >= walker->skip) { - if (walker->fn(tp, &f->result, walker) < 0) { - walker->stop = 1; - return; - } - } - walker->count++; + if (!tc_cls_stats_dump(tp, walker, &f->result)) + return; } } } @@ -701,12 +691,7 @@ static void tcindex_bind_class(void *fh, u32 classid, unsigned long cl, { struct tcindex_filter_result *r = fh; - if (r && r->res.classid == classid) { - if (cl) - __tcf_bind_filter(q, &r->res, base); - else - __tcf_unbind_filter(q, &r->res); - } + tc_cls_bind_class(classid, cl, q, &r->res, base); } static struct tcf_proto_ops cls_tcindex_ops __read_mostly = { diff --git a/net/sched/cls_u32.c b/net/sched/cls_u32.c index 4d27300c287c..34d25f7a0687 100644 --- a/net/sched/cls_u32.c +++ b/net/sched/cls_u32.c @@ -1040,7 +1040,11 @@ static int u32_change(struct net *net, struct sk_buff *in_skb, } #endif - memcpy(&n->sel, s, sel_size); + unsafe_memcpy(&n->sel, s, sel_size, + /* A composite flex-array structure destination, + * which was correctly sized with struct_size(), + * bounds-checked against nla_len(), and allocated + * above. */); RCU_INIT_POINTER(n->ht_up, ht); n->handle = handle; n->fshift = s->hmask ? ffs(ntohl(s->hmask)) - 1 : 0; @@ -1125,26 +1129,16 @@ static void u32_walk(struct tcf_proto *tp, struct tcf_walker *arg, ht = rtnl_dereference(ht->next)) { if (ht->prio != tp->prio) continue; - if (arg->count >= arg->skip) { - if (arg->fn(tp, ht, arg) < 0) { - arg->stop = 1; - return; - } - } - arg->count++; + + if (!tc_cls_stats_dump(tp, arg, ht)) + return; + for (h = 0; h <= ht->divisor; h++) { for (n = rtnl_dereference(ht->ht[h]); n; n = rtnl_dereference(n->next)) { - if (arg->count < arg->skip) { - arg->count++; - continue; - } - if (arg->fn(tp, n, arg) < 0) { - arg->stop = 1; + if (!tc_cls_stats_dump(tp, arg, n)) return; - } - arg->count++; } } } @@ -1256,12 +1250,7 @@ static void u32_bind_class(void *fh, u32 classid, unsigned long cl, void *q, { struct tc_u_knode *n = fh; - if (n && n->res.classid == classid) { - if (cl) - __tcf_bind_filter(q, &n->res, base); - else - __tcf_unbind_filter(q, &n->res); - } + tc_cls_bind_class(classid, cl, q, &n->res, base); } static int u32_dump(struct net *net, struct tcf_proto *tp, void *fh, diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c index bf87b50837a8..c98af0ada706 100644 --- a/net/sched/sch_api.c +++ b/net/sched/sch_api.c @@ -171,7 +171,7 @@ out_einval: } EXPORT_SYMBOL(register_qdisc); -int unregister_qdisc(struct Qdisc_ops *qops) +void unregister_qdisc(struct Qdisc_ops *qops) { struct Qdisc_ops *q, **qp; int err = -ENOENT; @@ -186,7 +186,8 @@ int unregister_qdisc(struct Qdisc_ops *qops) err = 0; } write_unlock(&qdisc_mod_lock); - return err; + + WARN(err, "unregister qdisc(%s) failed\n", qops->id); } EXPORT_SYMBOL(unregister_qdisc); @@ -194,7 +195,7 @@ EXPORT_SYMBOL(unregister_qdisc); void qdisc_get_default(char *name, size_t len) { read_lock(&qdisc_mod_lock); - strlcpy(name, default_qdisc_ops->id, len); + strscpy(name, default_qdisc_ops->id, len); read_unlock(&qdisc_mod_lock); } @@ -867,6 +868,23 @@ void qdisc_offload_graft_helper(struct net_device *dev, struct Qdisc *sch, } EXPORT_SYMBOL(qdisc_offload_graft_helper); +void qdisc_offload_query_caps(struct net_device *dev, + enum tc_setup_type type, + void *caps, size_t caps_len) +{ + const struct net_device_ops *ops = dev->netdev_ops; + struct tc_query_caps_base base = { + .type = type, + .caps = caps, + }; + + memset(caps, 0, caps_len); + + if (ops->ndo_setup_tc) + ops->ndo_setup_tc(dev, TC_QUERY_CAPS, &base); +} +EXPORT_SYMBOL(qdisc_offload_query_caps); + static void qdisc_offload_graft_root(struct net_device *dev, struct Qdisc *new, struct Qdisc *old, struct netlink_ext_ack *extack) @@ -1163,7 +1181,7 @@ static int qdisc_block_indexes_set(struct Qdisc *sch, struct nlattr **tca, static struct Qdisc *qdisc_create(struct net_device *dev, struct netdev_queue *dev_queue, - struct Qdisc *p, u32 parent, u32 handle, + u32 parent, u32 handle, struct nlattr **tca, int *errp, struct netlink_ext_ack *extack) { @@ -1424,10 +1442,6 @@ static int tc_get_qdisc(struct sk_buff *skb, struct nlmsghdr *n, struct Qdisc *p = NULL; int err; - if ((n->nlmsg_type != RTM_GETQDISC) && - !netlink_ns_capable(skb, net->user_ns, CAP_NET_ADMIN)) - return -EPERM; - err = nlmsg_parse_deprecated(n, sizeof(*tcm), tca, TCA_MAX, rtm_tca_policy, extack); if (err < 0) @@ -1508,9 +1522,6 @@ static int tc_modify_qdisc(struct sk_buff *skb, struct nlmsghdr *n, struct Qdisc *q, *p; int err; - if (!netlink_ns_capable(skb, net->user_ns, CAP_NET_ADMIN)) - return -EPERM; - replay: /* Reinit, just in case something touches this. */ err = nlmsg_parse_deprecated(n, sizeof(*tcm), tca, TCA_MAX, @@ -1640,7 +1651,7 @@ create_n_graft: } if (clid == TC_H_INGRESS) { if (dev_ingress_queue(dev)) { - q = qdisc_create(dev, dev_ingress_queue(dev), p, + q = qdisc_create(dev, dev_ingress_queue(dev), tcm->tcm_parent, tcm->tcm_parent, tca, &err, extack); } else { @@ -1657,7 +1668,7 @@ create_n_graft: else dev_queue = netdev_get_tx_queue(dev, 0); - q = qdisc_create(dev, dev_queue, p, + q = qdisc_create(dev, dev_queue, tcm->tcm_parent, tcm->tcm_handle, tca, &err, extack); } @@ -1904,7 +1915,7 @@ static int tcf_node_bind(struct tcf_proto *tp, void *n, struct tcf_walker *arg) { struct tcf_bind_args *a = (void *)arg; - if (tp->ops->bind_class) { + if (n && tp->ops->bind_class) { struct Qdisc *q = tcf_block_q(tp->chain->block); sch_tree_lock(q); @@ -1992,10 +2003,6 @@ static int tc_ctl_tclass(struct sk_buff *skb, struct nlmsghdr *n, u32 qid; int err; - if ((n->nlmsg_type != RTM_GETTCLASS) && - !netlink_ns_capable(skb, net->user_ns, CAP_NET_ADMIN)) - return -EPERM; - err = nlmsg_parse_deprecated(n, sizeof(*tcm), tca, TCA_MAX, rtm_tca_policy, extack); if (err < 0) diff --git a/net/sched/sch_atm.c b/net/sched/sch_atm.c index 4c8e994cf0a5..f52255fea652 100644 --- a/net/sched/sch_atm.c +++ b/net/sched/sch_atm.c @@ -354,12 +354,8 @@ static void atm_tc_walk(struct Qdisc *sch, struct qdisc_walker *walker) if (walker->stop) return; list_for_each_entry(flow, &p->flows, list) { - if (walker->count >= walker->skip && - walker->fn(sch, (unsigned long)flow, walker) < 0) { - walker->stop = 1; + if (!tc_qdisc_stats_dump(sch, (unsigned long)flow, walker)) break; - } - walker->count++; } } @@ -577,7 +573,6 @@ static void atm_tc_reset(struct Qdisc *sch) pr_debug("atm_tc_reset(sch %p,[qdisc %p])\n", sch, p); list_for_each_entry(flow, &p->flows, list) qdisc_reset(flow->q); - sch->q.qlen = 0; } static void atm_tc_destroy(struct Qdisc *sch) diff --git a/net/sched/sch_cake.c b/net/sched/sch_cake.c index a43a58a73d09..55c6879d2c7e 100644 --- a/net/sched/sch_cake.c +++ b/net/sched/sch_cake.c @@ -2569,9 +2569,6 @@ static int cake_change(struct Qdisc *sch, struct nlattr *opt, struct nlattr *tb[TCA_CAKE_MAX + 1]; int err; - if (!opt) - return -EINVAL; - err = nla_parse_nested_deprecated(tb, TCA_CAKE_MAX, opt, cake_policy, extack); if (err < 0) @@ -3064,16 +3061,13 @@ static void cake_walk(struct Qdisc *sch, struct qdisc_walker *arg) struct cake_tin_data *b = &q->tins[q->tin_order[i]]; for (j = 0; j < CAKE_QUEUES; j++) { - if (list_empty(&b->flows[j].flowchain) || - arg->count < arg->skip) { + if (list_empty(&b->flows[j].flowchain)) { arg->count++; continue; } - if (arg->fn(sch, i * CAKE_QUEUES + j + 1, arg) < 0) { - arg->stop = 1; + if (!tc_qdisc_stats_dump(sch, i * CAKE_QUEUES + j + 1, + arg)) break; - } - arg->count++; } } } diff --git a/net/sched/sch_cbq.c b/net/sched/sch_cbq.c index 91a0dc463c48..6568e17c4c63 100644 --- a/net/sched/sch_cbq.c +++ b/net/sched/sch_cbq.c @@ -975,7 +975,6 @@ cbq_reset(struct Qdisc *sch) cl->cpriority = cl->priority; } } - sch->q.qlen = 0; } @@ -1677,15 +1676,8 @@ static void cbq_walk(struct Qdisc *sch, struct qdisc_walker *arg) for (h = 0; h < q->clhash.hashsize; h++) { hlist_for_each_entry(cl, &q->clhash.hash[h], common.hnode) { - if (arg->count < arg->skip) { - arg->count++; - continue; - } - if (arg->fn(sch, (unsigned long)cl, arg) < 0) { - arg->stop = 1; + if (!tc_qdisc_stats_dump(sch, (unsigned long)cl, arg)) return; - } - arg->count++; } } } diff --git a/net/sched/sch_cbs.c b/net/sched/sch_cbs.c index 459cc240eda9..cac870eb7897 100644 --- a/net/sched/sch_cbs.c +++ b/net/sched/sch_cbs.c @@ -520,13 +520,7 @@ static unsigned long cbs_find(struct Qdisc *sch, u32 classid) static void cbs_walk(struct Qdisc *sch, struct qdisc_walker *walker) { if (!walker->stop) { - if (walker->count >= walker->skip) { - if (walker->fn(sch, 1, walker) < 0) { - walker->stop = 1; - return; - } - } - walker->count++; + tc_qdisc_stats_dump(sch, 1, walker); } } diff --git a/net/sched/sch_choke.c b/net/sched/sch_choke.c index 2adbd945bf15..3ac3e5c80b6f 100644 --- a/net/sched/sch_choke.c +++ b/net/sched/sch_choke.c @@ -60,7 +60,6 @@ struct choke_sched_data { u32 forced_drop; /* Forced drops, qavg > max_thresh */ u32 forced_mark; /* Forced marks, qavg > max_thresh */ u32 pdrop; /* Drops due to queue limits */ - u32 other; /* Drops due to drop() calls */ u32 matched; /* Drops to flow match */ } stats; @@ -315,8 +314,6 @@ static void choke_reset(struct Qdisc *sch) rtnl_qdisc_drop(skb, sch); } - sch->q.qlen = 0; - sch->qstats.backlog = 0; if (q->tab) memset(q->tab, 0, (q->tab_mask + 1) * sizeof(struct sk_buff *)); q->head = q->tail = 0; @@ -466,7 +463,6 @@ static int choke_dump_stats(struct Qdisc *sch, struct gnet_dump *d) .early = q->stats.prob_drop + q->stats.forced_drop, .marked = q->stats.prob_mark + q->stats.forced_mark, .pdrop = q->stats.pdrop, - .other = q->stats.other, .matched = q->stats.matched, }; diff --git a/net/sched/sch_codel.c b/net/sched/sch_codel.c index 30169b3adbbb..d7a4874543de 100644 --- a/net/sched/sch_codel.c +++ b/net/sched/sch_codel.c @@ -138,9 +138,6 @@ static int codel_change(struct Qdisc *sch, struct nlattr *opt, unsigned int qlen, dropped = 0; int err; - if (!opt) - return -EINVAL; - err = nla_parse_nested_deprecated(tb, TCA_CODEL_MAX, opt, codel_policy, NULL); if (err < 0) diff --git a/net/sched/sch_drr.c b/net/sched/sch_drr.c index 18e4f7a0b291..e35a4e90f4e6 100644 --- a/net/sched/sch_drr.c +++ b/net/sched/sch_drr.c @@ -284,15 +284,8 @@ static void drr_walk(struct Qdisc *sch, struct qdisc_walker *arg) for (i = 0; i < q->clhash.hashsize; i++) { hlist_for_each_entry(cl, &q->clhash.hash[i], common.hnode) { - if (arg->count < arg->skip) { - arg->count++; - continue; - } - if (arg->fn(sch, (unsigned long)cl, arg) < 0) { - arg->stop = 1; + if (!tc_qdisc_stats_dump(sch, (unsigned long)cl, arg)) return; - } - arg->count++; } } } @@ -441,8 +434,6 @@ static void drr_reset_qdisc(struct Qdisc *sch) qdisc_reset(cl->qdisc); } } - sch->qstats.backlog = 0; - sch->q.qlen = 0; } static void drr_destroy_qdisc(struct Qdisc *sch) diff --git a/net/sched/sch_dsmark.c b/net/sched/sch_dsmark.c index 4c100d105269..401ffaf87d62 100644 --- a/net/sched/sch_dsmark.c +++ b/net/sched/sch_dsmark.c @@ -176,16 +176,12 @@ static void dsmark_walk(struct Qdisc *sch, struct qdisc_walker *walker) return; for (i = 0; i < p->indices; i++) { - if (p->mv[i].mask == 0xff && !p->mv[i].value) - goto ignore; - if (walker->count >= walker->skip) { - if (walker->fn(sch, i + 1, walker) < 0) { - walker->stop = 1; - break; - } + if (p->mv[i].mask == 0xff && !p->mv[i].value) { + walker->count++; + continue; } -ignore: - walker->count++; + if (!tc_qdisc_stats_dump(sch, i + 1, walker)) + break; } } @@ -409,8 +405,6 @@ static void dsmark_reset(struct Qdisc *sch) pr_debug("%s(sch %p,[qdisc %p])\n", __func__, sch, p); if (p->q) qdisc_reset(p->q); - sch->qstats.backlog = 0; - sch->q.qlen = 0; } static void dsmark_destroy(struct Qdisc *sch) diff --git a/net/sched/sch_etf.c b/net/sched/sch_etf.c index c48f91075b5c..61d1f0e32cf3 100644 --- a/net/sched/sch_etf.c +++ b/net/sched/sch_etf.c @@ -323,9 +323,6 @@ static int etf_enable_offload(struct net_device *dev, struct etf_sched_data *q, struct tc_etf_qopt_offload etf = { }; int err; - if (q->offload) - return 0; - if (!ops->ndo_setup_tc) { NL_SET_ERR_MSG(extack, "Specified device does not support ETF offload"); return -EOPNOTSUPP; @@ -445,9 +442,6 @@ static void etf_reset(struct Qdisc *sch) timesortedlist_clear(sch); __qdisc_reset_queue(&sch->q); - sch->qstats.backlog = 0; - sch->q.qlen = 0; - q->last = 0; } diff --git a/net/sched/sch_ets.c b/net/sched/sch_ets.c index d73393493553..b10efeaf0629 100644 --- a/net/sched/sch_ets.c +++ b/net/sched/sch_ets.c @@ -341,15 +341,8 @@ static void ets_qdisc_walk(struct Qdisc *sch, struct qdisc_walker *arg) return; for (i = 0; i < q->nbands; i++) { - if (arg->count < arg->skip) { - arg->count++; - continue; - } - if (arg->fn(sch, i + 1, arg) < 0) { - arg->stop = 1; + if (!tc_qdisc_stats_dump(sch, i + 1, arg)) break; - } - arg->count++; } } @@ -594,11 +587,6 @@ static int ets_qdisc_change(struct Qdisc *sch, struct nlattr *opt, unsigned int i; int err; - if (!opt) { - NL_SET_ERR_MSG(extack, "ETS options are required for this operation"); - return -EINVAL; - } - err = nla_parse_nested(tb, TCA_ETS_MAX, opt, ets_policy, extack); if (err < 0) return err; @@ -727,8 +715,6 @@ static void ets_qdisc_reset(struct Qdisc *sch) } for (band = 0; band < q->nbands; band++) qdisc_reset(q->classes[band].qdisc); - sch->qstats.backlog = 0; - sch->q.qlen = 0; } static void ets_qdisc_destroy(struct Qdisc *sch) diff --git a/net/sched/sch_fq.c b/net/sched/sch_fq.c index 2fb76fc0cc31..48d14fb90ba0 100644 --- a/net/sched/sch_fq.c +++ b/net/sched/sch_fq.c @@ -808,9 +808,6 @@ static int fq_change(struct Qdisc *sch, struct nlattr *opt, unsigned drop_len = 0; u32 fq_log; - if (!opt) - return -EINVAL; - err = nla_parse_nested_deprecated(tb, TCA_FQ_MAX, opt, fq_policy, NULL); if (err < 0) diff --git a/net/sched/sch_fq_codel.c b/net/sched/sch_fq_codel.c index 839e1235db05..99d318b60568 100644 --- a/net/sched/sch_fq_codel.c +++ b/net/sched/sch_fq_codel.c @@ -347,8 +347,6 @@ static void fq_codel_reset(struct Qdisc *sch) codel_vars_init(&flow->cvars); } memset(q->backlogs, 0, q->flows_cnt * sizeof(u32)); - sch->q.qlen = 0; - sch->qstats.backlog = 0; q->memory_usage = 0; } @@ -374,9 +372,6 @@ static int fq_codel_change(struct Qdisc *sch, struct nlattr *opt, u32 quantum = 0; int err; - if (!opt) - return -EINVAL; - err = nla_parse_nested_deprecated(tb, TCA_FQ_CODEL_MAX, opt, fq_codel_policy, NULL); if (err < 0) @@ -483,26 +478,24 @@ static int fq_codel_init(struct Qdisc *sch, struct nlattr *opt, if (opt) { err = fq_codel_change(sch, opt, extack); if (err) - goto init_failure; + return err; } err = tcf_block_get(&q->block, &q->filter_list, sch, extack); if (err) - goto init_failure; + return err; if (!q->flows) { q->flows = kvcalloc(q->flows_cnt, sizeof(struct fq_codel_flow), GFP_KERNEL); - if (!q->flows) { - err = -ENOMEM; - goto init_failure; - } + if (!q->flows) + return -ENOMEM; + q->backlogs = kvcalloc(q->flows_cnt, sizeof(u32), GFP_KERNEL); - if (!q->backlogs) { - err = -ENOMEM; - goto alloc_failure; - } + if (!q->backlogs) + return -ENOMEM; + for (i = 0; i < q->flows_cnt; i++) { struct fq_codel_flow *flow = q->flows + i; @@ -515,13 +508,6 @@ static int fq_codel_init(struct Qdisc *sch, struct nlattr *opt, else sch->flags &= ~TCQ_F_CAN_BYPASS; return 0; - -alloc_failure: - kvfree(q->flows); - q->flows = NULL; -init_failure: - q->flows_cnt = 0; - return err; } static int fq_codel_dump(struct Qdisc *sch, struct sk_buff *skb) @@ -687,16 +673,12 @@ static void fq_codel_walk(struct Qdisc *sch, struct qdisc_walker *arg) return; for (i = 0; i < q->flows_cnt; i++) { - if (list_empty(&q->flows[i].flowchain) || - arg->count < arg->skip) { + if (list_empty(&q->flows[i].flowchain)) { arg->count++; continue; } - if (arg->fn(sch, i + 1, arg) < 0) { - arg->stop = 1; + if (!tc_qdisc_stats_dump(sch, i + 1, arg)) break; - } - arg->count++; } } diff --git a/net/sched/sch_fq_pie.c b/net/sched/sch_fq_pie.c index d6aba6edd16e..6980796d435d 100644 --- a/net/sched/sch_fq_pie.c +++ b/net/sched/sch_fq_pie.c @@ -283,9 +283,6 @@ static int fq_pie_change(struct Qdisc *sch, struct nlattr *opt, unsigned int num_dropped = 0; int err; - if (!opt) - return -EINVAL; - err = nla_parse_nested(tb, TCA_FQ_PIE_MAX, opt, fq_pie_policy, extack); if (err < 0) return err; @@ -521,9 +518,6 @@ static void fq_pie_reset(struct Qdisc *sch) INIT_LIST_HEAD(&flow->flowchain); pie_vars_init(&flow->vars); } - - sch->q.qlen = 0; - sch->qstats.backlog = 0; } static void fq_pie_destroy(struct Qdisc *sch) diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c index 7a8ea03f673d..a9aadc4e6858 100644 --- a/net/sched/sch_generic.c +++ b/net/sched/sch_generic.c @@ -941,7 +941,6 @@ struct Qdisc *qdisc_alloc(struct netdev_queue *dev_queue, goto errout; __skb_queue_head_init(&sch->gso_skb); __skb_queue_head_init(&sch->skb_bad_txq); - qdisc_skb_head_init(&sch->q); gnet_stats_basic_sync_init(&sch->bstats); spin_lock_init(&sch->q.lock); diff --git a/net/sched/sch_gred.c b/net/sched/sch_gred.c index 1073c76d05c4..a661b062cca8 100644 --- a/net/sched/sch_gred.c +++ b/net/sched/sch_gred.c @@ -648,9 +648,6 @@ static int gred_change(struct Qdisc *sch, struct nlattr *opt, u32 max_P; struct gred_sched_data *prealloc; - if (opt == NULL) - return -EINVAL; - err = nla_parse_nested_deprecated(tb, TCA_GRED_MAX, opt, gred_policy, extack); if (err < 0) @@ -829,7 +826,6 @@ static int gred_dump(struct Qdisc *sch, struct sk_buff *skb) opt.Wlog = q->parms.Wlog; opt.Plog = q->parms.Plog; opt.Scell_log = q->parms.Scell_log; - opt.other = q->stats.other; opt.early = q->stats.prob_drop; opt.forced = q->stats.forced_drop; opt.pdrop = q->stats.pdrop; @@ -895,8 +891,6 @@ append_opt: goto nla_put_failure; if (nla_put_u32(skb, TCA_GRED_VQ_STAT_PDROP, q->stats.pdrop)) goto nla_put_failure; - if (nla_put_u32(skb, TCA_GRED_VQ_STAT_OTHER, q->stats.other)) - goto nla_put_failure; nla_nest_end(skb, vq); } @@ -914,10 +908,9 @@ static void gred_destroy(struct Qdisc *sch) struct gred_sched *table = qdisc_priv(sch); int i; - for (i = 0; i < table->DPs; i++) { - if (table->tab[i]) - gred_destroy_vq(table->tab[i]); - } + for (i = 0; i < table->DPs; i++) + gred_destroy_vq(table->tab[i]); + gred_offload(sch, TC_GRED_DESTROY); kfree(table->opt); } diff --git a/net/sched/sch_hfsc.c b/net/sched/sch_hfsc.c index d3979a6000e7..70b0c5873d32 100644 --- a/net/sched/sch_hfsc.c +++ b/net/sched/sch_hfsc.c @@ -1349,15 +1349,8 @@ hfsc_walk(struct Qdisc *sch, struct qdisc_walker *arg) for (i = 0; i < q->clhash.hashsize; i++) { hlist_for_each_entry(cl, &q->clhash.hash[i], cl_common.hnode) { - if (arg->count < arg->skip) { - arg->count++; - continue; - } - if (arg->fn(sch, (unsigned long)cl, arg) < 0) { - arg->stop = 1; + if (!tc_qdisc_stats_dump(sch, (unsigned long)cl, arg)) return; - } - arg->count++; } } } @@ -1430,7 +1423,7 @@ hfsc_change_qdisc(struct Qdisc *sch, struct nlattr *opt, struct hfsc_sched *q = qdisc_priv(sch); struct tc_hfsc_qopt *qopt; - if (opt == NULL || nla_len(opt) < sizeof(*qopt)) + if (nla_len(opt) < sizeof(*qopt)) return -EINVAL; qopt = nla_data(opt); @@ -1484,8 +1477,6 @@ hfsc_reset_qdisc(struct Qdisc *sch) } q->eligible = RB_ROOT; qdisc_watchdog_cancel(&q->watchdog); - sch->qstats.backlog = 0; - sch->q.qlen = 0; } static void diff --git a/net/sched/sch_hhf.c b/net/sched/sch_hhf.c index 420ede875322..d26cd436cbe3 100644 --- a/net/sched/sch_hhf.c +++ b/net/sched/sch_hhf.c @@ -516,9 +516,6 @@ static int hhf_change(struct Qdisc *sch, struct nlattr *opt, u32 new_quantum = q->quantum; u32 new_hhf_non_hh_weight = q->hhf_non_hh_weight; - if (!opt) - return -EINVAL; - err = nla_parse_nested_deprecated(tb, TCA_HHF_MAX, opt, hhf_policy, NULL); if (err < 0) diff --git a/net/sched/sch_htb.c b/net/sched/sch_htb.c index 23a9d6242429..e5b4bbf3ce3d 100644 --- a/net/sched/sch_htb.c +++ b/net/sched/sch_htb.c @@ -1008,8 +1008,6 @@ static void htb_reset(struct Qdisc *sch) } qdisc_watchdog_cancel(&q->watchdog); __qdisc_reset_queue(&q->direct_queue); - sch->q.qlen = 0; - sch->qstats.backlog = 0; memset(q->hlevel, 0, sizeof(q->hlevel)); memset(q->row_mask, 0, sizeof(q->row_mask)); } @@ -1104,9 +1102,7 @@ static int htb_init(struct Qdisc *sch, struct nlattr *opt, err = qdisc_class_hash_init(&q->clhash); if (err < 0) - goto err_free_direct_qdiscs; - - qdisc_skb_head_init(&q->direct_queue); + return err; if (tb[TCA_HTB_DIRECT_QLEN]) q->direct_qlen = nla_get_u32(tb[TCA_HTB_DIRECT_QLEN]); @@ -1127,8 +1123,7 @@ static int htb_init(struct Qdisc *sch, struct nlattr *opt, qdisc = qdisc_create_dflt(dev_queue, &pfifo_qdisc_ops, TC_H_MAKE(sch->handle, 0), extack); if (!qdisc) { - err = -ENOMEM; - goto err_free_qdiscs; + return -ENOMEM; } htb_set_lockdep_class_child(qdisc); @@ -1146,7 +1141,7 @@ static int htb_init(struct Qdisc *sch, struct nlattr *opt, }; err = htb_offload(dev, &offload_opt); if (err) - goto err_free_qdiscs; + return err; /* Defer this assignment, so that htb_destroy skips offload-related * parts (especially calling ndo_setup_tc) on errors. @@ -1154,22 +1149,6 @@ static int htb_init(struct Qdisc *sch, struct nlattr *opt, q->offload = true; return 0; - -err_free_qdiscs: - for (ntx = 0; ntx < q->num_direct_qdiscs && q->direct_qdiscs[ntx]; - ntx++) - qdisc_put(q->direct_qdiscs[ntx]); - - qdisc_class_hash_destroy(&q->clhash); - /* Prevent use-after-free and double-free when htb_destroy gets called. - */ - q->clhash.hash = NULL; - q->clhash.hashsize = 0; - -err_free_direct_qdiscs: - kfree(q->direct_qdiscs); - q->direct_qdiscs = NULL; - return err; } static void htb_attach_offload(struct Qdisc *sch) @@ -1692,13 +1671,12 @@ static void htb_destroy(struct Qdisc *sch) qdisc_class_hash_destroy(&q->clhash); __qdisc_reset_queue(&q->direct_queue); - if (!q->offload) - return; - - offload_opt = (struct tc_htb_qopt_offload) { - .command = TC_HTB_DESTROY, - }; - htb_offload(dev, &offload_opt); + if (q->offload) { + offload_opt = (struct tc_htb_qopt_offload) { + .command = TC_HTB_DESTROY, + }; + htb_offload(dev, &offload_opt); + } if (!q->direct_qdiscs) return; @@ -2141,15 +2119,8 @@ static void htb_walk(struct Qdisc *sch, struct qdisc_walker *arg) for (i = 0; i < q->clhash.hashsize; i++) { hlist_for_each_entry(cl, &q->clhash.hash[i], common.hnode) { - if (arg->count < arg->skip) { - arg->count++; - continue; - } - if (arg->fn(sch, (unsigned long)cl, arg) < 0) { - arg->stop = 1; + if (!tc_qdisc_stats_dump(sch, (unsigned long)cl, arg)) return; - } - arg->count++; } } } diff --git a/net/sched/sch_mq.c b/net/sched/sch_mq.c index 83d2e54bf303..d0bc660d7401 100644 --- a/net/sched/sch_mq.c +++ b/net/sched/sch_mq.c @@ -247,11 +247,8 @@ static void mq_walk(struct Qdisc *sch, struct qdisc_walker *arg) arg->count = arg->skip; for (ntx = arg->skip; ntx < dev->num_tx_queues; ntx++) { - if (arg->fn(sch, ntx + 1, arg) < 0) { - arg->stop = 1; + if (!tc_qdisc_stats_dump(sch, ntx + 1, arg)) break; - } - arg->count++; } } diff --git a/net/sched/sch_mqprio.c b/net/sched/sch_mqprio.c index b29f3453c6ea..4c68abaa289b 100644 --- a/net/sched/sch_mqprio.c +++ b/net/sched/sch_mqprio.c @@ -558,11 +558,8 @@ static void mqprio_walk(struct Qdisc *sch, struct qdisc_walker *arg) /* Walk hierarchy with a virtual class per tc */ arg->count = arg->skip; for (ntx = arg->skip; ntx < netdev_get_num_tc(dev); ntx++) { - if (arg->fn(sch, ntx + TC_H_MIN_PRIORITY, arg) < 0) { - arg->stop = 1; + if (!tc_qdisc_stats_dump(sch, ntx + TC_H_MIN_PRIORITY, arg)) return; - } - arg->count++; } /* Pad the values and skip over unused traffic classes */ diff --git a/net/sched/sch_multiq.c b/net/sched/sch_multiq.c index cd8ab90c4765..75c9c860182b 100644 --- a/net/sched/sch_multiq.c +++ b/net/sched/sch_multiq.c @@ -152,7 +152,6 @@ multiq_reset(struct Qdisc *sch) for (band = 0; band < q->bands; band++) qdisc_reset(q->queues[band]); - sch->q.qlen = 0; q->curband = 0; } @@ -354,15 +353,8 @@ static void multiq_walk(struct Qdisc *sch, struct qdisc_walker *arg) return; for (band = 0; band < q->bands; band++) { - if (arg->count < arg->skip) { - arg->count++; - continue; - } - if (arg->fn(sch, band + 1, arg) < 0) { - arg->stop = 1; + if (!tc_qdisc_stats_dump(sch, band + 1, arg)) break; - } - arg->count++; } } diff --git a/net/sched/sch_netem.c b/net/sched/sch_netem.c index 5449ed114e40..18f4273a835b 100644 --- a/net/sched/sch_netem.c +++ b/net/sched/sch_netem.c @@ -961,9 +961,6 @@ static int netem_change(struct Qdisc *sch, struct nlattr *opt, int old_loss_model = CLG_RANDOM; int ret; - if (opt == NULL) - return -EINVAL; - qopt = nla_data(opt); ret = parse_attr(tb, TCA_NETEM_MAX, opt, netem_policy, sizeof(*qopt)); if (ret < 0) @@ -1254,12 +1251,8 @@ static unsigned long netem_find(struct Qdisc *sch, u32 classid) static void netem_walk(struct Qdisc *sch, struct qdisc_walker *walker) { if (!walker->stop) { - if (walker->count >= walker->skip) - if (walker->fn(sch, 1, walker) < 0) { - walker->stop = 1; - return; - } - walker->count++; + if (!tc_qdisc_stats_dump(sch, 1, walker)) + return; } } diff --git a/net/sched/sch_pie.c b/net/sched/sch_pie.c index 5a457ff61acd..974038ba6c7b 100644 --- a/net/sched/sch_pie.c +++ b/net/sched/sch_pie.c @@ -143,9 +143,6 @@ static int pie_change(struct Qdisc *sch, struct nlattr *opt, unsigned int qlen, dropped = 0; int err; - if (!opt) - return -EINVAL; - err = nla_parse_nested_deprecated(tb, TCA_PIE_MAX, opt, pie_policy, NULL); if (err < 0) diff --git a/net/sched/sch_plug.c b/net/sched/sch_plug.c index cbc2ebca4548..ea8c4a7174bb 100644 --- a/net/sched/sch_plug.c +++ b/net/sched/sch_plug.c @@ -161,9 +161,6 @@ static int plug_change(struct Qdisc *sch, struct nlattr *opt, struct plug_sched_data *q = qdisc_priv(sch); struct tc_plug_qopt *msg; - if (opt == NULL) - return -EINVAL; - msg = nla_data(opt); if (nla_len(opt) < sizeof(*msg)) return -EINVAL; diff --git a/net/sched/sch_prio.c b/net/sched/sch_prio.c index 3b8d7197c06b..fdc5ef52c3ee 100644 --- a/net/sched/sch_prio.c +++ b/net/sched/sch_prio.c @@ -135,8 +135,6 @@ prio_reset(struct Qdisc *sch) for (prio = 0; prio < q->bands; prio++) qdisc_reset(q->queues[prio]); - sch->qstats.backlog = 0; - sch->q.qlen = 0; } static int prio_offload(struct Qdisc *sch, struct tc_prio_qopt *qopt) @@ -187,7 +185,7 @@ static int prio_tune(struct Qdisc *sch, struct nlattr *opt, return -EINVAL; qopt = nla_data(opt); - if (qopt->bands > TCQ_PRIO_BANDS || qopt->bands < 2) + if (qopt->bands > TCQ_PRIO_BANDS || qopt->bands < TCQ_MIN_PRIO_BANDS) return -EINVAL; for (i = 0; i <= TC_PRIO_MAX; i++) { @@ -378,15 +376,8 @@ static void prio_walk(struct Qdisc *sch, struct qdisc_walker *arg) return; for (prio = 0; prio < q->bands; prio++) { - if (arg->count < arg->skip) { - arg->count++; - continue; - } - if (arg->fn(sch, prio + 1, arg) < 0) { - arg->stop = 1; + if (!tc_qdisc_stats_dump(sch, prio + 1, arg)) break; - } - arg->count++; } } diff --git a/net/sched/sch_qfq.c b/net/sched/sch_qfq.c index d4ce58c90f9f..cf5ebe43b3b4 100644 --- a/net/sched/sch_qfq.c +++ b/net/sched/sch_qfq.c @@ -659,15 +659,8 @@ static void qfq_walk(struct Qdisc *sch, struct qdisc_walker *arg) for (i = 0; i < q->clhash.hashsize; i++) { hlist_for_each_entry(cl, &q->clhash.hash[i], common.hnode) { - if (arg->count < arg->skip) { - arg->count++; - continue; - } - if (arg->fn(sch, (unsigned long)cl, arg) < 0) { - arg->stop = 1; + if (!tc_qdisc_stats_dump(sch, (unsigned long)cl, arg)) return; - } - arg->count++; } } } @@ -1458,8 +1451,6 @@ static void qfq_reset_qdisc(struct Qdisc *sch) qdisc_reset(cl->qdisc); } } - sch->qstats.backlog = 0; - sch->q.qlen = 0; } static void qfq_destroy_qdisc(struct Qdisc *sch) diff --git a/net/sched/sch_red.c b/net/sched/sch_red.c index 40adf1f07a82..a5a401f93c1a 100644 --- a/net/sched/sch_red.c +++ b/net/sched/sch_red.c @@ -176,8 +176,6 @@ static void red_reset(struct Qdisc *sch) struct red_sched_data *q = qdisc_priv(sch); qdisc_reset(q->qdisc); - sch->qstats.backlog = 0; - sch->q.qlen = 0; red_restart(&q->vars); } @@ -370,9 +368,6 @@ static int red_change(struct Qdisc *sch, struct nlattr *opt, struct nlattr *tb[TCA_RED_MAX + 1]; int err; - if (!opt) - return -EINVAL; - err = nla_parse_nested_deprecated(tb, TCA_RED_MAX, opt, red_policy, extack); if (err < 0) @@ -463,7 +458,6 @@ static int red_dump_stats(struct Qdisc *sch, struct gnet_dump *d) } st.early = q->stats.prob_drop + q->stats.forced_drop; st.pdrop = q->stats.pdrop; - st.other = q->stats.other; st.marked = q->stats.prob_mark + q->stats.forced_mark; return gnet_stats_copy_app(d, &st, sizeof(st)); @@ -522,12 +516,7 @@ static unsigned long red_find(struct Qdisc *sch, u32 classid) static void red_walk(struct Qdisc *sch, struct qdisc_walker *walker) { if (!walker->stop) { - if (walker->count >= walker->skip) - if (walker->fn(sch, 1, walker) < 0) { - walker->stop = 1; - return; - } - walker->count++; + tc_qdisc_stats_dump(sch, 1, walker); } } diff --git a/net/sched/sch_sfb.c b/net/sched/sch_sfb.c index 2829455211f8..e2389fa3cff8 100644 --- a/net/sched/sch_sfb.c +++ b/net/sched/sch_sfb.c @@ -456,8 +456,6 @@ static void sfb_reset(struct Qdisc *sch) struct sfb_sched_data *q = qdisc_priv(sch); qdisc_reset(q->qdisc); - sch->qstats.backlog = 0; - sch->q.qlen = 0; q->slot = 0; q->double_buffering = false; sfb_zero_all_buckets(q); @@ -661,12 +659,7 @@ static int sfb_delete(struct Qdisc *sch, unsigned long cl, static void sfb_walk(struct Qdisc *sch, struct qdisc_walker *walker) { if (!walker->stop) { - if (walker->count >= walker->skip) - if (walker->fn(sch, 1, walker) < 0) { - walker->stop = 1; - return; - } - walker->count++; + tc_qdisc_stats_dump(sch, 1, walker); } } diff --git a/net/sched/sch_sfq.c b/net/sched/sch_sfq.c index f8e569f79f13..abd436307d6a 100644 --- a/net/sched/sch_sfq.c +++ b/net/sched/sch_sfq.c @@ -888,16 +888,12 @@ static void sfq_walk(struct Qdisc *sch, struct qdisc_walker *arg) return; for (i = 0; i < q->divisor; i++) { - if (q->ht[i] == SFQ_EMPTY_SLOT || - arg->count < arg->skip) { + if (q->ht[i] == SFQ_EMPTY_SLOT) { arg->count++; continue; } - if (arg->fn(sch, i + 1, arg) < 0) { - arg->stop = 1; + if (!tc_qdisc_stats_dump(sch, i + 1, arg)) break; - } - arg->count++; } } diff --git a/net/sched/sch_skbprio.c b/net/sched/sch_skbprio.c index 7a5e4c454715..5df2dacb7b1a 100644 --- a/net/sched/sch_skbprio.c +++ b/net/sched/sch_skbprio.c @@ -213,9 +213,6 @@ static void skbprio_reset(struct Qdisc *sch) struct skbprio_sched_data *q = qdisc_priv(sch); int prio; - sch->qstats.backlog = 0; - sch->q.qlen = 0; - for (prio = 0; prio < SKBPRIO_MAX_PRIORITY; prio++) __skb_queue_purge(&q->qdiscs[prio]); @@ -268,15 +265,8 @@ static void skbprio_walk(struct Qdisc *sch, struct qdisc_walker *arg) return; for (i = 0; i < SKBPRIO_MAX_PRIORITY; i++) { - if (arg->count < arg->skip) { - arg->count++; - continue; - } - if (arg->fn(sch, i + 1, arg) < 0) { - arg->stop = 1; + if (!tc_qdisc_stats_dump(sch, i + 1, arg)) break; - } - arg->count++; } } diff --git a/net/sched/sch_taprio.c b/net/sched/sch_taprio.c index 86675a79da1e..435d866fcfa0 100644 --- a/net/sched/sch_taprio.c +++ b/net/sched/sch_taprio.c @@ -27,7 +27,6 @@ #include <net/tcp.h> static LIST_HEAD(taprio_list); -static DEFINE_SPINLOCK(taprio_list_lock); #define TAPRIO_ALL_GATES_OPEN -1 @@ -79,8 +78,8 @@ struct taprio_sched { struct sched_gate_list __rcu *admin_sched; struct hrtimer advance_timer; struct list_head taprio_list; - struct sk_buff *(*dequeue)(struct Qdisc *sch); - struct sk_buff *(*peek)(struct Qdisc *sch); + u32 max_frm_len[TC_MAX_QUEUE]; /* for the fast path */ + u32 max_sdu[TC_MAX_QUEUE]; /* for dump and offloading */ u32 txtime_delay; }; @@ -418,6 +417,9 @@ static int taprio_enqueue_one(struct sk_buff *skb, struct Qdisc *sch, struct Qdisc *child, struct sk_buff **to_free) { struct taprio_sched *q = qdisc_priv(sch); + struct net_device *dev = qdisc_dev(sch); + int prio = skb->priority; + u8 tc; /* sk_flags are only safe to use on full sockets. */ if (skb->sk && sk_fullsock(skb->sk) && sock_flag(skb->sk, SOCK_TXTIME)) { @@ -429,12 +431,20 @@ static int taprio_enqueue_one(struct sk_buff *skb, struct Qdisc *sch, return qdisc_drop(skb, sch, to_free); } + /* Devices with full offload are expected to honor this in hardware */ + tc = netdev_get_prio_tc_map(dev, prio); + if (skb->len > q->max_frm_len[tc]) + return qdisc_drop(skb, sch, to_free); + qdisc_qstats_backlog_inc(sch, skb); sch->q.qlen++; return qdisc_enqueue(skb, child, to_free); } +/* Will not be called in the full offload case, since the TX queues are + * attached to the Qdisc created using qdisc_create_dflt() + */ static int taprio_enqueue(struct sk_buff *skb, struct Qdisc *sch, struct sk_buff **to_free) { @@ -442,11 +452,6 @@ static int taprio_enqueue(struct sk_buff *skb, struct Qdisc *sch, struct Qdisc *child; int queue; - if (unlikely(FULL_OFFLOAD_IS_ENABLED(q->flags))) { - WARN_ONCE(1, "Trying to enqueue skb into the root of a taprio qdisc configured with full offload\n"); - return qdisc_drop(skb, sch, to_free); - } - queue = skb_get_queue_mapping(skb); child = q->qdiscs[queue]; @@ -455,10 +460,10 @@ static int taprio_enqueue(struct sk_buff *skb, struct Qdisc *sch, /* Large packets might not be transmitted when the transmission duration * exceeds any configured interval. Therefore, segment the skb into - * smaller chunks. Skip it for the full offload case, as the driver - * and/or the hardware is expected to handle this. + * smaller chunks. Drivers with full offload are expected to handle + * this in hardware. */ - if (skb_is_gso(skb) && !FULL_OFFLOAD_IS_ENABLED(q->flags)) { + if (skb_is_gso(skb)) { unsigned int slen = 0, numsegs = 0, len = qdisc_pkt_len(skb); netdev_features_t features = netif_skb_features(skb); struct sk_buff *segs, *nskb; @@ -492,7 +497,10 @@ static int taprio_enqueue(struct sk_buff *skb, struct Qdisc *sch, return taprio_enqueue_one(skb, sch, child, to_free); } -static struct sk_buff *taprio_peek_soft(struct Qdisc *sch) +/* Will not be called in the full offload case, since the TX queues are + * attached to the Qdisc created using qdisc_create_dflt() + */ +static struct sk_buff *taprio_peek(struct Qdisc *sch) { struct taprio_sched *q = qdisc_priv(sch); struct net_device *dev = qdisc_dev(sch); @@ -536,20 +544,6 @@ static struct sk_buff *taprio_peek_soft(struct Qdisc *sch) return NULL; } -static struct sk_buff *taprio_peek_offload(struct Qdisc *sch) -{ - WARN_ONCE(1, "Trying to peek into the root of a taprio qdisc configured with full offload\n"); - - return NULL; -} - -static struct sk_buff *taprio_peek(struct Qdisc *sch) -{ - struct taprio_sched *q = qdisc_priv(sch); - - return q->peek(sch); -} - static void taprio_set_budget(struct taprio_sched *q, struct sched_entry *entry) { atomic_set(&entry->budget, @@ -557,7 +551,10 @@ static void taprio_set_budget(struct taprio_sched *q, struct sched_entry *entry) atomic64_read(&q->picos_per_byte))); } -static struct sk_buff *taprio_dequeue_soft(struct Qdisc *sch) +/* Will not be called in the full offload case, since the TX queues are + * attached to the Qdisc created using qdisc_create_dflt() + */ +static struct sk_buff *taprio_dequeue(struct Qdisc *sch) { struct taprio_sched *q = qdisc_priv(sch); struct net_device *dev = qdisc_dev(sch); @@ -645,20 +642,6 @@ done: return skb; } -static struct sk_buff *taprio_dequeue_offload(struct Qdisc *sch) -{ - WARN_ONCE(1, "Trying to dequeue from the root of a taprio qdisc configured with full offload\n"); - - return NULL; -} - -static struct sk_buff *taprio_dequeue(struct Qdisc *sch) -{ - struct taprio_sched *q = qdisc_priv(sch); - - return q->dequeue(sch); -} - static bool should_restart_cycle(const struct sched_gate_list *oper, const struct sched_entry *entry) { @@ -781,6 +764,11 @@ static const struct nla_policy entry_policy[TCA_TAPRIO_SCHED_ENTRY_MAX + 1] = { [TCA_TAPRIO_SCHED_ENTRY_INTERVAL] = { .type = NLA_U32 }, }; +static const struct nla_policy taprio_tc_policy[TCA_TAPRIO_TC_ENTRY_MAX + 1] = { + [TCA_TAPRIO_TC_ENTRY_INDEX] = { .type = NLA_U32 }, + [TCA_TAPRIO_TC_ENTRY_MAX_SDU] = { .type = NLA_U32 }, +}; + static const struct nla_policy taprio_policy[TCA_TAPRIO_ATTR_MAX + 1] = { [TCA_TAPRIO_ATTR_PRIOMAP] = { .len = sizeof(struct tc_mqprio_qopt) @@ -793,6 +781,7 @@ static const struct nla_policy taprio_policy[TCA_TAPRIO_ATTR_MAX + 1] = { [TCA_TAPRIO_ATTR_SCHED_CYCLE_TIME_EXTENSION] = { .type = NLA_S64 }, [TCA_TAPRIO_ATTR_FLAGS] = { .type = NLA_U32 }, [TCA_TAPRIO_ATTR_TXTIME_DELAY] = { .type = NLA_U32 }, + [TCA_TAPRIO_ATTR_TC_ENTRY] = { .type = NLA_NESTED }, }; static int fill_sched_entry(struct taprio_sched *q, struct nlattr **tb, @@ -1099,27 +1088,20 @@ static int taprio_dev_notifier(struct notifier_block *nb, unsigned long event, void *ptr) { struct net_device *dev = netdev_notifier_info_to_dev(ptr); - struct net_device *qdev; struct taprio_sched *q; - bool found = false; ASSERT_RTNL(); if (event != NETDEV_UP && event != NETDEV_CHANGE) return NOTIFY_DONE; - spin_lock(&taprio_list_lock); list_for_each_entry(q, &taprio_list, taprio_list) { - qdev = qdisc_dev(q->root); - if (qdev == dev) { - found = true; - break; - } - } - spin_unlock(&taprio_list_lock); + if (dev != qdisc_dev(q->root)) + continue; - if (found) taprio_set_picos_per_byte(dev, q); + break; + } return NOTIFY_DONE; } @@ -1194,16 +1176,10 @@ static void taprio_offload_config_changed(struct taprio_sched *q) { struct sched_gate_list *oper, *admin; - spin_lock(&q->current_entry_lock); - - oper = rcu_dereference_protected(q->oper_sched, - lockdep_is_held(&q->current_entry_lock)); - admin = rcu_dereference_protected(q->admin_sched, - lockdep_is_held(&q->current_entry_lock)); + oper = rtnl_dereference(q->oper_sched); + admin = rtnl_dereference(q->admin_sched); switch_schedules(q, &admin, &oper); - - spin_unlock(&q->current_entry_lock); } static u32 tc_map_to_queue_mask(struct net_device *dev, u32 tc_mask) @@ -1256,7 +1232,8 @@ static int taprio_enable_offload(struct net_device *dev, { const struct net_device_ops *ops = dev->netdev_ops; struct tc_taprio_qopt_offload *offload; - int err = 0; + struct tc_taprio_caps caps; + int tc, err = 0; if (!ops->ndo_setup_tc) { NL_SET_ERR_MSG(extack, @@ -1264,6 +1241,19 @@ static int taprio_enable_offload(struct net_device *dev, return -EOPNOTSUPP; } + qdisc_offload_query_caps(dev, TC_SETUP_QDISC_TAPRIO, + &caps, sizeof(caps)); + + if (!caps.supports_queue_max_sdu) { + for (tc = 0; tc < TC_MAX_QUEUE; tc++) { + if (q->max_sdu[tc]) { + NL_SET_ERR_MSG_MOD(extack, + "Device does not handle queueMaxSDU"); + return -EOPNOTSUPP; + } + } + } + offload = taprio_offload_alloc(sched->num_entries); if (!offload) { NL_SET_ERR_MSG(extack, @@ -1273,6 +1263,9 @@ static int taprio_enable_offload(struct net_device *dev, offload->enable = 1; taprio_sched_to_offload(dev, sched, offload); + for (tc = 0; tc < TC_MAX_QUEUE; tc++) + offload->max_sdu[tc] = q->max_sdu[tc]; + err = ops->ndo_setup_tc(dev, TC_SETUP_QDISC_TAPRIO, offload); if (err < 0) { NL_SET_ERR_MSG(extack, @@ -1407,6 +1400,89 @@ out: return err; } +static int taprio_parse_tc_entry(struct Qdisc *sch, + struct nlattr *opt, + u32 max_sdu[TC_QOPT_MAX_QUEUE], + unsigned long *seen_tcs, + struct netlink_ext_ack *extack) +{ + struct nlattr *tb[TCA_TAPRIO_TC_ENTRY_MAX + 1] = { }; + struct net_device *dev = qdisc_dev(sch); + u32 val = 0; + int err, tc; + + err = nla_parse_nested(tb, TCA_TAPRIO_TC_ENTRY_MAX, opt, + taprio_tc_policy, extack); + if (err < 0) + return err; + + if (!tb[TCA_TAPRIO_TC_ENTRY_INDEX]) { + NL_SET_ERR_MSG_MOD(extack, "TC entry index missing"); + return -EINVAL; + } + + tc = nla_get_u32(tb[TCA_TAPRIO_TC_ENTRY_INDEX]); + if (tc >= TC_QOPT_MAX_QUEUE) { + NL_SET_ERR_MSG_MOD(extack, "TC entry index out of range"); + return -ERANGE; + } + + if (*seen_tcs & BIT(tc)) { + NL_SET_ERR_MSG_MOD(extack, "Duplicate TC entry"); + return -EINVAL; + } + + *seen_tcs |= BIT(tc); + + if (tb[TCA_TAPRIO_TC_ENTRY_MAX_SDU]) + val = nla_get_u32(tb[TCA_TAPRIO_TC_ENTRY_MAX_SDU]); + + if (val > dev->max_mtu) { + NL_SET_ERR_MSG_MOD(extack, "TC max SDU exceeds device max MTU"); + return -ERANGE; + } + + max_sdu[tc] = val; + + return 0; +} + +static int taprio_parse_tc_entries(struct Qdisc *sch, + struct nlattr *opt, + struct netlink_ext_ack *extack) +{ + struct taprio_sched *q = qdisc_priv(sch); + struct net_device *dev = qdisc_dev(sch); + u32 max_sdu[TC_QOPT_MAX_QUEUE]; + unsigned long seen_tcs = 0; + struct nlattr *n; + int tc, rem; + int err = 0; + + for (tc = 0; tc < TC_QOPT_MAX_QUEUE; tc++) + max_sdu[tc] = q->max_sdu[tc]; + + nla_for_each_nested(n, opt, rem) { + if (nla_type(n) != TCA_TAPRIO_ATTR_TC_ENTRY) + continue; + + err = taprio_parse_tc_entry(sch, n, max_sdu, &seen_tcs, extack); + if (err) + goto out; + } + + for (tc = 0; tc < TC_QOPT_MAX_QUEUE; tc++) { + q->max_sdu[tc] = max_sdu[tc]; + if (max_sdu[tc]) + q->max_frm_len[tc] = max_sdu[tc] + dev->hard_header_len; + else + q->max_frm_len[tc] = U32_MAX; /* never oversized */ + } + +out: + return err; +} + static int taprio_mqprio_cmp(const struct net_device *dev, const struct tc_mqprio_qopt *mqprio) { @@ -1485,6 +1561,10 @@ static int taprio_change(struct Qdisc *sch, struct nlattr *opt, if (err < 0) return err; + err = taprio_parse_tc_entries(sch, opt, extack); + if (err) + return err; + new_admin = kzalloc(sizeof(*new_admin), GFP_KERNEL); if (!new_admin) { NL_SET_ERR_MSG(extack, "Not enough memory for a new schedule"); @@ -1492,10 +1572,8 @@ static int taprio_change(struct Qdisc *sch, struct nlattr *opt, } INIT_LIST_HEAD(&new_admin->entries); - rcu_read_lock(); - oper = rcu_dereference(q->oper_sched); - admin = rcu_dereference(q->admin_sched); - rcu_read_unlock(); + oper = rtnl_dereference(q->oper_sched); + admin = rtnl_dereference(q->admin_sched); /* no changes - no new mqprio settings */ if (!taprio_mqprio_cmp(dev, mqprio)) @@ -1565,17 +1643,6 @@ static int taprio_change(struct Qdisc *sch, struct nlattr *opt, q->advance_timer.function = advance_sched; } - if (FULL_OFFLOAD_IS_ENABLED(q->flags)) { - q->dequeue = taprio_dequeue_offload; - q->peek = taprio_peek_offload; - } else { - /* Be sure to always keep the function pointers - * in a consistent state. - */ - q->dequeue = taprio_dequeue_soft; - q->peek = taprio_peek_soft; - } - err = taprio_get_start_time(sch, new_admin, &start); if (err < 0) { NL_SET_ERR_MSG(extack, "Internal error: failed get start time"); @@ -1638,19 +1705,16 @@ static void taprio_reset(struct Qdisc *sch) if (q->qdiscs[i]) qdisc_reset(q->qdiscs[i]); } - sch->qstats.backlog = 0; - sch->q.qlen = 0; } static void taprio_destroy(struct Qdisc *sch) { struct taprio_sched *q = qdisc_priv(sch); struct net_device *dev = qdisc_dev(sch); + struct sched_gate_list *oper, *admin; unsigned int i; - spin_lock(&taprio_list_lock); list_del(&q->taprio_list); - spin_unlock(&taprio_list_lock); /* Note that taprio_reset() might not be called if an error * happens in qdisc_create(), after taprio_init() has been called. @@ -1669,11 +1733,14 @@ static void taprio_destroy(struct Qdisc *sch) netdev_reset_tc(dev); - if (q->oper_sched) - call_rcu(&q->oper_sched->rcu, taprio_free_sched_cb); + oper = rtnl_dereference(q->oper_sched); + admin = rtnl_dereference(q->admin_sched); + + if (oper) + call_rcu(&oper->rcu, taprio_free_sched_cb); - if (q->admin_sched) - call_rcu(&q->admin_sched->rcu, taprio_free_sched_cb); + if (admin) + call_rcu(&admin->rcu, taprio_free_sched_cb); } static int taprio_init(struct Qdisc *sch, struct nlattr *opt, @@ -1688,9 +1755,6 @@ static int taprio_init(struct Qdisc *sch, struct nlattr *opt, hrtimer_init(&q->advance_timer, CLOCK_TAI, HRTIMER_MODE_ABS); q->advance_timer.function = advance_sched; - q->dequeue = taprio_dequeue_soft; - q->peek = taprio_peek_soft; - q->root = sch; /* We only support static clockids. Use an invalid value as default @@ -1699,15 +1763,17 @@ static int taprio_init(struct Qdisc *sch, struct nlattr *opt, q->clockid = -1; q->flags = TAPRIO_FLAGS_INVALID; - spin_lock(&taprio_list_lock); list_add(&q->taprio_list, &taprio_list); - spin_unlock(&taprio_list_lock); - if (sch->parent != TC_H_ROOT) + if (sch->parent != TC_H_ROOT) { + NL_SET_ERR_MSG_MOD(extack, "Can only be attached as root qdisc"); return -EOPNOTSUPP; + } - if (!netif_is_multiqueue(dev)) + if (!netif_is_multiqueue(dev)) { + NL_SET_ERR_MSG_MOD(extack, "Multi-queue device is required"); return -EOPNOTSUPP; + } /* pre-allocate qdisc, attachment can't fail */ q->qdiscs = kcalloc(dev->num_tx_queues, @@ -1879,6 +1945,33 @@ error_nest: return -1; } +static int taprio_dump_tc_entries(struct taprio_sched *q, struct sk_buff *skb) +{ + struct nlattr *n; + int tc; + + for (tc = 0; tc < TC_MAX_QUEUE; tc++) { + n = nla_nest_start(skb, TCA_TAPRIO_ATTR_TC_ENTRY); + if (!n) + return -EMSGSIZE; + + if (nla_put_u32(skb, TCA_TAPRIO_TC_ENTRY_INDEX, tc)) + goto nla_put_failure; + + if (nla_put_u32(skb, TCA_TAPRIO_TC_ENTRY_MAX_SDU, + q->max_sdu[tc])) + goto nla_put_failure; + + nla_nest_end(skb, n); + } + + return 0; + +nla_put_failure: + nla_nest_cancel(skb, n); + return -EMSGSIZE; +} + static int taprio_dump(struct Qdisc *sch, struct sk_buff *skb) { struct taprio_sched *q = qdisc_priv(sch); @@ -1888,9 +1981,8 @@ static int taprio_dump(struct Qdisc *sch, struct sk_buff *skb) struct nlattr *nest, *sched_nest; unsigned int i; - rcu_read_lock(); - oper = rcu_dereference(q->oper_sched); - admin = rcu_dereference(q->admin_sched); + oper = rtnl_dereference(q->oper_sched); + admin = rtnl_dereference(q->admin_sched); opt.num_tc = netdev_get_num_tc(dev); memcpy(opt.prio_tc_map, dev->prio_tc_map, sizeof(opt.prio_tc_map)); @@ -1918,6 +2010,9 @@ static int taprio_dump(struct Qdisc *sch, struct sk_buff *skb) nla_put_u32(skb, TCA_TAPRIO_ATTR_TXTIME_DELAY, q->txtime_delay)) goto options_error; + if (taprio_dump_tc_entries(q, skb)) + goto options_error; + if (oper && dump_schedule(skb, oper)) goto options_error; @@ -1934,8 +2029,6 @@ static int taprio_dump(struct Qdisc *sch, struct sk_buff *skb) nla_nest_end(skb, sched_nest); done: - rcu_read_unlock(); - return nla_nest_end(skb, nest); admin_error: @@ -1945,7 +2038,6 @@ options_error: nla_nest_cancel(skb, nest); start_error: - rcu_read_unlock(); return -ENOSPC; } @@ -2006,11 +2098,8 @@ static void taprio_walk(struct Qdisc *sch, struct qdisc_walker *arg) arg->count = arg->skip; for (ntx = arg->skip; ntx < dev->num_tx_queues; ntx++) { - if (arg->fn(sch, ntx + 1, arg) < 0) { - arg->stop = 1; + if (!tc_qdisc_stats_dump(sch, ntx + 1, arg)) break; - } - arg->count++; } } diff --git a/net/sched/sch_tbf.c b/net/sched/sch_tbf.c index 36079fdde2cb..277ad11f4d61 100644 --- a/net/sched/sch_tbf.c +++ b/net/sched/sch_tbf.c @@ -330,8 +330,6 @@ static void tbf_reset(struct Qdisc *sch) struct tbf_sched_data *q = qdisc_priv(sch); qdisc_reset(q->qdisc); - sch->qstats.backlog = 0; - sch->q.qlen = 0; q->t_c = ktime_get_ns(); q->tokens = q->buffer; q->ptokens = q->mtu; @@ -582,12 +580,7 @@ static unsigned long tbf_find(struct Qdisc *sch, u32 classid) static void tbf_walk(struct Qdisc *sch, struct qdisc_walker *walker) { if (!walker->stop) { - if (walker->count >= walker->skip) - if (walker->fn(sch, 1, walker) < 0) { - walker->stop = 1; - return; - } - walker->count++; + tc_qdisc_stats_dump(sch, 1, walker); } } diff --git a/net/sched/sch_teql.c b/net/sched/sch_teql.c index 6af6b95bdb67..16f9238aa51d 100644 --- a/net/sched/sch_teql.c +++ b/net/sched/sch_teql.c @@ -124,7 +124,6 @@ teql_reset(struct Qdisc *sch) struct teql_sched_data *dat = qdisc_priv(sch); skb_queue_purge(&dat->q); - sch->q.qlen = 0; } static void @@ -492,7 +491,7 @@ static int __init teql_init(void) master = netdev_priv(dev); - strlcpy(master->qops.id, dev->name, IFNAMSIZ); + strscpy(master->qops.id, dev->name, IFNAMSIZ); err = register_qdisc(&master->qops); if (err) { diff --git a/net/sctp/auth.c b/net/sctp/auth.c index db6b7373d16c..34964145514e 100644 --- a/net/sctp/auth.c +++ b/net/sctp/auth.c @@ -863,12 +863,17 @@ int sctp_auth_set_key(struct sctp_endpoint *ep, } list_del_init(&shkey->key_list); - sctp_auth_shkey_release(shkey); list_add(&cur_key->key_list, sh_keys); - if (asoc && asoc->active_key_id == auth_key->sca_keynumber) - sctp_auth_asoc_init_active_key(asoc, GFP_KERNEL); + if (asoc && asoc->active_key_id == auth_key->sca_keynumber && + sctp_auth_asoc_init_active_key(asoc, GFP_KERNEL)) { + list_del_init(&cur_key->key_list); + sctp_auth_shkey_release(cur_key); + list_add(&shkey->key_list, sh_keys); + return -ENOMEM; + } + sctp_auth_shkey_release(shkey); return 0; } @@ -902,8 +907,13 @@ int sctp_auth_set_active_key(struct sctp_endpoint *ep, return -EINVAL; if (asoc) { + __u16 active_key_id = asoc->active_key_id; + asoc->active_key_id = key_id; - sctp_auth_asoc_init_active_key(asoc, GFP_KERNEL); + if (sctp_auth_asoc_init_active_key(asoc, GFP_KERNEL)) { + asoc->active_key_id = active_key_id; + return -ENOMEM; + } } else ep->active_key_id = key_id; diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c index 0939cc3b915a..3ccbf3c201cd 100644 --- a/net/smc/af_smc.c +++ b/net/smc/af_smc.c @@ -379,6 +379,8 @@ static struct sock *smc_sock_alloc(struct net *net, struct socket *sock, sk->sk_state = SMC_INIT; sk->sk_destruct = smc_destruct; sk->sk_protocol = protocol; + WRITE_ONCE(sk->sk_sndbuf, READ_ONCE(net->smc.sysctl_wmem)); + WRITE_ONCE(sk->sk_rcvbuf, READ_ONCE(net->smc.sysctl_rmem)); smc = smc_sk(sk); INIT_WORK(&smc->tcp_listen_work, smc_tcp_listen_work); INIT_WORK(&smc->connect_work, smc_connect_work); @@ -427,6 +429,7 @@ static int smc_bind(struct socket *sock, struct sockaddr *uaddr, goto out_rel; smc->clcsock->sk->sk_reuse = sk->sk_reuse; + smc->clcsock->sk->sk_reuseport = sk->sk_reuseport; rc = kernel_bind(smc->clcsock, uaddr, addr_len); out_rel: @@ -3253,9 +3256,6 @@ static int __smc_create(struct net *net, struct socket *sock, int protocol, smc->clcsock = clcsock; } - smc->sk.sk_sndbuf = max(smc->clcsock->sk->sk_sndbuf, SMC_BUF_MIN_SIZE); - smc->sk.sk_rcvbuf = max(smc->clcsock->sk->sk_rcvbuf, SMC_BUF_MIN_SIZE); - out: return rc; } diff --git a/net/smc/smc_core.c b/net/smc/smc_core.c index df89c2e08cbf..e6ee797640b4 100644 --- a/net/smc/smc_core.c +++ b/net/smc/smc_core.c @@ -2310,10 +2310,10 @@ static int __smc_buf_create(struct smc_sock *smc, bool is_smcd, bool is_rmb) if (is_rmb) /* use socket recv buffer size (w/o overhead) as start value */ - sk_buf_size = smc->sk.sk_rcvbuf / 2; + sk_buf_size = smc->sk.sk_rcvbuf; else /* use socket send buffer size (w/o overhead) as start value */ - sk_buf_size = smc->sk.sk_sndbuf / 2; + sk_buf_size = smc->sk.sk_sndbuf; for (bufsize_short = smc_compress_bufsize(sk_buf_size, is_smcd, is_rmb); bufsize_short >= 0; bufsize_short--) { @@ -2372,7 +2372,7 @@ static int __smc_buf_create(struct smc_sock *smc, bool is_smcd, bool is_rmb) if (is_rmb) { conn->rmb_desc = buf_desc; conn->rmbe_size_short = bufsize_short; - smc->sk.sk_rcvbuf = bufsize * 2; + smc->sk.sk_rcvbuf = bufsize; atomic_set(&conn->bytes_to_rcv, 0); conn->rmbe_update_limit = smc_rmb_wnd_update_limit(buf_desc->len); @@ -2380,7 +2380,7 @@ static int __smc_buf_create(struct smc_sock *smc, bool is_smcd, bool is_rmb) smc_ism_set_conn(conn); /* map RMB/smcd_dev to conn */ } else { conn->sndbuf_desc = buf_desc; - smc->sk.sk_sndbuf = bufsize * 2; + smc->sk.sk_sndbuf = bufsize; atomic_set(&conn->sndbuf_space, bufsize); } return 0; diff --git a/net/smc/smc_llc.c b/net/smc/smc_llc.c index 175026ae33ae..524649d0ab65 100644 --- a/net/smc/smc_llc.c +++ b/net/smc/smc_llc.c @@ -2127,7 +2127,7 @@ void smc_llc_lgr_init(struct smc_link_group *lgr, struct smc_sock *smc) init_waitqueue_head(&lgr->llc_flow_waiter); init_waitqueue_head(&lgr->llc_msg_waiter); mutex_init(&lgr->llc_conf_mutex); - lgr->llc_testlink_time = READ_ONCE(net->ipv4.sysctl_tcp_keepalive_time); + lgr->llc_testlink_time = READ_ONCE(net->smc.sysctl_smcr_testlink_time); } /* called after lgr was removed from lgr_list */ diff --git a/net/smc/smc_llc.h b/net/smc/smc_llc.h index 4404e52b3346..7e7a3162c68b 100644 --- a/net/smc/smc_llc.h +++ b/net/smc/smc_llc.h @@ -19,6 +19,7 @@ #define SMC_LLC_WAIT_FIRST_TIME (5 * HZ) #define SMC_LLC_WAIT_TIME (2 * HZ) +#define SMC_LLC_TESTLINK_DEFAULT_TIME (30 * HZ) enum smc_llc_reqresp { SMC_LLC_REQ, diff --git a/net/smc/smc_netlink.c b/net/smc/smc_netlink.c index c5a62f6f52ba..621c46c70073 100644 --- a/net/smc/smc_netlink.c +++ b/net/smc/smc_netlink.c @@ -142,7 +142,8 @@ struct genl_family smc_gen_nl_family __ro_after_init = { .netnsok = true, .module = THIS_MODULE, .ops = smc_gen_nl_ops, - .n_ops = ARRAY_SIZE(smc_gen_nl_ops) + .n_ops = ARRAY_SIZE(smc_gen_nl_ops), + .resv_start_op = SMC_NETLINK_DISABLE_HS_LIMITATION + 1, }; int __init smc_nl_init(void) diff --git a/net/smc/smc_pnet.c b/net/smc/smc_pnet.c index 4c3bf6db7038..25fb2fd186e2 100644 --- a/net/smc/smc_pnet.c +++ b/net/smc/smc_pnet.c @@ -715,7 +715,8 @@ static struct genl_family smc_pnet_nl_family __ro_after_init = { .netnsok = true, .module = THIS_MODULE, .ops = smc_pnet_ops, - .n_ops = ARRAY_SIZE(smc_pnet_ops) + .n_ops = ARRAY_SIZE(smc_pnet_ops), + .resv_start_op = SMC_PNETID_FLUSH + 1, }; bool smc_pnet_is_ndev_pnetid(struct net *net, u8 *pnetid) diff --git a/net/smc/smc_sysctl.c b/net/smc/smc_sysctl.c index 0613868fdb97..b6f79fabb9d3 100644 --- a/net/smc/smc_sysctl.c +++ b/net/smc/smc_sysctl.c @@ -16,8 +16,12 @@ #include "smc.h" #include "smc_core.h" +#include "smc_llc.h" #include "smc_sysctl.h" +static int min_sndbuf = SMC_BUF_MIN_SIZE; +static int min_rcvbuf = SMC_BUF_MIN_SIZE; + static struct ctl_table smc_table[] = { { .procname = "autocorking_size", @@ -35,6 +39,29 @@ static struct ctl_table smc_table[] = { .extra1 = SYSCTL_ZERO, .extra2 = SYSCTL_TWO, }, + { + .procname = "smcr_testlink_time", + .data = &init_net.smc.sysctl_smcr_testlink_time, + .maxlen = sizeof(int), + .mode = 0644, + .proc_handler = proc_dointvec_jiffies, + }, + { + .procname = "wmem", + .data = &init_net.smc.sysctl_wmem, + .maxlen = sizeof(int), + .mode = 0644, + .proc_handler = proc_dointvec_minmax, + .extra1 = &min_sndbuf, + }, + { + .procname = "rmem", + .data = &init_net.smc.sysctl_rmem, + .maxlen = sizeof(int), + .mode = 0644, + .proc_handler = proc_dointvec_minmax, + .extra1 = &min_rcvbuf, + }, { } }; @@ -60,6 +87,9 @@ int __net_init smc_sysctl_net_init(struct net *net) net->smc.sysctl_autocorking_size = SMC_AUTOCORKING_DEFAULT_SIZE; net->smc.sysctl_smcr_buf_type = SMCR_PHYS_CONT_BUFS; + net->smc.sysctl_smcr_testlink_time = SMC_LLC_TESTLINK_DEFAULT_TIME; + WRITE_ONCE(net->smc.sysctl_wmem, READ_ONCE(net->ipv4.sysctl_tcp_wmem[1])); + WRITE_ONCE(net->smc.sysctl_rmem, READ_ONCE(net->ipv4.sysctl_tcp_rmem[1])); return 0; diff --git a/net/tipc/name_distr.c b/net/tipc/name_distr.c index 8267b751a526..190b49c5cbc3 100644 --- a/net/tipc/name_distr.c +++ b/net/tipc/name_distr.c @@ -41,14 +41,6 @@ int sysctl_tipc_named_timeout __read_mostly = 2000; -struct distr_queue_item { - struct distr_item i; - u32 dtype; - u32 node; - unsigned long expires; - struct list_head next; -}; - /** * publ_to_item - add publication info to a publication message * @p: publication info diff --git a/net/tipc/netlink.c b/net/tipc/netlink.c index c447cb5f879e..e8fd257c0e68 100644 --- a/net/tipc/netlink.c +++ b/net/tipc/netlink.c @@ -294,6 +294,7 @@ struct genl_family tipc_genl_family __ro_after_init = { .module = THIS_MODULE, .ops = tipc_genl_v2_ops, .n_ops = ARRAY_SIZE(tipc_genl_v2_ops), + .resv_start_op = TIPC_NL_ADDR_LEGACY_GET + 1, }; int __init tipc_netlink_start(void) diff --git a/net/tipc/netlink_compat.c b/net/tipc/netlink_compat.c index 0749df80454d..fc68733673ba 100644 --- a/net/tipc/netlink_compat.c +++ b/net/tipc/netlink_compat.c @@ -1357,6 +1357,7 @@ static struct genl_family tipc_genl_compat_family __ro_after_init = { .module = THIS_MODULE, .small_ops = tipc_genl_compat_ops, .n_small_ops = ARRAY_SIZE(tipc_genl_compat_ops), + .resv_start_op = TIPC_GENL_CMD + 1, }; int __init tipc_netlink_compat_start(void) diff --git a/net/tls/tls_device.c b/net/tls/tls_device.c index 0f983e5f7dde..a03d66046ca3 100644 --- a/net/tls/tls_device.c +++ b/net/tls/tls_device.c @@ -902,17 +902,28 @@ static void tls_device_core_ctrl_rx_resync(struct tls_context *tls_ctx, } static int -tls_device_reencrypt(struct sock *sk, struct tls_sw_context_rx *sw_ctx) +tls_device_reencrypt(struct sock *sk, struct tls_context *tls_ctx) { + struct tls_sw_context_rx *sw_ctx = tls_sw_ctx_rx(tls_ctx); + const struct tls_cipher_size_desc *cipher_sz; int err, offset, copy, data_len, pos; struct sk_buff *skb, *skb_iter; struct scatterlist sg[1]; struct strp_msg *rxm; char *orig_buf, *buf; + switch (tls_ctx->crypto_recv.info.cipher_type) { + case TLS_CIPHER_AES_GCM_128: + case TLS_CIPHER_AES_GCM_256: + break; + default: + return -EINVAL; + } + cipher_sz = &tls_cipher_size_desc[tls_ctx->crypto_recv.info.cipher_type]; + rxm = strp_msg(tls_strp_msg(sw_ctx)); - orig_buf = kmalloc(rxm->full_len + TLS_HEADER_SIZE + - TLS_CIPHER_AES_GCM_128_IV_SIZE, sk->sk_allocation); + orig_buf = kmalloc(rxm->full_len + TLS_HEADER_SIZE + cipher_sz->iv, + sk->sk_allocation); if (!orig_buf) return -ENOMEM; buf = orig_buf; @@ -927,10 +938,8 @@ tls_device_reencrypt(struct sock *sk, struct tls_sw_context_rx *sw_ctx) sg_init_table(sg, 1); sg_set_buf(&sg[0], buf, - rxm->full_len + TLS_HEADER_SIZE + - TLS_CIPHER_AES_GCM_128_IV_SIZE); - err = skb_copy_bits(skb, offset, buf, - TLS_HEADER_SIZE + TLS_CIPHER_AES_GCM_128_IV_SIZE); + rxm->full_len + TLS_HEADER_SIZE + cipher_sz->iv); + err = skb_copy_bits(skb, offset, buf, TLS_HEADER_SIZE + cipher_sz->iv); if (err) goto free_buf; @@ -941,7 +950,7 @@ tls_device_reencrypt(struct sock *sk, struct tls_sw_context_rx *sw_ctx) else err = 0; - data_len = rxm->full_len - TLS_CIPHER_AES_GCM_128_TAG_SIZE; + data_len = rxm->full_len - cipher_sz->tag; if (skb_pagelen(skb) > offset) { copy = min_t(int, skb_pagelen(skb) - offset, data_len); @@ -1024,7 +1033,7 @@ int tls_device_decrypted(struct sock *sk, struct tls_context *tls_ctx) * likely have initial fragments decrypted, and final ones not * decrypted. We need to reencrypt that single SKB. */ - return tls_device_reencrypt(sk, sw_ctx); + return tls_device_reencrypt(sk, tls_ctx); } /* Return immediately if the record is either entirely plaintext or @@ -1041,7 +1050,7 @@ int tls_device_decrypted(struct sock *sk, struct tls_context *tls_ctx) } ctx->resync_nh_reset = 1; - return tls_device_reencrypt(sk, sw_ctx); + return tls_device_reencrypt(sk, tls_ctx); } static void tls_device_attach(struct tls_context *ctx, struct sock *sk, @@ -1062,9 +1071,9 @@ static void tls_device_attach(struct tls_context *ctx, struct sock *sk, int tls_set_device_offload(struct sock *sk, struct tls_context *ctx) { - u16 nonce_size, tag_size, iv_size, rec_seq_size, salt_size; struct tls_context *tls_ctx = tls_get_ctx(sk); struct tls_prot_info *prot = &tls_ctx->prot_info; + const struct tls_cipher_size_desc *cipher_sz; struct tls_record_info *start_marker_record; struct tls_offload_context_tx *offload_ctx; struct tls_crypto_info *crypto_info; @@ -1099,44 +1108,44 @@ int tls_set_device_offload(struct sock *sk, struct tls_context *ctx) switch (crypto_info->cipher_type) { case TLS_CIPHER_AES_GCM_128: - nonce_size = TLS_CIPHER_AES_GCM_128_IV_SIZE; - tag_size = TLS_CIPHER_AES_GCM_128_TAG_SIZE; - iv_size = TLS_CIPHER_AES_GCM_128_IV_SIZE; iv = ((struct tls12_crypto_info_aes_gcm_128 *)crypto_info)->iv; - rec_seq_size = TLS_CIPHER_AES_GCM_128_REC_SEQ_SIZE; - salt_size = TLS_CIPHER_AES_GCM_128_SALT_SIZE; rec_seq = ((struct tls12_crypto_info_aes_gcm_128 *)crypto_info)->rec_seq; break; + case TLS_CIPHER_AES_GCM_256: + iv = ((struct tls12_crypto_info_aes_gcm_256 *)crypto_info)->iv; + rec_seq = + ((struct tls12_crypto_info_aes_gcm_256 *)crypto_info)->rec_seq; + break; default: rc = -EINVAL; goto release_netdev; } + cipher_sz = &tls_cipher_size_desc[crypto_info->cipher_type]; /* Sanity-check the rec_seq_size for stack allocations */ - if (rec_seq_size > TLS_MAX_REC_SEQ_SIZE) { + if (cipher_sz->rec_seq > TLS_MAX_REC_SEQ_SIZE) { rc = -EINVAL; goto release_netdev; } prot->version = crypto_info->version; prot->cipher_type = crypto_info->cipher_type; - prot->prepend_size = TLS_HEADER_SIZE + nonce_size; - prot->tag_size = tag_size; + prot->prepend_size = TLS_HEADER_SIZE + cipher_sz->iv; + prot->tag_size = cipher_sz->tag; prot->overhead_size = prot->prepend_size + prot->tag_size; - prot->iv_size = iv_size; - prot->salt_size = salt_size; - ctx->tx.iv = kmalloc(iv_size + TLS_CIPHER_AES_GCM_128_SALT_SIZE, - GFP_KERNEL); + prot->iv_size = cipher_sz->iv; + prot->salt_size = cipher_sz->salt; + ctx->tx.iv = kmalloc(cipher_sz->iv + cipher_sz->salt, GFP_KERNEL); if (!ctx->tx.iv) { rc = -ENOMEM; goto release_netdev; } - memcpy(ctx->tx.iv + TLS_CIPHER_AES_GCM_128_SALT_SIZE, iv, iv_size); + memcpy(ctx->tx.iv + cipher_sz->salt, iv, cipher_sz->iv); - prot->rec_seq_size = rec_seq_size; - ctx->tx.rec_seq = kmemdup(rec_seq, rec_seq_size, GFP_KERNEL); + prot->rec_seq_size = cipher_sz->rec_seq; + ctx->tx.rec_seq = kmemdup(rec_seq, cipher_sz->rec_seq, GFP_KERNEL); if (!ctx->tx.rec_seq) { rc = -ENOMEM; goto free_iv; diff --git a/net/tls/tls_device_fallback.c b/net/tls/tls_device_fallback.c index 7dfc8023e0f1..cdb391a8754b 100644 --- a/net/tls/tls_device_fallback.c +++ b/net/tls/tls_device_fallback.c @@ -54,13 +54,25 @@ static int tls_enc_record(struct aead_request *aead_req, struct scatter_walk *out, int *in_len, struct tls_prot_info *prot) { - unsigned char buf[TLS_HEADER_SIZE + TLS_CIPHER_AES_GCM_128_IV_SIZE]; + unsigned char buf[TLS_HEADER_SIZE + MAX_IV_SIZE]; + const struct tls_cipher_size_desc *cipher_sz; struct scatterlist sg_in[3]; struct scatterlist sg_out[3]; + unsigned int buf_size; u16 len; int rc; - len = min_t(int, *in_len, ARRAY_SIZE(buf)); + switch (prot->cipher_type) { + case TLS_CIPHER_AES_GCM_128: + case TLS_CIPHER_AES_GCM_256: + break; + default: + return -EINVAL; + } + cipher_sz = &tls_cipher_size_desc[prot->cipher_type]; + + buf_size = TLS_HEADER_SIZE + cipher_sz->iv; + len = min_t(int, *in_len, buf_size); scatterwalk_copychunks(buf, in, len, 0); scatterwalk_copychunks(buf, out, len, 1); @@ -73,13 +85,11 @@ static int tls_enc_record(struct aead_request *aead_req, scatterwalk_pagedone(out, 1, 1); len = buf[4] | (buf[3] << 8); - len -= TLS_CIPHER_AES_GCM_128_IV_SIZE; + len -= cipher_sz->iv; - tls_make_aad(aad, len - TLS_CIPHER_AES_GCM_128_TAG_SIZE, - (char *)&rcd_sn, buf[0], prot); + tls_make_aad(aad, len - cipher_sz->tag, (char *)&rcd_sn, buf[0], prot); - memcpy(iv + TLS_CIPHER_AES_GCM_128_SALT_SIZE, buf + TLS_HEADER_SIZE, - TLS_CIPHER_AES_GCM_128_IV_SIZE); + memcpy(iv + cipher_sz->salt, buf + TLS_HEADER_SIZE, cipher_sz->iv); sg_init_table(sg_in, ARRAY_SIZE(sg_in)); sg_init_table(sg_out, ARRAY_SIZE(sg_out)); @@ -90,7 +100,7 @@ static int tls_enc_record(struct aead_request *aead_req, *in_len -= len; if (*in_len < 0) { - *in_len += TLS_CIPHER_AES_GCM_128_TAG_SIZE; + *in_len += cipher_sz->tag; /* the input buffer doesn't contain the entire record. * trim len accordingly. The resulting authentication tag * will contain garbage, but we don't care, so we won't @@ -111,7 +121,7 @@ static int tls_enc_record(struct aead_request *aead_req, scatterwalk_pagedone(out, 1, 1); } - len -= TLS_CIPHER_AES_GCM_128_TAG_SIZE; + len -= cipher_sz->tag; aead_request_set_crypt(aead_req, sg_in, sg_out, len, iv); rc = crypto_aead_encrypt(aead_req); @@ -299,11 +309,14 @@ static void fill_sg_out(struct scatterlist sg_out[3], void *buf, int sync_size, void *dummy_buf) { + const struct tls_cipher_size_desc *cipher_sz = + &tls_cipher_size_desc[tls_ctx->crypto_send.info.cipher_type]; + sg_set_buf(&sg_out[0], dummy_buf, sync_size); sg_set_buf(&sg_out[1], nskb->data + tcp_payload_offset, payload_len); /* Add room for authentication tag produced by crypto */ dummy_buf += sync_size; - sg_set_buf(&sg_out[2], dummy_buf, TLS_CIPHER_AES_GCM_128_TAG_SIZE); + sg_set_buf(&sg_out[2], dummy_buf, cipher_sz->tag); } static struct sk_buff *tls_enc_skb(struct tls_context *tls_ctx, @@ -315,7 +328,8 @@ static struct sk_buff *tls_enc_skb(struct tls_context *tls_ctx, struct tls_offload_context_tx *ctx = tls_offload_ctx_tx(tls_ctx); int tcp_payload_offset = skb_tcp_all_headers(skb); int payload_len = skb->len - tcp_payload_offset; - void *buf, *iv, *aad, *dummy_buf; + const struct tls_cipher_size_desc *cipher_sz; + void *buf, *iv, *aad, *dummy_buf, *salt; struct aead_request *aead_req; struct sk_buff *nskb = NULL; int buf_len; @@ -324,20 +338,26 @@ static struct sk_buff *tls_enc_skb(struct tls_context *tls_ctx, if (!aead_req) return NULL; - buf_len = TLS_CIPHER_AES_GCM_128_SALT_SIZE + - TLS_CIPHER_AES_GCM_128_IV_SIZE + - TLS_AAD_SPACE_SIZE + - sync_size + - TLS_CIPHER_AES_GCM_128_TAG_SIZE; + switch (tls_ctx->crypto_send.info.cipher_type) { + case TLS_CIPHER_AES_GCM_128: + salt = tls_ctx->crypto_send.aes_gcm_128.salt; + break; + case TLS_CIPHER_AES_GCM_256: + salt = tls_ctx->crypto_send.aes_gcm_256.salt; + break; + default: + return NULL; + } + cipher_sz = &tls_cipher_size_desc[tls_ctx->crypto_send.info.cipher_type]; + buf_len = cipher_sz->salt + cipher_sz->iv + TLS_AAD_SPACE_SIZE + + sync_size + cipher_sz->tag; buf = kmalloc(buf_len, GFP_ATOMIC); if (!buf) goto free_req; iv = buf; - memcpy(iv, tls_ctx->crypto_send.aes_gcm_128.salt, - TLS_CIPHER_AES_GCM_128_SALT_SIZE); - aad = buf + TLS_CIPHER_AES_GCM_128_SALT_SIZE + - TLS_CIPHER_AES_GCM_128_IV_SIZE; + memcpy(iv, salt, cipher_sz->salt); + aad = buf + cipher_sz->salt + cipher_sz->iv; dummy_buf = aad + TLS_AAD_SPACE_SIZE; nskb = alloc_skb(skb_headroom(skb) + skb->len, GFP_ATOMIC); @@ -451,6 +471,7 @@ int tls_sw_fallback_init(struct sock *sk, struct tls_offload_context_tx *offload_ctx, struct tls_crypto_info *crypto_info) { + const struct tls_cipher_size_desc *cipher_sz; const u8 *key; int rc; @@ -463,15 +484,23 @@ int tls_sw_fallback_init(struct sock *sk, goto err_out; } - key = ((struct tls12_crypto_info_aes_gcm_128 *)crypto_info)->key; + switch (crypto_info->cipher_type) { + case TLS_CIPHER_AES_GCM_128: + key = ((struct tls12_crypto_info_aes_gcm_128 *)crypto_info)->key; + break; + case TLS_CIPHER_AES_GCM_256: + key = ((struct tls12_crypto_info_aes_gcm_256 *)crypto_info)->key; + break; + default: + return -EINVAL; + } + cipher_sz = &tls_cipher_size_desc[crypto_info->cipher_type]; - rc = crypto_aead_setkey(offload_ctx->aead_send, key, - TLS_CIPHER_AES_GCM_128_KEY_SIZE); + rc = crypto_aead_setkey(offload_ctx->aead_send, key, cipher_sz->key); if (rc) goto free_aead; - rc = crypto_aead_setauthsize(offload_ctx->aead_send, - TLS_CIPHER_AES_GCM_128_TAG_SIZE); + rc = crypto_aead_setauthsize(offload_ctx->aead_send, cipher_sz->tag); if (rc) goto free_aead; diff --git a/net/tls/tls_main.c b/net/tls/tls_main.c index 08ddf9d837ae..3735cb00905d 100644 --- a/net/tls/tls_main.c +++ b/net/tls/tls_main.c @@ -58,6 +58,23 @@ enum { TLS_NUM_PROTS, }; +#define CIPHER_SIZE_DESC(cipher) [cipher] = { \ + .iv = cipher ## _IV_SIZE, \ + .key = cipher ## _KEY_SIZE, \ + .salt = cipher ## _SALT_SIZE, \ + .tag = cipher ## _TAG_SIZE, \ + .rec_seq = cipher ## _REC_SEQ_SIZE, \ +} + +const struct tls_cipher_size_desc tls_cipher_size_desc[] = { + CIPHER_SIZE_DESC(TLS_CIPHER_AES_GCM_128), + CIPHER_SIZE_DESC(TLS_CIPHER_AES_GCM_256), + CIPHER_SIZE_DESC(TLS_CIPHER_AES_CCM_128), + CIPHER_SIZE_DESC(TLS_CIPHER_CHACHA20_POLY1305), + CIPHER_SIZE_DESC(TLS_CIPHER_SM4_GCM), + CIPHER_SIZE_DESC(TLS_CIPHER_SM4_CCM), +}; + static const struct proto *saved_tcpv6_prot; static DEFINE_MUTEX(tcpv6_prot_mutex); static const struct proto *saved_tcpv4_prot; @@ -507,6 +524,54 @@ static int do_tls_getsockopt_conf(struct sock *sk, char __user *optval, rc = -EFAULT; break; } + case TLS_CIPHER_ARIA_GCM_128: { + struct tls12_crypto_info_aria_gcm_128 * + crypto_info_aria_gcm_128 = + container_of(crypto_info, + struct tls12_crypto_info_aria_gcm_128, + info); + + if (len != sizeof(*crypto_info_aria_gcm_128)) { + rc = -EINVAL; + goto out; + } + lock_sock(sk); + memcpy(crypto_info_aria_gcm_128->iv, + cctx->iv + TLS_CIPHER_ARIA_GCM_128_SALT_SIZE, + TLS_CIPHER_ARIA_GCM_128_IV_SIZE); + memcpy(crypto_info_aria_gcm_128->rec_seq, cctx->rec_seq, + TLS_CIPHER_ARIA_GCM_128_REC_SEQ_SIZE); + release_sock(sk); + if (copy_to_user(optval, + crypto_info_aria_gcm_128, + sizeof(*crypto_info_aria_gcm_128))) + rc = -EFAULT; + break; + } + case TLS_CIPHER_ARIA_GCM_256: { + struct tls12_crypto_info_aria_gcm_256 * + crypto_info_aria_gcm_256 = + container_of(crypto_info, + struct tls12_crypto_info_aria_gcm_256, + info); + + if (len != sizeof(*crypto_info_aria_gcm_256)) { + rc = -EINVAL; + goto out; + } + lock_sock(sk); + memcpy(crypto_info_aria_gcm_256->iv, + cctx->iv + TLS_CIPHER_ARIA_GCM_256_SALT_SIZE, + TLS_CIPHER_ARIA_GCM_256_IV_SIZE); + memcpy(crypto_info_aria_gcm_256->rec_seq, cctx->rec_seq, + TLS_CIPHER_ARIA_GCM_256_REC_SEQ_SIZE); + release_sock(sk); + if (copy_to_user(optval, + crypto_info_aria_gcm_256, + sizeof(*crypto_info_aria_gcm_256))) + rc = -EFAULT; + break; + } default: rc = -EINVAL; } @@ -668,6 +733,20 @@ static int do_tls_setsockopt_conf(struct sock *sk, sockptr_t optval, case TLS_CIPHER_SM4_CCM: optsize = sizeof(struct tls12_crypto_info_sm4_ccm); break; + case TLS_CIPHER_ARIA_GCM_128: + if (crypto_info->version != TLS_1_2_VERSION) { + rc = -EINVAL; + goto err_crypto_info; + } + optsize = sizeof(struct tls12_crypto_info_aria_gcm_128); + break; + case TLS_CIPHER_ARIA_GCM_256: + if (crypto_info->version != TLS_1_2_VERSION) { + rc = -EINVAL; + goto err_crypto_info; + } + optsize = sizeof(struct tls12_crypto_info_aria_gcm_256); + break; default: rc = -EINVAL; goto err_crypto_info; diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c index fe27241cd13f..264cf367e265 100644 --- a/net/tls/tls_sw.c +++ b/net/tls/tls_sw.c @@ -2629,6 +2629,40 @@ int tls_set_sw_offload(struct sock *sk, struct tls_context *ctx, int tx) cipher_name = "ccm(sm4)"; break; } + case TLS_CIPHER_ARIA_GCM_128: { + struct tls12_crypto_info_aria_gcm_128 *aria_gcm_128_info; + + aria_gcm_128_info = (void *)crypto_info; + nonce_size = TLS_CIPHER_ARIA_GCM_128_IV_SIZE; + tag_size = TLS_CIPHER_ARIA_GCM_128_TAG_SIZE; + iv_size = TLS_CIPHER_ARIA_GCM_128_IV_SIZE; + iv = aria_gcm_128_info->iv; + rec_seq_size = TLS_CIPHER_ARIA_GCM_128_REC_SEQ_SIZE; + rec_seq = aria_gcm_128_info->rec_seq; + keysize = TLS_CIPHER_ARIA_GCM_128_KEY_SIZE; + key = aria_gcm_128_info->key; + salt = aria_gcm_128_info->salt; + salt_size = TLS_CIPHER_ARIA_GCM_128_SALT_SIZE; + cipher_name = "gcm(aria)"; + break; + } + case TLS_CIPHER_ARIA_GCM_256: { + struct tls12_crypto_info_aria_gcm_256 *gcm_256_info; + + gcm_256_info = (void *)crypto_info; + nonce_size = TLS_CIPHER_ARIA_GCM_256_IV_SIZE; + tag_size = TLS_CIPHER_ARIA_GCM_256_TAG_SIZE; + iv_size = TLS_CIPHER_ARIA_GCM_256_IV_SIZE; + iv = gcm_256_info->iv; + rec_seq_size = TLS_CIPHER_ARIA_GCM_256_REC_SEQ_SIZE; + rec_seq = gcm_256_info->rec_seq; + keysize = TLS_CIPHER_ARIA_GCM_256_KEY_SIZE; + key = gcm_256_info->key; + salt = gcm_256_info->salt; + salt_size = TLS_CIPHER_ARIA_GCM_256_SALT_SIZE; + cipher_name = "gcm(aria)"; + break; + } default: rc = -EINVAL; goto free_priv; diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c index bf338b782fc4..0f08c3177872 100644 --- a/net/unix/af_unix.c +++ b/net/unix/af_unix.c @@ -569,12 +569,6 @@ static void unix_sock_destructor(struct sock *sk) skb_queue_purge(&sk->sk_receive_queue); -#if IS_ENABLED(CONFIG_AF_UNIX_OOB) - if (u->oob_skb) { - kfree_skb(u->oob_skb); - u->oob_skb = NULL; - } -#endif DEBUG_NET_WARN_ON_ONCE(refcount_read(&sk->sk_wmem_alloc)); DEBUG_NET_WARN_ON_ONCE(!sk_unhashed(sk)); DEBUG_NET_WARN_ON_ONCE(sk->sk_socket); @@ -620,6 +614,13 @@ static void unix_release_sock(struct sock *sk, int embrion) unix_state_unlock(sk); +#if IS_ENABLED(CONFIG_AF_UNIX_OOB) + if (u->oob_skb) { + kfree_skb(u->oob_skb); + u->oob_skb = NULL; + } +#endif + wake_up_interruptible_all(&u->peer_wait); if (skpair != NULL) { @@ -785,15 +786,45 @@ static int unix_set_peek_off(struct sock *sk, int val) } #ifdef CONFIG_PROC_FS +static int unix_count_nr_fds(struct sock *sk) +{ + struct sk_buff *skb; + struct unix_sock *u; + int nr_fds = 0; + + spin_lock(&sk->sk_receive_queue.lock); + skb = skb_peek(&sk->sk_receive_queue); + while (skb) { + u = unix_sk(skb->sk); + nr_fds += atomic_read(&u->scm_stat.nr_fds); + skb = skb_peek_next(skb, &sk->sk_receive_queue); + } + spin_unlock(&sk->sk_receive_queue.lock); + + return nr_fds; +} + static void unix_show_fdinfo(struct seq_file *m, struct socket *sock) { struct sock *sk = sock->sk; struct unix_sock *u; + int nr_fds; if (sk) { - u = unix_sk(sock->sk); - seq_printf(m, "scm_fds: %u\n", - atomic_read(&u->scm_stat.nr_fds)); + u = unix_sk(sk); + if (sock->type == SOCK_DGRAM) { + nr_fds = atomic_read(&u->scm_stat.nr_fds); + goto out_print; + } + + unix_state_lock(sk); + if (sk->sk_state != TCP_LISTEN) + nr_fds = atomic_read(&u->scm_stat.nr_fds); + else + nr_fds = unix_count_nr_fds(sk); + unix_state_unlock(sk); +out_print: + seq_printf(m, "scm_fds: %u\n", nr_fds); } } #else @@ -2506,32 +2537,18 @@ static int unix_dgram_recvmsg(struct socket *sock, struct msghdr *msg, size_t si static int unix_read_skb(struct sock *sk, skb_read_actor_t recv_actor) { - int copied = 0; - - while (1) { - struct unix_sock *u = unix_sk(sk); - struct sk_buff *skb; - int used, err; - - mutex_lock(&u->iolock); - skb = skb_recv_datagram(sk, MSG_DONTWAIT, &err); - mutex_unlock(&u->iolock); - if (!skb) - return err; + struct unix_sock *u = unix_sk(sk); + struct sk_buff *skb; + int err, copied; - used = recv_actor(sk, skb); - if (used <= 0) { - if (!copied) - copied = used; - kfree_skb(skb); - break; - } else if (used <= skb->len) { - copied += used; - } + mutex_lock(&u->iolock); + skb = skb_recv_datagram(sk, MSG_DONTWAIT, &err); + mutex_unlock(&u->iolock); + if (!skb) + return err; - kfree_skb(skb); - break; - } + copied = recv_actor(sk, skb); + kfree_skb(skb); return copied; } diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c index b4ee163154a6..ee418701cdee 100644 --- a/net/vmw_vsock/af_vsock.c +++ b/net/vmw_vsock/af_vsock.c @@ -882,6 +882,16 @@ s64 vsock_stream_has_space(struct vsock_sock *vsk) } EXPORT_SYMBOL_GPL(vsock_stream_has_space); +void vsock_data_ready(struct sock *sk) +{ + struct vsock_sock *vsk = vsock_sk(sk); + + if (vsock_stream_has_data(vsk) >= sk->sk_rcvlowat || + sock_flag(sk, SOCK_DONE)) + sk->sk_data_ready(sk); +} +EXPORT_SYMBOL_GPL(vsock_data_ready); + static int vsock_release(struct socket *sock) { __vsock_release(sock->sk, 0); @@ -1066,8 +1076,9 @@ static __poll_t vsock_poll(struct file *file, struct socket *sock, if (transport && transport->stream_is_active(vsk) && !(sk->sk_shutdown & RCV_SHUTDOWN)) { bool data_ready_now = false; + int target = sock_rcvlowat(sk, 0, INT_MAX); int ret = transport->notify_poll_in( - vsk, 1, &data_ready_now); + vsk, target, &data_ready_now); if (ret < 0) { mask |= EPOLLERR; } else { @@ -2137,6 +2148,25 @@ out: return err; } +static int vsock_set_rcvlowat(struct sock *sk, int val) +{ + const struct vsock_transport *transport; + struct vsock_sock *vsk; + + vsk = vsock_sk(sk); + + if (val > vsk->buffer_size) + return -EINVAL; + + transport = vsk->transport; + + if (transport && transport->set_rcvlowat) + return transport->set_rcvlowat(vsk, val); + + WRITE_ONCE(sk->sk_rcvlowat, val ? : 1); + return 0; +} + static const struct proto_ops vsock_stream_ops = { .family = PF_VSOCK, .owner = THIS_MODULE, @@ -2156,6 +2186,7 @@ static const struct proto_ops vsock_stream_ops = { .recvmsg = vsock_connectible_recvmsg, .mmap = sock_no_mmap, .sendpage = sock_no_sendpage, + .set_rcvlowat = vsock_set_rcvlowat, }; static const struct proto_ops vsock_seqpacket_ops = { diff --git a/net/vmw_vsock/hyperv_transport.c b/net/vmw_vsock/hyperv_transport.c index fd98229e3db3..59c3e2697069 100644 --- a/net/vmw_vsock/hyperv_transport.c +++ b/net/vmw_vsock/hyperv_transport.c @@ -815,6 +815,12 @@ int hvs_notify_send_post_enqueue(struct vsock_sock *vsk, ssize_t written, return 0; } +static +int hvs_set_rcvlowat(struct vsock_sock *vsk, int val) +{ + return -EOPNOTSUPP; +} + static struct vsock_transport hvs_transport = { .module = THIS_MODULE, @@ -850,6 +856,7 @@ static struct vsock_transport hvs_transport = { .notify_send_pre_enqueue = hvs_notify_send_pre_enqueue, .notify_send_post_enqueue = hvs_notify_send_post_enqueue, + .set_rcvlowat = hvs_set_rcvlowat }; static bool hvs_check_transport(struct vsock_sock *vsk) diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c index ec2c2afbf0d0..a9980e9b9304 100644 --- a/net/vmw_vsock/virtio_transport_common.c +++ b/net/vmw_vsock/virtio_transport_common.c @@ -634,10 +634,7 @@ virtio_transport_notify_poll_in(struct vsock_sock *vsk, size_t target, bool *data_ready_now) { - if (vsock_stream_has_data(vsk)) - *data_ready_now = true; - else - *data_ready_now = false; + *data_ready_now = vsock_stream_has_data(vsk) >= target; return 0; } @@ -1084,7 +1081,7 @@ virtio_transport_recv_connected(struct sock *sk, switch (le16_to_cpu(pkt->hdr.op)) { case VIRTIO_VSOCK_OP_RW: virtio_transport_recv_enqueue(vsk, pkt); - sk->sk_data_ready(sk); + vsock_data_ready(sk); return err; case VIRTIO_VSOCK_OP_CREDIT_REQUEST: virtio_transport_send_credit_update(vsk); @@ -1342,7 +1339,7 @@ EXPORT_SYMBOL_GPL(virtio_transport_recv_pkt); void virtio_transport_free_pkt(struct virtio_vsock_pkt *pkt) { - kfree(pkt->buf); + kvfree(pkt->buf); kfree(pkt); } EXPORT_SYMBOL_GPL(virtio_transport_free_pkt); diff --git a/net/vmw_vsock/vmci_transport.c b/net/vmw_vsock/vmci_transport.c index b14f0ed7427b..842c94286d31 100644 --- a/net/vmw_vsock/vmci_transport.c +++ b/net/vmw_vsock/vmci_transport.c @@ -951,7 +951,7 @@ static int vmci_transport_recv_listen(struct sock *sk, * for ourself or any previous connection requests that we received. * If it's the latter, we try to find a socket in our list of pending * connections and, if we do, call the appropriate handler for the - * state that that socket is in. Otherwise we try to service the + * state that socket is in. Otherwise we try to service the * connection request. */ pending = vmci_transport_get_pending(sk, pkt); diff --git a/net/vmw_vsock/vmci_transport_notify.c b/net/vmw_vsock/vmci_transport_notify.c index d69fc4b595ad..7c3a7db134b2 100644 --- a/net/vmw_vsock/vmci_transport_notify.c +++ b/net/vmw_vsock/vmci_transport_notify.c @@ -307,7 +307,7 @@ vmci_transport_handle_wrote(struct sock *sk, struct vsock_sock *vsk = vsock_sk(sk); PKT_FIELD(vsk, sent_waiting_read) = false; #endif - sk->sk_data_ready(sk); + vsock_data_ready(sk); } static void vmci_transport_notify_pkt_socket_init(struct sock *sk) @@ -340,12 +340,12 @@ vmci_transport_notify_pkt_poll_in(struct sock *sk, { struct vsock_sock *vsk = vsock_sk(sk); - if (vsock_stream_has_data(vsk)) { + if (vsock_stream_has_data(vsk) >= target) { *data_ready_now = true; } else { - /* We can't read right now because there is nothing in the - * queue. Ask for notifications when there is something to - * read. + /* We can't read right now because there is not enough data + * in the queue. Ask for notifications when there is something + * to read. */ if (sk->sk_state == TCP_ESTABLISHED) { if (!send_waiting_read(sk, 1)) diff --git a/net/vmw_vsock/vmci_transport_notify_qstate.c b/net/vmw_vsock/vmci_transport_notify_qstate.c index 0f36d7c45db3..e96a88d850a8 100644 --- a/net/vmw_vsock/vmci_transport_notify_qstate.c +++ b/net/vmw_vsock/vmci_transport_notify_qstate.c @@ -84,7 +84,7 @@ vmci_transport_handle_wrote(struct sock *sk, bool bottom_half, struct sockaddr_vm *dst, struct sockaddr_vm *src) { - sk->sk_data_ready(sk); + vsock_data_ready(sk); } static void vsock_block_update_write_window(struct sock *sk) @@ -161,12 +161,12 @@ vmci_transport_notify_pkt_poll_in(struct sock *sk, { struct vsock_sock *vsk = vsock_sk(sk); - if (vsock_stream_has_data(vsk)) { + if (vsock_stream_has_data(vsk) >= target) { *data_ready_now = true; } else { - /* We can't read right now because there is nothing in the - * queue. Ask for notifications when there is something to - * read. + /* We can't read right now because there is not enough data + * in the queue. Ask for notifications when there is something + * to read. */ if (sk->sk_state == TCP_ESTABLISHED) vsock_block_update_write_window(sk); @@ -282,7 +282,7 @@ vmci_transport_notify_pkt_recv_post_dequeue( /* See the comment in * vmci_transport_notify_pkt_send_post_enqueue(). */ - sk->sk_data_ready(sk); + vsock_data_ready(sk); } return err; diff --git a/net/wireless/core.c b/net/wireless/core.c index eefd6d8ff465..5b0c4d5b80cf 100644 --- a/net/wireless/core.c +++ b/net/wireless/core.c @@ -860,6 +860,9 @@ int wiphy_register(struct wiphy *wiphy) for (i = 0; i < sband->n_iftype_data; i++) { const struct ieee80211_sband_iftype_data *iftd; + bool has_ap, has_non_ap; + u32 ap_bits = BIT(NL80211_IFTYPE_AP) | + BIT(NL80211_IFTYPE_P2P_GO); iftd = &sband->iftype_data[i]; @@ -879,6 +882,19 @@ int wiphy_register(struct wiphy *wiphy) else have_he = have_he && iftd->he_cap.has_he; + + has_ap = iftd->types_mask & ap_bits; + has_non_ap = iftd->types_mask & ~ap_bits; + + /* + * For EHT 20 MHz STA, the capabilities format differs + * but to simplify, don't check 20 MHz but rather check + * only if AP and non-AP were mentioned at the same time, + * reject if so. + */ + if (WARN_ON(iftd->eht_cap.has_eht && + has_ap && has_non_ap)) + return -EINVAL; } if (WARN_ON(!have_he && band == NL80211_BAND_6GHZ)) diff --git a/net/wireless/ibss.c b/net/wireless/ibss.c index 4935f94d1acc..edd062f104f4 100644 --- a/net/wireless/ibss.c +++ b/net/wireless/ibss.c @@ -171,7 +171,7 @@ static void __cfg80211_clear_ibss(struct net_device *dev, bool nowext) */ if (rdev->ops->del_key) for (i = 0; i < 6; i++) - rdev_del_key(rdev, dev, i, false, NULL); + rdev_del_key(rdev, dev, -1, i, false, NULL); if (wdev->u.ibss.current_bss) { cfg80211_unhold_bss(wdev->u.ibss.current_bss); diff --git a/net/wireless/nl80211.c b/net/wireless/nl80211.c index 2705e3ee8fc4..8ff8b1c040f0 100644 --- a/net/wireless/nl80211.c +++ b/net/wireless/nl80211.c @@ -1545,7 +1545,6 @@ static int nl80211_key_allowed(struct wireless_dev *wdev) return -ENOLINK; case NL80211_IFTYPE_STATION: case NL80211_IFTYPE_P2P_CLIENT: - /* for MLO, require driver validation of the link ID */ if (wdev->connected) return 0; return -ENOLINK; @@ -1821,10 +1820,15 @@ nl80211_send_iftype_data(struct sk_buff *msg, if (eht_cap->has_eht && he_cap->has_he) { u8 mcs_nss_size, ppe_thresh_size; u16 ppe_thres_hdr; + bool is_ap; + + is_ap = iftdata->types_mask & BIT(NL80211_IFTYPE_AP) || + iftdata->types_mask & BIT(NL80211_IFTYPE_P2P_GO); mcs_nss_size = ieee80211_eht_mcs_nss_size(&he_cap->he_cap_elem, - &eht_cap->eht_cap_elem); + &eht_cap->eht_cap_elem, + is_ap); ppe_thres_hdr = get_unaligned_le16(&eht_cap->eht_ppe_thres[0]); ppe_thresh_size = @@ -3476,8 +3480,21 @@ static int nl80211_set_wiphy(struct sk_buff *skb, struct genl_info *info) if (result) goto out; - result = rdev_set_txq_params(rdev, netdev, - &txq_params); + txq_params.link_id = + nl80211_link_id_or_invalid(info->attrs); + + wdev_lock(netdev->ieee80211_ptr); + if (txq_params.link_id >= 0 && + !(netdev->ieee80211_ptr->valid_links & + BIT(txq_params.link_id))) + result = -ENOLINK; + else if (txq_params.link_id >= 0 && + !netdev->ieee80211_ptr->valid_links) + result = -EINVAL; + else + result = rdev_set_txq_params(rdev, netdev, + &txq_params); + wdev_unlock(netdev->ieee80211_ptr); if (result) goto out; } @@ -3848,12 +3865,19 @@ static int nl80211_send_iface(struct sk_buff *msg, u32 portid, u32 seq, int flag for_each_valid_link(wdev, link_id) { struct nlattr *link = nla_nest_start(msg, link_id + 1); + struct cfg80211_chan_def chandef = {}; + int ret; if (nla_put_u8(msg, NL80211_ATTR_MLO_LINK_ID, link_id)) goto nla_put_failure; if (nla_put(msg, NL80211_ATTR_MAC, ETH_ALEN, wdev->links[link_id].addr)) goto nla_put_failure; + + ret = rdev_get_channel(rdev, wdev, link_id, &chandef); + if (ret == 0 && nl80211_send_chandef(msg, &chandef)) + goto nla_put_failure; + nla_nest_end(msg, link); } @@ -4320,6 +4344,38 @@ static int nl80211_set_noack_map(struct sk_buff *skb, struct genl_info *info) return rdev_set_noack_map(rdev, dev, noack_map); } +static int nl80211_validate_key_link_id(struct genl_info *info, + struct wireless_dev *wdev, + int link_id, bool pairwise) +{ + if (pairwise) { + if (link_id != -1) { + GENL_SET_ERR_MSG(info, + "link ID not allowed for pairwise key"); + return -EINVAL; + } + + return 0; + } + + if (wdev->valid_links) { + if (link_id == -1) { + GENL_SET_ERR_MSG(info, + "link ID must for MLO group key"); + return -EINVAL; + } + if (!(wdev->valid_links & BIT(link_id))) { + GENL_SET_ERR_MSG(info, "invalid link ID for MLO group key"); + return -EINVAL; + } + } else if (link_id != -1) { + GENL_SET_ERR_MSG(info, "link ID not allowed for non-MLO group key"); + return -EINVAL; + } + + return 0; +} + struct get_key_cookie { struct sk_buff *msg; int error; @@ -4381,13 +4437,15 @@ static int nl80211_get_key(struct sk_buff *skb, struct genl_info *info) void *hdr; struct sk_buff *msg; bool bigtk_support = false; + int link_id = nl80211_link_id_or_invalid(info->attrs); + struct wireless_dev *wdev = dev->ieee80211_ptr; if (wiphy_ext_feature_isset(&rdev->wiphy, NL80211_EXT_FEATURE_BEACON_PROTECTION)) bigtk_support = true; - if ((dev->ieee80211_ptr->iftype == NL80211_IFTYPE_STATION || - dev->ieee80211_ptr->iftype == NL80211_IFTYPE_P2P_CLIENT) && + if ((wdev->iftype == NL80211_IFTYPE_STATION || + wdev->iftype == NL80211_IFTYPE_P2P_CLIENT) && wiphy_ext_feature_isset(&rdev->wiphy, NL80211_EXT_FEATURE_BEACON_PROTECTION_CLIENT)) bigtk_support = true; @@ -4439,8 +4497,12 @@ static int nl80211_get_key(struct sk_buff *skb, struct genl_info *info) nla_put(msg, NL80211_ATTR_MAC, ETH_ALEN, mac_addr)) goto nla_put_failure; - err = rdev_get_key(rdev, dev, key_idx, pairwise, mac_addr, &cookie, - get_key_callback); + err = nl80211_validate_key_link_id(info, wdev, link_id, pairwise); + if (err) + goto free_msg; + + err = rdev_get_key(rdev, dev, link_id, key_idx, pairwise, mac_addr, + &cookie, get_key_callback); if (err) goto free_msg; @@ -4464,6 +4526,8 @@ static int nl80211_set_key(struct sk_buff *skb, struct genl_info *info) struct key_parse key; int err; struct net_device *dev = info->user_ptr[1]; + int link_id = nl80211_link_id_or_invalid(info->attrs); + struct wireless_dev *wdev = dev->ieee80211_ptr; err = nl80211_parse_key(info, &key); if (err) @@ -4479,7 +4543,7 @@ static int nl80211_set_key(struct sk_buff *skb, struct genl_info *info) !(key.p.mode == NL80211_KEY_SET_TX)) return -EINVAL; - wdev_lock(dev->ieee80211_ptr); + wdev_lock(wdev); if (key.def) { if (!rdev->ops->set_default_key) { @@ -4487,18 +4551,22 @@ static int nl80211_set_key(struct sk_buff *skb, struct genl_info *info) goto out; } - err = nl80211_key_allowed(dev->ieee80211_ptr); + err = nl80211_key_allowed(wdev); if (err) goto out; - err = rdev_set_default_key(rdev, dev, key.idx, - key.def_uni, key.def_multi); + err = nl80211_validate_key_link_id(info, wdev, link_id, false); + if (err) + goto out; + + err = rdev_set_default_key(rdev, dev, link_id, key.idx, + key.def_uni, key.def_multi); if (err) goto out; #ifdef CONFIG_CFG80211_WEXT - dev->ieee80211_ptr->wext.default_key = key.idx; + wdev->wext.default_key = key.idx; #endif } else if (key.defmgmt) { if (key.def_uni || !key.def_multi) { @@ -4511,16 +4579,20 @@ static int nl80211_set_key(struct sk_buff *skb, struct genl_info *info) goto out; } - err = nl80211_key_allowed(dev->ieee80211_ptr); + err = nl80211_key_allowed(wdev); if (err) goto out; - err = rdev_set_default_mgmt_key(rdev, dev, key.idx); + err = nl80211_validate_key_link_id(info, wdev, link_id, false); + if (err) + goto out; + + err = rdev_set_default_mgmt_key(rdev, dev, link_id, key.idx); if (err) goto out; #ifdef CONFIG_CFG80211_WEXT - dev->ieee80211_ptr->wext.default_mgmt_key = key.idx; + wdev->wext.default_mgmt_key = key.idx; #endif } else if (key.defbeacon) { if (key.def_uni || !key.def_multi) { @@ -4533,11 +4605,15 @@ static int nl80211_set_key(struct sk_buff *skb, struct genl_info *info) goto out; } - err = nl80211_key_allowed(dev->ieee80211_ptr); + err = nl80211_key_allowed(wdev); + if (err) + goto out; + + err = nl80211_validate_key_link_id(info, wdev, link_id, false); if (err) goto out; - err = rdev_set_default_beacon_key(rdev, dev, key.idx); + err = rdev_set_default_beacon_key(rdev, dev, link_id, key.idx); if (err) goto out; } else if (key.p.mode == NL80211_KEY_SET_TX && @@ -4553,14 +4629,18 @@ static int nl80211_set_key(struct sk_buff *skb, struct genl_info *info) goto out; } - err = rdev_add_key(rdev, dev, key.idx, + err = nl80211_validate_key_link_id(info, wdev, link_id, true); + if (err) + goto out; + + err = rdev_add_key(rdev, dev, link_id, key.idx, NL80211_KEYTYPE_PAIRWISE, mac_addr, &key.p); } else { err = -EINVAL; } out: - wdev_unlock(dev->ieee80211_ptr); + wdev_unlock(wdev); return err; } @@ -4572,6 +4652,8 @@ static int nl80211_new_key(struct sk_buff *skb, struct genl_info *info) struct net_device *dev = info->user_ptr[1]; struct key_parse key; const u8 *mac_addr = NULL; + int link_id = nl80211_link_id_or_invalid(info->attrs); + struct wireless_dev *wdev = dev->ieee80211_ptr; err = nl80211_parse_key(info, &key); if (err) @@ -4613,18 +4695,23 @@ static int nl80211_new_key(struct sk_buff *skb, struct genl_info *info) return -EINVAL; } - wdev_lock(dev->ieee80211_ptr); - err = nl80211_key_allowed(dev->ieee80211_ptr); + wdev_lock(wdev); + err = nl80211_key_allowed(wdev); if (err) GENL_SET_ERR_MSG(info, "key not allowed"); + + if (!err) + err = nl80211_validate_key_link_id(info, wdev, link_id, + key.type == NL80211_KEYTYPE_PAIRWISE); + if (!err) { - err = rdev_add_key(rdev, dev, key.idx, + err = rdev_add_key(rdev, dev, link_id, key.idx, key.type == NL80211_KEYTYPE_PAIRWISE, mac_addr, &key.p); if (err) GENL_SET_ERR_MSG(info, "key addition failed"); } - wdev_unlock(dev->ieee80211_ptr); + wdev_unlock(wdev); return err; } @@ -4636,6 +4723,8 @@ static int nl80211_del_key(struct sk_buff *skb, struct genl_info *info) struct net_device *dev = info->user_ptr[1]; u8 *mac_addr = NULL; struct key_parse key; + int link_id = nl80211_link_id_or_invalid(info->attrs); + struct wireless_dev *wdev = dev->ieee80211_ptr; err = nl80211_parse_key(info, &key); if (err) @@ -4663,27 +4752,31 @@ static int nl80211_del_key(struct sk_buff *skb, struct genl_info *info) if (!rdev->ops->del_key) return -EOPNOTSUPP; - wdev_lock(dev->ieee80211_ptr); - err = nl80211_key_allowed(dev->ieee80211_ptr); + wdev_lock(wdev); + err = nl80211_key_allowed(wdev); if (key.type == NL80211_KEYTYPE_GROUP && mac_addr && !(rdev->wiphy.flags & WIPHY_FLAG_IBSS_RSN)) err = -ENOENT; if (!err) - err = rdev_del_key(rdev, dev, key.idx, + err = nl80211_validate_key_link_id(info, wdev, link_id, + key.type == NL80211_KEYTYPE_PAIRWISE); + + if (!err) + err = rdev_del_key(rdev, dev, link_id, key.idx, key.type == NL80211_KEYTYPE_PAIRWISE, mac_addr); #ifdef CONFIG_CFG80211_WEXT if (!err) { - if (key.idx == dev->ieee80211_ptr->wext.default_key) - dev->ieee80211_ptr->wext.default_key = -1; - else if (key.idx == dev->ieee80211_ptr->wext.default_mgmt_key) - dev->ieee80211_ptr->wext.default_mgmt_key = -1; + if (key.idx == wdev->wext.default_key) + wdev->wext.default_key = -1; + else if (key.idx == wdev->wext.default_mgmt_key) + wdev->wext.default_mgmt_key = -1; } #endif - wdev_unlock(dev->ieee80211_ptr); + wdev_unlock(wdev); return err; } @@ -5587,7 +5680,7 @@ static int nl80211_calculate_ap_params(struct cfg80211_ap_settings *params) params->eht_cap = (void *)(cap->data + 1); if (!ieee80211_eht_capa_size_ok((const u8 *)params->he_cap, (const u8 *)params->eht_cap, - cap->datalen - 1)) + cap->datalen - 1, true)) return -EINVAL; } cap = cfg80211_find_ext_elem(WLAN_EID_EXT_EHT_OPERATION, ies, ies_len); @@ -6816,7 +6909,8 @@ static int nl80211_set_station_tdls(struct genl_info *info, if (!ieee80211_eht_capa_size_ok((const u8 *)params->link_sta_params.he_capa, (const u8 *)params->link_sta_params.eht_capa, - params->link_sta_params.eht_capa_len)) + params->link_sta_params.eht_capa_len, + false)) return -EINVAL; } } @@ -7127,7 +7221,8 @@ static int nl80211_new_station(struct sk_buff *skb, struct genl_info *info) if (!ieee80211_eht_capa_size_ok((const u8 *)params.link_sta_params.he_capa, (const u8 *)params.link_sta_params.eht_capa, - params.link_sta_params.eht_capa_len)) + params.link_sta_params.eht_capa_len, + false)) return -EINVAL; } } @@ -10087,8 +10182,10 @@ static int nl80211_send_bss(struct sk_buff *msg, struct netlink_callback *cb, (nla_put_u32(msg, NL80211_BSS_STATUS, NL80211_BSS_STATUS_ASSOCIATED) || (wdev->valid_links && - nla_put_u8(msg, NL80211_BSS_MLO_LINK_ID, - link_id)))) + (nla_put_u8(msg, NL80211_BSS_MLO_LINK_ID, + link_id) || + nla_put(msg, NL80211_BSS_MLD_ADDR, ETH_ALEN, + wdev->u.client.connected_addr))))) goto nla_put_failure; } break; @@ -11184,7 +11281,6 @@ static int nl80211_set_mcast_rate(struct sk_buff *skb, struct genl_info *info) struct net_device *dev = info->user_ptr[1]; int mcast_rate[NUM_NL80211_BANDS]; u32 nla_rate; - int err; if (dev->ieee80211_ptr->iftype != NL80211_IFTYPE_ADHOC && dev->ieee80211_ptr->iftype != NL80211_IFTYPE_MESH_POINT && @@ -11203,9 +11299,7 @@ static int nl80211_set_mcast_rate(struct sk_buff *skb, struct genl_info *info) if (!nl80211_parse_mcast_rate(rdev, mcast_rate, nla_rate)) return -EINVAL; - err = rdev_set_mcast_rate(rdev, dev, mcast_rate); - - return err; + return rdev_set_mcast_rate(rdev, dev, mcast_rate); } static struct sk_buff * @@ -15887,7 +15981,8 @@ nl80211_add_mod_link_station(struct sk_buff *skb, struct genl_info *info, if (!ieee80211_eht_capa_size_ok((const u8 *)params.he_capa, (const u8 *)params.eht_capa, - params.eht_capa_len)) + params.eht_capa_len, + false)) return -EINVAL; } } @@ -17141,6 +17236,7 @@ static struct genl_family nl80211_fam __ro_after_init = { .n_ops = ARRAY_SIZE(nl80211_ops), .small_ops = nl80211_small_ops, .n_small_ops = ARRAY_SIZE(nl80211_small_ops), + .resv_start_op = NL80211_CMD_REMOVE_LINK_STA + 1, .mcgrps = nl80211_mcgrps, .n_mcgrps = ARRAY_SIZE(nl80211_mcgrps), .parallel_ops = true, @@ -18836,11 +18932,13 @@ EXPORT_SYMBOL(cfg80211_pmksa_candidate_notify); static void nl80211_ch_switch_notify(struct cfg80211_registered_device *rdev, struct net_device *netdev, + unsigned int link_id, struct cfg80211_chan_def *chandef, gfp_t gfp, enum nl80211_commands notif, u8 count, bool quiet) { + struct wireless_dev *wdev = netdev->ieee80211_ptr; struct sk_buff *msg; void *hdr; @@ -18857,6 +18955,10 @@ static void nl80211_ch_switch_notify(struct cfg80211_registered_device *rdev, if (nla_put_u32(msg, NL80211_ATTR_IFINDEX, netdev->ifindex)) goto nla_put_failure; + if (wdev->valid_links && + nla_put_u8(msg, NL80211_ATTR_MLO_LINK_ID, link_id)) + goto nla_put_failure; + if (nl80211_send_chandef(msg, chandef)) goto nla_put_failure; @@ -18916,22 +19018,26 @@ void cfg80211_ch_switch_notify(struct net_device *dev, cfg80211_sched_dfs_chan_update(rdev); - nl80211_ch_switch_notify(rdev, dev, chandef, GFP_KERNEL, + nl80211_ch_switch_notify(rdev, dev, link_id, chandef, GFP_KERNEL, NL80211_CMD_CH_SWITCH_NOTIFY, 0, false); } EXPORT_SYMBOL(cfg80211_ch_switch_notify); void cfg80211_ch_switch_started_notify(struct net_device *dev, struct cfg80211_chan_def *chandef, - u8 count, bool quiet) + unsigned int link_id, u8 count, + bool quiet) { struct wireless_dev *wdev = dev->ieee80211_ptr; struct wiphy *wiphy = wdev->wiphy; struct cfg80211_registered_device *rdev = wiphy_to_rdev(wiphy); - trace_cfg80211_ch_switch_started_notify(dev, chandef); + ASSERT_WDEV_LOCK(wdev); + WARN_INVALID_LINK_ID(wdev, link_id); + + trace_cfg80211_ch_switch_started_notify(dev, chandef, link_id); - nl80211_ch_switch_notify(rdev, dev, chandef, GFP_KERNEL, + nl80211_ch_switch_notify(rdev, dev, link_id, chandef, GFP_KERNEL, NL80211_CMD_CH_SWITCH_STARTED_NOTIFY, count, quiet); } diff --git a/net/wireless/rdev-ops.h b/net/wireless/rdev-ops.h index 40915a82da73..13b209a8db28 100644 --- a/net/wireless/rdev-ops.h +++ b/net/wireless/rdev-ops.h @@ -77,65 +77,69 @@ rdev_change_virtual_intf(struct cfg80211_registered_device *rdev, } static inline int rdev_add_key(struct cfg80211_registered_device *rdev, - struct net_device *netdev, u8 key_index, - bool pairwise, const u8 *mac_addr, + struct net_device *netdev, int link_id, + u8 key_index, bool pairwise, const u8 *mac_addr, struct key_params *params) { int ret; - trace_rdev_add_key(&rdev->wiphy, netdev, key_index, pairwise, + trace_rdev_add_key(&rdev->wiphy, netdev, link_id, key_index, pairwise, mac_addr, params->mode); - ret = rdev->ops->add_key(&rdev->wiphy, netdev, key_index, pairwise, - mac_addr, params); + ret = rdev->ops->add_key(&rdev->wiphy, netdev, link_id, key_index, + pairwise, mac_addr, params); trace_rdev_return_int(&rdev->wiphy, ret); return ret; } static inline int rdev_get_key(struct cfg80211_registered_device *rdev, struct net_device *netdev, - u8 key_index, bool pairwise, const u8 *mac_addr, void *cookie, + int link_id, u8 key_index, bool pairwise, const u8 *mac_addr, + void *cookie, void (*callback)(void *cookie, struct key_params*)) { int ret; - trace_rdev_get_key(&rdev->wiphy, netdev, key_index, pairwise, mac_addr); - ret = rdev->ops->get_key(&rdev->wiphy, netdev, key_index, pairwise, - mac_addr, cookie, callback); + trace_rdev_get_key(&rdev->wiphy, netdev, link_id, key_index, pairwise, + mac_addr); + ret = rdev->ops->get_key(&rdev->wiphy, netdev, link_id, key_index, + pairwise, mac_addr, cookie, callback); trace_rdev_return_int(&rdev->wiphy, ret); return ret; } static inline int rdev_del_key(struct cfg80211_registered_device *rdev, - struct net_device *netdev, u8 key_index, - bool pairwise, const u8 *mac_addr) + struct net_device *netdev, int link_id, + u8 key_index, bool pairwise, const u8 *mac_addr) { int ret; - trace_rdev_del_key(&rdev->wiphy, netdev, key_index, pairwise, mac_addr); - ret = rdev->ops->del_key(&rdev->wiphy, netdev, key_index, pairwise, - mac_addr); + trace_rdev_del_key(&rdev->wiphy, netdev, link_id, key_index, pairwise, + mac_addr); + ret = rdev->ops->del_key(&rdev->wiphy, netdev, link_id, key_index, + pairwise, mac_addr); trace_rdev_return_int(&rdev->wiphy, ret); return ret; } static inline int rdev_set_default_key(struct cfg80211_registered_device *rdev, - struct net_device *netdev, u8 key_index, bool unicast, - bool multicast) + struct net_device *netdev, int link_id, u8 key_index, + bool unicast, bool multicast) { int ret; - trace_rdev_set_default_key(&rdev->wiphy, netdev, key_index, + trace_rdev_set_default_key(&rdev->wiphy, netdev, link_id, key_index, unicast, multicast); - ret = rdev->ops->set_default_key(&rdev->wiphy, netdev, key_index, - unicast, multicast); + ret = rdev->ops->set_default_key(&rdev->wiphy, netdev, link_id, + key_index, unicast, multicast); trace_rdev_return_int(&rdev->wiphy, ret); return ret; } static inline int rdev_set_default_mgmt_key(struct cfg80211_registered_device *rdev, - struct net_device *netdev, u8 key_index) + struct net_device *netdev, int link_id, u8 key_index) { int ret; - trace_rdev_set_default_mgmt_key(&rdev->wiphy, netdev, key_index); - ret = rdev->ops->set_default_mgmt_key(&rdev->wiphy, netdev, + trace_rdev_set_default_mgmt_key(&rdev->wiphy, netdev, link_id, + key_index); + ret = rdev->ops->set_default_mgmt_key(&rdev->wiphy, netdev, link_id, key_index); trace_rdev_return_int(&rdev->wiphy, ret); return ret; @@ -143,13 +147,15 @@ rdev_set_default_mgmt_key(struct cfg80211_registered_device *rdev, static inline int rdev_set_default_beacon_key(struct cfg80211_registered_device *rdev, - struct net_device *netdev, u8 key_index) + struct net_device *netdev, int link_id, + u8 key_index) { int ret; - trace_rdev_set_default_beacon_key(&rdev->wiphy, netdev, key_index); - ret = rdev->ops->set_default_beacon_key(&rdev->wiphy, netdev, - key_index); + trace_rdev_set_default_beacon_key(&rdev->wiphy, netdev, link_id, + key_index); + ret = rdev->ops->set_default_beacon_key(&rdev->wiphy, netdev, link_id, + key_index); trace_rdev_return_int(&rdev->wiphy, ret); return ret; } diff --git a/net/wireless/reg.c b/net/wireless/reg.c index c7383ede794f..d5c7a5aa6853 100644 --- a/net/wireless/reg.c +++ b/net/wireless/reg.c @@ -2389,6 +2389,10 @@ static bool reg_wdev_chan_valid(struct wiphy *wiphy, struct wireless_dev *wdev) switch (iftype) { case NL80211_IFTYPE_AP: case NL80211_IFTYPE_P2P_GO: + if (!wdev->links[link].ap.beacon_interval) + continue; + chandef = wdev->links[link].ap.chandef; + break; case NL80211_IFTYPE_MESH_POINT: if (!wdev->u.mesh.beacon_interval) continue; diff --git a/net/wireless/scan.c b/net/wireless/scan.c index 0134e5d5c81a..5382fc2003db 100644 --- a/net/wireless/scan.c +++ b/net/wireless/scan.c @@ -540,7 +540,7 @@ static int cfg80211_parse_ap_info(struct cfg80211_colocated_ap *entry, memcpy(entry->bssid, pos, ETH_ALEN); pos += ETH_ALEN; - if (length == IEEE80211_TBTT_INFO_OFFSET_BSSID_SSSID_BSS_PARAM) { + if (length >= IEEE80211_TBTT_INFO_OFFSET_BSSID_SSSID_BSS_PARAM) { memcpy(&entry->short_ssid, pos, sizeof(entry->short_ssid)); entry->short_ssid_valid = true; diff --git a/net/wireless/sme.c b/net/wireless/sme.c index 27fb2a0c4052..d513536617bd 100644 --- a/net/wireless/sme.c +++ b/net/wireless/sme.c @@ -747,6 +747,9 @@ void __cfg80211_connect_result(struct net_device *dev, if (WARN_ON(!cr->links[link].addr)) goto out; } + + if (WARN_ON(wdev->connect_keys)) + goto out; } wdev->unprot_beacon_reported = 0; @@ -1325,7 +1328,7 @@ void __cfg80211_disconnected(struct net_device *dev, const u8 *ie, NL80211_EXT_FEATURE_BEACON_PROTECTION_CLIENT)) max_key_idx = 7; for (i = 0; i <= max_key_idx; i++) - rdev_del_key(rdev, dev, i, false, NULL); + rdev_del_key(rdev, dev, -1, i, false, NULL); } rdev_set_qos_map(rdev, dev, NULL); diff --git a/net/wireless/trace.h b/net/wireless/trace.h index 10b2fd9bacb5..a405c3edbc47 100644 --- a/net/wireless/trace.h +++ b/net/wireless/trace.h @@ -434,13 +434,14 @@ TRACE_EVENT(rdev_change_virtual_intf, ); DECLARE_EVENT_CLASS(key_handle, - TP_PROTO(struct wiphy *wiphy, struct net_device *netdev, u8 key_index, - bool pairwise, const u8 *mac_addr), - TP_ARGS(wiphy, netdev, key_index, pairwise, mac_addr), + TP_PROTO(struct wiphy *wiphy, struct net_device *netdev, int link_id, + u8 key_index, bool pairwise, const u8 *mac_addr), + TP_ARGS(wiphy, netdev, link_id, key_index, pairwise, mac_addr), TP_STRUCT__entry( WIPHY_ENTRY NETDEV_ENTRY MAC_ENTRY(mac_addr) + __field(int, link_id) __field(u8, key_index) __field(bool, pairwise) ), @@ -448,34 +449,38 @@ DECLARE_EVENT_CLASS(key_handle, WIPHY_ASSIGN; NETDEV_ASSIGN; MAC_ASSIGN(mac_addr, mac_addr); + __entry->link_id = link_id; __entry->key_index = key_index; __entry->pairwise = pairwise; ), - TP_printk(WIPHY_PR_FMT ", " NETDEV_PR_FMT ", key_index: %u, pairwise: %s, mac addr: " MAC_PR_FMT, - WIPHY_PR_ARG, NETDEV_PR_ARG, __entry->key_index, - BOOL_TO_STR(__entry->pairwise), MAC_PR_ARG(mac_addr)) + TP_printk(WIPHY_PR_FMT ", " NETDEV_PR_FMT ", link_id: %d, " + "key_index: %u, pairwise: %s, mac addr: " MAC_PR_FMT, + WIPHY_PR_ARG, NETDEV_PR_ARG, __entry->link_id, + __entry->key_index, BOOL_TO_STR(__entry->pairwise), + MAC_PR_ARG(mac_addr)) ); DEFINE_EVENT(key_handle, rdev_get_key, - TP_PROTO(struct wiphy *wiphy, struct net_device *netdev, u8 key_index, - bool pairwise, const u8 *mac_addr), - TP_ARGS(wiphy, netdev, key_index, pairwise, mac_addr) + TP_PROTO(struct wiphy *wiphy, struct net_device *netdev, int link_id, + u8 key_index, bool pairwise, const u8 *mac_addr), + TP_ARGS(wiphy, netdev, link_id, key_index, pairwise, mac_addr) ); DEFINE_EVENT(key_handle, rdev_del_key, - TP_PROTO(struct wiphy *wiphy, struct net_device *netdev, u8 key_index, - bool pairwise, const u8 *mac_addr), - TP_ARGS(wiphy, netdev, key_index, pairwise, mac_addr) + TP_PROTO(struct wiphy *wiphy, struct net_device *netdev, int link_id, + u8 key_index, bool pairwise, const u8 *mac_addr), + TP_ARGS(wiphy, netdev, link_id, key_index, pairwise, mac_addr) ); TRACE_EVENT(rdev_add_key, - TP_PROTO(struct wiphy *wiphy, struct net_device *netdev, u8 key_index, - bool pairwise, const u8 *mac_addr, u8 mode), - TP_ARGS(wiphy, netdev, key_index, pairwise, mac_addr, mode), + TP_PROTO(struct wiphy *wiphy, struct net_device *netdev, int link_id, + u8 key_index, bool pairwise, const u8 *mac_addr, u8 mode), + TP_ARGS(wiphy, netdev, link_id, key_index, pairwise, mac_addr, mode), TP_STRUCT__entry( WIPHY_ENTRY NETDEV_ENTRY MAC_ENTRY(mac_addr) + __field(int, link_id) __field(u8, key_index) __field(bool, pairwise) __field(u8, mode) @@ -484,24 +489,27 @@ TRACE_EVENT(rdev_add_key, WIPHY_ASSIGN; NETDEV_ASSIGN; MAC_ASSIGN(mac_addr, mac_addr); + __entry->link_id = link_id; __entry->key_index = key_index; __entry->pairwise = pairwise; __entry->mode = mode; ), - TP_printk(WIPHY_PR_FMT ", " NETDEV_PR_FMT ", key_index: %u, " - "mode: %u, pairwise: %s, mac addr: " MAC_PR_FMT, - WIPHY_PR_ARG, NETDEV_PR_ARG, __entry->key_index, - __entry->mode, BOOL_TO_STR(__entry->pairwise), - MAC_PR_ARG(mac_addr)) + TP_printk(WIPHY_PR_FMT ", " NETDEV_PR_FMT ", link_id: %d, " + "key_index: %u, mode: %u, pairwise: %s, " + "mac addr: " MAC_PR_FMT, + WIPHY_PR_ARG, NETDEV_PR_ARG, __entry->link_id, + __entry->key_index, __entry->mode, + BOOL_TO_STR(__entry->pairwise), MAC_PR_ARG(mac_addr)) ); TRACE_EVENT(rdev_set_default_key, - TP_PROTO(struct wiphy *wiphy, struct net_device *netdev, u8 key_index, - bool unicast, bool multicast), - TP_ARGS(wiphy, netdev, key_index, unicast, multicast), + TP_PROTO(struct wiphy *wiphy, struct net_device *netdev, int link_id, + u8 key_index, bool unicast, bool multicast), + TP_ARGS(wiphy, netdev, link_id, key_index, unicast, multicast), TP_STRUCT__entry( WIPHY_ENTRY NETDEV_ENTRY + __field(int, link_id) __field(u8, key_index) __field(bool, unicast) __field(bool, multicast) @@ -509,48 +517,58 @@ TRACE_EVENT(rdev_set_default_key, TP_fast_assign( WIPHY_ASSIGN; NETDEV_ASSIGN; + __entry->link_id = link_id; __entry->key_index = key_index; __entry->unicast = unicast; __entry->multicast = multicast; ), - TP_printk(WIPHY_PR_FMT ", " NETDEV_PR_FMT ", key index: %u, unicast: %s, multicast: %s", - WIPHY_PR_ARG, NETDEV_PR_ARG, __entry->key_index, - BOOL_TO_STR(__entry->unicast), + TP_printk(WIPHY_PR_FMT ", " NETDEV_PR_FMT ", link_id: %d, " + "key index: %u, unicast: %s, multicast: %s", + WIPHY_PR_ARG, NETDEV_PR_ARG, __entry->link_id, + __entry->key_index, BOOL_TO_STR(__entry->unicast), BOOL_TO_STR(__entry->multicast)) ); TRACE_EVENT(rdev_set_default_mgmt_key, - TP_PROTO(struct wiphy *wiphy, struct net_device *netdev, u8 key_index), - TP_ARGS(wiphy, netdev, key_index), + TP_PROTO(struct wiphy *wiphy, struct net_device *netdev, int link_id, + u8 key_index), + TP_ARGS(wiphy, netdev, link_id, key_index), TP_STRUCT__entry( WIPHY_ENTRY NETDEV_ENTRY + __field(int, link_id) __field(u8, key_index) ), TP_fast_assign( WIPHY_ASSIGN; NETDEV_ASSIGN; + __entry->link_id = link_id; __entry->key_index = key_index; ), - TP_printk(WIPHY_PR_FMT ", " NETDEV_PR_FMT ", key index: %u", - WIPHY_PR_ARG, NETDEV_PR_ARG, __entry->key_index) + TP_printk(WIPHY_PR_FMT ", " NETDEV_PR_FMT ", link_id: %d, " + "key index: %u", WIPHY_PR_ARG, NETDEV_PR_ARG, + __entry->link_id, __entry->key_index) ); TRACE_EVENT(rdev_set_default_beacon_key, - TP_PROTO(struct wiphy *wiphy, struct net_device *netdev, u8 key_index), - TP_ARGS(wiphy, netdev, key_index), + TP_PROTO(struct wiphy *wiphy, struct net_device *netdev, int link_id, + u8 key_index), + TP_ARGS(wiphy, netdev, link_id, key_index), TP_STRUCT__entry( WIPHY_ENTRY NETDEV_ENTRY + __field(int, link_id) __field(u8, key_index) ), TP_fast_assign( WIPHY_ASSIGN; NETDEV_ASSIGN; + __entry->link_id = link_id; __entry->key_index = key_index; ), - TP_printk(WIPHY_PR_FMT ", " NETDEV_PR_FMT ", key index: %u", - WIPHY_PR_ARG, NETDEV_PR_ARG, __entry->key_index) + TP_printk(WIPHY_PR_FMT ", " NETDEV_PR_FMT ", link_id: %d, " + "key index: %u", WIPHY_PR_ARG, NETDEV_PR_ARG, + __entry->link_id, __entry->key_index) ); TRACE_EVENT(rdev_start_ap, @@ -3245,18 +3263,21 @@ TRACE_EVENT(cfg80211_ch_switch_notify, TRACE_EVENT(cfg80211_ch_switch_started_notify, TP_PROTO(struct net_device *netdev, - struct cfg80211_chan_def *chandef), - TP_ARGS(netdev, chandef), + struct cfg80211_chan_def *chandef, + unsigned int link_id), + TP_ARGS(netdev, chandef, link_id), TP_STRUCT__entry( NETDEV_ENTRY CHAN_DEF_ENTRY + __field(unsigned int, link_id) ), TP_fast_assign( NETDEV_ASSIGN; CHAN_DEF_ASSIGN(chandef); + __entry->link_id = link_id; ), - TP_printk(NETDEV_PR_FMT ", " CHAN_DEF_PR_FMT, - NETDEV_PR_ARG, CHAN_DEF_PR_ARG) + TP_printk(NETDEV_PR_FMT ", " CHAN_DEF_PR_FMT ", link:%d", + NETDEV_PR_ARG, CHAN_DEF_PR_ARG, __entry->link_id) ); TRACE_EVENT(cfg80211_radar_event, diff --git a/net/wireless/util.c b/net/wireless/util.c index 775836f6785a..01493568a21d 100644 --- a/net/wireless/util.c +++ b/net/wireless/util.c @@ -935,13 +935,13 @@ void cfg80211_upload_connect_keys(struct wireless_dev *wdev) for (i = 0; i < CFG80211_MAX_WEP_KEYS; i++) { if (!wdev->connect_keys->params[i].cipher) continue; - if (rdev_add_key(rdev, dev, i, false, NULL, + if (rdev_add_key(rdev, dev, -1, i, false, NULL, &wdev->connect_keys->params[i])) { netdev_err(dev, "failed to set key %d\n", i); continue; } if (wdev->connect_keys->def == i && - rdev_set_default_key(rdev, dev, i, true, true)) { + rdev_set_default_key(rdev, dev, -1, i, true, true)) { netdev_err(dev, "failed to set defkey %d\n", i); continue; } diff --git a/net/wireless/wext-compat.c b/net/wireless/wext-compat.c index a9767bfe7330..ddf340bfa07a 100644 --- a/net/wireless/wext-compat.c +++ b/net/wireless/wext-compat.c @@ -470,7 +470,7 @@ static int __cfg80211_set_encryption(struct cfg80211_registered_device *rdev, !(rdev->wiphy.flags & WIPHY_FLAG_IBSS_RSN)) err = -ENOENT; else - err = rdev_del_key(rdev, dev, idx, pairwise, + err = rdev_del_key(rdev, dev, -1, idx, pairwise, addr); } wdev->wext.connect.privacy = false; @@ -509,7 +509,7 @@ static int __cfg80211_set_encryption(struct cfg80211_registered_device *rdev, if (wdev->connected || (wdev->iftype == NL80211_IFTYPE_ADHOC && wdev->u.ibss.current_bss)) - err = rdev_add_key(rdev, dev, idx, pairwise, addr, params); + err = rdev_add_key(rdev, dev, -1, idx, pairwise, addr, params); else if (params->cipher != WLAN_CIPHER_SUITE_WEP40 && params->cipher != WLAN_CIPHER_SUITE_WEP104) return -EINVAL; @@ -546,7 +546,8 @@ static int __cfg80211_set_encryption(struct cfg80211_registered_device *rdev, __cfg80211_leave_ibss(rdev, wdev->netdev, true); rejoin = true; } - err = rdev_set_default_key(rdev, dev, idx, true, true); + err = rdev_set_default_key(rdev, dev, -1, idx, true, + true); } if (!err) { wdev->wext.default_key = idx; @@ -561,7 +562,7 @@ static int __cfg80211_set_encryption(struct cfg80211_registered_device *rdev, if (wdev->connected || (wdev->iftype == NL80211_IFTYPE_ADHOC && wdev->u.ibss.current_bss)) - err = rdev_set_default_mgmt_key(rdev, dev, idx); + err = rdev_set_default_mgmt_key(rdev, dev, -1, idx); if (!err) wdev->wext.default_mgmt_key = idx; return err; @@ -632,7 +633,7 @@ static int cfg80211_wext_siwencode(struct net_device *dev, if (wdev->connected || (wdev->iftype == NL80211_IFTYPE_ADHOC && wdev->u.ibss.current_bss)) - err = rdev_set_default_key(rdev, dev, idx, true, + err = rdev_set_default_key(rdev, dev, -1, idx, true, true); if (!err) wdev->wext.default_key = idx; @@ -685,6 +686,13 @@ static int cfg80211_wext_siwencodeext(struct net_device *dev, !rdev->ops->set_default_key) return -EOPNOTSUPP; + wdev_lock(wdev); + if (wdev->valid_links) { + wdev_unlock(wdev); + return -EOPNOTSUPP; + } + wdev_unlock(wdev); + switch (ext->alg) { case IW_ENCODE_ALG_NONE: remove = true; diff --git a/net/xdp/xdp_umem.c b/net/xdp/xdp_umem.c index 869b9b9b9fad..4681e8e8ad94 100644 --- a/net/xdp/xdp_umem.c +++ b/net/xdp/xdp_umem.c @@ -19,8 +19,6 @@ #include "xdp_umem.h" #include "xsk_queue.h" -#define XDP_UMEM_MIN_CHUNK_SIZE 2048 - static DEFINE_IDA(umem_ida); static void xdp_umem_unpin_pages(struct xdp_umem *umem) diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c index 5b4ce6ba1bc7..9f0561b67c12 100644 --- a/net/xdp/xsk.c +++ b/net/xdp/xsk.c @@ -355,16 +355,15 @@ static u32 xsk_tx_peek_release_fallback(struct xsk_buff_pool *pool, u32 max_entr return nb_pkts; } -u32 xsk_tx_peek_release_desc_batch(struct xsk_buff_pool *pool, u32 max_entries) +u32 xsk_tx_peek_release_desc_batch(struct xsk_buff_pool *pool, u32 nb_pkts) { struct xdp_sock *xs; - u32 nb_pkts; rcu_read_lock(); if (!list_is_singular(&pool->xsk_tx_list)) { /* Fallback to the non-batched version */ rcu_read_unlock(); - return xsk_tx_peek_release_fallback(pool, max_entries); + return xsk_tx_peek_release_fallback(pool, nb_pkts); } xs = list_first_or_null_rcu(&pool->xsk_tx_list, struct xdp_sock, tx_list); @@ -373,12 +372,7 @@ u32 xsk_tx_peek_release_desc_batch(struct xsk_buff_pool *pool, u32 max_entries) goto out; } - max_entries = xskq_cons_nb_entries(xs->tx, max_entries); - nb_pkts = xskq_cons_read_desc_batch(xs->tx, pool, max_entries); - if (!nb_pkts) { - xs->tx->queue_empty_descs++; - goto out; - } + nb_pkts = xskq_cons_nb_entries(xs->tx, nb_pkts); /* This is the backpressure mechanism for the Tx path. Try to * reserve space in the completion queue for all packets, but @@ -386,12 +380,18 @@ u32 xsk_tx_peek_release_desc_batch(struct xsk_buff_pool *pool, u32 max_entries) * packets. This avoids having to implement any buffering in * the Tx path. */ - nb_pkts = xskq_prod_reserve_addr_batch(pool->cq, pool->tx_descs, nb_pkts); + nb_pkts = xskq_prod_nb_free(pool->cq, nb_pkts); if (!nb_pkts) goto out; - xskq_cons_release_n(xs->tx, max_entries); + nb_pkts = xskq_cons_read_desc_batch(xs->tx, pool, nb_pkts); + if (!nb_pkts) { + xs->tx->queue_empty_descs++; + goto out; + } + __xskq_cons_release(xs->tx); + xskq_prod_write_addr_batch(pool->cq, pool->tx_descs, nb_pkts); xs->sk.sk_write_space(&xs->sk); out: @@ -954,8 +954,8 @@ static int xsk_bind(struct socket *sock, struct sockaddr *addr, int addr_len) goto out_unlock; } - err = xp_assign_dev_shared(xs->pool, umem_xs->umem, - dev, qid); + err = xp_assign_dev_shared(xs->pool, umem_xs, dev, + qid); if (err) { xp_destroy(xs->pool); xs->pool = NULL; diff --git a/net/xdp/xsk_buff_pool.c b/net/xdp/xsk_buff_pool.c index a71a8c6edf55..ed6c71826d31 100644 --- a/net/xdp/xsk_buff_pool.c +++ b/net/xdp/xsk_buff_pool.c @@ -212,17 +212,18 @@ err_unreg_pool: return err; } -int xp_assign_dev_shared(struct xsk_buff_pool *pool, struct xdp_umem *umem, +int xp_assign_dev_shared(struct xsk_buff_pool *pool, struct xdp_sock *umem_xs, struct net_device *dev, u16 queue_id) { u16 flags; + struct xdp_umem *umem = umem_xs->umem; /* One fill and completion ring required for each queue id. */ if (!pool->fq || !pool->cq) return -EINVAL; flags = umem->zc ? XDP_ZEROCOPY : XDP_COPY; - if (pool->uses_need_wakeup) + if (umem_xs->pool->uses_need_wakeup) flags |= XDP_USE_NEED_WAKEUP; return xp_assign_dev(pool, dev, queue_id, flags); diff --git a/net/xdp/xsk_queue.h b/net/xdp/xsk_queue.h index fb20bf7207cf..c6fb6b763658 100644 --- a/net/xdp/xsk_queue.h +++ b/net/xdp/xsk_queue.h @@ -205,6 +205,11 @@ static inline bool xskq_cons_read_desc(struct xsk_queue *q, return false; } +static inline void xskq_cons_release_n(struct xsk_queue *q, u32 cnt) +{ + q->cached_cons += cnt; +} + static inline u32 xskq_cons_read_desc_batch(struct xsk_queue *q, struct xsk_buff_pool *pool, u32 max) { @@ -226,6 +231,8 @@ static inline u32 xskq_cons_read_desc_batch(struct xsk_queue *q, struct xsk_buff cached_cons++; } + /* Release valid plus any invalid entries */ + xskq_cons_release_n(q, cached_cons - q->cached_cons); return nb_entries; } @@ -291,11 +298,6 @@ static inline void xskq_cons_release(struct xsk_queue *q) q->cached_cons++; } -static inline void xskq_cons_release_n(struct xsk_queue *q, u32 cnt) -{ - q->cached_cons += cnt; -} - static inline u32 xskq_cons_present_entries(struct xsk_queue *q) { /* No barriers needed since data is not accessed */ @@ -350,21 +352,17 @@ static inline int xskq_prod_reserve_addr(struct xsk_queue *q, u64 addr) return 0; } -static inline u32 xskq_prod_reserve_addr_batch(struct xsk_queue *q, struct xdp_desc *descs, - u32 max) +static inline void xskq_prod_write_addr_batch(struct xsk_queue *q, struct xdp_desc *descs, + u32 nb_entries) { struct xdp_umem_ring *ring = (struct xdp_umem_ring *)q->ring; - u32 nb_entries, i, cached_prod; - - nb_entries = xskq_prod_nb_free(q, max); + u32 i, cached_prod; /* A, matches D */ cached_prod = q->cached_prod; for (i = 0; i < nb_entries; i++) ring->desc[cached_prod++ & q->ring_mask] = descs[i].addr; q->cached_prod = cached_prod; - - return nb_entries; } static inline int xskq_prod_reserve_desc(struct xsk_queue *q, diff --git a/net/xfrm/espintcp.c b/net/xfrm/espintcp.c index 974eb97b77d2..29a540dcb5a7 100644 --- a/net/xfrm/espintcp.c +++ b/net/xfrm/espintcp.c @@ -91,7 +91,7 @@ static void espintcp_rcv(struct strparser *strp, struct sk_buff *skb) } /* remove header, leave non-ESP marker/SPI */ - if (!__pskb_pull(skb, rxm->offset + 2)) { + if (!pskb_pull(skb, rxm->offset + 2)) { XFRM_INC_STATS(sock_net(strp->sk), LINUX_MIB_XFRMINERROR); kfree_skb(skb); return; diff --git a/net/xfrm/xfrm_device.c b/net/xfrm/xfrm_device.c index 637ca8838436..5f5aafd418af 100644 --- a/net/xfrm/xfrm_device.c +++ b/net/xfrm/xfrm_device.c @@ -207,7 +207,8 @@ struct sk_buff *validate_xmit_xfrm(struct sk_buff *skb, netdev_features_t featur EXPORT_SYMBOL_GPL(validate_xmit_xfrm); int xfrm_dev_state_add(struct net *net, struct xfrm_state *x, - struct xfrm_user_offload *xuo) + struct xfrm_user_offload *xuo, + struct netlink_ext_ack *extack) { int err; struct dst_entry *dst; @@ -216,15 +217,21 @@ int xfrm_dev_state_add(struct net *net, struct xfrm_state *x, xfrm_address_t *saddr; xfrm_address_t *daddr; - if (!x->type_offload) + if (!x->type_offload) { + NL_SET_ERR_MSG(extack, "Type doesn't support offload"); return -EINVAL; + } /* We don't yet support UDP encapsulation and TFC padding. */ - if (x->encap || x->tfcpad) + if (x->encap || x->tfcpad) { + NL_SET_ERR_MSG(extack, "Encapsulation and TFC padding can't be offloaded"); return -EINVAL; + } - if (xuo->flags & ~(XFRM_OFFLOAD_IPV6 | XFRM_OFFLOAD_INBOUND)) + if (xuo->flags & ~(XFRM_OFFLOAD_IPV6 | XFRM_OFFLOAD_INBOUND)) { + NL_SET_ERR_MSG(extack, "Unrecognized flags in offload request"); return -EINVAL; + } dev = dev_get_by_index(net, xuo->ifindex); if (!dev) { @@ -256,6 +263,7 @@ int xfrm_dev_state_add(struct net *net, struct xfrm_state *x, if (x->props.flags & XFRM_STATE_ESN && !dev->xfrmdev_ops->xdo_dev_state_advance_esn) { + NL_SET_ERR_MSG(extack, "Device doesn't support offload with ESN"); xso->dev = NULL; dev_put(dev); return -EINVAL; @@ -277,8 +285,10 @@ int xfrm_dev_state_add(struct net *net, struct xfrm_state *x, xso->real_dev = NULL; netdev_put(dev, &xso->dev_tracker); - if (err != -EOPNOTSUPP) + if (err != -EOPNOTSUPP) { + NL_SET_ERR_MSG(extack, "Device failed to offload this state"); return err; + } } return 0; diff --git a/net/xfrm/xfrm_input.c b/net/xfrm/xfrm_input.c index b2f4ec9c537f..97074f6f2bde 100644 --- a/net/xfrm/xfrm_input.c +++ b/net/xfrm/xfrm_input.c @@ -20,11 +20,13 @@ #include <net/xfrm.h> #include <net/ip_tunnels.h> #include <net/ip6_tunnel.h> +#include <net/dst_metadata.h> #include "xfrm_inout.h" struct xfrm_trans_tasklet { - struct tasklet_struct tasklet; + struct work_struct work; + spinlock_t queue_lock; struct sk_buff_head queue; }; @@ -719,7 +721,8 @@ resume: sp = skb_sec_path(skb); if (sp) sp->olen = 0; - skb_dst_drop(skb); + if (skb_valid_dst(skb)) + skb_dst_drop(skb); gro_cells_receive(&gro_cells, skb); return 0; } else { @@ -737,7 +740,8 @@ resume: sp = skb_sec_path(skb); if (sp) sp->olen = 0; - skb_dst_drop(skb); + if (skb_valid_dst(skb)) + skb_dst_drop(skb); gro_cells_receive(&gro_cells, skb); return err; } @@ -760,18 +764,22 @@ int xfrm_input_resume(struct sk_buff *skb, int nexthdr) } EXPORT_SYMBOL(xfrm_input_resume); -static void xfrm_trans_reinject(struct tasklet_struct *t) +static void xfrm_trans_reinject(struct work_struct *work) { - struct xfrm_trans_tasklet *trans = from_tasklet(trans, t, tasklet); + struct xfrm_trans_tasklet *trans = container_of(work, struct xfrm_trans_tasklet, work); struct sk_buff_head queue; struct sk_buff *skb; __skb_queue_head_init(&queue); + spin_lock_bh(&trans->queue_lock); skb_queue_splice_init(&trans->queue, &queue); + spin_unlock_bh(&trans->queue_lock); + local_bh_disable(); while ((skb = __skb_dequeue(&queue))) XFRM_TRANS_SKB_CB(skb)->finish(XFRM_TRANS_SKB_CB(skb)->net, NULL, skb); + local_bh_enable(); } int xfrm_trans_queue_net(struct net *net, struct sk_buff *skb, @@ -789,8 +797,10 @@ int xfrm_trans_queue_net(struct net *net, struct sk_buff *skb, XFRM_TRANS_SKB_CB(skb)->finish = finish; XFRM_TRANS_SKB_CB(skb)->net = net; + spin_lock_bh(&trans->queue_lock); __skb_queue_tail(&trans->queue, skb); - tasklet_schedule(&trans->tasklet); + spin_unlock_bh(&trans->queue_lock); + schedule_work(&trans->work); return 0; } EXPORT_SYMBOL(xfrm_trans_queue_net); @@ -817,7 +827,8 @@ void __init xfrm_input_init(void) struct xfrm_trans_tasklet *trans; trans = &per_cpu(xfrm_trans_tasklet, i); + spin_lock_init(&trans->queue_lock); __skb_queue_head_init(&trans->queue); - tasklet_setup(&trans->tasklet, xfrm_trans_reinject); + INIT_WORK(&trans->work, xfrm_trans_reinject); } } diff --git a/net/xfrm/xfrm_interface.c b/net/xfrm/xfrm_interface.c index 5113fa0fbcee..5a67b120c4db 100644 --- a/net/xfrm/xfrm_interface.c +++ b/net/xfrm/xfrm_interface.c @@ -41,6 +41,7 @@ #include <net/addrconf.h> #include <net/xfrm.h> #include <net/net_namespace.h> +#include <net/dst_metadata.h> #include <net/netns/generic.h> #include <linux/etherdevice.h> @@ -56,6 +57,89 @@ static const struct net_device_ops xfrmi_netdev_ops; struct xfrmi_net { /* lists for storing interfaces in use */ struct xfrm_if __rcu *xfrmi[XFRMI_HASH_SIZE]; + struct xfrm_if __rcu *collect_md_xfrmi; +}; + +static const struct nla_policy xfrm_lwt_policy[LWT_XFRM_MAX + 1] = { + [LWT_XFRM_IF_ID] = NLA_POLICY_MIN(NLA_U32, 1), + [LWT_XFRM_LINK] = NLA_POLICY_MIN(NLA_U32, 1), +}; + +static void xfrmi_destroy_state(struct lwtunnel_state *lwt) +{ +} + +static int xfrmi_build_state(struct net *net, struct nlattr *nla, + unsigned int family, const void *cfg, + struct lwtunnel_state **ts, + struct netlink_ext_ack *extack) +{ + struct nlattr *tb[LWT_XFRM_MAX + 1]; + struct lwtunnel_state *new_state; + struct xfrm_md_info *info; + int ret; + + ret = nla_parse_nested(tb, LWT_XFRM_MAX, nla, xfrm_lwt_policy, extack); + if (ret < 0) + return ret; + + if (!tb[LWT_XFRM_IF_ID]) { + NL_SET_ERR_MSG(extack, "if_id must be set"); + return -EINVAL; + } + + new_state = lwtunnel_state_alloc(sizeof(*info)); + if (!new_state) { + NL_SET_ERR_MSG(extack, "failed to create encap info"); + return -ENOMEM; + } + + new_state->type = LWTUNNEL_ENCAP_XFRM; + + info = lwt_xfrm_info(new_state); + + info->if_id = nla_get_u32(tb[LWT_XFRM_IF_ID]); + + if (tb[LWT_XFRM_LINK]) + info->link = nla_get_u32(tb[LWT_XFRM_LINK]); + + *ts = new_state; + return 0; +} + +static int xfrmi_fill_encap_info(struct sk_buff *skb, + struct lwtunnel_state *lwt) +{ + struct xfrm_md_info *info = lwt_xfrm_info(lwt); + + if (nla_put_u32(skb, LWT_XFRM_IF_ID, info->if_id) || + (info->link && nla_put_u32(skb, LWT_XFRM_LINK, info->link))) + return -EMSGSIZE; + + return 0; +} + +static int xfrmi_encap_nlsize(struct lwtunnel_state *lwtstate) +{ + return nla_total_size(sizeof(u32)) + /* LWT_XFRM_IF_ID */ + nla_total_size(sizeof(u32)); /* LWT_XFRM_LINK */ +} + +static int xfrmi_encap_cmp(struct lwtunnel_state *a, struct lwtunnel_state *b) +{ + struct xfrm_md_info *a_info = lwt_xfrm_info(a); + struct xfrm_md_info *b_info = lwt_xfrm_info(b); + + return memcmp(a_info, b_info, sizeof(*a_info)); +} + +static const struct lwtunnel_encap_ops xfrmi_encap_ops = { + .build_state = xfrmi_build_state, + .destroy_state = xfrmi_destroy_state, + .fill_encap = xfrmi_fill_encap_info, + .get_encap_size = xfrmi_encap_nlsize, + .cmp_encap = xfrmi_encap_cmp, + .owner = THIS_MODULE, }; #define for_each_xfrmi_rcu(start, xi) \ @@ -77,17 +161,23 @@ static struct xfrm_if *xfrmi_lookup(struct net *net, struct xfrm_state *x) return xi; } + xi = rcu_dereference(xfrmn->collect_md_xfrmi); + if (xi && (xi->dev->flags & IFF_UP)) + return xi; + return NULL; } -static struct xfrm_if *xfrmi_decode_session(struct sk_buff *skb, - unsigned short family) +static bool xfrmi_decode_session(struct sk_buff *skb, + unsigned short family, + struct xfrm_if_decode_session_result *res) { struct net_device *dev; + struct xfrm_if *xi; int ifindex = 0; if (!secpath_exists(skb) || !skb->dev) - return NULL; + return false; switch (family) { case AF_INET6: @@ -107,11 +197,18 @@ static struct xfrm_if *xfrmi_decode_session(struct sk_buff *skb, } if (!dev || !(dev->flags & IFF_UP)) - return NULL; + return false; if (dev->netdev_ops != &xfrmi_netdev_ops) - return NULL; + return false; - return netdev_priv(dev); + xi = netdev_priv(dev); + res->net = xi->net; + + if (xi->p.collect_md) + res->if_id = xfrm_input_state(skb)->if_id; + else + res->if_id = xi->p.if_id; + return true; } static void xfrmi_link(struct xfrmi_net *xfrmn, struct xfrm_if *xi) @@ -157,7 +254,10 @@ static int xfrmi_create(struct net_device *dev) if (err < 0) goto out; - xfrmi_link(xfrmn, xi); + if (xi->p.collect_md) + rcu_assign_pointer(xfrmn->collect_md_xfrmi, xi); + else + xfrmi_link(xfrmn, xi); return 0; @@ -185,7 +285,10 @@ static void xfrmi_dev_uninit(struct net_device *dev) struct xfrm_if *xi = netdev_priv(dev); struct xfrmi_net *xfrmn = net_generic(xi->net, xfrmi_net_id); - xfrmi_unlink(xfrmn, xi); + if (xi->p.collect_md) + RCU_INIT_POINTER(xfrmn->collect_md_xfrmi, NULL); + else + xfrmi_unlink(xfrmn, xi); } static void xfrmi_scrub_packet(struct sk_buff *skb, bool xnet) @@ -214,6 +317,7 @@ static int xfrmi_rcv_cb(struct sk_buff *skb, int err) struct xfrm_state *x; struct xfrm_if *xi; bool xnet; + int link; if (err && !secpath_exists(skb)) return 0; @@ -224,6 +328,7 @@ static int xfrmi_rcv_cb(struct sk_buff *skb, int err) if (!xi) return 1; + link = skb->dev->ifindex; dev = xi->dev; skb->dev = dev; @@ -254,6 +359,17 @@ static int xfrmi_rcv_cb(struct sk_buff *skb, int err) } xfrmi_scrub_packet(skb, xnet); + if (xi->p.collect_md) { + struct metadata_dst *md_dst; + + md_dst = metadata_dst_alloc(0, METADATA_XFRM, GFP_ATOMIC); + if (!md_dst) + return -ENOMEM; + + md_dst->u.xfrm_info.if_id = x->if_id; + md_dst->u.xfrm_info.link = link; + skb_dst_set(skb, (struct dst_entry *)md_dst); + } dev_sw_netstats_rx_add(dev, skb->len); return 0; @@ -269,10 +385,23 @@ xfrmi_xmit2(struct sk_buff *skb, struct net_device *dev, struct flowi *fl) struct net_device *tdev; struct xfrm_state *x; int err = -1; + u32 if_id; int mtu; + if (xi->p.collect_md) { + struct xfrm_md_info *md_info = skb_xfrm_md_info(skb); + + if (unlikely(!md_info)) + return -EINVAL; + + if_id = md_info->if_id; + fl->flowi_oif = md_info->link; + } else { + if_id = xi->p.if_id; + } + dst_hold(dst); - dst = xfrm_lookup_with_ifid(xi->net, dst, fl, NULL, 0, xi->p.if_id); + dst = xfrm_lookup_with_ifid(xi->net, dst, fl, NULL, 0, if_id); if (IS_ERR(dst)) { err = PTR_ERR(dst); dst = NULL; @@ -283,7 +412,7 @@ xfrmi_xmit2(struct sk_buff *skb, struct net_device *dev, struct flowi *fl) if (!x) goto tx_err_link_failure; - if (x->if_id != xi->p.if_id) + if (x->if_id != if_id) goto tx_err_link_failure; tdev = dst->dev; @@ -633,6 +762,9 @@ static void xfrmi_netlink_parms(struct nlattr *data[], if (data[IFLA_XFRM_IF_ID]) parms->if_id = nla_get_u32(data[IFLA_XFRM_IF_ID]); + + if (data[IFLA_XFRM_COLLECT_METADATA]) + parms->collect_md = true; } static int xfrmi_newlink(struct net *src_net, struct net_device *dev, @@ -645,14 +777,27 @@ static int xfrmi_newlink(struct net *src_net, struct net_device *dev, int err; xfrmi_netlink_parms(data, &p); - if (!p.if_id) { - NL_SET_ERR_MSG(extack, "if_id must be non zero"); - return -EINVAL; - } + if (p.collect_md) { + struct xfrmi_net *xfrmn = net_generic(net, xfrmi_net_id); - xi = xfrmi_locate(net, &p); - if (xi) - return -EEXIST; + if (p.link || p.if_id) { + NL_SET_ERR_MSG(extack, "link and if_id must be zero"); + return -EINVAL; + } + + if (rtnl_dereference(xfrmn->collect_md_xfrmi)) + return -EEXIST; + + } else { + if (!p.if_id) { + NL_SET_ERR_MSG(extack, "if_id must be non zero"); + return -EINVAL; + } + + xi = xfrmi_locate(net, &p); + if (xi) + return -EEXIST; + } xi = netdev_priv(dev); xi->p = p; @@ -682,12 +827,22 @@ static int xfrmi_changelink(struct net_device *dev, struct nlattr *tb[], return -EINVAL; } + if (p.collect_md) { + NL_SET_ERR_MSG(extack, "collect_md can't be changed"); + return -EINVAL; + } + xi = xfrmi_locate(net, &p); if (!xi) { xi = netdev_priv(dev); } else { if (xi->dev != dev) return -EEXIST; + if (xi->p.collect_md) { + NL_SET_ERR_MSG(extack, + "device can't be changed to collect_md"); + return -EINVAL; + } } return xfrmi_update(xi, &p); @@ -700,6 +855,8 @@ static size_t xfrmi_get_size(const struct net_device *dev) nla_total_size(4) + /* IFLA_XFRM_IF_ID */ nla_total_size(4) + + /* IFLA_XFRM_COLLECT_METADATA */ + nla_total_size(0) + 0; } @@ -709,7 +866,8 @@ static int xfrmi_fill_info(struct sk_buff *skb, const struct net_device *dev) struct xfrm_if_parms *parm = &xi->p; if (nla_put_u32(skb, IFLA_XFRM_LINK, parm->link) || - nla_put_u32(skb, IFLA_XFRM_IF_ID, parm->if_id)) + nla_put_u32(skb, IFLA_XFRM_IF_ID, parm->if_id) || + (xi->p.collect_md && nla_put_flag(skb, IFLA_XFRM_COLLECT_METADATA))) goto nla_put_failure; return 0; @@ -725,8 +883,10 @@ static struct net *xfrmi_get_link_net(const struct net_device *dev) } static const struct nla_policy xfrmi_policy[IFLA_XFRM_MAX + 1] = { - [IFLA_XFRM_LINK] = { .type = NLA_U32 }, - [IFLA_XFRM_IF_ID] = { .type = NLA_U32 }, + [IFLA_XFRM_UNSPEC] = { .strict_start_type = IFLA_XFRM_COLLECT_METADATA }, + [IFLA_XFRM_LINK] = { .type = NLA_U32 }, + [IFLA_XFRM_IF_ID] = { .type = NLA_U32 }, + [IFLA_XFRM_COLLECT_METADATA] = { .type = NLA_FLAG }, }; static struct rtnl_link_ops xfrmi_link_ops __read_mostly = { @@ -762,6 +922,9 @@ static void __net_exit xfrmi_exit_batch_net(struct list_head *net_exit_list) xip = &xi->next) unregister_netdevice_queue(xi->dev, &list); } + xi = rtnl_dereference(xfrmn->collect_md_xfrmi); + if (xi) + unregister_netdevice_queue(xi->dev, &list); } unregister_netdevice_many(&list); rtnl_unlock(); @@ -999,6 +1162,8 @@ static int __init xfrmi_init(void) if (err < 0) goto rtnl_link_failed; + lwtunnel_encap_add_ops(&xfrmi_encap_ops, LWTUNNEL_ENCAP_XFRM); + xfrm_if_register_cb(&xfrm_if_cb); return err; @@ -1017,6 +1182,7 @@ pernet_dev_failed: static void __exit xfrmi_fini(void) { xfrm_if_unregister_cb(); + lwtunnel_encap_del_ops(&xfrmi_encap_ops, LWTUNNEL_ENCAP_XFRM); rtnl_link_unregister(&xfrmi_link_ops); xfrmi4_fini(); xfrmi6_fini(); diff --git a/net/xfrm/xfrm_ipcomp.c b/net/xfrm/xfrm_ipcomp.c index cb40ff0ff28d..80143360bf09 100644 --- a/net/xfrm/xfrm_ipcomp.c +++ b/net/xfrm/xfrm_ipcomp.c @@ -203,6 +203,7 @@ static void ipcomp_free_scratches(void) vfree(*per_cpu_ptr(scratches, i)); free_percpu(scratches); + ipcomp_scratches = NULL; } static void * __percpu *ipcomp_alloc_scratches(void) @@ -325,18 +326,22 @@ void ipcomp_destroy(struct xfrm_state *x) } EXPORT_SYMBOL_GPL(ipcomp_destroy); -int ipcomp_init_state(struct xfrm_state *x) +int ipcomp_init_state(struct xfrm_state *x, struct netlink_ext_ack *extack) { int err; struct ipcomp_data *ipcd; struct xfrm_algo_desc *calg_desc; err = -EINVAL; - if (!x->calg) + if (!x->calg) { + NL_SET_ERR_MSG(extack, "Missing required compression algorithm"); goto out; + } - if (x->encap) + if (x->encap) { + NL_SET_ERR_MSG(extack, "IPComp is not compatible with encapsulation"); goto out; + } err = -ENOMEM; ipcd = kzalloc(sizeof(*ipcd), GFP_KERNEL); diff --git a/net/xfrm/xfrm_policy.c b/net/xfrm/xfrm_policy.c index cc6ab79609e2..e392d8d05e0c 100644 --- a/net/xfrm/xfrm_policy.c +++ b/net/xfrm/xfrm_policy.c @@ -1889,7 +1889,7 @@ EXPORT_SYMBOL(xfrm_policy_walk_done); */ static int xfrm_policy_match(const struct xfrm_policy *pol, const struct flowi *fl, - u8 type, u16 family, int dir, u32 if_id) + u8 type, u16 family, u32 if_id) { const struct xfrm_selector *sel = &pol->selector; int ret = -ESRCH; @@ -2014,7 +2014,7 @@ static struct xfrm_policy * __xfrm_policy_eval_candidates(struct hlist_head *chain, struct xfrm_policy *prefer, const struct flowi *fl, - u8 type, u16 family, int dir, u32 if_id) + u8 type, u16 family, u32 if_id) { u32 priority = prefer ? prefer->priority : ~0u; struct xfrm_policy *pol; @@ -2028,7 +2028,7 @@ __xfrm_policy_eval_candidates(struct hlist_head *chain, if (pol->priority > priority) break; - err = xfrm_policy_match(pol, fl, type, family, dir, if_id); + err = xfrm_policy_match(pol, fl, type, family, if_id); if (err) { if (err != -ESRCH) return ERR_PTR(err); @@ -2053,7 +2053,7 @@ static struct xfrm_policy * xfrm_policy_eval_candidates(struct xfrm_pol_inexact_candidates *cand, struct xfrm_policy *prefer, const struct flowi *fl, - u8 type, u16 family, int dir, u32 if_id) + u8 type, u16 family, u32 if_id) { struct xfrm_policy *tmp; int i; @@ -2061,8 +2061,7 @@ xfrm_policy_eval_candidates(struct xfrm_pol_inexact_candidates *cand, for (i = 0; i < ARRAY_SIZE(cand->res); i++) { tmp = __xfrm_policy_eval_candidates(cand->res[i], prefer, - fl, type, family, dir, - if_id); + fl, type, family, if_id); if (!tmp) continue; @@ -2101,7 +2100,7 @@ static struct xfrm_policy *xfrm_policy_lookup_bytype(struct net *net, u8 type, ret = NULL; hlist_for_each_entry_rcu(pol, chain, bydst) { - err = xfrm_policy_match(pol, fl, type, family, dir, if_id); + err = xfrm_policy_match(pol, fl, type, family, if_id); if (err) { if (err == -ESRCH) continue; @@ -2120,7 +2119,7 @@ static struct xfrm_policy *xfrm_policy_lookup_bytype(struct net *net, u8 type, goto skip_inexact; pol = xfrm_policy_eval_candidates(&cand, ret, fl, type, - family, dir, if_id); + family, if_id); if (pol) { ret = pol; if (IS_ERR(pol)) @@ -3516,17 +3515,17 @@ int __xfrm_policy_check(struct sock *sk, int dir, struct sk_buff *skb, int xerr_idx = -1; const struct xfrm_if_cb *ifcb; struct sec_path *sp; - struct xfrm_if *xi; u32 if_id = 0; rcu_read_lock(); ifcb = xfrm_if_get_cb(); if (ifcb) { - xi = ifcb->decode_session(skb, family); - if (xi) { - if_id = xi->p.if_id; - net = xi->net; + struct xfrm_if_decode_session_result r; + + if (ifcb->decode_session(skb, family, &r)) { + if_id = r.if_id; + net = r.net; } } rcu_read_unlock(); diff --git a/net/xfrm/xfrm_replay.c b/net/xfrm/xfrm_replay.c index 9277d81b344c..9f4d42eb090f 100644 --- a/net/xfrm/xfrm_replay.c +++ b/net/xfrm/xfrm_replay.c @@ -766,18 +766,22 @@ int xfrm_replay_overflow(struct xfrm_state *x, struct sk_buff *skb) } #endif -int xfrm_init_replay(struct xfrm_state *x) +int xfrm_init_replay(struct xfrm_state *x, struct netlink_ext_ack *extack) { struct xfrm_replay_state_esn *replay_esn = x->replay_esn; if (replay_esn) { if (replay_esn->replay_window > - replay_esn->bmp_len * sizeof(__u32) * 8) + replay_esn->bmp_len * sizeof(__u32) * 8) { + NL_SET_ERR_MSG(extack, "ESN replay window is too large for the chosen bitmap size"); return -EINVAL; + } if (x->props.flags & XFRM_STATE_ESN) { - if (replay_esn->replay_window == 0) + if (replay_esn->replay_window == 0) { + NL_SET_ERR_MSG(extack, "ESN replay window must be > 0"); return -EINVAL; + } x->repl_mode = XFRM_REPLAY_MODE_ESN; } else { x->repl_mode = XFRM_REPLAY_MODE_BMP; diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c index 91c32a3b6924..81df34b3da6e 100644 --- a/net/xfrm/xfrm_state.c +++ b/net/xfrm/xfrm_state.c @@ -2611,7 +2611,8 @@ u32 xfrm_state_mtu(struct xfrm_state *x, int mtu) } EXPORT_SYMBOL_GPL(xfrm_state_mtu); -int __xfrm_init_state(struct xfrm_state *x, bool init_replay, bool offload) +int __xfrm_init_state(struct xfrm_state *x, bool init_replay, bool offload, + struct netlink_ext_ack *extack) { const struct xfrm_mode *inner_mode; const struct xfrm_mode *outer_mode; @@ -2626,12 +2627,16 @@ int __xfrm_init_state(struct xfrm_state *x, bool init_replay, bool offload) if (x->sel.family != AF_UNSPEC) { inner_mode = xfrm_get_mode(x->props.mode, x->sel.family); - if (inner_mode == NULL) + if (inner_mode == NULL) { + NL_SET_ERR_MSG(extack, "Requested mode not found"); goto error; + } if (!(inner_mode->flags & XFRM_MODE_FLAG_TUNNEL) && - family != x->sel.family) + family != x->sel.family) { + NL_SET_ERR_MSG(extack, "Only tunnel modes can accommodate a change of family"); goto error; + } x->inner_mode = *inner_mode; } else { @@ -2639,11 +2644,15 @@ int __xfrm_init_state(struct xfrm_state *x, bool init_replay, bool offload) int iafamily = AF_INET; inner_mode = xfrm_get_mode(x->props.mode, x->props.family); - if (inner_mode == NULL) + if (inner_mode == NULL) { + NL_SET_ERR_MSG(extack, "Requested mode not found"); goto error; + } - if (!(inner_mode->flags & XFRM_MODE_FLAG_TUNNEL)) + if (!(inner_mode->flags & XFRM_MODE_FLAG_TUNNEL)) { + NL_SET_ERR_MSG(extack, "Only tunnel modes can accommodate an AF_UNSPEC selector"); goto error; + } x->inner_mode = *inner_mode; @@ -2658,24 +2667,27 @@ int __xfrm_init_state(struct xfrm_state *x, bool init_replay, bool offload) } x->type = xfrm_get_type(x->id.proto, family); - if (x->type == NULL) + if (x->type == NULL) { + NL_SET_ERR_MSG(extack, "Requested type not found"); goto error; + } x->type_offload = xfrm_get_type_offload(x->id.proto, family, offload); - err = x->type->init_state(x); + err = x->type->init_state(x, extack); if (err) goto error; outer_mode = xfrm_get_mode(x->props.mode, family); if (!outer_mode) { + NL_SET_ERR_MSG(extack, "Requested mode not found"); err = -EPROTONOSUPPORT; goto error; } x->outer_mode = *outer_mode; if (init_replay) { - err = xfrm_init_replay(x); + err = xfrm_init_replay(x, extack); if (err) goto error; } @@ -2690,7 +2702,7 @@ int xfrm_init_state(struct xfrm_state *x) { int err; - err = __xfrm_init_state(x, true, false); + err = __xfrm_init_state(x, true, false, NULL); if (!err) x->km.state = XFRM_STATE_VALID; diff --git a/net/xfrm/xfrm_user.c b/net/xfrm/xfrm_user.c index 2ff017117730..e73f9efc54c1 100644 --- a/net/xfrm/xfrm_user.c +++ b/net/xfrm/xfrm_user.c @@ -35,7 +35,8 @@ #endif #include <asm/unaligned.h> -static int verify_one_alg(struct nlattr **attrs, enum xfrm_attr_type_t type) +static int verify_one_alg(struct nlattr **attrs, enum xfrm_attr_type_t type, + struct netlink_ext_ack *extack) { struct nlattr *rt = attrs[type]; struct xfrm_algo *algp; @@ -44,8 +45,10 @@ static int verify_one_alg(struct nlattr **attrs, enum xfrm_attr_type_t type) return 0; algp = nla_data(rt); - if (nla_len(rt) < (int)xfrm_alg_len(algp)) + if (nla_len(rt) < (int)xfrm_alg_len(algp)) { + NL_SET_ERR_MSG(extack, "Invalid AUTH/CRYPT/COMP attribute length"); return -EINVAL; + } switch (type) { case XFRMA_ALG_AUTH: @@ -54,6 +57,7 @@ static int verify_one_alg(struct nlattr **attrs, enum xfrm_attr_type_t type) break; default: + NL_SET_ERR_MSG(extack, "Invalid algorithm attribute type"); return -EINVAL; } @@ -61,7 +65,8 @@ static int verify_one_alg(struct nlattr **attrs, enum xfrm_attr_type_t type) return 0; } -static int verify_auth_trunc(struct nlattr **attrs) +static int verify_auth_trunc(struct nlattr **attrs, + struct netlink_ext_ack *extack) { struct nlattr *rt = attrs[XFRMA_ALG_AUTH_TRUNC]; struct xfrm_algo_auth *algp; @@ -70,14 +75,16 @@ static int verify_auth_trunc(struct nlattr **attrs) return 0; algp = nla_data(rt); - if (nla_len(rt) < (int)xfrm_alg_auth_len(algp)) + if (nla_len(rt) < (int)xfrm_alg_auth_len(algp)) { + NL_SET_ERR_MSG(extack, "Invalid AUTH_TRUNC attribute length"); return -EINVAL; + } algp->alg_name[sizeof(algp->alg_name) - 1] = '\0'; return 0; } -static int verify_aead(struct nlattr **attrs) +static int verify_aead(struct nlattr **attrs, struct netlink_ext_ack *extack) { struct nlattr *rt = attrs[XFRMA_ALG_AEAD]; struct xfrm_algo_aead *algp; @@ -86,8 +93,10 @@ static int verify_aead(struct nlattr **attrs) return 0; algp = nla_data(rt); - if (nla_len(rt) < (int)aead_len(algp)) + if (nla_len(rt) < (int)aead_len(algp)) { + NL_SET_ERR_MSG(extack, "Invalid AEAD attribute length"); return -EINVAL; + } algp->alg_name[sizeof(algp->alg_name) - 1] = '\0'; return 0; @@ -102,7 +111,7 @@ static void verify_one_addr(struct nlattr **attrs, enum xfrm_attr_type_t type, *addrp = nla_data(rt); } -static inline int verify_sec_ctx_len(struct nlattr **attrs) +static inline int verify_sec_ctx_len(struct nlattr **attrs, struct netlink_ext_ack *extack) { struct nlattr *rt = attrs[XFRMA_SEC_CTX]; struct xfrm_user_sec_ctx *uctx; @@ -112,42 +121,59 @@ static inline int verify_sec_ctx_len(struct nlattr **attrs) uctx = nla_data(rt); if (uctx->len > nla_len(rt) || - uctx->len != (sizeof(struct xfrm_user_sec_ctx) + uctx->ctx_len)) + uctx->len != (sizeof(struct xfrm_user_sec_ctx) + uctx->ctx_len)) { + NL_SET_ERR_MSG(extack, "Invalid security context length"); return -EINVAL; + } return 0; } static inline int verify_replay(struct xfrm_usersa_info *p, - struct nlattr **attrs) + struct nlattr **attrs, + struct netlink_ext_ack *extack) { struct nlattr *rt = attrs[XFRMA_REPLAY_ESN_VAL]; struct xfrm_replay_state_esn *rs; - if (!rt) - return (p->flags & XFRM_STATE_ESN) ? -EINVAL : 0; + if (!rt) { + if (p->flags & XFRM_STATE_ESN) { + NL_SET_ERR_MSG(extack, "Missing required attribute for ESN"); + return -EINVAL; + } + return 0; + } rs = nla_data(rt); - if (rs->bmp_len > XFRMA_REPLAY_ESN_MAX / sizeof(rs->bmp[0]) / 8) + if (rs->bmp_len > XFRMA_REPLAY_ESN_MAX / sizeof(rs->bmp[0]) / 8) { + NL_SET_ERR_MSG(extack, "ESN bitmap length must be <= 128"); return -EINVAL; + } if (nla_len(rt) < (int)xfrm_replay_state_esn_len(rs) && - nla_len(rt) != sizeof(*rs)) + nla_len(rt) != sizeof(*rs)) { + NL_SET_ERR_MSG(extack, "ESN attribute is too short to fit the full bitmap length"); return -EINVAL; + } /* As only ESP and AH support ESN feature. */ - if ((p->id.proto != IPPROTO_ESP) && (p->id.proto != IPPROTO_AH)) + if ((p->id.proto != IPPROTO_ESP) && (p->id.proto != IPPROTO_AH)) { + NL_SET_ERR_MSG(extack, "ESN only supported for ESP and AH"); return -EINVAL; + } - if (p->replay_window != 0) + if (p->replay_window != 0) { + NL_SET_ERR_MSG(extack, "ESN not compatible with legacy replay_window"); return -EINVAL; + } return 0; } static int verify_newsa_info(struct xfrm_usersa_info *p, - struct nlattr **attrs) + struct nlattr **attrs, + struct netlink_ext_ack *extack) { int err; @@ -161,10 +187,12 @@ static int verify_newsa_info(struct xfrm_usersa_info *p, break; #else err = -EAFNOSUPPORT; + NL_SET_ERR_MSG(extack, "IPv6 support disabled"); goto out; #endif default: + NL_SET_ERR_MSG(extack, "Invalid address family"); goto out; } @@ -173,65 +201,98 @@ static int verify_newsa_info(struct xfrm_usersa_info *p, break; case AF_INET: - if (p->sel.prefixlen_d > 32 || p->sel.prefixlen_s > 32) + if (p->sel.prefixlen_d > 32 || p->sel.prefixlen_s > 32) { + NL_SET_ERR_MSG(extack, "Invalid prefix length in selector (must be <= 32 for IPv4)"); goto out; + } break; case AF_INET6: #if IS_ENABLED(CONFIG_IPV6) - if (p->sel.prefixlen_d > 128 || p->sel.prefixlen_s > 128) + if (p->sel.prefixlen_d > 128 || p->sel.prefixlen_s > 128) { + NL_SET_ERR_MSG(extack, "Invalid prefix length in selector (must be <= 128 for IPv6)"); goto out; + } break; #else + NL_SET_ERR_MSG(extack, "IPv6 support disabled"); err = -EAFNOSUPPORT; goto out; #endif default: + NL_SET_ERR_MSG(extack, "Invalid address family in selector"); goto out; } err = -EINVAL; switch (p->id.proto) { case IPPROTO_AH: - if ((!attrs[XFRMA_ALG_AUTH] && - !attrs[XFRMA_ALG_AUTH_TRUNC]) || - attrs[XFRMA_ALG_AEAD] || + if (!attrs[XFRMA_ALG_AUTH] && + !attrs[XFRMA_ALG_AUTH_TRUNC]) { + NL_SET_ERR_MSG(extack, "Missing required attribute for AH: AUTH_TRUNC or AUTH"); + goto out; + } + + if (attrs[XFRMA_ALG_AEAD] || attrs[XFRMA_ALG_CRYPT] || attrs[XFRMA_ALG_COMP] || - attrs[XFRMA_TFCPAD]) + attrs[XFRMA_TFCPAD]) { + NL_SET_ERR_MSG(extack, "Invalid attributes for AH: AEAD, CRYPT, COMP, TFCPAD"); goto out; + } break; case IPPROTO_ESP: - if (attrs[XFRMA_ALG_COMP]) + if (attrs[XFRMA_ALG_COMP]) { + NL_SET_ERR_MSG(extack, "Invalid attribute for ESP: COMP"); goto out; + } + if (!attrs[XFRMA_ALG_AUTH] && !attrs[XFRMA_ALG_AUTH_TRUNC] && !attrs[XFRMA_ALG_CRYPT] && - !attrs[XFRMA_ALG_AEAD]) + !attrs[XFRMA_ALG_AEAD]) { + NL_SET_ERR_MSG(extack, "Missing required attribute for ESP: at least one of AUTH, AUTH_TRUNC, CRYPT, AEAD"); goto out; + } + if ((attrs[XFRMA_ALG_AUTH] || attrs[XFRMA_ALG_AUTH_TRUNC] || attrs[XFRMA_ALG_CRYPT]) && - attrs[XFRMA_ALG_AEAD]) + attrs[XFRMA_ALG_AEAD]) { + NL_SET_ERR_MSG(extack, "Invalid attribute combination for ESP: AEAD can't be used with AUTH, AUTH_TRUNC, CRYPT"); goto out; + } + if (attrs[XFRMA_TFCPAD] && - p->mode != XFRM_MODE_TUNNEL) + p->mode != XFRM_MODE_TUNNEL) { + NL_SET_ERR_MSG(extack, "TFC padding can only be used in tunnel mode"); goto out; + } break; case IPPROTO_COMP: - if (!attrs[XFRMA_ALG_COMP] || - attrs[XFRMA_ALG_AEAD] || + if (!attrs[XFRMA_ALG_COMP]) { + NL_SET_ERR_MSG(extack, "Missing required attribute for COMP: COMP"); + goto out; + } + + if (attrs[XFRMA_ALG_AEAD] || attrs[XFRMA_ALG_AUTH] || attrs[XFRMA_ALG_AUTH_TRUNC] || attrs[XFRMA_ALG_CRYPT] || - attrs[XFRMA_TFCPAD] || - (ntohl(p->id.spi) >= 0x10000)) + attrs[XFRMA_TFCPAD]) { + NL_SET_ERR_MSG(extack, "Invalid attributes for COMP: AEAD, AUTH, AUTH_TRUNC, CRYPT, TFCPAD"); goto out; + } + + if (ntohl(p->id.spi) >= 0x10000) { + NL_SET_ERR_MSG(extack, "SPI is too large for COMP (must be < 0x10000)"); + goto out; + } break; #if IS_ENABLED(CONFIG_IPV6) @@ -244,29 +305,36 @@ static int verify_newsa_info(struct xfrm_usersa_info *p, attrs[XFRMA_ALG_CRYPT] || attrs[XFRMA_ENCAP] || attrs[XFRMA_SEC_CTX] || - attrs[XFRMA_TFCPAD] || - !attrs[XFRMA_COADDR]) + attrs[XFRMA_TFCPAD]) { + NL_SET_ERR_MSG(extack, "Invalid attributes for DSTOPTS/ROUTING"); goto out; + } + + if (!attrs[XFRMA_COADDR]) { + NL_SET_ERR_MSG(extack, "Missing required COADDR attribute for DSTOPTS/ROUTING"); + goto out; + } break; #endif default: + NL_SET_ERR_MSG(extack, "Unsupported protocol"); goto out; } - if ((err = verify_aead(attrs))) + if ((err = verify_aead(attrs, extack))) goto out; - if ((err = verify_auth_trunc(attrs))) + if ((err = verify_auth_trunc(attrs, extack))) goto out; - if ((err = verify_one_alg(attrs, XFRMA_ALG_AUTH))) + if ((err = verify_one_alg(attrs, XFRMA_ALG_AUTH, extack))) goto out; - if ((err = verify_one_alg(attrs, XFRMA_ALG_CRYPT))) + if ((err = verify_one_alg(attrs, XFRMA_ALG_CRYPT, extack))) goto out; - if ((err = verify_one_alg(attrs, XFRMA_ALG_COMP))) + if ((err = verify_one_alg(attrs, XFRMA_ALG_COMP, extack))) goto out; - if ((err = verify_sec_ctx_len(attrs))) + if ((err = verify_sec_ctx_len(attrs, extack))) goto out; - if ((err = verify_replay(p, attrs))) + if ((err = verify_replay(p, attrs, extack))) goto out; err = -EINVAL; @@ -278,14 +346,19 @@ static int verify_newsa_info(struct xfrm_usersa_info *p, break; default: + NL_SET_ERR_MSG(extack, "Unsupported mode"); goto out; } err = 0; - if (attrs[XFRMA_MTIMER_THRESH]) - if (!attrs[XFRMA_ENCAP]) + if (attrs[XFRMA_MTIMER_THRESH]) { + if (!attrs[XFRMA_ENCAP]) { + NL_SET_ERR_MSG(extack, "MTIMER_THRESH attribute can only be set on ENCAP states"); err = -EINVAL; + goto out; + } + } out: return err; @@ -293,7 +366,7 @@ out: static int attach_one_algo(struct xfrm_algo **algpp, u8 *props, struct xfrm_algo_desc *(*get_byname)(const char *, int), - struct nlattr *rta) + struct nlattr *rta, struct netlink_ext_ack *extack) { struct xfrm_algo *p, *ualg; struct xfrm_algo_desc *algo; @@ -304,8 +377,10 @@ static int attach_one_algo(struct xfrm_algo **algpp, u8 *props, ualg = nla_data(rta); algo = get_byname(ualg->alg_name, 1); - if (!algo) + if (!algo) { + NL_SET_ERR_MSG(extack, "Requested COMP algorithm not found"); return -ENOSYS; + } *props = algo->desc.sadb_alg_id; p = kmemdup(ualg, xfrm_alg_len(ualg), GFP_KERNEL); @@ -317,7 +392,8 @@ static int attach_one_algo(struct xfrm_algo **algpp, u8 *props, return 0; } -static int attach_crypt(struct xfrm_state *x, struct nlattr *rta) +static int attach_crypt(struct xfrm_state *x, struct nlattr *rta, + struct netlink_ext_ack *extack) { struct xfrm_algo *p, *ualg; struct xfrm_algo_desc *algo; @@ -328,8 +404,10 @@ static int attach_crypt(struct xfrm_state *x, struct nlattr *rta) ualg = nla_data(rta); algo = xfrm_ealg_get_byname(ualg->alg_name, 1); - if (!algo) + if (!algo) { + NL_SET_ERR_MSG(extack, "Requested CRYPT algorithm not found"); return -ENOSYS; + } x->props.ealgo = algo->desc.sadb_alg_id; p = kmemdup(ualg, xfrm_alg_len(ualg), GFP_KERNEL); @@ -343,7 +421,7 @@ static int attach_crypt(struct xfrm_state *x, struct nlattr *rta) } static int attach_auth(struct xfrm_algo_auth **algpp, u8 *props, - struct nlattr *rta) + struct nlattr *rta, struct netlink_ext_ack *extack) { struct xfrm_algo *ualg; struct xfrm_algo_auth *p; @@ -355,8 +433,10 @@ static int attach_auth(struct xfrm_algo_auth **algpp, u8 *props, ualg = nla_data(rta); algo = xfrm_aalg_get_byname(ualg->alg_name, 1); - if (!algo) + if (!algo) { + NL_SET_ERR_MSG(extack, "Requested AUTH algorithm not found"); return -ENOSYS; + } *props = algo->desc.sadb_alg_id; p = kmalloc(sizeof(*p) + (ualg->alg_key_len + 7) / 8, GFP_KERNEL); @@ -373,7 +453,7 @@ static int attach_auth(struct xfrm_algo_auth **algpp, u8 *props, } static int attach_auth_trunc(struct xfrm_algo_auth **algpp, u8 *props, - struct nlattr *rta) + struct nlattr *rta, struct netlink_ext_ack *extack) { struct xfrm_algo_auth *p, *ualg; struct xfrm_algo_desc *algo; @@ -384,10 +464,14 @@ static int attach_auth_trunc(struct xfrm_algo_auth **algpp, u8 *props, ualg = nla_data(rta); algo = xfrm_aalg_get_byname(ualg->alg_name, 1); - if (!algo) + if (!algo) { + NL_SET_ERR_MSG(extack, "Requested AUTH_TRUNC algorithm not found"); return -ENOSYS; - if (ualg->alg_trunc_len > algo->uinfo.auth.icv_fullbits) + } + if (ualg->alg_trunc_len > algo->uinfo.auth.icv_fullbits) { + NL_SET_ERR_MSG(extack, "Invalid length requested for truncated ICV"); return -EINVAL; + } *props = algo->desc.sadb_alg_id; p = kmemdup(ualg, xfrm_alg_auth_len(ualg), GFP_KERNEL); @@ -402,7 +486,8 @@ static int attach_auth_trunc(struct xfrm_algo_auth **algpp, u8 *props, return 0; } -static int attach_aead(struct xfrm_state *x, struct nlattr *rta) +static int attach_aead(struct xfrm_state *x, struct nlattr *rta, + struct netlink_ext_ack *extack) { struct xfrm_algo_aead *p, *ualg; struct xfrm_algo_desc *algo; @@ -413,8 +498,10 @@ static int attach_aead(struct xfrm_state *x, struct nlattr *rta) ualg = nla_data(rta); algo = xfrm_aead_get_byname(ualg->alg_name, ualg->alg_icv_len, 1); - if (!algo) + if (!algo) { + NL_SET_ERR_MSG(extack, "Requested AEAD algorithm not found"); return -ENOSYS; + } x->props.ealgo = algo->desc.sadb_alg_id; p = kmemdup(ualg, aead_len(ualg), GFP_KERNEL); @@ -579,7 +666,8 @@ static void xfrm_smark_init(struct nlattr **attrs, struct xfrm_mark *m) static struct xfrm_state *xfrm_state_construct(struct net *net, struct xfrm_usersa_info *p, struct nlattr **attrs, - int *errp) + int *errp, + struct netlink_ext_ack *extack) { struct xfrm_state *x = xfrm_state_alloc(net); int err = -ENOMEM; @@ -606,21 +694,21 @@ static struct xfrm_state *xfrm_state_construct(struct net *net, if (attrs[XFRMA_SA_EXTRA_FLAGS]) x->props.extra_flags = nla_get_u32(attrs[XFRMA_SA_EXTRA_FLAGS]); - if ((err = attach_aead(x, attrs[XFRMA_ALG_AEAD]))) + if ((err = attach_aead(x, attrs[XFRMA_ALG_AEAD], extack))) goto error; if ((err = attach_auth_trunc(&x->aalg, &x->props.aalgo, - attrs[XFRMA_ALG_AUTH_TRUNC]))) + attrs[XFRMA_ALG_AUTH_TRUNC], extack))) goto error; if (!x->props.aalgo) { if ((err = attach_auth(&x->aalg, &x->props.aalgo, - attrs[XFRMA_ALG_AUTH]))) + attrs[XFRMA_ALG_AUTH], extack))) goto error; } - if ((err = attach_crypt(x, attrs[XFRMA_ALG_CRYPT]))) + if ((err = attach_crypt(x, attrs[XFRMA_ALG_CRYPT], extack))) goto error; if ((err = attach_one_algo(&x->calg, &x->props.calgo, xfrm_calg_get_byname, - attrs[XFRMA_ALG_COMP]))) + attrs[XFRMA_ALG_COMP], extack))) goto error; if (attrs[XFRMA_TFCPAD]) @@ -633,7 +721,7 @@ static struct xfrm_state *xfrm_state_construct(struct net *net, if (attrs[XFRMA_IF_ID]) x->if_id = nla_get_u32(attrs[XFRMA_IF_ID]); - err = __xfrm_init_state(x, false, attrs[XFRMA_OFFLOAD_DEV]); + err = __xfrm_init_state(x, false, attrs[XFRMA_OFFLOAD_DEV], extack); if (err) goto error; @@ -653,7 +741,7 @@ static struct xfrm_state *xfrm_state_construct(struct net *net, /* sysctl_xfrm_aevent_etime is in 100ms units */ x->replay_maxage = (net->xfrm.sysctl_aevent_etime*HZ)/XFRM_AE_ETH_M; - if ((err = xfrm_init_replay(x))) + if ((err = xfrm_init_replay(x, extack))) goto error; /* override default values from above */ @@ -662,7 +750,8 @@ static struct xfrm_state *xfrm_state_construct(struct net *net, /* configure the hardware if offload is requested */ if (attrs[XFRMA_OFFLOAD_DEV]) { err = xfrm_dev_state_add(net, x, - nla_data(attrs[XFRMA_OFFLOAD_DEV])); + nla_data(attrs[XFRMA_OFFLOAD_DEV]), + extack); if (err) goto error; } @@ -678,7 +767,7 @@ error_no_put: } static int xfrm_add_sa(struct sk_buff *skb, struct nlmsghdr *nlh, - struct nlattr **attrs) + struct nlattr **attrs, struct netlink_ext_ack *extack) { struct net *net = sock_net(skb->sk); struct xfrm_usersa_info *p = nlmsg_data(nlh); @@ -686,11 +775,11 @@ static int xfrm_add_sa(struct sk_buff *skb, struct nlmsghdr *nlh, int err; struct km_event c; - err = verify_newsa_info(p, attrs); + err = verify_newsa_info(p, attrs, extack); if (err) return err; - x = xfrm_state_construct(net, p, attrs, &err); + x = xfrm_state_construct(net, p, attrs, &err, extack); if (!x) return err; @@ -757,7 +846,7 @@ static struct xfrm_state *xfrm_user_state_lookup(struct net *net, } static int xfrm_del_sa(struct sk_buff *skb, struct nlmsghdr *nlh, - struct nlattr **attrs) + struct nlattr **attrs, struct netlink_ext_ack *extack) { struct net *net = sock_net(skb->sk); struct xfrm_state *x; @@ -1254,7 +1343,8 @@ static int build_spdinfo(struct sk_buff *skb, struct net *net, } static int xfrm_set_spdinfo(struct sk_buff *skb, struct nlmsghdr *nlh, - struct nlattr **attrs) + struct nlattr **attrs, + struct netlink_ext_ack *extack) { struct net *net = sock_net(skb->sk); struct xfrmu_spdhthresh *thresh4 = NULL; @@ -1299,7 +1389,8 @@ static int xfrm_set_spdinfo(struct sk_buff *skb, struct nlmsghdr *nlh, } static int xfrm_get_spdinfo(struct sk_buff *skb, struct nlmsghdr *nlh, - struct nlattr **attrs) + struct nlattr **attrs, + struct netlink_ext_ack *extack) { struct net *net = sock_net(skb->sk); struct sk_buff *r_skb; @@ -1358,7 +1449,8 @@ static int build_sadinfo(struct sk_buff *skb, struct net *net, } static int xfrm_get_sadinfo(struct sk_buff *skb, struct nlmsghdr *nlh, - struct nlattr **attrs) + struct nlattr **attrs, + struct netlink_ext_ack *extack) { struct net *net = sock_net(skb->sk); struct sk_buff *r_skb; @@ -1378,7 +1470,7 @@ static int xfrm_get_sadinfo(struct sk_buff *skb, struct nlmsghdr *nlh, } static int xfrm_get_sa(struct sk_buff *skb, struct nlmsghdr *nlh, - struct nlattr **attrs) + struct nlattr **attrs, struct netlink_ext_ack *extack) { struct net *net = sock_net(skb->sk); struct xfrm_usersa_id *p = nlmsg_data(nlh); @@ -1402,7 +1494,8 @@ out_noput: } static int xfrm_alloc_userspi(struct sk_buff *skb, struct nlmsghdr *nlh, - struct nlattr **attrs) + struct nlattr **attrs, + struct netlink_ext_ack *extack) { struct net *net = sock_net(skb->sk); struct xfrm_state *x; @@ -1477,7 +1570,7 @@ out_noput: return err; } -static int verify_policy_dir(u8 dir) +static int verify_policy_dir(u8 dir, struct netlink_ext_ack *extack) { switch (dir) { case XFRM_POLICY_IN: @@ -1486,13 +1579,14 @@ static int verify_policy_dir(u8 dir) break; default: + NL_SET_ERR_MSG(extack, "Invalid policy direction"); return -EINVAL; } return 0; } -static int verify_policy_type(u8 type) +static int verify_policy_type(u8 type, struct netlink_ext_ack *extack) { switch (type) { case XFRM_POLICY_TYPE_MAIN: @@ -1502,13 +1596,15 @@ static int verify_policy_type(u8 type) break; default: + NL_SET_ERR_MSG(extack, "Invalid policy type"); return -EINVAL; } return 0; } -static int verify_newpolicy_info(struct xfrm_userpolicy_info *p) +static int verify_newpolicy_info(struct xfrm_userpolicy_info *p, + struct netlink_ext_ack *extack) { int ret; @@ -1520,6 +1616,7 @@ static int verify_newpolicy_info(struct xfrm_userpolicy_info *p) break; default: + NL_SET_ERR_MSG(extack, "Invalid policy share"); return -EINVAL; } @@ -1529,35 +1626,44 @@ static int verify_newpolicy_info(struct xfrm_userpolicy_info *p) break; default: + NL_SET_ERR_MSG(extack, "Invalid policy action"); return -EINVAL; } switch (p->sel.family) { case AF_INET: - if (p->sel.prefixlen_d > 32 || p->sel.prefixlen_s > 32) + if (p->sel.prefixlen_d > 32 || p->sel.prefixlen_s > 32) { + NL_SET_ERR_MSG(extack, "Invalid prefix length in selector (must be <= 32 for IPv4)"); return -EINVAL; + } break; case AF_INET6: #if IS_ENABLED(CONFIG_IPV6) - if (p->sel.prefixlen_d > 128 || p->sel.prefixlen_s > 128) + if (p->sel.prefixlen_d > 128 || p->sel.prefixlen_s > 128) { + NL_SET_ERR_MSG(extack, "Invalid prefix length in selector (must be <= 128 for IPv6)"); return -EINVAL; + } break; #else + NL_SET_ERR_MSG(extack, "IPv6 support disabled"); return -EAFNOSUPPORT; #endif default: + NL_SET_ERR_MSG(extack, "Invalid selector family"); return -EINVAL; } - ret = verify_policy_dir(p->dir); + ret = verify_policy_dir(p->dir, extack); if (ret) return ret; - if (p->index && (xfrm_policy_id2dir(p->index) != p->dir)) + if (p->index && (xfrm_policy_id2dir(p->index) != p->dir)) { + NL_SET_ERR_MSG(extack, "Policy index doesn't match direction"); return -EINVAL; + } return 0; } @@ -1599,13 +1705,16 @@ static void copy_templates(struct xfrm_policy *xp, struct xfrm_user_tmpl *ut, } } -static int validate_tmpl(int nr, struct xfrm_user_tmpl *ut, u16 family) +static int validate_tmpl(int nr, struct xfrm_user_tmpl *ut, u16 family, + struct netlink_ext_ack *extack) { u16 prev_family; int i; - if (nr > XFRM_MAX_DEPTH) + if (nr > XFRM_MAX_DEPTH) { + NL_SET_ERR_MSG(extack, "Template count must be <= XFRM_MAX_DEPTH (" __stringify(XFRM_MAX_DEPTH) ")"); return -EINVAL; + } prev_family = family; @@ -1625,12 +1734,16 @@ static int validate_tmpl(int nr, struct xfrm_user_tmpl *ut, u16 family) case XFRM_MODE_BEET: break; default: - if (ut[i].family != prev_family) + if (ut[i].family != prev_family) { + NL_SET_ERR_MSG(extack, "Mode in template doesn't support a family change"); return -EINVAL; + } break; } - if (ut[i].mode >= XFRM_MODE_MAX) + if (ut[i].mode >= XFRM_MODE_MAX) { + NL_SET_ERR_MSG(extack, "Mode in template must be < XFRM_MODE_MAX (" __stringify(XFRM_MODE_MAX) ")"); return -EINVAL; + } prev_family = ut[i].family; @@ -1642,17 +1755,21 @@ static int validate_tmpl(int nr, struct xfrm_user_tmpl *ut, u16 family) break; #endif default: + NL_SET_ERR_MSG(extack, "Invalid family in template"); return -EINVAL; } - if (!xfrm_id_proto_valid(ut[i].id.proto)) + if (!xfrm_id_proto_valid(ut[i].id.proto)) { + NL_SET_ERR_MSG(extack, "Invalid XFRM protocol in template"); return -EINVAL; + } } return 0; } -static int copy_from_user_tmpl(struct xfrm_policy *pol, struct nlattr **attrs) +static int copy_from_user_tmpl(struct xfrm_policy *pol, struct nlattr **attrs, + struct netlink_ext_ack *extack) { struct nlattr *rt = attrs[XFRMA_TMPL]; @@ -1663,7 +1780,7 @@ static int copy_from_user_tmpl(struct xfrm_policy *pol, struct nlattr **attrs) int nr = nla_len(rt) / sizeof(*utmpl); int err; - err = validate_tmpl(nr, utmpl, pol->family); + err = validate_tmpl(nr, utmpl, pol->family, extack); if (err) return err; @@ -1672,7 +1789,8 @@ static int copy_from_user_tmpl(struct xfrm_policy *pol, struct nlattr **attrs) return 0; } -static int copy_from_user_policy_type(u8 *tp, struct nlattr **attrs) +static int copy_from_user_policy_type(u8 *tp, struct nlattr **attrs, + struct netlink_ext_ack *extack) { struct nlattr *rt = attrs[XFRMA_POLICY_TYPE]; struct xfrm_userpolicy_type *upt; @@ -1684,7 +1802,7 @@ static int copy_from_user_policy_type(u8 *tp, struct nlattr **attrs) type = upt->type; } - err = verify_policy_type(type); + err = verify_policy_type(type, extack); if (err) return err; @@ -1719,7 +1837,11 @@ static void copy_to_user_policy(struct xfrm_policy *xp, struct xfrm_userpolicy_i p->share = XFRM_SHARE_ANY; /* XXX xp->share */ } -static struct xfrm_policy *xfrm_policy_construct(struct net *net, struct xfrm_userpolicy_info *p, struct nlattr **attrs, int *errp) +static struct xfrm_policy *xfrm_policy_construct(struct net *net, + struct xfrm_userpolicy_info *p, + struct nlattr **attrs, + int *errp, + struct netlink_ext_ack *extack) { struct xfrm_policy *xp = xfrm_policy_alloc(net, GFP_KERNEL); int err; @@ -1731,11 +1853,11 @@ static struct xfrm_policy *xfrm_policy_construct(struct net *net, struct xfrm_us copy_from_user_policy(xp, p); - err = copy_from_user_policy_type(&xp->type, attrs); + err = copy_from_user_policy_type(&xp->type, attrs, extack); if (err) goto error; - if (!(err = copy_from_user_tmpl(xp, attrs))) + if (!(err = copy_from_user_tmpl(xp, attrs, extack))) err = copy_from_user_sec_ctx(xp, attrs); if (err) goto error; @@ -1754,7 +1876,8 @@ static struct xfrm_policy *xfrm_policy_construct(struct net *net, struct xfrm_us } static int xfrm_add_policy(struct sk_buff *skb, struct nlmsghdr *nlh, - struct nlattr **attrs) + struct nlattr **attrs, + struct netlink_ext_ack *extack) { struct net *net = sock_net(skb->sk); struct xfrm_userpolicy_info *p = nlmsg_data(nlh); @@ -1763,14 +1886,14 @@ static int xfrm_add_policy(struct sk_buff *skb, struct nlmsghdr *nlh, int err; int excl; - err = verify_newpolicy_info(p); + err = verify_newpolicy_info(p, extack); if (err) return err; - err = verify_sec_ctx_len(attrs); + err = verify_sec_ctx_len(attrs, extack); if (err) return err; - xp = xfrm_policy_construct(net, p, attrs, &err); + xp = xfrm_policy_construct(net, p, attrs, &err, extack); if (!xp) return err; @@ -2015,7 +2138,7 @@ static bool xfrm_userpolicy_is_valid(__u8 policy) } static int xfrm_set_default(struct sk_buff *skb, struct nlmsghdr *nlh, - struct nlattr **attrs) + struct nlattr **attrs, struct netlink_ext_ack *extack) { struct net *net = sock_net(skb->sk); struct xfrm_userpolicy_default *up = nlmsg_data(nlh); @@ -2036,7 +2159,7 @@ static int xfrm_set_default(struct sk_buff *skb, struct nlmsghdr *nlh, } static int xfrm_get_default(struct sk_buff *skb, struct nlmsghdr *nlh, - struct nlattr **attrs) + struct nlattr **attrs, struct netlink_ext_ack *extack) { struct sk_buff *r_skb; struct nlmsghdr *r_nlh; @@ -2066,7 +2189,8 @@ static int xfrm_get_default(struct sk_buff *skb, struct nlmsghdr *nlh, } static int xfrm_get_policy(struct sk_buff *skb, struct nlmsghdr *nlh, - struct nlattr **attrs) + struct nlattr **attrs, + struct netlink_ext_ack *extack) { struct net *net = sock_net(skb->sk); struct xfrm_policy *xp; @@ -2081,11 +2205,11 @@ static int xfrm_get_policy(struct sk_buff *skb, struct nlmsghdr *nlh, p = nlmsg_data(nlh); delete = nlh->nlmsg_type == XFRM_MSG_DELPOLICY; - err = copy_from_user_policy_type(&type, attrs); + err = copy_from_user_policy_type(&type, attrs, extack); if (err) return err; - err = verify_policy_dir(p->dir); + err = verify_policy_dir(p->dir, extack); if (err) return err; @@ -2101,7 +2225,7 @@ static int xfrm_get_policy(struct sk_buff *skb, struct nlmsghdr *nlh, struct nlattr *rt = attrs[XFRMA_SEC_CTX]; struct xfrm_sec_ctx *ctx; - err = verify_sec_ctx_len(attrs); + err = verify_sec_ctx_len(attrs, extack); if (err) return err; @@ -2149,7 +2273,8 @@ out: } static int xfrm_flush_sa(struct sk_buff *skb, struct nlmsghdr *nlh, - struct nlattr **attrs) + struct nlattr **attrs, + struct netlink_ext_ack *extack) { struct net *net = sock_net(skb->sk); struct km_event c; @@ -2249,7 +2374,7 @@ out_cancel: } static int xfrm_get_ae(struct sk_buff *skb, struct nlmsghdr *nlh, - struct nlattr **attrs) + struct nlattr **attrs, struct netlink_ext_ack *extack) { struct net *net = sock_net(skb->sk); struct xfrm_state *x; @@ -2293,7 +2418,7 @@ static int xfrm_get_ae(struct sk_buff *skb, struct nlmsghdr *nlh, } static int xfrm_new_ae(struct sk_buff *skb, struct nlmsghdr *nlh, - struct nlattr **attrs) + struct nlattr **attrs, struct netlink_ext_ack *extack) { struct net *net = sock_net(skb->sk); struct xfrm_state *x; @@ -2344,14 +2469,15 @@ out: } static int xfrm_flush_policy(struct sk_buff *skb, struct nlmsghdr *nlh, - struct nlattr **attrs) + struct nlattr **attrs, + struct netlink_ext_ack *extack) { struct net *net = sock_net(skb->sk); struct km_event c; u8 type = XFRM_POLICY_TYPE_MAIN; int err; - err = copy_from_user_policy_type(&type, attrs); + err = copy_from_user_policy_type(&type, attrs, extack); if (err) return err; @@ -2372,7 +2498,8 @@ static int xfrm_flush_policy(struct sk_buff *skb, struct nlmsghdr *nlh, } static int xfrm_add_pol_expire(struct sk_buff *skb, struct nlmsghdr *nlh, - struct nlattr **attrs) + struct nlattr **attrs, + struct netlink_ext_ack *extack) { struct net *net = sock_net(skb->sk); struct xfrm_policy *xp; @@ -2383,11 +2510,11 @@ static int xfrm_add_pol_expire(struct sk_buff *skb, struct nlmsghdr *nlh, struct xfrm_mark m; u32 if_id = 0; - err = copy_from_user_policy_type(&type, attrs); + err = copy_from_user_policy_type(&type, attrs, extack); if (err) return err; - err = verify_policy_dir(p->dir); + err = verify_policy_dir(p->dir, extack); if (err) return err; @@ -2403,7 +2530,7 @@ static int xfrm_add_pol_expire(struct sk_buff *skb, struct nlmsghdr *nlh, struct nlattr *rt = attrs[XFRMA_SEC_CTX]; struct xfrm_sec_ctx *ctx; - err = verify_sec_ctx_len(attrs); + err = verify_sec_ctx_len(attrs, extack); if (err) return err; @@ -2438,7 +2565,8 @@ out: } static int xfrm_add_sa_expire(struct sk_buff *skb, struct nlmsghdr *nlh, - struct nlattr **attrs) + struct nlattr **attrs, + struct netlink_ext_ack *extack) { struct net *net = sock_net(skb->sk); struct xfrm_state *x; @@ -2472,7 +2600,8 @@ out: } static int xfrm_add_acquire(struct sk_buff *skb, struct nlmsghdr *nlh, - struct nlattr **attrs) + struct nlattr **attrs, + struct netlink_ext_ack *extack) { struct net *net = sock_net(skb->sk); struct xfrm_policy *xp; @@ -2490,15 +2619,15 @@ static int xfrm_add_acquire(struct sk_buff *skb, struct nlmsghdr *nlh, xfrm_mark_get(attrs, &mark); - err = verify_newpolicy_info(&ua->policy); + err = verify_newpolicy_info(&ua->policy, extack); if (err) goto free_state; - err = verify_sec_ctx_len(attrs); + err = verify_sec_ctx_len(attrs, extack); if (err) goto free_state; /* build an XP */ - xp = xfrm_policy_construct(net, &ua->policy, attrs, &err); + xp = xfrm_policy_construct(net, &ua->policy, attrs, &err, extack); if (!xp) goto free_state; @@ -2577,7 +2706,7 @@ static int copy_from_user_migrate(struct xfrm_migrate *ma, } static int xfrm_do_migrate(struct sk_buff *skb, struct nlmsghdr *nlh, - struct nlattr **attrs) + struct nlattr **attrs, struct netlink_ext_ack *extack) { struct xfrm_userpolicy_id *pi = nlmsg_data(nlh); struct xfrm_migrate m[XFRM_MAX_DEPTH]; @@ -2594,7 +2723,7 @@ static int xfrm_do_migrate(struct sk_buff *skb, struct nlmsghdr *nlh, kmp = attrs[XFRMA_KMADDRESS] ? &km : NULL; - err = copy_from_user_policy_type(&type, attrs); + err = copy_from_user_policy_type(&type, attrs, extack); if (err) return err; @@ -2623,7 +2752,7 @@ static int xfrm_do_migrate(struct sk_buff *skb, struct nlmsghdr *nlh, } #else static int xfrm_do_migrate(struct sk_buff *skb, struct nlmsghdr *nlh, - struct nlattr **attrs) + struct nlattr **attrs, struct netlink_ext_ack *extack) { return -ENOPROTOOPT; } @@ -2819,7 +2948,8 @@ static const struct nla_policy xfrma_spd_policy[XFRMA_SPD_MAX+1] = { }; static const struct xfrm_link { - int (*doit)(struct sk_buff *, struct nlmsghdr *, struct nlattr **); + int (*doit)(struct sk_buff *, struct nlmsghdr *, struct nlattr **, + struct netlink_ext_ack *); int (*start)(struct netlink_callback *); int (*dump)(struct sk_buff *, struct netlink_callback *); int (*done)(struct netlink_callback *); @@ -2921,7 +3051,7 @@ static int xfrm_user_rcv_msg(struct sk_buff *skb, struct nlmsghdr *nlh, goto err; } - err = link->doit(skb, nlh, attrs); + err = link->doit(skb, nlh, attrs, extack); /* We need to free skb allocated in xfrm_alloc_compat() before * returning from this function, because consume_skb() won't take @@ -3272,11 +3402,11 @@ static struct xfrm_policy *xfrm_compile_policy(struct sock *sk, int opt, *dir = -EINVAL; if (len < sizeof(*p) || - verify_newpolicy_info(p)) + verify_newpolicy_info(p, NULL)) return NULL; nr = ((len - sizeof(*p)) / sizeof(*ut)); - if (validate_tmpl(nr, ut, p->sel.family)) + if (validate_tmpl(nr, ut, p->sel.family, NULL)) return NULL; if (p->dir > XFRM_POLICY_OUT) |