aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2018-05-07mac80211: average ack rssi support for data framesBalaji Pothunoori
The driver will process the RSSI if available and send it to mac80211. mac80211 will compute the weighted average of ack RSSI for stations. Signed-off-by: Balaji Pothunoori <bpothuno@codeaurora.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2018-05-07cfg80211: average ack rssi support for data framesBalaji Pothunoori
Average ack rssi will be given to userspace via NL80211 interface if firmware is capable. Userspace tool ‘iw’ can process this information and give the output as one of the fields in ‘iw dev wlanX station dump’. Example output : localhost ~ #iw dev wlan-5000mhz station dump Station 34:f3:9a:aa:3b:29 (on wlan-5000mhz) inactive time: 5370 ms rx bytes: 85321 rx packets: 576 tx bytes: 14225 tx packets: 71 tx retries: 0 tx failed: 2 beacon loss: 0 rx drop misc: 0 signal: -54 dBm signal avg: -53 dBm tx bitrate: 866.7 MBit/s VHT-MCS 9 80MHz short GI VHT-NSS 2 rx bitrate: 866.7 MBit/s VHT-MCS 9 80MHz short GI VHT-NSS 2 avg ack signal: -56 dBm authorized: yes authenticated: yes associated: yes preamble: short WMM/WME: yes MFP: no TDLS peer: no DTIM period: 2 beacon interval:100 short preamble: yes short slot time:yes connected time: 203 seconds Main use case is to measure the signal strength of a connected station to AP. Data packet transmit rates and bandwidth used by station can vary a lot even if the station is at fixed location, especially if the rates used are multi stream(2stream, 3stream) rates with different bandwidth(20/40/80 Mhz). These multi stream rates are sensitive and station can use different transmit power for each of the rate and bandwidth combinations. RSSI measured from these RX packets on AP will be not stable and can vary a lot with in a short time. Whereas 802.11 ack frames from station are sent relatively at a constant rate (6/12/24 Mbps) with constant bandwidth(20 Mhz). So average rssi of the ack packets is good and more accurate. Signed-off-by: Balaji Pothunoori <bpothuno@codeaurora.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2018-05-07cfg80211: Call reg_notifier for self managed hints conditionallyAmar Singhal
Currently the regulatory core does not call the regulatory callback reg_notifier for self managed wiphys, but regulatory_hint_user() call is independent of wiphy and is meant for all wiphys in the system. Even a self managed wiphy may be interested in regulatory_hint_user() to know the country code from a trusted regulatory domain change like a cellular base station. Therefore, for the regulatory source NL80211_REGDOM_SET_BY_USER and the user hint type NL80211_USER_REG_HINT_CELL_BASE, call the regulatory notifier. No current wlan driver uses the REGULATORY_WIPHY_SELF_MANAGED flag while also registering the reg_notifier regulatory callback, therefore there will be no impact on existing drivers without them being explicitly modified to take advantage of this new possibility. Signed-off-by: Amar Singhal <asinghal@codeaurora.org> Signed-off-by: Jouni Malinen <jouni@codeaurora.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2018-05-07nl80211: Add wmm rule attribute to NL80211_CMD_GET_WIPHY dump commandHaim Dreyfuss
This will serve userspace entity to maintain its regulatory limitation. More specifcally APs can use this data to calculate the WMM IE when building: beacons, probe responses, assoc responses etc... Signed-off-by: Haim Dreyfuss <haim.dreyfuss@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2018-05-07mac80211: add api to set CSA counter in mac80211Gregory Greenman
Sometimes the most updated CSA counter values are known only to the device. Add an API to pass this data to mac80211. Signed-off-by: Gregory Greenman <gregory.greenman@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2018-05-07mac80211: remove pointless flags=0 assignmentJohannes Berg
The data structure is initialized to all zeroes, and we already rely on that in other places, so remove the pointless assignment to 0. Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2018-05-07mac80211: ethtool: memset the whole sinfo struct to 0Johannes Berg
Rather than just setting the valid flags to 0 set the whole struct to 0 since other places might rely on it. Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2018-05-07mac80211: clean up rate info bandwidth settingJohannes Berg
There's no need to do the same thing three times in the different switch cases, pull that out to a single place. Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2018-05-07mac80211: rename rtap_vendor_space to rtap_spaceJohannes Berg
Since all the HE data won't fit into struct ieee80211_rx_status, we'll (have to) move that into the SKB proper. This means we'll need to skip over more things in the future, so rename this to remove the vendor-only notion from it. Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2018-04-19regulatory: Rename confusing 'country IE' in log outputToke Høiland-Jørgensen
The 'country IE' messages in the log can be confusing and make people think that the country code has been set to Ireland. Fix this by changing the log messages to use 'country element' instead (as they are no longer called 'information element' in the spec anyway). Reported-by: Bernhard Gabler <Bernhard_Gabler@web.de> Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2018-04-19mac80211_hwsim: indicate support for powersave.Bjoern Johansson
Without this, higher layers in the kernel will return an error code when trying to set the power state because the driver doesn't indicate power state support. This in turn causes VTS (Android Vendor Test Suite) failures because the WiFi HAL can't enable power saving mode. Signed-off-by: Bjoern Johansson <bjoernj@google.com> Signed-off-by: Lingfeng Yang <lfy@google.com> Signed-off-by: Roman Kiryanov <rkir@google.com> [johannes: remove remaining code, it was useless even as a skeleton since it didn't even have the right function arguments] Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2018-04-18ipv6: frags: fix a lockdep false positiveEric Dumazet
lockdep does not know that the locks used by IPv4 defrag and IPv6 reassembly units are of different classes. It complains because of following chains : 1) sch_direct_xmit() (lock txq->_xmit_lock) dev_hard_start_xmit() xmit_one() dev_queue_xmit_nit() packet_rcv_fanout() ip_check_defrag() ip_defrag() spin_lock() (lock frag queue spinlock) 2) ip6_input_finish() ipv6_frag_rcv() (lock frag queue spinlock) ip6_frag_queue() icmpv6_param_prob() (lock txq->_xmit_lock at some point) We could add lockdep annotations, but we also can make sure IPv6 calls icmpv6_param_prob() only after the release of the frag queue spinlock, since this naturally makes frag queue spinlock a leaf in lock hierarchy. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-18hv_netvsc: Add NetVSP v6 and v6.1 into version negotiationHaiyang Zhang
This patch adds the NetVSP v6 and 6.1 message structures, and includes these versions into NetVSC/NetVSP version negotiation process. Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-18hv_netvsc: propogate Hyper-V friendly name into interface aliasStephen Hemminger
This patch implement the 'Device Naming' feature of the Hyper-V network device API. In Hyper-V on the host through the GUI or PowerShell it is possible to enable the device naming feature which causes the host to make available to the guest the name of the device. This shows up in the RNDIS protocol as the friendly name. The name has no particular meaning and is limited to 256 characters. The value can only be set via PowerShell on the host, but could be scripted for mass deployments. The default value is the string 'Network Adapter' and since that is the same for all devices and useless, the driver ignores it. In Windows, the value goes into a registry key for use in SNMP ifAlias. For Linux, this patch puts the value in the network device alias property; where it is visible in ip tools and SNMP. The host provided ifAlias is just a suggestion, and can be overridden by later ip commands. Also requires exporting dev_set_alias in netdev core. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-18Merge branch 'r8169-series-with-further-smaller-improvements'David S. Miller
Heiner Kallweit says: ==================== r8169: series with further smaller improvements This series includes further smaller improvements. Then I think the basic cleanup has been done and next step would be preparing the switch to phylib. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-18r8169: remove jumbo_tx_csum from chip config structHeiner Kallweit
According to the chip configuration entries only RTL8169 (ver <= 06) supports tx checksumming for jumbo packets. By the way: constant JUMBO_1K is a little misleading because it refers to the standard packet size and not to a jumbo packet size. By implementing this rule we can get rid of configuring tx checksumming support per chip type. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-18r8169: improve pci region handlingHeiner Kallweit
The region to be used is always the first of type IORESOURCE_MEM. We can implement this rule directly w/o having to specify which region is the first one per configuration entry. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-18r8169: drop member txd_version from struct rtl8169_privateHeiner Kallweit
txd_version is used in rtl_init_one() only, so we can drop member txd_version from struct rtl8169_private. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-18r8169: improve rtl8169_get_mac_versionHeiner Kallweit
Certain entries in array mac_info[] are redundant, so remove them: 0x7cf, 0x2c200000 (VER 33): matched by entry 0x7c8, 0x2c000000 0x7cf, 0x28300000 (VER 26): matched by entry 0x7c8, 0x28000000 0x7cf, 0x3cb00000 (VER 24): matched by entry 0x7c8, 0x3c800000 0x7cf, 0x3c400000 (VER 22): matched by entry 0x7c8, 0x3c000000 0x7cf, 0x38500000 (VER 17): matched by entry 0x7c8, 0x38000000 0x7cf, 0x44900000 (VER 39): matched by entry 0x7c8, 0x44800000 0x7cf, 0x40b00000 (VER 30): matched by entry 0x7c8, 0x40800000 0x7cf, 0x40a00000 (VER 30): matched by entry 0x7c8, 0x40800000 0x7cf, 0x34a00000 (VER 09): matched by entry 0x7c8, 0x34800000 0x7cf, 0x24a00000 (VER 09): matched by entry 0x7c8, 0x24800000 In addition don't mask out bits 30 and 29 when printing the XID. Most likely this is a relict from the times when the driver covered RTL8169 chip version only. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-18r8169: don't display tp->mmio_addr addressHeiner Kallweit
For security reasons since commit ad67b74d2469 "printk: hash addresses printed with %p" %p doesn't display the full address any longer. We could switch to %px, but I think the pointer address doesn't provide a real benefit, so remove printing the hashed address. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-18r8169: drop member opts1_mask from struct rtl8169_privateHeiner Kallweit
We can get rid of member opts1_mask and in addition save a few cpu cycles in the hot path of rtl_rx(). Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-18r8169: change interrupt handler argument typeHeiner Kallweit
Code can be a little simplified by switching the interrupt handler argument type to struct rtl8169_private *. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-18r8169: change argument type of counters handling functionsHeiner Kallweit
The counter handling functions don't deal with the net_device, so code can be simplified by changing the argument type to struct rtl8169_private *. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-18r8169: change hw_start argument typeHeiner Kallweit
Code can be simplified by changing the argument type of hw_start callbacks from struct net_device * to struct rtl8169_private *. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-18r8169: remove rtl8169_map_to_asicHeiner Kallweit
This function is very simple and used only once, so we can inline the two statements. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-18r8169: replace rx_buf_sz with a constantHeiner Kallweit
rx_buf_sz is constant, so we don't have to pass it as parameter and in general can replace it with a constant. When working on this I noticed that also before in rtl_set_rx_max_size() a value of 0x4000 is set, what is not in line with the chip spec. According to the spec only bits 0..13 are used and we set an effective value of zero therefore. However, the driver still seems to work and due to potential side effects I'm reluctant to make a change. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-18r8169: remove unneeded check in rtl8169_rx_fillHeiner Kallweit
rtl8169_rx_fill() is called only once and directly before the call array tp->Rx_databuff[] is filled with zero's. Therefore we don't need this check. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-18r8169: improve rtl8169_init_ringHeiner Kallweit
This function doesn't use the net_device, therefore change the parameter to type struct rtl8169_private * to simplify the code. In addition we don't need the calculations in the memset statements, we can use the size of the arrays directly. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-18r8169: simplify rtl8169_alloc_rx_dataHeiner Kallweit
dev->dev.parent has the same value as tp_to_dev(tp) (set by SET_NETDEV_DEV() in rtl_init_one()) and we know it can't be NULL. This allows us to simplify the code. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-18r8169: switch to napi_schedule_irqoffHeiner Kallweit
napi_schedule() is called from hard irq context, so we can switch to napi_schedule_irqoff() and avoid some overhead. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-18r8169: use constant NAPI_POLL_WAITHeiner Kallweit
We can use generic constant NAPI_POLL_WAIT instead of defining an own constant for the same value. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-18r8169: use skb_copy_to_linear_data in rtl8169_try_rx_copyHeiner Kallweit
Not a giant leap for mankind, but let's avoid the open-coded memcpy and use standard helper skb_copy_to_linear_data instead. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-18r8169: remove member align from struct rtl_cfg_infoHeiner Kallweit
Since commit 6f0333b8fde4 "r8169: use 50% less ram for RX ring" member align isn't used any longer, so remove it. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-18r8169: remove unused member features from structHeiner Kallweit
Member features of struct rtl8169_private isn't used any longer since commit 6c6aa15fdea5 "r8169: improve interrupt handling", so remove it. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-18Merge branch 'netcp-K2G-SoC-support'David S. Miller
Murali Karicheri says: ==================== Add support for netcp driver on K2G SoC K2G SoC is another variant of Keystone family of SoCs. This patch series add support for NetCP driver on this SoC. The QMSS found on K2G SoC is a cut down version of the QMSS found on other keystone devices with less number of queues, internal link ram etc. The patch series has 2 patch sets that goes into the drivers/soc and the rest has to be applied to net sub system. Please review and merge if this looks good. K2G TRM is located at http://www.ti.com/lit/ug/spruhy8g/spruhy8g.pdf Thanks The boot logs on K2G ICE board (tftp boot over Ethernet and from mmc) https://pastebin.ubuntu.com/p/yvZ6drFhkW/ The boot logs on K2G GP board (tftp boot over Ethernet and from mmc) https://pastebin.ubuntu.com/p/QTr6K7s4Zp/ Also regressed boot on K2HK and K2L EVMs as we have modified GBE version detection logic (K2E uses same version of NetCP as in K2L. So regression on one of them is needed). Boot log on K2L and K2HK EVMs are at https://pastebin.ubuntu.com/p/N9DBdPjbvR/ This series applies to net-next master branch. Change history: v4 - ready for merge to net-next Folded the series "Add promiscous mode support in k2g network driver" into this. Fixed a typo in 5/11 (sgmii to rgmii) based on TI internal comment Reworked 4/11 and title changed to reflect additional changes to exclude sgmii configuration code for 2U cpsw. Use IS_SS_ID_2U() macro for customization. Added Reviewed-by from Rob Herring against 1/13 v3 - Addressed comments from Andrew Lunn and Grygorii Strashko against v2. v2 - Addressed following comments on initial version - split patch 3/5 to multiple patches from Andrew Lunn ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-18net: netcp: ethss: k2g: add promiscuous mode supportWingMan Kwok
This patch adds support for promiscuous mode in k2g's network driver. When upper layer instructs to transition from non-promiscuous mode to promiscuous mode or vice versa K2G network driver needs to configure ALE accordingly so that in case of non-promiscuous mode, ALE will not flood all unicast packets to host port, while in promiscuous mode, it will pass all received unicast packets to host port. Signed-off-by: WingMan Kwok <w-kwok2@ti.com> Signed-off-by: Murali Karicheri <m-karicheri2@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-18net: netcp: add api to support set rx mode in netcp modulesWingMan Kwok
This patch adds an API to support setting rx mode in netcp modules. If a netcp module needs to be notified when upper layer transitions from one rx mode to another and react accordingly, such a module will implement the new API set_rx_mode added in this patch. Currently rx modes supported are PROMISCUOUS and NON_PROMISCUOUS modes. Signed-off-by: WingMan Kwok <w-kwok2@ti.com> Signed-off-by: Murali Karicheri <m-karicheri2@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-18net: netcp: support probe deferralMurali Karicheri
The netcp driver shouldn't proceed until the knav qmss and dma devices are ready. So return -EPROBE_DEFER if these devices are not ready. Signed-off-by: Murali Karicheri <m-karicheri2@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-18Revert "net: netcp: remove dead code from the driver"Murali Karicheri
As the probe sequence is not guaranteed contrary to the assumption of the commit 2d8e276a9030, same has to be reverted. commit 2d8e276a9030 ("net: netcp: remove dead code from the driver") Signed-off-by: Murali Karicheri <m-karicheri2@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-18net: netcp: ethss: use of_get_phy_mode() to support different RGMII modesMurali Karicheri
The phy used for K2G allows for internal delays to be added optionally to the clock circuitry based on board desing. To add this support, enhance the driver to use of_get_phy_mode() to read the phy-mode from the phy device and pass the same to phy through of_phy_connect(). Signed-off-by: Murali Karicheri <m-karicheri2@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-18net: netcp: ethss: re-use stats handling code for 2u hardwareMurali Karicheri
The stats block in 2u cpsw hardware is similar to the one on nu and hence handle it in a similar way by using a macro that includes 2u hardware as well. Signed-off-by: Murali Karicheri <m-karicheri2@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-18net: netcp: ethss: map vlan priorities to zero flowMurali Karicheri
The driver currently support only vlan priority zero. So map the vlan priorities to zero flow in hardware. Signed-off-by: Murali Karicheri <m-karicheri2@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-18net: netcp: ethss: use rgmii link status for 2u cpsw hardwareMurali Karicheri
Introduce rgmii link status to handle link state events for 2u cpsw hardware on K2G. Signed-off-by: Murali Karicheri <m-karicheri2@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-18net: netcp: ethss: add support for handling rgmii link interfaceMurali Karicheri
2u cpsw hardware on K2G uses rgmii link to interface with Phy. So add support for this interface in the code so that driver can be re-used for 2u hardware. Signed-off-by: Murali Karicheri <m-karicheri2@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-18net: netcp: ethss: make sgmii configuration conditionalMurali Karicheri
As a preparatory patch to add support for 2u cpsw hardware found on K2G SoC, make sgmii configuration conditional. This is required since 2u uses RGMII interface instead of SGMII and to allow for driver re-use. Signed-off-by: Murali Karicheri <m-karicheri2@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-18net: netcp: ethss: use macro for checking ss_version consistentlyMurali Karicheri
Driver currently uses macro for NU and XBE hardwrae, while other places for older hardware such as that on K2H/K SoC (version 1.4 of the cpsw hardware, it explicitly check for the ss_version inline. Add a new macro for version 1.4 and use it to customize code in the driver. While at it also fix similar issue with checking XBE version by re-using existing macro IS_SS_ID_XGBE(). Signed-off-by: Murali Karicheri <m-karicheri2@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-18soc: ti: K2G: provide APIs to support driver probe deferralMurali Karicheri
This patch provide APIs to allow client drivers to support probe deferral. On K2G SoC, devices can be probed only after the ti_sci_pm_domains driver is probed and ready. As drivers may get probed at different order, any driver that depends on knav dma and qmss drivers, for example netcp network driver, needs to defer probe until knav devices are probed and ready to service. To do this, add an API to query the device ready status from the knav dma and qmss devices. Signed-off-by: Murali Karicheri <m-karicheri2@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-18soc: ti: K2G: enhancement to support QMSS in K2G NAVSSMurali Karicheri
Navigator Subsystem (NAVSS) available on K2G SoC has a cut down version of QMSS with less number of queues, internal linking ram with lesser number of buffers etc. It doesn't have status and explicit push register space as in QMSS available on other K2 SoCs. So define reg indices specific to QMSS on K2G. This patch introduces "ti,66ak2g-navss-qm" compatibility to identify QMSS on K2G NAVSS and to customize the dts handling code. Per Device manual, descriptors with index less than or equal to regions0_size is in region 0 in the case of K2 QMSS where as for QMSS on K2G, descriptors with index less than regions0_size is in region 0. So update the size accordingly in the regions0_size bits of the linking ram size 0 register. Signed-off-by: Murali Karicheri <m-karicheri2@ti.com> Signed-off-by: WingMan Kwok <w-kwok2@ti.com> Reviewed-by: Rob Herring <robh@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-17Merge branch 'ipv6-Separate-data-structures-for-FIB-and-data-path'David S. Miller
David Ahern says: ==================== net/ipv6: Separate data structures for FIB and data path IPv6 uses the same data struct for both control plane (FIB entries) and data path (dst entries). This struct has elements needed for both paths adding memory overhead and complexity (taking a dst hold in most places but an additional reference on rt6i_ref in a few). Furthermore, because of the dst_alloc tie, all FIB entries are allocated with GFP_ATOMIC. This patch set separates FIB entries from dst entries, better aligning IPv6 code with IPv4, simplifying the reference counting and allowing FIB entries added by userspace (not autoconf) to use GFP_KERNEL. It is first step to a number of performance and scalability changes. The end result of this patch set: - FIB entries (fib6_info): /* size: 208, cachelines: 4, members: 25 */ /* sum members: 207, holes: 1, sum holes: 1 */ - dst entries (rt6_info) /* size: 240, cachelines: 4, members: 11 */ Versus the the single rt6_info struct today for both paths: /* size: 320, cachelines: 5, members: 28 */ This amounts to a 35% reduction in memory use for FIB entries and a 25% reduction for dst entries. With respect to locking FIB entries use RCU and a single atomic counter with fib6_info_hold and fib6_info_release helpers to manage the reference counting. dst entries use only the traditional dst refcounts with dst_hold and dst_release. FIB entries for host routes are referenced by inet6_ifaddr and ifacaddr6. In both cases, additional holds are taken -- similar to what is done for devices. This set is the first of many changes to improve the scalability of the IPv6 code. Follow on changes include: - consolidating duplicate fib6_info references like IPv4 does with duplicate fib_info - moving fib6_info into a slab cache to avoid allocation roundups to power of 2 (the 208 size becomes a 256 actual allocation) - Allow FIB lookups without generating a dst (e.g., most rt6_lookup users just want to verify the egress device). Means moving dst allocation to the other side of fib6_rule_lookup which again aligns with IPv4 behavior - using separate standalone nexthop objects which have performance benefits beyond fib_info consolidation At this point I am not seeing any refcount leaks or underflows, no oops or bug_ons, or warnings from kasan, so I think it is ready for others to beat up on it finding errors in code paths I have missed. v2 changes - rebased to top of tree - improved commit message on patch 7 v1 changes - rebased to top of tree - fix memory leak of metrics as noted by Ido - MTU fixes based on pmtu tests (thanks Stefano Brivio for writing) RFC v2 changes - improved commit messages - move common metrics code from dst.c to net/ipv4/metrics.c (comment from DaveM) - address comments from Wei Wang and Martin KaFai Lau (let me know if I missed something) - fixes detected by kernel test robots + added fib6_metric_set to change metric on a FIB entry which could be pointing to read-only dst_default_metrics + 0day testing found a problem with an intermediate patch; added dst_hold_safe on rt->from. Code is removed 3 patches later - allow cacheinfo to handle NULL dst; means only expires is pushed to userspace ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-17net/ipv6: Remove unused code and variables for rt6_infoDavid Ahern
Drop unneeded elements from rt6_info struct and rearrange layout to something more relevant for the data path. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>