aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2016-03-02RDS: IB: add Fastreg MR (FRMR) detection supportsantosh.shilimkar@oracle.com
Discovere Fast Memmory Registration support using IB device IB_DEVICE_MEM_MGT_EXTENSIONS. Certain HCA might support just FRMR or FMR or both FMR and FRWR. In case both mr type are supported, default FMR is used. Default MR is still kept as FMR against what everyone else is following. Default will be changed to FRMR once the RDS performance with FRMR is comparable with FMR. The work is in progress for the same. Signed-off-by: Santosh Shilimkar <ssantosh@kernel.org> Signed-off-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-02RDS: IB: add mr reused statssantosh.shilimkar@oracle.com
Add MR reuse statistics to RDS IB transport. Signed-off-by: Santosh Shilimkar <ssantosh@kernel.org> Signed-off-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-02RDS: IB: handle the RDMA CM time wait eventsantosh.shilimkar@oracle.com
Drop the RDS connection on RDMA_CM_EVENT_TIMEWAIT_EXIT so that it can reconnect and resume. While testing fastreg, this error happened in couple of tests but was getting un-noticed. Signed-off-by: Santosh Shilimkar <ssantosh@kernel.org> Signed-off-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-02RDS: IB: add connection info to ibmrsantosh.shilimkar@oracle.com
Preperatory patch for FRMR support. From connection info, we can retrieve cm_id which contains qp handled needed for work request posting. We also need to drop the RDS connection on QP error states where connection handle becomes useful. Signed-off-by: Santosh Shilimkar <ssantosh@kernel.org> Signed-off-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-02RDS: IB: move FMR code to its own filesantosh.shilimkar@oracle.com
No functional change. Signed-off-by: Santosh Shilimkar <ssantosh@kernel.org> Signed-off-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-02RDS: IB: create struct rds_ib_fmrsantosh.shilimkar@oracle.com
Keep fmr related filed in its own struct. Fastreg MR structure will be added to the union. Signed-off-by: Santosh Shilimkar <ssantosh@kernel.org> Signed-off-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-02RDS: IB: Re-organise ibmr codesantosh.shilimkar@oracle.com
No functional changes. This is in preperation towards adding fastreg memory resgitration support. Signed-off-by: Santosh Shilimkar <ssantosh@kernel.org> Signed-off-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-02RDS: IB: Remove the RDS_IB_SEND_OP dependencysantosh.shilimkar@oracle.com
This helps to combine asynchronous fastreg MR completion handler with send completion handler. No functional change. Signed-off-by: Santosh Shilimkar <ssantosh@kernel.org> Signed-off-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-02MAINTAINERS: update RDS entrysantosh.shilimkar@oracle.com
Acked-by: Chien Yen <chien.yen@oracle.com> Signed-off-by: Santosh Shilimkar <ssantosh@kernel.org> Signed-off-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-02RDS: Add support for SO_TIMESTAMP for incoming messagessantosh.shilimkar@oracle.com
The SO_TIMESTAMP generates time stamp for each incoming RDS messages User app can enable it by using SO_TIMESTAMP setsocketopt() at SOL_SOCKET level. CMSG data of cmsg type SO_TIMESTAMP contains the time stamp in struct timeval format. Reviewed-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Signed-off-by: Santosh Shilimkar <ssantosh@kernel.org> Signed-off-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-02RDS: Drop stale iWARP RDMA transportsantosh.shilimkar@oracle.com
RDS iWarp support code has become stale and non testable. As indicated earlier, am dropping the support for it. If new iWarp user(s) shows up in future, we can adapat the RDS IB transprt for the special RDMA READ sink case. iWarp needs an MR for the RDMA READ sink. Signed-off-by: Santosh Shilimkar <ssantosh@kernel.org> Signed-off-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-02Merge branch 'qed-next'David S. Miller
Yuval Mintz says: ==================== qed: update series This patch series tries to improve general configuration by changing configuration to better suit B0 boards and allow more available resources to each physical function. In additition, it contains some small fixes and semantic changes. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-02qed: Remove unused NVM vendor IDYuval Mintz
Remove 2 unused fields from driver code. Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-02qed: Fix error flow on slowpath startYuval Mintz
In case of problems when initializing the chip, the error flows aren't being properly done. Specifically, it's possible that the chip would be left in a configuration allowing it [internally] to access the host memory, causing fatal problems in the device that would require power cycle to overcome. Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-02qed: Move statistics to L2 codeYuval Mintz
Current statistics logic is meant for L2, not for all future protocols. Move this content to the proper designated file. Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-02qed: Support B0 instead of A0Yuval Mintz
BB_A0 is a development model that is will not reach actual clients. In fact, future firmware would simply fail to initialize such chip. This changes the configuration into B0 instead of A0, and adds a safeguard against the slim chance someone would actually try this with an A0 adapter in which case probe would gracefully fail. Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-02qed: Correct BAR sizes for older MFWRam Amrani
Driver learns the inner bar sized from a register configured by management firmware, but older versions are not setting this register. But since we know which values were configured back then, use them instead. Signed-off-by: Ram Amrani <Ram.Amrani@qlogic.com> Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-02batman-adv: clarify CFG80211 dependencyArnd Bergmann
The driver calls cfg80211_get_station, which may be part of a module, so we must not enable BATMAN_ADV_BATMAN_V if BATMAN_ADV=y and CFG80211=m: net/built-in.o: In function `batadv_v_elp_get_throughput': (text+0x5c62c): undefined reference to `cfg80211_get_station' This clarifies the dependency to cover all combinations. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Fixes: c833484e5f38 ("batman-adv: ELP - compute the metric based on the estimated throughput") Acked-by: Antonio Quartulli <a@unstable.cc> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-02wan: lmc: Switch to using managed resourcesAmitoj Kaur Chawla
Use managed resource functions devm_kzalloc and pcim_enable_device to simplify error handling. Subsequently, remove unnecessary kfree, pci_disable_device and pci_release_regions. To be compatible with the change, various gotos are replaced with direct returns and unneeded labels are dropped. Also, `sc` was only being freed in the probe function and not the remove function before the change. By using devm_kzalloc this patch also fixes this memory leak. Signed-off-by: Amitoj Kaur Chawla <amitoj1606@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-01Merge tag 'batman-adv-for-davem' of git://git.open-mesh.org/linux-mergeDavid S. Miller
Antonio Quartulli says: ==================== batman-adv 20160229 this is our (hopefully) latest batch of patches intended for net-next. With this patchset we finally introduce B.A.T.M.A.N. V: the latest version of our routing protocol. Technical documentation describing the protocol in more detail can be found in our wiki[1][2][3][4]. For what concerns this pull request, you can find the high level description right below. [1] https://www.open-mesh.org/projects/batman-adv/wiki/BATMAN_V [2] https://www.open-mesh.org/projects/batman-adv/wiki/OGMv2 [3] https://www.open-mesh.org/projects/batman-adv/wiki/ELP [4] https://www.open-mesh.org/projects/batman-adv/wiki/BATMAN_V_Tests ... With this patchset we finally introduce our new routing protocol: B.A.T.M.A.N. V. Its implementation started quite some years ago, but due to the big changes being introduced it took a while to be discussed, designed, worked, re-worked, tested and debugged (well, we're never done with the latest). The entire operation has basically been a team work involving all the core contributors together with other people interested in the project. The new protocol is divided into two main subcomponents, called respectively ELP and OGMv2. The former is in charge of dealing with the neighbour discovery and link quality estimation, while the latter implements the algorithm that spreads the metrics around the network and computes optimal paths. The biggest change introduced with B.A.T.M.A.N. V is the new metric: the protocol won't rely on packet loss anymore, but it will use the estimated throughput extracted directly from the wifi driver (when available) by querying cfg80211. Batman-adv will also send some unicast probing packets when an interface is not used for payload traffic to make sure that such values are current. The new protocol can be compiled-in or not like other features we have and when selected will pull in CFG80211 as dependency for the reason described above. Thanks to the big work brought up in the past by Marek Lindner, batman-adv can easily deal several protocol implementations, therefore compiling in this new version does not exclude the older. This means that the user is offered the option to choose the protocol when creating the mesh interface (default is the old one to keep backward compatibility). Along with the protocol there are some sysfs knobs that are introduced to fine tune some of its behaviours, but users are recommended to keep the default values unless they know what they are doing. The last patch is about advertising our own patchwork platform (thanks to Sven Eckelmann for having set that up!) in the MAINTAINERS file. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-01net: pktgen: use reset to set mac headerZhang Shengju
Since offset is zero, it's not necessary to use set function. Signed-off-by: Zhang Shengju <zhangshengju@cmss.chinamobile.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-01sch_mqprio: Fix build with older gcc.David S. Miller
CC [M] net/sched/sch_mqprio.o net/sched/sch_mqprio.c: In function ?mqprio_init?: net/sched/sch_mqprio.c:145: error: unknown field ?tc? specified in initializer net/sched/sch_mqprio.c:145: warning: missing braces around initializer net/sched/sch_mqprio.c:145: warning: (near initialization for ?tc.<anonymous>?) make[2]: *** [net/sched/sch_mqprio.o] Error 1 make[1]: *** [net/sched] Error 2 make: *** [net] Error 2 Several people reported this, surround the unnamed union member initialization with braces to fix. Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-01of_mdio: kill useless variable in of_mdiobus_register()Sergei Shtylyov
of_mdiobus_register() declares the 'paddr' variable to hold the result of the of_get_property() but only uses it once after that while the function can be called directly from the *if* statement. Remove that variable and switch to calling of_find_property() instead since we don't care about the "reg" property's value anyway... Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-01Merge branch 'qed-next'David S. Miller
Yuval Mintz says: ==================== qed: Attention support patch series Until now we've only enabled attention generation for the sake of management firmware indications [required for link notifications]. This series enables [almost] all the attention sources of the HW, currently for the sake of logging information relating to issues experienced by HW. In future, infrastructure laid here would also be used for the sake of the recovery process. The first patch in the series is a semantic alignemnt of the code. The later 3 patches incremently create said infrastructure and enrich the logged information. Notice #3 contains quite a bit of structures [consisting of ~1K lines] that will eventually be removed and incorporated in the binary fw file. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-01qed: Print additional HW attention infoYuval Mintz
This patch utilizes the attention infrastructure to log additional information that relates only to specific HW blocks. For some of those HW blocks, it also stops automatically disabling the attention generation as the attention is considered benign and thus should only be logged; No fear of it flooding the system. Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-01qed: Print HW attention reasonsYuval Mintz
Each HW block contains common information about attention reasons, raising a bit for each one of the different sub-reasons that caused it to raise an attention. This patch extends the infrastructure by allowing logging of the various reasons causing the HW blocks to generate an attention. Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-01qed: Add support for HW attentionsYuval Mintz
HW is capable of generating attentnions for a multitude of reasons, but current driver is enabling attention generation only for management firmware [required for link notifications]. This patch enables almost all of the possible reasons for HW attentions, logging the HW block generating the attention and preventing further attentions from that source [to prevent possible attention flood]. It also lays the infrastructure for additional exploration of the various attentions. Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-01qed: Semantic refactoring of interrupt codeYuval Mintz
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-01net: remove skb_sender_cpu_clear()WANG Cong
After commit 52bd2d62ce67 ("net: better skb->sender_cpu and skb->napi_id cohabitation") skb_sender_cpu_clear() becomes empty and can be removed. Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-01Merge branch 'mlx5-next'David S. Miller
Saeed Mahameed says: ==================== mlx5 driver updates This series includes some bug fixes and updates for the mlx5 core and ethernet driver. From Gal, two fixes that protects the update CQ moderation flows when it is not allowed. From Moshe, two fixes for the core and ethernet driver in non-cached(NC) and write combining(WC) buffers mappings, which prevents the driver from double memory mappings. From Or, reduce the firmware command completion timeout. From Tariq, several small trivial fixes. Changes from v0: - "Fix global UAR mapping" commit messages updated to explain ARCH_HAS_IOREMAP_WC usage. - rebased to commit 8d3f2806f8fb 'Merge branch ethtool-ksettings' Changes from v1: - Removed ARCH_HAS_IOREMAP_WC config flag from "Fix global UAR mapping" commit, as it was not accurate to use it. - Squashed "Fix global UAR mapping" and "net/mlx5: Avoid double mapping of io mapped memory" - Added more info for "Fix global UAR mapping" in commit message Changes from v2: - None. resubmission per Dave's request due to two parallel submissions to mlx5 driver. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-01net/mlx5: Fix global UAR mappingMoshe Lazer
Avoid double mapping of io mapped memory, Device page may be mapped to non-cached(NC) or to write-combining(WC). The code before this fix tries to map it both to WC and NC contrary to what stated in Intel's software developer manual. Here we remove the global WC mapping of all UARS "dev->priv.bf_mapping", since UAR mapping should be decided per UAR (e.g we want different mappings for EQs, CQs vs QPs). Caller will now have to choose whether to map via write-combining API or not. mlx5e SQs will choose write-combining in order to perform BlueFlame writes. Fixes: 88a85f99e51f ('TX latency optimization to save DMA reads') Signed-off-by: Moshe Lazer <moshel@mellanox.com> Reviewed-by: Achiad Shochat <achiad@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-01net/mlx5: Make command timeout way shorterOr Gerlitz
The command timeout is terribly long, whole two hours. Make it 60s so if things do go wrong, the user gets feedback in relatively short time, so they can take corrective actions and/or investigate using tools and such. Fixes: e126ba97dba9 ('mlx5: Add driver for Mellanox Connect-IB adapters') Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-01net/mlx5e: Don't modify CQ before it was createdGal Pressman
Calling mlx5e_set_coalesce while the interface is down will result in modifying CQs that don't exist. Fixes: f62b8bb8f2d3 ('net/mlx5: Extend mlx5_core to support ConnectX-4 Ethernet functionality') Signed-off-by: Gal Pressman <galp@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-01net/mlx5e: Don't try to modify CQ moderation if it is not supportedGal Pressman
If CQ moderation is not supported by the device, print a warning on netdevice load, and return error when trying to modify/query cq moderation via ethtool. Fixes: f62b8bb8f2d3 ('net/mlx5: Extend mlx5_core to support ConnectX-4 Ethernet functionality') Signed-off-by: Gal Pressman <galp@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-01net/mlx5e: Set drop RQ's necessary parameters onlyTariq Toukan
By its role, there is no need to set all the other parameters for the drop RQ. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-01net/mlx5e: Move common case counters within sq_stats structTariq Toukan
For data cache locality considerations, we moved the nop and csum_offload_inner within sq_stats struct as they are more commonly accessed in xmit path. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-01net/mlx5e: Changed naming convention of tx queues in ethtool statsTariq Toukan
Instead of the pair (channel, tc), we now use a single number that goes over all tx queues of a TC, for all TCs. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-01net/mlx5e: Placement changed for carrier state updatesTariq Toukan
More proper to declare carrier state UP only after the channels are ready for traffic. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-01net/mlx5e: Replace async events spinlock with synchronize_irq()Tariq Toukan
We only need to flush the irq handler to make sure it does not queue a work into the global work queue after we start to flush it. So using synchronize_irq() is more appropriate than a spin lock. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-01net: ipv6/l3mdev: Move host route on saved address if necessaryDavid Ahern
Commit f1705ec197e70 allows IPv6 addresses to be retained on a link down. The address can have a cached host route which can point to the wrong FIB table if the L3 enslavement is changed (e.g., route can point to local table instead of VRF table if device is added to an L3 domain). On link up check the table of the cached host route against the FIB table associated with the device and correct if needed. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-01Merge branch 'net-timestamps-y2038'David S. Miller
Deepa Dinamani says: ==================== Convert network timestamps to be y2038 safe Introduction: The series is aimed at transitioning network timestamps to being y2038 safe. All patches can be reviewed and merged independently. Socket timestamps and ioctl calls will be handled separately. Thanks to Arnd Bergmann for discussing solution options with me. Solution: Data type struct timespec is not y2038 safe. Replace timespec with struct timespec64 which is y2038 safe. Changes v1 -> v2: Move and rename inet_current_time() as discussed Squash patches 1 and 2 Reword commit text for patch 2/3 Carry over review tags ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-01net: sctp: Convert log timestamps to be y2038 safeDeepa Dinamani
SCTP probe log timestamps use struct timespec which is not y2038 safe. Use struct timespec64 which is 2038 safe instead. Use monotonic time instead of real time as only time differences are logged. Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Neil Horman <nhorman@tuxdriver.com> Cc: Vlad Yasevich <vyasevich@gmail.com> Cc: Neil Horman <nhorman@tuxdriver.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: linux-sctp@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-01net: ipv4: tcp_probe: Replace timespec with timespec64Deepa Dinamani
TCP probe log timestamps use struct timespec which is not y2038 safe. Even though timespec might be good enough here as it is used to represent delta time, the plan is to get rid of all uses of timespec in the kernel. Replace with struct timespec64 which is y2038 safe. Prints still use unsigned long format and type. Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Cc: "David S. Miller" <davem@davemloft.net> Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Cc: James Morris <jmorris@namei.org> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> Cc: Patrick McHardy <kaber@trash.net> Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-01net: ipv4: Convert IP network timestamps to be y2038 safeDeepa Dinamani
ICMP timestamp messages and IP source route options require timestamps to be in milliseconds modulo 24 hours from midnight UT format. Add inet_current_timestamp() function to support this. The function returns the required timestamp in network byte order. Timestamp calculation is also changed to call ktime_get_real_ts64() which uses struct timespec64. struct timespec64 is y2038 safe. Previously it called getnstimeofday() which uses struct timespec. struct timespec is not y2038 safe. Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> Cc: James Morris <jmorris@namei.org> Cc: Patrick McHardy <kaber@trash.net> Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-01Merge branch 'ife'David S. Miller
Jamal Hadi Salim says: ==================== net_sched: Add support for IFE action As agreed at netconf in Seville, here's the patch finally (1 year was just too long to wait for an ethertype. Now we are just going have the user configure one). Described in netdev01 paper: "Distributing Linux Traffic Control Classifier-Action Subsystem" Authors: Jamal Hadi Salim and Damascene M. Joachimpillai The original motivation and deployment of this work was to horizontally scale packet processing at scope of a chasis or rack. This means one could take a tc policy and split it across machines connected over L2. The paper refers to this as "pipeline stage indexing". Other use cases which evolved out of the original intent include but are not limited to carrying OAM information, carrying exception handling metadata, carrying programmed authentication and authorization information, encapsulating programmed compliance information, service IDs etc. Read the referenced paper for more details. The architecture allows for incremental updates for new metadatum support to cover different use cases. This patch set includes support for basic skb metadatum. Followup patches will have more examples of metadata and other features. v4 changes: Integrate more feedback from Cong v3 changes: Integrate with the new namespace changes Remove skbhash and queue mapping metadata (but keep their claim for ids) Integrate feedback from Cong Integrate feedback from Daniel v2 changes: Remove module option for an upper bound of metadata Integrate feedback from Cong Integrate feedback from Daniel ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-01Support to encoding decoding skb prio on IFE actionJamal Hadi Salim
Example usage: Set the skb priority using skbedit then allow it to be encoded sudo tc qdisc add dev $ETH root handle 1: prio sudo tc filter add dev $ETH parent 1: protocol ip prio 10 \ u32 match ip protocol 1 0xff flowid 1:2 \ action skbedit prio 17 \ action ife encode \ allow prio \ dst 02:15:15:15:15:15 Note: You dont need the skbedit action if you are already encoding the skb priority earlier. A zero skb priority will not be sent Alternative hard code static priority of decimal 33 (unlike skbedit) then mark of 0x12 every time the filter matches sudo $TC filter add dev $ETH parent 1: protocol ip prio 10 \ u32 match ip protocol 1 0xff flowid 1:2 \ action ife encode \ type 0xDEAD \ use prio 33 \ use mark 0x12 \ dst 02:15:15:15:15:15 Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Acked-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-01Support to encoding decoding skb mark on IFE actionJamal Hadi Salim
Example usage: Set the skb using skbedit then allow it to be encoded sudo tc qdisc add dev $ETH root handle 1: prio sudo tc filter add dev $ETH parent 1: protocol ip prio 10 \ u32 match ip protocol 1 0xff flowid 1:2 \ action skbedit mark 17 \ action ife encode \ allow mark \ dst 02:15:15:15:15:15 Note: You dont need the skbedit action if you are already encoding the skb mark earlier. A zero skb mark, when seen, will not be encoded. Alternative hard code static mark of 0x12 every time the filter matches sudo $TC filter add dev $ETH parent 1: protocol ip prio 10 \ u32 match ip protocol 1 0xff flowid 1:2 \ action ife encode \ type 0xDEAD \ use mark 0x12 \ dst 02:15:15:15:15:15 Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Acked-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-01introduce IFE actionJamal Hadi Salim
This action allows for a sending side to encapsulate arbitrary metadata which is decapsulated by the receiving end. The sender runs in encoding mode and the receiver in decode mode. Both sender and receiver must specify the same ethertype. At some point we hope to have a registered ethertype and we'll then provide a default so the user doesnt have to specify it. For now we enforce the user specify it. Lets show example usage where we encode icmp from a sender towards a receiver with an skbmark of 17; both sender and receiver use ethertype of 0xdead to interop. YYYY: Lets start with Receiver-side policy config: xxx: add an ingress qdisc sudo tc qdisc add dev $ETH ingress xxx: any packets with ethertype 0xdead will be subjected to ife decoding xxx: we then restart the classification so we can match on icmp at prio 3 sudo $TC filter add dev $ETH parent ffff: prio 2 protocol 0xdead \ u32 match u32 0 0 flowid 1:1 \ action ife decode reclassify xxx: on restarting the classification from above if it was an icmp xxx: packet, then match it here and continue to the next rule at prio 4 xxx: which will match based on skb mark of 17 sudo tc filter add dev $ETH parent ffff: prio 3 protocol ip \ u32 match ip protocol 1 0xff flowid 1:1 \ action continue xxx: match on skbmark of 0x11 (decimal 17) and accept sudo tc filter add dev $ETH parent ffff: prio 4 protocol ip \ handle 0x11 fw flowid 1:1 \ action ok xxx: Lets show the decoding policy sudo tc -s filter ls dev $ETH parent ffff: protocol 0xdead xxx: filter pref 2 u32 filter pref 2 u32 fh 800: ht divisor 1 filter pref 2 u32 fh 800::800 order 2048 key ht 800 bkt 0 flowid 1:1 (rule hit 0 success 0) match 00000000/00000000 at 0 (success 0 ) action order 1: ife decode action reclassify index 1 ref 1 bind 1 installed 14 sec used 14 sec type: 0x0 Metadata: allow mark allow hash allow prio allow qmap Action statistics: Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 xxx: Observe that above lists all metadatum it can decode. Typically these submodules will already be compiled into a monolithic kernel or loaded as modules YYYY: Lets show the sender side now .. xxx: Add an egress qdisc on the sender netdev sudo tc qdisc add dev $ETH root handle 1: prio xxx: xxx: Match all icmp packets to 192.168.122.237/24, then xxx: tag the packet with skb mark of decimal 17, then xxx: Encode it with: xxx: ethertype 0xdead xxx: add skb->mark to whitelist of metadatum to send xxx: rewrite target dst MAC address to 02:15:15:15:15:15 xxx: sudo $TC filter add dev $ETH parent 1: protocol ip prio 10 u32 \ match ip dst 192.168.122.237/24 \ match ip protocol 1 0xff \ flowid 1:2 \ action skbedit mark 17 \ action ife encode \ type 0xDEAD \ allow mark \ dst 02:15:15:15:15:15 xxx: Lets show the encoding policy sudo tc -s filter ls dev $ETH parent 1: protocol ip xxx: filter pref 10 u32 filter pref 10 u32 fh 800: ht divisor 1 filter pref 10 u32 fh 800::800 order 2048 key ht 800 bkt 0 flowid 1:2 (rule hit 0 success 0) match c0a87aed/ffffffff at 16 (success 0 ) match 00010000/00ff0000 at 8 (success 0 ) action order 1: skbedit mark 17 index 6 ref 1 bind 1 Action statistics: Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 action order 2: ife encode action pipe index 3 ref 1 bind 1 dst MAC: 02:15:15:15:15:15 type: 0xDEAD Metadata: allow mark Action statistics: Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 xxx: test by sending ping from sender to destination Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Acked-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-01Merge tag 'mac80211-next-for-davem-2016-02-26' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next Johannes Berg says: ==================== Here's another round of updates for -next: * big A-MSDU RX performance improvement (avoid linearize of paged RX) * rfkill changes: cleanups, documentation, platform properties * basic PBSS support in cfg80211 * MU-MIMO action frame processing support * BlockAck reordering & duplicate detection offload support * various cleanups & little fixes ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-01Merge branch 'bridge-mcast-tmp-router-port'David S. Miller
Nikolay Aleksandrov says: ==================== bridge: mcast: add support for temp router port This set adds support for temporary router port which doesn't depend only on the incoming queries. It can be refreshed by setting multicast_router to the same value (3). The first two patches are minor changes that prepare the code for the third which adds this new type of router port. In order to be able to dump its information the mdb router port format is changed in patch 04 and extended similar to how mdb entries format was done recently. The related iproute2 changes will be posted if this is accepted. v2: set val first and adjust router type later in patch 01, patch 03 was split in 2 ==================== Signed-off-by: David S. Miller <davem@davemloft.net>