linux.git - Linux kernel

Age	Commit message (Collapse)	Author
2017-04-24	lwtunnel: check return value of nla_nest_start	Pan Bian
	Function nla_nest_start() may return a NULL pointer on error. However, in function lwtunnel_fill_encap(), the return value of nla_nest_start() is not validated before it is used. This patch checks the return value of nla_nest_start() against NULL. Signed-off-by: Pan Bian <bianpan2016@163.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-24	Merge branch 'nfp-dma-adjust_head-fixes'	David S. Miller
	Jakub Kicinski says: ==================== nfp: DMA flags, adjust head and fixes This series takes advantage of Alex's DMA_ATTR_SKIP_CPU_SYNC to make XDP packet modifications "correct" from DMA API point of view. It also allows us to parse the metadata before we run XDP at no additional DMA sync cost. That way we can get rid of the metadata memcpy, and remove the last upstream user of bpf_prog->xdp_adjust_head. David's patch adds a way to read capabilities from the management firmware. There are also two net-next fixes. Patch 4 which fixes what seems to be a result of a botched rebase on my part. Patch 5 corrects locking when state of ethernet ports is being refreshed. v3: move the sync from alloc func to the actual give to hw func v2: sync rx buffers before giving them to the card (Alex) ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-24	nfp: remove the refresh of all ports optimization	Jakub Kicinski
	The code refreshing the eth port state was trying to update state of all ports of the card. Unfortunately to safely walk the port list we would have to hold the port lock, which we can't due to lock ordering constraints against rtnl. Make the per-port sync refresh and async refresh of all ports completely separate routines. Fixes: 172f638c93dd ("nfp: add port state refresh") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-24	nfp: fix free list buffer size reporting	Jakub Kicinski
	XDP headroom should not be included in free list buffer size. Fixes: 6fe0c3b43804 ("nfp: add support for xdp_adjust_head()") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-24	nfp: add NSP routine to get static information	David Brunecz
	Retrieve identifying information from the NSP. For now it only contains versions of firmware subcomponents. Signed-off-by: David Brunecz <david.brunecz@netronome.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-24	nfp: parse metadata prepend before XDP runs	Jakub Kicinski
	Calling memcpy to shift metadata out of the way for XDP to run seems like an overkill. The most common metadata contents are 8 bytes containing type and flow hash. Simply parse the metadata before we run XDP. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-24	nfp: make use of the DMA_ATTR_SKIP_CPU_SYNC attr	Jakub Kicinski
	DMA unmap may destroy changes CPU made to the buffer. To make XDP run correctly on non-x86 platforms we should use the DMA_ATTR_SKIP_CPU_SYNC attribute. Thanks to using the attribute we can now push the sync operation to the common code path from XDP handler. A little bit of variable name reshuffling is required to bring the code back to readable state. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-24	Merge branch 'cls_flower-MPLS'	David S. Miller
	Benjamin LaHaise says: ==================== flower: add MPLS matching support This patch series adds support for parsing MPLS flows in the flow dissector and the flower classifier. Each of the MPLS TTL, BOS, TC and Label fields can be used for matching. v2: incorporate style feedback, move #defines to linux/include/mpls.h Note: this omits Jiri's request to remove tabs between the type and field names in struct declarations. This would be inconsistent with numerous other struct definitions. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-24	cls_flower: add support for matching MPLS fields (v2)	Benjamin LaHaise
	Add support to the tc flower classifier to match based on fields in MPLS labels (TTL, Bottom of Stack, TC field, Label). Signed-off-by: Benjamin LaHaise <benjamin.lahaise@netronome.com> Signed-off-by: Benjamin LaHaise <bcrl@kvack.org> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Simon Horman <simon.horman@netronome.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Cong Wang <xiyou.wangcong@gmail.com> Cc: Jiri Pirko <jiri@mellanox.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Hadar Hen Zion <hadarh@mellanox.com> Cc: Gao Feng <fgao@ikuai8.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-24	flow_dissector: add mpls support (v2)	Benjamin LaHaise
	Add support for parsing MPLS flows to the flow dissector in preparation for adding MPLS match support to cls_flower. Signed-off-by: Benjamin LaHaise <benjamin.lahaise@netronome.com> Signed-off-by: Benjamin LaHaise <bcrl@kvack.org> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Simon Horman <simon.horman@netronome.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Cong Wang <xiyou.wangcong@gmail.com> Cc: Jiri Pirko <jiri@mellanox.com> Cc: Eric Dumazet <jhs@mojatatu.com> Cc: Hadar Hen Zion <hadarh@mellanox.com> Cc: Gao Feng <fgao@ikuai8.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-24	Merge branch 'tcp-fastopen-middlebox-fixes'	David S. Miller
	Wei Wang says: ==================== net/tcp_fastopen: Fix for various TFO firewall issues Currently there are still some firewall issues in the middlebox which make the middlebox drop packets silently for TFO sockets. This kind of issue is hard to be detected by the end client. This patch series tries to detect such issues in the kernel and disable TFO temporarily. More details about the issues and the fixes are included in the following patches. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-24	net/tcp_fastopen: Remove mss check in tcp_write_timeout()	Wei Wang
	Christoph Paasch from Apple found another firewall issue for TFO: After successful 3WHS using TFO, server and client starts to exchange data. Afterwards, a 10s idle time occurs on this connection. After that, firewall starts to drop every packet on this connection. The fix for this issue is to extend existing firewall blackhole detection logic in tcp_write_timeout() by removing the mss check. Signed-off-by: Wei Wang <weiwan@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-24	net/tcp_fastopen: Add snmp counter for blackhole detection	Wei Wang
	This counter records the number of times the firewall blackhole issue is detected and active TFO is disabled. Signed-off-by: Wei Wang <weiwan@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-24	net/tcp_fastopen: Disable active side TFO in certain scenarios	Wei Wang
	Middlebox firewall issues can potentially cause server's data being blackholed after a successful 3WHS using TFO. Following are the related reports from Apple: https://www.nanog.org/sites/default/files/Paasch_Network_Support.pdf Slide 31 identifies an issue where the client ACK to the server's data sent during a TFO'd handshake is dropped. C ---> syn-data ---> S C <--- syn/ack ----- S C (accept & write) C <---- data ------- S C ----- ACK -> X S [retry and timeout] https://www.ietf.org/proceedings/94/slides/slides-94-tcpm-13.pdf Slide 5 shows a similar situation that the server's data gets dropped after 3WHS. C ---- syn-data ---> S C <--- syn/ack ----- S C ---- ack --------> S S (accept & write) C? X <- data ------ S [retry and timeout] This is the worst failure b/c the client can not detect such behavior to mitigate the situation (such as disabling TFO). Failing to proceed, the application (e.g., SSL library) may simply timeout and retry with TFO again, and the process repeats indefinitely. The proposed solution is to disable active TFO globally under the following circumstances: 1. client side TFO socket detects out of order FIN 2. client side TFO socket receives out of order RST We disable active side TFO globally for 1hr at first. Then if it happens again, we disable it for 2h, then 4h, 8h, ... And we reset the timeout to 1hr if a client side TFO sockets not opened on loopback has successfully received data segs from server. And we examine this condition during close(). The rational behind it is that when such firewall issue happens, application running on the client should eventually close the socket as it is not able to get the data it is expecting. Or application running on the server should close the socket as it is not able to receive any response from client. In both cases, out of order FIN or RST will get received on the client given that the firewall will not block them as no data are in those frames. And we want to disable active TFO globally as it helps if the middle box is very close to the client and most of the connections are likely to fail. Also, add a debug sysctl: tcp_fastopen_blackhole_detect_timeout_sec: the initial timeout to use when firewall blackhole issue happens. This can be set and read. When setting it to 0, it means to disable the active disable logic. Signed-off-by: Wei Wang <weiwan@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-24	Merge tag 'mlx5-updates-2017-04-22' of ↵	David S. Miller
	git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-updates-2017-04-22 Sparse and compiler warnings fixes from Stephen Hemminger. From Roi Dayan and Or Gerlitz, Add devlink and mlx5 support for controlling E-Switch encapsulation mode, this knob will enable HW support for applying encapsulation/decapsulation to VF traffic as part of SRIOV e-switch offloading. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-24	net: add rcu locking when changing early demux	David Ahern
	systemd-sysctl is triggering a suspicious RCU usage message when net.ipv4.tcp_early_demux or net.ipv4.udp_early_demux is changed via a sysctl config file: [ 33.896184] =============================== [ 33.899558] [ ERR: suspicious RCU usage. ] [ 33.900624] 4.11.0-rc7+ #104 Not tainted [ 33.901698] ------------------------------- [ 33.903059] /home/dsa/kernel-2.git/net/ipv4/sysctl_net_ipv4.c:305 suspicious rcu_dereference_check() usage! [ 33.905724] other info that might help us debug this: [ 33.907656] rcu_scheduler_active = 2, debug_locks = 0 [ 33.909288] 1 lock held by systemd-sysctl/143: [ 33.910373] #0: (sb_writers#5){.+.+.+}, at: [<ffffffff8123a370>] file_start_write+0x45/0x48 [ 33.912407] stack backtrace: [ 33.914018] CPU: 0 PID: 143 Comm: systemd-sysctl Not tainted 4.11.0-rc7+ #104 [ 33.915631] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014 [ 33.917870] Call Trace: [ 33.918431] dump_stack+0x81/0xb6 [ 33.919241] lockdep_rcu_suspicious+0x10f/0x118 [ 33.920263] proc_configure_early_demux+0x65/0x10a [ 33.921391] proc_udp_early_demux+0x3a/0x41 add rcu locking to proc_configure_early_demux. Fixes: dddb64bcb3461 ("net: Add sysctl to toggle early demux for tcp and udp") Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-24	Merge branch 'for-upstream' of ↵	David S. Miller
	git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next Johan Hedberg says: ==================== pull request: bluetooth-next 2017-04-22 Here are some more Bluetooth patches (and one 802.15.4 patch) in the bluetooth-next tree targeting the 4.12 kernel. Most of them are pure fixes. Please let me know if there are any issues pulling. Thanks. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-24	net: netcp: fix spelling mistake: "memomry" -> "memory"	Colin Ian King
	Trivial fix to spelling mistake in dev_err message and rejoin line. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-24	net: atheros: atl1: use offset_in_page() macro	Geliang Tang
	Use offset_in_page() macro instead of open-coding. Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-24	Merge branch 'bnxt_en-misc-next'	David S. Miller
	Michael Chan says: ==================== bnxt_en: Updates for net-next. Miscellaneous updates include passing DCBX RoCE VLAN priority to firmware, checking one more new firmware flag before allowing DCBX to run on the host, adding 100Gbps speed support, adding check to disallow speed settings on Multi-host NICs, and a minor fix for reporting VF attributes. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-24	bnxt_en: Restrict a PF in Multi-Host mode from changing port PHY configuration	Deepak Khungar
	This change restricts the PF in multi-host mode from setting any port level PHY configuration. The settings are controlled by firmware in Multi-Host mode. Signed-off-by: Deepak Khungar <deepak.khungar@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-24	bnxt_en: Check the FW_LLDP_AGENT flag before allowing DCBX host agent.	Michael Chan
	Check the additional flag in bnxt_hwrm_func_qcfg() before allowing DCBX to be done in host mode. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-24	bnxt_en: Add 100G link speed reporting for BCM57454 ASIC in ethtool	Deepak Khungar
	Added support for 100G link speed reporting for Broadcom BCM57454 ASIC in ethtool command. Signed-off-by: Deepak Khungar <deepak.khungar@broadcom.com> Signed-off-by: Ray Jui <ray.jui@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-24	bnxt_en: Fix VF attributes reporting.	Michael Chan
	The .ndo_get_vf_config() is returning the wrong qos attribute. Fix the code that checks and reports the qos and spoofchk attributes. The BNXT_VF_QOS and BNXT_VF_LINK_UP flags should not be set by default during init. time. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-24	bnxt_en: Pass DCB RoCE app priority to firmware.	Michael Chan
	When the driver gets the RoCE app priority set/delete call through DCBNL, the driver will send the information to the firmware to set up the priority VLAN tag for RDMA traffic. [ New version using the common ETH_P_IBOE constant in if_ether.h ] Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-24	openvswitch: Add eventmask support to CT action.	Jarno Rajahalme
	Add a new optional conntrack action attribute OVS_CT_ATTR_EVENTMASK, which can be used in conjunction with the commit flag (OVS_CT_ATTR_COMMIT) to set the mask of bits specifying which conntrack events (IPCT_*) should be delivered via the Netfilter netlink multicast groups. Default behavior depends on the system configuration, but typically a lot of events are delivered. This can be very chatty for the NFNLGRP_CONNTRACK_UPDATE group, even if only some types of events are of interest. Netfilter core init_conntrack() adds the event cache extension, so we only need to set the ctmask value. However, if the system is configured without support for events, the setting will be skipped due to extension not being found. Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Reviewed-by: Greg Rose <gvrose8192@gmail.com> Acked-by: Joe Stringer <joe@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-24	openvswitch: Typo fix.	Jarno Rajahalme
	Fix typo in a comment. Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Greg Rose <gvrose8192@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-24	Merge branch 'ibmvnic-updates-and-fixes'	David S. Miller
	Nathan Fontenot says: ==================== ibmvnic: Additional updates and bug fixes This set of patches is an additional set of updates and bug fixes to the ibmvnic driver which applies on top of the previous set of updates sent out on 4/19. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-24	ibmvnic: Free skb's in cases of failure in transmit	Thomas Falcon
	When an error is encountered during transmit we need to free the skb instead of returning TX_BUSY. Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-24	ibmvnic: Validate napi exist before disabling them	Nathan Fontenot
	Validate that the napi structs exist before trying to disable them at driver close. Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-24	ibmvnic: Add set_link_state routine for setting adapter link state	Nathan Fontenot
	Create a common routine for setting the link state for the vnic adapter. This update moves the sending of the crq and waiting for the link state response to a common place. The new routine also adds handling of resending the crq in cases of getting a partial success response. Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-24	ibmvnic: Move initialization of the stats token to ibmvnic_open	Nathan Fontenot
	We should be initializing the stats token in the same place we initialize the other resources for the driver. Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-24	ibmvnic: Only retrieve error info if present	Nathan Fontenot
	When handling a fatal error in the driver, there can be additional error information provided by the vios. This information is not always present, so only retrieve the additional error information when present. Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-24	ibmvnic: Insert header on VLAN tagged received frame	Murilo Fossa Vicentini
	This patch addresses a modification in the PAPR+ specification which now defines a previously reserved value for vNIC capabilities. It indicates whether the system firmware performs a VLAN header stripping on all VLAN tagged received frames, in case it does, the behavior expected is for the ibmvnic driver to be responsible for inserting the VLAN header. Reported-by: Manvanthara B. Puttashankar <mputtash@in.ibm.com> Signed-off-by: Murilo Fossa Vicentini <muvic@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-24	ibmvnic: Set real number of rx queues	Thomas Falcon
	Along with 5 TX queues, 5 RX queues are allocated at the beginning of device probe. However, only the real number of TX queues is set. Configure the real number of RX queues as well. Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-24	Merge branch 'packet-fanout-unique-id'	David S. Miller
	Mike Maloney says: ==================== packet: Add option to create new fanout group with unique id. Fanout uses a per net global namespace. A process that intends to create a new fanout group can accidentally join an existing group. It is not possible to detect this. Add a socket option to specify on the first call to setsockopt(..., PACKET_FANOUT, ...) to ensure that a new group is created. Also add tests. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-24	selftests/net: add tests for PACKET_FANOUT_FLAG_UNIQUEID	Mike Maloney
	Create two groups with PACKET_FANOUT_FLAG_UNIQUEID, add a socket to one. Ensure that the groups can only be joined if all options are consistent with the original except for this flag. Signed-off-by: Mike Maloney <maloney@google.com> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-24	packet: add PACKET_FANOUT_FLAG_UNIQUEID to assign new fanout group id.	Mike Maloney
	Fanout uses a per net global namespace. A process that intends to create a new fanout group can accidentally join an existing group. It is not possible to detect this. Add socket option PACKET_FANOUT_FLAG_UNIQUEID. When specified the supplied fanout group id must be set to 0, and the kernel chooses an id that is not already in use. This is an ephemeral flag so that other sockets can be added to this group using setsockopt, but NOT specifying this flag. The current getsockopt(..., PACKET_FANOUT, ...) can be used to retrieve the new group id. We assume that there are not a lot of fanout groups and that this is not a high frequency call. The method assigns ids starting at zero and increases until it finds an unused id. It keeps track of the last assigned id, and uses it as a starting point to find new ids. Signed-off-by: Mike Maloney <maloney@google.com> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-24	selftests/net: cleanup unused parameter in psock_fanout	Mike Maloney
	sock_fanout_open no longer sets the size of packet_socket ring, so stop passing the parameter. Signed-off-by: Mike Maloney <maloney@google.com> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-24	mdio_bus: Issue GPIO RESET to PHYs.	Roger Quadros
	Some boards [1] leave the PHYs at an invalid state during system power-up or reset thus causing unreliability issues with the PHY which manifests as PHY not being detected or link not functional. To fix this, these PHYs need to be RESET via a GPIO connected to the PHY's RESET pin. Some boards have a single GPIO controlling the PHY RESET pin of all PHYs on the bus whereas some others have separate GPIOs controlling individual PHY RESETs. In both cases, the RESET de-assertion cannot be done in the PHY driver as the PHY will not probe till its reset is de-asserted. So do the RESET de-assertion in the MDIO bus driver. [1] - am572x-idk, am571x-idk, a437x-idk Signed-off-by: Roger Quadros <rogerq@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-24	Merge branch 'VSOCK-add-vsockmon'	David S. Miller
	Stefan Hajnoczi says: ==================== VSOCK: vsockmon virtual device to monitor AF_VSOCK sockets. v5: * Change vsock_deliver_tap() API to avoid unnecessary skb creation [Jorgen] * Fix skb leak when no taps are registered [Jorgen] * s/cpu_to_le16(pkt->hdr.op)/le16_to_cpu(pkt->hdr.op)/ [Michael] * Add af_vsock_tap.c and vsockmon.[ch] to MAINTAINERS * checkpatch.pl and sparse fixes v4: * Add explicit reserved padding field to struct af_vsockmon_hdr and drop __attribute__((packed)) [Michael, DaveM] * Call synchronize_net() before module_put() [Michael] v3: * Hook virtio_transport.c (guest driver), not just drivers/vhost/vsock.c (host driver) * Fix DEFAULT_MTU macro definition [Zhu Yanjun] * Rename af_vsockmon_hdr->t field ->transport for clarity * Update .ndo_get_stats64() return type since it has changed * Include missing <linux/module.h> header in af_vsock_tap.c This is a continuation of Gerard Garcia's work on the vsockmon packet capture interface for AF_VSOCK. Packet capture is an essential feature for network communication. Gerard began addressing this feature gap in his Google Summer of Code 2016 project. I have cleaned up, rebased, and retested the v2 series he posted previously. The design follows the nlmon packet capture interface closely. This is because vsock has the same problem as netlink: there is no netdev on which packets can be captured. The nlmon driver is a synthetic netdev purely for the purpose of enabling packet capture. We follow the same approach here with vsockmon. See include/uapi/linux/vsockmon.h in this series for details on the packet layout. How to try it: 1. Build tcpdump with vsockmon patches: $ git clone -b vsock https://github.com/stefanha/libpcap $ (cd libcap && ./configure && make) $ git clone -b vsock https://github.com/stefanha/tcpdump $ (cd tcpdump && ./configure && make) 2. Build nc-vsock (a netcat-like tool): $ git clone https://github.com/stefanha/nc-vsock $ (cd nc-vsock && make) 3. Launch a virtual machine: # modprobe vhost_vsock # qemu-system-x86_64 -M accel=kvm -m 1024 -cpu host \ -drive if=virtio,file=test.img,format=raw \ -device vhost-vsock-pci,guest-cid=3 (Assumes guest is running a kernel with this patch) 4. Capture AF_VSOCK traffic in guest and/or host: # modprobe vsockmon # ip link add type vsockmon # ip link set vsockmon0 up # tcpdump -i vsockmon0 -vvv 5. Communicate! (host)$ nc-vsock -l 1234 (guest)$ nc-vsock 2 1234 ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-24	VSOCK: Add virtio vsock vsockmon hooks	Gerard Garcia
	The virtio drivers deal with struct virtio_vsock_pkt. Add virtio_transport_deliver_tap_pkt(pkt) for handing packets to the vsockmon device. We call virtio_transport_deliver_tap_pkt(pkt) from net/vmw_vsock/virtio_transport.c and drivers/vhost/vsock.c instead of common code. This is because the drivers may drop packets before handing them to common code - we still want to capture them. Signed-off-by: Gerard Garcia <ggarcia@deic.uab.cat> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Jorgen Hansen <jhansen@vmware.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-24	VSOCK: Add vsockmon device	Gerard Garcia
	Add vsockmon virtual network device that receives packets from the vsock transports and exposes them to user space. Based on the nlmon device. Signed-off-by: Gerard Garcia <ggarcia@deic.uab.cat> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-24	VSOCK: Add vsockmon tap functions	Gerard Garcia
	Add tap functions that can be used by the vsock transports to deliver packets to vsockmon virtual network devices. Signed-off-by: Gerard Garcia <ggarcia@deic.uab.cat> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Jorgen Hansen <jhansen@vmware.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-24	Merge tag 'wireless-drivers-next-for-davem-2017-04-21' of ↵	David S. Miller
	git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next Kalle Valo says: ==================== wireless-drivers-next patches for 4.12 Quite a lot of patches for rtlwifi and iwlwifi this time, but changes also for other active wireless drivers. Major changes: ath9k * add support for Dell Wireless 1601 PCI device * add debugfs file to manually override noise floor ath10k * bump up FW API to 6 for a new QCA6174 firmware branch wil6210 * support 8 kB RX buffers iwlwifi * work to support A000 devices continues * add support for FW API 30 * add Geographical and Dynamic Specific Absorption Rate (SAR) support * support a few new PCI device IDs rtlwifi * work on adding Bluetooth coexistance support, not finished yet ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-24	Merge branch 'qed-dcb-enhancements'	David S. Miller
	Sudarsana Reddy Kalluru says: ==================== qed*: Dcbx/dcbnl enhancements. The series has set of enhancements for dcbx/dcbnl implementation of qed/qede drivers. - Patches (1) & (3) capture the sematic and debug changes. - Patch (2) adds the driver support for populating RoCEv2 dcb data. - Patch (4) adds the required support for reading/configuring the IEEE selection field (SF). - Patch (5) adds the support for configuring the static dcbx mode. Please consider applying this to 'net-next' branch. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-24	qed: Add support for static dcbx.	sudarsana.kalluru@cavium.com
	The patch adds driver support for static/local dcbx mode. In this mode adapter brings up the dcbx link with locally configured parameters instead of performing the dcbx negotiation with the peer. The feature is useful when peer device/switch doesn't support dcbx. Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com> Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-24	qed: Support dcbnl IEEE selector field.	sudarsana.kalluru@cavium.com
	Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com> Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-24	qed: Add additional DCBx debug messages.	sudarsana.kalluru@cavium.com
	Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com> Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-24	qed: Separate RoCE DCBx support for V2.	sudarsana.kalluru@cavium.com
	In the older firmware there was no distinction between RoCE and RoCEv2 whereas the newer firmware (8.15.3.0) allows us to configure each independently. Driver need to populate the RoCEv2 data in its specific structure. Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com> Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>