aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2017-12-07net: mvmdio: disable/unprepare clocks in EPROBE_DEFER caseTobias Jordan
add appropriate calls to clk_disable_unprepare() by jumping to out_mdio in case orion_mdio_probe() returns -EPROBE_DEFER. Found by Linux Driver Verification project (linuxtesting.org). Fixes: 3d604da1e954 ("net: mvmdio: get and enable optional clock") Signed-off-by: Tobias Jordan <Tobias.Jordan@elektrobit.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-06drivers: net: dsa: remove duplicate includesPravin Shedge
These duplicate includes have been found with scripts/checkincludes.pl but they have been removed manually to avoid removing false positives. Signed-off-by: Pravin Shedge <pravin.shedge4linux@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-06rds: Fix NULL pointer dereference in __rds_rdma_mapHåkon Bugge
This is a fix for syzkaller719569, where memory registration was attempted without any underlying transport being loaded. Analysis of the case reveals that it is the setsockopt() RDS_GET_MR (2) and RDS_GET_MR_FOR_DEST (7) that are vulnerable. Here is an example stack trace when the bug is hit: BUG: unable to handle kernel NULL pointer dereference at 00000000000000c0 IP: __rds_rdma_map+0x36/0x440 [rds] PGD 2f93d03067 P4D 2f93d03067 PUD 2f93d02067 PMD 0 Oops: 0000 [#1] SMP Modules linked in: bridge stp llc tun rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache rds binfmt_misc sb_edac intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul c rc32_pclmul ghash_clmulni_intel pcbc aesni_intel crypto_simd glue_helper cryptd iTCO_wdt mei_me sg iTCO_vendor_support ipmi_si mei ipmi_devintf nfsd shpchp pcspkr i2c_i801 ioatd ma ipmi_msghandler wmi lpc_ich mfd_core auth_rpcgss nfs_acl lockd grace sunrpc ip_tables ext4 mbcache jbd2 mgag200 i2c_algo_bit drm_kms_helper ixgbe syscopyarea ahci sysfillrect sysimgblt libahci mdio fb_sys_fops ttm ptp libata sd_mod mlx4_core drm crc32c_intel pps_core megaraid_sas i2c_core dca dm_mirror dm_region_hash dm_log dm_mod CPU: 48 PID: 45787 Comm: repro_set2 Not tainted 4.14.2-3.el7uek.x86_64 #2 Hardware name: Oracle Corporation ORACLE SERVER X5-2L/ASM,MOBO TRAY,2U, BIOS 31110000 03/03/2017 task: ffff882f9190db00 task.stack: ffffc9002b994000 RIP: 0010:__rds_rdma_map+0x36/0x440 [rds] RSP: 0018:ffffc9002b997df0 EFLAGS: 00010202 RAX: 0000000000000000 RBX: ffff882fa2182580 RCX: 0000000000000000 RDX: 0000000000000000 RSI: ffffc9002b997e40 RDI: ffff882fa2182580 RBP: ffffc9002b997e30 R08: 0000000000000000 R09: 0000000000000002 R10: ffff885fb29e3838 R11: 0000000000000000 R12: ffff882fa2182580 R13: ffff882fa2182580 R14: 0000000000000002 R15: 0000000020000ffc FS: 00007fbffa20b700(0000) GS:ffff882fbfb80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000000000c0 CR3: 0000002f98a66006 CR4: 00000000001606e0 Call Trace: rds_get_mr+0x56/0x80 [rds] rds_setsockopt+0x172/0x340 [rds] ? __fget_light+0x25/0x60 ? __fdget+0x13/0x20 SyS_setsockopt+0x80/0xe0 do_syscall_64+0x67/0x1b0 entry_SYSCALL64_slow_path+0x25/0x25 RIP: 0033:0x7fbff9b117f9 RSP: 002b:00007fbffa20aed8 EFLAGS: 00000293 ORIG_RAX: 0000000000000036 RAX: ffffffffffffffda RBX: 00000000000c84a4 RCX: 00007fbff9b117f9 RDX: 0000000000000002 RSI: 0000400000000114 RDI: 000000000000109b RBP: 00007fbffa20af10 R08: 0000000000000020 R09: 00007fbff9dd7860 R10: 0000000020000ffc R11: 0000000000000293 R12: 0000000000000000 R13: 00007fbffa20b9c0 R14: 00007fbffa20b700 R15: 0000000000000021 Code: 41 56 41 55 49 89 fd 41 54 53 48 83 ec 18 8b 87 f0 02 00 00 48 89 55 d0 48 89 4d c8 85 c0 0f 84 2d 03 00 00 48 8b 87 00 03 00 00 <48> 83 b8 c0 00 00 00 00 0f 84 25 03 00 0 0 48 8b 06 48 8b 56 08 The fix is to check the existence of an underlying transport in __rds_rdma_map(). Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com> Reported-by: syzbot <syzkaller@googlegroups.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-06net_sched: use macvlan real dev trans_start in dev_trans_start()Chris Dion
Macvlan devices are similar to vlans and do not update their own trans_start. In order for arp monitoring to work for a bond device when the slaves are macvlans, obtain its real device. Signed-off-by: Chris Dion <christopher.dion@dell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-06xen-netback: Fix logging message with spurious period after newlineJoe Perches
Using a period after a newline causes bad output. Signed-off-by: Joe Perches <joe@perches.com> Reviewed-by: Paul Durrant <paul.durrant@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-06net: thunderx: Fix TCP/UDP checksum offload for IPv4 pktsFlorian Westphal
Offload IP header checksum to NIC. This fixes a previous patch which disabled checksum offloading for both IPv4 and IPv6 packets. So L3 checksum offload was getting disabled for IPv4 pkts. And HW is dropping these pkts for some reason. Without this patch, IPv4 TSO appears to be broken: WIthout this patch I get ~16kbyte/s, with patch close to 2mbyte/s when copying files via scp from test box to my home workstation. Looking at tcpdump on sender it looks like hardware drops IPv4 TSO skbs. This patch restores performance for me, ipv6 looks good too. Fixes: fa6d7cb5d76c ("net: thunderx: Fix TCP/UDP checksum offload for IPv6 pkts") Cc: Sunil Goutham <sgoutham@cavium.com> Cc: Aleksey Makarov <aleksey.makarov@auriga.com> Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-05make sock_alloc_file() do sock_release() on failuresAl Viro
This changes calling conventions (and simplifies the hell out the callers). New rules: once struct socket had been passed to sock_alloc_file(), it's been consumed either by struct file or by sock_release() done by sock_alloc_file(). Either way the caller should not do sock_release() after that point. Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-05socketpair(): allocate descriptors firstAl Viro
simplifies failure exits considerably... Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-05fix kcm_clone()Al Viro
1) it's fput() or sock_release(), not both 2) don't do fd_install() until the last failure exit. 3) not a bug per se, but... don't attach socket to struct file until it's set up. Take reserving descriptor into the caller, move fd_install() to the caller, sanitize failure exits and calling conventions. Cc: stable@vger.kernel.org # v4.6+ Acked-by: Tom Herbert <tom@herbertland.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-05dccp: CVE-2017-8824: use-after-free in DCCP codeMohamed Ghannam
Whenever the sock object is in DCCP_CLOSED state, dccp_disconnect() must free dccps_hc_tx_ccid and dccps_hc_rx_ccid and set to NULL. Signed-off-by: Mohamed Ghannam <simo.ghannam@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-05net: remove hlist_nulls_add_tail_rcu()Eric Dumazet
Alexander Potapenko reported use of uninitialized memory [1] This happens when inserting a request socket into TCP ehash, in __sk_nulls_add_node_rcu(), since sk_reuseport is not initialized. Bug was added by commit d894ba18d4e4 ("soreuseport: fix ordering for mixed v4/v6 sockets") Note that d296ba60d8e2 ("soreuseport: Resolve merge conflict for v4/v6 ordering fix") missed the opportunity to get rid of hlist_nulls_add_tail_rcu() : Both UDP sockets and TCP/DCCP listeners no longer use __sk_nulls_add_node_rcu() for their hash insertion. Since all other sockets have unique 4-tuple, the reuseport status has no special meaning, so we can always use hlist_nulls_add_head_rcu() for them and save few cycles/instructions. [1] ================================================================== BUG: KMSAN: use of uninitialized memory in inet_ehash_insert+0xd40/0x1050 CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.13.0+ #3288 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 Call Trace:  <IRQ>  __dump_stack lib/dump_stack.c:16  dump_stack+0x185/0x1d0 lib/dump_stack.c:52  kmsan_report+0x13f/0x1c0 mm/kmsan/kmsan.c:1016  __msan_warning_32+0x69/0xb0 mm/kmsan/kmsan_instr.c:766  __sk_nulls_add_node_rcu ./include/net/sock.h:684  inet_ehash_insert+0xd40/0x1050 net/ipv4/inet_hashtables.c:413  reqsk_queue_hash_req net/ipv4/inet_connection_sock.c:754  inet_csk_reqsk_queue_hash_add+0x1cc/0x300 net/ipv4/inet_connection_sock.c:765  tcp_conn_request+0x31e7/0x36f0 net/ipv4/tcp_input.c:6414  tcp_v4_conn_request+0x16d/0x220 net/ipv4/tcp_ipv4.c:1314  tcp_rcv_state_process+0x42a/0x7210 net/ipv4/tcp_input.c:5917  tcp_v4_do_rcv+0xa6a/0xcd0 net/ipv4/tcp_ipv4.c:1483  tcp_v4_rcv+0x3de0/0x4ab0 net/ipv4/tcp_ipv4.c:1763  ip_local_deliver_finish+0x6bb/0xcb0 net/ipv4/ip_input.c:216  NF_HOOK ./include/linux/netfilter.h:248  ip_local_deliver+0x3fa/0x480 net/ipv4/ip_input.c:257  dst_input ./include/net/dst.h:477  ip_rcv_finish+0x6fb/0x1540 net/ipv4/ip_input.c:397  NF_HOOK ./include/linux/netfilter.h:248  ip_rcv+0x10f6/0x15c0 net/ipv4/ip_input.c:488  __netif_receive_skb_core+0x36f6/0x3f60 net/core/dev.c:4298  __netif_receive_skb net/core/dev.c:4336  netif_receive_skb_internal+0x63c/0x19c0 net/core/dev.c:4497  napi_skb_finish net/core/dev.c:4858  napi_gro_receive+0x629/0xa50 net/core/dev.c:4889  e1000_receive_skb drivers/net/ethernet/intel/e1000/e1000_main.c:4018  e1000_clean_rx_irq+0x1492/0x1d30 drivers/net/ethernet/intel/e1000/e1000_main.c:4474  e1000_clean+0x43aa/0x5970 drivers/net/ethernet/intel/e1000/e1000_main.c:3819  napi_poll net/core/dev.c:5500  net_rx_action+0x73c/0x1820 net/core/dev.c:5566  __do_softirq+0x4b4/0x8dd kernel/softirq.c:284  invoke_softirq kernel/softirq.c:364  irq_exit+0x203/0x240 kernel/softirq.c:405  exiting_irq+0xe/0x10 ./arch/x86/include/asm/apic.h:638  do_IRQ+0x15e/0x1a0 arch/x86/kernel/irq.c:263  common_interrupt+0x86/0x86 Fixes: d894ba18d4e4 ("soreuseport: fix ordering for mixed v4/v6 sockets") Fixes: d296ba60d8e2 ("soreuseport: Resolve merge conflict for v4/v6 ordering fix") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Alexander Potapenko <glider@google.com> Acked-by: Craig Gallek <kraig@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-05Merge branch 'rmnet-Fix-leaks-in-failure-scenarios'David S. Miller
Subash Abhinov Kasiviswanathan says: ==================== net: qualcomm: rmnet: Fix leaks in failure scenarios Patch 1 fixes a leak in transmit path where a skb cannot be transmitted due to insufficient headroom to stamp the map header. Patch 2 fixes a leak in rmnet_newlink() failure because the rmnet endpoint was never freed ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-05net: qualcomm: rmnet: Fix leak in device creation failureSubash Abhinov Kasiviswanathan
If the rmnet device creation fails in the newlink either while registering with the physical device or after subsequent operations, the rmnet endpoint information is never freed. Fixes: ceed73a2cf4a ("drivers: net: ethernet: qualcomm: rmnet: Initial implementation") Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-05net: qualcomm: rmnet: Fix leak on transmit failureSubash Abhinov Kasiviswanathan
If a skb in transmit path does not have sufficient headroom to add the map header, the skb is not sent out and is never freed. Fixes: ceed73a2cf4a ("drivers: net: ethernet: qualcomm: rmnet: Initial implementation") Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-05VSOCK: fix outdated sk_state value in hvs_release()Stefan Hajnoczi
Since commit 3b4477d2dcf2709d0be89e2a8dced3d0f4a017f2 ("VSOCK: use TCP state constants for sk_state") VSOCK has used TCP_* constants for sk_state. Commit b4562ca7925a3bedada87a3dd072dd5bad043288 ("hv_sock: add locking in the open/close/release code paths") reintroduced the SS_DISCONNECTING constant. This patch replaces the old SS_DISCONNECTING with the new TCP_CLOSING constant. CC: Dexuan Cui <decui@microsoft.com> CC: Cathy Avery <cavery@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Jorgen Hansen <jhansen@vmware.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-05tipc: fix memory leak in tipc_accept_from_sock()Jon Maloy
When the function tipc_accept_from_sock() fails to create an instance of struct tipc_subscriber it omits to free the already created instance of struct tipc_conn instance before it returns. We fix that with this commit. Reported-by: David S. Miller <davem@davemloft.net> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-05tipc: fix a null pointer deref on error pathCong Wang
In tipc_topsrv_kern_subscr() when s->tipc_conn_new() fails we call tipc_close_conn() to clean up, but in this case calling conn_put() is just enough. This fixes the folllowing crash: kasan: GPF could be caused by NULL-ptr deref or user memory access general protection fault: 0000 [#1] SMP KASAN Dumping ftrace buffer: (ftrace buffer empty) Modules linked in: CPU: 0 PID: 3085 Comm: syzkaller064164 Not tainted 4.15.0-rc1+ #137 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 task: 00000000c24413a5 task.stack: 000000005e8160b5 RIP: 0010:__lock_acquire+0xd55/0x47f0 kernel/locking/lockdep.c:3378 RSP: 0018:ffff8801cb5474a8 EFLAGS: 00010002 RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 0000000000000000 RDX: 0000000000000004 RSI: 0000000000000000 RDI: ffffffff85ecb400 RBP: ffff8801cb547830 R08: 0000000000000001 R09: 0000000000000000 R10: 0000000000000000 R11: ffffffff87489d60 R12: ffff8801cd2980c0 R13: 0000000000000000 R14: 0000000000000001 R15: 0000000000000020 FS: 00000000014ee880(0000) GS:ffff8801db400000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007ffee2426e40 CR3: 00000001cb85a000 CR4: 00000000001406f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: lock_acquire+0x1d5/0x580 kernel/locking/lockdep.c:4004 __raw_spin_lock_bh include/linux/spinlock_api_smp.h:135 [inline] _raw_spin_lock_bh+0x31/0x40 kernel/locking/spinlock.c:175 spin_lock_bh include/linux/spinlock.h:320 [inline] tipc_subscrb_subscrp_delete+0x8f/0x470 net/tipc/subscr.c:201 tipc_subscrb_delete net/tipc/subscr.c:238 [inline] tipc_subscrb_release_cb+0x17/0x30 net/tipc/subscr.c:316 tipc_close_conn+0x171/0x270 net/tipc/server.c:204 tipc_topsrv_kern_subscr+0x724/0x810 net/tipc/server.c:514 tipc_group_create+0x702/0x9c0 net/tipc/group.c:184 tipc_sk_join net/tipc/socket.c:2747 [inline] tipc_setsockopt+0x249/0xc10 net/tipc/socket.c:2861 SYSC_setsockopt net/socket.c:1851 [inline] SyS_setsockopt+0x189/0x360 net/socket.c:1830 entry_SYSCALL_64_fastpath+0x1f/0x96 Fixes: 14c04493cb77 ("tipc: add ability to order and receive topology events in driver") Reported-by: syzbot <syzkaller@googlegroups.com> Cc: Jon Maloy <jon.maloy@ericsson.com> Cc: Ying Xue <ying.xue@windriver.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-05Merge branch 'sh_eth-dma-mapping-fixes'David S. Miller
Thomas Petazzoni says: ==================== net: sh_eth: DMA mapping API fixes Here are two patches that fix how the sh_eth driver is using the DMA mapping API: a bogus struct device is used in some places, or a NULL struct device is used. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-05net: sh_eth: don't use NULL as "struct device" for the DMA mapping APIThomas Petazzoni
Using NULL as argument for the DMA mapping API is bogus, as the DMA mapping API may use information from the "struct device" to perform the DMA mapping operation. Therefore, pass the appropriate "struct device". Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Acked-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-05net: sh_eth: use correct "struct device" when calling DMA mapping functionsThomas Petazzoni
There are two types of "struct device": the one representing the physical device on its physical bus (platform, SPI, PCI, etc.), and the one representing the logical device in its device class (net, etc.). The DMA mapping API expects to receive as argument a "struct device" representing the physical device, as the "struct device" contains information about the bus that the DMA API needs. However, the sh_eth driver mistakenly uses the "struct device" representing the logical device (embedded in "struct net_device") rather than the "struct device" representing the physical device on its bus. This commit fixes that by adjusting all calls to the DMA mapping API. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Acked-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-05Merge branch 'RED-qdisc-fixes'David S. Miller
Nogah Frankel says: ==================== RED qdisc fixes Add some input validation checks to RED qdisc. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-05net_sched: red: Avoid illegal valuesNogah Frankel
Check the qmin & qmax values doesn't overflow for the given Wlog value. Check that qmin <= qmax. Fixes: a783474591f2 ("[PKT_SCHED]: Generic RED layer") Signed-off-by: Nogah Frankel <nogahf@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-05net_sched: red: Avoid devision by zeroNogah Frankel
Do not allow delta value to be zero since it is used as a divisor. Fixes: 8af2a218de38 ("sch_red: Adaptative RED AQM") Signed-off-by: Nogah Frankel <nogahf@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-05gianfar: fix a flooded alignment reports because of padding issue.Zumeng Chen
According to LS1021A RM, the value of PAL can be set so that the start of the IP header in the receive data buffer is aligned to a 32-bit boundary. Normally, setting PAL = 2 provides minimal padding to ensure such alignment of the IP header. However every incoming packet's 8-byte time stamp will be inserted into the packet data buffer as padding alignment bytes when hardware time stamping is enabled. So we set the padding 8+2 here to avoid the flooded alignment faults: root@128:~# cat /proc/cpu/alignment User: 0 System: 17539 (inet_gro_receive+0x114/0x2c0) Skipped: 0 Half: 0 Word: 0 DWord: 0 Multi: 17539 User faults: 2 (fixup) Also shown when exception report enablement CPU: 0 PID: 161 Comm: irq/66-eth1_g0_ Not tainted 4.1.21-rt13-WR8.0.0.0_preempt-rt #16 Hardware name: Freescale LS1021A [<8001b420>] (unwind_backtrace) from [<8001476c>] (show_stack+0x20/0x24) [<8001476c>] (show_stack) from [<807cfb48>] (dump_stack+0x94/0xac) [<807cfb48>] (dump_stack) from [<80025d70>] (do_alignment+0x720/0x958) [<80025d70>] (do_alignment) from [<80009224>] (do_DataAbort+0x40/0xbc) [<80009224>] (do_DataAbort) from [<80015398>] (__dabt_svc+0x38/0x60) Exception stack(0x86ad1cc0 to 0x86ad1d08) 1cc0: f9b3e080 86b3d072 2d78d287 00000000 866816c0 86b3d05e 86e785d0 00000000 1ce0: 00000011 0000000e 80840ab0 86ad1d3c 86ad1d08 86ad1d08 806d7fc0 806d806c 1d00: 40070013 ffffffff [<80015398>] (__dabt_svc) from [<806d806c>] (inet_gro_receive+0x114/0x2c0) [<806d806c>] (inet_gro_receive) from [<80660eec>] (dev_gro_receive+0x21c/0x3c0) [<80660eec>] (dev_gro_receive) from [<8066133c>] (napi_gro_receive+0x44/0x17c) [<8066133c>] (napi_gro_receive) from [<804f0538>] (gfar_clean_rx_ring+0x39c/0x7d4) [<804f0538>] (gfar_clean_rx_ring) from [<804f0bf4>] (gfar_poll_rx_sq+0x58/0xe0) [<804f0bf4>] (gfar_poll_rx_sq) from [<80660b10>] (net_rx_action+0x27c/0x43c) [<80660b10>] (net_rx_action) from [<80033638>] (do_current_softirqs+0x1e0/0x3dc) [<80033638>] (do_current_softirqs) from [<800338c4>] (__local_bh_enable+0x90/0xa8) [<800338c4>] (__local_bh_enable) from [<8008025c>] (irq_forced_thread_fn+0x70/0x84) [<8008025c>] (irq_forced_thread_fn) from [<800805e8>] (irq_thread+0x16c/0x244) [<800805e8>] (irq_thread) from [<8004e490>] (kthread+0xe8/0x104) [<8004e490>] (kthread) from [<8000fda8>] (ret_from_fork+0x14/0x2c) Signed-off-by: Zumeng Chen <zumeng.chen@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-05Revert "net: core: maybe return -EEXIST in __dev_alloc_name"Johannes Berg
This reverts commit d6f295e9def0; some userspace (in the case we noticed it's wpa_supplicant), is relying on the current error code to determine that a fixed name interface already exists. Reported-by: Jouni Malinen <j@w1.fi> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-05nfp: fix port stats for mac representorsPieter Jansen van Vuuren
Previously we swapped the tx_packets, tx_bytes and tx_dropped counters with rx_packets, rx_bytes and rx_dropped counters, respectively. This behaviour is correct and expected for VF representors but it should not be swapped for physical port mac representors. Fixes: eadfa4c3be99 ("nfp: add stats and xmit helpers for representors") Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-05Revert "tcp: must block bh in __inet_twsk_hashdance()"Eric Dumazet
We had to disable BH _before_ calling __inet_twsk_hashdance() in commit cfac7f836a71 ("tcp/dccp: block bh before arming time_wait timer"). This means we can revert 614bdd4d6e61 ("tcp: must block bh in __inet_twsk_hashdance()"). Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-04Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhostLinus Torvalds
Pull virtio fixes from Michael Tsirkin: "virtio and qemu bugfixes A couple of bugfixes that just became ready" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: virtio_balloon: fix increment of vb->num_pfns in fill_balloon() virtio: release virtio index when fail to device_register fw_cfg: fix driver remove
2017-12-04Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking fixes from David Miller: 1) Various TCP control block fixes, including one that crashes with SELinux, from David Ahern and Eric Dumazet. 2) Fix ACK generation in rxrpc, from David Howells. 3) ipvlan doesn't set the mark properly in the ipv4 route lookup key, from Gao Feng. 4) SIT configuration doesn't take on the frag_off ipv4 field configuration properly, fix from Hangbin Liu. 5) TSO can fail after device down/up on stmmac, fix from Lars Persson. 6) Various bpftool fixes (mostly in JSON handling) from Quentin Monnet. 7) Various SKB leak fixes in vhost/tun/tap (mostly observed as performance problems). From Wei Xu. 8) mvpps's TX descriptors were not zero initialized, from Yan Markman. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (57 commits) tcp: use IPCB instead of TCP_SKB_CB in inet_exact_dif_match() tcp: add tcp_v4_fill_cb()/tcp_v4_restore_cb() rxrpc: Fix the MAINTAINERS record rxrpc: Use correct netns source in rxrpc_release_sock() liquidio: fix incorrect indentation of assignment statement stmmac: reset last TSO segment size after device open ipvlan: Add the skb->mark as flow4's member to lookup route s390/qeth: build max size GSO skbs on L2 devices s390/qeth: fix GSO throughput regression s390/qeth: fix thinko in IPv4 multicast address tracking tap: free skb if flags error tun: free skb in early errors vhost: fix skb leak in handle_rx() bnxt_en: Fix a variable scoping in bnxt_hwrm_do_send_msg() bnxt_en: fix dst/src fid for vxlan encap/decap actions bnxt_en: wildcard smac while creating tunnel decap filter bnxt_en: Need to unconditionally shut down RoCE in bnxt_shutdown phylink: ensure we take the link down when phylink_stop() is called sfp: warn about modules requiring address change sequence sfp: improve RX_LOS handling ...
2017-12-04arch/tile: mark as orphanedChris Metcalf
The chip family of TILEPro and TILE-Gx was developed by Tilera, which was eventually acquired by Mellanox. The tile architecture was added to the kernel in 2010 and first appeared in 2.6.36. Now at Mellanox we are developing new chips based on the ARM64 architecture; our last TILE-Gx chip (the Gx72) was released in 2013, and our customers using tile architecture products are not, as far as we know, looking to upgrade to newer kernel releases. In the absence of someone in the community stepping up to take over maintainership, this commit marks the architecture as orphaned. Cc: Chris Metcalf <metcalf@alum.mit.edu> Signed-off-by: Chris Metcalf <cmetcalf@mellanox.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-12-03Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfDavid S. Miller
Daniel Borkmann says: ==================== pull-request: bpf 2017-12-02 The following pull-request contains BPF updates for your *net* tree. The main changes are: 1) Fix a compilation warning in xdp redirect tracepoint due to missing bpf.h include that pulls in struct bpf_map, from Xie. 2) Limit the maximum number of attachable BPF progs for a given perf event as long as uabi is not frozen yet. The hard upper limit is now 64 and therefore the same as with BPF multi-prog for cgroups. Also add related error checking for the sample BPF loader when enabling and attaching to the perf event, from Yonghong. 3) Specifically set the RLIMIT_MEMLOCK for the test_verifier_log case, so that the test case can always pass and not fail in some environments due to too low default limit, also from Yonghong. 4) Fix up a missing license header comment for kernel/bpf/offload.c, from Jakub. 5) Several fixes for bpftool, among others a crash on incorrect arguments when json output is used, error message handling fixes on unknown options and proper destruction of json writer for some exit cases, all from Quentin. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-03Merge branch 'tcp-cb-selinux-corruption'David S. Miller
Eric Dumazet says: ==================== tcp: add tcp_v4_fill_cb()/tcp_v4_restore_cb() James Morris reported kernel stack corruption bug that we tracked back to commit 971f10eca186 ("tcp: better TCP_SKB_CB layout to reduce cache line misses") First patch needs to be backported to kernels >= 3.18, while second patch needs to be backported to kernels >= 4.9, since this was the time when inet_exact_dif_match appeared. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-03tcp: use IPCB instead of TCP_SKB_CB in inet_exact_dif_match()David Ahern
After this fix : ("tcp: add tcp_v4_fill_cb()/tcp_v4_restore_cb()"), socket lookups happen while skb->cb[] has not been mangled yet by TCP. Fixes: a04a480d4392 ("net: Require exact match for TCP socket lookups if dif is l3mdev") Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-03tcp: add tcp_v4_fill_cb()/tcp_v4_restore_cb()Eric Dumazet
James Morris reported kernel stack corruption bug [1] while running the SELinux testsuite, and bisected to a recent commit bffa72cf7f9d ("net: sk_buff rbnode reorg") We believe this commit is fine, but exposes an older bug. SELinux code runs from tcp_filter() and might send an ICMP, expecting IP options to be found in skb->cb[] using regular IPCB placement. We need to defer TCP mangling of skb->cb[] after tcp_filter() calls. This patch adds tcp_v4_fill_cb()/tcp_v4_restore_cb() in a very similar way we added them for IPv6. [1] [ 339.806024] SELinux: failure in selinux_parse_skb(), unable to parse packet [ 339.822505] Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: ffffffff81745af5 [ 339.822505] [ 339.852250] CPU: 4 PID: 3642 Comm: client Not tainted 4.15.0-rc1-test #15 [ 339.868498] Hardware name: LENOVO 10FGS0VA1L/30BC, BIOS FWKT68A 01/19/2017 [ 339.885060] Call Trace: [ 339.896875] <IRQ> [ 339.908103] dump_stack+0x63/0x87 [ 339.920645] panic+0xe8/0x248 [ 339.932668] ? ip_push_pending_frames+0x33/0x40 [ 339.946328] ? icmp_send+0x525/0x530 [ 339.958861] ? kfree_skbmem+0x60/0x70 [ 339.971431] __stack_chk_fail+0x1b/0x20 [ 339.984049] icmp_send+0x525/0x530 [ 339.996205] ? netlbl_skbuff_err+0x36/0x40 [ 340.008997] ? selinux_netlbl_err+0x11/0x20 [ 340.021816] ? selinux_socket_sock_rcv_skb+0x211/0x230 [ 340.035529] ? security_sock_rcv_skb+0x3b/0x50 [ 340.048471] ? sk_filter_trim_cap+0x44/0x1c0 [ 340.061246] ? tcp_v4_inbound_md5_hash+0x69/0x1b0 [ 340.074562] ? tcp_filter+0x2c/0x40 [ 340.086400] ? tcp_v4_rcv+0x820/0xa20 [ 340.098329] ? ip_local_deliver_finish+0x71/0x1a0 [ 340.111279] ? ip_local_deliver+0x6f/0xe0 [ 340.123535] ? ip_rcv_finish+0x3a0/0x3a0 [ 340.135523] ? ip_rcv_finish+0xdb/0x3a0 [ 340.147442] ? ip_rcv+0x27c/0x3c0 [ 340.158668] ? inet_del_offload+0x40/0x40 [ 340.170580] ? __netif_receive_skb_core+0x4ac/0x900 [ 340.183285] ? rcu_accelerate_cbs+0x5b/0x80 [ 340.195282] ? __netif_receive_skb+0x18/0x60 [ 340.207288] ? process_backlog+0x95/0x140 [ 340.218948] ? net_rx_action+0x26c/0x3b0 [ 340.230416] ? __do_softirq+0xc9/0x26a [ 340.241625] ? do_softirq_own_stack+0x2a/0x40 [ 340.253368] </IRQ> [ 340.262673] ? do_softirq+0x50/0x60 [ 340.273450] ? __local_bh_enable_ip+0x57/0x60 [ 340.285045] ? ip_finish_output2+0x175/0x350 [ 340.296403] ? ip_finish_output+0x127/0x1d0 [ 340.307665] ? nf_hook_slow+0x3c/0xb0 [ 340.318230] ? ip_output+0x72/0xe0 [ 340.328524] ? ip_fragment.constprop.54+0x80/0x80 [ 340.340070] ? ip_local_out+0x35/0x40 [ 340.350497] ? ip_queue_xmit+0x15c/0x3f0 [ 340.361060] ? __kmalloc_reserve.isra.40+0x31/0x90 [ 340.372484] ? __skb_clone+0x2e/0x130 [ 340.382633] ? tcp_transmit_skb+0x558/0xa10 [ 340.393262] ? tcp_connect+0x938/0xad0 [ 340.403370] ? ktime_get_with_offset+0x4c/0xb0 [ 340.414206] ? tcp_v4_connect+0x457/0x4e0 [ 340.424471] ? __inet_stream_connect+0xb3/0x300 [ 340.435195] ? inet_stream_connect+0x3b/0x60 [ 340.445607] ? SYSC_connect+0xd9/0x110 [ 340.455455] ? __audit_syscall_entry+0xaf/0x100 [ 340.466112] ? syscall_trace_enter+0x1d0/0x2b0 [ 340.476636] ? __audit_syscall_exit+0x209/0x290 [ 340.487151] ? SyS_connect+0xe/0x10 [ 340.496453] ? do_syscall_64+0x67/0x1b0 [ 340.506078] ? entry_SYSCALL64_slow_path+0x25/0x25 Fixes: 971f10eca186 ("tcp: better TCP_SKB_CB layout to reduce cache line misses") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: James Morris <james.l.morris@oracle.com> Tested-by: James Morris <james.l.morris@oracle.com> Tested-by: Casey Schaufler <casey@schaufler-ca.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-03Linux 4.15-rc2Linus Torvalds
2017-12-03Merge branch 'fixes' of git://git.armlinux.org.uk/~rmk/linux-armLinus Torvalds
Pull ARM fix from Russell King: "Just one fix this time around, for the late commit in the merge window that triggered a problem with qemu. Qemu is apparently also going to receive a fix for the discovered issue" * 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm: ARM: avoid faulting on qemu
2017-12-03Merge branch 'i2c/for-current' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux Pull i2c fixes from Wolfram Sang: "Here are two bugfixes for I2C, fixing a memleak in the core and irq allocation for i801. Also three bugfixes for the at24 eeprom driver which Bartosz collected while taking over maintainership for this driver" * 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: eeprom: at24: check at24_read/write arguments eeprom: at24: fix reading from 24MAC402/24MAC602 eeprom: at24: correctly set the size for at24mac402 i2c: i2c-boardinfo: fix memory leaks on devinfo i2c: i801: Fix Failed to allocate irq -2147483648 error
2017-12-03Merge tag 'hwmon-for-linus-v4.15-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging Pull hwmon fixes from Guenter Roeck: "Fixes: - Drop reference to obsolete maintainer tree - Fix overflow bug in pmbus driver - Fix SMBUS timeout problem in jc42 driver For the SMBUS timeout handling, we had a brief discussion if this should be considered a bug fix or a feature. Peter says "it fixes real problems where the application misbehave due to faulty content when reading from an eeprom", and he needs the patch in his company's v4.14 images. This is good enough for me and warrants backport to stable kernels" * tag 'hwmon-for-linus-v4.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging: hwmon: (jc42) optionally try to disable the SMBUS timeout hwmon: (pmbus) Use 64bit math for DIRECT format values hwmon: Drop reference to Jean's tree
2017-12-03rxrpc: Fix the MAINTAINERS recordDavid Howells
Fix the MAINTAINERS record so that it's more obvious who the maintainer for AF_RXRPC is. Reported-by: Joe Perches <joe@perches.com> Reported-by: David Miller <davem@davemloft.net> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-03rxrpc: Use correct netns source in rxrpc_release_sock()David Howells
In rxrpc_release_sock() there may be no rx->local value to access, so we can't unconditionally follow it to the rxrpc network namespace information to poke the connection reapers. Instead, use the socket's namespace pointer to find the namespace. This unfixed code causes the following static checker warning: net/rxrpc/af_rxrpc.c:898 rxrpc_release_sock() error: we previously assumed 'rx->local' could be null (see line 887) Fixes: 3d18cbb7fd0c ("rxrpc: Fix conn expiry timers") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-03liquidio: fix incorrect indentation of assignment statementColin Ian King
Remove one extraneous level of indentation on assignment statement. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-03Merge tag 'linux-can-fixes-for-4.15-20171201' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can Marc Kleine-Budde says: ==================== pull-request: can 2017-12-01 this is a pull for net consisting of nine patches. The first three patches are by Jimmy Assarsson for the kvaser_usb driver and add the missing free()s in some error path, a signed/unsigned comparison and ratelimit the error messages in case of incomplete messages. Oliver Stäbler's patch for the ti_hecc driver fix the napi poll function's return value. The return values of the probe function of the peak_canfd and peak_pci PCI drivers are fixed by Stephane Grosjean's patch. Two patches by me for the flexcan driver update the bugs/features/quirks overview table and fix the error state transition for the VF610 SoC. The two patches by Martin Kelly for the mcba_usb driver fix a typo and a device disconnect bug. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-03Merge tag 'at24-4.15-fixes-for-wolfram' of ↵Wolfram Sang
git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux into i2c/for-current Please consider pulling the following fixes for v4.15. While it doesn't fix any regression introduced in the v4.15 merge window, we have a feature in at24 since linux v4.8 - reading the mac address block from at24mac series - which turned out to be not working. This pull request contains changes that fix it together with a patch that hardens the read and write argument sanitization with out-of-bounds checks that were missing.
2017-12-03stmmac: reset last TSO segment size after device openLars Persson
The mss variable tracks the last max segment size sent to the TSO engine. We do not update the hardware as long as we receive skb:s with the same value in gso_size. During a network device down/up cycle (mapped to stmmac_release() and stmmac_open() callbacks) we issue a reset to the hardware and it forgets the setting for mss. However we did not zero out our mss variable so the next transmission of a gso packet happens with an undefined hardware setting. This triggers a hang in the TSO engine and eventuelly the netdev watchdog will bark. Fixes: f748be531d70 ("stmmac: support new GMAC4") Signed-off-by: Lars Persson <larper@axis.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-03ipvlan: Add the skb->mark as flow4's member to lookup routeGao Feng
Current codes don't use skb->mark to assign flowi4_mark, it would make the policy route rule with fwmark doesn't work as expected. Signed-off-by: Gao Feng <gfree.wind@vip.163.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-02Merge branch 's390-qeth-fixes'David S. Miller
Julian Wiedmann says: ==================== s390/qeth: fixes 2017-12-01 please apply the following three fixes for 4.15. These should also go back to stable. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-02s390/qeth: build max size GSO skbs on L2 devicesJulian Wiedmann
The current GSO skb size limit was copy&pasted over from the L3 path, where it is needed due to a TSO limitation. As L2 devices don't offer TSO support (and thus all GSO skbs are segmented before they reach the driver), there's no reason to restrict the stack in how large it may build the GSO skbs. Fixes: d52aec97e5bc ("qeth: enable scatter/gather in layer 2 mode") Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-02s390/qeth: fix GSO throughput regressionJulian Wiedmann
Using GSO with small MTUs currently results in a substantial throughput regression - which is caused by how qeth needs to map non-linear skbs into its IO buffer elements: compared to a linear skb, each GSO-segmented skb effectively consumes twice as many buffer elements (ie two instead of one) due to the additional header-only part. This causes the Output Queue to be congested with low-utilized IO buffers. Fix this as follows: If the MSS is low enough so that a non-SG GSO segmentation produces order-0 skbs (currently ~3500 byte), opt out from NETIF_F_SG. This is where we anticipate the biggest savings, since an SG-enabled GSO segmentation produces skbs that always consume at least two buffer elements. Larger MSS values continue to get a SG-enabled GSO segmentation, since 1) the relative overhead of the additional header-only buffer element becomes less noticeable, and 2) the linearization overhead increases. With the throughput regression fixed, re-enable NETIF_F_SG by default to reap the significant CPU savings of GSO. Fixes: 5722963a8e83 ("qeth: do not turn on SG per default") Reported-by: Nils Hoppmann <niho@de.ibm.com> Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-02s390/qeth: fix thinko in IPv4 multicast address trackingJulian Wiedmann
Commit 5f78e29ceebf ("qeth: optimize IP handling in rx_mode callback") reworked how secondary addresses are managed for qeth devices. Instead of dropping & subsequently re-adding all addresses on every ndo_set_rx_mode() call, qeth now keeps track of the addresses that are currently registered with the HW. On a ndo_set_rx_mode(), we thus only need to do (de-)registration requests for the addresses that have actually changed. On L3 devices, the lookup for IPv4 Multicast addresses checks the wrong hashtable - and thus never finds a match. As a result, we first delete *all* such addresses, and then re-add them again. So each set_rx_mode() causes a short period where the IPv4 Multicast addresses are not registered, and the card stops forwarding inbound traffic for them. Fix this by setting the ->is_multicast flag on the lookup object, thus enabling qeth_l3_ip_from_hash() to search the correct hashtable and find a match there. Fixes: 5f78e29ceebf ("qeth: optimize IP handling in rx_mode callback") Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-02Merge branch 'vhost-skb-leaks'David S. Miller
Wei Xu says: ==================== vhost: fix a few skb leaks Matthew found a roughly 40% tcp throughput regression with commit c67df11f(vhost_net: try batch dequing from skb array) as discussed in the following thread: https://www.mail-archive.com/netdev@vger.kernel.org/msg187936.html v4: - fix zero iov iterator count in tap/tap_do_read()(Jason) - don't put tun in case of EBADFD(Jason) - Replace msg->msg_control with new 'skb' when calling tun/tap_do_read() v3: - move freeing skb from vhost to tun/tap recvmsg() to not confuse the callers. v2: - add Matthew as the reporter, thanks matthew. - moving zero headcount check ahead instead of defer consuming skb due to jason and mst's comment. - add freeing skb in favor of recvmsg() fails. ==================== Acked-by: Michael S. Tsirkin <mst@redhat.com> Tested-by: Matthew Rosato <mjrosato@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>