aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2017-05-02tcp: fix wraparound issue in tcp_lpEric Dumazet
Be careful when comparing tcp_time_stamp to some u32 quantity, otherwise result can be surprising. Fixes: 7c106d7e782b ("[TCP]: TCP Low Priority congestion control") Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-02bpf, arm64: fix jit branch offset related to ldimm64Daniel Borkmann
When the instruction right before the branch destination is a 64 bit load immediate, we currently calculate the wrong jump offset in the ctx->offset[] array as we only account one instruction slot for the 64 bit load immediate although it uses two BPF instructions. Fix it up by setting the offset into the right slot after we incremented the index. Before (ldimm64 test 1): [...] 00000020: 52800007 mov w7, #0x0 // #0 00000024: d2800060 mov x0, #0x3 // #3 00000028: d2800041 mov x1, #0x2 // #2 0000002c: eb01001f cmp x0, x1 00000030: 54ffff82 b.cs 0x00000020 00000034: d29fffe7 mov x7, #0xffff // #65535 00000038: f2bfffe7 movk x7, #0xffff, lsl #16 0000003c: f2dfffe7 movk x7, #0xffff, lsl #32 00000040: f2ffffe7 movk x7, #0xffff, lsl #48 00000044: d29dddc7 mov x7, #0xeeee // #61166 00000048: f2bdddc7 movk x7, #0xeeee, lsl #16 0000004c: f2ddddc7 movk x7, #0xeeee, lsl #32 00000050: f2fdddc7 movk x7, #0xeeee, lsl #48 [...] After (ldimm64 test 1): [...] 00000020: 52800007 mov w7, #0x0 // #0 00000024: d2800060 mov x0, #0x3 // #3 00000028: d2800041 mov x1, #0x2 // #2 0000002c: eb01001f cmp x0, x1 00000030: 540000a2 b.cs 0x00000044 00000034: d29fffe7 mov x7, #0xffff // #65535 00000038: f2bfffe7 movk x7, #0xffff, lsl #16 0000003c: f2dfffe7 movk x7, #0xffff, lsl #32 00000040: f2ffffe7 movk x7, #0xffff, lsl #48 00000044: d29dddc7 mov x7, #0xeeee // #61166 00000048: f2bdddc7 movk x7, #0xeeee, lsl #16 0000004c: f2ddddc7 movk x7, #0xeeee, lsl #32 00000050: f2fdddc7 movk x7, #0xeeee, lsl #48 [...] Also, add a couple of test cases to make sure JITs pass this test. Tested on Cavium ThunderX ARMv8. The added test cases all pass after the fix. Fixes: 8eee539ddea0 ("arm64: bpf: fix out-of-bounds read in bpf2a64_offset()") Reported-by: David S. Miller <davem@davemloft.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Cc: Xi Wang <xi.wang@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-02bpf, arm64: implement jiting of BPF_XADDDaniel Borkmann
This work adds BPF_XADD for BPF_W/BPF_DW to the arm64 JIT and therefore completes JITing of all BPF instructions, meaning we can thus also remove the 'notyet' label and do not need to fall back to the interpreter when BPF_XADD is used in a program! This now also brings arm64 JIT in line with x86_64, s390x, ppc64, sparc64, where all current eBPF features are supported. BPF_W example from test_bpf: .u.insns_int = { BPF_ALU32_IMM(BPF_MOV, R0, 0x12), BPF_ST_MEM(BPF_W, R10, -40, 0x10), BPF_STX_XADD(BPF_W, R10, R0, -40), BPF_LDX_MEM(BPF_W, R0, R10, -40), BPF_EXIT_INSN(), }, [...] 00000020: 52800247 mov w7, #0x12 // #18 00000024: 928004eb mov x11, #0xffffffffffffffd8 // #-40 00000028: d280020a mov x10, #0x10 // #16 0000002c: b82b6b2a str w10, [x25,x11] // start of xadd mapping: 00000030: 928004ea mov x10, #0xffffffffffffffd8 // #-40 00000034: 8b19014a add x10, x10, x25 00000038: f9800151 prfm pstl1strm, [x10] 0000003c: 885f7d4b ldxr w11, [x10] 00000040: 0b07016b add w11, w11, w7 00000044: 880b7d4b stxr w11, w11, [x10] 00000048: 35ffffab cbnz w11, 0x0000003c // end of xadd mapping: [...] BPF_DW example from test_bpf: .u.insns_int = { BPF_ALU32_IMM(BPF_MOV, R0, 0x12), BPF_ST_MEM(BPF_DW, R10, -40, 0x10), BPF_STX_XADD(BPF_DW, R10, R0, -40), BPF_LDX_MEM(BPF_DW, R0, R10, -40), BPF_EXIT_INSN(), }, [...] 00000020: 52800247 mov w7, #0x12 // #18 00000024: 928004eb mov x11, #0xffffffffffffffd8 // #-40 00000028: d280020a mov x10, #0x10 // #16 0000002c: f82b6b2a str x10, [x25,x11] // start of xadd mapping: 00000030: 928004ea mov x10, #0xffffffffffffffd8 // #-40 00000034: 8b19014a add x10, x10, x25 00000038: f9800151 prfm pstl1strm, [x10] 0000003c: c85f7d4b ldxr x11, [x10] 00000040: 8b07016b add x11, x11, x7 00000044: c80b7d4b stxr w11, x11, [x10] 00000048: 35ffffab cbnz w11, 0x0000003c // end of xadd mapping: [...] Tested on Cavium ThunderX ARMv8, test suite results after the patch: No JIT: [ 3751.855362] test_bpf: Summary: 311 PASSED, 0 FAILED, [0/303 JIT'ed] With JIT: [ 3573.759527] test_bpf: Summary: 311 PASSED, 0 FAILED, [303/303 JIT'ed] Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-02Merge branch 'bpf-test-prog-fixes'David S. Miller
I say: ==================== Fix some bpf program testing framework bugs This series fixes two issue: 1) Accidental user pointer dereference in bpf_test_finish() 2) The packet data given to the test programs is not aligned correctly The first issue is fixed simply because we have a kernel side copy of the datastructure in question already. And the second bug is a simple matter of applying NET_IP_ALIGN where needed. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-02bpf: Align packet data properly in program testing framework.David Miller
Make sure we apply NET_IP_ALIGN when reserving headroom for SKB and XDP test runs, just like a real driver would. Signed-off-by: David S. Miller <davem@davemloft.net> Acked-by: Daniel Borkmann <daniel@iogearbox.net>
2017-05-02bpf: Do not dereference user pointer in bpf_test_finish().David Miller
Instead, pass the kattr in which has a kernel side copy of this data structure from userspace already. Fix based upon a suggestion from Alexei Starovoitov. Signed-off-by: David S. Miller <davem@davemloft.net> Acked-by: Daniel Borkmann <daniel@iogearbox.net>
2017-05-02selftests: bpf: Use bpf_endian.h in test_xdp.cDavid S. Miller
This fixes the testcase on big-endian. Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-02xdp: fix parameter kdoc for extackJakub Kicinski
Fix kdoc parameter spelling from extact to extack. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-02bpf, samples: fix build warning in cookie_uid_helper_exampleDaniel Borkmann
Fix the following warnings triggered by 51570a5ab2b7 ("A Sample of using socket cookie and uid for traffic monitoring"): In file included from /home/foo/net-next/samples/bpf/cookie_uid_helper_example.c:54:0: /home/foo/net-next/samples/bpf/cookie_uid_helper_example.c: In function 'prog_load': /home/foo/net-next/samples/bpf/cookie_uid_helper_example.c:119:27: warning: overflow in implicit constant conversion [-Woverflow] -32 + offsetof(struct stats, uid)), ^ /home/foo/net-next/samples/bpf/libbpf.h:135:12: note: in definition of macro 'BPF_STX_MEM' .off = OFF, \ ^ /home/foo/net-next/samples/bpf/cookie_uid_helper_example.c:121:27: warning: overflow in implicit constant conversion [-Woverflow] -32 + offsetof(struct stats, packets), 1), ^ /home/foo/net-next/samples/bpf/libbpf.h:155:12: note: in definition of macro 'BPF_ST_MEM' .off = OFF, \ ^ /home/foo/net-next/samples/bpf/cookie_uid_helper_example.c:129:27: warning: overflow in implicit constant conversion [-Woverflow] -32 + offsetof(struct stats, bytes)), ^ /home/foo/net-next/samples/bpf/libbpf.h:135:12: note: in definition of macro 'BPF_STX_MEM' .off = OFF, \ ^ HOSTLD /home/foo/net-next/samples/bpf/per_socket_stats_example Fixes: 51570a5ab2b7 ("A Sample of using socket cookie and uid for traffic monitoring") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-01sparc64: Fix BPF JIT wrt. branches and ldimm64 instructions.David S. Miller
Like other JITs, sparc64 maintains an array of instruction offsets but stores the entries off by one. This is done because jumps to the exit block are indexed to one past the last BPF instruction. So if we size the array by the program length, we need to record the previous instruction in order to stay within the array bounds. This is explained in ARM JIT commit 8eee539ddea0 ("arm64: bpf: fix out-of-bounds read in bpf2a64_offset()"). But this scheme requires a little bit of careful handling when the instruction before the branch destination is a 64-bit load immediate. It takes up 2 BPF instruction slots. Therefore, we have to fill in the array entry for the second half of the 64-bit load immediate instruction rather than for the one for the beginning of that instruction. Fixes: 7a12b5031c6b ("sparc64: Add eBPF JIT.") Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-01rhashtable: compact struct rhashtable_paramsFlorian Westphal
By using smaller datatypes this (rather large) struct shrinks considerably (80 -> 48 bytes on x86_64). As this is embedded in other structs, this also rerduces size of several others, e.g. cls_fl_head or nft_hash. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-01bpf: Include bpf_endian.h in test_progs.c too.David S. Miller
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-01bpf: Move endianness BPF helpers out of bpf_util.hDavid S. Miller
We do not want to include things like stdio.h and friends into eBPF program builds. bpf_util.h is for host compiled programs, so eBPF C-code helpers don't really belong there. Add a new bpf_endian.h as a quick fix for this for now. Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-01ipv6: Need to export ipv6_push_frag_opts for tunneling now.David S. Miller
Since that change also made the nfrag function not necessary for exports, remove it. Fixes: 89a23c8b528b ("ip6_tunnel: Fix missing tunnel encapsulation limit option") Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-01Merge branch 'dsa-mv88e6xxx-802.1s-and-88E6390-VTU'David S. Miller
Vivien Didelot says: ==================== net: dsa: mv88e6xxx: 802.1s and 88E6390 VTU This patch series adds support for the VLAN Table Unit (a.k.a. the VTU) to the 88E6390 family of Marvell Ethernet switch chips. The plumbing for the per VLAN Spanning Tree support is added as a side effect of the necessary refactoring. The patchset is split up so that no duplication of code is introduced. With this patchset applied, the mv88e6xxx driver has 2 new function pointers for the VTU GetNext and VTU Load/Purge operations (with 3 implementations), both handling programmation of 802.1q and 802.1s. On a ZII Rev C board (featuring 2 88E6390X chips) with all ports bridged together, we obtain the following hardware VLAN configuration: # cat /sys/class/net/br0/bridge/vlan_filtering 1 # cat /sys/class/net/br0/bridge/default_pvid 42 # bridge vlan add dev lan3 vid 666 # bridge vlan show port vlan ids lan1 42 PVID Egress Untagged lan1 42 PVID Egress Untagged lan2 42 PVID Egress Untagged lan2 42 PVID Egress Untagged lan3 42 PVID Egress Untagged 666 lan3 42 PVID Egress Untagged 666 lan4 42 PVID Egress Untagged lan4 42 PVID Egress Untagged lan5 42 PVID Egress Untagged lan5 42 PVID Egress Untagged lan6 42 PVID Egress Untagged lan6 42 PVID Egress Untagged lan7 42 PVID Egress Untagged lan7 42 PVID Egress Untagged lan8 42 PVID Egress Untagged lan8 42 PVID Egress Untagged br0 42 PVID Egress Untagged Below are the technical details for the different implementations. All switch families have up to 3 dedicated VTU Data registers used to program 802.1q and 802.1s, both using 2-bit values. On 88E6185 and 88E6352 families, port membership and state are adjacent, while the 88E6390 family share the same bits: Bits 88E6185/88E6352 88E6390 ----- ----------------- -------------------------- 0-1 Port 0 membership Port 0 membership or state 2-3 Port 0 state Port 1 membership or state 4-5 Port 1 membership Port 2 membership or state 6-7 Port 1 state Port 3 membership or state 8-9 Port 2 membership Port 4 membership or state 10-11 Port 2 state Port 5 membership or state ... ... ... The 88E6185 family programs all ports membership and state in a single VTU GetNext or Load/Purge operation. The 88E6352 family introduced an indirect Spanning Tree Unit table (a.k.a. STU) which requires additional STU GetNext and Load/Purge operations to read and write the ports state bits. The 88E6390 family also has an STU and requires data bits to be accessed before and after every single VTU or STU operation. Finally, the 88E6390 family introduced a 13th bit for the VLAN ID, which must be taken care of regardless the VTU operating mode. This means that iterating over the VTU now starts or ends with value 8191, not 4095. Patch 1 adds a max_vid field to the chip info structure. Patch 2 adds 802.1q and 802.1s data to the generic VTU entry structure. Patches 3 to 10 move helpers to a dedicated file (later made static). Patches 11 and 12 abstract handling of the STU behind VTU operations. Patches 13 and 14 add the new function pointers for VTU operations. Patches 15 and 18 polish the VTU code and add VTU support for 88E6390. Changes in v2: - add Reviewed-by tags - fix comments in 8/18 ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-01net: dsa: mv88e6xxx: add VTU support for 88E6390Vivien Didelot
The 6390 family of chips use only 2 of the 3 VTU Data registers to pack the MemberTag and PortState VLAN data. This means that they must be written or read before or after each VTU/STU operations. Implement this variant to add support for VTU with such chips. These chips have a 13th bit for the VID thus set their max_vid to 8191. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-01net: dsa: mv88e6xxx: support the VTU Page bitVivien Didelot
Newer chips such as the 88E6390 have a VTU Page bit in the VTU VID register to specify a 13th bit for the VID. This can be used to support 8K VLANs. When dumping the whole VTU, all VID bits must be set to one, including this VTU Page bit. Add support for VID greater than 4095. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-01net: dsa: mv88e6xxx: simplify VTU entry getterVivien Didelot
Make the code which fetches or initializes a new VTU entry more concise. This allows us the get rid of the old underscore prefix naming. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-01net: dsa: mv88e6xxx: make VTU helpers staticVivien Didelot
Now that we have chip operations for VTU accesses, mark all helpers from global1_vtu.c as static. Only the various implementations of the GetNext, LoadPurge and Flush operations need to be exposed. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-01net: dsa: mv88e6xxx: add VTU Load/Purge operationVivien Didelot
Add a new vtu_loadpurge operation to the chip info structure to differ the various implementations of the VTU accesses. Now that the STU handling is abstracted behind VTU operations, kill the obsolete MV88E6XXX_FLAG_STU flag. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-01net: dsa: mv88e6xxx: add VTU GetNext operationVivien Didelot
Add a new vtu_getnext operation to the chip info structure to differ the various implementations of the VTU accesses. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-01net: dsa: mv88e6xxx: load STU entry with VTU entryVivien Didelot
Now that the code writes both VTU and STU data when loading a VTU entry, load the corresponding STU entry at the same time. This allows us to get rid of the STU management in the _mv88e6xxx_vtu_new helper and thus remove the separate implementations of STU Load/Purge and STU GetNext, as well as the unused family checks. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-01net: dsa: mv88e6xxx: get STU entry on VTU GetNextVivien Didelot
Now that the code reads both VTU and STU data on VTU GetNext operation, fetch the STU entry data of a VTU entry at the same time. The STU data bits are masked with the VTU data bits and they are now all read at the same time a VTU GetNext operation is issued. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-01net: dsa: mv88e6xxx: move STU GetNext operationVivien Didelot
Extract the generic portion of code to issue an STU GetNext operation, which will be used in other implementations. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-01net: dsa: mv88e6xxx: move VTU Data accessorsVivien Didelot
The code to access the VTU Data registers currently only supports the 88E6185 family and alike: 2-bit membership adjacent to 2-bit port state. Even though the 88E6352 family introduced an indirect table to program the VLAN Spanning Tree states, the usage of the VTU Data registers remains the same regardless the VTU or STU operation. Now that the mv88e6xxx_vtu_entry structure contains both port membership and states data, factorize the code to access them in global1_vtu.c. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-01net: dsa: mv88e6xxx: move generic VTU GetNextVivien Didelot
Even though every switch model has a different way to access the VTU Data bits, the base implementation of the VTU GetNext operation remains the same: wait, write the first VID to iterate from, start the operation, and read the next VID. Move this generic implementation into global1_vtu.c and abstract the handling of the start VID (similarly to the ATU GetNext implementation), before introducing a new chip operation for specific chips. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-01net: dsa: mv88e6xxx: move VTU VID accessorsVivien Didelot
Add helpers to access the VTU VID register in the global1_vtu.c file. At the same time, move mv88e6xxx_g1_vtu_vid_write at the beginning of _mv88e6xxx_vtu_loadpurge, which adds no functional changes but makes future patches simpler. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-01net: dsa: mv88e6xxx: move VTU SID accessorsVivien Didelot
Add helpers to access the VTU SID register in the global1_vtu.c file. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-01net: dsa: mv88e6xxx: move VTU FID accessorsVivien Didelot
Add helpers to access the VTU FID register in the global1_vtu.c file. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-01net: dsa: mv88e6xxx: move VTU flushVivien Didelot
Move the VTU flush operation to global1_vtu.c and call it from a mv88e6xxx_vtu_setup helper, similarly to the ATU and PVT setup. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-01net: dsa: mv88e6xxx: move VTU Operation accessorsVivien Didelot
Move the helper functions to access the Global 1 VTU Operation register to a new global1_vtu.c file, and get rid of the old underscore prefix naming convention. This file will be extended will all VTU/STU related code. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-01net: dsa: mv88e6xxx: split VTU entry data memberVivien Didelot
VLAN aware Marvell chips can program 802.1Q VLAN membership as well as 802.1s per VLAN Spanning Tree state using the same 3 VTU Data registers. Some chips such as 88E6185 use different Data registers offsets for ports state and membership, and program them in a single operation. Other chips such as 88E6352 use the same register layout but program them in distinct operations (an indirect table is used for 802.1s.) Newer chips such as 88E6390 use the same offsets for both state and membership in distinct operations, thus require multiple data accesses. To correctly abstract this, split the "data" structure member of mv88e6xxx_vtu_entry in two "state" and "member" members, before adding VTU support for newer chips. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-01net: dsa: mv88e6xxx: add max VID to infoVivien Didelot
Some chips don't have a VLAN Table Unit, most of them do have a 4K table, some others as the 88E6390 family has a 13th bit for the VID. Add a new max_vid member to the info structure, used to check the presence of a VTU as well as the value used to iterate from in VTU GetNext operations. This makes the MV88E6XXX_FLAG_VTU obsolete, thus remove it. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-01xfrm: Indicate xfrm_state offload errorsIlan Tayari
Current code silently ignores driver errors when configuring IPSec offload xfrm_state, and falls back to host-based crypto. Fail the xfrm_state creation if the driver has an error, because the NIC offloading was explicitly requested by the user program. This will communicate back to the user that there was an error. Fixes: d77e38e612a0 ("xfrm: Add an IPsec hardware offloading API") Signed-off-by: Ilan Tayari <ilant@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-01net/esp4: Fix invalid esph pointer crashIlan Tayari
Both esp_output and esp_xmit take a pointer to the ESP header and place it in esp_info struct prior to calling esp_output_head. Inside esp_output_head, the call to esp_output_udp_encap makes sure to update the pointer if it gets invalid. However, if esp_output_head itself calls skb_cow_data, the pointer is not updated and stays invalid, causing a crash after esp_output_head returns. Update the pointer if it becomes invalid in esp_output_head Fixes: fca11ebde3f0 ("esp4: Reorganize esp_output") Signed-off-by: Ilan Tayari <ilant@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-01ip6_tunnel: Fix missing tunnel encapsulation limit optionCraig Gallek
The IPv6 tunneling code tries to insert IPV6_TLV_TNL_ENCAP_LIMIT and IPV6_TLV_PADN options when an encapsulation limit is defined (the default is a limit of 4). An MTU adjustment is done to account for these options as well. However, the options are never present in the generated packets. The issue appears to be a subtlety between IPV6_DSTOPTS and IPV6_RTHDRDSTOPTS defined in RFC 3542. When the IPIP tunnel driver was written, the encap limit options were included as IPV6_RTHDRDSTOPTS in dst0opt of struct ipv6_txoptions. Later, ipv6_push_nfrags_opts was (correctly) updated to require IPV6_RTHDR options when IPV6_RTHDRDSTOPTS are to be used. This caused the options to no longer be included in v6 encapsulated packets. The fix is to use IPV6_DSTOPTS (in dst1opt of struct ipv6_txoptions) instead. IPV6_DSTOPTS do not have the additional IPV6_RTHDR requirement. Fixes: 1df64a8569c7: ("[IPV6]: Add ip6ip6 tunnel driver.") Fixes: 333fad5364d6: ("[IPV6]: Support several new sockopt / ancillary data in Advanced API (RFC3542)") Signed-off-by: Craig Gallek <kraig@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-01Merge branch 'for-upstream' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next Johan Hedberg says: ==================== pull request: bluetooth-next 2017-04-30 Here's one last batch of Bluetooth patches in the bluetooth-next tree targeting the 4.12 kernel. - Remove custom ECDH implementation and use new KPP API instead - Add protocol checks to hci_ldisc - Add module license to HCI UART Nokia H4+ driver - Minor fix for 32bit user space - 64 bit kernel combination Please let me know if there are any issues pulling. Thanks. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-01switchdev: documentation: fix whitespace issuesLiam Beguin
Figure 1 is full of whitespaces; fix it Signed-off-by: Liam Beguin <lbeguin@tycoint.com> Signed-off-by: Sylvain Lemieux <slemieux@tycoint.com> Acked-by: Ivan Vecera <ivecera@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-01mlxsw: spectrum_router: Simplify VRF enslavementIdo Schimmel
When a netdev is enslaved to a VRF master, its router interface (RIF) needs to be destroyed (if exists) and a new one created using the corresponding virtual router (VR). >From the driver's perspective, the above is equivalent to an inetaddr event sent for this netdev. Therefore, when a port netdev (or its uppers) are enslaved to a VRF master, call the same function that would've been called had a NETDEV_UP was sent for this netdev in the inetaddr notification chain. This patch also fixes a bug when a LAG netdev with an existing RIF is enslaved to a VRF. Before this patch, each LAG port would drop the reference on the RIF, but would re-join the same one (in the wrong VR) soon after. With this patch, the corresponding RIF is first destroyed and a new one is created using the correct VR. Fixes: 7179eb5acd59 ("mlxsw: spectrum_router: Add support for VRFs") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-01Merge tag 'mlx5-updates-2017-04-30' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux mlx5-updates-2017-04-30 Or says: ================ mlx5 neigh update This series (whose code name is 'neigh update') from Hadar, enhances the mlx5 TC IP tunnel offloads to deal with changes to tunnel destination neighbours used in offloaded flows which involved encapsulation. In order to keep track on the validity state of such neighbours, we register a netevent notifier callback and act on NEIGH_UPDATE events: if a neighbour becomes valid, offload the related flows to HW (the other way around when neigh becomes invalid) and similarly when a neigh mac addresses changes. Since this traffic is offloaded from the host OS, the neighbour for the IP tunnel destination can mistakenly become STALE and deleted by the kernel since its 'used' value wasn't changed. To address that, we proactively update the neighbour 'used' value every DELAY_PROBE_TIME seconds, using time stamps generated by the existing driver code for HW flow counters. We use the DELAY_PROBE_TIME_UPDATE event to adjust the frequency of the updates. Prior to the core of the series, there's a patch from Saeed that introduces an extendable vport representor implementation scheme. It provides a separation between the eswitch to the netdev related aspects of the representors. We would like to thank Ido Schimmel and Ilya Lesokhin for their coaching && advice through the long design and review cycles while we struggled to understand and (hopefully correctly) implement the locking around the different driver flows(..) . - Or. ================= Misc Updates: From Tariq: Some small performance and trivial code optimization for mlx5 netdev driver - Optimize poll ICOSQ completion queue - Use prefetchw when a write is to follow - Use u8 as ownership type in mlx5e_get_cqe() From Eran: - Disable LRO by default on specific setups From Eli: - Small cleanup for E-Switch to avoid redundant allocation Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-01qed: Prevent warning without CONFIG_RFS_ACCELMintz, Yuval
After removing the PTP related initialization from slowpath start, the remaining PTT entry is required only in case CONFIG_RFS_ACCEL is set. Otherwise, it leads to a warning due to it being unused. Fixes: d179bd1699fc ("qed: Acquire/release ptt_ptp lock when enabling/disabling PTP") Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-01Merge branch 'qed-RoCE-fixes'David S. Miller
Yuval Mintz says: ==================== qed: RoCE related pseudo-fixes This series contains multiple small corrections to the RoCE logic in qed plus some debug information and inter-module parameter meant to prevent issues further along. - #1, #6 Share information with protocol driver [either new or filling missing bits in existing API]. - #2, #3 correct error flows in qed. - #4 add debug related information. - #5 fixes a minor issue in the HW configuration. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-01qed: output the DPM status and WID countRam Amrani
Output to the RDMA driver whether DPM mode is enabled or disabled in the HW and if so what is the number of WIDs it supports Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com> Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-01qed: align DPI configuration to HW requirementsRam Amrani
When calculating doorbell BAR partitioning round up the number of CPUs to the nearest power of 2 so the size of the DPI (per user section) configured in the hardware will be stored properly and not truncated. Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com> Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-01qed: verify RoCE resource bitmaps are releasedRam Amrani
Add mechanism to verify RoCE resources are released prior to freeing the bitmaps. If this is not the case, print what resources were not released. Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com> Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-01qed: add error handling flow to TID deregistratin posting failureRam Amrani
If the posting of the ramrod for the purpose of TID deregistration fails, abort the deregistration operation without using the FW's return code. Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com> Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-01qed: remove unused SQ error stateRam Amrani
The internal RoCE SQE QP state isn't being used. Instead we mark the QP as in regular error state. Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com> Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-01qed: configure the RoCE max message sizeRam Amrani
Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com> Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-01bpf: enhance verifier to understand stack pointer arithmeticYonghong Song
llvm 4.0 and above generates the code like below: .... 440: (b7) r1 = 15 441: (05) goto pc+73 515: (79) r6 = *(u64 *)(r10 -152) 516: (bf) r7 = r10 517: (07) r7 += -112 518: (bf) r2 = r7 519: (0f) r2 += r1 520: (71) r1 = *(u8 *)(r8 +0) 521: (73) *(u8 *)(r2 +45) = r1 .... and the verifier complains "R2 invalid mem access 'inv'" for insn #521. This is because verifier marks register r2 as unknown value after #519 where r2 is a stack pointer and r1 holds a constant value. Teach verifier to recognize "stack_ptr + imm" and "stack_ptr + reg with const val" as valid stack_ptr with new offset. Signed-off-by: Yonghong Song <yhs@fb.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-01benet: Use time_before_eq for time comparisonKarim Eshapa
Use time_before_eq for time comparison more safe and dealing with timer wrapping to be future-proof. Signed-off-by: Karim Eshapa <karim.eshapa@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>