aboutsummaryrefslogtreecommitdiff
path: root/drivers/infiniband/hw/cxgb4
AgeCommit message (Collapse)Author
2016-05-26IB/core: Make device counter infrastructure dynamicChristoph Lameter
In practice, each RDMA device has a unique set of counters that the hardware implements. Having a central set of counters that they must all adhere to is limiting and causes many useful counters to not be available. Therefore we create a dynamic counter registration infrastructure. The driver must implement a stats structure allocation routine, in which the driver must place the directory name it wants, a list of names for all of the counters, an array of u64 counters themselves, plus a few generic configuration options. We then implement a core routine to create a sysfs file for each of the named stats elements, and a core routine to retrieve the stats when any of the sysfs attribute files are read. To avoid excessive beating on the stats generation routine in the drivers, the core code also caches the stats for a short period of time so that someone attempting to read all of the stats in a given device's directory will not result in a stats generation call per file read. Future work will attempt to standardize just the shared stats elements, and possibly add a method to get the stats via netlink in addition to sysfs. Signed-off-by: Christoph Lameter <cl@linux.com> Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com> [ Add caching, make structure names more informative, add i40iw support, other significant rewrites from the original patch ]
2016-05-13Merge branches 'cxgb4-2', 'i40iw-2', 'ipoib', 'misc-4.7' and 'mlx5-fcs' into ↵Doug Ledford
k.o/for-4.7
2016-05-13iw_cxgb4: Convert a __force castBart Van Assche
__force casts should be avoided if there is a better alternative. Hence modify the comparison of s_addr with INADDR_ANY such that the __force cast is no longer necessary. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Steve Wise <swise@opengridcomputing.com> Cc: Vipul Pandya <vipul@chelsio.com> Acked-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-13RDMA/iw_cxgb4: Add arp failure handlers to send_mpa_reply/reject()Hariprasad S
These handlers when called print error message to the kernel log, but the actual handling is done by _c4iw_free_ep() and process_timeout(). Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-13RDMA/iw_cxgb4: Always wake up waiter in c4iw_peer_abort_intr()Hariprasad S
Currently c4iw_peer_abort_intr() does not wake up the waiter if the endpoint state indicates we're using MPAv2 and we're currently trying to connect. This was introduced with commit 7c0a33d61187a ("RDMA/cxgb4: Don't wakeup threads for MPAv2") However, this original fix is flawed because it introduces a race that can cause a deadlock of the iwarp stack. Here is the race: ->local side sets up an active offload connection. ->local side sends MPA_START request. ->peer sends MPA_START response. ->local side ingress cpl thread begins processing the MPA_START response, but before it changes the state from MPA_REQ_SENT to FPDU_MODE: ->peer sends a RST which results in a ABORT_REQ_RSS. This triggers peer_abort_intr() which sees the state in MPA_REQ_SENT and since mpa_rev is 2, it will avoid waking up the endpoint with -ECONNRESET, assuming the stack will re-attempt the connection using MPAv1. ->Meanwhile, the cpl thread moves the state to FPDU_MODE and calls c4iw_modify_rc_qp() which calls rdma_init() which sends a RI_WR/INIT WR to firmware. But since HW sent an abort, FW correctly drops the RI_WR/INIT WR. ->So the cpl thread is stuck waiting for a reply and cannot process the ABORT_REQ_RSS cpl sitting in its input queue. Thus everything comes to a halt because no more ingress cpls are processed by the stack... The correct fix for the issue is to always do the wake up in c4iw_abort_intr() but reinitialize the wait object in c4iw_reconnect(). Fixes: 7c0a33d61187a ("RDMA/cxgb4: Don't wakeup threads for MPAv2") Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-13RDMA/iw_cxgb4: Handle ret value of process_mpa_reply() in rx_dataHariprasad S
Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-13RDMA/iw_cxgb4: atomic find and reference for listening endpointsHariprasad S
Add get_ep_from_stid() which will atomically find and reference the endpoint struct if found. This avoids touch-after-free races between threads destroying listening endpoints and the CPL processing thread processing an incoming PASS_ACCEPT_REQ CPL. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-13RDMA/iw_cxgb4: Handle ULP accept/reject during ABORTINGHariprasad S
c4iw_reject() and c4iw_accept() need to handle the case where the endpoint has timed out and is in the middle of ABORTING the connection. Here is the flow that causes the BUG_ON() to fire on the server side: 1) offload connection setup and endpoint timer started 2) MPA_START request received from peer, CONNECT_REQUEST passed to ULP 3) endpoint timer fires, and process_timeout() aborts the connection, this moves the endpoint state to ABORTING until HW sends up the ABORT_RPL_RSS. 4) application exits closing the CONNECT_REQUEST cm_id. The IWCM calls c4iw_reject_cr() to destroy this connection request. 5) WHAMO: BUG_ON() because the state is ABORTING. The fix is to change c4iw_reject_cr() and c4iw_accept_cr() to fail the operation if the state is not in MPA_REQ_RCVD vs in DEAD. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-13RDMA/iw_cxgb4: Release ep for for FPDU_MODE and MPA_REQ_RCVD in process_timeoutHariprasad S
ARP failure may also happen when ep in FPDU_MODE and these failures need to be handled by process_timeout(). process_timeout() also has to handle case MPA_REQ_RCVD, setting abort to 1, leading to ep resource release. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-13RDMA/iw_cxgb4: Free skb in case of arp failure in _c4iw_free_ep()Hariprasad S
Arp failure for send_mpa_reply/reject() is handled by freeing the mpa_skb in c4iw_free_ep() before releasing ep. Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-13RDMA/iw_cxgb4: atomically lookup ep and get a referenceHariprasad S
There is a race between ULP threads calling c4iw_ep_disconnect() via c4iw_modify_rc_qp() and the ingress CPL thread where the ULP thread can free the endpoint just after the ingress CPL thread finds the ep pointer in the tid table. To avoid this, we now use the hwtid_idr table for lookups instead of the LLD tid table so we can lock around insert, remove, and lookup+get_ep to avoid the race. The CPL handlers now will either find the ep ptr and have a ref on it, or not find it and they can discard the CPL. Callers of get_ep_from_tid() will have a ref on the ep if found, and thus must deref when they are done. Negative advice in peer_abort_intr() need to dereference the ep. therefore peer_abort() is scheduled to dereference the ep later. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-13RDMA/iw_cxgb4: Handle return value of c4iw_ofld_send() in abort_arp_failure()Hariprasad S
In abort_arp_failure(), the return value from c4iw_ofld_send() is ignored and thus if the CPL isn't sent, the endpoint is stuck and never gets aborted. Failure of c4iw_ofld_send() is treated as fatal error, and the ep resources are released in a safer context through process_work(). Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-13RDMA/iw_cxgb4: in process_timeout() don't move ep state to ABORTINGHariprasad S
Moving the state to ABORTING causes the ep to get stuck because c4iw_ep_timeout() thinks the ABORT has already been done. So leave the state alone and let c4iw_ep_disconnect() do the right thing given the ep state. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-13RDMA/iw_cxgb4: handle return value of c4iw_l2t_send() and send_mpa_req()Hariprasad S
->In act_open_rpl(), CPL_ERR_TCAM_FULL error handling branch, there is no handling of the return value of send_fw_act_open_req(). ->In send_fw_act_open_req(), there is no handling of return value of c4iw_l2t_send(), which may cause a ep leak and won't notify upper layers on connection establish failure. ->send_mpa_req() should act on the return from c4iw_l2t_send() and return the error to the caller. ->In case of c4iw_l2t_send() failure in send_mpa_req(), returns without starting the timer and not changing the ep state, which is further handled by act_establish() -> In act_establish()?if send_mpa_request's get_skb returns an error, may cause an ep leak. So handle return value of send_mpa_req() Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-13RDMA/iw_cxgb4: stop_ep_timer() after MPA negotiationHariprasad S
->Stop the ep timer after MPA negotiation so that the arp failures during send_mpa_reply/reject will be handled by process_timeout() after the ep timer expires. ->Added case MPA_REP_SENT in process_timeout(). ->For MPA reject, c4iw_ep_disconnect tries to start an already started timer, which leads to warning message "timer already started". -> In case of mpa reject stop the timer and call send_mpa_reject(). -> Added new ep flag STOP_MPA_TIMER to tell fw4_ack() to stop the timer only for send_mpa_reply(), which is set in c4iw_accept_cr(). Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-13RDMA/iw_cxgb4: Do not stop timer in case of incomplete messagesHariprasad S
In case of incomplete mpa messages we should not stop timer as it results in return with timeout for the next mpa message Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-13RDMA/iw_cxgb4: parent_ep has to be dereferenced in case of passive accept ↵Hariprasad S
failure -> On passive side of connection parent_ep referenced during connection request has to be dereferenced during the passive accept failure. -> As passive accept failure error handlinglogic runs in atomic context, the parent ep is dereferenced by scheduling work request. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-13RDMA/iw_cxgb4: set the correct FID value in DSGL commandsHariprasad S
The FID value in a ULP_MEMIO command needs to be set to an IQ ID of a queue configured for our PF. The FID/IQ id is used to index into the PCIE FID table, to find out on which function the DMA needs to be issued. Essentially, every DMA needs to have the ingress queue. The exact ingress queue doesn't matter, but it needs to be an ingress queue associated with the function you want to see the DMA on. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-13RDMA/iw_cxgb4: Correct RFC number of MPAHariprasad S
Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-13RDMA/iw_cxgb4: Add few history bits for epHariprasad S
- add EP_DISC_FAIL history bit - add QP_REFED/DEREFED history bits - Add functions to ref/deref the cm_id and add history bit for the same - add CLOSE_CON_RPL history Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-13IB/core: Enhance ib_map_mr_sg()Bart Van Assche
The SRP initiator allows to set max_sectors to a value that exceeds the largest amount of data that can be mapped at once with an mlx4 HCA using fast registration and a page size of 4 KB. Hence modify ib_map_mr_sg() such that it can map partial sg-elements. If an sg-element has been mapped partially, let the caller know which fraction has been mapped by adjusting *sg_offset. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Tested-by: Laurence Oberman <loberman@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-13IB/core: Add passing an offset into the SG to ib_map_mr_sgChristoph Hellwig
Signed-off-by: Christoph Hellwig <hch@lst.de> Tested-by: Steve Wise <swise@opengridcomputing.com> Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-05RDMA/iw_cxgb4: remove abort_connection() usage from ep_timeout()Hariprasad S
Use c4iw_ep_disconnect() instead. This is part of getting rid of abort_connection() altogether so we properly clean up on send_abort() failures. This is the last user of abort_connection(), so remove it too. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-05RDMA/iw_cxgb4: move QP -> ERROR on fatal disconnect errorsHariprasad S
In c4iw_ep_disconnect(), if we fail to initiate a close operation, then move the qp to ERROR to disassociate the ep from the qp. Failure to do this will leak the ep resources. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-05RDMA/iw_cxgb4: don't use abort_connection in process_mpa_request()Hariprasad S
Instead return whether the caller needs to disconnect. This is part of getting rid of abort_connection() altogether so we properly clean up on send_abort() failures. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-05RDMA/iw_cxgb4: remove abort_connection() usage from accept/rejectHariprasad S
Use c4iw_ep_disconnect() instead. This is part of getting rid of abort_connection() altogether so we properly clean up on send_abort() failures. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-05RDMA/iw_cxgb4: free resources when send_flowc() failsHariprasad S
Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-05RDMA/iw_cxgb4: remove connection abort from process_mpa_replyHariprasad S
Instead, have the caller, rx_data() handle the close/abort like it does for process_mpa_request(). This is part of getting rid of abort_connection() altogether so we properly clean up on send_abort() failures. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-05RDMA/iw_cxgb4: ensure eps don't get freed while the mutex is heldHariprasad S
In rx_data(), with the ep in FPDU_MODE, refcnt=2, if we get unexpected streaming data, we call c4iw_modify_rc_qp() and move the qp from RTS -> TERMINATE. In c4iw_modify_rc_qp(), if rdma_fini() returns an error, the ep will be dereferenced (refcnt=1). Then rx_data() calls c4iw_ep_disconnect() which starts the close operation. But if send_halfclose() fails in c4iw_ep_disconnect(), we will call release_ep_resources() derefing the ep which reduces the refcnt to 0 and and frees the ep. However we still has the ep mutex at that point, so we have a touch-after-free bug. There is a similar issue where peer_close() calls c4iw_ep_disconnect(). The solution is to add a reference to the ep in c4iw_ep_disconnect() after acquiring the mutex, and release it after releasing the mutex. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-05RDMA/iw_cxgb4: stop ep timer on close failureHariprasad S
In c4iw_ep_disconnect(), if we start the ep timer to begin a close, but send_halfclose() fails, we need to stop the timer and send a CLOSE event up to the IWCM before releasing the resources. Otherwise, we can crash when the ep timer fires if the ep is referencing a previous instance of the device. This can happen as part of adapter reset/recovery, for instance. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-05RDMA/iw_cxgb4: release ep resources on accept arp failureHariprasad S
If ARP fails before the CPL_PASS_ACCEPT_RPL is seen by hardware, the tid will be stuck in SYN_PEND and never released. So create an arp failure handler specifically for this message to release the endpoint resources. In pass_accept_rpl_arp_failure(), put the parent endpoint so it will be freed when destroyed. Also we don't need to call release_tid() here because _c4iw_free_ep() calls cxgb4_remove_tid() which releases the hwtid. If we get an ABORT_REQ_RSS instead of a PASS_ESTABLISH (because the peer's ACK to our SYN is never received), then put the parent as well in peer_abort(). Treat accept_cr() failures just like arp failures: put the parent ep and release the ep resources destroying the tid The ARP failure handlers are called in an atomic context, so we need to schedule some of the processing which might block. Namely _c4iw_free_ep() which needs a mutex. So create a "special" CPL opcode and handler and schedule it via sched() to be run by process_work() in a blockable context. Also rework the active open arp failure handler to make use of release_ep_resources(). This allows both the active and passive arp failure handlers to use the same deferred cleanup function. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-04-26RDMA/iw_cxgb4: Fix bar2 virt addr calculation for T4 chipsHariprasad S
For T4, kernel mode qps don't use the user doorbell. User mode qps during flow control db ringing are forced into kernel, where user doorbell is treated as kernel doorbell and proper bar2 offset in bar2 virtual space is calculated, which incase of T4 is a bogus address, causing a kernel panic due to illegal write during doorbell ringing. In case of T4, kernel mode qp bar2 virtual address should be 0. Added T4 check during bar2 virtual address calculation to return 0. Fixed Bar2 range checks based on bar2 physical address. The below oops will be fixed <1>BUG: unable to handle kernel paging request at 000000000002aa08 <1>IP: [<ffffffffa011d800>] c4iw_uld_control+0x4e0/0x880 [iw_cxgb4] <4>PGD 1416a8067 PUD 15bf35067 PMD 0 <4>Oops: 0002 [#1] SMP <4>last sysfs file: /sys/devices/pci0000:00/0000:00:03.0/0000:02:00.4/infiniband/cxgb4_0/node_guid <4>CPU 5 <4>Modules linked in: rdma_ucm rdma_cm ib_cm ib_sa ib_mad ib_uverbs ip6table_filter ip6_tables ebtable_nat ebtables ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack ipt_REJECT xt_CHECKSUM iptable_mangle iptable_filter ip_tables bridge autofs4 target_core_iblock target_core_file target_core_pscsi target_core_mod configfs bnx2fc cnic uio fcoe libfcoe libfc scsi_transport_fc scsi_tgt 8021q garp stp llc cpufreq_ondemand acpi_cpufreq freq_table mperf vhost_net macvtap macvlan tun kvm uinput microcode iTCO_wdt iTCO_vendor_support sg joydev serio_raw i2c_i801 i2c_core lpc_ich mfd_core e1000e ptp pps_core ioatdma dca i7core_edac edac_core shpchp ext3 jbd mbcache sd_mod crc_t10dif pata_acpi ata_generic ata_piix iw_cxgb4 iw_cm ib_core ib_addr cxgb4 ipv6 dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_wait_scan] <4> Supermicro X8ST3/X8ST3 <4>RIP: 0010:[<ffffffffa011d800>] [<ffffffffa011d800>] c4iw_uld_control+0x4e0/0x880 [iw_cxgb4] <4>RSP: 0000:ffff880155a03db0 EFLAGS: 00010006 <4>RAX: 000000000000001d RBX: ffff88013ae5fc00 RCX: ffff880155adb180 <4>RDX: 000000000002aa00 RSI: 0000000000000001 RDI: ffff88013ae5fdf8 <4>RBP: ffff880155a03e10 R08: 0000000000000000 R09: 0000000000000001 <4>R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 <4>R13: 000000000000001d R14: ffff880156414ab0 R15: ffffe8ffffc05b88 <4>FS: 0000000000000000(0000) GS:ffff8800282a0000(0000) knlGS:0000000000000000 <4>CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b <4>CR2: 000000000002aa08 CR3: 000000015bd0e000 CR4: 00000000000007e0 <4>DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 <4>DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 <4>Process cxgb4 (pid: 394, threadinfo ffff880155a00000, task ffff880156414ab0) <4>Stack: <4> ffff880156415068 ffff880155adb180 ffff880155a03df0 ffffffffa00a344b <4><d> 00000000000003e8 ffff880155920000 0000000000000004 ffff880155920000 <4><d> ffff88015592d438 ffffffffa00a3860 ffff880155a03fd8 ffffe8ffffc05b88 <4>Call Trace: <4> [<ffffffffa00a344b>] ? enable_txq_db+0x2b/0x80 [cxgb4] <4> [<ffffffffa00a3860>] ? process_db_full+0x0/0xa0 [cxgb4] <4> [<ffffffffa00a38a6>] process_db_full+0x46/0xa0 [cxgb4] <4> [<ffffffff8109fda0>] worker_thread+0x170/0x2a0 <4> [<ffffffff810a6aa0>] ? autoremove_wake_function+0x0/0x40 <4> [<ffffffff8109fc30>] ? worker_thread+0x0/0x2a0 <4> [<ffffffff810a660e>] kthread+0x9e/0xc0 <4> [<ffffffff8100c28a>] child_rip+0xa/0x20 <4> [<ffffffff810a6570>] ? kthread+0x0/0xc0 <4> [<ffffffff8100c280>] ? child_rip+0x0/0x20 <4>Code: e9 ba 00 00 00 66 0f 1f 44 00 00 44 8b 05 29 07 02 00 45 85 c0 0f 85 71 02 00 00 8b 83 70 01 00 00 45 0f b7 ed c1 e0 0f 44 09 e8 <89> 42 08 0f ae f8 66 c7 83 82 01 00 00 00 00 44 0f b7 ab dc 01 <1>RIP [<ffffffffa011d800>] c4iw_uld_control+0x4e0/0x880 [iw_cxgb4] <4> RSP <ffff880155a03db0> <4>CR2: 000000000002aa08` Based on original work by Bharat Potnuri <bharat@chelsio.com> Fixes: 74217d4c6a4fb0d8 ("iw_cxgb4: support for bar2 qid densities exceeding the page size") Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Reviewed-by: Leon Romanovsky <leon@leon.nu> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-04-26iw_cxgb4: handle draining an idle qpSteve Wise
In c4iw_drain_sq/rq(), if the particular queue is already empty then don't block. Fixes: ce4af14d94aa ('iw_cxgb4: add queue drain functions') Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-04-26iw_cxgb4: initialize ibdev.iwcm->ifname for port mappingSteve Wise
The IWCM uses ibdev.iwcm->ifname for registration with the iwarp port map daemon. But iw_cxgb4 did not initialize this field which causes intermittent registration failures based on the contents of the uninitialized memory. Fixes: 170003c894d9 ("iw_cxgb4: remove port mapper related code") Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-23Merge branch 'for-next-merge' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending Pull more SCSI target updates from Nicholas Bellinger: "This series contains cxgb4 driver prerequisites for supporting iscsi segmentation offload (ISO), that will be utilized for a number of future v4.7 developments in iscsi-target for supporting generic hw offloads" * 'for-next-merge' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: cxgb4: update Kconfig and Makefile cxgb4: add iSCSI DDP page pod manager cxgb4, iw_cxgb4: move delayed ack macro definitions cxgb4: move VLAN_NONE macro definition cxgb4: update struct cxgb4_lld_info definition cxgb4: add definitions for iSCSI target ULD cxgb4, cxgb4i: move struct cpl_rx_data_ddp definition cxgb4, iw_cxgb4, cxgb4i: remove duplicate definitions cxgb4, iw_cxgb4: move definitions to common header file cxgb4: large receive offload support cxgb4: allocate resources for CXGB4_ULD_ISCSIT cxgb4: add new ULD type CXGB4_ULD_ISCSIT
2016-03-22cxgb4, iw_cxgb4: move delayed ack macro definitionsVarun Prakash
move delayed ack macro definitions to common header file t4_msg.h. Signed-off-by: Varun Prakash <varun@chelsio.com> Acked-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2016-03-22cxgb4, iw_cxgb4, cxgb4i: remove duplicate definitionsVarun Prakash
move struct ulptx_idata definition to common header file t4_msg.h. Signed-off-by: Varun Prakash <varun@chelsio.com> Acked-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2016-03-22cxgb4, iw_cxgb4: move definitions to common header fileVarun Prakash
move struct tcp_options, struct cpl_pass_accept_req, enum defining congestion control algorithms and associated macros to common header file t4_msg.h Signed-off-by: Varun Prakash <varun@chelsio.com> Acked-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2016-03-16Merge branches 'nes', 'cxgb4' and 'iwpm' into k.o/for-4.6Doug Ledford
2016-03-16iw_cxgb4: remove port mapper related codeSteve Wise
Now that most of the port mapper code been moved to iwcm, we can remove it from iw_cxgb4. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-16Merge branches 'mlx4', 'mlx5' and 'ocrdma' into k.o/for-4.6Doug Ledford
2016-03-01IB/core: Add vendor's specific data to alloc mwMatan Barak
Passing udata to the vendor's driver in order to pass data from the user-space driver to the kernel-space driver. This data will be used in downstream patches. Signed-off-by: Matan Barak <matanb@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-02-29iw_cxgb4: add queue drain functionsSteve Wise
Add completion objects, named sq_drained and rq_drained, to the c4iw_qp struct. The queue-specific completion object is signaled when the last CQE is drained from the CQ for that queue. Add c4iw_drain_sq() to block until qp->rq_drained is completed. Add c4iw_drain_rq() to block until qp->sq_drained is completed. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-02-29infiniband: cxgb4: use %pR format string for printing resourcesArnd Bergmann
The cxgb4 prints an MMIO resource using the "0x%x" and "%p" format strings on the length and start, respective, but that triggers a compiler warning when using a 64-bit resource_size_t on a 32-bit architecture: drivers/infiniband/hw/cxgb4/device.c: In function 'c4iw_rdev_open': drivers/infiniband/hw/cxgb4/device.c:807:7: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast] (void *)pci_resource_start(rdev->lldi.pdev, 2), This changes the format string to use %pR instead, which pretty-prints the resource, avoids the warning and is shorter. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-02-29iw_cxgb4: Max fastreg depth depends on DSGL supportHariprasad S
The max depth of a fastreg mr depends on whether the device supports DSGL or not. So compute it dynamically based on the device support and the module use_dsgl option. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-02-29cxgb4/iw_cxgb4: TOS supportHariprasad S
This series provides support for iWARP applications to specify a TOS value and have that map to a VLAN Priority for iw_cxgb4 iWARP connections. In iw_cxgb4, when allocating an L2T entry, pass the skb_priority based on the tos value in the cm_id. Also pass the correct tos value during connection setup so the passive side gets the client's desired tos. When sending the FLOWC work request to FW, if the egress device is in a vlan, then use the vlan priority bits as the scheduling class. This allows associating RDMA connections with scheduling classes to provide traffic shaping per flow. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-02-29iw_cxgb4: remove false error log entryHariprasad S
Don't log errors if a listening endpoint is going away when procesing a PASS_ACCEPT_REQ message. This can happen. Change the error printk to a PDBG() debug log entry Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-02-29iw_cxgb4: make queue allocation code more readableHariprasad S
Rename local mm* variables to more meaningful names Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-01-23Merge tag 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma Pull rdma updates from Doug Ledford: "Initial roundup of 4.5 merge window patches - Remove usage of ib_query_device and instead store attributes in ib_device struct - Move iopoll out of block and into lib, rename to irqpoll, and use in several places in the rdma stack as our new completion queue polling library mechanism. Update the other block drivers that already used iopoll to use the new mechanism too. - Replace the per-entry GID table locks with a single GID table lock - IPoIB multicast cleanup - Cleanups to the IB MR facility - Add support for 64bit extended IB counters - Fix for netlink oops while parsing RDMA nl messages - RoCEv2 support for the core IB code - mlx4 RoCEv2 support - mlx5 RoCEv2 support - Cross Channel support for mlx5 - Timestamp support for mlx5 - Atomic support for mlx5 - Raw QP support for mlx5 - MAINTAINERS update for mlx4/mlx5 - Misc ocrdma, qib, nes, usNIC, cxgb3, cxgb4, mlx4, mlx5 updates - Add support for remote invalidate to the iSER driver (pushed through the RDMA tree due to dependencies, acknowledged by nab) - Update to NFSoRDMA (pushed through the RDMA tree due to dependencies, acknowledged by Bruce)" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (169 commits) IB/mlx5: Unify CQ create flags check IB/mlx5: Expose Raw Packet QP to user space consumers {IB, net}/mlx5: Move the modify QP operation table to mlx5_ib IB/mlx5: Support setting Ethernet priority for Raw Packet QPs IB/mlx5: Add Raw Packet QP query functionality IB/mlx5: Add create and destroy functionality for Raw Packet QP IB/mlx5: Refactor mlx5_ib_qp to accommodate other QP types IB/mlx5: Allocate a Transport Domain for each ucontext net/mlx5_core: Warn on unsupported events of QP/RQ/SQ net/mlx5_core: Add RQ and SQ event handling net/mlx5_core: Export transport objects IB/mlx5: Expose CQE version to user-space IB/mlx5: Add CQE version 1 support to user QPs and SRQs IB/mlx5: Fix data validation in mlx5_ib_alloc_ucontext IB/sa: Fix netlink local service GFP crash IB/srpt: Remove redundant wc array IB/qib: Improve ipoib UD performance IB/mlx4: Advertise RoCE v2 support IB/mlx4: Create and use another QP1 for RoCEv2 IB/mlx4: Enable send of RoCE QP1 packets with IP/UDP headers ...
2016-01-19iw_cxgb4: Take clip reference before starting IPv6 listenHariprasad S
The h/w is designed in such a way that, if you do anything IPv6 related, a valid clip entry must be there. So take clip reference before creating IPv6 listening servers, and then if we fail to create server, release the clip entry. Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Acked-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>