aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/infiniband/hw/cxgb4 (follow)
AgeCommit message (Collapse)AuthorFilesLines
2016-09-02IB/cxgb4: Make _free_qp static to silence build warningBaoyou Xie1-1/+1
We get 1 warning when build kernel with W=1: drivers/infiniband/hw/cxgb4/qp.c:686:6: warning: no previous prototype for '_free_qp' [-Wmissing-prototypes] In fact, this function is only used in the file in which it is declared and don't need a declaration, but can be made static. so this patch marks it 'static'. Signed-off-by: Baoyou Xie <baoyou.xie@linaro.org> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Acked-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-08-23iw_cxgb4: Fix cxgb4 arm CQ logic w/IB_CQ_REPORT_MISSED_EVENTSBharat Potnuri2-5/+10
Current cxgb4 arm CQ logic ignores IB_CQ_REPORT_MISSED_EVENTS for request completion notification on a CQ. Due to this ib_poll_handler() assumes all events polled and avoids further iopoll scheduling. This patch adds logic to cxgb4 ib_req_notify_cq() handler to check if CQ is not empty and return accordingly. Based on the return value of ib_req_notify_cq() handler, ib_poll_handler() will schedule a run of iopoll handler. Signed-off-by: Potnuri Bharat Teja <bharat@chelsio.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-08-22iw_cxgb4: use the MPA initiator's IRD if < our ORDSteve Wise1-1/+1
The i40iw initiator sends an MPA-request with ird=16 and ord=16. The cxgb4 responder sends an MPA-reply with ord = 32 causing i40iw to terminate due to insufficient resources. The logic to reduce the ORD to <= peer's IRD was wrong. Reported-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-08-22iw_cxgb4: limit IRD/ORD advertised to ULP by device max.Steve Wise1-0/+4
The i40iw initiator sends an MPA-request with ird = 63, ord = 63. The cxgb4 responder sends a RST. Since the inbound ord=63 and it exceeds the max_ird/c4iw_max_read_depth (=32 default), chelsio decides to abort. Instead, cxgb4 should adjust the ord/ird down before presenting it to the ULP. Reported-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-08-03Merge branches 'cxgb4' and 'mlx5' into k.o/for-4.8Doug Ledford4-20/+50
2016-08-02iw_cxgb4: don't block in destroy_qp awaiting the last derefSteve Wise2-8/+15
Blocking in c4iw_destroy_qp() causes a deadlock when apps destroy a qp or disconnect a cm_id from their cm event handler function. There is no need to block here anyway, so just replace the refcnt atomic with a kref object and free the memory on the last put. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-08-02iw_cxgb4: explicitly move the qp to ERROR state during flushSteve Wise1-0/+9
This forces the connection to abort if the application failed to disconnect before flushing. This is aligned with how the common flush services work. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-08-02iw_cxgb4: stop MPA_REPLY timer when disconnectingSteve Wise1-1/+11
There exists a race where the application can setup a connection and then disconnect it before iw_cxgb4 processes the fw4_ack message. For passive side connections, the fw4_ack message is used to know when to stop the ep timer for MPA_REPLY messages. If the application disconnects before the fw4_ack is handled then c4iw_ep_disconnect() needs to clean up the timer state and stop the timer before restarting it for the disconnect timer. Failure to do this results in a "timer already started" message and a premature stopping of the disconnect timer. Fixes: e4b76a2 ("RDMA/iw_cxgb4: stop_ep_timer() after MPA negotiation") Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-08-02RDMA/iw_cxgb4: Use kfree_skb instead of kfreeHariprasad S1-11/+15
The commit 0f8ab0b6e91b4d53 ("RDMA/iw_cxgb4: Low resource fixes for Memory registration") from Jun 10, 2016, leads to the following static checker warning: drivers/infiniband/hw/cxgb4/mem.c:612 c4iw_alloc_mw() error: use kfree_skb() here instead of kfree(mhp->dereg_skb) Also fixes skb leak in c4iw_dealloc_mw Fixes: 0f8ab0b6e91b4d53 ("RDMA/iw_cxgb4: Low resource fixes for Memory registration") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-06-23Merge branches 'cxgb4-4.8', 'mlx5-4.8' and 'fw-version' into k.o/for-4.8Doug Ledford7-133/+266
2016-06-23IB/cxgb4: Support device FW version stringIra Weiny1-16/+15
And remove sysfs fw_ver in favor of the core. Reviewed-by: Steve Wise <swise@opengridcomputing.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-06-23RDMA/iw_cxgb4: Low resource fixes for Completion queueHariprasad S2-17/+26
Pre-allocate buffers to deallocate completion queue, so that completion queue is deallocated during RDMA termination when system is running out of memory. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-06-23RDMA/iw_cxgb4: Low resource fixes for Memory registrationHariprasad S2-37/+76
Pre-allocate buffers for deregistering memory region and memory window during RDMA connection close, when system is running out of memory. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-06-23RDMA/iw_cxgb4: Low resource fixes for connection managerHariprasad S3-42/+114
Pre-allocate buffers for sending various control messages to close connection, abort connection, etc so that we gracefully handle connections when system is running out of memory. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-06-23RDMA/iw_cxgb4: Add missing error codes for act open cmdHariprasad S1-3/+8
Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-06-23RDMA/iw_cxgb4: clean up c4iw_reject_cr()Hariprasad S1-11/+8
Get rid of unneeded code, and refactor things a bit. For MPA version 0 we abort the connection. For > 0, we attempt to send an MPA_START/REJECT Reply, and then disconnect gracefully. If the send of the MPA message fails, then we abort the connection. We can ignore c4iw_ep_disconnect() errors here because it will clean up the endpoint if there are failures. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-06-23RDMA/iw_cxgb4: allocate enough space for debugfs "qps" dumpHariprasad S1-1/+1
With IPv6 addresses, the "qps" debugfs is running out of space and truncating the output. Bump the required size accordingly. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-06-23RDMA/iw_cxgb4: only read markers_enabled mod param onceHariprasad S1-6/+18
markers_enabled should be read only once during MPA negotiation. The present code does read markers_enabled twice during negotiation which results in setting wrong recv/xmit markers if the markers_enabled is changed in the middle of negotiation. With this change the markers_enabled is read only once during MPA negotiation. recv markers are set based on markers enabled module parameter and xmit markers are set based on markers flag from the MPA_START_REQ/MPA_START_REP. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-26IB/core: Make device counter infrastructure dynamicChristoph Lameter1-9/+49
In practice, each RDMA device has a unique set of counters that the hardware implements. Having a central set of counters that they must all adhere to is limiting and causes many useful counters to not be available. Therefore we create a dynamic counter registration infrastructure. The driver must implement a stats structure allocation routine, in which the driver must place the directory name it wants, a list of names for all of the counters, an array of u64 counters themselves, plus a few generic configuration options. We then implement a core routine to create a sysfs file for each of the named stats elements, and a core routine to retrieve the stats when any of the sysfs attribute files are read. To avoid excessive beating on the stats generation routine in the drivers, the core code also caches the stats for a short period of time so that someone attempting to read all of the stats in a given device's directory will not result in a stats generation call per file read. Future work will attempt to standardize just the shared stats elements, and possibly add a method to get the stats via netlink in addition to sysfs. Signed-off-by: Christoph Lameter <cl@linux.com> Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com> [ Add caching, make structure names more informative, add i40iw support, other significant rewrites from the original patch ]
2016-05-13Merge branches 'cxgb4-2', 'i40iw-2', 'ipoib', 'misc-4.7' and 'mlx5-fcs' into k.o/for-4.7Doug Ledford3-133/+254
2016-05-13iw_cxgb4: Convert a __force castBart Van Assche1-1/+1
__force casts should be avoided if there is a better alternative. Hence modify the comparison of s_addr with INADDR_ANY such that the __force cast is no longer necessary. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Steve Wise <swise@opengridcomputing.com> Cc: Vipul Pandya <vipul@chelsio.com> Acked-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-13RDMA/iw_cxgb4: Add arp failure handlers to send_mpa_reply/reject()Hariprasad S1-0/+7
These handlers when called print error message to the kernel log, but the actual handling is done by _c4iw_free_ep() and process_timeout(). Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-13RDMA/iw_cxgb4: Always wake up waiter in c4iw_peer_abort_intr()Hariprasad S1-10/+2
Currently c4iw_peer_abort_intr() does not wake up the waiter if the endpoint state indicates we're using MPAv2 and we're currently trying to connect. This was introduced with commit 7c0a33d61187a ("RDMA/cxgb4: Don't wakeup threads for MPAv2") However, this original fix is flawed because it introduces a race that can cause a deadlock of the iwarp stack. Here is the race: ->local side sets up an active offload connection. ->local side sends MPA_START request. ->peer sends MPA_START response. ->local side ingress cpl thread begins processing the MPA_START response, but before it changes the state from MPA_REQ_SENT to FPDU_MODE: ->peer sends a RST which results in a ABORT_REQ_RSS. This triggers peer_abort_intr() which sees the state in MPA_REQ_SENT and since mpa_rev is 2, it will avoid waking up the endpoint with -ECONNRESET, assuming the stack will re-attempt the connection using MPAv1. ->Meanwhile, the cpl thread moves the state to FPDU_MODE and calls c4iw_modify_rc_qp() which calls rdma_init() which sends a RI_WR/INIT WR to firmware. But since HW sent an abort, FW correctly drops the RI_WR/INIT WR. ->So the cpl thread is stuck waiting for a reply and cannot process the ABORT_REQ_RSS cpl sitting in its input queue. Thus everything comes to a halt because no more ingress cpls are processed by the stack... The correct fix for the issue is to always do the wake up in c4iw_abort_intr() but reinitialize the wait object in c4iw_reconnect(). Fixes: 7c0a33d61187a ("RDMA/cxgb4: Don't wakeup threads for MPAv2") Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-13RDMA/iw_cxgb4: Handle ret value of process_mpa_reply() in rx_dataHariprasad S1-2/+2
Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-13RDMA/iw_cxgb4: atomic find and reference for listening endpointsHariprasad S1-8/+28
Add get_ep_from_stid() which will atomically find and reference the endpoint struct if found. This avoids touch-after-free races between threads destroying listening endpoints and the CPL processing thread processing an incoming PASS_ACCEPT_REQ CPL. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-13RDMA/iw_cxgb4: Handle ULP accept/reject during ABORTINGHariprasad S1-4/+2
c4iw_reject() and c4iw_accept() need to handle the case where the endpoint has timed out and is in the middle of ABORTING the connection. Here is the flow that causes the BUG_ON() to fire on the server side: 1) offload connection setup and endpoint timer started 2) MPA_START request received from peer, CONNECT_REQUEST passed to ULP 3) endpoint timer fires, and process_timeout() aborts the connection, this moves the endpoint state to ABORTING until HW sends up the ABORT_RPL_RSS. 4) application exits closing the CONNECT_REQUEST cm_id. The IWCM calls c4iw_reject_cr() to destroy this connection request. 5) WHAMO: BUG_ON() because the state is ABORTING. The fix is to change c4iw_reject_cr() and c4iw_accept_cr() to fail the operation if the state is not in MPA_REQ_RCVD vs in DEAD. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-13RDMA/iw_cxgb4: Release ep for for FPDU_MODE and MPA_REQ_RCVD in process_timeoutHariprasad S1-0/+2
ARP failure may also happen when ep in FPDU_MODE and these failures need to be handled by process_timeout(). process_timeout() also has to handle case MPA_REQ_RCVD, setting abort to 1, leading to ep resource release. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-13RDMA/iw_cxgb4: Free skb in case of arp failure in _c4iw_free_ep()Hariprasad S1-0/+2
Arp failure for send_mpa_reply/reject() is handled by freeing the mpa_skb in c4iw_free_ep() before releasing ep. Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-13RDMA/iw_cxgb4: atomically lookup ep and get a referenceHariprasad S1-28/+82
There is a race between ULP threads calling c4iw_ep_disconnect() via c4iw_modify_rc_qp() and the ingress CPL thread where the ULP thread can free the endpoint just after the ingress CPL thread finds the ep pointer in the tid table. To avoid this, we now use the hwtid_idr table for lookups instead of the LLD tid table so we can lock around insert, remove, and lookup+get_ep to avoid the race. The CPL handlers now will either find the ep ptr and have a ref on it, or not find it and they can discard the CPL. Callers of get_ep_from_tid() will have a ref on the ep if found, and thus must deref when they are done. Negative advice in peer_abort_intr() need to dereference the ep. therefore peer_abort() is scheduled to dereference the ep later. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-13RDMA/iw_cxgb4: Handle return value of c4iw_ofld_send() in abort_arp_failure()Hariprasad S1-3/+9
In abort_arp_failure(), the return value from c4iw_ofld_send() is ignored and thus if the CPL isn't sent, the endpoint is stuck and never gets aborted. Failure of c4iw_ofld_send() is treated as fatal error, and the ep resources are released in a safer context through process_work(). Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-13RDMA/iw_cxgb4: in process_timeout() don't move ep state to ABORTINGHariprasad S1-3/+0
Moving the state to ABORTING causes the ep to get stuck because c4iw_ep_timeout() thinks the ABORT has already been done. So leave the state alone and let c4iw_ep_disconnect() do the right thing given the ep state. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-13RDMA/iw_cxgb4: handle return value of c4iw_l2t_send() and send_mpa_req()Hariprasad S1-13/+22
->In act_open_rpl(), CPL_ERR_TCAM_FULL error handling branch, there is no handling of the return value of send_fw_act_open_req(). ->In send_fw_act_open_req(), there is no handling of return value of c4iw_l2t_send(), which may cause a ep leak and won't notify upper layers on connection establish failure. ->send_mpa_req() should act on the return from c4iw_l2t_send() and return the error to the caller. ->In case of c4iw_l2t_send() failure in send_mpa_req(), returns without starting the timer and not changing the ep state, which is further handled by act_establish() -> In act_establish()?if send_mpa_request's get_skb returns an error, may cause an ep leak. So handle return value of send_mpa_req() Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-13RDMA/iw_cxgb4: stop_ep_timer() after MPA negotiationHariprasad S2-20/+20
->Stop the ep timer after MPA negotiation so that the arp failures during send_mpa_reply/reject will be handled by process_timeout() after the ep timer expires. ->Added case MPA_REP_SENT in process_timeout(). ->For MPA reject, c4iw_ep_disconnect tries to start an already started timer, which leads to warning message "timer already started". -> In case of mpa reject stop the timer and call send_mpa_reject(). -> Added new ep flag STOP_MPA_TIMER to tell fw4_ack() to stop the timer only for send_mpa_reply(), which is set in c4iw_accept_cr(). Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-13RDMA/iw_cxgb4: Do not stop timer in case of incomplete messagesHariprasad S1-14/+16
In case of incomplete mpa messages we should not stop timer as it results in return with timeout for the next mpa message Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-13RDMA/iw_cxgb4: parent_ep has to be dereferenced in case of passive accept failureHariprasad S1-6/+19
-> On passive side of connection parent_ep referenced during connection request has to be dereferenced during the passive accept failure. -> As passive accept failure error handlinglogic runs in atomic context, the parent ep is dereferenced by scheduling work request. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-13RDMA/iw_cxgb4: set the correct FID value in DSGL commandsHariprasad S1-2/+3
The FID value in a ULP_MEMIO command needs to be set to an IQ ID of a queue configured for our PF. The FID/IQ id is used to index into the PCIE FID table, to find out on which function the DMA needs to be issued. Essentially, every DMA needs to have the ingress queue. The exact ingress queue doesn't matter, but it needs to be an ingress queue associated with the function you want to see the DMA on. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-13RDMA/iw_cxgb4: Correct RFC number of MPAHariprasad S1-1/+1
Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-13RDMA/iw_cxgb4: Add few history bits for epHariprasad S2-18/+36
- add EP_DISC_FAIL history bit - add QP_REFED/DEREFED history bits - Add functions to ref/deref the cm_id and add history bit for the same - add CLOSE_CON_RPL history Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-13IB/core: Enhance ib_map_mr_sg()Bart Van Assche2-2/+2
The SRP initiator allows to set max_sectors to a value that exceeds the largest amount of data that can be mapped at once with an mlx4 HCA using fast registration and a page size of 4 KB. Hence modify ib_map_mr_sg() such that it can map partial sg-elements. If an sg-element has been mapped partially, let the caller know which fraction has been mapped by adjusting *sg_offset. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Tested-by: Laurence Oberman <loberman@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-13IB/core: Add passing an offset into the SG to ib_map_mr_sgChristoph Hellwig2-7/+5
Signed-off-by: Christoph Hellwig <hch@lst.de> Tested-by: Steve Wise <swise@opengridcomputing.com> Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-05RDMA/iw_cxgb4: remove abort_connection() usage from ep_timeout()Hariprasad S1-10/+2
Use c4iw_ep_disconnect() instead. This is part of getting rid of abort_connection() altogether so we properly clean up on send_abort() failures. This is the last user of abort_connection(), so remove it too. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-05RDMA/iw_cxgb4: move QP -> ERROR on fatal disconnect errorsHariprasad S1-0/+13
In c4iw_ep_disconnect(), if we fail to initiate a close operation, then move the qp to ERROR to disassociate the ep from the qp. Failure to do this will leak the ep resources. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-05RDMA/iw_cxgb4: don't use abort_connection in process_mpa_request()Hariprasad S1-29/+35
Instead return whether the caller needs to disconnect. This is part of getting rid of abort_connection() altogether so we properly clean up on send_abort() failures. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-05RDMA/iw_cxgb4: remove abort_connection() usage from accept/rejectHariprasad S1-14/+15
Use c4iw_ep_disconnect() instead. This is part of getting rid of abort_connection() altogether so we properly clean up on send_abort() failures. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-05RDMA/iw_cxgb4: free resources when send_flowc() failsHariprasad S1-5/+18
Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-05RDMA/iw_cxgb4: remove connection abort from process_mpa_replyHariprasad S1-2/+13
Instead, have the caller, rx_data() handle the close/abort like it does for process_mpa_request(). This is part of getting rid of abort_connection() altogether so we properly clean up on send_abort() failures. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-05RDMA/iw_cxgb4: ensure eps don't get freed while the mutex is heldHariprasad S1-0/+7
In rx_data(), with the ep in FPDU_MODE, refcnt=2, if we get unexpected streaming data, we call c4iw_modify_rc_qp() and move the qp from RTS -> TERMINATE. In c4iw_modify_rc_qp(), if rdma_fini() returns an error, the ep will be dereferenced (refcnt=1). Then rx_data() calls c4iw_ep_disconnect() which starts the close operation. But if send_halfclose() fails in c4iw_ep_disconnect(), we will call release_ep_resources() derefing the ep which reduces the refcnt to 0 and and frees the ep. However we still has the ep mutex at that point, so we have a touch-after-free bug. There is a similar issue where peer_close() calls c4iw_ep_disconnect(). The solution is to add a reference to the ep in c4iw_ep_disconnect() after acquiring the mutex, and release it after releasing the mutex. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-05RDMA/iw_cxgb4: stop ep timer on close failureHariprasad S1-1/+6
In c4iw_ep_disconnect(), if we start the ep timer to begin a close, but send_halfclose() fails, we need to stop the timer and send a CLOSE event up to the IWCM before releasing the resources. Otherwise, we can crash when the ep timer fires if the ep is referencing a previous instance of the device. This can happen as part of adapter reset/recovery, for instance. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-05RDMA/iw_cxgb4: release ep resources on accept arp failureHariprasad S1-16/+66
If ARP fails before the CPL_PASS_ACCEPT_RPL is seen by hardware, the tid will be stuck in SYN_PEND and never released. So create an arp failure handler specifically for this message to release the endpoint resources. In pass_accept_rpl_arp_failure(), put the parent endpoint so it will be freed when destroyed. Also we don't need to call release_tid() here because _c4iw_free_ep() calls cxgb4_remove_tid() which releases the hwtid. If we get an ABORT_REQ_RSS instead of a PASS_ESTABLISH (because the peer's ACK to our SYN is never received), then put the parent as well in peer_abort(). Treat accept_cr() failures just like arp failures: put the parent ep and release the ep resources destroying the tid The ARP failure handlers are called in an atomic context, so we need to schedule some of the processing which might block. Namely _c4iw_free_ep() which needs a mutex. So create a "special" CPL opcode and handler and schedule it via sched() to be run by process_work() in a blockable context. Also rework the active open arp failure handler to make use of release_ep_resources(). This allows both the active and passive arp failure handlers to use the same deferred cleanup function. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-04-26RDMA/iw_cxgb4: Fix bar2 virt addr calculation for T4 chipsHariprasad S2-2/+6
For T4, kernel mode qps don't use the user doorbell. User mode qps during flow control db ringing are forced into kernel, where user doorbell is treated as kernel doorbell and proper bar2 offset in bar2 virtual space is calculated, which incase of T4 is a bogus address, causing a kernel panic due to illegal write during doorbell ringing. In case of T4, kernel mode qp bar2 virtual address should be 0. Added T4 check during bar2 virtual address calculation to return 0. Fixed Bar2 range checks based on bar2 physical address. The below oops will be fixed <1>BUG: unable to handle kernel paging request at 000000000002aa08 <1>IP: [<ffffffffa011d800>] c4iw_uld_control+0x4e0/0x880 [iw_cxgb4] <4>PGD 1416a8067 PUD 15bf35067 PMD 0 <4>Oops: 0002 [#1] SMP <4>last sysfs file: /sys/devices/pci0000:00/0000:00:03.0/0000:02:00.4/infiniband/cxgb4_0/node_guid <4>CPU 5 <4>Modules linked in: rdma_ucm rdma_cm ib_cm ib_sa ib_mad ib_uverbs ip6table_filter ip6_tables ebtable_nat ebtables ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack ipt_REJECT xt_CHECKSUM iptable_mangle iptable_filter ip_tables bridge autofs4 target_core_iblock target_core_file target_core_pscsi target_core_mod configfs bnx2fc cnic uio fcoe libfcoe libfc scsi_transport_fc scsi_tgt 8021q garp stp llc cpufreq_ondemand acpi_cpufreq freq_table mperf vhost_net macvtap macvlan tun kvm uinput microcode iTCO_wdt iTCO_vendor_support sg joydev serio_raw i2c_i801 i2c_core lpc_ich mfd_core e1000e ptp pps_core ioatdma dca i7core_edac edac_core shpchp ext3 jbd mbcache sd_mod crc_t10dif pata_acpi ata_generic ata_piix iw_cxgb4 iw_cm ib_core ib_addr cxgb4 ipv6 dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_wait_scan] <4> Supermicro X8ST3/X8ST3 <4>RIP: 0010:[<ffffffffa011d800>] [<ffffffffa011d800>] c4iw_uld_control+0x4e0/0x880 [iw_cxgb4] <4>RSP: 0000:ffff880155a03db0 EFLAGS: 00010006 <4>RAX: 000000000000001d RBX: ffff88013ae5fc00 RCX: ffff880155adb180 <4>RDX: 000000000002aa00 RSI: 0000000000000001 RDI: ffff88013ae5fdf8 <4>RBP: ffff880155a03e10 R08: 0000000000000000 R09: 0000000000000001 <4>R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 <4>R13: 000000000000001d R14: ffff880156414ab0 R15: ffffe8ffffc05b88 <4>FS: 0000000000000000(0000) GS:ffff8800282a0000(0000) knlGS:0000000000000000 <4>CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b <4>CR2: 000000000002aa08 CR3: 000000015bd0e000 CR4: 00000000000007e0 <4>DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 <4>DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 <4>Process cxgb4 (pid: 394, threadinfo ffff880155a00000, task ffff880156414ab0) <4>Stack: <4> ffff880156415068 ffff880155adb180 ffff880155a03df0 ffffffffa00a344b <4><d> 00000000000003e8 ffff880155920000 0000000000000004 ffff880155920000 <4><d> ffff88015592d438 ffffffffa00a3860 ffff880155a03fd8 ffffe8ffffc05b88 <4>Call Trace: <4> [<ffffffffa00a344b>] ? enable_txq_db+0x2b/0x80 [cxgb4] <4> [<ffffffffa00a3860>] ? process_db_full+0x0/0xa0 [cxgb4] <4> [<ffffffffa00a38a6>] process_db_full+0x46/0xa0 [cxgb4] <4> [<ffffffff8109fda0>] worker_thread+0x170/0x2a0 <4> [<ffffffff810a6aa0>] ? autoremove_wake_function+0x0/0x40 <4> [<ffffffff8109fc30>] ? worker_thread+0x0/0x2a0 <4> [<ffffffff810a660e>] kthread+0x9e/0xc0 <4> [<ffffffff8100c28a>] child_rip+0xa/0x20 <4> [<ffffffff810a6570>] ? kthread+0x0/0xc0 <4> [<ffffffff8100c280>] ? child_rip+0x0/0x20 <4>Code: e9 ba 00 00 00 66 0f 1f 44 00 00 44 8b 05 29 07 02 00 45 85 c0 0f 85 71 02 00 00 8b 83 70 01 00 00 45 0f b7 ed c1 e0 0f 44 09 e8 <89> 42 08 0f ae f8 66 c7 83 82 01 00 00 00 00 44 0f b7 ab dc 01 <1>RIP [<ffffffffa011d800>] c4iw_uld_control+0x4e0/0x880 [iw_cxgb4] <4> RSP <ffff880155a03db0> <4>CR2: 000000000002aa08` Based on original work by Bharat Potnuri <bharat@chelsio.com> Fixes: 74217d4c6a4fb0d8 ("iw_cxgb4: support for bar2 qid densities exceeding the page size") Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Reviewed-by: Leon Romanovsky <leon@leon.nu> Signed-off-by: Doug Ledford <dledford@redhat.com>