aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/infiniband/hw/qib/qib_rc.c (follow)
AgeCommit message (Collapse)AuthorFilesLines
2011-11-04IB/qib: Fix panic in RC error flushing logicMike Marciniszyn1-7/+3
The following panic can occur when flushing a QP: RIP: 0010:[<ffffffffa0168e8b>] [<ffffffffa0168e8b>] qib_send_complete+0x3b/0x190 [ib_qib] RSP: 0018:ffff8803cdc6fc90 EFLAGS: 00010046 RAX: 0000000000000000 RBX: ffff8803d84ba000 RCX: 0000000000000000 RDX: 0000000000000005 RSI: ffffc90015a53430 RDI: ffff8803d84ba000 RBP: ffff8803cdc6fce0 R08: ffff8803cdc6fc90 R09: 0000000000000001 R10: 00000000ffffffff R11: 0000000000000000 R12: ffff8803d84ba0c0 R13: ffff8803d84ba5cc R14: 0000000000000800 R15: 0000000000000246 FS: 0000000000000000(0000) GS:ffff880036600000(0000) knlGS:0000000000000000 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b CR2: 0000000000000034 CR3: 00000003e44f9000 CR4: 00000000000406f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process qib/0 (pid: 1350, threadinfo ffff8803cdc6e000, task ffff88042728a100) Stack: 53544c5553455201 0000000100000005 0000000000000000 ffff8803d84ba000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000001 ffff8803cdc6fd30 ffffffffa0165d7a Call Trace: [<ffffffffa0165d7a>] qib_make_rc_req+0x36a/0xe80 [ib_qib] [<ffffffffa0165a10>] ? qib_make_rc_req+0x0/0xe80 [ib_qib] [<ffffffffa01698b3>] qib_do_send+0xf3/0xb60 [ib_qib] [<ffffffff814db757>] ? thread_return+0x4e/0x777 [<ffffffffa01697c0>] ? qib_do_send+0x0/0xb60 [ib_qib] [<ffffffff81088bf0>] worker_thread+0x170/0x2a0 [<ffffffff8108e530>] ? autoremove_wake_function+0x0/0x40 [<ffffffff81088a80>] ? worker_thread+0x0/0x2a0 [<ffffffff8108e1c6>] kthread+0x96/0xa0 [<ffffffff8100c1ca>] child_rip+0xa/0x20 [<ffffffff8108e130>] ? kthread+0x0/0xa0 [<ffffffff8100c1c0>] ? child_rip+0x0/0x20 RIP [<ffffffffa0168e8b>] qib_send_complete+0x3b/0x190 [ib_qib] The RC error state flush logic in qib_make_rc_req() could return all of the acked wqes and potentially have emptied the queue. It would then unconditionally try return a flush completion via qib_send_complete() for an invalid wqe, or worse a valid one that is not queued. The panic results when the completion code tries to maintain an MR reference count for a NULL MR. This fix modifies logic to only send one completion per qib_make_rc_req() call and changing the completion status from IB_WC_SUCCESS to IB_WC_WR_FLUSH_ERR as the completions progress. The outer loop will call as many times as necessary to flush the queue. Reviewed-by: Ram Vepa <ram.vepa@qlogic.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@qlogic.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-10-21IB/qib: Remove s_lock around header validationMike Marciniszyn1-3/+1
Review of qib_ruc_check_hdr() shows that the s_lock is not required in the normal case. The r_lock is held in all cases, and protects the qp fields that are read. The s_lock will be needed to around the call to qib_migrate_qp() to insure that the send engine sees a consistent set of fields. Signed-off-by: Mike Marciniszyn <mike.marciniszyn@qlogic.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-10-21IB/qib: Precompute timeout jiffies to optimize latencyMike Marciniszyn1-5/+2
A new field is added to qib_qp called timeout_jiffies. It is initialized upon create and modify. The field is now used instead of a computation based on qp->timeout. Signed-off-by: Mike Marciniszyn <mike.marciniszyn@qlogic.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-10-21IB/qib: Decode path MTU optimizationMike Marciniszyn1-3/+3
Store both the encoded and decoded MTU in the QP structure as a minor optimization for UC/RC receive routines. Signed-off-by: Mike Marciniszyn <mike.marciniszyn@qlogic.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-10-21IB/qib: Optimize RC/UC code by IB operationMike Marciniszyn1-6/+13
The memset for zeroing work completions had been unconditional. This patch removes the memset and moves the zeroing into the work completion with a more explicit field by field set. With this patch, non-ONLY/non-LAST packets will avoid the overhead since they will not generate a completion. Signed-off-by: Mike Marciniszyn <mike.marciniszyn@qlogic.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-02-17IB/qib: Prevent double completions after a timeout or RNR errorMike Marciniszyn1-1/+2
There is a double completion associated with error handling for RC QPs. The sequence is: - The do_rc_ack() routine fields an RNR nack and there are 0 rnr_retries configured on the QP. - qib_error_qp() stops the pending timer - qib_rc_send_complete() is called from sdma_complete() - qib_rc_send_complete() starts the timer because the msb of the psn just completed says an ack is needed. - a bunch of flushes occur as ipoib posts WQEs to an error'ed QP - rc_timeout() calls qib_restart_rc() - qib_restart_rc() calls qib_send_complete() with a IB_WC_RETRY_EXC_ERR on a wqe that has already been completed in the past The fix avoids starting the timer since another packet will never arrive. Signed-off-by: Mike Marciniszyn <mike.marciniszyn@qlogic.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-02-10IB/qib: Fix double add_timer()Mike Marciniszyn1-0/+2
The following panic BUG_ON occurs during qib testing: Kernel BUG at include/linux/timer.h:82 RIP [<ffffffff881f7109>] :ib_qib:start_timer+0x73/0x89 RSP <ffffffff80425bd0> <0>Kernel panic - not syncing: Fatal exception <0>Dumping qib trace buffer from panic qib_set_lid INFO: IB0:1 got a lid: 0xf8 Done dumping qib trace buffer BUG: warning at kernel/panic.c:137/panic() (Tainted: G The flaw is due to a missing state test when processing responses that results in an add_timer() call when the same timer is already queued. This code was executing in parallel with a QP destroy on another CPU that had changed the state to reset, but the missing test caused to response handling code to run on into the panic. Signed-off-by: Mike Marciniszyn <mike.marciniszyn@qlogic.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-01-10IB/qib: Unnecessary delayed completions on RC connectionMike Marciniszyn1-0/+24
Currently on receipt of a response message (ACKs, RDMA Response, Atomic Responses etc.) if the SDMA completion counter is not advanced the driver delays the completion of the WQE. In most cases this is overly pessimistic as the response (ACK) to a previously transmitted send implies that the send is complete. Ensure that SDMA queue is progressed appropriately before determining if a send has delayed completions. Signed-off-by: Mike Marciniszyn <mike.marciniszyn@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-10-22IB/qib: Process RDMA WRITE ONLY with IMMEDIATE properlyJason Gunthorpe1-1/+4
See table 35 in IBA - the header order for RDMA_WRITE_ONLY_WITH_IMMEDIATE and SEND_LAST_WITH_IMMEDIATE is different: the RDMA_WRITE_ONLY has a RETH header before the immediate data, so we need a different code path to extract the immediate data. I tested this with a userspace app that does RDMA_WRITE with immediate on a QLE7140. Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-08-03IB/qib: Fix race between qib_error_qp() and receive packet processingRalph Campbell1-34/+13
When transitioning a QP to the error state, in progress RWQEs need to be marked complete. This also involves releasing the reference count to the memory regions referenced in the SGEs. The locking in the receive packet processing wasn't sufficient to prevent qib_error_qp() from modifying the r_sge state at the same time, thus leading to kernel panics. Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-05-23IB/qib: Add new qib driver for QLogic PCIe InfiniBand adaptersRalph Campbell1-0/+2288
Add a low-level IB driver for QLogic PCIe adapters. Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>