aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/infiniband/sw (follow)
AgeCommit message (Collapse)AuthorFilesLines
2017-10-25IB/rxe: Convert timers to use timer_setup()Kees Cook4-8/+8
In preparation for unconditionally passing the struct timer_list pointer to all timer callbacks, switch to using the new timer_setup() and from_timer() to pass the timer pointer explicitly. Cc: Moni Shoua <monis@mellanox.com> Cc: Doug Ledford <dledford@redhat.com> Cc: Sean Hefty <sean.hefty@intel.com> Cc: Hal Rosenstock <hal.rosenstock@gmail.com> Cc: linux-rdma@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-10-18Merge branch 'timer_setup' into for-nextDoug Ledford1-6/+4
Conflicts: drivers/infiniband/hw/cxgb4/cm.c drivers/infiniband/hw/qib/qib_driver.c drivers/infiniband/hw/qib/qib_mad.c There were minor fixups needed in these files. Just minor context diffs due to patches from independent sources touching the same basic area. Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-10-18IB/rdmavt: Convert timers to use timer_setup()Kees Cook1-6/+4
In preparation for unconditionally passing the struct timer_list pointer to all timer callbacks, switch to using the new timer_setup() and from_timer() to pass the timer pointer explicitly. setup_timer() was already being called before the open-coded init_timer() and .data assignment. These are removed as well. Cc: Dennis Dalessandro <dennis.dalessandro@intel.com> Cc: Doug Ledford <dledford@redhat.com> Cc: Sean Hefty <sean.hefty@intel.com> Cc: Hal Rosenstock <hal.rosenstock@gmail.com> Cc: linux-rdma@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-10-18Merge branch 'hfi1' into k.o/for-nextDoug Ledford2-2/+2
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-10-18IB/rdmavt: Don't wait for resources in QP resetAlex Estrin2-2/+2
Per the IBTA spec, QP destroy shall fail if the QP is attached to multicast groups, although the spec is silent on modify_qp to reset state. It implies that ULP must deregister QP from all mcast groups for destroy to succeed. The faulty patch "IB/ipoib: Update broadcast object if PKey value was changed in index 0" exposed two issues in rdmavt: 1. Rvt QP reset waits for qp references to go to zero. This will hang if QP is attached to multicast groups. 2. The mcast group detach will fail for a QP in reset state therefore preventing ULP from correcting the issue. This patch moves the reference count wait to the the destroy QP path and allows a QP mcast detach to work in the reset state. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Alex Estrin <alex.estrin@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-10-14RDMA/rxe: Suppress gcc 7 fall-through complaintsBart Van Assche3-3/+4
Avoid that gcc 7 reports the following warning when building with W=1: warning: this statement may fall through [-Wimplicit-fallthrough=] Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Cc: Moni Shoua <monis@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-10-14RDMA/rdmavt: Suppress gcc 7 fall-through complaintsBart Van Assche1-0/+1
Avoid that gcc 7 reports the following warning when building with W=1: warning: this statement may fall through [-Wimplicit-fallthrough=] Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-10-09IB/rxe: put the pool on allocation failureDoug Ledford1-7/+9
If the allocation of elem fails, it is not sufficient to simply check for NULL and return. We need to also put our reference on the pool or else we will leave the pool with a permanent ref count and we will never be able to free it. Fixes: 4831ca9e4a8e ("IB/rxe: check for allocation failure on elem") Suggested-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-10-09IB/rxe: check for allocation failure on elemColin Ian King1-0/+2
The allocation for elem may fail (especially because we're using GFP_ATOMIC) so best to check for a null return. This fixes a potential null pointer dereference when assigning elem->pool. Detected by CoverityScan CID#1357507 ("Dereference null return value") Fixes: 8700e3e7c485 ("Soft RoCE driver") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-09-27IB: Move PCI dependency from root KConfig to HW's KConfigsYuval Shaia1-0/+1
No reason to have dependency on PCI for the entire infiniband stack so move it to KConfig of only the drivers that actually using PCI. Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-08-28IB/rxe: Handle NETDEV_CHANGE eventsAndrew Boyer1-1/+6
Without this fix, ports configured on top of ixgbe miss link up notifications. ibv_query_port() will continue to return IBV_PORT_DOWN even though the port is up and working. Fixes: 8700e3e7c485 ("Soft RoCE driver") Signed-off-by: Andrew Boyer <andrew.boyer@dell.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-08-28IB/rxe: Avoid ICRC errors by copying into the skb firstAndrew Boyer1-6/+6
The current process is to first calculate the CRC and then copy the client data into the packet. This leaves a window in which the packet contents and CRC can get out of sync, if the client changes the data after the CRC is calculated but before the data is copied. By copying the data into the packet and then calculating the CRC directly from the packet contents we eliminate the window. This can be seen with qperf's ud_bi_bw test. This seems like very strange/reckless client behavior, but whether the client has mangled its data or not RXE should be able to transfer it reliably. Fixes: 8700e3e7c485 ("Soft RoCE driver") Signed-off-by: Andrew Boyer <andrew.boyer@dell.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-08-28IB/rxe: Another fix for broken receive queue drainingAndrew Boyer1-1/+3
This fixes another path in rxe_requester() that might overlook stale SKBs, preventing cleanup. Fixes: 1217197142d1 ("rxe: fix broken receive queue draining") Signed-off-by: Andrew Boyer <andrew.boyer@dell.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-08-28IB/rxe: Remove unneeded initialization in prepare6()Andrew Boyer1-1/+1
Fixes: 4ed6ad1eb30e ("IB/rxe: Cache dst in QP instead of getting it...") Signed-off-by: Andrew Boyer <andrew.boyer@dell.com> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-08-28IB/rxe: Fix up rxe_qp_cleanup()Andrew Boyer1-7/+2
Replace sk_dst_get()/dst_release() in rxe_qp_cleanup() with sk_dst_reset(). sk_dst_get() takes a new reference on dst, so the dst_release() doesn't actually release the original reference, which was the design intent. Fixes: 4ed6ad1eb30e ("IB/rxe: Cache dst in QP instead of getting it...") Signed-off-by: Andrew Boyer <andrew.boyer@dell.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-08-28IB/rxe: Add dst_clone() in prepare_ipv6_hdr()Andrew Boyer1-1/+1
Otherwise the reference count goes negative as IPv6 packets complete. Fixes: 4ed6ad1eb30e ("IB/rxe: Cache dst in QP instead of getting it...") Signed-off-by: Andrew Boyer <andrew.boyer@dell.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-08-28IB/rxe: Fix destination cache for IPv6Andrew Boyer2-1/+7
To successfully match an IPv6 path, the path cookie must match. Store it in the QP so that the IPv6 path can be reused. Replace open-coded version of dst_check() with the actual call, fixing the logic. The open-coded version skips the check call if dst->obsolete is 0 (DST_OBSOLETE_NONE), proceeding to replace the route. DST_OBSOLETE_NONE means that the route may continue to be used, though. Fixes: 4ed6ad1eb30e ("IB/rxe: Cache dst in QP instead of getting it...") Signed-off-by: Andrew Boyer <andrew.boyer@dell.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-08-28IB/rxe: Fix up the responder's find_resources() functionAndrew Boyer1-1/+1
The resource array is sized by max_dest_rd_atomic, not max_rd_atomic. Iterating over max_rd_atomic entries of qp->resp.resources[] will cause incorrect behavior when the two attributes are different (or even crash if max_rd_atomic is larger). Fixes: 8700e3e7c485 ("Soft RoCE driver") Signed-off-by: Andrew Boyer <andrew.boyer@dell.com> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-08-28IB/rxe: Remove dangling prototypeAndrew Boyer1-2/+0
Fixes: 8700e3e7c485 ("Soft RoCE driver") Signed-off-by: Andrew Boyer <andrew.boyer@dell.com> Acked-by: Moni Shoua <monis@mellanox.com> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-08-28IB/rxe: Disable completion upcalls when a CQ is destroyedAndrew Boyer4-0/+24
This prevents the stack from accessing userspace objects while they are being torn down. One possible sequence of events: - Userspace program exits - ib_uverbs_cleanup_ucontext() runs, calling ib_destroy_qp(), ib_destroy_cq(), etc. and releasing/freeing the UCQ - The QP still has tasklets running, so it isn't destroyed yet - The CQ is referenced by the QP, so the CQ isn't destroyed yet - The UCQ is kfree()'d anyway - A send work request completes - rxe_send_complete() calls cq->ibcq.comp_handler() - ib_uverbs_comp_handler() runs and crashes; the event queue is checked for is_closed, but it has no way to check the ib_ucq_object before accessing it The reference counting on the CQ doesn't protect against this since the CQ hasn't been destroyed yet. There's no available interface to deregister the UCQ from the CQ, and it didn't appear that attempting to add reference counting to the UCQ was going to be a good way to go since this solution is much simpler. Fixes: 8700e3e7c485 ("Soft RoCE driver") Signed-off-by: Andrew Boyer <andrew.boyer@dell.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-08-28IB/rxe: Move refcounting earlier in rxe_send()Andrew Boyer1-3/+5
The network stack will call nskb's destructor, rxe_skb_tx_dtor(), if the packet gets dropped by ip_local_out()/ip6_local_out(). Thus we need to add the QP ref before output to avoid extra dereferences during network congestion. This could lead to unwanted destruction of the QP. Fix up the skb_out accounting, too. Fixes: fda85ce91240 ("IB/rxe: Fix kernel panic from skb destructor") Signed-off-by: Andrew Boyer <andrew.boyer@dell.com> Acked-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-08-28IB/rdmavt: Handle dereg of inuse MRs properlyMike Marciniszyn2-21/+212
A destroy of an MR prior to destroying the QP can cause the following diagnostic if the QP is referencing the MR being de-registered: hfi1 0000:05:00.0: hfi1_0: rvt_dereg_mr timeout mr ffff8808562108 00 pd ffff880859b20b00 The solution is to when the a non-zero refcount is encountered when the MR is destroyed the QPs needs to be iterated looking for QPs in the same PD as the MR. If rvt_qp_mr_clean() detects any such QP references the rkey/lkey, the QP needs to be put into an error state via a call to rvt_qp_error() which will trigger the clean up of any stuck references. This solution is as specified in IBTA 1.3 Volume 1 11.2.10.5. [This is reproduced with the 0.4.9 version of qperf and the rc_bw test] Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-08-28IB/rdmavt: Add QP iterator API for QPsMike Marciniszyn1-0/+144
There are currently 3 spots in the qib and hfi1 driver that have knowledge of the internal QP hash list that should only be in scope to rdmavt QP code. Add an iterator API for processing all QPs to hide the nature of the RCU hashlist. The API consists of: - rvt_qp_iter_init() * For iterating QPs one at a time for seq_file semantics - rvt_qp_iter_next() * For iterating QPs one at a time for seq_file semantics - rvt_qp_iter() * For iterating all QPs The first two are used for things like seq_file prints. The last is for code that just needs to iterate all QPs in the system. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-08-28IB/rdmavt: Use rvt_put_swqe() in rvt_clear_mr_ref()Mike Marciniszyn1-5/+1
hfi1 and qib were converted in previous patches, do the same for rdmavt. Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-08-24IB/rxe: Make rxe_counter_name staticKamal Heib1-1/+1
rxe_counter_name is used in rxe_hw_counters.c only. Make it static. Fixes: 0b1e5b99a48b ('IB/rxe: Add port protocol stats') Signed-off-by: Kamal Heib <kamalh@mellanox.com> Reviewed-by: Yonatan Cohen <yonatanc@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-08-22IB/rdmavt, hfi1, qib: Modify check_ah() to account for extended LIDsDon Hiatt2-16/+23
rvt_check_ah() delegates lid verification to underlying driver. Underlying driver uses different conditions to check for dlid depending on whether the device supports extended LIDs Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com> Signed-off-by: Don Hiatt <don.hiatt@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-08-22IB/hfi1: Remove pmtu from the QP structureSebastian Sanchez1-2/+1
The pmtu field doens't have be stored in the QP structure as it can easily be calculated when needed. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Sebastian Sanchez <sebastian.sanchez@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-08-18Add OPA extended LID supportHiatt, Don1-1/+1
This patch series primarily increases sizes of variables that hold lid values from 16 to 32 bits. Additionally, it adds a check in the IB mad stack to verify a properly formatted MAD when OPA extended LIDs are used. Signed-off-by: Don Hiatt <don.hiatt@intel.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-08-18Merge branch 'misc' into k.o/for-nextDoug Ledford1-6/+1
Conflicts: drivers/infiniband/core/iwcm.c - The rdma_netlink patches in HEAD and the iwarp cm workqueue fix (don't use WQ_MEM_RECLAIM, we aren't safe for that context) touched the same code. Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-08-18IB/rxe: Remove unneeded checkYuval Shaia1-5/+0
Port validation is performed in ib_core, no need to duplicate it here. Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-08-18IB/rxe: Convert pr_info to pr_warnYuval Shaia1-1/+1
This message is warning so let's print it accordingly. Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-08-10Merge branches '32bit_lid' and 'irq_affinity' into k.o/merge-testDoug Ledford1-1/+1
Conflicts: drivers/infiniband/hw/mlx5/main.c - Both add new code include/rdma/ib_verbs.h - Both add new code Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-08-08IB/core: Change wc.slid from 16 to 32 bitsHiatt, Don1-1/+1
slid field in struct ib_wc is increased to 32 bits. This enables core components to use larger LIDs if needed. The user ABI is unchanged and return 16 bit values when queried. Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Don Hiatt <don.hiatt@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-31IB/{rdmavt, hfi1, qib}: Fix panic with post receive and SGE compressionMike Marciniszyn2-14/+10
The server side of qperf panics as follows: [242446.336860] IP: report_bug+0x64/0x10 [242446.341031] PGD 1c0c067 [242446.341032] P4D 1c0c067 [242446.343951] PUD 1c0d063 [242446.346870] PMD 8587ea067 [242446.349788] PTE 800000083e14016 [242446.352901] [242446.358352] Oops: 0003 [#1] SM [242446.437919] CPU: 1 PID: 7442 Comm: irq/92-hfi1_0 k Not tainted 4.12.0-mam-asm #1 [242446.446365] Hardware name: Intel Corporation S2600WT2/S2600WT2, BIOS SE5C610.86B.01.01.0018.C4.072020161249 07/20/201 [242446.458397] task: ffff8808392d2b80 task.stack: ffffc9000664000 [242446.465097] RIP: 0010:report_bug+0x64/0x10 [242446.469859] RSP: 0018:ffffc900066439c0 EFLAGS: 0001000 [242446.475784] RAX: ffffffffa06647e4 RBX: ffffffffa06461e1 RCX: 000000000000000 [242446.483840] RDX: 0000000000000907 RSI: ffffffffa0675040 RDI: ffffffffffff740 [242446.491897] RBP: ffffc900066439e0 R08: 0000000000000001 R09: 000000000000025 [242446.499953] R10: ffffffff81a253df R11: 0000000000000133 R12: ffffc90006643b3 [242446.508010] R13: ffffffffa065bbf0 R14: 00000000000001e5 R15: 000000000000000 [242446.516067] FS: 0000000000000000(0000) GS:ffff88085f640000(0000) knlGS:000000000000000 [242446.525191] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003 [242446.531698] CR2: ffffffffa06647ee CR3: 0000000001c09000 CR4: 00000000001406e [242446.539756] Call Trace [242446.542582] fixup_bug+0x2c/0x5 [242446.546277] do_trap+0x12b/0x18 [242446.549972] do_error_trap+0x89/0x11 [242446.554171] ? hfi1_copy_sge+0x271/0x2b0 [hfi1 [242446.559324] ? ttwu_do_wakeup+0x1e/0x14 [242446.563795] ? ttwu_do_activate+0x77/0x8 [242446.568363] do_invalid_op+0x20/0x3 [242446.572448] invalid_op+0x1e/0x3 [242446.576247] RIP: 0010:hfi1_copy_sge+0x271/0x2b0 [hfi1 [242446.582075] RSP: 0018:ffffc90006643be8 EFLAGS: 0001004 [242446.587999] RAX: 0000000000000000 RBX: ffff88083e0fa240 RCX: 000000000000000 [242446.596058] RDX: 0000000000000000 RSI: ffff880842508000 RDI: ffff88083e0fa24 [242446.604116] RBP: ffffc90006643c28 R08: 0000000000000000 R09: 000000000000000 [242446.612172] R10: ffffc90009473640 R11: 0000000000000133 R12: 000000000000000 [242446.620228] R13: 0000000000000000 R14: 0000000000002000 R15: ffff88084250800 [242446.628293] ? hfi1_copy_sge+0x1a1/0x2b0 [hfi1 [242446.633449] hfi1_rc_rcv+0x3da/0x1270 [hfi1 [242446.638312] ? sc_buffer_alloc+0x113/0x150 [hfi1 [242446.643662] hfi1_ib_rcv+0x1c9/0x2e0 [hfi1 [242446.648428] process_receive_ib+0x19a/0x270 [hfi1 [242446.653866] ? process_rcv_qp_work+0xd2/0x160 [hfi1 [242446.659505] handle_receive_interrupt_nodma_rtail+0x184/0x2e0 [hfi1 [242446.666693] ? irq_finalize_oneshot+0x100/0x10 [242446.671846] receive_context_thread+0x1b/0x140 [hfi1 [242446.677576] irq_thread_fn+0x1e/0x4 [242446.681659] irq_thread+0x13c/0x1b [242446.685646] ? irq_forced_thread_fn+0x60/0x6 [242446.690604] kthread+0x112/0x15 [242446.694298] ? irq_thread_check_affinity+0xe0/0xe [242446.699738] ? kthread_park+0x60/0x6 [242446.703919] ? do_syscall_64+0x67/0x15 [242446.708292] ret_from_fork+0x25/0x3 [242446.712374] Code: 63 78 04 44 0f b7 70 08 41 89 d0 4c 8d 2c 38 41 83 e0 01 f6 c2 02 74 17 66 45 85 c0 74 11 f6 c2 04 b9 01 00 00 00 75 bb 83 ca 04 <66> 89 50 0a 66 45 85 c0 74 52 0f b6 48 0b 41 0f b7 f6 4d 89 e0 [242446.733527] RIP: report_bug+0x64/0x100 RSP: ffffc900066439c [242446.739935] CR2: ffffffffa06647e [242446.743763] ---[ end trace 0e90a20d0aa494f7 ]-- The root cause is that the qib/hfi1 post receive call to rvt_lkey_ok() doesn't interpret the new return value from rvt_lkey_ok() properly leading to an mr reference count underrun. Additionally, remove an unused argument in rvt_sge_adjacent() aw well as an unneeded incr local in rvt_post_one_wr(). Fixes: Commit 14fe13fcd3af ("IB/rdmavt: Compress adjacent SGEs in rvt_lkey_ok()") Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-27Merge branch 'misc' into k.o/for-nextDoug Ledford2-48/+6
2017-07-26Merge branches 'rxe' and 'mlx' into k.o/for-nextDoug Ledford7-13/+13
2017-07-24RDMA: Remove useless MODULE_VERSIONLeon Romanovsky1-1/+0
All modules in drivers/infiniband defined and used MODULE_VERSION, which was pointless because the kernel version describes their state more accurate then those arbitrary numbers. Signed-off-by: Leon Romanovsky <leon@kernel.org> Acked-by: Sagi Grimbrg <sagi@grimberg.me> Reviewed-by: Sagi Grimberg <sagi@grimbeg.me> Acked-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Acked-by: Selvin Xavier <selvin.xavier@broadcom.com> Acked-by: Ram Amrani <Ram.Amrani@cavium.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Acked-by: Adit Ranadive <aditr@vmware.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-24IB/core: Add generic function to extract IB speed from netdevYuval Shaia1-47/+6
Logic of retrieving netdev speed from net_device and translating it to IB speed is implemented in rxe, in usnic and in bnxt drivers. Define new function which merges all. Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com> Reviewed-by: Christian Benvenuti <benve@cisco.com> Reviewed-by: Selvin Xavier <selvin.xavier@broadcom.com> Reviewed-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-24IB/rxe: Constify static rxe_vm_opsKamal Heib1-1/+1
Constify static rxe_vm_ops that is never modified. Signed-off-by: Kamal Heib <kamalh@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-24IB/rxe: Use __func__ to print function's nameKamal Heib2-5/+5
Its better to use __func__ to print functions name instead of writing the name in the print statement. Signed-off-by: Kamal Heib <kamalh@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-24IB/rxe: Use DEVICE_ATTR_RO macro to show parent fieldKamal Heib1-3/+3
Use DEVICE_ATTR RO() macro and rename the show function accordingly. Signed-off-by: Kamal Heib <kamalh@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-24IB/rxe: Prefer 'unsigned int' to bare use of 'unsigned'Kamal Heib3-3/+3
Signed-off-by: Kamal Heib <kamalh@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-24IB/rxe: Use "foo *bar" instead of "foo * bar"Kamal Heib1-1/+1
Signed-off-by: Kamal Heib <kamalh@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-24Merge branch 'hfi1' into k.o/for-4.14Doug Ledford5-46/+147
2017-07-20rxe: fix broken receive queue drainingVijay Immanuel2-0/+6
If we modified the qp to ERROR state, and drained the recieve queue, post_recv must trigger the responder task to complete the drain work request. Cc: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Vijay Immanuel <vijayi@attalasystems.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com>-- Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-20IB/rdmavt: Setting of QP timeout can overflow jiffies computationKaike Wan1-3/+1
Current computation of qp->timeout_jiffies in rvt_modify_qp() will cause overflow due to the fact that the input to the function usecs_to_jiffies is only 32-bit ( unsigned int). Overflow will occur when attr->timeout is equal to or greater than 30. The consequence is unnecessarily excessive retry and thus degradation of the system performance. This patch fixes the problem by limiting the input to 5-bit and calling usecs_to_jiffies() before multiplying the scaling factor. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-17IB/rxe: Set dma_mask and coherent_dma_maskyonatanc1-0/+2
The RXE coupled with dummy device causes to the kernel panic attached below. The panic happens when ib_register_device tries to set dma_mask by accessing a NULLed parent device. The RXE does not actually use DMA, so we can set the dma_mask to architecture value. [16240.199689] RIP: 0010:ib_register_device+0x468/0x5a0 [ib_core] [16240.205289] RSP: 0018:ffffc9000220fc10 EFLAGS: 00010246 [16240.209909] RAX: 0000000000000024 RBX: ffff880220d1a2a8 RCX: 0000000000000000 [16240.212244] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000009 [16240.214385] RBP: ffffc9000220fcb0 R08: 0000000000000000 R09: 000000000000023f [16240.254465] R10: 0000000000000007 R11: 0000000000000000 R12: 0000000000000000 [16240.259467] R13: 0000000000000000 R14: 0000000000000000 R15: ffff880220d1a2a8 [16240.263314] FS: 00007fd8ecca0740(0000) GS:ffff8802364c0000(0000) knlGS:0000000000000000 [16240.267292] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [16240.273503] CR2: 0000000000000218 CR3: 00000002253ba000 CR4: 00000000000006e0 [16240.277066] Call Trace: [16240.281836] ? __kmalloc+0x26f/0x280 [16240.286596] rxe_register_device+0x297/0x300 [rdma_rxe] [16240.291377] rxe_add+0x535/0x5b0 [rdma_rxe] [16240.297586] rxe_net_add+0x3e/0xc0 [rdma_rxe] [16240.302375] rxe_param_set_add+0x65/0x144 [rdma_rxe] [16240.307769] param_attr_store+0x68/0xd0 [16240.311640] module_attr_store+0x1d/0x30 [16240.316421] sysfs_kf_write+0x3a/0x50 [16240.317802] kernfs_fop_write+0xff/0x180 [16240.322989] __vfs_write+0x37/0x140 [16240.328164] ? handle_mm_fault+0xce/0x240 [16240.333340] vfs_write+0xb2/0x1b0 [16240.335013] SyS_write+0x55/0xc0 [16240.340632] entry_SYSCALL_64_fastpath+0x1a/0xa9 Fixes: 8700e3e7c485 ("Soft RoCE driver") Signed-off-by: Yonatan Cohen <yonatanc@mellanox.com> Reviewed-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-17IB/rxe: Fix kernel panic from skb destructorYonatan Cohen1-0/+3
In the time between rxe_send has finished and skb destructor called, the QP's ref count might be 0, leading to a possible QP destruction. This will lead to a kernel panic when the destructor dereferences the QP. The operation of incrementing QP ref count at rxe_send and decrementing from skb destructor will prevent this crash. BUG: unable to handle kernel NULL pointer dereference at 000000000000072c IP: [<ffffffffa05df765>] rxe_skb_tx_dtor+0x15/0x50 [rdma_rxe] PGD 0 [16240.211178] Oops: 0002 [#1] SMP CPU: 3 PID: 0 Comm: swapper/3 Tainted: G OE 4.9.0-mlnx #1 Hardware name: Red Hat KVM, BIOS Bochs 01/01/2011 task: ffff88042d6b1480 task.stack: ffffc90001904000 RIP: 0010:[<ffffffffa05df765>] [<ffffffffa05df765>] rxe_skb_tx_dtor+0x15/0x50 [rdma_rxe] RSP: 0018:ffff88043fcc3df0 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff880429684700 RCX: ffff88042d248200 RDX: 00000000ffffffff RSI: 00000000fffffe01 RDI: ffff880429684700 RBP: ffff88043fcc3e00 R08: ffff88043fcda240 R09: 00000000ff2d1de6 R10: 0000000000000000 R11: 00000000f49cf6fe R12: ffff880429684700 R13: ffffffff81893f96 R14: ffffffff817d66f0 R15: ffff880427f74200 FS: 0000000000000000(0000) GS:ffff88043fcc0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000000000072c CR3: 000000041d3df000 CR4: 00000000000006e0 Stack: ffffffff817b29cf ffff880429684700 ffff88043fcc3e18 ffffffff817b42c2 ffff880429684700 ffff88043fcc3e40 ffffffff817b4332 ffff880429684700 ffff880427f74238 ffff880427f74228 ffff88043fcc3e58 ffffffff81893f96 Call Trace: <IRQ> [16240.336345] [<ffffffff817b29cf>] ? skb_release_head_state+0x4f/0xb0 [<ffffffff817b42c2>] skb_release_all+0x12/0x30 [<ffffffff817b4332>] kfree_skb+0x32/0x90 [<ffffffff81893f96>] ndisc_error_report+0x36/0x40 [<ffffffff817d4de1>] neigh_invalidate+0x81/0xf0 [<ffffffff817d68f7>] neigh_timer_handler+0x207/0x2b0 [<ffffffff81109295>] call_timer_fn+0x35/0x120 [<ffffffff81109db7>] run_timer_softirq+0x1d7/0x460 [<ffffffff8106155e>] ? kvm_sched_clock_read+0x1e/0x30 [<ffffffff810366b9>] ? sched_clock+0x9/0x10 [<ffffffff810cfed2>] ? sched_clock_cpu+0x72/0xa0 [<ffffffff818dd537>] __do_softirq+0xd7/0x289 [<ffffffff810a6c95>] irq_exit+0xb5/0xc0 [<ffffffff818dd372>] smp_apic_timer_interrupt+0x42/0x50 [<ffffffff818dc682>] apic_timer_interrupt+0x82/0x90 <EOI> [16240.395776] [<ffffffff818da156>] ? native_safe_halt+0x6/0x10 [<ffffffff818d9e6e>] default_idle+0x1e/0xd0 [<ffffffff8103797f>] arch_cpu_idle+0xf/0x20 [<ffffffff818da2c5>] default_idle_call+0x35/0x40 [<ffffffff810e3eb5>] cpu_startup_entry+0x185/0x210 [<ffffffff81050433>] start_secondary+0x103/0x130 RIP [<ffffffffa05df765>] rxe_skb_tx_dtor+0x15/0x50 [rdma_rxe] Fixes: 8700e3e7c485 ("Soft RoCE driver") Signed-off-by: Yonatan Cohen <yonatanc@mellanox.com> Reviewed-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-17IB/{rdmavt, qib, hfi1}: Remove gfp flags argumentLeon Romanovsky1-35/+13
The caller to the driver marks GFP_NOIO allocations with help of memalloc_noio-* calls now. This makes redundant to pass down to the driver gfp flags, which can be GFP_KERNEL only. The patch removes the gfp flags argument and updates all driver paths. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Acked-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-12IB/rxe: do not copy extra stack memory to skbKees Cook1-1/+3
This fixes a over-read condition detected by FORTIFY_SOURCE for this line: memcpy(SKB_TO_PKT(skb), &ack_pkt, sizeof(skb->cb)); The error was: In file included from ./include/linux/bitmap.h:8:0, from ./include/linux/cpumask.h:11, from ./include/linux/mm_types_task.h:13, from ./include/linux/mm_types.h:4, from ./include/linux/kmemcheck.h:4, from ./include/linux/skbuff.h:18, from drivers/infiniband/sw/rxe/rxe_resp.c:34: In function 'memcpy', inlined from 'send_atomic_ack.constprop' at drivers/infiniband/sw/rxe/rxe_resp.c:998:2, inlined from 'acknowledge' at drivers/infiniband/sw/rxe/rxe_resp.c:1026:3, inlined from 'rxe_responder' at drivers/infiniband/sw/rxe/rxe_resp.c:1286:10: ./include/linux/string.h:309:4: error: call to '__read_overflow2' declared with attribute error: detected read beyond size of object passed as 2nd parameter __read_overflow2(); Daniel Micay noted that struct rxe_pkt_info is 32 bytes on 32-bit architectures, but skb->cb is still 64. The memcpy() over-reads 32 bytes. This fixes it by zeroing the unused bytes in skb->cb. Link: http://lkml.kernel.org/r/1497903987-21002-5-git-send-email-keescook@chromium.org Signed-off-by: Kees Cook <keescook@chromium.org> Cc: Moni Shoua <monis@mellanox.com> Cc: Doug Ledford <dledford@redhat.com> Cc: Sean Hefty <sean.hefty@intel.com> Cc: Daniel Micay <danielmicay@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>