aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/infiniband/sw (follow)
AgeCommit message (Collapse)AuthorFilesLines
2017-05-01IB/rxe: Don't clamp residual length to mtuJohannes Thumshirn1-2/+0
When reading a RDMA WRITE FIRST packet we copy the DMA length from the RDMA header into the qp->resp.resid variable for later use. Later in check_rkey() we clamp it to the MTU if the packet is an RDMA WRITE packet and has a residual length bigger than the MTU. Later in write_data_in() we subtract the payload of the packet from the residual length. If the packet happens to have a payload of exactly the MTU size we end up with a residual length of 0 despite the packet not being the last in the conversation. When the next packet in the conversation arrives, we don't have any residual length left and thus set the QP into an error state. This broke NVMe over Fabrics functionality over rdma_rxe.ko The patch was verified using the following test. # echo eth0 > /sys/module/rdma_rxe/parameters/add # nvme connect -t rdma -a 192.168.155.101 -s 1023 -n nvmf-test # mkfs.xfs -fK /dev/nvme0n1 meta-data=/dev/nvme0n1 isize=256 agcount=4, agsize=65536 blks = sectsz=4096 attr=2, projid32bit=1 = crc=0 finobt=0, sparse=0 data = bsize=4096 blocks=262144, imaxpct=25 = sunit=0 swidth=0 blks naming =version 2 bsize=4096 ascii-ci=0 ftype=1 log =internal log bsize=4096 blocks=2560, version=2 = sectsz=4096 sunit=1 blks, lazy-count=1 realtime =none extsz=4096 blocks=0, rtextents=0 # mount /dev/nvme0n1 /tmp/ [ 148.923263] XFS (nvme0n1): Mounting V4 Filesystem [ 148.961196] XFS (nvme0n1): Ending clean mount # dd if=/dev/urandom of=test.bin bs=1M count=128 128+0 records in 128+0 records out 134217728 bytes (134 MB, 128 MiB) copied, 0.437991 s, 306 MB/s # sha256sum test.bin cde42941f045efa8c4f0f157ab6f29741753cdd8d1cff93a6b03649d83c4129a test.bin # cp test.bin /tmp/ sha256sum /tmp/test.bin cde42941f045efa8c4f0f157ab6f29741753cdd8d1cff93a6b03649d83c4129a /tmp/test.bin Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de> Cc: Hannes Reinecke <hare@suse.de> Cc: Sagi Grimberg <sagi@grimberg.me> Cc: Max Gurtovoy <maxg@mellanox.com> Acked-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-05-01IB/core: Define 'ib' and 'roce' rdma_ah_attr typesDasaratharaman Chandramouli2-0/+2
rdma_ah_attr can now be either ib or roce allowing core components to use one type or the other and also to define attributes unique to a specific type. struct ib_ah is also initialized with the type when its first created. This ensures that calls such as modify_ah dont modify the type of the address handle attribute. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Don Hiatt <don.hiatt@intel.com> Reviewed-by: Sean Hefty <sean.hefty@intel.com> Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-05-01IB/core: Use rdma_ah_attr accessor functionsDasaratharaman Chandramouli5-37/+46
Modify core and driver components to use accessor functions introduced to access individual fields of rdma_ah_attr Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Don Hiatt <don.hiatt@intel.com> Reviewed-by: Sean Hefty <sean.hefty@intel.com> Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-05-01IB/core: Rename ib_destroy_ah to rdma_destroy_ahDasaratharaman Chandramouli1-1/+1
Rename ib_destroy_ah to rdma_destroy_ah so its in sync with the rename of the ib address handle attribute Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Don Hiatt <don.hiatt@intel.com> Reviewed-by: Sean Hefty <sean.hefty@intel.com> Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-05-01IB/core: Rename struct ib_ah_attr to rdma_ah_attrDasaratharaman Chandramouli5-19/+20
This patch simply renames struct ib_ah_attr to rdma_ah_attr as these fields specify attributes that are not necessarily specific to IB. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Don Hiatt <don.hiatt@intel.com> Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Reviewed-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-05-01IB/rxe: Initialize ib_ah_attr during query_ahDasaratharaman Chandramouli2-0/+2
Zero out ib_ah_attr before calling query_ah. Set ah_flags appropriately. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Don Hiatt <don.hiatt@intel.com> Reviewed-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-28IB/rdmavt/hfi1/qib: Use the MGID and MLID for multicast addressingMichael J. Ruhl1-17/+44
The Infiniband spec defines "A multicast address is defined by a MGID and a MLID" (section 10.5). The current code only uses the MGID for identifying multicast groups. Update the driver to be compliant with this definition. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com> Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-28IB/rxe: fix typo: "algorithmi" -> "algorithm"Colin Ian King1-1/+1
trivial fix to typo in pr_err message Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-28IB/rdmavt: restore IRQs on error path in rvt_create_ah()Dan Carpenter1-1/+1
We need to call spin_unlock_irqrestore() instead of vanilla spin_unlock() on this error path. Fixes: 119a8e708d16 ("IB/rdmavt: Add AH to rdmavt") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Acked-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-25IB: Replace ib_umem page_size by page_shiftArtemy Kovalyov2-9/+7
Size of pages are held by struct ib_umem in page_size field. It is better to store it as an exponent, because page size by nature is always power-of-two and used as a factor, divisor or ilog2's argument. The conversion of page_size to be page_shift allows to have portable code and avoid following error while compiling on ARM: ERROR: "__aeabi_uldivmod" [drivers/infiniband/core/ib_core.ko] undefined! CC: Selvin Xavier <selvin.xavier@broadcom.com> CC: Steve Wise <swise@chelsio.com> CC: Lijun Ou <oulijun@huawei.com> CC: Shiraz Saleem <shiraz.saleem@intel.com> CC: Adit Ranadive <aditr@vmware.com> CC: Dennis Dalessandro <dennis.dalessandro@intel.com> CC: Ram Amrani <Ram.Amrani@Cavium.com> Signed-off-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Acked-by: Ram Amrani <Ram.Amrani@cavium.com> Acked-by: Shiraz Saleem <shiraz.saleem@intel.com> Acked-by: Selvin Xavier <selvin.xavier@broadcom.com> Acked-by: Selvin Xavier <selvin.xavier@broadcom.com> Acked-by: Adit Ranadive <aditr@vmware.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-25{net,IB}/{rxe,usnic}: Utilize generic mac to eui32 functionYuval Shaia4-32/+6
This logic seems to be duplicated in (at least) three separate files. Move it to one place so code can be re-use. Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
2017-04-21IB/rxe: Cache dst in QP instead of getting it for each sendyonatanc2-4/+59
In RC QP there is no need to resolve the outgoing interface for each packet, as this does not change during QP life cycle. Instead cache the interface on the socket and use that one. This improves performance by 12% by sparing redundant calls to rxe_find_route. ib_send_bw -d rxe0 -x 1 -n 9000 -e -s $((1024 * 1024 )) -l 100 ---------------------------------------------------------------------------------------- | | bytes | iterations | BW peak[MB/sec] | BW average[MB/sec] | MsgRate[Mpps] | ---------------------------------------------------------------------------------------- | before | 1048576 | 9000 | inf | 551.21 | 0.000551 | | after | 1048576 | 9000 | inf | 615.54 | 0.000616 | ---------------------------------------------------------------------------------------- Fixes: 8700e3e7c485 ("Soft RoCE driver") Signed-off-by: Yonatan Cohen <yonatanc@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-21IB/rxe: Offload CRC calculation when possibleyonatanc9-9/+44
Use CPU ability to perform CRC calculations, by replacing direct calls to crc32_le() with crypto_shash_updata(). The overall performance gain measured with ib_send_bw tool is 10% and it was tested on "Intel CPU ES-2660 v2 @ 2.20Ghz" CPU. ib_send_bw -d rxe0 -x 1 -n 9000 -e -s $((1024 * 1024 )) -l 100 --------------------------------------------------------------------------------------------- | | bytes | iterations | BW peak[MB/sec] | BW average[MB/sec] | MsgRate[Mpps] | --------------------------------------------------------------------------------------------- | crc32_le | 1048576 | 9000 | inf | 497.60 | 0.000498 | | CRC offload | 1048576 | 9000 | inf | 546.70 | 0.000547 | --------------------------------------------------------------------------------------------- Fixes: 8700e3e7c485 ("Soft RoCE driver") Signed-off-by: Yonatan Cohen <yonatanc@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-21IB/rxe: Do not export module's private functionParav Pandit1-1/+0
Function rxe_rcv is used internally in RXE and don't need to be exported. This patch removes such export declaration. Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-21IB/rxe: Avoid accessing timers for non RC QPsParav Pandit1-5/+8
This patch avoids RNR NAK timer and retransmit timer initialization and cleanup for non RC QPs (such as UD QP, GSI QP). Reviewed-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-21IB/rxe: Add port protocol statsYonatan Cohen9-2/+174
Expose new counters using the get_hw_stats callback. We expose the following counters: +---------------------+----------------------------------------+ | Name | Description | |---------------------+----------------------------------------| |sent_pkts | number of sent pkts | |---------------------+----------------------------------------| |rcvd_pkts | number of received packets | |---------------------+----------------------------------------| |out_of_sequence | number of errors due to packet | | | transport sequence number | |---------------------+----------------------------------------| |duplicate_request | number of received duplicated packets. | | | A request that previously executed is | | | named duplicated. | |---------------------+----------------------------------------| |rcvd_rnr_err | number of received RNR by completer | |---------------------+----------------------------------------| |send_rnr_err | number of sent RNR by responder | |---------------------+----------------------------------------| |rcvd_seq_err | number of out of sequence packets | | | received | |---------------------+----------------------------------------| |ack_deffered | number of deferred handling of ack | | | packets. | |---------------------+----------------------------------------| |retry_exceeded_err | number of times retry exceeded | |---------------------+----------------------------------------| |completer_retry_err | number of times completer decided to | | | retry | |---------------------+----------------------------------------| |send_err | number of failed send packet | +---------------------+----------------------------------------+ Signed-off-by: Yonatan Cohen <yonatanc@mellanox.com> Reviewed-by: Moni Shoua <monis@mellanox.com> Reviewed-by: Andrew Boyer <andrew.boyer@dell.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-20Merge branch 'k.o/for-4.12' into k.o/for-4.12-rdma-netdeviceDoug Ledford7-41/+317
2017-04-05IB/hfi1: Eliminate synchronize_rcu() in mr deleteMike Marciniszyn1-16/+33
The synchronize_rcu() call can be eliminated to improve memory deregistration performance. There are two key fields involved: - The rcu pointer itself - the lkey_published field To close the window between the rcu read of the mregion pointer and the reference count the code should: 1. To lkey/rkey validation (reader) Read the rcu pointer. If the pointer is non-NULL, get a reference. To the current validation tests use a READ_ONCE() on the lkey_published. Upon any failure release the reference. 2. To the remove logic (delete) Insure the published is zeroed prior to setting the pointer to NULL. This requires using rcu_assign_pointer() to insure lkey_published is written prior to the NULL. 3. To the insert logic (add) Insure the published is set use an rcu_assign_pointer() to insure the pointer is after all MR fields. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-05IB/rdmavt: Avoid reseting wqe send_flags in unreserveMike Marciniszyn1-2/+5
The wqe should be read only and in fact the superfluous reset of the RVT_SEND_RESERVE_USED flag causes an issue where reserved operations elicit a bad completion to the ULP. The maintenance of the flag is now entirely within rvt_post_one_wr() where a reserved operation will set the flag and a non-reserved operation will insure the operation that is about to be posted has the flag reset. Fixes: Commit 856cc4c237ad ("IB/hfi1: Add the capability for reserved operations") Reviewed-by: Don Hiatt <don.hiatt@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-05IB/rdmavt, IB/hfi1: Fix timer migration regressionsSebastian Sanchez3-2/+116
RC timeout counter isn't getting incremented. Increment counter and add the trace for it. Fixes: 87c23b4ab018 ("IB/rdmavt: Adding timer logic to rdmavt") Reviewed-by: Brian Welty <brian.welty@intel.com> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Sebastian Sanchez <sebastian.sanchez@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-05IB/rdmavt: Add tracing for cq entry and pollMike Marciniszyn3-0/+131
The following fields are defined for filtering and triggering: - wr_id - status - opcode - qpn - length - idx Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-05IB/rdmavt: Add additional fields to post send traceMike Marciniszyn2-4/+32
This fix is to get additional debugging information. The following fields are added: - wqe - qpt - num_sge - ssn - pid - send_flags These additional fields provide for more focused filtering and triggering. The patch also moves the trace to just before the wqe is posted to get the most accurate information and future proofs the code to trace all possible reserved opcodes. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-05IB/rdmavt, IB/hfi1, IB/qib: Make wc opcode translation driver dependentMike Marciniszyn1-17/+0
The work to create a completion helper moved the translation of send wqe operations to completion opcodes to rdmvat. This precludes having driver dependent operations. Make the translation driver dependent by doing the translation in the driver prior to the rvt_qp_swqe_complete() call using restored translation tables. Fixes: Commit f2dc9cdce83c ("IB/rdmavt: Add a send completion helper") Fixes: Commit 0771da5a6e9d ("IB/hfi1,IB/qib: Use new send completion helper") Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-03-24IB/rxe: increment msn only when completing a requestDavid Marchand1-5/+4
According to C9-147, MSN should only be incremented when the last packet of a multi packet request has been received. "Logically, the requester associates a sequential Send Sequence Number (SSN) with each WQE posted to the send queue. The SSN bears a one- to-one relationship to the MSN returned by the responder in each re- sponse packet. Therefore, when the requester receives a response, it in- terprets the MSN as representing the SSN of the most recent request completed by the responder to determine which send WQE(s) can be completed." Fixes: 8700e3e7c485 ("Soft RoCE driver") Signed-off-by: David Marchand <david.marchand@6wind.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-03-24IB/rxe: Update documentation linkLeon Romanovsky1-1/+1
All Soft-RoCE (rxe) is handled now in rdma-core user space library, so the documentation. The patch below updates the documentation link to that new location. Reported-by: Josh Beavers <josh.beavers@gmail.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-03-24IB/rxe: double free on errorDan Carpenter1-1/+1
"goto err;" has it's own kfree_skb() call so it's a double free. We only need to free on the "goto exit;" path. Fixes: 8700e3e7c485 ("Soft RoCE driver") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-03-24infiniband: Fix alignment of mmap cookies to support VIPT cachingJason Gunthorpe2-4/+4
When vmalloc_user is used to create memory that is supposed to be mmap'd to user space, it is necessary for the mmap cookie (eg the offset) to be aligned to SHMLBA. This creates a situation where all virtual mappings of the same physical page share the same virtual cache index and guarantees VIPT coherence. Otherwise the cache is non-coherent and the kernel will not see writes by userspace when reading the shared page (or vice-versa). Reported-by: Josh Beavers <josh.beavers@gmail.com> Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-02-27scripts/spelling.txt: add "therfore" pattern and fix typo instancesMasahiro Yamada1-3/+3
Fix typos and add the following to the scripts/spelling.txt: therfore||therefore Besides, tidy up comment blocks for 80-col wrapping. Link: http://lkml.kernel.org/r/1481573103-11329-31-git-send-email-yamada.masahiro@socionext.com Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-25Merge tag 'for-next-dma_ops' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdmaLinus Torvalds12-449/+14
Pull rdma DMA mapping updates from Doug Ledford: "Drop IB DMA mapping code and use core DMA code instead. Bart Van Assche noted that the ib DMA mapping code was significantly similar enough to the core DMA mapping code that with a few changes it was possible to remove the IB DMA mapping code entirely and switch the RDMA stack to use the core DMA mapping code. This resulted in a nice set of cleanups, but touched the entire tree and has been kept separate for that reason." * tag 'for-next-dma_ops' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (37 commits) IB/rxe, IB/rdmavt: Use dma_virt_ops instead of duplicating it IB/core: Remove ib_device.dma_device nvme-rdma: Switch from dma_device to dev.parent RDS: net: Switch from dma_device to dev.parent IB/srpt: Modify a debug statement IB/srp: Switch from dma_device to dev.parent IB/iser: Switch from dma_device to dev.parent IB/IPoIB: Switch from dma_device to dev.parent IB/rxe: Switch from dma_device to dev.parent IB/vmw_pvrdma: Switch from dma_device to dev.parent IB/usnic: Switch from dma_device to dev.parent IB/qib: Switch from dma_device to dev.parent IB/qedr: Switch from dma_device to dev.parent IB/ocrdma: Switch from dma_device to dev.parent IB/nes: Remove a superfluous assignment statement IB/mthca: Switch from dma_device to dev.parent IB/mlx5: Switch from dma_device to dev.parent IB/mlx4: Switch from dma_device to dev.parent IB/i40iw: Remove a superfluous assignment statement IB/hns: Switch from dma_device to dev.parent ...
2017-02-23Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdmaLinus Torvalds2-5/+8
Pull Mellanox rdma updates from Doug Ledford: "Mellanox specific updates for 4.11 merge window Because the Mellanox code required being based on a net-next tree, I keept it separate from the remainder of the RDMA stack submission that is based on 4.10-rc3. This branch contains: - Various mlx4 and mlx5 fixes and minor changes - Support for adding a tag match rule to flow specs - Support for cvlan offload operation for raw ethernet QPs - A change to the core IB code to recognize raw eth capabilities and enumerate them (touches non-Mellanox code) - Implicit On-Demand Paging memory registration support" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (40 commits) IB/mlx5: Fix configuration of port capabilities IB/mlx4: Take source GID by index from HW GID table IB/mlx5: Fix blue flame buffer size calculation IB/mlx4: Remove unused variable from function declaration IB: Query ports via the core instead of direct into the driver IB: Add protocol for USNIC IB/mlx4: Support raw packet protocol IB/mlx5: Support raw packet protocol IB/core: Add raw packet protocol IB/mlx5: Add implicit MR support IB/mlx5: Expose MR cache for mlx5_ib IB/mlx5: Add null_mkey access IB/umem: Indicate that process is being terminated IB/umem: Update on demand page (ODP) support IB/core: Add implicit MR flag IB/mlx5: Support creation of a WQ with scatter FCS offload IB/mlx5: Enable QP creation with cvlan offload IB/mlx5: Enable WQ creation and modification with cvlan offload IB/mlx5: Expose vlan offloads capabilities IB/uverbs: Enable QP creation with cvlan offload ...
2017-02-23Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdmaLinus Torvalds21-226/+637
Pull rdma updates from Doug Ledford: "First set of updates for 4.11 kernel merge window - Add new Broadcom bnxt_re RoCE driver - rxe driver updates - ioctl cleanups - ETH_P_IBOE declaration cleanup - IPoIB changes - Add port state cache - Allow srpt driver to accept guids as port names in config - Update to hfi1 driver - Update to srp driver - Lots of misc minor changes all over" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (114 commits) RDMA/bnxt_re: fix for "bnxt_en: Update to firmware interface spec 1.7.0." rdma_cm: fail iwarp accepts w/o connection params IB/srp: Drain the send queue before destroying a QP IB/core: Add support for draining IB_POLL_DIRECT completion queues IB/srp: Improve an error path IB/srp: Make a diagnostic message more informative IB/srp: Document locking conventions IB/srp: Fix race conditions related to task management IB/srp: Avoid that duplicate responses trigger a kernel bug IB/SRP: Avoid using IB_MR_TYPE_SG_GAPS RDMA/qedr: Fix some error handling RDMA/bnxt_re: add DCB dependency IB/hns: include linux/module.h IB/vmw_pvrdma: Expose vendor error to ULPs vmw_pvrdma: switch to pci_alloc_irq_vectors IB/hfi1: use size_t for passing array length IB/ipoib: Remove redudant label IB/ipoib: remove the unnecessary memory free IB/mthca: switch to pci_alloc_irq_vectors IB/hfi1: Code reuse with memdup_copy ...
2017-02-19IB/hfi1, qib, rdmavt: Move AETH defines to rdma/ib_hdrs.hDon Hiatt2-11/+11
Rename RVT AETH defines and export in rdma/ib_hdrs.h Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Don Hiatt <don.hiatt@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-02-19IB/hfi1: Add rvt_rnr_tbl_to_usec functionDon Hiatt1-0/+11
Return usec from an index into ib_rvt_rnr_table. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Don Hiatt <don.hiatt@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-02-19IB/hfi1, rdmavt: Update copy_sge to use boolean argumentsBrian Welty1-1/+1
Convert copy_sge and related SGE state functions to use boolean. For determining if QP is in user mode, add helper function in rdmavt_qp.h. This is used to determine if QP needs the last byte ordering. While here, change rvt_pd.user to a boolean. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Brian Welty <brian.welty@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-02-19IB/rdmavt: Adding timer logic to rdmavtVenkata Sandeep Dhanalakota1-2/+181
To move common code across target to rdmavt for code reuse. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Brian Welty <brian.welty@intel.com> Signed-off-by: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-02-19IB/hfi1, qib, rdmavt: Move AETH credit functions into rdmavtBrian Welty2-2/+192
Add rvt_compute_aeth() and rvt_get_credit() as shared functions in rdmavt, moved from hfi1/qib logic. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Brian Welty <brian.welty@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-02-19IB/hfi1, qib, rdmavt: Move two IB event functions into rdmavtBrian Welty1-0/+38
Add rvt_rc_error() and rvt_comm_est() as shared functions in rdmavt, moved from hfi1/qib logic. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Brian Welty <brian.welty@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-02-19IB/rdmavt: Use per-CPU reference count for MRsSebastian Sanchez1-21/+38
Having per-CPU reference count for each MR prevents cache-line bouncing across the system. Thus, it prevents bottlenecks. Use per-CPU reference counts per MR. The per-CPU reference count for FMRs is used in atomic mode to allow accurate testing of the busy state. Other MR types run in per-CPU mode MR until they're freed. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Sebastian Sanchez <sebastian.sanchez@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-02-19IB/rxe: use setup_timer to simplify the codeWei Yongjun1-7/+2
Use setup_timer function instead of initializing timer with the function and data fields. Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-02-19Merge branch 'k.o/for-4.10-rc' into HEADDoug Ledford4-7/+8
2017-02-14IB: Query ports via the core instead of direct into the driverOr Gerlitz2-5/+8
Change the drivers to call ib_query_port in their get port immutable handler instead of their own query port handler. Doing this required to set the core cap flags of this device before the ib_query_port call is made, since the IB core might need these caps to serve the port query. Drivers are ensured by the IB core that the port attributes passed to the port query verb implementation are zero, and hence we removed the zeroing from the drivers. This patch doesn't add any new functionality. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Acked-by: Adit Ranadive <aditr@vmware.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-02-11Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2-4/+6
2017-02-08IB/rxe: Fix mem_check_range integer overflowEyal Itkin1-3/+5
Update the range check to avoid integer-overflow in edge case. Resolves CVE 2016-8636. Signed-off-by: Eyal Itkin <eyal.itkin@gmail.com> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-02-08IB/rxe: Fix resid updateEyal Itkin1-1/+1
Update the response's resid field when larger than MTU, instead of only updating the local resid variable. Fixes: 8700e3e7c485 ("Soft RoCE driver") Signed-off-by: Eyal Itkin <eyal.itkin@gmail.com> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-02-06net-next: treewide use is_vlan_dev() helper function.Parav Pandit1-1/+1
This patch makes use of is_vlan_dev() function instead of flag comparison which is exactly done by is_vlan_dev() helper function. Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Jon Maxwell <jmaxwell37@gmail.com> Acked-by: Johannes Thumshirn <jth@kernel.org> Acked-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-24IB/rxe: Prevent from completer to operate on non valid QPYonatan Cohen1-2/+1
On UD QP completer tasklet is scheduled for each packet sent. If it is followed by a destroy_qp(), the kernel panic will happen as the completer tries to operate on a destroyed QP. Fixes: 8700e3e7c485 ("Soft RoCE driver") Signed-off-by: Yonatan Cohen <yonatanc@mellanox.com> Reviewed-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-01-24IB/rxe: Fix rxe dev insertion to rxe_dev_listMaor Gottlieb1-1/+1
The first argument of list_add_tail is the new item and the second is the head of the list. Fix the code to pass arguments in the right order, otherwise not all the rxe devices will be removed during teardown. Fixes: 8700e3e7c4857 ('Soft RoCE driver') Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Reviewed-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-01-24IB/rxe, IB/rdmavt: Use dma_virt_ops instead of duplicating itBart Van Assche12-446/+11
Make the rxe and rdmavt drivers use dma_virt_ops. Update the comments that refer to the source files removed by this patch. Remove struct ib_dma_mapping_ops. Remove ib_device.dma_ops. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Andrew Boyer <andrew.boyer@dell.com> Cc: Dennis Dalessandro <dennis.dalessandro@intel.com> Cc: Jonathan Toppins <jtoppins@redhat.com> Cc: Alex Estrin <alex.estrin@intel.com> Cc: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-01-24IB/rxe: Switch from dma_device to dev.parentBart Van Assche1-3/+3
Prepare for removal of ib_device.dma_device. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Moni Shoua <monis@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-01-10IB/rxe: Fix an skb leakBart Van Assche1-1/+14
Additionally, make it easier to detect skb leaks by issuing a warning if a leak occurs. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Cc: Andrew Boyer <andrew.boyer@dell.com> Cc: Moni Shoua <monis@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>