aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/infiniband/sw/rxe/rxe_verbs.c (follow)
AgeCommit message (Collapse)AuthorFilesLines
2019-06-11RDMA: Convert CQ allocations to be under core responsibilityLeon Romanovsky1-20/+10
Ensure that CQ is allocated and freed by IB/core and not by drivers. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Acked-by: Gal Pressman <galpress@amazon.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Tested-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2019-06-11RDMA: Clean destroy CQ in drivers do not return errorsLeon Romanovsky1-2/+1
Like all other destroy commands, .destroy_cq() call is not supposed to fail. In all flows, the attempt to return earlier caused to memory leaks. This patch converts .destroy_cq() to do not return any errors. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Acked-by: Gal Pressman <galpress@amazon.com> Acked-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2019-06-10RDMA: Move owner into struct ib_device_opsJason Gunthorpe1-1/+1
This more closely follows how other subsytems work, with owner being a member of the structure containing the function pointers. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-06-10RDMA: Move uverbs_abi_ver into struct ib_device_opsJason Gunthorpe1-1/+1
No reason for every driver to emit code to set this, just make it part of the driver's existing static const ops structure. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-06-10RDMA: Move driver_id into struct ib_device_opsJason Gunthorpe1-1/+2
No reason for every driver to emit code to set this, just make it part of the driver's existing static const ops structure. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-04-08RDMA: Handle SRQ allocations by IB/coreLeon Romanovsky1-19/+12
Convert SRQ allocation from drivers to be in the IB/core Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-04-08RDMA: Handle AH allocations by IB/coreLeon Romanovsky1-18/+12
Simplify drivers by ensuring lifetime of ib_ah object. The changes in .create_ah() go hand in hand with relevant update in .destroy_ah(). We will use this opportunity and convert .destroy_ah() to don't fail, as it was suggested a long time ago, because there is nothing to do in case of failure during destroy. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-04-01IB: Pass only ib_udata in function prototypesShamir Rabinovitch1-10/+6
Now when ib_udata is passed to all the driver's object create/destroy APIs the ib_udata will carry the ib_ucontext for every user command. There is no need to also pass the ib_ucontext via the functions prototypes. Make ib_udata the only argument psssed. Signed-off-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-04-01IB: Pass uverbs_attr_bundle down ib_x destroy pathShamir Rabinovitch1-9/+8
The uverbs_attr_bundle with the ucontext is sent down to the drivers ib_x destroy path as ib_udata. The next patch will use the ib_udata to free the drivers destroy path from the dependency in 'uobject->context' as we already did for the create path. Signed-off-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-22RDMA: Handle ucontext allocations by IB/coreLeon Romanovsky1-8/+6
Following the PD conversion patch, do the same for ucontext allocations. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-19rdma_rxe: Use netlink messages to add/delete linksSteve Wise1-2/+2
Add support for the RDMA_NLDEV_CMD_NEWLINK/DELLINK messages which allow dynamically adding new RXE links. Deprecate the old module options for now. Cc: Moni Shoua <monis@mellanox.com> Reviewed-by: Yanjun Zhu <yanjun.zhu@oracle.com> Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-19RDMA/rxe: Close a race after ib_register_deviceJason Gunthorpe1-0/+14
Since rxe allows unregistration from other threads the rxe pointer can become invalid any moment after ib_register_driver returns. This could cause a user triggered use after free. Add another driver callback to be called right after the device becomes registered to complete any device setup required post-registration. This callback has enough core locking to prevent the device from becoming unregistered. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-19RDMA/rxe: Use driver_unregister and new unregistration APIJason Gunthorpe1-15/+2
rxe does not have correct locking for its registration/unregistration paths, use the core code to handle it instead. In this mode ib_unregister_device will also do the dealloc, so rxe is required to do clean up from a callback. The core code ensures that unregistration is done only once, and generally takes care of locking and concurrency problems for rxe. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-19RDMA/rxe: Use ib_device_get_by_netdev() instead of open codingJason Gunthorpe1-14/+3
The core API handles the locking correctly and is faster if there are multiple devices. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-15IB/{hw,sw}: Remove 'uobject->context' dependency in object creation APIsShamir Rabinovitch1-2/+4
Now when we have the udata passed to all the ib_xxx object creation APIs and the additional macro 'rdma_udata_to_drv_context' to get the ib_ucontext from ib_udata stored in uverbs_attr_bundle, we can finally start to remove the dependency of the drivers in the ib_xxx->uobject->context. Signed-off-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-08RDMA: Handle PD allocations by IB/coreLeon Romanovsky1-9/+7
The PD allocations in IB/core allows us to simplify drivers and their error flows in their .alloc_pd() paths. The changes in .alloc_pd() go hand in had with relevant update in .dealloc_pd(). We will use this opportunity and convert .dealloc_pd() to don't fail, as it was suggested a long time ago, failures are not happening as we have never seen a WARN_ON print. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-04RDMA/rxe: Move rxe_init_av() to rxe_av.cKamal Heib1-9/+2
Move the function rxe_init_av() to rxe_av.c file and use it instead of calling rxe_av_from_attr() and rxe_av_fill_ip_info(), also remove the unused rxe_dev parameter from rxe_init_av(). Signed-off-by: Kamal Heib <kamalheib1@gmail.com> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Reviewed-by: Zhu Yanjun <yanjun.zhu@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-01-14RDMA: Introduce and use rdma_device_to_ibdev()Parav Pandit1-2/+2
Introduce and use rdma_device_to_ibdev() API for those drivers which are registering one sysfs group and also use in ib_core. In subsequent patch, device->provider_ibdev one-to-one mapping is no longer holds true during accessing sysfs entries. Therefore, introduce an API rdma_device_to_ibdev() that provides such information. Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-01-14RDMA: Rename port_callback to init_portParav Pandit1-1/+1
Most provider routines are callback routines which ib core invokes. _callback suffix doesn't convey information about when such callback is invoked. Therefore, rename port_callback to init_port. Additionally, store the init_port function pointer in ib_device_ops, so that it can be accessed in subsequent patches when binding rdma device to net namespace. Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-12-19RDMA: Mark if destroy address handle is in a sleepable contextGal Pressman1-1/+1
Introduce a 'flags' field to destroy address handle callback and add a flag that marks whether the callback is executed in an atomic context or not. This will allow drivers to wait for completion instead of polling for it when it is allowed. Signed-off-by: Gal Pressman <galpress@amazon.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-12-19RDMA: Mark if create address handle is in a sleepable contextGal Pressman1-0/+1
Introduce a 'flags' field to create address handle callback and add a flag that marks whether the callback is executed in an atomic context or not. This will allow drivers to wait for completion instead of polling for it when it is allowed. Signed-off-by: Gal Pressman <galpress@amazon.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-12-18RDMA: Cleanup undesired pd->uobject usageShamir Rabinovitch1-1/+1
Drivers should be using udata to determine if a method is invoked from user space or kernel space. A pd does not necessarily say a different objects is kernel or user. Transforming the tests to use udata eliminates a large number of uobject references from the drivers. Signed-off-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-12-12RDMA/rxe: Initialize ib_device_ops structKamal Heib1-43/+47
Initialize ib_device_ops with the supported operations using ib_set_device_ops(). Signed-off-by: Kamal Heib <kamalheib1@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-12-11IB/{mlx5,ocrdma,qedr,rxe}: Omit port validation from IB verbsYuval Shaia1-21/+1
RDMA core layer already make sure port is valid, no need to check it here again. For the pkey validation this depends on commit b3ac5742fead ("RDMA/core: Validate port number in query_pkey verb") Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com> Acked-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-11-08IB/rxe: make rxe_unregister_device voidZhu Yanjun1-3/+1
Since the function rxe_unregister_device always returns 0, it is changed to void. Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-11-08RDMA/rxe: Distinguish between down links and disabled linksAndrew Boyer1-0/+8
In ib_query_port(), use the netdev's IFF_UP flag to determine phys_state (flag set = down = POLLING, flag clear = disabled = DISABLED). Callers can then use the phys_state field to distinguish between links which have a dead partner, cable missing, etc., from links which are turned off on the local node. This is useful for HA and supportability. Signed-off-by: Andrew Boyer <andrew.boyer@dell.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-10-17RDMA/drivers: Use core provided API for registering device attributesParav Pandit1-18/+8
Use rdma_set_device_sysfs_group() to register device attributes and simplify the driver. Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-26RDMA: Fully setup the device name in ib_register_deviceJason Gunthorpe1-2/+1
The current code has two copies of the device name, ibdev->dev and dev_name(&ibdev->dev), and they are setup at different times, which is very confusing. Set them both up at the same time and make dev_name() the lead name, which is the proper use of the driver core APIs. To make it very clear that the name is not valid until registration pass it in to the ib_register_device() call rather than messing with ibdev->name directly. Also the reorganization now checks that dev_name is unique even if it does not contain a %. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Acked-by: Adit Ranadive <aditr@vmware.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Acked-by: Devesh Sharma <devesh.sharma@broadcom.com> Reviewed-by: Shiraz Saleem <shiraz.saleem@intel.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
2018-07-30RDMA, core and ULPs: Declare ib_post_send() and ib_post_recv() arguments constBart Van Assche1-9/+9
Since neither ib_post_send() nor ib_post_recv() modify the data structure their second argument points at, declare that argument const. This change makes it necessary to declare the 'bad_wr' argument const too and also to modify all ULPs that call ib_post_send(), ib_post_recv() or ib_post_srq_recv(). This patch does not change any functionality but makes it possible for the compiler to verify whether the ib_post_(send|recv|srq_recv) really do not modify the posted work request. To make this possible, only one cast had to be introduce that casts away constness, namely in rpcrdma_post_recvs(). The only way I can think of to avoid that cast is to introduce an additional loop in that function or to change the data type of bad_wr from struct ib_recv_wr ** into int (an index that refers to an element in the work request list). However, both approaches would require even more extensive changes than this patch. Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-07-30RDMA: Constify the argument of the work request conversion functionsBart Van Assche1-4/+4
When posting a send work request, the work request that is posted is not modified by any of the RDMA drivers. Make this explicit by constifying most ib_send_wr pointers in RDMA transport drivers. Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-07-09RDMA/rxe: Simplify the error handling code in rxe_create_ah()Bart Van Assche1-10/+3
This patch not only simplifies the error handling code in rxe_create_ah() but also removes the dead code that was left behind by commit 47ec38666210 ("RDMA: Convert drivers to use sgid_attr instead of sgid_index"). Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Reviewed-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-06-18RDMA: Convert drivers to use sgid_attr instead of sgid_indexParav Pandit1-26/+5
The core code now ensures that all driver callbacks that receive an rdma_ah_attrs will have a sgid_attr's pointer if there is a GRH present. Drivers can use this pointer instead of calling a query function with sgid_index. This simplifies the drivers and also avoids races where a gid_index lookup may return different data if it is changed. Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2018-05-28Merge branch 'mr_fix' into git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma for-nextJason Gunthorpe1-9/+1
Update mlx4 to support user MR creation against read-only memory, previously it required the memory to be writable. Based on rdma for-rc due to dependencies. * mr_fix: (2 commits) IB/mlx4: Mark user MR as writable if actual virtual memory is writable IB/core: Make testing MR flags for writability a static inline function
2018-05-09nvmet,rxe: defer ip datagram sending to taskletAlexandru Moise1-9/+1
This addresses 3 separate problems: 1. When using NVME over Fabrics we may end up sending IP packets in interrupt context, we should defer this work to a tasklet. [ 50.939957] WARNING: CPU: 3 PID: 0 at kernel/softirq.c:161 __local_bh_enable_ip+0x1f/0xa0 [ 50.942602] CPU: 3 PID: 0 Comm: swapper/3 Kdump: loaded Tainted: G W 4.17.0-rc3-ARCH+ #104 [ 50.945466] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-20171110_100015-anatol 04/01/2014 [ 50.948163] RIP: 0010:__local_bh_enable_ip+0x1f/0xa0 [ 50.949631] RSP: 0018:ffff88009c183900 EFLAGS: 00010006 [ 50.951029] RAX: 0000000080010403 RBX: 0000000000000200 RCX: 0000000000000001 [ 50.952636] RDX: 0000000000000000 RSI: 0000000000000200 RDI: ffffffff817e04ec [ 50.954278] RBP: ffff88009c183910 R08: 0000000000000001 R09: 0000000000000614 [ 50.956000] R10: ffffea00021d5500 R11: 0000000000000001 R12: ffffffff817e04ec [ 50.957779] R13: 0000000000000000 R14: ffff88009566f400 R15: ffff8800956c7000 [ 50.959402] FS: 0000000000000000(0000) GS:ffff88009c180000(0000) knlGS:0000000000000000 [ 50.961552] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 50.963798] CR2: 000055c4ec0ccac0 CR3: 0000000002209001 CR4: 00000000000606e0 [ 50.966121] Call Trace: [ 50.966845] <IRQ> [ 50.967497] __dev_queue_xmit+0x62d/0x690 [ 50.968722] dev_queue_xmit+0x10/0x20 [ 50.969894] neigh_resolve_output+0x173/0x190 [ 50.971244] ip_finish_output2+0x2b8/0x370 [ 50.972527] ip_finish_output+0x1d2/0x220 [ 50.973785] ? ip_finish_output+0x1d2/0x220 [ 50.975010] ip_output+0xd4/0x100 [ 50.975903] ip_local_out+0x3b/0x50 [ 50.976823] rxe_send+0x74/0x120 [ 50.977702] rxe_requester+0xe3b/0x10b0 [ 50.978881] ? ip_local_deliver_finish+0xd1/0xe0 [ 50.980260] rxe_do_task+0x85/0x100 [ 50.981386] rxe_run_task+0x2f/0x40 [ 50.982470] rxe_post_send+0x51a/0x550 [ 50.983591] nvmet_rdma_queue_response+0x10a/0x170 [ 50.985024] __nvmet_req_complete+0x95/0xa0 [ 50.986287] nvmet_req_complete+0x15/0x60 [ 50.987469] nvmet_bio_done+0x2d/0x40 [ 50.988564] bio_endio+0x12c/0x140 [ 50.989654] blk_update_request+0x185/0x2a0 [ 50.990947] blk_mq_end_request+0x1e/0x80 [ 50.991997] nvme_complete_rq+0x1cc/0x1e0 [ 50.993171] nvme_pci_complete_rq+0x117/0x120 [ 50.994355] __blk_mq_complete_request+0x15e/0x180 [ 50.995988] blk_mq_complete_request+0x6f/0xa0 [ 50.997304] nvme_process_cq+0xe0/0x1b0 [ 50.998494] nvme_irq+0x28/0x50 [ 50.999572] __handle_irq_event_percpu+0xa2/0x1c0 [ 51.000986] handle_irq_event_percpu+0x32/0x80 [ 51.002356] handle_irq_event+0x3c/0x60 [ 51.003463] handle_edge_irq+0x1c9/0x200 [ 51.004473] handle_irq+0x23/0x30 [ 51.005363] do_IRQ+0x46/0xd0 [ 51.006182] common_interrupt+0xf/0xf [ 51.007129] </IRQ> 2. Work must always be offloaded to tasklet for rxe_post_send_kernel() when using NVMEoF in order to solve lock ordering between neigh->ha_lock seqlock and the nvme queue lock: [ 77.833783] Possible interrupt unsafe locking scenario: [ 77.833783] [ 77.835831] CPU0 CPU1 [ 77.837129] ---- ---- [ 77.838313] lock(&(&n->ha_lock)->seqcount); [ 77.839550] local_irq_disable(); [ 77.841377] lock(&(&nvmeq->q_lock)->rlock); [ 77.843222] lock(&(&n->ha_lock)->seqcount); [ 77.845178] <Interrupt> [ 77.846298] lock(&(&nvmeq->q_lock)->rlock); [ 77.847986] [ 77.847986] *** DEADLOCK *** 3. Same goes for the lock ordering between sch->q.lock and nvme queue lock: [ 47.634271] Possible interrupt unsafe locking scenario: [ 47.634271] [ 47.636452] CPU0 CPU1 [ 47.637861] ---- ---- [ 47.639285] lock(&(&sch->q.lock)->rlock); [ 47.640654] local_irq_disable(); [ 47.642451] lock(&(&nvmeq->q_lock)->rlock); [ 47.644521] lock(&(&sch->q.lock)->rlock); [ 47.646480] <Interrupt> [ 47.647263] lock(&(&nvmeq->q_lock)->rlock); [ 47.648492] [ 47.648492] *** DEADLOCK *** Using NVMEoF after this patch seems to finally be stable, without it, rxe eventually deadlocks the whole system and causes RCU stalls. Signed-off-by: Alexandru Moise <00moses.alexander00@gmail.com> Reviewed-by: Zhu Yanjun <yanjun.zhu@oracle.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-04-27IB/rxe: remove unused function variableZhu Yanjun1-3/+3
In the functions rxe_mem_init_dma, rxe_mem_init_user, rxe_mem_init_fast and copy_data, the function variable rxe is not used. So this function variable rxe is removed. CC: Srinivas Eeda <srinivas.eeda@oracle.com> CC: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-04-05IB/rxe: Fix for oops in rxe_register_device on ppc64le archMikhail Malygin1-1/+1
On ppc64le arch rxe_add command causes oops in kernel log: [ 92.495140] Oops: Kernel access of bad area, sig: 11 [#1] [ 92.499710] SMP NR_CPUS=2048 NUMA pSeries [ 92.499792] Modules linked in: ipt_MASQUERADE(E) nf_nat_masquerade_ipv4(E) nf_conntrack_netlink(E) nfnetlink(E) xfrm_user(E) iptable _nat(E) nf_conntrack_ipv4(E) nf_defrag_ipv4(E) nf_nat_ipv4(E) xt_addrtype(E) iptable_filter(E) ip_tables(E) xt_conntrack(E) x_tables(E) nf_nat(E) nf_conntrack(E) br_netfilter(E) bridge(E) stp(E) llc(E) overlay(E) af_packet(E) rpcrdma(E) ib_isert(E) iscsi_target_mod(E) i b_iser(E) libiscsi(E) ib_srpt(E) target_core_mod(E) ib_srp(E) ib_ipoib(E) rdma_ucm(E) ib_ucm(E) ib_uverbs(E) ib_umad(E) bochs_drm(E) tt m(E) drm_kms_helper(E) syscopyarea(E) sysfillrect(E) sysimgblt(E) fb_sys_fops(E) drm(E) agpgart(E) virtio_rng(E) virtio_console(E) rtc_ generic(E) dm_ec(OEN) ttln_rdma(OEN) rdma_cm(E) configfs(E) iw_cm(E) ib_cm(E) rdma_rxe(E) ip6_udp_tunnel(E) udp_tunnel(E) ib_core(E) ql a2xxx(E) [ 92.499832] scsi_transport_fc(E) nvme_fc(E) nvme_fabrics(E) nvme_core(E) ipmi_watchdog(E) ipmi_ssif(E) ipmi_poweroff(E) ipmi_powernv(EX) ipmi_devintf(E) ipmi_msghandler(E) dummy(E) ext4(E) crc16(E) jbd2(E) mbcache(E) dm_service_time(E) scsi_transport_iscsi(E) sd_mod(E) sr_mod(E) cdrom(E) hid_generic(E) usbhid(E) virtio_blk(E) virtio_scsi(E) virtio_net(E) ibmvscsi(EX) scsi_transport_srp(E) xhci_pci(E) xhci_hcd(E) usbcore(E) usb_common(E) virtio_pci(E) virtio_ring(E) virtio(E) sunrpc(E) dm_mirror(E) dm_region_hash(E) dm_log(E) sg(E) dm_multipath(E) dm_mod(E) scsi_dh_rdac(E) scsi_dh_emc(E) scsi_dh_alua(E) scsi_mod(E) autofs4(E) [ 92.499834] Supported: No, Unsupported modules are loaded [ 92.499839] CPU: 3 PID: 5576 Comm: sh Tainted: G OE NX 4.4.120-ttln.17-default #1 [ 92.499841] task: c0000000afe8a490 ti: c0000000beba8000 task.ti: c0000000beba8000 [ 92.499842] NIP: c00000000008ba3c LR: c000000000027644 CTR: c00000000008ba10 [ 92.499844] REGS: c0000000bebab750 TRAP: 0300 Tainted: G OE NX (4.4.120-ttln.17-default) [ 92.499850] MSR: 8000000000009033 <SF,EE,ME,IR,DR,RI,LE> CR: 28424428 XER: 20000000 [ 92.499871] CFAR: 0000000000002424 DAR: 0000000000000208 DSISR: 40000000 SOFTE: 1 GPR00: c000000000027644 c0000000bebab9d0 c000000000f09700 0000000000000000 GPR04: d0000000043d7192 0000000000000002 000000000000001a fffffffffffffffe GPR08: 000000000000009c c00000000008ba10 d0000000043e5848 d0000000043d3828 GPR12: c00000000008ba10 c000000007a02400 0000000010062e38 0000010020388860 GPR16: 0000000000000000 0000000000000000 00000100203885f0 00000000100f6c98 GPR20: c0000000b3f1fcc0 c0000000b3f1fc48 c0000000b3f1fbd0 c0000000b3f1fb58 GPR24: c0000000b3f1fae0 c0000000b3f1fa68 00000000000005dc c0000000b3f1f9f0 GPR28: d0000000043e5848 c0000000b3f1f900 c0000000b3f1f320 c0000000b3f1f000 [ 92.499881] NIP [c00000000008ba3c] dma_get_required_mask_pSeriesLP+0x2c/0x1a0 [ 92.499885] LR [c000000000027644] dma_get_required_mask+0x44/0xac [ 92.499886] Call Trace: [ 92.499891] [c0000000bebab9d0] [c0000000bebaba30] 0xc0000000bebaba30 (unreliable) [ 92.499894] [c0000000bebaba10] [c000000000027644] dma_get_required_mask+0x44/0xac [ 92.499904] [c0000000bebaba30] [d0000000043cb4b4] rxe_register_device+0xc4/0x430 [rdma_rxe] [ 92.499910] [c0000000bebabab0] [d0000000043c06c8] rxe_add+0x448/0x4e0 [rdma_rxe] [ 92.499915] [c0000000bebabb30] [d0000000043d28dc] rxe_net_add+0x4c/0xf0 [rdma_rxe] [ 92.499921] [c0000000bebabb60] [d0000000043d305c] rxe_param_set_add+0x6c/0x1ac [rdma_rxe] [ 92.499924] [c0000000bebabbf0] [c0000000000e78c0] param_attr_store+0xa0/0x180 [ 92.499927] [c0000000bebabc70] [c0000000000e6448] module_attr_store+0x48/0x70 [ 92.499932] [c0000000bebabc90] [c000000000391f60] sysfs_kf_write+0x70/0xb0 [ 92.499935] [c0000000bebabcb0] [c000000000390f1c] kernfs_fop_write+0x18c/0x1e0 [ 92.499939] [c0000000bebabd00] [c0000000002e22ac] __vfs_write+0x4c/0x1d0 [ 92.499942] [c0000000bebabd90] [c0000000002e2f94] vfs_write+0xc4/0x200 [ 92.499945] [c0000000bebabde0] [c0000000002e488c] SyS_write+0x6c/0x110 [ 92.499948] [c0000000bebabe30] [c000000000009384] system_call+0x38/0xe4 [ 92.499949] Instruction dump: [ 92.499954] 4e800020 3c4c00e8 3842dcf0 7c0802a6 f8010010 60000000 7c0802a6 fba1ffe8 [ 92.499958] fbc1fff0 fbe1fff8 f8010010 f821ffc1 <e9230208> 7c7e1b78 2fa90000 419e0078 [ 92.499962] ---[ end trace bed077e15eb420cf ]--- It fails in dma_get_required_mask, that has ppc-specific implementation, and fail if provided device argument is NULL Signed-off-by: Mikhail Malygin <mikhail@malygin.me> Reviewed-by: Yonatan Cohen <yonatanc@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-04-05IB/rxe: Removed GID add/del dummy routinesParav Pandit1-17/+0
rxe driver's add_gid() and del_gid() callbacks are doing simple checks which are already done by the ib core before invoking these callback routines. Therefore, code is simplified to skip implementing add_gid() and del_gid() callback functions. They are only invoked by ib_core if they are implemented. Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-04-03RDMA: Use ib_gid_attr during GID modificationParav Pandit1-6/+4
Now that ib_gid_attr contains device, port and index, simplify the provider APIs add_gid() and del_gid() to use device, port and index fields from the ib_gid_attr attributes structure. Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-04-03IB/providers: Avoid null netdev check for RoCEParav Pandit1-3/+1
Now that IB core GID cache ensures that all RoCE entries have an associated netdev remove null checks from the provider drivers for clarity. Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-04-03RDMA/providers: Simplify query_gid callback of RoCE providersParav Pandit1-18/+0
ib_query_gid() fetches the GID from the software cache maintained in ib_core for RoCE ports. Therefore, simplify the provider drivers for RoCE to treat query_gid() callback as never called for RoCE, and only require non-RoCE devices to implement it. Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-03-19IB/uverbs: Extend uverbs_ioctl header with driver_idMatan Barak1-0/+1
Extending uverbs_ioctl header with driver_id and another reserved field. driver_id should be used in order to identify the driver. Since every driver could have its own parsing tree, this is necessary for strace support. Downstream patches take off the EXPERIMENTAL flag from the ioctl() IB support and thus we add some reserved fields for future usage. Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-03-15RDMA/rxe: Use structs to describe the uABI instead of opencodingJason Gunthorpe1-5/+43
Open coding pointer math is not acceptable for describing the uABI in RDMA. Provide structs for all the cases. The udata is casted to the struct as close to the verbs entry point as possible for maximum clarity. Function signatures and so forth are revised to allow for this. Tested-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-03-15RDMA/rxe: Get rid of confusing udata parameter to rxe_cq_chk_attrJason Gunthorpe1-2/+2
It isn't used and it couldn't possibly ever be used correctly. Tested-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-03-06RDMA/rxe: Fix an out-of-bounds readBart Van Assche1-3/+2
This patch avoids that KASAN reports the following when the SRP initiator calls srp_post_send(): ================================================================== BUG: KASAN: stack-out-of-bounds in rxe_post_send+0x5c4/0x980 [rdma_rxe] Read of size 8 at addr ffff880066606e30 by task 02-mq/1074 CPU: 2 PID: 1074 Comm: 02-mq Not tainted 4.16.0-rc3-dbg+ #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.0.0-prebuilt.qemu-project.org 04/01/2014 Call Trace: dump_stack+0x85/0xc7 print_address_description+0x65/0x270 kasan_report+0x231/0x350 rxe_post_send+0x5c4/0x980 [rdma_rxe] srp_post_send.isra.16+0x149/0x190 [ib_srp] srp_queuecommand+0x94d/0x1670 [ib_srp] scsi_dispatch_cmd+0x1c2/0x550 [scsi_mod] scsi_queue_rq+0x843/0xa70 [scsi_mod] blk_mq_dispatch_rq_list+0x143/0xac0 blk_mq_do_dispatch_ctx+0x1c5/0x260 blk_mq_sched_dispatch_requests+0x2bf/0x2f0 __blk_mq_run_hw_queue+0xdb/0x160 __blk_mq_delay_run_hw_queue+0xba/0x100 blk_mq_run_hw_queue+0xf2/0x190 blk_mq_sched_insert_request+0x163/0x2f0 blk_execute_rq+0xb0/0x130 scsi_execute+0x14e/0x260 [scsi_mod] scsi_probe_and_add_lun+0x366/0x13d0 [scsi_mod] __scsi_scan_target+0x18a/0x810 [scsi_mod] scsi_scan_target+0x11e/0x130 [scsi_mod] srp_create_target+0x1522/0x19e0 [ib_srp] kernfs_fop_write+0x180/0x210 __vfs_write+0xb1/0x2e0 vfs_write+0xf6/0x250 SyS_write+0x99/0x110 do_syscall_64+0xee/0x2b0 entry_SYSCALL_64_after_hwframe+0x42/0xb7 The buggy address belongs to the page: page:ffffea0001998180 count:0 mapcount:0 mapping:0000000000000000 index:0x0 flags: 0x4000000000000000() raw: 4000000000000000 0000000000000000 0000000000000000 00000000ffffffff raw: dead000000000100 dead000000000200 0000000000000000 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff880066606d00: 00 00 00 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 ffff880066606d80: f1 00 f2 f2 f2 f2 f2 f2 f2 00 00 f2 f2 f2 f2 f2 >ffff880066606e00: f2 00 00 00 00 00 f2 f2 f2 f3 f3 f3 f3 00 00 00 ^ ffff880066606e80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ffff880066606f00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ================================================================== Fixes: 8700e3e7c485 ("Soft RoCE driver") Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Cc: Moni Shoua <monis@mellanox.com> Cc: stable@vger.kernel.org Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-01-31IB/rxe: remove redudant parameter in rxe_av_fill_ip_infoZhu Yanjun1-1/+1
In the function rxe_av_fill_ip_info, the parameter rxe is not used. So it is removed. CC: Srinivas Eeda <srinivas.eeda@oracle.com> CC: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-01-31IB/rxe: change the function rxe_av_fill_ip_info to voidZhu Yanjun1-2/+2
The function rxe_av_fill_ip_info always returns 0. So the function type is changed to void. CC: Srinivas Eeda <srinivas.eeda@oracle.com> CC: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-01-31IB/rxe: remove unnecessary parameter in rxe_av_to_attrZhu Yanjun1-2/+1
In the function rxe_av_to_attr, the parameter rxe is not used. So it is removed. CC: Srinivas Eeda <srinivas.eeda@oracle.com> CC: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-01-31IB/rxe: change the function to void from intZhu Yanjun1-3/+2
The function rxe_av_from_attr always return 0. So change the function to void. CC: Srinivas Eeda <srinivas.eeda@oracle.com> CC: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-01-31IB/rxe: remove redudant parameter in functionZhu Yanjun1-1/+1
In the function rxe_av_from_attr, the parameter rxe is not used. So it is removed. CC: Srinivas Eeda <srinivas.eeda@oracle.com> CC: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-01-10RDMA/rxe: Fix a race condition related to the QP error stateBart Van Assche1-0/+2
The following sequence: * Change queue pair state into IB_QPS_ERR. * Post a work request on the queue pair. Triggers the following race condition in the rdma_rxe driver: * rxe_qp_error() triggers an asynchronous call of rxe_completer(), the function that examines the QP send queue. * rxe_post_send() posts a work request on the QP send queue. If rxe_completer() runs prior to rxe_post_send(), it will drain the send queue and the driver will assume no further action is necessary. However, once we post the send to the send queue, because the queue is in error, no send completion will ever happen and the send will get stuck. In order to process the send, we need to make sure that rxe_completer() gets run after a send is posted to a queue pair in an error state. This patch ensures that happens. Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Cc: Moni Shoua <monis@mellanox.com> Cc: <stable@vger.kernel.org> # v4.8 Signed-off-by: Doug Ledford <dledford@redhat.com>