linux-dev - Linux kernel development work

Age	Commit message (Collapse)	Author	Files	Lines
2019-04-18	RDMA/cxgb4: Fix spelling mistake "immedate" -> "immediate"	Colin Ian King	1	-1/+1
	There is a spelling mistake in a module parameter description. Fix it. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-04-08	RDMA: Handle SRQ allocations by IB/core	Leon Romanovsky	1	-20/+12
	Convert SRQ allocation from drivers to be in the IB/core Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-04-04	RDMA/iw_cxgb4: Always disconnect when QP is transitioning to TERMINATE state	Potnuri Bharat Teja	1	-2/+2
	On receiving a TERM from tje peer, Host moves the QP to TERMINATE state and then moves the adapter out of RDMA mode. After issuing a TERM, peer issues a CLOSE and at this point of time if the connectivity between peer and host is lost for a significant amount of time, the QP remains in TERMINATE state. Therefore c4iw_modify_qp() needs to initiate a close on entering terminate state. Signed-off-by: Potnuri Bharat Teja <bharat@chelsio.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-04-01	IB: Remove 'uobject->context' dependency in object destroy APIs	Shamir Rabinovitch	1	-2/+2
	Now that we have the udata passed to all the ib_xxx object destroy APIs and the additional macro 'rdma_udata_to_drv_context' to get the ib_ucontext from ib_udata stored in uverbs_attr_bundle, we can finally start to remove the dependency of the drivers in the ib_xxx->uobject->context. Signed-off-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-04-01	IB: Pass uverbs_attr_bundle down ib_x destroy path	Shamir Rabinovitch	1	-2/+2
	The uverbs_attr_bundle with the ucontext is sent down to the drivers ib_x destroy path as ib_udata. The next patch will use the ib_udata to free the drivers destroy path from the dependency in 'uobject->context' as we already did for the create path. Signed-off-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-03-25	cxgb4: Convert qpidr to XArray	Matthew Wilcox	1	-17/+16
	Signed-off-by: Matthew Wilcox <willy@infradead.org> Acked-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-15	RDMA/cxgb4: Remove kref accounting for sync operation	Leon Romanovsky	1	-3/+0
	Ucontext allocation and release aren't async events and don't need kref accounting. The common layer of RDMA subsystem ensures that dealloc ucontext will be called after all other objects are released. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Tested-by: Raju Rangoju <rajur@chelsio.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-15	IB/{hw,sw}: Remove 'uobject->context' dependency in object creation APIs	Shamir Rabinovitch	1	-4/+5
	Now when we have the udata passed to all the ib_xxx object creation APIs and the additional macro 'rdma_udata_to_drv_context' to get the ib_ucontext from ib_udata stored in uverbs_attr_bundle, we can finally start to remove the dependency of the drivers in the ib_xxx->uobject->context. Signed-off-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-01-29	Merge branch 'devx-async' into k.o/for-next	Jason Gunthorpe	1	-3/+2
	Yishai Hadas says: Enable DEVX asynchronous query commands This series enables querying a DEVX object in an asynchronous mode. The userspace application won't block when calling the firmware and it will be able to get the response back once that it will be ready. To enable the above functionality: - DEVX asynchronous command completion FD object was introduced. - The applicable file operations were implemented to enable using it by the user application. - Query asynchronous method was added to the DEVX object, it will call the firmware asynchronously and manages the response on the given input FD. - Hot unplug support was added for the FD to work properly upon unbind/disassociate. - mlx5 core fence for asynchronous commands was implemented and used to prevent racing upon unbind/disassociate. This branch is based on mlx5-next & v5.0-rc2 due to dependencies, from git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux * branch 'devx-async': IB/mlx5: Implement DEVX hot unplug for async command FD IB/mlx5: Implement the file ops of DEVX async command FD IB/mlx5: Introduce async DEVX obj query API IB/mlx5: Introduce MLX5_IB_OBJECT_DEVX_ASYNC_CMD_FD Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-01-24	RDMA/iw_cxgb4: Drop __GFP_NOFAIL	Jason Gunthorpe	1	-1/+1
	There is no reason for this __GFP_NOFAIL, none of the other routines in this file use it, and there is an error unwind here. NOFAIL should be reserved for special cases, not used by network drivers. Fixes: 6a0b6174d35a ("rdma/cxgb4: Add support for kernel mode SRQ's") Reported-by: Nicholas Mc Guire <hofrat@osadl.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-01-08	cross-tree: phase out dma_zalloc_coherent()	Luis Chamberlain	1	-3/+2
	We already need to zero out memory for dma_alloc_coherent(), as such using dma_zalloc_coherent() is superflous. Phase it out. This change was generated with the following Coccinelle SmPL patch: @ replace_dma_zalloc_coherent @ expression dev, size, data, handle, flags; @@ -dma_zalloc_coherent(dev, size, handle, flags) +dma_alloc_coherent(dev, size, handle, flags) Suggested-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> [hch: re-ran the script on the latest tree] Signed-off-by: Christoph Hellwig <hch@lst.de>
2019-01-07	iw_cxgb4: Check for send WR also while posting write with completion WR	Potnuri Bharat Teja	1	-6/+13
	Inorder to optimize the NVMEoF read IOPs, iw_cxgb4 posts a FW Write with Completion WQE that combines an RDMA Write WR and the subsequent RDMA Send with Invalidate WR. This patch is an extension to it, where it posts a Write with completion for RDMA WRITE WR + RDMA SEND WR combination as well. Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Potnuri Bharat Teja <bharat@chelsio.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-12-18	RDMA: Cleanup undesired pd->uobject usage	Shamir Rabinovitch	1	-2/+2
	Drivers should be using udata to determine if a method is invoked from user space or kernel space. A pd does not necessarily say a different objects is kernel or user. Transforming the tests to use udata eliminates a large number of uobject references from the drivers. Signed-off-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-11-21	infiniband/hw/cxgb4/qp.c: Use dma_zalloc_coherent	Sabyasachi Gupta	1	-2/+1
	Replaced dma_alloc_coherent + memset with dma_zalloc_coherent Signed-off-by: Sabyasachi Gupta <sabyasachi.linux@gmail.com> Acked-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-10-26	Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma	Linus Torvalds	1	-5/+5
	Pull rdma updates from Jason Gunthorpe: "This has been a smaller cycle with many of the commits being smallish code fixes and improvements across the drivers. - Driver updates for bnxt_re, cxgb4, hfi1, hns, mlx5, nes, qedr, and rxe - Memory window support in hns - mlx5 user API 'flow mutate/steering' allows accessing the full packet mangling and matching machinery from user space - Support inter-working with verbs API calls in the 'devx' mlx5 user API, and provide options to use devx with less privilege - Modernize the use of syfs and the device interface to use attribute groups and cdev properly for uverbs, and clean up some of the core code's device list management - More progress on net namespaces for RDMA devices - Consolidate driver BAR mmapping support into core code helpers and rework how RDMA holds poitners to mm_struct for get_user_pages cases - First pass to use 'dev_name' instead of ib_device->name - Device renaming for RDMA devices" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (242 commits) IB/mlx5: Add support for extended atomic operations RDMA/core: Fix comment for hw stats init for port == 0 RDMA/core: Refactor ib_register_device() function RDMA/core: Fix unwinding flow in case of error to register device ib_srp: Remove WARN_ON in srp_terminate_io() IB/mlx5: Allow scatter to CQE without global signaled WRs IB/mlx5: Verify that driver supports user flags IB/mlx5: Support scatter to CQE for DC transport type RDMA/drivers: Use core provided API for registering device attributes RDMA/core: Allow existing drivers to set one sysfs group per device IB/rxe: Remove unnecessary enum values RDMA/umad: Use kernel API to allocate umad indexes RDMA/uverbs: Use kernel API to allocate uverbs indexes RDMA/core: Increase total number of RDMA ports across all devices IB/mlx4: Add port and TID to MAD debug print IB/mlx4: Enable debug print of SMPs RDMA/core: Rename ports_parent to ports_kobj RDMA/core: Do not expose unsupported counters IB/mlx4: Refer to the device kobject instead of ports_parent RDMA/nldev: Allow IB device rename through RDMA netlink ...
2018-10-10	PCI: Remove pci_unmap_addr() wrappers for DMA API	Christoph Hellwig	1	-5/+5
	Only some of these were still used by the cxgb4 driver, and that despite the fact that the driver otherwise uses the generic DMA API. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2018-09-25	iw_cxgb4: Use proper enumerated type in c4iw_bar2_addrs	Nathan Chancellor	1	-3/+4
	Clang warns when one enumerated type is implicitly converted to another. drivers/infiniband/hw/cxgb4/qp.c:287:8: warning: implicit conversion from enumeration type 'enum t4_bar2_qtype' to different enumeration type 'enum cxgb4_bar2_qtype' [-Wenum-conversion] T4_BAR2_QTYPE_EGRESS, ^~~~~~~~~~~~~~~~~~~~ c4iw_bar2_addrs expects a value from enum cxgb4_bar2_qtype so use the corresponding values from that type so Clang is satisfied without changing the meaning of the code. T4_BAR2_QTYPE_EGRESS = CXGB4_BAR2_QTYPE_EGRESS = 0 T4_BAR2_QTYPE_INGRESS = CXGB4_BAR2_QTYPE_INGRESS = 1 Reported-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Acked-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-21	RDMA/cxgb4: remove redundant null pointer check before kfree_skb	zhong jiang	1	-2/+1
	kfree_skb has taken the null pointer into account. hence it is safe to remove the redundant null pointer check before kfree_skb. Signed-off-by: zhong jiang <zhongjiang@huawei.com> Acked-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-09-04	iw_cxgb4: only allow 1 flush on user qps	Steve Wise	1	-0/+6
	Once the qp has been flushed, it cannot be flushed again. The user qp flush logic wasn't enforcing it however. The bug can cause touch-after-free crashes like: Unable to handle kernel paging request for data at address 0x000001ec Faulting instruction address: 0xc008000016069100 Oops: Kernel access of bad area, sig: 11 [#1] ... NIP [c008000016069100] flush_qp+0x80/0x480 [iw_cxgb4] LR [c00800001606cd6c] c4iw_modify_qp+0x71c/0x11d0 [iw_cxgb4] Call Trace: [c00800001606cd6c] c4iw_modify_qp+0x71c/0x11d0 [iw_cxgb4] [c00800001606e868] c4iw_ib_modify_qp+0x118/0x200 [iw_cxgb4] [c0080000119eae80] ib_security_modify_qp+0xd0/0x3d0 [ib_core] [c0080000119c4e24] ib_modify_qp+0xc4/0x2c0 [ib_core] [c008000011df0284] iwcm_modify_qp_err+0x44/0x70 [iw_cm] [c008000011df0fec] destroy_cm_id+0xcc/0x370 [iw_cm] [c008000011ed4358] rdma_destroy_id+0x3c8/0x520 [rdma_cm] [c0080000134b0540] ucma_close+0x90/0x1b0 [rdma_ucm] [c000000000444da4] __fput+0xe4/0x2f0 So fix flush_qp() to only flush the wq once. Cc: stable@vger.kernel.org Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-08-02	iw_cxgb4: Support FW write completion WR	Potnuri Bharat Teja	1	-1/+146
	To optimize NVME-oF READ IOPs, use a specialized WQE that combines the RDMA WRITE and SEND_INV WR chain submitted by the NVME-oF target driver. This reduces uP overhead per NVME-oF IO, and results in over 10% improvement in NVME-oF 4K READ IOPs. Signed-off-by: Potnuri Bharat Teja <bharat@chelsio.com> Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-08-02	iw_cxgb4: RDMA write with immediate support	Potnuri Bharat Teja	1	-8/+29
	Adds iw_cxgb4 functionality to support RDMA_WRITE_WITH_IMMEDATE opcode. Signed-off-by: Potnuri Bharat Teja <bharat@chelsio.com> Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-08-02	rdma/cxgb4: fix some info leaks	Dan Carpenter	1	-4/+3
	In c4iw_create_qp() there are several struct members which potentially aren't inintialized like uresp.rq_key. I've fixed this code before in in commit ae1fe07f3f42 ("RDMA/cxgb4: Fix stack info leak in c4iw_create_qp()") so this time I'm just going to take a big hammer approach and memset the whole struct to zero. Hopefully, it will stay fixed this time. In c4iw_create_srq() we don't clear uresp.reserved. Fixes: 6a0b6174d35a ("rdma/cxgb4: Add support for kernel mode SRQ's") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Raju Rangoju <rajur@chelsio.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-07-31	rdma/cxgb4: Simplify a structure initialization	Bart Van Assche	1	-1/+1
	This patch avoids that sparse reports the following warning: drivers/infiniband/hw/cxgb4/qp.c:2269:34: warning: Using plain integer as NULL pointer Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Acked-by: Steve Wise <swise@opengridcomputing.com> Acked-by: Raju Rangoju <rajur@chelsio.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-07-30	RDMA, core and ULPs: Declare ib_post_send() and ib_post_recv() arguments const	Bart Van Assche	1	-12/+15
	Since neither ib_post_send() nor ib_post_recv() modify the data structure their second argument points at, declare that argument const. This change makes it necessary to declare the 'bad_wr' argument const too and also to modify all ULPs that call ib_post_send(), ib_post_recv() or ib_post_srq_recv(). This patch does not change any functionality but makes it possible for the compiler to verify whether the ib_post_(send\|recv\|srq_recv) really do not modify the posted work request. To make this possible, only one cast had to be introduce that casts away constness, namely in rpcrdma_post_recvs(). The only way I can think of to avoid that cast is to introduce an additional loop in that function or to change the data type of bad_wr from struct ib_recv_wr ** into int (an index that refers to an element in the work request list). However, both approaches would require even more extensive changes than this patch. Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-07-30	RDMA: Constify the argument of the work request conversion functions	Bart Van Assche	1	-9/+12
	When posting a send work request, the work request that is posted is not modified by any of the RDMA drivers. Make this explicit by constifying most ib_send_wr pointers in RDMA transport drivers. Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-07-25	rdma/cxgb4: Add support for kernel mode SRQ's	Raju Rangoju	1	-127/+691
	This patch implements the srq specific verbs such as create/destroy/modify and post_srq_recv. And adds srq specific structures and defines to t4.h and uapi. Also updates the cq poll logic to deal with completions that are associated with the SRQ's. This patch also handles kernel mode SRQ_LIMIT events as well as flushed SRQ buffers Signed-off-by: Raju Rangoju <rajur@chelsio.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-06-12	treewide: kzalloc() -> kcalloc()	Kees Cook	1	-4/+4
	The kzalloc() function has a 2-factor argument form, kcalloc(). This patch replaces cases of: kzalloc(a * b, gfp) with: kcalloc(a * b, gfp) as well as handling cases of: kzalloc(a * b * c, gfp) with: kzalloc(array3_size(a, b, c), gfp) as it's slightly less ugly than: kzalloc_array(array_size(a, b), c, gfp) This does, however, attempt to ignore constant size factors like: kzalloc(4 * 1024, gfp) though any constants defined via macros get caught up in the conversion. Any factors with a sizeof() of "unsigned char", "char", and "u8" were dropped, since they're redundant. The Coccinelle script used for this was: // Fix redundant parens around sizeof(). @@ type TYPE; expression THING, E; @@ ( kzalloc( - (sizeof(TYPE)) * E + sizeof(TYPE) * E , ...) \| kzalloc( - (sizeof(THING)) * E + sizeof(THING) * E , ...) ) // Drop single-byte sizes and redundant parens. @@ expression COUNT; typedef u8; typedef __u8; @@ ( kzalloc( - sizeof(u8) * (COUNT) + COUNT , ...) \| kzalloc( - sizeof(__u8) * (COUNT) + COUNT , ...) \| kzalloc( - sizeof(char) * (COUNT) + COUNT , ...) \| kzalloc( - sizeof(unsigned char) * (COUNT) + COUNT , ...) \| kzalloc( - sizeof(u8) * COUNT + COUNT , ...) \| kzalloc( - sizeof(__u8) * COUNT + COUNT , ...) \| kzalloc( - sizeof(char) * COUNT + COUNT , ...) \| kzalloc( - sizeof(unsigned char) * COUNT + COUNT , ...) ) // 2-factor product with sizeof(type/expression) and identifier or constant. @@ type TYPE; expression THING; identifier COUNT_ID; constant COUNT_CONST; @@ ( - kzalloc + kcalloc ( - sizeof(TYPE) * (COUNT_ID) + COUNT_ID, sizeof(TYPE) , ...) \| - kzalloc + kcalloc ( - sizeof(TYPE) * COUNT_ID + COUNT_ID, sizeof(TYPE) , ...) \| - kzalloc + kcalloc ( - sizeof(TYPE) * (COUNT_CONST) + COUNT_CONST, sizeof(TYPE) , ...) \| - kzalloc + kcalloc ( - sizeof(TYPE) * COUNT_CONST + COUNT_CONST, sizeof(TYPE) , ...) \| - kzalloc + kcalloc ( - sizeof(THING) * (COUNT_ID) + COUNT_ID, sizeof(THING) , ...) \| - kzalloc + kcalloc ( - sizeof(THING) * COUNT_ID + COUNT_ID, sizeof(THING) , ...) \| - kzalloc + kcalloc ( - sizeof(THING) * (COUNT_CONST) + COUNT_CONST, sizeof(THING) , ...) \| - kzalloc + kcalloc ( - sizeof(THING) * COUNT_CONST + COUNT_CONST, sizeof(THING) , ...) ) // 2-factor product, only identifiers. @@ identifier SIZE, COUNT; @@ - kzalloc + kcalloc ( - SIZE * COUNT + COUNT, SIZE , ...) // 3-factor product with 1 sizeof(type) or sizeof(expression), with // redundant parens removed. @@ expression THING; identifier STRIDE, COUNT; type TYPE; @@ ( kzalloc( - sizeof(TYPE) * (COUNT) * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) \| kzalloc( - sizeof(TYPE) * (COUNT) * STRIDE + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) \| kzalloc( - sizeof(TYPE) * COUNT * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) \| kzalloc( - sizeof(TYPE) * COUNT * STRIDE + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) \| kzalloc( - sizeof(THING) * (COUNT) * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) \| kzalloc( - sizeof(THING) * (COUNT) * STRIDE + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) \| kzalloc( - sizeof(THING) * COUNT * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) \| kzalloc( - sizeof(THING) * COUNT * STRIDE + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) ) // 3-factor product with 2 sizeof(variable), with redundant parens removed. @@ expression THING1, THING2; identifier COUNT; type TYPE1, TYPE2; @@ ( kzalloc( - sizeof(TYPE1) * sizeof(TYPE2) * COUNT + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2)) , ...) \| kzalloc( - sizeof(TYPE1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2)) , ...) \| kzalloc( - sizeof(THING1) * sizeof(THING2) * COUNT + array3_size(COUNT, sizeof(THING1), sizeof(THING2)) , ...) \| kzalloc( - sizeof(THING1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(THING1), sizeof(THING2)) , ...) \| kzalloc( - sizeof(TYPE1) * sizeof(THING2) * COUNT + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2)) , ...) \| kzalloc( - sizeof(TYPE1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2)) , ...) ) // 3-factor product, only identifiers, with redundant parens removed. @@ identifier STRIDE, SIZE, COUNT; @@ ( kzalloc( - (COUNT) * STRIDE * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) \| kzalloc( - COUNT * (STRIDE) * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) \| kzalloc( - COUNT * STRIDE * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) \| kzalloc( - (COUNT) * (STRIDE) * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) \| kzalloc( - COUNT * (STRIDE) * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) \| kzalloc( - (COUNT) * STRIDE * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) \| kzalloc( - (COUNT) * (STRIDE) * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) \| kzalloc( - COUNT * STRIDE * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) ) // Any remaining multi-factor products, first at least 3-factor products, // when they're not all constants... @@ expression E1, E2, E3; constant C1, C2, C3; @@ ( kzalloc(C1 * C2 * C3, ...) \| kzalloc( - (E1) * E2 * E3 + array3_size(E1, E2, E3) , ...) \| kzalloc( - (E1) * (E2) * E3 + array3_size(E1, E2, E3) , ...) \| kzalloc( - (E1) * (E2) * (E3) + array3_size(E1, E2, E3) , ...) \| kzalloc( - E1 * E2 * E3 + array3_size(E1, E2, E3) , ...) ) // And then all remaining 2 factors products when they're not all constants, // keeping sizeof() as the second factor argument. @@ expression THING, E1, E2; type TYPE; constant C1, C2, C3; @@ ( kzalloc(sizeof(THING) * C2, ...) \| kzalloc(sizeof(TYPE) * C2, ...) \| kzalloc(C1 * C2 * C3, ...) \| kzalloc(C1 * C2, ...) \| - kzalloc + kcalloc ( - sizeof(TYPE) * (E2) + E2, sizeof(TYPE) , ...) \| - kzalloc + kcalloc ( - sizeof(TYPE) * E2 + E2, sizeof(TYPE) , ...) \| - kzalloc + kcalloc ( - sizeof(THING) * (E2) + E2, sizeof(THING) , ...) \| - kzalloc + kcalloc ( - sizeof(THING) * E2 + E2, sizeof(THING) , ...) \| - kzalloc + kcalloc ( - (E1) * E2 + E1, E2 , ...) \| - kzalloc + kcalloc ( - (E1) * (E2) + E1, E2 , ...) \| - kzalloc + kcalloc ( - E1 * E2 + E1, E2 , ...) ) Signed-off-by: Kees Cook <keescook@chromium.org>
2018-05-09	Merge branch 'k.o/for-rc' into k.o/wip/dl-for-next	Doug Ledford	1	-2/+2
	Several items of conflict have arisen between the RDMA stack's for-rc branch and upcoming for-next work: 9fd4350ba895 ("IB/rxe: avoid double kfree_skb") directly conflicts with 2e47350789eb ("IB/rxe: optimize the function duplicate_request") Patches already submitted by Intel for the hfi1 driver will fail to apply cleanly without this merge Other people on the mailing list have notified that their upcoming patches also fail to apply cleanly without this merge Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-05-01	IB/cxgb4: use skb_put_zero()/__skb_put_zero	YueHaibing	1	-6/+3
	Use the recently introduced helper to replace the pattern of skb_put_zero/__skb_put() && memset(). Signed-off-by: YueHaibing <yuehaibing@huawei.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-04-27	iw_cxgb4: Atomically flush per QP HW CQEs	Bharat Potnuri	1	-2/+2
	When a CQ is shared by multiple QPs, c4iw_flush_hw_cq() needs to acquire corresponding QP lock before moving the CQEs into its corresponding SW queue and accessing the SQ contents for completing a WR. Ignore CQEs if corresponding QP is already flushed. Cc: stable@vger.kernel.org Signed-off-by: Potnuri Bharat Teja <bharat@chelsio.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-12-27	Merge branch 'from-rc' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git	Jason Gunthorpe	1	-11/+83
	Patches for 4.16 that are dependent on patches sent to 4.15-rc. These are small clean ups for the vmw_pvrdma and i40iw drivers. * 'from-rc' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git: RDMA/vmw_pvrdma: Remove usage of BIT() from UAPI header RDMA/vmw_pvrdma: Use refcount_t instead of atomic_t RDMA/vmw_pvrdma: Use more specific sizeof in kcalloc RDMA/vmw_pvrdma: Clarify QP and CQ is_kernel logic RDMA/vmw_pvrdma: Add UAR SRQ macros in ABI header file i40iw: Change accelerated flag to bool
2017-12-21	iw_cxgb4: when flushing, complete all wrs in a chain	Steve Wise	1	-2/+26
	If a wr chain was posted and needed to be flushed, only the first wr in the chain was completed with FLUSHED status. The rest were never completed. This caused isert to hang on shutdown due to the missing completions which left iscsi IO commands referenced, stalling the shutdown. Fixes: 4fe7c2962e11 ("iw_cxgb4: refactor sq/rq drain logic") Cc: stable@vger.kernel.org Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-21	iw_cxgb4: reflect the original WR opcode in drain cqes	Steve Wise	1	-4/+42
	The flush/drain logic was not retaining the original wr opcode in its completion. This can cause problems if the application uses the completion opcode to make decisions. Use bit 10 of the CQE header word to indicate the CQE is a special drain completion, and save the original WR opcode in the cqe header opcode field. Fixes: 4fe7c2962e11 ("iw_cxgb4: refactor sq/rq drain logic") Cc: stable@vger.kernel.org Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-13	infiniband: cxgb4: use ktime_get for timestamps	Arnd Bergmann	1	-3/+3
	The debugfs file prints the difference between host timestamps as a seconds/nanoseconds tuple, along with a 64-bit nanoseconds hardware timestamp. The host time is read using getnstimeofday() which is deprecated because of the y2038 overflow, and it suffers from time jumps during settimeofday() and leap seconds. Converting to ktime_get_ts64() would solve those two, but I'm going a little further here by changing to ktime_get() and printing 64-bit nanoseconds on both host and hw timestamps. This simplifies the code further and makes the output easier to understand. The format of the debugfs file obviously changes here, but this should only be read by humans and not scripts, so I assume it's fine. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-11	iw_cxgb4: only insert drain cqes if wq is flushed	Steve Wise	1	-2/+12
	Only insert our special drain CQEs to support ib_drain_sq/rq() after the wq is flushed. Otherwise, existing but not yet polled CQEs can be returned out of order to the user application. This can happen when the QP has exited RTS but not yet flushed the QP, which can happen during a normal close (vs abortive close). In addition never count the drain CQEs when determining how many CQEs need to be synthesized during the flush operation. This latter issue should never happen if the QP is properly flushed before inserting the drain CQE, but I wanted to avoid corrupting the CQ state. So we handle it and log a warning once. Fixes: 4fe7c2962e11 ("iw_cxgb4: refactor sq/rq drain logic") Signed-off-by: Steve Wise <swise@opengridcomputing.com> Cc: stable@vger.kernel.org Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-07	iw_cxgb4: only clear the ARMED bit if a notification is needed	Steve Wise	1	-4/+4
	In __flush_qp(), the CQ ARMED bit was being cleared regardless of whether any notification is actually needed. This resulted in the iser termination logic getting stuck in ib_drain_sq() because the CQ was not marked ARMED and thus the drain CQE notification wasn't triggered. This new bug was exposed when this commit was merged: commit cbb40fadd31c ("iw_cxgb4: only call the cq comp_handler when the cq is armed") Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-11-13	iw_cxgb4: atomically flush the qp	Steve Wise	1	-8/+11
	__flush_qp() has a race condition where during the flush operation, the qp lock is released allowing another thread to possibly post a WR, which corrupts the queue state, possibly causing crashes. The lock was released to preserve the cq/qp locking hierarchy of cq first, then qp. However releasing the qp lock is not necessary; both RQ and SQ CQ locks can be acquired first, followed by the qp lock, and then the RQ and SQ flushing can be done w/o unlocking. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-11-13	iw_cxgb4: only call the cq comp_handler when the cq is armed	Steve Wise	1	-8/+12
	The ULPs completion handler should only be called if the CQ is armed for notification. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-11-13	iw_cxgb4: remove BUG_ON() usage.	Steve Wise	1	-3/+0
	iw_cxgb4 has many BUG_ON()s that were left over from various enhancemnets made over the years. Almost all of them should just be removed. Some, however indicate a ULP usage error and can be handled w/o bringing down the system. If the condition cannot happen with correctly implemented cxgb4 sw/fw, then remove the BUG_ON. If the condition indicates a misbehaving ULP (like CQ overflows), add proper recovery logic. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-11-13	RDMA/cxgb4: Protect from possible dereference	Leon Romanovsky	1	-1/+1
	Smatch tool reports the following error: drivers/infiniband/hw/cxgb4/qp.c:1886 c4iw_create_qp() error: we previously assumed 'ucontext' could be null (see line 1804) Cc: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-10-18	Merge branch 'timer_setup' into for-next	Doug Ledford	1	-1/+0
	Conflicts: drivers/infiniband/hw/cxgb4/cm.c drivers/infiniband/hw/qib/qib_driver.c drivers/infiniband/hw/qib/qib_mad.c There were minor fixups needed in these files. Just minor context diffs due to patches from independent sources touching the same basic area. Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-10-18	RDMA/cxgb4: Convert timers to use timer_setup()	Kees Cook	1	-1/+0
	In preparation for unconditionally passing the struct timer_list pointer to all timer callbacks, switch to using the new timer_setup() and from_timer() to pass the timer pointer explicitly. Also removes an unused timer and drops a redundant initialization. Cc: Steve Wise <swise@chelsio.com> Cc: Doug Ledford <dledford@redhat.com> Cc: Sean Hefty <sean.hefty@intel.com> Cc: Hal Rosenstock <hal.rosenstock@gmail.com> Cc: linux-rdma@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-09-29	iw_cxgb4: add referencing to wait objects	Steve Wise	1	-21/+10
	For messages sent from the host to fw that solicit a reply from fw, the c4iw_wr_wait struct pointer is passed in the host->fw message, and included in the fw->host fw6_msg reply. This allows the sender to wait until the reply is received, and the code processing the ingress reply to wake up the sender. If c4iw_wait_for_reply() times out, however, we need to keep the c4iw_wr_wait object around in case the reply eventually does arrive. Otherwise we have touch-after-free bugs in the wake_up paths. This was hit due to a bad kernel driver that blocked ingress processing of cxgb4 for a long time, causing iw_cxgb4 timeouts, but eventually resuming ingress processing and thus hitting the touch-after-free bug. So I want to fix iw_cxgb4 such that we'll at least keep the wait object around until the reply comes. If it never comes we leak a small amount of memory, but if it does come late, we won't potentially crash the system. So add a kref struct in the c4iw_wr_wait struct, and take a reference before sending a message to FW that will generate a FW6 reply. And remove the reference (and potentially free the wait object) when the reply is processed. The ep code also uses the wr_wait for non FW6 CPL messages and doesn't embed the c4iw_wr_wait object in the message sent to firmware. So for those cases we add c4iw_wake_up_noref(). The mr/mw, cq, and qp object create/destroy paths do need this reference logic. For these paths, c4iw_ref_send_wait() is introduced to take the wr_wait reference, send the msg to fw, and then wait for the reply. So going forward, iw_cxgb4 either uses c4iw_ofld_send(), c4iw_wait_for_reply() and c4iw_wake_up_noref() like is done in the some of the endpoint logic, or c4iw_ref_send_wait() and c4iw_wake_up_deref() (formerly c4iw_wake_up()) when sending messages with the c4iw_wr_wait object pointer embedded in the message and resulting FW6 reply. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-09-29	iw_cxgb4: allocate wait object for each ep object	Steve Wise	1	-4/+4
	Remove the embedded c4iw_wr_wait object in preparation for correctly handling timeouts. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-09-29	iw_cxgb4: allocate wait object for each qp object	Steve Wise	1	-22/+33
	Remove the local stack allocated c4iw_wr_wait object in preparation for correctly handling timeouts. Also cleaned up some error path unwind logic to make it more readable. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-09-27	iw_cxgb4: change pr_debug to appropriate log level	Bharat Potnuri	1	-2/+2
	Error logs of iw_cxgb4 needs to be printed by default. This patch changes the necessary pr_debug() to appropriate pr_<log level>. Signed-off-by: Potnuri Bharat Teja <bharat@chelsio.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-09-27	iw_cxgb4: Remove __func__ parameter from pr_debug()	Bharat Potnuri	1	-30/+25
	pr_debug() can be enabled to print function names, So removing the unwanted __func__ parameters from debug logs. Realign function parameters. Signed-off-by: Potnuri Bharat Teja <bharat@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-20	iw_cxgb4: don't use WR keys/addrs for 0 byte reads	Ganesh Goudar	1	-1/+1
	Only use the read sge lkey/addr and the remote rkey/addr if the length of the read is not zero. Otherwise the read response might be treated as the RTR read response and not delivered to the application. Or worse Terminator hardware will fail a 0B read if the STAG is 0 even if the read length is 0. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-06-20	net: introduce __skb_put_[zero, data, u8]	yuan linyu	1	-2/+1
	follow Johannes Berg, semantic patch file as below, @@ identifier p, p2; expression len; expression skb; type t, t2; @@ ( -p = __skb_put(skb, len); +p = __skb_put_zero(skb, len); \| -p = (t)__skb_put(skb, len); +p = __skb_put_zero(skb, len); ) ... when != p ( p2 = (t2)p; -memset(p2, 0, len); \| -memset(p, 0, len); ) @@ identifier p; expression len; expression skb; type t; @@ ( -t p = __skb_put(skb, len); +t p = __skb_put_zero(skb, len); ) ... when != p ( -memset(p, 0, len); ) @@ type t, t2; identifier p, p2; expression skb; @@ t p; ... ( -p = __skb_put(skb, sizeof(t)); +p = __skb_put_zero(skb, sizeof(t)); \| -p = (t )__skb_put(skb, sizeof(t)); +p = __skb_put_zero(skb, sizeof(t)); ) ... when != p ( p2 = (t2)p; -memset(p2, 0, sizeof(p)); \| -memset(p, 0, sizeof(p)); ) @@ expression skb, len; @@ -memset(__skb_put(skb, len), 0, len); +__skb_put_zero(skb, len); @@ expression skb, len, data; @@ -memcpy(__skb_put(skb, len), data, len); +__skb_put_data(skb, data, len); @@ expression SKB, C, S; typedef u8; identifier fn = {__skb_put}; fresh identifier fn2 = fn ## "_u8"; @@ - (u8 )fn(SKB, S) = C; + fn2(SKB, C); Signed-off-by: yuan linyu <Linyu.Yuan@alcatel-sbell.com.cn> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-16	networking: make skb_put & friends return void pointers	Johannes Berg	1	-4/+4
	It seems like a historic accident that these return unsigned char , and in many places that means casts are required, more often than not. Make these functions (skb_put, __skb_put and pskb_put) return void and remove all the casts across the tree, adding a (u8 ) cast only where the unsigned char pointer was used directly, all done with the following spatch: @@ expression SKB, LEN; typedef u8; identifier fn = { skb_put, __skb_put }; @@ - (fn(SKB, LEN)) + (u8 )fn(SKB, LEN) @@ expression E, SKB, LEN; identifier fn = { skb_put, __skb_put }; type T; @@ - E = ((T *)(fn(SKB, LEN))) + E = fn(SKB, LEN) which actually doesn't cover pskb_put since there are only three users overall. A handful of stragglers were converted manually, notably a macro in drivers/isdn/i4l/isdn_bsdcomp.c and, oddly enough, one of the many instances in net/bluetooth/hci_sock.c. In the former file, I also had to fix one whitespace problem spatch introduced. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>