linux-dev - Linux kernel development work

Age	Commit message (Collapse)	Author	Files	Lines
2010-03-03	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband	Linus Torvalds	35	-1200/+1123
	* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (48 commits) IB/srp: Clean up error path in srp_create_target_ib() IB/srp: Split send and recieve CQs to reduce number of interrupts RDMA/nes: Add support for KR device id 0x0110 IB/uverbs: Use anon_inodes instead of private infinibandeventfs IB/core: Fix and clean up ib_ud_header_init() RDMA/cxgb3: Mark RDMA device with CXIO_ERROR_FATAL when removing RDMA/cxgb3: Don't allocate the SW queue for user mode CQs RDMA/cxgb3: Increase the max CQ depth RDMA/cxgb3: Doorbell overflow avoidance and recovery IB/core: Pack struct ib_device a little tighter IB/ucm: Clean whitespace errors IB/ucm: Increase maximum devices supported IB/ucm: Use stack variable 'base' in ib_ucm_add_one IB/ucm: Use stack variable 'devnum' in ib_ucm_add_one IB/umad: Clean whitespace IB/umad: Increase maximum devices supported IB/umad: Use stack variable 'base' in ib_umad_init_port IB/umad: Use stack variable 'devnum' in ib_umad_init_port IB/umad: Remove port_table[] IB/umad: Convert *cdev to cdev in struct ib_umad_port ...
2010-03-03	[SCSI] libiscsi: Make iscsi_eh_target_reset start with session reset	Jayamohan Kallickal	1	-1/+1
	The iscsi_eh_target_reset has been modified to attempt target reset only. If it fails, then iscsi_eh_session_reset will be called. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: Jayamohan Kallickal <jayamohank@serverengines.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-03-01	Merge branch 'misc' into for-next	Roland Dreier	7	-71/+14
	Conflicts: drivers/infiniband/core/uverbs_main.c
2010-03-01	Merge branch 'srp' into for-next	Roland Dreier	2	-32/+65

2010-03-01	Merge branch 'nes' into for-next	Roland Dreier	7	-290/+284

2010-03-01	Merge branch 'mlx4' into for-next	Roland Dreier	1	-1/+1

2010-03-01	Merge branch 'iser' into for-next	Roland Dreier	5	-604/+391

2010-03-01	Merge branch 'ipoib' into for-next	Roland Dreier	1	-8/+2

2010-03-01	Merge branch 'ehca' into for-next	Roland Dreier	3	-7/+4

2010-03-01	Merge branch 'cxgb3' into for-next	Roland Dreier	7	-19/+110

2010-03-01	IB/srp: Clean up error path in srp_create_target_ib()	Roland Dreier	1	-13/+18
	Instead of repeating the error unwinding steps in each place an error can be detected, use the common idiom of gotos into an error flow. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-03-01	IB/srp: Split send and recieve CQs to reduce number of interrupts	Bart Van Assche	2	-25/+53
	We can reduce the number of IB interrupts from two interrupts per srp_queuecommand() call to one by using separate CQs for send and receive completions and processing send completions by polling every time a TX IU is allocated. Receive completion events still trigger an interrupt. Signed-off-by: Bart Van Assche <bart.vanassche@gmail.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-28	ipoib: returned back addrlen check for mc addresses	Jiri Pirko	1	-1/+5
	Apparently bogus mc address can break IPOIB multicast processing. Therefore returning the check for addrlen back until this is resolved in bonding (I don't see any other point from where mc address with non-dev->addr_len length can came from). Signed-off-by: Jiri Pirko <jpirko@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-02-26	infiniband: convert to use netdev_for_each_mc_addr	Jiri Pirko	2	-42/+51
	Due to the loop complexicity in nes_nic.c, I'm using char* to copy mc addresses to it. Signed-off-by: Jiri Pirko <jpirko@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-02-25	RDMA/nes: Add support for KR device id 0x0110	Chien Tung	5	-218/+243
	Add support for KR device id 0x0110. While at it, cleanup nes_init_phy() by splitting it into nes_init_1g_phy() and nes_init_2025_phy(). Remove support for NES_PHY_TYPE_IRIS, which was used on an XFP board that was only manufactured in small quantities and given out for evals in even smaller quantities. Signed-off-by: Chien Tung <chien.tin.tung@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-24	IB/uverbs: Use anon_inodes instead of private infinibandeventfs	Roland Dreier	2	-56/+5
	The anon_inodes interface has been split to allow creating a bare (non-installed) file pointer and also extended to allow specifying O_RDONLY in the flags. This makes it a suitable replacement for the private "infinibandeventfs" pseudo-filesystem used by uverbs, and this replacement saves a small chunk of boilerplate code. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-24	IB/core: Fix and clean up ib_ud_header_init()	Eli Cohen	3	-12/+6
	ib_ud_header_init() first clears header and then fills up the various fields. Later on, it tests header->immediate_present, which it has already cleared, so the condition is always false. Fix this by adding an immediate_present parameter and setting header->immediate_present as is done with grh_present. Also remove unused calculation of header_len. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-24	RDMA/cxgb3: Mark RDMA device with CXIO_ERROR_FATAL when removing	Steve Wise	1	-0/+1
	If cxgb3 calls the iw_cxgb3 t3cclient remove function due to a device removal event, then the iwch device must be marked with CXIO_ERROR_FATAL since the device below us is going away. Otherwise, we can get stuck in a deadlock as RDMA ULPs try and deallocate objects (like MRs, QPs, etc). So always mark the device with CXIO_ERROR_FATAL when removing. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-24	RDMA/cxgb3: Don't allocate the SW queue for user mode CQs	Steve Wise	3	-9/+9
	Only kernel mode CQs need the SW queue memory allocated. The SW queue for user mode CQs is allocated in userspace by libcxgb3. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-24	RDMA/cxgb3: Increase the max CQ depth	Steve Wise	1	-1/+1
	Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-24	RDMA/cxgb3: Doorbell overflow avoidance and recovery	Steve Wise	4	-8/+99
	T3 hardware doorbell FIFO overflows can cause application stalls due to lost doorbell ring events. This has been seen when running large NP IMB alltoall MPI jobs. The T3 hardware supports an xon/xoff-type flow control mechanism to help avoid overflowing the HW doorbell FIFO. This patch uses these interrupts to disable RDMA QP doorbell rings when we near an overflow condition, and then turn them back on (and ring all the active QP doorbells) when when the doorbell FIFO empties out. In addition if an doorbell ring is dropped by the hardware, the code will now recover. Design: cxgb3: - enable these DB interrupts - in the interrupt handler, schedule work tasks to call the ULPs event handlers with the new events. - ring all the qset txqs when an overflow is detected. iw_cxgb3: - disable db ringing on all active qps when we get the DB_FULL event - enable db ringing on all active qps and ring all active dbs when we get the DB_EMPTY event - On DB_DROP event: - disable db rings in the event handler - delay-schedule a work task which rings and enables the dbs on all active qps. - in post_send and post_recv logic, don't ring the db if it's disabled. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-24	IB/ucm: Clean whitespace errors	Alexander Chiang	1	-3/+3
	As shown when 'let c_space_errors=1' is set in vim. Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-24	IB/ucm: Increase maximum devices supported	Alexander Chiang	1	-8/+45
	Some large systems may support more than IB_UCM_MAX_DEVICES (currently 32). This change allows us to support more devices in a backwards-compatible manner. the first IB_UCM_MAX_DEVICES keep the same major/minor device numbers they've always had. If there are more than IB_UCM_MAX_DEVICES, then we dynamically request a new major device number (new minors start at 0). Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-24	IB/ucm: Use stack variable 'base' in ib_ucm_add_one	Alexander Chiang	1	-1/+3
	This change is not useful by itself, but sets us up for a future change that allows us to support more than IB_UCM_MAX_DEVICES. Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-24	IB/ucm: Use stack variable 'devnum' in ib_ucm_add_one	Alexander Chiang	1	-5/+7
	This change is not useful by itself, but sets us up for a future change that allows us to dynamically allocate device numbers in case we have more than IB_UCM_MAX_DEVICES in the system. Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-24	IB/umad: Clean whitespace	Alexander Chiang	1	-13/+13
	Clean errors as shown when 'let c_space_errors=1' is set in vim. Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-24	IB/umad: Increase maximum devices supported	Alexander Chiang	1	-6/+44
	Some large systems may support more than IB_UMAD_MAX_PORTS (currently 64). This change allows us to support more ports in a backwards-compatible manner. The first IB_UMAD_MAX_PORTS keep the same major/minor device numbers they've always had. If there are more than IB_UMAD_MAX_PORTS, we then dynamically request a new major device number (new minors start at 0). Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-24	IB/umad: Use stack variable 'base' in ib_umad_init_port	Alexander Chiang	1	-2/+5
	This change is not useful by itself, but sets us up for a future change that allows us to support more than IB_UMAD_MAX_PORTS in a system. Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-24	IB/umad: Use stack variable 'devnum' in ib_umad_init_port	Alexander Chiang	1	-6/+9
	This change is not useful by itself, but sets us up for a future change that allows us to dynamically allocate device numbers in case we have more than IB_UMAD_MAX_PORTS in the system. Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-24	IB/umad: Remove port_table[]	Alexander Chiang	1	-36/+9
	We no longer need this data structure, as it was used to associate an inode back to a struct ib_umad_port during ->open(). But now that we're embedding a struct cdev in struct ib_umad_port, we can use the container_of() macro to go from the inode back to the device instead. Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-24	IB/umad: Convert *cdev to cdev in struct ib_umad_port	Alexander Chiang	1	-26/+20
	Instead of storing pointers to cdev and sm_cdev, embed the full structures instead. This change allows us to use the container_of() macro in ib_umad_open() and ib_umad_sm_open() in a future patch. This change increases the size of struct ib_umad_port to 320 bytes from 128. Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-24	IB/uverbs: Whitespace cleanup	Alexander Chiang	1	-34/+34
	Clean up the errors as shown when 'let c_space_errors=1' is set in vim. Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-24	IB/uverbs: Pack struct ib_uverbs_event_file tighter	Alexander Chiang	1	-2/+2
	Eliminate some padding in the structure by rearranging the members. sizeof(struct ib_uverbs_event_file) is now 72 bytes (from 80) and more members now fit in the first cacheline. Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-24	IB/uverbs: Increase maximum devices supported	Alexander Chiang	1	-6/+50
	Some large systems may support more than IB_UVERBS_MAX_DEVICES (currently 32). This change allows us to support more devices in a backwards-compatible manner. The first IB_UVERBS_MAX_DEVICES keep the same major/minor device numbers that they've always had. If there are more than IB_UVERBS_MAX_DEVICES, we then dynamically request a new major device number (new minors start at 0). This change increases the maximum number of HCAs to 64 (from 32). Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-24	IB/uverbs: use stack variable 'base' in ib_uverbs_add_one	Alexander Chiang	1	-1/+3
	This change is not useful by itself, but sets us up for a future change that allows us to support more than IB_UVERBS_MAX_DEVICES in a system. Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-24	IB/uverbs: Use stack variable 'devnum' in ib_uverbs_add_one	Alexander Chiang	1	-5/+7
	This change is not useful by itself, but it sets us up for a future change that allows us to dynamically allocate device numbers in case we have more than IB_UVERBS_MAX_DEVICES in the system. Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-24	IB/uverbs: Remove dev_table	Alexander Chiang	1	-19/+5
	dev_table's raison d'etre was to associate an inode back to a struct ib_uverbs_device. However, now that we've converted ib_uverbs_device to contain an embedded cdev (instead of a *cdev), we can use the container_of() macro and cast back to the containing device. There's no longer any need for dev_table, so get rid of it. Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-24	IB/uverbs: Convert *cdev to cdev in struct ib_uverbs_device	Alexander Chiang	2	-16/+14
	Instead of storing a pointer to a cdev, embed the entire struct cdev. This change allows us to use the container_of() macro in ib_uverbs_open() in a future patch. This change increases the size of struct ib_uverbs_device to 168 bytes across 3 cachelines from 80 bytes in 2 cachelines. However, we rearrange the members so that everything fits into the first cacheline except for the struct cdev. Finally, we don't touch the cdev in any fastpaths, so this change shouldn't negatively affect performance. Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-24	IB/iser: Remove redundant locking from iser scsi command response flow	Or Gerlitz	1	-25/+0
	Currently the iSER receive completion flow takes the session lock twice. Optimize it to avoid the first one by letting iser_task_rdma_finalize() be called only from the cleanup_task callback invoked by iscsi_free_task, thus reducing the contention on the session lock between the scsi command submission to the scsi command completion flows. Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com> Reviewed-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-24	IB/iser: Use libiscsi passthrough mode	Or Gerlitz	2	-20/+3
	libiscsi passthrough mode invokes the transport xmit calls directly without first going through an internal queue, unlike the other mode, which uses a queue and a xmitworker thread. Now that the "cant_sleep" prerequisite of iscsi_host_alloc is met, move to use it. Handling xmit errors is now done by the passthrough flow of libiscsi. Since the queue/worker aren't used in this mode, the code that schedules the xmitworker is removed. Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com> Reviewed-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-24	IB/iser: Remove unnecessary connection checks	Or Gerlitz	3	-52/+0
	Remove unnecessary checks for the IB connection state and for QP overflow, as conn state changes are reported by iSER to libiscsi and handled there. QP overflow is theoretically possible only when unsolicited data-outs are used; anyway it's being checked and handled by HW drivers. Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-24	IB/iser: Use atomic allocations	Or Gerlitz	2	-3/+3
	Two minor flows in iSER's data path still use allocations; move them to be atomic as a preperation step towards moving to use libiscsi passthrough mode. Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-24	IB/iser: Simplify send flow/descriptors	Or Gerlitz	5	-278/+115
	Simplify and shrink the logic/code used for the send descriptors. Changes include removing struct iser_dto (an unnecessary abstraction), using struct iser_regd_buf only for handling SCSI commands, using dma_sync instead of dma_map/unmap, etc. Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-24	IB/iser: Use different CQ for send completions	Or Gerlitz	2	-37/+76
	Use a different CQ for send completions, where send completions are polled by the interrupt-driven receive completion handler. Therefore, interrupts aren't used for the send CQ. Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-24	IB/iser: Remove atomic counter for posted receive buffers	Or Gerlitz	3	-12/+12
	Now that both the posting and reaping of receive buffers is done in the completion path, the counter of outstanding buffers not be atomic. Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-24	IB/iser: New receive buffer posting logic	Or Gerlitz	4	-176/+235
	Currently, the recv buffer posting logic is based on the transactional nature of iSER which allows for posting a buffer before sending a PDU. Change this to post only when the number of outstanding recv buffers is below a water mark and in a batched manner, thus simplifying and optimizing the data path. Use a pre-allocated ring of recv buffers instead of allocating from kmem cache. A special treatment is given to the login response buffer whose size must be 8K unlike the size of buffers used for any other purpose which is 128 bytes. Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-24	IB/iser: Revert commit bba7ebb "avoid recv buffer exhaustion"	Or Gerlitz	3	-95/+41
	We will make a major change in the recv buffer posting logic, after which the problem commit bba7ebb "avoid recv buffer exhaustion caused by unexpected PDUs" comes to solve doesn't exist any more, so revert it. Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-19	RDMA/nes: Change WQ overflow return code	Or Gerlitz	1	-3/+3
	Change the nes driver to return -ENOMEM on SQ/RQ overflow to match the return code of other RDMA HW drivers (e.g cxgb3, ehca, mlx4, mthca). Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com> Acked-by: Chien Tung <chien.tin.tung@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-19	RDMA/nes: Multiple disconnects cause crash during AE handling	Faisal Latif	1	-60/+28
	There is a double disconnect during AE processing, causing crashes. While fixing the crash, also simplify the AE handling code. Signed-off-by: Faisal Latif <faisal.latif@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-19	RDMA/nes: Fix crash when listener destroyed during loopback setup	Faisal Latif	1	-1/+2
	When a listener is destroyed and there is an MPA response pending for loopback connection, the active side cm_node gets destroyed twice: once in cm_event_connect_error() and again in nes_accept()/nes_reject(). Increment the cm_node's refcount so it's not destroyed by cm_event_connect_error(). Signed-off-by: Faisal Latif <faisal.latif@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>