aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/infiniband/hw/ehca/ehca_iverbs.h (follow)
AgeCommit message (Collapse)AuthorFilesLines
2013-02-21IB/core: Add "type 2" memory windows supportShani Michaeli1-1/+1
This patch enhances the IB core support for Memory Windows (MWs). MWs allow an application to have better/flexible control over remote access to memory. Two types of MWs are supported, with the second type having two flavors: Type 1 - associated with PD only Type 2A - associated with QPN only Type 2B - associated with PD and QPN Applications can allocate a MW once, and then repeatedly bind the MW to different ranges in MRs that are associated to the same PD. Type 1 windows are bound through a verb, while type 2 windows are bound by posting a work request. The 32-bit memory key is composed of a 24-bit index and an 8-bit key. The key is changed with each bind, thus allowing more control over the peer's use of the memory key. The changes introduced are the following: * add memory window type enum and a corresponding parameter to ib_alloc_mw. * type 2 memory window bind work request support. * create a struct that contains the common part of the bind verb struct ibv_mw_bind and the bind work request into a single struct. * add the ib_inc_rkey helper function to advance the tag part of an rkey. Consumer interface details: * new device capability flags IB_DEVICE_MEM_WINDOW_TYPE_2A and IB_DEVICE_MEM_WINDOW_TYPE_2B are added to indicate device support for these features. Devices can set either IB_DEVICE_MEM_WINDOW_TYPE_2A or IB_DEVICE_MEM_WINDOW_TYPE_2B if it supports type 2A or type 2B memory windows. It can set neither to indicate it doesn't support type 2 windows at all. * modify existing provides and consumers code to the new param of ib_alloc_mw and the ib_mw_bind_info structure Signed-off-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Shani Michaeli <shanim@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2008-09-20IB/ehca: Generate flush status CQ entriesAlexander Schmidt1-0/+2
When a QP goes into error state, it is required that CQ entries with a flush error status are delivered to the application for any outstanding work requests. eHCA does not do this in hardware, so this patch adds software flush CQE generation to the ehca driver. Whenever a QP gets into error state, it is added to the QP error list of its respective CQ. If the error QP list of a CQ is not empty, poll_cq() generates flush CQEs before polling the actual CQ. Signed-off-by: Alexander Schmidt <alexs@linux.vnet.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-02-04IB/ehca: Add PMA supportHoang-Nam Nguyen1-0/+5
This patch enables ehca to redirect any PMA queries to the actual PMA QP. Signed-off-by: Hoang-Nam Nguyen <hnguyen@de.ibm.com> Reviewed-by: Joachim Fenkes <fenkes@de.ibm.com> Reviewed-by: Christoph Raisch <raisch@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-01-25IB/ehca: Add "port connection autodetect mode"Hoang-Nam Nguyen1-0/+2
This patch enhances ehca with a capability to "autodetect" the ports being connected physically. In order to utilize that function the module option nr_ports must be set to -1 (default is 2 - two ports). This feature is experimental and will made the default later. More detail: If the user connects only one port to the switch, current code requires 1) port one to be connected and 2) module option nr_ports=1 to be given. If autodetect is enabled, ehca will not wait at creation of the GSI QP for the respective port to become active. Since firmware does not accept modify_qp() while the port is down at initialization, we need to cache all calls to modify_qp() for the SMI/GSI QP and just return a good return code. When a port is activated and we get a PORT_ACTIVE event, we replay the cached modify-qp() parms and re-trigger any posted recv WRs. Only then do we forward the PORT_ACTIVE event to registered clients. The result of this autodetect patch is that all ports will be accessible by the users. Depending on their respective cabling only those ports that are connected properly will become operable. If a user tries to modify a regular QP of a non-connected port, modify_qp() will fail. Furthermore, ibv_devinfo should show the port state accordingly. Note that this patch primarily improves the loading behaviour of ehca. If the cable is removed while the driver is operating and plugged in again, firmware will handle that properly by sending an appropriate async event. Signed-off-by: Hoang-Nam Nguyen <hnguyen@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-11-13IB/ehca: Fix static rate calculationJoachim Fenkes1-0/+3
The IPD (inter-packet delay) formula was a little off and assumed a fixed physical link rate; fix the formula and query the actual physical link rate, now that we can get it. Also, refactor the calculation into a common function ehca_calc_ipd() and use that instead of duplicating code. Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-07-17IB/ehca: Fix warnings issued by checkpatch.plHoang-Nam Nguyen1-3/+4
Run the existing ehca code through checkpatch.pl and clean up the worst of the coding style violations. Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-07-09IB/ehca: Notify consumers of LID/PKEY/SM changes after nondisruptive eventsJoachim Fenkes1-0/+3
When firmware reports a nondisruptive port configuration change event, previous versions of the eHCA driver didn't forward the event to consumers like IPoIB. Add code that determines the type of configuration change by comparing old and new port attributes and reports it. Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-07-09IB/ehca: add Shared Receive Queue supportJoachim Fenkes1-0/+15
Support SRQs on eHCA2. Since an SRQ is a QP for eHCA2, a lot of code (structures, create, destroy, post_recv) can be shared between QP and SRQ. Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-05-08IB/uverbs: Export ib_umem_get()/ib_umem_release() to modulesRoland Dreier1-2/+1
Export ib_umem_get()/ib_umem_release() and put low-level drivers in control of when to call ib_umem_get() to pin and DMA map userspace, rather than always calling it in ib_uverbs_reg_mr() before calling the low-level driver's reg_user_mr method. Also move these functions to be in the ib_core module instead of ib_uverbs, so that driver modules using them do not depend on ib_uverbs. This has a number of advantages: - It is better design from the standpoint of making generic code a library that can be used or overridden by device-specific code as the details of specific devices dictate. - Drivers that do not need to pin userspace memory regions do not need to take the performance hit of calling ib_mem_get(). For example, although I have not tried to implement it in this patch, the ipath driver should be able to avoid pinning memory and just use copy_{to,from}_user() to access userspace memory regions. - Buffers that need special mapping treatment can be identified by the low-level driver. For example, it may be possible to solve some Altix-specific memory ordering issues with mthca CQs in userspace by mapping CQ buffers with extra flags. - Drivers that need to pin and DMA map userspace memory for things other than memory regions can use ib_umem_get() directly, instead of hacks using extra parameters to their reg_phys_mr method. For example, the mlx4 driver that is pending being merged needs to pin and DMA map QP and CQ buffers, but it does not need to create a memory key for these buffers. So the cleanest solution is for mlx4 to call ib_umem_get() in the create_qp and create_cq methods. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-05-06IB: Return "maybe missed event" hint from ib_req_notify_cq()Roland Dreier1-1/+1
The semantics defined by the InfiniBand specification say that completion events are only generated when a completions is added to a completion queue (CQ) after completion notification is requested. In other words, this means that the following race is possible: while (CQ is not empty) ib_poll_cq(CQ); // new completion is added after while loop is exited ib_req_notify_cq(CQ); // no event is generated for the existing completion To close this race, the IB spec recommends doing another poll of the CQ after requesting notification. However, it is not always possible to arrange code this way (for example, we have found that NAPI for IPoIB cannot poll after requesting notification). Also, some hardware (eg Mellanox HCAs) actually will generate an event for completions added before the call to ib_req_notify_cq() -- which is allowed by the spec, since there's no way for any upper-layer consumer to know exactly when a completion was really added -- so the extra poll of the CQ is just a waste. Motivated by this, we add a new flag "IB_CQ_REPORT_MISSED_EVENTS" for ib_req_notify_cq() so that it can return a hint about whether the a completion may have been added before the request for notification. The return value of ib_req_notify_cq() is extended so: < 0 means an error occurred while requesting notification == 0 means notification was requested successfully, and if IB_CQ_REPORT_MISSED_EVENTS was passed in, then no events were missed and it is safe to wait for another event. > 0 is only returned if IB_CQ_REPORT_MISSED_EVENTS was passed in. It means that the consumer must poll the CQ again to make sure it is empty to avoid the race described above. We add a flag to enable this behavior rather than turning it on unconditionally, because checking for missed events may incur significant overhead for some low-level drivers, and consumers that don't care about the results of this test shouldn't be forced to pay for the test. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-05-06IB: Add CQ comp_vector supportMichael S. Tsirkin1-1/+1
Add a num_comp_vectors member to struct ib_device and extend ib_create_cq() to pass in a comp_vector parameter -- this parallels the userspace libibverbs API. Update all hardware drivers to set num_comp_vectors to 1 and have all ULPs pass 0 for the comp_vector value. Pass the value of num_comp_vectors to userspace rather than hard-coding a value of 1. We want multiple CQ event vector support (via MSI-X or similar for adapters that can generate multiple interrupts), but it's not clear how many vectors we want, or how we want to deal with policy issues such as how to decide which vector to use or how to set up interrupt affinity. This patch is useful for experimenting, since no core changes will be necessary when updating a driver to support multiple vectors, and we know that we want to make at least these changes anyway. Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-02-04IB/ehca: Remove use of do_mmap()Hoang-Nam Nguyen1-8/+0
This patch removes do_mmap() from ehca: - Call remap_pfn_range() for hardware register block - Use vm_insert_page() to register memory allocated for completion queues and queue pairs - The actual mmap() call/trigger is now controlled by user space, ie. libehca Signed-off-by: Hoang-Nam Nguyen <hnguyen@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-01-09IB/ehca: Use proper GFP_ flags for get_zeroed_page()Hoang-Nam Nguyen1-2/+2
Here is a patch for ehca to use proper flag, ie. GFP_ATOMIC resp. GFP_KERNEL, when calling get_zeroed_page() to prevent "Bug: scheduling while atomic...". This error does not cause a kernel panic but makes ipoib un-usable afterwards. It is reproducible on 2.6.20-rc4 if one does ifconfig down during a flood ping test. I have not observed this error in earlier releases incl. 2.6.20-rc1. This error occurs when a qp event/irq is received and ehca event handler allocates a control block/page to obtain HCA error data block. Use of GFP_ATOMIC when in interrupt context prevents this issue. Signed-off-by Hoang-Nam Nguyen <hnguyen@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-11-09IB/ehca: Assure 4K alignment for firmware control blocksHoang-Nam Nguyen1-0/+8
Assure 4K alignment for firmware control blocks in 64K page mode, because kzalloc()'s result address might not be 4K aligned if 64K pages are enabled. Thus, we introduce wrappers called ehca_{alloc,free}_fw_ctrlblock(), which use a slab cache for objects with 4K length and 4K alignment in order to alloc/free firmware control blocks in 64K page mode. In 4K page mode those wrappers just are defines of get_zeroed_page() and free_page(). Signed-off-by: Hoang-Nam Nguyen <hnguyen@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-09-22IB/uverbs: Pass userspace data to modify_srq and modify_qp methodsRalph Campbell1-1/+2
Pass a struct ib_udata to the low-level driver's ->modify_srq() and ->modify_qp() methods, so that it can get to the device-specific data passed in by the userspace driver. Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-09-22IB/ehca: Add driver for IBM eHCA InfiniBand adaptersHeiko J Schick1-0/+181
Add a driver for IBM GX bus InfiniBand adapters, which are usable with some pSeries/System p systems. Signed-off-by: Heiko J Schick <schickhj.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>