aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c (follow)
AgeCommit message (Collapse)AuthorFilesLines
2022-10-07Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsiLinus Torvalds1-12/+2
Pull SCSI updates from James Bottomley: "Updates to the usual drivers (qla2xxx, lpfc, ufs, hisi_sas, mpi3mr, mpt3sas, target). The biggest change (from my biased viewpoint) being that the mpi3mr now attached to the SAS transport class, making it the first fusion type device to do so. Beyond the usual bug fixing and security class reworks, there aren't a huge number of core changes" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (141 commits) scsi: iscsi: iscsi_tcp: Fix null-ptr-deref while calling getpeername() scsi: mpi3mr: Remove unnecessary cast scsi: stex: Properly zero out the passthrough command structure scsi: mpi3mr: Update driver version to 8.2.0.3.0 scsi: mpi3mr: Fix scheduling while atomic type bug scsi: mpi3mr: Scan the devices during resume time scsi: mpi3mr: Free enclosure objects during driver unload scsi: mpi3mr: Handle 0xF003 Fault Code scsi: mpi3mr: Graceful handling of surprise removal of PCIe HBA scsi: mpi3mr: Schedule IRQ kthreads only on non-RT kernels scsi: mpi3mr: Support new power management framework scsi: mpi3mr: Update mpi3 header files scsi: mpt3sas: Revert "scsi: mpt3sas: Fix ioc->base_readl() use" scsi: mpt3sas: Revert "scsi: mpt3sas: Fix writel() use" scsi: wd33c93: Remove dead code related to the long-gone config WD33C93_PIO scsi: core: Add I/O timeout count for SCSI device scsi: qedf: Populate sysfs attributes for vport scsi: pm8001: Replace one-element array with flexible-array member scsi: 3w-xxxx: Replace one-element array with flexible-array member scsi: hptiop: Replace one-element array with flexible-array member in struct hpt_iop_request_ioctl_command() ...
2022-09-06scsi: hisi_sas: Add helper to process bcast eventsJohn Garry1-5/+2
Add a helper for bcast processing to reduce duplication. Link: https://lore.kernel.org/r/1662378529-101489-5-git-send-email-john.garry@huawei.com Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-09-06scsi: hisi_sas: Revert change to limit max hw sectors for v3 HWJohn Garry1-7/+0
Now that libsas and the SCSI core code limits the default sectors from commit 4cbfca5f7750 ("scsi: scsi_transport_sas: cap shost opt_sectors according to DMA optimal limit") and commit 608128d391fa ("scsi: sd: allow max_sectors be capped at DMA optimal size limit"), there is no need for the hack to limit the max HW sectors. Link: https://lore.kernel.org/r/1662378529-101489-2-git-send-email-john.garry@huawei.com Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-08-22block: Change the return type of blk_mq_map_queues() into voidBart Van Assche1-3/+2
Since blk_mq_map_queues() and the .map_queues() callbacks always return 0, change their return type into void. Most callers ignore the returned value anyway. Cc: Christoph Hellwig <hch@lst.de> Cc: Jason Wang <jasowang@redhat.com> Cc: Keith Busch <kbusch@kernel.org> Cc: Martin K. Petersen <martin.petersen@oracle.com> Cc: Doug Gilbert <dgilbert@interlog.com> Cc: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: John Garry <john.garry@huawei.com> Acked-by: Md Haris Iqbal <haris.iqbal@ionos.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Link: https://lore.kernel.org/r/20220815170043.19489-3-bvanassche@acm.org [axboe: fold in fix from Bart] Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-07-18scsi: hisi_sas: Modify v3 HW SATA completion error processingXingui Yang1-1/+8
If the I/O completion response frame returned by the target device has been written to the host memory and the err bit in the status field of the received fis is 1, ts->stat should set to SAS_PROTO_RESPONSE, and this will let EH analyze and further determine cause of failure. Link: https://lore.kernel.org/r/1657823002-139010-5-git-send-email-john.garry@huawei.com Signed-off-by: Xingui Yang <yangxingui@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-07-18scsi: hisi_sas: Relocate DMA unmap of SMP taskXiang Chen1-2/+0
Currently SMP tasks are DMA unmapped only when cq of SMP I/O is returned normally. If the cq of SMP I/O is returned with exception actually SMP TAS is never unmapped. Relocate DMA unmap of SMP task to fix the issue. Link: https://lore.kernel.org/r/1657823002-139010-4-git-send-email-john.garry@huawei.com Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-07-18scsi: hisi_sas: Call hisi_sas_slave_configure() from slave_configure_v3_hw()John Garry1-4/+1
There is duplicated code between slave_configure_v3_hw() and hisi_sas_slave_configure(), so call common function hisi_sas_slave_configure() from slave_configure_v3_hw(). Link: https://lore.kernel.org/r/1657823002-139010-2-git-send-email-john.garry@huawei.com Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-06-27scsi: hisi_sas: Limit max hw sectors for v3 HWJohn Garry1-0/+7
If the controller is behind an IOMMU then the IOMMU IOVA caching range can affect performance, as discussed in [0]. Limit the max HW sectors to not exceed this limit. We need to hardcode the value until a proper DMA mapping API is available. [0] https://lore.kernel.org/linux-iommu/20210129092120.1482-1-thunder.leizhen@huawei.com/ Link: https://lore.kernel.org/r/1655988119-223714-1-git-send-email-john.garry@huawei.com Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-05-10scsi: hisi_sas: Undo RPM resume for failed notify phy event for v3 HWXiang Chen1-2/+8
If we fail to notify the phy up event then undo the RPM resume, as the phy up notify event handling pairs with that RPM resume. Link: https://lore.kernel.org/r/1651839939-101188-1-git-send-email-john.garry@huawei.com Reported-by: Yihang Li <liyihang6@hisilicon.com> Tested-by: Yihang Li <liyihang6@hisilicon.com> Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-03-14scsi: hisi_sas: Use libsas internal abort supportJohn Garry1-9/+9
Use the common libsas internal abort functionality. In addition, this driver has special handling for internal abort timeouts - specifically whether to reset the controller in that instance, so extend the API for that. Timeout is now increased to 20 * Hz from 6 * Hz. We also retry for failure now, but this should not make a difference. Link: https://lore.kernel.org/r/1647001432-239276-5-git-send-email-john.garry@huawei.com Tested-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Acked-by: Jack Wang <jinpu.wang@ionos.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-02-27scsi: hisi_sas: Modify v3 HW SSP underflow error processingXingui Yang1-13/+26
In case of SSP underflow allow the response frame IU to be examined for setting the response stat value rather than always setting SAS_DATA_UNDERRUN. This will mean that we call sas_ssp_task_response() in those scenarios and may send sense data to upper layer. Such a condition would be for bad blocks were we just reporting an underflow error to upper layer, but now the sense data will tell immediately that the media is faulty. Link: https://lore.kernel.org/r/1645703489-87194-7-git-send-email-john.garry@huawei.com Signed-off-by: Xingui Yang <yangxingui@huawei.com> Signed-off-by: Qi Liu <liuqi115@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-02-27scsi: hisi_sas: Limit users changing debugfs BIST count valueXiang Chen1-2/+50
Add a file operation for "cnt" file under bist directory, so users can only read "cnt" or clear "cnt" to zero, but cannot randomly modify. Link: https://lore.kernel.org/r/1645703489-87194-6-git-send-email-john.garry@huawei.com Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: Qi Liu <liuqi115@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-02-27scsi: hisi_sas: Rename error labels in hisi_sas_v3_probe()Qi Liu1-10/+10
To avoid doubt, rename the error labels to indicate the action they will take. Link: https://lore.kernel.org/r/1645703489-87194-5-git-send-email-john.garry@huawei.com Signed-off-by: Qi Liu <liuqi115@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-02-27scsi: hisi_sas: Free irq vectors in order for v3 HWQi Liu1-5/+11
If the driver probe fails to request the channel IRQ or fatal IRQ, the driver will free the IRQ vectors before freeing the IRQs in free_irq(), and this will cause a kernel BUG like this: ------------[ cut here ]------------ kernel BUG at drivers/pci/msi.c:369! Internal error: Oops - BUG: 0 [#1] PREEMPT SMP Call trace: free_msi_irqs+0x118/0x13c pci_disable_msi+0xfc/0x120 pci_free_irq_vectors+0x24/0x3c hisi_sas_v3_probe+0x360/0x9d0 [hisi_sas_v3_hw] local_pci_probe+0x44/0xb0 work_for_cpu_fn+0x20/0x34 process_one_work+0x1d0/0x340 worker_thread+0x2e0/0x460 kthread+0x180/0x190 ret_from_fork+0x10/0x20 ---[ end trace b88990335b610c11 ]--- So we use devm_add_action() to control the order in which we free the vectors. Link: https://lore.kernel.org/r/1645703489-87194-4-git-send-email-john.garry@huawei.com Signed-off-by: Qi Liu <liuqi115@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-02-27scsi: hisi_sas: Change permission of parameter prot_maskXiang Chen1-1/+1
Currently the permission of parameter prot_mask is 0x0, which means that the member does not appear in sysfs. Change it as other module parameters to 0444 for world-readable. [mkp: s/v3/v2/] Link: https://lore.kernel.org/r/1645703489-87194-2-git-send-email-john.garry@huawei.com Fixes: d6a9000b81be ("scsi: hisi_sas: Add support for DIF feature for v2 hw") Reported-by: Yihang Li <liyihang6@hisilicon.com> Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-02-19scsi: libsas: Add struct sas_tmf_taskJohn Garry1-1/+1
Some of the LLDDs which use libsas have their own definition of a struct to hold TMF info, so add a common struct for libsas. Also add an interim force phy id field for hisi_sas driver, which will be removed once the STP "TMF" code is factored out. Even though some LLDDs (pm8001) use a u32 for the tag, u16 will be adequate, as that named driver only uses tags in range [0, 1024). Link: https://lore.kernel.org/r/1645112566-115804-8-git-send-email-john.garry@huawei.com Tested-by: Yihang Li <liyihang6@hisilicon.com> Tested-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-02-14Merge branch '5.17/scsi-fixes' into 5.18/scsi-stagingMartin K. Petersen1-2/+0
Pull 5.17 fixes branch into 5.18 tree to resolve a few pm8001 driver merge conflicts. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-02-11scsi: libsas: Drop SAS_TASK_AT_INITIATORJohn Garry1-2/+1
This flag is now only ever set, so delete it. This also avoids a use-after-free in the pm8001 queue path, as reported in the following: https://lore.kernel.org/linux-scsi/c3cb7228-254e-9584-182b-007ac5e6fe0a@huawei.com/T/#m28c94c6d3ff582ec4a9fa54819180740e8bd4cfb https://lore.kernel.org/linux-scsi/0cc0c435-b4f2-9c76-258d-865ba50a29dd@huawei.com/ [mkp: checkpatch + two SAS_TASK_AT_INITIATOR references] Link: https://lore.kernel.org/r/1644489804-85730-3-git-send-email-john.garry@huawei.com Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-01-24scsi: hisi_sas: Remove useless DMA-32 fallback configurationChristophe JAILLET1-2/+0
As stated in [1], dma_set_mask() with a 64-bit mask never fails if dev->dma_mask is non-NULL. So, if it fails, the 32-bit case will also fail for the same reason. Simplify code and remove some dead code accordingly. [1]: https://lore.kernel.org/linux-kernel/YL3vSPK5DXTNvgdx@infradead.org/#t Link: https://lore.kernel.org/r/1bf2d3660178b0e6f172e5208bc0bd68d31d9268.1642237482.git.christophe.jaillet@wanadoo.fr Acked-by: John Garry <john.garry@huawei.com> Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-12-22scsi: hisi_sas: Use autosuspend for the host controllerXiang Chen1-0/+2
The controller may frequently enter and exit suspend for each I/O which we need to deal with. This is inefficient and may cause too much suspend and resume activity for the controller. To avoid this, use a default 5s autosuspend for the controller to stop frequently suspending and resuming. This value may still be modified via sysfs interfaces. Link: https://lore.kernel.org/r/1639999298-244569-16-git-send-email-chenxiang66@hisilicon.com Acked-by: John Garry <john.garry@huawei.com> Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-12-22scsi: hisi_sas: Keep controller active between ISR of phyup and the event being processedXiang Chen1-1/+3
It is possible that controller may become suspended between processing a phyup interrupt and the event being processed by libsas. As such, we can't ensure the controller is active when processing the phyup event - this may cause the phyup event to be lost or other issues. To avoid any possible issues, add pm_runtime_get_noresume() in phyup interrupt handler and pm_runtime_put_sync() in the work handler exit to ensure that we stay always active. Since we only want to call pm_runtime_get_noresume() for v3 hw, signal this will a new event, HISI_PHYE_PHY_UP_PM. Link: https://lore.kernel.org/r/1639999298-244569-14-git-send-email-chenxiang66@hisilicon.com Acked-by: John Garry <john.garry@huawei.com> Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-12-22scsi: hisi_sas: Add more logs for runtime suspend/resumeXiang Chen1-2/+6
Add some logs at the beginning and end of suspend/resume. Link: https://lore.kernel.org/r/1639999298-244569-9-git-send-email-chenxiang66@hisilicon.com Acked-by: John Garry <john.garry@huawei.com> Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-12-22scsi: libsas: Don't always drain event workqueue for HA resumeJohn Garry1-1/+9
For the hisi_sas driver, if a directly attached disk is removed during suspend, a hang will occur in the resume process: The background is that in commit 16fd4a7c5917 ("scsi: hisi_sas: Add device link between SCSI devices and hisi_hba"), it is ensured that the HBA device cannot be runtime suspended when any SCSI device associated is active. Other drivers which use libsas don't worry about this as none support runtime suspend. The mentioned hang occurs when an disk is removed during suspend. In the removal process - from PHYE_RESUME_TIMEOUT event processing - we call into scsi_remove_device(), which is being processed in the HA event workqueue. Here we wait for all suppliers of the SCSI device to resume, which includes the HBA device (from the above commit). However the HBA device cannot resume, as it is waiting for the PHYE_RESUME_TIMEOUT to be processed (from calling sas_resume_ha() -> sas_drain_work()). This is the deadlock. There does not appear to be any need for the sas_drain_work() to be called at all in sas_resume_ha() as it is not syncing against anything, so allow LLDDs to avoid this by providing a variant of sas_resume_ha() which does "sync", i.e. doesn't drain the event workqueue. Link: https://lore.kernel.org/r/1639999298-244569-2-git-send-email-chenxiang66@hisilicon.com Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-12-16scsi: hisi_sas: Fix phyup timeout on FPGAQi Liu1-2/+8
The OOB interrupt and phyup interrupt handlers may run out-of-order in high CPU usage scenarios. Since the hisi_sas_phy.timer is added in hisi_sas_phy_oob_ready() and disarmed in phy_up_v3_hw(), this out-of-order execution will cause hisi_sas_phy.timer timeout to trigger. To solve, protect hisi_sas_phy.timer and .attached with a lock, and ensure that the timer won't be added after phyup handler completes. Link: https://lore.kernel.org/r/1639579061-179473-8-git-send-email-john.garry@huawei.com Signed-off-by: Qi Liu <liuqi115@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-12-16scsi: hisi_sas: Prevent parallel FLR and controller resetQi Liu1-0/+1
If we issue a controller reset command during executing a FLR a hung task may be found: Call trace: __switch_to+0x158/0x1cc __schedule+0x2e8/0x85c schedule+0x7c/0x110 schedule_timeout+0x190/0x1cc __down+0x7c/0xd4 down+0x5c/0x7c hisi_sas_task_exec+0x510/0x680 [hisi_sas_main] hisi_sas_queue_command+0x24/0x30 [hisi_sas_main] smp_execute_task_sg+0xf4/0x23c [libsas] sas_smp_phy_control+0x110/0x1e0 [libsas] transport_sas_phy_reset+0xc8/0x190 [libsas] phy_reset_work+0x2c/0x40 [libsas] process_one_work+0x1dc/0x48c worker_thread+0x15c/0x464 kthread+0x160/0x170 ret_from_fork+0x10/0x18 This is a race condition which occurs when the FLR completes first. Here the host HISI_SAS_RESETTING_BIT flag out gets of sync as HISI_SAS_RESETTING_BIT is not always cleared with the hisi_hba.sem held, so now only set/unset HISI_SAS_RESETTING_BIT under hisi_hba.sem . Link: https://lore.kernel.org/r/1639579061-179473-7-git-send-email-john.garry@huawei.com Signed-off-by: Qi Liu <liuqi115@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-11-05Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsiLinus Torvalds1-35/+27
Pull SCSI updates from James Bottomley: "This consists of the usual driver updates (ufs, smartpqi, lpfc, target, megaraid_sas, hisi_sas, qla2xxx) and minor updates and bug fixes. Notable core changes are the removal of scsi->tag which caused some churn in obsolete drivers and a sweep through all drivers to call scsi_done() directly instead of scsi->done() which removes a pointer indirection from the hot path and a move to register core sysfs files earlier, which means they're available to KOBJ_ADD processing, which necessitates switching all drivers to using attribute groups" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (279 commits) scsi: lpfc: Update lpfc version to 14.0.0.3 scsi: lpfc: Allow fabric node recovery if recovery is in progress before devloss scsi: lpfc: Fix link down processing to address NULL pointer dereference scsi: lpfc: Allow PLOGI retry if previous PLOGI was aborted scsi: lpfc: Fix use-after-free in lpfc_unreg_rpi() routine scsi: lpfc: Correct sysfs reporting of loop support after SFP status change scsi: lpfc: Wait for successful restart of SLI3 adapter during host sg_reset scsi: lpfc: Revert LOG_TRACE_EVENT back to LOG_INIT prior to driver_resource_setup() scsi: ufs: ufshcd-pltfrm: Fix memory leak due to probe defer scsi: ufs: mediatek: Avoid sched_clock() misuse scsi: mpt3sas: Make mpt3sas_dev_attrs static scsi: scsi_transport_sas: Add 22.5 Gbps link rate definitions scsi: target: core: Stop using bdevname() scsi: aha1542: Use memcpy_{from,to}_bvec() scsi: sr: Add error handling support for add_disk() scsi: sd: Add error handling support for add_disk() scsi: target: Perform ALUA group changes in one step scsi: target: Replace lun_tg_pt_gp_lock with rcu in I/O path scsi: target: Fix alua_tg_pt_gps_count tracking scsi: target: Fix ordered tag handling ...
2021-10-18block: drop unused includes in <linux/blkdev.h>Christoph Hellwig1-0/+1
Drop various include not actually used in blkdev.h itself. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Link: https://lore.kernel.org/r/20210920123328.1399408-14-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-16scsi: hisi_sas: Switch to attribute groupsBart Van Assche1-6/+8
struct device supports attribute groups directly but does not support struct device_attribute directly. Hence switch to attribute groups. Link: https://lore.kernel.org/r/20211012233558.4066756-21-bvanassche@acm.org Acked-by: John Garry <john.garry@huawei.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-12scsi: hisi_sas: Wait for phyup in hisi_sas_control_phy()Xiang Chen1-7/+2
When issuing a hardreset/linkreset/phy_set_linkrate from sysfs, the phy will be disabled and re-enabled for the directly attached scenario. It takes some time for the phy to come back up after re-enabling the phy. If the controller becomes suspended while waiting for the phy to come back, the phy up may be lost (along with the disk). To solve this problem, wait for the phy up to occur with a timeout. Indeed this is already done in hisi_sas_debug_I_T_nexus_reset() for local phys, so just relocate the functionality to hisi_sas_control_phy(). Since the HA workqueue is drained when suspending the controller, and the phy control function is called from the same workqueue, we can guarantee that the controller will not be suspended during this period. Link: https://lore.kernel.org/r/1634041588-74824-3-git-send-email-john.garry@huawei.com Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-12scsi: hisi_sas: Initialise devices in .slave_alloc callbackXiang Chen1-1/+1
Perform driver-specific SCSI device initialization in the designated SCSI midlayer callback instead of relying on the libsas "device found" callback. The SCSI midlayer .slave_alloc interface is called prior to sending any I/O to the device. Link: https://lore.kernel.org/r/1634041588-74824-2-git-send-email-john.garry@huawei.com Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-09-13scsi: hisi_sas: Increase debugfs_dump_index after dump is completedLuo Jiaxing1-1/+1
The hisi_hba debugfs_dump_index member should increased after a dump insertion completed, and not before it has started, so fix the code to do so. Link: https://lore.kernel.org/r/1629799260-120116-6-git-send-email-john.garry@huawei.com Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-09-13scsi: hisi_sas: Replace del_timer() calls with del_timer_sync()Xiang Chen1-2/+1
Some usage of del_timer() in the driver is potentially unsafe. When running the sas_task->slow_task timer in hisi_sas_exec_internal_tmf_task(), execution may be blocked in function hisi_sas_task_exec(); so it is possible that the timer is running when the callback to disable the timer is running. This could be dangerous, as we immediately release resources which the timer callback uses after disabling the timer. The same situation may be found at other sites, such as _hisi_sas_internal_task_abort(). Change calls to del_timer() to del_timer_sync() as necessary, to ensure any timer has finished when disabling. Also remove calls to timer_pending() prior to del_timer() as it is not necessary. Link: https://lore.kernel.org/r/1629799260-120116-5-git-send-email-john.garry@huawei.com Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-09-13scsi: hisi_sas: Rename HISI_SAS_{RESET -> RESETTING}_BITLuo Jiaxing1-5/+5
HISI_SAS_RESET_BIT means that the controller is being reset, and so the name is a bit vague. Rename it to HISI_SAS_RESETTING_BIT. Link: https://lore.kernel.org/r/1629799260-120116-4-git-send-email-john.garry@huawei.com Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-09-13scsi: hisi_sas: Stop printing queue count in v3 hardware probeJohn Garry1-1/+1
The number of hardware queues is available from sysfs. Remove the print in the v3 hardware probe function. Link: https://lore.kernel.org/r/1629799260-120116-3-git-send-email-john.garry@huawei.com Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-09-13scsi: hisi_sas: Use managed PCI functionsXiang Chen1-12/+8
Use managed PCI functions such as pcim_enable_device() and pcim_iomap_regions() to simplify exception handling code. Link: https://lore.kernel.org/r/1629799260-120116-2-git-send-email-john.garry@huawei.com Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-08-11scsi: hisi_sas: Use scsi_cmd_to_rq() instead of scsi_cmnd.requestBart Van Assche1-1/+1
Prepare for removal of the request pointer by using scsi_cmd_to_rq() instead. This patch does not change any functionality. Link: https://lore.kernel.org/r/20210809230355.8186-22-bvanassche@acm.org Acked-by: John Garry <john.garry@huawei.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-07-11Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsiLinus Torvalds1-0/+1
Pull more SCSI updates from James Bottomley: "This is a set of minor fixes and clean ups in the core and various drivers. The only core change in behaviour is the I/O retry for spinup notify, but that shouldn't impact anything other than the failing case" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (23 commits) scsi: virtio_scsi: Add validation for residual bytes from response scsi: ipr: System crashes when seeing type 20 error scsi: core: Retry I/O for Notify (Enable Spinup) Required error scsi: mpi3mr: Fix warnings reported by smatch scsi: qedf: Add check to synchronize abort and flush scsi: MAINTAINERS: Add mpi3mr driver maintainers scsi: libfc: Fix array index out of bound exception scsi: mvsas: Use DEVICE_ATTR_RO()/RW() macro scsi: megaraid_mbox: Use DEVICE_ATTR_ADMIN_RO() macro scsi: qedf: Use DEVICE_ATTR_RO() macro scsi: qedi: Use DEVICE_ATTR_RO() macro scsi: message: mptfc: Switch from pci_ to dma_ API scsi: be2iscsi: Fix some missing space in some messages scsi: be2iscsi: Fix an error handling path in beiscsi_dev_probe() scsi: ufs: Fix build warning without CONFIG_PM scsi: bnx2fc: Remove meaningless bnx2fc_abts_cleanup() return value assignment scsi: qla2xxx: Add heartbeat check scsi: virtio_scsi: Do not overwrite SCSI status scsi: libsas: Add LUN number check in .slave_alloc callback scsi: core: Inline scsi_mq_alloc_queue() ...
2021-07-02Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsiLinus Torvalds1-5/+5
Pull SCSI updates from James Bottomley: "This series consists of the usual driver updates (ufs, ibmvfc, megaraid_sas, lpfc, elx, mpi3mr, qedi, iscsi, storvsc, mpt3sas) with elx and mpi3mr being new drivers. The major core change is a rework to drop the status byte handling macros and the old bit shifted definitions and the rest of the updates are minor fixes" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (287 commits) scsi: aha1740: Avoid over-read of sense buffer scsi: arcmsr: Avoid over-read of sense buffer scsi: ips: Avoid over-read of sense buffer scsi: ufs: ufs-mediatek: Add missing of_node_put() in ufs_mtk_probe() scsi: elx: libefc: Fix IRQ restore in efc_domain_dispatch_frame() scsi: elx: libefc: Fix less than zero comparison of a unsigned int scsi: elx: efct: Fix pointer error checking in debugfs init scsi: elx: efct: Fix is_originator return code type scsi: elx: efct: Fix link error for _bad_cmpxchg scsi: elx: efct: Eliminate unnecessary boolean check in efct_hw_command_cancel() scsi: elx: efct: Do not use id uninitialized in efct_lio_setup_session() scsi: elx: efct: Fix error handling in efct_hw_init() scsi: elx: efct: Remove redundant initialization of variable lun scsi: elx: efct: Fix spelling mistake "Unexected" -> "Unexpected" scsi: lpfc: Fix build error in lpfc_scsi.c scsi: target: iscsi: Remove redundant continue statement scsi: qla4xxx: Remove redundant continue statement scsi: ppa: Switch to use module_parport_driver() scsi: imm: Switch to use module_parport_driver() scsi: mpt3sas: Fix error return value in _scsih_expander_add() ...
2021-06-22scsi: libsas: Add LUN number check in .slave_alloc callbackYufen Yu1-0/+1
Offlining a SATA device connected to a hisi SAS controller and then scanning the host will result in detecting 255 non-existent devices: # lsscsi [2:0:0:0] disk ATA Samsung SSD 860 2B6Q /dev/sda [2:0:1:0] disk ATA WDC WD2003FYYS-3 1D01 /dev/sdb [2:0:2:0] disk SEAGATE ST600MM0006 B001 /dev/sdc # echo "offline" > /sys/block/sdb/device/state # echo "- - -" > /sys/class/scsi_host/host2/scan # lsscsi [2:0:0:0] disk ATA Samsung SSD 860 2B6Q /dev/sda [2:0:1:0] disk ATA WDC WD2003FYYS-3 1D01 /dev/sdb [2:0:1:1] disk ATA WDC WD2003FYYS-3 1D01 /dev/sdh ... [2:0:1:255] disk ATA WDC WD2003FYYS-3 1D01 /dev/sdjb After a REPORT LUN command issued to the offline device fails, the SCSI midlayer tries to do a sequential scan of all devices whose LUN number is not 0. However, SATA does not support LUN numbers at all. Introduce a generic sas_slave_alloc() handler which will return -ENXIO for SATA devices if the requested LUN number is larger than 0 and make libsas drivers use this function as their .slave_alloc callback. Link: https://lore.kernel.org/r/20210622034037.1467088-1-yuyufen@huawei.com Reported-by: Wu Bo <wubo40@huawei.com> Suggested-by: John Garry <john.garry@huawei.com> Reviewed-by: John Garry <john.garry@huawei.com> Reviewed-by: Jason Yan <yanaijie@huawei.com> Signed-off-by: Yufen Yu <yuyufen@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-06-09scsi: hisi_sas: Include HZ in timer macrosLuo Jiaxing1-1/+1
Include HZ in timer macros to make the code more concise. Link: https://lore.kernel.org/r/1623058179-80434-4-git-send-email-john.garry@huawei.com Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-06-02scsi: libsas: Introduce more SAM status code aliases in enum exec_statusBart Van Assche1-4/+4
This patch prepares for converting SAM status codes into an enum. Without this patch converting SAM status codes into an enumeration type would trigger complaints about enum type mismatches for the SAS code. Link: https://lore.kernel.org/r/20210524025457.11299-2-bvanassche@acm.org Cc: Hannes Reinecke <hare@suse.com> Cc: Artur Paszkiewicz <artur.paszkiewicz@intel.com> Cc: Jason Yan <yanaijie@huawei.com> Reviewed-by: John Garry <john.garry@huawei.com> Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Acked-by: Jack Wang <jinpu.wang@ionos.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-05-21scsi: hisi_sas: Drop free_irq() of devm_request_irq() allocated irqYang Yingliang1-4/+4
irqs allocated with devm_request_irq() should not be freed using free_irq(). Doing so causes a dangling pointer and a subsequent double free. Link: https://lore.kernel.org/r/20210519130519.2661938-1-yangyingliang@huawei.com Reported-by: Hulk Robot <hulkci@huawei.com> Acked-by: John Garry <john.garry@huawei.com> Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-04-12scsi: hisi_sas: Warn in v3 hw channel interrupt handler when status reg clearedLuo Jiaxing1-2/+8
If a channel interrupt occurs without any status bit set, the handler will return directly. However, if such redundant interrupts are received, it's better to check what happen, so add logs for this. Link: https://lore.kernel.org/r/1617709711-195853-6-git-send-email-john.garry@huawei.com Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: Yihang Li <liyihang6@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-04-12scsi: hisi_sas: Directly snapshot registers when executing a resetJianqin Xie1-12/+15
The debugfs snapshot should be executed before the reset occurs to ensure that the register contents are saved properly. As such, it is incorrect to queue the debugfs dump when running a reset as the reset will occur prior to the snapshot work item is handler. Therefore, directly snapshot registers in the reset work handler. Link: https://lore.kernel.org/r/1617709711-195853-5-git-send-email-john.garry@huawei.com Signed-off-by: Jianqin Xie <xiejianqin@hisilicon.com> Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-04-12scsi: hisi_sas: Call sas_unregister_ha() to roll back if .hw_init() failsXiang Chen1-1/+3
Function sas_unregister_ha() needs to be called to roll back if hisi_hba->hw->hw_init() fails in function hisi_sas_probe() or hisi_sas_v3_probe(). Make that change. Link: https://lore.kernel.org/r/1617709711-195853-4-git-send-email-john.garry@huawei.com Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-04-12scsi: hisi_sas: Print SAS address for v3 hw erroneous completion printLuo Jiaxing1-1/+2
To help debugging efforts, print the device SAS address for v3 hw erroneous completion log. Here is an example print: hisi_sas_v3_hw 0000:b4:02.0: erroneous completion iptt=2193 task=000000002b0c13f8 dev id=17 addr=570fd45f9d17b001 Link: https://lore.kernel.org/r/1617709711-195853-3-git-send-email-john.garry@huawei.com Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-26scsi: hisi_sas: Add trace FIFO debugfs supportLuo Jiaxing1-0/+251
The controller provides trace FIFO DFX tool to assist link fault debugging and link optimization. This tool can be helpful when debugging link faults without SAS analyzers. Each PHY has an independent trace FIFO interface. The user can configure the trace FIFO tool of one PHY by using the following six interfaces: signal_sel: select signal group applies to different scenarios. 0x0: linkrate negotiation 0x1: Host 12G TX train 0x2: Disk 12G TX train 0x3: SAS PHY CTRL DFX 0 0x4: SAS PHY CTRL DFX 1 0x5: SAS PCS DFX other: linkrate negotiation dump_mask: The masked hardware status bit will not be updated. dump_mode: determines how to dump data after trigger signal is generated. 0x0: dump forever 0x1: dump 32 data after trigger signal is generated 0x2: no more dump after trigger signal is generated trigger_mode: determines the trigger mode, level or edge. 0x0: dump when trigger signal changed 0x1: dump when trigger signal's level equal to trigger_level 0x2: dump when trigger signal's level different from trigger_level trigger_level: determines the trigger level. trigger_msk: mask trigger signal The user can get 32-byte values from hardware by reading the rd_data. These values consitute the status record of the hardware at different time points. Link: https://lore.kernel.org/r/1611659068-131975-6-git-send-email-john.garry@huawei.com Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-26scsi: hisi_sas: Flush workqueue in hisi_sas_v3_remove()Luo Jiaxing1-0/+1
If the controller reset occurs at the same time as driver removal, it may be possible that the interrupts have been released prior to the host softreset, and calling pci_irq_vector() there causes a WARN: WARNING: CPU: 37 PID: 1542 /pci/msi.c:1275 pci_irq_vector+0xc0/0xd0 Call trace: pci_irq_vector+0xc0/0xd0 disable_host_v3_hw+0x58/0x5b0 [hisi_sas_v3_hw] soft_reset_v3_hw+0x40/0xc0 [hisi_sas_v3_hw] hisi_sas_controller_reset+0x150/0x260 [hisi_sas_main] hisi_sas_rst_work_handler+0x3c/0x58 [hisi_sas_main] To fix, flush the driver workqueue prior to releasing the interrupts to ensure any resets have been completed. Link: https://lore.kernel.org/r/1611659068-131975-5-git-send-email-john.garry@huawei.com Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-22scsi: hisi_sas: Switch back to original libsas event notifiersAhmed S. Darwish1-2/+2
libsas event notifiers required an extension where gfp_t flags must be explicitly passed. For bisectability, a temporary _gfp() variant of such functions were added. All call sites then got converted use the _gfp() variants and explicitly pass GFP context. Having no callers left, the original libsas notifiers were then modified to accept gfp_t flags by default. Switch back to the original libas API, while still passing GFP context. The libsas _gfp() variants will be removed afterwards. Link: https://lore.kernel.org/r/20210118100955.1761652-14-a.darwish@linutronix.de Reviewed-by: John Garry <john.garry@huawei.com> Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-22scsi: hisi_sas: Pass gfp_t flags to libsas event notifiersAhmed S. Darwish1-2/+4
Use the new libsas event notifiers API, which requires callers to explicitly pass the gfp_t memory allocation flags. Below are the context analysis for modified functions: => hisi_sas_bytes_dmaed(): Since it is invoked from both process and atomic contexts, let its callers pass the gfp_t flags: * hisi_sas_main.c: ------------------ hisi_sas_phyup_work(): workqueue context -> hisi_sas_bytes_dmaed(..., GFP_KERNEL) hisi_sas_controller_reset_done(): has an msleep() -> hisi_sas_rescan_topology() -> hisi_sas_phy_down() -> hisi_sas_bytes_dmaed(..., GFP_KERNEL) hisi_sas_debug_I_T_nexus_reset(): calls wait_for_completion_timeout() -> hisi_sas_phy_down() -> hisi_sas_bytes_dmaed(..., GFP_KERNEL) * hisi_sas_v1_hw.c: ------------------- int_abnormal_v1_hw(): irq handler -> hisi_sas_phy_down() -> hisi_sas_bytes_dmaed(..., GFP_ATOMIC) * hisi_sas_v[23]_hw.c: ---------------------- int_phy_updown_v[23]_hw(): irq handler -> phy_down_v[23]_hw() -> hisi_sas_phy_down() -> hisi_sas_bytes_dmaed(..., GFP_ATOMIC) => int_bcast_v1_hw() and phy_bcast_v3_hw(): Both are invoked exclusively from irq handlers. Pass GFP_ATOMIC. Link: https://lore.kernel.org/r/20210118100955.1761652-12-a.darwish@linutronix.de Reviewed-by: John Garry <john.garry@huawei.com> Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>