aboutsummaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)AuthorFilesLines
2015-12-28RDMA/ocrdma: Depend on async link events from CNADevesh Sharma6-22/+119
Recently Dough Ledford reported a deadlock happening between ocrdma-load sequence and NetworkManager service issuing "open" on be2net interface. The deadlock happens when any be2net hook (e.g. open/close) is called in parallel to insmod ocrdma.ko. A. be2net is sending administrative open/close event to ocrdma holding device_list_mutex. It does this from ndo_open/ndo_stop hooks of be2net. So sequence of locks is rtnl_lock---> device_list lock B. When new ocrdma roce device gets registered, infiniband stack now takes rtnl_lock in ib_register_device() in GID initialization routines. So sequence of locks in this path is device_list lock ---> rtnl_lock. This improper locking sequence causes deadlock. With this patch we stop using administrative open and close events injected by be2net driver. These events were used to dispatch PORT_ACTIVE and PORT_ERROR events to the IB-stack. This patch implements a logic to receive async-link-events generated from CNA whenever link-state-change is detected. Now on, these async-events will be used to dispatch PORT_ACTIVE and PORT_ERROR events to IB-stack. Depending on async-events from CNA removes the need to hold device-list-mutex and thus breaks the busy-wait scenario. Reported-by: Doug Ledford <dledford@redhat.com> CC: Sathya Perla <sathya.perla@avagotech.com> Signed-off-by: Padmanabh Ratnakar <padmanabh.ratnakar@avagotech.com> Signed-off-by: Selvin Xavier <selvin.xavier@avagotech.com> Signed-off-by: Devesh Sharma <devesh.sharma@avagotech.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-28RDMA/ocrdma: Dispatch only port event when port state changesDevesh Sharma1-23/+0
Dispatch only port event to IB stack when port state changes. Don't explicitly modify qps to error. Let application listen to port events on async event queue or let QP fail with retry-exceeded completion error. Signed-off-by: Padmanabh Ratnakar <padmanabh.ratnakar@avagotech.com> Signed-off-by: Devesh Sharma <devesh.sharma@avagotech.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-28RDMA/ocrdma: Fix vlan-id assignment in qp parametersDevesh Sharma1-3/+4
vlan-id is wrongly getting as 0 when PFC is enabled. Set vlan-id configured by user in QP parameters. In case vlan interface is not used, flash a warning to user to configure vlan and assign vlan-id as 0 in qp params. Fixes: dbf727de7440 ('IB/core: Use GID table in AH creation and dmac resolution') Cc: Matan Barak <matanb@mellanox.com> Signed-off-by: Devesh Sharma <devesh.sharma@avagotech.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-22IB/mlx4: Replace kfree with kvfree in mlx4_ib_destroy_srqWengang Wang1-1/+1
Commit 0ef2f05c7e02ff99c0b5b583d7dee2cd12b053f2 uses vmalloc for WR buffers when needed and uses kvfree to free the buffers. It missed changing kfree to kvfree in mlx4_ib_destroy_srq(). Reported-by: Matthew Finaly <matt@Mellanox.com> Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-22IB/cma: cma_match_net_dev needs to take into account port_numMatan Barak1-7/+9
Previously, cma_match_net_dev called cma_protocol_roce which tried to verify that the IB device uses RoCE protocol. However, if rdma_id wasn't bound to a port, then the check would occur against the first port of the device without regard to whether that port was even of the same type as the type of port the incoming packet was received on. Fix this by passing the port of the request and only checking against the same port of the device. Reported-by: Or Gerlitz <gerlitz.or@gmail.com> Fixes: b8cab5dab15f ('IB/cma: Accept connection without a valid netdev on RoCE') Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-08IB/mlx5: Postpone remove_keys under knowledge of coming preemptionLeon Romanovsky1-1/+13
The remove_keys() logic is performed as garbage collection task. Such task is intended to be run when no other active processes are running. The need_resched() will return TRUE if there are user tasks to be activated in near future. In such case, we don't execute remove_keys() and postpone the garbage collection work to try to run in next cycle, in order to free CPU resources to other tasks. The possible pseudo-code to trigger such scenario: 1. Allocate a lot of MR to fill the cache above the limit. 2. Wait a small amount of time "to calm" the system. 3. Start CPU extensive operations on multi-node cluster. 4. Expect performance degradation during MR cache shrink operation. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-08IB/mlx4: Use vmalloc for WR buffers when neededWengang Wang2-9/+21
There are several hits that WR buffer allocation(kmalloc) failed. It failed at order 3 and/or 4 contigous pages allocation. At the same time there are actually 100MB+ free memory but well fragmented. So try vmalloc when kmalloc failed. Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com> Acked-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-08IB/mlx4: Use correct order of variables in log messageWengang Wang1-1/+1
There is a mis-order in mlx4 log. Fix it. Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com> Acked-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-08iser-target: Remove explicit mlx4 work-aroundSagi Grimberg1-10/+3
The driver now exposes sufficient limits so we can avoid having mlx4 specific work-around. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-08mlx4: Expose correct max_sge_rd limitSagi Grimberg2-1/+12
mlx4 devices (ConnectX-2, ConnectX-3) has a limitation where rdma read work queue entries cannot exceed 512 bytes. A rdma_read wqe needs to fit in 512 bytes: - wqe control segment (16 bytes) - rdma segment (16 bytes) - scatter elements (16 bytes each) So max_sge_rd should be: (512 - 16 - 16) / 16 = 30. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-08IB/mad: Require CM send method for everything except ClassPortInfoHal Rosenstock2-0/+7
Receipt of CM MAD with other than the Send method for an attribute other than the ClassPortInfo attribute is invalid. CM attributes other than ClassPortInfo only use the send method. The SRP initiator does not maintain a timeout policy for CM connect requests relies on the CM layer to do that. The result was that the SRP initiator hung as the connect request never completed. A new SRP target has been observed to respond to Send CM REQ with GetResp of CM REQ with bad status. This is non conformant with IBA spec but exposes a vulnerability in the current MAD/CM code which will respond to the incoming GetResp of CM REQ as if it was a valid incoming Send of CM REQ rather than tossing this on the floor. It also causes the MAD layer not to retransmit the original REQ even though it has not received a REP. Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Hal Rosenstock <hal@mellanox.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-08IB/cma: Add a missing rcu_read_unlock()Bart Van Assche1-4/+1
Ensure that validate_ipv4_net_dev() calls rcu_read_unlock() if fib_lookup() fails. Detected by sparse. Compile-tested only. Fixes: "IB/cma: Validate routing of incoming requests" (commit f887f2ac87c2). Cc: Haggai Eran <haggaie@mellanox.com> Cc: stable <stable@vger.kernel.org> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Reviewed-by: Haggai Eran <haggaie@mellanox.com> Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-07IB core: Fix ib_sg_to_pages()Bart Van Assche1-21/+22
On 12/03/2015 01:18 AM, Christoph Hellwig wrote: > The patch looks good to me, but while we touch this area, how about > throwing in a few cosmetic fixes as well? How about the patch below ? In that version of the ib_sg_to_pages() fix these concerns have been addressed and additionally to more bugs have been fixed. ------------ [PATCH] IB core: Fix ib_sg_to_pages() Fix the code for detecting gaps. A gap occurs not only if the second or later scatterlist element is not aligned but also if any scatterlist element other than the last does not end at a page boundary. In the code for coalescing contiguous elements, ensure that mr->length is correct and that last_page_addr is up-to-date. Ensure that this function returns a negative error code instead of zero if the first set_page() call fails. Fixes: commit 4c67e2bfc8b7 ("IB/core: Introduce new fast registration API") Reported-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-07IB/srp: Fix srp_map_sg_fr()Bart Van Assche2-15/+9
After dma_map_sg() has been called the return value of that function must be used as the number of elements in the scatterlist instead of scsi_sg_count(). Fixes: commit f7f7aab1a5c0 ("IB/srp: Convert to new registration API") Reported-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: stable <stable@vger.kernel.org> # v4.4+ Cc: Sagi Grimberg <sagig@mellanox.com> Cc: Sebastian Parschauer <sebastian.riemer@profitbricks.com> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-07IB/srp: Fix indirect data buffer rkey endiannessBart Van Assche1-1/+1
Detected by sparse. Fixes: commit 330179f2fa93 ("IB/srp: Register the indirect data buffer descriptor") Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: stable <stable@vger.kernel.org> # v4.3+ Cc: Sagi Grimberg <sagig@mellanox.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Sebastian Parschauer <sebastian.riemer@profitbricks.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-07IB/srp: Initialize dma_length in srp_map_idbChristoph Hellwig1-0/+3
Without this sg_dma_len will return 0 on architectures tha have the dma_length field. Fixes: commit f7f7aab1a5c0 ("IB/srp: Convert to new registration API") Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-07IB/srp: Fix possible send queue overflowSagi Grimberg1-1/+1
When using work request based memory registration (fast_reg) we must reserve SQ entries for registration and invalidation in addition to send operations. Each IO consumes 3 SQ entries (registration, send, invalidation) so we need to allocate 3x larger send-queue instead of 2x. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> CC: Stable <stable@vger.kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-07IB/srp: Fix a memory leakBart Van Assche1-9/+13
If srp_connect_ch() returns a positive value then that is considered by its caller as a connection failure but this does not result in a scsi_host_put() call and additionally causes the srp_create_target() function to return a positive value while it should return a negative value. Avoid all this confusion and additionally fix a memory leak by ensuring that srp_connect_ch() always returns a value that is <= 0. This patch avoids that a rejected login triggers the following memory leak: unreferenced object 0xffff88021b24a220 (size 8): comm "srp_daemon", pid 56421, jiffies 4295006762 (age 4240.750s) hex dump (first 8 bytes): 68 6f 73 74 35 38 00 a5 host58.. backtrace: [<ffffffff8151014a>] kmemleak_alloc+0x7a/0xc0 [<ffffffff81165c1e>] __kmalloc_track_caller+0xfe/0x160 [<ffffffff81260d2b>] kvasprintf+0x5b/0x90 [<ffffffff81260e2d>] kvasprintf_const+0x8d/0xb0 [<ffffffff81254b0c>] kobject_set_name_vargs+0x3c/0xa0 [<ffffffff81337e3c>] dev_set_name+0x3c/0x40 [<ffffffff81355757>] scsi_host_alloc+0x327/0x4b0 [<ffffffffa03edc8e>] srp_create_target+0x4e/0x8a0 [ib_srp] [<ffffffff8133778b>] dev_attr_store+0x1b/0x20 [<ffffffff811f27fa>] sysfs_kf_write+0x4a/0x60 [<ffffffff811f1e8e>] kernfs_fop_write+0x14e/0x180 [<ffffffff81176eef>] __vfs_write+0x2f/0xf0 [<ffffffff811771e4>] vfs_write+0xa4/0x100 [<ffffffff81177c64>] SyS_write+0x54/0xc0 [<ffffffff8151b257>] entry_SYSCALL_64_fastpath+0x12/0x6f Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Cc: Sebastian Parschauer <sebastian.riemer@profitbricks.com> Cc: stable <stable@vger.kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-07IB/sa: Put netlink request into the request list before sendingKaike Wan1-15/+17
It was found by Saurabh Sengar that the netlink code tried to allocate memory with GFP_KERNEL while holding a spinlock. While it is possible to fix the issue by replacing GFP_KERNEL with GFP_ATOMIC, it is better to get rid of the spinlock while sending the packet. However, in order to protect against a race condition that a quick response may be received before the request is put on the request list, we need to put the request on the list first. Signed-off-by: Kaike Wan <kaike.wan@intel.com> Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reported-by: Saurabh Sengar <saurabh.truth@gmail.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-07IB/iser: use sector_div instead of do_divArnd Bergmann1-1/+1
do_div is the wrong way to divide a sector_t, as it is less efficient when sector_t is 32-bit wide. With the upcoming do_div optimizations, the kernel starts warning about this: drivers/infiniband/ulp/iser/iser_verbs.c:1296:4: note: in expansion of macro 'do_div' include/asm-generic/div64.h:224:22: warning: passing argument 1 of '__div64_32' from incompatible pointer type This changes the code to use sector_div instead, which always produces optimal code. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-07IB/core: use RCU for uverbs id lookupMike Marciniszyn2-5/+8
The current implementation gets a spin_lock, and at any scale with qib and hfi1 post send, the lock contention grows exponentially with the number of QPs. idr_find() is RCU compatibile, so read doesn't need the lock. Change to use rcu_read_lock() and rcu_read_unlock() in __idr_get_uobj(). kfree_rcu() is used to insure a grace period between the idr removal and actual free. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-By: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-07IB/qib: Minor fixes to qib per SFF 8636Easwar Hariharan1-2/+2
Minor errors found via code inspection during future development. SFF 8636 defines bit position 2 to hold the status indication of QSFP memory paging. The mask used to test for the value was incorrect and is fixed in this patch. Additionally, the dump function had a mismatch between the field being printed out and the field used to source the data which was fixed. Reviewed-by: Mitko Haralanov <mitko.haralanov@intel.com> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reported-by: Easwar Hariharan <easwar.hariharan@intel.com> Signed-off-by: Easwar Hariharan <easwar.hariharan@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-07IB/core: Fix user mode post wr corruptionMike Marciniszyn1-5/+10
Commit e622f2f4ad21 ("IB: split struct ib_send_wr") introduced a regression for HCAs whose user mode post sends go through ib_uverbs_post_send(). The code didn't account for the fact that the first sge is offset by an operation dependent length. The allocation did, but the pointer to the destination sge list is computed without that knowledge. The sge list copy_from_user() then corrupts fields in the work request Store the operation dependent length in a local variable and compute the sge list copy_from_user() destination using that length. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-07IB/qib: Fix qib_mr structureIra Weiny1-1/+1
struct qib_mr requires the mr member be the last because struct qib_mregion contains a dynamic array at the end. The additions of members should have been placed before this structure as the comment noted. Failure to do so was causing random memory corruption. Reproducing this bug was easy to do by running the client and server of ib_write_bw -s 8 -n 5 on the same node. This BUG() was tripped in a slab debug kernel: kernel BUG at mm/slab.c:2572! Fixes: 38071a461f0a ("IB/qib: Support the new memory registration API") Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-06Linux 4.4-rc4Linus Torvalds1-1/+1
2015-12-06staging/lustre: remove IOC_LIBCFS_PING_TEST ioctlJames Simmons2-18/+0
The ioctl IOC_LIBCFS_PING_TEST has not been used in ages. The recent nidstring changes which moved all the nidstring operations from libcfs to the LNet layer but this ioctl code was still using an nidstring operation that was causing a circular dependency loop between libcfs and LNet. Signed-off-by: James Simmons <jsimmons@infradead.org> Signed-off-by: Oleg Drokin <green@linuxhacker.ru> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-12-06Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfsLinus Torvalds4-31/+15
Pull vfs fixes from Al Viro: "A couple of fixes (-stable fodder) + dead code removal after the overlayfs fix. I agree that it's better to separate from the fix part to make backporting easier, but IMO it's not worth delaying said dead code removal until the next window" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: Don't reset ->total_link_count on nested calls of vfs_path_lookup() ovl: get rid of the dead code left from broken (and disabled) optimizations ovl: fix permission checking for setattr
2015-12-06Don't reset ->total_link_count on nested calls of vfs_path_lookup()Al Viro1-1/+0
we already zero it on outermost set_nameidata(), so initialization in path_init() is pointless and wrong. The same DoS exists on pre-4.2 kernels, but there a slightly different fix will be needed. Cc: stable@vger.kernel.org # v4.2 Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-12-06ovl: get rid of the dead code left from broken (and disabled) optimizationsAl Viro3-26/+11
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-12-06ovl: fix permission checking for setattrMiklos Szeredi1-4/+4
[Al Viro] The bug is in being too enthusiastic about optimizing ->setattr() away - instead of "copy verbatim with metadata" + "chmod/chown/utimes" (with the former being always safe and the latter failing in case of insufficient permissions) it tries to combine these two. Note that copyup itself will have to do ->setattr() anyway; _that_ is where the elevated capabilities are right. Having these two ->setattr() (one to set verbatim copy of metadata, another to do what overlayfs ->setattr() had been asked to do in the first place) combined is where it breaks. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Cc: <stable@vger.kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-12-06Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds5-15/+45
Pull scheduler fixes from Thomas Gleixner: "This updates contains the following changes: - Fix a signal handling regression in the bit wait functions. - Avoid false positive warnings in the wakeup path. - Initialize the scheduler root domain properly. - Handle gtime calculations in proc/$PID/stat proper. - Add more documentation for the barriers in try_to_wake_up(). - Fix a subtle race in try_to_wake_up() which might cause a task to be scheduled on two cpus - Compile static helper function only when it is used" * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/core: Fix an SMP ordering race in try_to_wake_up() vs. schedule() sched/core: Better document the try_to_wake_up() barriers sched/cputime: Fix invalid gtime in proc sched/core: Clear the root_domain cpumasks in init_rootdomain() sched/core: Remove false-positive warning from wake_up_process() sched/wait: Fix signal handling in bit wait helpers sched/rt: Hide the push_irq_work_func() declaration
2015-12-06Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds12-36/+54
Pull x86 fixes from Thoma Gleixner: "Another round of fixes for x86: - Move the initialization of the microcode driver to late_initcall to make sure everything that init function needs is available. - Make sure that lockdep knows about interrupts being off in the entry code before calling into c-code. - Undo the cpu hotplug init delay regression. - Use the proper conditionals in the mpx instruction decoder. - Fixup restart_syscall for x32 tasks. - Fix the hugepage regression on PAE kernels which was introduced with the latest PAT changes" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/signal: Fix restart_syscall number for x32 tasks x86/mpx: Fix instruction decoder condition x86/mm: Fix regression with huge pages on PAE x86 smpboot: Re-enable init_udelay=0 by default on modern CPUs x86/entry/64: Fix irqflag tracing wrt context tracking x86/microcode: Initialize the driver late when facilities are up
2015-12-06Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsiLinus Torvalds19-70/+129
Pull SCSI fixes from James Bottomley: "This is quite a bumper crop of fixes: three from Arnd correcting various build issues in some configurations, a lock recursion in qla2xxx. Two potentially exploitable issues in hpsa and mvsas, a potential null deref in st, a revert of a bdi registration fix that turned out to cause even more problems, a set of fixes to allow people who only defined MPT2SAS to still work after the mpt2/mpt3sas merger and a couple of fixes for issues turned up by the hyper-v storvsc driver" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: mpt3sas: fix Kconfig dependency problem for mpt2sas back compatibility Revert "scsi: Fix a bdi reregistration race" mpt3sas: Add dummy Kconfig option for backwards compatibility Fix a memory leak in scsi_host_dev_release() block/sd: Fix device-imposed transfer length limits scsi_debug: fix prevent_allow+verify regressions MAINTAINERS: Add myself as co-maintainer of the SCSI subsystem. sd: Make discard granularity match logical block size when LBPRZ=1 scsi: hpsa: select CONFIG_SCSI_SAS_ATTR scsi: advansys needs ISA dma api for ISA support scsi_sysfs: protect against double execution of __scsi_remove_device() st: fix potential null pointer dereference. scsi: report 'INQUIRY result too short' once per host advansys: fix big-endian builds qla2xxx: Fix rwlock recursion hpsa: logical vs bitwise AND typo mvsas: don't allow negative timeouts mpt3sas: Fix use sas_is_tlr_enabled API before enabling MPI2_SCSIIO_CONTROL_TLR_ON flag
2015-12-05Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linuxLinus Torvalds57-331/+735
Pull drm fixes from Dave Airlie: "A bunch of change across the board, the main things are some vblank fallout in radeon and nouveau required some work, but I think this should fix it all. There is also one drm fix for an oops in vmwgfx with how we pass the drm master around. The rest is just some amdgpu, i915, imx and rockchip fixes. Probably more than I'd like at this point, but hopefully things settle down now" * 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: (40 commits) drm/amdgpu: Fixup hw vblank counter/ts for new drm_update_vblank_count() (v3) drm/radeon: Fixup hw vblank counter/ts for new drm_update_vblank_count() (v2) drm/radeon: Retry DDC probing on DVI on failure if we got an HPD interrupt drm/amdgpu: add spin lock to protect freed list in vm (v2) drm/amdgpu: partially revert "drm/amdgpu: fix VM_CONTEXT*_PAGE_TABLE_END_ADDR" v2 drm/amdgpu: take a BO reference for the user fence drm/amdgpu: take a BO reference in the display code drm/amdgpu: set snooped flags only on system addresses v2 drm/nouveau: Fix pre-nv50 pageflip events (v4) drm: Fix an unwanted master inheritance v2 drm/amdgpu: fix race condition in amd_sched_entity_push_job drm/amdgpu: add err check for pin userptr drm/i915: take a power domain reference while checking the HDMI live status drm/i915: add MISSING_CASE to a few port/aux power domain helpers drm/i915/ddi: fix intel_display_port_aux_power_domain() after HDMI detect drm/i915: Introduce a gmbus power domain drm/i915: Clean up AUX power domain handling drm/rockchip: Use CRTC vblank event interface drm/rockchip: Fix module autoload for OF platform driver drm/rockchip: vop: fix window origin calculation ...
2015-12-05Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6Linus Torvalds3-3/+4
Pull crypto fixes from Herbert Xu: "This fixes a couple of crypto drivers that were using memcmp to verify authentication tags. They now use crypto_memneq instead" * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: talitos - Fix timing leak in ESP ICV verification crypto: nx - Fix timing leak in GCM and CCM decryption
2015-12-05x86/signal: Fix restart_syscall number for x32 tasksDmitry V. Levin1-7/+10
When restarting a syscall with regs->ax == -ERESTART_RESTARTBLOCK, regs->ax is assigned to a restart_syscall number. For x32 tasks, this syscall number must have __X32_SYSCALL_BIT set, otherwise it will be an x86_64 syscall number instead of a valid x32 syscall number. This issue has been there since the introduction of x32. Reported-by: strace/tests/restart_syscall.test Reported-and-tested-by: Elvira Khabirova <lineprinter0@gmail.com> Signed-off-by: Dmitry V. Levin <ldv@altlinux.org> Cc: Elvira Khabirova <lineprinter0@gmail.com> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/20151130215436.GA25996@altlinux.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-12-05x86/mpx: Fix instruction decoder conditionDave Hansen1-3/+3
MPX decodes instructions in order to tell which bounds register was violated. Part of this decoding involves looking at the "REX prefix" which is a special instrucion prefix used to retrofit support for new registers in to old instructions. The X86_REX_*() macros are defined to return actual bit values: #define X86_REX_R(rex) ((rex) & 4) *not* boolean values. However, the MPX code was checking for them like they were booleans. This might have led to us mis-decoding the "REX prefix" and giving false information out to userspace about bounds violations. X86_REX_B() actually is bit 1, so this is really only broken for the X86_REX_X() case. Fix the conditionals up to tolerate the non-boolean values. Fixes: fcc7ffd67991 "x86, mpx: Decode MPX instruction to get bound violation information" Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Cc: x86@kernel.org Cc: Dave Hansen <dave@sr71.net> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/20151201003113.D800C1E0@viggo.jf.intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-12-05Merge branch 'drm-fixes-4.4' of git://people.freedesktop.org/~agd5f/linux into drm-nextDave Airlie30-91/+387
A few more last minute fixes for 4.4 on top of my pull request from earlier this week. The big change here is a vblank regression fix due to commit 4dfd6486 "drm: Use vblank timestamps to guesstimate how many vblanks were missed". Beyond that, a hotplug fix and a few VM fixes. * 'drm-fixes-4.4' of git://people.freedesktop.org/~agd5f/linux: drm/amdgpu: Fixup hw vblank counter/ts for new drm_update_vblank_count() (v3) drm/radeon: Fixup hw vblank counter/ts for new drm_update_vblank_count() (v2) drm/radeon: Retry DDC probing on DVI on failure if we got an HPD interrupt drm/amdgpu: add spin lock to protect freed list in vm (v2) drm/amdgpu: partially revert "drm/amdgpu: fix VM_CONTEXT*_PAGE_TABLE_END_ADDR" v2 drm/amdgpu: take a BO reference for the user fence drm/amdgpu: take a BO reference in the display code drm/amdgpu: set snooped flags only on system addresses v2 drm/amdgpu: fix race condition in amd_sched_entity_push_job drm/amdgpu: add err check for pin userptr add blacklist for thinkpad T40p drm/amdgpu: fix VM page table reference counting drm/amdgpu: fix userptr flags check
2015-12-04Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-clientLinus Torvalds1-0/+1
Pull Ceph fix from Sage Weil: "This addresses a refcounting bug that leads to a use-after-free" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: rbd: don't put snap_context twice in rbd_queue_workfn()
2015-12-04drm/amdgpu: Fixup hw vblank counter/ts for new drm_update_vblank_count() (v3)Alex Deucher6-30/+140
commit 4dfd6486 "drm: Use vblank timestamps to guesstimate how many vblanks were missed" introduced in Linux 4.4-rc1 makes the drm core more fragile to drivers which don't update hw vblank counters and vblank timestamps in sync with firing of the vblank irq and essentially at leading edge of vblank. This exposed a problem with radeon-kms/amdgpu-kms which do not satisfy above requirements: The vblank irq fires a few scanlines before start of vblank, but programmed pageflips complete at start of vblank and vblank timestamps update at start of vblank, whereas the hw vblank counter increments only later, at start of vsync. This leads to problems like off by one errors for vblank counter updates, vblank counters apparently going backwards or vblank timestamps apparently having time going backwards. The net result is stuttering of graphics in games, or little hangs, as well as total failure of timing sensitive applications. See bug #93147 for an example of the regression on Linux 4.4-rc: https://bugs.freedesktop.org/show_bug.cgi?id=93147 This patch tries to align all above events better from the viewpoint of the drm core / of external callers to fix the problem: 1. The apparent start of vblank is shifted a few scanlines earlier, so the vblank irq now always happens after start of this extended vblank interval and thereby drm_update_vblank_count() always samples the updated vblank count and timestamp of the new vblank interval. To achieve this, the reporting of scanout positions by radeon_get_crtc_scanoutpos() now operates as if the vblank starts radeon_crtc->lb_vblank_lead_lines before the real start of the hw vblank interval. This means that the vblank timestamps which are based on these scanout positions will now update at this earlier start of vblank. 2. The driver->get_vblank_counter() function will bump the returned vblank count as read from the hw by +1 if the query happens after the shifted earlier start of the vblank, but before the real hw increment at start of vsync, so the counter appears to increment at start of vblank in sync with the timestamp update. 3. Calls from vblank irq-context and regular non-irq calls are now treated identical, always simulating the shifted vblank start, to avoid inconsistent results for queries happening from vblank irq vs. happening from drm_vblank_enable() or vblank_disable_fn(). 4. The radeon_flip_work_func will delay mmio programming a pageflip until the start of the real vblank iff it happens to execute inside the shifted earlier start of the vblank, so pageflips now also appear to execute at start of the shifted vblank, in sync with vblank counter and timestamp updates. This to avoid some races between updates of vblank count and timestamps that are used for swap scheduling and pageflip execution which could cause pageflips to execute before the scheduled target vblank. The lb_vblank_lead_lines "fudge" value is calculated as the size of the display controllers line buffer in scanlines for the given video mode: Vblank irq's are triggered by the line buffer logic when the line buffer refill for a video frame ends, ie. when the line buffer source read position enters the hw vblank. This means that a vblank irq could fire at most as many scanlines before the current reported scanout position of the crtc timing generator as the number of scanlines the line buffer can maximally hold for a given video mode. This patch has been successfully tested on a RV730 card with DCE-3 display engine and on a evergreen card with DCE-4 display engine, in single-display and dual-display configuration, with different video modes. A similar patch is needed for amdgpu-kms to fix the same problem. Limitations: - Maybe replace the udelay() in the flip_work_func() by a suitable usleep_range() for a bit better efficiency? Will try that. - Line buffer sizes in pixels are hard-coded on < DCE-4 to a value i just guessed to be high enough to work ok, lacking info on the true sizes atm. Probably fixes: fdo#93147 Port of Mario's radeon fix to amdgpu. Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (v1) Reviewed-by: Mario Kleiner <mario.kleiner.de@gmail.com> (v2) Refine amdgpu_flip_work_func() for better efficiency. In amdgpu_flip_work_func, replace the busy waiting udelay(5) with event lock held by a more performance and energy efficient usleep_range() until at least predicted true start of hw vblank, with some slack for scheduler happiness. Release the event lock during waits to not delay other outputs in doing their stuff, as the waiting can last up to 200 usecs in some cases. Also small fix to code comment and formatting in that function. (v2) Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> (v3) Fix crash in crtc disabled case
2015-12-04Merge branch 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimmLinus Torvalds4-53/+76
Pull libnvdimm fixes from Dan Williams: - NFIT parsing regression fixes from Linda. The nvdimm hot-add implementation merged in 4.4-rc1 interpreted the specification in a way that breaks actual HPE platforms. We are also closing the loop with the ACPI Working Group to get this clarification added to the spec. - Andy pointed out that his laptop without nvdimm resources is loading the e820-nvdimm module by default, fix that up to only load the module when an e820-type-12 range is present. * 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: nfit: Adjust for different _FIT and NFIT headers nfit: Fix the check for a successful NFIT merge nfit: Account for table size length variation libnvdimm, e820: skip module loading when no type-12
2015-12-04Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds9-89/+107
Pull ARM KVM fixes from Paolo Bonzini: - a series of fixes to deal with the aliasing between the sp and xzr register - a fix for the cache flush fix that went in -rc3 * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: ARM/arm64: KVM: correct PTE uncachedness check arm64: KVM: Get rid of old vcpu_reg() arm64: KVM: Correctly handle zero register in system register accesses arm64: KVM: Remove const from struct sys_reg_params arm64: KVM: Correctly handle zero register during MMIO
2015-12-04drm/radeon: Fixup hw vblank counter/ts for new drm_update_vblank_count() (v2)Mario Kleiner9-29/+164
commit 4dfd6486 "drm: Use vblank timestamps to guesstimate how many vblanks were missed" introduced in Linux 4.4-rc1 makes the drm core more fragile to drivers which don't update hw vblank counters and vblank timestamps in sync with firing of the vblank irq and essentially at leading edge of vblank. This exposed a problem with radeon-kms/amdgpu-kms which do not satisfy above requirements: The vblank irq fires a few scanlines before start of vblank, but programmed pageflips complete at start of vblank and vblank timestamps update at start of vblank, whereas the hw vblank counter increments only later, at start of vsync. This leads to problems like off by one errors for vblank counter updates, vblank counters apparently going backwards or vblank timestamps apparently having time going backwards. The net result is stuttering of graphics in games, or little hangs, as well as total failure of timing sensitive applications. See bug #93147 for an example of the regression on Linux 4.4-rc: https://bugs.freedesktop.org/show_bug.cgi?id=93147 This patch tries to align all above events better from the viewpoint of the drm core / of external callers to fix the problem: 1. The apparent start of vblank is shifted a few scanlines earlier, so the vblank irq now always happens after start of this extended vblank interval and thereby drm_update_vblank_count() always samples the updated vblank count and timestamp of the new vblank interval. To achieve this, the reporting of scanout positions by radeon_get_crtc_scanoutpos() now operates as if the vblank starts radeon_crtc->lb_vblank_lead_lines before the real start of the hw vblank interval. This means that the vblank timestamps which are based on these scanout positions will now update at this earlier start of vblank. 2. The driver->get_vblank_counter() function will bump the returned vblank count as read from the hw by +1 if the query happens after the shifted earlier start of the vblank, but before the real hw increment at start of vsync, so the counter appears to increment at start of vblank in sync with the timestamp update. 3. Calls from vblank irq-context and regular non-irq calls are now treated identical, always simulating the shifted vblank start, to avoid inconsistent results for queries happening from vblank irq vs. happening from drm_vblank_enable() or vblank_disable_fn(). 4. The radeon_flip_work_func will delay mmio programming a pageflip until the start of the real vblank iff it happens to execute inside the shifted earlier start of the vblank, so pageflips now also appear to execute at start of the shifted vblank, in sync with vblank counter and timestamp updates. This to avoid some races between updates of vblank count and timestamps that are used for swap scheduling and pageflip execution which could cause pageflips to execute before the scheduled target vblank. The lb_vblank_lead_lines "fudge" value is calculated as the size of the display controllers line buffer in scanlines for the given video mode: Vblank irq's are triggered by the line buffer logic when the line buffer refill for a video frame ends, ie. when the line buffer source read position enters the hw vblank. This means that a vblank irq could fire at most as many scanlines before the current reported scanout position of the crtc timing generator as the number of scanlines the line buffer can maximally hold for a given video mode. This patch has been successfully tested on a RV730 card with DCE-3 display engine and on a evergreen card with DCE-4 display engine, in single-display and dual-display configuration, with different video modes. A similar patch is needed for amdgpu-kms to fix the same problem. Limitations: - Line buffer sizes in pixels are hard-coded on < DCE-4 to a value i just guessed to be high enough to work ok, lacking info on the true sizes atm. Fixes: fdo#93147 Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Michel Dänzer <michel.daenzer@amd.com> Cc: Harry Wentland <Harry.Wentland@amd.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> (v1) Tested-by: Dave Witbrodt <dawitbro@sbcglobal.net> (v2) Refine radeon_flip_work_func() for better efficiency: In radeon_flip_work_func, replace the busy waiting udelay(5) with event lock held by a more performance and energy efficient usleep_range() until at least predicted true start of hw vblank, with some slack for scheduler happiness. Release the event lock during waits to not delay other outputs in doing their stuff, as the waiting can last up to 200 usecs in some cases. Retested on DCE-3 and DCE-4 to verify it still works nicely. (v2) Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2015-12-04drm/radeon: Retry DDC probing on DVI on failure if we got an HPD interruptLyude10-12/+32
HPD signals on DVI ports can be fired off before the pins required for DDC probing actually make contact, due to the pins for HPD making contact first. This results in a HPD signal being asserted but DDC probing failing, resulting in hotplugging occasionally failing. This is somewhat rare on most cards (depending on what angle you plug the DVI connector in), but on some cards it happens constantly. The Radeon R5 on the machine used for testing this patch for instance, runs into this issue just about every time I try to hotplug a DVI monitor and as a result hotplugging almost never works. Rescheduling the hotplug work for a second when we run into an HPD signal with a failing DDC probe usually gives enough time for the rest of the connector's pins to make contact, and fixes this issue. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Lyude <cpaul@redhat.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2015-12-04drm/amdgpu: add spin lock to protect freed list in vm (v2)jimqu2-3/+15
there is a protection fault about freed list when OCL test. add a spin lock to protect it. v2: drop changes in vm_fini Signed-off-by: JimQu <jim.qu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>
2015-12-04drm/amdgpu: partially revert "drm/amdgpu: fix VM_CONTEXT*_PAGE_TABLE_END_ADDR" v2Christian König2-2/+2
The gtt_end is already inclusive, we don't need to subtract one here. v2 (chk): keep the fix for the VM code, cause here it really applies. Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Anatoli Antonovitch <anatoli.antonovitch@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2015-12-04drm/amdgpu: take a BO reference for the user fenceChristian König1-2/+4
No need for a GEM reference here. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2015-12-04drm/amdgpu: take a BO reference in the display codeChristian König1-3/+3
No need for the GEM reference here. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2015-12-04Merge tag 'kvm-arm-for-v4.4-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into kvm-masterPaolo Bonzini9-89/+107
KVM/ARM fixes for v4.4-rc4 - A series of fixes to deal with the aliasing between the sp and xzr register - A fix for the cache flush fix that went in -rc3
2015-12-04drm/amdgpu: set snooped flags only on system addresses v2Christian König1-3/+4
Not necessary for VRAM. v2: no need to check if ttm is NULL. Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>