Age | Commit message (Collapse) | Author | Files | Lines |
|
Port of amdgpu patch 9298e52f8b51d1e4acd68f502832f3a97f8cf892.
Signed-off-by: Christian König <christian.koenig@amd.com>
CC: stable@vger.kernel.org
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
This can be the case when the GPU is powered off, e.g. via vgaswitcheroo
or runpm. When the GPU is powered up again, radeon_gart_table_vram_pin
flushes the TLB after setting rdev->gart.ptr to non-NULL.
Fixes panic on powering off R7xx GPUs.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=61529
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
|
|
bug:
https://bugs.freedesktop.org/show_bug.cgi?id=76490
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
|
|
Destroy ocrdma_dev_id IDR on module exit, reclaiming the allocated memory.
This was detected by the following semantic patch (written by Luis Rodriguez
<mcgrof@suse.com>)
<SmPL>
@ defines_module_init @
declarer name module_init, module_exit;
declarer name DEFINE_IDR;
identifier init;
@@
module_init(init);
@ defines_module_exit @
identifier exit;
@@
module_exit(exit);
@ declares_idr depends on defines_module_init && defines_module_exit @
identifier idr;
@@
DEFINE_IDR(idr);
@ on_exit_calls_destroy depends on declares_idr && defines_module_exit @
identifier declares_idr.idr, defines_module_exit.exit;
@@
exit(void)
{
...
idr_destroy(&idr);
...
}
@ missing_module_idr_destroy depends on declares_idr && defines_module_exit && !on_exit_calls_destroy @
identifier declares_idr.idr, defines_module_exit.exit;
@@
exit(void)
{
...
+idr_destroy(&idr);
}
</SmPL>
Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
Destroy multcast_idr on module exit, reclaiming the allocated memory.
This was detected by the following semantic patch (written by Luis Rodriguez
<mcgrof@suse.com>)
<SmPL>
@ defines_module_init @
declarer name module_init, module_exit;
declarer name DEFINE_IDR;
identifier init;
@@
module_init(init);
@ defines_module_exit @
identifier exit;
@@
module_exit(exit);
@ declares_idr depends on defines_module_init && defines_module_exit @
identifier idr;
@@
DEFINE_IDR(idr);
@ on_exit_calls_destroy depends on declares_idr && defines_module_exit @
identifier declares_idr.idr, defines_module_exit.exit;
@@
exit(void)
{
...
idr_destroy(&idr);
...
}
@ missing_module_idr_destroy depends on declares_idr && defines_module_exit && !on_exit_calls_destroy @
identifier declares_idr.idr, defines_module_exit.exit;
@@
exit(void)
{
...
+idr_destroy(&idr);
}
</SmPL>
Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
There is little chance our memory allocation will fail, so we can
combine initializing the work structs with allocating them instead of
looping through all of them once to allocate and again to initialize.
Then when we need to actually find out if our device is up or in the
process of going down, have all of our work structs batched up, take the
spin_lock once and only once, and do all of the batch under the one
spin_lock invocation instead of incurring all of the locked memory cycles
we would otherwise incur to take/release the spin_lock over and over
again.
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
We create a number of work structs to be queued up to a workqueue, and
on completion of the workqueue handler, the workqueue handler frees the
allocated memory. If, however, we don't queue the work struct because
the device is going down, then we need to free the memory ourselves.
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
On failure, we loop through all possible pointers and test them before
calling kfree. But really, why even attempt to free items we didn't
allocate when we can easily loop through exactly and only the devices
for which the original memory allocation succeeded and free just those.
Signed-off-by: Maninder Singh <maninder1.s@samsung.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
For IB links, reading HCA flow counters through iboe_process_mad() should
be used when mlx4_ib_process_mad() is invoked only for VFs PMA queries and
exactly nothing else.
Fixes: 7193a141eb74 ('IB/mlx4: Set VF to read from QP counters')
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
In little endian cases, the macros be16_to_cpu and cpu_to_be64
unfolds to __swab{16,64} which provides special case for constants.
In big endian cases, __constant_be16_to_cpu and be16_to_cpu
expand directly to the same expression. The same applies for
__constant_cpu_to_be64 and cpu_to_be64.
So, replace __constant_be16_to_cpu with be16_to_cpu and
__constant_cpu_to_be64 with cpu_to_be64, with the goal of getting
rid of the definition of __constant_be16_to_cpu and
__constant_cpu_to_be64 completely.
Signed-off-by: Vaishali Thakkar <vthakkar1994@gmail.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
When switching between modes (datagram / connected) change the MTU
accordingly.
datagram mode up to 4K, connected mode up to (64K - 0x10).
Signed-off-by: ELi Cohen <eli@mellanox.com>
Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
By default, IPoIB-CM driver uses 64k MTU. Larger MTU gives better
performance.
This MTU plus overhead puts the memory allocation for IP based packets at
32 4k pages (order 5), which have to be contiguous.
When the system memory under pressure, it was observed that allocating 128k
contiguous physical memory is difficult and causes serious errors (such as
system becomes unusable).
This enhancement resolve the issue by removing the physically contiguous
memory requirement using Scatter/Gather feature that exists in Linux stack.
With this fix Scatter-Gather will be supported also in connected mode.
This change reverts some of the change made in commit e112373fd6aa
("IPoIB/cm: Reduce connected mode TX object size").
The ability to use SG in IPoIB CM is possible because the coupling
between NETIF_F_SG and NETIF_F_CSUM was removed in commit
ec5f06156423 ("net: Kill link between CSUM and SG features.")
Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Acked-by: Christian Marie <christian@ponies.io>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
ib_ucm_release_dev clears the wrong bit if devnum is greater
than IB_UCM_MAX_DEVICES.
Signed-off-by: Carol L Soto <clsoto@linux.vnet.ibm.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
__ipoib_ib_dev_flush calls itself recursively on child devices, and lockdep
complains about locking vlan_rwsem twice (see below). Use down_read_nested
instead of down_read to prevent the warning.
=============================================
[ INFO: possible recursive locking detected ]
4.1.0-rc4+ #36 Tainted: G O
---------------------------------------------
kworker/u20:2/261 is trying to acquire lock:
(&priv->vlan_rwsem){.+.+..}, at: [<ffffffffa0791e2a>] __ipoib_ib_dev_flush+0x3a/0x2b0 [ib_ipoib]
but task is already holding lock:
(&priv->vlan_rwsem){.+.+..}, at: [<ffffffffa0791e2a>] __ipoib_ib_dev_flush+0x3a/0x2b0 [ib_ipoib]
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0
----
lock(&priv->vlan_rwsem);
lock(&priv->vlan_rwsem);
*** DEADLOCK ***
May be due to missing lock nesting notation
3 locks held by kworker/u20:2/261:
#0: ("%s""ipoib_flush"){.+.+..}, at: [<ffffffff810827cc>] process_one_work+0x15c/0x760
#1: ((&priv->flush_heavy)){+.+...}, at: [<ffffffff810827cc>] process_one_work+0x15c/0x760
#2: (&priv->vlan_rwsem){.+.+..}, at: [<ffffffffa0791e2a>] __ipoib_ib_dev_flush+0x3a/0x2b0 [ib_ipoib]
stack backtrace:
CPU: 3 PID: 261 Comm: kworker/u20:2 Tainted: G O 4.1.0-rc4+ #36
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007
Workqueue: ipoib_flush ipoib_ib_dev_flush_heavy [ib_ipoib]
ffff8801c6c54790 ffff8801c9927af8 ffffffff81665238 0000000000000001
ffffffff825b5b30 ffff8801c9927bd8 ffffffff810bba51 ffff880100000000
ffffffff00000001 ffff880100000001 ffff8801c6c55428 ffff8801c6c54790
Call Trace:
[<ffffffff81665238>] dump_stack+0x4f/0x6f
[<ffffffff810bba51>] __lock_acquire+0x741/0x1820
[<ffffffff810bcbf8>] lock_acquire+0xc8/0x240
[<ffffffffa0791e2a>] ? __ipoib_ib_dev_flush+0x3a/0x2b0 [ib_ipoib]
[<ffffffff81669d2c>] down_read+0x4c/0x70
[<ffffffffa0791e2a>] ? __ipoib_ib_dev_flush+0x3a/0x2b0 [ib_ipoib]
[<ffffffffa0791e2a>] __ipoib_ib_dev_flush+0x3a/0x2b0 [ib_ipoib]
[<ffffffffa0791e4a>] __ipoib_ib_dev_flush+0x5a/0x2b0 [ib_ipoib]
[<ffffffffa07920ba>] ipoib_ib_dev_flush_heavy+0x1a/0x20 [ib_ipoib]
[<ffffffff81082871>] process_one_work+0x201/0x760
[<ffffffff810827cc>] ? process_one_work+0x15c/0x760
[<ffffffff81082ef0>] worker_thread+0x120/0x4d0
[<ffffffff81082dd0>] ? process_one_work+0x760/0x760
[<ffffffff81082dd0>] ? process_one_work+0x760/0x760
[<ffffffff81088b7e>] kthread+0xfe/0x120
[<ffffffff81088a80>] ? __init_kthread_worker+0x70/0x70
[<ffffffff8166c6e2>] ret_from_fork+0x42/0x70
[<ffffffff81088a80>] ? __init_kthread_worker+0x70/0x70
Signed-off-by: Haggai Eran <haggaie@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
The ucma_lock_files() locks the mut mutex on two files, e.g. for migrating
an ID. Use mutex_lock_nested() to prevent the warning below.
=============================================
[ INFO: possible recursive locking detected ]
4.1.0-rc6-hmm+ #40 Tainted: G O
---------------------------------------------
pingpong_rpc_se/10260 is trying to acquire lock:
(&file->mut){+.+.+.}, at: [<ffffffffa047ac55>] ucma_migrate_id+0xc5/0x248 [rdma_ucm]
but task is already holding lock:
(&file->mut){+.+.+.}, at: [<ffffffffa047ac4b>] ucma_migrate_id+0xbb/0x248 [rdma_ucm]
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0
----
lock(&file->mut);
lock(&file->mut);
*** DEADLOCK ***
May be due to missing lock nesting notation
1 lock held by pingpong_rpc_se/10260:
#0: (&file->mut){+.+.+.}, at: [<ffffffffa047ac4b>] ucma_migrate_id+0xbb/0x248 [rdma_ucm]
stack backtrace:
CPU: 0 PID: 10260 Comm: pingpong_rpc_se Tainted: G O 4.1.0-rc6-hmm+ #40
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007
ffff8801f85b63d0 ffff880195677b58 ffffffff81668f49 0000000000000001
ffffffff825cbbe0 ffff880195677c38 ffffffff810bb991 ffff880100000000
ffff880100000000 ffff880100000001 ffff8801f85b7010 ffffffff8121bee9
Call Trace:
[<ffffffff81668f49>] dump_stack+0x4f/0x6e
[<ffffffff810bb991>] __lock_acquire+0x741/0x1820
[<ffffffff8121bee9>] ? dput+0x29/0x320
[<ffffffff810bcb38>] lock_acquire+0xc8/0x240
[<ffffffffa047ac55>] ? ucma_migrate_id+0xc5/0x248 [rdma_ucm]
[<ffffffff8166b901>] ? mutex_lock_nested+0x291/0x3e0
[<ffffffff8166b6d5>] mutex_lock_nested+0x65/0x3e0
[<ffffffffa047ac55>] ? ucma_migrate_id+0xc5/0x248 [rdma_ucm]
[<ffffffff810baeed>] ? trace_hardirqs_on+0xd/0x10
[<ffffffff8166b66e>] ? mutex_unlock+0xe/0x10
[<ffffffffa047ac55>] ucma_migrate_id+0xc5/0x248 [rdma_ucm]
[<ffffffffa0478474>] ucma_write+0xa4/0xb0 [rdma_ucm]
[<ffffffff81200674>] __vfs_write+0x34/0x100
[<ffffffff8112427c>] ? __audit_syscall_entry+0xac/0x110
[<ffffffff810ec055>] ? current_kernel_time+0xc5/0xe0
[<ffffffff812aa4d3>] ? security_file_permission+0x23/0x90
[<ffffffff8120088d>] ? rw_verify_area+0x5d/0xe0
[<ffffffff812009bb>] vfs_write+0xab/0x120
[<ffffffff81201519>] SyS_write+0x59/0xd0
[<ffffffff8112427c>] ? __audit_syscall_entry+0xac/0x110
[<ffffffff8166ffee>] system_call_fastpath+0x12/0x76
Signed-off-by: Haggai Eran <haggaie@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
Fixes: 3e0249f9c05c ("RDS/IB: add refcount tracking to struct rds_ib_device")
There lacks a dropping on rds_ib_device.refcount in case rds_ib_alloc_fmr
failed(mr pool running out). this lead to the refcount overflow.
A complain in line 117(see following) is seen. From vmcore:
s_ib_rdma_mr_pool_depleted is 2147485544 and rds_ibdev->refcount is -2147475448.
That is the evidence the mr pool is used up. so rds_ib_alloc_fmr is very likely
to return ERR_PTR(-EAGAIN).
115 void rds_ib_dev_put(struct rds_ib_device *rds_ibdev)
116 {
117 BUG_ON(atomic_read(&rds_ibdev->refcount) <= 0);
118 if (atomic_dec_and_test(&rds_ibdev->refcount))
119 queue_work(rds_wq, &rds_ibdev->free_work);
120 }
fix is to drop refcount when rds_ib_alloc_fmr failed.
Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com>
Reviewed-by: Haggai Eran <haggaie@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
Fix for incorrect recording of the MAC address
Signed-off-by: Tatyana Nikolova <Tatyana.E.Nikolova@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
Neighbor resolution doesn't work without this fix
Signed-off-by: Tatyana Nikolova <Tatyana.E.Nikolova@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
Fixes to allow clients to make remove mapping requests, after
they have provided the user space service with the mapping
information, they are using when the service is restarted.
1) Adding IWPM_REG_VALID, IWPM_REG_INCOMPL and IWPM_REG_UNDEF
registration types for the port mapper clients and functions
to set/check the registration type.
2) If the port mapper user space service is not available to register
the client, then its registration stays IWPM_REG_UNDEF and the
registration isn't checked until the service becomes available
(no mappings are possible, if the user space service isn't running).
3) After the service is restarted, the user space port mapper pid is set
to valid and the client registration is set to IWPM_REG_INCOMPL
to allow the client to make remove mapping requests.
Signed-off-by: Tatyana Nikolova <Tatyana.E.Nikolova@intel.com>
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
Tested-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
Error values of ib_query_port() and ib_query_device() weren't propagated
correctly. Because of that, ipoib_add_port() could return NULL value,
which escaped the IS_ERR() check in ipoib_add_one() and we crashed.
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
mlx4 VFs can provide CQE raw time-stamping services, but they
don't have the hca core clock mapped to their PCI bars.
As such, we should not attempt to query and report the clock offset
to user space for VFs. Doing so causes query_device over VFs to fail
with -ENOSUPP.
Fixes: 4b664c4355b2 ('IB/mlx4: Add support for CQ time-stamping')
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
Whenever ib_cm gets remove_one call, like when there is a hot-unplug
event, the driver should mark itself as going_down and confirm that no
new works are going to be queued for that device.
so, the order of the actions are:
1. mark the going_down bit.
2. flush the wq.
3. [make sure no new works for that device.]
4. unregister mad agent.
otherwise, works that are already queued can be scheduled after the mad
agent was freed.
Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
We might return res which is not initialized. Also
reduce code duplication by exporting srp_parse_tmo so
srp_tmo_set can reuse it.
Detected by Coverity.
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Jenny Falkovich <jennyf@mellanox.com>
Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
In little endian cases, the macro cpu_to_be{16,32,64} unfolds to
__swab{16,32,64} which provides special case for constants. In
big endian cases, __constant_cpu_to_be{16,32,64} and
cpu_to_be{16,32,64} expand directly to the same expression. So,
replace __constant_cpu_to_be{16,32,64} with cpu_to_be{16,32,64}
with the goal of getting rid of the definitions of
__constant_cpu_to_be{16,32,64} completely.
The Coccinelle semantic patch that performs this transformation
is as follows:
@@expression x;@@
(
- __constant_cpu_to_be16(x)
+ cpu_to_be16(x)
|
- __constant_cpu_to_be32(x)
+ cpu_to_be32(x)
|
- __constant_cpu_to_be64(x)
+ cpu_to_be64(x)
)
Signed-off-by: Vaishali Thakkar <vthakkar1994@gmail.com>
Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
We recently added BUG_ON's which were inappropriate for a condition which
should never happen. Change these to be WARN_ON_ONCE as a debugging aid.
Fixes: 4cd7c9479aff ('IB/mad: Add support for additional MAD info to/from drivers')
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
The define OPA_LID_PERMISSIVE is big endian and was compared to the
cpu endian variable opa_drslid.
Problem caught by 0-day build infrastructure.
Fixes: 8e4349d13f33 (IB/mad: Add final OPA MAD processing)
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: John, Jubin <jubin.john@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
Persuant to Liran's comments on node_type on linux-rdma
mailing list:
In an effort to reform the RDMA core and ULPs to minimize use of
node_type in struct ib_device, an additional bit is added to
struct ib_device for is_switch (IB switch). This is needed
to be initialized by any IB switch device driver. This is a
NEW requirement on such device drivers which are all
"out of tree".
In addition, an ib_switch helper was added to ib_verbs.h
based on the is_switch device bit rather than node_type
(although those should be consistent).
The RDMA core (MAD, SMI, agent, sa_query, multicast, sysfs)
as well as (IPoIB and SRP) ULPs are updated where
appropriate to use this new helper. In some cases,
the helper is now used under the covers of using
rdma_[start end]_port rather than the open coding
previously used.
Reviewed-by: Sean Hefty <sean.hefty@intel.com>
Reviewed-By: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Tested-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Hal Rosenstock <hal@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
At least some versions of AMI BIOS have corrupted contents in the TPM2
ACPI table and namely the physical address of the control area is set to
zero.
This patch changes the driver to fail gracefully when we observe a zero
address instead of continuing to ioremap.
Cc: <stable@vger.kernel.org>
Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Reviewed-by: Peter Huewe <peterhuewe@gmx.de>
Signed-off-by: Peter Huewe <peterhuewe@gmx.de>
|
|
When a cdev is contained in a dynamic structure the cdev parent kobj
should be set to the kobj that controls the lifetime of the enclosing
structure. In TPM's case this is the embedded struct device.
Also, cdev_init 0's the whole structure, so all sets must be after,
not before. This fixes module ref counting and cdev.
Cc: <stable@vger.kernel.org>
Fixes: 313d21eeab92 ("tpm: device class for tpm")
Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Reviewed-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Tested-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Signed-off-by: Peter Huewe <peterhuewe@gmx.de>
|
|
They just call file_inode and then the corresponding *_inode_file_wait
function. Just make them static inlines instead.
Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
|
|
Now that we have file locking helpers that can deal with an inode
instead of a filp, we can change the NFSv4 locking code to use that
instead.
This should fix the case where we have a filp that is closed while flock
or OFD locks are set on it, and the task is signaled so that it doesn't
wait for the LOCKU reply to come in before the filp is freed. At that
point we can end up with a use-after-free with the current code, which
relies on dereferencing the fl_file in the lock request.
Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
Reviewed-by: "J. Bruce Fields" <bfields@fieldses.org>
Tested-by: "J. Bruce Fields" <bfields@fieldses.org>
|
|
Allow callers to pass in an inode instead of a filp.
Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
Reviewed-by: "J. Bruce Fields" <bfields@fieldses.org>
Tested-by: "J. Bruce Fields" <bfields@fieldses.org>
|
|
...and rename it to better describe how it works.
In order to fix a use-after-free in NFS, we need to be able to remove
locks from an inode after the filp associated with them may have already
been freed. flock_lock_file already only dereferences the filp to get to
the inode, so just change it so the callers do that.
All of the callers already pass in a lock request that has the fl_file
set properly, so we don't need to pass it in individually. With that
change it now only dereferences the filp to get to the inode, so just
push that out to the callers.
Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
Reviewed-by: "J. Bruce Fields" <bfields@fieldses.org>
Tested-by: "J. Bruce Fields" <bfields@fieldses.org>
|
|
This reverts commit db2efec0caba4f81a22d95a34da640b86c313c8e.
William reported that he was seeing instability with this patch, which
is likely due to the fact that it can cause the kernel to take a new
reference to a filp after the last reference has already been put.
Revert this patch for now, as we'll need to fix this in another way.
Cc: stable@vger.kernel.org
Reported-by: William Dauchy <william@gandi.net>
Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
Reviewed-by: "J. Bruce Fields" <bfields@fieldses.org>
Tested-by: "J. Bruce Fields" <bfields@fieldses.org>
|
|
If a machine check happens, the machine has the vector facility installed
and the extended save area exists, the cpu will save vector register
contents into the extended save area. This is regardless of control
register 0 contents, which enables and disables the vector facility during
runtime.
On each machine check we should validate the vector registers. The current
code however tries to validate the registers only if the running task is
using vector registers in user space.
However even the current code is broken and causes vector register
corruption on machine checks, if user space uses them:
the prefix area contains a pointer (absolute address) to the machine check
extended save area. In order to save some space the save area was put into
an unused area of the second prefix page.
When validating vector register contents the code uses the absolute address
of the extended save area, which is wrong. Due to prefixing the vector
instructions will then access contents using absolute addresses instead
of real addresses, where the machine stored the contents.
If the above would work there is still the problem that register validition
would only happen if user space uses vector registers. If kernel space uses
them also, this may also lead to vector register content corruption:
if the kernel makes use of vector instructions, but the current running
user space context does not, the machine check handler will validate
floating point registers instead of vector registers.
Given the fact that writing to a floating point register may change the
upper halve of the corresponding vector register, we also experience vector
register corruption in this case.
Fix all of these issues, and always validate vector registers on each
machine check, if the machine has the vector facility installed and the
extended save area is defined.
Cc: <stable@vger.kernel.org> # 4.1+
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
|
The sfpc inline assembly within execve_tail() may incorrectly set bits
28-31 of the sfpc instruction to a value which is not zero.
These bits however are currently unused and therefore should be zero
so we won't get surprised if these bits will be used in the future.
Therefore remove the second operand from the inline assembly.
Cc: <stable@vger.kernel.org>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
|
The dasd device driver selects which (alias or base) device is used
for a given requests when the request is build. If the chosen alias
device is set offline before the request gets queued to the device
queue the starting function may use device structures that are
already freed. This might lead to a hanging offline process or a
kernel panic.
Add a check to the starting function that returns the request to the
upper layer if the device is already in offline processing.
In addition to that prevent that an alias device that's already in
offline processing gets chosen as start device.
Reviewed-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Reviewed-by: Peter Oberparleiter <peter.oberparleiter@linux.vnet.ibm.com>
Signed-off-by: Stefan Haberland <stefan.haberland@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
|
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
|
|
Currently instruction_pointer() returns pt_regs->ret and so return value
is of type "long", which implicitly stands for "signed long".
While that's perfectly fine when dealing with 32-bit values if return
value of instruction_pointer() gets assigned to 64-bit variable sign
extension may happen.
And at least in one real use-case it happens already.
In perf_prepare_sample() return value of perf_instruction_pointer()
(which is an alias to instruction_pointer() in case of ARC) is assigned
to (struct perf_sample_data)->ip (which type is "u64").
And what we see if instuction pointer points to user-space application
that in case of ARC lays below 0x8000_0000 "ip" gets set properly with
leading 32 zeros. But if instruction pointer points to kernel address
space that starts from 0x8000_0000 then "ip" is set with 32 leadig
"f"-s. I.e. id instruction_pointer() returns 0x8100_0000, "ip" will be
assigned with 0xffff_ffff__8100_0000. Which is obviously wrong.
In particular that issuse broke output of perf, because perf was unable
to associate addresses like 0xffff_ffff__8100_0000 with anything from
/proc/kallsyms.
That's what we used to see:
----------->8----------
6.27% ls [unknown] [k] 0xffffffff8046c5cc
2.96% ls libuClibc-0.9.34-git.so [.] memcpy
2.25% ls libuClibc-0.9.34-git.so [.] memset
1.66% ls [unknown] [k] 0xffffffff80666536
1.54% ls libuClibc-0.9.34-git.so [.] 0x000224d6
1.18% ls libuClibc-0.9.34-git.so [.] 0x00022472
----------->8----------
With that change perf output looks much better now:
----------->8----------
8.21% ls [kernel.kallsyms] [k] memset
3.52% ls libuClibc-0.9.34-git.so [.] memcpy
2.11% ls libuClibc-0.9.34-git.so [.] malloc
1.88% ls libuClibc-0.9.34-git.so [.] memset
1.64% ls [kernel.kallsyms] [k] _raw_spin_unlock_irqrestore
1.41% ls [kernel.kallsyms] [k] __d_lookup_rcu
----------->8----------
Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Cc: arc-linux-dev@synopsys.com
Cc: stable@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
|
|
|
|
This reverts commit dec4f799d0a4c9edae20512fa60b0a36f3299ca2.
Jörg Otte reports a NULL pointder dereference due to this commit, as
'crtc_state' very much can be NULL:
crtc_state = state->base.state ?
intel_atomic_get_crtc_state(state->base.state, intel_crtc) : NULL;
So the change to test 'crtc_state->base.active' cannot possibly be
correct as-is.
There may be some other minimal fix (like just checking crtc_state for
NULL), but I'm just reverting it now for the rc2 release, and people
like Daniel Vetter who actually know this code will figure out what the
right solution is in the longer term.
Reported-and-bisected-by: Jörg Otte <jrg.otte@gmail.com>
Cc: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@intel.com>
CC: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Commit 514ac99c64b "can: fix multiple delivery of a single CAN frame for
overlapping CAN filters" requires the skb->tstamp to be set to check for
identical CAN skbs.
Without timestamping to be required by user space applications this timestamp
was not generated which lead to commit 36c01245eb8 "can: fix loss of CAN frames
in raw_rcv" - which forces the timestamp to be set in all CAN related skbuffs
by introducing several __net_timestamp() calls.
This forces e.g. out of tree drivers which are not using alloc_can{,fd}_skb()
to add __net_timestamp() after skbuff creation to prevent the frame loss fixed
in mainline Linux.
This patch removes the timestamp dependency and uses an atomic counter to
create an unique identifier together with the skbuff pointer.
Btw: the new skbcnt element introduced in struct can_skb_priv has to be
initialized with zero in out-of-tree drivers which are not using
alloc_can{,fd}_skb() too.
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Cc: linux-stable <stable@vger.kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
|
|
Driver core sets "default" pinmux on on probe and CAN driver
sets "sleep" pinmux during register. This causes a small window
where the CAN pins are in "default" state with the DCAN module
being disabled.
Change the "default" state to be like sleep so this glitch is
avoided. Add a new "active" state that is used by the driver
when CAN is actually active.
Signed-off-by: Roger Quadros <rogerq@ti.com>
Cc: linux-stable <stable@vger.kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
|
|
The previous change 3973c526ae9c (net: can: c_can: Disable pins when CAN
interface is down) causes a slight glitch on the pinctrl settings when used.
Since commit ab78029 (drivers/pinctrl: grab default handles from device core),
the device core will automatically set the default pins. This causes the pins
to be momentarily set to the default and then to the sleep state in
register_c_can_dev(). By adding an optional "enable" state, boards can set the
default pin state to be disabled and avoid the glitch when the switch from
default to sleep first occurs. If the "enable" state is not available
c_can_pinctrl_select_state() falls back to using the "default" pinctrl state.
[Roger Q] - Forward port to v4.2 and use pinctrl_get_select().
Signed-off-by: J.D. Schroeder <jay.schroeder@garmin.com>
Signed-off-by: Roger Quadros <rogerq@ti.com>
Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com>
Cc: linux-stable <stable@vger.kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
|
|
All the error messages in the driver but the ones from devm_clk_get() failures
use similar format. Make those two messages consitent with others.
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
|
|
Also print the error code when the request_irq() call fails in rcar_can_open(),
rewording the error message...
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
|
|
Fix typo in the first error message printed by rcar_can_open().
Based on the original patch by Vladimir Barinov.
Fixes: 862e2b6af941 ("can: rcar_can: support all input clocks")
Reported-by: Vladimir Barinov <vladimir.barinov@cogentembedded.com>
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
|
|
Printing IRQ # using "%x" and "%u" unsigned formats isn't quite correct as
'ndev->irq' is of type *int*, so the "%d" format needs to be used instead.
While fixing this, beautify the dev_info() message in rcar_can_probe() a bit.
Fixes: fd1159318e55 ("can: add Renesas R-Car CAN driver")
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
|
|
rcar_can_probe() regards 0 as a wrong IRQ #, despite platform_get_irq() that it
calls returns negative error code in that case. This leads to the following
being printed to the console when attempting to open the device:
error requesting interrupt fffffffa
because rcar_can_open() calls request_irq() with a negative IRQ #, and that
function naturally fails with -EINVAL.
Check for the negative error codes instead and propagate them upstream instead
of just returning -ENODEV.
Fixes: fd1159318e55 ("can: add Renesas R-Car CAN driver")
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Cc: linux-stable <stable@vger.kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
|
|
Normally opening a file, unlinking it and then closing will have
the inode freed upon close() (provided that it's not otherwise busy and
has no remaining links, of course). However, there's one case where that
does *not* happen. Namely, if you open it by fhandle with cold dcache,
then unlink() and close().
In normal case you get d_delete() in unlink(2) notice that dentry
is busy and unhash it; on the final dput() it will be forcibly evicted from
dcache, triggering iput() and inode removal. In this case, though, we end
up with *two* dentries - disconnected (created by open-by-fhandle) and
regular one (used by unlink()). The latter will have its reference to inode
dropped just fine, but the former will not - it's considered hashed (it
is on the ->s_anon list), so it will stay around until the memory pressure
will finally do it in. As the result, we have the final iput() delayed
indefinitely. It's trivial to reproduce -
void flush_dcache(void)
{
system("mount -o remount,rw /");
}
static char buf[20 * 1024 * 1024];
main()
{
int fd;
union {
struct file_handle f;
char buf[MAX_HANDLE_SZ];
} x;
int m;
x.f.handle_bytes = sizeof(x);
chdir("/root");
mkdir("foo", 0700);
fd = open("foo/bar", O_CREAT | O_RDWR, 0600);
close(fd);
name_to_handle_at(AT_FDCWD, "foo/bar", &x.f, &m, 0);
flush_dcache();
fd = open_by_handle_at(AT_FDCWD, &x.f, O_RDWR);
unlink("foo/bar");
write(fd, buf, sizeof(buf));
system("df ."); /* 20Mb eaten */
close(fd);
system("df ."); /* should've freed those 20Mb */
flush_dcache();
system("df ."); /* should be the same as #2 */
}
will spit out something like
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/root 322023 303843 1131 100% /
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/root 322023 303843 1131 100% /
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/root 322023 283282 21692 93% /
- inode gets freed only when dentry is finally evicted (here we trigger
than by remount; normally it would've happened in response to memory
pressure hell knows when).
Cc: stable@vger.kernel.org # v2.6.38+; earlier ones need s/kill_it/unhash_it/
Acked-by: J. Bruce Fields <bfields@fieldses.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|