Age | Commit message (Collapse) | Author | Files | Lines |
|
The 0day kernel test build report reported an oops:
>
> IP: put_pid+0x22/0x5c
> PGD 19efa067 P4D 19efa067 PUD 0
> Oops: 0000 [#1]
> CPU: 0 PID: 727 Comm: trinity Not tainted 4.16.0-rc2-00010-g98f929b #1
> RIP: 0010:put_pid+0x22/0x5c
> RSP: 0018:ffff986719f73e48 EFLAGS: 00010202
> RAX: 00000006d765f710 RBX: ffff98671a4fa4d0 RCX: ffff986719f73d40
> RDX: 000000006f6e6125 RSI: 0000000000000000 RDI: ffffffffa01e6d21
> RBP: ffffffffa0955fe0 R08: 0000000000000020 R09: 0000000000000000
> R10: 0000000000000078 R11: ffff986719f73e76 R12: 0000000000001000
> R13: 00000000ffffffea R14: 0000000054000fb0 R15: 0000000000000000
> FS: 00000000028c2880(0000) GS:ffffffffa06ad000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000000677846439 CR3: 0000000019fc1005 CR4: 00000000000606b0
> Call Trace:
> ? ipc_update_pid+0x36/0x3e
> ? newseg+0x34c/0x3a6
> ? ipcget+0x5d/0x528
> ? entry_SYSCALL_64_after_hwframe+0x52/0xb7
> ? SyS_shmget+0x5a/0x84
> ? do_syscall_64+0x194/0x1b3
> ? entry_SYSCALL_64_after_hwframe+0x42/0xb7
> Code: ff 05 e7 20 9b 03 58 c9 c3 48 ff 05 85 21 9b 03 48 85 ff 74 4f 8b 47 04 8b 17 48 ff 05 7c 21 9b 03 48 83 c0 03 48 c1 e0 04 ff ca <48> 8b 44 07 08 74 1f 48 ff 05 6c 21 9b 03 ff 0f 0f 94 c2 48 ff
> RIP: put_pid+0x22/0x5c RSP: ffff986719f73e48
> CR2: 0000000677846439
> ---[ end trace ab8c5cb4389d37c5 ]---
> Kernel panic - not syncing: Fatal exception
In newseg when changing shm_cprid and shm_lprid from pid_t to struct
pid* I misread the kvmalloc as kvzalloc and thought shp was
initialized to 0. As that is not the case it is not safe to for the
error handling to address shm_cprid and shm_lprid before they are
initialized.
Therefore move the cleanup of shm_cprid and shm_lprid from the no_file
error cleanup path to the no_id error cleanup path. Ensuring that an
early error exit won't cause the oops above.
Reported-by: kernel test robot <fengguang.wu@intel.com>
Reviewed-by: Nagarathnam Muthusamy <nagarathnam.muthusamy@oracle.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
|
|
Stephen Rothewell <sfr@canb.auug.org.au> wrote:
> After merging the userns tree, today's linux-next build (powerpc
> ppc64_defconfig) produced this warning:
>
> In file included from include/linux/sched.h:16:0,
> from arch/powerpc/lib/xor_vmx_glue.c:14:
> include/linux/shm.h:17:35: error: 'struct file' declared inside parameter list will not be visible outside of this definition or declaration [-Werror]
> bool is_file_shm_hugepages(struct file *file);
> ^~~~
>
> and many, many more (most warnings, but some errors - arch/powerpc is
> mostly built with -Werror)
I dug through this and I discovered that the error was caused by the
removal of struct shmid_kernel from shm.h when building on powerpc.
Except for observing the existence of "struct file *shm_file" in
struct shmid_kernel I have no clue why the structure move would cause
such a failure. I suspect shm.h always needed the forward declaration
and someting had been confusing gcc into not issuing the warning. --EWB
Fixes: a2e102cd3cdd ("shm: Move struct shmid_kernel into ipc/shm.c")
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
|
|
Rename the variables shp, sma, msq to isp. As that is how the code already
refers to those variables.
Collapse smack_of_shm, smack_of_sem, and smack_of_msq into smack_of_ipc,
as the three functions had become completely identical.
Collapse smack_shm_alloc_security, smack_sem_alloc_security and
smack_msg_queue_alloc_security into smack_ipc_alloc_security as the three
functions had become identical.
Collapse smack_shm_free_security, smack_sem_free_security and
smack_msg_queue_free_security into smack_ipc_free_security as the
three functions had become identical.
Requested-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
|
|
After the last round of cleanups the shm, sem, and msg associate
operations just became trivial wrappers around the appropriate security
method. Simplify things further by just calling the security method
directly.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
|
|
Today the last process to update a semaphore is remembered and
reported in the pid namespace of that process. If there are processes
in any other pid namespace querying that process id with GETPID the
result will be unusable nonsense as it does not make any
sense in your own pid namespace.
Due to ipc_update_pid I don't think you will be able to get System V
ipc semaphores into a troublesome cache line ping-pong. Using struct
pids from separate process are not a problem because they do not share
a cache line. Using struct pid from different threads of the same
process are unlikely to be a problem as the reference count update
can be avoided.
Further linux futexes are a much better tool for the job of mutual
exclusion between processes than System V semaphores. So I expect
programs that are performance limited by their interprocess mutual
exclusion primitive will be using futexes.
So while it is possible that enhancing the storage of the last
rocess of a System V semaphore from an integer to a struct pid
will cause a performance regression because of the effect
of frequently updating the pid reference count. I don't expect
that to happen in practice.
This change updates semctl(..., GETPID, ...) to return the
process id of the last process to update a semphore inthe
pid namespace of the calling process.
Fixes: b488893a390e ("pid namespaces: changes to show virtual ids to user")
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
|
|
Today msg_lspid and msg_lrpid are remembered in the pid namespace of
the creator and the processes that last send or received a sysvipc
message. If you have processes in multiple pid namespaces that is
just wrong. The process ids reported will not make the least bit of
sense.
This fix is slightly more susceptible to a performance problem than
the related fix for System V shared memory. By definition the pids
are updated by msgsnd and msgrcv, the fast path of System V message
queues. The only concern over the previous implementation is the
incrementing and decrementing of the pid reference count. As that is
the only difference and multiple updates by of the task_tgid by
threads in the same process have been shown in af_unix sockets to
create a cache line ping-pong between cpus of the same processor.
In this case I don't expect cache lines holding pid reference counts
to ping pong between cpus. As senders and receivers update different
pids there is a natural separation there. Further if multiple threads
of the same process either send or receive messages the pid will be
updated to the same value and ipc_update_pid will avoid the reference
count update.
Which means in the common case I expect msg_lspid and msg_lrpid to
remain constant, and reference counts not to be updated when messages
are sent.
In rare cases it may be possible to trigger the issue which was
observed for af_unix sockets, but it will require multiple processes
with multiple threads to be either sending or receiving messages. It
just does not feel likely that anyone would do that in practice.
This change updates msgctl(..., IPC_STAT, ...) to return msg_lspid and
msg_lrpid in the pid namespace of the process calling stat.
This change also updates cat /proc/sysvipc/msg to return print msg_lspid
and msg_lrpid in the pid namespace of the process that opened the proc
file.
Fixes: b488893a390e ("pid namespaces: changes to show virtual ids to user")
Reviewed-by: Nagarathnam Muthusamy <nagarathnam.muthusamy@oracle.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
|
|
Today shm_cpid and shm_lpid are remembered in the pid namespace of the
creator and the processes that last touched a sysvipc shared memory
segment. If you have processes in multiple pid namespaces that
is just wrong, and I don't know how this has been over-looked for
so long.
As only creation and shared memory attach and shared memory detach
update the pids I do not expect there to be a repeat of the issues
when struct pid was attached to each af_unix skb, which in some
notable cases cut the performance in half. The problem was threads of
the same process updating same struct pid from different cpus causing
the cache line to be highly contended and bounce between cpus.
As creation, attach, and detach are expected to be rare operations for
sysvipc shared memory segments I do not expect that kind of cache line
ping pong to cause probems. In addition because the pid is at a fixed
location in the structure instead of being dynamic on a skb, the
reference count of the pid does not need to be updated on each
operation if the pid is the same. This ability to simply skip the pid
reference count changes if the pid is unchanging further reduces the
likelihood of the a cache line holding a pid reference count
ping-ponging between cpus.
Fixes: b488893a390e ("pid namespaces: changes to show virtual ids to user")
Reviewed-by: Nagarathnam Muthusamy <nagarathnam.muthusamy@oracle.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
|
|
Capture the pid namespace when /proc/sysvipc/msg /proc/sysvipc/shm
and /proc/sysvipc/sem are opened, and make it available through
the new helper ipc_seq_pid_ns.
This makes it possible to report the pids in these files in the
pid namespace of the opener of the files.
Implement ipc_update_pid. A simple impline helper that will only update
a struct pid pointer if the new value does not equal the old value. This
removes the need for wordy code sequences like:
old = object->pid;
object->pid = new;
put_pid(old);
and
old = object->pid;
if (old != new) {
object->pid = new;
put_pid(old);
}
Allowing the following to be written instead:
ipc_update_pid(&object->pid, new);
Which is easier to read and ensures that the pid reference count is
not touched the old and the new values are the same. Not touching
the reference count in this case is important to help avoid issues
like af_unix experienced, where multiple threads of the same
process managed to bounce the struct pid between cpu cache lines,
but updating the pids reference count.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
|
|
The definition IPCMNI is only used in ipc/util.h and ipc/util.c. So
there is no reason to keep it in a header file that the whole kernel
can see. Move it into util.h to simplify future maintenance.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
|
|
All of the users are now in ipc/msg.c so make the definition local to
that file to make code maintenance easier. AKA to prevent rebuilding
the entire kernel when struct msg_queue changes.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
|
|
All of the users are now in ipc/shm.c so make the definition local to
that file to make code maintenance easier. AKA to prevent rebuilding
the entire kernel when struct shmid_kernel changes.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
|
|
All of the users are now in ipc/sem.c so make the definitions
local to that file to make code maintenance easier. AKA
to prevent rebuilding the entire kernel when one of these
files is changed.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
|
|
All of the implementations of security hooks that take msg_queue only
access q_perm the struct kern_ipc_perm member. This means the
dependencies of the msg_queue security hooks can be simplified by
passing the kern_ipc_perm member of msg_queue.
Making this change will allow struct msg_queue to become private to
ipc/msg.c.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
|
|
All of the implementations of security hooks that take shmid_kernel only
access shm_perm the struct kern_ipc_perm member. This means the
dependencies of the shm security hooks can be simplified by passing
the kern_ipc_perm member of shmid_kernel..
Making this change will allow struct shmid_kernel to become private to ipc/shm.c.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
|
|
All of the implementations of security hooks that take sem_array only
access sem_perm the struct kern_ipc_perm member. This means the
dependencies of the sem security hooks can be simplified by passing
the kern_ipc_perm member of sem_array.
Making this change will allow struct sem and struct sem_array
to become private to ipc/sem.c.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
|
|
Those pid_* caches are created on demand when a process advances to the new
level of pid namespace. Which means pointers are stable, write only and
thus can be packed into an array instead of spreading them over and using
lists(!) to find them.
Both first and subsequent clone/unshare(CLONE_NEWPID) become faster.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
|
|
|
|
Passive sockets can have ongoing operations on them, specifically, we
have two wait_event_interruptable calls in pvcalls_front_accept.
Add two wake_up calls in pvcalls_front_release, then wait for the
potential waiters to return and release the sock_mapping refcount.
Signed-off-by: Stefano Stabellini <stefano@aporeto.com>
Acked-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
|
|
Introduce a per sock_mapping refcount, in addition to the existing
global refcount. Thanks to the sock_mapping refcount, we can safely wait
for it to be 1 in pvcalls_front_release before freeing an active socket,
instead of waiting for the global refcount to be 1.
Signed-off-by: Stefano Stabellini <stefano@aporeto.com>
Acked-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
|
|
The kernel panics on PV domains because native_smp_cpus_done() is
only called for HVM domains.
Calculate __max_logical_packages for PV domains.
Fixes: b4c0a7326f5d ("x86/smpboot: Fix __max_logical_packages estimate")
Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Tested-and-reported-by: Simon Gaiser <simon@invisiblethingslab.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Dou Liyang <douly.fnst@cn.fujitsu.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Kate Stewart <kstewart@linuxfoundation.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: xen-devel@lists.xenproject.org
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
|
|
Commit fd8aa9095a95 ("xen: optimize xenbus driver for multiple concurrent
xenstore accesses") optimized xenbus concurrent accesses but in doing so
broke UABI of /dev/xen/xenbus. Through /dev/xen/xenbus applications are in
charge of xenbus message exchange with the correct header and body. Now,
after the mentioned commit the replies received by application will no
longer have the header req_id echoed back as it was on request (see
specification below for reference), because that particular field is being
overwritten by kernel.
struct xsd_sockmsg
{
uint32_t type; /* XS_??? */
uint32_t req_id;/* Request identifier, echoed in daemon's response. */
uint32_t tx_id; /* Transaction id (0 if not related to a transaction). */
uint32_t len; /* Length of data following this. */
/* Generally followed by nul-terminated string(s). */
};
Before there was only one request at a time so req_id could simply be
forwarded back and forth. To allow simultaneous requests we need a
different req_id for each message thus kernel keeps a monotonic increasing
counter for this field and is written on every request irrespective of
userspace value.
Forwarding again the req_id on userspace requests is not a solution because
we would open the possibility of userspace-generated req_id colliding with
kernel ones. So this patch instead takes another route which is to
artificially keep user req_id while keeping the xenbus logic as is. We do
that by saving the original req_id before xs_send(), use the private kernel
counter as req_id and then once reply comes and was validated, we restore
back the original req_id.
Cc: <stable@vger.kernel.org> # 4.11
Fixes: fd8aa9095a ("xen: optimize xenbus driver for multiple concurrent xenstore accesses")
Reported-by: Bhavesh Davda <bhavesh.davda@oracle.com>
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
|
|
Sparse makes a fair bit of noise about our MPIDR mask being implicitly
long - let's explicitly describe it as such rather than just relying on
the value forcing automatic promotion.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
In many cases, page tables can be accessed concurrently by either another
CPU (due to things like fast gup) or by the hardware page table walker
itself, which may set access/dirty bits. In such cases, it is important
to use READ_ONCE/WRITE_ONCE when accessing page table entries so that
entries cannot be torn, merged or subject to apparent loss of coherence
due to compiler transformations.
Whilst there are some scenarios where this cannot happen (e.g. pinned
kernel mappings for the linear region), the overhead of using READ_ONCE
/WRITE_ONCE everywhere is minimal and makes the code an awful lot easier
to reason about. This patch consistently uses these macros in the arch
code, as well as explicitly namespacing pointers to page table entries
from the entries themselves by using adopting a 'p' suffix for the former
(as is sometimes used elsewhere in the kernel source).
Tested-by: Yury Norov <ynorov@caviumnetworks.com>
Tested-by: Richard Ruigrok <rruigrok@codeaurora.org>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
We get a warning about some slow configurations in randconfig kernels:
mm/memory.c:83:2: error: #warning Unfortunate NUMA and NUMA Balancing config, growing page-frame for last_cpupid. [-Werror=cpp]
The warning is reasonable by itself, but gets in the way of randconfig
build testing, so I'm hiding it whenever CONFIG_COMPILE_TEST is set.
The warning was added in 2013 in commit 75980e97dacc ("mm: fold
page->_last_nid into page->flags where possible").
Cc: stable@vger.kernel.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
dec_pending() is given an error status (possibly 0) to be recorded
against a bio. It can be called several times on the one 'struct
dm_io', and it is careful to only assign a non-zero error to
io->status. However when it then assigned io->status to bio->bi_status,
it is not careful and could overwrite a genuine error status with 0.
This can happen when chained bios are in use. If a bio is chained
beneath the bio that this dm_io is handling, the child bio might
complete and set bio->bi_status before the dm_io completes.
This has been possible since chained bios were introduced in 3.14, and
has become a lot easier to trigger with commit 18a25da84354 ("dm: ensure
bio submission follows a depth-first tree walk") as that commit caused
dm to start using chained bios itself.
A particular failure mode is that if a bio spans an 'error' target and a
working target, the 'error' fragment will complete instantly and set the
->bi_status, and the other fragment will normally complete a little
later, and will clear ->bi_status.
The fix is simply to only assign io_error to bio->bi_status when
io_error is not zero.
Reported-and-tested-by: Milan Broz <gmazyland@gmail.com>
Cc: stable@vger.kernel.org (v3.14+)
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
|
...instead of open coding file operations followed by custom ->open()
callbacks per each attribute.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
|
|
Since commit ad67b74d2469 ("printk: hash addresses printed with %p")
pointers are being hashed when printed. Displaying the virtual memory at
bootup time is not helpful. so delete the prints.
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jaedon Shin <jaedon.shin@gmail.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
|
|
We'd never implemented Multi-MSI support with GICv2m, because
it is weird and clunky, and you'd think people would rather use
MSI-X.
Turns out there is still plenty of devices out there that rely
on Multi-MSI. Oh well, let's teach that trick to the v2m widget,
it is not a big deal anyway.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
|
|
On some platforms there's an ITS available but it's not enabled
because reading or writing the registers is denied by the
firmware. In fact, reading or writing them will cause the system
to reset. We could remove the node from DT in such a case, but
it's better to skip nodes that are marked as "disabled" in DT so
that we can describe the hardware that exists and use the status
property to indicate how the firmware has configured things.
Cc: Stuart Yoder <stuyoder@gmail.com>
Cc: Laurentiu Tudor <laurentiu.tudor@nxp.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Rajendra Nayak <rnayak@codeaurora.org>
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
|
|
A DMB instruction can be used to ensure the relative order of only
memory accesses before and after the barrier. Since writes to system
registers are not memory operations, barrier DMB is not sufficient
for observability of memory accesses that occur before ICC_SGI1R_EL1
writes.
A DSB instruction ensures that no instructions that appear in program
order after the DSB instruction, can execute until the DSB instruction
has completed.
Cc: stable@vger.kernel.org
Acked-by: Will Deacon <will.deacon@arm.com>,
Signed-off-by: Shanker Donthineni <shankerd@codeaurora.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
|
|
The pr_debug() in gic-v3 gic_send_sgi() can trigger a circular locking
warning:
GICv3: CPU10: ICC_SGI1R_EL1 5000400
======================================================
WARNING: possible circular locking dependency detected
4.15.0+ #1 Tainted: G W
------------------------------------------------------
dynamic_debug01/1873 is trying to acquire lock:
((console_sem).lock){-...}, at: [<0000000099c891ec>] down_trylock+0x20/0x4c
but task is already holding lock:
(&rq->lock){-.-.}, at: [<00000000842e1587>] __task_rq_lock+0x54/0xdc
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #2 (&rq->lock){-.-.}:
__lock_acquire+0x3b4/0x6e0
lock_acquire+0xf4/0x2a8
_raw_spin_lock+0x4c/0x60
task_fork_fair+0x3c/0x148
sched_fork+0x10c/0x214
copy_process.isra.32.part.33+0x4e8/0x14f0
_do_fork+0xe8/0x78c
kernel_thread+0x48/0x54
rest_init+0x34/0x2a4
start_kernel+0x45c/0x488
-> #1 (&p->pi_lock){-.-.}:
__lock_acquire+0x3b4/0x6e0
lock_acquire+0xf4/0x2a8
_raw_spin_lock_irqsave+0x58/0x70
try_to_wake_up+0x48/0x600
wake_up_process+0x28/0x34
__up.isra.0+0x60/0x6c
up+0x60/0x68
__up_console_sem+0x4c/0x7c
console_unlock+0x328/0x634
vprintk_emit+0x25c/0x390
dev_vprintk_emit+0xc4/0x1fc
dev_printk_emit+0x88/0xa8
__dev_printk+0x58/0x9c
_dev_info+0x84/0xa8
usb_new_device+0x100/0x474
hub_port_connect+0x280/0x92c
hub_event+0x740/0xa84
process_one_work+0x240/0x70c
worker_thread+0x60/0x400
kthread+0x110/0x13c
ret_from_fork+0x10/0x18
-> #0 ((console_sem).lock){-...}:
validate_chain.isra.34+0x6e4/0xa20
__lock_acquire+0x3b4/0x6e0
lock_acquire+0xf4/0x2a8
_raw_spin_lock_irqsave+0x58/0x70
down_trylock+0x20/0x4c
__down_trylock_console_sem+0x3c/0x9c
console_trylock+0x20/0xb0
vprintk_emit+0x254/0x390
vprintk_default+0x58/0x90
vprintk_func+0xbc/0x164
printk+0x80/0xa0
__dynamic_pr_debug+0x84/0xac
gic_raise_softirq+0x184/0x18c
smp_cross_call+0xac/0x218
smp_send_reschedule+0x3c/0x48
resched_curr+0x60/0x9c
check_preempt_curr+0x70/0xdc
wake_up_new_task+0x310/0x470
_do_fork+0x188/0x78c
SyS_clone+0x44/0x50
__sys_trace_return+0x0/0x4
other info that might help us debug this:
Chain exists of:
(console_sem).lock --> &p->pi_lock --> &rq->lock
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(&rq->lock);
lock(&p->pi_lock);
lock(&rq->lock);
lock((console_sem).lock);
*** DEADLOCK ***
2 locks held by dynamic_debug01/1873:
#0: (&p->pi_lock){-.-.}, at: [<000000001366df53>] wake_up_new_task+0x40/0x470
#1: (&rq->lock){-.-.}, at: [<00000000842e1587>] __task_rq_lock+0x54/0xdc
stack backtrace:
CPU: 10 PID: 1873 Comm: dynamic_debug01 Tainted: G W 4.15.0+ #1
Hardware name: GIGABYTE R120-T34-00/MT30-GS2-00, BIOS T48 10/02/2017
Call trace:
dump_backtrace+0x0/0x188
show_stack+0x24/0x2c
dump_stack+0xa4/0xe0
print_circular_bug.isra.31+0x29c/0x2b8
check_prev_add.constprop.39+0x6c8/0x6dc
validate_chain.isra.34+0x6e4/0xa20
__lock_acquire+0x3b4/0x6e0
lock_acquire+0xf4/0x2a8
_raw_spin_lock_irqsave+0x58/0x70
down_trylock+0x20/0x4c
__down_trylock_console_sem+0x3c/0x9c
console_trylock+0x20/0xb0
vprintk_emit+0x254/0x390
vprintk_default+0x58/0x90
vprintk_func+0xbc/0x164
printk+0x80/0xa0
__dynamic_pr_debug+0x84/0xac
gic_raise_softirq+0x184/0x18c
smp_cross_call+0xac/0x218
smp_send_reschedule+0x3c/0x48
resched_curr+0x60/0x9c
check_preempt_curr+0x70/0xdc
wake_up_new_task+0x310/0x470
_do_fork+0x188/0x78c
SyS_clone+0x44/0x50
__sys_trace_return+0x0/0x4
GICv3: CPU0: ICC_SGI1R_EL1 12000
This could be fixed with printk_deferred() but that might lessen its
usefulness for debugging. So change it to pr_devel to keep it out of
production kernels. Developers working on gic-v3 can enable it as
needed in their kernels.
Signed-off-by: Mark Salter <msalter@redhat.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
|
|
Commit 7778c4b27cbe ("irqchip: mips-gic: Use pcpu_masks to avoid reading
GIC_SH_MASK*") removed the read of the hardware mask register when
handling shared interrupts, instead using the driver's shadow pcpu_masks
entry as the effective mask. Unfortunately this did not take account of
the write to pcpu_masks during gic_shared_irq_domain_map, which
effectively unmasks the interrupt early. If an interrupt is asserted,
gic_handle_shared_int decodes and processes the interrupt even though it
has not yet been unmasked via gic_unmask_irq, which also sets the
appropriate bit in pcpu_masks.
On the MIPS Boston board, when a console command line of
"console=ttyS0,115200n8r" is passed, the modem status IRQ is enabled in
the UART, which is immediately raised to the GIC. The interrupt has been
mapped, but no handler has yet been registered, nor is it expected to be
unmasked. However, the write to pcpu_masks in gic_shared_irq_domain_map
has effectively unmasked it, resulting in endless reports of:
[ 5.058454] irq 13, desc: ffffffff80a7ad80, depth: 1, count: 0, unhandled: 0
[ 5.062057] ->handle_irq(): ffffffff801b1838,
[ 5.062175] handle_bad_irq+0x0/0x2c0
Where IRQ 13 is the UART interrupt.
To fix this, just remove the write to pcpu_masks in
gic_shared_irq_domain_map. The existing write in gic_unmask_irq is the
correct place for what is now the effective unmasking.
Cc: stable@vger.kernel.org
Fixes: 7778c4b27cbe ("irqchip: mips-gic: Use pcpu_masks to avoid reading GIC_SH_MASK*")
Signed-off-by: Matt Redfearn <matt.redfearn@mips.com>
Reviewed-by: Paul Burton <paul.burton@mips.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
|
|
Some versions of QEMU will produce an ibm,dynamic-reconfiguration-memory
node with a ibm,dynamic-memory property that is zero-filled. This
causes the drmem code to oops trying to parse this property.
The fix for this is to validate that the property does contain LMB
entries before trying to parse it and bail if the count is zero.
Oops: Kernel access of bad area, sig: 11 [#1]
DAR: 0000000000000010
NIP read_drconf_v1_cell+0x54/0x9c
LR read_drconf_v1_cell+0x48/0x9c
Call Trace:
__param_initcall_debug+0x0/0x28 (unreliable)
drmem_init+0x144/0x2f8
do_one_initcall+0x64/0x1d0
kernel_init_freeable+0x298/0x38c
kernel_init+0x24/0x160
ret_from_kernel_thread+0x5c/0xb4
The ibm,dynamic-reconfiguration-memory device tree property generated
that causes this:
ibm,dynamic-reconfiguration-memory {
ibm,lmb-size = <0x0 0x10000000>;
ibm,memory-flags-mask = <0xff>;
ibm,dynamic-memory = <0x0 0x0 0x0 0x0 0x0 0x0>;
linux,phandle = <0x7e57eed8>;
ibm,associativity-lookup-arrays = <0x1 0x4 0x0 0x0 0x0 0x0>;
ibm,memory-preservation-time = <0x0>;
};
Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Reviewed-by: Cyril Bur <cyrilbur@gmail.com>
Tested-by: Daniel Black <daniel@linux.vnet.ibm.com>
[mpe: Trim oops report]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
for_each_cpu_wrap() was originally added in the #else half of a
large "#if NR_CPUS == 1" statement, but was omitted in the #if
half. This patch adds the missing #if half to prevent compile
errors when NR_CPUS is 1.
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Michael Kelley <mhkelley@outlook.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: kys@microsoft.com
Cc: martin.petersen@oracle.com
Cc: mikelley@microsoft.com
Fixes: c743f0a5c50f ("sched/fair, cpumask: Export for_each_cpu_wrap()")
Link: http://lkml.kernel.org/r/SN6PR1901MB2045F087F59450507D4FCC17CBF50@SN6PR1901MB2045.namprd19.prod.outlook.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
The X86_P6_NOP config class leaves out many i686-class CPUs. Instead,
explicitly enumerate all these CPUs.
Using a configuration with M686 currently sets X86_MINIMUM_CPU_FAMILY=5
instead of the correct value of 6.
Booting on an i586 it will fail to generate the "This kernel
requires an i686 CPU, but only detected an i586 CPU" message and
intentional halt as expected. It will instead just silently hang
when it hits i686-specific instructions.
Signed-off-by: Matthew Whitehead <tedheadster@gmail.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1518713696-11360-3-git-send-email-tedheadster@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
i586-class machines also lack support for Physical Address Extension (PAE),
so add them to the exclusion list.
Signed-off-by: Matthew Whitehead <tedheadster@gmail.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1518713696-11360-2-git-send-email-tedheadster@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Several i586-class CPUs supporting this instruction are missing from
the X86_CMPXCHG64 config group.
Using a configuration with either M586TSC or M586MMX currently sets
X86_MINIMUM_CPU_FAMILY=4 instead of the correct value of 5.
Booting on an i486 it will fail to generate the "This kernel
requires an i586 CPU, but only detected an i486 CPU" message and
intentional halt as expected. It will instead just silently hang
when it hits i586-specific instructions.
The M586 CPU is not in this list because at least the Cyrix 5x86
lacks this instruction, and perhaps others.
Signed-off-by: Matthew Whitehead <tedheadster@gmail.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1518713696-11360-1-git-send-email-tedheadster@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Improve error handling when disarming ftrace-based kprobes. Like with
arm_kprobe_ftrace(), propagate any errors from disarm_kprobe_ftrace() so
that we do not disable/unregister kprobes that are still armed. In other
words, unregister_kprobe() and disable_kprobe() should not report success
if the kprobe could not be disarmed.
disarm_all_kprobes() keeps its current behavior and attempts to
disarm all kprobes. It returns the last encountered error and gives a
warning if not all probes could be disarmed.
This patch is based on Petr Mladek's original patchset (patches 2 and 3)
back in 2015, which improved kprobes error handling, found here:
https://lkml.org/lkml/2015/2/26/452
However, further work on this had been paused since then and the patches
were not upstreamed.
Based-on-patches-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Jessica Yu <jeyu@kernel.org>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: David S . Miller <davem@davemloft.net>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Joe Lawrence <joe.lawrence@redhat.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Miroslav Benes <mbenes@suse.cz>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: live-patching@vger.kernel.org
Link: http://lkml.kernel.org/r/20180109235124.30886-3-jeyu@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Improve error handling when arming ftrace-based kprobes. Specifically, if
we fail to arm a ftrace-based kprobe, register_kprobe()/enable_kprobe()
should report an error instead of success. Previously, this has lead to
confusing situations where register_kprobe() would return 0 indicating
success, but the kprobe would not be functional if ftrace registration
during the kprobe arming process had failed. We should therefore take any
errors returned by ftrace into account and propagate this error so that we
do not register/enable kprobes that cannot be armed. This can happen if,
for example, register_ftrace_function() finds an IPMODIFY conflict (since
kprobe_ftrace_ops has this flag set) and returns an error. Such a conflict
is possible since livepatches also set the IPMODIFY flag for their ftrace_ops.
arm_all_kprobes() keeps its current behavior and attempts to arm all
kprobes. It returns the last encountered error and gives a warning if
not all probes could be armed.
This patch is based on Petr Mladek's original patchset (patches 2 and 3)
back in 2015, which improved kprobes error handling, found here:
https://lkml.org/lkml/2015/2/26/452
However, further work on this had been paused since then and the patches
were not upstreamed.
Based-on-patches-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Jessica Yu <jeyu@kernel.org>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: David S . Miller <davem@davemloft.net>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Joe Lawrence <joe.lawrence@redhat.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Miroslav Benes <mbenes@suse.cz>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: live-patching@vger.kernel.org
Link: http://lkml.kernel.org/r/20180109235124.30886-2-jeyu@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
The recently introduced clock gate support breaks on Tegra chips because
no thermal support is enabled for those devices. Conditionalize the code
on the existence of thermal support to fix this.
Fixes: b138eca661cc ("drm/nouveau: Add support for basic clockgating on Kepler1")
Cc: Martin Peres <martin.peres@free.fr>
Cc: Lyude Paul <lyude@redhat.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
Now that USB_UHCI_BIG_ENDIAN_MMIO and USB_UHCI_BIG_ENDIAN_DESC are moved
outside of the USB_SUPPORT conditional, simply select them from
SPARC_LEON rather than by the symbol's defaults in drivers/usb/Kconfig,
similar to how it is done for USB_EHCI_BIG_ENDIAN_MMIO and
USB_EHCI_BIG_ENDIAN_DESC.
Signed-off-by: James Hogan <jhogan@kernel.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Corentin Labbe <clabbe.montjoie@gmail.com>
Cc: sparclinux@vger.kernel.org
Cc: linux-usb@vger.kernel.org
Acked-by: David S. Miller <davem@davemloft.net>
Patchwork: https://patchwork.linux-mips.org/patch/18560/
|
|
Move the Kconfig symbols USB_UHCI_BIG_ENDIAN_MMIO and
USB_UHCI_BIG_ENDIAN_DESC out of drivers/usb/host/Kconfig, which is
conditional upon USB && USB_SUPPORT, so that it can be freely selected
by platform Kconfig symbols in architecture code.
For example once the MIPS_GENERIC platform selects are fixed in commit
2e6522c56552 ("MIPS: Fix typo BIG_ENDIAN to CPU_BIG_ENDIAN"), the MIPS
32r6_defconfig warns like so:
warning: (MIPS_GENERIC) selects USB_UHCI_BIG_ENDIAN_MMIO which has unmet direct dependencies (USB_SUPPORT && USB)
warning: (MIPS_GENERIC) selects USB_UHCI_BIG_ENDIAN_DESC which has unmet direct dependencies (USB_SUPPORT && USB)
Fixes: 2e6522c56552 ("MIPS: Fix typo BIG_ENDIAN to CPU_BIG_ENDIAN")
Signed-off-by: James Hogan <jhogan@kernel.org>
Cc: Corentin Labbe <clabbe.montjoie@gmail.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paul Burton <paul.burton@mips.com>
Cc: linux-usb@vger.kernel.org
Cc: linux-mips@linux-mips.org
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Patchwork: https://patchwork.linux-mips.org/patch/18559/
|
|
Update comment typo _consisitent_ to _consistent_ from following commit.
commit 0206319fdfee ("blk-mq: Fix poll_stat for new size-based bucketing.")
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Minwoo Im <minwoo.im.dev@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
This reverts commit f120c7b187e6c418238710b48723ce141f467543 which is no
longer required with the introduction of a syscall.tbl on s390.
Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
Cc: linux-s390@vger.kernel.org
LPU-Reference: 1518090470-2899-2-git-send-email-brueckner@linux.vnet.ibm.com
Link: https://lkml.kernel.org/n/tip-q1lg0nvhha1tk39ri9aqalcb@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Recently, s390 uses a syscall.tbl input file to generate its system call
table and unistd uapi header files. Hence, update mksyscalltbl to use
it as input to create the system table for perf.
Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
Cc: linux-s390@vger.kernel.org
LPU-Reference: 1518090470-2899-4-git-send-email-brueckner@linux.vnet.ibm.com
Link: https://lkml.kernel.org/n/tip-bdyhllhsq1zgxv2qx4m377y6@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|