Age | Commit message (Collapse) | Author | Files | Lines |
|
radeon_copy_dma and radeon_copy_blit must be called with
a valid reservation object. Otherwise a crash will be provoked.
We borrow the object from vram BO.
bug:
https://bugs.freedesktop.org/show_bug.cgi?id=88464
Cc: stable@vger.kernel.org
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Ilija Hadzic <ihadzic@research.bell-labs.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
radeon_copy_dma and radeon_copy_blit must be called with
a valid reservation object. Otherwise a crash will be provoked.
We borrow the object from destination BO.
bug:
https://bugs.freedesktop.org/show_bug.cgi?id=88464
Cc: stable@vger.kernel.org
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Ilija Hadzic <ihadzic@research.bell-labs.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Should be the same as cayman. We don't use VM by default
on NI parts so this isn't critical.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
|
|
If acceleration is disabled, it does not make sense
to init gpuvm since nothing will use it. Moreover,
if radeon_vm_init() gets called it uses accel to try
and clear the pde tables, etc. which results in a bug.
v2: handle vm_fini as well
v3: handle bo_open/close as well
Bug:
https://bugs.freedesktop.org/show_bug.cgi?id=88786
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
|
|
This is a workaround for RS880 and older chips which seem to have
an additional limit on the minimum PLL input frequency.
v2: fix signed/unsigned warning
bugs:
https://bugzilla.kernel.org/show_bug.cgi?id=91861
https://bugzilla.kernel.org/show_bug.cgi?id=83461
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
|
|
The variables diff_input, ext_vref, and vref_mv are only used in the probe
function and therefore don't need to be kept in the device data structure.
Reviewed-and-Tested-by: Robert Rosengren <robert.rosengren@axis.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
|
|
Simplify code and reduce code size by using regmap to access i2c registers.
Reviewed-and-Tested-by: Robert Rosengren <robert.rosengren@axis.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
|
|
NEC OEMs the same platforms as Stratus does, which have multiple devices on
some PCIe buses under downstream ports.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=51331
Fixes: 1278998f8ff6 ("PCI: Work around Stratus ftServer broken PCIe hierarchy (fix DMI check)")
Signed-off-by: Charlotte Richardson <charlotte.richardson@stratus.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: stable@vger.kernel.org # v3.5+
CC: Myron Stowe <myron.stowe@redhat.com>
|
|
The following patch fixes an issue observed with 4k sector disks
where the max_hw_sectors attribute was getting set too large in
sd_revalidate_disk. Since sdkp->max_xfer_blocks is in units
of SCSI logical blocks and queue_max_hw_sectors is in units of
512 byte blocks, on a 4k sector disk, every time we went through
sd_revalidate_disk, we were taking the current value of
queue_max_hw_sectors and increasing it by a factor of 8. Fix
this by only shifting sdkp->max_xfer_blocks.
Cc: stable@vger.kernel.org
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
|
|
This fixes a regression caused by commit 1d5203 ("scsi: handle more device
handler setup/teardown in common code").
The bug is that the alua detach() callout will try to access the
sddev->scsi_dh_data, but we have already set it to NULL. This patch
moves the clearing of that field to after detach() is called.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: Christoph Hellwig <hch@lst.de>
|
|
On 1 and 2 bytes per word, the transfer of the 3 last bytes will access
memory outside tx_ptr.
Although this has not trigger any error on real hardware, we should
better fix this.
Fixes: 24ba5e593f391507 (Remove rx_fn and tx_fn pointer)
Signed-off-by: Ricardo Ribalda Delgado <ricardo.ribalda@gmail.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
Adding myself as the Intel BDW/HSW ASoC driver maintainer.
Signed-off-by: Jie Yang <yang.jie@intel.com>
Acked-by: Liam Girdwood <lgirdwood@gmail.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
This patch changes a BUG_ON() statement to pr_debug, in case the user tries to
update a non-existing queue.
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Reviewed-by: Ben Goz <ben.goz@amd.com>
Reviewed-by: Jammy Zhou <Jammy.Zhou@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Reviewed-by: Jammy Zhou <Jammy.Zhou@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Reviewed-by: Jammy Zhou <Jammy.Zhou@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
|
|
In commit be9f4a44e7d41 ("ipv4: tcp: remove per net tcp_sock")
I tried to address contention on a socket lock, but the solution
I chose was horrible :
commit 3a7c384ffd57e ("ipv4: tcp: unicast_sock should not land outside
of TCP stack") addressed a selinux regression.
commit 0980e56e506b ("ipv4: tcp: set unicast_sock uc_ttl to -1")
took care of another regression.
commit b5ec8eeac46 ("ipv4: fix ip_send_skb()") fixed another regression.
commit 811230cd85 ("tcp: ipv4: initialize unicast_sock sk_pacing_rate")
was another shot in the dark.
Really, just use a proper socket per cpu, and remove the skb_orphan()
call, to re-enable flow control.
This solves a serious problem with FQ packet scheduler when used in
hostile environments, as we do not want to allocate a flow structure
for every RST packet sent in response to a spoofed packet.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
commit 8eb23b9f35aae413140d3fda766a98092c21e9b0
sched: Debug nested sleeps
causes false-positive warnings in RAID5 code.
This annotation removes them and adds a comment
explaining why there is no real problem.
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
|
|
If a non-page-aligned write is destined for a device which
is missing/faulty, we can deadlock.
As the target device is missing, a read-modify-write cycle
is not possible.
As the write is not for a full-page, a recontruct-write cycle
is not possible.
This should be handled by logic in fetch_block() which notices
there is a non-R5_OVERWRITE write to a missing device, and so
loads all blocks.
However since commit 67f455486d2ea2, that code requires
STRIPE_PREREAD_ACTIVE before it will active, and those circumstances
never set STRIPE_PREREAD_ACTIVE.
So: in handle_stripe_dirtying, if neither rmw or rcw was possible,
set STRIPE_DELAYED, which will cause STRIPE_PREREAD_ACTIVE be set
after a suitable delay.
Fixes: 67f455486d2ea20b2d94d6adf5b9b783d079e321
Cc: stable@vger.kernel.org (v3.16+)
Reported-by: Mikulas Patocka <mpatocka@redhat.com>
Tested-by: Heinz Mauelshagen <heinzm@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
|
|
|
|
Commit 8eb23b9f35aa ("sched: Debug nested sleeps") added code to report
on nested sleep conditions, which we generally want to avoid because the
inner sleeping operation can re-set the thread state to TASK_RUNNING,
but that will then cause the outer sleep loop not actually sleep when it
calls schedule.
However, that's actually valid traditional behavior, with the inner
sleep being some fairly rare case (like taking a sleeping lock that
normally doesn't actually need to sleep).
And the debug code would actually change the state of the task to
TASK_RUNNING internally, which makes that kind of traditional and
working code not work at all, because now the nested sleep doesn't just
sometimes cause the outer one to not block, but will cause it to happen
every time.
In particular, it will cause the cardbus kernel daemon (pccardd) to
basically busy-loop doing scheduling, converting a laptop into a heater,
as reported by Bruno Prémont. But there may be other legacy uses of
that nested sleep model in other drivers that are also likely to never
get converted to the new model.
This fixes both cases:
- don't set TASK_RUNNING when the nested condition happens (note: even
if WARN_ONCE() only _warns_ once, the return value isn't whether the
warning happened, but whether the condition for the warning was true.
So despite the warning only happening once, the "if (WARN_ON(..))"
would trigger for every nested sleep.
- in the cases where we knowingly disable the warning by using
"sched_annotate_sleep()", don't change the task state (that is used
for all core scheduling decisions), instead use '->task_state_change'
that is used for the debugging decision itself.
(Credit for the second part of the fix goes to Oleg Nesterov: "Can't we
avoid this subtle change in behaviour DEBUG_ATOMIC_SLEEP adds?" with the
suggested change to use 'task_state_change' as part of the test)
Reported-and-bisected-by: Bruno Prémont <bonbons@linux-vserver.org>
Tested-by: Rafael J Wysocki <rjw@rjwysocki.net>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>,
Cc: Ilya Dryomov <ilya.dryomov@inktank.com>,
Cc: Mike Galbraith <umgwanakikbuti@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Hurley <peter@hurleysoftware.com>,
Cc: Davidlohr Bueso <dave@stgolabs.net>,
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Add two more Fujitsu LIFEBOOK models that also ship with the Elantech
touchpad and don't work with crc_disabled to the quirk list.
Signed-off-by: Rainer Koenig <Rainer.Koenig@ts.fujitsu.com>
Cc: stable@vger.kernel.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
|
|
We used to optimize rescheduling and audit on syscall exit. Now
that the full slow path is reasonably fast, remove these
optimizations. Syscall exit auditing is now handled exclusively by
syscall_trace_leave.
This adds something like 10ns to the previously optimized paths on
my computer, presumably due mostly to SAVE_REST / RESTORE_REST.
I think that we should eventually replace both the syscall and
non-paranoid interrupt exit slow paths with a pair of C functions
along the lines of the syscall entry hooks.
Link: http://lkml.kernel.org/r/22f2aa4a0361707a5cfb1de9d45260b39965dead.1421453410.git.luto@amacapital.net
Acked-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
|
|
The x86_64 entry code currently jumps through complex and
inconsistent hoops to try to minimize the impact of syscall exit
work. For a true fast-path syscall, almost nothing needs to be
done, so returning is just a check for exit work and sysret. For a
full slow-path return from a syscall, the C exit hook is invoked if
needed and we join the iret path.
Using iret to return to userspace is very slow, so the entry code
has accumulated various special cases to try to do certain forms of
exit work without invoking iret. This is error-prone, since it
duplicates assembly code paths, and it's dangerous, since sysret
can malfunction in interesting ways if used carelessly. It's
also inefficient, since a lot of useful cases aren't optimized
and therefore force an iret out of a combination of paranoia and
the fact that no one has bothered to write even more asm code
to avoid it.
I would argue that this approach is backwards. Rather than trying
to avoid the iret path, we should instead try to make the iret path
fast. Under a specific set of conditions, iret is unnecessary. In
particular, if RIP==RCX, RFLAGS==R11, RIP is canonical, RF is not
set, and both SS and CS are as expected, then
movq 32(%rsp),%rsp;sysret does the same thing as iret. This set of
conditions is nearly always satisfied on return from syscalls, and
it can even occasionally be satisfied on return from an irq.
Even with the careful checks for sysret applicability, this cuts
nearly 80ns off of the overhead from syscalls with unoptimized exit
work. This includes tracing and context tracking, and any return
that invokes KVM's user return notifier. For example, the cost of
getpid with CONFIG_CONTEXT_TRACKING_FORCE=y drops from ~360ns to
~280ns on my computer.
This may allow the removal and even eventual conversion to C
of a respectable amount of exit asm.
This may require further tweaking to give the full benefit on Xen.
It may be worthwhile to adjust signal delivery and exec to try hit
the sysret path.
This does not optimize returns to 32-bit userspace. Making the same
optimization for CS == __USER32_CS is conceptually straightforward,
but it will require some tedious code to handle the differences
between sysretl and sysexitl.
Link: http://lkml.kernel.org/r/71428f63e681e1b4aa1a781e3ef7c27f027d1103.1421453410.git.luto@amacapital.net
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
|
|
context_tracking_user_exit() has no effect if in_interrupt() returns true,
so ist_enter() didn't work. Fix it by calling exception_enter(), and thus
context_tracking_user_exit(), before incrementing the preempt count.
This also adds an assertion that will catch the problem reliably if
CONFIG_PROVE_RCU=y to help prevent the bug from being reintroduced.
Link: http://lkml.kernel.org/r/261ebee6aee55a4724746d0d7024697013c40a08.1422709102.git.luto@amacapital.net
Fixes: 959274753857 x86, traps: Track entry into and exit from IST context
Reported-and-tested-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
|
|
Doing the following commands on a non idle network device
panics the box instantly, because cpu_bstats gets overwritten
by stats.
tc qdisc add dev eth0 root <your_favorite_qdisc>
... some traffic (one packet is enough) ...
tc qdisc replace dev eth0 root est 1sec 4sec <your_favorite_qdisc>
[ 325.355596] BUG: unable to handle kernel paging request at ffff8841dc5a074c
[ 325.362609] IP: [<ffffffff81541c9e>] __gnet_stats_copy_basic+0x3e/0x90
[ 325.369158] PGD 1fa7067 PUD 0
[ 325.372254] Oops: 0000 [#1] SMP
[ 325.375514] Modules linked in: ...
[ 325.398346] CPU: 13 PID: 14313 Comm: tc Not tainted 3.19.0-smp-DEV #1163
[ 325.412042] task: ffff8800793ab5d0 ti: ffff881ff2fa4000 task.ti: ffff881ff2fa4000
[ 325.419518] RIP: 0010:[<ffffffff81541c9e>] [<ffffffff81541c9e>] __gnet_stats_copy_basic+0x3e/0x90
[ 325.428506] RSP: 0018:ffff881ff2fa7928 EFLAGS: 00010286
[ 325.433824] RAX: 000000000000000c RBX: ffff881ff2fa796c RCX: 000000000000000c
[ 325.440988] RDX: ffff8841dc5a0744 RSI: 0000000000000060 RDI: 0000000000000060
[ 325.448120] RBP: ffff881ff2fa7948 R08: ffffffff81cd4f80 R09: 0000000000000000
[ 325.455268] R10: ffff883ff223e400 R11: 0000000000000000 R12: 000000015cba0744
[ 325.462405] R13: ffffffff81cd4f80 R14: ffff883ff223e460 R15: ffff883feea0722c
[ 325.469536] FS: 00007f2ee30fa700(0000) GS:ffff88407fa20000(0000) knlGS:0000000000000000
[ 325.477630] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 325.483380] CR2: ffff8841dc5a074c CR3: 0000003feeae9000 CR4: 00000000001407e0
[ 325.490510] Stack:
[ 325.492524] ffff883feea0722c ffff883fef719dc0 ffff883feea0722c ffff883ff223e4a0
[ 325.499990] ffff881ff2fa79a8 ffffffff815424ee ffff883ff223e49c 000000015cba0744
[ 325.507460] 00000000f2fa7978 0000000000000000 ffff881ff2fa79a8 ffff883ff223e4a0
[ 325.514956] Call Trace:
[ 325.517412] [<ffffffff815424ee>] gen_new_estimator+0x8e/0x230
[ 325.523250] [<ffffffff815427aa>] gen_replace_estimator+0x4a/0x60
[ 325.529349] [<ffffffff815718ab>] tc_modify_qdisc+0x52b/0x590
[ 325.535117] [<ffffffff8155edd0>] rtnetlink_rcv_msg+0xa0/0x240
[ 325.540963] [<ffffffff8155ed30>] ? __rtnl_unlock+0x20/0x20
[ 325.546532] [<ffffffff8157f811>] netlink_rcv_skb+0xb1/0xc0
[ 325.552145] [<ffffffff8155b355>] rtnetlink_rcv+0x25/0x40
[ 325.557558] [<ffffffff8157f0d8>] netlink_unicast+0x168/0x220
[ 325.563317] [<ffffffff8157f47c>] netlink_sendmsg+0x2ec/0x3e0
Lets play safe and not use an union : percpu 'pointers' are mostly read
anyway, and we have typically few qdiscs per host.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: John Fastabend <john.fastabend@gmail.com>
Fixes: 22e0f8b9322c ("net: sched: make bstats per cpu and estimator RCU safe")
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The existing code frees the skb in EAGAIN case, in which the skb will be
retried from upper layer and used again.
Also, the existing code doesn't free send buffer slot in error case, because
there is no completion message for unsent packets.
This patch fixes these problems.
(Please also include this patch for stable trees. Thanks!)
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch fixes the following kernel crash,
WARNING: CPU: 2 PID: 0 at net/ipv4/tcp_input.c:3079 tcp_clean_rtx_queue+0x658/0x80c()
Call trace:
[<fffffe0000096b7c>] dump_backtrace+0x0/0x184
[<fffffe0000096d10>] show_stack+0x10/0x1c
[<fffffe0000685ea0>] dump_stack+0x74/0x98
[<fffffe00000b44e0>] warn_slowpath_common+0x88/0xb0
[<fffffe00000b461c>] warn_slowpath_null+0x14/0x20
[<fffffe00005b5c1c>] tcp_clean_rtx_queue+0x654/0x80c
[<fffffe00005b6228>] tcp_ack+0x454/0x688
[<fffffe00005b6ca8>] tcp_rcv_established+0x4a4/0x62c
[<fffffe00005bf4b4>] tcp_v4_do_rcv+0x16c/0x350
[<fffffe00005c225c>] tcp_v4_rcv+0x8e8/0x904
[<fffffe000059d470>] ip_local_deliver_finish+0x100/0x26c
[<fffffe000059dad8>] ip_local_deliver+0xac/0xc4
[<fffffe000059d6c4>] ip_rcv_finish+0xe8/0x328
[<fffffe000059dd3c>] ip_rcv+0x24c/0x38c
[<fffffe0000563950>] __netif_receive_skb_core+0x29c/0x7c8
[<fffffe0000563ea4>] __netif_receive_skb+0x28/0x7c
[<fffffe0000563f54>] netif_receive_skb_internal+0x5c/0xe0
[<fffffe0000564810>] napi_gro_receive+0xb4/0x110
[<fffffe0000482a2c>] xgene_enet_process_ring+0x144/0x338
[<fffffe0000482d18>] xgene_enet_napi+0x1c/0x50
[<fffffe0000565454>] net_rx_action+0x154/0x228
[<fffffe00000b804c>] __do_softirq+0x110/0x28c
[<fffffe00000b8424>] irq_exit+0x8c/0xc0
[<fffffe0000093898>] handle_IRQ+0x44/0xa8
[<fffffe000009032c>] gic_handle_irq+0x38/0x7c
[...]
Software writes poison data into the descriptor bytes[15:8] and upon
receiving the interrupt, if those bytes are overwritten by the hardware with
the valid data, software also reads bytes[7:0] and executes receive/tx
completion logic.
If the CPU executes the above two reads in out of order fashion, then the
bytes[7:0] will have older data and causing the kernel panic. We have to
force the order of the reads and thus this patch introduces read memory
barrier between these reads.
Signed-off-by: Iyappan Subramanian <isubramanian@apm.com>
Signed-off-by: Keyur Chudgar <kchudgar@apm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When a skb has multiple vlans and it is CHECKSUM_PARTIAL,
ixgbevf_tx_csum() fails to get the network protocol and checksum related
descriptor fields are not configured correctly because skb->protocol
doesn't show the L3 protocol in this case.
Use first->protocol instead of skb->protocol to get the proper network
protocol.
Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When a skb has multiple vlans and it is CHECKSUM_PARTIAL,
ixgbe_tx_csum() fails to get the network protocol and checksum related
descriptor fields are not configured correctly because skb->protocol
doesn't show the L3 protocol in this case.
Use vlan_get_protocol() to get the proper network protocol.
Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When a skb has multiple vlans and it is CHECKSUM_PARTIAL,
igbvf_tx_csum() fails to get the network protocol and checksum related
descriptor fields are not configured correctly because skb->protocol
doesn't show the L3 protocol in this case.
Use vlan_get_protocol() to get the proper network protocol.
Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
vlan_get_protocol() could not get network protocol if a skb has a 802.1ad
vlan tag or multiple vlans, which caused incorrect checksum calculation
in several drivers.
Fix vlan_get_protocol() to retrieve network protocol instead of incorrect
vlan protocol.
As the logic is the same as skb_network_protocol(), create a common helper
function __vlan_get_protocol() and call it from existing functions.
Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When making use of RFC5061, section 4.2.4. for setting the primary IP
address, we're passing a wrong parameter header to param_type2af(),
resulting always in NULL being returned.
At this point, param.p points to a sctp_addip_param struct, containing
a sctp_paramhdr (type = 0xc004, length = var), and crr_id as a correlation
id. Followed by that, as also presented in RFC5061 section 4.2.4., comes
the actual sctp_addr_param, which also contains a sctp_paramhdr, but
this time with the correct type SCTP_PARAM_IPV{4,6}_ADDRESS that
param_type2af() can make use of. Since we already hold a pointer to
addr_param from previous line, just reuse it for param_type2af().
Fixes: d6de3097592b ("[SCTP]: Add the handling of "Set Primary IP Address" parameter to INIT")
Signed-off-by: Saran Maruti Ramanara <saran.neti@telus.com>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The subscription bitmask passed via struct sockaddr_nl is converted to
the group number when calling the netlink_bind() and netlink_unbind()
callbacks.
The conversion is however incorrect since bitmask (1 << 0) needs to be
mapped to group number 1. Note that you cannot specify the group number 0
(usually known as _NONE) from setsockopt() using NETLINK_ADD_MEMBERSHIP
since this is rejected through -EINVAL.
This problem became noticeable since 97840cb ("netfilter: nfnetlink:
fix insufficient validation in nfnetlink_bind") when binding to bitmask
(1 << 0) in ctnetlink.
Reported-by: Andre Tomt <andre@tomt.net>
Reported-by: Ivan Delalande <colona@arista.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
There is a race in the MIPS fork code which allows the child to get a
stale copy of parent MSA/FPU/DSP state that is active in hardware
registers when the fork() is called. This is because copy_thread() saves
the live register state into the child context only if the hardware is
currently in use, apparently on the assumption that the hardware state
cannot have been saved and disabled since the initial duplication of the
task_struct. However preemption is certainly possible during this
window.
An example sequence of events is as follows:
1) The parent userland process puts important data into saved floating
point registers ($f20-$f31), which are then dirty compared to the
process' stored context.
2) The parent process calls fork() which does a clone system call.
3) In the kernel, do_fork() -> copy_process() -> dup_task_struct() ->
arch_dup_task_struct() (which uses the weakly defined default
implementation). This duplicates the parent process' task context,
which includes a stale version of its FP context from when it was
last saved, probably some time before (1).
4) At some point before copy_process() calls copy_thread(), such as when
duplicating the memory map, the process is desceduled. Perhaps it is
preempted asynchronously, or perhaps it sleeps while blocked on a
mutex. The dirty FP state in the FP registers is saved to the parent
process' context and the FPU is disabled.
5) When the process is rescheduled again it continues copying state
until it gets to copy_thread(), which checks whether the FPU is in
use, so that it can copy that dirty state to the child process' task
context. Because of the deschedule however the FPU is not in use, so
the child process' context is left with stale FP context from the
last time the parent saved it (some time before (1)).
6) When the new child process is scheduled it reads the important data
from the saved floating point register, and ends up doing a NULL
pointer dereference as a result of the stale data.
This use of saved floating point registers across function calls can be
triggered fairly easily by explicitly using inline asm with a current
(MIPS R2) compiler, but is far more likely to happen unintentionally
with a MIPS R6 compiler where the FP registers are more likely to get
used as scratch registers for storing non-fp data.
It is easily fixed, in the same way that other architectures do it, by
overriding the implementation of arch_dup_task_struct() to sync the
dirty hardware state to the parent process' task context *prior* to
duplicating it, rather than copying straight to the child process' task
context in copy_thread(). Note, the FPU hardware is not disabled so the
parent process may continue executing with the live register context,
but now the child process is guaranteed to have an identical copy of it
at that point.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Reported-by: Matthew Fortune <matthew.fortune@imgtec.com>
Tested-by: Markos Chandras <markos.chandras@imgtec.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/9075/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
|
|
The following commits:
5890f70f15c52d (MIPS: Use dedicated exception handler if CPU supports RI/XI exceptions)
6575b1d4173eae (MIPS: kernel: cpu-probe: Detect unique RI/XI exceptions)
break the kernel for *all* existing MIPS CPUs that implement the
CP0_PageGrain[IEC] bit. They cause the TLB exception handlers to be
generated without the legacy execute-inhibit handling, but never set
the CP0_PageGrain[IEC] bit to activate the use of dedicated exception
vectors for execute-inhibit exceptions. The result is that upon
detection of an execute-inhibit violation, we loop forever in the TLB
exception handlers instead of sending SIGSEGV to the task.
If we are generating TLB exception handlers expecting separate
vectors, we must also enable the CP0_PageGrain[IEC] feature.
The bug was introduced in kernel version 3.17.
Signed-off-by: David Daney <david.daney@cavium.com>
Cc: <stable@vger.kernel.org>
Cc: Leonid Yegoshin <Leonid.Yegoshin@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: http://patchwork.linux-mips.org/patch/8880/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
|
|
This reverts commit afe1de664ef3cb756e70938d99417dcbc6b1379a.
The series of IPoIB bug fixes that went into 3.19-rc1 introduce
regressions, and after trying to sort things out, we decided to revert
to 3.18's IPoIB driver and get things right for 3.20.
Signed-off-by: Roland Dreier <roland@purestorage.com>
|
|
This reverts commit 67d7209e1f481cbaed37f9a224a328a3f83d0482.
The series of IPoIB bug fixes that went into 3.19-rc1 introduce
regressions, and after trying to sort things out, we decided to revert
to 3.18's IPoIB driver and get things right for 3.20.
Signed-off-by: Roland Dreier <roland@purestorage.com>
|
|
This reverts commit 016d9fb25cd9817ea9c723f4f7ecd978636b4489.
The series of IPoIB bug fixes that went into 3.19-rc1 introduce
regressions, and after trying to sort things out, we decided to revert
to 3.18's IPoIB driver and get things right for 3.20.
Signed-off-by: Roland Dreier <roland@purestorage.com>
|
|
This reverts commit e5d1dcf1b0951f4ba00d93653942dda6196109d8.
The series of IPoIB bug fixes that went into 3.19-rc1 introduce
regressions, and after trying to sort things out, we decided to revert
to 3.18's IPoIB driver and get things right for 3.20.
Signed-off-by: Roland Dreier <roland@purestorage.com>
|
|
This reverts commit 3bcce487fda8161597c20ed303d510e41ad7770e.
The series of IPoIB bug fixes that went into 3.19-rc1 introduce
regressions, and after trying to sort things out, we decided to revert
to 3.18's IPoIB driver and get things right for 3.20.
Signed-off-by: Roland Dreier <roland@purestorage.com>
|
|
This reverts commit 5141861cd5e17eac9676ff49c5abfafbea2b0e98.
The series of IPoIB bug fixes that went into 3.19-rc1 introduce
regressions, and after trying to sort things out, we decided to revert
to 3.18's IPoIB driver and get things right for 3.20.
Signed-off-by: Roland Dreier <roland@purestorage.com>
|
|
This reverts commit bb42a6dd02fb2901a69dbec2358810735b14b186.
The series of IPoIB bug fixes that went into 3.19-rc1 introduce
regressions, and after trying to sort things out, we decided to revert
to 3.18's IPoIB driver and get things right for 3.20.
Signed-off-by: Roland Dreier <roland@purestorage.com>
|
|
This reverts commit ce347ab90eaabc69a6146d41943981d51e7a9b82.
The series of IPoIB bug fixes that went into 3.19-rc1 introduce
regressions, and after trying to sort things out, we decided to revert
to 3.18's IPoIB driver and get things right for 3.20.
Signed-off-by: Roland Dreier <roland@purestorage.com>
|
|
Commit 842dfc11ea9a ("MIPS: Fix build with binutils 2.24.51+") in v3.18
enabled -msoft-float and sprinkled ".set hardfloat" where necessary to
use FP instructions. However it missed enable_restore_fp_context() which
since v3.17 does a ctc1 with inline assembly, causing the following
assembler errors on Mentor's 2014.05 toolchain:
{standard input}: Assembler messages:
{standard input}:2913: Error: opcode not supported on this processor: mips32r2 (mips32r2) `ctc1 $2,$31'
scripts/Makefile.build:257: recipe for target 'arch/mips/kernel/traps.o' failed
Fix that to use the new write_32bit_cp1_register() macro so that ".set
hardfloat" is automatically added when -msoft-float is in use.
Fixes 842dfc11ea9a ("MIPS: Fix build with binutils 2.24.51+")
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Cc: <stable@vger.kernel.org> # 3.18+, depends on "MIPS: mipsregs.h: Add write_32bit_cp1_register()"
Patchwork: https://patchwork.linux-mips.org/patch/9173/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
|
|
Add a write_32bit_cp1_register() macro to compliment the
read_32bit_cp1_register() macro. This is to abstract whether .set
hardfloat needs to be used based on GAS_HAS_SET_HARDFLOAT.
The implementation of _read_32bit_cp1_register() .sets mips1 due to
failure of gas v2.19 to assemble cfc1 for Octeon (see commit
25c300030016 ("MIPS: Override assembler target architecture for
octeon.")). I haven't copied this over to _write_32bit_cp1_register() as
I'm uncertain whether it applies to ctc1 too, or whether anybody cares
about that version of binutils any longer.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: David Daney <david.daney@cavium.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/9172/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
|
|
Currently, cpudl::free_cpus contains all CPUs during init, see
cpudl_init(). When calling cpudl_find(), we have to add rd->span
to avoid selecting the cpu outside the current root domain, because
cpus_allowed cannot be depended on when performing clustered
scheduling using the cpuset, see find_later_rq().
This patch adds cpudl_set_freecpu() and cpudl_clear_freecpu() for
changing cpudl::free_cpus when doing rq_online_dl()/rq_offline_dl(),
so we can avoid the rd->span operation when calling cpudl_find()
in find_later_rq().
Signed-off-by: Xunlei Pang <pang.xunlei@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Juri Lelli <juri.lelli@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/1421642980-10045-1-git-send-email-pang.xunlei@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
cpu_idle_poll() is entered into when either the cpu_idle_force_poll is set or
tick_check_broadcast_expired() returns true. The exit condition from
cpu_idle_poll() is tif_need_resched().
However this does not take into account scenarios where cpu_idle_force_poll
changes or tick_check_broadcast_expired() returns false, without setting
the resched flag. So a cpu will be caught in cpu_idle_poll() needlessly,
thereby wasting power. Add an explicit check on cpu_idle_force_poll and
tick_check_broadcast_expired() to the exit condition of cpu_idle_poll()
to avoid this.
Signed-off-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/20150121105655.15279.59626.stgit@preeti.in.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
If an interrupt fires in cond_resched(), between the call to __schedule()
and the PREEMPT_ACTIVE count decrementation, and that interrupt sets
TIF_NEED_RESCHED, the call to preempt_schedule_irq() will be ignored
due to the PREEMPT_ACTIVE count. This kind of scenario, with irq preemption
being delayed because it's interrupting a preempt-disabled area, is
usually fixed up after preemption is re-enabled back with an explicit
call to preempt_schedule().
This is what preempt_enable() does but a raw preempt count decrement as
performed by __preempt_count_sub(PREEMPT_ACTIVE) doesn't handle delayed
preemption check. Therefore when such a race happens, the rescheduling
is going to be delayed until the next scheduler or preemption entrypoint.
This can be a problem for scheduler latency sensitive workloads.
Lets fix that by consolidating cond_resched() with preempt_schedule()
internals.
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Reported-by: Ingo Molnar <mingo@kernel.org>
Original-patch-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1421946484-9298-1-git-send-email-fweisbec@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
This patch adds checks that prevens futile attempts to move rt tasks
to a CPU with active tasks of equal or higher priority.
This reduces run queue lock contention and improves the performance of
a well known OLTP benchmark by 0.7%.
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Shawn Bohrer <sbohrer@rgmadvisors.com>
Cc: Suruchi Kadu <suruchi.a.kadu@intel.com>
Cc: Doug Nelson<doug.nelson@intel.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/1421430374.2399.27.camel@schen9-desk2.jf.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|