Age | Commit message (Collapse) | Author | Files | Lines |
|
SKX is similar to BDX
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
This driver registers cpuidle devices when a CPU comes online, but it
leaves the registrations in place when a CPU goes offline. The module
exit code only unregisters the currently online CPUs, leaving the
devices for offline CPUs dangling.
This patch changes the driver to clean up all registrations on exit,
even those from CPUs that are offline.
Signed-off-by: Richard Cochran <rcochran@linutronix.de>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
If a cpuidle registration error occurs during the hot plug notifier
callback, we should really inform the hot plug machinery instead of
just ignoring the error. This patch changes the callback to properly
return on error.
Signed-off-by: Richard Cochran <rcochran@linutronix.de>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
The helper function, intel_idle_cpu_init, registers one new device
with the cpuidle layer. If the registration should fail, that
function immediately calls intel_idle_cpuidle_devices_uninit() to
unregister every last CPU's device. However, it makes no sense to do
so, when called from the hot plug notifier callback.
This patch moves the call to intel_idle_cpuidle_devices_uninit()
outside of the helper function to the one call site that actually
needs to perform the de-registrations.
Signed-off-by: Richard Cochran <rcochran@linutronix.de>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
This driver sets the broadcast tick quite early on during probe and does
not clean up again in cast of failure. This patch moves the setup call
after the registration, placing the on_each_cpu() calls within the global
CPU lock region.
Signed-off-by: Richard Cochran <rcochran@linutronix.de>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
The helper function, intel_idle_cpuidle_devices_uninit, frees the
globally allocated per-CPU data. However, this function is invoked
from the hot plug notifier callback at a time when freeing that data
is not safe.
If the call to cpuidle_register_driver() should fail (say, due to lack
of memory), then the driver will free its per-CPU region. On the
*next* CPU_ONLINE event, the driver will happily use the region again
and even free it again if the failure repeats.
This patch fixes the issue by moving the call to free_percpu() outside
of the helper function at the two call sites that actually need to
free the per-CPU data.
Signed-off-by: Richard Cochran <rcochran@linutronix.de>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
In the module_init() method, if the per-CPU allocation fails, then the
active cpuidle registration is not cleaned up. This patch fixes the
issue by attempting the allocation before registration, and then
cleaning it up again on registration failure.
Signed-off-by: Richard Cochran <rcochran@linutronix.de>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
In the module_exit() method, this driver first frees its per-CPU
pointer, then unregisters a callback making use of the pointer.
Furthermore, the function, intel_idle_cpuidle_devices_uninit, is racy
against CPU hot plugging as it calls for_each_online_cpu().
This patch corrects the issues by unregistering first on the exit path
while holding the hot plug lock.
Signed-off-by: Richard Cochran <rcochran@linutronix.de>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
The function, intel_idle_cpuidle_driver_init, makes calls on each CPU
to auto_demotion_disable() and c1e_promotion_disable(). These calls
are redundant, as intel_idle_cpu_init() does the same calls just a bit
later on. They are also premature, as the driver registration may yet
fail.
This patch removes the redundant code.
Signed-off-by: Richard Cochran <rcochran@linutronix.de>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
The function, intel_idle_cpuidle_driver_init, delivers no error codes
at all. This patch changes the function to return 'void' instead of
returning zero.
Signed-off-by: Richard Cochran <rcochran@linutronix.de>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Signed-off-by: Richard Cochran <rcochran@linutronix.de>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Enables "Intel(R) Xeon Phi(TM) Processor x200 Product Family" support,
formerly code-named KNL. It is based on modified Intel Atom Silvermont
microarchitecture.
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
[micah.barany@intel.com: adjusted values of residency and latency]
Signed-off-by: Micah Barany <micah.barany@intel.com>
[hubert.chrzaniuk@intel.com: removed deprecated CPUIDLE_FLAG_TIME_VALID flag]
Signed-off-by: Hubert Chrzaniuk <hubert.chrzaniuk@intel.com>
Signed-off-by: Pawel Karczewski <pawel.karczewski@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
|
|
Some SKL-H configurations require "intel_idle.max_cstate=7" to boot.
While that is an effective workaround, it disables C10.
This patch detects the problematic configuration,
and disables C8 and C9, keeping C10 enabled.
Note that enabling SGX in BIOS SETUP can also prevent this issue,
if the system BIOS provides that option.
https://bugzilla.kernel.org/show_bug.cgi?id=109081
"Freezes with Intel i7 6700HQ (Skylake), unless intel_idle.max_cstate=7"
Signed-off-by: Len Brown <len.brown@intel.com>
Cc: stable@vger.kernel.org
|
|
Commit a9ceb78bc75c (cpuidle,menu: use interactivity_req to disable
polling) changed the behavior of the fallback state selection part
of menu_select() so it looks at interactivity_req instead of
data->next_timer_us when it makes its decision. That effectively
caused polling to be used more often as fallback idle which led to
significant increases of energy consumption in some cases.
Commit e132b9b3bc7f (cpuidle: menu: use high confidence factors
only when considering polling) changed that logic again to be more
predictable, but that didn't help with the increased energy
consumption problem.
For this reason, go back to making decisions on which state to fall
back to based on data->next_timer_us which is the time we know for
sure something will happen rather than a prediction (which may be
inaccurate and turns out to be so often enough to be problematic).
However, take the target residency of the first proper idle state
(C1) into account, so that state is not used as the fallback one
if its target residency is greater than data->next_timer_us.
Fixes: a9ceb78bc75c (cpuidle,menu: use interactivity_req to disable polling)
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reported-and-tested-by: Doug Smythies <dsmythies@telus.net>
|
|
The menu governor uses five different factors to pick the
idle state:
- the user configured latency_req
- the time until the next timer (next_timer_us)
- the typical sleep interval, as measured recently
- an estimate of sleep time by dividing next_timer_us by an observed factor
- a load corrected version of the above, divided again by load
Only the first three items are known with enough confidence that
we can use them to consider polling, instead of an actual CPU
idle state, because the cost of being wrong about polling can be
excessive power use.
The latter two are used in the menu governor's main selection
loop, and can result in choosing a shallower idle state when
the system is expected to be busy again soon.
This pushes a busy system in the "performance" direction of
the performance<>power tradeoff, when choosing between idle
states, but stays more strictly on the "power" state when
deciding between polling and C1.
Signed-off-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
We know that the avg variable actually ends up holding a 32 bit
quantity, since it's an average of such numbers. It is only a u64
because it is temporarily used to hold the sum. Making it an actual
u32 allows gcc to generate slightly better code, e.g. when computing
the square, it can do a 32x32->64 multiply.
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Computing the integer square root is a rather expensive operation, at
least compared to doing a 64x64 -> 64 multiply (avg*avg) and, on 64
bit platforms, doing an extra comparison to a constant (variance <=
U64_MAX/36).
On 64 bit platforms, this does mean that we add a restriction on the
range of the variance where we end up using the estimate (since
previously the stddev <= ULONG_MAX was a tautology), but on the other
hand, we extend the range quite substantially on 32 bit platforms - in
both cases, we now allow standard deviations up to 715 seconds, which
is for example guaranteed if all observations are less than 1430
seconds.
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
|
|
The 'umidi' object will be free'd on the error path by snd_usbmidi_free()
when tearing down the rawmidi interface. So we shouldn't try to free it
in snd_usbmidi_create() after having registered the rawmidi interface.
Found by KASAN.
Signed-off-by: Andrey Konovalov <andreyknvl@gmail.com>
Acked-by: Clemens Ladisch <clemens@ladisch.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
Fix the RC QPs send queue overhead computation to take into account
two additional segments in the WQE which are needed for registration
operations.
The ATOMIC and UMR segments can't coexist together, so chose maximum out
of them.
The commit 9e65dc371b5c ("IB/mlx5: Fix RC transport send queue overhead
computation") was intended to update RC transport as commit messages
states, but added the code to UC transport.
Fixes: 9e65dc371b5c ("IB/mlx5: Fix RC transport send queue overhead computation")
Signed-off-by: Kamal Heib <kamalh@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
A narrow window for race condition still exist between
multicast join thread and *dev_flush workers.
A kernel crash caused by prolong erratic link state changes
was observed (most likely a faulty cabling):
[167275.656270] BUG: unable to handle kernel NULL pointer dereference at
0000000000000020
[167275.665973] IP: [<ffffffffa05f8f2e>] ipoib_mcast_join+0xae/0x1d0 [ib_ipoib]
[167275.674443] PGD 0
[167275.677373] Oops: 0000 [#1] SMP
...
[167275.977530] Call Trace:
[167275.982225] [<ffffffffa05f92f0>] ? ipoib_mcast_free+0x200/0x200 [ib_ipoib]
[167275.992024] [<ffffffffa05fa1b7>] ipoib_mcast_join_task+0x2a7/0x490
[ib_ipoib]
[167276.002149] [<ffffffff8109d5fb>] process_one_work+0x17b/0x470
[167276.010754] [<ffffffff8109e3cb>] worker_thread+0x11b/0x400
[167276.019088] [<ffffffff8109e2b0>] ? rescuer_thread+0x400/0x400
[167276.027737] [<ffffffff810a5aef>] kthread+0xcf/0xe0
Here was a hit spot:
ipoib_mcast_join() {
..............
rec.qkey = priv->broadcast->mcmember.qkey;
^^^^^^^
.....
}
Proposed patch should prevent multicast join task to continue
if link state change is detected.
Signed-off-by: Alex Estrin <alex.estrin@intel.com>
Changes from v4:
- as suggested by Doug Ledford, optimized spinlock usage,
i.e. ipoib_mcast_join() is called with lock held.
Changes from v3:
- sync with priv->lock before flag check.
Chages from v2:
- Move check for OPER_UP flag state to mcast_join() to
ensure no event worker is in progress.
- minor style fixes.
Changes from v1:
- No need to lock again if error detected.
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
This patch fixes vulnerability CVE-2016-2085. The problem exists
because the vm_verify_hmac() function includes a use of memcmp().
Unfortunately, this allows timing side channel attacks; specifically
a MAC forgery complexity drop from 2^128 to 2^12. This patch changes
the memcmp() to the cryptographically safe crypto_memneq().
Reported-by: Xiaofei Rex Guo <xiaofei.rex.guo@intel.com>
Signed-off-by: Ryan Ware <ware@linux.intel.com>
Cc: stable@vger.kernel.org
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
|
|
MMUv4 supports 2 concurrent page sizes: Normal and Super [4K to 16M]
So far Linux supported a single super page size for a given Normal page,
depending on the software page walking address split.
e.g. we had 11:8:13 address split for 8K page, which meant super page
was 2 ^(8+13) = 2M (given that THP size has to be PMD_SHIFT)
Now we turn this around, by allowing multiple Super Pages in Kconfig
(currently 2M and 16M only) and forcing page walker address split to
PGDIR_SHIFT and PAGE_SHIFT
For configs without Super page, things are same as before and
PGDIR_SHIFT can be hacked to get non default address split
The motivation for this change is a customer who needs 16M super page
and a 8K Normal page combo.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
|
|
arch/x86/built-in.o: In function `uv_bios_call':
(.text+0xeba00): undefined reference to `efi_call'
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Suggested-by: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@elte.hu>
Reviewed-by: Matt Fleming <matt@codeblueprint.co.uk>
Acked-by: Alex Thorlton <athorlton@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
The pfn_t type uses an unsigned long to store a pfn + flags value. On a
64-bit platform the upper 12 bits of an unsigned long are never used for
storing the value of a pfn. However, this is not true on highmem
platforms, all 32-bits of a pfn value are used to address a 44-bit
physical address space. A pfn_t needs to store a 64-bit value.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=112211
Fixes: 01c8f1c44b83 ("mm, dax, gpu: convert vm_insert_mixed to pfn_t")
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Reported-by: Stuart Foster <smf.linux@ntlworld.com>
Reported-by: Julian Margetson <runaway@candw.ms>
Tested-by: Julian Margetson <runaway@candw.ms>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Mike said:
: CONFIG_UBSAN_ALIGNMENT breaks x86-64 kernel with lockdep enabled, i. e
: kernel with CONFIG_UBSAN_ALIGNMENT fails to load without even any error
: message.
:
: The problem is that ubsan callbacks use spinlocks and might be called
: before lockdep is initialized. Particularly this line in the
: reserve_ebda_region function causes problem:
:
: lowmem = *(unsigned short *)__va(BIOS_LOWMEM_KILOBYTES);
:
: If i put lockdep_init() before reserve_ebda_region call in
: x86_64_start_reservations kernel loads well.
Fix this ordering issue permanently: change lockdep so that it uses
hlists for the hash tables. Unlike a list_head, an hlist_head is in its
initialized state when it is all-zeroes, so lockdep is ready for
operation immediately upon boot - lockdep_init() need not have run.
The patch will also save some memory.
lockdep_init() and lockdep_initialized can be done away with now - a 4.6
patch has been prepared to do this.
Reported-by: Mike Krinkin <krinkin.m.u@gmail.com>
Suggested-by: Mike Krinkin <krinkin.m.u@gmail.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
[akpm@linux-foundation.org: s/threshhold/threshold/]
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
This showed up on ARC when running LMBench bw_mem tests as Overlapping
TLB Machine Check Exception triggered due to STLB entry (2M pages)
overlapping some NTLB entry (regular 8K page).
bw_mem 2m touches a large chunk of vaddr creating NTLB entries. In the
interim khugepaged kicks in, collapsing the contiguous ptes into a
single pmd. pmdp_collapse_flush()->flush_pmd_tlb_range() is called to
flush out NTLB entries for the ptes. This for ARC (by design) can only
shootdown STLB entries (for pmd). The stray NTLB entries cause the
overlap with the subsequent STLB entry for collapsed page. So make
pmdp_collapse_flush() call pte flush interface not pmd flush.
Note that originally all thp flush call sites in generic code called
flush_tlb_range() leaving it to architecture to implement the flush for
pte and/or pmd. Commit 12ebc1581ad11454 changed this by calling a new
opt-in API flush_pmd_tlb_range() which made the semantics more explicit
but failed to distinguish the pte vs pmd flush in generic code, which is
what this patch fixes.
Note that ARC can fixed w/o touching the generic pmdp_collapse_flush()
by defining a ARC version, but that defeats the purpose of generic
version, plus sementically this is the right thing to do.
Fixes STAR 9000961194: LMBench on AXS103 triggering duplicate TLB
exceptions with super pages
Fixes: 12ebc1581ad11454 ("mm,thp: introduce flush_pmd_tlb_range")
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: <stable@vger.kernel.org> [4.4]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
We need to use post-decrement to get percpu_counter_destroy() called on
&wb->stat[0]. Moreover, the pre-decremebt would cause infinite
out-of-bounds accesses if the setup code failed at i==0.
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Vladimir Davydov <vdavydov@virtuozzo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
DAX implements split_huge_pmd() by clearing pmd. This simple approach
reduces memory overhead, as we don't need to deposit page table on huge
page mapping to make split_huge_pmd() never-fail. PTE table can be
allocated and populated later on page fault from backing store.
But one side effect is that have to check if pmd is pmd_none() after
split_huge_pmd(). In most places we do this already to deal with
parallel MADV_DONTNEED.
But I found two call sites which is not affected by MADV_DONTNEED (due
down_write(mmap_sem)), but need to have the check to work with DAX
properly.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Matthew Wilcox <willy@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
The kptr_restrict flag, when set to 1, only prints the kernel address
when the user has CAP_SYSLOG. When it is set to 2, the kernel address
is always printed as zero. When set to 1, this needs to check whether
or not we're in IRQ.
However, when set to 2, this check is unneccessary, and produces
confusing results in dmesg. Thus, only make sure we're not in IRQ when
mode 1 is used, but not mode 2.
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Add missing kernel-doc notation for function parameter 'gfp_mask' to fix
kernel-doc warning.
mm/filemap.c:1898: warning: No description found for parameter 'gfp_mask'
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
When enabling UBSAN_SANITIZE_ALL, the kernel image size gets increased
significantly (~3x). So, it sounds better to have some note in Kconfig.
And, fixed a typo.
Signed-off-by: Yang Shi <yang.shi@linaro.org>
Acked-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
When checking specific attribute from a bit mask, need to use bitwise
AND and not logical AND, fixed that.
Fixes: 145d9c541032 ('IB/core: Display extended counter set if
available')
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Matan Barak <matanb@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
The while loop after err_slaves should use post-decrement; otherwise
we'll fail to do the kfrees for i==0, and will run into out-of-bounds
accesses if the setup above failed already at i==0.
[I'm not sure why one even bothers populating the ->vlan_filter array:
mlx4.h isn't #included by anything outside
drivers/net/ethernet/mellanox/mlx4/, and "git grep -C2 -w vlan_filter
drivers/net/ethernet/mellanox/mlx4/" seems to suggest that the
vlan_filter elements aren't used at all.]
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
This reverts commit 829b6962f7e3cfc06f7c5c26269fd47ad48cf503.
Revert this change as it causes a sysfs path to change and therefore
introduces and ABI regression. More precisely Android's vold is not being
able to access /sys/module/mmcblk/parameters/perdev_minors any more, since
the path becomes changed to: "/sys/module/mmc_block/..."
Fixes: 829b6962f7e3 ("mmc: block: don't use parameter prefix if built as
module")
Reported-by: John Stultz <john.stultz@linaro.org>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
In __cpufreq_cooling_register() we allocate the arrays for time_in_idle
and time_in_idle_timestamp to be as big as the number of cpus in this
cpufreq device. However, in get_load() we access this array using the
cpu number as index, which can result in an out of bound access.
Index time_in_idle{,_timestamp} using the index in the cpufreq_device's
allowed_cpus mask, as we do for the load_cpu array in
cpufreq_get_requested_power()
Reported-by: Nicolas Boichat <drinkcat@chromium.org>
Cc: Amit Daniel Kachhap <amit.kachhap@gmail.com>
Cc: Zhang Rui <rui.zhang@intel.com>
Cc: Eduardo Valentin <edubezval@gmail.com>
Tested-by: Nicolas Boichat <drinkcat@chromium.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Javi Merino <javi.merino@arm.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
|
|
The value of ctx->pos in the last readdir call is supposed to be set to
INT_MAX due to 32bit compatibility, unless 'pos' is intentially set to a
larger value, then it's LLONG_MAX.
There's a report from PaX SIZE_OVERFLOW plugin that "ctx->pos++"
overflows (https://forums.grsecurity.net/viewtopic.php?f=1&t=4284), on a
64bit arch, where the value is 0x7fffffffffffffff ie. LLONG_MAX before
the increment.
We can get to that situation like that:
* emit all regular readdir entries
* still in the same call to readdir, bump the last pos to INT_MAX
* next call to readdir will not emit any entries, but will reach the
bump code again, finds pos to be INT_MAX and sets it to LLONG_MAX
Normally this is not a problem, but if we call readdir again, we'll find
'pos' set to LLONG_MAX and the unconditional increment will overflow.
The report from Victor at
(http://thread.gmane.org/gmane.comp.file-systems.btrfs/49500) with debugging
print shows that pattern:
Overflow: e
Overflow: 7fffffff
Overflow: 7fffffffffffffff
PAX: size overflow detected in function btrfs_real_readdir
fs/btrfs/inode.c:5760 cicus.935_282 max, count: 9, decl: pos; num: 0;
context: dir_context;
CPU: 0 PID: 2630 Comm: polkitd Not tainted 4.2.3-grsec #1
Hardware name: Gigabyte Technology Co., Ltd. H81ND2H/H81ND2H, BIOS F3 08/11/2015
ffffffff81901608 0000000000000000 ffffffff819015e6 ffffc90004973d48
ffffffff81742f0f 0000000000000007 ffffffff81901608 ffffc90004973d78
ffffffff811cb706 0000000000000000 ffff8800d47359e0 ffffc90004973ed8
Call Trace:
[<ffffffff81742f0f>] dump_stack+0x4c/0x7f
[<ffffffff811cb706>] report_size_overflow+0x36/0x40
[<ffffffff812ef0bc>] btrfs_real_readdir+0x69c/0x6d0
[<ffffffff811dafc8>] iterate_dir+0xa8/0x150
[<ffffffff811e6d8d>] ? __fget_light+0x2d/0x70
[<ffffffff811dba3a>] SyS_getdents+0xba/0x1c0
Overflow: 1a
[<ffffffff811db070>] ? iterate_dir+0x150/0x150
[<ffffffff81749b69>] entry_SYSCALL_64_fastpath+0x12/0x83
The jump from 7fffffff to 7fffffffffffffff happens when new dir entries
are not yet synced and are processed from the delayed list. Then the code
could go to the bump section again even though it might not emit any new
dir entries from the delayed list.
The fix avoids entering the "bump" section again once we've finished
emitting the entries, both for synced and delayed entries.
References: https://forums.grsecurity.net/viewtopic.php?f=1&t=4284
Reported-by: Victor <services@swwu.com>
CC: stable@vger.kernel.org
Signed-off-by: David Sterba <dsterba@suse.com>
Tested-by: Holger Hoffstätte <holger.hoffstaette@googlemail.com>
Signed-off-by: Chris Mason <clm@fb.com>
|
|
Since the dawn of time the ICST code has only supported divide
by one or hang in an eternal loop. Luckily we were always dividing
by one because the reference frequency for the systems using
the ICSTs is 24MHz and the [min,max] values for the PLL input
if [10,320] MHz for ICST307 and [6,200] for ICST525, so the loop
will always terminate immediately without assigning any divisor
for the reference frequency.
But for the code to make sense, let's insert the missing i++
Reported-by: David Binderman <dcb314@hotmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
Intel BXT/APL use a card detect GPIO however the host controller
will not enable bus power unless it's card detect also reflects
the presence of a card. Unfortunately those 2 things race which
can result in commands not starting, after which the controller
does nothing and there is a 10 second wait for the driver's
10-second timer to timeout.
That is fixed by having the driver look also at the present state
register to determine if the card is present. Consequently, provide
a 'get_cd' mmc host operation for BXT/APL that does that.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: stable@vger.kernel.org # v4.4+
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
Intel BXT/APL use a card detect GPIO however the host controller
will not enable bus power unless it's card detect also reflects
the presence of a card. Unfortunately those 2 things race which
can result in commands not starting, after which the controller
does nothing and there is a 10 second wait for the driver's
10-second timer to timeout.
That is fixed by having the driver look also at the present state
register to determine if the card is present. Consequently, provide
a 'get_cd' mmc host operation for BXT/APL that does that.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: stable@vger.kernel.org # v4.4+
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
Drivers may need to provide their own get_cd() mmc host op, but
currently the internals of the current op (sdhci_get_cd()) are
provided by sdhci_do_get_cd() which is also called from
sdhci_request().
To allow override of the get_cd functionality, change sdhci_request()
to call ->get_cd() instead of sdhci_do_get_cd().
Note, in the future the call to ->get_cd() will likely be removed
from sdhci_request() since most drivers don't need actually it.
However this change is being done now to facilitate a subsequent
bug fix.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: stable@vger.kernel.org # v4.4+
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
In the past, fixes for specific hardware devices were implemented
in sdhci using quirks. That approach is no longer accepted because
the growing number of quirks was starting to make the code difficult
to understand and maintain.
One alternative to quirks, is to allow drivers to override the default
mmc host operations. This patch makes it easy to do that, and it is
needed for a subsequent bug fix, for which separate patches are
provided.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: stable@vger.kernel.org # v4.4+
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
Depending on the configuration either the 32 or 64 bit version of
elf_check_arch() is defined. parse_crash_elf{32|64}_headers() does
some basic verification of the ELF header via
vmcore_elf{32|64}_check_arch() which happen to map to elf_check_arch().
Since the implementation 32 and 64 bit version of elf_check_arch()
differ, we use the wrong type:
In file included from include/linux/elf.h:4:0,
from fs/proc/vmcore.c:13:
fs/proc/vmcore.c: In function 'parse_crash_elf64_headers':
>> arch/mips/include/asm/elf.h:228:23: error: initialization from incompatible pointer type [-Werror=incompatible-pointer-types]
struct elfhdr *__h = (hdr); \
^
include/linux/crash_dump.h:41:37: note: in expansion of macro 'elf_check_arch'
#define vmcore_elf64_check_arch(x) (elf_check_arch(x) || vmcore_elf_check_arch_cross(x))
^
fs/proc/vmcore.c:1015:4: note: in expansion of macro 'vmcore_elf64_check_arch'
!vmcore_elf64_check_arch(&ehdr) ||
^
Therefore, we rather define vmcore_elf{32|64}_check_arch() as a
basic machine check and use it also in binfm_elf?32.c as well.
Signed-off-by: Daniel Wagner <daniel.wagner@bmw-carit.de>
Suggested-by: Maciej W. Rozycki <macro@imgtec.com>
Reviewed-by: Maciej W. Rozycki <macro@imgtec.com>
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Cc: linux-kernel@vger.kernel.org
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/12529/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
|
|
The ARM GICv3 specification mentions the need for dsb after a read
from the ICC_IAR1_EL1 register:
4.1.1 Physical CPU Interface:
The effects of reading ICC_IAR0_EL1 and ICC_IAR1_EL1
on the state of a returned INTID are not guaranteed
to be visible until after the execution of a DSB.
Not having this could result in missed interrupts, so let's add the
required barrier.
[Marc: fixed commit message]
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Tirumalesh Chalamarla <tchalamarla@caviumnetworks.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
|
|
EOImode1 is only used for the root controller and hence only the root
controller uses the eoimode1 functions for handling interrupts. However,
if the root controller supports EOImode1, then the EOImodeNS bit will be
set for all GICs, enabling EOImode1. This is not what we want and this
causes interrupts on non-root GICs to only be dropped in priority but
never deactivated. Therefore, only set the EOImodeNS bit for the root
controller.
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
|
|
Setting the affinity of an IRQ, it only applicable for the root
interrupt controller and so only populate this operator for the root
controller.
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
|
|
Calling component_add() may result in the completion of a set of
devices, which will try to bring up a master. In bringing the master
up, we populate its match array with the current set of children.
If binding any of the devices fails, component_add() itself will fail,
free the struct component entry, and return to the caller. The
now-freed entry is never removed from the master's match array, and
will later be used in a futile attempt to bind to freed memory.
Bring component_add's behaviour on failure to bring up a master into
line with component_del by removing the (to-be-freed) component from
the master's match array.
The specific case which broke was:
- rockchip_drm_drv adds a component master
- dwhdmi_rockchip adds a child component in probe (master incomplete)
- rockchip_drm_vop adds two children in probe, which completes the
set
- inside component_add, we try to bring up the master, having
populated the master's match array, and fail with EPROBE_DEFER from
dwhdmi_rockchip; we delete the putative component
- rockchip_drm_vop's probe fails and returns EPROBE_DEFER
- we later re-probe rockchip_drm_vop and add the component; the
master is complete, so we attempt to bring it up again
- walking the match array, we find the previous child, whose master
pointer doesn't match (as it has been freed in the meantime)
- rockchip_drm_vop probe fails, and will never be attempted again
Fixes: ffc30b74fd6d01588bd3fdebc3b1acc0857e6fc8
Signed-off-by: Daniel Stone <daniels@collabora.com>
Cc: Russell King <rmk+kernel@arm.linux.org.uk>
Cc: Thierry Reding <treding@nvidia.com>
Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
Signed-off-by: Quinn Tran <quinn.tran@qlogic.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>
Fixes: fb3269b ("qla2xxx: Add selective command queuing")
Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
|
|
Signed-off-by: Quinn Tran <quinn.tran@qlogic.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>
Fixes: fb3269b ("qla2xxx: Add selective command queuing")
Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
|