Age | Commit message (Collapse) | Author | Files | Lines |
|
It doesn't make sense to enable merge because the I/O
submitted to backing file is handled page by page.
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
When direct read IO is submitted from kernel, it is often
unnecessary to dirty pages, for example of loop, dirtying pages
have been considered in the upper filesystem(over loop) side
already, and they don't need to be dirtied again.
So this patch doesn't dirtying pages for ITER_BVEC/ITER_KVEC
direct read, and loop should be the 1st case to use ITER_BVEC/ITER_KVEC
for direct read I/O.
The patch is based on previous Dave's patch.
Reviewed-by: Dave Kleikamp <dave.kleikamp@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
In case of wbc->sync_mode == WB_SYNC_ALL we need to do data integrity
write, thus mark request as WRITE_SYNC.
akpm: afaict this change will cause the data integrity write bios to be
placed onto the second queue in cfq_io_cq.cfqq[], which presumably results
in special treatment. The documentation for REQ_SYNC is horrid.
Signed-off-by: Roman Pen <r.peniaev@gmail.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
The pages allocated for struct request contain pointers to other slab
allocations (via ops->init_request). Since kmemleak does not track/scan
page allocations, the slab objects will be reported as leaks (false
positives). This patch adds kmemleak callbacks to allow tracking of such
pages.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Reported-by: Bart Van Assche <bart.vanassche@sandisk.com>
Tested-by: Bart Van Assche<bart.vanassche@sandisk.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
Signed-off-by: Jann Horn <jann@thejh.net>
Reviewed-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
|
|
Commit 505a666ee3fc ("writeback: plug writeback in wb_writeback() and
writeback_inodes_wb()") has us holding a plug during writeback_sb_inodes,
which increases the merge rate when relatively contiguous small files
are written by the filesystem. It helps both on flash and spindles.
For an fs_mark workload creating 4K files in parallel across 8 drives,
this commit improves performance ~9% more by unplugging before calling
cond_resched(). cond_resched() doesn't trigger an implicit unplug, so
explicitly getting the IO down to the device before scheduling reduces
latencies for anyone waiting on clean pages.
It also cuts down on how often we use kblockd to unplug, which means
less work bouncing from one workqueue to another.
Many more details about how we got here:
https://lkml.org/lkml/2015/9/11/570
Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
The various definitions of __pfn_to_phys() have been consolidated to
use a generic macro in include/asm-generic/memory_model.h. This hit
mainline in the form of 012dcef3f058 "mm: move __phys_to_pfn and
__pfn_to_phys to asm/generic/memory_model.h". When the generic macro
was implemented the type cast to phys_addr_t was dropped which caused
boot regressions on ARM platforms with more than 4GB of memory and
LPAE enabled.
It was suggested to use PFN_PHYS() defined in include/linux/pfn.h
as provides the correct logic and avoids further duplication.
Reported-by: kernelci.org bot <bot@kernelci.org>
Suggested-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Tyler Baker <tyler.baker@linaro.org>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
Some of the device files are required to be user accessible for PSM while
most should remain accessible only by root.
Add a parameter to hfi1_cdev_init which controls if the user should have access
to this device which places it in a different class with the appropriate
devnode callback.
In addition set the devnode call back for the existing class to be a bit more
explicit for those permissions.
Finally remove the unnecessary null check before class_destroy
Tested-by: Donald Dutile <ddutile@redhat.com>
Signed-off-by: Haralanov, Mitko (mitko.haralanov@intel.com)
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
We are shifting by the _MASK macros instead of the _SHIFT ones.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
I added spaces around operators so it matches kernel style because
normally "-1ULL" is a number and " - 1" is a subtract operation. Also
removed some superflous "ULL" types so "1ULL" becomes "1".
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
The cinfo struct has a hole after the last struct member so we need to
zero it out. Otherwise we disclose some uninitialized stack data.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
mutex_trylock() returns zero on failure, not EBUSY.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
__get_txreq() returns an ERR_PTR() but this checks for NULL so it would
oops on failure.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
The boolean tests should have been or-ed.
Reported-by: David Binderman <dcb314@hotmail.com>
Reviewed-by: Jubin John <jubin.john@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
copy_to/from_user() returns the number of bytes which we were not able
to copy. It doesn't return an error code.
Also a couple places had a printk() on error and I removed that because
people can take advantage of it to fill /var/log/messages with spam.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
Byteswap link_width_downgrade_*_active values before sending on the wire. In
addition properly define the Port State Info structure.
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Christian Gomez <christian.gomez@intel.com>
Signed-off-by: Rimmer, Todd <todd.rimmer@intel.com>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Acked-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
Commit 2ee507c47293 ("sched: Add function single_task_running to let a task
check if it is the only task running on a cpu") referenced the current
runqueue with the smp_processor_id. When CONFIG_DEBUG_PREEMPT is enabled,
that is only allowed if preemption is disabled or the currrent task is
bound to the local cpu (e.g. kernel worker).
With commit f78195129963 ("kvm: add halt_poll_ns module parameter") KVM
calls single_task_running. If CONFIG_DEBUG_PREEMPT is enabled that
generates a lot of kernel messages.
To avoid adding preemption in that cases, as it would limit the usefulness,
we change single_task_running to access directly the cpu local runqueue.
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: <stable@vger.kernel.org>
Fixes: 2ee507c472939db4b146d545352b8a7c79ef47f8
Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
As discussed on linux-arch all architectures should wire up the separate
system calls that are hidden behind the socketcall multiplexer system call.
It's just a couple more system calls and gives us a very small performance
improvement.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
|
A couple of compat wrapper functions are simply trampolines to the real
system call. This happened because the compat wrapper defines will only
sign and zero extend system call parameters which are of different size
on s390/s390x (longs and pointers).
All other parameters will be correctly sign and zero extended by the
normal system call wrappers.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
|
Add notrace to the compat wrapper define to disable tracing of compat
wrapper functions. These are supposed to be very small and more or less
just a trampoline to the real system call.
Also fix indentation.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
|
Revert commit 6dc296e7df4c "mm: make sure all file VMAs have ->vm_ops
set".
Will Deacon reports that it "causes some mmap regressions in LTP, which
appears to use a MAP_PRIVATE mmap of /dev/zero as a way to get anonymous
pages in some of its tests (specifically mmap10 [1])".
William Shuman reports Oracle crashes.
So revert the patch while we work out what to do.
Reported-by: William Shuman <wshuman3@gmail.com>
Reported-by: Will Deacon <will.deacon@arm.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
[akpm@linux-foundation.org: Wanlong Gao has moved]
Signed-off-by: Cyril Hrubis <chrubis@suse.cz>
Cc: Jan Stancek <jstancek@redhat.com>
Cc: Stanislav Kholmanskikh <stanislav.kholmanskikh@oracle.com>
Cc: Alexey Kodanev <alexey.kodanev@oracle.com>
Cc: Wanlong Gao <wanlong.gao@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
This fixes a memleak if anon_inode_getfile() fails in userfaultfd().
Signed-off-by: Eric Biggers <ebiggers3@gmail.com>
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Some string_get_size() calls (e.g.:
string_get_size(1, 512, STRING_UNITS_10, ..., ...)
string_get_size(15, 64, STRING_UNITS_10, ..., ...)
) result in an infinite loop. The problem is that if size is equal to
divisor[units]/blk_size and is smaller than divisor[units] we'll end
up with size == 0 when we start doing sf_cap calculations:
For string_get_size(1, 512, STRING_UNITS_10, ..., ...) case:
...
remainder = do_div(size, divisor[units]); -> size is 0, remainder is 1
remainder *= blk_size; -> remainder is 512
...
size *= blk_size; -> size is still 0
size += remainder / divisor[units]; -> size is still 0
The caller causing the issue is sd_read_capacity(), the problem was
noticed on Hyper-V, such weird size was reported by host when scanning
collides with device removal. This is probably a separate issue worth
fixing, this patch is intended to prevent the library routine from
infinite looping.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Acked-by: James Bottomley <JBottomley@Odin.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
__delay was not exported as a result while building with allmodconfig we
were getting build error of undefined symbol. __delay is being used by:
drivers/net/phy/mdio-octeon.c
Signed-off-by: Sudip Mukherjee <sudip@vectorindia.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
ioremap_uc was not defined and as a result while building with
allmodconfig were getting build error of: implicit declaration of
function 'ioremap_uc'.
Signed-off-by: Sudip Mukherjee <sudip@vectorindia.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
The shadow which correspond 16 bytes memory may span 2 or 3 bytes. If
the memory is aligned on 8, then the shadow takes only 2 bytes. So we
check "shadow_first_bytes" is enough, and need not to call
"memory_is_poisoned_1(addr + 15);". But the code "if
(likely(!last_byte))" is wrong judgement.
e.g. addr=0, so last_byte = 15 & KASAN_SHADOW_MASK = 7, then the code
will continue to call "memory_is_poisoned_1(addr + 15);"
Signed-off-by: Xishi Qiu <qiuxishi@huawei.com>
Acked-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Andrey Konovalov <adech.fo@gmail.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Michal Marek <mmarek@suse.cz>
Cc: <zhongjiang@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
zcomp_create() verifies the success of zcomp_strm_{multi,single}_create()
through comp->stream, which can potentially be pointing to memory that
was freed if these functions returned an error.
While at it, replace a 'ERR_PTR(-ENOMEM)' by a more generic
'ERR_PTR(error)' as in the future zcomp_strm_{multi,siggle}_create()
could return other error codes. Function documentation updated
accordingly.
Fixes: beca3ec71fe5 ("zram: add multi stream functionality")
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Acked-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Acked-by: Minchan Kim <minchan@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Do not write initialize magic on systems that do not have
feature query 0xb. Fixes Bug #82451.
Redefine FEATURE_QUERY to align with 0xb and FEATURE2 with 0xd
for code clearity.
Add a new test function, hp_wmi_bios_2008_later() & simplify
hp_wmi_bios_2009_later(), which fixes a bug in cases where
an improper value is returned. Probably also fixes Bug #69131.
Add missing __init tag.
Signed-off-by: Kyle Evans <kvans32@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: Darren Hart <dvhart@linux.intel.com>
|
|
Use a generic name for this kind of PLL
Correction in dts files are already done here:
commit 5eb26c605909 ("ARM: STi: DT: Rename st_pll3200c32_407_c0_x into st_pll3200c32_cx_x")
Signed-off-by: Gabriel Fernandez <gabriel.fernandez@linaro.org>
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
|
|
We are the client, but advertise keepalive2 anyway - for consistency,
if nothing else. In the future the server might want to know whether
its clients support keepalive2.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Yan, Zheng <zyan@redhat.com>
|
|
This
struct ceph_timespec ceph_ts;
...
con_out_kvec_add(con, sizeof(ceph_ts), &ceph_ts);
wraps ceph_ts into a kvec and adds it to con->out_kvec array, yet
ceph_ts becomes invalid on return from prepare_write_keepalive(). As
a result, we send out bogus keepalive2 stamps. Fix this by encoding
into a ceph_timespec member, similar to how acks are read and written.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Yan, Zheng <zyan@redhat.com>
|
|
When bio bounce is involved, one new bio and its biovecs are
cloned from the comming bio, which can be one fast-cloned bio
from upper layer(such as dm).
So it is obviously wrong to assume the start index of the coming(
original) bio's io vector is zero, which can be any value between
0 and (bi_max_vecs - 1), especially in case of bio split.
This patch fixes Fedora's booting oops on i386, often with the
following kernel log together:
> [ 9.026738] systemd[1]: Switching root.
> [ 9.036467] systemd-journald[149]: Received SIGTERM from PID 1
> (systemd).
> [ 9.082262] BUG: Bad page state in process kworker/u5:1 pfn:372ac
> [ 9.083989] page:f3d32ae0 count:0 mapcount:0 mapping:f2252178
> index:0x16a
> [ 9.085755] flags: 0x40020021(locked|lru|mappedtodisk)
> [ 9.087284] page dumped because: page still charged to cgroup
> [ 9.088772] bad because of flags:
> [ 9.089731] flags: 0x21(locked|lru)
> [ 9.090818] page->mem_cgroup:f2c3e400
Reported-by: Josh Boyer <jwboyer@fedoraproject.org>
Tested-by: Adam Williamson <awilliam@redhat.com>
Cc: Ming Lin <mlin@kernel.org>
Cc: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
biovecs has become immutable since v3.13, so it isn't necessary
to allocate biovecs for the new cloned bios, then we can save
one extra biovecs allocation/copy, and the allocation is often
not fixed-length and a bit more expensive.
For example, if the 'max_sectors_kb' of null blk's queue is set
as 16(32 sectors) via sysfs just for making more splits, this patch
can increase throught about ~70% in the sequential read test over
null_blk(direct io, bs: 1M).
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Kent Overstreet <kent.overstreet@gmail.com>
Cc: Ming Lin <ming.l@ssi.samsung.com>
Cc: Dongsu Park <dpark@posteo.net>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
This fixes a performance regression introduced by commit 54efd50bfd,
and allows us to take full advantage of the fact that we have immutable
bio_vecs. Hand applied, as it rejected violently with commit
5014c311baa2.
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
pmem_rw_page() needs to call wmb_pmem() on writes to make sure that the
newly written data is durable. This flow was added to pmem_rw_bytes()
and pmem_make_request() with this commit:
commit 61031952f4c8 ("arch, x86: pmem api for ensuring durability of
persistent memory updates")
...the pmem_rw_page() path was missed.
Cc: <stable@vger.kernel.org>
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
Always take device_lock() before nvdimm_bus_lock() to prevent deadlock.
Signed-off-by: Axel Lin <axel.lin@ingics.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
Always take device_lock() before nvdimm_bus_lock() to prevent deadlock.
Cc: <stable@vger.kernel.org>
Signed-off-by: Axel Lin <axel.lin@ingics.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
Commit 6894258eda2f reversed the order of gfp_flags adjustment in
dma_alloc_attrs() for x86 [arch/x86/kernel/pci-dma.c] As a result,
relevant flags set by dma_alloc_coherent_gfp_flags() are just
discarded and cause coherent DMA memory allocation failure on some
devices.
Fixes: 6894258eda2f ("dma-mapping: consolidate dma_{alloc,free}_{attrs,coherent}")
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Tested-by: Tony Luck <tony.luck@intel.com>
Acked-by: Christoph Hellwig <hch@lst.de>
Link: http://lkml.kernel.org/r/20150914073834.GA13077@xzibit.linux.bs1.fc.nec.co.jp
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
This patch removes config option of KVM_ARM_MAX_VCPUS,
and like other ARCHs, just choose the maximum allowed
value from hardware, and follows the reasons:
1) from distribution view, the option has to be
defined as the max allowed value because it need to
meet all kinds of virtulization applications and
need to support most of SoCs;
2) using a bigger value doesn't introduce extra memory
consumption, and the help text in Kconfig isn't accurate
because kvm_vpu structure isn't allocated until request
of creating VCPU is sent from QEMU;
3) the main effect is that the field of vcpus[] in 'struct kvm'
becomes a bit bigger(sizeof(void *) per vcpu) and need more cache
lines to hold the structure, but 'struct kvm' is one generic struct,
and it has worked well on other ARCHs already in this way. Also,
the world switch frequecy is often low, for example, it is ~2000
when running kernel building load in VM from APM xgene KVM host,
so the effect is very small, and the difference can't be observed
in my test at all.
Cc: Dann Frazier <dann.frazier@canonical.com>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
|
|
Although the ThumbEE registers and traps were present in earlier
versions of the v8 architecture, it was retrospectively removed and so
we can do the same.
Whilst this breaks migrating a guest started on a previous version of
the kernel, it is much better to kill these (non existent) registers
as soon as possible.
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
[maz: added commend about migration]
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
|
|
When running a guest with the architected timer disabled (with QEMU and
the kernel_irqchip=off option, for example), it is important to make
sure the timer gets turned off. Otherwise, the guest may try to
enable it anyway, leading to a screaming HW interrupt.
The fix is to unconditionally turn off the virtual timer on guest
exit.
Cc: stable@vger.kernel.org
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
|
|
When running a guest with the architected timer disabled (with QEMU and
the kernel_irqchip=off option, for example), it is important to make
sure the timer gets turned off. Otherwise, the guest may try to
enable it anyway, leading to a screaming HW interrupt.
The fix is to unconditionally turn off the virtual timer on guest
exit.
Cc: stable@vger.kernel.org
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
|
|
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
CC: Andrew Morton <akpm@linux-foundation.org>
CC: linux-api@vger.kernel.org
CC: Heiko Carstens <heiko.carstens@de.ibm.com>
CC: linux-s390@vger.kernel.org
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
|
This config option is completely irrelevant for zfcpdump and
unfortunately causes a kernel panic on recent kernels in
"mspro_block_init()/driver_register()".
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
|
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
|
The scaled cputime is supposed to be derived from the normal per-thread
cputime by dividing it with the average thread density in the last interval.
The calculation of the scaling values for the average thread density is
incorrect. The current, incorrect calculation:
Ci = cycle count with i active threads
T = unscaled cputime, sT = scaled cputime
sT = T * (C1 + C2 + ... + Cn) / (1*C1 + 2*C2 + ... + n*Cn)
The calculation happens to yield the correct numbers for the simple cases
with only one Ci value not zero. But for cases with multiple Ci values not
zero it fails. E.g. on a SMT-2 system with one thread active half the time
and two threads active for the other half of the time it fails, the scaling
factor should be 3/4 but the formula gives 2/3.
The correct formula is
sT = T * (C1/1 + C2/2 + ... + Cn/n) / (C1 + C2 + ... + Cn)
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
|
Previously, the cpum_cf PMU returned -EPERM if a counter is requested and
the counter set to which the counter belongs is not authorized. According
to the perf_event_open() system call manual, an error code of EPERM indicates
an unsupported exclude setting or CAP_SYS_ADMIN is missing.
Use ENOENT to indicate that particular counters are not available when the
counter set which contains the counter is not authorized. For generic events,
this might trigger a fall back, for example, to a software event.
Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
|
The uc_sigmask in the ucontext structure is an array of words to keep
the 64 signal bits (or 1024 if you ask glibc but the kernel sigset_t
only has 64 bits).
For 64 bit the sigset_t contains a single 8 byte word, but for 31 bit
there are two 4 byte words. The compat signal handler code uses a
simple copy of the 64 bit sigset_t to the 31 bit compat_sigset_t.
As s390 is a big-endian architecture this is incorrect, the two words
in the 31 bit sigset_t array need to be swapped.
Cc: <stable@vger.kernel.org>
Reported-by: Stefan Liebler <stli@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
|
The critical section cleanup code misses to add the offset of the
thread_struct to the task address.
Therefore, if the critical section code gets executed, it may corrupt
the task struct or restore the contents of the floating point registers
from the wrong memory location.
Fixes d0164ee20d "s390/kernel: remove save_fpu_regs() parameter and use
__LC_CURRENT instead".
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|