Age | Commit message (Collapse) | Author | Files | Lines |
|
It's possible for a request to invalidate a fscache_cookie will come in
while we're already processing an invalidation. If that happens we
currently take an extra access reference that will leak. Only call
__fscache_begin_cookie_access if the FSCACHE_COOKIE_DO_INVALIDATE bit
was previously clear.
Also, ensure that we attempt to clear the bit when the cookie is
"FAILED" and put the reference to avoid an access leak.
Fixes: 85e4ea1049c7 ("fscache: Fix invalidation/lookup race")
Suggested-by: David Howells <dhowells@redhat.com>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
had been broken for ITER_BVEC et.al. since ever (OK, v3.17 when
ITER_BVEC had first appeared)...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
... since April 2021
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
... just shove it into one pipe_buffer.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
now that we are advancing the iterator, there's no need to
treat the first page separately - just call append_pipe()
in a loop.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
mechanical change; will be further massaged in subsequent commits
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
here nothing even looks at the iov_iter after the call, so we couldn't
care less whether it advances or not.
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
that one is somewhat clumsier than usual and needs serious testing.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
... and adjust the callers
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
... and untangle the cleanup on failure to add into pipe.
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
... doing revert if we end up not using some pages
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Most of the users immediately follow successful iov_iter_get_pages()
with advancing by the amount it had returned.
Provide inline wrappers doing that, convert trivial open-coded
uses of those.
BTW, iov_iter_get_pages() never returns more than it had been asked
to; such checks in cifs ought to be removed someday...
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
All call sites of get_pages_array() are essenitally identical now.
Replace with common helper...
Returns number of slots available in resulting array or 0 on OOM;
it's up to the caller to make sure it doesn't ask to zero-entry
array (i.e. neither maxpages nor size are allowed to be zero).
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
... and don't mangle maxsize there - turn the loop into counting
one instead. Easier to see that we won't run out of array that
way. Note that special treatment of the partial buffer in that
thing is an artifact of the non-advancing semantics of
iov_iter_get_pages() - if not for that, it would be append_pipe(),
same as the body of the loop that follows it. IOW, once we make
iov_iter_get_pages() advancing, the whole thing will turn into
calculate how many pages do we want
allocate an array (if needed)
call append_pipe() that many times.
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
same as for pipes and xarrays; after that iov_iter_get_pages() becomes
a wrapper for __iov_iter_get_pages_alloc().
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
same as for pipes
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
The differences between those two are
* pipe_get_pages() gets a non-NULL struct page ** value pointing to
preallocated array + array size.
* pipe_get_pages_alloc() gets an address of struct page ** variable that
contains NULL, allocates the array and (on success) stores its address in
that variable.
Not hard to combine - always pass struct page ***, have
the previous pipe_get_pages_alloc() caller pass ~0U as cap for
array size.
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
zero maxpages is bogus, but best treated as "just return 0";
NULL pages, OTOH, should be treated as a hard bug.
get rid of now completely useless checks in xarray_get_pages{,_alloc}().
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Incidentally, ITER_XARRAY did *not* free the sucker in case when
iter_xarray_populate_pages() returned 0...
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
All their callers are next to each other; all of them
want the total amount of pages and, possibly, the
offset in the partial final buffer.
Combine into a new helper (pipe_npages()), fix the
bogosity in pipe_space_for_user(), while we are at it.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
We often need to find whether the last buffer is anon or not, and
currently it's rather clumsy:
check if ->iov_offset is non-zero (i.e. that pipe is not empty)
if so, get the corresponding pipe_buffer and check its ->ops
if it's &default_pipe_buf_ops, we have an anon buffer.
Let's replace the use of ->iov_offset (which is nowhere near similar to
its role for other flavours) with signed field (->last_offset), with
the following rules:
empty, no buffers occupied: 0
anon, with bytes up to N-1 filled: N
zero-copy, with bytes up to N-1 filled: -N
That way abs(i->last_offset) is equal to what used to be in i->iov_offset
and empty vs. anon vs. zero-copy can be distinguished by the sign of
i->last_offset.
Checks for "should we extend the last buffer or should we start
a new one?" become easier to follow that way.
Note that most of the operations can only be done in a sane
state - i.e. when the pipe has nothing past the current position of
iterator. About the only thing that could be done outside of that
state is iov_iter_advance(), which transitions to the sane state by
truncating the pipe. There are only two cases where we leave the
sane state:
1) iov_iter_get_pages()/iov_iter_get_pages_alloc(). Will be
dealt with later, when we make get_pages advancing - the callers are
actually happier that way.
2) iov_iter copied, then something is put into the copy. Since
they share the underlying pipe, the original gets behind. When we
decide that we are done with the copy (original is not usable until then)
we advance the original. direct_io used to be done that way; nowadays
it operates on the original and we do iov_iter_revert() to discard
the excessive data. At the moment there's nothing in the kernel that
could do that to ITER_PIPE iterators, so this reason for insane state
is theoretical right now.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Fold pipe_truncate() into it, clean up. We can release buffers
in the same loop where we walk backwards to the iterator beginning
looking for the place where the new position will be.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
instead of setting ->iov_offset for new position and calling
pipe_truncate() to adjust ->len of the last buffer and discard
everything after it, adjust ->len at the same time we set ->iov_offset
and use pipe_discard_from() to deal with buffers past that.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
it's only used to get to the partial buffer we can add to,
and that's always the last one, i.e. pipe->head - 1.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Expand the only remaining call of push_pipe() (in
__pipe_get_pages()), combine it with the page-collecting loop there.
Note that the only reason it's not a loop doing append_pipe() is
that append_pipe() is advancing, while iov_iter_get_pages() is not.
As soon as it switches to saner semantics, this thing will switch
to using append_pipe().
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
New helper: append_pipe(). Extends the last buffer if possible,
allocates a new one otherwise. Returns page and offset in it
on success, NULL on failure. iov_iter is advanced past the
data we've got.
Use that instead of push_pipe() in copy-to-pipe primitives;
they get simpler that way. Handling of short copy (in "mc" one)
is done simply by iov_iter_revert() - iov_iter is in consistent
state after that one, so we can use that.
[Fix for braino caught by Liu Xinpeng <liuxp11@chinatelecom.cn> folded in]
[another braino fix, this time in copy_pipe_to_iter() and pipe_zero();
caught by testcase from Hugh Dickins]
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
There are only two kinds of pipe_buffer in the area used by ITER_PIPE.
1) anonymous - copy_to_iter() et.al. end up creating those and copying
data there. They have zero ->offset, and their ->ops points to
default_pipe_page_ops.
2) zero-copy ones - those come from copy_page_to_iter(), and page
comes from caller. ->offset is also caller-supplied - it might be
non-zero. ->ops points to page_cache_pipe_buf_ops.
Move creation and insertion of those into helpers - push_anon(pipe, size)
and push_page(pipe, page, offset, size) resp., separating them from
the "could we avoid creating a new buffer by merging with the current
head?" logics.
Acked-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
pipe_buffer instances of a pipe are organized as a ring buffer,
with power-of-2 size. Indices are kept *not* reduced modulo ring
size, so the buffer refered to by index N is
pipe->bufs[N & (pipe->ring_size - 1)].
Ring size can change over the lifetime of a pipe, but not while
the pipe is locked. So for any iov_iter primitives it's a constant.
Original conversion of pipes to this layout went overboard trying
to microoptimize that - calculating pipe->ring_size - 1, storing
it in a local variable and using through the function. In some
cases it might be warranted, but most of the times it only
obfuscates what's going on in there.
Introduce a helper (pipe_buf(pipe, N)) that would encapsulate
that and use it in the obvious cases. More will follow...
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Use pipe_discard_from() explicitly in generic_file_read_iter(); don't bother
with rather non-obvious use of iov_iter_advance() in there.
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Equivalent of single-segment iovec. Initialized by iov_iter_ubuf(),
checked for by iter_is_ubuf(), otherwise behaves like ITER_IOVEC
ones.
We are going to expose the things like ->write_iter() et.al. to those
in subsequent commits.
New predicate (user_backed_iter()) that is true for ITER_IOVEC and
ITER_UBUF; places like direct-IO handling should use that for
checking that pages we modify after getting them from iov_iter_get_pages()
would need to be dirtied.
DO NOT assume that replacing iter_is_iovec() with user_backed_iter()
will solve all problems - there's code that uses iter_is_iovec() to
decide how to poke around in iov_iter guts and for that the predicate
replacement obviously won't suffice.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
If something manages to set the maximum file size to MAX_OFFSET+1, this
can cause the xfs and ext4 filesystems at least to become corrupt.
Ordinarily, the kernel protects against userspace trying this by
checking the value early in the truncate() and ftruncate() system calls
calls - but there are at least two places that this check is bypassed:
(1) Cachefiles will round up the EOF of the backing file to DIO block
size so as to allow DIO on the final block - but this might push
the offset negative. It then calls notify_change(), but this
inadvertently bypasses the checking. This can be triggered if
someone puts an 8EiB-1 file on a server for someone else to try and
access by, say, nfs.
(2) ksmbd doesn't check the value it is given in set_end_of_file_info()
and then calls vfs_truncate() directly - which also bypasses the
check.
In both cases, it is potentially possible for a network filesystem to
cause a disk filesystem to be corrupted: cachefiles in the client's
cache filesystem; ksmbd in the server's filesystem.
nfsd is okay as it checks the value, but we can then remove this check
too.
Fix this by adding a check to inode_newsize_ok(), as called from
setattr_prepare(), thereby catching the issue as filesystems set up to
perform the truncate with minimal opportunity for bypassing the new
check.
Fixes: 1f08c925e7a3 ("cachefiles: Implement backing file wrangling")
Fixes: f44158485826 ("cifsd: add file operations")
Signed-off-by: David Howells <dhowells@redhat.com>
Reported-by: Jeff Layton <jlayton@kernel.org>
Tested-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Namjae Jeon <linkinjeon@kernel.org>
Cc: stable@kernel.org
Acked-by: Alexander Viro <viro@zeniv.linux.org.uk>
cc: Steve French <sfrench@samba.org>
cc: Hyunchul Lee <hyc.lee@gmail.com>
cc: Chuck Lever <chuck.lever@oracle.com>
cc: Dave Wysochanski <dwysocha@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
This patch removes the trailing white space in kernel/sysysctl.c
Signed-off-by: Fanjun Kong <bh1scw@gmail.com>
Reviewed-by: Muchun Song <songmuchun@bytedance.com>
[mcgrof: fix commit message subject]
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
|
|
This patch fixes two coding style issues:
1. Clean up indentation, replace spaces with tab
2. Add space after ','
Signed-off-by: Fanjun Kong <bh1scw@gmail.com>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
|
|
There are two adjacent sysctl entries protected by the same
CONFIG_TREE_RCU config symbol. Merge them into a single block to
improve readability.
Use the more common "#ifdef" form while at it.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
|
|
devm_regulator_get_optional() API will return -ENODEV if the regulator was
not found. For the optional supplies CX, PX we should not fail in that case
but rather continue. So let's catch that error and continue silently if
those regulators are not found.
The commit 3f52d118f992 ("remoteproc: qcom_q6v5_pas: Deal silently with
optional px and cx regulators") was supposed to do the same but it missed
the fact that devm_regulator_get_optional() API returns -ENODEV when the
regulator was not found.
Cc: Abel Vesa <abel.vesa@linaro.org>
Fixes: 3f52d118f992 ("remoteproc: qcom_q6v5_pas: Deal silently with optional px and cx regulators")
Reported-by: Steev Klimaszewski <steev@kali.org>
Reviewed-by: Abel Vesa <abel.vesa@linaro.org>
Reviewed-by: Johan Hovold <johan+linaro@kernel.org>
Tested-by: Steev Klimaszewski <steev@kali.org>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/20220801053939.12556-1-manivannan.sadhasivam@linaro.org
|
|
Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
|
|
The various functions contain a NULL check starting in v5.15.
Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
|
|
This reverts commit 4bf7fda4dce22214c70c49960b1b6438e6260b67.
It turns out that it was hopelessly naive to think that this would work,
considering that we've always done this. The first machine I actually
tested this on broke at bootup, getting to
Reached target cryptsetup.target - Local Encrypted Volumes.
and then hanging. It's unclear what actually fails, since there's a lot
else going on around that time (eg amdgpu probing also happens around
that same time, but it could be some other random init thing that didn't
complete earlier and just caused the boot to hang at that point).
The expectations that we should default to some unsafe and untested mode
seems entirely unfounded, and the belief that this wouldn't affect
modern systems is clearly entirely false. The machine in question is
about two years old, so it's not exactly shiny, but it's also not some
dusty old museum piece PDP-11 in a closet.
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: John Garry <john.garry@huawei.com>
Cc: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
This reverts commit 6f5c672d17f583b081e283927f5040f726c54598.
This breaks normal crash dump when CPU0 is offline.
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
|
|
This reverts commit 7d06fed77b7d8fc9f6cc41b4e3f2823d32532ad8.
This introduced vmem_mutex locking from vmem_map_4k_page()
function called from smp_reinit_ipl_cpu() with interrupts
disabled. While it is a pre-SMP early initcall no other CPUs
running in parallel nor other code taking vmem_mutex on this
boot stage - it still needs to be fixed.
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
|
|
This reverts commit e409b7f19172a3c154de62de4baf32a2c25a375a.
Commit 7d06fed77b7d ("s390/smp: rework absolute lowcore access")
introduced mutex lock with interrupts disabled. This commit is
a follow-up that needs to be reverted as well.
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
|
|
In the function s3fb_set_par(), the value of 'screen_size' is
calculated by the user input. If the user provides the improper value,
the value of 'screen_size' may larger than 'info->screen_size', which
may cause the following bug:
[ 54.083733] BUG: unable to handle page fault for address: ffffc90003000000
[ 54.083742] #PF: supervisor write access in kernel mode
[ 54.083744] #PF: error_code(0x0002) - not-present page
[ 54.083760] RIP: 0010:memset_orig+0x33/0xb0
[ 54.083782] Call Trace:
[ 54.083788] s3fb_set_par+0x1ec6/0x4040
[ 54.083806] fb_set_var+0x604/0xeb0
[ 54.083836] do_fb_ioctl+0x234/0x670
Fix the this by checking the value of 'screen_size' before memset_io().
Fixes: a268422de8bf ("fbdev driver for S3 Trio/Virge")
Signed-off-by: Zheyu Ma <zheyuma97@gmail.com>
Signed-off-by: Helge Deller <deller@gmx.de>
|
|
In the function arkfb_set_par(), the value of 'screen_size' is
calculated by the user input. If the user provides the improper value,
the value of 'screen_size' may larger than 'info->screen_size', which
may cause the following bug:
[ 659.399066] BUG: unable to handle page fault for address: ffffc90003000000
[ 659.399077] #PF: supervisor write access in kernel mode
[ 659.399079] #PF: error_code(0x0002) - not-present page
[ 659.399094] RIP: 0010:memset_orig+0x33/0xb0
[ 659.399116] Call Trace:
[ 659.399122] arkfb_set_par+0x143f/0x24c0
[ 659.399130] fb_set_var+0x604/0xeb0
[ 659.399161] do_fb_ioctl+0x234/0x670
[ 659.399189] fb_ioctl+0xdd/0x130
Fix the this by checking the value of 'screen_size' before memset_io().
Fixes: 681e14730c73 ("arkfb: new framebuffer driver for ARK Logic cards")
Signed-off-by: Zheyu Ma <zheyuma97@gmail.com>
Signed-off-by: Helge Deller <deller@gmx.de>
|
|
In the function vt8623fb_set_par(), the value of 'screen_size' is
calculated by the user input. If the user provides the improper value,
the value of 'screen_size' may larger than 'info->screen_size', which
may cause the following bug:
[ 583.339036] BUG: unable to handle page fault for address: ffffc90005000000
[ 583.339049] #PF: supervisor write access in kernel mode
[ 583.339052] #PF: error_code(0x0002) - not-present page
[ 583.339074] RIP: 0010:memset_orig+0x33/0xb0
[ 583.339110] Call Trace:
[ 583.339118] vt8623fb_set_par+0x11cd/0x21e0
[ 583.339146] fb_set_var+0x604/0xeb0
[ 583.339181] do_fb_ioctl+0x234/0x670
[ 583.339209] fb_ioctl+0xdd/0x130
Fix the this by checking the value of 'screen_size' before memset_io().
Fixes: 558b7bd86c32 ("vt8623fb: new framebuffer driver for VIA VT8623")
Signed-off-by: Zheyu Ma <zheyuma97@gmail.com>
Signed-off-by: Helge Deller <deller@gmx.de>
|
|
To 2.38
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
It is only used in transport.c.
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Rename generic mid functions to same style, i.e. without "cifs_"
prefix.
cifs_{init,destroy}_mids() -> {init,destroy}_mids()
Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de>
Signed-off-by: Steve French <stfrench@microsoft.com>
|