Age | Commit message (Collapse) | Author | Files | Lines |
|
It seems like Fedora 34 ends up enabling a few new gcc warnings, notably
"-Wstringop-overread" and "-Warray-parameter".
Both of them cause what seem to be valid warnings in the kernel, where
we have array size mismatches in function arguments (that are no longer
just silently converted to a pointer to element, but actually checked).
This fixes most of the trivial ones, by making the function declaration
match the function definition, and in the case of intel_pm.c, removing
the over-specified array size from the argument declaration.
At least one 'stringop-overread' warning remains in the i915 driver, but
that one doesn't have the same obvious trivial fix, and may or may not
actually be indicative of a bug.
[ It was a mistake to upgrade one of my machines to Fedora 34 while
being busy with the merge window, but if this is the extent of the
compiler upgrade problems, things are better than usual - Linus ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
The Kconfig dependency is incomplete since DRM_I915_GVT is a 'bool'
symbol that depends on the 'tristate' VFIO_MDEV. This allows a
configuration with VFIO_MDEV=m, DRM_I915_GVT=y and DRM_I915=y that
causes a link failure:
x86_64-linux-ld: drivers/gpu/drm/i915/gvt/gvt.o: in function `available_instances_show':
gvt.c:(.text+0x67a): undefined reference to `mtype_get_parent_dev'
x86_64-linux-ld: gvt.c:(.text+0x6a5): undefined reference to `mtype_get_type_group_id'
x86_64-linux-ld: drivers/gpu/drm/i915/gvt/gvt.o: in function `description_show':
gvt.c:(.text+0x76e): undefined reference to `mtype_get_parent_dev'
x86_64-linux-ld: gvt.c:(.text+0x799): undefined reference to `mtype_get_type_group_id'
Clarify the dependency by specifically disallowing the broken
configuration. If VFIO_MDEV is built-in, it will work, but if
VFIO_MDEV=m, the i915 driver cannot be built-in here.
Fixes: 07e543f4f9d1 ("vfio/gvt: Make DRM_I915_GVT depend on VFIO_MDEV")
Fixes: 9169cff168ff ("vfio/mdev: Correct the function signatures for the mdev_type_attributes")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Message-Id: <20210422133547.1861063-1-arnd@kernel.org>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
|
|
Harald Arnesen reported [1] a deadlock at reboot time, and after
he captured a stack trace a picture developed of what's going on:
The distribution he's using is using iwd (not wpa_supplicant) to
manage wireless. iwd will usually use the "socket owner" option
when it creates new interfaces, so that they're automatically
destroyed when it quits (unexpectedly or otherwise). This is also
done by wpa_supplicant, but it doesn't do it for the normal one,
only for additional ones, which is different with iwd.
Anyway, during shutdown, iwd quits while the netdev is still UP,
i.e. IFF_UP is set. This causes the stack trace that Linus so
nicely transcribed from the pictures:
cfg80211_destroy_iface_wk() takes wiphy_lock
-> cfg80211_destroy_ifaces()
->ieee80211_del_iface
->ieeee80211_if_remove
->cfg80211_unregister_wdev
->unregister_netdevice_queue
->dev_close_many
->__dev_close_many
->raw_notifier_call_chain
->cfg80211_netdev_notifier_call
and that last call tries to take wiphy_lock again.
In commit a05829a7222e ("cfg80211: avoid holding the RTNL when
calling the driver") I had taken into account the possibility of
recursing from cfg80211 into cfg80211_netdev_notifier_call() via
the network stack, but only for NETDEV_UNREGISTER, not for what
happens here, NETDEV_GOING_DOWN and NETDEV_DOWN notifications.
Additionally, while this worked still back in commit 78f22b6a3a92
("cfg80211: allow userspace to take ownership of interfaces"), it
missed another corner case: unregistering a netdev will cause
dev_close() to be called, and thus stop wireless operations (e.g.
disconnecting), but there are some types of virtual interfaces in
wifi that don't have a netdev - for that we need an additional
call to cfg80211_leave().
So, to fix this mess, change cfg80211_destroy_ifaces() to not
require the wiphy_lock(), but instead make it acquire it, but
only after it has actually closed all the netdevs on the list,
and then call cfg80211_leave() as well before removing them
from the driver, to fix the second issue. The locking change in
this requires modifying the nl80211 call to not get the wiphy
lock passed in, but acquire it by itself after flushing any
potentially pending destruction requests.
[1] https://lore.kernel.org/r/09464e67-f3de-ac09-28a3-e27b7914ee7d@skogtun.org
Cc: stable@vger.kernel.org # 5.12
Reported-by: Harald Arnesen <harald@skogtun.org>
Fixes: 776a39b8196d ("cfg80211: call cfg80211_destroy_ifaces() with wiphy lock held")
Fixes: 78f22b6a3a92 ("cfg80211: allow userspace to take ownership of interfaces")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Tested-by: Harald Arnesen <harald@skogtun.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Now that we have multishot poll requests, one SQE can emit multiple
CQEs. given below example:
sqe0(multishot poll)-->sqe1-->sqe2(drain req)
sqe2 is designed to issue after sqe0 and sqe1 completed, but since sqe0
is a multishot poll request, sqe2 may be issued after sqe0's event
triggered twice before sqe1 completed. This isn't what users leverage
drain requests for.
Here the solution is to wait for multishot poll requests fully
completed.
To achieve this, we should reconsider the req_need_defer equation, the
original one is:
all_sqes(excluding dropped ones) == all_cqes(including dropped ones)
This means we issue a drain request when all the previous submitted
SQEs have generated their CQEs.
Now we should consider multishot requests, we deduct all the multishot
CQEs except the cancellation one, In this way a multishot poll request
behave like a normal request, so:
all_sqes == all_cqes - multishot_cqes(except cancellations)
Here we introduce cq_extra for it.
Signed-off-by: Hao Xu <haoxu@linux.alibaba.com>
Link: https://lore.kernel.org/r/1618298439-136286-1-git-send-email-haoxu@linux.alibaba.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
syzkaller identified KASAN: null-ptr-deref Write in
io_uring_cancel_sqpoll.
io_uring_cancel_sqpoll is called by io_sq_thread before calling
io_uring_alloc_task_context. This leads to current->io_uring being NULL.
io_uring_cancel_sqpoll should not have to deal with threads where
current->io_uring is NULL.
In order to cast a wider safety net, perform input sanitisation directly
in io_uring_cancel_sqpoll and return for NULL value of current->io_uring.
This is safe since if current->io_uring isn't set, then there's no way
for the task to have submitted any requests.
Reported-by: syzbot+be51ca5a4d97f017cd50@syzkaller.appspotmail.com
Cc: stable@vger.kernel.org
Signed-off-by: Palash Oswal <hello@oswalpalash.com>
Link: https://lore.kernel.org/r/20210427125148.21816-1-hello@oswalpalash.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
When directory iterate and lookup is called, there's a buggy rewinding
of start point for traversing cluster chain to the parent directory
entry's first cluster. This caused repeated cluster chain traversing
from the first entry of the parent directory that would show worse
performance if huge amounts of files exist under the parent directory.
Fix not to rewind, make continue from currently referenced cluster and
dir entry.
Tested with 50,000 files under single directory / 256GB sdcard,
with command "time ls -l > /dev/null",
Before : 0m08.69s real 0m00.27s user 0m05.91s system
After : 0m07.01s real 0m00.25s user 0m04.34s system
Signed-off-by: Hyeongseok Kim <hyeongseok@gmail.com>
Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com>
Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
|
|
Degradation of write speed caused by frequent disk access for cluster
bitmap update on every cluster allocation could be improved by
selective syncing bitmap buffer. Change to flush bitmap buffer only
for the directory related operations.
Signed-off-by: Hyeongseok Kim <hyeongseok@gmail.com>
Acked-by: Sungjong Seo <sj1557.seo@samsung.com>
Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
|
|
Add FITRIM ioctl to enable discarding unused blocks while mounted.
As current exFAT doesn't have generic ioctl handler, add empty ioctl
function first, and add FITRIM handler.
Signed-off-by: Hyeongseok Kim <hyeongseok@gmail.com>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Acked-by: Sungjong Seo <sj1557.seo@samsung.com>
Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
|
|
s_lock which is for protecting concurrent access of file operations is
too huge for cluster bitmap protection, so introduce a new bitmap_lock
to narrow the lock range if only need to access cluster bitmap.
Signed-off-by: Hyeongseok Kim <hyeongseok@gmail.com>
Acked-by: Sungjong Seo <sj1557.seo@samsung.com>
Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
|
|
If mounted with discard option, exFAT issues discard command when clear
cluster bit to remove file. But the input parameter of cluster-to-sector
calculation is abnormally added by reserved cluster size which is 2,
leading to discard unrelated sectors included in target+2 cluster.
With fixing this, remove the wrong comments in set/clear/find bitmap
functions.
Fixes: 1e49a94cf707 ("exfat: add bitmap operations")
Cc: stable@vger.kernel.org # v5.7+
Signed-off-by: Hyeongseok Kim <hyeongseok@gmail.com>
Acked-by: Sungjong Seo <sj1557.seo@samsung.com>
Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
|
|
Fix some miscellaneous things in the new netfs lib[1]:
(1) The kerneldoc for netfs_readpage() shouldn't say netfs_page().
(2) netfs_readpage() can get an integer overflow on 32-bit when it
multiplies page_index(page) by PAGE_SIZE. It should use
page_file_offset() instead.
(3) netfs_write_begin() should use page_offset() to avoid the same
overflow.
Note that netfs_readpage() needs to use page_file_offset() rather than
page_offset() as it may see swap-over-NFS.
Reported-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Link: https://lore.kernel.org/r/161789062190.6155.12711584466338493050.stgit@warthog.procyon.org.uk/ [1]
|
|
Fix four things[1] in the patch that adds ITER_XARRAY[2]:
(1) Remove the address_space struct predeclaration. This is a holdover
from when it was ITER_MAPPING.
(2) Fix _copy_mc_to_iter() so that the xarray segment updates count and
iov_offset in the iterator before returning.
(3) Fix iov_iter_alignment() to not loop in the xarray case. Because the
middle pages are all whole pages, only the end pages need be
considered - and this can be reduced to just looking at the start
position in the xarray and the iteration size.
(4) Fix iov_iter_advance() to limit the size of the advance to no more
than the remaining iteration size.
Reported-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Al Viro <viro@zeniv.linux.org.uk>
Tested-by: Jeff Layton <jlayton@redhat.com>
Tested-by: Dave Wysochanski <dwysocha@redhat.com>
Link: https://lore.kernel.org/r/YIVrJT8GwLI0Wlgx@zeniv-ca.linux.org.uk [1]
Link: https://lore.kernel.org/r/161918448151.3145707.11541538916600921083.stgit@warthog.procyon.org.uk [2]
|
|
Uninitialized local variable "elf_info" would be passed to
kexec_free_elf_info() if kexec_build_elf_info() returns an error
in elf64_load().
If kexec_build_elf_info() returns an error, return the error
immediately.
Signed-off-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20210421163610.23775-2-nramas@linux.microsoft.com
|
|
There are a few "goto out;" statements before the local variable "fdt"
is initialized through the call to of_kexec_alloc_and_setup_fdt() in
elf64_load(). This will result in an uninitialized "fdt" being passed
to kvfree() in this function if there is an error before the call to
of_kexec_alloc_and_setup_fdt().
If there is any error after fdt is allocated, but before it is
saved in the arch specific kimage struct, free the fdt.
Fixes: 3c985d31ad66 ("powerpc: Use common of_kexec_alloc_and_setup_fdt()")
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com>
Signed-off-by: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20210421163610.23775-1-nramas@linux.microsoft.com
|
|
Commit d1f044103dad ("certs: Add ability to preload revocation certs")
created a new generated file for revocation certs, but didn't tell git
to ignore it. Thus causing unnecessary "git status" noise after a
kernel build with CONFIG_SYSTEM_REVOCATION_LIST enabled.
Add the proper gitignore magic.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Now we support sharing one page if PAGE_SIZE is not equal stripe size. To
support this, it needs to support calculating xor value with different
offsets for each r5dev. One offset array is used to record those offsets.
In RMW mode, parity page is used as a source page. It sets
ASYNC_TX_XOR_DROP_DST before calculating xor value in ops_run_prexor5.
So it needs to add src_list and src_offs at the same time. Now it only
needs src_list. So the xor value which is calculated is wrong. It can
cause data corruption problem.
I can reproduce this problem 100% on a POWER8 machine. The steps are:
mdadm -CR /dev/md0 -l5 -n3 /dev/sdb1 /dev/sdc1 /dev/sdd1 --size=3G
mkfs.xfs /dev/md0
mount /dev/md0 /mnt/test
mount: /mnt/test: mount(2) system call failed: Structure needs cleaning.
Fixes: 29bcff787a25 ("md/raid5: add new xor function to support different page offset")
Cc: stable@vger.kernel.org # v5.10+
Signed-off-by: Xiao Ni <xni@redhat.com>
Signed-off-by: Song Liu <song@kernel.org>
|
|
In null_init, null_add_dev(dev) is called.
In null_add_dev, it calls null_free_zoned_dev(dev) to free dev->zones
via kvfree(dev->zones) in out_cleanup_zone branch and returns err.
Then null_init accept the err code and then calls null_free_dev(dev).
But in null_free_dev(dev), dev->zones is freed again by
null_free_zoned_dev().
My patch set dev->zones to NULL in null_free_zoned_dev() after
kvfree(dev->zones) is called, to avoid the double free.
Fixes: 2984c8684f962 ("nullb: factor disk parameters")
Signed-off-by: Lv Yunlong <lyl2019@mail.ustc.edu.cn>
Link: https://lore.kernel.org/r/20210426143229.7374-1-lyl2019@mail.ustc.edu.cn
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
io_import_fixed() doesn't expect a registered buffer slot to be NULL and
would fail stumbling on it. We don't allow it, but if during
__io_sqe_buffers_update() rsrc removal succeeds but following register
fails, we'll get such a situation.
Do it atomically and don't remove buffers until we sure that a new one
can be set.
Fixes: 634d00df5e1cf ("io_uring: add full-fledged dynamic buffers support")
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/830020f9c387acddd51962a3123b5566571b8c6d.1619446608.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Add IPCC compatible for SC7280 SoC.
Signed-off-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Reviewed-by: Stephen Boyd <swboyd@chromium.org>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
|
|
All sqpoll rings (even sharing sqpoll task) are currently dead bound
to the task that created them, iow when owner task dies it kills all
its SQPOLL rings and their inflight requests via task_work infra. It's
neither the nicist way nor the most convenient as adds extra
locking/waiting and dependencies.
Leave it alone and rely on SIGKILL being delivered on its thread group
exit, so there are only two cases left:
1) thread group is dying, so sqpoll task gets a signal and exit itself
cancelling all requests.
2) an sqpoll ring is dying. Because refs_kill() is called the sqpoll not
going to submit any new request, and that's what we need. And
io_ring_exit_work() will do all the cancellation itself before
actually killing ctx, so sqpoll doesn't need to worry about it.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/3cd7f166b9c326a2c932b70e71a655b03257b366.1619389911.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
After closing an SQPOLL ring, io_ring_exit_work() kicks in and starts
doing cancellations via io_uring_try_cancel_requests(). It will go
through io_uring_try_cancel_iowq(), which uses ctx->tctx_list, but as
SQPOLL task don't have a ctx note, its io-wq won't be reachable and so
is left not cancelled.
It will eventually cancelled when one of the tasks dies, but if a thread
group survives for long and changes rings, it will spawn lots of
unreclaimed resources and live locked works.
Cancel SQPOLL task's io-wq separately in io_ring_exit_work().
Cc: stable@vger.kernel.org
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/a71a7fe345135d684025bb529d5cb1d8d6b46e10.1619389911.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
The variable up.resv is not initialized and is being checking for a
non-zero value in the call to _io_register_rsrc_update. Fix this by
explicitly setting the variable to 0.
Addresses-Coverity: ("Uninitialized scalar variable)"
Fixes: c3bdad027183 ("io_uring: add generic rsrc update with tags")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Link: https://lore.kernel.org/r/20210426094735.8320-1-colin.king@canonical.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Now we allocate io_mapped_ubuf instead of bvec, so we clearly have to
check its address after allocation.
Fixes: 41edf1a5ec967 ("io_uring: keep table of pointers to ubufs")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/d28eb1bc4384284f69dbce35b9f70c115ff6176f.1619392565.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
ioc_adjust_base_vrate() ignored vrate_min when rq_wait_pct indicates that
there is QD contention. The reasoning was that QD depletion always reliably
indicates device saturation and thus it's safe to override user specified
vrate_min. However, this sometimes leads to unnecessary throttling,
especially on really fast devices, because vrate adjustments have delays and
inertia. It also confuses users because the behavior violates the explicitly
specified configuration.
This patch drops the special case handling so that vrate_min is always
applied.
Signed-off-by: Tejun Heo <tj@kernel.org>
Link: https://lore.kernel.org/r/YIIo1HuyNmhDeiNx@slm.duckdns.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
In command queueing mode, the cache isn't flushed via the mmc_flush_cache()
function, but instead by issuing a CMDQ_TASK_MGMT (CMD48) with a
FLUSH_CACHE opcode. In this path, we need to check if cache has been
enabled, before deciding to flush the cache, along the lines of what's
being done in mmc_flush_cache().
To fix this problem, let's add a new bus ops callback ->cache_enabled() and
implement it for the mmc bus type. In this way, the mmc block device driver
can call it to know whether cache flushing should be done.
Fixes: 1e8e55b67030 (mmc: block: Add CQE support)
Cc: stable@vger.kernel.org
Reported-by: Brendan Peter <bpeter@lytx.com>
Signed-off-by: Avri Altman <avri.altman@wdc.com>
Tested-by: Brendan Peter <bpeter@lytx.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Link: https://lore.kernel.org/r/20210425060207.2591-2-avri.altman@wdc.com
Link: https://lore.kernel.org/r/20210425060207.2591-3-avri.altman@wdc.com
[Ulf: Squashed the two patches and made some minor updates]
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
A minor cleanup to address a clang warning removed an assigned
but unused local variable, but this now caused a gcc warning as
kfifo_out() is annotated to require checking its return code:
In file included from drivers/memstick/host/r592.h:13,
from drivers/memstick/host/r592.c:21:
drivers/memstick/host/r592.c: In function 'r592_flush_fifo_write':
include/linux/kfifo.h:588:1: error: ignoring return value of '__kfifo_uint_must_check_helper' declared with attribute 'warn_unused_result' [-Werror=unused-result]
588 | __kfifo_uint_must_check_helper( \
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
589 | ({ \
| ~~~~
590 | typeof((fifo) + 1) __tmp = (fifo); \
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
591 | typeof(__tmp->ptr) __buf = (buf); \
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
592 | unsigned long __n = (n); \
| ~~~~~~~~~~~~~~~~~~~~~~~~~~
593 | const size_t __recsize = sizeof(*__tmp->rectype); \
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
594 | struct __kfifo *__kfifo = &__tmp->kfifo; \
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
595 | (__recsize) ?\
| ~~~~~~~~~~~~~~
596 | __kfifo_out_r(__kfifo, __buf, __n, __recsize) : \
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
597 | __kfifo_out(__kfifo, __buf, __n); \
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
598 | }) \
| ~~~~
599 | )
| ~
drivers/memstick/host/r592.c:367:9: note: in expansion of macro 'kfifo_out'
367 | kfifo_out(&dev->pio_fifo, buffer, 4);
| ^~~~~~~~~
The value was never checked here, and the purpose of the function
is only to flush the contents, so restore the old behavior but
add a cast to void and a comment, which hopefully warns with neither
gcc nor clang now.
If anyone has an idea for how to fix it without ignoring the return
code, that is probably better.
Fixes: 4b00ed3c5072 ("memstick: r592: remove unused variable")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20210421135215.3414589-1-arnd@kernel.org
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
BUG_ON uses unlikely in if(), it can be optimized at compile time.
Usually, the condition in if() is not satisfied. In my opinion,
this can improve the efficiency of the multi-stage pipeline.
Signed-off-by: zhouchuangao <zhouchuangao@vivo.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
|
|
To 2.32
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
In some cases readahead of more than the read size can help
(to allow parallel i/o of read ahead which can improve performance).
Ceph introduced a mount parameter "rasize" to allow controlling this.
Add mount parameter "rasize" to allow control of amount of readahead
requested of the server. If rasize not set, rasize defaults to
negotiated rsize as before.
Reviewed-by: Shyam Prasad N <sprasad@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
For servers which don't support copy_range (SMB3 CopyChunk), the
logging of:
CIFS: VFS: \\server\share refcpy ioctl error -95 getting resume key
can fill the client logs and make debugging real problems more
difficult. Change the -EOPNOTSUPP on copy_range to a "warn once"
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
cifs_smb3_do_mount() calls smb3_fs_context_dup() and then
cifs_setup_volume_info(). The latter's subsequent smb3_parse_devname()
call overwrites the cifs_sb->ctx->UNC string already dup'ed by
smb3_fs_context_dup(), resulting in a leak. E.g.
unreferenced object 0xffff888002980420 (size 32):
comm "mount", pid 160, jiffies 4294892541 (age 30.416s)
hex dump (first 32 bytes):
5c 5c 31 39 32 2e 31 36 38 2e 31 37 34 2e 31 30 \\192.168.174.10
34 5c 72 61 70 69 64 6f 2d 73 68 61 72 65 00 00 4\rapido-share..
backtrace:
[<00000000069e12f6>] kstrdup+0x28/0x50
[<00000000b61f4032>] smb3_fs_context_dup+0x127/0x1d0 [cifs]
[<00000000c6e3e3bf>] cifs_smb3_do_mount+0x77/0x660 [cifs]
[<0000000063467a6b>] smb3_get_tree+0xdf/0x220 [cifs]
[<00000000716f731e>] vfs_get_tree+0x1b/0x90
[<00000000491d3892>] path_mount+0x62a/0x910
[<0000000046b2e774>] do_mount+0x50/0x70
[<00000000ca7b64dd>] __x64_sys_mount+0x81/0xd0
[<00000000b5122496>] do_syscall_64+0x33/0x40
[<000000002dd397af>] entry_SYSCALL_64_after_hwframe+0x44/0xae
This change is a bandaid until the cifs_setup_volume_info() TODO and
error handling issues are resolved.
Signed-off-by: David Disseldorp <ddiss@suse.de>
Acked-by: Ronnie Sahlberg <lsahlber@redhat.com>
Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
CC: <stable@vger.kernel.org> # v5.11+
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
pfid is being set to tcon->crfid.fid and they are copied in each other
multiple times. Remove the memcopy between same pointers - memory
locations.
Addresses-Coverity: ("Overlapped copy")
Fixes: 9e81e8ff74b9 ("cifs: return cached_fid from open_shroot")
Signed-off-by: Muhammad Usama Anjum <musamaanjum@gmail.com>
Reviewed-by: Aurelien Aptel <aaptel@suse.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Avoid a warning if the error percolates back up:
[440700.376476] CIFS VFS: \\otters.example.com crypt_message: Could not get encryption key
[440700.386947] ------------[ cut here ]------------
[440700.386948] err = 1
[440700.386977] WARNING: CPU: 11 PID: 2733 at /build/linux-hwe-5.4-p6lk6L/linux-hwe-5.4-5.4.0/lib/errseq.c:74 errseq_set+0x5c/0x70
...
[440700.397304] CPU: 11 PID: 2733 Comm: tar Tainted: G OE 5.4.0-70-generic #78~18.04.1-Ubuntu
...
[440700.397334] Call Trace:
[440700.397346] __filemap_set_wb_err+0x1a/0x70
[440700.397419] cifs_writepages+0x9c7/0xb30 [cifs]
[440700.397426] do_writepages+0x4b/0xe0
[440700.397444] __filemap_fdatawrite_range+0xcb/0x100
[440700.397455] filemap_write_and_wait+0x42/0xa0
[440700.397486] cifs_setattr+0x68b/0xf30 [cifs]
[440700.397493] notify_change+0x358/0x4a0
[440700.397500] utimes_common+0xe9/0x1c0
[440700.397510] do_utimes+0xc5/0x150
[440700.397520] __x64_sys_utimensat+0x88/0xd0
Fixes: 61cfac6f267d ("CIFS: Fix possible use after free in demultiplex thread")
Signed-off-by: Paul Aurich <paul@darkrain42.org>
CC: stable@vger.kernel.org
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
If smb3_notify() is called at mount point of CIFS, build_path_from_dentry()
returns the pointer to kmalloc-ed memory with terminating zero (this is
empty FileName to be passed to SMB2 CREATE request). This pointer is assigned
to the `path` variable.
Then `path + 1` (to skip first backslash symbol) is passed to
cifs_convert_path_to_utf16(). This is incorrect for empty path and causes
out-of-bound memory access.
Get rid of this "increase by one". cifs_convert_path_to_utf16() already
contains the check for leading backslash in the path.
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=212693
CC: <stable@vger.kernel.org> # v5.6+
Signed-off-by: Eugene Korenevsky <ekorenevsky@astralinux.ru>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
* rqst[1,2,3] is allocated in vars
* each rqst->rq_iov is also allocated in vars or using pooled memory
SMB2_open_free, SMB2_ioctl_free, SMB2_query_info_free are iterating on
each rqst after vars has been freed (use-after-free), and they are
freeing the kvec a second time (double-free).
How to trigger:
* compile with KASAN
* mount a share
$ smbinfo quota /mnt/foo
Segmentation fault
$ dmesg
==================================================================
BUG: KASAN: use-after-free in SMB2_open_free+0x1c/0xa0
Read of size 8 at addr ffff888007b10c00 by task python3/1200
CPU: 2 PID: 1200 Comm: python3 Not tainted 5.12.0-rc6+ #107
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.14.0-0-g155821a-rebuilt.opensuse.org 04/01/2014
Call Trace:
dump_stack+0x93/0xc2
print_address_description.constprop.0+0x18/0x130
? SMB2_open_free+0x1c/0xa0
? SMB2_open_free+0x1c/0xa0
kasan_report.cold+0x7f/0x111
? smb2_ioctl_query_info+0x240/0x990
? SMB2_open_free+0x1c/0xa0
SMB2_open_free+0x1c/0xa0
smb2_ioctl_query_info+0x2bf/0x990
? smb2_query_reparse_tag+0x600/0x600
? cifs_mapchar+0x250/0x250
? rcu_read_lock_sched_held+0x3f/0x70
? cifs_strndup_to_utf16+0x12c/0x1c0
? rwlock_bug.part.0+0x60/0x60
? rcu_read_lock_sched_held+0x3f/0x70
? cifs_convert_path_to_utf16+0xf8/0x140
? smb2_check_message+0x6f0/0x6f0
cifs_ioctl+0xf18/0x16b0
? smb2_query_reparse_tag+0x600/0x600
? cifs_readdir+0x1800/0x1800
? selinux_bprm_creds_for_exec+0x4d0/0x4d0
? do_user_addr_fault+0x30b/0x950
? __x64_sys_openat+0xce/0x140
__x64_sys_ioctl+0xb9/0xf0
do_syscall_64+0x33/0x40
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7fdcf1f4ba87
Code: b3 66 90 48 8b 05 11 14 2c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d e1 13 2c 00 f7 d8 64 89 01 48
RSP: 002b:00007ffef1ce7748 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00000000c018cf07 RCX: 00007fdcf1f4ba87
RDX: 0000564c467c5590 RSI: 00000000c018cf07 RDI: 0000000000000003
RBP: 00007ffef1ce7770 R08: 00007ffef1ce7420 R09: 00007fdcf0e0562b
R10: 0000000000000100 R11: 0000000000000246 R12: 0000000000004018
R13: 0000000000000001 R14: 0000000000000003 R15: 0000564c467c5590
Allocated by task 1200:
kasan_save_stack+0x1b/0x40
__kasan_kmalloc+0x7a/0x90
smb2_ioctl_query_info+0x10e/0x990
cifs_ioctl+0xf18/0x16b0
__x64_sys_ioctl+0xb9/0xf0
do_syscall_64+0x33/0x40
entry_SYSCALL_64_after_hwframe+0x44/0xae
Freed by task 1200:
kasan_save_stack+0x1b/0x40
kasan_set_track+0x1c/0x30
kasan_set_free_info+0x20/0x30
__kasan_slab_free+0xe5/0x110
slab_free_freelist_hook+0x53/0x130
kfree+0xcc/0x320
smb2_ioctl_query_info+0x2ad/0x990
cifs_ioctl+0xf18/0x16b0
__x64_sys_ioctl+0xb9/0xf0
do_syscall_64+0x33/0x40
entry_SYSCALL_64_after_hwframe+0x44/0xae
The buggy address belongs to the object at ffff888007b10c00
which belongs to the cache kmalloc-512 of size 512
The buggy address is located 0 bytes inside of
512-byte region [ffff888007b10c00, ffff888007b10e00)
The buggy address belongs to the page:
page:0000000044e14b75 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x7b10
head:0000000044e14b75 order:2 compound_mapcount:0 compound_pincount:0
flags: 0x100000000010200(slab|head)
raw: 0100000000010200 ffffea000015f500 0000000400000004 ffff888001042c80
raw: 0000000000000000 0000000000100010 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected
Memory state around the buggy address:
ffff888007b10b00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
ffff888007b10b80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>ffff888007b10c00: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^
ffff888007b10c80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff888007b10d00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================
Signed-off-by: Aurelien Aptel <aaptel@suse.com>
CC: <stable@vger.kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Can aid in making mount problems easier to diagnose
Signed-off-by: Aurelien Aptel <aaptel@suse.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
This makes the errors accessible from userspace via dmesg and
the fs_context fd.
Signed-off-by: Aurelien Aptel <aaptel@suse.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Add fs_context param to parsing helpers to be able to log into it in
next patch.
Make some helper static as they are not used outside of fs_context.c
Signed-off-by: Aurelien Aptel <aaptel@suse.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
This new helper will be used in the fs_context mount option parsing
code. It log errors both in:
* the fs_context log queue for userspace to read
* kernel printk buffer (dmesg, old behaviour)
Signed-off-by: Aurelien Aptel <aaptel@suse.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Emulated via server side copy and setsize for
SMB3 and later. In the future we could compound
this (and/or optionally use DUPLICATE_EXTENTS
if supported by the server).
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Emulated for SMB3 and later via server side copy
and setsize. Eventually this could be compounded.
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Improves directory metadata caching
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Needed for the final patch in the directory caching series
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
and clear the timestamp when we receive a lease break.
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Needed for subsequent patches in the directory caching
series.
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
We need to hold both a reference for the root/superblock as well as the directory that we
are caching. We need to drop these references before we call kill_anon_sb().
At this point, the root and the cached dentries are always the same but this will change
once we start caching other directories as well.
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
And use this to only allow to take out a shared handle once the mount has completed and the
sb becomes available.
This will become important in follow up patches where we will start holding a reference to the
directory dentry for the shared handle during the lifetime of the handle.
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
These functions will eventually be used to cache any directory, not just the root
so change the names.
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Move the check for the directory path into the open_shroot() function
but still fail for any non-root directories.
This is preparation for later when we will start using the cache also
for other directories than the root.
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
instead of doing it in the callsites for open_shroot.
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|