Age | Commit message (Collapse) | Author | Files | Lines |
|
EFI has a rather unique benefit that it has access to some limited
non-volatile storage, where the kernel can store a random seed. Read
that seed in EFISTUB and concatenate it with other seeds we wind up
passing onward to the kernel in the configuration table. This is
complementary to the current other two sources - previous bootloaders,
and the EFI RNG protocol.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
[ardb: check for non-NULL RNG protocol pointer, call GetVariable()
without buffer first to obtain the size]
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
|
|
In anticipation of putting random seeds in EFI variables, it's important
that the random GUID namespace of variables remains hidden from
userspace. We accomplish this by not populating efivarfs with entries
from that GUID, as well as denying the creation of new ones in that
GUID.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
|
|
Instead of blindly creating the EFI random seed configuration table if
the RNG protocol is implemented and works, check whether such a EFI
configuration table was provided by an earlier boot stage and if so,
concatenate the existing and the new seeds, leaving it up to the core
code to mix it in and credit it the way it sees fit.
This can be used for, e.g., systemd-boot, to pass an additional seed to
Linux in a way that can be consumed by the kernel very early. In that
case, the following definitions should be used to pass the seed to the
EFI stub:
struct linux_efi_random_seed {
u32 size; // of the 'seed' array in bytes
u8 seed[];
};
The memory for the struct must be allocated as EFI_ACPI_RECLAIM_MEMORY
pool memory, and the address of the struct in memory should be installed
as a EFI configuration table using the following GUID:
LINUX_EFI_RANDOM_SEED_TABLE_GUID 1ce1e5bc-7ceb-42f2-81e5-8aadf180f57b
Note that doing so is safe even on kernels that were built without this
patch applied, but the seed will simply be overwritten with a seed
derived from the EFI RNG protocol, if available. The recommended seed
size is 32 bytes, and seeds larger than 512 bytes are considered
corrupted and ignored entirely.
In order to preserve forward secrecy, seeds from previous bootloaders
are memzero'd out, and in order to preserve memory, those older seeds
are also freed from memory. Freeing from memory without first memzeroing
is not safe to do, as it's possible that nothing else will ever
overwrite those pages used by EFI.
Reviewed-by: Jason A. Donenfeld <Jason@zx2c4.com>
[ardb: incorporate Jason's followup changes to extend the maximum seed
size on the consumer end, memzero() it and drop a needless printk]
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
|
|
Print the CXL Error Log field as found in CXL Protocol Error Section.
The CXL RAS Capability structure will be reused by OS First Handling
and the duplication/appropriate placement will be addressed eventually.
Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
|
|
Add support for decoding CXL Protocol Error Section as defined in UEFI 2.10
Section N.2.13.
Do the section decoding in a new cper_cxl.c file. This new file will be
used in the future for more CXL CPERs decode support. Add this to the
existing UEFI_CPER config.
Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
|
|
commit f4dc7fffa987 ("efi: libstub: unify initrd loading between
architectures") merge the first and the second parameters into a
struct without updating the kernel-doc. Let's fix it.
Signed-off-by: Jialin Zhang <zhangjialin11@huawei.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
|
|
The EFI runtime map code is only wired up on x86, which is the only
architecture that has a need for it in its implementation of kexec.
So let's move this code under arch/x86 and drop all references to it
from generic code. To ensure that the efi_runtime_map_init() is invoked
at the appropriate time use a 'sync' subsys_initcall() that will be
called right after the EFI initcall made from generic code where the
original invocation of efi_runtime_map_init() resided.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Dave Young <dyoung@redhat.com>
|
|
The current Kconfig logic for CONFIG_EFI_RUNTIME_MAPS does not convey
that without it, a kexec kernel is not able to boot in EFI mode at all.
So clarify this, and make the option only configurable via the menu
system if CONFIG_EXPERT is set.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Dave Young <dyoung@redhat.com>
|
|
By default, the efi-pstore backend hardcode the UEFI variable size
as 1024 bytes. The historical reasons for that were discussed by
Ard in threads [0][1]:
"there is some cargo cult from prehistoric EFI times going
on here, it seems. Or maybe just misinterpretation of the maximum
size for the variable *name* vs the variable itself.".
"OVMF has
OvmfPkg/OvmfPkgX64.dsc:
gEfiMdeModulePkgTokenSpaceGuid.PcdMaxVariableSize|0x2000
OvmfPkg/OvmfPkgX64.dsc:
gEfiMdeModulePkgTokenSpaceGuid.PcdMaxVariableSize|0x8400
where the first one is without secure boot and the second with secure
boot. Interestingly, the default is
gEfiMdeModulePkgTokenSpaceGuid.PcdMaxVariableSize|0x400
so this is probably where this 1k number comes from."
With that, and since there is not such a limit in the UEFI spec, we
have the confidence to hereby add a module parameter to enable advanced
users to change the UEFI record size for efi-pstore data collection,
this way allowing a much easier reading of the collected log, which
wouldn't be scattered anymore among many small files.
Through empirical analysis we observed that extreme low values (like 8
bytes) could eventually cause writing issues, so given that and the OVMF
default discussed, we limited the minimum value to 1024 bytes, which also
is still the default.
[0] https://lore.kernel.org/lkml/CAMj1kXF4UyRMh2Y_KakeNBHvkHhTtavASTAxXinDO1rhPe_wYg@mail.gmail.com/
[1] https://lore.kernel.org/lkml/CAMj1kXFy-2KddGu+dgebAdU9v2sindxVoiHLWuVhqYw+R=kqng@mail.gmail.com/
Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
|
|
Currently, the EFI_PARAVIRT flag is only used by Xen dom0 boot on x86,
even though other architectures also support pseudo-EFI boot, where the
core kernel is invoked directly and provided with a set of data tables
that resemble the ones constructed by the EFI stub, which never actually
runs in that case.
Let's fix this inconsistency, and always set this flag when booting dom0
via the EFI boot path. Note that Xen on x86 does not provide the EFI
memory map in this case, whereas other architectures do, so move the
associated EFI_PARAVIRT check into the x86 platform code.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
|
|
The EFI memory map is a description of the memory layout as provided by
the firmware, and only x86 manipulates it in various different ways for
its own memory bookkeeping. So let's move the memmap routines that are
only used by x86 into the x86 arch tree.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
|
|
The EFI fake memmap support is specific to x86, which manipulates the
EFI memory map in various different ways after receiving it from the EFI
stub. On other architectures, we have managed to push back on this, and
the EFI memory map is kept pristine.
So let's move the fake memmap code into the x86 arch tree, where it
arguably belongs.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
|
|
The initrd= command line loader can be useful for development, but it
was limited to loading files from the same file system as the loaded
kernel (and it didn't work on x86 mixed mode).
As both issues have been fixed, and the initrd= can now be used with
files residing on any simple file system exposed by the EFI firmware,
let's permit it to be enabled on RISC-V and LoongArch, which did not
support it up to this point.
Note that LoadFile2 remains the preferred option, as it is much simpler
to use and implement, but generic loaders (including the UEFI shell) may
not implement this so there, initrd= can now be used as well (if enabled
in the build)
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
|
|
Now that we have support for calling protocols that need additional
marshalling for mixed mode, wire up the initrd command line loader.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
|
|
Rework the EFI stub macro wrappers around protocol method calls and
other indirect calls in order to allow return types other than
efi_status_t. This means the widening should be conditional on whether
or not the return type is efi_status_t, and should be omitted otherwise.
Also, switch to _Generic() to implement the type based compile time
conditionals, which is more concise, and distinguishes between
efi_status_t and u64 properly.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
|
|
Currently, the initrd= command line option to the EFI stub only supports
loading files that reside on the same volume as the loaded image, which
is not workable for loaders like GRUB that don't even implement the
volume abstraction (EFI_SIMPLE_FILE_SYSTEM_PROTOCOL), and load the
kernel from an anonymous buffer in memory. For this reason, another
method was devised that relies on the LoadFile2 protocol.
However, the command line loader is rather useful when using the UEFI
shell or other generic loaders that have no awareness of Linux specific
protocols so let's make it a bit more flexible, by permitting textual
device paths to be provided to initrd= as well, provided that they refer
to a file hosted on a EFI_SIMPLE_FILE_SYSTEM_PROTOCOL volume. E.g.,
initrd=PciRoot(0x0)/Pci(0x3,0x0)/HD(1,MBR,0xBE1AFDFA,0x3F,0xFBFC1)/rootfs.cpio.gz
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
|
|
The EFI spec is not very clear about which permissions are being given
when allocating pages of a certain type. However, it is quite obvious
that EFI_LOADER_CODE is more likely to permit execution than
EFI_LOADER_DATA, which becomes relevant once we permit booting the
kernel proper with the firmware's 1:1 mapping still active.
Ostensibly, recent systems such as the Surface Pro X grant executable
permissions to EFI_LOADER_CODE regions but not EFI_LOADER_DATA regions.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
|
|
|
|
The latest fix for the non-contiguous memalloc helper changed the
allocation method for a non-IOMMU system to use only the fallback
allocator. This should have worked, but it caused a problem sometimes
when too many non-contiguous pages are allocated that can't be treated
by HD-audio controller.
As a quirk workaround, go back to the original strategy: use
dma_alloc_noncontiguous() at first, and apply the fallback only when
it fails, but only for non-IOMMU case.
We'll need a better fix in the fallback code as well, but this
workaround should paper over most cases.
Fixes: 9736a325137b ("ALSA: memalloc: Don't fall back for SG-buffer with IOMMU")
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/CAHk-=wgSH5ubdvt76gNwa004ooZAEJL_1Q-Fyw5M2FDdqL==dg@mail.gmail.com
Link: https://lore.kernel.org/r/20221112084718.3305-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
While the ATA specification states that a device should return command
aborted for all commands queued after the device has entered error state,
since ATA only keeps the sense data for the latest command (in non-NCQ
case), we really don't want to send block layer commands to the device
after it has entered error state. (Only ATA EH commands should be sent,
to read the sense data etc.)
Currently, scsi_queue_rq() will check if scsi_host_in_recovery()
(state is SHOST_RECOVERY), and if so, it will _not_ issue a command via:
scsi_dispatch_cmd() -> host->hostt->queuecommand() (ata_scsi_queuecmd())
-> __ata_scsi_queuecmd() -> ata_scsi_translate() -> ata_qc_issue()
Before commit e494f6a72839 ("[SCSI] improved eh timeout handler"),
when receiving a TFES error IRQ, the call chain looked like this:
ahci_error_intr() -> ata_port_abort() -> ata_do_link_abort() ->
ata_qc_complete() -> ata_qc_schedule_eh() -> blk_abort_request() ->
blk_rq_timed_out() -> q->rq_timed_out_fn() (scsi_times_out()) ->
scsi_eh_scmd_add() -> scsi_host_set_state(shost, SHOST_RECOVERY)
Which meant that as soon as an error IRQ was serviced, SHOST_RECOVERY
would be set.
However, after commit e494f6a72839 ("[SCSI] improved eh timeout handler"),
scsi_times_out() will instead call scsi_abort_command() which will queue
delayed work, and the worker function scmd_eh_abort_handler() will call
scsi_eh_scmd_add(), which calls scsi_host_set_state(shost, SHOST_RECOVERY).
So now, after the TFES error IRQ has been serviced, we need to wait for
the SCSI workqueue to run its work before SHOST_RECOVERY gets set.
It is worth noting that, even before commit e494f6a72839 ("[SCSI] improved
eh timeout handler"), we could receive an error IRQ from the time when
scsi_queue_rq() checks scsi_host_in_recovery(), to the time when
ata_scsi_queuecmd() is actually called.
In order to handle both the delayed setting of SHOST_RECOVERY and the
window where we can receive an error IRQ, add a check against
ATA_PFLAG_EH_PENDING (which gets set when servicing the error IRQ),
inside ata_scsi_queuecmd() itself, while holding the ap->lock.
(Since the ap->lock is held while servicing IRQs.)
Fixes: e494f6a72839 ("[SCSI] improved eh timeout handler")
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Tested-by: John Garry <john.g.garry@oracle.com>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
|
|
Add a lockdep annotation in io_poll_req_insert_locked().
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/8115d8e702733754d0aea119e9b5bb63d1eb8b24.1668184658.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
io_poll_double_prepare() | io_poll_wake()
| poll->head = NULL
smp_load(&poll->head); /* NULL */ |
flags = req->flags; |
| req->flags &= ~SINGLE_POLL;
req->flags = flags | DOUBLE_POLL |
The idea behind io_poll_double_prepare() is to serialise with the
first poll entry by taking the wq lock. However, it's not safe to assume
that io_poll_wake() is not running when we can't grab the lock and so we
may race modifying req->flags.
Skip double poll setup if that happens. It's ok because the first poll
entry will only be removed when it's definitely completing, e.g.
pollfree or oneshot with a valid mask.
Fixes: 49f1c68e048f1 ("io_uring: optimise submission side poll_refs")
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/b7fab2d502f6121a7d7b199fe4d914a43ca9cdfd.1668184658.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
get_port_from_cmdline() returns an int, yet is assigned to a char, which
is wrong in its own right, but also, with char becoming unsigned, this
poses problems, because -1 is used as an error value. Further
complicating things, fw_init_early_console() is only ever called with a
-1 argument. Fix this up by removing the unused argument from
fw_init_early_console() and treating port as a proper signed integer.
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
|
|
Cast upper bound of branch range to long to do signed compare,
avoid negative offset trigger this warning.
Fixes: 9b6584e35f40 ("MIPS: jump_label: Use compact branches for >= r6")
Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Cc: stable@vger.kernel.org
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
|
|
The local GPIO driver in the MIPS Alchemy is including the legacy
<linux/gpio.h> header but what it wants is to implement a GPIO
driver so include <linux/gpio/driver.h> instead.
Cc: Bartosz Golaszewski <brgl@bgdev.pl>
Cc: linux-gpio@vger.kernel.org
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
|
|
Add WARN_ON on kexec related kmalloc failed, avoid to pass NULL pointer
to following memcpy and loongson_kexec_prepare.
Fixes: 6ce48897ce47 ("MIPS: Loongson64: Add kexec/kdump support")
Signed-off-by: Liao Chang <liaochang1@huawei.com>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
|
|
Building with clang-14 fails with:
AS arch/mips/kernel/relocate_kernel.o
<unknown>:0: error: symbol 'kexec_args' is already defined
<unknown>:0: error: symbol 'secondary_kexec_args' is already defined
<unknown>:0: error: symbol 'kexec_start_address' is already defined
<unknown>:0: error: symbol 'kexec_indirection_page' is already defined
<unknown>:0: error: symbol 'relocate_new_kernel_size' is already defined
It turns out EXPORT defined in asm/asm.h expands to a symbol definition,
so there is no need to define these symbols again. Remove duplicated
symbol definitions.
Fixes: 7aa1c8f47e7e ("MIPS: kdump: Add support")
Signed-off-by: Rongwei Zhang <pudh4418@gmail.com>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
|
|
In the mips CONFIG_SYS_SUPPORTS_ZBOOT kernel, fix the compile error
when using CONFIG_FORTIFY_SOURCE=y
LD vmlinuz
mipsel-openwrt-linux-musl-ld: arch/mips/boot/compressed/decompress.o: in
function `decompress_kernel':
./include/linux/decompress/mm.h:(.text.decompress_kernel+0x177c):
undefined reference to `warn_slowpath_fmt'
kernel test robot helped identify this as related to fortify. The error
appeared with commit 54d9469bc515 ("fortify: Add run-time WARN for
cross-field memcpy()")
Link: https://lore.kernel.org/r/202209161144.x9xSqNQZ-lkp@intel.com/
Resolve this in the same style as commit cfecea6ead5f ("lib/string:
Move helper functions out of string.c")
Reported-by: kernel test robot <lkp@intel.com>
Fixes: 54d9469bc515 ("fortify: Add run-time WARN for cross-field memcpy()")
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: John Thomson <git@johnthomson.fastmail.com.au>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
|
|
When zapping a GFN range, pass 0 => ALL_ONES for the to-be-invalidated
range to effectively block all page faults while the zap is in-progress.
The invalidation helpers take a host virtual address, whereas zapping a
GFN obviously provides a guest physical address and with the wrong unit
of measurement (frame vs. byte).
Alternatively, KVM could walk all memslots to get the associated HVAs,
but thanks to SMM, that would require multiple lookups. And practically
speaking, kvm_zap_gfn_range() usage is quite rare and not a hot path,
e.g. MTRR and CR0.CD are almost guaranteed to be done only on vCPU0
during boot, and APICv inhibits are similarly infrequent operations.
Fixes: edb298c663fc ("KVM: x86/mmu: bump mmu notifier count in kvm_zap_gfn_range")
Reported-by: Chao Peng <chao.p.peng@linux.intel.com>
Cc: stable@vger.kernel.org
Cc: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20221111001841.2412598-1-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
In ata_tdev_add(), the return value of transport_add_device() is
not checked. As a result, it causes null-ptr-deref while removing
the module, because transport_remove_device() is called to remove
the device that was not added.
Unable to handle kernel NULL pointer dereference at virtual address 00000000000000d0
CPU: 13 PID: 13603 Comm: rmmod Kdump: loaded Tainted: G W 6.1.0-rc3+ #36
pstate: 60400009 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : device_del+0x48/0x3a0
lr : device_del+0x44/0x3a0
Call trace:
device_del+0x48/0x3a0
attribute_container_class_device_del+0x28/0x40
transport_remove_classdev+0x60/0x7c
attribute_container_device_trigger+0x118/0x120
transport_remove_device+0x20/0x30
ata_tdev_delete+0x24/0x50 [libata]
ata_tlink_delete+0x40/0xa0 [libata]
ata_tport_delete+0x2c/0x60 [libata]
ata_port_detach+0x148/0x1b0 [libata]
ata_pci_remove_one+0x50/0x80 [libata]
ahci_remove_one+0x4c/0x8c [ahci]
Fix this by checking and handling return value of transport_add_device()
in ata_tdev_add(). In the error path, device_del() is called to delete
the device which was added earlier in this function, and ata_tdev_free()
is called to free ata_dev.
Fixes: d9027470b886 ("[libata] Add ATA transport class")
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
|
|
In ata_tlink_add(), the return value of transport_add_device() is
not checked. As a result, it causes null-ptr-deref while removing
the module, because transport_remove_device() is called to remove
the device that was not added.
Unable to handle kernel NULL pointer dereference at virtual address 00000000000000d0
CPU: 33 PID: 13850 Comm: rmmod Kdump: loaded Tainted: G W 6.1.0-rc3+ #12
pstate: 60400009 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : device_del+0x48/0x39c
lr : device_del+0x44/0x39c
Call trace:
device_del+0x48/0x39c
attribute_container_class_device_del+0x28/0x40
transport_remove_classdev+0x60/0x7c
attribute_container_device_trigger+0x118/0x120
transport_remove_device+0x20/0x30
ata_tlink_delete+0x88/0xb0 [libata]
ata_tport_delete+0x2c/0x60 [libata]
ata_port_detach+0x148/0x1b0 [libata]
ata_pci_remove_one+0x50/0x80 [libata]
ahci_remove_one+0x4c/0x8c [ahci]
Fix this by checking and handling return value of transport_add_device()
in ata_tlink_add().
Fixes: d9027470b886 ("[libata] Add ATA transport class")
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
|
|
In ata_tport_add(), the return value of transport_add_device() is
not checked. As a result, it causes null-ptr-deref while removing
the module, because transport_remove_device() is called to remove
the device that was not added.
Unable to handle kernel NULL pointer dereference at virtual address 00000000000000d0
CPU: 12 PID: 13605 Comm: rmmod Kdump: loaded Tainted: G W 6.1.0-rc3+ #8
pstate: 60400009 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : device_del+0x48/0x39c
lr : device_del+0x44/0x39c
Call trace:
device_del+0x48/0x39c
attribute_container_class_device_del+0x28/0x40
transport_remove_classdev+0x60/0x7c
attribute_container_device_trigger+0x118/0x120
transport_remove_device+0x20/0x30
ata_tport_delete+0x34/0x60 [libata]
ata_port_detach+0x148/0x1b0 [libata]
ata_pci_remove_one+0x50/0x80 [libata]
ahci_remove_one+0x4c/0x8c [ahci]
Fix this by checking and handling return value of transport_add_device()
in ata_tport_add().
Fixes: d9027470b886 ("[libata] Add ATA transport class")
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
|
|
In the error path in ata_tport_add(), when calling put_device(),
ata_tport_release() is called, it will put the refcount of 'ap->host'.
And then ata_host_put() is called again, the refcount is decreased
to 0, ata_host_release() is called, all ports are freed and set to
null.
When unbinding the device after failure, ata_host_stop() is called
to release the resources, it leads a null-ptr-deref(), because all
the ports all freed and null.
Unable to handle kernel NULL pointer dereference at virtual address 0000000000000008
CPU: 7 PID: 18671 Comm: modprobe Kdump: loaded Tainted: G E 6.1.0-rc3+ #8
pstate: 80400009 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : ata_host_stop+0x3c/0x84 [libata]
lr : release_nodes+0x64/0xd0
Call trace:
ata_host_stop+0x3c/0x84 [libata]
release_nodes+0x64/0xd0
devres_release_all+0xbc/0x1b0
device_unbind_cleanup+0x20/0x70
really_probe+0x158/0x320
__driver_probe_device+0x84/0x120
driver_probe_device+0x44/0x120
__driver_attach+0xb4/0x220
bus_for_each_dev+0x78/0xdc
driver_attach+0x2c/0x40
bus_add_driver+0x184/0x240
driver_register+0x80/0x13c
__pci_register_driver+0x4c/0x60
ahci_pci_driver_init+0x30/0x1000 [ahci]
Fix this by removing redundant ata_host_put() in the error path.
Fixes: 2623c7a5f279 ("libata: add refcounting to ata_host")
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
|
|
It's not necessary to free netdev allocated with devm_alloc_etherdev()
and using free_netdev() leads to double free.
Fixes: fd3040b9394c ("net: ethernet: Add driver for Sunplus SP7021")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Link: https://lore.kernel.org/r/20221109150116.2988194-1-weiyongjun@huaweicloud.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Recently, ld.lld moved from '--undefined-version' to
'--no-undefined-version' as the default, which breaks the compat vDSO
build:
ld.lld: error: version script assignment of 'LINUX_4.15' to symbol '__vdso_gettimeofday' failed: symbol not defined
ld.lld: error: version script assignment of 'LINUX_4.15' to symbol '__vdso_clock_gettime' failed: symbol not defined
ld.lld: error: version script assignment of 'LINUX_4.15' to symbol '__vdso_clock_getres' failed: symbol not defined
These symbols are not present in the compat vDSO or the regular vDSO for
32-bit but they are unconditionally included in the version section of
the linker script, which is prohibited with '--no-undefined-version'.
Fix this issue by only including the symbols that are actually exported
in the version section of the linker script.
Link: https://github.com/ClangBuiltLinux/linux/issues/1756
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/20221108171324.3377226-1-nathan@kernel.org/
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
Currently, RISC-V sets up reserved memory using the "early" copy of the
device tree. As a result, when trying to get a reserved memory region
using of_reserved_mem_lookup(), the pointer to reserved memory regions
is using the early, pre-virtual-memory address which causes a kernel
panic when trying to use the buffer's name:
Unable to handle kernel paging request at virtual address 00000000401c31ac
Oops [#1]
Modules linked in:
CPU: 0 PID: 0 Comm: swapper Not tainted 6.0.0-rc1-00001-g0d9d6953d834 #1
Hardware name: Microchip PolarFire-SoC Icicle Kit (DT)
epc : string+0x4a/0xea
ra : vsnprintf+0x1e4/0x336
epc : ffffffff80335ea0 ra : ffffffff80338936 sp : ffffffff81203be0
gp : ffffffff812e0a98 tp : ffffffff8120de40 t0 : 0000000000000000
t1 : ffffffff81203e28 t2 : 7265736572203a46 s0 : ffffffff81203c20
s1 : ffffffff81203e28 a0 : ffffffff81203d22 a1 : 0000000000000000
a2 : ffffffff81203d08 a3 : 0000000081203d21 a4 : ffffffffffffffff
a5 : 00000000401c31ac a6 : ffff0a00ffffff04 a7 : ffffffffffffffff
s2 : ffffffff81203d08 s3 : ffffffff81203d00 s4 : 0000000000000008
s5 : ffffffff000000ff s6 : 0000000000ffffff s7 : 00000000ffffff00
s8 : ffffffff80d9821a s9 : ffffffff81203d22 s10: 0000000000000002
s11: ffffffff80d9821c t3 : ffffffff812f3617 t4 : ffffffff812f3617
t5 : ffffffff812f3618 t6 : ffffffff81203d08
status: 0000000200000100 badaddr: 00000000401c31ac cause: 000000000000000d
[<ffffffff80338936>] vsnprintf+0x1e4/0x336
[<ffffffff80055ae2>] vprintk_store+0xf6/0x344
[<ffffffff80055d86>] vprintk_emit+0x56/0x192
[<ffffffff80055ed8>] vprintk_default+0x16/0x1e
[<ffffffff800563d2>] vprintk+0x72/0x80
[<ffffffff806813b2>] _printk+0x36/0x50
[<ffffffff8068af48>] print_reserved_mem+0x1c/0x24
[<ffffffff808057ec>] paging_init+0x528/0x5bc
[<ffffffff808031ae>] setup_arch+0xd0/0x592
[<ffffffff8080070e>] start_kernel+0x82/0x73c
early_init_fdt_scan_reserved_mem() takes no arguments as it operates on
initial_boot_params, which is populated by early_init_dt_verify(). On
RISC-V, early_init_dt_verify() is called twice. Once, directly, in
setup_arch() if CONFIG_BUILTIN_DTB is not enabled and once indirectly,
very early in the boot process, by parse_dtb() when it calls
early_init_dt_scan_nodes().
This first call uses dtb_early_va to set initial_boot_params, which is
not usable later in the boot process when
early_init_fdt_scan_reserved_mem() is called. On arm64 for example, the
corresponding call to early_init_dt_scan_nodes() uses fixmap addresses
and doesn't suffer the same fate.
Move early_init_fdt_scan_reserved_mem() further along the boot sequence,
after the direct call to early_init_dt_verify() in setup_arch() so that
the names use the correct virtual memory addresses. The above supposed
that CONFIG_BUILTIN_DTB was not set, but should work equally in the case
where it is - unflatted_and_copy_device_tree() also updates
initial_boot_params.
Reported-by: Valentina Fernandez <valentina.fernandezalanis@microchip.com>
Reported-by: Evgenii Shatokhin <e.shatokhin@yadro.com>
Link: https://lore.kernel.org/linux-riscv/f8e67f82-103d-156c-deb0-d6d6e2756f5e@microchip.com/
Fixes: 922b0375fc93 ("riscv: Fix memblock reservation for device tree blob")
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
Tested-by: Evgenii Shatokhin <e.shatokhin@yadro.com>
Link: https://lore.kernel.org/r/20221107151524.3941467-1-conor.dooley@microchip.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
Currently, when mapping the EFI runtime regions in the EFI page tables,
we complain about misaligned regions in a rather noisy way, using
WARN().
Not only does this produce a lot of irrelevant clutter in the log, it is
factually incorrect, as misaligned runtime regions are actually allowed
by the EFI spec as long as they don't require conflicting memory types
within the same 64k page.
So let's drop the warning, and tweak the code so that we
- take both the start and end of the region into account when checking
for misalignment
- only revert to RWX mappings for non-code regions if misaligned code
regions are also known to exist.
Cc: <stable@vger.kernel.org>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
|
|
Ampere Altra machines are reported to misbehave when the SetTime() EFI
runtime service is called after ExitBootServices() but before calling
SetVirtualAddressMap(). Given that the latter is horrid, pointless and
explicitly documented as optional by the EFI spec, we no longer invoke
it at boot if the configured size of the VA space guarantees that the
EFI runtime memory regions can remain mapped 1:1 like they are at boot
time.
On Ampere Altra machines, this results in SetTime() calls issued by the
rtc-efi driver triggering synchronous exceptions during boot. We can
now recover from those without bringing down the system entirely, due to
commit 23715a26c8d81291 ("arm64: efi: Recover from synchronous
exceptions occurring in firmware"). However, it would be better to avoid
the issue entirely, given that the firmware appears to remain in a funny
state after this.
So attempt to identify these machines based on the 'family' field in the
type #1 SMBIOS record, and call SetVirtualAddressMap() unconditionally
in that case.
Tested-by: Alexandru Elisei <alexandru.elisei@gmail.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
|
|
Even after commit 89fd4a1df829 ("riscv: jump_label: mark arguments as
const to satisfy asm constraints"), building with CC_OPTIMIZE_FOR_SIZE
+ LLVM=1 can reproduce below build error:
CC arch/riscv/kernel/vdso/vgettimeofday.o
In file included from <built-in>:4:
In file included from lib/vdso/gettimeofday.c:5:
In file included from include/vdso/datapage.h:17:
In file included from include/vdso/processor.h:10:
In file included from arch/riscv/include/asm/vdso/processor.h:7:
In file included from include/linux/jump_label.h:112:
arch/riscv/include/asm/jump_label.h:42:3: error:
invalid operand for inline asm constraint 'i'
" .option push \n\t"
^
1 error generated.
I think the problem is when "-Os" is passed as CFLAGS, it's removed by
"CFLAGS_REMOVE_vgettimeofday.o = $(CC_FLAGS_FTRACE) -Os" which is
introduced in commit e05d57dcb8c7 ("riscv: Fixup __vdso_gettimeofday
broke dynamic ftrace"), thus no optimization at all for vgettimeofday.c
arm64 does remove "-Os" as well, but it forces "-O2" after removing
"-Os".
I compared the generated vgettimeofday.o with "-O2" and "-Os",
I think no big performance difference. So let's tell the kbuild not
to remove "-Os" rather than follow arm64 style.
vdso related performance can be improved a lot when building kernel with
CC_OPTIMIZE_FOR_SIZE after this commit, ("-Os" VS no optimization)
Fixes: e05d57dcb8c7 ("riscv: Fixup __vdso_gettimeofday broke dynamic ftrace")
Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/20221031182943.2453-1-jszhang@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
Last patch from Vivien was nearly 3 years ago and he has not reviewed or
responded to DSA patches since then, move to CREDITS.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20221109231907.621678-1-f.fainelli@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
thread_struct's s[12] may contain random kernel memory content, which
may be finally leaked to userspace. This is a security hole. Fix it
by clearing the s[12] array in thread_struct when fork.
As for kthread case, it's better to clear the s[12] array as well.
Fixes: 7db91e57a0ac ("RISC-V: Task implementation")
Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Tested-by: Guo Ren <guoren@kernel.org>
Link: https://lore.kernel.org/r/20221029113450.4027-1-jszhang@kernel.org
Reviewed-by: Guo Ren <guoren@kernel.org>
Link: https://lore.kernel.org/r/CAJF2gTSdVyAaM12T%2B7kXAdRPGS4VyuO08X1c7paE-n4Fr8OtRA@mail.gmail.com/
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
We already check if the chosen starting offset for the buffer IDs fit
within an unsigned short, as 65535 is the maximum value for a provided
buffer. But if the caller asks to add N buffers at offset M, and M + N
would exceed the size of the unsigned short, we simply add buffers with
wrapping around the ID.
This is not necessarily a bug and could in fact be a valid use case, but
it seems confusing and inconsistent with the initial check for starting
offset. Let's check for wrap consistently, and error the addition if we
do need to wrap.
Reported-by: Olivier Langlois <olivier@trillion01.com>
Link: https://github.com/axboe/liburing/issues/726
Cc: stable@vger.kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
kmemleak reports memory leaks in macvlan_common_newlink, as follows:
ip link add link eth0 name .. type macvlan mode source macaddr add
<MAC-ADDR>
kmemleak reports:
unreferenced object 0xffff8880109bb140 (size 64):
comm "ip", pid 284, jiffies 4294986150 (age 430.108s)
hex dump (first 32 bytes):
00 00 00 00 00 00 00 00 b8 aa 5a 12 80 88 ff ff ..........Z.....
80 1b fa 0d 80 88 ff ff 1e ff ac af c7 c1 6b 6b ..............kk
backtrace:
[<ffffffff813e06a7>] kmem_cache_alloc_trace+0x1c7/0x300
[<ffffffff81b66025>] macvlan_hash_add_source+0x45/0xc0
[<ffffffff81b66a67>] macvlan_changelink_sources+0xd7/0x170
[<ffffffff81b6775c>] macvlan_common_newlink+0x38c/0x5a0
[<ffffffff81b6797e>] macvlan_newlink+0xe/0x20
[<ffffffff81d97f8f>] __rtnl_newlink+0x7af/0xa50
[<ffffffff81d98278>] rtnl_newlink+0x48/0x70
...
In the scenario where the macvlan mode is configured as 'source',
macvlan_changelink_sources() will be execured to reconfigure list of
remote source mac addresses, at the same time, if register_netdevice()
return an error, the resource generated by macvlan_changelink_sources()
is not cleaned up.
Using this patch, in the case of an error, it will execute
macvlan_flush_sources() to ensure that the resource is cleaned up.
Fixes: aa5fd0fb7748 ("driver: macvlan: Destroy new macvlan port if macvlan_common_newlink failed.")
Signed-off-by: Chuang Wang <nashuiliang@gmail.com>
Link: https://lore.kernel.org/r/20221109090735.690500-1-nashuiliang@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
When alloc tx/rx ring failed in tsi108_open(), it doesn't free irq. Fix
it.
Fixes: 5e123b844a1c ("[PATCH] Add tsi108/9 On Chip Ethernet device driver support")
Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com>
Link: https://lore.kernel.org/r/20221109044016.126866-1-shaozhengchao@huawei.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
As 'kobject_add' may allocated memory for 'kobject->name' when return error.
And in this function, if call 'kobject_add' failed didn't free kobject.
So call 'kobject_put' to recycling resources.
Signed-off-by: Ye Bin <yebin10@huawei.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20221110144539.2989354-1-yebin@huaweicloud.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
When the non-contiguous page allocation for SG buffer allocation
fails, the memalloc helper tries to fall back to the old page
allocation methods. This would, however, result in the bogus page
addresses when IOMMU is enabled. Usually in such a case, the fallback
allocation should fail as well, but occasionally it succeeds and
hitting a bad access.
The fallback was thought for non-IOMMU case, and as the error from
dma_alloc_noncontiguous() with IOMMU essentially implies a fatal
memory allocation error, we should return the error straightforwardly
without fallback. This avoids the corner case like the above.
The patch also renames the local variable "dma_ops" with snd_ prefix
for avoiding the name conflict.
Fixes: a8d302a0b770 ("ALSA: memalloc: Revive x86-specific WC page allocations again")
Reported-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
Reviewed-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
Link: https://lore.kernel.org/r/alpine.DEB.2.22.394.2211041541090.3532114@eliteleevi.tm.intel.com
Link: https://lore.kernel.org/r/20221110132216.30605-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
When failed to init rxq or txq in mv643xx_eth_open() for opening device,
napi isn't disabled. When open mv643xx_eth device next time, it will
trigger a BUG_ON() in napi_enable(). Compile tested only.
Fixes: 2257e05c1705 ("mv643xx_eth: get rid of receive-side locking")
Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com>
Link: https://lore.kernel.org/r/20221109025432.80900-1-shaozhengchao@huawei.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
When failed to start nic or add interrupt service routine in
s2io_card_up() for opening device, napi isn't disabled. When open
s2io device next time, it will trigger a BUG_ON()in napi_enable().
Compile tested only.
Fixes: 5f490c968056 ("S2io: Fixed synchronization between scheduling of napi with card reset and close")
Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com>
Link: https://lore.kernel.org/r/20221109023741.131552-1-shaozhengchao@huawei.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Commit aaab73f8fba4 ("macsec: clear encryption keys from the stack after
setting up offload") made sure to clean encryption keys from the stack
after setting up offloading, but the atlantic driver made a copy and did
not clear it. Fix this.
[4 Fixes tags below, all part of the same series, no need to split this]
Fixes: 9ff40a751a6f ("net: atlantic: MACSec ingress offload implementation")
Fixes: b8f8a0b7b5cb ("net: atlantic: MACSec ingress offload HW bindings")
Fixes: 27736563ce32 ("net: atlantic: MACSec egress offload implementation")
Fixes: 9d106c6dd81b ("net: atlantic: MACSec egress offload HW bindings")
Signed-off-by: Antoine Tenart <atenart@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Commit aaab73f8fba4 ("macsec: clear encryption keys from the stack after
setting up offload") made sure to clean encryption keys from the stack
after setting up offloading, but the MSCC PHY driver made a copy, kept
it in the flow data and did not clear it when freeing a flow. Fix this.
Fixes: 28c5107aa904 ("net: phy: mscc: macsec support")
Signed-off-by: Antoine Tenart <atenart@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|