aboutsummaryrefslogtreecommitdiffstatshomepage
path: root/include (follow)
AgeCommit message (Collapse)AuthorFilesLines
2019-11-25Merge tag 'edac_for_5.5' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/rasLinus Torvalds1-78/+68
Pull EDAC updates from Borislav Petkov: "A lot of changes this time around, details below. From the next cycle onwards, we'll switch the EDAC tree to topic branches (instead of a single edac-for-next branch) which should make the changes handling more flexible, hopefully. We'll see. Summary: - Rework error logging functions to accept a count of errors parameter (Hanna Hawa) - Part one of substantial EDAC core + ghes_edac driver cleanup (Robert Richter) - Print additional useful logging information in skx_* (Tony Luck) - Improve amd64_edac hw detection + cleanups (Yazen Ghannam) - Misc cleanups, fixes and code improvements" * tag 'edac_for_5.5' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras: (35 commits) EDAC/altera: Use the Altera System Manager driver EDAC/altera: Cleanup the ECC Manager EDAC/altera: Use fast register IO for S10 IRQs EDAC/ghes: Do not warn when incrementing refcount on 0 EDAC/Documentation: Describe CPER module definition and DIMM ranks EDAC: Unify the mc_event tracepoint call EDAC/ghes: Remove intermediate buffer pvt->detail_location EDAC/ghes: Fix grain calculation EDAC/ghes: Use standard kernel macros for page calculations EDAC: Remove misleading comment in struct edac_raw_error_desc EDAC/mc: Reduce indentation level in edac_mc_handle_error() EDAC/mc: Remove needless zero string termination EDAC/mc: Do not BUG_ON() in edac_mc_alloc() EDAC: Introduce an mci_for_each_dimm() iterator EDAC: Remove EDAC_DIMM_OFF() macro EDAC: Replace EDAC_DIMM_PTR() macro with edac_get_dimm() function EDAC/amd64: Get rid of the ECC disabled long message EDAC/ghes: Fix locking and memory barrier issues EDAC/amd64: Check for memory before fully initializing an instance EDAC/amd64: Use cached data when checking for ECC ...
2019-11-25Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds11-17/+166
Pull KVM updates from Paolo Bonzini: "ARM: - data abort report and injection - steal time support - GICv4 performance improvements - vgic ITS emulation fixes - simplify FWB handling - enable halt polling counters - make the emulated timer PREEMPT_RT compliant s390: - small fixes and cleanups - selftest improvements - yield improvements PPC: - add capability to tell userspace whether we can single-step the guest - improve the allocation of XIVE virtual processor IDs - rewrite interrupt synthesis code to deliver interrupts in virtual mode when appropriate. - minor cleanups and improvements. x86: - XSAVES support for AMD - more accurate report of nested guest TSC to the nested hypervisor - retpoline optimizations - support for nested 5-level page tables - PMU virtualization optimizations, and improved support for nested PMU virtualization - correct latching of INITs for nested virtualization - IOAPIC optimization - TSX_CTRL virtualization for more TAA happiness - improved allocation and flushing of SEV ASIDs - many bugfixes and cleanups" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (127 commits) kvm: nVMX: Relax guest IA32_FEATURE_CONTROL constraints KVM: x86: Grab KVM's srcu lock when setting nested state KVM: x86: Open code shared_msr_update() in its only caller KVM: Fix jump label out_free_* in kvm_init() KVM: x86: Remove a spurious export of a static function KVM: x86: create mmu/ subdirectory KVM: nVMX: Remove unnecessary TLB flushes on L1<->L2 switches when L1 use apic-access-page KVM: x86: remove set but not used variable 'called' KVM: nVMX: Do not mark vmcs02->apic_access_page as dirty when unpinning KVM: vmx: use MSR_IA32_TSX_CTRL to hard-disable TSX on guest that lack it KVM: vmx: implement MSR_IA32_TSX_CTRL disable RTM functionality KVM: x86: implement MSR_IA32_TSX_CTRL effect on CPUID KVM: x86: do not modify masked bits of shared MSRs KVM: x86: fix presentation of TSX feature in ARCH_CAPABILITIES KVM: PPC: Book3S HV: XIVE: Fix potential page leak on error path KVM: PPC: Book3S HV: XIVE: Free previous EQ page when setting up a new one KVM: nVMX: Assume TLB entries of L1 and L2 are tagged differently if L0 use EPT KVM: x86: Unexport kvm_vcpu_reload_apic_access_page() KVM: nVMX: add CR4_LA57 bit to nested CR4_FIXED1 KVM: nVMX: Use semi-colon instead of comma for exit-handlers initialization ...
2019-11-25Merge tag 'for-linus-5.5a-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tipLinus Torvalds1-2/+8
Pull xen updates from Juergen Gross: - a small series to remove the build constraint of Xen x86 MCE handling to 64-bit only - a bunch of minor cleanups * tag 'for-linus-5.5a-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen: Fix Kconfig indentation xen/mcelog: also allow building for 32-bit kernels xen/mcelog: add PPIN to record when available xen/mcelog: drop __MC_MSR_MCGCAP xen/gntdev: Use select for DMA_SHARED_BUFFER xen: mm: make xen_mm_init static xen: mm: include <xen/xen-ops.h> for missing declarations
2019-11-25Merge tag 'mips_5.5' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linuxLinus Torvalds2-1/+10
Pull MIPS updates from Paul Burton: "The main MIPS changes for 5.5: - Atomics-related code sees some rework & cleanup, most notably allowing Loongson LL/SC errata workarounds to be more bulletproof & their correctness to be checked at build time. - Command line setup code is simplified somewhat, resolving various corner cases. - MIPS kernels can now be built with kcov code coverage support. - We can now build with CONFIG_FORTIFY_SOURCE=y. - Miscellaneous cleanups. And some platform specific changes: - We now disable some broken TLB functionality on certain Ingenic systems, and JZ4780 systems gain some devicetree nodes to support more devices. - Loongson support sees a number of cleanups, and we gain initial support for Loongson 3A R4 systems. - We gain support for MediaTek MT7688-based GARDENA Smart Gateway systems. - SGI IP27 (Origin 2*) see a number of fixes, cleanups & simplifications. - SGI IP30 (Octane) systems are now supported" * tag 'mips_5.5' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux: (107 commits) MIPS: SGI-IP27: Enable ethernet phy on second Origin 200 module MIPS: PCI: Fix fake subdevice ID for IOC3 MIPS: Ingenic: Disable abandoned HPTLB function. MIPS: PCI: remember nasid changed by set interrupt affinity MIPS: SGI-IP27: Fix crash, when CPUs are disabled via nr_cpus parameter mips: add support for folded p4d page tables mips: drop __pXd_offset() macros that duplicate pXd_index() ones mips: fix build when "48 bits virtual memory" is enabled MIPS: math-emu: Reuse name array in debugfs_fpuemu() MIPS: allow building with kcov coverage MIPS: Loongson64: Drop setup_pcimap MIPS: Loongson2ef: Convert to early_printk_8250 MIPS: Drop CPU_SUPPORTS_UNCACHED_ACCELERATED MIPS: Loongson{2ef, 32, 64} convert to generic fw cmdline MIPS: Drop pmon.h MIPS: Loongson: Unify LOONGSON3/LOONGSON64 Kconfig usage MIPS: Loongson: Rename LOONGSON1 to LOONGSON32 MIPS: Loongson: Fix return value of loongson_hwmon_init MIPS: add support for SGI Octane (IP30) MIPS: PCI: make phys_to_dma/dma_to_phys for pci-xtalk-bridge common ...
2019-11-25Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linuxLinus Torvalds8-42/+92
Pull arm64 updates from Catalin Marinas: "Apart from the arm64-specific bits (core arch and perf, new arm64 selftests), it touches the generic cow_user_page() (reviewed by Kirill) together with a macro for x86 to preserve the existing behaviour on this architecture. Summary: - On ARMv8 CPUs without hardware updates of the access flag, avoid failing cow_user_page() on PFN mappings if the pte is old. The patches introduce an arch_faults_on_old_pte() macro, defined as false on x86. When true, cow_user_page() makes the pte young before attempting __copy_from_user_inatomic(). - Covert the synchronous exception handling paths in arch/arm64/kernel/entry.S to C. - FTRACE_WITH_REGS support for arm64. - ZONE_DMA re-introduced on arm64 to support Raspberry Pi 4 - Several kselftest cases specific to arm64, together with a MAINTAINERS update for these files (moved to the ARM64 PORT entry). - Workaround for a Neoverse-N1 erratum where the CPU may fetch stale instructions under certain conditions. - Workaround for Cortex-A57 and A72 errata where the CPU may speculatively execute an AT instruction and associate a VMID with the wrong guest page tables (corrupting the TLB). - Perf updates for arm64: additional PMU topologies on HiSilicon platforms, support for CCN-512 interconnect, AXI ID filtering in the IMX8 DDR PMU, support for the CCPI2 uncore PMU in ThunderX2. - GICv3 optimisation to avoid a heavy barrier when accessing the ICC_PMR_EL1 register. - ELF HWCAP documentation updates and clean-up. - SMC calling convention conduit code clean-up. - KASLR diagnostics printed during boot - NVIDIA Carmel CPU added to the KPTI whitelist - Some arm64 mm clean-ups: use generic free_initrd_mem(), remove stale macro, simplify calculation in __create_pgd_mapping(), typos. - Kconfig clean-ups: CMDLINE_FORCE to depend on CMDLINE, choice for endinanness to help with allmodconfig" * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (93 commits) arm64: Kconfig: add a choice for endianness kselftest: arm64: fix spelling mistake "contiguos" -> "contiguous" arm64: Kconfig: make CMDLINE_FORCE depend on CMDLINE MAINTAINERS: Add arm64 selftests to the ARM64 PORT entry arm64: kaslr: Check command line before looking for a seed arm64: kaslr: Announce KASLR status on boot kselftest: arm64: fake_sigreturn_misaligned_sp kselftest: arm64: fake_sigreturn_bad_size kselftest: arm64: fake_sigreturn_duplicated_fpsimd kselftest: arm64: fake_sigreturn_missing_fpsimd kselftest: arm64: fake_sigreturn_bad_size_for_magic0 kselftest: arm64: fake_sigreturn_bad_magic kselftest: arm64: add helper get_current_context kselftest: arm64: extend test_init functionalities kselftest: arm64: mangle_pstate_invalid_mode_el[123][ht] kselftest: arm64: mangle_pstate_invalid_daif_bits kselftest: arm64: mangle_pstate_invalid_compat_toggle and common utils kselftest: arm64: extend toplevel skeleton Makefile drivers/perf: hisi: update the sccl_id/ccl_id for certain HiSilicon platform arm64: mm: reserve CMA and crashkernel in ZONE_DMA32 ...
2019-11-25Merge tag 'linux-kselftest-5.5-rc1-kunit' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftestLinus Torvalds4-0/+1972
Pull kselftest KUnit support gtom Shuah Khan: "This adds KUnit, a lightweight unit testing and mocking framework for the Linux kernel from Brendan Higgins. KUnit is not an end-to-end testing framework. It is currently supported on UML and sub-systems can write unit tests and run them in UML env. KUnit documentation is included in this update. In addition, this Kunit update adds 3 new kunit tests: - proc sysctl test from Iurii Zaikin - the 'list' doubly linked list test from David Gow - ext4 tests for decoding extended timestamps from Iurii Zaikin In the future KUnit will be linked to Kselftest framework to provide a way to trigger KUnit tests from user-space" * tag 'linux-kselftest-5.5-rc1-kunit' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: (23 commits) lib/list-test: add a test for the 'list' doubly linked list ext4: add kunit test for decoding extended timestamps Documentation: kunit: Fix verification command kunit: Fix '--build_dir' option kunit: fix failure to build without printk MAINTAINERS: add proc sysctl KUnit test to PROC SYSCTL section kernel/sysctl-test: Add null pointer test for sysctl.c:proc_dointvec() MAINTAINERS: add entry for KUnit the unit testing framework Documentation: kunit: add documentation for KUnit kunit: defconfig: add defconfigs for building KUnit tests kunit: tool: add Python wrappers for running KUnit tests kunit: test: add tests for KUnit managed resources kunit: test: add the concept of assertions kunit: test: add tests for kunit test abort kunit: test: add support for test abort objtool: add kunit_try_catch_throw to the noreturn list kunit: test: add initial tests lib: enable building KUnit in lib/ kunit: test: add the concept of expectations kunit: test: add assertion printing library ...
2019-11-25Merge tag 'fsverity-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscryptLinus Torvalds2-2/+3
Pull fsverity updates from Eric Biggers: "Expose the fs-verity bit through statx()" * tag 'fsverity-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt: docs: fs-verity: mention statx() support f2fs: support STATX_ATTR_VERITY ext4: support STATX_ATTR_VERITY statx: define STATX_ATTR_VERITY docs: fs-verity: document first supported kernel version
2019-11-25Merge tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscryptLinus Torvalds2-33/+5
Pull fscrypt updates from Eric Biggers: - Add the IV_INO_LBLK_64 encryption policy flag which modifies the encryption to be optimized for UFS inline encryption hardware. - For AES-128-CBC, use the crypto API's implementation of ESSIV (which was added in 5.4) rather than doing ESSIV manually. - A few other cleanups. * tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt: f2fs: add support for IV_INO_LBLK_64 encryption policies ext4: add support for IV_INO_LBLK_64 encryption policies fscrypt: add support for IV_INO_LBLK_64 policies fscrypt: avoid data race on fscrypt_mode::logged_impl_name docs: ioctl-number: document fscrypt ioctl numbers fscrypt: zeroize fscrypt_info before freeing fscrypt: remove struct fscrypt_ctx fscrypt: invoke crypto API for ESSIV handling
2019-11-25Merge tag 'for-5.5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linuxLinus Torvalds3-73/+86
Pull btrfs updates from David Sterba: "User visible changes: - new block group profiles: RAID1 with 3- and 4- copies - RAID1 in btrfs has always 2 copies, now add support for 3 and 4 - this is an incompat feature (named RAID1C34) - recommended use of RAID1C3 is replacement of RAID6 profile on metadata, this brings a more reliable resiliency against 2 device loss/damage - support for new checksums - per-filesystem, set at mkfs time - fast hash (crc32c successor): xxhash, 64bit digest - strong hashes (both 256bit): sha256 (slower, FIPS), blake2b (faster) - the blake2b module goes via the crypto tree, btrfs.ko has a soft dependency - speed up lseek, don't take inode locks unnecessarily, this can speed up parallel SEEK_CUR/SEEK_SET/SEEK_END by 80% - send: - allow clone operations within the same file - limit maximum number of sent clone references to avoid slow backref walking - error message improvements: device scan prints process name and PID Core changes: - cleanups - remove unique workqueue helpers, used to provide a way to avoid deadlocks in the workqueue code, now done in a simpler way - remove lots of indirect function calls in compression code - extent IO tree code moved out of extent_io.c - cleanup backup superblock handling at mount time - transaction life cycle documentation and cleanups - locking code cleanups, annotations and documentation - add more cold, const, pure function attributes - removal of unused or redundant struct members or variables - new tree-checker sanity tests - try to detect missing INODE_ITEM, cross-reference checks of DIR_ITEM, DIR_INDEX, INODE_REF, and XATTR_* items - remove own bio scheduling code (used to avoid checksum submissions being stuck behind other IO), replaced by cgroup controller-based code to allow better control and avoid priority inversions in cases where the custom and cgroup scheduling disagreed Fixes: - avoid getting stuck during cyclic writebacks - fix trimming of ranges crossing block group boundaries - fix rename exchange on subvolumes, all involved subvolumes need to be recorded in the transaction" * tag 'for-5.5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: (137 commits) btrfs: drop bdev argument from submit_extent_page btrfs: remove extent_map::bdev btrfs: drop bio_set_dev where not needed btrfs: get bdev directly from fs_devices in submit_extent_page btrfs: record all roots for rename exchange on a subvol Btrfs: fix block group remaining RO forever after error during device replace btrfs: scrub: Don't check free space before marking a block group RO btrfs: change btrfs_fs_devices::rotating to bool btrfs: change btrfs_fs_devices::seeding to bool btrfs: rename btrfs_block_group_cache btrfs: block-group: Reuse the item key from caller of read_one_block_group() btrfs: block-group: Refactor btrfs_read_block_groups() btrfs: document extent buffer locking btrfs: access eb::blocking_writers according to ACCESS_ONCE policies btrfs: set blocking_writers directly, no increment or decrement btrfs: merge blocking_writers branches in btrfs_tree_read_lock btrfs: drop incompat bit for raid1c34 after last block group is gone btrfs: add incompat for raid1 with 3, 4 copies btrfs: add support for 4-copy replication (raid1c4) btrfs: add support for 3-copy replication (raid1c3) ...
2019-11-25Merge tag 'mtd/for-5.5' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linuxLinus Torvalds2-28/+37
Pull MTD updates from Miquel Raynal: "MTD core: - drop inactive maintainers, update the repositories and add IRC channel - debugfs functions improvements - initialize more structure parameters - misc fixes reported by robots MTD devices: - spear_smi: Fixed Write Burst mode - new Intel IXP4xx flash probing hook Raw NAND core: - useless extra checks dropped - update the detection of the bad block markers position Raw NAND controller drivers: - Cadence: new driver - Brcmnand: support for flash-dma v0 + fixes - Denali: drop support for the legacy controller/chip DT representation - superfluous dev_err() calls removed SPI NOR core changes: - introduce 'struct spi_nor_controller_ops' - clean the Register Operations methods - use dev_dbg insted of dev_err for low level info - fix retlen handling in sst_write() - fix silent truncations in spi_nor_read and spi_nor_read_raw() - fix the clearing of QE bit on lock()/unlock() - rework the disabling of the block write protection - rework the Quad Enable methods - make sure nor->spimem and nor->controller_ops are mutually exclusive - set default Quad Enable method for ISSI flashes - add support for few flashes SPI NOR controller drivers changes: - intel-spi: - support chips without software sequencer - add support for Intel Cannon Lake and Intel Comet Lake-H flashes CFI core changes: - code cleanups related useless initializers and coding style issues - fix for a possible double free problem in cfi_cmdset_0002 - improved HyperFlash error reporting and handling in cfi_cmdset_0002 core" * tag 'mtd/for-5.5' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux: (73 commits) mtd: devices: fix mchp23k256 read and write mtd: no need to check return value of debugfs_create functions mtd: spi-nor: Set default Quad Enable method for ISSI flashes mtd: spi-nor: Add support for is25wp256 mtd: spi-nor: Add support for w25q256jw mtd: spi-nor: Move condition to avoid a NULL check mtd: spi-nor: Make sure nor->spimem and nor->controller_ops are mutually exclusive mtd: spi-nor: Rename Quad Enable methods mtd: spi-nor: Merge spansion Quad Enable methods mtd: spi-nor: Rename CR_QUAD_EN_SPAN to SR2_QUAD_EN_BIT1 mtd: spi-nor: Extend the SR Read Back test mtd: spi-nor: Rework the disabling of block write protection mtd: spi-nor: Fix clearing of QE bit on lock()/unlock() mtd: cfi_cmdset_0002: fix delayed error detection on HyperFlash mtd: cfi_cmdset_0002: only check errors when ready in cfi_check_err_status() mtd: cfi_cmdset_0002: don't free cfi->cfiq in error path of cfi_amdstd_setup() mtd: cfi_cmdset_*: kill useless 'ret' variable initializers mtd: cfi_util: use DIV_ROUND_UP() in cfi_udelay() mtd: spi-nor: Print debug message when the read back test fails mtd: spi-nor: Check all the bits written, not just the BP ones ...
2019-11-25Merge tag 'for-5.5/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dmLinus Torvalds1-3/+0
Pull device mapper updates from Mike Snitzer: - Fix DM core to disallow stacking request-based DM on partitions. - Fix DM raid target to properly resync raidset even if bitmap needed additional pages. - Fix DM crypt performance regression due to use of WQ_HIGHPRI for the IO and crypt workqueues. - Fix DM integrity metadata layout that was aligned on 128K boundary rather than the intended 4K boundary (removes 124K of wasted space for each metadata block). - Improve the DM thin, cache and clone targets to use spin_lock_irq rather than spin_lock_irqsave where possible. - Fix DM thin single thread performance that was lost due to needless workqueue wakeups. - Fix DM zoned target performance that was lost due to excessive backing device checks. - Add ability to trigger write failure with the DM dust test target. - Fix whitespace indentation in drivers/md/Kconfig. - Various smalls fixes and cleanups (e.g. use struct_size, fix uninitialized variable, variable renames, etc). * tag 'for-5.5/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: (22 commits) Revert "dm crypt: use WQ_HIGHPRI for the IO and crypt workqueues" dm: Fix Kconfig indentation dm thin: wakeup worker only when deferred bios exist dm integrity: fix excessive alignment of metadata runs dm raid: Remove unnecessary negation of a shift in raid10_format_to_md_layout dm zoned: reduce overhead of backing device checks dm dust: add limited write failure mode dm dust: change ret to r in dust_map_read and dust_map dm dust: change result vars to r dm cache: replace spin_lock_irqsave with spin_lock_irq dm bio prison: replace spin_lock_irqsave with spin_lock_irq dm thin: replace spin_lock_irqsave with spin_lock_irq dm clone: add bucket_lock_irq/bucket_unlock_irq helpers dm clone: replace spin_lock_irqsave with spin_lock_irq dm writecache: handle REQ_FUA dm writecache: fix uninitialized variable warning dm stripe: use struct_size() in kmalloc() dm raid: streamline rs_get_progress() and its raid_status() caller side dm raid: simplify rs_setup_recovery call chain dm raid: to ensure resynchronization, perform raid set grow in preresume ...
2019-11-25Merge tag 'for-5.5/disk-revalidate-20191122' of git://git.kernel.dk/linux-blockLinus Torvalds2-6/+3
Pull disk revalidation updates from Jens Axboe: "This continues the work that Jan Kara started to thoroughly cleanup and consolidate how we handle rescans and revalidations" * tag 'for-5.5/disk-revalidate-20191122' of git://git.kernel.dk/linux-block: block: move clearing bd_invalidated into check_disk_size_change block: remove (__)blkdev_reread_part as an exported API block: fix bdev_disk_changed for non-partitioned devices block: move rescan_partitions to fs/block_dev.c block: merge invalidate_partitions into rescan_partitions block: refactor rescan_partitions
2019-11-25Merge tag 'for-5.5/zoned-20191122' of git://git.kernel.dk/linux-blockLinus Torvalds2-13/+26
Pull zoned block device update from Jens Axboe: "Enhancements and improvements to the zoned device support" * tag 'for-5.5/zoned-20191122' of git://git.kernel.dk/linux-block: scsi: sd_zbc: Remove set but not used variable 'buflen' block: rework zone reporting scsi: sd_zbc: Cleanup sd_zbc_alloc_report_buffer() null_blk: Add zone_nr_conv to features null_blk: clean up report zones null_blk: clean up the block device operations block: Remove partition support for zoned block devices block: Simplify report zones execution block: cleanup the !zoned case in blk_revalidate_disk_zones block: Enhance blk_revalidate_disk_zones()
2019-11-25Merge tag 'for-5.5/drivers-post-20191122' of git://git.kernel.dk/linux-blockLinus Torvalds1-0/+6
Pull additional block driver updates from Jens Axboe: "Here's another block driver update, done to avoid conflicts with the zoned changes coming next. This contains: - Prepare SCSI sd for zone open/close/finish support - Small NVMe pull request - hwmon support (Akinobu) - add new co-maintainer (Christoph) - work-around for a discard issue on non-conformant drives (Eduard) - Small nbd leak fix" * tag 'for-5.5/drivers-post-20191122' of git://git.kernel.dk/linux-block: nbd: prevent memory leak nvme: hwmon: add quirk to avoid changing temperature threshold nvme: hwmon: provide temperature min and max values for each sensor nvmet: add another maintainer nvme: Discard workaround for non-conformant devices nvme: Add hardware monitoring support scsi: sd_zbc: add zone open, close, and finish support
2019-11-25Merge tag 'for-5.5/drivers-20191121' of git://git.kernel.dk/linux-blockLinus Torvalds2-48/+188
Pull block driver updates from Jens Axboe: "Here are the main block driver updates for 5.5. Nothing major in here, mostly just fixes. This contains: - a set of bcache changes via Coly - MD changes from Song - loop unmap write-zeroes fix (Darrick) - spelling fixes (Geert) - zoned additions cleanups to null_blk/dm (Ajay) - allow null_blk online submit queue changes (Bart) - NVMe changes via Keith, nothing major here either" * tag 'for-5.5/drivers-20191121' of git://git.kernel.dk/linux-block: (56 commits) Revert "bcache: fix fifo index swapping condition in journal_pin_cmp()" drivers/md/raid5-ppl.c: use the new spelling of RWH_WRITE_LIFE_NOT_SET drivers/md/raid5.c: use the new spelling of RWH_WRITE_LIFE_NOT_SET bcache: don't export symbols bcache: remove the extra cflags for request.o bcache: at least try to shrink 1 node in bch_mca_scan() bcache: add idle_max_writeback_rate sysfs interface bcache: add code comments in bch_btree_leaf_dirty() bcache: fix deadlock in bcache_allocator bcache: add code comment bch_keylist_pop() and bch_keylist_pop_front() bcache: deleted code comments for dead code in bch_data_insert_keys() bcache: add more accurate error messages in read_super() bcache: fix static checker warning in bcache_device_free() bcache: fix a lost wake-up problem caused by mca_cannibalize_lock bcache: fix fifo index swapping condition in journal_pin_cmp() md/raid10: prevent access of uninitialized resync_pages offset md: avoid invalid memory access for array sb->dev_roles md/raid1: avoid soft lockup under high load null_blk: add zone open, close, and finish support dm: add zone open, close and finish support ...
2019-11-25Merge tag 'for-5.5/block-20191121' of git://git.kernel.dk/linux-blockLinus Torvalds10-242/+369
Pull core block updates from Jens Axboe: "Due to more granular branches, this one is small and will be followed with other core branches that add specific features. I meant to just have a core and drivers branch, but external dependencies we ended up adding a few more that are also core. The changes are: - Fixes and improvements for the zoned device support (Ajay, Damien) - sed-opal table writing and datastore UID (Revanth) - blk-cgroup (and bfq) blk-cgroup stat fixes (Tejun) - Improvements to the block stats tracking (Pavel) - Fix for overruning sysfs buffer for large number of CPUs (Ming) - Optimization for small IO (Ming, Christoph) - Fix typo in RWH lifetime hint (Eugene) - Dead code removal and documentation (Bart) - Reduction in memory usage for queue and tag set (Bart) - Kerneldoc header documentation (André) - Device/partition revalidation fixes (Jan) - Stats tracking for flush requests (Konstantin) - Various other little fixes here and there (et al)" * tag 'for-5.5/block-20191121' of git://git.kernel.dk/linux-block: (48 commits) Revert "block: split bio if the only bvec's length is > SZ_4K" block: add iostat counters for flush requests block,bfq: Skip tracing hooks if possible block: sed-opal: Introduce SUM_SET_LIST parameter and append it using 'add_token_u64' blk-cgroup: cgroup_rstat_updated() shouldn't be called on cgroup1 block: Don't disable interrupts in trigger_softirq() sbitmap: Delete sbitmap_any_bit_clear() blk-mq: Delete blk_mq_has_free_tags() and blk_mq_can_queue() block: split bio if the only bvec's length is > SZ_4K block: still try to split bio if the bvec crosses pages blk-cgroup: separate out blkg_rwstat under CONFIG_BLK_CGROUP_RWSTAT blk-cgroup: reimplement basic IO stats using cgroup rstat blk-cgroup: remove now unused blkg_print_stat_{bytes|ios}_recursive() blk-throtl: stop using blkg->stat_bytes and ->stat_ios bfq-iosched: stop using blkg->stat_bytes and ->stat_ios bfq-iosched: relocate bfqg_*rwstat*() helpers block: add zone open, close and finish ioctl support block: add zone open, close and finish operations block: Simplify REQ_OP_ZONE_RESET_ALL handling block: Remove REQ_OP_ZONE_RESET plugging ...
2019-11-25Merge tag 'for-5.5/libata-20191121' of git://git.kernel.dk/linux-blockLinus Torvalds1-6/+7
Pull libata updates from Jens Axboe: "Just a few fixes all over the place, support for the Annapurna SATA controller, and a patchset that cleans up the error defines and ultimately fixes anissue with sata_mv" * tag 'for-5.5/libata-20191121' of git://git.kernel.dk/linux-block: ata: pata_artop: make arrays static const, makes object smaller ata_piix: remove open-coded dmi_match(DMI_OEM_STRING) ata: sata_mv, avoid trigerrable BUG_ON ata: make qc_prep return ata_completion_errors ata: define AC_ERR_OK ata: Documentation, fix function names libata: Ensure ata_port probe has completed before detach ahci: tegra: use regulator_bulk_set_supply_names() ahci: Add support for Amazon's Annapurna Labs SATA controller
2019-11-25Merge tag 'for-5.5/io_uring-20191121' of git://git.kernel.dk/linux-blockLinus Torvalds5-1/+386
Pull io_uring updates from Jens Axboe: "A lot of stuff has been going on this cycle, with improving the support for networked IO (and hence unbounded request completion times) being one of the major themes. There's been a set of fixes done this week, I'll send those out as well once we're certain we're fully happy with them. This contains: - Unification of the "normal" submit path and the SQPOLL path (Pavel) - Support for sparse (and bigger) file sets, and updating of those file sets without needing to unregister/register again. - Independently sized CQ ring, instead of just making it always 2x the SQ ring size. This makes it more flexible for networked applications. - Support for overflowed CQ ring, never dropping events but providing backpressure on submits. - Add support for absolute timeouts, not just relative ones. - Support for generic cancellations. This divorces io_uring from workqueues as well, which additionally gets us one step closer to generic async system call support. - With cancellations, we can support grabbing the process file table as well, just like we do mm context. This allows support for system calls that create file descriptors, like accept4() support that's built on top of that. - Support for io_uring tracing (Dmitrii) - Support for linked timeouts. These abort an operation if it isn't completed by the time noted in the linke timeout. - Speedup tracking of poll requests - Various cleanups making the coder easier to follow (Jackie, Pavel, Bob, YueHaibing, me) - Update MAINTAINERS with new io_uring list" * tag 'for-5.5/io_uring-20191121' of git://git.kernel.dk/linux-block: (64 commits) io_uring: make POLL_ADD/POLL_REMOVE scale better io-wq: remove now redundant struct io_wq_nulls_list io_uring: Fix getting file for non-fd opcodes io_uring: introduce req_need_defer() io_uring: clean up io_uring_cancel_files() io-wq: ensure free/busy list browsing see all items io-wq: ensure we have a stable view of ->cur_work for cancellations io_wq: add get/put_work handlers to io_wq_create() io_uring: check for validity of ->rings in teardown io_uring: fix potential deadlock in io_poll_wake() io_uring: use correct "is IO worker" helper io_uring: fix -ENOENT issue with linked timer with short timeout io_uring: don't do flush cancel under inflight_lock io_uring: flag SQPOLL busy condition to userspace io_uring: make ASYNC_CANCEL work with poll and timeout io_uring: provide fallback request for OOM situations io_uring: convert accept4() -ERESTARTSYS into -EINTR io_uring: fix error clear of ->file_table in io_sqe_files_register() io_uring: separate the io_free_req and io_free_req_find_next interface io_uring: keep io_put_req only responsible for release and put req ...
2019-11-25Merge tag 'tpmdd-next-20191112' of git://git.infradead.org/users/jjs/linux-tpmddLinus Torvalds3-56/+244
Pull tpmd updates from Jarkko Sakkinen: - support for Cr50 fTPM - support for fTPM on AMD Zen+ CPUs - TPM 2.0 trusted keys code relocated from drivers/char/tpm to security/keys * tag 'tpmdd-next-20191112' of git://git.infradead.org/users/jjs/linux-tpmdd: KEYS: trusted: Remove set but not used variable 'keyhndl' tpm: Switch to platform_get_irq_optional() tpm_crb: fix fTPM on AMD Zen+ CPUs KEYS: trusted: Move TPM2 trusted keys code KEYS: trusted: Create trusted keys subsystem KEYS: Use common tpm_buf for trusted and asymmetric keys tpm: Move tpm_buf code to include/linux/ tpm: use GFP_KERNEL instead of GFP_HIGHMEM for tpm_buf tpm: add check after commands attribs tab allocation tpm: tpm_tis_spi: Drop THIS_MODULE usage from driver struct tpm: tpm_tis_spi: Cleanup includes tpm: tpm_tis_spi: Support cr50 devices tpm: tpm_tis_spi: Introduce a flow control callback tpm: Add a flag to indicate TPM power is managed by firmware dt-bindings: tpm: document properties for cr50 tpm_tis: override durations for STM tpm with firmware 1.2.8.28 tpm: provide a way to override the chip returned durations tpm: Remove duplicate code from caps_show() in tpm-sysfs.c
2019-11-25vfs: properly and reliably lock f_pos in fdget_pos()Linus Torvalds1-2/+0
fdget_pos() is used by file operations that will read and update f_pos: things like "read()", "write()" and "lseek()" (but not, for example, "pread()/pwrite" that get their file positions elsewhere). However, it had two separate escape clauses for this, because not everybody wants or needs serialization of the file position. The first and most obvious case is the "file descriptor doesn't have a position at all", ie a stream-like file. Except we didn't actually use FMODE_STREAM, but instead used FMODE_ATOMIC_POS. The reason for that was that FMODE_STREAM didn't exist back in the days, but also that we didn't want to mark all the special cases, so we only marked the ones that _required_ position atomicity according to POSIX - regular files and directories. The case one was intentionally lazy, but now that we _do_ have FMODE_STREAM we could and should just use it. With the change to use FMODE_STREAM, there are no remaining uses for FMODE_ATOMIC_POS, and all the code to set it is deleted. Any cases where we don't want the serialization because the driver (or subsystem) doesn't use the file position should just be updated to do "stream_open()". We've done that for all the obvious and common situations, we may need a few more. Quoting Kirill Smelkov in the original FMODE_STREAM thread (see link below for full email): "And I appreciate if people could help at least somehow with "getting rid of mixed case entirely" (i.e. always lock f_pos_lock on !FMODE_STREAM), because this transition starts to diverge from my particular use-case too far. To me it makes sense to do that transition as follows: - convert nonseekable_open -> stream_open via stream_open.cocci; - audit other nonseekable_open calls and convert left users that truly don't depend on position to stream_open; - extend stream_open.cocci to analyze alloc_file_pseudo as well (this will cover pipes and sockets), or maybe convert pipes and sockets to FMODE_STREAM manually; - extend stream_open.cocci to analyze file_operations that use no_llseek or noop_llseek, but do not use nonseekable_open or alloc_file_pseudo. This might find files that have stream semantic but are opened differently; - extend stream_open.cocci to analyze file_operations whose .read/.write do not use ppos at all (independently of how file was opened); - ... - after that remove FMODE_ATOMIC_POS and always take f_pos_lock if !FMODE_STREAM; - gather bug reports for deadlocked read/write and convert missed cases to FMODE_STREAM, probably extending stream_open.cocci along the road to catch similar cases i.e. always take f_pos_lock unless a file is explicitly marked as being stream, and try to find and cover all files that are streams" We have not done the "extend stream_open.cocci to analyze alloc_file_pseudo" as well, but the previous commit did manually handle the case of pipes and sockets. The other case where we can avoid locking f_pos is the "this file descriptor only has a single user and it is us, and thus there is no need to lock it". The second test was correct, although a bit subtle and worth just re-iterating here. There are two kinds of other sources of references to the same file descriptor: file descriptors that have been explicitly shared across fork() or with dup(), and file tables having elevated reference counts due to threading (or explicit file sharing with clone()). The first case would have incremented the file count explicitly, and in the second case the previous __fdget() would have incremented it for us and set the FDPUT_FPUT flag. But in both cases the file count would be greater than one, so the "file_count(file) > 1" test catches both situations. Also note that if file_count is 1, that also means that no other thread can have access to the file table, so there also cannot be races with concurrent calls to dup()/fork()/clone() that would increment the file count any other way. Link: https://lore.kernel.org/linux-fsdevel/20190413184404.GA13490@deco.navytux.spb.ru Cc: Kirill Smelkov <kirr@nexedi.com> Cc: Eic Dumazet <edumazet@google.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Alan Stern <stern@rowland.harvard.edu> Cc: Marco Elver <elver@google.com> Cc: Andrea Parri <parri.andrea@gmail.com> Cc: Paul McKenney <paulmck@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-11-22Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netLinus Torvalds2-0/+8
Pull networking fixes from David Miller: 1) Validate tunnel options length in act_tunnel_key, from Xin Long. 2) Fix DMA sync bug in gve driver, from Adi Suresh. 3) TSO kills performance on some r8169 chips due to HW issues, disable by default in that case, from Corinna Vinschen. 4) Fix clock disable mismatch in fec driver, from Chubong Yuan. 5) Fix interrupt status bits define in hns3 driver, from Huazhong Tan. 6) Fix workqueue deadlocks in qeth driver, from Julian Wiedmann. 7) Don't napi_disable() twice in r8152 driver, from Hayes Wang. 8) Fix SKB extension memory leak, from Florian Westphal. * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (54 commits) r8152: avoid to call napi_disable twice MAINTAINERS: Add myself as maintainer of virtio-vsock udp: drop skb extensions before marking skb stateless net: rtnetlink: prevent underflows in do_setvfinfo() can: m_can_platform: remove unnecessary m_can_class_resume() call can: m_can_platform: set net_device structure as driver data hv_netvsc: Fix send_table offset in case of a host bug hv_netvsc: Fix offset usage in netvsc_send_table() net-ipv6: IPV6_TRANSPARENT - check NET_RAW prior to NET_ADMIN sfc: Only cancel the PPS workqueue if it exists nfc: port100: handle command failure cleanly net-sysfs: fix netdev_queue_add_kobject() breakage r8152: Re-order napi_disable in rtl8152_close net: qca_spi: Move reset_count to struct qcaspi net: qca_spi: fix receive buffer size check net/ibmvnic: Ignore H_FUNCTION return from H_EOI to tolerate XIVE mode Revert "net/ibmvnic: Fix EOI when running in XIVE mode" net/mlxfw: Verify FSM error code translation doesn't exceed array size net/mlx5: Update the list of the PCI supported devices net/mlx5: Fix auto group size calculation ...
2019-11-22udp: drop skb extensions before marking skb statelessFlorian Westphal1-0/+6
Once udp stack has set the UDP_SKB_IS_STATELESS flag, later skb free assumes all skb head state has been dropped already. This will leak the extension memory in case the skb has extensions other than the ipsec secpath, e.g. bridge nf data. To fix this, set the UDP_SKB_IS_STATELESS flag only if we don't have extensions or if the extension space can be free'd. Fixes: 895b5c9f206eb7d25dc1360a ("netfilter: drop bridge nf reset from nf_reset") Cc: Paolo Abeni <pabeni@redhat.com> Reported-by: Byron Stanoszek <gandalf@winds.org> Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-21Merge branch 'nvme-5.5' of git://git.infradead.org/nvme into for-5.5/drivers-postJens Axboe1-0/+6
Pull NVMe changes from Keith: "- The only new feature is the optional hwmon support for nvme (Guenter and Akinobu) - A universal work-around for controllers reading discard payloads beyond the range boundary (Eduard) - Chaitanya graciously agreed to share the target driver maintenance" * 'nvme-5.5' of git://git.infradead.org/nvme: nvme: hwmon: add quirk to avoid changing temperature threshold nvme: hwmon: provide temperature min and max values for each sensor nvmet: add another maintainer nvme: Discard workaround for non-conformant devices nvme: Add hardware monitoring support
2019-11-22nvme: hwmon: provide temperature min and max values for each sensorAkinobu Mita1-0/+6
According to the NVMe specification, the over temperature threshold and under temperature threshold features shall be implemented for Composite Temperature if a non-zero WCTEMP field value is reported in the Identify Controller data structure. The features are also implemented for all implemented temperature sensors (i.e., all Temperature Sensor fields that report a non-zero value). This provides the over temperature threshold and under temperature threshold for each sensor as temperature min and max values of hwmon sysfs attributes. The WCTEMP is already provided as a temperature max value for Composite Temperature, but this change isn't incompatible. Because the default value of the over temperature threshold for Composite Temperature is the WCTEMP. Now the alarm attribute for Composite Temperature indicates one of the temperature is outside of a temperature threshold. Because there is only a single bit in Critical Warning field that indicates a temperature is outside of a threshold. Example output from the "sensors" command: nvme-pci-0100 Adapter: PCI adapter Composite: +33.9°C (low = -273.1°C, high = +69.8°C) (crit = +79.8°C) Sensor 1: +34.9°C (low = -273.1°C, high = +65261.8°C) Sensor 2: +31.9°C (low = -273.1°C, high = +65261.8°C) Sensor 5: +47.9°C (low = -273.1°C, high = +65261.8°C) This also adds helper macros for kelvin from/to milli Celsius conversion, and replaces the repeated code in hwmon.c. Cc: Keith Busch <kbusch@kernel.org> Cc: Jens Axboe <axboe@fb.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Sagi Grimberg <sagi@grimberg.me> Cc: Jean Delvare <jdelvare@suse.com> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Tested-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Signed-off-by: Keith Busch <kbusch@kernel.org>
2019-11-21block: add iostat counters for flush requestsKonstantin Khlebnikov1-0/+1
Requests that triggers flushing volatile writeback cache to disk (barriers) have significant effect to overall performance. Block layer has sophisticated engine for combining several flush requests into one. But there is no statistics for actual flushes executed by disk. Requests which trigger flushes usually are barriers - zero-size writes. This patch adds two iostat counters into /sys/class/block/$dev/stat and /proc/diskstats - count of completed flush requests and their total time. Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-11-21Merge branch 'kvm-tsx-ctrl' into HEADPaolo Bonzini64-206/+347
Conflicts: arch/x86/kvm/vmx/vmx.c
2019-11-21Merge tag 'kvmarm-5.5' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEADPaolo Bonzini12-21/+167
KVM/arm updates for Linux 5.5: - Allow non-ISV data aborts to be reported to userspace - Allow injection of data aborts from userspace - Expose stolen time to guests - GICv4 performance improvements - vgic ITS emulation fixes - Simplify FWB handling - Enable halt pool counters - Make the emulated timer PREEMPT_RT compliant Conflicts: include/uapi/linux/kvm.h
2019-11-19net/tls: enable sk_msg redirect to tls socket egressWillem de Bruijn1-0/+2
Bring back tls_sw_sendpage_locked. sk_msg redirection into a socket with TLS_TX takes the following path: tcp_bpf_sendmsg_redir tcp_bpf_push_locked tcp_bpf_push kernel_sendpage_locked sock->ops->sendpage_locked Also update the flags test in tls_sw_sendpage_locked to allow flag MSG_NO_SHARED_FRAGS. bpf_tcp_sendmsg sets this. Link: https://lore.kernel.org/netdev/CA+FuTSdaAawmZ2N8nfDDKu3XLpXBbMtcCT0q4FntDD2gn8ASUw@mail.gmail.com/T/#t Link: https://github.com/wdebruij/kerneltools/commits/icept.2 Fixes: 0608c69c9a80 ("bpf: sk_msg, sock{map|hash} redirect through ULP") Fixes: f3de19af0f5b ("Revert \"net/tls: remove unused function tls_sw_sendpage_locked\"") Signed-off-by: Willem de Bruijn <willemb@google.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-18btrfs: rename btrfs_block_group_cacheDavid Sterba1-12/+12
The type name is misleading, a single entry is named 'cache' while this normally means a collection of objects. Rename that everywhere. Also the identifier was quite long, making function prototypes harder to format. Suggested-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2019-11-18btrfs: add incompat for raid1 with 3, 4 copiesDavid Sterba1-0/+1
The new raid1c3 and raid1c4 profiles are backward incompatible and the name shall be 'raid1c34', the status can be found in the global supported features in /sys/fs/btrfs/features or in the per-filesystem directory. Signed-off-by: David Sterba <dsterba@suse.com>
2019-11-18btrfs: add support for 4-copy replication (raid1c4)David Sterba2-1/+6
Add new block group profile to store 4 copies in a simliar way that current RAID1 does. The profile attributes and constraints are defined in the raid table and used by the same code that already handles the 2- and 3-copy RAID1. The minimum number of devices is 4, the maximum number of devices/chunks that can be lost/damaged is 3. There is no comparable traditional RAID level, the profile is added for future needs to accompany triple-parity and beyond. Signed-off-by: David Sterba <dsterba@suse.com>
2019-11-18btrfs: add support for 3-copy replication (raid1c3)David Sterba2-2/+7
Add new block group profile to store 3 copies in a simliar way that current RAID1 does. The profile attributes and constraints are defined in the raid table and used by the same code that already handles the 2-copy RAID1. The minimum number of devices is 3, the maximum number of devices/chunks that can be lost/damaged is 2. Like RAID6 but with 33% space utilization. Signed-off-by: David Sterba <dsterba@suse.com>
2019-11-18btrfs: add dedicated members for start and length of a block groupDavid Sterba1-8/+8
The on-disk format of block group item makes use of the key that stores the offset and length. This is further used in the code, although this makes thing harder to understand. The key is also packed so the offset/length is not properly aligned as u64. Add start (key.objectid) and length (key.offset) members to block group and remove the embedded key. When the item is searched or written, a local variable for key is used. Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2019-11-18btrfs: move block_group_item::used to block groupDavid Sterba1-3/+2
For unknown reasons, the member 'used' in the block group struct is stored in the b-tree item and accessed everywhere using the special accessor helper. Let's unify it and make it a regular member and only update the item before writing it to the tree. The item is still being used for flags and chunk_objectid, there's some duplication until the item is removed in following patches. Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2019-11-18btrfs: add blake2b to checksumming algorithmsDavid Sterba1-0/+1
Add blake2b (with 256 bit digest) to the list of possible checksumming algorithms used by BTRFS. Signed-off-by: David Sterba <dsterba@suse.com>
2019-11-18btrfs: add sha256 to checksumming algorithmJohannes Thumshirn1-0/+1
Add sha256 to the list of possible checksumming algorithms used by BTRFS. Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2019-11-18btrfs: add xxhash64 to checksumming algorithmsJohannes Thumshirn1-0/+1
Add xxhash64 to the list of possible checksumming algorithms used by BTRFS. Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2019-11-18blk-cgroup: cgroup_rstat_updated() shouldn't be called on cgroup1Tejun Heo1-1/+2
Currently, cgroup rstat is supported only on cgroup2 hierarchy and rstat functions shouldn't be called on cgroup1 cgroups. While converting blk-cgroup core statistics to rstat, f73316482977 ("blk-cgroup: reimplement basic IO stats using cgroup rstat") accidentally ended up calling cgroup_rstat_updated() on cgroup1 cgroups causing crashes. Longer term, we probably should add cgroup1 support to rstat but for now let's mask the call directly. Fixes: f73316482977 ("blk-cgroup: reimplement basic IO stats using cgroup rstat") Tested-by: Faiz Abbas <faiz_abbas@ti.com> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-11-18btrfs: tracepoints: constify all pointersDavid Sterba1-26/+26
We don't modify the data passed to tracepoints, some of the declarations are already const, add it to the rest. Signed-off-by: David Sterba <dsterba@suse.com>
2019-11-18btrfs: tracepoints: drop typecasts from printkDavid Sterba1-15/+13
Remove typecasts from trace printk, adjust types and move typecast to the assignment if necessary. When assigning, the types are more obvious compared to matching the variables to the format strings. Signed-off-by: David Sterba <dsterba@suse.com>
2019-11-18btrfs: use enum for extent type definesChengguang Xu1-4/+6
Use enum to replace macro definitions of extent types. Signed-off-by: Chengguang Xu <cgxu519@mykernel.net> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2019-11-18btrfs: get rid of pointless wtag variable in async-thread.cOmar Sandoval1-3/+3
Commit ac0c7cf8be00 ("btrfs: fix crash when tracepoint arguments are freed by wq callbacks") added a void pointer, wtag, which is passed into trace_btrfs_all_work_done() instead of the freed work item. This is silly for a few reasons: 1. The freed work item still has the same address. 2. work is still in scope after it's freed, so assigning wtag doesn't stop anyone from using it. 3. The tracepoint has always taken a void * argument, so assigning wtag doesn't actually make things any more type-safe. (Note that the original bug in commit bc074524e123 ("btrfs: prefix fsid to all trace events") was that the void * was implicitly casted when it was passed to btrfs_work_owner() in the trace point itself). Instead, let's add some clearer warnings as comments. Reviewed-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Omar Sandoval <osandov@fb.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2019-11-17Merge tag 'iommu-fixes-v5.4-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommuLinus Torvalds1-2/+4
Pull iommu fixes from Joerg Roedel: - Fix for Intel IOMMU to correct invalidation commands when in SVA mode. - Update MAINTAINERS entry for Intel IOMMU * tag 'iommu-fixes-v5.4-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: iommu/vt-d: Fix QI_DEV_IOTLB_PFSID and QI_DEV_EIOTLB_PFSID macros MAINTAINERS: Update for INTEL IOMMU (VT-d) entry
2019-11-16Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netLinus Torvalds5-3/+11
Pull networking fixes from David Miller: 1) Fix memory leak in xfrm_state code, from Steffen Klassert. 2) Fix races between devlink reload operations and device setup/cleanup, from Jiri Pirko. 3) Null deref in NFC code, from Stephan Gerhold. 4) Refcount fixes in SMC, from Ursula Braun. 5) Memory leak in slcan open error paths, from Jouni Hogander. 6) Fix ETS bandwidth validation in hns3, from Yonglong Liu. 7) Info leak on short USB request answers in ax88172a driver, from Oliver Neukum. 8) Release mem region properly in ep93xx_eth, from Chuhong Yuan. 9) PTP config timestamp flags validation, from Richard Cochran. 10) Dangling pointers after SKB data realloc in seg6, from Andrea Mayer. 11) Missing free_netdev() in gemini driver, from Chuhong Yuan. * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (56 commits) ipmr: Fix skb headroom in ipmr_get_route(). net: hns3: cleanup of stray struct hns3_link_mode_mapping net/smc: fix fastopen for non-blocking connect() rds: ib: update WR sizes when bringing up connection net: gemini: add missed free_netdev net: dsa: tag_8021q: Fix dsa_8021q_restore_pvid for an absent pvid seg6: fix skb transport_header after decap_and_validate() seg6: fix srh pointer in get_srh() net: stmmac: Use the correct style for SPDX License Identifier octeontx2-af: Use the correct style for SPDX License Identifier ptp: Extend the test program to check the external time stamp flags. mlx5: Reject requests to enable time stamping on both edges. igb: Reject requests that fail to enable time stamping on both edges. dp83640: Reject requests to enable time stamping on both edges. mv88e6xxx: Reject requests to enable time stamping on both edges. ptp: Introduce strict checking of external time stamp options. renesas: reject unsupported external timestamp flags mlx5: reject unsupported external timestamp flags igb: reject unsupported external timestamp flags dp83640: reject unsupported external timestamp flags ...
2019-11-15mm/memory_hotplug: fix try_offline_node()David Hildenbrand1-0/+1
try_offline_node() is pretty much broken right now: - The node span is updated when onlining memory, not when adding it. We ignore memory that was mever onlined. Bad. - We touch possible garbage memmaps. The pfn_to_nid(pfn) can easily trigger a kernel panic. Bad for memory that is offline but also bad for subsection hotadd with ZONE_DEVICE, whereby the memmap of the first PFN of a section might contain garbage. - Sections belonging to mixed nodes are not properly considered. As memory blocks might belong to multiple nodes, we would have to walk all pageblocks (or at least subsections) within present sections. However, we don't have a way to identify whether a memmap that is not online was initialized (relevant for ZONE_DEVICE). This makes things more complicated. Luckily, we can piggy pack on the node span and the nid stored in memory blocks. Currently, the node span is grown when calling move_pfn_range_to_zone() - e.g., when onlining memory, and shrunk when removing memory, before calling try_offline_node(). Sysfs links are created via link_mem_sections(), e.g., during boot or when adding memory. If the node still spans memory or if any memory block belongs to the nid, we don't set the node offline. As memory blocks that span multiple nodes cannot get offlined, the nid stored in memory blocks is reliable enough (for such online memory blocks, the node still spans the memory). Introduce for_each_memory_block() to efficiently walk all memory blocks. Note: We will soon stop shrinking the ZONE_DEVICE zone and the node span when removing ZONE_DEVICE memory to fix similar issues (access of garbage memmaps) - until we have a reliable way to identify whether these memmaps were properly initialized. This implies later, that once a node had ZONE_DEVICE memory, we won't be able to set a node offline - which should be acceptable. Since commit f1dd2cd13c4b ("mm, memory_hotplug: do not associate hotadded memory to zones until online") memory that is added is not assoziated with a zone/node (memmap not initialized). The introducing commit 60a5a19e7419 ("memory-hotplug: remove sysfs file of node") already missed that we could have multiple nodes for a section and that the zone/node span is updated when onlining pages, not when adding them. I tested this by hotplugging two DIMMs to a memory-less and cpu-less NUMA node. The node is properly onlined when adding the DIMMs. When removing the DIMMs, the node is properly offlined. Masayoshi Mizuma reported: : Without this patch, memory hotplug fails as panic: : : BUG: kernel NULL pointer dereference, address: 0000000000000000 : ... : Call Trace: : remove_memory_block_devices+0x81/0xc0 : try_remove_memory+0xb4/0x130 : __remove_memory+0xa/0x20 : acpi_memory_device_remove+0x84/0x100 : acpi_bus_trim+0x57/0x90 : acpi_bus_trim+0x2e/0x90 : acpi_device_hotplug+0x2b2/0x4d0 : acpi_hotplug_work_fn+0x1a/0x30 : process_one_work+0x171/0x380 : worker_thread+0x49/0x3f0 : kthread+0xf8/0x130 : ret_from_fork+0x35/0x40 [david@redhat.com: v3] Link: http://lkml.kernel.org/r/20191102120221.7553-1-david@redhat.com Link: http://lkml.kernel.org/r/20191028105458.28320-1-david@redhat.com Fixes: 60a5a19e7419 ("memory-hotplug: remove sysfs file of node") Fixes: f1dd2cd13c4b ("mm, memory_hotplug: do not associate hotadded memory to zones until online") # visiable after d0dc12e86b319 Signed-off-by: David Hildenbrand <david@redhat.com> Tested-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com> Cc: Tang Chen <tangchen@cn.fujitsu.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "Rafael J. Wysocki" <rafael@kernel.org> Cc: Keith Busch <keith.busch@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org> Cc: Jani Nikula <jani.nikula@intel.com> Cc: Nayna Jain <nayna@linux.ibm.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Pavel Tatashin <pasha.tatashin@soleen.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-11-15ptp: Introduce strict checking of external time stamp options.Richard Cochran1-1/+3
User space may request time stamps on rising edges, falling edges, or both. However, the particular mode may or may not be supported in the hardware or in the driver. This patch adds a "strict" flag that tells drivers to ensure that the requested mode will be honored. Signed-off-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-15ptp: Validate requests to enable time stamping of external signals.Richard Cochran1-0/+1
Commit 415606588c61 ("PTP: introduce new versions of IOCTLs") introduced a new external time stamp ioctl that validates the flags. This patch extends the validation to ensure that at least one rising or falling edge flag is set when enabling external time stamps. Signed-off-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-15KVM: x86: deliver KVM IOAPIC scan request to target vCPUsNitesh Narayan Lal1-0/+2
In IOAPIC fixed delivery mode instead of flushing the scan requests to all vCPUs, we should only send the requests to vCPUs specified within the destination field. This patch introduces kvm_get_dest_vcpus_mask() API which retrieves an array of target vCPUs by using kvm_apic_map_get_dest_lapic() and then based on the vcpus_idx, it sets the bit in a bitmap. However, if the above fails kvm_get_dest_vcpus_mask() finds the target vCPUs by traversing all available vCPUs. Followed by setting the bits in the bitmap. If we had different vCPUs in the previous request for the same redirection table entry then bits corresponding to these vCPUs are also set. This to done to keep ioapic_handled_vectors synchronized. This bitmap is then eventually passed on to kvm_make_vcpus_request_mask() to generate a masked request only for the target vCPUs. This would enable us to reduce the latency overhead on isolated vCPUs caused by the IPI to process due to KVM_REQ_IOAPIC_SCAN. Suggested-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Nitesh Narayan Lal <nitesh@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-11-15KVM: remember position in kvm->vcpus arrayRadim Krčmář1-8/+3
Fetching an index for any vcpu in kvm->vcpus array by traversing the entire array everytime is costly. This patch remembers the position of each vcpu in kvm->vcpus array by storing it in vcpus_idx under kvm_vcpu structure. Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Nitesh Narayan Lal <nitesh@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-11-15perf/core: Provide a kernel-internal interface to pause perf_eventLike Xu1-0/+5
Exporting perf_event_pause() as an external accessor for kernel users (such as KVM) who may do both disable perf_event and read count with just one time to hold perf_event_ctx_lock. Also the value could be reset optionally. Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Like Xu <like.xu@linux.intel.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>