aboutsummaryrefslogtreecommitdiffstats
path: root/init/Kconfig (follow)
AgeCommit message (Collapse)AuthorFilesLines
2018-02-05membarrier: Provide core serializing command, *_SYNC_COREMathieu Desnoyers1-0/+3
Provide core serializing membarrier command to support memory reclaim by JIT. Each architecture needs to explicitly opt into that support by documenting in their architecture code how they provide the core serializing instructions required when returning from the membarrier IPI, and after the scheduler has updated the curr->mm pointer (before going back to user-space). They should then select ARCH_HAS_MEMBARRIER_SYNC_CORE to enable support for that command on their architecture. Architectures selecting this feature need to either document that they issue core serializing instructions when returning to user-space, or implement their architecture-specific sync_core_before_usermode(). Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrea Parri <parri.andrea@gmail.com> Cc: Andrew Hunter <ahh@google.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Avi Kivity <avi@scylladb.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Dave Watson <davejwatson@fb.com> Cc: David Sehr <sehr@google.com> Cc: Greg Hackmann <ghackmann@google.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Maged Michael <maged.michael@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Will Deacon <will.deacon@arm.com> Cc: linux-api@vger.kernel.org Cc: linux-arch@vger.kernel.org Link: http://lkml.kernel.org/r/20180129202020.8515-9-mathieu.desnoyers@efficios.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-05locking: Introduce sync_core_before_usermode()Mathieu Desnoyers1-0/+3
Introduce an architecture function that ensures the current CPU issues a core serializing instruction before returning to usermode. This is needed for the membarrier "sync_core" command. Architectures defining the sync_core_before_usermode() static inline need to select ARCH_HAS_SYNC_CORE_BEFORE_USERMODE. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrea Parri <parri.andrea@gmail.com> Cc: Andrew Hunter <ahh@google.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Avi Kivity <avi@scylladb.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Dave Watson <davejwatson@fb.com> Cc: David Sehr <sehr@google.com> Cc: Greg Hackmann <ghackmann@google.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Maged Michael <maged.michael@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Will Deacon <will.deacon@arm.com> Cc: linux-api@vger.kernel.org Cc: linux-arch@vger.kernel.org Link: http://lkml.kernel.org/r/20180129202020.8515-7-mathieu.desnoyers@efficios.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-05powerpc, membarrier: Skip memory barrier in switch_mm()Mathieu Desnoyers1-0/+3
Allow PowerPC to skip the full memory barrier in switch_mm(), and only issue the barrier when scheduling into a task belonging to a process that has registered to use expedited private. Threads targeting the same VM but which belong to different thread groups is a tricky case. It has a few consequences: It turns out that we cannot rely on get_nr_threads(p) to count the number of threads using a VM. We can use (atomic_read(&mm->mm_users) == 1 && get_nr_threads(p) == 1) instead to skip the synchronize_sched() for cases where the VM only has a single user, and that user only has a single thread. It also turns out that we cannot use for_each_thread() to set thread flags in all threads using a VM, as it only iterates on the thread group. Therefore, test the membarrier state variable directly rather than relying on thread flags. This means membarrier_register_private_expedited() needs to set the MEMBARRIER_STATE_PRIVATE_EXPEDITED flag, issue synchronize_sched(), and only then set MEMBARRIER_STATE_PRIVATE_EXPEDITED_READY which allows private expedited membarrier commands to succeed. membarrier_arch_switch_mm() now tests for the MEMBARRIER_STATE_PRIVATE_EXPEDITED flag. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alan Stern <stern@rowland.harvard.edu> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Andrea Parri <parri.andrea@gmail.com> Cc: Andrew Hunter <ahh@google.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Avi Kivity <avi@scylladb.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Dave Watson <davejwatson@fb.com> Cc: David Sehr <sehr@google.com> Cc: Greg Hackmann <ghackmann@google.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Maged Michael <maged.michael@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Will Deacon <will.deacon@arm.com> Cc: linux-api@vger.kernel.org Cc: linux-arch@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Link: http://lkml.kernel.org/r/20180129202020.8515-3-mathieu.desnoyers@efficios.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-01-12Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds1-0/+1
Pull scheduler fixes from Ingo Molnar: "A Kconfig fix, a build fix and a membarrier bug fix" * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: membarrier: Disable preemption when calling smp_call_function_many() sched/isolation: Make CONFIG_CPU_ISOLATION=y depend on SMP or COMPILE_TEST ia64, sched/cputime: Fix build error if CONFIG_VIRT_CPU_ACCOUNTING_NATIVE=y
2018-01-10Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfDavid S. Miller1-0/+7
Daniel Borkmann says: ==================== pull-request: bpf 2018-01-09 The following pull-request contains BPF updates for your *net* tree. The main changes are: 1) Prevent out-of-bounds speculation in BPF maps by masking the index after bounds checks in order to fix spectre v1, and add an option BPF_JIT_ALWAYS_ON into Kconfig that allows for removing the BPF interpreter from the kernel in favor of JIT-only mode to make spectre v2 harder, from Alexei. 2) Remove false sharing of map refcount with max_entries which was used in spectre v1, from Daniel. 3) Add a missing NULL psock check in sockmap in order to fix a race, from John. 4) Fix test_align BPF selftest case since a recent change in verifier rejects the bit-wise arithmetic on pointers earlier but test_align update was missing, from Alexei. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-09bpf: introduce BPF_JIT_ALWAYS_ON configAlexei Starovoitov1-0/+7
The BPF interpreter has been used as part of the spectre 2 attack CVE-2017-5715. A quote from goolge project zero blog: "At this point, it would normally be necessary to locate gadgets in the host kernel code that can be used to actually leak data by reading from an attacker-controlled location, shifting and masking the result appropriately and then using the result of that as offset to an attacker-controlled address for a load. But piecing gadgets together and figuring out which ones work in a speculation context seems annoying. So instead, we decided to use the eBPF interpreter, which is built into the host kernel - while there is no legitimate way to invoke it from inside a VM, the presence of the code in the host kernel's text section is sufficient to make it usable for the attack, just like with ordinary ROP gadgets." To make attacker job harder introduce BPF_JIT_ALWAYS_ON config option that removes interpreter from the kernel in favor of JIT-only mode. So far eBPF JIT is supported by: x64, arm64, arm32, sparc64, s390, powerpc64, mips64 The start of JITed program is randomized and code page is marked as read-only. In addition "constant blinding" can be turned on with net.core.bpf_jit_harden v2->v3: - move __bpf_prog_ret0 under ifdef (Daniel) v1->v2: - fix init order, test_bpf and cBPF (Daniel's feedback) - fix offloaded bpf (Jakub's feedback) - add 'return 0' dummy in case something can invoke prog->bpf_func - retarget bpf tree. For bpf-next the patch would need one extra hunk. It will be sent when the trees are merged back to net-next Considered doing: int bpf_jit_enable __read_mostly = BPF_EBPF_JIT_DEFAULT; but it seems better to land the patch as-is and in bpf-next remove bpf_jit_enable global variable from all JITs, consolidate in one place and remove this jit_init() function. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-01-08sched/isolation: Make CONFIG_CPU_ISOLATION=y depend on SMP or COMPILE_TESTGeert Uytterhoeven1-0/+1
On uniprocessor systems, critical and non-critical tasks cannot be isolated, as there is only a single CPU core. Hence enabling CPU isolation by default on such systems does not make much sense. Instead of changing the default for !SMP, fix this by making the feature depend on SMP, with an override for compile-testing. Note that its sole selector (NO_HZ_FULL) already depends on SMP. This decreases kernel size for a default uniprocessor kernel by ca. 1 KiB. Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Nicolas Pitre <nico@linaro.org> Cc: Frederic Weisbecker <frederic@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: 2c43838c99d9d23f ("sched/isolation: Enable CONFIG_CPU_ISOLATION=y by default") Link: http://lkml.kernel.org/r/1514891590-20782-1-git-send-email-geert@linux-m68k.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-18sched/isolation: Enable CONFIG_CPU_ISOLATION=y by defaultFrederic Weisbecker1-1/+5
The "isolcpus=" boot parameter support was always built-in before we moved the related code under CONFIG_CPU_ISOLATION. Having it disabled by default is very confusing for people accustomed to use this parameter. So enable it by dafault to keep the previous behaviour but keep it optable for those who want to tinify their kernels. Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Cc: Chris Metcalf <cmetcalf@mellanox.com> Cc: Christoph Lameter <cl@linux.com> Cc: John Stultz <john.stultz@linaro.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Luiz Capitulino <lcapitulino@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Wanpeng Li <kernellwp@gmail.com> Cc: kernel test robot <xiaolong.ye@intel.com> Link: http://lkml.kernel.org/r/1513275507-29200-3-git-send-email-frederic@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-11-17EXPERT Kconfig menu: fix broken EXPERT menuRandy Dunlap1-91/+93
Clean up the EXPERT menu (yet again). Move FHANDLE and CHECKPOINT_RESTORE into the primary EXPERT menu since they already depend on EXPERT. Move BPF_SYSCALL and USERFAULTFD out of the EXPERT Kconfig symbols menu list since they do not depend on EXPERT and were breaking the continuity of that menu list. Move all of the KALLSYMS Kconfig symbols to the end of the EXPERT menu. This separates the kernel services from the build options. This patch depends on [PATCH] pci: move PCI_QUIRKS to the PCI bus menu (https://lkml.org/lkml/2017/11/2/907). Link: http://lkml.kernel.org/r/72e4465a-a5ff-cb3c-1a90-11aa4861b161@infradead.org Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> [BPF] Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-11-15Merge branch 'akpm' (patches from Andrew)Linus Torvalds1-6/+0
Merge updates from Andrew Morton: - a few misc bits - ocfs2 updates - almost all of MM * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (131 commits) memory hotplug: fix comments when adding section mm: make alloc_node_mem_map a void call if we don't have CONFIG_FLAT_NODE_MEM_MAP mm: simplify nodemask printing mm,oom_reaper: remove pointless kthread_run() error check mm/page_ext.c: check if page_ext is not prepared writeback: remove unused function parameter mm: do not rely on preempt_count in print_vma_addr mm, sparse: do not swamp log with huge vmemmap allocation failures mm/hmm: remove redundant variable align_end mm/list_lru.c: mark expected switch fall-through mm/shmem.c: mark expected switch fall-through mm/page_alloc.c: broken deferred calculation mm: don't warn about allocations which stall for too long fs: fuse: account fuse_inode slab memory as reclaimable mm, page_alloc: fix potential false positive in __zone_watermark_ok mm: mlock: remove lru_add_drain_all() mm, sysctl: make NUMA stats configurable shmem: convert shmem_init_inodecache() to void Unify migrate_pages and move_pages access checks mm, pagevec: rename pagevec drained field ...
2017-11-15mm: slabinfo: remove CONFIG_SLABINFOYang Shi1-6/+0
According to discussion with Christoph (https://marc.info/?l=linux-kernel&m=150695909709711&w=2), it sounds like it is pointless to keep CONFIG_SLABINFO around. This patch removes the CONFIG_SLABINFO config option, but /proc/slabinfo is still available. [yang.s@alibaba-inc.com: v11] Link: http://lkml.kernel.org/r/1507656303-103845-3-git-send-email-yang.s@alibaba-inc.com Link: http://lkml.kernel.org/r/1507152550-46205-3-git-send-email-yang.s@alibaba-inc.com Signed-off-by: Yang Shi <yang.s@alibaba-inc.com> Acked-by: David Rientjes <rientjes@google.com> Cc: Christoph Lameter <cl@linux.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Pekka Enberg <penberg@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-11-15Merge tag 'pci-v4.15-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pciLinus Torvalds1-9/+0
Pull PCI updates from Bjorn Helgaas: - detach driver before tearing down procfs/sysfs (Alex Williamson) - disable PCIe services during shutdown (Sinan Kaya) - fix ASPM oops on systems with no Root Ports (Ard Biesheuvel) - fix ASPM LTR_L1.2_THRESHOLD programming (Bjorn Helgaas) - fix ASPM Common_Mode_Restore_Time computation (Bjorn Helgaas) - fix portdrv MSI/MSI-X vector allocation (Dongdong Liu, Bjorn Helgaas) - report non-fatal AER errors only to the affected endpoint (Gabriele Paoloni) - distribute bus numbers, MMIO, and I/O space among hotplug bridges to allow more devices to be hot-added (Mika Westerberg) - fix pciehp races during initialization and surprise link down (Mika Westerberg) - handle surprise-removed devices in PME handling (Qiang) - support resizable BARs for large graphics devices (Christian König) - expose SR-IOV offset, stride, and VF device ID via sysfs (Filippo Sironi) - create SR-IOV virtfn/physfn sysfs links before attaching driver (Stuart Hayes) - fix SR-IOV "ARI Capable Hierarchy" restore issue (Tony Nguyen) - enforce Kconfig IOV/REALLOC dependency (Sascha El-Sharkawy) - avoid slot reset if bridge itself is broken (Jan Glauber) - clean up pci_reset_function() path (Jan H. Schönherr) - make pci_map_rom() fail if the option ROM is invalid (Changbin Du) - convert timers to timer_setup() (Kees Cook) - move PCI_QUIRKS to PCI bus Kconfig menu (Randy Dunlap) - constify pci_dev_type and intel_mid_pci_ops (Bhumika Goyal) - remove unnecessary pci_dev, pci_bus, resource, pcibios_set_master() declarations (Bjorn Helgaas) - fix endpoint framework overflows and BUG()s (Dan Carpenter) - fix endpoint framework issues (Kishon Vijay Abraham I) - avoid broken Cavium CN8xxx bus reset behavior (David Daney) - extend Cavium ACS capability quirks (Vadim Lomovtsev) - support Synopsys DesignWare RC in ECAM mode (Ard Biesheuvel) - turn off dra7xx clocks cleanly on shutdown (Keerthy) - fix Faraday probe error path (Wei Yongjun) - support HiSilicon STB SoC PCIe host controller (Jianguo Sun) - fix Hyper-V interrupt affinity issue (Dexuan Cui) - remove useless ACPI warning for Hyper-V pass-through devices (Vitaly Kuznetsov) - support multiple MSI on iProc (Sandor Bodo-Merle) - support Layerscape LS1012a and LS1046a PCIe host controllers (Hou Zhiqiang) - fix Layerscape default error response (Minghuan Lian) - support MSI on Tango host controller (Marc Gonzalez) - support Tegra186 PCIe host controller (Manikanta Maddireddy) - use generic accessors on Tegra when possible (Thierry Reding) - support V3 Semiconductor PCI host controller (Linus Walleij) * tag 'pci-v4.15-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (85 commits) PCI/ASPM: Add L1 Substates definitions PCI/ASPM: Reformat ASPM register definitions PCI/ASPM: Use correct capability pointer to program LTR_L1.2_THRESHOLD PCI/ASPM: Account for downstream device's Port Common_Mode_Restore_Time PCI: xgene: Rename xgene_pcie_probe_bridge() to xgene_pcie_probe() PCI: xilinx: Rename xilinx_pcie_link_is_up() to xilinx_pcie_link_up() PCI: altera: Rename altera_pcie_link_is_up() to altera_pcie_link_up() PCI: Fix kernel-doc build warning PCI: Fail pci_map_rom() if the option ROM is invalid PCI: Move pci_map_rom() error path PCI: Move PCI_QUIRKS to the PCI bus menu alpha/PCI: Make pdev_save_srm_config() static PCI: Remove unused declarations PCI: Remove redundant pci_dev, pci_bus, resource declarations PCI: Remove redundant pcibios_set_master() declarations PCI/PME: Handle invalid data when reading Root Status PCI: hv: Use effective affinity mask PCI: pciehp: Do not clear Presence Detect Changed during initialization PCI: pciehp: Fix race condition handling surprise link down PCI: Distribute available resources to hotplug-capable bridges ...
2017-11-15Merge branch 'for-linus' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/jikos/trivialLinus Torvalds1-1/+1
Pull trivial tree updates from Jiri Kosina: "The usual rocket-science from trivial tree for 4.15" * 'for-linus' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: MAINTAINERS: relinquish kconfig MAINTAINERS: Update my email address treewide: Fix typos in Kconfig kfifo: Fix comments init/Kconfig: Fix module signing document location misc: ibmasm: Return error on error path HID: logitech-hidpp: fix mistake in printk, "feeback" -> "feedback" MAINTAINERS: Correct path to uDraw PS3 driver tracing: Fix doc mistakes in trace sample tracing: Kconfig text fixes for CONFIG_HWLAT_TRACER MIPS: Alchemy: Remove reverted CONFIG_NETLINK_MMAP from db1xxx_defconfig mm/huge_memory.c: fixup grammar in comment lib/xz: Add fall-through comments to a switch statement
2017-11-08Merge branch 'linus' into sched/core, to pick up fixesIngo Molnar1-1/+1
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-11-07PCI: Move PCI_QUIRKS to the PCI bus menuRandy Dunlap1-9/+0
Localize PCI_QUIRKS in the PCI bus menu. Move PCI_QUIRKS to the PCI bus menu instead of the (often broken) General Setup EXPERT menu. The prompt still depends on EXPERT. Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2017-10-27sched/isolation: Handle the nohz_full= parameterFrederic Weisbecker1-1/+0
We want to centralize the isolation management, done by the housekeeping subsystem. Therefore we need to handle the nohz_full= parameter from there. Since nohz_full= so far has involved unbound timers, watchdog, RCU and tilegx NAPI isolation, we keep that default behaviour. nohz_full= will be deprecated in the future. We want to control the isolation features from the isolcpus= parameter. Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Chris Metcalf <cmetcalf@mellanox.com> Cc: Christoph Lameter <cl@linux.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Luiz Capitulino <lcapitulino@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Wanpeng Li <kernellwp@gmail.com> Link: http://lkml.kernel.org/r/1509072159-31808-10-git-send-email-frederic@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-10-27sched/isolation: Split out new CONFIG_CPU_ISOLATION=y config from CONFIG_NO_HZ_FULLFrederic Weisbecker1-0/+8
Split the housekeeping config from CONFIG_NO_HZ_FULL. This way we finally separate the isolation code from NOHZ. Although a dependency to CONFIG_NO_HZ_FULL remains for now, while the housekeeping code still deals with NOHZ internals. Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Chris Metcalf <cmetcalf@mellanox.com> Cc: Christoph Lameter <cl@linux.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Luiz Capitulino <lcapitulino@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Wanpeng Li <kernellwp@gmail.com> Link: http://lkml.kernel.org/r/1509072159-31808-8-git-send-email-frederic@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-10-12init/Kconfig: Fix module signing document locationNathan Chancellor1-1/+1
This was moved in commit 94e980cc45f2 ("Documentation/module-signing.txt: convert to ReST markup") and was missed by commit 8c27ceff3604 ("docs: fix locations of several documents that got moved"). Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2017-10-07kbuild: Fix optimization level choice defaultUlf Magnusson1-1/+1
The choice containing the CC_OPTIMIZE_FOR_PERFORMANCE symbol accidentally added a "CONFIG_" prefix when trying to make it the default, selecting an undefined symbol as the default. The mistake is harmless here: Since the default symbol is not visible, the choice falls back on using the visible symbol as the default instead, which is CC_OPTIMIZE_FOR_PERFORMANCE, as intended. A patch that makes Kconfig print a warning in this case has been submitted separately: http://www.spinics.net/lists/linux-kbuild/msg15566.html Signed-off-by: Ulf Magnusson <ulfalizer@gmail.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2017-09-06mm: add SLUB free list pointer obfuscationKees Cook1-0/+9
This SLUB free list pointer obfuscation code is modified from Brad Spengler/PaX Team's code in the last public patch of grsecurity/PaX based on my understanding of the code. Changes or omissions from the original code are mine and don't reflect the original grsecurity/PaX code. This adds a per-cache random value to SLUB caches that is XORed with their freelist pointer address and value. This adds nearly zero overhead and frustrates the very common heap overflow exploitation method of overwriting freelist pointers. A recent example of the attack is written up here: http://cyseclabs.com/blog/cve-2016-6187-heap-off-by-one-exploit and there is a section dedicated to the technique the book "A Guide to Kernel Exploitation: Attacking the Core". This is based on patches by Daniel Micay, and refactored to minimize the use of #ifdef. With 200-count cycles of "hackbench -g 20 -l 1000" I saw the following run times: before: mean 10.11882499999999999995 variance .03320378329145728642 stdev .18221905304181911048 after: mean 10.12654000000000000014 variance .04700556623115577889 stdev .21680767106160192064 The difference gets lost in the noise, but if the above is to be taken literally, using CONFIG_FREELIST_HARDENED is 0.07% slower. Link: http://lkml.kernel.org/r/20170802180609.GA66807@beast Signed-off-by: Kees Cook <keescook@chromium.org> Suggested-by: Daniel Micay <danielmicay@gmail.com> Cc: Rik van Riel <riel@redhat.com> Cc: Tycho Andersen <tycho@docker.com> Cc: Alexander Popov <alex.popov@linux.com> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-08-01futex: Allow for compiling out PI supportNicolas Pitre1-1/+6
This makes it possible to preserve basic futex support and compile out the PI support when RT mutexes are not available. Signed-off-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Darren Hart <dvhart@infradead.org> Link: http://lkml.kernel.org/r/alpine.LFD.2.20.1708010024190.5981@knanqh.ubzr
2017-07-06mm: allow slab_nomerge to be set at build timeKees Cook1-0/+14
Some hardened environments want to build kernels with slab_nomerge already set (so that they do not depend on remembering to set the kernel command line option). This is desired to reduce the risk of kernel heap overflows being able to overwrite objects from merged caches and changes the requirements for cache layout control, increasing the difficulty of these attacks. By keeping caches unmerged, these kinds of exploits can usually only damage objects in the same cache (though the risk to metadata exploitation is unchanged). Link: http://lkml.kernel.org/r/20170620230911.GA25238@beast Signed-off-by: Kees Cook <keescook@chromium.org> Cc: Daniel Micay <danielmicay@gmail.com> Cc: David Windsor <dave@nullcore.net> Cc: Eric Biggers <ebiggers3@gmail.com> Cc: Christoph Lameter <cl@linux.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Daniel Micay <danielmicay@gmail.com> Cc: David Windsor <dave@nullcore.net> Cc: Eric Biggers <ebiggers3@gmail.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: Mauro Carvalho Chehab <mchehab@kernel.org> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Nicolas Pitre <nicolas.pitre@linaro.org> Cc: Tejun Heo <tj@kernel.org> Cc: Daniel Mack <daniel@zonque.org> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Helge Deller <deller@gmx.de> Cc: Rik van Riel <riel@redhat.com> Cc: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-06Merge branch 'for-4.13' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroupLinus Torvalds1-2/+5
Pull cgroup changes from Tejun Heo: - Waiman made the debug controller work and a lot more useful on cgroup2 - There were a couple issues with cgroup subtree delegation. The documentation on delegating to a non-root user was missing some part and cgroup namespace support wasn't factoring in delegation at all. The documentation is updated and the now there is a mount option to make cgroup namespace fit for delegation * 'for-4.13' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: cgroup: implement "nsdelegate" mount option cgroup: restructure cgroup_procs_write_permission() cgroup: "cgroup.subtree_control" should be writeable by delegatee cgroup: fix lockdep warning in debug controller cgroup: refactor cgroup_masks_read() in the debug controller cgroup: make debug an implicit controller on cgroup2 cgroup: Make debug cgroup support v2 and thread mode cgroup: Make Kconfig prompt of debug cgroup more accurate cgroup: Move debug cgroup to its own file cgroup: Keep accurate count of tasks in each css_set
2017-07-03Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds1-0/+1
Pull scheduler updates from Ingo Molnar: "The main changes in this cycle were: - Add the SYSTEM_SCHEDULING bootup state to move various scheduler debug checks earlier into the bootup. This turns silent and sporadically deadly bugs into nice, deterministic splats. Fix some of the splats that triggered. (Thomas Gleixner) - A round of restructuring and refactoring of the load-balancing and topology code (Peter Zijlstra) - Another round of consolidating ~20 of incremental scheduler code history: this time in terms of wait-queue nomenclature. (I didn't get much feedback on these renaming patches, and we can still easily change any names I might have misplaced, so if anyone hates a new name, please holler and I'll fix it.) (Ingo Molnar) - sched/numa improvements, fixes and updates (Rik van Riel) - Another round of x86/tsc scheduler clock code improvements, in hope of making it more robust (Peter Zijlstra) - Improve NOHZ behavior (Frederic Weisbecker) - Deadline scheduler improvements and fixes (Luca Abeni, Daniel Bristot de Oliveira) - Simplify and optimize the topology setup code (Lauro Ramos Venancio) - Debloat and decouple scheduler code some more (Nicolas Pitre) - Simplify code by making better use of llist primitives (Byungchul Park) - ... plus other fixes and improvements" * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (103 commits) sched/cputime: Refactor the cputime_adjust() code sched/debug: Expose the number of RT/DL tasks that can migrate sched/numa: Hide numa_wake_affine() from UP build sched/fair: Remove effective_load() sched/numa: Implement NUMA node level wake_affine() sched/fair: Simplify wake_affine() for the single socket case sched/numa: Override part of migrate_degrades_locality() when idle balancing sched/rt: Move RT related code from sched/core.c to sched/rt.c sched/deadline: Move DL related code from sched/core.c to sched/deadline.c sched/cpuset: Only offer CONFIG_CPUSETS if SMP is enabled sched/fair: Spare idle load balancing on nohz_full CPUs nohz: Move idle balancer registration to the idle path sched/loadavg: Generalize "_idle" naming to "_nohz" sched/core: Drop the unused try_get_task_struct() helper function sched/fair: WARN() and refuse to set buddy when !se->on_rq sched/debug: Fix SCHED_WARN_ON() to return a value on !CONFIG_SCHED_DEBUG as well sched/wait: Disambiguate wq_entry->task_list and wq_head->task_list naming sched/wait: Move bit_wait_table[] and related functionality from sched/core.c to sched/wait_bit.c sched/wait: Split out the wait_bit*() APIs from <linux/wait.h> into <linux/wait_bit.h> sched/wait: Re-adjust macro line continuation backslashes in <linux/wait.h> ...
2017-06-23sched/cpuset: Only offer CONFIG_CPUSETS if SMP is enabledNicolas Pitre1-0/+1
Make CONFIG_CPUSETS=y depend on SMP as this feature makes no sense on UP. This allows for configuring out cpuset_cpumask_can_shrink() and task_can_attach() entirely, which shrinks the kernel a bit. Signed-off-by: Nicolas Pitre <nico@linaro.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20170614171926.8345-2-nicolas.pitre@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-06-14cgroup: Make Kconfig prompt of debug cgroup more accurateWaiman Long1-2/+5
The Kconfig prompt and description of the debug cgroup controller more accurate by saying that it is for debug purpose only and its interfaces are unstable. Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2017-06-08rcu: Move RCU non-debug Kconfig options to kernel/rcuPaul E. McKenney1-238/+1
RCU's Kconfig options are scattered, and there are enough of them that it would be good for them to be more centralized. This commit therefore extracts RCU's Kconfig options from init/Kconfig into a new kernel/rcu/Kconfig file. Reported-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2017-06-08rcu: Eliminate NOCBs CPU-state Kconfig optionsPaul E. McKenney1-53/+0
The CONFIG_RCU_NOCB_CPU_ALL, CONFIG_RCU_NOCB_CPU_NONE, and CONFIG_RCU_NOCB_CPU_ZERO Kconfig options are used only in testing and are redundant with the rcu_nocbs= boot parameter. This commit therefore removes these three Kconfig options and adjusts the rcutorture scripts to use the boot parameter instead. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2017-06-08rcu: Remove debugfs tracingPaul E. McKenney1-8/+0
RCU's debugfs tracing used to be the only reasonable low-level debug information available, but ftrace and event tracing has since surpassed the RCU debugfs level of usefulness. This commit therefore removes RCU's debugfs tracing. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2017-06-08srcu: Remove Classic SRCUPaul E. McKenney1-19/+2
Classic SRCU was only ever intended to be a fallback in case of issues with Tree/Tiny SRCU, and the latter two are doing quite well in testing. This commit therefore removes Classic SRCU. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2017-06-08rcu: Remove the RCU_KTHREAD_PRIO Kconfig optionPaul E. McKenney1-31/+0
Anything that can be done with the RCU_KTHREAD_PRIO Kconfig option can also be done with the rcutree.kthread_prio kernel boot parameter. This commit therefore removes this Kconfig option. Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Rik van Riel <riel@redhat.com>
2017-06-08srcu: Apply trivial callback lists to shrink Tiny SRCUPaul E. McKenney1-1/+1
The rcu_segcblist structure provides quite a bit of functionality, and Tiny SRCU needs almost none of it. So this commit replaces Tiny SRCU's uses of rcu_segcblist with a simple singly linked list with tail pointer. This change significantly reduces Tiny SRCU's memory footprint, more than making up for the growth caused by the creation of rcu_segcblist.c Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2017-06-08srcu: Make SRCU be once again optionalPaul E. McKenney1-1/+0
Commit d160a727c40e ("srcu: Make SRCU be built by default") in response to build errors, which were caused by code that included srcu.h despite !SRCU. However, srcutiny.o is almost 2K of code, which is not insignificant for those attempting to run the Linux kernel on IoT devices. This commit therefore makes SRCU be once again optional, and adjusts srcu.h to allow error-free inclusion in !SRCU kernel builds. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Acked-by: Nicolas Pitre <nico@linaro.org>
2017-05-02rcu: Separately compile large rcu_segcblist functionsPaul E. McKenney1-0/+3
This commit creates a new kernel/rcu/rcu_segcblist.c file that contains non-trivial segcblist functions. Trivial functions remain as static inline functions in kernel/rcu/rcu_segcblist.h Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de>
2017-04-24srcu: Make SRCU be built by defaultPaul E. McKenney1-0/+1
SRCU is optional, and included only if there is a "select SRCU" in effect. However, we now have Tiny SRCU, so this commit defaults CONFIG_SRCU=y. Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2017-04-24srcu: Fix Kconfig botch when SRCU not selectedPaul E. McKenney1-2/+2
If the CONFIG_SRCU option is not selected, for example, when building arch/tile allnoconfig, the following build errors appear: kernel/rcu/tree.o: In function `srcu_online_cpu': tree.c:(.text+0x4248): multiple definition of `srcu_online_cpu' kernel/rcu/srcutree.o:srcutree.c:(.text+0x2120): first defined here kernel/rcu/tree.o: In function `srcu_offline_cpu': tree.c:(.text+0x4250): multiple definition of `srcu_offline_cpu' kernel/rcu/srcutree.o:srcutree.c:(.text+0x2160): first defined here The corresponding .config file shows CONFIG_TREE_SRCU=y, but no sign of CONFIG_SRCU, which fatally confuses SRCU's #ifdefs, resulting in the above errors. The reason this occurs is the folowing line in init/Kconfig's definition for TREE_SRCU: default y if !TINY_RCU && !CLASSIC_SRCU If CONFIG_CLASSIC_SRCU=n, as it will be in for allnoconfig, and if CONFIG_SMP=y, then we will get CONFIG_TREE_SRCU=y but no CONFIG_SRCU, as seen in the .config file, and which will result in the above errors. This error did not show up during rcutorture testing because rcutorture forces CONFIG_SRCU=y, as it must to prevent build errors in rcutorture.c. This commit therefore conditions TREE_SRCU (and TINY_SRCU, while it is at it) with SRCU, like this: default y if SRCU && !TINY_RCU && !CLASSIC_SRCU Reported-by: kbuild test robot <fengguang.wu@intel.com> Reported-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/20170423162205.GP3956@linux.vnet.ibm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-04-21Merge branches 'doc.2017.04.12a', 'fixes.2017.04.19a' and 'srcu.2017.04.21a' into HEADPaul E. McKenney1-2/+37
doc.2017.04.12a: Documentation updates fixes.2017.04.19a: Miscellaneous fixes srcu.2017.04.21a: Parallelize SRCU callback handling
2017-04-19rcu: Make RCU_FANOUT_LEAF help text more explicit about skew_tickPaul E. McKenney1-2/+8
If you set RCU_FANOUT_LEAF too high, you can get lock contention on the leaf rcu_node, and you should boot with the skew_tick kernel parameter set in order to avoid this lock contention. This commit therefore upgrades the RCU_FANOUT_LEAF help text to explicitly state this. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2017-04-18srcu: Introduce CLASSIC_SRCU Kconfig optionPaul E. McKenney1-2/+19
The TREE_SRCU rewrite is large and a bit on the non-simple side, so this commit helps reduce risk by allowing the old v4.11 SRCU algorithm to be selected using a new CLASSIC_SRCU Kconfig option that depends on RCU_EXPERT. The default is to use the new TREE_SRCU and TINY_SRCU algorithms, in order to help get these the testing that they need. However, if your users do not require the update-side scalability that is to be provided by TREE_SRCU, select RCU_EXPERT and then CLASSIC_SRCU to revert back to the old classic SRCU algorithm. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2017-04-18srcu: Create a tiny SRCUPaul E. McKenney1-0/+12
In response to automated complaints about modifications to SRCU increasing its size, this commit creates a tiny SRCU that is used in SMP=n && PREEMPT=n builds. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2017-02-27Merge branch 'for-4.11' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroupLinus Torvalds1-0/+10
Pull cgroup updates from Tejun Heo: "Several noteworthy changes. - Parav's rdma controller is finally merged. It is very straight forward and can limit the abosolute numbers of common rdma constructs used by different cgroups. - kernel/cgroup.c got too chubby and disorganized. Created kernel/cgroup/ subdirectory and moved all cgroup related files under kernel/ there and reorganized the core code. This hurts for backporting patches but was long overdue. - cgroup v2 process listing reimplemented so that it no longer depends on allocating a buffer large enough to cache the entire result to sort and uniq the output. v2 has always mangled the sort order to ensure that users don't depend on the sorted output, so this shouldn't surprise anybody. This makes the pid listing functions use the same iterators that are used internally, which have to have the same iterating capabilities anyway. - perf cgroup filtering now works automatically on cgroup v2. This patch was posted a long time ago but somehow fell through the cracks. - misc fixes asnd documentation updates" * 'for-4.11' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: (27 commits) kernfs: fix locking around kernfs_ops->release() callback cgroup: drop the matching uid requirement on migration for cgroup v2 cgroup, perf_event: make perf_event controller work on cgroup2 hierarchy cgroup: misc cleanups cgroup: call subsys->*attach() only for subsystems which are actually affected by migration cgroup: track migration context in cgroup_mgctx cgroup: cosmetic update to cgroup_taskset_add() rdmacg: Fixed uninitialized current resource usage cgroup: Add missing cgroup-v2 PID controller documentation. rdmacg: Added documentation for rdmacg IB/core: added support to use rdma cgroup controller rdmacg: Added rdma cgroup controller cgroup: fix a comment typo cgroup: fix RCU related sparse warnings cgroup: move namespace code to kernel/cgroup/namespace.c cgroup: rename functions for consistency cgroup: move v1 mount functions to kernel/cgroup/cgroup-v1.c cgroup: separate out cgroup1_kf_syscall_ops cgroup: refactor mount path and clearly distinguish v1 and v2 paths cgroup: move cgroup v1 specific code to kernel/cgroup/cgroup-v1.c ...
2017-02-22Merge branch 'akpm' (patches from Andrew)Linus Torvalds1-0/+14
Merge updates from Andrew Morton: "142 patches: - DAX updates - various misc bits - OCFS2 updates - most of MM" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (142 commits) mm/z3fold.c: limit first_num to the actual range of possible buddy indexes mm: fix <linux/pagemap.h> stray kernel-doc notation zram: remove obsolete sysfs attrs mm/memblock.c: remove unnecessary log and clean up oom-reaper: use madvise_dontneed() logic to decide if unmap the VMA mm: drop unused argument of zap_page_range() mm: drop zap_details::check_swap_entries mm: drop zap_details::ignore_dirty mm, page_alloc: warn_alloc nodemask is NULL when cpusets are disabled mm: help __GFP_NOFAIL allocations which do not trigger OOM killer mm, oom: do not enforce OOM killer for __GFP_NOFAIL automatically mm: consolidate GFP_NOFAIL checks in the allocator slowpath lib/show_mem.c: teach show_mem to work with the given nodemask arch, mm: remove arch specific show_mem mm, page_alloc: warn_alloc print nodemask mm, page_alloc: do not report all nodes in show_mem Revert "mm: bail out in shrink_inactive_list()" mm, vmscan: consider eligible zones in get_scan_count mm, vmscan: cleanup lru size claculations mm, vmscan: do not count freed pages as PGDEACTIVATE ...
2017-02-22Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printkLinus Torvalds1-7/+9
Pull printk updates from Petr Mladek: - Add Petr Mladek, Sergey Senozhatsky as printk maintainers, and Steven Rostedt as the printk reviewer. This idea came up after the discussion about printk issues at Kernel Summit. It was formulated and discussed at lkml[1]. - Extend a lock-less NMI per-cpu buffers idea to handle recursive printk() calls by Sergey Senozhatsky[2]. It is the first step in sanitizing printk as discussed at Kernel Summit. The change allows to see messages that would normally get ignored or would cause a deadlock. Also it allows to enable lockdep in printk(). This already paid off. The testing in linux-next helped to discover two old problems that were hidden before[3][4]. - Remove unused parameter by Sergey Senozhatsky. Clean up after a past change. [1] http://lkml.kernel.org/r/1481798878-31898-1-git-send-email-pmladek@suse.com [2] http://lkml.kernel.org/r/20161227141611.940-1-sergey.senozhatsky@gmail.com [3] http://lkml.kernel.org/r/20170215044332.30449-1-sergey.senozhatsky@gmail.com [4] http://lkml.kernel.org/r/20170217015932.11898-1-sergey.senozhatsky@gmail.com * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk: printk: drop call_console_drivers() unused param printk: convert the rest to printk-safe printk: remove zap_locks() function printk: use printk_safe buffers in printk printk: report lost messages in printk safe/nmi contexts printk: always use deferred printk when flush printk_safe lines printk: introduce per-cpu safe_print seq buffer printk: rename nmi.c and exported api printk: use vprintk_func in vprintk() MAINTAINERS: Add printk maintainers
2017-02-22slub: make sysfs directories for memcg sub-caches optionalTejun Heo1-0/+14
SLUB creates a per-cache directory under /sys/kernel/slab which hosts a bunch of debug files. Usually, there aren't that many caches on a system and this doesn't really matter; however, if memcg is in use, each cache can have per-cgroup sub-caches. SLUB creates the same directories for these sub-caches under /sys/kernel/slab/$CACHE/cgroup. Unfortunately, because there can be a lot of cgroups, active or draining, the product of the numbers of caches, cgroups and files in each directory can reach a very high number - hundreds of thousands is commonplace. Millions and beyond aren't difficult to reach either. What's under /sys/kernel/slab is primarily for debugging and the information and control on the a root cache already cover its sub-caches. While having a separate directory for each sub-cache can be helpful for development, it doesn't make much sense to pay this amount of overhead by default. This patch introduces a boot parameter slub_memcg_sysfs which determines whether to create sysfs directories for per-memcg sub-caches. It also adds CONFIG_SLUB_MEMCG_SYSFS_ON which determines the boot parameter's default value and defaults to 0. [akpm@linux-foundation.org: kset_unregister(NULL) is legal] Link: http://lkml.kernel.org/r/20170204145203.GB26958@mtj.duckdns.org Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-22Merge tag 'char-misc-4.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-miscLinus Torvalds1-0/+7
Pull char/misc driver updates from Greg KH: "Here is the big char/misc driver patchset for 4.11-rc1. Lots of different driver subsystems updated here: rework for the hyperv subsystem to handle new platforms better, mei and w1 and extcon driver updates, as well as a number of other "minor" driver updates. All of these have been in linux-next for a while with no reported issues" * tag 'char-misc-4.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (169 commits) goldfish: Sanitize the broken interrupt handler x86/platform/goldfish: Prevent unconditional loading vmbus: replace modulus operation with subtraction vmbus: constify parameters where possible vmbus: expose hv_begin/end_read vmbus: remove conditional locking of vmbus_write vmbus: add direct isr callback mode vmbus: change to per channel tasklet vmbus: put related per-cpu variable together vmbus: callback is in softirq not workqueue binder: Add support for file-descriptor arrays binder: Add support for scatter-gather binder: Add extra size to allocator binder: Refactor binder_transact() binder: Support multiple /dev instances binder: Deal with contexts in debugfs binder: Support multiple context managers binder: Split flat_binder_object auxdisplay: ht16k33: remove private workqueue auxdisplay: ht16k33: rework input device initialization ...
2017-02-20Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds1-14/+0
Pull RCU updates from Ingo Molnar: "The RCU changes in this cycle are: - Dynticks updates, consolidating open-coded counter accesses into a well-defined API - SRCU updates: Simplify algorithm, add formal verification - Documentation updates - Miscellaneous fixes - Torture-test updates Most of the diffstat comes from the relatively large documentation update" * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (42 commits) srcu: Reduce probability of SRCU ->unlock_count[] counter overflow rcutorture: Add CBMC-based formal verification for SRCU srcu: Force full grace-period ordering srcu: Implement more-efficient reader counts rcu: Adjust FQS offline checks for exact online-CPU detection rcu: Check cond_resched_rcu_qs() state less often to reduce GP overhead rcu: Abstract extended quiescent state determination rcu: Abstract dynticks extended quiescent state enter/exit operations rcu: Add lockdep checks to synchronous expedited primitives rcu: Eliminate unused expedited_normal counter llist: Clarify comments about when locking is needed rcu: Fix comment in rcu_organize_nocb_kthreads() rcu: Enable RCU tracepoints by default to aid in debugging rcu: Make rcu_cpu_starting() use its "cpu" argument rcu: Add comment headers to expedited-grace-period counter functions rcu: Don't wake rcuc/X kthreads on NOCB CPUs rcu: Re-enable TASKS_RCU for User Mode Linux rcu: Once again use NMI-based stack traces in stall warnings rcu: Remove short-term CPU kicking rcu: Add long-term CPU kicking ...
2017-02-08printk: rename nmi.c and exported apiSergey Senozhatsky1-7/+9
A preparation patch for printk_safe work. No functional change. - rename nmi.c to print_safe.c - add `printk_safe' prefix to some (which used both by printk-safe and printk-nmi) of the exported functions. Link: http://lkml.kernel.org/r/20161227141611.940-3-sergey.senozhatsky@gmail.com Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Jan Kara <jack@suse.cz> Cc: Tejun Heo <tj@kernel.org> Cc: Calvin Owens <calvinowens@fb.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Peter Hurley <peter@hurleysoftware.com> Cc: linux-kernel@vger.kernel.org Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Signed-off-by: Petr Mladek <pmladek@suse.com>
2017-02-06Merge 4.10-rc7 into char-misc-nextGreg Kroah-Hartman1-0/+4
We want the hv and other fixes in here as well to handle merge and testing issues. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-02-03kbuild: modversions: add infrastructure for emitting relative CRCsArd Biesheuvel1-0/+4
This add the kbuild infrastructure that will allow architectures to emit vmlinux symbol CRCs as 32-bit offsets to another location in the kernel where the actual value is stored. This works around problems with CRCs being mistaken for relocatable symbols on kernels that self relocate at runtime (i.e., powerpc with CONFIG_RELOCATABLE=y) For the kbuild side of things, this comes down to the following: - introducing a Kconfig symbol MODULE_REL_CRCS - adding a -R switch to genksyms to instruct it to emit the CRC symbols as references into the .rodata section - making modpost distinguish such references from absolute CRC symbols by the section index (SHN_ABS) - making kallsyms disregard non-absolute symbols with a __crc_ prefix Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-01-31Merge branch 'for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcuIngo Molnar1-14/+0
Pull RCU changes from Paul E. McKenney: - Dynticks updates, consolidating open-coded counter accesses into a well-defined API - SRCU updates: Simplify algorithm, add formal verification - Documentation updates - Miscellaneous fixes - Torture-test updates Signed-off-by: Ingo Molnar <mingo@kernel.org>