aboutsummaryrefslogtreecommitdiffstatshomepage
path: root/arch/riscv/include/asm/pgtable.h (follow)
AgeCommit message (Collapse)AuthorFilesLines
2020-03-26riscv: Add support to dump the kernel page tablesZong Li1-0/+10
In a similar manner to arm64, x86, powerpc, etc., it can traverse all page tables, and dump the page table layout with the memory types and permissions. Add a debugfs file at /sys/kernel/debug/kernel_page_tables to export the page table layout to userspace. Signed-off-by: Zong Li <zong.li@sifive.com> Tested-by: Alexandre Ghiti <alex@ghiti.fr> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2020-03-05RISC-V: Move all address space definition macros to one placeAtish Patra1-37/+41
If both CONFIG_KASAN and CONFIG_SPARSEMEM_VMEMMAP are set, we get the following compilation error. --------------------------------------------------------------- ./arch/riscv/include/asm/pgtable-64.h: In function ‘pud_page’: ./include/asm-generic/memory_model.h:54:29: error: ‘vmemmap’ undeclared (first use in this function); did you mean ‘mem_map’? #define __pfn_to_page(pfn) (vmemmap + (pfn)) ^~~~~~~ ./include/asm-generic/memory_model.h:82:21: note: in expansion of macro ‘__pfn_to_page’ #define pfn_to_page __pfn_to_page ^~~~~~~~~~~~~ ./arch/riscv/include/asm/pgtable-64.h:70:9: note: in expansion of macro ‘pfn_to_page’ return pfn_to_page(pud_val(pud) >> _PAGE_PFN_SHIFT); --------------------------------------------------------------- Fix the compliation errors by moving all the address space definition macros before including pgtable-64.h. Fixes: 8ad8b72721d0 (riscv: Add KASAN support) Signed-off-by: Atish Patra <atish.patra@wdc.com> Reviewed-by: Anup Patel <anup@brainfault.org> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2020-02-04riscv: mm: add p?d_leaf() definitionsSteven Price1-0/+7
walk_page_range() is going to be allowed to walk page tables other than those of user space. For this it needs to know when it has reached a 'leaf' entry in the page tables. This information is provided by the p?d_leaf() functions/macros. For riscv a page is a leaf page when it has a read, write or execute bit set on it. Link: http://lkml.kernel.org/r/20191218162402.45610-8-steven.price@arm.com Signed-off-by: Steven Price <steven.price@arm.com> Reviewed-by: Alexandre Ghiti <alex@ghiti.fr> Reviewed-by: Zong Li <zong.li@sifive.com> Acked-by: Paul Walmsley <paul.walmsley@sifive.com> [arch/riscv] Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Andy Lutomirski <luto@kernel.org> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David S. Miller <davem@davemloft.net> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Hogan <jhogan@kernel.org> Cc: James Morse <james.morse@arm.com> Cc: Jerome Glisse <jglisse@redhat.com> Cc: "Liang, Kan" <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul Burton <paul.burton@mips.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-12-27Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextDavid S. Miller1-0/+4
Daniel Borkmann says: ==================== pull-request: bpf-next 2019-12-27 The following pull-request contains BPF updates for your *net-next* tree. We've added 127 non-merge commits during the last 17 day(s) which contain a total of 110 files changed, 6901 insertions(+), 2721 deletions(-). There are three merge conflicts. Conflicts and resolution looks as follows: 1) Merge conflict in net/bpf/test_run.c: There was a tree-wide cleanup c593642c8be0 ("treewide: Use sizeof_field() macro") which gets in the way with b590cb5f802d ("bpf: Switch to offsetofend in BPF_PROG_TEST_RUN"): <<<<<<< HEAD if (!range_is_zero(__skb, offsetof(struct __sk_buff, priority) + sizeof_field(struct __sk_buff, priority), ======= if (!range_is_zero(__skb, offsetofend(struct __sk_buff, priority), >>>>>>> 7c8dce4b166113743adad131b5a24c4acc12f92c There are a few occasions that look similar to this. Always take the chunk with offsetofend(). Note that there is one where the fields differ in here: <<<<<<< HEAD if (!range_is_zero(__skb, offsetof(struct __sk_buff, tstamp) + sizeof_field(struct __sk_buff, tstamp), ======= if (!range_is_zero(__skb, offsetofend(struct __sk_buff, gso_segs), >>>>>>> 7c8dce4b166113743adad131b5a24c4acc12f92c Just take the one with offsetofend() /and/ gso_segs. Latter is correct due to 850a88cc4096 ("bpf: Expose __sk_buff wire_len/gso_segs to BPF_PROG_TEST_RUN"). 2) Merge conflict in arch/riscv/net/bpf_jit_comp.c: (I'm keeping Bjorn in Cc here for a double-check in case I got it wrong.) <<<<<<< HEAD if (is_13b_check(off, insn)) return -1; emit(rv_blt(tcc, RV_REG_ZERO, off >> 1), ctx); ======= emit_branch(BPF_JSLT, RV_REG_T1, RV_REG_ZERO, off, ctx); >>>>>>> 7c8dce4b166113743adad131b5a24c4acc12f92c Result should look like: emit_branch(BPF_JSLT, tcc, RV_REG_ZERO, off, ctx); 3) Merge conflict in arch/riscv/include/asm/pgtable.h: <<<<<<< HEAD ======= #define VMALLOC_SIZE (KERN_VIRT_SIZE >> 1) #define VMALLOC_END (PAGE_OFFSET - 1) #define VMALLOC_START (PAGE_OFFSET - VMALLOC_SIZE) #define BPF_JIT_REGION_SIZE (SZ_128M) #define BPF_JIT_REGION_START (PAGE_OFFSET - BPF_JIT_REGION_SIZE) #define BPF_JIT_REGION_END (VMALLOC_END) /* * Roughly size the vmemmap space to be large enough to fit enough * struct pages to map half the virtual address space. Then * position vmemmap directly below the VMALLOC region. */ #define VMEMMAP_SHIFT \ (CONFIG_VA_BITS - PAGE_SHIFT - 1 + STRUCT_PAGE_MAX_SHIFT) #define VMEMMAP_SIZE BIT(VMEMMAP_SHIFT) #define VMEMMAP_END (VMALLOC_START - 1) #define VMEMMAP_START (VMALLOC_START - VMEMMAP_SIZE) #define vmemmap ((struct page *)VMEMMAP_START) >>>>>>> 7c8dce4b166113743adad131b5a24c4acc12f92c Only take the BPF_* defines from there and move them higher up in the same file. Remove the rest from the chunk. The VMALLOC_* etc defines got moved via 01f52e16b868 ("riscv: define vmemmap before pfn_to_page calls"). Result: [...] #define __S101 PAGE_READ_EXEC #define __S110 PAGE_SHARED_EXEC #define __S111 PAGE_SHARED_EXEC #define VMALLOC_SIZE (KERN_VIRT_SIZE >> 1) #define VMALLOC_END (PAGE_OFFSET - 1) #define VMALLOC_START (PAGE_OFFSET - VMALLOC_SIZE) #define BPF_JIT_REGION_SIZE (SZ_128M) #define BPF_JIT_REGION_START (PAGE_OFFSET - BPF_JIT_REGION_SIZE) #define BPF_JIT_REGION_END (VMALLOC_END) /* * Roughly size the vmemmap space to be large enough to fit enough * struct pages to map half the virtual address space. Then * position vmemmap directly below the VMALLOC region. */ #define VMEMMAP_SHIFT \ (CONFIG_VA_BITS - PAGE_SHIFT - 1 + STRUCT_PAGE_MAX_SHIFT) #define VMEMMAP_SIZE BIT(VMEMMAP_SHIFT) #define VMEMMAP_END (VMALLOC_START - 1) #define VMEMMAP_START (VMALLOC_START - VMEMMAP_SIZE) [...] Let me know if there are any other issues. Anyway, the main changes are: 1) Extend bpftool to produce a struct (aka "skeleton") tailored and specific to a provided BPF object file. This provides an alternative, simplified API compared to standard libbpf interaction. Also, add libbpf extern variable resolution for .kconfig section to import Kconfig data, from Andrii Nakryiko. 2) Add BPF dispatcher for XDP which is a mechanism to avoid indirect calls by generating a branch funnel as discussed back in bpfconf'19 at LSF/MM. Also, add various BPF riscv JIT improvements, from Björn Töpel. 3) Extend bpftool to allow matching BPF programs and maps by name, from Paul Chaignon. 4) Support for replacing cgroup BPF programs attached with BPF_F_ALLOW_MULTI flag for allowing updates without service interruption, from Andrey Ignatov. 5) Cleanup and simplification of ring access functions for AF_XDP with a bonus of 0-5% performance improvement, from Magnus Karlsson. 6) Enable BPF JITs for x86-64 and arm64 by default. Also, final version of audit support for BPF, from Daniel Borkmann and latter with Jiri Olsa. 7) Move and extend test_select_reuseport into BPF program tests under BPF selftests, from Jakub Sitnicki. 8) Various BPF sample improvements for xdpsock for customizing parameters to set up and benchmark AF_XDP, from Jay Jayatheerthan. 9) Improve libbpf to provide a ulimit hint on permission denied errors. Also change XDP sample programs to attach in driver mode by default, from Toke Høiland-Jørgensen. 10) Extend BPF test infrastructure to allow changing skb mark from tc BPF programs, from Nikita V. Shirokov. 11) Optimize prologue code sequence in BPF arm32 JIT, from Russell King. 12) Fix xdp_redirect_cpu BPF sample to manually attach to tracepoints after libbpf conversion, from Jesper Dangaard Brouer. 13) Minor misc improvements from various others. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-20riscv: define vmemmap before pfn_to_page callsDavid Abdurachmanov1-17/+21
pfn_to_page & page_to_pfn depend on vmemmap being available before the calls if kernel is configured with CONFIG_SPARSEMEM_VMEMMAP=y. This was caused by NOMMU changes which moved vmemmap definition bellow functions definitions calling pfn_to_page & page_to_pfn. Noticed while compiled 5.5-rc2 kernel for Fedora/RISCV. v2: - Add a comment for vmemmap in source Signed-off-by: David Abdurachmanov <david.abdurachmanov@sifive.com> Fixes: 6bd33e1ece52 ("riscv: add nommu support") Reviewed-by: Anup Patel <anup@brainfault.org> Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
2019-12-19riscv, bpf: Provide RISC-V specific JIT image alloc/freeBjörn Töpel1-0/+4
This commit makes sure that the JIT images is kept close to the kernel text, so BPF calls can use relative calling with auipc/jalr or jal instead of loading the full 64-bit address and jalr. The BPF JIT image region is 128 MB before the kernel text. Signed-off-by: Björn Töpel <bjorn.topel@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20191216091343.23260-7-bjorn.topel@gmail.com
2019-11-28Merge tag 'ioremap-5.5' of git://git.infradead.org/users/hch/ioremapLinus Torvalds1-0/+6
Pull generic ioremap support from Christoph Hellwig: "This adds the remaining bits for an entirely generic ioremap and iounmap to lib/ioremap.c. To facilitate that, it cleans up the giant mess of weird ioremap variants we had with no users outside the arch code. For now just the three newest ports use the code, but there is more than a handful others that can be converted without too much work. Summary: - clean up various obsolete ioremap and iounmap variants - add a new generic ioremap implementation and switch csky, nds32 and riscv over to it" * tag 'ioremap-5.5' of git://git.infradead.org/users/hch/ioremap: (21 commits) nds32: use generic ioremap csky: use generic ioremap csky: remove ioremap_cache riscv: use the generic ioremap code lib: provide a simple generic ioremap implementation sh: remove __iounmap nios2: remove __iounmap hexagon: remove __iounmap m68k: rename __iounmap and mark it static arch: rely on asm-generic/io.h for default ioremap_* definitions asm-generic: don't provide ioremap for CONFIG_MMU asm-generic: ioremap_uc should behave the same with and without MMU xtensa: clean up ioremap x86: Clean up ioremap() parisc: remove __ioremap nios2: remove __ioremap alpha: remove the unused __ioremap wrapper hexagon: clean up ioremap ia64: rename ioremap_nocache to ioremap_uc unicore32: remove ioremap_cached ...
2019-11-17riscv: add nommu supportChristoph Hellwig1-41/+53
The kernel runs in M-mode without using page tables, and thus can't run bare metal without help from additional firmware. Most of the patch is just stubbing out code not needed without page tables, but there is an interesting detail in the signals implementation: - The normal RISC-V syscall ABI only implements rt_sigreturn as VDSO entry point, but the ELF VDSO is not supported for nommu Linux. We instead copy the code to call the syscall onto the stack. In addition to enabling the nommu code a new defconfig for a small kernel image that can run in nommu mode on qemu is also provided, to run a kernel in qemu you can use the following command line: qemu-system-riscv64 -smp 2 -m 64 -machine virt -nographic \ -kernel arch/riscv/boot/loader \ -drive file=rootfs.ext2,format=raw,id=hd0 \ -device virtio-blk-device,drive=hd0 Contains contributions from Damien Le Moal <Damien.LeMoal@wdc.com>. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Anup Patel <anup@brainfault.org> [paul.walmsley@sifive.com: updated to apply; add CONFIG_MMU guards around PCI_IOBASE definition to fix build issues; fixed checkpatch issues; move the PCI_IO_* and VMEMMAP address space macros along with the others; resolve sparse warning] Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
2019-11-11riscv: use the generic ioremap codeChristoph Hellwig1-0/+6
Use the generic ioremap code instead of providing a local version. Note that this relies on the asm-generic no-op definition of pgprot_noncached. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Paul Walmsley <paul.walmsley@sifive.com> Tested-by: Paul Walmsley <paul.walmsley@sifive.com> # rv32, rv64 boot Acked-by: Paul Walmsley <paul.walmsley@sifive.com> # arch/riscv
2019-10-28RISC-V: Add PCIe I/O BAR memory mappingYash Shah1-1/+6
For legacy I/O BARs (non-MMIO BARs) to work correctly on RISC-V Linux, we need to establish a reserved memory region for them, so that drivers that wish to use the legacy I/O BARs can issue reads and writes against a memory region that is mapped to the host PCIe controller's I/O BAR mapping. Signed-off-by: Yash Shah <yash.shah@sifive.com> Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
2019-10-23riscv: Fix implicit declaration of 'page_to_section'Kefeng Wang1-4/+1
With CONFIG_SPARSEMEM and !CONFIG_SPARSEMEM_VMEMMAP, arch/riscv/include/asm/pgtable.h: In function ‘mk_pte’: include/asm-generic/memory_model.h:64:14: error: implicit declaration of function ‘page_to_section’; did you mean ‘present_section’? [-Werror=implicit-function-declaration] int __sec = page_to_section(__pg); \ ^~~~~~~~~~~~~~~ Fixed by changing mk_pte() from inline function to macro. Cc: Logan Gunthorpe <logang@deltatee.com> Fixes: d95f1a542c3d ("RISC-V: Implement sparsemem") Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> [paul.walmsley@sifive.com: fixed checkpatch errors] Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
2019-10-23riscv: fix fs/proc/kcore.c compilation with sparsemem enabledDavid Abdurachmanov1-2/+0
Failed to compile Fedora/RISCV kernel (5.4-rc3+) with sparsemem enabled: fs/proc/kcore.c: In function 'read_kcore': fs/proc/kcore.c:510:8: error: implicit declaration of function 'kern_addr_valid'; did you mean 'virt_addr_valid'? [-Werror=implicit-function-declaration] 510 | if (kern_addr_valid(start)) { | ^~~~~~~~~~~~~~~ | virt_addr_valid Looking at other architectures I don't see kern_addr_valid being guarded by CONFIG_FLATMEM. Fixes: d95f1a542c3d ("RISC-V: Implement sparsemem") Signed-off-by: David Abdurachmanov <david.abdurachmanov@sifive.com> Tested-by: David Abdurachmanov <david.abdurachmanov@sifive.com> Reviewed-by: Logan Gunthorpe <logang@deltatee.com> Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
2019-10-15RISC-V: fix virtual address overlapped in FIXADDR_START and VMEMMAP_STARTGreentime Hu1-8/+8
This patch fixes the virtual address layout in pgtable.h. The virtual address of FIXADDR_START and VMEMMAP_START should not be overlapped. Fixes: d95f1a542c3d ("RISC-V: Implement sparsemem") Signed-off-by: Greentime Hu <greentime.hu@sifive.com> Reviewed-by: Anup Patel <anup@brainfault.org> [paul.walmsley@sifive.com: fixed patch description] Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
2019-09-27Merge tag 'riscv/for-v5.4-rc1-b' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linuxLinus Torvalds1-12/+12
Pull more RISC-V updates from Paul Walmsley: "Some additional RISC-V updates. This includes one significant fix: - Prevent interrupts from being unconditionally re-enabled during exception handling if they were disabled in the context in which the exception occurred Also a few other fixes: - Fix a build error when sparse memory support is manually enabled - Prevent CPUs beyond CONFIG_NR_CPUS from being enabled in early boot And a few minor improvements: - DT improvements: in the FU540 SoC DT files, improve U-Boot compatibility by adding an "ethernet0" alias, drop an unnecessary property from the DT files, and add support for the PWM device - KVM preparation: add a KVM-related macro for future RISC-V KVM support, and export some symbols required to build KVM support as modules - defconfig additions: build more drivers by default for QEMU configurations" * tag 'riscv/for-v5.4-rc1-b' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: riscv: Avoid interrupts being erroneously enabled in handle_exception() riscv: dts: sifive: Drop "clock-frequency" property of cpu nodes riscv: dts: sifive: Add ethernet0 to the aliases node RISC-V: Export kernel symbols for kvm KVM: RISC-V: Add KVM_REG_RISCV for ONE_REG interface arch/riscv: disable excess harts before picking main boot hart RISC-V: Enable VIRTIO drivers in RV64 and RV32 defconfig RISC-V: Fix building error when CONFIG_SPARSEMEM_MANUAL=y riscv: dts: Add DT support for SiFive FU540 PWM driver
2019-09-24mm: consolidate pgtable_cache_init() and pgd_cache_init()Mike Rapoport1-5/+0
Both pgtable_cache_init() and pgd_cache_init() are used to initialize kmem cache for page table allocations on several architectures that do not use PAGE_SIZE tables for one or more levels of the page table hierarchy. Most architectures do not implement these functions and use __weak default NOP implementation of pgd_cache_init(). Since there is no such default for pgtable_cache_init(), its empty stub is duplicated among most architectures. Rename the definitions of pgd_cache_init() to pgtable_cache_init() and drop empty stubs of pgtable_cache_init(). Link: http://lkml.kernel.org/r/1566457046-22637-1-git-send-email-rppt@linux.ibm.com Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Acked-by: Will Deacon <will@kernel.org> [arm64] Acked-by: Thomas Gleixner <tglx@linutronix.de> [x86] Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-19RISC-V: Fix building error when CONFIG_SPARSEMEM_MANUAL=yGreentime Hu1-12/+12
Fix a build break by adjusting where VMALLOC_* and FIXADDR_* are defined. This fixes the definition of the MEMMAP_* macros. CC init/main.o In file included from ./include/linux/mm.h:99, from ./include/linux/ring_buffer.h:5, from ./include/linux/trace_events.h:6, from ./include/trace/syscall.h:7, from ./include/linux/syscalls.h:85, from init/main.c:21: ./arch/riscv/include/asm/pgtable.h: In function ‘pmd_page’: ./arch/riscv/include/asm/pgtable.h:95:24: error: ‘VMALLOC_START’ undeclared (first use in this function); did you mean ‘VMEMMAP_START’? #define VMEMMAP_START (VMALLOC_START - VMEMMAP_SIZE) ^~~~~~~~~~~~~ Fixes: d95f1a542c3d ("RISC-V: Implement sparsemem") Signed-off-by: Greentime Hu <greentime.hu@sifive.com> [paul.walmsley@sifive.com: minor patch description fix] Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
2019-08-30RISC-V: Implement sparsememLogan Gunthorpe1-0/+13
Implement sparsemem support for Risc-v which helps pave the way for memory hotplug and eventually P2P support. Introduce Kconfig options for virtual and physical address bits which are used to calculate the size of the vmemmap and set the MAX_PHYSMEM_BITS. The vmemmap is located directly before the VMALLOC region and sized such that we can allocate enough pages to populate all the virtual address space in the system (similar to the way it's done in arm64). During initialization, call memblocks_present() and sparse_init(), and provide a stub for vmemmap_populate() (all of which is similar to arm64). [greentime.hu@sifive.com: fixed pfn_valid, FIXADDR_TOP and fixed a bug rebasing onto v5.3] Signed-off-by: Greentime Hu <greentime.hu@sifive.com> Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Reviewed-by: Palmer Dabbelt <palmer@sifive.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Andrew Waterman <andrew@sifive.com> Cc: Olof Johansson <olof@lixom.net> Cc: Michael Clark <michaeljclark@mac.com> Cc: Rob Herring <robh@kernel.org> Cc: Zong Li <zong@andestech.com> Reviewed-by: Mike Rapoport <rppt@linux.ibm.com> [paul.walmsley@sifive.com: updated to apply; minor commit message reformat] Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
2019-08-28RISC-V: Fix FIXMAP area corruption on RV32 systemsAnup Patel1-2/+10
Currently, various virtual memory areas of Linux RISC-V are organized in increasing order of their virtual addresses is as follows: 1. User space area (This is lowest area and starts at 0x0) 2. FIXMAP area 3. VMALLOC area 4. Kernel area (This is highest area and starts at PAGE_OFFSET) The maximum size of user space aread is represented by TASK_SIZE. On RV32 systems, TASK_SIZE is defined as VMALLOC_START which causes the user space area to overlap the FIXMAP area. This allows user space apps to potentially corrupt the FIXMAP area and kernel OF APIs will crash whenever they access corrupted FDT in the FIXMAP area. On RV64 systems, TASK_SIZE is set to fixed 256GB and no other areas happen to overlap so we don't see any FIXMAP area corruptions. This patch fixes FIXMAP area corruption on RV32 systems by setting TASK_SIZE to FIXADDR_START. We also move FIXADDR_TOP, FIXADDR_SIZE, and FIXADDR_START defines to asm/pgtable.h so that we can avoid cyclic header includes. Signed-off-by: Anup Patel <anup.patel@wdc.com> Tested-by: Alistair Francis <alistair.francis@wdc.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
2019-07-09RISC-V: Setup initial page tables in two stagesAnup Patel1-0/+8
Currently, the setup_vm() does initial page table setup in one-shot very early before enabling MMU. Due to this, the setup_vm() has to map all possible kernel virtual addresses since it does not know size and location of RAM. This means we have kernel mappings for non-existent RAM and any buggy driver (or kernel) code doing out-of-bound access to RAM will not fault and cause underterministic behaviour. Further, the setup_vm() creates PMD mappings (i.e. 2M mappings) for RV64 systems. This means for PAGE_OFFSET=0xffffffe000000000 (i.e. MAXPHYSMEM_128GB=y), the setup_vm() will require 129 pages (i.e. 516 KB) of memory for initial page tables which is never freed. The memory required for initial page tables will further increase if we chose a lower value of PAGE_OFFSET (e.g. 0xffffff0000000000) This patch implements two-staged initial page table setup, as follows: 1. Early (i.e. setup_vm()): This stage maps kernel image and DTB in a early page table (i.e. early_pg_dir). The early_pg_dir will be used only by boot HART so it can be freed as-part of init memory free-up. 2. Final (i.e. setup_vm_final()): This stage maps all possible RAM banks in the final page table (i.e. swapper_pg_dir). The boot HART will start using swapper_pg_dir at the end of setup_vm_final(). All non-boot HARTs directly use the swapper_pg_dir created by boot HART. We have following advantages with this new approach: 1. Kernel mappings for non-existent RAM don't exists anymore. 2. Memory consumed by initial page tables is now indpendent of the chosen PAGE_OFFSET. 3. Memory consumed by initial page tables on RV64 system is 2 pages (i.e. 8 KB) which has significantly reduced and these pages will be freed as-part of the init memory free-up. The patch also provides a foundation for implementing strict kernel mappings where we protect kernel text and rodata using PTE permissions. Suggested-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Anup Patel <anup.patel@wdc.com> [paul.walmsley@sifive.com: updated to apply; fixed a checkpatch warning] Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
2019-07-03riscv: Introduce huge page support for 32/64bit kernelAlexandre Ghiti1-2/+6
This patch implements both 4MB huge page support for 32bit kernel and 2MB/1GB huge pages support for 64bit kernel. Signed-off-by: Alexandre Ghiti <alex@ghiti.fr> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
2019-06-05treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 286Thomas Gleixner1-9/+1
Based on 1 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license as published by the free software foundation version 2 this program is distributed in the hope that it will be useful but without any warranty without even the implied warranty of merchantability or fitness for a particular purpose see the gnu general public license for more details extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 97 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Allison Randal <allison@lohutok.net> Reviewed-by: Alexios Zavras <alexios.zavras@intel.com> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190529141901.025053186@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-02-21RISC-V: Move setup_bootmem() to mm/init.cAnup Patel1-0/+1
The setup_bootmem() mainly populates memblocks and does early memory reservations. The right location for this function is mm/init.c. It calls setup_initrd() so we move that as well. Signed-off-by: Anup Patel <anup.patel@wdc.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Palmer Dabbelt <palmer@sifive.com> Reviewed-by: Mike Rapoport <rppt@linux.ibm.com>
2019-02-11riscv: Add pte bit to distinguish swap from invalidStefan O'Rear1-4/+4
Previously, invalid PTEs and swap PTEs had the same binary representation, causing errors when attempting to unmap PROT_NONE mappings, including implicit unmap on exit. Typical error: swap_info_get: Bad swap file entry 40000000007a9879 BUG: Bad page map in process a.out pte:3d4c3cc0 pmd:3e521401 Cc: stable@vger.kernel.org Signed-off-by: Stefan O'Rear <sorear2@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2018-01-07riscv: remove CONFIG_MMU ifdefsChristoph Hellwig1-4/+0
The RISC-V port doesn't suport a nommu mode, so there is no reason to provide some code only under a CONFIG_MMU ifdef. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2017-11-30RISC-V: Flush I$ when making a dirty page executableAndrew Waterman1-26/+32
The RISC-V ISA allows for instruction caches that are not coherent WRT stores, even on a single hart. As a result, we need to explicitly flush the instruction cache whenever marking a dirty page as executable in order to preserve the correct system behavior. Local instruction caches aren't that scary (our implementations actually flush the cache, but RISC-V is defined to allow higher-performance implementations to exist), but RISC-V defines no way to perform an instruction cache shootdown. When explicitly asked to do so we can shoot down remote instruction caches via an IPI, but this is a bit on the slow side. Instead of requiring an IPI to all harts whenever marking a page as executable, we simply flush the currently running harts. In order to maintain correct behavior, we additionally mark every other hart as needing a deferred instruction cache which will be taken before anything runs on it. Signed-off-by: Andrew Waterman <andrew@sifive.com> Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2017-09-26RISC-V: Paging and MMUPalmer Dabbelt1-0/+430
This patch contains code to manage the RISC-V MMU, including definitions of the page tables and the page walking code. Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com>