linux-dev - Linux kernel development work

Age	Commit message (Collapse)	Author	Files	Lines
2021-10-28	drm/amd/display: move FPU associated DSC code to DML folder	Qingqing Zhuo	1	-29/+0
	[Why & How] As part of the FPU isolation work documented in https://patchwork.freedesktop.org/series/93042/, isolate code that uses FPU in DSC to DML, where all FPU code should locate. This change does not refactor any functions but move code around. Cc: Christian König <christian.koenig@amd.com> Cc: Hersen Wu <hersenxs.wu@amd.com> Cc: Anson Jacob <Anson.Jacob@amd.com> Cc: Harry Wentland <harry.wentland@amd.com> Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Acked-by: Agustin Gutierrez <agustin.gutierrez@amd.com> Tested-by: Anson Jacob <Anson.Jacob@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Qingqing Zhuo <qingqing.zhuo@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-01-06	drm/amdgpu/display: drop DCN support for aarch64	Alex Deucher	1	-4/+0
	From Ard: "Simply disabling -mgeneral-regs-only left and right is risky, given that the standard AArch64 ABI permits the use of FP/SIMD registers anywhere, and GCC is known to use SIMD registers for spilling, and may invent other uses of the FP/SIMD register file that have nothing to do with the floating point code in question. Note that putting kernel_neon_begin() and kernel_neon_end() around the code that does use FP is not sufficient here, the problem is in all the other code that may be emitted with references to SIMD registers in it. So the only way to do this properly is to put all floating point code in a separate compilation unit, and only compile that unit with -mgeneral-regs-only." Disable support until the code can be properly refactored to support this properly on aarch64. Acked-by: Will Deacon <will@kernel.org> Reported-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-10	drm/amd/display: add DCN support for aarch64	Daniel Kolesa	1	-0/+5
	This adds ARM64 support into the DCN. This mainly enables support for Navi graphics cards. The dcn10 changes haven't been tested, since I don't have the relevant hardware available, but there is no way to conditionally disable them, so I've done them anyway. Signed-off-by: Daniel Kolesa <daniel@octaforge.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-06-11	drm/amd/display: Rework dsc to isolate FPU operations	Rodrigo Siqueira	1	-2/+0
	When we want to use float point operation on Linux we need to use within special kernel protection (`kernel_fpu_{begin,end}()`.), otherwise the kernel can clobber userspace FPU register state. For detecting these issues we use a tool named objtool (with -Ffa flags) to highlight the FPU problems, all warnings can be summed up as follows: ./tools/objtool/objtool check -Ffa drivers/gpu/drm/amd/display/dc/dml/dml_common_defs.o [..] dc/dsc/rc_calc.o: warning: objtool: get_qp_set()+0x2f8: FPU instruction outside of kernel_fpu_{begin,end}() [..] dc/dsc/rc_calc.o: warning: objtool: dsc_roundf()+0x5: FPU instruction outside of kernel_fpu_{begin,end}() [..] dc/dsc/rc_calc.o: warning: objtool: dsc_ceil()+0x5: FPU instruction outside of kernel_fpu_{begin,end}() [..] dc/dsc/rc_calc.o: warning: objtool: get_ofs_set()+0x3eb: FPU instruction outside of kernel_fpu_{begin,end}() [..] dc/dsc/rc_calc.o: warning: objtool: calc_rc_params()+0x3c: FPU instruction outside of kernel_fpu_{begin,end}() [..] dc/dsc/dc_dsc.o: warning: objtool: get_dsc_bandwidth_range.isra.0()+0x8d: FPU instruction outside of kernel_fpu_{begin,end}() [..] dc/dsc/dc_dsc.o: warning: objtool: setup_dsc_config()+0x2ef: FPU instruction outside of kernel_fpu_{begin,end}() [..] dc/dsc/rc_calc_dpi.o: warning: objtool:copy_pps_fields()+0xbb: FPU instruction outside of kernel_fpu_{begin,end}() [..] dc/dsc/rc_calc_dpi.o: warning: objtool: dscc_compute_dsc_parameters()+0x7b: FPU instruction outside of kernel_fpu_{begin,end}() This commit fixes the above issues by rework DSC as described: 1. Isolate all FPU operations in a single file; 2. Use FPU flags only in the file that handles FPU operations; 3. Isolate all functions that require float point operation in static functions; 4. Add a mid-layer function that does not use any float point operation, and that could be safely invoked in other parts of the code. 5. Keep float point operation under DC_FP_{START/END} macro. CC: Christian König <christian.koenig@amd.com> CC: Alexander Deucher <Alexander.Deucher@amd.com> CC: Peter Zijlstra <peterz@infradead.org> CC: Tony Cheng <tony.cheng@amd.com> CC: Harry Wentland <hwentlan@amd.com> Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Reviewed-by: Mikita Lipski <Mikita.Lipski@amd.com> Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-18	amdgpu: Enable initial DCN support on POWER	Timothy Pearson	1	-0/+8
	DCN requires floating point support to operate. Add the appropriate x86/ppc64 guards and FPU / AltiVec / VSX context switches to DCN. Note that the current DC20 code doesn't contain all required FPU wrappers on x86 or POWER, so this patch is insufficient to fully enable DC20 on POWER. v2: s/X86_64/X86/g to retain previous behavior. Signed-off-by: Timothy Pearson <tpearson@raptorengineering.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-11	drm/amdgpu: fix license on Kconfig and Makefiles	Alex Deucher	1	-0/+1
	amdgpu is MIT licensed. Fixes: ec8f24b7faaf3d ("treewide: Add SPDX license identifier - Makefile/Kconfig") Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-30	drm/amdgpu: enable -msse2 for GCC 7.1+ users	Nick Desaulniers	1	-3/+1
	A final attempt at enabling sse2 for GCC users. Orininally attempted in: commit 10117450735c ("drm/amd/display: add -msse2 to prevent Clang from emitting libcalls to undefined SW FP routines") Reverted due to "reported instability" in: commit 193392ed9f69 ("Revert "drm/amd/display: add -msse2 to prevent Clang from emitting libcalls to undefined SW FP routines"") Re-added just for Clang in: commit 0f0727d971f6 ("drm/amd/display: readd -msse2 to prevent Clang from emitting libcalls to undefined SW FP routines") The original report didn't have enough information to know if the GPF was due to misalignment, but I suspect that it was. (The missing information was the disassembly of the function at the bottom of the trace, to see if the instruction pointer pointed to an instruction with 16B alignment memory operand requirements. The stack trace does show the stack was only 8B but not 16B aligned though, which makes this a strong possibility). Now that the stack misalignment issue has been fixed for users of GCC 7.1+, reattempt adding -msse2. This matches Clang. It will likely never be safe to enable this for pre-GCC 7.1 AND use a 16B aligned stack in these translation units. This is only a functional change for GCC 7.1+ users, and should be boot tested. Link: https://bugs.freedesktop.org/show_bug.cgi?id=109487 Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-30	drm/amdgpu: fix stack alignment ABI mismatch for GCC 7.1+	Nick Desaulniers	1	-0/+9
	GCC earlier than 7.1 errors when compiling code that makes use of `double`s and sets a stack alignment outside of the range of [2^4-2^12]: $ cat foo.c double foo(double x, double y) { return x + y; } $ gcc-4.9 -mpreferred-stack-boundary=3 foo.c error: -mpreferred-stack-boundary=3 is not between 4 and 12 This is likely why the AMDGPU driver was ever compiled with a different stack alignment (and thus different ABI) than the rest of the x86 kernel. The kernel uses 8B stack alignment, while the driver was using 16B stack alignment in a few places. Since GCC 7.1+ doesn't error, fix the ABI mismatch for users of newer versions of GCC. There was discussion about whether to mark the driver broken or not for users of GCC earlier than 7.1, but since the driver currently is working, don't explicitly break the driver for them here. Relying on differing stack alignment is unspecified behavior, and brittle, and may break in the future. This patch is no functional change for GCC users earlier than 7.1. It's been compile tested on GCC 4.9 and 8.3 to check the correct flags. It should be boot tested when built with GCC 7.1+. -mincoming-stack-boundary= or -mstackrealign may help keep this code building for pre-GCC 7.1 users. The version check for GCC is broken into two conditionals, both because cc-ifversion is currently GCC specific, and it simplifies a subsequent patch. Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-30	drm/amdgpu: fix stack alignment ABI mismatch for Clang	Nick Desaulniers	1	-6/+4
	The x86 kernel is compiled with an 8B stack alignment via `-mpreferred-stack-boundary=3` for GCC since 3.6-rc1 via commit d9b0cde91c60 ("x86-64, gcc: Use -mpreferred-stack-boundary=3 if supported") or `-mstack-alignment=8` for Clang. Parts of the AMDGPU driver are compiled with 16B stack alignment. Generally, the stack alignment is part of the ABI. Linking together two different translation units with differing stack alignment is dangerous, particularly when the translation unit with the smaller stack alignment makes calls into the translation unit with the larger stack alignment. While 8B aligned stacks are sometimes also 16B aligned, they are not always. Multiple users have reported General Protection Faults (GPF) when using the AMDGPU driver compiled with Clang. Clang is placing objects in stack slots assuming the stack is 16B aligned, and selecting instructions that require 16B aligned memory operands. At runtime, syscall handlers with 8B aligned stack call into code that assumes 16B stack alignment. When the stack is a multiple of 8B but not 16B, these instructions result in a GPF. Remove the code that added compatibility between the differing compiler flags, as it will result in runtime GPFs when built with Clang. Cleanups for GCC will be sent in later patches in the series. Link: https://github.com/ClangBuiltLinux/linux/issues/735 Debugged-by: Yuxuan Shui <yshuiv7@gmail.com> Reported-by: Shirish S <shirish.s@amd.com> Reported-by: Yuxuan Shui <yshuiv7@gmail.com> Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-20	Merge tag 'kbuild-v5.4' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild	Linus Torvalds	1	-4/+3
	Pull Kbuild updates from Masahiro Yamada: - add modpost warn exported symbols marked as 'static' because 'static' and EXPORT_SYMBOL is an odd combination - break the build early if gold linker is used - optimize the Bison rule to produce .c and .h files by a single pattern rule - handle PREEMPT_RT in the module vermagic and UTS_VERSION - warn CONFIG options leaked to the user-space except existing ones - make single targets work properly - rebuild modules when module linker scripts are updated - split the module final link stage into scripts/Makefile.modfinal - fix the missed error code in merge_config.sh - improve the error message displayed on the attempt of the O= build in unclean source tree - remove 'clean-dirs' syntax - disable -Wimplicit-fallthrough warning for Clang - add CONFIG_CC_OPTIMIZE_FOR_SIZE_O3 for ARC - remove ARCH_{CPP,A,C}FLAGS variables - add $(BASH) to run bash scripts - change CFLAGS_<basetarget>.o to take the relative path to $(obj) instead of the basename - stop suppressing Clang's -Wunused-function warnings when W=1 - fix linux/export.h to avoid genksyms calculating CRC of trimmed exported symbols - misc cleanups tag 'kbuild-v5.4' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (63 commits) genksyms: convert to SPDX License Identifier for lex.l and parse.y modpost: use __section in the output to .mod.c modpost: use MODULE_INFO() for __module_depends export.h, genksyms: do not make genksyms calculate CRC of trimmed symbols export.h: remove defined(__KERNEL__), which is no longer needed kbuild: allow Clang to find unused static inline functions for W=1 build kbuild: rename KBUILD_ENABLE_EXTRA_GCC_CHECKS to KBUILD_EXTRA_WARN kbuild: refactor scripts/Makefile.extrawarn merge_config.sh: ignore unwanted grep errors kbuild: change FLAGS_<basetarget>.o to take the path relative to $(obj) modpost: add NOFAIL to strndup modpost: add guid_t type definition kbuild: add $(BASH) to run scripts with bash-extension kbuild: remove ARCH_{CPP,A,C}FLAGS kbuild,arc: add CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE_O3 for ARC kbuild: Do not enable -Wimplicit-fallthrough for clang for now kbuild: clean up subdir-ymn calculation in Makefile.clean kbuild: remove unneeded '+' marker from cmd_clean kbuild: remove clean-dirs syntax kbuild: check clean srctree even earlier ...
2019-09-04	kbuild: change *FLAGS_<basetarget>.o to take the path relative to $(obj)	Masahiro Yamada	1	-4/+3
	Kbuild provides per-file compiler flag addition/removal: CFLAGS_<basetarget>.o CFLAGS_REMOVE_<basetarget>.o AFLAGS_<basetarget>.o AFLAGS_REMOVE_<basetarget>.o CPPFLAGS_<basetarget>.lds HOSTCFLAGS_<basetarget>.o HOSTCXXFLAGS_<basetarget>.o The <basetarget> is the filename of the target with its directory and suffix stripped. This syntax comes into a trouble when two files with the same basename appear in one Makefile, for example: obj-y += foo.o obj-y += dir/foo.o CFLAGS_foo.o := <some-flags> Here, the <some-flags> applies to both foo.o and dir/foo.o The real world problem is: scripts/kconfig/util.c scripts/kconfig/lxdialog/util.c Both files are compiled into scripts/kconfig/mconf, but only the latter should be given with the ncurses flags. It is more sensible to use the relative path to the Makefile, like this: obj-y += foo.o CFLAGS_foo.o := <some-flags> obj-y += dir/foo.o CFLAGS_dir/foo.o := <other-flags> At first, I attempted to replace $(basetarget) with $. The $ variable is replaced with the stem ('%') part in a pattern rule. This works with most of cases, but does not for explicit rules. For example, arch/ia64/lib/Makefile reuses rule_as_o_S in its own explicit rules, so $* will be empty, resulting in ignoring the per-file AFLAGS. I introduced a new variable, target-stem, which can be used also from explicit rules. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Acked-by: Marc Zyngier <maz@kernel.org>
2019-07-30	drm/amd/display: readd -msse2 to prevent Clang from emitting libcalls to undefined SW FP routines	Nick Desaulniers	1	-0/+4
	arch/x86/Makefile disables SSE and SSE2 for the whole kernel. The AMDGPU drivers modified in this patch re-enable SSE but not SSE2. Turn on SSE2 to support emitting double precision floating point instructions rather than calls to non-existent (usually available from gcc_s or compiler_rt) floating point helper routines for Clang. This was originally landed in: commit 10117450735c ("drm/amd/display: add -msse2 to prevent Clang from emitting libcalls to undefined SW FP routines") but reverted in: commit 193392ed9f69 ("Revert "drm/amd/display: add -msse2 to prevent Clang from emitting libcalls to undefined SW FP routines"") due to bugreports from GCC builds. Add guards to only do so for Clang. Link: https://bugs.freedesktop.org/show_bug.cgi?id=109487 Link: https://github.com/ClangBuiltLinux/linux/issues/327 Suggested-by: Sedat Dilek <sedat.dilek@gmail.com> Suggested-by: Sami Tolvanen <samitolvanen@google.com> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-16	drm/amd/display: Support clang option for stack alignment	Arnd Bergmann	1	-4/+12
	As previously fixed for dml in commit 4769278e5c7f ("amdgpu/dc/dml: Support clang option for stack alignment") and calcs in commit cc32ad8f559c ("amdgpu/dc/calcs: Support clang option for stack alignment"), dcn20 uses an option that is not available with clang: clang: error: unknown argument: '-mpreferred-stack-boundary=4' scripts/Makefile.build:281: recipe for target 'drivers/gpu/drm/amd/amdgpu/../display/dc/dcn20/dcn20_resource.o' failed Use the same trick that we have in the other two files. Fixes: 7ed4e6352c16 ("drm/amd/display: Add DCN2 HW Sequencer and Resource") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-06-22	drm/amd/display: Add DSC support for Navi (v2)	Harry Wentland	1	-0/+13
	Add support for DCN2 DSC (Display Stream Compression) HW Blocks: +--------++------+ +----------+ \| HUBBUB \|\| HUBP \| <-- \| MMHUBBUB \| +--------++------+ +----------+ \| ^ v \| +--------+ +--------+ \| DPP \| \| DWB \| +--------+ +--------+ \| v ^ +--------+ \| \| MPC \| \| +--------+ \| \| \| v \| +-------+ +-------+ \| \| OPP \| <--> \| DSC \| \| +-------+ +-------+ \| \| \| v \| +--------+ / \| OPTC \| -------------- +--------+ \| v +--------+ +--------+ \| DIO \| \| DCCG \| +--------+ +--------+ v2: rebase (Alex) Signed-off-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>