aboutsummaryrefslogtreecommitdiffstatshomepage
path: root/tools/perf/scripts/python/export-to-postgresql.py (unfollow)
AgeCommit message (Collapse)AuthorFilesLines
2024-07-04perf: Fix perf_aux_size() for greater-than 32-bit sizeAdrian Hunter1-1/+1
perf_buffer->aux_nr_pages uses a 32-bit type, so a cast is needed to calculate a 64-bit size. Fixes: 45bfb2e50471 ("perf: Add AUX area to ring buffer for raw data streams") Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20240624201101.60186-5-adrian.hunter@intel.com
2024-07-04perf/x86/intel/pt: Fix pt_topa_entry_for_page() address calculationAdrian Hunter1-1/+1
Currently, perf allocates an array of page pointers which is limited in size by MAX_PAGE_ORDER. That in turn limits the maximum Intel PT buffer size to 2GiB. Should that limitation be lifted, the Intel PT driver can support larger sizes, except for one calculation in pt_topa_entry_for_page(), which is limited to 32-bits. Fix pt_topa_entry_for_page() address calculation by adding a cast. Fixes: 39152ee51b77 ("perf/x86/intel/pt: Get rid of reverse lookup table for ToPA") Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20240624201101.60186-4-adrian.hunter@intel.com
2024-07-04perf/x86/intel/pt: Fix a topa_entry base address calculationAdrian Hunter1-1/+1
topa_entry->base is a bit-field. Bit-fields are not promoted to a 64-bit type, even if the underlying type is 64-bit, and so, if necessary, must be cast to a larger type when calculations are done. Fix a topa_entry->base address calculation by adding a cast. Without the cast, the address was limited to 36-bits i.e. 64GiB. The address calculation is used on systems that do not support Multiple Entry ToPA (only Broadwell), and affects physical addresses on or above 64GiB. Instead of writing to the correct address, the address comprising the first 36 bits would be written to. Intel PT snapshot and sampling modes are not affected. Fixes: 52ca9ced3f70 ("perf/x86/intel/pt: Add Intel PT PMU driver") Reported-by: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20240624201101.60186-3-adrian.hunter@intel.com
2024-07-04perf/x86/intel/pt: Fix topa_entry base lengthMarco Cavenati1-2/+2
topa_entry->base needs to store a pfn. It obviously needs to be large enough to store the largest possible x86 pfn which is MAXPHYADDR-PAGE_SIZE (52-12). So it is 4 bits too small. Increase the size of topa_entry->base from 36 bits to 40 bits. Note, systems where physical addresses can be 256TiB or more are affected. [ Adrian: Amend commit message as suggested by Dave Hansen ] Fixes: 52ca9ced3f70 ("perf/x86/intel/pt: Add Intel PT PMU driver") Signed-off-by: Marco Cavenati <cavenati.marco@gmail.com> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20240624201101.60186-2-adrian.hunter@intel.com
2024-06-17perf/x86/intel/uncore: Support HBM and CXL PMON countersKan Liang1-2/+53
Unknown uncore PMON types can be found in both SPR and EMR with HBM or CXL. $ls /sys/devices/ | grep type uncore_type_12_16 uncore_type_12_18 uncore_type_12_2 uncore_type_12_4 uncore_type_12_6 uncore_type_12_8 uncore_type_13_17 uncore_type_13_19 uncore_type_13_3 uncore_type_13_5 uncore_type_13_7 uncore_type_13_9 The unknown PMON types are HBM and CXL PMON. Except for the name, the other information regarding the HBM and CXL PMON counters can be retrieved via the discovery table. Add them into the uncores tables for SPR and EMR. The event config registers for all CXL related units are 8-byte apart. Add SPR_UNCORE_MMIO_OFFS8_COMMON_FORMAT to specially handle it. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Yunying Sun <yunying.sun@intel.com> Link: https://lore.kernel.org/r/20240614134631.1092359-9-kan.liang@linux.intel.com
2024-06-17perf/x86/uncore: Cleanup unused unit structureKan Liang4-112/+12
The unit control and ID information are retrieved from the unit control RB tree. No one uses the old structure anymore. Remove them. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Yunying Sun <yunying.sun@intel.com> Link: https://lore.kernel.org/r/20240614134631.1092359-8-kan.liang@linux.intel.com
2024-06-17perf/x86/uncore: Apply the unit control RB tree to PCI uncore unitsKan Liang5-48/+94
The unit control RB tree has the unit control and unit ID information for all the PCI units. Use them to replace the box_ctls/pci_offsets to get an accurate unit control address for PCI uncore units. The UPI/M3UPI units in the discovery table are ignored. Please see the commit 65248a9a9ee1 ("perf/x86/uncore: Add a quirk for UPI on SPR"). Manually allocate a unit control RB tree for UPI/M3UPI. Add cleanup_extra_boxes to release such manual allocation. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Yunying Sun <yunying.sun@intel.com> Link: https://lore.kernel.org/r/20240614134631.1092359-7-kan.liang@linux.intel.com
2024-06-17perf/x86/uncore: Apply the unit control RB tree to MSR uncore unitsKan Liang4-11/+59
The unit control RB tree has the unit control and unit ID information for all the MSR units. Use them to replace the box_ctl and uncore_msr_box_ctl() to get an accurate unit control address for MSR uncore units. Add intel_generic_uncore_assign_hw_event(), which utilizes the accurate unit control address from the unit control RB tree to calculate the config_base and event_base. The unit id related information should be retrieved from the unit control RB tree as well. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Yunying Sun <yunying.sun@intel.com> Link: https://lore.kernel.org/r/20240614134631.1092359-6-kan.liang@linux.intel.com
2024-06-17perf/x86/uncore: Apply the unit control RB tree to MMIO uncore unitsKan Liang1-16/+14
The unit control RB tree has the unit control and unit ID information for all the units. Use it to replace the box_ctls/mmio_offsets to get an accurate unit control address for MMIO uncore units. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Yunying Sun <yunying.sun@intel.com> Link: https://lore.kernel.org/r/20240614134631.1092359-5-kan.liang@linux.intel.com
2024-06-17perf/x86/uncore: Retrieve the unit ID from the unit control RB treeKan Liang1-0/+3
The box_ids only save the unit ID for the first die. If a unit, e.g., a CXL unit, doesn't exist in the first die. The unit ID cannot be retrieved. The unit control RB tree also stores the unit ID information. Retrieve the unit ID from the unit control RB tree Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Yunying Sun <yunying.sun@intel.com> Link: https://lore.kernel.org/r/20240614134631.1092359-4-kan.liang@linux.intel.com
2024-06-17perf/x86/uncore: Support per PMU cpumaskKan Liang4-5/+89
The cpumask of some uncore units, e.g., CXL uncore units, may be wrong under some configurations. Perf may access an uncore counter of a non-existent uncore unit. The uncore driver assumes that all uncore units are symmetric among dies. A global cpumask is shared among all uncore PMUs. However, some CXL uncore units may only be available on some dies. A per PMU cpumask is introduced to track the CPU mask of this PMU. The driver searches the unit control RB tree to check whether the PMU is available on a given die, and updates the per PMU cpumask accordingly. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Yunying Sun <yunying.sun@intel.com> Link: https://lore.kernel.org/r/20240614134631.1092359-3-kan.liang@linux.intel.com
2024-06-17perf/x86/uncore: Save the unit control address of all unitsKan Liang2-2/+87
The unit control address of some CXL units may be wrongly calculated under some configuration on a EMR machine. The current implementation only saves the unit control address of the units from the first die, and the first unit of the rest of dies. Perf assumed that the units from the other dies have the same offset as the first die. So the unit control address of the rest of the units can be calculated. However, the assumption is wrong, especially for the CXL units. Introduce an RB tree for each uncore type to save the unit control address and three kinds of ID information (unit ID, PMU ID, and die ID) for all units. The unit ID is a physical ID of a unit. The PMU ID is a logical ID assigned to a unit. The logical IDs start from 0 and must be contiguous. The physical ID and the logical ID are 1:1 mapping. The units with the same physical ID in different dies share the same PMU. The die ID indicates which die a unit belongs to. The RB tree can be searched by two different keys (unit ID or PMU ID + die ID). During the RB tree setup, the unit ID is used as a key to look up the RB tree. The perf can create/assign a proper PMU ID to the unit. Later, after the RB tree is setup, PMU ID + die ID is used as a key to look up the RB tree to fill the cpumask of a PMU. It's used more frequently, so PMU ID + die ID is compared in the unit_less(). The uncore_find_unit() has to be O(N). But the RB tree setup only occurs once during the driver load time. It should be acceptable. Compared with the current implementation, more space is required to save the information of all units. The extra size should be acceptable. For example, on EMR, there are 221 units at most. For a 2-socket machine, the extra space is ~6KB at most. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20240614134631.1092359-2-kan.liang@linux.intel.com
2024-05-18perf/x86/amd: Use try_cmpxchg() in events/amd/{un,}core.cUros Bizjak2-3/+9
Replace this pattern in events/amd/{un,}core.c: cmpxchg(*ptr, old, new) == old ... with the simpler and faster: try_cmpxchg(*ptr, &old, new) The x86 CMPXCHG instruction returns success in the ZF flag, so this change saves a compare after the CMPXCHG. No functional change intended. Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20240425101708.5025-1-ubizjak@gmail.com
2024-05-08perf/x86/cstate: Remove unused 'struct perf_cstate_msr'Ingo Molnar1-6/+0
Use of this structure was removed in: 8f2a28c5859b ("perf/x86/cstate: Use new probe function") Remove the now stale type as well. Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: linux-kernel@vger.kernel.org Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2024-05-02perf/x86/rapl: Rename 'maxdie' to nr_rapl_pmu and 'dieid' to rapl_pmu_idxDhananjay Ugwekar1-8/+8
AMD CPUs have the scope of RAPL energy-pkg event as package, whereas Intel Cascade Lake CPUs have the scope as die. To account for the difference in the energy-pkg event scope between AMD and Intel CPUs, give more generic and semantically correct names to the maxdie and dieid variables. No functional change. Signed-off-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Tested-by: K Prateek Nayak <kprateek.nayak@amd.com> Link: https://lore.kernel.org/r/20240502095115.177713-2-Dhananjay.Ugwekar@amd.com
2024-05-02x86/insn: Add support for APX EVEX instructions to the opcode mapAdrian Hunter2-0/+186
To support APX functionality, the EVEX prefix is used to: - promote legacy instructions - promote VEX instructions - add new instructions Promoted VEX instructions require no extra annotation because the opcodes do not change and the permissive nature of the instruction decoder already allows them to have an EVEX prefix. Promoted legacy instructions and new instructions are placed in map 4 which has not been used before. Create a new table for map 4 and add APX instructions. Annotate SCALABLE instructions with "(es)" - refer to patch "x86/insn: Add support for APX EVEX to the instruction decoder logic". SCALABLE instructions must be represented in both no-prefix (NP) and 66 prefix forms. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20240502105853.5338-9-adrian.hunter@intel.com
2024-05-02x86/insn: Add support for APX EVEX to the instruction decoder logicAdrian Hunter8-0/+42
Intel Advanced Performance Extensions (APX) extends the EVEX prefix to support: - extended general purpose registers (EGPRs) i.e. r16 to r31 - Push-Pop Acceleration (PPX) hints - new data destination (NDD) register - suppress status flags writes (NF) of common instructions - new instructions Refer to the Intel Advanced Performance Extensions (Intel APX) Architecture Specification for details. The extended EVEX prefix does not need amended instruction decoder logic, except in one area. Some instructions are defined as SCALABLE which means the EVEX.W bit and EVEX.pp bits are used to determine operand size. Specifically, if an instruction is SCALABLE and EVEX.W is zero, then EVEX.pp value 0 (representing no prefix NP) means default operand size, whereas EVEX.pp value 1 (representing 66 prefix) means operand size override i.e. 16 bits Add an attribute (INAT_EVEX_SCALABLE) to identify such instructions, and amend the logic appropriately. Amend the awk script that generates the attribute tables from the opcode map, to recognise "(es)" as attribute INAT_EVEX_SCALABLE. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20240502105853.5338-8-adrian.hunter@intel.com
2024-05-02x86/insn: x86/insn: Add support for REX2 prefix to the instruction decoder opcode mapAdrian Hunter2-144/+152
Support for REX2 has been added to the instruction decoder logic and the awk script that generates the attribute tables from the opcode map. Add REX2 prefix byte (0xD5) to the opcode map. Add annotation (!REX2) for map 0/1 opcodes that are reserved under REX2. Add JMPABS to the opcode map and add annotation (REX2) to identify that it has a mandatory REX2 prefix. A separate opcode attribute table is not needed at this time because JMPABS has the same attribute encoding as the MOV instruction that it shares an opcode with i.e. INAT_MOFFSET. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20240502105853.5338-7-adrian.hunter@intel.com
2024-05-02x86/insn: Add support for REX2 prefix to the instruction decoder logicAdrian Hunter8-12/+132
Intel Advanced Performance Extensions (APX) uses a new 2-byte prefix named REX2 to select extended general purpose registers (EGPRs) i.e. r16 to r31. The REX2 prefix is effectively an extended version of the REX prefix. REX2 and EVEX are also used with PUSH/POP instructions to provide a Push-Pop Acceleration (PPX) hint. With PPX hints, a CPU will attempt to fast-forward register data between matching PUSH and POP instructions. REX2 is valid only with opcodes in maps 0 and 1. Similar extension for other maps is provided by the EVEX prefix, covered in a separate patch. Some opcodes in maps 0 and 1 are reserved under REX2. One of these is used for a new 64-bit absolute direct jump instruction JMPABS. Refer to the Intel Advanced Performance Extensions (Intel APX) Architecture Specification for details. Define a code value for the REX2 prefix (INAT_PFX_REX2), and add attribute flags for opcodes reserved under REX2 (INAT_NO_REX2) and to identify opcodes (only JMPABS) that require a mandatory REX2 prefix (INAT_REX2_VARIANT). Amend logic to read the REX2 prefix and get the opcode attribute for the map number (0 or 1) encoded in the REX2 prefix. Amend the awk script that generates the attribute tables from the opcode map, to recognise "REX2" as attribute INAT_PFX_REX2, and "(!REX2)" as attribute INAT_NO_REX2, and "(REX2)" as attribute INAT_REX2_VARIANT. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20240502105853.5338-6-adrian.hunter@intel.com
2024-05-02x86/insn: Add misc new Intel instructionsAdrian Hunter2-24/+90
The x86 instruction decoder is used not only for decoding kernel instructions. It is also used by perf uprobes (user space probes) and by perf tools Intel Processor Trace decoding. Consequently, it needs to support instructions executed by user space also. Add instructions documented in Intel Architecture Instruction Set Extensions and Future Features Programming Reference March 2024 319433-052, that have not been added yet: AADD AAND AOR AXOR CMPccXADD PBNDKB RDMSRLIST URDMSR UWRMSR VBCSTNEBF162PS VBCSTNESH2PS VCVTNEEBF162PS VCVTNEEPH2PS VCVTNEOBF162PS VCVTNEOPH2PS VCVTNEPS2BF16 VPDPB[SU,UU,SS]D[,S] VPDPW[SU,US,UU]D[,S] VPMADD52HUQ VPMADD52LUQ VSHA512MSG1 VSHA512MSG2 VSHA512RNDS2 VSM3MSG1 VSM3MSG2 VSM3RNDS2 VSM4KEY4 VSM4RNDS4 WRMSRLIST TCMMIMFP16PS TCMMRLFP16PS TDPFP16PS PREFETCHIT1 PREFETCHIT0 Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20240502105853.5338-5-adrian.hunter@intel.com
2024-05-02x86/insn: Add VEX versions of VPDPBUSD, VPDPBUSDS, VPDPWSSD and VPDPWSSDSAdrian Hunter2-8/+8
The x86 instruction decoder is used not only for decoding kernel instructions. It is also used by perf uprobes (user space probes) and by perf tools Intel Processor Trace decoding. Consequently, it needs to support instructions executed by user space also. Intel Architecture Instruction Set Extensions and Future Features manual number 319433-044 of May 2021, documented VEX versions of instructions VPDPBUSD, VPDPBUSDS, VPDPWSSD and VPDPWSSDS, but the opcode map has them listed as EVEX only. Remove EVEX-only (ev) annotation from instructions VPDPBUSD, VPDPBUSDS, VPDPWSSD and VPDPWSSDS, which allows them to be decoded with either a VEX or EVEX prefix. Fixes: 0153d98f2dd6 ("x86/insn: Add misc instructions to x86 instruction decoder") Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20240502105853.5338-4-adrian.hunter@intel.com
2024-05-02x86/insn: Fix PUSH instruction in x86 instruction decoder opcode mapAdrian Hunter2-2/+2
The x86 instruction decoder is used not only for decoding kernel instructions. It is also used by perf uprobes (user space probes) and by perf tools Intel Processor Trace decoding. Consequently, it needs to support instructions executed by user space also. Opcode 0x68 PUSH instruction is currently defined as 64-bit operand size only i.e. (d64). That was based on Intel SDM Opcode Map. However that is contradicted by the Instruction Set Reference section for PUSH in the same manual. Remove 64-bit operand size only annotation from opcode 0x68 PUSH instruction. Example: $ cat pushw.s .global _start .text _start: pushw $0x1234 mov $0x1,%eax # system call number (sys_exit) int $0x80 $ as -o pushw.o pushw.s $ ld -s -o pushw pushw.o $ objdump -d pushw | tail -4 0000000000401000 <.text>: 401000: 66 68 34 12 pushw $0x1234 401004: b8 01 00 00 00 mov $0x1,%eax 401009: cd 80 int $0x80 $ perf record -e intel_pt//u ./pushw [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.014 MB perf.data ] Before: $ perf script --insn-trace=disasm Warning: 1 instruction trace errors pushw 10349 [000] 10586.869237014: 401000 [unknown] (/home/ahunter/git/misc/rtit-tests/pushw) pushw $0x1234 pushw 10349 [000] 10586.869237014: 401006 [unknown] (/home/ahunter/git/misc/rtit-tests/pushw) addb %al, (%rax) pushw 10349 [000] 10586.869237014: 401008 [unknown] (/home/ahunter/git/misc/rtit-tests/pushw) addb %cl, %ch pushw 10349 [000] 10586.869237014: 40100a [unknown] (/home/ahunter/git/misc/rtit-tests/pushw) addb $0x2e, (%rax) instruction trace error type 1 time 10586.869237224 cpu 0 pid 10349 tid 10349 ip 0x40100d code 6: Trace doesn't match instruction After: $ perf script --insn-trace=disasm pushw 10349 [000] 10586.869237014: 401000 [unknown] (./pushw) pushw $0x1234 pushw 10349 [000] 10586.869237014: 401004 [unknown] (./pushw) movl $1, %eax Fixes: eb13296cfaf6 ("x86: Instruction decoder API") Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20240502105853.5338-3-adrian.hunter@intel.com
2024-05-02x86/insn: Add Key Locker instructions to the opcode mapChang S. Bae2-8/+14
The x86 instruction decoder needs to know these new instructions that are going to be used in the crypto library as well as the x86 core code. Add the following: LOADIWKEY: Load a CPU-internal wrapping key. ENCODEKEY128: Wrap a 128-bit AES key to a key handle. ENCODEKEY256: Wrap a 256-bit AES key to a key handle. AESENC128KL: Encrypt a 128-bit block of data using a 128-bit AES key indicated by a key handle. AESENC256KL: Encrypt a 128-bit block of data using a 256-bit AES key indicated by a key handle. AESDEC128KL: Decrypt a 128-bit block of data using a 128-bit AES key indicated by a key handle. AESDEC256KL: Decrypt a 128-bit block of data using a 256-bit AES key indicated by a key handle. AESENCWIDE128KL: Encrypt 8 128-bit blocks of data using a 128-bit AES key indicated by a key handle. AESENCWIDE256KL: Encrypt 8 128-bit blocks of data using a 256-bit AES key indicated by a key handle. AESDECWIDE128KL: Decrypt 8 128-bit blocks of data using a 128-bit AES key indicated by a key handle. AESDECWIDE256KL: Decrypt 8 128-bit blocks of data using a 256-bit AES key indicated by a key handle. The detail can be found in Intel Software Developer Manual. Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Link: https://lore.kernel.org/r/20240502105853.5338-2-adrian.hunter@intel.com
2024-04-29x86/mm: Switch to new Intel CPU model definesTony Luck1-10/+6
New CPU #defines encode vendor and family as well as model. [ dhansen: vertically align 0's in invlpg_miss_ids[] ] Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/all/20240424181518.41946-1-tony.luck%40intel.com
2024-04-29x86/tsc_msr: Switch to new Intel CPU model definesTony Luck1-7/+7
New CPU #defines encode vendor and family as well as model. Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/all/20240424181518.41927-1-tony.luck%40intel.com
2024-04-29x86/tsc: Switch to new Intel CPU model definesTony Luck1-3/+3
New CPU #defines encode vendor and family as well as model. Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/all/20240424181517.41907-1-tony.luck%40intel.com
2024-04-29x86/cpu: Switch to new Intel CPU model definesTony Luck1-3/+3
New CPU #defines encode vendor and family as well as model. [ dhansen: vertically align macro and remove stray subject / ] Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/all/20240424181516.41887-1-tony.luck%40intel.com
2024-04-29x86/resctrl: Switch to new Intel CPU model definesTony Luck2-16/+16
New CPU #defines encode vendor and family as well as model. [ bp: Squash two resctrl patches into one. ] Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/all/20240424181514.41848-1-tony.luck%40intel.com
2024-04-29x86/microcode/intel: Switch to new Intel CPU model definesTony Luck1-3/+2
New CPU #defines encode vendor and family as well as model. Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/all/20240424181513.41829-1-tony.luck%40intel.com
2024-04-29x86/mce: Switch to new Intel CPU model definesTony Luck3-19/+18
New CPU #defines encode vendor and family as well as model. [ bp: Squash *three* mce patches into one, fold in fix: https://lore.kernel.org/r/20240429022051.63360-1-tony.luck@intel.com ] Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/all/20240424181511.41772-1-tony.luck%40intel.com
2024-04-29x86/cpu: Switch to new Intel CPU model definesTony Luck1-1/+1
New CPU #defines encode vendor and family as well as model. Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/all/20240424181511.41753-1-tony.luck%40intel.com
2024-04-29x86/cpu/intel_epb: Switch to new Intel CPU model definesTony Luck1-6/+6
New CPU #defines encode vendor and family as well as model. Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/all/20240424181510.41733-1-tony.luck%40intel.com
2024-04-29x86/aperfmperf: Switch to new Intel CPU model definesTony Luck1-9/+8
New CPU #defines encode vendor and family as well as model. Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/all/20240424181505.41654-1-tony.luck%40intel.com
2024-04-29x86/apic: Switch to new Intel CPU model definesTony Luck1-19/+19
New CPU #defines encode vendor and family as well as model. Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/all/20240424181504.41634-1-tony.luck%40intel.com
2024-04-29perf/x86/msr: Switch to new Intel CPU model definesTony Luck1-66/+66
New CPU #defines encode vendor and family as well as model. Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/all/20240424181503.41614-1-tony.luck%40intel.com
2024-04-29perf/x86/intel/uncore: Switch to new Intel CPU model definesTony Luck3-53/+55
New CPU #defines encode vendor and family as well as model. [ bp: Squash *three* uncore patches into one. ] Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/all/20240424181501.41557-1-tony.luck%40intel.com
2024-04-28Linux 6.9-rc6Linus Torvalds1-1/+1
2024-04-28sched/isolation: Fix boot crash when maxcpus < first housekeeping CPUOleg Nesterov1-1/+6
housekeeping_setup() checks cpumask_intersects(present, online) to ensure that the kernel will have at least one housekeeping CPU after smp_init(), but this doesn't work if the maxcpus= kernel parameter limits the number of processors available after bootup. For example, a kernel with "maxcpus=2 nohz_full=0-2" parameters crashes at boot time on a virtual machine with 4 CPUs. Change housekeeping_setup() to use cpumask_first_and() and check that the returned CPU number is valid and less than setup_max_cpus. Another corner case is "nohz_full=0" on a machine with a single CPU or with the maxcpus=1 kernel argument. In this case non_housekeeping_mask is empty and tick_nohz_full_setup() makes no sense. And indeed, the kernel hits the WARN_ON(tick_nohz_full_running) in tick_sched_do_timer(). And how should the kernel interpret the "nohz_full=" parameter? It should be silently ignored, but currently cpulist_parse() happily returns the empty cpumask and this leads to the same problem. Change housekeeping_setup() to check cpumask_empty(non_housekeeping_mask) and do nothing in this case. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Phil Auld <pauld@redhat.com> Acked-by: Frederic Weisbecker <frederic@kernel.org> Link: https://lore.kernel.org/r/20240413141746.GA10008@redhat.com
2024-04-28sched/isolation: Prevent boot crash when the boot CPU is nohz_fullOleg Nesterov2-6/+12
Documentation/timers/no_hz.rst states that the "nohz_full=" mask must not include the boot CPU, which is no longer true after: 08ae95f4fd3b ("nohz_full: Allow the boot CPU to be nohz_full"). However after: aae17ebb53cd ("workqueue: Avoid using isolated cpus' timers on queue_delayed_work") the kernel will crash at boot time in this case; housekeeping_any_cpu() returns an invalid CPU number until smp_init() brings the first housekeeping CPU up. Change housekeeping_any_cpu() to check the result of cpumask_any_and() and return smp_processor_id() in this case. This is just the simple and backportable workaround which fixes the symptom, but smp_processor_id() at boot time should be safe at least for type == HK_TYPE_TIMER, this more or less matches the tick_do_timer_boot_cpu logic. There is no worry about cpu_down(); tick_nohz_cpu_down() will not allow to offline tick_do_timer_cpu (the 1st online housekeeping CPU). Fixes: aae17ebb53cd ("workqueue: Avoid using isolated cpus' timers on queue_delayed_work") Reported-by: Chris von Recklinghausen <crecklin@redhat.com> Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Phil Auld <pauld@redhat.com> Acked-by: Frederic Weisbecker <frederic@kernel.org> Link: https://lore.kernel.org/r/20240411143905.GA19288@redhat.com Closes: https://lore.kernel.org/all/20240402105847.GA24832@redhat.com/
2024-04-27profiling: Remove create_prof_cpu_mask().Tetsuo Handa2-48/+0
create_prof_cpu_mask() is no longer used after commit 1f44a225777e ("s390: convert interrupt handling to use generic hardirq"). Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-04-27i2c: smbus: fix NULL function pointer dereferenceWolfram Sang1-6/+6
Baruch reported an OOPS when using the designware controller as target only. Target-only modes break the assumption of one transfer function always being available. Fix this by always checking the pointer in __i2c_transfer. Reported-by: Baruch Siach <baruch@tkos.co.il> Closes: https://lore.kernel.org/r/4269631780e5ba789cf1ae391eec1b959def7d99.1712761976.git.baruch@tkos.co.il Fixes: 4b1acc43331d ("i2c: core changes for slave support") [wsa: dropped the simplification in core-smbus to avoid theoretical regressions] Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Tested-by: Baruch Siach <baruch@tkos.co.il>
2024-04-26RISC-V: selftests: cbo: Ensure asm operands match constraints, take 2Andrew Jones2-1/+11
Commit 0de65288d75f ("RISC-V: selftests: cbo: Ensure asm operands match constraints") attempted to ensure MK_CBO() would always provide to a compile-time constant when given a constant, but cpu_to_le32() isn't necessarily going to do that. Switch to manually shifting the bytes, when needed, to finally get this right. Reported-by: Woodrow Shen <woodrow.shen@sifive.com> Closes: https://lore.kernel.org/all/CABquHATcBTUwfLpd9sPObBgNobqQKEAZ2yxk+TWSpyO5xvpXpg@mail.gmail.com/ Fixes: a29e2a48afe3 ("RISC-V: selftests: Add CBO tests") Fixes: 0de65288d75f ("RISC-V: selftests: cbo: Ensure asm operands match constraints") Signed-off-by: Andrew Jones <ajones@ventanamicro.com> Link: https://lore.kernel.org/r/20240322134728.151255-2-ajones@ventanamicro.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-04-26perf riscv: Fix the warning due to the incompatible typeBen Zong-You Xie1-1/+1
In the 32-bit platform, the second argument of getline is expectd to be 'size_t *'(aka 'unsigned int *'), but line_sz is of type 'unsigned long *'. Therefore, declare line_sz as size_t. Signed-off-by: Ben Zong-You Xie <ben717@andestech.com> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20240305120501.1785084-3-ben717@andestech.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-04-26netfs: Fix the pre-flush when appending to a file in writethrough modeDavid Howells1-7/+6
In netfs_perform_write(), when the file is marked NETFS_ICTX_WRITETHROUGH or O_*SYNC or RWF_*SYNC was specified, write-through caching is performed on a buffered file. When setting up for write-through, we flush any conflicting writes in the region and wait for the write to complete, failing if there's a write error to return. The issue arises if we're writing at or above the EOF position because we skip the flush and - more importantly - the wait. This becomes a problem if there's a partial folio at the end of the file that is being written out and we want to make a write to it too. Both the already-running write and the write we start both want to clear the writeback mark, but whoever is second causes a warning looking something like: ------------[ cut here ]------------ R=00000012: folio 11 is not under writeback WARNING: CPU: 34 PID: 654 at fs/netfs/write_collect.c:105 ... CPU: 34 PID: 654 Comm: kworker/u386:27 Tainted: G S ... ... Workqueue: events_unbound netfs_write_collection_worker ... RIP: 0010:netfs_writeback_lookup_folio Fix this by making the flush-and-wait unconditional. It will do nothing if there are no folios in the pagecache and will return quickly if there are no folios in the region specified. Further, move the WBC attachment above the flush call as the flush is going to attach a WBC and detach it again if it is not present - and since we need one anyway we might as well share it. Fixes: 41d8e7673a77 ("netfs: Implement a write-through caching option") Reported-by: kernel test robot <oliver.sang@intel.com> Closes: https://lore.kernel.org/oe-lkp/202404161031.468b84f-oliver.sang@intel.com Signed-off-by: David Howells <dhowells@redhat.com> Link: https://lore.kernel.org/r/2150448.1714130115@warthog.procyon.org.uk Reviewed-by: Jeffrey Layton <jlayton@kernel.org> cc: Eric Van Hensbergen <ericvh@kernel.org> cc: Latchesar Ionkov <lucho@ionkov.net> cc: Dominique Martinet <asmadeus@codewreck.org> cc: Christian Schoenebeck <linux_oss@crudebyte.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org cc: linux-mm@kvack.org cc: v9fs@lists.linux.dev cc: linux-afs@lists.infradead.org cc: linux-cifs@vger.kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-04-26MAINTAINERS: Update Uwe's email address, drop SIOX maintenanceUwe Kleine-König1-2/+1
In the context of changing my career path, my Pengutronix email address will soon stop to be available to me. Update the PWM maintainer entry to my kernel.org identity. I drop my co-maintenance of SIOX. Thorsten will continue to care for it with the support of the Pengutronix kernel team. Signed-off-by: Uwe Kleine-König <ukleinek@kernel.org> Acked-by: Thorsten Scherer <t.scherer@eckelmann.de> Link: https://lore.kernel.org/r/20240424212626.603631-2-ukleinek@kernel.org Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
2024-04-26MAINTAINERS: Drop entry for PCA9541 bus master selectorGuenter Roeck1-6/+0
I no longer have access to PCA9541 hardware, and I am no longer involved in related development. Listing me as PCA9541 maintainer does not make sense anymore. Remove PCA9541 from MAINTAINERS to let its support default to the generic I2C multiplexer entry. Signed-off-by: Guenter Roeck <linux@roeck-us.net> Acked-by: Peter Rosin <peda@axentia.se> Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
2024-04-25smb3: fix lock ordering potential deadlock in cifs_sync_mid_resultSteve French1-0/+3
Coverity spotted that the cifs_sync_mid_result function could deadlock "Thread deadlock (ORDER_REVERSAL) lock_order: Calling spin_lock acquires lock TCP_Server_Info.srv_lock while holding lock TCP_Server_Info.mid_lock" Addresses-Coverity: 1590401 ("Thread deadlock (ORDER_REVERSAL)") Cc: stable@vger.kernel.org Reviewed-by: Shyam Prasad N <sprasad@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-04-25smb3: missing lock when picking channelSteve French1-1/+3
Coverity spotted a place where we should have been holding the channel lock when accessing the ses channel index. Addresses-Coverity: 1582039 ("Data race condition (MISSING_LOCK)") Cc: stable@vger.kernel.org Reviewed-by: Shyam Prasad N <sprasad@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-04-25riscv: T-Head: Test availability bit before enabling MAE errataChristoph Müllner1-4/+10
T-Head's memory attribute extension (XTheadMae) (non-compatible equivalent of RVI's Svpbmt) is currently assumed for all T-Head harts. However, QEMU recently decided to drop acceptance of guests that write reserved bits in PTEs. As XTheadMae uses reserved bits in PTEs and Linux applies the MAE errata for all T-Head harts, this broke the Linux startup on QEMU emulations of the C906 emulation. This patch attempts to address this issue by testing the MAE-enable bit in the th.sxstatus CSR. This CSR is available in HW and can be emulated in QEMU. This patch also makes the XTheadMae probing mechanism reliable, because a test for the right combination of mvendorid, marchid, and mimpid is not sufficient to enable MAE. Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu> Link: https://lore.kernel.org/r/20240407213236.2121592-3-christoph.muellner@vrull.eu Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-04-25riscv: thead: Rename T-Head PBMT to MAEChristoph Müllner3-19/+19
T-Head's vendor extension to set page attributes has the name MAE (memory attribute extension). Let's rename it, so it is clear what this referes to. Link: https://github.com/T-head-Semi/thead-extension-spec/blob/master/xtheadmae.adoc Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu> Link: https://lore.kernel.org/r/20240407213236.2121592-2-christoph.muellner@vrull.eu Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>