aboutsummaryrefslogtreecommitdiffstatshomepage
path: root/tools/perf/scripts/python/export-to-postgresql.py (unfollow)
AgeCommit message (Collapse)AuthorFilesLines
2020-03-13NTB: add helper functions to set and clear sideinfoArindam Nath2-10/+37
We define two new helper functions to set and clear sideinfo registers respectively. These functions take an additional boolean parameter which signifies whether we want to set/clear the sideinfo register of the peer(true) or local host(false). Signed-off-by: Arindam Nath <arindam.nath@amd.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>
2020-03-13NTB: move ntb_ctrl handling to init and deinitArindam Nath1-10/+10
It does not really make sense to enable or disable the bits of NTB_CTRL register only during enable and disable link callbacks. They should be done independent of these callbacks. The correct placement for that is during the amd_init_side_info() and amd_deinit_side_info() functions, which are invoked during probe and remove respectively. Signed-off-by: Arindam Nath <arindam.nath@amd.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>
2020-03-13NTB: handle link up, D0 and D3 events correctlyArindam Nath1-0/+6
Just like for Link-Down event, Link-Up and D3 events are also mutually exclusive to Link-Down and D0 events respectively. So we clear the bitmasks in peer_sta depending on event type. Signed-off-by: Arindam Nath <arindam.nath@amd.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>
2020-03-13NTB: handle link down event correctlyArindam Nath1-3/+6
Link-Up and Link-Down are mutually exclusive events. So when we receive a Link-Down event, we should also clear the bitmask for Link-Up event in peer_sta. Signed-off-by: Arindam Nath <arindam.nath@amd.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>
2020-03-13NTB: remove handling of peer_sta from amd_link_is_upArindam Nath1-11/+0
amd_link_is_up() is a callback to inquire whether the NTB link is up or not. So it should not indulge itself into clearing the bitmasks of peer_sta. Signed-off-by: Arindam Nath <arindam.nath@amd.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>
2020-03-13NTB: set peer_sta within event handler itselfArindam Nath1-2/+4
amd_ack_smu() should only set the corresponding bits into SMUACK register. Setting the bitmask of peer_sta should be done within the event handler. They are two different things, and so should be handled differently and at different places. Signed-off-by: Arindam Nath <arindam.nath@amd.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>
2020-03-13NTB: return the side info status from amd_poll_linkArindam Nath2-8/+5
Bit 1 of SIDE_INFO register is an indication that the driver on the other side of link is ready. We set this bit during driver initialization sequence. So rather than having separate macros to return the status, we can simply return the status of this bit from amd_poll_link(). So a return of 1 or 0 from this function will indicate to the caller whether the driver on the other side of link is ready or not, respectively. Signed-off-by: Arindam Nath <arindam.nath@amd.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>
2020-03-13NTB: define a new function to get link statusArindam Nath1-43/+50
Since getting the status of link is a logically separate operation, we simply create a new function which will store the link status to be used later. Signed-off-by: Arindam Nath <arindam.nath@amd.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>
2020-03-13NTB: Enable link up and down event notificationArindam Nath1-0/+5
Link-Up and Link-Down events can occur irrespective of whether a data transfer is in progress or not. So we need to enable the interrupt delivery for these events early during driver load. Signed-off-by: Arindam Nath <arindam.nath@amd.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>
2020-03-13NTB: clear interrupt status registerArindam Nath1-0/+3
The interrupt status register should be cleared by driver once the particular event is handled. The patch fixes this. Signed-off-by: Arindam Nath <arindam.nath@amd.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>
2020-03-13NTB: Fix access to link status and control registerArindam Nath1-3/+37
The design of AMD NTB implementation is such that NTB primary acts as an endpoint device and NTB secondary is an endpoint device behind a combination of Switch Upstream and Switch Downstream. Considering that, the link status and control register needs to be accessed differently based on the NTB topology. So in the case of NTB secondary, we first get the pointer to the Switch Downstream device for the NTB device. Then we get the pointer to the Switch Upstream device. Once we have that, we read the Link Status and Control register to get the correct status of link at the secondary. In the case of NTB primary, simply reading the Link Status and Control register of the NTB device itself will suffice. Suggested-by: Jiasen Lin <linjiasen@hygon.cn> Signed-off-by: Arindam Nath <arindam.nath@amd.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>
2020-03-13MAINTAINERS: update maintainer list for AMD NTB driverSanjay R Mehta1-0/+1
updating with my email address. Signed-off-by: Sanjay R Mehta <sanju.mehta@amd.com> Signed-off-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>
2020-03-13NTB: ntb_transport: Use scnprintf() for avoiding potential buffer overflowTakashi Iwai1-29/+29
Since snprintf() returns the would-be-output size instead of the actual output size, the succeeding calls may go beyond the given buffer limit. Fix it by replacing with scnprintf(). Fixes: fce8a7bb5b4b (PCI-Express Non-Transparent Bridge Support) Fixes: 282a2feeb9bf (NTB: Use DMA Engine to Transmit and Receive) Fixes: a754a8fcaf38 (NTB: allocate number transport entries depending on size of ring size) Fixes: d98ef99e378b (NTB: Clean up QP stats info) Fixes: e74bfeedad08 (NTB: Add flow control to the ntb_netdev) Fixes: 569410ca756c (NTB: Use unique DMA channels for TX and RX) Signed-off-by: Takashi Iwai <tiwai@suse.de> Reviewed-by: Logan Gunthorpe <logang@deltatee.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>
2020-03-13ntb_hw_switchtec: Fix ntb_mw_clear_trans error if size == 0Alexander Fomichev1-1/+1
ntb_mw_set_trans() should work as ntb_mw_clear_trans() when size == 0 and/or addr == 0. But error in xlate_pos checking condition prevents this. Fix the condition to make ntb_mw_clear_trans() working. Fixes: 87d11e645e31 (NTB: switchtec_ntb: Add memory window support) Signed-off-by: Alexander Fomichev <fomichev.ru@gmail.com> Reviewed-by: Logan Gunthorpe <logang@deltatee.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>
2020-03-13ntb_tool: Fix printk formatHelge Deller1-7/+7
The correct printk format is %pa or %pap, but not %pa[p]. Fixes: 7f46c8b3a5523 ("NTB: ntb_tool: Add full multi-port NTB API support") Signed-off-by: Helge Deller <deller@gmx.de> Signed-off-by: Jon Mason <jdmason@kudzu.us>
2020-03-13NTB: ntb_perf: Fix address err in perf_copy_chunkJiasen Lin1-10/+47
peer->outbuf is a virtual address which is get by ioremap, it can not be converted to a physical address by virt_to_page and page_to_phys. This conversion will result in DMA error, because the destination address which is converted by page_to_phys is invalid. This patch save the MMIO address of NTB BARx in perf_setup_peer_mw, and map the BAR space to DMA address after we assign the DMA channel. Then fill the destination address of DMA descriptor with this DMA address to guarantee that the address of memory write requests fall into memory window of NBT BARx with IOMMU enabled and disabled. Fixes: 5648e56d03fa ("NTB: ntb_perf: Add full multi-port NTB API support") Signed-off-by: Jiasen Lin <linjiasen@hygon.cn> Reviewed-by: Logan Gunthorpe <logang@deltatee.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>
2020-03-12NTB: Fix an error in get link statusJiasen Lin2-3/+2
The offset of PCIe Capability Header for AMD and HYGON NTB is 0x64, but the macro which named "AMD_LINK_STATUS_OFFSET" is defined as 0x68. It is offset of Device Capabilities Reg rather than Link Control Reg. This code trigger an error in get link statsus: cat /sys/kernel/debug/ntb_hw_amd/0000:43:00.1/info LNK STA - 0x8fa1 Link Status - Up Link Speed - PCI-E Gen 0 Link Width - x0 This patch use pcie_capability_read_dword to get link status. After fix this issue, we can get link status accurately: cat /sys/kernel/debug/ntb_hw_amd/0000:43:00.1/info LNK STA - 0x11030042 Link Status - Up Link Speed - PCI-E Gen 3 Link Width - x16 Fixes: a1b3695820aa4 ("NTB: Add support for AMD PCI-Express Non-Transparent Bridge") Signed-off-by: Jiasen Lin <linjiasen@hygon.cn> Signed-off-by: Jon Mason <jdmason@kudzu.us>
2020-02-09Linux 5.6-rc1Linus Torvalds1-2/+2
2020-02-09irqchip/gic-v4.1: Avoid 64bit division for the sake of 32bit ARMMarc Zyngier1-2/+2
In order to allow the GICv4 code to link properly on 32bit ARM, make sure we don't use 64bit divisions when it isn't strictly necessary. Fixes: 4e6437f12d6e ("irqchip/gic-v4.1: Ensure L2 vPE table is allocated at RD level") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Zenghui Yu <yuzenghui@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-02-08fs: Add VirtualBox guest shared folder (vboxsf) supportHans de Goede12-0/+3280
VirtualBox hosts can share folders with guests, this commit adds a VFS driver implementing the Linux-guest side of this, allowing folders exported by the host to be mounted under Linux. This driver depends on the guest <-> host IPC functions exported by the vboxguest driver. Acked-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-02-08Fix up remaining devm_ioremap_nocache() in SGI IOC3 8250 UART driverLinus Torvalds1-1/+1
This is a merge error on my part - the driver was merged into mainline by commit c5951e7c8ee5 ("Merge tag 'mips_5.6' of git://../mips/linux") over a week ago, but nobody apparently noticed that it didn't actually build due to still having a reference to the devm_ioremap_nocache() function, removed a few days earlier through commit 6a1000bd2703 ("Merge tag 'ioremap-5.6' of git://../ioremap"). Apparently this didn't get any build testing anywhere. Not perhaps all that surprising: it's restricted to 64-bit MIPS only, and only with the new SGI_MFD_IOC3 support enabled. I only noticed because the ioremap conflicts in the ARM SoC driver update made me check there weren't any others hiding, and I found this one. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-02-08pipe: use exclusive waits when reading or writingLinus Torvalds4-30/+51
This makes the pipe code use separate wait-queues and exclusive waiting for readers and writers, avoiding a nasty thundering herd problem when there are lots of readers waiting for data on a pipe (or, less commonly, lots of writers waiting for a pipe to have space). While this isn't a common occurrence in the traditional "use a pipe as a data transport" case, where you typically only have a single reader and a single writer process, there is one common special case: using a pipe as a source of "locking tokens" rather than for data communication. In particular, the GNU make jobserver code ends up using a pipe as a way to limit parallelism, where each job consumes a token by reading a byte from the jobserver pipe, and releases the token by writing a byte back to the pipe. This pattern is fairly traditional on Unix, and works very well, but will waste a lot of time waking up a lot of processes when only a single reader needs to be woken up when a writer releases a new token. A simplified test-case of just this pipe interaction is to create 64 processes, and then pass a single token around between them (this test-case also intentionally passes another token that gets ignored to test the "wake up next" logic too, in case anybody wonders about it): #include <unistd.h> int main(int argc, char **argv) { int fd[2], counters[2]; pipe(fd); counters[0] = 0; counters[1] = -1; write(fd[1], counters, sizeof(counters)); /* 64 processes */ fork(); fork(); fork(); fork(); fork(); fork(); do { int i; read(fd[0], &i, sizeof(i)); if (i < 0) continue; counters[0] = i+1; write(fd[1], counters, (1+(i & 1)) *sizeof(int)); } while (counters[0] < 1000000); return 0; } and in a perfect world, passing that token around should only cause one context switch per transfer, when the writer of a token causes a directed wakeup of just a single reader. But with the "writer wakes all readers" model we traditionally had, on my test box the above case causes more than an order of magnitude more scheduling: instead of the expected ~1M context switches, "perf stat" shows 231,852.37 msec task-clock # 15.857 CPUs utilized 11,250,961 context-switches # 0.049 M/sec 616,304 cpu-migrations # 0.003 M/sec 1,648 page-faults # 0.007 K/sec 1,097,903,998,514 cycles # 4.735 GHz 120,781,778,352 instructions # 0.11 insn per cycle 27,997,056,043 branches # 120.754 M/sec 283,581,233 branch-misses # 1.01% of all branches 14.621273891 seconds time elapsed 0.018243000 seconds user 3.611468000 seconds sys before this commit. After this commit, I get 5,229.55 msec task-clock # 3.072 CPUs utilized 1,212,233 context-switches # 0.232 M/sec 103,951 cpu-migrations # 0.020 M/sec 1,328 page-faults # 0.254 K/sec 21,307,456,166 cycles # 4.074 GHz 12,947,819,999 instructions # 0.61 insn per cycle 2,881,985,678 branches # 551.096 M/sec 64,267,015 branch-misses # 2.23% of all branches 1.702148350 seconds time elapsed 0.004868000 seconds user 0.110786000 seconds sys instead. Much better. [ Note! This kernel improvement seems to be very good at triggering a race condition in the make jobserver (in GNU make 4.2.1) for me. It's a long known bug that was fixed back in June 2017 by GNU make commit b552b0525198 ("[SV 51159] Use a non-blocking read with pselect to avoid hangs."). But there wasn't a new release of GNU make until 4.3 on Jan 19 2020, so a number of distributions may still have the buggy version. Some have backported the fix to their 4.2.1 release, though, and even without the fix it's quite timing-dependent whether the bug actually is hit. ] Josh Triplett says: "I've been hammering on your pipe fix patch (switching to exclusive wait queues) for a month or so, on several different systems, and I've run into no issues with it. The patch *substantially* improves parallel build times on large (~100 CPU) systems, both with parallel make and with other things that use make's pipe-based jobserver. All current distributions (including stable and long-term stable distributions) have versions of GNU make that no longer have the jobserver bug" Tested-by: Josh Triplett <josh@joshtriplett.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-02-08compat_ioctl: fix FIONREAD on devicesArnd Bergmann1-4/+7
My final cleanup patch for sys_compat_ioctl() introduced a regression on the FIONREAD ioctl command, which is used for both regular and special files, but only works on regular files after my patch, as I had missed the warning that Al Viro put into a comment right above it. Change it back so it can work on any file again by moving the implementation to do_vfs_ioctl() instead. Fixes: 77b9040195de ("compat_ioctl: simplify the implementation") Reported-and-tested-by: Christian Zigotzky <chzigotzky@xenosoft.de> Reported-and-tested-by: youling257 <youling257@gmail.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2020-02-08net: thunderx: use proper interface type for RGMIITim Harvey1-1/+1
The configuration of the OCTEONTX XCV_DLL_CTL register via xcv_init_hw() is such that the RGMII RX delay is bypassed leaving the RGMII TX delay enabled in the MAC: /* Configure DLL - enable or bypass * TX no bypass, RX bypass */ cfg = readq_relaxed(xcv->reg_base + XCV_DLL_CTL); cfg &= ~0xFF03; cfg |= CLKRX_BYP; writeq_relaxed(cfg, xcv->reg_base + XCV_DLL_CTL); This would coorespond to a interface type of PHY_INTERFACE_MODE_RGMII_RXID and not PHY_INTERFACE_MODE_RGMII. Fixing this allows RGMII PHY drivers to do the right thing (enable RX delay in the PHY) instead of erroneously enabling both delays in the PHY. Signed-off-by: Tim Harvey <tharvey@gateworks.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-08powerpc: Fix CONFIG_TRACE_IRQFLAGS with CONFIG_VMAP_STACKChristophe Leroy1-1/+1
When CONFIG_PROVE_LOCKING is selected together with (now default) CONFIG_VMAP_STACK, kernel enter deadlock during boot. At the point of checking whether interrupts are enabled or not, the value of MSR saved on stack is read using the physical address of the stack. But at this point, when using VMAP stack the DATA MMU translation has already been re-enabled, leading to deadlock. Don't use the physical address of the stack when CONFIG_VMAP_STACK is set. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Reported-by: Guenter Roeck <linux@roeck-us.net> Fixes: 028474876f47 ("powerpc/32: prepare for CONFIG_VMAP_STACK") Tested-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/daeacdc0dec0416d1c587cc9f9e7191ad3068dc0.1581095957.git.christophe.leroy@c-s.fr
2020-02-08powerpc/futex: Fix incorrect user access blockingMichael Ellerman1-4/+6
The early versions of our kernel user access prevention (KUAP) were written by Russell and Christophe, and didn't have separate read/write access. At some point I picked up the series and added the read/write access, but I failed to update the usages in futex.h to correctly allow read and write. However we didn't notice because of another bug which was causing the low-level code to always enable read and write. That bug was fixed recently in commit 1d8f739b07bd ("powerpc/kuap: Fix set direction in allow/prevent_user_access()"). futex_atomic_cmpxchg_inatomic() is passed the user address as %3 and does: 1: lwarx %1, 0, %3 cmpw 0, %1, %4 bne- 3f 2: stwcx. %5, 0, %3 Which clearly loads and stores from/to %3. The logic in arch_futex_atomic_op_inuser() is similar, so fix both of them to use allow_read_write_user(). Without this fix, and with PPC_KUAP_DEBUG=y, we see eg: Bug: Read fault blocked by AMR! WARNING: CPU: 94 PID: 149215 at arch/powerpc/include/asm/book3s/64/kup-radix.h:126 __do_page_fault+0x600/0xf30 CPU: 94 PID: 149215 Comm: futex_requeue_p Tainted: G W 5.5.0-rc7-gcc9x-g4c25df5640ae #1 ... NIP [c000000000070680] __do_page_fault+0x600/0xf30 LR [c00000000007067c] __do_page_fault+0x5fc/0xf30 Call Trace: [c00020138e5637e0] [c00000000007067c] __do_page_fault+0x5fc/0xf30 (unreliable) [c00020138e5638c0] [c00000000000ada8] handle_page_fault+0x10/0x30 --- interrupt: 301 at cmpxchg_futex_value_locked+0x68/0xd0 LR = futex_lock_pi_atomic+0xe0/0x1f0 [c00020138e563bc0] [c000000000217b50] futex_lock_pi_atomic+0x80/0x1f0 (unreliable) [c00020138e563c30] [c00000000021b668] futex_requeue+0x438/0xb60 [c00020138e563d60] [c00000000021c6cc] do_futex+0x1ec/0x2b0 [c00020138e563d90] [c00000000021c8b8] sys_futex+0x128/0x200 [c00020138e563e20] [c00000000000b7ac] system_call+0x5c/0x68 Fixes: de78a9c42a79 ("powerpc: Add a framework for Kernel Userspace Access Protection") Cc: stable@vger.kernel.org # v5.2+ Reported-by: syzbot+e808452bad7c375cbee6@syzkaller-ppc64.appspotmail.com Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Christophe Leroy <christophe.leroy@c-s.fr> Link: https://lore.kernel.org/r/20200207122145.11928-1-mpe@ellerman.id.au
2020-02-08irqchip/gic-v3-its: Rename VPENDBASER/VPROPBASER accessorsZenghui Yu3-24/+24
V{PEND,PROP}BASER registers are actually located in VLPI_base frame of the *redistributor*. Rename their accessors to reflect this fact. No functional changes. Signed-off-by: Zenghui Yu <yuzenghui@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20200206075711.1275-7-yuzenghui@huawei.com
2020-02-08irqchip/gic-v3-its: Remove superfluous WARN_ONZenghui Yu1-1/+0
"ITS virtual pending table not cleaning" is already complained inside its_clear_vpend_valid(), there's no need to trigger a WARN_ON again. Signed-off-by: Zenghui Yu <yuzenghui@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20200206075711.1275-6-yuzenghui@huawei.com
2020-02-08irqchip/gic-v4.1: Drop 'tmp' in inherit_vpe_l1_table_from_rd()Zenghui Yu1-3/+1
The variable 'tmp' in inherit_vpe_l1_table_from_rd() is actually not needed, drop it. Signed-off-by: Zenghui Yu <yuzenghui@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20200206075711.1275-5-yuzenghui@huawei.com
2020-02-08irqchip/gic-v4.1: Ensure L2 vPE table is allocated at RD levelZenghui Yu1-0/+80
In GICv4, we will ensure that level2 vPE table memory is allocated for the specified vpe_id on all v4 ITS, in its_alloc_vpe_table(). This still works well for the typical GICv4.1 implementation, where the new vPE table is shared between the ITSs and the RDs. To make it explicit, let us introduce allocate_vpe_l2_table() to make sure that the L2 tables are allocated on all v4.1 RDs. We're likely not need to allocate memory in it because the vPE table is shared and (L2 table is) already allocated at ITS level, except for the case where the ITS doesn't share anything (say SVPET == 0, practically unlikely but architecturally allowed). The implementation of allocate_vpe_l2_table() is mostly copied from its_alloc_table_entry(). Signed-off-by: Zenghui Yu <yuzenghui@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20200206075711.1275-4-yuzenghui@huawei.com
2020-02-08irqchip/gic-v4.1: Set vpe_l1_base for all redistributorsZenghui Yu2-2/+5
Currently, we will not set vpe_l1_page for the current RD if we can inherit the vPE configuration table from another RD (or ITS), which results in an inconsistency between RDs within the same CommonLPIAff group. Let's rename it to vpe_l1_base to indicate the base address of the vPE configuration table of this RD, and set it properly for *all* v4.1 redistributors. Signed-off-by: Zenghui Yu <yuzenghui@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20200206075711.1275-3-yuzenghui@huawei.com
2020-02-08irqchip/gic-v4.1: Fix programming of GICR_VPROPBASER_4_1_SIZEZenghui Yu1-1/+1
The Size field of GICv4.1 VPROPBASER register indicates number of pages minus one and together Page_Size and Size control the vPEID width. Let's respect this requirement of the architecture. Signed-off-by: Zenghui Yu <yuzenghui@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20200206075711.1275-2-yuzenghui@huawei.com
2020-02-08mt76: mt7615: fix max_nss in mt7615_eeprom_parse_hw_capLorenzo Bianconi1-1/+2
Fix u8 cast reading max_nss from MT_TOP_STRAP_STA register in mt7615_eeprom_parse_hw_cap routine Fixes: acf5457fd99db ("mt76: mt7615: read {tx,rx} mask from eeprom") Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2020-02-07bpf: Improve bucket_log calculation logicMartin KaFai Lau1-2/+3
It was reported that the max_t, ilog2, and roundup_pow_of_two macros have exponential effects on the number of states in the sparse checker. This patch breaks them up by calculating the "nbuckets" first so that the "bucket_log" only needs to take ilog2(). In addition, Linus mentioned: Patch looks good, but I'd like to point out that it's not just sparse. You can see it with a simple make net/core/bpf_sk_storage.i grep 'smap->bucket_log = ' net/core/bpf_sk_storage.i | wc and see the end result: 1 365071 2686974 That's one line (the assignment line) that is 2,686,974 characters in length. Now, sparse does happen to react particularly badly to that (I didn't look to why, but I suspect it's just that evaluating all the types that don't actually ever end up getting used ends up being much more expensive than it should be), but I bet it's not good for gcc either. Fixes: 6ac99e8f23d4 ("bpf: Introduce bpf sk local storage") Reported-by: Randy Dunlap <rdunlap@infradead.org> Reported-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com> Link: https://lore.kernel.org/bpf/20200207081810.3918919-1-kafai@fb.com