aboutsummaryrefslogtreecommitdiffstats
path: root/tools/perf/scripts/python/export-to-sqlite.py (unfollow)
AgeCommit message (Collapse)AuthorFilesLines
2020-06-04afs: Don't get epoch from a server because it may be ambiguousDavid Howells2-54/+2
Don't get the epoch from a server, particularly one that we're looking up by UUID, as UUIDs may be ambiguous and may map to more than one server - so we can't draw any conclusions from it. Reported-by: Jeffrey Altman <jaltman@auristor.com> Signed-off-by: David Howells <dhowells@redhat.com>
2020-06-04afs: Build an abstraction around an "operation" conceptDavid Howells21-2832/+2648
Turn the afs_operation struct into the main way that most fileserver operations are managed. Various things are added to the struct, including the following: (1) All the parameters and results of the relevant operations are moved into it, removing corresponding fields from the afs_call struct. afs_call gets a pointer to the op. (2) The target volume is made the main focus of the operation, rather than the target vnode(s), and a bunch of op->vnode->volume are made op->volume instead. (3) Two vnode records are defined (op->file[]) for the vnode(s) involved in most operations. The vnode record (struct afs_vnode_param) contains: - The vnode pointer. - The fid of the vnode to be included in the parameters or that was returned in the reply (eg. FS.MakeDir). - The status and callback information that may be returned in the reply about the vnode. - Callback break and data version tracking for detecting simultaneous third-parth changes. (4) Pointers to dentries to be updated with new inodes. (5) An operations table pointer. The table includes pointers to functions for issuing AFS and YFS-variant RPCs, handling the success and abort of an operation and handling post-I/O-lock local editing of a directory. To make this work, the following function restructuring is made: (A) The rotation loop that issues calls to fileservers that can be found in each function that wants to issue an RPC (such as afs_mkdir()) is extracted out into common code, in a new file called fs_operation.c. (B) The rotation loops, such as the one in afs_mkdir(), are replaced with a much smaller piece of code that allocates an operation, sets the parameters and then calls out to the common code to do the actual work. (C) The code for handling the success and failure of an operation are moved into operation functions (as (5) above) and these are called from the core code at appropriate times. (D) The pseudo inode getting stuff used by the dynamic root code is moved over into dynroot.c. (E) struct afs_iget_data is absorbed into the operation struct and afs_iget() expects to be given an op pointer and a vnode record. (F) Point (E) doesn't work for the root dir of a volume, but we know the FID in advance (it's always vnode 1, unique 1), so a separate inode getter, afs_root_iget(), is provided to special-case that. (G) The inode status init/update functions now also take an op and a vnode record. (H) The RPC marshalling functions now, for the most part, just take an afs_operation struct as their only argument. All the data they need is held there. The result delivery functions write their answers there as well. (I) The call is attached to the operation and then the operation core does the waiting. And then the new operation code is, for the moment, made to just initialise the operation, get the appropriate vnode I/O locks and do the same rotation loop as before. This lays the foundation for the following changes in the future: (*) Overhauling the rotation (again). (*) Support for asynchronous I/O, where the fileserver rotation must be done asynchronously also. Signed-off-by: David Howells <dhowells@redhat.com>
2020-05-31afs: Rename struct afs_fs_cursor to afs_operationDavid Howells14-278/+278
As a prelude to implementing asynchronous fileserver operations in the afs filesystem, rename struct afs_fs_cursor to afs_operation. This struct is going to form the core of the operation management and is going to acquire more members in later. Signed-off-by: David Howells <dhowells@redhat.com>
2020-05-31afs: Remove the error argument from afs_protocol_error()David Howells8-60/+39
Remove the error argument from afs_protocol_error() as it's always -EBADMSG. Signed-off-by: David Howells <dhowells@redhat.com>
2020-05-31afs: Set error flag rather than return error from file status decodeDavid Howells4-123/+55
Set a flag in the call struct to indicate an unmarshalling error rather than return and handle an error from the decoding of file statuses. This flag is checked on a successful return from the delivery function. Signed-off-by: David Howells <dhowells@redhat.com>
2020-05-31afs: Make callback processing more efficient.David Howells3-60/+100
afs_vol_interest objects represent the volume IDs currently being accessed from a fileserver. These hold lists of afs_cb_interest objects that repesent the superblocks using that volume ID on that server. When a callback notification from the server telling of a modification by another client arrives, the volume ID specified in the notification is looked up in the server's afs_vol_interest list. Through the afs_cb_interest list, the relevant superblocks can be iterated over and the specific inode looked up and marked in each one. Make the following efficiency improvements: (1) Hold rcu_read_lock() over the entire processing rather than locking it each time. (2) Do all the callbacks for each vid together rather than individually. Each volume then only needs to be looked up once. (3) afs_vol_interest objects are now stored in an rb_tree rather than a flat list to reduce the lookup step count. (4) afs_vol_interest lookup is now done with RCU, but because it's in an rb_tree which may rotate under us, a seqlock is used so that if it changes during the walk, we repeat the walk with a lock held. With this and the preceding patch which adds RCU-based lookups in the inode cache, target volumes/vnodes can be taken without the need to take any locks, except on the target itself. Signed-off-by: David Howells <dhowells@redhat.com>
2020-05-31afs: Show more information in /proc/net/afs/serversDavid Howells1-8/+9
Show more information in /proc/net/afs/servers to make it easier to see what's going on with the server probing. Signed-off-by: David Howells <dhowells@redhat.com>
2020-05-31afs: Actively poll fileservers to maintain NAT or firewall openingsDavid Howells9-119/+281
When an AFS client accesses a file, it receives a limited-duration callback promise that the server will notify it if another client changes a file. This callback duration can be a few hours in length. If a client mounts a volume and then an application prevents it from being unmounted, say by chdir'ing into it, but then does nothing for some time, the rxrpc_peer record will expire and rxrpc-level keepalive will cease. If there is NAT or a firewall between the client and the server, the route back for the server may close after a comparatively short duration, meaning that attempts by the server to notify the client may then bounce. The client, however, may (so far as it knows) still have a valid unexpired promise and will then rely on its cached data and will not see changes made on the server by a third party until it incidentally rechecks the status or the promise needs renewal. To deal with this, the client needs to regularly probe the server. This has two effects: firstly, it keeps a route open back for the server, and secondly, it causes the server to disgorge any notifications that got queued up because they couldn't be sent. Fix this by adding a mechanism to emit regular probes. Two levels of probing are made available: Under normal circumstances the 'slow' queue will be used for a fileserver - this just probes the preferred address once every 5 mins or so; however, if server fails to respond to any probes, the server will shift to the 'fast' queue from which all its interfaces will be probed every 30s. When it finally responds, the record will switch back to the slow queue. Further notes: (1) Probing is now no longer driven from the fileserver rotation algorithm. (2) Probes are dispatched to all interfaces on a fileserver when that an afs_server object is set up to record it. (3) The afs_server object is removed from the probe queues when we start to probe it. afs_is_probing_server() returns true if it's not listed - ie. it's undergoing probing. (4) The afs_server object is added back on to the probe queue when the final outstanding probe completes, but the probed_at time is set when we're about to launch a probe so that it's not dependent on the probe duration. (5) The timer and the work item added for this must be handed a count on net->servers_outstanding, which they hand on or release. This makes sure that network namespace cleanup waits for them. Fixes: d2ddc776a458 ("afs: Overhaul volume and server record caching and fileserver rotation") Reported-by: Dave Botsch <botsch@cnf.cornell.edu> Signed-off-by: David Howells <dhowells@redhat.com>
2020-05-31afs: Split the usage count on struct afs_serverDavid Howells9-71/+131
Split the usage count on the afs_server struct to have an active count that registers who's actually using it separately from the reference count on the object. This allows a future patch to dispatch polling probes without advancing the "unuse" time into the future each time we emit a probe, which would otherwise prevent unused server records from expiring. Included in this: (1) The latter part of afs_destroy_server() in which the RCU destruction of afs_server objects is invoked and the outstanding server count is decremented is split out into __afs_put_server(). (2) afs_put_server() now calls __afs_put_server() rather then setting the management timer. (3) The calls begun by afs_fs_give_up_all_callbacks() and afs_fs_get_capabilities() can now take a ref on the server record, so afs_destroy_server() can just drop its ref and needn't wait for the completion of these calls. They'll put the ref when they're done. (4) Because of (3), afs_fs_probe_done() no longer needs to wake up afs_destroy_server() with server->probe_outstanding. (5) afs_gc_servers can be simplified. It only needs to check if server->active is 0 rather than playing games with the refcount. (6) afs_manage_servers() can propose a server for gc if usage == 0 rather than if ref == 1. The gc is effected by (5). Signed-off-by: David Howells <dhowells@redhat.com>
2020-05-31afs: Use the serverUnique field in the UVLDB record to reduce rpc opsDavid Howells4-15/+20
The U-version VLDB volume record retrieved by the VL.GetEntryByNameU rpc op carries a change counter (the serverUnique field) for each fileserver listed in the record as backing that volume. This is incremented whenever the registration details for a fileserver change (such as its address list). Note that the same value will be seen in all UVLDB records that refer to that fileserver. This should be checked before calling the VL server to re-query the address list for a fileserver. If it's the same, there's no point doing the query. Reported-by: Jeffrey Altman <jaltman@auristor.com> Signed-off-by: David Howells <dhowells@redhat.com>
2020-05-31afs: Always include dir in bulk status fetch from afs_do_lookup()David Howells1-2/+7
When a lookup is done in an AFS directory, the filesystem will speculate and fetch up to 49 other statuses for files in the same directory and fetch those as well, turning them into inodes or updating inodes that already exist. However, occasionally, a callback break might go missing due to NAT timing out, but the afs filesystem doesn't then realise that the directory is not up to date. Alleviate this by using one of the status slots to check the directory in which the lookup is being done. Reported-by: Dave Botsch <botsch@cnf.cornell.edu> Suggested-by: Jeffrey Altman <jaltman@auristor.com> Signed-off-by: David Howells <dhowells@redhat.com>
2020-05-31rxrpc: Adjust /proc/net/rxrpc/calls to display call->debug_id not user_IDDavid Howells1-3/+3
The user ID value isn't actually much use - and leaks a kernel pointer or a userspace value - so replace it with the call debug ID, which appears in trace points. Signed-off-by: David Howells <dhowells@redhat.com>
2020-05-31rxrpc: Map the EACCES error produced by some ICMP6 to EHOSTUNREACHDavid Howells1-0/+3
Map the EACCES error that is produced by some ICMP6 packets to EHOSTUNREACH when we get them as EACCES has other meanings within a filesystem context. Signed-off-by: David Howells <dhowells@redhat.com>
2020-05-31vfs, afs, ext4: Make the inode hash table RCU searchableDavid Howells4-41/+130
Make the inode hash table RCU searchable so that searches that want to access or modify an inode without taking a ref on that inode can do so without taking the inode hash table lock. The main thing this requires is some RCU annotation on the list manipulation operations. Inodes are already freed by RCU in most cases. Users of this interface must take care as the inode may be still under construction or may be being torn down around them. There are at least three instances where this can be of use: (1) Testing whether the inode number iunique() is going to return is currently unique (the iunique_lock is still held). (2) Ext4 date stamp updating. (3) AFS callback breaking. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> cc: linux-ext4@vger.kernel.org cc: linux-afs@lists.infradead.org
2020-05-24Linux 5.7-rc7Linus Torvalds1-1/+1
2020-05-23net: smsc911x: Fix runtime PM imbalance on errorDinghao Liu1-4/+5
Remove runtime PM usage counter decrement when the increment function has not been called to keep the counter balanced. Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-23net/mlx4_core: fix a memory leak bug.Qiushi Wu1-1/+1
In function mlx4_opreq_action(), pointer "mailbox" is not released, when mlx4_cmd_box() return and error, causing a memory leak bug. Fix this issue by going to "out" label, mlx4_free_cmd_mailbox() can free this pointer. Fixes: fe6f700d6cbb ("net/mlx4_core: Respond to operation request by firmware") Signed-off-by: Qiushi Wu <wu000273@umn.edu> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-23net: ethernet: ti: cpsw: fix ASSERT_RTNL() warning during suspendGrygorii Strashko1-0/+4
vlan_for_each() are required to be called with rtnl_lock taken, otherwise ASSERT_RTNL() warning will be triggered - which happens now during System resume from suspend: cpsw_suspend() |- cpsw_ndo_stop() |- __hw_addr_ref_unsync_dev() |- cpsw_purge_all_mc() |- vlan_for_each() |- ASSERT_RTNL(); Hence, fix it by surrounding cpsw_ndo_stop() by rtnl_lock/unlock() calls. Fixes: 15180eca569b ("net: ethernet: ti: cpsw: fix vlan mcast") Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-23net: phy: mscc: fix initialization of the MACsec protocol modeAntoine Tenart5-10/+21
At the very end of the MACsec block initialization in the MSCC PHY driver, the MACsec "protocol mode" is set. This setting should be set based on the PHY id within the package, as the bank used to access the register used depends on this. This was not done correctly, and only the first bank was used leading to the two upper PHYs being unstable when using the VSC8584. This patch fixes it. Fixes: 1bbe0ecc2a1a ("net: phy: mscc: macsec initialization") Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-23net: stmmac: don't attach interface until resume finishesLeon Yu1-2/+2
Commit 14b41a2959fb ("net: stmmac: Delete txtimer in suspend") was the first attempt to fix a race between mod_timer() and setup_timer() during stmmac_resume(). However the issue still exists as the commit only addressed half of the issue. Same race can still happen as stmmac_resume() re-attaches interface way too early - even before hardware is fully initialized. Worse, doing so allows network traffic to restart and stmmac_tx_timer_arm() being called in the middle of stmmac_resume(), which re-init tx timers in stmmac_init_coalesce(). timer_list will be corrupted and system crashes as a result of race between mod_timer() and setup_timer(). systemd--1995 2.... 552950018us : stmmac_suspend: 4994 ksoftirq-9 0..s2 553123133us : stmmac_tx_timer_arm: 2276 systemd--1995 0.... 553127896us : stmmac_resume: 5101 systemd--320 7...2 553132752us : stmmac_tx_timer_arm: 2276 (sd-exec-1999 5...2 553135204us : stmmac_tx_timer_arm: 2276 --------------------------------- pc : run_timer_softirq+0x468/0x5e0 lr : run_timer_softirq+0x570/0x5e0 Call trace: run_timer_softirq+0x468/0x5e0 __do_softirq+0x124/0x398 irq_exit+0xd8/0xe0 __handle_domain_irq+0x6c/0xc0 gic_handle_irq+0x60/0xb0 el1_irq+0xb8/0x180 arch_cpu_idle+0x38/0x230 default_idle_call+0x24/0x3c do_idle+0x1e0/0x2b8 cpu_startup_entry+0x28/0x48 secondary_start_kernel+0x1b4/0x208 Fix this by deferring netif_device_attach() to the end of stmmac_resume(). Signed-off-by: Leon Yu <leoyu@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-23net: Fix return value about devm_platform_ioremap_resource()Tiezhu Yang4-4/+7
When call function devm_platform_ioremap_resource(), we should use IS_ERR() to check the return value and return PTR_ERR() if failed. Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-23sparc32: fix page table traversal in srmmu_nocache_init()Mike Rapoport1-2/+2
The srmmu_nocache_init() uses __nocache_fix() macro to add an offset to page table entry to access srmmu_nocache_pool. But since sparc32 has only three actual page table levels, pgd, p4d and pud are essentially the same thing and pgd_offset() and p4d_offset() are no-ops, the __nocache_fix() should be done only at PUD level. Remove __nocache_fix() for p4d_offset() and pud_offset() and keep it only for PUD and lower levels. Fixes: c2bc26f7ca1f ("sparc32: use PUD rather than PGD to get PMD in srmmu_nocache_init()") Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Cc: David S. Miller <davem@davemloft.net> Cc: Anatoly Pugachev <matorola@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-05-23MAINTAINERS: add files related to kdumpBaoquan He1-0/+5
Kdump is implemented based on kexec, however some files are only related to crash dumping and missing, add them to KDUMP entry. Signed-off-by: Baoquan He <bhe@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Dave Young <dyoung@redhat.com> Link: http://lkml.kernel.org/r/20200520103633.GW5029@MiWiFi-R3L-srv Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-05-23z3fold: fix use-after-free when freeing handlesUladzislau Rezki1-5/+6
free_handle() for a foreign handle may race with inter-page compaction, what can lead to memory corruption. To avoid that, take write lock not read lock in free_handle to be synchronized with __release_z3fold_page(). For example KASAN can detect it: ================================================================== BUG: KASAN: use-after-free in LZ4_decompress_safe+0x2c4/0x3b8 Read of size 1 at addr ffffffc976695ca3 by task GoogleApiHandle/4121 CPU: 0 PID: 4121 Comm: GoogleApiHandle Tainted: P S OE 4.19.81-perf+ #162 Hardware name: Sony Mobile Communications. PDX-203(KONA) (DT) Call trace: LZ4_decompress_safe+0x2c4/0x3b8 lz4_decompress_crypto+0x3c/0x70 crypto_decompress+0x58/0x70 zcomp_decompress+0xd4/0x120 ... Apart from that, initialize zhdr->mapped_count in init_z3fold_page() and remove "newpage" variable because it is not used anywhere. Signed-off-by: Uladzislau Rezki <uladzislau.rezki@sony.com> Signed-off-by: Vitaly Wool <vitaly.wool@konsulko.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Qian Cai <cai@lca.pw> Cc: Raymond Jennings <shentino@gmail.com> Cc: <stable@vger.kernel.org> Link: http://lkml.kernel.org/r/20200520082100.28876-1-vitaly.wool@konsulko.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-05-23sparc32: use PUD rather than PGD to get PMD in srmmu_nocache_init()Mike Rapoport1-1/+1
The kbuild test robot reported the following warning: arch/sparc/mm/srmmu.c: In function 'srmmu_nocache_init': arch/sparc/mm/srmmu.c:300:9: error: variable 'pud' set but not used [-Werror=unused-but-set-variable] 300 | pud_t *pud; This warning is caused by misprint in the page table traversal in srmmu_nocache_init() function which accessed a PMD entry using PGD rather than PUD. Since sparc32 has only 3 page table levels, the PGD and PUD are essentially the same and usage of __nocache_fix() removed the type checking. Use PUD for the consistency and to silence the compiler warning. Fixes: 7235db268a2777bc38 ("sparc32: use pgtable-nopud instead of 4level-fixup") Reported-by: kbuild test robot <lkp@intel.com> Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: David S. Miller <davem@davemloft.net> Cc: Anatoly Pugachev <matorola@gmail.com> Cc: <stable@vger.kernel.org> Link: http://lkml.kernel.org/r/20200520132005.GM1059226@linux.ibm.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-05-23MAINTAINERS: update email address for Naoya HoriguchiNaoya Horiguchi1-1/+1
My email address has changed due to system upgrade, so please update it in MAINTAINERS list. My old address (n-horiguchi@ah.jp.nec.com) will be still active for a few months. Note that my email system has some encoding issue and can't send patches in raw format via git-send-email. So patches from me will be delivered via my free address (nao.horiguchi@gmail.com) or GitHub. Signed-off-by: Naoya Horiguchi <naoya.horiguchi@nec.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/r/1589874488-9247-1-git-send-email-naoya.horiguchi@nec.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-05-23sh: include linux/time_types.h for sockiosArnd Bergmann1-0/+2
Using the socket ioctls on arch/sh (and only there) causes build time problems when __kernel_old_timeval/__kernel_old_timespec are not already visible to the compiler. Add an explict include line for the header that defines these structures. Fixes: 8c709f9a0693 ("y2038: sh: remove timeval/timespec usage from headers") Fixes: 0768e17073dc ("net: socket: implement 64-bit timestamps") Reported-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Tested-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Rich Felker <dalias@libc.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Cc: <stable@vger.kernel.org> Link: http://lkml.kernel.org/r/20200519131327.1836482-1-arnd@arndb.de Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-05-23kasan: disable branch tracing for core runtimeMarco Elver3-10/+8
During early boot, while KASAN is not yet initialized, it is possible to enter reporting code-path and end up in kasan_report(). While uninitialized, the branch there prevents generating any reports, however, under certain circumstances when branches are being traced (TRACE_BRANCH_PROFILING), we may recurse deep enough to cause kernel reboots without warning. To prevent similar issues in future, we should disable branch tracing for the core runtime. [elver@google.com: remove duplicate DISABLE_BRANCH_PROFILING, per Qian Cai] Link: https://lore.kernel.org/lkml/20200517011732.GE24705@shao2-debian/ Link: http://lkml.kernel.org/r/20200522075207.157349-1-elver@google.com Reported-by: kernel test robot <rong.a.chen@intel.com> Signed-off-by: Marco Elver <elver@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Andrey Konovalov <andreyknvl@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Qian Cai <cai@lca.pw> Cc: <stable@vger.kernel.org> Link: http://lkml.kernel.org/r//20200517011732.GE24705@shao2-debian/ Link: http://lkml.kernel.org/r/20200519182459.87166-1-elver@google.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-05-23selftests/vm/write_to_hugetlbfs.c: fix unused variable warningJohn Hubbard1-2/+0
Remove unused variable "i", which was triggering a compiler warning. Fixes: 29750f71a9b4 ("hugetlb_cgroup: add hugetlb_cgroup reservation tests") Signed-off-by: John Hubbard <jhubbard@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-By: Mina Almasry <almasrymina@google.com> Cc: Brian Geffon <bgeffon@google.com> Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Ralph Campbell <rcampbell@nvidia.com> Link: http://lkml.kernel.org/r/20200517001245.361762-2-jhubbard@nvidia.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-05-23selftests/vm/.gitignore: add mremap_dontunmapJohn Hubbard1-0/+1
Add mremap_dontunmap to .gitignore. Fixes: 0c28759ee3c9 ("selftests: add MREMAP_DONTUNMAP selftest") Signed-off-by: John Hubbard <jhubbard@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Brian Geffon <bgeffon@google.com> Link: http://lkml.kernel.org/r/20200517002509.362401-2-jhubbard@nvidia.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-05-23rapidio: fix an error in get_user_pages_fast() error handlingJohn Hubbard1-0/+5
In the case of get_user_pages_fast() returning fewer pages than requested, rio_dma_transfer() does not quite do the right thing. It attempts to release all the pages that were requested, rather than just the pages that were pinned. Fix the error handling so that only the pages that were successfully pinned are released. Fixes: e8de370188d0 ("rapidio: add mport char device driver") Signed-off-by: John Hubbard <jhubbard@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Matt Porter <mporter@kernel.crashing.org> Cc: Alexandre Bounine <alex.bou9@gmail.com> Cc: Sumit Semwal <sumit.semwal@linaro.org> Cc: Dan Carpenter <dan.carpenter@oracle.com> Cc: <stable@vger.kernel.org> Link: http://lkml.kernel.org/r/20200517235620.205225-2-jhubbard@nvidia.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-05-23x86: bitops: fix build regressionNick Desaulniers1-6/+6
This is easily reproducible via CC=clang + CONFIG_STAGING=y + CONFIG_VT6656=m. It turns out that if your config tickles __builtin_constant_p via differences in choices to inline or not, these statements produce invalid assembly: $ cat foo.c long a(long b, long c) { asm("orb %1, %0" : "+q"(c): "r"(b)); return c; } $ gcc foo.c foo.c: Assembler messages: foo.c:2: Error: `%rax' not allowed with `orb' Use the `%b` "x86 Operand Modifier" to instead force register allocation to select a lower-8-bit GPR operand. The "q" constraint only has meaning on -m32 otherwise is treated as "r". Not all GPRs have low-8-bit aliases for -m32. Fixes: 1651e700664b4 ("x86: Fix bitops.h warning with a moved cast") Reported-by: kernelci.org bot <bot@kernelci.org> Suggested-by: Andy Shevchenko <andriy.shevchenko@intel.com> Suggested-by: Brian Gerst <brgerst@gmail.com> Suggested-by: H. Peter Anvin <hpa@zytor.com> Suggested-by: Ilie Halip <ilie.halip@gmail.com> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Tested-by: Sedat Dilek <sedat.dilek@gmail.com> Tested-by: Nathan Chancellor <natechancellor@gmail.com> [build, clang-11] Reviewed-by: Nathan Chancellor <natechancellor@gmail.com> Reviewed-By: Brian Gerst <brgerst@gmail.com> Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Marco Elver <elver@google.com> Cc: "Paul E. McKenney" <paulmck@kernel.org> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Luc Van Oostenryck <luc.vanoostenryck@gmail.com> Cc: Masahiro Yamada <yamada.masahiro@socionext.com> Cc: Daniel Axtens <dja@axtens.net> Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org> Link: http://lkml.kernel.org/r/20200508183230.229464-1-ndesaulniers@google.com Link: https://github.com/ClangBuiltLinux/linux/issues/961 Link: https://lore.kernel.org/lkml/20200504193524.GA221287@google.com/ Link: https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#x86Operandmodifiers Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-05-23device-dax: don't leak kernel memory to user space after unloading kmemDavid Hildenbrand1-3/+11
Assume we have kmem configured and loaded: [root@localhost ~]# cat /proc/iomem ... 140000000-33fffffff : Persistent Memory$ 140000000-1481fffff : namespace0.0 150000000-33fffffff : dax0.0 150000000-33fffffff : System RAM Assume we try to unload kmem. This force-unloading will work, even if memory cannot get removed from the system. [root@localhost ~]# rmmod kmem [ 86.380228] removing memory fails, because memory [0x0000000150000000-0x0000000157ffffff] is onlined ... [ 86.431225] kmem dax0.0: DAX region [mem 0x150000000-0x33fffffff] cannot be hotremoved until the next reboot Now, we can reconfigure the namespace: [root@localhost ~]# ndctl create-namespace --force --reconfig=namespace0.0 --mode=devdax [ 131.409351] nd_pmem namespace0.0: could not reserve region [mem 0x140000000-0x33fffffff]dax [ 131.410147] nd_pmem: probe of namespace0.0 failed with error -16namespace0.0 --mode=devdax ... This fails as expected due to the busy memory resource, and the memory cannot be used. However, the dax0.0 device is removed, and along its name. The name of the memory resource now points at freed memory (name of the device): [root@localhost ~]# cat /proc/iomem ... 140000000-33fffffff : Persistent Memory 140000000-1481fffff : namespace0.0 150000000-33fffffff : �_�^7_��/_��wR��WQ���^��� ... 150000000-33fffffff : System RAM We have to make sure to duplicate the string. While at it, remove the superfluous setting of the name and fixup a stale comment. Fixes: 9f960da72b25 ("device-dax: "Hotremove" persistent memory that is used like normal RAM") Signed-off-by: David Hildenbrand <david@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Pavel Tatashin <pasha.tatashin@soleen.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: <stable@vger.kernel.org> [5.3] Link: http://lkml.kernel.org/r/20200508084217.9160-2-david@redhat.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-05-23Revert "kobject: Make sure the parent does not get released before its children"Greg Kroah-Hartman1-20/+10
This reverts commit 4ef12f7198023c09ad6d25b652bd8748c965c7fa. Guenter reports: All my arm64be (arm64 big endian) boot tests crash with this patch applied. Reverting it fixes the problem. Crash log and bisect results (from pending-fixes branch) below. And also: arm64 images don't crash but report lots of "poison overwritten" backtraces like the one below. On arm, I see "refcount_t: underflow", also attached. I didn't bisect those, but given the context I would suspect the same culprit. Reported-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20200513151840.36400-1-heikki.krogerus@linux.intel.com Cc: Naresh Kamboju <naresh.kamboju@linaro.org> Cc: kernel test robot <rong.a.chen@intel.com> Cc: "Rafael J. Wysocki" <rafael@kernel.org> Cc: Heikki Krogerus <heikki.krogerus@linux.intel.com> Cc: Brendan Higgins <brendanhiggins@google.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: stable <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-05-22net/mlx5: Fix error flow in case of function_setup failureShay Drory1-1/+2
Currently, if an error occurred during mlx5_function_setup(), we keep dev->state as DEVICE_STATE_UP. Fixing it by adding a goto label. Fixes: e161105e58da ("net/mlx5: Function setup/teardown procedures") Signed-off-by: Shay Drory <shayd@mellanox.com> Reviewed-by: Moshe Shemesh <moshe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-22net/mlx5e: CT: Correctly get flow ruleRoi Dayan2-3/+6
The correct way is to us the flow_cls_offload_flow_rule() wrapper instead of f->rule directly. Fixes: 4c3844d9e97e ("net/mlx5e: CT: Introduce connection tracking") Signed-off-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Oz Shlomo <ozsh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-22net/mlx5e: Update netdev txq on completions during closureMoshe Shemesh1-3/+6
On sq closure when we free its descriptors, we should also update netdev txq on completions which would not arrive. Otherwise if we reopen sqs and attach them back, for example on fw fatal recovery flow, we may get tx timeout. Fixes: 29429f3300a3 ("net/mlx5e: Timeout if SQ doesn't flush during close") Signed-off-by: Moshe Shemesh <moshe@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-22net/mlx5: Annotate mutex destroy for root nsRoi Dayan1-0/+6
Invoke mutex_destroy() to catch any errors. Fixes: 2cc43b494a6c ("net/mlx5_core: Managing root flow table") Signed-off-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-22net/mlx5: Don't maintain a case of del_sw_func being nullRoi Dayan1-8/+9
Add del_sw_func cb for root ns. Now there is no need to maintain a case of del_sw_func being null when freeing the node. Fixes: 2cc43b494a6c ("net/mlx5_core: Managing root flow table") Signed-off-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Paul Blakey <paulb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-22net/mlx5: Fix cleaning unmanaged flow tablesRoi Dayan1-5/+6
Unmanaged flow tables doesn't have a parent and tree_put_node() assume there is always a parent if cleaning is needed. fix that. Fixes: 5281a0c90919 ("net/mlx5: fs_core: Introduce unmanaged flow tables") Signed-off-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Paul Blakey <paulb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-22net/mlx5: Fix memory leak in mlx5_events_initMoshe Shemesh1-1/+3
Fix memory leak in mlx5_events_init(), in case create_single_thread_workqueue() fails, events struct should be freed. Fixes: 5d3c537f9070 ("net/mlx5: Handle event of power detection in the PCIE slot") Signed-off-by: Moshe Shemesh <moshe@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-22net/mlx5e: Fix inner tirs handlingRoi Dayan4-10/+12
In the cited commit inner_tirs argument was added to create and destroy inner tirs, and no indication was added to mlx5e_modify_tirs_hash() function. In order to have a consistent handling, use inner_indir_tir[0].tirn in tirs destroy/modify function as an indication to whether inner tirs are created. Inner tirs are not created for representors and before this commit, a call to mlx5e_modify_tirs_hash() was sending HW commands to modify non-existent inner tirs. Fixes: 46dc933cee82 ("net/mlx5e: Provide explicit directive if to create inner indirect tirs") Signed-off-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Vlad Buslov <vladbu@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-22net/mlx5e: kTLS, Destroy key object after destroying the TISTariq Toukan1-1/+1
The TLS TIS object contains the dek/key ID. By destroying the key first, the TIS would contain an invalid non-existing key ID. Reverse the destroy order, this also acheives the desired assymetry between the destroy and the create flows. Fixes: d2ead1f360e8 ("net/mlx5e: Add kTLS TX HW offload support") Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Reviewed-by: Boris Pismenny <borisp@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-22net/mlx5e: Fix allowed tc redirect merged eswitch offload casesMaor Dickman3-14/+41
After changing the parent_id to be the same for both NICs of same The cited commit wrongly allow offload of tc redirect flows from VF to uplink and vice versa when devcies are on different eswitch, these cases aren't supported by HW. Disallow the above offloads when devcies are on different eswitch and VF LAG is not configured. Fixes: f6dc1264f1c0 ("net/mlx5e: Disallow tc redirect offload cases we don't support") Signed-off-by: Maor Dickman <maord@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-22net/mlx5: Avoid processing commands before cmdif is readyEran Ben Elisha3-0/+23
When driver is reloading during recovery flow, it can't get new commands till command interface is up again. Otherwise we may get to null pointer trying to access non initialized command structures. Add cmdif state to avoid processing commands while cmdif is not ready. Fixes: e126ba97dba9 ("mlx5: Add driver for Mellanox Connect-IB adapters") Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Moshe Shemesh <moshe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-22net/mlx5: Fix a race when moving command interface to events modeEran Ben Elisha3-4/+40
After driver creates (via FW command) an EQ for commands, the driver will be informed on new commands completion by EQE. However, due to a race in driver's internal command mode metadata update, some new commands will still be miss-handled by driver as if we are in polling mode. Such commands can get two non forced completion, leading to already freed command entry access. CREATE_EQ command, that maps EQ to the command queue must be posted to the command queue while it is empty and no other command should be posted. Add SW mechanism that once the CREATE_EQ command is about to be executed, all other commands will return error without being sent to the FW. Allow sending other commands only after successfully changing the driver's internal command mode metadata. We can safely return error to all other commands while creating the command EQ, as all other commands might be sent from the user/application during driver load. Application can rerun them later after driver's load was finished. Fixes: e126ba97dba9 ("mlx5: Add driver for Mellanox Connect-IB adapters") Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Moshe Shemesh <moshe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-22net/mlx5: Add command entry handling completionMoshe Shemesh2-0/+15
When FW response to commands is very slow and all command entries in use are waiting for completion we can have a race where commands can get timeout before they get out of the queue and handled. Timeout completion on uninitialized command will cause releasing command's buffers before accessing it for initialization and then we will get NULL pointer exception while trying access it. It may also cause releasing buffers of another command since we may have timeout completion before even allocating entry index for this command. Add entry handling completion to avoid this race. Fixes: e126ba97dba9 ("mlx5: Add driver for Mellanox Connect-IB adapters") Signed-off-by: Moshe Shemesh <moshe@mellanox.com> Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-23rxrpc: Fix a memory leak in rxkad_verify_response()Qiushi Wu1-2/+1
A ticket was not released after a call of the function "rxkad_decrypt_ticket" failed. Thus replace the jump target "temporary_error_free_resp" by "temporary_error_free_ticket". Fixes: 8c2f826dc3631 ("rxrpc: Don't put crypto buffers on the stack") Signed-off-by: Qiushi Wu <wu000273@umn.edu> Signed-off-by: David Howells <dhowells@redhat.com> cc: Markus Elfring <Markus.Elfring@web.de>
2020-05-23rxrpc: Fix a warningDavid Howells1-1/+1
Fix a warning due to an uninitialised variable. le included from ../fs/afs/fs_probe.c:11: ../fs/afs/fs_probe.c: In function 'afs_fileserver_probe_result': ../fs/afs/internal.h:1453:2: warning: 'rtt_us' may be used uninitialized in this function [-Wmaybe-uninitialized] 1453 | printk("[%-6.6s] "FMT"\n", current->comm ,##__VA_ARGS__) | ^~~~~~ ../fs/afs/fs_probe.c:35:15: note: 'rtt_us' was declared here Signed-off-by: David Howells <dhowells@redhat.com>
2020-05-22net: sun: fix missing release regions in cas_init_one().Qiushi Wu1-2/+1
In cas_init_one(), "pdev" is requested by "pci_request_regions", but it was not released after a call of the function “pci_write_config_byte” failed. Thus replace the jump target “err_write_cacheline” by "err_out_free_res". Fixes: 1f26dac32057 ("[NET]: Add Sun Cassini driver.") Signed-off-by: Qiushi Wu <wu000273@umn.edu> Signed-off-by: David S. Miller <davem@davemloft.net>