aboutsummaryrefslogtreecommitdiffstatshomepage
AgeCommit message (Collapse)AuthorFilesLines
2020-08-15net: xdp: account for layer 3 packets in generic skb handlerjd/xdp-l3Jason A. Donenfeld1-0/+12
A user reported that packets from wireguard were possibly ignored by XDP [1]. Another user reported that modifying packets from layer 3 interfaces results in impossible to diagnose drops. Apparently, the generic skb xdp handler path seems to assume that packets will always have an ethernet header, which really isn't always the case for layer 3 packets, which are produced by multiple drivers. This patch fixes the oversight. If the mac_len is 0 and so is hard_header_len, then we know that the skb is a layer 3 packet, and in that case prepend a pseudo ethhdr to the packet whose h_proto is copied from skb->protocol, which will have the appropriate v4 or v6 ethertype. This allows us to keep XDP programs' assumption correct about packets always having that ethernet header, so that existing code doesn't break, while still allowing layer 3 devices to use the generic XDP handler. We push on the ethernet header and then pull it right off and set mac_len to the ethernet header size, so that the rest of the XDP code does not need any changes. That is, it makes it so that the skb has its ethernet header just before the data pointer, of size ETH_HLEN. Previous discussions have included the point that maybe XDP should just be intentionally broken on layer 3 interfaces, by design, and that layer 3 people should be using cls_bpf. However, I think there are good grounds to reconsider this perspective: - Complicated deployments wind up applying XDP modifications to a variety of different devices on a given host, some of which are using specialized ethernet cards and other ones using virtual layer 3 interfaces, such as WireGuard. Being able to apply one codebase to each of these winds up being essential. - cls_bpf does not support the same feature set as XDP, and operates at a slightly different stage in the networking stack. You may reply, "then add all the features you want to cls_bpf", but that seems to be missing the point, and would still result in there being two ways to do everything, which is not desirable for anyone actually _using_ this code. - While XDP was originally made for hardware offloading, and while many look disdainfully upon the generic mode, it nevertheless remains a highly useful and popular way of adding bespoke packet transformations, and from that perspective, a difference between layer 2 and layer 3 packets is immaterial if the user is primarily concerned with transformations to layer 3 and beyond. - It's not impossible to imagine layer 3 hardware (e.g. a WireGuard PCIe card) including eBPF/XDP functionality built-in. In that case, why limit XDP as a technology to only layer 2? Then, having generic XDP work for layer 3 would naturally fit as well. [1] https://lore.kernel.org/wireguard/M5WzVK5--3-2@tuta.io/ Reported-by: Thomas Ptacek <thomas@sockpuppet.org> Reported-by: Adhipati Blambangan <adhipati@tuta.io> Cc: David Ahern <dsahern@gmail.com> Cc: Toke Høiland-Jørgensen <toke@redhat.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com> Cc: Jesper Dangaard Brouer <brouer@redhat.com> Cc: John Fastabend <john.fastabend@gmail.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2020-08-15net: xdp: pull ethernet header off packet after computing skb->protocolJason A. Donenfeld1-0/+1
When an XDP program changes the ethernet header protocol field, eth_type_trans is used to recalculate skb->protocol. In order for eth_type_trans to work correctly, the ethernet header must actually be part of the skb data segment, so the code first pushes that onto the head of the skb. However, it subsequently forgets to pull it back off, making the behavior of the passed-on packet inconsistent between the protocol modifying case and the static protocol case. This patch fixes the issue by simply pulling the ethernet header back off of the skb head. Fixes: 297249569932 ("net: fix generic XDP to handle if eth header was mangled") Cc: Jesper Dangaard Brouer <brouer@redhat.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2020-08-13Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netLinus Torvalds66-350/+494
Pull networking fixes from David Miller: "Some merge window fallout, some longer term fixes: 1) Handle headroom properly in lapbether and x25_asy drivers, from Xie He. 2) Fetch MAC address from correct r8152 device node, from Thierry Reding. 3) In the sw kTLS path we should allow MSG_CMSG_COMPAT in sendmsg, from Rouven Czerwinski. 4) Correct fdputs in socket layer, from Miaohe Lin. 5) Revert troublesome sockptr_t optimization, from Christoph Hellwig. 6) Fix TCP TFO key reading on big endian, from Jason Baron. 7) Missing CAP_NET_RAW check in nfc, from Qingyu Li. 8) Fix inet fastreuse optimization with tproxy sockets, from Tim Froidcoeur. 9) Fix 64-bit divide in new SFC driver, from Edward Cree. 10) Add a tracepoint for prandom_u32 so that we can more easily perform usage analysis. From Eric Dumazet. 11) Fix rwlock imbalance in AF_PACKET, from John Ogness" * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (49 commits) net: openvswitch: introduce common code for flushing flows af_packet: TPACKET_V3: fix fill status rwlock imbalance random32: add a tracepoint for prandom_u32() Revert "ipv4: tunnel: fix compilation on ARCH=um" net: accept an empty mask in /sys/class/net/*/queues/rx-*/rps_cpus net: ethernet: stmmac: Disable hardware multicast filter net: stmmac: dwmac1000: provide multicast filter fallback ipv4: tunnel: fix compilation on ARCH=um vsock: fix potential null pointer dereference in vsock_poll() sfc: fix ef100 design-param checking net: initialize fastreuse on inet_inherit_port net: refactor bind_bucket fastreuse into helper net: phy: marvell10g: fix null pointer dereference net: Fix potential memory leak in proto_register() net: qcom/emac: add missed clk_disable_unprepare in error path of emac_clks_phase1_init ionic_lif: Use devm_kcalloc() in ionic_qcq_alloc() net/nfc/rawsock.c: add CAP_NET_RAW check. hinic: fix strncpy output truncated compile warnings drivers/net/wan/x25_asy: Added needed_headroom and a skb->len check net/tls: Fix kmap usage ...
2020-08-13Merge branch 'i2c/for-5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linuxLinus Torvalds41-250/+409
Pull i2c updates from Wolfram Sang: - bus recovery can now be given a pinctrl handle and the I2C core will do all the steps to switch to/from GPIO which can save quite some boilerplate code from drivers - "fallthrough" conversion - driver updates, mostly ID additions * 'i2c/for-5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: (32 commits) i2c: iproc: fix race between client unreg and isr i2c: eg20t: use generic power management i2c: eg20t: Drop PCI wakeup calls from .suspend/.resume i2c: mediatek: Fix i2c_spec_values description i2c: mediatek: Add i2c compatible for MediaTek MT8192 dt-bindings: i2c: update bindings for MT8192 SoC i2c: mediatek: Add access to more than 8GB dram in i2c driver i2c: mediatek: Add apdma sync in i2c driver i2c: i801: Add support for Intel Tiger Lake PCH-H i2c: i801: Add support for Intel Emmitsburg PCH i2c: bcm2835: Replace HTTP links with HTTPS ones Documentation: i2c: dev: 'block process call' is supported i2c: at91: Move to generic GPIO bus recovery i2c: core: treat EPROBE_DEFER when acquiring SCL/SDA GPIOs i2c: core: add generic I2C GPIO recovery dt-bindings: i2c: add generic properties for GPIO bus recovery i2c: rcar: avoid race when unregistering slave i2c: tegra: Avoid tegra_i2c_init_dma() for Tegra210 vi i2c i2c: tegra: Fix runtime resume to re-init VI I2C i2c: tegra: Fix the error path in tegra_i2c_runtime_resume ...
2020-08-13net: openvswitch: introduce common code for flushing flowsTonghao Zhang3-21/+27
To avoid some issues, for example RCU usage warning and double free, we should flush the flows under ovs_lock. This patch refactors table_instance_destroy and introduces table_instance_flow_flush which can be invoked by __dp_destroy or ovs_flow_tbl_flush. Fixes: 50b0e61b32ee ("net: openvswitch: fix possible memleak on destroy flow-table") Reported-by: Johan Knöös <jknoos@google.com> Reported-at: https://mail.openvswitch.org/pipermail/ovs-discuss/2020-August/050489.html Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com> Reviewed-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-13af_packet: TPACKET_V3: fix fill status rwlock imbalanceJohn Ogness1-2/+7
After @blk_fill_in_prog_lock is acquired there is an early out vnet situation that can occur. In that case, the rwlock needs to be released. Also, since @blk_fill_in_prog_lock is only acquired when @tp_version is exactly TPACKET_V3, only release it on that exact condition as well. And finally, add sparse annotation so that it is clearer that prb_fill_curr_block() and prb_clear_blk_fill_status() are acquiring and releasing @blk_fill_in_prog_lock, respectively. sparse is still unable to understand the balance, but the warnings are now on a higher level that make more sense. Fixes: 632ca50f2cbd ("af_packet: TPACKET_V3: replace busy-wait loop") Signed-off-by: John Ogness <john.ogness@linutronix.de> Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-13random32: add a tracepoint for prandom_u32()Eric Dumazet2-0/+19
There has been some heat around prandom_u32() lately, and some people were wondering if there was a simple way to determine how often it was used, before considering making it maybe 10 times more expensive. This tracepoint exports the generated pseudo random value. Tested: perf list | grep prandom_u32 random:prandom_u32 [Tracepoint event] perf record -a [-g] [-C1] -e random:prandom_u32 sleep 1 [ perf record: Woken up 0 times to write data ] [ perf record: Captured and wrote 259.748 MB perf.data (924087 samples) ] perf report --nochildren ... 97.67% ksoftirqd/1 [kernel.vmlinux] [k] prandom_u32 | ---prandom_u32 prandom_u32 | |--48.86%--tcp_v4_syn_recv_sock | tcp_check_req | tcp_v4_rcv | ... --48.81%--tcp_conn_request tcp_v4_conn_request tcp_rcv_state_process ... perf script Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Willy Tarreau <w@1wt.eu> Cc: Sedat Dilek <sedat.dilek@gmail.com> Tested-by: Sedat Dilek <sedat.dilek@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-13Merge tag 'docs-5.9-2' of git://git.lwn.net/linuxLinus Torvalds18-379/+105
Pull documentation fixes from Jonathan Corbet: "A handful of obvious fixes that wandered in during the merge window" * tag 'docs-5.9-2' of git://git.lwn.net/linux: Documentation/locking/locktypes: fix the typo doc/zh_CN: resolve undefined label warning in admin-guide index doc/zh_CN: fix title heading markup in admin-guide cpu-load docs: remove the 2.6 "Upgrading I2C Drivers" guide docs: Correct the release date of 5.2 stable mailmap: Update comments for with format and more detalis docs: cdrom: Fix a typo and rst markup Doc: admin-guide: use correct legends in kernel-parameters.txt Documentation/features: refresh RISC-V arch support files documentation: coccinelle: Improve command example for make C={1,2} Core-api: Documentation: Replace deprecated :c:func: Usage Dev-tools: Documentation: Replace deprecated :c:func: Usage Filesystems: Documentation: Replace deprecated :c:func: Usage docs: trace: fix a typo
2020-08-13Documentation/locking/locktypes: fix the typoHuang Shijie1-1/+1
We have three categories locks, not two. Signed-off-by: Huang Shijie <sjhuang@iluvatar.ai> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20200813060220.18199-1-sjhuang@iluvatar.ai Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2020-08-13Merge tag 's390-5.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linuxLinus Torvalds12-46/+59
Pull more s390 updates from Heiko Carstens: - Allow s390 debug feature to handle finally more than 256 CPU numbers, instead of truncating the most significant bits. - Improve THP splitting required by qemu processes by making use of walk_page_vma() instead of calling follow_page() for every single page within each vma. - Add missing ZCRYPT dependency to VFIO_AP to fix potential compile problems. - Remove not required select CLOCKSOURCE_VALIDATE_LAST_CYCLE again. - Set node distance to LOCAL_DISTANCE instead of 0, since e.g. libnuma translates a node distance of 0 to "no NUMA support available". - Couple of other minor fixes and improvements. * tag 's390-5.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/numa: move code to arch/s390/kernel s390/time: remove select CLOCKSOURCE_VALIDATE_LAST_CYCLE again s390/debug: debug feature version 3 s390/Kconfig: add missing ZCRYPT dependency to VFIO_AP s390/numa: set node distance to LOCAL_DISTANCE s390/pkey: remove redundant variable initialization s390/test_unwind: fix possible memleak in test_unwind() s390/gmap: improve THP splitting s390/atomic: circumvent gcc 10 build regression
2020-08-13Merge tag 'for-5.9-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linuxLinus Torvalds9-25/+65
Pull more btrfs updates from David Sterba: "One minor update, the rest are fixes that have arrived a bit late for the first batch. There are also some recent fixes for bugs that were discovered during the merge window and pop up during testing. User visible change: - show correct subvolume path in /proc/mounts for bind mounts Fixes: - fix compression messages when remounting with different level or compression algorithm - tree-log: fix some memory leaks on error handling paths - restore I_VERSION on remount - fix return values and error code mixups - fix umount crash with quotas enabled when removing sysfs files - fix trim range on a shrunk device" * tag 'for-5.9-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: trim: fix underflow in trim length to prevent access beyond device boundary btrfs: fix return value mixup in btrfs_get_extent btrfs: sysfs: fix NULL pointer dereference at btrfs_sysfs_del_qgroups() btrfs: check correct variable after allocation in btrfs_backref_iter_alloc btrfs: make sure SB_I_VERSION doesn't get unset by remount btrfs: fix memory leaks after failure to lookup checksums during inode logging btrfs: don't show full path of bind mounts in subvol= btrfs: fix messages after changing compression level by remount btrfs: only search for left_info if there is no right_info in try_merge_free_space btrfs: inode: fix NULL pointer dereference if inode doesn't need compression
2020-08-13Merge tag 'xfs-5.9-merge-8' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linuxLinus Torvalds15-19/+21
Pull xfs fixes from Darrick Wong: "Two small fixes that have come in during the past week: - Fix duplicated words in comments - Fix an ubsan complaint about null pointer arithmetic" * tag 'xfs-5.9-merge-8' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: xfs: Fix UBSAN null-ptr-deref in xfs_sysfs_init xfs: delete duplicated words + other fixes
2020-08-13Merge tag 'exfat-for-5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfatLinus Torvalds10-116/+121
Pull exfat updates from Namjae Jeon: - don't clear MediaFailure and VolumeDirty bit in volume flags if these were already set before mounting - write multiple dirty buffers at once in sync mode - remove unneeded EXFAT_SB_DIRTY bit set * tag 'exfat-for-5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat: exfat: retain 'VolumeFlags' properly exfat: optimize exfat_zeroed_cluster() exfat: add error check when updating dir-entries exfat: write multiple sectors at once exfat: remove EXFAT_SB_DIRTY flag
2020-08-13mm: memcontrol: fix warning when allocating the root cgroupJohannes Weiner1-6/+0
Commit 3e38e0aaca9e ("mm: memcg: charge memcg percpu memory to the parent cgroup") adds memory tracking to the memcg kernel structures themselves to make cgroups liable for the memory they are consuming through the allocation of child groups (which can be significant). This code is a bit awkward as it's spread out through several functions: The outermost function does memalloc_use_memcg(parent) to set up current->active_memcg, which designates which cgroup to charge, and the inner functions pass GFP_ACCOUNT to request charging for specific allocations. To make sure this dependency is satisfied at all times - to make sure we don't randomly charge whoever is calling the functions - the inner functions warn on !current->active_memcg. However, this triggers a false warning when the root memcg itself is allocated. No parent exists in this case, and so current->active_memcg is rightfully NULL. It's a false positive, not indicative of a bug. Delete the warnings for now, we can revisit this later. Fixes: 3e38e0aaca9e ("mm: memcg: charge memcg percpu memory to the parent cgroup") Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Acked-by: Roman Gushchin <guro@fb.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-12Merge tag 'rtc-5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linuxLinus Torvalds15-218/+244
Pull RTC updates from Alexandre Belloni: "Not much this cycle - mostly non urgent driver fixes: - ds1374: use watchdog core - pcf2127: add alarm and pcf2129 support" * tag 'rtc-5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux: rtc: pcf2127: fix alarm handling rtc: pcf2127: add alarm support rtc: pcf2127: add pca2129 device id rtc: max77686: Fix wake-ups for max77620 rtc: ds1307: provide an indication that the watchdog has fired rtc: ds1374: remove unused define rtc: ds1374: fix RTC_DRV_DS1374_WDT dependencies rtc: cleanup obsolete comment about struct rtc_class_ops rtc: pl031: fix set_alarm by adding back call to alarm_irq_enable rtc: ds1374: wdt: Use watchdog core for watchdog part rtc: Replace HTTP links with HTTPS ones rtc: goldfish: Enable interrupt in set_alarm() when necessary rtc: max77686: Do not allow interrupt to fire before system resume rtc: imxdi: fix trivial typos rtc: cpcap: fix range
2020-08-12Revert "ipv4: tunnel: fix compilation on ARCH=um"David S. Miller1-1/+0
This reverts commit 06a7a37be55e29961c9ba2abec4d07c8e0e21861. The bug was already fixed, this added a dup include. Reported-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-12net: accept an empty mask in /sys/class/net/*/queues/rx-*/rps_cpusEric Dumazet1-5/+7
We must accept an empty mask in store_rps_map(), or we are not able to disable RPS on a queue. Fixes: 07bbecb34106 ("net: Restrict receive packets queuing to housekeeping CPUs") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Maciej Żenczykowski <maze@google.com> Cc: Alex Belits <abelits@marvell.com> Cc: Nitesh Narayan Lal <nitesh@redhat.com> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Maciej Żenczykowski <maze@google.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Nitesh Narayan Lal <nitesh@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-12Merge branch 'net-stmmac-Fix-multicast-filter-on-IPQ806x'David S. Miller2-0/+4
Jonathan McDowell says: ==================== net: stmmac: Fix multicast filter on IPQ806x This pair of patches are the result of discovering a failure to correctly receive IPv6 multicast packets on such a device (in particular DHCPv6 requests and RA solicitations). Putting the device into promiscuous mode, or allmulti, both resulted in such packets correctly being received. Examination of the vendor driver (nss-gmac from the qsdk) shows that it does not enable the multicast filter and instead falls back to allmulti. Extend the base dwmac1000 driver to fall back when there's no suitable hardware filter, and update the ipq806x platform to request this. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-12net: ethernet: stmmac: Disable hardware multicast filterJonathan McDowell1-0/+1
The IPQ806x does not appear to have a functional multicast ethernet address filter. This was observed as a failure to correctly receive IPv6 packets on a LAN to the all stations address. Checking the vendor driver shows that it does not attempt to enable the multicast filter and instead falls back to receiving all multicast packets, internally setting ALLMULTI. Use the new fallback support in the dwmac1000 driver to correctly achieve the same with the mainline IPQ806x driver. Confirmed to fix IPv6 functionality on an RB3011 router. Cc: stable@vger.kernel.org Signed-off-by: Jonathan McDowell <noodles@earth.li> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-12net: stmmac: dwmac1000: provide multicast filter fallbackJonathan McDowell1-0/+3
If we don't have a hardware multicast filter available then instead of silently failing to listen for the requested ethernet broadcast addresses fall back to receiving all multicast packets, in a similar fashion to other drivers with no multicast filter. Cc: stable@vger.kernel.org Signed-off-by: Jonathan McDowell <noodles@earth.li> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-12i2c: iproc: fix race between client unreg and isrDhananjay Phadke1-1/+12
When i2c client unregisters, synchronize irq before setting iproc_i2c->slave to NULL. (1) disable_irq() (2) Mask event enable bits in control reg (3) Erase slave address (avoid further writes to rx fifo) (4) Flush tx and rx FIFOs (5) Clear pending event (interrupt) bits in status reg (6) enable_irq() (7) Set client pointer to NULL Unable to handle kernel NULL pointer dereference at virtual address 0000000000000318 [ 371.020421] pc : bcm_iproc_i2c_isr+0x530/0x11f0 [ 371.025098] lr : __handle_irq_event_percpu+0x6c/0x170 [ 371.030309] sp : ffff800010003e40 [ 371.033727] x29: ffff800010003e40 x28: 0000000000000060 [ 371.039206] x27: ffff800010ca9de0 x26: ffff800010f895df [ 371.044686] x25: ffff800010f18888 x24: ffff0008f7ff3600 [ 371.050165] x23: 0000000000000003 x22: 0000000001600000 [ 371.055645] x21: ffff800010f18888 x20: 0000000001600000 [ 371.061124] x19: ffff0008f726f080 x18: 0000000000000000 [ 371.066603] x17: 0000000000000000 x16: 0000000000000000 [ 371.072082] x15: 0000000000000000 x14: 0000000000000000 [ 371.077561] x13: 0000000000000000 x12: 0000000000000001 [ 371.083040] x11: 0000000000000000 x10: 0000000000000040 [ 371.088519] x9 : ffff800010f317c8 x8 : ffff800010f317c0 [ 371.093999] x7 : ffff0008f805b3b0 x6 : 0000000000000000 [ 371.099478] x5 : ffff0008f7ff36a4 x4 : ffff8008ee43d000 [ 371.104957] x3 : 0000000000000000 x2 : ffff8000107d64c0 [ 371.110436] x1 : 00000000c00000af x0 : 0000000000000000 [ 371.115916] Call trace: [ 371.118439] bcm_iproc_i2c_isr+0x530/0x11f0 [ 371.122754] __handle_irq_event_percpu+0x6c/0x170 [ 371.127606] handle_irq_event_percpu+0x34/0x88 [ 371.132189] handle_irq_event+0x40/0x120 [ 371.136234] handle_fasteoi_irq+0xcc/0x1a0 [ 371.140459] generic_handle_irq+0x24/0x38 [ 371.144594] __handle_domain_irq+0x60/0xb8 [ 371.148820] gic_handle_irq+0xc0/0x158 [ 371.152687] el1_irq+0xb8/0x140 [ 371.155927] arch_cpu_idle+0x10/0x18 [ 371.159615] do_idle+0x204/0x290 [ 371.162943] cpu_startup_entry+0x24/0x60 [ 371.166990] rest_init+0xb0/0xbc [ 371.170322] arch_call_rest_init+0xc/0x14 [ 371.174458] start_kernel+0x404/0x430 Fixes: c245d94ed106 ("i2c: iproc: Add multi byte read-write support for slave mode") Signed-off-by: Dhananjay Phadke <dphadke@linux.microsoft.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Acked-by: Ray Jui <ray.jui@broadcom.com> Signed-off-by: Wolfram Sang <wsa@kernel.org>
2020-08-12ipv4: tunnel: fix compilation on ARCH=umJohannes Berg1-0/+1
With certain configurations, a 64-bit ARCH=um errors out here with an unknown csum_ipv6_magic() function. Include the right header file to always have it. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-12vsock: fix potential null pointer dereference in vsock_poll()Stefano Garzarella1-1/+1
syzbot reported this issue where in the vsock_poll() we find the socket state at TCP_ESTABLISHED, but 'transport' is null: general protection fault, probably for non-canonical address 0xdffffc0000000012: 0000 [#1] PREEMPT SMP KASAN KASAN: null-ptr-deref in range [0x0000000000000090-0x0000000000000097] CPU: 0 PID: 8227 Comm: syz-executor.2 Not tainted 5.8.0-rc7-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:vsock_poll+0x75a/0x8e0 net/vmw_vsock/af_vsock.c:1038 Call Trace: sock_poll+0x159/0x460 net/socket.c:1266 vfs_poll include/linux/poll.h:90 [inline] do_pollfd fs/select.c:869 [inline] do_poll fs/select.c:917 [inline] do_sys_poll+0x607/0xd40 fs/select.c:1011 __do_sys_poll fs/select.c:1069 [inline] __se_sys_poll fs/select.c:1057 [inline] __x64_sys_poll+0x18c/0x440 fs/select.c:1057 do_syscall_64+0x60/0xe0 arch/x86/entry/common.c:384 entry_SYSCALL_64_after_hwframe+0x44/0xa9 This issue can happen if the TCP_ESTABLISHED state is set after we read the vsk->transport in the vsock_poll(). We could put barriers to synchronize, but this can only happen during connection setup, so we can simply check that 'transport' is valid. Fixes: c0cfa2d8a788 ("vsock: add multi-transports support") Reported-and-tested-by: syzbot+a61bac2fcc1a7c6623fe@syzkaller.appspotmail.com Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Jorgen Hansen <jhansen@vmware.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-12sfc: fix ef100 design-param checkingEdward Cree1-1/+2
The handling of the RXQ/TXQ size granularity design-params had two problems: it had a 64-bit divide that didn't build on 32-bit platforms, and it could divide by zero if the NIC supplied 0 as the value of the design-param. Fix both by checking for 0 and for a granularity bigger than our min-size; if the granularity <= EFX_MIN_DMAQ_SIZE then it fits in 32 bits, so we can cast it to u32 for the divide. Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Edward Cree <ecree@solarflare.com> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-12Merge tag 'ceph-for-5.9-rc1' of git://github.com/ceph/ceph-clientLinus Torvalds25-136/+511
Pull ceph updates from Ilya Dryomov: "Xiubo has completed his work on filesystem client metrics, they are sent to all available MDSes once per second now. Other than that, we have a lot of fixes and cleanups all around the filesystem, including a tweak to cut down on MDS request resends in multi-MDS setups from Yanhu and fixups for SELinux symlink labeling and MClientSession message decoding from Jeff" * tag 'ceph-for-5.9-rc1' of git://github.com/ceph/ceph-client: (22 commits) ceph: handle zero-length feature mask in session messages ceph: use frag's MDS in either mode ceph: move sb->wb_pagevec_pool to be a global mempool ceph: set sec_context xattr on symlink creation ceph: remove redundant initialization of variable mds ceph: fix use-after-free for fsc->mdsc ceph: remove unused variables in ceph_mdsmap_decode() ceph: delete repeated words in fs/ceph/ ceph: send client provided metric flags in client metadata ceph: periodically send perf metrics to MDSes ceph: check the sesion state and return false in case it is closed libceph: replace HTTP links with HTTPS ones ceph: remove unnecessary cast in kfree() libceph: just have osd_req_op_init() return a pointer ceph: do not access the kiocb after aio requests ceph: clean up and optimize ceph_check_delayed_caps() ceph: fix potential mdsc use-after-free crash ceph: switch to WARN_ON_ONCE in encode_supported_features() ceph: add global total_caps to count the mdsc's total caps number ceph: add check_session_state() helper and make it global ...
2020-08-12Merge branch 'parisc-5.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linuxLinus Torvalds5-8/+70
Pull more parisc updates from Helge Deller: - Oscar Carter contributed a patch which fixes parisc's usage of dereference_function_descriptor() and thus will allow using the -Wcast-function-type compiler option in the top-level Makefile - Sven Schnelle fixed a bug in the SBA code to prevent crashes during kexec - John David Anglin provided implementations for __smp_store_release() and __smp_load_acquire barriers() which avoids using the sync assembler instruction and thus speeds up barrier paths - Some whitespace cleanups in parisc's atomic.h header file * 'parisc-5.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux: parisc: Implement __smp_store_release and __smp_load_acquire barriers parisc: mask out enable and reserved bits from sba imask parisc: Whitespace cleanups in atomic.h parisc/kernel/ftrace: Remove function callback casts sections.h: dereference_function_descriptor() returns void pointer
2020-08-12Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds83-2636/+3479
Pull more KVM updates from Paolo Bonzini: "PPC: - Improvements and bugfixes for secure VM support, giving reduced startup time and memory hotplug support. - Locking fixes in nested KVM code - Increase number of guests supported by HV KVM to 4094 - Preliminary POWER10 support ARM: - Split the VHE and nVHE hypervisor code bases, build the EL2 code separately, allowing for the VHE code to now be built with instrumentation - Level-based TLB invalidation support - Restructure of the vcpu register storage to accomodate the NV code - Pointer Authentication available for guests on nVHE hosts - Simplification of the system register table parsing - MMU cleanups and fixes - A number of post-32bit cleanups and other fixes MIPS: - compilation fixes x86: - bugfixes - support for the SERIALIZE instruction" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (70 commits) KVM: MIPS/VZ: Fix build error caused by 'kvm_run' cleanup x86/kvm/hyper-v: Synic default SCONTROL MSR needs to be enabled MIPS: KVM: Convert a fallthrough comment to fallthrough MIPS: VZ: Only include loongson_regs.h for CPU_LOONGSON64 x86: Expose SERIALIZE for supported cpuid KVM: x86: Don't attempt to load PDPTRs when 64-bit mode is enabled KVM: arm64: Move S1PTW S2 fault logic out of io_mem_abort() KVM: arm64: Don't skip cache maintenance for read-only memslots KVM: arm64: Handle data and instruction external aborts the same way KVM: arm64: Rename kvm_vcpu_dabt_isextabt() KVM: arm: Add trace name for ARM_NISV KVM: arm64: Ensure that all nVHE hyp code is in .hyp.text KVM: arm64: Substitute RANDOMIZE_BASE for HARDEN_EL2_VECTORS KVM: arm64: Make nVHE ASLR conditional on RANDOMIZE_BASE KVM: PPC: Book3S HV: Rework secure mem slot dropping KVM: PPC: Book3S HV: Move kvmppc_svm_page_out up KVM: PPC: Book3S HV: Migrate hot plugged memory KVM: PPC: Book3S HV: In H_SVM_INIT_DONE, migrate remaining normal-GFNs to secure-GFNs KVM: PPC: Book3S HV: Track the state GFNs associated with secure VMs KVM: PPC: Book3S HV: Disable page merging in H_SVM_INIT_START ...
2020-08-12Merge tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linuxLinus Torvalds82-871/+4778
Pull more clk updates from Stephen Boyd: "Here's some more updates that missed the last pull request because I happened to tag the tree at an earlier point in the history of clk-next. I must have fat fingered it and checked out an older version of clk-next on this second computer I'm using. This time it actually includes more code for Qualcomm SoCs, the AT91 major updates, and some Rockchip SoC clk driver updates as well. I've corrected this flow so this shouldn't happen again" * tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: (83 commits) clk: bcm2835: Do not use prediv with bcm2711's PLLs clk: drop unused function __clk_get_flags clk: hsdk: Fix bad dependency on IOMEM dt-bindings: clock: Fix YAML schemas for LPASS clocks on SC7180 clk: mmp: avoid missing prototype warning clk: sparx5: Add Sparx5 SoC DPLL clock driver dt-bindings: clock: sparx5: Add bindings include file clk: qoriq: add LS1021A core pll mux options clk: clk-atlas6: fix return value check in atlas6_clk_init() clk: tegra: pll: Improve PLLM enable-state detection clk: X1000: Add support for calculat REFCLK of USB PHY. clk: JZ4780: Reformat the code to align it. clk: JZ4780: Add functions for enable and disable USB PHY. clk: Ingenic: Add RTC related clocks for Ingenic SoCs. dt-bindings: clock: Add tabs to align code. dt-bindings: clock: Add RTC related clocks for Ingenic SoCs. clk: davinci: Use fallthrough pseudo-keyword clk: imx: Use fallthrough pseudo-keyword clk: qcom: gcc-sdm660: Fix up gcc_mss_mnoc_bimc_axi_clk clk: qcom: gcc-sdm660: Add missing modem reset ...
2020-08-12Merge tag 'linux-watchdog-5.9-rc1' of git://www.linux-watchdog.org/linux-watchdogLinus Torvalds62-210/+1032
Pull watchdog updates from Wim Van Sebroeck: - f71808e_wdt imporvements - dw_wdt improvements - mlx-wdt: support new watchdog type with longer timeout period - fallthrough pseudo-keyword replacements - overall small fixes and improvements * tag 'linux-watchdog-5.9-rc1' of git://www.linux-watchdog.org/linux-watchdog: (35 commits) watchdog: rti-wdt: balance pm runtime enable calls watchdog: rti-wdt: attach to running watchdog during probe watchdog: add support for adjusting last known HW keepalive time watchdog: use __watchdog_ping in startup watchdog: softdog: Add options 'soft_reboot_cmd' and 'soft_active_on_boot' watchdog: pcwd_usb: remove needless check before usb_free_coherent() watchdog: Replace HTTP links with HTTPS ones dt-bindings: watchdog: renesas,wdt: Document r8a774e1 support watchdog: initialize device before misc_register watchdog: booke_wdt: Add common nowayout parameter driver watchdog: scx200_wdt: Use fallthrough pseudo-keyword watchdog: Use fallthrough pseudo-keyword watchdog: f71808e_wdt: do stricter parameter validation watchdog: f71808e_wdt: clear watchdog timeout occurred flag watchdog: f71808e_wdt: remove use of wrong watchdog_info option watchdog: f71808e_wdt: indicate WDIOF_CARDRESET support in watchdog_info.options docs: watchdog: codify ident.options as superset of possible status flags dt-bindings: watchdog: Add compatible for QCS404, SC7180, SDM845, SM8150 dt-bindings: watchdog: Convert QCOM watchdog timer bindings to YAML watchdog: dw_wdt: Add DebugFS files ...
2020-08-12Merge tag 'vfio-v5.9-rc1' of git://github.com/awilliam/linux-vfioLinus Torvalds5-193/+282
Pull VFIO updates from Alex Williamson: - Inclusive naming updates (Alex Williamson) - Intel X550 INTx quirk (Alex Williamson) - Error path resched between unmaps (Xiang Zheng) - SPAPR IOMMU pin_user_pages() conversion (John Hubbard) - Trivial mutex simplification (Alex Williamson) - QAT device denylist (Giovanni Cabiddu) - type1 IOMMU ioctl refactor (Liu Yi L) * tag 'vfio-v5.9-rc1' of git://github.com/awilliam/linux-vfio: vfio/type1: Refactor vfio_iommu_type1_ioctl() vfio/pci: Add QAT devices to denylist vfio/pci: Add device denylist PCI: Add Intel QuickAssist device IDs vfio/pci: Hold igate across releasing eventfd contexts vfio/spapr_tce: convert get_user_pages() --> pin_user_pages() vfio/type1: Add conditional rescheduling after iommu map failed vfio/pci: Add Intel X550 to hidden INTx devices vfio: Cleanup allowed driver naming
2020-08-12Merge tag 'drm-next-2020-08-12' of git://anongit.freedesktop.org/drm/drmLinus Torvalds84-382/+903
Pull drm fixes from Dave Airlie: "This has a few vmwgfx regression fixes we hit from the merge window (one in TTM), it also has a bunch of amdgpu fixes along with a scattering everywhere else. core: - Fix drm_dp_mst_port refcount leaks in drm_dp_mst_allocate_vcpi - Remove null check for kfree in drm_dev_release. - Fix DRM_FORMAT_MOD_AMLOGIC_FBC definition. - re-added docs for drm_gem_flink_ioctl() - add orientation quirk for ASUS T103HAF ttm: - ttm: fix page-offset calculation within TTM - revert patch causing vmwgfx regressions fbcon: - Fix a fbcon OOB read in fbdev, found by syzbot. vga: - Mark vga_tryget static as it's not used elsewhere. amdgpu: - Re-add spelling typo fix - Sienna Cichlid fixes - Navy Flounder fixes - DC fixes - SMU i2c fix - Power fixes vmwgfx: - regression fixes for modesetting crashes - misc fixes xlnx: - Small fixes to xlnx. omap: - Fix mode initialization in omap_connector_mode_valid(). - force runtime PM suspend on system suspend tidss: - fix modeset init for DPI panels" * tag 'drm-next-2020-08-12' of git://anongit.freedesktop.org/drm/drm: (70 commits) drm/ttm: revert "drm/ttm: make TT creation purely optional v3" drm/vmwgfx: fix spelling mistake "Cant" -> "Can't" drm/vmwgfx: fix spelling mistake "Cound" -> "Could" drm/vmwgfx/ldu: Use drm_mode_config_reset drm/vmwgfx/sou: Use drm_mode_config_reset drm/vmwgfx/stdu: Use drm_mode_config_reset drm/vmwgfx: Fix two list_for_each loop exit tests drm/vmwgfx: Use correct vmw_legacy_display_unit pointer drm/vmwgfx: Use struct_size() helper drm/amdgpu: Fix bug where DPM is not enabled after hibernate and resume drm/amd/powerplay: put VCN/JPEG into PG ungate state before dpm table setup(V3) drm/amd/powerplay: update swSMU VCN/JPEG PG logics drm/amdgpu: use mode1 reset by default for sienna_cichlid drm/amdgpu/smu: rework i2c adpater registration drm/amd/display: Display goes blank after inst drm/amd/display: Change null plane state swizzle mode to 4kb_s drm/amd/display: Use helper function to check for HDMI signal drm/amd/display: AMD OUI (DPCD 0x00300) skipped on some sink drm/amd/display: Fix logger context drm/amd/display: populate new dml variable ...
2020-08-12Merge branch 'akpm' (patches from Andrew)Linus Torvalds264-1437/+2357
Merge more updates from Andrew Morton: - most of the rest of MM (memcg, hugetlb, vmscan, proc, compaction, mempolicy, oom-kill, hugetlbfs, migration, thp, cma, util, memory-hotplug, cleanups, uaccess, migration, gup, pagemap), - various other subsystems (alpha, misc, sparse, bitmap, lib, bitops, checkpatch, autofs, minix, nilfs, ufs, fat, signals, kmod, coredump, exec, kdump, rapidio, panic, kcov, kgdb, ipc). * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (164 commits) mm/gup: remove task_struct pointer for all gup code mm: clean up the last pieces of page fault accountings mm/xtensa: use general page fault accounting mm/x86: use general page fault accounting mm/sparc64: use general page fault accounting mm/sparc32: use general page fault accounting mm/sh: use general page fault accounting mm/s390: use general page fault accounting mm/riscv: use general page fault accounting mm/powerpc: use general page fault accounting mm/parisc: use general page fault accounting mm/openrisc: use general page fault accounting mm/nios2: use general page fault accounting mm/nds32: use general page fault accounting mm/mips: use general page fault accounting mm/microblaze: use general page fault accounting mm/m68k: use general page fault accounting mm/ia64: use general page fault accounting mm/hexagon: use general page fault accounting mm/csky: use general page fault accounting ...
2020-08-12mm/gup: remove task_struct pointer for all gup codePeter Xu18-87/+69
After the cleanup of page fault accounting, gup does not need to pass task_struct around any more. Remove that parameter in the whole gup stack. Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: John Hubbard <jhubbard@nvidia.com> Link: http://lkml.kernel.org/r/20200707225021.200906-26-peterx@redhat.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-12mm: clean up the last pieces of page fault accountingsPeter Xu4-29/+10
Here're the last pieces of page fault accounting that were still done outside handle_mm_fault() where we still have regs==NULL when calling handle_mm_fault(): arch/powerpc/mm/copro_fault.c: copro_handle_mm_fault arch/sparc/mm/fault_32.c: force_user_fault arch/um/kernel/trap.c: handle_page_fault mm/gup.c: faultin_page fixup_user_fault mm/hmm.c: hmm_vma_fault mm/ksm.c: break_ksm Some of them has the issue of duplicated accounting for page fault retries. Some of them didn't do the accounting at all. This patch cleans all these up by letting handle_mm_fault() to do per-task page fault accounting even if regs==NULL (though we'll still skip the perf event accountings). With that, we can safely remove all the outliers now. There's another functional change in that now we account the page faults to the caller of gup, rather than the task_struct that passed into the gup code. More information of this can be found at [1]. After this patch, below things should never be touched again outside handle_mm_fault(): - task_struct.[maj|min]_flt - PERF_COUNT_SW_PAGE_FAULTS_[MAJ|MIN] [1] https://lore.kernel.org/lkml/CAHk-=wj_V2Tps2QrMn20_W0OJF9xqNh52XSGA42s-ZJ8Y+GyKw@mail.gmail.com/ Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Cain <bcain@codeaurora.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Chris Zankel <chris@zankel.net> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David S. Miller <davem@davemloft.net> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com> Cc: Greentime Hu <green.hu@gmail.com> Cc: Guo Ren <guoren@kernel.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: James E.J. Bottomley <James.Bottomley@HansenPartnership.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Jonas Bonn <jonas@southpole.se> Cc: Ley Foon Tan <ley.foon.tan@intel.com> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Nick Hu <nickhu@andestech.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Richard Henderson <rth@twiddle.net> Cc: Rich Felker <dalias@libc.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Stafford Horne <shorne@gmail.com> Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vincent Chen <deanbo422@gmail.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Will Deacon <will@kernel.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Link: http://lkml.kernel.org/r/20200707225021.200906-25-peterx@redhat.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-12mm/xtensa: use general page fault accountingPeter Xu1-11/+4
Use the general page fault accounting by passing regs into handle_mm_fault(). It naturally solve the issue of multiple page fault accounting when page fault retry happened. Remove the PERF_COUNT_SW_PAGE_FAULTS_[MAJ|MIN] perf events because it's now also done in handle_mm_fault(). Move the PERF_COUNT_SW_PAGE_FAULTS event higher before taking mmap_sem for the fault, then it'll match with the rest of the archs. Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Max Filippov <jcmvbkbc@gmail.com> Cc: Chris Zankel <chris@zankel.net> Link: http://lkml.kernel.org/r/20200707225021.200906-24-peterx@redhat.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-12mm/x86: use general page fault accountingPeter Xu1-15/+2
Use the general page fault accounting by passing regs into handle_mm_fault(). Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: H. Peter Anvin <hpa@zytor.com> Link: http://lkml.kernel.org/r/20200707225021.200906-23-peterx@redhat.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-12mm/sparc64: use general page fault accountingPeter Xu1-10/+1
Use the general page fault accounting by passing regs into handle_mm_fault(). It naturally solve the issue of multiple page fault accounting when page fault retry happened. Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: David S. Miller <davem@davemloft.net> Link: http://lkml.kernel.org/r/20200707225021.200906-22-peterx@redhat.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-12mm/sparc32: use general page fault accountingPeter Xu1-10/+1
Use the general page fault accounting by passing regs into handle_mm_fault(). It naturally solve the issue of multiple page fault accounting when page fault retry happened. Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: David S. Miller <davem@davemloft.net> Link: http://lkml.kernel.org/r/20200707225021.200906-21-peterx@redhat.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-12mm/sh: use general page fault accountingPeter Xu1-10/+1
Use the general page fault accounting by passing regs into handle_mm_fault(). It naturally solve the issue of multiple page fault accounting when page fault retry happened. Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Rich Felker <dalias@libc.org> Link: http://lkml.kernel.org/r/20200707225021.200906-20-peterx@redhat.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-12mm/s390: use general page fault accountingPeter Xu1-15/+1
Use the general page fault accounting by passing regs into handle_mm_fault(). It naturally solve the issue of multiple page fault accounting when page fault retry happened. Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Acked-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Link: http://lkml.kernel.org/r/20200707225021.200906-19-peterx@redhat.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-12mm/riscv: use general page fault accountingPeter Xu1-15/+1
Use the general page fault accounting by passing regs into handle_mm_fault(). It naturally solve the issue of multiple page fault accounting when page fault retry happened. Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Pekka Enberg <penberg@kernel.org> Acked-by: Palmer Dabbelt <palmerdabbelt@google.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Link: http://lkml.kernel.org/r/20200707225021.200906-18-peterx@redhat.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-12mm/powerpc: use general page fault accountingPeter Xu1-8/+3
Use the general page fault accounting by passing regs into handle_mm_fault(). Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Link: http://lkml.kernel.org/r/20200707225021.200906-17-peterx@redhat.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-12mm/parisc: use general page fault accountingPeter Xu1-5/+3
Use the general page fault accounting by passing regs into handle_mm_fault(). It naturally solve the issue of multiple page fault accounting when page fault retry happened. Add the missing PERF_COUNT_SW_PAGE_FAULTS perf events too. Note, the other two perf events (PERF_COUNT_SW_PAGE_FAULTS_[MAJ|MIN]) were done in handle_mm_fault(). Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: James E.J. Bottomley <James.Bottomley@HansenPartnership.com> Cc: Helge Deller <deller@gmx.de> Link: http://lkml.kernel.org/r/20200707225021.200906-16-peterx@redhat.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-12mm/openrisc: use general page fault accountingPeter Xu1-5/+4
Use the general page fault accounting by passing regs into handle_mm_fault(). It naturally solve the issue of multiple page fault accounting when page fault retry happened. Add the missing PERF_COUNT_SW_PAGE_FAULTS perf events too. Note, the other two perf events (PERF_COUNT_SW_PAGE_FAULTS_[MAJ|MIN]) were done in handle_mm_fault(). Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Stafford Horne <shorne@gmail.com> Cc: Jonas Bonn <jonas@southpole.se> Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi> Link: http://lkml.kernel.org/r/20200707225021.200906-15-peterx@redhat.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-12mm/nios2: use general page fault accountingPeter Xu1-10/+4
Use the general page fault accounting by passing regs into handle_mm_fault(). It naturally solve the issue of multiple page fault accounting when page fault retry happened. Add the missing PERF_COUNT_SW_PAGE_FAULTS perf events too. Note, the other two perf events (PERF_COUNT_SW_PAGE_FAULTS_[MAJ|MIN]) were done in handle_mm_fault(). Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Ley Foon Tan <ley.foon.tan@intel.com> Link: http://lkml.kernel.org/r/20200707225021.200906-14-peterx@redhat.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-12mm/nds32: use general page fault accountingPeter Xu1-16/+3
Use the general page fault accounting by passing regs into handle_mm_fault(). It naturally solve the issue of multiple page fault accounting when page fault retry happened. Fix PERF_COUNT_SW_PAGE_FAULTS perf event manually for page fault retries, by moving it before taking mmap_sem. Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Greentime Hu <green.hu@gmail.com> Cc: Nick Hu <nickhu@andestech.com> Cc: Vincent Chen <deanbo422@gmail.com> Link: http://lkml.kernel.org/r/20200707225021.200906-13-peterx@redhat.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-12mm/mips: use general page fault accountingPeter Xu1-11/+3
Use the general page fault accounting by passing regs into handle_mm_fault(). It naturally solve the issue of multiple page fault accounting when page fault retry happened. Fix PERF_COUNT_SW_PAGE_FAULTS perf event manually for page fault retries, by moving it before taking mmap_sem. Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Link: http://lkml.kernel.org/r/20200707225021.200906-12-peterx@redhat.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-12mm/microblaze: use general page fault accountingPeter Xu1-5/+4
Use the general page fault accounting by passing regs into handle_mm_fault(). It naturally solve the issue of multiple page fault accounting when page fault retry happened. Add the missing PERF_COUNT_SW_PAGE_FAULTS perf events too. Note, the other two perf events (PERF_COUNT_SW_PAGE_FAULTS_[MAJ|MIN]) were done in handle_mm_fault(). Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Michal Simek <monstr@monstr.eu> Link: http://lkml.kernel.org/r/20200707225021.200906-11-peterx@redhat.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-12mm/m68k: use general page fault accountingPeter Xu1-10/+4
Use the general page fault accounting by passing regs into handle_mm_fault(). It naturally solve the issue of multiple page fault accounting when page fault retry happened. Add the missing PERF_COUNT_SW_PAGE_FAULTS perf events too. Note, the other two perf events (PERF_COUNT_SW_PAGE_FAULTS_[MAJ|MIN]) were done in handle_mm_fault(). Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Link: http://lkml.kernel.org/r/20200707225021.200906-10-peterx@redhat.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-12mm/ia64: use general page fault accountingPeter Xu1-5/+4
Use the general page fault accounting by passing regs into handle_mm_fault(). It naturally solve the issue of multiple page fault accounting when page fault retry happened. Add the missing PERF_COUNT_SW_PAGE_FAULTS perf events too. Note, the other two perf events (PERF_COUNT_SW_PAGE_FAULTS_[MAJ|MIN]) were done in handle_mm_fault(). Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: "Luck, Tony" <tony.luck@intel.com> Link: http://lkml.kernel.org/r/20200707225021.200906-9-peterx@redhat.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>