Age | Commit message (Collapse) | Author | Files | Lines |
|
We saw excessive direct memory compaction triggered by skb_page_frag_refill.
This causes performance issues and add latency. Commit 5640f7685831e0
introduces the order-3 allocation. According to the changelog, the order-3
allocation isn't a must-have but to improve performance. But direct memory
compaction has high overhead. The benefit of order-3 allocation can't
compensate the overhead of direct memory compaction.
This patch makes the order-3 page allocation atomic. If there is no memory
pressure and memory isn't fragmented, the alloction will still success, so we
don't sacrifice the order-3 benefit here. If the atomic allocation fails,
direct memory compaction will not be triggered, skb_page_frag_refill will
fallback to order-0 immediately, hence the direct memory compaction overhead is
avoided. In the allocation failure case, kswapd is waken up and doing
compaction, so chances are allocation could success next time.
alloc_skb_with_frags is the same.
The mellanox driver does similar thing, if this is accepted, we must fix
the driver too.
V3: fix the same issue in alloc_skb_with_frags as pointed out by Eric
V2: make the changelog clearer
Cc: Eric Dumazet <edumazet@google.com>
Cc: Chris Mason <clm@fb.com>
Cc: Debabrata Banerjee <dbavatar@gmail.com>
Signed-off-by: Shaohua Li <shli@fb.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
If a device is renamed and the original name is subsequently reused
for a new device, the following warning is generated:
sysctl duplicate entry: /net/mpls/conf/veth0//input
CPU: 3 PID: 1379 Comm: ip Not tainted 4.1.0-rc4+ #20
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014
0000000000000000 0000000000000000 ffffffff81566aaf 0000000000000000
ffffffff81236279 ffff88002f7d7f00 0000000000000000 ffff88000db336d8
ffff88000db33698 0000000000000005 ffff88002e046000 ffff8800168c9280
Call Trace:
[<ffffffff81566aaf>] ? dump_stack+0x40/0x50
[<ffffffff81236279>] ? __register_sysctl_table+0x289/0x5a0
[<ffffffffa051a24f>] ? mpls_dev_notify+0x1ff/0x300 [mpls_router]
[<ffffffff8108db7f>] ? notifier_call_chain+0x4f/0x70
[<ffffffff81470e72>] ? register_netdevice+0x2b2/0x480
[<ffffffffa0524748>] ? veth_newlink+0x178/0x2d3 [veth]
[<ffffffff8147f84c>] ? rtnl_newlink+0x73c/0x8e0
[<ffffffff8147f27a>] ? rtnl_newlink+0x16a/0x8e0
[<ffffffff81459ff2>] ? __kmalloc_reserve.isra.30+0x32/0x90
[<ffffffff8147ccfd>] ? rtnetlink_rcv_msg+0x8d/0x250
[<ffffffff8145b027>] ? __alloc_skb+0x47/0x1f0
[<ffffffff8149badb>] ? __netlink_lookup+0xab/0xe0
[<ffffffff8147cc70>] ? rtnetlink_rcv+0x30/0x30
[<ffffffff8149e7a0>] ? netlink_rcv_skb+0xb0/0xd0
[<ffffffff8147cc64>] ? rtnetlink_rcv+0x24/0x30
[<ffffffff8149df17>] ? netlink_unicast+0x107/0x1a0
[<ffffffff8149e4be>] ? netlink_sendmsg+0x50e/0x630
[<ffffffff8145209c>] ? sock_sendmsg+0x3c/0x50
[<ffffffff81452beb>] ? ___sys_sendmsg+0x27b/0x290
[<ffffffff811bd258>] ? mem_cgroup_try_charge+0x88/0x110
[<ffffffff811bd5b6>] ? mem_cgroup_commit_charge+0x56/0xa0
[<ffffffff811d7700>] ? do_filp_open+0x30/0xa0
[<ffffffff8145336e>] ? __sys_sendmsg+0x3e/0x80
[<ffffffff8156c3f2>] ? system_call_fastpath+0x16/0x75
Fix this by unregistering the previous sysctl table (registered for
the path containing the original device name) and re-registering the
table for the path containing the new device name.
Fixes: 37bde79979c3 ("mpls: Per-device enabling of packet input")
Reported-by: Scott Feldman <sfeldma@gmail.com>
Signed-off-by: Robert Shearman <rshearma@brocade.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When programming the start of a periodic output, the code wrongly places
the seconds value into the "low" register and the nanoseconds into the
"high" register. Even though this is backwards, it slipped through my
testing, because the re-arming code in the interrupt service routine is
correct, and the signal does appear starting with the second edge.
This patch fixes the issue by programming the registers correctly.
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When incoming packet qualifies for rx_copybreak, we copy the data to newly
allocated skb. We do not free/unmap the original buffer. At this point driver
assumes this buffer is unallocated. When enic_rq_alloc_buf() is called for
buffer allocation, it checks if buf->os_buf is NULL. If its not NULL that means
buffer can be re-used.
When vnic_rq_clean() is called for freeing all rq buffers, and if the
rx_copybreak reused buffer falls outside the used desc, we do not free the
buffer. The following trace is observer when dma-debug is enabled.
Fix is to walk through complete ring and clean if buffer is present.
[ 40.555386] ------------[ cut here ]------------
[ 40.555396] WARNING: CPU: 0 PID: 491 at lib/dma-debug.c:971 dma_debug_device_change+0x188/0x1f0()
[ 40.555400] pci 0000:06:00.0: DMA-API: device driver has pending DMA allocations while released from device [count=4]
One of leaked entries details: [device address=0x00000000ff4cc040] [size=9018 bytes] [mapped with DMA_FROM_DEVICE] [mapped as single]
[ 40.555402] Modules linked in: nfsv3 nfs_acl rpcsec_gss_krb5 auth_rpcgss oid_registry nfsv4 dns_resolver coretemp intel_rapl iosf_mbi x86_pkg_temp_thermal intel_powerclamp kvm_intel kvm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 lrw joydev mousedev gf128mul hid_generic glue_helper mgag200 usbhid ttm hid drm_kms_helper drm ablk_helper syscopyarea sysfillrect sysimgblt i2c_algo_bit i2c_core iTCO_wdt cryptd mac_hid evdev pcspkr sb_edac edac_core tpm_tis iTCO_vendor_support ipmi_si wmi tpm ipmi_msghandler shpchp lpc_ich processor acpi_power_meter hwmon button ac sch_fq_codel nfs lockd grace sunrpc fscache sd_mod ehci_pci ehci_hcd megaraid_sas usbcore scsi_mod usb_common enic(-) crc32c_generic crc32c_intel btrfs xor raid6_pq ext4 crc16 mbcache jbd2
[ 40.555467] CPU: 0 PID: 491 Comm: rmmod Not tainted 4.1.0-rc7-ARCH-01305-gf59b71f #118
[ 40.555469] Hardware name: Cisco Systems Inc UCSB-B200-M4/UCSB-B200-M4, BIOS B200M4.2.2.2.23.061220140128 06/12/2014
[ 40.555471] 0000000000000000 00000000e2f8a5b7 ffff880275f8bc48 ffffffff8158d6f0
[ 40.555474] 0000000000000000 ffff880275f8bca0 ffff880275f8bc88 ffffffff8107b04a
[ 40.555477] ffff8802734e0000 0000000000000004 ffff8804763fb3c0 ffff88027600b650
[ 40.555480] Call Trace:
[ 40.555488] [<ffffffff8158d6f0>] dump_stack+0x4f/0x7b
[ 40.555492] [<ffffffff8107b04a>] warn_slowpath_common+0x8a/0xc0
[ 40.555494] [<ffffffff8107b0d5>] warn_slowpath_fmt+0x55/0x70
[ 40.555498] [<ffffffff812fa408>] dma_debug_device_change+0x188/0x1f0
[ 40.555503] [<ffffffff8109aaef>] notifier_call_chain+0x4f/0x80
[ 40.555506] [<ffffffff8109aecb>] __blocking_notifier_call_chain+0x4b/0x70
[ 40.555510] [<ffffffff8109af06>] blocking_notifier_call_chain+0x16/0x20
[ 40.555514] [<ffffffff813f8066>] __device_release_driver+0xf6/0x120
[ 40.555518] [<ffffffff813f8b08>] driver_detach+0xc8/0xd0
[ 40.555523] [<ffffffff813f7c59>] bus_remove_driver+0x59/0xe0
[ 40.555527] [<ffffffff813f93a0>] driver_unregister+0x30/0x70
[ 40.555534] [<ffffffff8131532d>] pci_unregister_driver+0x2d/0xa0
[ 40.555542] [<ffffffffa0200ec2>] enic_cleanup_module+0x10/0x14e [enic]
[ 40.555547] [<ffffffff8110158f>] SyS_delete_module+0x1cf/0x280
[ 40.555551] [<ffffffff811e284e>] ? ____fput+0xe/0x10
[ 40.555554] [<ffffffff810980ec>] ? task_work_run+0xbc/0xf0
[ 40.555558] [<ffffffff815930ee>] system_call_fastpath+0x12/0x71
[ 40.555561] ---[ end trace 4988cadc77c2b236 ]---
[ 40.555562] Mapped at:
[ 40.555563] [<ffffffff812fa865>] debug_dma_map_page+0x95/0x150
[ 40.555566] [<ffffffffa01f4a88>] enic_rq_alloc_buf+0x1b8/0x360 [enic]
[ 40.555570] [<ffffffffa01f7658>] enic_open+0xf8/0x820 [enic]
[ 40.555574] [<ffffffff8148d50e>] __dev_open+0xce/0x150
[ 40.555579] [<ffffffff8148d851>] __dev_change_flags+0xa1/0x170
Signed-off-by: Govindarajulu Varadarajan <_govind@gmx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
We do not check the return value of enic_dev_stats_dump(). If allocation
fails, we will hit NULL pointer reference.
Return only if memory allocation fails. For other failures, we return the
previously recorded values.
Signed-off-by: Govindarajulu Varadarajan <_govind@gmx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
There is a small window between vnic_intr_unmask() and enic_poll_unlock_napi().
In this window if an irq occurs and napi is scheduled on different cpu, it tries
to acquire enic_poll_lock_napi() and hits the following WARN_ON message.
Fix is to unlock napi_poll before unmasking the interrupt.
[ 781.121746] ------------[ cut here ]------------
[ 781.121789] WARNING: CPU: 1 PID: 0 at drivers/net/ethernet/cisco/enic/vnic_rq.h:228 enic_poll_msix_rq+0x36a/0x3c0 [enic]()
[ 781.121834] Modules linked in: nfsv3 nfs_acl rpcsec_gss_krb5 auth_rpcgss oid_registry nfsv4 dns_resolver coretemp intel_rapl iosf_mbi x86_pkg_temp_thermal intel_powerclamp kvm_intel kvm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel mgag200 ttm drm_kms_helper joydev aes_x86_64 lrw drm gf128mul mousedev glue_helper sb_edac ablk_helper iTCO_wdt iTCO_vendor_support evdev ipmi_si syscopyarea sysfillrect sysimgblt i2c_algo_bit i2c_core edac_core lpc_ich mac_hid cryptd pcspkr ipmi_msghandler shpchp tpm_tis acpi_power_meter tpm wmi processor hwmon button ac sch_fq_codel nfs lockd grace sunrpc fscache hid_generic usbhid hid ehci_pci ehci_hcd sd_mod megaraid_sas usbcore scsi_mod usb_common enic crc32c_generic crc32c_intel btrfs xor raid6_pq ext4 crc16 mbcache jbd2
[ 781.122176] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.1.0-rc6-ARCH-00040-gc46a024-dirty #106
[ 781.122210] Hardware name: Cisco Systems Inc UCSB-B200-M4/UCSB-B200-M4, BIOS B200M4.2.2.2.23.061220140128 06/12/2014
[ 781.122252] 0000000000000000 bddbbc9d655ec96e ffff880277e43da8 ffffffff81583fe8
[ 781.122286] 0000000000000000 0000000000000000 ffff880277e43de8 ffffffff8107acfa
[ 781.122319] ffff880272c01000 ffff880273f18000 ffff880273f1a100 0000000000000000
[ 781.122352] Call Trace:
[ 781.122364] <IRQ> [<ffffffff81583fe8>] dump_stack+0x4f/0x7b
[ 781.122399] [<ffffffff8107acfa>] warn_slowpath_common+0x8a/0xc0
[ 781.122425] [<ffffffff8107ae2a>] warn_slowpath_null+0x1a/0x20
[ 781.122455] [<ffffffffa01fa9ca>] enic_poll_msix_rq+0x36a/0x3c0 [enic]
[ 781.122487] [<ffffffff8148525a>] net_rx_action+0x22a/0x370
[ 781.122512] [<ffffffff8107ed3d>] __do_softirq+0xed/0x2d0
[ 781.122537] [<ffffffff8107f06e>] irq_exit+0x7e/0xa0
[ 781.122560] [<ffffffff8158c424>] do_IRQ+0x64/0x100
[ 781.122582] [<ffffffff8158a42e>] common_interrupt+0x6e/0x6e
[ 781.122605] <EOI> [<ffffffff810bd331>] ? cpu_startup_entry+0x121/0x480
[ 781.122638] [<ffffffff810bd2fc>] ? cpu_startup_entry+0xec/0x480
[ 781.122667] [<ffffffff810f2ed3>] ? clockevents_register_device+0x113/0x1f0
[ 781.122698] [<ffffffff81050ab6>] start_secondary+0x196/0x1e0
[ 781.122723] ---[ end trace cec2e9dd3af7b9db ]---
Signed-off-by: Govindarajulu Varadarajan <_govind@gmx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Jeff Layton reported the following;
[ 74.232485] ------------[ cut here ]------------
[ 74.233354] WARNING: CPU: 2 PID: 754 at net/core/sock.c:364 sk_clear_memalloc+0x51/0x80()
[ 74.234790] Modules linked in: cts rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache xfs libcrc32c snd_hda_codec_generic snd_hda_intel snd_hda_controller snd_hda_codec snd_hda_core snd_hwdep snd_seq snd_seq_device nfsd snd_pcm snd_timer snd e1000 ppdev parport_pc joydev parport pvpanic soundcore floppy serio_raw i2c_piix4 pcspkr nfs_acl lockd virtio_balloon acpi_cpufreq auth_rpcgss grace sunrpc qxl drm_kms_helper ttm drm virtio_console virtio_blk virtio_pci ata_generic virtio_ring pata_acpi virtio
[ 74.243599] CPU: 2 PID: 754 Comm: swapoff Not tainted 4.1.0-rc6+ #5
[ 74.244635] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[ 74.245546] 0000000000000000 0000000079e69e31 ffff8800d066bde8 ffffffff8179263d
[ 74.246786] 0000000000000000 0000000000000000 ffff8800d066be28 ffffffff8109e6fa
[ 74.248175] 0000000000000000 ffff880118d48000 ffff8800d58f5c08 ffff880036e380a8
[ 74.249483] Call Trace:
[ 74.249872] [<ffffffff8179263d>] dump_stack+0x45/0x57
[ 74.250703] [<ffffffff8109e6fa>] warn_slowpath_common+0x8a/0xc0
[ 74.251655] [<ffffffff8109e82a>] warn_slowpath_null+0x1a/0x20
[ 74.252585] [<ffffffff81661241>] sk_clear_memalloc+0x51/0x80
[ 74.253519] [<ffffffffa0116c72>] xs_disable_swap+0x42/0x80 [sunrpc]
[ 74.254537] [<ffffffffa01109de>] rpc_clnt_swap_deactivate+0x7e/0xc0 [sunrpc]
[ 74.255610] [<ffffffffa03e4fd7>] nfs_swap_deactivate+0x27/0x30 [nfs]
[ 74.256582] [<ffffffff811e99d4>] destroy_swap_extents+0x74/0x80
[ 74.257496] [<ffffffff811ecb52>] SyS_swapoff+0x222/0x5c0
[ 74.258318] [<ffffffff81023f27>] ? syscall_trace_leave+0xc7/0x140
[ 74.259253] [<ffffffff81798dae>] system_call_fastpath+0x12/0x71
[ 74.260158] ---[ end trace 2530722966429f10 ]---
The warning in question was unnecessary but with Jeff's series the rules
are also clearer. This patch removes the warning and updates the comment
to explain why sk_mem_reclaim() may still be called.
[jlayton: remove if (sk->sk_forward_alloc) conditional. As Leon
points out that it's not needed.]
Cc: Leon Romanovsky <leon@leon.nu>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Since the addition of sysfs multicast router support if one set
multicast_router to "2" more than once, then the port would be added to
the hlist every time and could end up linking to itself and thus causing an
endless loop for rlist walkers.
So to reproduce just do:
echo 2 > multicast_router; echo 2 > multicast_router;
in a bridge port and let some igmp traffic flow, for me it hangs up
in br_multicast_flood().
Fix this by adding a check in br_multicast_add_router() if the port is
already linked.
The reason this didn't happen before the addition of multicast_router
sysfs entries is because there's a !hlist_unhashed check that prevents
it.
Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org>
Fixes: 0909e11758bd ("bridge: Add multicast_router sysfs entries")
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
If the TIPC connection timer expires in a probing state, a
self abort message is supposed to be generated and delivered
to the local socket. This is currently broken, and the abort
message is actually sent out to the peer node with invalid
addressing information. This will cause the link to enter
a constant retransmission state and eventually reset.
We fix this by removing the self-abort message creation and
tear down connection immediately instead.
Signed-off-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This reverts commit 0243508edd317ff1fa63b495643a7c192fbfcd92.
It introduces new regressions.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Until recently, mac80211 overwrote all the statistics it could
provide when getting called, but it now relies on the struct
having been zeroed by the caller. This was always the case in
nl80211, but wext used a static struct which could even cause
values from one device leak to another.
Using a static struct is OK (as even documented in a comment)
since the whole usage of this function and its return value is
always locked under RTNL. Not clearing the struct for calling
the driver has always been wrong though, since drivers were
free to only fill values they could report, so calling this
for one device and then for another would always have leaked
values from one to the other.
Fix this by initializing the structure in question before the
driver method call.
This fixes https://bugzilla.kernel.org/show_bug.cgi?id=99691
Cc: stable@vger.kernel.org
Reported-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Reported-by: Alexander Kaltsas <alexkaltsas@gmail.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Commit 70008aa50e92 ("skbuff: convert to skb_orphan_frags") replaced
open coded tests of SKBTX_DEV_ZEROCOPY and skb_copy_ubufs with calls
to helper function skb_orphan_frags. Apply that to the last remaining
open coded site.
Signed-off-by: Willem de Bruijn <willemb@google.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The RGMII block is currently only powered on when using RGMII or
RGMII_NO_ID, which is not correct when using the GENET interface in MII
or Reverse MII modes. We always need to power on the RGMII interface for
this block to properly work, regardless of the MII mode in which we
operate.
Fixes: aa09677cba423 ("net: bcmgenet: add MDIO routines")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
UDP encapsulation is broken on IPv6. This is because the logic to resubmit
the nexthdr is inverted, checking for a ret value > 0 instead of < 0. Also,
the resubmit label is in the wrong position since we already get the
nexthdr value when performing decapsulation. In addition the skb pull is no
longer necessary either.
This changes the return value check to look for < 0, using it for the
nexthdr on the next iteration, and moves the resubmit label to the proper
location.
With these changes the v6 code now matches what we do in the v4 ip input
code wrt resubmitting when decapsulating.
Signed-off-by: Josh Hunt <johunt@akamai.com>
Acked-by: "Tom Herbert" <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The memory pointed to by idev->stats.icmpv6msgdev,
idev->stats.icmpv6dev and idev->stats.ipv6 can each be used in an RCU
read context without taking a reference on idev. For example, through
IP6_*_STATS_* calls in ip6_rcv. These memory blocks are freed without
waiting for an RCU grace period to elapse. This could lead to the
memory being written to after it has been freed.
Fix this by using call_rcu to free the memory used for stats, as well
as idev after an RCU grace period has elapsed.
Signed-off-by: Robert Shearman <rshearma@brocade.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
MODULE_DEVICE_TABLE is referring to wrong driver's table and breaks the
build. Fix that.
Cc: stable@vger.kernel.org
Signed-off-by: Aaro Koskinen <aaro.koskinen@nokia.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
|
|
When the driver gets unregistered a call to netif_napi_del() was
missing, this all was also missing in the error paths of
b44_init_one().
Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
br_fdb_update() can be called in process context in the following way:
br_fdb_add() -> __br_fdb_add() -> br_fdb_update() (if NTF_USE flag is set)
so we need to disable softirqs because there are softirq users of the
hash_lock. One easy way to reproduce this is to modify the bridge utility
to set NTF_USE, enable stp and then set maxageing to a low value so
br_fdb_cleanup() is called frequently and then just add new entries in
a loop. This happens because br_fdb_cleanup() is called from timer/softirq
context. The spin locks in br_fdb_update were _bh before commit f8ae737deea1
("[BRIDGE]: forwarding remove unneeded preempt and bh diasables")
and at the time that commit was correct because br_fdb_update() couldn't be
called from process context, but that changed after commit:
292d1398983f ("bridge: add NTF_USE support")
Using local_bh_disable/enable around br_fdb_update() allows us to keep
using the spin_lock/unlock in br_fdb_update for the fast-path.
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Fixes: 292d1398983f ("bridge: add NTF_USE support")
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This reverts commit 1d7c49037b12016e7056b9f2c990380e2187e766.
Nikolay Aleksandrov has a better version of this fix.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The mpls device is used in an RCU read context without a lock being
held. As the memory is freed without waiting for the RCU grace period
to elapse, the freed memory could still be in use.
Address this by using kfree_rcu to free the memory for the mpls device
after the RCU grace period has elapsed.
Fixes: 03c57747a702 ("mpls: Per-device MPLS state")
Signed-off-by: Robert Shearman <rshearma@brocade.com>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
There are several places in the driver (all in control paths) where
coherent dma memory is being allocated using either dma_alloc_coherent()
or the deprecated pci_alloc_consistent(). All these calls should be
changed to use dma_zalloc_coherent() to avoid uninitialized fields in
data structures backed by this memory.
Reported-by: Joerg Roedel <jroedel@suse.de>
Tested-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna@avagotech.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
br_fdb_update() can be called in process context in the following way:
br_fdb_add() -> __br_fdb_add() -> br_fdb_update() (if NTF_USE flag is set)
so we need to use spin_lock_bh because there are softirq users of the
hash_lock. One easy way to reproduce this is to modify the bridge utility
to set NTF_USE, enable stp and then set maxageing to a low value so
br_fdb_cleanup() is called frequently and then just add new entries in
a loop. This happens because br_fdb_cleanup() is called from timer/softirq
context. These locks were _bh before commit f8ae737deea1
("[BRIDGE]: forwarding remove unneeded preempt and bh diasables")
and at the time that commit was correct because br_fdb_update() couldn't be
called from process context, but that changed after commit:
292d1398983f ("bridge: add NTF_USE support")
Signed-off-by: Wilson Kok <wkok@cumulusnetworks.com>
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Fixes: 292d1398983f ("bridge: add NTF_USE support")
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Since the Tx timer function runs in softirq context the driver needs
to call disable_irq_nosync instead of a disable_irq.
Reported-by: Josh Stone <jistone@redhat.com>
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
rhashtable uses EXPORT_SYMBOL_GPL() without importing linux/export.h
directly it is only imported indirectly through some other includes.
Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Fix possible unintended sign extension in unsigned MMIO loads by casting
to uint16_t in the case of mmio_needed != 2.
Signed-off-by: Nicholas Mc Guire <hofrat@osadl.org>
Reviewed-by: James Hogan <james.hogan@imgtec.com>
Tested-by: James Hogan <james.hogan@imgtec.com>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: kvm@vger.kernel.org
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/9985/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
|
|
Fix stack pointer offset which could potentially corrupt
argument registers in the previous frame. The calculated offset
reflects the size of all the registers we need to preserve so there
is no need for this erroneous subtraction.
[ralf@linux-mips.org: Fixed conflict due to only applying this fix part
of the entire series as part of 4.1 fixes.]
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
Cc: netdev@vger.kernel.org
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Daniel Borkmann <dborkman@redhat.com>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: linux-kernel@vger.kernel.org
Cc: linux-mips@linux-mips.org
Patchwork: http://patchwork.linux-mips.org/patch/10527/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
|
|
setup_per_cpu_areas() only setup __per_cpu_offset[] for each possible
cpu, but loongson_sysconf.nr_cpus can be greater than possible cpus
(due to reserved_cpus_mask). So in loongson3_ipi_interrupt(), percpu
access will touch the original varible in .data..percpu section which
has been freed. Without this patch, cpu-hotplug will cause memery
corruption.
Signed-off-by: Huacai Chen <chenhc@lemote.com>
Cc: John Crispin <john@phrozen.org>
Cc: Steven J. Hill <Steven.Hill@imgtec.com>
Cc: linux-mips@linux-mips.org
Cc: Fuxin Zhang <zhangfx@lemote.com>
Cc: Zhangjin Wu <wuzhangjin@gmail.com>
Patchwork: http://patchwork.linux-mips.org/patch/10524/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
|
|
Commit 334c86c494b9 ("MIPS: IRQ: Add stackoverflow detection") added
kernel stack overflow detection, however it only enabled it conditional
upon the preprocessor definition DEBUG_STACKOVERFLOW, which is never
actually defined. The Kconfig option is called DEBUG_STACKOVERFLOW,
which manifests to the preprocessor as CONFIG_DEBUG_STACKOVERFLOW, so
switch it to using that definition instead.
Fixes: 334c86c494b9 ("MIPS: IRQ: Add stackoverflow detection")
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Adam Jiang <jiang.adam@gmail.com>
Cc: linux-mips@linux-mips.org
Cc: <stable@vger.kernel.org> # 2.6.37+
Patchwork: http://patchwork.linux-mips.org/patch/10531/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
|
|
Fixes a typo in arch/mips/mm/c-r4k.c's probe_scache().
Signed-off-by: Joshua Kinard <kumba@gentoo.org>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
|
|
I've had reports of UEFI platforms failing iSCSI boot in various
configurations, that ended up being caused by network initialization
scripts getting tripped up by unexpected null addresses (0.0.0.0) being
reported for gateways, dhcp servers, and dns servers.
The tianocore EDK2 iSCSI driver generates an iBFT table that always uses
IPv4-mapped IPv6 addresses for the NIC structure fields. This results
in values that are "not present or not specified" being reported as
::ffff:0.0.0.0 rather than all zeros as specified.
The iscsi_ibft module filters unspecified fields from the iBFT from
sysfs, preventing userspace from using invalid values and making it easy
to check for the presence of a value. This currently fails in regard to
these mapped null addresses.
In order to remain consistent with how the iBFT information is exposed,
we should accommodate the behavior of the tianocore iSCSI driver as it's
already in the wild in a large number of servers.
Tested under qemu using an OVMF build of tianocore EDK2.
Signed-off-by: Chris Leech <cleech@redhat.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
|
|
The map_single() function is not defined as static, even though it
doesn't seem to be used anywhere else in the kernel. Make it static to
avoid namespace pollution since this is a rather generic symbol.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
|
|
Writing to a file is supposed to return the number of bytes written.
Returning zero unfortunately causes bash to constantly spin trying
to write to the sysfs file, to such an extent that even ^c and ^z
have no effect. The only way out of that is to kill the shell and
log back in. This isn't nice behaviour.
Fix it by returning the number of characters written to sysfs files.
[airlied: used suggestion from Al Viro]
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
On some v7 devices (e.g. Lenovo-E550) the deltas reported are typically
only in the 0-1 range dividing this by 2 results in a range of 0-0.
And even for v7 devices where this does not lead to making the trackstick
entirely unusable, it makes it twice as slow as before we added v7 support
and were using the ps/2 mouse emulation of the dual point setup.
If some kind of generic slowdown is actually necessary for some devices,
then that belongs in userspace, not in the kernel.
Cc: stable@vger.kernel.org
Reported-and-tested-by: Rico Moorman <rico.moorman@gmail.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Benjamin Tissoires <benjamin.tissoires@gmail.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
|
|
This adds new icbody type to the list recognized by Elantech PS/2 driver.
Cc: stable@vger.kernel.org
Signed-off-by: Sam Hung <sam.hung@emc.com.tw>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
|
|
If SRIOV is enabled we need to be in VEB mode not VEPA mode at probe.
This fixes an NPAR bug when SRIOV is enabled in the BIOS.
Change-ID: Ibf006abafd9a0ca3698ec24848cd771cf345cbbc
Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com>
Tested-by: Jim Young <james.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
|
The patch fixes a bug in the default configuration which
prevented a software bridge loaded on the PF interface from
working correctly because broadcast packets are incorrectly
looped back.
Fix the general case, by loading the driver in VEPA mode Until a
VF or VMDq VSI is added. This way loopback on the Main VSI is
turned off until needed and can resolve the issue of unnecessary
reflection for users that do not have VF or VMDq VSIs setup.
The driver must now coordinate the loopback setting for the Flow
Director (FDIR) VSI to make sure it is in sync with the current
VEB or VEPA mode setting.
The user can still switch bridge modes from the bridge commands and
choose to be in VEPA mode with VF VSIs. Because of hardware
requirements, the call to switch to VEB mode when no VF/VMDqs are
present will be rejected.
NOTE: This patch uses BIT_ULL as that is preferred going forward,
a followup patch in the lower priority queue to net-next will fix
up the remaining 1 << usages.
Change-ID: Ib121ddb18fe4b3c4f52e9deda6fcbeb9105683d1
Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Jim Young <james.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
|
This patch fixes a bug where the i40e Tx queue will hang if this
skb is passed to the driver.
With mixed size fragments while using TSO there was a corner case
where we needed to linearize but we were not. This was seen with
iSCSI traffic and could be reproduced with a frag list that looks
like this:
num_frags = 17, gso_segs = 17, hdr_len = 66,
skb_shinfo(skb)->gso_size = 1448
size = 3002, j = 1, frag_size = 2936, num_frags = 17
size = 4268, j = 1, frag_size = 4096, num_frags = 16
size = 5534, j = 1, frag_size = 4096, num_frags = 15
size = 5352, j = 1, frag_size = 4096, num_frags = 14
size = 5170, j = 1, frag_size = 4096, num_frags = 13
size = 3468, j = 1, frag_size = 2576, num_frags = 12
size = 750, j = 1, frag_size = 112, num_frags = 11
size = 862, j = 2, frag_size = 112, num_frags = 10
size = 974, j = 3, frag_size = 112, num_frags = 9
size = 1126, j = 4, frag_size = 152, num_frags = 8
size = 1330, j = 5, frag_size = 204, num_frags = 7
size = 1534, j = 6, frag_size = 204, num_frags = 6
size = 356, j = 1, frag_size = 204, num_frags = 5
size = 560, j = 2, frag_size = 204, num_frags = 4
size = 764, j = 3, frag_size = 204, num_frags = 3
size = 968, j = 4, frag_size = 204, num_frags = 2
size = 1140, j = 5, frag_size = 172, num_frags = 1
result: linearize = 0, j = 6
Change-ID: I79bb1aeab0af255fe2ce28e93672a85d85bf47e8
Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
|
"IS_ENABLED(PPC_PSERIES)" always evaluates to false, as IS_ENABLED() is
supposed to be used with the full Kconfig symbol name, including the
"CONFIG_" prefix.
Add the missing "CONFIG_" prefix to fix this.
Fixes: a25095d451ece23b ("of: Move dynamic node fixups out of powerpc and into common code")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Cc: stable@vger.kernel.org #+3.17
Signed-off-by: Grant Likely <grant.likely@linaro.org>
|
|
In the functions compat_get_bitmap() and compat_put_bitmap() the
variable nr_compat_longs stores how many compat_ulong_t words should be
copied in a loop.
The copy loop itself is this:
if (nr_compat_longs-- > 0) {
if (__get_user(um, umask)) return -EFAULT;
} else {
um = 0;
}
Since nr_compat_longs gets unconditionally decremented in each loop and
since it's type is unsigned this could theoretically lead to out of
bounds accesses to userspace if nr_compat_longs wraps around to
(unsigned)(-1).
Although the callers currently do not trigger out-of-bounds accesses, we
should better implement the loop in a safe way to completely avoid such
warp-arounds.
Signed-off-by: Helge Deller <deller@gmx.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
|
|
Added the USB serial device ID for the HubZ dual ZigBee
and Z-Wave radio dongle.
Signed-off-by: John D. Blair <johnb@candicontrols.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Johan Hovold <johan@kernel.org>
|
|
Commit 066450be41 ("perf/x86/intel/pt: Clean up the control flow
in pt_pmu_hw_init()") changed attribute initialization so that
only the first attribute gets initialized using
sysfs_attr_init(), which upsets lockdep.
This patch fixes the glitch so that all allocated attributes are
properly initialized thus fixing the lockdep warning reported by
Tvrtko and Imre.
Reported-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reported-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: <linux-kernel@vger.kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
The cpumask vp_dev->msix_affinity_masks[info->msix_vector] may contain
staled information when vp_set_vq_affinity() gets called, so clear it
before setting the new cpu bit mask.
Cc: stable@vger.kernel.org
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
commit 65ca7514e21adbee25b8175fc909759c735d00ff
Author: Damien Lespiau <damien.lespiau@intel.com>
Date: Mon Feb 9 19:33:22 2015 +0000
drm/i915/skl: Implement WaBarrierPerformanceFixDisable
got misapplied and the code landed in chv_init_workarounds() instead of
the intended skl_init_workarounds(). Move it over to the right place.
Cc: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
|
|
Add all missing platforms handled by intel_set_memory_cxsr() to the
i915_sr_status debugfs entry.
v2: Add G4X too. (Ville)
Clarify the change also affects CHV. (Ander)
References: https://bugs.freedesktop.org/show_bug.cgi?id=89792
Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
|
|
After GPU reset, HW is losing the address of HWS page in the register.
The page itself is valid except that HW is not aware of its location.
[ 64.368623] [drm:gen8_init_common_ring [i915]] *ERROR* HWS Page address = 0x00000000
[ 64.368655] [drm:gen8_init_common_ring [i915]] *ERROR* HWS Page address = 0x00000000
[ 64.368681] [drm:gen8_init_common_ring [i915]] *ERROR* HWS Page address = 0x00000000
[ 64.368704] [drm:gen8_init_common_ring [i915]] *ERROR* HWS Page address = 0x00000000
This patch reloads this value into the register during ring init.
Signed-off-by: Arun Siluvery <arun.siluvery@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
|
|
421b3885bf6d56391297844f43fb7154a6396e12 "udp: ipv4: Add udp early
demux" introduced a regression that allowed sockets bound to INADDR_ANY
to receive packets from multicast groups that the socket had not joined.
For example a socket that had joined 224.168.2.9 could also receive
packets from 225.168.2.9 despite not having joined that group if
ip_early_demux is enabled.
Fix this by calling ip_check_mc_rcu() in udp_v4_early_demux() to verify
that the multicast packet is indeed ours.
Signed-off-by: Shawn Bohrer <sbohrer@rgmadvisors.com>
Reported-by: Yurij M. Plotnikov <Yurij.Plotnikov@oktetlabs.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Currently, openvswitch tries to disable LRO from the user space. This does
not work correctly when the device added is a vlan interface, though.
Instead of dealing with possibly complex stacked cross name space relations
in the user space, do the same as bridging does and call dev_disable_lro in
the kernel.
Signed-off-by: Jiri Benc <jbenc@redhat.com>
Acked-by: Flavio Leitner <fbl@redhat.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Currently the bpf frame pointer is set to the old r15. This is
wrong because of packed stack. Fix this and adjust the frame pointer
to respect packed stack. This now generates a prolog like the following:
3ff8001c3fa: eb67f0480024 stmg %r6,%r7,72(%r15)
3ff8001c400: ebcff0780024 stmg %r12,%r15,120(%r15)
3ff8001c406: b904001f lgr %r1,%r15 <- load backchain
3ff8001c40a: 41d0f048 la %r13,72(%r15) <- load adjusted bfp
3ff8001c40e: a7fbfd98 aghi %r15,-616
3ff8001c412: e310f0980024 stg %r1,152(%r15) <- save backchain
Fixes: 054623105728 ("s390/bpf: Add s390x eBPF JIT compiler backend")
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
On s390x we have to provide 160 bytes stack space before we can call
the next function. From the 160 bytes that we got from the previous
function we only use 11 * 8 bytes and have 160 - 11 * 8 bytes left.
Currently for BPF we allocate additional 160 - 11 * 8 bytes for the
next function. This is wrong because then the next function only gets:
(160 - 11 * 8) + (160 - 11 * 8) = 2 * 72 = 144 bytes
Fix this and allocate enough memory for the next function.
Fixes: 054623105728 ("s390/bpf: Add s390x eBPF JIT compiler backend")
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|