Age | Commit message (Collapse) | Author | Files | Lines |
|
This patch fixes trace formatting issues using the
QETH_CARD_TEXT_ macro. The total size of each trace entry
is 8 bytes. Some of the sprintf formats exceed these 8
bytes (for example using abcd:%d and the converted value
needs more than 3 bytes). The solution is to shorten the
text prepending the value or use a different format (%x).
Signed-off-by: Thomas Richter <tmricht@linux.vnet.ibm.com>
Signed-off-by: Frank Blaschka <blaschka@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch makes some global functions static and removes
the prototypes from the header file.
Also function qeth_query_card_info is not exported anymore,
there is no external user for it, this function should never
have been exported in the first place.
Signed-off-by: Thomas Richter <tmricht@linux.vnet.ibm.com>
Signed-off-by: Frank Blaschka <blaschka@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
A full Rx ring only requires 1 MiB of memory. This is not enough
memory that it is useful to dynamically scale the number of Rx
requests in the ring based on traffic rates, because:
a) Even the full 1 MiB is a tiny fraction of a typically modern Linux
VM (for example, the AWS micro instance still has 1 GiB of memory).
b) Netfront would have used up to 1 MiB already even with moderate
data rates (there was no adjustment of target based on memory
pressure).
c) Small VMs are going to typically have one VCPU and hence only one
queue.
Keeping the ring full of Rx requests handles bursty traffic better
than trying to converge on an optimal number of requests to keep
filled.
On a 4 core host, an iperf -P 64 -t 60 run from dom0 to a 4 VCPU guest
improved from 5.1 Gbit/s to 5.6 Gbit/s. Gains with more bursty
traffic are expected to be higher.
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
After the NAPIfication of sunvnet, we no longer need to
synchronize by doing irqsave/restore on vio.lock in the
I/O fastpath.
NAPI ->poll() is non-reentrant, so all RX processing occurs
strictly in a serialized environment. TX reclaim is done in NAPI
context, so the netif_tx_lock can be used to serialize
critical sections between Tx and Rx paths.
Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
A vnet_port_remove could be triggered as a result of an ldm-unbind
operation by the peer, module unload, or other changes to the
inter-vnet-link configuration. When this is concurrent with
vnet_start_xmit(), there are several race sequences possible,
such as
thread 1 thread 2
vnet_start_xmit
-> tx_port_find
spin_lock_irqsave(&vp->lock..)
ret = __tx_port_find(..)
spin_lock_irqrestore(&vp->lock..)
vio_remove -> ..
->vnet_port_remove
spin_lock_irqsave(&vp->lock..)
cleanup
spin_lock_irqrestore(&vp->lock..)
kfree(port)
/* attempt to use ret will bomb */
This patch adds RCU locking for port access so that vnet_port_remove
will correctly clean up port-related state.
Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Acked-by: Dwight Engen <dwight.engen@oracle.com>
Acked-by: Bob Picco <bob.picco@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Move Rx packet procssing to the NAPI poll callback.
Disable VIO interrupt and unconditioanlly go into NAPI
context from vnet_event.
Note that we want to minimize the number of LDC
STOP/START messages sent. Specifically, do not send a STOP
message if vnet_walk_rx does not read all the available descriptors
because of the NAPI budget limitation. Instead, note the end index
as part of port state, and resume from this index when the
next poll callback is triggered.
Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Acked-by: Raghuram Kothakota <raghuram.kothakota@oracle.com>
Acked-by: Dwight Engen <dwight.engen@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
See Documentation/CodingStyle Chapter 6
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
With properly using libphy PHYs now, remove the in-driver PHY
mangling.
Tested-by: Antoine Ténart <antoine.tenart@free-electrons.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Marvell Ethernet IP supports PHY negotiation driven by HW. This
fundamentally clashes with libphy (software) driven negotiation and
also cannot cope with quirky PHYs. Therefore, always disable any HW
negotiation features and properly use libphy's phy_device.
Tested-by: Antoine Ténart <antoine.tenart@free-electrons.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Current libphy handling in pxa168_eth lacks proper phy_connect. Prepare
to fix this by first moving phy properties from platform_data to private
driver data.
Tested-by: Antoine Ténart <antoine.tenart@free-electrons.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The PXA168 Ethernet IP support MII and RMII connection to its PHY.
Currently, pxa168 platform_data does not provide a way to pass that
and there is one user of pxa168 platform_data (mach-mmp/gplug).
Given the pinctrl settings of gplug it uses RMII, so add and pass
a corresponding phy_interface_t.
Tested-by: Antoine Ténart <antoine.tenart@free-electrons.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Marvell 88E3016 is a FastEthernet PHY that also can be found in Marvell
Berlin SoCs as integrated PHY.
Tested-by: Antoine Ténart <antoine.tenart@free-electrons.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
As of commit e4dc601bf99ccd1c ("m68k: Disable/restore interrupts in
hwreg_present()/hwreg_write()"), this is no longer needed.
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
As of commit e4dc601bf99ccd1c ("m68k: Disable/restore interrupts in
hwreg_present()/hwreg_write()"), this is no longer needed.
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Both image_data and typhoon_fw->data are const u8*, so the cast to u8*
is unnecessary and confusing.
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Acked-by: David Dillow <dave@thedillows.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
sctp_addr_is_valid() only appeared in its definition.
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Sébastien Barré <sebastien.barre@uclouvain.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch save the fn before doing rt6_backtrack.
Hence, without redo-ing the fib6_lookup(), saved_fn can be used
to redo rt6_select() with RT6_LOOKUP_F_REACHABLE off.
Some minor changes I think make sense to review as a single patch:
* Remove the 'out:' goto label.
* Remove the 'reachable' variable. Only use the 'strict' variable instead.
After this patch, "failing ip6_ins_rt()" should be the only case that
requires a redo of fib6_lookup().
Cc: David Miller <davem@davemloft.net>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When there is a RTF_CACHE hit, no need to redo fib6_lookup()
with reachable=0.
Cc: David Miller <davem@davemloft.net>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
It is the prep work to reduce the number of calls to fib6_lookup().
The BACKTRACK macro could be hard-to-read and error-prone due to
its side effects (mainly goto).
This patch is to:
1. Replace BACKTRACK macro with a function (fib6_backtrack) with the following
return values:
* If it is backtrack-able, returns next fn for retry.
* If it reaches the root, returns NULL.
2. The caller needs to decide if a backtrack is needed (by testing
rt == net->ipv6.ip6_null_entry).
3. Rename the goto labels in ip6_pol_route() to make the next few
patches easier to read.
Cc: David Miller <davem@davemloft.net>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Remove trailing whitespace in tcp.h icmp.c syncookies.c
Signed-off-by: Kenjiro Nakayama <nakayamakenjiro@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Bump i40e version to 1.0.21.
Signed-off-by: Catherine Sullivan <catherine.sullivan@intel.com>
Tested-By: Jim Young <jamesx.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
|
Move the three variables out of the loop, so it only declares once.
Change-ID: I436913777c7da3c16dc0031b59e3ffa61de74718
Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Signed-off-by: Patrick Lu <patrick.lu@intel.com>
Tested-by: Jim Young <jamesx.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
|
Add driver support for 10GBaseT device.
Change-ID: I4be6ed847ac0bddd220b9878a95c523b32038174
Signed-off-by: Catherine Sullivan <catherine.sullivan@intel.com>
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Tested-by: Jim Young <jamesx.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
|
Add code to handle link events when updating the PF switch. This
allows link information to be properly provided to VFs in all cases.
Change-ID: If314c95f3d39259ef4c40a4a3b823381e28fb24f
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Jim Young <jamesx.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
|
Move the setting of flow control because this should be done at a pf level not
a vsi level. Also add a sleep and restart an to fix a bug where Rx would stop
after some stress.
Change-ID: I9a93d8c2ff27c39339eb00bc4ec1225e43900be0
Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Jim Young <jamesx.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
|
As per the Documentation/timers/timers-howto.txt it is preferred to use
usleep_range() instead of udelay() if the delay value is > 10us in
non-atomic contexts.
So, replacing all the instances of udelay() with 10 or greater than 10
micro seconds delay in the driver and using usleep_range() instead.
Change-ID: Iaa2ab499a4c26f6005e5d86cc421407ef9de16c7
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Neerav Parikh <neerav.parikh@intel.com>
Tested-by: Jim Young <jamesx.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
|
This is one small step in making the indentation more consistent. If
we truly want to align values, then use tabs rather than spaces.
Change-ID: I12368bc77a52f296d1843fdcb67201a7d7cd4749
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Tested-by: Jim Young <jamesx.m.young@intel.com>
|
|
The driver can do a simpler job of managing link state by simply
using the admin queue receive event for link events as a doorbell
that tells the driver to update link state.
Additionally, add a workaround will help make sure the link state in the
hardware is consistent with the link state the driver is reporting
by refreshing the link state every service task interval.
Change-ID: Ib95b5b7b8cc016e97d8009f6363c9f9eed301444
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Jim Young <jamesx.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
|
Tell the firmware what kind of link related events the driver is
interested in. In this case, just link up/down and qualified module
events are the ones the driver really cares about.
Change-ID: If132c812c340c8e1927c2caf6d55185296b66201
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Jim Young <jamesx.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
|
total_data_buflen is used by netvsc_send() to decide if a packet can be put
into send buffer. It should also include the size of RNDIS message before the
Ethernet frame. Otherwise, a messge with total size bigger than send_section_size
may be copied into the send buffer, and cause data corruption.
[Request to include this patch to the Stable branches]
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Currently the amd-xgbe driver increments the packets processed counter
each time a descriptor is processed. Since a packet can be represented
by more than one descriptor incrementing the counter in this way is not
appropriate. Also, since multiple descriptors cause the budget check
to be short circuited, sometimes the returned value from the poll
function would be larger than the budget value resulting in a WARN_ONCE
being triggered.
Update the polling logic to properly account for the number of packets
processed and exit when the budget value is reached.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The ndo_set_features callback function was improperly using an unsigned
int to save the current feature value for features such as NETIF_F_RXCSUM.
Since that feature is in the upper 32 bits of a 64 bit variable the
result was always 0 making it not possible to actually turn off the
hardware RX checksum support. Change the unsigned int type to the
netdev_features_t type in order to properly capture the current value
and perform the proper operation.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Since commit 278d24047891 (net: fec: ptp: Enable PPS output based on ptp clock)
fec_enet_interrupt calls fec_ptp_check_pps_event unconditionally, which calls
into ptp_clock_event. If fep->ptp_clock is NULL, ptp_clock_event tries to
dereference the NULL pointer.
Since on i.MX53 fep->bufdesc_ex is not set, fec_ptp_init is never called,
and fep->ptp_clock is NULL, which reliably causes a kernel panic.
This patch adds a check for fep->ptp_clock == NULL in fec_enet_interrupt.
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The commit "net: Save TX flow hash in sock and set in skbuf on xmit"
introduced the inet_set_txhash() and ip6_set_txhash() routines to calculate
and record flow hash(sk_txhash) in the socket structure. sk_txhash is used
to set skb->hash which is used to spread flows across multiple TXQs.
But, the above routines are invoked before the source port of the connection
is created. Because of this all outgoing connections that just differ in the
source port get hashed into the same TXQ.
This patch fixes this problem for IPv4/6 by invoking the the above routines
after the source port is available for the socket.
Fixes: b73c3d0e4("net: Save TX flow hash in sock and set in skbuf on xmit")
Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
pskb_may_pull() maybe change skb->data and make nh and exthdr pointer
oboslete, so recompute the nd and exthdr
Signed-off-by: Li RongQing <roy.qing.li@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
After interface restart (eg: after link disconnection/reconnection), the bridge
function doesn't work anymore. This is due to the promiscuous mode being cleared
by the restart.
The mac-fcc already includes code to set the promiscuous mode back during the restart.
This patch adds the same handling to mac-fec and mac-scc.
Tested with bridge function on MPC885 with FEC.
Reported-by: Germain Montoies <germain.montoies@c-s.fr>
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The crafted header start address is from a driver supplied buffer, which
one can reasonably expect to be aligned on a 4-bytes boundary.
However ATM the TSO helper API is only used by ethernet drivers and
the tcp header will then be aligned to a 2-bytes only boundary from the
header start address.
Signed-off-by: Karl Beldan <karl.beldan@rivierawaves.com>
Cc: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
write_count and insert_count can wrap around, making > check invalid.
Fixes: 70b33fb0ddec827cbbd14cdc664fc27b2ef4a6b6 ("sfc: add support for
skb->xmit_more").
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Use netdev_alloc_pcpu_stats to allocate percpu stats and initialize syncp.
Fixes: 22e0f8b9322c "net: sched: make bstats per cpu and estimator RCU safe"
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Acked-by: Cong Wang <cwang@twopensource.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
while comparing for verifier state equivalency the comparison
was missing a check for uninitialized register.
Make sure it does so and add a testcase.
Fixes: f1bca824dabb ("bpf: add search pruning optimization to verifier")
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The synchronize_rcu() in netlink_release() introduces unacceptable
latency. Reintroduce minimal lookup so we can drop the
synchronize_rcu() until socket destruction has been RCUfied.
Cc: David S. Miller <davem@davemloft.net>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Reported-by: Steinar H. Gunderson <sgunderson@bigfoot.com>
Reported-and-tested-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When running tipcTC&tipcTS test suite, below lockdep unsafe locking
scenario is reported:
[ 1109.997854]
[ 1109.997988] =================================
[ 1109.998290] [ INFO: inconsistent lock state ]
[ 1109.998575] 3.17.0-rc1+ #113 Not tainted
[ 1109.998762] ---------------------------------
[ 1109.998762] inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
[ 1109.998762] swapper/7/0 [HC0[0]:SC1[1]:HE1:SE0] takes:
[ 1109.998762] (slock-AF_TIPC){+.?...}, at: [<ffffffffa0011969>] tipc_sk_rcv+0x49/0x2b0 [tipc]
[ 1109.998762] {SOFTIRQ-ON-W} state was registered at:
[ 1109.998762] [<ffffffff810a4770>] __lock_acquire+0x6a0/0x1d80
[ 1109.998762] [<ffffffff810a6555>] lock_acquire+0x95/0x1e0
[ 1109.998762] [<ffffffff81a2d1ce>] _raw_spin_lock+0x3e/0x80
[ 1109.998762] [<ffffffffa0011969>] tipc_sk_rcv+0x49/0x2b0 [tipc]
[ 1109.998762] [<ffffffffa0004fe8>] tipc_link_xmit+0xa8/0xc0 [tipc]
[ 1109.998762] [<ffffffffa000ec6f>] tipc_sendmsg+0x15f/0x550 [tipc]
[ 1109.998762] [<ffffffffa000f165>] tipc_connect+0x105/0x140 [tipc]
[ 1109.998762] [<ffffffff817676ee>] SYSC_connect+0xae/0xc0
[ 1109.998762] [<ffffffff81767b7e>] SyS_connect+0xe/0x10
[ 1109.998762] [<ffffffff817a9788>] compat_SyS_socketcall+0xb8/0x200
[ 1109.998762] [<ffffffff81a306e5>] sysenter_dispatch+0x7/0x1f
[ 1109.998762] irq event stamp: 241060
[ 1109.998762] hardirqs last enabled at (241060): [<ffffffff8105a4ad>] __local_bh_enable_ip+0x6d/0xd0
[ 1109.998762] hardirqs last disabled at (241059): [<ffffffff8105a46f>] __local_bh_enable_ip+0x2f/0xd0
[ 1109.998762] softirqs last enabled at (241020): [<ffffffff81059a52>] _local_bh_enable+0x22/0x50
[ 1109.998762] softirqs last disabled at (241021): [<ffffffff8105a626>] irq_exit+0x96/0xc0
[ 1109.998762]
[ 1109.998762] other info that might help us debug this:
[ 1109.998762] Possible unsafe locking scenario:
[ 1109.998762]
[ 1109.998762] CPU0
[ 1109.998762] ----
[ 1109.998762] lock(slock-AF_TIPC);
[ 1109.998762] <Interrupt>
[ 1109.998762] lock(slock-AF_TIPC);
[ 1109.998762]
[ 1109.998762] *** DEADLOCK ***
[ 1109.998762]
[ 1109.998762] 2 locks held by swapper/7/0:
[ 1109.998762] #0: (rcu_read_lock){......}, at: [<ffffffff81782dc9>] __netif_receive_skb_core+0x69/0xb70
[ 1109.998762] #1: (rcu_read_lock){......}, at: [<ffffffffa0001c90>] tipc_l2_rcv_msg+0x40/0x260 [tipc]
[ 1109.998762]
[ 1109.998762] stack backtrace:
[ 1109.998762] CPU: 7 PID: 0 Comm: swapper/7 Not tainted 3.17.0-rc1+ #113
[ 1109.998762] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
[ 1109.998762] ffffffff82745830 ffff880016c03828 ffffffff81a209eb 0000000000000007
[ 1109.998762] ffff880017b3cac0 ffff880016c03888 ffffffff81a1c5ef 0000000000000001
[ 1109.998762] ffff880000000001 ffff880000000000 ffffffff81012d4f 0000000000000000
[ 1109.998762] Call Trace:
[ 1109.998762] <IRQ> [<ffffffff81a209eb>] dump_stack+0x4e/0x68
[ 1109.998762] [<ffffffff81a1c5ef>] print_usage_bug+0x1f1/0x202
[ 1109.998762] [<ffffffff81012d4f>] ? save_stack_trace+0x2f/0x50
[ 1109.998762] [<ffffffff810a406c>] mark_lock+0x28c/0x2f0
[ 1109.998762] [<ffffffff810a3440>] ? print_irq_inversion_bug.part.46+0x1f0/0x1f0
[ 1109.998762] [<ffffffff810a467d>] __lock_acquire+0x5ad/0x1d80
[ 1109.998762] [<ffffffff810a70dd>] ? trace_hardirqs_on+0xd/0x10
[ 1109.998762] [<ffffffff8108ace8>] ? sched_clock_cpu+0x98/0xc0
[ 1109.998762] [<ffffffff8108ad2b>] ? local_clock+0x1b/0x30
[ 1109.998762] [<ffffffff810a10dc>] ? lock_release_holdtime.part.29+0x1c/0x1a0
[ 1109.998762] [<ffffffff8108aa05>] ? sched_clock_local+0x25/0x90
[ 1109.998762] [<ffffffffa000dec0>] ? tipc_sk_get+0x60/0x80 [tipc]
[ 1109.998762] [<ffffffff810a6555>] lock_acquire+0x95/0x1e0
[ 1109.998762] [<ffffffffa0011969>] ? tipc_sk_rcv+0x49/0x2b0 [tipc]
[ 1109.998762] [<ffffffff810a6fb6>] ? trace_hardirqs_on_caller+0xa6/0x1c0
[ 1109.998762] [<ffffffff81a2d1ce>] _raw_spin_lock+0x3e/0x80
[ 1109.998762] [<ffffffffa0011969>] ? tipc_sk_rcv+0x49/0x2b0 [tipc]
[ 1109.998762] [<ffffffffa000dec0>] ? tipc_sk_get+0x60/0x80 [tipc]
[ 1109.998762] [<ffffffffa0011969>] tipc_sk_rcv+0x49/0x2b0 [tipc]
[ 1109.998762] [<ffffffffa00076bd>] tipc_rcv+0x5ed/0x960 [tipc]
[ 1109.998762] [<ffffffffa0001d1c>] tipc_l2_rcv_msg+0xcc/0x260 [tipc]
[ 1109.998762] [<ffffffffa0001c90>] ? tipc_l2_rcv_msg+0x40/0x260 [tipc]
[ 1109.998762] [<ffffffff81783345>] __netif_receive_skb_core+0x5e5/0xb70
[ 1109.998762] [<ffffffff81782dc9>] ? __netif_receive_skb_core+0x69/0xb70
[ 1109.998762] [<ffffffff81784eb9>] ? dev_gro_receive+0x259/0x4e0
[ 1109.998762] [<ffffffff817838f6>] __netif_receive_skb+0x26/0x70
[ 1109.998762] [<ffffffff81783acd>] netif_receive_skb_internal+0x2d/0x1f0
[ 1109.998762] [<ffffffff81785518>] napi_gro_receive+0xd8/0x240
[ 1109.998762] [<ffffffff815bf854>] e1000_clean_rx_irq+0x2c4/0x530
[ 1109.998762] [<ffffffff815c1a46>] e1000_clean+0x266/0x9c0
[ 1109.998762] [<ffffffff8108ad2b>] ? local_clock+0x1b/0x30
[ 1109.998762] [<ffffffff8108aa05>] ? sched_clock_local+0x25/0x90
[ 1109.998762] [<ffffffff817842b1>] net_rx_action+0x141/0x310
[ 1109.998762] [<ffffffff810bd710>] ? handle_fasteoi_irq+0xe0/0x150
[ 1109.998762] [<ffffffff81059fa6>] __do_softirq+0x116/0x4d0
[ 1109.998762] [<ffffffff8105a626>] irq_exit+0x96/0xc0
[ 1109.998762] [<ffffffff81a30d07>] do_IRQ+0x67/0x110
[ 1109.998762] [<ffffffff81a2ee2f>] common_interrupt+0x6f/0x6f
[ 1109.998762] <EOI> [<ffffffff8100d2b7>] ? default_idle+0x37/0x250
[ 1109.998762] [<ffffffff8100d2b5>] ? default_idle+0x35/0x250
[ 1109.998762] [<ffffffff8100dd1f>] arch_cpu_idle+0xf/0x20
[ 1109.998762] [<ffffffff810999fd>] cpu_startup_entry+0x27d/0x4d0
[ 1109.998762] [<ffffffff81034c78>] start_secondary+0x188/0x1f0
When intra-node messages are delivered from one process to another
process, tipc_link_xmit() doesn't disable BH before it directly calls
tipc_sk_rcv() on process context to forward messages to destination
socket. Meanwhile, if messages delivered by remote node arrive at the
node and their destinations are also the same socket, tipc_sk_rcv()
running on process context might be preempted by tipc_sk_rcv() running
BH context. As a result, the latter cannot obtain the socket lock as
the lock was obtained by the former, however, the former has no chance
to be run as the latter is owning the CPU now, so headlock happens. To
avoid it, BH should be always disabled in tipc_sk_rcv().
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Locking dependency detected below possible unsafe locking scenario:
CPU0 CPU1
T0: tipc_named_rcv() tipc_rcv()
T1: [grab nametble write lock]* [grab node lock]*
T2: tipc_update_nametbl() tipc_node_link_up()
T3: tipc_nodesub_subscribe() tipc_nametbl_publish()
T4: [grab node lock]* [grab nametble write lock]*
The opposite order of holding nametbl write lock and node lock on
above two different paths may result in a deadlock. If we move the
the updating of the name table after link state named out of node
lock, the reverse order of holding locks will be eliminated, and
as a result, the deadlock risk.
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
In enic_stop, we disable preemption using local_bh_disable(). We disable
preemption to wait for busy_poll to finish.
napi_disable should not be called here as it might sleep.
Moving napi_disable() call out side of local_bh_disable.
BUG: sleeping function called from invalid context at include/linux/netdevice.h:477
in_atomic(): 1, irqs_disabled(): 0, pid: 443, name: ifconfig
INFO: lockdep is turned off.
Preemption disabled at:[<ffffffffa029c5c4>] enic_rfs_flw_tbl_free+0x34/0xd0 [enic]
CPU: 31 PID: 443 Comm: ifconfig Not tainted 3.17.0-netnext-05504-g59f35b8 #268
Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
ffff8800dac10000 ffff88020b8dfcb8 ffffffff8148a57c 0000000000000000
ffff88020b8dfcd0 ffffffff8107e253 ffff8800dac12a40 ffff88020b8dfd10
ffffffffa029305b ffff88020b8dfd48 ffff8800dac10000 ffff88020b8dfd48
Call Trace:
[<ffffffff8148a57c>] dump_stack+0x4e/0x7a
[<ffffffff8107e253>] __might_sleep+0x123/0x1a0
[<ffffffffa029305b>] enic_stop+0xdb/0x4d0 [enic]
[<ffffffff8138ed7d>] __dev_close_many+0x9d/0xf0
[<ffffffff8138ef81>] __dev_close+0x31/0x50
[<ffffffff813974a8>] __dev_change_flags+0x98/0x160
[<ffffffff81397594>] dev_change_flags+0x24/0x60
[<ffffffff814085fd>] devinet_ioctl+0x63d/0x710
[<ffffffff81139c16>] ? might_fault+0x56/0xc0
[<ffffffff81409ef5>] inet_ioctl+0x65/0x90
[<ffffffff813768e0>] sock_do_ioctl+0x20/0x50
[<ffffffff81376ebb>] sock_ioctl+0x20b/0x2e0
[<ffffffff81197250>] do_vfs_ioctl+0x2e0/0x500
[<ffffffff81492619>] ? sysret_check+0x22/0x5d
[<ffffffff81285f23>] ? __this_cpu_preempt_check+0x13/0x20
[<ffffffff8109fe19>] ? trace_hardirqs_on_caller+0x119/0x270
[<ffffffff811974ac>] SyS_ioctl+0x3c/0x80
[<ffffffff814925ed>] system_call_fastpath+0x1a/0x1f
Signed-off-by: Govindarajulu Varadarajan <_govind@gmx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The following warning is shown when spinlock debug is enabled.
This occurs when enic_flow_may_expire timer function is running and
enic_stop is called on same CPU.
Fix this by using spink_lock_bh().
=================================
[ INFO: inconsistent lock state ]
3.17.0-netnext-05504-g59f35b8 #268 Not tainted
---------------------------------
inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage.
ifconfig/443 [HC0[0]:SC0[0]:HE1:SE1] takes:
(&(&enic->rfs_h.lock)->rlock){+.?...}, at:
enic_rfs_flw_tbl_free+0x34/0xd0 [enic]
{IN-SOFTIRQ-W} state was registered at:
[<ffffffff810a25af>] __lock_acquire+0x83f/0x21c0
[<ffffffff810a45f2>] lock_acquire+0xa2/0xd0
[<ffffffff814913fc>] _raw_spin_lock+0x3c/0x80
[<ffffffffa029c3d5>] enic_flow_may_expire+0x25/0x130[enic]
[<ffffffff810bcd07>] call_timer_fn+0x77/0x100
[<ffffffff810bd8e3>] run_timer_softirq+0x1e3/0x270
[<ffffffff8105f9ae>] __do_softirq+0x14e/0x280
[<ffffffff8105fdae>] irq_exit+0x8e/0xb0
[<ffffffff8103da0f>] smp_apic_timer_interrupt+0x3f/0x50
[<ffffffff81493742>] apic_timer_interrupt+0x72/0x80
[<ffffffff81018143>] default_idle+0x13/0x20
[<ffffffff81018a6a>] arch_cpu_idle+0xa/0x10
[<ffffffff81097676>] cpu_startup_entry+0x2c6/0x330
[<ffffffff8103b7ad>] start_secondary+0x21d/0x290
irq event stamp: 2997
hardirqs last enabled at (2997): [<ffffffff81491865>] _raw_spin_unlock_irqrestore+0x65/0x90
hardirqs last disabled at (2996): [<ffffffff814915e6>] _raw_spin_lock_irqsave+0x26/0x90
softirqs last enabled at (2968): [<ffffffff813b57a3>] dev_deactivate_many+0x213/0x260
softirqs last disabled at (2966): [<ffffffff813b5783>] dev_deactivate_many+0x1f3/0x260
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0
----
lock(&(&enic->rfs_h.lock)->rlock);
<Interrupt>
lock(&(&enic->rfs_h.lock)->rlock);
*** DEADLOCK ***
Reported-by: Jan Stancek <jstancek@redhat.com>
Signed-off-by: Govindarajulu Varadarajan <_govind@gmx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
if ->encapsulation is set we have to use inner_tcp_hdrlen and add the
size of the inner network headers too.
This is 'mostly harmless'; tbf might send skb that is slightly over
quota or drop skb even if it would have fit.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
skb_gso_segment has three possible return values:
1. a pointer to the first segmented skb
2. an errno value (IS_ERR())
3. NULL. This can happen when GSO is used for header verification.
However, several callers currently test IS_ERR instead of IS_ERR_OR_NULL
and would oops when NULL is returned.
Note that these call sites should never actually see such a NULL return
value; all callers mask out the GSO bits in the feature argument.
However, there have been issues with some protocol handlers erronously not
respecting the specified feature mask in some cases.
It is preferable to get 'have to turn off hw offloading, else slow' reports
rather than 'kernel crashes'.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
skb_gso_segment() has a 'features' argument representing offload features
available to the output path.
A few handlers, e.g. GRE, instead re-fetch the features of skb->dev and use
those instead of the provided ones when handing encapsulation/tunnels.
Depending on dev->hw_enc_features of the output device skb_gso_segment() can
then return NULL even when the caller has disabled all GSO feature bits,
as segmentation of inner header thinks device will take care of segmentation.
This e.g. affects the tbf scheduler, which will silently drop GRE-encap GSO skbs
that did not fit the remaining token quota as the segmentation does not work
when device supports corresponding hw offload capabilities.
Cc: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The following patch fixes a bug which causes the ax88179_178a driver to be
incapable of being added to a bond.
When I brought up the issue with the bonding maintainers, they indicated
that the real problem was with the NIC driver which must return zero for
success (of setting the MAC address). I see that several other NIC drivers
follow that pattern by either simply always returing zero, or by passing
through a negative (error) result while rewriting any positive return code
to zero. With that same philisophy applied to the ax88179_178a driver, it
allows it to work correctly with the bonding driver.
I believe this is suitable for queuing in -stable, as it's a small, simple,
and obvious fix that corrects a defect with no other known workaround.
This patch is against vanilla 3.17(.0).
Signed-off-by: Ian Morgan <imorgan@primordial.ca>
drivers/net/usb/ax88179_178a.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
Signed-off-by: David S. Miller <davem@davemloft.net>
|