aboutsummaryrefslogtreecommitdiffstats
path: root/tools/perf/scripts/python/call-graph-from-postgresql.py (unfollow)
AgeCommit message (Collapse)AuthorFilesLines
2016-01-29tcp: Change reference to experimental CWND RFC.Jörg Thalheim1-1/+1
Signed-off-by: Jörg Thalheim <joerg@higgsboson.tk> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-29macvlan: make operstate and carrier more accurateNikolay Aleksandrov1-0/+2
Currently when a macvlan is being initialized and the lower device is netif_carrier_ok(), the macvlan device doesn't run through rfc2863_policy() and is left with UNKNOWN operstate. Fix it by adding an unconditional linkwatch event for the new macvlan device. Similar fix is already used by the 8021q device (see register_vlan_dev()). Also fix the inconsistent state when the lower device has been down and its carrier was changed (when a device is down NETDEV_CHANGE doesn't get generated). The second issue can be seen f.e. when we have a macvlan on top of a 8021q device which has been down and its real device has been changing carrier states, after setting the 8021q device up, the macvlan device will have the same carrier state as it was before even though the 8021q can now have a different state. Example for case 1: 4: eth2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP mode DEFAULT group default qlen 1000 $ ip l add l eth2 macvl0 type macvlan $ ip l set macvl0 up $ ip l sh macvl0 72: macvl0@eth2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN mode DEFAULT group default link/ether f6:0b:54:0a:9d:a3 brd ff:ff:ff:ff:ff:ff Example for case 2 (order is important): Prestate: eth2 UP/CARRIER, vlan1 down, vlan1-macvlan down $ ip l set vlan1-macvlan up $ ip l sh vlan1-macvlan 71: vlan1-macvlan@vlan1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN mode DEFAULT group default link/ether 4a:b8:44:56:b9:b9 brd ff:ff:ff:ff:ff:ff [ eth2 loses CARRIER before vlan1 has been UP-ed ] $ ip l sh eth2 4: eth2: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast state DOWN mode DEFAULT group default qlen 1000 link/ether 52:54:00:bf:57:16 brd ff:ff:ff:ff:ff:ff $ ip l sh vlan1-macvlan 71: vlan1-macvlan@vlan1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN mode DEFAULT group default link/ether 4a:b8:44:56:b9:b9 brd ff:ff:ff:ff:ff:ff $ ip l set vlan1 up $ ip l sh vlan1 70: vlan1@eth2: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state LOWERLAYERDOWN mode DEFAULT group default qlen 1000 link/ether 52:54:00:bf:57:16 brd ff:ff:ff:ff:ff:ff $ ip l sh vlan1-macvlan 71: vlan1-macvlan@vlan1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN mode DEFAULT group default link/ether 4a:b8:44:56:b9:b9 brd ff:ff:ff:ff:ff:ff vlan1-macvlan is still UP, still has carrier and is still in the same operstate as before. After the patch in case 1 macvl0 has state UP as it should and in case 2 vlan1-macvlan has state LOWERLAYERDOWN again as it should. Note that while the lower macvlan device is down their carrier and thus operstate can go out of sync but that will be fixed once the lower device goes up again. This behaviour seems to have been present since beginning of git history. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-29tipc: fix connection abort during subscription cancelParthasarathy Bhuvaragan1-6/+5
In 'commit 7fe8097cef5f ("tipc: fix nullpointer bug when subscribing to events")', we terminate the connection if the subscription creation fails. In the same commit, the subscription creation result was based on the value of the subscription pointer (set in the function) instead of the return code. Unfortunately, the same function tipc_subscrp_create() handles subscription cancel request. For a subscription cancellation request, the subscription pointer cannot be set. Thus if a subscriber has several subscriptions and cancels any of them, the connection is terminated. In this commit, we terminate the connection based on the return value of tipc_subscrp_create(). Fixes: commit 7fe8097cef5f ("tipc: fix nullpointer bug when subscribing to events") Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-29net: cavium: liquidio: use helpers ns_to_timespec64()Kefeng Wang1-3/+1
Convert the driver to use ns_to_timespec64() to keep consistency with timespec64_to_ns() instead of open coding the same logic. Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-29ipv4: early demux should be aware of fragmentsEric Dumazet1-1/+4
We should not assume a valid protocol header is present, as this is not the case for IPv4 fragments. Lets avoid extra cache line misses and potential bugs if we actually find a socket and incorrectly uses its dst. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-28DT: phy.txt: Clarify expected compatible valuesAndrew Lunn4-20/+6
PHY devices may only list compatibility with clause 22, 45, and if they need to be more specific, their PHY identifier values. No other compatible strings are allowed. Make this clear in the documentation, and remove examples where make/model compatible strings are listed. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-28of: of_mdio: Add a whitelist of PHY compatibilities.Andrew Lunn1-0/+27
Some phy nodes list a compatible value indicating the PHY make/model. This is never used to match the device to the driver. However it does confuse the code to separate a PHY from a generic MDIO device like a switch. Generic MDIO devices must have a compatible value, PHYs can list clause 22 or 45, but nothing else. Issue a warning if we find a compatible value known on the whitelist, and say it is a PHY. Fixes: a9049e0c513c ("mdio: Add support for mdio drivers.") Reported-by: Aaro Koskinen <aaro.koskinen@nokia.com> Reported-by: Olof Johansson <olof@lixom.net> Tested-by: Aaro Koskinen <aaro.koskinen@nokia.com> Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-28lan78xx: throttle TX path at slower than SuperSpeed USBWoojung.Huh@microchip.com1-1/+3
Throttle TX path only at slower than SuperSpeed USB. SuperSpeed USB has enough bandwidth to maintain GigE. Signed-off-by: Woojung Huh <woojung.huh@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-28lan78xx: Add to handle mux control per chip idWoojung.Huh@microchip.com1-27/+71
Depends on chip, some EEPROM pins are muxed with LED function. Disable & restore LED function to access EEPROM. Signed-off-by: Woojung Huh <woojung.huh@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-28lan78xx: change to use updated phy-ignore-interruptsWoojung.Huh@microchip.com1-16/+14
Update lan78xx to use patch of commit 4f2aaf7dd95b ("Merge branch 'fix-phy-ignore-interrupts'"). Signed-off-by: Woojung Huh <woojung.huh@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-28tcp: beware of alignments in tcp_get_info()Eric Dumazet1-4/+8
With some combinations of user provided flags in netlink command, it is possible to call tcp_get_info() with a buffer that is not 8-bytes aligned. It does matter on some arches, so we need to use put_unaligned() to store the u64 fields. Current iproute2 package does not trigger this particular issue. Fixes: 0df48c26d841 ("tcp: add tcpi_bytes_acked to tcp_info") Fixes: 977cb0ecf82e ("tcp: add pacing_rate information into tcp_info") Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-28switchdev: Require RTNL mutex to be held when sending FDB notificationsIdo Schimmel5-15/+14
When switchdev drivers process FDB notifications from the underlying device they resolve the netdev to which the entry points to and notify the bridge using the switchdev notifier. However, since the RTNL mutex is not held there is nothing preventing the netdev from disappearing in the middle, which will cause br_switchdev_event() to dereference a non-existing netdev. Make switchdev drivers hold the lock at the beginning of the notification processing session and release it once it ends, after notifying the bridge. Also, remove switchdev_mutex and fdb_lock, as they are no longer needed when RTNL mutex is held. Fixes: 03bf0c281234 ("switchdev: introduce switchdev notifier") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-28xen-netfront: request Tx response events more oftenMalcolm Crossley1-12/+3
Trying to batch Tx response events results in poor performance because this delays freeing the transmitted skbs. Instead use the standard RING_FINAL_CHECK_FOR_RESPONSES() macro to be notified once the next Tx response is placed on the ring. Signed-off-by: Malcolm Crossley <malcolm.crossley@citrix.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-28net: mv643xx_eth: fix packet corruption with TSO and tiny unaligned packets.Nicolas Schichan1-2/+2
The code in txq_put_data() would use txq->tx_curr_desc to index the tso_hdrs/tso_hdrs_dma buffers, for less than 8 bytes unaligned fragments, which is already moved to the next descriptor at the beginning of the function. If that fragment was the last of the the skb, the next skb would use that same space to place the ip headers, overwritting that small fragment data. Fixes: 91986fd3d335 (net: mv643xx_eth: Ensure proper data alignment in TSO TX path) Signed-off-by: Nicolas Schichan <nschichan@freebox.fr> Reviewed-by: Philipp Kirchhofer <philipp@familie-kirchhofer.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-28of: of_mdio: Ensure mdio device is a PHYAndrew Lunn1-1/+9
of_phy_find_device() is used to find the phy device associated with a device node. It is expected the node is for a PHY device, but in fact it could of been probed as a generic MDIO device. Ensure the device is a PHY before returning it. Fixes: a9049e0c513c ("mdio: Add support for mdio drivers.") Reported-by: Aaro Koskinen <aaro.koskinen@nokia.com> Reported-by: Olof Johansson <olof@lixom.net> Signed-off-by: Andrew Lunn <andrew@lunn.ch> Tested-by: Aaro Koskinen <aaro.koskinen@nokia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-28net: Fix dependencies for !HAS_IOMEM archsRichard Weinberger2-0/+2
Not every arch has io memory. So, unbreak the build by fixing the dependencies. Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-28tcp: fix tcp_mark_head_lost to check skb len before fragmentingNeal Cardwell1-5/+5
This commit fixes a corner case in tcp_mark_head_lost() which was causing the WARN_ON(len > skb->len) in tcp_fragment() to fire. tcp_mark_head_lost() was assuming that if a packet has tcp_skb_pcount(skb) of N, then it's safe to fragment off a prefix of M*mss bytes, for any M < N. But with the tricky way TCP pcounts are maintained, this is not always true. For example, suppose the sender sends 4 1-byte packets and have the last 3 packet sacked. It will merge the last 3 packets in the write queue into an skb with pcount = 3 and len = 3 bytes. If another recovery happens after a sack reneging event, tcp_mark_head_lost() may attempt to split the skb assuming it has more than 2*MSS bytes. This sounds very counterintuitive, but as the commit description for the related commit c0638c247f55 ("tcp: don't fragment SACKed skbs in tcp_mark_head_lost()") notes, this is because tcp_shifted_skb() coalesces adjacent regions of SACKed skbs, and when doing this it preserves the sum of their packet counts in order to reflect the real-world dynamics on the wire. The c0638c247f55 commit tried to avoid problems by not fragmenting SACKed skbs, since SACKed skbs are where the non-proportionality between pcount and skb->len/mss is known to be possible. However, that commit did not handle the case where during a reneging event one of these weird SACKed skbs becomes an un-SACKed skb, which tcp_mark_head_lost() can then try to fragment. The fix is to simply mark the entire skb lost when this happens. This makes the recovery slightly more aggressive in such corner cases before we detect reordering. But once we detect reordering this code path is by-passed because FACK is disabled. Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-28inet: frag: Always orphan skbs inside ip_defrag()Joe Stringer2-2/+1
Later parts of the stack (including fragmentation) expect that there is never a socket attached to frag in a frag_list, however this invariant was not enforced on all defrag paths. This could lead to the BUG_ON(skb->sk) during ip_do_fragment(), as per the call stack at the end of this commit message. While the call could be added to openvswitch to fix this particular error, the head and tail of the frags list are already orphaned indirectly inside ip_defrag(), so it seems like the remaining fragments should all be orphaned in all circumstances. kernel BUG at net/ipv4/ip_output.c:586! [...] Call Trace: <IRQ> [<ffffffffa0205270>] ? do_output.isra.29+0x1b0/0x1b0 [openvswitch] [<ffffffffa02167a7>] ovs_fragment+0xcc/0x214 [openvswitch] [<ffffffff81667830>] ? dst_discard_out+0x20/0x20 [<ffffffff81667810>] ? dst_ifdown+0x80/0x80 [<ffffffffa0212072>] ? find_bucket.isra.2+0x62/0x70 [openvswitch] [<ffffffff810e0ba5>] ? mod_timer_pending+0x65/0x210 [<ffffffff810b732b>] ? __lock_acquire+0x3db/0x1b90 [<ffffffffa03205a2>] ? nf_conntrack_in+0x252/0x500 [nf_conntrack] [<ffffffff810b63c4>] ? __lock_is_held+0x54/0x70 [<ffffffffa02051a3>] do_output.isra.29+0xe3/0x1b0 [openvswitch] [<ffffffffa0206411>] do_execute_actions+0xe11/0x11f0 [openvswitch] [<ffffffff810b63c4>] ? __lock_is_held+0x54/0x70 [<ffffffffa0206822>] ovs_execute_actions+0x32/0xd0 [openvswitch] [<ffffffffa020b505>] ovs_dp_process_packet+0x85/0x140 [openvswitch] [<ffffffff810b63c4>] ? __lock_is_held+0x54/0x70 [<ffffffffa02068a2>] ovs_execute_actions+0xb2/0xd0 [openvswitch] [<ffffffffa020b505>] ovs_dp_process_packet+0x85/0x140 [openvswitch] [<ffffffffa0215019>] ? ovs_ct_get_labels+0x49/0x80 [openvswitch] [<ffffffffa0213a1d>] ovs_vport_receive+0x5d/0xa0 [openvswitch] [<ffffffff810b732b>] ? __lock_acquire+0x3db/0x1b90 [<ffffffff810b732b>] ? __lock_acquire+0x3db/0x1b90 [<ffffffff810b732b>] ? __lock_acquire+0x3db/0x1b90 [<ffffffffa0214895>] ? internal_dev_xmit+0x5/0x140 [openvswitch] [<ffffffffa02148fc>] internal_dev_xmit+0x6c/0x140 [openvswitch] [<ffffffffa0214895>] ? internal_dev_xmit+0x5/0x140 [openvswitch] [<ffffffff81660299>] dev_hard_start_xmit+0x2b9/0x5e0 [<ffffffff8165fc21>] ? netif_skb_features+0xd1/0x1f0 [<ffffffff81660f20>] __dev_queue_xmit+0x800/0x930 [<ffffffff81660770>] ? __dev_queue_xmit+0x50/0x930 [<ffffffff810b53f1>] ? mark_held_locks+0x71/0x90 [<ffffffff81669876>] ? neigh_resolve_output+0x106/0x220 [<ffffffff81661060>] dev_queue_xmit+0x10/0x20 [<ffffffff816698e8>] neigh_resolve_output+0x178/0x220 [<ffffffff816a8e6f>] ? ip_finish_output2+0x1ff/0x590 [<ffffffff816a8e6f>] ip_finish_output2+0x1ff/0x590 [<ffffffff816a8cee>] ? ip_finish_output2+0x7e/0x590 [<ffffffff816a9a31>] ip_do_fragment+0x831/0x8a0 [<ffffffff816a8c70>] ? ip_copy_metadata+0x1b0/0x1b0 [<ffffffff816a9ae3>] ip_fragment.constprop.49+0x43/0x80 [<ffffffff816a9c9c>] ip_finish_output+0x17c/0x340 [<ffffffff8169a6f4>] ? nf_hook_slow+0xe4/0x190 [<ffffffff816ab4c0>] ip_output+0x70/0x110 [<ffffffff816a9b20>] ? ip_fragment.constprop.49+0x80/0x80 [<ffffffff816aa9f9>] ip_local_out+0x39/0x70 [<ffffffff816abf89>] ip_send_skb+0x19/0x40 [<ffffffff816abfe3>] ip_push_pending_frames+0x33/0x40 [<ffffffff816df21a>] icmp_push_reply+0xea/0x120 [<ffffffff816df93d>] icmp_reply.constprop.23+0x1ed/0x230 [<ffffffff816df9ce>] icmp_echo.part.21+0x4e/0x50 [<ffffffff810b63c4>] ? __lock_is_held+0x54/0x70 [<ffffffff810d5f9e>] ? rcu_read_lock_held+0x5e/0x70 [<ffffffff816dfa06>] icmp_echo+0x36/0x70 [<ffffffff816e0d11>] icmp_rcv+0x271/0x450 [<ffffffff816a4ca7>] ip_local_deliver_finish+0x127/0x3a0 [<ffffffff816a4bc1>] ? ip_local_deliver_finish+0x41/0x3a0 [<ffffffff816a5160>] ip_local_deliver+0x60/0xd0 [<ffffffff816a4b80>] ? ip_rcv_finish+0x560/0x560 [<ffffffff816a46fd>] ip_rcv_finish+0xdd/0x560 [<ffffffff816a5453>] ip_rcv+0x283/0x3e0 [<ffffffff810b6302>] ? match_held_lock+0x192/0x200 [<ffffffff816a4620>] ? inet_del_offload+0x40/0x40 [<ffffffff8165d062>] __netif_receive_skb_core+0x392/0xae0 [<ffffffff8165e68e>] ? process_backlog+0x8e/0x230 [<ffffffff810b53f1>] ? mark_held_locks+0x71/0x90 [<ffffffff8165d7c8>] __netif_receive_skb+0x18/0x60 [<ffffffff8165e678>] process_backlog+0x78/0x230 [<ffffffff8165e6dd>] ? process_backlog+0xdd/0x230 [<ffffffff8165e355>] net_rx_action+0x155/0x400 [<ffffffff8106b48c>] __do_softirq+0xcc/0x420 [<ffffffff816a8e87>] ? ip_finish_output2+0x217/0x590 [<ffffffff8178e78c>] do_softirq_own_stack+0x1c/0x30 <EOI> [<ffffffff8106b88e>] do_softirq+0x4e/0x60 [<ffffffff8106b948>] __local_bh_enable_ip+0xa8/0xb0 [<ffffffff816a8eb0>] ip_finish_output2+0x240/0x590 [<ffffffff816a9a31>] ? ip_do_fragment+0x831/0x8a0 [<ffffffff816a9a31>] ip_do_fragment+0x831/0x8a0 [<ffffffff816a8c70>] ? ip_copy_metadata+0x1b0/0x1b0 [<ffffffff816a9ae3>] ip_fragment.constprop.49+0x43/0x80 [<ffffffff816a9c9c>] ip_finish_output+0x17c/0x340 [<ffffffff8169a6f4>] ? nf_hook_slow+0xe4/0x190 [<ffffffff816ab4c0>] ip_output+0x70/0x110 [<ffffffff816a9b20>] ? ip_fragment.constprop.49+0x80/0x80 [<ffffffff816aa9f9>] ip_local_out+0x39/0x70 [<ffffffff816abf89>] ip_send_skb+0x19/0x40 [<ffffffff816abfe3>] ip_push_pending_frames+0x33/0x40 [<ffffffff816d55d3>] raw_sendmsg+0x7d3/0xc30 [<ffffffff810b732b>] ? __lock_acquire+0x3db/0x1b90 [<ffffffff816e7557>] ? inet_sendmsg+0xc7/0x1d0 [<ffffffff810b63c4>] ? __lock_is_held+0x54/0x70 [<ffffffff816e759a>] inet_sendmsg+0x10a/0x1d0 [<ffffffff816e7495>] ? inet_sendmsg+0x5/0x1d0 [<ffffffff8163e398>] sock_sendmsg+0x38/0x50 [<ffffffff8163ec5f>] ___sys_sendmsg+0x25f/0x270 [<ffffffff811aadad>] ? handle_mm_fault+0x8dd/0x1320 [<ffffffff8178c147>] ? _raw_spin_unlock+0x27/0x40 [<ffffffff810529b2>] ? __do_page_fault+0x1e2/0x460 [<ffffffff81204886>] ? __fget_light+0x66/0x90 [<ffffffff8163f8e2>] __sys_sendmsg+0x42/0x80 [<ffffffff8163f932>] SyS_sendmsg+0x12/0x20 [<ffffffff8178cb17>] entry_SYSCALL_64_fastpath+0x12/0x6f Code: 00 00 44 89 e0 e9 7c fb ff ff 4c 89 ff e8 e7 e7 ff ff 41 8b 9d 80 00 00 00 2b 5d d4 89 d8 c1 f8 03 0f b7 c0 e9 33 ff ff f 66 66 66 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90 55 48 RIP [<ffffffff816a9a92>] ip_do_fragment+0x892/0x8a0 RSP <ffff88006d603170> Fixes: 7f8a436eaa2c ("openvswitch: Add conntrack action") Signed-off-by: Joe Stringer <joe@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-28sctp: remove the dead field of sctp_transportXin Long4-21/+2
After we use refcnt to check if transport is alive, the dead can be removed from sctp_transport. The traversal of transport_addr_list in procfs dump is using list_for_each_entry_rcu, no need to check if it has been freed. sctp_generate_t3_rtx_event and sctp_generate_heartbeat_event is protected by sock lock, it's not necessary to check dead, either. also, the timers are cancelled when sctp_transport_free() is called, that it doesn't wait for refcnt to reach 0 to cancel them. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-28sctp: hold transport before we access t->asoc in sctp procXin Long1-0/+8
Previously, before rhashtable, /proc assoc listing was done by read-locking the entire hash entry and dumping all assocs at once, so we were sure that the assoc wasn't freed because it wouldn't be possible to remove it from the hash meanwhile. Now we use rhashtable to list transports, and dump entries one by one. That is, now we have to check if the assoc is still a good one, as the transport we got may be being freed. Signed-off-by: Xin Long <lucien.xin@gmail.com> Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-28sctp: fix the transport dead race check by using atomic_add_unless on refcntXin Long3-9/+14
Now when __sctp_lookup_association is running in BH, it will try to check if t->dead is set, but meanwhile other CPUs may be freeing this transport and this assoc and if it happens that __sctp_lookup_association checked t->dead a bit too early, it may think that the association is still good while it was already freed. So we fix this race by using atomic_add_unless in sctp_transport_hold. After we get one transport from hashtable, we will hold it only when this transport's refcnt is not 0, so that we can make sure t->asoc cannot be freed before we hold the asoc again. Note that sctp association is not freed using RCU so we can't use atomic_add_unless() with it as it may just be too late for that either. Fixes: 4f0087812648 ("sctp: apply rhashtable api to send/recv path") Reported-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-28mlxsw: reg: Use correct offset in field definitonIdo Schimmel1-3/+3
The rx_lane, tx_lane and module fields in the PMLP register don't have an additional offset besides the base one (0x04), so set it to 0x00. Fixes: 4ec14b7634b2 ("mlxsw: Add interface to access registers and process events") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-28mlxsw: spectrum: Compare local ports instead of pointersIdo Schimmel1-2/+4
When dumping the FDB we can't compare the actual pointers of the ports structs, as it's possible the struct represents a vPort instead of the underlying physical port. Solve this by comparing the local port number instead, as it's shared between the physical ports and all the vPorts on top of him. Fixes: 54a732018d8e ("mlxsw: spectrum: Adjust switchdev ops for VLAN devices") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-28mlxsw: spectrum: Dump LAG FDB records only onceIdo Schimmel1-2/+10
LAG FDB records can only point to LAG devices or VLAN devices configured on top of them. Therefore, when dumping the FDB we shouldn't associate these records with the underlying physical ports. Fixes: 8a1ab5d76639 ("mlxsw: spectrum: Implement FDB add/remove/dump for LAG") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-28mlxsw: spectrum: Use correct netdev when notifying bridgeIdo Schimmel1-2/+4
LAG FDB entries pointing to VLAN devices should be reported to the bridge with the matching VLAN device and not the underlying LAG device. Fixes: aac78a440887 ("mlxsw: spectrum: Adjust FDB notifications for VLAN devices") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-28mlxsw: spectrum: Don't report VLAN for 802.1D FDB entriesIdo Schimmel1-15/+16
When dumping the hardware FDB we should report entries pointing to VLAN devices with VLAN 0, as packets coming into the bridge are untagged. Likewise, pass FDB_{ADD,DEL} notifications with VLAN 0 for these devices. Fixes: 54a732018d8e ("mlxsw: spectrum: Adjust switchdev ops for VLAN devices") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-28mlxsw: spectrum: Notify bridge's FDB only based on learning_syncIdo Schimmel1-8/+6
When we disable learning on bridge port we should still update the software bridge's FDB when entry pointing to this bridge port is aged-out. We can otherwise have an inconsistency between software and hardware tables. Fixes: 8a1ab5d76639 ("mlxsw: spectrum: Implement FDB add/remove/dump for LAG") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-28mlxsw: spectrum: Disable learning according to STP stateIdo Schimmel1-1/+1
When port is put into LISTENING state it shouldn't populate the FDB, so set the port's STP state in hardware to DISCARDING instead of LEARNING. It will therefore keep listening to BPDU packets, but discard other non-control packets and won't perform any learning. Fixes: 56ade8fe3fe1 ("mlxsw: spectrum: Add initial support for Spectrum ASIC") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-28mlxsw: spectrum: Don't forward packets when STP state is DISABLEDIdo Schimmel1-1/+1
When STP state is set to DISABLED the port is assumed to be inactive, but currently we forward packets ingressing through it. Instead, set the port's STP state in hardware to DISCARDING, which means it doesn't forward packets or perform any learning, but it does trap control packets. However, these packets will be dropped by bridge code, which results in the expected behavior. Fixes: 56ade8fe3fe1 ("mlxsw: spectrum: Add initial support for Spectrum ASIC") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-28mlxsw: spectrum: Flush FDB when leaving bridgeIdo Schimmel1-8/+129
As explained in previous commit, we should always take care of flushing the FDB in the driver and not rely on bridge code. We need to distinguish between two cases with regards to LAG: 1) Port is leaving LAG while LAG is bridged (or VLAN devices on top of it). In this case don't flush the FDB entries pointing to the LAG ID, as this will affect other ports still member in the LAG. Only flush the FDB when the last port in the LAG is leaving the bridge. 2) LAG device is leaving the bridge. In this case the CHANGEUPPER event is simply propagated to each member port, so make each port flush the FDB in its turn. Note that emptying a bridged LAG from ports creates an inconsistency between hardware and software. A user who later (< ageing_time) re-populates the LAG won't have any FDB entries pointing to the LAG ID in hardware, but they will be present in the software bridge's FDB. Currently there is no good solution to this problem, but this will be addressed by us in the future. In order to optimize the flushing process, flush by port or LAG ID if there are no VLAN interfaces on top of the port. Otherwise, flush using (Port / LAG ID, FID=VID} for each of the lower 4K FIDs. In the case of VLAN device simply flush using {Port / LAG ID, vFID} with the vFID to which the VLAN device is mapped to. Fixes: 56ade8fe3fe1 ("mlxsw: spectrum: Add initial support for Spectrum ASIC") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-28mlxsw: reg: Add the Switch Filtering DB Flush registerIdo Schimmel1-0/+88
When removing a net device from a bridge we should flush the FDB entries associated with this net device. Up until now, we relied upon bridge code to do that for us, but it is possible for user to prevent hardware from syncing with the software bridge (learning_sync=0), so we need to flush overselves. Add the Switch Filtering DB Flush (SFDF) register that is used to flush FDB entries according to different parameters (per-port, per-FID etc). Fixes: 56ade8fe3fe1 ("mlxsw: spectrum: Add initial support for Spectrum ASIC") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-28mlxsw: spectrum: Handle port leaving LAG while bridgedIdo Schimmel3-3/+36
It is possible for a user to remove a port from a LAG device, while the LAG device or VLAN devices on top of it are bridged. In these cases, bridge's teardown sequence is never issued, so we need to take care of it ourselves. When LAG's unlinking event is received by port netdev: 1) Traverse its vPorts list and make those member in a bridge leave it. They will be deleted later by LAG code. 2) Make the port netdev itself leave its bridge if member in one. Fixes: 0d65fc13042f ("mlxsw: spectrum: Implement LAG port join/leave") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-26rfkill: fix rfkill_fop_read wait_event usageJohannes Berg1-12/+4
The code within wait_event_interruptible() is called with !TASK_RUNNING, so mustn't call any functions that can sleep, like mutex_lock(). Since we re-check the list_empty() in a loop after the wait, it's safe to simply use list_empty() without locking. This bug has existed forever, but was only discovered now because all userspace implementations, including the default 'rfkill' tool, use poll() or select() to get a readable fd before attempting to read. Cc: stable@vger.kernel.org Fixes: c64fb01627e24 ("rfkill: create useful userspace interface") Reported-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2016-01-26mac80211: Requeue work after scan complete for all VIF types.Sachin Kulkarni5-19/+11
During a sw scan ieee80211_iface_work ignores work items for all vifs. However after the scan complete work is requeued only for STA, ADHOC and MESH iftypes. This occasionally results in event processing getting delayed/not processed for iftype AP when it coexists with a STA. This can result in data halt and eventually disconnection on the AP interface. Cc: stable@vger.kernel.org Signed-off-by: Sachin Kulkarni <Sachin.Kulkarni@imgtec.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2016-01-25net: i40e: shut up uninitialized variable warningsArnd Bergmann1-2/+2
intel/i40e/i40e_txrx.c: In function 'i40e_xmit_frame_ring': intel/i40e/i40e_txrx.c:2367:20: error: 'oiph' may be used uninitialized in this function [-Werror=maybe-uninitialized] intel/i40e/i40e_txrx.c:2317:16: note: 'oiph' was declared here intel/i40e/i40e_txrx.c:2367:17: error: 'oudph' may be used uninitialized in this function [-Werror=maybe-uninitialized] intel/i40e/i40e_txrx.c:2316:17: note: 'oudph' was declared here Signed-off-by: Arnd Bergmann <arnd@arndb.de> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-01-25i40e: fix build warningsEric Dumazet1-10/+5
Fixes following build warnings : drivers/net/ethernet/intel/i40e/i40e_main.c:7057:13: warning: 'i40e_sync_udp_filters_subtask' defined but not used [-Wunused-function] drivers/net/ethernet/intel/i40e/i40e_main.c:8524:13: warning: 'i40e_add_vxlan_port' defined but not used [-Wunused-function] drivers/net/ethernet/intel/i40e/i40e_main.c:8569:13: warning: 'i40e_del_vxlan_port' defined but not used [-Wunused-function] drivers/net/ethernet/intel/i40e/i40e_main.c:8604:13: warning: 'i40e_add_geneve_port' defined but not used [-Wunused-function] drivers/net/ethernet/intel/i40e/i40e_main.c:8651:13: warning: 'i40e_del_geneve_port' defined but not used [-Wunused-function] Fixes: 6a899024058d ("i40e: geneve tunnel offload support") Signed-off-by: Eric Dumazet <edumazet@google.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-01-25hv_netvsc: Fix book keeping of skb during batching processHaiyang Zhang2-11/+23
Since eliminating send_completion_tid from struct hv_netvsc_packet, we haven't add proper book keeping for the skb of the batched packet. This patch fixes this issue and allows the previous skb is properly freed. Otherwise, a panic may happen. Thanks to Simon Xiao <sixiao@microsoft.com> for bisecting and analysis. Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Reviewed-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-25hv_netvsc: use skb_get_hash() instead of a homegrown implementationVitaly Kuznetsov1-64/+3
Recent changes to 'struct flow_keys' (e.g commit d34af823ff40 ("net: Add VLAN ID to flow_keys")) introduced a performance regression in netvsc driver. Is problem is, however, not the above mentioned commit but the fact that netvsc_set_hash() function did some assumptions on the struct flow_keys data layout and this is wrong. Get rid of netvsc_set_hash() by switching to skb_get_hash(). This change will also imply switching to Jenkins hash from the currently used Toeplitz but it seems there is no good excuse for Toeplitz to stay. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-25sit: set rtnl_link_ops before calling register_netdeviceThadeu Lima de Souza Cascardo1-2/+2
When creating a SIT tunnel with ip tunnel, rtnl_link_ops is not set before ipip6_tunnel_create is called. When register_netdevice is called, there is no linkinfo attribute in the NEWLINK message because of that. Setting rtnl_link_ops before calling register_netdevice fixes that. Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-25net: fec: use CONFIG_ARM instead of CONFIG_ARCH_MXC/SOC_IMX28Johannes Berg2-7/+4
As Arnd Bergmann points out, using CONFIG_ARCH_MXC and/or SOC_IMX28 is wrong if some other ARM platform uses this device - the operation of the driver would depend on an unrelated ARM platform that might or might not be set for multi-platform kernels. Prior to my previous patch, any other platforms using it would have been broken already due to having the cbd_datlen/cbd_sc fields in the wrong order, but byte ordering correctly, so no such platforms can exist and work today. In any case, it seems likely that only Freescale SoCs use this part, and those are little-endian on ARM, so CONFIG_ARM is safe for them. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-25defxx: fix build warningSudip Mukherjee1-4/+4
We are getting many build warnings about: 'bar_start' may be used uninitialized and 'bar_len' may be used uninitialized They are not actually uninitialized as dfx_get_bars() will initialize them properly. But still lets have them initialized just to satisfy the compiler (gcc 4.8.2). Signed-off-by: Sudip Mukherjee <sudip@vectorindia.org> Acked-by: Maciej W. Rozycki <macro@linux-mips.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-25net: macb: fix build warningSudip Mukherjee1-1/+1
We are getting build warning about: macb.c:2889:13: warning: 'tx_clk' may be used uninitialized in this function macb.c:2888:11: warning: 'hclk' may be used uninitialized in this function In reality they are not used uninitialized as clk_init() will initialize them, this patch will just silence the warning. Signed-off-by: Sudip Mukherjee <sudip@vectorindia.org> Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-25net: fec: make driver endian-safeJohannes Berg3-72/+101
The driver treats the device descriptors as CPU-endian, which appears to be correct with the default endianness on both ARM (typically LE) and PowerPC (typically BE) SoCs, indicating that the hardware block is generated differently. Add endianness annotations and byteswaps as necessary. It's not clear that the ifdef there really is correct and shouldn't just be #ifdef CONFIG_ARM, but I also can't test on anything but the i.MX6 HummingBoard where this gets it working with a BE kernel. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-25net: dsa: fix mv88e6xxx switchesRussell King1-1/+1
Since commit 76e398a62712 ("net: dsa: use switchdev obj for VLAN add/del ops"), the Marvell 88E6xxx switch has been unable to pass traffic between ports - any received traffic is discarded by the switch. Taking a port out of bridge mode and configuring a vlan on it also the port to start passing traffic. With the debugfs files re-instated to allow debug of this issue by comparing the register settings between the working and non-working case, the reason becomes clear: GLOBAL GLOBAL2 SERDES 0 1 2 3 4 5 6 - 7: 1111 707f 2001 2 2 2 2 2 0 2 + 7: 1111 707f 2001 1 1 1 1 1 0 1 Register 7 for the ports is the default vlan tag register, and in the non-working setup, it has been set to 2, despite vlan 2 not being configured. This causes the switch to drop all packets coming in to these ports. The working setup has the default vlan tag register set to 1, which is the default vlan when none is configured. Inspection of the code reveals why. The code prior to this commit was: - for (vid = vlan->vid_begin; vid <= vlan->vid_end; ++vid) { ... - if (!err && vlan->flags & BRIDGE_VLAN_INFO_PVID) - err = ds->drv->port_pvid_set(ds, p->port, vid); but the new code is: + for (vid = vlan->vid_begin; vid <= vlan->vid_end; ++vid) { ... + } ... + if (pvid) + err = _mv88e6xxx_port_pvid_set(ds, port, vid); This causes the new code to always set the default vlan to one higher than the old code. Fix this. Fixes: 76e398a62712 ("net: dsa: use switchdev obj for VLAN add/del ops") Cc: <stable@vger.kernel.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-2582xx: FCC: Fixing a bug causing to FCC port lock-up (second try)Martin Roth1-1/+1
This is an additional patch to the one already submitted recently. The previous patch was not complete, and the FCC port lock-up scenario has been reproduced in lab. I had an opportunity to check the current patch in lab and the FCC port lock no longer freezes, while the previous patch still locks-up the FCC port. The current patch fixes a pointer arithmetic bug (second bug in the same line), which leads FCC port lock-up during underrun/collision handling. Within the tx_startup() function in mac-fcc.c, the address of last BD is not calculated correctly. As a result of wrong calculation of the last BD address, the next transmitted BD may be set to an area out of the transmit BD ring. This actually causes to port lock-up and it is not recoverable. Signed-off-by: Martin Roth <martin.roth@motorolasolutions.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-25ipv4+ipv6: Make INET*_ESP select CRYPTO_ECHAINIVThomas Egerer2-0/+2
The ESP algorithms using CBC mode require echainiv. Hence INET*_ESP have to select CRYPTO_ECHAINIV in order to work properly. This solves the issues caused by a misconfiguration as described in [1]. The original approach, patching crypto/Kconfig was turned down by Herbert Xu [2]. [1] https://lists.strongswan.org/pipermail/users/2015-December/009074.html [2] http://marc.info/?l=linux-crypto-vger&m=145224655809562&w=2 Signed-off-by: Thomas Egerer <hakke_007@gmx.de> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-24sctp: allow setting SCTP_SACK_IMMEDIATELY by the applicationMarcelo Ricardo Leitner1-0/+2
This patch extends commit b93d6471748d ("sctp: implement the sender side for SACK-IMMEDIATELY extension") as it didn't white list SCTP_SACK_IMMEDIATELY on sctp_msghdr_parse(), causing it to be understood as an invalid flag and returning -EINVAL to the application. Note that the actual handling of the flag is already there in sctp_datamsg_from_user(). https://tools.ietf.org/html/rfc7053#section-7 Fixes: b93d6471748d ("sctp: implement the sender side for SACK-IMMEDIATELY extension") Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-24net: simplify napi_synchronize() to avoid warningsArnd Bergmann1-6/+5
The napi_synchronize() function is defined twice: The definition for SMP builds waits for other CPUs to be done, while the uniprocessor variant just contains a barrier and ignores its argument. In the mvneta driver, this leads to a warning about an unused variable when we lookup the NAPI struct of another CPU and then don't use it: ethernet/marvell/mvneta.c: In function 'mvneta_percpu_notifier': ethernet/marvell/mvneta.c:2910:30: error: unused variable 'other_port' [-Werror=unused-variable] There are no other CPUs on a UP build, so that code never runs, but gcc does not know this. The nicest solution seems to be to turn the napi_synchronize() helper into an inline function for the UP case as well, as that leads gcc to not complain about the argument being unused. Once we do that, we can also combine the two cases into a single function definition and use if(IS_ENABLED()) rather than #ifdef to make it look a bit nicer. The warning first came up in linux-4.4, but I failed to catch it earlier. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Fixes: f86428854480 ("net: mvneta: Statically assign queues to CPUs") Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-24pptp: fix illegal memory access caused by multiple bind()sHannes Frederic Sowa1-10/+24
Several times already this has been reported as kasan reports caused by syzkaller and trinity and people always looked at RCU races, but it is much more simple. :) In case we bind a pptp socket multiple times, we simply add it to the callid_sock list but don't remove the old binding. Thus the old socket stays in the bucket with unused call_id indexes and doesn't get cleaned up. This causes various forms of kasan reports which were hard to pinpoint. Simply don't allow multiple binds and correct error handling in pptp_bind. Also keep sk_state bits in place in pptp_connect. Fixes: 00959ade36acad ("PPTP: PPP over IPv4 (Point-to-Point Tunneling Protocol)") Cc: Dmitry Kozlov <xeb@mail.ru> Cc: Sasha Levin <sasha.levin@oracle.com> Cc: Dmitry Vyukov <dvyukov@google.com> Reported-by: Dmitry Vyukov <dvyukov@google.com> Cc: Dave Jones <davej@codemonkey.org.uk> Reported-by: Dave Jones <davej@codemonkey.org.uk> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-24drivers: net: xgene: fix extra IRQ issueIyappan Subramanian2-3/+10
For interrupt controller that doesn't support irq_disable and hardware with level interrupt, an extra interrupt may be pending. This patch fixes the issue by setting IRQ_DISABLE_UNLAZY flag for the interrupt line, as suggested by, 'commit e9849777d0e2 ("genirq: Add flag to force mask in disable_irq[_nosync]()")' Signed-off-by: Iyappan Subramanian <isubramanian@apm.com> Tested-by: Toan Le <toanle@apm.com> Signed-off-by: David S. Miller <davem@davemloft.net>