Age | Commit message (Collapse) | Author | Files | Lines |
|
On each processed XCOPY command, two "kmalloc-512" memory objects are
leaked. These represent two allocations of struct xcopy_pt_cmd in
target_core_xcopy.c.
The reason for the memory leak is that the cmd_kref field is not
initialized (thus, it is zero because the allocations were done with
kzalloc). When we decrement zero kref in target_put_sess_cmd, the result
is not zero, thus target_release_cmd_kref is not called.
This patch fixes the bug by moving kref initialization from
target_get_sess_cmd to transport_init_se_cmd (this function is called from
target_core_xcopy.c, so it will correctly initialize cmd_kref). It can be
easily verified that all code that calls target_get_sess_cmd also calls
transport_init_se_cmd earlier, thus moving kref_init shouldn't introduce
any new problems.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org # 3.12+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
|
|
This is the prevent previous stores from overlapping the block stores
done by the memcpy loop.
Based upon a glibc patch by Jose E. Marchesi
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
ALB/TLB learning packets use all vlans configured on top
of the bond. This ends up being incorrect if we have a stack
of vlans on top of the bond. ALB/TLB should only use
first level/outer most vlans in its announcements.
Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Prior to commit fbd929f2dce460456807a51e18d623db3db9f077
bonding: support QinQ for bond arp interval
the arp monitoring code allowed for proper detection of devices
stacked on top of vlans. Since the above commit, the
code can still detect a device stacked on top of single
vlan, but not a device stacked on top of Q-in-Q configuration.
The search will only set the inner vlan tag if the route
device is the vlan device. However, this is not always the
case, as it is possible to extend the stacked configuration.
With this patch it is possible to provision devices on
top Q-in-Q vlan configuration that should be used as
a source of ARP monitoring information.
For example:
ip link add link bond0 vlan10 type vlan proto 802.1q id 10
ip link add link vlan10 vlan100 type vlan proto 802.1q id 100
ip link add link vlan100 type macvlan
Note: This patch limites the number of stacked VLANs to 2,
just like before. The original, however had another issue
in that if we had more then 2 levels of VLANs, we would end
up generating incorrectly tagged traffic. This is no longer
possible.
Fixes: fbd929f2dce460456807a51e18d623db3db9f077 (bonding: support QinQ for bond arp interval)
CC: Jay Vosburgh <j.vosburgh@gmail.com>
CC: Veaceslav Falico <vfalico@redhat.com>
CC: Andy Gospodarek <andy@greyhouse.net>
CC: Ding Tianhong <dingtianhong@huawei.com>
CC: Patric McHardy <kaber@trash.net>
Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Commit ee1de406ba6eb1 ("random: simplify accounting logic") simplified
things too much, in that it allows the following to trigger an
overflow that results in a BUG_ON crash:
dd if=/dev/urandom of=/dev/zero bs=67108707 count=1
Thanks to Peter Zihlstra for discovering the crash, and Hannes
Frederic for analyizing the root cause.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reported-by: Peter Zijlstra <peterz@infradead.org>
Reported-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: Greg Price <price@mit.edu>
|
|
Macvlan devices try to avoid stacking, but that's not always
successfull or even desired. As an example, the following
configuration is perefectly legal and valid:
eth0 <--- macvlan0 <---- vlan0.10 <--- macvlan1
However, this configuration produces the following lockdep
trace:
[ 115.620418] ======================================================
[ 115.620477] [ INFO: possible circular locking dependency detected ]
[ 115.620516] 3.15.0-rc1+ #24 Not tainted
[ 115.620540] -------------------------------------------------------
[ 115.620577] ip/1704 is trying to acquire lock:
[ 115.620604] (&vlan_netdev_addr_lock_key/1){+.....}, at: [<ffffffff815df49c>] dev_uc_sync+0x3c/0x80
[ 115.620686]
but task is already holding lock:
[ 115.620723] (&macvlan_netdev_addr_lock_key){+.....}, at: [<ffffffff815da5be>] dev_set_rx_mode+0x1e/0x40
[ 115.620795]
which lock already depends on the new lock.
[ 115.620853]
the existing dependency chain (in reverse order) is:
[ 115.620894]
-> #1 (&macvlan_netdev_addr_lock_key){+.....}:
[ 115.620935] [<ffffffff810d57f2>] lock_acquire+0xa2/0x130
[ 115.620974] [<ffffffff816f62e7>] _raw_spin_lock_nested+0x37/0x50
[ 115.621019] [<ffffffffa07296c3>] vlan_dev_set_rx_mode+0x53/0x110 [8021q]
[ 115.621066] [<ffffffff815da557>] __dev_set_rx_mode+0x57/0xa0
[ 115.621105] [<ffffffff815da5c6>] dev_set_rx_mode+0x26/0x40
[ 115.621143] [<ffffffff815da6be>] __dev_open+0xde/0x140
[ 115.621174] [<ffffffff815da9ad>] __dev_change_flags+0x9d/0x170
[ 115.621174] [<ffffffff815daaa9>] dev_change_flags+0x29/0x60
[ 115.621174] [<ffffffff815e7f11>] do_setlink+0x321/0x9a0
[ 115.621174] [<ffffffff815ea59f>] rtnl_newlink+0x51f/0x730
[ 115.621174] [<ffffffff815e6e75>] rtnetlink_rcv_msg+0x95/0x250
[ 115.621174] [<ffffffff81608b19>] netlink_rcv_skb+0xa9/0xc0
[ 115.621174] [<ffffffff815e6dca>] rtnetlink_rcv+0x2a/0x40
[ 115.621174] [<ffffffff81608150>] netlink_unicast+0xf0/0x1c0
[ 115.621174] [<ffffffff8160851f>] netlink_sendmsg+0x2ff/0x740
[ 115.621174] [<ffffffff815bc9db>] sock_sendmsg+0x8b/0xc0
[ 115.621174] [<ffffffff815bd4b9>] ___sys_sendmsg+0x369/0x380
[ 115.621174] [<ffffffff815bdbb2>] __sys_sendmsg+0x42/0x80
[ 115.621174] [<ffffffff815bdc02>] SyS_sendmsg+0x12/0x20
[ 115.621174] [<ffffffff816ffd69>] system_call_fastpath+0x16/0x1b
[ 115.621174]
-> #0 (&vlan_netdev_addr_lock_key/1){+.....}:
[ 115.621174] [<ffffffff810d4d43>] __lock_acquire+0x1773/0x1a60
[ 115.621174] [<ffffffff810d57f2>] lock_acquire+0xa2/0x130
[ 115.621174] [<ffffffff816f62e7>] _raw_spin_lock_nested+0x37/0x50
[ 115.621174] [<ffffffff815df49c>] dev_uc_sync+0x3c/0x80
[ 115.621174] [<ffffffffa0696d2a>] macvlan_set_mac_lists+0xca/0x110 [macvlan]
[ 115.621174] [<ffffffff815da557>] __dev_set_rx_mode+0x57/0xa0
[ 115.621174] [<ffffffff815da5c6>] dev_set_rx_mode+0x26/0x40
[ 115.621174] [<ffffffff815da6be>] __dev_open+0xde/0x140
[ 115.621174] [<ffffffff815da9ad>] __dev_change_flags+0x9d/0x170
[ 115.621174] [<ffffffff815daaa9>] dev_change_flags+0x29/0x60
[ 115.621174] [<ffffffff815e7f11>] do_setlink+0x321/0x9a0
[ 115.621174] [<ffffffff815ea59f>] rtnl_newlink+0x51f/0x730
[ 115.621174] [<ffffffff815e6e75>] rtnetlink_rcv_msg+0x95/0x250
[ 115.621174] [<ffffffff81608b19>] netlink_rcv_skb+0xa9/0xc0
[ 115.621174] [<ffffffff815e6dca>] rtnetlink_rcv+0x2a/0x40
[ 115.621174] [<ffffffff81608150>] netlink_unicast+0xf0/0x1c0
[ 115.621174] [<ffffffff8160851f>] netlink_sendmsg+0x2ff/0x740
[ 115.621174] [<ffffffff815bc9db>] sock_sendmsg+0x8b/0xc0
[ 115.621174] [<ffffffff815bd4b9>] ___sys_sendmsg+0x369/0x380
[ 115.621174] [<ffffffff815bdbb2>] __sys_sendmsg+0x42/0x80
[ 115.621174] [<ffffffff815bdc02>] SyS_sendmsg+0x12/0x20
[ 115.621174] [<ffffffff816ffd69>] system_call_fastpath+0x16/0x1b
[ 115.621174]
other info that might help us debug this:
[ 115.621174] Possible unsafe locking scenario:
[ 115.621174] CPU0 CPU1
[ 115.621174] ---- ----
[ 115.621174] lock(&macvlan_netdev_addr_lock_key);
[ 115.621174] lock(&vlan_netdev_addr_lock_key/1);
[ 115.621174] lock(&macvlan_netdev_addr_lock_key);
[ 115.621174] lock(&vlan_netdev_addr_lock_key/1);
[ 115.621174]
*** DEADLOCK ***
[ 115.621174] 2 locks held by ip/1704:
[ 115.621174] #0: (rtnl_mutex){+.+.+.}, at: [<ffffffff815e6dbb>] rtnetlink_rcv+0x1b/0x40
[ 115.621174] #1: (&macvlan_netdev_addr_lock_key){+.....}, at: [<ffffffff815da5be>] dev_set_rx_mode+0x1e/0x40
[ 115.621174]
stack backtrace:
[ 115.621174] CPU: 3 PID: 1704 Comm: ip Not tainted 3.15.0-rc1+ #24
[ 115.621174] Hardware name: Hewlett-Packard HP xw8400 Workstation/0A08h, BIOS 786D5 v02.38 10/25/2010
[ 115.621174] ffffffff82339ae0 ffff880465f79568 ffffffff816ee20c ffffffff82339ae0
[ 115.621174] ffff880465f795a8 ffffffff816e9e1b ffff880465f79600 ffff880465b019c8
[ 115.621174] 0000000000000001 0000000000000002 ffff880465b019c8 ffff880465b01230
[ 115.621174] Call Trace:
[ 115.621174] [<ffffffff816ee20c>] dump_stack+0x4d/0x66
[ 115.621174] [<ffffffff816e9e1b>] print_circular_bug+0x200/0x20e
[ 115.621174] [<ffffffff810d4d43>] __lock_acquire+0x1773/0x1a60
[ 115.621174] [<ffffffff810d3172>] ? trace_hardirqs_on_caller+0xb2/0x1d0
[ 115.621174] [<ffffffff810d57f2>] lock_acquire+0xa2/0x130
[ 115.621174] [<ffffffff815df49c>] ? dev_uc_sync+0x3c/0x80
[ 115.621174] [<ffffffff816f62e7>] _raw_spin_lock_nested+0x37/0x50
[ 115.621174] [<ffffffff815df49c>] ? dev_uc_sync+0x3c/0x80
[ 115.621174] [<ffffffff815df49c>] dev_uc_sync+0x3c/0x80
[ 115.621174] [<ffffffffa0696d2a>] macvlan_set_mac_lists+0xca/0x110 [macvlan]
[ 115.621174] [<ffffffff815da557>] __dev_set_rx_mode+0x57/0xa0
[ 115.621174] [<ffffffff815da5c6>] dev_set_rx_mode+0x26/0x40
[ 115.621174] [<ffffffff815da6be>] __dev_open+0xde/0x140
[ 115.621174] [<ffffffff815da9ad>] __dev_change_flags+0x9d/0x170
[ 115.621174] [<ffffffff815daaa9>] dev_change_flags+0x29/0x60
[ 115.621174] [<ffffffff811e1db1>] ? mem_cgroup_bad_page_check+0x21/0x30
[ 115.621174] [<ffffffff815e7f11>] do_setlink+0x321/0x9a0
[ 115.621174] [<ffffffff810d394c>] ? __lock_acquire+0x37c/0x1a60
[ 115.621174] [<ffffffff815ea59f>] rtnl_newlink+0x51f/0x730
[ 115.621174] [<ffffffff815ea169>] ? rtnl_newlink+0xe9/0x730
[ 115.621174] [<ffffffff815e6e75>] rtnetlink_rcv_msg+0x95/0x250
[ 115.621174] [<ffffffff810d329d>] ? trace_hardirqs_on+0xd/0x10
[ 115.621174] [<ffffffff815e6dbb>] ? rtnetlink_rcv+0x1b/0x40
[ 115.621174] [<ffffffff815e6de0>] ? rtnetlink_rcv+0x40/0x40
[ 115.621174] [<ffffffff81608b19>] netlink_rcv_skb+0xa9/0xc0
[ 115.621174] [<ffffffff815e6dca>] rtnetlink_rcv+0x2a/0x40
[ 115.621174] [<ffffffff81608150>] netlink_unicast+0xf0/0x1c0
[ 115.621174] [<ffffffff8160851f>] netlink_sendmsg+0x2ff/0x740
[ 115.621174] [<ffffffff815bc9db>] sock_sendmsg+0x8b/0xc0
[ 115.621174] [<ffffffff8119d4af>] ? might_fault+0x5f/0xb0
[ 115.621174] [<ffffffff8119d4f8>] ? might_fault+0xa8/0xb0
[ 115.621174] [<ffffffff8119d4af>] ? might_fault+0x5f/0xb0
[ 115.621174] [<ffffffff815cb51e>] ? verify_iovec+0x5e/0xe0
[ 115.621174] [<ffffffff815bd4b9>] ___sys_sendmsg+0x369/0x380
[ 115.621174] [<ffffffff816faa0d>] ? __do_page_fault+0x11d/0x570
[ 115.621174] [<ffffffff810cfe9f>] ? up_read+0x1f/0x40
[ 115.621174] [<ffffffff816fab04>] ? __do_page_fault+0x214/0x570
[ 115.621174] [<ffffffff8120a10b>] ? mntput_no_expire+0x6b/0x1c0
[ 115.621174] [<ffffffff8120a0b7>] ? mntput_no_expire+0x17/0x1c0
[ 115.621174] [<ffffffff8120a284>] ? mntput+0x24/0x40
[ 115.621174] [<ffffffff815bdbb2>] __sys_sendmsg+0x42/0x80
[ 115.621174] [<ffffffff815bdc02>] SyS_sendmsg+0x12/0x20
[ 115.621174] [<ffffffff816ffd69>] system_call_fastpath+0x16/0x1b
Fix this by correctly providing macvlan lockdep class.
Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This reverts commit dc8eaaa006350d24030502a4521542e74b5cb39f.
vlan: Fix lockdep warning when vlan dev handle notification
Instead we use the new new API to find the lock subclass of
our vlan device. This way we can support configurations where
vlans are interspersed with other devices:
bond -> vlan -> macvlan -> vlan
Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Currently netif_addr_lock_nested assumes that there can be only
a single nesting level between 2 devices. However, if we
have multiple devices of the same type stacked, this fails.
For example:
eth0 <-- vlan0.10 <-- vlan0.10.20
A more complicated configuration may stack more then one type of
device in different order.
Ex:
eth0 <-- vlan0.10 <-- macvlan0 <-- vlan1.10.20 <-- macvlan1
This patch adds an ndo_* function that allows each stackable
device to report its nesting level. If the device doesn't
provide this function default subclass of 1 is used.
Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Multiple devices in the kernel can be stacked/nested and they
need to know their nesting level for the purposes of lockdep.
This patch provides a generic function that determines a nesting
level of a particular device by its type (ex: vlan, macvlan, etc).
We only care about nesting of the same type of devices.
For example:
eth0 <- vlan0.10 <- macvlan0 <- vlan1.20
The nesting level of vlan1.20 would be 1, since there is another vlan
in the stack under it.
Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The value written to PLLE_AUX was incorrect due to a wrong variable
being used. Without this fix SATA does not work.
Cc: stable@vger.kernel.org
Signed-off-by: Tuomas Tynkkynen <ttynkkynen@nvidia.com>
Tested-by: Mikko Perttunen <mperttunen@nvidia.com>
Reviewed-by: Thierry Reding <treding@nvidia.com>
Tested-by: Thierry Reding <treding@nvidia.com>
Acked-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Mike Turquette <mturquette@linaro.org>
[mturquette@linaro.org: improved changelog]
|
|
wdev->ifdev should be set by .change_virtual_intf(). This solves the
problem of WARN() messages on module unload.
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Starting from linux-3.13, GRO attempts to build full size skbs.
Problem is the commit assumed one particular field in skb->cb[]
was clean, but it is not the case on some stacked devices.
Timo reported a crash in case traffic is decrypted before
reaching a GRE device.
Fix this by initializing NAPI_GRO_CB(skb)->last at the right place,
this also removes one conditional.
Thanks a lot to Timo for providing full reports and bisecting this.
Fixes: 8a29111c7ca6 ("net: gro: allow to build full sized skb")
Bisected-by: Timo Teras <timo.teras@iki.fi>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Tested-by: Timo Teräs <timo.teras@iki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The connected check fails to check for ip_gre nbma mode tunnels
properly. ip_gre creates temporary tnl_params with daddr specified
to pass-in the actual target on per-packet basis from neighbor
layer. Detect these tunnels by inspecting the actual tunnel
configuration.
Minimal test case:
ip route add 192.168.1.1/32 via 10.0.0.1
ip route add 192.168.1.2/32 via 10.0.0.2
ip tunnel add nbma0 mode gre key 1 tos c0
ip addr add 172.17.0.0/16 dev nbma0
ip link set nbma0 up
ip neigh add 172.17.0.1 lladdr 192.168.1.1 dev nbma0
ip neigh add 172.17.0.2 lladdr 192.168.1.2 dev nbma0
ping 172.17.0.1
ping 172.17.0.2
The second ping should be going to 192.168.1.2 and head 10.0.0.2;
but cached gre tunnel level route is used and it's actually going
to 192.168.1.1 via 10.0.0.1.
The lladdr's need to go to separate dst for the bug to trigger.
Test case uses separate route entries, but this can also happen
when the route entry is same: if there is a nexthop exception or
the GRE tunnel is IPsec'ed in which case the dst points to xfrm
bundle unique to the gre lladdr.
Fixes: 7d442fab0a67 ("ipv4: Cache dst in tunnels")
Signed-off-by: Timo Teräs <timo.teras@iki.fi>
Cc: Tom Herbert <therbert@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Adding more than one chip on device-tree currently causes the probing
routine to always use the first chips data pointer.
Signed-off-by: Fabian Godehardt <fg@emlix.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Currently, "ip -6 route get mark xyz" ignores the mark passed in
by userspace. Make it honour the mark, just like IPv4 does.
Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When the NAPI budget was not all used, xenvif_poll() would call
napi_complete() /after/ enabling the interrupt. This resulted in a
race between the napi_complete() and the napi_schedule() in the
interrupt handler. The use of local_irq_save/restore() avoided by
race iff the handler is running on the same CPU but not if it was
running on a different CPU.
Fix this properly by calling napi_complete() before reenabling
interrupts (in the xenvif_napi_schedule_or_enable_irq() call).
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Seems it helps some users, but causes issues for other users:
https://bugzilla.redhat.com/show_bug.cgi?id=1089545
So lets drop it for now until we've figured out a better fix.
Fixes: 43d949024425 (ACPI / video: Add use_native_backlight quirks for more systems)
References: https://bugzilla.redhat.com/show_bug.cgi?id=1089545
Cc: All applicable <stable@vger.kernel.org>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
There may be padding on the ticket contained in the key payload, so just ensure
that the claimed token length is large enough, rather than exactly the right
size.
Signed-off-by: Nathaniel Wesley Filardo <nwf@cs.jhu.edu>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
With commit be9dad1f9f26604fb ("net: phy: suspend phydev when going
to HALTED"), an unused PHY device will be put in a low-power mode
using BMCR_PDOWN. Some Ethernet drivers might be calling phy_start()
and phy_stop() from ndo_open and ndo_close() respectively, while
calling phy_connect() and phy_disconnect() from probe and remove.
In such a case, the PHY will be powered down during the phy_stop()
call, but will fail to be powered up in phy_start().
This patch fixes this scenario.
Signed-off-by: Jiancheng Xue <xuejiancheng@huawei.com>
Signed-off-by: Zhangfei Gao <zhangfei.gao@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When we receive a netdev event indicating a netdev change and/or
a netdev address change, we must change the MAC index used by the
proxy QP1 (in the QP context), otherwise RoCE CM packets sent by the
VF will not carry the same source MAC address as the non-CM packets.
We use the UPDATE_QP command to perform this change.
In order to avoid modifying a QP context based on netdev event,
while the driver attempts to destroy this QP (e.g either the mlx4_ib
or ib_mad modules are unloaded), we use mutex locking in both flows.
Since the relevant mlx4 proxy GSI QP is created indirectly by the
mad module when they create their GSI QP, the mlx4 didn't need to
keep track on that QP prior to this change.
Now, when QP modifications are needed to this QP from within the
driver, we added refernece to it.
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch adds UPDATE_QP SRIOV wrapper support.
The mechanism is a general one, but currently only source MAC
index changes are allowed for VFs.
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Use a correct pipe type when filling un interrupt urbs. This should
finally take care of the WARN() messages on the console when USB urbs
are submitted.
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Commit e2b149cc4ba0 ("crush: add chooseleaf_vary_r tunable") added the
crush_map::chooseleaf_vary_r field but missed the decode part. This
lead to misdirected requests caused by incorrect raw crush mapping
sets.
Fixes: http://tracker.ceph.com/issues/8226
Reported-and-Tested-by: Dmitry Smirnov <onlyjob@member.fsf.org>
Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
|
|
It has been reported that using ZFSonLinux on rbd will result in memory
corruption. The bug report can be found here:
https://github.com/zfsonlinux/spl/issues/241
http://tracker.ceph.com/issues/7790
The reason is that ZFS will send pages with page_count 0 into rbd, which in
turns send them to tcp_sendpage. However, tcp_sendpage cannot deal with
page_count 0, as it will do get_page and put_page, and erroneously free the
page.
This type of issue has been noted before, and handled in iscsi, drbd,
etc. So, rbd should also handle this. This fix address this issue by fall back
to slower sendmsg when page_count 0 detected.
Cc: Sage Weil <sage@inktank.com>
Cc: Yehuda Sadeh <yehuda@inktank.com>
Cc: stable@vger.kernel.org
Signed-off-by: Chunwei Chen <tuxoko@gmail.com>
Reviewed-by: Ilya Dryomov <ilya.dryomov@inktank.com>
|
|
The following happens when trying to run a kvm guest on a kernel
configured for 64k pages. This doesn't happen with 4k pages:
BUG: failure at include/linux/mm.h:297/put_page_testzero()!
Kernel panic - not syncing: BUG!
CPU: 2 PID: 4228 Comm: qemu-system-aar Tainted: GF 3.13.0-0.rc7.31.sa2.k32v1.aarch64.debug #1
Call trace:
[<fffffe0000096034>] dump_backtrace+0x0/0x16c
[<fffffe00000961b4>] show_stack+0x14/0x1c
[<fffffe000066e648>] dump_stack+0x84/0xb0
[<fffffe0000668678>] panic+0xf4/0x220
[<fffffe000018ec78>] free_reserved_area+0x0/0x110
[<fffffe000018edd8>] free_pages+0x50/0x88
[<fffffe00000a759c>] kvm_free_stage2_pgd+0x30/0x40
[<fffffe00000a5354>] kvm_arch_destroy_vm+0x18/0x44
[<fffffe00000a1854>] kvm_put_kvm+0xf0/0x184
[<fffffe00000a1938>] kvm_vm_release+0x10/0x1c
[<fffffe00001edc1c>] __fput+0xb0/0x288
[<fffffe00001ede4c>] ____fput+0xc/0x14
[<fffffe00000d5a2c>] task_work_run+0xa8/0x11c
[<fffffe0000095c14>] do_notify_resume+0x54/0x58
In arch/arm/kvm/mmu.c:unmap_range(), we end up doing an extra put_page()
on the stage2 pgd which leads to the BUG in put_page_testzero(). This
happens because a pud_huge() test in unmap_range() returns true when it
should always be false with 2-level pages tables used by 64k pages.
This patch removes support for huge puds if 2-level pagetables are
being used.
Signed-off-by: Mark Salter <msalter@redhat.com>
[catalin.marinas@arm.com: removed #ifndef around PUD_SIZE check]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: <stable@vger.kernel.org> # v3.11+
|
|
A few platforms lack a 'device_type = "memory"' for their memory
nodes, relying on an old ppc quirk in order to discover its memory.
Add the missing data so that all parsing code can find memory nodes
correctly.
Signed-off-by: Leif Lindholm <leif.lindholm@linaro.org>
Cc: linux-mips@linux-mips.org
Cc: devicetree@vger.kernel.org
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: <stable@vger.kernel.org>
Acked-by: John Crispin <blogic@openwrt.org>
Signed-off-by: Grant Likely <grant.likely@linaro.org>
|
|
The current .dts for ste-ccu8540 lacks a 'device_type = "memory"' for
its memory node, relying on an old ppc quirk in order to discover its
memory. Fix the data so that all parsing code can handle it correctly.
Signed-off-by: Leif Lindholm <leif.lindholm@linaro.org>
Acked-by: Lee Jones <lee.jones@linaro.org>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: devicetree@vger.kernel.org
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Grant Likely <grant.likely@linaro.org>
|
|
I've missed to add a NULL entry to the bond_intmax_tbl when I introduced
it with the conversion of arp_interval so add it now.
CC: Jay Vosburgh <j.vosburgh@gmail.com>
CC: Veaceslav Falico <vfalico@gmail.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Fixes: 7bdb04ed0dbf ("bonding: convert arp_interval to use the new option API")
Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
Acked-by: Veaceslav Falico <vfalico@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The original series for reintroducing grant mapping for netback had a patch [1]
to handle receiving of packets from an another VIF. Grant copy on the receiving
side needs the grant ref of the page to set up the op.
The original patch assumed (wrongly) that the frags array haven't changed. In
the case reported by Sander, the sending guest sent a packet where the linear
buffer and the first frag were under PKT_PROT_LEN (=128) bytes.
xenvif_tx_submit() then pulled up the linear area to 128 bytes, and ditched the
first frag. The receiving side had an off-by-one problem when gathered the grant
refs.
This patch fixes that by checking whether the actual frag's page pointer is the
same as the page in the original frag list. It can handle any kind of changes on
the original frags array, like:
- removing granted frags from the array at any point
- adding local pages to the frags list anywhere
- reordering the frags
It's optimized to the most common case, when there is 1:1 relation between the
frags and the list, plus works optimal when frags are removed from the end or
the beginning.
[1]: 3e2234: xen-netback: Handle foreign mapped pages on the guest RX path
Reported-by: Sander Eikelenboom <linux@eikelenboom.it>
Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
RFC 4861 states in 7.2.5:
The IsRouter flag in the cache entry MUST be set based on the
Router flag in the received advertisement. In those cases
where the IsRouter flag changes from TRUE to FALSE as a result
of this update, the node MUST remove that router from the
Default Router List and update the Destination Cache entries
for all destinations using that neighbor as a router as
specified in Section 7.3.3. This is needed to detect when a
node that is used as a router stops forwarding packets due to
being configured as a host.
Currently, when dealing with NA Message which IsRouter flag changes from
TRUE to FALSE, the kernel only removes router from the Default Router List,
and don't update the Destination Cache entries.
Now in order to update those Destination Cache entries, i introduce
function rt6_clean_tohost().
Signed-off-by: Duan Jiong <duanj.fnst@cn.fujitsu.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
After the call to phy_init_hw failed in phy_attach_direct, phy_detach is called
to detach the phy device from its network device. If the attached driver is a
generic phy driver, this also detaches the driver. Subsequently phy_resume
is called, which assumes without checking that a driver is attached to the
device. This will result in a crash such as
Unable to handle kernel paging request for data at address 0xffffffffffffff90
Faulting instruction address: 0xc0000000003a0e18
Oops: Kernel access of bad area, sig: 11 [#1]
...
NIP [c0000000003a0e18] .phy_attach_direct+0x68/0x17c
LR [c0000000003a0e6c] .phy_attach_direct+0xbc/0x17c
Call Trace:
[c0000003fc0475d0] [c0000000003a0e6c] .phy_attach_direct+0xbc/0x17c (unreliable)
[c0000003fc047670] [c0000000003a0ff8] .phy_connect_direct+0x28/0x98
[c0000003fc047700] [c0000000003f0074] .of_phy_connect+0x4c/0xa4
Only call phy_resume if phy_init_hw was successful.
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Just like for pSCSI, if the transport sets get_write_cache, then it is
not valid to enable write cache emulation for it. Return an error.
see https://bugzilla.redhat.com/show_bug.cgi?id=1082675
Reviewed-by: Chris Leech <cleech@redhat.com>
Signed-off-by: Andy Grover <agrover@redhat.com>
Cc: stable@vger.kernel.org # 3.10+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
|
|
This patch explicitly disables Immediate + Unsolicited Data for ISER
connections during login in iscsi_login_zero_tsih_s2() when protection
has been enabled for the session by the underlying hardware.
This is currently required because protection / signature memory regions
(MRs) expect T10 PI to occur on RDMA READs + RDMA WRITEs transfers, and
not on a immediate data payload associated with ISCSI_OP_SCSI_CMD, or
unsolicited data-out associated with a ISCSI_OP_SCSI_DATA_OUT.
v2 changes:
- Add TARGET_PROT_DOUT_INSERT check (Sagi)
- Add pr_debug noisemaker (Sagi)
- Add goto to avoid early return from MRDSL check (nab)
Cc: Sagi Grimberg <sagig@mellanox.com>
Cc: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
|
|
This patch fixes a free-after-use regression in ft_free_cmd(), where
ft_sess_put() is called with cmd->sess after percpu_ida_free() has
already released the tag.
Fix this bug by saving the ft_sess pointer ahead of percpu_ida_free(),
and pass it directly to ft_sess_put().
The regression was originally introduced in v3.13-rc1 commit:
commit 5f544cfac956971099e906f94568bc3fd1a7108a
Author: Nicholas Bellinger <nab@daterainc.com>
Date: Mon Sep 23 12:12:42 2013 -0700
tcm_fc: Convert to per-cpu command map pre-allocation of ft_cmd
Reported-by: Jun Wu <jwu@stormojo.com>
Cc: Mark Rustad <mark.d.rustad@intel.com>
Cc: Robert Love <robert.w.love@intel.com>
Cc: <stable@vger.kernel.org> #3.13+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
|
|
This patch changes an incorrect use of BUG_ON to instead generate a
REJECT + PROTOCOL_ERROR in iscsit_process_nop_out() code. This case
can occur with traditional TCP where a flood of zeros in the data
stream can reach this block for what is presumed to be a NOP-OUT with
a solicited reply, but without a valid iscsi_cmd pointer.
This incorrect BUG_ON was introduced during the v3.11-rc timeframe
with the following commit:
commit 778de368964c5b7e8100cde9f549992d521e9c89
Author: Nicholas Bellinger <nab@linux-iscsi.org>
Date: Fri Jun 14 16:07:47 2013 -0700
iscsi/isert-target: Refactor ISCSI_OP_NOOP RX handling
Reported-by: Arshad Hussain <arshad.hussain@calsoftinc.com>
Cc: stable@vger.kernel.org # 3.11+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
|
|
When the target is in stop stage, iSER transport initiates RDMA disconnects.
The iSER initiator may wish to establish a new connection over the
still existing network portal. In this case iSER transport should not
accept and resume new RDMA connections. In order to learn that, iscsi_np
is added with enabled flag so the iSER transport can check when deciding
weather to accept and resume a new connection request.
The iscsi_np is enabled after successful transport setup, and disabled
before iscsi_np login threads are cleaned up.
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Cc: stable@vger.kernel.org # 3.10+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
|
|
RDMA CM and iSCSI target flows are asynchronous and completely
uncorrelated. Relying on the fact that iscsi_accept_np will be called
after CM connection request event and will wait for it is a mistake.
When attempting to login to a few targets this flow is racy and
unpredictable, but for parallel login to dozens of targets will
race and hang every time.
The correct synchronizing mechanism in this case is pending on
a semaphore rather than a wait_for_event. We keep the pending
interruptible for iscsi_np cleanup stage.
(Squash patch to remove dead code into parent - nab)
Reported-by: Slava Shwartsman <valyushash@gmail.com>
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Cc: stable@vger.kernel.org # 3.10+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
|
|
Should be adding list_add_tail($new, $head) and not
the other way around.
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Cc: stable@vger.kernel.org # 3.10+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
|
|
Userspace tools assume if a value is read from configfs, it is valid
and will not cause an error if the same value is written back. The only
valid value for pi_prot_type for backends not supporting DIF is 0, so allow
this particular value to be set without returning an error.
Reported-by: Krzysztof Chojnowski <frirajder@gmail.com>
Signed-off-by: Andy Grover <agrover@redhat.com>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Cc: stable@vger.kernel.org # 3.14+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
|
|
This patch disables multicast hash filtering if present in the hardware
and uses promiscuous mode instead until the problem with multicast
filtering has been debugged, integrated and tested.
Signed-off-by: Vince Bridgers <vbridgers2013@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch fixes the many sparse errors and warnings contained in the
initial submission of the Altera Triple Speed Ethernet driver, and a
few minor cppcheck warnings. Changes are tested on ARM and NIOS2
example designs, and compile tested against multiple architectures.
Typical issues addressed were as follows:
altera_tse_ethtool.c:136:19: warning: incorrect type in argument
1 (different address spaces)
altera_tse_ethtool.c:136:19: expected void const volatile
[noderef] <asn:2>*addr
altera_tse_ethtool.c:136:19: got unsigned int *<noident>
...
altera_sgdma.c:129:31: warning: cast removes address space of
expression
Signed-off-by: Vince Bridgers <vbridgers2013@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
From: Cong Wang <cwang@twopensource.com>
commit 50624c934db18ab90 (net: Delay default_device_exit_batch until no
devices are unregistering) introduced rtnl_lock_unregistering() for
default_device_exit_batch(). Same race could happen we when rmmod a driver
which calls rtnl_link_unregister() as we call dev->destructor without rtnl
lock.
For long term, I think we should clean up the mess of netdev_run_todo()
and net namespce exit code.
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Cong Wang <cwang@twopensource.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The attached change significantly improves the performance of the LWS-CAS code
in syscall.S.
This allows a number of packages to build (e.g., zeromq3, gtest and libxs)
that previously failed because slow LWS-CAS performance under contention. In
particular, interrupts taken while the lock was taken degraded performance
significantly.
The change does the following:
1) Disables interrupts around the CAS operation, and
2) Changes the loads and stores to use the ordered completer, "o", on
PA 2.0. "o" and "ma" with a zero offset are equivalent. The latter is
accepted on both PA 1.X and 2.0.
The use of ordered loads and stores probably makes no difference on all
existing hardware, but it seemed pedantically correct. In particular, the CAS
operation must complete before LDCW lock is released. As written before, a
processor could reorder the operations.
I don't believe the period interrupts are disabled is long enough to
significantly increase interrupt latency. For example, the TLB insert code is
longer. Worst case is a memory fault in the CAS operation.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Cc: stable@vger.kernel.org # 3.13+
Signed-off-by: Helge Deller <deller@gmx.de>
|
|
Ratelimit printing of userspace segfaults and make it runtime
configurable via the /proc/sys/debug/exception-trace variable. This
should resolve syslog from growing way too fast and thus prevents
possible system service attacks.
Signed-off-by: Helge Deller <deller@gmx.de>
Cc: stable@vger.kernel.org # 3.13+
|
|
When a new device is added below a hotplug bridge, the bridge's secondary
bus speed and the device's bus speed must match. The shpchp driver
previously checked the bridge's *primary* bus speed, not the secondary bus
speed.
This caused hot-add errors like:
shpchp 0000:00:03.0: Speed of bus ff and adapter 0 mismatch
Check the secondary bus speed instead.
[bhelgaas: changelog]
Link: https://bugzilla.kernel.org/show_bug.cgi?id=75251
Fixes: 3749c51ac6c1 ("PCI: Make current and maximum bus speeds part of the PCI core")
Signed-off-by: Marcel Apfelbaum <marcel.a@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
CC: stable@vger.kernel.org # v2.6.34+
|
|
Change introduced by 88e48d7b3340ef07b108eb8a8b3813dd093cc7f7
("batman-adv: make DAT drop ARP requests targeting local clients")
implements a check that prevents DAT from using the caching
mechanism when the client that is supposed to provide a reply
to an arp request is local.
However change brought by be1db4f6615b5e6156c807ea8985171c215c2d57
("batman-adv: make the Distributed ARP Table vlan aware")
has not converted the above check into its vlan aware version
thus making it useless when the local client is behind a vlan.
Fix the behaviour by properly specifying the vlan when
checking for a client being local or not.
Reported-by: Simon Wunderlich <simon@open-mesh.com>
Signed-off-by: Antonio Quartulli <antonio@open-mesh.com>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
|
|
A pointer to the orig_node representing a bat-gateway is
stored in the gw_node->orig_node member, but the refcount
for such orig_node is never increased.
This leads to memory faults when gw_node->orig_node is accessed
and the originator has already been freed.
Fix this by increasing the refcount on gw_node creation
and decreasing it on gw_node free.
Signed-off-by: Antonio Quartulli <antonio@open-mesh.com>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
|
|
In the new fragmentation code the batadv_frag_send_packet()
function obtains a reference to the primary_if, but it does
not release it upon return.
This reference imbalance prevents the primary_if (and then
the related netdevice) to be properly released on shut down.
Fix this by releasing the primary_if in batadv_frag_send_packet().
Introduced by ee75ed88879af88558818a5c6609d85f60ff0df4
("batman-adv: Fragment and send skbs larger than mtu")
Cc: Martin Hundebøll <martin@hundeboll.net>
Signed-off-by: Antonio Quartulli <antonio@open-mesh.com>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Acked-by: Martin Hundebøll <martin@hundeboll.net>
|
|
If hard_iface is NULL and goto out is made batadv_hardif_free_ref()
doesn't check for NULL before dereferencing it to get to refcount.
Introduced in cb1c92ec37fb70543d133a1fa7d9b54d6f8a1ecd
("batman-adv: add debugfs support to view multiif tables").
Reported-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Acked-by: Antonio Quartulli <antonio@meshcoding.com>
Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>
|
|
Add the corresponding trace if we have a full match in a non-terminal
rule. Note that the traces will look slightly different than in
x_tables since the log message after all expressions have been
evaluated (contrary to x_tables, that emits it before the target
action). This manifests in two differences in nf_tables wrt. x_tables:
1) The rule that enables the tracing is included in the trace.
2) If the rule emits some log message, that is shown before the
trace log message.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|