Age | Commit message (Collapse) | Author | Files | Lines |
|
tcp_rcv_established() can now run in process context.
We need to disable BH while acquiring tcp probe spinlock,
or risk a deadlock.
Fixes: 5413d1babe8f ("net: do not block BH while processing socket backlog")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Ricardo Nabinger Sanchez <rnsanchez@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Because of <linux/libc-compat.h> interface limitations, <netinet/in.h>
provided by libc cannot be included after <linux/in.h>, therefore any
header that includes <netinet/in.h> cannot be included after <linux/in.h>.
Change uapi/linux/l2tp.h, the last uapi header that includes
<netinet/in.h>, to include <linux/in.h> and <linux/in6.h> instead of
<netinet/in.h> and use __SOCK_SIZE__ instead of sizeof(struct sockaddr)
the same way as uapi/linux/in.h does, to fix linux/if_pppol2tp.h userspace
compilation errors like this:
In file included from /usr/include/linux/l2tp.h:12:0,
from /usr/include/linux/if_pppol2tp.h:21,
/usr/include/netinet/in.h:31:8: error: redefinition of 'struct in_addr'
Fixes: 47c3e7783be4 ("net: l2tp: deprecate PPPOL2TP_MSG_* in favour of L2TP_MSG_*")
Signed-off-by: Dmitry V. Levin <ldv@altlinux.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Multiple threads can call fanout_add() at the same time.
We need to grab fanout_mutex earlier to avoid races that could
lead to one thread freeing po->rollover that was set by another thread.
Do the same in fanout_release(), for peace of mind, and to help us
finding lockdep issues earlier.
Fixes: dc99f600698d ("packet: Add fanout support.")
Fixes: 0648ab70afe6 ("packet: rollover prepare: per-socket state")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
In the current driver, the MTU is set to the maximum value
capable for the backing device. This decision turned out to
be a mistake as it led to confusion among users. The expected
initial MTU value used for other IBM vNIC capable operating
systems is 1500, with the maximum value (9000) reserved for
when Jumbo frames are enabled. This patch sets the MTU to
the default value for a net device.
It also corrects a discrepancy between MTU values received from
firmware, which includes the ethernet header length, and net
device MTU values.
Finally, it removes redundant min/max MTU assignments after device
initialization.
Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
There is a copy-paste error, which hides breaking of resume
for CPSW driver: there was replaced netdev_priv() to ndev_to_cpsw(ndev)
in suspend, but left it unchanged in resume.
Fixes: 606f39939595a4d4540406bfc11f265b2036af6d
(ti: cpsw: move platform data and slaves info to cpsw_common)
Reported-by: Alexey Starikovskiy <AStarikovskiy@topcon.com>
Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
In commit 98e3862ca2b1 ("kcm: fix 0-length case for kcm_sendmsg()")
I tried to avoid skb allocation for 0-length case, but missed
a check for NULL pointer in the non EOR case.
Fixes: 98e3862ca2b1 ("kcm: fix 0-length case for kcm_sendmsg()")
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Tom Herbert <tom@herbertland.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Fix hardware setup of multicast address hash:
- Never clear the hardware hash (to avoid packet loss)
- Construct the hash register values in software and then write once
to hardware
Signed-off-by: Rui Sousa <rui.sousa@nxp.com>
Signed-off-by: Fugang Duan <fugang.duan@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch adds a check on the type of the source address for the case
where the destination address is in6addr_any. If the source is an
IPv4-mapped IPv6 source address, the destination is changed to
::ffff:127.0.0.1, and otherwise the destination is changed to ::1. This
is done in three locations to handle UDP calls to either connect() or
sendmsg() and TCP calls to connect(). Note that udpv6_sendmsg() delays
handling an in6addr_any destination until very late, so the patch only
needs to handle the case where the source is an IPv4-mapped IPv6
address.
Signed-off-by: Jonathan T. Leighton <jtleight@udel.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch adds a check for the problematic case of an IPv4-mapped IPv6
source address and a destination address that is neither an IPv4-mapped
IPv6 address nor in6addr_any, and returns an appropriate error. The
check in done before returning from looking up the route.
Signed-off-by: Jonathan T. Leighton <jtleight@udel.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When called by HW offloading drivers, the TC action (e.g
net/sched/act_mirred.c) code uses this_cpu logic, e.g
_bstats_cpu_update(this_cpu_ptr(a->cpu_bstats), bytes, packets)
per the kernel documention, preemption should be disabled, add that.
Before the fix, when running with CONFIG_PREEMPT set, we get a
BUG: using smp_processor_id() in preemptible [00000000] code: tc/3793
asserion from the TC action (mirred) stats_update callback.
Fixes: aad7e08d39bd ('net/mlx5e: Hardware offloaded flower filter statistics support')
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch adds code that handles GFP_ATOMIC kmalloc failure on
insertion. As we cannot use vmalloc, we solve it by making our
hash table nested. That is, we allocate single pages at each level
and reach our desired table size by nesting them.
When a nested table is created, only a single page is allocated
at the top-level. Lower levels are allocated on demand during
insertion. Therefore for each insertion to succeed, only two
(non-consecutive) pages are needed.
After a nested table is created, a rehash will be scheduled in
order to switch to a vmalloced table as soon as possible. Also,
the rehash code will never rehash into a nested table. If we
detect a nested table during a rehash, the rehash will be aborted
and a new rehash will be scheduled.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
There are two problems with the function tipc_sk_reinit. Firstly
it's doing a manual walk over an rhashtable. This is broken as
an rhashtable can be resized and if you manually walk over it
during a resize then you may miss entries.
Secondly it's missing memory barriers as previously the code used
spinlocks which provide the barriers implicitly.
This patch fixes both problems.
Fixes: 07f6c4bc048a ("tipc: convert tipc reference table to...")
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The function glock_hash_walk walks the rhashtable by hand. This
is broken because if it catches the hash table in the middle of
a rehash, then it will miss entries.
This patch replaces the manual walk by using the rhashtable walk
interface.
Fixes: 88ffbf3e037e ("GFS2: Use resizable hash table for glocks")
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When sending ARP requests over AX.25 links the hwaddress in the neighbour
cache are not getting initialized. For such an incomplete arp entry
ax2asc2 will generate an empty string resulting in /proc/net/arp output
like the following:
$ cat /proc/net/arp
IP address HW type Flags HW address Mask Device
192.168.122.1 0x1 0x2 52:54:00:00:5d:5f * ens3
172.20.1.99 0x3 0x0 * bpq0
The missing field will confuse the procfs parsing of arp(8) resulting in
incorrect output for the device such as the following:
$ arp
Address HWtype HWaddress Flags Mask Iface
gateway ether 52:54:00:00:5d:5f C ens3
172.20.1.99 (incomplete) ens3
This changes the content of /proc/net/arp to:
$ cat /proc/net/arp
IP address HW type Flags HW address Mask Device
172.20.1.99 0x3 0x0 * * bpq0
192.168.122.1 0x1 0x2 52:54:00:00:5d:5f * ens3
To do so it change ax2asc to put the string "*" in buf for a NULL address
argument. Finally the HW address field is left aligned in a 17 character
field (the length of an ethernet HW address in the usual hex notation) for
readability.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch fixes an issue where the type of counters in the queue(s)
and interface are not in sync (queue counters are int, interface
counters are long), causing incorrect reporting of tx/rx values
of the vif interface and unclear counter overflows.
This patch sets both counters to the u64 type.
Signed-off-by: Mart van Santen <mart@greenhost.nl>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The ghostprotocols.net domain is not working, remove it from CREDITS and
MAINTAINERS, and change the status to "Odd fixes", and since I haven't
been maintaining those, remove my address from there.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
It seems nobody used LLC since linux-3.12.
Fortunately fuzzers like syzkaller still know how to run this code,
otherwise it would be no fun.
Setting skb->sk without skb->destructor leads to all kinds of
bugs, we now prefer to be very strict about it.
Ideally here we would use skb_set_owner() but this helper does not exist yet,
only CAN seems to have a private helper for that.
Fixes: 376c7311bdb6 ("net: add a temporary sanity check in skb_orphan()")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
If BPF_F_ALLOW_OVERRIDE flag is used in BPF_PROG_ATTACH command
to the given cgroup the descendent cgroup will be able to override
effective bpf program that was inherited from this cgroup.
By default it's not passed, therefore override is disallowed.
Examples:
1.
prog X attached to /A with default
prog Y fails to attach to /A/B and /A/B/C
Everything under /A runs prog X
2.
prog X attached to /A with allow_override.
prog Y fails to attach to /A/B with default (non-override)
prog M attached to /A/B with allow_override.
Everything under /A/B runs prog M only.
3.
prog X attached to /A with allow_override.
prog Y fails to attach to /A with default.
The user has to detach first to switch the mode.
In the future this behavior may be extended with a chain of
non-overridable programs.
Also fix the bug where detach from cgroup where nothing is attached
was not throwing error. Return ENOENT in such case.
Add several testcases and adjust libbpf.
Fixes: 3007098494be ("cgroup: add support for eBPF programs")
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Tejun Heo <tj@kernel.org>
Acked-by: Daniel Mack <daniel@zonque.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The failure path in ibmvnic_open() mistakenly makes a second call
to napi_enable instead of calling napi_disable. This can result
in a BUG_ON for any queues that were enabled in the previous call
to napi_enable.
Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Initialize condition variables prior to invoking any work that can
mark them complete. This resolves a race in the ibmvnic driver where
the driver faults trying to complete an uninitialized condition
variable.
Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
udp_ioctl(), as its name suggests, is used by UDP protocols,
but is also used by L2TP :(
L2TP should use its own handler, because it really does not
look the same.
SIOCINQ for instance should not assume UDP checksum or headers.
Thanks to Andrey and syzkaller team for providing the report
and a nice reproducer.
While crashes only happen on recent kernels (after commit
7c13f97ffde6 ("udp: do fwd memory scheduling on dequeue")), this
probably needs to be backported to older kernels.
Fixes: 7c13f97ffde6 ("udp: do fwd memory scheduling on dequeue")
Fixes: 85584672012e ("udp: Fix udp_poll() and ioctl()")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
rx_refill_timer should be deleted as soon as we disconnect from the
backend since otherwise it is possible for the timer to go off before
we get to xennet_destroy_queues(). If this happens we may dereference
queue->rx.sring which is set to NULL in xennet_disconnect_backend().
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
CC: stable@vger.kernel.org
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
If a USB-to-serial adapter is unplugged, the driver re-initializes, with
dev->hard_header_len and dev->addr_len set to zero, instead of the correct
values. If then a packet is sent through the half-dead interface, the
kernel will panic due to running out of headroom in the skb when pushing
for the AX.25 headers resulting in this panic:
[<c0595468>] (skb_panic) from [<c0401f70>] (skb_push+0x4c/0x50)
[<c0401f70>] (skb_push) from [<bf0bdad4>] (ax25_hard_header+0x34/0xf4 [ax25])
[<bf0bdad4>] (ax25_hard_header [ax25]) from [<bf0d05d4>] (ax_header+0x38/0x40 [mkiss])
[<bf0d05d4>] (ax_header [mkiss]) from [<c041b584>] (neigh_compat_output+0x8c/0xd8)
[<c041b584>] (neigh_compat_output) from [<c043e7a8>] (ip_finish_output+0x2a0/0x914)
[<c043e7a8>] (ip_finish_output) from [<c043f948>] (ip_output+0xd8/0xf0)
[<c043f948>] (ip_output) from [<c043f04c>] (ip_local_out_sk+0x44/0x48)
This patch makes mkiss behave like the 6pack driver. 6pack does not
panic. In 6pack.c sp_setup() (same function name here) the values for
dev->hard_header_len and dev->addr_len are set to the same values as in
my mkiss patch.
[ralf@linux-mips.org: Massages original submission to conform to the usual
standards for patch submissions.]
Signed-off-by: Thomas Osterried <thomas@osterried.de>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch fixes the device being used to DMA map skb->data.
Erroneous device assignment causes the crash when SMMU is enabled.
This happens during TX since buffer gets DMA mapped with device
correspondign to net_device and gets unmapped using the device
related to DSAF.
Signed-off-by: Kejian Yan <yankejian@huawei.com>
Reviewed-by: Yisen Zhuang <yisen.zhuang@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch incorrectly attempted nested mnt_want_write, and incorrectly
disabled nfsd's owner override for truncate. We'll fix those problems
and make another attempt soon, for the moment I think the safest is to
revert.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
|
USB PHYs need the MDIO clock divisor enabled earlier to work.
Initialize mdio clock divisor in probe function. The ext bus
bit available in the same register will be used by mdio mux
to enable external mdio.
Signed-off-by: Yendapally Reddy Dhananjaya Reddy <yendapally.reddy@broadcom.com>
Fixes: ddc24ae1 ("net: phy: Broadcom iProc MDIO bus driver")
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jon Mason <jon.mason@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
In function igmpv3/mld_add_delrec() we allocate pmc and put it in
idev->mc_tomb, so we should free it when we don't need it in del_delrec().
But I removed kfree(pmc) incorrectly in latest two patches. Now fix it.
Fixes: 24803f38a5c0 ("igmp: do not remove igmp souce list info when ...")
Fixes: 1666d49e1d41 ("mld: do not remove mld souce list info when ...")
Reported-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This fixes a crash when running out of grant refs when creating many
queues across many netdevs.
* If creating queues fails (i.e. there are no grant refs available),
call xenbus_dev_fatal() to ensure that the xenbus device is set to the
closed state.
* If no queues are created, don't call xennet_disconnect_backend as
netdev->real_num_tx_queues will not have been set correctly.
* If setup_netfront() fails, ensure that all the queues created are
cleaned up, not just those that have been set up.
* If any queues were set up and an error occurs, call
xennet_destroy_queues() to clean up the napi context.
* If any fatal error occurs, unregister and destroy the netdev to avoid
leaving around a half setup network device.
Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When the context is deactivated, the link_type is set to 0xff, which
triggers a warning message, and results in a wrong link status, as
the LSI is ignored.
Signed-off-by: Stefan Brüns <stefan.bruens@rwth-aachen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
If a context is configured as dualstack ("IPv4v6"), the modem indicates
the context activation with a slightly different indication message.
The dual-stack indication omits the link_type (IPv4/v6) and adds
additional address fields.
IPv6 LSIs are identical to IPv4 LSIs, but have a different link type.
Signed-off-by: Stefan Brüns <stefan.bruens@rwth-aachen.de>
Reviewed-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Dmitry reported a kernel warning:
WARNING: CPU: 3 PID: 2936 at net/kcm/kcmsock.c:627
kcm_write_msgs+0x12e3/0x1b90 net/kcm/kcmsock.c:627
CPU: 3 PID: 2936 Comm: a.out Not tainted 4.10.0-rc6+ #209
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:15 [inline]
dump_stack+0x2ee/0x3ef lib/dump_stack.c:51
panic+0x1fb/0x412 kernel/panic.c:179
__warn+0x1c4/0x1e0 kernel/panic.c:539
warn_slowpath_null+0x2c/0x40 kernel/panic.c:582
kcm_write_msgs+0x12e3/0x1b90 net/kcm/kcmsock.c:627
kcm_sendmsg+0x163a/0x2200 net/kcm/kcmsock.c:1029
sock_sendmsg_nosec net/socket.c:635 [inline]
sock_sendmsg+0xca/0x110 net/socket.c:645
sock_write_iter+0x326/0x600 net/socket.c:848
new_sync_write fs/read_write.c:499 [inline]
__vfs_write+0x483/0x740 fs/read_write.c:512
vfs_write+0x187/0x530 fs/read_write.c:560
SYSC_write fs/read_write.c:607 [inline]
SyS_write+0xfb/0x230 fs/read_write.c:599
entry_SYSCALL_64_fastpath+0x1f/0xc2
when calling syscall(__NR_write, sock2, 0x208aaf27ul, 0x0ul) on a KCM
seqpacket socket. It appears that kcm_sendmsg() does not handle len==0
case correctly, which causes an empty skb is allocated and queued.
Fix this by skipping the skb allocation for len==0 case.
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Tom Herbert <tom@herbertland.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The commit 90c311b0eeea ("xen-netfront: Fix Rx stall during network
stress and OOM") caused the refill timer to be triggerred almost on
all invocations of xennet_alloc_rx_buffers for certain workloads.
This reworks the fix by reverting to the old behaviour and taking into
consideration the skb allocation failure. Refill timer is now triggered
on insufficient requests or skb allocation failure.
Signed-off-by: Vineeth Remanan Pillai <vineethp@amazon.com>
Fixes: 90c311b0eeea (xen-netfront: Fix Rx stall during network stress and OOM)
Reported-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The Generic PHY drivers gets assigned after we checked that the current
PHY driver is NULL, so we need to check a few things before we can
safely dereference d->driver. This would be causing a NULL deference to
occur when a system binds to the Generic PHY driver. Update
phy_attach_direct() to do the following:
- grab the driver module reference after we have assigned the Generic
PHY drivers accordingly, and remember we came from the generic PHY
path
- update the error path to clean up the module reference in case the
Generic PHY probe function fails
- split the error path involving phy_detacht() to avoid double free/put
since phy_detach() does all the clean up
- finally, have phy_detach() drop the module reference count before we
call device_release_driver() for the Generic PHY driver case
Fixes: cafe8df8b9bc ("net: phy: Fix lack of reference count on PHY driver")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
We'll OOPS in ramoops_get_next_prz() if the platform didn't ask for any
ftrace zones (i.e., cxt->fprzs will be NULL). Let's just skip this
entire FTRACE section if there's no 'fprzs'.
Regression seen on a coreboot/depthcharge-based Chromebook.
Fixes: 2fbea82bbb89 ("pstore: Merge per-CPU ftrace records into one")
Cc: Joel Fernandes <joelaf@google.com>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Brian Norris <briannorris@chromium.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
|
|
SMBSLVCNT must be protected with the piix4_mutex_sb800 in order to avoid
multiple buses accessing to the semaphore at the same time.
Fixes: 701dc207bf55 ("i2c: piix4: Avoid race conditions with IMC")
Reported-by: Jean Delvare <jdelvare@suse.de>
Signed-off-by: Ricardo Ribalda Delgado <ricardo.ribalda@gmail.com>
Signed-off-by: Jean Delvare <jdelvare@suse.de>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
|
|
Since '701dc207bf55 ("i2c: piix4: Avoid race conditions with IMC")' we
are using the SMBSLVCNT register at offset 0x8. We need to request it.
Fixes: 701dc207bf55 ("i2c: piix4: Avoid race conditions with IMC")
Signed-off-by: Ricardo Ribalda Delgado <ricardo.ribalda@gmail.com>
Signed-off-by: Jean Delvare <jdelvare@suse.de>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
|
|
Without this change, the HDMI/DP codec will be recognised as a
generic codec, and there is no sound when playing through this codec.
As suggested by NVidia side, after adding the new ID in the driver,
the sound playing works well.
Cc: <stable@vger.kernel.org>
Signed-off-by: Hui Wang <hui.wang@canonical.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
Otherwise KVM will fail to pass them through to the host
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
The IPIs come in as HVI not EE, so we need to test the appropriate
SRR1 bits. The encoding is such that it won't have false positives
on P7 and P8 so we can just test it like that. We also need to handle
the icp-opal variant of the flush.
Fixes: d74361881f0d ("powerpc/xics: Add ICP OPAL backend")
Cc: stable@vger.kernel.org # v4.8+
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
Three tiny changes to the ERAT flushing logic: First don't make
it depend on DD1. It hasn't been decided yet but we might run
DD2 in a mode that also requires explicit flushes for performance
reasons so make it unconditional. We also add a missing isync, and
finally remove the flush from _tlbiel_va as it is only necessary
for congruence-class invalidations (PID, LPID and full TLB), not
targetted invalidations.
Fixes: 96ed1fe511a8 ("powerpc/mm/radix: Invalidate ERAT on tlbiel for POWER9 DD1")
Cc: stable@vger.kernel.org # v4.9+
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
This reverts commit 020eb3daaba2857b32c4cf4c82f503d6a00a67de.
Gabriel C reports that it causes his machine to not boot, and we haven't
tracked down the reason for it yet. Since the bug it fixes has been
around for a longish time, we're better off reverting the fix for now.
Gabriel says:
"It hangs early and freezes with a lot RCU warnings.
I bisected it down to :
> Ruslan Ruslichenko (1):
> x86/ioapic: Restore IO-APIC irq_chip retrigger callback
Reverting this one fixes the problem for me..
The box is a PRIMERGY TX200 S5 , 2 socket , 2 x E5520 CPU(s) installed"
and Ruslan and Thomas are currently stumped.
Reported-and-bisected-by: Gabriel C <nix.or.die@gmail.com>
Cc: Ruslan Ruslichenko <rruslich@cisco.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@kernel.org # for the backport of the original commit
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
This reverts commit 2cc751545854d7bd7eedf4d7e377bb52e176cd07.
With this commit in place I get on a Cavium ThunderX (arm64) system:
$ if=/dev/hwrng bs=256 count=1 | od -t x1 -A x -v > rng-bad.txt
1+0 records in
1+0 records out
256 bytes (256 B) copied, 9.1171e-05 s, 2.8 MB/s
$ dd if=/dev/hwrng bs=256 count=1 | od -t x1 -A x -v >> rng-bad.txt
1+0 records in
1+0 records out
256 bytes (256 B) copied, 9.6141e-05 s, 2.7 MB/s
$ cat rng-bad.txt
000000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
000010 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
000020 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
000030 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
000040 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
000050 00 00 00 00 37 20 46 ae d0 fc 1c 55 25 6e b0 b8
000060 7c 7e d7 d4 00 0f 6f b2 91 1e 30 a8 fa 3e 52 0e
000070 06 2d 53 30 be a1 20 0f aa 56 6e 0e 44 6e f4 35
000080 b7 6a fe d2 52 70 7e 58 56 02 41 ea d1 9c 6a 6a
000090 d1 bd d8 4c da 35 45 ef 89 55 fc 59 d5 cd 57 ba
0000a0 4e 3e 02 1c 12 76 43 37 23 e1 9f 7a 9f 9e 99 24
0000b0 47 b2 de e3 79 85 f6 55 7e ad 76 13 4f a0 b5 41
0000c0 c6 92 42 01 d9 12 de 8f b4 7b 6e ae d7 24 fc 65
0000d0 4d af 0a aa 36 d9 17 8d 0e 8b 7a 3b b6 5f 96 47
0000e0 46 f7 d8 ce 0b e8 3e c6 13 a6 2c b6 d6 cc 17 26
0000f0 e3 c3 17 8e 9e 45 56 1e 41 ef 29 1a a8 65 c8 3a
000100
000000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
000010 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
000020 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
000030 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
000040 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
000050 00 00 00 00 f4 90 65 aa 8b f2 5e 31 01 53 b4 d4
000060 06 c0 23 a2 99 3d 01 e4 b0 c1 b1 55 0f 80 63 cf
000070 33 24 d8 3a 1d 5e cd 2c ba c0 d0 18 6f bc 97 46
000080 1e 19 51 b1 90 15 af 80 5e d1 08 0d eb b0 6c ab
000090 6a b4 fe 62 37 c5 e1 ee 93 c3 58 78 91 2a d5 23
0000a0 63 50 eb 1f 3b 84 35 18 cf b2 a4 b8 46 69 9e cf
0000b0 0c 95 af 03 51 45 a8 42 f1 64 c9 55 fc 69 76 63
0000c0 98 9d 82 fa 76 85 24 da 80 07 29 fe 4e 76 0c 61
0000d0 ff 23 94 4f c8 5c ce 0b 50 e8 31 bc 9d ce f4 ca
0000e0 be ca 28 da e6 fa cc 64 1c ec a8 41 db fe 42 bd
0000f0 a0 e2 4b 32 b4 52 ba 03 70 8e c1 8e d0 50 3a c6
000100
To my untrained mental entropy detector, the first several bytes of
each read from /dev/hwrng seem to not be very random (i.e. all zero).
When I revert the patch (apply this patch), I get back to what we have
in v4.9, which looks like (much more random appearing):
$ dd if=/dev/hwrng bs=256 count=1 | od -t x1 -A x -v > rng-good.txt
1+0 records in
1+0 records out
256 bytes (256 B) copied, 0.000252233 s, 1.0 MB/s
$ dd if=/dev/hwrng bs=256 count=1 | od -t x1 -A x -v >> rng-good.txt
1+0 records in
1+0 records out
256 bytes (256 B) copied, 0.000113571 s, 2.3 MB/s
$ cat rng-good.txt
000000 75 d1 2d 19 68 1f d2 26 a1 49 22 61 66 e8 09 e5
000010 e0 4e 10 d0 1a 2c 45 5d 59 04 79 8e e2 b7 2c 2e
000020 e8 ad da 34 d5 56 51 3d 58 29 c7 7a 8e ed 22 67
000030 f9 25 b9 fb c6 b7 9c 35 1f 84 21 35 c1 1d 48 34
000040 45 7c f6 f1 57 63 1a 88 38 e8 81 f0 a9 63 ad 0e
000050 be 5d 3e 74 2e 4e cb 36 c2 01 a8 14 e1 38 e1 bb
000060 23 79 09 56 77 19 ff 98 e8 44 f3 27 eb 6e 0a cb
000070 c9 36 e3 2a 96 13 07 a0 90 3f 3b bd 1d 04 1d 67
000080 be 33 14 f8 02 c2 a4 02 ab 8b 5b 74 86 17 f0 5e
000090 a1 d7 aa ef a6 21 7b 93 d1 85 86 eb 4e 8c d0 4c
0000a0 56 ac e4 45 27 44 84 9f 71 db 36 b9 f7 47 d7 b3
0000b0 f2 9c 62 41 a3 46 2b 5b e3 80 63 a4 35 b5 3c f4
0000c0 bc 1e 3a ad e4 59 4a 98 6c e8 8d ff 1b 16 f8 52
0000d0 05 5c 2f 52 2a 0f 45 5b 51 fb 93 97 a4 49 4f 06
0000e0 f3 a0 d1 1e ba 3d ed a7 60 8f bb 84 2c 21 94 2d
0000f0 b3 66 a6 61 1e 58 30 24 85 f8 c8 18 c3 77 00 22
000100
000000 73 ca cc a1 d9 bb 21 8d c3 5c f3 ab 43 6d a7 a4
000010 4a fd c5 f4 9c ba 4a 0f b1 2e 19 15 4e 84 26 e0
000020 67 c9 f2 52 4d 65 1f 81 b7 8b 6d 2b 56 7b 99 75
000030 2e cd d0 db 08 0c 4b df f3 83 c6 83 00 2e 2b b8
000040 0f af 61 1d f2 02 35 74 b5 a4 6f 28 f3 a1 09 12
000050 f2 53 b5 d2 da 45 01 e5 12 d6 46 f8 0b db ed 51
000060 7b f4 0d 54 e0 63 ea 22 e2 1d d0 d6 d0 e7 7e e0
000070 93 91 fb 87 95 43 41 28 de 3d 8b a3 a8 8f c4 9e
000080 30 95 12 7a b2 27 28 ff 37 04 2e 09 7c dd 7c 12
000090 e1 50 60 fb 6d 5f a8 65 14 40 89 e3 4c d2 87 8f
0000a0 34 76 7e 66 7a 8e 6b a3 fc cf 38 52 2e f9 26 f0
0000b0 98 63 15 06 34 99 b2 88 4f aa d8 14 88 71 f1 81
0000c0 be 51 11 2b f4 7e a0 1e 12 b2 44 2e f6 8d 84 ea
0000d0 63 82 2b 66 b3 9a fd 08 73 5a c2 cc ab 5a af b1
0000e0 88 e3 a6 80 4b fc db ed 71 e0 ae c0 0a a4 8c 35
0000f0 eb 89 f9 8a 4b 52 59 6f 09 7c 01 3f 56 e7 c7 bf
000100
Signed-off-by: David Daney <david.daney@cavium.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Commit 210e7a43fa90 ("mm: SLUB freelist randomization") broke USB hub
initialisation as described in
https://bugzilla.kernel.org/show_bug.cgi?id=177551.
Bail out early from init_cache_random_seq if s->random_seq is already
initialised. This prevents destroying the previously computed
random_seq offsets later in the function.
If the offsets are destroyed, then shuffle_freelist will truncate
page->freelist to just the first object (orphaning the rest).
Fixes: 210e7a43fa90 ("mm: SLUB freelist randomization")
Link: http://lkml.kernel.org/r/20170207140707.20824-1-sean@erifax.org
Signed-off-by: Sean Rees <sean@erifax.org>
Reported-by: <userwithuid@gmail.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Thomas Garnier <thgarnie@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Commit 513e3d2d11c9 ("cpumask: always use nr_cpu_ids in formatting and
parsing functions") converted both cpumask printing and parsing
functions to use nr_cpu_ids instead of nr_cpumask_bits. While this was
okay for the printing functions as it just picked one of the two output
formats that we were alternating between depending on a kernel config,
doing the same for parsing wasn't okay.
nr_cpumask_bits can be either nr_cpu_ids or NR_CPUS. We can always use
nr_cpu_ids but that is a variable while NR_CPUS is a constant, so it can
be more efficient to use NR_CPUS when we can get away with it.
Converting the printing functions to nr_cpu_ids makes sense because it
affects how the masks get presented to userspace and doesn't break
anything; however, using nr_cpu_ids for parsing functions can
incorrectly leave the higher bits uninitialized while reading in these
masks from userland. As all testing and comparison functions use
nr_cpumask_bits which can be larger than nr_cpu_ids, the parsed cpumasks
can erroneously yield false negative results.
This made the taskstats interface incorrectly return -EINVAL even when
the inputs were correct.
Fix it by restoring the parse functions to use nr_cpumask_bits instead
of nr_cpu_ids.
Link: http://lkml.kernel.org/r/20170206182442.GB31078@htj.duckdns.org
Fixes: 513e3d2d11c9 ("cpumask: always use nr_cpu_ids in formatting and parsing functions")
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Martin Steigerwald <martin.steigerwald@teamix.de>
Debugged-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Cc: <stable@vger.kernel.org> [4.0+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Some ->page_mkwrite handlers may return VM_FAULT_RETRY as its return
code (GFS2 or Lustre can definitely do this). However VM_FAULT_RETRY
from ->page_mkwrite is completely unhandled by the mm code and results
in locking and writeably mapping the page which definitely is not what
the caller wanted.
Fix Lustre and block_page_mkwrite_ret() used by other filesystems
(notably GFS2) to return VM_FAULT_NOPAGE instead which results in
bailing out from the fault code, the CPU then retries the access, and we
fault again effectively doing what the handler wanted.
Link: http://lkml.kernel.org/r/20170203150729.15863-1-jack@suse.cz
Signed-off-by: Jan Kara <jack@suse.cz>
Reported-by: Al Viro <viro@ZenIV.linux.org.uk>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
The user_header gets caught by kmemleak with the following splat as
missing a free:
unreferenced object 0xffff99667a733d80 (size 96):
comm "swapper/0", pid 1, jiffies 4294892317 (age 62191.468s)
hex dump (first 32 bytes):
a0 b6 92 b4 ff ff ff ff 00 00 00 00 01 00 00 00 ................
01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
backtrace:
kmemleak_alloc+0x4a/0xa0
__kmalloc+0x144/0x260
__register_sysctl_table+0x54/0x5e0
register_sysctl+0x1b/0x20
user_namespace_sysctl_init+0x17/0x34
do_one_initcall+0x52/0x1a0
kernel_init_freeable+0x173/0x200
kernel_init+0xe/0x100
ret_from_fork+0x2c/0x40
The BUG_ON()s are intended to crash so no need to clean up after
ourselves on error there. This is also a kernel/ subsys_init() we don't
need a respective exit call here as this is never modular, so just white
list it.
Link: http://lkml.kernel.org/r/20170203211404.31458-1-mcgrof@kernel.org
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Nikolay Borisov <n.borisov.lkml@gmail.com>
Cc: Serge Hallyn <serge@hallyn.com>
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
When drm_crtc_init_with_planes() was orignally added
(in drm_crtc.c, e13161af80c185ecd8dc4641d0f5df58f9e3e0af
drm: Add drm_crtc_init_with_planes() (v2)), it only checked for "primary"
being non-null. If that was the case, it modified primary->possible_crtcs.
Then, when support for cursor planes was added
(fc1d3e44ef7c1db93384150fdbf8948dcf949f15 drm: Allow drivers to register
cursor planes with crtc), the same behaviour was implemented for cursor
planes.
vc4_plane_init() since its inception has passed 0xff as "possible_crtcs"
parameter to drm_universal_plane_init(). With a change in drm_crtc.c
(7abc7d47510c75dd984380ebf819616e574c9604 drm: don't override
possible_crtcs for primary/cursor planes) passing 0xff results in primary's
possible_crtcs set to 0xff (cursor was updated manually by vc4_crtc.c).
Consequently, it would be allowed to use the primary plane from CRTC 1 (for
example) on CRTC 0, which would result in the overlay and cursors being
buried.
Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@samsung.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Link: http://patchwork.freedesktop.org/patch/msgid/1485941708-27892-1-git-send-email-andrzej.p@samsung.com
Fixes: 7abc7d47510c ("drm: don't override possible_crtcs for primary/cursor planes")
|
|
This patch fixes the case where there is no phydev attached
to a LMAC in DT due to non-existance of a PHY driver or due
to usage of non-stanadard PHY which doesn't support autoneg.
Changes dependeds on firmware to send correct info w.r.t
PHY and autoneg capability.
This patch also covers a case where a 10G/40G interface is used
as a 1G with convertors with Cortina PHY in between.
Signed-off-by: Thanneeru Srinivasulu <tsrinivasulu@cavium.com>
Signed-off-by: Sunil Goutham <sgoutham@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
dsa_slave_create() can fail, and dsa_user_port_unapply() will properly check
for the network device not being NULL before attempting to destroy it. We were
not setting the slave network device as NULL if dsa_slave_create() failed, so
we would later on be calling dsa_slave_destroy() on a now free'd and
unitialized network device, causing crashes in dsa_slave_destroy().
Fixes: 83c0afaec7b7 ("net: dsa: Add new binding implementation")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Andrey reported a kernel crash:
general protection fault: 0000 [#1] SMP KASAN
Dumping ftrace buffer:
(ftrace buffer empty)
Modules linked in:
CPU: 2 PID: 3880 Comm: syz-executor1 Not tainted 4.10.0-rc6+ #124
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
task: ffff880060048040 task.stack: ffff880069be8000
RIP: 0010:ping_v4_push_pending_frames net/ipv4/ping.c:647 [inline]
RIP: 0010:ping_v4_sendmsg+0x1acd/0x23f0 net/ipv4/ping.c:837
RSP: 0018:ffff880069bef8b8 EFLAGS: 00010206
RAX: dffffc0000000000 RBX: ffff880069befb90 RCX: 0000000000000000
RDX: 0000000000000018 RSI: ffff880069befa30 RDI: 00000000000000c2
RBP: ffff880069befbb8 R08: 0000000000000008 R09: 0000000000000000
R10: 0000000000000002 R11: 0000000000000000 R12: ffff880069befab0
R13: ffff88006c624a80 R14: ffff880069befa70 R15: 0000000000000000
FS: 00007f6f7c716700(0000) GS:ffff88006de00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000004a6f28 CR3: 000000003a134000 CR4: 00000000000006e0
Call Trace:
inet_sendmsg+0x164/0x5b0 net/ipv4/af_inet.c:744
sock_sendmsg_nosec net/socket.c:635 [inline]
sock_sendmsg+0xca/0x110 net/socket.c:645
SYSC_sendto+0x660/0x810 net/socket.c:1687
SyS_sendto+0x40/0x50 net/socket.c:1655
entry_SYSCALL_64_fastpath+0x1f/0xc2
This is because we miss a check for NULL pointer for skb_peek() when
the queue is empty. Other places already have the same check.
Fixes: c319b4d76b9e ("net: ipv4: add IPPROTO_ICMP socket kind")
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Tested-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|