aboutsummaryrefslogtreecommitdiffstats
path: root/net/mptcp (follow)
AgeCommit message (Collapse)AuthorFilesLines
2020-01-29mptcp: Fix build with PROC_FS disabled.David S. Miller1-0/+2
net/mptcp/subflow.c: In function ‘mptcp_subflow_create_socket’: net/mptcp/subflow.c:624:25: error: ‘struct netns_core’ has no member named ‘sock_inuse’ Reported-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-25mptcp: Fix code formattingMat Martineau2-3/+3
checkpatch.pl had a few complaints in the last set of MPTCP patches: ERROR: code indent should use tabs where possible +^I subflow, sk->sk_family, icsk->icsk_af_ops, target, mapped);$ CHECK: Comparison to NULL could be written "!new_ctx" + if (new_ctx == NULL) { ERROR: "foo * bar" should be "foo *bar" +static const struct proto_ops * tcp_proto_ops(struct sock *sk) Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-25mptcp: do not inherit inet proto opsFlorian Westphal1-19/+51
We need to initialise the struct ourselves, else we expose tcp-specific callbacks such as tcp_splice_read which will then trigger splat because the socket is an mptcp one: BUG: KASAN: slab-out-of-bounds in tcp_mstamp_refresh+0x80/0xa0 net/ipv4/tcp_output.c:57 Write of size 8 at addr ffff888116aa21d0 by task syz-executor.0/5478 CPU: 1 PID: 5478 Comm: syz-executor.0 Not tainted 5.5.0-rc6 #3 Call Trace: tcp_mstamp_refresh+0x80/0xa0 net/ipv4/tcp_output.c:57 tcp_rcv_space_adjust+0x72/0x7f0 net/ipv4/tcp_input.c:612 tcp_read_sock+0x622/0x990 net/ipv4/tcp.c:1674 tcp_splice_read+0x20b/0xb40 net/ipv4/tcp.c:791 do_splice+0x1259/0x1560 fs/splice.c:1205 To prevent build error with ipv6, add the recv/sendmsg function declaration to ipv6.h. The functions are already accessible "thanks" to retpoline related work, but they are currently only made visible by socket.c specific INDIRECT_CALLABLE macros. Reported-by: Christoph Paasch <cpaasch@apple.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-24mptcp: cope with later TCP fallbackPaolo Abeni2-17/+103
With MPTCP v1, passive connections can fallback to TCP after the subflow becomes established: syn + MP_CAPABLE -> <- syn, ack + MP_CAPABLE ack, seq = 3 -> // OoO packet is accepted because in-sequence // passive socket is created, is in ESTABLISHED // status and tentatively as MP_CAPABLE ack, seq = 2 -> // no MP_CAPABLE opt, subflow should fallback to TCP We can't use the 'subflow' socket fallback, as we don't have it available for passive connection. Instead, when the fallback is detected, replace the mptcp socket with the underlying TCP subflow. Beyond covering the above scenario, it makes a TCP fallback socket as efficient as plain TCP ones. Co-developed-by: Florian Westphal <fw@strlen.de> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Christoph Paasch <cpaasch@apple.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-24mptcp: process MP_CAPABLE data optionChristoph Paasch4-27/+95
This patch implements the handling of MP_CAPABLE + data option, as per RFC 6824 bis / RFC 8684: MPTCP v1. On the server side we can receive the remote key after that the connection is established. We need to explicitly track the 'missing remote key' status and avoid emitting a mptcp ack until we get such info. When a late/retransmitted/OoO pkt carrying MP_CAPABLE[+data] option is received, we have to propagate the mptcp seq number info to the msk socket. To avoid ABBA locking issue, explicitly check for that in recvmsg(), where we own msk and subflow sock locks. The above also means that an established mp_capable subflow - still waiting for the remote key - can be 'downgraded' to plain TCP. Such change could potentially block a reader waiting for new data forever - as they hook to msk, while later wake-up after the downgrade will be on subflow only. The above issue is not handled here, we likely have to get rid of msk->fallback to handle that cleanly. Signed-off-by: Christoph Paasch <cpaasch@apple.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-24mptcp: parse and emit MP_CAPABLE option according to v1 specChristoph Paasch3-36/+146
This implements MP_CAPABLE options parsing and writing according to RFC 6824 bis / RFC 8684: MPTCP v1. Local key is sent on syn/ack, and both keys are sent on 3rd ack. MP_CAPABLE messages len are updated accordingly. We need the skbuff to correctly emit the above, so we push the skbuff struct as an argument all the way from tcp code to the relevant mptcp callbacks. When processing incoming MP_CAPABLE + data, build a full blown DSS-like map info, to simplify later processing. On child socket creation, we need to record the remote key, if available. Signed-off-by: Christoph Paasch <cpaasch@apple.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-24mptcp: move from sha1 (v0) to sha256 (v1)Paolo Abeni4-59/+104
For simplicity's sake use directly sha256 primitives (and pull them as a required build dep). Add optional, boot-time self-tests for the hmac function. Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Christoph Paasch <cpaasch@apple.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-24mptcp: new sysctl to control the activation per NSMatthieu Baerts4-5/+146
New MPTCP sockets will return -ENOPROTOOPT if MPTCP support is disabled for the current net namespace. We are providing here a way to control access to the feature for those that need to turn it on or off. The value of this new sysctl can be different per namespace. We can then restrict the usage of MPTCP to the selected NS. In case of serious issues with MPTCP, administrators can now easily turn MPTCP off. Co-developed-by: Peter Krystad <peter.krystad@linux.intel.com> Signed-off-by: Peter Krystad <peter.krystad@linux.intel.com> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Christoph Paasch <cpaasch@apple.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-24mptcp: allow collapsing consecutive sendpages on the same substreamPaolo Abeni1-15/+60
If the current sendmsg() lands on the same subflow we used last, we can try to collapse the data. Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Christoph Paasch <cpaasch@apple.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-24mptcp: recvmsg() can drain data from multiple subflowsPaolo Abeni1-10/+168
With the previous patch in place, the msk can detect which subflow has the current map with a simple walk, let's update the main loop to always select the 'current' subflow. The exit conditions now closely mirror tcp_recvmsg() to get expected timeout and signal behavior. Co-developed-by: Peter Krystad <peter.krystad@linux.intel.com> Signed-off-by: Peter Krystad <peter.krystad@linux.intel.com> Co-developed-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: Davide Caratti <dcaratti@redhat.com> Co-developed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Co-developed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Co-developed-by: Florian Westphal <fw@strlen.de> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Christoph Paasch <cpaasch@apple.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-24mptcp: add subflow write space signalling and mptcp_pollFlorian Westphal3-0/+57
Add new SEND_SPACE flag to indicate that a subflow has enough space to accept more data for transmission. It gets cleared at the end of mptcp_sendmsg() in case ssk has run below the free watermark. It is (re-set) from the wspace callback. This allows us to use msk->flags to determine the poll mask. Co-developed-by: Peter Krystad <peter.krystad@linux.intel.com> Signed-off-by: Peter Krystad <peter.krystad@linux.intel.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Christoph Paasch <cpaasch@apple.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-24mptcp: Implement MPTCP receive pathMat Martineau4-4/+584
Parses incoming DSS options and populates outgoing MPTCP ACK fields. MPTCP fields are parsed from the TCP option header and placed in an skb extension, allowing the upper MPTCP layer to access MPTCP options after the skb has gone through the TCP stack. The subflow implements its own data_ready() ops, which ensures that the pending data is in sequence - according to MPTCP seq number - dropping out-of-seq skbs. The DATA_READY bit flag is set if this is the case. This allows the MPTCP socket layer to determine if more data is available without having to consult the individual subflows. It additionally validates the current mapping and propagates EoF events to the connection socket. Co-developed-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Co-developed-by: Peter Krystad <peter.krystad@linux.intel.com> Signed-off-by: Peter Krystad <peter.krystad@linux.intel.com> Co-developed-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: Davide Caratti <dcaratti@redhat.com> Co-developed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Co-developed-by: Florian Westphal <fw@strlen.de> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Christoph Paasch <cpaasch@apple.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-24mptcp: Write MPTCP DSS headers to outgoing data packetsMat Martineau3-6/+285
Per-packet metadata required to write the MPTCP DSS option is written to the skb_ext area. One write to the socket may contain more than one packet of data, which is copied to page fragments and mapped in to MPTCP DSS segments with size determined by the available page fragments and the maximum mapping length allowed by the MPTCP specification. If do_tcp_sendpages() splits a DSS segment in to multiple skbs, that's ok - the later skbs can either have duplicated DSS mapping information or none at all, and the receiver can handle that. The current implementation uses the subflow frag cache and tcp sendpages to avoid excessive code duplication. More work is required to ensure that it works correctly under memory pressure and to support MPTCP-level retransmissions. The MPTCP DSS checksum is not yet implemented. Co-developed-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Co-developed-by: Peter Krystad <peter.krystad@linux.intel.com> Signed-off-by: Peter Krystad <peter.krystad@linux.intel.com> Co-developed-by: Florian Westphal <fw@strlen.de> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Christoph Paasch <cpaasch@apple.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-24mptcp: Add setsockopt()/getsockopt() socket operationsPeter Krystad1-0/+58
set/getsockopt behaviour with multiple subflows is undefined. Therefore, for now, we return -EOPNOTSUPP unless we're in fallback mode. Co-developed-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Peter Krystad <peter.krystad@linux.intel.com> Signed-off-by: Christoph Paasch <cpaasch@apple.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-24mptcp: Add shutdown() socket operationPeter Krystad1-0/+66
Call shutdown on all subflows in use on the given socket, or on the fallback socket. Co-developed-by: Florian Westphal <fw@strlen.de> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Peter Krystad <peter.krystad@linux.intel.com> Signed-off-by: Christoph Paasch <cpaasch@apple.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-24mptcp: Add key generation and token treePeter Krystad6-8/+428
Generate the local keys, IDSN, and token when creating a new socket. Introduce the token tree to track all tokens in use using a radix tree with the MPTCP token itself as the index. Override the rebuild_header callback in inet_connection_sock_af_ops for creating the local key on a new outgoing connection. Override the init_req callback of tcp_request_sock_ops for creating the local key on a new incoming connection. Will be used to obtain the MPTCP parent socket to handle incoming joins. Co-developed-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: Davide Caratti <dcaratti@redhat.com> Co-developed-by: Florian Westphal <fw@strlen.de> Signed-off-by: Florian Westphal <fw@strlen.de> Co-developed-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Peter Krystad <peter.krystad@linux.intel.com> Signed-off-by: Christoph Paasch <cpaasch@apple.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-24mptcp: Create SUBFLOW socket for incoming connectionsPeter Krystad1-5/+231
Add subflow_request_sock type that extends tcp_request_sock and add an is_mptcp flag to tcp_request_sock distinguish them. Override the listen() and accept() methods of the MPTCP socket proto_ops so they may act on the subflow socket. Override the conn_request() and syn_recv_sock() handlers in the inet_connection_sock to handle incoming MPTCP SYNs and the ACK to the response SYN. Add handling in tcp_output.c to add MP_CAPABLE to an outgoing SYN-ACK response for a subflow_request_sock. Co-developed-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: Davide Caratti <dcaratti@redhat.com> Co-developed-by: Florian Westphal <fw@strlen.de> Signed-off-by: Florian Westphal <fw@strlen.de> Co-developed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Co-developed-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Peter Krystad <peter.krystad@linux.intel.com> Signed-off-by: Christoph Paasch <cpaasch@apple.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-24mptcp: Handle MP_CAPABLE options for outgoing connectionsPeter Krystad4-24/+547
Add hooks to tcp_output.c to add MP_CAPABLE to an outgoing SYN request, to capture the MP_CAPABLE in the received SYN-ACK, to add MP_CAPABLE to the final ACK of the three-way handshake. Use the .sk_rx_dst_set() handler in the subflow proto to capture when the responding SYN-ACK is received and notify the MPTCP connection layer. Co-developed-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Co-developed-by: Florian Westphal <fw@strlen.de> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Peter Krystad <peter.krystad@linux.intel.com> Signed-off-by: Christoph Paasch <cpaasch@apple.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-24mptcp: Associate MPTCP context with TCP socketPeter Krystad4-7/+272
Use ULP to associate a subflow_context structure with each TCP subflow socket. Creating these sockets requires new bind and connect functions to make sure ULP is set up immediately when the subflow sockets are created. Co-developed-by: Florian Westphal <fw@strlen.de> Signed-off-by: Florian Westphal <fw@strlen.de> Co-developed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Co-developed-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: Davide Caratti <dcaratti@redhat.com> Co-developed-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Peter Krystad <peter.krystad@linux.intel.com> Signed-off-by: Christoph Paasch <cpaasch@apple.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-24mptcp: Handle MPTCP TCP optionsPeter Krystad3-1/+127
Add hooks to parse and format the MP_CAPABLE option. This option is handled according to MPTCP version 0 (RFC6824). MPTCP version 1 MP_CAPABLE (RFC6824bis/RFC8684) will be added later in coordination with related code changes. Co-developed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Co-developed-by: Florian Westphal <fw@strlen.de> Signed-off-by: Florian Westphal <fw@strlen.de> Co-developed-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: Peter Krystad <peter.krystad@linux.intel.com> Signed-off-by: Christoph Paasch <cpaasch@apple.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-24mptcp: Add MPTCP socket stubsMat Martineau4-0/+184
Implements the infrastructure for MPTCP sockets. MPTCP sockets open one in-kernel TCP socket per subflow. These subflow sockets are only managed by the MPTCP socket that owns them and are not visible from userspace. This commit allows a userspace program to open an MPTCP socket with: sock = socket(AF_INET, SOCK_STREAM, IPPROTO_MPTCP); The resulting socket is simply a wrapper around a single regular TCP socket, without any of the MPTCP protocol implemented over the wire. Co-developed-by: Florian Westphal <fw@strlen.de> Signed-off-by: Florian Westphal <fw@strlen.de> Co-developed-by: Peter Krystad <peter.krystad@linux.intel.com> Signed-off-by: Peter Krystad <peter.krystad@linux.intel.com> Co-developed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Co-developed-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Christoph Paasch <cpaasch@apple.com> Signed-off-by: David S. Miller <davem@davemloft.net>