aboutsummaryrefslogtreecommitdiffstatshomepage
path: root/Documentation/networking/kcm.txt
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation/networking/kcm.txt')
-rw-r--r--Documentation/networking/kcm.txt285
1 files changed, 0 insertions, 285 deletions
diff --git a/Documentation/networking/kcm.txt b/Documentation/networking/kcm.txt
deleted file mode 100644
index b773a5278ac4..000000000000
--- a/Documentation/networking/kcm.txt
+++ /dev/null
@@ -1,285 +0,0 @@
-Kernel Connection Multiplexor
------------------------------
-
-Kernel Connection Multiplexor (KCM) is a mechanism that provides a message based
-interface over TCP for generic application protocols. With KCM an application
-can efficiently send and receive application protocol messages over TCP using
-datagram sockets.
-
-KCM implements an NxM multiplexor in the kernel as diagrammed below:
-
-+------------+ +------------+ +------------+ +------------+
-| KCM socket | | KCM socket | | KCM socket | | KCM socket |
-+------------+ +------------+ +------------+ +------------+
- | | | |
- +-----------+ | | +----------+
- | | | |
- +----------------------------------+
- | Multiplexor |
- +----------------------------------+
- | | | | |
- +---------+ | | | ------------+
- | | | | |
-+----------+ +----------+ +----------+ +----------+ +----------+
-| Psock | | Psock | | Psock | | Psock | | Psock |
-+----------+ +----------+ +----------+ +----------+ +----------+
- | | | | |
-+----------+ +----------+ +----------+ +----------+ +----------+
-| TCP sock | | TCP sock | | TCP sock | | TCP sock | | TCP sock |
-+----------+ +----------+ +----------+ +----------+ +----------+
-
-KCM sockets
------------
-
-The KCM sockets provide the user interface to the multiplexor. All the KCM sockets
-bound to a multiplexor are considered to have equivalent function, and I/O
-operations in different sockets may be done in parallel without the need for
-synchronization between threads in userspace.
-
-Multiplexor
------------
-
-The multiplexor provides the message steering. In the transmit path, messages
-written on a KCM socket are sent atomically on an appropriate TCP socket.
-Similarly, in the receive path, messages are constructed on each TCP socket
-(Psock) and complete messages are steered to a KCM socket.
-
-TCP sockets & Psocks
---------------------
-
-TCP sockets may be bound to a KCM multiplexor. A Psock structure is allocated
-for each bound TCP socket, this structure holds the state for constructing
-messages on receive as well as other connection specific information for KCM.
-
-Connected mode semantics
-------------------------
-
-Each multiplexor assumes that all attached TCP connections are to the same
-destination and can use the different connections for load balancing when
-transmitting. The normal send and recv calls (include sendmmsg and recvmmsg)
-can be used to send and receive messages from the KCM socket.
-
-Socket types
-------------
-
-KCM supports SOCK_DGRAM and SOCK_SEQPACKET socket types.
-
-Message delineation
--------------------
-
-Messages are sent over a TCP stream with some application protocol message
-format that typically includes a header which frames the messages. The length
-of a received message can be deduced from the application protocol header
-(often just a simple length field).
-
-A TCP stream must be parsed to determine message boundaries. Berkeley Packet
-Filter (BPF) is used for this. When attaching a TCP socket to a multiplexor a
-BPF program must be specified. The program is called at the start of receiving
-a new message and is given an skbuff that contains the bytes received so far.
-It parses the message header and returns the length of the message. Given this
-information, KCM will construct the message of the stated length and deliver it
-to a KCM socket.
-
-TCP socket management
----------------------
-
-When a TCP socket is attached to a KCM multiplexor data ready (POLLIN) and
-write space available (POLLOUT) events are handled by the multiplexor. If there
-is a state change (disconnection) or other error on a TCP socket, an error is
-posted on the TCP socket so that a POLLERR event happens and KCM discontinues
-using the socket. When the application gets the error notification for a
-TCP socket, it should unattach the socket from KCM and then handle the error
-condition (the typical response is to close the socket and create a new
-connection if necessary).
-
-KCM limits the maximum receive message size to be the size of the receive
-socket buffer on the attached TCP socket (the socket buffer size can be set by
-SO_RCVBUF). If the length of a new message reported by the BPF program is
-greater than this limit a corresponding error (EMSGSIZE) is posted on the TCP
-socket. The BPF program may also enforce a maximum messages size and report an
-error when it is exceeded.
-
-A timeout may be set for assembling messages on a receive socket. The timeout
-value is taken from the receive timeout of the attached TCP socket (this is set
-by SO_RCVTIMEO). If the timer expires before assembly is complete an error
-(ETIMEDOUT) is posted on the socket.
-
-User interface
-==============
-
-Creating a multiplexor
-----------------------
-
-A new multiplexor and initial KCM socket is created by a socket call:
-
- socket(AF_KCM, type, protocol)
-
- - type is either SOCK_DGRAM or SOCK_SEQPACKET
- - protocol is KCMPROTO_CONNECTED
-
-Cloning KCM sockets
--------------------
-
-After the first KCM socket is created using the socket call as described
-above, additional sockets for the multiplexor can be created by cloning
-a KCM socket. This is accomplished by an ioctl on a KCM socket:
-
- /* From linux/kcm.h */
- struct kcm_clone {
- int fd;
- };
-
- struct kcm_clone info;
-
- memset(&info, 0, sizeof(info));
-
- err = ioctl(kcmfd, SIOCKCMCLONE, &info);
-
- if (!err)
- newkcmfd = info.fd;
-
-Attach transport sockets
-------------------------
-
-Attaching of transport sockets to a multiplexor is performed by calling an
-ioctl on a KCM socket for the multiplexor. e.g.:
-
- /* From linux/kcm.h */
- struct kcm_attach {
- int fd;
- int bpf_fd;
- };
-
- struct kcm_attach info;
-
- memset(&info, 0, sizeof(info));
-
- info.fd = tcpfd;
- info.bpf_fd = bpf_prog_fd;
-
- ioctl(kcmfd, SIOCKCMATTACH, &info);
-
-The kcm_attach structure contains:
- fd: file descriptor for TCP socket being attached
- bpf_prog_fd: file descriptor for compiled BPF program downloaded
-
-Unattach transport sockets
---------------------------
-
-Unattaching a transport socket from a multiplexor is straightforward. An
-"unattach" ioctl is done with the kcm_unattach structure as the argument:
-
- /* From linux/kcm.h */
- struct kcm_unattach {
- int fd;
- };
-
- struct kcm_unattach info;
-
- memset(&info, 0, sizeof(info));
-
- info.fd = cfd;
-
- ioctl(fd, SIOCKCMUNATTACH, &info);
-
-Disabling receive on KCM socket
--------------------------------
-
-A setsockopt is used to disable or enable receiving on a KCM socket.
-When receive is disabled, any pending messages in the socket's
-receive buffer are moved to other sockets. This feature is useful
-if an application thread knows that it will be doing a lot of
-work on a request and won't be able to service new messages for a
-while. Example use:
-
- int val = 1;
-
- setsockopt(kcmfd, SOL_KCM, KCM_RECV_DISABLE, &val, sizeof(val))
-
-BFP programs for message delineation
-------------------------------------
-
-BPF programs can be compiled using the BPF LLVM backend. For example,
-the BPF program for parsing Thrift is:
-
- #include "bpf.h" /* for __sk_buff */
- #include "bpf_helpers.h" /* for load_word intrinsic */
-
- SEC("socket_kcm")
- int bpf_prog1(struct __sk_buff *skb)
- {
- return load_word(skb, 0) + 4;
- }
-
- char _license[] SEC("license") = "GPL";
-
-Use in applications
-===================
-
-KCM accelerates application layer protocols. Specifically, it allows
-applications to use a message based interface for sending and receiving
-messages. The kernel provides necessary assurances that messages are sent
-and received atomically. This relieves much of the burden applications have
-in mapping a message based protocol onto the TCP stream. KCM also make
-application layer messages a unit of work in the kernel for the purposes of
-steering and scheduling, which in turn allows a simpler networking model in
-multithreaded applications.
-
-Configurations
---------------
-
-In an Nx1 configuration, KCM logically provides multiple socket handles
-to the same TCP connection. This allows parallelism between in I/O
-operations on the TCP socket (for instance copyin and copyout of data is
-parallelized). In an application, a KCM socket can be opened for each
-processing thread and inserted into the epoll (similar to how SO_REUSEPORT
-is used to allow multiple listener sockets on the same port).
-
-In a MxN configuration, multiple connections are established to the
-same destination. These are used for simple load balancing.
-
-Message batching
-----------------
-
-The primary purpose of KCM is load balancing between KCM sockets and hence
-threads in a nominal use case. Perfect load balancing, that is steering
-each received message to a different KCM socket or steering each sent
-message to a different TCP socket, can negatively impact performance
-since this doesn't allow for affinities to be established. Balancing
-based on groups, or batches of messages, can be beneficial for performance.
-
-On transmit, there are three ways an application can batch (pipeline)
-messages on a KCM socket.
- 1) Send multiple messages in a single sendmmsg.
- 2) Send a group of messages each with a sendmsg call, where all messages
- except the last have MSG_BATCH in the flags of sendmsg call.
- 3) Create "super message" composed of multiple messages and send this
- with a single sendmsg.
-
-On receive, the KCM module attempts to queue messages received on the
-same KCM socket during each TCP ready callback. The targeted KCM socket
-changes at each receive ready callback on the KCM socket. The application
-does not need to configure this.
-
-Error handling
---------------
-
-An application should include a thread to monitor errors raised on
-the TCP connection. Normally, this will be done by placing each
-TCP socket attached to a KCM multiplexor in epoll set for POLLERR
-event. If an error occurs on an attached TCP socket, KCM sets an EPIPE
-on the socket thus waking up the application thread. When the application
-sees the error (which may just be a disconnect) it should unattach the
-socket from KCM and then close it. It is assumed that once an error is
-posted on the TCP socket the data stream is unrecoverable (i.e. an error
-may have occurred in the middle of receiving a message).
-
-TCP connection monitoring
--------------------------
-
-In KCM there is no means to correlate a message to the TCP socket that
-was used to send or receive the message (except in the case there is
-only one attached TCP socket). However, the application does retain
-an open file descriptor to the socket so it will be able to get statistics
-from the socket which can be used in detecting issues (such as high
-retransmissions on the socket).