From 32857cf57f920cdc03b5095f08febec94cf9c36b Mon Sep 17 00:00:00 2001 From: John Fastabend Date: Fri, 19 Jul 2019 10:29:18 -0700 Subject: net/tls: fix transition through disconnect with close It is possible (via shutdown()) for TCP socks to go through TCP_CLOSE state via tcp_disconnect() without actually calling tcp_close which would then call the tls close callback. Because of this a user could disconnect a socket then put it in a LISTEN state which would break our assumptions about sockets always being ESTABLISHED state. More directly because close() can call unhash() and unhash is implemented by sockmap if a sockmap socket has TLS enabled we can incorrectly destroy the psock from unhash() and then call its close handler again. But because the psock (sockmap socket representation) is already destroyed we call close handler in sk->prot. However, in some cases (TLS BASE/BASE case) this will still point at the sockmap close handler resulting in a circular call and crash reported by syzbot. To fix both above issues implement the unhash() routine for TLS. v4: - add note about tls offload still needing the fix; - move sk_proto to the cold cache line; - split TX context free into "release" and "free", otherwise the GC work itself is in already freed memory; - more TX before RX for consistency; - reuse tls_ctx_free(); - schedule the GC work after we're done with context to avoid UAF; - don't set the unhash in all modes, all modes "inherit" TLS_BASE's callbacks anyway; - disable the unhash hook for TLS_HW. Fixes: 3c4d7559159bf ("tls: kernel TLS support") Reported-by: Eric Dumazet Signed-off-by: John Fastabend Signed-off-by: Jakub Kicinski Signed-off-by: Daniel Borkmann --- Documentation/networking/tls-offload.rst | 6 ++++++ 1 file changed, 6 insertions(+) (limited to 'Documentation') diff --git a/Documentation/networking/tls-offload.rst b/Documentation/networking/tls-offload.rst index 048e5ca44824..8a1eeb393316 100644 --- a/Documentation/networking/tls-offload.rst +++ b/Documentation/networking/tls-offload.rst @@ -513,3 +513,9 @@ Redirects leak clear text In the RX direction, if segment has already been decrypted by the device and it gets redirected or mirrored - clear text will be transmitted out. + +shutdown() doesn't clear TLS state +---------------------------------- + +shutdown() system call allows for a TLS socket to be reused as a different +connection. Offload doesn't currently handle that. -- cgit v1.2.3-59-g8ed1b From 280c089916228a005af7f95c1716ea1fea1027b5 Mon Sep 17 00:00:00 2001 From: Tariq Toukan Date: Mon, 22 Jul 2019 13:43:03 +0300 Subject: Documentation: TLS: fix stat counters description Add missing description of counters. Split tx_tls_encrypted counter into two, to give packets and bytes indications. Fixes: f42c104f2ec9 ("Documentation: add TLS offload documentation") Suggested-by: Jakub Kicinski Signed-off-by: Tariq Toukan Signed-off-by: Saeed Mahameed --- Documentation/networking/tls-offload.rst | 23 +++++++++++++++++------ 1 file changed, 17 insertions(+), 6 deletions(-) (limited to 'Documentation') diff --git a/Documentation/networking/tls-offload.rst b/Documentation/networking/tls-offload.rst index 048e5ca44824..b70b70dc4524 100644 --- a/Documentation/networking/tls-offload.rst +++ b/Documentation/networking/tls-offload.rst @@ -424,13 +424,24 @@ Statistics Following minimum set of TLS-related statistics should be reported by the driver: - * ``rx_tls_decrypted`` - number of successfully decrypted TLS segments - * ``tx_tls_encrypted`` - number of in-order TLS segments passed to device - for encryption + * ``rx_tls_decrypted_packets`` - number of successfully decrypted RX packets + which were part of a TLS stream. + * ``rx_tls_decrypted_bytes`` - number of TLS payload bytes in RX packets + which were successfully decrypted. + * ``tx_tls_encrypted_packets`` - number of TX packets passed to the device + for encryption of their TLS payload. + * ``tx_tls_encrypted_bytes`` - number of TLS payload bytes in TX packets + passed to the device for encryption. + * ``tx_tls_ctx`` - number of TLS TX HW offload contexts added to device for + encryption. * ``tx_tls_ooo`` - number of TX packets which were part of a TLS stream - but did not arrive in the expected order - * ``tx_tls_drop_no_sync_data`` - number of TX packets dropped because - they arrived out of order and associated record could not be found + but did not arrive in the expected order. + * ``tx_tls_drop_no_sync_data`` - number of TX packets which were part of + a TLS stream dropped, because they arrived out of order and associated + record could not be found. + * ``tx_tls_drop_bypass_req`` - number of TX packets which were part of a TLS + stream dropped, because they contain both data that has been encrypted by + software and data that expects hardware crypto offload. Notable corner cases, exceptions and additional requirements ============================================================ -- cgit v1.2.3-59-g8ed1b From 5d92e631b8be8965a90c144320f06e096081a551 Mon Sep 17 00:00:00 2001 From: Jakub Kicinski Date: Thu, 1 Aug 2019 14:36:01 -0700 Subject: net/tls: partially revert fix transition through disconnect with close Looks like we were slightly overzealous with the shutdown() cleanup. Even though the sock->sk_state can reach CLOSED again, socket->state will not got back to SS_UNCONNECTED once connections is ESTABLISHED. Meaning we will see EISCONN if we try to reconnect, and EINVAL if we try to listen. Only listen sockets can be shutdown() and reused, but since ESTABLISHED sockets can never be re-connected() or used for listen() we don't need to try to clean up the ULP state early. Fixes: 32857cf57f92 ("net/tls: fix transition through disconnect with close") Signed-off-by: Jakub Kicinski Signed-off-by: David S. Miller --- Documentation/networking/tls-offload.rst | 6 ---- include/net/tls.h | 2 -- net/tls/tls_main.c | 55 -------------------------------- 3 files changed, 63 deletions(-) (limited to 'Documentation') diff --git a/Documentation/networking/tls-offload.rst b/Documentation/networking/tls-offload.rst index 2d9f9ebf4117..b70b70dc4524 100644 --- a/Documentation/networking/tls-offload.rst +++ b/Documentation/networking/tls-offload.rst @@ -524,9 +524,3 @@ Redirects leak clear text In the RX direction, if segment has already been decrypted by the device and it gets redirected or mirrored - clear text will be transmitted out. - -shutdown() doesn't clear TLS state ----------------------------------- - -shutdown() system call allows for a TLS socket to be reused as a different -connection. Offload doesn't currently handle that. diff --git a/include/net/tls.h b/include/net/tls.h index 9e425ac2de45..41b2d41bb1b8 100644 --- a/include/net/tls.h +++ b/include/net/tls.h @@ -290,8 +290,6 @@ struct tls_context { struct list_head list; refcount_t refcount; - - struct work_struct gc; }; enum tls_offload_ctx_dir { diff --git a/net/tls/tls_main.c b/net/tls/tls_main.c index f208f8455ef2..9cbbae606ced 100644 --- a/net/tls/tls_main.c +++ b/net/tls/tls_main.c @@ -261,33 +261,6 @@ void tls_ctx_free(struct tls_context *ctx) kfree(ctx); } -static void tls_ctx_free_deferred(struct work_struct *gc) -{ - struct tls_context *ctx = container_of(gc, struct tls_context, gc); - - /* Ensure any remaining work items are completed. The sk will - * already have lost its tls_ctx reference by the time we get - * here so no xmit operation will actually be performed. - */ - if (ctx->tx_conf == TLS_SW) { - tls_sw_cancel_work_tx(ctx); - tls_sw_free_ctx_tx(ctx); - } - - if (ctx->rx_conf == TLS_SW) { - tls_sw_strparser_done(ctx); - tls_sw_free_ctx_rx(ctx); - } - - tls_ctx_free(ctx); -} - -static void tls_ctx_free_wq(struct tls_context *ctx) -{ - INIT_WORK(&ctx->gc, tls_ctx_free_deferred); - schedule_work(&ctx->gc); -} - static void tls_sk_proto_cleanup(struct sock *sk, struct tls_context *ctx, long timeo) { @@ -315,29 +288,6 @@ static void tls_sk_proto_cleanup(struct sock *sk, #endif } -static void tls_sk_proto_unhash(struct sock *sk) -{ - struct inet_connection_sock *icsk = inet_csk(sk); - long timeo = sock_sndtimeo(sk, 0); - struct tls_context *ctx; - - if (unlikely(!icsk->icsk_ulp_data)) { - if (sk->sk_prot->unhash) - sk->sk_prot->unhash(sk); - } - - ctx = tls_get_ctx(sk); - tls_sk_proto_cleanup(sk, ctx, timeo); - write_lock_bh(&sk->sk_callback_lock); - icsk->icsk_ulp_data = NULL; - sk->sk_prot = ctx->sk_proto; - write_unlock_bh(&sk->sk_callback_lock); - - if (ctx->sk_proto->unhash) - ctx->sk_proto->unhash(sk); - tls_ctx_free_wq(ctx); -} - static void tls_sk_proto_close(struct sock *sk, long timeout) { struct inet_connection_sock *icsk = inet_csk(sk); @@ -786,7 +736,6 @@ static void build_protos(struct proto prot[TLS_NUM_CONFIG][TLS_NUM_CONFIG], prot[TLS_BASE][TLS_BASE].setsockopt = tls_setsockopt; prot[TLS_BASE][TLS_BASE].getsockopt = tls_getsockopt; prot[TLS_BASE][TLS_BASE].close = tls_sk_proto_close; - prot[TLS_BASE][TLS_BASE].unhash = tls_sk_proto_unhash; prot[TLS_SW][TLS_BASE] = prot[TLS_BASE][TLS_BASE]; prot[TLS_SW][TLS_BASE].sendmsg = tls_sw_sendmsg; @@ -804,20 +753,16 @@ static void build_protos(struct proto prot[TLS_NUM_CONFIG][TLS_NUM_CONFIG], #ifdef CONFIG_TLS_DEVICE prot[TLS_HW][TLS_BASE] = prot[TLS_BASE][TLS_BASE]; - prot[TLS_HW][TLS_BASE].unhash = base->unhash; prot[TLS_HW][TLS_BASE].sendmsg = tls_device_sendmsg; prot[TLS_HW][TLS_BASE].sendpage = tls_device_sendpage; prot[TLS_HW][TLS_SW] = prot[TLS_BASE][TLS_SW]; - prot[TLS_HW][TLS_SW].unhash = base->unhash; prot[TLS_HW][TLS_SW].sendmsg = tls_device_sendmsg; prot[TLS_HW][TLS_SW].sendpage = tls_device_sendpage; prot[TLS_BASE][TLS_HW] = prot[TLS_BASE][TLS_SW]; - prot[TLS_BASE][TLS_HW].unhash = base->unhash; prot[TLS_SW][TLS_HW] = prot[TLS_SW][TLS_SW]; - prot[TLS_SW][TLS_HW].unhash = base->unhash; prot[TLS_HW][TLS_HW] = prot[TLS_HW][TLS_SW]; #endif -- cgit v1.2.3-59-g8ed1b