authorJohn Fastabend <john.fastabend@gmail.com>2019-07-19 10:29:22 -0700
committerDaniel Borkmann <daniel@iogearbox.net>2019-07-22 16:04:17 +0200
commit95fa145479fbc0a0c1fd3274ceb42ec03c042a4a (patch)
tree22b1ec2fdeaa59725aff3e18e3cb366bb6fb7bec /include
parentbpf: sockmap, only create entry if ulp is not already enabled (diff)
bpf: sockmap/tls, close can race with map free
When a map free is called and in parallel a socket is closed we have two paths that can potentially reset the socket prot ops, the bpf close() path and the map free path. This creates a problem with which prot ops should be used from the socket closed side. If the map_free side completes first then we want to call the original lowest level ops. However, if the tls path runs first we want to call the sockmap ops. Additionally there was no locking around prot updates in TLS code paths so the prot ops could be changed multiple times once from TLS path and again from sockmap side potentially leaving ops pointed at either TLS or sockmap when psock and/or tls context have already been destroyed. To fix this race first only update ops inside callback lock so that TLS, sockmap and lowest level all agree on prot state. Second and a ULP callback update() so that lower layers can inform the upper layer when they are being removed allowing the upper layer to reset prot ops. This gets us close to allowing sockmap and tls to be stacked in arbitrary order but will save that patch for *next trees. v4: - make sure we don't free things for device; - remove the checks which swap the callbacks back only if TLS is at the top. Reported-by: syzbot+06537213db7ba2745c4a@syzkaller.appspotmail.com Fixes: 02c558b2d5d6 ("bpf: sockmap, support for msg_peek in sk_msg with redirect ingress") Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2 files changed, 10 insertions, 1 deletions
diff --git a/include/linux/skmsg.h b/include/linux/skmsg.h
index 50ced8aba9db..e4b3fb4bb77c 100644
--- a/include/linux/skmsg.h
+++ b/include/linux/skmsg.h
@@ -354,7 +354,13 @@ static inline void sk_psock_restore_proto(struct sock *sk,
sk->sk_write_space = psock->saved_write_space;
if (psock->sk_proto) {
- sk->sk_prot = psock->sk_proto;
+ struct inet_connection_sock *icsk = inet_csk(sk);
+ bool has_ulp = !!icsk->icsk_ulp_data;
+ if (has_ulp)
+ tcp_update_ulp(sk, psock->sk_proto);
+ else
+ sk->sk_prot = psock->sk_proto;
psock->sk_proto = NULL;
diff --git a/include/net/tcp.h b/include/net/tcp.h
index f42d300f0cfa..c82a23470081 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -2103,6 +2103,8 @@ struct tcp_ulp_ops {
/* initialize ulp */
int (*init)(struct sock *sk);
+ /* update ulp */
+ void (*update)(struct sock *sk, struct proto *p);
/* cleanup ulp */
void (*release)(struct sock *sk);
@@ -2114,6 +2116,7 @@ void tcp_unregister_ulp(struct tcp_ulp_ops *type);
int tcp_set_ulp(struct sock *sk, const char *name);
void tcp_get_available_ulp(char *buf, size_t len);
void tcp_cleanup_ulp(struct sock *sk);
+void tcp_update_ulp(struct sock *sk, struct proto *p);
#define MODULE_ALIAS_TCP_ULP(name) \
__MODULE_INFO(alias, alias_userspace, name); \