From ecd2ebdea350c40e73c00d400d74c8a09c072082 Mon Sep 17 00:00:00 2001
From: Jarek Poplawski <jarkao2@gmail.com>
Date: Thu, 10 Jan 2008 21:21:20 -0800
Subject: [AX25] af_ax25: Possible circular locking.

Bernard Pidoux F6BVP reported:
> When I killall kissattach I can see the following message.
>
> This happens on kernel 2.6.24-rc5 already patched with the 6 previously
> patches I sent recently.
>
>
> =======================================================
> [ INFO: possible circular locking dependency detected ]
> 2.6.23.9 #1
> -------------------------------------------------------
> kissattach/2906 is trying to acquire lock:
>  (linkfail_lock){-+..}, at: [<d8bd4603>] ax25_link_failed+0x11/0x39 [ax25]
>
> but task is already holding lock:
>  (ax25_list_lock){-+..}, at: [<d8bd7c7c>] ax25_device_event+0x38/0x84
> [ax25]
>
> which lock already depends on the new lock.
>
>
> the existing dependency chain (in reverse order) is:
...

lockdep is worried about the different order here:

#1 (rose_neigh_list_lock){-+..}:
#3 (ax25_list_lock){-+..}:

#0 (linkfail_lock){-+..}:
#1 (rose_neigh_list_lock){-+..}:

#3 (ax25_list_lock){-+..}:
#0 (linkfail_lock){-+..}:

So, ax25_list_lock could be taken before and after linkfail_lock.
I don't know if this three-thread clutch is very probable (or
possible at all), but it seems another bug reported by Bernard
("[...] system impossible to reboot with linux-2.6.24-rc5")
could have similar source - namely ax25_list_lock held by
ax25_kill_by_device() during ax25_disconnect(). It looks like the
only place which calls ax25_disconnect() this way, so I guess, it
isn't necessary.

This patch is breaking the lock for ax25_disconnect().

Reported-and-tested-by: Bernard Pidoux <f6bvp@free.fr>
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 net/ax25/af_ax25.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

(limited to 'net/ax25')
diff --git a/net/ax25/af_ax25.c b/net/ax25/af_ax25.c
index ecb14ee00498..b4725ff317c0 100644
--- a/net/ax25/af_ax25.c
+++ b/net/ax25/af_ax25.c
@@ -87,10 +87,22 @@ static void ax25_kill_by_device(struct net_device *dev)
 		return;
 
 	spin_lock_bh(&ax25_list_lock);
+again:
 	ax25_for_each(s, node, &ax25_list) {
 		if (s->ax25_dev == ax25_dev) {
 			s->ax25_dev = NULL;
+			spin_unlock_bh(&ax25_list_lock);
 			ax25_disconnect(s, ENETUNREACH);
+			spin_lock_bh(&ax25_list_lock);
+
+			/* The entry could have been deleted from the
+			 * list meanwhile and thus the next pointer is
+			 * no longer valid.  Play it safe and restart
+			 * the scan.  Forward progress is ensured
+			 * because we set s->ax25_dev to NULL and we
+			 * are never passed a NULL 'dev' argument.
+			 */
+			goto again;
 		}
 	}
 	spin_unlock_bh(&ax25_list_lock);
-- 
cgit v1.2.3-59-g8ed1b