aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/infiniband/hw/mthca
diff options
context:
space:
mode:
authorJack Morgenstein <jackm@dev.mellanox.co.il>2009-09-05 20:24:50 -0700
committerRoland Dreier <rolandd@cisco.com>2009-09-05 20:24:50 -0700
commit3b4a8cd51e59c1c342c51b241bbb96c6ac24a147 (patch)
treef185d61b515a21e93159c2c6d50efd8ebf2ac7c7 /drivers/infiniband/hw/mthca
parentmlx4_core: Distinguish multiple devices in /proc/interrupts (diff)
downloadlinux-dev-3b4a8cd51e59c1c342c51b241bbb96c6ac24a147.tar.xz
linux-dev-3b4a8cd51e59c1c342c51b241bbb96c6ac24a147.zip
IB/mlx4: Don't allow userspace open while recovering from catastrophic error
Userspace apps are supposed to release all ib device resources if they receive a fatal async event (IBV_EVENT_DEVICE_FATAL). However, the app has no way of knowing when the device has come back up, except to repeatedly attempt ibv_open_device() until it succeeds. However, currently there is no protection against the open succeeding while the device is in being removed following the fatal event. In this case, the open will succeed, but as a result the device waits in the middle of its removal until the new app releases its resources -- and the new app will not do so, since the open succeeded at a point following the fatal event generation. This patch adds an "active" flag to the device. The active flag is set to false (in the fatal event flow) before the "fatal" event is generated, so any subsequent ibv_dev_open() call to the device will fail until the device comes back up, thus preventing the above deadlock. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
Diffstat (limited to 'drivers/infiniband/hw/mthca')
0 files changed, 0 insertions, 0 deletions