Discussion:
[PATCH net] net/mlx4_en: mlx4_en_netpoll shouldn't call napi_schedule when port is down
Amir Vadai
2014-09-29 11:04:55 UTC
Permalink
From: Ido Shamay <***@mellanox.com>

mlx4_en_netpoll, which is mlx4_en ndo_poll_controller callback,
might be called when port is down, causing a napi_schedule when
napi->poll callback in NULL. mutex_trylock is needed to acquire
the port_state lock, since other threads may grab it and stop
the port while we are in napi scheduling. Using trylock since in atomic
context.

Signed-off-by: Ido Shamay <***@mellanox.com>
Signed-off-by: Amir Vadai <***@mellanox.com>
---

Hi Dave,

Please push this commit to -stable

Thanks,
Amir

drivers/net/ethernet/mellanox/mlx4/en_netdev.c | 10 ++++++++++
1 file changed, 10 insertions(+)

diff --git a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
index abddcf8..e243f1c 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
@@ -1272,13 +1272,23 @@ out:
static void mlx4_en_netpoll(struct net_device *dev)
{
struct mlx4_en_priv *priv = netdev_priv(dev);
+ struct mlx4_en_dev *mdev = priv->mdev;
struct mlx4_en_cq *cq;
int i;

+ if (!mutex_trylock(&mdev->state_lock))
+ return;
+
+ if (!priv->port_up)
+ goto out;
+
for (i = 0; i < priv->rx_ring_num; i++) {
cq = priv->rx_cq[i];
napi_schedule(&cq->napi);
}
+
+out:
+ mutex_unlock(&mdev->state_lock);
}
#endif
--
1.8.3.4
Cong Wang
2014-09-29 16:47:30 UTC
Permalink
Post by Amir Vadai
mlx4_en_netpoll, which is mlx4_en ndo_poll_controller callback,
might be called when port is down, causing a napi_schedule when
napi->poll callback in NULL. mutex_trylock is needed to acquire
the port_state lock, since other threads may grab it and stop
the port while we are in napi scheduling. Using trylock since in atomic
context.
Are you sure it's safe? Its comment says:

* This function must not be used in interrupt context. The
* mutex must be released by the same task that acquired it.

Did you test it with LOCKDEP enabled?
David Miller
2014-09-29 17:47:04 UTC
Permalink
From: Amir Vadai <***@mellanox.com>
Date: Mon, 29 Sep 2014 14:04:55 +0300
Post by Amir Vadai
mlx4_en_netpoll, which is mlx4_en ndo_poll_controller callback,
might be called when port is down, causing a napi_schedule when
napi->poll callback in NULL. mutex_trylock is needed to acquire
the port_state lock, since other threads may grab it and stop
the port while we are in napi scheduling. Using trylock since in atomic
context.
I don't think netpoll should 'variably succeed' on poll controller's
ability to immediately take a trylock.

You need to find a way to make this code always process the queues
when it is called, the approach you are taking here is not a correct.

If you've designed things in such a way that the locking here is
difficult, that's unfortunate but you still have to make this
function behave properly.
Ido Shamay
2014-10-20 15:59:44 UTC
Permalink
Post by David Miller
Date: Mon, 29 Sep 2014 14:04:55 +0300
Post by Amir Vadai
mlx4_en_netpoll, which is mlx4_en ndo_poll_controller callback,
might be called when port is down, causing a napi_schedule when
napi->poll callback in NULL. mutex_trylock is needed to acquire
the port_state lock, since other threads may grab it and stop
the port while we are in napi scheduling. Using trylock since in atomic
context.
I don't think netpoll should 'variably succeed' on poll controller's
ability to immediately take a trylock.
You need to find a way to make this code always process the queues
when it is called, the approach you are taking here is not a correct.
If you've designed things in such a way that the locking here is
difficult, that's unfortunate but you still have to make this
function behave properly.
Sorry about that, aborting this commit since the problem was already
fixed in commit 3484aac1, which added netif_device_detach to driver port
stop flow and that made netconsole not to call ndo_poll_controller callback.

Thanks.
Post by David Miller
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
More majordomo info at http://vger.kernel.org/majordomo-info.html
Loading...