Discussion:
[PATCH] sfc: efx: add support for skb->xmit_more
Daniel Borkmann
2014-10-13 16:59:45 UTC
Permalink
Add support for xmit_more batching: efx has BQL support, thus we need
to only move the queue hang check before the hw tail pointer write
and test for xmit_more bit resp. whether the queue/stack has stopped.
Thanks to Nikolay Aleksandrov who had a Solarflare NIC to test!

Signed-off-by: Daniel Borkmann <***@redhat.com>
Acked-by: Nikolay Aleksandrov <***@redhat.com>
Cc: Shradha Shah <***@solarflare.com>
---
drivers/net/ethernet/sfc/tx.c | 15 ++++++++-------
1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/drivers/net/ethernet/sfc/tx.c b/drivers/net/ethernet/sfc/tx.c
index 3206098..566432c 100644
--- a/drivers/net/ethernet/sfc/tx.c
+++ b/drivers/net/ethernet/sfc/tx.c
@@ -439,13 +439,13 @@ finish_packet:

netdev_tx_sent_queue(tx_queue->core_txq, skb->len);

+ efx_tx_maybe_stop_queue(tx_queue);
+
/* Pass off to hardware */
- efx_nic_push_buffers(tx_queue);
+ if (!skb->xmit_more || netif_xmit_stopped(tx_queue->core_txq))
+ efx_nic_push_buffers(tx_queue);

tx_queue->tx_packets++;
-
- efx_tx_maybe_stop_queue(tx_queue);
-
return NETDEV_TX_OK;

dma_err:
@@ -1308,11 +1308,12 @@ static int efx_enqueue_skb_tso(struct efx_tx_queue *tx_queue,

netdev_tx_sent_queue(tx_queue->core_txq, skb->len);

- /* Pass off to hardware */
- efx_nic_push_buffers(tx_queue);
-
efx_tx_maybe_stop_queue(tx_queue);

+ /* Pass off to hardware */
+ if (!skb->xmit_more || netif_xmit_stopped(tx_queue->core_txq))
+ efx_nic_push_buffers(tx_queue);
+
tx_queue->tso_bursts++;
return NETDEV_TX_OK;
--
1.7.11.7
Edward Cree
2014-10-13 18:02:22 UTC
Permalink
Post by Daniel Borkmann
Add support for xmit_more batching: efx has BQL support, thus we need
to only move the queue hang check before the hw tail pointer write
and test for xmit_more bit resp. whether the queue/stack has stopped.
Thanks to Nikolay Aleksandrov who had a Solarflare NIC to test!
I see two issues here.

1) The error handling path (labels dma_err: and err:) will unwind back
to tx_queue->insert_count, which (AFAICT) only gets updated by
efx_nic_push_buffers() (at least on EF10, where that calls
efx_ef10_tx_write()).
So if we transmit a bunch of skbs with xmit_more, then get an error, we
will unwind all the way back to the start, whereas (again, AFAICT) the
previous skbs were nicely completed and safe to send.
The unwind will leave us in a good state, but we'll have failed to send
some packets that there was nothing wrong with.
I think the appropriate fix for this would be to maintain two separate
insert_counts, as xmit_more means we can't conflate "last write pointer
pushed" and "write pointer at start of this skb" any longer.

2) I think we shouldn't do PIO when xmit_more is set - it's probably bad
for performance (though I haven't measured), and I'm not entirely
convinced it will behave correctly. I think
efx_nic_tx_is_empty(tx_queue) will return true if we have xmit_more'd a
PIO packet, enabling us to potentially PIO the next packet as well, thus
overwriting the PIO buffer before the NIC's had a chance to read it.
I'm also unsure about what happens to TX Push (efx_ef10_push_tx_desc),
for similar reasons.
So I think TX PIO and TX Push both need to be suppressed when
skb->xmit_more (as a correctness issue), and should also be suppressed
on the !xmit_more skb that 'uncorks' a previous row of xmit_mores - as a
performance issue at the least, and for TX Push I'm also unsure about
the correctness (though I haven't worked that one through in detail).

Until these issues are addressed, I have to NAK this. Tomorrow I'll try
to rustle up an rfc patch that handles these issues, unless someone
beats me to it.

-Edward
Post by Daniel Borkmann
---
drivers/net/ethernet/sfc/tx.c | 15 ++++++++-------
1 file changed, 8 insertions(+), 7 deletions(-)
diff --git a/drivers/net/ethernet/sfc/tx.c b/drivers/net/ethernet/sfc/tx.c
index 3206098..566432c 100644
--- a/drivers/net/ethernet/sfc/tx.c
+++ b/drivers/net/ethernet/sfc/tx.c
netdev_tx_sent_queue(tx_queue->core_txq, skb->len);
+ efx_tx_maybe_stop_queue(tx_queue);
+
/* Pass off to hardware */
- efx_nic_push_buffers(tx_queue);
+ if (!skb->xmit_more || netif_xmit_stopped(tx_queue->core_txq))
+ efx_nic_push_buffers(tx_queue);
tx_queue->tx_packets++;
-
- efx_tx_maybe_stop_queue(tx_queue);
-
return NETDEV_TX_OK;
@@ -1308,11 +1308,12 @@ static int efx_enqueue_skb_tso(struct efx_tx_queue *tx_queue,
netdev_tx_sent_queue(tx_queue->core_txq, skb->len);
- /* Pass off to hardware */
- efx_nic_push_buffers(tx_queue);
-
efx_tx_maybe_stop_queue(tx_queue);
+ /* Pass off to hardware */
+ if (!skb->xmit_more || netif_xmit_stopped(tx_queue->core_txq))
+ efx_nic_push_buffers(tx_queue);
+
tx_queue->tso_bursts++;
return NETDEV_TX_OK;
Daniel Borkmann
2014-10-13 19:18:32 UTC
Permalink
On 10/13/2014 08:02 PM, Edward Cree wrote:
...
Post by Edward Cree
Until these issues are addressed, I have to NAK this. Tomorrow I'll try
to rustle up an rfc patch that handles these issues, unless someone
beats me to it.
Thanks for your feedback! Yes, please feel free to take this onwards and
integrate/adapt it to your needs for sfc's xmit_more support.

Thanks a lot,
Daniel
Edward Cree
2014-10-14 18:41:37 UTC
Permalink
Don't ring the doorbell, and don't do PIO. This will also prevent
TX Push, because there will be more than one buffer waiting when
the doorbell is rung.

Signed-off-by: Edward Cree <***@solarflare.com>
---

drivers/net/ethernet/sfc/nic.h | 29 +++++++++++++++++++++++-----
drivers/net/ethernet/sfc/tx.c | 43 +++++++++++++++++-------------------------
2 files changed, 41 insertions(+), 31 deletions(-)

diff --git a/drivers/net/ethernet/sfc/nic.h b/drivers/net/ethernet/sfc/nic.h
index 60f8514..f77cce0 100644
--- a/drivers/net/ethernet/sfc/nic.h
+++ b/drivers/net/ethernet/sfc/nic.h
@@ -71,9 +71,17 @@ efx_tx_desc(struct efx_tx_queue *tx_queue, unsigned int index)
return ((efx_qword_t *) (tx_queue->txd.buf.addr)) + index;
}

-/* Report whether the NIC considers this TX queue empty, given the
- * write_count used for the last doorbell push. May return false
- * negative.
+/* Get partner of a TX queue, seen as part of the same net core queue */
+static struct efx_tx_queue *efx_tx_queue_partner(struct efx_tx_queue *tx_queue)
+{
+ if (tx_queue->queue & EFX_TXQ_TYPE_OFFLOAD)
+ return tx_queue - EFX_TXQ_TYPE_OFFLOAD;
+ else
+ return tx_queue + EFX_TXQ_TYPE_OFFLOAD;
+}
+
+/* Report whether this TX queue would be empty for the given write_count.
+ * May return false negative.
*/
static inline bool __efx_nic_tx_is_empty(struct efx_tx_queue *tx_queue,
unsigned int write_count)
@@ -86,9 +94,18 @@ static inline bool __efx_nic_tx_is_empty(struct efx_tx_queue *tx_queue,
return ((empty_read_count ^ write_count) & ~EFX_EMPTY_COUNT_VALID) == 0;
}

-static inline bool efx_nic_tx_is_empty(struct efx_tx_queue *tx_queue)
+/* Decide whether we can use TX PIO, ie. write packet data directly into
+ * a buffer on the device. This can reduce latency at the expense of
+ * throughput, so we only do this if both hardware and software TX rings
+ * are empty. This also ensures that only one packet at a time can be
+ * using the PIO buffer.
+ */
+static inline bool efx_nic_may_tx_pio(struct efx_tx_queue *tx_queue)
{
- return __efx_nic_tx_is_empty(tx_queue, tx_queue->write_count);
+ struct efx_tx_queue *partner = efx_tx_queue_partner(tx_queue);
+ return tx_queue->piobuf &&
+ __efx_nic_tx_is_empty(tx_queue, tx_queue->insert_count) &&
+ __efx_nic_tx_is_empty(partner, partner->insert_count);
}

/* Decide whether to push a TX descriptor to the NIC vs merely writing
@@ -96,6 +113,8 @@ static inline bool efx_nic_tx_is_empty(struct efx_tx_queue *tx_queue)
* descriptor to an empty queue, but is otherwise pointless. Further,
* Falcon and Siena have hardware bugs (SF bug 33851) that may be
* triggered if we don't check this.
+ * We use the write_count used for the last doorbell push, to get the
+ * NIC's view of the tx queue.
*/
static inline bool efx_nic_may_push_tx_desc(struct efx_tx_queue *tx_queue,
unsigned int write_count)
diff --git a/drivers/net/ethernet/sfc/tx.c b/drivers/net/ethernet/sfc/tx.c
index 3206098..aaf2987 100644
--- a/drivers/net/ethernet/sfc/tx.c
+++ b/drivers/net/ethernet/sfc/tx.c
@@ -132,15 +132,6 @@ unsigned int efx_tx_max_skb_descs(struct efx_nic *efx)
return max_descs;
}

-/* Get partner of a TX queue, seen as part of the same net core queue */
-static struct efx_tx_queue *efx_tx_queue_partner(struct efx_tx_queue *tx_queue)
-{
- if (tx_queue->queue & EFX_TXQ_TYPE_OFFLOAD)
- return tx_queue - EFX_TXQ_TYPE_OFFLOAD;
- else
- return tx_queue + EFX_TXQ_TYPE_OFFLOAD;
-}
-
static void efx_tx_maybe_stop_queue(struct efx_tx_queue *txq1)
{
/* We need to consider both queues that the net core sees as one */
@@ -344,6 +335,7 @@ netdev_tx_t efx_enqueue_skb(struct efx_tx_queue *tx_queue, struct sk_buff *skb)
struct efx_nic *efx = tx_queue->efx;
struct device *dma_dev = &efx->pci_dev->dev;
struct efx_tx_buffer *buffer;
+ unsigned int old_insert_count = tx_queue->insert_count;
skb_frag_t *fragment;
unsigned int len, unmap_len = 0;
dma_addr_t dma_addr, unmap_addr = 0;
@@ -351,8 +343,6 @@ netdev_tx_t efx_enqueue_skb(struct efx_tx_queue *tx_queue, struct sk_buff *skb)
unsigned short dma_flags;
int i = 0;

- EFX_BUG_ON_PARANOID(tx_queue->write_count != tx_queue->insert_count);
-
if (skb_shinfo(skb)->gso_size)
return efx_enqueue_skb_tso(tx_queue, skb);

@@ -369,9 +359,8 @@ netdev_tx_t efx_enqueue_skb(struct efx_tx_queue *tx_queue, struct sk_buff *skb)

/* Consider using PIO for short packets */
#ifdef EFX_USE_PIO
- if (skb->len <= efx_piobuf_size && tx_queue->piobuf &&
- efx_nic_tx_is_empty(tx_queue) &&
- efx_nic_tx_is_empty(efx_tx_queue_partner(tx_queue))) {
+ if (skb->len <= efx_piobuf_size && !skb->xmit_more &&
+ efx_nic_may_tx_pio(tx_queue)) {
buffer = efx_enqueue_skb_pio(tx_queue, skb);
dma_flags = EFX_TX_BUF_OPTION;
goto finish_packet;
@@ -439,13 +428,14 @@ finish_packet:

netdev_tx_sent_queue(tx_queue->core_txq, skb->len);

+ efx_tx_maybe_stop_queue(tx_queue);
+
/* Pass off to hardware */
- efx_nic_push_buffers(tx_queue);
+ if (!skb->xmit_more || netif_xmit_stopped(tx_queue->core_txq))
+ efx_nic_push_buffers(tx_queue);

tx_queue->tx_packets++;

- efx_tx_maybe_stop_queue(tx_queue);
-
return NETDEV_TX_OK;

dma_err:
@@ -458,7 +448,7 @@ finish_packet:
dev_kfree_skb_any(skb);

/* Work backwards until we hit the original insert pointer value */
- while (tx_queue->insert_count != tx_queue->write_count) {
+ while (tx_queue->insert_count != old_insert_count) {
unsigned int pkts_compl = 0, bytes_compl = 0;
--tx_queue->insert_count;
buffer = __efx_tx_queue_get_insert_buffer(tx_queue);
@@ -989,12 +979,13 @@ static int efx_tso_put_header(struct efx_tx_queue *tx_queue,
/* Remove buffers put into a tx_queue. None of the buffers must have
* an skb attached.
*/
-static void efx_enqueue_unwind(struct efx_tx_queue *tx_queue)
+static void efx_enqueue_unwind(struct efx_tx_queue *tx_queue,
+ unsigned int insert_count)
{
struct efx_tx_buffer *buffer;

/* Work backwards until we hit the original insert pointer value */
- while (tx_queue->insert_count != tx_queue->write_count) {
+ while (tx_queue->insert_count != insert_count) {
--tx_queue->insert_count;
buffer = __efx_tx_queue_get_insert_buffer(tx_queue);
efx_dequeue_buffer(tx_queue, buffer, NULL, NULL);
@@ -1258,14 +1249,13 @@ static int efx_enqueue_skb_tso(struct efx_tx_queue *tx_queue,
struct sk_buff *skb)
{
struct efx_nic *efx = tx_queue->efx;
+ unsigned int old_insert_count = tx_queue->insert_count;
int frag_i, rc;
struct tso_state state;

/* Find the packet protocol and sanity-check it */
state.protocol = efx_tso_check_protocol(skb);

- EFX_BUG_ON_PARANOID(tx_queue->write_count != tx_queue->insert_count);
-
rc = tso_start(&state, efx, skb);
if (rc)
goto mem_err;
@@ -1308,11 +1298,12 @@ static int efx_enqueue_skb_tso(struct efx_tx_queue *tx_queue,

netdev_tx_sent_queue(tx_queue->core_txq, skb->len);

- /* Pass off to hardware */
- efx_nic_push_buffers(tx_queue);
-
efx_tx_maybe_stop_queue(tx_queue);

+ /* Pass off to hardware */
+ if (!skb->xmit_more || netif_xmit_stopped(tx_queue->core_txq))
+ efx_nic_push_buffers(tx_queue);
+
tx_queue->tso_bursts++;
return NETDEV_TX_OK;

@@ -1336,6 +1327,6 @@ static int efx_enqueue_skb_tso(struct efx_tx_queue *tx_queue,
dma_unmap_single(&efx->pci_dev->dev, state.header_dma_addr,
state.header_unmap_len, DMA_TO_DEVICE);

- efx_enqueue_unwind(tx_queue);
+ efx_enqueue_unwind(tx_queue, old_insert_count);
return NETDEV_TX_OK;
}
--
1.7.11.7
David Miller
2014-10-14 21:15:01 UTC
Permalink
From: Edward Cree <***@solarflare.com>
Date: Tue, 14 Oct 2014 19:41:37 +0100
Post by Edward Cree
Don't ring the doorbell, and don't do PIO. This will also prevent
TX Push, because there will be more than one buffer waiting when
the doorbell is rung.
This looks good to me, mind if I apply this now?
Edward Cree
2014-10-15 11:05:35 UTC
Permalink
Post by David Miller
Date: Tue, 14 Oct 2014 19:41:37 +0100
Post by Edward Cree
Don't ring the doorbell, and don't do PIO. This will also prevent
TX Push, because there will be more than one buffer waiting when
the doorbell is rung.
This looks good to me, mind if I apply this now?
I'd rather wait until Jon Cooper's had a chance to look at it; he
understands our TX path better than I.

-Edward
David Miller
2014-10-15 16:20:34 UTC
Permalink
From: Edward Cree <***@solarflare.com>
Date: Wed, 15 Oct 2014 12:05:35 +0100
Post by Edward Cree
Post by David Miller
Date: Tue, 14 Oct 2014 19:41:37 +0100
Post by Edward Cree
Don't ring the doorbell, and don't do PIO. This will also prevent
TX Push, because there will be more than one buffer waiting when
the doorbell is rung.
This looks good to me, mind if I apply this now?
I'd rather wait until Jon Cooper's had a chance to look at it; he
understands our TX path better than I.
Ok.
Jonathan Cooper
2014-10-16 15:42:38 UTC
Permalink
Post by David Miller
Date: Wed, 15 Oct 2014 12:05:35 +0100
Post by Edward Cree
Post by David Miller
Date: Tue, 14 Oct 2014 19:41:37 +0100
Post by Edward Cree
Don't ring the doorbell, and don't do PIO. This will also prevent
TX Push, because there will be more than one buffer waiting when
the doorbell is rung.
This looks good to me, mind if I apply this now?
I'd rather wait until Jon Cooper's had a chance to look at it; he
understands our TX path better than I.
Ok.
Looks good to me.

Acked-by: Jon Cooper <***@solarflare.com>
Robert Stonehouse
2014-10-16 16:36:24 UTC
Permalink
Post by Edward Cree
Don't ring the doorbell, and don't do PIO. This will also prevent
TX Push, because there will be more than one buffer waiting when
the doorbell is rung.
---
@@ -351,8 +343,6 @@ netdev_tx_t efx_enqueue_skb(struct efx_tx_queue *tx_queue, struct sk_buff *skb)
unsigned short dma_flags;
int i = 0;
- EFX_BUG_ON_PARANOID(tx_queue->write_count != tx_queue->insert_count);
-
if (skb_shinfo(skb)->gso_size)
return efx_enqueue_skb_tso(tx_queue, skb);
Would it be possible to keep a weaker version of this check i.e.
EFX_BUG_ON_PARANOID(tx_queue->write_count > tx_queue->insert_count);
Post by Edward Cree
@@ -1258,14 +1249,13 @@ static int efx_enqueue_skb_tso(struct efx_tx_queue *tx_queue,
struct sk_buff *skb)
{
struct efx_nic *efx = tx_queue->efx;
+ unsigned int old_insert_count = tx_queue->insert_count;
int frag_i, rc;
struct tso_state state;
/* Find the packet protocol and sanity-check it */
state.protocol = efx_tso_check_protocol(skb);
- EFX_BUG_ON_PARANOID(tx_queue->write_count != tx_queue->insert_count);
-
rc = tso_start(&state, efx, skb);
if (rc)
goto mem_err;
The same would apply here.

Thanks
Rob
Edward Cree
2014-10-17 14:32:25 UTC
Permalink
Don't ring the doorbell, and don't do PIO. This will also prevent
TX Push, because there will be more than one buffer waiting when
the doorbell is rung.

Signed-off-by: Edward Cree <***@solarflare.com>
---
drivers/net/ethernet/sfc/nic.h | 29 +++++++++++++++++++++++-----
drivers/net/ethernet/sfc/tx.c | 43 +++++++++++++++++++-----------------------
2 files changed, 43 insertions(+), 29 deletions(-)

diff --git a/drivers/net/ethernet/sfc/nic.h b/drivers/net/ethernet/sfc/nic.h
index 60f8514..f77cce0 100644
--- a/drivers/net/ethernet/sfc/nic.h
+++ b/drivers/net/ethernet/sfc/nic.h
@@ -71,9 +71,17 @@ efx_tx_desc(struct efx_tx_queue *tx_queue, unsigned int index)
return ((efx_qword_t *) (tx_queue->txd.buf.addr)) + index;
}

-/* Report whether the NIC considers this TX queue empty, given the
- * write_count used for the last doorbell push. May return false
- * negative.
+/* Get partner of a TX queue, seen as part of the same net core queue */
+static struct efx_tx_queue *efx_tx_queue_partner(struct efx_tx_queue *tx_queue)
+{
+ if (tx_queue->queue & EFX_TXQ_TYPE_OFFLOAD)
+ return tx_queue - EFX_TXQ_TYPE_OFFLOAD;
+ else
+ return tx_queue + EFX_TXQ_TYPE_OFFLOAD;
+}
+
+/* Report whether this TX queue would be empty for the given write_count.
+ * May return false negative.
*/
static inline bool __efx_nic_tx_is_empty(struct efx_tx_queue *tx_queue,
unsigned int write_count)
@@ -86,9 +94,18 @@ static inline bool __efx_nic_tx_is_empty(struct efx_tx_queue *tx_queue,
return ((empty_read_count ^ write_count) & ~EFX_EMPTY_COUNT_VALID) == 0;
}

-static inline bool efx_nic_tx_is_empty(struct efx_tx_queue *tx_queue)
+/* Decide whether we can use TX PIO, ie. write packet data directly into
+ * a buffer on the device. This can reduce latency at the expense of
+ * throughput, so we only do this if both hardware and software TX rings
+ * are empty. This also ensures that only one packet at a time can be
+ * using the PIO buffer.
+ */
+static inline bool efx_nic_may_tx_pio(struct efx_tx_queue *tx_queue)
{
- return __efx_nic_tx_is_empty(tx_queue, tx_queue->write_count);
+ struct efx_tx_queue *partner = efx_tx_queue_partner(tx_queue);
+ return tx_queue->piobuf &&
+ __efx_nic_tx_is_empty(tx_queue, tx_queue->insert_count) &&
+ __efx_nic_tx_is_empty(partner, partner->insert_count);
}

/* Decide whether to push a TX descriptor to the NIC vs merely writing
@@ -96,6 +113,8 @@ static inline bool efx_nic_tx_is_empty(struct efx_tx_queue *tx_queue)
* descriptor to an empty queue, but is otherwise pointless. Further,
* Falcon and Siena have hardware bugs (SF bug 33851) that may be
* triggered if we don't check this.
+ * We use the write_count used for the last doorbell push, to get the
+ * NIC's view of the tx queue.
*/
static inline bool efx_nic_may_push_tx_desc(struct efx_tx_queue *tx_queue,
unsigned int write_count)
diff --git a/drivers/net/ethernet/sfc/tx.c b/drivers/net/ethernet/sfc/tx.c
index 3206098..ee84a90 100644
--- a/drivers/net/ethernet/sfc/tx.c
+++ b/drivers/net/ethernet/sfc/tx.c
@@ -132,15 +132,6 @@ unsigned int efx_tx_max_skb_descs(struct efx_nic *efx)
return max_descs;
}

-/* Get partner of a TX queue, seen as part of the same net core queue */
-static struct efx_tx_queue *efx_tx_queue_partner(struct efx_tx_queue *tx_queue)
-{
- if (tx_queue->queue & EFX_TXQ_TYPE_OFFLOAD)
- return tx_queue - EFX_TXQ_TYPE_OFFLOAD;
- else
- return tx_queue + EFX_TXQ_TYPE_OFFLOAD;
-}
-
static void efx_tx_maybe_stop_queue(struct efx_tx_queue *txq1)
{
/* We need to consider both queues that the net core sees as one */
@@ -344,6 +335,7 @@ netdev_tx_t efx_enqueue_skb(struct efx_tx_queue *tx_queue, struct sk_buff *skb)
struct efx_nic *efx = tx_queue->efx;
struct device *dma_dev = &efx->pci_dev->dev;
struct efx_tx_buffer *buffer;
+ unsigned int old_insert_count = tx_queue->insert_count;
skb_frag_t *fragment;
unsigned int len, unmap_len = 0;
dma_addr_t dma_addr, unmap_addr = 0;
@@ -351,7 +343,7 @@ netdev_tx_t efx_enqueue_skb(struct efx_tx_queue *tx_queue, struct sk_buff *skb)
unsigned short dma_flags;
int i = 0;

- EFX_BUG_ON_PARANOID(tx_queue->write_count != tx_queue->insert_count);
+ EFX_BUG_ON_PARANOID(tx_queue->write_count > tx_queue->insert_count);

if (skb_shinfo(skb)->gso_size)
return efx_enqueue_skb_tso(tx_queue, skb);
@@ -369,9 +361,8 @@ netdev_tx_t efx_enqueue_skb(struct efx_tx_queue *tx_queue, struct sk_buff *skb)

/* Consider using PIO for short packets */
#ifdef EFX_USE_PIO
- if (skb->len <= efx_piobuf_size && tx_queue->piobuf &&
- efx_nic_tx_is_empty(tx_queue) &&
- efx_nic_tx_is_empty(efx_tx_queue_partner(tx_queue))) {
+ if (skb->len <= efx_piobuf_size && !skb->xmit_more &&
+ efx_nic_may_tx_pio(tx_queue)) {
buffer = efx_enqueue_skb_pio(tx_queue, skb);
dma_flags = EFX_TX_BUF_OPTION;
goto finish_packet;
@@ -439,13 +430,14 @@ finish_packet:

netdev_tx_sent_queue(tx_queue->core_txq, skb->len);

+ efx_tx_maybe_stop_queue(tx_queue);
+
/* Pass off to hardware */
- efx_nic_push_buffers(tx_queue);
+ if (!skb->xmit_more || netif_xmit_stopped(tx_queue->core_txq))
+ efx_nic_push_buffers(tx_queue);

tx_queue->tx_packets++;

- efx_tx_maybe_stop_queue(tx_queue);
-
return NETDEV_TX_OK;

dma_err:
@@ -458,7 +450,7 @@ finish_packet:
dev_kfree_skb_any(skb);

/* Work backwards until we hit the original insert pointer value */
- while (tx_queue->insert_count != tx_queue->write_count) {
+ while (tx_queue->insert_count != old_insert_count) {
unsigned int pkts_compl = 0, bytes_compl = 0;
--tx_queue->insert_count;
buffer = __efx_tx_queue_get_insert_buffer(tx_queue);
@@ -989,12 +981,13 @@ static int efx_tso_put_header(struct efx_tx_queue *tx_queue,
/* Remove buffers put into a tx_queue. None of the buffers must have
* an skb attached.
*/
-static void efx_enqueue_unwind(struct efx_tx_queue *tx_queue)
+static void efx_enqueue_unwind(struct efx_tx_queue *tx_queue,
+ unsigned int insert_count)
{
struct efx_tx_buffer *buffer;

/* Work backwards until we hit the original insert pointer value */
- while (tx_queue->insert_count != tx_queue->write_count) {
+ while (tx_queue->insert_count != insert_count) {
--tx_queue->insert_count;
buffer = __efx_tx_queue_get_insert_buffer(tx_queue);
efx_dequeue_buffer(tx_queue, buffer, NULL, NULL);
@@ -1258,13 +1251,14 @@ static int efx_enqueue_skb_tso(struct efx_tx_queue *tx_queue,
struct sk_buff *skb)
{
struct efx_nic *efx = tx_queue->efx;
+ unsigned int old_insert_count = tx_queue->insert_count;
int frag_i, rc;
struct tso_state state;

/* Find the packet protocol and sanity-check it */
state.protocol = efx_tso_check_protocol(skb);

- EFX_BUG_ON_PARANOID(tx_queue->write_count != tx_queue->insert_count);
+ EFX_BUG_ON_PARANOID(tx_queue->write_count > tx_queue->insert_count);

rc = tso_start(&state, efx, skb);
if (rc)
@@ -1308,11 +1302,12 @@ static int efx_enqueue_skb_tso(struct efx_tx_queue *tx_queue,

netdev_tx_sent_queue(tx_queue->core_txq, skb->len);

- /* Pass off to hardware */
- efx_nic_push_buffers(tx_queue);
-
efx_tx_maybe_stop_queue(tx_queue);

+ /* Pass off to hardware */
+ if (!skb->xmit_more || netif_xmit_stopped(tx_queue->core_txq))
+ efx_nic_push_buffers(tx_queue);
+
tx_queue->tso_bursts++;
return NETDEV_TX_OK;

@@ -1336,6 +1331,6 @@ static int efx_enqueue_skb_tso(struct efx_tx_queue *tx_queue,
dma_unmap_single(&efx->pci_dev->dev, state.header_dma_addr,
state.header_unmap_len, DMA_TO_DEVICE);

- efx_enqueue_unwind(tx_queue);
+ efx_enqueue_unwind(tx_queue, old_insert_count);
return NETDEV_TX_OK;
}
--
1.7.11.7
David Miller
2014-10-18 03:47:43 UTC
Permalink
From: Edward Cree <***@solarflare.com>
Date: Fri, 17 Oct 2014 15:32:25 +0100
Post by Edward Cree
Don't ring the doorbell, and don't do PIO. This will also prevent
TX Push, because there will be more than one buffer waiting when
the doorbell is rung.
Applied, thanks.
Ben Hutchings
2014-10-22 08:58:43 UTC
Permalink
On Fri, 2014-10-17 at 15:32 +0100, Edward Cree wrote:
[...]
Post by Edward Cree
@@ -351,7 +343,7 @@ netdev_tx_t efx_enqueue_skb(struct efx_tx_queue *tx_queue, struct sk_buff *skb)
unsigned short dma_flags;
int i = 0;
- EFX_BUG_ON_PARANOID(tx_queue->write_count != tx_queue->insert_count);
+ EFX_BUG_ON_PARANOID(tx_queue->write_count > tx_queue->insert_count);
[...]

Doesn't this break after 2^32 descriptors? It seems like you would need
a similar comparison to time_after(); possibly:

EFX_BUG_ON_PARANOID((int)(tx_queue->write_count - tx_queue->insert_count) > 0);

Ben.
--
Ben Hutchings
For every action, there is an equal and opposite criticism. - Harrison
Edward Cree
2014-10-22 12:36:21 UTC
Permalink
Post by Ben Hutchings
[...]
Post by Edward Cree
@@ -351,7 +343,7 @@ netdev_tx_t efx_enqueue_skb(struct efx_tx_queue *tx_queue, struct sk_buff *skb)
unsigned short dma_flags;
int i = 0;
- EFX_BUG_ON_PARANOID(tx_queue->write_count != tx_queue->insert_count);
+ EFX_BUG_ON_PARANOID(tx_queue->write_count > tx_queue->insert_count);
[...]
Doesn't this break after 2^32 descriptors? It seems like you would need
EFX_BUG_ON_PARANOID((int)(tx_queue->write_count - tx_queue->insert_count) > 0);
Ben.
Yes, Jon already spotted that. We just diked it out - see "[PATCH net]
sfc: remove incorrect EFX_BUG_ON_PARANOID check"
http://patchwork.ozlabs.org/patch/401503/
But your suggestion is a good one; feel free to post as a new patch on
that thread.
-Edward

Loading...