Discussion:
[PATCH] net: fix saving TX flow hash in sock for outgoing connections
Sathya Perla
2014-10-22 16:12:01 UTC
Permalink
The commit "net: Save TX flow hash in sock and set in skbuf on xmit"
introduced the inet_set_txhash() and ip6_set_txhash() routines to calculate
and record flow hash(sk_txhash) in the socket structure. sk_txhash is used
to set skb->hash which is used to spread flows across multiple TXQs.

But, the above routines are invoked before the source port of the connection
is created. Because of this all outgoing connections that just differ in the
source port get hashed into the same TXQ.

This patch fixes this problem for IPv4/6 by invoking the the above routines
after the source port is available for the socket.

Fixes: b73c3d0e4("net: Save TX flow hash in sock and set in skbuf on xmit")

Signed-off-by: Sathya Perla <***@emulex.com>
---
net/ipv4/tcp_ipv4.c | 4 ++--
net/ipv6/tcp_ipv6.c | 4 ++--
2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 94d1a77..9c7d762 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -206,8 +206,6 @@ int tcp_v4_connect(struct sock *sk, struct sockaddr *uaddr, int addr_len)
inet->inet_dport = usin->sin_port;
inet->inet_daddr = daddr;

- inet_set_txhash(sk);
-
inet_csk(sk)->icsk_ext_hdr_len = 0;
if (inet_opt)
inet_csk(sk)->icsk_ext_hdr_len = inet_opt->opt.optlen;
@@ -224,6 +222,8 @@ int tcp_v4_connect(struct sock *sk, struct sockaddr *uaddr, int addr_len)
if (err)
goto failure;

+ inet_set_txhash(sk);
+
rt = ip_route_newports(fl4, rt, orig_sport, orig_dport,
inet->inet_sport, inet->inet_dport, sk);
if (IS_ERR(rt)) {
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index 8314955..ace29b6 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -200,8 +200,6 @@ static int tcp_v6_connect(struct sock *sk, struct sockaddr *uaddr,
sk->sk_v6_daddr = usin->sin6_addr;
np->flow_label = fl6.flowlabel;

- ip6_set_txhash(sk);
-
/*
* TCP over IPv4
*/
@@ -297,6 +295,8 @@ static int tcp_v6_connect(struct sock *sk, struct sockaddr *uaddr,
if (err)
goto late_failure;

+ ip6_set_txhash(sk);
+
if (!tp->write_seq && likely(!tp->repair))
tp->write_seq = secure_tcpv6_sequence_number(np->saddr.s6_addr32,
sk->sk_v6_daddr.s6_addr32,
--
1.7.1
Eric Dumazet
2014-10-22 17:39:02 UTC
Permalink
Post by Sathya Perla
The commit "net: Save TX flow hash in sock and set in skbuf on xmit"
introduced the inet_set_txhash() and ip6_set_txhash() routines to calculate
and record flow hash(sk_txhash) in the socket structure. sk_txhash is used
to set skb->hash which is used to spread flows across multiple TXQs.
But, the above routines are invoked before the source port of the connection
is created. Because of this all outgoing connections that just differ in the
source port get hashed into the same TXQ.
Acked-by: Eric Dumazet <***@google.com>

Are you really using the socket/flow hash to select a TXQ ?

Even with this patch, you have a good probability of multiple
cpus hitting same TXQ.
Sathya Perla
2014-10-22 18:35:59 UTC
Permalink
-----Original Message-----
Post by Sathya Perla
The commit "net: Save TX flow hash in sock and set in skbuf on xmit"
introduced the inet_set_txhash() and ip6_set_txhash() routines to
calculate
Post by Sathya Perla
and record flow hash(sk_txhash) in the socket structure. sk_txhash is used
to set skb->hash which is used to spread flows across multiple TXQs.
But, the above routines are invoked before the source port of the
connection
Post by Sathya Perla
is created. Because of this all outgoing connections that just differ in the
source port get hashed into the same TXQ.
Are you really using the socket/flow hash to select a TXQ ?
Yes, as I don't have XPS configured on my setup.
netdev_pick_tx() uses the socket/flow hash when XPS is not used.
Even with this patch, you have a good probability of multiple
cpus hitting same TXQ.
Agree. Are you suggesting that drivers should automatically
register an XPS configuration? I thought it was upto the user
to enab
Eric Dumazet
2014-10-22 19:09:56 UTC
Permalink
Post by Sathya Perla
-----Original Message-----
Are you really using the socket/flow hash to select a TXQ ?
Yes, as I don't have XPS configured on my setup.
netdev_pick_tx() uses the socket/flow hash when XPS is not used.
Yes, this is the (poor) fallback
Post by Sathya Perla
Even with this patch, you have a good probability of multiple
cpus hitting same TXQ.
Agree. Are you suggesting that drivers should automatically
register an XPS configuration? I thought it was upto the user
to enable it...
Yep, search for netif_set_xps_queue()

(commit 537c00de1c9ba9876b9)

Look at commit d03a68f8217ea0349 for an example of how it can be done,
if user do not override this later.
David Miller
2014-10-22 20:14:02 UTC
Permalink
From: Eric Dumazet <***@gmail.com>
Date: Wed, 22 Oct 2014 12:09:56 -0700
Post by Eric Dumazet
Post by Sathya Perla
Agree. Are you suggesting that drivers should automatically
register an XPS configuration? I thought it was upto the user
to enable it...
Yep, search for netif_set_xps_queue()
(commit 537c00de1c9ba9876b9)
Look at commit d03a68f8217ea0349 for an example of how it can be done,
if user do not override this later.
Very few people know about this :-/

I only see 4 drivers adjusted to do this, it would be nice if this
was more widespread.
David Miller
2014-10-22 20:15:00 UTC
Permalink
From: Sathya Perla <***@emulex.com>
Date: Wed, 22 Oct 2014 21:42:01 +0530
Post by Sathya Perla
The commit "net: Save TX flow hash in sock and set in skbuf on xmit"
introduced the inet_set_txhash() and ip6_set_txhash() routines to calculate
and record flow hash(sk_txhash) in the socket structure. sk_txhash is used
to set skb->hash which is used to spread flows across multiple TXQs.
But, the above routines are invoked before the source port of the connection
is created. Because of this all outgoing connections that just differ in the
source port get hashed into the same TXQ.
This patch fixes this problem for IPv4/6 by invoking the the above routines
after the source port is available for the socket.
Fixes: b73c3d0e4("net: Save TX flow hash in sock and set in skbuf on xmit")
Applied and queued up for -stable, thanks.

Loading...