That is the paper I was referring to in my last comment.
Aug 18 13:53:29 prod-0064 kernel: [ 151.261120] LNetError:
485:0:(o2iblnd.c:869:kiblnd_create_conn()) Can't create QP: -12, send_wr: 16191,
recv_wr: 254
There’s some discussion of this error message in LU-5718 (
https://jira.hpdd.intel.com/browse/LU-5718 ).
There is clearly a bug somewhere and anyone that googles recommended settings and then
tries to apply those settings to their mlx5 network will encounter it.
There may be a bug, or there may just be a lack of documentation or knowledge about the
interaction between the o2iblnd driver parameters and the mlx5 drivers. I think it’s
important to understand that the “recommended settings” laid out in that paper are
recommended in the context of dealing with the ping storms. The scale (client:router:OST
ratio) at which the ping storm becomes an issue is not clear. Thus, it may not be
necessary for *every* system to use these settings. That isn’t to say that the default
settings are the best either. But perhaps small to moderately sized systems can get away
with 16/16, or 32/32, etc.
Anyone that googles recommended settings needs to understand the context in which those
recommendations were made.
Chris Horn
On Aug 19, 2015, at 11:49 AM, Ken Jeffries
<jeffries@cray.com<mailto:jeffries@cray.com>> wrote:
Hi Chris,
AFAICT the generally recommended values are peer_credits=126 and concurrent_sends=63. See
https://cug.org/proceedings/attendee_program_cug2012/includes/files/pap16...
and others. Those values if set when using mlx5 produce a non working network with errors
in /var/log/messages like:
Aug 18 13:53:29 prod-0064 kernel: [ 151.261120] LNetError:
485:0:(o2iblnd.c:869:kiblnd_create_conn()) Can't create QP: -12, send_wr: 16191,
recv_wr: 254
Aug 18 13:54:05 prod-0064 kernel: [ 187.241154] LNetError:
6:0:(o2iblnd.c:869:kiblnd_create_conn()) Can't create QP: -12, send_wr: 16191,
recv_wr: 254
Aug 18 13:54:05 prod-0064 kernel: [ 187.241161] LNetError:
6:0:(o2iblnd.c:869:kiblnd_create_conn()) Skipped 3 previous similar messages
Aug 18 13:54:41 prod-0064 kernel: [ 223.220728] LNetError:
6:0:(o2iblnd.c:869:kiblnd_create_conn()) Can't create QP: -12, send_wr: 16191,
recv_wr: 254
The 63/16 combination is the closest we could come to 126/63 and have a working network.
With not being able to do any performance testing with 126/63 we are not able to directly
say whether we are leaving performance on
the table.
There is clearly a bug somewhere and anyone that googles recommended settings and then
tries to apply those settings to their mlx5 network will encounter it.
Regards,
Ken
From: Chris Horn <hornc@cray.com<mailto:hornc@cray.com>>
Date: Wednesday, August 19, 2015 at 11:21 AM
To: Kenneth Jeffries <jeffries@cray.com<mailto:jeffries@cray.com>>
Cc: Martin Hecht <hecht@hlrs.de<mailto:hecht@hlrs.de>>, "Prescott,Craig
P" <prescott@rc.ufl.edu<mailto:prescott@rc.ufl.edu>>,
"hpdd-discuss@lists.01.org<mailto:hpdd-discuss@lists.01.org>"
<hpdd-discuss@ml01.01.org<mailto:hpdd-discuss@ml01.01.org>>
Subject: Re: [HPDD-discuss] o2iblnd peer_credits and concurrent_sends
I don’t know that there’s a good one-size-fits all solution for how to configure the
credits. The recommendations laid out in Cray’s paper are in response to an acute problem
seen at large scale to deal with ping storms created by the Lustre pinger. If your system
doesn’t experience that problem then the default values may be sufficient. If you have
evidence that you’re leaving performance on the table, and LNet is your bottleneck, then
experimenting with credits may be worthwhile.
Chris Horn
On Aug 19, 2015, at 11:17 AM, Chris Horn
<hornc@cray.com<mailto:hornc@cray.com>> wrote:
The o2iblnd driver code forces peer_credits and concurrent_sends to be in a reasonable
range of each other:
if (*kiblnd_tunables.kib_concurrent_sends > *kiblnd_tunables.kib_peertxcredits
* 2)
*kiblnd_tunables.kib_concurrent_sends = *kiblnd_tunables.kib_peertxcredits
* 2;
if (*kiblnd_tunables.kib_concurrent_sends < *kiblnd_tunables.kib_peertxcredits
/ 2)
*kiblnd_tunables.kib_concurrent_sends = *kiblnd_tunables.kib_peertxcredits
/ 2;
The code above ensures that concurrent_sends cannot be larger than 2*peer_credits or
smaller than peer_credits/2. I’m not really sure why it allows concurrent_sends to be less
than peer_credits.
By changing the value of concurrent_sends after the module has loaded you’re circumventing
the above logic.
Chris Horn
On Aug 19, 2015, at 8:01 AM, Ken Jeffries
<jeffries@cray.com<mailto:jeffries@cray.com>> wrote:
Hi Martin and Craig,
This seems to be only a problem on mlx5 and not on mlx4. As Craig says the default values
(peer_credits=8 concurrent_sends=8) do work. The values peer_credits=63
concurrent_sends=16
also work but the concurrent_sends=16 can not be set via the normal .conf file in
modprobe.d/. After the modprobe ko2iblnd but before the module is used, it is possible
to chmod
/sys/module/ko2iblnd/parameters/concurrent_sends to writeable and then echo 16 into the
parameter.
These values are still well short of some generally recommended values and that is
concerning. As Martin says, it may be possible to increase other parameters to go beyond
these values.
Regards,
Ken
From: Martin Hecht <hecht@hlrs.de<mailto:hecht@hlrs.de>>
Date: Wednesday, August 19, 2015 at 6:52 AM
To: "Prescott,Craig P"
<prescott@rc.ufl.edu<mailto:prescott@rc.ufl.edu>>, Kenneth Jeffries
<jeffries@cray.com<mailto:jeffries@cray.com>>,
"hpdd-discuss@lists.01.org<mailto:hpdd-discuss@lists.01.org>"
<hpdd-discuss@ml01.01.org<mailto:hpdd-discuss@ml01.01.org>>
Subject: Re: [HPDD-discuss] o2iblnd peer_credits and concurrent_sends
Hi,
we stumbled over the peer_credits as well. It must be set to the same value on all clients
and servers.
I also heard from Cray that 63 was the maximum that works. Maybe apart from the limitation
of the lnet protocol there are further restrictions, or you have to increase other
parameters as well, in order to go beyond 63.
Martin
On 08/19/2015 03:14 AM, Prescott,Craig P wrote:
Hi Ken,
No, I never got any answers to that old post. We ended up going with the default values
back then - those have actually been ok for our scale/use case. FWIW, I have a hunch that
the problem may have been due to limitations of the Connect-IB driver we were using at the
time on the clients.
Kind of timely that you bring this issue up now, though, as we are bringing up a new file
system and already had it on our list to revisit.
Cheers,
Craig
________________________________
From: HPDD-discuss
<hpdd-discuss-bounces@ml01.01.org><mailto:hpdd-discuss-bounces@ml01.01.org> on
behalf of Ken Jeffries <jeffries@cray.com><mailto:jeffries@cray.com>
Sent: Monday, August 17, 2015 10:01 PM
To: hpdd-discuss@lists.01.org<mailto:hpdd-discuss@lists.01.org>
Subject: Re: [HPDD-discuss] o2iblnd peer_credits and concurrent_sends
Craig,
did you ever get an answer to your question? Or pick values that worked?
https://lists.01.org/pipermail/hpdd-discuss/2013-July/000358.html
Ken
_______________________________________________
HPDD-discuss mailing list
HPDD-discuss@lists.01.org<mailto:HPDD-discuss@lists.01.org>https://lists.01.org/mailman/listinfo/hpdd-discuss
_______________________________________________
HPDD-discuss mailing list
HPDD-discuss@lists.01.org<mailto:HPDD-discuss@lists.01.org>
https://lists.01.org/mailman/listinfo/hpdd-discuss