On 01/03/18 04:20 PM, Jason Gunthorpe wrote:
On Thu, Mar 01, 2018 at 11:00:51PM +0000, Stephen Bates wrote:
No, locality matters. If you have a bunch of NICs and bunch of
and the allocator chooses to put all P2P memory on a single drive your
performance will suck horribly even if all the traffic is offloaded.
Performance will suck if you have speed differences between the PCI-E
devices. Eg a bunch of Gen 3 x8 NVMe cards paired with a Gen 4 x16 NIC
will not reach full performance unless the p2p buffers are properly
balanced between all cards.
This would be solved better by choosing the closest devices in the
hierarchy in the p2pmem_find function (ie. preferring devices involved
in the transaction). Also, based on the current code, it's a bit of a
moot point seeing it requires all devices to be on the same switch. So
unless you have a giant switch hidden somewhere you're not going to get
too many NIC/NVME pairs.
In any case, we are looking at changing both those things so distance in
the topology is preferred and the ability to span multiple switches is
allowed. We're also discussing changing it to "user picks" which
simplifies all of this.