James,
Why, specifically, did you expect a performance boost? I don't think the relevant
bottlenecks are affected by RPC size, so I don't think you should expect an
improvement.
My understanding (which comes partly from LUG presentations, etc, and partly from
investigations of these issues I've done at Cray) is that 2.x is slower than 1.8 for
two reasons:
1. Clients are often CPU bound in the CLIO layers, which I think would be unaffected by a
larger RPC size.
2. Clients do not package IOs anywhere near as well, resulting in a larger number of
smaller IOs. (Even in a test where we do only 1 MB IO requests, on 2.x, we would see a
large number of RPCs of (much) < 1 MB from the client, where as in 1.8, we saw almost
exclusively 1 MB RPCs. For some tests, we'd see as many as 10 times as many total RPCs
on 2.x vs 1.8.)
Since the client isn't doing a good job of filling 1 MB RPCs, I don't think it
would fill 4 MB RPCs.
In contrast, 2.6 is much more like 1.8. CPU usage is down, and IOs are packaged much
better. Our IO statistics for 2.6 look much more like 1.8 than for earlier 2.x.
- Patrick Farrell
________________________________
From: HPDD-discuss [hpdd-discuss-bounces(a)lists.01.org] on behalf of Simmons, James A.
[simmonsja(a)ornl.gov]
Sent: Wednesday, October 22, 2014 8:16 PM
To: hpdd-discuss(a)lists.01.org
Subject: [HPDD-discuss] Anyone using 4MB RPCs
So recently we have moved our systems from 1.8 to 2.5 clients and have lost of the
performance we had from before which is expected. So I thought we could try using
4MB RPCs instead of the default 1MB RPC packet. I set max_pages_per_rpc to 1024
and looked at the value of max_dirty_mb which was 32 and max_rpcs_in_flight which
is 8. By default a dirty cache of 32MB should be enough in this case. So It tested it and
saw no performance improvements. After that I boosted max_dirty_mb to 64 and still
no improvements over the default settings. Has anyone seen this before? What could
I be missing?