I'm looking for some help to understand why write throughput in my
Lustre IB cluster is only about 50-80 MB/sec, while read performance is 5-6
GB/sec. I'm running 2.4.1-RC2-PRISTINE on my MDT and 2 OSS's. Also using
2.4.92 on my 16 clients, 3 of which run iozone write throughput tests.
I noticed a few threads in the 2.4.1 RC1 and RC2 timeframe discussing low
write performance. I noticed that curr_dirty_bytes start off at 0 at the
start of the test as one would expect. As the test proceeds, one OSS's
curr_dirty_bytes stays pegged at some huge number, implying it didn't see a
commit. The other OSS's curr_dirty_bytes varies during the test as iozone
writes data that gets committed. What can I look at to see why the commit
isn't happening?
Thanks in advance,
Michael