On 2014/12/16, 4:32 PM, "Kumar, Amit" <ahkumar(a)mail.smu.edu> wrote:
Hi Andreas,
Thank you for quick response, All looks okay, may be I had a panic attack
misinterpreting the output:
Here is the output. I will clear the stats and run this to better
understand this.
Please let me know if you see anything usual here.
Best Regards,
Amit
/proc/fs/lustre/osc/scratch-OST0000-osc-ffff880637fb2c00/stats @
1418772281.510957
Name Count Rate #Events Unit last min avg
max
req_waittime 7 0 1548734 [usec] 225307 72 58639
8744767
req_active 7 0 1548734 [reqs] 7 1 2.90
13
write_bytes 6 0 1045411 [bytes] 6291456 1 1031744
1048576
ost_write 6 0 1045411 [usec] 224684 1399 85930
3173670
It looks like your client is just not sending very many RPCs. In a 10s
period, the highest write RPC count was only 7, so you couldn't be seeing
more than 7MB/s at the application. The write RPCs are completing on
average in 86ms, but the recent ones are taking 225ms. That isn't good,
but still only 1/6 of the available time, and not even counting RPCs sent
in parallel.
With any performance analysis, you need to look at the various components
(back end disk, network, client) to see where the actual bottleneck is. I
can't do that for you. There are a number of tools to do this - lnet
selftest, obdfilter-survey, IOR, that can measure performance at various
levels of the IO stack to see what the limits are.
Cheers, Andreas
--
Andreas Dilger
Lustre Software Architect
Intel High Performance Data Division