Hi Andreas,
Thank you for quick response, All looks okay, may be I had a panic attack misinterpreting
the output:
Here is the output. I will clear the stats and run this to better understand this.
Please let me know if you see anything usual here.
Best Regards,
Amit
llstat -i 10 /proc/fs/lustre/osc/scratch-OST0000-osc-ffff880637fb2c00/stats
/usr/bin/llstat: STATS on 12/16/14
/proc/fs/lustre/osc/scratch-OST0000-osc-ffff880637fb2c00/stats on 10.42.206.97@o2ib
snapshot_time 1418772211.507153
req_waittime 1548724
req_active 1548724
read_bytes 36053
write_bytes 1045405
ost_setattr 17764
ost_read 36053
ost_write 1045405
ost_connect 1
ost_punch 2960
ost_statfs 256
ldlm_cancel 79309
obd_ping 229843
/proc/fs/lustre/osc/scratch-OST0000-osc-ffff880637fb2c00/stats @ 1418772221.507480
Name Cur.Count Cur.Rate #Events Unit last min
avg max stddev
req_waittime 0 0 1548724 [usec] 0 72
58639.58 8744767 98781.00
req_active 0 0 1548724 [reqs] 0 1
2.90 13 2.89
read_bytes 0 0 36053 [bytes] 0 0
968093.45 1048576 275326.45
write_bytes 0 0 1045405 [bytes] 0 1
1031744.42 1048576 126401.91
ost_setattr 0 0 17764 [usec] 0 106
5441.21 731731 27806.96
ost_read 0 0 36053 [usec] 0 158
12357.40 477701 18094.23
ost_write 0 0 1045405 [usec] 0 1399
85930.30 3173670 109508.79
ost_connect 0 0 1 [usec] 0 28281
28281.00 28281 0.00
ost_punch 0 0 2960 [usec] 0 165
2337.63 324251 17537.66
ost_statfs 0 0 256 [usec] 0 196
354.67 819 82.18
ldlm_cancel 0 0 79309 [usec] 0 93
3637.37 774585 26263.10
obd_ping 0 0 229843 [usec] 0 72
375.10 1363317 2847.93
/proc/fs/lustre/osc/scratch-OST0000-osc-ffff880637fb2c00/stats @ 1418772231.508082
Name Cur.Count Cur.Rate #Events Unit last min
avg max stddev
req_waittime 1 0 1548725 [usec] 875 72
58639.54 8744767 98780.98
req_active 1 0 1548725 [reqs] 1 1
2.90 13 2.89
read_bytes 0 0 36053 [bytes] 0 0
968093.45 1048576 275326.45
write_bytes 0 0 1045405 [bytes] 0 1
1031744.42 1048576 126401.91
ost_setattr 0 0 17764 [usec] 0 106
5441.21 731731 27806.96
ost_read 0 0 36053 [usec] 0 158
12357.40 477701 18094.23
ost_write 0 0 1045405 [usec] 0 1399
85930.30 3173670 109508.79
ost_connect 0 0 1 [usec] 0 28281
28281.00 28281 0.00
ost_punch 0 0 2960 [usec] 0 165
2337.63 324251 17537.66
ost_statfs 0 0 256 [usec] 0 196
354.67 819 82.18
ldlm_cancel 0 0 79309 [usec] 0 93
3637.37 774585 26263.10
obd_ping 1 0 229844 [usec] 875 72
375.10 1363317 2847.92
/proc/fs/lustre/osc/scratch-OST0000-osc-ffff880637fb2c00/stats @ 1418772241.508661
Name Cur.Count Cur.Rate #Events Unit last min
avg max stddev
req_waittime 0 0 1548725 [usec] 0 72
58639.54 8744767 98780.98
req_active 0 0 1548725 [reqs] 0 1
2.90 13 2.89
read_bytes 0 0 36053 [bytes] 0 0
968093.45 1048576 275326.45
write_bytes 0 0 1045405 [bytes] 0 1
1031744.42 1048576 126401.91
ost_setattr 0 0 17764 [usec] 0 106
5441.21 731731 27806.96
ost_read 0 0 36053 [usec] 0 158
12357.40 477701 18094.23
ost_write 0 0 1045405 [usec] 0 1399
85930.30 3173670 109508.79
ost_connect 0 0 1 [usec] 0 28281
28281.00 28281 0.00
ost_punch 0 0 2960 [usec] 0 165
2337.63 324251 17537.66
ost_statfs 0 0 256 [usec] 0 196
354.67 819 82.18
ldlm_cancel 0 0 79309 [usec] 0 93
3637.37 774585 26263.10
obd_ping 0 0 229844 [usec] 0 72
375.10 1363317 2847.92
/proc/fs/lustre/osc/scratch-OST0000-osc-ffff880637fb2c00/stats @ 1418772251.509246
Name Cur.Count Cur.Rate #Events Unit last min
avg max stddev
req_waittime 1 0 1548726 [usec] 316 72
58639.51 8744767 98780.96
req_active 1 0 1548726 [reqs] 1 1
2.90 13 2.89
read_bytes 0 0 36053 [bytes] 0 0
968093.45 1048576 275326.45
write_bytes 0 0 1045405 [bytes] 0 1
1031744.42 1048576 126401.91
ost_setattr 0 0 17764 [usec] 0 106
5441.21 731731 27806.96
ost_read 0 0 36053 [usec] 0 158
12357.40 477701 18094.23
ost_write 0 0 1045405 [usec] 0 1399
85930.30 3173670 109508.79
ost_connect 0 0 1 [usec] 0 28281
28281.00 28281 0.00
ost_punch 0 0 2960 [usec] 0 165
2337.63 324251 17537.66
ost_statfs 0 0 256 [usec] 0 196
354.67 819 82.18
ldlm_cancel 0 0 79309 [usec] 0 93
3637.37 774585 26263.10
obd_ping 1 0 229845 [usec] 316 72
375.10 1363317 2847.92
/proc/fs/lustre/osc/scratch-OST0000-osc-ffff880637fb2c00/stats @ 1418772261.509811
Name Cur.Count Cur.Rate #Events Unit last min
avg max stddev
req_waittime 0 0 1548726 [usec] 0 72
58639.51 8744767 98780.96
req_active 0 0 1548726 [reqs] 0 1
2.90 13 2.89
read_bytes 0 0 36053 [bytes] 0 0
968093.45 1048576 275326.45
write_bytes 0 0 1045405 [bytes] 0 1
1031744.42 1048576 126401.91
ost_setattr 0 0 17764 [usec] 0 106
5441.21 731731 27806.96
ost_read 0 0 36053 [usec] 0 158
12357.40 477701 18094.23
ost_write 0 0 1045405 [usec] 0 1399
85930.30 3173670 109508.79
ost_connect 0 0 1 [usec] 0 28281
28281.00 28281 0.00
ost_punch 0 0 2960 [usec] 0 165
2337.63 324251 17537.66
ost_statfs 0 0 256 [usec] 0 196
354.67 819 82.18
ldlm_cancel 0 0 79309 [usec] 0 93
3637.37 774585 26263.10
obd_ping 0 0 229845 [usec] 0 72
375.10 1363317 2847.92
/proc/fs/lustre/osc/scratch-OST0000-osc-ffff880637fb2c00/stats @ 1418772271.510391
Name Cur.Count Cur.Rate #Events Unit last min
avg max stddev
req_waittime 1 0 1548727 [usec] 496 72
58639.47 8744767 98780.94
req_active 1 0 1548727 [reqs] 1 1
2.90 13 2.89
read_bytes 0 0 36053 [bytes] 0 0
968093.45 1048576 275326.45
write_bytes 0 0 1045405 [bytes] 0 1
1031744.42 1048576 126401.91
ost_setattr 0 0 17764 [usec] 0 106
5441.21 731731 27806.96
ost_read 0 0 36053 [usec] 0 158
12357.40 477701 18094.23
ost_write 0 0 1045405 [usec] 0 1399
85930.30 3173670 109508.79
ost_connect 0 0 1 [usec] 0 28281
28281.00 28281 0.00
ost_punch 0 0 2960 [usec] 0 165
2337.63 324251 17537.66
ost_statfs 0 0 256 [usec] 0 196
354.67 819 82.18
ldlm_cancel 0 0 79309 [usec] 0 93
3637.37 774585 26263.10
obd_ping 0 0 229845 [usec] 0 72
375.10 1363317 2847.92
/proc/fs/lustre/osc/scratch-OST0000-osc-ffff880637fb2c00/stats @ 1418772281.510957
Name Cur.Count Cur.Rate #Events Unit last min
avg max stddev
req_waittime 7 0 1548734 [usec] 225307 72
58639.35 8744767 98780.75
req_active 7 0 1548734 [reqs] 7 1
2.90 13 2.89
read_bytes 0 0 36053 [bytes] 0 0
968093.45 1048576 275326.45
write_bytes 6 0 1045411 [bytes] 6291456 1
1031744.52 1048576 126401.56
ost_setattr 0 0 17764 [usec] 0 106
5441.21 731731 27806.96
ost_read 0 0 36053 [usec] 0 158
12357.40 477701 18094.23
ost_write 6 0 1045411 [usec] 224684 1399
85930.02 3173670 109508.55
ost_connect 0 0 1 [usec] 0 28281
28281.00 28281 0.00
ost_punch 0 0 2960 [usec] 0 165
2337.63 324251 17537.66
ost_statfs 0 0 256 [usec] 0 196
354.67 819 82.18
ldlm_cancel 1 0 79310 [usec] 623 93
3637.34 774585 26262.94
obd_ping 0 0 229845 [usec] 0 72
375.10 1363317 2847.92
/proc/fs/lustre/osc/scratch-OST0000-osc-ffff880637fb2c00/stats @ 1418772291.511506
Name Cur.Count Cur.Rate #Events Unit last min
avg max stddev
req_waittime 5 0 1548739 [usec] 192330 72
58639.28 8744767 98780.61
req_active 5 0 1548739 [reqs] 5 1
2.90 13 2.89
read_bytes 0 0 36053 [bytes] 0 0
968093.45 1048576 275326.45
write_bytes 5 0 1045416 [bytes] 5242880 1
1031744.60 1048576 126401.26
ost_setattr 0 0 17764 [usec] 0 106
5441.21 731731 27806.96
ost_read 0 0 36053 [usec] 0 158
12357.40 477701 18094.23
ost_write 5 0 1045416 [usec] 192330 1399
85929.79 3173670 109508.36
ost_connect 0 0 1 [usec] 0 28281
28281.00 28281 0.00
ost_punch 0 0 2960 [usec] 0 165
2337.63 324251 17537.66
ost_statfs 0 0 256 [usec] 0 196
354.67 819 82.18
ldlm_cancel 0 0 79310 [usec] 0 93
3637.34 774585 26262.94
obd_ping 0 0 229845 [usec] 0 72
375.10 1363317 2847.92
/proc/fs/lustre/osc/scratch-OST0000-osc-ffff880637fb2c00/stats @ 1418772301.512044
Name Cur.Count Cur.Rate #Events Unit last min
avg max stddev
req_waittime 6 0 1548745 [usec] 207370 72
58639.19 8744767 98780.44
req_active 6 0 1548745 [reqs] 6 1
2.90 13 2.89
read_bytes 0 0 36053 [bytes] 0 0
968093.45 1048576 275326.45
write_bytes 6 0 1045422 [bytes] 6062080 1
1031744.47 1048576 126401.08
ost_setattr 0 0 17764 [usec] 0 106
5441.21 731731 27806.96
ost_read 0 0 36053 [usec] 0 158
12357.40 477701 18094.23
ost_write 6 0 1045422 [usec] 207370 1399
85929.50 3173670 109508.13
ost_connect 0 0 1 [usec] 0 28281
28281.00 28281 0.00
ost_punch 0 0 2960 [usec] 0 165
2337.63 324251 17537.66
ost_statfs 0 0 256 [usec] 0 196
354.67 819 82.18
ldlm_cancel 0 0 79310 [usec] 0 93
3637.34 774585 26262.94
obd_ping 0 0 229845 [usec] 0 72
375.10 1363317 2847.92
/proc/fs/lustre/osc/scratch-OST0000-osc-ffff880637fb2c00/stats @ 1418772311.512583
Name Cur.Count Cur.Rate #Events Unit last min
avg max stddev
req_waittime 5 0 1548750 [usec] 178679 72
58639.12 8744767 98780.30
req_active 5 0 1548750 [reqs] 5 1
2.90 13 2.89
read_bytes 0 0 36053 [bytes] 0 0
968093.45 1048576 275326.45
write_bytes 5 0 1045427 [bytes] 5242880 1
1031744.55 1048576 126400.78
ost_setattr 0 0 17764 [usec] 0 106
5441.21 731731 27806.96
ost_read 0 0 36053 [usec] 0 158
12357.40 477701 18094.23
ost_write 5 0 1045427 [usec] 178679 1399
85929.26 3173670 109507.93
ost_connect 0 0 1 [usec] 0 28281
28281.00 28281 0.00
ost_punch 0 0 2960 [usec] 0 165
2337.63 324251 17537.66
ost_statfs 0 0 256 [usec] 0 196
354.67 819 82.18
ldlm_cancel 0 0 79310 [usec] 0 93
3637.34 774585 26262.94
obd_ping 0 0 229845 [usec] 0 72
375.10 1363317 2847.92
/proc/fs/lustre/osc/scratch-OST0000-osc-ffff880637fb2c00/stats @ 1418772321.513150
Name Cur.Count Cur.Rate #Events Unit last min
avg max stddev
req_waittime 6 0 1548756 [usec] 244869 72
58639.05 8744767 98780.13
req_active 6 0 1548756 [reqs] 6 1
2.90 13 2.89
read_bytes 0 0 36053 [bytes] 0 0
968093.45 1048576 275326.45
write_bytes 6 0 1045433 [bytes] 6291456 1
1031744.65 1048576 126400.42
ost_setattr 0 0 17764 [usec] 0 106
5441.21 731731 27806.96
ost_read 0 0 36053 [usec] 0 158
12357.40 477701 18094.23
ost_write 6 0 1045433 [usec] 244869 1399
85929.00 3173670 109507.70
ost_connect 0 0 1 [usec] 0 28281
28281.00 28281 0.00
ost_punch 0 0 2960 [usec] 0 165
2337.63 324251 17537.66
ost_statfs 0 0 256 [usec] 0 196
354.67 819 82.18
ldlm_cancel 0 0 79310 [usec] 0 93
3637.34 774585 26262.94
obd_ping 0 0 229845 [usec] 0 72
375.10 1363317 2847.92
/proc/fs/lustre/osc/scratch-OST0000-osc-ffff880637fb2c00/stats @ 1418772331.513715
Name Cur.Count Cur.Rate #Events Unit last min
avg max stddev
req_waittime 6 0 1548762 [usec] 217509 72
58638.96 8744767 98779.96
req_active 6 0 1548762 [reqs] 6 1
2.90 13 2.89
read_bytes 0 0 36053 [bytes] 0 0
968093.45 1048576 275326.45
write_bytes 6 0 1045439 [bytes] 5767168 1
1031744.25 1048576 126401.04
ost_setattr 0 0 17764 [usec] 0 106
5441.21 731731 27806.96
ost_read 0 0 36053 [usec] 0 158
12357.40 477701 18094.23
ost_write 6 0 1045439 [usec] 217509 1399
85928.71 3173670 109507.47
ost_connect 0 0 1 [usec] 0 28281
28281.00 28281 0.00
ost_punch 0 0 2960 [usec] 0 165
2337.63 324251 17537.66
ost_statfs 0 0 256 [usec] 0 196
354.67 819 82.18
ldlm_cancel 0 0 79310 [usec] 0 93
3637.34 774585 26262.94
obd_ping 0 0 229845 [usec] 0 72
375.10 1363317 2847.92
/proc/fs/lustre/osc/scratch-OST0000-osc-ffff880637fb2c00/stats @ 1418772341.514308
Name Cur.Count Cur.Rate #Events Unit last min
avg max stddev
req_waittime 2 0 1548764 [usec] 37517 72
58638.91 8744767 98779.91
req_active 2 0 1548764 [reqs] 2 1
2.90 13 2.89
read_bytes 0 0 36053 [bytes] 0 0
968093.45 1048576 275326.45
write_bytes 2 0 1045441 [bytes] 2097152 1
1031744.28 1048576 126400.92
ost_setattr 0 0 17764 [usec] 0 106
5441.21 731731 27806.96
ost_read 0 0 36053 [usec] 0 158
12357.40 477701 18094.23
ost_write 2 0 1045441 [usec] 37517 1399
85928.58 3173670 109507.40
ost_connect 0 0 1 [usec] 0 28281
28281.00 28281 0.00
ost_punch 0 0 2960 [usec] 0 165
2337.63 324251 17537.66
ost_statfs 0 0 256 [usec] 0 196
354.67 819 82.18
ldlm_cancel 0 0 79310 [usec] 0 93
3637.34 774585 26262.94
obd_ping 0 0 229845 [usec] 0 72
375.10 1363317 2847.92
________________________________________
From: Dilger, Andreas [andreas.dilger(a)intel.com]
Sent: Tuesday, December 16, 2014 5:16 PM
To: Kumar, Amit
Cc: hpdd-discuss(a)lists.01.org
Subject: Re: [HPDD-discuss] Long req_waittime
You're cutting off all of the useful values from these stats. All this
tells us is that you sent 239189 RPCs from this client, but you chopped
off the actual stats that report how long the RPCs took (in microseconds:
min, max, sum, sum_squared for stddev). You can use something like:
llstat -i 10 fsname-OST0000-*
to get a nicely formatted output (condensed to 80 columns for this email):
/proc/fs/lustre/osc/myth-OST0000-osc-ffff880079252800/stats @
1418771188.848400
Name Count Rate Events Unit last min avg max
stddev
req_waittime 0 0 38104 [usec] 0 511 29747 859246
38441.08
req_active 0 0 38104 [reqs] 0 1 1.15 17
0.64
read_bytes 0 0 10967 [bytes] 0 0 644180 1048576
470678.11
write_bytes 0 0 8831 [bytes] 0 188 314025 1048576
154912.70
ost_read 0 0 10967 [usec] 0 1311 43729 859246
55680.31
ost_write 0 0 8831 [usec] 0 4003 21733 578621
22564.67
ost_connect 0 0 1 [usec] 0 1581 1581 1581
0.00
ost_punch 0 0 70 [usec] 0 981 10499 51744
10787.99
ost_statfs 0 0 8060 [usec] 0 511 2217 110251
2467.72
ost_sync 0 0 8240 [usec] 0 660 51698 265859
20041.93
ldlm_cancel 0 0 638 [usec] 0 688 16629 202345
25984.31
obd_ping 0 0 444 [usec] 0 840 3848 52907
2766.05
These are aggregate stats since the client/server was first mounted, or
last cleared (by writing into the /proc file or using "lctl set_param
NNNN=clear").
This will at least give you some more useful info about the stats to start
looking at what is going slowly.
Cheers, Andreas
--
Andreas Dilger
Lustre Software Architect
Intel High Performance Data Division