Thanks for your email Brett. Yes, it is able to run from client but it seems to be taking a very long time to produce the output. My command is “/usr/bin/ost-survey -s 10 /mnt/lustre/”

 

Yes, I still get the errors with “thrhi=4” and without any size. FYI: I am using RAID1 for the data disks for MDT and OST.

 

Thanks,

-Upanshu

 

From: Lee, Brett [mailto:brett.lee@intel.com]
Sent: Wednesday, September 18, 2013 5:55 PM
To: Singhal, Upanshu; hpdd-discuss@lists.01.org (hpdd-discuss@ml01.01.org)
Subject: RE: Lustre 2.4 IO testing with Lustre IO Toolkit

 

For ost-survey, can you try running it from a Lustre client?

 

For obdfilter-survey, do you still get the errors with “thrhi=4” and without “size=24576”?

 

--

Brett Lee

Sr. Systems Engineer

Intel High Performance Data Division

 

From: hpdd-discuss-bounces@lists.01.org [mailto:hpdd-discuss-bounces@lists.01.org] On Behalf Of Singhal, Upanshu
Sent: Wednesday, September 18, 2013 5:00 AM
To: hpdd-discuss@lists.01.org (hpdd-discuss@ml01.01.org)
Subject: [HPDD-discuss] Lustre 2.4 IO testing with Lustre IO Toolkit

 

Hello,

 

We have a 5 node Lustre 2.4 setup now with 1 MGS/MDS, 2 OSS and 2 Clients, each OSS has 1 OST, OST device has been provisioned from a storage array. Setup seems to be fine as I am able to mount all OSTs on client and write files on the OSTs. I am able to run the basic commands successfully e.g. “lfs df -h”, “lfs df –ih”, “ls –lsah” etc.

But, we face some issues when we run Lustre I/O Tool kit commands specially obdfilter-survey and ost-survey. We get following error:

 

obdfilter-survey – (noobjlo=2 noobjhi=16 thrlo=2 thrhi=64 size=24576 targets=lustre-OST0001 case=disk /usr/bin/obdfilter-survey)

 

ost  1 sz 25165824K rsz 1024K obj    1 thr    4 write 22126.57             ERROR rewrite 1264310.26             ERROR read 5448.33 [5982.41,6001.74]

 

/var/log/message error is : LustreError: 11277:0:(ofd_grant.c:255:ofd_grant_space_left()) lustre-OST0000: cli ECHO_UUID/ffff88013a50b400 left 0 < tot_grant 4750324 unstable 2097152 pending 2097152

 

Is this something related to the size of the data we are writing or rewriting? Because, if I use small size then I do not see the error.

 

ost-survey – ost-survey -s 300 /mnt/ost0/

/usr/bin/ost-survey: 09/17/13 OST speed survey on /mnt/ost0/ from 10.x.x.x@tcp

Cannot open : No such file or directory

 

Has anyone faced these kind of errors? If so, what configuration change I need to do? Any suggestions on tests?

 

Thanks,

-Upanshu

 

Upanshu Singhal

EMC Data Storage Systems, Bangalore, India.

Phone: 91-80-67375604