Thanks a lot Lee. ost-survey ran the whole night but never came back with the results. I will try sgpdd-survey to verify OSTs.

 

Thanks,

-Upanshu

 

From: Lee, Brett [mailto:brett.lee@intel.com]
Sent: Wednesday, September 18, 2013 7:57 PM
To: Singhal, Upanshu
Cc: hpdd-discuss@lists.01.org (hpdd-discuss@ml01.01.org)
Subject: RE: Lustre 2.4 IO testing with Lustre IO Toolkit

 

Sounds like obdfilter-survey is also working, but your system is not big enough to handle large scale testing.

 

As for working examples, I’d use the one that you reported as working, and then gently scale up from there.

 

Might I ask why you are running obdfilter-survey?  What is your goal here?

 

Frankly, I’d run sgpdd-survey to verify the OSTs, and then run IOR or something like that to perform an initial benchmark of the file system to see if it matches the block IO expectation of your design.

 

--

Brett Lee

Sr. Systems Engineer

Intel High Performance Data Division

 

From: Singhal, Upanshu [mailto:upanshu.singhal@emc.com]
Sent: Wednesday, September 18, 2013 7:48 AM
To: Lee, Brett
Cc: hpdd-discuss@lists.01.org (hpdd-discuss@ml01.01.org)
Subject: RE: Lustre 2.4 IO testing with Lustre IO Toolkit

 

Hello Lee,

 

So, I am using 4 cores with 4 GB RAM and 128 GB of HD space. I tried with a very low value like size=24K and it worked fine, when I increase it to 100K, it start giving me error. So, it seems that there is an issue with the space on the disk. I do not have any data on the disks and this is just fresh test setup.

 

Can you suggest few examples to start with?

 

Thanks,

-Upanshu

 

From: Lee, Brett [mailto:brett.lee@intel.com]
Sent: Wednesday, September 18, 2013 6:36 PM
To: Singhal, Upanshu
Cc: hpdd-discuss@lists.01.org (hpdd-discuss@ml01.01.org)
Subject: RE: Lustre 2.4 IO testing with Lustre IO Toolkit

 

Sounds like ost-survey is working, just not fast enough for your liking. J

 

Regarding obdfilter-survey, and the other *survey scripts, there are some default values at the top of the scripts that you are overriding by passing in the “name=X” value on the command line.  You probably knew that already…  So I’m wondering about the suitability of the parameters for your system.  Like, if you ask for too many threads, or too much disk space, or …  Are you working with multicore CPUs on the OSSs, and do the OSTs have sufficient free space?  Perhaps start with some very low values across the board?

 

Another thought is that there remains some data on the OSTs that is still accounted for – possibly from earlier runs of the *survey tools.  Deleting that data manually, or recreating the file system is a quick fix in that case.  Of course, those suggestions assume you are working in a lab and the file system does not hold valuable data.

 

--

Brett Lee

Sr. Systems Engineer

Intel High Performance Data Division

 

From: Singhal, Upanshu [mailto:upanshu.singhal@emc.com]
Sent: Wednesday, September 18, 2013 6:46 AM
To: Lee, Brett
Cc: hpdd-discuss@lists.01.org (hpdd-discuss@ml01.01.org)
Subject: RE: Lustre 2.4 IO testing with Lustre IO Toolkit

 

Thanks for your email Brett. Yes, it is able to run from client but it seems to be taking a very long time to produce the output. My command is “/usr/bin/ost-survey -s 10 /mnt/lustre/”

 

Yes, I still get the errors with “thrhi=4” and without any size. FYI: I am using RAID1 for the data disks for MDT and OST.

 

Thanks,

-Upanshu

 

From: Lee, Brett [mailto:brett.lee@intel.com]
Sent: Wednesday, September 18, 2013 5:55 PM
To: Singhal, Upanshu; hpdd-discuss@lists.01.org (hpdd-discuss@ml01.01.org)
Subject: RE: Lustre 2.4 IO testing with Lustre IO Toolkit

 

For ost-survey, can you try running it from a Lustre client?

 

For obdfilter-survey, do you still get the errors with “thrhi=4” and without “size=24576”?

 

--

Brett Lee

Sr. Systems Engineer

Intel High Performance Data Division

 

From: hpdd-discuss-bounces@lists.01.org [mailto:hpdd-discuss-bounces@lists.01.org] On Behalf Of Singhal, Upanshu
Sent: Wednesday, September 18, 2013 5:00 AM
To: hpdd-discuss@lists.01.org (hpdd-discuss@ml01.01.org)
Subject: [HPDD-discuss] Lustre 2.4 IO testing with Lustre IO Toolkit

 

Hello,

 

We have a 5 node Lustre 2.4 setup now with 1 MGS/MDS, 2 OSS and 2 Clients, each OSS has 1 OST, OST device has been provisioned from a storage array. Setup seems to be fine as I am able to mount all OSTs on client and write files on the OSTs. I am able to run the basic commands successfully e.g. “lfs df -h”, “lfs df –ih”, “ls –lsah” etc.

But, we face some issues when we run Lustre I/O Tool kit commands specially obdfilter-survey and ost-survey. We get following error:

 

obdfilter-survey – (noobjlo=2 noobjhi=16 thrlo=2 thrhi=64 size=24576 targets=lustre-OST0001 case=disk /usr/bin/obdfilter-survey)

 

ost  1 sz 25165824K rsz 1024K obj    1 thr    4 write 22126.57             ERROR rewrite 1264310.26             ERROR read 5448.33 [5982.41,6001.74]

 

/var/log/message error is : LustreError: 11277:0:(ofd_grant.c:255:ofd_grant_space_left()) lustre-OST0000: cli ECHO_UUID/ffff88013a50b400 left 0 < tot_grant 4750324 unstable 2097152 pending 2097152

 

Is this something related to the size of the data we are writing or rewriting? Because, if I use small size then I do not see the error.

 

ost-survey – ost-survey -s 300 /mnt/ost0/

/usr/bin/ost-survey: 09/17/13 OST speed survey on /mnt/ost0/ from 10.x.x.x@tcp

Cannot open : No such file or directory

 

Has anyone faced these kind of errors? If so, what configuration change I need to do? Any suggestions on tests?

 

Thanks,

-Upanshu

 

Upanshu Singhal

EMC Data Storage Systems, Bangalore, India.

Phone: 91-80-67375604