Sounds like ost-survey is working, just not fast enough for your liking.
J
Regarding obdfilter-survey, and the other *survey scripts, there are some default values at the top of the scripts that you are overriding by passing in the “name=X” value on the command line. You probably knew
that already… So I’m wondering about the suitability of the parameters for your system. Like, if you ask for too many threads, or too much disk space, or … Are you working with multicore CPUs on the OSSs, and do the OSTs have sufficient free space? Perhaps
start with some very low values across the board?
Another thought is that there remains some data on the OSTs that is still accounted for – possibly from earlier runs of the *survey tools. Deleting that data manually, or recreating the file system is a quick
fix in that case. Of course, those suggestions assume you are working in a lab and the file system does not hold valuable data.
--
Brett Lee
Sr. Systems Engineer
Intel High Performance Data Division
From: Singhal, Upanshu [mailto:upanshu.singhal@emc.com]
Sent: Wednesday, September 18, 2013 6:46 AM
To: Lee, Brett
Cc: hpdd-discuss@lists.01.org (hpdd-discuss@ml01.01.org)
Subject: RE: Lustre 2.4 IO testing with Lustre IO Toolkit
Thanks for your email Brett. Yes, it is able to run from client but it seems to be taking a very long time to produce the output. My command is “/usr/bin/ost-survey -s 10 /mnt/lustre/”
Yes, I still get the errors with “thrhi=4” and without any size. FYI: I am using RAID1 for the data disks for MDT and OST.
Thanks,
-Upanshu
From: Lee, Brett [mailto:brett.lee@intel.com]
Sent: Wednesday, September 18, 2013 5:55 PM
To: Singhal, Upanshu; hpdd-discuss@lists.01.org (hpdd-discuss@ml01.01.org)
Subject: RE: Lustre 2.4 IO testing with Lustre IO Toolkit
For ost-survey, can you try running it from a Lustre client?
For obdfilter-survey, do you still get the errors with “thrhi=4” and without “size=24576”?
--
Brett Lee
Sr. Systems Engineer
Intel High Performance Data Division
From:
hpdd-discuss-bounces@lists.01.org [mailto:hpdd-discuss-bounces@lists.01.org]
On Behalf Of Singhal, Upanshu
Sent: Wednesday, September 18, 2013 5:00 AM
To: hpdd-discuss@lists.01.org (hpdd-discuss@ml01.01.org)
Subject: [HPDD-discuss] Lustre 2.4 IO testing with Lustre IO Toolkit
Hello,
We have a 5 node Lustre 2.4 setup now with 1 MGS/MDS, 2 OSS and 2 Clients, each OSS has 1 OST, OST device has been provisioned from a storage array. Setup seems to be fine as I am able to mount all OSTs on client
and write files on the OSTs. I am able to run the basic commands successfully e.g. “lfs df -h”, “lfs df –ih”, “ls –lsah” etc.
But, we face some issues when we run Lustre I/O Tool kit commands specially obdfilter-survey and ost-survey. We get following error:
obdfilter-survey – (noobjlo=2 noobjhi=16 thrlo=2 thrhi=64 size=24576 targets=lustre-OST0001 case=disk /usr/bin/obdfilter-survey)
ost 1 sz 25165824K rsz 1024K obj 1 thr 4 write 22126.57 ERROR rewrite 1264310.26 ERROR read 5448.33 [5982.41,6001.74]
/var/log/message error is : LustreError: 11277:0:(ofd_grant.c:255:ofd_grant_space_left()) lustre-OST0000: cli ECHO_UUID/ffff88013a50b400 left 0 < tot_grant 4750324 unstable 2097152 pending 2097152
Is this something related to the size of the data we are writing or rewriting? Because, if I use small size then I do not see the error.
ost-survey – ost-survey -s 300 /mnt/ost0/
/usr/bin/ost-survey: 09/17/13 OST speed survey on /mnt/ost0/ from
10.x.x.x@tcp
Cannot open : No such file or directory
Has anyone faced these kind of errors? If so, what configuration change I need to do? Any suggestions on tests?
Thanks,
-Upanshu
Upanshu Singhal
EMC Data Storage Systems, Bangalore, India.
Phone: 91-80-67375604