Thanks for your email Brett. Yes, it is able to run from client but it seems to be taking a very long time to produce the output. My command is “/usr/bin/ost-survey -s 10 /mnt/lustre/”
Yes, I still get the errors with “thrhi=4” and without any size. FYI: I am using RAID1 for the data disks for MDT and OST.
Thanks,
-Upanshu
From: Lee, Brett [mailto:brett.lee@intel.com]
Sent: Wednesday, September 18, 2013 5:55 PM
To: Singhal, Upanshu; hpdd-discuss@lists.01.org (hpdd-discuss@ml01.01.org)
Subject: RE: Lustre 2.4 IO testing with Lustre IO Toolkit
For ost-survey, can you try running it from a Lustre client?
For obdfilter-survey, do you still get the errors with “thrhi=4” and without “size=24576”?
--
Brett Lee
Sr. Systems Engineer
Intel High Performance Data Division
From:
hpdd-discuss-bounces@lists.01.org [mailto:hpdd-discuss-bounces@lists.01.org]
On Behalf Of Singhal, Upanshu
Sent: Wednesday, September 18, 2013 5:00 AM
To: hpdd-discuss@lists.01.org (hpdd-discuss@ml01.01.org)
Subject: [HPDD-discuss] Lustre 2.4 IO testing with Lustre IO Toolkit
Hello,
We have a 5 node Lustre 2.4 setup now with 1 MGS/MDS, 2 OSS and 2 Clients, each OSS has 1 OST, OST device has been provisioned from a storage array. Setup seems to be fine as I am able to mount all OSTs on client
and write files on the OSTs. I am able to run the basic commands successfully e.g. “lfs df -h”, “lfs df –ih”, “ls –lsah” etc.
But, we face some issues when we run Lustre I/O Tool kit commands specially obdfilter-survey and ost-survey. We get following error:
obdfilter-survey – (noobjlo=2 noobjhi=16 thrlo=2 thrhi=64 size=24576 targets=lustre-OST0001 case=disk /usr/bin/obdfilter-survey)
ost 1 sz 25165824K rsz 1024K obj 1 thr 4 write 22126.57 ERROR rewrite 1264310.26 ERROR read 5448.33 [5982.41,6001.74]
/var/log/message error is : LustreError: 11277:0:(ofd_grant.c:255:ofd_grant_space_left()) lustre-OST0000: cli ECHO_UUID/ffff88013a50b400 left 0 < tot_grant 4750324 unstable 2097152 pending 2097152
Is this something related to the size of the data we are writing or rewriting? Because, if I use small size then I do not see the error.
ost-survey – ost-survey -s 300 /mnt/ost0/
/usr/bin/ost-survey: 09/17/13 OST speed survey on /mnt/ost0/ from
10.x.x.x@tcp
Cannot open : No such file or directory
Has anyone faced these kind of errors? If so, what configuration change I need to do? Any suggestions on tests?
Thanks,
-Upanshu
Upanshu Singhal
EMC Data Storage Systems, Bangalore, India.
Phone: 91-80-67375604