Thanks a lot Lee. ost-survey ran the whole night but never came back with the results. I
will try sgpdd-survey to verify OSTs.
Thanks,
-Upanshu
From: Lee, Brett [mailto:brett.lee@intel.com]
Sent: Wednesday, September 18, 2013 7:57 PM
To: Singhal, Upanshu
Cc: hpdd-discuss(a)lists.01.org (hpdd-discuss(a)ml01.01.org)
Subject: RE: Lustre 2.4 IO testing with Lustre IO Toolkit
Sounds like obdfilter-survey is also working, but your system is not big enough to handle
large scale testing.
As for working examples, I'd use the one that you reported as working, and then gently
scale up from there.
Might I ask why you are running obdfilter-survey? What is your goal here?
Frankly, I'd run sgpdd-survey to verify the OSTs, and then run IOR or something like
that to perform an initial benchmark of the file system to see if it matches the block IO
expectation of your design.
--
Brett Lee
Sr. Systems Engineer
Intel High Performance Data Division
From: Singhal, Upanshu [mailto:upanshu.singhal@emc.com]
Sent: Wednesday, September 18, 2013 7:48 AM
To: Lee, Brett
Cc: hpdd-discuss@lists.01.org<mailto:hpdd-discuss@lists.01.org>
(hpdd-discuss@ml01.01.org<mailto:hpdd-discuss@ml01.01.org>)
Subject: RE: Lustre 2.4 IO testing with Lustre IO Toolkit
Hello Lee,
So, I am using 4 cores with 4 GB RAM and 128 GB of HD space. I tried with a very low value
like size=24K and it worked fine, when I increase it to 100K, it start giving me error.
So, it seems that there is an issue with the space on the disk. I do not have any data on
the disks and this is just fresh test setup.
Can you suggest few examples to start with?
Thanks,
-Upanshu
From: Lee, Brett [mailto:brett.lee@intel.com]
Sent: Wednesday, September 18, 2013 6:36 PM
To: Singhal, Upanshu
Cc: hpdd-discuss@lists.01.org<mailto:hpdd-discuss@lists.01.org>
(hpdd-discuss@ml01.01.org<mailto:hpdd-discuss@ml01.01.org>)
Subject: RE: Lustre 2.4 IO testing with Lustre IO Toolkit
Sounds like ost-survey is working, just not fast enough for your liking. :)
Regarding obdfilter-survey, and the other *survey scripts, there are some default values
at the top of the scripts that you are overriding by passing in the "name=X"
value on the command line. You probably knew that already... So I'm wondering about
the suitability of the parameters for your system. Like, if you ask for too many threads,
or too much disk space, or ... Are you working with multicore CPUs on the OSSs, and do
the OSTs have sufficient free space? Perhaps start with some very low values across the
board?
Another thought is that there remains some data on the OSTs that is still accounted for -
possibly from earlier runs of the *survey tools. Deleting that data manually, or
recreating the file system is a quick fix in that case. Of course, those suggestions
assume you are working in a lab and the file system does not hold valuable data.
--
Brett Lee
Sr. Systems Engineer
Intel High Performance Data Division
From: Singhal, Upanshu [mailto:upanshu.singhal@emc.com]
Sent: Wednesday, September 18, 2013 6:46 AM
To: Lee, Brett
Cc: hpdd-discuss@lists.01.org<mailto:hpdd-discuss@lists.01.org>
(hpdd-discuss@ml01.01.org<mailto:hpdd-discuss@ml01.01.org>)
Subject: RE: Lustre 2.4 IO testing with Lustre IO Toolkit
Thanks for your email Brett. Yes, it is able to run from client but it seems to be taking
a very long time to produce the output. My command is "/usr/bin/ost-survey -s 10
/mnt/lustre/"
Yes, I still get the errors with "thrhi=4" and without any size. FYI: I am using
RAID1 for the data disks for MDT and OST.
Thanks,
-Upanshu
From: Lee, Brett [mailto:brett.lee@intel.com]
Sent: Wednesday, September 18, 2013 5:55 PM
To: Singhal, Upanshu; hpdd-discuss@lists.01.org<mailto:hpdd-discuss@lists.01.org>
(hpdd-discuss@ml01.01.org<mailto:hpdd-discuss@ml01.01.org>)
Subject: RE: Lustre 2.4 IO testing with Lustre IO Toolkit
For ost-survey, can you try running it from a Lustre client?
For obdfilter-survey, do you still get the errors with "thrhi=4" and without
"size=24576"?
--
Brett Lee
Sr. Systems Engineer
Intel High Performance Data Division
From: hpdd-discuss-bounces@lists.01.org<mailto:hpdd-discuss-bounces@lists.01.org>
[mailto:hpdd-discuss-bounces@lists.01.org] On Behalf Of Singhal, Upanshu
Sent: Wednesday, September 18, 2013 5:00 AM
To: hpdd-discuss@lists.01.org<mailto:hpdd-discuss@lists.01.org>
(hpdd-discuss@ml01.01.org<mailto:hpdd-discuss@ml01.01.org>)
Subject: [HPDD-discuss] Lustre 2.4 IO testing with Lustre IO Toolkit
Hello,
We have a 5 node Lustre 2.4 setup now with 1 MGS/MDS, 2 OSS and 2 Clients, each OSS has 1
OST, OST device has been provisioned from a storage array. Setup seems to be fine as I am
able to mount all OSTs on client and write files on the OSTs. I am able to run the basic
commands successfully e.g. "lfs df -h", "lfs df -ih", "ls
-lsah" etc.
But, we face some issues when we run Lustre I/O Tool kit commands specially
obdfilter-survey and ost-survey. We get following error:
obdfilter-survey - (noobjlo=2 noobjhi=16 thrlo=2 thrhi=64 size=24576
targets=lustre-OST0001 case=disk /usr/bin/obdfilter-survey)
ost 1 sz 25165824K rsz 1024K obj 1 thr 4 write 22126.57 ERROR rewrite
1264310.26 ERROR read 5448.33 [5982.41,6001.74]
/var/log/message error is : LustreError: 11277:0:(ofd_grant.c:255:ofd_grant_space_left())
lustre-OST0000: cli ECHO_UUID/ffff88013a50b400 left 0 < tot_grant 4750324 unstable
2097152 pending 2097152
Is this something related to the size of the data we are writing or rewriting? Because, if
I use small size then I do not see the error.
ost-survey - ost-survey -s 300 /mnt/ost0/
/usr/bin/ost-survey: 09/17/13 OST speed survey on /mnt/ost0/ from
10.x.x.x@tcp<mailto:10.x.x.x@tcp>
Cannot open : No such file or directory
Has anyone faced these kind of errors? If so, what configuration change I need to do? Any
suggestions on tests?
Thanks,
-Upanshu
Upanshu Singhal
EMC Data Storage Systems, Bangalore, India.
Phone: 91-80-67375604