I have a query regarding the File Stripping.
I installed 1 MDS, 2 OSS/OST and 2 LustreClients(would be making it
namenode and datanodes). Now as I run the lfs utility for file
stripping:
[root@lustreclient1 ~]# lfs getstripe /mnt/lustre
/mnt/lustre
stripe_count: 1 stripe_size: 1048576 stripe_offset: -1
/mnt/lustre/ebook
stripe_count: 1 stripe_size: 1048576 stripe_offset: -1
/mnt/lustre/hadoop_tmp
stripe_count: 1 stripe_size: 1048576 stripe_offset: -1
I understand that as of now stripe_count=1 denotes that its stripping
over just 1 OSS/OST.
Since I have 7 OSTs do I need to setstripe count to 7. What is the
exact command, if needed.
As I am comparing Hadoop over Lustre, will striping play a role here?
Please suggest.
On 3/20/13, Diep, Minh <minh.diep(a)intel.com> wrote:
On 3/20/13 10:54 AM, "linux freaker" <linuxfreaker(a)gmail.com> wrote:
>Yes, you are correct. I wrote lustreclient3 twice, instead of
>lustreclient4.
>I got your point on this.
>
>>while MDS and OSS/OST will be undisturbed and will
>>neither be namenode/datanode.. am I right?
>It can be namenode but should not be datanode since we are not
>recommending mount lustre client on servers.
>
>You said it can be namenode. How is it possible?
>Say, if I take MDS as namenode. In case of namenode, we usually take
>mount point as /mnt/lustre. But there is no such mount point here in
>MDS. Same for OSS / OST.
Namenode only uses a small storage. You could do that with local disk,
don't need lustre. However, if you use a lustre client as both name node
and datanode, it's fine too.
>
>One more doubt is:
>
>You expressed .."If you have 2 oss with 6 OST each, resulting total of
>12 disks, then you
>might use 3 disks on each of 4 datanode (ie. Total 12 disks.)
>However, you are using LVM. That's different."
>
>I dint understand why are we concerned regarding 3 disk on each of 4
>namenode? Are you talking about Hadoop + HDFS here.
Yes, to try to use comparable resources.
>
>Though I will go ahead and test the environment, and then come back
>with more results meanwhile.
>
>Thanks for all the suggestion. Its really great to see such an active
>mailinglist.
No problem
Thanks
-Minh
>
>
>
>
>On 3/20/13, Diep, Minh <minh.diep(a)intel.com> wrote:
>>
>>
>> On 3/20/13 10:21 AM, "linux freaker" <linuxfreaker(a)gmail.com>
wrote:
>>
>>>Just to understand it correctly.
>>>If I have 1 MDS, 2 OSS with 6 OST each(created through LVM) and 4
>>>lustreclients.
>>>So, as per your statement, its equivalent to 1 NameNode(=>
>>>LustreClient1) and 3 DataNode(=>lustreclient2, lustreclient3,
>>>lustreclient3),
>> Should be 4 datanode + 1 namenode. You had lustreclient3 twice?
>>
>>>while MDS and OSS/OST will be undisturbed and will
>>>neither be namenode/datanode.. am I right?
>> It can be namenode but should not be datanode since we are not
>> recommending mount lustre client on servers
>>>
>>>All I dint get this point .."I would also keep the same total number
>>>of OSTs and total number of disks on all datanodes." Can you please
>>>clarify.
>> If you have 2 oss with 6 OST each, resulting total of 12 disks, then you
>> might use 3 disks on each of 4 datanode (ie. Total 12 disks.)
>> However, you are using LVM. That's different.
>>
>> You can start out with what you have to see how the perf numbers turn
>>out,
>> but it's difficult to draw any conclusion if we are not comparing
>> apple-to-apple.
>>
>> Thanks
>> -Minh
>>
>>>
>>>Regarding LUG, I will try to see if I can attend it.Thanks for sharing
>>>it.
>>>
>>>On 3/20/13, Diep, Minh <minh.diep(a)intel.com> wrote:
>>>> Hi,
>>>>
>>>> There isn't a simple or trivial comparison between Hadoop+HDFS and
>>>> Hadoop+Lustre.
>>>> A typical approach (IMHO) is keeping the same number of Lustre client
>>>>with
>>>> Hadoop datanode.
>>>> I would also keep the same total number of OSTs and total number of
>>>>disks
>>>> on all datanodes.
>>>>
>>>>