> did you try to find files based on ost ID?
> lfs find -O 1,2
Yes I do
è
(I just forgot in my previous post to write lfs
find /path_to_directory -type f | wc -l gave me 240 files found
lfs
find -O OST1 -O OST2 /path_to_directory gave me just 4 files found
[root@archives-mds ~]# lfs find -O archives-OST0001 /ARCHIVES/spectre_hr_2/tigr_nc/
[root@archives-mds ~]# lfs find -O archives-OST0002 /ARCHIVES/spectre_hr_2/tigr_nc/
/ARCHIVES/spectre_hr_2/tigr_nc//spi4a_1241_Day.nc
/ARCHIVES/spectre_hr_2/tigr_nc//spi4a_0121_Night.nc
/ARCHIVES/spectre_hr_2/tigr_nc//spi4a_2261_Night.nc
[root@archives-mds ~]# find /ARCHIVES/spectre_hr_2/tigr_nc/
/ARCHIVES/spectre_hr_2/tigr_nc/
/ARCHIVES/spectre_hr_2/tigr_nc/spi4a_0061_Night.nc
/ARCHIVES/spectre_hr_2/tigr_nc/spi4a_1050_Night.nc
/ARCHIVES/spectre_hr_2/tigr_nc/spi4a_0261_Night.nc
/ARCHIVES/spectre_hr_2/tigr_nc/spi4a_2161_Night.nc
/ARCHIVES/spectre_hr_2/tigr_nc/spi4a_1461_Night.nc
/ARCHIVES/spectre_hr_2/tigr_nc/spi4a_0201_Night.nc
/ARCHIVES/spectre_hr_2/tigr_nc/spi4a_2001_Day.nc
/ARCHIVES/spectre_hr_2/tigr_nc/spi4a_2021_Day.nc
/ARCHIVES/spectre_hr_2/tigr_nc/spi4a_1641_Night.nc
/ARCHIVES/spectre_hr_2/tigr_nc/spi4a_2001_Night.nc
/ARCHIVES/spectre_hr_2/tigr_nc/spi4a_0341_Day.nc
/ARCHIVES/spectre_hr_2/tigr_nc/spi4a_1761_Day.nc
/ARCHIVES/spectre_hr_2/tigr_nc/spi4a_1221_Night.nc
… etc I truncated the output.
This file, spi4a_1241_Day.nc, is ok
==========================
[root@archives-mds ~]# lfs getstripe /ARCHIVES/spectre_hr_2/tigr_nc/spi4a_1241_Day.nc
/ARCHIVES/spectre_hr_2/tigr_nc/spi4a_1241_Day.nc
lmm_stripe_count: 1
lmm_stripe_size: 1048576
lmm_pattern: 1
lmm_layout_gen: 0
lmm_stripe_offset: 2
obdidx objid objid group
2 14458853 0xdc9fe5 0
[root@archives-mds ~]# stat /ARCHIVES/spectre_hr_2/tigr_nc/spi4a_1241_Day.nc
File: « /ARCHIVES/spectre_hr_2/tigr_nc/spi4a_1241_Day.nc »
Size: 40020545 Blocks: 78168 IO Block: 4194304 fichier
Device: effa7236h/4026167862d Inode: 144115373027889935 Links: 1
Access: (0660/-rw-rw----) Uid: (11161/tournier) Gid: (11434/ sps)
Access: 2015-05-13 12:13:32.000000000 +0200
Modify: 2014-04-01 19:03:10.000000000 +0200
Change: 2015-06-17 14:56:15.000000000 +0200
[root@archives-mds ~]# cp /ARCHIVES/spectre_hr_2/tigr_nc/spi4a_1241_Day.nc /tmp/waste.nc It’s ok immediately
This file, spi4a_1941_Day.nc, is not ok
==========================
[root@archives-mds ~]# lfs getstripe /ARCHIVES/spectre_hr_2/tigr_nc/spi4a_1941_Day.nc
/ARCHIVES/spectre_hr_2/tigr_nc/spi4a_1941_Day.nc
lmm_stripe_count: 1
lmm_stripe_size: 1048576
lmm_pattern: 1
lmm_layout_gen: 0
lmm_stripe_offset: 1
obdidx objid objid group
1 9958236 0x97f35c 0
[root@archives-mds ~]# stat /ARCHIVES/spectre_hr_2/tigr_nc/spi4a_1941_Day.nc
File: « /ARCHIVES/spectre_hr_2/tigr_nc/spi4a_1941_Day.nc »
Size: 40020545 Blocks: 78168 IO Block: 4194304 fichier
Device: effa7236h/4026167862d Inode: 144115373027890005 Links: 1
Access: (0660/-rw-rw----) Uid: (11161/tournier) Gid: (11434/ sps)
Access: 2015-05-13 12:16:03.000000000 +0200
Modify: 2014-04-01 19:04:48.000000000 +0200
Change: 2015-06-17 14:58:03.000000000 +0200
[root@archives-mds ~]# cp /ARCHIVES/spectre_hr_2/tigr_nc/spi4a_1941_Day.nc /tmp/waste.nc
I wait fewer minutes that the system give the hand. But the copy is done after a long time of inactivity
David
-----Message d'origine-----
De : Arman Khalatyan [mailto:arm2arm@gmail.com]
Envoyé : jeudi 18 juin 2015 11:50
À : David Roman
Cc : hpdd-discuss@ml01.01.org
Objet : Re: [HPDD-discuss] Data lost on OST
did you try to find files based on ost ID?
lfs find -O 1,2
also check stat on files shat you cannot copy:
lfs get stripe /lustre/filename
stat /lustre/filename
***********************************************************
Dr. Arman Khalatyan eScience -SuperComputing Leibniz-Institut für Astrophysik Potsdam (AIP) An der Sternwarte 16, 14482 Potsdam, Germany
***********************************************************
On Thu, Jun 18, 2015 at 11:01 AM, David Roman <David.Roman@noveltis.fr> wrote:
> I continue to try to understand my problem. I just do an amazing discovery ....
> A quick reminder :
>
>
> find /path_to_directory -type f | wc -l gave me 240 files found lfs
> find -O OST1 -O OST2 /path_to_directory gave me just 4 files found
>
> I can read and copy these 4 files, but not the others.
>
> If I do cp /path_to_directory/* /other_path the command doesn't work
>
> BUT !!! The very amazing thing, with rsync command I can copy all data !!!!
>
> David
>
>
> -----Message d'origine-----
> De : David Roman
> Envoyé : mercredi 17 juin 2015 16:38
> À : hpdd-discuss@ml01.01.org
> Objet : RE: [HPDD-discuss] Data lost on OST
>
> I do a mistake when i replied. I re-post for all.
>
>
> -----Message d'origine-----
> De : David Roman
> Envoyé : mercredi 17 juin 2015 16:07
> À : 'Arman Khalatyan'
> Objet : RE: [HPDD-discuss] Data lost on OST
>
> I already reboot servers this morning
>
>
> -----Message d'origine-----
> De : David Roman
> Envoyé : mercredi 17 juin 2015 16:03
> À : 'Arman Khalatyan'
> Objet : RE: [HPDD-discuss] Data lost on OST
>
> No, I plan to use 3 OSS servers, each with 1 OST. In a first time I deployed archives-mds (MDT000) and archives-oss3 (OST002). In second time I deployed archives-oss2 (OST001). I never use the third server archives-oss1 (OST000). OST000
doesn't exist.
>
> I do a test just now ...
> I copied some data to my lustre volume, about 106 GO.
> I have no error with the copy operation. But je problem is the same. I
> don't have all data. I found data on OST002, but nothing on OST001
>
>
>
> -----Message d'origine-----
> De : Arman Khalatyan [mailto:arm2arm@gmail.com] Envoyé : mercredi 17
> juin 2015 15:53 À : David Roman Objet : Re: [HPDD-discuss] Data lost
> on OST
>
> what about
> OST0000 : Resource temporarily unavailable???
> did you recently removed it from MDS?
> before lctl lfsck stry to reboot the MDS/OSS After start usually it is
> starting auto scrub
> ***********************************************************
>
> Dr. Arman Khalatyan eScience -SuperComputing Leibniz-Institut für
> Astrophysik Potsdam (AIP) An der Sternwarte 16, 14482 Potsdam,
> Germany
>
> ***********************************************************
>
>
> On Wed, Jun 17, 2015 at 3:43 PM, David Roman <David.Roman@noveltis.fr> wrote:
>> Yes if I do
>> ls -l /directory/path
>> find /directory/path
>> lfs find /directory/path
>>
>> I see my files (240)
>>
>> If I do
>> lfs find -O archives-OST0001 /directory/path ==> I see nothing
>> lfs find -O archives-OST0002 /directory/path ==> I see only 4 files, I can read only this files.
>>
>>
>> # lctl dl
>> 0 UP osd-ldiskfs archives-MDT0000-osd archives-MDT0000-osd_UUID 10
>> 1 UP mgs MGS MGS 51
>> 2 UP mgc MGC192.168.1.45@tcp 86f5008e-05e8-6d58-4fa6-64dfebed9dd8 5
>> 3 UP mds MDS MDS_uuid 3
>> 4 UP lod archives-MDT0000-mdtlov archives-MDT0000-mdtlov_UUID 4
>> 5 UP mdt archives-MDT0000 archives-MDT0000_UUID 53
>> 6 UP mdd archives-MDD0000 archives-MDD0000_UUID 4
>> 7 UP qmt archives-QMT0000 archives-QMT0000_UUID 4
>> 8 UP osp archives-OST0002-osc-MDT0000 archives-MDT0000-mdtlov_UUID 5
>> 9 UP osp archives-OST0001-osc-MDT0000 archives-MDT0000-mdtlov_UUID
>> 5
>> 10 UP lwp archives-MDT0000-lwp-MDT0000
>> archives-MDT0000-lwp-MDT0000_UUID 5
>> 11 UP lov archives-clilov-ffff880029b0e000
>> 3ba711ce-278f-bd95-e4be-9cae34c7a5ab 4
>> 12 UP lmv archives-clilmv-ffff880029b0e000
>> 3ba711ce-278f-bd95-e4be-9cae34c7a5ab 4
>> 13 UP mdc archives-MDT0000-mdc-ffff880029b0e000
>> 3ba711ce-278f-bd95-e4be-9cae34c7a5ab 5
>> 14 UP osc archives-OST0002-osc-ffff880029b0e000
>> 3ba711ce-278f-bd95-e4be-9cae34c7a5ab 5
>> 15 UP osc archives-OST0001-osc-ffff880029b0e000
>> 3ba711ce-278f-bd95-e4be-9cae34c7a5ab 5
>> *********************************************************************
>> *
>> *****
>>
>> # lfs df
>> UUID 1K-blocks Used Available Use% Mounted on
>> archives-MDT0000_UUID 366138224 10685360 330392200 3% /ARCHIVES[MDT:0]
>> OST0000 : Resource temporarily unavailable
>> archives-OST0001_UUID 42910527264 15650953496 25064315980 38% /ARCHIVES[OST:1]
>> archives-OST0002_UUID 42911625592 40387604352 377020420 99% /ARCHIVES[OST:2]
>>
>> filesystem summary: 85822152856 56038557848 25441336400 69%
>> /ARCHIVES
>> *********************************************************************
>> *
>> *****
>>
>> # cat /proc/fs/lustre/lov/*-MDT0000-mdtlov/target_obd
>> 1: archives-OST0001_UUID ACTIVE
>> 2: archives-OST0002_UUID ACTIVE
>> *********************************************************************
>> *
>> *****
>>
>>
>>
>>
>>
>>
>>
>>
>> I found some errors about lustre, but I don't understand then.
>>
>> For exemple :
>>
>> messages-20150531:May 27 17:41:28 archives-oss2 kernel: LustreError:
>> dumping log to /tmp/lustre-log.1432741288.2818 messages-20150531:May
>> 27 17:41:28 archives-oss2 kernel: [<ffffffffa080bad1>] ?
>> lustre_pack_reply_v2+0x1e1/0x280 [ptlrpc] messages-20150531:May 27
>> 17:41:28 archives-oss2 kernel: [<ffffffffa080bc1e>] ?
>> lustre_pack_reply_flags+0xae/0x1f0 [ptlrpc] messages-20150531:May 27
>> 17:41:28 archives-oss2 kernel: [<ffffffffa080bad1>] ?
>> lustre_pack_reply_v2+0x1e1/0x280 [ptlrpc] messages-20150531:May 27
>> 17:41:28 archives-oss2 kernel: [<ffffffffa080bc1e>] ?
>> lustre_pack_reply_flags+0xae/0x1f0 [ptlrpc] messages-20150531:May 27
>> 17:41:28 archives-oss2 kernel: [<ffffffffa080bad1>] ?
>> lustre_pack_reply_v2+0x1e1/0x280 [ptlrpc] messages-20150531:May 27
>> 17:41:28 archives-oss2 kernel: [<ffffffffa080bc1e>] ?
>> lustre_pack_reply_flags+0xae/0x1f0 [ptlrpc]
>>
>> Can I try to use lctl lfsck ???
>>
>>
>> David
>>
>>
>>
>>
>> -----Message d'origine-----
>> De : Arman Khalatyan [mailto:arm2arm@gmail.com] Envoyé : mercredi 17
>> juin 2015 15:16 À : David Roman Cc :
hpdd-discuss@ml01.01.org Objet :
>> Re: [HPDD-discuss] Data lost on OST
>>
>> Can you see with "ls -l" the file names?
>> Do you see any errors in logs? what you can check is connectivity from client to OSTs:
>> lctl dl
>> lfs df
>> or make on mds:
>> cat /proc/fs/lustre/lov/*-MDT0000-mdtlov/target_obd
>>
>> ***********************************************************
>>
>> Dr. Arman Khalatyan eScience -SuperComputing
>> Leibniz-Institut für Astrophysik Potsdam (AIP) An der Sternwarte
>> 16,
>> 14482 Potsdam, Germany
>>
>> ***********************************************************
>>
>>
>> On Wed, Jun 17, 2015 at 11:23 AM, David Roman <David.Roman@noveltis.fr> wrote:
>>> Hello,
>>>
>>>
>>> I use Lustre 2.6. I have one MDS and 2 OSS servers.
>>> When I do a ls command in a specific directory I see my files. But when I want read some them with cat command, the command is blocked.
>>>
>>> With lfs find -O <device> /my/directory i not see all files !!!
>>>
>>> Could you help me please ???
>>>
>>>
>>> Tank you
>>> _______________________________________________
>>> HPDD-discuss mailing list
>>>
https://lists.01.org/mailman/listinfo/hpdd-discuss
> _______________________________________________
> HPDD-discuss mailing list