did you try to find files based on ost ID?
lfs find -O 1,2
Yes I do ==> (I just forgot in my previous post to write lfs
find /path_to_directory -type f | wc -l gave me 240 files found
lfs find -O OST1 -O OST2 /path_to_directory gave me just 4 files found
[root@archives-mds ~]# lfs find -O archives-OST0001 /ARCHIVES/spectre_hr_2/tigr_nc/
[root@archives-mds ~]# lfs find -O archives-OST0002 /ARCHIVES/spectre_hr_2/tigr_nc/
/ARCHIVES/spectre_hr_2/tigr_nc//spi4a_1241_Day.nc
/ARCHIVES/spectre_hr_2/tigr_nc//spi4a_0121_Night.nc
/ARCHIVES/spectre_hr_2/tigr_nc//spi4a_2261_Night.nc
[root@archives-mds ~]# find /ARCHIVES/spectre_hr_2/tigr_nc/
/ARCHIVES/spectre_hr_2/tigr_nc/
/ARCHIVES/spectre_hr_2/tigr_nc/spi4a_0061_Night.nc
/ARCHIVES/spectre_hr_2/tigr_nc/spi4a_1050_Night.nc
/ARCHIVES/spectre_hr_2/tigr_nc/spi4a_0261_Night.nc
/ARCHIVES/spectre_hr_2/tigr_nc/spi4a_2161_Night.nc
/ARCHIVES/spectre_hr_2/tigr_nc/spi4a_1461_Night.nc
/ARCHIVES/spectre_hr_2/tigr_nc/spi4a_0201_Night.nc
/ARCHIVES/spectre_hr_2/tigr_nc/spi4a_2001_Day.nc
/ARCHIVES/spectre_hr_2/tigr_nc/spi4a_2021_Day.nc
/ARCHIVES/spectre_hr_2/tigr_nc/spi4a_1641_Night.nc
/ARCHIVES/spectre_hr_2/tigr_nc/spi4a_2001_Night.nc
/ARCHIVES/spectre_hr_2/tigr_nc/spi4a_0341_Day.nc
/ARCHIVES/spectre_hr_2/tigr_nc/spi4a_1761_Day.nc
/ARCHIVES/spectre_hr_2/tigr_nc/spi4a_1221_Night.nc
… etc I truncated the output.
This file, spi4a_1241_Day.nc, is ok
==========================
[root@archives-mds ~]# lfs getstripe /ARCHIVES/spectre_hr_2/tigr_nc/spi4a_1241_Day.nc
/ARCHIVES/spectre_hr_2/tigr_nc/spi4a_1241_Day.nc
lmm_stripe_count: 1
lmm_stripe_size: 1048576
lmm_pattern: 1
lmm_layout_gen: 0
lmm_stripe_offset: 2
obdidx objid objid
group
2 14458853 0xdc9fe5 0
[root@archives-mds ~]# stat /ARCHIVES/spectre_hr_2/tigr_nc/spi4a_1241_Day.nc
File: « /ARCHIVES/spectre_hr_2/tigr_nc/spi4a_1241_Day.nc »
Size: 40020545 Blocks: 78168 IO Block: 4194304 fichier
Device: effa7236h/4026167862d Inode: 144115373027889935 Links: 1
Access: (0660/-rw-rw----) Uid: (11161/tournier) Gid: (11434/ sps)
Access: 2015-05-13 12:13:32.000000000 +0200
Modify: 2014-04-01 19:03:10.000000000 +0200
Change: 2015-06-17 14:56:15.000000000 +0200
[root@archives-mds ~]# cp /ARCHIVES/spectre_hr_2/tigr_nc/spi4a_1241_Day.nc /tmp/waste.nc
It’s ok immediately
This file, spi4a_1941_Day.nc, is not ok
==========================
[root@archives-mds ~]# lfs getstripe /ARCHIVES/spectre_hr_2/tigr_nc/spi4a_1941_Day.nc
/ARCHIVES/spectre_hr_2/tigr_nc/spi4a_1941_Day.nc
lmm_stripe_count: 1
lmm_stripe_size: 1048576
lmm_pattern: 1
lmm_layout_gen: 0
lmm_stripe_offset: 1
obdidx objid objid
group
1 9958236 0x97f35c 0
[root@archives-mds ~]# stat /ARCHIVES/spectre_hr_2/tigr_nc/spi4a_1941_Day.nc
File: « /ARCHIVES/spectre_hr_2/tigr_nc/spi4a_1941_Day.nc »
Size: 40020545 Blocks: 78168 IO Block: 4194304 fichier
Device: effa7236h/4026167862d Inode: 144115373027890005 Links: 1
Access: (0660/-rw-rw----) Uid: (11161/tournier) Gid: (11434/ sps)
Access: 2015-05-13 12:16:03.000000000 +0200
Modify: 2014-04-01 19:04:48.000000000 +0200
Change: 2015-06-17 14:58:03.000000000 +0200
[root@archives-mds ~]# cp /ARCHIVES/spectre_hr_2/tigr_nc/spi4a_1941_Day.nc /tmp/waste.nc
I wait fewer minutes that the system give the hand. But the copy is done after a long time
of inactivity
David
-----Message d'origine-----
De : Arman Khalatyan [mailto:arm2arm@gmail.com]
Envoyé : jeudi 18 juin 2015 11:50
À : David Roman
Cc : hpdd-discuss(a)ml01.01.org
Objet : Re: [HPDD-discuss] Data lost on OST
did you try to find files based on ost ID?
lfs find -O 1,2
also check stat on files shat you cannot copy:
lfs get stripe /lustre/filename
stat /lustre/filename
***********************************************************
Dr. Arman Khalatyan eScience -SuperComputing Leibniz-Institut für Astrophysik Potsdam
(AIP) An der Sternwarte 16, 14482 Potsdam, Germany
***********************************************************
On Thu, Jun 18, 2015 at 11:01 AM, David Roman
<David.Roman@noveltis.fr<mailto:David.Roman@noveltis.fr>> wrote:
I continue to try to understand my problem. I just do an amazing
discovery ....
A quick reminder :
find /path_to_directory -type f | wc -l gave me 240 files found lfs
find -O OST1 -O OST2 /path_to_directory gave me just 4 files found
I can read and copy these 4 files, but not the others.
If I do cp /path_to_directory/* /other_path the command doesn't
work
BUT !!! The very amazing thing, with rsync command I can copy all
data !!!!
David
-----Message d'origine-----
De : David Roman
Envoyé : mercredi 17 juin 2015 16:38
> À : hpdd-discuss@ml01.01.org<mailto:hpdd-discuss@ml01.01.org
Objet : RE: [HPDD-discuss] Data lost on OST
I do a mistake when i replied. I re-post for all.
-----Message d'origine-----
De : David Roman
Envoyé : mercredi 17 juin 2015 16:07
À : 'Arman Khalatyan'
Objet : RE: [HPDD-discuss] Data lost on OST
I already reboot servers this morning
-----Message d'origine-----
De : David Roman
Envoyé : mercredi 17 juin 2015 16:03
À : 'Arman Khalatyan'
Objet : RE: [HPDD-discuss] Data lost on OST
No, I plan to use 3 OSS servers, each with 1 OST. In a first time I
deployed archives-mds (MDT000) and archives-oss3 (OST002). In second time I deployed
archives-oss2 (OST001). I never use the third server archives-oss1 (OST000). OST000
doesn't exist.
I do a test just now ...
I copied some data to my lustre volume, about 106 GO.
I have no error with the copy operation. But je problem is the same.
I
don't have all data. I found data on OST002, but nothing on
OST001
-----Message d'origine-----
De : Arman Khalatyan [mailto:arm2arm@gmail.com] Envoyé : mercredi 17
juin 2015 15:53 À : David Roman Objet : Re: [HPDD-discuss] Data lost
on OST
what about
OST0000 : Resource temporarily unavailable???
did you recently removed it from MDS?
before lctl lfsck stry to reboot the MDS/OSS After start usually it
is
starting auto scrub
***********************************************************
Dr. Arman Khalatyan eScience -SuperComputing Leibniz-Institut für
Astrophysik Potsdam (AIP) An der Sternwarte 16, 14482 Potsdam,
Germany
***********************************************************
On Wed, Jun 17, 2015 at 3:43 PM, David Roman
<David.Roman@noveltis.fr<mailto:David.Roman@noveltis.fr>> wrote:
> Yes if I do
> ls -l /directory/path
> find /directory/path
> lfs find /directory/path
>
> I see my files (240)
>
> If I do
> lfs find -O archives-OST0001 /directory/path ==> I see
nothing
> lfs find -O archives-OST0002 /directory/path ==> I see
only 4 files, I can read only this files.
>
>
> # lctl dl
> 0 UP osd-ldiskfs archives-MDT0000-osd archives-MDT0000-osd_UUID
10
> 1 UP mgs MGS MGS 51
> 2 UP mgc MGC192.168.1.45@tcp<mailto:MGC192.168.1.45@tcp>
86f5008e-05e8-6d58-4fa6-64dfebed9dd8 5
> 3 UP mds MDS MDS_uuid 3
> 4 UP lod archives-MDT0000-mdtlov archives-MDT0000-mdtlov_UUID
4
> 5 UP mdt archives-MDT0000 archives-MDT0000_UUID 53
> 6 UP mdd archives-MDD0000 archives-MDD0000_UUID 4
> 7 UP qmt archives-QMT0000 archives-QMT0000_UUID 4
> 8 UP osp archives-OST0002-osc-MDT0000
archives-MDT0000-mdtlov_UUID 5
> 9 UP osp archives-OST0001-osc-MDT0000
archives-MDT0000-mdtlov_UUID
> 5
> 10 UP lwp archives-MDT0000-lwp-MDT0000
> archives-MDT0000-lwp-MDT0000_UUID 5
> 11 UP lov archives-clilov-ffff880029b0e000
> 3ba711ce-278f-bd95-e4be-9cae34c7a5ab 4
> 12 UP lmv archives-clilmv-ffff880029b0e000
> 3ba711ce-278f-bd95-e4be-9cae34c7a5ab 4
> 13 UP mdc archives-MDT0000-mdc-ffff880029b0e000
> 3ba711ce-278f-bd95-e4be-9cae34c7a5ab 5
> 14 UP osc archives-OST0002-osc-ffff880029b0e000
> 3ba711ce-278f-bd95-e4be-9cae34c7a5ab 5
> 15 UP osc archives-OST0001-osc-ffff880029b0e000
> 3ba711ce-278f-bd95-e4be-9cae34c7a5ab 5
>
*********************************************************************
> *
> *****
>
> # lfs df
> UUID 1K-blocks Used Available Use%
Mounted on
> archives-MDT0000_UUID 366138224 10685360 330392200 3%
/ARCHIVES[MDT:0]
> OST0000 : Resource temporarily unavailable
> archives-OST0001_UUID 42910527264 15650953496 25064315980 38%
/ARCHIVES[OST:1]
> archives-OST0002_UUID 42911625592 40387604352 377020420 99%
/ARCHIVES[OST:2]
>
> filesystem summary: 85822152856 56038557848 25441336400 69%
> /ARCHIVES
>
*********************************************************************
> *
> *****
>
> # cat /proc/fs/lustre/lov/*-MDT0000-mdtlov/target_obd
> 1: archives-OST0001_UUID ACTIVE
> 2: archives-OST0002_UUID ACTIVE
>
*********************************************************************
> *
> *****
>
>
>
>
>
>
>
>
> I found some errors about lustre, but I don't understand
then.
>
> For exemple :
>
> messages-20150531:May 27 17:41:28 archives-oss2 kernel:
LustreError:
> dumping log to /tmp/lustre-log.1432741288.2818
messages-20150531:May
> 27 17:41:28 archives-oss2 kernel: [<ffffffffa080bad1>] ?
> lustre_pack_reply_v2+0x1e1/0x280 [ptlrpc] messages-20150531:May
27
> 17:41:28 archives-oss2 kernel: [<ffffffffa080bc1e>] ?
> lustre_pack_reply_flags+0xae/0x1f0 [ptlrpc] messages-20150531:May
27
> 17:41:28 archives-oss2 kernel: [<ffffffffa080bad1>] ?
> lustre_pack_reply_v2+0x1e1/0x280 [ptlrpc] messages-20150531:May
27
> 17:41:28 archives-oss2 kernel: [<ffffffffa080bc1e>] ?
> lustre_pack_reply_flags+0xae/0x1f0 [ptlrpc] messages-20150531:May
27
> 17:41:28 archives-oss2 kernel: [<ffffffffa080bad1>] ?
> lustre_pack_reply_v2+0x1e1/0x280 [ptlrpc] messages-20150531:May
27
> 17:41:28 archives-oss2 kernel: [<ffffffffa080bc1e>] ?
> lustre_pack_reply_flags+0xae/0x1f0 [ptlrpc]
>
> Can I try to use lctl lfsck ???
>
>
>
David
>
>
>
>
>
-----Message d'origine-----
>
De : Arman Khalatyan [mailto:arm2arm@gmail.com] Envoyé : mercredi 17
> juin 2015 15:16 À : David Roman Cc :
hpdd-discuss@ml01.01.org<mailto:hpdd-discuss@ml01.01.org> Objet :
> Re: [HPDD-discuss] Data lost on OST
>
> Can you see with "ls -l" the file names?
> Do you see any errors in logs? what you can check is connectivity
from client to OSTs:
> lctl dl
> lfs df
> or make on mds:
> cat /proc/fs/lustre/lov/*-MDT0000-mdtlov/target_obd
>
>
***********************************************************
>
> Dr. Arman Khalatyan eScience -SuperComputing
> Leibniz-Institut für Astrophysik Potsdam (AIP) An der
Sternwarte
> 16,
> 14482 Potsdam, Germany
>
>
***********************************************************
>
>
> On Wed, Jun 17, 2015 at 11:23 AM, David Roman
<David.Roman@noveltis.fr<mailto:David.Roman@noveltis.fr>> wrote:
>> Hello,
>>
>>
>> I use Lustre 2.6. I have one MDS and 2 OSS servers.
>> When I do a ls command in a specific directory I see my
files. But when I want read some them with cat command, the command is blocked.
>>
>> With lfs find -O <device> /my/directory i not see all
files !!!
>>
>> Could you help me please ???
>>
>>
>> Tank you
>> _______________________________________________
>> HPDD-discuss mailing list
>>> HPDD-discuss@lists.01.org<mailto:HPDD-discuss@lists.01.org
_______________________________________________
HPDD-discuss mailing list
> HPDD-discuss@lists.01.org<mailto:HPDD-discuss@lists.01.org