Andreas,
Your summary here is incredibly valuable. Although I aspire to someday know and
understand the entire lustre documentation, in the meantime this translation of fact into
application is wonderful.
Thanks for your time on this list.
John Richards
john.richards(a)icecube.wisc.edu
On Oct 18, 2013, at 04:46 , "Dilger, Andreas" <andreas.dilger(a)intel.com>
wrote:
On 2013/10/18 1:53 AM, "Pascal Decourt"
<pdecourt(a)sgi.com> wrote:
> Very interesting.
> I run the lctl command and here is the output:
>
>
> 65838502:10.149.254.2@o2ib:12345-10.149.1.171@o2ib:x1448244599539665:448:I
> nterpret:1382081942:-1382081942s(-1382081952s) opc 4
The size of the history buffer is controlled via the
"req_buffer_history_max"
tunable. By default it only has a few entries in it.
> For my understanding, could you please give more details on the field of
> this line ?
Fields are separated by colons.
1st field: a sequence counter, so it is possible to determine if a
record has already been processed.
2nd field: target NID (i.e. server)
3rd field: source NID (i.e. client)
4th field: client RPC XID number (per-client unique identifier)
5th field: RPC request buffer size
6th field: state of the RPC processing
7th field: time (unix epoch) when RPC arrived
8th field: total processing time (seconds)
9th field: time before client would have timed out request
10th field: RPC opcode number (see lustre_idl.h)
This is all described in the user manual if you search for req_history:
https://wiki.hpdd.intel.com/display/PUB/Documentation
There is also a tool "lustre_req_history" that processes these files a bit.
Cheers, Andreas
> On 18/10/2013 07:14, Kumar, Amit wrote:
>
>
> Awesome Andreas sounds promising, I will give this a shot in the
> morning....
> Regards
> Amit
>
> Sent from my iPhone
>
> On Oct 17, 2013, at 6:19 PM, "Dilger, Andreas"
<andreas.dilger(a)intel.com>
> <mailto:andreas.dilger@intel.com> wrote:
>
> On 2013/10/14 11:56 AM, "Kumar, Amit" <ahkumar(a)mail.smu.edu>
> <mailto:ahkumar@mail.smu.edu> wrote:
> I Just saw your message. Even though I have it deactivated, something is
> still writing significantly. Hope you can help me understand this better.
> Even though new objects are not assigned can the disk usage increase for
> the existing objects?
>
> On the other hand, my migration script is not able to locate any files on
> the OST to move, which is odd. One thing I know is some files names are
> very odd, like may be 20-30 spaces in a file of length about at least
> 100 characters. Some users program is fanatic in file name creation and I
> have no control over it.
>
> Hope you can shed some more light. I am planning on running fsck to
> identify any issues with the ost.
>
>
> I don't think fsck is needed. It seems like you have some process on a
> client that
> is writing to an open-unlinked file on those OSTs.
>
> To find out which client, run on the affected OSS:
> # lctl get_param ost.OSS.ost_io.req_history | grep "opc 4"
> ost.OSS.ost_io.req_history=
> 1431689:192.168.20.1@tcp:12345-192.168.20.159@tcp:x1448908035689984:488:Co
> m
> plete:1382050012:0s(-43s) opc 4
> 1431690:192.168.20.1@tcp:12345-192.168.20.159@tcp:x1448908035689988:488:Co
> m
> plete:1382050012:0s(-43s) opc 4
>
> This will print out all clients (NID "192.168.20.159@tcp" in this case)
> that
> are writing. On the client(s), use "lsof | grep /lustre/mountpoint" to
> see
> what processes have files open on the filesystem (probably marked
> "(deleted)")
> and kill those processes.
>
> This should immediately free up all of the space on those OSTs.
>
> Cheers, Andreas
>
> -----Original Message-----
> From: Nico Budewitz [mailto:Nico.Budewitz@aei.mpg.de]
> Sent: Thursday, October 10, 2013 3:14 PM
> To: Kumar, Amit
> Cc: hpdd-discuss(a)lists.01.org
> Subject: Re: [HPDD-discuss] OST deactivated but still objects being
> written to it?
>
> Hi,
>
> temp. deactivated OSTs via 'lctl --device devno deactivate' will be
> marked as inactive. No new objects are assigned to the deactivated OST,
> but reads and writes of existing objects are still no problem.
>
> Lustre_Manual Chapter: Removing an OST from the File System Hope that
> helps
> - --
> "la lykken gro, som gresset bak do"
> Nico Budewitz
> High Performance Computing
> Max-Planck-Institute for Gravitational Physics /
> Albert-Einstein-Institute Am Muehlenberg 1, 14476 Golm
> Tel.: +49 (0)331 567 7364 Fax: +49 (0)331 567 7284
>
http://supercomputers.aei.mpg.de
>
> On Oct 10, 2013, at 10:01 PM, "Kumar, Amit" <ahkumar(a)mail.smu.edu>
> <mailto:ahkumar@mail.smu.edu> wrote:
>
> Dear All,
>
> I have deactivated an OST yet I see the size of the OST is increasing
> constantly?
>
> Any thoughts what could be causing this?
>
> Thank you,
> Amit
>
Cheers, Andreas
--
Andreas Dilger
Lustre Software Architect
Intel High Performance Data Division
_______________________________________________
HPDD-discuss mailing list
HPDD-discuss(a)lists.01.org
https://lists.01.org/mailman/listinfo/hpdd-discuss