On 2016/01/08, 14:05, "HPDD-discuss on behalf of Kurt Strosahl"
<hpdd-discuss-bounces(a)lists.01.org on behalf of strosahl(a)jlab.org> wrote:
Perhaps... but the AT was showing up on the mdt system when I did an
lctl
there.
After a reboot I recreated testL, and I'm now able to write to both ost3
and ost4 (tested by using pools).
I'm going to start stepping through the ost removal process, and see if I
can find which step broke testL.
Has anyone else out there tried permanently removing an ost from lustre
2.5.3 system? I know I have to set lazystatafs so that df commands don't
hang, but I'm not sure what other traps 2.5.3 has laid for me.
My home MythTV Lustre filesystem (running 2.5.3.90) has a permanently
deactivated OST for several months, partly for testing reasons, and partly
because I was reconfiguring the disks in my server. I haven't had any
problems with this.
After emptying the OST, I used "lctl conf_param myth-OST0004.osc.active=0"
to mark the OST inactive and "lctl conf_param myth.llite.lazystatfs=1"
(though I'm not sure if this is required if the OST is permanently
inactive).
Cheers, Andreas
--
Andreas Dilger
Lustre Principal Architect
Intel High Performance Data Division
w/r,
Kurt J. Strosahl
System Administrator
Scientific Computing Group, Thomas Jefferson National Accelerator Facility
----- Original Message -----
From: "Ken Jeffries" <jeffries(a)cray.com>
To: "Kurt Strosahl" <strosahl(a)jlab.org>, hpdd-discuss(a)ml01.01.org
Sent: Friday, January 8, 2016 3:32:28 PM
Subject: Re: [HPDD-discuss] lctl dl showing AT lustre 2.5.3
This might be related to
https://jira.hpdd.intel.com/browse/LU-5582
Ken J
From: HPDD-discuss
<hpdd-discuss-bounces@lists.01.org<mailto:hpdd-discuss-bounces@lists.01.or
g>> on behalf of Kurt Strosahl
<strosahl@jlab.org<mailto:strosahl@jlab.org>>
Date: Friday, January 8, 2016 at 2:20 PM
To: "hpdd-discuss@ml01.01.org<mailto:hpdd-discuss@ml01.01.org>"
<hpdd-discuss@ml01.01.org<mailto:hpdd-discuss@ml01.01.org>>
Subject: [HPDD-discuss] lctl dl showing AT lustre 2.5.3
Good Afternoon,
While testing how to remove an ost from a lustre 2.5.3 file system I
have come across an unusual bug... After I deactivated the ost and
unmounted it I tried using the --replace option listed in man mkfs.lustre
After that I noticed that I could no longer write to the osts that
followed the one I'd been testing on, and having torn down and rebuilt
the test file system I was unable to get the ost to work again (this
would be on a completely fresh filesystem. While trying to debug I came
across the following...
lctl dl
22 AT osp testL-OST0003-osc-MDT0000 testL-MDT0000-mdtlov_UUID 1
Everything else on the file system is unmounted, and ost3 is not mounted
on the system it was previously mounted on (an lctl dl there shows
nothing loaded).
I'm going to reboot the system, which I think should clear this issue...
but I was curious as to what the AT meant, and if it crops up in a
production system how I'd go about removing it.
I also wonder what I did that resulted in the state of AT... I set it to
inactive using:
lctl conf_param testL-OST0003.osc.active=1
Note that this is permanently setting the OST _active_ instead of
_inactive_ as was written above.
then used the --replace option from mkfs.lustre to reuse index 3.
When I mounted it back up everything seems functional... I was able to
use that OST to perform some benchmarking. It was only later when I'd
added more osts to the test environment that I found I couldn't write to
them.