I have also noticed, when only MDS/MGS are up I do not have an "UP" lwp device.
I suspect this may have something to do with the  connection problems.

Is there a way to clear the process recover log and get lwp up again?


On Wed, Feb 5, 2014 at 3:35 AM, Dilger, Andreas <andreas.dilger@intel.com> wrote:
On 2014/02/03, 9:29 PM, "Anthony Alba" <ascanio.alba7@gmail.com> wrote:

>I have split an MDS from an MGS using 2.4.2; running on the same server
>using
>CentOS 2.6.32-358.23.2.el6_lustre + ldiskfs only.

What steps did you take to do this?

>I have done a writeconf; now the MDS won't "register" with the MGS.
>
>
>On the split MDS/MGS server the MGT and MDT devices mount but the logs
>show:
>
>
>Process recover log testfs-mtdir error -22.

Did you copy this file over to the new MGS?

>However, if I now try to mount the OSTs, they get stuck in "AT".
>
>
>One symptom is that on the split MGS/MDS the lustre modules cannot unload,
>as some process is using osp.o. If I don't start the OSTs the lustre
>stack can be unloaded.
>
>
>Any suggestions on getting the MDS/MGS to play nice?
>
>
>Thanks
>Anthony
>
>
>
>
>[root@mds1 ~]# lctl dl
>  0 UP osd-ldiskfs MGS-osd MGS-osd_UUID 5
>  1 UP mgs MGS MGS 5
>  2 UP mgc MGC5.5.200.5@o2ib 9ce8a82a-136e-616e-f6f3-f4570fbd364e 5
>  3 UP osd-ldiskfs testfs-MDT0000-osd testfs-MDT0000-osd_UUID 7
>  4 UP mds MDS MDS_uuid 3
>  5 UP lod testfs-MDT0000-mdtlov testfs-MDT0000-mdtlov_UUID 4
>  6 UP mdt testfs-MDT0000 testfs-MDT0000_UUID 3
>  7 UP mdd testfs-MDD0000 testfs-MDD0000_UUID 4
>  8 UP qmt testfs-QMT0000 testfs-QMT0000_UUID 4
>
>
>Feb  4 12:12:10 mds1 kernel: Lustre: MGS: Logs for fs testfs were removed
>by user request.
>All servers must be restarted in order to regenerate the logs.
>
>Feb  4 12:12:10 mds1 kernel: Lustre: testfs-MDT0000: used disk, loading
>
>Feb  4 12:12:10 mds1 kernel: LustreError:
>8412:0:(osd_io.c:1000:osd_ldiskfs_read()) testfs=M
>DT0000: can't read 128@8192 on ino 21: rc = 0
>
>Feb  4 12:12:10 mds1 kernel: LustreError:
>8412:0:(mdt_recovery.c:112:mdt_clients_data_init()
>) error reading MDS last_rcvd idx 0, off 8192: rc -14
>
>Feb  4 12:12:18 mds1 kernel: Lustre:
>6987:0:(mgc_request.c:1564:mgc_process_recover_log()) P
>rocess recover log testfs-mdtir error -22
>
>
>
>Now attempt to mount 2 x OST:
>
>
>  9 AT osp testfs-OST0000-osc-MDT0000 testfs-MDT0000-mdtlov_UUID 1
>
>
>
>
>
>Feb  4 12:22:48 mds1 kernel: Lustre: MGS: Regenerating testfs-OST0000 log
>by user request.
>Feb  4 12:22:58 mds1 kernel: Lustre:
>8697:0:(mgc_request.c:1564:mgc_process_recover_log()) P
>rocess recover log testfs-mdtir error -22
>
>Feb  4 12:22:58 mds1 kernel: LustreError:
>8803:0:(ldlm_lib.c:429:client_obd_setup()) can't a
>dd initial connection
>
>Feb  4 12:22:58 mds1 kernel: LustreError:
>8803:0:(osp_dev.c:686:osp_init0()) testfs-OST0000-
>osc-MDT0000: can't setup obd: -2
>
>Feb  4 12:22:58 mds1 kernel: LustreError:
>8803:0:(obd_config.c:572:class_setup()) setup testfs-OST0000-osc-MDT0000
>failed (-2)
>
>
>Feb  4 12:22:48 mds1 kernel: Lustre: MGS: Regenerating testfs-OST0000 log
>by user request.
>Feb  4 12:22:58 mds1 kernel: Lustre:
>8697:0:(mgc_request.c:1564:mgc_process_recover_log()) P
>rocess recover log testfs-mdtir error -22
>
>Feb  4 12:22:58 mds1 kernel: LustreError:
>8803:0:(ldlm_lib.c:429:client_obd_setup()) can't a
>dd initial connection
>
>Feb  4 12:22:58 mds1 kernel: LustreError:
>8803:0:(osp_dev.c:686:osp_init0()) testfs-OST0000-
>osc-MDT0000: can't setup obd: -2
>
>Feb  4 12:22:58 mds1 kernel: LustreError:
>8803:0:(obd_config.c:572:class_setup()) setup mant
>le-OST0000-osc-MDT0000 failed (-2)
>
>Feb  4 12:22:58 mds1 kernel: LustreError:
>8803:0:(obd_config.c:1553:class_config_llog_handle
>r()) MGC5.5.200.5@o2ib: cfg command failed: rc = -2
>
>Feb  4 12:22:58 mds1 kernel: Lustre:    cmd=cf003
>0:testfs-OST0000-osc-MDT0000  1:testfs-OST
>0000_UUID  2:0@<0:0>
>
>
>
>
>Feb  4 12:22:58 mds1 kernel: LustreError:
>8803:0:(obd_config.c:1553:class_config_llog_handler())
>MGC5.5.200.5@o2ib: cfg command failed: rc = -2
>
>Feb  4 12:22:58 mds1 kernel: Lustre:    cmd=cf003
>0:testfs-OST0000-osc-MDT0000
>1:testfs-OST0000_UUID  2:0@<0:0>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>


Cheers, Andreas
--
Andreas Dilger

Lustre Software Architect
Intel High Performance Data Division