I have split an MDS from an MGS using 2.4.2; running on the same server
using
CentOS 2.6.32-358.23.2.el6_lustre + ldiskfs only.
I have done a writeconf; now the MDS won't "register" with the MGS.
On the split MDS/MGS server the MGT and MDT devices mount but the logs show:
Process recover log testfs-mtdir error -22.
However, if I now try to mount the OSTs, they get stuck in "AT".
One symptom is that on the split MGS/MDS the lustre modules cannot unload,
as some process is using osp.o. If I don't start the OSTs the lustre stack
can be unloaded.
Any suggestions on getting the MDS/MGS to play nice?
Thanks
Anthony
[root@mds1 ~]# lctl dl
0 UP osd-ldiskfs MGS-osd MGS-osd_UUID 5
1 UP mgs MGS MGS 5
2 UP mgc MGC5.5.200.5@o2ib 9ce8a82a-136e-616e-f6f3-f4570fbd364e 5
3 UP osd-ldiskfs testfs-MDT0000-osd testfs-MDT0000-osd_UUID 7
4 UP mds MDS MDS_uuid 3
5 UP lod testfs-MDT0000-mdtlov testfs-MDT0000-mdtlov_UUID 4
6 UP mdt testfs-MDT0000 testfs-MDT0000_UUID 3
7 UP mdd testfs-MDD0000 testfs-MDD0000_UUID 4
8 UP qmt testfs-QMT0000 testfs-QMT0000_UUID 4
Feb 4 12:12:10 mds1 kernel: Lustre: MGS: Logs for fs testfs were removed
by user request.
All servers must be restarted in order to regenerate the logs.
Feb 4 12:12:10 mds1 kernel: Lustre: testfs-MDT0000: used disk, loading
Feb 4 12:12:10 mds1 kernel: LustreError:
8412:0:(osd_io.c:1000:osd_ldiskfs_read()) testfs=M
DT0000: can't read 128@8192 on ino 21: rc = 0
Feb 4 12:12:10 mds1 kernel: LustreError:
8412:0:(mdt_recovery.c:112:mdt_clients_data_init()
) error reading MDS last_rcvd idx 0, off 8192: rc -14
Feb 4 12:12:18 mds1 kernel: Lustre:
6987:0:(mgc_request.c:1564:mgc_process_recover_log()) P
rocess recover log testfs-mdtir error -22
Now attempt to mount 2 x OST:
9 AT osp testfs-OST0000-osc-MDT0000 testfs-MDT0000-mdtlov_UUID 1
Feb 4 12:22:48 mds1 kernel: Lustre: MGS: Regenerating testfs-OST0000 log
by user request.
Feb 4 12:22:58 mds1 kernel: Lustre:
8697:0:(mgc_request.c:1564:mgc_process_recover_log()) P
rocess recover log testfs-mdtir error -22
Feb 4 12:22:58 mds1 kernel: LustreError:
8803:0:(ldlm_lib.c:429:client_obd_setup()) can't a
dd initial connection
Feb 4 12:22:58 mds1 kernel: LustreError:
8803:0:(osp_dev.c:686:osp_init0()) testfs-OST0000-
osc-MDT0000: can't setup obd: -2
Feb 4 12:22:58 mds1 kernel: LustreError:
8803:0:(obd_config.c:572:class_setup()) setup testfs-OST0000-osc-MDT0000
failed (-2)
Feb 4 12:22:48 mds1 kernel: Lustre: MGS: Regenerating testfs-OST0000 log
by user request.
Feb 4 12:22:58 mds1 kernel: Lustre:
8697:0:(mgc_request.c:1564:mgc_process_recover_log()) P
rocess recover log testfs-mdtir error -22
Feb 4 12:22:58 mds1 kernel: LustreError:
8803:0:(ldlm_lib.c:429:client_obd_setup()) can't a
dd initial connection
Feb 4 12:22:58 mds1 kernel: LustreError:
8803:0:(osp_dev.c:686:osp_init0()) testfs-OST0000-
osc-MDT0000: can't setup obd: -2
Feb 4 12:22:58 mds1 kernel: LustreError:
8803:0:(obd_config.c:572:class_setup()) setup mant
le-OST0000-osc-MDT0000 failed (-2)
Feb 4 12:22:58 mds1 kernel: LustreError:
8803:0:(obd_config.c:1553:class_config_llog_handle
r()) MGC5.5.200.5@o2ib: cfg command failed: rc = -2
Feb 4 12:22:58 mds1 kernel: Lustre: cmd=cf003
0:testfs-OST0000-osc-MDT0000 1:testfs-OST
0000_UUID 2:0@<0:0>
Feb 4 12:22:58 mds1 kernel: LustreError:
8803:0:(obd_config.c:1553:class_config_llog_handler())
MGC5.5.200.5@o2ib: cfg command failed: rc = -2
Feb 4 12:22:58 mds1 kernel: Lustre: cmd=cf003
0:testfs-OST0000-osc-MDT0000
1:testfs-OST0000_UUID 2:0@<0:0>