I have split an MDS from an MGS using 2.4.2; running on the same server using 
CentOS 2.6.32-358.23.2.el6_lustre + ldiskfs only.

I have done a writeconf; now the MDS won't "register" with the MGS.  

On the split MDS/MGS server the MGT and MDT devices mount but the logs show:

Process recover log testfs-mtdir error -22.

However, if I now try to mount the OSTs, they get stuck in "AT".

One symptom is that on the split MGS/MDS the lustre modules cannot unload,
as some process is using osp.o. If I don't start the OSTs the lustre stack can be unloaded.

Any suggestions on getting the MDS/MGS to play nice?

Thanks
Anthony


[root@mds1 ~]# lctl dl                                                
  0 UP osd-ldiskfs MGS-osd MGS-osd_UUID 5                             
  1 UP mgs MGS MGS 5                                                  
  2 UP mgc MGC5.5.200.5@o2ib 9ce8a82a-136e-616e-f6f3-f4570fbd364e 5   
  3 UP osd-ldiskfs testfs-MDT0000-osd testfs-MDT0000-osd_UUID 7       
  4 UP mds MDS MDS_uuid 3                                             
  5 UP lod testfs-MDT0000-mdtlov testfs-MDT0000-mdtlov_UUID 4         
  6 UP mdt testfs-MDT0000 testfs-MDT0000_UUID 3                       
  7 UP mdd testfs-MDD0000 testfs-MDD0000_UUID 4                       
  8 UP qmt testfs-QMT0000 testfs-QMT0000_UUID 4                       

Feb  4 12:12:10 mds1 kernel: Lustre: MGS: Logs for fs testfs were removed by user request.  
All servers must be restarted in order to regenerate the logs.                              
Feb  4 12:12:10 mds1 kernel: Lustre: testfs-MDT0000: used disk, loading                     
Feb  4 12:12:10 mds1 kernel: LustreError: 8412:0:(osd_io.c:1000:osd_ldiskfs_read()) testfs=M
DT0000: can't read 128@8192 on ino 21: rc = 0                                               
Feb  4 12:12:10 mds1 kernel: LustreError: 8412:0:(mdt_recovery.c:112:mdt_clients_data_init()
) error reading MDS last_rcvd idx 0, off 8192: rc -14                                       
Feb  4 12:12:18 mds1 kernel: Lustre: 6987:0:(mgc_request.c:1564:mgc_process_recover_log()) P
rocess recover log testfs-mdtir error -22     

Now attempt to mount 2 x OST:

  9 AT osp testfs-OST0000-osc-MDT0000 testfs-MDT0000-mdtlov_UUID 1               


Feb  4 12:22:48 mds1 kernel: Lustre: MGS: Regenerating testfs-OST0000 log by user request.  
Feb  4 12:22:58 mds1 kernel: Lustre: 8697:0:(mgc_request.c:1564:mgc_process_recover_log()) P
rocess recover log testfs-mdtir error -22                                                   
Feb  4 12:22:58 mds1 kernel: LustreError: 8803:0:(ldlm_lib.c:429:client_obd_setup()) can't a
dd initial connection                                                                       
Feb  4 12:22:58 mds1 kernel: LustreError: 8803:0:(osp_dev.c:686:osp_init0()) testfs-OST0000-
osc-MDT0000: can't setup obd: -2                                                            
Feb  4 12:22:58 mds1 kernel: LustreError: 8803:0:(obd_config.c:572:class_setup()) setup testfs-OST0000-osc-MDT0000 failed (-2)                 

Feb  4 12:22:48 mds1 kernel: Lustre: MGS: Regenerating testfs-OST0000 log by user request.  
Feb  4 12:22:58 mds1 kernel: Lustre: 8697:0:(mgc_request.c:1564:mgc_process_recover_log()) P
rocess recover log testfs-mdtir error -22                                                   
Feb  4 12:22:58 mds1 kernel: LustreError: 8803:0:(ldlm_lib.c:429:client_obd_setup()) can't a
dd initial connection                                                                       
Feb  4 12:22:58 mds1 kernel: LustreError: 8803:0:(osp_dev.c:686:osp_init0()) testfs-OST0000-
osc-MDT0000: can't setup obd: -2                                                            
Feb  4 12:22:58 mds1 kernel: LustreError: 8803:0:(obd_config.c:572:class_setup()) setup mant
le-OST0000-osc-MDT0000 failed (-2)                                                          
Feb  4 12:22:58 mds1 kernel: LustreError: 8803:0:(obd_config.c:1553:class_config_llog_handle
r()) MGC5.5.200.5@o2ib: cfg command failed: rc = -2                                         
Feb  4 12:22:58 mds1 kernel: Lustre:    cmd=cf003 0:testfs-OST0000-osc-MDT0000  1:testfs-OST
0000_UUID  2:0@<0:0>                                                                        

                                       
Feb  4 12:22:58 mds1 kernel: LustreError: 8803:0:(obd_config.c:1553:class_config_llog_handler())
MGC5.5.200.5@o2ib: cfg command failed: rc = -2                                         
Feb  4 12:22:58 mds1 kernel: Lustre:    cmd=cf003 0:testfs-OST0000-osc-MDT0000  
1:testfs-OST0000_UUID  2:0@<0:0>