Anthony,

If you have another volume available and don’t mind losing any settings stored on the MGS/MGT (No data – just anything you set as a conf_param), you could try formatting a different, new, volume as the MGS/MGT.  Use the usual process from the manual, then try to start as normal.

(Note – The manual doesn’t really distinguish between the management server (MGS) and management target (MGT).)

I suggest trying a new MGT because it seems likely your copy of the MGS/MGT didn’t get some things, and often, replacing the MGT is fairly painless.  And if it’s missing something important, as long as you used a new volume, you’re just back where you were before.

About the file referenced – I’m not really familiar with the on-disk contents of the MGT, but you should be able to see that if you mount the MGT as ldiskfs.  The thing that’s worrying is that if you’re missing that, what else are you missing/didn’t get copied, which is why I suggested trying a new MGS/MGT if that’s an option.

- Patrick

From: Anthony Alba <ascanio.alba7@gmail.com>
Date: Wednesday, February 5, 2014 at 3:03 AM
To: "Dilger, Andreas" <andreas.dilger@intel.com>
Cc: "hpdd-discuss@lists.01.org" <hpdd-discuss@lists.01.org>
Subject: Re: [HPDD-discuss] Split MDS/MGS - process recover log testfs-mtdir error -22

Further debugging through the logs

Comparing the start up of this MDS/MGS pair  with a know good system,

On the known good system:

mgc_copy_llog()  
lustre_start_simple() starting obd goodfs-MDT0000-lwp-MDT0000 (type=lwp)
..etc etc

On the bad system:
mgc_copy_llog()
/* no lwp stuff at all */


For some reason, my MDS/MGS pair are not invoking the lwp OBD.