We are doing a POC with Lustre Version 2.3. We have three VMs functioning as MDS servers and the MGS is collocated with one of the MDS Server. MDS and OSS failovers using Redhat Cluster Manger worked fine for us. I’m interested to know what happens to the cluster if the MGS servers crashes. I have gone thru the Lustre manual and could not find any information to make the MGS server highly available like MDS and OSS. So, given below are my questions:
1. From the Lustre manual it appears that we can have only one MGS server in a cluster. Can someone confirm ? Is it possible to have multiple MGS servers with failover using a Cluster Manager?
2. What impact the Lustre clients will have if the MGS server goes down since we reference MGS Server when the client mounts the Lustre file systems? Can the clients continue the I/O since the MDS and ODS servers are still available?
3. Does the MGS server holds the completed Cluster configuration data? What is the process for backing up MGS file system?
Prasad Surampudi | Systems Engineer | ATS Group, LLC
mobile 302.419.5833 | fax 484.320.4306 | prasad.surampudi@theATSgroup.com