I see that Minh was correct – you were *not* trying to configure Lustre on two different networks, but instead you had a misconfigured LNet NID on client1.
Well, the same thing is happening to you again. Client 2 is still configured with the TCP NID. We see this because the LNet module did *NOT* unload.
In order to change the NID, currently, you will need to remove the LNet module and then reload it. When the module loads it will pick up the “options lnet networks” parameters.
Before running the “lctl ping”, make sure you run a “lctl list_nids” to ensure that client2 has correctly configured the o2ib NID.
Note: “ping <IP>” indicates something different than “lctl ping <NID>”.
--
Brett Lee
Sr. Systems Engineer
Intel High Performance Data Division
From: hpdd-discuss-bounces@lists.01.org [mailto:hpdd-discuss-bounces@lists.01.org] On Behalf Of linux freaker
Sent: Saturday, May 04, 2013 4:16 AM
To: Diep, Minh; hpdd-discuss@lists.01.org
Subject: Re: [HPDD-discuss] Issue setting up Lustre Client
I tried on another lustreclient but could nt get it working..
[root@lustreclient2 ~]# lustre_rmmod
Modules still loaded:
lnet/klnds/socklnd/ksocklnd.o lnet/lnet/lnet.o libcfs/libcfs/libcfs.o
[root@lustreclient2 ~]#
[root@lustreclient2 ~]# modprobe lustre
[root@lustreclient2 ~]# modprobe lnet
[root@lustreclient2 ~]# ping 192.168.1.1
PING 192.168.1.1 (192.168.1.1) 56(84) bytes of data.
64 bytes from 192.168.1.1: icmp_seq=1 ttl=64 time=0.233 ms
^C
--- 192.168.1.1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 982ms
rtt min/avg/max/mdev = 0.233/0.233/0.233/0.000 ms
[root@lustreclient2 ~]# mount -t lustre 192.168.1.1@o2ib0:/lustre /mnt/lustre
mount.lustre: mount 192.168.1.1@o2ib0:/lustre at /mnt/lustre failed: No such file or directory
Is the MGS specification correct?
Is the filesystem name correct?
If upgrading, is the copied client log valid? (see upgrade docs)
[root@lustreclient2 ~]# ^C
[root@lustreclient2 ~]#
On Sat, May 4, 2013 at 1:45 PM, linux freaker <linuxfreaker@gmail.com> wrote:
I tried on another lustreclient but could nt get it working..
[root@lustreclient2 ~]# lustre_rmmod
Modules still loaded:
lnet/klnds/socklnd/ksocklnd.o lnet/lnet/lnet.o libcfs/libcfs/libcfs.o
[root@lustreclient2 ~]#
[root@lustreclient2 ~]# modprobe lustre
[root@lustreclient2 ~]# modprobe lnet
[root@lustreclient2 ~]# ping 192.168.1.1
PING 192.168.1.1 (192.168.1.1) 56(84) bytes of data.
64 bytes from 192.168.1.1: icmp_seq=1 ttl=64 time=0.233 ms
^C
--- 192.168.1.1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 982ms
rtt min/avg/max/mdev = 0.233/0.233/0.233/0.000 ms
[root@lustreclient2 ~]# mount -t lustre 192.168.1.1@o2ib0:/lustre /mnt/lustre
mount.lustre: mount 192.168.1.1@o2ib0:/lustre at /mnt/lustre failed: No such file or directory
Is the MGS specification correct?
Is the filesystem name correct?
If upgrading, is the copied client log valid? (see upgrade docs)
[root@lustreclient2 ~]# ^C
[root@lustreclient2 ~]#
Any idea ?
On Sat, May 4, 2013 at 9:25 AM, linux freaker <linuxfreaker@gmail.com> wrote:
On Sat, May 4, 2013 at 9:21 AM, Diep, Minh <minh.diep@intel.com> wrote:
#lctl list_nids10.94.214.188@tcpIf this is on the client, then it's wrong network. Do a lustre_rmmod to unload module and load lent with correct x.x.x.x@o2ibThanks-Minh
From: linux freaker <linuxfreaker@gmail.com>
Date: Friday, May 3, 2013 8:24 PM
To: "hpdd-discuss@lists.01.org" <hpdd-discuss@lists.01.org>
Subject: [HPDD-discuss] Issue setting up Lustre Client
My MDS and OSS configuration look like:
MDS:#mount/dev/mapper/vg00-mdt on /mnt/mdt type lustre (rw)lctl list_nids192.168.1.1@o2ibOSS1 and OSS2:# lctl list_nids192.168.1.2@o2ibcat /proc/fs/lustre/devices0 UP mgc MGC192.168.1.1@o2ib fe161cab-092e-5a7b-0ac1-6081653d099e 51 UP ost OSS OSS_uuid 32 UP obdfilter lustre-OST0006 lustre-OST0006_UUID 53 UP obdfilter lustre-OST0007 lustre-OST0007_UUID 54 UP obdfilter lustre-OST0008 lustre-OST0008_UUID 55 UP obdfilter lustre-OST0009 lustre-OST0009_UUID 56 UP obdfilter lustre-OST000a lustre-OST000a_UUID 57 UP obdfilter lustre-OST000b lustre-OST000b_UUID 5LustreClient:[root@lustreclient1 ~]# mount -t lustre 192.168.1.1@o2ib:/lustre /mnt/lustremount.lustre: mount 192.168.1.1@o2ib0:/lustre at /mnt/lustre failed: No such file or directoryIs the MGS specification correct?Is the filesystem name correct?If upgrading, is the copied client log valid? (see upgrade docs)#lctl list_nids10.94.214.188@tcpPlease note that earlier I had 10.94.214.188 as MDS which is still being displayed. I am not sure if lctl list nids need to run on client.[root@lustreclient1 ~]# ^C[root@lustreclient1 ~]#Why client is not mounting. My MGS is working fine.