I did tried as you suggested:

rmmod lnet
ERROR: Module lnet is in use by ksocklnd



On Sat, May 4, 2013 at 5:48 PM, linux freaker <linuxfreaker@gmail.com> wrote:
Lee,

I understand what you both suggested.
If you see, currently my Lustreclient2 is showing:

[root@lustreclient2 ~]# lctl list_nids
10.94.214.189@tcp
[root@lustreclient2 ~]#

Now this is a wrong network. I too want to make it to 192.168.1.1@o2ib

Now while I tried to run:

lustre_rmmd command it threw error:

lustre_rmmod
error: dl: No such file or directory opening /proc/fs/lustre/devices
opening /dev/obd failed: No such device
hint: the kernel modules may not be loaded
Error getting device list: No such device: check dmesg.
Modules still loaded:
lnet/klnds/socklnd/ksocklnd.o lnet/lnet/lnet.o libcfs/libcfs/libcfs.o
[root@lustreclient2 ~]#


I dont know why its throwing such error.



On Sat, May 4, 2013 at 5:40 PM, Lee, Brett <brett.lee@intel.com> wrote:

I see that Minh was correct – you were *not* trying to configure Lustre on two different networks, but instead you had a misconfigured LNet NID on client1.

 

Well, the same thing is happening to you again.  Client 2 is still configured with the TCP NID.  We see this because the LNet module did *NOT* unload.

 

In order to change the NID, currently, you will need to remove the LNet module and then reload it.  When the module loads it will pick up the “options lnet networks” parameters.

 

Before running the “lctl ping”, make sure you run a “lctl list_nids” to ensure that client2 has correctly configured the o2ib NID.

 

Note:  “ping <IP>” indicates something different than “lctl ping <NID>”.

 

--

Brett Lee

Sr. Systems Engineer

Intel High Performance Data Division

 

From: hpdd-discuss-bounces@lists.01.org [mailto:hpdd-discuss-bounces@lists.01.org] On Behalf Of linux freaker
Sent: Saturday, May 04, 2013 4:16 AM
To: Diep, Minh; hpdd-discuss@lists.01.org
Subject: Re: [HPDD-discuss] Issue setting up Lustre Client

 

I tried on another lustreclient but could nt get it working..

 

[root@lustreclient2 ~]# lustre_rmmod

Modules still loaded:

lnet/klnds/socklnd/ksocklnd.o lnet/lnet/lnet.o libcfs/libcfs/libcfs.o

[root@lustreclient2 ~]#

[root@lustreclient2 ~]# modprobe lustre

[root@lustreclient2 ~]# modprobe lnet

[root@lustreclient2 ~]# ping 192.168.1.1

PING 192.168.1.1 (192.168.1.1) 56(84) bytes of data.

64 bytes from 192.168.1.1: icmp_seq=1 ttl=64 time=0.233 ms

^C

--- 192.168.1.1 ping statistics ---

1 packets transmitted, 1 received, 0% packet loss, time 982ms

rtt min/avg/max/mdev = 0.233/0.233/0.233/0.000 ms

[root@lustreclient2 ~]# mount -t lustre 192.168.1.1@o2ib0:/lustre /mnt/lustre

mount.lustre: mount 192.168.1.1@o2ib0:/lustre at /mnt/lustre failed: No such file or directory

Is the MGS specification correct?

Is the filesystem name correct?

If upgrading, is the copied client log valid? (see upgrade docs)

[root@lustreclient2 ~]# ^C

[root@lustreclient2 ~]#

 

On Sat, May 4, 2013 at 1:45 PM, linux freaker <linuxfreaker@gmail.com> wrote:

I tried on another lustreclient but could nt get it working..

 

[root@lustreclient2 ~]# lustre_rmmod

Modules still loaded:

lnet/klnds/socklnd/ksocklnd.o lnet/lnet/lnet.o libcfs/libcfs/libcfs.o

[root@lustreclient2 ~]#

[root@lustreclient2 ~]# modprobe lustre

[root@lustreclient2 ~]# modprobe lnet

[root@lustreclient2 ~]# ping 192.168.1.1

PING 192.168.1.1 (192.168.1.1) 56(84) bytes of data.

64 bytes from 192.168.1.1: icmp_seq=1 ttl=64 time=0.233 ms

^C

--- 192.168.1.1 ping statistics ---

1 packets transmitted, 1 received, 0% packet loss, time 982ms

rtt min/avg/max/mdev = 0.233/0.233/0.233/0.000 ms

[root@lustreclient2 ~]# mount -t lustre 192.168.1.1@o2ib0:/lustre /mnt/lustre

mount.lustre: mount 192.168.1.1@o2ib0:/lustre at /mnt/lustre failed: No such file or directory

Is the MGS specification correct?

Is the filesystem name correct?

If upgrading, is the copied client log valid? (see upgrade docs)

[root@lustreclient2 ~]# ^C

[root@lustreclient2 ~]#

 

 

Any idea ?

 

On Sat, May 4, 2013 at 9:25 AM, linux freaker <linuxfreaker@gmail.com> wrote:

Wow !!! Great.

 

lctl list_nids

 

It worked !! I am able to mount it.

 

 

On Sat, May 4, 2013 at 9:21 AM, Diep, Minh <minh.diep@intel.com> wrote:

#lctl list_nids
10.94.214.188@tcp
 
If this is on the client, then it's wrong network. Do a lustre_rmmod to unload module and load lent with correct x.x.x.x@o2ib
 
Thanks
-Minh

 

From: linux freaker <linuxfreaker@gmail.com>
Date: Friday, May 3, 2013 8:24 PM
To: "hpdd-discuss@lists.01.org" <hpdd-discuss@lists.01.org>
Subject: [HPDD-discuss] Issue setting up Lustre Client

 

My MDS and OSS configuration look like:

 

MDS:
 
#mount
/dev/mapper/vg00-mdt on /mnt/mdt type lustre (rw)
 
lctl list_nids
192.168.1.1@o2ib
 
OSS1 and OSS2:
 
# lctl list_nids
192.168.1.2@o2ib
 
 cat /proc/fs/lustre/devices
  0 UP mgc MGC192.168.1.1@o2ib fe161cab-092e-5a7b-0ac1-6081653d099e 5
  1 UP ost OSS OSS_uuid 3
  2 UP obdfilter lustre-OST0006 lustre-OST0006_UUID 5
  3 UP obdfilter lustre-OST0007 lustre-OST0007_UUID 5
  4 UP obdfilter lustre-OST0008 lustre-OST0008_UUID 5
  5 UP obdfilter lustre-OST0009 lustre-OST0009_UUID 5
  6 UP obdfilter lustre-OST000a lustre-OST000a_UUID 5
  7 UP obdfilter lustre-OST000b lustre-OST000b_UUID 5
 
 
LustreClient:
 
[root@lustreclient1 ~]# mount -t lustre 192.168.1.1@o2ib:/lustre /mnt/lustre
mount.lustre: mount 192.168.1.1@o2ib0:/lustre at /mnt/lustre failed: No such file or directory
Is the MGS specification correct?
Is the filesystem name correct?
If upgrading, is the copied client log valid? (see upgrade docs)
 
#lctl list_nids
10.94.214.188@tcp
Please note that earlier I had 10.94.214.188 as MDS which is still being displayed. I am not sure if lctl list nids need to run on client.
 
[root@lustreclient1 ~]# ^C
[root@lustreclient1 ~]#
 
Why client is not mounting. My MGS is working fine.