lustre_rmmod doesn't always work as expected. Usually I do the following to unload
modules:
lctl network unconfigure
lustre_rmmod
Note that the first command will return an error (busy) but this can be ignored. Of
course, rebooting the client will also reload the modules :).
Malcolm.
--
Malcolm Cowe, Systems Engineer
Intel High Performance Data Division
malcolm.j.cowe(a)intel.com
+61 408 573 001
-----Original Message-----
From: hpdd-discuss-bounces(a)lists.01.org [mailto:hpdd-discuss-
bounces(a)lists.01.org] On Behalf Of Diep, Minh
Sent: Sunday, May 05, 2013 2:51 AM
To: linux freaker; Lee, Brett
Cc: hpdd-discuss(a)lists.01.org
Subject: Re: [HPDD-discuss] Issue setting up Lustre Client
Hi,
The lustre_rmmod error could be modules that wasn't
shutdown/removed properly. I suggest run lustre_rmmod again or
reboot the client to have a clean start. If you want to investigate further,
please provide dmesg.
Thanks
-Minh
From: linux freaker
<linuxfreaker@gmail.com<mailto:linuxfreaker@gmail.com>>
Date: Saturday, May 4, 2013 5:18 AM
To: "Lee, Brett" <brett.lee@intel.com<mailto:brett.lee@intel.com>>
Cc: Minh Diep <minh.diep@intel.com<mailto:minh.diep@intel.com>>,
"hpdd-discuss@lists.01.org<mailto:hpdd-discuss@lists.01.org>" <hpdd-
discuss@lists.01.org<mailto:hpdd-discuss@lists.01.org>>
Subject: Re: [HPDD-discuss] Issue setting up Lustre Client
Lee,
I understand what you both suggested.
If you see, currently my Lustreclient2 is showing:
[root@lustreclient2 ~]# lctl list_nids
10.94.214.189@tcp
[root@lustreclient2 ~]#
Now this is a wrong network. I too want to make it to 192.168.1.1@o2ib
Now while I tried to run:
lustre_rmmd command it threw error:
lustre_rmmod
error: dl: No such file or directory opening /proc/fs/lustre/devices
opening /dev/obd failed: No such device
hint: the kernel modules may not be loaded
Error getting device list: No such device: check dmesg.
Modules still loaded:
lnet/klnds/socklnd/ksocklnd.o lnet/lnet/lnet.o libcfs/libcfs/libcfs.o
[root@lustreclient2 ~]#
I dont know why its throwing such error.
On Sat, May 4, 2013 at 5:40 PM, Lee, Brett
<brett.lee@intel.com<mailto:brett.lee@intel.com>> wrote:
I see that Minh was correct - you were *not* trying to configure Lustre
on two different networks, but instead you had a misconfigured LNet
NID on client1.
Well, the same thing is happening to you again. Client 2 is still configured
with the TCP NID. We see this because the LNet module did *NOT*
unload.
In order to change the NID, currently, you will need to remove the LNet
module and then reload it. When the module loads it will pick up the
"options lnet networks" parameters.
Before running the "lctl ping", make sure you run a "lctl list_nids"
to
ensure that client2 has correctly configured the o2ib NID.
Note: "ping <IP>" indicates something different than "lctl ping
<NID>".
--
Brett Lee
Sr. Systems Engineer
Intel High Performance Data Division
From:hpdd-discuss-bounces@lists.01.org<mailto:hpdd-discuss-
bounces(a)lists.01.org> [mailto:hpdd-discuss-
bounces@lists.01.org<mailto:hpdd-discuss-bounces@lists.01.org>] On
Behalf Of linux freaker
Sent: Saturday, May 04, 2013 4:16 AM
To: Diep, Minh; hpdd-discuss@lists.01.org<mailto:hpdd-
discuss(a)lists.01.org>
Subject: Re: [HPDD-discuss] Issue setting up Lustre Client
I tried on another lustreclient but could nt get it working..
[root@lustreclient2 ~]# lustre_rmmod
Modules still loaded:
lnet/klnds/socklnd/ksocklnd.o lnet/lnet/lnet.o libcfs/libcfs/libcfs.o
[root@lustreclient2 ~]#
[root@lustreclient2 ~]# modprobe lustre
[root@lustreclient2 ~]# modprobe lnet
[root@lustreclient2 ~]# ping 192.168.1.1
PING 192.168.1.1 (192.168.1.1) 56(84) bytes of data.
64 bytes from 192.168.1.1<http://192.168.1.1/>: icmp_seq=1 ttl=64
time=0.233 ms
^C
--- 192.168.1.1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 982ms
rtt min/avg/max/mdev = 0.233/0.233/0.233/0.000 ms
[root@lustreclient2 ~]# mount -t lustre
192.168.1.1@o2ib0:/lustre<mailto:192.168.1.1@o2ib0:/lustre>
/mnt/lustre
mount.lustre: mount
192.168.1.1@o2ib0:/lustre<mailto:192.168.1.1@o2ib0:/lustre> at
/mnt/lustre failed: No such file or directory
Is the MGS specification correct?
Is the filesystem name correct?
If upgrading, is the copied client log valid? (see upgrade docs)
[root@lustreclient2 ~]# ^C
[root@lustreclient2 ~]#
On Sat, May 4, 2013 at 1:45 PM, linux freaker
<linuxfreaker@gmail.com<mailto:linuxfreaker@gmail.com>> wrote:
I tried on another lustreclient but could nt get it working..
[root@lustreclient2 ~]# lustre_rmmod
Modules still loaded:
lnet/klnds/socklnd/ksocklnd.o lnet/lnet/lnet.o libcfs/libcfs/libcfs.o
[root@lustreclient2 ~]#
[root@lustreclient2 ~]# modprobe lustre
[root@lustreclient2 ~]# modprobe lnet
[root@lustreclient2 ~]# ping 192.168.1.1
PING 192.168.1.1 (192.168.1.1) 56(84) bytes of data.
64 bytes from 192.168.1.1<http://192.168.1.1>: icmp_seq=1 ttl=64
time=0.233 ms
^C
--- 192.168.1.1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 982ms
rtt min/avg/max/mdev = 0.233/0.233/0.233/0.000 ms
[root@lustreclient2 ~]# mount -t lustre
192.168.1.1@o2ib0:/lustre<mailto:192.168.1.1@o2ib0:/lustre>
/mnt/lustre
mount.lustre: mount
192.168.1.1@o2ib0:/lustre<mailto:192.168.1.1@o2ib0:/lustre> at
/mnt/lustre failed: No such file or directory
Is the MGS specification correct?
Is the filesystem name correct?
If upgrading, is the copied client log valid? (see upgrade docs)
[root@lustreclient2 ~]# ^C
[root@lustreclient2 ~]#
Any idea ?
On Sat, May 4, 2013 at 9:25 AM, linux freaker
<linuxfreaker@gmail.com<mailto:linuxfreaker@gmail.com>> wrote:
Wow !!! Great.
lctl list_nids
192.168.1.4@o2ib<mailto:192.168.1.4@o2ib>
It worked !! I am able to mount it.
On Sat, May 4, 2013 at 9:21 AM, Diep, Minh
<minh.diep@intel.com<mailto:minh.diep@intel.com>> wrote:
#lctl list_nids
10.94.214.188@tcp<mailto:10.94.214.188@tcp>
If this is on the client, then it's wrong network. Do a lustre_rmmod to
unload module and load lent with correct
x.x.x.x@o2ib<mailto:x.x.x.x@o2ib>
Thanks
-Minh
From: linux freaker
<linuxfreaker@gmail.com<mailto:linuxfreaker@gmail.com>>
Date: Friday, May 3, 2013 8:24 PM
To: "hpdd-discuss@lists.01.org<mailto:hpdd-discuss@lists.01.org>"
<hpdd-discuss@lists.01.org<mailto:hpdd-discuss@lists.01.org>>
Subject: [HPDD-discuss] Issue setting up Lustre Client
My MDS and OSS configuration look like:
MDS:
#mount
/dev/mapper/vg00-mdt on /mnt/mdt type lustre (rw)
lctl list_nids
192.168.1.1@o2ib<mailto:192.168.1.1@o2ib>
OSS1 and OSS2:
# lctl list_nids
192.168.1.2@o2ib<mailto:192.168.1.2@o2ib>
cat /proc/fs/lustre/devices
0 UP mgc MGC192.168.1.1@o2ib<mailto:MGC192.168.1.1@o2ib>
fe161cab-092e-5a7b-0ac1-6081653d099e 5
1 UP ost OSS OSS_uuid 3
2 UP obdfilter lustre-OST0006 lustre-OST0006_UUID 5
3 UP obdfilter lustre-OST0007 lustre-OST0007_UUID 5
4 UP obdfilter lustre-OST0008 lustre-OST0008_UUID 5
5 UP obdfilter lustre-OST0009 lustre-OST0009_UUID 5
6 UP obdfilter lustre-OST000a lustre-OST000a_UUID 5
7 UP obdfilter lustre-OST000b lustre-OST000b_UUID 5
LustreClient:
[root@lustreclient1 ~]# mount -t lustre
192.168.1.1@o2ib:/lustre<mailto:192.168.1.1@o2ib:/lustre> /mnt/lustre
mount.lustre: mount
192.168.1.1@o2ib0:/lustre<mailto:192.168.1.1@o2ib0:/lustre> at
/mnt/lustre failed: No such file or directory
Is the MGS specification correct?
Is the filesystem name correct?
If upgrading, is the copied client log valid? (see upgrade docs)
#lctl list_nids
10.94.214.188@tcp<mailto:10.94.214.188@tcp>
Please note that earlier I had 10.94.214.188 as MDS which is still being
displayed. I am not sure if lctl list nids need to run on client.
[root@lustreclient1 ~]# ^C
[root@lustreclient1 ~]#
Why client is not mounting. My MGS is working fine.
_______________________________________________
HPDD-discuss mailing list
HPDD-discuss(a)lists.01.org
https://lists.01.org/mailman/listinfo/hpdd-discuss