Ofcourse I've checked it, eth0 is present and it is up (used as internal network between the VMs). The other interface eth1 (Host Only adapter in VirtualBox) used to connect from Host to Guest(VM).The log messages were taken during boot.After logging in, "modprobe -v lnet" does not show anything(already loaded). But "modprobe -v lustre" showsinsmod /lib/modules/2.6.32-358.6.2.el6_lustre.g230b174.x86_64/updates/kernel/fs/lustre/mdc.koinsmod /lib/modules/2.6.32-358.6.2.el6_lustre.g230b174.x86_64/updates/kernel/fs/lustre/osc.koinsmod /lib/modules/2.6.32-358.6.2.el6_lustre.g230b174.x86_64/updates/kernel/fs/lustre/lov.koinsmod /lib/modules/2.6.32-358.6.2.el6_lustre.g230b174.x86_64/updates/kernel/fs/lustre/lustre.koThe service status is:# service lnet statusrunning# service lustre statuspartial# service lustre stop# service lustre status
partial# service lnet stopRemoving module lustreERROR: Module osc has non-zero reference count.# service lnet statusrunningSo the reloading of lnet/lustre is not working here.On Wed, Feb 5, 2014 at 1:11 AM, Mohr Jr, Richard Frank (Rick Mohr) <rmohr@utk.edu> wrote:
Are the log messages from when the system booted up? If so, did the Lustre modules get loaded before the eth0 interface was activated?
I would run "ifconfig" and some ping tests to verify that eth0 is active. If so, try reloading the lnet/lustre modules to see if it starts working.
--
Rick Mohr
Senior HPC System Administrator
National Institute for Computational Sciences
http://www.nics.tennessee.edu
On Feb 4, 2014, at 1:37 PM, San B <forum.san@gmail.com>
wrote:
> Hi,
>
> The lustre service on MGS server throws following message.
>
> # service lustre status
> partial
>
> This Lustre implementation is performed on VirtualBox VMs. Hence No Infiniband connectivity. I tried to debug it further and found following error messages:
>
> The /var/log/messages :
>
> Feb 4 18:02:11 ic2 kernel: LNet: HW CPU cores: 1, npartitions: 1
> Feb 4 18:02:11 ic2 kernel: alg: No test for crc32 (crc32-table)
> Feb 4 18:02:11 ic2 kernel: alg: No test for adler32 (adler32-zlib)
> Feb 4 18:02:11 ic2 kernel: padlock: VIA PadLock Hash Engine not detected.
> Feb 4 18:02:11 ic2 kernel: Lustre: Lustre: Build Version: 2.4.0-RC2-gd3f91c4-PRISTINE-2.6.32-358.6.2.el6_lustre.g230b174.x86_64
> Feb 4 18:02:11 ic2 kernel: LNetError: 685:0:(socklnd.c:2822:ksocknal_startup()) Interface eth0 is down
> Feb 4 18:02:11 ic2 kernel: LNetError: 105-4: Error -100 starting up LNI tcp
> Feb 4 18:02:11 ic2 kernel: LustreError: 685:0:(events.c:806:ptlrpc_init_portals()) network initialisation failed
>
> # cat /etc/modprobe.d/lustre.conf
> options lnet networks="tcp0(eth0)"
>
> # modprobe -v lnet
> # modprobe -v lustre
> insmod /lib/modules/2.6.32-358.6.2.el6_lustre.g230b174.x86_64/updates/kernel/fs/lustre/mdc.ko
> insmod /lib/modules/2.6.32-358.6.2.el6_lustre.g230b174.x86_64/updates/kernel/fs/lustre/osc.ko
> insmod /lib/modules/2.6.32-358.6.2.el6_lustre.g230b174.x86_64/updates/kernel/fs/lustre/lov.ko
> insmod /lib/modules/2.6.32-358.6.2.el6_lustre.g230b174.x86_64/updates/kernel/fs/lustre/lustre.ko
> # df -h
> Filesystem Size Used Avail Use% Mounted on
> /dev/mapper/VolGroup-lv_root
> 8.5G 2.9G 5.2G 36% /
> tmpfs 246M 0 246M 0% /dev/shm
> /dev/sda1 485M 57M 403M 13% /boot
> /dev/sr0 3.5G 3.5G 0 100% /var/isoimages/rhel6.4
> /dev/sdb 7.9G 344M 7.2G 5% /mgs
> #
>
> Can someone point out what could be the wrong here?
>
> Thanks in advance
> _______________________________________________
> HPDD-discuss mailing list
> HPDD-discuss@lists.01.org
> https://lists.01.org/mailman/listinfo/hpdd-discuss
_______________________________________________
HPDD-discuss mailing list
HPDD-discuss@lists.01.org
https://lists.01.org/mailman/listinfo/hpdd-discuss