Hello Upanshu,

i am assuming even after modprobe/depmod mkfs.lustre did not succeed.

This seems to be the problem to me ( from the logs)

Jul 22 02:59:41 lustre-3 kernel: LDISKFS-fs error (device sdb): ldiskfs_ext_find_extent: :463: bad header in inode #3629057: invalid magic - magic 0, entries 0, max 0(0), depth 0(0)

IMO which suggest using proper e2fsprogs version ( https://wiki.hpdd.intel.com/display/PUB/Changelog+1.8 ), so upgrading e2fsprogs seems right direction.

Also I would try  /usr/lib64/lustre/tests/llmount.sh ( lustre-test*.rpm) ( uses files in /tmp/<fs-name>-ost/mdt instead of /dev/sd* ). 

Let us know how it goes
.
HTH




On 22 July 2013 14:32, Singhal, Upanshu <upanshu.singhal@emc.com> wrote:

Hello Parinay,

 

Below is the output for “modprobe –v lustre”, depmod does not show any output. Attached is log file as well. I check for Patricks suggestion, I am trying to uninstall e2fsprogs followed by upgrade, facing some issues.

 

insmod /lib/modules/2.6.18-194.17.1.el5_lustre.1.8.7/updates/kernel/net/lustre/libcfs.ko

insmod /lib/modules/2.6.18-194.17.1.el5_lustre.1.8.7/updates/kernel/fs/lustre/lvfs.ko

insmod /lib/modules/2.6.18-194.17.1.el5_lustre.1.8.7/updates/kernel/net/lustre/lnet.ko networks=tcp0(eth0)

insmod /lib/modules/2.6.18-194.17.1.el5_lustre.1.8.7/updates/kernel/fs/lustre/obdclass.ko

insmod /lib/modules/2.6.18-194.17.1.el5_lustre.1.8.7/updates/kernel/fs/lustre/ptlrpc.ko

insmod /lib/modules/2.6.18-194.17.1.el5_lustre.1.8.7/updates/kernel/fs/lustre/osc.ko

insmod /lib/modules/2.6.18-194.17.1.el5_lustre.1.8.7/updates/kernel/fs/lustre/mdc.ko

insmod /lib/modules/2.6.18-194.17.1.el5_lustre.1.8.7/updates/kernel/fs/lustre/lov.ko

insmod /lib/modules/2.6.18-194.17.1.el5_lustre.1.8.7/updates/kernel/fs/lustre/lustre.ko

 

 

Thanks,

-Upanshu

 

From: Parinay Kondekar [mailto:parinay_kondekar@xyratex.com]
Sent: Monday, July 22, 2013 1:03 PM


To: Singhal, Upanshu
Cc: hpdd-discuss@lists.01.org
Subject: Re: [HPDD-discuss] Issue installing Lustre 1.87

 

Hello Upanshu,

 

- Are the lustre modules loaded ?  What does `modprobe -v lustre` say ?  depmod ?  

- logs - /var/log/messages . dmesg . If the size is huge, it would be good to attach a trimmed version.

 

I hope you also cross-checked for what Patrick suggested. 


HTH

 

On 22 July 2013 12:32, Singhal, Upanshu <upanshu.singhal@emc.com> wrote:

Hello Parinay,

 

Thanks for your email, much appreciated. Multiple interfaces do you mean multiple network interfaces? If so, then the answer is no, I am using Ethernet only.

 

Output for “lctl list_nids” is : opening /dev/lnet failed: No such device. hint: the kernel modules may not be loaded. IOC_LIBCFS_GET_NI error 19: No such device

Output for “lctl ping” is: opening /dev/lnet failed: No such device. hint: the kernel modules may not be loaded. failed to ping 10.243.109.63@tcp: No such device

 

It seems that I am missing some module on my machines or do need to configure anything for the above errors?

 

Log messages from /var/log/messages are below, is there any other location to get the log messages?

 

LDISKFS FS on sdb, internal journal

LDISKFS-fs: mounted filesystem with ordered data mode.

LDISKFS-fs: file extents enabled

LDISKFS-fs: mballoc enabled

LDISKFS-fs error (device sdb): ldiskfs_ext_find_extent: :463: bad header in inode #3629057: invalid magic - magic 0, entries 0, max 0(0), depth 0(0)

Remounting filesystem read-only

LDISKFS-fs error (device sdb) in start_transaction: Readonly filesystem

LDISKFS-fs: mballoc: 0 blocks 0 reqs (0 success)

LDISKFS-fs: mballoc: 0 extents scanned, 0 goal hits, 0 2^N hits, 0 breaks, 0 lost

LDISKFS-fs: mballoc: 0 generated and it took 0

LDISKFS-fs: mballoc: 0 preallocated, 0 discarded

 

Thanks,

-Upanshu

 

From: Parinay Kondekar [mailto:parinay_kondekar@xyratex.com]
Sent: Friday, July 19, 2013 7:16 PM
To: Singhal, Upanshu
Cc: hpdd-discuss@lists.01.org
Subject: Re: [HPDD-discuss] Issue installing Lustre 1.87

 

More logs would help. like messages.

 

- Do you have multiple interfaces ? It would be good to check "lctl list_nids" and "lctl ping "

- For 2.3 and other you can refer to - http://downloads.whamcloud.com/public/lustre/

 

 

 

HTH

 

 

On 19 July 2013 18:26, Singhal, Upanshu <upanshu.singhal@emc.com> wrote:

Hello,

 

I am installing Lustre 1.87 on RHEL 5.8 and having issue while configuring OST on one of the node. These are the steps I am performing reading through manual, please reply how to proceed further. Please let me know if you find any issue with these steps:

 

1.       I have 2 RHEL 5.8 hosts, one for MGS and another one for OSS

2.       Configured Static IP address on both the hosts

3.       Disabled SELINUX in /etc/selinux/config file

4.       Provision 1 raw disk each to both the systems for MDT and OST, did not created any partition on them

5.       Installed lustre 1.87 RPM packages on both the machines in the given order

o    kernel-2.6.18-194.17.1.el5_lustre.1.8.7.x86_64.rpm

o    lustre-ldiskfs-3.1.6-2.6.18_194.17.1.el5_lustre.1.8.7.x86_64.rpm

o    lustre-modules-1.8.7-2.6.18_194.17.1.el5_lustre.1.8.7.x86_64.rpm

o    lustre-1.8.7-2.6.18_194.17.1.el5_lustre.1.8.7.x86_64.rpm

o    e2fsprogs-1.42.ora1-0redhat.rhel5.x86_64.rpm

o    e2fsprogs-devel-1.42.ora1-0redhat.rhel5.x86_64.rpm

o    Am I missing any package or order?

6.       Modified /etc/modprobe.conf to have : options lnet networks=tcp0(eth0)

7.       Rebooted both the machines

8.       Disabled iptables and ip6tables

9.       On machine 1 for MDS / MGS Server, successfully executed following command to create MGS/MDT file system

a.       mkfs.lustre –fsname=lustre –mgs –mdt /dev/sdb

b.      mount –t lustre /dev/sdb /mnt

10.   On machine 2 executed following command, but it fails with the error as I mentioned before:

a.       mkfs.lustre –fsname=lustre –ost –mgsnode=10.243.107.39@tcp0 /dev/sdb

                                                               i.      Fails with the error :

1.       mkfs.lustre: Can't make configs dir /tmp/mntRiWwg6/CONFIGS: Input/output error

2.       mkfs.lustre FATAL: failed to write local files

3.       mkfs.lustre: exiting with -1 (Unknown error 18446744073709551615)

b.      I tried the same command with the machine 1 as well to have MGS and OSS on the same machine, but still have the same issue.

 

Please let me know are these the similar approach you take or something else I need to do. Next thing I am planning is to download and compile the sources of Lustre 2.3 and perform the same exercise on RHEL 6.3. Can you please let me know what all source rpms I need to download and build?

 

Thanks.

Upanshu Singhal

 


_______________________________________________
HPDD-discuss mailing list
HPDD-discuss@lists.01.org
https://lists.01.org/mailman/listinfo/hpdd-discuss