Hi Mike,
While installing OFED I have used below command:
# ./mlnxofedinstall -vvv --add-kernel-support --without-32bit
--without-fw-update --hpc
I have used option --add-kernel-support, Which add kernel support (Run
mlnx_add_kernel_support.sh). This is what you meant to say, right?
Thanks,
Aayush.
On 9/30/2014 11:04 PM, Mike Ware wrote:
I knew I had it somewhere
http://lists.lustre.org/pipermail/lustre-discuss/2012-November/016988.html
Mike
On Tue, Sep 30, 2014 at 10:32 AM, Mike Ware <charnobyl3000(a)gmail.com
<mailto:charnobyl3000@gmail.com>> wrote:
I had a similar issue using the Mellanox packages. If i remember
correctly I had to recompile the drivers against the Lustre kernel
for the install. I believe Mellanox had an article on this but I
don't have the link.
Mike
On Tue, Sep 30, 2014 at 8:07 AM, Parinay Kondekar
<parinay.kondekar(a)seagate.com
<mailto:parinay.kondekar@seagate.com>> wrote:
IMO you should try out strace to see if anything is noticed.
"Write failed: Broken pipe" is quite common message and
difficult to conclude anything with.
Regards
parinay
On Tue, Sep 30, 2014 at 8:16 PM, aayush agrawal
<aayush.agrawal(a)calsoftinc.com
<mailto:aayush.agrawal@calsoftinc.com>> wrote:
Hi Parinay,
Yes, I see ib0 in output of ifconfig -a.
I also tried with options lnet networks=*o2ib_0_*(ib0) but
no luck.
While loading lnet I do see error in var/log/messages:
kernel: LNet: HW CPU cores: 32, npartitions: 4
alg: No test for crc32 (crc32-table)
kernel: alg: No test for adler32 (adler32-zlib)
kernel: alg: No test for crc32 (crc32-pclmul)
kernel: padlock: VIA PadLock Hash Engine not detected.
modprobe: FATAL: Error inserting padlock_sha
(/lib/modules/2.6.32_358/kernel/drivers/crypto/padlock-sha.ko):
No such device
But as per below link this should not be a problem?
https://jira.hpdd.intel.com/browse/LU-1599
modprobe lnet completes successfully and I see "Write
failed: Broken pipe" after running "lctl network up" and
after this session gets logout from the server.
Thanks,
Aayush.
On 9/30/2014 7:21 PM, Parinay Kondekar wrote:
> - what is the output of 'ifconfig -a' , do you see ib0
> there ? mentioning 'options lnet
> networks=*o2ib_0_*(ib0)'**should be enough.
> - anything in syslog ?
>
> HTH
>
> On Tue, Sep 30, 2014 at 6:03 PM, aayush agrawal
> <aayush.agrawal(a)calsoftinc.com
> <mailto:aayush.agrawal@calsoftinc.com>> wrote:
>
> Hi,
>
> I am trying to build lustre 2.5.0 against
> MLNX_OFED_LINUX-2.2-1.0.1-rhel6.4-x86_64 on CentOS6.4
> with kernel version 2.6.32-358.
> But I am not able to set lnet config settings
> properly. I used settings suggested in lustre 2.x
> manual. But then not able to get network up using lctl.
>
> Details:
>
> I have two server machines, one for mgs+mdt and
> second for oss and one client machine. I want to
> setup Infiniband on all these machines.
> I could run below steps successfully for all the
> three machines:
> 1. Run script mlnxofedinstall
> # ./mlnxofedinstall -vvv --add-kernel-support
> --without-32bit --without-fw-update --hpc
> 2. Restart openibd service
> # /etc/init.d/openibd restart
> 3. configure ib0 interface.
> 4. configure lustre with o2ib
> # ./configure
> --with-linux=Path_to_linux-2.6.32-358.18.1.el6
> --with-o2ib=/usr/src/ofa_kernel/default/
>
> 5. make lustre rpms:
> # make rpms
> This gave me below compilation error
> I looked online for this error and found bug
> registered on the same:
>
https://jira.hpdd.intel.com/browse/LU-4266
>
<
https://urldefense.proofpoint.com/v2/url?u=https-3A__jira.hpdd.intel.com_...
> Below patch from above link solved the problem and
> hence I could build lustre rpms:
>
http://review.whamcloud.com/#/c/8451/1
>
<
https://urldefense.proofpoint.com/v2/url?u=http-3A__review.whamcloud.com_...
>
> Now first I want to do the Infiniband setup for mgs
> and mdt on single machine which also has Ethernet IP.
> Then I want to format and mount mgs and mdt.
> So I installed above created lustre rpms and then
> added below line in /etc/modprobe.d/lustre.conf
> options lnet networks=o2ib(ib0)
>
> Then I rebooted the machine to remove all lustre
> related modules including lnet and then ranmodprobe
> lnet command to add above parameters and the ran lctl
> network up which is giving me below error:
> LNET configure error 100: Network is down
>
> I looked online and found below discussion on same error:
>
http://lists.lustre.org/pipermail/lustre-discuss/2010-June/013510.html
>
<
https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.lustre.org_pipe...
>
> As per suggestion in above mail I tried with below
> line in /etc/modprobe.d/lustre.conf. In below command
> for IB_IP, I have given infiniband IP.
> options lnet *networks=o2ib(ib0)* routes="tcp0
> IB_IP@o2ib"
> This command hangs for around 2 to 3 minutes and then
> gives error: Write failed: Broken pipe. Same is the
> case for "options lnet *networks=o2ib(ib0)*"
> But if I set: options lnet
> *networks=tcp0(eth0),o2ib(ib0)* routes="tcp1
> IB_IP@o2ib" then it gives LNET configure error 100:
> Network is down.
>
> It seems that for network=o2ib(ibo) I am getting
> error Write failed: Broken pipe.
> Am I missing anything while following above steps? Or
> how do I resolve above error?
>
> Thanks,
> Aayush.
>
> <html>
> _______________________________________________
> HPDD-discuss mailing list
> HPDD-discuss(a)lists.01.org
> <mailto:HPDD-discuss@lists.01.org>
>
https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.01.org_mailman...
>
>
_______________________________________________
Lustre-discuss mailing list
Lustre-discuss(a)lists.lustre.org
<mailto:Lustre-discuss@lists.lustre.org>
http://lists.lustre.org/mailman/listinfo/lustre-discuss