Hi all,

Here I am, shooting for the moon again :O

So I successfully got my Lustre servers up and running on CentOS 7 and I need to make some clients. Most machines at my site run a custom, house-built 3.4.61 kernel on Ubuntu 12.04. The kernel was picked mostly due to MOSIX [1] support considerations however we also carefully reviewed the ChangeLog to ensure that key components that we depend on i.e. XFS, NFS, MD RAID were fairly stable ... it would be pretty nice if we could get these machines keyed into Lustre using the kernel they are already running.

I'm just curious if this is actually possible? I'm using the same lustre-release.tar that I pulled from Git maybe a few weeks ago to build on CentOS 7...

I take that over to the host where I did the kernel build, so, the full source and all the intermediates are in /usr/src/linux ... I untar lustre-release.tar in /usr/src and I follow a procedure that someone basically outlined in submitting Lustre bug report LU-1706:

# cd lustre-release
# sh autogen.sh
# ./configure
# make debs

I go ahead and let this run ... it appears to successfully compile the userland; I get a few packages out:

# ls -l /usr/src/*lustre*.deb
-rw-r--r-- 1 root root    119984 Jul  9 16:39 /usr/src/linux-patch-lustre_2.7.55-1_all.deb
-rw-r--r-- 1 root root    585924 Jul  9 16:40 /usr/src/lustre-dev_2.7.55-1_amd64.deb
-rw-r--r-- 1 root root 106588528 Jul  9 16:40 /usr/src/lustre-source_2.7.55-1_all.deb
-rw-r--r-- 1 root root   4163394 Jul  9 16:40 /usr/src/lustre-tests_2.7.55-1_amd64.deb
-rw-r--r-- 1 root root    659826 Jul  9 16:40 /usr/src/lustre-utils_2.7.55-1_amd64.deb

However, I don't get any kernel module built... it's not clear to me, if I need to follow a manual process of patching the kernel like I did on the server-side, or if "make debs" can figure it out, especially given the presence of the complete kernel source, headers and intermediates at /usr/src/linux. 

The actual "make debs" process, in spite of generating a few *.debs, does fail out with an error, as follows:

# Doesn't seem possible to only build modules...
./configure --with-linux=/lib/modules/3.4.61/build \
                    --with-linux-obj=/lib/modules/3.4.61/build \
                    --disable-server \
                    --disable-quilt  \
                    --disable-dependency-tracking \
                    --disable-doc  \
                    --disable-utils \
                    --disable-iokit \
                    --disable-snmp \
                    --disable-tests \
                    --enable-quota \
                    --with-o2ib=/lib/modules/3.4.61/build /lib/modules/3.4.61/build/drivers/infiniband
configure: WARNING: you should use --build, --host, --target
configure: WARNING: invalid host type: /lib/modules/3.4.61/build/drivers/infiniband
checking build system type... Invalid configuration `/lib/modules/3.4.61/build/drivers/infiniband': machine `/lib/modules/3.4.61/build/drivers/infiniband' not recognized
configure: error: /bin/bash config/config.sub /lib/modules/3.4.61/build/drivers/infiniband failed
make[2]: *** [kdist_config] Error 1
make[2]: Leaving directory `/usr/src/lustre-release/debian/tmp/modules-deb/usr_src/modules/lustre'
make[1]: *** [kdist_build] Error 2
make[1]: Leaving directory `/usr/src/lustre-release/debian/tmp/modules-deb/usr_src/modules/lustre'
BUILD FAILED!
See /usr/src/lustre-release/debian/tmp/modules-deb/var_cache_modass/lustre.buildlog.3.4.61.1436474447 for details.
make: *** [debs] Error 7

That directory does exist...

# ls -l /lib/modules/3.4.61/build/drivers/infiniband
total 20
drwxr-xr-x  2 root root 4096 Sep 10  2013 core
drwxr-xr-x 11 root root 4096 Sep 10  2013 hw
-rw-r--r--  1 root root 1967 Sep  8  2013 Kconfig
-rw-r--r--  1 root root  609 Sep  8  2013 Makefile
drwxr-xr-x  6 root root 4096 Sep 10  2013 ulp

However, we don't use Infiniband here ... it's all Ethernet ... I really couldn't care less if some Infiniband configuration is not recognized, except it seems to be hanging up the module build stage of "make debs" ... can anyone help me out with a workaround on this?

Is it an issue of manual patching; do I perhaps need to specify some parameters to autogen.sh and/or ./configure or maybe it's just a bug of some sort in that particular Git pull; maybe I can clear it up quick with some quick and dirty modifications to the Makefile? Is this even possible to use on a 3.4.61 kernel with Ubuntu 12.04 userland? It seems to build the Lustre userland okay, just the kernel module build seems to be getting hung up.

Thanks,

Sean

[1] http://www.mosix.cs.huji.ac.il/