Hi Amit,
AFAIK, 'lfs_migrate' works on a copy/compare/move scheme, serially (per file),
using rsync. So it really boils down to how fast the IO is.
One thought is to create a new file of known size, use lfs_migrate on that file, and
extrapolate your results from that.
--
Brett Lee
Sr. Systems Engineer
Intel High Performance Data Division
From: Kumar, Amit [mailto:ahkumar@mail.smu.edu]
Sent: Friday, June 21, 2013 5:50 PM
To: Lee, Brett; hpdd-discuss(a)lists.01.org
Subject: RE: [HPDD-discuss] Lustre Error
Hi Brett,
Thank you for explaining the process. I do plan to unmount the entire file system from the
nodes before performing this change, although I can get away from doing this.
Having said that, if I were to use lfs_migrate on about 500TB, how long would it take to
complete this process, just so I can estimate our down time.
Amit
From: hpdd-discuss-bounces@lists.01.org<mailto:hpdd-discuss-bounces@lists.01.org>
[mailto:hpdd-discuss-bounces@lists.01.org] On Behalf Of Lee, Brett
Sent: Friday, June 21, 2013 2:54 PM
To: hpdd-discuss@lists.01.org<mailto:hpdd-discuss@lists.01.org>
Subject: Re: [HPDD-discuss] Lustre Error
Hello Amit,
There are a few things I've learned on this topic, but honestly, I have some lingering
questions about the process myself.
1. Adding the new OST's
2. Rebalancing the file system
Regarding 1, my understanding is that mounting the new OSTs makes the MGS aware of them,
and then the MGS explicitly informs the MDS, and then the clients learn about the new OSTs
via the MDS (either explicitly like the MDS, or when the MDS allocates a new file to a new
OST for that client). So, it seems this should be able to be done live. If this is not
accurate, remounting should do the trick. :)
Regarding 2, there is 'lfs_migrate', which is a very flexible tool that allows
rebalancing in a variety of ways - entire file system, per-directory, and per-OST (migrate
to or migrate away). The caveat here is that the files being migrated should be quiesced.
Ideally, *all* the Lustre clients would unmount the file system, and then a secure
"management client" would mount the file system and run lfs_migrate - thus
ensuring that there is no other r/w activity.
If you "just" add the OSTs, Lustre is smart enough to detect the % utilization
differential, and over time rebalance - but manually rebalancing would be much more
effective.
I recall a preso by Andreas that provides some "tips and tricks" - to include
lfs_migrate:
http://wiki.lustre.org/images/e/e4/LUG-2010-tricksRev.pdf
--
Brett Lee
Sr. Systems Engineer
Intel High Performance Data Division
From: Kumar, Amit [mailto:ahkumar@mail.smu.edu]
Sent: Friday, June 21, 2013 1:12 PM
To: Lee, Brett; hpdd-discuss@lists.01.org<mailto:hpdd-discuss@lists.01.org>
Subject: RE: [HPDD-discuss] Lustre Error
Hi Brett,
This is interesting and a good intersection of what I am in the process of doing,
I am adding a new OSS to manage additional new OST's that will join this existing
pool, can you please point me to any specifics or best practices that I need to pay
attention as I join this new storage to existing pool? Better rebalancing tip for the
entire pool?
Thank you,
Amit
From: hpdd-discuss-bounces@lists.01.org<mailto:hpdd-discuss-bounces@lists.01.org>
[mailto:hpdd-discuss-bounces@lists.01.org] On Behalf Of Lee, Brett
Sent: Friday, June 21, 2013 1:33 PM
To: hpdd-discuss@lists.01.org<mailto:hpdd-discuss@lists.01.org>
Subject: Re: [HPDD-discuss] Lustre Error
(lctl get_param ost.*.ost_io.threads_started) Interestingly very diverse spread for the
threads started on OSS's don't know why.
Amit, I should have mentioned that Lustre starts up an initial number of threads
(ost.OSS.ost_io.threads_min, or oss_num_threads specified in the module config), and then
increases the thread count automatically - that's why you see differing thread counts
per OST.
--
Brett Lee
Sr. Systems Engineer
Intel High Performance Data Division
From: hpdd-discuss-bounces@lists.01.org<mailto:hpdd-discuss-bounces@lists.01.org>
[mailto:hpdd-discuss-bounces@lists.01.org] On Behalf Of Lee, Brett
Sent: Friday, June 21, 2013 12:23 PM
To: hpdd-discuss@lists.01.org<mailto:hpdd-discuss@lists.01.org>
Subject: Re: [HPDD-discuss] Lustre Error
Hi Amit,
Good to hear there were no LBUG's found. Some of my colleagues may have better
advice, but it seems like throttling down the threads would be advised as a stop gap
measure. Preferably, you'd be able to migrate some of the data off the Lustre file
system, and / or add additional OST's and rebalance the OST utilization. Am thinking
the "max threads" setting will require OSS reboots to be enacted (vs. Lustre
shutting down threads).
--
Brett Lee
Sr. Systems Engineer
Intel High Performance Data Division
From: Kumar, Amit [mailto:ahkumar@mail.smu.edu]
Sent: Friday, June 21, 2013 11:59 AM
To: Lee, Brett; hpdd-discuss@lists.01.org<mailto:hpdd-discuss@lists.01.org>
Subject: RE: [HPDD-discuss] Lustre Error
Hi Brett,
Thank you for assuring that I am not stupid :)
Here is more info:
# ON MDS & OSS # rpm -qa | grep lustre
kernel-devel-2.6.18-194.17.1.el5_lustre.1.8.5
lustre-ldiskfs-3.1.4-2.6.18_194.17.1.el5_lustre.1.8.5
kernel-2.6.18-194.17.1.el5_lustre.1.8.5
lustre-1.8.5-2.6.18_194.17.1.el5_lustre.1.8.5
lustre-modules-1.8.5-2.6.18_194.17.1.el5_lustre.1.8.5
# ON Clients: $ rpm -qa | grep lustre
lustre-1.8.7-2.6.18_308.8.1.el5_201206071003
lustre-modules-1.8.7-2.6.18_308.8.1.el5_201206071003
(lctl get_param ost.*.ost_io.threads_started) Interestingly very diverse spread for the
threads started on OSS's don't know why.
ost.OSS.ost_io.threads_started=248
ost.OSS.ost_io.threads_started=512
ost.OSS.ost_io.threads_started=128
ost.OSS.ost_io.threads_started=128
ost.OSS.ost_io.threads_started=512
ost.OSS.ost_io.threads_started=248
ost.OSS.ost_io.threads_started=512
ost.OSS.ost_io.threads_started=512
ost.OSS.ost_io.threads_started=362
ost.OSS.ost_io.threads_started=293
ost.OSS.ost_io.threads_started=295
ost.OSS.ost_io.threads_started=64
Across the board every OSS is set to .*.ost_io.threads_max=512 except one, which is set to
64;
I checked for this "grep -i LBUG /var/log/messages" on all OSS, but no results
at all.
Regards,
Amit
From: hpdd-discuss-bounces@lists.01.org<mailto:hpdd-discuss-bounces@lists.01.org>
[mailto:hpdd-discuss-bounces@lists.01.org] On Behalf Of Lee, Brett
Sent: Friday, June 21, 2013 12:14 PM
To: hpdd-discuss@lists.01.org<mailto:hpdd-discuss@lists.01.org>
Subject: Re: [HPDD-discuss] Lustre Error
Hi Amit,
You're very welcome for the help. In this email, it looks like you've provided
some very key information.
In short, each OSS runs different types of service threads. Some of those threads are for
IO to disk. With OST's at 90% capacity utilization, it is harder to find large
contiguous extents, and thus it is slower to write to the OSTs, and so more threads can
get started. So what you are wondering about is not "stupid" but seemingly,
AFAIK, on the mark. :)
Obviously, you'll want to eliminate the 90% utilization.
You might, in the meantime, reduce the number of OSS IO threads. This is kind of a
"band aid", and not a solution, but it would seem like doing that should reduce
the load on the OSS's - but resulting in a slow down of the clients.
The OSS thread counts are tuneable, based on a few items (e.g. speed of storage, number
of CPU cores). On that OSS, can you check how many IO threads are running? Something
like:
[oss]# lctl get_param ost.*.ost_io.threads_started
Just in case, could you do one more check? This "seems to be" tied fully to the
% utilization, but, could you do a quick "grep -i LBUG /var/log/messages" on the
OSS?
And for completeness, what version of Lustre you are running? ;)
--
Brett Lee
Sr. Systems Engineer
Intel High Performance Data Division
From: Kumar, Amit [mailto:ahkumar@mail.smu.edu]
Sent: Friday, June 21, 2013 10:47 AM
To: Lee, Brett; hpdd-discuss@lists.01.org<mailto:hpdd-discuss@lists.01.org>
Subject: RE: [HPDD-discuss] Lustre Error
Hi Brett,
Thank you for your reply.
State of OST's is healthy, no issues with storage arrays. Although OSS shows as being
healthy in /proc/fs/lustre/health_check; I see following messages in dmesg and
/var/log/messages: (below)
And this not the only OST that every client is having connection issues with, each one is
probably having issues with OST's served by 12 different OSS's. But what I found
is lately I am seeing more of these and I was inclining towards us approaching 90% of the
capacity being the cause, may be stupid but I was wondering.
Also what interests me is I do not see heavy % of IO wait on the OSS but I see following
messages "slow i_mutex 2080s due to heavy IO load" and don't know how they
relate.
Heavy IO, but low IO wait(< 2%) == "slow i_mutex 2080s due to heavy IO load"
Can you please help me understand these messages in terms of accessing what is really a
heavy IO, for example the number of RPC requests etc.
Regards,
Amit
======
Lustre: Service thread pid 12065 was inactive for 1200.00s. Watchdog stack traces are
limited to 3 per 300 seconds, skipping this one.
Lustre: Skipped 2 previous similar messages
Lustre: Service thread pid 12284 was inactive for 1200.00s. Watchdog stack traces are
limited to 3 per 300 seconds, skipping this one.
Lustre: Skipped 3 previous similar messages
Lustre: Service thread pid 9156 was inactive for 1200.00s. Watchdog stack traces are
limited to 3 per 300 seconds, skipping this one.
Lustre: Skipped 9 previous similar messages
Lustre: Service thread pid 12271 was inactive for 1200.00s. Watchdog stack traces are
limited to 3 per 300 seconds, skipping this one.
Lustre: Skipped 4 previous similar messages
Lustre: Service thread pid 12061 was inactive for 1200.00s. Watchdog stack traces are
limited to 3 per 300 seconds, skipping this one.
Lustre: Skipped 1 previous similar message
Lustre: 9083:0:(ldlm_lib.c:574:target_handle_reconnect()) smuhpc-OST0010:
855a66c9-5bbe-5ffd-df02-97093d51b99b reconnecting
Lustre: 9083:0:(ldlm_lib.c:574:target_handle_reconnect()) Skipped 589 previous similar
messages
Lustre: 9083:0:(ldlm_lib.c:874:target_handle_connect()) smuhpc-OST0010: refuse
reconnection from
855a66c9-5bbe-5ffd-df02-97093d51b99b@10.1.9.1@tcp<mailto:855a66c9-5bbe-5ffd-df02-97093d51b99b@10.1.9.1@tcp>
to 0xffff8103996f7e00; still busy with 1 active RPCs
Lustre: 9083:0:(ldlm_lib.c:874:target_handle_connect()) Skipped 589 previous similar
messages
LustreError: 9083:0:(ldlm_lib.c:1919:target_send_reply_msg()) @@@ processing error (-16)
req@ffff8101c7c68400 x1437061797611875/t0
o8->855a66c9-5bbe-5ffd-df02-97093d51b99b@NET_0x200000a010901_UUID:0/0 lens 368/264 e 0
to 0 dl 1371830482 ref 1 fl Interpret:/0/0 rc -16/0
LustreError: 9083:0:(ldlm_lib.c:1919:target_send_reply_msg()) Skipped 589 previous similar
messages
Lustre: 9200:0:(service.c:808:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time
(5/-150), not sending early reply
req@ffff81047a0a0c50 x1431402790240966/t0
o4->e7d2b14c-dfd6-fd11-3c14-52e5f55225d4@NET_0x200000a010d32_UUID:0/0 lens 448/416 e 0
to 0 dl 1371830559 ref 2 fl Interpret:/0/0 rc 0/0
Lustre: 9200:0:(service.c:808:ptlrpc_at_send_early_reply()) Skipped 7 previous similar
messages
Lustre: 9158:0:(service.c:808:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time
(5/-150), not sending early reply
req@ffff8104820f7c50 x1437061797610933/t0
o4->855a66c9-5bbe-5ffd-df02-97093d51b99b@NET_0x200000a010901_UUID:0/0 lens 448/416 e 0
to 0 dl 1371830649 ref 2 fl Interpret:/0/0 rc 0/0
Lustre: 10083:0:(ldlm_lib.c:574:target_handle_reconnect()) smuhpc-OST0010:
855a66c9-5bbe-5ffd-df02-97093d51b99b reconnecting
Lustre: 10083:0:(ldlm_lib.c:574:target_handle_reconnect()) Skipped 748 previous similar
messages
Lustre: 10083:0:(ldlm_lib.c:874:target_handle_connect()) smuhpc-OST0010: refuse
reconnection from
855a66c9-5bbe-5ffd-df02-97093d51b99b@10.1.9.1@tcp<mailto:855a66c9-5bbe-5ffd-df02-97093d51b99b@10.1.9.1@tcp>
to 0xffff8103996f7e00; still busy with 1 active RPCs
Lustre: 10083:0:(ldlm_lib.c:874:target_handle_connect()) Skipped 748 previous similar
messages
LustreError: 10083:0:(ldlm_lib.c:1919:target_send_reply_msg()) @@@ processing error (-16)
req@ffff8105467a4400 x1437061797613244/t0
o8->855a66c9-5bbe-5ffd-df02-97093d51b99b@NET_0x200000a010901_UUID:0/0 lens 368/264 e 0
to 0 dl 1371831084 ref 1 fl Interpret:/0/0 rc -16/0
LustreError: 10083:0:(ldlm_lib.c:1919:target_send_reply_msg()) Skipped 748 previous
similar messages
Lustre: Service thread pid 31704 was inactive for 1200.00s. The thread might be hung, or
it might only be slow and will resume later. Dumping the stack trace for debugging
purposes:
Lustre: Skipped 4 previous similar messages
Pid: 31704, comm: ll_ost_io_206
Call Trace:
[<ffffffff8863fd0e>] start_this_handle+0x301/0x3cb [jbd2]
[<ffffffff800a09ca>] autoremove_wake_function+0x0/0x2e
[<ffffffff8863fe83>] jbd2_journal_start+0xab/0xdf [jbd2]
[<ffffffff88a79ddc>] fsfilt_ldiskfs_brw_start+0x35c/0x490 [fsfilt_ldiskfs]
[<ffffffff88904b70>] filter_quota_acquire+0x0/0x120 [lquota]
[<ffffffff88ab5c9a>] filter_commitrw_write+0xdaa/0x2be0 [obdfilter]
[<ffffffff88a51e28>] ost_checksum_bulk+0x2d8/0x5b0 [ost]
[<ffffffff88a51c38>] ost_checksum_bulk+0xe8/0x5b0 [ost]
[<ffffffff88a58cf9>] ost_brw_write+0x1c99/0x2480 [ost]
[<ffffffff88811658>] ptlrpc_send_reply+0x5c8/0x5e0 [ptlrpc]
[<ffffffff887dc8b0>] target_committed_to_req+0x40/0x120 [ptlrpc]
[<ffffffff8008cf93>] default_wake_function+0x0/0xe
[<ffffffff88815bc8>] lustre_msg_check_version_v2+0x8/0x20 [ptlrpc]
[<ffffffff88a5c08e>] ost_handle+0x2bae/0x55b0 [ost]
[<ffffffff80150d56>] __next_cpu+0x19/0x28
[<ffffffff800767ae>] smp_send_reschedule+0x4e/0x53
[<ffffffff8882515a>] ptlrpc_server_handle_request+0x97a/0xdf0 [ptlrpc]
[<ffffffff888258a8>] ptlrpc_wait_event+0x2d8/0x310 [ptlrpc]
[<ffffffff8008b3bd>] __wake_up_common+0x3e/0x68
[<ffffffff88826817>] ptlrpc_main+0xf37/0x10f0 [ptlrpc]
[<ffffffff8005dfb1>] child_rip+0xa/0x11
[<ffffffff888258e0>] ptlrpc_main+0x0/0x10f0 [ptlrpc]
[<ffffffff8005dfa7>] child_rip+0x0/0x11
Lustre: 9237:0:(service.c:808:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time
(5/-150), not sending early reply
req@ffff810565372c00 x1421713761522861/t0
o4->b30a3ede-80f8-3b78-f87d-84d4445760e2@NET_0x200000a010b32_UUID:0/0 lens 448/416 e 0
to 0 dl 1371831064 ref 2 fl Interpret:/0/0 rc 0/0
Lustre: Service thread pid 12101 was inactive for 1200.00s. The thread might be hung, or
it might only be slow and will resume later. Dumping the stack trace for debugging
purposes:
Pid: 12101, comm: ll_ost_io_187
Call Trace:
[<ffffffff8863fd0e>] start_this_handle+0x301/0x3cb [jbd2]
[<ffffffff800a09ca>] autoremove_wake_function+0x0/0x2e
[<ffffffff8863fe83>] jbd2_journal_start+0xab/0xdf [jbd2]
[<ffffffff88a79ddc>] fsfilt_ldiskfs_brw_start+0x35c/0x490 [fsfilt_ldiskfs]
[<ffffffff88904b70>] filter_quota_acquire+0x0/0x120 [lquota]
[<ffffffff88ab5c9a>] filter_commitrw_write+0xdaa/0x2be0 [obdfilter]
[<ffffffff88a51c38>] ost_checksum_bulk+0xe8/0x5b0 [ost]
[<ffffffff88a58cf9>] ost_brw_write+0x1c99/0x2480 [ost]
[<ffffffff88811658>] ptlrpc_send_reply+0x5c8/0x5e0 [ptlrpc]
[<ffffffff887dc8b0>] target_committed_to_req+0x40/0x120 [ptlrpc]
[<ffffffff8008cf93>] default_wake_function+0x0/0xe
[<ffffffff88815bc8>] lustre_msg_check_version_v2+0x8/0x20 [ptlrpc]
[<ffffffff88a5c08e>] ost_handle+0x2bae/0x55b0 [ost]
[<ffffffff80150d56>] __next_cpu+0x19/0x28
[<ffffffff800767ae>] smp_send_reschedule+0x4e/0x53
[<ffffffff8882515a>] ptlrpc_server_handle_request+0x97a/0xdf0 [ptlrpc]
[<ffffffff888258a8>] ptlrpc_wait_event+0x2d8/0x310 [ptlrpc]
[<ffffffff8008b3bd>] __wake_up_common+0x3e/0x68
[<ffffffff88826817>] ptlrpc_main+0xf37/0x10f0 [ptlrpc]
[<ffffffff8005dfb1>] child_rip+0xa/0x11
[<ffffffff888258e0>] ptlrpc_main+0x0/0x10f0 [ptlrpc]
[<ffffffff8005dfa7>] child_rip+0x0/0x11
Lustre: smuhpc-OST0010: slow journal start 1409s due to heavy IO load
Lustre: smuhpc-OST0010: slow brw_start 2078s due to heavy IO load
Lustre: Skipped 7 previous similar messages
Lustre: smuhpc-OST0010: slow quota init 2181s due to heavy IO load
Lustre: Skipped 5 previous similar messages
Lustre: Skipped 1 previous similar message
Lustre: smuhpc-OST0010: slow quota init 2180s due to heavy IO load
Lustre: Skipped 1 previous similar message
Lustre: smuhpc-OST0010: slow direct_io 2181s due to heavy IO load
Lustre: smuhpc-OST0010: slow journal start 2181s due to heavy IO load
Lustre: smuhpc-OST0010: slow commitrw commit 2181s due to heavy IO load
Lustre: 12270:0:(service.c:1429:ptlrpc_server_handle_request()) @@@ Request
x1421713792149431 took longer than estimated (755+1426s); client may timeout.
req@ffff810029734c00 x1421713792149431/t8627977641
o4->46f21a5d-81ac-89be-d3c6-517411ba6faf@NET_0x200000a010b0c_UUID:0/0 lens 448/416 e 0
to 0 dl 1371829788 ref 1 fl Complete:/0/0 rc 0/0
Lustre: 12270:0:(service.c:1429:ptlrpc_server_handle_request()) Skipped 19 previous
similar messages
Lustre: Service thread pid 12270 completed after 2181.07s. This indicates the system was
overloaded (too many service threads, or there were not enough hardware resources).
Lustre: Skipped 18 previous similar messages
Lustre: Skipped 1 previous similar message
Lustre: 31704:0:(service.c:1429:ptlrpc_server_handle_request()) @@@ Request
x1431402790240966 took longer than estimated (755+655s); client may timeout.
req@ffff81047a0a0c50 x1431402790240966/t8627977645
o4->e7d2b14c-dfd6-fd11-3c14-52e5f55225d4@NET_0x200000a010d32_UUID:0/0 lens 448/416 e 0
to 0 dl 1371830559 ref 1 fl Complete:/0/0 rc 0/0
Lustre: Service thread pid 9175 completed after 2181.10s. This indicates the system was
overloaded (too many service threads, or there were not enough hardware resources).
Lustre: Skipped 2 previous similar messages
Lustre: 31704:0:(service.c:1429:ptlrpc_server_handle_request()) Skipped 5 previous similar
messages
Lustre: smuhpc-OST0017: slow journal start 2061s due to heavy IO load
Lustre: smuhpc-OST0017: slow journal start 1325s due to heavy IO load
Lustre: smuhpc-OST0017: slow journal start 910s due to heavy IO load
Lustre: Skipped 1 previous similar message
Lustre: smuhpc-OST0017: slow brw_start 2080s due to heavy IO load
Lustre: Skipped 1 previous similar message
Lustre: Skipped 2 previous similar messages
Lustre: smuhpc-OST0017: slow quota init 2082s due to heavy IO load
Lustre: Skipped 3 previous similar messages
Lustre: Skipped 10 previous similar messages
Lustre: smuhpc-OST0017: slow direct_io 2082s due to heavy IO load
Lustre: smuhpc-OST0017: slow journal start 2082s due to heavy IO load
Lustre: smuhpc-OST0017: slow commitrw commit 2082s due to heavy IO load
Lustre: Skipped 1 previous similar message
Lustre: 12267:0:(service.c:1429:ptlrpc_server_handle_request()) @@@ Request
x1421712688053164 took longer than estimated (755+1327s); client may timeout.
req@ffff8102cc972400 x1421712688053164/t8668727237
o4->31bfe693-51e1-7b6e-c416-9f1901c39087@NET_0x200000a010903_UUID:0/0 lens 448/416 e 0
to 0 dl 1371829892 ref 1 fl Complete:/0/0 rc 0/0
Lustre: 12267:0:(service.c:1429:ptlrpc_server_handle_request()) Skipped 4 previous similar
messages
Lustre: Service thread pid 12267 completed after 2082.17s. This indicates the system was
overloaded (too many service threads, or there were not enough hardware resources).
Lustre: Skipped 7 previous similar messages
Lustre: Skipped 1 previous similar message
Lustre: smuhpc-OST0017: slow i_mutex 2080s due to heavy IO load
From: Lee, Brett [mailto:brett.lee@intel.com]
Sent: Friday, June 21, 2013 11:04 AM
To: Kumar, Amit; hpdd-discuss@lists.01.org<mailto:hpdd-discuss@lists.01.org>
Subject: RE: [HPDD-discuss] Lustre Error
Amit,
You wrote about multiple clients seeing errors, seemingly with OST0017. What is the state
of that OST, both now and then? Did it undergo any manual (administrative) actions?
Some clues may come from running these commands on the OSS serving that OST:
[oss]# cat /proc/fs/lustre/devices
[oss]# cat /proc/fs/lustre/healthy
[oss]# grep -i recover /var/log/messages
Any problems with other OSTs or the network?
--
Brett Lee
Sr. Systems Engineer
Intel High Performance Data Division
From: hpdd-discuss-bounces@lists.01.org<mailto:hpdd-discuss-bounces@lists.01.org>
[mailto:hpdd-discuss-bounces@lists.01.org] On Behalf Of Kumar, Amit
Sent: Friday, June 21, 2013 8:48 AM
To: hpdd-discuss@lists.01.org<mailto:hpdd-discuss@lists.01.org>
Subject: Re: [HPDD-discuss] Lustre Error
Just to add to my original question: I looked at the OSS in error/debug message below and
it looks fairly loaded?
Does 12 active RPC's requests sound like a heavy load for an OSS?
Regards,
Amit
Jun 21 09:34:44 array5b kernel: Lustre: 10156:0:(ldlm_lib.c:874:target_handle_connect())
smuhpc-OST0017: refuse reconnection from
855a66c9-5bbe-5ffd-df02-97093d51b99b@10.1.9.1@tcp<mailto:855a66c9-5bbe-5ffd-df02-97093d51b99b@10.1.9.1@tcp>
to 0xffff8104b2ca5600; still busy with 12 active RPCs
Jun 21 09:34:44 array5b kernel: Lustre: 10156:0:(ldlm_lib.c:874:target_handle_connect())
Skipped 10 previous similar messages
# ps
root 24243 0.1 0.0 0 0 ? S 2012 431:11 [socknal_sd00]
root 24244 0.2 0.0 0 0 ? S 2012 845:40 [socknal_sd01]
root 24245 0.0 0.0 0 0 ? S 2012 345:52 [socknal_sd02]
root 24246 0.1 0.0 0 0 ? S 2012 359:08 [socknal_sd03]
root 24247 0.0 0.0 0 0 ? S 2012 332:10 [socknal_sd04]
root 24248 0.2 0.0 0 0 ? S 2012 874:48 [socknal_sd05]
root 24249 0.0 0.0 0 0 ? S 2012 290:02 [socknal_sd06]
root 24250 0.3 0.0 0 0 ? S 2012 1157:13 [socknal_sd07]
root 24251 0.2 0.0 0 0 ? S 2012 969:46 [socknal_sd08]
root 24252 0.1 0.0 0 0 ? S 2012 447:14 [socknal_sd09]
root 24253 0.2 0.0 0 0 ? S 2012 727:55 [socknal_sd10]
root 24254 0.5 0.0 0 0 ? S 2012 1822:56 [socknal_sd11]
root 24255 0.2 0.0 0 0 ? S 2012 740:05 [socknal_sd12]
root 24256 0.3 0.0 0 0 ? S 2012 1144:03 [socknal_sd13]
root 24257 0.4 0.0 0 0 ? S 2012 1552:31 [socknal_sd14]
root 24258 0.2 0.0 0 0 ? S 2012 940:08 [socknal_sd15]
root 24259 0.0 0.0 0 0 ? S 2012 0:00 [socknal_cd00]
root 24260 0.0 0.0 0 0 ? S 2012 0:00 [socknal_cd01]
root 24261 0.0 0.0 0 0 ? S 2012 0:00 [socknal_cd02]
root 24262 0.0 0.0 0 0 ? S 2012 0:00 [socknal_cd03]
root 24263 0.0 0.0 0 0 ? S 2012 0:00 [socknal_reaper]
From: hpdd-discuss-bounces@lists.01.org<mailto:hpdd-discuss-bounces@lists.01.org>
[mailto:hpdd-discuss-bounces@lists.01.org] On Behalf Of Kumar, Amit
Sent: Friday, June 21, 2013 9:17 AM
To: hpdd-discuss@lists.01.org<mailto:hpdd-discuss@lists.01.org>
Subject: Re: [HPDD-discuss] Lustre Error
As a follow up question. I would like to understand what can we expect from Lustre 1.8.7
when we are approaching 90% of 500TB capacity, in terms of performance and response
times?
Best,
Amit
From: hpdd-discuss-bounces@lists.01.org<mailto:hpdd-discuss-bounces@lists.01.org>
[mailto:hpdd-discuss-bounces@lists.01.org] On Behalf Of Kumar, Amit
Sent: Friday, June 21, 2013 9:07 AM
To: hpdd-discuss@lists.01.org<mailto:hpdd-discuss@lists.01.org>
Subject: [HPDD-discuss] Lustre Error
Dear Lustre,
I am seeing quite a bit of these errors on quite a number of lustre clients. Any idea what
could be causing many of the client to have connection issues?
Network seems to be solid not issues on that end, because it is internal.
LustreError: 11-0: an error occurred while communicating with
10.1.1.58@tcp<mailto:10.1.1.58@tcp>. The ost_connect operation failed with -114
LustreError: Skipped 19 previous similar messages
Lustre: 5315:0:(import.c:517:import_select_connection())
smuhpc-OST0017-osc-ffff810c6d349400: tried all connections, increasing latency to 25s
Lustre: 5315:0:(import.c:517:import_select_connection()) Skipped 20 previous similar
messages
Lustre: 5314:0:(client.c:1496:ptlrpc_expire_one_request()) @@@ Request x1437061797490383
sent from smuhpc-OST0017-osc-ffff810c6d349400 to NID
10.1.1.58@tcp<mailto:10.1.1.58@tcp> 30s ago has timed out (30s prior to deadline).
req@ffff81045d656c00 x1437061797490383/t0
o8->smuhpc-OST0017_UUID@10.1.1.58@tcp:28/4<mailto:smuhpc-OST0017_UUID@10.1.1.58@tcp:28/4>
lens 368/584 e 0 to 1 dl 1371822883 ref 1 fl Rpc:N/0/0 rc 0/0
Lustre: 5314:0:(client.c:1496:ptlrpc_expire_one_request()) @@@ Request x1437061797492276
sent from smuhpc-OST0017-osc-ffff810c6d349400 to NID
10.1.1.58@tcp<mailto:10.1.1.58@tcp> 20s ago has timed out (20s prior to deadline).
req@ffff81067e7b4800 x1437061797492276/t0
o8->smuhpc-OST0017_UUID@10.1.1.58@tcp:28/4<mailto:smuhpc-OST0017_UUID@10.1.1.58@tcp:28/4>
lens 368/584 e 0 to 1 dl 1371823117 ref 1 fl Rpc:N/0/0 rc 0/0
LustreError: 11-0: an error occurred while communicating with
10.1.1.58@tcp<mailto:10.1.1.58@tcp>. The ost_connect operation failed with -114
LustreError: Skipped 25 previous similar messages
Any thoughts will be greatly appreciated.
Best,
Amit