I have used lfs migrate --block in the past to move files that were in
use. It works for files that are being read, but will fail with a
message if they are being written to (as expected). Seems to be safe
for files in either case. I'd limit the # of migrates per OST to a
traditional number, optimally 1, but 2 or 4 at most.
I generally used it when trying to increase the stripe width on a file
that was being read heavily, and it's a great tool for that.
It seems to do a pretty slow copy, single threaded, so it's not the
fastest method. I've written scripts that will launch a job per file
and use this to rebalance file systems, and it seemed to work well for a
decent set of data, but not nearly 500 TB. For that amount of data you
might want to consider weighing the costs of a lfs migrate (time spent
burning up compute nodes/data mover nodes) vs. the time you'd spend with
a high speed parallel copy (dcp) + a resync at a downtime.
I assume you don't have a purge policy where you could just let the file
system rebalance naturally over time?
Shawn
On Thu, 2014-09-04 at 12:59 -0400, Robin Humble wrote:
Hiya,
has anyone used 'lfs migrate [--block]' to live migrate lots of data?
it worked ok?
any hints for best usage? (how many migrates running per OST etc.)
the context is that we've doubled our number of OSTs and now need to
rebalance our ~1 PB of data by moving roughly 1/2 of it onto the new
empty OSTs.
I have yet to chat to anyone who's used 'lfs migrate' (either directly
or via lfs_migrate) in production, so I'm being paranoid and looking
for comforting war stories where it's been used to shift around a lot
of data without problems...
documentation is a bit scarce. maybe just
https://jira.hpdd.intel.com/browse/LU-2445
and 'lfs help migrate'. but with --block it sounds pretty amazing.
it should be able to do the rebalance live and without a downtime
(with some delays to file access).
we're using the latest(?) Intel Enterprise Lustre version 2.0.1.1
(which appears to be 2.5.2 based). we've heard via Intel support that
'lfs migrate' runs a verify pass, which sounds nice.
cheers,
robin
--
Dr Robin Humble
_______________________________________________
HPDD-discuss mailing list
HPDD-discuss(a)lists.01.org
https://lists.01.org/mailman/listinfo/hpdd-discuss