Hi Andreas,
Thank you for your response!!. I looked at the application's strace and starting to
make sense in retrospect.
Looks like the application is syncing after every 4 bytes (strace snippet below) are
written out. With 1MB stripe size this can be a very taxing behavior I presume.
Also I am not sure how/where did you get the time in milli-seconds(25ms) for OST's
time to complete an operation. I could only see from the below output that at the most it
takes about 1s to complete these operations on OST's for that particular client in
question.
I have asked application owner to see if he can tweak this a bit further. Are there any
other tips or tricks to isolate these sync creatures from taxing others while we have
about 500-1000 of these jobs that will be running simultaneously for the next few months?
Thank you,
Amit
lctl get_param ost.OSS.ost_io.req_history | grep 10.42.90.13
6252718641215373314:10.42.146.10@o2ib:12345-10.42.90.13@o2ib:x1526494076203612:504:Complete:1455824505:0s(-6s)
opc 4
6252718641378492418:10.42.146.10@o2ib:12345-10.42.90.13@o2ib:x1526494076203620:504:Complete:1455824505:0s(-6s)
opc 4
6252718641481842690:10.42.146.10@o2ib:12345-10.42.90.13@o2ib:x1526494076203628:504:Complete:1455824505:1s(-5s)
opc 4
6252718642018713602:10.42.146.10@o2ib:12345-10.42.90.13@o2ib:x1526494076203636:504:Complete:1455824506:0s(-6s)
opc 4
6252718642252808194:10.42.146.10@o2ib:12345-10.42.90.13@o2ib:x1526494076203644:504:Complete:1455824506:0s(-6s)
opc 4
6252718642526617602:10.42.146.10@o2ib:12345-10.42.90.13@o2ib:x1526494076203652:504:Complete:1455824506:0s(-6s)
opc 4
6252718642890997762:10.42.146.10@o2ib:12345-10.42.90.13@o2ib:x1526494076203660:504:Complete:1455824506:0s(-6s)
opc 4
6252718643055427586:10.42.146.10@o2ib:12345-10.42.90.13@o2ib:x1526494076203668:504:Complete:1455824506:0s(-6s)
opc 4
6252718643149275138:10.42.146.10@o2ib:12345-10.42.90.13@o2ib:x1526494076203676:504:Complete:1455824506:0s(-6s)
opc 4
6252718643259572226:10.42.146.10@o2ib:12345-10.42.90.13@o2ib:x1526494076203684:504:Complete:1455824506:0s(-6s)
opc 4
6252718643368624130:10.42.146.10@o2ib:12345-10.42.90.13@o2ib:x1526494076203692:504:Complete:1455824506:0s(-6s)
opc 4
6252718643706724354:10.42.146.10@o2ib:12345-10.42.90.13@o2ib:x1526494076203700:504:Complete:1455824506:0s(-6s)
opc 4
6252718643865649154:10.42.146.10@o2ib:12345-10.42.90.13@o2ib:x1526494076203708:504:Complete:1455824506:0s(-6s)
opc 4
6252718644294844418:10.42.146.10@o2ib:12345-10.42.90.13@o2ib:x1526494076203716:504:Complete:1455824506:0s(-6s)
opc 4
6252718644496957442:10.42.146.10@o2ib:12345-10.42.90.13@o2ib:x1526494076203724:504:Complete:1455824506:0s(-6s)
opc 4
6252718644604698626:10.42.146.10@o2ib:12345-10.42.90.13@o2ib:x1526494076203732:504:Complete:1455824506:0s(-6s)
opc 4
6252718644703395842:10.42.146.10@o2ib:12345-10.42.90.13@o2ib:x1526494076203740:504:Interpret:1455824506:-1455824506s(-1455824512s)
opc 4
STRACE on application:
3285 1455822493.327917 rt_sigaction(SIGINT, {SIG_IGN, [], SA_RESTORER, 0x36d8a0f710},
{SIG_DFL, [], SA_RESTORER, 0x36d8a0f710}, 8) = 0 <0.000010>
3285 1455822493.327958 write(28,
"\0\0'\"\0\4\0\2\253GT\244\322\r\0006\0\1\16\246\357\356\0\0\0d\5TTree"...,
10018) = 10018 <0.000077>
3285 1455822493.328063 rt_sigaction(SIGINT, {SIG_DFL, [], SA_RESTORER, 0x36d8a0f710},
NULL, 8) = 0 <0.000009>
3285 1455822493.328101 lseek(28, 216015618, SEEK_SET) = 216015618 <0.000007>
3285 1455822493.328125 rt_sigaction(SIGINT, {SIG_IGN, [], SA_RESTORER, 0x36d8a0f710},
{SIG_DFL, [], SA_RESTORER, 0x36d8a0f710}, 8) = 0 <0.000006>
3285 1455822493.328164 write(28, "\377\377\374\302", 4) = 4 <0.000057>
3285 1455822493.328242 rt_sigaction(SIGINT, {SIG_DFL, [], SA_RESTORER, 0x36d8a0f710},
NULL, 8) = 0 <0.000005>
3285 1455822493.328268 fsync(28) = 0 <0.054499>
3285 1455822493.382910 lseek(28, 216015618, SEEK_SET) = 216015618 <0.000070>
3285 1455822493.383013 rt_sigaction(SIGINT, {SIG_IGN, [], SA_RESTORER, 0x36d8a0f710},
{SIG_DFL, [], SA_RESTORER, 0x36d8a0f710}, 8) = 0 <0.000056>
3285 1455822493.383134 write(28,
"\0\0\1,\0\4\0\0\2\234T\244\322\r\0005\0\1\f\340#\2\0\0\0d\4TH1F\r"..., 304) =
304 <0.000103>
3285 1455822493.383269 rt_sigaction(SIGINT, {SIG_DFL, [], SA_RESTORER, 0x36d8a0f710},
NULL, 8) = 0 <0.000050>
3285 1455822493.383360 lseek(28, 245821422, SEEK_SET) = 245821422 <0.000012>
3285 1455822493.383394 rt_sigaction(SIGINT, {SIG_IGN, [], SA_RESTORER, 0x36d8a0f710},
{SIG_DFL, [], SA_RESTORER, 0x36d8a0f710}, 8) = 0 <0.000052>
3285 1455822493.383498 write(28, "\227q[\355", 4) = 4 <0.000080>
3285 1455822493.383600 rt_sigaction(SIGINT, {SIG_DFL, [], SA_RESTORER, 0x36d8a0f710},
NULL, 8) = 0 <0.000006>
3285 1455822493.383630 fsync(28) = 0 <0.061134>
3285 1455822493.446691 lseek(28, 245821422, SEEK_SET) = 245821422 <0.000016>
3285 1455822493.446797 rt_sigaction(SIGINT, {SIG_IGN, [], SA_RESTORER, 0x36d8a0f710},
{SIG_DFL, [], SA_RESTORER, 0x36d8a0f710}, 8) = 0 <0.000136>
3285 1455822493.447133 write(28,
"\0\0'\"\0\4\0\2\253GT\244\322\r\0006\0\1\16\246\357\356\0\0\0d\5TTree"...,
10018) = 10018 <0.000186>
3285 1455822493.447454 rt_sigaction(SIGINT, {SIG_DFL, [], SA_RESTORER, 0x36d8a0f710},
NULL, 8) = 0 <0.000116>
3285 1455822493.447701 lseek(28, 216015618, SEEK_SET) = 216015618 <0.000066>
3285 1455822493.447797 rt_sigaction(SIGINT, {SIG_IGN, [], SA_RESTORER, 0x36d8a0f710},
{SIG_DFL, [], SA_RESTORER, 0x36d8a0f710}, 8) = 0 <0.000050>
3285 1455822493.447933 write(28, "\377\377\374\302", 4) = 4 <0.000087>
3285 1455822493.448050 rt_sigaction(SIGINT, {SIG_DFL, [], SA_RESTORER, 0x36d8a0f710},
NULL, 8) = 0 <0.000030>
3285 1455822493.448117 fsync(28) = 0 <0.030902>
3285 1455822493.479217 lseek(28, 216015618, SEEK_SET) = 216015618 <0.000100>
3285 1455822493.479452 rt_sigaction(SIGINT, {SIG_IGN, [], SA_RESTORER, 0x36d8a0f710},
{SIG_DFL, [], SA_RESTORER, 0x36d8a0f710}, 8) = 0 <0.000057>
3285 1455822493.479582 write(28,
"\0\0\1,\0\4\0\0\2\234T\244\322\r\0005\0\1\f\340#\2\0\0\0d\4TH1F\r"..., 304) =
304 <0.000080>
3285 1455822493.479697 rt_sigaction(SIGINT, {SIG_DFL, [], SA_RESTORER, 0x36d8a0f710},
NULL, 8) = 0 <0.000010>
3285 1455822493.479747 lseek(28, 245821422, SEEK_SET) = 245821422 <0.000010>
3285 1455822493.479778 rt_sigaction(SIGINT, {SIG_IGN, [], SA_RESTORER, 0x36d8a0f710},
{SIG_DFL, [], SA_RESTORER, 0x36d8a0f710}, 8) = 0 <0.000009>
3285 1455822493.479819 write(28, "\227q[\355", 4) = 4 <0.000058>
3285 1455822493.479899 rt_sigaction(SIGINT, {SIG_DFL, [], SA_RESTORER, 0x36d8a0f710},
NULL, 8) = 0 <0.000006>
3285 1455822493.479931 fsync(28) = 0 <0.059761>
3285 1455822493.541591 lseek(28, 245821422, SEEK_SET) = 245821422 <0.000011>
3285 1455822493.541628 rt_sigaction(SIGINT, {SIG_IGN, [], SA_RESTORER, 0x36d8a0f710},
{SIG_DFL, [], SA_RESTORER, 0x36d8a0f710}, 8) = 0 <0.000054>
3285 1455822493.541760 write(28,
"\0\0'\"\0\4\0\2\253GT\244\322\r\0006\0\1\16\246\357\356\0\0\0d\5TTree"...,
10018) = 10018 <0.000126>
3285 1455822493.541918 rt_sigaction(SIGINT, {SIG_DFL, [], SA_RESTORER, 0x36d8a0f710},
NULL, 8) = 0 <0.000057>
3285 1455822493.542012 lseek(28, 216015618, SEEK_SET) = 216015618 <0.000014>
3285 1455822493.542080 rt_sigaction(SIGINT, {SIG_IGN, [], SA_RESTORER, 0x36d8a0f710},
{SIG_DFL, [], SA_RESTORER, 0x36d8a0f710}, 8) = 0 <0.000052>
3285 1455822493.542211 write(28, "\377\377\374\302", 4) = 4 <0.000065>
3285 1455822493.542301 rt_sigaction(SIGINT, {SIG_DFL, [], SA_RESTORER, 0x36d8a0f710},
NULL, 8) = 0 <0.000009>
3285 1455822493.542332 fsync(28) = 0 <0.018463>
3285 1455822493.560945 lseek(28, 216015618, SEEK_SET) = 216015618 <0.000032>
3285 1455822493.561017 rt_sigaction(SIGINT, {SIG_IGN, [], SA_RESTORER, 0x36d8a0f710},
{SIG_DFL, [], SA_RESTORER, 0x36d8a0f710}, 8) = 0 <0.000009>
3285 1455822493.561066 write(28,
"\0\0\1,\0\4\0\0\2\234T\244\322\r\0005\0\1\f\340#\2\0\0\0d\4TH1F\r"..., 304) =
304 <0.000063>
3285 1455822493.561157 rt_sigaction(SIGINT, {SIG_DFL, [], SA_RESTORER, 0x36d8a0f710},
NULL, 8) = 0 <0.000009>
3285 1455822493.561201 lseek(28, 245821422, SEEK_SET) = 245821422 <0.000007>
3285 1455822493.561224 rt_sigaction(SIGINT, {SIG_IGN, [], SA_RESTORER, 0x36d8a0f710},
{SIG_DFL, [], SA_RESTORER, 0x36d8a0f710}, 8) = 0 <0.000006>
3285 1455822493.561261 write(28, "\227q[\355", 4) = 4 <0.000057>
3285 1455822493.561338 rt_sigaction(SIGINT, {SIG_DFL, [], SA_RESTORER, 0x36d8a0f710},
NULL, 8) = 0 <0.000006>
3285 1455822493.561368 fsync(28^C
-----Original Message-----
From: Dilger, Andreas [mailto:andreas.dilger@intel.com]
Sent: Wednesday, February 17, 2016 6:59 PM
To: Kumar, Amit <ahkumar(a)mail.smu.edu>; hpdd-discuss(a)lists.01.org
Subject: Re: [HPDD-discuss] Large number of sync requests
What does strace of the application show? The clients also keep track of sync
operations:
lctl get_param {osc,mdc}.*.stats | grep sync
and also the time it takes these operations to complete. For my OSTs it is about
25ms (SATA HDD), and the MDT it is about 3ms (old SSD).
If the application is issuing a lot of sync requests, not much that can be done
about it except fix the app.
Cheers, Andreas
--
Andreas Dilger
Lustre Principal Architect
Intel High Performance Data Division
On 2016/02/17, 09:16, "HPDD-discuss on behalf of Kumar, Amit" <hpdd-discuss-
bounces@lists.01.org<mailto:hpdd-discuss-bounces@lists.01.org> on behalf of
ahkumar@mail.smu.edu<mailto:ahkumar@mail.smu.edu>> wrote:
Dear All,
I am noticing large number of sync requests originating from the jobs. And I am
trying to understand the numbers.
I handpicked a few below from the output of "lctl get_param mdt.scratch-
MDT0000.job_stats" from MDS. Strangely when I run try to get job_stats from
all OSS for all OST's I don't see these high sync requests, rather it is 0.
My question why would I see these many sync requests, even if it is cumulative
it is high for a job that runs for couple of hours at the most.
Can any body please shed some light on what I am seeing here?
Thank you,
Amit
- job_id: 7819402
snapshot_time: 1455722134
open: { samples: 632, unit: reqs }
close: { samples: 484, unit: reqs }
mknod: { samples: 0, unit: reqs }
link: { samples: 0, unit: reqs }
unlink: { samples: 0, unit: reqs }
mkdir: { samples: 0, unit: reqs }
rmdir: { samples: 0, unit: reqs }
rename: { samples: 0, unit: reqs }
getattr: { samples: 58, unit: reqs }
setattr: { samples: 0, unit: reqs }
getxattr: { samples: 0, unit: reqs }
setxattr: { samples: 0, unit: reqs }
statfs: { samples: 0, unit: reqs }
sync: { samples: 1183934, unit: reqs }
samedir_rename: { samples: 0, unit: reqs }
crossdir_rename: { samples: 0, unit: reqs }
- job_id: 7819409
snapshot_time: 1455722644
open: { samples: 738, unit: reqs }
close: { samples: 572, unit: reqs }
mknod: { samples: 0, unit: reqs }
link: { samples: 0, unit: reqs }
unlink: { samples: 0, unit: reqs }
mkdir: { samples: 0, unit: reqs }
rmdir: { samples: 0, unit: reqs }
rename: { samples: 0, unit: reqs }
getattr: { samples: 66, unit: reqs }
setattr: { samples: 0, unit: reqs }
getxattr: { samples: 0, unit: reqs }
setxattr: { samples: 0, unit: reqs }
statfs: { samples: 0, unit: reqs }
sync: { samples: 1396655, unit: reqs }
samedir_rename: { samples: 0, unit: reqs }
crossdir_rename: { samples: 0, unit: reqs }