Patrick
Here are a few images that should clarify what the blue ( writes ) and
red ( reads ) lines are all about. When plotting the millions of these
for the entire iozone
run, each read and write plotted will end up on one pixel. But now you
get the aggregate view of the application's I/O pattern. The slopes of
the aggregate lines
give an indication of data delivery rate of the file system. When reads
are coming out of the system cache, the slope will be very steep, as
opposed to the shallow
slope when having to go to the OST.
With this info, now go back and look at the image in the previous email
with the file position activity overlayed with the OSC cached. Now you
can see how the
OSCs are responding to the iozone's writes and reads.
John
On 2/5/2015 11:27 AM, Patrick Farrell wrote:
John,
I don't have anything to add at the moment, but I am watching your
explorations with interest. Thanks for sharing this.
One question - The blue and red lines coming up the graph... What are
those? (Particularly, the one which peaks and then heads back down?)
- Patrick
On 02/05/2015 07:46 AM, John Bauer wrote:
> Richard, Patrick
>
> The more I look at this, the more bizarre it gets. Now when I run
> this iozone test, I also track *cached_mb* for each OSC. This plot
> has the file position activity plot overlayed
> with the value of cached_mb for the 16 OST's that the file was
> stripped across. Things are predictable until the size of the file
> being written exceeds the amount of memory that
> Lustre can use for caching ( during the first write). After that,
> the competition for buffer memory by the OSC's starts. Further
> comment on the plot image.
>
>
>
>
>
> I still have not determined why *cached_mb* for any OSC never exceeds
> 8GB in the test cases where I stripe across 2,3,4,5,6,or 7 OSTs. In
> those cases the sum of the cached_mb for
> the used OSTs never exceeds the 50% of system memory limit.
>
> John
>
>
> On 2/3/2015 4:47 PM, Mohr Jr, Richard Frank (Rick Mohr) wrote:
>>> On Feb 3, 2015, at 3:58 PM, Patrick Farrell<paf(a)cray.com> wrote:
>>>
>>> Interesting that Rick's seeing 3/4 on his system. The limit looks to be
< 512MB, if I'm reading correctly.
>> I glanced at the Lustre source for my 2.5.3 client and found this:
>>
>> pages = si.totalram - si.totalhigh;
>> if (pages >> (20 - PAGE_CACHE_SHIFT) < 512) {
>> lru_page_max = pages / 2;
>> } else {
>> lru_page_max = (pages / 4) * 3;
>> }
>>
>> The way I am reading this is that if the system has <512MB of memory, the
lru_page_max is 1/2 the system RAM. Otherwise, it will be 3/4 of the system RAM.
>>
>> --
>> Rick Mohr
>> Senior HPC System Administrator
>> National Institute for Computational Sciences
>>
http://www.nics.tennessee.edu
>>
>> _______________________________________________
>> HPDD-discuss mailing list
>> HPDD-discuss(a)lists.01.org
>>
https://lists.01.org/mailman/listinfo/hpdd-discuss
>