-----BEGIN PGP SIGNED MESSAGE-----
On 07/31/2014 01:04 AM, Aaron Lu wrote:
On Wed, Jul 30, 2014 at 10:25:03AM -0400, Rik van Riel wrote:
> On 07/29/2014 10:14 PM, Aaron Lu wrote:
>> On Tue, Jul 29, 2014 at 04:04:37PM -0400, Rik van Riel wrote:
>>> On Tue, 29 Jul 2014 10:17:12 +0200 Peter Zijlstra
>>> <peterz(a)infradead.org> wrote:
>>>>> +#define NUMA_SCALE 1000 +#define NUMA_MOVE_THRESH 50
>>>> Please make that 1024, there's no reason not to use power
>>>> of two here. This base 10 factor thing annoyed me no end
>>>> already, its time for it to die.
>>> That's easy enough. However, it would be good to know
>>> whether this actually helps with the regression Aaron found
>> Sorry for the delay.
>> I applied the last patch and queued the hackbench job to the
>> ivb42 test machine for it to run 5 times, and here is the
>> result(regarding the proc-vmstat.numa_hint_faults_local
>> field): 173565 201262 192317 198342 198595 avg: 192816
>> It seems it is still very big than previous kernels.
> It looks like a step in the right direction, though.
> Could you try running with a larger threshold?
>>> +++ b/kernel/sched/fair.c @@ -924,10 +924,12 @@ static inline
>>> unsigned long group_faults_cpu(struct numa_group *group, int
>>> /* * These return the fraction of accesses done by a
>>> particular task, or - * task group, on a particular numa
>>> node. The group weight is given a - * larger multiplier, in
>>> order to group tasks together that are almost - * evenly
>>> spread out between numa nodes. + * task group, on a
>>> particular numa node. The NUMA move threshold + * prevents
>>> task moves with marginal improvement, and is set to 5%. */
>>> +#define NUMA_SCALE 1024 +#define NUMA_MOVE_THRESH (5 *
>>> NUMA_SCALE / 100)
> It would be good to see if changing NUMA_MOVE_THRESH to
> (NUMA_SCALE / 8) does the trick.
With your 2nd patch and the above change, the result is:
"proc-vmstat.numa_hint_faults_local": [ 199708, 209152, 200638,
187324, 196654 ],
OK, so it is still a little higher than your original 162245.
I guess this is to be expected, since the code will be more
successful at placing a task on the right node, which results
in the task scanning its memory more rapidly for a little bit.
Are you seeing any changes in throughput?
All rights reversed
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
-----END PGP SIGNATURE-----