Feng Tang <feng.tang(a)intel.com> writes:
On Mon, Jul 06, 2020 at 06:34:34AM -0700, Andi Kleen wrote:
> > ret = proc_dointvec_minmax(table, write, buffer, lenp, ppos);
> > - if (ret == 0 && write)
> > + if (ret == 0 && write) {
> > + if (sysctl_overcommit_memory == OVERCOMMIT_NEVER)
> > + schedule_on_each_cpu(sync_overcommit_as);
>
> The schedule_on_each_cpu is not atomic, so the problem could still happen
> in that window.
>
> I think it may be ok if it eventually resolves, but certainly needs
> a comment explaining it. Can you do some stress testing toggling the
> policy all the time on different CPUs and running the test on
> other CPUs and see if the test fails?
For the raw test case reported by 0day, this patch passed in 200 times
run. And I will read the ltp code and try stress testing it as you
suggested.
> The other alternative would be to define some intermediate state
> for the sysctl variable and only switch to never once the schedule_on_each_cpu
> returned. But that's more complexity.
One thought I had is to put this schedule_on_each_cpu() before
the proc_dointvec_minmax() to do the sync before sysctl_overcommit_memory
is really changed. But the window still exists, as the batch is
still the larger one.
Can we change the batch firstly, then sync the global counter, finally
change the overcommit policy?
Best Regards,
Huang, Ying