Hi Minchan,
Do you have time to look at this? I re-test it in v5.14, the
regression still existed. Thanks.
On 5/26/2021 12:57 AM, Chris Goldsworthy wrote:
On 2021-05-25 09:39, Minchan Kim wrote:
> On Tue, May 25, 2021 at 08:16:03AM -0700, Minchan Kim wrote:
>
> < snip >
>
>> > > On Thu, May 20, 2021 at 04:31:44PM +0800, kernel test robot wrote:
>> > > >
>> > > >
>> > > > Greeting,
>> > > >
>> > > > FYI, we noticed a -21.8% regression of fio.write_iops due to
>> commit:
>> > > >
>> > > >
>> > > > commit: 8cc621d2f45ddd3dc664024a647ee7adf48d79a5 ("mm: fs:
>> > > > invalidate BH LRU during page migration")
>> > > >
>>
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
>> > > >
>> > > >
>> > > > in testcase: fio-basic
>> > > > on test machine: 96 threads 2 sockets Intel(R) Xeon(R) Gold
>> 6252 CPU
>> > > > @ 2.10GHz with 256G memory
>> > > > with following parameters:
>> > > >
>> > > > disk: 2pmem
>> > > > fs: ext4
>> > > > runtime: 200s
>> > > > nr_task: 50%
>> > > > time_based: tb
>> > > > rw: randwrite
>> > > > bs: 4k
>> > > > ioengine: libaio
>> > > > test_size: 200G
>> > > > cpufreq_governor: performance
>> > > > ucode: 0x5003006
>> > > >
>> > > > test-description: Fio is a tool that will spawn a number of
>> threads
>> > > > or processes doing a particular type of I/O action as
>> specified by
>> > > > the user.
>> > > > test-url:
https://github.com/axboe/fio
>> > > >
>> > > >
>> > > >
>> > > > If you fix the issue, kindly add following tag
>> > > > Reported-by: kernel test robot <oliver.sang(a)intel.com>
>> > > >
>> > > >
>> > > > Details are as below:
>> > > >
>>
-------------------------------------------------------------------------------------------------->
>> > > >
>> > > >
>> > > > To reproduce:
>> > > >
>> > > > git clone
https://github.com/intel/lkp-tests.git
>> > > > cd lkp-tests
>> > > > bin/lkp install job.yaml # job file is
>> > > > attached in this email
>> > > > bin/lkp split-job --compatible job.yaml # generate
>> the yaml
>> > > > file for lkp run
>> > > > bin/lkp run generated-yaml-file
>> > >
>> > > Hi,
>> > >
>> > > I tried to insall the lkp-test in my machine by following above
>> guide
>> > > but failed
>> > > due to package problems(I guess it's my problem since I use
>> something
>> > > particular
>> > > environement). However, I guess it comes from increased miss
>> ratio of
>> > > bh_lrus
>> > > since the patch caused more frequent invalidation of the bh_lrus
>> calls
>> > > compared
>> > > to old. For example, lru_add_drain could be called from several hot
>> > > places(e.g.,
>> > > unmap and pagevec_release from several path) and it could keeps
>> > > invalidating
>> > > bh_lrus.
>> > >
>> > > IMO, we should move the overhead from such hot path to cold one.
>> How
>> > > about this?
>> > >
>> > > From ebf4ede1cf32fb14d85f0015a3693cb8e1b8dbfe Mon Sep 17
>> 00:00:00 2001
>> > > From: Minchan Kim <minchan(a)kernel.org>
>> > > Date: Thu, 20 May 2021 11:17:56 -0700
>> > > Subject: [PATCH] invalidate bh_lrus only at lru_add_drain_all
>> > >
>> > > Not-Yet-Signed-off-by: Minchan Kim <minchan(a)kernel.org>
>> > > ---
>> > > mm/swap.c | 15 +++++++++++++--
>> > > 1 file changed, 13 insertions(+), 2 deletions(-)
>> > >
>> > > diff --git a/mm/swap.c b/mm/swap.c
>> > > index dfb48cf9c2c9..d6168449e28c 100644
>> > > --- a/mm/swap.c
>> > > +++ b/mm/swap.c
>> > > @@ -642,7 +642,6 @@ void lru_add_drain_cpu(int cpu)
>> > > pagevec_lru_move_fn(pvec, lru_lazyfree_fn);
>> > >
>> > > activate_page_drain(cpu);
>> > > - invalidate_bh_lrus_cpu(cpu);
>> > > }
>> > >
>> > > /**
>> > > @@ -725,6 +724,17 @@ void lru_add_drain(void)
>> > > local_unlock(&lru_pvecs.lock);
>> > > }
>> > >
>> > > +void lru_and_bh_lrus_drain(void)
>> > > +{
>> > > + int cpu;
>> > > +
>> > > + local_lock(&lru_pvecs.lock);
>> > > + cpu = smp_processor_id();
>> > > + lru_add_drain_cpu(cpu);
>> > > + local_unlock(&lru_pvecs.lock);
>> > > + invalidate_bh_lrus_cpu(cpu);
>> > > +}
>> > > +
>> >
>> > Nit: drop int cpu?
>>
>> Do you mean to suggest using smp_processor_id at both places
>> instead of local varaible? Since the invalidate_bh_lrus_cpu
>> is called out of the lru_pvecs.lock, I wanted to express
>> the draining happens at the same CPU via storing the CPU.
>>
>> >
>> > > void lru_add_drain_cpu_zone(struct zone *zone)
>> > > {
>> > > local_lock(&lru_pvecs.lock);
>> > > @@ -739,7 +749,7 @@ static DEFINE_PER_CPU(struct work_struct,
>> > > lru_add_drain_work);
>> > >
>> > > static void lru_add_drain_per_cpu(struct work_struct *dummy)
>> > > {
>> > > - lru_add_drain();
>> > > + lru_and_bh_lrus_drain();
>> > > }
>> > >
>> > > /*
>> > > @@ -881,6 +891,7 @@ void lru_cache_disable(void)
>> > > __lru_add_drain_all(true);
>> > > #else
>> > > lru_add_drain();
>> > > + invalidate_bh_lrus_cpu(smp_processor_id());
>> > > #endif
>> > > }
>> >
>> > Can't we replace the call to lru_add_drain() and
>> > invalidate_bh_lrus_cpu(smp_processor_id()) with a single call to
>> > lru_and_bh_lrus_drain()?
>>
>> Good idea.
>
> From 8caadeb49d82403a08643dfbdb0b7749017c00bb Mon Sep 17 00:00:00 2001
> From: Minchan Kim <minchan(a)kernel.org>
> Date: Tue, 25 May 2021 08:19:17 -0700
> Subject: [PATCH] mm: fs: invalidate bh_lrus for only cold path
>
> kernel test robot reported the regression of fio.write_iops[1]
> with [2].
>
> Since lru_add_drain is called frequently, invalidate bh_lrus
> there could increase bh_lrus cache miss ratio, which needs
> more IO in the end.
>
> This patch moves the bh_lrus invalidation from the hot path(
> e.g., zap_page_range, pagevec_release) to cold path(i.e.,
> lru_add_drain_all, lru_cache_disable).
>
> [1]
>
https://lore.kernel.org/lkml/20210520083144.GD14190@xsang-OptiPlex-9020/
> [2] 8cc621d2f45d, mm: fs: invalidate BH LRU during page migration
> Cc: Xing, Zhengjun <zhengjun.xing(a)intel.com>
> Cc: Chris Goldsworthy <cgoldswo(a)codeaurora.org>
> Reported-by: kernel test robot <oliver.sang(a)intel.com>
> Signed-off-by: Minchan Kim <minchan(a)kernel.org>
> ---
> mm/swap.c | 16 +++++++++++++---
> 1 file changed, 13 insertions(+), 3 deletions(-)
>
> diff --git a/mm/swap.c b/mm/swap.c
> index 1958d5feb148..3e25d99a9dbb 100644
> --- a/mm/swap.c
> +++ b/mm/swap.c
> @@ -642,7 +642,6 @@ void lru_add_drain_cpu(int cpu)
> pagevec_lru_move_fn(pvec, lru_lazyfree_fn);
>
> activate_page_drain(cpu);
> - invalidate_bh_lrus_cpu(cpu);
> }
>
> /**
> @@ -725,6 +724,17 @@ void lru_add_drain(void)
> local_unlock(&lru_pvecs.lock);
> }
>
> +static void lru_add_and_bh_lrus_drain(void)
> +{
> + int cpu;
> +
> + local_lock(&lru_pvecs.lock);
> + cpu = smp_processor_id();
> + lru_add_drain_cpu(cpu);
> + local_unlock(&lru_pvecs.lock);
> + invalidate_bh_lrus_cpu(cpu);
> +}
> +
> void lru_add_drain_cpu_zone(struct zone *zone)
> {
> local_lock(&lru_pvecs.lock);
> @@ -739,7 +749,7 @@ static DEFINE_PER_CPU(struct work_struct,
> lru_add_drain_work);
>
> static void lru_add_drain_per_cpu(struct work_struct *dummy)
> {
> - lru_add_drain();
> + lru_add_and_bh_lrus_drain();
> }
>
> /*
> @@ -880,7 +890,7 @@ void lru_cache_disable(void)
> */
> __lru_add_drain_all(true);
> #else
> - lru_add_drain();
> + lru_add_and_bh_lrus_drain();
> #endif
> }
Feel free to add:
Reviewed-by: Chris Goldsworthy <cgoldswo(a)codeaurora.org>