On Fri, May 18, 2018 at 3:00 PM, Mikulas Patocka <mpatocka(a)redhat.com> wrote:
On Fri, 18 May 2018, Dan Williams wrote:
> >> ...and I wonder what the benefit is of the 16-byte case? I would
> >> assume the bulk of the benefit is limited to the 4 and 8 byte copy
> >> cases.
> >
> > dm-writecache uses 16-byte writes frequently, so it is needed for that.
> >
> > If we split 16-byte write to two 8-byte writes, it would degrade
> > performance for architectures where memcpy_flushcache needs to flush the
> > cache.
>
> My question was how measurable it is to special case 16-byte
> transfers? I know Ingo is going to ask this question, so it would
> speed things along if this patch included performance benefit numbers
> for each special case in the changelog.
I tested it some times ago - and the movnti instruction has 2% better
throughput than the existing memcpy_flushcache function.
It is doing one 16-byte write for every sector written and one 8-byte
write for every sector clean-up. So, the overhead is measurable.
Awesome, include those measured numbers in the changelog for the next
spin of the patch.