On Thu, Nov 8, 2018 at 10:06 AM Alexander Duyck
<alexander.h.duyck(a)linux.intel.com> wrote:
Introduce four new variants of the async_schedule_ functions that allow
scheduling on a specific NUMA node.
The first two functions are async_schedule_near and
async_schedule_near_domain end up mapping to async_schedule and
async_schedule_domain, but provide NUMA node specific functionality. They
replace the original functions which were moved to inline function
definitions that call the new functions while passing NUMA_NO_NODE.
The second two functions are async_schedule_dev and
async_schedule_dev_domain which provide NUMA specific functionality when
passing a device as the data member and that device has a NUMA node other
than NUMA_NO_NODE.
The main motivation behind this is to address the need to be able to
schedule device specific init work on specific NUMA nodes in order to
improve performance of memory initialization.
What Andrew tends to due when an enhancement is spread over multiple
patches is to duplicate the motivation in each patch. So here you
could include the few sentences you wrote about the performance
benefits of this work:
"What I have seen on several systems is a pretty significant improvement
in initialization time for persistent memory. In the case of 3TB of
memory being initialized on a single node the improvement in the worst
case was from about 36s down to 26s for total initialization time."
...and conclude that the data shows a general benefit for affinitizing
async work to a specific numa node.
With that changelog clarification:
Reviewed-by: Dan Williams <dan.j.williams(a)intel.com>