Le 17/04/2019 à 23:35, Dan Williams a écrit :
On Tue, Apr 16, 2019 at 8:31 AM Brice Goglin
<Brice.Goglin(a)inria.fr> wrote:
>
> Le 08/04/2019 à 21:55, Brice Goglin a écrit :
>
> Le 08/04/2019 à 16:56, Dan Williams a écrit :
>
> Yes, I agree with all of the above, but I think we need a way to fix
> this independent of the HMAT data being present. The SLIT already
> tells the kernel enough to let tooling figure out equidistant "local"
> nodes. While the numa_node attribute will remain a singleton the
> tooling needs to handle this case and can't assume the HMAT data will
> be present.
>
> So you want to export the part of SLIT that is currently hidden to
> userspace because the corresponding nodes aren't registered?
>
> With the patch below, I get 17 17 28 28 in dax0.0/node_distance which
> means it's close to node0 and node1.
>
> The code is pretty much a duplicate of read_node_distance() in
> drivers/base/node.c. Not sure it's worth factorizing such small functions?
>
> The name "node_distance" (instead of "distance" for NUMA nodes)
is also
> subject to discussion.
>
> Here's a better patch that exports the existing routine for showing
> node distances, and reuses it in dax/bus.c and nvdimm/pfn_devs.c:
>
> # cat /sys/class/block/pmem1/device/node_distance
> 28 28 17 17
> # cat /sys/bus/dax/devices/dax0.0/node_distance
> 17 17 28 28
>
> By the way, it also handles the case where the nd_region has no
> valid target_node (idea stolen from kmem.c).
>
> Are there other places where it'd be useful to export that attribute?
>
> Ideally we could just export it in the region sysfs directory,
> but I can't find backlinks going from daxX.Y or pmemZ to that
> region directory :/
I understand where you're trying to go, but this is too dax-device
specific. What about a storage-controller in the topology that is
equidistant from multiple cpu nodes. I'd rather solve this from the
tooling perspective to lookup cpu nodes that are equidistant to the
device's "numa_node".
I don't see how you're going to lookup those equidistant nodes. In the
above case, pmem1 numa_node is 2. Where do you want tools to find the
information that pmem1 is actually close to node2 AND node3?
That information is hidden in SLIT node5<->node2 and node5<->node3 but
these are not exposed to userspace tools since node5 isn't registered.
Brice