[PATCH 00/16] Memory Hierarchy: Enable target node lookups for
reserved memory
by Dan Williams
Yes, this patch series looks like a pile of boring libnvdimm cleanups,
but buried at the end are some small gems that testing with libnvdimm
uncovered. These gems will prove more valuable over time for Memory
Hierarchy management as more platforms, via the ACPI HMAT and EFI
Specific Purpose Memory, publish reserved or "soft-reserved" ranges to
Linux. Linux system administrators will expect to be able to interact
with those ranges with a unique numa node number when/if that memory is
onlined via the dax_kmem driver [1].
One configuration that currently fails to properly convey the target
node for the resulting memory hotplug operation is persistent memory
defined by the memmap=nn!ss parameter. For example, today if node1 is a
memory only node, and all the memory from node1 is specified to
memmap=nn!ss and subsequently onlined, it will end up being onlined as
node0 memory. As it stands, memory_add_physaddr_to_nid() can only
identify online nodes and since node1 in this example has no online cpus
/ memory the target node is initialized node0.
The fix is to preserve rather than discard the numa_meminfo entries that
are relevant for reserved memory ranges, and to uplevel the node
distance helper for determining the "local" (closest) node relative to
an initiator node.
The first 12 patches are cleanups to make sure that all nvdimm devices
and their children properly export a numa_node attribute. The switch to
a device-type is less code and less error prone as a result.
Patch 13 and 14 are the core changes (gems) to allow numa node
information for offline memory to be tracked.
Patches 15 and 16 use this new capability to fix the conveyance of numa
node information for memmap=nn!ss assignments. See patch 16 for more
details.
[1]: https://pmem.io/ndctl/daxctl-reconfigure-device.html
---
Dan Williams (16):
libnvdimm: Move attribute groups to device type
libnvdimm: Move region attribute group definition
libnvdimm: Move nd_device_attribute_group to device_type
libnvdimm: Move nd_numa_attribute_group to device_type
libnvdimm: Move nd_region_attribute_group to device_type
libnvdimm: Move nd_mapping_attribute_group to device_type
libnvdimm: Move nvdimm_attribute_group to device_type
libnvdimm: Move nvdimm_bus_attribute_group to device_type
dax: Create a dax device_type
dax: Simplify root read-only definition for the 'resource' attribute
libnvdimm: Simplify root read-only definition for the 'resource' attribute
dax: Add numa_node to the default device-dax attributes
acpi/mm: Up-level "map to online node" functionality
x86/numa: Provide a range-to-target_node lookup facility
libnvdimm/e820: Drop the wrapper around memory_add_physaddr_to_nid
libnvdimm/e820: Retrieve and populate correct 'target_node' info
arch/powerpc/platforms/pseries/papr_scm.c | 25 ---
arch/x86/mm/numa.c | 72 ++++++++-
drivers/acpi/nfit/core.c | 7 -
drivers/acpi/numa.c | 41 -----
drivers/dax/bus.c | 22 ++-
drivers/nvdimm/btt_devs.c | 24 +--
drivers/nvdimm/bus.c | 15 +-
drivers/nvdimm/core.c | 8 +
drivers/nvdimm/dax_devs.c | 27 +--
drivers/nvdimm/dimm_devs.c | 30 ++--
drivers/nvdimm/e820.c | 30 ----
drivers/nvdimm/namespace_devs.c | 77 +++++-----
drivers/nvdimm/nd.h | 5 -
drivers/nvdimm/of_pmem.c | 13 --
drivers/nvdimm/pfn_devs.c | 38 ++---
drivers/nvdimm/region_devs.c | 235 +++++++++++++++--------------
include/linux/acpi.h | 23 +++
include/linux/libnvdimm.h | 7 -
include/linux/memory_hotplug.h | 6 +
include/linux/numa.h | 2
mm/mempolicy.c | 30 ++++
21 files changed, 382 insertions(+), 355 deletions(-)
1 year, 2 months
[ndctl PATCH v2 1/2] ndctl/namespace: Rework counts reported by enable-namespace
by Vishal Verma
Add detection of 'seed' namespaces
(ndctl_namespace_is_configuration_idle()) to the enable-namespace
operatiuon and libndctl API. In libndctl, return a '1' for seed
namespaces. In namespace.c, reinterpret a '1' based on a check for a
seed namespace, and decide on skip vs success accordingly. Collect this
into a new namespace_enable helper, and make the reported count
consistent by also skipping namespaces that were already enabled.
Link: https://github.com/pmem/ndctl/issues/119
Reported-by: Aneesh Kumar K.V <aneesh.kumar(a)linux.ibm.com>
Cc: Dan Williams <dan.j.williams(a)intel.com>
Signed-off-by: Vishal Verma <vishal.l.verma(a)intel.com>
---
Changes in v2:
- The kernel is the ultimate authority on enabling namespaces, so we
should let it make the decision of how to handle seed namespaces
instead of preemptively skipping them. Let the kernel make that
decision, and fix up error reporting after the fact.
ndctl/lib/libndctl.c | 9 +++++++--
ndctl/namespace.c | 37 ++++++++++++++++++++++++++++++++++---
2 files changed, 41 insertions(+), 5 deletions(-)
diff --git a/ndctl/lib/libndctl.c b/ndctl/lib/libndctl.c
index d6a2800..cde58ff 100644
--- a/ndctl/lib/libndctl.c
+++ b/ndctl/lib/libndctl.c
@@ -4045,8 +4045,13 @@ NDCTL_EXPORT int ndctl_namespace_enable(struct ndctl_namespace *ndns)
return 1;
}
- err(ctx, "%s: failed to enable\n", devname);
- return rc ? rc : -ENXIO;
+ if (ndctl_namespace_is_configuration_idle(ndns)) {
+ dbg(ctx, "%s: skip seed namespace\n", devname);
+ return 1;
+ } else {
+ err(ctx, "%s: failed to enable\n", devname);
+ return rc ? rc : -ENXIO;
+ }
}
rc = 0;
dbg(ctx, "%s: enabled\n", devname);
diff --git a/ndctl/namespace.c b/ndctl/namespace.c
index a07d7e2..f2987ca 100644
--- a/ndctl/namespace.c
+++ b/ndctl/namespace.c
@@ -961,6 +961,36 @@ out:
return rc;
}
+/*
+ * Adjust the return convention slightly differently from
+ * ndctl_namespace_enable(). We don't care as much if the enable resulted in
+ * a different namespace personality being attached. We care more about success,
+ * failure, or skipped.
+ * return 0 for success
+ * return < 0 for failure
+ * return > 0 for skipped
+ */
+static int namespace_enable(struct ndctl_namespace *ndns)
+{
+ int rc;
+
+ if (ndctl_namespace_is_enabled(ndns))
+ return 1;
+
+ rc = ndctl_namespace_enable(ndns);
+ if (rc < 0)
+ return rc;
+
+ /*
+ * ndctl_namespace_enable() returns 'success' even for seed namespaces.
+ * Reinterpret it to determine success vs. skipped.
+ */
+ if (ndctl_namespace_is_configuration_idle(ndns))
+ return 1;
+
+ return 0;
+}
+
static int enable_labels(struct ndctl_region *region)
{
int mappings = ndctl_region_get_mappings(region);
@@ -1401,11 +1431,12 @@ static int do_xaction_namespace(const char *namespace,
(*processed)++;
break;
case ACTION_ENABLE:
- rc = ndctl_namespace_enable(ndns);
- if (rc >= 0) {
+ rc = namespace_enable(ndns);
+ if (rc == 0)
(*processed)++;
+ /* return success if skipped */
+ if (rc > 0)
rc = 0;
- }
break;
case ACTION_DESTROY:
rc = namespace_destroy(region, ndns);
--
2.20.1
1 year, 2 months
Investment opportunity
by Peter Wong
Greetings,
Find attached email very confidential. reply for more details
Thanks.
Peter Wong
----------------------------------------------------
This email was sent by the shareware version of Postman Professional.
1 year, 2 months
Re: DAX filesystem support on ARMv8
by Jan Kara
Hi!
On Tue 12-11-19 02:12:09, Bharat Kumar Gogada wrote:
> As per Documentation/filesystems/dax.txt
>
> The DAX code does not work correctly on architectures which have virtually
> mapped caches such as ARM, MIPS and SPARC.
>
> Can anyone please shed light on dax filesystem issue w.r.t ARM architecture ?
I've CCed Dan, he might have idea what that comment means :)
Out of curiosity, why do you care?
Honza
--
Jan Kara <jack(a)suse.com>
SUSE Labs, CR
1 year, 2 months