[PATCH v6 00/11] device-dax: support sub-dividing soft-reserved
ranges
by Dan Williams
Changes since v5 [1]:
- (David) Introduce range_len() to include/linux/range.h immediately in
"device-dax: make pgmap optional for instance creation" rather than
wait until "mm/memremap_pages: convert to 'struct range'" to move it.
- (David) David points out that release_mem_region() can not be used in
the kmem driver since it depends on the resource range being busy at
free. The dance the driver does to hand-off busy/free management to
add_memory_driver_managed() breaks request_mem_region()'s assumptions
and requires the driver to continue to use a open-coded
release_resource() + kfree() sequence. For the new multi-range case,
expand the driver-data to hold all the resulting 'struct resource'
instances from mapping the ranges.
- (Boris) consolidate pgmap manipulation code in the
xen_alloc_unpopulated_pages() path. Since this touched
"mm/memremap_pages: convert to 'struct range'" with the pending fix from
Dan, I folded in that fix and gave him a Reported-by credit.
[1]: http://lore.kernel.org/r/160106109960.30709.7379926726669669398.stgit@dwi...
---
Hi Andrew,
As before patches that are in your tree and did not change as a result
of these updates are not re-sent. This set replaces:
device-dax-make-pgmap-optional-for-instance-creation.patch
...through...
device-dax-add-dis-contiguous-resource-support.patch
...in your stack.
I let this soak over the weekend in kbuild-robot visible tree and it
received a build success notification over 160 configs, and no other
regression notices.
---
The device-dax facility allows an address range to be directly mapped
through a chardev, or optionally hotplugged to the core kernel page
allocator as System-RAM. It is the mechanism for converting persistent
memory (pmem) to be used as another volatile memory pool i.e. the
current Memory Tiering hot topic on linux-mm.
In the case of pmem the nvdimm-namespace-label mechanism can sub-divide
it, but that labeling mechanism is not available / applicable to
soft-reserved ("EFI specific purpose") memory [2]. This series provides
a sysfs-mechanism for the daxctl utility to enable provisioning of
volatile-soft-reserved memory ranges.
The motivations for this facility are:
1/ Allow performance differentiated memory ranges to be split between
kernel-managed and directly-accessed use cases.
2/ Allow physical memory to be provisioned along performance relevant
address boundaries. For example, divide a memory-side cache [3] along
cache-color boundaries.
3/ Parcel out soft-reserved memory to VMs using device-dax as a security
/ permissions boundary [4]. Specifically I have seen people (ab)using
memmap=nn!ss (mark System-RAM as Persistent Memory) just to get the
device-dax interface on custom address ranges. A follow-on for the VM
use case is to teach device-dax to dynamically allocate 'struct page' at
runtime to reduce the duplication of 'struct page' space in both the
guest and the host kernel for the same physical pages.
[2]: http://lore.kernel.org/r/157309097008.1579826.12818463304589384434.stgit@...
[3]: http://lore.kernel.org/r/154899811738.3165233.12325692939590944259.stgit@...
[4]: http://lore.kernel.org/r/20200110190313.17144-1-joao.m.martins@oracle.com
---
Dan Williams (11):
device-dax: make pgmap optional for instance creation
device-dax/kmem: introduce dax_kmem_range()
device-dax/kmem: move resource tracking to drvdata
device-dax: add an allocation interface for device-dax instances
device-dax: introduce 'struct dev_dax' typed-driver operations
device-dax: introduce 'seed' devices
drivers/base: make device_find_child_by_name() compatible with sysfs inputs
device-dax: add resize support
mm/memremap_pages: convert to 'struct range'
mm/memremap_pages: support multiple ranges per invocation
device-dax: add dis-contiguous resource support
arch/powerpc/kvm/book3s_hv_uvmem.c | 14 -
drivers/base/core.c | 2
drivers/dax/bus.c | 708 ++++++++++++++++++++++++++++++--
drivers/dax/bus.h | 11
drivers/dax/dax-private.h | 23 +
drivers/dax/device.c | 71 ++-
drivers/dax/hmem/hmem.c | 14 -
drivers/dax/kmem.c | 198 ++++++---
drivers/dax/pmem/compat.c | 2
drivers/dax/pmem/core.c | 14 -
drivers/gpu/drm/nouveau/nouveau_dmem.c | 15 -
drivers/nvdimm/badrange.c | 26 +
drivers/nvdimm/claim.c | 13 -
drivers/nvdimm/nd.h | 3
drivers/nvdimm/pfn_devs.c | 13 -
drivers/nvdimm/pmem.c | 27 +
drivers/nvdimm/region.c | 21 +
drivers/pci/p2pdma.c | 12 -
drivers/xen/unpopulated-alloc.c | 49 +-
include/linux/memremap.h | 11
include/linux/range.h | 6
lib/test_hmm.c | 51 +-
mm/memremap.c | 299 ++++++++------
tools/testing/nvdimm/dax-dev.c | 22 +
tools/testing/nvdimm/test/iomap.c | 2
25 files changed, 1216 insertions(+), 411 deletions(-)
base-commit: d524ed85683d657593ac1e58098407bed0601a84
1 year, 7 months
[PATCH v2] ext4/xfs: add page refcount helper
by Ralph Campbell
There are several places where ZONE_DEVICE struct pages assume a reference
count == 1 means the page is idle and free. Instead of open coding this,
add helper functions to hide this detail.
Signed-off-by: Ralph Campbell <rcampbell(a)nvidia.com>
Reviewed-by: Christoph Hellwig <hch(a)lst.de>
Acked-by: Darrick J. Wong <darrick.wong(a)oracle.com>
Acked-by: Theodore Ts'o <tytso(a)mit.edu> # for fs/ext4/inode.c
---
Changes in v2:
I strongly resisted the idea of extending this patch but after Jan
Kara's comment about there being more places that could be cleaned
up, I felt compelled to make this one tensy wensy change to add
a dax_wakeup_page() to match the dax_wait_page().
I kept the Reviewed/Acked-bys since I don't think this substantially
changes the patch.
fs/dax.c | 4 ++--
fs/ext4/inode.c | 5 +----
fs/xfs/xfs_file.c | 4 +---
include/linux/dax.h | 15 +++++++++++++++
mm/memremap.c | 3 ++-
5 files changed, 21 insertions(+), 10 deletions(-)
diff --git a/fs/dax.c b/fs/dax.c
index 5b47834f2e1b..85c63f735909 100644
--- a/fs/dax.c
+++ b/fs/dax.c
@@ -358,7 +358,7 @@ static void dax_disassociate_entry(void *entry, struct address_space *mapping,
for_each_mapped_pfn(entry, pfn) {
struct page *page = pfn_to_page(pfn);
- WARN_ON_ONCE(trunc && page_ref_count(page) > 1);
+ WARN_ON_ONCE(trunc && !dax_layout_is_idle_page(page));
WARN_ON_ONCE(page->mapping && page->mapping != mapping);
page->mapping = NULL;
page->index = 0;
@@ -372,7 +372,7 @@ static struct page *dax_busy_page(void *entry)
for_each_mapped_pfn(entry, pfn) {
struct page *page = pfn_to_page(pfn);
- if (page_ref_count(page) > 1)
+ if (!dax_layout_is_idle_page(page))
return page;
}
return NULL;
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 771ed8b1fadb..132620cbfa13 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -3937,10 +3937,7 @@ int ext4_break_layouts(struct inode *inode)
if (!page)
return 0;
- error = ___wait_var_event(&page->_refcount,
- atomic_read(&page->_refcount) == 1,
- TASK_INTERRUPTIBLE, 0, 0,
- ext4_wait_dax_page(ei));
+ error = dax_wait_page(ei, page, ext4_wait_dax_page);
} while (error == 0);
return error;
diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
index 3d1b95124744..a5304aaeaa3a 100644
--- a/fs/xfs/xfs_file.c
+++ b/fs/xfs/xfs_file.c
@@ -749,9 +749,7 @@ xfs_break_dax_layouts(
return 0;
*retry = true;
- return ___wait_var_event(&page->_refcount,
- atomic_read(&page->_refcount) == 1, TASK_INTERRUPTIBLE,
- 0, 0, xfs_wait_dax_page(inode));
+ return dax_wait_page(inode, page, xfs_wait_dax_page);
}
int
diff --git a/include/linux/dax.h b/include/linux/dax.h
index b52f084aa643..e2da78e87338 100644
--- a/include/linux/dax.h
+++ b/include/linux/dax.h
@@ -243,6 +243,21 @@ static inline bool dax_mapping(struct address_space *mapping)
return mapping->host && IS_DAX(mapping->host);
}
+static inline bool dax_layout_is_idle_page(struct page *page)
+{
+ return page_ref_count(page) == 1;
+}
+
+static inline void dax_wakeup_page(struct page *page)
+{
+ wake_up_var(&page->_refcount);
+}
+
+#define dax_wait_page(_inode, _page, _wait_cb) \
+ ___wait_var_event(&(_page)->_refcount, \
+ dax_layout_is_idle_page(_page), \
+ TASK_INTERRUPTIBLE, 0, 0, _wait_cb(_inode))
+
#ifdef CONFIG_DEV_DAX_HMEM_DEVICES
void hmem_register_device(int target_nid, struct resource *r);
#else
diff --git a/mm/memremap.c b/mm/memremap.c
index 2bb276680837..504a10ff2edf 100644
--- a/mm/memremap.c
+++ b/mm/memremap.c
@@ -12,6 +12,7 @@
#include <linux/types.h>
#include <linux/wait_bit.h>
#include <linux/xarray.h>
+#include <linux/dax.h>
static DEFINE_XARRAY(pgmap_array);
@@ -508,7 +509,7 @@ void free_devmap_managed_page(struct page *page)
{
/* notify page idle for dax */
if (!is_device_private_page(page)) {
- wake_up_var(&page->_refcount);
+ dax_wakeup_page(page);
return;
}
--
2.20.1
1 year, 7 months
[ndctl PATCH] build: Use asciidoc instead of asciidoctor on RHEL
by Dan Williams
Until RHEL moves to asciidoctor fallback to the old asciidoc for RHEL
builds.
Signed-off-by: Dan Williams <dan.j.williams(a)intel.com>
---
ndctl.spec.in | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/ndctl.spec.in b/ndctl.spec.in
index 94e15ad309c5..056c53069082 100644
--- a/ndctl.spec.in
+++ b/ndctl.spec.in
@@ -9,7 +9,12 @@ Source0: https://github.com/pmem/%{name}/archive/v%{version}.tar.gz#/%{name}-%{v
Requires: LNAME%{?_isa} = %{version}-%{release}
Requires: DAX_LNAME%{?_isa} = %{version}-%{release}
BuildRequires: autoconf
+%if 0%{?rhel} < 9
+BuildRequires: asciidoc
+%define asciidoc --disable-asciidoctor
+%else
BuildRequires: rubygem-asciidoctor
+%endif
BuildRequires: xmlto
BuildRequires: automake
BuildRequires: libtool
@@ -86,7 +91,7 @@ control API for these devices.
%build
echo %{version} > version
./autogen.sh
-%configure --disable-static --disable-silent-rules
+%configure --disable-static --disable-silent-rules %{?asciidoc}
make %{?_smp_mflags}
%install
1 year, 7 months
[PATCH] ext4/xfs: add page refcount helper
by Ralph Campbell
There are several places where ZONE_DEVICE struct pages assume a reference
count == 1 means the page is idle and free. Instead of open coding this,
add a helper function to hide this detail.
Signed-off-by: Ralph Campbell <rcampbell(a)nvidia.com>
Reviewed-by: Christoph Hellwig <hch(a)lst.de>
---
I'm resending this as a separate patch since I think it is ready to
merge. Originally, this was part of an RFC and is unchanged from v3:
https://lore.kernel.org/linux-mm/20201001181715.17416-1-rcampbell@nvidia.com
It applies cleanly to linux-5.9.0-rc7-mm1 but doesn't really
depend on anything, just simple merge conflicts when applied to
other trees.
I'll let the various maintainers decide which tree and when to merge.
It isn't urgent since it is a clean up patch.
fs/dax.c | 4 ++--
fs/ext4/inode.c | 5 +----
fs/xfs/xfs_file.c | 4 +---
include/linux/dax.h | 10 ++++++++++
4 files changed, 14 insertions(+), 9 deletions(-)
diff --git a/fs/dax.c b/fs/dax.c
index 5b47834f2e1b..85c63f735909 100644
--- a/fs/dax.c
+++ b/fs/dax.c
@@ -358,7 +358,7 @@ static void dax_disassociate_entry(void *entry, struct address_space *mapping,
for_each_mapped_pfn(entry, pfn) {
struct page *page = pfn_to_page(pfn);
- WARN_ON_ONCE(trunc && page_ref_count(page) > 1);
+ WARN_ON_ONCE(trunc && !dax_layout_is_idle_page(page));
WARN_ON_ONCE(page->mapping && page->mapping != mapping);
page->mapping = NULL;
page->index = 0;
@@ -372,7 +372,7 @@ static struct page *dax_busy_page(void *entry)
for_each_mapped_pfn(entry, pfn) {
struct page *page = pfn_to_page(pfn);
- if (page_ref_count(page) > 1)
+ if (!dax_layout_is_idle_page(page))
return page;
}
return NULL;
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 771ed8b1fadb..132620cbfa13 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -3937,10 +3937,7 @@ int ext4_break_layouts(struct inode *inode)
if (!page)
return 0;
- error = ___wait_var_event(&page->_refcount,
- atomic_read(&page->_refcount) == 1,
- TASK_INTERRUPTIBLE, 0, 0,
- ext4_wait_dax_page(ei));
+ error = dax_wait_page(ei, page, ext4_wait_dax_page);
} while (error == 0);
return error;
diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
index 3d1b95124744..a5304aaeaa3a 100644
--- a/fs/xfs/xfs_file.c
+++ b/fs/xfs/xfs_file.c
@@ -749,9 +749,7 @@ xfs_break_dax_layouts(
return 0;
*retry = true;
- return ___wait_var_event(&page->_refcount,
- atomic_read(&page->_refcount) == 1, TASK_INTERRUPTIBLE,
- 0, 0, xfs_wait_dax_page(inode));
+ return dax_wait_page(inode, page, xfs_wait_dax_page);
}
int
diff --git a/include/linux/dax.h b/include/linux/dax.h
index b52f084aa643..8909a91cd381 100644
--- a/include/linux/dax.h
+++ b/include/linux/dax.h
@@ -243,6 +243,16 @@ static inline bool dax_mapping(struct address_space *mapping)
return mapping->host && IS_DAX(mapping->host);
}
+static inline bool dax_layout_is_idle_page(struct page *page)
+{
+ return page_ref_count(page) == 1;
+}
+
+#define dax_wait_page(_inode, _page, _wait_cb) \
+ ___wait_var_event(&(_page)->_refcount, \
+ dax_layout_is_idle_page(_page), \
+ TASK_INTERRUPTIBLE, 0, 0, _wait_cb(_inode))
+
#ifdef CONFIG_DEV_DAX_HMEM_DEVICES
void hmem_register_device(int target_nid, struct resource *r);
#else
--
2.20.1
1 year, 7 months
[PATCH] x86/mce: Gate copy_mc_fragile() export by
CONFIG_COPY_MC_TEST=y
by Dan Williams
It appears that modpost is not happy about exporting assembly symbols
that are not consumed in the same build. As Boris reports:
WARNING: modpost: EXPORT symbol "copy_mc_fragile" [vmlinux] version generation failed, symbol will not be versioned.
The export is only consumed in the CONFIG_COPY_MC_TEST=y case, and even
then not in a way that modpost could see. CONFIG_COPY_MC_TEST uses a
module built in tools/testing/nvdimm/ to exercise the copy_mc_fragile()
corner cases. Given the test already requires manually editing the
config entry for CONFIG_COPY_MC_TEST to make it "def_bool y" the
additional dependency to require is CONFIG_MODVERSIONS=n is not too
onerous.
Alternatively, COPY_MC_TEST and its related infrastructure could just be
ripped out because it has served its purpose. For now, just stop
exporting the symbol by default, and add the MODVERSIONS dependency to
the test.
Fixes: ec6347bb4339 ("x86, powerpc: Rename memcpy_mcsafe() to copy_mc_to_{user, kernel}()")
Reported-by: Borislav Petkov <bp(a)suse.de>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Ingo Molnar <mingo(a)redhat.com>
Cc: Borislav Petkov <bp(a)alien8.de>
Cc: x86(a)kernel.org
Cc: "H. Peter Anvin" <hpa(a)zytor.com>
Cc: Tony Luck <tony.luck(a)intel.com>
Signed-off-by: Dan Williams <dan.j.williams(a)intel.com>
---
arch/x86/Kconfig.debug | 1 +
arch/x86/lib/copy_mc_64.S | 2 ++
2 files changed, 3 insertions(+)
diff --git a/arch/x86/Kconfig.debug b/arch/x86/Kconfig.debug
index 27b5e2bc6a01..6f0f5d8ac62e 100644
--- a/arch/x86/Kconfig.debug
+++ b/arch/x86/Kconfig.debug
@@ -63,6 +63,7 @@ config EARLY_PRINTK_USB_XDBC
crashes or need a very simple printk logging facility.
config COPY_MC_TEST
+ depends on !MODVERSIONS
def_bool n
config EFI_PGT_DUMP
diff --git a/arch/x86/lib/copy_mc_64.S b/arch/x86/lib/copy_mc_64.S
index 892d8915f609..50fb05256751 100644
--- a/arch/x86/lib/copy_mc_64.S
+++ b/arch/x86/lib/copy_mc_64.S
@@ -88,7 +88,9 @@ SYM_FUNC_START(copy_mc_fragile)
.L_done:
ret
SYM_FUNC_END(copy_mc_fragile)
+#ifdef CONFIG_COPY_MC_TEST
EXPORT_SYMBOL_GPL(copy_mc_fragile)
+#endif
.section .fixup, "ax"
/*
1 year, 7 months
[ndctl PATCH] ndctl/namespace: Catch attempts to sub-divide legacy
/ label-less capacity
by Dan Williams
Fail attempts to specify a size smaller than the host region to
'create-namespace' when labels are not available. Otherwise ndctl
confusingly succeeds and reports that the namespace is still statically
sized to the region:
Example before:
# ndctl create-namespace -s 32g
"size":"63.00 GiB (67.64 GB)",
Example after:
# ndctl create-namespace -e namespace0.0 -s 2G -f
Error: Legacy / label-less namespaces do not support sub-dividing a region retry without -s/--size=
failed to reconfigure namespace: Invalid argument
The memmap= parameter while useful, does not emulate many of the
provisioning flows of real persistent memory devices. The set of useful
namespace configuration that can be performed on top of memmap= defined
region+namespace is reconfiguring the namespace between operation modes:
create-namespace -e namespace0.0 -f -m {devdax,fsdax,sector}
Link: https://github.com/pmem/ndctl/issues/150
Reported-by: Eric Sandeen <esandeen(a)redhat.com>
Signed-off-by: Dan Williams <dan.j.williams(a)intel.com>
---
ndctl/namespace.c | 11 +++++++++++
1 file changed, 11 insertions(+)
diff --git a/ndctl/namespace.c b/ndctl/namespace.c
index e734248c9752..e946bb6c9bfa 100644
--- a/ndctl/namespace.c
+++ b/ndctl/namespace.c
@@ -684,6 +684,17 @@ static int validate_namespace_options(struct ndctl_region *region,
return rc;
}
+ /*
+ * Block attempts to set a custom size on legacy (label-less)
+ * namespaces
+ */
+ if (ndctl_region_get_nstype(region) == ND_DEVICE_NAMESPACE_IO
+ && p->size != ndctl_region_get_size(region)) {
+ error("Legacy / label-less namespaces do not support sub-dividing a region\n");
+ error("Retry without -s/--size=\n");
+ return -EINVAL;
+ }
+
if (param.uuid) {
if (uuid_parse(param.uuid, p->uuid) != 0) {
err("%s: invalid uuid\n", __func__);
1 year, 7 months
[ANNOUNCE] ndctl v70
by Verma, Vishal L
A new release of ndctl is available [1].
Highlights include support for the new firmware activation facility, a
new 'split-acpi' command in 'daxctl'to aid testing and debugging, and
other minor fixes.
A shortlog is appended below.
[1]: https://github.com/pmem/ndctl/releases/tag/v70
Dan Williams (15):
ndctl/dimm: Fix chatty status messages
ndctl/list: Indicate firmware update capability
ndctl/dimm: Detect firmware-update vs ARS conflict
ndctl/dimm: Improve firmware-update failure message
ndctl/dimm: Prepare to emit dimm json object after firmware update
ndctl/dimm: Emit dimm firmware details after update
ndctl/list: Add firmware activation enumeration
ndctl/dimm: Auto-arm firmware activation
ndctl/bus: Add 'activate-firmware' command
ndctl/test: Test firmware-activation interface
ndctl/docs: Update copyright date
test: Validate strict iomem protections of pmem
ndctl: Refactor nfit.h to acpi.h
daxctl: Add 'split-acpi' command to generate custom ACPI tables
test/ndctl: mremap pmd confusion
Santosh Sivaraj (1):
test: Remove a redundant ndctl_namespace_foreach
Vishal Verma (3):
ndctl/contrib: update 'prepare-release' for merge workflow
libndctl: fix a potential buffer overflow
ndctl/inject-error: remove logically dead code
1 year, 7 months
[PATCH v10 0/2] Renovate memcpy_mcsafe with copy_mc_to_{user,
kernel}
by Dan Williams
Changes since v9 [1]:
- (Boris) Compile out the copy_mc_fragile() infrastructure in the
CONFIG_X86_MCE=n case.
This had several knock-on effects. The proposed x86: copy_mc_generic()
was internally checking for X86_FEATURE_ERMS and falling back to
copy_mc_fragile(), however that fallback is not possible in the
CONFIG_X86_MCE=n case when copy_mc_fragile() is compiled out. Instead,
copy_mc_to_user() is rewritten similar to copy_user_generic() that walks
through several fallback implementations copy_mc_fragile ->
copy_mc_enhanced_fast_string (new) -> copy_user_generic (no #MC
recovery).
[1]: http://lore.kernel.org/r/160087928642.3520.17063139768910633998.stgit@dwi...
---
Hi Boris,
I gave this some soak time over the weekend for the robots to chew on
for regressions. No reports, and the updates pass my testing. Please
consider including this in your updates for v5.10, and thanks for
offering to pick this up.
---
The motivations to go rework memcpy_mcsafe() are that the benefit of
doing slow and careful copies is obviated on newer CPUs, and that the
current opt-in list of cpus to instrument recovery is broken relative to
those cpus. There is no need to keep an opt-in list up to date on an
ongoing basis if pmem/dax operations are instrumented for recovery by
default. With recovery enabled by default the old "mcsafe_key" opt-in to
careful copying can be made a "fragile" opt-out. Where the "fragile"
list takes steps to not consume poison across cachelines.
The discussion with Linus made clear that the current "_mcsafe" suffix
was imprecise to a fault. The operations that are needed by pmem/dax are
to copy from a source address that might throw #MC to a destination that
may write-fault, if it is a user page. So copy_to_user_mcsafe() becomes
copy_mc_to_user() to indicate the separate precautions taken on source
and destination. copy_mc_to_kernel() is introduced as a non-SMAP version
that does not expect write-faults on the destination, but is still
prepared to abort with an error code upon taking #MC.
---
Dan Williams (2):
x86, powerpc: Rename memcpy_mcsafe() to copy_mc_to_{user,kernel}()
x86/copy_mc: Introduce copy_mc_enhanced_fast_string()
arch/powerpc/Kconfig | 2
arch/powerpc/include/asm/string.h | 2
arch/powerpc/include/asm/uaccess.h | 40 +++--
arch/powerpc/lib/Makefile | 2
arch/powerpc/lib/copy_mc_64.S | 4
arch/x86/Kconfig | 2
arch/x86/Kconfig.debug | 2
arch/x86/include/asm/copy_mc_test.h | 75 +++++++++
arch/x86/include/asm/mce.h | 9 +
arch/x86/include/asm/mcsafe_test.h | 75 ---------
arch/x86/include/asm/string_64.h | 32 ----
arch/x86/include/asm/uaccess.h | 9 +
arch/x86/include/asm/uaccess_64.h | 20 --
arch/x86/kernel/cpu/mce/core.c | 8 -
arch/x86/kernel/quirks.c | 10 -
arch/x86/lib/Makefile | 1
arch/x86/lib/copy_mc.c | 96 ++++++++++++
arch/x86/lib/copy_mc_64.S | 163 ++++++++++++++++++++
arch/x86/lib/memcpy_64.S | 115 --------------
arch/x86/lib/usercopy_64.c | 21 ---
drivers/md/dm-writecache.c | 15 +-
drivers/nvdimm/claim.c | 2
drivers/nvdimm/pmem.c | 6 -
include/linux/string.h | 9 -
include/linux/uaccess.h | 13 ++
include/linux/uio.h | 10 +
lib/Kconfig | 7 +
lib/iov_iter.c | 48 +++---
tools/arch/x86/include/asm/mcsafe_test.h | 13 --
tools/arch/x86/lib/memcpy_64.S | 115 --------------
tools/objtool/check.c | 5 -
tools/perf/bench/Build | 1
tools/perf/bench/mem-memcpy-x86-64-lib.c | 24 ---
tools/testing/nvdimm/test/nfit.c | 49 +++---
.../testing/selftests/powerpc/copyloops/.gitignore | 2
tools/testing/selftests/powerpc/copyloops/Makefile | 6 -
.../selftests/powerpc/copyloops/copy_mc_64.S | 1
.../selftests/powerpc/copyloops/memcpy_mcsafe_64.S | 1
38 files changed, 484 insertions(+), 531 deletions(-)
rename arch/powerpc/lib/{memcpy_mcsafe_64.S => copy_mc_64.S} (98%)
create mode 100644 arch/x86/include/asm/copy_mc_test.h
delete mode 100644 arch/x86/include/asm/mcsafe_test.h
create mode 100644 arch/x86/lib/copy_mc.c
create mode 100644 arch/x86/lib/copy_mc_64.S
delete mode 100644 tools/arch/x86/include/asm/mcsafe_test.h
delete mode 100644 tools/perf/bench/mem-memcpy-x86-64-lib.c
create mode 120000 tools/testing/selftests/powerpc/copyloops/copy_mc_64.S
delete mode 120000 tools/testing/selftests/powerpc/copyloops/memcpy_mcsafe_64.S
base-commit: a1b8638ba1320e6684aa98233c15255eb803fac7
1 year, 7 months
[ndctl PATCH 1/2] libndctl: fix a potential buffer overflow
by Vishal Verma
Static analysis points out that the 'buf' in ndctl_dimm_is_active was
inappropriately sized. We already have 'SYSFS_ATTR_SIZE' for such
buffers, and it looks like this was just an oversight.
Fixes: 0a4509d7de2f ("ndctl: enumerate interleave sets")
Cc: Dan Williams <dan.j.williams(a)intel.com>
Signed-off-by: Vishal Verma <vishal.l.verma(a)intel.com>
---
ndctl/lib/libndctl.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/ndctl/lib/libndctl.c b/ndctl/lib/libndctl.c
index 6556b33..5546963 100644
--- a/ndctl/lib/libndctl.c
+++ b/ndctl/lib/libndctl.c
@@ -3675,8 +3675,8 @@ NDCTL_EXPORT int ndctl_dimm_is_active(struct ndctl_dimm *dimm)
{
struct ndctl_ctx *ctx = ndctl_dimm_get_ctx(dimm);
char *path = dimm->dimm_buf;
+ char buf[SYSFS_ATTR_SIZE];
int len = dimm->buf_len;
- char buf[20];
if (snprintf(path, len, "%s/state", dimm->dimm_path) >= len) {
err(ctx, "%s: buffer too small!\n",
--
2.26.2
1 year, 7 months
[PATCH v5 00/17] device-dax: support sub-dividing soft-reserved
ranges
by Dan Williams
Changes since v4 [1]:
- Rebased on
device-dax-move-instance-creation-parameters-to-struct-dev_dax_data.patch
in -mm [2]. I.e. patches that did not need fixups from v4 are not
included.
- Folded all fixes
- Replaced "device-dax: kill dax_kmem_res" with:
device-dax/kmem: introduce dax_kmem_range()
device-dax/kmem: move resource name tracking to drvdata
device-dax/kmem: replace release_resource() with release_mem_region()
...to address David's request to make those cleanups easier to review.
Note that I dropped changes to how IORESOURCE_BUSY is manipulated since
David and I are still debating the best way forward there.
- Broke out some of dax-bus reworks in "device-dax: introduce 'seed'
devices" to a new "device-dax: introduce 'struct dev_dax' typed-driver
operations"
- Added a conversion of xen_alloc_unallocated_pages() from pgmap.res to
pgmap.range. I found it odd that there is no corresponding
memunmap_pages() triggered by xen_free_unallocated_pages()?
- Not included, a conversion of virtio_fs to use pgmap.range for its new
usage of devm_memremap_pages(). It appears the virtio_fs changes are
merged after -mm? My mental model of -mm was that it applies on top of
linux-next? In any event, Vivek, you will need to coordinate a
conversion to pgmap.range for the virtio_fs dax-support merge. Maybe
that should go through Andrew as well?
- Lowercase all the subject lines per akpm's preference
- Received a 0day robot build-success notification over 122 configs
- Thanks to Joao for looking after this set while I was out.
[1]: http://lore.kernel.org/r/159625229779.3040297.11363509688097221416.stgit@...
[2]: https://ozlabs.org/~akpm/mmots/broken-out/device-dax-move-instance-creati...
---
Andrew, this series replaces
device-dax-make-pgmap-optional-for-instance-creation.patch
...through...
dax-hmem-introduce-dax_hmemregion_idle-parameter.patch
...in your stack.
Let me know if there is a different / preferred way to refresh a bulk of
patches in your queue when only a subset need updates.
---
The device-dax facility allows an address range to be directly mapped
through a chardev, or optionally hotplugged to the core kernel page
allocator as System-RAM. It is the mechanism for converting persistent
memory (pmem) to be used as another volatile memory pool i.e. the
current Memory Tiering hot topic on linux-mm.
In the case of pmem the nvdimm-namespace-label mechanism can sub-divide
it, but that labeling mechanism is not available / applicable to
soft-reserved ("EFI specific purpose") memory [3]. This series provides
a sysfs-mechanism for the daxctl utility to enable provisioning of
volatile-soft-reserved memory ranges.
The motivations for this facility are:
1/ Allow performance differentiated memory ranges to be split between
kernel-managed and directly-accessed use cases.
2/ Allow physical memory to be provisioned along performance relevant
address boundaries. For example, divide a memory-side cache [4] along
cache-color boundaries.
3/ Parcel out soft-reserved memory to VMs using device-dax as a security
/ permissions boundary [5]. Specifically I have seen people (ab)using
memmap=nn!ss (mark System-RAM as Persistent Memory) just to get the
device-dax interface on custom address ranges. A follow-on for the VM
use case is to teach device-dax to dynamically allocate 'struct page' at
runtime to reduce the duplication of 'struct page' space in both the
guest and the host kernel for the same physical pages.
[3]: http://lore.kernel.org/r/157309097008.1579826.12818463304589384434.stgit@...
[4]: http://lore.kernel.org/r/154899811738.3165233.12325692939590944259.stgit@...
[5]: http://lore.kernel.org/r/20200110190313.17144-1-joao.m.martins@oracle.com
---
Dan Williams (14):
device-dax: make pgmap optional for instance creation
device-dax/kmem: introduce dax_kmem_range()
device-dax/kmem: move resource name tracking to drvdata
device-dax/kmem: replace release_resource() with release_mem_region()
device-dax: add an allocation interface for device-dax instances
device-dax: introduce 'struct dev_dax' typed-driver operations
device-dax: introduce 'seed' devices
drivers/base: make device_find_child_by_name() compatible with sysfs inputs
device-dax: add resize support
mm/memremap_pages: convert to 'struct range'
mm/memremap_pages: support multiple ranges per invocation
device-dax: add dis-contiguous resource support
device-dax: introduce 'mapping' devices
device-dax: add an 'align' attribute
Joao Martins (3):
device-dax: make align a per-device property
dax/hmem: introduce dax_hmem.region_idle parameter
device-dax: add a range mapping allocation attribute
arch/powerpc/kvm/book3s_hv_uvmem.c | 14
drivers/base/core.c | 2
drivers/dax/bus.c | 1039 ++++++++++++++++++++++++++++++--
drivers/dax/bus.h | 11
drivers/dax/dax-private.h | 58 ++
drivers/dax/device.c | 112 ++-
drivers/dax/hmem/hmem.c | 17 -
drivers/dax/kmem.c | 178 +++--
drivers/dax/pmem/compat.c | 2
drivers/dax/pmem/core.c | 14
drivers/gpu/drm/nouveau/nouveau_dmem.c | 15
drivers/nvdimm/badrange.c | 26 -
drivers/nvdimm/claim.c | 13
drivers/nvdimm/nd.h | 3
drivers/nvdimm/pfn_devs.c | 13
drivers/nvdimm/pmem.c | 27 -
drivers/nvdimm/region.c | 21 -
drivers/pci/p2pdma.c | 12
drivers/xen/unpopulated-alloc.c | 45 +
include/linux/memremap.h | 11
include/linux/range.h | 6
lib/test_hmm.c | 15
mm/memremap.c | 299 +++++----
tools/testing/nvdimm/dax-dev.c | 22 -
tools/testing/nvdimm/test/iomap.c | 2
25 files changed, 1557 insertions(+), 420 deletions(-)
base-commit: 6764736525f27a411ba2c0c430aaa2df7375f3ac
1 year, 7 months