[PATCH 4.4 005/137] x86/mm: Fix vmalloc_fault() to handle large pages properly
by Greg Kroah-Hartman
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Toshi Kani <toshi.kani(a)hpe.com>
commit f4eafd8bcd5229e998aa252627703b8462c3b90f upstream.
A kernel page fault oops with the callstack below was observed
when a read syscall was made to a pmem device after a huge amount
(>512GB) of vmalloc ranges was allocated by ioremap() on a x86_64
system:
BUG: unable to handle kernel paging request at ffff880840000ff8
IP: vmalloc_fault+0x1be/0x300
PGD c7f03a067 PUD 0
Oops: 0000 [#1] SM
Call Trace:
__do_page_fault+0x285/0x3e0
do_page_fault+0x2f/0x80
? put_prev_entity+0x35/0x7a0
page_fault+0x28/0x30
? memcpy_erms+0x6/0x10
? schedule+0x35/0x80
? pmem_rw_bytes+0x6a/0x190 [nd_pmem]
? schedule_timeout+0x183/0x240
btt_log_read+0x63/0x140 [nd_btt]
:
? __symbol_put+0x60/0x60
? kernel_read+0x50/0x80
SyS_finit_module+0xb9/0xf0
entry_SYSCALL_64_fastpath+0x1a/0xa4
Since v4.1, ioremap() supports large page (pud/pmd) mappings in
x86_64 and PAE. vmalloc_fault() however assumes that the vmalloc
range is limited to pte mappings.
vmalloc faults do not normally happen in ioremap'd ranges since
ioremap() sets up the kernel page tables, which are shared by
user processes. pgd_ctor() sets the kernel's PGD entries to
user's during fork(). When allocation of the vmalloc ranges
crosses a 512GB boundary, ioremap() allocates a new pud table
and updates the kernel PGD entry to point it. If user process's
PGD entry does not have this update yet, a read/write syscall
to the range will cause a vmalloc fault, which hits the Oops
above as it does not handle a large page properly.
Following changes are made to vmalloc_fault().
64-bit:
- No change for the PGD sync operation as it handles large
pages already.
- Add pud_huge() and pmd_huge() to the validation code to
handle large pages.
- Change pud_page_vaddr() to pud_pfn() since an ioremap range
is not directly mapped (while the if-statement still works
with a bogus addr).
- Change pmd_page() to pmd_pfn() since an ioremap range is not
backed by struct page (while the if-statement still works
with a bogus addr).
32-bit:
- No change for the sync operation since the index3 PGD entry
covers the entire vmalloc range, which is always valid.
(A separate change to sync PGD entry is necessary if this
memory layout is changed regardless of the page size.)
- Add pmd_huge() to the validation code to handle large pages.
This is for completeness since vmalloc_fault() won't happen
in ioremap'd ranges as its PGD entry is always valid.
Reported-by: Henning Schild <henning.schild(a)siemens.com>
Signed-off-by: Toshi Kani <toshi.kani(a)hpe.com>
Acked-by: Borislav Petkov <bp(a)alien8.de>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Andy Lutomirski <luto(a)amacapital.net>
Cc: Brian Gerst <brgerst(a)gmail.com>
Cc: Denys Vlasenko <dvlasenk(a)redhat.com>
Cc: H. Peter Anvin <hpa(a)zytor.com>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Luis R. Rodriguez <mcgrof(a)suse.com>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Toshi Kani <toshi.kani(a)hp.com>
Cc: linux-mm(a)kvack.org
Cc: linux-nvdimm(a)lists.01.org
Link: http://lkml.kernel.org/r/1455758214-24623-1-git-send-email-toshi.kani@hpe...
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/mm/fault.c | 15 +++++++++++----
1 file changed, 11 insertions(+), 4 deletions(-)
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -287,6 +287,9 @@ static noinline int vmalloc_fault(unsign
if (!pmd_k)
return -1;
+ if (pmd_huge(*pmd_k))
+ return 0;
+
pte_k = pte_offset_kernel(pmd_k, address);
if (!pte_present(*pte_k))
return -1;
@@ -360,8 +363,6 @@ void vmalloc_sync_all(void)
* 64-bit:
*
* Handle a fault on the vmalloc area
- *
- * This assumes no large pages in there.
*/
static noinline int vmalloc_fault(unsigned long address)
{
@@ -403,17 +404,23 @@ static noinline int vmalloc_fault(unsign
if (pud_none(*pud_ref))
return -1;
- if (pud_none(*pud) || pud_page_vaddr(*pud) != pud_page_vaddr(*pud_ref))
+ if (pud_none(*pud) || pud_pfn(*pud) != pud_pfn(*pud_ref))
BUG();
+ if (pud_huge(*pud))
+ return 0;
+
pmd = pmd_offset(pud, address);
pmd_ref = pmd_offset(pud_ref, address);
if (pmd_none(*pmd_ref))
return -1;
- if (pmd_none(*pmd) || pmd_page(*pmd) != pmd_page(*pmd_ref))
+ if (pmd_none(*pmd) || pmd_pfn(*pmd) != pmd_pfn(*pmd_ref))
BUG();
+ if (pmd_huge(*pmd))
+ return 0;
+
pte_ref = pte_offset_kernel(pmd_ref, address);
if (!pte_present(*pte_ref))
return -1;
6 years, 5 months
[PATCH 4.4 004/137] x86/uaccess/64: Handle the caching of 4-byte nocache copies properly in __copy_user_nocache()
by Greg Kroah-Hartman
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Toshi Kani <toshi.kani(a)hpe.com>
commit a82eee7424525e34e98d821dd059ce14560a1e35 upstream.
Data corruption issues were observed in tests which initiated
a system crash/reset while accessing BTT devices. This problem
is reproducible.
The BTT driver calls pmem_rw_bytes() to update data in pmem
devices. This interface calls __copy_user_nocache(), which
uses non-temporal stores so that the stores to pmem are
persistent.
__copy_user_nocache() uses non-temporal stores when a request
size is 8 bytes or larger (and is aligned by 8 bytes). The
BTT driver updates the BTT map table, which entry size is
4 bytes. Therefore, updates to the map table entries remain
cached, and are not written to pmem after a crash.
Change __copy_user_nocache() to use non-temporal store when
a request size is 4 bytes. The change extends the current
byte-copy path for a less-than-8-bytes request, and does not
add any overhead to the regular path.
Reported-and-tested-by: Micah Parrish <micah.parrish(a)hpe.com>
Reported-and-tested-by: Brian Boylston <brian.boylston(a)hpe.com>
Signed-off-by: Toshi Kani <toshi.kani(a)hpe.com>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Andy Lutomirski <luto(a)amacapital.net>
Cc: Borislav Petkov <bp(a)alien8.de>
Cc: Borislav Petkov <bp(a)suse.de>
Cc: Brian Gerst <brgerst(a)gmail.com>
Cc: Dan Williams <dan.j.williams(a)intel.com>
Cc: Denys Vlasenko <dvlasenk(a)redhat.com>
Cc: H. Peter Anvin <hpa(a)zytor.com>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Luis R. Rodriguez <mcgrof(a)suse.com>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Ross Zwisler <ross.zwisler(a)linux.intel.com>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Toshi Kani <toshi.kani(a)hp.com>
Cc: Vishal Verma <vishal.l.verma(a)intel.com>
Cc: linux-nvdimm(a)lists.01.org
Link: http://lkml.kernel.org/r/1455225857-12039-3-git-send-email-toshi.kani@hpe...
[ Small readability edits. ]
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/lib/copy_user_64.S | 36 ++++++++++++++++++++++++++++++++----
1 file changed, 32 insertions(+), 4 deletions(-)
--- a/arch/x86/lib/copy_user_64.S
+++ b/arch/x86/lib/copy_user_64.S
@@ -237,13 +237,14 @@ ENDPROC(copy_user_enhanced_fast_string)
* Note: Cached memory copy is used when destination or size is not
* naturally aligned. That is:
* - Require 8-byte alignment when size is 8 bytes or larger.
+ * - Require 4-byte alignment when size is 4 bytes.
*/
ENTRY(__copy_user_nocache)
ASM_STAC
- /* If size is less than 8 bytes, go to byte copy */
+ /* If size is less than 8 bytes, go to 4-byte copy */
cmpl $8,%edx
- jb .L_1b_cache_copy_entry
+ jb .L_4b_nocache_copy_entry
/* If destination is not 8-byte aligned, "cache" copy to align it */
ALIGN_DESTINATION
@@ -282,7 +283,7 @@ ENTRY(__copy_user_nocache)
movl %edx,%ecx
andl $7,%edx
shrl $3,%ecx
- jz .L_1b_cache_copy_entry /* jump if count is 0 */
+ jz .L_4b_nocache_copy_entry /* jump if count is 0 */
/* Perform 8-byte nocache loop-copy */
.L_8b_nocache_copy_loop:
@@ -294,11 +295,33 @@ ENTRY(__copy_user_nocache)
jnz .L_8b_nocache_copy_loop
/* If no byte left, we're done */
-.L_1b_cache_copy_entry:
+.L_4b_nocache_copy_entry:
+ andl %edx,%edx
+ jz .L_finish_copy
+
+ /* If destination is not 4-byte aligned, go to byte copy: */
+ movl %edi,%ecx
+ andl $3,%ecx
+ jnz .L_1b_cache_copy_entry
+
+ /* Set 4-byte copy count (1 or 0) and remainder */
+ movl %edx,%ecx
+ andl $3,%edx
+ shrl $2,%ecx
+ jz .L_1b_cache_copy_entry /* jump if count is 0 */
+
+ /* Perform 4-byte nocache copy: */
+30: movl (%rsi),%r8d
+31: movnti %r8d,(%rdi)
+ leaq 4(%rsi),%rsi
+ leaq 4(%rdi),%rdi
+
+ /* If no bytes left, we're done: */
andl %edx,%edx
jz .L_finish_copy
/* Perform byte "cache" loop-copy for the remainder */
+.L_1b_cache_copy_entry:
movl %edx,%ecx
.L_1b_cache_copy_loop:
40: movb (%rsi),%al
@@ -323,6 +346,9 @@ ENTRY(__copy_user_nocache)
.L_fixup_8b_copy:
lea (%rdx,%rcx,8),%rdx
jmp .L_fixup_handle_tail
+.L_fixup_4b_copy:
+ lea (%rdx,%rcx,4),%rdx
+ jmp .L_fixup_handle_tail
.L_fixup_1b_copy:
movl %ecx,%edx
.L_fixup_handle_tail:
@@ -348,6 +374,8 @@ ENTRY(__copy_user_nocache)
_ASM_EXTABLE(16b,.L_fixup_4x8b_copy)
_ASM_EXTABLE(20b,.L_fixup_8b_copy)
_ASM_EXTABLE(21b,.L_fixup_8b_copy)
+ _ASM_EXTABLE(30b,.L_fixup_4b_copy)
+ _ASM_EXTABLE(31b,.L_fixup_4b_copy)
_ASM_EXTABLE(40b,.L_fixup_1b_copy)
_ASM_EXTABLE(41b,.L_fixup_1b_copy)
ENDPROC(__copy_user_nocache)
6 years, 5 months
[PATCH 4.4 003/137] x86/uaccess/64: Make the __copy_user_nocache() assembly code more readable
by Greg Kroah-Hartman
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Toshi Kani <toshi.kani(a)hpe.com>
commit ee9737c924706aaa72c2ead93e3ad5644681dc1c upstream.
Add comments to __copy_user_nocache() to clarify its procedures
and alignment requirements.
Also change numeric branch target labels to named local labels.
No code changed:
arch/x86/lib/copy_user_64.o:
text data bss dec hex filename
1239 0 0 1239 4d7 copy_user_64.o.before
1239 0 0 1239 4d7 copy_user_64.o.after
md5:
58bed94c2db98c1ca9a2d46d0680aaae copy_user_64.o.before.asm
58bed94c2db98c1ca9a2d46d0680aaae copy_user_64.o.after.asm
Signed-off-by: Toshi Kani <toshi.kani(a)hpe.com>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Andy Lutomirski <luto(a)amacapital.net>
Cc: Borislav Petkov <bp(a)alien8.de>
Cc: Borislav Petkov <bp(a)suse.de>
Cc: Brian Gerst <brgerst(a)gmail.com>
Cc: Denys Vlasenko <dvlasenk(a)redhat.com>
Cc: H. Peter Anvin <hpa(a)zytor.com>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Luis R. Rodriguez <mcgrof(a)suse.com>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Toshi Kani <toshi.kani(a)hp.com>
Cc: brian.boylston(a)hpe.com
Cc: dan.j.williams(a)intel.com
Cc: linux-nvdimm(a)lists.01.org
Cc: micah.parrish(a)hpe.com
Cc: ross.zwisler(a)linux.intel.com
Cc: vishal.l.verma(a)intel.com
Link: http://lkml.kernel.org/r/1455225857-12039-2-git-send-email-toshi.kani@hpe...
[ Small readability edits and added object file comparison. ]
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/lib/copy_user_64.S | 114 ++++++++++++++++++++++++++++----------------
1 file changed, 73 insertions(+), 41 deletions(-)
--- a/arch/x86/lib/copy_user_64.S
+++ b/arch/x86/lib/copy_user_64.S
@@ -232,17 +232,30 @@ ENDPROC(copy_user_enhanced_fast_string)
/*
* copy_user_nocache - Uncached memory copy with exception handling
- * This will force destination/source out of cache for more performance.
+ * This will force destination out of cache for more performance.
+ *
+ * Note: Cached memory copy is used when destination or size is not
+ * naturally aligned. That is:
+ * - Require 8-byte alignment when size is 8 bytes or larger.
*/
ENTRY(__copy_user_nocache)
ASM_STAC
+
+ /* If size is less than 8 bytes, go to byte copy */
cmpl $8,%edx
- jb 20f /* less then 8 bytes, go to byte copy loop */
+ jb .L_1b_cache_copy_entry
+
+ /* If destination is not 8-byte aligned, "cache" copy to align it */
ALIGN_DESTINATION
+
+ /* Set 4x8-byte copy count and remainder */
movl %edx,%ecx
andl $63,%edx
shrl $6,%ecx
- jz 17f
+ jz .L_8b_nocache_copy_entry /* jump if count is 0 */
+
+ /* Perform 4x8-byte nocache loop-copy */
+.L_4x8b_nocache_copy_loop:
1: movq (%rsi),%r8
2: movq 1*8(%rsi),%r9
3: movq 2*8(%rsi),%r10
@@ -262,60 +275,79 @@ ENTRY(__copy_user_nocache)
leaq 64(%rsi),%rsi
leaq 64(%rdi),%rdi
decl %ecx
- jnz 1b
-17: movl %edx,%ecx
+ jnz .L_4x8b_nocache_copy_loop
+
+ /* Set 8-byte copy count and remainder */
+.L_8b_nocache_copy_entry:
+ movl %edx,%ecx
andl $7,%edx
shrl $3,%ecx
- jz 20f
-18: movq (%rsi),%r8
-19: movnti %r8,(%rdi)
+ jz .L_1b_cache_copy_entry /* jump if count is 0 */
+
+ /* Perform 8-byte nocache loop-copy */
+.L_8b_nocache_copy_loop:
+20: movq (%rsi),%r8
+21: movnti %r8,(%rdi)
leaq 8(%rsi),%rsi
leaq 8(%rdi),%rdi
decl %ecx
- jnz 18b
-20: andl %edx,%edx
- jz 23f
+ jnz .L_8b_nocache_copy_loop
+
+ /* If no byte left, we're done */
+.L_1b_cache_copy_entry:
+ andl %edx,%edx
+ jz .L_finish_copy
+
+ /* Perform byte "cache" loop-copy for the remainder */
movl %edx,%ecx
-21: movb (%rsi),%al
-22: movb %al,(%rdi)
+.L_1b_cache_copy_loop:
+40: movb (%rsi),%al
+41: movb %al,(%rdi)
incq %rsi
incq %rdi
decl %ecx
- jnz 21b
-23: xorl %eax,%eax
+ jnz .L_1b_cache_copy_loop
+
+ /* Finished copying; fence the prior stores */
+.L_finish_copy:
+ xorl %eax,%eax
ASM_CLAC
sfence
ret
.section .fixup,"ax"
-30: shll $6,%ecx
+.L_fixup_4x8b_copy:
+ shll $6,%ecx
addl %ecx,%edx
- jmp 60f
-40: lea (%rdx,%rcx,8),%rdx
- jmp 60f
-50: movl %ecx,%edx
-60: sfence
+ jmp .L_fixup_handle_tail
+.L_fixup_8b_copy:
+ lea (%rdx,%rcx,8),%rdx
+ jmp .L_fixup_handle_tail
+.L_fixup_1b_copy:
+ movl %ecx,%edx
+.L_fixup_handle_tail:
+ sfence
jmp copy_user_handle_tail
.previous
- _ASM_EXTABLE(1b,30b)
- _ASM_EXTABLE(2b,30b)
- _ASM_EXTABLE(3b,30b)
- _ASM_EXTABLE(4b,30b)
- _ASM_EXTABLE(5b,30b)
- _ASM_EXTABLE(6b,30b)
- _ASM_EXTABLE(7b,30b)
- _ASM_EXTABLE(8b,30b)
- _ASM_EXTABLE(9b,30b)
- _ASM_EXTABLE(10b,30b)
- _ASM_EXTABLE(11b,30b)
- _ASM_EXTABLE(12b,30b)
- _ASM_EXTABLE(13b,30b)
- _ASM_EXTABLE(14b,30b)
- _ASM_EXTABLE(15b,30b)
- _ASM_EXTABLE(16b,30b)
- _ASM_EXTABLE(18b,40b)
- _ASM_EXTABLE(19b,40b)
- _ASM_EXTABLE(21b,50b)
- _ASM_EXTABLE(22b,50b)
+ _ASM_EXTABLE(1b,.L_fixup_4x8b_copy)
+ _ASM_EXTABLE(2b,.L_fixup_4x8b_copy)
+ _ASM_EXTABLE(3b,.L_fixup_4x8b_copy)
+ _ASM_EXTABLE(4b,.L_fixup_4x8b_copy)
+ _ASM_EXTABLE(5b,.L_fixup_4x8b_copy)
+ _ASM_EXTABLE(6b,.L_fixup_4x8b_copy)
+ _ASM_EXTABLE(7b,.L_fixup_4x8b_copy)
+ _ASM_EXTABLE(8b,.L_fixup_4x8b_copy)
+ _ASM_EXTABLE(9b,.L_fixup_4x8b_copy)
+ _ASM_EXTABLE(10b,.L_fixup_4x8b_copy)
+ _ASM_EXTABLE(11b,.L_fixup_4x8b_copy)
+ _ASM_EXTABLE(12b,.L_fixup_4x8b_copy)
+ _ASM_EXTABLE(13b,.L_fixup_4x8b_copy)
+ _ASM_EXTABLE(14b,.L_fixup_4x8b_copy)
+ _ASM_EXTABLE(15b,.L_fixup_4x8b_copy)
+ _ASM_EXTABLE(16b,.L_fixup_4x8b_copy)
+ _ASM_EXTABLE(20b,.L_fixup_8b_copy)
+ _ASM_EXTABLE(21b,.L_fixup_8b_copy)
+ _ASM_EXTABLE(40b,.L_fixup_1b_copy)
+ _ASM_EXTABLE(41b,.L_fixup_1b_copy)
ENDPROC(__copy_user_nocache)
6 years, 5 months
[ndctl PATCH] ndctl: wait for initial ars scrubbing
by Dan Williams
Regions are registered asynchronously after scrubbing the list of
regions may still be populating as the test is initiating.
Signed-off-by: Dan Williams <dan.j.williams(a)intel.com>
---
test/libndctl.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/test/libndctl.c b/test/libndctl.c
index 0af0c071ddc1..2601117333a4 100644
--- a/test/libndctl.c
+++ b/test/libndctl.c
@@ -1903,6 +1903,8 @@ static int do_test0(struct ndctl_ctx *ctx, struct ndctl_test *test)
if (!bus)
return -ENXIO;
+ ndctl_bus_wait_probe(bus);
+
/* disable all regions so that set_config_data commands are permitted */
ndctl_region_foreach(bus, region)
ndctl_region_disable_invalidate(region);
@@ -1940,6 +1942,8 @@ static int do_test1(struct ndctl_ctx *ctx, struct ndctl_test *test)
if (!bus)
return -ENXIO;
+ ndctl_bus_wait_probe(bus);
+
rc = check_dimms(bus, dimms1, ARRAY_SIZE(dimms1), 0, 0, test);
if (rc)
return rc;
6 years, 5 months
[ndctl PATCH] ndctl: update local ndctl.h for acpi6.1
by Dan Williams
Grab the latest ars command definitions from ndctl.h.
Signed-off-by: Dan Williams <dan.j.williams(a)intel.com>
---
ndctl.h | 11 +++++++++--
1 file changed, 9 insertions(+), 2 deletions(-)
diff --git a/ndctl.h b/ndctl.h
index 50df797f1974..0c50ff4dc3b1 100644
--- a/ndctl.h
+++ b/ndctl.h
@@ -66,14 +66,18 @@ struct nd_cmd_ars_cap {
__u64 length;
__u32 status;
__u32 max_ars_out;
+ __u32 clear_err_unit;
+ __u32 reserved;
} __attribute__((packed));
struct nd_cmd_ars_start {
__u64 address;
__u64 length;
__u16 type;
- __u8 reserved[6];
+ __u8 flags;
+ __u8 reserved[5];
__u32 status;
+ __u32 scrub_time;
} __attribute__((packed));
struct nd_cmd_ars_status {
@@ -81,11 +85,14 @@ struct nd_cmd_ars_status {
__u32 out_length;
__u64 address;
__u64 length;
+ __u64 restart_address;
+ __u64 restart_length;
__u16 type;
+ __u16 flags;
__u32 num_records;
struct nd_ars_record {
__u32 handle;
- __u32 flags;
+ __u32 reserved;
__u64 err_address;
__u64 length;
} __attribute__((packed)) records[0];
6 years, 5 months
[ndctl PATCH] ndctl: fix check_ars_status
by Dan Williams
When an ars status command results in a busy status ACPI 6.1 specifies
that the command buffer is zeroed. To poll for ars results
check_ars_status() must reinstantiate the command for every poll
iteration.
Signed-off-by: Dan Williams <dan.j.williams(a)intel.com>
---
lib/libndctl-ars.c | 27 ++++++++++++++-------------
lib/ndctl/libndctl.h | 2 +-
test/libndctl.c | 29 +++++++++++++++++++++--------
3 files changed, 36 insertions(+), 22 deletions(-)
diff --git a/lib/libndctl-ars.c b/lib/libndctl-ars.c
index 863217dd09aa..ad85191898d4 100644
--- a/lib/libndctl-ars.c
+++ b/lib/libndctl-ars.c
@@ -141,20 +141,21 @@ NDCTL_EXPORT unsigned int ndctl_cmd_ars_cap_get_size(struct ndctl_cmd *ars_cap)
return 0;
}
-NDCTL_EXPORT unsigned int ndctl_cmd_ars_in_progress(struct ndctl_cmd *ars_stat)
+NDCTL_EXPORT int ndctl_cmd_ars_in_progress(struct ndctl_cmd *cmd)
{
- struct ndctl_ctx *ctx = ndctl_bus_get_ctx(cmd_to_bus(ars_stat));
- int rc;
-
- if (ars_stat->type == ND_CMD_ARS_STATUS && ars_stat->status == 0) {
- rc = (((*ars_stat->firmware_status >> ARS_EXT_STATUS_SHIFT) == 1) ? 1 : 0);
- /*
- * If in-progress, invalidate the ndctl_cmd, so that if we're
- * called again without a fresh ars_stat command, we fail.
- */
- if (rc)
- ars_stat->status = 1;
- return rc;
+ struct ndctl_ctx *ctx = ndctl_bus_get_ctx(cmd_to_bus(cmd));
+
+ if (cmd->type == ND_CMD_ARS_STATUS && cmd->status == 0) {
+ if (cmd->ars_status->status == 1 << 16) {
+ /*
+ * If in-progress, invalidate the ndctl_cmd, so
+ * that if we're called again without a fresh
+ * ars_status command, we fail.
+ */
+ cmd->status = 1;
+ return 1;
+ }
+ return 0;
}
dbg(ctx, "invalid ars_status\n");
diff --git a/lib/ndctl/libndctl.h b/lib/ndctl/libndctl.h
index 088bb3c3d8b9..17770f78ce01 100644
--- a/lib/ndctl/libndctl.h
+++ b/lib/ndctl/libndctl.h
@@ -154,7 +154,7 @@ struct ndctl_cmd *ndctl_bus_cmd_new_ars_cap(struct ndctl_bus *bus,
struct ndctl_cmd *ndctl_bus_cmd_new_ars_start(struct ndctl_cmd *ars_cap, int type);
struct ndctl_cmd *ndctl_bus_cmd_new_ars_status(struct ndctl_cmd *ars_cap);
unsigned int ndctl_cmd_ars_cap_get_size(struct ndctl_cmd *ars_cap);
-unsigned int ndctl_cmd_ars_in_progress(struct ndctl_cmd *ars_status);
+int ndctl_cmd_ars_in_progress(struct ndctl_cmd *ars_status);
unsigned int ndctl_cmd_ars_num_records(struct ndctl_cmd *ars_stat);
unsigned long long ndctl_cmd_ars_get_record_addr(struct ndctl_cmd *ars_stat,
unsigned int rec_index);
diff --git a/test/libndctl.c b/test/libndctl.c
index b4539b996d6a..0af0c071ddc1 100644
--- a/test/libndctl.c
+++ b/test/libndctl.c
@@ -1657,6 +1657,7 @@ static int check_ars_status(struct ndctl_bus *bus, struct ndctl_dimm *dimm,
{
struct ndctl_cmd *cmd_ars_cap = check_cmds[ND_CMD_ARS_CAP].cmd;
struct ndctl_cmd *cmd;
+ unsigned long tmo = 5;
unsigned int i;
int rc;
@@ -1665,6 +1666,8 @@ static int check_ars_status(struct ndctl_bus *bus, struct ndctl_dimm *dimm,
__func__, ndctl_dimm_get_handle(dimm));
return -ENXIO;
}
+
+ retry:
cmd = ndctl_bus_cmd_new_ars_status(cmd_ars_cap);
if (!cmd) {
fprintf(stderr, "%s: bus: %s failed to create cmd\n",
@@ -1672,15 +1675,25 @@ static int check_ars_status(struct ndctl_bus *bus, struct ndctl_dimm *dimm,
return -ENOTTY;
}
- do {
- rc = ndctl_cmd_submit(cmd);
- if (rc) {
- fprintf(stderr, "%s: bus: %s failed to submit cmd: %d\n",
+ rc = ndctl_cmd_submit(cmd);
+ if (rc) {
+ fprintf(stderr, "%s: bus: %s failed to submit cmd: %d\n",
__func__, ndctl_bus_get_provider(bus), rc);
- ndctl_cmd_unref(cmd);
- return rc;
- }
- } while (ndctl_cmd_ars_in_progress(cmd));
+ ndctl_cmd_unref(cmd);
+ return rc;
+ }
+
+ if (!tmo) {
+ fprintf(stderr, "%s: bus: %s ars timeout\n", __func__,
+ ndctl_bus_get_provider(bus));
+ return -EIO;
+ }
+
+ if (ndctl_cmd_ars_in_progress(cmd)) {
+ tmo--;
+ sleep(1);
+ goto retry;
+ }
for (i = 0; i < ndctl_cmd_ars_num_records(cmd); i++) {
fprintf(stderr, "%s: record[%d].addr: 0x%llx\n", __func__, i,
6 years, 5 months
[PATCH 0/2] nfit, libnvdimm: ars fixups for acpi6.1 compatibility
by Dan Williams
The address range scrub format is formalized in acpi6.1 and the existing
Linux implementation needs to be updated accordingly. In the course of
testing this update a bug in how Linux handles the buffer size for
ars_status commands was discovered/fixed.
---
Dan Williams (2):
libnvdimm, tools/testing/nvdimm: fix 'ars_status' output buffer sizing
nfit: update address range scrub commands to the acpi 6.1 format
drivers/acpi/nfit.c | 6 +++---
drivers/nvdimm/bus.c | 20 ++++++++++----------
include/linux/libnvdimm.h | 3 +--
include/uapi/linux/ndctl.h | 11 +++++++++--
tools/testing/nvdimm/test/nfit.c | 8 ++++++--
5 files changed, 29 insertions(+), 19 deletions(-)
6 years, 5 months
[PATCH 0/6] DAX cleanups
by Matthew Wilcox
Very little exciting in here. This is all based on the PUD support code
that I just sent, mostly addressing things that came up during review
of the PUD code but weren't really justifiable as being mixed into the
adding of PUD support.
Matthew Wilcox (6):
dax: Use vmf->gfp_mask
dax: Remove unnecessary rechecking of i_size
dax: Use vmf->pgoff in fault handlers
dax: Use PAGE_CACHE_SIZE where appropriate
dax: Factor dax_insert_pmd_mapping out of dax_pmd_fault
dax: Factor dax_insert_pud_mapping out of dax_pud_fault
fs/dax.c | 395 ++++++++++++++++++++++++++-------------------------------------
1 file changed, 164 insertions(+), 231 deletions(-)
--
2.7.0.rc3
6 years, 5 months
[ndctl PATCH] ndctl: fix ndctl-libs package name for rhel/ndctl.spec
by Dan Williams
Commit e05edf480650 "ndctl: rework release collateral" errantly changed
the name of the 'libs' package for rhel/ndctl.spec to 'libndctl', change
it back.
Signed-off-by: Dan Williams <dan.j.williams(a)intel.com>
---
Makefile.am | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/Makefile.am b/Makefile.am
index 61dce040131c..c8ce3177e5a2 100644
--- a/Makefile.am
+++ b/Makefile.am
@@ -54,7 +54,7 @@ CLEANFILES += $(noinst_SCRIPTS)
do_rhel_subst = sed -e 's,VERSION,$(VERSION),g' \
-e 's,DNAME,ndctl-devel,g' \
- -e 's,LNAME,libndctl,g'
+ -e 's,LNAME,ndctl-libs,g'
do_sles_subst = sed -e 's,VERSION,$(VERSION),g' \
-e 's,DNAME,libndctl-devel,g' \
6 years, 5 months
[PATCH] nvdimm: use 'u64' for pfn flags
by Arnd Bergmann
A recent bugfix changed pfn_t to always be 64-bit wide, but did not
change the code in pmem.c, which is now broken on 32-bit architectures
as reported by gcc:
In file included from ../drivers/nvdimm/pmem.c:28:0:
drivers/nvdimm/pmem.c: In function 'pmem_alloc':
include/linux/pfn_t.h:15:17: error: large integer implicitly truncated to unsigned type [-Werror=overflow]
#define PFN_DEV (1ULL << (BITS_PER_LONG_LONG - 3))
This changes the intermediate pfn_flags in struct pmem_device to
be 64 bit wide as well, so they can store the flags correctly.
Signed-off-by: Arnd Bergmann <arnd(a)arndb.de>
Fixes: db78c22230d0 ("mm: fix pfn_t vs highmem")
---
drivers/nvdimm/pmem.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/nvdimm/pmem.c b/drivers/nvdimm/pmem.c
index 7edf31671dab..8d0b54670184 100644
--- a/drivers/nvdimm/pmem.c
+++ b/drivers/nvdimm/pmem.c
@@ -41,7 +41,7 @@ struct pmem_device {
phys_addr_t phys_addr;
/* when non-zero this device is hosting a 'pfn' instance */
phys_addr_t data_offset;
- unsigned long pfn_flags;
+ u64 pfn_flags;
void __pmem *virt_addr;
size_t size;
struct badblocks bb;
--
2.7.0
6 years, 5 months