On Wed 23-03-16 14:50:00, Ross Zwisler wrote:
On Mon, Mar 21, 2016 at 02:22:52PM +0100, Jan Kara wrote:
> Currently the handling of huge pages for DAX is racy. For example the
> following can happen:
>
> CPU0 (THP write fault) CPU1 (normal read fault)
>
> __dax_pmd_fault() __dax_fault()
> get_block(inode, block, &bh, 0) -> not mapped
> get_block(inode, block, &bh, 0)
> -> not mapped
> if (!buffer_mapped(&bh) && write)
> get_block(inode, block, &bh, 1) -> allocates blocks
> truncate_pagecache_range(inode, lstart, lend);
> dax_load_hole();
>
> This results in data corruption since process on CPU1 won't see changes
> into the file done by CPU0.
>
> The race can happen even if two normal faults race however with THP the
> situation is even worse because the two faults don't operate on the same
> entries in the radix tree and we want to use these entries for
> serialization. So disable THP support in DAX code for now.
Yep, I agree that we should disable PMD faults until we get the multi-order
radix tree work finished and integrated with this locking.
I do agree with Dan though that it would be preferable to disable PMD faults
by having CONFIG_FS_DAX_PMD depend on BROKEN. That seems smaller and easier
to switch PMD faults back on for testing.
I did it this way because I wasn't sure PMD fault code even compiles and I
didn't want to break CONFIG_BROKEN builds. But OK, I'll make the code at
least compile and do what you say.
Honza
> Signed-off-by: Jan Kara <jack(a)suse.cz>
> ---
> fs/dax.c | 2 +-
> include/linux/dax.h | 2 +-
> 2 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/fs/dax.c b/fs/dax.c
> index 0329ec0bee2e..444e9dd079ca 100644
> --- a/fs/dax.c
> +++ b/fs/dax.c
> @@ -719,7 +719,7 @@ int dax_fault(struct vm_area_struct *vma, struct vm_fault *vmf,
> }
> EXPORT_SYMBOL_GPL(dax_fault);
>
> -#ifdef CONFIG_TRANSPARENT_HUGEPAGE
> +#if 0
> /*
> * The 'colour' (ie low bits) within a PMD of a page offset. This comes
up
> * more often than one might expect in the below function.
> diff --git a/include/linux/dax.h b/include/linux/dax.h
> index 4b63923e1f8d..fd28d824254b 100644
> --- a/include/linux/dax.h
> +++ b/include/linux/dax.h
> @@ -29,7 +29,7 @@ static inline struct page *read_dax_sector(struct block_device
*bdev,
> }
> #endif
>
> -#ifdef CONFIG_TRANSPARENT_HUGEPAGE
> +#if 0
> int dax_pmd_fault(struct vm_area_struct *, unsigned long addr, pmd_t *,
> unsigned int flags, get_block_t);
> int __dax_pmd_fault(struct vm_area_struct *, unsigned long addr, pmd_t *,
> --
> 2.6.2
>
--
Jan Kara <jack(a)suse.com>
SUSE Labs, CR