Jan Kara <jack(a)suse.cz> writes:
On Tue 02-01-18 13:51:49, Dan Williams wrote:
> On Tue, Jan 2, 2018 at 1:44 PM, Dave Chinner <david(a)fromorbit.com> wrote:
> > On Sat, Dec 23, 2017 at 04:56:43PM -0800, Dan Williams wrote:
> >> In support of testing truncate colliding with dma add a mechanism that
> >> delays the completion of block I/O requests by a programmable number of
> >> seconds. This allows a truncate operation to be issued while page
> >> references are held for direct-I/O.
> >> Signed-off-by: Dan Williams <dan.j.williams(a)intel.com>
> > Why not put this in the generic bio layer code and then write a
> > generic fstest to exercise this truncate vs direct IO completion
> > race condition on all types of storage and filesystems?
> > i.e. if it sits in a nvdimm test suite, it's never going to be run
> > by filesystem developers....
> I do want to get it into xfstests eventually. I picked the nvdimm
> infrastructure for expediency of getting the fix developed. Also, I
> consider the collision in the non-dax case a solved problem since the
> core mm will keep the page out of circulation indefinitely.
Yes, but there are different races that could happen even for regular page
cache pages. So I also think it would be worthwhile to have this inside the
block layer possibly as part of the generic fault-injection framework which
is already there for fail_make_request. That already supports various
filtering, frequency, and other options that could be useful.
Or consider extending the dm-delay target (which delays the queuing of
bios) to support delaying the completions. I'm not sure I'm a fan of
sticking all sorts of debug code into the generic I/O submission path.