On Thu, Dec 29, 2016 at 08:56:13PM -0800, Dan Williams wrote:
> Um... Then we do have a problem - nocache variant of uaccess
> does *not* guarantee that clwb is redundant.
> What about the requirements of e.g. tcp_sendmsg() with its use of
> skb_add_data_nocache()? What warranties do we need there?
Yes, we need to distinguish the existing "nocache" that tries to avoid
unnecessary cache pollution and this new "must write through" semantic
for writing to persistent memory. I suspect usages of
skb_add_data_nocache() are ok since they are in the transmit path.
Receiving directly into a buffer that is expected to be persisted
immediately is where we would need to be careful, but that is already
backstopped by dirty cacheline tracking. So as far as I can see, we
should only need a new memcpy_writethrough() (?) for the pmem
direct-i/o path at present.
OK... Right now we have several places playing with nocache:
* dax_iomap_actor(). Writethrough warranties needed, nocache
side serves to reduce the cache impact *and* avoid the need for clwb
* several memcpy_to_pmem() users - acpi_nfit_blk_single_io(),
nsio_rw_bytes(), write_pmem(). No clwb attempted; is it needed there?
* hfi1_copy_sge(). Cache pollution avoidance? The source is
in the kernel, looks like memcpy_nocache() candidate.
* ntb_memcpy_tx(). Really fishy one - it's from kernel to iomem,
with nocache userland->kernel copying primitive abused on x86. As soon
as e.g. powerpc or sparc grows ARCH_HAS_NOCACHE_UACCESS, we are in trouble
there. What is it actually trying to achieve? memcpy_toio() with
cache pollution avoidance?
* networking copy_from_iter_full_nocache() users - cache pollution
avoidance, AFAICS; no writethrough warranties sought.
Why does pmem need writethrough warranties, anyway? All explanations I've
found on the net had been along the lines of "we should not store a pointer
to pmem data structure until the structure itself had been committed to
pmem itself" and it looks like something that ought to be a job for barriers
- after all, we don't want the pointer store to be observed by _anything_
in the system until the earlier stores are visible, so what makes pmem
different from e.g. another CPU or a PCI busmaster, or...
I'm trying to figure out what would be the right API here; sure, we can
add separate memcpy_writethrough()/__copy_from_user_inatomic_writethrough()/
copy_from_iter_writethrough(), but I would like to understand what's going