On Mon, Mar 05, 2018 at 01:10:53PM -0700, Jason Gunthorpe wrote:
So when reading the above mlx code, we see the first wmb() being
to ensure that CPU stores to cachable memory are visible to the DMA
triggered by the doorbell ring.
IIUC, we don't need a similar barrier for NVMe to ensure memory is
visibile to DMA since the SQE memory is allocated DMA coherent when the
SQ is not within a CMB.
The mmiowb() is used to ensure that DB writes are not combined and
issued in any order other than implied by the lock that encloses the
whole thing. This is needed because uar_map is WC memory.
We don't have ordering with respect to two writel's here, so if ARM
performance was a concern the writel could be switched to
Presumably nvme has similar requirments, although I guess the DB
register is mapped UC not WC?
Yep, the NVMe DB register is required by the spec to be mapped UC.