On Mon, Mar 11, 2019 at 8:37 AM Dan Williams <dan.j.williams(a)intel.com> wrote:
Another feature the userspace tooling can support for the PMEM as RAM
case is the ability to complete an Address Range Scrub of the range
before it is added to the core-mm. I.e at least ensure that previously
encountered poison is eliminated.
Ok, so this at least makes sense as an argument to me.
In the "PMEM as filesystem" part, the errors have long-term history,
while in "PMEM as RAM" the memory may be physically the same thing,
but it doesn't have the history and as such may not be prone to
long-term errors the same way.
So that validly argues that yes, when used as RAM, the likelihood for
errors is much lower because they don't accumulate the same way.
The driver can also publish an
attribute to indicate when rep; mov is recoverable, and gate the
hotplug policy on the result. In my opinion a positive indicator of
the cpu's ability to recover rep; mov exceptions is a gap that needs
addressing.
Is there some way to say "don't raise MC for this region"? Or at least
limit it to a nonfatal one?
Linus