On 04/03/2015 08:12 PM, Yinghai Lu wrote:
On Fri, Apr 3, 2015 at 9:14 AM, Toshi Kani <toshi.kani(a)hp.com>
wrote:
> On Wed, 2015-04-01 at 09:12 +0200, Christoph Hellwig wrote:
> :
>> @@ -748,7 +758,7 @@ u64 __init early_reserve_e820(u64 size, u64 align)
>> /*
>> * Find the highest page frame number we have available
>> */
>> -static unsigned long __init e820_end_pfn(unsigned long limit_pfn, unsigned
type)
>> +static unsigned long __init e820_end_pfn(unsigned long limit_pfn)
>> {
>> int i;
>> unsigned long last_pfn = 0;
>> @@ -759,7 +769,11 @@ static unsigned long __init e820_end_pfn(unsigned long
limit_pfn, unsigned type)
>> unsigned long start_pfn;
>> unsigned long end_pfn;
>>
>> - if (ei->type != type)
>> + /*
>> + * Persistent memory is accounted as ram for purposes of
>> + * establishing max_pfn and mem_map.
>> + */
>> + if (ei->type != E820_RAM && ei->type != E820_PRAM)
>> continue;
>
> Should we also delete this code, accounting E820_PRAM as ram, along with
> the deletion of reserve_pmem() in this version?
Hi Yinghai, Toshi
In my old patches I did not have these updates as well, and everything
was very much usable, for a long time.
However. I actually liked these changes in Christoph's patches and
thought they should stay, here is why.
Today I will be sending patches to make pmem be supported with
page-struct as an optional alternative to the use of ioremap.
This is for advanced users that wants to RDMA direct_IO and so
on directly out of pmem.
At one point we had a BUG in some mm/memory.c code that was checking max_pfn.
Actually that was a bug and we do not go through this code anymore. And between
us that global variable max_pfn is a bad hack. But I kind of like to have it as
long as it is used. So code that wants to protect by max_pfn can still accept
pmem memory submitted to it.
I have tried to audit the Kernel use of max_pfn and I do not see how
this can hurt? I do see were it would theoretically help.
Think of a system that looks like this as a memory map:
1. VM (Volitile mem)
2. PM
3. VM
4. PM
Which is what is returned by current and planned NUMA implementations.
So pmem region-2 will be covered by max_pfn. But pmem region 4 will not.
If any code checks for max_pfn it will be OK with pmem-2 but *not* with
pmem-4. This is highly unexpected.
I think the all max_pfn should be killed ASAP, but until it is then
it will not hurt for pmem to be covered.
Thanks
Boaz