On 2/15/2017 9:03 PM, Dan Williams wrote:
On Wed, Feb 15, 2017 at 4:45 PM, Linda Knippers
> On 02/15/2017 07:29 PM, Dan Williams wrote:
>> On Wed, Feb 15, 2017 at 3:19 PM, Linda Knippers <linda.knippers(a)hpe.com>
>>> On 02/13/2017 01:30 PM, Dan Williams wrote:
>>>> On Mon, Feb 13, 2017 at 10:17 AM, Linda Knippers
>>>>> On 02/13/2017 12:28 PM, Dan Williams wrote:
>>>>>> On Mon, Feb 13, 2017 at 8:27 AM, Linda Knippers
>>>>>>> As it is today, we can't enable or test new functions in
>>>>>>> changing the kernel.
>>>>>>> With this patch we allow function 0 for the HPE DSM
>>>>>>> as is already allowed with the MS family. We now only
>>>>>>> the functions to the currently documented set if the module
>>>>>>> "disable_vendor_specific" is set. Since the module
>>>>>>> description says "Limit commands to the publicly
>>>>>>> this approach seems correct.
>>>>>> My concern is that this makes the kernel less strict by default.
>>>>>> this is only a test capability lets add a module option to
>>>>>> the default dsm_mask.
>>>>> This part isn't strictly a test capability. It's also to
>>>>> kernels to be supported with newer NVDIMM hardware, firmware, and
>>>>> We shouldn't need a customer to update their production kernel
just to support
>>>>> an NVDIMM management tool.
>>>>> As for less secure by default, the default is to allow undocumented
>>>>> functions of the "vendor specific" type. How is the really
>>>> It's about pushing back on undocumented BIOS interfaces as much as
>>> I'm not looking to propagate a bunch of undocumented interfaces.
>>> looking to avoid having distros release new kernels and customers update
>>> their systems because there's a new documented interface, especially when
>>> thing that needs to change is the mask. As you've experienced, the
>>> process takes a long time and I'm anticipating that between now and
>>> eventual standardization, there will be at least one new DSM that's
>>> I'm sure we'll document it too, but that won't help a customer
>>> older kernel.
>>>> The kernel has the documented set enabled by default,
>>>> including the documented vendor-specifc function code.
>>> Ok, so if I update my document to say that undefined values are
>>> we're good? Or shall I define the unused values that way explicitly?
>>>> There is the option to disable the vendor-specific tunnel to restrict the
>>>> only allowing commands with publicly documented effects.
>>> My patch respects that option.
>>>> With a new "default_dsm_mask" parameter the default kernel
policy can be
>>>> overridden by the system owner.
>>> It can, although it means customers would have to add kernel parameters,
>>> they try to avoid. I assume that's why the module parameter for a
>>> policy isn't the default since customers using Intel management tools
would need to
>>> add a parameter to allow the use of the vendor-specific function.
>>> I'm not trying to be difficult here, but I really don't see the
>>> between a documented function with undocumented behavior and an
>>> undocumented function with undocumented behavior.
>> The difference is that it is slightly more painful to require a kernel
>> module override. Disabling these commands by default has some effect
>> of pushing back on the creation of new interfaces.
> Intel's vendor-specific function, which can expose new interfaces,
> requires no override.
Right, and NVDIMM_FAMILY_HPE2 has one too, and we have the kernel
option to disable all vendor specific passthroughs if an environment
only wants to run commands with publicly documented effects.
Ok, I can add some to the HPE1 family too. :-)
There's also nothing on the kernel side stopping any vendor from
implementing the "HPE2" or "INTEL" vendor specific interfaces in
BIOS, the interfaces are not exclusive.
I'm sure that many vendors will implement the INTEL interfaces when
there are devices using the 0x0201 RFIC. However, those aren't sufficient
for NVDIMM-N devices.
As for HPE2, it may be short lived. I'm not sure it's in the wild and
if it's not, perhaps we can remove it. Still checking.
>> Customers are
>> already prepared to need new kernels for new enabling.
> This may not be new enabling. It could be hardware they already have
> but there is a FW update or a management feature or some error recovery
> that gets turned on.
I consider new platform functionality hardware or firmware as new enabling.
A customer may not see it that way.
>> It's a similar coordination we do for new device-ids for
drivers, but we still allow
>> for overriding the default via the "new_id" interface in sysfs.
>> Look at the reaction we got from the community the last time this
>> subject was brought up .
> I actually think we handled that objection in the next few messages.
I think we placated it with the strict checking, but the fact that the
kernel continues to have 4 nvdimm-family-types is still a point of
We will likely end up supporting a 5th one as we move to a standard.
>> That reaction stems from the historical
>> divergence of storage management interfaces requiring Linux to have a
>> vendor-specific toolkit per device. Requiring a kernel change to make
>> a command supported by default is a good thing because it affords
>> ongoing opportunities to find commonalities between vendors and
>> support common tooling.
> A good thing for who? Not for the distro or the customer. I don't
> see how this is different than Intel adding new features through your generic
> vendor-specific function, which would not require a kernel spin.
A distro would rather carry one utility that handles multiple vendors.
The kernel would rather carry support code for one generic interface
than multiple interfaces.
We'd all like that - but we don't have that yet.
A kernel spin is not required with the proposed module parameter.
True, but configuring boot options and a reboot are required.
>> I hope the need for the vendor-specific tunnels goes away
>> and Linux can generically support the most common management tasks
>> with generic infrastructure.
> We totally agree. In case you're wondering, we don't have a new
> DSM queued up just waiting to be exposed by this patch. However, cases
> have come up where we have considered the need for a new DSM function
> outside of pure test environments. So far we have come up with other
> approaches but at some point, the right approach will be to add a new DSM
> function and I'd really like to not have to spin the kernel and all that
> goes along with that when it happens. It's not good for customers.
I don't understand why a module option is unacceptable when that is
the proposed solution to the kernel picking the wrong "default"
The difference is that picking the default family is primarily for
testing while the other case it's not. Testers do all sorts of things
that customers won't like.
In fact, thinking about it further, I'd be more open to a patch
allows overriding a DIMM's family via sysfs.
Sorry if this is a dumb question but will that work at boot time? Today
there are DSMs that are called during initialization. There could also
be DSMs called based on whether NFIT device flag bit 4 is set, although
we don't currently do that.
That way we can
potentially have cross usage of these different interfaces and it's
less awkward for tooling to use than a module option.
For testing purposes, being able to switch DSM families without a reboot
would be really nice. You could even expose the GUID and allow it, the
family, and the dsm mask to be overridden. Would be helpful when testing
that new standard DSM we all aspire to.
This approach would touch a lot more code and require a lot more testing
than my relatively simple patches because today these things are configured
at boot time rather than run time. I wonder about ordering and consistency
checking. For example, if someone changed the family would you automatically
change the mask or is that two operations? I assume you'd check
that the family is actually supported for the device? With sysfs, different
devices could have different families, where with the module parameter option
there's one default that's applied to all devices for which that family is
At this point I'd probably prefer the simplicity of the module parameter
because I know I can do it and test it. If the only way you'll take the
patch is to add a second parameter then I'll do that, but I still don't
see the point.