[PATCH 0/4] clean up some functions in nvdimm/badrange.c and namespace_devs.c
by Zhen Lei
When I learned the code of drivers/nvdimm, I found some places can be improved.
Zhen Lei (4):
libnvdimm: remove redundant list_empty() check in badrange.c
libnvdimm: eliminate a meaningless spinlock operation
libnvdimm: eliminate two unnecessary zero initializations in
badrange.c
libnvdimm: avoid unnecessary judgments in nvdimm_namespace_disk_name()
drivers/nvdimm/badrange.c | 36 ++++++++++--------------------------
drivers/nvdimm/namespace_devs.c | 12 +++++-------
2 files changed, 15 insertions(+), 33 deletions(-)
--
1.8.3
1 year, 8 months
[ndctl PATCH v2] namespace-action: Don't act on any seed namespaces
by Santosh Sivaraj
Catch seed namespaces early on. This will prevent checking for sizes in enable,
disable and destroy namespace code path, which in turn prevents the inconsistent
reporting in count of enabled/disabled namespaces.
Signed-off-by: Santosh Sivaraj <santosh(a)fossix.org>
---
ndctl/lib/libndctl.c | 5 ----
ndctl/namespace.c | 57 +++++++++++++++++++++-----------------------
2 files changed, 27 insertions(+), 35 deletions(-)
Changes from v2: Updated patch to return the proper error code. In V1, we return
the default -ENXIO even if we had skipped all the namespaces (when there are
only idle namespaces). In cases like that, we need to return zero, except when
creating namespaces, which may fail due to insufficient space.
diff --git a/ndctl/lib/libndctl.c b/ndctl/lib/libndctl.c
index ee737cb..d0599f7 100644
--- a/ndctl/lib/libndctl.c
+++ b/ndctl/lib/libndctl.c
@@ -4130,16 +4130,11 @@ NDCTL_EXPORT int ndctl_namespace_enable(struct ndctl_namespace *ndns)
const char *devname = ndctl_namespace_get_devname(ndns);
struct ndctl_ctx *ctx = ndctl_namespace_get_ctx(ndns);
struct ndctl_region *region = ndns->region;
- unsigned long long size = ndctl_namespace_get_size(ndns);
int rc;
if (ndctl_namespace_is_enabled(ndns))
return 0;
- /* Don't try to enable idle namespace (no capacity allocated) */
- if (size == 0)
- return -ENXIO;
-
rc = ndctl_bind(ctx, ndns->module, devname);
/*
diff --git a/ndctl/namespace.c b/ndctl/namespace.c
index 0550580..ea540dd 100644
--- a/ndctl/namespace.c
+++ b/ndctl/namespace.c
@@ -1102,7 +1102,6 @@ static int namespace_destroy(struct ndctl_region *region,
struct ndctl_namespace *ndns)
{
const char *devname = ndctl_namespace_get_devname(ndns);
- unsigned long long size;
bool did_zero = false;
int rc;
@@ -1147,19 +1146,9 @@ static int namespace_destroy(struct ndctl_region *region,
goto out;
}
- size = ndctl_namespace_get_size(ndns);
-
rc = ndctl_namespace_delete(ndns);
if (rc)
debug("%s: failed to reclaim\n", devname);
-
- /*
- * Don't report a destroyed namespace when no capacity was
- * allocated.
- */
- if (size == 0 && rc == 0)
- rc = 1;
-
out:
return rc;
}
@@ -2128,8 +2117,9 @@ static int do_xaction_namespace(const char *namespace,
ndctl_namespace_foreach_safe(region, ndns, _n) {
ndns_name = ndctl_namespace_get_devname(ndns);
- if (strcmp(namespace, "all") != 0
+ if ((strcmp(namespace, "all") != 0
&& strcmp(namespace, ndns_name) != 0)
+ || ndctl_namespace_get_size(ndns) == 0)
continue;
switch (action) {
case ACTION_DISABLE:
@@ -2148,9 +2138,6 @@ static int do_xaction_namespace(const char *namespace,
rc = namespace_destroy(region, ndns);
if (rc == 0)
(*processed)++;
- /* return success if skipped */
- if (rc > 0)
- rc = 0;
break;
case ACTION_CHECK:
rc = namespace_check(ndns, verbose,
@@ -2195,22 +2182,32 @@ static int do_xaction_namespace(const char *namespace,
if (ri_ctx.f_out && ri_ctx.f_out != stdout)
fclose(ri_ctx.f_out);
- if (action == ACTION_CREATE && rc == -EAGAIN) {
- /*
- * Namespace creation searched through all candidate
- * regions and all of them said "nope, I don't have
- * enough capacity", so report -ENOSPC. Except during
- * greedy namespace creation using --continue as we
- * may have created some namespaces already, and the
- * last one in the region search may preexist.
- */
- if (param.greedy && (*processed) > 0)
- rc = 0;
- else
- rc = -ENOSPC;
+ if (action == ACTION_CREATE) {
+ if (rc == -EAGAIN) {
+ /*
+ * Namespace creation searched through all candidate
+ * regions and all of them said "nope, I don't have
+ * enough capacity", so report -ENOSPC. Except during
+ * greedy namespace creation using --continue as we
+ * may have created some namespaces already, and the
+ * last one in the region search may preexist.
+ */
+ if (param.greedy && (*processed) > 0)
+ rc = 0;
+ else
+ rc = -ENOSPC;
+ }
+
+ if (saved_rc)
+ rc = saved_rc;
+
+ return rc;
}
- if (saved_rc)
- rc = saved_rc;
+
+ /* If there is nothing processed, and if we have only skipped idle
+ * namespaces, there shouldn't be any error */
+ if (rc == -ENXIO && !*processed)
+ rc = 0;
return rc;
}
--
2.26.2
1 year, 8 months
[PATCH] libnvdimm/e820: Fix build error without MEMORY_HOTPLUG
by YueHaibing
If MEMORY_HOTPLUG is n, build fails:
drivers/nvdimm/e820.c: In function ‘e820_register_one’:
drivers/nvdimm/e820.c:24:12: error: implicit declaration of function ‘phys_to_target_node’; did you mean ‘set_page_node’? [-Werror=implicit-function-declaration]
int nid = phys_to_target_node(res->start);
^~~~~~~~~~~~~~~~~~~
set_page_node
Fixes: 7b27a8622f80 ("libnvdimm/e820: Retrieve and populate correct 'target_node' info")
Signed-off-by: YueHaibing <yuehaibing(a)huawei.com>
---
drivers/nvdimm/e820.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/drivers/nvdimm/e820.c b/drivers/nvdimm/e820.c
index 4cd18be9d0e9..c741f029f215 100644
--- a/drivers/nvdimm/e820.c
+++ b/drivers/nvdimm/e820.c
@@ -17,6 +17,13 @@ static int e820_pmem_remove(struct platform_device *pdev)
return 0;
}
+#ifndef CONFIG_MEMORY_HOTPLUG
+static inline int phys_to_target_node(u64 start)
+{
+ return 0;
+}
+#endif
+
static int e820_register_one(struct resource *res, void *data)
{
struct nd_region_desc ndr_desc;
--
2.17.1
1 year, 8 months
Re: [PATCH] fs: Kill DCACHE_DONTCACHE dentry even if
DCACHE_REFERENCED is set
by Li, Hao
Hello,
CC to Dave, darrick.wong and xfs/nvdimm list to get more discussions.
Thanks.
Hao Li
On 2020/8/24 14:17, Li, Hao wrote:
> On 2020/8/23 14:54, Ira Weiny wrote:
>> On Fri, Aug 21, 2020 at 10:40:41AM -0700, 'Ira Weiny' wrote:
>>> On Fri, Aug 21, 2020 at 09:59:53AM +0800, Hao Li wrote:
>>>> Currently, DCACHE_REFERENCED prevents the dentry with DCACHE_DONTCACHE
>>>> set from being killed, so the corresponding inode can't be evicted. If
>>>> the DAX policy of an inode is changed, we can't make policy changing
>>>> take effects unless dropping caches manually.
>>>>
>>>> This patch fixes this problem and flushes the inode to disk to prepare
>>>> for evicting it.
>>> This looks intriguing and I really hope this helps but I don't think this will
>>> guarantee that the state changes immediately will it?
>>>
>>> Do you have a test case which fails before and passes after? Perhaps one of
>>> the new xfstests submitted by Xiao?
>> Ok I just went back and read your comment before.[1] Sorry for being a bit
>> slow on the correlation between this patch and that email. (BTW, I think it
>> would have been good to put those examples in the commit message and or
>> reference that example.)
> Thanks for your advice. I will add those examples in v2 after further
> discussion of this patch.
>
>> I'm assuming that with this patch example 2 from [1]
>> works without a drop_cache _if_ no other task has the file open?
> Yes. If no other task is opening the file, the inode and page cache of this
> file will be dropped during xfs_io exiting process. There is no need to run
> echo 2 > drop_caches.
>
>> Anyway, with that explanation I think you are correct that this improves the
>> situation _if_ the only references on the file is controlled by the user and
>> they have indeed closed all of them.
>>
>> The code for DCACHE_DONTCACHE as I attempted to write it was that it should
>> have prevented further caching of the inode such that the inode would evict
>> sooner. But it seems you have found a bug/optimization?
> Yes. This patch is an optimization and can also be treated as a bugfix.
> On the other side, even though this patch can make DCACHE_DONTCACHE more
> reasonable, I am not quite sure if my approach is safe and doesn't impact
> the fs performance. I hope the community can give me more advice.
>
>> In the end, however, if another user (such as a backup running by the admin)
>> has a reference the DAX change may still be delayed.
> Yes. In this situation, neither drop_caches approach nor this patch can make
> the DAX change take effects soon.
> Moreover, things are different if the backup task exits, this patch
> will make sure the inode and page cache of the file can be dropped
> _automatically_ without manual intervention. By contrast, the original
> approach needs a manual cache dropping.
>
>> So I'm thinking the
>> documentation should remain largely as is? But perhaps I am wrong. Does this
>> completely remove the need for a drop_caches or only in the example you gave?
> I think the contents related to drop_caches in documentation can be removed
> if this patch's approach is acceptable.
>
>> Since I'm not a FS expert I'm still not sure.
> Frankly, I'm not an expert either, so I hope this patch can be discussed
> further in case it has side effects.
>
> Thanks,
> Hao Li
>
>> Regardless, thanks for the fixup! :-D
>> Ira
>>
>> [1] https://lore.kernel.org/linux-xfs/ba98b77e-a806-048a-a0dc-ca585677daf3@cn...
>>
>>> Ira
>>>
>>>> Signed-off-by: Hao Li <lihao2018.fnst(a)cn.fujitsu.com>
>>>> ---
>>>> fs/dcache.c | 3 ++-
>>>> fs/inode.c | 2 +-
>>>> 2 files changed, 3 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/fs/dcache.c b/fs/dcache.c
>>>> index ea0485861d93..486c7409dc82 100644
>>>> --- a/fs/dcache.c
>>>> +++ b/fs/dcache.c
>>>> @@ -796,7 +796,8 @@ static inline bool fast_dput(struct dentry *dentry)
>>>> */
>>>> smp_rmb();
>>>> d_flags = READ_ONCE(dentry->d_flags);
>>>> - d_flags &= DCACHE_REFERENCED | DCACHE_LRU_LIST | DCACHE_DISCONNECTED;
>>>> + d_flags &= DCACHE_REFERENCED | DCACHE_LRU_LIST | DCACHE_DISCONNECTED
>>>> + | DCACHE_DONTCACHE;
>>>>
>>>> /* Nothing to do? Dropping the reference was all we needed? */
>>>> if (d_flags == (DCACHE_REFERENCED | DCACHE_LRU_LIST) && !d_unhashed(dentry))
>>>> diff --git a/fs/inode.c b/fs/inode.c
>>>> index 72c4c347afb7..5218a8aebd7f 100644
>>>> --- a/fs/inode.c
>>>> +++ b/fs/inode.c
>>>> @@ -1632,7 +1632,7 @@ static void iput_final(struct inode *inode)
>>>> }
>>>>
>>>> state = inode->i_state;
>>>> - if (!drop) {
>>>> + if (!drop || (drop && (inode->i_state & I_DONTCACHE))) {
>>>> WRITE_ONCE(inode->i_state, state | I_WILL_FREE);
>>>> spin_unlock(&inode->i_lock);
>>>>
>>>> --
>>>> 2.28.0
>>>>
>>>>
>>>>
>
>
1 year, 8 months
Re: [PATCH 9/9] iomap: Change calling convention for zeroing
by Dave Chinner
On Mon, Aug 24, 2020 at 09:35:59PM -0600, Andreas Dilger wrote:
> On Aug 24, 2020, at 9:26 PM, Matthew Wilcox <willy(a)infradead.org> wrote:
> >
> > On Tue, Aug 25, 2020 at 10:27:35AM +1000, Dave Chinner wrote:
> >>> do {
> >>> - unsigned offset, bytes;
> >>> -
> >>> - offset = offset_in_page(pos);
> >>> - bytes = min_t(loff_t, PAGE_SIZE - offset, count);
> >>> + loff_t bytes;
> >>>
> >>> if (IS_DAX(inode))
> >>> - status = dax_iomap_zero(pos, offset, bytes, iomap);
> >>> + bytes = dax_iomap_zero(pos, length, iomap);
> >>
> >> Hmmm. everything is loff_t here, but the callers are defining length
> >> as u64, not loff_t. Is there a potential sign conversion problem
> >> here? (sure 64 bit is way beyond anything we'll pass here, but...)
> >
> > I've gone back and forth on the correct type for 'length' a few times.
> > size_t is too small (not for zeroing, but for seek()). An unsigned type
> > seems right -- a length can't be negative, and we don't want to give
> > the impression that it can. But the return value from these functions
> > definitely needs to be signed so we can represent an error. So a u64
> > length with an loff_t return type feels like the best solution. And
> > the upper layers have to promise not to pass in a length that's more
> > than 2^63-1.
>
> The problem with allowing a u64 as the length is that it leads to the
> possibility of an argument value that cannot be returned. Checking
> length < 0 is not worse than checking length > 0x7ffffffffffffff,
> and has the benefit of consistency with the other argument types and
> signs...
I think the problem here is that we have no guaranteed 64 bit size
type. when that was the case with off_t, we created loff_t to always
represent a 64 bit offset value. However, we never created one for
the count/size that is passed alongside loff_t in many places - it
was said that "syscalls are limited to 32 bit sizes" and
"size_t is 64 bit on 64 bit platforms" and so on and so we still
don't have a clean way to pass 64 bit sizes through the IO path.
We've been living with this shitty situation for a long time now, so
perhaps it's time for us to define lsize_t for 64 bit lengths and
start using that everywhere that needs a 64 bit clean path
through the code, regardless of whether the arch is 32 or 64 bit...
Thoughts?
-Dave.
--
Dave Chinner
david(a)fromorbit.com
1 year, 8 months
[GIT PULL] libnvdimm fixes for v5.9-rc3
by Verma, Vishal L
Hi Linus, please pull from:
git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm.git
tags/libnvdimm-fix-v5.9-rc3
...to receive a couple of minor fixes for things merged in 5.9-rc1. One
is an out-of-bounds access caught by KASAN, and the second is a tweak to
some overzealous logging about dax support even for traditional block
devices which was unnecessary. These have appeared in -next without any
problems.
---
The following changes since commit 9123e3a74ec7b934a4a099e98af6a61c2f80bbf5:
Linux 5.9-rc1 (2020-08-16 13:04:57 -0700)
are available in the Git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm.git tags/libnvdimm-fix-v5.9-rc3
for you to fetch changes up to c2affe920b0e0669650943ac086215cf6519be34:
dax: do not print error message for non-persistent memory block device (2020-08-20 11:43:18 -0600)
----------------------------------------------------------------
libnvdimm fixes for v5.9-rc3
Fix an out-of-bounds access introduced in libnvdimm v5.9-rc1
dax: do not print error message for non-persistent memory block device
----------------------------------------------------------------
Adrian Huang (1):
dax: do not print error message for non-persistent memory block device
Zqiang (1):
libnvdimm: KASAN: global-out-of-bounds Read in internal_create_group
drivers/dax/super.c | 6 ++++++
drivers/nvdimm/dimm_devs.c | 1 +
2 files changed, 7 insertions(+)
diff --git a/drivers/dax/super.c b/drivers/dax/super.c
index c82cbcb64202..32642634c1bb 100644
--- a/drivers/dax/super.c
+++ b/drivers/dax/super.c
@@ -100,6 +100,12 @@ bool __generic_fsdax_supported(struct dax_device *dax_dev,
return false;
}
+ if (!dax_dev && !bdev_dax_supported(bdev, blocksize)) {
+ pr_debug("%s: error: dax unsupported by block device\n",
+ bdevname(bdev, buf));
+ return false;
+ }
+
id = dax_read_lock();
len = dax_direct_access(dax_dev, pgoff, 1, &kaddr, &pfn);
len2 = dax_direct_access(dax_dev, pgoff_end, 1, &end_kaddr, &end_pfn);
diff --git a/drivers/nvdimm/dimm_devs.c b/drivers/nvdimm/dimm_devs.c
index 61374def5155..b59032e0859b 100644
--- a/drivers/nvdimm/dimm_devs.c
+++ b/drivers/nvdimm/dimm_devs.c
@@ -529,6 +529,7 @@ static DEVICE_ATTR_ADMIN_RW(activate);
static struct attribute *nvdimm_firmware_attributes[] = {
&dev_attr_activate.attr,
&dev_attr_result.attr,
+ NULL,
};
static umode_t nvdimm_firmware_visible(struct kobject *kobj, struct attribute *a, int n)
1 year, 8 months
[PATCH] drivers/dax: Use kobj_to_dev() instead
by Wang Qing
Use kobj_to_dev() instead of container_of()
Signed-off-by: Wang Qing <wangqing(a)vivo.com>
---
drivers/dax/bus.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/dax/bus.c b/drivers/dax/bus.c
index df238c8..24625d2
--- a/drivers/dax/bus.c
+++ b/drivers/dax/bus.c
@@ -331,7 +331,7 @@ static DEVICE_ATTR_RO(numa_node);
static umode_t dev_dax_visible(struct kobject *kobj, struct attribute *a, int n)
{
- struct device *dev = container_of(kobj, struct device, kobj);
+ struct device *dev = kobj_to_dev(kobj);
struct dev_dax *dev_dax = to_dev_dax(dev);
if (a == &dev_attr_target_node.attr && dev_dax_target_node(dev_dax) < 0)
--
2.7.4
1 year, 8 months
[PATCH 1/1] dax: do not print error message for non-persistent memory block device
by Adrian Huang
From: Adrian Huang <ahuang12(a)lenovo.com>
From: Adrian Huang <ahuang12(a)lenovo.com>
Commit 231609785cbf ("dax: print error message by pr_info()
in __generic_fsdax_supported()") happens to print the following
error message during booting when the non-persistent memory block
devices are configured by device mapper. Those error messages are
caused by the variable 'dax_dev' is NULL. Users might be confused
with those error messages since they do not use the persistent
memory device. Moreover, users might scare about "what's wrong
with my disks" because they see the 'error' and 'failed' keywords.
# dmesg | grep fail
sdk3: error: dax access failed (-95)
sdk3: error: dax access failed (-95)
sdk3: error: dax access failed (-95)
sdk3: error: dax access failed (-95)
sdk3: error: dax access failed (-95)
sdk3: error: dax access failed (-95)
sdk3: error: dax access failed (-95)
sdk3: error: dax access failed (-95)
sdk3: error: dax access failed (-95)
sdb3: error: dax access failed (-95)
sdb3: error: dax access failed (-95)
sdb3: error: dax access failed (-95)
sdb3: error: dax access failed (-95)
sdb3: error: dax access failed (-95)
sdb3: error: dax access failed (-95)
sdb3: error: dax access failed (-95)
sdb3: error: dax access failed (-95)
sdb3: error: dax access failed (-95)
# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 1.1T 0 disk
├─sda1 8:1 0 156M 0 part
├─sda2 8:2 0 40G 0 part
└─sda3 8:3 0 1.1T 0 part
sdb 8:16 0 1.1T 0 disk
├─sdb1 8:17 0 600M 0 part
├─sdb2 8:18 0 1G 0 part
└─sdb3 8:19 0 1.1T 0 part
├─rhel00-swap 254:3 0 4G 0 lvm
├─rhel00-home 254:4 0 1T 0 lvm
└─rhel00-root 254:5 0 50G 0 lvm
sdc 8:32 0 1.1T 0 disk
sdd 8:48 0 1.1T 0 disk
sde 8:64 0 1.1T 0 disk
sdf 8:80 0 1.1T 0 disk
sdg 8:96 0 1.1T 0 disk
sdh 8:112 0 3.3T 0 disk
├─sdh1 8:113 0 500M 0 part /boot/efi
├─sdh2 8:114 0 40G 0 part /
├─sdh3 8:115 0 2.9T 0 part /home
└─sdh4 8:116 0 314.6G 0 part [SWAP]
sdi 8:128 0 1.1T 0 disk
sdj 8:144 0 3.3T 0 disk
├─sdj1 8:145 0 512M 0 part
└─sdj2 8:146 0 3.3T 0 part
sdk 8:160 0 119.2G 0 disk
├─sdk1 8:161 0 200M 0 part
├─sdk2 8:162 0 1G 0 part
└─sdk3 8:163 0 118G 0 part
├─rhel-swap 254:0 0 4G 0 lvm
├─rhel-home 254:1 0 64G 0 lvm
└─rhel-root 254:2 0 50G 0 lvm
sdl 8:176 0 119.2G 0 disk
The call path is shown as follows:
dm_table_determine_type
dm_table_supports_dax
device_supports_dax
generic_fsdax_supported
__generic_fsdax_supported
With the disk configuration listing from the command 'lsblk',
the member 'dev->dax_dev' of the block devices 'sdb3' and 'sdk3'
(configured by device mapper) is NULL in function
generic_fsdax_supported() because the member is configured in
function open_table_device().
To prevent the confusing error messages in this scenario (this is
normal behavior), just print those error messages by pr_debug()
by checking if dax_dev is NULL and the block device does not support
DAX.
Fixes: 231609785cbf ("dax: print error message by pr_info() in __generic_fsdax_supported()")
Cc: Coly Li <colyli(a)suse.de>
Cc: Dan Williams <dan.j.williams(a)intel.com>
Cc: Alasdair Kergon <agk(a)redhat.com>
Cc: Mike Snitzer <snitzer(a)redhat.com>
Signed-off-by: Adrian Huang <ahuang12(a)lenovo.com>
---
drivers/dax/super.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/drivers/dax/super.c b/drivers/dax/super.c
index c82cbcb64202..32642634c1bb 100644
--- a/drivers/dax/super.c
+++ b/drivers/dax/super.c
@@ -100,6 +100,12 @@ bool __generic_fsdax_supported(struct dax_device *dax_dev,
return false;
}
+ if (!dax_dev && !bdev_dax_supported(bdev, blocksize)) {
+ pr_debug("%s: error: dax unsupported by block device\n",
+ bdevname(bdev, buf));
+ return false;
+ }
+
id = dax_read_lock();
len = dax_direct_access(dax_dev, pgoff, 1, &kaddr, &pfn);
len2 = dax_direct_access(dax_dev, pgoff_end, 1, &end_kaddr, &end_pfn);
--
2.17.1
1 year, 8 months
[PATCH v2 0/4] bugfix and optimize for drivers/nvdimm
by Zhen Lei
v1 --> v2:
1. Add Fixes for Patch 1-2
2. Slightly change the subject and description of Patch 1
3. Add a new trivial Patch 4, I just found that yesterday.
v1:
I found a memleak when I learned the drivers/nvdimm code today. And I also
added a sanity check for priv->bus_desc.provider_name, because strdup()
maybe failed. Patch 3 is a trivial source code optimization.
Zhen Lei (4):
libnvdimm: fix memmory leaks in of_pmem.c
libnvdimm: add sanity check for provider_name in
of_pmem_region_probe()
libnvdimm/bus: simplify walk_to_nvdimm_bus()
libnvdimm/region: reduce an unnecessary if branch in
nd_region_create()
drivers/nvdimm/bus.c | 7 +++----
drivers/nvdimm/of_pmem.c | 7 +++++++
drivers/nvdimm/region_devs.c | 5 +----
3 files changed, 11 insertions(+), 8 deletions(-)
--
1.8.3
1 year, 8 months
[PATCH] powerpc/papr_scm: Limit the readability of 'perf_stats' sysfs attribute
by Vaibhav Jain
The newly introduced 'perf_stats' attribute uses the default access
mode of 0444 letting non-root users access performance stats of an
nvdimm and potentially force the kernel into issuing large number of
expensive HCALLs. Since the information exposed by this attribute
cannot be cached hence its better to ward of access to this attribute
from users who don't need to access these performance statistics.
Hence this patch adds check in perf_stats_show() to only let users
that are 'perfmon_capable()' to read the nvdimm performance
statistics.
Fixes: 2d02bf835e573 ('powerpc/papr_scm: Fetch nvdimm performance stats from PHYP')
Reported-by: Aneesh Kumar K.V <aneesh.kumar(a)linux.ibm.com>
Signed-off-by: Vaibhav Jain <vaibhav(a)linux.ibm.com>
---
arch/powerpc/platforms/pseries/papr_scm.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/arch/powerpc/platforms/pseries/papr_scm.c b/arch/powerpc/platforms/pseries/papr_scm.c
index f439f0dfea7d1..36c51bf8af9a8 100644
--- a/arch/powerpc/platforms/pseries/papr_scm.c
+++ b/arch/powerpc/platforms/pseries/papr_scm.c
@@ -792,6 +792,10 @@ static ssize_t perf_stats_show(struct device *dev,
struct nvdimm *dimm = to_nvdimm(dev);
struct papr_scm_priv *p = nvdimm_provider_data(dimm);
+ /* Allow access only to perfmon capable users */
+ if (!perfmon_capable())
+ return -EACCES;
+
if (!p->stat_buffer_len)
return -ENOENT;
--
2.26.2
1 year, 9 months