On Sun, Jul 8, 2018 at 9:59 PM, Qi, Fuli <qi.fuli(a)jp.fujitsu.com> wrote:
> -----Original Message-----
> From: Dan Williams [mailto:dan.j.williams@intel.com]
> Sent: Sunday, July 8, 2018 5:06 AM
> To: Qi, Fuli/斉 福利 <qi.fuli(a)jp.fujitsu.com>
> Cc: linux-nvdimm <linux-nvdimm(a)lists.01.org>
> Subject: Re: [PATCH v9 1/3] ndctl, monitor: add ndctl monitor
>
> On Thu, Jun 28, 2018 at 7:30 PM, QI Fuli <qi.fuli(a)jp.fujitsu.com> wrote:
> > Ndctl monitor is used for monitoring the smart events of nvdimm DIMMs.
> > When a smart event fires, monitor will output the notifications which
> > include dimm health status and evnet informations to syslog or a
> > logfile by setting [--logfile] option. The notifications follow json
> > format and can be consumed by log collectors like Fluentd.
> >
> > The objects to monitor can be selected by setting [--dimm] [--region]
> > [--namespace] [--bus] options and the event type can be filtered by
> > setting [--dimm-event] option. These options support multiple
> > space-separated arguments.
> >
> > Ndctl monitor can be forked as a daemon by using [--daemon] option,
> > such as:
> > # ndctl monitor --daemon --logfile /var/log/ndctl/monitor.log
> >
> > Signed-off-by: QI Fuli <qi.fuli(a)jp.fujitsu.com>
> > ---
> > builtin.h | 1 +
> > ndctl/Makefile.am | 3 +-
> > ndctl/lib/libndctl.sym | 1 +
> > ndctl/lib/smart.c | 17 ++
> > ndctl/libndctl.h | 6 +
> > ndctl/monitor.c | 531 +++++++++++++++++++++++++++++++++++++++++
> > ndctl/ndctl.c | 1 +
> > util/filter.h | 9 +
> > 8 files changed, 568 insertions(+), 1 deletion(-) create mode 100644
> > ndctl/monitor.c
> >
> > diff --git a/builtin.h b/builtin.h
> > index d3cc723..675a6ce 100644
> > --- a/builtin.h
> > +++ b/builtin.h
> > @@ -39,6 +39,7 @@ int cmd_inject_error(int argc, const char **argv,
> > void *ctx); int cmd_wait_scrub(int argc, const char **argv, void
> > *ctx); int cmd_start_scrub(int argc, const char **argv, void *ctx);
> > int cmd_list(int argc, const char **argv, void *ctx);
> > +int cmd_monitor(int argc, const char **argv, void *ctx);
> > #ifdef ENABLE_TEST
> > int cmd_test(int argc, const char **argv, void *ctx); #endif diff
> > --git a/ndctl/Makefile.am b/ndctl/Makefile.am index d22a379..7dbf223
> > 100644
> > --- a/ndctl/Makefile.am
> > +++ b/ndctl/Makefile.am
> > @@ -16,7 +16,8 @@ ndctl_SOURCES = ndctl.c \
> > util/json-smart.c \
> > util/json-firmware.c \
> > inject-error.c \
> > - inject-smart.c
> > + inject-smart.c \
> > + monitor.c
> >
> > if ENABLE_DESTRUCTIVE
> > ndctl_SOURCES += ../test/blk_namespaces.c \ diff --git
> > a/ndctl/lib/libndctl.sym b/ndctl/lib/libndctl.sym index
> > e939993..f64df56 100644
> > --- a/ndctl/lib/libndctl.sym
> > +++ b/ndctl/lib/libndctl.sym
> > @@ -366,4 +366,5 @@ global:
> > ndctl_namespace_inject_error2;
> > ndctl_namespace_uninject_error2;
> > ndctl_cmd_ars_stat_get_flag_overflow;
> > + ndctl_cmd_smart_get_event_flags;
> > } LIBNDCTL_15;
> > diff --git a/ndctl/lib/smart.c b/ndctl/lib/smart.c index
> > 0455252..90a65d0 100644
> > --- a/ndctl/lib/smart.c
> > +++ b/ndctl/lib/smart.c
> > @@ -101,6 +101,23 @@ NDCTL_EXPORT unsigned int
> > ndctl_cmd_smart_threshold_get_temperature(
> >
> > smart_cmd_op(smart_threshold_get_supported_alarms, unsigned int, 0);
> >
> > +NDCTL_EXPORT unsigned int ndctl_cmd_smart_get_event_flags(struct
> > +ndctl_cmd *cmd)
>
> My expectation for this ndctl_*_get_event_flags() api was to have it be:
>
> ndctl_dimm_get_event_flags()
>
> ...and with that in place get rid of the 'struct monitor_dimm' object.
> Push everything to be retrieved through api calls against a 'struct
ndctl_dimm' object.
> In other words, the usage of 'struct ndctl_cmd'
> should be hidden and all monitor operations should be done in terms of 'struct
> ndctl_dimm' helper calls.
>
Hi Dan,
Thanks for your comments.
In the v9 of monitor, I use the 'struct ndctl_cmd' object in the following
places:
ndctl_cmd_smart_get_flags(struct ndctl_cmd *cmd)
ndctl_cmd_smart_get_health(struct ndctl_cmd *cmd)
ndctl_cmd_smart_get_event_flags(struct ndctl_cmd *cmd)
Is it that you want to hide all of the 'struct ndctl_cmd' objects and add the
following 'struct ndctl_dimm' helper calls?
I'm primarily reacting to:
+struct monitor_dimm {
+ struct ndctl_dimm *dimm;
+ int health_eventfd;
+ unsigned int health;
+ unsigned int event_flags;
+ struct list_node list;
+};
Which is effectively duplicating ndctl_dimm internal data.
ndctl_dimm_get_flags(struct ndctl_dimm *dimm)
ndctl_dimm_get_health(struct ndctl_dimm *dimm)
ndctl_dimm_get_event_flags(struct ndctl_dimm *dimm)
I understand why we need ndctl_dimm_get_event_flags() since that tells
you what events have fired. Why do we need the other 2? If the event
has fired then the monitor proceeds to call util_dimm_health_to_json.
Is that not sufficient?