Processor & Aggregator Plugin Support #1726

sparrc · 2016-09-08T10:25:00Z

EDIT: the proposed Aggregator interface has changed, see #1726 (comment)

Proposal:

processor & aggregator plugins will be new types of plugins that sit in-between input and output plugins.

If there are processors or aggregators defined in the config, then all metrics will pass through them before being passed onto the output plugins.

processor plugins will generically support matching based on (with globbing):

tag key/value
measurement name
field keys

aggregator plugins will generically support matching based on (with globbing):

tag key/value
measurement name
field keys

An initial implementation has been written by @alimousazy in this PR: #1364, but I would like to consider here the structure & interface of processor plugins independent of the histogram/aggregator feature.

My proposal for the processor interface differs a bit from that PR. While that PR presents an interesting way of streaming metrics through multiple channels, it also raises an important question of how large to create each channel, which could greatly increase the total possible buffer size of telegraf.

Channels are great for multiple processes to run concurrently and aggregate their work in one place, but this is not actually the workflow of a processor plugin. For each metric that comes from the input plugins, each processor will need to be applied, and after all processors are applied the metric(s) will be passed onto the aggregator plugin(s) & output plugin(s).

The original metric will therefore get sent directly to the output plugins, while the aggregator plugins are free to process the metric as they need, adding their metrics to their accumulator as they need.

type Processor interface {
    // SampleConfig returns the default configuration of the Input
    SampleConfig() string

    // Description returns a one-sentence description on the Input
    Description() string

    // Apply the processor to the given metric
    Apply(in ...telegraf.Metric) []telegraf.Metric
}

type Aggregator interface {
    // SampleConfig returns the default configuration of the Input
    SampleConfig() string

    // Description returns a one-sentence description on the Input
    Description() string

    // Apply the metric to the aggregator
    Apply(in telegraf.Metric)

    // Start starts the aggregator
    Start(acc telegraf.Accumulator)
    Stop()
}

Use case: [Why is this important (helps with prioritizing requests)]

some of the uses of these plugins:

dropping metrics
aggregating metrics
adding & removing tags
adding & removing fields
modifying fields, measurement names, tags, etc.

Open Questions:

Ordering: how do we deal with ordering of processors? do we need to support an argument for users to manually order the plugins? or can we rely on the configuration file to provide the order for us?
Allocations: what affect are processor plugins going to have on allocations?

The text was updated successfully, but these errors were encountered:

closes #1726

alimousazy · 2016-09-08T21:11:15Z

Hi,

I totally agree on that channel will increase complexity in term of memory usage and execution, but I have a question regarding the proposed design. since some of these filters may need to have another trigger other than metric arrival for example Histogram may need to flush data every 1 minute (Aggregate date) how we can handle that . Another thing Filter mapping of in and out metric is not always one to one for example histogram or dropping filters may decide not pass metric. another case when filter flushing metrics ex-histogram it might return multi-metrics instead of one.

sparrc · 2016-09-09T12:08:01Z

since some of these filters may need to have another trigger other than metric arrival for example Histogram may need to flush data every 1 minute (Aggregate date) how we can handle that .

That's a good point, I think it might be necessary to define two types of plugins: filters and aggregators. Aggregators would behave sort of like a "service filter" where they have continuous access to an output channel.

I'll come up with a design overview for this soon.

Another thing Filter mapping of in and out metric is not always one to one for example histogram or dropping filters may decide not pass metric. another case when filter flushing metrics ex-histogram it might return multi-metrics instead of one.

agreed, I have updated the Apply function to reflect this (accepting and returning lists)

sparrc · 2016-09-09T12:28:24Z

~~Updated design, this is to take into account the need for two different types of plugins: filters & aggregators.~~

alimousazy · 2016-09-09T21:27:11Z

While I do feel that this model will solve flushing metrics in active state component, but active state filters still considered as filter and its output should go throw other filters based on order. in suggested design active state filters which does flush metric sepreatly will by pass other filters and push metrics directly to output plugin. It came to my mind that the apply pattern which we are using looks similar to channel with no buffer if you ignore the cost of creating channel in term of memory and functionality, while I still feel that the channel is overkill for such functionliaty.

sparrc · 2016-09-09T23:32:50Z

@alimousazy I don't quite understand what you're suggesting, just eliminating the channel directly before the outputs? That channel shouldn't need to have a large buffer as it will have a goroutine constantly reading off of it.

alimousazy · 2016-09-09T23:35:57Z

@sparrc what I meant since Histogram will emit metrics every minute these metrics should pass also to other filters like drop filter ... etc . based on my understanding the latest design that your proposed Histogram metric will go directly to output plugin .

sparrc · 2016-09-09T23:41:57Z

yes, correct, the metrics coming from the aggregators would have the same fields as the metrics they are aggregating

alimousazy · 2016-09-12T03:57:46Z

Nice ideas for filters https://hekad.readthedocs.io/en/v0.10.0/config/filters/index.html#config-filters.

I would like to work on Lua sandbox for input plugins, filters and output plugins, this will make it easier for the end user to load diffrent kind of plugins at run time without the need of implementing it in Go.

sparrc · 2016-09-12T09:25:08Z

I actually don't like the concept of making filter plugins that are specific to any input or output plugin. The filter plugins should perform generic tasks on any metric passing through, rather than being specifically defined for only a single type of plugin.

If you'd like to work on a Lua sandbox please open a separate issue where we can discuss the design of that. I'm not 100% convinced it's necessary, to me it seems like we can serve non-Go needs with the exec plugin, but I'd like to discuss in case there is a compelling case for it.

closes #1726

sparrc · 2016-09-12T13:57:26Z

@alimousazy Any more thoughts on the design I've outlined above? I know that you've raised the issue that aggregated metrics don't get passed onto the filter.

In my view, since aggregations will be "opt-in" metrics, there is no need to further pass them on to the filter plugins, it will be sufficient for the aggregators to create output metrics based on the incoming metrics that have already been filtered.

closes #1726

alimousazy · 2016-09-12T17:16:34Z

I still feel that aggregated data should pass by other filters, let me give you use cases

1- Bandwidth limit filter should be able to limit number of emited metrics regardless if it aggregated or not, therefor aggregated metrics should pass by that filter.

2- Metrics shaping filter should be able to rename metrics regardless if it coming from aggregator or normal input plugin.

3- Anonmly detection filter should be able to work on aggregated data as well as normal data.

These use cases that I was able to recall.

@sparrc

closes #1726

sparrc · 2016-09-13T13:15:08Z

fair enough, we'll probably need to make a separate AggregatorAccumulator for adding aggregated metrics to the first metric channel. That accumulator will have some way of marking a metric as an "aggregate", so that after it passes thru the filters it does not get re-sent to the aggregators, so flow would look like this:

 ┌───────────┐                                       ┌───────────┐                        ┌───────────┐
 │  inputs   │                                       │processors │                        │  outputs  │
 │   .cpu    │────┐                          ┌──────▶│  .tagger  │                     ┌─▶│ .influxdb │
 │           │    │                          │       │           │                     │  │           │
 └───────────┘    │                          │       └───────────┘                     │  └───────────┘
                  │    ┌─────────────┐       │             │           ┌─────────────┐ │
                  ├───▶│ metric chan │───────┘             │        ┌─▶│ metric chan │─┤
                  │    └─────────────┘                     ▼        │  └─────────────┘ │
 ┌───────────┐    │           ▲                      ┌───────────┐  │                  │  ┌───────────┐
 │  inputs   │    │           │                      │processors │  │                  │  │  outputs  │
 │   .mem    │────┘           │                      │ .renamer  │──┤                  └─▶│   .file   │
 │           │         original metric               │           │  │                     │           │
 └───────────┘       ┌─────not sent─────┐            └───────────┘  │                     └───────────┘
                     │                  │                           │
                     │                  │                           │
                     │                  │                   aggregate metrics
                     │                  │                       not sent
                     │                  │                           │
               ┌───────────┐      ┌───────────┐                     │
               │aggregators│      │aggregators│                     │
               │.histogram │      │   .min    │◀────────────────────┤
               │           │      │           │                     │
               └───────────┘      └───────────┘                     │
                     ▲                                              │
                     │                                              │
                     └──────────────────────────────────────────────┘

closes #1726

sparrc · 2016-09-13T17:29:06Z

@alimousazy fyi I've renamed "filters" to "processors" because it was a bit of name overload. Filters already refer to the metric filter options that users can apply to plugins (tagdrop, tagpass, tagexclude, taginclude, etc.).

closes #1726

sparrc · 2016-09-27T15:26:10Z

Updating the Aggregator interface due to some internal discussions and implementation details that came up.

We have decided that it would be best if all aggregator plugins were required to have a period parameter which allows the user to configure how large the bucket of each aggregation is. This is going to be a required argument, with a default value of 30s.

Because of this, the flushing of aggregator plugins can be done outside of the plugin itself, and thus the interface will be simplified to not require Start/Stop functions. Instead, the Push function pushes the current aggregated metrics to the given accumulator (similar to the input plugin Gather(acc) function). The Reset() function resets the aggregator's internal buffers and starts counting new aggregations.

The way the plugin is handled, no locking needs to be done, the Reset/Push/Add functions will never conflict with one another.

type Aggregator interface {
    // SampleConfig returns the default configuration of the Input.
    SampleConfig() string

    // Description returns a one-sentence description on the Input.
    Description() string

    // Add the metric to the aggregator.
    Add(in Metric)

    // Push pushes the current aggregates to the accumulator.
    Push(acc Accumulator)

    // Reset resets the aggregators caches and aggregates.
    Reset()
}

closes #1726

sparrc added the feature request Requests for new plugin and for new features to existing plugins label Sep 8, 2016

sparrc added this to the 1.1.0 milestone Sep 8, 2016

sparrc mentioned this issue Sep 8, 2016

"Histogram" statistics aggregator plugin #1662

Closed

sparrc added a commit that referenced this issue Sep 8, 2016

Support Filter Plugins

9149ebd

closes #1726

sparrc added a commit that referenced this issue Sep 8, 2016

Support Filter Plugins

f718c26

closes #1726

sparrc added a commit that referenced this issue Sep 8, 2016

Support Filter Plugins

1f02fa0

closes #1726

sparrc added a commit that referenced this issue Sep 8, 2016

Support Filter Plugins

c7f5dbe

closes #1726

sparrc added a commit that referenced this issue Sep 8, 2016

Support Filter Plugins

f3a3447

closes #1726

sparrc added a commit that referenced this issue Sep 12, 2016

Support Filter Plugins

739e72a

closes #1726

sparrc added a commit that referenced this issue Sep 12, 2016

Support Filter & Aggregator Plugins

53a44e3

closes #1726

sparrc added a commit that referenced this issue Sep 12, 2016

Support Filter & Aggregator Plugins

afb7cb7

closes #1726

sparrc added a commit that referenced this issue Sep 13, 2016

Support Filter & Aggregator Plugins

b14eade

closes #1726

sparrc added a commit that referenced this issue Sep 13, 2016

Support Filter & Aggregator Plugins

b3600c7

closes #1726

sparrc added a commit that referenced this issue Sep 13, 2016

Support Processor & Aggregator Plugins

41f55e5

closes #1726

sparrc changed the title ~~Filter Plugin Support~~ Processor & Aggregator Plugin Support Sep 13, 2016

sparrc added a commit that referenced this issue Sep 13, 2016

Support Processor & Aggregator Plugins

87a7b99

closes #1726

sparrc added a commit that referenced this issue Sep 13, 2016

Support Processor & Aggregator Plugins

0e1b490

closes #1726

sparrc mentioned this issue Sep 16, 2016

Support Processor & Aggregator Plugins #1777

Merged

13 tasks

sparrc added a commit that referenced this issue Sep 16, 2016

Support Processor & Aggregator Plugins

0e8a3f1

closes #1726

sparrc added a commit that referenced this issue Sep 16, 2016

Support Processor & Aggregator Plugins

7f2825f

closes #1726

sparrc added a commit that referenced this issue Sep 19, 2016

Support Processor & Aggregator Plugins

bf6bd0a

closes #1726

sparrc added a commit that referenced this issue Sep 19, 2016

Support Processor & Aggregator Plugins

ee770f6

closes #1726

sparrc added a commit that referenced this issue Sep 19, 2016

Support Processor & Aggregator Plugins

27a8180

closes #1726

sparrc added a commit that referenced this issue Sep 19, 2016

Support Processor & Aggregator Plugins

75facef

closes #1726

sparrc added a commit that referenced this issue Sep 19, 2016

Support Processor & Aggregator Plugins

2c15327

closes #1726

sparrc added a commit that referenced this issue Sep 19, 2016

Support Processor & Aggregator Plugins

3118f8b

closes #1726

sparrc added a commit that referenced this issue Sep 19, 2016

Support Processor & Aggregator Plugins

e1fd7bc

closes #1726

sparrc added a commit that referenced this issue Sep 19, 2016

Support Processor & Aggregator Plugins

911f92e

closes #1726

sparrc added a commit that referenced this issue Sep 20, 2016

Support Processor & Aggregator Plugins

3e6f0bd

closes #1726

sparrc added a commit that referenced this issue Sep 20, 2016

Support Processor & Aggregator Plugins

e56b548

closes #1726

sparrc added a commit that referenced this issue Sep 21, 2016

Support Processor & Aggregator Plugins

96319d9

closes #1726

sparrc added a commit that referenced this issue Sep 23, 2016

Support Processor & Aggregator Plugins

160d667

closes #1726

sparrc added a commit that referenced this issue Sep 23, 2016

Support Processor & Aggregator Plugins

1cf2988

closes #1726

sparrc added a commit that referenced this issue Sep 27, 2016

Support Processor & Aggregator Plugins

d700525

closes #1726

sparrc added a commit that referenced this issue Sep 28, 2016

Support Processor & Aggregator Plugins

1e40451

closes #1726

sparrc added a commit that referenced this issue Sep 28, 2016

Support Processor & Aggregator Plugins

8890a4c

closes #1726

sparrc added a commit that referenced this issue Oct 3, 2016

Support Processor & Aggregator Plugins

e72559c

closes #1726

sparrc added a commit that referenced this issue Oct 4, 2016

Support Processor & Aggregator Plugins

6018813

closes #1726

sparrc added a commit that referenced this issue Oct 5, 2016

Support Processor & Aggregator Plugins

2fefda1

closes #1726

sparrc added a commit that referenced this issue Oct 6, 2016

Support Processor & Aggregator Plugins

725fa41

closes #1726

sparrc added a commit that referenced this issue Oct 7, 2016

Support Processor & Aggregator Plugins

cd6597f

closes #1726

sparrc added a commit that referenced this issue Oct 10, 2016

Support Processor & Aggregator Plugins

984a814

closes #1726

sparrc added a commit that referenced this issue Oct 12, 2016

Support Processor & Aggregator Plugins

64a7126

closes #1726

sparrc closed this as completed in #1777 Oct 12, 2016

dylanmei mentioned this issue Jan 12, 2017

Jolokia input plugin should extract tags #2014

Closed

danielnelson mentioned this issue Mar 24, 2017

Create static fields parsing log files #2564

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Processor & Aggregator Plugin Support #1726

Processor & Aggregator Plugin Support #1726

sparrc commented Sep 8, 2016 •

edited

Loading

alimousazy commented Sep 8, 2016

sparrc commented Sep 9, 2016 •

edited

Loading

sparrc commented Sep 9, 2016 •

edited

Loading

alimousazy commented Sep 9, 2016

sparrc commented Sep 9, 2016

alimousazy commented Sep 9, 2016

sparrc commented Sep 9, 2016

alimousazy commented Sep 12, 2016

sparrc commented Sep 12, 2016 •

edited

Loading

sparrc commented Sep 12, 2016

alimousazy commented Sep 12, 2016

sparrc commented Sep 13, 2016 •

edited

Loading

sparrc commented Sep 13, 2016

sparrc commented Sep 27, 2016

Processor & Aggregator Plugin Support #1726

Processor & Aggregator Plugin Support #1726

Comments

sparrc commented Sep 8, 2016 • edited Loading

Proposal:

Use case: [Why is this important (helps with prioritizing requests)]

Open Questions:

alimousazy commented Sep 8, 2016

sparrc commented Sep 9, 2016 • edited Loading

sparrc commented Sep 9, 2016 • edited Loading

alimousazy commented Sep 9, 2016

sparrc commented Sep 9, 2016

alimousazy commented Sep 9, 2016

sparrc commented Sep 9, 2016

alimousazy commented Sep 12, 2016

sparrc commented Sep 12, 2016 • edited Loading

sparrc commented Sep 12, 2016

alimousazy commented Sep 12, 2016

sparrc commented Sep 13, 2016 • edited Loading

sparrc commented Sep 13, 2016

sparrc commented Sep 27, 2016

sparrc commented Sep 8, 2016 •

edited

Loading

sparrc commented Sep 9, 2016 •

edited

Loading

sparrc commented Sep 9, 2016 •

edited

Loading

sparrc commented Sep 12, 2016 •

edited

Loading

sparrc commented Sep 13, 2016 •

edited

Loading