Skip to content
This repository has been archived by the owner on Jul 11, 2022. It is now read-only.

Make Metrics consistent with Go client. #129

Merged
merged 8 commits into from
Mar 6, 2018

Conversation

eundoosong
Copy link
Contributor

@eundoosong eundoosong commented Feb 24, 2018

Make Metrics consistent with Go client.

Now the metric name shows consistent with go-client as you see below prometheus test I did.

  • in sampler, add SamplerMetrics which has 4 types of metrics, as go-client has.
  • in tracer, add spans_finished

Resolves #102

Please review.

# HELP jaeger:traces jaeger:traces
# TYPE jaeger:traces counter
jaeger:traces{sampled="n",state="started"} 6.0
jaeger:traces{sampled="y",state="started"} 0.0
jaeger:traces{sampled="y",state="joined"} 0.0
jaeger:traces{sampled="n",state="joined"} 0.0
# HELP jaeger:sampler_updates jaeger:sampler_updates
# TYPE jaeger:sampler_updates counter
jaeger:sampler_updates{result="err"} 0.0
jaeger:sampler_updates{result="ok"} 0.0
# HELP python_info Python platform information
# TYPE python_info gauge
python_info{implementation="CPython",major="2",minor="7",patchlevel="10",version="2.7.10"} 1.0
# HELP jaeger:sampler_queries jaeger:sampler_queries
# TYPE jaeger:sampler_queries counter
jaeger:sampler_queries{result="err"} 0.0
jaeger:sampler_queries{result="ok"} 0.0
# HELP jaeger:started_spans jaeger:started_spans
# TYPE jaeger:started_spans counter
jaeger:started_spans{sampled="n"} 12.0
jaeger:started_spans{sampled="y"} 0.0
# HELP jaeger:finished_spans jaeger:finished_spans
# TYPE jaeger:finished_spans counter
jaeger:finished_spans 0.0
# HELP jaeger:reporter_spans jaeger:reporter_spans
# TYPE jaeger:reporter_spans counter
jaeger:reporter_spans{result="dropped"} 0.0
jaeger:reporter_spans{result="socket_error"} 0.0
jaeger:reporter_spans{result="err"} 0.0
jaeger:reporter_spans{result="ok"} 0.0

Signed-off-by: eundoo-song eundoo.song@gmail.com

eundoo-song and others added 2 commits February 13, 2018 19:19
Signed-off-by: eundoo-song <eundoo.song@samsung.com>
Signed-off-by: Eundoo Song <eundoo.song@gmail.com>
@coveralls
Copy link

coveralls commented Feb 24, 2018

Coverage Status

Coverage decreased (-0.008%) to 93.903% when pulling 3002240 on eundoosong:consistent_metrics into 9cbf95b on jaegertracing:master.

@eundoosong
Copy link
Contributor Author

eundoosong commented Feb 24, 2018

It looks like the uncovered line(return PeriodicCallback) is not related to this.
Anyone knows how to fix this?

@yurishkuro
Copy link
Member

@eundoosong Sweet! Thanks for contributing. Don't worry about the coverage, it looks like a glitch.

@jpkrohling could you take a look, since you just did the same in Java?

Signed-off-by: Eundoo Song <eundoo.song@gmail.com>
@jpkrohling
Copy link

LGTM. Looks like the Python version is missing a gauge (reporter_queue_length) and has an extra counter (jaeger:reporter_spans{'result': 'socket_error}), but I think those differences are OK.

@eundoosong
Copy link
Contributor Author

@jpkrohling Thanks for your review.
I think we can add reporter_queue_length gauge just after span batch submission like this.
Tornado provides an api to know current size of queue.
http://www.tornadoweb.org/en/stable/_modules/tornado/queues.html#Queue.qsize

if spans:
    yield self._submit(spans)
    self.metrics.reporter_queue_length(self.queue.qsize())
    for _ in spans:
        self.queue.task_done()
    spans = spans[:0]

If it's ok, I'll update.

@jpkrohling
Copy link

I'm not familiar with Tornado (or Python in general, for that matter), but looks OK to me.

Signed-off-by: Eundoo Song <eundoo.song@gmail.com>
@eundoosong
Copy link
Contributor Author

@yurishkuro Is there any reason reporter_queue_length gauge is missed in python? Do you think it should be added as above?

@yurishkuro
Copy link
Member

@eundoosong yes, please add it. Only I would add it after that if statement, i.e. as the last statement of the while loop (or alternatively the first statement).

Re jaeger:reporter_spans{'result': 'socket_error} counter, I am not sure it's worth having it separate from the err counter, as it makes aggregations less obvious.

replace socker_error by reporter failure counter.

Signed-off-by: Eundoo Song <eundoo.song@gmail.com>
@eundoosong
Copy link
Contributor Author

@yurishkuro @jpkrohling reporter_socket(removed) is replaced by reporter_failure and reporter_queue_length is moved.
Is this ready to be merged, or is there anything else missing?

Copy link

@jpkrohling jpkrohling left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but I can't judge the Python part :)

Copy link
Member

@yurishkuro yurishkuro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one small issue with update sampler metric

self.error_reporter.error(
'Fail to update sampler'
'from jaeger-agent: %s [%s]', e, response)

def _update_adaptive_sampler(self, per_operation_strategies):
if isinstance(self.sampler, AdaptiveSampler):
self.sampler.update(per_operation_strategies)
self.metrics.sampler_updated(1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should always run, for else: part as well

…ondition

Signed-off-by: Eundoo Song <eundoo.song@gmail.com>
@eundoosong
Copy link
Contributor Author

@yurishkuro moved sampler_updated metric out of the condition

@yurishkuro
Copy link
Member

@eundoosong have you tried this in a real service with a real metrics library?

@eundoosong
Copy link
Contributor Author

eundoosong commented Mar 6, 2018

@yurishkuro I tried this using my prometheus factory as you see my first comment above
code: eundoosong@78aa262#diff-08bb54ec260941e2becb5054785b7fa2
It was for test, but I'm planning to use this in a real service with Prometheus and Grafana.

@yurishkuro yurishkuro merged commit 8ddfcc1 into jaegertracing:master Mar 6, 2018
@yurishkuro
Copy link
Member

Thanks, @eundoosong!

Do you think you can add a PR for PrometheusMetricsFactory? It think it would be great to have it in Jaeger client. The only gotcha is that we don't want to have a runtime dependency on prometheus client, so I would design the PrometheusMetricsFactory to import prom client conditionally (blow up if not found).

@eundoosong
Copy link
Contributor Author

@yurishkuro Sure, I can add PrometheusMetricsFactory. PR will be created after tidying code and a try to remove a runtime dependency on prometheus client.
And I'd love to contribute the same in nodejs since our another main language for tracing client is nodejs.
related issue: jaegertracing/jaeger-client-node#178

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants