-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add telemetry to bundler #17
Conversation
a708cf9
to
972863f
Compare
cc6e36d
to
4255b4e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, I'll leave the yaml config to someone who knows what it's doing but the code change looks ok.
OpenTelemetry looks widely used enough to be a good choice for the exporting side, but I'm not familiar enough with the downstream platforms to really comment on them.
What does Jaeger offer over something like Grafana that's used elsewhere at Diamond? Are we going to end up with an individual monitoring system for each service or is the aim that they're all going to be combined into a central system at some point?
@@ -12,6 +12,7 @@ services: | |||
DATABASE_URL: mysql://root:rootpassword@ispyb/ispyb_build | |||
BUNDLER_DATABASE_URL: mysql://root:rootpassword@ispyb/ispyb_build | |||
BUNDLER_LOG_LEVEL: DEBUG | |||
BUNDLER_OTEL_COLLECTOR_URL: http://collector:4317 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any way of parameterising these values rather than repeating them in several places/files?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
otelcol
& jaeger
both support env var substitution in their configs, so I could put them in the docker-compose
. Unfortunately prometheus
doesn't
Jaeger faciliates distributed tracing, as opposed to metrics visualisation. That is to say it provides a view of 'events' each of which can be drilled down into giving a waterfall of spans which were involved in this event and logs which were captured in those spans, even across service boundaries. To really make distributed tracing powerful there needs to be a single instance which can correlate spans from desperate services, so I would envisage having one instance (cluster for HA) which all services would push their tracing data to. |
I thought grafana offered tracing visualisation as well but looking at it, it requires an external data source (such as jaeger) to work. I have no preferences one way or the other, I'm just not sure of what each part is doing. Where does prometheus fit in? |
Jaeger and Prometheus both serve effectively the same role, they act as a storage backend and query engine for traces and metrics respectively. Grafana can then provide visualisations of this data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If Peter is happy with the Rust, the devcontainer changes look sensible; presumably we have an equivalent standard logging deployment which handles the jaeger?
Nothing in the deployment yet, and no recommended good practice for how to do this, kind of using this project (and XChem) as the test bed for OTEL & Jaeger with the view to spin up some Diamond wide infra to handle traces & metrics |
Either way, the monitoring deployment lives outside of the deployment it's monitoring, so that's a future problem |
Add telemetry (logs + metrics + traces) to bundler using
tracing
with export in OpenTelemetry format viaopentelemetry-otlp
.Collecting & visualising results with: