-
Notifications
You must be signed in to change notification settings - Fork 269
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement HTTP metrics for Connectors #6067
base: next
Are you sure you want to change the base?
Conversation
apollo-federation/src/sources/connect/validation/graphql/strings.rs
Outdated
Show resolved
Hide resolved
6459d7c
to
6de78c3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Example of config for instruments:
telemetry:
instrumentation:
instruments:
subgraph:
http.client.request.duration:
attributes:
subgraph.name: true
connector.source:
connector_source: name
subgraph.http.method: true
subgraph.graphql.operation.type: string
connector.url.template: true
specific.metrics.for.a.connector:
condition:
eq:
- service_kind: true
- connector
value:
connector_http_response_header: "x-ratelimit-remaining"
unit: count
type: histogram
description: "my desc"
We also have to think about testing other instrumentation elements like events and spans using the new selectors and attributes
@@ -264,6 +272,78 @@ impl DefaultForLevel for SubgraphAttributes { | |||
} | |||
} | |||
|
|||
#[derive(Deserialize, JsonSchema, Clone, Default, Debug)] | |||
#[serde(deny_unknown_fields, default)] | |||
pub(crate) struct ConnectorHttpAttributes { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should put this in subgraph attributes. Let's only keep subgraph attributes for now, and as it's namespaced by connector. It will be enough, I also think we shouldn't have both connector.http.method
and subgraph.http.method
we should just keep the subgraph one for now. If people wants to know if it's a connector or a subgraph I would suggest adding in SubgraphSelelector
something like service_kind
which could be connector
or subgraph
. (I don't think service_kind
is the right naming but you got the idea).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue here is that the relationship between subgraph and connector is not 1-to-1. There can be a number of connectors on a given subgraph, and there can be multiple calls to each connector per GraphQL request. So a connector isn't really a "kind" of subgraph - the subgraph service lifecycle stage is not invoked for connectors. The lifecycle is really now like this:
------ subgraph service ------ http client service
|
fetch service --|
| ------- http client service
------ connector service ------|
------- http client service
The fetch service and connector service aren't currently exposed as plugin lifecycle hooks, but likely will be.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm indeed, I didn't know we could have several http calls for a connector. Any idea @BrynCooke ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think what's here is the way forward, and then we add http_client
service support to the telemetry plugin.
I know it's a big job, but it feels like the right thing to do.
@@ -81,6 +86,11 @@ pub(crate) struct InstrumentsConfig { | |||
SubgraphInstrumentsConfig, | |||
Instrument<SubgraphAttributes, SubgraphSelector, SubgraphValue>, | |||
>, | |||
/// Connector service instruments. For more information see documentation on Router lifecycle. | |||
pub(crate) connector: Extendable< |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, I think everything should be in subgraph for now. To keep only subgraph. Because I think usually people will want to have the same attributes even if it's a connector or subgraph (if not they would still be able to know using the selector and a condition to add specific attributes or metrics if it's a connector for example).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another issue here is that HTTP connectors are only one kind of connector - there will be more. So unlike subgraphs, where every request is HTTP, we may have connector requests that are SQL, gRPC, etc. As we go forward with these, sharing attributes between connectors and subgraph would make less and less sense, since the attributes would never apply to a subgraph.
} | ||
} | ||
|
||
pub(crate) fn new_builtin_connector_instruments(&self) -> HashMap<String, StaticInstrument> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this one is basically the same than subgraph instruments so probably one more reason to merge everything in subgraph
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The request and response types limit things here. You can't have attributes or selectors that are for both subgraph request/response and http request/response - they can only be generic over one type of request/response. That limits how things can be arranged.
@@ -3569,4 +3852,203 @@ mod test { | |||
Some("default".into()) | |||
); | |||
} | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need tests including both subgraph calls and connectors calls to make sure it works properly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a test with both connectors and subgraph, let me know if it's sufficient.
cc @BrynCooke I need your review here too |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general this is looking good. Can we pull the connectors stuff out into a separate module though rather than adding to the already large attributes.rs
instruments.rs
and selectors.rs
We should eventually be aiming for a separate module for each router pipeline stage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It just needs more tests for instruments but also for events and spans, feel free to ask if you need help or if you don't find these existing tests
const TEST_SUBGRAPH_NAME: &str = "test_subgraph_name"; | ||
const TEST_SOURCE_NAME: &str = "test_source_name"; | ||
const TEST_URL_TEMPLATE: &str = "/test"; | ||
const TEST_HEADER_NAME: &str = "test_header_name"; | ||
const TEST_HEADER_VALUE: &str = "test_header_value"; | ||
const TEST_STATIC: &str = "test_static"; | ||
|
||
#[fixture] | ||
fn connector_info() -> ConnectorInfo { | ||
ConnectorInfo { | ||
subgraph_name: TEST_SUBGRAPH_NAME.to_string(), | ||
source_name: Some(TEST_SOURCE_NAME.to_string()), | ||
http_method: HTTPMethod::Get.as_str().to_string(), | ||
url_template: TEST_URL_TEMPLATE.to_string(), | ||
} | ||
} | ||
|
||
#[fixture] | ||
fn context(connector_info: ConnectorInfo) -> RouterContext { | ||
let context = RouterContext::default(); | ||
context | ||
.insert(CONNECTOR_INFO_CONTEXT_KEY, connector_info) | ||
.unwrap(); | ||
context | ||
} | ||
|
||
#[fixture] | ||
fn http_request(context: RouterContext) -> HttpRequest { | ||
HttpRequest { | ||
http_request: http::Request::builder().body("".into()).unwrap(), | ||
context, | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't feel confident introducing this technique to debug. We try as much as we can to lower the compilation time and if we introduce macros in our tests it will increase the time. Could you try to use the same kind of tests we did before. If it's something you would like to introduce in the router codebase then I would suggest to talk about this during one of our router arch meeting because we need opinion from the team. Also we have yaml based tests in apollo-router/src/plugins/telemetry/config_new/fixtures
Could you add more tests directly there please ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't feel confident introducing this technique to debug. We try as much as we can to lower the compilation time and if we introduce macros in our tests it will increase the time. Could you try to use the same kind of tests we did before. If it's something you would like to introduce in the router codebase then I would suggest to talk about this during one of our router arch meeting because we need opinion from the team. Also we have yaml based tests in
apollo-router/src/plugins/telemetry/config_new/fixtures
Could you add more tests directly there please ?
I replaced the use of rstest
with just plain unit tests, and added more tests in fixtures
.
ae88b1b
to
970b75d
Compare
I've added support for connectors HTTP events, and added tests for that in the places I found existing tests for other events. Please let me know if I didn't find the tests you were looking for. Spans are another matter - I haven't added support for spans in this PR. The HTTP client service isn't really set up for this - it creates the @bnjjj - let me know if this sounds ok, and if the tests I added are what you were looking for |
@BrynCooke - I moved connectors related things out to a |
970b75d
to
0a6ba87
Compare
✅ Docs Preview ReadyNo new or changed pages found. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great, could you also open an issue explaining the connector selectors are not already supported for spans link it directly in this PR please ? Also could be great to document that limitation in selectors docs and spans docs probably.
apollo-router/src/plugins/telemetry/config_new/connector/http/events.rs
Outdated
Show resolved
Hide resolved
apollo-router/src/plugins/telemetry/config_new/connector/http/events.rs
Outdated
Show resolved
Hide resolved
docs/source/configuration/telemetry/instrumentation/selectors.mdx
Outdated
Show resolved
Hide resolved
docs/source/configuration/telemetry/instrumentation/selectors.mdx
Outdated
Show resolved
Hide resolved
3801f57
to
994d598
Compare
Add standard metrics for connectors HTTP requests and responses:
http.client.request.duration
http.client.request.body.size
http.client.response.body.size
Selectors are defined for:
subgraph_name
connector_source
connector_http_request_header
connector_http_response_header
connector_http_response_status
connector_http_method
connector_url_template
static
The implementation uses the HTTP client service, as a single call to the connector service can result in multiple HTTP calls. The metrics make more sense at the HTTP client level anyway. However, since the instrumentation types are generic over a single request/response type, this means we'll be unable to have any other instruments at the connector service level that use the same selectors. However, that should be fine as the interesting data is at the HTTP level for connectors.
Since the HTTP client service is not public, this required making the telemetry plugin private. This in turn required some small changes to the plugin test harness, which was previously incompatible with private plugins.
Planning ahead for future Connectors beyond the HTTP REST API connector, the type names include
Http
where appropriate so that additional types can be created in the future without conflict or confusion. For example, in addition toConnectorHttpSelector
, there might eventually be aConnectorGrpcSelector
. These need to be different types, because theSelector
implementation has associated request and response types. These areHttpRequest
andHttpResponse
for the HTTP implementation but would be something else for gRPC.Additionally, the configuration settings are under
http
underconnectors
, to accommodate future configuration.Examples
Add a metric for the number of 404 response codes from a particular connector source API:
Add connector attributes to the
http.client.request.duration
instrument:Create a histogram of the remaining rate limit from an API based on a response header value:
It's also now possible to configure events for connectors:
Checklist
Complete the checklist (and note appropriate exceptions) before the PR is marked ready-for-review.
Exceptions
Notes
Footnotes
It may be appropriate to bring upcoming changes to the attention of other (impacted) groups. Please endeavour to do this before seeking PR approval. The mechanism for doing this will vary considerably, so use your judgement as to how and when to do this. ↩
Configuration is an important part of many changes. Where applicable please try to document configuration examples. ↩
Tick whichever testing boxes are applicable. If you are adding Manual Tests, please document the manual testing (extensively) in the Exceptions. ↩