Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[probabilisticsamplerprocessor] Support consistent intermediate span sampling (OTEP 226) #22058

Closed
wants to merge 10 commits into from

Conversation

jmacd
Copy link
Contributor

@jmacd jmacd commented May 17, 2023

Description: Implements a consistent form of tail sampling described in OTEP 226. Extents the probabilisticsamplerprocessor to demonstrate this and to use in drafting a specification.

As shown, this specification is an alternative to https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/trace/tracestate-probability-sampling.md.

Link to tracking Issue: See open-telemetry/opentelemetry-specification#1826

Testing: The existing consistent probability sampler specification describes a deterministic test to ensure the sampler passes a chi-squared test; that test can be re-used. Existing tests that use HashSeed==0 are now tested using the newer logic. The existing test data uses 128 bits of random trace ID, so this is expected.

More testing is needed for the tracestate logic. The logs sampling logic in this sampler has not been updated for this prototype.

Documentation: TODO

@jmacd
Copy link
Contributor Author

jmacd commented May 24, 2023

@atoulme

@jmacd
Copy link
Contributor Author

jmacd commented May 25, 2023

Feedback from @kentquirk: Please do not hand-parse TraceState. I accept!

@jmacd
Copy link
Contributor Author

jmacd commented Jun 2, 2023

The latest commit 9010a67 has a regexp-based parser but still avoids allocations. I've updated the sampler processor to use the current tracestate-handling and draft-spec t-value and now also the s-value when using a HashSeed configuration.

I started looking into the tail sampler processor, with its many configurations. I suspect that will be harder to achieve, but we should expect to have well-defined adjusted counts from our samplers, and we can accomplish this.

@oertl
Copy link

oertl commented Jun 6, 2023

The latest commit 9010a67 has a regexp-based parser but still avoids allocations.

@jmacd I would be interested to know how costly parsing/encoding decimal-based t-values is compared to the (optional) r-value, which is hex encoded. I have some performance concerns, since some consistent samplers (e.g., rate-limiting or reservoir-based samplers) need to analyze the t-value of a span to adjust their own sampling probability.

@github-actions
Copy link
Contributor

This PR was marked stale due to lack of activity. It will be closed in 14 days.

@github-actions github-actions bot added the Stale label Jun 21, 2023
@github-actions
Copy link
Contributor

github-actions bot commented Jul 5, 2023

Closed as inactive. Feel free to reopen if this PR is still being worked on.

@github-actions github-actions bot closed this Jul 5, 2023
@jpkrohling
Copy link
Member

@jmacd, should this be reopened? Was this pending on me?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants