Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Elasticsearch uses Span2 internally #1651

Closed
wants to merge 5 commits into from
Closed

Conversation

codefromthecrypt
Copy link
Member

@codefromthecrypt codefromthecrypt commented Jul 12, 2017

This uses the span2 model from #1499 to implement Elasticsearch without using nested queries. This requires code from #1673

@codefromthecrypt
Copy link
Member Author

added a "shared" field to help with status quo where libraries on the client and server side usually share a span ID. #1499 (comment)

Made one of the converters, but lacking tests at the moment. Once there's 2-way conversion, I'll open another PR trying out ES or similar

@codefromthecrypt
Copy link
Member Author

lack of tests was driving me bonkers. Backfilled tests so that next change (which includes conversion code) is more sane

@codefromthecrypt
Copy link
Member Author

I've a working span converter, but still needs a lot more tests as there are many ways data can come in (and come in unexpectedly)

@codefromthecrypt
Copy link
Member Author

to move this forward, I'm first temporarily converting the internals of InMemorySpanStore to use SimpleSpan (and the converter). This will help break tests and show bugs before trying in elasticsearch.

@codefromthecrypt codefromthecrypt force-pushed the simplespan branch 2 times, most recently from e6f8d8f to 5b0d520 Compare July 19, 2017 14:23
@codefromthecrypt
Copy link
Member Author

ok have almost all storage tests working.. should finish it tomorrow and move to ES

@codefromthecrypt
Copy link
Member Author

update: all spanstore tests work, but some dependency linker tests don't probably missing some edge cases

@codefromthecrypt
Copy link
Member Author

ok almost there on the data conversion. a couple more tests to go

@codefromthecrypt codefromthecrypt force-pushed the simplespan branch 4 times, most recently from edf10fa to a1937ec Compare July 21, 2017 06:29
@codefromthecrypt
Copy link
Member Author

all tests are good with local span store.

I moved onto @openzipkin/elasticsearch, but ran into a glitch and need feedback on the idea or reverting to span.timestamp/duration (so that duration query is effortless) #1499 (comment)

@codefromthecrypt
Copy link
Member Author

tentatively changed from startTimestamp/finishTimestamp to timestamp/duration to see if elasticsearch becomes easier (almost certainly will)

@codefromthecrypt
Copy link
Member Author

@openzipkin/elasticsearch I added a commit for elasticsearch which uses the new format internally (no change to the format collectors use). Can I get a guinea pig to test this out by building this branch and throwing traffic at it? Bear in mind it requires ES 2.4+

@codefromthecrypt codefromthecrypt force-pushed the simplespan branch 2 times, most recently from e32a28e to 2b28827 Compare July 22, 2017 09:37
@codefromthecrypt
Copy link
Member Author

ps thanks @ImFlog and @anuraaga for the help getting ES to work. Would have been a lot harder without you!

@codefromthecrypt
Copy link
Member Author

FYI working on dependency link now. deferring ES as I expect that it might be effortful for folks to test a branch

@codefromthecrypt
Copy link
Member Author

dependency linking (killing DependencyLinkSpan for SimpleSpan) went well enough except one-way spans. I might have to re-jig the converters a bit to keep span store tests passing and also pass on dependency links. Glad I hit a snag as this stuff really needs to be solid.

@codefromthecrypt
Copy link
Member Author

cleanups all done and provably we won't need DependencyLinkSpan after this. We can use the same span type for input, storage, query and aggregation. To that end, I've raised a comment whether to rename SimpleSpan to zipkin2.Span.

#1499 (comment)

Will proceed in whichever direction following a couple people feeding back. Regardless, it will be chopping this PR into smaller pieces.

@jcarres-mdsol
Copy link
Contributor

What's the timeline? First changes to internal storage then API or all together?

@codefromthecrypt
Copy link
Member Author

codefromthecrypt commented Jul 27, 2017 via email

@codefromthecrypt
Copy link
Member Author

codefromthecrypt commented Jul 27, 2017 via email

@jcarres-mdsol
Copy link
Contributor

The flag is a good idea, would allow for incremental adoption. I would imagine that we just wipe out old data and store new in the new format. I think most users do not store the data for long anyways

@codefromthecrypt
Copy link
Member Author

@jcarres-mdsol I think I found a way to tease out a lot of the code safely.

If I start with swapping the internal DependencyLinkSpan for an internal copy of what's now SimpleSpan, we can get that code out there and make sure it works. Later will be a simple import change to formalize the type.

@codefromthecrypt
Copy link
Member Author

first thing splitting off is the span model type (no codec or converters) #1669

Adrian Cole added 3 commits July 27, 2017 18:58
This adds an internal copy of the new span data structure defined in
issue 1499. This is starting internal to ease review and allow
incremental progress. The first consumer will be dependency linking.
This adds an internal copy of a span conversion utility mentioned in
issue #1499. This is starting internal to ease review and allow
incremental progress. The first consumer will be dependency linking.
This adds an internal copy of a span json codec issue #1499. This starts
internal to ease review and allow incremental progress. The first
consumer will be Elasticsearch, as this format removes nested queries.

Note: this change also introduces json serialization of Span2, which
allows future use in Spark.
@codefromthecrypt codefromthecrypt changed the base branch from master to span2-link July 27, 2017 11:08
@codefromthecrypt codefromthecrypt changed the title WIP: simple json java type and codec WIP: Elasticsearch uses Span2 internally Jul 27, 2017
This drops the internal type of DependencyLinkSpan in favor of the Span2
type coming in #1499. Doing so now gives us practice, solves a few bugs
along the way. When Span2 becomes non-internal, the change will be a
simple package rename.
@codefromthecrypt
Copy link
Member Author

All the rejigging PRs are green. I plan to merge the ones leading up to this (Elasticsearch spike) tomorrow. If folks are keen on the new span kinds for producer/consumer, I'll add a PR for that.

Once the above is done, I'll look at how hard it is to make this have an internal flag to use the "span2" type conditionally (right now, this branch won't read old data only new data). If it is straightforward enough, I'll push that out with a minor bump adding instructions here on how to enable it.

In the mean time, anyone can try this branch on their own. It writes to the same indexes, but using the "span2" type and no "servicespan" index.

@codefromthecrypt
Copy link
Member Author

heh.. something's up as github won't let me re-open this

@codefromthecrypt
Copy link
Member Author

carried forward in #1674

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants