Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expansion of Encoding Interfaces and Addition of V2 #1227

Merged
merged 17 commits into from
Feb 2, 2022

Conversation

joe-elliott
Copy link
Member

@joe-elliott joe-elliott commented Jan 13, 2022

What this PR does:
Expands the encoding interfaces added in #1211 and adds a v2 interface which is capable of efficiently retrieving its range to improve search speeds. This PR also puts everything into place to fix the start/end time range of blocks referenced in #1175 .

Marked as draft b/c this shouldn't be merged in 1.3. Also it needs some cleanup.

UPDATE:
This PR is ready for review but I'm going to leave it draft until 1.3 is cut.

Checklist

  • Tests updated
  • Documentation added
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

Signed-off-by: Joe Elliott <number101010@gmail.com>
Signed-off-by: Joe Elliott <number101010@gmail.com>
Signed-off-by: Joe Elliott <number101010@gmail.com>
Signed-off-by: Joe Elliott <number101010@gmail.com>
Signed-off-by: Joe Elliott <number101010@gmail.com>
Signed-off-by: Joe Elliott <number101010@gmail.com>
Signed-off-by: Joe Elliott <number101010@gmail.com>
Signed-off-by: Joe Elliott <number101010@gmail.com>
Signed-off-by: Joe Elliott <number101010@gmail.com>
@joe-elliott joe-elliott marked this pull request as ready for review January 25, 2022 19:30
Signed-off-by: Joe Elliott <number101010@gmail.com>
Signed-off-by: Joe Elliott <number101010@gmail.com>
Signed-off-by: Joe Elliott <number101010@gmail.com>
rpc PushBytes(PushBytesRequest) returns (PushResponse) {};
// different versions of PushBytes expect the trace data to be pushed in different formats
rpc PushBytes(PushBytesRequest) returns (PushResponse) {}; // ./pkg/model/v1
rpc PushBytesV2(PushBytesRequest) returns (PushResponse) {}; // ./pkg/model/v2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you adding both versions here to allow handling of both versions at the same time?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup. while the distributors transition to the new version the ingesters will need to be able to take both v1 and v2.

@@ -87,6 +89,14 @@ var (
}, []string{discardReasonLabel, "tenant"})
)

// rebatchedTrace is used to more cleanly pass the set of data
type rebatchedTrace struct {
id []byte
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is id used? I see trace.id used, but maybe missed where id is used.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, it's used on line 333 where the distributor creates the batches it sends to the ingesters.


_ = buffer.EncodeFixed32(uint64(start)) // EncodeFixed32 can't return an error
_ = buffer.EncodeFixed32(uint64(end))
err := buffer.Marshal(pb)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this marshal appending to the buffer? So the two uint64s are put in place, and then the variable-length trace bytes?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

technically two uint32s, but yes. for some reason EncodeFixed32 takes a uint64.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right sorry, uint32.

Copy link
Contributor

@mdisibio mdisibio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall this is looking good and should make it easier to support more data encodings in the future. Do you have any numbers on how much the new per-trace start/end improve search times?

I'm not sure I am understanding the difference between ObjectDecoder and SegmentDecoder. In V2 they have the same underlying format (uint32/uint32/[]byte), and similarities in PrepareForRead and Combine/ToObject. Could/should these be combined, or maybe some more comments/clarification why they are not? This might be a terminology/definition difference (see the other comment for more specifics)

pkg/model/segment_decoder.go Outdated Show resolved Hide resolved
modules/ingester/ingester.go Show resolved Hide resolved
@joe-elliott
Copy link
Member Author

joe-elliott commented Jan 26, 2022

Could/should these be combined, or maybe some more comments/clarification why they are not?

Initially I had them combined but found that more confusing. Having the SegmentDecoder handle the relationship between Distributor/Ingester and the ObjectDecoder handle the backend felt more natural to me.

For v2 the SegmentDecoder is handling uint32/uint32/tempopb.Trace and the ObjectDecoder is handling uint32/uint32/tempopb.TraceByte which contains byte slices of unmarshalled tempopb.Traces. These things are just askew enough I felt like they should be distinct. I would definitely take recommendations on consolidating if you have some thoughts.

Signed-off-by: Joe Elliott <number101010@gmail.com>
@joe-elliott joe-elliott mentioned this pull request Feb 1, 2022
3 tasks
@joe-elliott joe-elliott merged commit 668ae5e into grafana:main Feb 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants