Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kafka 11 - record headers support #414

Merged
merged 9 commits into from
Nov 8, 2021

Conversation

habutre
Copy link
Contributor

@habutre habutre commented Aug 13, 2020

This PR is targeted to #397 to provide support for record headers provided since Kafka 0.11

The headers were implemented first on Kayrock and for testing purposes this branch points to Kayrock master. After a new Kayrock be released this PR can be changed and merged. This change is only for your review and validation.

To maintain compability while using the new client the Protocol.{Produce, Fetch}.Message had to be changed.

Only the new client is aware about such info on messages and may not cause conflicts or misbehavior

Out of scope: The mix format causes some changes that should be evaluate if make sense or have to be reverted

@habutre
Copy link
Contributor Author

habutre commented Aug 14, 2020

Hey @dantswain @jbruggem @joshuawscott @bjhaid I think you are the right guys to involve here to help me (let me know if someone might be added or removed from my mentions)

I am facing random errors with ci_tests.sh also with all_tests.sh locally, did you have some tips & tricks that I can apply to get rid of it and making the tests pass?

One can see in the CI tests each config has erros in different test cases and I am a bit lost to find a solution.

Thanks for any help.

@joshuawscott
Copy link
Member

When I released 0.11.0 I fixed most of the intermittent failures, I thought. I was able to run 20x in a row without any failures locally. That said, because Travis uses 1.5 core machines, it doesn't necessarily work the same. There does seem to be a completely consistent failure, though:

  1) test gzip compression - produce v3, fetch v0 (KafkaEx.KayrockRecordBatchTest)
     test/integration/kayrock/record_batch_test.exs:301
     ** (UndefinedFunctionError) function nil.value/0 is undefined
     code: assert message.value == msg
     stacktrace:
       nil.value()
       test/integration/kayrock/record_batch_test.exs:328: (test)
     The following output was logged:
     
     21:27:44.845 [debug] Successfully connected to broker "localhost":9093
     
     21:27:44.888 [debug] Successfully connected to broker "localhost":9094
     
     21:27:44.932 [debug] Successfully connected to broker "localhost":9095

Often a single failure will cause ripples due to using the same underlying brokers, so my suggestion would be to solve this failure and see if the others begin to work.

@joshuawscott joshuawscott self-requested a review August 19, 2020 17:41
@habutre
Copy link
Contributor Author

habutre commented Sep 11, 2020

Hey @dantswain I would really appreciate if you could take a look into my last commit that make the ci tests pass.
I am a bit affraid to change pre-existent tests to make all them pass.

@joshuawscott I think now you can review and if you can help to clarify about the offsets manipulation on kafka_ex/test/integration/kayrock/record_batch_test.exs will be awesome

ℹ️ Just a remind I am pointing the Kayrock to master and a release is necessary there before move forward with this PR

@joshuawscott
Copy link
Member

kayrock 0.1.12 is out now

@dantswain
Copy link
Collaborator

@habutre I apologize for not getting to this yet. I have been unbelievably busy :( I will try to make time.

Copy link
Collaborator

@dantswain dantswain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@habutre I think this all looks good in theory, but it's hard to be sure of what I'm reviewing with all of the mix format changes included. Maybe after #420 is merged you could merge into your branch and update this PR?

lib/kafka_ex/protocol/fetch.ex Outdated Show resolved Hide resolved
@habutre
Copy link
Contributor Author

habutre commented Nov 5, 2020

Is it possible to retry the CI? Checking the errors 1) timeout 2) failed to get deps

Copy link
Collaborator

@dantswain dantswain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we're good here. I kicked the build again and had one question about what seems like an unrelated change.

@@ -234,7 +287,7 @@ defmodule KafkaEx.KayrockRecordBatchTest do

fetch_responses =
KafkaEx.fetch(topic, 0,
offset: max(offset - 2, 0),
offset: offset,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't remember exactly why this was set this way - probably some kind of bizarre test behavior. Did you have a compelling reason to change it?

Copy link
Contributor Author

@habutre habutre Nov 6, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At first I tried to understand the reason to get a previous offset instead the one returned by the producer, but since it always worked before I kept is AS-IS. After I have introduced the headers the kayrock tests start failing randomly but always with the same reason nil message.

I watch the topic with Kafka tool so I figure out that the offset - 2 doesn't match with the expected message then I tried to remove the max(offset - 2, 0) and got success on every execution what make me think that was the reason and also no other test failed.

I don't have a rational about that behavior only that on my mind the offset should not be manipulated to get the desired message, it was just a recursively try-n-check.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dantswain in case you're interested, here is the commit where you added this max(offset - 2, 0): 7197a90

I didn't see anything in particular to help me understand. IMO the new changed code makes more sense 🤷

@Argonus
Copy link
Contributor

Argonus commented May 31, 2021

@habutre @joshuawscott
Are we blocked by sth here? Maybe it's worth merging master to this PR & if everything is green merge it to the main branch.
If I can help somehow, please let me know.

@habutre
Copy link
Contributor Author

habutre commented May 31, 2021

@habutre @joshuawscott
Are we blocked by sth here? Maybe it's worth merging master to this PR & if everything is green merge it to the main branch.
If I can help somehow, please let me know.

Hey @Argonus this PR is ready for review and I am keeping as much as possible synced with latest merged PRs.

Copy link
Collaborator

@jbruggem jbruggem left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I'd rather have a second opinion though :)

@Argonus
Copy link
Contributor

Argonus commented Oct 29, 2021

@jbruggem @joshuawscott @dantswain
Can I help somehow with this PR?

Rogério Ramos and others added 7 commits November 1, 2021 17:09
To maintain compability while using the new client the message protocol
had to be changed.

Only the new client is aware about such info on messages
As usual I ran the command mix format that added changes to all these
files.

I don't know if run the `mix format` is a desired or common practice on
_KafkaEx_ contribution, so I left it in a separated commit that can be
reverted without big efforts
The offset resulting from the KafkaEx.produce sometimes was decreased
causing some msg not being found or retrieved correct

I don't have background to know why in the compressed msg testing the
offset was subtract by -2 `max(offset-2, 0)` but the tests only passed
when the retrieved offset was used to fetch messages
Copy link
Collaborator

@jbruggem jbruggem left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still LGTM. @joshuawscott @dantswain could one of you cross-check these changes to make sure it makes sense ? I'd rather not approve alone.

lib/kafka_ex/protocol/produce.ex Outdated Show resolved Hide resolved
Copy link
Collaborator

@dantswain dantswain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeesh sorry for the delay here. @habutre can you check @jbruggem 's comment, but otherwise looks good to me. Thanks for the work on this and for your patience!

@jbruggem
Copy link
Collaborator

jbruggem commented Nov 2, 2021

@habutre do you wish to commit those last documentation suggestions ?

Once done, I'll merge.

habutre and others added 2 commits November 2, 2021 21:20
Co-authored-by: Jehan Bruggeman <jbruggem@users.noreply.github.com>
@Argonus
Copy link
Contributor

Argonus commented Nov 5, 2021

@jbruggem Maybe it's worth releasing a new version as well?
There would be two quite big things: support for snappy & support for headers

@jbruggem
Copy link
Collaborator

jbruggem commented Nov 8, 2021

@Argonus TBH I've never cut a release on KafkaEx before, so I'm not sure what the criteria are :).

@dantswain @joshuawscott would this be a good moment to make a release, and if so who can do it ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants