Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

2.1.1 release includes SaslAuthenticationException regression #180

Closed
jamielwhite opened this issue Jun 20, 2024 · 16 comments
Closed

2.1.1 release includes SaslAuthenticationException regression #180

jamielwhite opened this issue Jun 20, 2024 · 16 comments

Comments

@jamielwhite
Copy link

In #143, I reported an issue in which the first re-authentication failed for the OAUTHBEARER mechanism. This issue was resolved in release 2.0.3 for the case when awsRoleArn was provided. When I upgraded our apps from 2.0.3 to 2.1.1, I noticed our apps were restarting every hour due to a SaslAuthenticationException: Session too short error. The difference between the behavior now vs in #143 is that the failure occurs after one hour, not after 15 minutes when the role first expires.

I reported this in a comment on a related issue for the case where awsRoleArn is not provided, but since that error also occurs in 2.0.3 I've created this as a separate issue.

security.protocol=SASL_SSL
sasl.mechanism=OAUTHBEARER
sasl.jaas.config=org.apache.kafka.common.security.oauthbearer.OAuthBearerLoginModule required awsRoleArn="<role arn>" awsStsRegion="<region>";
sasl.login.callback.handler.class=software.amazon.msk.auth.iam.IAMOAuthBearerLoginCallbackHandler
@james1miller93
Copy link

We're also seeing this using version 2.1.0. Looking back through the comments from @jamielwhite, it seems we're running our services in exactly the same way (pods in a kubernetes cluster with iam credentials assigned via irsa roles). As mentioned above, the error is causing our apps to restart.

@sidyag
Copy link
Contributor

sidyag commented Jul 8, 2024

I think this PR will fix this. Can you test with it? #182

@jamielwhite
Copy link
Author

I'll test it out, but since we aren't using awsProfileName I'm not sure if that branch is going to be hit.

@sidyag
Copy link
Contributor

sidyag commented Jul 8, 2024

Interesting. The configuration is provided with awsRoleArn? Let me try and reproduce it in that way.

@jamielwhite
Copy link
Author

I tested the jar from the PR and confirmed it still results in the "Session too short" error after 1 hour with awsRoleArn.

@sidyag
Copy link
Contributor

sidyag commented Jul 10, 2024

I added the fix for awsRoleArn #183

I think something similar is happening for default creds. Will try to fix that today.

@jamielwhite
Copy link
Author

I pulled down the code from #183 and tried out the JAR, and unfortunately it still resulted in the same error after an hour.

@sidyag
Copy link
Contributor

sidyag commented Jul 10, 2024

Is that with awsRoleArn or with default? The fix for default is on its way.

@jamielwhite
Copy link
Author

It's with awsRoleArn. I inspected the jar and double-checked it includes the new asyncCredentialUpdateEnabled property being set to true.

@sidyag
Copy link
Contributor

sidyag commented Jul 11, 2024

Okay. Found the issue. It is a documentation problem.

Try with the following configurations:

security.protocol=SASL_SSL
sasl.mechanism=OAUTHBEARER
sasl.jaas.config=org.apache.kafka.common.security.oauthbearer.OAuthBearerLoginModule required awsRoleArn="<role arn>" awsStsRegion="<region>";
sasl.login.callback.handler.class=software.amazon.msk.auth.iam.IAMOAuthBearerLoginCallbackHandler
sasl.client.callback.handler.class=software.amazon.msk.auth.iam.IAMOAuthBearerLoginCallbackHandler

@sidyag
Copy link
Contributor

sidyag commented Jul 11, 2024

I have updated docs, and the default chain. Could you try now with mainline?

@jamielwhite
Copy link
Author

jamielwhite commented Jul 11, 2024

With awsRoleArn and the addition of sasl.client.callback.handler.class, I'm able to run it longer than 1 hour without any errors 🎉 . I'll try out the default later today, but it will take longer to confirm since that error appeared after 2 hours.

@jamielwhite
Copy link
Author

The case without awsRoleArn is also working for longer than 2 hours now. Thanks for the fix and documentation!

@james1miller93
Copy link

Thank you for your work on resolving this @sidyag.
Do you know when a release tag is likely to be published with the fix?

@sidyag
Copy link
Contributor

sidyag commented Jul 17, 2024

Have published release 2.2.0. It should be available in the next 12-24 hours.

@sidyag sidyag closed this as completed Jul 17, 2024
Copy link

This issue is now closed. Comments on closed issues are hard for our team to see.
If you need more assistance, please open a new issue that references this one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants