-
Notifications
You must be signed in to change notification settings - Fork 329
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix #1198: handle github ratelimit #1266
Conversation
Check the remaining rate limit and sleep if remaining is below threshold | ||
:param token: The Github API token as string. | ||
''' | ||
response = requests.get('https://api.github.com/rate_limit', headers={'Authorization': f"token {token}"}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than add another call, would it be possible to use the rateLimit
object in the graphql response object itself?
cartography/cartography/intel/github/teams.py
Lines 91 to 96 in f3f237c
rateLimit { | |
limit | |
cost | |
remaining | |
resetAt | |
} |
This would let us save an API call for each page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's why I tried originally, but if the graphql ratelimit is already fully consumed, then the query would not return a response. Calling the rest endpoint consumes a different ratelimit, core
I believe. Test data shows that the limit for core is 60, but it's actually >5k for authenticated requests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From conversation we had offline, if we start the sync without knowing that the remaining
quota is already 0, then the graphql response won't even return and the code will crash.
Co-authored-by: Alex Chantavy <achantavy@lyft.com>
Github has a requests-per-hour ratelimit that resets per hour. ### Changes - when the `remaining` is lower than a threshold of `500`, delays github graphql requests until the `reset` time. - changes the retry mode from constant to exponential backoff - change the page size of the repos query to `50` from `100`. The github API would very frequently timeout, but is much better now. ### Testing - added unit tests - manual testing --------- Co-authored-by: Alex Chantavy <achantavy@lyft.com>
Github has a requests-per-hour ratelimit that resets per hour.
Changes
remaining
is lower than a threshold of500
, delays github graphql requests until thereset
time.50
from100
. The github API would very frequently timeout, but is much better now.Testing