You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The pagination of S3 list_objects_v2 skip pages when using CommondPrefixes (i.e. Delimiter) and StartingToken
Use case:
Our API provides a list of S3 "folders" and supports pagination. It is a wrapper over our internal S3 bucket and forwards the information. The first response of the API returns a list of common prefixes and the next token provided by the PageIterator. The second request uses this token to continue the listing.
Expected Behavior
Using the paginator.paginate() method with the Delimiter parameter and not setting StartingToken should return all pages starting from the first one and its next token.
Using it again but this time with a given StartingToken (the first page next token) should return all pages starting from the second one and its next token.
Current Behavior
When the paginator.paginate() is called with StartingToken it returns the second page with an empty CommonPrefixes list but the third with a valid CommonPrefixes list
Reproduction Steps
You need a bucket with date partitions and files in them.
I followed the issue down to PageIterator.__iter__() (.venv/lib/python3.11/site-packages/botocore/paginate.py)
iffirst_request:
# The first request is handled differently. We could# possibly have a resume/starting token that tells us where# to index into the retrieved page.ifself._starting_tokenisnotNone:
starting_truncation=self._handle_first_request(
parsed, primary_result_key, starting_truncation
)
first_request=Falseself._record_non_aggregate_key_values(parsed)
The primary_result_key is initiated a few lines before that as self.result_keys[0] and result_keys are essentially coming from a JSON schema from venv/lib/python3.11/site-packages/botocore/data/s3/2006-03-01/paginators-1.json
Hey @dboyadzhiev, thanks for reaching out and for the detailed reproduction steps. I was able to reproduce this behavior, and will bring it up with the team. I'll provide an update when I know more.
Hi @dboyadzhiev, thanks for your patience. Could you clarify why you have the first call separate from the rest? I was able to get all the common prefixes by using just one loop, and initializing next_token to None. This seems to be what you're trying to achieve, unless I'm misunderstanding the problem.
We used that logic to implement pagination. With the code above I simulated two different requests. Imagine you have an app with a list of 20 files per page, and this is to click on the button "next".
Describe the bug
The pagination of S3
list_objects_v2
skip pages when usingCommondPrefixes
(i.e.Delimiter
) andStartingToken
Use case:
Our API provides a list of S3 "folders" and supports pagination. It is a wrapper over our internal S3 bucket and forwards the information. The first response of the API returns a list of common prefixes and the
next token
provided by thePageIterator
. The second request uses this token to continue the listing.Expected Behavior
Using the
paginator.paginate()
method with theDelimiter
parameter and not settingStartingToken
should return all pages starting from the first one and its next token.Using it again but this time with a given
StartingToken
(the first pagenext token
) should return all pages starting from the second one and its next token.Current Behavior
When the
paginator.paginate()
is called withStartingToken
it returns the second page with an emptyCommonPrefixes
list but the third with a validCommonPrefixes
listReproduction Steps
You need a bucket with date partitions and files in them.
Output:
Possible Solution
No response
Additional Information/Context
I followed the issue down to
PageIterator.__iter__()
(.venv/lib/python3.11/site-packages/botocore/paginate.py
)The
primary_result_key
is initiated a few lines before that asself.result_keys[0]
andresult_keys
are essentially coming from a JSON schema fromvenv/lib/python3.11/site-packages/botocore/data/s3/2006-03-01/paginators-1.json
where
result_key
isContents
which is missing in the S3 response bodyparsed
SDK version used
1.31.17
Environment details (OS name and version, etc.)
MacOS 14.2.1 (23C71)
The text was updated successfully, but these errors were encountered: