Added iterate_episodes and made dataset an iterable #54

Howuhh · 2023-04-06T20:52:44Z

Description

Although the documentation says that Minari doesn't serve the purpose of creating replay buffers, the dataset itself only allows you to sample random episodes, which is not very useful in practice for creating custom replay buffers or dataloaders. It's much more useful to be able to iterate or get a generator by episodes (which won't load everything into memory). I am currently using Minari to build dataset for my tasks and this kind of functionality is very lacking for convenient use. So I added the ability to iterate through the episodes.

Type of change

New feature (non-breaking change which adds functionality)
This change requires a documentation update

Checklist:

I have run the pre-commit checks with pre-commit run --all-files (see CONTRIBUTING.md instructions to set it up)
I have run pytest -v and no errors are present.
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
I solved any possible warnings that pytest -v has generated that are related to my code to the best of my knowledge.
I have added tests that prove my fix is effective or that my feature works
New and existing unit tests pass locally with my changes

rodrigodelazcano · 2023-04-07T16:28:42Z

Hey @Howuhh thanks a lot for this PR! I'm wondering if merging this new method iterate_episodes and sample_episodes will be possible. Have a single generator iterate_episodes, with a batch_size and shuffle parameters. Let me know your thoughts about this.

I can merge this PR first after the pre-commit is fixed and make an issue for the single generator implementation

rodrigodelazcano · 2023-04-07T16:35:55Z

Second thoughts, I don't think batch_size is necessary since what we are trying to do here is populate a buffer. @Howuhh Can you remove the sample_episodes in favor of iterate_episodes in this PR as well?

Howuhh · 2023-04-07T16:42:08Z

@rodrigodelazcano To be honest, I don't really understand the current issue with the pre-commit, and what's more, locally, all the checks go through without errors (with setup described in CONTRIBUTING.md).

minari/dataset/minari_dataset.py

rodrigodelazcano · 2023-04-07T16:45:13Z

@rodrigodelazcano To be honest, I don't really understand the current issue with the pre-commit, and what's more, locally, all the checks go through without errors (with setup described in CONTRIBUTING.md).

Interesting, let me have a look at it.

rodrigodelazcano · 2023-04-07T20:24:52Z

@Howuhh regarding the pre-commit errors I'm also unable to replicate them locally. Looking at the logs it seems that the issue comes from pyright identifying the ndarray type as Unknown. I also think that we can omit the use of a numpy array to pass the list of indices, passing a list should be enough. Can you remove this type and see if the error is fixed?

Howuhh · 2023-04-07T20:43:11Z

@rodrigodelazcano Yup, but then it will be a little bit inconsistent with type of episode_indices for MinariDataset class.

Howuhh · 2023-04-07T21:32:14Z

also, there are examples such as dataset.name the documentation, however .name attr does not exist in the code (as far as I can tell)

@rodrigodelazcano I actually don't think it's necessary to remove sample_episodes, maybe it could be useful to someone else. For example, we do exactly that in CORL for Decision Transformer.

younik

Looks good to me; I just added a small comment

younik · 2023-04-13T20:35:37Z

minari/dataset/minari_dataset.py

+        for (
+            episode_index
+        ) in episode_indices:  # pyright: ignore [reportOptionalIterable]


Is it possible to avoid #pyright: ignore here and thus make it one line?

It seems it doesn't recognize in line 140 that episode_indices can't be None, but it should work with an assert

@Howuhh can you edit this?

@rodrigodelazcano yup, been a little busy, I'll correct it

done, checks are all green locally, so mb this will work

rodrigodelazcano

LGTM! I'll merge after @younik reviews are fixed

Howuhh added 5 commits April 6, 2023 23:34

added iterate_episodes and made dataset an iterable

57bd580

pre-commit fix

43dc63f

remove stuff

d0e5a53

fix typo

c0a3f96

ndim check and explicit conversion to list

0de40da

younik reviewed Apr 7, 2023

View reviewed changes

minari/dataset/minari_dataset.py Outdated Show resolved Hide resolved

Howuhh added 2 commits April 8, 2023 00:23

fix docs, make indices type list

c73c3ca

linter fix

14d9652

Howuhh requested review from rodrigodelazcano and younik April 7, 2023 21:29

younik approved these changes Apr 13, 2023

View reviewed changes

rodrigodelazcano approved these changes Apr 14, 2023

View reviewed changes

check for none

da3cab2

Howuhh requested a review from rodrigodelazcano April 16, 2023 20:52

rodrigodelazcano merged commit a614223 into Farama-Foundation:main Apr 16, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added iterate_episodes and made dataset an iterable #54

Added iterate_episodes and made dataset an iterable #54

Howuhh commented Apr 6, 2023

rodrigodelazcano commented Apr 7, 2023

rodrigodelazcano commented Apr 7, 2023 •

edited

Loading

Howuhh commented Apr 7, 2023

rodrigodelazcano commented Apr 7, 2023

rodrigodelazcano commented Apr 7, 2023

Howuhh commented Apr 7, 2023

Howuhh commented Apr 7, 2023 •

edited

Loading

younik left a comment

younik Apr 13, 2023

rodrigodelazcano Apr 16, 2023

Howuhh Apr 16, 2023

Howuhh Apr 16, 2023 •

edited

Loading

rodrigodelazcano left a comment •

edited

Loading

Added iterate_episodes and made dataset an iterable #54

Added iterate_episodes and made dataset an iterable #54

Conversation

Howuhh commented Apr 6, 2023

Description

Type of change

Checklist:

rodrigodelazcano commented Apr 7, 2023

rodrigodelazcano commented Apr 7, 2023 • edited Loading

Howuhh commented Apr 7, 2023

rodrigodelazcano commented Apr 7, 2023

rodrigodelazcano commented Apr 7, 2023

Howuhh commented Apr 7, 2023

Howuhh commented Apr 7, 2023 • edited Loading

younik left a comment

Choose a reason for hiding this comment

younik Apr 13, 2023

Choose a reason for hiding this comment

rodrigodelazcano Apr 16, 2023

Choose a reason for hiding this comment

Howuhh Apr 16, 2023

Choose a reason for hiding this comment

Howuhh Apr 16, 2023 • edited Loading

Choose a reason for hiding this comment

rodrigodelazcano left a comment • edited Loading

Choose a reason for hiding this comment

rodrigodelazcano commented Apr 7, 2023 •

edited

Loading

Howuhh commented Apr 7, 2023 •

edited

Loading

Howuhh Apr 16, 2023 •

edited

Loading

rodrigodelazcano left a comment •

edited

Loading