Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added cloud storage utility to toolkit #19

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

leela
Copy link
Collaborator

@leela leela commented Jul 16, 2020

closes #18

@leela leela changed the title Added cloud storage utility to toolkit Added cloud storage utility to toolkit - WIP Jul 16, 2020
@leela leela requested a review from anandology July 16, 2020 04:18
@leela leela marked this pull request as draft July 16, 2020 04:18
@leela leela force-pushed the 18-add-cloud-storage-utility branch 3 times, most recently from 47d571a to 22fd201 Compare July 16, 2020 05:47
Copy link
Member

@anandology anandology left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some features like copying a directory, downloading a whole directory etc. present in the storage.py of algoshelf/maverix that are not available here. It would be good to add them here.

Also, can you please test cases?

collection = bucket.objects.filter(Prefix=self.path)
return [obj.key for obj in collection.all()]

def read_dataframe(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it is better have separate read_csv and read_paquet methods.

return False
raise

def delete(self, del_dir: bool=False):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The del_dir doesn't sound right. See what is used in pathlib for this kind of operation.

"""Read the contents of a path
"""
obj = self._object
return obj.get()['Body'].read()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there is a read_text, then there should be a read_bytes as well. What about write? Shouldn't we have write_text and write_bytes methods?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also. i guess, [Body].read() returns bytes, not text. Please check.

@leela leela force-pushed the 18-add-cloud-storage-utility branch from 22fd201 to 1397967 Compare July 17, 2020 11:45
@leela
Copy link
Collaborator Author

leela commented Jul 17, 2020

@anandology , It is still needs a proper testing, test cases and exception handling, I am on it currently. Meanwhile you can have a look at it and please pass your comments.

@leela leela force-pushed the 18-add-cloud-storage-utility branch 9 times, most recently from 8c917db to 0f6e99f Compare July 19, 2020 10:56
Copy link
Member

@anandology anandology left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@leela leela changed the title Added cloud storage utility to toolkit - WIP Added cloud storage utility to toolkit Jul 24, 2020
@leela leela marked this pull request as ready for review July 24, 2020 04:03
@leela leela force-pushed the 18-add-cloud-storage-utility branch 7 times, most recently from 1e59f1e to c02c02d Compare July 24, 2020 06:12
@leela leela force-pushed the 18-add-cloud-storage-utility branch 21 times, most recently from 78692a6 to 85b0858 Compare July 25, 2020 04:03
@leela leela force-pushed the 18-add-cloud-storage-utility branch from 85b0858 to d60c137 Compare July 25, 2020 04:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add storage module
2 participants