Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

Exp backoff for downloads. #9671

Closed
wants to merge 4 commits into from

Conversation

KellenSunderland
Copy link
Contributor

@KellenSunderland KellenSunderland commented Feb 1, 2018

Description

The primary goal of this PR was to add exponential backoff to our http request. This should be a good feature for customers with slow Internet connections, but I also hope it will help mitigate test issues such as: #9669.

I noticed that we have a lot of duplicate code between test_utils.download and gluon.download. I've attempted to combine these two functions, but can roll it back if they had a good reason to be separated.

Checklist

Essentials

  • Passed code style checking (make lint)
  • Changes are complete (i.e. I finished coding on this PR)
  • All changes have test coverage:
  • Code is well-documented:

Note examples are affected by this change. I've updated them so that they do not reference download functions that no longer exist.

Comments

  • The tests require a new test dependency: requests_mock. Which may need to be added to the CI. Due to this, this PR may require a little iteration before test work properly. No rush to get it merged.
  • I've created a requirements.test.txt file for convenient installation of test deps.
  • NNVM will also need to be updated with a few new references in files tutorials/deploy_model_on_rasp.py and tutorials/from_mxnet.py.

@szha
Copy link
Member

szha commented Feb 1, 2018

Looks like a breaking change

@KellenSunderland
Copy link
Contributor Author

Yes, this should be considered a breaking API change. It removes the download function from gluon utils. Sorry, I should have called that out in the description.



def _download_retry_with_backoff(fname, url, max_attempts=4):
"""Downloads a url with backoff applied, automatically re-requesting after a failure.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Retry can be added to existing download function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This requires using a session right? I'm not sure if that then lets us stream to a file. I'll check it out though. I'd prefer to use built in functionality provided we get backoff and streaming support.

Copy link
Contributor

@larroy larroy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice change

backoff_coef = 50.0
while True:
try:
print('Downloading %s from %s...' % (fname, url))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would use logging instead of print and arguments to logging instead of %

# Likely non-2** result, possibly timeout or redirection failure.
attempts = attempts + 1
if attempts > max_attempts:
print('Downloading %s from %s, failed after #%d attempts' % (fname, url, attempts))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above.


if overwrite or not os.path.exists(fname) or (sha1_hash and not check_sha1(fname, sha1_hash)):
dirname = os.path.dirname(os.path.abspath(os.path.expanduser(fname)))
if not os.path.exists(dirname):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can use exist_ok and avoid the check:

https://docs.python.org/3/library/os.html#os.makedirs

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.



def create_dir(dirname):
if not os.path.exists(dirname):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

@szha szha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't have this change until 2.0. You may consider adding the retry to existing download functions.

@KellenSunderland
Copy link
Contributor Author

@larroy: Thanks for the review, good tips.

@szha I'm alright with having this wait for 2.0. I can bring some of the non-breaking changes in with another minimal PR if you'd like.

@szha
Copy link
Member

szha commented Feb 1, 2018

OK

@szha szha added the Breaking label Feb 1, 2018
return fname


def create_dir(dirname):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if not os.path.exists(dirname):
os.makedirs(dirname)

This is just two lines. I don't think its worth making it a function

@eric-haibin-lin
Copy link
Member

Can we keep the alias in test_utils and gluon to keep backward compatibility?

@@ -61,13 +52,13 @@ def split_data(data, num_slice, batch_axis=0, even_split=True):
size = data.shape[batch_axis]
if size < num_slice:
raise ValueError(
"Too many slices for data with shape %s. Arguments are " \
"num_slice=%d and batch_axis=%d."%(str(data.shape), num_slice, batch_axis))
"Too many slices for data with shape %s. Arguments are "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't the backslash required?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't triple quotes useful in this case? """blahs dah"""

Copy link
Contributor Author

@KellenSunderland KellenSunderland Feb 2, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think triplequotes would be the equiv of

"Too many slices for data with shape %s. Arguments are \n" \
"num_slice=%d and batch_axis=%d."%(str(data.shape), num_slice, batch_axis))

but I'll try a few options.

"data with shape %s cannot be evenly split into %d slices along axis %d. " \
"Use a batch size that's multiple of %d or set even_split=False to allow " \
"uneven partitioning of data."%(
"data with shape %s cannot be evenly split into %d slices along axis %d. "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Backslashes

@KellenSunderland
Copy link
Contributor Author

@eric-haibin-lin Yes I was thinking of just pointing everything to the gluon download function for now.

@CodingCat
Copy link
Contributor

Hi, the community has passed to vote about associating the code changes with JIRA (https://lists.apache.org/thread.html/ab22cf0e35f1bce2c3bf3bec2bc5b85a9583a3fe7fd56ba1bbade55f@%3Cdev.mxnet.apache.org%3E)

We have updated the guidelines for contributors in https://cwiki.apache.org/confluence/display/MXNET/Development+Process, please ensure that you have created a JIRA at https://issues.apache.org/jira/projects/MXNET/issues/ to describe your work in this pull request and include the JIRA title in your PR as [MXNET-xxxx] your title where MXNET-xxxx is the JIRA id

Thanks!

@KellenSunderland
Copy link
Contributor Author

Closing, will reopen if I have time to refactor.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants