-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add README for image classification example #21758
Add README for image classification example #21758
Conversation
Can one of the admins verify this patch? |
1 similar comment
Can one of the admins verify this patch? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, just a couple small comments.
|
||
### Data | ||
Data related to RunInference has been staged in | ||
`gs://apache-beam-ml/` for use with these example pipelines: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know gs://apache-beam-ml will really work as a link or is right.
maybe "staged in apache-beam-testing" will work
Feel free to keep it this way if I'm just wrong or misunderstanding.
Maybe this will link right if the users cloud account is set to apache-beam-testing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not intended to be a link, but rather to show the location of datasets, models, etc. related to RunInference. The apache-beam-ml
bucket is public; outside users should be able to view (with command gsutil ls gs://apache-beam-ml
) and read these models / datasets with no issue?
gs://apache-beam-ml/datasets/imagenet/raw-data/validation/ILSVRC2012_val_00005012.JPEG,573 | ||
``` | ||
where the second item in each line is the integer representing the predicted class of the | ||
image. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it would be cool if one of the ptransforms in the example joined to integer prediction to the actual name of the image.
for example:
gs://apache-beam-ml/datasets/.....5102.jpeg, horse
gs://apache-beam-ml/datasets/.....5102.jpeg, cheese
etc.
But that is outside of the scope of this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have that for a different example #21766
I have another PR #21766, and once that gets merged I can add instructions for it in the README.md as well. |
Codecov Report
@@ Coverage Diff @@
## master #21758 +/- ##
==========================================
+ Coverage 74.02% 74.06% +0.03%
==========================================
Files 698 698
Lines 92192 92333 +141
==========================================
+ Hits 68248 68383 +135
- Misses 22693 22699 +6
Partials 1251 1251
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, a couple minor suggestions
|
||
### Datasets and Models for RunInference | ||
Data related to RunInference has been staged in | ||
`gs://apache-beam-ml/` for use with these example pipelines. You can see this by using the [gsutil tool](https://cloud.google.com/storage/docs/gsutil#gettingstarted). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(optional) maybe link to the cloud console here: https://pantheon.corp.google.com/storage/browser/apache-beam-ml
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is that link accessible to only Google employees though? would https://console.cloud.google.com/
be better?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
whoops, yes it would. I copied the wrong thing :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
- ... | ||
- gs://apache-beam-ml/datasets/imagenet/raw-data/validation/ILSVRC2012_val_00050000.JPEG | ||
--> | ||
- `gs://apache-beam-ml/testing/inputs/it_imagenet_validation_inputs.txt/`: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- `gs://apache-beam-ml/testing/inputs/it_imagenet_validation_inputs.txt/`: | |
- `gs://apache-beam-ml/testing/inputs/it_imagenet_validation_inputs.txt`: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
validation data | ||
- gs://apache-beam-ml/datasets/imagenet/raw-data/validation/ILSVRC2012_val_00000001.JPEG | ||
- ... | ||
- gs://apache-beam-ml/datasets/imagenet/raw-data/validation/ILSVRC2012_val_00000015.JPEG |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(optional) It might be nice to clarify that these sub-bullets are the file contents with something like:
$ gsutil cat gs://apache-beam-ml/testing/inputs/it_imagenet_validation_inputs.txt
gs://apache-beam-ml/datasets/imagenet/raw-data/validation/ILSVRC2012_val_00000001.JPEG
...
gs://apache-beam-ml/datasets/imagenet/raw-data/validation/ILSVRC2012_val_00000015.JPEG
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, one more thing I forgot to include in the last review.
For installation of the `torch` dependency for Dataflow pipelines, refer to these | ||
[instructions](https://beam.apache.org/documentation/sdks/python-pipeline-dependencies/#pypi-dependencies). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For installation of the `torch` dependency for Dataflow pipelines, refer to these | |
[instructions](https://beam.apache.org/documentation/sdks/python-pipeline-dependencies/#pypi-dependencies). | |
For installation of the `torch` dependency on a distributed runner, like Dataflow, refer to these | |
[instructions](https://beam.apache.org/documentation/sdks/python-pipeline-dependencies/#pypi-dependencies). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Doesn't need to be exactly that text, just in general Beam docs should mention Dataflow as a distributed runner and be clear that there are others)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, fixed.
TODO: Add link to full documentation on Beam website when it's published. | ||
|
||
i.e. "See the | ||
[documentation](https://beam.apache.org/documentation/dsls/dataframes/overview/#pre-requisites) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this a leftover?
Merging despite the failed test since this is just a docs change. |
* Add README for image classification example * Fix typos and input name changes * Fix typos and clarify inputs text * Add link to GCP console; Add clarifying comment
Instructions on how to run a Pytorch RunInference example.
Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:
R: @username
).CHANGES.md
with noteworthy changes.See the Contributor Guide for more tips on how to make review process smoother.
To check the build health, please visit https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md
GitHub Actions Tests Status (on master branch)
See CI.md for more information about GitHub Actions CI.