-
Notifications
You must be signed in to change notification settings - Fork 842
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Scriptable Tokenizer for Text Classification Example #1691
Conversation
…ptable tokenizer into a single artifact
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Matthias, super glad to see this work in. Left a bunch of feedback. The important ones is we need some test, updates to the README to clarify benefits to users and a few more comments in the train.py
examples/text_classification_with_scriptable_tokenzier/README.md
Outdated
Show resolved
Hide resolved
examples/text_classification_with_scriptable_tokenzier/handler.py
Outdated
Show resolved
Hide resolved
examples/text_classification_with_scriptable_tokenzier/handler.py
Outdated
Show resolved
Hide resolved
examples/text_classification_with_scriptable_tokenzier/train_model.py
Outdated
Show resolved
Hide resolved
examples/text_classification_with_scriptable_tokenzier/train_model.py
Outdated
Show resolved
Hide resolved
examples/text_classification_with_scriptable_tokenzier/README.md
Outdated
Show resolved
Hide resolved
examples/text_classification_with_scriptable_tokenzier/handler.py
Outdated
Show resolved
Hide resolved
examples/text_classification_with_scriptable_tokenzier/train_model.py
Outdated
Show resolved
Hide resolved
examples/text_classification_with_scriptable_tokenzier/train_model.py
Outdated
Show resolved
Hide resolved
examples/text_classification_with_scriptable_tokenzier/train_model.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM - although I think we should revert the changes to test/pytest
and create_mar.sh
let's fix those properly in another PR pending review in the CONTRIBUTING-dev.md
spec I shared
@@ -16,7 +16,7 @@ function cleanup { | |||
trap cleanup EXIT | |||
|
|||
# Download and Extract model's source code | |||
sudo apt-get install zip unzip | |||
sudo apt-get install -y zip unzip |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this change needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should revert the changes to test/pytest and create_mar.sh let's fix those properly in another PR
Sounds good, reverted the changes
@mreso could you run sanity and regression test to make sure the pytest you added is verified? |
Thank you @lxning! Here are the logs for the regression tests: https://gist.github.com/mreso/0bcaed895517f74a09cbd4e80621969c |
…ptable tokenizer into a single artifact
Description
This example shows how to combine a model with a scriptable tokenizer and script the two together with TorchScript to achieve a single artifact.
The combination of model and tokenizer makes sure that the same tokenizer is used in training and inference.
Additionally, it decreases handler complexity and makes it easy to deploy these in the future C++ based backend.
Training the model depends on torchdata which is mentioned in the README. Inference does not require the package.
Fixes #(issue)
Type of change
Please delete options that are not relevant.
Feature/Issue validation/testing
Test plan:
Checklist: