Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatic "Model Cards" for Flair models, and default ability to resume training #2457

Merged
merged 9 commits into from
Oct 1, 2021

Conversation

alanakbik
Copy link
Collaborator

This PR modifies the ModelTrainer so that it automatically stores all training parameters as a "model card" in a Flair model. This allows you to:

  • print all parameters with which a model was trained
  • resume training any Flair model since the optimizer and scheduler are now stored by default

Model cards

When you train any Flair model, a model card will now automatically be saved. The following example trains a small POS-tagger and prints the model card in the end:

# initialize corpus and make label dictionary for POS tags
corpus = UD_ENGLISH().downsample(0.01)
tag_type = "pos"
tag_dictionary = corpus.make_label_dictionary(tag_type)

# simple sequence tagger
tagger = SequenceTagger(hidden_size=256,
                        embeddings=WordEmbeddings("glove"),
                        tag_dictionary=tag_dictionary,
                        tag_type=tag_type)

# initialize model trainer and experiment path
trainer = ModelTrainer(tagger, corpus)
path = f'resources/taggers/model-card'

# train for a few epochs
trainer.train(path,
              max_epochs=20,
              )

# load best model and print "model card"
trained_model = SequenceTagger.load(path + '/best-model.pt')
trained_model.print_model_card()

This should print a model card like:

------------------------------------
--------- Flair Model Card ---------
------------------------------------
- this Flair model was trained with:
-- Flair version 0.9
-- PyTorch version 1.7.1
-- Transformers version 4.8.1
------------------------------------
------- Training Parameters: -------
------------------------------------
-- base_path = resources/taggers/model-card
-- learning_rate = 0.1
-- mini_batch_size = 32
-- mini_batch_chunk_size = None
-- max_epochs = 20
-- train_with_dev = False
-- train_with_test = False
[... shortened ...]
------------------------------------

Resume training

Previously, we distinguished between checkpoints and model files. Now all models can function as checkpoints, meaning you can load them and continue training them. Say you want to load the model above (trained to epoch 20) and continue training it to epoch 25. Do it like this:

# resume training best model, but this time until epoch 25
trainer.resume(trained_model,
               base_path=path + '-resume',
               max_epochs=25,
               )

@alanakbik alanakbik merged commit d58128a into master Oct 1, 2021
@alanakbik alanakbik deleted the trainer-details branch October 2, 2021 15:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant