Automatic "Model Cards" for Flair models, and default ability to resume training #2457

alanakbik · 2021-09-30T21:06:33Z

This PR modifies the ModelTrainer so that it automatically stores all training parameters as a "model card" in a Flair model. This allows you to:

print all parameters with which a model was trained
resume training any Flair model since the optimizer and scheduler are now stored by default

Model cards

When you train any Flair model, a model card will now automatically be saved. The following example trains a small POS-tagger and prints the model card in the end:

# initialize corpus and make label dictionary for POS tags
corpus = UD_ENGLISH().downsample(0.01)
tag_type = "pos"
tag_dictionary = corpus.make_label_dictionary(tag_type)

# simple sequence tagger
tagger = SequenceTagger(hidden_size=256,
                        embeddings=WordEmbeddings("glove"),
                        tag_dictionary=tag_dictionary,
                        tag_type=tag_type)

# initialize model trainer and experiment path
trainer = ModelTrainer(tagger, corpus)
path = f'resources/taggers/model-card'

# train for a few epochs
trainer.train(path,
              max_epochs=20,
              )

# load best model and print "model card"
trained_model = SequenceTagger.load(path + '/best-model.pt')
trained_model.print_model_card()

This should print a model card like:

------------------------------------
--------- Flair Model Card ---------
------------------------------------
- this Flair model was trained with:
-- Flair version 0.9
-- PyTorch version 1.7.1
-- Transformers version 4.8.1
------------------------------------
------- Training Parameters: -------
------------------------------------
-- base_path = resources/taggers/model-card
-- learning_rate = 0.1
-- mini_batch_size = 32
-- mini_batch_chunk_size = None
-- max_epochs = 20
-- train_with_dev = False
-- train_with_test = False
[... shortened ...]
------------------------------------

Resume training

Previously, we distinguished between checkpoints and model files. Now all models can function as checkpoints, meaning you can load them and continue training them. Say you want to load the model above (trained to epoch 20) and continue training it to epoch 25. Do it like this:

# resume training best model, but this time until epoch 25
trainer.resume(trained_model,
               base_path=path + '-resume',
               max_epochs=25,
               )

…el card

…ion of training

alanakbik added 9 commits September 29, 2021 09:02

GH-2454: store training parameters in model during training

86225d1

Merge branch 'master' into trainer-details

e3a2bee

GH-2454: add Flair, PyTorch and Transformers version to automatic mod…

a961722

…el card

GH-2454: add optimizer and scheduler to model card and enable resumpt…

8d7a2cc

…ion of training

GH-2454: rename method to print_model_card

6e5b5c3

GH-2454: more comments

0c7fdb4

GH-2454: scheduler and optimizer serialization

f565d16

GH-2454: scheduler and optimizer serialization

6c4a55c

GH-2454: fix unit tests

7343c56

alanakbik merged commit d58128a into master Oct 1, 2021

alanakbik deleted the trainer-details branch October 2, 2021 15:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Automatic "Model Cards" for Flair models, and default ability to resume training #2457

Automatic "Model Cards" for Flair models, and default ability to resume training #2457

alanakbik commented Sep 30, 2021

Automatic "Model Cards" for Flair models, and default ability to resume training #2457

Automatic "Model Cards" for Flair models, and default ability to resume training #2457

Conversation

alanakbik commented Sep 30, 2021