Skip to content

v0.3.1

Compare
Choose a tag to compare
@zhaoqf123 zhaoqf123 released this 17 Aug 16:54

v0.3.1

Model

  • Add drop path to regularize large models, and it works quite well for deep models
  • Add EMA

Other

  • Add one package dependency: timm, to implement EMA
  • Update README to include details of Eulerian sequence and cyclic node re-index.
  • Code refactoring.
  • Tokenization config json refactoring.
  • Update vocab by adding some special tokens, e.g., <bos>, <new>, <mask> and etc.
  • Turn of optimizer offload in deepspeed config to boost the training speed.