Skip to content

Transformers Loss Landscape Explortaion Project, part of thesis for masters, FDT ITMO

Notifications You must be signed in to change notification settings

stas1f1/Transformers-Loss-Landscape

Repository files navigation

Transformers Loss Landscape Exploration

In this project visualization methods of fine-tuning Transformer architecture network on QA task were demondstrated. Visualization of the optimal state achieved by training could be taken into consideration when setting up the fine-tuning process.

As a result of comparing available corpora and the premise of their creation, two datasets were selected: SQuAD and Adversarial QA for the QA task. DistilBERT model, chosen for significant training speed and model configuration weight advantage over other Transformer models while not compromising accuracy, was trained on QA task.

1-Dimensional loss and F1-score plot for interpolated values. x=0 corresponds to untuned model, x=1 corresponds to fine-tuned model on SQuAD 1.1 dataset

2-Dimensional loss surface contour plot. (0,0) point corresponds to untuned model, (0,1) point corresponds to fine-tuned model on SQuAD 1.1 dataset

Visualization of projected optimization trajectory on 2-dimensional loss surface contour plot. (0,0) point corresponds to untuned model, (0,1) point corresponds to fine-tuned model on SQuAD 1.1 dataset

Work based on:

Hao Li, Zheng Xu, Gavin Taylor, Christoph Studer and Tom Goldstein. Visualizing the Loss Landscape of Neural Nets. NIPS, 2018.

Made with Python as a part of thesis for masters, FDT ITMO

About

Transformers Loss Landscape Explortaion Project, part of thesis for masters, FDT ITMO

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published