Skip to content

Commit

Permalink
Update README.md for float8 unification (#895)
Browse files Browse the repository at this point in the history
  • Loading branch information
vkuzo committed Sep 16, 2024
1 parent b2e1d49 commit da13bf2
Showing 1 changed file with 2 additions and 0 deletions.
2 changes: 2 additions & 0 deletions torchao/float8/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,8 @@ throughput speedups of up to 1.5x on 128 GPU LLaMa 3 70B pretraining jobs.

:warning: <em>The codebase is stable, but backwards compatibility is not yet guaranteed.</em>

:warning: <em>These APIs are training-only and float8-only, and we plan to [unify them with the rest of torchao](https://github.com/pytorch/ao/issues/894) in the future.</em>

# Single GPU User API

We provide three per-tensor scaling strategies: dynamic, delayed and static. See https://arxiv.org/pdf/2209.05433.pdf, Section 4.3 for more details. These strategies are configurable separately for activations (`input`), weights (`weight`) and gradients (`grad_output`).
Expand Down

0 comments on commit da13bf2

Please sign in to comment.