Skip to content

Commit

Permalink
Clarify llm download/loading instructions (#25145)
Browse files Browse the repository at this point in the history
* Clarify llm download/loading instructions

* Fix instructions

* Default model name

* Install beam

* Provide some extra info on LLM
  • Loading branch information
damccorm authored Feb 5, 2023
1 parent 16cb63b commit aee2c84
Show file tree
Hide file tree
Showing 2 changed files with 22 additions and 8 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -88,9 +88,9 @@ def parse_args(argv):
parser.add_argument(
"--model_name",
dest="model_name",
required=True,
required=False,
help="Path to the model's state_dict.",
default="t5-small",
default="t5-11b",
)

return parser.parse_known_args(args=argv)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -25,17 +25,31 @@ RunInference works well on arbitrarily large models as long as they can fit on y
This example demonstrates running inference with a `T5` language model using `RunInference` in a pipeline. `T5` is an encoder-decoder model pre-trained on a multi-task mixture of unsupervised and supervised tasks. Each task is converted into a text-to-text format. The example uses `T5-11B`, which contains 11 billion parameters and is 45 GB in size. In order to work well on a variety of tasks, `T5` prepends a different prefix to the input corresponding to each task. For example, for translation, the input would be: `translate English to German: …` and for summarization, it would be: `summarize: …`. For more information about `T5` see the [T5 overiew](https://huggingface.co/docs/transformers/model_doc/t5) in the HuggingFace documentation.

### Run the Pipeline ?
First, install the required packages listed in [requirements.txt](https://github.com/apache/beam/blob/master/sdks/python/apache_beam/examples/inference/large_language_modeling/requirements.txt) and pass the required arguments. You can download the `T5-11b` model from [Hugging Face Hub](https://huggingface.co/t5-11b) using:
First, install `apache-beam` 2.40 or greater:

- git lfs install
- git clone https://huggingface.co/t5-11b
Note: It will download the checkpoint, then you need to convert it to the model state dict as mentioned [here](https://pytorch.org/tutorials/beginner/saving_loading_models.html#save-load-state-dict-recommended).
```
pip install apache-beam -U
```

Next, install the required packages listed in [requirements.txt](https://github.com/apache/beam/blob/master/sdks/python/apache_beam/examples/inference/large_language_modeling/requirements.txt) and pass the required arguments. You can download the `T5-11b` model from [Hugging Face Hub](https://huggingface.co/t5-11b) with the following steps:

- Install Git LFS following the instructions [here](https://docs.github.com/en/repositories/working-with-files/managing-large-files/installing-git-large-file-storage?platform=mac)
- Run `git lfs install`
- Run `git clone https://huggingface.co/t5-11b` (this may take a long time). This will download the checkpoint, then you need to convert it to the model state dict as described [here](https://pytorch.org/tutorials/beginner/saving_loading_models.html#save-load-state-dict-recommended):

```
import torch
from transformers import T5ForConditionalGeneration
model = T5ForConditionalGeneration.from_pretrained("path/to/cloned/t5-11b")
torch.save(model.state_dict(), "path/to/save/state_dict.pth")
```

You can view the code on [GitHub](https://github.com/apache/beam/tree/master/sdks/python/apache_beam/examples/inference/large_language_modeling/main.py)

1. Locally on your machine: `python main.py --runner DirectRunner --model_state_dict_path <local or remote path to state_dict>`. You need to have 45 GB of disk space available to run this example.
2. On Google Cloud using Dataflow: `python main.py --runner DataflowRunner --model_state_dict_path <local or remote path to state_dict> --project PROJECT_ID
--region REGION --requirements_file requirements.txt --temp_location GCS_PATH`. Make sure to pass other arguments as mentioned [here](https://cloud.google.com/dataflow/docs/guides/setting-pipeline-options#setting_required_options)
2. On Google Cloud using Dataflow: `python main.py --runner DataflowRunner --model_state_dict_path <gs://path/to/saved/state_dict.pth> --project <PROJECT_ID>
--region <REGION> --requirements_file requirements.txt --temp_location <gs://path/to/temp/location> --experiments "use_runner_v2,no_use_multiple_sdk_containers" --machine_type=n2-standard-16`. You can also pass other configuration parameters as described [here](https://cloud.google.com/dataflow/docs/guides/setting-pipeline-options#setting_required_options).

### Pipeline Steps
The pipeline contains the following steps:
Expand Down

0 comments on commit aee2c84

Please sign in to comment.