Skip to content

Commit

Permalink
Add support for hpu_backend and Resnet50 compile example
Browse files Browse the repository at this point in the history
  • Loading branch information
wozna committed Jun 10, 2024
1 parent 36049cb commit 6dff71c
Show file tree
Hide file tree
Showing 6 changed files with 110 additions and 0 deletions.
1 change: 1 addition & 0 deletions docs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,7 @@ TorchServe is a performant, flexible and easy to use tool for serving PyTorch ea
- [TorchServe Integrations](../examples/README.md#torchserve-integrations)
- [TorchServe UseCases](../examples/README.md#usecases)
* [Workflow Examples](https://github.com/pytorch/serve/tree/master/examples/Workflows) - Examples of how to compose models in a workflow with TorchServe
* [Resnet50 HPU compile](../examples/pt2/torch_compile_hpu/README.md) - An example of how to run the model in compile mode with the HPU device

## Advanced Features

Expand Down
81 changes: 81 additions & 0 deletions examples/pt2/torch_compile_hpu/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,81 @@

# TorchServe Inference with torch.compile with HPU backend of Resnet50 model

This guide provides steps on how to optimize a ResNet50 model using `torch.compile` with [HPU backend](https://docs.habana.ai/en/latest/PyTorch/Inference_on_PyTorch/Getting_Started_with_Inference.html), aiming to enhance inference performance when deployed through TorchServe. `torch.compile` allows for ahead-of-time compilation of PyTorch models.

### Prerequisites
- `Intel® Gaudi® AI accelerator software for PyTorch` - Go to [Installation_Guide](https://docs.habana.ai/en/latest/Installation_Guide/index.html) which covers installation procedures, including software verification and subsequent steps for software installation and management.

## Workflow
1. Configure torch.compile.
2. Create model archive.
3. Start TorchServe.
4. Run Inference.
5. Stop TorchServe.

First, navigate to `examples/pt2/torch_compile_hpu`
```bash
cd examples/pt2/torch_compile_hpu
```

### 1. Configure torch.compile

`torch.compile` allows various configurations that can influence performance outcomes. Explore different options in the [official PyTorch documentation](https://pytorch.org/docs/stable/generated/torch.compile.html)


In this example, we use the following config that is provided in `model-config.yaml` file:

```yaml
minWorkers: 1
maxWorkers: 1
pt2: {backend: "hpu_backend"}
```
`pt2: {backend: "hpu_backend"}` - this line enables compile mode, if you remove it from the config file, the model will run in eager mode.

### 2. Create model archive

Download the pre-trained model and prepare the model archive:
```bash
wget https://download.pytorch.org/models/resnet50-11ad3fa6.pth
mkdir model_store
PT_HPU_LAZY_MODE=0 torch-model-archiver --model-name resnet-50 --version 1.0 --model-file model.py \
--serialized-file resnet50-11ad3fa6.pth --export-path model_store \
--extra-files ../../image_classifier/index_to_name.json --handler hpu_image_classifier.py \
--config-file model-config.yaml
```

### 3. Start TorchServe

Start the TorchServe server using the following command:
```bash
PT_HPU_LAZY_MODE=0 torchserve --start --ncs --model-store model_store --models resnet-50.mar
```

### 4. Run Inference

**Note:** `torch.compile` requires a warm-up phase to reach optimal performance. Ensure you run at least as many inferences as the `maxWorkers` specified before measuring performance.

```bash
# Open a new terminal
cd examples/pt2/torch_compile_hpu
curl http://127.0.0.1:8080/predictions/resnet-50 -T ../../image_classifier/kitten.jpg
```

The expected output will be JSON-formatted classification probabilities, such as:

```json
{
"tabby": 0.2724992632865906,
"tiger_cat": 0.1374046504497528,
"Egyptian_cat": 0.046274710446596146,
"lynx": 0.003206699388101697,
"lens_cap": 0.002257900545373559
}
```

### 5. Stop the server
Stop TorchServe with the following command:

```bash
torchserve --stop
```
18 changes: 18 additions & 0 deletions examples/pt2/torch_compile_hpu/hpu_image_classifier.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
import habana_frameworks.torch.core as htcore # nopycln: import
import torch

from ts.torch_handler.image_classifier import ImageClassifier


class HPUImageClassifier(ImageClassifier):
def set_hpu(self):
self.map_location = "hpu"
self.device = torch.device(self.map_location)

def _load_pickled_model(self, model_dir, model_file, model_pt_path):
"""
This override of this method allows us to set device to hpu and use the default base_handler without having to modify it.
"""
model = super()._load_pickled_model(model_dir, model_file, model_pt_path)
self.set_hpu()
return model
3 changes: 3 additions & 0 deletions examples/pt2/torch_compile_hpu/model-config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
minWorkers: 1
maxWorkers: 1
pt2: {backend: "hpu_backend"}
6 changes: 6 additions & 0 deletions examples/pt2/torch_compile_hpu/model.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
from torchvision.models.resnet import Bottleneck, ResNet


class ImageClassifier(ResNet):
def __init__(self):
super(ImageClassifier, self).__init__(Bottleneck, [3, 4, 6, 3])
1 change: 1 addition & 0 deletions ts/utils/util.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@ class PT2Backend(str, enum.Enum):
IPEX = "ipex"
TORCHXLA_TRACE_ONCE = "torchxla_trace_once"
OPENVINO = "openvino"
HPU_BACKEND = "hpu_backend"


logger = logging.getLogger(__name__)
Expand Down

0 comments on commit 6dff71c

Please sign in to comment.