Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update readmes #35

Merged
merged 2 commits into from
Feb 2, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 0 additions & 11 deletions Makefile

This file was deleted.

7 changes: 5 additions & 2 deletions chatbot-langchain/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,9 +7,12 @@ podman build -t stchat . -f builds/Containerfile
```
### Run image locally

Make sure your model service is up and running before starting this container image.
Make sure the playground model service is up and running before starting this container image.
To start the model service, refer to [the playground document](../playground/README.md)


```bash
podman run -it -p 8501:8501 -e MODEL_SERVICE_ENDPOINT=http://10.88.0.1:8001/v1 stchat
podman run --rm -it -p 8501:8501 -e MODEL_SERVICE_ENDPOINT=http://10.88.0.1:8001/v1 stchat
```

Interact with the application from your local browser at `localhost:8501`
4 changes: 2 additions & 2 deletions finetune/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ This also assumes that `<location/of/your/data/>` contains the following 2 files
### Run the image

```bash
podman run -it -v <location/of/your/data/>:/locallm/data/ finetunellm
podman run --rm -it -v <location/of/your/data/>:/locallm/data/ finetunellm
```
This will run 10 iterations of LoRA finetuning and generate a new model that can be exported and used in another chat application. I'll caution that 10 iterations is likely insufficient to see a real change in the model outputs, but it serves here for demo purposes.

Expand Down Expand Up @@ -56,4 +56,4 @@ podman run -it -v <location/of/your/data/>:/locallm/data/ \
-e NEW_MODEL=<name-of-new-finetuned-model.gguf>
finetunellm

```
```
36 changes: 36 additions & 0 deletions playground/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
### Build Model Service

From this directory,

```bash
podman build -t playground:image .
```

### Download Model

At the time of this writing, 2 models are known to work with this service

- **Llama2-7b**
- Download URL: [https://huggingface.co/TheBloke/Llama-2-7B-Chat-GGUF/resolve/main/llama-2-7b-chat.Q5_K_S.gguf](https://huggingface.co/TheBloke/Llama-2-7B-Chat-GGUF/resolve/main/llama-2-7b-chat.Q5_K_S.gguf)
- **Mistral-7b**
- Download URL: [https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.1-GGUF/resolve/main/mistral-7b-instruct-v0.1.Q4_K_S.gguf](https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.1-GGUF/resolve/main/mistral-7b-instruct-v0.1.Q4_K_S.gguf)

```bash
cd ../models
wget <Download URL>
cd ../
```

### Deploy Model Service

Deploy the LLM server and volume mount the model of choice.

```bash
podman run --rm -it -d \
-p 8001:8001 \
-v Local/path/to/locallm/models:/locallm/models:ro,Z \
-e MODEL_PATH=models/<model-filename> \
-e HOST=0.0.0.0 \
-e PORT=8001 \
playground:image`
```
9 changes: 4 additions & 5 deletions rag-langchain/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,12 +8,11 @@ This example will deploy a local RAG application using a chromadb server, a llam
Use the existing ChromaDB image to deploy a vector store service.

* `podman pull chromadb/chroma`
* `podman run -it -p 8000:8000 chroma`
* `podman run --rm -it -p 8000:8000 chroma`

### Deploy Model Service
### Deploy Model Service

Deploy the LLM server and volume mount the model of choice.
* `podman run -it -p 8001:8001 -v Local/path/to/locallm/models:/locallm/models:Z -e MODEL_PATH=models/llama-2-7b-chat.Q5_K_S.gguf -e HOST=0.0.0.0 -e PORT=8001 playground`
To start the model service, refer to [the playground model-service document](../playground/README.md)

### Build and Deploy RAG app
Deploy a small application that can populate the data base from the vectorDB and generate a response with the LLM.
Expand All @@ -31,5 +30,5 @@ snapshot_download(repo_id="BAAI/bge-base-en-v1.5",
Follow the instructions below to build you container image and run it locally.

* `podman build -t ragapp rag-langchain -f rag-langchain/builds/Containerfile`
* `podman run -it -p 8501:8501 -v Local/path/to/locallm/models/:/rag/models:Z -v Local/path/to/locallm/data:/rag/data:Z ragapp -- -H 10.88.0.1 -m http://10.88.0.1:8001/v1`
* `podman run --rm -it -p 8501:8501 -v Local/path/to/locallm/models/:/rag/models:Z -v Local/path/to/locallm/data:/rag/data:Z ragapp -- -H 10.88.0.1 -m http://10.88.0.1:8001/v1`

2 changes: 1 addition & 1 deletion summarizer/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ The user should provide the model name, the architecture and image name they wan
Once the model service image is built, it can be run with the following:

```bash
podman run -it -p 7860:7860 summarizer
podman run --rm -it -p 7860:7860 summarizer
```
### Interact with the app

Expand Down