Playground for RAG (Retrieval-Augmented Generation) locally / on-prem
This project implements a RAG system using Postgres with pgvector for efficient similarity search, and Ollama for embeddings and text generation.
- Easy setup and cleanup scripts
- Vector similarity search using pgvector
- Local embedding generation with Ollama
- Text generation using Ollama models
- Multiple interaction modes:
- Single query mode
- Continuous mode for multiple one-turn queries
- Multi-turn conversation mode with context retention
- Docker
- Python3
- Ollama CLI installed (for local embedding generation and text generation)
- seed_data.json - Seed data for the playground with embeddings for products, copied over from Azure-Samples/rag-postgres-openai-python
- seed_data_no_embeds.json - Same seed data but without embeddings, for testing generating embeddings
- Clone this repository:
git clone https://github.com/your-username/rag-playground
cd rag-playground
- Run the preparation script:
chmod +x prepare.sh
./prepare.sh
This script will:
-
Pull the required Ollama models
-
Set up a Python virtual environment
-
Install the required Python packages
-
Create a
.env
file from.env.example
-
Start the Postgres database with pgvector
-
Update the
.env
file with your specific configuration if needed. -
Load the sample data and create embeddings:
python3 main.py setup
Ensure your Python virtual environment is activated:
source venv/bin/activate
Run a single search query:
python3 main.py search --query "Your search query here"
Example:
python3 main.py search --query "I need a waterproof jacket for hiking below $200 USD"
Run multiple one-turn queries in succession:
python3 main.py search --continuous
In continuous mode, you can use the following commands:
help
: Display available commandsclear
: Clear the screenexit
: Exit the program
Engage in a multi-turn conversation with context retention:
python3 main.py search --multi-turn
In multi-turn mode, you can use the following commands:
help
: Display available commandsclear
: Clear the screenclear history
: Clear the conversation historyexit
: Exit the program
- To add more products or update existing ones, modify the
data/seed_data_no_embeds.json
file and re-run the setup process. - Adjust the number of results returned by modifying the
n
parameter in thevector_similarity_search
function inrag_search.py
. - Change the language model for response generation by updating the
TEXT_GEN_MODEL_NAME
in your.env
file.
To remove all generated files, stop containers, and clean up the project:
chmod +x cleanup.sh
./cleanup.sh
If you encounter any issues:
- Ensure that Docker is running and that you have the necessary permissions.
- Check that Ollama is installed correctly and the required models are available.
- Verify that your
.env
file is configured correctly. - If you're having database connection issues, ensure that the Postgres container is running and that the connection details in your
.env
file are correct.
If problems persist - good luck!
This project is licensed under the MIT License - see the LICENSE file for details.