A containerized Nginx reverse proxy setup using OpenResty for managing Ollama services.
- OpenResty-based: Built on OpenResty (Alpine) for enhanced performance and Lua scripting capabilities
- Configurable Proxy: Easily configurable reverse proxy for Ollama API services
- Docker Compose Ready: Simple deployment with Docker Compose
- Rate Limiting Support: Built-in rate limiting configuration (commented out by default)
- Large File Support: Configured to handle large model files up to 100MB
- Timezone Support: Pre-configured for Asia/Taipei timezone
-
Clone this repository:
git clone https://github.com/Rui0828/Nginx-Ollama.git cd Nginx-Ollama
-
Create your Nginx configuration files in the
conf.d
directory:mkdir -p conf.d
-
Add your server configurations to
conf.d/
(see Configuration section below) -
Start the services:
docker-compose up -d
-
Access your services through
http://localhost:8080
├── Dockerfile
├── docker-compose.yml
├── nginx.conf
├── conf.d/ # Your server configurations go here
└── other_tools/ # Additional tools and scripts
The project includes a pre-configured Ollama proxy with API key authentication. The main configuration is in conf.d/ollama.conf
which proxies requests to http://host.docker.internal:11434
through the /ollama/
path.
# Using X-API-Key header
curl -H "X-API-Key: your-secret-key" http://localhost:8080/ollama/api/tags
# Using Authorization Bearer token
curl -H "Authorization: Bearer your-secret-key" http://localhost:8080/ollama/api/tags
# Using query parameter
curl "http://localhost:8080/ollama/api/tags?api_key=your-secret-key"
For using OpenAI API SDK or compatible clients, use the /v1
endpoint:
# OpenAI API compatible endpoint
curl -H "Authorization: Bearer your-secret-key" \
-H "Content-Type: application/json" \
-d '{"model": "llama2", "messages": [{"role": "user", "content": "Hello!"}]}' \
http://localhost:8080/ollama/v1/chat/completions
Python Example with OpenAI SDK:
from openai import OpenAI
client = OpenAI(
base_url="http://localhost:8080/ollama/v1",
api_key="your-secret-key"
)
response = client.chat.completions.create(
model="llama2",
messages=[
{"role": "user", "content": "Hello, how are you?"}
]
)
print(response.choices[0].message.content)
This setup includes a robust API key authentication system for securing access to your Ollama services.
The authentication is implemented using OpenResty's Lua scripting capability through check_key.lua
. It validates API keys against a whitelist stored in other_tools/ollama/keys.txt
.
The system supports multiple ways to provide your API key:
-
X-API-Key Header (recommended):
curl -H "X-API-Key: your-secret-key" http://localhost:8080/ollama/api/tags
-
Authorization Bearer Token:
curl -H "Authorization: Bearer your-secret-key" http://localhost:8080/ollama/api/tags
-
Query Parameter:
curl "http://localhost:8080/ollama/api/tags?api_key=your-secret-key"
-
Add/Remove Keys: Edit the
other_tools/ollama/keys.txt
file:your-secret-key abc123 supersecret another-key-here
-
Restart Required: After modifying keys, restart the container:
docker-compose restart nginx
-
Security Best Practices:
- Use strong, randomly generated keys
- Regularly rotate your API keys
- Keep the
keys.txt
file secure and backed up - Consider using environment variables for production deployments
The authentication system automatically allows OPTIONS
preflight requests to pass through without authentication, ensuring proper CORS support for web applications.
- 401 Unauthorized: Returned when no API key is provided or the key is invalid
- Error Message:
"Unauthorized: Invalid or missing API Key"
To enable rate limiting, uncomment the following line in nginx.conf
:
limit_req_zone $binary_remote_addr zone=api:10m rate=50r/s;
Then add to your server block:
limit_req zone=api burst=20 nodelay;
The container uses the following configuration:
- Timezone: Asia/Taipei
- Exposed Port: 80 (mapped to 8080 on host)
- Log Location:
/var/log/nginx/
./conf.d:/etc/nginx/conf.d
- Server configurations./other_tools:/etc/nginx/other_tools
- Additional tools
The configuration includes optimized settings for handling AI model requests:
- Worker Processes: Auto-scaled based on CPU cores
- Worker Connections: 8192 per worker
- Client Max Body Size: 100MB
- Proxy Timeouts: 600 seconds for large model operations
- Keep-Alive: 65 seconds
docker build -t nginx-ollama .
docker run -d \
-p 8080:80 \
-v $(pwd)/conf.d:/etc/nginx/conf.d \
-v $(pwd)/other_tools:/etc/nginx/other_tools \
nginx-ollama
This project is licensed under the Apache License 2.0. See the LICENSE file for details.
- Fork the repository
- Create your feature branch (
git checkout -b feature/amazing-feature
) - Commit your changes (
git commit -m 'Add some amazing feature'
) - Push to the branch (
git push origin feature/amazing-feature
) - Open a Pull Request
For issues and questions, please open an issue in the repository.