Skip to content

Commit e111ee3

Browse files
authored
docs(examples): Enhance SDK with new examples and environment configuration (#2)
* docs(examples): Enhance SDK with new examples and environment configuration Signed-off-by: Eden Reich <eden.reich@gmail.com> * docs: Update README and examples to clarify tool listing and usage in SDK Signed-off-by: Eden Reich <eden.reich@gmail.com> * fix: Update client initialization URLs to include /v1 as the base URL for consistency Signed-off-by: Eden Reich <eden.reich@gmail.com> * docs(tools-use): Enhance README and examples with tool usage and definitions Signed-off-by: Eden Reich <eden.reich@gmail.com> * docs(examples): Enhance chat completion examples with error handling and streaming support Signed-off-by: Eden Reich <eden.reich@gmail.com> * docs(examples): Update README files to enhance clarity and usage instructions Signed-off-by: Eden Reich <eden.reich@gmail.com> * docs: Enhance MCP tools section in README with server-side management details Signed-off-by: Eden Reich <eden.reich@gmail.com> --------- Signed-off-by: Eden Reich <eden.reich@gmail.com>
1 parent 70a3229 commit e111ee3

22 files changed

+804
-105
lines changed

.gitattributes

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1 +1,3 @@
11
.devcontainer/** linguist-vendored=true
2+
3+
inference_gateway/models.py linguist-generated=true

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,3 +6,4 @@ dist
66
.coverage
77
node_modules/
88
.mypy_cache/
9+
**/.env

README.md

Lines changed: 84 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,9 @@
1717
- [Error Handling](#error-handling)
1818
- [Advanced Usage](#advanced-usage)
1919
- [Using Tools](#using-tools)
20+
- [Listing Available MCP Tools](#listing-available-mcp-tools)
2021
- [Custom HTTP Configuration](#custom-http-configuration)
22+
- [Examples](#examples)
2123
- [License](#license)
2224

2325
A modern Python SDK for interacting with the [Inference Gateway](https://github.com/edenreich/inference-gateway), providing a unified interface to multiple AI providers.
@@ -41,17 +43,17 @@ pip install inference-gateway
4143
### Basic Usage
4244

4345
```python
44-
from inference_gateway import InferenceGatewayClient, Message, MessageRole
46+
from inference_gateway import InferenceGatewayClient, Message
4547

4648
# Initialize client
47-
client = InferenceGatewayClient("http://localhost:8080")
49+
client = InferenceGatewayClient("http://localhost:8080/v1")
4850

4951
# Simple chat completion
5052
response = client.create_chat_completion(
5153
model="openai/gpt-4",
5254
messages=[
53-
Message(role=MessageRole.SYSTEM, content="You are a helpful assistant"),
54-
Message(role=MessageRole.USER, content="Hello!")
55+
Message(role="system", content="You are a helpful assistant"),
56+
Message(role="user", content="Hello!")
5557
]
5658
)
5759

@@ -70,18 +72,18 @@ print(response.choices[0].message.content)
7072
from inference_gateway import InferenceGatewayClient
7173

7274
# Basic configuration
73-
client = InferenceGatewayClient("http://localhost:8080")
75+
client = InferenceGatewayClient("http://localhost:8080/v1")
7476

7577
# With authentication
7678
client = InferenceGatewayClient(
77-
"http://localhost:8080",
79+
"http://localhost:8080/v1",
7880
token="your-api-token",
7981
timeout=60.0 # Custom timeout
8082
)
8183

8284
# Using httpx instead of requests
8385
client = InferenceGatewayClient(
84-
"http://localhost:8080",
86+
"http://localhost:8080/v1",
8587
use_httpx=True
8688
)
8789
```
@@ -105,13 +107,13 @@ print("OpenAI models:", openai_models)
105107
#### Standard Completion
106108

107109
```python
108-
from inference_gateway import Message, MessageRole
110+
from inference_gateway import Message
109111

110112
response = client.create_chat_completion(
111113
model="openai/gpt-4",
112114
messages=[
113-
Message(role=MessageRole.SYSTEM, content="You are a helpful assistant"),
114-
Message(role=MessageRole.USER, content="Explain quantum computing")
115+
Message(role="system", content="You are a helpful assistant"),
116+
Message(role="user", content="Explain quantum computing")
115117
],
116118
max_tokens=500
117119
)
@@ -126,7 +128,7 @@ print(response.choices[0].message.content)
126128
for chunk in client.create_chat_completion_stream(
127129
model="ollama/llama2",
128130
messages=[
129-
Message(role=MessageRole.USER, content="Tell me a story")
131+
Message(role="user", content="Tell me a story")
130132
],
131133
use_sse=True
132134
):
@@ -136,7 +138,7 @@ for chunk in client.create_chat_completion_stream(
136138
for chunk in client.create_chat_completion_stream(
137139
model="anthropic/claude-3",
138140
messages=[
139-
Message(role=MessageRole.USER, content="Explain AI safety")
141+
Message(role="user", content="Explain AI safety")
140142
],
141143
use_sse=False
142144
):
@@ -186,43 +188,96 @@ except InferenceGatewayError as e:
186188
### Using Tools
187189

188190
```python
189-
# List available MCP tools works when MCP_ENABLE and MCP_EXPOSE are set on the gateway
190-
tools = client.list_tools()
191-
print("Available tools:", tools)
191+
# Define a weather tool using type-safe Pydantic models
192+
from inference_gateway.models import ChatCompletionTool, FunctionObject, FunctionParameters
193+
194+
weather_tool = ChatCompletionTool(
195+
type="function",
196+
function=FunctionObject(
197+
name="get_current_weather",
198+
description="Get the current weather in a given location",
199+
parameters=FunctionParameters(
200+
type="object",
201+
properties={
202+
"location": {
203+
"type": "string",
204+
"description": "The city and state, e.g. San Francisco, CA"
205+
},
206+
"unit": {
207+
"type": "string",
208+
"enum": ["celsius", "fahrenheit"],
209+
"description": "The temperature unit to use"
210+
}
211+
},
212+
required=["location"]
213+
)
214+
)
215+
)
192216

193-
# Use tools in chat completion works when MCP_ENABLE and MCP_EXPOSE are set to false on the gateway
217+
# Using tools in a chat completion
194218
response = client.create_chat_completion(
195219
model="openai/gpt-4",
196-
messages=[...],
197-
tools=[
198-
{
199-
"type": "function",
200-
"function": {
201-
"name": "get_current_weather",
202-
"description": "Get the current weather",
203-
"parameters": {...}
204-
}
205-
}
206-
]
220+
messages=[
221+
Message(role="system", content="You are a helpful assistant with access to weather information"),
222+
Message(role="user", content="What is the weather like in New York?")
223+
],
224+
tools=[weather_tool] # Pass the tool definition
207225
)
226+
227+
print(response.choices[0].message.content)
228+
229+
# Check if the model made a tool call
230+
if response.choices[0].message.tool_calls:
231+
for tool_call in response.choices[0].message.tool_calls:
232+
print(f"Tool called: {tool_call.function.name}")
233+
print(f"Arguments: {tool_call.function.arguments}")
208234
```
209235

236+
### Listing Available MCP Tools
237+
238+
```python
239+
# List available MCP tools (requires MCP_ENABLE and MCP_EXPOSE to be set on the gateway)
240+
tools = client.list_tools()
241+
print("Available tools:", tools)
242+
```
243+
244+
**Server-Side Tool Management**
245+
246+
The SDK currently supports listing available MCP tools, which is particularly useful for UI applications that need to display connected tools to users. The key advantage is that tools are managed server-side:
247+
248+
- **Automatic Tool Injection**: Tools are automatically inferred and injected into requests by the Inference Gateway server
249+
- **Simplified Client Code**: No need to manually manage or configure tools in your client application
250+
- **Transparent Tool Calls**: During streaming chat completions with configured MCP servers, tool calls appear in the response stream - no special handling required except optionally displaying them to users
251+
252+
This architecture allows you to focus on LLM interactions while the gateway handles all tool management complexities behind the scenes.
253+
210254
### Custom HTTP Configuration
211255

212256
```python
213257
# With custom headers
214258
client = InferenceGatewayClient(
215-
"http://localhost:8080",
259+
"http://localhost:8080/v1",
216260
headers={"X-Custom-Header": "value"}
217261
)
218262

219263
# With proxy settings
220264
client = InferenceGatewayClient(
221-
"http://localhost:8080",
265+
"http://localhost:8080/v1",
222266
proxies={"http": "http://proxy.example.com"}
223267
)
224268
```
225269

270+
## Examples
271+
272+
For comprehensive examples demonstrating various use cases, see the [examples](examples/) directory:
273+
274+
- [List LLMs](examples/list/) - How to list available models
275+
- [Chat](examples/chat/) - Basic and advanced chat completion examples
276+
- [Tools](examples/tools/) - Working with function tools
277+
- [MCP](examples/mcp/) - Model Context Protocol integration examples
278+
279+
Each example includes a detailed README with setup instructions and explanations.
280+
226281
## License
227282

228283
This SDK is distributed under the MIT License, see [LICENSE](LICENSE) for more information.

Taskfile.yml

Lines changed: 31 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -43,7 +43,7 @@ tasks:
4343
--output inference_gateway/models.py
4444
--output-model-type pydantic_v2.BaseModel
4545
--enum-field-as-literal all
46-
--target-python-version {{.PYTHON_VERSION}}
46+
--target-python-version 3.12
4747
--use-schema-description
4848
--use-generic-container-types
4949
--use-standard-collections
@@ -58,7 +58,8 @@ tasks:
5858
--strict-nullable
5959
--allow-population-by-field-name
6060
--snake-case-field
61-
--strip-default-none
61+
--use-default
62+
--use-default-kwarg
6263
--use-title-as-name
6364
- echo "✅ Models generated successfully"
6465
- task: format
@@ -67,17 +68,17 @@ tasks:
6768
desc: Format code with black and isort
6869
cmds:
6970
- echo "Formatting code..."
70-
- black inference_gateway/ tests/
71-
- isort inference_gateway/ tests/
71+
- black inference_gateway/ tests/ examples/
72+
- isort inference_gateway/ tests/ examples/
7273
- echo "✅ Code formatted"
7374

7475
lint:
7576
desc: Run all linting checks
7677
cmds:
7778
- echo "Running linting checks..."
78-
- black --check inference_gateway/ tests/
79-
- isort --check-only inference_gateway/ tests/
80-
- mypy inference_gateway/
79+
- black --check inference_gateway/ tests/ examples/
80+
- isort --check-only inference_gateway/ tests/ examples/
81+
- mypy inference_gateway/ examples/
8182
- echo "✅ All linting checks passed"
8283

8384
test:
@@ -122,6 +123,29 @@ tasks:
122123
- python -m build
123124
- echo "✅ Package built successfully"
124125

126+
install-global:
127+
desc: Build and install the package globally for testing
128+
deps:
129+
- build
130+
cmds:
131+
- echo "Installing package globally..."
132+
- pip uninstall -y inference-gateway || true
133+
- pip install dist/*.whl --force-reinstall
134+
- echo "✅ Package installed globally successfully"
135+
136+
install-global-dev:
137+
desc: Build and install the package globally for testing (skip tests)
138+
deps:
139+
- clean
140+
- format
141+
cmds:
142+
- echo "Building package (skipping tests)..."
143+
- python -m build
144+
- echo "Installing package globally..."
145+
- pip uninstall -y inference-gateway || true
146+
- pip install dist/*.whl --force-reinstall
147+
- echo "✅ Package installed globally successfully"
148+
125149
docs:serve:
126150
desc: Serve documentation locally (placeholder for future docs)
127151
cmds:

examples/.env.example

Lines changed: 48 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,48 @@
1+
2+
# General settings
3+
ENVIRONMENT=production
4+
ENABLE_TELEMETRY=false
5+
ENABLE_AUTH=false
6+
# Model Context Protocol (MCP)
7+
MCP_ENABLE=false
8+
MCP_EXPOSE=false
9+
MCP_SERVERS=
10+
MCP_CLIENT_TIMEOUT=5s
11+
MCP_DIAL_TIMEOUT=3s
12+
MCP_TLS_HANDSHAKE_TIMEOUT=3s
13+
MCP_RESPONSE_HEADER_TIMEOUT=3s
14+
MCP_EXPECT_CONTINUE_TIMEOUT=1s
15+
MCP_REQUEST_TIMEOUT=5s
16+
# OpenID Connect
17+
OIDC_ISSUER_URL=http://keycloak:8080/realms/inference-gateway-realm
18+
OIDC_CLIENT_ID=inference-gateway-client
19+
OIDC_CLIENT_SECRET=
20+
# Server settings
21+
SERVER_HOST=0.0.0.0
22+
SERVER_PORT=8080
23+
SERVER_READ_TIMEOUT=30s
24+
SERVER_WRITE_TIMEOUT=30s
25+
SERVER_IDLE_TIMEOUT=120s
26+
SERVER_TLS_CERT_PATH=
27+
SERVER_TLS_KEY_PATH=
28+
# Client settings
29+
CLIENT_TIMEOUT=30s
30+
CLIENT_MAX_IDLE_CONNS=20
31+
CLIENT_MAX_IDLE_CONNS_PER_HOST=20
32+
CLIENT_IDLE_CONN_TIMEOUT=30s
33+
CLIENT_TLS_MIN_VERSION=TLS12
34+
# Providers
35+
ANTHROPIC_API_URL=https://api.anthropic.com/v1
36+
ANTHROPIC_API_KEY=
37+
CLOUDFLARE_API_URL=https://api.cloudflare.com/client/v4/accounts/{ACCOUNT_ID}/ai
38+
CLOUDFLARE_API_KEY=
39+
COHERE_API_URL=https://api.cohere.ai
40+
COHERE_API_KEY=
41+
GROQ_API_URL=https://api.groq.com/openai/v1
42+
GROQ_API_KEY=
43+
OLLAMA_API_URL=http://ollama:8080/v1
44+
OLLAMA_API_KEY=
45+
OPENAI_API_URL=https://api.openai.com/v1
46+
OPENAI_API_KEY=
47+
DEEPSEEK_API_URL=https://api.deepseek.com
48+
DEEPSEEK_API_KEY=

examples/README.md

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
# Examples
2+
3+
Before starting with the examples, ensure you have the inference-gateway up and running:
4+
5+
1. Copy the `.env.example` file to `.env` and set your provider key.
6+
7+
2. Set your preferred Large Language Model (LLM) provider for the examples:
8+
9+
```sh
10+
export LLM_NAME=groq/meta-llama/llama-4-scout-17b-16e-instruct
11+
```
12+
13+
3. Run the Docker container:
14+
15+
```
16+
docker run --rm -it -p 8080:8080 --env-file .env -e $LLM_NAME ghcr.io/inference-gateway/inference-gateway:0.7.1
17+
```
18+
19+
Recommended is to set the environment variable `ENVIRONMENT=development` in your `.env` file to enable debug mode.
20+
21+
The following examples demonstrate how to use the Inference Gateway SDK for various tasks:
22+
23+
- [List LLMs](list/README.md)
24+
- [Chat](chat/README.md)
25+
- [Tools](tools/README.md)
26+
- [MCP](mcp/README.md)

examples/__init__.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
# Examples package

0 commit comments

Comments
 (0)