Skip to content

create cu python library #1

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
116 changes: 116 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,116 @@
# Python
**/__pycache__/
**/*.py[cod]
**/*$py.class
*.so
__init__.py.bak
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
**/*.egg-info/
.installed.cfg
*.egg

# Virtual Environment
venv/
env/
ENV/
.venv/

# IDE
.idea/
.vscode/
*.swp
*.swo

# Testing
.coverage
htmlcov/
.pytest_cache/
.tox/

# Distribution
*.tar.gz
*.zip

# Logs
*.log

# Local development
.env
.env.local
.env.*.local

# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class

# C extensions
*.so

# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
pip-wheel-metadata/
share/python-wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST

# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec

# Installer logs
pip-log.txt
pip-delete-this-directory.txt

# Unit test / coverage reports
htmlcov/
.tox/
.nox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
*.py,cover
.hypothesis/
.pytest_cache/

# Virtual environments
venv/
env/
ENV/

# IDEs
.vscode/
.idea/
*.swp
*.swo
179 changes: 178 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1 +1,178 @@
# cu-playwright-python
# Kernel Python Sample App - Computer Use

This is a simple Kernel application that implements a prompt loop using Anthropic Computer Use.

It generally follows the [Anthropic Reference Implementation](https://github.com/anthropics/anthropic-quickstarts/tree/main/computer-use-demo) but replaces `xodotool` and `gnome-screenshot` with Playwright.

# cu-playwright-python

Computer Use Agent for Python - Browser automation with Claude using Anthropic's Computer Use capabilities.

This library provides a clean, simple interface to Anthropic's Computer Use capabilities, allowing Claude to interact with web pages through Playwright. It's the Python equivalent of [cu-playwright-ts](https://github.com/anthropics/cu-playwright-ts).

## Features

- 🤖 **AI-Powered Browser Automation**: Let Claude control web browsers with natural language
- 🎯 **Simple API**: Clean, intuitive interface for complex browser interactions
- 📊 **Structured Responses**: Get validated data back using Pydantic models
- 🛠️ **Built on Playwright**: Reliable browser automation underneath
- ⚡ **Async/Await**: Modern Python async support

## Installation

```bash
pip install cu-playwright-python
```

You'll also need to install Playwright browsers:

```bash
playwright install chromium
```

## Quick Start

```python
import asyncio
import os
from playwright.async_api import async_playwright
from cu_playwright_python import ComputerUseAgent

async def main():
# Set up your Anthropic API key
api_key = os.getenv("ANTHROPIC_API_KEY")

async with async_playwright() as playwright:
browser = await playwright.chromium.launch(headless=False)
page = await browser.new_page()

# Create agent
agent = ComputerUseAgent(
api_key=api_key,
page=page
)

# Navigate and interact
await page.goto("https://example.com")
result = await agent.execute("What is the title of this page?")
print(result)

await browser.close()

asyncio.run(main())
```

## Usage

### Basic Text Responses

```python
# Simple queries return text
result = await agent.execute("Describe what you see on this page")
print(result) # "This page contains a header with..."
```

### Structured Data with Pydantic

```python
from pydantic import BaseModel

class ProductInfo(BaseModel):
name: str
price: float
in_stock: bool

# Get structured data back
product = await agent.execute(
"Find the main product on this page and extract its details",
schema=ProductInfo
)
print(f"Product: {product.name}, Price: ${product.price}")
```

### Advanced Options

```python
result = await agent.execute(
"Fill out the contact form with realistic data",
system_prompt_suffix="Be extra careful with form validation",
thinking_budget=2048 # More reasoning tokens
)
```

## API Reference

### ComputerUseAgent

The main class for browser automation with Claude.

#### Constructor

```python
ComputerUseAgent(
api_key: str, # Anthropic API key
page: Page, # Playwright page instance
model: str = "claude-sonnet-4-20250514" # Claude model to use
)
```

#### execute()

Execute a computer use task with Claude.

```python
async def execute(
query: str, # Task description
*,
schema: Optional[Type[BaseModel]] = None, # Pydantic model for structured responses
system_prompt_suffix: str = "", # Additional system instructions
thinking_budget: int = 1024 # Token budget for Claude's reasoning
) -> Union[str, BaseModel]
```

**Parameters:**
- `query`: Natural language description of what you want Claude to do
- `schema`: Optional Pydantic model class for structured responses
- `system_prompt_suffix`: Additional instructions appended to the system prompt
- `thinking_budget`: Token budget for Claude's internal reasoning process

**Returns:**
- `str`: When no schema is provided
- `BaseModel`: Instance of the provided schema class when schema is specified

## Examples

See [example.py](example.py) for comprehensive usage examples including:

- Basic page interactions
- Structured data extraction with Pydantic models
- Complex multi-step workflows
- Custom system prompts

## Requirements

- Python 3.9+
- An Anthropic API key ([get one here](https://console.anthropic.com/))
- Playwright browsers installed

## Model Compatibility

This library works with Anthropic's Computer Use compatible models:

- `claude-sonnet-4-20250514` (recommended)
- Other Computer Use enabled models

See [Anthropic's documentation](https://docs.anthropic.com/en/docs/agents-and-tools/tool-use/computer-use-tool#model-compatibility) for the latest model compatibility information.

## Related Projects

- [cu-playwright-ts](https://github.com/anthropics/cu-playwright-ts) - TypeScript version
- [Anthropic Computer Use Demo](https://github.com/anthropics/anthropic-quickstarts/tree/main/computer-use-demo)

## License

MIT - see [LICENSE](LICENSE) for details.

## Contributing

Contributions are welcome! Please feel free to submit a Pull Request.
11 changes: 11 additions & 0 deletions __init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
"""
cu-playwright-python: Computer Use Agent for Python

A Python library for automating browser interactions with Claude using Anthropic's Computer Use capabilities.
"""

from .agent import ComputerUseAgent
from .tools import ToolVersion

__version__ = "0.1.0"
__all__ = ["ComputerUseAgent", "ToolVersion"]
Loading