Computer Use Playwright SDK

A TypeScript SDK that combines Anthropic's Computer Use capabilities with Playwright for browser automation tasks. This SDK provides a clean, type-safe interface for automating browser interactions using Claude's computer use abilities.

Features

🤖 Simple API: Single ComputerUseAgent class for all computer use tasks
🔄 Dual Response Types: Support for both text and structured (JSON) responses
🛡️ Type Safety: Full TypeScript support with Zod schema validation
⚡ Optimized: Clean error handling and robust JSON parsing
🎯 Focused: Clean API surface with sensible defaults

Installation

npm install @onkernel/cu-playwright-ts
# or
yarn add @onkernel/cu-playwright-ts
# or
bun add @onkernel/cu-playwright-ts

Quick Start

import { chromium } from 'playwright';
import { ComputerUseAgent } from '@onkernel/cu-playwright-ts';

const browser = await chromium.launch({ headless: false });
const page = await browser.newPage();

// Navigate to Hacker News manually first
await page.goto("https://news.ycombinator.com/");

const agent = new ComputerUseAgent({
  apiKey: process.env.ANTHROPIC_API_KEY!,
  page,
});

// Simple text response
const answer = await agent.execute('Tell me the title of the top story');
console.log(answer);

await browser.close();

API Reference

`ComputerUseAgent`

The main class for computer use automation.

Constructor

new ComputerUseAgent(options: {
  apiKey: string;
  page: Page;
  model?: string;
})

Parameters:

apiKey (string): Your Anthropic API key. Get one from Anthropic Console
page (Page): Playwright page instance to control
model (string, optional): Anthropic model to use. Defaults to 'claude-sonnet-4-20250514'

Supported Models: See Anthropic's Computer Use documentation for the latest model compatibility.

`execute()` Method

async execute<T = string>(
  query: string,
  schema?: z.ZodSchema<T>,
  options?: {
    systemPromptSuffix?: string;
    thinkingBudget?: number;
  }
): Promise<T>

Parameters:

query (string): The task description for Claude to execute
schema (ZodSchema, optional): Zod schema for structured responses. When provided, the response will be validated against this schema
options (object, optional):
- systemPromptSuffix (string): Additional instructions appended to the system prompt
- thinkingBudget (number): Token budget for Claude's internal reasoning process. Default: 1024. See Extended Thinking documentation for details

Returns:

Promise<T>: When schema is provided, returns validated data of type T
Promise<string>: When no schema is provided, returns the text response

Usage Examples

Text Response

import { ComputerUseAgent } from '@onkernel/cu-playwright-ts';

// Navigate to the target page first
await page.goto("https://news.ycombinator.com/");

const agent = new ComputerUseAgent({
  apiKey: process.env.ANTHROPIC_API_KEY!,
  page,
});

const result = await agent.execute(
  'Tell me the title of the top story on this page'
);
console.log(result); // "Title of the top story"

Structured Response with Zod

import { z } from 'zod';
import { ComputerUseAgent } from '@onkernel/cu-playwright-ts';

const agent = new ComputerUseAgent({
  apiKey: process.env.ANTHROPIC_API_KEY!,
  page,
});

const HackerNewsStory = z.object({
  title: z.string(),
  points: z.number(),
  author: z.string(),
  comments: z.number(),
  url: z.string().optional(),
});

const stories = await agent.execute(
  'Get the top 5 Hacker News stories with their details',
  z.array(HackerNewsStory).max(5)
);

console.log(stories);
// [
//   {
//     title: "Example Story",
//     points: 150,
//     author: "user123",
//     comments: 42,
//     url: "https://example.com"
//   },
//   ...
// ]

Advanced Options

const result = await agent.execute(
  'Complex task requiring more thinking',
  undefined, // No schema for text response
  {
    systemPromptSuffix: 'Be extra careful with form submissions.',
    thinkingBudget: 4096, // More thinking tokens for complex tasks
  }
);

Environment Setup

Anthropic API Key: Set your API key as an environment variable:
```
export ANTHROPIC_API_KEY=your_api_key_here
```
Playwright: Install Playwright and browser dependencies:
```
npx playwright install
```

Computer Use Parameters

This SDK leverages Anthropic's Computer Use API with the following key parameters:

Model Selection

Claude 3.5 Sonnet: Best balance of speed and capability for most tasks
Claude 4 Models: Enhanced reasoning with extended thinking capabilities
Claude 3.7 Sonnet: Advanced reasoning with thinking transparency

Thinking Budget

The thinkingBudget parameter controls Claude's internal reasoning process:

1024 tokens (default): Suitable for simple tasks
4096+ tokens: Better for complex reasoning tasks
16k+ tokens: Recommended for highly complex multi-step operations

See Anthropic's Extended Thinking guide for optimization tips.

Error Handling

The SDK includes built-in error handling:

try {
  const result = await agent.execute('Your task here');
  console.log(result);
} catch (error) {
  if (error.message.includes('No response received')) {
    console.log('Agent did not receive a response from Claude');
  } else {
    console.log('Other error:', error.message);
  }
}

Best Practices

Use specific, clear instructions: "Click the red 'Submit' button" vs "click submit"
For complex tasks, break them down: Use step-by-step instructions in your query
Optimize thinking budget: Start with default (1024) and increase for complex tasks
Handle errors gracefully: Implement proper error handling for production use
Use structured responses: When you need specific data format, use Zod schemas
Test in headless: false: During development, run with visible browser to debug

Security Considerations

⚠️ Important: Computer use can interact with any visible application. Always:

Run in isolated environments (containers/VMs) for production
Avoid providing access to sensitive accounts or data
Review Claude's actions in logs before production deployment
Use allowlisted domains when possible

See Anthropic's Computer Use Security Guide for detailed security recommendations.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
.github		.github
tools		tools
types		types
utils		utils
.eslintrc.json		.eslintrc.json
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
agent.ts		agent.ts
bun.lock		bun.lock
eslint.config.js		eslint.config.js
example.ts		example.ts
index.ts		index.ts
loop.ts		loop.ts
package.json		package.json
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Computer Use Playwright SDK

Features

Installation

Quick Start

API Reference

`ComputerUseAgent`

Constructor

`execute()` Method

Usage Examples

Text Response

Structured Response with Zod

Advanced Options

Environment Setup

Computer Use Parameters

Model Selection

Thinking Budget

Error Handling

Best Practices

Security Considerations

Requirements

Related Resources

License

About

Uh oh!

Releases

Packages

Contributors 3

Uh oh!

Languages

License

onkernel/cu-playwright-ts

Folders and files

Latest commit

History

Repository files navigation

Computer Use Playwright SDK

Features

Installation

Quick Start

API Reference

ComputerUseAgent

Constructor

execute() Method

Usage Examples

Text Response

Structured Response with Zod

Advanced Options

Environment Setup

Computer Use Parameters

Model Selection

Thinking Budget

Error Handling

Best Practices

Security Considerations

Requirements

Related Resources

License

About

Topics

Resources

License

Code of conduct

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Uh oh!

Languages

`ComputerUseAgent`

`execute()` Method

Packages