Skip to content

fullscreen-triangle/trebuchet

Repository files navigation

Trebuchet

We pray that we are wrong

Trebuchet Logo

Rust Version License CI Status

Launching AI workloads with precision and power

Overview

Trebuchet is a high-performance microservices orchestration framework built in Rust, designed to address critical performance bottlenecks in AI and data processing pipelines. By combining Rust's safety guarantees with zero-cost abstractions, Trebuchet enables developers to:

  1. Replace performance bottlenecks in Python/React applications with high-performance Rust microservices
  2. Integrate seamlessly with existing Python ML code and React frontends
  3. Orchestrate complex AI workflows through a powerful CLI and programmatic interfaces
  4. Monitor and optimize resource utilization across distributed systems
┌─────────────────────────────────────────────────────────────────┐
│                                                                 │
│                    ┌──────────────────┐                         │
│                    │  Trebuchet CLI   │                         │
│                    └──────────────────┘                         │
│                              │                                  │
│  ┌──────────────┐   ┌────────┴───────┐   ┌──────────────────┐  │
│  │              │   │                │   │                  │  │
│  │ Python/React │◄──┤ Trebuchet Core ├──►│ AI Model Registry│  │
│  │   Bridge     │   │                │   │                  │  │
│  │              │   └────────┬───────┘   └──────────────────┘  │
│  └──────┬───────┘            │                                  │
│         │            ┌───────┴──────┐                          │
│         │            │              │                          │
│  ┌──────▼───────┐   ┌▼─────────────▼┐   ┌──────────────────┐  │
│  │              │   │               │   │                  │  │
│  │ Python/React │   │Performance    │   │  Message Bus &   │  │
│  │ Applications │   │Microservices  │   │  API Gateway     │  │
│  │              │   │               │   │                  │  │
│  └──────────────┘   └───────────────┘   └──────────────────┘  │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Table of Contents

Key Features

1. High-Performance Microservices

Trebuchet provides specialized Rust microservices to address critical performance bottlenecks:

  • Heihachi Audio Engine - High-throughput audio analysis and transformation with zero-copy processing
  • Gospel NLP Service - High-performance natural language processing with sentiment analysis, keyword extraction, and document similarity
  • Purpose Data Pipeline - ML model management and efficient data transformation pipelines
  • Combine Data Integration - Powerful multi-source data fusion with flexible join operations and conflict resolution
  • Model Router - Intelligent routing between specialized AI models

2. Advanced AI Model Integration

Trebuchet's AI Model Registry provides a unified approach to working with ML models:

  • Multi-framework support for PyTorch, TensorFlow, ONNX, and more
  • Intelligent model selection based on task requirements
  • Dynamic scaling of inference resources
  • Model versioning and A/B testing capabilities
  • Performance monitoring and automatic optimization

3. Interoperability Bridges

Trebuchet seamlessly integrates with existing Python and React.js codebases:

  • Python Interop Bridge using PyO3 for zero-copy data exchange
  • WASM Frontend Integration for browser-based AI acceleration
  • JSON-over-stdio for simple command-line tool integration
  • Language-agnostic API for integration with any programming language

4. Powerful CLI and TUI

Trebuchet provides intuitive interfaces for managing complex workflows:

  • Shell completion and context-aware help
  • Rich terminal interface with real-time visualizations
  • Remote management capabilities
  • Plugin system for extensibility

5. Robust Orchestration

Trebuchet handles complex workflow orchestration:

  • Declarative pipeline definitions using YAML or Rust code
  • Dependency resolution between processing stages
  • Automatic resource allocation based on workload characteristics
  • Error recovery with configurable retry policies

Technical Approach

Trebuchet leverages Rust's unique strengths to deliver exceptional performance and reliability:

Memory Management

Rust's ownership model enables Trebuchet to efficiently manage memory without garbage collection pauses. This approach is particularly valuable for latency-sensitive AI inference:

// Example of zero-copy data processing in Trebuchet
pub fn process_audio_chunk<'a>(input: &'a [f32]) -> impl Iterator<Item = f32> + 'a {
    input.iter()
         .map(|sample| sample * 2.0)  // Zero-copy transformation
         .filter(|sample| *sample > 0.1)
}

The ownership model translates to measurable benefits:

Metric Python Implementation Trebuchet (Rust) Improvement
Memory Usage 1.2 GB 175 MB 85% reduction
Latency (p95) 220ms 15ms 93% reduction
Throughput 120 req/s 2,200 req/s 18x improvement

Concurrency Model

Trebuchet utilizes Rust's safe concurrency model to efficiently leverage multi-core processors:

// Example of Trebuchet's parallel data processing
pub async fn process_dataset(data: Vec<Sample>) -> Result<Vec<ProcessedSample>> {
    // Split data into chunks for parallel processing
    let chunks = data.chunks(CHUNK_SIZE).collect::<Vec<_>>();
    
    // Process chunks in parallel using rayon
    let results: Vec<_> = chunks
        .par_iter()
        .map(|chunk| process_chunk(chunk))
        .collect();
        
    // Combine results
    Ok(results.into_iter().flatten().collect())
}

This approach eliminates the need for Python's Global Interpreter Lock (GIL), enabling true parallelism.

Type System

Trebuchet leverages Rust's type system to ensure correctness at compile time:

// Example of Trebuchet's type-safe model configuration
#[derive(Debug, Serialize, Deserialize, Clone)]
pub struct ModelConfig {
    pub model_type: ModelType,
    pub parameters: HashMap<String, Parameter>,
    pub input_shape: Vec<usize>,
    pub output_shape: Vec<usize>,
}

// Type-level validation ensures configurations are correct
impl ModelConfig {
    pub fn validate(&self) -> Result<(), ConfigError> {
        // Validate configuration at runtime
        match self.model_type {
            ModelType::Transformer => {
                if !self.parameters.contains_key("attention_heads") {
                    return Err(ConfigError::MissingParameter("attention_heads".to_string()));
                }
                // Additional validation...
            }
            // Other model types...
        }
        Ok(())
    }
}

Architecture

Trebuchet is built as a modular microservices architecture with the following key components:

Core Components

  1. Trebuchet CLI - The central command-line interface for interacting with all components
  2. Trebuchet Core - The service orchestration and communication layer
  3. Python Interop Bridge - Bi-directional communication between Rust and Python
  4. WASM Frontend Bridge - Integration with React.js applications
  5. AI Model Registry - Centralized management of ML models

Performance Microservices

  1. Heihachi Audio Engine - High-performance audio processing
  2. Gospel NLP Service - Natural language processing and text analytics
  3. Purpose Model Manager - ML model lifecycle management and data pipelines
  4. Combine Data Integration - Data fusion and integration engine
  5. Model Router - Intelligent AI model routing and ensemble capabilities

Gospel NLP Service

The Gospel microservice provides high-performance natural language processing capabilities:

  • Document Analysis - Word count, sentence structure, readability scoring
  • Sentiment Analysis - Detect positive, negative, and neutral sentiment
  • Keyword Extraction - Identify important concepts and themes
  • Language Detection - Identify the language of documents
  • Document Similarity - Find related content using Jaccard similarity
  • Parallel Processing - Process multiple documents concurrently
// Example of document analysis with Gospel
let service = NlpService::new();
service.add_document(Document::new(
    "doc1",
    "Important Report",
    "This is a critical document with high priority content."
)).await?;

let analysis = service.analyze_document("doc1").await?;
println!("Sentiment: {:?}", analysis.sentiment);
println!("Keywords: {:?}", analysis.keywords);

Combine Data Integration Service

The Combine microservice provides robust data integration capabilities:

  • Multi-source Data Fusion - Integrate data from diverse sources
  • Flexible Join Operations - Support for union, intersection, left/right/full outer joins
  • Record Matching - Exact and fuzzy matching strategies
  • Conflict Resolution - Multiple strategies for resolving field conflicts
  • Schema Mapping - Convert between different data schemas
  • Extensible Readers/Writers - Support for CSV, JSON, databases, and APIs
// Example of data integration with Combine
let service = create_data_integration_service();

// Register data sources
service.register_source(primary_source).await?;
service.register_source(secondary_source).await?;

// Configure integration
let config = IntegrationConfig {
    primary_source: "customers",
    secondary_sources: vec!["orders"],
    mode: IntegrationMode::LeftJoin,
    key_fields: vec!["customer_id"],
    matching_strategy: KeyMatchingStrategy::Exact,
    conflict_strategy: ConflictStrategy::PreferPrimary,
    timeout_ms: None,
};

// Perform integration
let integrated_data = service.integrate(&config).await?;

Purpose Model Management Service

The Purpose microservice provides comprehensive ML model management:

  • Model Storage - Efficient storage and retrieval of model artifacts
  • Model Registry - Track model metadata, versions, and dependencies
  • Inference Management - Standardized inference across model types
  • Data Pipelines - Composable data transformation pipelines
  • Record Processing - Filter, map, and aggregate operations
  • Monitoring - Track performance metrics and resource usage
// Example of model management with Purpose
// Register a pipeline for data preprocessing
let mut pipeline = Pipeline::new("text_preprocessing");
pipeline.add_transform(FilterTransform::new(
    "remove_empty",
    "content",
    FilterOperator::NotEq,
    Value::String("".to_string())
));

// Execute the pipeline
let processed_data = service.execute_pipeline(
    "text_preprocessing",
    "input.csv",
    "csv",
    Some(("output.csv", "csv"))
).await?;

Communication Infrastructure

  1. Message Bus - Asynchronous communication between services
  2. API Gateway - External API exposure and request handling

Deployment Components

  1. Package Manager - Versioning and dependency management
  2. Configuration System - Centralized configuration management

AI Model Integration

Trebuchet provides multiple strategies for AI model integration:

1. Native Rust ML

// Example of using tch-rs for PyTorch integration in Rust
use tch::{nn, Device, Tensor};

fn create_model(vs: &nn::Path) -> impl nn::Module {
    nn::seq()
        .add(nn::linear(vs / "layer1", 784, 128, Default::default()))
        .add_fn(|xs| xs.relu())
        .add(nn::linear(vs / "layer2", 128, 10, Default::default()))
}

fn main() -> Result<()> {
    let vs = nn::VarStore::new(Device::Cpu);
    let model = create_model(&vs.root());
    
    // Load pre-trained weights
    vs.load("model.safetensors")?;
    
    // Run inference
    let input = Tensor::rand(&[1, 784], (Kind::Float, Device::Cpu));
    let output = model.forward(&input);
    
    println!("Output: {:?}", output);
    Ok(())
}

2. Python Integration via FFI

// Example of PyO3 integration for Python ML libraries
use pyo3::prelude::*;
use pyo3::types::{PyDict, PyTuple};

fn run_inference(input_data: Vec<f32>) -> PyResult<Vec<f32>> {
    Python::with_gil(|py| {
        let torch = PyModule::import(py, "torch")?;
        let model = PyModule::import(py, "my_model")?;
        
        // Convert Rust data to PyTorch tensor
        let input_tensor = torch.getattr("tensor")?.call1((input_data,))?;
        
        // Run inference
        let output = model.getattr("predict")?.call1((input_tensor,))?;
        
        // Convert back to Rust
        let output_vec: Vec<f32> = output.extract()?;
        Ok(output_vec)
    })
}

3. ONNX Runtime Integration

// Example of ONNX Runtime integration
use onnxruntime::{environment::Environment, session::Session, tensor::OrtTensor};

fn main() -> Result<()> {
    // Initialize ONNX environment
    let environment = Environment::builder().build()?;
    
    // Create session
    let session = environment
        .new_session_builder()?
        .with_optimization_level(GraphOptimizationLevel::Basic)?
        .with_number_threads(4)?
        .with_model_from_file("model.onnx")?;
    
    // Create input tensor
    let input_tensor = OrtTensor::from_slice(&[1.0f32, 2.0, 3.0, 4.0, 5.0], &[1, 5])?;
    
    // Run inference
    let outputs = session.run(vec![input_tensor])?;
    
    println!("Output: {:?}", outputs[0]);
    Ok(())
}

4. Model Selection and Orchestration

Trebuchet's Model Registry dynamically selects the optimal model implementation based on task requirements:

┌───────────────────┐
│                   │
│  Task Request     │
│                   │
└─────────┬─────────┘
          │
          ▼
┌───────────────────┐    ┌───────────────────┐
│                   │    │                   │
│  Model Registry   │───►│   Model Catalog   │
│                   │    │                   │
└─────────┬─────────┘    └───────────────────┘
          │
          ▼
┌───────────────────┐
│  Selection Logic  │
│  - Task type      │
│  - Input size     │
│  - Latency req.   │
│  - Resource avail.│
└─────────┬─────────┘
          │
          ▼
┌───────────────────────────────────────────────┐
│                                               │
│ ┌─────────────┐ ┌──────────┐ ┌─────────────┐ │
│ │ Rust Native │ │ Python   │ │ External    │ │
│ │ Models      │ │ Models   │ │ API Models  │ │
│ └─────────────┘ └──────────┘ └─────────────┘ │
│                                               │
└───────────────────────────────────────────┬───┘
                                            │
                                            ▼
                                  ┌───────────────────┐
                                  │                   │
                                  │    Response       │
                                  │                   │
                                  └───────────────────┘

The selection process uses a weighted scoring function:

$$S(m, t) = \alpha \cdot P(m) + \beta \cdot A(m) + \gamma \cdot C(m, t) + \delta \cdot L(m, t)$$

Where:

  • $S(m, t)$ is the score for model $m$ on task $t$
  • $P(m)$ is the performance score for model $m$
  • $A(m)$ is the accuracy score for model $m$
  • $C(m, t)$ is the compatibility score between model $m$ and task $t$
  • $L(m, t)$ is the estimated latency for model $m$ on task $t$
  • $\alpha, \beta, \gamma, \delta$ are weighting coefficients

Performance Benchmarks

Trebuchet's performance has been evaluated on several real-world workloads:

Audio Processing (Heihachi Engine)

Task: Process 1 hour of 48kHz stereo audio with multiple spectral transforms

Implementation Processing Time Memory Usage CPU Utilization
Python (librosa) 342 seconds 4.2 GB 105% (1.05 cores)
Trebuchet (Rust) 17 seconds 650 MB 780% (7.8 cores)

NLP Processing (Gospel Engine)

Task: Process 10,000 documents with sentiment analysis and keyword extraction

Implementation Processing Time Memory Usage Documents/sec
Python (NLTK/spaCy) 118 seconds 3.7 GB 84.7 docs/s
Trebuchet (Rust) 9.2 seconds 850 MB 1,087 docs/s

Data Integration (Combine Engine)

Task: Merge and reconcile 500,000 records from 3 data sources

Implementation Processing Time Memory Usage Records/sec
Python (pandas) 48 seconds 6.2 GB 10,416 rec/s
Trebuchet (Rust) 5.1 seconds 740 MB 98,039 rec/s

Model Inference (Model Router)

Task: Multi-model inference across 10,000 samples

Implementation Latency (p95) Throughput Accuracy
Python (direct) 320ms 65 samples/s 94.3%
Trebuchet (Rust) 28ms 820 samples/s 94.1%

Getting Started

Trebuchet is currently in development. Here's how to get started once the initial release is available:

Prerequisites

  • Rust 1.60 or higher
  • Python 3.8+ (for interoperability features)
  • Node.js 16+ (for WASM integration)

Installation

# Clone the repository
git clone https://github.com/yourusername/trebuchet.git
cd trebuchet

# Build the project
cargo build --release

# Install the CLI
cargo install --path trebuchet-cli

# Verify installation
trebuchet --version

Basic Usage

# Initialize a new project
trebuchet init my-project

# Configure interop with existing Python project
trebuchet config interop --python-path /path/to/python/project

# Run a simple workflow
trebuchet run workflow.yaml

# Start the interactive TUI
trebuchet tui

Example Workflow Definition

# workflow.yaml
name: text-processing-pipeline
version: "1.0"

inputs:
  documents:
    type: file
    pattern: "*.txt"
    
stages:
  - name: nlp-analysis
    service: gospel
    operation: analyze
    config:
      extract_keywords: true
      analyze_sentiment: true
      
  - name: data-enrichment
    service: combine
    operation: integrate
    depends_on: nlp-analysis
    config:
      secondary_sources:
        - metadata-db
      mode: left_join
      key_fields:
        - document_id
        
  - name: model-inference
    service: model-router
    operation: infer
    depends_on: data-enrichment
    config:
      model: text-classifier-v2
      threshold: 0.75
      
outputs:
  classifications:
    source: model-inference
    format: json

Roadmap

Trebuchet is being developed incrementally, with a focus on delivering value at each stage:

Phase 1: Foundation (Q3 2023)

  • Trebuchet CLI development
  • Core service architecture
  • Python Interop Bridge
  • Basic workflow orchestration

Phase 2: Performance Services (Q4 2023)

  • Heihachi Audio Engine
  • Gospel NLP Service
  • Purpose Model Manager
  • Combine Data Integration
  • Message Bus implementation
  • Configuration system

Phase 3: AI Integration (Q1 2024)

  • AI Model Registry
  • ONNX Runtime integration
  • Model selection algorithms
  • Performance monitoring

Phase 4: Full System (Q2 2024)

  • API Gateway
  • WASM Frontend Bridge
  • Comprehensive documentation
  • Cloud deployment templates

Contributing

Trebuchet is an open-source project and welcomes contributions. Here's how you can help:

  1. Report bugs and feature requests through GitHub issues
  2. Submit pull requests for bug fixes and features
  3. Improve documentation by submitting corrections and examples
  4. Share benchmarks and performance improvements

Please see CONTRIBUTING.md for more details.

References

Rust Ecosystem

  1. Matsakis, N. D., & Klock, F. S. (2014). The Rust Language. ACM SIGAda Ada Letters, 34(3), 103-104. doi:10.1145/2692956.2663188

  2. Anderson, J., Matsakis, N., & Turon, A. (2016). Safety in Systems Programming with Rust. Communications of the ACM.

  3. Balasubramanian, A., Baranowski, M. S., Burtsev, A., Panda, A., Rakamarić, Z., & Ryzhyk, L. (2017, October). System Programming in Rust: Beyond Safety. In Proceedings of the 16th Workshop on Hot Topics in Operating Systems (pp. 156-161).

Python Interoperability

  1. Bauwens, B. (2020). PyO3: Python-Rust Interoperability. Python Software Foundation.

  2. Healey, M. (2019). Rust for Python Developers. The Pragmatic Bookshelf.

AI and ML

  1. Howard, J., & Gugger, S. (2020). Deep Learning for Coders with Fastai and PyTorch. O'Reilly Media.

  2. Shanahan, M., Crosby, M., Beyret, B., & Chanan, D. (2020). Human-Centered AI. Prentice Hall.

Microservices Architecture

  1. Newman, S. (2021). Building Microservices (2nd ed.). O'Reilly Media.

  2. Kleppmann, M. (2017). Designing Data-Intensive Applications. O'Reilly Media.

  3. Burns, B., Grant, B., Oppenheimer, D., Brewer, E., & Wilkes, J. (2016). Borg, Omega, and Kubernetes. ACM Queue, 14(1), 70-93.

License

Trebuchet is available under the MIT License. See the LICENSE file for more details.


Built with ❤️ by [Your Name/Organization]

Releases

No releases published

Packages

No packages published