Skip to content
/ aideml Public

AIDE: AI-Driven Exploration in the Space of Code. The machine Learning engineering agent that automates AI R&D.

License

Notifications You must be signed in to change notification settings

WecoAI/aideml

Repository files navigation

AIDE ML — The Machine Learning Engineering Agent

LLM‑driven agent that writes, evaluates & improves machine‑learning code.

PyPI Python 3.10+ arXiv paper MIT License PyPI Downloads

Use in Production? Try Weco →

What Is AIDE ML?

AIDE ML is the open‑source “reference build” of the AIDE algorithm, a tree‑search agent that autonomously drafts, debugs and benchmarks code until a user‑defined metric is maximised (or minimised). It ships as a research‑friendly Python package with batteries‑included utilities (CLI, visualisation, config presets) so that academics and engineer‑researchers can replicate the paper, test new ideas, or prototyping ML pipelines.

Tree Search Visualization

Layer Description Where to find it
AIDE algorithm LLM‑guided agentic tree search in the space of code. Described in our paper.
AIDE ML repo (this repo) Lean implementation for experimentation & extension. pip install aideml
Weco product The platform generalizes AIDE's capabilities to broader code optimization scenarios, providing experiment tracking and enhanced user control. weco.ai

Who should use it?

  • Agent‑architecture researchers – swap in new search heuristics, evaluators or LLM back‑ends.
  • ML Practitioners – quickly build a high performance ML pipelines given a dataset.

Key Capabilities

  • Natural‑language task specification Point the agent at a dataset and describe goal + metric in plain English. No YAML grids or bespoke wrappers. aide data_dir=… goal="Predict churn" eval="AUROC"
  • Iterative agentic tree search Each python script becomes a node in a solution tree; LLM‑generated patches spawn children; metric feedback prunes and guides the search. OpenAI’s MLE‑Bench (75 Kaggle comps) found the tree‑search of AIDE wins 4 × more medals than the best linear agent (OpenHands).
Utility features provided by this repo
  • HTML visualiser – inspect the full solution tree and code attached to each node.
  • Streamlit UI – prototype ML solution .
  • Model‑neutral plumbing – OpenAI, Anthropic, Gemini, or any local LLM that speaks the OpenAI API.

Featured Research built on/with AIDE

Institution Paper / Project Name Links
OpenAI MLE-bench: Evaluating Machine-Learning Agents on Machine-Learning Engineering Paper, GitHub
METR RE-Bench: Evaluating frontier AI R&D capabilities of language-model agents against human experts Paper, GitHub
Sakana AI The AI Scientist-v2: Workshop-Level Automated Scientific Discovery via Agentic Tree Search Paper, GitHub
Meta The Automated LLM Speedrunning Benchmark: Reproducing NanoGPT Improvements Paper, GitHub
Meta AI Research Agents for Machine Learning: Search, Exploration, and Generalization in MLE-bench Paper, GitHub

Know another public project that cites or forks AIDE?
Open a PR and add it to the table!

How to Use AIDE ML

Quick Start

# 1  Install
pip install -U aideml

# 2  Set an LLM key
export OPENAI_API_KEY=<your‑key>  # https://platform.openai.com/api-keys

# 3  Run an optimisation
aide data_dir="example_tasks/house_prices" \
     goal="Predict the sales price for each house" \
     eval="RMSE between log‑prices"

After the run finishes you’ll find:

  • logs/<id>/best_solution.py – best code found
  • logs/<id>/tree_plot.html – click to inspect the solution tree

Web UI

pip install -U aideml   # adds streamlit
cd aide/webui
streamlit run app.py

Use the sidebar to paste your API key, upload data, set Goal & Metric, then press Run AIDE.

The UI shows live logs, the solution tree, and the best code.


Advanced CLI Options

# Choose a different coding model and run 50 steps
aide agent.code.model="claude-4-sonnet" \
     agent.steps=50 \
     data_dir=… goal=… eval=…

Common flags

Flag Purpose Default
agent.code.model LLM used to write code gpt-4-turbo
agent.steps Improvement iterations 20
agent.search.num_drafts Drafts per step 5

Use AIDE ML Inside Python

import aide
import logging

def main():
    logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(name)s - %(levelname)s - %(message)s')
    aide_logger = logging.getLogger("aide")
    aide_logger.setLevel(logging.INFO)
    print("Starting experiment...")
    exp = aide.Experiment(
        data_dir="example_tasks/bitcoin_price",  # replace this with your own directory
        goal="Build a time series forecasting model for bitcoin close price.",  # replace with your own goal description
        eval="RMSLE"  # replace with your own evaluation metric
    )

    best_solution = exp.run(steps=2)

    print(f"Best solution has validation metric: {best_solution.valid_metric}")
    print(f"Best solution code: {best_solution.code}")
    print("Experiment finished.")

if __name__ == '__main__':
    main()

Power‑User Extras

Local LLM (Ollama example)

export OPENAI_BASE_URL="http://localhost:11434/v1"
aide agent.code.model="qwen2.5" data_dir=… goal=… eval=…

Docker

docker build -t aide .
docker run -it --rm \
  -v "${LOGS_DIR:-$(pwd)/logs}:/app/logs" \
  -v "${WORKSPACE_BASE:-$(pwd)/workspaces}:/app/workspaces" \
  -v "$(pwd)/aide/example_tasks:/app/data" \
  -e OPENAI_API_KEY="your-actual-api-key" \
  aide data_dir=/app/data/house_prices goal="Predict price" eval="RMSE"

Development install

git clone https://github.com/WecoAI/aideml.git
cd aideml && pip install -e .

Citation

If you use AIDE in your work, please cite the following paper:

@article{aide2025,
      title={AIDE: AI-Driven Exploration in the Space of Code}, 
      author={Zhengyao Jiang and Dominik Schmidt and Dhruv Srikanth and Dixing Xu and Ian Kaplan and Deniss Jacenko and Yuxiang Wu},
      year={2025},
      eprint={2502.13138},
      archivePrefix={arXiv},
      primaryClass={cs.AI},
      url={https://arxiv.org/abs/2502.13138}, 
}

About

AIDE: AI-Driven Exploration in the Space of Code. The machine Learning engineering agent that automates AI R&D.

Topics

Resources

License

Stars

Watchers

Forks