Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 10 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -65,16 +65,16 @@ ART is an open-source RL framework that improves agent reliability by allowing L

## 📒 Notebooks

| Agent Task | Example Notebook | Description | Comparative Performance |
| ------------------- | -------------------------------------------------------------------------------------------------------------------------------------- | --------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| **ART•E LangGraph** | [🏋️ Train agent](https://colab.research.google.com/github/openpipe/art-notebooks/blob/main/examples/langgraph/art-e-langgraph.ipynb) | Qwen 2.5 7B learns to search emails using LangGraph | [Link coming soon] |
| **MCP•RL** | [🏋️ Train agent](https://colab.research.google.com/github/openpipe/art-notebooks/blob/main/examples/mcp-rl/mcp-rl.ipynb) | Qwen 2.5 3B masters the NWS MCP server | [Link coming soon] |
| **ART•E [RULER]** | [🏋️ Train agent](https://colab.research.google.com/github/openpipe/art-notebooks/blob/main/examples/art-e.ipynb) | Qwen 2.5 7B learns to search emails using RULER | <img src="https://github.com/openpipe/art/raw/main/assets/benchmarks/email_agent/accuracy-training-progress.svg" height="72"> [benchmarks](/examples/art-e/art_e/evaluate/display_benchmarks.ipynb) |
| **2048** | [🏋️ Train agent](https://colab.research.google.com/github/openpipe/art-notebooks/blob/main/examples/2048/2048.ipynb) | Qwen 2.5 3B learns to play 2048 | <img src="https://github.com/openpipe/art/raw/main/assets/benchmarks/2048/accuracy-training-progress.svg" height="72"> [benchmarks](/examples/2048/benchmark_2048.ipynb) |
| **Temporal Clue** | [🏋️ Train agent](https://colab.research.google.com/github/openpipe/art-notebooks/blob/main/examples/temporal_clue/temporal-clue.ipynb) | Qwen 2.5 7B learns to solve Temporal Clue | [Link coming soon] |
| **Tic Tac Toe** | [🏋️ Train agent](https://colab.research.google.com/github/openpipe/art-notebooks/blob/main/examples/tic_tac_toe/tic-tac-toe.ipynb) | Qwen 2.5 3B learns to play Tic Tac Toe | <img src="https://github.com/openpipe/art/raw/main/assets/benchmarks/tic-tac-toe-local/accuracy-training-progress.svg" height="72"> [benchmarks](/examples/tic_tac_toe/benchmark_tic_tac_toe.ipynb) |
| **Codenames** | [🏋️ Train agent](https://colab.research.google.com/github/openpipe/art-notebooks/blob/main/examples/codenames/Codenames_RL.ipynb) | Qwen 2.5 3B learns to play Codenames | <img src="https://github.com/openpipe/art/raw/main/assets/benchmarks/codenames/win_rate_over_time.png" height="72"> [benchmarks](/examples/codenames/Codenames_RL.ipynb) |
| **AutoRL [RULER]** | [🏋️ Train agent](https://colab.research.google.com/github/openpipe/art-notebooks/blob/main/examples/auto_rl.ipynb) | Train Qwen 2.5 7B to master any task | [Link coming soon] |
| Agent Task | Example Notebook | Description | Comparative Performance |
| ------------------- | -------------------------------------------------------------------------------------------------------------------------------------- | --------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| **ART•E LangGraph** | [🏋️ Train agent](https://colab.research.google.com/github/openpipe/art-notebooks/blob/main/examples/langgraph/art-e-langgraph.ipynb) | Qwen 2.5 7B learns to search emails using LangGraph | [Link coming soon] |
| **MCP•RL** | [🏋️ Train agent](https://colab.research.google.com/github/openpipe/art-notebooks/blob/main/examples/mcp-rl/mcp-rl.ipynb) | Qwen 2.5 3B masters the NWS MCP server | [Link coming soon] |
| **ART•E [RULER]** | [🏋️ Train agent](https://colab.research.google.com/github/openpipe/art-notebooks/blob/main/examples/art-e.ipynb) | Qwen 2.5 7B learns to search emails using RULER | <img src="https://github.com/openpipe/art/raw/main/assets/benchmarks/email_agent/accuracy-training-progress.svg" height="72"> [benchmarks](/dev/art-e/art_e/evaluate/display_benchmarks.ipynb) |
| **2048** | [🏋️ Train agent](https://colab.research.google.com/github/openpipe/art-notebooks/blob/main/examples/2048/2048.ipynb) | Qwen 2.5 3B learns to play 2048 | <img src="https://github.com/openpipe/art/raw/main/assets/benchmarks/2048/accuracy-training-progress.svg" height="72"> [benchmarks](/examples/2048/display_benchmarks.ipynb) |
| **Temporal Clue** | [🏋️ Train agent](https://colab.research.google.com/github/openpipe/art-notebooks/blob/main/examples/temporal_clue/temporal-clue.ipynb) | Qwen 2.5 7B learns to solve Temporal Clue | [Link coming soon] |
| **Tic Tac Toe** | [🏋️ Train agent](https://colab.research.google.com/github/openpipe/art-notebooks/blob/main/examples/tic_tac_toe/tic-tac-toe.ipynb) | Qwen 2.5 3B learns to play Tic Tac Toe | <img src="https://github.com/openpipe/art/raw/main/assets/benchmarks/tic-tac-toe-local/accuracy-training-progress.svg" height="72"> [benchmarks](/examples/tic_tac_toe/display-benchmarks.ipynb) |
| **Codenames** | [🏋️ Train agent](https://colab.research.google.com/github/openpipe/art-notebooks/blob/main/examples/codenames/Codenames_RL.ipynb) | Qwen 2.5 3B learns to play Codenames | <img src="https://github.com/openpipe/art/raw/main/assets/benchmarks/codenames/win_rate_over_time.png" height="72"> [benchmarks](https://github.com/OpenPipe/art-notebooks/blob/main/examples/codenames/Codenames_RL.ipynb) |
| **AutoRL [RULER]** | [🏋️ Train agent](https://colab.research.google.com/github/openpipe/art-notebooks/blob/main/examples/auto_rl.ipynb) | Train Qwen 2.5 7B to master any task | [Link coming soon] |

## 📰 ART News

Expand Down
2 changes: 1 addition & 1 deletion docs/fundamentals/art-backend.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -141,7 +141,7 @@ To see `LocalBackend` and `SkyPilotBackend` in action, try the examples below.
<Card
title="2048 Notebook"
icon="laptop-code"
href="https://colab.research.google.com/github/openpipe/art/blob/main/examples/2048/2048.ipynb"
href="https://colab.research.google.com/github/openpipe/art-notebooks/blob/main/examples/2048/2048.ipynb"
horizontal={true}
arrow={true}
>
Expand Down
2 changes: 1 addition & 1 deletion docs/getting-started/about.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -109,7 +109,7 @@ The ART client can be installed into projects designed to run on any machine tha
<Card
title="Train an agent to play 2048"
icon="robot"
href="https://colab.research.google.com/github/openpipe/art/blob/main/examples/2048/2048.ipynb"
href="https://colab.research.google.com/github/openpipe/art-notebooks/blob/main/examples/2048/2048.ipynb"
horizontal={true}
arrow={true}
></Card>
Expand Down
17 changes: 10 additions & 7 deletions docs/getting-started/notebooks.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -7,12 +7,15 @@ icon: "book"

<div className="full-width">

| Agent Task | Notebook | Description | Performance |
| ----------------- | -------------------------------------------------------------------------------------------------------------------------------------- | ----------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| **ART•E [RULER]** | [🏋️&nbsp;Train&nbsp;agent](https://colab.research.google.com/github/openpipe/art/blob/main/examples/art-e.ipynb) | Qwen 2.5 7B learns to search emails using RULER | <a href="https://github.com/OpenPipe/ART/blob/main/examples/art-e/art_e/evaluate/display_benchmarks.ipynb"><img src="https://github.com/OpenPipe/ART/raw/main/assets/benchmarks/email_agent/accuracy-training-progress.svg" width="72" style={{margin: "0"}} /></a> |
| **2048** | [🏋️&nbsp;Train&nbsp;agent](https://colab.research.google.com/github/openpipe/art/blob/main/examples/2048/2048.ipynb) | Qwen 2.5 3B learns to play 2048 | <a href="https://github.com/OpenPipe/ART/blob/main/examples/2048/benchmark_2048.ipynb"><img src="https://github.com/OpenPipe/ART/raw/main/assets/benchmarks/2048/accuracy-training-progress.svg" width="72" style={{margin: "0"}} /></a> |
| **Temporal Clue** | [🏋️&nbsp;Train&nbsp;agent](https://colab.research.google.com/github/openpipe/art/blob/main/examples/temporal_clue/temporal-clue.ipynb) | Qwen 2.5 7B learns to solve Temporal Clue | [Link coming soon] |
| **Tic Tac Toe** | [🏋️&nbsp;Train&nbsp;agent](https://colab.research.google.com/github/openpipe/art/blob/main/examples/tic_tac_toe/tic-tac-toe.ipynb) | Qwen 2.5 3B learns to play Tic Tac Toe | <a href="https://github.com/OpenPipe/ART/blob/main/examples/tic_tac_toe/benchmark_tic_tac_toe.ipynb"><img src="https://github.com/OpenPipe/ART/raw/main/assets/benchmarks/tic-tac-toe-local/accuracy-training-progress.svg" width="72" style={{margin: "0"}} /></a> |
| **Codenames** | [🏋️&nbsp;Train&nbsp;agent](https://colab.research.google.com/github/openpipe/art/blob/main/examples/codenames/Codenames_RL.ipynb) | Qwen 2.5 3B learns to play Codenames | <a href="https://github.com/OpenPipe/ART/blob/main/examples/codenames/Codenames_RL.ipynb"><img src="https://github.com/OpenPipe/ART/raw/main/assets/benchmarks/codenames/win_rate_over_time.png" width="72" style={{margin: "0"}} /></a> |
| Agent Task | Notebook | Description | Performance |
| ------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------ | --------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| **ART•E LangGraph** | [🏋️&nbsp;Train&nbsp;agent](https://colab.research.google.com/github/openpipe/art-notebooks/blob/main/examples/langgraph/art-e-langgraph.ipynb) | Qwen 2.5 7B learns to search emails using LangGraph | [Link coming soon] |
| **MCP•RL** | [🏋️&nbsp;Train&nbsp;agent](https://colab.research.google.com/github/openpipe/art-notebooks/blob/main/examples/mcp-rl/mcp-rl.ipynb) | Qwen 2.5 3B masters the NWS MCP server | [Link coming soon] |
| **ART•E [RULER]** | [🏋️&nbsp;Train&nbsp;agent](https://colab.research.google.com/github/openpipe/art-notebooks/blob/main/examples/art-e.ipynb) | Qwen 2.5 7B learns to search emails using RULER | <a href="https://github.com/OpenPipe/ART/blob/main/dev/art-e/art_e/evaluate/display_benchmarks.ipynb"><img src="https://github.com/OpenPipe/ART/raw/main/assets/benchmarks/email_agent/accuracy-training-progress.svg" width="72" style={{margin: "0"}} /></a> |
| **2048** | [🏋️&nbsp;Train&nbsp;agent](https://colab.research.google.com/github/openpipe/art-notebooks/blob/main/examples/2048/2048.ipynb) | Qwen 2.5 3B learns to play 2048 | <a href="https://github.com/OpenPipe/ART/blob/main/examples/2048/display_benchmarks.ipynb"><img src="https://github.com/OpenPipe/ART/raw/main/assets/benchmarks/2048/accuracy-training-progress.svg" width="72" style={{margin: "0"}} /></a> |
| **Temporal Clue** | [🏋️&nbsp;Train&nbsp;agent](https://colab.research.google.com/github/openpipe/art-notebooks/blob/main/examples/temporal_clue/temporal-clue.ipynb) | Qwen 2.5 7B learns to solve Temporal Clue | [Link coming soon] |
| **Tic Tac Toe** | [🏋️&nbsp;Train&nbsp;agent](https://colab.research.google.com/github/openpipe/art-notebooks/blob/main/examples/tic_tac_toe/tic-tac-toe.ipynb) | Qwen 2.5 3B learns to play Tic Tac Toe | <a href="https://github.com/OpenPipe/ART/blob/main/examples/tic_tac_toe/display-benchmarks.ipynb"><img src="https://github.com/OpenPipe/ART/raw/main/assets/benchmarks/tic-tac-toe-local/accuracy-training-progress.svg" width="72" style={{margin: "0"}} /></a> |
| **Codenames** | [🏋️&nbsp;Train&nbsp;agent](https://colab.research.google.com/github/openpipe/art-notebooks/blob/main/examples/codenames/Codenames_RL.ipynb) | Qwen 2.5 3B learns to play Codenames | <a href="https://github.com/OpenPipe/art-notebooks/blob/main/examples/codenames/Codenames_RL.ipynb"><img src="https://github.com/OpenPipe/ART/raw/main/assets/benchmarks/codenames/win_rate_over_time.png" width="72" style={{margin: "0"}} /></a> |
| **AutoRL [RULER]** | [🏋️&nbsp;Train&nbsp;agent](https://colab.research.google.com/github/openpipe/art-notebooks/blob/main/examples/auto_rl.ipynb) | Train Qwen 2.5 7B to master any task | [Link coming soon] |

</div>
4 changes: 2 additions & 2 deletions docs/getting-started/quick-start.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -24,13 +24,13 @@ If you'd like to enable observability while working through this guide, create a

- [Weights & Biases](https://wandb.ai/home)

Once you have your Weights & Biases API key, open the [notebook](https://colab.research.google.com/github/openpipe/art/blob/main/examples/2048/2048.ipynb) in Google Colab and set them in the **Environment Variables** cell.
Once you have your Weights & Biases API key, open the [notebook](https://colab.research.google.com/github/openpipe/art-notebooks/blob/main/examples/2048/2048.ipynb) in Google Colab and set them in the **Environment Variables** cell.

Once your API keys are set, or if you won't need observability while completing this walkthrough, continue on to the next step.

## Step 2: Prepare your notebook

If you haven't already, open the [notebook](https://colab.research.google.com/github/openpipe/art/blob/main/examples/2048/2048.ipynb) in Google Colab and connect to a T4 runtime environment.
If you haven't already, open the [notebook](https://colab.research.google.com/github/openpipe/art-notebooks/blob/main/examples/2048/2048.ipynb) in Google Colab and connect to a T4 runtime environment.

<Accordion title="Connecting to a T4 GPU">
In the top bar of your Google Colab notebook, find *Runtime* > *Change runtime
Expand Down