Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
41 changes: 13 additions & 28 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,43 +21,28 @@ Train multi-step agents for real-world tasks using GRPO.

</div>

## 🦜🔗 LangGraph Integration: Build Smarter Multi-Step Agents
## 📏 RULER: Zero-Shot Agent Rewards

ART's **LangGraph integration** enables you to train sophisticated ReAct-style agents that improve through reinforcement learning. Build agents that reason, use tools, and adapt their behavior over time without manual prompt engineering.
**RULER** (Relative Universal LLM-Elicited Rewards) eliminates the need for hand-crafted reward functions by using an LLM-as-judge to automatically score agent trajectories. Simply define your task in the system prompt, and RULER handles the rest—**no labeled data, expert feedback, or reward engineering required**.

✨ **Key Benefits:**

- **Automatic behavior improvement** - Train agents to get better at multi-step reasoning
- **Tool usage optimization** - Learn when and how to use tools more effectively
- **Seamless integration** - Drop-in replacement for LangGraph's LLM initialization
- **RULER compatibility** - Train without hand-crafted reward functions
- **2-3x faster development** - Skip reward function engineering entirely
- **General-purpose** - Works across any task without modification
- **Strong performance** - Matches or exceeds hand-crafted rewards in 3/4 benchmarks
- **Easy integration** - Drop-in replacement for manual reward functions

```python
import art
from art.langgraph import init_chat_model, wrap_rollout
from langgraph.prebuilt import create_react_agent
# Before: Hours of reward engineering
def complex_reward_function(trajectory):
# 50+ lines of careful scoring logic...
pass

async def email_rollout(model: art.Model, scenario: str) -> art.Trajectory:
# Create LangGraph agent with ART's chat model
chat_model = init_chat_model(model.name)
agent = create_react_agent(chat_model, tools)

await agent.ainvoke({"messages": [("user", scenario)]})
return art.Trajectory(reward=1.0, messages_and_choices=[])

# Train your agent
scenarios = ["Find urgent emails", "Search Q4 budget"]

# Using wrap_rollout (captures interactions automatically)
groups = await art.gather_trajectory_groups([
art.TrajectoryGroup(wrap_rollout(model, email_rollout)(model, s) for _ in range(4))
for s in scenarios
])

await model.train(groups)
# After: One line with RULER
judged_group = await ruler_score_group(group, "openai/o3")
```

[📖 Learn more about LangGraph integration →](https://art.openpipe.ai/integrations/langgraph-integration) | [🏋️ Try the notebook →](https://colab.research.google.com/github/openpipe/art-notebooks/blob/main/examples/langgraph/art-e-langgraph.ipynb)
[📖 Learn more about RULER →](https://art.openpipe.ai/fundamentals/ruler)

## ART Overview

Expand Down