From 05228bc8c443c8b0d14994da9110d76cf353ab65 Mon Sep 17 00:00:00 2001
From: arcticfly <davidlcorbitt@gmail.com>
Date: Thu, 28 Aug 2025 11:51:54 -0700
Subject: [PATCH 1/3] Add step: to steps

---
 docs/tutorials/open-deep-research.mdx | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/docs/tutorials/open-deep-research.mdx b/docs/tutorials/open-deep-research.mdx
index c4b57c2d7..54ec373ab 100644
--- a/docs/tutorials/open-deep-research.mdx
+++ b/docs/tutorials/open-deep-research.mdx
@@ -27,7 +27,7 @@ Total cost: <b>~$100</b>
 
 </Info>
 
-## Step 1: Clone the starter repo and install dependencies
+### Step 1: Clone the starter repo and install dependencies
 
 To get started, clone [Open Deep Research Training](https://github.com/OpenPipe/open_deep_research_training), which contains the following pieces of our RL pipeline:
 
@@ -40,7 +40,7 @@ Once the repository is cloned, install dependencies. If you haven't already, ins
 
 Then install the project dependencies by running `uv sync`.
 
-### 2. Install SkyPilot/RunPod
+### Step 2: Install SkyPilot/RunPod
 
 We'll be using `LocalBackend` to manage the GPU that your model will be trained on. In order to provision a GPU for your training run, you'll need to have SkyPilot installed on your machine and provide it with the credentials to spin up machines on at least one infra provider.
 
@@ -48,11 +48,11 @@ We recommend using RunPod because of their ease of use, but any infra provider t
 
 Follow RunPod's **Getting Started** guide [here](https://docs.runpod.io/integrations/skypilot/). You'll have to provide a credit card to use RunPod, but you'll only pay for the time your GPUs are running.
 
-### 3. Set up optional environment variables found in `.env.example`
+### Step 3: Set up optional environment variables found in `.env.example`
 
 Copy `.env.example` to `.env` at the root of the repository, and fill in the values for the environment variables. If you're unsure about any of the values, refer to [ENV_INSTRUCTIONS.md](https://github.com/OpenPipe/open_deep_research_training/blob/main/ENV_INSTRUCTIONS.md).
 
-### 4. Run the training scripts
+### Step 4: Run the training scripts
 
 You'll want to run these scripts in this order:
 
@@ -87,7 +87,7 @@ The following steps execute when a training run on a new cluster begins:
 - **Upload the final model checkpoint.**
   - This usually takes a few minutes.
 
-### 5. Generate the benchmarks
+### Step 5: Generate the benchmarks
 
 Run the benchmark script to evaluate your trained models:
 
@@ -103,7 +103,7 @@ This script will:
 
 Then run the `display_benchmarks.ipynb` notebook to visualize the results and generate comparison charts.
 
-### 6. Shutting down the cluster
+### Step 6: Shutting down the cluster
 
 When you're done training and running benchmarks, you can shut down the cluster by running:
 

From 5aee4825e14ada90114b5edec8e291c049b1ba4d Mon Sep 17 00:00:00 2001
From: arcticfly <davidlcorbitt@gmail.com>
Date: Thu, 28 Aug 2025 11:52:31 -0700
Subject: [PATCH 2/3] Increase estimated cost

---
 docs/tutorials/open-deep-research.mdx | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/tutorials/open-deep-research.mdx b/docs/tutorials/open-deep-research.mdx
index 54ec373ab..30a3c3b87 100644
--- a/docs/tutorials/open-deep-research.mdx
+++ b/docs/tutorials/open-deep-research.mdx
@@ -23,7 +23,7 @@ Reading time: <b>45 min</b>
 
 Training time: <b>30hr</b>
 
-Total cost: <b>~$100</b>
+Total cost: <b>~$350</b>
 
 </Info>
 

From 657b5b916a350ddb99985e87f4abf238d9dd9d90 Mon Sep 17 00:00:00 2001
From: arcticfly <davidlcorbitt@gmail.com>
Date: Thu, 28 Aug 2025 11:55:47 -0700
Subject: [PATCH 3/3] Update deep research tutorial

---
 docs/tutorials/open-deep-research.mdx | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/docs/tutorials/open-deep-research.mdx b/docs/tutorials/open-deep-research.mdx
index 30a3c3b87..79c08ccd9 100644
--- a/docs/tutorials/open-deep-research.mdx
+++ b/docs/tutorials/open-deep-research.mdx
@@ -5,7 +5,7 @@ description: "Train a deep research agent to exceed SOTA performance using GRPO
 icon: "magnifying-glass"
 ---
 
-This tutorial demonstrates how to train an LLM using GRPO to exceed SOTA performance at deep research. Specifically, you will be using the [ART](https://github.com/OpenPipe/ART) library to specialize an agent for [Langchain's open deep research](https://github.com/langchain-ai/open_deep_research) framework, and will evaluate your agent's performance using [DeepResearch Bench: A Comprehensive Benchmark for Deep Research Agents](https://github.com/Ayanami0730/deep_research_bench).
+This tutorial demonstrates how to train your own deep research agent using GRPO to exceed Sonnet 4's perfromance. Specifically, you will be using the [ART](https://github.com/OpenPipe/ART) library to specialize Qwen 2.5 14B for [Langchain's open deep research](https://github.com/langchain-ai/open_deep_research) framework, and will evaluate your agent's performance using [DeepResearch Bench: A Comprehensive Benchmark for Deep Research Agents](https://github.com/Ayanami0730/deep_research_bench).
 
 In addition to the GRPO training step, you will also run an initial SFT training run to improve the model's baseline performance.
 
@@ -21,7 +21,7 @@ In addition to the GRPO training step, you will also run an initial SFT training
 
 Reading time: <b>45 min</b>
 
-Training time: <b>30hr</b>
+Training time: <b>30 hr</b>
 
 Total cost: <b>~$350</b>
 
@@ -144,7 +144,7 @@ To learn more about ART, check out another tutorial or look through our notebook
   </div>
   <div className="card-wrapper">
     <Card
-      title="All Notebooks"
+      title="ART Notebooks"
       icon="book"
       href="/getting-started/notebooks"
       horizontal={true}