From 9c747ee26d28f4be5defc0663d9551ea11341284 Mon Sep 17 00:00:00 2001
From: Andrew Schilling <aschilling@nvidia.com>
Date: Thu, 10 Apr 2025 17:42:14 +0000
Subject: [PATCH 1/5] Correcting file names

Signed-off-by: Andrew Schilling <aschilling@nvidia.com>
---
 ...ding_new_models.md => adding-new-models.md} |  0
 .../chat-datasets.md}                          |  0
 .../checkpointing.md                           |  0
 .../design-and-philosophy.md}                  |  0
 .../{design_docs => design-docs}/generation.md |  0
 docs/design-docs/gpu-logger.md                 |  0
 docs/{design_docs => design-docs}/logger.md    |  0
 docs/{design_docs => design-docs}/padding.md   |  0
 docs/{design_docs => design-docs}/uv.md        |  0
 docs/guides/sft.md                             |  2 +-
 docs/index.md                                  | 18 +++++++++---------
 ...cal_workstation.md => local-workstation.md} |  0
 12 files changed, 10 insertions(+), 10 deletions(-)
 rename docs/{adding_new_models.md => adding-new-models.md} (100%)
 rename docs/{design_docs/chat_datasets.md => design-docs/chat-datasets.md} (100%)
 rename docs/{design_docs => design-docs}/checkpointing.md (100%)
 rename docs/{design_docs/design_and_philosophy.md => design-docs/design-and-philosophy.md} (100%)
 rename docs/{design_docs => design-docs}/generation.md (100%)
 create mode 100644 docs/design-docs/gpu-logger.md
 rename docs/{design_docs => design-docs}/logger.md (100%)
 rename docs/{design_docs => design-docs}/padding.md (100%)
 rename docs/{design_docs => design-docs}/uv.md (100%)
 rename docs/{local_workstation.md => local-workstation.md} (100%)

diff --git a/docs/adding_new_models.md b/docs/adding-new-models.md
similarity index 100%
rename from docs/adding_new_models.md
rename to docs/adding-new-models.md
diff --git a/docs/design_docs/chat_datasets.md b/docs/design-docs/chat-datasets.md
similarity index 100%
rename from docs/design_docs/chat_datasets.md
rename to docs/design-docs/chat-datasets.md
diff --git a/docs/design_docs/checkpointing.md b/docs/design-docs/checkpointing.md
similarity index 100%
rename from docs/design_docs/checkpointing.md
rename to docs/design-docs/checkpointing.md
diff --git a/docs/design_docs/design_and_philosophy.md b/docs/design-docs/design-and-philosophy.md
similarity index 100%
rename from docs/design_docs/design_and_philosophy.md
rename to docs/design-docs/design-and-philosophy.md
diff --git a/docs/design_docs/generation.md b/docs/design-docs/generation.md
similarity index 100%
rename from docs/design_docs/generation.md
rename to docs/design-docs/generation.md
diff --git a/docs/design-docs/gpu-logger.md b/docs/design-docs/gpu-logger.md
new file mode 100644
index 0000000000..e69de29bb2
diff --git a/docs/design_docs/logger.md b/docs/design-docs/logger.md
similarity index 100%
rename from docs/design_docs/logger.md
rename to docs/design-docs/logger.md
diff --git a/docs/design_docs/padding.md b/docs/design-docs/padding.md
similarity index 100%
rename from docs/design_docs/padding.md
rename to docs/design-docs/padding.md
diff --git a/docs/design_docs/uv.md b/docs/design-docs/uv.md
similarity index 100%
rename from docs/design_docs/uv.md
rename to docs/design-docs/uv.md
diff --git a/docs/guides/sft.md b/docs/guides/sft.md
index 4d452b109d..8a67da85e8 100644
--- a/docs/guides/sft.md
+++ b/docs/guides/sft.md
@@ -29,7 +29,7 @@ SFT datasets in Reinforcer are encapsulated using classes. Each SFT data class i
   1. `formatted_ds`: The dictionary of formatted datasets. This dictionary should contain `train` and `validation` splits, and each split should conform to the format described below.
   2. `task_spec`: The `TaskDataSpec` for this dataset. This should specify the name you choose for this dataset as well as the `custom_template` for this dataset. More on custom templates below.
 
-SFT datasets are expected to follow the HuggingFace chat format. Refer to the [chat dataset document](../design_docs/chat_datasets.md) for details. If your data is not in the correct format, simply write a preprocessing script to convert the data into this format. [data/hf_datasets/squad.py](../../nemo_reinforcer/data/hf_datasets/squad.py) has an example:
+SFT datasets are expected to follow the HuggingFace chat format. Refer to the [chat dataset document](../design-docs/chat-datasets.md) for details. If your data is not in the correct format, simply write a preprocessing script to convert the data into this format. [data/hf_datasets/squad.py](../../nemo_reinforcer/data/hf_datasets/squad.py) has an example:
 
 ```python
 def format_squad(data):
diff --git a/docs/index.md b/docs/index.md
index 553778ff98..0b802b0ce2 100644
--- a/docs/index.md
+++ b/docs/index.md
@@ -6,7 +6,7 @@
 :caption: 🖥️  Environment Start
 :hidden:
 
-local_workstation.md
+local-workstation.md
 cluster.md
 
 ```
@@ -15,7 +15,7 @@ cluster.md
 :caption: 📚 Guides
 :hidden:
 
-adding_new_models.md
+adding-new-models.md
 guides/sft.md
 guides/grpo.md
 guides/eval.md
@@ -41,11 +41,11 @@ apidocs/index.rst
 :caption: 📐 Design Docs
 :hidden:
 
-design_docs/design_and_philosophy.md
-design_docs/padding.md
-design_docs/logger.md
-design_docs/uv.md
-design_docs/chat_datasets.md
-design_docs/generation.md
-design_docs/checkpointing.md
+design-docs/design-and-philosophy.md
+design-docs/padding.md
+design-docs/logger.md
+design-docs/uv.md
+design-docs/chat-datasets.md
+design-docs/generation.md
+design-docs/checkpointing.md
 ```
diff --git a/docs/local_workstation.md b/docs/local-workstation.md
similarity index 100%
rename from docs/local_workstation.md
rename to docs/local-workstation.md

From 19e3d85c0452977d7c633f59ba8ad126dc66a312 Mon Sep 17 00:00:00 2001
From: Andrew Schilling <aschilling@nvidia.com>
Date: Thu, 10 Apr 2025 17:59:27 +0000
Subject: [PATCH 2/5] Applying SEO Fixes

Signed-off-by: Andrew Schilling <aschilling@nvidia.com>
---
 docs/design-docs/index.md | 12 ++++++++++++
 docs/guides/index.md      |  9 +++++++++
 2 files changed, 21 insertions(+)
 create mode 100644 docs/design-docs/index.md
 create mode 100644 docs/guides/index.md

diff --git a/docs/design-docs/index.md b/docs/design-docs/index.md
new file mode 100644
index 0000000000..e178a61002
--- /dev/null
+++ b/docs/design-docs/index.md
@@ -0,0 +1,12 @@
+```{toctree}
+:caption: 📐 Design Docs
+:hidden:
+
+design-and-philosophy.md
+padding.md
+logger.md
+uv.md
+chat-datasets.md
+generation.md
+checkpointing.md
+```
\ No newline at end of file
diff --git a/docs/guides/index.md b/docs/guides/index.md
new file mode 100644
index 0000000000..4276cc8d22
--- /dev/null
+++ b/docs/guides/index.md
@@ -0,0 +1,9 @@
+```{toctree}
+:caption: 📚 Guides
+:hidden:
+
+adding-new-models.md
+sft.md
+grpo.md
+eval.md
+```
\ No newline at end of file

From f25343778b5d8cffa58cd82c61e0538a40b0939f Mon Sep 17 00:00:00 2001
From: Andrew Schilling <aschilling@nvidia.com>
Date: Thu, 10 Apr 2025 18:46:11 +0000
Subject: [PATCH 3/5] Trying to sign commit

Signed-off-by: Andrew Schilling <aschilling@nvidia.com>
---
 docs/adding-new-models.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/adding-new-models.md b/docs/adding-new-models.md
index c39642ea69..a682b4c3a7 100644
--- a/docs/adding-new-models.md
+++ b/docs/adding-new-models.md
@@ -20,7 +20,7 @@ $$
 \frac{1}{n}\sum_{i=1}^{n\text{(tokens)}}\exp\left(\left\|\text{logprobs-train-fwk}_i - \text{logprobs-sampling-fwk}_i\right\|\right)
 $$
 
-where samples are drawn as $x \sim \pi_{\text{sampling-framework}}$
+Where samples are drawn as $x \sim \pi_{\text{sampling-framework}}$
 
 as a measure of multiplicative probability error for sampled tokens. Note that this is not exhaustive (the sampling framework could lack distribution support and we wouldn't catch it here, as $x \sim \pi_{\text{sampling-framework}}$). To get a much stricter guarantee on correctness, you should run this metric twice and average the results, where in the second run, you sample $x \sim \pi_{\text{training-framework}}$. In practice, we use just the former in our tests and find it sufficient.
 

From 3a878c516dbd80574d417ec181cef35307c3c9e0 Mon Sep 17 00:00:00 2001
From: Andrew Schilling <aschilling@nvidia.com>
Date: Thu, 10 Apr 2025 19:10:28 +0000
Subject: [PATCH 4/5] Fixing Typo

Signed-off-by: Andrew Schilling <aschilling@nvidia.com>
---
 docs/adding-new-models.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/adding-new-models.md b/docs/adding-new-models.md
index a682b4c3a7..41aac4d07b 100644
--- a/docs/adding-new-models.md
+++ b/docs/adding-new-models.md
@@ -22,7 +22,7 @@ $$
 
 Where samples are drawn as $x \sim \pi_{\text{sampling-framework}}$
 
-as a measure of multiplicative probability error for sampled tokens. Note that this is not exhaustive (the sampling framework could lack distribution support and we wouldn't catch it here, as $x \sim \pi_{\text{sampling-framework}}$). To get a much stricter guarantee on correctness, you should run this metric twice and average the results, where in the second run, you sample $x \sim \pi_{\text{training-framework}}$. In practice, we use just the former in our tests and find it sufficient.
+As a measure of multiplicative probability error for sampled tokens. Note that this is not exhaustive (the sampling framework could lack distribution support and we wouldn't catch it here, as $x \sim \pi_{\text{sampling-framework}}$). To get a much stricter guarantee on correctness, you should run this metric twice and average the results, where in the second run, you sample $x \sim \pi_{\text{training-framework}}$. In practice, we use just the former in our tests and find it sufficient.
 
 ## Understanding Discrepancies Between Backends
 

From 9d89cb5228020a55655e149920502a2b5a7b49db Mon Sep 17 00:00:00 2001
From: Andrew Schilling <aschilling@nvidia.com>
Date: Tue, 15 Apr 2025 17:02:24 +0000
Subject: [PATCH 5/5] Reverting text edits

Signed-off-by: Andrew Schilling <aschilling@nvidia.com>
---
 docs/adding-new-models.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/adding-new-models.md b/docs/adding-new-models.md
index 41aac4d07b..34aaaaf3b0 100644
--- a/docs/adding-new-models.md
+++ b/docs/adding-new-models.md
@@ -20,7 +20,7 @@ $$
 \frac{1}{n}\sum_{i=1}^{n\text{(tokens)}}\exp\left(\left\|\text{logprobs-train-fwk}_i - \text{logprobs-sampling-fwk}_i\right\|\right)
 $$
 
-Where samples are drawn as $x \sim \pi_{\text{sampling-framework}}$
+where samples are drawn as $x \sim \pi_{\text{sampling-framework}}$
 
 As a measure of multiplicative probability error for sampled tokens. Note that this is not exhaustive (the sampling framework could lack distribution support and we wouldn't catch it here, as $x \sim \pi_{\text{sampling-framework}}$). To get a much stricter guarantee on correctness, you should run this metric twice and average the results, where in the second run, you sample $x \sim \pi_{\text{training-framework}}$. In practice, we use just the former in our tests and find it sufficient.