From 4d9e5d7034e38041bc022b4dbc576bd8446cc21d Mon Sep 17 00:00:00 2001
From: Stas Bekman <stas00@users.noreply.github.com>
Date: Tue, 23 Mar 2021 22:26:23 -0700
Subject: [PATCH 1/2] [doc] pipeline

As @g-karthik flagged in https://github.com/microsoft/DeepSpeed/pull/659#discussion_r600132598 my previous correction PR had one sentence that said the wrong thing. So this PR attempts to rectify that.

Thank you!
---
 docs/_tutorials/pipeline.md | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/docs/_tutorials/pipeline.md b/docs/_tutorials/pipeline.md
index 70790c82b301..dcd7666cfe1a 100644
--- a/docs/_tutorials/pipeline.md
+++ b/docs/_tutorials/pipeline.md
@@ -276,9 +276,9 @@ For example, a machine with 16 GPUs must have as much local CPU memory as 16 tim
 
 DeepSpeed provides a `LayerSpec` class that delays the construction of
 modules until the model layers have been partitioned across workers.
-Then each worker will allocate only the layers it's assigned to. So, continuing the
+Then each worker will allocate only the layers it's assigned to. So, comparing to the
 example from the previous paragraph, a machine with 16 GPUs will need to allocate a
-total of 1x model size on its CPU, compared to 16x in the LayerSpec example.
+total of 1x model size on its CPU memory and not 16x.
 
 Here is an example of the abbreviated AlexNet model, but expressed only
 with `LayerSpec`s. Note that the syntax is almost unchanged: `nn.ReLU(inplace=True)`

From 1ca1570dc49e8b2b85290ceac5a33d8768bcac6f Mon Sep 17 00:00:00 2001
From: Stas Bekman <stas00@users.noreply.github.com>
Date: Tue, 23 Mar 2021 22:27:55 -0700
Subject: [PATCH 2/2] tweak

---
 docs/_tutorials/pipeline.md | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/docs/_tutorials/pipeline.md b/docs/_tutorials/pipeline.md
index dcd7666cfe1a..0d847ea18752 100644
--- a/docs/_tutorials/pipeline.md
+++ b/docs/_tutorials/pipeline.md
@@ -277,8 +277,8 @@ For example, a machine with 16 GPUs must have as much local CPU memory as 16 tim
 DeepSpeed provides a `LayerSpec` class that delays the construction of
 modules until the model layers have been partitioned across workers.
 Then each worker will allocate only the layers it's assigned to. So, comparing to the
-example from the previous paragraph, a machine with 16 GPUs will need to allocate a
-total of 1x model size on its CPU memory and not 16x.
+example from the previous paragraph, using `LayerSpec` a machine with 16 GPUs will need to 
+allocate a total of 1x model size on its CPU memory and not 16x.
 
 Here is an example of the abbreviated AlexNet model, but expressed only
 with `LayerSpec`s. Note that the syntax is almost unchanged: `nn.ReLU(inplace=True)`