[SPARK-14628][CORE] Simplify task metrics by always tracking read/write metrics #12417

cloud-fan · 2016-04-15T13:05:36Z

What changes were proposed in this pull request?

Part of the reason why TaskMetrics and its callers are complicated are due to the optional metrics we collect, including input, output, shuffle read, and shuffle write. I think we can always track them and just assign 0 as the initial values. It is usually very obvious whether a task is supposed to read any data or not. By always tracking them, we can remove a lot of map, foreach, flatMap, getOrElse(0L) calls throughout Spark.

This patch also changes a few behaviors.

Removed the distinction of data read/write methods (e.g. Hadoop, Memory, Network, etc).
Accumulate all data reads and writes, rather than only the first method. (Fixes SPARK-5225)

How was this patch tested?

existing tests.

This is bases on #12388, with more test fixes.

cloud-fan · 2016-04-15T13:06:06Z

cc @rxin

cloud-fan · 2016-04-15T13:08:27Z

core/src/main/scala/org/apache/spark/status/api/v1/AllStagesResource.scala

+        fetchWaitTime = internal.fetchWaitTime,
+        remoteBytesRead = internal.remoteBytesRead,
+        totalBlocksFetched = internal.totalBlocksFetched,
+        recordsRead = internal.recordsRead


Looks like the internal and external shuffle read metrics don't match, the localBytesRead is missing.

SparkQA · 2016-04-15T13:08:54Z

Test build #55926 has finished for PR 12417 at commit cbc154f.

This patch fails Scala style tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2016-04-15T17:03:33Z

Test build #55931 has finished for PR 12417 at commit 2711075.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

rxin · 2016-04-15T18:54:00Z

core/src/main/scala/org/apache/spark/util/JsonProtocol.scala

+    } else {
+      JNothing
+    }
+    val shuffleWriteMetrics: JValue = if (taskMetrics.shuffleWriteMetrics.isUpdated) {


can't we always output the metrics, and just fix the json protocol test?

SparkQA · 2016-04-15T20:32:56Z

Test build #2791 has finished for PR 12417 at commit 2711075.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2016-04-15T22:26:20Z

Test build #2793 has finished for PR 12417 at commit 2711075.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

rxin · 2016-04-15T22:39:20Z

Going to merge this first. We can fix the test thing later.

## What changes were proposed in this pull request? This PR is a follow up for #12417, now we always track input/output/shuffle metrics in spark JSON protocol and status API. Most of the line changes are because of re-generating the gold answer for `HistoryServerSuite`, and we add a lot of 0 values for read/write metrics. ## How was this patch tested? existing tests. Author: Wenchen Fan <wenchen@databricks.com> Closes #12462 from cloud-fan/follow.

…te metrics ## What changes were proposed in this pull request? Part of the reason why TaskMetrics and its callers are complicated are due to the optional metrics we collect, including input, output, shuffle read, and shuffle write. I think we can always track them and just assign 0 as the initial values. It is usually very obvious whether a task is supposed to read any data or not. By always tracking them, we can remove a lot of map, foreach, flatMap, getOrElse(0L) calls throughout Spark. This patch also changes a few behaviors. 1. Removed the distinction of data read/write methods (e.g. Hadoop, Memory, Network, etc). 2. Accumulate all data reads and writes, rather than only the first method. (Fixes SPARK-5225) ## How was this patch tested? existing tests. This is bases on apache#12388, with more test fixes. Author: Reynold Xin <rxin@databricks.com> Author: Wenchen Fan <wenchen@databricks.com> Closes apache#12417 from cloud-fan/metrics-refactor.

## What changes were proposed in this pull request? This PR is a follow up for apache#12417, now we always track input/output/shuffle metrics in spark JSON protocol and status API. Most of the line changes are because of re-generating the gold answer for `HistoryServerSuite`, and we add a lot of 0 values for read/write metrics. ## How was this patch tested? existing tests. Author: Wenchen Fan <wenchen@databricks.com> Closes apache#12462 from cloud-fan/follow.

rxin and others added 9 commits April 15, 2016 12:15

Always track ShuffleReadMetrics (i.e. not an option)

997f1e1

first round commits

876b471

mima and remove more options

b27b97b

remove more options in StatsReportListener

4a4a8bf

fix comment

3cac11f

fix one test problem

b0493f1

fix more test cases

7791215

delete a failing test case since it is invalid now

fa78b5e

fix all tests

cbc154f

cloud-fan reviewed Apr 15, 2016
View reviewed changes

fix style

2711075

rxin reviewed Apr 15, 2016
View reviewed changes

asfgit closed this in 8028a28 Apr 15, 2016

cloud-fan mentioned this pull request Apr 18, 2016

[SPARK-14628][CORE][folllow-up] Always tracking read/write metrics #12462

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-14628][CORE] Simplify task metrics by always tracking read/write metrics #12417

[SPARK-14628][CORE] Simplify task metrics by always tracking read/write metrics #12417

Uh oh!

cloud-fan commented Apr 15, 2016

Uh oh!

cloud-fan commented Apr 15, 2016

Uh oh!

cloud-fan Apr 15, 2016

Uh oh!

SparkQA commented Apr 15, 2016

Uh oh!

SparkQA commented Apr 15, 2016

Uh oh!

rxin Apr 15, 2016

Uh oh!

SparkQA commented Apr 15, 2016

Uh oh!

SparkQA commented Apr 15, 2016

Uh oh!

rxin commented Apr 15, 2016

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[SPARK-14628][CORE] Simplify task metrics by always tracking read/write metrics #12417

[SPARK-14628][CORE] Simplify task metrics by always tracking read/write metrics #12417

Uh oh!

Conversation

cloud-fan commented Apr 15, 2016

What changes were proposed in this pull request?

How was this patch tested?

Uh oh!

cloud-fan commented Apr 15, 2016

Uh oh!

cloud-fan Apr 15, 2016

Choose a reason for hiding this comment

Uh oh!

SparkQA commented Apr 15, 2016

Uh oh!

SparkQA commented Apr 15, 2016

Uh oh!

rxin Apr 15, 2016

Choose a reason for hiding this comment

Uh oh!

SparkQA commented Apr 15, 2016

Uh oh!

SparkQA commented Apr 15, 2016

Uh oh!

rxin commented Apr 15, 2016

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants