Skip to content

Conversation

@mosche
Copy link
Member

@mosche mosche commented Mar 6, 2023

Change the default storage level to the respective default of Spark's Dataset API (MEMORY_AND_DISK ) rather than using the default of the RDD API / the SparkRunner (MEMORY_ONLY).

Closes #25737


Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:

  • Mention the appropriate issue in your description (for example: addresses #123), if applicable. This will automatically add a link to the pull request in the issue. If you would like the issue to automatically close on merging the pull request, comment fixes #<ISSUE NUMBER> instead.
  • Update CHANGES.md with noteworthy changes.
  • If this contribution is large, please file an Apache Individual Contributor License Agreement.

See the Contributor Guide for more tips on how to make review process smoother.

To check the build health, please visit https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md

GitHub Actions Tests Status (on master branch)

Build python source distribution and wheels
Python tests
Java tests
Go tests

See CI.md for more information about GitHub Actions CI.

@github-actions
Copy link
Contributor

github-actions bot commented Mar 6, 2023

Checks are failing. Will not request review until checks are succeeding. If you'd like to override that behavior, comment assign set of reviewers

@mosche
Copy link
Member Author

mosche commented Mar 7, 2023

Run Java PreCommit

@mosche
Copy link
Member Author

mosche commented Mar 7, 2023

R: @aromanenko-dev

@github-actions
Copy link
Contributor

github-actions bot commented Mar 7, 2023

Stopping reviewer notifications for this pull request: review requested by someone other than the bot, ceding control

Copy link
Contributor

@aromanenko-dev aromanenko-dev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@mosche
Copy link
Member Author

mosche commented Mar 8, 2023

Run Spark Runner Tpcds Tests

2 similar comments
@mosche
Copy link
Member Author

mosche commented Mar 8, 2023

Run Spark Runner Tpcds Tests

@mosche
Copy link
Member Author

mosche commented Mar 8, 2023

Run Spark Runner Tpcds Tests

@mosche mosche merged commit de8f317 into apache:master Mar 8, 2023
@mosche mosche deleted the 25737_spark_ds_default_storage_level branch March 8, 2023 09:57
ruslan-ikhsan pushed a commit to akvelon/beam that referenced this pull request Mar 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Task]: Spark Dataset runner should use default storage level MEMORY_AND_DISK

2 participants