Skip to content
This repository was archived by the owner on Jun 14, 2024. It is now read-only.

Conversation

@apoorvedave1
Copy link
Contributor

What is the context for this pull request?

Hyperspace throws error for refreshIndex api for following cases:

  • with delete flag enabled, and no data files were deleted => HyperspaceException
  • with append flag enabled, and no new data files were appended => HyperspaceException

Ideally the behavior should be, a refreshIndex just end operation saying "Nothing to do!!".
This PR introduces another exception type: NoChangesDetected. The base Action class eats up these exception with logging what happened and why it's not necessary to continue this operation.
The end user just sees log warnings but their code is not broken if source data is not modified.

What changes were proposed in this pull request?

Change the behavior of refresh api for deletes and appends if source data is not modified.

Does this PR introduce any user-facing change?

Yes:
Hyperspace will NOT throw error for refreshIndex api for following cases:

  • with delete flag enabled, and no data files were deleted => logWarning and continue
  • with append flag enabled, and no new data files were appended => logWarning and continue

How was this patch tested?

updated unit tests for this case

@apoorvedave1 apoorvedave1 self-assigned this Oct 8, 2020
Comment on lines 60 to 64

if (deletedFiles.isEmpty) {
throw NoChangesDetected("Refresh aborted as no deleted source data file found.")
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for reviewers: this order is changed because if there are no deleted files, we don't really need to go further and fail the operation for lineage column missing.

@apoorvedave1 apoorvedave1 changed the title Delete and APpend refresh apis should not throw error if data isn't changed Delete and Append refresh apis should not throw error if data isn't changed Oct 8, 2020
@sezruby
Copy link
Collaborator

sezruby commented Oct 8, 2020

I think it would be good to check the condition - no data change - for "full" refresh mode?
It doesn't need to be handled with this PR, but it seems necessary.

@apoorvedave1
Copy link
Contributor Author

I think it would be good to check the condition - no data change - for "full" refresh mode?
It doesn't need to be handled with this PR, but it seems necessary.

Fair enough, created this issue #190

@apoorvedave1 apoorvedave1 requested a review from imback82 October 8, 2020 19:26
@imback82
Copy link
Contributor

imback82 commented Oct 8, 2020

@apoorvedave1 Can you check the build failure?

logWarning(e.msg)
case e: Exception =>
logEvent(event(appInfo, message = s"Operation Failed: ${e.getMessage}."))
logEvent(event(appInfo, message = s"Operation failed: ${e.getMessage}"))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for reviewers: the trailing full stop "." is removed in the lines after ${e.getMessage} because there are cases when the e.getMessage contains a . (refer this for example)

@apoorvedave1 apoorvedave1 requested a review from imback82 October 8, 2020 23:00
Copy link
Contributor

@imback82 imback82 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few minor comments, but LGTM, thanks @apoorvedave1!

Apoorve Dave and others added 5 commits October 8, 2020 17:31
protected lazy val spark: SparkSession = SparkSession
.builder()
.master(s"local[$numParallelism]")
.config(HYPERSPACE_EVENT_LOGGER_CLASS_KEY, "com.microsoft.hyperspace.MockEventLogger")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for reviewers: setting this config at the beginning of tests because it's a lazy val in a object EventLogger class, and hence it is initialized only once at the beginning of the test suite.

I tried setting this within a test but since the lazy val got initialized before this config could take effect, it did not work.

Copy link
Contributor

@imback82 imback82 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @apoorvedave1!

@imback82 imback82 merged commit ba8733f into microsoft:master Oct 9, 2020
@apoorvedave1 apoorvedave1 deleted the errorhandling branch March 16, 2021 22:40
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants