-
Notifications
You must be signed in to change notification settings - Fork 116
Support refresh quick mode by using Hybrid Scan #238
Conversation
src/main/scala/com/microsoft/hyperspace/index/rules/RuleUtils.scala
Outdated
Show resolved
Hide resolved
src/main/scala/com/microsoft/hyperspace/actions/RefreshQuickAction.scala
Outdated
Show resolved
Hide resolved
src/main/scala/com/microsoft/hyperspace/actions/RefreshQuickAction.scala
Outdated
Show resolved
Hide resolved
src/test/scala/com/microsoft/hyperspace/index/E2EHyperspaceRulesTests.scala
Show resolved
Hide resolved
| // In case the index has appended files and/or deleted files which are updated by quick | ||
| // refresh mode, we should handle them in query time by using Hybrid Scan. | ||
| // Because quick refresh doesn't update index data, but just metadata in IndexLogEntry. | ||
| lazy val quickRefreshedIndex = index.hasSourceUpdate |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we are using it only once, do we need to create a separate (lazy) val for it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's mainly for readability and comments. + It's only required when HybridScan is enabled, so it's lazy val.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would add a comment if you added lazy to avoid creating sets for appended files, etc.
src/main/scala/com/microsoft/hyperspace/index/rules/RuleUtils.scala
Outdated
Show resolved
Hide resolved
src/test/scala/com/microsoft/hyperspace/index/RefreshIndexTests.scala
Outdated
Show resolved
Hide resolved
|
@imback82 @pirz @apoorvedave1 Could you review this change? Thanks! |
pirz
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, Thanks @sezruby
|
LGTM, thanks @sezruby |
imback82
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM (few minor points), thanks @sezruby!
src/main/scala/com/microsoft/hyperspace/actions/RefreshQuickAction.scala
Outdated
Show resolved
Hide resolved
src/main/scala/com/microsoft/hyperspace/index/rules/RuleUtils.scala
Outdated
Show resolved
Hide resolved
| // In case the index has appended files and/or deleted files which are updated by quick | ||
| // refresh mode, we should handle them in query time by using Hybrid Scan. | ||
| // Because quick refresh doesn't update index data, but just metadata in IndexLogEntry. | ||
| lazy val quickRefreshedIndex = index.hasSourceUpdate |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would add a comment if you added lazy to avoid creating sets for appended files, etc.
src/main/scala/com/microsoft/hyperspace/index/rules/RuleUtils.scala
Outdated
Show resolved
Hide resolved
src/main/scala/com/microsoft/hyperspace/actions/RefreshQuickAction.scala
Outdated
Show resolved
Hide resolved
| (index.sourceFileInfoSet -- exist, filesAppended) | ||
| if (!HyperspaceConf.hybridScanEnabled(spark) && index.hasSourceUpdate) { | ||
| // If the index contains the source update info, we need to handle | ||
| // appendedFiles and deletedFiles in IndexLogEntry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add back the comment that signature validation was done by this time?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the context for this pull request?
Fixes #174
Fixes #169
What changes were proposed in this pull request?
Support quick refresh of indexes which only updates the metadata -
Updatedata - of the index.After the refresh, the latest signature will be used for index selection process (getCandidateIndexes) and the appended files and deleted files of
Updatewill be handled by Hybrid Scan.Does this PR introduce any user-facing change?
Yes
How was this patch tested?
Unit test