require empty db keyspace to run hash#184
Merged
smacker merged 4 commits intosrc-d:masterfrom Jan 29, 2019
smacker:require_empty_db_keyspace
Merged
require empty db keyspace to run hash#184smacker merged 4 commits intosrc-d:masterfrom smacker:require_empty_db_keyspace
smacker merged 4 commits intosrc-d:masterfrom
smacker:require_empty_db_keyspace
Conversation
carlosms
reviewed
Jan 28, 2019
| } | ||
|
|
||
| def isDBEmpty(session: Session, mode: String): Boolean = { | ||
| var row = session.execute(s"select count(*) from $keyspace.${tables.docFreq} where id='$mode'").one() |
Contributor
There was a problem hiding this comment.
is select count(*) performance ok to run this before every command? Would it improve if it was select count(*) ... limit 1?
Contributor
Author
There was a problem hiding this comment.
currently, it's okay because this table can contain only 2 rows max.
but it's a good point and better to update in case we change it. 👍
Contributor
There was a problem hiding this comment.
ok, I missed that about the table :D
carlosms
approved these changes
Jan 28, 2019
Contributor
carlosms
left a comment
There was a problem hiding this comment.
👍, left a small suggestion
|
|
||
| if (!gemini.isDBEmpty(cassandra, config.mode)) { | ||
| println("Database keyspace is not empty! Hashing may produce wrong results. " + | ||
| "Please choose another keyspace or pass --replace argument") |
Contributor
There was a problem hiding this comment.
Suggested change
| "Please choose another keyspace or pass --replace argument") | |
| "Please choose another keyspace or pass the --replace option") |
Contributor
Author
There was a problem hiding this comment.
fixed the first commit with new message
Hashing can't be executed incrementally due to calculation of document frequencies which require full input. this commit checks if hashtables and docfreq tables are empty and gemini exits with error if they are not. it also introduces new flag --replace which would clean up db for current hashing mode. Signed-off-by: Maxim Sukharev <max@smacker.ru>
It allows to pass just `--cassandra` instead of `--cassandra=true` Signed-off-by: Maxim Sukharev <max@smacker.ru>
Signed-off-by: Maxim Sukharev <max@smacker.ru>
se7entyse7en
suggested changes
Jan 28, 2019
Signed-off-by: Maxim Sukharev <max@smacker.ru>
se7entyse7en
approved these changes
Jan 28, 2019
smola
approved these changes
Jan 28, 2019
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Hashing can't be executed incrementally due to calculation of document
frequencies which require full input.
this commit checks if hashtables and docfreq tables are empty and gemini
exits with error if they are not.
it also introduces new flag --replace which would clean up db for
current hashing mode.
There is also separate commit that changes type of cassandra flag to unit.
It allows to pass just
--cassandrainstead of--cassandra=true(for consistency)
Output when db isn't empty: