Skip to content

Schema refresh review fixes.#931

Merged
emtwo merged 1 commit intomasterfrom
schema-refresh-improvement-review
Apr 4, 2019
Merged

Schema refresh review fixes.#931
emtwo merged 1 commit intomasterfrom
schema-refresh-improvement-review

Conversation

@jezdez
Copy link
Copy Markdown

@jezdez jezdez commented Apr 4, 2019

Just a minor follow-up to #930.

@jezdez jezdez requested a review from emtwo April 4, 2019 12:19
noop_query = None
configuration_properties = None
data_sample_query = None
sample_query = None
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just to stay consistent with the noop_query name.

Comment thread redash/tasks/queries.py
models.db.session.commit()
logger.info("Deleted %d unused query results.", deleted_count)


Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's stay with PEP8 and do two linebreaks between functions.

Comment thread redash/tasks/queries.py
def cleanup_data_in_table(table_model):
removed_metadata = table_model.query.filter(
table_model.exists == False,
table_model.exists.is_(False),
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also something that flake8 reported in my editor.

Comment thread redash/tasks/queries.py
if is_old_data:
table_model.query.filter(
table_model.id == removed_metadata_row.id,
).delete()
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioneded on IRC this function should do the cleanup completely on the db side to prevent having to fetch the data to delete it afterwards. In my experience such cleanup function tend to break with race conditions if the list of things to delete exceeds memory or runtime limits.

Comment thread redash/tasks/queries.py
def insert_or_update_table_metadata(data_source, existing_tables_set, table_data):
# Update all persisted tables that exist to reflect this.
persisted_tables = TableMetadata.query.filter(
TableMetadata.name.in_(tuple(existing_tables_set)),
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No needs to convert the set to a tuple for IN queries.

Comment thread redash/tasks/queries.py
TableMetadata.data_source_id == ds.id,
).all()

for j, table in enumerate(all_existing_persisted_tables):
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No enumerate needed here.

Comment thread redash/tasks/queries.py
existing_columns_set = set()

# Clear the set for the next round
existing_columns_set.clear()
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think clearing the set was the purpose of this right?

Copy link
Copy Markdown

@emtwo emtwo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Jannis!

@emtwo emtwo merged commit 5d03b54 into master Apr 4, 2019
@jezdez jezdez deleted the schema-refresh-improvement-review branch April 15, 2019 14:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants