Skip to content

Closes #934, #935: Remove type from schema browser and don't show empty example column in schema drawer#936

Merged
emtwo merged 4 commits intomasterfrom
emtwo/schema_ui_fixes
Apr 11, 2019
Merged

Closes #934, #935: Remove type from schema browser and don't show empty example column in schema drawer#936
emtwo merged 4 commits intomasterfrom
emtwo/schema_ui_fixes

Conversation

@emtwo
Copy link
Copy Markdown

@emtwo emtwo commented Apr 8, 2019

These are a couple of small UI tweaks related to schema browser changes.

Edit: I've added some more changes.

  1. get_schema() is now faster with only two postgres queries
  2. Column metadata is available when Athena uses glue
  3. Fixing a bug in the metadata processing

@emtwo emtwo force-pushed the emtwo/schema_ui_fixes branch from d98c01b to 5fd1c54 Compare April 9, 2019 18:07
@emtwo emtwo requested review from jezdez and washort April 10, 2019 14:59
@emtwo emtwo force-pushed the emtwo/schema_ui_fixes branch from 5fd1c54 to fbf7ce3 Compare April 10, 2019 17:44
Copy link
Copy Markdown

@washort washort left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good

Comment thread redash/models/__init__.py Outdated
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can just be

Suggested change
if column.exists == False:
if not column.exists:

Comment thread redash/models/__init__.py Outdated
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use .setdefault(column.table_id, {}).append({...}) here

@emtwo emtwo force-pushed the emtwo/schema_ui_fixes branch from fbf7ce3 to 6b369e2 Compare April 10, 2019 19:44
@emtwo emtwo force-pushed the emtwo/schema_ui_fixes branch from 6b369e2 to 42f691a Compare April 11, 2019 01:42
@emtwo emtwo merged this pull request into master Apr 11, 2019
emtwo pushed a commit that referenced this pull request Apr 11, 2019
…ty example column in schema drawer (#936)

* Closes #934, #935: Remove type from schema browser and don't show empty example column

* Speed up schema fetch requests with fewer postgres queries.

* Add column metadata to Athena glue processing.

* Fix bug assuming 'metadata' exists for every table.
emtwo pushed a commit that referenced this pull request Apr 11, 2019
…ty example column in schema drawer (#936)

* Closes #934, #935: Remove type from schema browser and don't show empty example column

* Speed up schema fetch requests with fewer postgres queries.

* Add column metadata to Athena glue processing.

* Fix bug assuming 'metadata' exists for every table.
@jezdez jezdez deleted the emtwo/schema_ui_fixes branch April 15, 2019 14:44
jezdez pushed a commit that referenced this pull request May 13, 2019
* Process extra column metadata for a few sql-based data sources.

* Add Table and Column metadata tables.

* Periodically update table and column schema tables in a celery task.

* Fetching schema returns data from table and column metadata tables.

* Add tests for backend changes.

* Front-end shows extra table metadata and uses new schema response.

* Delete datasource schema data when deleting a data source.

* Process and store data source schema when a data source is first created or after a migration.

* Tables should have a unique name per datasource.

* Addressing review comments.

* Update migration file for mixins.

* Appease PEP8

* Upgrade migration file for rebase.

* Cascade delete.

* Adding org_id

* Remove redundant column and table prefixes.

* Non-existing tables and columns should be filtered out on the server side not client side.

* Fetching table samples should be optional and should happen in a separate task per table.

* Allow users to force a schema refresh.

* Use updated_at to help prune old schema metadata periodically.

* Using settings.SCHEMAS_REFRESH_QUEUE

* fix for getredash#2426 test

* more stable test_interactive_new

* Closes #927, #928: Schema refresh improvements.

* Closes #934, #935: Remove type from schema browser and don't show empty example column in schema drawer (#936)

* Speed up schema fetch requests with fewer postgres queries.

* Add column metadata to Athena glue processing.

* Fix bug assuming 'metadata' exists for every table.

* Closes #939: Persisted, existing table metadata should be updated.

* Sample processing should be rate-limited.

* Add cli command for refreshing data samples.

* Schema refreshes should not overwrite column 'example' field.

* refresh_samples() should filter tables_to_sample on the datasource's id being sampled

* Correctly wrap long text in schema drawer.
@jezdez jezdez mentioned this pull request May 13, 2019
2 tasks
jezdez pushed a commit that referenced this pull request May 16, 2019
* Process extra column metadata for a few sql-based data sources.

* Add Table and Column metadata tables.

* Periodically update table and column schema tables in a celery task.

* Fetching schema returns data from table and column metadata tables.

* Add tests for backend changes.

* Front-end shows extra table metadata and uses new schema response.

* Delete datasource schema data when deleting a data source.

* Process and store data source schema when a data source is first created or after a migration.

* Tables should have a unique name per datasource.

* Addressing review comments.

* Update migration file for mixins.

* Appease PEP8

* Upgrade migration file for rebase.

* Cascade delete.

* Adding org_id

* Remove redundant column and table prefixes.

* Non-existing tables and columns should be filtered out on the server side not client side.

* Fetching table samples should be optional and should happen in a separate task per table.

* Allow users to force a schema refresh.

* Use updated_at to help prune old schema metadata periodically.

* Using settings.SCHEMAS_REFRESH_QUEUE

* fix for getredash#2426 test

* more stable test_interactive_new

* Closes #927, #928: Schema refresh improvements.

* Closes #934, #935: Remove type from schema browser and don't show empty example column in schema drawer (#936)

* Speed up schema fetch requests with fewer postgres queries.

* Add column metadata to Athena glue processing.

* Fix bug assuming 'metadata' exists for every table.

* Closes #939: Persisted, existing table metadata should be updated.

* Sample processing should be rate-limited.

* Add cli command for refreshing data samples.

* Schema refreshes should not overwrite column 'example' field.

* refresh_samples() should filter tables_to_sample on the datasource's id being sampled

* Correctly wrap long text in schema drawer.

Co-authored-by: Alison <github@bankofknowledge.net>
washort pushed a commit that referenced this pull request Jun 10, 2019
* Process extra column metadata for a few sql-based data sources.

* Add Table and Column metadata tables.

* Periodically update table and column schema tables in a celery task.

* Fetching schema returns data from table and column metadata tables.

* Add tests for backend changes.

* Front-end shows extra table metadata and uses new schema response.

* Delete datasource schema data when deleting a data source.

* Process and store data source schema when a data source is first created or after a migration.

* Tables should have a unique name per datasource.

* Addressing review comments.

* Update migration file for mixins.

* Appease PEP8

* Upgrade migration file for rebase.

* Cascade delete.

* Adding org_id

* Remove redundant column and table prefixes.

* Non-existing tables and columns should be filtered out on the server side not client side.

* Fetching table samples should be optional and should happen in a separate task per table.

* Allow users to force a schema refresh.

* Use updated_at to help prune old schema metadata periodically.

* Using settings.SCHEMAS_REFRESH_QUEUE

* fix for getredash#2426 test

* more stable test_interactive_new

* Closes #927, #928: Schema refresh improvements.

* Closes #934, #935: Remove type from schema browser and don't show empty example column in schema drawer (#936)

* Speed up schema fetch requests with fewer postgres queries.

* Add column metadata to Athena glue processing.

* Fix bug assuming 'metadata' exists for every table.

* Closes #939: Persisted, existing table metadata should be updated.

* Sample processing should be rate-limited.

* Add cli command for refreshing data samples.

* Schema refreshes should not overwrite column 'example' field.

* refresh_samples() should filter tables_to_sample on the datasource's id being sampled

* Correctly wrap long text in schema drawer.

Co-authored-by: Alison <github@bankofknowledge.net>
jezdez pushed a commit that referenced this pull request Jun 13, 2019
* Process extra column metadata for a few sql-based data sources.

* Add Table and Column metadata tables.

* Periodically update table and column schema tables in a celery task.

* Fetching schema returns data from table and column metadata tables.

* Add tests for backend changes.

* Front-end shows extra table metadata and uses new schema response.

* Delete datasource schema data when deleting a data source.

* Process and store data source schema when a data source is first created or after a migration.

* Tables should have a unique name per datasource.

* Addressing review comments.

* Update migration file for mixins.

* Appease PEP8

* Upgrade migration file for rebase.

* Cascade delete.

* Adding org_id

* Remove redundant column and table prefixes.

* Non-existing tables and columns should be filtered out on the server side not client side.

* Fetching table samples should be optional and should happen in a separate task per table.

* Allow users to force a schema refresh.

* Use updated_at to help prune old schema metadata periodically.

* Using settings.SCHEMAS_REFRESH_QUEUE

* fix for getredash#2426 test

* more stable test_interactive_new

* Closes #927, #928: Schema refresh improvements.

* Closes #934, #935: Remove type from schema browser and don't show empty example column in schema drawer (#936)

* Speed up schema fetch requests with fewer postgres queries.

* Add column metadata to Athena glue processing.

* Fix bug assuming 'metadata' exists for every table.

* Closes #939: Persisted, existing table metadata should be updated.

* Sample processing should be rate-limited.

* Add cli command for refreshing data samples.

* Schema refreshes should not overwrite column 'example' field.

* refresh_samples() should filter tables_to_sample on the datasource's id being sampled

* Correctly wrap long text in schema drawer.

Co-authored-by: Alison <github@bankofknowledge.net>
washort pushed a commit that referenced this pull request Jun 27, 2019
* Process extra column metadata for a few sql-based data sources.

* Add Table and Column metadata tables.

* Periodically update table and column schema tables in a celery task.

* Fetching schema returns data from table and column metadata tables.

* Add tests for backend changes.

* Front-end shows extra table metadata and uses new schema response.

* Delete datasource schema data when deleting a data source.

* Process and store data source schema when a data source is first created or after a migration.

* Tables should have a unique name per datasource.

* Addressing review comments.

* Update migration file for mixins.

* Appease PEP8

* Upgrade migration file for rebase.

* Cascade delete.

* Adding org_id

* Remove redundant column and table prefixes.

* Non-existing tables and columns should be filtered out on the server side not client side.

* Fetching table samples should be optional and should happen in a separate task per table.

* Allow users to force a schema refresh.

* Use updated_at to help prune old schema metadata periodically.

* Using settings.SCHEMAS_REFRESH_QUEUE

* fix for getredash#2426 test

* more stable test_interactive_new

* Closes #927, #928: Schema refresh improvements.

* Closes #934, #935: Remove type from schema browser and don't show empty example column in schema drawer (#936)

* Speed up schema fetch requests with fewer postgres queries.

* Add column metadata to Athena glue processing.

* Fix bug assuming 'metadata' exists for every table.

* Closes #939: Persisted, existing table metadata should be updated.

* Sample processing should be rate-limited.

* Add cli command for refreshing data samples.

* Schema refreshes should not overwrite column 'example' field.

* refresh_samples() should filter tables_to_sample on the datasource's id being sampled

* Correctly wrap long text in schema drawer.

Co-authored-by: Alison <github@bankofknowledge.net>
washort pushed a commit that referenced this pull request Jun 28, 2019
* Process extra column metadata for a few sql-based data sources.

* Add Table and Column metadata tables.

* Periodically update table and column schema tables in a celery task.

* Fetching schema returns data from table and column metadata tables.

* Add tests for backend changes.

* Front-end shows extra table metadata and uses new schema response.

* Delete datasource schema data when deleting a data source.

* Process and store data source schema when a data source is first created or after a migration.

* Tables should have a unique name per datasource.

* Addressing review comments.

* Update migration file for mixins.

* Appease PEP8

* Upgrade migration file for rebase.

* Cascade delete.

* Adding org_id

* Remove redundant column and table prefixes.

* Non-existing tables and columns should be filtered out on the server side not client side.

* Fetching table samples should be optional and should happen in a separate task per table.

* Allow users to force a schema refresh.

* Use updated_at to help prune old schema metadata periodically.

* Using settings.SCHEMAS_REFRESH_QUEUE

* fix for getredash#2426 test

* more stable test_interactive_new

* Closes #927, #928: Schema refresh improvements.

* Closes #934, #935: Remove type from schema browser and don't show empty example column in schema drawer (#936)

* Speed up schema fetch requests with fewer postgres queries.

* Add column metadata to Athena glue processing.

* Fix bug assuming 'metadata' exists for every table.

* Closes #939: Persisted, existing table metadata should be updated.

* Sample processing should be rate-limited.

* Add cli command for refreshing data samples.

* Schema refreshes should not overwrite column 'example' field.

* refresh_samples() should filter tables_to_sample on the datasource's id being sampled

* Correctly wrap long text in schema drawer.

Co-authored-by: Alison <github@bankofknowledge.net>

Schema Improvements Part 2: Add data source config options.

Adding BigQuery schema drawer with data types and samples.
jezdez pushed a commit that referenced this pull request Aug 19, 2019
* Process extra column metadata for a few sql-based data sources.

* Add Table and Column metadata tables.

* Periodically update table and column schema tables in a celery task.

* Fetching schema returns data from table and column metadata tables.

* Add tests for backend changes.

* Front-end shows extra table metadata and uses new schema response.

* Delete datasource schema data when deleting a data source.

* Process and store data source schema when a data source is first created or after a migration.

* Tables should have a unique name per datasource.

* Addressing review comments.

* Update migration file for mixins.

* Appease PEP8

* Upgrade migration file for rebase.

* Cascade delete.

* Adding org_id

* Remove redundant column and table prefixes.

* Non-existing tables and columns should be filtered out on the server side not client side.

* Fetching table samples should be optional and should happen in a separate task per table.

* Allow users to force a schema refresh.

* Use updated_at to help prune old schema metadata periodically.

* Using settings.SCHEMAS_REFRESH_QUEUE

* fix for getredash#2426 test

* more stable test_interactive_new

* Closes #927, #928: Schema refresh improvements.

* Closes #934, #935: Remove type from schema browser and don't show empty example column in schema drawer (#936)

* Speed up schema fetch requests with fewer postgres queries.

* Add column metadata to Athena glue processing.

* Fix bug assuming 'metadata' exists for every table.

* Closes #939: Persisted, existing table metadata should be updated.

* Sample processing should be rate-limited.

* Add cli command for refreshing data samples.

* Schema refreshes should not overwrite column 'example' field.

* refresh_samples() should filter tables_to_sample on the datasource's id being sampled

* Correctly wrap long text in schema drawer.

Co-authored-by: Alison <github@bankofknowledge.net>

Schema Improvements Part 2: Add data source config options.

Adding BigQuery schema drawer with data types and samples.
washort pushed a commit that referenced this pull request Sep 16, 2019
* Process extra column metadata for a few sql-based data sources.

* Add Table and Column metadata tables.

* Periodically update table and column schema tables in a celery task.

* Fetching schema returns data from table and column metadata tables.

* Add tests for backend changes.

* Front-end shows extra table metadata and uses new schema response.

* Delete datasource schema data when deleting a data source.

* Process and store data source schema when a data source is first created or after a migration.

* Tables should have a unique name per datasource.

* Addressing review comments.

* Update migration file for mixins.

* Appease PEP8

* Upgrade migration file for rebase.

* Cascade delete.

* Adding org_id

* Remove redundant column and table prefixes.

* Non-existing tables and columns should be filtered out on the server side not client side.

* Fetching table samples should be optional and should happen in a separate task per table.

* Allow users to force a schema refresh.

* Use updated_at to help prune old schema metadata periodically.

* Using settings.SCHEMAS_REFRESH_QUEUE

* fix for getredash#2426 test

* more stable test_interactive_new

* Closes #927, #928: Schema refresh improvements.

* Closes #934, #935: Remove type from schema browser and don't show empty example column in schema drawer (#936)

* Speed up schema fetch requests with fewer postgres queries.

* Add column metadata to Athena glue processing.

* Fix bug assuming 'metadata' exists for every table.

* Closes #939: Persisted, existing table metadata should be updated.

* Sample processing should be rate-limited.

* Add cli command for refreshing data samples.

* Schema refreshes should not overwrite column 'example' field.

* refresh_samples() should filter tables_to_sample on the datasource's id being sampled

* Correctly wrap long text in schema drawer.

Co-authored-by: Alison <github@bankofknowledge.net>

Schema Improvements Part 2: Add data source config options.

Adding BigQuery schema drawer with data types and samples.
washort pushed a commit that referenced this pull request Sep 17, 2019
* Process extra column metadata for a few sql-based data sources.

* Add Table and Column metadata tables.

* Periodically update table and column schema tables in a celery task.

* Fetching schema returns data from table and column metadata tables.

* Add tests for backend changes.

* Front-end shows extra table metadata and uses new schema response.

* Delete datasource schema data when deleting a data source.

* Process and store data source schema when a data source is first created or after a migration.

* Tables should have a unique name per datasource.

* Addressing review comments.

* Update migration file for mixins.

* Appease PEP8

* Upgrade migration file for rebase.

* Cascade delete.

* Adding org_id

* Remove redundant column and table prefixes.

* Non-existing tables and columns should be filtered out on the server side not client side.

* Fetching table samples should be optional and should happen in a separate task per table.

* Allow users to force a schema refresh.

* Use updated_at to help prune old schema metadata periodically.

* Using settings.SCHEMAS_REFRESH_QUEUE

* fix for getredash#2426 test

* more stable test_interactive_new

* Closes #927, #928: Schema refresh improvements.

* Closes #934, #935: Remove type from schema browser and don't show empty example column in schema drawer (#936)

* Speed up schema fetch requests with fewer postgres queries.

* Add column metadata to Athena glue processing.

* Fix bug assuming 'metadata' exists for every table.

* Closes #939: Persisted, existing table metadata should be updated.

* Sample processing should be rate-limited.

* Add cli command for refreshing data samples.

* Schema refreshes should not overwrite column 'example' field.

* refresh_samples() should filter tables_to_sample on the datasource's id being sampled

* Correctly wrap long text in schema drawer.

Co-authored-by: Alison <github@bankofknowledge.net>

Schema Improvements Part 2: Add data source config options.

Adding BigQuery schema drawer with data types and samples.
emtwo pushed a commit that referenced this pull request Nov 5, 2019
* Process extra column metadata for a few sql-based data sources.

* Add Table and Column metadata tables.

* Periodically update table and column schema tables in a celery task.

* Fetching schema returns data from table and column metadata tables.

* Add tests for backend changes.

* Front-end shows extra table metadata and uses new schema response.

* Delete datasource schema data when deleting a data source.

* Process and store data source schema when a data source is first created or after a migration.

* Tables should have a unique name per datasource.

* Addressing review comments.

* Update migration file for mixins.

* Appease PEP8

* Upgrade migration file for rebase.

* Cascade delete.

* Adding org_id

* Remove redundant column and table prefixes.

* Non-existing tables and columns should be filtered out on the server side not client side.

* Fetching table samples should be optional and should happen in a separate task per table.

* Allow users to force a schema refresh.

* Use updated_at to help prune old schema metadata periodically.

* Using settings.SCHEMAS_REFRESH_QUEUE

* fix for getredash#2426 test

* more stable test_interactive_new

* Closes #927, #928: Schema refresh improvements.

* Closes #934, #935: Remove type from schema browser and don't show empty example column in schema drawer (#936)

* Speed up schema fetch requests with fewer postgres queries.

* Add column metadata to Athena glue processing.

* Fix bug assuming 'metadata' exists for every table.

* Closes #939: Persisted, existing table metadata should be updated.

* Sample processing should be rate-limited.

* Add cli command for refreshing data samples.

* Schema refreshes should not overwrite column 'example' field.

* refresh_samples() should filter tables_to_sample on the datasource's id being sampled

* Correctly wrap long text in schema drawer.

Co-authored-by: Alison <github@bankofknowledge.net>

Schema Improvements Part 2: Add data source config options.

Adding BigQuery schema drawer with data types and samples.
jezdez pushed a commit that referenced this pull request Jan 22, 2020
* Process extra column metadata for a few sql-based data sources.

* Add Table and Column metadata tables.

* Periodically update table and column schema tables in a celery task.

* Fetching schema returns data from table and column metadata tables.

* Add tests for backend changes.

* Front-end shows extra table metadata and uses new schema response.

* Delete datasource schema data when deleting a data source.

* Process and store data source schema when a data source is first created or after a migration.

* Tables should have a unique name per datasource.

* Addressing review comments.

* Update migration file for mixins.

* Appease PEP8

* Upgrade migration file for rebase.

* Cascade delete.

* Adding org_id

* Remove redundant column and table prefixes.

* Non-existing tables and columns should be filtered out on the server side not client side.

* Fetching table samples should be optional and should happen in a separate task per table.

* Allow users to force a schema refresh.

* Use updated_at to help prune old schema metadata periodically.

* Using settings.SCHEMAS_REFRESH_QUEUE

* fix for getredash#2426 test

* more stable test_interactive_new

* Closes #927, #928: Schema refresh improvements.

* Closes #934, #935: Remove type from schema browser and don't show empty example column in schema drawer (#936)

* Speed up schema fetch requests with fewer postgres queries.

* Add column metadata to Athena glue processing.

* Fix bug assuming 'metadata' exists for every table.

* Closes #939: Persisted, existing table metadata should be updated.

* Sample processing should be rate-limited.

* Add cli command for refreshing data samples.

* Schema refreshes should not overwrite column 'example' field.

* refresh_samples() should filter tables_to_sample on the datasource's id being sampled

* Correctly wrap long text in schema drawer.

Schema Improvements Part 2: Add data source config options.

Adding BigQuery schema drawer with data types and samples.

Add empty migration to replace the removed schedule_until migration

Add merge migration.

Co-authored-by: Alison <github@bankofknowledge.net>
jezdez pushed a commit that referenced this pull request Feb 5, 2020
* Process extra column metadata for a few sql-based data sources.

* Add Table and Column metadata tables.

* Periodically update table and column schema tables in a celery task.

* Fetching schema returns data from table and column metadata tables.

* Add tests for backend changes.

* Front-end shows extra table metadata and uses new schema response.

* Delete datasource schema data when deleting a data source.

* Process and store data source schema when a data source is first created or after a migration.

* Tables should have a unique name per datasource.

* Addressing review comments.

* Update migration file for mixins.

* Appease PEP8

* Upgrade migration file for rebase.

* Cascade delete.

* Adding org_id

* Remove redundant column and table prefixes.

* Non-existing tables and columns should be filtered out on the server side not client side.

* Fetching table samples should be optional and should happen in a separate task per table.

* Allow users to force a schema refresh.

* Use updated_at to help prune old schema metadata periodically.

* Using settings.SCHEMAS_REFRESH_QUEUE

* fix for getredash#2426 test

* more stable test_interactive_new

* Closes #927, #928: Schema refresh improvements.

* Closes #934, #935: Remove type from schema browser and don't show empty example column in schema drawer (#936)

* Speed up schema fetch requests with fewer postgres queries.

* Add column metadata to Athena glue processing.

* Fix bug assuming 'metadata' exists for every table.

* Closes #939: Persisted, existing table metadata should be updated.

* Sample processing should be rate-limited.

* Add cli command for refreshing data samples.

* Schema refreshes should not overwrite column 'example' field.

* refresh_samples() should filter tables_to_sample on the datasource's id being sampled

* Correctly wrap long text in schema drawer.

Schema Improvements Part 2: Add data source config options.

Adding BigQuery schema drawer with data types and samples.

Add empty migration to replace the removed schedule_until migration

Add merge migration.

Co-authored-by: Alison <github@bankofknowledge.net>
jezdez added a commit that referenced this pull request May 4, 2020
* Process extra column metadata for a few sql-based data sources.

* Add Table and Column metadata tables.

* Periodically update table and column schema tables in a celery task.

* Fetching schema returns data from table and column metadata tables.

* Add tests for backend changes.

* Front-end shows extra table metadata and uses new schema response.

* Delete datasource schema data when deleting a data source.

* Process and store data source schema when a data source is first created or after a migration.

* Tables should have a unique name per datasource.

* Addressing review comments.

* Update migration file for mixins.

* Appease PEP8

* Upgrade migration file for rebase.

* Cascade delete.

* Adding org_id

* Remove redundant column and table prefixes.

* Non-existing tables and columns should be filtered out on the server side not client side.

* Fetching table samples should be optional and should happen in a separate task per table.

* Allow users to force a schema refresh.

* Use updated_at to help prune old schema metadata periodically.

* Using settings.SCHEMAS_REFRESH_QUEUE

* fix for getredash#2426 test

* more stable test_interactive_new

* Closes #927, #928: Schema refresh improvements.

* Closes #934, #935: Remove type from schema browser and don't show empty example column in schema drawer (#936)

* Speed up schema fetch requests with fewer postgres queries.

* Add column metadata to Athena glue processing.

* Fix bug assuming 'metadata' exists for every table.

* Closes #939: Persisted, existing table metadata should be updated.

* Sample processing should be rate-limited.

* Add cli command for refreshing data samples.

* Schema refreshes should not overwrite column 'example' field.

* refresh_samples() should filter tables_to_sample on the datasource's id being sampled

* Correctly wrap long text in schema drawer.

Schema Improvements Part 2: Add data source config options.

Adding BigQuery schema drawer with data types and samples.

Add empty migration to replace the removed schedule_until migration

Add merge migration.

Co-authored-by: Alison <github@bankofknowledge.net>
Co-authored-by: Jannis Leidel <jannis@leidel.info>
jezdez added a commit that referenced this pull request May 14, 2020
* Process extra column metadata for a few sql-based data sources.

* Add Table and Column metadata tables.

* Periodically update table and column schema tables in a celery task.

* Fetching schema returns data from table and column metadata tables.

* Add tests for backend changes.

* Front-end shows extra table metadata and uses new schema response.

* Delete datasource schema data when deleting a data source.

* Process and store data source schema when a data source is first created or after a migration.

* Tables should have a unique name per datasource.

* Addressing review comments.

* Update migration file for mixins.

* Appease PEP8

* Upgrade migration file for rebase.

* Cascade delete.

* Adding org_id

* Remove redundant column and table prefixes.

* Non-existing tables and columns should be filtered out on the server side not client side.

* Fetching table samples should be optional and should happen in a separate task per table.

* Allow users to force a schema refresh.

* Use updated_at to help prune old schema metadata periodically.

* Using settings.SCHEMAS_REFRESH_QUEUE

* fix for getredash#2426 test

* more stable test_interactive_new

* Closes #927, #928: Schema refresh improvements.

* Closes #934, #935: Remove type from schema browser and don't show empty example column in schema drawer (#936)

* Speed up schema fetch requests with fewer postgres queries.

* Add column metadata to Athena glue processing.

* Fix bug assuming 'metadata' exists for every table.

* Closes #939: Persisted, existing table metadata should be updated.

* Sample processing should be rate-limited.

* Add cli command for refreshing data samples.

* Schema refreshes should not overwrite column 'example' field.

* refresh_samples() should filter tables_to_sample on the datasource's id being sampled

* Correctly wrap long text in schema drawer.

Schema Improvements Part 2: Add data source config options.

Adding BigQuery schema drawer with data types and samples.

Add empty migration to replace the removed schedule_until migration

Add merge migration.

Fix spacing issue with data scanned value in query execution metadata.

Increase schema refresh timeout.

Co-authored-by: Alison <github@bankofknowledge.net>
Co-authored-by: Jannis Leidel <jannis@leidel.info>
robhudson pushed a commit that referenced this pull request Jun 11, 2020
* Process extra column metadata for a few sql-based data sources.

* Add Table and Column metadata tables.

* Periodically update table and column schema tables in a celery task.

* Fetching schema returns data from table and column metadata tables.

* Add tests for backend changes.

* Front-end shows extra table metadata and uses new schema response.

* Delete datasource schema data when deleting a data source.

* Process and store data source schema when a data source is first created or after a migration.

* Tables should have a unique name per datasource.

* Addressing review comments.

* Update migration file for mixins.

* Appease PEP8

* Upgrade migration file for rebase.

* Cascade delete.

* Adding org_id

* Remove redundant column and table prefixes.

* Non-existing tables and columns should be filtered out on the server side not client side.

* Fetching table samples should be optional and should happen in a separate task per table.

* Allow users to force a schema refresh.

* Use updated_at to help prune old schema metadata periodically.

* Using settings.SCHEMAS_REFRESH_QUEUE

* fix for getredash#2426 test

* more stable test_interactive_new

* Closes #927, #928: Schema refresh improvements.

* Closes #934, #935: Remove type from schema browser and don't show empty example column in schema drawer (#936)

* Speed up schema fetch requests with fewer postgres queries.

* Add column metadata to Athena glue processing.

* Fix bug assuming 'metadata' exists for every table.

* Closes #939: Persisted, existing table metadata should be updated.

* Sample processing should be rate-limited.

* Add cli command for refreshing data samples.

* Schema refreshes should not overwrite column 'example' field.

* refresh_samples() should filter tables_to_sample on the datasource's id being sampled

* Correctly wrap long text in schema drawer.

Schema Improvements Part 2: Add data source config options.

Adding BigQuery schema drawer with data types and samples.

Add empty migration to replace the removed schedule_until migration

Add merge migration.

Fix spacing issue with data scanned value in query execution metadata.

Increase schema refresh timeout.

Remove old migrations.

Co-authored-by: Alison <github@bankofknowledge.net>
Co-authored-by: Jannis Leidel <jannis@leidel.info>
emtwo pushed a commit that referenced this pull request Jul 15, 2020
* Process extra column metadata for a few sql-based data sources.

* Add Table and Column metadata tables.

* Periodically update table and column schema tables in a celery task.

* Fetching schema returns data from table and column metadata tables.

* Add tests for backend changes.

* Front-end shows extra table metadata and uses new schema response.

* Delete datasource schema data when deleting a data source.

* Process and store data source schema when a data source is first created or after a migration.

* Tables should have a unique name per datasource.

* Addressing review comments.

* Update migration file for mixins.

* Appease PEP8

* Upgrade migration file for rebase.

* Cascade delete.

* Adding org_id

* Remove redundant column and table prefixes.

* Non-existing tables and columns should be filtered out on the server side not client side.

* Fetching table samples should be optional and should happen in a separate task per table.

* Allow users to force a schema refresh.

* Use updated_at to help prune old schema metadata periodically.

* Using settings.SCHEMAS_REFRESH_QUEUE

* fix for getredash#2426 test

* more stable test_interactive_new

* Closes #927, #928: Schema refresh improvements.

* Closes #934, #935: Remove type from schema browser and don't show empty example column in schema drawer (#936)

* Speed up schema fetch requests with fewer postgres queries.

* Add column metadata to Athena glue processing.

* Fix bug assuming 'metadata' exists for every table.

* Closes #939: Persisted, existing table metadata should be updated.

* Sample processing should be rate-limited.

* Add cli command for refreshing data samples.

* Schema refreshes should not overwrite column 'example' field.

* refresh_samples() should filter tables_to_sample on the datasource's id being sampled

* Correctly wrap long text in schema drawer.

Schema Improvements Part 2: Add data source config options.

Adding BigQuery schema drawer with data types and samples.

Add empty migration to replace the removed schedule_until migration

Add merge migration.

Fix spacing issue with data scanned value in query execution metadata.

Increase schema refresh timeout.

Remove old migrations.

Co-authored-by: Alison <github@bankofknowledge.net>
Co-authored-by: Jannis Leidel <jannis@leidel.info>
jezdez added a commit that referenced this pull request Oct 15, 2020
* Process extra column metadata for a few sql-based data sources.

* Add Table and Column metadata tables.

* Periodically update table and column schema tables in a celery task.

* Fetching schema returns data from table and column metadata tables.

* Add tests for backend changes.

* Front-end shows extra table metadata and uses new schema response.

* Delete datasource schema data when deleting a data source.

* Process and store data source schema when a data source is first created or after a migration.

* Tables should have a unique name per datasource.

* Addressing review comments.

* Update migration file for mixins.

* Appease PEP8

* Upgrade migration file for rebase.

* Cascade delete.

* Adding org_id

* Remove redundant column and table prefixes.

* Non-existing tables and columns should be filtered out on the server side not client side.

* Fetching table samples should be optional and should happen in a separate task per table.

* Allow users to force a schema refresh.

* Use updated_at to help prune old schema metadata periodically.

* Using settings.SCHEMAS_REFRESH_QUEUE

* fix for getredash#2426 test

* more stable test_interactive_new

* Closes #927, #928: Schema refresh improvements.

* Closes #934, #935: Remove type from schema browser and don't show empty example column in schema drawer (#936)

* Speed up schema fetch requests with fewer postgres queries.

* Add column metadata to Athena glue processing.

* Fix bug assuming 'metadata' exists for every table.

* Closes #939: Persisted, existing table metadata should be updated.

* Sample processing should be rate-limited.

* Add cli command for refreshing data samples.

* Schema refreshes should not overwrite column 'example' field.

* refresh_samples() should filter tables_to_sample on the datasource's id being sampled

* Correctly wrap long text in schema drawer.

Schema Improvements Part 2: Add data source config options.

Adding BigQuery schema drawer with data types and samples.

Add empty migration to replace the removed schedule_until migration

Add merge migration.

Fix spacing issue with data scanned value in query execution metadata.

Increase schema refresh timeout.

Remove old migrations.

Co-authored-by: Alison <github@bankofknowledge.net>
Co-authored-by: Jannis Leidel <jannis@leidel.info>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants