Skip to content
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
29 changes: 29 additions & 0 deletions caravel/bin/caravel
Original file line number Diff line number Diff line change
Expand Up @@ -125,6 +125,35 @@ def refresh_druid():
"[" + cluster.cluster_name + "]")
session.commit()

@manager.option(
'-p', '--prefix', default='data_',
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd make the prefix not required and empty by default so one can insert all the tables of a database if needed.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is not good because some table in caravel db store user,permission...etc data, they don't need insert to caravel.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah but 'data_' is completely arbitrary, if you don't want to read all the tables fine, just don't add a default that would work only for you :)

help="Sync Table Prefix")
def synctable(prefix):
''' Sync DB Table with Caravel Table'''
existing_tables = []
for row in db.session.query(caravel.models.SqlaTable).all():
existing_tables += [(row.database.database_name, row.name)]
all_tables = []
for caravel_db in db.session.query(caravel.models.Database).all():
for table_name in caravel_db.get_sqla_engine().table_names():
all_tables += [(caravel_db.database_name, table_name)]

all_tables = [row for row in all_tables if row[1].startswith(prefix)]

need_insert = list(set(all_tables) - set(existing_tables))

for db_name, table_name in need_insert:
tbl = caravel.models.SqlaTable(table_name=table_name)
tbl.description = ""
tbl.is_featured = False
tbl.database = db.session.query(caravel.models.Database).filter_by(database_name=db_name).first()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think you can query this once outside the loop

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean #152 ?
It is not same in each time loop

db.session.merge(tbl)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

db.session.add(tbl) should be enough

db.session.commit()
tbl.fetch_metadata()

print("[{db}] {table} insert success.".format(db=db_name, table=table_name))

print("[{}] Sync table complete.".format(len(need_insert)))

if __name__ == "__main__":
manager.run()