AWS Glue Catalog for Iceberg ingest extension #17392
Conversation
| private Catalog setupGlueCatalog() { | ||
| catalog = new GlueCatalog(); | ||
| catalogProperties.put(CatalogProperties.WAREHOUSE_LOCATION, warehousePath); | ||
| catalog.initialize(CATALOG_NAME, catalogProperties); |
There was a problem hiding this comment.
catalog properties must have these key value pairs
"type" : "glue",
"catalog-impl": "org.apache.iceberg.aws.glue.GlueCatalog",
"io-impl": "org.apache.iceberg.aws.s3.S3FileIO",
There was a problem hiding this comment.
warehouse path must be s3://bucket/path
There was a problem hiding this comment.
AWS related env variables must be available where druid cluster is running.
There was a problem hiding this comment.
AWS related env variables must be available where druid cluster is running.
Could we add more information related to this in the docs specific to the glue catalog?
There was a problem hiding this comment.
Yes, I will do that. Recently figured out that there is simpler approach in iceberg API itself to choose the catalog. I am spending sometime to check if that would drastically make it modular & work for all available iceberg catalog support on the fly.
|
While testing I find error: Please let me know if anyone have faced similar error message, it is related to not able to find IcebergInputSource from the iceberg extension as subtype for input source. |
|
@shekhar-rajak Thank you for working on this! |
Thanks! I found that there was already After adding into the existing list. I am able to run it. |
|
I reallise lib folder not copyting the jars from the druid-iceberg-extension/lib which is needed at runtime . When I copied those jar then GlueCatalog was detected and able to run load iceberg table |
|
We need to have integration testing for glue catalog. That need a separate discussion and test pipeline. |
| <version>${iceberg.core.version}</version> | ||
| </dependency> | ||
| <!-- GlueCatalog class--> | ||
| <dependency> |
There was a problem hiding this comment.
|
@shekhar-rajak Catalog changes look good to me. |
|
Update the doc and PR as per the review comment. |
a2l007
left a comment
There was a problem hiding this comment.
LGTM, Thanks for the contribution @shekhar-rajak !
* iceberg glue catalog dependencies added * GlueIcebergCatalog added in druid module * default version of iceberg glue catalog implementation - basics * basic tests added * removed dependecy iceberg-aws-bundle * glue catalog support - docs update for iceberg * Update IcebergDruidModule.java * Update IcebergDruidModule.java * updates in dependencies and warehousePath must be under catalogProp * removed some dependencies - which not required * only glue sdk added * update license * avro exclusion removed * doc update * doc update * set the type to glue * minor change * minor change * fixing codestyle * checkstyle fixes * checkstyle fixes * checkstyle fixes * dependency check fixes * update pom for ignore warning for glue catalog * compile scope needed - iceberg-aws and awssdk * updates pom with comment * minor change * mvn dependency check in iceberg extension * revert pom.xml changes * aws sdk sts and s3 for gluecatalog initialize * dependency check - ignore aws sdk s3 and sts --------- Co-authored-by: SHEKHAR PRASAD RAJAK <shekhar_rajak@apple.com>
Fixes #17352.
Description
Release note
Key changed/added classes in this PR
GlueIcebergCatalogNote: Integraton testing needs a separate discussion / changes.