Skip to content

Conversation

@henrib
Copy link
Contributor

@henrib henrib commented Mar 20, 2024

What changes were proposed in this pull request?

This is a basic implementation of the Iceberg REST Catalog as a service embedded within HMS.

Why are the changes needed?

For Hive users that have started using Iceberg to store data and want to expose those same tables to external query engines, implementing the Iceberg REST catalog embedded within HMS allows an easy upgrade and deployment path.
Although a dedicated service to host the REST catalog is certainly desirable, it also implies deploying and managing another service.

Does this PR introduce any user-facing change?

It does since HMS then 'hosts' a servlet that implements the Iceberg REST Catalog API.

Is the change a dependency upgrade?

No.

How was this patch tested?

Unit tests.

Henri Biestro and others added 25 commits November 13, 2023 17:43
- add CatalogClass conf to instantiate a HiveCatalog (thrift client);
- added metrics;
- add CatalogClass conf to instantiate a HiveCatalog (thrift client);
- added metrics;
- Introduce HiveActor abstract class that captures all methods needed by the (Iceberg) catalog to interact with Hive (HMS);
- Implement thrift-client (HiveCatalogActor) and embedded client (HMSActor);
- Removed property specifying catalog implementation;
- Added property specifying HiveActor implementation class (hive.iceberg.catalog.actor.class);
- Refactored tests accordingly;
@henrib henrib changed the title Iceberg REST Catalog HIVE-28059: Iceberg REST Catalog Mar 20, 2024
Henri Biestro added 2 commits March 20, 2024 23:29
- Introduce HiveActor abstract class that captures all methods needed by the (Iceberg) catalog to interact with Hive (HMS);
- Implement thrift-client (HiveCatalogActor) and embedded client (HMSActor);
- Removed property specifying catalog implementation;
- Added property specifying HiveActor implementation class (hive.iceberg.catalog.actor.class);
- Refactored tests accordingly;
- Introduce HiveActor abstract class that captures all methods needed by the (Iceberg) catalog to interact with Hive (HMS);
- Implement thrift-client (HiveCatalogActor) and embedded client (HMSActor);
- Removed property specifying catalog implementation;
- Added property specifying HiveActor implementation class (hive.iceberg.catalog.actor.class);
- Refactored tests accordingly;
"hive.metastore.catalog.servlet.path", "icecli",
"Catalog servlet path component of URL endpoint."
),
CATALOG_SERVLET_PORT("hive.metastore.catalog.servlet.port",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use more appropriate name here. I could not find out neither from the name, nor from the description this property is for Iceberg Rest Server.

"hive.metastore.properties.namespaces", "",
"Property-maps namespace map; a JSON map where keys are namespace names, values namespace" +
"class and/or static factory method"),
CATALOG_SERVLET_PATH("hive.metastore.catalog.servlet.path",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use more appropriate name here. I could not find out neither from the name, nor from the description this property is for Iceberg Rest Server

" 0 will let the system determine the catalog server port," +
" positive value will be used as-is."
),
CATALOG_SERVLET_AUTH("hive.metastore.catalog.servlet.auth",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use more appropriate name here. I could not find out neither from the name, nor from the description this property is for Iceberg Rest Server

import org.slf4j.LoggerFactory;


public class HMSCatalogServer {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think HMSRESTCatalogServer would be better.

* The RESTCatalogServlet provides a servlet implementation used in combination with a
* RESTCatalogAdaptor to proxy the REST Spec to any Catalog implementation.
*/
public class HMSCatalogServlet extends HttpServlet {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think HMSRESTCatalogServlet would be better.

* Original @ https://github.com/apache/iceberg/blob/main/core/src/test/java/org/apache/iceberg/rest/RESTCatalogAdapter.java
* Adaptor class to translate REST requests into {@link Catalog} API calls.
*/
public class HMSCatalogAdapter implements RESTClient {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think HMSRESTCatalogAdapter would be better.

import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

public class HMSCatalogActor implements HiveActor {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think HMSRESTCatalogActor would be better.

Copy link
Contributor

@zratkai zratkai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please check my comments!

@henrib
Copy link
Contributor Author

henrib commented Dec 16, 2024

Please check my comments!

Will do :-)

@henrib henrib marked this pull request as ready for review January 10, 2025 14:46
@henrib henrib marked this pull request as draft January 10, 2025 14:47
@henrib henrib closed this Jan 10, 2025
@henrib henrib deleted the HMS-Catalog branch January 10, 2025 20:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants