-
Notifications
You must be signed in to change notification settings - Fork 16.4k
Open search integrations #34693
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open search integrations #34693
Conversation
…opensearchqueryoperator unit test
| class OpenSearchHook(AwsBaseHook): | ||
| """ | ||
| This Hook provides a thin wrapper around the OpenSearch client. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AwsBaseHook/AwsGenericHook uses only for boto3 / botocore clients. Internals of this hook intend to use with this libraries and may work incorrectly if someone call public methods of AwsBaseHook/AwsGenericHook
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense then to create a base OpenSearch hook class in an OpenSearch providers package and use that as the base class for an AWS based Hook for OpenSearch? Most of the logic in this could be moved to a base hook for Open Search and then have the AWS Specific one just override the auth method used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah it how it works in general. There is some exceptions but most (if not all) of them created years ago. there is also exists hooks in Amazon Provider which not depends on AwsBaseHook, example:
AwsBaseHook it wrapper around one of the boto3 (botocore) clients, e.g.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would there be a good reason to have any operators that do things like create domains in a DAG? With the above that seems like where an open search hook that is a sub class of AwsBaseHook would come in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you think it might be useful then I don't see any obstacles for add wrappers around this boto3 clients. For hook it is pretty simple, just create simple thin wrapper
airflow/airflow/providers/amazon/aws/hooks/sqs.py
Lines 23 to 40 in bd2f156
| class SqsHook(AwsBaseHook): | |
| """ | |
| Interact with Amazon Simple Queue Service. | |
| Provide thin wrapper around :external+boto3:py:class:`boto3.client("sqs") <SQS.Client>`. | |
| Additional arguments (such as ``aws_conn_id``) may be specified and | |
| are passed down to the underlying AwsBaseHook. | |
| .. seealso:: | |
| - :class:`airflow.providers.amazon.aws.hooks.base_aws.AwsBaseHook` | |
| """ | |
| def __init__(self, *args, **kwargs) -> None: | |
| kwargs["client_type"] = "sqs" | |
| super().__init__(*args, **kwargs) | |
And optionally add useful methods
| conn_type = "opensearch" | ||
| hook_name = "AWS Open Search Hook" | ||
|
|
||
| def __init__(self, *args: Any, open_search_conn_id: str, log_query: bool, **kwargs: Any): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| def __init__(self, *args: Any, open_search_conn_id: str, log_query: bool, **kwargs: Any): | |
| def __init__(self, *, open_search_conn_id: str, log_query: bool, **kwargs: Any): |
I would recommend to accept in hook constructor keyword only arguments, positional arguments prevent to do some changes in the future without breaking changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that we should create support of OpenSearch (as product) in a separate provider. Otherwise this feature would be exclusively available only for AWS.
And after that we could add support of AWS OpenSearch (as managed service) into Amazon provider.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Working now on creating a providers package for OpenSearch and moving the main Hook logic there with the hook for AWS Open Search taking that as a base class and only doing overrides of the AWS Open Search specific requirements such as the auth and the extras requiring a region_name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets have seperate PR for open search integration and depend PR for AWS related code on top of it.
It also requires to follow protocol of adding new provider
https://github.com/apache/airflow/blob/main/PROVIDERS.rst#accepting-new-community-providers in terms of mailing list thread
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay let me revert the changes that added the new open search provider.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes are reverted. @eladkal @Taragolis What would be your preference on the approach for the new provider, I can get everything onto a new branch and cut a different PR and then mail the dev list for lazy consensus given open search is an apache licensed product, or do we need a full discussion and vote?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think a discussion is needed it seems as it fits the criteria for lazy consensus vote as popular vendor neutral open source.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Draft PR #34705
| "mypy-boto3-appflow>=1.24.0", | ||
| "mypy-boto3-rds>=1.24.0", | ||
| "mypy-boto3-redshift-data>=1.24.0", | ||
| "mypy-boto3-s3>=1.24.0", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mypy packages moved out of the amazon-provider
| "mypy-boto3-appflow>=1.24.0", | |
| "mypy-boto3-rds>=1.24.0", | |
| "mypy-boto3-redshift-data>=1.24.0", | |
| "mypy-boto3-s3>=1.24.0", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is logo of OpenSearch as a product, logo of Amazon OpenSearch Service (ElasticSearch and OpenSearch)

you could take it from https://aws.amazon.com/architecture/icons/
e595762 to
851c3ab
Compare
| @@ -0,0 +1,189 @@ | |||
| # | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Quite related to this comment thread but are these operators specific to the AWS managed service? Should not these operators be in the opensearch provider?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR is on hold right now until the other PR is ready. Once that is ready I will be updating this PR with pieces in place that handle the separate provider and really what my thought process here is going to be is that the particular Operators for the managed service just need to be sub classes of the Open Search provider ones and replace the hook with the one required for the managed service piece. I am open to other ideas here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds like very good to me 👍
| from airflow.providers.amazon.aws.hooks.base_aws import AwsBaseHook | ||
|
|
||
|
|
||
| class OpenSearchHook(AwsBaseHook): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this hook needs to be modified after #34705 is merged?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am working on that now, everything in the amazon provider will have a prefix of Aws. And this hook will be a child of the open search hook and potentially the AwsBaseHook as well to avoid rewriting the logic for getting a boto3 client.
|
Closing this PR in favor of actually just adding the auth required in the OpenSearch provider to simplify dependency management and code management. I will open a new PR in the coming days for adding the Auth piece to the OpenSearch provider |
This draft is to reduce the overall amount of code to have to review the first time and to get some eyes on an issue with the providers.yaml pre-commit check error. The intention here is a base for OpenSearch functionality. The Operators allow the use of either a high level or low level version of Open Search clients. The logo that you see is likely going to be replaced as it needs to be resized. Additional Unit Tests will be coming in a second revision.
^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named
{pr_number}.significant.rstor{issue_number}.significant.rst, in newsfragments.