-
Notifications
You must be signed in to change notification settings - Fork 16.4k
Add kerberos dependency to Impala Provider #32304
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need this here? Can users not install this additionally in their setup if they need it? I mean for users who do not want to use Kerberos authentication and use other mechanisms like LDAP, for them we would be installing this additional dependency which many not be needed, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, indeed, this will bother users that do not need it. My reasoning was the following: For Hadoop, the provider is shipped with kerberos. Thus, to stay consistent, for Impala (which is setup on top of an hadoop system), it makes sense to have it bundled too.
I can propose a PR to add kerberos as optional dependency to hdfs and impala ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
okay. @eladkal could you please help us here on what could be the appropriate way?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See example in "amazon/provider.yaml" (additional-extras)
additional-extras:
- name: pandas
dependencies:
- pandas>=0.17.1
# There is conflict between boto3 and aiobotocore dependency botocore.
# TODO: We can remove it once boto3 and aiobotocore both have compatible botocore version or
# boto3 have native async support and we move away from aio aiobotocore
- name: aiobotocore
dependencies:
- aiobotocore[boto3]>=2.2.0
- name: cncf.kubernetes
dependencies:
- apache-airflow-providers-cncf-kubernetes>=7.2.0There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Edited the PR to make it optional. I'll open a PR to update the docs on airflow-website asap.
104b8f4 to
6413aff
Compare
6413aff to
a5c6678
Compare
Hello,
Using the ImpalaHook with kerberos (
"auth_mechanism": "GSSAPI") fails with the following error:Solution
The kerberos module, an optional dependency of impyla, is not bundled with Airflow. (Only requests-kerberos for hdfs and pykerberos come default with Airflow, if I'm not mistaken)
This PR add
kerberosas dependency. Fixing the above error.About license
The package is under Apache 2.0 license (see github repo, and pypi).
^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named
{pr_number}.significant.rstor{issue_number}.significant.rst, in newsfragments.