Skip to content

Conversation

@mik-laj
Copy link
Member

@mik-laj mik-laj commented Jul 16, 2020

I didn't use generic OpenID library because they are more difficult to use than Google library. On the other hand, we also need to implement the client in this module. Its implementation requires things that are Google Vendored, e.g. obtaining the Token ID using a metaserver.

Hopefully we will be able to come up with a more generic solution in the future, but this is a step forward. I would like to support authorizations with Keycloak.


Make sure to mark the boxes below before creating PR: [x]

  • Description above provides context of the change
  • Unit tests coverage for changes (not needed for documentation changes)
  • Target Github ISSUE in description if exists
  • Commits follow "How to write a good git commit message"
  • Relevant documentation is updated including usage instructions.
  • [XI will engage committers as explained in Contribution Workflow Example.

In case of fundamental code change, Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in UPDATING.md.
Read the Pull Request Guidelines for more information.

@boring-cyborg boring-cyborg bot added the area:API Airflow's REST/HTTP API label Jul 16, 2020
@mik-laj mik-laj marked this pull request as draft July 16, 2020 00:17
@mik-laj mik-laj force-pushed the api-google-openid branch from 87a566e to b51354e Compare July 17, 2020 13:08
@mik-laj mik-laj force-pushed the api-google-openid branch from cc21632 to a4b5fd9 Compare July 17, 2020 18:47
@mik-laj mik-laj changed the title Add Google Authorization for experimental API Add Google Authentication for experimental API Jul 17, 2020
@mik-laj mik-laj marked this pull request as ready for review July 17, 2020 22:24
@mik-laj mik-laj requested review from potiuk and turbaszek July 17, 2020 22:24

log = logging.getLogger(__name__)

_GOOGLE_ISSUERS = ["accounts.google.com", "https://accounts.google.com"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit, but would it make sense to use tuple here to assure immutability?

"""Initializes authentication."""


def _get_id_token_from_request(r):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't aware that pylint does not check arguments names, only variables 👀

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated the PR.

except exceptions.GoogleAuthError:
return None

# This check is part of 1.19.0 (2020-07-09), In order not to create strong version requirements
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what 1.19.0 is

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines +208 to +215
if __name__ == "__main__":
from google.auth.transport import requests

request_adaapter = requests.Request()

creds = get_default_id_token_credentials(target_audience=None)
creds.refresh(request=request_adaapter)
print(creds.token)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should it be removed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a helpful code because the ADC flow is very complex and sometimes you may need to manually check the tokens. At the beginning of the file, I added documentation that explains how to use it.

@mik-laj mik-laj requested a review from turbaszek July 22, 2020 00:30
"""Base API client for all API clients."""

def __init__(self, api_base_url, auth):
def __init__(self, api_base_url, auth=None, session=None):
Copy link
Member

@turbaszek turbaszek Jul 22, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you mind adding type hints? There's ongoing effort to increase coverage #9708

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This entire module has no types. I'm afraid that if I add annotations in one place, I will have to make a lot of additional changes.
I can see that the IDE already shows me various errors, and then it can get even worse.
Screenshot 2020-07-22 at 10 56 42
After that, we should retire this API client with Airflow 2.0 and start using the OpenAPI based client.

Comment on lines +75 to +76
except json.JSONDecodeError:
raise exceptions.DefaultCredentialsError(f"File {filename} is not a valid json file.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think we should propagate some information from the original error? This may help but not sure how much

Copy link
Member Author

@mik-laj mik-laj Jul 22, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do it.
https://www.python.org/dev/peps/pep-3134/

import json
try:
    json.loads('{A}')
except:
    raise Exception("BBB")
Traceback (most recent call last):
  File "a.py", line 4, in <module>
    json.loads('{A}')
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/json/__init__.py", line 354, in loads
    return _default_decoder.decode(s)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/json/decoder.py", line 339, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/json/decoder.py", line 355, in raw_decode
    obj, end = self.scan_once(s, idx)
json.decoder.JSONDecodeError: Expecting property name enclosed in double quotes: line 1 column 2 (char 1)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "a.py", line 6, in <module>
    raise Exception("BBB")
Exception: BBB

Did I miss something?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking about single exception instead of two - less traceback but that's just a suggestion

[api]
auth_backend = airflow.providers.google.common.auth_backend.google_openid

It is also highly recommended to configure an OAuth2 audience so that the generated used tokens can only
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
It is also highly recommended to configure an OAuth2 audience so that the generated used tokens can only
It is also highly recommended to configure an OAuth2 audience so that the generated tokens can only

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
It is also highly recommended to configure an OAuth2 audience so that the generated used tokens can only
It is also highly recommended to configure an OAuth2 audience so that the generated tokens are restricted to use by Airflow only.

Copy link
Member

@turbaszek turbaszek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 🐈

credential_type = info.get("type")

if credential_type == _AUTHORIZED_USER_TYPE:
current_credentials = oauth2_credentials.Credentials.from_authorized_user_info(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we have this logic on the Airflow side? This looks like an issue for the Google library.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may try to move it to another library in the future, but this is only one module, so in my opinion, there is no strong need to do it immediately. We will still improve this code and when it is mature, we can move it to another library.

@mik-laj mik-laj merged commit 39a0288 into apache:master Jul 22, 2020
@mik-laj mik-laj deleted the api-google-openid branch July 22, 2020 20:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:API Airflow's REST/HTTP API

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants