-
Notifications
You must be signed in to change notification settings - Fork 16.4k
PoC: Implement Docker Registry Auth Protocol #33602
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
0234992 to
5a225c9
Compare
|
THAT one looks pretty complete not POC :D . I did not have time to look at details, but sounds cool and likely @o-nikolas @vandonr-amz and team should take a look :) |
5a225c9 to
2939b43
Compare
Well for me this implementation is still between some dirty hack and something missing in Connection+Hook integration. (Option 1) Just create inheritance of existed DockerHook, DockerOperator, docker decorator, DockerSwarmOperatorRequired support all of them independently, update decorator and need to track is change in parent classes break child classes. (Option 2) Create something like
|
|
I think trying to come up with anything "common" for all "docker-like" potential operators is unnecessary effort. Like all attemps to make things DRY and reused, it also introduces coupling (and that's really what you refer as "needs changes for Airflow 2 and back-compatibility).. This is a "middle" solution as you say - it does not integrate with Connection as deeply as it could but IMHO it's good. It's ok the implementaiton of protocol is tied to "docker-only", and it's ok that particular implementation (AWS in this case) maps - in the code - the generic "airlfow connection" credentials to those of your DockerAuth. This is a very elagant solution where a small piece of "Service-specific code" is used to join the two sides. Airflow is all about "platform as a code" and we should do more of it where we take the "Airflow realm" (of hooks, connections etc) and write small-ish snippets of code where we make it works with "other realms" (such as docker authentication). Trying to merge them is not really efficient, because we have many such worlds to connect - so in a way what have in Connections/Hook is the Least Common Denominator - not perfect and not super flexible, but at the same time you can easily write a piece of code like yours that makes the information stored in Connections usable to build more complex authentication/refreshign etc . schemes (like what you do). |
|
I'm still not sure that hook should be a part which are responsible for connection, but we have what we have. If go thought community providers we easily could find than many of them implements it's own unique method with Connection integration and resolving merging parameters. Let's me show how I would see some simple integration, where parameters for integration do not lost between operator and hook, pretty well documented, could be validated before use (now we validate only in the UI) and also could use as part of build UI and auto-documentation. class AwesomeConnection(BaseModel):
"""Awesome Connection"""
class Config:
...
connection_id: str = Field(
title="Name",
description="The name value of insight",
)
login: str = Field(
title="Awesome Login",
description="Description to Awesome ID",
airflow_connection_only=True,
)
password: str = Field(
title="Awesome Password",
description=(
"Description to Awesome Password, e.g. "
"where it could be obtained in case cloud services"
),
airflow_sensetive=True,
airflow_connection_only=True,
)
hostname: str = Field(
title="Awesome Hostname",
description="Foo bar spam egg",
)
code_only_parameter: str = Field(
title="Awesome Parameter",
description="Foo bar spam egg",
airflow_user_code_only=True,
)
class AwesomeHook:
def __init__(self, *, connection: AwesomeConnection):
self._connection = connection
@cached_property
def client(self):
resolved_connection_info = awesome_common_merger(self._connection)
# or
resolved_connection_info = self._connection.merge_with(conn_id=self._connection.connection_id)
return awesome_client(
login=resolved_connection_info.login,
password=resolved_connection_info.password,
hostname=resolved_connection_info.hostname,
code_only_parameter=resolved_connection_info.hostname,
)
def do_something_awesome(self, param1, param2):
return self.client.sudo_rm_rf_root_dir(param1, param2, preserve_root=False)
def do_something(self, param1, param3):
return self.client.sleep(param1, param3)
class AwesomeOperatorDoSomethingAwesome(BaseOperator):
def __init__(self, *, connection: AwesomeConnection, param1, param2, **kwargs):
super().__init__(**kwargs)
self._connection = connection
self.param1 = param1
self.param2 = param2
@cached_property
def hook(self):
return AwesomeHook(connection=self._connection)
def execute(self, context):
return self.hook.do_something_awesome(self.param1, self.param2)
class AwesomeOperatorDoSomething(BaseOperator):
def __init__(self, *, connection: AwesomeConnection, param1, param3, **kwargs):
super().__init__(**kwargs)
self._connection = connection
self.param1 = param1
self.param3 = param3
@cached_property
def hook(self):
return AwesomeHook(connection=self._connection)
def execute(self, context):
return self.hook.do_something(self.param1, self.param3)This primitive solution wouldn't work well with something abstracted like HttpHook. And I can't find the way how to start moving from "put whatever you wanted to connection, hook and you could even not extend the documentation" to some partially-strict, which do not give a chance to use something unsafe from connection, and restrict provide sensitive information directly thought user code (tokens/passwords and etc.) Maybe there is some intermediate solution exists without break everything for everyone |
|
I'm still don't get why |
It does not expect docker because previously |
|
(and yes probably this test should use artificial, rather than real "generated/provider_dependencies.json". I will fix it shortly, it causes some unnecessary friction when you add implicit dependencies between providers that weren't there before. |
|
I wil do it after we merge this PR. |
I would rather say
If this solution would be accepted, we might finally remove exclusive implementation AWS IAM authentication in PostgresHook and MySQLHook by the same method. |
2939b43 to
a9864e8
Compare
a9864e8 to
3d8abce
Compare
|
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 5 days if no further activity occurs. Thank you for your contributions. |


This is Proof of concept for add ability provide additional credentials method to DockerHook and related operators/decorators. Some kind of continuation of #26162
ToDo:
^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named
{pr_number}.significant.rstor{issue_number}.significant.rst, in newsfragments.