Skip to content

Conversation

@MaksYermak
Copy link
Contributor

Create operators for working with Auto ML for Vertex AI service. Includes operators, hooks, example dags, tests and docs.

Co-authored-by: Wojciech Januszek januszek@google.com
Co-authored-by: Lukasz Wyszomirski wyszomirski@google.com
Co-authored-by: Maksim Yermakou maksimy@google.com


^ Add meaningful description above

Read the Pull Request Guidelines for more information.
In case of fundamental code change, Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in UPDATING.md.

@boring-cyborg boring-cyborg bot added area:providers kind:documentation provider:google Google (including GCP) related issues labels Feb 9, 2022
@MaksYermak MaksYermak changed the title Auto ml vertex ai operators Create Auto ML operators for Vertex AI service Feb 9, 2022
@MaksYermak MaksYermak force-pushed the auto-ml-vertex-ai-operators branch 2 times, most recently from 4a541c5 to 809af22 Compare February 11, 2022 15:59
Comment on lines 43 to 60
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as in #21267 (comment) , let's try to be consistent

Comment on lines +87 to +57
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remember that some time ago Google recommended to avoid such interfaces because there was a lot of issues like "can we add foo_bar argument to operator XYZ". Instead it was suggested that an operator should accept a "body" - an object that is accepted by the underlying API.

CC @potiuk @mik-laj @kosteev

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am fine with both approaches as long as Google want to maintain it ;)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MaksYermak Is this consistent with approach that we have in other google operators?
I think it should be good as long as we consistent everywhere.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kosteev Yes it is consistent. We have the same approach in other google operators.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thx.

Copy link
Member

@turbaszek turbaszek Feb 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MaksYermak @kosteev it's a matter of definition: For example Google AutoML operators accept model argument instead of particular elements of model as we do here.

Some time ago we had a move that resulted in refactoring Dataproc and BigQuery (as well others) operators to follow "single input" approach because it was considered more generic and easier to follow. Also it is more inline with Google change between approach in v1 and v2 clients: https://googleapis.dev/python/dataproc/latest/UPGRADING.html#method-calls

Copy link
Member

@potiuk potiuk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am fine to go with it. this looks like a nicely generated piece of code, doesn't it?

@potiuk
Copy link
Member

potiuk commented Feb 15, 2022

Looks like there are some import errors though :(

@github-actions github-actions bot added the okay to merge It's ok to merge this PR as it does not require more tests label Feb 15, 2022
@github-actions
Copy link

The PR is likely OK to be merged with just subset of tests for default Python and Database versions without running the full matrix of tests, because it does not modify the core of Airflow. If the committers decide that the full tests matrix is needed, they will add the label 'full tests needed'. Then you should rebase to the latest main or amend the last commit of the PR, and push it with --force-with-lease.

Copy link
Member

@turbaszek turbaszek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see this comment, let's try to be consistent 👌

Copy link
Member

@turbaszek turbaszek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see this comment, let's try to be consistent 👌

@MaksYermak MaksYermak force-pushed the auto-ml-vertex-ai-operators branch from 809af22 to 1f0ac36 Compare February 16, 2022 13:30
@MaksYermak
Copy link
Contributor Author

Please see this comment, let's try to be consistent 👌

@potiuk @turbaszek I have updated links for Vertex AI with new approach. One more thing I with @lwyszomi and @wojsamjan have decided to create a separate package for links. @potiuk @turbaszek what do you think about it?

@MaksYermak MaksYermak force-pushed the auto-ml-vertex-ai-operators branch from 1f0ac36 to ecdfa1d Compare February 17, 2022 09:37
@MaksYermak MaksYermak force-pushed the auto-ml-vertex-ai-operators branch from ecdfa1d to 3e8184f Compare February 17, 2022 14:57
should have a transformation. If an input column has no transformations on it, such a column is
ignored by the training, except for the targetColumn, which should have no transformations
defined on. Only one of column_transformations or column_specs should be passed. Consider using
column_specs as column_transformations will be deprecated eventually.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MaksYermak can you please clarify this deprecation eventually?
This is the first time we added this operator why do we warn about deprecation (and if there is something that user needs to be aware of why do we warn only in docstring?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@eladkal thank you for your comment. It is my mistake I haven´t noticed that this parameter is deprecated in the google-cloud-aiplatform library. I will create PR with deprecation warning for this parameter.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@eladkal I have added deprecated warning for this parameter in this PR #24467

@potiuk potiuk added this to the Airflow 2.3.1 milestone May 1, 2022
@ephraimbuddy ephraimbuddy removed this from the Airflow 2.3.1 milestone May 8, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:providers kind:documentation okay to merge It's ok to merge this PR as it does not require more tests provider:google Google (including GCP) related issues

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants