Skip to content

Conversation

@piffall
Copy link
Contributor

@piffall piffall commented May 14, 2018

Make sure you have checked all steps below.

JIRA

Description

  • Add support for scale dataproc clusters:

Tests

  • Implementation of tests:

Commits

  • My commits all reference JIRA issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "How to write a good git commit message":
    1. Subject is separated from body by a blank line
    2. Subject is limited to 50 characters
    3. Subject does not end with a period
    4. Subject uses the imperative mood ("add", not "adding")
    5. Body wraps at 72 characters
    6. Body explains "what" and "why", not "how"

Documentation

  • In case of new functionality, my PR adds documentation that describes how to use it.
    • When adding new operators/hooks/sensors, the autoclass documentation generation needs to be added.

Code Quality

  • Passes git diff upstream/master -u -- "*.py" | flake8 --diff

Copy link
Contributor

@Fokko Fokko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some small remarks. Could you also add this to the documentation? Maybe provide an example. I like the idea of the operator.

@kaxil Can you take a look?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we poll the status instead of having a fixed sleep?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think not. API has rate limits, and I see that all other dataproc operators are using 15s as sleep time. So I respect that previous decision.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

d is not in the regex group, so this branch of the elif will never be invoked.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Fixing this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe move this to the init?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are wright! Fixing this! Thanks!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we move this one to a new line as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixing this. Thanks!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit. Can you change # Unit test for the DataprocClusterDeleteOperator to # Unit test for the DataprocClusterScaleOperatorTest

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixing this. Thanks!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you change it to The name of the cluster to scale

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixing this. Thanks!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you update this to The region for the dataproc cluster as now it does support regions like europe-west

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixing this. Thanks!

Copy link
Member

@kaxil kaxil May 14, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we also have a try.. except block for 404 errors?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can only happen if the operation is not found, that should never happen, so I think it should raise. Do you that that would be better to add the try..catch block and raise the exception after logging it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would want to log it tbh.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Error logged.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a specific reason for selecting 15 seconds? or is it just the smallest time you think dataproc would take to scale up?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, in my experience it couldn't be done in less time.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you just add a link to https://cloud.google.com/dataproc/docs/concepts/configuring-clusters/scaling-clusters and say "For more information visit {LINK}".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Link added.

@kaxil
Copy link
Member

kaxil commented May 14, 2018

Can you squash your commits?

Can you also fix the following flake8 error?

Running flake8 on the diff in the range 042c3f2a..8fa2eb3d (6 commit(s)):
--------------------------------------------------------------------------------
tests/contrib/operators/test_dataproc_operator.py:267:54: F841 local variable '_' is assigned to but never used
                with self.assertRaises(TypeError) as _:
                                                     ^
tests/contrib/operators/test_dataproc_operator.py:309:54: F841 local variable '_' is assigned to but never used
                with self.assertRaises(TypeError) as _:
                                                     ^
ERROR: InvocationError for command '/home/travis/build/apache/incubator-airflow/scripts/ci/flake8_diff.sh' (exited with code 1)

Thank you for adding this operator. This would be quite helpful.

@piffall
Copy link
Contributor Author

piffall commented May 14, 2018

Commits squashed and flake8 errors fixed. Waiting for test results.

@codecov-io
Copy link

codecov-io commented May 14, 2018

Codecov Report

Merging #3357 into master will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master    #3357   +/-   ##
=======================================
  Coverage   76.41%   76.41%           
=======================================
  Files         203      203           
  Lines       14993    14993           
=======================================
  Hits        11457    11457           
  Misses       3536     3536

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9236349...e6e6bfa. Read the comment docs.

@piffall
Copy link
Contributor Author

piffall commented May 14, 2018

I think that is all fixed. Thank you guys!

@piffall piffall closed this May 14, 2018
@piffall piffall reopened this May 14, 2018
@piffall piffall closed this May 14, 2018
@piffall piffall reopened this May 14, 2018
@kaxil
Copy link
Member

kaxil commented May 14, 2018

@piffall
Copy link
Contributor Author

piffall commented May 14, 2018

@kaxil Documentation added to code.rst and integration.rst files.

Copy link
Member

@kaxil kaxil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@asfgit asfgit closed this in 9c915c1 May 14, 2018
asfgit pushed a commit that referenced this pull request May 14, 2018
Closes #3357 from piffall/master

(cherry picked from commit 9c915c1)
Signed-off-by: Kaxil Naik <kaxilnaik@gmail.com>
aliceabe pushed a commit to aliceabe/incubator-airflow that referenced this pull request Jan 3, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants