[BEAM-3444] Fix python3 flake8 errors e999 #4376

holdenk · 2018-01-10T00:03:13Z

Fix the E999 errors from Python3 flake 8 & attempt to add flake8 Python3 to our linting since we've had some new things pop up which would have been caught by this.

holdenk · 2018-01-19T03:37:35Z

Ping @aaltay @ehudm @robertwb

aaltay · 2018-01-19T17:55:31Z

sdks/python/apache_beam/examples/complete/tfidf.py

@@ -151,10 +151,14 @@ def compute_term_frequency(uri_count_and_total):
    # receives the value we listed after the lambda in Map(). Additional side


This comment block is referring to the removed lambda.

aaltay

Thank you @holdenk!

aaltay · 2018-01-19T17:55:50Z

sdks/python/apache_beam/examples/complete/tfidf.py

    # receives the value we listed after the lambda in Map(). Additional side
    # inputs (and ordinary Python values, too) can be provided to MapFns and
    # DoFns in this way.
+    def div_word_count_by_total(word_counts, total):


Rename word_counts to word_count?

aaltay · 2018-01-19T17:59:01Z

sdks/python/apache_beam/runners/portability/fn_api_runner_test.py


    except:
-      print res._metrics_by_stage
+      print(res._metrics_by_stage)


let's use logging. logging.info perhaps?

I'm not sure since this is done in a test, I think printing might actually be the right behaviour.

Yes, printing is fine. It's just so that if this test fails, there is some context.

aaltay · 2018-01-19T18:43:07Z

sdks/python/apache_beam/runners/worker/data_plane_test.py

@@ -50,7 +51,7 @@ def call_fn():
      thread.join(timeout_secs)
      if exc_info:
        t, v, tb = exc_info  # pylint: disable=unbalanced-tuple-unpacking


Do we still need disable=unbalanced-tuple-unpacking, we do not have it in data_plane.py:186

I'm not sure, either way its probably unrelated to the flake8 changes. I think doing a follow up sweep of the linter disable statements could be a good starter task for someone later?

aaltay · 2018-01-19T18:44:27Z

sdks/python/apache_beam/transforms/util.py

    odd_one_out = [sorted_data[-1]] if len(sorted_data) % 2 == 1 else []
    # Sort the pairs by how different they are.
+
+    def div_keys(kv1_kv_2):


Rename kv1_kv_2 to kv1_kv2.

aaltay · 2018-01-19T18:50:18Z

sdks/python/apache_beam/typehints/typecheck.py

      error_msg = ('Runtime type violation detected within ParDo(%s): '
                   '%s' % (self.full_label, e))
-      raise TypeCheckError, error_msg, sys.exc_info()[2]
+      raise_with_traceback(TypeCheckError(error_msg), sys.exc_info()[2])


Can we use from future.utils import raise_ similar to other changes? (Like raise_(TypeCheckError, error_msg, sys.exc_info()[2])

I mean we can, but since were constructing a new one this is maybe nicer? raise_ is a little ugly imho. But I'm happy to use it more places if we need to

+1 to raise_with_traceback.

aaltay · 2018-01-19T18:55:15Z

sdks/python/apache_beam/utils/retry.py

            except StopIteration:
              # Re-raise the original exception since we finished the retries.
-              raise exn, None, exn_traceback  # pylint: disable=raising-bad-type
+              raise_with_traceback(exn, exn_traceback)


Should we use raise_?

since we don't need to be explicit about the msg I think this is good.

aaltay · 2018-01-19T19:59:19Z

sdks/python/run_mini_py3lint.sh

+fi
+
+echo "Running flake8 for module $MODULE:"
+flake8 $MODULE --count --select=E999 --show-source --statistics


Why do we need a separate file? I see the we are running the same thing in run_pylint.sh already.

So many of the current linters don't pass when run in a Py3 env. We can copy them over 1 and a time.

Sounds good.

Optional: We don't need mini in the name.

aaltay · 2018-01-19T21:01:34Z

sdks/python/run_pylint.sh

  "avroio_test.py"
  "datastore_wordcount.py"
  "datastoreio_test.py"
+  "hadoopfilesystem.py"


Why do we need exceptions for a growing list of files?

@ehudm specifically for this file.

Just saw this (my username is udim). Was wondering why this file was added to this list?

aaltay · 2018-01-19T21:02:33Z

sdks/python/tox.ini

+whitelist_externals=time
+commands =
+  time pip install -e .[test]
+  time {toxinidir}/run_mini_py3lint.sh


Do we really need two lint environments? We run this command in the single lint environment.

We need a new one for Python3 linting

To be clear flake8 only catches the E999 issues when run in Python3

aaltay · 2018-01-19T21:03:17Z

sdks/python/tox.ini

  time {toxinidir}/run_pylint.sh
 passenv = TRAVIS*

+[testenv:lint2]


perhaps rename this to lint3 lint_py3 or something similar. 2 might be confusing.

holdenk · 2018-02-12T08:04:50Z

Gentle re-ping :)

holdenk · 2018-02-14T12:35:25Z

Ping @aaltay ?

holdenk · 2018-02-15T04:32:29Z

Ping @aaltay @robertwb ? Would be nice to get in so people stop committing syntactically invalid py3 code.

holdenk · 2018-02-15T13:27:03Z

Ok so looks like y'all merged some conflicting changes I'm open to updating this again but I'd like to know when to expect a review so I don't chase my tail (especially since y'all request rebasing).

holdenk · 2018-02-17T00:44:52Z

Let’s try a ping during west coast working hours. CC @robertwb @aaltay

aaltay · 2018-02-17T02:57:39Z

@holdenk I can make a pass over this next week. I do not think you need to rebased till we agree on the comments, that way you can only do one rebase after the last round. If you prefer, it is fine to rebase at any point as well.

holdenk · 2018-02-17T06:36:13Z

Ok, I know in some other projects PRs which aren't mergeable tend to get lower review priority but if that is not the case here I'll focus my time on actual changes which are more fun/useful.

holdenk · 2018-02-21T11:45:23Z

Ping? Are there other reviewers interested in Py3 migration I should reach out to?

holdenk · 2018-02-23T01:01:34Z

@robertwb ?

robertwb

Only minor comments. The loss of tuple unpacking for function arguments is particularly sad for our use, but so be it.

robertwb · 2018-02-23T01:47:23Z

sdks/python/apache_beam/examples/complete/tfidf.py

    # inputs (and ordinary Python values, too) can be provided to MapFns and
    # DoFns in this way.
+    def div_word_count_by_total(word_count, total):
+      (word, count) = word_count


Optional: You don't need parentheses in the unpacking or the packing into tuples.

robertwb · 2018-02-23T01:54:21Z

sdks/python/apache_beam/runners/portability/fn_api_runner_test.py


    except:
-      print res._metrics_by_stage
+      print(res._metrics_by_stage)


Yes, printing is fine. It's just so that if this test fails, there is some context.

robertwb · 2018-02-23T02:04:05Z

sdks/python/apache_beam/runners/worker/data_plane.py

          if self._exc_info:
-            raise self.exc_info[0], self.exc_info[1], self.exc_info[2]
+            t, v, tb = self._exc_info
+            raise_(t, v, tb)


Or raise_(*self._exc_info)?

And elsewhere.

I'm flexible, I like the explicit unpack but sure.

robertwb · 2018-02-23T02:06:09Z

sdks/python/apache_beam/typehints/typecheck.py

      error_msg = ('Runtime type violation detected within ParDo(%s): '
                   '%s' % (self.full_label, e))
-      raise TypeCheckError, error_msg, sys.exc_info()[2]
+      raise_with_traceback(TypeCheckError(error_msg), sys.exc_info()[2])


+1 to raise_with_traceback.

holdenk · 2018-02-23T03:03:38Z

So on Feb 15nth @robertwb merged a fix for some of these except using a different library and without tests for regressions. I'm going to rebase this to simply act as a test for the fixes (although it would have been nice if we had merged this before) in that PR.

holdenk · 2018-02-23T03:10:34Z

Just to clarify I don't mean necessarily that one or the other should be merged its just with the long review cycle to dev cycle of the small patches we end up with a lot of conflicts (which is ok if less than most-fun).

aaltay

LGTM.

One general comment, how does py3 environment works on Jenkins? Those machines do not have python3 installed (pending: https://issues.apache.org/jira/browse/BEAM-3671). #4610 is also blocked on that issue.

aaltay · 2018-02-23T04:15:54Z

sdks/python/run_mini_py3lint.sh

+fi
+
+echo "Running flake8 for module $MODULE:"
+flake8 $MODULE --count --select=E999 --show-source --statistics


Sounds good.

Optional: We don't need mini in the name.

holdenk · 2018-02-23T04:43:39Z

@aaltay so I was thinking of keeping mini to indicate its not full until we have the full linting working in Python 3. Does that sound reasonable? Open to changing if not.

holdenk · 2018-02-23T04:44:23Z

And the answer is -- for Jenkins it does not (although I can remove the env from the list if we want to merge it based on local runs while we wait for Jenkins to update).

aaltay · 2018-02-23T04:52:47Z

The name is fine either way (with or without mini), it is up to you. We can always change it later.

Let's remove it from the env list, until the JIRA I mentioned is fixed. Because otherwise running tox without specifying environment will fail on Jenkins. Although this is sad and it will not prevent not merging more breaking changes. (You can add a comment about this to the JIRA and we can add it back to the list once python3 is installed on the workers.)

holdenk · 2018-02-23T04:58:18Z

Added https://issues.apache.org/jira/browse/BEAM-3738 with a link to 3671

aaltay · 2018-02-23T05:03:43Z

Sounds good. I can merge this once tests pass. I would suggest squashing your commits.

holdenk · 2018-02-23T05:33:20Z

Sure I can squash this down to one.

@aaltay

…d by flake8 Mini style fixes Quick unpacking fixes Fix type tests to match the changed func Fix band to bandc oops Passe the message text through Switch mostly to raise_ even though raise_with_traceback should be closer. Make mini_py3lint executable Install future as well Fix tfidf div_word_count_by_total function (oops) Take @aaltay's comments into account Update comment, explicitly unpack tuple in test. Standardize on six to match @luke-zhu's work Add the envlists Remove lint_py3 for now

aaltay · 2018-02-23T20:01:06Z

There is a lint error

Running pylint for module apache_beam:
************* Module apache_beam.runners.worker.data_plane
E:187,12: Too many positional arguments for function call (too-many-function-args)
************* Module apache_beam.runners.worker.sdk_worker
E:293, 8: Too many positional arguments for function call (too-many-function-args)
************* Module apache_beam.runners.worker.data_plane_test
E: 54, 8: Too many positional arguments for function call (too-many-function-args)

…efully we can remove isort after the py3 migration is complete and just depend on pylint.

holdenk · 2018-02-24T11:15:02Z

ok there we go :)

holdenk · 2018-02-26T22:57:57Z

West coast working time ping @aaltay :)

aaltay · 2018-02-27T01:23:41Z

Thank you. Merged.

udim · 2018-03-02T01:05:47Z

sdks/python/tox.ini

  python setup.py test
 passenv = TRAVIS*

-[testenv:lint]


Why was the name changed?
Please revert or update https://beam.apache.org/contribute/contribution-guide/

Sure, I'll ping you on the PR for that.

shoyer · 2018-04-12T23:47:15Z

sdks/python/apache_beam/runners/common.py

          + step_annotation)
      new_exn._tagged_with_step = True
-    raise new_exn, None, original_traceback
+    six.raise_from(new_exn, original_traceback)


This use of raise_from instead of reraise led to a bug with dropped stacktraces: https://issues.apache.org/jira/projects/BEAM/issues/BEAM-3956

Are the other uses of raise_from instead of reraise in this PR appropriate?

holdenk changed the title ~~[BEAM-3444] Fix python3 flake8 errors e999~~ [WIP][BEAM-3444] Fix python3 flake8 errors e999 Jan 11, 2018

holdenk force-pushed the BEAM-3444-fix-flake8-errors-e999 branch 2 times, most recently from d55ded3 to 7f5e1bc Compare January 18, 2018 03:05

holdenk changed the title ~~[WIP][BEAM-3444] Fix python3 flake8 errors e999~~ [BEAM-3444] Fix python3 flake8 errors e999 Jan 19, 2018

aaltay reviewed Jan 19, 2018

View reviewed changes

holdenk force-pushed the BEAM-3444-fix-flake8-errors-e999 branch from 57eb4a8 to 0df7e5a Compare February 12, 2018 07:22

holdenk force-pushed the BEAM-3444-fix-flake8-errors-e999 branch from 0df7e5a to 2522a55 Compare February 15, 2018 13:33

robertwb reviewed Feb 23, 2018

View reviewed changes

holdenk force-pushed the BEAM-3444-fix-flake8-errors-e999 branch from 2522a55 to 996eba1 Compare February 23, 2018 03:07

aaltay reviewed Feb 23, 2018

View reviewed changes

holdenk force-pushed the BEAM-3444-fix-flake8-errors-e999 branch from 3b048b8 to b70af6d Compare February 23, 2018 07:14

holdenk force-pushed the BEAM-3444-fix-flake8-errors-e999 branch from b70af6d to 9850d3f Compare February 23, 2018 09:06

holdenk added 2 commits February 23, 2018 17:21

Fix some raise_from to reraise.

9579372

vcfio somehow has some sort issues. It's not overly important and hop…

9f7aa9b

…efully we can remove isort after the py3 migration is complete and just depend on pylint.

aaltay merged commit d29f7b8 into apache:master Feb 27, 2018

udim reviewed Mar 2, 2018

View reviewed changes

cclauss mentioned this pull request Mar 2, 2018

Test: Py3.6 flake8 --select=E901,E999,F821,F822,F823 #4610

Closed

shoyer reviewed Apr 12, 2018

View reviewed changes

		@@ -151,10 +151,14 @@ def compute_term_frequency(uri_count_and_total):
		# receives the value we listed after the lambda in Map(). Additional side

[BEAM-3444] Fix python3 flake8 errors e999 #4376

[BEAM-3444] Fix python3 flake8 errors e999 #4376

Uh oh!

Conversation

holdenk commented Jan 10, 2018

Uh oh!

holdenk commented Jan 19, 2018

Uh oh!

Choose a reason for hiding this comment

Uh oh!

aaltay left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

holdenk commented Feb 12, 2018

Uh oh!

holdenk commented Feb 14, 2018

Uh oh!

holdenk commented Feb 15, 2018

Uh oh!

holdenk commented Feb 15, 2018

Uh oh!

holdenk commented Feb 17, 2018

Uh oh!

aaltay commented Feb 17, 2018

Uh oh!

holdenk commented Feb 17, 2018

Uh oh!

holdenk commented Feb 21, 2018

Uh oh!

holdenk commented Feb 23, 2018

Uh oh!

robertwb left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

holdenk commented Feb 23, 2018 via email •

edited

Loading