Skip to content

Conversation

@hmc-cs-mdrissi
Copy link
Contributor

@hmc-cs-mdrissi hmc-cs-mdrissi commented Aug 29, 2021

Update all usages of collections.Sequence/collections.Iterable/etc to use collections.abc as old name is deprecated and will be removed in 3.10. collections.abc was added in 3.3 so all python versions beam supports should be compatible with this change.

@tvalentyn This fix was motivated by seeing the deprecation warning and this ticket, https://issues.apache.org/jira/browse/BEAM-12000?jql=text%20~%20%223.9%22. Doesn't solve the main work for the ticket, but covers one easy change.

@codecov
Copy link

codecov bot commented Aug 29, 2021

Codecov Report

Merging #15415 (c2f8a52) into master (cbb363f) will increase coverage by 0.70%.
The diff coverage is 90.90%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #15415      +/-   ##
==========================================
+ Coverage   83.83%   84.54%   +0.70%     
==========================================
  Files         440      443       +3     
  Lines       59886    59128     -758     
==========================================
- Hits        50207    49990     -217     
+ Misses       9679     9138     -541     
Impacted Files Coverage Δ
sdks/python/apache_beam/utils/proto_utils.py 89.13% <ø> (+26.63%) ⬆️
sdks/python/apache_beam/transforms/trigger.py 91.41% <50.00%> (+1.79%) ⬆️
.../apache_beam/runners/direct/transform_evaluator.py 91.04% <100.00%> (+0.59%) ⬆️
...ks/python/apache_beam/runners/worker/sideinputs.py 88.07% <100.00%> (+0.11%) ⬆️
sdks/python/apache_beam/typehints/typecheck.py 97.56% <100.00%> (+0.01%) ⬆️
sdks/python/apache_beam/typehints/typehints.py 93.84% <100.00%> (+0.49%) ⬆️
sdks/python/apache_beam/typehints/row_type.py 82.35% <0.00%> (-1.86%) ⬇️
sdks/python/apache_beam/io/source_test_utils.py 88.47% <0.00%> (-1.39%) ⬇️
sdks/python/apache_beam/internal/metrics/metric.py 90.42% <0.00%> (-1.07%) ⬇️
...apache_beam/runners/dataflow/internal/apiclient.py 76.22% <0.00%> (-0.58%) ⬇️
... and 177 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update cbb363f...c2f8a52. Read the comment docs.

@tvalentyn tvalentyn changed the title Update deprecated collections to use collections.abc in python sdk [BEAM-12000] Update deprecated collections to use collections.abc in python sdk Aug 30, 2021
Copy link
Contributor

@tvalentyn tvalentyn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for this fix.

Should we import collecitons.abc as well? I am seeing tests are passing, so not sure if all the affected codepaths were triggered, but seeing this locally:

:~$ python3
Python 3.9.2 (default, Feb 28 2021, 17:03:44) 
[GCC 10.2.1 20210110] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import collections
>>> isinstance(object, collections.abc.Iterable)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3.9/collections/__init__.py", line 68, in __getattr__
    raise AttributeError(f'module {__name__!r} has no attribute {name!r}')
AttributeError: module 'collections' has no attribute 'abc'
>>> import collections.abc
>>> isinstance(object, collections.abc.Iterable)
False

@pabloem
Copy link
Member

pabloem commented Sep 10, 2021

what's the status for this PR?

@hmc-cs-mdrissi
Copy link
Contributor Author

hmc-cs-mdrissi commented Sep 14, 2021

I've updated the import to cover collections.abc. I'm unsure why the tests are fine as some tests definitely do pass by those lines based on adding breakpoints, but better to be safe. Most likely reason why tests are fine is that some other import indirectly adds collections.abc. It's a common library to import so most of the time you will get somewhere.

PR status was mainly I needed to get dev environment set up. Back when I made the pr I made the edits without setting up an environment. Now that I've got beam locally set up, I added a simple unit test that should hopefully fix coverage failure. Coverage failure was only thing blocking the pr earlier. For coverage I noticed overload lines were counting as untyped when they should be skipped so added overload to exclude_lines.

RAT precommit hook error confuses me,

* What went wrong:
09:37:41 Execution failed for task ':rat'.
09:37:41 > A failure occurred while executing org.nosphere.apache.rat.RatWork
09:37:41    > Apache Rat audit failure - 1 unapproved license
09:37:41      	See file:///home/jenkins/jenkins-slave/workspace/beam_PreCommit_RAT_Commit/src/build/reports/rat/index.html
09:37:41 
09:37:41 * Try:
09:37:41 Run with --stacktrace option to get the stack trace. Run with --info or --debug option to get more log output. Run with --scan to get full insights.
09:37:41 

pylint error I'll fix in a bit.

edit: On coverage check I'm a little confused. Why is there both .coveragerc and setup.cfg with coverage options? Which one is used by CI? Can there only be one?

def pack_Any(msg):
# type: (message.Message) -> any_pb2.Any
pass
...
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pass vs ... are similar but not same. pass has an inferred return type of None for some type checkers. Pyright will give you an error for using pass here. ... is typical way to mark overloads as stubs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are test files even type checked? I tried running mypy on this file and got an error. Also discovered mypy.ini for the repo has an invalid value giving this error message,

mypy.ini: [mypy]: follow_imports: invalid choice 'true' (choose from 'normal', 'silent', 'skip', 'error')

The mypy error that prevents it from type checking this file is,

apache_beam/portability/api/metrics_pb2.pyi:117: error: invalid syntax [syntax] Looks like root cause of that error is a comment using type: notation for doc string and confusing it with standard type comment.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, I think generated _pb2.py files should be excluded from the type checks.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't know about this usage of Ellipsis. Found: https://mypy.readthedocs.io/en/stable/stubs.html and python/typing#109 where this is disussed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pb2.py is not type checked, but pb2.pyi is needed for type inference. Looks like proto file has a comment that confuses mypy. I can make a tweak to the comment to make it fine for mypy.

args: ["--rcfile=sdks/python/.pylintrc"]
files: ^sdks/python/apache_beam/
exclude: *exclude
- repo: https://github.com/pycqa/isort
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added as CI runs isort but pre-commit doesn't. Should have same behavior as ci. The setup.cfg tweaks I did were to match run_pylint.sh in CI.

@hmc-cs-mdrissi
Copy link
Contributor Author

hmc-cs-mdrissi commented Sep 14, 2021

I'm confused by test failure on ubuntu-latest 3.7, but 3.7 works on other platforms + other ubuntus work. None of the changes I made look platform related. Can that check be re-run/potentially flaky?

isort failure looks like I need to find one more config option to make pre-commit match ci fully.

@tvalentyn
Copy link
Contributor

I'm confused by test failure on ubuntu-latest 3.7, but 3.7 works on other platforms + other ubuntus work. None of the changes I made look platform related. Can that check be re-run/potentially flaky?

It's a flake, it's being fixed in BEAM-12794.

Why is there both .coveragerc and setup.cfg with coverage options? Which one is used by CI? Can there only be one?

I don't know, you could ask on dev@ or @udim might know.

orig_msg = unpack_Any(packed_msg, timestamp_pb2.Timestamp)
none_msg = unpack_Any(packed_msg, None)
assert proto_timestamp == orig_msg
assert none_msg is None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this a left over given that you have a separate test_none_pack?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test_None_pack is to check packing None. unpack_Any does not allow None for the msg, but allows None for the msg type. So this is intended. I wanted to test both None msg pack and None class unpack.

output = target/site/cobertura/coverage.xml

[isort]
line_length = 120
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where does this number come from?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Came across https://issues.apache.org/jira/browse/BEAM-3745 with touches on isort & lint. perhaps we can close it already or after your changes.


# Don't complain about missing debug-only code:
def __repr__
if self\.debug
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about: if self\.debug_logging_enabled to avoid pattern-matching with if self.debug_options which occurs in the codebase?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. The reason I did if self.debug was to make .coveragerc consistent with setup.cfg coverage option. For the infra pr I can make them both self.debug_logging_enabled

@@ -0,0 +1,25 @@
import unittest
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RAT check wants you to add a license here, see other files.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you will fix.

@tvalentyn
Copy link
Contributor

Thanks, I left some comments. It would be better to split infra changes in this PR into self-contained changes, either as separate PRs or separate commits appropriately squashed. If you choose to have multiple commits in the same PR, please don't squash the commits before review on this PR finalizes.

@hmc-cs-mdrissi
Copy link
Contributor Author

I'll split the pr into two prs. One for infra changes and one for unit test + abc.collections change. Unit test is only needed with the abc.collections change for coverage check to be satisfied.

@pabloem
Copy link
Member

pabloem commented Sep 23, 2021

hi there! any updates on this PR? : )

@tvalentyn
Copy link
Contributor

Closing in favor of #15850 @hmc-cs-mdrissi feel free to rebase this PR to keep the infrastructure changes that you made and reopen it. Thank you!

@tvalentyn tvalentyn closed this Jan 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants