Skip to content

Conversation

@nielsbasjes
Copy link
Contributor

@nielsbasjes nielsbasjes commented Nov 11, 2020

This adds a docker based build environment intended for developers to have a predictable working environment with all needed tools present.

On a Linux machine (I ONLY tested this on my Ubuntu 20.04 laptop) just run the start-build-env.sh script to build and start a Docker based build setup that wraps around the current source tree beam copy.

This is based upon the initiative started by @omarismail94 #12837 and copies large parts from https://github.com/apache/hadoop

As suggested by @TheNeuralBit in #12837 I was able to run ./gradlew :sdks:python:container:py37:docker inside this docker setup which actually created a 2.39GB docker image apache/beam_python3.7_sdk:2.27.0.dev

TODO:

  • Check if the entire build works in this.
  • Write the documentation.

Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:

  • Choose reviewer(s) and mention them in a comment (R: @username).
  • Format the pull request title like [BEAM-XXX] Fixes bug in ApproximateQuantiles, where you replace BEAM-XXX with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue.
  • Update CHANGES.md with noteworthy changes.
  • If this contribution is large, please file an Apache Individual Contributor License Agreement.

See the Contributor Guide for more tips on how to make review process smoother.

Post-Commit Tests Status (on master branch)

Lang SDK Dataflow Flink Samza Spark Twister2
Go Build Status --- Build Status --- Build Status ---
Java Build Status Build Status
Build Status
Build Status
Build Status
Build Status
Build Status
Build Status Build Status
Build Status
Build Status
Build Status
Python Build Status
Build Status
Build Status
Build Status
Build Status
Build Status
Build Status
Build Status
--- Build Status ---
XLang Build Status --- Build Status --- Build Status ---

Pre-Commit Tests Status (on master branch)

--- Java Python Go Website Whitespace Typescript
Non-portable Build Status Build Status
Build Status
Build Status
Build Status
Build Status Build Status Build Status Build Status
Portable --- Build Status --- --- --- ---

See .test-infra/jenkins/README for trigger phrase, status and link of all Jenkins jobs.

GitHub Actions Tests Status (on master branch)

Build python source distribution and wheels
Python tests
Java tests

See CI.md for more information about GitHub Actions CI.

@nielsbasjes
Copy link
Contributor Author

nielsbasjes commented Nov 11, 2020

R: @TheNeuralBit
R: @omarismail94

@nielsbasjes
Copy link
Contributor Author

R: @TheNeuralBit

@nielsbasjes nielsbasjes force-pushed the BEAM-10891-Development-Docker branch 4 times, most recently from f66285d to 5814f96 Compare November 12, 2020 13:48
@nielsbasjes nielsbasjes changed the title [WIP][BEAM-10891] Standardized developer build environment using Docker [BEAM-10891] Standardized developer build environment using Docker Nov 12, 2020
@nielsbasjes
Copy link
Contributor Author

Status:
Several things fail when I use this docker image and run ./gradlew check .

One I'm kinda stuck on how to fix it is this one:

> Task :sdks:java:core:compileJava
...
compiler message file broken: key=compiler.misc.msg.bug arguments=11.0.9, {1}, {2}, {3}, {4}, {5}, {6}, {7}
java.lang.NoSuchMethodError: 'void com.sun.tools.javac.util.Log.error(com.sun.tools.javac.util.JCDiagnostic$DiagnosticPosition, java.lang.String, java.lang.Object[])'
        at com.google.errorprone.ErrorProneError.logFatalError(ErrorProneError.java:55)
        at com.google.errorprone.ErrorProneAnalyzer.finished(ErrorProneAnalyzer.java:155)
        at jdk.compiler/com.sun.tools.javac.api.MultiTaskListener.finished(MultiTaskListener.java:132)
        at jdk.compiler/com.sun.tools.javac.main.JavaCompiler.flow(JavaCompiler.java:1418)
        at jdk.compiler/com.sun.tools.javac.main.JavaCompiler.flow(JavaCompiler.java:1365)

Suggestions are welcome.

@TheNeuralBit
Copy link
Member

One I'm kinda stuck on how to fix it is this one:

Aha! I ran into this same error just now, its because I had JAVA_HOME set to use a Java 11 JDK. Switching back to Java 8 resolved it. We don't yet support Java 11, can you have the image specify Java 8?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can add time to the package installation and will remove the need for this. I did that on my end and it worked!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, already copied that improvement from your version.

@omarismail94
Copy link
Contributor

omarismail94 commented Nov 23, 2020

I just pulled the branch and ran

./gradlew check

Only 2 tests failed, but I think their cause is known:

I think with that, we are good to go! That script is pretty awesome @nielsbasjes!

@nielsbasjes
Copy link
Contributor Author

I found a very subtle permission problem in the go part.
Working on it.

@nielsbasjes
Copy link
Contributor Author

What I ran into is that if you run go get github.com/linkedin/goavro while building the docker image the contents of ${HOME}/.cache/go-build will be owned by root. This causes permission issues during the build.

@nielsbasjes
Copy link
Contributor Author

I'm in doubt about something.

Right now I put the ~/.gradle of the docker image in the real directory ~/.beam_docker_build_env/.gradle/

The reason for that is that if I put it in a directory that is part of the source tree it will cause IntelliJ to see thousands of changed files way too often.

If I link it to the real ~/.gradle then this causes problems because if you also want to build it outside the docker the paths are different and some of these files have a path generated in them.

The downside is that if you have multiple copies of the beam code cloned then these will all share this one instance and may very well (untested) cause conflicts.

What is the best approach here?

@omarismail94
Copy link
Contributor

omarismail94 commented Nov 23, 2020

@nielsbasjes I am not familiar with how this works, and would love your thoughts on this:

In the status quo (without docker containers), if you have multiple copies of beam repos, wouldn't they all use ~/.gradle, meaning the potential for conflict already exists?

If so, I do not think it matters which setup you go with. It seems like the former may be cleaner as it separates the .gradle directory of docker builds vs non-docker builds.

Let me know what you think!

@TheNeuralBit
Copy link
Member

I agree with @omarismail94's assessment

@nielsbasjes
Copy link
Contributor Author

In the current version still fails to ./gradlew build completely.
2 problems right now:

First this error that "sometimes" occurs:

-----------
* What went wrong:
Execution failed for task ':sdks:go:test:load:resolveBuildDependencies'.
> Exception in resolution, message is:
  Cannot resolve dependency:github.com/etcd-io/etcd: commit='11214aa33bf5a47d3d9d8dafe0f6b97237dfe921', urls=[https://github.com/etcd-io/etcd.git, git@github.com:etcd-io/etcd.git]
  Resolution stack is:
  +- github.com/apache/beam/sdks/go/test/load
   +- ./github.com/apache/beam/sdks/go@/home/nbasjes/beam/sdks/go

The funny thing is that apparently the code is locked to a commit in etcd that was put in in February 2018, almost 2 years ago.

Second building the website consistently fails with this.

> Task :website:startDockerContainer
16b1c9631a0b35b77e763b80145fa07320be545c679c118c329ceaa3968a44eb

> Task :website:installDependencies FAILED
OCI runtime exec failed: exec failed: container_linux.go:349: starting container process caused "chdir to cwd (\"/opt/website/www\") set in config.json failed: no such file or directory": unknown

> Task :website:initGitSubmodules FAILED
fatal: Not a git repository (or any parent up to mount point /opt)
Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).

> Task :website:buildCodeSamples FAILED
OCI runtime exec failed: exec failed: container_linux.go:349: starting container process caused "chdir to cwd (\"/opt/website/www\") set in config.json failed: no such file or directory": unknown

A container for the image beam-website:latest is started.
I checked and the directory /opt in the container is empty (i.e. /opt/website/www does not exist)

A generic problem I found here is that if you run this multiple time you end up with multiple docker containers running that are not terminated automatically. I'm ignoring this for now.

@nielsbasjes
Copy link
Contributor Author

I ran a test with build.dependsOn buildWebsite removed from website/build.gradle to see if the rest of the build is ok.

So far the ./gradlew :sdks:go:test:goVet keeps failing with about 150 errors like these :

$ ./gradlew :sdks:go:test:goVet
Starting a Gradle Daemon, 1 incompatible and 3 stopped Daemons could not be reused, use --status for details
Configuration on demand is an incubating feature.

> Task :sdks:go:goPrepare
Use project GOPATH: /home/nbasjes/beam/sdks/go/.gogradle/project_gopath

> Task :sdks:go:test:goPrepare
Use project GOPATH: /home/nbasjes/beam/sdks/go/test/.gogradle/project_gopath

> Task :sdks:go:test:resolveBuildDependencies
Resolving ./github.com/apache/beam/sdks/go@/home/nbasjes/beam/sdks/go
.gogradle/project_gopath/src/github.com/apache/beam/sdks/go/test/load/vendor/google.golang.org/genproto/googleapis/cloud/bigquery/connection/v1beta1/connection.pb.go:37:2: cannot find package "google.golang.org/protobuf/types/known/fieldmaskpb" in any of:
        /home/nbasjes/beam/sdks/go/test/.gogradle/project_gopath/src/github.com/apache/beam/sdks/go/test/load/vendor/google.golang.org/protobuf/types/known/fieldmaskpb (vendor tree)
        /home/nbasjes/beam/sdks/go/test/.gogradle/project_gopath/src/github.com/apache/beam/sdks/go/test/vendor/google.golang.org/protobuf/types/known/fieldmaskpb
        /home/nbasjes/.gradle/go/binary/1.12/go/src/google.golang.org/protobuf/types/known/fieldmaskpb (from $GOROOT)
        /home/nbasjes/beam/sdks/go/test/.gogradle/project_gopath/src/google.golang.org/protobuf/types/known/fieldmaskpb (from $GOPATH)
.gogradle/project_gopath/src/github.com/apache/beam/sdks/go/test/load/vendor/github.com/etcd-io/etcd/clientv3/namespace/kv.go:20:2: cannot find package "github.com/coreos/etcd/clientv3" in any of:
        /home/nbasjes/beam/sdks/go/test/.gogradle/project_gopath/src/github.com/apache/beam/sdks/go/test/load/vendor/github.com/coreos/etcd/clientv3 (vendor tree)
        /home/nbasjes/beam/sdks/go/test/.gogradle/project_gopath/src/github.com/apache/beam/sdks/go/test/vendor/github.com/coreos/etcd/clientv3
        /home/nbasjes/.gradle/go/binary/1.12/go/src/github.com/coreos/etcd/clientv3 (from $GOROOT)
        /home/nbasjes/beam/sdks/go/test/.gogradle/project_gopath/src/github.com/coreos/etcd/clientv3 (from $GOPATH)

and

can't load package: package github.com/apache/beam/sdks/go/test/load/vendor/golang.org/x/net/lif: build constraints exclude all Go files in /home/nbasjes/beam/sdks/go/test/.gogradle/project_gopath/src/github.com/apache/beam/sdks/go/test/load/vendor/golang.org/x/net/lif
can't load package: package github.com/apache/beam/sdks/go/test/load/vendor/cloud.google.com/go/internal/godocfx: build constraints exclude all Go files in /home/nbasjes/beam/sdks/go/test/.gogradle/project_gopath/src/github.com/apache/beam/sdks/go/test/load/vendor/cloud.google.com/go/internal/godocfx

My knowledge of Go and Gradle is too limited to have any idea how to fix this.

Please help

@nielsbasjes
Copy link
Contributor Author

@omarismail94
I too have these two fail

  • Task :sdks:java:extensions:ml:test. This fails due to DLP API permissions I have on my GCP project.
    When I ran these they all failed because of Caused by: java.io.IOException: The Application Default Credentials are not available. They are available if running in Google Compute Engine. Otherwise, the environment variable GOOGLE_APPLICATION_CREDENTIALS must be defined pointing to a file defining the credentials. See https://developers.google.com/accounts/docs/application-default-credentials for more information.
    The ones that fail seem to be IT tests using Google Cloud features (which logically require authentication).
    So apparently the build expectations of the build environment that cannot be met everywhere.
    I wonder how these are run in a CI environment like Jenkins?

  • Task :sdks:go:test:goVet. I think this fails due to a known issue: https://issues.apache.org/jira/browse/BEAM-4831
    The pull request linked to that issue was merged in July 2018 so I think this is a different problem.
    I have tried downgrading the go version to (the latest) 1.12 as this is the version defined in all of the gradle build scripts.
    Not much difference in build success.

@TheNeuralBit
Copy link
Member

@lostluck, @youngoli any advice for the goVet errors?

For the GCP permissions issue, if this is an IT I don't think it should be run as part of :check, maybe its getting misclassified as a unit test? Note the jenkins workers have credentials for the apache-beam-testing GCP project so they can run integration tests.

@TheNeuralBit
Copy link
Member

TheNeuralBit commented Nov 30, 2020

The extensions:ml build.gradle has explicitly added *IT.java to the :test task:

project.test {
def gcpProject = project.findProperty("gcpProject") ?: 'apache-beam-testing'
include "**/**IT.class"

There should be a separate :integrationTest task for these. We could run it as part of the Java PostCommit

EDIT: I drafted #13444 for this

@nielsbasjes
Copy link
Contributor Author

@TheNeuralBit
Looks like this code was added as part of https://issues.apache.org/jira/browse/BEAM-9147
So I assume does that mean we should make that a separate issue?

@TheNeuralBit
Copy link
Member

So I assume does that mean we should make that a separate issue?

Yeah we can handle that separately.

I would like to get to the bottom of the goVet issue in this PR though. Alternatively, if we can confirm this is working for Python and Java development, we could file a follow-up jira to figure out the issues with the Go SDK.

(Also FYI there's been some discussion about this effort on the dev@ mailing list)

@nielsbasjes
Copy link
Contributor Author

I created https://issues.apache.org/jira/browse/BEAM-11363 for the IT issue

@nielsbasjes
Copy link
Contributor Author

I just read the dev list and what you wrote. I think committing it as it is now and improve from there is a good idea.
That will make it a lot easier for the go experts to try this and reproduce the goVet, website and ML problems.
I'll rebase and squash to 1 commit on master so this can be merged cleanly.

@nielsbasjes nielsbasjes force-pushed the BEAM-10891-Development-Docker branch from 661605a to 5e6296f Compare November 30, 2020 20:41
@ajamato
Copy link

ajamato commented Dec 1, 2020

Thanks @nielsbasjes! I looked this over and I have a few minor suggestions.

@ajamato, @rohdesamuel - would you be willing to try this out and see if it works for your purposes?

I'll give it a try @rohdesamuel FYI you can use this to pull in the PR

git fetch origin pull/13308/head:BEAM-10891-Development-Docker

git checkout BEAM-10891-Development-Docker

(assumes origin is set to apache beam git repo)

@ajamato
Copy link

ajamato commented Dec 1, 2020

Thanks @nielsbasjes! I looked this over and I have a few minor suggestions.

@ajamato, @rohdesamuel - would you be willing to try this out and see if it works for your purposes?

Okay

Thanks @nielsbasjes! I looked this over and I have a few minor suggestions.
@ajamato, @rohdesamuel - would you be willing to try this out and see if it works for your purposes?

I'll give it a try @rohdesamuel FYI you can use this to pull in the PR

git fetch origin pull/13308/head:BEAM-10891-Development-Docker

git checkout BEAM-10891-Development-Docker

(assumes origin is set to apache beam git repo)

Okay. I gave it a try. And in its current state, IMO I wouldn't use it, it would be too complicated to use compare to what I have now.

I was basically just aiming to go through the commands here, run the python tests, run the lint, format steps, etc. And make sure they all work
https://cwiki.apache.org/confluence/display/BEAM/Python+Tips

First I couldn't run pyenv, to setup a virutalenv and perform the steps in that guide.
We would need to install
pyenv
pyenv-virtualenv

Then I would be able to setup a python virtualenv and perform the steps in the wiki

Even better, I would like to see this setup a the docker container setup a python virtualenv as well. I think everything can run from within there.

I didn't really attempt steps for java though.

Also I needed to run sudo to start-build-env.sh. I am wondering if we can have this work without sudo. As I imagine this is giving my docker instance too much permissions

sudo ./start-build-env.sh

@TheNeuralBit
Copy link
Member

First I couldn't run pyenv, to setup a virutalenv and perform the steps in that guide.

Could you just skip the virtualenv and install everything in that guide directly into the container's python install?

@TheNeuralBit
Copy link
Member

I've been experimenting with the container in a clean clone of Beam on this branch. Weirdly it seems that :sdks:go:test:goVet fails consistently when I run ./gradlew check. It continues to fail when I run it directly (./gradlew :sdks:go:test:goVet), until I remove all the vendor directories (find sdks/go/ -name vendor -type d -exec rm -rf {} \;). Then it succeds consistently!

... until I run ./gradlew check again. Is it possible some other target is spoiling the vendor directories?

I suspect this is a separate issue. If this can be repro'd outside of the dev container, let's file a bug and move on (I'm out of time for today, I'll try it myself tomorrow if no one else does).

@nielsbasjes
Copy link
Contributor Author

@ajamato I did some googling on how to setup pyenv and pyenv-virtualenv and I've added that to the docker image (along with pre downloading the latest patch releases of the python versions Beam needs.

Note that my knowledge level around python is very limited so other than just installing these tools I do not know how to proceed.

So feedback is appreciated.

@nielsbasjes
Copy link
Contributor Author

nielsbasjes commented Dec 2, 2020

Python build now fails with this error (with by the looks of it the same cause)
pypa/virtualenv#2006

Error log:

nbasjes@[Beam Build Env.]:~/beam {BEAM-10891-Development-Docker} ]
$ cat /home/nbasjes/beam/sdks/python/test-suites/tox/py38/build/srcs/sdks/python/target/.tox-py3-yapf-check/py3-yapf-check/log/py3-yapf-check-0.log
action: py3-yapf-check, msg: getenv
cwd: /home/nbasjes/beam/sdks/python/test-suites/tox/py38/build/srcs/sdks/python/target/.tox-py3-yapf-check
cmd: /home/nbasjes/beam/build/gradleenv/-1227304282/bin/python -m virtualenv --no-download --python /home/nbasjes/beam/build/gradleenv/-1227304282/bin/python py3-yapf-check
ImportError: cannot import name 'enquote_executable' from 'distlib.scripts' (/home/nbasjes/beam/build/gradleenv/-1227304282/lib/python3.8/site-packages/distlib/scripts.py)

@nielsbasjes
Copy link
Contributor Author

@TheNeuralBit I ran the current docker image locally with your changes from #13444 included.

Turns out that sdks/java/io/google-cloud-platform also has a few tests that require Google credentials to run.
DicomIOTest. test_Dicom_failedMetadataRead
FhirIOTest. test_FhirIO_failedReads
FhirIOTest. test_FhirIO_failedWrites
HL7v2IOTest. test_HL7v2IO_failedReads
HL7v2IOTest. test_HL7v2IO_failedWrites

All of these failed also with Caused by: java.io.IOException: The Application Default Credentials are not available.

My current guess is that these should be moved to be an IT test instead of a regular test.

Do you want to include the #13444 fixes?

@nielsbasjes
Copy link
Contributor Author

nielsbasjes commented Dec 2, 2020

Strange; after a ./gradlew check the file sdks/go/pkg/beam/transforms/stats/stats.shims.go is modified.
The changes seem like code layout/styling differences. For example:

-       f := fn.(func(float64,float64) (float64))
+       f := fn.(func(float64, float64) float64)

@TheNeuralBit
Copy link
Member

Honestly I'm not sure it makes sense to include pyenv and pyenv-virtualenv by default. This is yet another level of containerization that confuses things. Once users have started the build environment they could just treat its system python(s) like a virtualenv.

We should consider pre-installing the tools from https://cwiki.apache.org/confluence/display/BEAM/Python+Tips for the system python though.

For those that want to use pyenv inside the build env container they're free to install it.

@TheNeuralBit
Copy link
Member

My current guess is that these should be moved to be an IT test instead of a regular test.

Do you want to include the #13444 fixes?

That's probably the right thing to do, but I want to bring it up with the dev list first. I'll use BEAM-11363 to track fixing those tests as well. For now you could try just deleting or @Ignore-ing those tests to see if you can get it to pass.

@nielsbasjes
Copy link
Contributor Author

@TheNeuralBit I had a look at the python page you mentioned regarding the tooling on it.
Seems to me that what I have now installs most of the tools mentioned there.
Including the virtualenv and pyenv tools.

So at this point I'm a bit confused what you guys want to have installed and what not regarding the python tools.

@nielsbasjes nielsbasjes force-pushed the BEAM-10891-Development-Docker branch from 74c9a6d to 3c715c6 Compare December 4, 2020 15:03
@omarismail94
Copy link
Contributor

@nielsbasjes I think he meant having the following packages:

  • tox
  • yapf==0.29.0
  • pytest

I see tox as part of the commit files already, but not the latter 2

@nielsbasjes
Copy link
Contributor Author

@omarismail94 Cool, can you give me a hint what the correct place is to put them in? (I'm a real Python noob)

Copy link
Member

@TheNeuralBit TheNeuralBit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for all your work on this @nielsbasjes, I'm sure this will make life much easier for new contributors. I would really like it if this could successfully run ./gradlew check but it looks like the issues were running into there are independent of this container. I filed BEAM-11402 to address that separately.

Just one last request - I think we should avoid the extra layer of pyenv for the reasons I mentioned above. I can merge once that's addressed.

@nielsbasjes
Copy link
Contributor Author

As far as i can tell the current image pre installs the pyenv but it is not activated by default. Please explain what you would like me to change.

@TheNeuralBit
Copy link
Member

Please explain what you would like me to change.

Let's just not install pyenv and its dependencies

@TheNeuralBit TheNeuralBit merged commit 60fe232 into apache:master Dec 5, 2020
@TheNeuralBit
Copy link
Member

🎉 thanks again @nielsbasjes and @omarismail94!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants