Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
80 changes: 80 additions & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
###############################################################################
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
###############################################################################

# Dockerfile for container with dev prerequisites


FROM ubuntu:20.04

ARG DEBIAN_FRONTEND=noninteractive

# Update and install common packages
RUN apt -q update \
&& apt install -y software-properties-common apt-utils apt-transport-https ca-certificates \
&& add-apt-repository -y ppa:deadsnakes/ppa \
&& apt-get -q install -y --no-install-recommends \
curl \
gnupg-agent \
rsync \
git \
vim \
locales \
wget \
time \
openjdk-8-jdk \
python3-setuptools \
python3-pip \
python3.5 \
python3.6 \
python3.7 \
python2.7 \
virtualenv \
tox

# Setup to install docker
RUN curl -fsSL https://download.docker.com/linux/ubuntu/gpg | apt-key add -
RUN apt-key fingerprint 0EBFCD88
RUN add-apt-repository \
"deb [arch=amd64] https://download.docker.com/linux/ubuntu \
$(lsb_release -cs) \
stable"
RUN apt-get install -y docker-ce docker-ce-cli containerd.io

# Set the locale
Copy link
Contributor Author

@omarismail94 omarismail94 Nov 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Switching to UTF-8 as some tests (Java getBytes()) fail if default is used.

Used code here: https://stackoverflow.com/a/28406007/14101188 to implement this

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool fix.
Yet to me this sounds like a serious bug in these parts of Beam.
In my mind I would even go for setting it to something that is never used and fix the code to 'always run correctly'.
For now: Let's just make sure the build works.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it's our intention to not use any APIs that depend on the locale (e.g. getBytes() calls should specify the encoding). This seems like something one of our static analysis plugins could verify and I'm a little surprised its not happening already.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just verified that errorprone will raise a compile error if we use APIs that depend on the locale. I changed an existing call to getBytes():

> Task :sdks:java:core:compileTestJava
/usr/local/google/home/bhulette/working_dir/beam/sdks/java/core/src/test/java/org/apache/beam/sdk/util/ExposedByteArrayInputStreamTest.java:77: warning: [DefaultCharset] Implicit use of the platform default charset, which can result in differing behaviour between JVM executions or incorrect behavior if the encoding of the data source doesn't match expectations.
    assertArrayEquals("ello World!".getBytes(), ret);
                                            ^
    (see http://errorprone.info/bugpattern/DefaultCharset)
  Did you mean 'assertArrayEquals("ello World!".getBytes(UTF_8), ret);' or 'assertArrayEquals("ello World!".getBytes(Charset.defaultCharset()), ret);'?

We can't control if one of our dependencies is doing this though, maybe that's what's going on. We should use this opportunity to file a jira about the specific tests that are sensitive to locale (if one doesn't exist already)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A possible different source of this class of problem is generated code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I spoke with @omarismail94 about this offline. It looks the issue is we have getBytes(Charsets.DefaultCharset()) in many places, which errorprone allows (in fact its one of the suggested corrections in the error message).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just changed them all and set up a new PR: #13410

RUN sed -i '/en_US.UTF-8/s/^# //g' /etc/locale.gen && \
locale-gen
ENV LANG en_US.UTF-8
ENV LANGUAGE en_US:en
ENV LC_ALL en_US.UTF-8

#Set Python3.6 as default
RUN alias python=python3.6

# Install Go
RUN wget https://golang.org/dl/go1.15.5.linux-amd64.tar.gz && tar -C /usr/local -xzf go1.15.5.linux-amd64.tar.gz
# ENV PATH $PATH:/usr/local/go/bin
ENV GOROOT /usr/local/go
ENV PATH $PATH:$GOROOT/bin

# Set work directory
WORKDIR /workspaces/beam
ENV GOPATH /workspaces/beam/sdks/go/examples/.gogradle/project_gopath
RUN go get github.com/linkedin/goavro

# Install grpcio-tools mypy-protobuf for `python3 sdks/python/setup.py sdist` to work
RUN pip3 install grpcio-tools mypy-protobuf
5 changes: 5 additions & 0 deletions website/www/site/content/en/contribute/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -94,6 +94,11 @@ $ go get github.com/linkedin/goavro

gLinux users should configure their machines for sudoless Docker.

Alternatively, you can use the [Dockerfile](https://github.com/apache/beam/blob/master/Dockerfile) to setup a container meeting the requirements above, and mount your clone of the Beam repo to the container. e.g.:
```shell script
docker run -it --network=host -v /var/run/docker.sock:/var/run/docker.sock --mount type=bind,source="$PWD",target=/workspaces/beam <CONTAINER_ID> bash
```

### Connect With the Beam community

1. Consider subscribing to the [dev@ mailing list](/community/contact-us/), especially
Expand Down