Skip to content

Conversation

@dbnicholson
Copy link
Contributor

@dbnicholson dbnicholson commented Apr 18, 2022

Here are various fixes I put together when working with the build in a Docker container. I believe everything should work the same as before with the documented build_docker and run_docker make targets.


Dockerfile Outdated
Comment on lines 62 to 69
RUN make -C /tmp setup SDK=$ANDROIDSDK && \
rm -f /tmp/Makefile

# install python dependencies
COPY requirements.txt /tmp/
RUN pip install -r /tmp/requirements.txt && \
rm -f /tmp/requirements.txt && \
useradd -lm kivy
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Flipping through commit messages

Interesting approach here, and I understand the argument for cache invalidation. However, my local testing confirms my assumption that the COPY itself will invalidate cache regardless of the state of the filesystem in the proceeding RUN.

If my understanding of the intention here is correct, I think this defeats the purpose of the changes to caching layer order, as well as the added complexity of rm? Since any change to Makefile will now result in cache invalidation really early on in the process.

I'm testing using the same version of Docker as our builder -- are things different on the version you're running @dbnicholson?

On another note:

Apart from that, can you think ofthere any consequences to having make setup run as root rather than kivy? @rtibbles

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the cache invalidation on unrelated Makefile changes is really unfortunate. The only real advantage I guess is that (for example) changing an asset doesn't cause you to reinstall the SDK.

My other thought here was to split out the SDK install commands to a script instead of having them embedded in the Makefile. Right now no matter where you put it in the Dockerfile, you need the Makefile in the build container before you can install the SDK. Whether that's by root or unprivileged and whether it's just the Makefile or the entire checkout being copied, you're going to reinstall the SDK. And that's really slow.

I haven't seen any issue using the SDK as installed by root. That's what happens in the Ubuntu VMs used in the Github workflows. They're not installing the SDK at all and are instead using the pre-installed one in /usr/local/lib/android.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think there's any issue with running as root inside the container, as @dbnicholson says, it is what happens inside Github Actions containers.

I had originally put as much as possible into the Makefile, in order to allow for easy invalidation of caches when the Makefile changes. It's a slightly dumb mechanism, but I've had more issues with cache not invalidating than with it invalidating too often, so I've pushed it the other way in my previous work here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally, I like that most of the logic is in the Makefile - provides a unified location to update scripts for both docker users and local builders. Puts in in a place where either can modify it and propagate changes to the other workflow.

Which was my main "argument" against the ENV definitions below.

Given that the cache would be invalidated with Makefile changes, I was questioning whether the layer rearrangement was worth it; I figure a change to Makefile is high on the "most likely to be updated" list. But your point about changing an asset is valid.

I'm not doing active dev here, but my impression is that this repo isn't meant to change very often at all, regularly building based off of updated Kolibri artifacts. Right now, those are copied in during build time. And moving setup back is probably the right move. Let's keep it!

I'm unsure what the intention behind the proceeding rms are, but I don't see any reason to remove them 🤷

# creates a volume for reuse between builds, holding p4a's android distro
docker create -it \
--mount type=volume,src=${P4A_CACHE},dst=${CONTAINER_HOME}/.local \
--mount type=volume,src=${GRADLEW_CACHE},dst=${GRADLEW_PATH} \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

63010a2

Without this, gradlew downloads gradle every time the container is run.

🙌

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did get really tired of watching that get downloaded.

I think this could actually be the entire ~/.gradle directory without an issue. Before I figured out the GRADLE_OPTS environment variable, I was messing with ~/.gradle/gradle.properties. Then I had a conflict with managing the volume vs pre-seeding part of it.

@DXCanas
Copy link
Member

DXCanas commented Apr 18, 2022

Submitted so quickly I forgot to add a summary to my review:
Overall, these are all really neat changes.

Thanks a lot for the thought put in here. Deferred to others for some, and requested changes based primarily on the cache busting concerns and "multiple sources of truth" scenario.

@dbnicholson
Copy link
Contributor Author

I started rebasing this on #110 and while poking at this again and found I could vastly improve this setup so long as the p4a in use has kivy/python-for-android#2582. Then you don't need the kivy user in the container at all. I made a PR in learningequality/python-for-android#1 to do that.

@rtibbles
Copy link
Member

I have merged that PR, @dbnicholson - will fix the issue you noted in my PR, and update the commit hash to include your update.

Without this, the host python built by p4a doesn't have the ssl module
and the pip using it falls over trying to download from https URLs. See
kivy/python-for-android#2178.
This actually has no effect on the build. `ARG` does not take defaults
from environment variables unless they're previously defined in the
Dockerfile using `ENV`. Regardless, neither the `pip` install nor the
`make setup` commands make use of the `ARCH` environment variable, so
even if it was set correctly it wouldn't matter.
This doesn't do anything useful since the changes to the environment
only last as long as the process started by `RUN` is active.
Furthermore, the process is `sh` and not `bash`, where `source` is not
actually a builtin. If you look at the output, you'll see it produces an
error that's ignored.
requirements.txt doesn't exist in the container until the entire
directory is copied later. Copy it separately to a temporary file since
that also means docker will use a cached layer unless requirements.txt
has changed and not any file in the checkout.
Like the github Ubuntu VMs, make a `ndk-bundle` symlink pointing to the
default NDK version. This makes it simpler to set the `ANDROIDNDK`
environment variable since the NDK version doesn't need to be included.
There's no reason the SDK needs to be installed as the unprivileged
user. None of the debian or python packages are. This also sets the
environment variables expected by the Makefile and then there's no need
for using the `.env` file or passing variables related to it when
running docker.
This is by far the longest step in the image build. Since it doesn't
depend on the python dependencies, move it earlier since it's much more
likely that the python-for-android version in `requirements.txt` will
change before the SDK does.
It's kinda neat that you can just pass the make target as the command to
the container, but it makes it more difficult to use the container in
other ways since you have to override the entrypoint.
If no arguments are passed you'll get the image's default command, but
this allows running something different with the rest of the setup.
The need to copy things in and out of the container so it can run as the
real kivy user is a burden. In fact, the only reason you need to do this
is because p4a and other tools store intermediate outputs in the user's
home directory instead of in the current directory.

Instead, we can fake that by running the container as the current user
and setting the HOME environment variable to a writable location in the
container. Furthermore, we can mount a volume at this path and cache all
of those outputs.

Once the container is running as the current user, usage is much simpler
as we don't need to worry about ownership of files. The source directory
can be mounted directly into the container and made the current working
directory. Then the container environment behaves almost exactly like
running the commands directly in the current directory. We don't need a
user inside the container or have to copy the source tree into the
image.

This depends on p4a passing through the HOME environment variable during
builds. See kivy/python-for-android#2582.
Copy link
Member

@DXCanas DXCanas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All the functional changes I had requested are dismissed!

@rtibbles rtibbles merged commit e37bff0 into learningequality:develop Apr 20, 2022
@dbnicholson dbnicholson deleted the docker-fixes branch April 20, 2022 19:40
@dbnicholson
Copy link
Contributor Author

Heh, you merged so fast I didn't even have time to describe something I changed :)

The last commit is the real reason I set off to do all of this - run entirely as the current user without any copying in or out of the image and without embedding the source into the image. In my testing it works great, but that's not what got reviewed before. I'm happy to switch that back, but I do think it's better and more robust this way.

@rtibbles
Copy link
Member

I read the code (quickly) before I merged - and those changes made sense. I think given that P4A does have this implicit dependency on the user HOME dir, it makes a lot of sense to properly alias that, rather than trying to workaround it by moving stuff in and out of the container. Not embedding the source into the image also seems preferable.

If this somehow breaks something later, then I'll complain!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants