Skip to content
This repository was archived by the owner on Nov 17, 2023. It is now read-only.

Conversation

@larroy
Copy link
Contributor

@larroy larroy commented Dec 13, 2019

Fixes #16753

@larroy
Copy link
Contributor Author

larroy commented Dec 13, 2019

@leezu

@leezu
Copy link
Contributor

leezu commented Dec 14, 2019

Thanks @larroy. The same update needs to be done for the other arches as the failure mode affects all arches. Either in this PR or separate PRs.

# The container is pinned for preventing CI failures on updates, swap below to use
# the upstream container
#FROM dockcross/linux-armv7
FROM mxnetci/dockcross-linux-armv7-pinned
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this container be given another tag besides latest https://hub.docker.com/r/mxnetci/dockcross-linux-armv7-pinned/tags

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you only specify the name mxnetci/dockcross-linux-armv7-pinned, instead of mxnetci/dockcross-linux-armv7-pinned:TAG, docker will always pull the version tagged as latest: mxnetci/dockcross-linux-armv7-pinned:latest.
But that version changes whenever someone pushes a new version of dockcross-linux-armv7-pinned to mxnetci. So it becomes impossible to test if an update of dockcross-linux-armv7-pinned breaks the CI. Instead all CI runs will be changed immediately.

If instead you add a tag TAG for the particular version of the mxnetci/dockcross-linux-armv7-pinned container at https://hub.docker.com/r/mxnetci/dockcross-linux-armv7-pinned/tags, our Dockerfiles here on Github can mention the tag mxnetci/dockcross-linux-armv7-pinned:TAG. Then, when pushing a new update of dockcross-linux-armv7-pinned to mxnetci and tagged as TAG2, the CI will continue to use the old version until a PR is opened to update to use TAG2 on github.

Copy link
Contributor Author

@larroy larroy Dec 14, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's the idea, that we push the container if it can be built, so we don't have to constantly update this file. Another job would push the container to this tag whenever it has been built successfully, so we don't need the :TAG here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But then there is no need to use mxnetci. Then we can just directly use the dockcross repository.
You mentioned that in the past dockcross sometimes broke our build. I thought we used pinning to fix that?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mxnetci is a fully automatically handled repository while MXNetcipinned is manually curated. In case there's a security breach, the pinned repo wouldn't be impacted. Since the pinned image usually is no longer available at dockcross since they don't have tagging, the image would then be lost if the image gets deleted by the automatic process. Thus, I'd recommend to stay on the separate pinned repo

Copy link
Contributor Author

@larroy larroy Dec 16, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pinned image needs to be maintained automatically. It's a good point with the credentials. We might consider having it in a private pipeline with a different repo to preserve those credentials. Edit: now that I think of it from a security perspective doesn't matter much since we are generating the mxnetci images from an unsafe environment.

@larroy
Copy link
Contributor Author

larroy commented Dec 16, 2019

@mxnet-label-bot add [pr-awaiting-review]

@lanking520 lanking520 added the pr-awaiting-review PR is waiting for code review label Dec 16, 2019
@leezu leezu self-requested a review December 17, 2019 03:54
Copy link
Contributor

@leezu leezu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I withdraw the approval until we reach consensus about if 1) we need to pin images 2) if we do need pinned images, how they should be maintained 3) where the pinned images should live.
In my understanding, if we maintain pinned images automatically, we don't need to pin in the first place and can rely on the upstream repo directly.

Copy link
Contributor

@leezu leezu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I withdraw the approval until we reach consensus about if 1) we need to pin images 2) if we do need pinned images, how they should be maintained 3) where the pinned images should live.
In my understanding, if we maintain pinned images automatically, we don't need to pin in the first place and can rely on the upstream repo directly.

@leezu leezu self-requested a review December 17, 2019 03:56
@larroy
Copy link
Contributor Author

larroy commented Dec 17, 2019

@leezu how do you suggest to proceed to reach consensus? Also wouldn't it be good to have a hotfix first for armv7 then reach consensus later?

@leezu
Copy link
Contributor

leezu commented Dec 17, 2019

Maybe consensus is the wrong word. Essentially I'm not clear what's the advantage of using mxnetci/dockcross-X without version tag and automating the updates to mxnetci compared to directly using the upstream dockcross.

@leezu
Copy link
Contributor

leezu commented Dec 17, 2019

Regarding Hotfix, armv7 is not sufficient, we need to update all arches. It's straightforward to update, ie. d3571d1 passes CI (part of #17031)

@larroy
Copy link
Contributor Author

larroy commented Dec 18, 2019

Well, why do you think it was pinned in the first place? It happened a few times that dockcross updating their containers broke the CI as we couldn't compile anymore. Otherwise it wouldn't be pinned. Cross compilation is very tricky, so if it starts failing at some point and people don't know how to fix it then the most likely outcome is that the job would be disabled.
]

@leezu
Copy link
Contributor

leezu commented Dec 18, 2019

@larroy fair enough, but if you don't pin it inside the code-base but rather by pushing the relevant container to the mxnetcipinned docker repository automatically, the same breakage will occur.
Just the way of intervention would be different (someone needs to roll-back the change in container on mxnetcipinned. That is very opaque for all bystanders, thus may lead to the same outcome is that a failing testcase would be disabled)
Or how would you suggest to avoid it?
My understanding is that the only way to avoid it would be to pin it inside the mxnet codebase as done currently?

@larroy
Copy link
Contributor Author

larroy commented Dec 18, 2019

One possible idea is to push the pinned container only if the build works. Or leave it pinned until we have the need to update and is done manually. What do you think is better? or you have a better idea?

@leezu
Copy link
Contributor

leezu commented Dec 19, 2019

Yes, I agree the update should be done automatically when it doesn't break the CI build.
Instead of replicating the checking logic in a separate system and using that to decide whether to push a new container to the mxnetcipinned repo (which then is immediately used by any CI run), we can use the source code to fix the container (via tag) and have a bot to open pull requests whenever the base container is updated.

Then the CI will test the pull request as it normal and it can be merged if it passes.

@leezu
Copy link
Contributor

leezu commented Dec 23, 2019

As part of #16753, the docker containers have been updated and switched to the upstream version.
#17151 tracks reintroducing pinning.

@larroy larroy closed this Jan 8, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

pr-awaiting-review PR is waiting for code review

Projects

None yet

Development

Successfully merging this pull request may close these issues.

fail to build using docker

4 participants