Skip to content

checkpoint: resolve symlink for external bind mount#2902

Merged
mrunalp merged 1 commit intoopencontainers:masterfrom
liusdu:checkpoint
Jun 25, 2021
Merged

checkpoint: resolve symlink for external bind mount#2902
mrunalp merged 1 commit intoopencontainers:masterfrom
liusdu:checkpoint

Conversation

@liusdu
Copy link
Copy Markdown

@liusdu liusdu commented Apr 13, 2021

runc resolves symlink before doing bind mount. So
we should save original path while formatting CriuReq for
checkpoint.

Signed-off-by: Liu Hua weldonliu@tencent.com

@kolyshkin
Copy link
Copy Markdown
Contributor

@liusdu thank you for the patch! Nice that you're using checkpoint/restore!

Can you add a test case (to e.g. tests/integration/checkpoint.bats?)

@kolyshkin
Copy link
Copy Markdown
Contributor

@liusdu ^^^

@liusdu
Copy link
Copy Markdown
Author

liusdu commented Apr 17, 2021

@liusdu ^^^

@kolyshkin Sorry for this late, checkpoint/restore of runc works very well for me.
The test cases is added, please take a look~~

Comment thread tests/integration/checkpoint.bats Outdated
@liusdu
Copy link
Copy Markdown
Author

liusdu commented Apr 21, 2021

@kolyshkin

Comment thread tests/integration/checkpoint.bats Outdated
options: ["bind"]
}]'
simple_cr

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: extra empty line

Comment thread libcontainer/container_linux.go Outdated

func (c *linuxContainer) addCriuDumpMount(req *criurpc.CriuReq, m *configs.Mount) {
mountDest := strings.TrimPrefix(m.Destination, c.config.Rootfs)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: extra empty line

@kolyshkin
Copy link
Copy Markdown
Contributor

LGTM except for a couple of nits. @adrianreber PTAL

@adrianreber
Copy link
Copy Markdown
Contributor

@liusdu Can you give an example where you saw this error. It looks correct, but I am curious when this can happen.

@liusdu
Copy link
Copy Markdown
Author

liusdu commented Apr 22, 2021

@adrianreber Look at following Dockerfile

FROM openjdk:8-jre
RUN wget  -qO-  'https://www.apache.org/dyn/closer.cgi?action=download&filename=flink/flink-1.12.2/flink-1.12.2-bin-scala_2.11.tgz' |tar -zvx -C /opt
RUN  ln -s  /opt/flink-1.12.2  /opt/flink

If we pass configuration as volume to container, we can meet this type of error.

docker run -d -v ${FLINK_CONF}:/opt/flink/conf ${flink_image}`

@liusdu
Copy link
Copy Markdown
Author

liusdu commented Apr 22, 2021

@kolyshkin thanks for you review~

Comment thread tests/integration/checkpoint.bats
runc resolves symlink before doing bind mount. So
we should save original path while formatting CriuReq for
checkpoint.

Signed-off-by: Liu Hua <weldonliu@tencent.com>
@liusdu
Copy link
Copy Markdown
Author

liusdu commented Apr 29, 2021

ping @kolyshkin @adrianreber

Copy link
Copy Markdown
Contributor

@kolyshkin kolyshkin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@kolyshkin
Copy link
Copy Markdown
Contributor

close/reopen to re-kick CI

@kolyshkin kolyshkin closed this May 3, 2021
@kolyshkin kolyshkin reopened this May 3, 2021
@kolyshkin kolyshkin mentioned this pull request May 3, 2021
7 tasks
@liusdu liusdu closed this May 6, 2021
@liusdu liusdu reopened this May 6, 2021
Comment thread tests/integration/checkpoint.bats
@haircommander
Copy link
Copy Markdown
Contributor

LGTM

func (c *linuxContainer) addCriuDumpMount(req *criurpc.CriuReq, m *configs.Mount) {
mountDest := strings.TrimPrefix(m.Destination, c.config.Rootfs)
if dest, err := securejoin.SecureJoin(c.config.Rootfs, mountDest); err == nil {
mountDest = dest[len(c.config.Rootfs):]
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do wonder -- what happens if the mount is moved from inside the container (with a simple rename) after the filesystem has been set up? Is this also something CRIU doesn't support / really doesn't like?

Copy link
Copy Markdown
Author

@liusdu liusdu May 13, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes @cyphar Since CRIU does not know how to restore bind mounts(master in host namespace), CRIU need this type of mount marked as external(via --external). runc formats --external parameter according to spec

So for situation you mentioned, CRIU does not work properly. btw, should we export --external for runc. Then users can mark external resource themselves?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do wonder -- what happens if the mount is moved from inside the container (with a simple rename) after the filesystem has been set up? Is this also something CRIU doesn't support / really doesn't like?

I do not understand this question. But initially I would have said that CRIU can handle it. Now @liusdu says it is not possible. I probably misunderstood the question.

So for situation you mentioned, CRIU does not work properly. btw, should we export --external for runc. Then users can mark external resource themselves?

For Kubernetes, restoring containers in different Pods, I will need CRIU's --external to be available at the runc level.

Currently runc informs CRIU about all external mounts but it is not possible to change the location of the external mount between checkpoint and restore. For Kubernetes I will need to change the location of an external mount. The most obvious bind mount currently is the location of the secrets directory which has the Pod UID in its container external path and I need to tell CRIU/runc that that directory should be mounted from some other location.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ping @cyphar ..

@liusdu
Copy link
Copy Markdown
Author

liusdu commented Jun 21, 2021

@haircommander can you approve this pr?

@haircommander
Copy link
Copy Markdown
Contributor

It LGTM, but I'm not a maintainer so my approval wouldn't help unfortunately
@kolyshkin @mrunalp @AkihiroSuda PTAL

@kolyshkin kolyshkin added this to the 1.1.0 milestone Jun 21, 2021
Copy link
Copy Markdown
Contributor

@kolyshkin kolyshkin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@kolyshkin
Copy link
Copy Markdown
Contributor

We're in the middle of 1.0.0 GA release. Once it's done, I think we can merge this.

Copy link
Copy Markdown
Contributor

@mrunalp mrunalp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@mrunalp mrunalp merged commit 1079288 into opencontainers:master Jun 25, 2021
@kolyshkin
Copy link
Copy Markdown
Contributor

There's something wrong with CI :(

@kolyshkin
Copy link
Copy Markdown
Contributor

Might be caused by #3029. @liusdu can you take a look? The failed CI can be seen from the Actions tab, https://github.com/opencontainers/runc/actions

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants