Skip to content

Support for exporting nydus compression type#2581

Merged
tonistiigi merged 6 commits into
moby:masterfrom
imeoer:nydus-compression-type
Nov 8, 2022
Merged

Support for exporting nydus compression type#2581
tonistiigi merged 6 commits into
moby:masterfrom
imeoer:nydus-compression-type

Conversation

@imeoer
Copy link
Copy Markdown
Contributor

@imeoer imeoer commented Jan 26, 2022

Related issue: #2046

Nydus image is a container accelerated image format provided by the Dragonfly image-service project, which offers the ability to pull image data on demand, without waiting for the entire image pull to complete and then start the container. It has been put in production usage and shown vast improvements over the OCI v1 image format in terms of container launching speed, image space, and network bandwidth efficiency, as well as data integrity. Nydus image can be flexibly configured as a FUSE-based user-space filesystem or in-kernel FS with nydus daemon in user-space:

  • Lightweight integration with VM-based containers runtime like KataContainers. KataContainers is supported Nydus as a native image acceleration solution.
  • Nydus closely cooperates with Linux in-kernel disk filesystem containers' rootfs can directly be set up by EROFS over Fscache with lazy pulling capability. The corresponding changes had been merged into Linux kernel since v5.19.

Nydus has provided a conversion tool Nydusify for converting OCIv1 image to Nydus image and integrated into Harbor Acceld as a conversion driver, which assumes that the OCI image is already available in the registry, but a better way would be to build the Nydus images directly from the build system instead of using the conversion tool, which would increase the speed of the image export, so we experimentally integrated the Nydus export in Buildkit.

image

A manifest example of nydus image can be found here. Unlike other compression formats (gzip, zstd, eStargz, etc.) in OCI image, nydus is divided into two types of layer, blob and bootstrap, where blob serves as the data and metadata for all files of each layer of the image, and bootstrap serves as the metadata of the whole image, the bootstrap is equivalent to the view of the whole image filesystem after all layers overlay, nydus use this layer form to reduce the number of HTTP request and FUSE overlay's overhead on preparing the container's rootfs. For example, for an OCI image with 3 layers, the corresponding nydus image is 4 layers (3 layers of blob + 1 layer of bootstrap).

The converter package in nydus-snapshotter do the actual layer compression, this PR imports the package to implement the export of nydus compression type.

@imeoer imeoer marked this pull request as draft January 26, 2022 10:54
@imeoer imeoer force-pushed the nydus-compression-type branch 4 times, most recently from 6e3e5de to 6dbd90a Compare January 27, 2022 08:07
@imeoer imeoer marked this pull request as ready for review January 27, 2022 08:08
@imeoer imeoer force-pushed the nydus-compression-type branch from 6dbd90a to 1cac153 Compare January 28, 2022 01:48
@imeoer imeoer force-pushed the nydus-compression-type branch from 1cac153 to 73d370b Compare February 7, 2022 06:42
@imeoer
Copy link
Copy Markdown
Contributor Author

imeoer commented Feb 7, 2022

@ktock @AkihiroSuda PTAL

Comment thread cache/converter.go Outdated
Comment thread cache/nydus.go Outdated
Comment thread cache/blobs.go Outdated
@ktock
Copy link
Copy Markdown
Collaborator

ktock commented Feb 8, 2022

BuildKit doesn't seem to be able to use the exported nydus-compressed image and cache.
e.g. the second build in the following with inline cache panics.

# mkdir -p /tmp/ctx
cat <<EOF > /tmp/ctx/Dockerfile
FROM registry2-buildkit:5000/ubuntu:20.04
RUN echo hello > /hello
EOF
# buildctl build --progress=plain --frontend=dockerfile.v0 --local context=/tmp/ctx --local dockerfile=/tmp/ctx \
                 --output type=image,name=registry2-buildkit:5000/ubuntu:20.04-hello,push=true,compression=nydus,oci-mediatypes=true \
                 --export-cache type=inline
# buildctl build --progress=plain --frontend=dockerfile.v0 --local context=/tmp/ctx --local dockerfile=/tmp/ctx \
                 --output type=image,name=registry2-buildkit:5000/ubuntu:20.04-hello-zstd,push=true,compression=zstd,oci-mediatypes=true \
                 --import-cache type=registry,ref=registry2-buildkit:5000/ubuntu:20.04-hello
...
panic: runtime error: index out of range [4] with length 4

goroutine 748 [running]:
github.com/moby/buildkit/cache/remotecache.(*contentCacheImporter).importInlineCache.func1.1()
	/go/src/github.com/moby/buildkit/cache/remotecache/import.go:188 +0xda5
golang.org/x/sync/errgroup.(*Group).Go.func1()
	/go/src/github.com/moby/buildkit/vendor/golang.org/x/sync/errgroup/errgroup.go:57 +0x67
created by golang.org/x/sync/errgroup.(*Group).Go
	/go/src/github.com/moby/buildkit/vendor/golang.org/x/sync/errgroup/errgroup.go:54 +0x92

@imeoer imeoer force-pushed the nydus-compression-type branch from 73d370b to bd9ae80 Compare February 8, 2022 10:15
@imeoer
Copy link
Copy Markdown
Contributor Author

imeoer commented Feb 8, 2022

BuildKit doesn't seem to be able to use the exported nydus-compressed image and cache. e.g. the second build in the following with inline cache panics.

@ktock Thanks for the testing, the panic indeed occurs.

This seems to be a tricky problem. Nydus image consists of multiple blobs and a bootstrap (one more layer than the cache records), and the blob does not correspond to the cache record one by one (probably fewer layers than the cache record).

Is it possible to consider not supporting the inline cache export for nydus image? Will submit a new commit to do this.

@imeoer imeoer changed the title Support for exporting nydus compression type [WIP] Support for exporting nydus compression type Feb 15, 2022
@imeoer imeoer force-pushed the nydus-compression-type branch 5 times, most recently from 798a223 to a9362be Compare February 17, 2022 04:03
@imeoer imeoer changed the title [WIP] Support for exporting nydus compression type Support for exporting nydus compression type Feb 17, 2022
@imeoer
Copy link
Copy Markdown
Contributor Author

imeoer commented Feb 17, 2022

@ktock rebased code to resolve the conflicts with the PR and add workaround for inline export on the commit. PTAL.

@imeoer imeoer force-pushed the nydus-compression-type branch 4 times, most recently from 3b53729 to 8c2ce98 Compare February 21, 2022 02:49
Copy link
Copy Markdown
Collaborator

@ktock ktock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now we seem to have 2 implementation of computeBlobChain(computeBlobChain and computeNydusBlobChain).
Some concerns about maintainability:

  • How can guarantee/keep the compatibility of these implementations?
  • Will we need 2 duplicated patches on these implementations for 1 futural change?

Comment thread cache/refs.go Outdated
Comment thread exporter/containerimage/writer.go Outdated
Comment thread cache/manager_test.go Outdated
Comment thread docs/nydus.md Outdated
@imeoer imeoer force-pushed the nydus-compression-type branch from 8c2ce98 to 6b2f9e8 Compare March 2, 2022 09:43
@imeoer imeoer changed the title Support for exporting nydus compression type [WIP] Support for exporting nydus compression type Mar 3, 2022
Comment thread Dockerfile Outdated
ARG TARGETOS
ARG TARGETARCH
SHELL ["/bin/bash", "-c"]
RUN set -e; [[ "$TARGETOS" == "linux" ]] && [[ "$TARGETARCH" == "amd64" || "$TARGETARCH" == "arm64" ]] || exit 1;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is likely to break the integration-tests-base on other platforms (arm, s390x, ppc64le, riscv64).

Can we just build nydus in this stage, or could you upload binaries for all the archs?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added os/arch check, since nydus has only been tested in production on linux/amd64 and linux/arm64 machines, it is currently only guaranteed to be available on these two platforms, in the future we will add support for other linux archs or macOS, etc, thanks.

Need to depend on some rust components if build in this stage, as said here, how about letting us remove the arch check and keep available on amd64 and arm64 first? nydus will release more arches in the future.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will release more other platforms for making sure the integration tests are available as much as possible.

Comment thread cache/manager_test.go Outdated
Comment on lines +1555 to +1561
// Currently the nydus compression type only ensures the consistency of archive/tar
// implementation in Go, so need to skip the system tar command case for it.
if k == 1 && (i == compression.Nydus || j == compression.Nydus) {
continue
}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like that this breaks the cache if a conversion from a layer which is tar-ed by non-go implementation into nydus then later another conversion happens from that nydus blob to another compression.

Copy link
Copy Markdown
Contributor Author

@imeoer imeoer Aug 17, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the reply! I want to explain that nydus is not an archive format, if we convert a system tar to nydus and then convert nydus back to tar, although nydus can guarantee filesystem consistency, but it can’t do a complete restore of the system tar, it will change its file entry path, for example, ./foo1 -> foo1 (it lost ./ for file path in the tar header), but nydus makes sure that the decompressed tar is as consistent as possible with the hash of OCI tar format (in go implementation), and does strictly guarantee in filesystem consistency. So can we consider skipping the hash check test case between nydus and system tar first?

@ktock
Copy link
Copy Markdown
Collaborator

ktock commented Aug 17, 2022

Another idea for moving this integration forward: #3037

@imeoer
Copy link
Copy Markdown
Contributor Author

imeoer commented Aug 17, 2022

Another idea for moving this integration forward: #3037

@ktock Good idea, this proposal might make cleaner for the job of appending an extra bootstrap layer to nydus image, is it a blocker for this PR? Does the cache changes in this PR still available? Maybe we can move this change forward first, and then use the exporter plugin to refactor.

@imeoer
Copy link
Copy Markdown
Contributor Author

imeoer commented Aug 24, 2022

These are the points that I think are different between nydus and other compression types involved in this PR. Hope we can continue to give some opinions on whether these will block moving on. cc @AkihiroSuda @ktock

  1. Nydus compression type can't be mixed with other compression types in the same image, so if cache record is other/nydus layer, but the target is another compression type, need to convert forcibly (force-compression=true).

  2. Need to append an extra bootstrap layer to the manifest of nydus image, this layer represents the whole metadata of filesystem view for the entire image, this requires skipping the layer when importing inline cache.

  3. Nydus is not an archive format, but a filesystem format, if we convert a system tar to nydus and then convert nydus back to tar, although nydus can guarantee filesystem consistency, but it can’t do a complete restore of the system tar, it will change its file entry path, for example, ./foo1 -> foo1 (it lost ./ for file path in the tar header), but nydus makes sure that the decompressed tar is as consistent as possible with the hash of OCI tar format (in go implementation), and does strictly guarantee in filesystem consistency.

@imeoer
Copy link
Copy Markdown
Contributor Author

imeoer commented Aug 30, 2022

ping @AkihiroSuda @ktock @tonistiigi ~

@ktock
Copy link
Copy Markdown
Collaborator

ktock commented Aug 30, 2022

Yes, I agree that they are blockers. And another consideration is how can we maintain it (e.g. how can we guarantee the emitted nydus isn't broken and maintain all nydus-related exceptions). #3037 is another approach for solving these issues. #2045 looks like integrates nydus as an exporter implementation so I thought that #3037 helps that approach moving forward.

@bergwolf
Copy link
Copy Markdown

bergwolf commented Sep 2, 2022

@AkihiroSuda @ktock @imeoer @bergwolf had a meeting discussing this PR and reached the following consensus:

  1. We can live with the current three Nydus exceptions in the PR, with proper unit tests to help validate them.
  2. We will add go -tags to mark out nydus code to allow non-nydus build (which is the default build mode).
  3. Meanwhile, @ktock and @imeoer will work on an exporter plugin framework to support compression types like Nydus.
  4. The Nydus team commit to helping maintain Nydus functionality in buildkit, including incremental improvements and bugfixes.

@tonistiigi what do you think of the above? If it sounds ok to you, we'll proceed with the change.

@AkihiroSuda
Copy link
Copy Markdown
Member

(Follow-up: the go tag is the short-term solution; the long term solution is the exporter plugin)

@tonistiigi
Copy link
Copy Markdown
Member

what do you think of the above?

sgtm. Make sure that as much code as possible is in the gotag files and that there aren't many exceptions in the shared files.

@imeoer
Copy link
Copy Markdown
Contributor Author

imeoer commented Sep 7, 2022

cc @AkihiroSuda @ktock @tonistiigi @bergwolf

Hi, we have moved the nydus codes to the gotag files, please take a look, the following points need to be clarified:

  1. the nydus features and related tests will only be run when -tags=nydus is enabled. the unit test for system tar handling by the nydus compression type will still be skipped in manager_test.go, and this part of the code can be removed when migrating to the future exporter plugin.

  2. because the nydus image has an additional bootstrap layer, in order to avoid exception changes to the remote cache, nydus does not support cache import/export (maybe we can support it in exporter plugin), the relevant limitations have been added to nydus.md.

Comment thread cache/blobs.go Outdated
Comment thread cache/blobs.go Outdated
Comment on lines +115 to +117
case compression.Nydus:
compressorFunc, finalize = compressNydus(ctx, comp)
mediaType = nydusify.MediaTypeNydusBlob
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we introduce -tags=nydus, I think we should move all if comp.Type == compression.Nydus and case compression.Nydus into *_nydus.go file.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now if -tags=nydus not to be enabled, the nydus related condition branch will return errdefs.ErrNotImplemented in cache/not_nydus.go. Seems it's difficult to put case and if code paths into a separate go file, any suggestions for this?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we still keep the constant compression.Nydus ? If we try to put it into *_nydus.go, we need to reimplement some methods, I think the maintainability can become poor.

Copy link
Copy Markdown
Contributor Author

@imeoer imeoer Sep 23, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ktock Adjusted some methods for reducing nydusify package dependency, PTAL.

If we want to completely remove the compression.Nydus dependency when -tags=nydus not be enabled, I think we could implement an interface like this for each compression type:

type Compression interface {
	String() string
	Compress(context.Context, compression.Config) (...)
	Decompress(context.Context, ocispecs.Descriptor) (...)
	NeedsConversion(context.Context, content.Store, ocispecs.Descriptor) (bool, error)
	...
}

The advantage of this is that we can remove switch case code path, and ensure that don't miss any compression.Type handling, WDYT? If possible, I can submit a separate PR to implement it first.

Copy link
Copy Markdown
Contributor Author

@imeoer imeoer Sep 29, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The advantage of this is that we can remove switch case code path, and ensure that don't miss any compression.Type handling, WDYT? If possible, I can submit a separate PR to implement it first.

@ktock @AkihiroSuda Introduced a new compression.Compression interface in #3136, PTAL.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AkihiroSuda @ktock Fixed, PTAL, thanks!

@bergwolf
Copy link
Copy Markdown

@AkihiroSuda @ktock @tonistiigi With #3136 merged, this PR is pretty clean in terms of code invasion in the general code path. Could you give it another spin and see if it is suitable for merging? As discussed earlier, we'll continue improving the nydus functionality in buildkit and we look forward to putting it into large-scale production use.

Nydus image is a container accelerated image format provided by the
Dragonfly image-service project, which offers the ability to pull
image data on demand, without waiting for the entire image pull to
complete and then start the container. It has been put in production
usage and shown vast improvements over the old OCI image format in
terms of container launching speed, image space, and network bandwidth
efficiency, as well as data integrity. Nydus image can be flexibly
configured as a FUSE-based user-space filesystem or in-kernel
EROFS (from Linux kernel v5.16) with Nydus daemon in user-space,
integrating with VM-based container runtime like KataContainers
is much easier.

Nydus has provided a conversion tool Nydusify for converting OCIv1
image to Nydus image and integrated into Harbor Acceld as a conversion
driver, which assumes that the OCI image is already available in the
registry, but a better way would be to build the Nydus images directly
from the build system instead of using the conversion tool, which would
increase the speed of the image export, so we experimentally integrated
the Nydus export in Buildkit.

Unlike other compression formats (gzip, estargz, etc.) in OCI image,
nydus is divided into two types of layer, blob, and bootstrap, where
blob serves as the data part of each layer of the image, and bootstrap
serves as the metadata of the whole image, the bootstrap is equivalent
to the view of the whole image filesystem after all layers overlay. For
example, for an OCI image with 3 layers, the corresponding nydus image
is 4 layers (3 layers of blob + 1 layer of bootstrap).

The nydus-snapshotter project provides a package to do the actual layer
compression, this commit imports the package to implement the export of
nydus compression type.

Signed-off-by: Yan Song <imeoer@linux.alibaba.com>
For nydus image, appending an extra nydus bootstrap layer
to the manifest, this layer represents the whole metadata
of filesystem view for the entire image.

Nydus uses this extra boostrap layer to reduce the number
of HTTP request and FUSE overlay's overhead on preparing
the container's rootfs.

For now we just support this in image type exporter.

Signed-off-by: Yan Song <imeoer@linux.alibaba.com>
Signed-off-by: Yan Song <imeoer@linux.alibaba.com>
Signed-off-by: Yan Song <imeoer@linux.alibaba.com>
Comment thread cache/not_nydus.go
Comment thread docs/nydus.md Outdated
Comment thread util/compression/not_nydus.go
Comment thread cache/not_nydus.go Outdated
Comment thread cache/nydus.go Outdated
Comment thread exporter/containerimage/writer.go Outdated
Introduce nydus image and show that how to export nydus image, and
add known limitations.

Signed-off-by: Yan Song <imeoer@linux.alibaba.com>
Signed-off-by: Yan Song <imeoer@linux.alibaba.com>
@imeoer
Copy link
Copy Markdown
Contributor Author

imeoer commented Nov 3, 2022

@ktock @AkihiroSuda Updated, please help to take a glance again, thanks!

@imeoer
Copy link
Copy Markdown
Contributor Author

imeoer commented Nov 7, 2022

Thanks for all! cc @tonistiigi Please help to take a final look, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants