Containerd separates the concepts of content store that holds the image data in compressed form for distribution and snapshots that hold image data that that can be used by containers. Docker, for example, doesn't duplicate this.
The question is if buildkit should also introduce this concept or hide it behind containerd snapshot implementation. Currently, it is doing the latter. Buildkit snapshot internally has a reference to the contentstore blob and buildctl du shows the sum of their sizes.
This makes it possible to define a snapshot implementation that wouldn't need to duplicate data if the implementation is smart enough to be 100% stable.
Things start to get more complicated when preparing for importing and exporting cache. These features should work independently from snapshotter implementation or exporter type and it is likely that implementing them would need a very similar implementation to contentstore. This is also the same method how workers should share data on a distributed workflow.
Thoughts?
Containerd separates the concepts of content store that holds the image data in compressed form for distribution and snapshots that hold image data that that can be used by containers. Docker, for example, doesn't duplicate this.
The question is if buildkit should also introduce this concept or hide it behind containerd snapshot implementation. Currently, it is doing the latter. Buildkit snapshot internally has a reference to the contentstore blob and
buildctl dushows the sum of their sizes.This makes it possible to define a snapshot implementation that wouldn't need to duplicate data if the implementation is smart enough to be 100% stable.
Things start to get more complicated when preparing for importing and exporting cache. These features should work independently from snapshotter implementation or exporter type and it is likely that implementing them would need a very similar implementation to contentstore. This is also the same method how workers should share data on a distributed workflow.
Thoughts?