Skip to content

Conversation

@mtrmac
Copy link
Collaborator

@mtrmac mtrmac commented Sep 16, 2017

This teaches copy.Image to handle manifest lists, instead of failing, by choosing an image appropriate for the current system, and copying only that one.

Ultimately we want to copy the whole manifest list; this does not yet do that.

NOTE: This is an alternative to #344, and should have a similar effect for Linux users. macOS users trying to access Linux images will instead start seeing errors like

Error choosing an image from manifest list docker://busybox:latest: no image found in manifest list for architecture amd64, OS darwin 

unless they explicitly make a types.SystemContext{OSChoice:"linux"} override. This may be fine for some uses, but the advertised use of skopeo copy for copying between docker/distribution registries suffers (until we implement a copy of the whole manifest list).

types.ImageSource is modified to understand a set of images distinguished by an optional instanceDigest; (as opposed to creating new types.ImageSource instances for each image instance, this allows sharing state, e.g. docker load, or an unpacked archive, for all of the instances). Then copy.Image is split into a per-manifest-list initialization, and a per-image-instance copy operation, so the per-manifest-list part can choose a single instance if necessary.

NOTE: One of the implications of this is that signature verification now happens only for the individual instances; manifest list signatures are not required and not processed at all.

The overall diff is probably difficult to review, please see the individual commits, and read the individual commit messages messages for more detailed discussion:

  • Fix an incorrect variable reference in storageImageSource.GetSignatures
  • Move the core logic of types.Image.IsMultiImage to c/i/manifest
  • Remove types.Image.IsMultiImage
  • Do the isMultiImage decision already on an UnparsedImage, not an Image
  • Rename copy.imageCopier to copier
  • Split copy.imageCopier from copy.copier
  • Use copier and imageCopier instead of parameters and variables
  • Add copier.Printf
  • FIXME Replace GetTargetManifest by GetManifest(instanceDigest)
  • FIXME: Add a instanceDigest parameter to GetSignatures
  • Support instanceDigest in types.UnparsedImage and types.Image
  • Do not Close the ImageSource in UnparsedImage/Image
  • Split copier.copyOneImage from copy.Image
  • Finally, if the copy.Image source is a manifest list, copy a single instance
  • Add architecture/OS overrides to types.SystemContext
  • Revert "Hotfix: Do not fetch manifest lists from Docker registries"

@mtrmac
Copy link
Collaborator Author

mtrmac commented Sep 16, 2017

@runcom RFC. I still need to update/add a few tests.

@mtrmac
Copy link
Collaborator Author

mtrmac commented Sep 16, 2017

BTW as much as types.ImageReference.NewImage is convenient, what it does for manifest lists (.Manifest(), and the manifest digest, are the manifest list, but everything else is obtained by parsing a selected manifest instance, and notably .UpdatedImage() creates a new manifest from the instance, not for the list) is pretty likely to be incorrect or at least very surprising for most callers.

I’m very tempted to break the API, even if just to rename it to NewImageForDefaultInstance or NewImageWithWeirdManifestListHandling, just to force callers to think this through.

But I think even better would be eliminating the .NewImage() method entirely — it originally exists for the docker.Image and storage.storageImage overrides, which can be done better in other ways — and replacing all of those mostly copy&pasted implementations with a single helper, perhaps image.FromReference or something like that, and then we can think about handling manifests lists in there (maybe just by failing when the image is a manifest list and the caller did not expect that).

(This is all obviously for a separate PR, just noting that this PR highlights how NewImage is a bad fit.)

@mtrmac
Copy link
Collaborator Author

mtrmac commented Sep 16, 2017

See containers/skopeo#420 for the corresponding skopeo update.

@runcom
Copy link
Member

runcom commented Sep 16, 2017

@cyphar @hallyn PTAL

@runcom
Copy link
Member

runcom commented Sep 16, 2017

NOTE: This is an alternative to #344, and should have a similar effect for Linux users. macOS users trying to access Linux images will instead start seeing errors like

I'm fine with this, macOS users now have to choose arch/os and that's fine because skopeo allows for this kind of fine-grained operations. So +1 on this and the flags you added in skopeo PR.

Ultimately we want to copy the whole manifest list; this does not yet do that.

assuming you're talking about destinations like oci which supports manifest index/list, I'm fine with this either, let's tackle that problem later (unless some contributor want to take it, hint hint) (for the record CRI-O isn't affected by this.)

NOTE: One of the implications of this is that signature verification now happens only for the individual instances; manifest list signatures are not required and not processed at all.

Good. That seems ok as a start.

@mtrmac overall the approach looks good to me. I'll make a detailed code review once fully ready

@mtrmac
Copy link
Collaborator Author

mtrmac commented Sep 18, 2017

NOTE: This is an alternative to #344, and should have a similar effect for Linux users. macOS users trying to access Linux images will instead start seeing errors like

I'm fine with this, macOS users now have to choose arch/os and that's fine because skopeo allows for this kind of fine-grained operations.

Well, we used to recommend (skopeo copy docker://… docker://…) and it would mostly Just Work; now users would be forced to add --override-os, and after copying manifest lists is implemented, they may want to remove --override-os again.

OTOH quite a few of the Hub images now have 7 or so variants, and it’s pretty likely that most users don’t actually want to always copy of all of them. So forcing users to make an explicit choice might not be so bad.

@mtrmac
Copy link
Collaborator Author

mtrmac commented Sep 18, 2017

@mtrmac overall the approach looks good to me. I'll make a detailed code review once fully ready

Done (added a few noddy tests to dir: and containers-storage, and a test for chooseDigestFromManifestList).

@mtrmac mtrmac changed the title RFC: Support copying a single image from manifest lists Support copying a single image from manifest lists Sep 18, 2017
@mtrmac
Copy link
Collaborator Author

mtrmac commented Sep 27, 2017

@runcom PTAL

@cyphar
Copy link
Contributor

cyphar commented Oct 3, 2017

I will take a look over this when I get a chance (this week hopefully).

(Entirely unrelated to the rest of this patch set.)

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
We will be removing it from types.Image altogether; and c/i/manifest
is a better place for hard-coding MIME types anyway.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
Instead, have copy.Image call .Manifest() and use manifest.MIMETypeIsMultiImage
directly.

This is not locally an improvement; but we want to move the IsMultiImage check
to work on types.UnparsedImage, and because types.Image is a superset of
types.UnparsedImage, for consistency we would have to implement a
completely useless sourcedImage.IsMultiImage which works with the cached data.

Instead, let's just drop the method from the types.Image interface and work
with the MIME type directly.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
We don't actually need to parse the image to make this decision.

Also, we are going to do signature checking on the individual images in the list
in the future, so the list/non-list decision needs to happen before
signature checking.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
The copier name is not that great, but we will want separate objects
for a per-manifest-list state (e.g. information about blobs, which
may be shared) and per-image-instace state (e.g. manifest conversions).
For lack of better ideas, copier will be the per-manifest-list one,
and imageCopier the per-image-instance one.

For now this commit only renames the old copier object, splitting
it into copier/imageCopier will happen later.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
Splits state related to an individual image (i.e. individual
manifest and its edits) from state shared across images in a possible
manifest list.

Does not change anything else about the code structure; cleanups will come later.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
Make more functinos methods of copier/imageCopier, and use fields
of these objects instead of local variables in copy.Image.

This primarily makes it clear which values are associated with which
concepts and state, and simplifies rearranging the large copy.Image.

This makes the lifetime of some values a bit more opaque, but IDE
reference lookup tools are eaiser to use with only one copy of a value
to track, which is roughly a wash.

The one real downside is that tests are now creating mock copier/imageCopier
objects with only the fields necessary to test a particular function,
i.e. the objects are mostly invalid.  Perhaps that can be cleared up later
with better initialization helpers and some common "unusable ImageSource" mocks.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
This simplifies the ubiquitous fmt.Fprintf(c.reportWriter, ...) calls,
and eliminates the private writeReport helper.

We also have to initialize copier earlier, so this moves it to the approximately
right place, when all its values are known.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
This will make the code paths more uniform for consumers of the
primary manifest and the manifest instances.

(Having an explicit support for manifest instances is necessary
for transports like docker-daemon: / oci-archive:, which
contain several images but setting up an ImageSource is very
expensive, or which don't even allow referencing images by digest.)

This is a direct replacement of GetTargetManifest, and should
not change behavior; notably the OCI implementation is still
blindly guessing the manifest type although it is probably
available in the index.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
This mirrors GetManifest, and allows / requires signatures to be per-instance.

Also add implementations in docker: and atomic:, the only transports which
support both manifest lists and signatures.

Does not change behavior yet, the only user always specifies nil
instanceDigest.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
A types.UnparsedImage is now created by image.UnparsedInstance, with the
caller specifying nil or a digest.

A types.Image is (preferably) created by image.FromUnparsedImage based
on a specific instance.  For continuity/compatibility, image.FromSource
continues working the old way, i.e. using the default instance, and
transparently parsing data from a chosen instance (but still returning
the manifest list in .Manifest()).

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
Remove the .Close() methods from UnparsedImage/Image, which closed the
underlying ImageSource.  Instead, just require the caller to ensure
that the ImageSource is not closed as long as the UnparsedImage/Image
are used.

This allows using several independent UnparsedImage/Image instances
for a shared ImageSource; notably independent Image objects for the
individual image instances in a manifest list.  (copy.Image is already
simpler although it is only using a single instance.)

To keep ImageReference.NewImage simple and not to break all the external
callers of this, also add a simple ImageCloser wrapper which retains
the ImageSource closing functionality, and return it from image.FromSource
and ImageReference.NewImage implementations.

(It's very likely many of the NewImage callers would be surprised by how this
handles manifest lists, and it is very tempting to break this API, at least
by renaming, to force the callers to consider this; however, this would be
better done after eliminating the need of ImageReference.NewImage entirely,
by replacing the specialized types.Image extensions with something else, first.)

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
After building all that infrastructure, things start to fall into place:
separate the code common to handling individual images and manifest lists from
the code to copy a single image.

This only extracts a method, does not change behavior.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
Finally: If the top-level image is a manifest list, select an appropriate instance,
create a new types.UnparsedImage for the instance, and copy only that instance.
The instanceDigest handling in UnparsedImage and dependencies will take care
of handling the rest.

For several transports we should instead copy all of the instances in the manifest
list; this is not implemented yet, but at least the source instance access
part is ready.  (Remaining: listing images in a manifest list,
instanceDigest handling in destinations, updating and copying the manifest list.)

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
... and use them for choosing an image from a manifest list, and
verifying whether an image is acceptable in a MustMatchRuntimeOS()
destination.  Propagate the types.SystemContext through the
call stack as necessary.

This adds no users and seems not all that important, but after
we re-enable fetching manifest lists, docker_transport_test.go does a
"//busybox".NewImage(), which nowadays mean copying from a manifest list,
and then we need this override to keep tests working on non-Linux platforms
for which Docker does not publish images.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
Now that types.SystemContext allows overriding the runtime architecture
and OS, it is simple enough to write at least a smoke test for
chooseDigestFromManifestList.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
This reverts commit bf1ca2e.

We can now process the manifest lists ourselves.

Because docker_transport_test.go tests "//busybox".NewImage(),
which will now process transport lists, also add an architecture/OS
override to keep the test working e.g. on macOS, for which Docker
does not publish an image in the manifest list.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
@mtrmac
Copy link
Collaborator Author

mtrmac commented Nov 15, 2017

@runcom PTAL. Tests pass in containers/skopeo#420 (at least on Linux right now)

@runcom
Copy link
Member

runcom commented Nov 15, 2017

This may be fine for some uses, but the advertised use of skopeo copy for copying between docker/distribution registries suffers (until we implement a copy of the whole manifest list).

I'm actually afraid of this one, right now official images are now manifest lists, what would happen if I copy the redis image which is a list to another registry with this patch? (AFK right now)

@mtrmac
Copy link
Collaborator Author

mtrmac commented Nov 15, 2017

This may be fine for some uses, but the advertised use of skopeo copy for copying between docker/distribution registries suffers (until we implement a copy of the whole manifest list).

I'm actually afraid of this one, right now official images are now manifest lists, what would happen if I copy the redis image which is a list to another registry with this patch? (AFK right now)

It will select a variant for your current runtime environment (or the one you have chosen via SystemContext) and copy only that one image, not the whole manifest list.

For most people who only care about x86_64 this seems to be fine, perhaps even better than copying all of the architectures. macOS users wanting to copy Linux images will be forced to add an extra CLI option.

copy/copy.go Outdated
if src.IsMultiImage() {
multiImage, err := isMultiImage(src)
if err != nil {
return errors.Wrapf(err, "Error determining manifest MIME type for %s", transports.ImageName(srcRef))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"error determining ..."

}()
multiImage, err := isMultiImage(unparsedImage)
if err != nil {
return errors.Wrapf(err, "Error determining manifest MIME type for %s", transports.ImageName(srcRef))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lower case Error.


// imageCopier tracks state specific to a single image (possibly an item of a manifest list)
type imageCopier struct {
c *copier
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the object be a copier rather then a "c"

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is used all over the place, and consistent with func (c *copier) receivers of the per-manifest-list code.

}
}
]
} No newline at end of file
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

newline

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an actual busybox manifest coming from Docker Hub; it is most representative unmodified.

@rhatdan
Copy link
Member

rhatdan commented Nov 15, 2017

Looks like throuhout the code the error messages switch from lower case to upper case and back. The standard for the tools that use this library is lower case for errors.
Probably should go through all error returns and lowercase them, to make them consistent.
In a different PR.

@mtrmac
Copy link
Collaborator Author

mtrmac commented Nov 15, 2017

Looks like throuhout the code the error messages switch from lower case to upper case and back. The standard for the tools that use this library is lower case for errors.

The library has been fairly consistently using upper case; historically primarily because parts of skopeo have had minimal error handling like if err != nil {return err} ultimately resulting in logrus.Fatalf(the_error_from_containers_image).

@runcom
Copy link
Member

runcom commented Nov 16, 2017

yeah I guess it's fine since it's also docker push behavior to copy just one image for now. We'll revisit this later.

@runcom
Copy link
Member

runcom commented Nov 16, 2017

LGTM

Approved with PullApprove

@mtrmac
Copy link
Collaborator Author

mtrmac commented Nov 16, 2017

👍

Approved with PullApprove

@mtrmac mtrmac merged commit 46ad350 into containers:master Nov 16, 2017
@mtrmac mtrmac deleted the manifest-lists branch November 16, 2017 15:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants