Skip to content
This repository was archived by the owner on Sep 23, 2022. It is now read-only.

Add proposal F#52

Merged
jdolitsky merged 2 commits intoopencontainers:mainfrom
silvin-lubecki:proposal-f
Jun 6, 2022
Merged

Add proposal F#52
jdolitsky merged 2 commits intoopencontainers:mainfrom
silvin-lubecki:proposal-f

Conversation

@silvin-lubecki
Copy link
Copy Markdown
Contributor

@silvin-lubecki silvin-lubecki commented Jun 4, 2022

This proposal tries to follow up with all the comments made on the issue #50

Signed-off-by: Silvin Lubecki <silvin.lubecki@docker.com>
Copy link
Copy Markdown
Contributor

@sudo-bmitch sudo-bmitch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My comments are non-blocking. We can merge and iterate. Race conditions and existing runtime support are the big questions for me, since we can't count on ETags with existing registries.

A new concern we didn't consider before is scalability, since this has limits on the size of the index manifest, and runtimes may limit the annotation size.

Comment thread docs/proposals/PROPOSAL_F.md Outdated
Example of a reference OCI descriptor, pointing to a SBOM referencing an alpine image on a specific platform:
```json
{
"mediaType": "application/vnd.cncf.oras.artifact.manifest.v1+json",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may want to pick a media type known to OCI here.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean from this list or that one needs to be defined for SBOMs specifically?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That list looks right. For a manifest, anything other than a docker image, docker manifest list, OCI image, or OCI index is going to be rejected by most registries.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed 47a0daa


#### Well-known annotations
A **well-known** annotations section is added to the [Annotations](https://github.com/opencontainers/image-spec/blob/main/annotations.md) section.
Those annotations **SHOULD** be added to the referenced image descriptor.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A question I'll have when assessing the proposal is what happens when the annotations don't match.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean if the annotations are not from the "well-known annotations" list ?
Then maybe I should add:
"An encountered annotation that is unknown to the implementation MUST be ignored." ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed 47a0daa

Comment thread docs/proposals/PROPOSAL_F.md Outdated
> If multiple manifests match a client or runtime's requirements, the first matching entry SHOULD be used.

Example:
```json
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
```json
```jsonc

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed 47a0daa

> The distribution-spec **SHOULD** solve the race condition occurring when [multiple parties modify an OCI Index in parallel](https://github.com/opencontainers/distribution-spec/pull/251).
Querying for a reference involves pulling either existing tags pointing to an index that includes the artifact and image, or pulling new tags that use the digest of the referenced image in the tag value.

> Registries are expected to handle nested OCI Index correctly.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will need testing.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added Testing is needed to verify that a registry supports nested OCI Index. in the Backwards Compatibility.2 section.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed 47a0daa

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@silvin-lubecki - What is the recommendation when the index size is more than 4MB. Implementations currently reject manifests that are too large.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is up to registry owners to decide. 4 MB is really big for an object that you expect people to fetch very often.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was discussed in opencontainers/distribution-spec#260 and officially clarified in opencontainers/distribution-spec#293 😅


### Registry HTTP API

> The distribution-spec **SHOULD** solve the race condition occurring when [multiple parties modify an OCI Index in parallel](https://github.com/opencontainers/distribution-spec/pull/251).
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For existing registries that don't have this feature yet, are we accepting the race condition?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that we need to. Depending on the time horizon for fixing this, we could provide guidance for how clients can work around the race.

I would worry that the guidance would become the standard though 😬

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could say that, if the response doesn't include an ETag, that the client SHOULD re-pull the manifest and check that the contents are what it expects, and if it doesn't, it SHOULD go through the pull-update-push dance again, until the update lands racelessly. Or, client's MAY consider the case an error and make users re-push until the race doesn't happen, possibly losing information along the way.

This seems like both a.) enough to ensure the race doesn't practically happen, and b.) a slow and annoying enough protocol that it still encourages registries to support ETags so it isn't necessary. If a client wants to assume the registry will support ETags they can, and just fail loudly if it doesn't.

This has the possible negative side effect of pushing the updated manifest multiple times, each time it may lose the race and need to retry. Again, this is more reason a registry should prefer not to need this, and just support ETags.

Ultimately there's not a lot we can do to make the protocol bullet-proof, except just promote ETags everywhere.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could say that, if the response doesn't include an ETag, that the client SHOULD re-pull the manifest and check that the contents are what it expects, and if it doesn't, it SHOULD go through the pull-update-push dance again, until the update lands racelessly. Or, client's MAY consider the case an error and make users re-push until the race doesn't happen, possibly losing information along the way.

I agree with this, but I wonder if it should be detailed in this proposal. I think it's more a distribution-spec issue 🤔

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Depending on how long it takes to push various blobs, the client could have cached a manifests for a long while (in computer time) and overwrite a manifest that was pushed and verified by another client. In other words, there are lots of races, and we'll probably need to spend some effort specifying the client requirements to minimize (but not prevent) these races.

We also need to factor in that some tooling may see an artifact get pushed and attempt to intentionally delete that artifact, and we'd need a way to differentiate that from a race condition.

Comment thread docs/proposals/PROPOSAL_F.md Outdated
1. As an artifact producer, I want to update an existing artifact with a newer artifact.
- Yes: New artifacts are associated by pushing the artifact and the unique tag referencing the target manifest.
1. As an artifact producer, I want to push multiple artifacts concurrently (possibly from separate systems) without encountering race conditions or other errors.
- Yes: Each artifact may be pushed separately with a unique artifact manifest and tag that references the target manifest.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Weren't we getting away from unique tags with this proposal? If multiple tags exist, do all manifests need to be pulled, and how does the client know which one is the newest?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that I understand the use case here. Is it intended to tease out that index modifications are racy?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I should have clarified, multiple digest tags. I thought you were pushing for a single predictable digest tag, and not a different digest tag for each version of the extended index.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, indeed, it's a copy/paste I should have resolved 😅
Changing to:
No: If two manifests with the same digest are pushed to the same registry path, a race condition will occur. However, this problem exists in current registries.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed 47a0daa

Comment thread docs/proposals/PROPOSAL_F.md Outdated
- E.g. `registry.example.org/project-d:sha256-0000000000000000000000000000000000000000000000000000000000000000.0404040404040404`
- `<alg>`: the digest algorithm
- `<ref>`: the referenced digest (limit of 64 characters)
- `<hash>`: hash of this artifact (limit of 16 characters)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the artifact hash still needed with this proposal? I thought that was going away to avoid the tag listing.

Copy link
Copy Markdown

@chris-crone chris-crone Jun 4, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's needed when you would like to add metadata to an existing object.

As the proposal is that you append metadata to the OCI index, you do not need to list tags. You only need to have your unsigned existing object's digest and then you can compute the tag (using the above formula) for the index with metadata.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that mixing up the reference digest with the artifact hash? The original index you want to extend is the first digest. The digest of the new index with the modified fields is the second hash, and that isn't known by consumers in advance so they need to use a tag list to find them.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Distribution spec doesn't seem to have a way to pull/push index manifests. Would the new OCI index be a nested index?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Distribution spec doesn't seem to have a way to pull/push index manifests." I'm confused, distribution-spec allows this. The nested index concern is with clients that assume an index is only 1 level and can find their platform by scanning the index without recursion.

My question above is getting into how many tags will there be pointing to index manifests for a given "thing". My original understanding is there's:
original tag -> original index or extended index
digest tag -> extended index

And the extended index includes an entry for the original index, and could have entries for each of the original index entries, flattening the search done by clients. That would mean there's only one digest tag for each referenced digest, pointing to the extended index, negating the need for the additional <hash> entry.

Copy link
Copy Markdown

@chris-crone chris-crone Jun 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nishakm

Distribution spec doesn't seem to have a way to pull/push index manifests

I'm not a distribution spec expert but I believe they're the same as other manifests

There's a useful diagram here to see what's happening.

@sudo-bmitch

That would mean there's only one digest tag for each referenced digest, pointing to the extended index, negating the need for the additional entry.

Yes, I believe that you're right. I didn't read it closely enough the first time.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sudo-bmitch yes totally, removing it 👍

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed 47a0daa

- `<ref>`: the referenced digest (limit of 64 characters)
- `<hash>`: hash of this artifact (limit of 16 characters)

If the referenced image is an OCI Index, the manifests **SHOULD** be inserted first, for backward compatibility with runtimes.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
If the referenced image is an OCI Index, the manifests **SHOULD** be inserted first, for backward compatibility with runtimes.
If the referenced image is an OCI Index, the manifests **MAY** be inserted first, for backward compatibility with runtimes.

I would change this to MAY as it's up to the people creating the artifacts if they want to provide backward compatibility or not.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we be prescriptive here, so we can maximize the backward compatibility ? Also I don't see good reasons not to do it 🤔

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also I don't see good reasons not to do it

It adds complexity and duplication which is not ideal. Will we still want this in 1-2 years time?

Shouldn't we be prescriptive here, so we can maximize the backward compatibility ?

By putting "MAY", we're making the how of backward compatibility well known but we allow tool builders to decide if they need it for their tooling.

I'd be curious to see what others think here. I won't die on this hill :)

},
// signatures added to the linux/arm64
"annotations": {
"dev.cosignproject.cosign/signature.e59879c": "...",
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to clarify, the working group mission statement is:

Propose how to describe and query relationships between objects stored in an OCI registry.

@sudo-bmitch @nishakm @jdolitsky @imjasonh I believe that exactly how images are signed or SBOMs are stored is out of scope for this working group. Is that correct?

I raise this as I think we should be clear that we're using illustrative examples to solve for the working group's mission, not trying to define signing or SBOM for artifacts across the industry.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, the goal is to define how additional data is associated to an existing image. We don't want to get into signing formats. It might make sense to discuss this not as signatures, but an optimization for smaller data types.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it helps, we could replace dev.cosignproject... with "org.favorite.icecream": "rocky-road" or something, to make it clear it's a demonstration of a value that's expected to be small, and not a specific signing implementation.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I tried to keep the proposal as generic as I could, those signatures only appear in the examples. Changing them with what @imjasonh suggested 👍

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed 47a0daa

Comment thread docs/proposals/PROPOSAL_F.md Outdated
"os": "linux"
},
"annotations": {
"dev.cosignproject.cosign/signature.e59879c": "...",
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are the hex suffixes here and org.opencontainers.signatures.index ? Or is this intentionally vague as the role of the WG is to not actually describe these keys?

Should we define/recommend a key structure for signatures. Eg. sign.<vendor>/<ftype>.<uniq>

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a carry over from this comment: #50 (comment)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it is, but I'm removing it for more generic annotation examples.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed 47a0daa

#### Pre-defined annotations

Reference annotations are added to the [Annotations](https://github.com/opencontainers/image-spec/blob/main/annotations.md) section.
* `org.opencontainers.reference.digest`: digest referencing an image. The digest **SHOULD** be referenced as OCI Descriptor in the root OCI index **OR** in the nested OCI index to avoid a weak reference.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be worded as "SHOULD reference an OCI Descriptor"?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens when an artifact/image's reference annotation points to a nested index which references the same artifact/image? Is there any way a client can resolve this?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens when an artifact/image's reference annotation points to a nested index which references the same artifact/image? Is there any way a client can resolve this?

What would be the use case ? I don't really get the workflow that would produce it 🤔

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be worded as "SHOULD reference an OCI Descriptor"?

My intent was to enforce the image producers to make sure that if they reference a digest with this annotation, they also reference it using an OCI Descriptor, otherwise it's only a weak pointer and nothing prevents the registry from garbage collecting the referenced digest.

Comment thread docs/proposals/PROPOSAL_F.md Outdated
- E.g. `registry.example.org/project-d:sha256-0000000000000000000000000000000000000000000000000000000000000000.0404040404040404`
- `<alg>`: the digest algorithm
- `<ref>`: the referenced digest (limit of 64 characters)
- `<hash>`: hash of this artifact (limit of 16 characters)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Distribution spec doesn't seem to have a way to pull/push index manifests. Would the new OCI index be a nested index?

Copy link
Copy Markdown
Contributor

@jdolitsky jdolitsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fantastic. Thank you.

Could we also add a row to the table of proposals in the README? https://github.com/opencontainers/wg-reference-types#proposals

- Removing references to cosign in the examples
- Use jsonc
- Remove tag hash
- Fixed copy paste in requirements

Signed-off-by: Silvin Lubecki <silvin.lubecki@docker.com>
@silvin-lubecki
Copy link
Copy Markdown
Contributor Author

Could we also add a row to the table of proposals in the README? https://github.com/opencontainers/wg-reference-types#proposals

Fixed 47a0daa

Copy link
Copy Markdown
Contributor

@jdolitsky jdolitsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets put it in and iterate from there!

@jdolitsky jdolitsky merged commit 74a04fd into opencontainers:main Jun 6, 2022
@jdolitsky
Copy link
Copy Markdown
Contributor

For any unresolved topics from this PR, please add here: #53

Thanks!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants