Skip to content

Type system changes // canonical string representation#432

Merged
duglin merged 3 commits into
cloudevents:masterfrom
clemensv:type-system-changes
Jun 6, 2019
Merged

Type system changes // canonical string representation#432
duglin merged 3 commits into
cloudevents:masterfrom
clemensv:type-system-changes

Conversation

@clemensv
Copy link
Copy Markdown
Contributor

@clemensv clemensv commented May 23, 2019

This amends the type system section such that any type can be represented as a runtime or transport or encoding sees it fit, but that all implementations MUST support a canonical string representation.

Fixes: #396
Fixes: #413

Signed-off-by: Clemens Vasters clemensv@microsoft.com

Signed-off-by: Clemens Vasters <clemensv@microsoft.com>
@clemensv clemensv changed the title Type system changes with canonical reprsentation Type system changes with canonical representation May 23, 2019
@clemensv clemensv changed the title Type system changes with canonical representation Type system changes // canonical string representation May 23, 2019
Signed-off-by: Clemens Vasters <clemensv@microsoft.com>
Signed-off-by: Clemens Vasters <clemensv@microsoft.com>
Comment thread spec.md
The following abstract data types are available for use in attributes. Each of
these types MAY be represented differently by different event formats and in
transport metadata fields. This specification defines a canonical string-encoding
for each type that MUST be supported by all implementations.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even if that impl only supports transports/formats that don't use the string encoding?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@duglin Yes, all implementations need to expect a string value to come across the wire where the "native" representation would be a binary-encoded number, for instance. Interop.

Comment thread spec.md
the runtime/language native type that best corresponds to the abstract type.

For example, the `time` attribute might be represented by the language's native
*datetime* type in a given implementation, but it MUST be settable providing
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"MUST be settable"... not sure what this means. Does this mean the SDK's setter MUST accept a "string" value from the user? Or is this more about how there are certain transports that will use the string-encoding and impls that support that transport must be prepared to convert between strings and the native type?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, the next sentence covers the transport stuff. So I think I'm back to not understanding what the "settable" part means in practice.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@duglin "settable" means that the implementation MUST ALWAYS allow the app to set a time value via its string representation, either from the SDK side (e.g. you're building a forwarder that maps between transports) or from the wire side, even if the "native" representation of the type isn't string. E.g. set_time(time_t) and set_time(string).

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok that helps clarify things- thanks.

However, is this out of scope for the spec? Should this be guidance in the sdk doc? I'm not sure the spec, which is just about (mainly) defining the metadata, should get into how the metadata is set/retrieved by the app/user. Especially via a "MUST" - as non-normative guidance I can see though.

@duglin
Copy link
Copy Markdown
Collaborator

duglin commented May 30, 2019

I believe this proposal means:

  • for transports, like http, that do not have a type-system for their metadata the attributes will all be serialized as strings
  • if the receiver of the CloudEvent knows the "true" type of a particular attribute then it MAY choose to expose that attribute in its "true" data type to the user/app
  • this then implies that unknown extensions would always be exposed to the user/app as a string

For example:
If there's an HTTP header of the form:

ce-myextension: { "foo": "bar" }

the receiver of this CloudEvent would pass along the value of myextension as { "foo": "bar" } to the user/app as a string, not as a map.

Do I have this right @clemensv ?

@duglin
Copy link
Copy Markdown
Collaborator

duglin commented May 30, 2019

@jem in #413 (comment) you said you would prefer to keep the type info in the serialized name of the attribute. Can you elaborate on how this info would be used? In particular in cases where the extension is unknown by both the receiving SDK and app. While I can image some amount of type checking might be done (e.g. it can check that 5k5 isn't a valid int), w/o understanding more about the attribute I'm not sure that's a ton of value. The receiver can't even really display any help text or description text around that attribute - it just has the CloudEvent attribute name. So, then we need to ask if its more helpful or painful to the application to have a mixture of extension types given to them, rather than knowing all extensions will just be a string. As a dev, if all I'm doing is passing along this info I would think that not having to do some kind of reflection on each attribute might be easier and be less error prone.

@duglin duglin added the v0.3 label May 30, 2019
@timbray
Copy link
Copy Markdown

timbray commented May 30, 2019

I'm assuming this would not affect the JSON serialization. A number would still have to be encoded as a JSON number, right?

@clemensv
Copy link
Copy Markdown
Contributor Author

clemensv commented Jun 3, 2019

@duglin

  • for transports, like http, that do not have a type-system for their metadata the attributes will all be serialized as strings

Yes.

  • if the receiver of the CloudEvent knows the "true" type of a particular attribute then it MAY choose to expose that attribute in its "true" data type to the user/app

Yes.

  • this then implies that unknown extensions would always be exposed to the user/app as a string

Yes.

@clemensv
Copy link
Copy Markdown
Contributor Author

clemensv commented Jun 3, 2019

I'm assuming this would not affect the JSON serialization. A number would still have to be encoded as a JSON number, right?

Correct, @timbray. Since people have been objecting to having quotes in HTTP header values when they're strings, I'm now defining CloudEvents as having its own string encoding for its type system, but I deep-link into the JSON spec for Integer and Map and lean on RFC3986 for URI-reference and RFC3339 for Time directly.

The net effect of all this is that JSON numbers are still JSON numbers and that maps contained in metadata are still proper JSON, but that all the quotes around strings disappear from CloudEvents headers. The design keeps everything else the same.

Also, since I'm allowing each encoding and transport and runtime to use whatever native type system it has, and a Time expression might always be an int64 UNIX epoch while you're in a Java and AMQP world, but since I mandate the ability to convert to/from these well-defined string formats, we can always transcode via string to, say, a C# DateTime Tick, which (surprise!) has a different epoch offset.

@clemensv
Copy link
Copy Markdown
Contributor Author

clemensv commented Jun 4, 2019

@duglin "@JemDay in #413 (comment) you said you would prefer to keep the type info in the serialized name of the attribute." (see complete comment above)

Concrete data types only really matter where the event interfaces with the programming abstraction, and the application will generally have an opinion or expectation about those if it handles them. In a C# producer app, you may set an attribute as DateTime and in the receiving Java consumer app you might want to get that attribute value as a Date, but whether that value traveled as an AMQP timestamp or as an RFC3339 string inside an HTTP header doesn't really matter as long as conversions are defined and those conversions don't cause information loss.

@n3wscott
Copy link
Copy Markdown
Member

n3wscott commented Jun 5, 2019

This proposal would solve #413 and resolve #396

@duglin
Copy link
Copy Markdown
Collaborator

duglin commented Jun 6, 2019

Approved on the 6/6 call

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Header values in HTTP Binary Content Mode

4 participants