Suggest padding for ints#1026
Conversation
Fixes cloudevents#923 Signed-off-by: Doug Davis <dug@microsoft.com>
|
@BenBeattieHood @jroper Any thoughts on this? |
|
Thanks @duglin :) I appreciate the additional work, but I think the original is clearer - sequence is a string, and used to order the events - therefore it's inferable that integers are required to be zero-padded. Zero padding integers might be worth calling out specifically, but I don't think the additional text around recommendations/'should's helps make this point clearer in this case. And including commentary on whitespace might conflict with formatting (eg yml vs json) and so should be defined in the format instead. |
|
@BenBeattieHood thanks for responding. Got a question for ya.... while I agree that requiring leading zeros makes things easier because then (basically) sequenceType can be ignored and people can always just compare "sequence" using strcmp, what do you see sequenceType actually being used for then? I mean, it might be kind of interesting to know that it can be thought of an an "int" but for the purpose of ensuring ordering that's not really necessary, right? |
FYI: we for instance considered using the sequenceType to also define what the sequence value relates to. E.g., is it valid comparing sequence for a given subject, a given event type, across subjects and event types , ... But we discarded that as we thought this is not the real intent of sequenceType. |
|
@BenBeattieHood see if the new edits look better |
|
@duglin I like the new revision - it seems much clearer to me. Thank you! In answer to your question, I guess in the case where one runs out of padding on an int, or moves from one string encoding to another, then sequenceType can help transmit this change. But yep, it's probable that it's an overengineering, and just removing sequenceType will be simpler. @c-pius I believe sequence is only ever necessary across a single aggregate instance (ie across all events from a single sensor, or across all events from a single domain entity instance). Sequence isn't really important across anything else, as its really there to ensure events are consumed in the same order as they are emitted from an event source instance. (more info on this here) |
|
My thoughts are that sequence should be opaque, and use case specific. I don't expect any clients to actually use it to try and sort events, most messaging providers are capable of delivering in order within certain constraints. What I do expect is it to be used for offset based consumption restarting, so when extending consumption out to a device via a TCP connection, eg, gRPC or WebSockets, and the connection is interrupted the sequence can be used to indicate what the last event you consumed was so the stream can be resumed. In that case, the value is completely opaque. I'm not a fan of the lexicographical ordering requirement. Consider event sources that come from Cassandra, if you want an ordered sequence in Cassandra, best practice is to use time based UUIDs. These produce a stable ordering based on time, but are not lexicographically sortable. We use these in for our event sourced journal in Akka when running on Cassandra. Many other distributed databases, eg, Spanner or Yugabyte, have a transaction timestamp that is commonly used to order events. These are usually 64 bit ints and they don't increase by one. So the requirement to increase by one excludes event sources from those databases too. Another thing to consider is that an event source might not want to reveal its throughout, as this could be confidential business data, and so instead of emitting a plain sequence number, encrypted sequence numbers might make more sense, which obviously are not lexicographically sortable. So, I would prefer to make sequence opaque, with any meaning taken from it being committed out of band. |
|
@BenBeattieHood any thoughts on @jroper's comment? |
jskeet
left a comment
There was a problem hiding this comment.
I think this is a reasonable fudge for the moment, but if we ever want to actually standardize sequence/sequencetype, I think we'll need a bit more work.
|
Based on the above, it seems to me that this PR at least clarifies the current state of things... yes Virginia you need to pad things. But there are still some questions around things like opacity of "sequence" and "lexicographical ordering" that might be good to discuss in a follow-up issue/PR. |
|
@markpeek need to address the "wrapping" issue too - not sure what to do about negative numbers |
|
I'm going to suggest we adopt #1031 instead of this PR during today's call. |
|
Per 7/21 call, closing in favor of #1031 |
Fixes #923
Signed-off-by: Doug Davis dug@microsoft.com
Release Note