-
Notifications
You must be signed in to change notification settings - Fork 1
Description
If we print flattened from the first example in the README, we get this:
referenced: [{ }], root: [{ helper: #0 }, { helper: { } }, { helper: #0 }]
which is about a space-efficient as possible for a JSON-like format. However, if we print the actual JSON data produced for that example, we see something a lot more wasteful:
{"root":{"value":{"unkeyed":[{"value":{"keyed":{"helper":{"reference":0}}}},{"value":{"keyed":{"helper":{"value":{"keyed":{}}}}}},{"value":{"keyed":{"helper":{"reference":0}}}}]}},"referenced":[{"keyed":{}}]}There are three aspects of a single main issue:
- pretty much every object is wrapped in a
{ "value": ... }or{ "reference": .. }; - every encoding container (keyed, unkeyed, single value) is also wrapped in a
{ "keyed": ... }or similar; - (not visible in above example) every primitive value is wrapped in a
{ "int8": ... }as appropriate.
All of these annotations are necessary since FlattenedContainer is encoded after CyclicEncoder has completed, and decoded before CyclicDecoder decodes the actual object graph — it doesn't have access to the actual structure of the object graph during decoding, so the encoded data needs to contain all of this information.
I see two ways to solve this.
1. Create a custom, JSON-like format that FlattenedContainer can convert itself to and from
This would simply add toData() and init(from data:) methods to FlattenedContainer which could be used instead using an encoder on it.
The conformance to Codable can remain, but will not be used or needed for this method.
Advantages:
- A highly-efficient encoding format can be created, or even multiple formats, e.g. a JSON-like format and also a binary format.
- The (wasteful)
Codableimplementation remains as-is for compatibility and can continue to be used.
Disadvantages:
- Introduces a proprietary file format.
- Data generation and parsing code.
2. Merge the two encoding steps
The API usage would change from
let flattened = try! CyclicEncoder().flatten(object)
let data = try! JSONEncoder().encode(flattened)
let decoded = try! JSONDecoder().decode(FlattenedContainer.self, from: data)
let unflattened = try! CyclicDecoder().decode(MyObject.self, from: decoded)to something like
let data = try! CyclicEncoder().encode(object, with: JSONEncoder())
let unflattened = try! CyclicDecoder().decode(MyObject.self, from: data, with: JSONDecoder())Essentially the idea is for CyclicEncoder/Decoder to "wrap" the proper encoder/decoder, encoding an array of referenced objects and the root object, and intercepting calls to encode(:) so that encoding an object of type T actually encodes a (e.g.) enum Referenceable<T> which can specify either the object or a reference id. The decoder would be implemented similarly.
Advantages:
- Does a good job of preserving the structure of the object graph, aside from extra nesting for
Referenceable<T>, and the addition of the referenced objects array at the root.
Disadvantages:
- Not sure how to effectively and unambiguously encode a
Referenceable<T>. A simple possibility is encoding[5]for a reference, and[-1, <the object>]for an actual value. However, this is still quite intrusive and may not be ideal if e.g. the serialised JSON would be processed directly later.
Time permitting, I'll most likely get started on implementing both options and then compare the finished implementations to see which one is better in practice.
Please leave a comment if you have a relevant use case which would be worth considering when implementing this.