JsonReceiver implements a basic mapping between Postgres scalar types and similar JSON scalar types:
it does not implement any mapping to compound (object, array) types however, even when an "obvious" mapping exists (for example from Postgres json[b] values); all other values are emitted as JSON strings (presumably according to Postgres default text representation for that type?), meaning that we see output like:
{
"tx": [
{
"schema": "public",
"table": "questions",
"colnames": [
"title",
"tags",
],
"coltypes": [
"text",
"question_tags",
],
"values": [
"How do I parse this nested JSON?",
"[\"json\", \"postgresql\", \"python\"]",
],
"op": "I"
}
]
}
when it might be more convenient for the replisome consumer to get something like
{
...
"values": [
"How do I parse this nested JSON?",
["json", "postgresql", "python"],
...
}
note also that the coltype of tags is question_tags and not jsonb, even though we have defined:
CREATE DOMAIN question_tags AS jsonb;
There are at least 2 good reasons why the JsonReceiver should not (by default) emit nested JSON in this way, however:
-
the empty object {} is currently used to represent the special case of "unchanged TOASTed value" in the values array, and since large json[b] fields are TOASTable, there would be no way to distinguish the "unchanged" and "changed to empty object" cases (at least without some metadata outside the values array, which the consumer must know to look for).
-
if the consumer is just inserting values into another Postgres instance, it would need to serialise the nested JSON again (an unnecessary decode/encode roundtrip)
It is proposed to instead provide a filter class which can be optionally plugged in to parse nested JSON before it reaches the consumer class. The implementation would be trivial for json[b] fields (just call json.loads() on the relevant fields in values).
There is a problem with DOMAIN type aliases, however; as mentioned above JsonReceiver emits the alias and not the underlying type, so there is no way for the proposed filter to detect these fields as JSON-able automatically. The filter would need to be configured with a set of type aliases which it should also parse.
Finally, we could extend the filter to also convert other JSON-like Postgres types (hstore, text[] come to mind), perhaps by using the decoding layer from psycopg2?
JsonReceiverimplements a basic mapping between Postgres scalar types and similar JSON scalar types:stringbooleannumberit does not implement any mapping to compound (
object,array) types however, even when an "obvious" mapping exists (for example from Postgresjson[b]values); all other values are emitted as JSONstrings (presumably according to Postgres defaulttextrepresentation for that type?), meaning that we see output like:{ "tx": [ { "schema": "public", "table": "questions", "colnames": [ "title", "tags", ], "coltypes": [ "text", "question_tags", ], "values": [ "How do I parse this nested JSON?", "[\"json\", \"postgresql\", \"python\"]", ], "op": "I" } ] }when it might be more convenient for the replisome
consumerto get something like{ ... "values": [ "How do I parse this nested JSON?", ["json", "postgresql", "python"], ... }note also that the
coltypeoftagsisquestion_tagsand notjsonb, even though we have defined:There are at least 2 good reasons why the
JsonReceivershould not (by default) emit nested JSON in this way, however:the empty object
{}is currently used to represent the special case of "unchanged TOASTed value" in thevaluesarray, and since largejson[b]fields are TOASTable, there would be no way to distinguish the "unchanged" and "changed to empty object" cases (at least without some metadata outside thevaluesarray, which the consumer must know to look for).if the
consumeris just inserting values into another Postgres instance, it would need to serialise the nested JSON again (an unnecessary decode/encode roundtrip)It is proposed to instead provide a
filterclass which can be optionally plugged in to parse nested JSON before it reaches theconsumerclass. The implementation would be trivial forjson[b]fields (just calljson.loads()on the relevant fields invalues).There is a problem with
DOMAINtype aliases, however; as mentioned aboveJsonReceiveremits the alias and not the underlying type, so there is no way for the proposedfilterto detect these fields as JSON-able automatically. Thefilterwould need to be configured with a set of type aliases which it should also parse.Finally, we could extend the
filterto also convert other JSON-like Postgres types (hstore,text[]come to mind), perhaps by using the decoding layer frompsycopg2?