lds: parse wal2json string values according to their type#879
lds: parse wal2json string values according to their type#879ab-pm wants to merge 6 commits intographile:v4from
Conversation
[wal2json serialises all (non-numeric, non-boolean, non-null) database values as strings](https://github.com/eulerto/wal2json/blob/f8bab055fafc196539c9652b6b5584228498c6eb/wal2json.c#L1337-L1384) into the JSON, and so far we failed to take their types into account when converting them to records. This now * makes wal2json output strings as far as possible (in particular for numeric types whose values did not always have a JSON number representation) * parses the values [just like `node-postgres` does it](https://github.com/brianc/node-postgres/blob/32ec730b51a1fd73bf97d65105c949729fa9ec80/packages/pg/lib/result.js#L68-L73) * instructs wal2json to output the type oids instead of the type names for this
|
Let’s make this opt in so people who don’t need it don’t incur the additional cost; add something like |
…ding` to `PgLogicalDecoding` and add doc comments to describe them
|
Done. I notice the CI appears to be running some older postgres versions with some older wal2json only; apparently |
2b9657c to
ab8d299
Compare
* Refactor `changeToRecord` and `changeToPk` from exports into methods of `PgLogicalDecoding` * add `LdsOptions.types` to choose parsers, use noop identity function by default * call `changeToRecord` only once on updates
* add `LdsOptions.params` * fix `numeric-data-types-as-string` to be set only if `types` are passed * adjust the `parse` function accordingly, to handle `number` values without a type parser * set `include-type-oids` only if `types` are passed * adjust `changeToRecord`/`changeToPk` to call `parse` only if type oids are present
2a3685d to
b87a3c8
Compare
… and try to ignore the `software-properties-common` package that is not available on Debian 13 and later? (I have no idea what I'm doing)
b87a3c8 to
49ec36f
Compare
| ...params, | ||
| }) | ||
| .flatMap(entry => (typeof entry[1] == "string" ? entry : [])) | ||
| .map(pg.Client.prototype.escapeLiteral) |
There was a problem hiding this comment.
Not pretty. Unfortunately the top-level export pg.escapeLiteral is only available since brianc/node-postgres#2954, and I didn't want to require pg@^8.11.0…
| apt-transport-https \ | ||
| ca-certificates \ | ||
| build-essential \ | ||
| pkg-config \ | ||
| git \ | ||
| postgresql-server-dev-$PG_MAJOR | ||
| add-apt-repository "deb http://ftp.debian.org/debian testing main contrib" | ||
| apt-get update | ||
| # add-apt-repository "deb http://ftp.debian.org/debian testing main contrib" | ||
| # apt-get update |
There was a problem hiding this comment.
You may want to not merge this. As the commit message indicates, I have no idea what I was doing; but apparently installing ca-certificates solved the TLS problem with git-cloning the wal2json repository?
Not sure what the debian testing repository is good for. If it's still needed, would simply echoing this strings into some /etc/apt/sources.list.d/???.list file work?
Description
We're using @graphile/lds in a LiveSource (similar to subscriptions-lds) that is not only used for monitoring live queries, but also publishes the actual row data. We discovered a bug in our application where the announcement
.datathat is passed to thesubscribeToLogicalDecodingcallback did not match the expected value - we assumed it would basically look like the results frompg.query('SELECT * FROM schema.table'). However we found that some of the properties contained plain strings!The culprit appears to be that wal2json outputs most record values as string, basically as
value::text(I believe?) and notto_json(value)as one might expect. The changes emitted byPgLogicalDecodingare then simply transformed from key/value lists to a dictionary, but no type-specific parsing is performed (thecolumntypesare available but ignored).Performance impact
unknown
Correctness impact
As far as I can tell, most live queries listen for changes to records by the record identifier (primary key), and those typically consist of integers and texts where this is not an issue. I fail to understand how the
liveCollectionpredicates work and what they check for.In our case, we were looking at the published data and tried to access some properties that should have been
Dates,JSONvalues orArrays (timestamptz,jsonbandint[]respectively). The issue surely occurs with other rarely used datatypes, like enum(array)s, ranges, custom composite types, etc.Checklist
yarn lint:fixpasses.yarn testpasses.RELEASE_NOTES.mdas I did not find anyFurther steps
This PR does report the issue and serves to start the discussion. If you're interested in merging the proposed fix, I'd like to continue with some refactorings; in particular I think it should be possible to inject custom type parsers. For that I would suggest to make
parse,changeToRecordandchangeToPkmethods of thePgLogicalDecodingclass, and usethis.typesinstead ofpg.typeswhere.typesshould be apg.CustomTypesConfig-shaped object that can be configured via the constructoroptionsand falls back topg.typesif not passed.