Our write approach goes something like this:
- Do some reads to figure out upserts etc.
- Do some writes to temporary tables and
fulltext_values.
- Flush changes into the
transactions table.
- Flush changes into the
datoms table.
- Drop the temporary tables.
- Read back from
transactions to build a transaction report.
- Apply changes — parts, idents, and attributes — to the
schema, idents, and parts tables.
All of this happens in one database transaction.
One concern about all of this is that there will be significant fragmentation: in each transact we write to datoms and transactions, interleaving changes (and to other tables, too). Even with WAL, this is a long way from an optimized append-only workflow
SQLite allows for attached databases, and writes to attached databases are atomic if not using WAL, and atomic but not crash-safe otherwise.
We could thus split our data, literally making our materialized datoms table and its indices a separate materialized store. (And we could do this via our transaction listener if we weren't so concerned about atomicity and speed.)
So long as the transaction to the transaction log database completes first, we can replay transacted datoms into the datoms database in the event of necessary crash recovery.
This approach would allow each database to be tuned independently, would prevent the writer's readback of the transaction log from impacting the datoms-oriented page cache used by readers, and potentially allow us to scale better.
The transactor would have a connection to both DBs, attached. Readers would typically only use a connection to the datoms database.
Our write approach goes something like this:
fulltext_values.transactionstable.datomstable.transactionsto build a transaction report.schema,idents, andpartstables.All of this happens in one database transaction.
One concern about all of this is that there will be significant fragmentation: in each
transactwe write todatomsandtransactions, interleaving changes (and to other tables, too). Even with WAL, this is a long way from an optimized append-only workflowSQLite allows for attached databases, and writes to attached databases are atomic if not using WAL, and atomic but not crash-safe otherwise.
We could thus split our data, literally making our materialized
datomstable and its indices a separate materialized store. (And we could do this via our transaction listener if we weren't so concerned about atomicity and speed.)So long as the transaction to the transaction log database completes first, we can replay transacted datoms into the datoms database in the event of necessary crash recovery.
This approach would allow each database to be tuned independently, would prevent the writer's readback of the transaction log from impacting the datoms-oriented page cache used by readers, and potentially allow us to scale better.
The transactor would have a connection to both DBs, attached. Readers would typically only use a connection to the datoms database.