We have four fields in datoms: index_vaet, index_avet, index_fulltext and unique_value.
These fields appear in datoms and in the indices idx_datoms_eavt and idx_datoms_aevt.
Some rows also appear in idx_datoms_avet, idx_datoms_vaet, idx_datoms_fulltext, and idx_datoms_unique_value.
That is: each datom is responsible for anywhere from 12 to 28 TINYINTs in the DB. Sometimes these won't contribute to space (if they're zero, I've read that sqlite should pack them down to nothing), but assuming one byte per field, half a million datoms will add up to 14MB to the DB just for these flags.
These flags are entirely derived from the schema: a datom's attribute is the sole determinant (AFAICS) of whether these flags are 1 or 0 for a row.
Now, partial indexes cannot refer to other tables or call functions, so we can't simply join against the schema in the CREATE INDEX … WHERE clause. But we could use a more complex operator-driven expression, including bitmasks, to compress these flags.
We could also consider approaches to simply removing columns:
index_vaet and index_fulltext are mutually exclusive, so they could be two integer values in a single field.
index_vaet corresponds to :db/valueType :db.type/ref, which is already implicitly represented by a value_type_tag of 0, so we can filter on that instead.
Finally, we could consider direct schema creation as an approach: rather than having idx_datoms_fulltext, for example, we could create per-attribute indices when we register a schema fragment that includes :fulltext true:
CREATE INDEX idx_datoms_fulltext_page_title ON datoms (value_type_tag, v, a, e) WHERE a = 65538 -- The interned attribute.
I don't know if sqlite will happily use a collection of such indices (perhaps it'd be quicker for real-world queries!), but it's worth exploring.
We have four fields in
datoms:index_vaet,index_avet,index_fulltextandunique_value.These fields appear in
datomsand in the indicesidx_datoms_eavtandidx_datoms_aevt.Some rows also appear in
idx_datoms_avet,idx_datoms_vaet,idx_datoms_fulltext, andidx_datoms_unique_value.That is: each datom is responsible for anywhere from 12 to 28
TINYINTs in the DB. Sometimes these won't contribute to space (if they're zero, I've read that sqlite should pack them down to nothing), but assuming one byte per field, half a million datoms will add up to 14MB to the DB just for these flags.These flags are entirely derived from the schema: a datom's attribute is the sole determinant (AFAICS) of whether these flags are
1or0for a row.Now, partial indexes cannot refer to other tables or call functions, so we can't simply join against the schema in the
CREATE INDEX … WHEREclause. But we could use a more complex operator-driven expression, including bitmasks, to compress these flags.We could also consider approaches to simply removing columns:
index_vaetandindex_fulltextare mutually exclusive, so they could be two integer values in a single field.index_vaetcorresponds to:db/valueType :db.type/ref, which is already implicitly represented by avalue_type_tagof 0, so we can filter on that instead.Finally, we could consider direct schema creation as an approach: rather than having
idx_datoms_fulltext, for example, we could create per-attribute indices when we register a schema fragment that includes:fulltext true:CREATE INDEX idx_datoms_fulltext_page_title ON datoms (value_type_tag, v, a, e) WHERE a = 65538 -- The interned attribute.I don't know if sqlite will happily use a collection of such indices (perhaps it'd be quicker for real-world queries!), but it's worth exploring.