Consider an alternative disk representation for datom flags

We have four fields in `datoms`: `index_vaet`, `index_avet`, `index_fulltext` and `unique_value`.

These fields appear in `datoms` and in the indices `idx_datoms_eavt` and `idx_datoms_aevt`.

Some rows also appear in `idx_datoms_avet`, `idx_datoms_vaet`,  `idx_datoms_fulltext`, and `idx_datoms_unique_value`.

That is: each datom is responsible for anywhere from 12 to 28 `TINYINT`s in the DB. Sometimes these won't contribute to space (if they're zero, I've read that sqlite should pack them down to nothing), but assuming one byte per field, half a million datoms will add up to 14MB to the DB just for these flags.

These flags are entirely derived from the schema: a datom's attribute is the sole determinant (AFAICS) of whether these flags are `1` or `0` for a row.

Now, partial indexes cannot refer to other tables or call functions, so we can't simply join against the schema in the `CREATE INDEX … WHERE` clause. But we could use a more complex operator-driven expression, including bitmasks, to compress these flags.

We could also consider approaches to simply removing columns:
- `index_vaet` and `index_fulltext` are mutually exclusive, so they could be two integer values in a single field. 
- `index_vaet` corresponds to `:db/valueType :db.type/ref`, which is already implicitly represented by a `value_type_tag` of 0, so we can filter on that instead.

Finally, we could consider direct schema creation as an approach: rather than having `idx_datoms_fulltext`, for example, we could create per-attribute indices when we register a schema fragment that includes `:fulltext true`:

`CREATE INDEX idx_datoms_fulltext_page_title
ON datoms (value_type_tag, v, a, e)
WHERE a = 65538     -- The interned attribute.`

I don't know if sqlite will happily use a collection of such indices (perhaps it'd be quicker for real-world queries!), but it's worth exploring.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consider an alternative disk representation for datom flags #29

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Consider an alternative disk representation for datom flags #29

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions