beam-migrate: CREATE INDEX support#789
Conversation
| -- Collect user-created secondary indices. | ||
| -- | ||
| -- Excludes: | ||
| -- - primary keys | ||
| -- - indices that back a constraint (i.e. those created implicitly by UNIQUE/EXCLUDE) | ||
| -- - expression indices e.g. CREATE INDEX ON users (LOWER(email)) | ||
| secondaryIndexes <- | ||
| map (\(schema, tblNm, idxNm, isUniq, cols) -> | ||
| Db.SomeDatabasePredicate | ||
| (Db.TableHasIndex (Db.QualifiedName schema tblNm) idxNm (V.toList cols) isUniq)) <$> | ||
| Pg.query_ conn (fromString (unlines | ||
| [ -- NULL out 'public' since it is the implicit default schema in Postgres | ||
| "SELECT NULLIF(ns.nspname, 'public'), c.relname, i.relname, ix.indisunique," | ||
| -- re-aggregate column names in index-key order (see ORDINALITY below) | ||
| , " array_agg(a.attname ORDER BY k.n ASC)" | ||
| , "FROM pg_index ix" | ||
| , "JOIN pg_class c ON c.oid = ix.indrelid" | ||
| , "JOIN pg_class i ON i.oid = ix.indexrelid" | ||
| , "JOIN pg_namespace ns ON ns.oid = c.relnamespace" | ||
| -- ORDINALITY allows retaining ordering of index columns | ||
| , "CROSS JOIN unnest(ix.indkey) WITH ORDINALITY k(attid, n)" | ||
| , "JOIN pg_attribute a ON a.attnum = k.attid AND a.attrelid = ix.indrelid" | ||
| -- only regular tables (not views, sequences, etc.) | ||
| , "WHERE c.relkind = 'r'" | ||
| -- exclude Postgres system schemas | ||
| , " AND ns.nspname NOT LIKE 'pg_%'" | ||
| , " AND ns.nspname != 'information_schema'" | ||
| -- exclude primary key indices | ||
| , " AND NOT ix.indisprimary" | ||
| -- exclude indices created implicitly by a UNIQUE or EXCLUDE constraint | ||
| , " AND NOT EXISTS (SELECT 1 FROM pg_constraint con WHERE con.conindid = ix.indexrelid)" | ||
| -- exclude expression indices: a key column number of 0 means that | ||
| -- position is an expression (e.g. lower(col)) rather than a plain | ||
| -- column reference, which TableHasIndex cannot represent | ||
| , " AND NOT EXISTS (SELECT 1 FROM unnest(ix.indkey) AS k(attnum) WHERE k.attnum = 0)" | ||
| , "GROUP BY ns.nspname, c.relname, i.relname, ix.indisunique" ])) | ||
|
|
There was a problem hiding this comment.
Claude Code helped me write this, because I was completely out of my depth with it. I can't really vouch for its correctness.
There was a problem hiding this comment.
Based on a cursory reading of the documentation it looks fine. It's exercised in tests
|
Thanks! I should be able to review in the next few days. As a side-note: did you check out other attempts at this? e.g. #335 |
I wasn't aware of this approach, I'll take a look this week. |
|
#335 was adding the notion of index to #335 has a lot of complexity due to trying to derive indices from a separate record description of indices using generics, with a separate #335 was missing SQLite support, and had no schema introspection queries. #335 also made I took a couple of changes from that PR and pushed them here:
|
|
In conclusion, I think this is simpler than #335 because it avoids the machinery for deriving secondary indices using Generics. That part also seemed to be what caused #335 to get bogged down. With this PR the approach is a bit more manual (in particular, the secondary indices are named manually). None of this is meant as a judgement on #335 because that functionality is indeed quite appealing. My use case is mainly to automate away manual index creation and have the migrations framework handle it, which this PR does quite well. But otherwise the approach here is a bit barebones in comparison. |
| , Eq (Sql92CreateIndexOptionsSyntax syntax) | ||
| , Hashable (Sql92CreateIndexOptionsSyntax syntax) | ||
| ) => IsSql92CreateDropIndexSyntax syntax where | ||
| data family Sql92CreateIndexOptionsSyntax syntax |
There was a problem hiding this comment.
I'm not familiar with data families. I would have expected a closed type family instead:
class ( IsSql92DdlCommandSyntax syntax
, Show (Sql92CreateIndexOptionsSyntax syntax)
, Eq (Sql92CreateIndexOptionsSyntax syntax)
, Hashable (Sql92CreateIndexOptionsSyntax syntax)
) => IsSql92CreateDropIndexSyntax syntax where
type Sql92CreateIndexOptionsSyntax syntaxThat's how other syntaxes are represented in Beam.
Is there an advantage to using data families?
There was a problem hiding this comment.
It's one less indirection. With this setup one writes:
instance IsSql92CreateDropIndexSyntax MySyntax where
data Sql92CreateIndexOptionsSyntax MySyntax
= MySyntaxIndexOptions { field1 :: Ty1, field2 :: Ty2 }
deriving stock (Eq, Ord, Generic)
deriving anyclass Hashablewhereas with a type family it would be:
instance IsSql92CreateDropIndexSyntax MySyntax where
type Sql92CreateIndexOptionsSyntax MySyntax = MySyntaxIndexOptions
data MySyntaxIndexOptions
= MySyntaxIndexOptions { field1 :: Ty1, field2 :: Ty2 }
deriving stock (Eq, Ord, Generic)
deriving anyclass HashableThe latter is strictly more boilerplate, and also less permissive as one cannot write unsaturated type families while one can write unsaturated data families. Perhaps not so relevant here, but it does make some type-level programming idioms impossible, e.g. (rough sketch):
type KnownSyntaxes = [PgCommandSyntax, SqliteCommandSyntax]
type AllIndicesSupport :: (Type -> Constraint) -> Constraint
type AllIndicesSupport c = All (c . Sql92CreateIndexOptionsSyntax) KnownSyntaxesAll that said, if you think it would be better for consistency I can switch the code to using type families.
There was a problem hiding this comment.
In the type family case, MySyntaxIndexOptions has a standalone type, whereas the data family does not. Can the data family instance be used in a standalone way?
Imagine a backend-specific function on an index. Something like:
someFunc :: PgSyntaxIndexOptions -> Pg ()In the data family case, would that be written like this?
someFunc :: Sql92CreateIndexOptionsSyntax PgSyntax -> Pg ()If so, I lean towards keeping the consistency with other bits of beam by using type families
There was a problem hiding this comment.
Yes, it would be written as you say.
There was a problem hiding this comment.
Allright, let's keep the consistency by using a type family instead of data family and then we can wrap this PR up!
Thank you for your patience through this review process
There was a problem hiding this comment.
I've updated the class to use a type family. It has the unfortunate effect of being more ambiguous, because a type signature such as
indexIsUnique :: Sql92CreateIndexOptionsSyntax syntax -> Boolwhich used to be unambiguous is now ambiguous (because syntax only appears guarded under a type family).
If you look at the commit I think you'll agree things are quite a bit less ergonomic like this, but I agree it's also important to keep the interface consistent.
| -- Collect user-created secondary indices. | ||
| -- | ||
| -- Excludes: | ||
| -- - primary keys | ||
| -- - indices that back a constraint (i.e. those created implicitly by UNIQUE/EXCLUDE) | ||
| -- - expression indices e.g. CREATE INDEX ON users (LOWER(email)) | ||
| secondaryIndexes <- | ||
| map (\(schema, tblNm, idxNm, isUniq, cols) -> | ||
| Db.SomeDatabasePredicate | ||
| (Db.TableHasIndex (Db.QualifiedName schema tblNm) idxNm (V.toList cols) isUniq)) <$> | ||
| Pg.query_ conn (fromString (unlines | ||
| [ -- NULL out 'public' since it is the implicit default schema in Postgres | ||
| "SELECT NULLIF(ns.nspname, 'public'), c.relname, i.relname, ix.indisunique," | ||
| -- re-aggregate column names in index-key order (see ORDINALITY below) | ||
| , " array_agg(a.attname ORDER BY k.n ASC)" | ||
| , "FROM pg_index ix" | ||
| , "JOIN pg_class c ON c.oid = ix.indrelid" | ||
| , "JOIN pg_class i ON i.oid = ix.indexrelid" | ||
| , "JOIN pg_namespace ns ON ns.oid = c.relnamespace" | ||
| -- ORDINALITY allows retaining ordering of index columns | ||
| , "CROSS JOIN unnest(ix.indkey) WITH ORDINALITY k(attid, n)" | ||
| , "JOIN pg_attribute a ON a.attnum = k.attid AND a.attrelid = ix.indrelid" | ||
| -- only regular tables (not views, sequences, etc.) | ||
| , "WHERE c.relkind = 'r'" | ||
| -- exclude Postgres system schemas | ||
| , " AND ns.nspname NOT LIKE 'pg_%'" | ||
| , " AND ns.nspname != 'information_schema'" | ||
| -- exclude primary key indices | ||
| , " AND NOT ix.indisprimary" | ||
| -- exclude indices created implicitly by a UNIQUE or EXCLUDE constraint | ||
| , " AND NOT EXISTS (SELECT 1 FROM pg_constraint con WHERE con.conindid = ix.indexrelid)" | ||
| -- exclude expression indices: a key column number of 0 means that | ||
| -- position is an expression (e.g. lower(col)) rather than a plain | ||
| -- column reference, which TableHasIndex cannot represent | ||
| , " AND NOT EXISTS (SELECT 1 FROM unnest(ix.indkey) AS k(attnum) WHERE k.attnum = 0)" | ||
| , "GROUP BY ns.nspname, c.relname, i.relname, ix.indisunique" ])) | ||
|
|
There was a problem hiding this comment.
Based on a cursory reading of the documentation it looks fine. It's exercised in tests
80445a4 to
8a454b7
Compare
This commit adds functionality to add secondary indices to a database to speed up queries. The user-facing API consists of the 'addTableIndex' function, and the helper 'selectorColumnName' function. Backend support goes via the new typeclass 'IsSql92CreateDropIndexSyntax', with support in both the SQLite and Postgres backends.
|
The failures on 9.12 and 9.14 seem spurious as they have to do with the installation of |
|
Awesome @sheaf ! Thanks for your contribution |
|
I should be able to make new releases for beam-migrate / beam-sqlite / beam-postgres today |
This PR adds functionality to add secondary indices to a database to speed up queries.
The user-facing API consists of the
addTableIndexfunction, and the helperindexColfunction.Backend support goes via the new typeclass
IsSql92CreateDropIndexSyntax, with support in both the SQLite and Postgres backends.