- Linux
- ghc 8.6
- cabal-install 3.2
- PostgreSQL backend
I have a very simple delete cascade function:
deleteCascadeSlotNo :: MonadIO m => Word64 -> ReaderT SqlBackend m Bool
deleteCascadeSlotNo slotNo = do
keys <- selectKeysList [ BlockSlotNo ==. Just slotNo ] []
mapM_ deleteCascade keys
pure $ not (null keys)
It is not enforced by the schema, but I know that slotNo is unique. I have also confirmed (by adding debug) that the list of keys contains at most 1 element. However, the database has 25 tables (with more to come) with a relatively high level of linkage between the tables, in part to ensure proper delete/cascade behavior.
Currently, when deleteCascadeSlotNo is used on the block table (over 5 million entries) it only needs to delete the most recently added row or two. However, for debugging/testing I may want to delete many more rows and here is where we run into problems. For larger delete/cascade operations, the time taken increases dramatically.
| Rows to delete |
Time taken |
| 1000 |
0m44s |
| 2000 |
1m25s |
| 4000 |
3m04s |
| 8000 |
7m14s |
| 1600 |
13m18s |
The speed is pretty much linear in the number of row to be deleted, but for large numbers of row (say 1 million) it can still be slow (30 minutes or more).
The slowness seem to be due to the fact the the operation requires a lot of back and forth communication between Haskell/Persistent and PostgreSQL. The delete/cascade operation would be much quicker if more of the work was handed off to Postgres.
I have a very simple delete cascade function:
It is not enforced by the schema, but I know that
slotNois unique. I have also confirmed (by adding debug) that the list of keys contains at most 1 element. However, the database has 25 tables (with more to come) with a relatively high level of linkage between the tables, in part to ensure properdelete/cascadebehavior.Currently, when
deleteCascadeSlotNois used on theblocktable (over 5 million entries) it only needs to delete the most recently added row or two. However, for debugging/testing I may want to delete many more rows and here is where we run into problems. For larger delete/cascade operations, the time taken increases dramatically.The speed is pretty much linear in the number of row to be deleted, but for large numbers of row (say 1 million) it can still be slow (30 minutes or more).
The slowness seem to be due to the fact the the operation requires a lot of back and forth communication between Haskell/Persistent and PostgreSQL. The delete/cascade operation would be much quicker if more of the work was handed off to Postgres.