Implement upsert resolution algorithm.#283
Conversation
23ea4c9 to
6831925
Compare
|
It's worth mentioning that I do have a vision for the relationship between |
| use mentat_tx::entities::{Entity, OpType}; | ||
| use errors::{ErrorKind, Result, ResultExt}; | ||
| use types::{Attribute, AttributeBitFlags, DB, Entid, IdentMap, Partition, PartitionMap, Schema, TypedValue, ValueType}; | ||
| use types::{AVMap, AVPair, Attribute, AttributeBitFlags, DB, Entid, IdentMap, Partition, PartitionMap, Schema, TypedValue, TxReport, ValueType}; |
There was a problem hiding this comment.
Perhaps
{
Break,
This,
Line,
};
?
There was a problem hiding this comment.
Doesn't that just replace a "too wide" problem with a "too high" problem? Or maybe I'm taking you to literally?
use stuff::{
Long, List, Of, Things, To, Import,
Split, Over, Several, Lines,
};
?
There was a problem hiding this comment.
It leaves us in Java's import situation: a sorted line-list with one diff line per insert or removal. Not the most compact representation, but good at minimally and clearly representing change over time.
There was a problem hiding this comment.
I'm happy to push these onto multiple lines for clean diffs. I'll do it throughout.
|
|
||
| let bootstrap_db = DB::new(bootstrap_partition_map, bootstrap::bootstrap_schema()); | ||
| bootstrap_db.transact_internal(&tx, &bootstrap::bootstrap_entities()[..], bootstrap::TX0, now())?; | ||
| // TODO: return to transact_internal to self manage the encompassing SQLite transaction. |
| }).collect(); | ||
|
|
||
| // TODO: only cache the latest of these statements. Changing the set of partitions isn't | ||
| // supported in the Clojure implementationat all, and might not be supported in Mentat soon, |
| } | ||
|
|
||
| /// Allocate a single fresh entid in the given `partition`. | ||
| fn allocate_entid(&mut self, partition: String) -> i64 { |
There was a problem hiding this comment.
Can you use a &str here instead of a String? After all, you can't allocate an entid in a partition that isn't already in the partition map…
| } | ||
|
|
||
| /// Allocate `n` fresh entids in the given `partition`. | ||
| fn allocate_entids(&mut self, partition: String, n: usize) -> Range<i64> { |
| // TODO: move this to the transactor layer. | ||
| pub fn transact(&mut self, conn: &rusqlite::Connection, entities: &[Entity]) -> Result<TxReport> { | ||
| let tx_instant = now(); // Label the transaction with the timestamp when we first see it: leading edge. | ||
| let tx = self.allocate_entid(":db.part/tx".to_string()); |
| /// This approach is explained in https://github.com/mozilla/mentat/wiki/Transacting. | ||
| // TODO: move this to the transactor layer. | ||
| pub fn transact_internal(&self, conn: &rusqlite::Connection, entities: &[Entity], tx_id: Entid, tx_instant: i64) -> Result<TxReport> { | ||
| pub fn transact_internal(&mut self, conn: &rusqlite::Connection, entities: &[Entity], tx_id: Entid, tx_instant: i64) -> Result<TxReport> { |
There was a problem hiding this comment.
This is indeed starting to solidify on a wonky abstraction: transact_internal should take a reference to a DB as input, and return a report that refers to a (potentially changed) DB, no? Or, rather, altering the current internal DB of the conn is a side-effect.
Its contract is that the DB must match the contents of the database, so it can't take just any DB…
There was a problem hiding this comment.
I see you commented further, but I think a duality in many Rust expressions:
&mut self -> ()and caller owns coordination;&self -> Selfand callee owns coordination.
I elected for the former since I expect to build a Conn to manage the coordination in #296.
| /// A transaction on its way to being applied. | ||
| #[derive(Debug)] | ||
| pub struct Tx<'conn> { | ||
| /// The metadata to use to interpret the transaction entities with. |
There was a problem hiding this comment.
Ah, gotcha. I just didn't get this far yet.
rnewman
left a comment
There was a problem hiding this comment.
I'd like to see the domain concepts firm up sooner rather than later; having a mutating transact operation on a db seems super wrong…
| /// The transaction ID of the transaction. | ||
| pub tx_id: Entid, | ||
|
|
||
| /// The timestamp when the transaction began to be commited. |
There was a problem hiding this comment.
Fixed, here and another place.
|
|
||
| impl<'conn> Tx<'conn> { | ||
| pub fn new(db: &'conn mut DB, conn: &'conn rusqlite::Connection, tx_id: Entid, tx_instant: i64) -> Tx<'conn> { | ||
| Tx { db: db, |
| Tx { db: db, | ||
| conn: conn, | ||
| tx_id: tx_id, | ||
| tx_instant: tx_instant } |
There was a problem hiding this comment.
Trailing comma and } on its own line
| let added = false; | ||
| /// Update the current partition map materialized view. | ||
| // TODO: only update changed partitions. | ||
| pub fn update_partition_map(&self, conn: &rusqlite::Connection) -> Result<()> { |
There was a problem hiding this comment.
Is it sensible to use &mut rusqlite::Connection here as a signal that this is a mutating operation?
There was a problem hiding this comment.
Indeed, only one thread should ever be using the same Connection, so perhaps we want type SQLite = &mut rusqlite::Connection;…?
There was a problem hiding this comment.
I haven't really worked out the details, but I think there's an entirely different division of responsibilities in the future. Right now, the data structures DB and Tx "own" the work of updating the SQLite store and processing an incoming transaction. My vision is that there's a split between the "updating the SQLite gubbins" and "term rewriting, resolution, etc" on top.
Following your suggestion, we might indeed define a type
type SQLiteConnection = &mut rusqlite::Connection;and a trait that defines the "low level interface", like
trait ConcreteRepresentation {
pub fn lookup_avs(...);
pub fn insert_non_fts_one(...);
...
}Then, we might
impl ConcreteRepresentation for SQLiteConnection {
pub fn lookup_avs(...) { // Using SQLite-specific features, etc.
...
}That would separate the DB type from the implementation details that are specific to SQLite. You could imagine that some ConcreteRepresentation implementations might not support fulltext indexing at all; or that there's an implementation that uses Postgres or some other backing store.
Does that seem sensible?
I explicitly am not supporting opening existing databases yet, let alone upgrading databases from earlier versions. That can follow fast once basic transactions are supported.
This adds TempId entities, but we can't disambiguate String temporary IDs from values without the use of the schema, so there's no new value branch. Similarly, we can't disambiguate lookup-ref values from two element list values without a schema, so we remove this entirely. We'll handle the ambiguity later in the transactor.
This converts an existing test to EDN: https://github.com/mozilla/mentat/blob/84a80f40f5c888f8452d07bd15f3b5fba49d3963/test/datomish/db_test.cljc#L193.
This is very preliminary, since we don't have a real connection type to manage transactions and their metadata yet.
67cb18f to
f715a27
Compare
This PR will end up implementing #184. Right now it's just a place-holder for some issues I want to file.