Skip to content
This repository was archived by the owner on Sep 12, 2018. It is now read-only.

Implement upsert resolution algorithm.#283

Merged
ncalexan merged 13 commits into
mozilla:rustfrom
ncalexan:upsert-resolution
Feb 15, 2017
Merged

Implement upsert resolution algorithm.#283
ncalexan merged 13 commits into
mozilla:rustfrom
ncalexan:upsert-resolution

Conversation

@ncalexan
Copy link
Copy Markdown
Member

This PR will end up implementing #184. Right now it's just a place-holder for some issues I want to file.

@ncalexan
Copy link
Copy Markdown
Member Author

@rnewman, @jsantell this is ready for initial review. There are some loose ends like collecting tempids (and testing them) and several follow up tickets (supporting lookup refs, map notation). But it's best to get some review and hopefully get this stuff landed to get other people involved.

@ncalexan
Copy link
Copy Markdown
Member Author

It's worth mentioning that I do have a vision for the relationship between DB, Tx, and (future) Conn, but I'm not close enough to the implementation to worry too much about getting it right just yet. So Tx and DB remain inter-twingled for now; the division of responsibilities will become clearer in time.

Copy link
Copy Markdown
Collaborator

@rnewman rnewman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In-progress.

Comment thread db/src/db.rs Outdated
use mentat_tx::entities::{Entity, OpType};
use errors::{ErrorKind, Result, ResultExt};
use types::{Attribute, AttributeBitFlags, DB, Entid, IdentMap, Partition, PartitionMap, Schema, TypedValue, ValueType};
use types::{AVMap, AVPair, Attribute, AttributeBitFlags, DB, Entid, IdentMap, Partition, PartitionMap, Schema, TypedValue, TxReport, ValueType};
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps

{
  Break,
  This,
  Line,
};

?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't that just replace a "too wide" problem with a "too high" problem? Or maybe I'm taking you to literally?

use stuff::{
    Long, List, Of, Things, To, Import,
    Split, Over, Several, Lines,
};

?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It leaves us in Java's import situation: a sorted line-list with one diff line per insert or removal. Not the most compact representation, but good at minimally and clearly representing change over time.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy to push these onto multiple lines for clean diffs. I'll do it throughout.

Comment thread db/src/db.rs Outdated

let bootstrap_db = DB::new(bootstrap_partition_map, bootstrap::bootstrap_schema());
bootstrap_db.transact_internal(&tx, &bootstrap::bootstrap_entities()[..], bootstrap::TX0, now())?;
// TODO: return to transact_internal to self manage the encompassing SQLite transaction.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: "self-manage"

Comment thread db/src/db.rs Outdated
}).collect();

// TODO: only cache the latest of these statements. Changing the set of partitions isn't
// supported in the Clojure implementationat all, and might not be supported in Mentat soon,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: "implementation at"

Comment thread db/src/db.rs Outdated
}

/// Allocate a single fresh entid in the given `partition`.
fn allocate_entid(&mut self, partition: String) -> i64 {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you use a &str here instead of a String? After all, you can't allocate an entid in a partition that isn't already in the partition map…

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Comment thread db/src/db.rs Outdated
}

/// Allocate `n` fresh entids in the given `partition`.
fn allocate_entids(&mut self, partition: String, n: usize) -> Range<i64> {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same.

Comment thread db/src/db.rs Outdated
// TODO: move this to the transactor layer.
pub fn transact(&mut self, conn: &rusqlite::Connection, entities: &[Entity]) -> Result<TxReport> {
let tx_instant = now(); // Label the transaction with the timestamp when we first see it: leading edge.
let tx = self.allocate_entid(":db.part/tx".to_string());
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

… same.

Comment thread db/src/db.rs Outdated
/// This approach is explained in https://github.com/mozilla/mentat/wiki/Transacting.
// TODO: move this to the transactor layer.
pub fn transact_internal(&self, conn: &rusqlite::Connection, entities: &[Entity], tx_id: Entid, tx_instant: i64) -> Result<TxReport> {
pub fn transact_internal(&mut self, conn: &rusqlite::Connection, entities: &[Entity], tx_id: Entid, tx_instant: i64) -> Result<TxReport> {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is indeed starting to solidify on a wonky abstraction: transact_internal should take a reference to a DB as input, and return a report that refers to a (potentially changed) DB, no? Or, rather, altering the current internal DB of the conn is a side-effect.

Its contract is that the DB must match the contents of the database, so it can't take just any DB

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see you commented further, but I think a duality in many Rust expressions:

  • &mut self -> () and caller owns coordination;
  • &self -> Self and callee owns coordination.

I elected for the former since I expect to build a Conn to manage the coordination in #296.

Comment thread db/src/tx.rs
/// A transaction on its way to being applied.
#[derive(Debug)]
pub struct Tx<'conn> {
/// The metadata to use to interpret the transaction entities with.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, gotcha. I just didn't get this far yet.

Copy link
Copy Markdown
Collaborator

@rnewman rnewman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to see the domain concepts firm up sooner rather than later; having a mutating transact operation on a db seems super wrong…

Comment thread db/src/tx.rs Outdated
/// The transaction ID of the transaction.
pub tx_id: Entid,

/// The timestamp when the transaction began to be commited.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"committed"

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed, here and another place.

Comment thread db/src/tx.rs Outdated

impl<'conn> Tx<'conn> {
pub fn new(db: &'conn mut DB, conn: &'conn rusqlite::Connection, tx_id: Entid, tx_instant: i64) -> Tx<'conn> {
Tx { db: db,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Newline after {

Comment thread db/src/tx.rs Outdated
Tx { db: db,
conn: conn,
tx_id: tx_id,
tx_instant: tx_instant }
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trailing comma and } on its own line

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Comment thread db/src/db.rs
let added = false;
/// Update the current partition map materialized view.
// TODO: only update changed partitions.
pub fn update_partition_map(&self, conn: &rusqlite::Connection) -> Result<()> {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it sensible to use &mut rusqlite::Connection here as a signal that this is a mutating operation?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, only one thread should ever be using the same Connection, so perhaps we want type SQLite = &mut rusqlite::Connection;…?

Copy link
Copy Markdown
Member Author

@ncalexan ncalexan Feb 14, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't really worked out the details, but I think there's an entirely different division of responsibilities in the future. Right now, the data structures DB and Tx "own" the work of updating the SQLite store and processing an incoming transaction. My vision is that there's a split between the "updating the SQLite gubbins" and "term rewriting, resolution, etc" on top.

Following your suggestion, we might indeed define a type

type SQLiteConnection = &mut rusqlite::Connection;

and a trait that defines the "low level interface", like

trait ConcreteRepresentation {
  pub fn lookup_avs(...);
  pub fn insert_non_fts_one(...);
  ...
}

Then, we might

impl ConcreteRepresentation for SQLiteConnection {
  pub fn lookup_avs(...) { // Using SQLite-specific features, etc.
  ...
}

That would separate the DB type from the implementation details that are specific to SQLite. You could imagine that some ConcreteRepresentation implementations might not support fulltext indexing at all; or that there's an implementation that uses Postgres or some other backing store.

Does that seem sensible?

I explicitly am not supporting opening existing databases yet, let
alone upgrading databases from earlier versions.  That can follow fast
once basic transactions are supported.
This adds TempId entities, but we can't disambiguate String temporary
IDs from values without the use of the schema, so there's no new value
branch.  Similarly, we can't disambiguate lookup-ref values from two
element list values without a schema, so we remove this entirely.
We'll handle the ambiguity later in the transactor.
This is very preliminary, since we don't have a real connection type
to manage transactions and their metadata yet.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants