Skip to content

.Net: Change Sqlite connector to accept connection string instead of DbConnection#10972

Merged
roji merged 2 commits intomicrosoft:feature-vector-data-preb2from
roji:Sqlite
Mar 14, 2025
Merged

.Net: Change Sqlite connector to accept connection string instead of DbConnection#10972
roji merged 2 commits intomicrosoft:feature-vector-data-preb2from
roji:Sqlite

Conversation

@roji
Copy link
Copy Markdown
Member

@roji roji commented Mar 14, 2025

This is a resubmit of #10550, which was closed when its target branch (feature-vector-data-preb1) was deleted.

  • I believe that IVectorStore and IVectorStoreRecordCollection should have clear guarantees around thread-safety/concurrency, which should be clearly documented etc. For example, we shouldn't have one implementation/instance of IVectorStoreRecordCollection which is thread-safe, and another which isn't (in the way that SqliteVectorStoreRecordCollection is, by closing over SqliteConnection, which itself isn't thread-safe).
  • If we agree on the above, I believe the MEVD abstractions should be thread-safe; IVectorStore and IVectorStoreRecordCollection conceptually represent/reference database-side entities (the database itself, a collection within the database), and not e.g. a connection to the database (like DbConnection); so an instance doesn't represent something that's client-side stateful (again, like a connection). Also, in most databases, the client is thread-safe, representing simply a way to execute calls (e.g. via gRPC) on the database.
  • The proposed implementation is very similar to the PostgreSQL one - a (pooled) connection is retrieved and opened for each operation, and then closed. Note that this means that Data Source=:memory: cannot be used in most cases, since the databases only lives while the connection is open, and disappears immediately when closed. This isn't perfect, but supporting in-memory would mean no concurrent usage.
  • Unlike the PostgreSQL implementation, there's no SqliteDataSource like there is NpgsqlDataSource (but I hope there will be at some point - see #30991); there's a DbDataSource abstraction in ADO.NET which we could theoretically accept, but that's not accepted usage, and in any case, we require a SqliteConnection (not DbConnection) in order to e.g. call LoadExtension(), and so would need to down-cast DbConnections coming out of DbDataSource in any case. As a result, I'm proposing that for now, the Sqlite connector simply accept a connection string, which is the natural/common way for ADO.NET-related things to work. In the future, when a SqliteDataSource is introduced, we can add support for that like we have for NpgsqlDataSource.
  • I started out trying to preserve (but obsolete) the ability to accept SqliteConnection (to avoid the breaking change), but it turned out quite complicated, requiring lots of refactoring just for that, and a bit of risk of getting it wrong. Since this would be a temporary, one-month obsoletion only, and this is a SQLite-only break, I'm proposing to break this now rather than do all the effort and take on the risk for the two months we have. So I've obsoleted the relevant APIs with [Obsolete(error: true)], to have an informative error message for the user.

Closes #10454

/cc @westey-m @dmytrostruk @adamsitnik

@roji roji requested a review from a team as a code owner March 14, 2025 09:28
@roji roji changed the base branch from main to feature-vector-data-preb2 March 14, 2025 09:28
@markwallace-microsoft markwallace-microsoft added .NET Issue or Pull requests regarding .NET code kernel Issues or pull requests impacting the core kernel memory labels Mar 14, 2025
@github-actions github-actions Bot changed the title Change Sqlite connector to accept connection string instead of DbConnection .Net: Change Sqlite connector to accept connection string instead of DbConnection Mar 14, 2025
Copy link
Copy Markdown
Member

@adamsitnik adamsitnik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you @roji !

@roji roji merged commit ce572c8 into microsoft:feature-vector-data-preb2 Mar 14, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

kernel Issues or pull requests impacting the core kernel memory .NET Issue or Pull requests regarding .NET code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

.Net MEVD: Clarify thread-safety of the abstraction

4 participants