Proposal: New Architecture for `sqlx::query!()` Macros

# Status Quo

Currently, the default mode of operation for the `sqlx::query!()` family of macros is to connect to a running database server (or open that database in-process for SQLite) as specified by the `DATABASE_URL` environment variable, and then use database-specific commands to parse and analyze the passed query, to get information on bind parameters and output columns and generate code based on that information.

As an option, the macros can also operate in an "offline" mode where instead of connecting to a database server, they read a `sqlx-data.json` file created by `cargo sqlx prepare` which contains the same data and use that to generate code.

The implementation is relatively straightforward and functions well enough, but has some drawbacks:

### Compiling a project with a lot of `query!()` macro invocations is painfully slow.

Since each macro invocation involves opening and working with a TCP connection to a database server from within an asynchronous runtime, the query macros execute a _lot_ more code than your typical procedural macro. This is compounded by the fact that procedural macros are exclusively compiled in _debug mode_ which means no optimizations (which async code relies pretty heavily on to be fast).

I have tried playing around with setting `rustflags` in a `config.toml` to force `sqlx-macros` to compile in release mode but it didn't really help and seemed to actually _worsen_ compile times if it did anything at all; the most noticeable thing that it did was clobber all the cached compiler artifacts for the whole dependency tree, requiring a full recompile.

Macro invocations are currently executed serially which further compounds the problem.

Currently as a mitigation we only start a single runtime and then cache it in a `lazy_static!()`; since proc macros are executed in a separate process that is kept running for the duration of the compiler, this saves some overhead per invocation. I've been meaning to see if caching a `Pool` of connections saves any overhead as well. I imagine it would, at least somewhat. We could cache analysis results as well, but that would only help for queries that are copy-pasted a lot.

### Many people find the idea of procedural macros opening arbitrary TCP connections to be highly unpalatable.

Oftentimes when someone comes across SQLx and starts reading about the query macros, their first reaction is something along the lines of, "wow, the compiler actually lets you _do_ that? That's disgusting!"

Fair enough, the current procedural macro environment is extremely _laissez-faire_ and there have been calls to lock down what procedural macros (and build scripts) can actually do, thanks to sensationalized proof-of-concept projects showing how a proc macro executed by a transitive dependency could steal sensitive information like SSH keys and the like.

Personally, I think the threat is kind of overblown, or at least rather moot, considering a malicious crate in your dependency tree could accomplish all kinds of nastiness already, just at runtime instead of compile time. I fear that a big push to sandbox procedural macros and build scripts could lure people into a false sense of security and make them feel like they don't need to audit their dependencies. But, I digress.

### The `query!()` macros don't play that well with IDEs.

This situation has improved drastically compared to when we first introduced SQLx, to the point where IntelliJ-Rust and Rust Analyzer can both provide code completion for `sqlx::query!()`. However, it can still take some time to get results, in part thanks to point 1 here, and the macros aren't "pure" in the sense that their output could change from run to run based on schema changes, without the compiler or IDEs being able to pick up on it. They should be working from cached output where possible, and be able to invalidate that cache in a way that IDE plugins can learn about.

### Offline mode is finicky to use and it's easy to forget to run `cargo sqlx prepare` before committing.

Offline mode isn't easily discoverable (in part thanks to just how dense the documentation on `sqlx::query!()` has gotten... we really need to move that stuff to an MDBook or something). It's also easy to forget to run `cargo sqlx prepare` before committing, which often leads to really confusing CI errors.

In the past, what I've done is add `cargo sqlx prepare && git add sqlx-data.json` to `.git/hooks/precommit` to make sure I don't forget, but this makes committing take forever thanks to point 1.

### The macros don't have a way to ensure that the database is up-to-date with migrations.

This is somewhat on purpose as we don't want the macros automatically executing migrations which may end up breaking the user's local development database and/or clobbering manually generated test data. Still, it does get annoying.

# Proposal

Instead of having the macros connect directly to the database server at compile time, they could communicate with a daemon process that the user starts and leaves running while working on their project.

I'm imagining this daemon process as another subcommand of `sqlx-cli`, perhaps `cargo sqlx watch` or something like that. It would watch `target/sqlx` which the macros would write their query strings to as files. I considered using a crate like [`interprocess`](https://docs.rs/interprocess/latest/interprocess/) to let the macros and daemon talk directly, but I think communicating with the filesystem is better because that's a medium that the compiler natively understands as I'll elaborate on in a second.

Since a `cargo install`ed binary is built in release mode by default, the latency of the actual request to parse and analyze the query should improve quite a bit.

The macros could submit their request to the daemon and then return some dummy code (or even a temporary error, like ``Waiting for query analysis. Is `cargo sqlx watch` running?``), telling the compiler to watch `target/sqlx` for the daemon to write the result of processing the query. Then the compiler could move on to the next macro invocation instead of blocking on the completion of the first one. This would let the macros essentially run in parallel, which should significantly reduce overall compile times.

Assuming #570 is implemented as part of this rearchitecting, the daemon could also automatically update entries in `.sqlx/` for successfully parsed queries so that `cargo sqlx prepare` doesn't have to be manually run before committing.

This would also be more amenable to sandboxing as we would only require write access to `target/` and read access to the project, which wouldn't be doing anything more than what build scripts are already expected to do, and any sandboxing scheme that is implemented would be required to support. Support for Windows named pipes or Unix domain sockets would probably be right out in this case unless we specifically asked for it, and that would be an uphill battle.

The daemon could also periodically re-run the analysis of queries to pick up changes to the database schema.

@mehcode suggested that the daemon could automatically migrate the database if necessary, so no other manual setup would be needed. It could even spin up a temporary database so migrations could be automatically tested and synchronized with the query macros without breaking a local development database.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposal: New Architecture for `sqlx::query!()` Macros #1598

Status Quo

Compiling a project with a lot of `query!()` macro invocations is painfully slow.

Many people find the idea of procedural macros opening arbitrary TCP connections to be highly unpalatable.

The `query!()` macros don't play that well with IDEs.

Offline mode is finicky to use and it's easy to forget to run `cargo sqlx prepare` before committing.

The macros don't have a way to ensure that the database is up-to-date with migrations.

Proposal

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Proposal: New Architecture for sqlx::query!() Macros #1598

Description

Status Quo

Compiling a project with a lot of query!() macro invocations is painfully slow.

Many people find the idea of procedural macros opening arbitrary TCP connections to be highly unpalatable.

The query!() macros don't play that well with IDEs.

Offline mode is finicky to use and it's easy to forget to run cargo sqlx prepare before committing.

The macros don't have a way to ensure that the database is up-to-date with migrations.

Proposal

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

Proposal: New Architecture for `sqlx::query!()` Macros #1598

Compiling a project with a lot of `query!()` macro invocations is painfully slow.

The `query!()` macros don't play that well with IDEs.

Offline mode is finicky to use and it's easy to forget to run `cargo sqlx prepare` before committing.