Skip to content

SQL: counters for rows read and written#979

Merged
smerritt merged 2 commits intocloudflare:mainfrom
smerritt:smerritt/sqlite-row-counts-amalgamated
Aug 10, 2023
Merged

SQL: counters for rows read and written#979
smerritt merged 2 commits intocloudflare:mainfrom
smerritt:smerritt/sqlite-row-counts-amalgamated

Conversation

@smerritt
Copy link
Copy Markdown
Contributor

@smerritt smerritt commented Aug 4, 2023

In a durable object, state.storage.sql now has two new properties: rowsRead and rowsWritten. These count the number of rows read and written by the underlying SQLite database.

This is more than just the number of rows returned from queries or the number of records inserted/updated; it counts all rows read by the database engine in the processing of a query. For example, SELECT MAX(unindexed_column) FROM a_table returns a single result but reads every row from disk, and rowsRead reflects that.

@smerritt smerritt force-pushed the smerritt/sqlite-row-counts-amalgamated branch from 217fadf to 91b9597 Compare August 4, 2023 23:09
@smerritt smerritt marked this pull request as ready for review August 4, 2023 23:20
@kentonv
Copy link
Copy Markdown
Member

kentonv commented Aug 5, 2023

Does this patch come directly from libSQL? If so we need to include license and copyright information, probably in the description header for each patch file. (Looks like it's MIT license, so no issues using it, but still need to include the copyright.)

Comment thread src/workerd/util/sqlite.h Outdated
Comment thread src/workerd/util/sqlite.h Outdated
Comment thread src/workerd/util/sqlite.c++ Outdated
Comment thread src/workerd/util/sqlite.c++
Comment thread src/workerd/api/sql.h Outdated
Comment thread src/workerd/util/sqlite.c++ Outdated
@smerritt smerritt force-pushed the smerritt/sqlite-row-counts-amalgamated branch 2 times, most recently from 381b2b6 to eb97f7c Compare August 7, 2023 23:52
@smerritt
Copy link
Copy Markdown
Contributor Author

smerritt commented Aug 7, 2023

The patch now contains an appropriate copyright header pointing at libsql.

@smerritt smerritt force-pushed the smerritt/sqlite-row-counts-amalgamated branch from eb97f7c to 085518a Compare August 7, 2023 23:56
Comment thread src/workerd/util/sqlite.h Outdated
Comment thread src/workerd/api/sql.c++ Outdated

auto& query = state.query;

obj->rowsRead = query.getRowsRead();
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, it seems wasteful to query these counters on every single row even if the app doesn't ever use them. (And even apps that do use them probably wouldn't check them on every row.)

Maybe we could have the implementation of getRowsRead() delegate to the underlying Query if it is still active. Then we only need to make a copy when the query is done.

Or maybe this isn't expensive enough to matter?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They're cheap enough that this probably doesn't matter, but I agree that it looks wasteful, which is a trap for future programmers to fall into.

I took your suggestion of saving them off when the Query is done. I didn't see a way of invalidating a query out from under a Cursor, but if you know of one, then please tell me because that'll make the cursor falsely report 0 for row counts.

@smerritt smerritt force-pushed the smerritt/sqlite-row-counts-amalgamated branch from 085518a to cf38956 Compare August 9, 2023 23:16
This takes the row-counting logic from libsql[1] and sticks it into
our copy of SQLite.

libsql is an ambitious fork of SQLite, so switching out the entire
library would be quite risky. Porting the row-counting logic from
libsql (which is MIT-licensed) is a much less risky change, though it
does require us to carry a patch around forever, or at least until
SQLite implements something similar, if they ever do.

[1] A fork of sqlite that accepts contributions.
    https://github.com/libsql/libsql
The `storage.sql` object in JS now has two new properties: `rowsRead`
and `rowsWritten`, which contain appropriate numbers. Similarly,
workerd::api::SqliteDatabase::getRows[Read|Written] provide the same
numbers in C++.

The counters are only updated after a full result set has been read. I
thought about updating the counters as each row was read, but that has
the disadvantage of counting rows that weren't useful due to errors.
If one wanted to meter usage based on rows, one would only want to
count rows for successfully-executed queries.
@smerritt smerritt force-pushed the smerritt/sqlite-row-counts-amalgamated branch from cf38956 to 26f0498 Compare August 10, 2023 16:37
@smerritt smerritt merged commit bb06753 into cloudflare:main Aug 10, 2023
@smerritt smerritt deleted the smerritt/sqlite-row-counts-amalgamated branch August 10, 2023 17:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants