sqldb: add invoice schema and sql queries by positiveblue · Pull Request #7354 · lightningnetwork/lnd

positiveblue · 2023-01-24T17:39:02Z

This is the first PR for #6288

In this PR we define the tables, indexes and queries for our invoice schema.

We aim to support sqlite/postgres out of the box. The sql code in the migration/query files need to work for both engines with the minimum changes and we want to avoid using specific functionalities like postgres SEQUENCEs.

We will use sqlc to generate go code from SQL. The tool uses the config defined in sqlc.yaml. By default we will set the engine to postgres but the generated code also works for sqlite.

The invoice schema can be divided in three parts:

Bolt11 invoices: All the common tables to keep track of created invoices, their htlcs and their payments.
AMP invoices: Most of the logic is shared with BOLT11 invoices, but we add a couple of tables to be able to keep track of the different payments for each AMP invoice (identified by the set_id) and the extra data of each htlc.
Invoice Events: This aims to support a new functionality that is not currently supported by the KV store. We are interested in keep a record of the events that the an invoice went through.

Because the queries are not generated at runtime, we need to write generic ones that allow us to filter/order using different filter parameters that can be set to NULL when we don't want to apply in a specific query.

For filtering, we use the pattern

(
    column = sqlc.narg('param_name') OR
    sqlc.narg('param_name') IS NULL
) AND (
    column = sqlc.narg('param_name_2') OR
    sqlc.narg('param_name_2') IS NULL
)

We also can also use

    CASE
        WHEN sqlc.narg('param_name')=TRUE THEN (custom_logic)
        ELSE TRUE 
    END

Ordering is a bit more complicated, because we cannot add ASC/DESC in a CASE-WHEN-THEN-ELSE so we need to unroll the cases

ORDER BY
    CASE
        WHEN sqlc.narg('ording_param') = FALSE THEN column1  
        ELSE NULL
    END ASC,
    CASE
        WHEN sqlc.narg('ordering_param') = TRUE  THEN column1  
        ELSE NULL
    END DESC

A good example of a real query using this patterns is FilterInvocies.

There are a couple of things that I am considering to add, but I would like to know others opinions.

Currently, row ids (INTEGER PRIMARY KEY) are mapped to int32. That means that we have a counter valid until 2,147,483,647 which is 1 invoice/second for the next ~68 years. In Go code we use uint64 for them. uint64 is not a supported sql type, but we may want to use the largest integer value (int64) for this.
Invoice payments are stored in their own table. Some invoice types can have a one invoice -> many peyments so a JOIN
make sense in this case. However, it could be interesting to add information about the "latest" payment in the invoice table. In that way we would fetch the invoice and its payment information in one trip to the db, whenever it's possible.
Most of the invoices will have the same feature set, based on their type. Maybe we could add an "features_set_enum" instead of storing them for each invoice. That would also gives us the flexibility of having a different feature set for the same kind of invoice in case we add it only to new invoices.

Here is a diagram of the proposed schema.

NOTE: the structure of this PR has changed so some of the old comments apply now to other PRs. This one is only about the SQL schema.

ziggie1984

Hi, impressive work! :)
Did a first pass-through in preparation of the review club next week.

ziggie1984 · 2023-01-28T14:18:18Z

question: Could you explain why we are naming the test files with the prefix test instead of the suffix test?

yes, comes from taproot_assets but as you noticed it would be good to change it to match the Go's good practices

This feedback will be added in #7343

ziggie1984 · 2023-01-28T14:21:43Z

question: would it be reasonable to only include this file when building for example with the kvdb_sqlite package, this came to my mind because later in the test files you mark the with specific tags like test_db_postgres or test_db_sqlite so was wondering why in the test code but not here?

sputn1ck

Great work @positiveblue . Added some qs and need to research the SEQUENCES stuff a bit

sputn1ck · 2023-02-02T11:26:05Z

I'm not sure about the implications of storing with or without the time zone. E.g. assuming I'm running my node in my home in utc+0 and later migrating that to a remote server that is in a different timezone, my gut feeling is that would be a conflict here?

I think it's ok to use no timezone until we make sure to display all printed timestamps as UTC or alternatively transform to the user's timezone (when printing).

@sputn1ck your gut feeling is right here: time/timezone == software engineering headache

As Andras said, the idea is to not use timezone (and everything is sitll consistent) until you want to show it to the user

sputn1ck · 2023-02-02T11:39:11Z

nit: hodl 😄

haha I checked it in our code and we call them hodl invoices

I think it's they real name hold-invoices though...

bhandras

Cool PR, great work so far @positiveblue 🚀 🚀

bhandras · 2023-02-02T16:20:26Z

I think it's ok to use no timezone until we make sure to display all printed timestamps as UTC or alternatively transform to the user's timezone (when printing).

bhandras · 2023-02-02T16:22:06Z

bhandras

Looking really good to me, awesome work @positiveblue 🥇

lightninglabs-deploy · 2023-07-05T18:01:47Z

@bhandras: review reminder
@Roasbeef: review reminder
@sputn1ck: review reminder
@ziggie1984: review reminder

ziggie1984

Great PR 🤓, learned a lot how to cleanly setup sql schemas with sqlc.

Just had some nits and questions.

Regarding your remarks:

I would prefer int64 for the primary key, especially because sql dbs can easily deployed on other hosts and compacted on the fly so I consider the additonal space not as critical. On the other hand a migration later one is not a big deal as well?
So regarding the payment information when fetching an invoice, do I understand it right that you propose just adding the latest successful payment rather then all when using MPP or AMP for example?
Very much in favour of making the feature set space requirement more efficient but also keep the possibility to have it separate for each invoice just in case we want to be more flexible?

ziggie1984 · 2023-07-06T10:34:20Z

+    amount_msat BIGINT NOT NULL, 
+
+    -- Delta to use for the time-lock of the CLTV extended to the final hop.
+    cltv_delta INTEGER NOT NULL,


Q: For Bolt12 invoices there will be a "cltv_delta" included in the blinded path, however there will be no specific final_hop delta as in bolt11 invoices, there might be still a receiver required delta but it will be handled differently I think. So my question is, maybe we should allow a NULL value then?

ziggie1984 · 2023-07-06T10:43:57Z

so for every update there will be a corresponding invoice_event?

ziggie1984 · 2023-07-06T11:04:52Z

wondering why the set_id is stored as bytes but he htlc_id we go for text ? Is text more efficient that BLOB?

ziggie1984 · 2023-07-07T12:59:10Z

+    -- not have leave this empty, like Keysends.
+    payment_request TEXT UNIQUE, 
+
+    -- The invoice state.


in the next commit you filter for pending_only which only hits for 0 and 3, would it make sense then to define the states in the comments here as well?

I think we can leave that for the application layer (Go) given that "pending_only"/"pending" is not a state by itself but an invoice is considered pending when its state is "open"/"accepted"

ziggie1984 · 2023-07-07T14:26:34Z

+
+-- name: SelectAMPInvoicePayments :many
+SELECT aip.*, ip.*
+FROM amp_invoice_payments aip LEFT JOIN invoice_payments ip ON aip.settled_index = ip.id


Q: Could you elaborate why we want to show information from the invoice_payments table here? Will those not be equal to the ones in the amp_invoice_payments table?

Yes, sql generates a struct where fields in both tables get a standard name, fore example:

type SelectAMPInvoicePaymentsRow struct { SetID []byte State int16 CreatedAt time.Time SettledIndex sql.NullInt32 InvoiceID int32 ID sql.NullInt32 SettledAt sql.NullTime AmountPaidMsat sql.NullInt64 InvoiceID_2 sql.NullInt32 }

We could "avoid" the overlap and specify a .* for one of the tables and only the distinct columns in the other table. The problem is that next time that we add a column to the second table, we need to update this query too. Because it is not much data, I think it's better to leave it as is.

Sounds good, now I understand thanks.

ziggie1984 · 2023-07-07T14:58:57Z

Q: We do not use ENUM types here because we would need to redefine them in the go code which makes things more complicated than having everything in one place ?

bhandras

LGTM, pending @ziggie1984's comments.
Great to see this coming together, will be super nice to use SQL finally! 🚀 🚀

sqlc is a tool that generates fully type-safe idiomatic code from SQL. The result is Go code can then used execute the queries in the database. The noraml flow looks like: - The developer write some sql that will update the schema in the database: new tables, indices, etc - The developer updates the set of queries that will use the new schema. - `sqlc` generates type-safe interfaces to those queries. - The developer can then write application code that calls the methods generated by sqlc. The tool configuration needs to live in the repo's root and its name is `sqlc.yaml`. LND will support out of the box sqlite and postgres. The sql code needs to be (almost) the same for both engines, so we cannot use custom functions like `ANY` in postgres. The SQLC config file needs to define what is the target engine, we will set postgres but the generated code can be executed by sqlite too. In some specific cases, we will `match and replace` some sql lines to be sure the table definitions are valid for the targeted engine.

This is the schema for "ordinal" BOLT11 invoices. The invoices table aims to keep an entry for each invoice, BOLT11 or not, that will be supported. Invoice related HTLCs will be stored in a separete table than forwarded htlcs. SQLite does not support `SEQUENCE`. We achieve atomic autoincrementals using primary keys with autoincrement/serial. An invoice `AddIndex` translates to `invoices(id)` while `SettleIndex` is `invoice_payments(id)`.

Set of queries to deal with invoices. A couple of things to take into account: - Because the queries are not rewritten at runtime, we cannot have a generic `INSERT` with different tuples. - Because the queries are not rewritten at runtime, we cannot build one query with only the filters that matter for that queries. The two options are a combinatorial approach (a new query for every permutation) or a generic query using the pattern ``` SELECT * FROM table WHERE ( -- Can be read as: -- Match the filter 1 value if filter_1 != nil column_1 >= sqlc.narg('filter_1') OR sqlc.narg('filter_1') IS NULL ) AND ( column_2 >= sqlc.narg('filter_2') OR sqlc.narg('filter_2') IS NULL ) ... ```

Schema for AMP invocies. AMP invoices can be paid multiple times and each payment to an AMP invoice is identified by a `set_id`. The A in AMP stands for `Atomic`. All the htlcs belonging to the same AMP payment (share the same set_id) will be resolved at the same time with the same result: settled/canceled. AMP invoices do not have an "invoice preimage". Instead, each htcl has its own hash/preimage. When a new htlc is added the hash for that htlc is attached to it. When all the htlcs of a set_id have been received we are able to compute the preimage for each one of them.

run `make sqlc` All the code in this commit is auto-generated.

sputn1ck

Awesome work @positiveblue 🥇

ziggie1984

Great Job 🎉

Roasbeef

Excellent PR! This series beings a marked improvement for the general persistence situation within lnd.

Left some non-blocking comments that can be followed up on later. Some may end up changing the schema, but we can do so in a breaking manner, as people can't yet use the definitions, etc.

LGTM 🌪️

Roasbeef · 2023-07-12T23:53:33Z

+
+    -- The encoded payment request for this invoice. Some invoice types may 
+    -- not have leave this empty, like Keysends.
+    payment_request TEXT UNIQUE, 


I think we have a bit of room to further decouple things here. Eg: in the future a logical invoice will have both a BOLT 11 and BOLT 12 invoice format, along w/ w/e might arise in the future. So we could spin this out into another table that then references this main invoice table.

Similarly, not every invoice will have a 1:1 payment hash to pre-image relationship. Eg: AMP invoices only have a payment_addr as the hash+preimage are generated by the sender.

Roasbeef · 2023-07-12T23:54:11Z

+    state SMALLINT NOT NULL,
+
+    -- The accumulated value of all the htlcs settled for this invoice.
+    amount_paid_msat BIGINT NOT NULL,


This in theory could be derived via a view that sums over the set of payments to the invoice itself. Another factor here is the set of active HTLCs as well.

Roasbeef · 2023-07-12T23:54:31Z

+    -- The accumulated value of all the htlcs settled for this invoice.
+    amount_paid_msat BIGINT NOT NULL,
+
+    -- This field will be true for AMP invoices.


So then the general invoice feature table was dropped in favor of this/

Roasbeef · 2023-07-12T23:55:38Z

+CREATE INDEX IF NOT EXISTS invoice_htlc_custom_records_htlc_id_idx ON invoice_htlc_custom_records(htlc_id);
+
+-- invoice_payments contains the information of a settled invoice payment. 
+CREATE TABLE IF NOT EXISTS invoice_payments (


👍

In the future we can add another event table here for stuff like: cancel, accept, hodl, etc.

Roasbeef · 2023-07-12T23:58:18Z

@@ -0,0 +1,54 @@
+-- amp_invoices_payments  
+CREATE TABLE IF NOT EXISTS amp_invoice_payments (


Perhaps also nice to have easy access to the amount here as well? Then can just read this out rather than needing to also fetch+sum over the set of invoice HTLCs.

Roasbeef · 2023-07-12T23:59:40Z

+SELECT aip.*, ip.*
+FROM amp_invoice_payments aip LEFT JOIN invoice_payments ip ON aip.settled_index = ip.id
+WHERE (
+    set_id = sqlc.narg('set_id') OR 


If you look at the way the queries exist as is, I think what you also want here is based on the payment_addr as well. Then given that, you can fetch them all for a payment addr w/o an intermediate look up for the invoice itself.

Roasbeef · 2023-07-13T00:01:07Z

In the future if we want very strict typing here, we can make a set "enum values" table, then point to that itself (all items unique, essentially pre-populated).

You can also do stuff like:

type CHECK(type in ('THIS', THAT'))

etc

positiveblue mentioned this pull request Jan 24, 2023

[WIP] sqldb: InvoiceDB interface implementation #7357

Closed

saubyk assigned positiveblue Jan 26, 2023

saubyk added invoices sql database Related to the database/storage of LND labels Jan 26, 2023

saubyk added this to the v0.16.0 milestone Jan 26, 2023

saubyk requested a review from sputn1ck January 26, 2023 18:25

sputn1ck reviewed Jan 26, 2023

View reviewed changes

Comment thread sqldb/sqlc/migrations/000001_sequences.up.sql Outdated

ziggie1984 reviewed Jan 28, 2023

View reviewed changes

saubyk requested a review from bhandras January 31, 2023 18:26

sputn1ck self-requested a review February 2, 2023 11:16

sputn1ck reviewed Feb 2, 2023

View reviewed changes

sputn1ck self-requested a review February 2, 2023 11:59

bhandras reviewed Feb 2, 2023

View reviewed changes

saubyk modified the milestones: v0.16.0, v0.16.1 Feb 14, 2023

saubyk modified the milestones: v0.16.1, v0.17.0 Mar 13, 2023

positiveblue mentioned this pull request May 11, 2023

[Epic] channeldb+invoices: create native SQL schema for invoice storage #6288

Closed

5 tasks

positiveblue force-pushed the invoice-sql-schema branch from 65b3307 to 3972d08 Compare May 30, 2023 15:36

positiveblue force-pushed the invoice-sql-schema branch 2 times, most recently from 0fd6898 to c9c2e47 Compare June 8, 2023 00:05

positiveblue mentioned this pull request Jun 9, 2023

[WIP] Invocies: refactor InvoiceDB methods #7542

Closed

positiveblue force-pushed the invoice-sql-schema branch 2 times, most recently from c902354 to c9d9247 Compare June 10, 2023 00:11

positiveblue changed the title ~~[WIP] sqldb: add schema for invoices (table and queries)~~ sqldb: add invoice schema and sql queries Jun 10, 2023

positiveblue changed the base branch from master to 0-17-0-staging June 10, 2023 01:00

positiveblue changed the base branch from 0-17-0-staging to master June 10, 2023 01:00

positiveblue marked this pull request as ready for review June 10, 2023 01:01

positiveblue mentioned this pull request Jun 10, 2023

sqldb: new package supporting sql backends #7343

Merged