Skip to content

routing: allow runtime updates to mission control config#4909

Merged
Roasbeef merged 7 commits into
lightningnetwork:masterfrom
carlaKC:mc-paramsapi
Feb 11, 2021
Merged

routing: allow runtime updates to mission control config#4909
Roasbeef merged 7 commits into
lightningnetwork:masterfrom
carlaKC:mc-paramsapi

Conversation

@carlaKC
Copy link
Copy Markdown
Collaborator

@carlaKC carlaKC commented Jan 11, 2021

This PR adds the ability to get and set mission control's config via rpc, allowing for experimentation with these values without a restart.

Since we already have locking in mission control, we can just acquire the lock and update the cfg.

@carlaKC carlaKC changed the title routing: allow upd routing: allow runtime updates to mission control config Jan 11, 2021
@Roasbeef Roasbeef added this to the 0.13.0 milestone Jan 12, 2021
@Roasbeef Roasbeef requested review from bhandras and halseth January 12, 2021 03:30
@Roasbeef Roasbeef added enhancement Improvements to existing features / behaviour routing mission control labels Jan 12, 2021
Copy link
Copy Markdown
Collaborator

@bhandras bhandras left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pretty straightforward changes and nice refactoring along the way.
My only real comment is that since we can change the mission control config during route calculation (since locking is per point) we may create an unwanted path as a result. It's very unlikely as a single user but may be likely as service provider. I'd consider fixing that too, although it's probably not a huge deal.

Comment thread routing/missioncontrol.go Outdated
Comment thread routing/probability_estimator.go Outdated
Comment thread lnrpc/routerrpc/router.proto Outdated
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit unsure about using integer percents for the probability. Maybe better to use floats since it results in simpler code and also less confusing? Also for weight?

@halseth
Copy link
Copy Markdown
Contributor

halseth commented Jan 12, 2021

Just from a high level look, I'm wondering if we can do this without introducing race conditions?

Even though we get/set the config behind a mutex, MissionControl itself still reads the config without acquiring the lock first.

@carlaKC
Copy link
Copy Markdown
Collaborator Author

carlaKC commented Jan 12, 2021

MissionControl itself still reads the config without acquiring the lock first.

I thought I'd checked all of these, but looks like ReportPaymentSuccess and ReportPaymentFailure don't acquire the lock for adding results to the store. Although they do acquire it later to update MC, so wondering whether than can be moved up. Otherwise the other calls to MC all use the lock so it shouldn't be too much of an issue?

@halseth
Copy link
Copy Markdown
Contributor

halseth commented Jan 13, 2021

@carlaKC I was thinking about places like this: https://github.com/lightningnetwork/lnd/blob/d7ecde2972bfb949c6df717d53ebfbb3283de90b/routing/probability_estimator.go#L82

One alternative to swapping out the mission control config, is to create a new MissionControl with the new config and swap that out instead.

@carlaKC carlaKC force-pushed the mc-paramsapi branch 2 times, most recently from daac715 to 7318702 Compare January 15, 2021 13:48
We are going to use the config struct to allow getting and setting
of the mission control config in the commits that follow. Self node
is not something we want to change, so we move it out for better
separation.
All of the other mission control exported functions acquire their locks
immediately, and do not lock in the subsequent unexported functions.
This commit moves the lock up for the report payment functions so that
mission control's config values are covered by this lock, in preparation
for allowing config to be updated at runtime. Moving this lock means
that we will hold the lock for the additional time it takes to store a
single result, AddResult, to the store.
In preparation for allowing live update of mc config, we extract our
probability estimator cfg for easy update and add validation.
Copy link
Copy Markdown
Contributor

@halseth halseth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty good :)

control more willing to try hops that we have no information about, lower
values will discourage trying these hops.
*/
float hop_probability = 2;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use double instead?

@joostjager
Copy link
Copy Markdown
Contributor

Perhaps it is worth documenting in the proto that the runtime settings will only be in effect until the next restart.

An alternative would be to hot-reload the config file, either automatically or via a cli command (similar to systemctl daemon-reload). That would at least keep the config in one place. It also opens up the possibility to hot-reload other config settings without adding rpcs for all of them (which aren't really useful other than for experimentation).

@halseth
Copy link
Copy Markdown
Contributor

halseth commented Jan 22, 2021

I think just adding to the doc that it won't be persisted is good enough for now?

@carlaKC
Copy link
Copy Markdown
Collaborator Author

carlaKC commented Jan 25, 2021

Perhaps it is worth documenting in the proto that the runtime settings will only be in effect until the next restart.

Indeed, will add some comments on the rpc making a note of it.

An alternative would be to hot-reload the config file, either automatically or via a cli command (similar to systemctl daemon-reload). That would at least keep the config in one place. It also opens up the possibility to hot-reload other config settings without adding rpcs for all of them (which aren't really useful other than for experimentation).

I think this would be a nice feature to have, but likely a larger project since not everybody runs lnd with a config file. Although it's kicking the can down the road a bit, I'd like to go ahead with adding this rpc because mission control is a pretty underused feature IMO, and surfacing it will help people experiment with it. If we find ourselves exposing other config rpcs, then it's time to talk hot reload.

@Roasbeef Roasbeef added the P2 should be fixed if one has time label Jan 28, 2021
@carlaKC
Copy link
Copy Markdown
Collaborator Author

carlaKC commented Feb 10, 2021

My only real comment is that since we can change the mission control config during route calculation

I don't think that there's a clean way to do this, without restructuring the way payment sessions use mission control quite significantly, which I'm not sure is worth the effort.

It's very unlikely as a single user but may be likely as service provider

If we do run into problems here, I think there could be an option to provide specific MC params for an individual payment, so that you can specify exactly what you want for that route. WDYT?

@carlaKC carlaKC requested a review from bhandras February 10, 2021 08:16
Copy link
Copy Markdown
Collaborator

@bhandras bhandras left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 🧮

"rely on historical results, expressed as " +
"value in [0;1]",
}, cli.UintFlag{
Name: "pmtnr",
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: maxpayments ?

@Roasbeef Roasbeef merged commit 7b0ea3c into lightningnetwork:master Feb 11, 2021
@alexbosworth
Copy link
Copy Markdown
Contributor

    The amount of time mission control will take to restore a penalized node
    or channel back to 50% success probability, expressed as a unix timestamp
    in seconds. Setting this value to a higher value will penalize failures for
    longer, making mission control less likely to route through nodes and
    channels that we have previously recorded failures for.
    */
    uint64 half_life_seconds = 1;

If the value is just the number of seconds, why is it described as a unix timestamp?

@carlaKC
Copy link
Copy Markdown
Collaborator Author

carlaKC commented Mar 5, 2021

If the value is just the number of seconds, why is it described as a unix timestamp?

Shouldn't be, fix in #5078.

yaslama added a commit to breez/lnd that referenced this pull request Jul 19, 2021
roeierez added a commit to breez/lnd that referenced this pull request Sep 5, 2021
…rical-sync

* commit 'b444ae37125d32eaab1d68e73c4b1b12bf6451bc':
  backport lightningnetwork#5061 - routing: add mission control import rpc
  backport lightningnetwork#4909 - routing: allow runtime updates to mission control config
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement Improvements to existing features / behaviour mission control P2 should be fixed if one has time routing

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants