Skip to content

Execute TypeManager timers in the correct runtime context #3309

Merged
sergeybykov merged 4 commits into
dotnet:masterfrom
benjaminpetit:fix-versioning-scheduling
Aug 16, 2017
Merged

Execute TypeManager timers in the correct runtime context #3309
sergeybykov merged 4 commits into
dotnet:masterfrom
benjaminpetit:fix-versioning-scheduling

Conversation

@benjaminpetit
Copy link
Copy Markdown
Contributor

Replace PR #3256

In Silo.DoStart() we should be scheduling TypeManager.Initialize() on the TypeManager's context (it's a system target), since it starts a timer which potentially makes grain calls.
The timer started there also needs to invoke its callback on the TypeManager's context.

Also TypeManager.Initialize should be call after the Silo joined the cluster, otherwise the silo that is starting cannot talk to silo that are already active in the cluster. It means that we can have a very small window where a silo that join cluster can make placement without respecting versioning strategies.

A more complete fix will be implemented in 2.0 since it will likely be a breaking change

@benjaminpetit
Copy link
Copy Markdown
Contributor Author

Test failure not related to this PR

@dotnet-bot test netstandard-win-functional

@sergeybykov sergeybykov merged commit 554b1e0 into dotnet:master Aug 16, 2017
@gabikliot
Copy link
Copy Markdown
Contributor

The race you described with a small window after become active but still not done start indeed exists in a number of places in Orleans. Essentially everything that must be finished initialization after silo becomes active suffers from that race. For example, reminders system target. To cope, it has its own started flag, and it may start receiving msgs before it started. The two approaches to deal with is either reject untill started, or queue (locally, with Tasks it's easy), finish init, and then do the queued work.

This is fundamental to the fact that our membership join is not lock step synchronous with handshaking all silos (we thought that would kill scalability). Instead, each silo joins and everyone else learns asynchronously about him.

sergeybykov pushed a commit to sergeybykov/orleans that referenced this pull request Aug 17, 2017
@benjaminpetit benjaminpetit deleted the fix-versioning-scheduling branch March 15, 2018 16:55
@github-actions github-actions Bot locked and limited conversation to collaborators Dec 9, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants