-
Notifications
You must be signed in to change notification settings - Fork 496
Implement state-machine utility #5670
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
2529892 to
cd6eb1a
Compare
cd6eb1a to
a990c72
Compare
harrishancock
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I got partway through state-machine.h. It appears to include a lot of unused code. It looks like the only class we actually use in this PR is ComposableStateMachine. Can we delete everything but that class, and add the other, more tightly-scoped state machines when we have use cases for them? (I have a sneaking suspicion that we won't...) I don't want to review code that won't ever be used, and I don't want to approve the PR without having reviewed the code in it, so that seems like the fastest way to get this merged.
In terms of correctness, the main thing that stood out to me so far is const-correctness of the transition lock.
guybedford
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If only KJ supported pattern matching I guess! I appreciate how the state machine solves a very real UAF issue, and that is well motivated.
Good decision keeping the compression refactor straightforward enough to verify that it is correct.
It is still a little bit worrying having so much state machine code without it being integrated, but as long as all follow-ons are suitably vetted and tested this approach can work for iteration further.
I wasn't able to fully read all state machine internals to fully understand the transition models - I would also be curious to understand how it handles more highly connected state transitions with constraints.
Agreed with @harrishancock that the naming isn't quite on first glance with withState. Noting for the main functions exactly what they do ("calls the callback only if the state is the given state?") would help readability either way.
Great to see systematic improvements to underlying state models.
55996ea to
37ee448
Compare
|
Significant updates and usages. I recommend starting with the final state of the PR rather than going through commit-by-commit. Will be far less confusing that way. |
harrishancock
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Finished reviewing state-machine.h. I have only nits and little suggestions that can be deferred at this point, but I'm holding off approving until I can read the rest of the changes in the PR.
54b874f to
9d1860b
Compare
|
@harrishancock ... added a commit addressing those review comments. |
harrishancock
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I went through and left some comments, but nothing particularly urgent I think. I'd say let's merge, but maybe try to get this into an internal release with as few other changes as possible.
A few last thoughts:
- I noticed that
state.visitForGc()isn't used everywhere I'd expect it to be. There's a comment mentioning thekj::Own&conundrum. Are there other reasons it couldn't be used everywhere? - Some classes use a
= StateMachine::create()field initializer, some classes usecreate()in a constructor initializer list, and still others leave their states uninitialized, then immediately calltransitionTo(). An incremental improvement might be to delete StateMachine's default constructor, or make it private, forcing everyone to use thecreate()factory function. This would get rid of the possibility of uninitialized states, and make sure all states are properly initialized in initializer lists or field initializers. This would remove one barrier to implementingvisitForGc()in terms ofvisit(). - After seeing how frequently we need to use the
getUnsafe()family of functions, I'm leaning more and more toward some sort of scoped accessor macro, maybeWD_IF_STATE, like in my previous comment philosophizing aboutKJ_IF_SOMEintegration. The first step would be to giveTransitionLockatryGet()member function, like @guybedford suggested. But if we do that, we'd need to figure out how to defer state transitions to when the TransitionLock is destroyed to support many of the places we callgetUnsafe(), which raises the main confusing bit from my perspective: how to delineate the difference between OperationScope and TransitionLock, when to use one versus the other. The best answer isn't super clear to me. - Related to the previous point, we lose a little bit of safety by switching away from
KJ_SWITCH_ONEOF, in that that macro forced us to handle every possible state. So, adding states now runs a greater risk of forgetting to adjust existing code to handle the new state. The verbosity ofKJ_SWITCH_ONEOFis a burden, though, so I'm not saying it's wrong to switch away from it in the general case -- just that in some places we might want to guarantee that we've exhaustively handled all the states. A newWD_SWITCH_STATEmacro could hide a TransitionLock, but still let us perform the exhaustive switch.
None of the above points are blocking, they're just suggestions. We can iterate more after we merge.
df69240 to
a8899d7
Compare
|
Ok... @guybedford and @harrishancock ... I'm going to split this PR up now that it's essentially done. It's too big to land safely as is. My plan is to split it across multiple PRS using autogates as much as possible but that's not always going to be easy and it's likely to force us to roll it out likely over multiple weeks. |
This was an already existing bug that when piping a readable stream through a transform stream, and the transform's readable side is canceled, the source readable would not be correctly released and marked as closed. With this fix, verified that the behavior is correct per the spec and matches other implementations. Should not be a breaking change as this as previously broken and would error in inconsistent ways (IdentityTransformStream would error differently than a regular TransformStream), now both have consistent and correct behavior.
e490fd5 to
fbb6fb1
Compare
|
The generated output of Full Type Diff |
|
@anonrig ... do not bother reviewing this PR. I'm in the process of splitting it into multiple other PRs since we cannot land the bulk of these changes without autogate(s). |
Yes, I'm just sharing a funny fact about Github :-) |
|
|
||
| #include "nbytes.h" | ||
|
|
||
| #include <workerd/api/system-streams.h> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Handled by: #5755
| @@ -1,10 +1,52 @@ | |||
| #include "identity-transform-stream.h" | |||
|
|
|||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Handled by #5763
| } | ||
| } | ||
| }; | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Handled by #5767
| @@ -1,7 +1,9 @@ | |||
| #include "common.h" | |||
| #include "readable-source.h" | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Handled by #5767
| #include "common.h" | ||
| #include "writable-sink.h" | ||
|
|
||
| #include <workerd/api/util.h> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Handled by #5767
| } | ||
| } | ||
| }; | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Handled by #5767
| @@ -1,6 +1,7 @@ | |||
| #include "common.h" | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Handled by #5767
| @@ -1,29 +1,49 @@ | |||
| #include "writable-sink.h" | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Handled by #5767
Scratching a bit of a long term itch here. The bulk of the implementation here was done by claude under heavy supervision and iteration.
Throughout the code base we make use of kj::OneOf for state machines. The typical pattern is something like:
This pattern is simple and effective but suffers from a lack of guard rails. For instance, state transitions can be made out of order, or can be made while there are still dangling references. Over the years we've had numerous cases of UAF's and other issues due to invalidated references after incorrectly timed state changes, etc. The direct use of kj::OneOf is effective but can be error prone.
This PR introduces a new state-machine.h utility that thinly wraps around kj::OneOf while putting some guard rails in place to protect against state transitions occuring at the wrong time, reducing the likelihood of UAFs and other memory issues, and generally just improving the quality of the code. An example use in compression.c++ is included as a proof point.
ComposableStateMachine< TerminalStates<Ended>, ErrorState<kj::Exception>, ActiveState<Open>, Open, Ended, kj::Exception> state;The intention would be to first transition the code in api/streams to use this, then apply usage more generally across the code base.
The specific implementation of the tool was an exercise in using claude. It required a decent amount of back and forth and refinement. Claude definitely did not get everything right in the first, second, or even third pass so the session was highly interactive, requiring quite a bit of back and forth with the tool to generate the right behavior. Test coverage should be complete and the tests detail how the utility should be used. There are also extensive doc comments throughout. See the changes in compression.c++ for an illustration of how existing code would be migrated.
The code comments go into significant detail on how the mechanism works, and even includes a migration guide. A significant portion of the LOC changed in the PR are just comments. The test provides full coverage and plenty of examples on how the API would be used. For review, I recommend starting with the code comments and tests in order to best understand what the code is doing. Then, move on to the actual implementation detail.
Overall: this is part of the streams cleanup effort. One of the issues we've had with the streams implementation are unclear/incorrect streams state transitions leading to things like UAFs. This is part of the effort to clean up that implementation to make it safer overall.
It's a large PR, yes, but there's exceedingly little risk here. The only functional changes are to compression.c++, which already has good test coverage. The changes there are minimal and shouldn't introduce any risk.