In error recovery scenarios (for example for invalid Ruby code) Prism can return unexpected node types in fields they wouldn't be for valid Ruby. For example:
Prism parses this as a ModuleNode with a DefNode in the constant_path field. The allowed/expected nodes are listed in config.yml here and we use on error: to signal that in error recovery scenarios there might be something other than ConstantReadNode or ConstantPathNode. In this case, we only have on error: MissingNode but it can actually be DefNode or something else (perhaps this is an oversight and all possible error nodes should be listed here?).
Clients using Prism may want to deal with invalid Ruby syntax in all kinds of different ways, but I suspect the most common approach would be to simply ignore unexpected nodes and maybe stop traversing that branch of the AST. In any case, clients cannot expect only valid node types in fields, as shown above.
One way for clients to deal with unexpected node types would be to guard based on each of the expected node types, for example:
if (!PM_NODE_TYPE(node, PM_CONSTANT_READ_NODE) && !PM_NODE_TYPE(node, PM_CONSTANT_PATH_NODE)) {
// handle error case
}
In some cases this might be quite verbose, with many valid node types.
Another approach (assuming the on error indicators in config.yml are exhaustive) would be to check for known error cases:
if (PM_NODE_TYPE(node, PM_MISSING_NODE) || PM_NODE_TYPE(node, PM_DEF_NODE)) {
// handle error case
}
Again, this could potentially be quite verbose and I'm not sure if config.yml is currently expected to exhaustively list all possible on error cases.
Proposal
Prism could introduce an UnexpectedNode type (similar to MissingNode) that would be returned in error recovery scenarios, wrapping the original node type.
if (PM_NODE_TYPE(node, PM_MISSING_NODE) || PM_NODE_TYPE(node, PM_UNEXPECTED_NODE)) {
// handle error case
}
Or introduce a helper covering both cases:
if (PM_ERROR_NODE(node)) {
// handle error case
}
The client could more easily just ignore error recovery cases, or they could do something appropriate with node->child in the case of UnexpectedNode.
I'm sure there are other approaches too, but this one seemed quite straightforward. However, introducing a new node type would be a breaking change, so perhaps there is a non-breaking alternative, such as adding a flag to existing node types?
Prototype
I prototyped it here but I thought I should open an issue first, to see if this is something you would consider, rather than spending more time tidying up the code for a PR. That prototype also ensures MissingNode is always added in cases where a required field is null, but that's potentially a separate issue.
In error recovery scenarios (for example for invalid Ruby code) Prism can return unexpected node types in fields they wouldn't be for valid Ruby. For example:
Prism parses this as a
ModuleNodewith aDefNodein theconstant_pathfield. The allowed/expected nodes are listed inconfig.ymlhere and we useon error:to signal that in error recovery scenarios there might be something other thanConstantReadNodeorConstantPathNode. In this case, we only haveon error: MissingNodebut it can actually beDefNodeor something else (perhaps this is an oversight and all possible error nodes should be listed here?).Clients using Prism may want to deal with invalid Ruby syntax in all kinds of different ways, but I suspect the most common approach would be to simply ignore unexpected nodes and maybe stop traversing that branch of the AST. In any case, clients cannot expect only valid node types in fields, as shown above.
One way for clients to deal with unexpected node types would be to guard based on each of the expected node types, for example:
In some cases this might be quite verbose, with many valid node types.
Another approach (assuming the
on errorindicators inconfig.ymlare exhaustive) would be to check for known error cases:Again, this could potentially be quite verbose and I'm not sure if
config.ymlis currently expected to exhaustively list all possibleon errorcases.Proposal
Prism could introduce an
UnexpectedNodetype (similar toMissingNode) that would be returned in error recovery scenarios, wrapping the original node type.Or introduce a helper covering both cases:
The client could more easily just ignore error recovery cases, or they could do something appropriate with
node->childin the case ofUnexpectedNode.I'm sure there are other approaches too, but this one seemed quite straightforward. However, introducing a new node type would be a breaking change, so perhaps there is a non-breaking alternative, such as adding a flag to existing node types?
Prototype
I prototyped it here but I thought I should open an issue first, to see if this is something you would consider, rather than spending more time tidying up the code for a PR. That prototype also ensures
MissingNodeis always added in cases where a required field is null, but that's potentially a separate issue.