Implement support for xsl:iterate#122
Conversation
| pub struct Iterate { | ||
| pub context_names: ContextNames, | ||
| pub loop_name: Name, | ||
| pub var_atom: AtomS, | ||
| pub params: Vec<IterateParam>, | ||
| pub expr: Box<ExprS>, | ||
| pub on_complete: Option<Box<ExprS>>, | ||
| } |
There was a problem hiding this comment.
I'm not sure if my IR changes are good here. This ir::Iterate definition feels way too specific in the way it includes iterate control flow (loop_name), iterate parameters (params), and the on_complete statement, but given the bytecode I need to generate in compile_iterate, I couldn't see a way to weave those without making them a part of ir::Iterate
| self.compile_sequence_loop_end(span); | ||
| // pop sequence length name & index; | ||
| self.scopes.pop_name(); | ||
| self.scopes.pop_name(); |
There was a problem hiding this comment.
Is there a reason for compile_sequence_loop_end to only pop the values added by compile_sequence_loop_init and not the names?
(I guess compile_quantified is special?)
There was a problem hiding this comment.
I can't recall a good reason so if that refactoring is possible I'm happy to see it applied!
There was a problem hiding this comment.
IIRC, I looked at it, and it would require an extra VarSet instruction in compile_quantified. Might get around to it at some point, especially if something brings me back to the loop code 😅
| } | ||
| // Finally, emit a value as the result of IterateLetNext (typ. an empty sequence) | ||
| // self.builder.emit_constant(sequence::Sequence::default(), span); | ||
| self.compile_expr(&iterate_let_next.return_expr)?; |
There was a problem hiding this comment.
As evidenced by dead code: Not sure if I want the next-iteration in the IR have a return_expr expression or not.
Rationale for including it: it makes the IR more flexible and consistent with ir::IterateBreak
Rationale against: it's never going to be set to anything but an empty sequence by XSLT 3.1 (next-iteration cannot specify extra children; those always go before the next-iteration element (But xsl:break also could have any extra children go before it, except the standard does allow it to specify extra children to add, huh!)).
| type_: param.as_.clone(), | ||
| }) | ||
| }) | ||
| .collect::<error::SpannedResult<Vec<_>>>()?; |
There was a problem hiding this comment.
Is there a simpler Rust pattern to use around here?
| return_expr: Box::new(return_expr.expr()), | ||
| }); | ||
| let result = return_expr.bind_expr_no_span(&mut self.variables, let_next); | ||
| Ok(result) |
There was a problem hiding this comment.
(result can be inlined, will clean up later)
| } | ||
| // Then, store them back into the variables (in reverse, so they match up) | ||
| for param in iterate_let_next.params.iter().rev() { | ||
| self.compile_variable_set(¶m.name, span)?; |
There was a problem hiding this comment.
(Currently, parameters set by IterateLetNext do update variables before the iteration is over; I could fix that by adding more variables (basically have two slots on the stack for each variable all the time), but it would result in less optimal bytecode, especially when next-iteration doesn't set all parameters.)
|
Nice! Hey, just a heads up; I'm on vacation for the next few days so I'll likely be able to give feedback next week. |
I didn't know I created such an optimization! |
|
Er... I meant "the tail call optimization just described in the previous point", but alas, my brain moved on before my fingers could finish typing that 😂 Take your time! I'll probably move on to other things with |
|
It's interesting that this only makes 4 iterate tests pass; what's holding up the passing of the other tests? Are those missing iterate features or other XSLT features that are blocking them? |
faassen
left a comment
There was a problem hiding this comment.
Thank you for all this hard work! You figured out a lot of details and it's great to see you managed to extend the IR & IR -> bytecode compiler in particular.
I feel bad for taking a while to review. It's also difficult to build enough context to do a proper review.
There are two things helping here that make me approve it:
-
this mostly deals with XSLT
-
a few more things are added to the IR; they might be overly specific but I'm good with that for now.
-
the interpreter is virtually untouched besides errors.
So given all this, I'm going to err on the side of accepting PRs on this topic, because I don't want to block development and I'd only get in the way. When the interpreter is touched or there is a significant IR compilation change that's where I want to ensure XPath doesn't have a regression, but we have a lot of tests to cover us there.
| self.compile_sequence_loop_end(span); | ||
| // pop sequence length name & index; | ||
| self.scopes.pop_name(); | ||
| self.scopes.pop_name(); |
There was a problem hiding this comment.
I can't recall a good reason so if that refactoring is possible I'm happy to see it applied!
So...
xsl:iterateis a bit of a weird one.Within XSL's functional language, it fits as a tail-recursive function which produces the rest of the elements at that place in the template.
Here's a possible example:
However, translating that to Xee's functional IR turns out to be tricky:
xsl:next-iterationnor thexsl:chooseare really in "tail position" (just like1 + x()is not a tail-call); instead we have something like:ir::If-s andir::Let-s); but that's gnarly, since we would have to turn any comma operators containing xsl:choose-s and similar conditionals along the way "inside-out" to get them out of the tail position.xsl:iterateis a tail-recursive function. That feels even less tractable than the previous point...So, instead of all of that tail-recursive nastiness ( 😇 ), this PR ends up implementing
xsl:next-iterationandxsl:breakas sort-of functional effects scoped by thexsl:iterateloop: they imperatively set values on thexsl:iterateloop that control the next iteration of the loop, however those imperative side-effects cannot be observed by the subsequent code in the current iteration of the loop. Within the IR,IterateBreakandIterateLetNextact as sort-of decorators: they cause a change to the execution of the subsequent iteration, but otherwise pass expression results unchanged.Hope that description helps explain my line of thought in implementing what I did 😊