Implement support for xsl:iterate by bojidar-bg · Pull Request #122 · Paligo/xee

bojidar-bg · 2025-09-11T11:43:13Z

So... xsl:iterate is a bit of a weird one.
Within XSL's functional language, it fits as a tail-recursive function which produces the rest of the elements at that place in the template.
Here's a possible example:

<xsl:iterate select="/foo">
  <xsl:param name="n" select="0"/>
  <xsl:value-of select="$n"/>
  <xsl:choose> <!-- tail-position choose -->
    <xsl:when test="n = 4">
      <xsl:break/> <!-- tail early return -->
    </xsl:when>
    <xsl:otherwise>
      Text
      <xsl:next-iteration> <!-- tail recursive call -->
        <xsl:with-param name="n" select="$n + 1"/>
      </xsl:next-iteration>
    </xsl:otherwise>
  </xsl:choose>
</xsl:iterate>

However, translating that to Xee's functional IR turns out to be tricky:

Xee's IR encodes appending elements using its special comma operator. As such, neither the xsl:next-iteration nor the xsl:choose are really in "tail position" (just like 1 + x() is not a tail-call); instead we have something like:

ir::Binary {
  op: Comma,
  left: <..everything before xsl:choose..>,
  right: ir::If {
    <...>
    else_: ir::Binary {
      op: Comma,
      left: <..everything before xsl:next-iteration..>,
      right: <tail-position xsl:next-iteration>
    }
  }
}

We could presumably place the next-iteration in tail position near the root of the IR tree, outside of any binary operators (yet still inside ir::If-s and ir::Let-s); but that's gnarly, since we would have to turn any comma operators containing xsl:choose-s and similar conditionals along the way "inside-out" to get them out of the tail position.
We could, presumably, make it so that some form of tail call optimization in the IR-to-bytecode compiler understands comma operators, so that we can still preserve the idea that xsl:iterate is a tail-recursive function. That feels even less tractable than the previous point...

So, instead of all of that tail-recursive nastiness ( 😇 ), this PR ends up implementing xsl:next-iteration and xsl:break as sort-of functional effects scoped by the xsl:iterate loop: they imperatively set values on the xsl:iterate loop that control the next iteration of the loop, however those imperative side-effects cannot be observed by the subsequent code in the current iteration of the loop. Within the IR, IterateBreak and IterateLetNext act as sort-of decorators: they cause a change to the execution of the subsequent iteration, but otherwise pass expression results unchanged.

Hope that description helps explain my line of thought in implementing what I did 😊

bojidar-bg · 2025-09-11T11:44:48Z

xee-ir/src/ir.rs

+pub struct Iterate {
+    pub context_names: ContextNames,
+    pub loop_name: Name,
+    pub var_atom: AtomS,
+    pub params: Vec<IterateParam>,
+    pub expr: Box<ExprS>,
+    pub on_complete: Option<Box<ExprS>>,
+}


I'm not sure if my IR changes are good here. This ir::Iterate definition feels way too specific in the way it includes iterate control flow (loop_name), iterate parameters (params), and the on_complete statement, but given the bytecode I need to generate in compile_iterate, I couldn't see a way to weave those without making them a part of ir::Iterate

bojidar-bg · 2025-09-11T11:46:08Z

xee-ir/src/function_compiler.rs

+        self.compile_sequence_loop_end(span);
+        // pop sequence length name & index;
+        self.scopes.pop_name();
+        self.scopes.pop_name();


Is there a reason for compile_sequence_loop_end to only pop the values added by compile_sequence_loop_init and not the names?
(I guess compile_quantified is special?)

I can't recall a good reason so if that refactoring is possible I'm happy to see it applied!

IIRC, I looked at it, and it would require an extra VarSet instruction in compile_quantified. Might get around to it at some point, especially if something brings me back to the loop code 😅

bojidar-bg · 2025-09-11T11:56:30Z

xee-ir/src/function_compiler.rs

+        }
+        // Finally, emit a value as the result of IterateLetNext (typ. an empty sequence)
+        // self.builder.emit_constant(sequence::Sequence::default(), span);
+        self.compile_expr(&iterate_let_next.return_expr)?;


As evidenced by dead code: Not sure if I want the next-iteration in the IR have a return_expr expression or not.
Rationale for including it: it makes the IR more flexible and consistent with ir::IterateBreak
Rationale against: it's never going to be set to anything but an empty sequence by XSLT 3.1 (next-iteration cannot specify extra children; those always go before the next-iteration element (But xsl:break also could have any extra children go before it, except the standard does allow it to specify extra children to add, huh!)).

bojidar-bg · 2025-09-11T11:59:50Z

xee-xslt-compiler/src/ast_ir.rs

+                    type_: param.as_.clone(),
+                })
+            })
+            .collect::<error::SpannedResult<Vec<_>>>()?;


Is there a simpler Rust pattern to use around here?

bojidar-bg · 2025-09-11T12:00:34Z

xee-xslt-compiler/src/ast_ir.rs

+            return_expr: Box::new(return_expr.expr()),
+        });
+        let result = return_expr.bind_expr_no_span(&mut self.variables, let_next);
+        Ok(result)


(result can be inlined, will clean up later)

bojidar-bg · 2025-09-11T12:08:37Z

xee-ir/src/function_compiler.rs

+        }
+        // Then, store them back into the variables (in reverse, so they match up)
+        for param in iterate_let_next.params.iter().rev() {
+            self.compile_variable_set(&param.name, span)?;


(Currently, parameters set by IterateLetNext do update variables before the iteration is over; I could fix that by adding more variables (basically have two slots on the stack for each variable all the time), but it would result in less optimal bytecode, especially when next-iteration doesn't set all parameters.)

faassen · 2025-09-11T15:59:52Z

Nice! Hey, just a heads up; I'm on vacation for the next few days so I'll likely be able to give feedback next week.

faassen · 2025-09-11T16:02:02Z

the tail call optimization in the IR-to-bytecode compiler

I didn't know I created such an optimization!

bojidar-bg · 2025-09-11T17:45:57Z

Er... I meant "the tail call optimization just described in the previous point", but alas, my brain moved on before my fingers could finish typing that 😂

Take your time! I'll probably move on to other things with xsl:param/xls:with-param-s, those seem like low hanging fruits that would make a lot of tests pass moving forward. (:

faassen · 2025-09-18T10:24:46Z

It's interesting that this only makes 4 iterate tests pass; what's holding up the passing of the other tests? Are those missing iterate features or other XSLT features that are blocking them?

faassen

Thank you for all this hard work! You figured out a lot of details and it's great to see you managed to extend the IR & IR -> bytecode compiler in particular.

I feel bad for taking a while to review. It's also difficult to build enough context to do a proper review.

There are two things helping here that make me approve it:

this mostly deals with XSLT
a few more things are added to the IR; they might be overly specific but I'm good with that for now.
the interpreter is virtually untouched besides errors.

So given all this, I'm going to err on the side of accepting PRs on this topic, because I don't want to block development and I'd only get in the way. When the interpreter is touched or there is a significant IR compilation change that's where I want to ensure XPath doesn't have a regression, but we have a lot of tests to cover us there.

faassen · 2025-09-18T10:26:41Z

xee-ir/src/function_compiler.rs

+        self.compile_sequence_loop_end(span);
+        // pop sequence length name & index;
+        self.scopes.pop_name();
+        self.scopes.pop_name();


I can't recall a good reason so if that refactoring is possible I'm happy to see it applied!

bojidar-bg added 2 commits September 10, 2025 14:52

Implement initial support for xsl:iterate

fbc7c18

Implement support for xsl:iterate parameters

afc2daa

bojidar-bg commented Sep 11, 2025

View reviewed changes

faassen approved these changes Sep 18, 2025

View reviewed changes

faassen merged commit 0667627 into Paligo:main Sep 18, 2025
1 check passed

github-actions bot mentioned this pull request Aug 30, 2025

chore: release #116

Open

bojidar-bg mentioned this pull request Oct 28, 2025

XSLT AST parser does not handle variable names correctly #127

Open

Conversation

bojidar-bg commented Sep 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

faassen commented Sep 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

faassen commented Sep 11, 2025

Uh oh!

bojidar-bg commented Sep 11, 2025

Uh oh!

faassen commented Sep 18, 2025

Uh oh!

faassen left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

bojidar-bg commented Sep 11, 2025 •

edited

Loading

faassen commented Sep 11, 2025 •

edited

Loading