Skip to content

Conversation

@dogakarakas
Copy link

@dogakarakas dogakarakas commented Jul 13, 2025

This pull request adds a set of algebraic simplification rewrites to the DaphneIR canonicalizer. These rewrites target common algebraic patterns to improve optimization and simplify the intermediate representation. The following rewrites have been implemented and can be found in the directory "src/ir/daphneir/Canonicalize.cpp":

Static Rewrites:
1)sumAll(ewAdd(X, Y)) → ewAdd(sumAll(X), sumAll(Y))
2)sumAll(transpose(X)) → sumAll(X)
3)sum(lambda * X) → lambda * sum(X) (only when lambda is a scalar and X is a matrix)
4)trace(X @ Y) (i.e., sum(diagVector(X @ Y))) → sum(X * transpose(Y))
5)X @ Y)[i,j] → X[i, :] @ Y[:, j] (only applicable if both inputs are not transposed)

Dynamic Rewrite:
1)X[a:b, c:d] = Y → X = Y, if dims(X) == dims(Y) *
*only applicable when both matrices have the same element type

All simplifications are tested using LLVM's FileCheck framework. The FileCheck tests verifying each rewrite are located in the "test/util/" directory and script-level test cases can be found under the directory "test/api/cli/expressions/".

These changes improve canonicalization by recognizing and applying equivalent but more efficient patterns, both statically and dynamically.

@pdamme pdamme self-requested a review July 17, 2025 10:07
@pdamme pdamme added student project Suitable for a bachelor/master student's programming project. AMLS summer 2025 Student project for the Architecture of ML Systems lecture at TU Berlin (summer 2025). labels Jul 17, 2025
@dogakarakas dogakarakas marked this pull request as draft July 17, 2025 15:32
@dogakarakas dogakarakas marked this pull request as ready for review July 17, 2025 16:48
@dogakarakas dogakarakas marked this pull request as draft July 17, 2025 16:49
@dogakarakas dogakarakas marked this pull request as ready for review July 17, 2025 17:29
Copy link
Collaborator

@pdamme pdamme left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this contribution, @dogakarakas. Adding more simplification rewrites to DAPHNE is great as it can significantly improve both the runtime and memory consumption of DaphneDSL scripts. Overall, the code of your simplification rewrites makes sense and already looks quite good. However, a few less obvious points about the simplification rewrites should be improved. Furthermore, testing needs special attention. It would be great if you could address the following points:

Required changes: (must be addressed before we can merge this PR)

  1. Use CompilerUtils::isScaType()/hasScaType() (see src/compiler/utils/CompilerUtils.h) instead of your own isScalar(). That's important for consistency. One may be tempted to classify every SSA value that is not a matrix, frame, or unknown as a scalar, but it's not that simple. There are other data types (e.g., columns and lists) and we may add more types in the future. The mentioned utility functions are the single point of truth for determining if a type is a scalar type.

  2. Avoid creating ops from MLIR's arith dialect, create ops from the daphne dialect instead. When rewriting daphne::EwAddOp/EwMulOp, you sometimes create the new ops as arith::AddIOp/AddFOp/MulIOp/MulFOp. The arith dialect is quite low-level, which is why you need to check if the operands are integers or floats; such checks can be avoided when you create DAPHNE's own ops. Furthermore, the introduction of arith ops may prevent subsequent simplification rewrites from DAPHNE, which target EwAddOp/EwMulOp.

  3. Use CompilerUtils::isConstant<bool>() to check if the transpose arguments of MatMulOp are known at DaphneDSL compile-time. There are multiple ops for constants in MLIR and it is easy to forget something. That's why we use this central utility function whenever we need the compile-time constant value of an mlir::Value.

  4. Fix issues with the types of newly created intermediate results. When creating a new MLIR op, one usually has to specify its result type. You made a good attempt at selecting the right result types for ops newly created by your rewrites. However, when mixing different value types for the inputs of the expressions, subtle problems can arise. For instance, consider your rewrite sum(X+Y) -> sum(X) + sum(Y). Here, you create two new sumAll ops and give them the result type of the original sumAll op as their result type. Now assume X is a matrix with value type si64 (signed 64-bit integer) and Y is a matrix with value type f64 (64-bit floating-point). Then, the type of X+Y is f64 (integer plus float is float) and the type of sum(X+Y) is f64 too. However, the type of sum(X) is si64 (the same as the value type of X), while you would assign f64. This mismatch can lead to problems later, e.g., a kernel for the summation of an si64-matrix with an f64 result type might be missing. To circumvent this kind of problem, I recommend the following: Create ops producing new intermediate results with DAPHNE's unknown type, the type will be inferred in a later compiler pass (the inference pass). Create the final op replacing the original op with the original op's result type (as you already do), because the final result type should not be changed by a rewrite.

  5. Fix potential bugs related to transposed arguments of MatMulOp. DaphneIR's MatMulOp has four arguments: the two matrices to be multiplied and two booleans that indicate if the corresponding input matrix should be interpreted as transposed (inspired by the underlying BLAS routines, e.g., dgemm()). Looking at your code for the rewrite (X @ Y)[i, j] -> X[i, ] @ Y[, j], I think I spot two bugs that need further investigation:

    • I think in the two cases when exactly one of the input matrices is transposed, sliceRow and sliceCol need to be swapped.
    • I think the newly created MatMulOp should retain the two transpose booleans of the original MatMulOp instead of setting them to false.

    Please add test cases to check if your rewrite behaves correctly in all four combinations of (non-)transposed inputs of the matrix multiplication.

  6. Revise the test cases. We need two kinds of test cases:

    • On the one hand, we need script-level test cases that check if the expressions addressed by your rewrites yield correct results (no matter how DAPHNE calculates them internally). For instance, add a DaphneDSL script as simple as print(sum(t([1, 2, 3, 4]))); and check if it really prints 10\n. These test cases should reside in test/api/cli/expressions/.
    • On the other hand, we need IR test cases which take an input IR, apply the canonicalization pass (through daphne-opt), and check if your rewrites have really been applied (by checking if certain operations exist or don't exist after the pass). To that end, we use LLVM's FileCheck tool. Examples of such test cases can be found in test/codegen/ and test/util/. You added your test cases in test/util/. However, you made them call daphne --explain ... instead of daphne-opt.
      The way you test at the moment may be due to a misunderstanding, because in our meeting, I mentioned writing script-level test cases that use --explain to print the IR plus checks on the IR output as an alternative.
  7. Undo changes unrelated to this PR. This PR proposes a few changes that are not related to the task and would not be useful on the main branch. They might originate from changes you made for your local setup. In detail, please undo the changes to build.sh (not setting -j1 will also make the CI checks run faster), daphne-opt/daphne-opt.cpp (only formatting changes), llvm.sh (not needed), and scripts/examples/extensions/myKernels/myKernels.cpp (only formatting changes).

Optional changes: (recommended, but not required for merging)
You can further improve the implementation of your rewrites as follows:

  • You don't need to "check if the results of the inner sums are indeed scalars", the sumAll op always returns a scalar.
  • Feel free to use rewriter.replaceOpWithNewOp() instead of rewriter.create() followed by rewriter.replaceOp(), it's shorter.

I hope this feedback helps you to improve your PR. Feel free to share your thoughts on these points.

@dogakarakas dogakarakas marked this pull request as draft July 25, 2025 17:25
…, hasScaType, isConstant} for consistency and correctness; replaced arith ops with DAPHNE dialect ops to preserve rewrite applicability. Fixed type handling in rewrites, preserved transpose flags in MatMulOp, and revised/extended both IR- and script-level tests while removing unrelated changes.
@pdamme
Copy link
Collaborator

pdamme commented Jul 30, 2025

Thanks for the revisions so far, @dogakarakas.

A short clarification regarding point 5 (transposed arguments of MatMulOp): DaphneIR's MatMulOp has four arguments (two matrices; two booleans, which indicate if the respective matrix is transposed). However, DaphneDSL's matrix multiplication operator @ only has the two matrices are inputs, e.g., X @ Y. To (indirectly) control the two boolean arguments of MatMulOp from DaphneDSL, simply transpose the operands, e.g., t(X) @ t(Y). There are four cases: X @ Y, X @ t(Y), t(X) @ Y, and t(X) @ t(Y). However, I realized that due to some current limitations (see #447 for some background), only a transpose of the right-hand-side operand is factored into the MatMulOp at the moment. Thus, we can reduce point 5 to two cases: X @ Y and X @ t(Y) should work correctly; at the same time, the rewrite should only be applied if the transpose of the left-hand-side input is false.

@pdamme
Copy link
Collaborator

pdamme commented Jul 30, 2025

PS: Thinking further about it, I think your rewrite will usually not see a MatMulOp with transposed args set to true. The reason is that tranposes are only factored into MatMulOp after property inference (because MatMulOp::canonicalize() takes the sparsity of the inputs into account), while your rewrite can be applied before property inference. Therefore, in your rewrite, please just check if the two transpose arguments of MatMulOp are false and stop the rewrite, otherwise.

We could add and test the remaining three cases later, once we have fixed the current limitations of the transpose args of MatMul, but that's beyond the scope of this PR.

@dogakarakas dogakarakas marked this pull request as ready for review August 3, 2025 13:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

AMLS summer 2025 Student project for the Architecture of ML Systems lecture at TU Berlin (summer 2025). student project Suitable for a bachelor/master student's programming project.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants