Skip to content

Conversation

@kurisu6912
Copy link
Collaborator

@kurisu6912 kurisu6912 commented Dec 2, 2025

This pr integrates Z3 into the TVM arith analyzer.

  • Only support integer expressions. TVM lanes, vscales are not supported.
  • It uses the Z3 integer theorem to prove expressions in TVM

TODO

Added API

  • analyzer.get_smtlib2: Returns a SMTLIB2 format including all constraints, variables and debug informations
  • analyzer.get_smtlib2(expr): Based on get_smtlib2 and adding expr as the prove target
  • analyzer.set_z3_timeout_ms(t): Set z3 timeout in millseconds
  • analyzer.set_z3_max_step(step): Set z3 max step
  • analyzer.get_z3_stats(): Return z3 statistic information (in str)

Example

from tvm.tir.expr import Var, And
from tvm.arith import Analyzer
analyzer = Analyzer()
a = Var('a', 'int32')
b = Var('b', 'int32')
c = Var('c', 'int32')
with analyzer.constraint_scope(And(And(a > 0, b > 0), c > 0)):
    try_to_prove = (a - b) // c * c + b <= a
    print(analyzer.can_prove(try_to_prove))
    print(analyzer.get_smtlib2())
    print(analyzer.get_smtlib2(try_to_prove))
    print(analyzer.get_z3_stats())

This example gives the proving result True.

analyzer.get_smtlib2():

(set-option :timeout 5)
; Entering Scope
; Entering Scope
; constraint: a > 0 and b > 0 and c > 0
; 
(set-info :status unknown)
(declare-fun a () Int)
(declare-fun b () Int)
(declare-fun c () Int)
(assert (<= (- 2147483648) a))
(assert (<= a 2147483647))
(assert (<= (- 2147483648) b))
(assert (<= b 2147483647))
(assert (<= (- 2147483648) c))
(assert (<= c 2147483647))
(assert
 (let (($x43 (> c 0)))
 (let (($x37 (> b 0)))
 (let (($x32 (> a 0)))
 (let (($x38 (and $x32 $x37)))
 (and $x38 $x43))))))
(check-sat)

analyzer.get_smtlib2(try_to_prove):

(set-option :timeout 5)
; Entering Scope
; Entering Scope
; constraint: a > 0 and b > 0 and c > 0
; Trying to prove: (a - b) // c * c + b <= a
; 
(set-info :status unknown)
(declare-fun a () Int)
(declare-fun b () Int)
(declare-fun c () Int)
(assert (<= (- 2147483648) a))
(assert (<= a 2147483647))
(assert (<= (- 2147483648) b))
(assert (<= b 2147483647))
(assert (<= (- 2147483648) c))
(assert (<= c 2147483647))
(assert
 (let (($x43 (> c 0)))
 (let (($x37 (> b 0)))
 (let (($x32 (> a 0)))
 (let (($x38 (and $x32 $x37)))
 (and $x38 $x43))))))
(assert
 (let ((?x53 (- a b)))
 (let ((?x54 (div ?x53 c)))
 (let ((?x55 (* ?x54 c)))
 (let ((?x56 (+ ?x55 b)))
 (let (($x57 (<= ?x56 a)))
 (not $x57)))))))
(check-sat)

analyzer.get_z3_stats():

(:added-eqs                      9
 :arith-eq-adapter               3
 :arith-bound-propagations-cheap 3
 :arith-conflicts                1
 :arith-fixed-eqs                3
 :arith-lower                    8
 :arith-make-feasible            3
 :arith-max-columns              20
 :arith-max-rows                 8
 :arith-propagations             3
 :arith-upper                    10
 :conflicts                      1
 :del-clause                     15
 :max-memory                     36.29
 :memory                         19.86
 :mk-bool-var                    28
 :mk-clause                      15
 :num-allocs                     8949891
 :num-checks                     1
 :propagations                   7
 :rlimit-count                   550)

Summary by CodeRabbit

  • New Features

    • Optional Z3 SMT solver integration for stronger arithmetic proving and verification
    • Build/config options to enable PyPI-provided Z3 and adjust runtime/install search paths
  • Breaking Changes

    • A cumulative-sum API was tightened to accept only buffer inputs (may affect callers)
  • Tests

    • Extensive new arithmetic, simplification, iter-map, int-set, and CUDA-targeted test coverage
  • Chores

    • Added z3-solver dependency and updated third-party submodule and packaging/tooling entries

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 2, 2025

Walkthrough

Adds optional Z3 integration to the build and packaging (CMake + Python packaging), introduces multiple new arithmetic test modules (including Z3-backed proofs), parameterizes CUDA GEMM tests, tightens cumsum_fragment's signature to tir.Buffer, updates TVM submodule pointer, and adds z3-solver to requirements.

Changes

Cohort / File(s) Summary
Z3 Integration (Build & CMake)
CMakeLists.txt, cmake/pypi-z3/FindZ3.cmake
Adds USE_Z3 and USE_PYPI_Z3 CMake options; conditionally loads cmake/pypi-z3 and find_package(Z3) when opted in; adjusts BUILD_RPATH and TILELANG_INSTALL_RPATH for Z3 libraries on Apple/UNIX; sets per-target INSTALL_RPATH for tilelang, tilelang_module, tvm, tvm_runtime; introduces imported target z3::libz3.
Packaging & Dependencies
pyproject.toml, requirements.txt, requirements-dev.txt, requirements-test.txt
Adds z3-solver>=4.13.0 to dependencies and build-system requirements; adds patchelf to build requires (platform-specific); updates manylinux image and wheel repair exclusions to omit libz3.so.
Arithmetic Analysis Tests
testing/python/arith/test_arith_hard.py, testing/python/arith/test_arith_intset.py, testing/python/arith/test_arith_iter_affine_map.py, testing/python/arith/test_arith_simplify.py
Adds four comprehensive test modules exercising Analyzer, int-set reasoning, iter-affine-map analysis, simplification and SMT-LIB2 / Z3-backed proofs, including many new test helpers and entrypoints.
CUDA GEMM Tests Parameterization
testing/python/tilelibrary/test_tilelang_tilelibrary_gemm_sp.py
Parameterizes SM90/SM80 tests with explicit trans_A/trans_B args, requires CUDA and compute-version guards, and threads new params through run_gemm_sp_* call chain.
Tests Cleanup / Minor Fixes
testing/python/transform/test_tilelang_transform_legalize_safe_memory_access.py, tilelang/utils/sparse.py
Comments out an issue-specific test and wrapper; removes a non-functional comment in sparse.py except block.
API Signature Restriction
tilelang/language/reduce.py
Narrows cumsum_fragment macro signature: src and dst now typed as tir.Buffer (previously allowed BufferRegion/BufferLoad).
Submodule Update
3rdparty/tvm
Advances TVM submodule pointer from 2b1ead1a... to 4d3ec925... (no code-level changes).

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~35 minutes

  • Review hotspots:
    • CMakeLists.txt / cmake/pypi-z3/FindZ3.cmake: RPATH logic, imported target properties, and conditional find_package behavior.
    • Packaging changes in pyproject.toml: build-system requires and wheel repair exclusions.
    • New test modules: scan for flaky assumptions, Z3 dependency usage, and test entrypoints.
    • tilelang/language/reduce.py: verify callers are compatible with narrowed cumsum_fragment signature.

Possibly related PRs

Suggested reviewers

  • LeiWang1999
  • XuehaiPan

Poem

🐰 I hopped into CMake with a cheerful click,
I fetched Z3 from PyPI — quick, quick, quick!
Tests grew bold proofs, SMT whispers at night,
Buffers got picky, now signatures are tight.
Submodule bumped, wheels tuned just right — hooray!

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 10.64% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title '[Feat] Integrate Z3 in TVM Arith Analyzer' clearly and specifically describes the main feature addition—integrating Z3 SMT solver into TVM's Arith Analyzer—which aligns with the core purpose of this PR as confirmed by the objectives and file changes.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link

github-actions bot commented Dec 2, 2025

👋 Hi! Thank you for contributing to the TileLang project.

Please remember to run pre-commit run --all-files in the root directory of the project to ensure your changes are properly linted and formatted. This will help ensure your contribution passes the format check.

We appreciate you taking this step! Our team will review your contribution, and we look forward to your awesome work! 🚀

@LeiWang1999 LeiWang1999 self-requested a review December 2, 2025 15:41
@SiriusNEO SiriusNEO self-requested a review December 4, 2025 18:48
oraluben
oraluben previously approved these changes Dec 8, 2025
@oraluben
Copy link
Collaborator

oraluben commented Dec 15, 2025

Macos pending on: tile-ai/tvm#15 cc @kurisu6912

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
CMakeLists.txt (1)

313-314: Variable name mismatch: Python3_SITELIB is undefined; use Python_SITELIB.

Line 279 calls find_package(Python REQUIRED ...), which defines Python_SITELIB (not Python3_SITELIB). The BUILD_RPATH references will expand to empty, breaking Z3 library discovery at runtime during development builds.

Apply this diff to fix the variable name:

-  set_property(TARGET tvm APPEND PROPERTY BUILD_RPATH ${Python3_SITELIB}/z3/lib)
-  set_property(TARGET tvm APPEND PROPERTY BUILD_RPATH ${Python3_SITELIB}/z3/bin)
+  set_property(TARGET tvm APPEND PROPERTY BUILD_RPATH ${Python_SITELIB}/z3/lib)
+  set_property(TARGET tvm APPEND PROPERTY BUILD_RPATH ${Python_SITELIB}/z3/bin)
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4ca48b1 and 1f9e489.

📒 Files selected for processing (3)
  • CMakeLists.txt (2 hunks)
  • pyproject.toml (4 hunks)
  • requirements-dev.txt (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • requirements-dev.txt
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: Test for Python 3.12 with Metal (on macos-latest)
  • GitHub Check: Test for Python 3.12 with CUDA-12.8 (on self-hosted-nvidia)
  • GitHub Check: Build wheels for Python 3.9 on ubuntu-24.04-arm with CUDA-12.8
  • GitHub Check: Build wheels for Python 3.9 on macos-latest with Metal
  • GitHub Check: Build wheels for Python 3.9 on ubuntu-latest with CUDA-12.8
  • GitHub Check: Build SDist
🔇 Additional comments (5)
pyproject.toml (3)

46-46: Platform-conditional Z3 and patchelf dependencies are correctly scoped.

The conditional placement (platform_system == 'Linux') in both dependencies and build-system.requires ensures Z3-related tooling is only pulled on Linux, avoiding unnecessary dependencies on macOS and Windows. This aligns with the Z3 packaging strategy outlined in the PR (avoiding SONAME issues across platforms).

Also applies to: 57-62


236-236: manylinux version bump reflects Z3 glibc requirement.

The change from manylinux_2_28 (glibc 2.28) to manylinux_2_34 (glibc 2.34) on aarch64 aligns with Z3's minimum glibc requirement on ARM, as noted in the PR discussion. This ensures compatibility without linker errors on ARM builds.


265-265: libz3.so exclusion from wheel repair is correct.

Excluding libz3.so from auditwheel's repair prevents the Z3 shared library from being bundled twice (once from the PyPI wheel, once from repair), which could cause symbol conflicts or packaging issues.

CMakeLists.txt (2)

318-335: INSTALL_RPATH logic correctly differentiates platforms and Z3 availability.

The conditional setup (Apple with @loader_path paths, UNIX with $ORIGIN and optional Z3 library path) is well-structured and matches the packaging strategy. macOS uses loader_path for relocatability, and Linux respects both TVM FFI and Z3 library locations when Z3 is enabled.


341-362: Install-time Z3 soname patching is correctly guarded.

The patchelf logic is properly gated behind both if(USE_Z3 AND USE_PYPI_Z3) and if(UNIX AND NOT APPLE), ensuring it only runs on Linux when Z3 is actually enabled. The glob pattern *.so* is appropriate for Linux-only context. This approach aligns with the PR's strategy of patching Z3's custom SONAME to a stable libz3.so to avoid runtime binding issues.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
CMakeLists.txt (1)

225-231: Unconditional USE_Z3=ON overrides user configuration and surprises opt-out users.

Setting USE_Z3 unconditionally forces Z3 support even for users explicitly building with -DUSE_Z3=OFF or wanting a minimal CPU-only build. This was previously flagged as a major issue; a gating option is needed.

Apply this diff to respect user configuration:

+# Gate Z3 support behind TileLang option (default OFF for minimal builds)
+option(TILELANG_USE_Z3 "Enable Z3 support in TVM arith analyzer" OFF)
+
-set(USE_Z3      ON CACHE STRING "Use Z3 SMT solver for TileLang optimizations")
-set(USE_PYPI_Z3 ON CACHE BOOL   "Use Z3 provided by PyPI z3-solver package")
-
-if(USE_Z3 AND USE_PYPI_Z3)
+if(TILELANG_USE_Z3)
+  set(USE_Z3      ON CACHE STRING "Use Z3 SMT solver for TileLang optimizations")
+  set(USE_PYPI_Z3 ON CACHE BOOL   "Use Z3 provided by PyPI z3-solver package")
   list(APPEND CMAKE_MODULE_PATH "${CMAKE_CURRENT_LIST_DIR}/cmake/pypi-z3")
   find_package(Z3 REQUIRED)
 endif()

This allows users to opt-in via -DTILELANG_USE_Z3=ON while respecting explicit -DUSE_Z3=OFF choices.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 61102b0 and 86a668c.

📒 Files selected for processing (1)
  • CMakeLists.txt (2 hunks)
🔇 Additional comments (1)
CMakeLists.txt (1)

305-323: RPATH configuration is well-structured and platform-aware.

The INSTALL_RPATH setup correctly differentiates between platforms (macOS @loader_path, Linux $ORIGIN), properly gates Z3-specific paths, and applies settings consistently to all relevant targets. Assuming the Python_SITELIB fix above is applied, this segment will function correctly.

@kurisu6912 kurisu6912 linked an issue Dec 15, 2025 that may be closed by this pull request
1 task
LeiWang1999
LeiWang1999 previously approved these changes Dec 16, 2025
SiriusNEO
SiriusNEO previously approved these changes Dec 16, 2025
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
testing/python/tilelibrary/test_tilelang_tilelibrary_gemm_sp.py (2)

219-220: Consider removing decorators from helper function.

The @tilelang.testing.requires_cuda and @tilelang.testing.requires_cuda_compute_version decorators on run_gemm_sp_sm90 are redundant since this is a helper function, not a pytest-discovered test. Pytest markers only affect test functions (those starting with test_). The effective decorators are on test_gemm_sp_sm90 (lines 310-311), so these can be safely removed for clarity.

Apply this diff:

-@tilelang.testing.requires_cuda
-@tilelang.testing.requires_cuda_compute_version(9, 0)
 def run_gemm_sp_sm90(

264-266: Consider removing decorators from helper function.

Similar to run_gemm_sp_sm90, these decorators on run_gemm_sp_sm80 are redundant. The effective compute-version guards are already on test_gemm_sp_sm80 (lines 334-336), so these can be removed for clarity.

Apply this diff:

-@tilelang.testing.requires_cuda
-@tilelang.testing.requires_cuda_compute_version_ge(8, 0)
-@tilelang.testing.requires_cuda_compute_version_le(8, 9)
 def run_gemm_sp_sm80(
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f3f20b8 and 0fc8549.

📒 Files selected for processing (1)
  • testing/python/tilelibrary/test_tilelang_tilelibrary_gemm_sp.py (4 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-11-14T07:56:11.098Z
Learnt from: lucifer1004
Repo: tile-ai/tilelang PR: 1256
File: testing/python/jit/test_tilelang_jit_gemm_nvrtc.py:55-115
Timestamp: 2025-11-14T07:56:11.098Z
Learning: In `testing/python/jit/test_tilelang_jit_gemm_nvrtc.py`, the global function `tilelang_callback_cuda_postproc` registered via `tvm.register_global_func(..., override=True)` is intentionally not restored after the test completes, as the persistent behavior is expected.

Applied to files:

  • testing/python/tilelibrary/test_tilelang_tilelibrary_gemm_sp.py
🧬 Code graph analysis (1)
testing/python/tilelibrary/test_tilelang_tilelibrary_gemm_sp.py (3)
tilelang/testing/__init__.py (3)
  • requires_cuda_compute_version (37-101)
  • requires_cuda_compute_version_ge (104-105)
  • requires_cuda_compute_version_le (120-121)
tilelang/tileop/gemm/__init__.py (2)
  • trans_A (101-102)
  • trans_B (105-106)
tilelang/tileop/gemm_sp/gemm_sp_base.py (2)
  • trans_A (45-46)
  • trans_B (49-50)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: Test for Python 3.12 with CUDA-12.8 (on self-hosted-nvidia)
  • GitHub Check: Test for Python 3.12 with Metal (on macos-latest)
  • GitHub Check: Build wheels for Python 3.9 on ubuntu-latest with CUDA-12.8
  • GitHub Check: Build wheels for Python 3.9 on ubuntu-24.04-arm with CUDA-12.8
  • GitHub Check: Build SDist
  • GitHub Check: Build wheels for Python 3.9 on macos-latest with Metal
🔇 Additional comments (1)
testing/python/tilelibrary/test_tilelang_tilelibrary_gemm_sp.py (1)

233-234: LGTM: Transpose parameter additions are correct.

The trans_A and trans_B parameters are properly threaded through the entire call chain (kernel generation → run_gemm_sp → validation) and correctly control matrix transposition in the sparse GEMM tests. The test parameterization provides good coverage of transpose combinations.

Also applies to: 279-280

@LeiWang1999 LeiWang1999 merged commit 9c21586 into tile-ai:main Dec 17, 2025
11 of 15 checks passed
@yyttt6
Copy link
Collaborator

yyttt6 commented Dec 17, 2025

This PR brings performance improvements for mha_fwd_bhsd_wgmma_pipelined, as well as other cases such as warp_specialize_gemm_copy1_gemm_0.
speedup_bar

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

5 participants