Skip to content

feat(tracy): integrate Tracy performance profiling support#498

Merged
chenghuaWang merged 3 commits intoUbiquitousLearning:v2from
chenghuaWang:v2
Oct 31, 2025
Merged

feat(tracy): integrate Tracy performance profiling support#498
chenghuaWang merged 3 commits intoUbiquitousLearning:v2from
chenghuaWang:v2

Conversation

@chenghuaWang
Copy link
Copy Markdown
Collaborator

@chenghuaWang chenghuaWang commented Oct 31, 2025

  • Add Tracy profiling support with new CMake option MLLM_TRACY_ENABLE
  • Integrate Tracy memory tracking in CPU allocator (TracyAlloc/TracyFree)
  • Add Tracy zones to key engine components (Dispatcher, Context, MemoryManager)
  • Update Tracy CMakeLists to link against Tracy::TracyClient
  • Include Tracy headers and zone macros in relevant source files
  • Disable Tracy by default in macOS build configuration
  • Install MllmTracy target when Tracy is enabled
  • Link MllmRT with MllmTracy when Tracy support is active
  • Refactor Tracy header guards and include paths for better compatibility

Summary by CodeRabbit

Release Notes

  • New Features

    • Added profiling instrumentation to memory management, task dispatching, context operations, and dispatcher systems.
  • Chores

    • Updated build system with new installation rules and configurable integration options for platform-specific builds.

chenghuaWang and others added 2 commits October 31, 2025 16:40
- Add Tracy profiling support with new CMake option `MLLM_TRACY_ENABLE`
- Integrate Tracy memory tracking in CPU allocator (`TracyAlloc`/`TracyFree`)
- Add Tracy zones to key engine components (Dispatcher, Context, MemoryManager)
- Update Tracy CMakeLists to link against `Tracy::TracyClient`
- Include Tracy headers and zone macros in relevant source files
- Disable Tracy by default in macOS build configuration
- Install `MllmTracy` target when Tracy is enabled
- Link `MllmRT` with `MllmTracy` when Tracy support is active
- Refactor Tracy header guards and include paths for better compatibility
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Oct 31, 2025

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Walkthrough

Tracy performance profiling instrumentation is integrated throughout the codebase. Changes include adding Tracy header includes and zone-scoping macros to core components, updating CMake configurations to conditionally link and install the MllmTracy library, modifying Tracy header files with pragma once directives, and disabling Tracy in the macOS Accelerate build configuration.

Changes

Cohort / File(s) Summary
Build Configuration
CMakeLists.txt, mllm/CMakeLists.txt, mllm/tracy_perf/CMakeLists.txt, tasks/build_osx_apple_silicon_accelerate.yaml
Adds conditional installation rules for MllmTracy target, links MllmRT to MllmTracy when MLLM_TRACY_ENABLE is true, updates Tracy library linkage to use Tracy::TracyClient, adds public include directories, and disables Tracy in macOS Accelerate build.
Tracy Headers
mllm/tracy_perf/Tracy.hpp, mllm/tracy_perf/Tracy.cpp
Replaces header guards with \#pragma once, adds copyright headers, updates include directives from quotes to angle brackets, and restructures include statements under MLLM_TRACY_ENABLE guard.
Performance Instrumentation
mllm/backends/cpu/CPUAllocator.cpp, mllm/backends/cpu/CPUDispatcher.cpp, mllm/engine/Context.cpp, mllm/engine/DispatcherManager.cpp, mllm/engine/MemoryManager.cpp
Adds Tracy header includes and MLLM_TRACY_ZONE_SCOPED macros at entry points of alloc/free operations, task processing, operation submission, and dispatcher task distribution for profiling instrumentation.

Sequence Diagram

sequenceDiagram
    participant Caller
    participant ProfiledFunc as Profiled Function
    participant Tracy as Tracy Profiler

    Caller->>ProfiledFunc: invoke function
    Note over ProfiledFunc: MLLM_TRACY_ZONE_SCOPED<br/>(zone starts)
    activate ProfiledFunc
    rect rgb(200, 220, 255)
        Note over Tracy: Zone timing active
        ProfiledFunc->>ProfiledFunc: original logic
    end
    deactivate ProfiledFunc
    Note over ProfiledFunc: zone ends
    ProfiledFunc->>Caller: return
    ProfiledFunc->>Tracy: emit timing data
Loading

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

  • Verify Tracy::TracyClient linkage is correctly resolved in CMake
  • Confirm MLLM_TRACY_ZONE_SCOPED macro is defined and accessible across all instrumented files
  • Check CPUAllocator.cpp preprocessor guard usage (TracyAlloc/TracyFree calls)
  • Ensure pragma once replacement in Tracy.hpp doesn't conflict with any existing header guards

Poem

🐰 With Tracy's gaze across the land,
Each zone now glows with timing grand,
From memory pools to CPU's command,
From dispatch tasks to every strand—
Performance secrets we now understand! ⏱️

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The PR title "feat(tracy): integrate Tracy performance profiling support" directly and accurately reflects the primary objective of this changeset. Across all modified files—including CMakeLists configurations, CPU allocator, dispatchers, memory management, and build configurations—the consistent theme is integrating Tracy profiling capabilities. The title is concise, specific, and uses clear conventional commit format without vague terminology or noise. A reviewer scanning the repository history would immediately understand that this PR adds Tracy performance profiling support to the project.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@chenghuaWang
Copy link
Copy Markdown
Collaborator Author

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Oct 31, 2025

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8e81356 and f512ec5.

📒 Files selected for processing (11)
  • CMakeLists.txt (1 hunks)
  • mllm/CMakeLists.txt (1 hunks)
  • mllm/backends/cpu/CPUAllocator.cpp (2 hunks)
  • mllm/backends/cpu/CPUDispatcher.cpp (2 hunks)
  • mllm/engine/Context.cpp (2 hunks)
  • mllm/engine/DispatcherManager.cpp (2 hunks)
  • mllm/engine/MemoryManager.cpp (3 hunks)
  • mllm/tracy_perf/CMakeLists.txt (1 hunks)
  • mllm/tracy_perf/Tracy.cpp (1 hunks)
  • mllm/tracy_perf/Tracy.hpp (1 hunks)
  • tasks/build_osx_apple_silicon_accelerate.yaml (1 hunks)
🧰 Additional context used
🪛 Clang (14.0.6)
mllm/backends/cpu/CPUAllocator.cpp

[error] 26-26: do not use pointer arithmetic

(cppcoreguidelines-pro-bounds-pointer-arithmetic,-warnings-as-errors)


[error] 31-31: do not use pointer arithmetic

(cppcoreguidelines-pro-bounds-pointer-arithmetic,-warnings-as-errors)


[error] 31-31: do not use C-style cast to convert between unrelated types

(cppcoreguidelines-pro-type-cstyle-cast,-warnings-as-errors)

🔇 Additional comments (15)
tasks/build_osx_apple_silicon_accelerate.yaml (1)

13-13: LGTM! Appropriate to disable Tracy on macOS Accelerate builds.

Disabling Tracy for this specific build configuration is a sensible choice, likely to avoid platform-specific issues or maintain build performance.

mllm/backends/cpu/CPUDispatcher.cpp (2)

8-8: LGTM! Tracy header correctly included.


43-43: LGTM! Zone scoping correctly placed.

The MLLM_TRACY_ZONE_SCOPED macro is appropriately positioned at the function entry to profile the entire task processing logic.

mllm/engine/DispatcherManager.cpp (2)

8-8: LGTM! Tracy header correctly included.


18-21: LGTM! Zone scoping correctly profiles the dispatch path.

The Tracy zone correctly wraps the dispatcher receive call, enabling profiling of task submission overhead.

mllm/tracy_perf/Tracy.hpp (2)

1-4: LGTM! Modern header structure with appropriate copyright.

The use of #pragma once is a cleaner alternative to traditional include guards and is widely supported by modern compilers.


6-15: LGTM! Tracy integration correctly implemented.

The angle brackets <tracy/Tracy.hpp> appropriately treat Tracy as an external library, and the conditional macro definitions ensure clean compilation when Tracy is disabled.

mllm/engine/Context.cpp (2)

10-10: LGTM! Tracy header correctly included.


46-46: LGTM! Zone scoping correctly profiles the operation submission path.

The Tracy zone appropriately covers the entire flow including device selection, op creation, and task submission.

mllm/tracy_perf/CMakeLists.txt (1)

3-4: LGTM! Modern CMake practices applied.

Using the Tracy::TracyClient imported target and exposing public include directories follows modern CMake best practices and ensures proper dependency management.

mllm/CMakeLists.txt (1)

81-84: LGTM! Tracy integration correctly wired into the build.

The conditional compilation block properly adds the Tracy subdirectory and links the MllmTracy library to MllmRT when Tracy support is enabled.

mllm/engine/MemoryManager.cpp (2)

6-6: LGTM! Tracy profiling properly integrated.

The Tracy header include and zone instrumentation are correctly placed for profiling memory allocation operations.

Also applies to: 29-29


63-63: LGTM! Consistent profiling instrumentation.

The Tracy zone for the free method matches the pattern used in alloc.

mllm/backends/cpu/CPUAllocator.cpp (1)

6-6: LGTM! TracyAlloc correctly tracks allocation.

The TracyAlloc call properly tracks the malloc'd pointer with the correct size (including alignment offset).

Also applies to: 22-24

CMakeLists.txt (1)

303-310: LGTM! Installation rule properly configured.

The MllmTracy installation follows the same pattern as other targets and is correctly gated by the MLLM_TRACY_ENABLE flag.

Comment thread mllm/backends/cpu/CPUAllocator.cpp
Comment thread mllm/tracy_perf/Tracy.cpp
The TracyFree call was placed after the free() call, which could lead to
undefined behavior. This change ensures that TracyFree is called before
the memory is actually freed, allowing proper tracking and profiling
of memory deallocation when Tracy is enabled.

Additionally, remove redundant tracy include in Tracy.cpp to avoid
potential conflicts with the local tracy implementation.
@chenghuaWang chenghuaWang merged commit 04ddf7a into UbiquitousLearning:v2 Oct 31, 2025
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant