feat: Add benchmark for Qwen3 and update readme about benchmark by jialilve · Pull Request #487 · UbiquitousLearning/mllm

jialilve · 2025-10-23T08:45:53Z

New Qwen3_W4A32_KAI.hpp benchmark
Updated model registry in All.hpp
Enhanced README with usage examples

Summary by CodeRabbit

New Features
- Implemented full benchmarking functionality with multiple test runs, result aggregation, and detailed performance metrics collection.
Documentation
- Added comprehensive documentation for the benchmark tool, including setup, usage instructions, parameters, and troubleshooting guidance.
Chores
- Updated project ignore rules for development environments and build artifacts.

coderabbitai · 2025-10-23T17:40:53Z

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Walkthrough

This PR establishes a complete LLM benchmark tool framework for testing model performance. Changes include: adding ignore rules for virtual environments and model artifacts, defining an abstract benchmark interface with virtual methods, implementing Qwen3 benchmark support with timing instrumentation and performance metrics, and creating comprehensive documentation for the benchmark tool.

Changes

Cohort / File(s)	Summary
Project Configuration `\.gitignore`	Adds ignore patterns for Python virtual environments (.venv/) and downloaded models/build artifacts (models/)
Benchmark Tool Documentation `tools/mllm-llm-benchmark/README\.md`	Adds comprehensive README covering tool overview, build instructions, usage, parameters, examples, output format, test workflows, model support extension, and troubleshooting guidance
Benchmark Framework `tools/mllm-llm-benchmark/models/BenchmarkTemplate\.hpp`	Introduces abstract base class BenchmarkTemplate with virtual methods (init, printModelInfo, warmup, clear, run); documents BenchmarkTemplateResult struct fields with inline comments for TTFT, prefill speed, and decode speed
Qwen3 Benchmark Implementation `tools/mllm-llm-benchmark/models/Qwen3_W4A32_KAI\.hpp`	Replaces no-op stubs with full implementation: config loading, model initialization, warmup execution with test tensor generation, KV cache clearing, and streaming generation with timing instrumentation to compute TTFT and throughput metrics
Benchmark Execution `tools/mllm-llm-benchmark/main\.cpp`	Implements complete benchmark flow: adds threading/timing utilities, structures output with configuration details per test case, executes three benchmark runs per configuration with cache clearing between runs, collects metrics, computes and prints per-run and average results, includes 5-second cooldown between iterations
Example Code Formatting `examples/qwen3/main\.cpp`	Adds blank line after catch block for formatting consistency

Sequence Diagram(s)

sequenceDiagram
    participant Main as Benchmark Runner
    participant Template as BenchmarkTemplate
    participant Model as Qwen3Model
    
    Main->>Template: init(config_path, model_path, cache_length)
    Template->>Model: Load config & weights
    Model-->>Template: Ready
    
    rect rgba(200, 220, 255, 0.3)
    Note over Main: For each PP/TG configuration
    Main->>Template: warmup()
    Template->>Model: Generate test input
    Model-->>Template: Complete
    
    loop 3 benchmark runs
        Main->>Template: clear() - Reset KV cache
        rect rgba(255, 220, 200, 0.3)
        Main->>Template: run(pp, tg) - Time generation
        Template->>Model: streamGenerate with timing
        Note over Model: Measure: TTFT, prefill speed, decode speed
        Model-->>Template: BenchmarkTemplateResult
        Template-->>Main: Metrics
        end
        Main->>Main: Sleep 5s (cooldown)
    end
    end
    
    Main->>Main: Compute averages & print results

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

The changes span multiple related files with varying complexity. The Qwen3 implementation requires careful review of timing instrumentation and the streaming generation flow. The benchmark orchestration logic in main.cpp involves threading and result aggregation. However, changes are cohesive and follow a consistent pattern—implementing a benchmark framework—which moderates overall review burden.

Possibly related PRs

UbiquitousLearning/mllm\feat(cli): add mllm-llm-benchmark tool for performance testing #481: Directly related — modifies the same benchmark tool files (main.cpp, BenchmarkTemplate.hpp, Qwen3_W4A32_KAI.hpp) to expand the benchmark interface and implementation.

Poem

🐰 Three runs per test, then rest and measure—
Qwen3 benchmarks bring data pleasure!
TTFT and speeds flow pristine,
The fastest tokens you've e'er seen! 🚀✨

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The pull request title "feat: Add benchmark for Qwen3 and update readme about benchmark" is well-aligned with the primary changes in the changeset. The title accurately captures the two main user-facing aspects: the implementation of the Qwen3 benchmark (via the new Qwen3_W4A32_KAI.hpp file with full instrumentation) and the addition of comprehensive README documentation. While the changeset also includes supporting infrastructure work such as the BenchmarkTemplate abstract base class and refactored main.cpp logic, these are foundational changes that enable the Qwen3 benchmark feature described in the title. The title is clear, specific, and provides a meaningful summary of the primary objective without being misleading.

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

chenghuaWang · 2025-10-24T02:08:58Z

@coderabbitai review

coderabbitai · 2025-10-24T02:09:11Z

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

coderabbitai

Actionable comments posted: 4

🧹 Nitpick comments (8)

tools/mllm-llm-benchmark/README.md (2)
77-83: Add language to code fence to satisfy markdownlint (MD040).
-```
+```text
 MLLM Build Version : abc123def456
 ...
-```
+``` 
183-195: Correct string lowercasing in All.hpp example.

tolower(model_name) isn’t valid for std::string. Use transform.
- auto normalized_model_name = tolower(model_name);
+ std::string normalized_model_name = model_name;
+ std::transform(normalized_model_name.begin(), normalized_model_name.end(),
+                normalized_model_name.begin(),
+                [](unsigned char c){ return static_cast<char>(std::tolower(c)); });
tools/mllm-llm-benchmark/main.cpp (1)
97-107: Avoid shadowing Argparse vars; use clearer names.

Also compute averages using results.size() to stay correct if run count changes.
- for (auto [pp, tg] : pp_tg_pairs) {
+ for (auto [prompt_len, gen_len] : pp_tg_pairs) {
-   mllm::print("  Prompt Length (PP)    :", pp);
-   mllm::print("  Generation Length (TG):", tg);
+   mllm::print("  Prompt Length (PP)    :", prompt_len);
+   mllm::print("  Generation Length (TG):", gen_len);
And later:
- auto result = benchmark->run(pp, tg);
+ auto result = benchmark->run(prompt_len, gen_len);
And for averages/summary:
- mllm::print("Configuration: PP=", pp, " TG=", tg);
+ mllm::print("Configuration: PP=", prompt_len, " TG=", gen_len);
- avg_ttft /= 3.0f;
- avg_prefill_speed /= 3.0f;
- avg_decode_speed /= 3.0f;
+ const float denom = results.empty() ? 1.0f : static_cast<float>(results.size());
+ avg_ttft /= denom;
+ avg_prefill_speed /= denom;
+ avg_decode_speed /= denom;
tools/mllm-llm-benchmark/models/Qwen3_W4A32_KAI.hpp (4)
96-98: KV cache “clear” may be incomplete; prefer explicit clear if available.

Resetting sequence count might not release buffers or indices. Use the cache’s clear/flush API if it exists.
- model_->kvCache().setCurrentSeqCnt(0);
+ // Prefer a full clear if provided by API
+ // model_->kvCache().clear(); // or equivalent
+ model_->kvCache().setCurrentSeqCnt(0);
Run the provided script in main.cpp comment to list kvCache methods.

124-145: Use steady_clock for timing; high_resolution_clock may be non‑monotonic.

More stable for latency metrics.
- auto prefill_start = std::chrono::high_resolution_clock::now();
- auto decode_start = std::chrono::high_resolution_clock::now();
- auto decode_end = std::chrono::high_resolution_clock::now();
+ using Clock = std::chrono::steady_clock;
+ auto prefill_start = Clock::now();
+ auto decode_start = Clock::now();
+ auto decode_end = Clock::now();
And:
- auto prefill_duration = std::chrono::duration_cast<std::chrono::microseconds>(decode_start - prefill_start).count();
- auto decode_duration = std::chrono::duration_cast<std::chrono::microseconds>(decode_end - decode_start).count();
+ auto prefill_duration = std::chrono::duration_cast<std::chrono::microseconds>(decode_start - prefill_start).count();
+ auto decode_duration = std::chrono::duration_cast<std::chrono::microseconds>(decode_end - decode_start).count();
118-123: Confirm correct generation arg keys.

Some runtimes expect max_new_tokens instead of max_length. Verify expected keys to ensure bounded generation.
- args["max_length"] = mllm::AnyValue(max_len);
+ // If runtime expects max_new_tokens:
+ // args["max_new_tokens"] = mllm::AnyValue(max_len);
Use the verification script in main.cpp comment to grep for supported keys.

5-8: Drop unused include.

Header shouldn’t pull heavy includes unnecessarily.
-#include <thread>
tools/mllm-llm-benchmark/models/BenchmarkTemplate.hpp (1)
21-23: Consider explicitly defining special member functions.

As flagged by static analysis, the class defines a virtual destructor but doesn't address copy/move semantics. For abstract interface classes, it's best practice to explicitly delete copy and move operations to prevent object slicing and clarify intent.

Apply this diff to follow the Rule of Five and silence the static analysis warning:
 class BenchmarkTemplate {
  public:
   virtual ~BenchmarkTemplate() = default;
+  BenchmarkTemplate(const BenchmarkTemplate&) = delete;
+  BenchmarkTemplate& operator=(const BenchmarkTemplate&) = delete;
+  BenchmarkTemplate(BenchmarkTemplate&&) = delete;
+  BenchmarkTemplate& operator=(BenchmarkTemplate&&) = delete;

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c6c000e and 754002e.

📒 Files selected for processing (6)

.gitignore (1 hunks)
examples/qwen3/main.cpp (1 hunks)
tools/mllm-llm-benchmark/README.md (1 hunks)
tools/mllm-llm-benchmark/main.cpp (2 hunks)
tools/mllm-llm-benchmark/models/BenchmarkTemplate.hpp (1 hunks)
tools/mllm-llm-benchmark/models/Qwen3_W4A32_KAI.hpp (1 hunks)

🧰 Additional context used

🪛 Clang (14.0.6)

tools/mllm-llm-benchmark/models/Qwen3_W4A32_KAI.hpp

[error] 5-5: 'memory' file not found

(clang-diagnostic-error)

[error] 14-14: invalid case style for class 'Qwen3_W4A32_KAI_Benchmark'

(readability-identifier-naming,-warnings-as-errors)

[error] 16-16: method 'init' can be made static

(readability-convert-member-functions-to-static,-warnings-as-errors)

[error] 16-16: 3 adjacent parameters of 'init' of similar type are easily swapped by mistake

(bugprone-easily-swappable-parameters,-warnings-as-errors)

[error] 62-62: variable 'warmup_length' is not initialized

(cppcoreguidelines-init-variables,-warnings-as-errors)

[error] 63-63: variable 'warmup_gen' is not initialized

(cppcoreguidelines-init-variables,-warnings-as-errors)

[error] 76-76: variable 'inputs' is not initialized

(cppcoreguidelines-init-variables,-warnings-as-errors)

[error] 79-79: variable 'args' is not initialized

(cppcoreguidelines-init-variables,-warnings-as-errors)

[error] 80-80: variable 'max_len' is not initialized

(cppcoreguidelines-init-variables,-warnings-as-errors)

[error] 100-100: method 'run' can be made static

(readability-convert-member-functions-to-static,-warnings-as-errors)

[error] 100-100: 2 adjacent parameters of 'run' of similar type ('int') are easily swapped by mistake

(bugprone-easily-swappable-parameters,-warnings-as-errors)

[error] 100-100: parameter name 'pp' is too short, expected at least 3 characters

(readability-identifier-length,-warnings-as-errors)

[error] 100-100: parameter name 'tg' is too short, expected at least 3 characters

(readability-identifier-length,-warnings-as-errors)

[error] 103-103: floating point literal has suffix 'f', which is not uppercase

(readability-uppercase-literal-suffix,-warnings-as-errors)

[error] 103-103: floating point literal has suffix 'f', which is not uppercase

(readability-uppercase-literal-suffix,-warnings-as-errors)

[error] 103-103: floating point literal has suffix 'f', which is not uppercase

(readability-uppercase-literal-suffix,-warnings-as-errors)

[error] 115-115: variable 'inputs' is not initialized

(cppcoreguidelines-init-variables,-warnings-as-errors)

[error] 118-118: variable 'args' is not initialized

(cppcoreguidelines-init-variables,-warnings-as-errors)

[error] 119-119: variable 'max_len' is not initialized

(cppcoreguidelines-init-variables,-warnings-as-errors)

[error] 146-146: uninitialized record type: 'result'

(cppcoreguidelines-pro-type-member-init,-warnings-as-errors)

[error] 155-155: floating point literal has suffix 'f', which is not uppercase

(readability-uppercase-literal-suffix,-warnings-as-errors)

[error] 162-162: floating point literal has suffix 'f', which is not uppercase

(readability-uppercase-literal-suffix,-warnings-as-errors)

tools/mllm-llm-benchmark/models/BenchmarkTemplate.hpp

[error] 21-21: class 'BenchmarkTemplate' defines a default destructor but does not define a copy constructor, a copy assignment operator, a move constructor or a move assignment operator

(cppcoreguidelines-special-member-functions,-warnings-as-errors)

🪛 markdownlint-cli2 (0.18.1)

tools/mllm-llm-benchmark/README.md

77-77: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🔇 Additional comments (3)

examples/qwen3/main.cpp (1)

63-64: No functional change; OK to keep.

Visually separates error output from perf summary.

tools/mllm-llm-benchmark/models/BenchmarkTemplate.hpp (2)

10-14: Clean benchmark result structure.

The struct design is appropriate for holding benchmark metrics. The documentation clearly explains each field's purpose and units.

25-60: Well-designed benchmark interface.

The abstract interface provides a clear contract for benchmark implementations, covering initialization, warmup, execution, and cleanup. The method signatures are appropriate, and the documentation effectively explains each method's purpose and parameters.

coderabbitai · 2025-10-24T02:16:46Z

+  // Actual run for 3 turns and gives avg results. Each turn will sleep for 5 seconds to let the SoC or GPU/NPU cool down.
+  mllm::print("\n========================================");
+  mllm::print("Starting Benchmark Tests");
+  mllm::print("========================================\n");
+


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

Harden CLI: require mandatory args; set thread count; validate PP/TG lists.

Mark -n, -m, -c, -pp, -tg as required; fail fast if missing.

Use num_threads if set (hook to runtime’s thread setter).

Validate non-empty PP/TG and catch stoi errors.

🏁 Script executed:

#!/bin/bash # Find thread count setter and kvCache APIs across the repo. rg -nP 'setNumThreads|num_threads|threads\W*set' -C2 rg -n 'kvCache\\(\\)\\.' -C2 # Confirm which gen args are supported rg -nP 'max_length|max_new_tokens' -C2

Length of output: 18650

🏁 Script executed:

#!/bin/bash # Examine the main.cpp file to see argument setup and validation cat tools/mllm-llm-benchmark/main.cpp | head -150

Length of output: 6072

🏁 Script executed:

#!/bin/bash # Search for PP/TG parsing and usage rg -n 'pp\.value|tg\.value|\bpp\b|\btg\b' tools/mllm-llm-benchmark/main.cpp -A2 -B2

Length of output: 2272

🏁 Script executed:

#!/bin/bash # Search for num_threads usage in generation calls rg -n 'num_threads|generate' tools/mllm-llm-benchmark/main.cpp -A3 -B1

Length of output: 502

🏁 Script executed:

#!/bin/bash # Check if there's error handling for stoi or parsing rg -n 'stoi|try|catch|parse|exception' tools/mllm-llm-benchmark/main.cpp

Length of output: 1040

🏁 Script executed:

#!/bin/bash # Check benchmark initialization and what parameters it accepts rg -n 'benchmark->init|benchmark->run|benchmark->warmup' tools/mllm-llm-benchmark/main.cpp -A1 -B1

Length of output: 324

🏁 Script executed:

#!/bin/bash # Search for benchmark class definition and its methods fd -t f -e hpp -e cpp | xargs grep -l 'class.*Benchmark' | head -5

Length of output: 0

🏁 Script executed:

#!/bin/bash # Check if num_threads is used anywhere else in the benchmark tool rg -n 'num_threads' tools/mllm-llm-benchmark/ -C1

Length of output: 392

🏁 Script executed:

#!/bin/bash # Search for .required() usage pattern in the codebase to confirm it's supported rg -n '\.required\(\)' --type cpp -C1 | head -30

Length of output: 0

🏁 Script executed:

#!/bin/bash # Find and examine Argparse.hpp to see available methods fd -name "Argparse.hpp" -type f

Length of output: 230

🏁 Script executed:

#!/bin/bash # Examine the Argparse header for method signatures fd -name "Argparse.hpp" -type f -exec cat {} \;

Length of output: 230

🏁 Script executed:

#!/bin/bash # Find Argparse.hpp with proper escaping find . -name "Argparse.hpp" -type f

Length of output: 25

🏁 Script executed:

#!/bin/bash # Search for Argparse class definition rg -n 'class Argparse' -A 20 --type cpp --type hpp | head -50

Length of output: 27

🏁 Script executed:

#!/bin/bash # Check for .required or .mandatory or constraint methods in Argparse rg -n '\bArgparse::add|\.help\(|\.required\(|\bmandatory\(' --type cpp --type hpp | head -30

Length of output: 27

🏁 Script executed:

#!/bin/bash # Directly read the Argparse.hpp file cat ./mllm/utils/Argparse.hpp | head -200

Length of output: 5337

🏁 Script executed:

#!/bin/bash # Search for method definitions in Argparse.hpp cat ./mllm/utils/Argparse.hpp | grep -n 'required\|mandatory\|constraint' -A2 -B2

Length of output: 672

Harden CLI: mark mandatory args; use thread count; validate PP/TG; handle parse errors.

The following issues need fixing:

Mark required arguments (-n, -m, -c, -pp, -tg) using .required() on lines 18-23. Argparse supports this and enforces it during parse().

Hook num_threads to runtime: Currently retrieved at line 21 but never used. Call mllm::setMaximumNumThreads(num_threads.get()) after parsing, or pass to benchmark->init() if the method accepts it.

Validate PP/TG non-empty and handle stoi errors (lines 75-83): Wrap the parsing loops in try-catch to catch std::invalid_argument and std::out_of_range from stoi(). Also validate that pp.get() and tg.get() are non-empty before parsing.

🤖 Prompt for AI Agents

In tools/mllm-llm-benchmark/main.cpp around lines 92-96 and touching argument definitions at lines 18-23 and parsing/validation at 75-83: mark the CLI args -n, -m, -c, -pp, -tg as required by calling .required() on their definitions so parse() enforces presence; after parsing, call mllm::setMaximumNumThreads(num_threads.get()) (or pass num_threads.get() into benchmark->init() if preferred) to apply the thread count; before converting pp and tg strings ensure pp.get() and tg.get() are non-empty and wrap the stoi loops in try-catch blocks catching std::invalid_argument and std::out_of_range, logging a clear error and exiting/returning on parse failure.

coderabbitai · 2025-10-24T02:16:47Z

+      // Run benchmark
      auto result = benchmark->run(pp, tg);
-
-      // TODO Sleep some times.
+      results.push_back(result);
+
+      mllm::print("    TTFT         :", result.ttft, "ms");
+      mllm::print("    Prefill Speed:", result.prefill_speed, "tokens/s");
+      mllm::print("    Decode Speed :", result.decode_speed, "tokens/s");
+


⚠️ Potential issue | 🟠 Major

Wrap run in try/catch to prevent entire suite aborting on one failure.

Also log and continue.

- // Run benchmark - auto result = benchmark->run(pp, tg); - results.push_back(result); + // Run benchmark + try { + auto result = benchmark->run(prompt_len, gen_len); + results.push_back(result); + } catch (const std::exception& ex) { + mllm::print(" Error during run: ", ex.what()); + continue; // proceed with next run + }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

// Run benchmark

auto result = benchmark->run(pp, tg);

// TODO Sleep some times.

results.push_back(result);

mllm::print(" TTFT :", result.ttft, "ms");

mllm::print(" Prefill Speed:", result.prefill_speed, "tokens/s");

mllm::print(" Decode Speed :", result.decode_speed, "tokens/s");

// Run benchmark

try {

auto result = benchmark->run(pp, tg);

results.push_back(result);

} catch (const std::exception& ex) {

mllm::print(" Error during run: ", ex.what());

continue; // proceed with next run

}

mllm::print(" TTFT :", result.ttft, "ms");

mllm::print(" Prefill Speed:", result.prefill_speed, "tokens/s");

mllm::print(" Decode Speed :", result.decode_speed, "tokens/s");

🤖 Prompt for AI Agents

In tools/mllm-llm-benchmark/main.cpp around lines 114-121, wrap the call to benchmark->run(pp, tg) along with pushing the result and the mllm::print lines in a try block so a single benchmark failure doesn't abort the whole suite; on success push_back the result and print speeds as before, on failure catch std::exception& e and log the error (e.g. with mllm::print including e.what()), also add a catch(...) to log unknown failures, and simply continue the loop without pushing a result.

coderabbitai · 2025-10-24T02:16:47Z

+    auto param = mllm::load(model_path, mllm::ModelFileVersion::kV2);
+    model_->load(param);
+
+    mllm::print("Model initialized successfully");
+  }


⚠️ Potential issue | 🟠 Major

Don’t hard-code ModelFileVersion; autodetect or make configurable.

Weights may be V1/V2. For robustness, detect version from file or pass it in.

- auto param = mllm::load(model_path, mllm::ModelFileVersion::kV2); + // If available, detect file version; otherwise, expose a parameter or document V2-only. + auto param = mllm::load(model_path, /* TODO: detect or inject version */ mllm::ModelFileVersion::kV2);

If detection exists (e.g., mllm::detectModelFileVersion(model_path)), use it here. Otherwise, document V2-only in README and CLI help.

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents

In tools/mllm-llm-benchmark/models/Qwen3_W4A32_KAI.hpp around lines 29-33, the code hard-codes ModelFileVersion::kV2 when loading the model; change this to either autodetect the version from model_path (e.g., call mllm::detectModelFileVersion(model_path) and pass the returned enum to mllm::load) or make the version configurable (read from a CLI flag or config and use that value), and if no detection API is available update the README and CLI help text to explicitly state the code only supports V2 files so callers know the limitation.

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

chenghuaWang · 2025-10-24T06:34:19Z

LGTM

jialilve added 2 commits October 23, 2025 01:09

feat: Add benchmark for Qwen3 and update readme about benchmark

97d9b9f

docs: Re-write comments and README in English

754002e

coderabbitai Bot reviewed Oct 24, 2025

View reviewed changes

Update .gitignore

41b7d2b

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

chenghuaWang merged commit 886988e into UbiquitousLearning:v2 Oct 24, 2025
1 check passed

chenghuaWang mentioned this pull request Nov 1, 2025

Development Roadmap (2025 H2) #460

Closed

coderabbitai Bot mentioned this pull request Feb 14, 2026

feat(benchmark): Add CPU benchmark tool with context length sweep #639

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Add benchmark for Qwen3 and update readme about benchmark#487

feat: Add benchmark for Qwen3 and update readme about benchmark#487
chenghuaWang merged 3 commits intoUbiquitousLearning:v2from
jialilve:v2

jialilve commented Oct 23, 2025 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Oct 23, 2025 •

edited

Loading

Review skipped

Uh oh!

chenghuaWang commented Oct 24, 2025

Uh oh!

coderabbitai Bot commented Oct 24, 2025

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

coderabbitai Bot Oct 24, 2025

Uh oh!

coderabbitai Bot Oct 24, 2025

Uh oh!

coderabbitai Bot Oct 24, 2025

Uh oh!

chenghuaWang commented Oct 24, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

jialilve commented Oct 23, 2025 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Oct 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

Pre-merge checks and finishing touches

Uh oh!

chenghuaWang commented Oct 24, 2025

Uh oh!

coderabbitai Bot commented Oct 24, 2025

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot Oct 24, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Oct 24, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Oct 24, 2025

Choose a reason for hiding this comment

Uh oh!

chenghuaWang commented Oct 24, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jialilve commented Oct 23, 2025 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Oct 23, 2025 •

edited

Loading