Skip to content

feat(qwen3): add config and quantization files for 0.6B model#482

Merged
chenghuaWang merged 5 commits intoUbiquitousLearning:v2from
chenghuaWang:v2
Oct 16, 2025
Merged

feat(qwen3): add config and quantization files for 0.6B model#482
chenghuaWang merged 5 commits intoUbiquitousLearning:v2from
chenghuaWang:v2

Conversation

@chenghuaWang
Copy link
Copy Markdown
Collaborator

  • Add new kai_sme.cpp and kai_sme.hpp files with proper copyright headers
  • Implement ARM-specific linear kernel using SME instructions
  • Include necessary header guards and license information
  • Remove empty KernelSelector files that were not being used

chenghuaWang and others added 3 commits October 16, 2025 08:37
Added new source files Nn.cc and Compile.cc to the MllmFFIExtension library
in CMakeLists.txt to extend the FFI interface.

feat(build): format MLIR installation script

Reformatted the cmake command in install_mlir.sh to a single line for better
readability and consistency in the build script.
- Add new kai_sme.cpp and kai_sme.hpp files with proper copyright headers
- Implement ARM-specific linear kernel using SME instructions
- Include necessary header guards and license information
- Remove empty KernelSelector files that were not being used
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Oct 16, 2025

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Comment @coderabbitai help to get the list of available commands and usage tips.

- Add `config_0.6B_w4a8_i8mm_kai.json` with model architecture settings
- Add `quant_cfg_0.6B_w4a8_i8mm_kai.json` with layer-wise quantization hints
- Configure KaiLinear implementation types for various modules

perf(cpu): add label support for KaiLinear implementations

- Insert labels for kai linear implementations to enable goto jumps
- Optimize forward path by switching implementations based on input shape

refactor(mllm): comment out memory cleanup temporarily

- Comment out `clearAll()` call in `shutdownContext()`
- Mark as FIXME for CUDA compatibility

style(qwen3): reformat function signature for readability

- Reformat `makeRotaryPosEmbedding` function declaration to fit within
  line limits
- Improve code style consistency

fix(qwen3): remove redundant finish token callback

- Remove unnecessary finish token callback in Qwen3Session
- Clean up post-processing logic for radix tree insertion
@chenghuaWang chenghuaWang changed the title feat(cpu): add kai_sme kernel implementation placeholder for ARM linear operations feat(qwen3): add config and quantization files for 0.6B model Oct 16, 2025
Adds a new devcontainer.json file for cu128 environment with comprehensive
VS Code extension setup including Python, C++, debugging, and formatting
tools.
@chenghuaWang chenghuaWang merged commit 72f292e into UbiquitousLearning:v2 Oct 16, 2025
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant