fix: macOS 26 BNNS compiler compatibility for Sortformer#11
Merged
Alex-Wengg merged 1 commit intomainfrom Jan 29, 2026
Merged
Conversation
macOS 26 introduced stricter validation in the BNNS graph compiler that rejects CoreML models where input and output tensors share the same name. The Sortformer head module had `chunk_pre_encoder_embs` as both an input and output (pass-through for state management), which now fails with: "Function main has tensor chunk_pre_encoder_embs as both an input and output. Inputs and outputs must be distinct, please add an explicit identity op." Changes: - coreml_wrappers.py: Add explicit identity ops (+ 0.0) to create distinct output tensors in SortformerHeadWrapper - convert_to_coreml.py: Rename output tensors to *_out suffix (chunk_pre_encoder_embs_out, chunk_pre_encoder_lengths_out) Models converted with this fix require FluidAudio to be updated to read from the new output names. Fixes FluidInference/FluidAudio#265
46ed4cc to
119b54d
Compare
Alex-Wengg
added a commit
to FluidInference/FluidAudio
that referenced
this pull request
Jan 26, 2026
## Summary - Fixes macOS 26 BNNS compiler error for Sortformer models - Updates to use V2 models with renamed output tensors - Adds Float16 support for head module outputs ## Problem macOS 26 introduced stricter validation in the BNNS graph compiler that rejects CoreML models where input and output tensors share the same name: ``` Failed to configure ML Program for the feature types declared in the model description. Function main has tensor chunk_pre_encoder_embs as both an input and output. Inputs and outputs must be distinct, please add an explicit identity op. ``` ## Solution ### Code Changes 1. **SortformerModelInference.swift**: - Read from renamed outputs (`chunk_pre_encoder_embs_out`, `chunk_pre_encoder_lengths_out`) - Handle Float16 output (head module uses fp16 precision) 2. **ModelNames.swift**: - Update model names to V2 (`SortformerV2`, `SortformerNvidiaLowV2`, `SortformerNvidiaHighV2`) ### Model Changes (separate PR) V2 models converted with: - Explicit identity ops (`+ 0.0`) to create distinct output tensors - Renamed output tensor names to `*_out` suffix See: FluidInference/mobius#11 ## Testing Tested on macOS 26.1 (Build 25B78), Apple M2: - Model loads successfully (~90ms warm, ~1.2s cold) - Inference works correctly (9-10x RTFx) - DER results match expected values Fixes #265
Alex-Wengg
pushed a commit
that referenced
this pull request
Feb 3, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
*_outsuffix (chunk_pre_encoder_embs_out,chunk_pre_encoder_lengths_out)Problem
macOS 26 introduced stricter validation in the BNNS graph compiler:
Solution
+ 0.0) inSortformerHeadWrapperto create distinct output tensorsBreaking Change
Models converted with this fix require FluidAudio to read from new output names:
chunk_pre_encoder_embs→chunk_pre_encoder_embs_outchunk_pre_encoder_lengths→chunk_pre_encoder_lengths_outV2 models uploaded to HuggingFace use these new names.
Fixes FluidInference/FluidAudio#265