fix(build): preserve identifiers in strip-and-sync framework fallback#1259
Conversation
Greptile SummaryThis PR extends the strip-and-sync fallback (introduced in #1258) for non-standard PyInstaller-embedded Python.framework bundles to preserve the original codesign identifier before stripping pre-existing signatures. Without this, re-signing the canonical binary would let Confidence Score: 5/5Safe to merge — the fix is logically correct and the only remaining findings are minor style suggestions. The core invariant of the patch (capture identifier → strip → re-sign with original identifier → sync) is implemented in the right order with a sensible fallback. No P0/P1 issues were found. No files require special attention. Important Files Changed
Sequence DiagramsequenceDiagram
participant S as build_app_tauri.sh
participant CS as codesign
participant FS as Filesystem
S->>FS: find fw -type f | xargs file | grep Mach-O
FS-->>S: [fw_bin1, fw_bin2, fw_bin3, ...]
Note over S: First iteration (canonical)
S->>CS: codesign -d fw_bin1 (capture identifier)
CS-->>S: Identifier=com.python.python
S->>CS: codesign --remove-signature fw_bin1
S->>CS: sign_binary_with_identifier fw_bin1 com.python.python
CS-->>S: Signed canonical binary
Note over S: Subsequent iterations (duplicates)
S->>CS: codesign --remove-signature fw_bin2
S->>FS: cp -p fw_bin1 fw_bin2
S->>CS: codesign --remove-signature fw_bin3
S->>FS: cp -p fw_bin1 fw_bin3
Note over S: All copies now byte-identical and share the same signature
Reviews (1): Last reviewed commit: "fix(build): preserve identifiers in stri..." | Re-trigger Greptile |
| while IFS= read -r fw_bin; do | ||
| if [ -z "$canonical_identifier" ]; then | ||
| canonical_identifier=$(codesign -d "$fw_bin" 2>&1 | awk -F= '/^Identifier=/{print substr($0, index($0, "=") + 1)}') | ||
| if [ -z "$canonical_identifier" ]; then | ||
| canonical_identifier="$(basename "$fw_bin")" | ||
| fi | ||
| fi |
There was a problem hiding this comment.
Identifier captured from first
find result — order is non-deterministic
find "$fw" -type f produces filesystem-order output with no ordering guarantee, so the binary whose identifier is captured as the canonical one is unpredictable. In the normal case this is harmless — all PyInstaller copies of Python carry the same original identifier — but if a prior aborted run left one copy re-signed with a degraded identifier (e.g., Python), that copy could surface first and its bad identifier would be propagated to all others.
A small defensive improvement would be to prefer the deepest versioned path (e.g., Versions/3.x/Python) over the shallow symlink alias, or at least log which binary's identifier was selected so it's visible in CI output.
| while IFS= read -r fw_bin; do | ||
| if [ -z "$canonical_identifier" ]; then | ||
| canonical_identifier=$(codesign -d "$fw_bin" 2>&1 | awk -F= '/^Identifier=/{print substr($0, index($0, "=") + 1)}') | ||
| if [ -z "$canonical_identifier" ]; then |
There was a problem hiding this comment.
-F= field separator is set but unused
awk -F= splits on = into fields, but the print statement uses substr($0, index($0,"=") + 1) (positional) rather than $2. The -F= flag has no effect here and can be dropped for clarity.
| if [ -z "$canonical_identifier" ]; then | |
| canonical_identifier=$(codesign -d "$fw_bin" 2>&1 | awk '/^Identifier=/{print substr($0, index($0, "=") + 1)}') |
|
I pushed one more fix instead of just restating the same theory. What I foundThe current The fallback logic did this:
That is too aggressive. Inside these PyInstaller So even if preserving the identifier fixed the What I changedNew commit on this PR:
The fallback now:
So the dedup invariant is now much tighter:
Why this is betterThis matches the real problem more closely. The bug isn't "everything inside Python.framework is one duplicate set". Grouping by stripped-content hash preserves that distinction. Verification
If this still fails after merge/test on |
…igning fallback Rebases on top of ActivityWatch#1255-ActivityWatch#1258 and adds two correctness improvements: 1. Strip all existing signatures before comparison, so content hashing identifies true duplicates rather than nonce-only signature differences from PyInstaller's pre-sign codesign_identity step. 2. Group binaries by SHA-256 content hash instead of treating the whole framework as one duplicate set. This correctly handles the (unlikely but possible) case where a framework contains genuinely different Mach-O files — only true duplicates share a signed payload. Both changes preserve master's existing patterns: identifier preservation via temp copy, codesign --force --options runtime, and the ambiguous- framework fallback structure.
50875e1 to
563dbdf
Compare
Rebased on master + consolidatedRebased onto latest master (which includes #1255-#1258) and consolidated the 4 previous commits into 1 clean commit ( What changed from the previous version:
Net diff: +50/-52 lines in
This is strictly more correct than the current "always sync all" approach, and maintains all the hard-won fixes from #1255-#1258. Next step: If notarization still fails after this merge, the remaining debug loop should happen on a macOS machine with |
Summary
PythonWhy
Build Taurionmasteris still failing on macOS in run 24240587094 after #1258. The notarization rejection log shows all embeddedPython.frameworkcopies are still rejected asThe signature of the binary is invalid.The current strip-and-sync fallback fixed divergent signatures, but it re-signs the canonical binary without preserving the original identifier. For these embedded Python binaries, that means codesign uses the path/basename-derived identifier (
Python) instead of the original binary identifier, which is exactly the regression fixed earlier in #1255.This patch reapplies that lesson to the strip-and-sync fallback: capture the original identifier before removing the signature, then pass it explicitly during re-signing.
Verification
bash -n scripts/package/build_app_tauri.sh