Summary
Extend the SymbolDemangler to handle MSVC-mangled symbols (?-prefixed) found in Windows PE binaries. The core Rust and C++ Itanium ABI demangling is already fully implemented; this issue tracks the remaining gap for Windows-native symbols.
Current State
The SymbolDemangler in src/classification/symbols.rs already handles:
- ✅ Rust legacy mangling (
_ZN prefix) via rustc-demangle = "0.1.27"
- ✅ Rust v0 mangling (
_R prefix) via rustc-demangle
- ✅ C++ Itanium ABI (
_Z prefix) via cpp_demangle = "0.5.1"
- ✅ Pipeline integration in
classify_strings() within src/pipeline/mod.rs
- ✅ Graceful fallback — failed demanglings are tracked and warned
The Gap
The pipeline in src/pipeline/mod.rs already detects MSVC-mangled symbols via:
let looks_mangled =
s.text.starts_with("_Z") || s.text.starts_with("_R") || s.text.starts_with('?');
However, SymbolDemangler::is_mangled() and SymbolDemangler::demangle() have no handling for ?-prefixed MSVC symbols. The pipeline detects them but SymbolDemangler silently passes through — producing no demangled output for any Windows PE import/export symbol using MSVC mangling.
MSVC-mangled symbol example:
- Mangled:
?printf@@YAHPEBDZZ
- Expected demangled:
int __cdecl printf(char const * const,...)
This is a high-value gap because Stringy targets PE binary analysis (src/container/pe/) and MSVC is the dominant toolchain for Windows binaries.
Proposed Solution
1. Add Dependency
Add an MSVC demangling crate to Cargo.toml, for example msvc-demangler:
2. Extend SymbolDemangler
// In src/classification/symbols.rs
pub fn is_mangled(&self, symbol: &str) -> bool {
symbol.starts_with("_R") // Rust v0
|| symbol.starts_with("_Z") // Itanium ABI (Rust legacy + C++)
|| symbol.starts_with('?') // MSVC
}
fn try_demangle_internal(&self, symbol: &str) -> Option<String> {
if symbol.starts_with("_R") {
return self.try_rust_demangle(symbol);
}
if symbol.starts_with("_Z") {
return self.try_rust_demangle(symbol)
.or_else(|| self.try_cpp_demangle(symbol));
}
if symbol.starts_with('?') {
return self.try_msvc_demangle(symbol);
}
None
}
fn try_msvc_demangle(&self, symbol: &str) -> Option<String> {
// Use msvc-demangler crate
let demangled = msvc_demangler::demangle(symbol, msvc_demangler::DemangleFlags::llvm()).ok()?;
if demangled != symbol {
Some(demangled)
} else {
None
}
}
3. Tests Required
Add comprehensive tests to src/classification/symbols.rs:
is_mangled correctly returns true for ?-prefixed symbols
demangle produces correct output for common MSVC patterns (plain functions, namespaced methods, overloaded operators)
- Fallback to original on invalid/unknown MSVC-mangled input
- No regression on existing Rust and C++ demangling tests
Add integration tests using real PE binary fixtures (if available under tests/fixtures/).
Acceptance Criteria
Requirements
Requirement ID: 4.1
Task ID
stringy-analyzer/symbol-processing
Related Work
src/classification/symbols.rs — extend SymbolDemangler
src/pipeline/mod.rs — classify_strings() already detects ? prefix; no pipeline changes needed
src/container/pe/ — primary consumer of MSVC-demangled symbols
Summary
Extend the
SymbolDemanglerto handle MSVC-mangled symbols (?-prefixed) found in Windows PE binaries. The core Rust and C++ Itanium ABI demangling is already fully implemented; this issue tracks the remaining gap for Windows-native symbols.Current State
The
SymbolDemanglerinsrc/classification/symbols.rsalready handles:_ZNprefix) viarustc-demangle = "0.1.27"_Rprefix) viarustc-demangle_Zprefix) viacpp_demangle = "0.5.1"classify_strings()withinsrc/pipeline/mod.rsThe Gap
The pipeline in
src/pipeline/mod.rsalready detects MSVC-mangled symbols via:However,
SymbolDemangler::is_mangled()andSymbolDemangler::demangle()have no handling for?-prefixed MSVC symbols. The pipeline detects them butSymbolDemanglersilently passes through — producing no demangled output for any Windows PE import/export symbol using MSVC mangling.MSVC-mangled symbol example:
?printf@@YAHPEBDZZint __cdecl printf(char const * const,...)This is a high-value gap because Stringy targets PE binary analysis (
src/container/pe/) and MSVC is the dominant toolchain for Windows binaries.Proposed Solution
1. Add Dependency
Add an MSVC demangling crate to
Cargo.toml, for examplemsvc-demangler:2. Extend
SymbolDemangler3. Tests Required
Add comprehensive tests to
src/classification/symbols.rs:is_mangledcorrectly returnstruefor?-prefixed symbolsdemangleproduces correct output for common MSVC patterns (plain functions, namespaced methods, overloaded operators)Add integration tests using real PE binary fixtures (if available under
tests/fixtures/).Acceptance Criteria
SymbolDemangler::is_mangled()returnstruefor?-prefixed symbolsSymbolDemangler::demangle()correctly demangled MSVC symbols into human-readable formTag::DemangledSymboland preserve original inoriginal_textcargo deny checkpasses with the new dependencyRequirements
Requirement ID: 4.1
Task ID
stringy-analyzer/symbol-processing
Related Work
src/classification/symbols.rs— extendSymbolDemanglersrc/pipeline/mod.rs—classify_strings()already detects?prefix; no pipeline changes neededsrc/container/pe/— primary consumer of MSVC-demangled symbols