You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Version 0.2 aims to significantly enhance Stringy's capability to extract meaningful strings from binaries by adding three critical features that complement the existing format detection and section parsing infrastructure.
Goals
This epic tracks the implementation of three interconnected features that will make Stringy more effective at extracting actionable intelligence from binaries:
1. PE Resources Extraction
Context: Windows PE executables contain resource sections with rich metadata including version info, manifests, dialog boxes, menus, and embedded strings that are currently not being extracted.
Proposed Solution:
Extend the PE container parser (src/container/pe.rs) to parse resource directories
Extract strings from:
Version information (product name, company, file description)
Manifest files (embedded XML)
String tables (STRINGTABLE resources)
Dialog and menu definitions
Tag extracted strings with pe-resource, version-info, manifest semantic tags
Assign high scores to version/manifest strings (they're highly relevant)
Acceptance Criteria:
Parse PE resource directory structure
Extract version info fields (FileDescription, ProductName, CompanyName, etc.)
Extract manifest XML strings
Handle STRINGTABLE resources
Add appropriate semantic tagging
Include resource strings in scoring/ranking system
2. Rust Symbol Demangling
Context: The project already has rustc-demangle as a dependency and mentions Rust demangling in the README, but full integration into the extraction pipeline is needed.
Proposed Solution:
Integrate rustc-demangle into the symbol processing pipeline
Demangle Rust symbols found in:
Import/export tables
Symbol tables (.symtab, .dynsym for ELF)
Debug sections (if present)
Store both mangled and demangled versions for context
Add rust-symbol semantic tag
Prioritize demangled symbols in ranking (more human-readable = more useful)
Acceptance Criteria:
Detect Rust mangled symbols (_ZN prefix pattern)
Apply rustc-demangle to detected symbols
Include demangled names in output alongside mangled versions
Add semantic tagging for Rust symbols
Update scoring to favor demangled symbols
Handle demangle failures gracefully
3. Import/Export Names Enhancement
Context: The container parsers already extract import/export symbols (foundation exists), but they need to be fully integrated into the string extraction and ranking pipeline.
Proposed Solution:
Complete integration of import/export extraction across all formats:
ELF: .dynsym, .symtab sections
PE: Import Directory Table, Export Address Table
Mach-O: LC_DYLD_INFO_ONLY, symbol table
Add semantic classification:
Tag with import, export, symbol-name
Detect API categories (crypto, network, file I/O)
Identify suspicious or high-value API calls
Integrate into ranking system with adjustable scores
Support filtering by import/export in CLI
Acceptance Criteria:
Extract import names from all three binary formats
Extract export names from all three binary formats
Add semantic tags for symbol types and API categories
Integrate into main string extraction pipeline
Include in scoring/ranking algorithm
Add CLI flags: --imports, --exports, --symbols
Test with real-world binaries (ELF, PE, Mach-O)
Dependencies
goblin: Already in use for binary format parsing
rustc-demangle: Already in dependencies
No new external dependencies required
Implementation Order
Recommended sequence to minimize integration complexity:
Import/Export Enhancement (foundational) - builds on existing container parsers
Overview
Version 0.2 aims to significantly enhance Stringy's capability to extract meaningful strings from binaries by adding three critical features that complement the existing format detection and section parsing infrastructure.
Goals
This epic tracks the implementation of three interconnected features that will make Stringy more effective at extracting actionable intelligence from binaries:
1. PE Resources Extraction
Context: Windows PE executables contain resource sections with rich metadata including version info, manifests, dialog boxes, menus, and embedded strings that are currently not being extracted.
Proposed Solution:
src/container/pe.rs) to parse resource directoriespe-resource,version-info,manifestsemantic tagsAcceptance Criteria:
2. Rust Symbol Demangling
Context: The project already has
rustc-demangleas a dependency and mentions Rust demangling in the README, but full integration into the extraction pipeline is needed.Proposed Solution:
rustc-demangleinto the symbol processing pipelinerust-symbolsemantic tagAcceptance Criteria:
3. Import/Export Names Enhancement
Context: The container parsers already extract import/export symbols (foundation exists), but they need to be fully integrated into the string extraction and ranking pipeline.
Proposed Solution:
import,export,symbol-nameAcceptance Criteria:
--imports,--exports,--symbolsDependencies
Implementation Order
Recommended sequence to minimize integration complexity:
Testing Strategy
Success Metrics
Related Documentation