Summary
Implement structured extraction of PE resources, specifically VERSIONINFO and STRINGTABLE data, which contain rich metadata and user-facing strings often crucial for malware analysis, software identification, and reverse engineering workflows.
Background & Context
Windows PE (Portable Executable) binaries embed structured resources in the .rsrc section containing valuable string data:
- VERSIONINFO: File metadata including ProductName, FileDescription, CompanyName, LegalCopyright, FileVersion, and ProductVersion
- STRINGTABLE: Localized UI strings, error messages, and application text
- Other resources: Dialog templates, menus, and accelerator tables (future work)
While the current implementation uses goblin for PE parsing, its resource section support is limited to raw byte access. The pelite crate provides a pure Rust PE parser with comprehensive resource parsing capabilities, making structured extraction straightforward.
Implementation Plan
Phase 1: Foundation (This Issue)
-
Add pelite dependency to Cargo.toml
- Version: Latest stable (0.10.x recommended)
- Evaluate if pelite should replace or complement goblin
-
Create resource extraction module at src/extraction/pe_resources.rs
- Define resource types enum (VersionInfo, StringTable, etc.)
- Implement resource enumeration using pelite's resource directory walker
- Add error handling for malformed resource sections
-
Extend existing types in src/types.rs
- Add
ResourceMetadata struct for VERSIONINFO fields
- Add
ResourceStringTable struct for STRINGTABLE entries
- Extend
FoundString to include resource context
-
Integration with PE parser (src/container/pe.rs)
- Add resource section parsing to
PeParser::parse()
- Populate
ContainerInfo with resource metadata
- Ensure existing section classification still works
Phase 2: Resource Extraction (Follow-up)
- Implement VERSIONINFO extraction with key-value pair parsing
- Implement STRINGTABLE extraction with locale handling
- Add Unicode and ANSI string decoding for resource data
- Map extracted strings to
FoundString with StringSource::ResourceString
Phase 3: Testing & Documentation (Follow-up)
- Unit tests with sample PE resource data
- Integration tests with real-world PE binaries
- Document resource extraction architecture
- Add examples to README
Technical Considerations
- Dual parser strategy: Consider using pelite specifically for resource extraction while keeping goblin for section/import/export parsing
- Performance: Resource parsing should be optional via CLI flag (e.g.,
--extract-resources)
- Malformed binaries: Gracefully handle corrupted resource directories (common in packed/obfuscated malware)
- Memory safety: pelite operates on byte slices; ensure proper bounds checking
Success Criteria
References
Related Work
- Milestone: v0.1 (MVP)
- Depends on: Existing PE parser infrastructure
- Enables: Future work on dialog/menu resource parsing, icon extraction
@traycerai branch:3-implement-pe-section-classification-and-importexport-table-parsing
Summary
Implement structured extraction of PE resources, specifically VERSIONINFO and STRINGTABLE data, which contain rich metadata and user-facing strings often crucial for malware analysis, software identification, and reverse engineering workflows.
Background & Context
Windows PE (Portable Executable) binaries embed structured resources in the
.rsrcsection containing valuable string data:While the current implementation uses
goblinfor PE parsing, its resource section support is limited to raw byte access. Thepelitecrate provides a pure Rust PE parser with comprehensive resource parsing capabilities, making structured extraction straightforward.Implementation Plan
Phase 1: Foundation (This Issue)
Add pelite dependency to
Cargo.tomlCreate resource extraction module at
src/extraction/pe_resources.rsExtend existing types in
src/types.rsResourceMetadatastruct for VERSIONINFO fieldsResourceStringTablestruct for STRINGTABLE entriesFoundStringto include resource contextIntegration with PE parser (
src/container/pe.rs)PeParser::parse()ContainerInfowith resource metadataPhase 2: Resource Extraction (Follow-up)
FoundStringwithStringSource::ResourceStringPhase 3: Testing & Documentation (Follow-up)
Technical Considerations
--extract-resources)Success Criteria
References
src/container/pe.rs:50(SectionType::Resources)Related Work
@traycerai branch:3-implement-pe-section-classification-and-importexport-table-parsing