Implement PE Section Classification and Import/Export Table Parsing

## Summary

Enhance the PE (Portable Executable) parser to intelligently classify sections based on string likelihood and implement import/export table parsing to extract meaningful symbols and metadata from Windows executables.

## Context

PE binaries structure data differently than ELF or Mach-O formats. Windows executables typically store:
- **Read-only strings** in `.rdata` (read-only data) sections - high value for string extraction
- **Initialized data** in `.data` sections - lower priority, often contains runtime state
- **Import tables** listing DLL dependencies and function names - valuable for understanding program behavior
- **Export tables** defining public API surfaces - critical for DLL analysis

Currently, StringyMcStringFace lacks PE-specific intelligence to prioritize these sections appropriately, which means:
- We treat all sections equally, missing optimization opportunities
- We don't extract import/export symbols that provide high-signal strings
- UTF-16LE strings common in PE binaries aren't prioritized correctly
- Section scoring doesn't account for PE-specific characteristics

## Proposed Solution

### 1. Section Classification Enhancement

Implement section weight assignment based on PE characteristics:

```rust
// Proposed section scoring for PE
fn classify_pe_section(section: &PESection) -> SectionWeight {
    match section.name.as_str() {
        ".rdata" | ".text" => SectionWeight::High,      // Read-only, likely strings
        ".rsrc" => SectionWeight::Medium,                // Resources, may contain strings
        ".data" => SectionWeight::Low,                   // Writable, runtime state
        ".bss" | ".reloc" => SectionWeight::VeryLow,    // Unlikely to contain strings
        _ => SectionWeight::Medium                         // Default for unknown sections
    }
}
```

### 2. Import/Export Table Parsing

Extract symbol names from PE import and export directories:

- Parse import directory to extract DLL names and imported function names
- Parse export directory to extract exported function names (for DLL analysis)
- Tag extracted symbols with appropriate metadata (`source: ImportTable`, `source: ExportTable`)
- Assign high scores to these strings as they represent high-confidence identifiers

### 3. Integration Points

- Update `PEParser` in `crates/stringy-analyzer/src/parsers/pe.rs`
- Extend `SectionInfo` to include PE-specific weight heuristics
- Add import/export extraction to the symbol extraction pipeline
- Ensure UTF-16LE detection prioritizes `.rdata` sections

## Technical Requirements

**Requirement 1.2**: Section classification by string likelihood  
**Requirement 1.4**: Import/export table parsing

### Dependencies
- `goblin` PE parser capabilities
- Existing `SectionWeight` enum may need extension
- `StringSource` enum needs `ImportTable` and `ExportTable` variants

### Performance Considerations
- Import/export parsing is typically fast (small tables)
- Section classification is O(n) where n = number of sections (usually < 10)
- No significant performance impact expected

## Acceptance Criteria

- [ ] Implement section weight classification for PE-specific sections (`.rdata`, `.data`, `.rsrc`, `.text`, `.bss`, `.reloc`)
- [ ] Parse PE import directory and extract DLL names and imported function names
- [ ] Parse PE export directory and extract exported function names
- [ ] Add `ImportTable` and `ExportTable` variants to `StringSource` enum
- [ ] Assign appropriate scores to import/export strings (high priority)
- [ ] Add unit tests for section classification logic
- [ ] Add integration tests using sample PE binaries (e.g., a simple DLL with exports)
- [ ] `cargo clippy -- -D warnings` passes without errors
- [ ] Add benchmarks with criterion for PE parsing performance
- [ ] Use insta for snapshot testing of import/export extraction results
- [ ] Update justfile recipes for PE-specific test cases
- [ ] Ensure CI pipeline passes all checks
- [ ] Document PE-specific behavior in module-level docs

## Test Cases

### Unit Tests
- Section weight assignment for known PE section names
- Section weight for unknown/custom section names
- Empty import/export table handling

### Integration Tests
- Parse a PE binary with imports (e.g., `kernel32.dll` functions)
- Parse a DLL with exports
- Verify UTF-16LE strings from `.rdata` score higher than `.data`

### Snapshot Tests (insta)
- Import table extraction output
- Export table extraction output
- Section classification results

## Related Work

- Builds on existing `goblin` PE parser foundation
- Complements ELF/Mach-O section classification (issues #1, #2)
- Feeds into ranking system (Requirement 2.1)
- Enables better YARA rule generation for PE binaries

## References

- [PE Format Specification (Microsoft)](https://learn.microsoft.com/en-us/windows/win32/debug/pe-format)
- [goblin PE documentation](https://docs.rs/goblin/latest/goblin/pe/index.html)

---

**Task-ID**: stringy-analyzer/pe-section-classification  
**Requirements**: 1.2, 1.4  
**Milestone**: v0.1
@traycerai branch:3-implement-pe-section-classification-and-importexport-table-parsing

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Implement PE Section Classification and Import/Export Table Parsing #3

Summary

Context

Proposed Solution

1. Section Classification Enhancement

2. Import/Export Table Parsing

3. Integration Points

Technical Requirements

Dependencies

Performance Considerations

Acceptance Criteria

Test Cases

Unit Tests

Integration Tests

Snapshot Tests (insta)

Related Work

References

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Implement PE Section Classification and Import/Export Table Parsing #3

Description

Summary

Context

Proposed Solution

1. Section Classification Enhancement

2. Import/Export Table Parsing

3. Integration Points

Technical Requirements

Dependencies

Performance Considerations

Acceptance Criteria

Test Cases

Unit Tests

Integration Tests

Snapshot Tests (insta)

Related Work

References

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions