Implement String Ranking Engine with Configurable Scoring System

## Summary

Create a flexible RankingEngine that assigns importance scores to extracted strings based on their semantic tags, source location, and section characteristics. This enables prioritization of potentially interesting strings in binary analysis.

## Background

The string analyzer extracts and classifies strings from binaries with semantic tags (URLs, IPs, file paths, etc.), section types (code, data, resources), and source locations (imports, exports, section data). However, not all strings are equally interesting for analysis. A ranking system is needed to:

- **Prioritize high-value strings** (e.g., network indicators, file paths, registry keys)
- **Deprioritize noise** (e.g., common debug strings, version info)
- **Weight by context** (strings from executable sections vs. debug sections)
- **Enable customizable scoring** for different analysis scenarios (malware analysis, reverse engineering, compliance scanning)

## Proposed Solution

### Architecture

Create `src/classification/ranking.rs` with the following components:

1. **RankingEngine struct**: Main scoring engine with configurable weights
2. **ScoreConfig struct**: Configuration for tag weights, source weights, and section type multipliers
3. **StringScore struct**: Returned score with breakdown for transparency
4. **Default scoring profiles**: Presets for common use cases (malware analysis, general strings, etc.)

### Scoring Algorithm

```
final_score = (tag_weight + source_weight) × section_type_multiplier
```

**Tag Weights** (base importance):
- High value (8-10): URLs, Domains, IPv4/IPv6, Email, Registry paths
- Medium value (5-7): File paths, GUIDs, Base64 (potential encoding)
- Lower value (2-4): Format strings, User agents
- Contextual (variable): Imports/Exports (depends on name), Version strings

**Source Weights**:
- ImportName/ExportName: +3 (API calls are interesting)
- SectionData: +2 (hardcoded strings)
- ResourceString: +1 (UI strings, less critical)
- DebugInfo: -2 (usually noise)

**Section Type Multipliers**:
- Code sections: ×1.5 (strings in executable code are unusual)
- StringData/ReadOnlyData: ×1.0 (expected location)
- WritableData: ×1.2 (potentially modified at runtime)
- Resources: ×0.8 (often benign UI strings)
- Debug: ×0.3 (low priority noise)

### Implementation Details

```rust
pub struct RankingEngine {
    config: ScoreConfig,
}

pub struct ScoreConfig {
    tag_weights: HashMap<Tag, f32>,
    source_weights: HashMap<StringSource, f32>,
    section_multipliers: HashMap<SectionType, f32>,
}

pub struct StringScore {
    pub total: f32,
    pub tag_weight: f32,
    pub source_weight: f32,
    pub section_multiplier: f32,
}

impl RankingEngine {
    pub fn new(config: ScoreConfig) -> Self;
    pub fn with_defaults() -> Self;
    pub fn score(&self, tag: &Tag, source: StringSource, section: SectionType) -> StringScore;
}
```

## Acceptance Criteria

- [ ] `RankingEngine` struct created with configurable scoring
- [ ] `ScoreConfig` supports custom weights for tags, sources, and sections
- [ ] Default scoring profile implemented with sensible weights
- [ ] `score()` method returns detailed `StringScore` with breakdown
- [ ] Unit tests for various scoring combinations
- [ ] Documentation with examples of customization
- [ ] Integration point in classification module (`mod.rs`)

## Technical Notes

- Use `f32` for scores to allow fractional weights
- Consider using builder pattern for `ScoreConfig` customization
- Scores should be normalized (0-100 range recommended)
- Future enhancement: Machine learning-based weight tuning

## Dependencies

- Requires existing types from `src/classification/mod.rs`: `Tag`, `StringSource`, `SectionType`
- No external crate dependencies expected for MVP

## References

- Requirements: 5.1
- Task-ID: stringy-analyzer/ranking-system-foundation

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Implement String Ranking Engine with Configurable Scoring System #21

Summary

Background

Proposed Solution

Architecture

Scoring Algorithm

Implementation Details

Acceptance Criteria

Technical Notes

Dependencies

References

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Implement String Ranking Engine with Configurable Scoring System #21

Description

Summary

Background

Proposed Solution

Architecture

Scoring Algorithm

Implementation Details

Acceptance Criteria

Technical Notes

Dependencies

References

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions