Enhance ELF symbol extraction with comprehensive type support and visibility filtering#55
Conversation
|
Caution Review failedThe pull request is closed. Summary by CodeRabbit
WalkthroughAdds extensive developer documentation and Rust project standards; updates CI/docs workflow and Cargo configuration; introduces fixture-based integration tests and Criterion ELF benchmarks; and significantly enhances ELF parsing with DT_NEEDED extraction, symbol versioning, symbol-to-library mapping, broader symbol classification, and related APIs/tests. Changes
Sequence Diagram(s)sequenceDiagram
participant Test as Integration Test
participant Parser as ElfParser
participant ELF as goblin::Elf
participant Vers as VersionInfo
Test->>Parser: call extract_imports(elf, libraries)
Parser->>Parser: extract_needed_libraries(elf)
Parser->>ELF: read DT_NEEDED, dynsym, versym, verneed
ELF-->>Parser: libraries[], dynsyms[], versym_table, verneed_entries
loop each undefined dynsym
Parser->>Vers: resolve_versym(elf, sym_index)
Vers-->>Parser: version_index or None
Parser->>Vers: parse_verneed_entry(elf, version_index)
Vers-->>Parser: (library_name, version) or None
Parser->>Parser: get_symbol_providing_library(sym_index, libraries)
Parser-->>Test: emit ImportInfo { name, library?, type, ... }
end
Test->>Parser: call extract_exports(elf)
loop each defined symbol in syms/dynsyms
alt visible and supported type
Parser-->>Test: emit ExportInfo { name, type, visibility, ... }
end
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes
Possibly related issues
Possibly related PRs
Poem
Pre-merge checks and finishing touches✅ Passed checks (5 passed)
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro Disabled knowledge base sources:
⛔ Files ignored due to path filters (2)
📒 Files selected for processing (25)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
…ibility filtering Co-authored-by: unclesp1d3r <251112+unclesp1d3r@users.noreply.github.com>
Co-authored-by: unclesp1d3r <251112+unclesp1d3r@users.noreply.github.com>
Co-authored-by: unclesp1d3r <251112+unclesp1d3r@users.noreply.github.com>
…pendency management settings - Added "CC0-1.0" and "Unlicense" to the list of allowed licenses. - Cleared the skip-tree section for better clarity. - Introduced new configuration for allowing specific organizations for git sources. Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
- Updated the documentation for enhanced symbol extraction, detailing import/export detection and library dependencies. - Refined comments in the ELF parser code for better clarity and future use. - Cleaned up whitespace in test cases to improve readability. Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
…ptimization guidelines - Introduced comprehensive documentation for Rust coding standards, emphasizing the use of Rust 2024 Edition, zero warnings policy, and structured error handling with `thiserror`. - Established error handling patterns detailing structured error types, context, propagation, and recovery strategies. - Added performance optimization standards focusing on high-performance binary processing, memory management, and benchmarking practices using Criterion. Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
… reviews, performance tuning, security hardening, LLM updates, and task management - Introduced comprehensive documentation for CI checks to ensure code changes pass all checks before merging. - Added guidelines for using CodeRabbit to identify and address code issues. - Created a detailed code review process focusing on quality improvements while preserving public APIs. - Documented performance tuning strategies for analyzing and optimizing code performance. - Established security hardening practices to enhance the security posture of the codebase. - Updated the llms.txt file to reflect changes in documentation and specifications. - Added a workflow for managing the next task in the checklist, ensuring compliance with project standards. Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
…parsing - Updated the Rust edition in Cargo.toml from 2021 to 2024. - Introduced a new benchmark suite for ELF parsing in `benches/elf.rs`, including tests for full parsing, imports, and exports. - Enhanced documentation in `binary-formats.md` to clarify symbol-to-library mapping and its implementation details. - Modified the ELF parser to support library mapping using version information, improving the accuracy of symbol attribution. Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
- Replaced dynamic C code generation in ELF integration tests with pre-compiled binary fixtures for consistency and reliability. - Updated tests to verify ELF parsing, imports, and exports using the new fixture files. - Added new integration tests for Mach-O and PE formats, ensuring comprehensive coverage across binary formats. - Introduced a README in the fixtures directory to document the purpose and rebuilding instructions for the test binaries. Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
- Updated the GitHub Actions workflow to streamline the rustdoc build process by removing unnecessary target directory specification. - Ensured the rustdoc output is correctly copied to the documentation directory after building. - Improved overall clarity and efficiency of the documentation build steps. Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
- Bumped `clap` version from 4.5.48 to 4.5.51. - Updated `goblin` from 0.10.1 to 0.10.3. - Upgraded `insta` from 1.0 to 1.43. - Increased `tempfile` version from 3.8 to 3.23. Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
|
@coderabbitai full review |
✅ Actions performedFull review triggered. |
1 similar comment
✅ Actions performedFull review triggered. |
…ibility filtering (#55) * Initial plan * Enhance ELF symbol extraction with comprehensive symbol types and visibility filtering Co-authored-by: unclesp1d3r <251112+unclesp1d3r@users.noreply.github.com> * Add insta snapshot tests for ELF symbol extraction Co-authored-by: unclesp1d3r <251112+unclesp1d3r@users.noreply.github.com> * Update documentation for enhanced ELF symbol extraction Co-authored-by: unclesp1d3r <251112+unclesp1d3r@users.noreply.github.com> * Update deny.toml to include additional allowed licenses and refine dependency management settings - Added "CC0-1.0" and "Unlicense" to the list of allowed licenses. - Cleared the skip-tree section for better clarity. - Introduced new configuration for allowing specific organizations for git sources. Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io> * Enhance ELF symbol extraction documentation and improve code clarity - Updated the documentation for enhanced symbol extraction, detailing import/export detection and library dependencies. - Refined comments in the ELF parser code for better clarity and future use. - Cleaned up whitespace in test cases to improve readability. Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io> * Add Rust coding standards, error handling patterns, and performance optimization guidelines - Introduced comprehensive documentation for Rust coding standards, emphasizing the use of Rust 2024 Edition, zero warnings policy, and structured error handling with `thiserror`. - Established error handling patterns detailing structured error types, context, propagation, and recovery strategies. - Added performance optimization standards focusing on high-performance binary processing, memory management, and benchmarking practices using Criterion. Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io> * Add new command documentation for CI checks, CodeRabbit reviews, code reviews, performance tuning, security hardening, LLM updates, and task management - Introduced comprehensive documentation for CI checks to ensure code changes pass all checks before merging. - Added guidelines for using CodeRabbit to identify and address code issues. - Created a detailed code review process focusing on quality improvements while preserving public APIs. - Documented performance tuning strategies for analyzing and optimizing code performance. - Established security hardening practices to enhance the security posture of the codebase. - Updated the llms.txt file to reflect changes in documentation and specifications. - Added a workflow for managing the next task in the checklist, ensuring compliance with project standards. Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io> * Update Cargo.toml for Rust 2024 Edition and add benchmarking for ELF parsing - Updated the Rust edition in Cargo.toml from 2021 to 2024. - Introduced a new benchmark suite for ELF parsing in `benches/elf.rs`, including tests for full parsing, imports, and exports. - Enhanced documentation in `binary-formats.md` to clarify symbol-to-library mapping and its implementation details. - Modified the ELF parser to support library mapping using version information, improving the accuracy of symbol attribution. Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io> * Refactor ELF integration tests to use fixtures and improve structure - Replaced dynamic C code generation in ELF integration tests with pre-compiled binary fixtures for consistency and reliability. - Updated tests to verify ELF parsing, imports, and exports using the new fixture files. - Added new integration tests for Mach-O and PE formats, ensuring comprehensive coverage across binary formats. - Introduced a README in the fixtures directory to document the purpose and rebuilding instructions for the test binaries. Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io> * Refactor GitHub Actions workflow for documentation build - Updated the GitHub Actions workflow to streamline the rustdoc build process by removing unnecessary target directory specification. - Ensured the rustdoc output is correctly copied to the documentation directory after building. - Improved overall clarity and efficiency of the documentation build steps. Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io> * Update docs workflow * Update dependencies in Cargo.toml - Bumped `clap` version from 4.5.48 to 4.5.51. - Updated `goblin` from 0.10.1 to 0.10.3. - Upgraded `insta` from 1.0 to 1.43. - Increased `tempfile` version from 3.8 to 3.23. Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io> --------- Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io> Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: unclesp1d3r <251112+unclesp1d3r@users.noreply.github.com> Co-authored-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
The ELF parser extracted only
STT_FUNCsymbols and lacked library dependency parsing and visibility filtering, limiting import/export analysis compared to the PE parser.Changes
Symbol Type Support
STT_TLS(thread-local storage) andSTT_GNU_IFUNC(indirect functions) in addition to existingSTT_FUNC,STT_OBJECT, andSTT_NOTYPEVisibility Filtering
STV_HIDDENorSTV_INTERNAL, exposing only externally visible symbolsLibrary Dependency Extraction
extract_needed_libraries()to parseDT_NEEDEDentries from the dynamic sectionlibc.so.6)Example
Testing
Original prompt
This section details on the original issue you should resolve
<issue_title>Enhance ELF Dynamic Symbol Extraction with Library Mapping and Comprehensive Symbol Classification</issue_title>
<issue_description>## Summary
Enhance the existing ELF import/export extraction to comprehensively parse the dynamic section, extract library dependencies (DT_NEEDED entries), map symbols to their originating libraries, and improve symbol classification beyond just functions.
Context
The current ELF parser in
src/container/elf.rsprovides basic import/export extraction by analyzing the dynamic symbol table (.dynsym). However, it has several limitations:This enhancement extends the existing functionality to provide more complete import/export analysis, matching the comprehensiveness of the PE parser implementation.
Technical Background
ELF Dynamic Section Structure:
DT_NEEDEDentries that specify required shared librariesDT_SYMTAB,DT_STRTAB, andDT_HASH/DT_GNU_HASHfor symbol resolutionDT_VERNEED,DT_VERDEF, andDT_VERSYMSymbol Classification:
Proposed Solution
Implementation Steps
Parse DT_NEEDED entries from dynamic section
Enhance import extraction (
extract_imports)Enhance export extraction (
extract_exports)Extend data structures if needed
ImportInfoto include symbol type and versionExportInfoto include symbol type and bindingAdd comprehensive unit tests
Add integration tests with real binaries
/bin/lsequivalent)Code Structure
Requirements
4.2, 4.3
Acceptance Criteria
cargo testpassingcargo clippy -- -D warningspassesdocs/explaining the enhanced extractionDependencies
Task-ID
stringy-analyzer/elf-import-export-extraction</issue_desc...
✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.