Skip to content

Pr/compressor integration#1560

Draft
sgerber-main wants to merge 4 commits intoXilinx:devfrom
sgerber-main:pr/compressor-integration
Draft

Pr/compressor integration#1560
sgerber-main wants to merge 4 commits intoXilinx:devfrom
sgerber-main:pr/compressor-integration

Conversation

@sgerber-main
Copy link
Copy Markdown

No description provided.

@sgerber-main sgerber-main force-pushed the pr/compressor-integration branch from 91b219d to 17e2e8d Compare April 21, 2026 12:34
Port of compressor-python library for efficient low-bitwidth dot product
computation using LUT primitives instead of DSP blocks.

Architecture:
- Counter-based compressor trees
- Fused accumulation with constant propagation
- Target-specific primitive selection (CARRY4/CARRY8/LOOKAHEAD8)

FPGA Support:
- Versal: Fully functional
- 7-Series: Functional without fused accumulation and gate absorption (not ready for mvau integration)
- UltraScale/UltraScale+: Not yet implemented

Integration scripts for both dotp_comp and add_multi optimization modes included.

Implementation:
- Python-based compressor graph construction and optimization
- SystemVerilog template expansion for RTL generation
- mul_comp_map module for partial product broadcasting

This commit adds the generator infrastructure only. Integration with
FINN's RTL backend follows in subsequent commits.
Wire the compressor generator into FINN's RTL MVAU datapath, enabling
LUT-based dot product computation as an alternative to DSP blocks.

RTL Datapath Changes (finn-rtllib/mvu/):
- mvu_vvu_axi.sv: Add USE_COMPRESSOR parameter and conditional instantiation
- add_multi.sv: Add CATCH_COMP macro for generated compressor module instantiation
- mvu_vvu_axi_wrapper.v: Propagate COMP_PIPELINE_DEPTH parameter

FINN Backend Integration (matrixvectoractivation_rtl.py):
- Add compressor eligibility checks (_is_dotp_comp_eligible)
- Conditionally generate dotp_comp and add_multi compressor modules
- Include generated RTL files in build
- Propagate USE_COMPRESSOR and COMP_PIPELINE_DEPTH template variables

Versal MVAU can use compressor-based compute instead of DSP blocks.
7-Series and UltraScale+ not yet supported.
Test infrastructure:
- XSim testbench templates (dotp_comp_tb, add_multi_comp_tb, mul_comp_map_tb)
- Vivado TCL simulation scripts (dotp_comp, add_multi_comp, dotp)
- Test runner scripts: run_tests.sh (21 core configs), run_dotp_comp_tests.sh (8 configs), run_add_multi_comp_tests.sh (8 configs)
- Common test utilities (test_common.sh)
Complete 7-Series support with gate absorption optimization and fused
accumulation. Add UltraScale/UltraScale+ target (reuses 7-Series primitives,
Vivado maps CARRY4→CARRY8 transparently).

Key Changes:
- Implement 7-Series gate absorption (MuxCYPredAdder, MuxCYRippleSum)
- Fix 7-Series fused accumulation and carry chain wiring
- Fix compressor generation bugs (mul_comp_map indexing, N=1 passthrough, MuxCYAtom06)
- Add UltraScale() target class and remove UltraScale+ restrictions
- Remove RTL bitwidth restrictions: 2-3 bit networks now eligible for compressor path
- Add BIPOLAR datatype guard (RTL doesn't support BIPOLAR)
- Unified add_multi.sv generation for OOC synthesis
- VVU template variable consistency (USE_COMPRESSOR, COMP_PIPELINE_DEPTH)

All three FPGA families (Versal, 7-Series, UltraScale+) now fully supported.
@sgerber-main sgerber-main force-pushed the pr/compressor-integration branch from 17e2e8d to 03bfca4 Compare April 21, 2026 13:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant