Draft
Conversation
91b219d to
17e2e8d
Compare
Port of compressor-python library for efficient low-bitwidth dot product computation using LUT primitives instead of DSP blocks. Architecture: - Counter-based compressor trees - Fused accumulation with constant propagation - Target-specific primitive selection (CARRY4/CARRY8/LOOKAHEAD8) FPGA Support: - Versal: Fully functional - 7-Series: Functional without fused accumulation and gate absorption (not ready for mvau integration) - UltraScale/UltraScale+: Not yet implemented Integration scripts for both dotp_comp and add_multi optimization modes included. Implementation: - Python-based compressor graph construction and optimization - SystemVerilog template expansion for RTL generation - mul_comp_map module for partial product broadcasting This commit adds the generator infrastructure only. Integration with FINN's RTL backend follows in subsequent commits.
Wire the compressor generator into FINN's RTL MVAU datapath, enabling LUT-based dot product computation as an alternative to DSP blocks. RTL Datapath Changes (finn-rtllib/mvu/): - mvu_vvu_axi.sv: Add USE_COMPRESSOR parameter and conditional instantiation - add_multi.sv: Add CATCH_COMP macro for generated compressor module instantiation - mvu_vvu_axi_wrapper.v: Propagate COMP_PIPELINE_DEPTH parameter FINN Backend Integration (matrixvectoractivation_rtl.py): - Add compressor eligibility checks (_is_dotp_comp_eligible) - Conditionally generate dotp_comp and add_multi compressor modules - Include generated RTL files in build - Propagate USE_COMPRESSOR and COMP_PIPELINE_DEPTH template variables Versal MVAU can use compressor-based compute instead of DSP blocks. 7-Series and UltraScale+ not yet supported.
Test infrastructure: - XSim testbench templates (dotp_comp_tb, add_multi_comp_tb, mul_comp_map_tb) - Vivado TCL simulation scripts (dotp_comp, add_multi_comp, dotp) - Test runner scripts: run_tests.sh (21 core configs), run_dotp_comp_tests.sh (8 configs), run_add_multi_comp_tests.sh (8 configs) - Common test utilities (test_common.sh)
Complete 7-Series support with gate absorption optimization and fused accumulation. Add UltraScale/UltraScale+ target (reuses 7-Series primitives, Vivado maps CARRY4→CARRY8 transparently). Key Changes: - Implement 7-Series gate absorption (MuxCYPredAdder, MuxCYRippleSum) - Fix 7-Series fused accumulation and carry chain wiring - Fix compressor generation bugs (mul_comp_map indexing, N=1 passthrough, MuxCYAtom06) - Add UltraScale() target class and remove UltraScale+ restrictions - Remove RTL bitwidth restrictions: 2-3 bit networks now eligible for compressor path - Add BIPOLAR datatype guard (RTL doesn't support BIPOLAR) - Unified add_multi.sv generation for OOC synthesis - VVU template variable consistency (USE_COMPRESSOR, COMP_PIPELINE_DEPTH) All three FPGA families (Versal, 7-Series, UltraScale+) now fully supported.
17e2e8d to
03bfca4
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.