Making multiplication and inner product more efficient#1
Open
goforashutosh wants to merge 6 commits intoDCMMC:mainfrom
Open
Making multiplication and inner product more efficient#1goforashutosh wants to merge 6 commits intoDCMMC:mainfrom
goforashutosh wants to merge 6 commits intoDCMMC:mainfrom
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Hi, we used this repo for matrix multiplication, which involved a large number of multiplications and inner products. The diff below shows a lot of changes due to the formatting differences in our code, but this fork only modifies two functions, which were in the original repo-
signed_div_scaleandinner_product.For
signed_div_scale, we implement|q|<2^3pbound by translatingqby2^3pand doing a range check. This reduces the number of cells required per multiplication call from ~250 to ~90 (for lookup bits = 12).For
inner_product, we change the implementation so thatsigned_div_scaleis called only once since this is the expensive call. This makes the number of cells go from90*NtoN+90where is the size of the vectors for inner product.