Add more multiplication primitives by jamierpond · Pull Request #107 · thecppzoo/zoo

jamierpond · 2025-01-13T02:35:46Z

add main logic
add even lane mask
even/odd lane mask
clean up a little

jamierpond · 2025-01-13T02:36:58Z

inc/zoo/swar/associative_iteration.h

+
+using S = SWAR<4, u32>;
+
+static_assert(S::oddLaneMask().value() == 0xF0F0'F0F0);


aware these tests not formatted nicely, just making a draft for visibility

thecppzoo

Thanks for this draft.
This work surfaces several important questions:

Support for non-power-of-two lane sizes
What are we going to do with two's complement signs? perhaps this is not fullMultiplication but safeMultiplication
We have to implement "negation" (two's complement flipping the sign)

That being said, we are in an excellent position to also implement division as multiplication by the reciprocal, which would be useful at least for compile-time divisors.

Please merge the auto declarations: this does not coerce the type, but verifies that all the declarands have the same type:

auto a = initialize_a(inputsForA);
auto b = initialize_b(inputsForB);

In that code, the types are not coerced (very good!) but a and b may be of different types.

auto
    a = initA(iA),
    b = initB(iB);

We still don't coerce the types, but if a and b have different types, it is a compilation error.
This is especially useful here in the SWAR library

inc/zoo/swar/SWAR.h

thecppzoo · 2025-01-13T05:29:47Z

inc/zoo/swar/associative_iteration.h

+   auto [l_even, l_odd] = doublePrecision(multiplicand);
+   auto [r_even, r_odd] = doublePrecision(multiplier);
+   auto res_even = multiplication_OverflowUnsafe(l_even, r_even);
+   auto res_odd = multiplication_OverflowUnsafe(l_odd, r_odd);


Merge these declarations into a single auto, the idea is that in that way you are verifying they are all of the same type.

also todo signed multiplication

thecppzoo · 2025-01-13T05:48:00Z

Perhaps we can also make "widening multiplication", that doubles the lane size. For example, in x86-64, there are the instructions to multiply two register-size values and get a result of double the number of bits, using the "DX:AX" for the result, so, for 64 bits, it would be RDX with the upper 64 bits, and RAX with the lower, in this way, the multiplication also widens. Ask Claude what is the name of this.

thecppzoo · 2025-01-16T23:32:20Z

inc/zoo/swar/associative_iteration.h

+   SWAR<NB, T> result;
+   SWAR<NB, T> overflow;


This is not overflow.

thecppzoo · 2025-01-16T23:34:14Z

inc/zoo/swar/associative_iteration.h

+
+template <int NB, typename T>
+constexpr auto
+doublingMultiplication(SWAR<NB, T> multiplicand, SWAR<NB, T> multiplier) {


doubling is confusing here. doublePrecisionMultiplication is fine, multiplicationByDoublingPrecision, ...

thecppzoo · 2025-01-16T23:35:04Z

inc/zoo/swar/associative_iteration.h

+}
+
+template <int NB, typename T>
+constexpr MultiplicationResult<NB, T>


Why the explicit return type?

thecppzoo · 2025-01-16T23:39:25Z

inc/zoo/swar/associative_iteration.h

+template<int NB, typename T>
+constexpr auto saturatingExponentiation(
+    SWAR<NB, T> x,


Absolutely not.
We're not removing the non-saturating exponentiation and provide only the saturating exponentiation. Don't do that.
Always the general operation is pre-requisite for the more specific.

thecppzoo · 2025-01-16T23:41:13Z

inc/zoo/swar/associative_iteration.h

+   SWAR<NB, T> lower;
+   SWAR<NB, T> upper;


thecppzoo · 2025-01-16T23:42:43Z

inc/zoo/swar/associative_iteration.h

+       over_even = D{(lower.value() & UpperHalfOfLanes) >> HalfLane},
+       over_odd = D{(upper.value() & UpperHalfOfLanes) >> HalfLane};


shift intra lane allows you to provide the mask.
Please use those primitives instead of deploying the pick-axe

thecppzoo · 2025-01-16T23:45:06Z

inc/zoo/swar/SWAR.h

+template <int NBits, typename T>
+constexpr static auto consumeMSB(SWAR<NBits, T> s) noexcept {
+    using S = SWAR<NBits, T>;
+    auto msbCleared = s & ~S{S::MostSignificantBit};
+    return S{static_cast<T>(msbCleared.value() << 1)};
+}



I am not sold on promoting this to the main header of swar.
This really seems to be an artifact of the "regressive" direction of "associative iteration", it does not cohere enough to the SWAR library itself.

thecppzoo · 2025-01-16T23:47:32Z

inc/zoo/swar/associative_iteration.h

+auto
+doublePrecisionMultiplication(SWAR<NB, T> multiplicand, SWAR<NB, T> multiplier) {
+   auto
+       icand = doublePrecision(multiplicand),


Nice! never thought about omitting the prefix

thecppzoo · 2025-01-16T23:49:01Z

I can not resist to comment about how elegant this is all looking.
The primitives of doubling/halving precision were a success.

jamierpond added 4 commits January 12, 2025 18:33

add main logic

b4db29e

add even lane mask

1261006

even/odd lane mask

f976ae4

clean up a little

f0720bd

jamierpond requested a review from thecppzoo January 13, 2025 02:35

clean some more

f928811

jamierpond commented Jan 13, 2025

View reviewed changes

just return overflow

51f2987

thecppzoo reviewed Jan 13, 2025

View reviewed changes

jamierpond added 4 commits January 14, 2025 12:29

wow seems to be working

35fffa7

tidy a little

b87b408

rm spurious tests

07be9f9

rename

93e4bac

jamierpond marked this pull request as ready for review January 15, 2025 01:39

jamierpond changed the title ~~Draft: fullMultiplication~~ Add more multiplication primitives Jan 15, 2025

jamierpond added 3 commits January 14, 2025 22:44

start test refactor

05468a2

tidy tests

ac45f1b

rm tests

2302504

jamierpond requested a review from thecppzoo January 16, 2025 23:28

thecppzoo reviewed Jan 16, 2025

View reviewed changes

add consume msb

2214ac8

thecppzoo reviewed Jan 16, 2025

View reviewed changes

rename lower/upper

61a7506

thecppzoo reviewed Jan 16, 2025

View reviewed changes

inc/zoo/swar/associative_iteration.h

Comment on lines +477 to +478

SWAR<NB, T> lower;

SWAR<NB, T> upper;

Copy link

Owner

thecppzoo Jan 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

merge

thecppzoo reviewed Jan 16, 2025

View reviewed changes

make doubling multi nicer

7b41db0

thecppzoo reviewed Jan 16, 2025

View reviewed changes

jamierpond added 3 commits January 16, 2025 17:14

consolidate exponentation and make naming consistent

5cf88df

works

3a65ed2

mv tests

2b613ee

jamierpond requested a review from thecppzoo February 25, 2025 08:08

jamierpond added 2 commits February 25, 2025 00:10

oops

fa0667b

tidy

dea354a

jamierpond force-pushed the jp/overflow-multi-safe-clean branch from 9fb833d to dea354a Compare February 25, 2025 08:27

jamierpond added 4 commits February 25, 2025 00:27

tidy

81237a6

tidy

b792e30

make pair for generatlity

5d41262

tidy

cfb1072


		using S = SWAR<4, u32>;

		static_assert(S::oddLaneMask().value() == 0xF0F0'F0F0);

		over_even = D{(lower.value() & UpperHalfOfLanes) >> HalfLane},
		over_odd = D{(upper.value() & UpperHalfOfLanes) >> HalfLane};

Conversation

jamierpond commented Jan 13, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

thecppzoo left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

thecppzoo commented Jan 13, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

thecppzoo commented Jan 16, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants