Skip to content

Add new level-0 macro layer.#830

Merged
devinamatthews merged 20 commits intomasterfrom
new-level0
Feb 27, 2025
Merged

Add new level-0 macro layer.#830
devinamatthews merged 20 commits intomasterfrom
new-level0

Conversation

@devinamatthews
Copy link
Copy Markdown
Member

@devinamatthews devinamatthews commented Nov 3, 2024

Details:

  • Developed by @fgvanzee and @devinamatthews.
  • Level-0 scalar macros have moved from a named-based system (e.g. bli_dcopys( ... )) to a macro argument-based system (bli_tcopys( d,d, ... )).
  • All macros are explicitly mixed-type.
  • All input and output operands can have a distinct type (precision and/or domain). Unnecessary computations and spurious NaN/Inf propagation are avoided in mixed-domain cases.
  • All macros which do math (i.e. not copy/set/etc.) take an additional computational precision.
  • Tile-level macros, 1m, broadcast-B, and other extensions are also included.
  • All macros should correctly handle aliasing of input and output operands (this needs to be rigorously checked).
  • The macros work generically over the defined types -- new types only need limited support (primarily conversion to other types and basic math).
  • Fixes Cannot use complex bli_scal2s with &x == &y #828.

Details:

- Developed by @fgvanzee and @devinamatthews.
- Level-0 scalar macros have moved from a named-based system (e.g. `bli_dcopys( ... )`) to a macro argument-based system (`bli_tcopys( d,d, ... )`).
- All macros are explicitly mixed-type.
- All input and output operands can have a distinct type (precision and/or domain). Unnecessary computations and spurious NaN/Inf propagation are avoided in mixed-domain cases.
- All macros which do math (i.e. not copy/set/etc.) take an additional computational precision.
- Tile-level macros, 1m, broadcast-B, and other extensions are also included.
- All macros should correctly handle aliasing of input and output operands (this needs to be rigorously checked).
- The macros work generically over the defined types -- new types only need limited support (primarily conversion to other types and basic math).
@devinamatthews devinamatthews marked this pull request as draft November 3, 2024 22:02
@devinamatthews
Copy link
Copy Markdown
Member Author

@fgvanzee I'm going to first rigorously check this fixes #828 and also add some tests.

@devinamatthews devinamatthews marked this pull request as ready for review November 5, 2024 00:10
@devinamatthews
Copy link
Copy Markdown
Member Author

@fgvanzee everything works now, with a full level-0 testsuite. "In-place" axpys, axpbys, xpbys, and scal2s also tested and work correctly.

@fgvanzee
Copy link
Copy Markdown
Member

fgvanzee commented Nov 9, 2024

Awesome! Many thanks for finishing this up.

Comment thread docs/Multithreading.md Outdated
BLIS disables multithreading by default. In order to allow multithreaded parallelism from BLIS, you must first enable multithreading explicitly at configure-time.

As of this writing, BLIS optionally supports multithreading via OpenMP or POSIX threads(or both).
As of this writing, BLIS optionally supports multithreading via OpenMP or POSIX bli_threads(or both).
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems to be a search-and-replace misfire.

Copy link
Copy Markdown
Member

@fgvanzee fgvanzee Nov 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh wait, or maybe it's not a typo? Is this meant as a to reference the bli_pthread_*() API wrapper?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a typo.

@devinamatthews
Copy link
Copy Markdown
Member Author

devinamatthews commented Nov 15, 2024 via email

devinamatthews and others added 2 commits November 15, 2024 15:00
Revert typo in docs. [ci skip]
Copy link
Copy Markdown
Member

@fgvanzee fgvanzee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't review the C++ files, but everything else looks good.

@devinamatthews
Copy link
Copy Markdown
Member Author

OK, I'll want to wait to merge this 'til I clear a few of the other PRs out of the queue.

@devinamatthews
Copy link
Copy Markdown
Member Author

@fgvanzee are we out of Travis CI credits again?

@devinamatthews devinamatthews merged commit a014a08 into master Feb 27, 2025
@devinamatthews devinamatthews deleted the new-level0 branch February 27, 2025 19:48
devinamatthews added a commit that referenced this pull request Jun 7, 2025
Details:
- Developed by @fgvanzee and @devinamatthews.
- Level-0 scalar macros have moved from a named-based system (e.g. `bli_dcopys( ... )`) to a macro argument-based system (`bli_tcopys( d,d, ... )`).
- All macros are explicitly mixed-type.
- All input and output operands can have a distinct type (precision and/or domain). Unnecessary computations and spurious NaN/Inf propagation are avoided in mixed-domain cases.
- All macros which do math (i.e. not copy/set/etc.) take an additional computational precision.
- Tile-level macros, 1m, broadcast-B, and other extensions are also included.
- All macros should correctly handle aliasing of input and output operands (this needs to be rigorously checked).
- The macros work generically over the defined types -- new types only need limited support (primarily conversion to other types and basic math).
- For code outside of core BLIS (optimized kernels, sandboxes, etc.), a selection of legacy macros have been added which translate to the new level-0 macros. Behavior is unchanged.
- A standalone, templated C++ testsuite for the level-0 macros has been added. It is currently included as part of the CircleCI tests.
- Const-correctness of level-0 macros is also checked.

(cherry picked from commit a014a08)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Cannot use complex bli_scal2s with &x == &y

2 participants