Add new level-0 macro layer.#830
Merged
devinamatthews merged 20 commits intomasterfrom Feb 27, 2025
Merged
Conversation
Details: - Developed by @fgvanzee and @devinamatthews. - Level-0 scalar macros have moved from a named-based system (e.g. `bli_dcopys( ... )`) to a macro argument-based system (`bli_tcopys( d,d, ... )`). - All macros are explicitly mixed-type. - All input and output operands can have a distinct type (precision and/or domain). Unnecessary computations and spurious NaN/Inf propagation are avoided in mixed-domain cases. - All macros which do math (i.e. not copy/set/etc.) take an additional computational precision. - Tile-level macros, 1m, broadcast-B, and other extensions are also included. - All macros should correctly handle aliasing of input and output operands (this needs to be rigorously checked). - The macros work generically over the defined types -- new types only need limited support (primarily conversion to other types and basic math).
Member
Author
Member
Author
|
@fgvanzee everything works now, with a full level-0 testsuite. "In-place" axpys, axpbys, xpbys, and scal2s also tested and work correctly. |
Member
|
Awesome! Many thanks for finishing this up. |
fgvanzee
reviewed
Nov 15, 2024
| BLIS disables multithreading by default. In order to allow multithreaded parallelism from BLIS, you must first enable multithreading explicitly at configure-time. | ||
|
|
||
| As of this writing, BLIS optionally supports multithreading via OpenMP or POSIX threads(or both). | ||
| As of this writing, BLIS optionally supports multithreading via OpenMP or POSIX bli_threads(or both). |
Member
There was a problem hiding this comment.
Seems to be a search-and-replace misfire.
Member
There was a problem hiding this comment.
Oh wait, or maybe it's not a typo? Is this meant as a to reference the bli_pthread_*() API wrapper?
Member
Author
|
Yep, please revert.
________________________________
From: Field G. Van Zee ***@***.***>
Sent: Friday, November 15, 2024 11:34:51 AM
To: flame/blis ***@***.***>
Cc: Matthews, Devin ***@***.***>; Author ***@***.***>
Subject: Re: [flame/blis] Add new level-0 macro layer. (PR #830)
@fgvanzee commented on this pull request.
________________________________
In docs/Multithreading.md<#830 (comment)>:
@@ -38,7 +38,7 @@ To summarize: In order to observe multithreaded parallelism within a BLIS operat
BLIS disables multithreading by default. In order to allow multithreaded parallelism from BLIS, you must first enable multithreading explicitly at configure-time.
…-As of this writing, BLIS optionally supports multithreading via OpenMP or POSIX threads(or both).
+As of this writing, BLIS optionally supports multithreading via OpenMP or POSIX bli_threads(or both).
Seems to be a search-and-replace misfire.
—
Reply to this email directly, view it on GitHub<#830 (review)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ABIAZINFNWZ66RXJUIWQDR32AYWDXAVCNFSM6AAAAABRCITQPCVHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMZDIMZZGI2TKOBRGY>.
You are receiving this because you authored the thread.Message ID: ***@***.***>
|
Revert typo in docs. [ci skip]
fgvanzee
approved these changes
Nov 25, 2024
Member
fgvanzee
left a comment
There was a problem hiding this comment.
I didn't review the C++ files, but everything else looks good.
Member
Author
|
OK, I'll want to wait to merge this 'til I clear a few of the other PRs out of the queue. |
Member
Author
|
@fgvanzee are we out of Travis CI credits again? |
Also add some missing in-place tests.
devinamatthews
added a commit
that referenced
this pull request
Jun 7, 2025
Details: - Developed by @fgvanzee and @devinamatthews. - Level-0 scalar macros have moved from a named-based system (e.g. `bli_dcopys( ... )`) to a macro argument-based system (`bli_tcopys( d,d, ... )`). - All macros are explicitly mixed-type. - All input and output operands can have a distinct type (precision and/or domain). Unnecessary computations and spurious NaN/Inf propagation are avoided in mixed-domain cases. - All macros which do math (i.e. not copy/set/etc.) take an additional computational precision. - Tile-level macros, 1m, broadcast-B, and other extensions are also included. - All macros should correctly handle aliasing of input and output operands (this needs to be rigorously checked). - The macros work generically over the defined types -- new types only need limited support (primarily conversion to other types and basic math). - For code outside of core BLIS (optimized kernels, sandboxes, etc.), a selection of legacy macros have been added which translate to the new level-0 macros. Behavior is unchanged. - A standalone, templated C++ testsuite for the level-0 macros has been added. It is currently included as part of the CircleCI tests. - Const-correctness of level-0 macros is also checked. (cherry picked from commit a014a08)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Details:
bli_dcopys( ... )) to a macro argument-based system (bli_tcopys( d,d, ... )).bli_scal2swith&x == &y#828.