Skip to content

Conversation

@martin-frbg
Copy link
Collaborator

for #2068. Some of the implementations (notably POWER, MIPS64 and SPARC) are a bit inefficient as they replace the fabs calls of their corresponding ?asum template with an otherwise unneeded fmov just to avoid a rewrite of the logic.

as trivial copies of asum/zsasum with the fabs calls replaced by fmov to preserve code structure
(trivial copies of the respective ?asum with the fabs calls removed)
as trivial copies of the respective ?asum kernels with the fabs calls removed
as trivial copy of asum with the fabs calls removed
as trivial copy of ?asum with the fabs calls removed
as trivial copy of ?asum with the fabs replaced by mov to preserve code structure
as trivial copy of ?asum with the fabs replaced by fmr to preserve code structure
as trivial copy of ?asum with the fabs replaced by fmov to preserve code structure
as trivial copy of ?asum with the fabs calls removed
as trivial copy of ?asum with the fabs calls removed
as trivial copies of the respective ?asum kernels with the ABS and vflpsb calls removed
@martin-frbg martin-frbg added this to the 0.3.6 milestone Mar 30, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant