Added BLAS, CBLAS interfaces and test functions for gemm_batch and axpbyv APIs.#566
Conversation
Details: - gemm_batch API computes a series of GEMM for groups of general matrices. - Each group contains matrices with same parameters. - This API is part BLAS extension APIs. AMD-Internal: [CPUPL-1184] Change-Id: Ic23772830eb1d157da4db45158a039b0826419fd
AMD-Internal: [CPUPL-1184] Change-Id: Icc5c41429b0d92f1a66a955769cc0518ca4706ee
Details: - The axpby routines perform a vector-vector operation defined as y = a*x + b*y where a, b are scalars and x, y are vectors. - This API is part of BLAS-like extension APIs Change-Id: I17a53b03bba97de7ae1995a9f086084bd241bcdc AMD-Internal: [CPUPL-1118]
Details: - Corrected "#if" directive in line 89 - Commented out "#define print" to disable printing the vectors Change-Id: I9ec3cbfb716540dd3e2264f5c3925d9e0c0c294a
|
@Meghana-vankadari Thanks for this PR. We would like you to make one key change before we can merge this. We noticed that you used C preprocessor macro templates for the BLAS APIs to However, this PR also adds CBLAS APIs with explicitly defined functions for the There are also minor whitespace changes I'd like to make, but I can make those myself just prior to merging. Let us know if you have any questions. |
|
@fgvanzee, do we have any automation/script for whitespace/coding standard corrections? If it is available would like to integrate in our local git. |
|
@Meghana-vankadari @devinamatthews I'm having second thoughts about the templatization for the So now I'm thinking that it may be better to leave the code as-is. |
|
Well, it can be done with macros fairly easily but it doesn't really matter much either way. I am fine with non-templated code I guess as long as it is not our responsibility to maintain it. |
Agreed. |
Details: - Relocated cblas_?axpby.c and cblas_?gemm_batch.c files into a new 'extra' subdirectory within frame/compat/cblas/src. - Relocated bla_axpby.c and bla_gemm_batch.c (and their headers) into a new 'extra' subdirectory within frame/compat. - Reorganized prototypes and related definitions in cblas.h and cblas_f77.h. - Whitespace updates. - Comment updates.
Details: - Fixed a long-standing bug in common.mk that for some reason never manifested until now. Previously, CBLAS source files were compiled *without* the location of cblas.h being specified via a -I flag. I'm not sure why this worked, but it may be due to the fact that the cblas.h file resided in the same directory as all of the CBLAS source, and perhaps compilers implicitly add a -I flag for the directory that corresponds to the location of the source file being compiled. This bug only showed up because some CBLAS-like source code was moved into an 'extra' subdirectory of that frame/compat/cblas/src directory. After moving the code, compilation for those files failed (because the cblas.h header file, presumably, could not be found in the same location). This bug was fixed within common.mk by explicitly adding the cblas.h directory to the list of -I flags passed to the compiler. - Comment updates.
Details: - Expanded the BLAS compatibility layer to include support for ?axpby_() and ?gemm_batch_(). The former is a straightforward BLAS-like interface into the axpbyv operation while the latter implements a batched gemm via loops over bli_?gemm(). Also expanded the CBLAS compatibility layer to include support for cblas_?axpby() and cblas_?gemm_batch(), which serve as wrappers to the corresponding (new) BLAS-like APIs. Thanks to Meghana Vankadari for submitting these new APIs via flame#566. - Fixed a long-standing bug in common.mk that for some reason never manifested until now. Previously, CBLAS source files were compiled *without* the location of cblas.h being specified via a -I flag. I'm not sure why this worked, but it may be due to the fact that the cblas.h file resided in the same directory as all of the CBLAS source, and perhaps compilers implicitly add a -I flag for the directory that corresponds to the location of the source file being compiled. This bug only showed up because some CBLAS-like source code was moved into an 'extra' subdirectory of that frame/compat/cblas/src directory. After moving the code, compilation for those files failed (because the cblas.h header file, presumably, could not be found in the same location). This bug was fixed within common.mk by explicitly adding the cblas.h directory to the list of -I flags passed to the compiler. - Added test_axpbyv.c and test_gemm_batch.c files to 'test' directory, and updated test/Makefile to build those drivers. - Fixed typo in error message string in cblas_sgemm.c.
Details: - Expanded the BLAS compatibility layer to include support for ?axpby_() and ?gemm_batch_(). The former is a straightforward BLAS-like interface into the axpbyv operation while the latter implements a batched gemm via loops over bli_?gemm(). Also expanded the CBLAS compatibility layer to include support for cblas_?axpby() and cblas_?gemm_batch(), which serve as wrappers to the corresponding (new) BLAS-like APIs. Thanks to Meghana Vankadari for submitting these new APIs via #566. - Fixed a long-standing bug in common.mk that for some reason never manifested until now. Previously, CBLAS source files were compiled *without* the location of cblas.h being specified via a -I flag. I'm not sure why this worked, but it may be due to the fact that the cblas.h file resided in the same directory as all of the CBLAS source, and perhaps compilers implicitly add a -I flag for the directory that corresponds to the location of the source file being compiled. This bug only showed up because some CBLAS-like source code was moved into an 'extra' subdirectory of that frame/compat/cblas/src directory. After moving the code, compilation for those files failed (because the cblas.h header file, presumably, could not be found in the same location). This bug was fixed within common.mk by explicitly adding the cblas.h directory to the list of -I flags passed to the compiler. - Added test_axpbyv.c and test_gemm_batch.c files to 'test' directory, and updated test/Makefile to build those drivers. - Fixed typo in error message string in cblas_sgemm.c. - (cherry picked from commit 7bc8ab4)
Added BLAS, CBLAS interfaces and test functions for gemm_batch and axpbyv APIs.
AMD BLIS Upstream:
This PR includes following commits for AMD BLIS version 3.0.1
f001b18 Added few changes in test_axpbyv.c file
3440b84 Added BLAS and CBLAS interfaces for axpby API
9869388 Added CBLAS interface and test file for gemm_batch API
9b4be5f Added BLAS interface for gemm_batch API