Skip to content

Added BLAS, CBLAS interfaces and test functions for gemm_batch and axpbyv APIs.#566

Merged
fgvanzee merged 6 commits intoflame:masterfrom
Meghana-vankadari:new_apis
Nov 11, 2021
Merged

Added BLAS, CBLAS interfaces and test functions for gemm_batch and axpbyv APIs.#566
fgvanzee merged 6 commits intoflame:masterfrom
Meghana-vankadari:new_apis

Conversation

@Meghana-vankadari
Copy link
Copy Markdown
Contributor

Added BLAS, CBLAS interfaces and test functions for gemm_batch and axpbyv APIs.

AMD BLIS Upstream:

This PR includes following commits for AMD BLIS version 3.0.1

f001b18 Added few changes in test_axpbyv.c file
3440b84 Added BLAS and CBLAS interfaces for axpby API
9869388 Added CBLAS interface and test file for gemm_batch API
9b4be5f Added BLAS interface for gemm_batch API

Details:
- gemm_batch API computes a series of GEMM for groups of general
  matrices.
- Each group contains matrices with same parameters.
- This API is part BLAS extension APIs.

AMD-Internal: [CPUPL-1184]
Change-Id: Ic23772830eb1d157da4db45158a039b0826419fd
AMD-Internal: [CPUPL-1184]
Change-Id: Icc5c41429b0d92f1a66a955769cc0518ca4706ee
Details:
- The axpby routines perform a vector-vector operation defined as
  y = a*x + b*y where a, b are scalars and x, y are vectors.
- This API is part of BLAS-like extension APIs

Change-Id: I17a53b03bba97de7ae1995a9f086084bd241bcdc
AMD-Internal: [CPUPL-1118]
Details:
- Corrected "#if" directive in line 89
- Commented out "#define print" to disable printing the vectors

Change-Id: I9ec3cbfb716540dd3e2264f5c3925d9e0c0c294a
@Meghana-vankadari Meghana-vankadari changed the title New apis Added BLAS, CBLAS interfaces and test functions for gemm_batch and axpbyv APIs. Oct 26, 2021
@fgvanzee
Copy link
Copy Markdown
Member

@Meghana-vankadari Thanks for this PR. We would like you to make one key change before we can merge this.

We noticed that you used C preprocessor macro templates for the BLAS APIs to axpbyv and gemm_batch. This is good.

However, this PR also adds CBLAS APIs with explicitly defined functions for the s, d, c, and z datatypes. I can understand why you may have chosen to do it this way since it matches the style of the existing CBLAS code (which itself is based on netlib code), but we've decided that we would like these new CBLAS APIs (for axpbyv and gemm_batch) to use C preprocessor templates as well. We think this will reduce code maintenance going forward.

There are also minor whitespace changes I'd like to make, but I can make those myself just prior to merging.

Let us know if you have any questions.

@dzambare
Copy link
Copy Markdown
Contributor

@fgvanzee, do we have any automation/script for whitespace/coding standard corrections? If it is available would like to integrate in our local git.

@fgvanzee
Copy link
Copy Markdown
Member

@dzambare We do not have any scripts, unfortunately, but the coding conventions are outlined here.

@fgvanzee
Copy link
Copy Markdown
Member

fgvanzee commented Nov 10, 2021

@Meghana-vankadari @devinamatthews I'm having second thoughts about the templatization for the axpby and gemm_batch CBLAS APIs. The reason is that in order to follow CBLAS's calling conventions, the real domain matrices and vectors are passed via typed pointers (e.g. const float* and const double*), but for the complex domain interfaces, matrices and vectors are passed via const void* pointers. This can't easily be encoded into a cpp macro template.

So now I'm thinking that it may be better to leave the code as-is.

@devinamatthews
Copy link
Copy Markdown
Member

Well, it can be done with macros fairly easily but it doesn't really matter much either way. I am fine with non-templated code I guess as long as it is not our responsibility to maintain it.

@fgvanzee
Copy link
Copy Markdown
Member

I am fine with non-templated code I guess as long as it is not our responsibility to maintain it.

Agreed.

Details:
- Relocated cblas_?axpby.c and cblas_?gemm_batch.c files into a new
  'extra' subdirectory within frame/compat/cblas/src.
- Relocated bla_axpby.c and bla_gemm_batch.c (and their headers) into
  a new 'extra' subdirectory within frame/compat.
- Reorganized prototypes and related definitions in cblas.h and
  cblas_f77.h.
- Whitespace updates.
- Comment updates.
Details:
- Fixed a long-standing bug in common.mk that for some reason never
  manifested until now. Previously, CBLAS source files were compiled
  *without* the location of cblas.h being specified via a -I flag.
  I'm not sure why this worked, but it may be due to the fact that
  the cblas.h file resided in the same directory as all of the CBLAS
  source, and perhaps compilers implicitly add a -I flag for the
  directory that corresponds to the location of the source file being
  compiled. This bug only showed up because some CBLAS-like source code
  was moved into an 'extra' subdirectory of that frame/compat/cblas/src
  directory. After moving the code, compilation for those files failed
  (because the cblas.h header file, presumably, could not be found in
  the same location). This bug was fixed within common.mk by explicitly
  adding the cblas.h directory to the list of -I flags passed to the
  compiler.
- Comment updates.
@fgvanzee fgvanzee merged commit 7bc8ab4 into flame:master Nov 11, 2021
dzambare pushed a commit to Meghana-vankadari/blis that referenced this pull request Jan 6, 2022
Details:
- Expanded the BLAS compatibility layer to include support for 
  ?axpby_() and ?gemm_batch_(). The former is a straightforward
  BLAS-like interface into the axpbyv operation while the latter
  implements a batched gemm via loops over bli_?gemm(). Also
  expanded the CBLAS compatibility layer to include support for
  cblas_?axpby() and cblas_?gemm_batch(), which serve as wrappers to 
  the corresponding (new) BLAS-like APIs. Thanks to Meghana Vankadari
  for submitting these new APIs via flame#566.
- Fixed a long-standing bug in common.mk that for some reason never
  manifested until now. Previously, CBLAS source files were compiled
  *without* the location of cblas.h being specified via a -I flag.
  I'm not sure why this worked, but it may be due to the fact that
  the cblas.h file resided in the same directory as all of the CBLAS
  source, and perhaps compilers implicitly add a -I flag for the
  directory that corresponds to the location of the source file being
  compiled. This bug only showed up because some CBLAS-like source code
  was moved into an 'extra' subdirectory of that frame/compat/cblas/src
  directory. After moving the code, compilation for those files failed
  (because the cblas.h header file, presumably, could not be found in
  the same location). This bug was fixed within common.mk by explicitly
  adding the cblas.h directory to the list of -I flags passed to the
  compiler.
- Added test_axpbyv.c and test_gemm_batch.c files to 'test' directory,
  and updated test/Makefile to build those drivers.
- Fixed typo in error message string in cblas_sgemm.c.
fgvanzee added a commit that referenced this pull request Sep 10, 2022
Details:
- Expanded the BLAS compatibility layer to include support for
  ?axpby_() and ?gemm_batch_(). The former is a straightforward
  BLAS-like interface into the axpbyv operation while the latter
  implements a batched gemm via loops over bli_?gemm(). Also
  expanded the CBLAS compatibility layer to include support for
  cblas_?axpby() and cblas_?gemm_batch(), which serve as wrappers to
  the corresponding (new) BLAS-like APIs. Thanks to Meghana Vankadari
  for submitting these new APIs via #566.
- Fixed a long-standing bug in common.mk that for some reason never
  manifested until now. Previously, CBLAS source files were compiled
  *without* the location of cblas.h being specified via a -I flag.
  I'm not sure why this worked, but it may be due to the fact that
  the cblas.h file resided in the same directory as all of the CBLAS
  source, and perhaps compilers implicitly add a -I flag for the
  directory that corresponds to the location of the source file being
  compiled. This bug only showed up because some CBLAS-like source code
  was moved into an 'extra' subdirectory of that frame/compat/cblas/src
  directory. After moving the code, compilation for those files failed
  (because the cblas.h header file, presumably, could not be found in
  the same location). This bug was fixed within common.mk by explicitly
  adding the cblas.h directory to the list of -I flags passed to the
  compiler.
- Added test_axpbyv.c and test_gemm_batch.c files to 'test' directory,
  and updated test/Makefile to build those drivers.
- Fixed typo in error message string in cblas_sgemm.c.
- (cherry picked from commit 7bc8ab4)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants