Skip to content

Conversation

@corbett5
Copy link
Collaborator

@corbett5 corbett5 commented Apr 23, 2021

Creates a new release as well as adds a bunch of new features

  • You can specify the allocators a specific ChaiBuffer should use, and construct an Array with that buffer.
  • You can allocate an Array directly on device. This only works when the Array was previously empty, but this should be the most common use case.
  • Adds half precision math functions.
  • Added a conversion from ArrayView< T, N, NDIM > to ArrayView< U, N, NDIM >, the use case I had was to convert between arrays of __half and __half2.
  • Improved the error logging when you have an LVARRAY_ERROR (or ASSERT) on device.
  • Added functions to memcpy an ArraySlice, it uses Umpire so it works between memory spaces.
  • Added a zero method to ArrayView and CRSMatrixView that will use memset instead of launching a kernel to zero it out.
  • arrayManipulation::resize (which is used by pretty much every class) now uses memset when it will be zero initializing memory. This should be a lot faster for large arrays, it's possible it will be slower for really small things but it would be easy enough to add in a different path for small cases.
  • Replace the MemorySpace enum with using MemorySpace = camp::resource::Platform.

@corbett5 corbett5 force-pushed the corbett/tc-update branch from 310260f to 33b6aa3 Compare April 28, 2021 09:12
@corbett5
Copy link
Collaborator Author

Closes #172.

@corbett5 corbett5 force-pushed the corbett/tc-update branch from 152a525 to 5efa405 Compare April 29, 2021 21:49
@corbett5 corbett5 force-pushed the corbett/tc-update branch from 5efa405 to 4cc725d Compare April 29, 2021 22:21
@corbett5 corbett5 marked this pull request as ready for review April 29, 2021 22:32
@corbett5 corbett5 requested a review from rrsettgast April 29, 2021 22:32
@corbett5 corbett5 changed the title Corbett/tc update 0.2.0 release Apr 29, 2021
@corbett5
Copy link
Collaborator Author

Both Spheral and Tribol would like a new release, plus the original release didn't have the Python stuff.

depends_on('python +shared +pic', when='+pylvarray')
depends_on('py-numpy@1.19: +blas +lapack +force-parallel-build', when='+pylvarray')
depends_on('py-scipy@1.5.2: +force-parallel-build', when='+pylvarray')
# depends_on('py-scipy@1.5.2: +force-parallel-build', when='+pylvarray')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intentional ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, thanks!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand. Was the exclusion of scipy intentional? If this is the python that must be used when interfacing with LvArray, then should we be including a large collection of packages like scipy, matplotlib...etc.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I was testing out the NumPy build on Lassen which I thought I had working with ESSL but I guess not. SciPy takes a bit to build so I commented it out. We should only list packages here that need to be built from source, everything else should and can be pip installed.

*/
void reserve( INDEX_TYPE const newCapacity )
{ bufferManipulation::reserve( this->m_dataBuffer, this->size(), newCapacity ); }
{ bufferManipulation::reserve( this->m_dataBuffer, this->size(), MemorySpace::host, newCapacity ); }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't the space be propagated to the outer interface?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so. At the moment the only way to allocate on device (without moving from host) is call resizeWithoutInitializationOrDestruction on an empty Array. If the Array already has an allocation (on host or device) it won't work. This is probably the majority use case, and if you want to resize the device only Array you can create a new device only Array and copy the values over. Doing a more general resize/reallocation on device without launching a kernel is tricky and not possible in all cases such as any time you have a default value. I'd certainly like to expand this capability and as soon as there's a use case it should be easy to add.

Doing the single parameter resize where the resize dimension is not the largest stride is serial (basically inserting new values in a bunch of places) and doing so on device would involve either creating a new device allocation each time or moving it back to host and doing the work there.

* If the buffer is allocated using Umpire then the Umpire ResouceManager is used, otherwise std::memset is used.
* @note The memset occurs in the last space the array was used in and the view is moved and touched in that space.
*/
inline void zero() const
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we want to allow setting the memory space?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely since it can avoid a move. I think the best way to do this is have registerTouch allocate in the given space if it didn't already exist and then touch it there. That way all memcpy has to do is replace move( space, true ) with registerTouch( space ). I don't really want to add that to this PR though so I'll create an issue.

@corbett5 corbett5 mentioned this pull request May 2, 2021
@corbett5 corbett5 merged commit ab0d302 into develop May 6, 2021
@corbett5 corbett5 deleted the corbett/tc-update branch May 6, 2021 06:35
corbett5 added a commit that referenced this pull request May 6, 2021
* Added Spack config for Summit.

* Added allocator construction to ChaiBuffer and buffer construction to Array.

* Added half precision math functions.

* Updated RAJA to 0.13.0

* Added type conversion to ArrayView.

* Improved CUDA error macro messages.

* Added memcpy and memset functions.

* Use camp Platform for MemorySpace.

* Copyright update.

* Spack and release changes.

* Removed need for matrix output macro.

* Memcpy changes.

* Uncrustify changes.

* Removed constexpr from math functions.

* Changes for CUDA 11.

* Updated release date.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants