-
Notifications
You must be signed in to change notification settings - Fork 9
0.2.0 release #231
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
0.2.0 release #231
Conversation
310260f to
33b6aa3
Compare
|
Closes #172. |
152a525 to
5efa405
Compare
5efa405 to
4cc725d
Compare
|
Both Spheral and Tribol would like a new release, plus the original release didn't have the Python stuff. |
| depends_on('python +shared +pic', when='+pylvarray') | ||
| depends_on('py-numpy@1.19: +blas +lapack +force-parallel-build', when='+pylvarray') | ||
| depends_on('py-scipy@1.5.2: +force-parallel-build', when='+pylvarray') | ||
| # depends_on('py-scipy@1.5.2: +force-parallel-build', when='+pylvarray') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Intentional ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand. Was the exclusion of scipy intentional? If this is the python that must be used when interfacing with LvArray, then should we be including a large collection of packages like scipy, matplotlib...etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I was testing out the NumPy build on Lassen which I thought I had working with ESSL but I guess not. SciPy takes a bit to build so I commented it out. We should only list packages here that need to be built from source, everything else should and can be pip installed.
| */ | ||
| void reserve( INDEX_TYPE const newCapacity ) | ||
| { bufferManipulation::reserve( this->m_dataBuffer, this->size(), newCapacity ); } | ||
| { bufferManipulation::reserve( this->m_dataBuffer, this->size(), MemorySpace::host, newCapacity ); } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't the space be propagated to the outer interface?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think so. At the moment the only way to allocate on device (without moving from host) is call resizeWithoutInitializationOrDestruction on an empty Array. If the Array already has an allocation (on host or device) it won't work. This is probably the majority use case, and if you want to resize the device only Array you can create a new device only Array and copy the values over. Doing a more general resize/reallocation on device without launching a kernel is tricky and not possible in all cases such as any time you have a default value. I'd certainly like to expand this capability and as soon as there's a use case it should be easy to add.
Doing the single parameter resize where the resize dimension is not the largest stride is serial (basically inserting new values in a bunch of places) and doing so on device would involve either creating a new device allocation each time or moving it back to host and doing the work there.
| * If the buffer is allocated using Umpire then the Umpire ResouceManager is used, otherwise std::memset is used. | ||
| * @note The memset occurs in the last space the array was used in and the view is moved and touched in that space. | ||
| */ | ||
| inline void zero() const |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we want to allow setting the memory space?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definitely since it can avoid a move. I think the best way to do this is have registerTouch allocate in the given space if it didn't already exist and then touch it there. That way all memcpy has to do is replace move( space, true ) with registerTouch( space ). I don't really want to add that to this PR though so I'll create an issue.
* Added Spack config for Summit. * Added allocator construction to ChaiBuffer and buffer construction to Array. * Added half precision math functions. * Updated RAJA to 0.13.0 * Added type conversion to ArrayView. * Improved CUDA error macro messages. * Added memcpy and memset functions. * Use camp Platform for MemorySpace. * Copyright update. * Spack and release changes. * Removed need for matrix output macro. * Memcpy changes. * Uncrustify changes. * Removed constexpr from math functions. * Changes for CUDA 11. * Updated release date.
Creates a new release as well as adds a bunch of new features
ChaiBuffershould use, and construct anArraywith that buffer.Arraydirectly on device. This only works when theArraywas previously empty, but this should be the most common use case.ArrayView< T, N, NDIM >toArrayView< U, N, NDIM >, the use case I had was to convert between arrays of__halfand__half2.LVARRAY_ERROR(orASSERT) on device.memcpyanArraySlice, it uses Umpire so it works between memory spaces.zeromethod toArrayViewandCRSMatrixViewthat will usememsetinstead of launching a kernel to zero it out.arrayManipulation::resize(which is used by pretty much every class) now usesmemsetwhen it will be zero initializing memory. This should be a lot faster for large arrays, it's possible it will be slower for really small things but it would be easy enough to add in a different path for small cases.MemorySpaceenum withusing MemorySpace = camp::resource::Platform.