We have to talk of: - support for both CUDA and ROCm - Array API: how cupy arrays can still work with numpy functions through `__array_ufunc__` and `__array_function__` - stream execution
We have to talk of:
__array_ufunc__and__array_function__