[FEA]: Introduce cache-modified input iterator into cuda.parallel

### Is this a duplicate?

- [x] I confirmed there appear to be no [duplicate issues](https://github.com/NVIDIA/cccl/issues) for this request and that I agree to the [Code of Conduct](CODE_OF_CONDUCT.md)

### Area

Not sure

### Is your feature request related to a problem? Please describe.

Usage of cuda.parallel in applications like llm.c ([example](https://github.com/gevtushenko/llm.c/blob/c541c0799d6a2399311d057f57a5d88909ccefba/train_gpt2.cu#L980-L982)) is currently blocked by lack of cache-modified iterators support.  




### Describe the solution you'd like

We need an functional alternative of [cache-modified iterator](https://github.com/NVIDIA/cccl/blob/main/cub/cub/iterator/cache_modified_input_iterator.cuh) in cuda.parallel.itertools. Design might follow the API that @fbusato came up with in https://github.com/NVIDIA/cccl/pull/2487. For instance:

```python
d_input = cp.array([8, 6, 7, 5, 3, 0, 9], dtype=dtype)
d_streaming_input = cudax.itertools.accessor(d_input, "eviction_policy::no_allocation")
cudax.reduce(d_streaming_input)
```

should lead to streaming loads of `d_input` (`ld.global.cs` instruction in PTX)

### Describe alternatives you've considered

_No response_

### Additional context

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEA]: Introduce cache-modified input iterator into cuda.parallel #2536

Is this a duplicate?

Area

Is your feature request related to a problem? Please describe.

Describe the solution you'd like

Describe alternatives you've considered

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[FEA]: Introduce cache-modified input iterator into cuda.parallel #2536

Description

Is this a duplicate?

Area

Is your feature request related to a problem? Please describe.

Describe the solution you'd like

Describe alternatives you've considered

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions