Skip to content
This repository was archived by the owner on Oct 7, 2025. It is now read-only.

Conversation

@penpornk
Copy link
Member

We would love to hear your thoughts on the OpenXLA CPU Strategy. This is a master RFC on the overall plan and will have follow-up RFCs to flesh out specific details.

For easier discussion, we have put the full RFC in a Google Doc. Please feel free to leave comments/questions directly on the doc or post a longer-form comment here. The feedback period will be open until October 31, 2023. Thank you very much!

@jpienaar jpienaar merged commit b32b2e0 into openxla:main Jan 18, 2024
copybara-service bot pushed a commit to jax-ml/jax that referenced this pull request Jul 26, 2024
XLA:CPU is preparing to switch from compiling whole XLA program into a single LLVM function to a mode where each fusion/kernel will have its own entry point, and a thin runtime that will dispatch compute functions concurrently. This execution mode does not work very well with while loops with tiny computations and large number of iterations. Similar to GPU backend use vmap to avoid excessive runtime overheads.

Context: openxla/community#96
PiperOrigin-RevId: 656119575
copybara-service bot pushed a commit to jax-ml/jax that referenced this pull request Jul 26, 2024
XLA:CPU is preparing to switch from compiling whole XLA program into a single LLVM function to a mode where each fusion/kernel will have its own entry point, and a thin runtime that will dispatch compute functions concurrently. This execution mode does not work very well with while loops with tiny computations and large number of iterations. Similar to GPU backend use vmap to avoid excessive runtime overheads.

Context: openxla/community#96
PiperOrigin-RevId: 656119575
copybara-service bot pushed a commit to jax-ml/jax that referenced this pull request Jul 26, 2024
XLA:CPU is preparing to switch from compiling whole XLA program into a single LLVM function to a mode where each fusion/kernel will have its own entry point, and a thin runtime that will dispatch compute functions concurrently. This execution mode does not work very well with while loops with tiny computations and large number of iterations. Similar to GPU backend use vmap to avoid excessive runtime overheads.

Context: openxla/community#96
PiperOrigin-RevId: 656199716
nitins17 pushed a commit to google-ml-infra/jax-fork that referenced this pull request Aug 27, 2024
XLA:CPU is preparing to switch from compiling whole XLA program into a single LLVM function to a mode where each fusion/kernel will have its own entry point, and a thin runtime that will dispatch compute functions concurrently. This execution mode does not work very well with while loops with tiny computations and large number of iterations. Similar to GPU backend use vmap to avoid excessive runtime overheads.

Context: openxla/community#96
PiperOrigin-RevId: 656199716
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants