Skip to content

Conversation

@valechen
Copy link
Contributor

@valechen valechen commented Aug 4, 2025

  1. Move "calculate_max_output_tiles_analytically" to lean_atten.py
  2. Combine "get_num_splits_and_buffer_sizes" and "get_lean_attention_params"
  3. Create _attention_inner() @jit for inner loop attention calculation
  4. Add handling for total_tiles < num_SMs case

@valechen valechen requested a review from rahulbatra85 August 4, 2025 22:28
@valechen valechen requested a review from vgokhale August 5, 2025 16:26
@valechen valechen merged commit 86a6fd2 into main Aug 6, 2025
14 checks passed
@valechen valechen deleted the la_opt2 branch August 6, 2025 16:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants