Skip to content

Conversation

@brunomazzottiamd
Copy link
Contributor

@brunomazzottiamd brunomazzottiamd commented Dec 18, 2025

Motivation

Triton test step of AITER CI is taking too long and becoming a bottleneck for PR merging. We should programmatically select which Triton tests to run, based on the diff content of a given PR. The full test suite can be executed periodically on main branch only.

This PR proposes a Python script that given

  • a source branch with changes to be merged
  • a target branch to merge into
  • the AITER codebase

automatically selects which Triton tests validate the changes to be merged.

Technical Details

Script help text:

$ python .github/scripts/select_triton_tests.py --help
usage: select_triton_tests.py [-h] [-s SOURCE] [-t TARGET] [-v ENV_VAR] [-f ENV_FILE] [-l {critical,error,warning,info,debug,off}]

select which Triton tests to run based on git diff

options:
  -h, --help            show this help message and exit
  -s SOURCE, --source SOURCE
                        source branch, defaults to current branch
  -t TARGET, --target TARGET
                        target branch, defaults to main
  -v ENV_VAR, --env-var ENV_VAR
                        environment variable to store which tests to run, defaults to TRITON_TEST
  -f ENV_FILE, --env-file ENV_FILE
                        environment file to write, won't write anything if absent
  -l {critical,error,warning,info,debug,off}, --log-level {critical,error,warning,info,debug,off}
                        log level to enable (default: info)

Implementation details:

  • Uses Python subprocess module to run git commands and find out what has changed between source and target branches.
  • Uses Python pathlib module to list all Triton source files in AITER code base (kernels, kernel configurations, unit tests, benchmark scripts).
  • Uses Python ast module to recursively parse the Triton source files, tracking all dependency relations. These dependency relations can be between two source files or between a kernel configuration file and a kernel source file.
  • The dependency relations are encoded in a networkx directed graph. Later, the graph is traversed starting from the diff content until we reach unit tests. All unit tests that are reachable should be executed to validate the proposed changes.
  • No code is executed, the entire construction of the dependency graph is done through static analysis.

Test Plan

Tested locally with two scenarios:

  • Test case A - changed the following Triton source files:
    • aiter/ops/triton/_triton_kernels/mha.py → MHA forward kernel
    • aiter/ops/triton/_triton_kernels/mha_onekernel_bwd.py → MHA backward kernel
    • aiter/ops/triton/_triton_kernels/rope.py → RoPE kernel
    • aiter/ops/triton/configs/gfx942-GMM.json → GMM kernel configuration
    • op_tests/op_benchmarks/triton/bench_topk.py → Top-k benchmark script
  • Test case B - changed files other than Triton sources.
  • Test case C - this feature branch, i.e. bmazzott/run-triton-tests-selectively, which has changes in:
    • aiter/ops/triton/_triton_kernels/fused_gemm_afp4wfp4_a16w16.py → standardization of get_gemm_config arguments
    • aiter/ops/triton/_triton_kernels/fused_gemm_afp4wfp4_mul_add.py → standardization of get_gemm_config arguments
    • aiter/ops/triton/_triton_kernels/gemm_afp4wfp4.py → standardization of get_gemm_config arguments
    • aiter/ops/triton/_triton_kernels/gmm.py → standardization of kernel config file pattern
  • Test case D - changed an untested Triton kernel:
    • aiter/ops/triton/_triton_kernels/pod_attention.py → untested kernel

Test Result

Summarized script output:

Test case A:

INFO|There are 5 Triton source files in the diff:
INFO|* aiter\ops\triton\_triton_kernels\mha.py
INFO|* aiter\ops\triton\_triton_kernels\mha_onekernel_bwd.py
INFO|* aiter\ops\triton\_triton_kernels\rope.py
INFO|* aiter\ops\triton\configs\gfx942-GMM.json
INFO|* op_tests\op_benchmarks\triton\bench_topk.py
INFO|There are 7 tests reachable from the Triton diff:
INFO|* op_tests\triton_tests\attention\test_mha.py
INFO|* op_tests\triton_tests\fusions\test_fused_kv_cache.py
INFO|* op_tests\triton_tests\fusions\test_fused_qk_concat.py
INFO|* op_tests\triton_tests\rope\test_fused_qkv_split_qk_rope.py
INFO|* op_tests\triton_tests\rope\test_rope.py
INFO|* op_tests\triton_tests\test_gmm.py
INFO|* op_tests\triton_tests\test_topk.py
INFO|Finished, execution took 7.41 seconds.

Test case B:

INFO|There are no Triton source files in diff, there's no need to run Triton tests.

Test case C:

INFO|There are 4 Triton source files in the diff:
INFO|* aiter\ops\triton\_triton_kernels\fused_gemm_afp4wfp4_a16w16.py
INFO|* aiter\ops\triton\_triton_kernels\fused_gemm_afp4wfp4_mul_add.py
INFO|* aiter\ops\triton\_triton_kernels\gemm_afp4wfp4.py
INFO|* aiter\ops\triton\_triton_kernels\gmm.py
INFO|There are 9 tests reachable from the Triton diff:
INFO|* op_tests\triton_tests\gemm\basic\test_gemm_a16wfp4.py
INFO|* op_tests\triton_tests\gemm\basic\test_gemm_afp4wfp4.py
INFO|* op_tests\triton_tests\gemm\batched\test_batched_gemm_a16wfp4.py
INFO|* op_tests\triton_tests\gemm\fused\test_fused_gemm_afp4wfp4_a16w16.py
INFO|* op_tests\triton_tests\gemm\fused\test_fused_gemm_afp4wfp4_mul_add.py
INFO|* op_tests\triton_tests\gemm\fused\test_fused_gemm_afp4wfp4_split_cat.py
INFO|* op_tests\triton_tests\quant\test_fused_mxfp4_quant.py
INFO|* op_tests\triton_tests\test_activation.py
INFO|* op_tests\triton_tests\test_gmm.py
INFO|Finished, execution took 2.36 seconds.

Test case D:

INFO|There is 1 Triton source file in the diff:
INFO|* aiter\ops\triton\_triton_kernels\pod_attention.py
WARNING|Triton source file [aiter\ops\triton\_triton_kernels\pod_attention.py] isn't in the dependency graph, it's unreachable.
WARNING|Couldn't find any test file related to Triton diff.
WARNING|Please check test selection script, there might be a bug in it.
WARNING|Please check Triton code base, there may be untested kernels.
INFO|Finished, execution took 2.34 seconds.

Triton test execution speedup

Using this PR as an example:

  • Source branch: bmazzott/run-triton-tests-selectively
  • Target branch: main
Metric Full Triton test suite Selected Triton tests based on PR diff content Comment
Test files 60 9 85% fewer test files
Test cases 159062 17096 89.3% fewer test cases
Running time 2h 2m 19s (7322s) 5m 22s (322s) 22.7 speedup

TODO before merging

  • Address all comments made by Bruno Mazzotti, the PR author.
  • Find a way to integrate the script with CI infrastructure and GitHub Actions. I think AITER CI team (Xin Huang + Leonid Drozdov) can help with this.

IMPORTANT NOTICE!

  • The sole goal of this PR is to just run the script on CI, on every subsequent PR to be merged.
  • Under no circumstances, should any script execution error be considered a CI failure.
  • For now, we should not trust the script output, just monitor it and fix possible issues over time. Only when we are confident in the result, should it be used to select which tests to run.
  • Another must have condition to use the script to drive test selection is to have a periodic run of the full Triton test suite on main branch.

Submission Checklist

@brunomazzottiamd brunomazzottiamd self-assigned this Dec 18, 2025
@brunomazzottiamd brunomazzottiamd added enhancement New feature or request triton labels Dec 18, 2025
brunomazzottiamd

This comment was marked as resolved.

@gyohuangxin

This comment was marked as resolved.

@brunomazzottiamd

This comment was marked as resolved.

@brunomazzottiamd

This comment was marked as resolved.

@brunomazzottiamd brunomazzottiamd force-pushed the bmazzott/run-triton-tests-selectively branch 2 times, most recently from 64e8d2c to dc110b6 Compare January 6, 2026 21:13
@brunomazzottiamd

This comment was marked as resolved.

@gyohuangxin

This comment was marked as resolved.

@brunomazzottiamd

This comment was marked as resolved.

@brunomazzottiamd brunomazzottiamd force-pushed the bmazzott/run-triton-tests-selectively branch 5 times, most recently from 6e37b57 to 45d86fc Compare January 7, 2026 19:31
@brunomazzottiamd

This comment was marked as resolved.

source_branch,
)
else:
git_check_branch(source_branch)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure the reason CI didn't find the branch, could you add some debug steps like git branch -r and git branch?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fine, I'll add some git commands before running the script to see what's going on. Thank you for the suggestion.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I couldn't run any git command in the container with docker exec -w /workspace triton_test git ....

I was getting the following error:

fatal: detected dubious ownership in repository at '/workspace'
To add an exception for this directory, call:
	git config --global --add safe.directory /workspace

This error happens due to a security feature that occurs when the repository's owner is different from the user running the git command. This prevents unauthorized scripts from running in repositories not owned by the current user. You can resolve this error by either explicitly marking the directory as safe in your Git configuration or by changing the directory's ownership to your current user.

After running git config --global --add safe.directory /workspace the test selection script is working. However, I'm not sure if this is the best solution! I'm open to suggestions!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the sake of reference:

docker exec -w /workspace triton_test \
git config --global --add safe.directory /workspace

@brunomazzottiamd brunomazzottiamd force-pushed the bmazzott/run-triton-tests-selectively branch 5 times, most recently from 10d6be7 to c744074 Compare January 8, 2026 17:43
@brunomazzottiamd brunomazzottiamd force-pushed the bmazzott/run-triton-tests-selectively branch from eabb64d to c43f997 Compare January 8, 2026 18:23
Copy link
Contributor Author

@brunomazzottiamd brunomazzottiamd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor issues to be resolved.

git fetch origin --no-tags \
+refs/heads/${{ github.base_ref }}:refs/remotes/origin/${{ github.base_ref }}
# TODO: Evaluate if this step should be removed before merging.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolve before merging.

triton_test \
pip install pytest
# TODO: Evaluate if the security exception is the best way to fix the script execution.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolve before merging.

python .github/scripts/select_triton_tests.py \
--source '${{ github.head_ref }}' --target 'remotes/origin/${{ github.base_ref }}' \
--env-var TRITON_TEST --env-file "${ENV_FILE}"
# TODO: Comment the follwing command before merging.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolve before merging, i.e. comment the following line.

@brunomazzottiamd
Copy link
Contributor Author

Marked old discussions as resolved, hiding them.

@brunomazzottiamd
Copy link
Contributor Author

@gyohuangxin, I have good news to share: The test selection script is working on CI! Taking this PR as an example, we're able to achieve a 22x speedup on test execution. Please take a look at "Triton test execution speedup" section of PR description, I updated it with more details.

I'd be glad if you could review the proposed changes on .github/workflows/triton-test.yaml.

Tomorrow I'll try to handle forked PRs, I think it's a matter of fetching the source and target branches.

@brunomazzottiamd brunomazzottiamd force-pushed the bmazzott/run-triton-tests-selectively branch from c43f997 to 57a8c61 Compare January 9, 2026 12:02
# TODO: Evaluate if the security exception is the best way to fix the script execution.
# TODO: Uncomment [docker exec -w /workspace triton_test cat "${ENV_FILE}" >> "${GITHUB_ENV}"]
# command to enable test selection.
- name: Triton Test Selection Script
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we will always run all tests in main branch, we should skip this step when it's a main branch.

Suggested change
- name: Triton Test Selection Script
- name: Triton Test Selection Script
if: ${{ github.ref != 'refs/heads/main' }}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding forked PRs, I think the simplest approach is to still run the full test suite. So we can change it to the following:

Suggested change
- name: Triton Test Selection Script
- name: Triton Test Selection Script
if: ${{ github.ref != 'refs/heads/main' && !github.event.pull_request.head.repo.fork }}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm trying to enable forked PRs, but I'm having a hard time... I'll try a bit more today, if I can't get it to work then I'll add the if statement as you suggested.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did this:

      - name: Run a step only if it's from a fork
        if: ${{ github.event.pull_request.head.repo.fork }}
        run: echo "This PR is from a fork."

      - name: Run a step only if it's from the same repo (not a fork)
        if: ${{ !github.event.pull_request.head.repo.fork }}
        run: echo "This PR is from an internal branch."

It's always printing This PR is from a fork., for #1682 (not a fork) and #1804 (a fork). What's going on? I'm very confused...

echo "Running Triton Tests..."
docker exec -w /workspace triton_test mkdir -p test-reports
docker exec -w /workspace triton_test pytest -v ${{ env.TRITON_TEST }} --junitxml=test-reports/triton.xml
docker exec -w /workspace triton_test pytest -v ${TRITON_TEST} --junitxml=test-reports/triton.xml
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we skip the tests on the main branch, we need to add another step.
echo ${{ env.TRITON_TEST }} >> "${GITHUB_ENV}"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my previous tests, I think this isn't required. ${TRITON_TEST} initial value is ${{ env.TRITON_TEST }}, if we don't mess with ${GITHUB_ENV} file it should be fine. I'll perform more tests later today just to confirm my hypothesis.

@gyohuangxin
Copy link
Member

@brunomazzottiamd Congratulations! This is a huge amount of work, thanks so much for all the effort for AIter CI!

@brunomazzottiamd
Copy link
Contributor Author

Forked PR test being done in #1804.

@brunomazzottiamd brunomazzottiamd force-pushed the bmazzott/run-triton-tests-selectively branch 4 times, most recently from 5d3c6c5 to 4102cb9 Compare January 9, 2026 18:41
* Add benchmarks and test selection script to Triton paths filter.
* Install NetworkX dependency of test selection script.
* Fetch target branch from remote.
* Show available branches: It's for debugging purposes, maybe it could
  be removed before merging.
* Run test selection script: It's still a dry run, the result of test
  selection script isn't being used by pytest yet.
@brunomazzottiamd brunomazzottiamd force-pushed the bmazzott/run-triton-tests-selectively branch from 4102cb9 to a0662d4 Compare January 9, 2026 19:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request triton

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants