Skip to content

NVIDIA_TF32_OVERRIDE=0 in test fixtures#4392

Merged
wujingyue merged 1 commit intomainfrom
wjy/env
May 7, 2025
Merged

NVIDIA_TF32_OVERRIDE=0 in test fixtures#4392
wujingyue merged 1 commit intomainfrom
wjy/env

Conversation

@wujingyue
Copy link
Collaborator

Our unit tests heavily depend on this flag and our CI already sets this when running any tests. I think it's better to set this in test fixtures instead so we don't have to remember to do so when running tests locally.

Our unit tests heavily depend on this flag and our CI already sets this
when running any tests. I think it's better to set this in test fixtures
instead so we don't have to remember to do so when running tests
locally.
@wujingyue wujingyue requested review from naoyam and xwang233 May 7, 2025 19:16
@wujingyue
Copy link
Collaborator Author

!test

@github-actions
Copy link

github-actions bot commented May 7, 2025

Description

  • Set NVIDIA_TF32_OVERRIDE=0 in test fixtures

  • Ensures consistent test environment locally and in CI

  • Added environment variable setting in multiple test files


Changes walkthrough 📝

Relevant files
Enhancement
utils.cpp
Set NVIDIA_TF32_OVERRIDE in C++ test fixtures                       

tests/cpp/utils.cpp

  • Added setting of NVIDIA_TF32_OVERRIDE environment variable to 0 in
    NVFuserTest constructor
  • +5/-0     
    utils.py
    Set NVIDIA_TF32_OVERRIDE in Python test fixtures                 

    python/nvfuser/testing/utils.py

  • Added setting of NVIDIA_TF32_OVERRIDE environment variable to 0 in
    setup_class method
  • +2/-0     
    conftest.py
    Set NVIDIA_TF32_OVERRIDE in multidevice test fixtures       

    tests/python/multidevice/conftest.py

  • Added setting of NVIDIA_TF32_OVERRIDE environment variable to 0 in
    multidevice_test fixture
  • +3/-0     

    PR Reviewer Guide 🔍

    Here are some key observations to aid the review process:

    🧪 PR contains tests
    ⚡ Recommended focus areas for review

    Error Handling

    The current error handling for setenv only logs a warning. Consider adding more robust error handling or ensuring that the test suite can handle this failure gracefully.

    if (setenv(kTf32Override, "0", /*overwrite=*/1) != 0) {
      TORCH_WARN("Failed to set ", kTf32Override, " to 0");
    }
    Fixture Scope

    Setting os.environ["NVIDIA_TF32_OVERRIDE"] in the multidevice_test fixture might not be the appropriate scope. Consider if this should be set at a higher level or if there are any side effects on other tests.

    os.environ["NVIDIA_TF32_OVERRIDE"] = "0"

    @wujingyue wujingyue merged commit a9f306a into main May 7, 2025
    44 of 45 checks passed
    @wujingyue wujingyue deleted the wjy/env branch May 7, 2025 20:33
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

    Labels

    None yet

    Projects

    None yet

    Development

    Successfully merging this pull request may close these issues.

    2 participants