Skip to content

utils: handle flash_attn missing from importlib packages_distributions without crashing#45524

Open
SAY-5 wants to merge 1 commit intohuggingface:mainfrom
SAY-5:fix/flash-attn-keyerror-45520
Open

utils: handle flash_attn missing from importlib packages_distributions without crashing#45524
SAY-5 wants to merge 1 commit intohuggingface:mainfrom
SAY-5:fix/flash-attn-keyerror-45520

Conversation

@SAY-5
Copy link
Copy Markdown

@SAY-5 SAY-5 commented Apr 20, 2026

Fixes #45520.

is_flash_attn_2_available, is_flash_attn_3_available, is_flash_attn_4_available, and is_flash_attn_greater_or_equal all do two checks:

is_available, _ = _is_package_available("flash_attn", return_version=True)
is_available = is_available and "flash-attn" in [
    pkg.replace("_", "-") for pkg in PACKAGE_DISTRIBUTION_MAPPING["flash_attn"]
]

Step 1 uses importlib.util.find_spec, which returns a spec if any flash_attn import is findable — an editable install, a namespace package, a bundled shim, or a stub module sitting under another project. Step 2 then assumes that every findable import name also has an entry in importlib.metadata.packages_distributions().

That assumption does not hold in practice. On Python 3.13 with ComfyUI setups (as reported in #45520), and more generally in any environment where the flash_attn import is resolvable via a non-pip source, packages_distributions() has no "flash_attn" key. Because the list comprehension is evaluated before the in operator, short-circuit evaluation of the outer and does not protect us — the KeyError fires during transformers import and takes down the whole process before any model is loaded.

Swap the four raising subscripts for .get(name, []). If the name is missing from the distribution map we simply conclude that the requested flash-attention flavour is not properly installed — which is the answer is_flash_attn_*_available() would have returned anyway — instead of raising. The inner helper _is_package_available already wraps the same subscript in a try/except, so we are only making the outer call sites match that contract.

…not in the distribution map

is_flash_attn_2_available / _3 / _4 / _greater_or_equal do two checks:

  is_available, _ = _is_package_available("flash_attn", return_version=True)
  is_available = is_available and "flash-attn" in [
      pkg.replace("_", "-") for pkg in PACKAGE_DISTRIBUTION_MAPPING["flash_attn"]
  ]

Step 1 uses importlib.util.find_spec, which returns a spec if any
"flash_attn" import is findable (an editable install, a namespace
package, a bundled shim, or a stub module under another project).
Step 2 then assumes that every findable import name also has an entry
in importlib.metadata.packages_distributions().

That assumption does not hold. On Python 3.13 with ComfyUI setups
(huggingface#45520), and in any environment where the import is resolvable via a
non-pip source, packages_distributions() has no "flash_attn" key.
Because the list comprehension is evaluated before the `in` operator,
short-circuit evaluation of the outer `and` does not protect us - the
KeyError fires during `transformers` import and takes down the whole
process before any model is loaded.

Swap the four raising subscripts for `.get(name, [])`. If the name is
missing from the distribution map we simply conclude that the requested
flash-attention flavour is not properly installed - which is the same
answer is_flash_attn_*_available() would have returned anyway - instead
of raising. The inner helper `_is_package_available` already wraps the
same subscript in a try/except, so we are only making the outer call
sites match that contract.

Fixes huggingface#45520
@Rocketknight1
Copy link
Copy Markdown
Member

cc @vasqu

Copy link
Copy Markdown
Contributor

@vasqu vasqu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, seems valid in either case. Can you

  1. Update the same in modeling flash attn utils
    FLASH_ATTENTION_COMPATIBILITY_MATRIX = {
    2: {
    "flash_attn_version": 2,
    "general_availability_check": is_flash_attn_2_available,
    "pkg_availability_check": lambda *args, **kwargs: importlib.util.find_spec("flash_attn") is not None
    and "flash-attn" in [pkg.replace("_", "-") for pkg in PACKAGE_DISTRIBUTION_MAPPING["flash_attn"]],
    "supported_devices": (
    (is_torch_cuda_available, "cuda"),
    (is_torch_mlu_available, "mlu"),
    (is_torch_npu_available, "npu"),
    (is_torch_xpu_available, "xpu"),
    ),
    "custom_supported_devices": (
    (is_torch_npu_available, "Detect using FlashAttention2 on Ascend NPU."),
    (
    is_torch_xpu_available,
    f"Detect using FlashAttention2 (via kernel `{FLASH_ATTN_KERNEL_FALLBACK['flash_attention_2']}`) on XPU.",
    ),
    ),
    },
    3: {
    "flash_attn_version": 3,
    "general_availability_check": is_flash_attn_3_available,
    "pkg_availability_check": lambda *args, **kwargs: importlib.util.find_spec("flash_attn_interface") is not None
    and "flash-attn-3" in [pkg.replace("_", "-") for pkg in PACKAGE_DISTRIBUTION_MAPPING["flash_attn_interface"]],
    "supported_devices": ((is_torch_cuda_available, "cuda"),),
    "cuda_min_major_version": 8, # Ampere
    },
    4: {
    "flash_attn_version": 4,
    "general_availability_check": is_flash_attn_4_available,
    "pkg_availability_check": lambda *args, **kwargs: importlib.util.find_spec("flash_attn") is not None
    and "flash-attn-4" in [pkg.replace("_", "-") for pkg in PACKAGE_DISTRIBUTION_MAPPING["flash_attn"]],
    "supported_devices": ((is_torch_cuda_available, "cuda"),),
    "cuda_min_major_version": 9, # Hopper
    },
    }
  2. Add a small test similar to
    def test_not_available_flash(self):

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

KeyError: 'flash_attn' in import_utils.py when running on Python 3.13

3 participants